This is the start of the stable review cycle for the 5.15.95 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.95-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.95-rc1
Arnd Bergmann arnd@arndb.de platform/x86/amd: pmc: add CONFIG_SERIO dependency
Dan Carpenter error27@gmail.com net: sched: sch: Fix off by one in htb_activate_prios()
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: SOF: Intel: hda-dai: fix possible stream_tag leak
Thomas Gleixner tglx@linutronix.de alarmtimer: Prevent starvation by small intervals and SIG_IGN
Greg Kroah-Hartman gregkh@linuxfoundation.org kvm: initialize all of the kvm_debugregs structure before sending it to userspace
Pedro Tammela pctammela@mojatatu.com net/sched: tcindex: search key must be 16 bits
Natalia Petrova n.petrova@fintech.ru i40e: Add checking for null for nlmsg_find_attr()
Pedro Tammela pctammela@mojatatu.com net/sched: act_ctinfo: use percpu stats
Baowen Zheng baowen.zheng@corigine.com flow_offload: fill flags to action structure
Matt Roper matthew.d.roper@intel.com drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list
Raviteja Goud Talla ravitejax.goud.talla@intel.com drm/i915/gen11: Moving WAs to icl_gt_workarounds_init()
Qian Yingjin qian@ddn.com mm/filemap: fix page end in filemap_get_read_batch
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: fix underflow in second superblock position calculations
Guillaume Nault gnault@redhat.com ipv6: Fix tcp socket connection with DSCP.
Guillaume Nault gnault@redhat.com ipv6: Fix datagram socket connection with DSCP.
Jason Xing kernelxing@tencent.com ixgbe: add double of VLAN header when computing the max MTU
Jakub Kicinski kuba@kernel.org net: mpls: fix stale pointer if allocation fails during device rename
Cristian Ciocaltea cristian.ciocaltea@collabora.com net: stmmac: Restrict warning on disabling DMA store and fwd mode
Michael Chan michael.chan@broadcom.com bnxt_en: Fix mqprio and XDP ring checking logic
Johannes Zink j.zink@pengutronix.de net: stmmac: fix order of dwmac5 FlexPPS parametrization sequence
Hangyu Hua hbh25y@gmail.com net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()
Miko Larsson mikoxyzzz@gmail.com net/usb: kalmia: Don't pass act_len in usb_bulk_msg error path
Kuniyuki Iwashima kuniyu@amazon.com dccp/tcp: Avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions.
Pedro Tammela pctammela@mojatatu.com net/sched: tcindex: update imperfect hash filters respecting rcu
Pietro Borrello borrello@diag.uniroma1.it sctp: sctp_sock_filter(): avoid list_entry() on possibly empty list
Siddharth Vadapalli s-vadapalli@ti.com net: ethernet: ti: am65-cpsw: Add RX DMA Channel Teardown Quirk
Rafał Miłecki rafal@milecki.pl net: bgmac: fix BCM5358 support by setting correct flags
Jason Xing kernelxing@tencent.com i40e: add double of VLAN header when computing the max MTU
Jason Xing kernelxing@tencent.com ixgbe: allow to increase MTU to 3K with XDP enabled
Andrew Morton akpm@linux-foundation.org revert "squashfs: harden sanity check in squashfs_read_xattr_id_table"
Felix Riemann felix.riemann@sma.de net: Fix unwanted sign extension in netdev_stats_to_stats64()
Aaron Thompson dev@aaront.org Revert "mm: Always release pages to the buddy allocator in memblock_free_late()."
Misono Tomohiro misono.tomohiro@jp.fujitsu.com selftest/lkdtm: Skip stack-entropy test if lkdtm is not available
Isaac J. Manjarres isaacmanjarres@google.com of: reserved_mem: Have kmemleak ignore dynamically allocated reserved mem
Mike Kravetz mike.kravetz@oracle.com hugetlb: check for undefined shift on 32 bit architectures
Munehisa Kamata kamatam@amazon.com sched/psi: Fix use-after-free in ep_remove_wait_queue()
Kailang Yang kailang@realtek.com ALSA: hda/realtek - fixed wrong gpio assigned
Bo Liu bo.liu@senarytech.com ALSA: hda/conexant: add a new hda codec SN6180
Yang Yingliang yangyingliang@huawei.com mmc: mmc_spi: fix error handling in mmc_spi_probe()
Yang Yingliang yangyingliang@huawei.com mmc: sdio: fix possible resource leaks in some error paths
Paul Cercueil paul@crapouillou.net mmc: jz4740: Work around bug on JZ4760(B)
Kuniyuki Iwashima kuniyu@amazon.com tcp: Fix listen() regression in 5.15.88.
Florian Westphal fw@strlen.de netfilter: nft_tproxy: restrict to prerouting hook
Mario Limonciello mario.limonciello@amd.com platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN
Mario Limonciello mario.limonciello@amd.com platform/x86: amd-pmc: Correct usage of SMU version
Hans de Goede hdegoede@redhat.com platform/x86: amd-pmc: Fix compilation when CONFIG_DEBUGFS is disabled
Sanket Goswami Sanket.Goswami@amd.com platform/x86: amd-pmc: Export Idlemask values based on the APU
Leo Li sunpeng.li@amd.com drm/amd/display: Fail atomic_check early on normalize_zpos error
Seth Jenkins sethjenkins@google.com aio: fix mremap after fork null-deref
Paolo Abeni pabeni@redhat.com mptcp: do not wait for bare sockets' timeout
Darrick J. Wong djwong@kernel.org xfs: don't leak btree cursor when insrec fails after a split
Darrick J. Wong djwong@kernel.org xfs: purge dquots after inode walk fails during quotacheck
Dave Chinner dchinner@redhat.com xfs: assert in xfs_btree_del_cursor should take into account error
Dave Chinner dchinner@redhat.com xfs: don't assert fail on perag references on teardown
Dave Chinner dchinner@redhat.com xfs: avoid unnecessary runtime sibling pointer endian conversions
Dave Chinner dchinner@redhat.com xfs: validate v5 feature fields
Dave Chinner dchinner@redhat.com xfs: set XFS_FEAT_NLINK correctly
Dave Chinner dchinner@redhat.com xfs: detect self referencing btree sibling pointers
Dave Chinner dchinner@redhat.com xfs: fix potential log item leak
Dave Chinner dchinner@redhat.com xfs: zero inode fork buffer at allocation
Russell King (Oracle) rmk+kernel@armlinux.org.uk nvmem: core: fix return value
Russell King (Oracle) rmk+kernel@armlinux.org.uk nvmem: core: fix registration vs use race
Russell King (Oracle) rmk+kernel@armlinux.org.uk nvmem: core: fix cleanup after dev_set_name()
Gaosheng Cui cuigaosheng1@huawei.com nvmem: core: add error handling for dev_set_name
Hans de Goede hdegoede@redhat.com platform/x86: touchscreen_dmi: Add Chuwi Vi8 (CWI501) DMI match
Alex Deucher alexander.deucher@amd.com drm/amd/display: Properly handle additional cases where DCN is not supported
Amit Engel Amit.Engel@dell.com nvme-fc: fix a missing queue put in nvmet_fc_ls_create_association
Vasily Gorbik gor@linux.ibm.com s390/decompressor: specify __decompress() buf len to avoid overflow
Kees Cook keescook@chromium.org net: sched: sch: Bounds check priority
Ben Skeggs bskeggs@redhat.com drm/nouveau/devinit/tu102-: wait for GFW_BOOT_PROGRESS == COMPLETED
Andrey Konovalov andrey.konovalov@linaro.org net: stmmac: do not stop RX_CLK in Rx LPI state for qcs404 SoC
Hyunwoo Kim v4bel@theori.io net/rose: Fix to not accept on connected socket
Shunsuke Mie mie@igel.co.jp tools/virtio: fix the vringh test for virtio ring changes
Arnd Bergmann arnd@arndb.de ASoC: cs42l56: fix DT probe
Jakub Sitnicki jakub@cloudflare.com bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself
Cezary Rojewski cezary.rojewski@intel.com ALSA: hda: Do not unset preset when cleaning up codec
Eduard Zingerman eddyz87@gmail.com selftests/bpf: Verify copy_register_state() preserves parent/live fields
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: Intel: sof_cs42l42: always set dpcm_capture for amplifiers
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: Intel: sof_rt5682: always set dpcm_capture for amplifiers
Mario Limonciello mario.limonciello@amd.com ACPI / x86: Add support for LPS0 callback handler
Guo Ren guoren@linux.alibaba.com riscv: kprobe: Fixup misaligned load text
Masami Hiramatsu mhiramat@kernel.org kprobes: treewide: Cleanup the error messages for kprobes
Paolo Abeni pabeni@redhat.com mptcp: fix locking for in-kernel listener creation
-------------
Diffstat:
Makefile | 4 +- arch/arm/probes/kprobes/core.c | 4 +- arch/arm64/kernel/probes/kprobes.c | 5 +- arch/csky/kernel/probes/kprobes.c | 10 +- arch/mips/kernel/kprobes.c | 11 +- arch/riscv/kernel/probes/kprobes.c | 17 +- arch/s390/boot/compressed/decompressor.c | 2 +- arch/s390/kernel/kprobes.c | 4 +- arch/x86/kvm/x86.c | 3 +- drivers/acpi/x86/s2idle.c | 40 +++++ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 +- drivers/gpu/drm/i915/gt/intel_workarounds.c | 32 ++-- .../gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c | 23 +++ drivers/mmc/core/sdio_bus.c | 17 +- drivers/mmc/core/sdio_cis.c | 12 -- drivers/mmc/host/jz4740_mmc.c | 10 ++ drivers/mmc/host/mmc_spi.c | 8 +- drivers/net/ethernet/broadcom/bgmac-bcma.c | 6 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 +- drivers/net/ethernet/intel/i40e/i40e_main.c | 4 +- drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2 + drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 28 ++-- .../ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c | 2 + drivers/net/ethernet/stmicro/stmmac/dwmac5.c | 3 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 +- .../net/ethernet/stmicro/stmmac/stmmac_platform.c | 2 +- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 12 +- drivers/net/ethernet/ti/am65-cpsw-nuss.h | 1 + drivers/net/usb/kalmia.c | 8 +- drivers/nvme/target/fc.c | 4 +- drivers/nvmem/core.c | 41 ++--- drivers/of/of_reserved_mem.c | 3 +- drivers/platform/x86/Kconfig | 1 + drivers/platform/x86/amd-pmc.c | 116 ++++++++++++++ drivers/platform/x86/touchscreen_dmi.c | 9 ++ fs/aio.c | 4 + fs/nilfs2/ioctl.c | 7 + fs/nilfs2/super.c | 9 ++ fs/nilfs2/the_nilfs.c | 8 +- fs/squashfs/xattr_id.c | 2 +- fs/xfs/libxfs/xfs_ag.c | 3 +- fs/xfs/libxfs/xfs_btree.c | 175 ++++++++++++++++----- fs/xfs/libxfs/xfs_inode_fork.c | 12 +- fs/xfs/libxfs/xfs_sb.c | 70 +++++++-- fs/xfs/xfs_bmap_item.c | 2 + fs/xfs/xfs_icreate_item.c | 1 + fs/xfs/xfs_qm.c | 9 +- fs/xfs/xfs_refcount_item.c | 2 + fs/xfs/xfs_rmap_item.c | 2 + include/linux/acpi.h | 10 +- include/linux/hugetlb.h | 5 +- include/linux/stmmac.h | 1 + include/net/sock.h | 13 ++ kernel/kprobes.c | 36 ++--- kernel/sched/psi.c | 7 +- kernel/time/alarmtimer.c | 33 +++- mm/filemap.c | 5 +- mm/memblock.c | 8 +- net/core/dev.c | 2 +- net/core/sock_map.c | 61 +++---- net/dccp/ipv6.c | 7 +- net/ipv4/inet_connection_sock.c | 1 + net/ipv6/datagram.c | 2 +- net/ipv6/tcp_ipv6.c | 11 +- net/mpls/af_mpls.c | 4 + net/mptcp/pm_netlink.c | 10 +- net/mptcp/protocol.c | 9 ++ net/mptcp/subflow.c | 2 +- net/netfilter/nft_tproxy.c | 8 + net/openvswitch/meter.c | 4 +- net/rose/af_rose.c | 8 + net/sched/act_bpf.c | 2 +- net/sched/act_connmark.c | 2 +- net/sched/act_ctinfo.c | 6 +- net/sched/act_gate.c | 2 +- net/sched/act_ife.c | 2 +- net/sched/act_ipt.c | 2 +- net/sched/act_mpls.c | 2 +- net/sched/act_nat.c | 2 +- net/sched/act_pedit.c | 2 +- net/sched/act_police.c | 2 +- net/sched/act_sample.c | 2 +- net/sched/act_simple.c | 2 +- net/sched/act_skbedit.c | 2 +- net/sched/act_skbmod.c | 2 +- net/sched/cls_tcindex.c | 34 +++- net/sched/sch_htb.c | 5 +- net/sctp/diag.c | 4 +- sound/pci/hda/hda_bind.c | 2 + sound/pci/hda/hda_codec.c | 1 - sound/pci/hda/patch_conexant.c | 1 + sound/pci/hda/patch_realtek.c | 2 +- sound/soc/codecs/cs42l56.c | 6 - sound/soc/intel/boards/sof_cs42l42.c | 3 + sound/soc/intel/boards/sof_rt5682.c | 5 +- sound/soc/sof/intel/hda-dai.c | 8 +- .../selftests/bpf/verifier/search_pruning.c | 36 +++++ tools/testing/selftests/lkdtm/stack-entropy.sh | 16 +- tools/virtio/linux/bug.h | 8 +- tools/virtio/linux/build_bug.h | 7 + tools/virtio/linux/cpumask.h | 7 + tools/virtio/linux/gfp.h | 7 + tools/virtio/linux/kernel.h | 1 + tools/virtio/linux/kmsan.h | 12 ++ tools/virtio/linux/scatterlist.h | 1 + tools/virtio/linux/topology.h | 7 + 106 files changed, 935 insertions(+), 295 deletions(-)
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit ad2171009d968104ccda9dc517f5a3ba891515db ]
For consistency, in mptcp_pm_nl_create_listen_socket(), we need to call the __mptcp_nmpc_socket() under the msk socket lock.
Note that as a side effect, mptcp_subflow_create_socket() needs a 'nested' lockdep annotation, as it will acquire the subflow (kernel) socket lock under the in-kernel listener msk socket lock.
The current lack of locking is almost harmless, because the relevant socket is not exposed to the user space, but in future we will add more complexity to the mentioned helper, let's play safe.
Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/pm_netlink.c | 10 ++++++---- net/mptcp/subflow.c | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 2b1b40199c617..3a1e8f2388665 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -891,8 +891,8 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, { int addrlen = sizeof(struct sockaddr_in); struct sockaddr_storage addr; - struct mptcp_sock *msk; struct socket *ssock; + struct sock *newsk; int backlog = 1024; int err;
@@ -901,13 +901,15 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, if (err) return err;
- msk = mptcp_sk(entry->lsk->sk); - if (!msk) { + newsk = entry->lsk->sk; + if (!newsk) { err = -EINVAL; goto out; }
- ssock = __mptcp_nmpc_socket(msk); + lock_sock(newsk); + ssock = __mptcp_nmpc_socket(mptcp_sk(newsk)); + release_sock(newsk); if (!ssock) { err = -EINVAL; goto out; diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 15dbaa202c7cf..b0e9548f00bf1 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1570,7 +1570,7 @@ int mptcp_subflow_create_socket(struct sock *sk, struct socket **new_sock) if (err) return err;
- lock_sock(sf->sk); + lock_sock_nested(sf->sk, SINGLE_DEPTH_NESTING);
/* the newly created socket has to be in the same cgroup as its parent */ mptcp_attach_cgroup(sk, sf->sk);
From: Masami Hiramatsu mhiramat@kernel.org
[ Upstream commit 9c89bb8e327203bc27e09ebd82d8f61ac2ae8b24 ]
This clean up the error/notification messages in kprobes related code. Basically this defines 'pr_fmt()' macros for each files and update the messages which describes
- what happened, - what is the kernel going to do or not do, - is the kernel fine, - what can the user do about it.
Also, if the message is not needed (e.g. the function returns unique error code, or other error message is already shown.) remove it, and replace the message with WARN_*() macros if suitable.
Link: https://lkml.kernel.org/r/163163036568.489837.14085396178727185469.stgit@dev...
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Stable-dep-of: eb7423273cc9 ("riscv: kprobe: Fixup misaligned load text") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/probes/kprobes/core.c | 4 +++- arch/arm64/kernel/probes/kprobes.c | 5 ++++- arch/csky/kernel/probes/kprobes.c | 10 ++++----- arch/mips/kernel/kprobes.c | 11 +++++---- arch/riscv/kernel/probes/kprobes.c | 11 +++++---- arch/s390/kernel/kprobes.c | 4 +++- kernel/kprobes.c | 36 +++++++++++++----------------- 7 files changed, 41 insertions(+), 40 deletions(-)
diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c index 9d8634e2f12f7..9bcae72dda440 100644 --- a/arch/arm/probes/kprobes/core.c +++ b/arch/arm/probes/kprobes/core.c @@ -11,6 +11,8 @@ * Copyright (C) 2007 Marvell Ltd. */
+#define pr_fmt(fmt) "kprobes: " fmt + #include <linux/kernel.h> #include <linux/kprobes.h> #include <linux/module.h> @@ -278,7 +280,7 @@ void __kprobes kprobe_handler(struct pt_regs *regs) break; case KPROBE_REENTER: /* A nested probe was hit in FIQ, it is a BUG */ - pr_warn("Unrecoverable kprobe detected.\n"); + pr_warn("Failed to recover from reentered kprobes.\n"); dump_kprobe(p); fallthrough; default: diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c index b7404dba0d623..2162b6fd7251d 100644 --- a/arch/arm64/kernel/probes/kprobes.c +++ b/arch/arm64/kernel/probes/kprobes.c @@ -7,6 +7,9 @@ * Copyright (C) 2013 Linaro Limited. * Author: Sandeepa Prabhu sandeepa.prabhu@linaro.org */ + +#define pr_fmt(fmt) "kprobes: " fmt + #include <linux/extable.h> #include <linux/kasan.h> #include <linux/kernel.h> @@ -218,7 +221,7 @@ static int __kprobes reenter_kprobe(struct kprobe *p, break; case KPROBE_HIT_SS: case KPROBE_REENTER: - pr_warn("Unrecoverable kprobe detected.\n"); + pr_warn("Failed to recover from reentered kprobes.\n"); dump_kprobe(p); BUG(); break; diff --git a/arch/csky/kernel/probes/kprobes.c b/arch/csky/kernel/probes/kprobes.c index 584ed9f36290f..bd92ac376e157 100644 --- a/arch/csky/kernel/probes/kprobes.c +++ b/arch/csky/kernel/probes/kprobes.c @@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0+
+#define pr_fmt(fmt) "kprobes: " fmt + #include <linux/kprobes.h> #include <linux/extable.h> #include <linux/slab.h> @@ -77,10 +79,8 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p) { unsigned long probe_addr = (unsigned long)p->addr;
- if (probe_addr & 0x1) { - pr_warn("Address not aligned.\n"); - return -EINVAL; - } + if (probe_addr & 0x1) + return -EILSEQ;
/* copy instruction */ p->opcode = le32_to_cpu(*p->addr); @@ -229,7 +229,7 @@ static int __kprobes reenter_kprobe(struct kprobe *p, break; case KPROBE_HIT_SS: case KPROBE_REENTER: - pr_warn("Unrecoverable kprobe detected.\n"); + pr_warn("Failed to recover from reentered kprobes.\n"); dump_kprobe(p); BUG(); break; diff --git a/arch/mips/kernel/kprobes.c b/arch/mips/kernel/kprobes.c index 75bff0f773198..b0934a0d7aedd 100644 --- a/arch/mips/kernel/kprobes.c +++ b/arch/mips/kernel/kprobes.c @@ -11,6 +11,8 @@ * Copyright (C) IBM Corporation, 2002, 2004 */
+#define pr_fmt(fmt) "kprobes: " fmt + #include <linux/kprobes.h> #include <linux/preempt.h> #include <linux/uaccess.h> @@ -80,8 +82,7 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p) insn = p->addr[0];
if (insn_has_ll_or_sc(insn)) { - pr_notice("Kprobes for ll and sc instructions are not" - "supported\n"); + pr_notice("Kprobes for ll and sc instructions are not supported\n"); ret = -EINVAL; goto out; } @@ -219,7 +220,7 @@ static int evaluate_branch_instruction(struct kprobe *p, struct pt_regs *regs, return 0;
unaligned: - pr_notice("%s: unaligned epc - sending SIGBUS.\n", current->comm); + pr_notice("Failed to emulate branch instruction because of unaligned epc - sending SIGBUS to %s.\n", current->comm); force_sig(SIGBUS); return -EFAULT;
@@ -238,10 +239,8 @@ static void prepare_singlestep(struct kprobe *p, struct pt_regs *regs, regs->cp0_epc = (unsigned long)p->addr; else if (insn_has_delayslot(p->opcode)) { ret = evaluate_branch_instruction(p, regs, kcb); - if (ret < 0) { - pr_notice("Kprobes: Error in evaluating branch\n"); + if (ret < 0) return; - } } regs->cp0_epc = (unsigned long)&p->ainsn.insn[0]; } diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c index 125241ce82d6a..b53aa0209e079 100644 --- a/arch/riscv/kernel/probes/kprobes.c +++ b/arch/riscv/kernel/probes/kprobes.c @@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0+
+#define pr_fmt(fmt) "kprobes: " fmt + #include <linux/kprobes.h> #include <linux/extable.h> #include <linux/slab.h> @@ -65,11 +67,8 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p) { unsigned long probe_addr = (unsigned long)p->addr;
- if (probe_addr & 0x1) { - pr_warn("Address not aligned.\n"); - - return -EINVAL; - } + if (probe_addr & 0x1) + return -EILSEQ;
if (!arch_check_kprobe(p)) return -EILSEQ; @@ -209,7 +208,7 @@ static int __kprobes reenter_kprobe(struct kprobe *p, break; case KPROBE_HIT_SS: case KPROBE_REENTER: - pr_warn("Unrecoverable kprobe detected.\n"); + pr_warn("Failed to recover from reentered kprobes.\n"); dump_kprobe(p); BUG(); break; diff --git a/arch/s390/kernel/kprobes.c b/arch/s390/kernel/kprobes.c index 52d056a5f89fc..952d44b0610b0 100644 --- a/arch/s390/kernel/kprobes.c +++ b/arch/s390/kernel/kprobes.c @@ -7,6 +7,8 @@ * s390 port, used ppc64 as template. Mike Grundy grundym@us.ibm.com */
+#define pr_fmt(fmt) "kprobes: " fmt + #include <linux/moduleloader.h> #include <linux/kprobes.h> #include <linux/ptrace.h> @@ -259,7 +261,7 @@ static void kprobe_reenter_check(struct kprobe_ctlblk *kcb, struct kprobe *p) * is a BUG. The code path resides in the .kprobes.text * section and is executed with interrupts disabled. */ - pr_err("Invalid kprobe detected.\n"); + pr_err("Failed to recover from reentered kprobes.\n"); dump_kprobe(p); BUG(); } diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 23af2f8e8563e..8818f3a89fef3 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -18,6 +18,9 @@ * jkenisto@us.ibm.com and Prasanna S Panchamukhi * prasanna@in.ibm.com added function-return probes. */ + +#define pr_fmt(fmt) "kprobes: " fmt + #include <linux/kprobes.h> #include <linux/hash.h> #include <linux/init.h> @@ -892,7 +895,7 @@ static void optimize_all_kprobes(void) optimize_kprobe(p); } cpus_read_unlock(); - printk(KERN_INFO "Kprobes globally optimized\n"); + pr_info("kprobe jump-optimization is enabled. All kprobes are optimized if possible.\n"); out: mutex_unlock(&kprobe_mutex); } @@ -925,7 +928,7 @@ static void unoptimize_all_kprobes(void)
/* Wait for unoptimizing completion */ wait_for_kprobe_optimizer(); - printk(KERN_INFO "Kprobes globally unoptimized\n"); + pr_info("kprobe jump-optimization is disabled. All kprobes are based on software breakpoint.\n"); }
static DEFINE_MUTEX(kprobe_sysctl_mutex); @@ -1003,7 +1006,7 @@ static int reuse_unused_kprobe(struct kprobe *ap) * unregistered. * Thus there should be no chance to reuse unused kprobe. */ - printk(KERN_ERR "Error: There should be no unused kprobe here.\n"); + WARN_ON_ONCE(1); return -EINVAL; }
@@ -1049,18 +1052,13 @@ static int __arm_kprobe_ftrace(struct kprobe *p, struct ftrace_ops *ops, int ret = 0;
ret = ftrace_set_filter_ip(ops, (unsigned long)p->addr, 0, 0); - if (ret) { - pr_debug("Failed to arm kprobe-ftrace at %pS (%d)\n", - p->addr, ret); + if (WARN_ONCE(ret < 0, "Failed to arm kprobe-ftrace at %pS (error %d)\n", p->addr, ret)) return ret; - }
if (*cnt == 0) { ret = register_ftrace_function(ops); - if (ret) { - pr_debug("Failed to init kprobe-ftrace (%d)\n", ret); + if (WARN(ret < 0, "Failed to register kprobe-ftrace (error %d)\n", ret)) goto err_ftrace; - } }
(*cnt)++; @@ -1092,14 +1090,14 @@ static int __disarm_kprobe_ftrace(struct kprobe *p, struct ftrace_ops *ops,
if (*cnt == 1) { ret = unregister_ftrace_function(ops); - if (WARN(ret < 0, "Failed to unregister kprobe-ftrace (%d)\n", ret)) + if (WARN(ret < 0, "Failed to unregister kprobe-ftrace (error %d)\n", ret)) return ret; }
(*cnt)--;
ret = ftrace_set_filter_ip(ops, (unsigned long)p->addr, 1, 0); - WARN_ONCE(ret < 0, "Failed to disarm kprobe-ftrace at %pS (%d)\n", + WARN_ONCE(ret < 0, "Failed to disarm kprobe-ftrace at %pS (error %d)\n", p->addr, ret); return ret; } @@ -1894,7 +1892,7 @@ unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
node = node->next; } - pr_err("Oops! Kretprobe fails to find correct return address.\n"); + pr_err("kretprobe: Return address not found, not execute handler. Maybe there is a bug in the kernel.\n"); BUG_ON(1);
found: @@ -2229,8 +2227,7 @@ EXPORT_SYMBOL_GPL(enable_kprobe); /* Caller must NOT call this in usual path. This is only for critical case */ void dump_kprobe(struct kprobe *kp) { - pr_err("Dumping kprobe:\n"); - pr_err("Name: %s\nOffset: %x\nAddress: %pS\n", + pr_err("Dump kprobe:\n.symbol_name = %s, .offset = %x, .addr = %pS\n", kp->symbol_name, kp->offset, kp->addr); } NOKPROBE_SYMBOL(dump_kprobe); @@ -2493,8 +2490,7 @@ static int __init init_kprobes(void) err = populate_kprobe_blacklist(__start_kprobe_blacklist, __stop_kprobe_blacklist); if (err) { - pr_err("kprobes: failed to populate blacklist: %d\n", err); - pr_err("Please take care of using kprobes.\n"); + pr_err("Failed to populate blacklist (error %d), kprobes not restricted, be careful using them!\n", err); }
if (kretprobe_blacklist_size) { @@ -2503,7 +2499,7 @@ static int __init init_kprobes(void) kretprobe_blacklist[i].addr = kprobe_lookup_name(kretprobe_blacklist[i].name, 0); if (!kretprobe_blacklist[i].addr) - printk("kretprobe: lookup failed: %s\n", + pr_err("Failed to lookup symbol '%s' for kretprobe blacklist. Maybe the target function is removed or renamed.\n", kretprobe_blacklist[i].name); } } @@ -2707,7 +2703,7 @@ static int arm_all_kprobes(void) }
if (errors) - pr_warn("Kprobes globally enabled, but failed to arm %d out of %d probes\n", + pr_warn("Kprobes globally enabled, but failed to enable %d out of %d probes. Please check which kprobes are kept disabled via debugfs.\n", errors, total); else pr_info("Kprobes globally enabled\n"); @@ -2750,7 +2746,7 @@ static int disarm_all_kprobes(void) }
if (errors) - pr_warn("Kprobes globally disabled, but failed to disarm %d out of %d probes\n", + pr_warn("Kprobes globally disabled, but failed to disable %d out of %d probes. Please check which kprobes are kept enabled via debugfs.\n", errors, total); else pr_info("Kprobes globally disabled\n");
From: Guo Ren guoren@linux.alibaba.com
[ Upstream commit eb7423273cc9922ee2d05bf660c034d7d515bb91 ]
The current kprobe would cause a misaligned load for the probe point. This patch fixup it with two half-word loads instead.
Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Signed-off-by: Guo Ren guoren@linux.alibaba.com Signed-off-by: Guo Ren guoren@kernel.org Link: https://lore.kernel.org/linux-riscv/878rhig9zj.fsf@all.your.base.are.belong.... Reported-by: Bjorn Topel bjorn.topel@gmail.com Reviewed-by: Björn Töpel bjorn@kernel.org Link: https://lore.kernel.org/r/20230204063531.740220-1-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/probes/kprobes.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c index b53aa0209e079..7548b1d62509c 100644 --- a/arch/riscv/kernel/probes/kprobes.c +++ b/arch/riscv/kernel/probes/kprobes.c @@ -65,16 +65,18 @@ static bool __kprobes arch_check_kprobe(struct kprobe *p)
int __kprobes arch_prepare_kprobe(struct kprobe *p) { - unsigned long probe_addr = (unsigned long)p->addr; + u16 *insn = (u16 *)p->addr;
- if (probe_addr & 0x1) + if ((unsigned long)insn & 0x1) return -EILSEQ;
if (!arch_check_kprobe(p)) return -EILSEQ;
/* copy instruction */ - p->opcode = *p->addr; + p->opcode = (kprobe_opcode_t)(*insn++); + if (GET_INSN_LENGTH(p->opcode) == 4) + p->opcode |= (kprobe_opcode_t)(*insn) << 16;
/* decode instruction */ switch (riscv_probe_decode_insn(p->addr, &p->ainsn.api)) {
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 20e1d6402a71dba7ad2b81f332a3c14c7d3b939b ]
Currenty the latest thing run during a suspend to idle attempt is the LPS0 `prepare_late` callback and the earliest thing is the `resume_early` callback.
There is a desire for the `amd-pmc` driver to suspend later in the suspend process (ideally the very last thing), so create a callback that it or any other driver can hook into to do this.
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/r/20220317141445.6498-1-mario.limonciello@amd.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Stable-dep-of: 8e60615e8932 ("platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/x86/s2idle.c | 40 +++++++++++++++++++++++++++++++++++++++ include/linux/acpi.h | 10 +++++++++- 2 files changed, 49 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index 2af1ae1721021..4a11a38764321 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -86,6 +86,8 @@ struct lpi_device_constraint_amd { int min_dstate; };
+static LIST_HEAD(lps0_s2idle_devops_head); + static struct lpi_constraints *lpi_constraints_table; static int lpi_constraints_table_size; static int rev_id; @@ -434,6 +436,8 @@ static struct acpi_scan_handler lps0_handler = {
int acpi_s2idle_prepare_late(void) { + struct acpi_s2idle_dev_ops *handler; + if (!lps0_device_handle || sleep_no_lps0) return 0;
@@ -464,14 +468,26 @@ int acpi_s2idle_prepare_late(void) acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY, lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); } + + list_for_each_entry(handler, &lps0_s2idle_devops_head, list_node) { + if (handler->prepare) + handler->prepare(); + } + return 0; }
void acpi_s2idle_restore_early(void) { + struct acpi_s2idle_dev_ops *handler; + if (!lps0_device_handle || sleep_no_lps0) return;
+ list_for_each_entry(handler, &lps0_s2idle_devops_head, list_node) + if (handler->restore) + handler->restore(); + /* Modern standby exit */ if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_EXIT, @@ -514,4 +530,28 @@ void acpi_s2idle_setup(void) s2idle_set_ops(&acpi_s2idle_ops_lps0); }
+int acpi_register_lps0_dev(struct acpi_s2idle_dev_ops *arg) +{ + if (!lps0_device_handle || sleep_no_lps0) + return -ENODEV; + + lock_system_sleep(); + list_add(&arg->list_node, &lps0_s2idle_devops_head); + unlock_system_sleep(); + + return 0; +} +EXPORT_SYMBOL_GPL(acpi_register_lps0_dev); + +void acpi_unregister_lps0_dev(struct acpi_s2idle_dev_ops *arg) +{ + if (!lps0_device_handle || sleep_no_lps0) + return; + + lock_system_sleep(); + list_del(&arg->list_node); + unlock_system_sleep(); +} +EXPORT_SYMBOL_GPL(acpi_unregister_lps0_dev); + #endif /* CONFIG_SUSPEND */ diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 6224b1e32681c..2d7df5cea2494 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -1005,7 +1005,15 @@ void acpi_os_set_prepare_extended_sleep(int (*func)(u8 sleep_state,
acpi_status acpi_os_prepare_extended_sleep(u8 sleep_state, u32 val_a, u32 val_b); - +#ifdef CONFIG_X86 +struct acpi_s2idle_dev_ops { + struct list_head list_node; + void (*prepare)(void); + void (*restore)(void); +}; +int acpi_register_lps0_dev(struct acpi_s2idle_dev_ops *arg); +void acpi_unregister_lps0_dev(struct acpi_s2idle_dev_ops *arg); +#endif /* CONFIG_X86 */ #ifndef CONFIG_IA64 void arch_reserve_mem_area(acpi_physical_address addr, size_t size); #else
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 324f065cdbaba1b879a63bf07e61ca156b789537 ]
The amplifier may provide hardware support for I/V feedback, or alternatively the firmware may generate an echo reference attached to the SSP and dailink used for the amplifier.
To avoid any issues with invalid/NULL substreams in the latter case, always unconditionally set dpcm_capture.
Link: https://github.com/thesofproject/linux/issues/4083 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Link: https://lore.kernel.org/r/20230119163459.2235843-2-kai.vehmanen@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/sof_rt5682.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/soc/intel/boards/sof_rt5682.c b/sound/soc/intel/boards/sof_rt5682.c index f096bd6d69be7..d0ce2f06b30c6 100644 --- a/sound/soc/intel/boards/sof_rt5682.c +++ b/sound/soc/intel/boards/sof_rt5682.c @@ -737,8 +737,6 @@ static struct snd_soc_dai_link *sof_card_dai_links_create(struct device *dev, links[id].num_codecs = ARRAY_SIZE(max_98373_components); links[id].init = max_98373_spk_codec_init; links[id].ops = &max_98373_ops; - /* feedback stream */ - links[id].dpcm_capture = 1; } else if (sof_rt5682_quirk & SOF_MAX98360A_SPEAKER_AMP_PRESENT) { max_98360a_dai_link(&links[id]); @@ -751,6 +749,9 @@ static struct snd_soc_dai_link *sof_card_dai_links_create(struct device *dev, links[id].platforms = platform_component; links[id].num_platforms = ARRAY_SIZE(platform_component); links[id].dpcm_playback = 1; + /* feedback stream or firmware-generated echo reference */ + links[id].dpcm_capture = 1; + links[id].no_pcm = 1; links[id].cpus = &cpus[id]; links[id].num_cpus = 1;
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit e0a52220344ab7defe25b9cdd58fe1dc1122e67c ]
The amplifier may provide hardware support for I/V feedback, or alternatively the firmware may generate an echo reference attached to the SSP and dailink used for the amplifier.
To avoid any issues with invalid/NULL substreams in the latter case, always unconditionally set dpcm_capture.
Link: https://github.com/thesofproject/linux/issues/4083 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Link: https://lore.kernel.org/r/20230119163459.2235843-3-kai.vehmanen@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/sof_cs42l42.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/sound/soc/intel/boards/sof_cs42l42.c b/sound/soc/intel/boards/sof_cs42l42.c index ce78c18798876..8061082d9fbf3 100644 --- a/sound/soc/intel/boards/sof_cs42l42.c +++ b/sound/soc/intel/boards/sof_cs42l42.c @@ -311,6 +311,9 @@ static int create_spk_amp_dai_links(struct device *dev, links[*id].platforms = platform_component; links[*id].num_platforms = ARRAY_SIZE(platform_component); links[*id].dpcm_playback = 1; + /* firmware-generated echo reference */ + links[*id].dpcm_capture = 1; + links[*id].no_pcm = 1; links[*id].cpus = &cpus[*id]; links[*id].num_cpus = 1;
From: Eduard Zingerman eddyz87@gmail.com
[ Upstream commit b9fa9bc839291020b362ab5392e5f18ba79657ac ]
A testcase to check that verifier.c:copy_register_state() preserves register parentage chain and livness information.
Signed-off-by: Eduard Zingerman eddyz87@gmail.com Link: https://lore.kernel.org/r/20230106142214.1040390-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../selftests/bpf/verifier/search_pruning.c | 36 +++++++++++++++++++ 1 file changed, 36 insertions(+)
diff --git a/tools/testing/selftests/bpf/verifier/search_pruning.c b/tools/testing/selftests/bpf/verifier/search_pruning.c index 7e50cb80873a5..7e36078f8f482 100644 --- a/tools/testing/selftests/bpf/verifier/search_pruning.c +++ b/tools/testing/selftests/bpf/verifier/search_pruning.c @@ -154,3 +154,39 @@ .result_unpriv = ACCEPT, .insn_processed = 15, }, +/* The test performs a conditional 64-bit write to a stack location + * fp[-8], this is followed by an unconditional 8-bit write to fp[-8], + * then data is read from fp[-8]. This sequence is unsafe. + * + * The test would be mistakenly marked as safe w/o dst register parent + * preservation in verifier.c:copy_register_state() function. + * + * Note the usage of BPF_F_TEST_STATE_FREQ to force creation of the + * checkpoint state after conditional 64-bit assignment. + */ +{ + "write tracking and register parent chain bug", + .insns = { + /* r6 = ktime_get_ns() */ + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + /* r0 = ktime_get_ns() */ + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + /* if r0 > r6 goto +1 */ + BPF_JMP_REG(BPF_JGT, BPF_REG_0, BPF_REG_6, 1), + /* *(u64 *)(r10 - 8) = 0xdeadbeef */ + BPF_ST_MEM(BPF_DW, BPF_REG_FP, -8, 0xdeadbeef), + /* r1 = 42 */ + BPF_MOV64_IMM(BPF_REG_1, 42), + /* *(u8 *)(r10 - 8) = r1 */ + BPF_STX_MEM(BPF_B, BPF_REG_FP, BPF_REG_1, -8), + /* r2 = *(u64 *)(r10 - 8) */ + BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_FP, -8), + /* exit(0) */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .flags = BPF_F_TEST_STATE_FREQ, + .errstr = "invalid read from stack off -8+1 size 8", + .result = REJECT, +},
From: Cezary Rojewski cezary.rojewski@intel.com
[ Upstream commit 87978e6ad45a16835cc58234451111091be3c59a ]
Several functions that take part in codec's initialization and removal are re-used by ASoC codec drivers implementations. Drivers mimic the behavior of hda_codec_driver_probe/remove() found in sound/pci/hda/hda_bind.c with their component->probe/remove() instead.
One of the reasons for that is the expectation of snd_hda_codec_device_new() to receive a valid pointer to an instance of struct snd_card. This expectation can be met only once sound card components probing commences.
As ASoC sound card may be unbound without codec device being actually removed from the system, unsetting ->preset in snd_hda_codec_cleanup_for_unbind() interferes with module unload -> load scenario causing null-ptr-deref. Preset is assigned only once, during device/driver matching whereas ASoC codec driver's module reloading may occur several times throughout the lifetime of an audio stack.
Suggested-by: Takashi Iwai tiwai@suse.com Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com Link: https://lore.kernel.org/r/20230119143235.1159814-1-cezary.rojewski@intel.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/hda_bind.c | 2 ++ sound/pci/hda/hda_codec.c | 1 - 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/sound/pci/hda/hda_bind.c b/sound/pci/hda/hda_bind.c index 7af2515735957..8e35009ec25cb 100644 --- a/sound/pci/hda/hda_bind.c +++ b/sound/pci/hda/hda_bind.c @@ -144,6 +144,7 @@ static int hda_codec_driver_probe(struct device *dev)
error: snd_hda_codec_cleanup_for_unbind(codec); + codec->preset = NULL; return err; }
@@ -166,6 +167,7 @@ static int hda_codec_driver_remove(struct device *dev) if (codec->patch_ops.free) codec->patch_ops.free(codec); snd_hda_codec_cleanup_for_unbind(codec); + codec->preset = NULL; module_put(dev->driver->owner); return 0; } diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index f552785d301e0..19be60bb57810 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -791,7 +791,6 @@ void snd_hda_codec_cleanup_for_unbind(struct hda_codec *codec) snd_array_free(&codec->cvt_setups); snd_array_free(&codec->spdif_out); snd_array_free(&codec->verbs); - codec->preset = NULL; codec->follower_dig_outs = NULL; codec->spdif_status_reset = 0; snd_array_free(&codec->mixers);
From: Jakub Sitnicki jakub@cloudflare.com
[ Upstream commit 5b4a79ba65a1ab479903fff2e604865d229b70a9 ]
sock_map proto callbacks should never call themselves by design. Protect against bugs like [1] and break out of the recursive loop to avoid a stack overflow in favor of a resource leak.
[1] https://lore.kernel.org/all/00000000000073b14905ef2e7401@google.com/
Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Jakub Sitnicki jakub@cloudflare.com Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/r/20230113-sockmap-fix-v2-1-1e0ee7ac2f90@cloudflare.... Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/sock_map.c | 61 +++++++++++++++++++++++++-------------------- 1 file changed, 34 insertions(+), 27 deletions(-)
diff --git a/net/core/sock_map.c b/net/core/sock_map.c index ae6013a8bce53..86b4e8909ad1e 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -1514,15 +1514,16 @@ void sock_map_unhash(struct sock *sk) psock = sk_psock(sk); if (unlikely(!psock)) { rcu_read_unlock(); - if (sk->sk_prot->unhash) - sk->sk_prot->unhash(sk); - return; + saved_unhash = READ_ONCE(sk->sk_prot)->unhash; + } else { + saved_unhash = psock->saved_unhash; + sock_map_remove_links(sk, psock); + rcu_read_unlock(); } - - saved_unhash = psock->saved_unhash; - sock_map_remove_links(sk, psock); - rcu_read_unlock(); - saved_unhash(sk); + if (WARN_ON_ONCE(saved_unhash == sock_map_unhash)) + return; + if (saved_unhash) + saved_unhash(sk); } EXPORT_SYMBOL_GPL(sock_map_unhash);
@@ -1535,17 +1536,18 @@ void sock_map_destroy(struct sock *sk) psock = sk_psock_get(sk); if (unlikely(!psock)) { rcu_read_unlock(); - if (sk->sk_prot->destroy) - sk->sk_prot->destroy(sk); - return; + saved_destroy = READ_ONCE(sk->sk_prot)->destroy; + } else { + saved_destroy = psock->saved_destroy; + sock_map_remove_links(sk, psock); + rcu_read_unlock(); + sk_psock_stop(psock); + sk_psock_put(sk, psock); } - - saved_destroy = psock->saved_destroy; - sock_map_remove_links(sk, psock); - rcu_read_unlock(); - sk_psock_stop(psock); - sk_psock_put(sk, psock); - saved_destroy(sk); + if (WARN_ON_ONCE(saved_destroy == sock_map_destroy)) + return; + if (saved_destroy) + saved_destroy(sk); } EXPORT_SYMBOL_GPL(sock_map_destroy);
@@ -1560,16 +1562,21 @@ void sock_map_close(struct sock *sk, long timeout) if (unlikely(!psock)) { rcu_read_unlock(); release_sock(sk); - return sk->sk_prot->close(sk, timeout); + saved_close = READ_ONCE(sk->sk_prot)->close; + } else { + saved_close = psock->saved_close; + sock_map_remove_links(sk, psock); + rcu_read_unlock(); + sk_psock_stop(psock); + release_sock(sk); + cancel_work_sync(&psock->work); + sk_psock_put(sk, psock); } - - saved_close = psock->saved_close; - sock_map_remove_links(sk, psock); - rcu_read_unlock(); - sk_psock_stop(psock); - release_sock(sk); - cancel_work_sync(&psock->work); - sk_psock_put(sk, psock); + /* Make sure we do not recurse. This is a bug. + * Leak the socket instead of crashing on a stack overflow. + */ + if (WARN_ON_ONCE(saved_close == sock_map_close)) + return; saved_close(sk, timeout); } EXPORT_SYMBOL_GPL(sock_map_close);
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit e18c6da62edc780e4f4f3c9ce07bdacd69505182 ]
While looking through legacy platform data users, I noticed that the DT probing never uses data from the DT properties, as the platform_data structure gets overwritten directly after it is initialized.
There have never been any boards defining the platform_data in the mainline kernel either, so this driver so far only worked with patched kernels or with the default values.
For the benefit of possible downstream users, fix the DT probe by no longer overwriting the data.
Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Charles Keepax ckeepax@opensource.cirrus.com Link: https://lore.kernel.org/r/20230126162203.2986339-1-arnd@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/cs42l56.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/sound/soc/codecs/cs42l56.c b/sound/soc/codecs/cs42l56.c index b39c25409c239..f0af8c18e5efa 100644 --- a/sound/soc/codecs/cs42l56.c +++ b/sound/soc/codecs/cs42l56.c @@ -1193,18 +1193,12 @@ static int cs42l56_i2c_probe(struct i2c_client *i2c_client, if (pdata) { cs42l56->pdata = *pdata; } else { - pdata = devm_kzalloc(&i2c_client->dev, sizeof(*pdata), - GFP_KERNEL); - if (!pdata) - return -ENOMEM; - if (i2c_client->dev.of_node) { ret = cs42l56_handle_of_data(i2c_client, &cs42l56->pdata); if (ret != 0) return ret; } - cs42l56->pdata = *pdata; }
if (cs42l56->pdata.gpio_nreset) {
From: Shunsuke Mie mie@igel.co.jp
[ Upstream commit 3f7b75abf41cc4143aa295f62acbb060a012868d ]
Fix the build caused by missing kmsan_handle_dma() and is_power_of_2() that are used in drivers/virtio/virtio_ring.c.
Signed-off-by: Shunsuke Mie mie@igel.co.jp Message-Id: 20230110034310.779744-1-mie@igel.co.jp Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/virtio/linux/bug.h | 8 +++----- tools/virtio/linux/build_bug.h | 7 +++++++ tools/virtio/linux/cpumask.h | 7 +++++++ tools/virtio/linux/gfp.h | 7 +++++++ tools/virtio/linux/kernel.h | 1 + tools/virtio/linux/kmsan.h | 12 ++++++++++++ tools/virtio/linux/scatterlist.h | 1 + tools/virtio/linux/topology.h | 7 +++++++ 8 files changed, 45 insertions(+), 5 deletions(-) create mode 100644 tools/virtio/linux/build_bug.h create mode 100644 tools/virtio/linux/cpumask.h create mode 100644 tools/virtio/linux/gfp.h create mode 100644 tools/virtio/linux/kmsan.h create mode 100644 tools/virtio/linux/topology.h
diff --git a/tools/virtio/linux/bug.h b/tools/virtio/linux/bug.h index 813baf13f62a2..51a919083d9b8 100644 --- a/tools/virtio/linux/bug.h +++ b/tools/virtio/linux/bug.h @@ -1,13 +1,11 @@ /* SPDX-License-Identifier: GPL-2.0 */ -#ifndef BUG_H -#define BUG_H +#ifndef _LINUX_BUG_H +#define _LINUX_BUG_H
#include <asm/bug.h>
#define BUG_ON(__BUG_ON_cond) assert(!(__BUG_ON_cond))
-#define BUILD_BUG_ON(x) - #define BUG() abort()
-#endif /* BUG_H */ +#endif /* _LINUX_BUG_H */ diff --git a/tools/virtio/linux/build_bug.h b/tools/virtio/linux/build_bug.h new file mode 100644 index 0000000000000..cdbb75e28a604 --- /dev/null +++ b/tools/virtio/linux/build_bug.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_BUILD_BUG_H +#define _LINUX_BUILD_BUG_H + +#define BUILD_BUG_ON(x) + +#endif /* _LINUX_BUILD_BUG_H */ diff --git a/tools/virtio/linux/cpumask.h b/tools/virtio/linux/cpumask.h new file mode 100644 index 0000000000000..307da69d6b26c --- /dev/null +++ b/tools/virtio/linux/cpumask.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_CPUMASK_H +#define _LINUX_CPUMASK_H + +#include <linux/kernel.h> + +#endif /* _LINUX_CPUMASK_H */ diff --git a/tools/virtio/linux/gfp.h b/tools/virtio/linux/gfp.h new file mode 100644 index 0000000000000..43d146f236f14 --- /dev/null +++ b/tools/virtio/linux/gfp.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __LINUX_GFP_H +#define __LINUX_GFP_H + +#include <linux/topology.h> + +#endif diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h index 0b493542e61a6..a4beb719d2174 100644 --- a/tools/virtio/linux/kernel.h +++ b/tools/virtio/linux/kernel.h @@ -10,6 +10,7 @@ #include <stdarg.h>
#include <linux/compiler.h> +#include <linux/log2.h> #include <linux/types.h> #include <linux/overflow.h> #include <linux/list.h> diff --git a/tools/virtio/linux/kmsan.h b/tools/virtio/linux/kmsan.h new file mode 100644 index 0000000000000..272b5aa285d5a --- /dev/null +++ b/tools/virtio/linux/kmsan.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_KMSAN_H +#define _LINUX_KMSAN_H + +#include <linux/gfp.h> + +inline void kmsan_handle_dma(struct page *page, size_t offset, size_t size, + enum dma_data_direction dir) +{ +} + +#endif /* _LINUX_KMSAN_H */ diff --git a/tools/virtio/linux/scatterlist.h b/tools/virtio/linux/scatterlist.h index 369ee308b6686..74d9e1825748e 100644 --- a/tools/virtio/linux/scatterlist.h +++ b/tools/virtio/linux/scatterlist.h @@ -2,6 +2,7 @@ #ifndef SCATTERLIST_H #define SCATTERLIST_H #include <linux/kernel.h> +#include <linux/bug.h>
struct scatterlist { unsigned long page_link; diff --git a/tools/virtio/linux/topology.h b/tools/virtio/linux/topology.h new file mode 100644 index 0000000000000..910794afb993a --- /dev/null +++ b/tools/virtio/linux/topology.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_TOPOLOGY_H +#define _LINUX_TOPOLOGY_H + +#include <linux/cpumask.h> + +#endif /* _LINUX_TOPOLOGY_H */
From: Hyunwoo Kim v4bel@theori.io
[ Upstream commit 14caefcf9837a2be765a566005ad82cd0d2a429f ]
If you call listen() and accept() on an already connect()ed rose socket, accept() can successfully connect. This is because when the peer socket sends data to sendmsg, the skb with its own sk stored in the connected socket's sk->sk_receive_queue is connected, and rose_accept() dequeues the skb waiting in the sk->sk_receive_queue.
This creates a child socket with the sk of the parent rose socket, which can cause confusion.
Fix rose_listen() to return -EINVAL if the socket has already been successfully connected, and add lock_sock to prevent this issue.
Signed-off-by: Hyunwoo Kim v4bel@theori.io Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://lore.kernel.org/r/20230125105944.GA133314@ubuntu Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/rose/af_rose.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c index 29a208ed8fb88..86c93cf1744b0 100644 --- a/net/rose/af_rose.c +++ b/net/rose/af_rose.c @@ -487,6 +487,12 @@ static int rose_listen(struct socket *sock, int backlog) { struct sock *sk = sock->sk;
+ lock_sock(sk); + if (sock->state != SS_UNCONNECTED) { + release_sock(sk); + return -EINVAL; + } + if (sk->sk_state != TCP_LISTEN) { struct rose_sock *rose = rose_sk(sk);
@@ -496,8 +502,10 @@ static int rose_listen(struct socket *sock, int backlog) memset(rose->dest_digis, 0, AX25_ADDR_LEN * ROSE_MAX_DIGIS); sk->sk_max_ack_backlog = backlog; sk->sk_state = TCP_LISTEN; + release_sock(sk); return 0; } + release_sock(sk);
return -EOPNOTSUPP; }
From: Andrey Konovalov andrey.konovalov@linaro.org
[ Upstream commit 54aa39a513dbf2164ca462a19f04519b2407a224 ]
Currently in phy_init_eee() the driver unconditionally configures the PHY to stop RX_CLK after entering Rx LPI state. This causes an LPI interrupt storm on my qcs404-base board.
Change the PHY initialization so that for "qcom,qcs404-ethqos" compatible device RX_CLK continues to run even in Rx LPI state.
Signed-off-by: Andrey Konovalov andrey.konovalov@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 ++- include/linux/stmmac.h | 1 + 3 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c index 6b1d9e8879f46..d0c7f22a4e55a 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c @@ -505,6 +505,8 @@ static int qcom_ethqos_probe(struct platform_device *pdev) plat_dat->has_gmac4 = 1; plat_dat->pmt = 1; plat_dat->tso_en = of_property_read_bool(np, "snps,tso"); + if (of_device_is_compatible(np, "qcom,qcs404-ethqos")) + plat_dat->rx_clk_runs_in_lpi = 1;
ret = stmmac_dvr_probe(&pdev->dev, plat_dat, &stmmac_res); if (ret) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 4191502d6472f..d56f65338ea66 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1174,7 +1174,8 @@ static void stmmac_mac_link_up(struct phylink_config *config,
stmmac_mac_set(priv, priv->ioaddr, true); if (phy && priv->dma_cap.eee) { - priv->eee_active = phy_init_eee(phy, 1) >= 0; + priv->eee_active = + phy_init_eee(phy, !priv->plat->rx_clk_runs_in_lpi) >= 0; priv->eee_enabled = stmmac_eee_init(priv); priv->tx_lpi_enabled = priv->eee_enabled; stmmac_set_eee_pls(priv, priv->hw, true); diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h index 48d015ed21752..cc338c6c74954 100644 --- a/include/linux/stmmac.h +++ b/include/linux/stmmac.h @@ -251,6 +251,7 @@ struct plat_stmmacenet_data { int rss_en; int mac_port_sel_speed; bool en_tx_lpi_clockgating; + bool rx_clk_runs_in_lpi; int has_xgmac; bool vlan_fail_q_en; u8 vlan_fail_q;
From: Ben Skeggs bskeggs@redhat.com
[ Upstream commit d22915d22ded21fd5b24b60d174775789f173997 ]
Starting from Turing, the driver is no longer responsible for initiating DEVINIT when required as the GPU started loading a FW image from ROM and executing DEVINIT itself after power-on.
However - we apparently still need to wait for it to complete.
This should correct some issues with runpm on some systems, where we get control of the HW before it's been fully reinitialised after resume from suspend.
Signed-off-by: Ben Skeggs bskeggs@redhat.com Reviewed-by: Lyude Paul lyude@redhat.com Signed-off-by: Lyude Paul lyude@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230130223715.1831509-1-bskeg... Signed-off-by: Sasha Levin sashal@kernel.org --- .../drm/nouveau/nvkm/subdev/devinit/tu102.c | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c index 634f64f88fc8b..81a1ad2c88a7e 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c @@ -65,10 +65,33 @@ tu102_devinit_pll_set(struct nvkm_devinit *init, u32 type, u32 freq) return ret; }
+static int +tu102_devinit_wait(struct nvkm_device *device) +{ + unsigned timeout = 50 + 2000; + + do { + if (nvkm_rd32(device, 0x118128) & 0x00000001) { + if ((nvkm_rd32(device, 0x118234) & 0x000000ff) == 0xff) + return 0; + } + + usleep_range(1000, 2000); + } while (timeout--); + + return -ETIMEDOUT; +} + int tu102_devinit_post(struct nvkm_devinit *base, bool post) { struct nv50_devinit *init = nv50_devinit(base); + int ret; + + ret = tu102_devinit_wait(init->base.subdev.device); + if (ret) + return ret; + gm200_devinit_preos(init, post); return 0; }
From: Kees Cook keescook@chromium.org
[ Upstream commit de5ca4c3852f896cacac2bf259597aab5e17d9e3 ]
Nothing was explicitly bounds checking the priority index used to access clpriop[]. WARN and bail out early if it's pathological. Seen with GCC 13:
../net/sched/sch_htb.c: In function 'htb_activate_prios': ../net/sched/sch_htb.c:437:44: warning: array subscript [0, 31] is outside array bounds of 'struct htb_prio[8]' [-Warray-bounds=] 437 | if (p->inner.clprio[prio].feed.rb_node) | ~~~~~~~~~~~~~~~^~~~~~ ../net/sched/sch_htb.c:131:41: note: while referencing 'clprio' 131 | struct htb_prio clprio[TC_HTB_NUMPRIO]; | ^~~~~~
Cc: Jamal Hadi Salim jhs@mojatatu.com Cc: Cong Wang xiyou.wangcong@gmail.com Cc: Jiri Pirko jiri@resnulli.us Cc: "David S. Miller" davem@davemloft.net Cc: Eric Dumazet edumazet@google.com Cc: Jakub Kicinski kuba@kernel.org Cc: Paolo Abeni pabeni@redhat.com Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org Reviewed-by: Simon Horman simon.horman@corigine.com Reviewed-by: Cong Wang cong.wang@bytedance.com Link: https://lore.kernel.org/r/20230127224036.never.561-kees@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_htb.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 45b92e40082ef..7ea8c73ddeff0 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -427,7 +427,10 @@ static void htb_activate_prios(struct htb_sched *q, struct htb_class *cl) while (cl->cmode == HTB_MAY_BORROW && p && mask) { m = mask; while (m) { - int prio = ffz(~m); + unsigned int prio = ffz(~m); + + if (WARN_ON_ONCE(prio > ARRAY_SIZE(p->inner.clprio))) + break; m &= ~(1 << prio);
if (p->inner.clprio[prio].feed.rb_node)
From: Vasily Gorbik gor@linux.ibm.com
[ Upstream commit 7ab41c2c08a32132ba8c14624910e2fe8ce4ba4b ]
Historically calls to __decompress() didn't specify "out_len" parameter on many architectures including s390, expecting that no writes beyond uncompressed kernel image are performed. This has changed since commit 2aa14b1ab2c4 ("zstd: import usptream v1.5.2") which includes zstd library commit 6a7ede3dfccb ("Reduce size of dctx by reutilizing dst buffer (#2751)"). Now zstd decompression code might store literal buffer in the unwritten portion of the destination buffer. Since "out_len" is not set, it is considered to be unlimited and hence free to use for optimization needs. On s390 this might corrupt initrd or ipl report which are often placed right after the decompressor buffer. Luckily the size of uncompressed kernel image is already known to the decompressor, so to avoid the problem simply specify it in the "out_len" parameter.
Link: https://github.com/facebook/zstd/commit/6a7ede3dfccb Signed-off-by: Vasily Gorbik gor@linux.ibm.com Tested-by: Alexander Egorenkov egorenar@linux.ibm.com Link: https://lore.kernel.org/r/patch-1.thread-41c676.git-41c676c2d153.your-ad-her... Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/boot/compressed/decompressor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/boot/compressed/decompressor.c b/arch/s390/boot/compressed/decompressor.c index e27c2140d6206..623f6775d01d7 100644 --- a/arch/s390/boot/compressed/decompressor.c +++ b/arch/s390/boot/compressed/decompressor.c @@ -80,6 +80,6 @@ void *decompress_kernel(void) void *output = (void *)decompress_offset;
__decompress(_compressed_start, _compressed_end - _compressed_start, - NULL, NULL, output, 0, NULL, error); + NULL, NULL, output, vmlinux.image_size, NULL, error); return output; }
From: Amit Engel Amit.Engel@dell.com
[ Upstream commit 0cab4404874f2de52617de8400c844891c6ea1ce ]
As part of nvmet_fc_ls_create_association there is a case where nvmet_fc_alloc_target_queue fails right after a new association with an admin queue is created. In this case, no one releases the get taken in nvmet_fc_alloc_target_assoc. This fix is adding the missing put.
Signed-off-by: Amit Engel Amit.Engel@dell.com Reviewed-by: James Smart jsmart2021@gmail.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/fc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c index c43bc5e1c7a28..00a2a591f5c1f 100644 --- a/drivers/nvme/target/fc.c +++ b/drivers/nvme/target/fc.c @@ -1685,8 +1685,10 @@ nvmet_fc_ls_create_association(struct nvmet_fc_tgtport *tgtport, else { queue = nvmet_fc_alloc_target_queue(iod->assoc, 0, be16_to_cpu(rqst->assoc_cmd.sqsize)); - if (!queue) + if (!queue) { ret = VERR_QUEUE_ALLOC_FAIL; + nvmet_fc_tgt_a_put(iod->assoc); + } } }
From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit 6fc547a5a2ef5ce05b16924106663ab92f8f87a7 ]
There could be boards with DCN listed in IP discovery, but no display hardware actually wired up. In this case the vbios display table will not be populated. Detect this case and skip loading DM when we detect it.
v2: Mark DCN as harvested as well so other display checks elsewhere in the driver are handled properly.
Cc: Aurabindo Pillai aurabindo.pillai@amd.com Reviewed-by: Aurabindo Pillai aurabindo.pillai@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index e9797439bb0eb..3f325e1e440c9 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -4428,6 +4428,17 @@ DEVICE_ATTR_WO(s3_debug); static int dm_early_init(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_mode_info *mode_info = &adev->mode_info; + struct atom_context *ctx = mode_info->atom_context; + int index = GetIndexIntoMasterTable(DATA, Object_Header); + u16 data_offset; + + /* if there is no object header, skip DM */ + if (!amdgpu_atom_parse_data_header(ctx, index, NULL, NULL, NULL, &data_offset)) { + adev->harvest_ip_mask |= AMD_HARVEST_IP_DMU_MASK; + dev_info(adev->dev, "No object header, skipping DM\n"); + return -ENOENT; + }
switch (adev->asic_type) { #if defined(CONFIG_DRM_AMD_DC_SI)
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit eecf2acd4a580e9364e5087daf0effca60a240b7 ]
Add a DMI match for the CWI501 version of the Chuwi Vi8 tablet, pointing to the same chuwi_vi8_data as the existing CWI506 version DMI match.
Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20230202103413.331459-1-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/touchscreen_dmi.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/platform/x86/touchscreen_dmi.c b/drivers/platform/x86/touchscreen_dmi.c index 93671037fd598..69ba2c5182610 100644 --- a/drivers/platform/x86/touchscreen_dmi.c +++ b/drivers/platform/x86/touchscreen_dmi.c @@ -1073,6 +1073,15 @@ const struct dmi_system_id touchscreen_dmi_table[] = { DMI_MATCH(DMI_BIOS_DATE, "05/07/2016"), }, }, + { + /* Chuwi Vi8 (CWI501) */ + .driver_data = (void *)&chuwi_vi8_data, + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Insyde"), + DMI_MATCH(DMI_PRODUCT_NAME, "i86"), + DMI_MATCH(DMI_BIOS_VERSION, "CHUWI.W86JLBNR01"), + }, + }, { /* Chuwi Vi8 (CWI506) */ .driver_data = (void *)&chuwi_vi8_data,
From: Gaosheng Cui cuigaosheng1@huawei.com
[ Upstream commit 5544e90c81261e82e02bbf7c6015a4b9c8c825ef ]
The type of return value of dev_set_name is int, which may return wrong result, so we add error handling for it to reclaim memory of nvmem resource, and return early when an error occurs.
Signed-off-by: Gaosheng Cui cuigaosheng1@huawei.com Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20220916122100.170016-4-srinivas.kandagatla@linaro... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: ab3428cfd9aa ("nvmem: core: fix registration vs use race") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvmem/core.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c index ee86022c4f2b8..51bec9f8a3bf9 100644 --- a/drivers/nvmem/core.c +++ b/drivers/nvmem/core.c @@ -804,18 +804,24 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
switch (config->id) { case NVMEM_DEVID_NONE: - dev_set_name(&nvmem->dev, "%s", config->name); + rval = dev_set_name(&nvmem->dev, "%s", config->name); break; case NVMEM_DEVID_AUTO: - dev_set_name(&nvmem->dev, "%s%d", config->name, nvmem->id); + rval = dev_set_name(&nvmem->dev, "%s%d", config->name, nvmem->id); break; default: - dev_set_name(&nvmem->dev, "%s%d", + rval = dev_set_name(&nvmem->dev, "%s%d", config->name ? : "nvmem", config->name ? config->id : nvmem->id); break; }
+ if (rval) { + ida_free(&nvmem_ida, nvmem->id); + kfree(nvmem); + return ERR_PTR(rval); + } + nvmem->read_only = device_property_present(config->dev, "read-only") || config->read_only || !nvmem->reg_write;
From: Russell King (Oracle) rmk+kernel@armlinux.org.uk
[ Upstream commit 560181d3ace61825f4ca9dd3481d6c0ee6709fa8 ]
If dev_set_name() fails, we leak nvmem->wp_gpio as the cleanup does not put this. While a minimal fix for this would be to add the gpiod_put() call, we can do better if we split device_register(), and use the tested nvmem_release() cleanup code by initialising the device early, and putting the device.
This results in a slightly larger fix, but results in clear code.
Note: this patch depends on "nvmem: core: initialise nvmem->id early" and "nvmem: core: remove nvmem_config wp_gpio".
Fixes: 5544e90c8126 ("nvmem: core: add error handling for dev_set_name") Cc: stable@vger.kernel.org Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter error27@gmail.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk [Srini: Fixed subject line and error code handing with wp_gpio while applying.] Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230127104015.23839-6-srinivas.kandagatla@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: ab3428cfd9aa ("nvmem: core: fix registration vs use race") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvmem/core.c | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c index 51bec9f8a3bf9..f06b65f0d410b 100644 --- a/drivers/nvmem/core.c +++ b/drivers/nvmem/core.c @@ -768,14 +768,18 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
nvmem->id = rval;
+ nvmem->dev.type = &nvmem_provider_type; + nvmem->dev.bus = &nvmem_bus_type; + nvmem->dev.parent = config->dev; + + device_initialize(&nvmem->dev); + if (!config->ignore_wp) nvmem->wp_gpio = gpiod_get_optional(config->dev, "wp", GPIOD_OUT_HIGH); if (IS_ERR(nvmem->wp_gpio)) { - ida_free(&nvmem_ida, nvmem->id); rval = PTR_ERR(nvmem->wp_gpio); - kfree(nvmem); - return ERR_PTR(rval); + goto err_put_device; }
kref_init(&nvmem->refcnt); @@ -787,9 +791,6 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config) nvmem->stride = config->stride ?: 1; nvmem->word_size = config->word_size ?: 1; nvmem->size = config->size; - nvmem->dev.type = &nvmem_provider_type; - nvmem->dev.bus = &nvmem_bus_type; - nvmem->dev.parent = config->dev; nvmem->root_only = config->root_only; nvmem->priv = config->priv; nvmem->type = config->type; @@ -816,11 +817,8 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config) break; }
- if (rval) { - ida_free(&nvmem_ida, nvmem->id); - kfree(nvmem); - return ERR_PTR(rval); - } + if (rval) + goto err_put_device;
nvmem->read_only = device_property_present(config->dev, "read-only") || config->read_only || !nvmem->reg_write; @@ -831,7 +829,7 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name);
- rval = device_register(&nvmem->dev); + rval = device_add(&nvmem->dev); if (rval) goto err_put_device;
From: Russell King (Oracle) rmk+kernel@armlinux.org.uk
[ Upstream commit ab3428cfd9aa2f3463ee4b2909b5bb2193bd0c4a ]
The i.MX6 CPU frequency driver sometimes fails to register at boot time due to nvmem_cell_read_u32() sporadically returning -ENOENT.
This happens because there is a window where __nvmem_device_get() in of_nvmem_cell_get() is able to return the nvmem device, but as cells have been setup, nvmem_find_cell_entry_by_node() returns NULL.
The occurs because the nvmem core registration code violates one of the fundamental principles of kernel programming: do not publish data structures before their setup is complete.
Fix this by making nvmem core code conform with this principle.
Fixes: eace75cfdcf7 ("nvmem: Add a simple NVMEM framework for nvmem providers") Cc: stable@vger.kernel.org Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230127104015.23839-7-srinivas.kandagatla@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvmem/core.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c index f06b65f0d410b..6a74e38746057 100644 --- a/drivers/nvmem/core.c +++ b/drivers/nvmem/core.c @@ -827,22 +827,16 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config) nvmem->dev.groups = nvmem_dev_groups; #endif
- dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name); - - rval = device_add(&nvmem->dev); - if (rval) - goto err_put_device; - if (nvmem->nkeepout) { rval = nvmem_validate_keepouts(nvmem); if (rval) - goto err_device_del; + goto err_put_device; }
if (config->compat) { rval = nvmem_sysfs_setup_compat(nvmem, config); if (rval) - goto err_device_del; + goto err_put_device; }
if (config->cells) { @@ -859,6 +853,12 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config) if (rval) goto err_remove_cells;
+ dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name); + + rval = device_add(&nvmem->dev); + if (rval) + goto err_remove_cells; + blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);
return nvmem; @@ -867,8 +867,6 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config) nvmem_device_remove_all_cells(nvmem); if (config->compat) nvmem_sysfs_remove_compat(nvmem, config); -err_device_del: - device_del(&nvmem->dev); err_put_device: put_device(&nvmem->dev);
From: Russell King (Oracle) rmk+kernel@armlinux.org.uk
[ Upstream commit 0c4862b1c1465e473bc961a02765490578bf5c20 ]
Dan Carpenter points out that the return code was not set in commit 60c8b4aebd8e ("nvmem: core: fix cleanup after dev_set_name()"), but this is not the only issue - we also need to zero wp_gpio to prevent gpiod_put() being called on an error value.
Fixes: 560181d3ace6 ("nvmem: core: fix cleanup after dev_set_name()") Cc: stable@vger.kernel.org Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter error27@gmail.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230127104015.23839-10-srinivas.kandagatla@linaro... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvmem/core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c index 6a74e38746057..47c1487dcf8cc 100644 --- a/drivers/nvmem/core.c +++ b/drivers/nvmem/core.c @@ -779,6 +779,7 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config) GPIOD_OUT_HIGH); if (IS_ERR(nvmem->wp_gpio)) { rval = PTR_ERR(nvmem->wp_gpio); + nvmem->wp_gpio = NULL; goto err_put_device; }
From: Dave Chinner dchinner@redhat.com
[ Upstream commit cb512c921639613ce03f87e62c5e93ed9fe8c84d ]
When we first allocate or resize an inline inode fork, we round up the allocation to 4 byte alingment to make journal alignment constraints. We don't clear the unused bytes, so we can copy up to three uninitialised bytes into the journal. Zero those bytes so we only ever copy zeros into the journal.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Allison Henderson allison.henderson@oracle.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_inode_fork.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index 1d174909f9bdf..20095233d7bc0 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -50,8 +50,13 @@ xfs_init_local_fork( mem_size++;
if (size) { + /* + * As we round up the allocation here, we need to ensure the + * bytes we don't copy data into are zeroed because the log + * vectors still copy them into the journal. + */ real_size = roundup(mem_size, 4); - ifp->if_u1.if_data = kmem_alloc(real_size, KM_NOFS); + ifp->if_u1.if_data = kmem_zalloc(real_size, KM_NOFS); memcpy(ifp->if_u1.if_data, data, size); if (zero_terminate) ifp->if_u1.if_data[size] = '\0'; @@ -500,10 +505,11 @@ xfs_idata_realloc( /* * For inline data, the underlying buffer must be a multiple of 4 bytes * in size so that it can be logged and stay on word boundaries. - * We enforce that here. + * We enforce that here, and use __GFP_ZERO to ensure that size + * extensions always zero the unused roundup area. */ ifp->if_u1.if_data = krealloc(ifp->if_u1.if_data, roundup(new_size, 4), - GFP_NOFS | __GFP_NOFAIL); + GFP_NOFS | __GFP_NOFAIL | __GFP_ZERO); ifp->if_bytes = new_size; }
From: Dave Chinner dchinner@redhat.com
[ Upstream commit c230a4a85bcdbfc1a7415deec6caf04e8fca1301 ]
Ever since we added shadown format buffers to the log items, log items need to handle the item being released with shadow buffers attached. Due to the fact this requirement was added at the same time we added new rmap/reflink intents, we missed the cleanup of those items.
In theory, this means shadow buffers can be leaked in a very small window when a shutdown is initiated. Testing with KASAN shows this leak does not happen in practice - we haven't identified a single leak in several years of shutdown testing since ~v4.8 kernels.
However, the intent whiteout cleanup mechanism results in every cancelled intent in exactly the same state as this tiny race window creates and so if intents down clean up shadow buffers on final release we will leak the shadow buffer for just about every intent we create.
Hence we start with this patch to close this condition off and ensure that when whiteouts start to be used we don't leak lots of memory.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Allison Henderson allison.henderson@oracle.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_bmap_item.c | 2 ++ fs/xfs/xfs_icreate_item.c | 1 + fs/xfs/xfs_refcount_item.c | 2 ++ fs/xfs/xfs_rmap_item.c | 2 ++ 4 files changed, 7 insertions(+)
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index 03159970133ff..51ffdec5e4faa 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -39,6 +39,7 @@ STATIC void xfs_bui_item_free( struct xfs_bui_log_item *buip) { + kmem_free(buip->bui_item.li_lv_shadow); kmem_cache_free(xfs_bui_zone, buip); }
@@ -198,6 +199,7 @@ xfs_bud_item_release( struct xfs_bud_log_item *budp = BUD_ITEM(lip);
xfs_bui_release(budp->bud_buip); + kmem_free(budp->bud_item.li_lv_shadow); kmem_cache_free(xfs_bud_zone, budp); }
diff --git a/fs/xfs/xfs_icreate_item.c b/fs/xfs/xfs_icreate_item.c index 017904a34c023..c265ae20946d5 100644 --- a/fs/xfs/xfs_icreate_item.c +++ b/fs/xfs/xfs_icreate_item.c @@ -63,6 +63,7 @@ STATIC void xfs_icreate_item_release( struct xfs_log_item *lip) { + kmem_free(ICR_ITEM(lip)->ic_item.li_lv_shadow); kmem_cache_free(xfs_icreate_zone, ICR_ITEM(lip)); }
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index 46904b793bd48..8ef842d17916a 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -35,6 +35,7 @@ STATIC void xfs_cui_item_free( struct xfs_cui_log_item *cuip) { + kmem_free(cuip->cui_item.li_lv_shadow); if (cuip->cui_format.cui_nextents > XFS_CUI_MAX_FAST_EXTENTS) kmem_free(cuip); else @@ -204,6 +205,7 @@ xfs_cud_item_release( struct xfs_cud_log_item *cudp = CUD_ITEM(lip);
xfs_cui_release(cudp->cud_cuip); + kmem_free(cudp->cud_item.li_lv_shadow); kmem_cache_free(xfs_cud_zone, cudp); }
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index 5f06959804678..15e7b01740a77 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -35,6 +35,7 @@ STATIC void xfs_rui_item_free( struct xfs_rui_log_item *ruip) { + kmem_free(ruip->rui_item.li_lv_shadow); if (ruip->rui_format.rui_nextents > XFS_RUI_MAX_FAST_EXTENTS) kmem_free(ruip); else @@ -227,6 +228,7 @@ xfs_rud_item_release( struct xfs_rud_log_item *rudp = RUD_ITEM(lip);
xfs_rui_release(rudp->rud_ruip); + kmem_free(rudp->rud_item.li_lv_shadow); kmem_cache_free(xfs_rud_zone, rudp); }
From: Dave Chinner dchinner@redhat.com
[ Upstream commit dc04db2aa7c9307e740d6d0e173085301c173b1a ]
To catch the obvious graph cycle problem and hence potential endless looping.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_btree.c | 140 ++++++++++++++++++++++++++++---------- 1 file changed, 105 insertions(+), 35 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 2983954817135..5bec048343b0c 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -51,6 +51,52 @@ xfs_btree_magic( return magic; }
+static xfs_failaddr_t +xfs_btree_check_lblock_siblings( + struct xfs_mount *mp, + struct xfs_btree_cur *cur, + int level, + xfs_fsblock_t fsb, + xfs_fsblock_t sibling) +{ + if (sibling == NULLFSBLOCK) + return NULL; + if (sibling == fsb) + return __this_address; + if (level >= 0) { + if (!xfs_btree_check_lptr(cur, sibling, level + 1)) + return __this_address; + } else { + if (!xfs_verify_fsbno(mp, sibling)) + return __this_address; + } + + return NULL; +} + +static xfs_failaddr_t +xfs_btree_check_sblock_siblings( + struct xfs_mount *mp, + struct xfs_btree_cur *cur, + int level, + xfs_agnumber_t agno, + xfs_agblock_t agbno, + xfs_agblock_t sibling) +{ + if (sibling == NULLAGBLOCK) + return NULL; + if (sibling == agbno) + return __this_address; + if (level >= 0) { + if (!xfs_btree_check_sptr(cur, sibling, level + 1)) + return __this_address; + } else { + if (!xfs_verify_agbno(mp, agno, sibling)) + return __this_address; + } + return NULL; +} + /* * Check a long btree block header. Return the address of the failing check, * or NULL if everything is ok. @@ -65,6 +111,8 @@ __xfs_btree_check_lblock( struct xfs_mount *mp = cur->bc_mp; xfs_btnum_t btnum = cur->bc_btnum; int crc = xfs_has_crc(mp); + xfs_failaddr_t fa; + xfs_fsblock_t fsb = NULLFSBLOCK;
if (crc) { if (!uuid_equal(&block->bb_u.l.bb_uuid, &mp->m_sb.sb_meta_uuid)) @@ -83,16 +131,16 @@ __xfs_btree_check_lblock( if (be16_to_cpu(block->bb_numrecs) > cur->bc_ops->get_maxrecs(cur, level)) return __this_address; - if (block->bb_u.l.bb_leftsib != cpu_to_be64(NULLFSBLOCK) && - !xfs_btree_check_lptr(cur, be64_to_cpu(block->bb_u.l.bb_leftsib), - level + 1)) - return __this_address; - if (block->bb_u.l.bb_rightsib != cpu_to_be64(NULLFSBLOCK) && - !xfs_btree_check_lptr(cur, be64_to_cpu(block->bb_u.l.bb_rightsib), - level + 1)) - return __this_address;
- return NULL; + if (bp) + fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp)); + + fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb, + be64_to_cpu(block->bb_u.l.bb_leftsib)); + if (!fa) + fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb, + be64_to_cpu(block->bb_u.l.bb_rightsib)); + return fa; }
/* Check a long btree block header. */ @@ -130,6 +178,9 @@ __xfs_btree_check_sblock( struct xfs_mount *mp = cur->bc_mp; xfs_btnum_t btnum = cur->bc_btnum; int crc = xfs_has_crc(mp); + xfs_failaddr_t fa; + xfs_agblock_t agbno = NULLAGBLOCK; + xfs_agnumber_t agno = NULLAGNUMBER;
if (crc) { if (!uuid_equal(&block->bb_u.s.bb_uuid, &mp->m_sb.sb_meta_uuid)) @@ -146,16 +197,18 @@ __xfs_btree_check_sblock( if (be16_to_cpu(block->bb_numrecs) > cur->bc_ops->get_maxrecs(cur, level)) return __this_address; - if (block->bb_u.s.bb_leftsib != cpu_to_be32(NULLAGBLOCK) && - !xfs_btree_check_sptr(cur, be32_to_cpu(block->bb_u.s.bb_leftsib), - level + 1)) - return __this_address; - if (block->bb_u.s.bb_rightsib != cpu_to_be32(NULLAGBLOCK) && - !xfs_btree_check_sptr(cur, be32_to_cpu(block->bb_u.s.bb_rightsib), - level + 1)) - return __this_address;
- return NULL; + if (bp) { + agbno = xfs_daddr_to_agbno(mp, xfs_buf_daddr(bp)); + agno = xfs_daddr_to_agno(mp, xfs_buf_daddr(bp)); + } + + fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno, agbno, + be32_to_cpu(block->bb_u.s.bb_leftsib)); + if (!fa) + fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno, + agbno, be32_to_cpu(block->bb_u.s.bb_rightsib)); + return fa; }
/* Check a short btree block header. */ @@ -4265,6 +4318,21 @@ xfs_btree_visit_block( if (xfs_btree_ptr_is_null(cur, &rptr)) return -ENOENT;
+ /* + * We only visit blocks once in this walk, so we have to avoid the + * internal xfs_btree_lookup_get_block() optimisation where it will + * return the same block without checking if the right sibling points + * back to us and creates a cyclic reference in the btree. + */ + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { + if (be64_to_cpu(rptr.l) == XFS_DADDR_TO_FSB(cur->bc_mp, + xfs_buf_daddr(bp))) + return -EFSCORRUPTED; + } else { + if (be32_to_cpu(rptr.s) == xfs_daddr_to_agbno(cur->bc_mp, + xfs_buf_daddr(bp))) + return -EFSCORRUPTED; + } return xfs_btree_lookup_get_block(cur, level, &rptr, &block); }
@@ -4439,20 +4507,21 @@ xfs_btree_lblock_verify( { struct xfs_mount *mp = bp->b_mount; struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_fsblock_t fsb; + xfs_failaddr_t fa;
/* numrecs verification */ if (be16_to_cpu(block->bb_numrecs) > max_recs) return __this_address;
/* sibling pointer verification */ - if (block->bb_u.l.bb_leftsib != cpu_to_be64(NULLFSBLOCK) && - !xfs_verify_fsbno(mp, be64_to_cpu(block->bb_u.l.bb_leftsib))) - return __this_address; - if (block->bb_u.l.bb_rightsib != cpu_to_be64(NULLFSBLOCK) && - !xfs_verify_fsbno(mp, be64_to_cpu(block->bb_u.l.bb_rightsib))) - return __this_address; - - return NULL; + fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp)); + fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb, + be64_to_cpu(block->bb_u.l.bb_leftsib)); + if (!fa) + fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb, + be64_to_cpu(block->bb_u.l.bb_rightsib)); + return fa; }
/** @@ -4493,7 +4562,9 @@ xfs_btree_sblock_verify( { struct xfs_mount *mp = bp->b_mount; struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); - xfs_agblock_t agno; + xfs_agnumber_t agno; + xfs_agblock_t agbno; + xfs_failaddr_t fa;
/* numrecs verification */ if (be16_to_cpu(block->bb_numrecs) > max_recs) @@ -4501,14 +4572,13 @@ xfs_btree_sblock_verify(
/* sibling pointer verification */ agno = xfs_daddr_to_agno(mp, xfs_buf_daddr(bp)); - if (block->bb_u.s.bb_leftsib != cpu_to_be32(NULLAGBLOCK) && - !xfs_verify_agbno(mp, agno, be32_to_cpu(block->bb_u.s.bb_leftsib))) - return __this_address; - if (block->bb_u.s.bb_rightsib != cpu_to_be32(NULLAGBLOCK) && - !xfs_verify_agbno(mp, agno, be32_to_cpu(block->bb_u.s.bb_rightsib))) - return __this_address; - - return NULL; + agbno = xfs_daddr_to_agbno(mp, xfs_buf_daddr(bp)); + fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno, + be32_to_cpu(block->bb_u.s.bb_leftsib)); + if (!fa) + fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno, + be32_to_cpu(block->bb_u.s.bb_rightsib)); + return fa; }
/*
From: Dave Chinner dchinner@redhat.com
[ Upstream commit dd0d2f9755191690541b09e6385d0f8cd8bc9d8f ]
While xfs_has_nlink() is not used in kernel, it is used in userspace (e.g. by xfs_db) so we need to set the XFS_FEAT_NLINK flag correctly in xfs_sb_version_to_features().
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_sb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index e58349be78bd5..72c05485c8706 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -70,6 +70,8 @@ xfs_sb_version_to_features( /* optional V4 features */ if (sbp->sb_rblocks > 0) features |= XFS_FEAT_REALTIME; + if (sbp->sb_versionnum & XFS_SB_VERSION_NLINKBIT) + features |= XFS_FEAT_NLINK; if (sbp->sb_versionnum & XFS_SB_VERSION_ATTRBIT) features |= XFS_FEAT_ATTR; if (sbp->sb_versionnum & XFS_SB_VERSION_QUOTABIT)
From: Dave Chinner dchinner@redhat.com
[ Upstream commit f0f5f658065a5af09126ec892e4c383540a1c77f ]
We don't check that the v4 feature flags taht v5 requires to be set are actually set anywhere. Do this check when we see that the filesystem is a v5 filesystem.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_sb.c | 68 +++++++++++++++++++++++++++++++++++------- 1 file changed, 58 insertions(+), 10 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 72c05485c8706..04e2a57313fa0 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -30,6 +30,47 @@ * Physical superblock buffer manipulations. Shared with libxfs in userspace. */
+/* + * Check that all the V4 feature bits that the V5 filesystem format requires are + * correctly set. + */ +static bool +xfs_sb_validate_v5_features( + struct xfs_sb *sbp) +{ + /* We must not have any unknown V4 feature bits set */ + if (sbp->sb_versionnum & ~XFS_SB_VERSION_OKBITS) + return false; + + /* + * The CRC bit is considered an invalid V4 flag, so we have to add it + * manually to the OKBITS mask. + */ + if (sbp->sb_features2 & ~(XFS_SB_VERSION2_OKBITS | + XFS_SB_VERSION2_CRCBIT)) + return false; + + /* Now check all the required V4 feature flags are set. */ + +#define V5_VERS_FLAGS (XFS_SB_VERSION_NLINKBIT | \ + XFS_SB_VERSION_ALIGNBIT | \ + XFS_SB_VERSION_LOGV2BIT | \ + XFS_SB_VERSION_EXTFLGBIT | \ + XFS_SB_VERSION_DIRV2BIT | \ + XFS_SB_VERSION_MOREBITSBIT) + +#define V5_FEAT_FLAGS (XFS_SB_VERSION2_LAZYSBCOUNTBIT | \ + XFS_SB_VERSION2_ATTR2BIT | \ + XFS_SB_VERSION2_PROJID32BIT | \ + XFS_SB_VERSION2_CRCBIT) + + if ((sbp->sb_versionnum & V5_VERS_FLAGS) != V5_VERS_FLAGS) + return false; + if ((sbp->sb_features2 & V5_FEAT_FLAGS) != V5_FEAT_FLAGS) + return false; + return true; +} + /* * We support all XFS versions newer than a v4 superblock with V2 directories. */ @@ -37,9 +78,19 @@ bool xfs_sb_good_version( struct xfs_sb *sbp) { - /* all v5 filesystems are supported */ + /* + * All v5 filesystems are supported, but we must check that all the + * required v4 feature flags are enabled correctly as the code checks + * those flags and not for v5 support. + */ if (xfs_sb_is_v5(sbp)) - return true; + return xfs_sb_validate_v5_features(sbp); + + /* We must not have any unknown v4 feature bits set */ + if ((sbp->sb_versionnum & ~XFS_SB_VERSION_OKBITS) || + ((sbp->sb_versionnum & XFS_SB_VERSION_MOREBITSBIT) && + (sbp->sb_features2 & ~XFS_SB_VERSION2_OKBITS))) + return false;
/* versions prior to v4 are not supported */ if (XFS_SB_VERSION_NUM(sbp) < XFS_SB_VERSION_4) @@ -51,12 +102,6 @@ xfs_sb_good_version( if (!(sbp->sb_versionnum & XFS_SB_VERSION_EXTFLGBIT)) return false;
- /* And must not have any unknown v4 feature bits set */ - if ((sbp->sb_versionnum & ~XFS_SB_VERSION_OKBITS) || - ((sbp->sb_versionnum & XFS_SB_VERSION_MOREBITSBIT) && - (sbp->sb_features2 & ~XFS_SB_VERSION2_OKBITS))) - return false; - /* It's a supported v4 filesystem */ return true; } @@ -264,12 +309,15 @@ xfs_validate_sb_common( bool has_dalign;
if (!xfs_verify_magic(bp, dsb->sb_magicnum)) { - xfs_warn(mp, "bad magic number"); + xfs_warn(mp, +"Superblock has bad magic number 0x%x. Not an XFS filesystem?", + be32_to_cpu(dsb->sb_magicnum)); return -EWRONGFS; }
if (!xfs_sb_good_version(sbp)) { - xfs_warn(mp, "bad version"); + xfs_warn(mp, +"Superblock has unknown features enabled or corrupted feature masks."); return -EWRONGFS; }
From: Dave Chinner dchinner@redhat.com
[ Upstream commit 5672225e8f2a872a22b0cecedba7a6644af1fb84 ]
Commit dc04db2aa7c9 has caused a small aim7 regression, showing a small increase in CPU usage in __xfs_btree_check_sblock() as a result of the extra checking.
This is likely due to the endian conversion of the sibling poitners being unconditional instead of relying on the compiler to endian convert the NULL pointer at compile time and avoiding the runtime conversion for this common case.
Rework the checks so that endian conversion of the sibling pointers is only done if they are not null as the original code did.
.... and these need to be "inline" because the compiler completely fails to inline them automatically like it should be doing.
$ size fs/xfs/libxfs/xfs_btree.o* text data bss dec hex filename 51874 240 0 52114 cb92 fs/xfs/libxfs/xfs_btree.o.orig 51562 240 0 51802 ca5a fs/xfs/libxfs/xfs_btree.o.inline
Just when you think the tools have advanced sufficiently we don't have to care about stuff like this anymore, along comes a reminder that *our tools still suck*.
Fixes: dc04db2aa7c9 ("xfs: detect self referencing btree sibling pointers") Reported-by: kernel test robot oliver.sang@intel.com Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_btree.c | 47 +++++++++++++++++++++++++++------------ 1 file changed, 33 insertions(+), 14 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 5bec048343b0c..b4b5bf4bfed7f 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -51,16 +51,31 @@ xfs_btree_magic( return magic; }
-static xfs_failaddr_t +/* + * These sibling pointer checks are optimised for null sibling pointers. This + * happens a lot, and we don't need to byte swap at runtime if the sibling + * pointer is NULL. + * + * These are explicitly marked at inline because the cost of calling them as + * functions instead of inlining them is about 36 bytes extra code per call site + * on x86-64. Yes, gcc-11 fails to inline them, and explicit inlining of these + * two sibling check functions reduces the compiled code size by over 300 + * bytes. + */ +static inline xfs_failaddr_t xfs_btree_check_lblock_siblings( struct xfs_mount *mp, struct xfs_btree_cur *cur, int level, xfs_fsblock_t fsb, - xfs_fsblock_t sibling) + __be64 dsibling) { - if (sibling == NULLFSBLOCK) + xfs_fsblock_t sibling; + + if (dsibling == cpu_to_be64(NULLFSBLOCK)) return NULL; + + sibling = be64_to_cpu(dsibling); if (sibling == fsb) return __this_address; if (level >= 0) { @@ -74,17 +89,21 @@ xfs_btree_check_lblock_siblings( return NULL; }
-static xfs_failaddr_t +static inline xfs_failaddr_t xfs_btree_check_sblock_siblings( struct xfs_mount *mp, struct xfs_btree_cur *cur, int level, xfs_agnumber_t agno, xfs_agblock_t agbno, - xfs_agblock_t sibling) + __be32 dsibling) { - if (sibling == NULLAGBLOCK) + xfs_agblock_t sibling; + + if (dsibling == cpu_to_be32(NULLAGBLOCK)) return NULL; + + sibling = be32_to_cpu(dsibling); if (sibling == agbno) return __this_address; if (level >= 0) { @@ -136,10 +155,10 @@ __xfs_btree_check_lblock( fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp));
fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb, - be64_to_cpu(block->bb_u.l.bb_leftsib)); + block->bb_u.l.bb_leftsib); if (!fa) fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb, - be64_to_cpu(block->bb_u.l.bb_rightsib)); + block->bb_u.l.bb_rightsib); return fa; }
@@ -204,10 +223,10 @@ __xfs_btree_check_sblock( }
fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno, agbno, - be32_to_cpu(block->bb_u.s.bb_leftsib)); + block->bb_u.s.bb_leftsib); if (!fa) fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno, - agbno, be32_to_cpu(block->bb_u.s.bb_rightsib)); + agbno, block->bb_u.s.bb_rightsib); return fa; }
@@ -4517,10 +4536,10 @@ xfs_btree_lblock_verify( /* sibling pointer verification */ fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp)); fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb, - be64_to_cpu(block->bb_u.l.bb_leftsib)); + block->bb_u.l.bb_leftsib); if (!fa) fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb, - be64_to_cpu(block->bb_u.l.bb_rightsib)); + block->bb_u.l.bb_rightsib); return fa; }
@@ -4574,10 +4593,10 @@ xfs_btree_sblock_verify( agno = xfs_daddr_to_agno(mp, xfs_buf_daddr(bp)); agbno = xfs_daddr_to_agbno(mp, xfs_buf_daddr(bp)); fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno, - be32_to_cpu(block->bb_u.s.bb_leftsib)); + block->bb_u.s.bb_leftsib); if (!fa) fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno, - be32_to_cpu(block->bb_u.s.bb_rightsib)); + block->bb_u.s.bb_rightsib); return fa; }
From: Dave Chinner dchinner@redhat.com
[ Upstream commit 5b55cbc2d72632e874e50d2e36bce608e55aaaea ]
Not fatal, the assert is there to catch developer attention. I'm seeing this occasionally during recoveryloop testing after a shutdown, and I don't want this to stop an overnight recoveryloop run as it is currently doing.
Convert the ASSERT to a XFS_IS_CORRUPT() check so it will dump a corruption report into the log and cause a test failure that way, but it won't stop the machine dead.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_ag.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c index 005abfd9fd347..aff6fb5281f63 100644 --- a/fs/xfs/libxfs/xfs_ag.c +++ b/fs/xfs/libxfs/xfs_ag.c @@ -173,7 +173,6 @@ __xfs_free_perag( struct xfs_perag *pag = container_of(head, struct xfs_perag, rcu_head);
ASSERT(!delayed_work_pending(&pag->pag_blockgc_work)); - ASSERT(atomic_read(&pag->pag_ref) == 0); kmem_free(pag); }
@@ -192,7 +191,7 @@ xfs_free_perag( pag = radix_tree_delete(&mp->m_perag_tree, agno); spin_unlock(&mp->m_perag_lock); ASSERT(pag); - ASSERT(atomic_read(&pag->pag_ref) == 0); + XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0);
cancel_delayed_work_sync(&pag->pag_blockgc_work); xfs_iunlink_destroy(pag);
From: Dave Chinner dchinner@redhat.com
[ Upstream commit 56486f307100e8fc66efa2ebd8a71941fa10bf6f ]
xfs/538 on a 1kB block filesystem failed with this assert:
XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 || xfs_is_shutdown(cur->bc_mp), file: fs/xfs/libxfs/xfs_btree.c, line: 448
The problem was that an allocation failed unexpectedly in xfs_bmbt_alloc_block() after roughly 150,000 minlen allocation error injections, resulting in an EFSCORRUPTED error being returned to xfs_bmapi_write(). The error occurred on extent-to-btree format conversion allocating the new root block:
RIP: 0010:xfs_bmbt_alloc_block+0x177/0x210 Call Trace: <TASK> xfs_btree_new_iroot+0xdf/0x520 xfs_btree_make_block_unfull+0x10d/0x1c0 xfs_btree_insrec+0x364/0x790 xfs_btree_insert+0xaa/0x210 xfs_bmap_add_extent_hole_real+0x1fe/0x9a0 xfs_bmapi_allocate+0x34c/0x420 xfs_bmapi_write+0x53c/0x9c0 xfs_alloc_file_space+0xee/0x320 xfs_file_fallocate+0x36b/0x450 vfs_fallocate+0x148/0x340 __x64_sys_fallocate+0x3c/0x70 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa
Why the allocation failed at this point is unknown, but is likely that we ran the transaction out of reserved space and filesystem out of space with bmbt blocks because of all the minlen allocations being done causing worst case fragmentation of a large allocation.
Regardless of the cause, we've then called xfs_bmapi_finish() which calls xfs_btree_del_cursor(cur, error) to tear down the cursor.
So we have a failed operation, error != 0, cur->bc_ino.allocated > 0 and the filesystem is still up. The assert fails to take into account that allocation can fail with an error and the transaction teardown will shut the filesystem down if necessary. i.e. the assert needs to check "|| error != 0" as well, because at this point shutdown is pending because the current transaction is dirty....
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_btree.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index b4b5bf4bfed7f..482a4ccc65682 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -445,8 +445,14 @@ xfs_btree_del_cursor( break; }
+ /* + * If we are doing a BMBT update, the number of unaccounted blocks + * allocated during this cursor life time should be zero. If it's not + * zero, then we should be shut down or on our way to shutdown due to + * cancelling a dirty transaction on error. + */ ASSERT(cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 || - xfs_is_shutdown(cur->bc_mp)); + xfs_is_shutdown(cur->bc_mp) || error != 0); if (unlikely(cur->bc_flags & XFS_BTREE_STAGING)) kmem_free(cur->bc_ops); if (!(cur->bc_flags & XFS_BTREE_LONG_PTRS) && cur->bc_ag.pag)
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit 86d40f1e49e9a909d25c35ba01bea80dbcd758cb ]
xfs/434 and xfs/436 have been reporting occasional memory leaks of xfs_dquot objects. These tests themselves were the messenger, not the culprit, since they unload the xfs module, which trips the slub debugging code while tearing down all the xfs slab caches:
============================================================================= BUG xfs_dquot (Tainted: G W ): Objects remaining in xfs_dquot on __kmem_cache_shutdown() -----------------------------------------------------------------------------
Slab 0xffffea000606de00 objects=30 used=5 fp=0xffff888181b78a78 flags=0x17ff80000010200(slab|head|node=0|zone=2|lastcpupid=0xfff) CPU: 0 PID: 3953166 Comm: modprobe Tainted: G W 5.18.0-rc6-djwx #rc6 d5824be9e46a2393677bda868f9b154d917ca6a7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20171121_152543-x86-ol7-builder-01.us.oracle.com-4.el7.1 04/01/2014
Since we don't generally rmmod the xfs module between fstests, this means that xfs/434 is really just the canary in the coal mine -- something leaked a dquot, but we don't know who. After days of pounding on fstests with kmemleak enabled, I finally got it to spit this out:
unreferenced object 0xffff8880465654c0 (size 536): comm "u10:4", pid 88, jiffies 4294935810 (age 29.512s) hex dump (first 32 bytes): 60 4a 56 46 80 88 ff ff 58 ea e4 5c 80 88 ff ff `JVF....X...... 00 e0 52 49 80 88 ff ff 01 00 01 00 00 00 00 00 ..RI............ backtrace: [<ffffffffa0740f6c>] xfs_dquot_alloc+0x2c/0x530 [xfs] [<ffffffffa07443df>] xfs_qm_dqread+0x6f/0x330 [xfs] [<ffffffffa07462a2>] xfs_qm_dqget+0x132/0x4e0 [xfs] [<ffffffffa0756bb0>] xfs_qm_quotacheck_dqadjust+0xa0/0x3e0 [xfs] [<ffffffffa075724d>] xfs_qm_dqusage_adjust+0x35d/0x4f0 [xfs] [<ffffffffa06c9068>] xfs_iwalk_ag_recs+0x348/0x5d0 [xfs] [<ffffffffa06c95d3>] xfs_iwalk_run_callbacks+0x273/0x540 [xfs] [<ffffffffa06c9e8d>] xfs_iwalk_ag+0x5ed/0x890 [xfs] [<ffffffffa06ca22f>] xfs_iwalk_ag_work+0xff/0x170 [xfs] [<ffffffffa06d22c9>] xfs_pwork_work+0x79/0x130 [xfs] [<ffffffff81170bb2>] process_one_work+0x672/0x1040 [<ffffffff81171b1b>] worker_thread+0x59b/0xec0 [<ffffffff8118711e>] kthread+0x29e/0x340 [<ffffffff810032bf>] ret_from_fork+0x1f/0x30
Now we know that quotacheck is at fault, but even this report was canaryish -- it was triggered by xfs/494, which doesn't actually mount any filesystems. (kmemleak can be a little slow to notice leaks, even with fstests repeatedly whacking it to look for them.) Looking at the *previous* fstest, however, showed that the test run before xfs/494 was xfs/117. The tipoff to the problem is in this excerpt from dmesg:
XFS (sda4): Quotacheck needed: Please wait. XFS (sda4): Metadata corruption detected at xfs_dinode_verify.part.0+0xdb/0x7b0 [xfs], inode 0x119 dinode XFS (sda4): Unmount and run xfs_repair XFS (sda4): First 128 bytes of corrupted metadata buffer: 00000000: 49 4e 81 a4 03 02 00 00 00 00 00 00 00 00 00 00 IN.............. 00000010: 00 00 00 01 00 00 00 00 00 90 57 54 54 1a 4c 68 ..........WTT.Lh 00000020: 81 f9 7d e1 6d ee 16 00 34 bd 7d e1 6d ee 16 00 ..}.m...4.}.m... 00000030: 34 bd 7d e1 6d ee 16 00 00 00 00 00 00 00 00 00 4.}.m........... 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000050: 00 00 00 02 00 00 00 00 00 00 00 00 96 80 f3 ab ................ 00000060: ff ff ff ff da 57 7b 11 00 00 00 00 00 00 00 03 .....W{......... 00000070: 00 00 00 01 00 00 00 10 00 00 00 00 00 00 00 08 ................ XFS (sda4): Quotacheck: Unsuccessful (Error -117): Disabling quotas.
The dinode verifier decided that the inode was corrupt, which causes iget to return with EFSCORRUPTED. Since this happened during quotacheck, it is obvious that the kernel aborted the inode walk on account of the corruption error and disabled quotas. Unfortunately, we neglect to purge the dquot cache before doing that, which is how the dquots leaked.
The problems started 10 years ago in commit b84a3a, when the dquot lists were converted to a radix tree, but the error handling behavior was not correctly preserved -- in that commit, if the bulkstat failed and usrquota was enabled, the bulkstat failure code would be overwritten by the result of flushing all the dquots to disk. As long as that succeeds, we'd continue the quota mount as if everything were ok, but instead we're now operating with a corrupt inode and incorrect quota usage counts. I didn't notice this bug in 2019 when I wrote commit ebd126a, which changed quotacheck to skip the dqflush when the scan doesn't complete due to inode walk failures.
Introduced-by: b84a3a96751f ("xfs: remove the per-filesystem list of dquots") Fixes: ebd126a651f8 ("xfs: convert quotacheck to use the new iwalk functions") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_qm.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index 5608066d6e539..623244650a2f0 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -1317,8 +1317,15 @@ xfs_qm_quotacheck(
error = xfs_iwalk_threaded(mp, 0, 0, xfs_qm_dqusage_adjust, 0, true, NULL); - if (error) + if (error) { + /* + * The inode walk may have partially populated the dquot + * caches. We must purge them before disabling quota and + * tearing down the quotainfo, or else the dquots will leak. + */ + xfs_qm_dqpurge_all(mp); goto error_return; + }
/* * We've made all the changes that we need to make incore. Flush them
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit a54f78def73d847cb060b18c4e4a3d1d26c9ca6d ]
The recent patch to improve btree cycle checking caused a regression when I rebased the in-memory btree branch atop the 5.19 for-next branch, because in-memory short-pointer btrees do not have AG numbers. This produced the following complaint from kmemleak:
unreferenced object 0xffff88803d47dde8 (size 264): comm "xfs_io", pid 4889, jiffies 4294906764 (age 24.072s) hex dump (first 32 bytes): 90 4d 0b 0f 80 88 ff ff 00 a0 bd 05 80 88 ff ff .M.............. e0 44 3a a0 ff ff ff ff 00 df 08 06 80 88 ff ff .D:............. backtrace: [<ffffffffa0388059>] xfbtree_dup_cursor+0x49/0xc0 [xfs] [<ffffffffa029887b>] xfs_btree_dup_cursor+0x3b/0x200 [xfs] [<ffffffffa029af5d>] __xfs_btree_split+0x6ad/0x820 [xfs] [<ffffffffa029b130>] xfs_btree_split+0x60/0x110 [xfs] [<ffffffffa029f6da>] xfs_btree_make_block_unfull+0x19a/0x1f0 [xfs] [<ffffffffa029fada>] xfs_btree_insrec+0x3aa/0x810 [xfs] [<ffffffffa029fff3>] xfs_btree_insert+0xb3/0x240 [xfs] [<ffffffffa02cb729>] xfs_rmap_insert+0x99/0x200 [xfs] [<ffffffffa02cf142>] xfs_rmap_map_shared+0x192/0x5f0 [xfs] [<ffffffffa02cf60b>] xfs_rmap_map_raw+0x6b/0x90 [xfs] [<ffffffffa0384a85>] xrep_rmap_stash+0xd5/0x1d0 [xfs] [<ffffffffa0384dc0>] xrep_rmap_visit_bmbt+0xa0/0xf0 [xfs] [<ffffffffa0384fb6>] xrep_rmap_scan_iext+0x56/0xa0 [xfs] [<ffffffffa03850d8>] xrep_rmap_scan_ifork+0xd8/0x160 [xfs] [<ffffffffa0385195>] xrep_rmap_scan_inode+0x35/0x80 [xfs] [<ffffffffa03852ee>] xrep_rmap_find_rmaps+0x10e/0x270 [xfs]
I noticed that xfs_btree_insrec has a bunch of debug code that return out of the function immediately, without freeing the "new" btree cursor that can be returned when _make_block_unfull calls xfs_btree_split. Fix the error return in this function to free the btree cursor.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_btree.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 482a4ccc65682..dffe4ca584935 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -3266,7 +3266,7 @@ xfs_btree_insrec( struct xfs_btree_block *block; /* btree block */ struct xfs_buf *bp; /* buffer for block */ union xfs_btree_ptr nptr; /* new block ptr */ - struct xfs_btree_cur *ncur; /* new btree cursor */ + struct xfs_btree_cur *ncur = NULL; /* new btree cursor */ union xfs_btree_key nkey; /* new block key */ union xfs_btree_key *lkey; int optr; /* old key/record index */ @@ -3346,7 +3346,7 @@ xfs_btree_insrec( #ifdef DEBUG error = xfs_btree_check_block(cur, block, level, bp); if (error) - return error; + goto error0; #endif
/* @@ -3366,7 +3366,7 @@ xfs_btree_insrec( for (i = numrecs - ptr; i >= 0; i--) { error = xfs_btree_debug_check_ptr(cur, pp, i, level); if (error) - return error; + goto error0; }
xfs_btree_shift_keys(cur, kp, 1, numrecs - ptr + 1); @@ -3451,6 +3451,8 @@ xfs_btree_insrec( return 0;
error0: + if (ncur) + xfs_btree_del_cursor(ncur, error); return error; }
From: Paolo Abeni pabeni@redhat.com
commit d4e85922e3e7ef2071f91f65e61629b60f3a9cf4 upstream.
If the peer closes all the existing subflows for a given mptcp socket and later the application closes it, the current implementation let it survive until the timewait timeout expires.
While the above is allowed by the protocol specification it consumes resources for almost no reason and additionally causes sporadic self-tests failures.
Let's move the mptcp socket to the TCP_CLOSE state when there are no alive subflows at close time, so that the allocated resources will be freed immediately.
Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/mptcp/protocol.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2726,6 +2726,7 @@ static void mptcp_close(struct sock *sk, { struct mptcp_subflow_context *subflow; bool do_cancel_work = false; + int subflows_alive = 0;
lock_sock(sk); sk->sk_shutdown = SHUTDOWN_MASK; @@ -2747,11 +2748,19 @@ cleanup: struct sock *ssk = mptcp_subflow_tcp_sock(subflow); bool slow = lock_sock_fast_nested(ssk);
+ subflows_alive += ssk->sk_state != TCP_CLOSE; + sock_orphan(ssk); unlock_sock_fast(ssk, slow); } sock_orphan(sk);
+ /* all the subflows are closed, only timeout can change the msk + * state, let's not keep resources busy for no reasons + */ + if (subflows_alive == 0) + inet_sk_state_store(sk, TCP_CLOSE); + sock_hold(sk); pr_debug("msk=%p state=%d", sk, sk->sk_state); if (sk->sk_state == TCP_CLOSE) {
From: Seth Jenkins sethjenkins@google.com
commit 81e9d6f8647650a7bead74c5f926e29970e834d1 upstream.
Commit e4a0d3e720e7 ("aio: Make it possible to remap aio ring") introduced a null-deref if mremap is called on an old aio mapping after fork as mm->ioctx_table will be set to NULL.
[jmoyer@redhat.com: fix 80 column issue] Link: https://lkml.kernel.org/r/x49sffq4nvg.fsf@segfault.boston.devel.redhat.com Fixes: e4a0d3e720e7 ("aio: Make it possible to remap aio ring") Signed-off-by: Seth Jenkins sethjenkins@google.com Signed-off-by: Jeff Moyer jmoyer@redhat.com Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: Benjamin LaHaise bcrl@kvack.org Cc: Jann Horn jannh@google.com Cc: Pavel Emelyanov xemul@parallels.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/aio.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/aio.c +++ b/fs/aio.c @@ -334,6 +334,9 @@ static int aio_ring_mremap(struct vm_are spin_lock(&mm->ioctx_lock); rcu_read_lock(); table = rcu_dereference(mm->ioctx_table); + if (!table) + goto out_unlock; + for (i = 0; i < table->nr; i++) { struct kioctx *ctx;
@@ -347,6 +350,7 @@ static int aio_ring_mremap(struct vm_are } }
+out_unlock: rcu_read_unlock(); spin_unlock(&mm->ioctx_lock); return res;
From: Leo Li sunpeng.li@amd.com
commit 2a00299e7447395d0898e7c6214817c06a61a8e8 upstream.
[Why]
drm_atomic_normalize_zpos() can return an error code when there's modeset lock contention. This was being ignored.
[How]
Bail out of atomic check if normalize_zpos() returns an error.
Fixes: b261509952bc ("drm/amd/display: Fix double cursor on non-video RGB MPO") Signed-off-by: Leo Li sunpeng.li@amd.com Tested-by: Mikhail Gavrilov mikhail.v.gavrilov@gmail.com Reviewed-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -10889,7 +10889,11 @@ static int amdgpu_dm_atomic_check(struct * `dcn10_can_pipe_disable_cursor`). By now, all modified planes are in * atomic state, so call drm helper to normalize zpos. */ - drm_atomic_normalize_zpos(dev, state); + ret = drm_atomic_normalize_zpos(dev, state); + if (ret) { + drm_dbg(dev, "drm_atomic_normalize_zpos() failed\n"); + goto fail; + }
/* Remove exiting planes if they are modified */ for_each_oldnew_plane_in_state_reverse(state, plane, old_plane_state, new_plane_state, i) {
From: Sanket Goswami Sanket.Goswami@amd.com
commit f6045de1f53268131ea75a99b210b869dcc150b2 upstream.
IdleMask is the metric used by the PM firmware to know the status of each of the Hardware IP blocks monitored by the PM firmware.
Knowing this value is key to get the information of s2idle suspend/resume status. This value is mapped to PMC scratch registers, retrieve them accordingly based on the CPU family and the underlying firmware support.
Co-developed-by: Shyam Sundar S K Shyam-sundar.S-k@amd.com Signed-off-by: Shyam Sundar S K Shyam-sundar.S-k@amd.com Signed-off-by: Sanket Goswami Sanket.Goswami@amd.com Reviewed-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20210916124002.2529-1-Sanket.Goswami@amd.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/amd-pmc.c | 76 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+)
--- a/drivers/platform/x86/amd-pmc.c +++ b/drivers/platform/x86/amd-pmc.c @@ -29,6 +29,10 @@ #define AMD_PMC_REGISTER_RESPONSE 0x980 #define AMD_PMC_REGISTER_ARGUMENT 0x9BC
+/* PMC Scratch Registers */ +#define AMD_PMC_SCRATCH_REG_CZN 0x94 +#define AMD_PMC_SCRATCH_REG_YC 0xD14 + /* Base address of SMU for mapping physical address to virtual address */ #define AMD_PMC_SMU_INDEX_ADDRESS 0xB8 #define AMD_PMC_SMU_INDEX_DATA 0xBC @@ -110,6 +114,10 @@ struct amd_pmc_dev { u32 base_addr; u32 cpu_id; u32 active_ips; +/* SMU version information */ + u16 major; + u16 minor; + u16 rev; struct device *dev; struct mutex lock; /* generic mutex lock */ #if IS_ENABLED(CONFIG_DEBUG_FS) @@ -201,6 +209,66 @@ static int s0ix_stats_show(struct seq_fi } DEFINE_SHOW_ATTRIBUTE(s0ix_stats);
+static int amd_pmc_get_smu_version(struct amd_pmc_dev *dev) +{ + int rc; + u32 val; + + rc = amd_pmc_send_cmd(dev, 0, &val, SMU_MSG_GETSMUVERSION, 1); + if (rc) + return rc; + + dev->major = (val >> 16) & GENMASK(15, 0); + dev->minor = (val >> 8) & GENMASK(7, 0); + dev->rev = (val >> 0) & GENMASK(7, 0); + + dev_dbg(dev->dev, "SMU version is %u.%u.%u\n", dev->major, dev->minor, dev->rev); + + return 0; +} + +static int amd_pmc_idlemask_read(struct amd_pmc_dev *pdev, struct device *dev, + struct seq_file *s) +{ + u32 val; + + switch (pdev->cpu_id) { + case AMD_CPU_ID_CZN: + val = amd_pmc_reg_read(pdev, AMD_PMC_SCRATCH_REG_CZN); + break; + case AMD_CPU_ID_YC: + val = amd_pmc_reg_read(pdev, AMD_PMC_SCRATCH_REG_YC); + break; + default: + return -EINVAL; + } + + if (dev) + dev_dbg(pdev->dev, "SMU idlemask s0i3: 0x%x\n", val); + + if (s) + seq_printf(s, "SMU idlemask : 0x%x\n", val); + + return 0; +} + +static int amd_pmc_idlemask_show(struct seq_file *s, void *unused) +{ + struct amd_pmc_dev *dev = s->private; + int rc; + + if (dev->major > 56 || (dev->major >= 55 && dev->minor >= 37)) { + rc = amd_pmc_idlemask_read(dev, NULL, s); + if (rc) + return rc; + } else { + seq_puts(s, "Unsupported SMU version for Idlemask\n"); + } + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(amd_pmc_idlemask); + static void amd_pmc_dbgfs_unregister(struct amd_pmc_dev *dev) { debugfs_remove_recursive(dev->dbgfs_dir); @@ -213,6 +281,8 @@ static void amd_pmc_dbgfs_register(struc &smu_fw_info_fops); debugfs_create_file("s0ix_stats", 0644, dev->dbgfs_dir, dev, &s0ix_stats_fops); + debugfs_create_file("amd_pmc_idlemask", 0644, dev->dbgfs_dir, dev, + &amd_pmc_idlemask_fops); } #else static inline void amd_pmc_dbgfs_register(struct amd_pmc_dev *dev) @@ -349,6 +419,8 @@ static int __maybe_unused amd_pmc_suspen amd_pmc_send_cmd(pdev, 0, NULL, SMU_MSG_LOG_RESET, 0); amd_pmc_send_cmd(pdev, 0, NULL, SMU_MSG_LOG_START, 0);
+ /* Dump the IdleMask before we send hint to SMU */ + amd_pmc_idlemask_read(pdev, dev, NULL); msg = amd_pmc_get_os_hint(pdev); rc = amd_pmc_send_cmd(pdev, 1, NULL, msg, 0); if (rc) @@ -371,6 +443,9 @@ static int __maybe_unused amd_pmc_resume if (rc) dev_err(pdev->dev, "resume failed\n");
+ /* Dump the IdleMask to see the blockers */ + amd_pmc_idlemask_read(pdev, dev, NULL); + return 0; }
@@ -458,6 +533,7 @@ static int amd_pmc_probe(struct platform if (err) dev_err(dev->dev, "SMU debugging info not supported on this platform\n");
+ amd_pmc_get_smu_version(dev); platform_set_drvdata(pdev, dev); amd_pmc_dbgfs_register(dev); return 0;
From: Hans de Goede hdegoede@redhat.com
commit 40635cd32f0d83573a558dc30e9ba3469e769249 upstream.
The amd_pmc_get_smu_version() and amd_pmc_idlemask_read() functions are used in the probe / suspend/resume code, so they are also used when CONFIG_DEBUGFS is disabled, move them outside of the #ifdef CONFIG_DEBUGFS block.
Note this purely moves the code to above the #ifdef CONFIG_DEBUGFS, the code is completely unchanged.
Fixes: f6045de1f532 ("platform/x86: amd-pmc: Export Idlemask values based on the APU") Cc: Shyam Sundar S K Shyam-sundar.S-k@amd.com Cc: Sanket Goswami Sanket.Goswami@amd.com Reported-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/amd-pmc.c | 86 ++++++++++++++++++++--------------------- 1 file changed, 43 insertions(+), 43 deletions(-)
--- a/drivers/platform/x86/amd-pmc.c +++ b/drivers/platform/x86/amd-pmc.c @@ -155,6 +155,49 @@ struct smu_metrics { u64 timecondition_notmet_totaltime[SOC_SUBSYSTEM_IP_MAX]; } __packed;
+static int amd_pmc_get_smu_version(struct amd_pmc_dev *dev) +{ + int rc; + u32 val; + + rc = amd_pmc_send_cmd(dev, 0, &val, SMU_MSG_GETSMUVERSION, 1); + if (rc) + return rc; + + dev->major = (val >> 16) & GENMASK(15, 0); + dev->minor = (val >> 8) & GENMASK(7, 0); + dev->rev = (val >> 0) & GENMASK(7, 0); + + dev_dbg(dev->dev, "SMU version is %u.%u.%u\n", dev->major, dev->minor, dev->rev); + + return 0; +} + +static int amd_pmc_idlemask_read(struct amd_pmc_dev *pdev, struct device *dev, + struct seq_file *s) +{ + u32 val; + + switch (pdev->cpu_id) { + case AMD_CPU_ID_CZN: + val = amd_pmc_reg_read(pdev, AMD_PMC_SCRATCH_REG_CZN); + break; + case AMD_CPU_ID_YC: + val = amd_pmc_reg_read(pdev, AMD_PMC_SCRATCH_REG_YC); + break; + default: + return -EINVAL; + } + + if (dev) + dev_dbg(pdev->dev, "SMU idlemask s0i3: 0x%x\n", val); + + if (s) + seq_printf(s, "SMU idlemask : 0x%x\n", val); + + return 0; +} + #ifdef CONFIG_DEBUG_FS static int smu_fw_info_show(struct seq_file *s, void *unused) { @@ -209,49 +252,6 @@ static int s0ix_stats_show(struct seq_fi } DEFINE_SHOW_ATTRIBUTE(s0ix_stats);
-static int amd_pmc_get_smu_version(struct amd_pmc_dev *dev) -{ - int rc; - u32 val; - - rc = amd_pmc_send_cmd(dev, 0, &val, SMU_MSG_GETSMUVERSION, 1); - if (rc) - return rc; - - dev->major = (val >> 16) & GENMASK(15, 0); - dev->minor = (val >> 8) & GENMASK(7, 0); - dev->rev = (val >> 0) & GENMASK(7, 0); - - dev_dbg(dev->dev, "SMU version is %u.%u.%u\n", dev->major, dev->minor, dev->rev); - - return 0; -} - -static int amd_pmc_idlemask_read(struct amd_pmc_dev *pdev, struct device *dev, - struct seq_file *s) -{ - u32 val; - - switch (pdev->cpu_id) { - case AMD_CPU_ID_CZN: - val = amd_pmc_reg_read(pdev, AMD_PMC_SCRATCH_REG_CZN); - break; - case AMD_CPU_ID_YC: - val = amd_pmc_reg_read(pdev, AMD_PMC_SCRATCH_REG_YC); - break; - default: - return -EINVAL; - } - - if (dev) - dev_dbg(pdev->dev, "SMU idlemask s0i3: 0x%x\n", val); - - if (s) - seq_printf(s, "SMU idlemask : 0x%x\n", val); - - return 0; -} - static int amd_pmc_idlemask_show(struct seq_file *s, void *unused) { struct amd_pmc_dev *dev = s->private;
From: Mario Limonciello mario.limonciello@amd.com
commit b8fb0d9b47660ddb8a8256412784aad7cee9f21a upstream.
Yellow carp has been outputting versions like `1093.24.0`, but this is supposed to be 69.24.0. That is the MSB is being interpreted incorrectly.
The MSB is not part of the major version, but has generally been treated that way thus far. It's actually the program, and used to distinguish between two programs from a similar family but different codebase.
Link: https://patchwork.freedesktop.org/patch/469993/ Signed-off-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20220120174439.12770-1-mario.limonciello@amd.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/amd-pmc.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/drivers/platform/x86/amd-pmc.c +++ b/drivers/platform/x86/amd-pmc.c @@ -115,9 +115,10 @@ struct amd_pmc_dev { u32 cpu_id; u32 active_ips; /* SMU version information */ - u16 major; - u16 minor; - u16 rev; + u8 smu_program; + u8 major; + u8 minor; + u8 rev; struct device *dev; struct mutex lock; /* generic mutex lock */ #if IS_ENABLED(CONFIG_DEBUG_FS) @@ -164,11 +165,13 @@ static int amd_pmc_get_smu_version(struc if (rc) return rc;
- dev->major = (val >> 16) & GENMASK(15, 0); + dev->smu_program = (val >> 24) & GENMASK(7, 0); + dev->major = (val >> 16) & GENMASK(7, 0); dev->minor = (val >> 8) & GENMASK(7, 0); dev->rev = (val >> 0) & GENMASK(7, 0);
- dev_dbg(dev->dev, "SMU version is %u.%u.%u\n", dev->major, dev->minor, dev->rev); + dev_dbg(dev->dev, "SMU program %u version is %u.%u.%u\n", + dev->smu_program, dev->major, dev->minor, dev->rev);
return 0; }
From: Mario Limonciello mario.limonciello@amd.com
commit 8e60615e8932167057b363c11a7835da7f007106 upstream.
By default when the system is configured for low power idle in the FADT the keyboard is set up as a wake source. This matches the behavior that Windows uses for Modern Standby as well.
It has been reported that a variety of AMD based designs there are spurious wakeups are happening where two IRQ sources are active.
For example: ``` PM: Triggering wakeup from IRQ 9 PM: Triggering wakeup from IRQ 1 ```
In these designs IRQ 9 is the ACPI SCI and IRQ 1 is the keyboard. One way to trigger this problem is to suspend the laptop and then unplug the AC adapter. The SOC will be in a hardware sleep state and plugging in the AC adapter returns control to the kernel's s2idle loop.
Normally if just IRQ 9 was active the s2idle loop would advance any EC transactions and no other IRQ being active would cause the s2idle loop to put the SOC back into hardware sleep state.
When this bug occurred IRQ 1 is also active even if no keyboard activity occurred. This causes the s2idle loop to break and the system to wake.
This is a platform firmware bug triggering IRQ1 without keyboard activity. This occurs in Windows as well, but Windows will enter "SW DRIPS" and then with no activity enters back into "HW DRIPS" (hardware sleep state).
This issue affects Renoir, Lucienne, Cezanne, and Barcelo platforms. It does not happen on newer systems such as Mendocino or Rembrandt.
It's been fixed in newer platform firmware. To avoid triggering the bug on older systems check the SMU F/W version and adjust the policy at suspend time for s2idle wakeup from keyboard on these systems. A lot of thought and experimentation has been given around the timing of disabling IRQ1, and to make it work the "suspend" PM callback is restored.
Reported-by: Kai-Heng Feng kai.heng.feng@canonical.com Reported-by: Xaver Hugl xaver.hugl@gmail.com Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2115 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1951 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20230120191519.15926-1-mario.limonciello@amd.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com [ This has been hand modified for missing dependency commits. ] Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1921#note_1770257 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/amd-pmc.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+)
--- a/drivers/platform/x86/amd-pmc.c +++ b/drivers/platform/x86/amd-pmc.c @@ -20,6 +20,7 @@ #include <linux/module.h> #include <linux/pci.h> #include <linux/platform_device.h> +#include <linux/serio.h> #include <linux/suspend.h> #include <linux/seq_file.h> #include <linux/uaccess.h> @@ -412,12 +413,48 @@ static int amd_pmc_get_os_hint(struct am return -EINVAL; }
+static int amd_pmc_czn_wa_irq1(struct amd_pmc_dev *pdev) +{ + struct device *d; + int rc; + + if (!pdev->major) { + rc = amd_pmc_get_smu_version(pdev); + if (rc) + return rc; + } + + if (pdev->major > 64 || (pdev->major == 64 && pdev->minor > 65)) + return 0; + + d = bus_find_device_by_name(&serio_bus, NULL, "serio0"); + if (!d) + return 0; + if (device_may_wakeup(d)) { + dev_info_once(d, "Disabling IRQ1 wakeup source to avoid platform firmware bug\n"); + disable_irq_wake(1); + device_set_wakeup_enable(d, false); + } + put_device(d); + + return 0; +} + static int __maybe_unused amd_pmc_suspend(struct device *dev) { struct amd_pmc_dev *pdev = dev_get_drvdata(dev); int rc; u8 msg;
+ if (pdev->cpu_id == AMD_CPU_ID_CZN) { + int rc = amd_pmc_czn_wa_irq1(pdev); + + if (rc) { + dev_err(pdev->dev, "failed to adjust keyboard wakeup: %d\n", rc); + return rc; + } + } + /* Reset and Start SMU logging - to monitor the s0i3 stats */ amd_pmc_send_cmd(pdev, 0, NULL, SMU_MSG_LOG_RESET, 0); amd_pmc_send_cmd(pdev, 0, NULL, SMU_MSG_LOG_START, 0);
From: Florian Westphal fw@strlen.de
commit 18bbc3213383a82b05383827f4b1b882e3f0a5a5 upstream.
TPROXY is only allowed from prerouting, but nft_tproxy doesn't check this. This fixes a crash (null dereference) when using tproxy from e.g. output.
Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support") Reported-by: Shell Chen xierch@gmail.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Qingfang DENG dqfext@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nft_tproxy.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/net/netfilter/nft_tproxy.c +++ b/net/netfilter/nft_tproxy.c @@ -312,6 +312,13 @@ static int nft_tproxy_dump(struct sk_buf return 0; }
+static int nft_tproxy_validate(const struct nft_ctx *ctx, + const struct nft_expr *expr, + const struct nft_data **data) +{ + return nft_chain_validate_hooks(ctx->chain, 1 << NF_INET_PRE_ROUTING); +} + static struct nft_expr_type nft_tproxy_type; static const struct nft_expr_ops nft_tproxy_ops = { .type = &nft_tproxy_type, @@ -320,6 +327,7 @@ static const struct nft_expr_ops nft_tpr .init = nft_tproxy_init, .destroy = nft_tproxy_destroy, .dump = nft_tproxy_dump, + .validate = nft_tproxy_validate, };
static struct nft_expr_type nft_tproxy_type __read_mostly = {
From: Kuniyuki Iwashima kuniyu@amazon.com
When we backport dadd0dcaa67d ("net/ulp: prevent ULP without clone op from entering the LISTEN status"), we have accidentally backported a part of 7a7160edf1bf ("net: Return errno in sk->sk_prot->get_port().") and removed err = -EADDRINUSE in inet_csk_listen_start().
Thus, listen() no longer returns -EADDRINUSE even if ->get_port() failed as reported in [0].
We set -EADDRINUSE to err just before ->get_port() to fix the regression.
[0]: https://lore.kernel.org/stable/EF8A45D0-768A-4CD5-9A8A-0FA6E610ABF7@winter.c...
Reported-by: Winter winter@winter.cafe Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/inet_connection_sock.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -1070,6 +1070,7 @@ int inet_csk_listen_start(struct sock *s * It is OK, because this socket enters to hash table only * after validation is complete. */ + err = -EADDRINUSE; inet_sk_state_store(sk, TCP_LISTEN); if (!sk->sk_prot->get_port(sk, inet->inet_num)) { inet->inet_sport = htons(inet->inet_num);
From: Paul Cercueil paul@crapouillou.net
commit 3f18c5046e633cc4bbad396b74c05d46d353033d upstream.
On JZ4760 and JZ4760B, SD cards fail to run if the maximum clock rate is set to 50 MHz, even though the controller officially does support it.
Until the actual bug is found and fixed, limit the maximum clock rate to 24 MHz.
Signed-off-by: Paul Cercueil paul@crapouillou.net Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230131210229.68129-1-paul@crapouillou.net Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/jz4740_mmc.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/mmc/host/jz4740_mmc.c +++ b/drivers/mmc/host/jz4740_mmc.c @@ -1038,6 +1038,16 @@ static int jz4740_mmc_probe(struct platf mmc->ops = &jz4740_mmc_ops; if (!mmc->f_max) mmc->f_max = JZ_MMC_CLK_RATE; + + /* + * There seems to be a problem with this driver on the JZ4760 and + * JZ4760B SoCs. There, when using the maximum rate supported (50 MHz), + * the communication fails with many SD cards. + * Until this bug is sorted out, limit the maximum rate to 24 MHz. + */ + if (host->version == JZ_MMC_JZ4760 && mmc->f_max > JZ_MMC_CLK_RATE) + mmc->f_max = JZ_MMC_CLK_RATE; + mmc->f_min = mmc->f_max / 128; mmc->ocr_avail = MMC_VDD_32_33 | MMC_VDD_33_34;
From: Yang Yingliang yangyingliang@huawei.com
commit 605d9fb9556f8f5fb4566f4df1480f280f308ded upstream.
If sdio_add_func() or sdio_init_func() fails, sdio_remove_func() can not release the resources, because the sdio function is not presented in these two cases, it won't call of_node_put() or put_device().
To fix these leaks, make sdio_func_present() only control whether device_del() needs to be called or not, then always call of_node_put() and put_device().
In error case in sdio_init_func(), the reference of 'card->dev' is not get, to avoid redundant put in sdio_free_func_cis(), move the get_device() to sdio_alloc_func() and put_device() to sdio_release_func(), it can keep the get/put function be balanced.
Without this patch, while doing fault inject test, it can get the following leak reports, after this fix, the leak is gone.
unreferenced object 0xffff888112514000 (size 2048): comm "kworker/3:2", pid 65, jiffies 4294741614 (age 124.774s) hex dump (first 32 bytes): 00 e0 6f 12 81 88 ff ff 60 58 8d 06 81 88 ff ff ..o.....`X...... 10 40 51 12 81 88 ff ff 10 40 51 12 81 88 ff ff .@Q......@Q..... backtrace: [<000000009e5931da>] kmalloc_trace+0x21/0x110 [<000000002f839ccb>] mmc_alloc_card+0x38/0xb0 [mmc_core] [<0000000004adcbf6>] mmc_sdio_init_card+0xde/0x170 [mmc_core] [<000000007538fea0>] mmc_attach_sdio+0xcb/0x1b0 [mmc_core] [<00000000d4fdeba7>] mmc_rescan+0x54a/0x640 [mmc_core]
unreferenced object 0xffff888112511000 (size 2048): comm "kworker/3:2", pid 65, jiffies 4294741623 (age 124.766s) hex dump (first 32 bytes): 00 40 51 12 81 88 ff ff e0 58 8d 06 81 88 ff ff .@Q......X...... 10 10 51 12 81 88 ff ff 10 10 51 12 81 88 ff ff ..Q.......Q..... backtrace: [<000000009e5931da>] kmalloc_trace+0x21/0x110 [<00000000fcbe706c>] sdio_alloc_func+0x35/0x100 [mmc_core] [<00000000c68f4b50>] mmc_attach_sdio.cold.18+0xb1/0x395 [mmc_core] [<00000000d4fdeba7>] mmc_rescan+0x54a/0x640 [mmc_core]
Fixes: 3d10a1ba0d37 ("sdio: fix reference counting in sdio_remove_func()") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230130125808.3471254-1-yangyingliang@huawei.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/sdio_bus.c | 17 ++++++++++++++--- drivers/mmc/core/sdio_cis.c | 12 ------------ 2 files changed, 14 insertions(+), 15 deletions(-)
--- a/drivers/mmc/core/sdio_bus.c +++ b/drivers/mmc/core/sdio_bus.c @@ -293,6 +293,12 @@ static void sdio_release_func(struct dev if (!(func->card->quirks & MMC_QUIRK_NONSTD_SDIO)) sdio_free_func_cis(func);
+ /* + * We have now removed the link to the tuples in the + * card structure, so remove the reference. + */ + put_device(&func->card->dev); + kfree(func->info); kfree(func->tmpbuf); kfree(func); @@ -323,6 +329,12 @@ struct sdio_func *sdio_alloc_func(struct
device_initialize(&func->dev);
+ /* + * We may link to tuples in the card structure, + * we need make sure we have a reference to it. + */ + get_device(&func->card->dev); + func->dev.parent = &card->dev; func->dev.bus = &sdio_bus_type; func->dev.release = sdio_release_func; @@ -376,10 +388,9 @@ int sdio_add_func(struct sdio_func *func */ void sdio_remove_func(struct sdio_func *func) { - if (!sdio_func_present(func)) - return; + if (sdio_func_present(func)) + device_del(&func->dev);
- device_del(&func->dev); of_node_put(func->dev.of_node); put_device(&func->dev); } --- a/drivers/mmc/core/sdio_cis.c +++ b/drivers/mmc/core/sdio_cis.c @@ -404,12 +404,6 @@ int sdio_read_func_cis(struct sdio_func return ret;
/* - * Since we've linked to tuples in the card structure, - * we must make sure we have a reference to it. - */ - get_device(&func->card->dev); - - /* * Vendor/device id is optional for function CIS, so * copy it from the card structure as needed. */ @@ -434,11 +428,5 @@ void sdio_free_func_cis(struct sdio_func }
func->tuples = NULL; - - /* - * We have now removed the link to the tuples in the - * card structure, so remove the reference. - */ - put_device(&func->card->dev); }
From: Yang Yingliang yangyingliang@huawei.com
commit cf4c9d2ac1e42c7d18b921bec39486896645b714 upstream.
If mmc_add_host() fails, it doesn't need to call mmc_remove_host(), or it will cause null-ptr-deref, because of deleting a not added device in mmc_remove_host().
To fix this, goto label 'fail_glue_init', if mmc_add_host() fails, and change the label 'fail_add_host' to 'fail_gpiod_request'.
Fixes: 15a0580ced08 ("mmc_spi host driver") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Cc:stable@vger.kernel.org Link: https://lore.kernel.org/r/20230131013835.3564011-1-yangyingliang@huawei.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/mmc_spi.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/mmc/host/mmc_spi.c +++ b/drivers/mmc/host/mmc_spi.c @@ -1441,7 +1441,7 @@ static int mmc_spi_probe(struct spi_devi
status = mmc_add_host(mmc); if (status != 0) - goto fail_add_host; + goto fail_glue_init;
/* * Index 0 is card detect @@ -1449,7 +1449,7 @@ static int mmc_spi_probe(struct spi_devi */ status = mmc_gpiod_request_cd(mmc, NULL, 0, false, 1000); if (status == -EPROBE_DEFER) - goto fail_add_host; + goto fail_gpiod_request; if (!status) { /* * The platform has a CD GPIO signal that may support @@ -1464,7 +1464,7 @@ static int mmc_spi_probe(struct spi_devi /* Index 1 is write protect/read only */ status = mmc_gpiod_request_ro(mmc, NULL, 1, 0); if (status == -EPROBE_DEFER) - goto fail_add_host; + goto fail_gpiod_request; if (!status) has_ro = true;
@@ -1478,7 +1478,7 @@ static int mmc_spi_probe(struct spi_devi ? ", cd polling" : ""); return 0;
-fail_add_host: +fail_gpiod_request: mmc_remove_host(mmc); fail_glue_init: mmc_spi_dma_free(host);
From: Bo Liu bo.liu@senarytech.com
commit 18d7e16c917a08f08778ecf2b780d63648d5d923 upstream.
The current kernel does not support the SN6180 codec chip. Add the SN6180 codec configuration item to kernel.
Signed-off-by: Bo Liu bo.liu@senarytech.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1675908828-1012-1-git-send-email-bo.liu@senarytech... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_conexant.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -1124,6 +1124,7 @@ static const struct hda_device_id snd_hd HDA_CODEC_ENTRY(0x14f11f86, "CX8070", patch_conexant_auto), HDA_CODEC_ENTRY(0x14f12008, "CX8200", patch_conexant_auto), HDA_CODEC_ENTRY(0x14f120d0, "CX11970", patch_conexant_auto), + HDA_CODEC_ENTRY(0x14f120d1, "SN6180", patch_conexant_auto), HDA_CODEC_ENTRY(0x14f15045, "CX20549 (Venice)", patch_conexant_auto), HDA_CODEC_ENTRY(0x14f15047, "CX20551 (Waikiki)", patch_conexant_auto), HDA_CODEC_ENTRY(0x14f15051, "CX20561 (Hermosa)", patch_conexant_auto),
From: Kailang Yang kailang@realtek.com
commit 2bdccfd290d421b50df4ec6a68d832dad1310748 upstream.
GPIO2 PIN use for output. Mask Dir and Data need to assign for 0x4. Not 0x3. This fixed was for Lenovo Desktop(0x17aa1056). GPIO2 use for AMP enable.
Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/8d02bb9ac8134f878cd08607fdf088fd@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -826,7 +826,7 @@ do_sku: alc_setup_gpio(codec, 0x02); break; case 7: - alc_setup_gpio(codec, 0x03); + alc_setup_gpio(codec, 0x04); break; case 5: default:
From: Munehisa Kamata kamatam@amazon.com
commit c2dbe32d5db5c4ead121cf86dabd5ab691fb47fe upstream.
If a non-root cgroup gets removed when there is a thread that registered trigger and is polling on a pressure file within the cgroup, the polling waitqueue gets freed in the following path:
do_rmdir cgroup_rmdir kernfs_drain_open_files cgroup_file_release cgroup_pressure_release psi_trigger_destroy
However, the polling thread still has a reference to the pressure file and will access the freed waitqueue when the file is closed or upon exit:
fput ep_eventpoll_release ep_free ep_remove_wait_queue remove_wait_queue
This results in use-after-free as pasted below.
The fundamental problem here is that cgroup_file_release() (and consequently waitqueue's lifetime) is not tied to the file's real lifetime. Using wake_up_pollfree() here might be less than ideal, but it is in line with the comment at commit 42288cb44c4b ("wait: add wake_up_pollfree()") since the waitqueue's lifetime is not tied to file's one and can be considered as another special case. While this would be fixable by somehow making cgroup_file_release() be tied to the fput(), it would require sizable refactoring at cgroups or higher layer which might be more justifiable if we identify more cases like this.
BUG: KASAN: use-after-free in _raw_spin_lock_irqsave+0x60/0xc0 Write of size 4 at addr ffff88810e625328 by task a.out/4404
CPU: 19 PID: 4404 Comm: a.out Not tainted 6.2.0-rc6 #38 Hardware name: Amazon EC2 c5a.8xlarge/, BIOS 1.0 10/16/2017 Call Trace: <TASK> dump_stack_lvl+0x73/0xa0 print_report+0x16c/0x4e0 kasan_report+0xc3/0xf0 kasan_check_range+0x2d2/0x310 _raw_spin_lock_irqsave+0x60/0xc0 remove_wait_queue+0x1a/0xa0 ep_free+0x12c/0x170 ep_eventpoll_release+0x26/0x30 __fput+0x202/0x400 task_work_run+0x11d/0x170 do_exit+0x495/0x1130 do_group_exit+0x100/0x100 get_signal+0xd67/0xde0 arch_do_signal_or_restart+0x2a/0x2b0 exit_to_user_mode_prepare+0x94/0x100 syscall_exit_to_user_mode+0x20/0x40 do_syscall_64+0x52/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK>
Allocated by task 4404:
kasan_set_track+0x3d/0x60 __kasan_kmalloc+0x85/0x90 psi_trigger_create+0x113/0x3e0 pressure_write+0x146/0x2e0 cgroup_file_write+0x11c/0x250 kernfs_fop_write_iter+0x186/0x220 vfs_write+0x3d8/0x5c0 ksys_write+0x90/0x110 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 4407:
kasan_set_track+0x3d/0x60 kasan_save_free_info+0x27/0x40 ____kasan_slab_free+0x11d/0x170 slab_free_freelist_hook+0x87/0x150 __kmem_cache_free+0xcb/0x180 psi_trigger_destroy+0x2e8/0x310 cgroup_file_release+0x4f/0xb0 kernfs_drain_open_files+0x165/0x1f0 kernfs_drain+0x162/0x1a0 __kernfs_remove+0x1fb/0x310 kernfs_remove_by_name_ns+0x95/0xe0 cgroup_addrm_files+0x67f/0x700 cgroup_destroy_locked+0x283/0x3c0 cgroup_rmdir+0x29/0x100 kernfs_iop_rmdir+0xd1/0x140 vfs_rmdir+0xfe/0x240 do_rmdir+0x13d/0x280 __x64_sys_rmdir+0x2c/0x30 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 0e94682b73bf ("psi: introduce psi monitor") Signed-off-by: Munehisa Kamata kamatam@amazon.com Signed-off-by: Mengchi Cheng mengcc@amazon.com Signed-off-by: Ingo Molnar mingo@kernel.org Acked-by: Suren Baghdasaryan surenb@google.com Acked-by: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/20230106224859.4123476-1-kamatam@amazon.com/ Link: https://lore.kernel.org/r/20230214212705.4058045-1-kamatam@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/psi.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -1169,10 +1169,11 @@ void psi_trigger_destroy(struct psi_trig
group = t->group; /* - * Wakeup waiters to stop polling. Can happen if cgroup is deleted - * from under a polling process. + * Wakeup waiters to stop polling and clear the queue to prevent it from + * being accessed later. Can happen if cgroup is deleted from under a + * polling process. */ - wake_up_interruptible(&t->event_wait); + wake_up_pollfree(&t->event_wait);
mutex_lock(&group->trigger_lock);
From: Mike Kravetz mike.kravetz@oracle.com
commit ec4288fe63966b26d53907212ecd05dfa81dd2cc upstream.
Users can specify the hugetlb page size in the mmap, shmget and memfd_create system calls. This is done by using 6 bits within the flags argument to encode the base-2 logarithm of the desired page size. The routine hstate_sizelog() uses the log2 value to find the corresponding hugetlb hstate structure. Converting the log2 value (page_size_log) to potential hugetlb page size is the simple statement:
1UL << page_size_log
Because only 6 bits are used for page_size_log, the left shift can not be greater than 63. This is fine on 64 bit architectures where a long is 64 bits. However, if a value greater than 31 is passed on a 32 bit architecture (where long is 32 bits) the shift will result in undefined behavior. This was generally not an issue as the result of the undefined shift had to exactly match hugetlb page size to proceed.
Recent improvements in runtime checking have resulted in this undefined behavior throwing errors such as reported below.
Fix by comparing page_size_log to BITS_PER_LONG before doing shift.
Link: https://lkml.kernel.org/r/20230216013542.138708-1-mike.kravetz@oracle.com Link: https://lore.kernel.org/lkml/CA+G9fYuei_Tr-vN9GS7SfFyU1y9hNysnf=PB7kT0=yv4Mi... Fixes: 42d7395feb56 ("mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB") Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Reported-by: Naresh Kamboju naresh.kamboju@linaro.org Reviewed-by: Jesper Juhl jesperjuhl76@gmail.com Acked-by: Muchun Song songmuchun@bytedance.com Tested-by: Linux Kernel Functional Testing lkft@linaro.org Tested-by: Naresh Kamboju naresh.kamboju@linaro.org Cc: Anders Roxell anders.roxell@linaro.org Cc: Andi Kleen ak@linux.intel.com Cc: Sasha Levin sashal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/hugetlb.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -684,7 +684,10 @@ static inline struct hstate *hstate_size if (!page_size_log) return &default_hstate;
- return size_to_hstate(1UL << page_size_log); + if (page_size_log < BITS_PER_LONG) + return size_to_hstate(1UL << page_size_log); + + return NULL; }
static inline struct hstate *hstate_vma(struct vm_area_struct *vma)
From: Isaac J. Manjarres isaacmanjarres@google.com
commit ce4d9a1ea35ac5429e822c4106cb2859d5c71f3e upstream.
Patch series "Fix kmemleak crashes when scanning CMA regions", v2.
When trying to boot a device with an ARM64 kernel with the following config options enabled:
CONFIG_DEBUG_PAGEALLOC=y CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y CONFIG_DEBUG_KMEMLEAK=y
a crash is encountered when kmemleak starts to scan the list of gray or allocated objects that it maintains. Upon closer inspection, it was observed that these page-faults always occurred when kmemleak attempted to scan a CMA region.
At the moment, kmemleak is made aware of CMA regions that are specified through the devicetree to be dynamically allocated within a range of addresses. However, kmemleak should not need to scan CMA regions or any reserved memory region, as those regions can be used for DMA transfers between drivers and peripherals, and thus wouldn't contain anything useful for kmemleak.
Additionally, since CMA regions are unmapped from the kernel's address space when they are freed to the buddy allocator at boot when CONFIG_DEBUG_PAGEALLOC is enabled, kmemleak shouldn't attempt to access those memory regions, as that will trigger a crash. Thus, kmemleak should ignore all dynamically allocated reserved memory regions.
This patch (of 1):
Currently, kmemleak ignores dynamically allocated reserved memory regions that don't have a kernel mapping. However, regions that do retain a kernel mapping (e.g. CMA regions) do get scanned by kmemleak.
This is not ideal for two reasons:
1 kmemleak works by scanning memory regions for pointers to allocated objects to determine if those objects have been leaked or not. However, reserved memory regions can be used between drivers and peripherals for DMA transfers, and thus, would not contain pointers to allocated objects, making it unnecessary for kmemleak to scan these reserved memory regions.
2 When CONFIG_DEBUG_PAGEALLOC is enabled, along with kmemleak, the CMA reserved memory regions are unmapped from the kernel's address space when they are freed to buddy at boot. These CMA reserved regions are still tracked by kmemleak, however, and when kmemleak attempts to scan them, a crash will happen, as accessing the CMA region will result in a page-fault, since the regions are unmapped.
Thus, use kmemleak_ignore_phys() for all dynamically allocated reserved memory regions, instead of those that do not have a kernel mapping associated with them.
Link: https://lkml.kernel.org/r/20230208232001.2052777-1-isaacmanjarres@google.com Link: https://lkml.kernel.org/r/20230208232001.2052777-2-isaacmanjarres@google.com Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private") Signed-off-by: Isaac J. Manjarres isaacmanjarres@google.com Acked-by: Mike Rapoport (IBM) rppt@kernel.org Acked-by: Catalin Marinas catalin.marinas@arm.com Cc: Frank Rowand frowand.list@gmail.com Cc: Kirill A. Shutemov kirill.shtuemov@linux.intel.com Cc: Nick Kossifidis mick@ics.forth.gr Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Rob Herring robh@kernel.org Cc: Russell King (Oracle) rmk+kernel@armlinux.org.uk Cc: Saravana Kannan saravanak@google.com Cc: stable@vger.kernel.org [5.15+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/of_reserved_mem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/of/of_reserved_mem.c +++ b/drivers/of/of_reserved_mem.c @@ -47,9 +47,10 @@ static int __init early_init_dt_alloc_re err = memblock_mark_nomap(base, size); if (err) memblock_free(base, size); - kmemleak_ignore_phys(base); }
+ kmemleak_ignore_phys(base); + return err; }
From: Misono Tomohiro misono.tomohiro@jp.fujitsu.com
commit 90091c367e74d5b58d9ebe979cc363f7468f58d3 upstream.
Exit with return code 4 if lkdtm is not available like other tests in order to properly skip the test.
Signed-off-by: Misono Tomohiro misono.tomohiro@jp.fujitsu.com Signed-off-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/20210805101236.1140381-1-misono.tomohiro@jp.fujits... Cc: Andrew Paniakin apanyaki@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/lkdtm/stack-entropy.sh | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/lkdtm/stack-entropy.sh +++ b/tools/testing/selftests/lkdtm/stack-entropy.sh @@ -4,13 +4,27 @@ # Measure kernel stack entropy by sampling via LKDTM's REPORT_STACK test. set -e samples="${1:-1000}" +TRIGGER=/sys/kernel/debug/provoke-crash/DIRECT +KSELFTEST_SKIP_TEST=4 + +# Verify we have LKDTM available in the kernel. +if [ ! -r $TRIGGER ] ; then + /sbin/modprobe -q lkdtm || true + if [ ! -r $TRIGGER ] ; then + echo "Cannot find $TRIGGER (missing CONFIG_LKDTM?)" + else + echo "Cannot write $TRIGGER (need to run as root?)" + fi + # Skip this test + exit $KSELFTEST_SKIP_TEST +fi
# Capture dmesg continuously since it may fill up depending on sample size. log=$(mktemp -t stack-entropy-XXXXXX) dmesg --follow >"$log" & pid=$! report=-1 for i in $(seq 1 $samples); do - echo "REPORT_STACK" >/sys/kernel/debug/provoke-crash/DIRECT + echo "REPORT_STACK" > $TRIGGER if [ -t 1 ]; then percent=$(( 100 * $i / $samples )) if [ "$percent" -ne "$report" ]; then
From: Aaron Thompson dev@aaront.org
commit 647037adcad00f2bab8828d3d41cd0553d41f3bd upstream.
This reverts commit 115d9d77bb0f9152c60b6e8646369fa7f6167593.
The pages being freed by memblock_free_late() have already been initialized, but if they are in the deferred init range, __free_one_page() might access nearby uninitialized pages when trying to coalesce buddies. This can, for example, trigger this BUG:
BUG: unable to handle page fault for address: ffffe964c02580c8 RIP: 0010:__list_del_entry_valid+0x3f/0x70 <TASK> __free_one_page+0x139/0x410 __free_pages_ok+0x21d/0x450 memblock_free_late+0x8c/0xb9 efi_free_boot_services+0x16b/0x25c efi_enter_virtual_mode+0x403/0x446 start_kernel+0x678/0x714 secondary_startup_64_no_verify+0xd2/0xdb </TASK>
A proper fix will be more involved so revert this change for the time being.
Fixes: 115d9d77bb0f ("mm: Always release pages to the buddy allocator in memblock_free_late().") Signed-off-by: Aaron Thompson dev@aaront.org Link: https://lore.kernel.org/r/20230207082151.1303-1-dev@aaront.org Signed-off-by: Mike Rapoport (IBM) rppt@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memblock.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
--- a/mm/memblock.c +++ b/mm/memblock.c @@ -1615,13 +1615,7 @@ void __init __memblock_free_late(phys_ad end = PFN_DOWN(base + size);
for (; cursor < end; cursor++) { - /* - * Reserved pages are always initialized by the end of - * memblock_free_all() (by memmap_init() and, if deferred - * initialization is enabled, memmap_init_reserved_pages()), so - * these pages can be released directly to the buddy allocator. - */ - __free_pages_core(pfn_to_page(cursor), 0); + memblock_free_pages(pfn_to_page(cursor), cursor, 0); totalram_pages_inc(); } }
From: Felix Riemann felix.riemann@sma.de
commit 9b55d3f0a69af649c62cbc2633e6d695bb3cc583 upstream.
When converting net_device_stats to rtnl_link_stats64 sign extension is triggered on ILP32 machines as 6c1c509778 changed the previous "ulong -> u64" conversion to "long -> u64" by accessing the net_device_stats fields through a (signed) atomic_long_t.
This causes for example the received bytes counter to jump to 16EiB after having received 2^31 bytes. Casting the atomic value to "unsigned long" beforehand converting it into u64 avoids this.
Fixes: 6c1c5097781f ("net: add atomic_long_t to net_device_stats fields") Signed-off-by: Felix Riemann felix.riemann@sma.de Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/core/dev.c +++ b/net/core/dev.c @@ -10646,7 +10646,7 @@ void netdev_stats_to_stats64(struct rtnl
BUILD_BUG_ON(n > sizeof(*stats64) / sizeof(u64)); for (i = 0; i < n; i++) - dst[i] = atomic_long_read(&src[i]); + dst[i] = (unsigned long)atomic_long_read(&src[i]); /* zero out counters that only exist in rtnl_link_stats64 */ memset((char *)stats64 + n * sizeof(u64), 0, sizeof(*stats64) - n * sizeof(u64));
From: Andrew Morton akpm@linux-foundation.org
commit a5b21d8d791cd4db609d0bbcaa9e0c7e019888d1 upstream.
This fix was nacked by Philip, for reasons identified in the email linked below.
Link: https://lkml.kernel.org/r/68f15d67-8945-2728-1f17-5b53a80ec52d@squashfs.org.... Fixes: 72e544b1b28325 ("squashfs: harden sanity check in squashfs_read_xattr_id_table") Cc: Alexey Khoroshilov khoroshilov@ispras.ru Cc: Fedor Pchelkin pchelkin@ispras.ru Cc: Phillip Lougher phillip@squashfs.org.uk Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/squashfs/xattr_id.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/squashfs/xattr_id.c +++ b/fs/squashfs/xattr_id.c @@ -76,7 +76,7 @@ __le64 *squashfs_read_xattr_id_table(str /* Sanity check values */
/* there is always at least one xattr id */ - if (*xattr_ids <= 0) + if (*xattr_ids == 0) return ERR_PTR(-EINVAL);
len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
From: Jason Xing kernelxing@tencent.com
commit f9cd6a4418bac6a046ee78382423b1ae7565fb24 upstream.
Recently I encountered one case where I cannot increase the MTU size directly from 1500 to a much bigger value with XDP enabled if the server is equipped with IXGBE card, which happened on thousands of servers in production environment. After applying the current patch, we can set the maximum MTU size to 3K.
This patch follows the behavior of changing MTU as i40e/ice does.
[1] commit 23b44513c3e6 ("ice: allow 3k MTU for XDP") [2] commit 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP") Signed-off-by: Jason Xing kernelxing@tencent.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Tested-by: Chandan Kumar Rout chandanx.rout@intel.com (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-)
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -6730,6 +6730,18 @@ static void ixgbe_free_all_rx_resources( }
/** + * ixgbe_max_xdp_frame_size - returns the maximum allowed frame size for XDP + * @adapter: device handle, pointer to adapter + */ +static int ixgbe_max_xdp_frame_size(struct ixgbe_adapter *adapter) +{ + if (PAGE_SIZE >= 8192 || adapter->flags2 & IXGBE_FLAG2_RX_LEGACY) + return IXGBE_RXBUFFER_2K; + else + return IXGBE_RXBUFFER_3K; +} + +/** * ixgbe_change_mtu - Change the Maximum Transfer Unit * @netdev: network interface device structure * @new_mtu: new value for maximum frame size @@ -6740,18 +6752,13 @@ static int ixgbe_change_mtu(struct net_d { struct ixgbe_adapter *adapter = netdev_priv(netdev);
- if (adapter->xdp_prog) { + if (ixgbe_enabled_xdp_adapter(adapter)) { int new_frame_size = new_mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN; - int i; - - for (i = 0; i < adapter->num_rx_queues; i++) { - struct ixgbe_ring *ring = adapter->rx_ring[i];
- if (new_frame_size > ixgbe_rx_bufsz(ring)) { - e_warn(probe, "Requested MTU size is not supported with XDP\n"); - return -EINVAL; - } + if (new_frame_size > ixgbe_max_xdp_frame_size(adapter)) { + e_warn(probe, "Requested MTU size is not supported with XDP\n"); + return -EINVAL; } }
From: Jason Xing kernelxing@tencent.com
commit ce45ffb815e8e238f05de1630be3969b6bb15e4e upstream.
Include the second VLAN HLEN into account when computing the maximum MTU size as other drivers do.
Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Signed-off-by: Jason Xing kernelxing@tencent.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Tested-by: Chandan Kumar Rout chandanx.rout@intel.com (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -2789,7 +2789,7 @@ static int i40e_change_mtu(struct net_de struct i40e_pf *pf = vsi->back;
if (i40e_enabled_xdp_vsi(vsi)) { - int frame_size = new_mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN; + int frame_size = new_mtu + I40E_PACKET_HDR_PAD;
if (frame_size > i40e_max_xdp_frame_size(vsi)) return -EINVAL;
From: Rafał Miłecki rafal@milecki.pl
commit d61615c366a489646a1bfe5b33455f916762d5f4 upstream.
Code blocks handling BCMA_CHIP_ID_BCM5357 and BCMA_CHIP_ID_BCM53572 were incorrectly unified. Chip package values are not unique and cannot be checked independently. They are meaningful only in a context of a given chip.
Packages BCM5358 and BCM47188 share the same value but then belong to different chips. Code unification resulted in treating BCM5358 as BCM47188 and broke its initialization.
Link: https://github.com/openwrt/openwrt/issues/8278 Fixes: cb1b0f90acfe ("net: ethernet: bgmac: unify code of the same family") Cc: Jon Mason jdmason@kudzu.us Signed-off-by: Rafał Miłecki rafal@milecki.pl Reviewed-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20230208091637.16291-1-zajec5@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bgmac-bcma.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/broadcom/bgmac-bcma.c +++ b/drivers/net/ethernet/broadcom/bgmac-bcma.c @@ -228,12 +228,12 @@ static int bgmac_probe(struct bcma_devic bgmac->feature_flags |= BGMAC_FEAT_CLKCTLST; bgmac->feature_flags |= BGMAC_FEAT_FLW_CTRL1; bgmac->feature_flags |= BGMAC_FEAT_SW_TYPE_PHY; - if (ci->pkg == BCMA_PKG_ID_BCM47188 || - ci->pkg == BCMA_PKG_ID_BCM47186) { + if ((ci->id == BCMA_CHIP_ID_BCM5357 && ci->pkg == BCMA_PKG_ID_BCM47186) || + (ci->id == BCMA_CHIP_ID_BCM53572 && ci->pkg == BCMA_PKG_ID_BCM47188)) { bgmac->feature_flags |= BGMAC_FEAT_SW_TYPE_RGMII; bgmac->feature_flags |= BGMAC_FEAT_IOST_ATTACHED; } - if (ci->pkg == BCMA_PKG_ID_BCM5358) + if (ci->id == BCMA_CHIP_ID_BCM5357 && ci->pkg == BCMA_PKG_ID_BCM5358) bgmac->feature_flags |= BGMAC_FEAT_SW_TYPE_EPHYRMII; break; case BCMA_CHIP_ID_BCM53573:
From: Siddharth Vadapalli s-vadapalli@ti.com
commit 0ed577e7e8e508c24e22ba07713ecc4903e147c3 upstream.
In TI's AM62x/AM64x SoCs, successful teardown of RX DMA Channel raises an interrupt. The process of servicing this interrupt involves flushing all pending RX DMA descriptors and clearing the teardown completion marker (TDCM). The am65_cpsw_nuss_rx_packets() function invoked from the RX NAPI callback services the interrupt. Thus, it is necessary to wait for this handler to run, drain all packets and clear TDCM, before calling napi_disable() in am65_cpsw_nuss_common_stop() function post channel teardown. If napi_disable() executes before ensuring that TDCM is cleared, the TDCM remains set when the interfaces are down, resulting in an interrupt storm when the interfaces are brought up again.
Since the interrupt raised to indicate the RX DMA Channel teardown is specific to the AM62x and AM64x SoCs, add a quirk for it.
Fixes: 4f7cce272403 ("net: ethernet: ti: am65-cpsw: add support for am64x cpsw3g") Co-developed-by: Vignesh Raghavendra vigneshr@ti.com Signed-off-by: Vignesh Raghavendra vigneshr@ti.com Signed-off-by: Siddharth Vadapalli s-vadapalli@ti.com Reviewed-by: Roger Quadros rogerq@kernel.org Link: https://lore.kernel.org/r/20230209084432.189222-1-s-vadapalli@ti.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 12 +++++++++++- drivers/net/ethernet/ti/am65-cpsw-nuss.h | 1 + 2 files changed, 12 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -564,7 +564,15 @@ static int am65_cpsw_nuss_common_stop(st k3_udma_glue_disable_tx_chn(common->tx_chns[i].tx_chn); }
+ reinit_completion(&common->tdown_complete); k3_udma_glue_tdown_rx_chn(common->rx_chns.rx_chn, true); + + if (common->pdata.quirks & AM64_CPSW_QUIRK_DMA_RX_TDOWN_IRQ) { + i = wait_for_completion_timeout(&common->tdown_complete, msecs_to_jiffies(1000)); + if (!i) + dev_err(common->dev, "rx teardown timeout\n"); + } + napi_disable(&common->napi_rx);
for (i = 0; i < AM65_CPSW_MAX_RX_FLOWS; i++) @@ -786,6 +794,8 @@ static int am65_cpsw_nuss_rx_packets(str
if (cppi5_desc_is_tdcm(desc_dma)) { dev_dbg(dev, "%s RX tdown flow: %u\n", __func__, flow_idx); + if (common->pdata.quirks & AM64_CPSW_QUIRK_DMA_RX_TDOWN_IRQ) + complete(&common->tdown_complete); return 0; }
@@ -2609,7 +2619,7 @@ static const struct am65_cpsw_pdata j721 };
static const struct am65_cpsw_pdata am64x_cpswxg_pdata = { - .quirks = 0, + .quirks = AM64_CPSW_QUIRK_DMA_RX_TDOWN_IRQ, .ale_dev_id = "am64-cpswxg", .fdqring_mode = K3_RINGACC_RING_MODE_RING, }; --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.h +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.h @@ -84,6 +84,7 @@ struct am65_cpsw_rx_chn { };
#define AM65_CPSW_QUIRK_I2027_NO_TX_CSUM BIT(0) +#define AM64_CPSW_QUIRK_DMA_RX_TDOWN_IRQ BIT(1)
struct am65_cpsw_pdata { u32 quirks;
From: Pietro Borrello borrello@diag.uniroma1.it
commit a1221703a0f75a9d81748c516457e0fc76951496 upstream.
Use list_is_first() to check whether tsp->asoc matches the first element of ep->asocs, as the list is not guaranteed to have an entry.
Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file") Signed-off-by: Pietro Borrello borrello@diag.uniroma1.it Acked-by: Xin Long lucien.xin@gmail.com Link: https://lore.kernel.org/r/20230208-sctp-filter-v2-1-6e1f4017f326@diag.unirom... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/diag.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/net/sctp/diag.c +++ b/net/sctp/diag.c @@ -343,11 +343,9 @@ static int sctp_sock_filter(struct sctp_ struct sctp_comm_param *commp = p; struct sock *sk = ep->base.sk; const struct inet_diag_req_v2 *r = commp->r; - struct sctp_association *assoc = - list_entry(ep->asocs.next, struct sctp_association, asocs);
/* find the ep only once through the transports by this condition */ - if (tsp->asoc != assoc) + if (!list_is_first(&tsp->asoc->asocs, &ep->asocs)) return 0;
if (r->sdiag_family != AF_UNSPEC && sk->sk_family != r->sdiag_family)
From: Pedro Tammela pctammela@mojatatu.com
commit ee059170b1f7e94e55fa6cadee544e176a6e59c2 upstream.
The imperfect hash area can be updated while packets are traversing, which will cause a use-after-free when 'tcf_exts_exec()' is called with the destroyed tcf_ext.
CPU 0: CPU 1: tcindex_set_parms tcindex_classify tcindex_lookup tcindex_lookup tcf_exts_change tcf_exts_exec [UAF]
Stop operating on the shared area directly, by using a local copy, and update the filter with 'rcu_replace_pointer()'. Delete the old filter version only after a rcu grace period elapsed.
Fixes: 9b0d4446b569 ("net: sched: avoid atomic swap in tcf_exts_change") Reported-by: valis sec@valis.email Suggested-by: valis sec@valis.email Signed-off-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Pedro Tammela pctammela@mojatatu.com Link: https://lore.kernel.org/r/20230209143739.279867-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/cls_tcindex.c | 34 ++++++++++++++++++++++++++++++---- 1 file changed, 30 insertions(+), 4 deletions(-)
--- a/net/sched/cls_tcindex.c +++ b/net/sched/cls_tcindex.c @@ -12,6 +12,7 @@ #include <linux/errno.h> #include <linux/slab.h> #include <linux/refcount.h> +#include <linux/rcupdate.h> #include <net/act_api.h> #include <net/netlink.h> #include <net/pkt_cls.h> @@ -338,6 +339,7 @@ tcindex_set_parms(struct net *net, struc struct tcf_result cr = {}; int err, balloc = 0; struct tcf_exts e; + bool update_h = false;
err = tcf_exts_init(&e, net, TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE); if (err < 0) @@ -455,10 +457,13 @@ tcindex_set_parms(struct net *net, struc } }
- if (cp->perfect) + if (cp->perfect) { r = cp->perfect + handle; - else - r = tcindex_lookup(cp, handle) ? : &new_filter_result; + } else { + /* imperfect area is updated in-place using rcu */ + update_h = !!tcindex_lookup(cp, handle); + r = &new_filter_result; + }
if (r == &new_filter_result) { f = kzalloc(sizeof(*f), GFP_KERNEL); @@ -484,7 +489,28 @@ tcindex_set_parms(struct net *net, struc
rcu_assign_pointer(tp->root, cp);
- if (r == &new_filter_result) { + if (update_h) { + struct tcindex_filter __rcu **fp; + struct tcindex_filter *cf; + + f->result.res = r->res; + tcf_exts_change(&f->result.exts, &r->exts); + + /* imperfect area bucket */ + fp = cp->h + (handle % cp->hash); + + /* lookup the filter, guaranteed to exist */ + for (cf = rcu_dereference_bh_rtnl(*fp); cf; + fp = &cf->next, cf = rcu_dereference_bh_rtnl(*fp)) + if (cf->key == handle) + break; + + f->next = cf->next; + + cf = rcu_replace_pointer(*fp, f, 1); + tcf_exts_get_net(&cf->result.exts); + tcf_queue_work(&cf->rwork, tcindex_destroy_fexts_work); + } else if (r == &new_filter_result) { struct tcindex_filter *nfp; struct tcindex_filter __rcu **fp;
From: Kuniyuki Iwashima kuniyu@amazon.com
commit ca43ccf41224b023fc290073d5603a755fd12eed upstream.
Eric Dumazet pointed out [0] that when we call skb_set_owner_r() for ipv6_pinfo.pktoptions, sk_rmem_schedule() has not been called, resulting in a negative sk_forward_alloc.
We add a new helper which clones a skb and sets its owner only when sk_rmem_schedule() succeeds.
Note that we move skb_set_owner_r() forward in (dccp|tcp)_v6_do_rcv() because tcp_send_synack() can make sk_forward_alloc negative before ipv6_opt_accepted() in the crossed SYN-ACK or self-connect() cases.
[0]: https://lore.kernel.org/netdev/CANn89iK9oc20Jdi_41jb9URdF210r7d1Y-+uypbMSbOf...
Fixes: 323fbd0edf3f ("net: dccp: Add handling of IPV6_PKTOPTIONS to dccp_v6_do_rcv()") Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/sock.h | 13 +++++++++++++ net/dccp/ipv6.c | 7 ++----- net/ipv6/tcp_ipv6.c | 10 +++------- 3 files changed, 18 insertions(+), 12 deletions(-)
--- a/include/net/sock.h +++ b/include/net/sock.h @@ -2315,6 +2315,19 @@ static inline __must_check bool skb_set_ return false; }
+static inline struct sk_buff *skb_clone_and_charge_r(struct sk_buff *skb, struct sock *sk) +{ + skb = skb_clone(skb, sk_gfp_mask(sk, GFP_ATOMIC)); + if (skb) { + if (sk_rmem_schedule(sk, skb, skb->truesize)) { + skb_set_owner_r(skb, sk); + return skb; + } + __kfree_skb(skb); + } + return NULL; +} + static inline void skb_prepare_for_gro(struct sk_buff *skb) { if (skb->destructor != sock_wfree) { --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -551,11 +551,9 @@ static struct sock *dccp_v6_request_recv *own_req = inet_ehash_nolisten(newsk, req_to_sk(req_unhash), NULL); /* Clone pktoptions received with SYN, if we own the req */ if (*own_req && ireq->pktopts) { - newnp->pktoptions = skb_clone(ireq->pktopts, GFP_ATOMIC); + newnp->pktoptions = skb_clone_and_charge_r(ireq->pktopts, newsk); consume_skb(ireq->pktopts); ireq->pktopts = NULL; - if (newnp->pktoptions) - skb_set_owner_r(newnp->pktoptions, newsk); }
return newsk; @@ -615,7 +613,7 @@ static int dccp_v6_do_rcv(struct sock *s --ANK (980728) */ if (np->rxopt.all) - opt_skb = skb_clone(skb, GFP_ATOMIC); + opt_skb = skb_clone_and_charge_r(skb, sk);
if (sk->sk_state == DCCP_OPEN) { /* Fast path */ if (dccp_rcv_established(sk, skb, dccp_hdr(skb), skb->len)) @@ -679,7 +677,6 @@ ipv6_pktoptions: np->flow_label = ip6_flowlabel(ipv6_hdr(opt_skb)); if (ipv6_opt_accepted(sk, opt_skb, &DCCP_SKB_CB(opt_skb)->header.h6)) { - skb_set_owner_r(opt_skb, sk); memmove(IP6CB(opt_skb), &DCCP_SKB_CB(opt_skb)->header.h6, sizeof(struct inet6_skb_parm)); --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1429,14 +1429,11 @@ static struct sock *tcp_v6_syn_recv_sock
/* Clone pktoptions received with SYN, if we own the req */ if (ireq->pktopts) { - newnp->pktoptions = skb_clone(ireq->pktopts, - sk_gfp_mask(sk, GFP_ATOMIC)); + newnp->pktoptions = skb_clone_and_charge_r(ireq->pktopts, newsk); consume_skb(ireq->pktopts); ireq->pktopts = NULL; - if (newnp->pktoptions) { + if (newnp->pktoptions) tcp_v6_restore_cb(newnp->pktoptions); - skb_set_owner_r(newnp->pktoptions, newsk); - } } } else { if (!req_unhash && found_dup_sk) { @@ -1506,7 +1503,7 @@ static int tcp_v6_do_rcv(struct sock *sk --ANK (980728) */ if (np->rxopt.all) - opt_skb = skb_clone(skb, sk_gfp_mask(sk, GFP_ATOMIC)); + opt_skb = skb_clone_and_charge_r(skb, sk);
if (sk->sk_state == TCP_ESTABLISHED) { /* Fast path */ struct dst_entry *dst; @@ -1590,7 +1587,6 @@ ipv6_pktoptions: if (np->repflow) np->flow_label = ip6_flowlabel(ipv6_hdr(opt_skb)); if (ipv6_opt_accepted(sk, opt_skb, &TCP_SKB_CB(opt_skb)->header.h6)) { - skb_set_owner_r(opt_skb, sk); tcp_v6_restore_cb(opt_skb); opt_skb = xchg(&np->pktoptions, opt_skb); } else {
From: Miko Larsson mikoxyzzz@gmail.com
commit c68f345b7c425b38656e1791a0486769a8797016 upstream.
syzbot reported that act_len in kalmia_send_init_packet() is uninitialized when passing it to the first usb_bulk_msg error path. Jiri Pirko noted that it's pointless to pass it in the error path, and that the value that would be printed in the second error path would be the value of act_len from the first call to usb_bulk_msg.[1]
With this in mind, let's just not pass act_len to the usb_bulk_msg error paths.
1: https://lore.kernel.org/lkml/Y9pY61y1nwTuzMOa@nanopsycho/
Fixes: d40261236e8e ("net/usb: Add Samsung Kalmia driver for Samsung GT-B3730") Reported-and-tested-by: syzbot+cd80c5ef5121bfe85b55@syzkaller.appspotmail.com Signed-off-by: Miko Larsson mikoxyzzz@gmail.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/kalmia.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/net/usb/kalmia.c +++ b/drivers/net/usb/kalmia.c @@ -65,8 +65,8 @@ kalmia_send_init_packet(struct usbnet *d init_msg, init_msg_len, &act_len, KALMIA_USB_TIMEOUT); if (status != 0) { netdev_err(dev->net, - "Error sending init packet. Status %i, length %i\n", - status, act_len); + "Error sending init packet. Status %i\n", + status); return status; } else if (act_len != init_msg_len) { @@ -83,8 +83,8 @@ kalmia_send_init_packet(struct usbnet *d
if (status != 0) netdev_err(dev->net, - "Error receiving init result. Status %i, length %i\n", - status, act_len); + "Error receiving init result. Status %i\n", + status); else if (act_len != expected_len) netdev_err(dev->net, "Unexpected init result length: %i\n", act_len);
From: Hangyu Hua hbh25y@gmail.com
commit 2fa28f5c6fcbfc794340684f36d2581b4f2d20b5 upstream.
old_meter needs to be free after it is detached regardless of whether the new meter is successfully attached.
Fixes: c7c4c44c9a95 ("net: openvswitch: expand the meters supported number") Signed-off-by: Hangyu Hua hbh25y@gmail.com Acked-by: Eelco Chaudron echaudro@redhat.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/openvswitch/meter.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/openvswitch/meter.c +++ b/net/openvswitch/meter.c @@ -450,7 +450,7 @@ static int ovs_meter_cmd_set(struct sk_b
err = attach_meter(meter_tbl, meter); if (err) - goto exit_unlock; + goto exit_free_old_meter;
ovs_unlock();
@@ -473,6 +473,8 @@ static int ovs_meter_cmd_set(struct sk_b genlmsg_end(reply, ovs_reply_header); return genlmsg_reply(reply, info);
+exit_free_old_meter: + ovs_meter_free(old_meter); exit_unlock: ovs_unlock(); nlmsg_free(reply);
From: Johannes Zink j.zink@pengutronix.de
commit 4562c65ec852067c6196abdcf2d925f08841dcbc upstream.
So far changing the period by just setting new period values while running did not work.
The order as indicated by the publicly available reference manual of the i.MX8MP [1] indicates a sequence:
* initiate the programming sequence * set the values for PPS period and start time * start the pulse train generation.
This is currently not used in dwmac5_flex_pps_config(), which instead does:
* initiate the programming sequence and immediately start the pulse train generation * set the values for PPS period and start time
This caused the period values written not to take effect until the FlexPPS output was disabled and re-enabled again.
This patch fix the order and allows the period to be set immediately.
[1] https://www.nxp.com/webapp/Download?colCode=IMX8MPRM
Fixes: 9a8a02c9d46d ("net: stmmac: Add Flexible PPS support") Signed-off-by: Johannes Zink j.zink@pengutronix.de Link: https://lore.kernel.org/r/20230210143937.3427483-1-j.zink@pengutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/dwmac5.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac5.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac5.c @@ -541,9 +541,9 @@ int dwmac5_flex_pps_config(void __iomem return 0; }
- val |= PPSCMDx(index, 0x2); val |= TRGTMODSELx(index, 0x2); val |= PPSEN0; + writel(val, ioaddr + MAC_PPS_CONTROL);
writel(cfg->start.tv_sec, ioaddr + MAC_PPSx_TARGET_TIME_SEC(index));
@@ -568,6 +568,7 @@ int dwmac5_flex_pps_config(void __iomem writel(period - 1, ioaddr + MAC_PPSx_WIDTH(index));
/* Finally, activate it */ + val |= PPSCMDx(index, 0x2); writel(val, ioaddr + MAC_PPS_CONTROL); return 0; }
From: Michael Chan michael.chan@broadcom.com
commit 2038cc592811209de20c4e094ca08bfb1e6fbc6c upstream.
In bnxt_reserve_rings(), there is logic to check that the number of TX rings reserved is enough to cover all the mqprio TCs, but it fails to account for the TX XDP rings. So the check will always fail if there are mqprio TCs and TX XDP rings. As a result, the driver always fails to initialize after the XDP program is attached and the device will be brought down. A subsequent ifconfig up will also fail because the number of TX rings is set to an inconsistent number. Fix the check to properly account for TX XDP rings. If the check fails, set the number of TX rings back to a consistent number after calling netdev_reset_tc().
Fixes: 674f50a5b026 ("bnxt_en: Implement new method to reserve rings.") Reviewed-by: Hongguang Gao hongguang.gao@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -9023,10 +9023,14 @@ int bnxt_reserve_rings(struct bnxt *bp, netdev_err(bp->dev, "ring reservation/IRQ init failure rc: %d\n", rc); return rc; } - if (tcs && (bp->tx_nr_rings_per_tc * tcs != bp->tx_nr_rings)) { + if (tcs && (bp->tx_nr_rings_per_tc * tcs != + bp->tx_nr_rings - bp->tx_nr_rings_xdp)) { netdev_err(bp->dev, "tx ring reservation failure\n"); netdev_reset_tc(bp->dev); - bp->tx_nr_rings_per_tc = bp->tx_nr_rings; + if (bp->tx_nr_rings_xdp) + bp->tx_nr_rings_per_tc = bp->tx_nr_rings_xdp; + else + bp->tx_nr_rings_per_tc = bp->tx_nr_rings; return -ENOMEM; } return 0;
From: Cristian Ciocaltea cristian.ciocaltea@collabora.com
commit 05d7623a892a9da62da0e714428e38f09e4a64d8 upstream.
When setting 'snps,force_thresh_dma_mode' DT property, the following warning is always emitted, regardless the status of force_sf_dma_mode:
dwmac-starfive 10020000.ethernet: force_sf_dma_mode is ignored if force_thresh_dma_mode is set.
Do not print the rather misleading message when DMA store and forward mode is already disabled.
Fixes: e2a240c7d3bc ("driver:net:stmmac: Disable DMA store and forward mode if platform data force_thresh_dma_mode is set.") Signed-off-by: Cristian Ciocaltea cristian.ciocaltea@collabora.com Link: https://lore.kernel.org/r/20230210202126.877548-1-cristian.ciocaltea@collabo... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -558,7 +558,7 @@ stmmac_probe_config_dt(struct platform_d dma_cfg->mixed_burst = of_property_read_bool(np, "snps,mixed-burst");
plat->force_thresh_dma_mode = of_property_read_bool(np, "snps,force_thresh_dma_mode"); - if (plat->force_thresh_dma_mode) { + if (plat->force_thresh_dma_mode && plat->force_sf_dma_mode) { plat->force_sf_dma_mode = 0; dev_warn(&pdev->dev, "force_sf_dma_mode is ignored if force_thresh_dma_mode is set.\n");
From: Jakub Kicinski kuba@kernel.org
commit fda6c89fe3d9aca073495a664e1d5aea28cd4377 upstream.
lianhui reports that when MPLS fails to register the sysctl table under new location (during device rename) the old pointers won't get overwritten and may be freed again (double free).
Handle this gracefully. The best option would be unregistering the MPLS from the device completely on failure, but unfortunately mpls_ifdown() can fail. So failing fully is also unreliable.
Another option is to register the new table first then only remove old one if the new one succeeds. That requires more code, changes order of notifications and two tables may be visible at the same time.
sysctl point is not used in the rest of the code - set to NULL on failures and skip unregister if already NULL.
Reported-by: lianhui tang bluetlh@gmail.com Fixes: 0fae3bf018d9 ("mpls: handle device renames for per-device sysctls") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mpls/af_mpls.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/net/mpls/af_mpls.c +++ b/net/mpls/af_mpls.c @@ -1428,6 +1428,7 @@ static int mpls_dev_sysctl_register(stru free: kfree(table); out: + mdev->sysctl = NULL; return -ENOBUFS; }
@@ -1437,6 +1438,9 @@ static void mpls_dev_sysctl_unregister(s struct net *net = dev_net(dev); struct ctl_table *table;
+ if (!mdev->sysctl) + return; + table = mdev->sysctl->ctl_table_arg; unregister_net_sysctl_table(mdev->sysctl); kfree(table);
From: Jason Xing kernelxing@tencent.com
commit 0967bf837784a11c65d66060623a74e65211af0b upstream.
Include the second VLAN HLEN into account when computing the maximum MTU size as other drivers do.
Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP") Signed-off-by: Jason Xing kernelxing@tencent.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Tested-by: Chandan Kumar Rout chandanx.rout@intel.com (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2 ++ drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 +-- 2 files changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h @@ -67,6 +67,8 @@ #define IXGBE_RXBUFFER_4K 4096 #define IXGBE_MAX_RXBUFFER 16384 /* largest size for a single descriptor */
+#define IXGBE_PKT_HDR_PAD (ETH_HLEN + ETH_FCS_LEN + (VLAN_HLEN * 2)) + /* Attempt to maximize the headroom available for incoming frames. We * use a 2K buffer for receives and need 1536/1534 to store the data for * the frame. This leaves us with 512 bytes of room. From that we need --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -6753,8 +6753,7 @@ static int ixgbe_change_mtu(struct net_d struct ixgbe_adapter *adapter = netdev_priv(netdev);
if (ixgbe_enabled_xdp_adapter(adapter)) { - int new_frame_size = new_mtu + ETH_HLEN + ETH_FCS_LEN + - VLAN_HLEN; + int new_frame_size = new_mtu + IXGBE_PKT_HDR_PAD;
if (new_frame_size > ixgbe_max_xdp_frame_size(adapter)) { e_warn(probe, "Requested MTU size is not supported with XDP\n");
From: Guillaume Nault gnault@redhat.com
commit e010ae08c71fda8be3d6bda256837795a0b3ea41 upstream.
Take into account the IPV6_TCLASS socket option (DSCP) in ip6_datagram_flow_key_init(). Otherwise fib6_rule_match() can't properly match the DSCP value, resulting in invalid route lookup.
For example:
ip route add unreachable table main 2001:db8::10/124
ip route add table 100 2001:db8::10/124 dev eth0 ip -6 rule add dsfield 0x04 table 100
echo test | socat - UDP6:[2001:db8::11]:54321,ipv6-tclass=0x04
Without this patch, socat fails at connect() time ("No route to host") because the fib-rule doesn't jump to table 100 and the lookup ends up being done in the main table.
Fixes: 2cc67cc731d9 ("[IPV6] ROUTE: Routing by Traffic Class.") Signed-off-by: Guillaume Nault gnault@redhat.com Reviewed-by: Eric Dumazet edumazet@google.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/datagram.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -51,7 +51,7 @@ static void ip6_datagram_flow_key_init(s fl6->flowi6_mark = sk->sk_mark; fl6->fl6_dport = inet->inet_dport; fl6->fl6_sport = inet->inet_sport; - fl6->flowlabel = np->flow_label; + fl6->flowlabel = ip6_make_flowinfo(np->tclass, np->flow_label); fl6->flowi6_uid = sk->sk_uid;
if (!fl6->flowi6_oif)
From: Guillaume Nault gnault@redhat.com
commit 8230680f36fd1525303d1117768c8852314c488c upstream.
Take into account the IPV6_TCLASS socket option (DSCP) in tcp_v6_connect(). Otherwise fib6_rule_match() can't properly match the DSCP value, resulting in invalid route lookup.
For example:
ip route add unreachable table main 2001:db8::10/124
ip route add table 100 2001:db8::10/124 dev eth0 ip -6 rule add dsfield 0x04 table 100
echo test | socat - TCP6:[2001:db8::11]:54321,ipv6-tclass=0x04
Without this patch, socat fails at connect() time ("No route to host") because the fib-rule doesn't jump to table 100 and the lookup ends up being done in the main table.
Fixes: 2cc67cc731d9 ("[IPV6] ROUTE: Routing by Traffic Class.") Signed-off-by: Guillaume Nault gnault@redhat.com Reviewed-by: Eric Dumazet edumazet@google.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/tcp_ipv6.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -269,6 +269,7 @@ static int tcp_v6_connect(struct sock *s fl6.flowi6_proto = IPPROTO_TCP; fl6.daddr = sk->sk_v6_daddr; fl6.saddr = saddr ? *saddr : np->saddr; + fl6.flowlabel = ip6_make_flowinfo(np->tclass, np->flow_label); fl6.flowi6_oif = sk->sk_bound_dev_if; fl6.flowi6_mark = sk->sk_mark; fl6.fl6_dport = usin->sin6_port;
From: Ryusuke Konishi konishi.ryusuke@gmail.com
commit 99b9402a36f0799f25feee4465bfa4b8dfa74b4d upstream.
Macro NILFS_SB2_OFFSET_BYTES, which computes the position of the second superblock, underflows when the argument device size is less than 4096 bytes. Therefore, when using this macro, it is necessary to check in advance that the device size is not less than a lower limit, or at least that underflow does not occur.
The current nilfs2 implementation lacks this check, causing out-of-bound block access when mounting devices smaller than 4096 bytes:
I/O error, dev loop0, sector 36028797018963960 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 NILFS (loop0): unable to read secondary superblock (blocksize = 1024)
In addition, when trying to resize the filesystem to a size below 4096 bytes, this underflow occurs in nilfs_resize_fs(), passing a huge number of segments to nilfs_sufile_resize(), corrupting parameters such as the number of segments in superblocks. This causes excessive loop iterations in nilfs_sufile_resize() during a subsequent resize ioctl, causing semaphore ns_segctor_sem to block for a long time and hang the writer thread:
INFO: task segctord:5067 blocked for more than 143 seconds. Not tainted 6.2.0-rc8-syzkaller-00015-gf6feea56f66d #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:segctord state:D stack:23456 pid:5067 ppid:2 flags:0x00004000 Call Trace: <TASK> context_switch kernel/sched/core.c:5293 [inline] __schedule+0x1409/0x43f0 kernel/sched/core.c:6606 schedule+0xc3/0x190 kernel/sched/core.c:6682 rwsem_down_write_slowpath+0xfcf/0x14a0 kernel/locking/rwsem.c:1190 nilfs_transaction_lock+0x25c/0x4f0 fs/nilfs2/segment.c:357 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2486 [inline] nilfs_segctor_thread+0x52f/0x1140 fs/nilfs2/segment.c:2570 kthread+0x270/0x300 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 </TASK> ... Call Trace: <TASK> folio_mark_accessed+0x51c/0xf00 mm/swap.c:515 __nilfs_get_page_block fs/nilfs2/page.c:42 [inline] nilfs_grab_buffer+0x3d3/0x540 fs/nilfs2/page.c:61 nilfs_mdt_submit_block+0xd7/0x8f0 fs/nilfs2/mdt.c:121 nilfs_mdt_read_block+0xeb/0x430 fs/nilfs2/mdt.c:176 nilfs_mdt_get_block+0x12d/0xbb0 fs/nilfs2/mdt.c:251 nilfs_sufile_get_segment_usage_block fs/nilfs2/sufile.c:92 [inline] nilfs_sufile_truncate_range fs/nilfs2/sufile.c:679 [inline] nilfs_sufile_resize+0x7a3/0x12b0 fs/nilfs2/sufile.c:777 nilfs_resize_fs+0x20c/0xed0 fs/nilfs2/super.c:422 nilfs_ioctl_resize fs/nilfs2/ioctl.c:1033 [inline] nilfs_ioctl+0x137c/0x2440 fs/nilfs2/ioctl.c:1301 ...
This fixes these issues by inserting appropriate minimum device size checks or anti-underflow checks, depending on where the macro is used.
Link: https://lkml.kernel.org/r/0000000000004e1dfa05f4a48e6b@google.com Link: https://lkml.kernel.org/r/20230214224043.24141-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Reported-by: syzbot+f0c4082ce5ebebdac63b@syzkaller.appspotmail.com Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nilfs2/ioctl.c | 7 +++++++ fs/nilfs2/super.c | 9 +++++++++ fs/nilfs2/the_nilfs.c | 8 +++++++- 3 files changed, 23 insertions(+), 1 deletion(-)
--- a/fs/nilfs2/ioctl.c +++ b/fs/nilfs2/ioctl.c @@ -1114,7 +1114,14 @@ static int nilfs_ioctl_set_alloc_range(s
minseg = range[0] + segbytes - 1; do_div(minseg, segbytes); + + if (range[1] < 4096) + goto out; + maxseg = NILFS_SB2_OFFSET_BYTES(range[1]); + if (maxseg < segbytes) + goto out; + do_div(maxseg, segbytes); maxseg--;
--- a/fs/nilfs2/super.c +++ b/fs/nilfs2/super.c @@ -409,6 +409,15 @@ int nilfs_resize_fs(struct super_block * goto out;
/* + * Prevent underflow in second superblock position calculation. + * The exact minimum size check is done in nilfs_sufile_resize(). + */ + if (newsize < 4096) { + ret = -ENOSPC; + goto out; + } + + /* * Write lock is required to protect some functions depending * on the number of segments, the number of reserved segments, * and so forth. --- a/fs/nilfs2/the_nilfs.c +++ b/fs/nilfs2/the_nilfs.c @@ -544,9 +544,15 @@ static int nilfs_load_super_block(struct { struct nilfs_super_block **sbp = nilfs->ns_sbp; struct buffer_head **sbh = nilfs->ns_sbh; - u64 sb2off = NILFS_SB2_OFFSET_BYTES(nilfs->ns_bdev->bd_inode->i_size); + u64 sb2off, devsize = nilfs->ns_bdev->bd_inode->i_size; int valid[2], swp = 0;
+ if (devsize < NILFS_SEG_MIN_BLOCKS * NILFS_MIN_BLOCK_SIZE + 4096) { + nilfs_err(sb, "device size too small"); + return -EINVAL; + } + sb2off = NILFS_SB2_OFFSET_BYTES(devsize); + sbp[0] = nilfs_read_super_block(sb, NILFS_SB_OFFSET_BYTES, blocksize, &sbh[0]); sbp[1] = nilfs_read_super_block(sb, sb2off, blocksize, &sbh[1]);
From: Qian Yingjin qian@ddn.com
commit 5956592ce337330cdff0399a6f8b6a5aea397a8e upstream.
I was running traces of the read code against an RAID storage system to understand why read requests were being misaligned against the underlying RAID strips. I found that the page end offset calculation in filemap_get_read_batch() was off by one.
When a read is submitted with end offset 1048575, then it calculates the end page for read of 256 when it should be 255. "last_index" is the index of the page beyond the end of the read and it should be skipped when get a batch of pages for read in @filemap_get_read_batch().
The below simple patch fixes the problem. This code was introduced in kernel 5.12.
Link: https://lkml.kernel.org/r/20230208022400.28962-1-coolqyj@163.com Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read") Signed-off-by: Qian Yingjin qian@ddn.com Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/filemap.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/filemap.c +++ b/mm/filemap.c @@ -2538,18 +2538,19 @@ static int filemap_get_pages(struct kioc struct page *page; int err = 0;
+ /* "last_index" is the index of the page beyond the end of the read */ last_index = DIV_ROUND_UP(iocb->ki_pos + iter->count, PAGE_SIZE); retry: if (fatal_signal_pending(current)) return -EINTR;
- filemap_get_read_batch(mapping, index, last_index, pvec); + filemap_get_read_batch(mapping, index, last_index - 1, pvec); if (!pagevec_count(pvec)) { if (iocb->ki_flags & IOCB_NOIO) return -EAGAIN; page_cache_sync_readahead(mapping, ra, filp, index, last_index - index); - filemap_get_read_batch(mapping, index, last_index, pvec); + filemap_get_read_batch(mapping, index, last_index - 1, pvec); } if (!pagevec_count(pvec)) { if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_WAITQ))
From: Raviteja Goud Talla ravitejax.goud.talla@intel.com
[ Upstream commit 67b858dd89932086ae0ee2d0ce4dd070a2c88bb3 ]
Bspec page says "Reset: BUS", Accordingly moving w/a's: Wa_1407352427,Wa_1406680159 to proper function icl_gt_workarounds_init() Which will resolve guc enabling error
v2: - Previous patch rev2 was created by email client which caused the Build failure, This v2 is to resolve the previous broken series
Reviewed-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Raviteja Goud Talla ravitejax.goud.talla@intel.com Signed-off-by: John Harrison John.C.Harrison@Intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20211203145603.4006937-1-ravit... Stable-dep-of: d5a1224aa68c ("drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 6b5ab19a2ada9..0dda8f6da4230 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1049,6 +1049,15 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) GAMT_CHKN_BIT_REG, GAMT_CHKN_DISABLE_L3_COH_PIPE);
+ /* Wa_1407352427:icl,ehl */ + wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE2, + PSDUNIT_CLKGATE_DIS); + + /* Wa_1406680159:icl,ehl */ + wa_write_or(wal, + SUBSLICE_UNIT_LEVEL_CLKGATE, + GWUNIT_CLKGATE_DIS); + /* Wa_1607087056:icl,ehl,jsl */ if (IS_ICELAKE(i915) || IS_JSL_EHL_GT_STEP(i915, STEP_A0, STEP_B0)) @@ -1745,15 +1754,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE, VSUNIT_CLKGATE_DIS | HSUNIT_CLKGATE_DIS);
- /* Wa_1407352427:icl,ehl */ - wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE2, - PSDUNIT_CLKGATE_DIS); - - /* Wa_1406680159:icl,ehl */ - wa_write_or(wal, - SUBSLICE_UNIT_LEVEL_CLKGATE, - GWUNIT_CLKGATE_DIS); - /* * Wa_1408767742:icl[a2..forever],ehl[all] * Wa_1605460711:icl[a0..c0]
From: Matt Roper matthew.d.roper@intel.com
[ Upstream commit d5a1224aa68c8b124a4c5c390186e571815ed390 ]
The UNSLICE_UNIT_LEVEL_CLKGATE register programmed by this workaround has 'BUS' style reset, indicating that it does not lose its value on engine resets. Furthermore, this register is part of the GT forcewake domain rather than the RENDER domain, so it should not be impacted by RCS engine resets. As such, we should implement this on the GT workaround list rather than an engine list.
Bspec: 19219 Fixes: 3551ff928744 ("drm/i915/gen11: Moving WAs to rcs_engine_wa_init()") Signed-off-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Gustavo Sousa gustavo.sousa@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230201222831.608281-2-matthe... (cherry picked from commit 5f21dc07b52eb54a908e66f5d6e05a87bcb5b049) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 0dda8f6da4230..de93a1e988f29 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1049,6 +1049,13 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) GAMT_CHKN_BIT_REG, GAMT_CHKN_DISABLE_L3_COH_PIPE);
+ /* + * Wa_1408615072:icl,ehl (vsunit) + * Wa_1407596294:icl,ehl (hsunit) + */ + wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE, + VSUNIT_CLKGATE_DIS | HSUNIT_CLKGATE_DIS); + /* Wa_1407352427:icl,ehl */ wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE2, PSDUNIT_CLKGATE_DIS); @@ -1747,13 +1754,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) wa_masked_en(wal, GEN9_CSFE_CHICKEN1_RCS, GEN11_ENABLE_32_PLANE_MODE);
- /* - * Wa_1408615072:icl,ehl (vsunit) - * Wa_1407596294:icl,ehl (hsunit) - */ - wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE, - VSUNIT_CLKGATE_DIS | HSUNIT_CLKGATE_DIS); - /* * Wa_1408767742:icl[a2..forever],ehl[all] * Wa_1605460711:icl[a0..c0]
From: Baowen Zheng baowen.zheng@corigine.com
[ Upstream commit 40bd094d65fc9f83941b024cde7c24516f036879 ]
Fill flags to action structure to allow user control if the action should be offloaded to hardware or not.
Signed-off-by: Baowen Zheng baowen.zheng@corigine.com Signed-off-by: Louis Peens louis.peens@corigine.com Signed-off-by: Simon Horman simon.horman@corigine.com Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 21c167aa0ba9 ("net/sched: act_ctinfo: use percpu stats") Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/act_bpf.c | 2 +- net/sched/act_connmark.c | 2 +- net/sched/act_ctinfo.c | 2 +- net/sched/act_gate.c | 2 +- net/sched/act_ife.c | 2 +- net/sched/act_ipt.c | 2 +- net/sched/act_mpls.c | 2 +- net/sched/act_nat.c | 2 +- net/sched/act_pedit.c | 2 +- net/sched/act_police.c | 2 +- net/sched/act_sample.c | 2 +- net/sched/act_simple.c | 2 +- net/sched/act_skbedit.c | 2 +- net/sched/act_skbmod.c | 2 +- 14 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c index 5c36013339e11..2a05bad56ef3e 100644 --- a/net/sched/act_bpf.c +++ b/net/sched/act_bpf.c @@ -305,7 +305,7 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla, ret = tcf_idr_check_alloc(tn, &index, act, bind); if (!ret) { ret = tcf_idr_create(tn, index, est, act, - &act_bpf_ops, bind, true, 0); + &act_bpf_ops, bind, true, flags); if (ret < 0) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c index 032ef927d0ebb..0deb4e96a6c2e 100644 --- a/net/sched/act_connmark.c +++ b/net/sched/act_connmark.c @@ -124,7 +124,7 @@ static int tcf_connmark_init(struct net *net, struct nlattr *nla, ret = tcf_idr_check_alloc(tn, &index, a, bind); if (!ret) { ret = tcf_idr_create(tn, index, est, a, - &act_connmark_ops, bind, false, 0); + &act_connmark_ops, bind, false, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_ctinfo.c b/net/sched/act_ctinfo.c index 2d75fe1223ac4..65a20f3c9514e 100644 --- a/net/sched/act_ctinfo.c +++ b/net/sched/act_ctinfo.c @@ -212,7 +212,7 @@ static int tcf_ctinfo_init(struct net *net, struct nlattr *nla, err = tcf_idr_check_alloc(tn, &index, a, bind); if (!err) { ret = tcf_idr_create(tn, index, est, a, - &act_ctinfo_ops, bind, false, 0); + &act_ctinfo_ops, bind, false, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c index 7df72a4197a3f..ac985c53ebafe 100644 --- a/net/sched/act_gate.c +++ b/net/sched/act_gate.c @@ -357,7 +357,7 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla,
if (!err) { ret = tcf_idr_create(tn, index, est, a, - &act_gate_ops, bind, false, 0); + &act_gate_ops, bind, false, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c index 7064a365a1a98..ec987ec758070 100644 --- a/net/sched/act_ife.c +++ b/net/sched/act_ife.c @@ -553,7 +553,7 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
if (!exists) { ret = tcf_idr_create(tn, index, est, a, &act_ife_ops, - bind, true, 0); + bind, true, flags); if (ret) { tcf_idr_cleanup(tn, index); kfree(p); diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index 265b1443e252f..2f3d507c24a1f 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -145,7 +145,7 @@ static int __tcf_ipt_init(struct net *net, unsigned int id, struct nlattr *nla,
if (!exists) { ret = tcf_idr_create(tn, index, est, a, ops, bind, - false, 0); + false, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_mpls.c b/net/sched/act_mpls.c index db0ef0486309b..980ad795727e9 100644 --- a/net/sched/act_mpls.c +++ b/net/sched/act_mpls.c @@ -254,7 +254,7 @@ static int tcf_mpls_init(struct net *net, struct nlattr *nla,
if (!exists) { ret = tcf_idr_create(tn, index, est, a, - &act_mpls_ops, bind, true, 0); + &act_mpls_ops, bind, true, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_nat.c b/net/sched/act_nat.c index 7dd6b586ba7f6..2a39b3729e844 100644 --- a/net/sched/act_nat.c +++ b/net/sched/act_nat.c @@ -61,7 +61,7 @@ static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est, err = tcf_idr_check_alloc(tn, &index, a, bind); if (!err) { ret = tcf_idr_create(tn, index, est, a, - &act_nat_ops, bind, false, 0); + &act_nat_ops, bind, false, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c index 1262a84b725fc..4f72e6e7dbda5 100644 --- a/net/sched/act_pedit.c +++ b/net/sched/act_pedit.c @@ -189,7 +189,7 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla, err = tcf_idr_check_alloc(tn, &index, a, bind); if (!err) { ret = tcf_idr_create(tn, index, est, a, - &act_pedit_ops, bind, false, 0); + &act_pedit_ops, bind, false, flags); if (ret) { tcf_idr_cleanup(tn, index); goto out_free; diff --git a/net/sched/act_police.c b/net/sched/act_police.c index 5c0a3ea9fe120..d44b933b821d7 100644 --- a/net/sched/act_police.c +++ b/net/sched/act_police.c @@ -90,7 +90,7 @@ static int tcf_police_init(struct net *net, struct nlattr *nla,
if (!exists) { ret = tcf_idr_create(tn, index, NULL, a, - &act_police_ops, bind, true, 0); + &act_police_ops, bind, true, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c index 230501eb9e069..ab4ae24ab886f 100644 --- a/net/sched/act_sample.c +++ b/net/sched/act_sample.c @@ -70,7 +70,7 @@ static int tcf_sample_init(struct net *net, struct nlattr *nla,
if (!exists) { ret = tcf_idr_create(tn, index, est, a, - &act_sample_ops, bind, true, 0); + &act_sample_ops, bind, true, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c index cbbe1861d3a20..7885271540259 100644 --- a/net/sched/act_simple.c +++ b/net/sched/act_simple.c @@ -128,7 +128,7 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla,
if (!exists) { ret = tcf_idr_create(tn, index, est, a, - &act_simp_ops, bind, false, 0); + &act_simp_ops, bind, false, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c index 6054185383474..6088ceaf582e8 100644 --- a/net/sched/act_skbedit.c +++ b/net/sched/act_skbedit.c @@ -176,7 +176,7 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
if (!exists) { ret = tcf_idr_create(tn, index, est, a, - &act_skbedit_ops, bind, true, 0); + &act_skbedit_ops, bind, true, act_flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c index ecb9ee6660954..ee9cc0abf9e10 100644 --- a/net/sched/act_skbmod.c +++ b/net/sched/act_skbmod.c @@ -168,7 +168,7 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla,
if (!exists) { ret = tcf_idr_create(tn, index, est, a, - &act_skbmod_ops, bind, true, 0); + &act_skbmod_ops, bind, true, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret;
From: Pedro Tammela pctammela@mojatatu.com
[ Upstream commit 21c167aa0ba943a7cac2f6969814f83bb701666b ]
The tc action act_ctinfo was using shared stats, fix it to use percpu stats since bstats_update() must be called with locks or with a percpu pointer argument.
tdc results: 1..12 ok 1 c826 - Add ctinfo action with default setting ok 2 0286 - Add ctinfo action with dscp ok 3 4938 - Add ctinfo action with valid cpmark and zone ok 4 7593 - Add ctinfo action with drop control ok 5 2961 - Replace ctinfo action zone and action control ok 6 e567 - Delete ctinfo action with valid index ok 7 6a91 - Delete ctinfo action with invalid index ok 8 5232 - List ctinfo actions ok 9 7702 - Flush ctinfo actions ok 10 3201 - Add ctinfo action with duplicate index ok 11 8295 - Add ctinfo action with invalid index ok 12 3964 - Replace ctinfo action with invalid goto_chain control
Fixes: 24ec483cec98 ("net: sched: Introduce act_ctinfo action") Reviewed-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Pedro Tammela pctammela@mojatatu.com Reviewed-by: Larysa Zaremba larysa.zaremba@intel.com Link: https://lore.kernel.org/r/20230210200824.444856-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/act_ctinfo.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/sched/act_ctinfo.c b/net/sched/act_ctinfo.c index 65a20f3c9514e..56e0a5eb64942 100644 --- a/net/sched/act_ctinfo.c +++ b/net/sched/act_ctinfo.c @@ -92,7 +92,7 @@ static int tcf_ctinfo_act(struct sk_buff *skb, const struct tc_action *a, cp = rcu_dereference_bh(ca->params);
tcf_lastuse_update(&ca->tcf_tm); - bstats_update(&ca->tcf_bstats, skb); + tcf_action_update_bstats(&ca->common, skb); action = READ_ONCE(ca->tcf_action);
wlen = skb_network_offset(skb); @@ -211,8 +211,8 @@ static int tcf_ctinfo_init(struct net *net, struct nlattr *nla, index = actparm->index; err = tcf_idr_check_alloc(tn, &index, a, bind); if (!err) { - ret = tcf_idr_create(tn, index, est, a, - &act_ctinfo_ops, bind, false, flags); + ret = tcf_idr_create_from_flags(tn, index, est, a, + &act_ctinfo_ops, bind, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret;
From: Natalia Petrova n.petrova@fintech.ru
[ Upstream commit 7fa0b526f865cb42aa33917fd02a92cb03746f4d ]
The result of nlmsg_find_attr() 'br_spec' is dereferenced in nla_for_each_nested(), but it can take NULL value in nla_find() function, which will result in an error.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 51616018dd1b ("i40e: Add support for getlink, setlink ndo ops") Signed-off-by: Natalia Petrova n.petrova@fintech.ru Reviewed-by: Jesse Brandeburg jesse.brandeburg@intel.com Tested-by: Gurucharan G gurucharanx.g@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20230209172833.3596034-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 4a32f84dc0571..5ffcd3cc989f7 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -13005,6 +13005,8 @@ static int i40e_ndo_bridge_setlink(struct net_device *dev, }
br_spec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC); + if (!br_spec) + return -EINVAL;
nla_for_each_nested(attr, br_spec, rem) { __u16 mode;
From: Pedro Tammela pctammela@mojatatu.com
[ Upstream commit 42018a322bd453e38b3ffee294982243e50a484f ]
Syzkaller found an issue where a handle greater than 16 bits would trigger a null-ptr-deref in the imperfect hash area update.
general protection fault, probably for non-canonical address 0xdffffc0000000015: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000a8-0x00000000000000af] CPU: 0 PID: 5070 Comm: syz-executor456 Not tainted 6.2.0-rc7-syzkaller-00112-gc68f345b7c42 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 RIP: 0010:tcindex_set_parms+0x1a6a/0x2990 net/sched/cls_tcindex.c:509 Code: 01 e9 e9 fe ff ff 4c 8b bd 28 fe ff ff e8 0e 57 7d f9 48 8d bb a8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 94 0c 00 00 48 8b 85 f8 fd ff ff 48 8b 9b a8 00 RSP: 0018:ffffc90003d3ef88 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000015 RSI: ffffffff8803a102 RDI: 00000000000000a8 RBP: ffffc90003d3f1d8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88801e2b10a8 R13: dffffc0000000000 R14: 0000000000030000 R15: ffff888017b3be00 FS: 00005555569af300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000056041c6d2000 CR3: 000000002bfca000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcindex_change+0x1ea/0x320 net/sched/cls_tcindex.c:572 tc_new_tfilter+0x96e/0x2220 net/sched/cls_api.c:2155 rtnetlink_rcv_msg+0x959/0xca0 net/core/rtnetlink.c:6132 netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2574 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x91b/0xe10 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xd3/0x120 net/socket.c:734 ____sys_sendmsg+0x334/0x8c0 net/socket.c:2476 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2530 __sys_sendmmsg+0x18f/0x460 net/socket.c:2616 __do_sys_sendmmsg net/socket.c:2645 [inline] __se_sys_sendmmsg net/socket.c:2642 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2642 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
Fixes: ee059170b1f7 ("net/sched: tcindex: update imperfect hash filters respecting rcu") Signed-off-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Pedro Tammela pctammela@mojatatu.com Reported-by: syzbot syzkaller@googlegroups.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_tcindex.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c index ac3deffc24bfb..54c5ff207fb1b 100644 --- a/net/sched/cls_tcindex.c +++ b/net/sched/cls_tcindex.c @@ -502,7 +502,7 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base, /* lookup the filter, guaranteed to exist */ for (cf = rcu_dereference_bh_rtnl(*fp); cf; fp = &cf->next, cf = rcu_dereference_bh_rtnl(*fp)) - if (cf->key == handle) + if (cf->key == (u16)handle) break;
f->next = cf->next;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 2c10b61421a28e95a46ab489fd56c0f442ff6952 upstream.
When calling the KVM_GET_DEBUGREGS ioctl, on some configurations, there might be some unitialized portions of the kvm_debugregs structure that could be copied to userspace. Prevent this as is done in the other kvm ioctls, by setting the whole structure to 0 before copying anything into it.
Bonus is that this reduces the lines of code as the explicit flag setting and reserved space zeroing out can be removed.
Cc: Sean Christopherson seanjc@google.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: x86@kernel.org Cc: "H. Peter Anvin" hpa@zytor.com Cc: stable stable@kernel.org Reported-by: Xingyuan Mo hdthky0@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Message-Id: 20230214103304.3689213-1-gregkh@linuxfoundation.org Tested-by: Xingyuan Mo hdthky0@gmail.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/x86.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4821,12 +4821,11 @@ static void kvm_vcpu_ioctl_x86_get_debug { unsigned long val;
+ memset(dbgregs, 0, sizeof(*dbgregs)); memcpy(dbgregs->db, vcpu->arch.db, sizeof(vcpu->arch.db)); kvm_get_dr(vcpu, 6, &val); dbgregs->dr6 = val; dbgregs->dr7 = vcpu->arch.dr7; - dbgregs->flags = 0; - memset(&dbgregs->reserved, 0, sizeof(dbgregs->reserved)); }
static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
From: Thomas Gleixner tglx@linutronix.de
commit d125d1349abeb46945dc5e98f7824bf688266f13 upstream.
syzbot reported a RCU stall which is caused by setting up an alarmtimer with a very small interval and ignoring the signal. The reproducer arms the alarm timer with a relative expiry of 8ns and an interval of 9ns. Not a problem per se, but that's an issue when the signal is ignored because then the timer is immediately rearmed because there is no way to delay that rearming to the signal delivery path. See posix_timer_fn() and commit 58229a189942 ("posix-timers: Prevent softirq starvation by small intervals and SIG_IGN") for details.
The reproducer does not set SIG_IGN explicitely, but it sets up the timers signal with SIGCONT. That has the same effect as explicitely setting SIG_IGN for a signal as SIGCONT is ignored if there is no handler set and the task is not ptraced.
The log clearly shows that:
[pid 5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} ---
It works because the tasks are traced and therefore the signal is queued so the tracer can see it, which delays the restart of the timer to the signal delivery path. But then the tracer is killed:
[pid 5087] kill(-5102, SIGKILL <unfinished ...> ... ./strace-static-x86_64: Process 5107 detached
and after it's gone the stall can be observed:
syzkaller login: [ 79.439102][ C0] hrtimer: interrupt took 68471 ns [ 184.460538][ C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: ... [ 184.658237][ C1] rcu: Stack dump where RCU GP kthread last ran: [ 184.664574][ C1] Sending NMI from CPU 1 to CPUs 0: [ 184.669821][ C0] NMI backtrace for cpu 0 [ 184.669831][ C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0 ... [ 184.670036][ C0] Call Trace: [ 184.670041][ C0] <IRQ> [ 184.670045][ C0] alarmtimer_fired+0x327/0x670
posix_timer_fn() prevents that by checking whether the interval for timers which have the signal ignored is smaller than a jiffie and artifically delay it by shifting the next expiry out by a jiffie. That's accurate vs. the overrun accounting, but slightly inaccurate vs. timer_gettimer(2).
The comment in that function says what needs to be done and there was a fix available for the regular userspace induced SIG_IGN mechanism, but that did not work due to the implicit ignore for SIGCONT and similar signals. This needs to be worked on, but for now the only available workaround is to do exactly what posix_timer_fn() does:
Increase the interval of self-rearming timers, which have their signal ignored, to at least a jiffie.
Interestingly this has been fixed before via commit ff86bf0c65f1 ("alarmtimer: Rate limit periodic intervals") already, but that fix got lost in a later rework.
Reported-by: syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com Fixes: f2c45807d399 ("alarmtimer: Switch over to generic set/get/rearm routine") Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: John Stultz jstultz@google.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87k00q1no2.ffs@tglx Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/time/alarmtimer.c | 33 +++++++++++++++++++++++++++++---- 1 file changed, 29 insertions(+), 4 deletions(-)
--- a/kernel/time/alarmtimer.c +++ b/kernel/time/alarmtimer.c @@ -470,11 +470,35 @@ u64 alarm_forward(struct alarm *alarm, k } EXPORT_SYMBOL_GPL(alarm_forward);
-u64 alarm_forward_now(struct alarm *alarm, ktime_t interval) +static u64 __alarm_forward_now(struct alarm *alarm, ktime_t interval, bool throttle) { struct alarm_base *base = &alarm_bases[alarm->type]; + ktime_t now = base->get_ktime(); + + if (IS_ENABLED(CONFIG_HIGH_RES_TIMERS) && throttle) { + /* + * Same issue as with posix_timer_fn(). Timers which are + * periodic but the signal is ignored can starve the system + * with a very small interval. The real fix which was + * promised in the context of posix_timer_fn() never + * materialized, but someone should really work on it. + * + * To prevent DOS fake @now to be 1 jiffie out which keeps + * the overrun accounting correct but creates an + * inconsistency vs. timer_gettime(2). + */ + ktime_t kj = NSEC_PER_SEC / HZ; + + if (interval < kj) + now = ktime_add(now, kj); + } + + return alarm_forward(alarm, now, interval); +}
- return alarm_forward(alarm, base->get_ktime(), interval); +u64 alarm_forward_now(struct alarm *alarm, ktime_t interval) +{ + return __alarm_forward_now(alarm, interval, false); } EXPORT_SYMBOL_GPL(alarm_forward_now);
@@ -551,9 +575,10 @@ static enum alarmtimer_restart alarm_han if (posix_timer_event(ptr, si_private) && ptr->it_interval) { /* * Handle ignored signals and rearm the timer. This will go - * away once we handle ignored signals proper. + * away once we handle ignored signals proper. Ensure that + * small intervals cannot starve the system. */ - ptr->it_overrun += alarm_forward_now(alarm, ptr->it_interval); + ptr->it_overrun += __alarm_forward_now(alarm, ptr->it_interval, true); ++ptr->it_requeue_pending; ptr->it_active = 1; result = ALARMTIMER_RESTART;
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
commit 1f810d2b6b2fbdc5279644d8b2c140b1f7c9d43d upstream.
The HDaudio stream allocation is done first, and in a second step the LOSIDV parameter is programmed for the multi-link used by a codec.
This leads to a possible stream_tag leak, e.g. if a DisplayAudio link is not used. This would happen when a non-Intel graphics card is used and userspace unconditionally uses the Intel Display Audio PCMs without checking if they are connected to a receiver with jack controls.
We should first check that there is a valid multi-link entry to configure before allocating a stream_tag. This change aligns the dma_assign and dma_cleanup phases.
Complements: b0cd60f3e9f5 ("ALSA/ASoC: hda: clarify bus_get_link() and bus_link_get() helpers") Link: https://github.com/thesofproject/linux/issues/4151 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Rander Wang rander.wang@intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20230216162340.19480-1-peter.ujfalusi@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/sof/intel/hda-dai.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/sound/soc/sof/intel/hda-dai.c +++ b/sound/soc/sof/intel/hda-dai.c @@ -212,6 +212,10 @@ static int hda_link_hw_params(struct snd int stream_tag; int ret;
+ link = snd_hdac_ext_bus_get_link(bus, codec_dai->component->name); + if (!link) + return -EINVAL; + /* get stored dma data if resuming from system suspend */ link_dev = snd_soc_dai_get_dma_data(dai, substream); if (!link_dev) { @@ -232,10 +236,6 @@ static int hda_link_hw_params(struct snd if (ret < 0) return ret;
- link = snd_hdac_ext_bus_get_link(bus, codec_dai->component->name); - if (!link) - return -EINVAL; - /* set the hdac_stream in the codec dai */ snd_soc_dai_set_stream(codec_dai, hdac_stream(link_dev), substream->stream);
From: Dan Carpenter error27@gmail.com
commit 9cec2aaffe969f2a3e18b5ec105fc20bb908e475 upstream.
The > needs be >= to prevent an out of bounds access.
Fixes: de5ca4c3852f ("net: sched: sch: Bounds check priority") Signed-off-by: Dan Carpenter error27@gmail.com Reviewed-by: Simon Horman simon.horman@corigine.com Reviewed-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/Y+D+KN18FQI2DKLq@kili Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_htb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -429,7 +429,7 @@ static void htb_activate_prios(struct ht while (m) { unsigned int prio = ffz(~m);
- if (WARN_ON_ONCE(prio > ARRAY_SIZE(p->inner.clprio))) + if (WARN_ON_ONCE(prio >= ARRAY_SIZE(p->inner.clprio))) break; m &= ~(1 << prio);
From: Arnd Bergmann arnd@arndb.de
commit abce209d18fd26e865b2406cc68819289db973f9 upstream.
Using the serio subsystem now requires the code to be reachable:
x86_64-linux-ld: drivers/platform/x86/amd/pmc.o: in function `amd_pmc_suspend_handler': pmc.c:(.text+0x86c): undefined reference to `serio_bus'
Add the usual dependency: as other users of serio use 'select' rather than 'depends on', use the same here.
Fixes: 8e60615e8932 ("platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN") Signed-off-by: Arnd Bergmann arnd@arndb.de Link: https://lore.kernel.org/r/20230127093950.2368575-1-arnd@kernel.org Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/platform/x86/Kconfig +++ b/drivers/platform/x86/Kconfig @@ -171,6 +171,7 @@ config ACER_WMI config AMD_PMC tristate "AMD SoC PMC driver" depends on ACPI && PCI + select SERIO help The driver provides support for AMD Power Management Controller primarily responsible for S2Idle transactions that are driven from
On Mon, Feb 20, 2023 at 02:35:33PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.95 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 10.2.0) and powerpc (ps3_defconfig, GCC 12.2.0).
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
On Mon, 20 Feb 2023 at 19:21, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.15.95 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.95-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 5.15.95-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-5.15.y * git commit: 76543d843499bc53e1360720c61967de1d3e0ef0 * git describe: v5.15.94-84-g76543d843499 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15....
## Test Regressions (compared to v5.15.90-205-g5605d15db022)
## Metric Regressions (compared to v5.15.90-205-g5605d15db022)
## Test Fixes (compared to v5.15.90-205-g5605d15db022)
## Metric Fixes (compared to v5.15.90-205-g5605d15db022)
## Test result summary total: 157706, pass: 133248, fail: 4182, skip: 19944, xfail: 332
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 149 total, 148 passed, 1 failed * arm64: 49 total, 47 passed, 2 failed * i386: 39 total, 35 passed, 4 failed * mips: 31 total, 29 passed, 2 failed * parisc: 8 total, 8 passed, 0 failed * powerpc: 34 total, 32 passed, 2 failed * riscv: 14 total, 14 passed, 0 failed * s390: 16 total, 14 passed, 2 failed * sh: 14 total, 12 passed, 2 failed * sparc: 8 total, 8 passed, 0 failed * x86_64: 42 total, 40 passed, 2 failed
## Test suites summary * boot * fwts * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-c * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-open-posix-tests * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * packetdrill * perf * rcutorture * v4l2-compliance * vdso
-- Linaro LKFT https://lkft.linaro.org
On 2/20/23 5:35 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.95 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.95-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
Hi Greg,
On Mon, Feb 20, 2023 at 02:35:33PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.95 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
Build test (gcc version 12.2.1 20230210): mips: 62 configs -> no failure arm: 99 configs -> no failure arm64: 3 configs -> no failure x86_64: 4 configs -> no failure alpha allmodconfig -> no failure csky allmodconfig -> no failure powerpc allmodconfig -> no failure riscv allmodconfig -> no failure s390 allmodconfig -> no failure xtensa allmodconfig -> no failure
Boot test: x86_64: Booted on my test laptop. No regression. x86_64: Booted on qemu. No regression. [1] arm64: Booted on rpi4b (4GB model). No regression. [2] mips: Booted on ci20 board. No regression. [3]
[1]. https://openqa.qa.codethink.co.uk/tests/2907 [2]. https://openqa.qa.codethink.co.uk/tests/2910 [3]. https://openqa.qa.codethink.co.uk/tests/2912
Tested-by: Sudip Mukherjee sudip.mukherjee@codethink.co.uk
On Mon, Feb 20, 2023 at 02:35:33PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.95 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 492 pass: 492 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On 2/20/23 05:35, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.95 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.95-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On 2/20/23 06:35, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.95 release. There are 83 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.95-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org