This is the start of the stable review cycle for the 5.10.179 release. There are 68 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 26 Apr 2023 13:11:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.179-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.10.179-rc1
Ekaterina Orlova vorobushek.ok@gmail.com ASN.1: Fix check for strdup() success
Nikita Zhandarovich n.zhandarovich@fintech.ru ASoC: fsl_asrc_dma: fix potential null-ptr-deref
Dan Carpenter error27@gmail.com iio: adc: at91-sama5d2_adc: fix an error code in at91_adc_allocate_trigger()
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: hibvt: Explicitly set .polarity in .get_state()
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: iqs620a: Explicitly set .polarity in .get_state()
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: meson: Explicitly set .polarity in .get_state()
Kuniyuki Iwashima kuniyu@amazon.com sctp: Call inet6_destroy_sock() via sk->sk_destruct().
Kuniyuki Iwashima kuniyu@amazon.com dccp: Call inet6_destroy_sock() via sk->sk_destruct().
Kuniyuki Iwashima kuniyu@amazon.com inet6: Remove inet6_destroy_sock() in sk->sk_prot->destroy().
Kuniyuki Iwashima kuniyu@amazon.com tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
Kuniyuki Iwashima kuniyu@amazon.com udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
Baokun Li libaokun1@huawei.com ext4: fix use-after-free in ext4_xattr_set_entry
Ritesh Harjani riteshh@linux.ibm.com ext4: remove duplicate definition of ext4_xattr_ibody_inline_set()
Tudor Ambarus tudor.ambarus@linaro.org Revert "ext4: fix use-after-free in ext4_xattr_set_entry"
Miklos Szeredi mszeredi@redhat.com fuse: fix deadlock between atomic O_TRUNC and page invalidation
Jiachen Zhang zhangjiachen.jaycee@bytedance.com fuse: always revalidate rename target dentry
Miklos Szeredi mszeredi@redhat.com fuse: fix attr version comparison in fuse_read_update_size()
Miklos Szeredi mszeredi@redhat.com fuse: check s_root when destroying sb
Connor Kuehl ckuehl@redhat.com virtiofs: split requests that exceed virtqueue size
Miklos Szeredi mszeredi@redhat.com virtiofs: clean up error handling in virtio_fs_get_tree()
Alyssa Ross hi@alyssa.is purgatory: fix disabling debug info
Salvatore Bonaccorso carnil@debian.org docs: futex: Fix kernel-doc references after code split-up preparation
Jiaxun Yang jiaxun.yang@flygoat.com MIPS: Define RUNTIME_DISCARD_EXIT in LD script
Qais Yousef qyousef@layalina.io sched/fair: Fixes for capacity inversion detection
Qais Yousef qyousef@layalina.io sched/uclamp: Fix a uninitialized variable warnings
Qais Yousef qais.yousef@arm.com sched/fair: Consider capacity inversion in util_fits_cpu()
Qais Yousef qais.yousef@arm.com sched/fair: Detect capacity inversion
Qais Yousef qais.yousef@arm.com sched/uclamp: Cater for uclamp in find_energy_efficient_cpu()'s early exit condition
Qais Yousef qais.yousef@arm.com sched/uclamp: Make cpu_overutilized() use util_fits_cpu()
Qais Yousef qais.yousef@arm.com sched/uclamp: Make asym_fits_capacity() use util_fits_cpu()
Qais Yousef qais.yousef@arm.com sched/uclamp: Make select_idle_capacity() use util_fits_cpu()
Qais Yousef qais.yousef@arm.com sched/uclamp: Fix fits_capacity() check in feec()
Qais Yousef qais.yousef@arm.com sched/uclamp: Make task_fits_capacity() use util_fits_cpu()
Peter Xu peterx@redhat.com mm/khugepaged: check again on anon uffd-wp during isolation
Bhavya Kapoor b-kapoor@ti.com mmc: sdhci_am654: Set HIGH_SPEED_ENA for SDR12 and SDR25
Ondrej Mosnacek omosnace@redhat.com kernel/sys.c: fix and improve control flow in __sys_setres[ug]id()
Greg Kroah-Hartman gregkh@linuxfoundation.org memstick: fix memory leak if card device is never registered
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: initialize unused bytes in segment summary blocks
Brian Masney bmasney@redhat.com iio: light: tsl2772: fix reading proximity-diodes from device tree
Brian Foster bfoster@redhat.com xfs: drop submit side trans alloc for append ioends
Aneesh Kumar K.V aneesh.kumar@linux.ibm.com powerpc/doc: Fix htmldocs errors
Juergen Gross jgross@suse.com xen/netback: use same error messages for same errors
Sagi Grimberg sagi@grimberg.me nvme-tcp: fix a possible UAF when failing to allocate an io queue
Heiko Carstens hca@linux.ibm.com s390/ptrace: fix PTRACE_GET_LAST_BREAK error handling
Álvaro Fernández Rojas noltari@gmail.com net: dsa: b53: mmap: add phy ops
Damien Le Moal damien.lemoal@opensource.wdc.com scsi: core: Improve scsi_vpd_inquiry() checks
Tomas Henzl thenzl@redhat.com scsi: megaraid_sas: Fix fw_crash_buffer_show()
Nick Desaulniers ndesaulniers@google.com selftests: sigaltstack: fix -Wuninitialized
Jonathan Denose jdenose@chromium.org Input: i8042 - add quirk for Fujitsu Lifebook A574/H
Douglas Raillard douglas.raillard@arm.com f2fs: Fix f2fs_truncate_partial_nodes ftrace event
Sebastian Basierski sebastianx.basierski@intel.com e1000e: Disable TSO on i219-LM card to increase speed
Daniel Borkmann daniel@iogearbox.net bpf: Fix incorrect verifier pruning due to missing register precision taints
Ido Schimmel idosch@nvidia.com mlxsw: pci: Fix possible crash during initialization
Alexander Aring aahringo@redhat.com net: rpl: fix rpl header size calculation
Nikita Zhandarovich n.zhandarovich@fintech.ru mlxfw: fix null-ptr-deref in mlxfw_mfa2_tlv_next()
Aleksandr Loktionov aleksandr.loktionov@intel.com i40e: fix i40e_setup_misc_vector() error handling
Aleksandr Loktionov aleksandr.loktionov@intel.com i40e: fix accessing vsi->active_filters without holding lock
Florian Westphal fw@strlen.de netfilter: nf_tables: fix ifdef to also consider nf_tables=m
Ding Hui dinghui@sangfor.com.cn sfc: Fix use-after-free due to selftest_work
Jonathan Cooper jonathan.s.cooper@amd.com sfc: Split STATE_READY in to STATE_NET_DOWN and STATE_NET_UP.
Xuan Zhuo xuanzhuo@linux.alibaba.com virtio_net: bugfix overflow inside xdp_linearize_page()
Gwangun Jung exsociety@gmail.com net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg
Cristian Ciocaltea cristian.ciocaltea@collabora.com regulator: fan53555: Explicitly include bits header
Florian Westphal fw@strlen.de netfilter: br_netfilter: fix recent physdev match breakage
Peng Fan peng.fan@nxp.com arm64: dts: imx8mm-evk: correct pmic clock source
Marc Gonzalez mgonzalez@freebox.fr arm64: dts: meson-g12-common: specify full DMC range
Dmitry Baryshkov dmitry.baryshkov@linaro.org arm64: dts: qcom: ipq8074-hk01: enable QMP device, not the PHY node
Jianqun Xu jay.xu@rock-chips.com ARM: dts: rockchip: fix a typo error for rk3288 spdif node
-------------
Diffstat:
Documentation/kernel-hacking/locking.rst | 2 +- Documentation/powerpc/associativity.rst | 29 ++-- Documentation/powerpc/index.rst | 1 + .../translations/it_IT/kernel-hacking/locking.rst | 2 +- Makefile | 4 +- arch/arm/boot/dts/rk3288.dtsi | 2 +- arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi | 3 +- arch/arm64/boot/dts/freescale/imx8mm-evk.dtsi | 2 +- arch/arm64/boot/dts/qcom/ipq8074-hk01.dts | 4 +- arch/mips/kernel/vmlinux.lds.S | 2 + arch/s390/kernel/ptrace.c | 8 +- arch/x86/purgatory/Makefile | 3 +- drivers/iio/adc/at91-sama5d2_adc.c | 2 +- drivers/iio/light/tsl2772.c | 1 + drivers/input/serio/i8042-x86ia64io.h | 8 + drivers/memstick/core/memstick.c | 5 +- drivers/mmc/host/sdhci_am654.c | 2 - drivers/net/dsa/b53/b53_mmap.c | 14 ++ drivers/net/ethernet/intel/e1000e/netdev.c | 51 +++--- drivers/net/ethernet/intel/i40e/i40e_main.c | 9 +- .../ethernet/mellanox/mlxfw/mlxfw_mfa2_tlv_multi.c | 2 + drivers/net/ethernet/mellanox/mlxsw/pci_hw.h | 2 +- drivers/net/ethernet/sfc/ef100_netdev.c | 6 +- drivers/net/ethernet/sfc/efx.c | 30 ++-- drivers/net/ethernet/sfc/efx_common.c | 12 +- drivers/net/ethernet/sfc/efx_common.h | 6 +- drivers/net/ethernet/sfc/ethtool_common.c | 2 +- drivers/net/ethernet/sfc/net_driver.h | 50 +++++- drivers/net/virtio_net.c | 8 +- drivers/net/xen-netback/netback.c | 6 +- drivers/nvme/host/tcp.c | 46 +++--- drivers/pwm/pwm-hibvt.c | 1 + drivers/pwm/pwm-iqs620a.c | 1 + drivers/pwm/pwm-meson.c | 7 + drivers/regulator/fan53555.c | 11 +- drivers/scsi/megaraid/megaraid_sas_base.c | 2 +- drivers/scsi/scsi.c | 11 +- fs/ext4/inline.c | 11 +- fs/ext4/xattr.c | 26 +-- fs/ext4/xattr.h | 6 +- fs/fuse/dir.c | 7 +- fs/fuse/file.c | 31 ++-- fs/fuse/fuse_i.h | 3 + fs/fuse/inode.c | 5 +- fs/fuse/virtio_fs.c | 46 ++++-- fs/nilfs2/segment.c | 20 +++ fs/xfs/xfs_aops.c | 45 +---- include/linux/skbuff.h | 5 +- include/net/ipv6.h | 2 + include/net/udp.h | 2 +- include/net/udplite.h | 8 - include/trace/events/f2fs.h | 2 +- kernel/bpf/verifier.c | 15 ++ kernel/sched/core.c | 10 +- kernel/sched/fair.c | 183 ++++++++++++++++----- kernel/sched/sched.h | 70 +++++++- kernel/sys.c | 69 ++++---- mm/khugepaged.c | 4 + net/bridge/br_netfilter_hooks.c | 17 +- net/dccp/dccp.h | 1 + net/dccp/ipv6.c | 15 +- net/dccp/proto.c | 8 +- net/ipv4/udp.c | 9 +- net/ipv4/udplite.c | 8 + net/ipv6/af_inet6.c | 15 +- net/ipv6/ipv6_sockglue.c | 20 +-- net/ipv6/ping.c | 6 - net/ipv6/raw.c | 2 - net/ipv6/rpl.c | 3 +- net/ipv6/tcp_ipv6.c | 8 +- net/ipv6/udp.c | 17 +- net/ipv6/udp_impl.h | 1 + net/ipv6/udplite.c | 9 +- net/l2tp/l2tp_ip6.c | 2 - net/mptcp/protocol.c | 7 - net/sched/sch_qfq.c | 13 +- net/sctp/socket.c | 29 +++- scripts/asn1_compiler.c | 2 +- sound/soc/fsl/fsl_asrc_dma.c | 11 +- .../selftests/sigaltstack/current_stack_pointer.h | 23 +++ tools/testing/selftests/sigaltstack/sas.c | 7 +- 81 files changed, 751 insertions(+), 409 deletions(-)
From: Jianqun Xu jay.xu@rock-chips.com
[ Upstream commit 02c84f91adb9a64b75ec97d772675c02a3e65ed7 ]
Fix the address in the spdif node name.
Fixes: 874e568e500a ("ARM: dts: rockchip: Add SPDIF transceiver for RK3288") Signed-off-by: Jianqun Xu jay.xu@rock-chips.com Reviewed-by: Sjoerd Simons sjoerd@collabora.com Link: https://lore.kernel.org/r/20230208091411.1603142-1-jay.xu@rock-chips.com Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/rk3288.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi index aab28161b9ae9..250a03a066a17 100644 --- a/arch/arm/boot/dts/rk3288.dtsi +++ b/arch/arm/boot/dts/rk3288.dtsi @@ -959,7 +959,7 @@ status = "disabled"; };
- spdif: sound@ff88b0000 { + spdif: sound@ff8b0000 { compatible = "rockchip,rk3288-spdif", "rockchip,rk3066-spdif"; reg = <0x0 0xff8b0000 0x0 0x10000>; #sound-dai-cells = <0>;
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 72630ba422b70ea0874fc90d526353cf71c72488 ]
Correct PCIe PHY enablement to refer the QMP device nodes rather than PHY device nodes. QMP nodes have 'status = "disabled"' property in the ipq8074.dtsi, while PHY nodes do not correspond to the actual device and do not have the status property.
Fixes: e8a7fdc505bb ("arm64: dts: ipq8074: qcom: Re-arrange dts nodes based on address") Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Bjorn Andersson andersson@kernel.org Link: https://lore.kernel.org/r/20230324021651.1799969-1-dmitry.baryshkov@linaro.o... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/ipq8074-hk01.dts | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/qcom/ipq8074-hk01.dts b/arch/arm64/boot/dts/qcom/ipq8074-hk01.dts index cc08dc4eb56a5..68698cdf56c46 100644 --- a/arch/arm64/boot/dts/qcom/ipq8074-hk01.dts +++ b/arch/arm64/boot/dts/qcom/ipq8074-hk01.dts @@ -60,11 +60,11 @@ perst-gpio = <&tlmm 58 0x1>; };
-&pcie_phy0 { +&pcie_qmp0 { status = "okay"; };
-&pcie_phy1 { +&pcie_qmp1 { status = "okay"; };
From: Marc Gonzalez mgonzalez@freebox.fr
[ Upstream commit aec4353114a408b3a831a22ba34942d05943e462 ]
According to S905X2 Datasheet - Revision 07: DRAM Memory Controller (DMC) register area spans ff638000-ff63a000.
According to DeviceTree Specification - Release v0.4-rc1: simple-bus nodes do not require reg property.
Fixes: 1499218c80c99a ("arm64: dts: move common G12A & G12B modes to meson-g12-common.dtsi") Signed-off-by: Marc Gonzalez mgonzalez@freebox.fr Reviewed-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Link: https://lore.kernel.org/r/20230327120932.2158389-2-mgonzalez@freebox.fr Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi index c0defb36592d0..9dd9f7715fbe6 100644 --- a/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi +++ b/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi @@ -1604,10 +1604,9 @@
dmc: bus@38000 { compatible = "simple-bus"; - reg = <0x0 0x38000 0x0 0x400>; #address-cells = <2>; #size-cells = <2>; - ranges = <0x0 0x0 0x0 0x38000 0x0 0x400>; + ranges = <0x0 0x0 0x0 0x38000 0x0 0x2000>;
canvas: video-lut@48 { compatible = "amlogic,canvas";
From: Peng Fan peng.fan@nxp.com
[ Upstream commit 85af7ffd24da38e416a14bd6bf207154d94faa83 ]
The osc_32k supports #clock-cells as 0, using an id is wrong, drop it.
Fixes: a6a355ede574 ("arm64: dts: imx8mm-evk: Add 32.768 kHz clock to PMIC") Signed-off-by: Peng Fan peng.fan@nxp.com Reviewed-by: Marco Felsch m.felsch@pengutronix.de Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/freescale/imx8mm-evk.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/freescale/imx8mm-evk.dtsi b/arch/arm64/boot/dts/freescale/imx8mm-evk.dtsi index 521eb3a5a12ed..ed6d296bd6644 100644 --- a/arch/arm64/boot/dts/freescale/imx8mm-evk.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8mm-evk.dtsi @@ -128,7 +128,7 @@ rohm,reset-snvs-powered;
#clock-cells = <0>; - clocks = <&osc_32k 0>; + clocks = <&osc_32k>; clock-output-names = "clk-32k-out";
regulators {
From: Florian Westphal fw@strlen.de
[ Upstream commit 94623f579ce338b5fa61b5acaa5beb8aa657fb9e ]
Recent attempt to ensure PREROUTING hook is executed again when a decrypted ipsec packet received on a bridge passes through the network stack a second time broke the physdev match in INPUT hook.
We can't discard the nf_bridge info strct from sabotage_in hook, as this is needed by the physdev match.
Keep the struct around and handle this with another conditional instead.
Fixes: 2b272bb558f1 ("netfilter: br_netfilter: disable sabotage_in hook after first suppression") Reported-and-tested-by: Farid BENAMROUCHE fariouche@yahoo.fr Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/skbuff.h | 1 + net/bridge/br_netfilter_hooks.c | 17 +++++++++++------ 2 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 39636fe7e8f0a..287999eedef45 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -258,6 +258,7 @@ struct nf_bridge_info { u8 pkt_otherhost:1; u8 in_prerouting:1; u8 bridged_dnat:1; + u8 sabotage_in_done:1; __u16 frag_max_size; struct net_device *physindev;
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c index f3c7cfba31e1b..f14beb9a62edb 100644 --- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -868,12 +868,17 @@ static unsigned int ip_sabotage_in(void *priv, { struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
- if (nf_bridge && !nf_bridge->in_prerouting && - !netif_is_l3_master(skb->dev) && - !netif_is_l3_slave(skb->dev)) { - nf_bridge_info_free(skb); - state->okfn(state->net, state->sk, skb); - return NF_STOLEN; + if (nf_bridge) { + if (nf_bridge->sabotage_in_done) + return NF_ACCEPT; + + if (!nf_bridge->in_prerouting && + !netif_is_l3_master(skb->dev) && + !netif_is_l3_slave(skb->dev)) { + nf_bridge->sabotage_in_done = 1; + state->okfn(state->net, state->sk, skb); + return NF_STOLEN; + } }
return NF_ACCEPT;
From: Cristian Ciocaltea cristian.ciocaltea@collabora.com
[ Upstream commit 4fb9a5060f73627303bc531ceaab1b19d0a24aef ]
Since commit f2a9eb975ab2 ("regulator: fan53555: Add support for FAN53526") the driver makes use of the BIT() macro, but relies on the bits header being implicitly included.
Explicitly pull the header in to avoid potential build failures in some configurations.
While here, reorder include directives alphabetically.
Fixes: f2a9eb975ab2 ("regulator: fan53555: Add support for FAN53526") Signed-off-by: Cristian Ciocaltea cristian.ciocaltea@collabora.com Link: https://lore.kernel.org/r/20230406171806.948290-3-cristian.ciocaltea@collabo... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/fan53555.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/regulator/fan53555.c b/drivers/regulator/fan53555.c index aa426183b6a11..1af12074a75ab 100644 --- a/drivers/regulator/fan53555.c +++ b/drivers/regulator/fan53555.c @@ -8,18 +8,19 @@ // Copyright (c) 2012 Marvell Technology Ltd. // Yunfan Zhang yfzhang@marvell.com
+#include <linux/bits.h> +#include <linux/err.h> +#include <linux/i2c.h> #include <linux/module.h> +#include <linux/of_device.h> #include <linux/param.h> -#include <linux/err.h> #include <linux/platform_device.h> +#include <linux/regmap.h> #include <linux/regulator/driver.h> +#include <linux/regulator/fan53555.h> #include <linux/regulator/machine.h> #include <linux/regulator/of_regulator.h> -#include <linux/of_device.h> -#include <linux/i2c.h> #include <linux/slab.h> -#include <linux/regmap.h> -#include <linux/regulator/fan53555.h>
/* Voltage setting */ #define FAN53555_VSEL0 0x00
From: Gwangun Jung exsociety@gmail.com
[ Upstream commit 3037933448f60f9acb705997eae62013ecb81e0d ]
If the TCA_QFQ_LMAX value is not offered through nlattr, lmax is determined by the MTU value of the network device. The MTU of the loopback device can be set up to 2^31-1. As a result, it is possible to have an lmax value that exceeds QFQ_MIN_LMAX.
Due to the invalid lmax value, an index is generated that exceeds the QFQ_MAX_INDEX(=24) value, causing out-of-bounds read/write errors.
The following reports a oob access:
[ 84.582666] BUG: KASAN: slab-out-of-bounds in qfq_activate_agg.constprop.0 (net/sched/sch_qfq.c:1027 net/sched/sch_qfq.c:1060 net/sched/sch_qfq.c:1313) [ 84.583267] Read of size 4 at addr ffff88810f676948 by task ping/301 [ 84.583686] [ 84.583797] CPU: 3 PID: 301 Comm: ping Not tainted 6.3.0-rc5 #1 [ 84.584164] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 84.584644] Call Trace: [ 84.584787] <TASK> [ 84.584906] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) [ 84.585108] print_report (mm/kasan/report.c:320 mm/kasan/report.c:430) [ 84.585570] kasan_report (mm/kasan/report.c:538) [ 84.585988] qfq_activate_agg.constprop.0 (net/sched/sch_qfq.c:1027 net/sched/sch_qfq.c:1060 net/sched/sch_qfq.c:1313) [ 84.586599] qfq_enqueue (net/sched/sch_qfq.c:1255) [ 84.587607] dev_qdisc_enqueue (net/core/dev.c:3776) [ 84.587749] __dev_queue_xmit (./include/net/sch_generic.h:186 net/core/dev.c:3865 net/core/dev.c:4212) [ 84.588763] ip_finish_output2 (./include/net/neighbour.h:546 net/ipv4/ip_output.c:228) [ 84.589460] ip_output (net/ipv4/ip_output.c:430) [ 84.590132] ip_push_pending_frames (./include/net/dst.h:444 net/ipv4/ip_output.c:126 net/ipv4/ip_output.c:1586 net/ipv4/ip_output.c:1606) [ 84.590285] raw_sendmsg (net/ipv4/raw.c:649) [ 84.591960] sock_sendmsg (net/socket.c:724 net/socket.c:747) [ 84.592084] __sys_sendto (net/socket.c:2142) [ 84.593306] __x64_sys_sendto (net/socket.c:2150) [ 84.593779] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 84.593902] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 84.594070] RIP: 0033:0x7fe568032066 [ 84.594192] Code: 0e 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c09[ 84.594796] RSP: 002b:00007ffce388b4e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
Code starting with the faulting instruction =========================================== [ 84.595047] RAX: ffffffffffffffda RBX: 00007ffce388cc70 RCX: 00007fe568032066 [ 84.595281] RDX: 0000000000000040 RSI: 00005605fdad6d10 RDI: 0000000000000003 [ 84.595515] RBP: 00005605fdad6d10 R08: 00007ffce388eeec R09: 0000000000000010 [ 84.595749] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040 [ 84.595984] R13: 00007ffce388cc30 R14: 00007ffce388b4f0 R15: 0000001d00000001 [ 84.596218] </TASK> [ 84.596295] [ 84.596351] Allocated by task 291: [ 84.596467] kasan_save_stack (mm/kasan/common.c:46) [ 84.596597] kasan_set_track (mm/kasan/common.c:52) [ 84.596725] __kasan_kmalloc (mm/kasan/common.c:384) [ 84.596852] __kmalloc_node (./include/linux/kasan.h:196 mm/slab_common.c:967 mm/slab_common.c:974) [ 84.596979] qdisc_alloc (./include/linux/slab.h:610 ./include/linux/slab.h:731 net/sched/sch_generic.c:938) [ 84.597100] qdisc_create (net/sched/sch_api.c:1244) [ 84.597222] tc_modify_qdisc (net/sched/sch_api.c:1680) [ 84.597357] rtnetlink_rcv_msg (net/core/rtnetlink.c:6174) [ 84.597495] netlink_rcv_skb (net/netlink/af_netlink.c:2574) [ 84.597627] netlink_unicast (net/netlink/af_netlink.c:1340 net/netlink/af_netlink.c:1365) [ 84.597759] netlink_sendmsg (net/netlink/af_netlink.c:1942) [ 84.597891] sock_sendmsg (net/socket.c:724 net/socket.c:747) [ 84.598016] ____sys_sendmsg (net/socket.c:2501) [ 84.598147] ___sys_sendmsg (net/socket.c:2557) [ 84.598275] __sys_sendmsg (./include/linux/file.h:31 net/socket.c:2586) [ 84.598399] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 84.598520] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 84.598688] [ 84.598744] The buggy address belongs to the object at ffff88810f674000 [ 84.598744] which belongs to the cache kmalloc-8k of size 8192 [ 84.599135] The buggy address is located 2664 bytes to the right of [ 84.599135] allocated 7904-byte region [ffff88810f674000, ffff88810f675ee0) [ 84.599544] [ 84.599598] The buggy address belongs to the physical page: [ 84.599777] page:00000000e638567f refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10f670 [ 84.600074] head:00000000e638567f order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 84.600330] flags: 0x200000000010200(slab|head|node=0|zone=2) [ 84.600517] raw: 0200000000010200 ffff888100043180 dead000000000122 0000000000000000 [ 84.600764] raw: 0000000000000000 0000000080020002 00000001ffffffff 0000000000000000 [ 84.601009] page dumped because: kasan: bad access detected [ 84.601187] [ 84.601241] Memory state around the buggy address: [ 84.601396] ffff88810f676800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 84.601620] ffff88810f676880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 84.601845] >ffff88810f676900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 84.602069] ^ [ 84.602243] ffff88810f676980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 84.602468] ffff88810f676a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 84.602693] ================================================================== [ 84.602924] Disabling lock debugging due to kernel taint
Fixes: 3015f3d2a3cd ("pkt_sched: enable QFQ to support TSO/GSO") Reported-by: Gwangun Jung exsociety@gmail.com Signed-off-by: Gwangun Jung exsociety@gmail.com Acked-by: Jamal Hadi Salimjhs@mojatatu.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_qfq.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index 1d1d81aeb389f..cad7deacf60a4 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -421,15 +421,16 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, } else weight = 1;
- if (tb[TCA_QFQ_LMAX]) { + if (tb[TCA_QFQ_LMAX]) lmax = nla_get_u32(tb[TCA_QFQ_LMAX]); - if (lmax < QFQ_MIN_LMAX || lmax > (1UL << QFQ_MTU_SHIFT)) { - pr_notice("qfq: invalid max length %u\n", lmax); - return -EINVAL; - } - } else + else lmax = psched_mtu(qdisc_dev(sch));
+ if (lmax < QFQ_MIN_LMAX || lmax > (1UL << QFQ_MTU_SHIFT)) { + pr_notice("qfq: invalid max length %u\n", lmax); + return -EINVAL; + } + inv_w = ONE_FP / weight; weight = ONE_FP / inv_w;
From: Xuan Zhuo xuanzhuo@linux.alibaba.com
[ Upstream commit 853618d5886bf94812f31228091cd37d308230f7 ]
Here we copy the data from the original buf to the new page. But we not check that it may be overflow.
As long as the size received(including vnethdr) is greater than 3840 (PAGE_SIZE -VIRTIO_XDP_HEADROOM). Then the memcpy will overflow.
And this is completely possible, as long as the MTU is large, such as 4096. In our test environment, this will cause crash. Since crash is caused by the written memory, it is meaningless, so I do not include it.
Fixes: 72979a6c3590 ("virtio_net: xdp, add slowpath case for non contiguous buffers") Signed-off-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Acked-by: Jason Wang jasowang@redhat.com Acked-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/virtio_net.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index d533211161366..47c9118cc92a3 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -646,8 +646,13 @@ static struct page *xdp_linearize_page(struct receive_queue *rq, int page_off, unsigned int *len) { - struct page *page = alloc_page(GFP_ATOMIC); + int tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + struct page *page;
+ if (page_off + *len + tailroom > PAGE_SIZE) + return NULL; + + page = alloc_page(GFP_ATOMIC); if (!page) return NULL;
@@ -655,7 +660,6 @@ static struct page *xdp_linearize_page(struct receive_queue *rq, page_off += *len;
while (--*num_buf) { - int tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); unsigned int buflen; void *buf; int off;
From: Jonathan Cooper jonathan.s.cooper@amd.com
[ Upstream commit 813cf9d1e753e1e0a247d3d685212a06141b483e ]
This patch splits the READY state in to NET_UP and NET_DOWN. This is to prepare for future work to delay resource allocation until interface up so that we can use resources more efficiently in SRIOV environments, and also to lay the ground work for an extra PROBED state where we don't create a network interface, for VDPA operation.
Signed-off-by: Jonathan Cooper jonathan.s.cooper@amd.com Acked-by: Martin Habets habetsm.xilinx@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: a80bb8e7233b ("sfc: Fix use-after-free due to selftest_work") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/sfc/ef100_netdev.c | 6 ++- drivers/net/ethernet/sfc/efx.c | 29 ++++++------- drivers/net/ethernet/sfc/efx_common.c | 10 ++--- drivers/net/ethernet/sfc/efx_common.h | 6 +-- drivers/net/ethernet/sfc/ethtool_common.c | 2 +- drivers/net/ethernet/sfc/net_driver.h | 50 +++++++++++++++++++++-- 6 files changed, 72 insertions(+), 31 deletions(-)
diff --git a/drivers/net/ethernet/sfc/ef100_netdev.c b/drivers/net/ethernet/sfc/ef100_netdev.c index 63a44ee763be7..b9429e8faba1e 100644 --- a/drivers/net/ethernet/sfc/ef100_netdev.c +++ b/drivers/net/ethernet/sfc/ef100_netdev.c @@ -96,6 +96,8 @@ static int ef100_net_stop(struct net_device *net_dev) efx_mcdi_free_vis(efx); efx_remove_interrupts(efx);
+ efx->state = STATE_NET_DOWN; + return 0; }
@@ -172,6 +174,8 @@ static int ef100_net_open(struct net_device *net_dev) efx_link_status_changed(efx); mutex_unlock(&efx->mac_lock);
+ efx->state = STATE_NET_UP; + return 0;
fail: @@ -272,7 +276,7 @@ int ef100_register_netdev(struct efx_nic *efx) /* Always start with carrier off; PHY events will detect the link */ netif_carrier_off(net_dev);
- efx->state = STATE_READY; + efx->state = STATE_NET_DOWN; rtnl_unlock(); efx_init_mcdi_logging(efx);
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c index c069659c9e2d0..5f064f185d553 100644 --- a/drivers/net/ethernet/sfc/efx.c +++ b/drivers/net/ethernet/sfc/efx.c @@ -105,14 +105,6 @@ static int efx_xdp(struct net_device *dev, struct netdev_bpf *xdp); static int efx_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **xdpfs, u32 flags);
-#define EFX_ASSERT_RESET_SERIALISED(efx) \ - do { \ - if ((efx->state == STATE_READY) || \ - (efx->state == STATE_RECOVERY) || \ - (efx->state == STATE_DISABLED)) \ - ASSERT_RTNL(); \ - } while (0) - /************************************************************************** * * Port handling @@ -377,6 +369,8 @@ static int efx_probe_all(struct efx_nic *efx) if (rc) goto fail5;
+ efx->state = STATE_NET_DOWN; + return 0;
fail5: @@ -543,6 +537,9 @@ int efx_net_open(struct net_device *net_dev) efx_start_all(efx); if (efx->state == STATE_DISABLED || efx->reset_pending) netif_device_detach(efx->net_dev); + else + efx->state = STATE_NET_UP; + efx_selftest_async_start(efx); return 0; } @@ -721,8 +718,6 @@ static int efx_register_netdev(struct efx_nic *efx) * already requested. If so, the NIC is probably hosed so we * abort. */ - efx->state = STATE_READY; - smp_mb(); /* ensure we change state before checking reset_pending */ if (efx->reset_pending) { netif_err(efx, probe, efx->net_dev, "aborting probe due to scheduled reset\n"); @@ -750,6 +745,8 @@ static int efx_register_netdev(struct efx_nic *efx)
efx_associate(efx);
+ efx->state = STATE_NET_DOWN; + rtnl_unlock();
rc = device_create_file(&efx->pci_dev->dev, &dev_attr_phy_type); @@ -851,7 +848,7 @@ static void efx_pci_remove_main(struct efx_nic *efx) /* Flush reset_work. It can no longer be scheduled since we * are not READY. */ - BUG_ON(efx->state == STATE_READY); + WARN_ON(efx_net_active(efx->state)); efx_flush_reset_workqueue(efx);
efx_disable_interrupts(efx); @@ -1196,13 +1193,13 @@ static int efx_pm_freeze(struct device *dev)
rtnl_lock();
- if (efx->state != STATE_DISABLED) { - efx->state = STATE_UNINIT; - + if (efx_net_active(efx->state)) { efx_device_detach_sync(efx);
efx_stop_all(efx); efx_disable_interrupts(efx); + + efx->state = efx_freeze(efx->state); }
rtnl_unlock(); @@ -1217,7 +1214,7 @@ static int efx_pm_thaw(struct device *dev)
rtnl_lock();
- if (efx->state != STATE_DISABLED) { + if (efx_frozen(efx->state)) { rc = efx_enable_interrupts(efx); if (rc) goto fail; @@ -1230,7 +1227,7 @@ static int efx_pm_thaw(struct device *dev)
efx_device_attach_if_not_resetting(efx);
- efx->state = STATE_READY; + efx->state = efx_thaw(efx->state);
efx->type->resume_wol(efx); } diff --git a/drivers/net/ethernet/sfc/efx_common.c b/drivers/net/ethernet/sfc/efx_common.c index de797e1ac5a98..1527678b241cf 100644 --- a/drivers/net/ethernet/sfc/efx_common.c +++ b/drivers/net/ethernet/sfc/efx_common.c @@ -897,7 +897,7 @@ static void efx_reset_work(struct work_struct *data) * have changed by now. Now that we have the RTNL lock, * it cannot change again. */ - if (efx->state == STATE_READY) + if (efx_net_active(efx->state)) (void)efx_reset(efx, method);
rtnl_unlock(); @@ -907,7 +907,7 @@ void efx_schedule_reset(struct efx_nic *efx, enum reset_type type) { enum reset_type method;
- if (efx->state == STATE_RECOVERY) { + if (efx_recovering(efx->state)) { netif_dbg(efx, drv, efx->net_dev, "recovering: skip scheduling %s reset\n", RESET_TYPE(type)); @@ -942,7 +942,7 @@ void efx_schedule_reset(struct efx_nic *efx, enum reset_type type) /* If we're not READY then just leave the flags set as the cue * to abort probing or reschedule the reset later. */ - if (READ_ONCE(efx->state) != STATE_READY) + if (!efx_net_active(READ_ONCE(efx->state))) return;
/* efx_process_channel() will no longer read events once a @@ -1214,7 +1214,7 @@ static pci_ers_result_t efx_io_error_detected(struct pci_dev *pdev, rtnl_lock();
if (efx->state != STATE_DISABLED) { - efx->state = STATE_RECOVERY; + efx->state = efx_recover(efx->state); efx->reset_pending = 0;
efx_device_detach_sync(efx); @@ -1268,7 +1268,7 @@ static void efx_io_resume(struct pci_dev *pdev) netif_err(efx, hw, efx->net_dev, "efx_reset failed after PCI error (%d)\n", rc); } else { - efx->state = STATE_READY; + efx->state = efx_recovered(efx->state); netif_dbg(efx, hw, efx->net_dev, "Done resetting and resuming IO after PCI error.\n"); } diff --git a/drivers/net/ethernet/sfc/efx_common.h b/drivers/net/ethernet/sfc/efx_common.h index 65513fd0cf6c4..c72e819da8fd3 100644 --- a/drivers/net/ethernet/sfc/efx_common.h +++ b/drivers/net/ethernet/sfc/efx_common.h @@ -45,9 +45,7 @@ int efx_reconfigure_port(struct efx_nic *efx);
#define EFX_ASSERT_RESET_SERIALISED(efx) \ do { \ - if ((efx->state == STATE_READY) || \ - (efx->state == STATE_RECOVERY) || \ - (efx->state == STATE_DISABLED)) \ + if (efx->state != STATE_UNINIT) \ ASSERT_RTNL(); \ } while (0)
@@ -64,7 +62,7 @@ void efx_port_dummy_op_void(struct efx_nic *efx);
static inline int efx_check_disabled(struct efx_nic *efx) { - if (efx->state == STATE_DISABLED || efx->state == STATE_RECOVERY) { + if (efx->state == STATE_DISABLED || efx_recovering(efx->state)) { netif_err(efx, drv, efx->net_dev, "device is disabled due to earlier errors\n"); return -EIO; diff --git a/drivers/net/ethernet/sfc/ethtool_common.c b/drivers/net/ethernet/sfc/ethtool_common.c index bd552c7dffcb1..3846b76b89720 100644 --- a/drivers/net/ethernet/sfc/ethtool_common.c +++ b/drivers/net/ethernet/sfc/ethtool_common.c @@ -137,7 +137,7 @@ void efx_ethtool_self_test(struct net_device *net_dev, if (!efx_tests) goto fail;
- if (efx->state != STATE_READY) { + if (!efx_net_active(efx->state)) { rc = -EBUSY; goto out; } diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h index 8aecb4bd2c0d5..39f97929b3ffe 100644 --- a/drivers/net/ethernet/sfc/net_driver.h +++ b/drivers/net/ethernet/sfc/net_driver.h @@ -627,12 +627,54 @@ enum efx_int_mode { #define EFX_INT_MODE_USE_MSI(x) (((x)->interrupt_mode) <= EFX_INT_MODE_MSI)
enum nic_state { - STATE_UNINIT = 0, /* device being probed/removed or is frozen */ - STATE_READY = 1, /* hardware ready and netdev registered */ - STATE_DISABLED = 2, /* device disabled due to hardware errors */ - STATE_RECOVERY = 3, /* device recovering from PCI error */ + STATE_UNINIT = 0, /* device being probed/removed */ + STATE_NET_DOWN, /* hardware probed and netdev registered */ + STATE_NET_UP, /* ready for traffic */ + STATE_DISABLED, /* device disabled due to hardware errors */ + + STATE_RECOVERY = 0x100,/* recovering from PCI error */ + STATE_FROZEN = 0x200, /* frozen by power management */ };
+static inline bool efx_net_active(enum nic_state state) +{ + return state == STATE_NET_DOWN || state == STATE_NET_UP; +} + +static inline bool efx_frozen(enum nic_state state) +{ + return state & STATE_FROZEN; +} + +static inline bool efx_recovering(enum nic_state state) +{ + return state & STATE_RECOVERY; +} + +static inline enum nic_state efx_freeze(enum nic_state state) +{ + WARN_ON(!efx_net_active(state)); + return state | STATE_FROZEN; +} + +static inline enum nic_state efx_thaw(enum nic_state state) +{ + WARN_ON(!efx_frozen(state)); + return state & ~STATE_FROZEN; +} + +static inline enum nic_state efx_recover(enum nic_state state) +{ + WARN_ON(!efx_net_active(state)); + return state | STATE_RECOVERY; +} + +static inline enum nic_state efx_recovered(enum nic_state state) +{ + WARN_ON(!efx_recovering(state)); + return state & ~STATE_RECOVERY; +} + /* Forward declaration */ struct efx_nic;
From: Ding Hui dinghui@sangfor.com.cn
[ Upstream commit a80bb8e7233b2ad6ff119646b6e33fb3edcec37b ]
There is a use-after-free scenario that is:
When the NIC is down, user set mac address or vlan tag to VF, the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop() and efx_net_open(), since netif_running() is false, the port will not start and keep port_enabled false, but selftest_work is scheduled in efx_net_open().
If we remove the device before selftest_work run, the efx_stop_port() will not be called since the NIC is down, and then efx is freed, we will soon get a UAF in run_timer_softirq() like this:
[ 1178.907941] ================================================================== [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90 [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0 [ 1178.907950] [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1 [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022 [ 1178.907955] Call Trace: [ 1178.907956] <IRQ> [ 1178.907960] dump_stack+0x71/0xab [ 1178.907963] print_address_description+0x6b/0x290 [ 1178.907965] ? run_timer_softirq+0xdea/0xe90 [ 1178.907967] kasan_report+0x14a/0x2b0 [ 1178.907968] run_timer_softirq+0xdea/0xe90 [ 1178.907971] ? init_timer_key+0x170/0x170 [ 1178.907973] ? hrtimer_cancel+0x20/0x20 [ 1178.907976] ? sched_clock+0x5/0x10 [ 1178.907978] ? sched_clock_cpu+0x18/0x170 [ 1178.907981] __do_softirq+0x1c8/0x5fa [ 1178.907985] irq_exit+0x213/0x240 [ 1178.907987] smp_apic_timer_interrupt+0xd0/0x330 [ 1178.907989] apic_timer_interrupt+0xf/0x20 [ 1178.907990] </IRQ> [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
If the NIC is not actually brought up, there is no need to schedule selftest_work, so let's move invoking efx_selftest_async_start() into efx_start_all(), and it will be canceled by broughting down.
Fixes: dd40781e3a4e ("sfc: Run event/IRQ self-test asynchronously when interface is brought up") Fixes: e340be923012 ("sfc: add ndo_set_vf_mac() function for EF10") Debugged-by: Huang Cun huangcun@sangfor.com.cn Cc: Donglin Peng pengdonglin@sangfor.com.cn Suggested-by: Martin Habets habetsm.xilinx@gmail.com Signed-off-by: Ding Hui dinghui@sangfor.com.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/sfc/efx.c | 1 - drivers/net/ethernet/sfc/efx_common.c | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c index 5f064f185d553..7cf52fcdb3078 100644 --- a/drivers/net/ethernet/sfc/efx.c +++ b/drivers/net/ethernet/sfc/efx.c @@ -540,7 +540,6 @@ int efx_net_open(struct net_device *net_dev) else efx->state = STATE_NET_UP;
- efx_selftest_async_start(efx); return 0; }
diff --git a/drivers/net/ethernet/sfc/efx_common.c b/drivers/net/ethernet/sfc/efx_common.c index 1527678b241cf..476ef1c976375 100644 --- a/drivers/net/ethernet/sfc/efx_common.c +++ b/drivers/net/ethernet/sfc/efx_common.c @@ -542,6 +542,8 @@ void efx_start_all(struct efx_nic *efx) /* Start the hardware monitor if there is one */ efx_start_monitor(efx);
+ efx_selftest_async_start(efx); + /* Link state detection is normally event-driven; we have * to poll now because we could have missed a change */
From: Florian Westphal fw@strlen.de
[ Upstream commit c55c0e91c813589dc55bea6bf9a9fbfaa10ae41d ]
nftables can be built as a module, so fix the preprocessor conditional accordingly.
Fixes: 478b360a47b7 ("netfilter: nf_tables: fix nf_trace always-on with XT_TRACE=n") Reported-by: Florian Fainelli f.fainelli@gmail.com Reported-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/skbuff.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 287999eedef45..a210f19958621 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -4277,7 +4277,7 @@ static inline void nf_reset_ct(struct sk_buff *skb)
static inline void nf_reset_trace(struct sk_buff *skb) { -#if IS_ENABLED(CONFIG_NETFILTER_XT_TARGET_TRACE) || defined(CONFIG_NF_TABLES) +#if IS_ENABLED(CONFIG_NETFILTER_XT_TARGET_TRACE) || IS_ENABLED(CONFIG_NF_TABLES) skb->nf_trace = 0; #endif } @@ -4297,7 +4297,7 @@ static inline void __nf_copy(struct sk_buff *dst, const struct sk_buff *src, dst->_nfct = src->_nfct; nf_conntrack_get(skb_nfct(src)); #endif -#if IS_ENABLED(CONFIG_NETFILTER_XT_TARGET_TRACE) || defined(CONFIG_NF_TABLES) +#if IS_ENABLED(CONFIG_NETFILTER_XT_TARGET_TRACE) || IS_ENABLED(CONFIG_NF_TABLES) if (copy) dst->nf_trace = src->nf_trace; #endif
From: Aleksandr Loktionov aleksandr.loktionov@intel.com
[ Upstream commit 8485d093b076e59baff424552e8aecfc5bd2d261 ]
Fix accessing vsi->active_filters without holding the mac_filter_hash_lock. Move vsi->active_filters = 0 inside critical section and move clear_bit(__I40E_VSI_OVERFLOW_PROMISC, vsi->state) after the critical section to ensure the new filters from other threads can be added only after filters cleaning in the critical section is finished.
Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with the MAC Address as key") Signed-off-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 76481ff7074ba..3a93d538b2d73 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -13458,15 +13458,15 @@ static int i40e_add_vsi(struct i40e_vsi *vsi) vsi->id = ctxt.vsi_number; }
- vsi->active_filters = 0; - clear_bit(__I40E_VSI_OVERFLOW_PROMISC, vsi->state); spin_lock_bh(&vsi->mac_filter_hash_lock); + vsi->active_filters = 0; /* If macvlan filters already exist, force them to get loaded */ hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) { f->state = I40E_FILTER_NEW; f_count++; } spin_unlock_bh(&vsi->mac_filter_hash_lock); + clear_bit(__I40E_VSI_OVERFLOW_PROMISC, vsi->state);
if (f_count) { vsi->flags |= I40E_VSI_FLAG_FILTER_CHANGED;
From: Aleksandr Loktionov aleksandr.loktionov@intel.com
[ Upstream commit c86c00c6935505929cc9adb29ddb85e48c71f828 ]
Add error handling of i40e_setup_misc_vector() in i40e_rebuild(). In case interrupt vectors setup fails do not re-open vsi-s and do not bring up vf-s, we have no interrupts to serve a traffic anyway.
Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 3a93d538b2d73..d23a467d0d209 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -10448,8 +10448,11 @@ static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired) pf->hw.aq.asq_last_status)); } /* reinit the misc interrupt */ - if (pf->flags & I40E_FLAG_MSIX_ENABLED) + if (pf->flags & I40E_FLAG_MSIX_ENABLED) { ret = i40e_setup_misc_vector(pf); + if (ret) + goto end_unlock; + }
/* Add a filter to drop all Flow control frames from any VSI from being * transmitted. By doing so we stop a malicious VF from sending out
From: Nikita Zhandarovich n.zhandarovich@fintech.ru
[ Upstream commit c0e73276f0fcbbd3d4736ba975d7dc7a48791b0c ]
Function mlxfw_mfa2_tlv_multi_get() returns NULL if 'tlv' in question does not pass checks in mlxfw_mfa2_tlv_payload_get(). This behaviour may lead to NULL pointer dereference in 'multi->total_len'. Fix this issue by testing mlxfw_mfa2_tlv_multi_get()'s return value against NULL.
Found by Linux Verification Center (linuxtesting.org) with static analysis tool SVACE.
Fixes: 410ed13cae39 ("Add the mlxfw module for Mellanox firmware flash process") Co-developed-by: Natalia Petrova n.petrova@fintech.ru Signed-off-by: Nikita Zhandarovich n.zhandarovich@fintech.ru Reviewed-by: Ido Schimmel idosch@nvidia.com Link: https://lore.kernel.org/r/20230417120718.52325-1-n.zhandarovich@fintech.ru Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2_tlv_multi.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2_tlv_multi.c b/drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2_tlv_multi.c index 017d68f1e1232..972c571b41587 100644 --- a/drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2_tlv_multi.c +++ b/drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2_tlv_multi.c @@ -31,6 +31,8 @@ mlxfw_mfa2_tlv_next(const struct mlxfw_mfa2_file *mfa2_file,
if (tlv->type == MLXFW_MFA2_TLV_MULTI_PART) { multi = mlxfw_mfa2_tlv_multi_get(mfa2_file, tlv); + if (!multi) + return NULL; tlv_len = NLA_ALIGN(tlv_len + be16_to_cpu(multi->total_len)); }
From: Alexander Aring aahringo@redhat.com
[ Upstream commit 4e006c7a6dac0ead4c1bf606000aa90a372fc253 ]
This patch fixes a missing 8 byte for the header size calculation. The ipv6_rpl_srh_size() is used to check a skb_pull() on skb->data which points to skb_transport_header(). Currently we only check on the calculated addresses fields using CmprI and CmprE fields, see:
https://www.rfc-editor.org/rfc/rfc6554#section-3
there is however a missing 8 byte inside the calculation which stands for the fields before the addresses field. Those 8 bytes are represented by sizeof(struct ipv6_rpl_sr_hdr) expression.
Fixes: 8610c7c6e3bd ("net: ipv6: add support for rpl sr exthdr") Signed-off-by: Alexander Aring aahringo@redhat.com Reported-by: maxpl0it maxpl0it@protonmail.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/rpl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/rpl.c b/net/ipv6/rpl.c index 307f336b5353e..3b0386437f69d 100644 --- a/net/ipv6/rpl.c +++ b/net/ipv6/rpl.c @@ -32,7 +32,8 @@ static void *ipv6_rpl_segdata_pos(const struct ipv6_rpl_sr_hdr *hdr, int i) size_t ipv6_rpl_srh_size(unsigned char n, unsigned char cmpri, unsigned char cmpre) { - return (n * IPV6_PFXTAIL_LEN(cmpri)) + IPV6_PFXTAIL_LEN(cmpre); + return sizeof(struct ipv6_rpl_sr_hdr) + (n * IPV6_PFXTAIL_LEN(cmpri)) + + IPV6_PFXTAIL_LEN(cmpre); }
void ipv6_rpl_srh_decompress(struct ipv6_rpl_sr_hdr *outhdr,
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 1f64757ee2bb22a93ec89b4c71707297e8cca0ba ]
During initialization the driver issues a reset command via its command interface in order to remove previous configuration from the device.
After issuing the reset, the driver waits for 200ms before polling on the "system_status" register using memory-mapped IO until the device reaches a ready state (0x5E). The wait is necessary because the reset command only triggers the reset, but the reset itself happens asynchronously. If the driver starts polling too soon, the read of the "system_status" register will never return and the system will crash [1].
The issue was discovered when the device was flashed with a development firmware version where the reset routine took longer to complete. The issue was fixed in the firmware, but it exposed the fact that the current wait time is borderline.
Fix by increasing the wait time from 200ms to 400ms. With this patch and the buggy firmware version, the issue did not reproduce in 10 reboots whereas without the patch the issue is reproduced quite consistently.
[1] mce: CPUs not responding to MCE broadcast (may include false positives): 0,4 mce: CPUs not responding to MCE broadcast (may include false positives): 0,4 Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler Shutting down cpus with NMI Kernel Offset: 0x12000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Fixes: ac004e84164e ("mlxsw: pci: Wait longer before accessing the device after reset") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Petr Machata petrm@nvidia.com Signed-off-by: Petr Machata petrm@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlxsw/pci_hw.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h index a2c1fbd3e0d13..0225c8f1e5ea2 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h +++ b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h @@ -26,7 +26,7 @@ #define MLXSW_PCI_CIR_TIMEOUT_MSECS 1000
#define MLXSW_PCI_SW_RESET_TIMEOUT_MSECS 900000 -#define MLXSW_PCI_SW_RESET_WAIT_MSECS 200 +#define MLXSW_PCI_SW_RESET_WAIT_MSECS 400 #define MLXSW_PCI_FW_READY 0xA1844 #define MLXSW_PCI_FW_READY_MASK 0xFFFF #define MLXSW_PCI_FW_READY_MAGIC 0x5E
From: Daniel Borkmann daniel@iogearbox.net
[ Upstream commit 71b547f561247897a0a14f3082730156c0533fed ]
Juan Jose et al reported an issue found via fuzzing where the verifier's pruning logic prematurely marks a program path as safe.
Consider the following program:
0: (b7) r6 = 1024 1: (b7) r7 = 0 2: (b7) r8 = 0 3: (b7) r9 = -2147483648 4: (97) r6 %= 1025 5: (05) goto pc+0 6: (bd) if r6 <= r9 goto pc+2 7: (97) r6 %= 1 8: (b7) r9 = 0 9: (bd) if r6 <= r9 goto pc+1 10: (b7) r6 = 0 11: (b7) r0 = 0 12: (63) *(u32 *)(r10 -4) = r0 13: (18) r4 = 0xffff888103693400 // map_ptr(ks=4,vs=48) 15: (bf) r1 = r4 16: (bf) r2 = r10 17: (07) r2 += -4 18: (85) call bpf_map_lookup_elem#1 19: (55) if r0 != 0x0 goto pc+1 20: (95) exit 21: (77) r6 >>= 10 22: (27) r6 *= 8192 23: (bf) r1 = r0 24: (0f) r0 += r6 25: (79) r3 = *(u64 *)(r0 +0) 26: (7b) *(u64 *)(r1 +0) = r3 27: (95) exit
The verifier treats this as safe, leading to oob read/write access due to an incorrect verifier conclusion:
func#0 @0 0: R1=ctx(off=0,imm=0) R10=fp0 0: (b7) r6 = 1024 ; R6_w=1024 1: (b7) r7 = 0 ; R7_w=0 2: (b7) r8 = 0 ; R8_w=0 3: (b7) r9 = -2147483648 ; R9_w=-2147483648 4: (97) r6 %= 1025 ; R6_w=scalar() 5: (05) goto pc+0 6: (bd) if r6 <= r9 goto pc+2 ; R6_w=scalar(umin=18446744071562067969,var_off=(0xffffffff00000000; 0xffffffff)) R9_w=-2147483648 7: (97) r6 %= 1 ; R6_w=scalar() 8: (b7) r9 = 0 ; R9=0 9: (bd) if r6 <= r9 goto pc+1 ; R6=scalar(umin=1) R9=0 10: (b7) r6 = 0 ; R6_w=0 11: (b7) r0 = 0 ; R0_w=0 12: (63) *(u32 *)(r10 -4) = r0 last_idx 12 first_idx 9 regs=1 stack=0 before 11: (b7) r0 = 0 13: R0_w=0 R10=fp0 fp-8=0000???? 13: (18) r4 = 0xffff8ad3886c2a00 ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0) 15: (bf) r1 = r4 ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0) 16: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 17: (07) r2 += -4 ; R2_w=fp-4 18: (85) call bpf_map_lookup_elem#1 ; R0=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0) 19: (55) if r0 != 0x0 goto pc+1 ; R0=0 20: (95) exit
from 19 to 21: R0=map_value(off=0,ks=4,vs=48,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm???? 21: (77) r6 >>= 10 ; R6_w=0 22: (27) r6 *= 8192 ; R6_w=0 23: (bf) r1 = r0 ; R0=map_value(off=0,ks=4,vs=48,imm=0) R1_w=map_value(off=0,ks=4,vs=48,imm=0) 24: (0f) r0 += r6 last_idx 24 first_idx 19 regs=40 stack=0 before 23: (bf) r1 = r0 regs=40 stack=0 before 22: (27) r6 *= 8192 regs=40 stack=0 before 21: (77) r6 >>= 10 regs=40 stack=0 before 19: (55) if r0 != 0x0 goto pc+1 parent didn't have regs=40 stack=0 marks: R0_rw=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0) R6_rw=P0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm???? last_idx 18 first_idx 9 regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1 regs=40 stack=0 before 17: (07) r2 += -4 regs=40 stack=0 before 16: (bf) r2 = r10 regs=40 stack=0 before 15: (bf) r1 = r4 regs=40 stack=0 before 13: (18) r4 = 0xffff8ad3886c2a00 regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0 regs=40 stack=0 before 11: (b7) r0 = 0 regs=40 stack=0 before 10: (b7) r6 = 0 25: (79) r3 = *(u64 *)(r0 +0) ; R0_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar() 26: (7b) *(u64 *)(r1 +0) = r3 ; R1_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar() 27: (95) exit
from 9 to 11: R1=ctx(off=0,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0 11: (b7) r0 = 0 ; R0_w=0 12: (63) *(u32 *)(r10 -4) = r0 last_idx 12 first_idx 11 regs=1 stack=0 before 11: (b7) r0 = 0 13: R0_w=0 R10=fp0 fp-8=0000???? 13: (18) r4 = 0xffff8ad3886c2a00 ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0) 15: (bf) r1 = r4 ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0) 16: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 17: (07) r2 += -4 ; R2_w=fp-4 18: (85) call bpf_map_lookup_elem#1 frame 0: propagating r6 last_idx 19 first_idx 11 regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1 regs=40 stack=0 before 17: (07) r2 += -4 regs=40 stack=0 before 16: (bf) r2 = r10 regs=40 stack=0 before 15: (bf) r1 = r4 regs=40 stack=0 before 13: (18) r4 = 0xffff8ad3886c2a00 regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0 regs=40 stack=0 before 11: (b7) r0 = 0 parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_r=P0 R7=0 R8=0 R9=0 R10=fp0 last_idx 9 first_idx 9 regs=40 stack=0 before 9: (bd) if r6 <= r9 goto pc+1 parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_rw=Pscalar() R7_w=0 R8_w=0 R9_rw=0 R10=fp0 last_idx 8 first_idx 0 regs=40 stack=0 before 8: (b7) r9 = 0 regs=40 stack=0 before 7: (97) r6 %= 1 regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2 regs=40 stack=0 before 5: (05) goto pc+0 regs=40 stack=0 before 4: (97) r6 %= 1025 regs=40 stack=0 before 3: (b7) r9 = -2147483648 regs=40 stack=0 before 2: (b7) r8 = 0 regs=40 stack=0 before 1: (b7) r7 = 0 regs=40 stack=0 before 0: (b7) r6 = 1024 19: safe frame 0: propagating r6 last_idx 9 first_idx 0 regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2 regs=40 stack=0 before 5: (05) goto pc+0 regs=40 stack=0 before 4: (97) r6 %= 1025 regs=40 stack=0 before 3: (b7) r9 = -2147483648 regs=40 stack=0 before 2: (b7) r8 = 0 regs=40 stack=0 before 1: (b7) r7 = 0 regs=40 stack=0 before 0: (b7) r6 = 1024
from 6 to 9: safe verification time 110 usec stack depth 4 processed 36 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 2
The verifier considers this program as safe by mistakenly pruning unsafe code paths. In the above func#0, code lines 0-10 are of interest. In line 0-3 registers r6 to r9 are initialized with known scalar values. In line 4 the register r6 is reset to an unknown scalar given the verifier does not track modulo operations. Due to this, the verifier can also not determine precisely which branches in line 6 and 9 are taken, therefore it needs to explore them both.
As can be seen, the verifier starts with exploring the false/fall-through paths first. The 'from 19 to 21' path has both r6=0 and r9=0 and the pointer arithmetic on r0 += r6 is therefore considered safe. Given the arithmetic, r6 is correctly marked for precision tracking where backtracking kicks in where it walks back the current path all the way where r6 was set to 0 in the fall-through branch.
Next, the pruning logics pops the path 'from 9 to 11' from the stack. Also here, the state of the registers is the same, that is, r6=0 and r9=0, so that at line 19 the path can be pruned as it is considered safe. It is interesting to note that the conditional in line 9 turned r6 into a more precise state, that is, in the fall-through path at the beginning of line 10, it is R6=scalar(umin=1), and in the branch-taken path (which is analyzed here) at the beginning of line 11, r6 turned into a known const r6=0 as r9=0 prior to that and therefore (unsigned) r6 <= 0 concludes that r6 must be 0 (**):
[...] ; R6_w=scalar() 9: (bd) if r6 <= r9 goto pc+1 ; R6=scalar(umin=1) R9=0 [...]
from 9 to 11: R1=ctx(off=0,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0 [...]
The next path is 'from 6 to 9'. The verifier considers the old and current state equivalent, and therefore prunes the search incorrectly. Looking into the two states which are being compared by the pruning logic at line 9, the old state consists of R6_rwD=Pscalar() R9_rwD=0 R10=fp0 and the new state consists of R1=ctx(off=0,imm=0) R6_w=scalar(umax=18446744071562067968) R7_w=0 R8_w=0 R9_w=-2147483648 R10=fp0. While r6 had the reg->precise flag correctly set in the old state, r9 did not. Both r6'es are considered as equivalent given the old one is a superset of the current, more precise one, however, r9's actual values (0 vs 0x80000000) mismatch. Given the old r9 did not have reg->precise flag set, the verifier does not consider the register as contributing to the precision state of r6, and therefore it considered both r9 states as equivalent. However, for this specific pruned path (which is also the actual path taken at runtime), register r6 will be 0x400 and r9 0x80000000 when reaching line 21, thus oob-accessing the map.
The purpose of precision tracking is to initially mark registers (including spilled ones) as imprecise to help verifier's pruning logic finding equivalent states it can then prune if they don't contribute to the program's safety aspects. For example, if registers are used for pointer arithmetic or to pass constant length to a helper, then the verifier sets reg->precise flag and backtracks the BPF program instruction sequence and chain of verifier states to ensure that the given register or stack slot including their dependencies are marked as precisely tracked scalar. This also includes any other registers and slots that contribute to a tracked state of given registers/stack slot. This backtracking relies on recorded jmp_history and is able to traverse entire chain of parent states. This process ends only when all the necessary registers/slots and their transitive dependencies are marked as precise.
The backtrack_insn() is called from the current instruction up to the first instruction, and its purpose is to compute a bitmask of registers and stack slots that need precision tracking in the parent's verifier state. For example, if a current instruction is r6 = r7, then r6 needs precision after this instruction and r7 needs precision before this instruction, that is, in the parent state. Hence for the latter r7 is marked and r6 unmarked.
For the class of jmp/jmp32 instructions, backtrack_insn() today only looks at call and exit instructions and for all other conditionals the masks remain as-is. However, in the given situation register r6 has a dependency on r9 (as described above in **), so also that one needs to be marked for precision tracking. In other words, if an imprecise register influences a precise one, then the imprecise register should also be marked precise. Meaning, in the parent state both dest and src register need to be tracked for precision and therefore the marking must be more conservative by setting reg->precise flag for both. The precision propagation needs to cover both for the conditional: if the src reg was marked but not the dst reg and vice versa.
After the fix the program is correctly rejected:
func#0 @0 0: R1=ctx(off=0,imm=0) R10=fp0 0: (b7) r6 = 1024 ; R6_w=1024 1: (b7) r7 = 0 ; R7_w=0 2: (b7) r8 = 0 ; R8_w=0 3: (b7) r9 = -2147483648 ; R9_w=-2147483648 4: (97) r6 %= 1025 ; R6_w=scalar() 5: (05) goto pc+0 6: (bd) if r6 <= r9 goto pc+2 ; R6_w=scalar(umin=18446744071562067969,var_off=(0xffffffff80000000; 0x7fffffff),u32_min=-2147483648) R9_w=-2147483648 7: (97) r6 %= 1 ; R6_w=scalar() 8: (b7) r9 = 0 ; R9=0 9: (bd) if r6 <= r9 goto pc+1 ; R6=scalar(umin=1) R9=0 10: (b7) r6 = 0 ; R6_w=0 11: (b7) r0 = 0 ; R0_w=0 12: (63) *(u32 *)(r10 -4) = r0 last_idx 12 first_idx 9 regs=1 stack=0 before 11: (b7) r0 = 0 13: R0_w=0 R10=fp0 fp-8=0000???? 13: (18) r4 = 0xffff9290dc5bfe00 ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0) 15: (bf) r1 = r4 ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0) 16: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 17: (07) r2 += -4 ; R2_w=fp-4 18: (85) call bpf_map_lookup_elem#1 ; R0=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0) 19: (55) if r0 != 0x0 goto pc+1 ; R0=0 20: (95) exit
from 19 to 21: R0=map_value(off=0,ks=4,vs=48,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm???? 21: (77) r6 >>= 10 ; R6_w=0 22: (27) r6 *= 8192 ; R6_w=0 23: (bf) r1 = r0 ; R0=map_value(off=0,ks=4,vs=48,imm=0) R1_w=map_value(off=0,ks=4,vs=48,imm=0) 24: (0f) r0 += r6 last_idx 24 first_idx 19 regs=40 stack=0 before 23: (bf) r1 = r0 regs=40 stack=0 before 22: (27) r6 *= 8192 regs=40 stack=0 before 21: (77) r6 >>= 10 regs=40 stack=0 before 19: (55) if r0 != 0x0 goto pc+1 parent didn't have regs=40 stack=0 marks: R0_rw=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0) R6_rw=P0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm???? last_idx 18 first_idx 9 regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1 regs=40 stack=0 before 17: (07) r2 += -4 regs=40 stack=0 before 16: (bf) r2 = r10 regs=40 stack=0 before 15: (bf) r1 = r4 regs=40 stack=0 before 13: (18) r4 = 0xffff9290dc5bfe00 regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0 regs=40 stack=0 before 11: (b7) r0 = 0 regs=40 stack=0 before 10: (b7) r6 = 0 25: (79) r3 = *(u64 *)(r0 +0) ; R0_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar() 26: (7b) *(u64 *)(r1 +0) = r3 ; R1_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar() 27: (95) exit
from 9 to 11: R1=ctx(off=0,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0 11: (b7) r0 = 0 ; R0_w=0 12: (63) *(u32 *)(r10 -4) = r0 last_idx 12 first_idx 11 regs=1 stack=0 before 11: (b7) r0 = 0 13: R0_w=0 R10=fp0 fp-8=0000???? 13: (18) r4 = 0xffff9290dc5bfe00 ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0) 15: (bf) r1 = r4 ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0) 16: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 17: (07) r2 += -4 ; R2_w=fp-4 18: (85) call bpf_map_lookup_elem#1 frame 0: propagating r6 last_idx 19 first_idx 11 regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1 regs=40 stack=0 before 17: (07) r2 += -4 regs=40 stack=0 before 16: (bf) r2 = r10 regs=40 stack=0 before 15: (bf) r1 = r4 regs=40 stack=0 before 13: (18) r4 = 0xffff9290dc5bfe00 regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0 regs=40 stack=0 before 11: (b7) r0 = 0 parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_r=P0 R7=0 R8=0 R9=0 R10=fp0 last_idx 9 first_idx 9 regs=40 stack=0 before 9: (bd) if r6 <= r9 goto pc+1 parent didn't have regs=240 stack=0 marks: R1=ctx(off=0,imm=0) R6_rw=Pscalar() R7_w=0 R8_w=0 R9_rw=P0 R10=fp0 last_idx 8 first_idx 0 regs=240 stack=0 before 8: (b7) r9 = 0 regs=40 stack=0 before 7: (97) r6 %= 1 regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2 regs=240 stack=0 before 5: (05) goto pc+0 regs=240 stack=0 before 4: (97) r6 %= 1025 regs=240 stack=0 before 3: (b7) r9 = -2147483648 regs=40 stack=0 before 2: (b7) r8 = 0 regs=40 stack=0 before 1: (b7) r7 = 0 regs=40 stack=0 before 0: (b7) r6 = 1024 19: safe
from 6 to 9: R1=ctx(off=0,imm=0) R6_w=scalar(umax=18446744071562067968) R7_w=0 R8_w=0 R9_w=-2147483648 R10=fp0 9: (bd) if r6 <= r9 goto pc+1 last_idx 9 first_idx 0 regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2 regs=240 stack=0 before 5: (05) goto pc+0 regs=240 stack=0 before 4: (97) r6 %= 1025 regs=240 stack=0 before 3: (b7) r9 = -2147483648 regs=40 stack=0 before 2: (b7) r8 = 0 regs=40 stack=0 before 1: (b7) r7 = 0 regs=40 stack=0 before 0: (b7) r6 = 1024 last_idx 9 first_idx 0 regs=200 stack=0 before 6: (bd) if r6 <= r9 goto pc+2 regs=240 stack=0 before 5: (05) goto pc+0 regs=240 stack=0 before 4: (97) r6 %= 1025 regs=240 stack=0 before 3: (b7) r9 = -2147483648 regs=40 stack=0 before 2: (b7) r8 = 0 regs=40 stack=0 before 1: (b7) r7 = 0 regs=40 stack=0 before 0: (b7) r6 = 1024 11: R6=scalar(umax=18446744071562067968) R9=-2147483648 11: (b7) r0 = 0 ; R0_w=0 12: (63) *(u32 *)(r10 -4) = r0 last_idx 12 first_idx 11 regs=1 stack=0 before 11: (b7) r0 = 0 13: R0_w=0 R10=fp0 fp-8=0000???? 13: (18) r4 = 0xffff9290dc5bfe00 ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0) 15: (bf) r1 = r4 ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0) 16: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 17: (07) r2 += -4 ; R2_w=fp-4 18: (85) call bpf_map_lookup_elem#1 ; R0_w=map_value_or_null(id=3,off=0,ks=4,vs=48,imm=0) 19: (55) if r0 != 0x0 goto pc+1 ; R0_w=0 20: (95) exit
from 19 to 21: R0=map_value(off=0,ks=4,vs=48,imm=0) R6=scalar(umax=18446744071562067968) R7=0 R8=0 R9=-2147483648 R10=fp0 fp-8=mmmm???? 21: (77) r6 >>= 10 ; R6_w=scalar(umax=18014398507384832,var_off=(0x0; 0x3fffffffffffff)) 22: (27) r6 *= 8192 ; R6_w=scalar(smax=9223372036854767616,umax=18446744073709543424,var_off=(0x0; 0xffffffffffffe000),s32_max=2147475456,u32_max=-8192) 23: (bf) r1 = r0 ; R0=map_value(off=0,ks=4,vs=48,imm=0) R1_w=map_value(off=0,ks=4,vs=48,imm=0) 24: (0f) r0 += r6 last_idx 24 first_idx 21 regs=40 stack=0 before 23: (bf) r1 = r0 regs=40 stack=0 before 22: (27) r6 *= 8192 regs=40 stack=0 before 21: (77) r6 >>= 10 parent didn't have regs=40 stack=0 marks: R0_rw=map_value(off=0,ks=4,vs=48,imm=0) R6_r=Pscalar(umax=18446744071562067968) R7=0 R8=0 R9=-2147483648 R10=fp0 fp-8=mmmm???? last_idx 19 first_idx 11 regs=40 stack=0 before 19: (55) if r0 != 0x0 goto pc+1 regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1 regs=40 stack=0 before 17: (07) r2 += -4 regs=40 stack=0 before 16: (bf) r2 = r10 regs=40 stack=0 before 15: (bf) r1 = r4 regs=40 stack=0 before 13: (18) r4 = 0xffff9290dc5bfe00 regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0 regs=40 stack=0 before 11: (b7) r0 = 0 parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_rw=Pscalar(umax=18446744071562067968) R7_w=0 R8_w=0 R9_w=-2147483648 R10=fp0 last_idx 9 first_idx 0 regs=40 stack=0 before 9: (bd) if r6 <= r9 goto pc+1 regs=240 stack=0 before 6: (bd) if r6 <= r9 goto pc+2 regs=240 stack=0 before 5: (05) goto pc+0 regs=240 stack=0 before 4: (97) r6 %= 1025 regs=240 stack=0 before 3: (b7) r9 = -2147483648 regs=40 stack=0 before 2: (b7) r8 = 0 regs=40 stack=0 before 1: (b7) r7 = 0 regs=40 stack=0 before 0: (b7) r6 = 1024 math between map_value pointer and register with unbounded min value is not allowed verification time 886 usec stack depth 4 processed 49 insns (limit 1000000) max_states_per_insn 1 total_states 5 peak_states 5 mark_read 2
Fixes: b5dc0163d8fd ("bpf: precise scalar_value tracking") Reported-by: Juan Jose Lopez Jaimez jjlopezjaimez@google.com Reported-by: Meador Inge meadori@google.com Reported-by: Simon Scannell simonscannell@google.com Reported-by: Nenad Stojanovski thenenadx@google.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Co-developed-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Reviewed-by: John Fastabend john.fastabend@gmail.com Reviewed-by: Juan Jose Lopez Jaimez jjlopezjaimez@google.com Reviewed-by: Meador Inge meadori@google.com Reviewed-by: Simon Scannell simonscannell@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9e5f1ebe67d7f..5a96a9dd51e4c 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1931,6 +1931,21 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, } } else if (opcode == BPF_EXIT) { return -ENOTSUPP; + } else if (BPF_SRC(insn->code) == BPF_X) { + if (!(*reg_mask & (dreg | sreg))) + return 0; + /* dreg <cond> sreg + * Both dreg and sreg need precision before + * this insn. If only sreg was marked precise + * before it would be equally necessary to + * propagate it to dreg. + */ + *reg_mask |= (sreg | dreg); + /* else dreg <cond> K + * Only dreg still needs precision before + * this insn, so for the K-based conditional + * there is nothing new to be marked. + */ } } else if (class == BPF_LD) { if (!(*reg_mask & dreg))
From: Sebastian Basierski sebastianx.basierski@intel.com
[ Upstream commit 67d47b95119ad589b0a0b16b88b1dd9a04061ced ]
While using i219-LM card currently it was only possible to achieve about 60% of maximum speed due to regression introduced in Linux 5.8. This was caused by TSO not being disabled by default despite commit f29801030ac6 ("e1000e: Disable TSO for buffer overrun workaround"). Fix that by disabling TSO during driver probe.
Fixes: f29801030ac6 ("e1000e: Disable TSO for buffer overrun workaround") Signed-off-by: Sebastian Basierski sebastianx.basierski@intel.com Signed-off-by: Mateusz Palczewski mateusz.palczewski@intel.com Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230417205345.1030801-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/e1000e/netdev.c | 51 +++++++++++----------- 1 file changed, 26 insertions(+), 25 deletions(-)
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index ae0c9aaab48db..b700663a634d2 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -5294,31 +5294,6 @@ static void e1000_watchdog_task(struct work_struct *work) ew32(TARC(0), tarc0); }
- /* disable TSO for pcie and 10/100 speeds, to avoid - * some hardware issues - */ - if (!(adapter->flags & FLAG_TSO_FORCE)) { - switch (adapter->link_speed) { - case SPEED_10: - case SPEED_100: - e_info("10/100 speed: disabling TSO\n"); - netdev->features &= ~NETIF_F_TSO; - netdev->features &= ~NETIF_F_TSO6; - break; - case SPEED_1000: - netdev->features |= NETIF_F_TSO; - netdev->features |= NETIF_F_TSO6; - break; - default: - /* oops */ - break; - } - if (hw->mac.type == e1000_pch_spt) { - netdev->features &= ~NETIF_F_TSO; - netdev->features &= ~NETIF_F_TSO6; - } - } - /* enable transmits in the hardware, need to do this * after setting TARC(0) */ @@ -7477,6 +7452,32 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent) NETIF_F_RXCSUM | NETIF_F_HW_CSUM);
+ /* disable TSO for pcie and 10/100 speeds to avoid + * some hardware issues and for i219 to fix transfer + * speed being capped at 60% + */ + if (!(adapter->flags & FLAG_TSO_FORCE)) { + switch (adapter->link_speed) { + case SPEED_10: + case SPEED_100: + e_info("10/100 speed: disabling TSO\n"); + netdev->features &= ~NETIF_F_TSO; + netdev->features &= ~NETIF_F_TSO6; + break; + case SPEED_1000: + netdev->features |= NETIF_F_TSO; + netdev->features |= NETIF_F_TSO6; + break; + default: + /* oops */ + break; + } + if (hw->mac.type == e1000_pch_spt) { + netdev->features &= ~NETIF_F_TSO; + netdev->features &= ~NETIF_F_TSO6; + } + } + /* Set user-changeable features (subset of all device features) */ netdev->hw_features = netdev->features; netdev->hw_features |= NETIF_F_RXFCS;
From: Douglas Raillard douglas.raillard@arm.com
[ Upstream commit 0b04d4c0542e8573a837b1d81b94209e48723b25 ]
Fix the nid_t field so that its size is correctly reported in the text format embedded in trace.dat files. As it stands, it is reported as being of size 4:
field:nid_t nid[3]; offset:24; size:4; signed:0;
Instead of 12:
field:nid_t nid[3]; offset:24; size:12; signed:0;
This also fixes the reported offset of subsequent fields so that they match with the actual struct layout.
Signed-off-by: Douglas Raillard douglas.raillard@arm.com Reviewed-by: Mukesh Ojha quic_mojha@quicinc.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/trace/events/f2fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h index df293bc7f03b8..e8cd19e91de11 100644 --- a/include/trace/events/f2fs.h +++ b/include/trace/events/f2fs.h @@ -513,7 +513,7 @@ TRACE_EVENT(f2fs_truncate_partial_nodes, TP_STRUCT__entry( __field(dev_t, dev) __field(ino_t, ino) - __field(nid_t, nid[3]) + __array(nid_t, nid, 3) __field(int, depth) __field(int, err) ),
From: Jonathan Denose jdenose@chromium.org
[ Upstream commit f5bad62f9107b701a6def7cac1f5f65862219b83 ]
Fujitsu Lifebook A574/H requires the nomux option to properly probe the touchpad, especially when waking from sleep.
Signed-off-by: Jonathan Denose jdenose@google.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20230303152623.45859-1-jdenose@google.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/serio/i8042-x86ia64io.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/input/serio/i8042-x86ia64io.h b/drivers/input/serio/i8042-x86ia64io.h index 65c0081838e3d..9dcdf21c50bdc 100644 --- a/drivers/input/serio/i8042-x86ia64io.h +++ b/drivers/input/serio/i8042-x86ia64io.h @@ -601,6 +601,14 @@ static const struct dmi_system_id i8042_dmi_quirk_table[] __initconst = { }, .driver_data = (void *)(SERIO_QUIRK_NOMUX) }, + { + /* Fujitsu Lifebook A574/H */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"), + DMI_MATCH(DMI_PRODUCT_NAME, "FMVA0501PZ"), + }, + .driver_data = (void *)(SERIO_QUIRK_NOMUX) + }, { /* Gigabyte M912 */ .matches = {
From: Nick Desaulniers ndesaulniers@google.com
[ Upstream commit 05107edc910135d27fe557267dc45be9630bf3dd ]
Building sigaltstack with clang via: $ ARCH=x86 make LLVM=1 -C tools/testing/selftests/sigaltstack/
produces the following warning: warning: variable 'sp' is uninitialized when used here [-Wuninitialized] if (sp < (unsigned long)sstack || ^~
Clang expects these to be declared at global scope; we've fixed this in the kernel proper by using the macro `current_stack_pointer`. This is defined in different headers for different target architectures, so just create a new header that defines the arch-specific register names for the stack pointer register, and define it for more targets (at least the ones that support current_stack_pointer/ARCH_HAS_CURRENT_STACK_POINTER).
Reported-by: Linux Kernel Functional Testing lkft@linaro.org Link: https://lore.kernel.org/lkml/CA+G9fYsi3OOu7yCsMutpzKDnBMAzJBCPimBp86LhGBa0eC... Signed-off-by: Nick Desaulniers ndesaulniers@google.com Reviewed-by: Kees Cook keescook@chromium.org Tested-by: Linux Kernel Functional Testing lkft@linaro.org Tested-by: Anders Roxell anders.roxell@linaro.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../sigaltstack/current_stack_pointer.h | 23 +++++++++++++++++++ tools/testing/selftests/sigaltstack/sas.c | 7 +----- 2 files changed, 24 insertions(+), 6 deletions(-) create mode 100644 tools/testing/selftests/sigaltstack/current_stack_pointer.h
diff --git a/tools/testing/selftests/sigaltstack/current_stack_pointer.h b/tools/testing/selftests/sigaltstack/current_stack_pointer.h new file mode 100644 index 0000000000000..ea9bdf3a90b16 --- /dev/null +++ b/tools/testing/selftests/sigaltstack/current_stack_pointer.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#if __alpha__ +register unsigned long sp asm("$30"); +#elif __arm__ || __aarch64__ || __csky__ || __m68k__ || __mips__ || __riscv +register unsigned long sp asm("sp"); +#elif __i386__ +register unsigned long sp asm("esp"); +#elif __loongarch64 +register unsigned long sp asm("$sp"); +#elif __ppc__ +register unsigned long sp asm("r1"); +#elif __s390x__ +register unsigned long sp asm("%15"); +#elif __sh__ +register unsigned long sp asm("r15"); +#elif __x86_64__ +register unsigned long sp asm("rsp"); +#elif __XTENSA__ +register unsigned long sp asm("a1"); +#else +#error "implement current_stack_pointer equivalent" +#endif diff --git a/tools/testing/selftests/sigaltstack/sas.c b/tools/testing/selftests/sigaltstack/sas.c index 8934a3766d207..41646c22384a2 100644 --- a/tools/testing/selftests/sigaltstack/sas.c +++ b/tools/testing/selftests/sigaltstack/sas.c @@ -19,6 +19,7 @@ #include <errno.h>
#include "../kselftest.h" +#include "current_stack_pointer.h"
#ifndef SS_AUTODISARM #define SS_AUTODISARM (1U << 31) @@ -40,12 +41,6 @@ void my_usr1(int sig, siginfo_t *si, void *u) stack_t stk; struct stk_data *p;
-#if __s390x__ - register unsigned long sp asm("%15"); -#else - register unsigned long sp asm("sp"); -#endif - if (sp < (unsigned long)sstack || sp >= (unsigned long)sstack + SIGSTKSZ) { ksft_exit_fail_msg("SP is not on sigaltstack\n");
From: Tomas Henzl thenzl@redhat.com
[ Upstream commit 0808ed6ebbc292222ca069d339744870f6d801da ]
If crash_dump_buf is not allocated then crash dump can't be available. Replace logical 'and' with 'or'.
Signed-off-by: Tomas Henzl thenzl@redhat.com Link: https://lore.kernel.org/r/20230324135249.9733-1-thenzl@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/megaraid/megaraid_sas_base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 84a2e9292fd03..b5a74b237fd21 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -3248,7 +3248,7 @@ fw_crash_buffer_show(struct device *cdev,
spin_lock_irqsave(&instance->crashdump_lock, flags); buff_offset = instance->fw_crash_buffer_offset; - if (!instance->crash_dump_buf && + if (!instance->crash_dump_buf || !((instance->fw_crash_state == AVAILABLE) || (instance->fw_crash_state == COPYING))) { dev_err(&instance->pdev->dev,
From: Damien Le Moal damien.lemoal@opensource.wdc.com
[ Upstream commit f0aa59a33d2ac2267d260fe21eaf92500df8e7b4 ]
Some USB-SATA adapters have broken behavior when an unsupported VPD page is probed: Depending on the VPD page number, a 4-byte header with a valid VPD page number but with a 0 length is returned. Currently, scsi_vpd_inquiry() only checks that the page number is valid to determine if the page is valid, which results in receiving only the 4-byte header for the non-existent page. This error manifests itself very often with page 0xb9 for the Concurrent Positioning Ranges detection done by sd_read_cpr(), resulting in the following error message:
sd 0:0:0:0: [sda] Invalid Concurrent Positioning Ranges VPD page
Prevent such misleading error message by adding a check in scsi_vpd_inquiry() to verify that the page length is not 0.
Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Link: https://lore.kernel.org/r/20230322022211.116327-1-damien.lemoal@opensource.w... Reviewed-by: Benjamin Block bblock@linux.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 6ad834d61d4c7..d6c25a88cebc9 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -317,11 +317,18 @@ static int scsi_vpd_inquiry(struct scsi_device *sdev, unsigned char *buffer, if (result) return -EIO;
- /* Sanity check that we got the page back that we asked for */ + /* + * Sanity check that we got the page back that we asked for and that + * the page size is not 0. + */ if (buffer[1] != page) return -EIO;
- return get_unaligned_be16(&buffer[2]) + 4; + result = get_unaligned_be16(&buffer[2]); + if (!result) + return -EIO; + + return result + 4; }
/**
From: Álvaro Fernández Rojas noltari@gmail.com
[ Upstream commit 45977e58ce65ed0459edc9a0466d9dfea09463f5 ]
Implement phy_read16() and phy_write16() ops for B53 MMAP to avoid accessing B53_PORT_MII_PAGE registers which hangs the device. This access should be done through the MDIO Mux bus controller.
Signed-off-by: Álvaro Fernández Rojas noltari@gmail.com Acked-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/b53/b53_mmap.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/drivers/net/dsa/b53/b53_mmap.c b/drivers/net/dsa/b53/b53_mmap.c index c628d0980c0b1..1d52cb3e46d52 100644 --- a/drivers/net/dsa/b53/b53_mmap.c +++ b/drivers/net/dsa/b53/b53_mmap.c @@ -215,6 +215,18 @@ static int b53_mmap_write64(struct b53_device *dev, u8 page, u8 reg, return 0; }
+static int b53_mmap_phy_read16(struct b53_device *dev, int addr, int reg, + u16 *value) +{ + return -EIO; +} + +static int b53_mmap_phy_write16(struct b53_device *dev, int addr, int reg, + u16 value) +{ + return -EIO; +} + static const struct b53_io_ops b53_mmap_ops = { .read8 = b53_mmap_read8, .read16 = b53_mmap_read16, @@ -226,6 +238,8 @@ static const struct b53_io_ops b53_mmap_ops = { .write32 = b53_mmap_write32, .write48 = b53_mmap_write48, .write64 = b53_mmap_write64, + .phy_read16 = b53_mmap_phy_read16, + .phy_write16 = b53_mmap_phy_write16, };
static int b53_mmap_probe(struct platform_device *pdev)
From: Heiko Carstens hca@linux.ibm.com
[ Upstream commit f9bbf25e7b2b74b52b2f269216a92657774f239c ]
Return -EFAULT if put_user() for the PTRACE_GET_LAST_BREAK request fails, instead of silently ignoring it.
Reviewed-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/ptrace.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c index a76dd27fb2e81..3009bb5272524 100644 --- a/arch/s390/kernel/ptrace.c +++ b/arch/s390/kernel/ptrace.c @@ -500,9 +500,7 @@ long arch_ptrace(struct task_struct *child, long request, } return 0; case PTRACE_GET_LAST_BREAK: - put_user(child->thread.last_break, - (unsigned long __user *) data); - return 0; + return put_user(child->thread.last_break, (unsigned long __user *)data); case PTRACE_ENABLE_TE: if (!MACHINE_HAS_TE) return -EIO; @@ -854,9 +852,7 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request, } return 0; case PTRACE_GET_LAST_BREAK: - put_user(child->thread.last_break, - (unsigned int __user *) data); - return 0; + return put_user(child->thread.last_break, (unsigned int __user *)data); } return compat_ptrace_request(child, request, addr, data); }
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit 88eaba80328b31ef81813a1207b4056efd7006a6 ]
When we allocate a nvme-tcp queue, we set the data_ready callback before we actually need to use it. This creates the potential that if a stray controller sends us data on the socket before we connect, we can trigger the io_work and start consuming the socket.
In this case reported: we failed to allocate one of the io queues, and as we start releasing the queues that we already allocated, we get a UAF [1] from the io_work which is running before it should really.
Fix this by setting the socket ops callbacks only before we start the queue, so that we can't accidentally schedule the io_work in the initialization phase before the queue started. While we are at it, rename nvme_tcp_restore_sock_calls to pair with nvme_tcp_setup_sock_ops.
[1]: [16802.107284] nvme nvme4: starting error recovery [16802.109166] nvme nvme4: Reconnecting in 10 seconds... [16812.173535] nvme nvme4: failed to connect socket: -111 [16812.173745] nvme nvme4: Failed reconnect attempt 1 [16812.173747] nvme nvme4: Reconnecting in 10 seconds... [16822.413555] nvme nvme4: failed to connect socket: -111 [16822.413762] nvme nvme4: Failed reconnect attempt 2 [16822.413765] nvme nvme4: Reconnecting in 10 seconds... [16832.661274] nvme nvme4: creating 32 I/O queues. [16833.919887] BUG: kernel NULL pointer dereference, address: 0000000000000088 [16833.920068] nvme nvme4: Failed reconnect attempt 3 [16833.920094] #PF: supervisor write access in kernel mode [16833.920261] nvme nvme4: Reconnecting in 10 seconds... [16833.920368] #PF: error_code(0x0002) - not-present page [16833.921086] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp] [16833.921191] RIP: 0010:_raw_spin_lock_bh+0x17/0x30 ... [16833.923138] Call Trace: [16833.923271] <TASK> [16833.923402] lock_sock_nested+0x1e/0x50 [16833.923545] nvme_tcp_try_recv+0x40/0xa0 [nvme_tcp] [16833.923685] nvme_tcp_io_work+0x68/0xa0 [nvme_tcp] [16833.923824] process_one_work+0x1e8/0x390 [16833.923969] worker_thread+0x53/0x3d0 [16833.924104] ? process_one_work+0x390/0x390 [16833.924240] kthread+0x124/0x150 [16833.924376] ? set_kthread_struct+0x50/0x50 [16833.924518] ret_from_fork+0x1f/0x30 [16833.924655] </TASK>
Reported-by: Yanjun Zhang zhangyanjun@cestc.cn Signed-off-by: Sagi Grimberg sagi@grimberg.me Tested-by: Yanjun Zhang zhangyanjun@cestc.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/tcp.c | 46 +++++++++++++++++++++++------------------ 1 file changed, 26 insertions(+), 20 deletions(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 57df87def8c33..e6147a9220f9a 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1535,22 +1535,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, if (ret) goto err_init_connect;
- queue->rd_enabled = true; set_bit(NVME_TCP_Q_ALLOCATED, &queue->flags); - nvme_tcp_init_recv_ctx(queue); - - write_lock_bh(&queue->sock->sk->sk_callback_lock); - queue->sock->sk->sk_user_data = queue; - queue->state_change = queue->sock->sk->sk_state_change; - queue->data_ready = queue->sock->sk->sk_data_ready; - queue->write_space = queue->sock->sk->sk_write_space; - queue->sock->sk->sk_data_ready = nvme_tcp_data_ready; - queue->sock->sk->sk_state_change = nvme_tcp_state_change; - queue->sock->sk->sk_write_space = nvme_tcp_write_space; -#ifdef CONFIG_NET_RX_BUSY_POLL - queue->sock->sk->sk_ll_usec = 1; -#endif - write_unlock_bh(&queue->sock->sk->sk_callback_lock);
return 0;
@@ -1569,7 +1554,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, return ret; }
-static void nvme_tcp_restore_sock_calls(struct nvme_tcp_queue *queue) +static void nvme_tcp_restore_sock_ops(struct nvme_tcp_queue *queue) { struct socket *sock = queue->sock;
@@ -1584,7 +1569,7 @@ static void nvme_tcp_restore_sock_calls(struct nvme_tcp_queue *queue) static void __nvme_tcp_stop_queue(struct nvme_tcp_queue *queue) { kernel_sock_shutdown(queue->sock, SHUT_RDWR); - nvme_tcp_restore_sock_calls(queue); + nvme_tcp_restore_sock_ops(queue); cancel_work_sync(&queue->io_work); }
@@ -1599,21 +1584,42 @@ static void nvme_tcp_stop_queue(struct nvme_ctrl *nctrl, int qid) mutex_unlock(&queue->queue_lock); }
+static void nvme_tcp_setup_sock_ops(struct nvme_tcp_queue *queue) +{ + write_lock_bh(&queue->sock->sk->sk_callback_lock); + queue->sock->sk->sk_user_data = queue; + queue->state_change = queue->sock->sk->sk_state_change; + queue->data_ready = queue->sock->sk->sk_data_ready; + queue->write_space = queue->sock->sk->sk_write_space; + queue->sock->sk->sk_data_ready = nvme_tcp_data_ready; + queue->sock->sk->sk_state_change = nvme_tcp_state_change; + queue->sock->sk->sk_write_space = nvme_tcp_write_space; +#ifdef CONFIG_NET_RX_BUSY_POLL + queue->sock->sk->sk_ll_usec = 1; +#endif + write_unlock_bh(&queue->sock->sk->sk_callback_lock); +} + static int nvme_tcp_start_queue(struct nvme_ctrl *nctrl, int idx) { struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl); + struct nvme_tcp_queue *queue = &ctrl->queues[idx]; int ret;
+ queue->rd_enabled = true; + nvme_tcp_init_recv_ctx(queue); + nvme_tcp_setup_sock_ops(queue); + if (idx) ret = nvmf_connect_io_queue(nctrl, idx, false); else ret = nvmf_connect_admin_queue(nctrl);
if (!ret) { - set_bit(NVME_TCP_Q_LIVE, &ctrl->queues[idx].flags); + set_bit(NVME_TCP_Q_LIVE, &queue->flags); } else { - if (test_bit(NVME_TCP_Q_ALLOCATED, &ctrl->queues[idx].flags)) - __nvme_tcp_stop_queue(&ctrl->queues[idx]); + if (test_bit(NVME_TCP_Q_ALLOCATED, &queue->flags)) + __nvme_tcp_stop_queue(queue); dev_err(nctrl->device, "failed to connect queue: %d ret=%d\n", idx, ret); }
From: Juergen Gross jgross@suse.com
[ Upstream commit 2eca98e5b24d01c02b46c67be05a5f98cc9789b1 ]
Issue the same error message in case an illegal page boundary crossing has been detected in both cases where this is tested.
Suggested-by: Jan Beulich jbeulich@suse.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Link: https://lore.kernel.org/r/20230329080259.14823-1-jgross@suse.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/xen-netback/netback.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 67614e7166ac8..379ac9ca60b70 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -996,10 +996,8 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
/* No crossing a page as the payload mustn't fragment. */ if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) { - netdev_err(queue->vif->dev, - "txreq.offset: %u, size: %u, end: %lu\n", - txreq.offset, txreq.size, - (unsigned long)(txreq.offset&~XEN_PAGE_MASK) + txreq.size); + netdev_err(queue->vif->dev, "Cross page boundary, txreq.offset: %u, size: %u\n", + txreq.offset, txreq.size); xenvif_fatal_tx_err(queue->vif); break; }
From: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com
commit f50da6edbf1ebf35dd8070847bfab5cb988d472b upstream.
Fix make htmldocs related errors with the newly added associativity.rst doc file.
Reported-by: Stephen Rothwell sfr@canb.auug.org.au Tested-by: Stephen Rothwell sfr@canb.auug.org.au # build test Signed-off-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210825042447.106219-1-aneesh.kumar@linux.ibm.com Cc: Salvatore Bonaccorso carnil@debian.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/powerpc/associativity.rst | 29 +++++++++++++++-------------- Documentation/powerpc/index.rst | 1 + 2 files changed, 16 insertions(+), 14 deletions(-)
--- a/Documentation/powerpc/associativity.rst +++ b/Documentation/powerpc/associativity.rst @@ -1,6 +1,6 @@ ============================ NUMA resource associativity -============================= +============================
Associativity represents the groupings of the various platform resources into domains of substantially similar mean performance relative to resources outside @@ -20,11 +20,11 @@ A value of 1 indicates the usage of Form bit 2 of byte 5 in the "ibm,architecture-vec-5" property is used.
Form 0 ------ +------ Form 0 associativity supports only two NUMA distances (LOCAL and REMOTE).
Form 1 ------ +------ With Form 1 a combination of ibm,associativity-reference-points, and ibm,associativity device tree properties are used to determine the NUMA distance between resource groups/domains.
@@ -78,17 +78,18 @@ numa-lookup-index-table.
For ex: ibm,numa-lookup-index-table = <3 0 8 40>; -ibm,numa-distace-table = <9>, /bits/ 8 < 10 20 80 - 20 10 160 - 80 160 10>; - | 0 8 40 ---|------------ - | -0 | 10 20 80 - | -8 | 20 10 160 - | -40| 80 160 10 +ibm,numa-distace-table = <9>, /bits/ 8 < 10 20 80 20 10 160 80 160 10>; + +:: + + | 0 8 40 + --|------------ + | + 0 | 10 20 80 + | + 8 | 20 10 160 + | + 40| 80 160 10
A possible "ibm,associativity" property for resources in node 0, 8 and 40
--- a/Documentation/powerpc/index.rst +++ b/Documentation/powerpc/index.rst @@ -7,6 +7,7 @@ powerpc .. toctree:: :maxdepth: 1
+ associativity booting bootwrapper cpu_families
From: Brian Foster bfoster@redhat.com
commit 7cd3099f4925d7c15887d1940ebd65acd66100f5 upstream.
Per-inode ioend completion batching has a log reservation deadlock vector between preallocated append transactions and transactions that are acquired at completion time for other purposes (i.e., unwritten extent conversion or COW fork remaps). For example, if the ioend completion workqueue task executes on a batch of ioends that are sorted such that an append ioend sits at the tail, it's possible for the outstanding append transaction reservation to block allocation of transactions required to process preceding ioends in the list.
Append ioend completion is historically the common path for on-disk inode size updates. While file extending writes may have completed sometime earlier, the on-disk inode size is only updated after successful writeback completion. These transactions are preallocated serially from writeback context to mitigate concurrency and associated log reservation pressure across completions processed by multi-threaded workqueue tasks.
However, now that delalloc blocks unconditionally map to unwritten extents at physical block allocation time, size updates via append ioends are relatively rare. This means that inode size updates most commonly occur as part of the preexisting completion time transaction to convert unwritten extents. As a result, there is no longer a strong need to preallocate size update transactions.
Remove the preallocation of inode size update transactions to avoid the ioend completion processing log reservation deadlock. Instead, continue to send all potential size extending ioends to workqueue context for completion and allocate the transaction from that context. This ensures that no outstanding log reservation is owned by the ioend completion worker task when it begins to process ioends.
Signed-off-by: Brian Foster bfoster@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Reported-by: Christian Theune ct@flyingcircus.io Link: https://lore.kernel.org/linux-xfs/CAOQ4uxjj2UqA0h4Y31NbmpHksMhVrXfXjLG4Tnz3z... Signed-off-by: Amir Goldstein amir73il@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_aops.c | 45 +++------------------------------------------ 1 file changed, 3 insertions(+), 42 deletions(-)
--- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -39,33 +39,6 @@ static inline bool xfs_ioend_is_append(s XFS_I(ioend->io_inode)->i_d.di_size; }
-STATIC int -xfs_setfilesize_trans_alloc( - struct iomap_ioend *ioend) -{ - struct xfs_mount *mp = XFS_I(ioend->io_inode)->i_mount; - struct xfs_trans *tp; - int error; - - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_fsyncts, 0, 0, 0, &tp); - if (error) - return error; - - ioend->io_private = tp; - - /* - * We may pass freeze protection with a transaction. So tell lockdep - * we released it. - */ - __sb_writers_release(ioend->io_inode->i_sb, SB_FREEZE_FS); - /* - * We hand off the transaction to the completion thread now, so - * clear the flag here. - */ - xfs_trans_clear_context(tp); - return 0; -} - /* * Update on-disk file size now that data has been written to disk. */ @@ -191,12 +164,10 @@ xfs_end_ioend( error = xfs_reflink_end_cow(ip, offset, size); else if (ioend->io_type == IOMAP_UNWRITTEN) error = xfs_iomap_write_unwritten(ip, offset, size, false); - else - ASSERT(!xfs_ioend_is_append(ioend) || ioend->io_private);
+ if (!error && xfs_ioend_is_append(ioend)) + error = xfs_setfilesize(ip, ioend->io_offset, ioend->io_size); done: - if (ioend->io_private) - error = xfs_setfilesize_ioend(ioend, error); iomap_finish_ioends(ioend, error); memalloc_nofs_restore(nofs_flag); } @@ -246,7 +217,7 @@ xfs_end_io(
static inline bool xfs_ioend_needs_workqueue(struct iomap_ioend *ioend) { - return ioend->io_private || + return xfs_ioend_is_append(ioend) || ioend->io_type == IOMAP_UNWRITTEN || (ioend->io_flags & IOMAP_F_SHARED); } @@ -259,8 +230,6 @@ xfs_end_bio( struct xfs_inode *ip = XFS_I(ioend->io_inode); unsigned long flags;
- ASSERT(xfs_ioend_needs_workqueue(ioend)); - spin_lock_irqsave(&ip->i_ioend_lock, flags); if (list_empty(&ip->i_ioend_list)) WARN_ON_ONCE(!queue_work(ip->i_mount->m_unwritten_workqueue, @@ -510,14 +479,6 @@ xfs_prepare_ioend( ioend->io_offset, ioend->io_size); }
- /* Reserve log space if we might write beyond the on-disk inode size. */ - if (!status && - ((ioend->io_flags & IOMAP_F_SHARED) || - ioend->io_type != IOMAP_UNWRITTEN) && - xfs_ioend_is_append(ioend) && - !ioend->io_private) - status = xfs_setfilesize_trans_alloc(ioend); - memalloc_nofs_restore(nofs_flag);
if (xfs_ioend_needs_workqueue(ioend))
From: Brian Masney bmasney@redhat.com
commit b1cb00d51e361cf5af93649917d9790e1623647e upstream.
tsl2772_read_prox_diodes() will correctly parse the properties from device tree to determine which proximity diode(s) to read from, however it didn't actually set this value on the struct tsl2772_settings. Let's go ahead and fix that.
Reported-by: Tom Rix trix@redhat.com Link: https://lore.kernel.org/lkml/20230327120823.1369700-1-trix@redhat.com/ Fixes: 94cd1113aaa0 ("iio: tsl2772: add support for reading proximity led settings from device tree") Signed-off-by: Brian Masney bmasney@redhat.com Link: https://lore.kernel.org/r/20230404011455.339454-1-bmasney@redhat.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/light/tsl2772.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/iio/light/tsl2772.c +++ b/drivers/iio/light/tsl2772.c @@ -606,6 +606,7 @@ static int tsl2772_read_prox_diodes(stru return -EINVAL; } } + chip->settings.prox_diode = prox_diode_mask;
return 0; }
From: Ryusuke Konishi konishi.ryusuke@gmail.com
commit ef832747a82dfbc22a3702219cc716f449b24e4a upstream.
Syzbot still reports uninit-value in nilfs_add_checksums_on_logs() for KMSAN enabled kernels after applying commit 7397031622e0 ("nilfs2: initialize "struct nilfs_binfo_dat"->bi_pad field").
This is because the unused bytes at the end of each block in segment summaries are not initialized. So this fixes the issue by padding the unused bytes with null bytes.
Link: https://lkml.kernel.org/r/20230417173513.12598-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Reported-by: syzbot+048585f3f4227bb2b49b@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=048585f3f4227bb2b49b Cc: Alexander Potapenko glider@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/segment.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
--- a/fs/nilfs2/segment.c +++ b/fs/nilfs2/segment.c @@ -435,6 +435,23 @@ static int nilfs_segctor_reset_segment_b return 0; }
+/** + * nilfs_segctor_zeropad_segsum - zero pad the rest of the segment summary area + * @sci: segment constructor object + * + * nilfs_segctor_zeropad_segsum() zero-fills unallocated space at the end of + * the current segment summary block. + */ +static void nilfs_segctor_zeropad_segsum(struct nilfs_sc_info *sci) +{ + struct nilfs_segsum_pointer *ssp; + + ssp = sci->sc_blk_cnt > 0 ? &sci->sc_binfo_ptr : &sci->sc_finfo_ptr; + if (ssp->offset < ssp->bh->b_size) + memset(ssp->bh->b_data + ssp->offset, 0, + ssp->bh->b_size - ssp->offset); +} + static int nilfs_segctor_feed_segment(struct nilfs_sc_info *sci) { sci->sc_nblk_this_inc += sci->sc_curseg->sb_sum.nblocks; @@ -443,6 +460,7 @@ static int nilfs_segctor_feed_segment(st * The current segment is filled up * (internal code) */ + nilfs_segctor_zeropad_segsum(sci); sci->sc_curseg = NILFS_NEXT_SEGBUF(sci->sc_curseg); return nilfs_segctor_reset_segment_buffer(sci); } @@ -547,6 +565,7 @@ static int nilfs_segctor_add_file_block( goto retry; } if (unlikely(required)) { + nilfs_segctor_zeropad_segsum(sci); err = nilfs_segbuf_extend_segsum(segbuf); if (unlikely(err)) goto failed; @@ -1536,6 +1555,7 @@ static int nilfs_segctor_collect(struct nadd = min_t(int, nadd << 1, SC_MAX_SEGDELTA); sci->sc_stage = prev_stage; } + nilfs_segctor_zeropad_segsum(sci); nilfs_segctor_truncate_segments(sci, sci->sc_curseg, nilfs->ns_sufile); return 0;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 4b6d621c9d859ff89e68cebf6178652592676013 upstream.
When calling dev_set_name() memory is allocated for the name for the struct device. Once that structure device is registered, or attempted to be registerd, with the driver core, the driver core will handle cleaning up that memory when the device is removed from the system.
Unfortunatly for the memstick code, there is an error path that causes the struct device to never be registered, and so the memory allocated in dev_set_name will be leaked. Fix that leak by manually freeing it right before the memory for the device is freed.
Cc: Maxim Levitsky maximlevitsky@gmail.com Cc: Alex Dubov oakad@yahoo.com Cc: Ulf Hansson ulf.hansson@linaro.org Cc: "Rafael J. Wysocki" rafael@kernel.org Cc: Hans de Goede hdegoede@redhat.com Cc: Kay Sievers kay.sievers@vrfy.org Cc: linux-mmc@vger.kernel.org Fixes: 0252c3b4f018 ("memstick: struct device - replace bus_id with dev_name(), dev_set_name()") Cc: stable stable@kernel.org Co-developed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Co-developed-by: Mirsad Goran Todorovac mirsad.todorovac@alu.unizg.hr Signed-off-by: Mirsad Goran Todorovac mirsad.todorovac@alu.unizg.hr Link: https://lore.kernel.org/r/20230401200327.16800-1-gregkh@linuxfoundation.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/memstick/core/memstick.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/memstick/core/memstick.c +++ b/drivers/memstick/core/memstick.c @@ -412,6 +412,7 @@ static struct memstick_dev *memstick_all return card; err_out: host->card = old_card; + kfree_const(card->dev.kobj.name); kfree(card); return NULL; } @@ -470,8 +471,10 @@ static void memstick_check(struct work_s put_device(&card->dev); host->card = NULL; } - } else + } else { + kfree_const(card->dev.kobj.name); kfree(card); + } }
out_power_off:
From: Ondrej Mosnacek omosnace@redhat.com
commit 659c0ce1cb9efc7f58d380ca4bb2a51ae9e30553 upstream.
Linux Security Modules (LSMs) that implement the "capable" hook will usually emit an access denial message to the audit log whenever they "block" the current task from using the given capability based on their security policy.
The occurrence of a denial is used as an indication that the given task has attempted an operation that requires the given access permission, so the callers of functions that perform LSM permission checks must take care to avoid calling them too early (before it is decided if the permission is actually needed to perform the requested operation).
The __sys_setres[ug]id() functions violate this convention by first calling ns_capable_setid() and only then checking if the operation requires the capability or not. It means that any caller that has the capability granted by DAC (task's capability set) but not by MAC (LSMs) will generate a "denied" audit record, even if is doing an operation for which the capability is not required.
Fix this by reordering the checks such that ns_capable_setid() is checked last and -EPERM is returned immediately if it returns false.
While there, also do two small optimizations: * move the capability check before prepare_creds() and * bail out early in case of a no-op.
Link: https://lkml.kernel.org/r/20230217162154.837549-1-omosnace@redhat.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Cc: Eric W. Biederman ebiederm@xmission.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sys.c | 69 ++++++++++++++++++++++++++++++++++------------------------- 1 file changed, 40 insertions(+), 29 deletions(-)
--- a/kernel/sys.c +++ b/kernel/sys.c @@ -634,6 +634,7 @@ long __sys_setresuid(uid_t ruid, uid_t e struct cred *new; int retval; kuid_t kruid, keuid, ksuid; + bool ruid_new, euid_new, suid_new;
kruid = make_kuid(ns, ruid); keuid = make_kuid(ns, euid); @@ -648,25 +649,29 @@ long __sys_setresuid(uid_t ruid, uid_t e if ((suid != (uid_t) -1) && !uid_valid(ksuid)) return -EINVAL;
+ old = current_cred(); + + /* check for no-op */ + if ((ruid == (uid_t) -1 || uid_eq(kruid, old->uid)) && + (euid == (uid_t) -1 || (uid_eq(keuid, old->euid) && + uid_eq(keuid, old->fsuid))) && + (suid == (uid_t) -1 || uid_eq(ksuid, old->suid))) + return 0; + + ruid_new = ruid != (uid_t) -1 && !uid_eq(kruid, old->uid) && + !uid_eq(kruid, old->euid) && !uid_eq(kruid, old->suid); + euid_new = euid != (uid_t) -1 && !uid_eq(keuid, old->uid) && + !uid_eq(keuid, old->euid) && !uid_eq(keuid, old->suid); + suid_new = suid != (uid_t) -1 && !uid_eq(ksuid, old->uid) && + !uid_eq(ksuid, old->euid) && !uid_eq(ksuid, old->suid); + if ((ruid_new || euid_new || suid_new) && + !ns_capable_setid(old->user_ns, CAP_SETUID)) + return -EPERM; + new = prepare_creds(); if (!new) return -ENOMEM;
- old = current_cred(); - - retval = -EPERM; - if (!ns_capable_setid(old->user_ns, CAP_SETUID)) { - if (ruid != (uid_t) -1 && !uid_eq(kruid, old->uid) && - !uid_eq(kruid, old->euid) && !uid_eq(kruid, old->suid)) - goto error; - if (euid != (uid_t) -1 && !uid_eq(keuid, old->uid) && - !uid_eq(keuid, old->euid) && !uid_eq(keuid, old->suid)) - goto error; - if (suid != (uid_t) -1 && !uid_eq(ksuid, old->uid) && - !uid_eq(ksuid, old->euid) && !uid_eq(ksuid, old->suid)) - goto error; - } - if (ruid != (uid_t) -1) { new->uid = kruid; if (!uid_eq(kruid, old->uid)) { @@ -726,6 +731,7 @@ long __sys_setresgid(gid_t rgid, gid_t e struct cred *new; int retval; kgid_t krgid, kegid, ksgid; + bool rgid_new, egid_new, sgid_new;
krgid = make_kgid(ns, rgid); kegid = make_kgid(ns, egid); @@ -738,23 +744,28 @@ long __sys_setresgid(gid_t rgid, gid_t e if ((sgid != (gid_t) -1) && !gid_valid(ksgid)) return -EINVAL;
+ old = current_cred(); + + /* check for no-op */ + if ((rgid == (gid_t) -1 || gid_eq(krgid, old->gid)) && + (egid == (gid_t) -1 || (gid_eq(kegid, old->egid) && + gid_eq(kegid, old->fsgid))) && + (sgid == (gid_t) -1 || gid_eq(ksgid, old->sgid))) + return 0; + + rgid_new = rgid != (gid_t) -1 && !gid_eq(krgid, old->gid) && + !gid_eq(krgid, old->egid) && !gid_eq(krgid, old->sgid); + egid_new = egid != (gid_t) -1 && !gid_eq(kegid, old->gid) && + !gid_eq(kegid, old->egid) && !gid_eq(kegid, old->sgid); + sgid_new = sgid != (gid_t) -1 && !gid_eq(ksgid, old->gid) && + !gid_eq(ksgid, old->egid) && !gid_eq(ksgid, old->sgid); + if ((rgid_new || egid_new || sgid_new) && + !ns_capable_setid(old->user_ns, CAP_SETGID)) + return -EPERM; + new = prepare_creds(); if (!new) return -ENOMEM; - old = current_cred(); - - retval = -EPERM; - if (!ns_capable_setid(old->user_ns, CAP_SETGID)) { - if (rgid != (gid_t) -1 && !gid_eq(krgid, old->gid) && - !gid_eq(krgid, old->egid) && !gid_eq(krgid, old->sgid)) - goto error; - if (egid != (gid_t) -1 && !gid_eq(kegid, old->gid) && - !gid_eq(kegid, old->egid) && !gid_eq(kegid, old->sgid)) - goto error; - if (sgid != (gid_t) -1 && !gid_eq(ksgid, old->gid) && - !gid_eq(ksgid, old->egid) && !gid_eq(ksgid, old->sgid)) - goto error; - }
if (rgid != (gid_t) -1) new->gid = krgid;
From: Bhavya Kapoor b-kapoor@ti.com
commit 2265098fd6a6272fde3fd1be5761f2f5895bd99a upstream.
Timing Information in Datasheet assumes that HIGH_SPEED_ENA=1 should be set for SDR12 and SDR25 modes. But sdhci_am654 driver clears HIGH_SPEED_ENA register. Thus, Modify sdhci_am654 to not clear HIGH_SPEED_ENA (HOST_CONTROL[2]) bit for SDR12 and SDR25 speed modes.
Fixes: e374e87538f4 ("mmc: sdhci_am654: Clear HISPD_ENA in some lower speed modes") Signed-off-by: Bhavya Kapoor b-kapoor@ti.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230317092711.660897-1-b-kapoor@ti.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci_am654.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/mmc/host/sdhci_am654.c +++ b/drivers/mmc/host/sdhci_am654.c @@ -351,8 +351,6 @@ static void sdhci_am654_write_b(struct s */ case MMC_TIMING_SD_HS: case MMC_TIMING_MMC_HS: - case MMC_TIMING_UHS_SDR12: - case MMC_TIMING_UHS_SDR25: val &= ~SDHCI_CTRL_HISPD; } }
From: Peter Xu peterx@redhat.com
commit dd47ac428c3f5f3bcabe845f36be870fe6c20784 upstream.
Khugepaged collapse an anonymous thp in two rounds of scans. The 2nd round done in __collapse_huge_page_isolate() after hpage_collapse_scan_pmd(), during which all the locks will be released temporarily. It means the pgtable can change during this phase before 2nd round starts.
It's logically possible some ptes got wr-protected during this phase, and we can errornously collapse a thp without noticing some ptes are wr-protected by userfault. e1e267c7928f wanted to avoid it but it only did that for the 1st phase, not the 2nd phase.
Since __collapse_huge_page_isolate() happens after a round of small page swapins, we don't need to worry on any !present ptes - if it existed khugepaged will already bail out. So we only need to check present ptes with uffd-wp bit set there.
This is something I found only but never had a reproducer, I thought it was one caused a bug in Muhammad's recent pagemap new ioctl work, but it turns out it's not the cause of that but an userspace bug. However this seems to still be a real bug even with a very small race window, still worth to have it fixed and copy stable.
Link: https://lkml.kernel.org/r/20230405155120.3608140-1-peterx@redhat.com Fixes: e1e267c7928f ("khugepaged: skip collapse if uffd-wp detected") Signed-off-by: Peter Xu peterx@redhat.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Yang Shi shy828301@gmail.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Axel Rasmussen axelrasmussen@google.com Cc: Mike Rapoport rppt@linux.vnet.ibm.com Cc: Nadav Amit nadav.amit@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/khugepaged.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -622,6 +622,10 @@ static int __collapse_huge_page_isolate( result = SCAN_PTE_NON_PRESENT; goto out; } + if (pte_uffd_wp(pteval)) { + result = SCAN_PTE_UFFD_WP; + goto out; + } page = vm_normal_page(vma, address, pteval); if (unlikely(!page)) { result = SCAN_PAGE_NULL;
From: Qais Yousef qais.yousef@arm.com
commit b48e16a69792b5dc4a09d6807369d11b2970cc36 upstream.
So that the new uclamp rules in regard to migration margin and capacity pressure are taken into account correctly.
Fixes: a7008c07a568 ("sched/fair: Make task_fits_capacity() consider uclamp restrictions") Co-developed-by: Vincent Guittot vincent.guittot@linaro.org Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-3-qais.yousef@arm.com (cherry picked from commit b48e16a69792b5dc4a09d6807369d11b2970cc36) Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 26 ++++++++++++++++---------- kernel/sched/sched.h | 9 +++++++++ 2 files changed, 25 insertions(+), 10 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4197,10 +4197,12 @@ static inline int util_fits_cpu(unsigned return fits; }
-static inline int task_fits_capacity(struct task_struct *p, - unsigned long capacity) +static inline int task_fits_cpu(struct task_struct *p, int cpu) { - return fits_capacity(uclamp_task_util(p), capacity); + unsigned long uclamp_min = uclamp_eff_value(p, UCLAMP_MIN); + unsigned long uclamp_max = uclamp_eff_value(p, UCLAMP_MAX); + unsigned long util = task_util_est(p); + return util_fits_cpu(util, uclamp_min, uclamp_max, cpu); }
static inline void update_misfit_status(struct task_struct *p, struct rq *rq) @@ -4213,7 +4215,7 @@ static inline void update_misfit_status( return; }
- if (task_fits_capacity(p, capacity_of(cpu_of(rq)))) { + if (task_fits_cpu(p, cpu_of(rq))) { rq->misfit_task_load = 0; return; } @@ -7942,7 +7944,7 @@ static int detach_tasks(struct lb_env *e
case migrate_misfit: /* This is not a misfit task */ - if (task_fits_capacity(p, capacity_of(env->src_cpu))) + if (task_fits_cpu(p, env->src_cpu)) goto next;
env->imbalance = 0; @@ -8884,6 +8886,10 @@ static inline void update_sg_wakeup_stat
memset(sgs, 0, sizeof(*sgs));
+ /* Assume that task can't fit any CPU of the group */ + if (sd->flags & SD_ASYM_CPUCAPACITY) + sgs->group_misfit_task_load = 1; + for_each_cpu(i, sched_group_span(group)) { struct rq *rq = cpu_rq(i); unsigned int local; @@ -8903,12 +8909,12 @@ static inline void update_sg_wakeup_stat if (!nr_running && idle_cpu_without(i, p)) sgs->idle_cpus++;
- } + /* Check if task fits in the CPU */ + if (sd->flags & SD_ASYM_CPUCAPACITY && + sgs->group_misfit_task_load && + task_fits_cpu(p, i)) + sgs->group_misfit_task_load = 0;
- /* Check if task fits in the group */ - if (sd->flags & SD_ASYM_CPUCAPACITY && - !task_fits_capacity(p, group->sgc->max_capacity)) { - sgs->group_misfit_task_load = 1; }
sgs->group_capacity = group->sgc->capacity; --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2468,6 +2468,15 @@ static inline bool uclamp_is_used(void) return static_branch_likely(&sched_uclamp_used); } #else /* CONFIG_UCLAMP_TASK */ +static inline unsigned long uclamp_eff_value(struct task_struct *p, + enum uclamp_id clamp_id) +{ + if (clamp_id == UCLAMP_MIN) + return 0; + + return SCHED_CAPACITY_SCALE; +} + static inline unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util, struct task_struct *p)
From: Qais Yousef qais.yousef@arm.com
commit 244226035a1f9b2b6c326e55ae5188fab4f428cb upstream.
As reported by Yun Hsiang [1], if a task has its uclamp_min >= 0.8 * 1024, it'll always pick the previous CPU because fits_capacity() will always return false in this case.
The new util_fits_cpu() logic should handle this correctly for us beside more corner cases where similar failures could occur, like when using UCLAMP_MAX.
We open code uclamp_rq_util_with() except for the clamp() part, util_fits_cpu() needs the 'raw' values to be passed to it.
Also introduce uclamp_rq_{set, get}() shorthand accessors to get uclamp value for the rq. Makes the code more readable and ensures the right rules (use READ_ONCE/WRITE_ONCE) are respected transparently.
[1] https://lists.linaro.org/pipermail/eas-dev/2020-July/001488.html
Fixes: 1d42509e475c ("sched/fair: Make EAS wakeup placement consider uclamp restrictions") Reported-by: Yun Hsiang hsiang023167@gmail.com Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-4-qais.yousef@arm.com (cherry picked from commit 244226035a1f9b2b6c326e55ae5188fab4f428cb) [Fix trivial conflict in kernel/sched/fair.c due to new automatic variables in master vs 5.10] Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/core.c | 10 +++++----- kernel/sched/fair.c | 26 ++++++++++++++++++++++++-- kernel/sched/sched.h | 42 +++++++++++++++++++++++++++++++++++++++--- 3 files changed, 68 insertions(+), 10 deletions(-)
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -980,7 +980,7 @@ static inline void uclamp_idle_reset(str if (!(rq->uclamp_flags & UCLAMP_FLAG_IDLE)) return;
- WRITE_ONCE(rq->uclamp[clamp_id].value, clamp_value); + uclamp_rq_set(rq, clamp_id, clamp_value); }
static inline @@ -1158,8 +1158,8 @@ static inline void uclamp_rq_inc_id(stru if (bucket->tasks == 1 || uc_se->value > bucket->value) bucket->value = uc_se->value;
- if (uc_se->value > READ_ONCE(uc_rq->value)) - WRITE_ONCE(uc_rq->value, uc_se->value); + if (uc_se->value > uclamp_rq_get(rq, clamp_id)) + uclamp_rq_set(rq, clamp_id, uc_se->value); }
/* @@ -1225,7 +1225,7 @@ static inline void uclamp_rq_dec_id(stru if (likely(bucket->tasks)) return;
- rq_clamp = READ_ONCE(uc_rq->value); + rq_clamp = uclamp_rq_get(rq, clamp_id); /* * Defensive programming: this should never happen. If it happens, * e.g. due to future modification, warn and fixup the expected value. @@ -1233,7 +1233,7 @@ static inline void uclamp_rq_dec_id(stru SCHED_WARN_ON(bucket->value > rq_clamp); if (bucket->value >= rq_clamp) { bkt_clamp = uclamp_rq_max_value(rq, clamp_id, uc_se->value); - WRITE_ONCE(uc_rq->value, bkt_clamp); + uclamp_rq_set(rq, clamp_id, bkt_clamp); } }
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6802,6 +6802,8 @@ compute_energy(struct task_struct *p, in static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) { unsigned long prev_delta = ULONG_MAX, best_delta = ULONG_MAX; + unsigned long p_util_min = uclamp_is_used() ? uclamp_eff_value(p, UCLAMP_MIN) : 0; + unsigned long p_util_max = uclamp_is_used() ? uclamp_eff_value(p, UCLAMP_MAX) : 1024; struct root_domain *rd = cpu_rq(smp_processor_id())->rd; unsigned long cpu_cap, util, base_energy = 0; int cpu, best_energy_cpu = prev_cpu; @@ -6829,6 +6831,8 @@ static int find_energy_efficient_cpu(str
for (; pd; pd = pd->next) { unsigned long cur_delta, spare_cap, max_spare_cap = 0; + unsigned long rq_util_min, rq_util_max; + unsigned long util_min, util_max; unsigned long base_energy_pd; int max_spare_cap_cpu = -1;
@@ -6852,8 +6856,26 @@ static int find_energy_efficient_cpu(str * much capacity we can get out of the CPU; this is * aligned with schedutil_cpu_util(). */ - util = uclamp_rq_util_with(cpu_rq(cpu), util, p); - if (!fits_capacity(util, cpu_cap)) + if (uclamp_is_used()) { + if (uclamp_rq_is_idle(cpu_rq(cpu))) { + util_min = p_util_min; + util_max = p_util_max; + } else { + /* + * Open code uclamp_rq_util_with() except for + * the clamp() part. Ie: apply max aggregation + * only. util_fits_cpu() logic requires to + * operate on non clamped util but must use the + * max-aggregated uclamp_{min, max}. + */ + rq_util_min = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MIN); + rq_util_max = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MAX); + + util_min = max(rq_util_min, p_util_min); + util_max = max(rq_util_max, p_util_max); + } + } + if (!util_fits_cpu(util, util_min, util_max, cpu)) continue;
/* Always use prev_cpu as a candidate. */ --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2402,6 +2402,23 @@ static inline void cpufreq_update_util(s #ifdef CONFIG_UCLAMP_TASK unsigned long uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
+static inline unsigned long uclamp_rq_get(struct rq *rq, + enum uclamp_id clamp_id) +{ + return READ_ONCE(rq->uclamp[clamp_id].value); +} + +static inline void uclamp_rq_set(struct rq *rq, enum uclamp_id clamp_id, + unsigned int value) +{ + WRITE_ONCE(rq->uclamp[clamp_id].value, value); +} + +static inline bool uclamp_rq_is_idle(struct rq *rq) +{ + return rq->uclamp_flags & UCLAMP_FLAG_IDLE; +} + /** * uclamp_rq_util_with - clamp @util with @rq and @p effective uclamp values. * @rq: The rq to clamp against. Must not be NULL. @@ -2437,12 +2454,12 @@ unsigned long uclamp_rq_util_with(struct * Ignore last runnable task's max clamp, as this task will * reset it. Similarly, no need to read the rq's min clamp. */ - if (rq->uclamp_flags & UCLAMP_FLAG_IDLE) + if (uclamp_rq_is_idle(rq)) goto out; }
- min_util = max_t(unsigned long, min_util, READ_ONCE(rq->uclamp[UCLAMP_MIN].value)); - max_util = max_t(unsigned long, max_util, READ_ONCE(rq->uclamp[UCLAMP_MAX].value)); + min_util = max_t(unsigned long, min_util, uclamp_rq_get(rq, UCLAMP_MIN)); + max_util = max_t(unsigned long, max_util, uclamp_rq_get(rq, UCLAMP_MAX)); out: /* * Since CPU's {min,max}_util clamps are MAX aggregated considering @@ -2488,6 +2505,25 @@ static inline bool uclamp_is_used(void) { return false; } + +static inline unsigned long uclamp_rq_get(struct rq *rq, + enum uclamp_id clamp_id) +{ + if (clamp_id == UCLAMP_MIN) + return 0; + + return SCHED_CAPACITY_SCALE; +} + +static inline void uclamp_rq_set(struct rq *rq, enum uclamp_id clamp_id, + unsigned int value) +{ +} + +static inline bool uclamp_rq_is_idle(struct rq *rq) +{ + return false; +} #endif /* CONFIG_UCLAMP_TASK */
#ifdef arch_scale_freq_capacity
From: Qais Yousef qais.yousef@arm.com
commit b759caa1d9f667b94727b2ad12589cbc4ce13a82 upstream.
Use the new util_fits_cpu() to ensure migration margin and capacity pressure are taken into account correctly when uclamp is being used otherwise we will fail to consider CPUs as fitting in scenarios where they should.
Fixes: b4c9c9f15649 ("sched/fair: Prefer prev cpu in asymmetric wakeup path") Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-5-qais.yousef@arm.com (cherry picked from commit b759caa1d9f667b94727b2ad12589cbc4ce13a82) Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6394,21 +6394,23 @@ static int select_idle_cpu(struct task_s static int select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target) { - unsigned long task_util, best_cap = 0; + unsigned long task_util, util_min, util_max, best_cap = 0; int cpu, best_cpu = -1; struct cpumask *cpus;
cpus = this_cpu_cpumask_var_ptr(select_idle_mask); cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
- task_util = uclamp_task_util(p); + task_util = task_util_est(p); + util_min = uclamp_eff_value(p, UCLAMP_MIN); + util_max = uclamp_eff_value(p, UCLAMP_MAX);
for_each_cpu_wrap(cpu, cpus, target) { unsigned long cpu_cap = capacity_of(cpu);
if (!available_idle_cpu(cpu) && !sched_idle_cpu(cpu)) continue; - if (fits_capacity(task_util, cpu_cap)) + if (util_fits_cpu(task_util, util_min, util_max, cpu)) return cpu;
if (cpu_cap > best_cap) {
From: Qais Yousef qais.yousef@arm.com
commit a2e7f03ed28fce26c78b985f87913b6ce3accf9d upstream.
Use the new util_fits_cpu() to ensure migration margin and capacity pressure are taken into account correctly when uclamp is being used otherwise we will fail to consider CPUs as fitting in scenarios where they should.
s/asym_fits_capacity/asym_fits_cpu/ to better reflect what it does now.
Fixes: b4c9c9f15649 ("sched/fair: Prefer prev cpu in asymmetric wakeup path") Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-6-qais.yousef@arm.com (cherry picked from commit a2e7f03ed28fce26c78b985f87913b6ce3accf9d) [Conflict in kernel/sched/fair.c due different name of static key wrapper function and slightly different if condition block in one of the asym_fits_cpu() call sites] Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6422,10 +6422,13 @@ select_idle_capacity(struct task_struct return best_cpu; }
-static inline bool asym_fits_capacity(unsigned long task_util, int cpu) +static inline bool asym_fits_cpu(unsigned long util, + unsigned long util_min, + unsigned long util_max, + int cpu) { if (static_branch_unlikely(&sched_asym_cpucapacity)) - return fits_capacity(task_util, capacity_of(cpu)); + return util_fits_cpu(util, util_min, util_max, cpu);
return true; } @@ -6436,7 +6439,7 @@ static inline bool asym_fits_capacity(un static int select_idle_sibling(struct task_struct *p, int prev, int target) { struct sched_domain *sd; - unsigned long task_util; + unsigned long task_util, util_min, util_max; int i, recent_used_cpu;
/* @@ -6445,11 +6448,13 @@ static int select_idle_sibling(struct ta */ if (static_branch_unlikely(&sched_asym_cpucapacity)) { sync_entity_load_avg(&p->se); - task_util = uclamp_task_util(p); + task_util = task_util_est(p); + util_min = uclamp_eff_value(p, UCLAMP_MIN); + util_max = uclamp_eff_value(p, UCLAMP_MAX); }
if ((available_idle_cpu(target) || sched_idle_cpu(target)) && - asym_fits_capacity(task_util, target)) + asym_fits_cpu(task_util, util_min, util_max, target)) return target;
/* @@ -6457,7 +6462,7 @@ static int select_idle_sibling(struct ta */ if (prev != target && cpus_share_cache(prev, target) && (available_idle_cpu(prev) || sched_idle_cpu(prev)) && - asym_fits_capacity(task_util, prev)) + asym_fits_cpu(task_util, util_min, util_max, prev)) return prev;
/* @@ -6472,7 +6477,7 @@ static int select_idle_sibling(struct ta in_task() && prev == smp_processor_id() && this_rq()->nr_running <= 1 && - asym_fits_capacity(task_util, prev)) { + asym_fits_cpu(task_util, util_min, util_max, prev)) { return prev; }
@@ -6483,7 +6488,7 @@ static int select_idle_sibling(struct ta cpus_share_cache(recent_used_cpu, target) && (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) && cpumask_test_cpu(p->recent_used_cpu, p->cpus_ptr) && - asym_fits_capacity(task_util, recent_used_cpu)) { + asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) { /* * Replace recent_used_cpu with prev as it is a potential * candidate for the next wake:
From: Qais Yousef qais.yousef@arm.com
commit c56ab1b3506ba0e7a872509964b100912bde165d upstream.
So that it is now uclamp aware.
This fixes a major problem of busy tasks capped with UCLAMP_MAX keeping the system in overutilized state which disables EAS and leads to wasting energy in the long run.
Without this patch running a busy background activity like JIT compilation on Pixel 6 causes the system to be in overutilized state 74.5% of the time.
With this patch this goes down to 9.79%.
It also fixes another problem when long running tasks that have their UCLAMP_MIN changed while running such that they need to upmigrate to honour the new UCLAMP_MIN value. The upmigration doesn't get triggered because overutilized state never gets set in this state, hence misfit migration never happens at tick in this case until the task wakes up again.
Fixes: af24bde8df202 ("sched/uclamp: Add uclamp support to energy_compute()") Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-7-qais.yousef@arm.com (cherry picked from commit c56ab1b3506ba0e7a872509964b100912bde165d) [Conflict in kernel/sched/fair.c: use cpu_util() instead of cpu_util_cfs()] Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5657,7 +5657,10 @@ static inline unsigned long cpu_util(int
static inline bool cpu_overutilized(int cpu) { - return !fits_capacity(cpu_util(cpu), capacity_of(cpu)); + unsigned long rq_util_min = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MIN); + unsigned long rq_util_max = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MAX); + + return !util_fits_cpu(cpu_util(cpu), rq_util_min, rq_util_max, cpu); }
static inline void update_overutilized_status(struct rq *rq)
From: Qais Yousef qais.yousef@arm.com
commit d81304bc6193554014d4372a01debdf65e1e9a4d upstream.
If the utilization of the woken up task is 0, we skip the energy calculation because it has no impact.
But if the task is boosted (uclamp_min != 0) will have an impact on task placement and frequency selection. Only skip if the util is truly 0 after applying uclamp values.
Change uclamp_task_cpu() signature to avoid unnecessary additional calls to uclamp_eff_get(). feec() is the only user now.
Fixes: 732cd75b8c920 ("sched/fair: Select an energy-efficient CPU on task wake-up") Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-8-qais.yousef@arm.com (cherry picked from commit d81304bc6193554014d4372a01debdf65e1e9a4d) Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3928,14 +3928,16 @@ static inline unsigned long task_util_es }
#ifdef CONFIG_UCLAMP_TASK -static inline unsigned long uclamp_task_util(struct task_struct *p) +static inline unsigned long uclamp_task_util(struct task_struct *p, + unsigned long uclamp_min, + unsigned long uclamp_max) { - return clamp(task_util_est(p), - uclamp_eff_value(p, UCLAMP_MIN), - uclamp_eff_value(p, UCLAMP_MAX)); + return clamp(task_util_est(p), uclamp_min, uclamp_max); } #else -static inline unsigned long uclamp_task_util(struct task_struct *p) +static inline unsigned long uclamp_task_util(struct task_struct *p, + unsigned long uclamp_min, + unsigned long uclamp_max) { return task_util_est(p); } @@ -6836,7 +6838,7 @@ static int find_energy_efficient_cpu(str goto fail;
sync_entity_load_avg(&p->se); - if (!task_util_est(p)) + if (!uclamp_task_util(p, p_util_min, p_util_max)) goto unlock;
for (; pd; pd = pd->next) {
From: Qais Yousef qais.yousef@arm.com
commit: 44c7b80bffc3a657a36857098d5d9c49d94e652b upstream.
Check each performance domain to see if thermal pressure is causing its capacity to be lower than another performance domain.
We assume that each performance domain has CPUs with the same capacities, which is similar to an assumption made in energy_model.c
We also assume that thermal pressure impacts all CPUs in a performance domain equally.
If there're multiple performance domains with the same capacity_orig, we will trigger a capacity inversion if the domain is under thermal pressure.
The new cpu_in_capacity_inversion() should help users to know when information about capacity_orig are not reliable and can opt in to use the inverted capacity as the 'actual' capacity_orig.
Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-9-qais.yousef@arm.com (cherry picked from commit 44c7b80bffc3a657a36857098d5d9c49d94e652b) [Trivial conflict in kernel/sched/fair.c and sched.h due to code shuffling] Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 19 +++++++++++++++ 2 files changed, 79 insertions(+), 3 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8376,16 +8376,73 @@ static unsigned long scale_rt_capacity(i
static void update_cpu_capacity(struct sched_domain *sd, int cpu) { + unsigned long capacity_orig = arch_scale_cpu_capacity(cpu); unsigned long capacity = scale_rt_capacity(cpu); struct sched_group *sdg = sd->groups; + struct rq *rq = cpu_rq(cpu);
- cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(cpu); + rq->cpu_capacity_orig = capacity_orig;
if (!capacity) capacity = 1;
- cpu_rq(cpu)->cpu_capacity = capacity; - trace_sched_cpu_capacity_tp(cpu_rq(cpu)); + rq->cpu_capacity = capacity; + + /* + * Detect if the performance domain is in capacity inversion state. + * + * Capacity inversion happens when another perf domain with equal or + * lower capacity_orig_of() ends up having higher capacity than this + * domain after subtracting thermal pressure. + * + * We only take into account thermal pressure in this detection as it's + * the only metric that actually results in *real* reduction of + * capacity due to performance points (OPPs) being dropped/become + * unreachable due to thermal throttling. + * + * We assume: + * * That all cpus in a perf domain have the same capacity_orig + * (same uArch). + * * Thermal pressure will impact all cpus in this perf domain + * equally. + */ + if (static_branch_unlikely(&sched_asym_cpucapacity)) { + unsigned long inv_cap = capacity_orig - thermal_load_avg(rq); + struct perf_domain *pd = rcu_dereference(rq->rd->pd); + + rq->cpu_capacity_inverted = 0; + + for (; pd; pd = pd->next) { + struct cpumask *pd_span = perf_domain_span(pd); + unsigned long pd_cap_orig, pd_cap; + + cpu = cpumask_any(pd_span); + pd_cap_orig = arch_scale_cpu_capacity(cpu); + + if (capacity_orig < pd_cap_orig) + continue; + + /* + * handle the case of multiple perf domains have the + * same capacity_orig but one of them is under higher + * thermal pressure. We record it as capacity + * inversion. + */ + if (capacity_orig == pd_cap_orig) { + pd_cap = pd_cap_orig - thermal_load_avg(cpu_rq(cpu)); + + if (pd_cap > inv_cap) { + rq->cpu_capacity_inverted = inv_cap; + break; + } + } else if (pd_cap_orig > inv_cap) { + rq->cpu_capacity_inverted = inv_cap; + break; + } + } + } + + trace_sched_cpu_capacity_tp(rq);
sdg->sgc->capacity = capacity; sdg->sgc->min_capacity = capacity; --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -973,6 +973,7 @@ struct rq {
unsigned long cpu_capacity; unsigned long cpu_capacity_orig; + unsigned long cpu_capacity_inverted;
struct callback_head *balance_callback;
@@ -2539,6 +2540,24 @@ static inline unsigned long capacity_ori { return cpu_rq(cpu)->cpu_capacity_orig; } + +/* + * Returns inverted capacity if the CPU is in capacity inversion state. + * 0 otherwise. + * + * Capacity inversion detection only considers thermal impact where actual + * performance points (OPPs) gets dropped. + * + * Capacity inversion state happens when another performance domain that has + * equal or lower capacity_orig_of() becomes effectively larger than the perf + * domain this CPU belongs to due to thermal pressure throttling it hard. + * + * See comment in update_cpu_capacity(). + */ +static inline unsigned long cpu_in_capacity_inversion(int cpu) +{ + return cpu_rq(cpu)->cpu_capacity_inverted; +} #endif
/**
From: Qais Yousef qais.yousef@arm.com
commit: aa69c36f31aadc1669bfa8a3de6a47b5e6c98ee8 upstream.
We do consider thermal pressure in util_fits_cpu() for uclamp_min only. With the exception of the biggest cores which by definition are the max performance point of the system and all tasks by definition should fit.
Even under thermal pressure, the capacity of the biggest CPU is the highest in the system and should still fit every task. Except when it reaches capacity inversion point, then this is no longer true.
We can handle this by using the inverted capacity as capacity_orig in util_fits_cpu(). Which not only addresses the problem above, but also ensure uclamp_max now considers the inverted capacity. Force fitting a task when a CPU is in this adverse state will contribute to making the thermal throttling last longer.
Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-10-qais.yousef@arm.com (cherry picked from commit aa69c36f31aadc1669bfa8a3de6a47b5e6c98ee8) Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4113,12 +4113,16 @@ static inline int util_fits_cpu(unsigned * For uclamp_max, we can tolerate a drop in performance level as the * goal is to cap the task. So it's okay if it's getting less. * - * In case of capacity inversion, which is not handled yet, we should - * honour the inverted capacity for both uclamp_min and uclamp_max all - * the time. + * In case of capacity inversion we should honour the inverted capacity + * for both uclamp_min and uclamp_max all the time. */ - capacity_orig = capacity_orig_of(cpu); - capacity_orig_thermal = capacity_orig - arch_scale_thermal_pressure(cpu); + capacity_orig = cpu_in_capacity_inversion(cpu); + if (capacity_orig) { + capacity_orig_thermal = capacity_orig; + } else { + capacity_orig = capacity_orig_of(cpu); + capacity_orig_thermal = capacity_orig - arch_scale_thermal_pressure(cpu); + }
/* * We want to force a task to fit a cpu as implied by uclamp_max.
From: Qais Yousef qyousef@layalina.io
commit e26fd28db82899be71b4b949527373d0a6be1e65 upstream.
Addresses the following warnings:
config: riscv-randconfig-m031-20221111 compiler: riscv64-linux-gcc (GCC) 12.1.0
smatch warnings: kernel/sched/fair.c:7263 find_energy_efficient_cpu() error: uninitialized symbol 'util_min'. kernel/sched/fair.c:7263 find_energy_efficient_cpu() error: uninitialized symbol 'util_max'.
Fixes: 244226035a1f ("sched/uclamp: Fix fits_capacity() check in feec()") Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter error27@gmail.com Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Vincent Guittot vincent.guittot@linaro.org Link: https://lore.kernel.org/r/20230112122708.330667-2-qyousef@layalina.io (cherry picked from commit e26fd28db82899be71b4b949527373d0a6be1e65) [Conflict in kernel/sched/fair.c due to new automatic variable in master vs 5.10 and new code around for loop] Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 35 ++++++++++++++++------------------- 1 file changed, 16 insertions(+), 19 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6846,9 +6846,9 @@ static int find_energy_efficient_cpu(str goto unlock;
for (; pd; pd = pd->next) { + unsigned long util_min = p_util_min, util_max = p_util_max; unsigned long cur_delta, spare_cap, max_spare_cap = 0; unsigned long rq_util_min, rq_util_max; - unsigned long util_min, util_max; unsigned long base_energy_pd; int max_spare_cap_cpu = -1;
@@ -6857,6 +6857,8 @@ static int find_energy_efficient_cpu(str base_energy += base_energy_pd;
for_each_cpu_and(cpu, perf_domain_span(pd), sched_domain_span(sd)) { + struct rq *rq = cpu_rq(cpu); + if (!cpumask_test_cpu(cpu, p->cpus_ptr)) continue;
@@ -6872,24 +6874,19 @@ static int find_energy_efficient_cpu(str * much capacity we can get out of the CPU; this is * aligned with schedutil_cpu_util(). */ - if (uclamp_is_used()) { - if (uclamp_rq_is_idle(cpu_rq(cpu))) { - util_min = p_util_min; - util_max = p_util_max; - } else { - /* - * Open code uclamp_rq_util_with() except for - * the clamp() part. Ie: apply max aggregation - * only. util_fits_cpu() logic requires to - * operate on non clamped util but must use the - * max-aggregated uclamp_{min, max}. - */ - rq_util_min = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MIN); - rq_util_max = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MAX); - - util_min = max(rq_util_min, p_util_min); - util_max = max(rq_util_max, p_util_max); - } + if (uclamp_is_used() && !uclamp_rq_is_idle(rq)) { + /* + * Open code uclamp_rq_util_with() except for + * the clamp() part. Ie: apply max aggregation + * only. util_fits_cpu() logic requires to + * operate on non clamped util but must use the + * max-aggregated uclamp_{min, max}. + */ + rq_util_min = uclamp_rq_get(rq, UCLAMP_MIN); + rq_util_max = uclamp_rq_get(rq, UCLAMP_MAX); + + util_min = max(rq_util_min, p_util_min); + util_max = max(rq_util_max, p_util_max); } if (!util_fits_cpu(util, util_min, util_max, cpu)) continue;
From: Qais Yousef qyousef@layalina.io
commit da07d2f9c153e457e845d4dcfdd13568d71d18a4 upstream.
Traversing the Perf Domains requires rcu_read_lock() to be held and is conditional on sched_energy_enabled(). Ensure right protections applied.
Also skip capacity inversion detection for our own pd; which was an error.
Fixes: 44c7b80bffc3 ("sched/fair: Detect capacity inversion") Reported-by: Dietmar Eggemann dietmar.eggemann@arm.com Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Vincent Guittot vincent.guittot@linaro.org Link: https://lore.kernel.org/r/20230112122708.330667-3-qyousef@layalina.io (cherry picked from commit da07d2f9c153e457e845d4dcfdd13568d71d18a4) Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8407,16 +8407,23 @@ static void update_cpu_capacity(struct s * * Thermal pressure will impact all cpus in this perf domain * equally. */ - if (static_branch_unlikely(&sched_asym_cpucapacity)) { + if (sched_energy_enabled()) { unsigned long inv_cap = capacity_orig - thermal_load_avg(rq); - struct perf_domain *pd = rcu_dereference(rq->rd->pd); + struct perf_domain *pd;
+ rcu_read_lock(); + + pd = rcu_dereference(rq->rd->pd); rq->cpu_capacity_inverted = 0;
for (; pd; pd = pd->next) { struct cpumask *pd_span = perf_domain_span(pd); unsigned long pd_cap_orig, pd_cap;
+ /* We can't be inverted against our own pd */ + if (cpumask_test_cpu(cpu_of(rq), pd_span)) + continue; + cpu = cpumask_any(pd_span); pd_cap_orig = arch_scale_cpu_capacity(cpu);
@@ -8441,6 +8448,8 @@ static void update_cpu_capacity(struct s break; } } + + rcu_read_unlock(); }
trace_sched_cpu_capacity_tp(rq);
From: Jiaxun Yang jiaxun.yang@flygoat.com
commit 6dcbd0a69c84a8ae7a442840a8cf6b1379dc8f16 upstream.
MIPS's exit sections are discarded at runtime as well.
Fixes link error: `.exit.text' referenced in section `__jump_table' of fs/fuse/inode.o: defined in discarded section `.exit.text' of fs/fuse/inode.o
Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Reported-by: "kernelci.org bot" bot@kernelci.org Signed-off-by: Jiaxun Yang jiaxun.yang@flygoat.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/mips/kernel/vmlinux.lds.S | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/mips/kernel/vmlinux.lds.S +++ b/arch/mips/kernel/vmlinux.lds.S @@ -15,6 +15,8 @@ #define EMITS_PT_NOTE #endif
+#define RUNTIME_DISCARD_EXIT + #include <asm-generic/vmlinux.lds.h>
#undef mips
From: Salvatore Bonaccorso carnil@debian.org
In upstream commit 77e52ae35463 ("futex: Move to kernel/futex/") the futex code from kernel/futex.c was moved into kernel/futex/core.c in preparation of the split-up of the implementation in various files.
Point kernel-doc references to the new files as otherwise the documentation shows errors on build:
[...] Error: Cannot open file ./kernel/futex.c Error: Cannot open file ./kernel/futex.c [...] WARNING: kernel-doc './scripts/kernel-doc -rst -enable-lineno -sphinx-version 3.4.3 -internal ./kernel/futex.c' failed with return code 2
There is no direct upstream commit for this change. It is made in analogy to commit bc67f1c454fb ("docs: futex: Fix kernel-doc references") applied as consequence of the restructuring of the futex code.
Fixes: 77e52ae35463 ("futex: Move to kernel/futex/") Signed-off-by: Salvatore Bonaccorso carnil@debian.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/kernel-hacking/locking.rst | 2 +- Documentation/translations/it_IT/kernel-hacking/locking.rst | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/Documentation/kernel-hacking/locking.rst +++ b/Documentation/kernel-hacking/locking.rst @@ -1358,7 +1358,7 @@ Mutex API reference Futex API reference ===================
-.. kernel-doc:: kernel/futex.c +.. kernel-doc:: kernel/futex/core.c :internal:
Further reading --- a/Documentation/translations/it_IT/kernel-hacking/locking.rst +++ b/Documentation/translations/it_IT/kernel-hacking/locking.rst @@ -1400,7 +1400,7 @@ Riferimento per l'API dei Mutex Riferimento per l'API dei Futex ===============================
-.. kernel-doc:: kernel/futex.c +.. kernel-doc:: kernel/futex/core.c :internal:
Approfondimenti
From: Alyssa Ross hi@alyssa.is
commit d83806c4c0cccc0d6d3c3581a11983a9c186a138 upstream.
Since 32ef9e5054ec, -Wa,-gdwarf-2 is no longer used in KBUILD_AFLAGS. Instead, it includes -g, the appropriate -gdwarf-* flag, and also the -Wa versions of both of those if building with Clang and GNU as. As a result, debug info was being generated for the purgatory objects, even though the intention was that it not be.
Fixes: 32ef9e5054ec ("Makefile.debug: re-enable debug info for .S files") Signed-off-by: Alyssa Ross hi@alyssa.is Cc: stable@vger.kernel.org Acked-by: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/purgatory/Makefile | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/arch/x86/purgatory/Makefile +++ b/arch/x86/purgatory/Makefile @@ -64,8 +64,7 @@ CFLAGS_sha256.o += $(PURGATORY_CFLAGS) CFLAGS_REMOVE_string.o += $(PURGATORY_CFLAGS_REMOVE) CFLAGS_string.o += $(PURGATORY_CFLAGS)
-AFLAGS_REMOVE_setup-x86_$(BITS).o += -Wa,-gdwarf-2 -AFLAGS_REMOVE_entry64.o += -Wa,-gdwarf-2 +asflags-remove-y += -g -Wa,-gdwarf-2
$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE $(call if_changed,ld)
From: Miklos Szeredi mszeredi@redhat.com
commit 833c5a42e28beeefa1f9bd476a63fe8050c1e8ca upstream.
Avoid duplicating error cleanup.
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Bo yb203166@antfin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/virtio_fs.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-)
--- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -1440,22 +1440,14 @@ static int virtio_fs_get_tree(struct fs_ return -EINVAL; }
+ err = -ENOMEM; fc = kzalloc(sizeof(struct fuse_conn), GFP_KERNEL); - if (!fc) { - mutex_lock(&virtio_fs_mutex); - virtio_fs_put(fs); - mutex_unlock(&virtio_fs_mutex); - return -ENOMEM; - } + if (!fc) + goto out_err;
fm = kzalloc(sizeof(struct fuse_mount), GFP_KERNEL); - if (!fm) { - mutex_lock(&virtio_fs_mutex); - virtio_fs_put(fs); - mutex_unlock(&virtio_fs_mutex); - kfree(fc); - return -ENOMEM; - } + if (!fm) + goto out_err;
fuse_conn_init(fc, fm, fsc->user_ns, &virtio_fs_fiq_ops, fs); fc->release = fuse_free_conn; @@ -1483,6 +1475,13 @@ static int virtio_fs_get_tree(struct fs_ WARN_ON(fsc->root); fsc->root = dget(sb->s_root); return 0; + +out_err: + kfree(fc); + mutex_lock(&virtio_fs_mutex); + virtio_fs_put(fs); + mutex_unlock(&virtio_fs_mutex); + return err; }
static const struct fs_context_operations virtio_fs_context_ops = {
From: Connor Kuehl ckuehl@redhat.com
commit a7f0d7aab0b4f3f0780b1f77356e2fe7202ac0cb upstream.
If an incoming FUSE request can't fit on the virtqueue, the request is placed onto a workqueue so a worker can try to resubmit it later where there will (hopefully) be space for it next time.
This is fine for requests that aren't larger than a virtqueue's maximum capacity. However, if a request's size exceeds the maximum capacity of the virtqueue (even if the virtqueue is empty), it will be doomed to a life of being placed on the workqueue, removed, discovered it won't fit, and placed on the workqueue yet again.
Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect Descriptors) of the virtio spec:
"A driver MUST NOT create a descriptor chain longer than the Queue Size of the device."
To fix this, limit the number of pages FUSE will use for an overall request. This way, each request can realistically fit on the virtqueue when it is decomposed into a scattergather list and avoid violating section 2.6.5.3.1 of the virtio spec.
Signed-off-by: Connor Kuehl ckuehl@redhat.com Reviewed-by: Vivek Goyal vgoyal@redhat.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Bo yb203166@antfin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/fuse_i.h | 3 +++ fs/fuse/inode.c | 3 ++- fs/fuse/virtio_fs.c | 19 +++++++++++++++++-- 3 files changed, 22 insertions(+), 3 deletions(-)
--- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -556,6 +556,9 @@ struct fuse_conn { /** Maxmum number of pages that can be used in a single request */ unsigned int max_pages;
+ /** Constrain ->max_pages to this value during feature negotiation */ + unsigned int max_pages_limit; + /** Input queue */ struct fuse_iqueue iq;
--- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -710,6 +710,7 @@ void fuse_conn_init(struct fuse_conn *fc fc->pid_ns = get_pid_ns(task_active_pid_ns(current)); fc->user_ns = get_user_ns(user_ns); fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ; + fc->max_pages_limit = FUSE_MAX_MAX_PAGES;
INIT_LIST_HEAD(&fc->mounts); list_add(&fm->fc_entry, &fc->mounts); @@ -1056,7 +1057,7 @@ static void process_init_reply(struct fu fc->abort_err = 1; if (arg->flags & FUSE_MAX_PAGES) { fc->max_pages = - min_t(unsigned int, FUSE_MAX_MAX_PAGES, + min_t(unsigned int, fc->max_pages_limit, max_t(unsigned int, arg->max_pages, 1)); } if (IS_ENABLED(CONFIG_FUSE_DAX) && --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -18,6 +18,12 @@ #include <linux/uio.h> #include "fuse_i.h"
+/* Used to help calculate the FUSE connection's max_pages limit for a request's + * size. Parts of the struct fuse_req are sliced into scattergather lists in + * addition to the pages used, so this can help account for that overhead. + */ +#define FUSE_HEADER_OVERHEAD 4 + /* List of virtio-fs device instances and a lock for the list. Also provides * mutual exclusion in device removal and mounting path */ @@ -1426,9 +1432,10 @@ static int virtio_fs_get_tree(struct fs_ { struct virtio_fs *fs; struct super_block *sb; - struct fuse_conn *fc; + struct fuse_conn *fc = NULL; struct fuse_mount *fm; - int err; + unsigned int virtqueue_size; + int err = -EIO;
/* This gets a reference on virtio_fs object. This ptr gets installed * in fc->iq->priv. Once fuse_conn is going away, it calls ->put() @@ -1440,6 +1447,10 @@ static int virtio_fs_get_tree(struct fs_ return -EINVAL; }
+ virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq); + if (WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD)) + goto out_err; + err = -ENOMEM; fc = kzalloc(sizeof(struct fuse_conn), GFP_KERNEL); if (!fc) @@ -1454,6 +1465,10 @@ static int virtio_fs_get_tree(struct fs_ fc->delete_stale = true; fc->auto_submounts = true;
+ /* Tell FUSE to split requests that exceed the virtqueue's size */ + fc->max_pages_limit = min_t(unsigned int, fc->max_pages_limit, + virtqueue_size - FUSE_HEADER_OVERHEAD); + fsc->s_fs_info = fm; sb = sget_fc(fsc, virtio_fs_test_super, virtio_fs_set_super); fuse_mount_put(fm);
From: Miklos Szeredi mszeredi@redhat.com
commit d534d31d6a45d71de61db22090b4820afb68fddc upstream.
Checking "fm" works because currently sb->s_fs_info is cleared on error paths; however, sb->s_root is what generic_shutdown_super() checks to determine whether the sb was fully initialized or not.
This change will allow cleanup of sb setup error paths.
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Bo yb203166@antfin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/inode.c | 2 +- fs/fuse/virtio_fs.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1596,7 +1596,7 @@ static void fuse_kill_sb_blk(struct supe struct fuse_mount *fm = get_fuse_mount_super(sb); bool last;
- if (fm) { + if (sb->s_root) { last = fuse_mount_remove(fm); if (last) fuse_conn_destroy(fm); --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -1399,7 +1399,7 @@ static void virtio_kill_sb(struct super_ bool last;
/* If mount failed, we can still be called without any fc */ - if (fm) { + if (sb->s_root) { last = fuse_mount_remove(fm); if (last) virtio_fs_conn_destroy(fm);
From: Miklos Szeredi mszeredi@redhat.com
commit 484ce65715b06aead8c4901f01ca32c5a240bc71 upstream.
A READ request returning a short count is taken as indication of EOF, and the cached file size is modified accordingly.
Fix the attribute version checking to allow for changes to fc->attr_version on other inodes.
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Bo yb203166@antfin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -782,7 +782,7 @@ static void fuse_read_update_size(struct struct fuse_inode *fi = get_fuse_inode(inode);
spin_lock(&fi->lock); - if (attr_ver == fi->attr_version && size < inode->i_size && + if (attr_ver >= fi->attr_version && size < inode->i_size && !test_bit(FUSE_I_SIZE_UNSTABLE, &fi->state)) { fi->attr_version = atomic64_inc_return(&fc->attr_version); i_size_write(inode, size);
From: Jiachen Zhang zhangjiachen.jaycee@bytedance.com
commit ccc031e26afe60d2a5a3d93dabd9c978210825fb upstream.
The previous commit df8629af2934 ("fuse: always revalidate if exclusive create") ensures that the dentries are revalidated on O_EXCL creates. This commit complements it by also performing revalidation for rename target dentries. Otherwise, a rename target file that only exists in kernel dentry cache but not in the filesystem will result in EEXIST if RENAME_NOREPLACE flag is used.
Signed-off-by: Jiachen Zhang zhangjiachen.jaycee@bytedance.com Signed-off-by: Zhang Tianci zhangtianci.1997@bytedance.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Bo yb203166@antfin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -205,7 +205,7 @@ static int fuse_dentry_revalidate(struct if (inode && fuse_is_bad(inode)) goto invalid; else if (time_before64(fuse_dentry_time(entry), get_jiffies_64()) || - (flags & (LOOKUP_EXCL | LOOKUP_REVAL))) { + (flags & (LOOKUP_EXCL | LOOKUP_REVAL | LOOKUP_RENAME_TARGET))) { struct fuse_entry_out outarg; FUSE_ARGS(args); struct fuse_forget_link *forget;
From: Miklos Szeredi mszeredi@redhat.com
commit 2fdbb8dd01556e1501132b5ad3826e8f71e24a8b upstream.
fuse_finish_open() will be called with FUSE_NOWRITE set in case of atomic O_TRUNC open(), so commit 76224355db75 ("fuse: truncate pagecache on atomic_o_trunc") replaced invalidate_inode_pages2() by truncate_pagecache() in such a case to avoid the A-A deadlock. However, we found another A-B-B-A deadlock related to the case above, which will cause the xfstests generic/464 testcase hung in our virtio-fs test environment.
For example, consider two processes concurrently open one same file, one with O_TRUNC and another without O_TRUNC. The deadlock case is described below, if open(O_TRUNC) is already set_nowrite(acquired A), and is trying to lock a page (acquiring B), open() could have held the page lock (acquired B), and waiting on the page writeback (acquiring A). This would lead to deadlocks.
open(O_TRUNC) ---------------------------------------------------------------- fuse_open_common inode_lock [C acquire] fuse_set_nowrite [A acquire]
fuse_finish_open truncate_pagecache lock_page [B acquire] truncate_inode_page unlock_page [B release]
fuse_release_nowrite [A release] inode_unlock [C release] ----------------------------------------------------------------
open() ---------------------------------------------------------------- fuse_open_common fuse_finish_open invalidate_inode_pages2 lock_page [B acquire] fuse_launder_page fuse_wait_on_page_writeback [A acquire & release] unlock_page [B release] ----------------------------------------------------------------
Besides this case, all calls of invalidate_inode_pages2() and invalidate_inode_pages2_range() in fuse code also can deadlock with open(O_TRUNC).
Fix by moving the truncate_pagecache() call outside the nowrite protected region. The nowrite protection is only for delayed writeback (writeback_cache) case, where inode lock does not protect against truncation racing with writes on the server. Write syscalls racing with page cache truncation still get the inode lock protection.
This patch also changes the order of filemap_invalidate_lock() vs. fuse_set_nowrite() in fuse_open_common(). This new order matches the order found in fuse_file_fallocate() and fuse_do_setattr().
Reported-by: Jiachen Zhang zhangjiachen.jaycee@bytedance.com Tested-by: Jiachen Zhang zhangjiachen.jaycee@bytedance.com Fixes: e4648309b85a ("fuse: truncate pending writes on O_TRUNC") Cc: stable@vger.kernel.org Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Bo yb203166@antfin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/dir.c | 5 +++++ fs/fuse/file.c | 29 +++++++++++++++++------------ 2 files changed, 22 insertions(+), 12 deletions(-)
--- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -537,6 +537,7 @@ static int fuse_create_open(struct inode struct fuse_entry_out outentry; struct fuse_inode *fi; struct fuse_file *ff; + bool trunc = flags & O_TRUNC;
/* Userspace expects S_IFREG in create mode */ BUG_ON((mode & S_IFMT) != S_IFREG); @@ -604,6 +605,10 @@ static int fuse_create_open(struct inode } else { file->private_data = ff; fuse_finish_open(inode, file); + if (fm->fc->atomic_o_trunc && trunc) + truncate_pagecache(inode, 0); + else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) + invalidate_inode_pages2(inode->i_mapping); } return err;
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -206,14 +206,10 @@ void fuse_finish_open(struct inode *inod fi->attr_version = atomic64_inc_return(&fc->attr_version); i_size_write(inode, 0); spin_unlock(&fi->lock); - truncate_pagecache(inode, 0); fuse_invalidate_attr(inode); if (fc->writeback_cache) file_update_time(file); - } else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) { - invalidate_inode_pages2(inode->i_mapping); } - if ((file->f_mode & FMODE_WRITE) && fc->writeback_cache) fuse_link_write_file(file); } @@ -236,30 +232,39 @@ int fuse_open_common(struct inode *inode if (err) return err;
- if (is_wb_truncate || dax_truncate) { + if (is_wb_truncate || dax_truncate) inode_lock(inode); - fuse_set_nowrite(inode); - }
if (dax_truncate) { down_write(&get_fuse_inode(inode)->i_mmap_sem); err = fuse_dax_break_layouts(inode, 0, 0); if (err) - goto out; + goto out_inode_unlock; }
+ if (is_wb_truncate || dax_truncate) + fuse_set_nowrite(inode); + err = fuse_do_open(fm, get_node_id(inode), file, isdir); if (!err) fuse_finish_open(inode, file);
-out: + if (is_wb_truncate || dax_truncate) + fuse_release_nowrite(inode); + if (!err) { + struct fuse_file *ff = file->private_data; + + if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC)) + truncate_pagecache(inode, 0); + else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) + invalidate_inode_pages2(inode->i_mapping); + } if (dax_truncate) up_write(&get_fuse_inode(inode)->i_mmap_sem);
- if (is_wb_truncate | dax_truncate) { - fuse_release_nowrite(inode); +out_inode_unlock: + if (is_wb_truncate || dax_truncate) inode_unlock(inode); - }
return err; }
From: Tudor Ambarus tudor.ambarus@linaro.org
This reverts commit bb8592efcf8ef2f62947745d3182ea05b5256a15 which is commit 67d7d8ad99beccd9fe92d585b87f1760dc9018e3 upstream.
The order in which patches are queued to stable matters. This patch has a logical dependency on commit 310c097c2bdbea253d6ee4e064f3e65580ef93ac upstream, and failing to queue the latter results in a null-ptr-deref reported at the Link below.
In order to avoid conflicts on stable, revert the commit just so that we can queue its prerequisite patch first and then queue the same after.
Link: https://syzkaller.appspot.com/bug?extid=d5ebf56f3b1268136afd Signed-off-by: Tudor Ambarus tudor.ambarus@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/xattr.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -2193,9 +2193,8 @@ int ext4_xattr_ibody_find(struct inode * struct ext4_inode *raw_inode; int error;
- if (!EXT4_INODE_HAS_XATTR_SPACE(inode)) + if (EXT4_I(inode)->i_extra_isize == 0) return 0; - raw_inode = ext4_raw_inode(&is->iloc); header = IHDR(inode, raw_inode); is->s.base = is->s.first = IFIRST(header); @@ -2223,9 +2222,8 @@ int ext4_xattr_ibody_inline_set(handle_t struct ext4_xattr_search *s = &is->s; int error;
- if (!EXT4_INODE_HAS_XATTR_SPACE(inode)) + if (EXT4_I(inode)->i_extra_isize == 0) return -ENOSPC; - error = ext4_xattr_set_entry(i, s, handle, inode, false /* is_block */); if (error) return error;
From: Ritesh Harjani riteshh@linux.ibm.com
[ Upstream commit 310c097c2bdbea253d6ee4e064f3e65580ef93ac ]
ext4_xattr_ibody_inline_set() & ext4_xattr_ibody_set() have the exact same definition. Hence remove ext4_xattr_ibody_inline_set() and all its call references. Convert the callers of it to call ext4_xattr_ibody_set() instead.
[ Modified to preserve ext4_xattr_ibody_set() and remove ext4_xattr_ibody_inline_set() instead. -- TYT ]
Signed-off-by: Ritesh Harjani riteshh@linux.ibm.com Link: https://lore.kernel.org/r/fd566b799bbbbe9b668eb5eecde5b5e319e3694f.162268548... Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Tudor Ambarus tudor.ambarus@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/inline.c | 11 +++++------ fs/ext4/xattr.c | 26 +------------------------- fs/ext4/xattr.h | 6 +++--- 3 files changed, 9 insertions(+), 34 deletions(-)
--- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -206,7 +206,7 @@ out: /* * write the buffer to the inline inode. * If 'create' is set, we don't need to do the extra copy in the xattr - * value since it is already handled by ext4_xattr_ibody_inline_set. + * value since it is already handled by ext4_xattr_ibody_set. * That saves us one memcpy. */ static void ext4_write_inline_data(struct inode *inode, struct ext4_iloc *iloc, @@ -288,7 +288,7 @@ static int ext4_create_inline_data(handl
BUG_ON(!is.s.not_found);
- error = ext4_xattr_ibody_inline_set(handle, inode, &i, &is); + error = ext4_xattr_ibody_set(handle, inode, &i, &is); if (error) { if (error == -ENOSPC) ext4_clear_inode_state(inode, @@ -360,7 +360,7 @@ static int ext4_update_inline_data(handl i.value = value; i.value_len = len;
- error = ext4_xattr_ibody_inline_set(handle, inode, &i, &is); + error = ext4_xattr_ibody_set(handle, inode, &i, &is); if (error) goto out;
@@ -433,7 +433,7 @@ static int ext4_destroy_inline_data_nolo if (error) goto out;
- error = ext4_xattr_ibody_inline_set(handle, inode, &i, &is); + error = ext4_xattr_ibody_set(handle, inode, &i, &is); if (error) goto out;
@@ -1930,8 +1930,7 @@ int ext4_inline_data_truncate(struct ino i.value = value; i.value_len = i_size > EXT4_MIN_INLINE_DATA_SIZE ? i_size - EXT4_MIN_INLINE_DATA_SIZE : 0; - err = ext4_xattr_ibody_inline_set(handle, inode, - &i, &is); + err = ext4_xattr_ibody_set(handle, inode, &i, &is); if (err) goto out_error; } --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -2214,31 +2214,7 @@ int ext4_xattr_ibody_find(struct inode * return 0; }
-int ext4_xattr_ibody_inline_set(handle_t *handle, struct inode *inode, - struct ext4_xattr_info *i, - struct ext4_xattr_ibody_find *is) -{ - struct ext4_xattr_ibody_header *header; - struct ext4_xattr_search *s = &is->s; - int error; - - if (EXT4_I(inode)->i_extra_isize == 0) - return -ENOSPC; - error = ext4_xattr_set_entry(i, s, handle, inode, false /* is_block */); - if (error) - return error; - header = IHDR(inode, ext4_raw_inode(&is->iloc)); - if (!IS_LAST_ENTRY(s->first)) { - header->h_magic = cpu_to_le32(EXT4_XATTR_MAGIC); - ext4_set_inode_state(inode, EXT4_STATE_XATTR); - } else { - header->h_magic = cpu_to_le32(0); - ext4_clear_inode_state(inode, EXT4_STATE_XATTR); - } - return 0; -} - -static int ext4_xattr_ibody_set(handle_t *handle, struct inode *inode, +int ext4_xattr_ibody_set(handle_t *handle, struct inode *inode, struct ext4_xattr_info *i, struct ext4_xattr_ibody_find *is) { --- a/fs/ext4/xattr.h +++ b/fs/ext4/xattr.h @@ -200,9 +200,9 @@ extern int ext4_xattr_ibody_find(struct extern int ext4_xattr_ibody_get(struct inode *inode, int name_index, const char *name, void *buffer, size_t buffer_size); -extern int ext4_xattr_ibody_inline_set(handle_t *handle, struct inode *inode, - struct ext4_xattr_info *i, - struct ext4_xattr_ibody_find *is); +extern int ext4_xattr_ibody_set(handle_t *handle, struct inode *inode, + struct ext4_xattr_info *i, + struct ext4_xattr_ibody_find *is);
extern struct mb_cache *ext4_xattr_create_cache(void); extern void ext4_xattr_destroy_cache(struct mb_cache *);
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 67d7d8ad99beccd9fe92d585b87f1760dc9018e3 ]
Hulk Robot reported a issue: ================================================================== BUG: KASAN: use-after-free in ext4_xattr_set_entry+0x18ab/0x3500 Write of size 4105 at addr ffff8881675ef5f4 by task syz-executor.0/7092
CPU: 1 PID: 7092 Comm: syz-executor.0 Not tainted 4.19.90-dirty #17 Call Trace: [...] memcpy+0x34/0x50 mm/kasan/kasan.c:303 ext4_xattr_set_entry+0x18ab/0x3500 fs/ext4/xattr.c:1747 ext4_xattr_ibody_inline_set+0x86/0x2a0 fs/ext4/xattr.c:2205 ext4_xattr_set_handle+0x940/0x1300 fs/ext4/xattr.c:2386 ext4_xattr_set+0x1da/0x300 fs/ext4/xattr.c:2498 __vfs_setxattr+0x112/0x170 fs/xattr.c:149 __vfs_setxattr_noperm+0x11b/0x2a0 fs/xattr.c:180 __vfs_setxattr_locked+0x17b/0x250 fs/xattr.c:238 vfs_setxattr+0xed/0x270 fs/xattr.c:255 setxattr+0x235/0x330 fs/xattr.c:520 path_setxattr+0x176/0x190 fs/xattr.c:539 __do_sys_lsetxattr fs/xattr.c:561 [inline] __se_sys_lsetxattr fs/xattr.c:557 [inline] __x64_sys_lsetxattr+0xc2/0x160 fs/xattr.c:557 do_syscall_64+0xdf/0x530 arch/x86/entry/common.c:298 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x459fe9 RSP: 002b:00007fa5e54b4c08 EFLAGS: 00000246 ORIG_RAX: 00000000000000bd RAX: ffffffffffffffda RBX: 000000000051bf60 RCX: 0000000000459fe9 RDX: 00000000200003c0 RSI: 0000000020000180 RDI: 0000000020000140 RBP: 000000000051bf60 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000001009 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc73c93fc0 R14: 000000000051bf60 R15: 00007fa5e54b4d80 [...] ==================================================================
Above issue may happen as follows: ------------------------------------- ext4_xattr_set ext4_xattr_set_handle ext4_xattr_ibody_find >> s->end < s->base >> no EXT4_STATE_XATTR >> xattr_check_inode is not executed ext4_xattr_ibody_set ext4_xattr_set_entry >> size_t min_offs = s->end - s->base >> UAF in memcpy
we can easily reproduce this problem with the following commands: mkfs.ext4 -F /dev/sda mount -o debug_want_extra_isize=128 /dev/sda /mnt touch /mnt/file setfattr -n user.cat -v `seq -s z 4096|tr -d '[:digit:]'` /mnt/file
In ext4_xattr_ibody_find, we have the following assignment logic: header = IHDR(inode, raw_inode) = raw_inode + EXT4_GOOD_OLD_INODE_SIZE + i_extra_isize is->s.base = IFIRST(header) = header + sizeof(struct ext4_xattr_ibody_header) is->s.end = raw_inode + s_inode_size
In ext4_xattr_set_entry min_offs = s->end - s->base = s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - i_extra_isize - sizeof(struct ext4_xattr_ibody_header) last = s->first free = min_offs - ((void *)last - s->base) - sizeof(__u32) = s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - i_extra_isize - sizeof(struct ext4_xattr_ibody_header) - sizeof(__u32)
In the calculation formula, all values except s_inode_size and i_extra_size are fixed values. When i_extra_size is the maximum value s_inode_size - EXT4_GOOD_OLD_INODE_SIZE, min_offs is -4 and free is -8. The value overflows. As a result, the preceding issue is triggered when memcpy is executed.
Therefore, when finding xattr or setting xattr, check whether there is space for storing xattr in the inode to resolve this issue.
Cc: stable@kernel.org Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20220616021358.2504451-3-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Tudor Ambarus tudor.ambarus@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/xattr.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -2193,8 +2193,9 @@ int ext4_xattr_ibody_find(struct inode * struct ext4_inode *raw_inode; int error;
- if (EXT4_I(inode)->i_extra_isize == 0) + if (!EXT4_INODE_HAS_XATTR_SPACE(inode)) return 0; + raw_inode = ext4_raw_inode(&is->iloc); header = IHDR(inode, raw_inode); is->s.base = is->s.first = IFIRST(header); @@ -2222,8 +2223,9 @@ int ext4_xattr_ibody_set(handle_t *handl struct ext4_xattr_search *s = &is->s; int error;
- if (EXT4_I(inode)->i_extra_isize == 0) + if (!EXT4_INODE_HAS_XATTR_SPACE(inode)) return -ENOSPC; + error = ext4_xattr_set_entry(i, s, handle, inode, false /* is_block */); if (error) return error;
From: Kuniyuki Iwashima kuniyu@amazon.com
commit 21985f43376cee092702d6cb963ff97a9d2ede68 upstream.
Commit 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support") forgot to add a change to free inet6_sk(sk)->rxpmtu while converting an IPv6 socket into IPv4 with IPV6_ADDRFORM. After conversion, sk_prot is changed to udp_prot and ->destroy() never cleans it up, resulting in a memory leak.
This is due to the discrepancy between inet6_destroy_sock() and IPV6_ADDRFORM, so let's call inet6_destroy_sock() from IPV6_ADDRFORM to remove the difference.
However, this is not enough for now because rxpmtu can be changed without lock_sock() after commit 03485f2adcde ("udpv6: Add lockless sendmsg() support"). We will fix this case in the following patch.
Note we will rename inet6_destroy_sock() to inet6_cleanup_sock() and remove unnecessary inet6_destroy_sock() calls in sk_prot->destroy() in the future.
Fixes: 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/ipv6.h | 1 + net/ipv6/af_inet6.c | 6 ++++++ net/ipv6/ipv6_sockglue.c | 20 ++++++++------------ 3 files changed, 15 insertions(+), 12 deletions(-)
--- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1109,6 +1109,7 @@ void ipv6_icmp_error(struct sock *sk, st void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info); void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
+void inet6_cleanup_sock(struct sock *sk); int inet6_release(struct socket *sock); int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len); int inet6_getname(struct socket *sock, struct sockaddr *uaddr, --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -503,6 +503,12 @@ void inet6_destroy_sock(struct sock *sk) } EXPORT_SYMBOL_GPL(inet6_destroy_sock);
+void inet6_cleanup_sock(struct sock *sk) +{ + inet6_destroy_sock(sk); +} +EXPORT_SYMBOL_GPL(inet6_cleanup_sock); + /* * This does both peername and sockname. */ --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -429,9 +429,6 @@ static int do_ipv6_setsockopt(struct soc if (optlen < sizeof(int)) goto e_inval; if (val == PF_INET) { - struct ipv6_txoptions *opt; - struct sk_buff *pktopt; - if (sk->sk_type == SOCK_RAW) break;
@@ -462,7 +459,6 @@ static int do_ipv6_setsockopt(struct soc break; }
- fl6_free_socklist(sk); __ipv6_sock_mc_close(sk); __ipv6_sock_ac_close(sk);
@@ -497,14 +493,14 @@ static int do_ipv6_setsockopt(struct soc sk->sk_socket->ops = &inet_dgram_ops; sk->sk_family = PF_INET; } - opt = xchg((__force struct ipv6_txoptions **)&np->opt, - NULL); - if (opt) { - atomic_sub(opt->tot_len, &sk->sk_omem_alloc); - txopt_put(opt); - } - pktopt = xchg(&np->pktoptions, NULL); - kfree_skb(pktopt); + + /* Disable all options not to allocate memory anymore, + * but there is still a race. See the lockless path + * in udpv6_sendmsg() and ipv6_local_rxpmtu(). + */ + np->rxopt.all = 0; + + inet6_cleanup_sock(sk);
/* * ... and add it to the refcnt debug socks count
From: Kuniyuki Iwashima kuniyu@amazon.com
commit d38afeec26ed4739c640bf286c270559aab2ba5f upstream.
Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were able to clean them up by calling inet6_destroy_sock() during the IPv6 -> IPv4 conversion by IPV6_ADDRFORM. However, commit 03485f2adcde ("udpv6: Add lockless sendmsg() support") added a lockless memory allocation path, which could cause a memory leak:
setsockopt(IPV6_ADDRFORM) sendmsg() +-----------------------+ +-------+ - do_ipv6_setsockopt(sk, ...) - udpv6_sendmsg(sk, ...) - sockopt_lock_sock(sk) ^._ called via udpv6_prot - lock_sock(sk) before WRITE_ONCE() - WRITE_ONCE(sk->sk_prot, &tcp_prot) - inet6_destroy_sock() - if (!corkreq) - sockopt_release_sock(sk) - ip6_make_skb(sk, ...) - release_sock(sk) ^._ lockless fast path for the non-corking case
- __ip6_append_data(sk, ...) - ipv6_local_rxpmtu(sk, ...) - xchg(&np->rxpmtu, skb) ^._ rxpmtu is never freed.
- goto out_no_dst;
- lock_sock(sk)
For now, rxpmtu is only the case, but not to miss the future change and a similar bug fixed in commit e27326009a3d ("net: ping6: Fix memleak in ipv6_renew_options()."), let's set a new function to IPv6 sk->sk_destruct() and call inet6_cleanup_sock() there. Since the conversion does not change sk->sk_destruct(), we can guarantee that we can clean up IPv6 resources finally.
We can now remove all inet6_destroy_sock() calls from IPv6 protocol specific ->destroy() functions, but such changes are invasive to backport. So they can be posted as a follow-up later for net-next.
Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/ipv6.h | 1 + include/net/udp.h | 2 +- include/net/udplite.h | 8 -------- net/ipv4/udp.c | 9 ++++++--- net/ipv4/udplite.c | 8 ++++++++ net/ipv6/af_inet6.c | 8 +++++++- net/ipv6/udp.c | 15 ++++++++++++++- net/ipv6/udp_impl.h | 1 + net/ipv6/udplite.c | 9 ++++++++- 9 files changed, 46 insertions(+), 15 deletions(-)
--- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1110,6 +1110,7 @@ void ipv6_local_error(struct sock *sk, i void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
void inet6_cleanup_sock(struct sock *sk); +void inet6_sock_destruct(struct sock *sk); int inet6_release(struct socket *sock); int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len); int inet6_getname(struct socket *sock, struct sockaddr *uaddr, --- a/include/net/udp.h +++ b/include/net/udp.h @@ -268,7 +268,7 @@ static inline bool udp_sk_bound_dev_eq(s }
/* net/ipv4/udp.c */ -void udp_destruct_sock(struct sock *sk); +void udp_destruct_common(struct sock *sk); void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len); int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb); void udp_skb_destructor(struct sock *sk, struct sk_buff *skb); --- a/include/net/udplite.h +++ b/include/net/udplite.h @@ -24,14 +24,6 @@ static __inline__ int udplite_getfrag(vo return copy_from_iter_full(to, len, &msg->msg_iter) ? 0 : -EFAULT; }
-/* Designate sk as UDP-Lite socket */ -static inline int udplite_sk_init(struct sock *sk) -{ - udp_init_sock(sk); - udp_sk(sk)->pcflag = UDPLITE_BIT; - return 0; -} - /* * Checksumming routines */ --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1582,7 +1582,7 @@ drop: } EXPORT_SYMBOL_GPL(__udp_enqueue_schedule_skb);
-void udp_destruct_sock(struct sock *sk) +void udp_destruct_common(struct sock *sk) { /* reclaim completely the forward allocated memory */ struct udp_sock *up = udp_sk(sk); @@ -1595,10 +1595,14 @@ void udp_destruct_sock(struct sock *sk) kfree_skb(skb); } udp_rmem_release(sk, total, 0, true); +} +EXPORT_SYMBOL_GPL(udp_destruct_common);
+static void udp_destruct_sock(struct sock *sk) +{ + udp_destruct_common(sk); inet_sock_destruct(sk); } -EXPORT_SYMBOL_GPL(udp_destruct_sock);
int udp_init_sock(struct sock *sk) { @@ -1606,7 +1610,6 @@ int udp_init_sock(struct sock *sk) sk->sk_destruct = udp_destruct_sock; return 0; } -EXPORT_SYMBOL_GPL(udp_init_sock);
void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len) { --- a/net/ipv4/udplite.c +++ b/net/ipv4/udplite.c @@ -17,6 +17,14 @@ struct udp_table udplite_table __read_mostly; EXPORT_SYMBOL(udplite_table);
+/* Designate sk as UDP-Lite socket */ +static int udplite_sk_init(struct sock *sk) +{ + udp_init_sock(sk); + udp_sk(sk)->pcflag = UDPLITE_BIT; + return 0; +} + static int udplite_rcv(struct sk_buff *skb) { return __udp4_lib_rcv(skb, &udplite_table, IPPROTO_UDPLITE); --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -107,6 +107,12 @@ static __inline__ struct ipv6_pinfo *ine return (struct ipv6_pinfo *)(((u8 *)sk) + offset); }
+void inet6_sock_destruct(struct sock *sk) +{ + inet6_cleanup_sock(sk); + inet_sock_destruct(sk); +} + static int inet6_create(struct net *net, struct socket *sock, int protocol, int kern) { @@ -199,7 +205,7 @@ lookup_protocol: inet->hdrincl = 1; }
- sk->sk_destruct = inet_sock_destruct; + sk->sk_destruct = inet6_sock_destruct; sk->sk_family = PF_INET6; sk->sk_protocol = protocol;
--- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -54,6 +54,19 @@ #include <trace/events/skb.h> #include "udp_impl.h"
+static void udpv6_destruct_sock(struct sock *sk) +{ + udp_destruct_common(sk); + inet6_sock_destruct(sk); +} + +int udpv6_init_sock(struct sock *sk) +{ + skb_queue_head_init(&udp_sk(sk)->reader_queue); + sk->sk_destruct = udpv6_destruct_sock; + return 0; +} + static u32 udp6_ehashfn(const struct net *net, const struct in6_addr *laddr, const u16 lport, @@ -1702,7 +1715,7 @@ struct proto udpv6_prot = { .connect = ip6_datagram_connect, .disconnect = udp_disconnect, .ioctl = udp_ioctl, - .init = udp_init_sock, + .init = udpv6_init_sock, .destroy = udpv6_destroy_sock, .setsockopt = udpv6_setsockopt, .getsockopt = udpv6_getsockopt, --- a/net/ipv6/udp_impl.h +++ b/net/ipv6/udp_impl.h @@ -12,6 +12,7 @@ int __udp6_lib_rcv(struct sk_buff *, str int __udp6_lib_err(struct sk_buff *, struct inet6_skb_parm *, u8, u8, int, __be32, struct udp_table *);
+int udpv6_init_sock(struct sock *sk); int udp_v6_get_port(struct sock *sk, unsigned short snum); void udp_v6_rehash(struct sock *sk);
--- a/net/ipv6/udplite.c +++ b/net/ipv6/udplite.c @@ -12,6 +12,13 @@ #include <linux/proc_fs.h> #include "udp_impl.h"
+static int udplitev6_sk_init(struct sock *sk) +{ + udpv6_init_sock(sk); + udp_sk(sk)->pcflag = UDPLITE_BIT; + return 0; +} + static int udplitev6_rcv(struct sk_buff *skb) { return __udp6_lib_rcv(skb, &udplite_table, IPPROTO_UDPLITE); @@ -38,7 +45,7 @@ struct proto udplitev6_prot = { .connect = ip6_datagram_connect, .disconnect = udp_disconnect, .ioctl = udp_ioctl, - .init = udplite_sk_init, + .init = udplitev6_sk_init, .destroy = udpv6_destroy_sock, .setsockopt = udpv6_setsockopt, .getsockopt = udpv6_getsockopt,
From: Kuniyuki Iwashima kuniyu@amazon.com
commit b5fc29233d28be7a3322848ebe73ac327559cdb9 upstream.
After commit d38afeec26ed ("tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct()."), we call inet6_destroy_sock() in sk->sk_destruct() by setting inet6_sock_destruct() to it to make sure we do not leak inet6-specific resources.
Now we can remove unnecessary inet6_destroy_sock() calls in sk->sk_prot->destroy().
DCCP and SCTP have their own sk->sk_destruct() function, so we change them separately in the following patches.
Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ping.c | 6 ------ net/ipv6/raw.c | 2 -- net/ipv6/tcp_ipv6.c | 8 +------- net/ipv6/udp.c | 2 -- net/l2tp/l2tp_ip6.c | 2 -- net/mptcp/protocol.c | 7 ------- 6 files changed, 1 insertion(+), 26 deletions(-)
--- a/net/ipv6/ping.c +++ b/net/ipv6/ping.c @@ -22,11 +22,6 @@ #include <linux/proc_fs.h> #include <net/ping.h>
-static void ping_v6_destroy(struct sock *sk) -{ - inet6_destroy_sock(sk); -} - /* Compatibility glue so we can support IPv6 when it's compiled as a module */ static int dummy_ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) @@ -171,7 +166,6 @@ struct proto pingv6_prot = { .owner = THIS_MODULE, .init = ping_init_sock, .close = ping_close, - .destroy = ping_v6_destroy, .connect = ip6_datagram_connect_v6_only, .disconnect = __udp_disconnect, .setsockopt = ipv6_setsockopt, --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -1211,8 +1211,6 @@ static void raw6_destroy(struct sock *sk lock_sock(sk); ip6_flush_pending_frames(sk); release_sock(sk); - - inet6_destroy_sock(sk); }
static int rawv6_init_sk(struct sock *sk) --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1936,12 +1936,6 @@ static int tcp_v6_init_sock(struct sock return 0; }
-static void tcp_v6_destroy_sock(struct sock *sk) -{ - tcp_v4_destroy_sock(sk); - inet6_destroy_sock(sk); -} - #ifdef CONFIG_PROC_FS /* Proc filesystem TCPv6 sock list dumping. */ static void get_openreq6(struct seq_file *seq, @@ -2134,7 +2128,7 @@ struct proto tcpv6_prot = { .accept = inet_csk_accept, .ioctl = tcp_ioctl, .init = tcp_v6_init_sock, - .destroy = tcp_v6_destroy_sock, + .destroy = tcp_v4_destroy_sock, .shutdown = tcp_shutdown, .setsockopt = tcp_setsockopt, .getsockopt = tcp_getsockopt, --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1630,8 +1630,6 @@ void udpv6_destroy_sock(struct sock *sk) udp_encap_disable(); } } - - inet6_destroy_sock(sk); }
/* --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -255,8 +255,6 @@ static void l2tp_ip6_destroy_sock(struct
if (tunnel) l2tp_tunnel_delete(tunnel); - - inet6_destroy_sock(sk); }
static int l2tp_ip6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len) --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2863,12 +2863,6 @@ static const struct proto_ops mptcp_v6_s
static struct proto mptcp_v6_prot;
-static void mptcp_v6_destroy(struct sock *sk) -{ - mptcp_destroy(sk); - inet6_destroy_sock(sk); -} - static struct inet_protosw mptcp_v6_protosw = { .type = SOCK_STREAM, .protocol = IPPROTO_MPTCP, @@ -2884,7 +2878,6 @@ int __init mptcp_proto_v6_init(void) mptcp_v6_prot = mptcp_prot; strcpy(mptcp_v6_prot.name, "MPTCPv6"); mptcp_v6_prot.slab = NULL; - mptcp_v6_prot.destroy = mptcp_v6_destroy; mptcp_v6_prot.obj_size = sizeof(struct mptcp6_sock);
err = proto_register(&mptcp_v6_prot, 1);
From: Kuniyuki Iwashima kuniyu@amazon.com
commit 1651951ebea54970e0bda60c638fc2eee7a6218f upstream.
After commit d38afeec26ed ("tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct()."), we call inet6_destroy_sock() in sk->sk_destruct() by setting inet6_sock_destruct() to it to make sure we do not leak inet6-specific resources.
DCCP sets its own sk->sk_destruct() in the dccp_init_sock(), and DCCPv6 socket shares it by calling the same init function via dccp_v6_init_sock().
To call inet6_sock_destruct() from DCCPv6 sk->sk_destruct(), we export it and set dccp_v6_sk_destruct() in the init function.
Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dccp/dccp.h | 1 + net/dccp/ipv6.c | 15 ++++++++------- net/dccp/proto.c | 8 +++++++- net/ipv6/af_inet6.c | 1 + 4 files changed, 17 insertions(+), 8 deletions(-)
--- a/net/dccp/dccp.h +++ b/net/dccp/dccp.h @@ -283,6 +283,7 @@ int dccp_rcv_state_process(struct sock * int dccp_rcv_established(struct sock *sk, struct sk_buff *skb, const struct dccp_hdr *dh, const unsigned int len);
+void dccp_destruct_common(struct sock *sk); int dccp_init_sock(struct sock *sk, const __u8 ctl_sock_initialized); void dccp_destroy_sock(struct sock *sk);
--- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -992,6 +992,12 @@ static const struct inet_connection_sock .sockaddr_len = sizeof(struct sockaddr_in6), };
+static void dccp_v6_sk_destruct(struct sock *sk) +{ + dccp_destruct_common(sk); + inet6_sock_destruct(sk); +} + /* NOTE: A lot of things set to zero explicitly by call to * sk_alloc() so need not be done here. */ @@ -1004,17 +1010,12 @@ static int dccp_v6_init_sock(struct sock if (unlikely(!dccp_v6_ctl_sock_initialized)) dccp_v6_ctl_sock_initialized = 1; inet_csk(sk)->icsk_af_ops = &dccp_ipv6_af_ops; + sk->sk_destruct = dccp_v6_sk_destruct; }
return err; }
-static void dccp_v6_destroy_sock(struct sock *sk) -{ - dccp_destroy_sock(sk); - inet6_destroy_sock(sk); -} - static struct timewait_sock_ops dccp6_timewait_sock_ops = { .twsk_obj_size = sizeof(struct dccp6_timewait_sock), }; @@ -1037,7 +1038,7 @@ static struct proto dccp_v6_prot = { .accept = inet_csk_accept, .get_port = inet_csk_get_port, .shutdown = dccp_shutdown, - .destroy = dccp_v6_destroy_sock, + .destroy = dccp_destroy_sock, .orphan_count = &dccp_orphan_count, .max_header = MAX_DCCP_HEADER, .obj_size = sizeof(struct dccp6_sock), --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -171,12 +171,18 @@ const char *dccp_packet_name(const int t
EXPORT_SYMBOL_GPL(dccp_packet_name);
-static void dccp_sk_destruct(struct sock *sk) +void dccp_destruct_common(struct sock *sk) { struct dccp_sock *dp = dccp_sk(sk);
ccid_hc_tx_delete(dp->dccps_hc_tx_ccid, sk); dp->dccps_hc_tx_ccid = NULL; +} +EXPORT_SYMBOL_GPL(dccp_destruct_common); + +static void dccp_sk_destruct(struct sock *sk) +{ + dccp_destruct_common(sk); inet_sock_destruct(sk); }
--- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -112,6 +112,7 @@ void inet6_sock_destruct(struct sock *sk inet6_cleanup_sock(sk); inet_sock_destruct(sk); } +EXPORT_SYMBOL_GPL(inet6_sock_destruct);
static int inet6_create(struct net *net, struct socket *sock, int protocol, int kern)
From: Kuniyuki Iwashima kuniyu@amazon.com
commit 6431b0f6ff1633ae598667e4cdd93830074a03e8 upstream.
After commit d38afeec26ed ("tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct()."), we call inet6_destroy_sock() in sk->sk_destruct() by setting inet6_sock_destruct() to it to make sure we do not leak inet6-specific resources.
SCTP sets its own sk->sk_destruct() in the sctp_init_sock(), and SCTPv6 socket reuses it as the init function.
To call inet6_sock_destruct() from SCTPv6 sk->sk_destruct(), we set sctp_v6_destruct_sock() in a new init function.
Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/socket.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-)
--- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -4995,13 +4995,17 @@ static void sctp_destroy_sock(struct soc }
/* Triggered when there are no references on the socket anymore */ -static void sctp_destruct_sock(struct sock *sk) +static void sctp_destruct_common(struct sock *sk) { struct sctp_sock *sp = sctp_sk(sk);
/* Free up the HMAC transform. */ crypto_free_shash(sp->hmac); +}
+static void sctp_destruct_sock(struct sock *sk) +{ + sctp_destruct_common(sk); inet_sock_destruct(sk); }
@@ -9195,7 +9199,7 @@ void sctp_copy_sock(struct sock *newsk, sctp_sk(newsk)->reuse = sp->reuse;
newsk->sk_shutdown = sk->sk_shutdown; - newsk->sk_destruct = sctp_destruct_sock; + newsk->sk_destruct = sk->sk_destruct; newsk->sk_family = sk->sk_family; newsk->sk_protocol = IPPROTO_SCTP; newsk->sk_backlog_rcv = sk->sk_prot->backlog_rcv; @@ -9427,11 +9431,20 @@ struct proto sctp_prot = {
#if IS_ENABLED(CONFIG_IPV6)
-#include <net/transp_v6.h> -static void sctp_v6_destroy_sock(struct sock *sk) +static void sctp_v6_destruct_sock(struct sock *sk) +{ + sctp_destruct_common(sk); + inet6_sock_destruct(sk); +} + +static int sctp_v6_init_sock(struct sock *sk) { - sctp_destroy_sock(sk); - inet6_destroy_sock(sk); + int ret = sctp_init_sock(sk); + + if (!ret) + sk->sk_destruct = sctp_v6_destruct_sock; + + return ret; }
struct proto sctpv6_prot = { @@ -9441,8 +9454,8 @@ struct proto sctpv6_prot = { .disconnect = sctp_disconnect, .accept = sctp_accept, .ioctl = sctp_ioctl, - .init = sctp_init_sock, - .destroy = sctp_v6_destroy_sock, + .init = sctp_v6_init_sock, + .destroy = sctp_destroy_sock, .shutdown = sctp_shutdown, .setsockopt = sctp_setsockopt, .getsockopt = sctp_getsockopt,
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit 8caa81eb950cb2e9d2d6959b37d853162d197f57 upstream.
The driver only supports normal polarity. Complete the implementation of .get_state() by setting .polarity accordingly.
This fixes a regression that was possible since commit c73a3107624d ("pwm: Handle .get_state() failures") which stopped to zero-initialize the state passed to the .get_state() callback. This was reported at https://forum.odroid.com/viewtopic.php?f=177&t=46360 . While this was an unintended side effect, the real issue is the driver's callback not setting the polarity.
There is a complicating fact, that the .apply() callback fakes support for inversed polarity. This is not (and cannot) be matched by .get_state(). As fixing this isn't easy, only point it out in a comment to prevent authors of other drivers from copying that approach.
Fixes: c375bcbaabdb ("pwm: meson: Read the full hardware state in meson_pwm_get_state()") Reported-by: Munehisa Kamata kamatam@amazon.com Acked-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Link: https://lore.kernel.org/r/20230310191405.2606296-1-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pwm/pwm-meson.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/pwm/pwm-meson.c +++ b/drivers/pwm/pwm-meson.c @@ -168,6 +168,12 @@ static int meson_pwm_calc(struct meson_p duty = state->duty_cycle; period = state->period;
+ /* + * Note this is wrong. The result is an output wave that isn't really + * inverted and so is wrongly identified by .get_state as normal. + * Fixing this needs some care however as some machines might rely on + * this. + */ if (state->polarity == PWM_POLARITY_INVERSED) duty = period - duty;
@@ -366,6 +372,7 @@ static void meson_pwm_get_state(struct p state->period = 0; state->duty_cycle = 0; } + state->polarity = PWM_POLARITY_NORMAL; }
static const struct pwm_ops meson_pwm_ops = {
From: "Uwe Kleine-K�nig" u.kleine-koenig@pengutronix.de
[ Upstream commit b20b097128d9145fadcea1cbb45c4d186cb57466 ]
The driver only supports normal polarity. Complete the implementation of .get_state() by setting .polarity accordingly.
Fixes: 6f0841a8197b ("pwm: Add support for Azoteq IQS620A PWM generator") Reviewed-by: Jeff LaBundy jeff@labundy.com Link: https://lore.kernel.org/r/20230228135508.1798428-4-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pwm/pwm-iqs620a.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/pwm/pwm-iqs620a.c +++ b/drivers/pwm/pwm-iqs620a.c @@ -132,6 +132,7 @@ static void iqs620_pwm_get_state(struct mutex_unlock(&iqs620_pwm->lock);
state->period = IQS620_PWM_PERIOD_NS; + state->polarity = PWM_POLARITY_NORMAL; }
static int iqs620_pwm_notifier(struct notifier_block *notifier,
From: "Uwe Kleine-K�nig" u.kleine-koenig@pengutronix.de
[ Upstream commit 6f57937980142715e927697a6ffd2050f38ed6f6 ]
The driver only both polarities. Complete the implementation of .get_state() by setting .polarity according to the configured hardware state.
Fixes: d09f00810850 ("pwm: Add PWM driver for HiSilicon BVT SOCs") Link: https://lore.kernel.org/r/20230228135508.1798428-2-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pwm/pwm-hibvt.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/pwm/pwm-hibvt.c +++ b/drivers/pwm/pwm-hibvt.c @@ -146,6 +146,7 @@ static void hibvt_pwm_get_state(struct p
value = readl(base + PWM_CTRL_ADDR(pwm->hwpwm)); state->enabled = (PWM_ENABLE_MASK & value); + state->polarity = (PWM_POLARITY_MASK & value) ? PWM_POLARITY_INVERSED : PWM_POLARITY_NORMAL; }
static int hibvt_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
From: Dan Carpenter error27@gmail.com
commit 73a428b37b9b538f8f8fe61caa45e7f243bab87c upstream.
The at91_adc_allocate_trigger() function is supposed to return error pointers. Returning a NULL will cause an Oops.
Fixes: 5e1a1da0f8c9 ("iio: adc: at91-sama5d2_adc: add hw trigger and buffer support") Signed-off-by: Dan Carpenter error27@gmail.com Link: https://lore.kernel.org/r/5d728f9d-31d1-410d-a0b3-df6a63a2c8ba@kili.mountain Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/at91-sama5d2_adc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/adc/at91-sama5d2_adc.c +++ b/drivers/iio/adc/at91-sama5d2_adc.c @@ -1002,7 +1002,7 @@ static struct iio_trigger *at91_adc_allo trig = devm_iio_trigger_alloc(&indio->dev, "%s-dev%d-%s", indio->name, indio->id, trigger_name); if (!trig) - return NULL; + return ERR_PTR(-ENOMEM);
trig->dev.parent = indio->dev.parent; iio_trigger_set_drvdata(trig, indio);
From: Nikita Zhandarovich n.zhandarovich@fintech.ru
commit 86a24e99c97234f87d9f70b528a691150e145197 upstream.
dma_request_slave_channel() may return NULL which will lead to NULL pointer dereference error in 'tmp_chan->private'.
Correct this behaviour by, first, switching from deprecated function dma_request_slave_channel() to dma_request_chan(). Secondly, enable sanity check for the resuling value of dma_request_chan(). Also, fix description that follows the enacted changes and that concerns the use of dma_request_slave_channel().
Fixes: 706e2c881158 ("ASoC: fsl_asrc_dma: Reuse the dma channel if available in Back-End") Co-developed-by: Natalia Petrova n.petrova@fintech.ru Signed-off-by: Nikita Zhandarovich n.zhandarovich@fintech.ru Acked-by: Shengjiu Wang shengjiu.wang@gmail.com Link: https://lore.kernel.org/r/20230417133242.53339-1-n.zhandarovich@fintech.ru Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/fsl/fsl_asrc_dma.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/sound/soc/fsl/fsl_asrc_dma.c +++ b/sound/soc/fsl/fsl_asrc_dma.c @@ -207,14 +207,19 @@ static int fsl_asrc_dma_hw_params(struct be_chan = soc_component_to_pcm(component_be)->chan[substream->stream]; tmp_chan = be_chan; } - if (!tmp_chan) - tmp_chan = dma_request_slave_channel(dev_be, tx ? "tx" : "rx"); + if (!tmp_chan) { + tmp_chan = dma_request_chan(dev_be, tx ? "tx" : "rx"); + if (IS_ERR(tmp_chan)) { + dev_err(dev, "failed to request DMA channel for Back-End\n"); + return -EINVAL; + } + }
/* * An EDMA DEV_TO_DEV channel is fixed and bound with DMA event of each * peripheral, unlike SDMA channel that is allocated dynamically. So no * need to configure dma_request and dma_request2, but get dma_chan of - * Back-End device directly via dma_request_slave_channel. + * Back-End device directly via dma_request_chan. */ if (!asrc->use_edma) { /* Get DMA request of Back-End */
From: Ekaterina Orlova vorobushek.ok@gmail.com
commit 5a43001c01691dcbd396541e6faa2c0077378f48 upstream.
It seems there is a misprint in the check of strdup() return code that can lead to NULL pointer dereference.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 4520c6a49af8 ("X.509: Add simple ASN.1 grammar compiler") Signed-off-by: Ekaterina Orlova vorobushek.ok@gmail.com Cc: David Woodhouse dwmw2@infradead.org Cc: James Bottomley jejb@linux.ibm.com Cc: Jarkko Sakkinen jarkko@kernel.org Cc: keyrings@vger.kernel.org Cc: linux-kbuild@vger.kernel.org Link: https://lore.kernel.org/r/20230315172130.140-1-vorobushek.ok@gmail.com/ Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/asn1_compiler.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/scripts/asn1_compiler.c +++ b/scripts/asn1_compiler.c @@ -625,7 +625,7 @@ int main(int argc, char **argv) p = strrchr(argv[1], '/'); p = p ? p + 1 : argv[1]; grammar_name = strdup(p); - if (!p) { + if (!grammar_name) { perror(NULL); exit(1); }
On Mon, Apr 24, 2023 at 03:17:31PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.179 release. There are 68 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 26 Apr 2023 13:11:11 +0000. Anything received after that time might be too late.
Build results: total: 162 pass: 162 fail: 0 Qemu test results: total: 485 pass: 485 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On Mon, 24 Apr 2023 at 14:34, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.10.179 release. There are 68 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 26 Apr 2023 13:11:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.179-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 5.10.179-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-5.10.y * git commit: 1ef2000b94cb2a0bc4a1e822fd21885e80d65646 * git describe: v5.10.176-363-g1ef2000b94cb * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.10.y/build/v5.10....
## Test Regressions (compared to v5.10.176-294-gabbd2e43cd45)
## Metric Regressions (compared to v5.10.176-294-gabbd2e43cd45)
## Test Fixes (compared to v5.10.176-294-gabbd2e43cd45)
## Metric Fixes (compared to v5.10.176-294-gabbd2e43cd45)
## Test result summary total: 130665, pass: 104853, fail: 3680, skip: 21897, xfail: 235
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 115 total, 114 passed, 1 failed * arm64: 43 total, 40 passed, 3 failed * i386: 33 total, 31 passed, 2 failed * mips: 27 total, 26 passed, 1 failed * parisc: 8 total, 8 passed, 0 failed * powerpc: 26 total, 20 passed, 6 failed * riscv: 12 total, 11 passed, 1 failed * s390: 12 total, 12 passed, 0 failed * sh: 14 total, 12 passed, 2 failed * sparc: 8 total, 8 passed, 0 failed * x86_64: 36 total, 34 passed, 2 failed
## Test suites summary * boot * fwts * igt-gpu-tools * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * perf * rcutorture * v4l2-compliance * vdso
-- Linaro LKFT https://lkft.linaro.org
Hello Greg,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org Sent: Monday, April 24, 2023 2:18 PM
This is the start of the stable review cycle for the 5.10.179 release. There are 68 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 26 Apr 2023 13:11:11 +0000. Anything received after that time might be too late.
CIP configurations built and booted with Linux 5.10.179-rc1 (1ef2000b94cb): https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/84... https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/commits/linu...
Tested-by: Chris Paterson (CIP) chris.paterson2@renesas.com
Kind regards, Chris
On 4/24/2023 6:17 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.179 release. There are 68 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 26 Apr 2023 13:11:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.179-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On 4/24/23 07:17, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.179 release. There are 68 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 26 Apr 2023 13:11:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.179-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org