November 2024 - Linux-stable-mirror

[PATCH net v2 0/5] Make TCP-MD5-diag slightly less broken

by Dmitry Safonov via B4 Relay

Changes in v2: - Fixup for uninitilized md5sig_count stack variable (Oops! Kudos to kernel test robot <lkp(a)intel.com>) - Correct space damage, add a missing Fixes tag & reformat tcp_ulp_ops_size() (Kuniyuki Iwashima) - Take out patch for maximum attribute length, see (4) below. Going to send it later with the next TCP-AO-diag part (Kuniyuki Iwashima) - Link to v1: https://lore.kernel.org/r/20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmai… My original intent was to replace the last non-upstream Arista's TCP-AO piece. That is per-netns procfs seqfile which lists AO keys. In my view an acceptable upstream alternative would be TCP-AO-diag uAPI. So, I started by looking and reviewing TCP-MD5-diag code. And straight away I saw a bunch of issues: 1. Similarly to TCP_MD5SIG_EXT, which doesn't check tcpm_flags for unknown flags and so being non-extendable setsockopt(), the same way tcp_diag_put_md5sig() dumps md5 keys in an array of tcp_diag_md5sig, which makes it ABI non-extendable structure as userspace can't tolerate any new members in it. 2. Inet-diag allocates netlink message for sockets in inet_diag_dump_one_icsk(), which uses a TCP-diag callback .idiag_get_aux_size(), that pre-calculates the needed space for TCP-diag related information. But as neither socket lock nor rcu_readlock() are held between allocation and the actual TCP info filling, the TCP-related space requirement may change before reaching tcp_diag_put_md5sig(). I.e., the number of TCP-MD5 keys on a socket. Thankfully, TCP-MD5-diag won't overwrite the skb, but will return EMSGSIZE, triggering WARN_ON() in inet_diag_dump_one_icsk(). 3. Inet-diag "do" request* can create skb of any message required size. But "dump" request* the skb size, since d35c99ff77ec ("netlink: do not enter direct reclaim from netlink_dump()") is limited by 32 KB. Having in mind that sizeof(struct tcp_diag_md5sig) = 100 bytes, dumps for sockets that have more than 327 keys are going to fail (not counting other diag infos, which lower this limit futher). That is much lower than the number of TCP-MD5 keys that can be allocated on a socket with the current default optmem_max limit (128Kb). So, then I went and written selftests for TCP-MD5-diag and besides confirming that (2) and (3) are not theoretical issues, I also discovered another issues, that I didn't notice on code inspection: 4. nlattr::nla_len is __u16, which limits the largest netlink attibute by 64Kb or by 655 tcp_diag_md5sig keys in the diag array. What happens de-facto is that the netlink attribute gets u16 overflow, breaking the userspace parsing - RTA_NEXT(), that should point to the next attribute, points into the middle of md5 keys array. In this patch set issues (2) and (4) are addressed. (2) by not returning EMSGSIZE when the dump raced with modifying TCP-MD5 keys on a socket, but mark the dump inconsistent by setting NLM_F_DUMP_INTR nlmsg flag. Which changes uAPI in situations where previously kernel did WARN() and errored the dump. (4) by artificially limiting the maximum attribute size by U16_MAX - 1. In order to remove the new limit from (4) solution, my plan is to convert the dump of TCP-MD5 keys from an array to NL_ATTR_TYPE_NESTED_ARRAY (or alike), which should also address (1). And for (3), it's needed to teach tcp-diag how-to remember not only socket on which previous recvmsg() stopped, but potentially TCP-MD5 key as well. I plan in the next part of patch set address (3), (1) and the new limit for (4), together with adding new TCP-AO-diag. * Terminology from Documentation/userspace-api/netlink/intro.rst Signed-off-by: Dmitry Safonov <0x7f454c46(a)gmail.com> --- Dmitry Safonov (5): net/diag: Do not race on dumping MD5 keys with adding new MD5 keys net/diag: Warn only once on EMSGSIZE net/diag: Pre-allocate optional info only if requested net/diag: Always pre-allocate tcp_ulp info net/netlink: Correct the comment on netlink message max cap include/linux/inet_diag.h | 3 +- include/net/tcp.h | 1 - net/ipv4/inet_diag.c | 89 ++++++++++++++++++++++++++++++++++++++--------- net/ipv4/tcp_diag.c | 68 ++++++++++++++++++------------------ net/mptcp/diag.c | 20 ----------- net/netlink/af_netlink.c | 4 +-- net/tls/tls_main.c | 17 --------- 7 files changed, 110 insertions(+), 92 deletions(-) --- base-commit: f1b785f4c7870c42330b35522c2514e39a1e28e7 change-id: 20241106-tcp-md5-diag-prep-2f0dcf371d90 Best regards, -- Dmitry Safonov <0x7f454c46(a)gmail.com>

6 months, 3 weeks

5
8
0 0

[PATCH] fs/ceph/file: fix memory leaks in __ceph_sync_read()

by Max Kellermann

In two `break` statements, the call to ceph_release_page_vector() was missing, leaking the allocation from ceph_alloc_page_vector(). Cc: stable(a)vger.kernel.org Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com> --- fs/ceph/file.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 4b8d59ebda00..24d0f1cc9aac 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1134,6 +1134,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, extent_cnt = __ceph_sparse_read_ext_count(inode, read_len); ret = ceph_alloc_sparse_ext_map(op, extent_cnt); if (ret) { + ceph_release_page_vector(pages, num_pages); ceph_osdc_put_request(req); break; } @@ -1168,6 +1169,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, op->extent.sparse_ext_cnt); if (fret < 0) { ret = fret; + ceph_release_page_vector(pages, num_pages); ceph_osdc_put_request(req); break; } -- 2.45.2

6 months, 3 weeks

3
21
0 0

[PATCH 6.6.x] btrfs: add cancellation points to trim loops

by David Sterba

From: Luca Stefani <luca.stefani.ge1(a)gmail.com> There are reports that system cannot suspend due to running trim because the task responsible for trimming the device isn't able to finish in time, especially since we have a free extent discarding phase, which can trim a lot of unallocated space. There are no limits on the trim size (unlike the block group part). Since trime isn't a critical call it can be interrupted at any time, in such cases we stop the trim, report the amount of discarded bytes and return an error. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180 Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737 CC: stable(a)vger.kernel.org # 5.15+ Signed-off-by: Luca Stefani <luca.stefani.ge1(a)gmail.com> Reviewed-by: David Sterba <dsterba(a)suse.com> Signed-off-by: David Sterba <dsterba(a)suse.com> --- fs/btrfs/extent-tree.c | 7 ++++++- fs/btrfs/free-space-cache.c | 4 ++-- fs/btrfs/free-space-cache.h | 7 +++++++ 3 files changed, 15 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index b3680e1c7054..599407120513 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1319,6 +1319,11 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, start += bytes_to_discard; bytes_left -= bytes_to_discard; *discarded_bytes += bytes_to_discard; + + if (btrfs_trim_interrupted()) { + ret = -ERESTARTSYS; + break; + } } return ret; @@ -6094,7 +6099,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed) start += len; *trimmed += bytes; - if (fatal_signal_pending(current)) { + if (btrfs_trim_interrupted()) { ret = -ERESTARTSYS; break; } diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 3bcf4a30cad7..9a6ec9344c3e 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -3808,7 +3808,7 @@ static int trim_no_bitmap(struct btrfs_block_group *block_group, if (async && *total_trimmed) break; - if (fatal_signal_pending(current)) { + if (btrfs_trim_interrupted()) { ret = -ERESTARTSYS; break; } @@ -3999,7 +3999,7 @@ static int trim_bitmaps(struct btrfs_block_group *block_group, } block_group->discard_cursor = start; - if (fatal_signal_pending(current)) { + if (btrfs_trim_interrupted()) { if (start != offset) reset_trimming_bitmap(ctl, offset); ret = -ERESTARTSYS; diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h index 33b4da3271b1..bd80c7b2af96 100644 --- a/fs/btrfs/free-space-cache.h +++ b/fs/btrfs/free-space-cache.h @@ -6,6 +6,8 @@ #ifndef BTRFS_FREE_SPACE_CACHE_H #define BTRFS_FREE_SPACE_CACHE_H +#include <linux/freezer.h> + /* * This is the trim state of an extent or bitmap. * @@ -43,6 +45,11 @@ static inline bool btrfs_free_space_trimming_bitmap( return (info->trim_state == BTRFS_TRIM_STATE_TRIMMING); } +static inline bool btrfs_trim_interrupted(void) +{ + return fatal_signal_pending(current) || freezing(current); +} + /* * Deltas are an effective way to populate global statistics. Give macro names * to make it clear what we're doing. An example is discard_extents in -- 2.45.0

6 months, 3 weeks

5
6
0 0

please revert backport of 44c76825d6eefee9eb7ce06c38e1a6632ac7eb7d

by Kees Cook

Hi stable tree maintainers, Please revert the backports of 44c76825d6ee ("x86: Increase brk randomness entropy for 64-bit systems") namely: 5.4: 03475167fda50b8511ef620a27409b08365882e1 5.10: 25d31baf922c1ee987efd6fcc9c7d4ab539c66b4 5.15: 06cb3463aa58906cfff72877eb7f50cb26e9ca93 6.1: b0cde867b80a5e81fcbc0383e138f5845f2005ee 6.6: 1a45994fb218d93dec48a3a86f68283db61e0936 There seems to be a bad interaction between this change and older PIE-built qemu-user-static (for aarch64) binaries[1]. Investigation continues to see if this will need to be reverted from 6.6, 6.11, and mainline. But for now, it's clearly a problem for older kernels with older qemu. Thanks! -Kees [1] https://lore.kernel.org/all/202411201000.F3313C02@keescook/ -- Kees Cook

6 months, 3 weeks

5
6
0 0

[PATCH v2 0/2] drm/msm/dpu: two fixes targeting 6.11

by Dmitry Baryshkov

Leonard Lausen reported an issue with suspend/resume of the sc7180 devices. Fix the WB atomic check, which caused the issue. Also make sure that DPU debugging logs are always directed to the drm_debug / DRIVER so that usual drm.debug masks work in an expected way. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org> --- Changes in v2: - Reworked the writeback to just drop the connector->status check. - Expanded commit message for the debugging patch. - Link to v1: https://lore.kernel.org/r/20240709-dpu-fix-wb-v1-0-448348bfd4cb@linaro.org --- Dmitry Baryshkov (2): drm/msm/dpu1: don't choke on disabling the writeback connector drm/msm/dpu: don't play tricks with debug macros drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h | 14 ++------------ drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c | 3 --- 2 files changed, 2 insertions(+), 15 deletions(-) --- base-commit: 668d33c9ff922c4590c58754ab064aaf53c387dd change-id: 20240709-dpu-fix-wb-6cd57e3eb182 Best regards, -- Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>

6 months, 3 weeks

5
19
0 0

[PATCH v2] mm: Respect mmap hint address when aligning for THP

by Kalesh Singh

Commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries") updated __get_unmapped_area() to align the start address for the VMA to a PMD boundary if CONFIG_TRANSPARENT_HUGEPAGE=y. It does this by effectively looking up a region that is of size, request_size + PMD_SIZE, and aligning up the start to a PMD boundary. Commit 4ef9ad19e176 ("mm: huge_memory: don't force huge page alignment on 32 bit") opted out of this for 32bit due to regressions in mmap base randomization. Commit d4148aeab412 ("mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes") restricted this to only mmap sizes that are multiples of the PMD_SIZE due to reported regressions in some performance benchmarks -- which seemed mostly due to the reduced spatial locality of related mappings due to the forced PMD-alignment. Another unintended side effect has emerged: When a user specifies an mmap hint address, the THP alignment logic modifies the behavior, potentially ignoring the hint even if a sufficiently large gap exists at the requested hint location. Example Scenario: Consider the following simplified virtual address (VA) space: ... 0x200000-0x400000 --- VMA A 0x400000-0x600000 --- Hole 0x600000-0x800000 --- VMA B ... A call to mmap() with hint=0x400000 and len=0x200000 behaves differently: - Before THP alignment: The requested region (size 0x200000) fits into the gap at 0x400000, so the hint is respected. - After alignment: The logic searches for a region of size 0x400000 (len + PMD_SIZE) starting at 0x400000. This search fails due to the mapping at 0x600000 (VMA B), and the hint is ignored, falling back to arch_get_unmapped_area[_topdown](). In general the hint is effectively ignored, if there is any existing mapping in the below range: [mmap_hint + mmap_size, mmap_hint + mmap_size + PMD_SIZE) This changes the semantics of mmap hint; from ""Respect the hint if a sufficiently large gap exists at the requested location" to "Respect the hint only if an additional PMD-sized gap exists beyond the requested size". This has performance implications for allocators that allocate their heap using mmap but try to keep it "as contiguous as possible" by using the end of the exisiting heap as the address hint. With the new behavior it's more likely to get a much less contiguous heap, adding extra fragmentation and performance overhead. To restore the expected behavior; don't use thp_get_unmapped_area_vmflags() when the user provided a hint address, for anonymous mappings. Note: As, Yang Shi, pointed out: the issue still remains for filesystems which are using thp_get_unmapped_area() for their get_unmapped_area() op. It is unclear what worklaods will regress for if we ignore THP alignment when the hint address is provided for such file backed mappings -- so this fix will be handled separately. Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: Yang Shi <yang(a)os.amperecomputing.com> Cc: Rik van Riel <riel(a)surriel.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Minchan Kim <minchan(a)kernel.org> Cc: Hans Boehm <hboehm(a)google.com> Cc: Lokesh Gidra <lokeshgidra(a)google.com> Cc: <stable(a)vger.kernel.org> Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries") Signed-off-by: Kalesh Singh <kaleshsingh(a)google.com> Reviewed-by: Rik van Riel <riel(a)surriel.com> Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz> --- Changes in v2: - Clarify the handling of file backed mappings, as highlighted by Yang - Collect Vlastimil's and Rik's Reviewed-by's mm/mmap.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/mmap.c b/mm/mmap.c index 79d541f1502b..2f01f1a8e304 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -901,6 +901,7 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, if (get_area) { addr = get_area(file, addr, len, pgoff, flags); } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) + && !addr /* no hint */ && IS_ALIGNED(len, PMD_SIZE)) { /* Ensures that larger anonymous mappings are THP aligned. */ addr = thp_get_unmapped_area_vmflags(file, addr, len, base-commit: 2d5404caa8c7bb5c4e0435f94b28834ae5456623 -- 2.47.0.338.g60cca15819-goog

6 months, 3 weeks

4
4
0 0

[PATCH v2] ARM: dts: dra7: Add bus_dma_limit for l4 cfg bus

by Romain Naour

From: Romain Naour <romain.naour(a)skf.com> A bus_dma_limit was added for l3 bus by commit cfb5d65f2595 ("ARM: dts: dra7: Add bus_dma_limit for L3 bus") to fix an issue observed only with SATA on DRA7-EVM with 4GB RAM and CONFIG_ARM_LPAE enabled. Since kernel 5.13, the SATA issue can be reproduced again following the SATA node move from L3 bus to L4_cfg in commit 8af15365a368 ("ARM: dts: Configure interconnect target module for dra7 sata"). Fix it by adding an empty dma-ranges property to l4_cfg and segment@100000 nodes (parent device tree node of SATA controller) to inherit the 2GB dma ranges limit from l3 bus node. Note: A similar fix was applied for PCIe controller by commit 90d4d3f4ea45 ("ARM: dts: dra7: Fix bus_dma_limit for PCIe"). Fixes: 8af15365a368 ("ARM: dts: Configure interconnect target module for dra7 sata"). Link: https://lore.kernel.org/linux-omap/c583e1bb-f56b-4489-8012-ce742e85f233@smi… Cc: <stable(a)vger.kernel.org> # 5.13 Signed-off-by: Romain Naour <romain.naour(a)skf.com> --- v2: add stable tag --- arch/arm/boot/dts/ti/omap/dra7-l4.dtsi | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm/boot/dts/ti/omap/dra7-l4.dtsi b/arch/arm/boot/dts/ti/omap/dra7-l4.dtsi index 6e67d99832ac..ba7fdaae9c6e 100644 --- a/arch/arm/boot/dts/ti/omap/dra7-l4.dtsi +++ b/arch/arm/boot/dts/ti/omap/dra7-l4.dtsi @@ -12,6 +12,7 @@ &l4_cfg { /* 0x4a000000 */ ranges = <0x00000000 0x4a000000 0x100000>, /* segment 0 */ <0x00100000 0x4a100000 0x100000>, /* segment 1 */ <0x00200000 0x4a200000 0x100000>; /* segment 2 */ + dma-ranges; segment@0 { /* 0x4a000000 */ compatible = "simple-pm-bus"; @@ -557,6 +558,7 @@ segment@100000 { /* 0x4a100000 */ <0x0007e000 0x0017e000 0x001000>, /* ap 124 */ <0x00059000 0x00159000 0x001000>, /* ap 125 */ <0x0005a000 0x0015a000 0x001000>; /* ap 126 */ + dma-ranges; target-module@2000 { /* 0x4a102000, ap 27 3c.0 */ compatible = "ti,sysc"; -- 2.45.0

6 months, 3 weeks

2
1
0 0

[PATCH] cpufreq: fix using cpufreq-dt as module

by Andreas Kemnade

E.g. omap2plus_defconfig compiles cpufreq-dt as module. As there is no module alias nor a module_init(), cpufreq-dt-platdev will not be used and therefore on several omap platforms there is no cpufreq. Enforce builtin compile of cpufreq-dt-platdev to make it effective. Fixes: 3b062a086984 ("cpufreq: dt-platdev: Support building as module") Cc: stable(a)vger.kernel.org Signed-off-by: Andreas Kemnade <andreas(a)kemnade.info> --- drivers/cpufreq/Kconfig | 2 +- drivers/cpufreq/cpufreq-dt-platdev.c | 2 -- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig index 2561b215432a8..4547adf5d2a7d 100644 --- a/drivers/cpufreq/Kconfig +++ b/drivers/cpufreq/Kconfig @@ -218,7 +218,7 @@ config CPUFREQ_DT If in doubt, say N. config CPUFREQ_DT_PLATDEV - tristate "Generic DT based cpufreq platdev driver" + bool "Generic DT based cpufreq platdev driver" depends on OF help This adds a generic DT based cpufreq platdev driver for frequency diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c index 18942bfe9c95f..78ad3221fe077 100644 --- a/drivers/cpufreq/cpufreq-dt-platdev.c +++ b/drivers/cpufreq/cpufreq-dt-platdev.c @@ -234,5 +234,3 @@ static int __init cpufreq_dt_platdev_init(void) sizeof(struct cpufreq_dt_platform_data))); } core_initcall(cpufreq_dt_platdev_init); -MODULE_DESCRIPTION("Generic DT based cpufreq platdev driver"); -MODULE_LICENSE("GPL"); -- 2.39.2

6 months, 3 weeks

3
4
0 0

[PATCH] nvme-pci: Remove O2 Queue Depth quirk

by Gwendal Grignou

PCI_DEVICE(0x1217, 0x8760) (O2 Micro, Inc. FORESEE E2M2 NVMe SSD) is a NMVe to eMMC bridge, that can be used with different eMMC memory devices. The NVMe device name contains the eMMC device name, for instance: `BAYHUB SanDisk-DA4128-91904055-128GB` The bridge is known to work with many eMMC devices, we need to limit the queue depth once we know which eMMC device is behind the bridge. Fixes: commit 83bdfcbdbe5d ("nvme-pci: qdepth 1 quirk") Cc: stable(a)vger.kernel.org Signed-off-by: Gwendal Grignou <gwendal(a)chromium.org> --- drivers/nvme/host/pci.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 4b9fda0b1d9a3..1c908e129fddf 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3448,8 +3448,6 @@ static const struct pci_device_id nvme_id_table[] = { NVME_QUIRK_BOGUS_NID, }, { PCI_VDEVICE(REDHAT, 0x0010), /* Qemu emulated controller */ .driver_data = NVME_QUIRK_BOGUS_NID, }, - { PCI_DEVICE(0x1217, 0x8760), /* O2 Micro 64GB Steam Deck */ - .driver_data = NVME_QUIRK_QDEPTH_ONE }, { PCI_DEVICE(0x126f, 0x2262), /* Silicon Motion generic */ .driver_data = NVME_QUIRK_NO_DEEPEST_PS | NVME_QUIRK_BOGUS_NID, }, -- 2.47.0.163.g1226f6d8fa-goog

6 months, 3 weeks

5
7
0 0

drm/amd/display: Pass pwrseq inst for backlight and ABM

by Alex Deucher

Hi Greg, Sasha, Please cherry pick upstream commit b17ef04bf3a4 ("drm/amd/display: Pass pwrseq inst for backlight and ABM") to stable kernel 6.6.x and newer. This fixes broken backlight adjustment on some AMD platforms with eDP panels. Thanks, Alex

6 months, 3 weeks

3
7
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror November 2024