July 2025 - Linux-stable-mirror

[PATCH] comedi: Fix use of uninitialized memory in do_insn_ioctl() and do_insnlist_ioctl()

by Ian Abbott

syzbot reports a KMSAN kernel-infoleak in `do_insn_ioctl()`. A kernel buffer is allocated to hold `insn->n` samples (each of which is an `unsigned int`). For some instruction types, `insn->n` samples are copied back to user-space, unless an error code is being returned. The problem is that not all the instruction handlers that need to return data to userspace fill in the whole `insn->n` samples, so that there is an information leak. There is a similar syzbot report for `do_insnlist_ioctl()`, although it does not have a reproducer for it at the time of writing. One culprit is `insn_rw_emulate_bits()` which is used as the handler for `INSN_READ` or `INSN_WRITE` instructions for subdevices that do not have a specific handler for that instruction, but do have an `INSN_BITS` handler. For `INSN_READ` it only fills in at most 1 sample, so if `insn->n` is greater than 1, the remaining `insn->n - 1` samples copied to userspace will be uninitialized kernel data. Another culprit is `vm80xx_ai_insn_read()` in the "vm80xx" driver. It never returns an error, even if it fails to fill the buffer. Fix it in `do_insn_ioctl()` and `do_insnlist_ioctl()` by making sure that uninitialized parts of the allocated buffer are zeroed before handling each instruction. Thanks to Arnaud Lecomte for their fix to `do_insn_ioctl()`. That fix replaced the call to `kmalloc_array()` with `kcalloc()`, but it is not always necessary to clear the whole buffer. Fixes: ed9eccbe8970 ("Staging: add comedi core") Reported-by: syzbot+a5e45f768aab5892da5d(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a5e45f768aab5892da5d Reported-by: syzbot+fb4362a104d45ab09cf9(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fb4362a104d45ab09cf9 Cc: <stable(a)vger.kernel.org> #5.13+ Cc: Arnaud Lecomte <contact(a)arnaud-lcm.com> Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk> --- Cherry picks to fix this for 5.4.y and 5.10.y not yet available. --- drivers/comedi/comedi_fops.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/comedi/comedi_fops.c b/drivers/comedi/comedi_fops.c index 23b7178522ae..7e2f2b1a1c36 100644 --- a/drivers/comedi/comedi_fops.c +++ b/drivers/comedi/comedi_fops.c @@ -1587,6 +1587,9 @@ static int do_insnlist_ioctl(struct comedi_device *dev, memset(&data[n], 0, (MIN_SAMPLES - n) * sizeof(unsigned int)); } + } else { + memset(data, 0, max_t(unsigned int, n, MIN_SAMPLES) * + sizeof(unsigned int)); } ret = parse_insn(dev, insns + i, data, file); if (ret < 0) @@ -1670,6 +1673,8 @@ static int do_insn_ioctl(struct comedi_device *dev, memset(&data[insn->n], 0, (MIN_SAMPLES - insn->n) * sizeof(unsigned int)); } + } else { + memset(data, 0, n_data * sizeof(unsigned int)); } ret = parse_insn(dev, insn, data, file); if (ret < 0) -- 2.47.2

2 months, 2 weeks

1
0
0 0

[stable backport request] x86/ioremap: Use is_ioremap_addr() in iounmap() for 5.15.y

by Arin Sun

Dear Stable Kernel Team and Maintainers, I am writing to request a backport of the following commit from the mainline kernel to the 5.15.y stable branch: Commit: x86/ioremap: Use is_ioremap_addr() in iounmap() ID: 50c6dbdfd16e312382842198a7919341ad480e05 Author: Max Ramanouski Merged in: Linux 6.11-rc1 (approximately August 2024) This commit fixes a bug in the iounmap() function for x86 architectures in kernel versions 5.x. Specifically, the original code uses a check against high_memory: if ((void __force *)addr <= high_memory) return; This can lead to memory leaks on certain x86 servers where ioremap() returns addresses that are not guaranteed to be greater than high_memory, causing the function to return early without properly unmapping the memory. The fix replaces this with is_ioremap_addr(), making the check more reliable: if (WARN_ON_ONCE(!is_ioremap_addr((void __force *)addr))) return; I have checked the 5.15.y branch logs and did not find this backport. This issue affects production environments, particularly on customer machines where we cannot easily deploy custom kernels. Backporting this to 5.15.y (and possibly other LTS branches like 5.10.y if applicable) would help resolve the memory leak without requiring users to upgrade to 6.x series. Do you have plans to backport this commit? If not, could you please consider it for inclusion in the stable releases? Thank you for your time and efforts in maintaining the stable kernels. Best regards, xin.sun

2 months, 2 weeks

2
1
0 0

[PATCH 6.6.y] arm64/cpufeatures/kvm: Add ARMv8.9 FEAT_ECBHB bits in ID_AA64MMFR1 register

by Roy, Patrick

From: Nianyao Tang <tangnianyao(a)huawei.com> [ upstream commit e8cde32f111f7f5681a7bad3ec747e9e697569a9 ] Enable ECBHB bits in ID_AA64MMFR1 register as per ARM DDI 0487K.a specification. When guest OS read ID_AA64MMFR1_EL1, kvm emulate this reg using ftr_id_aa64mmfr1 and always return ID_AA64MMFR1_EL1.ECBHB=0 to guest. It results in guest syscall jump to tramp ventry, which is not needed in implementation with ID_AA64MMFR1_EL1.ECBHB=1. Let's make the guest syscall process the same as the host. This fixes performance regressions introduced by commit 4117975672c4 ("arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists") for guests running on neoverse v2 hardware, which supports ECBHB. Signed-off-by: Nianyao Tang <tangnianyao(a)huawei.com> Link: https://lore.kernel.org/r/20240611122049.2758600-1-tangnianyao@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com> Signed-off-by: Patrick Roy <roypat(a)amazon.co.uk> --- arch/arm64/kernel/cpufeature.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index b6d381f743f3..2ce9ef9d924a 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -364,6 +364,7 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr0[] = { }; static const struct arm64_ftr_bits ftr_id_aa64mmfr1[] = { + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_EL1_ECBHB_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_EL1_TIDCP1_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_EL1_AFP_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_EL1_HCX_SHIFT, 4, 0), -- 2.50.1

2 months, 2 weeks

1
0
0 0

[PATCH] block: Fix bounce check logic in blk_queue_may_bounce()

by Hardeep Sharma

Buffer bouncing is needed only when memory exists above the lowmem region, i.e., when max_low_pfn < max_pfn. The previous check (max_low_pfn >= max_pfn) was inverted and prevented bouncing when it could actually be required. Note that bouncing depends on CONFIG_HIGHMEM, which is typically enabled on 32-bit ARM where not all memory is permanently mapped into the kernel’s lowmem region. Fixes: 9bb33f24abbd0 ("block: refactor the bounce buffering code") Cc: stable(a)vger.kernel.org Signed-off-by: Hardeep Sharma <quic_hardshar(a)quicinc.com> --- block/blk.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk.h b/block/blk.h index 67915b04b3c1..f8a1d64be5a2 100644 --- a/block/blk.h +++ b/block/blk.h @@ -383,7 +383,7 @@ static inline bool blk_queue_may_bounce(struct request_queue *q) { return IS_ENABLED(CONFIG_BOUNCE) && q->limits.bounce == BLK_BOUNCE_HIGH && - max_low_pfn >= max_pfn; + max_low_pfn < max_pfn; } static inline struct bio *blk_queue_bounce(struct bio *bio, -- 2.25.1

2 months, 2 weeks

1
0
0 0

[PATCH] sprintf.h requires stdarg.h

by Suchit Karunakaran

From: Stephen Rothwell <sfr(a)canb.auug.org.au> In file included from drivers/crypto/intel/qat/qat_common/adf_pm_dbgfs_utils.c:4: include/linux/sprintf.h:11:54: error: unknown type name 'va_list' 11 | __printf(2, 0) int vsprintf(char *buf, const char *, va_list); | ^~~~~~~ include/linux/sprintf.h:1:1: note: 'va_list' is defined in header '<stdarg.h>'; this is probably fixable by adding '#include <stdarg.h>' Link: https://lkml.kernel.org/r/20250721173754.42865913@canb.auug.org.au Fixes: 39ced19b9e60 ("lib/vsprintf: split out sprintf() and friends") Signed-off-by: Stephen Rothwell <sfr(a)canb.auug.org.au> Cc: Andriy Shevchenko <andriy.shevchenko(a)linux.intel.com> Cc: Herbert Xu <herbert(a)gondor.apana.org.au> Cc: Petr Mladek <pmladek(a)suse.com> Cc: Steven Rostedt <rostedt(a)goodmis.org> Cc: Rasmus Villemoes <linux(a)rasmusvillemoes.dk> Cc: Sergey Senozhatsky <senozhatsky(a)chromium.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Suchit Karunakaran <suchitkarunakaran(a)gmail.com> --- include/linux/sprintf.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h index 51cab2def9ec..876130091384 100644 --- a/include/linux/sprintf.h +++ b/include/linux/sprintf.h @@ -4,6 +4,7 @@ #include <linux/compiler_attributes.h> #include <linux/types.h> +#include <linux/stdarg.h> int num_to_str(char *buf, int size, unsigned long long num, unsigned int width); -- 2.39.5

2 months, 2 weeks

1
0
0 0

[PATCH v5 1/8] mm/shmem, swap: improve cached mTHP handling and fix potential hung

by Kairui Song

From: Kairui Song <kasong(a)tencent.com> The current swap-in code assumes that, when a swap entry in shmem mapping is order 0, its cached folios (if present) must be order 0 too, which turns out not always correct. The problem is shmem_split_large_entry is called before verifying the folio will eventually be swapped in, one possible race is: CPU1 CPU2 shmem_swapin_folio /* swap in of order > 0 swap entry S1 */ folio = swap_cache_get_folio /* folio = NULL */ order = xa_get_order /* order > 0 */ folio = shmem_swap_alloc_folio /* mTHP alloc failure, folio = NULL */ <... Interrupted ...> shmem_swapin_folio /* S1 is swapped in */ shmem_writeout /* S1 is swapped out, folio cached */ shmem_split_large_entry(..., S1) /* S1 is split, but the folio covering it has order > 0 now */ Now any following swapin of S1 will hang: `xa_get_order` returns 0, and folio lookup will return a folio with order > 0. The `xa_get_order(&mapping->i_pages, index) != folio_order(folio)` will always return false causing swap-in to return -EEXIST. And this looks fragile. So fix this up by allowing seeing a larger folio in swap cache, and check the whole shmem mapping range covered by the swapin have the right swap value upon inserting the folio. And drop the redundant tree walks before the insertion. This will actually improve performance, as it avoids two redundant Xarray tree walks in the hot path, and the only side effect is that in the failure path, shmem may redundantly reallocate a few folios causing temporary slight memory pressure. And worth noting, it may seems the order and value check before inserting might help reducing the lock contention, which is not true. The swap cache layer ensures raced swapin will either see a swap cache folio or failed to do a swapin (we have SWAP_HAS_CACHE bit even if swap cache is bypassed), so holding the folio lock and checking the folio flag is already good enough for avoiding the lock contention. The chance that a folio passes the swap entry value check but the shmem mapping slot has changed should be very low. Fixes: 809bc86517cc ("mm: shmem: support large folio swap out") Signed-off-by: Kairui Song <kasong(a)tencent.com> Reviewed-by: Kemeng Shi <shikemeng(a)huaweicloud.com> Reviewed-by: Baolin Wang <baolin.wang(a)linux.alibaba.com> Tested-by: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: <stable(a)vger.kernel.org> --- mm/shmem.c | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 334b7b4a61a0..e3c9a1365ff4 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -884,7 +884,9 @@ static int shmem_add_to_page_cache(struct folio *folio, pgoff_t index, void *expected, gfp_t gfp) { XA_STATE_ORDER(xas, &mapping->i_pages, index, folio_order(folio)); - long nr = folio_nr_pages(folio); + unsigned long nr = folio_nr_pages(folio); + swp_entry_t iter, swap; + void *entry; VM_BUG_ON_FOLIO(index != round_down(index, nr), folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); @@ -896,14 +898,24 @@ static int shmem_add_to_page_cache(struct folio *folio, gfp &= GFP_RECLAIM_MASK; folio_throttle_swaprate(folio, gfp); + swap = iter = radix_to_swp_entry(expected); do { xas_lock_irq(&xas); - if (expected != xas_find_conflict(&xas)) { - xas_set_err(&xas, -EEXIST); - goto unlock; + xas_for_each_conflict(&xas, entry) { + /* + * The range must either be empty, or filled with + * expected swap entries. Shmem swap entries are never + * partially freed without split of both entry and + * folio, so there shouldn't be any holes. + */ + if (!expected || entry != swp_to_radix_entry(iter)) { + xas_set_err(&xas, -EEXIST); + goto unlock; + } + iter.val += 1 << xas_get_order(&xas); } - if (expected && xas_find_conflict(&xas)) { + if (expected && iter.val - nr != swap.val) { xas_set_err(&xas, -EEXIST); goto unlock; } @@ -2323,7 +2335,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, error = -ENOMEM; goto failed; } - } else if (order != folio_order(folio)) { + } else if (order > folio_order(folio)) { /* * Swap readahead may swap in order 0 folios into swapcache * asynchronously, while the shmem mapping can still stores @@ -2348,15 +2360,15 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, swap = swp_entry(swp_type(swap), swp_offset(swap) + offset); } + } else if (order < folio_order(folio)) { + swap.val = round_down(swap.val, 1 << folio_order(folio)); } alloced: /* We have to do this with folio locked to prevent races */ folio_lock(folio); if ((!skip_swapcache && !folio_test_swapcache(folio)) || - folio->swap.val != swap.val || - !shmem_confirm_swap(mapping, index, swap) || - xa_get_order(&mapping->i_pages, index) != folio_order(folio)) { + folio->swap.val != swap.val) { error = -EEXIST; goto unlock; } -- 2.50.0

2 months, 2 weeks

2
4
0 0

[merged mm-stable] mm-damon-ops-common-ignore-migration-request-to-invalid-nodes.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/damon/ops-common: ignore migration request to invalid nodes has been removed from the -mm tree. Its filename was mm-damon-ops-common-ignore-migration-request-to-invalid-nodes.patch This patch was dropped because it was merged into the mm-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: SeongJae Park <sj(a)kernel.org> Subject: mm/damon/ops-common: ignore migration request to invalid nodes Date: Sun, 20 Jul 2025 11:58:22 -0700 damon_migrate_pages() tries migration even if the target node is invalid. If users mistakenly make such invalid requests via DAMOS_MIGRATE_{HOT,COLD} action, the below kernel BUG can happen. [ 7831.883495] BUG: unable to handle page fault for address: 0000000000001f48 [ 7831.884160] #PF: supervisor read access in kernel mode [ 7831.884681] #PF: error_code(0x0000) - not-present page [ 7831.885203] PGD 0 P4D 0 [ 7831.885468] Oops: Oops: 0000 [#1] SMP PTI [ 7831.885852] CPU: 31 UID: 0 PID: 94202 Comm: kdamond.0 Not tainted 6.16.0-rc5-mm-new-damon+ #93 PREEMPT(voluntary) [ 7831.886913] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.el9 04/01/2014 [ 7831.887777] RIP: 0010:__alloc_frozen_pages_noprof (include/linux/mmzone.h:1724 include/linux/mmzone.h:1750 mm/page_alloc.c:4936 mm/page_alloc.c:5137) [...] [ 7831.895953] Call Trace: [ 7831.896195] <TASK> [ 7831.896397] __folio_alloc_noprof (mm/page_alloc.c:5183 mm/page_alloc.c:5192) [ 7831.896787] migrate_pages_batch (mm/migrate.c:1189 mm/migrate.c:1851) [ 7831.897228] ? __pfx_alloc_migration_target (mm/migrate.c:2137) [ 7831.897735] migrate_pages (mm/migrate.c:2078) [ 7831.898141] ? __pfx_alloc_migration_target (mm/migrate.c:2137) [ 7831.898664] damon_migrate_folio_list (mm/damon/ops-common.c:321 mm/damon/ops-common.c:354) [ 7831.899140] damon_migrate_pages (mm/damon/ops-common.c:405) [...] Add a target node validity check in damon_migrate_pages(). The validity check is stolen from that of do_pages_move(), which is being used for the move_pages() system call. Link: https://lkml.kernel.org/r/20250720185822.1451-1-sj@kernel.org Fixes: b51820ebea65 ("mm/damon/paddr: introduce DAMOS_MIGRATE_COLD action for demotion") [6.11.x] Signed-off-by: SeongJae Park <sj(a)kernel.org> Reviewed-by: Joshua Hahn <joshua.hahnjy(a)gmail.com> Cc: Honggyu Kim <honggyu.kim(a)sk.com> Cc: Hyeongtak Ji <hyeongtak.ji(a)sk.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/damon/ops-common.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/mm/damon/ops-common.c~mm-damon-ops-common-ignore-migration-request-to-invalid-nodes +++ a/mm/damon/ops-common.c @@ -383,6 +383,10 @@ unsigned long damon_migrate_pages(struct if (list_empty(folio_list)) return nr_migrated; + if (target_nid < 0 || target_nid >= MAX_NUMNODES || + !node_state(target_nid, N_MEMORY)) + return nr_migrated; + noreclaim_flag = memalloc_noreclaim_save(); nid = folio_nid(lru_to_folio(folio_list)); _ Patches currently in -mm which might be from sj(a)kernel.org are selftests-damon-sysfspy-stop-damon-for-dumping-failures.patch selftests-damon-_damon_sysfs-support-damos-watermarks-setup.patch selftests-damon-_damon_sysfs-support-damos-filters-setup.patch selftests-damon-_damon_sysfs-support-monitoring-intervals-goal-setup.patch selftests-damon-_damon_sysfs-support-damos-quota-weights-setup.patch selftests-damon-_damon_sysfs-support-damos-quota-goal-nid-setup.patch selftests-damon-_damon_sysfs-support-damos-action-dests-setup.patch selftests-damon-_damon_sysfs-support-damos-target_nid-setup.patch selftests-damon-_damon_sysfs-use-232-1-as-max-nr_accesses-and-age.patch selftests-damon-drgn_dump_damon_status-dump-damos-migrate_dests.patch selftests-damon-drgn_dump_damon_status-dump-ctx-opsid.patch selftests-damon-drgn_dump_damon_status-dump-damos-filters.patch selftests-damon-sysfspy-generalize-damos-watermarks-commit-assertion.patch selftests-damon-sysfspy-generalize-damosquota-commit-assertion.patch selftests-damon-sysfspy-test-quota-goal-commitment.patch selftests-damon-sysfspy-test-damos-destinations-commitment.patch selftests-damon-sysfspy-generalize-damos-scheme-commit-assertion.patch selftests-damon-sysfspy-test-damos-filters-commitment.patch selftests-damon-sysfspy-generalize-damos-schemes-commit-assertion.patch selftests-damon-sysfspy-generalize-monitoring-attributes-commit-assertion.patch selftests-damon-sysfspy-generalize-damon-context-commit-assertion.patch selftests-damon-sysfspy-test-non-default-parameters-runtime-commit.patch selftests-damon-sysfspy-test-runtime-reduction-of-damon-parameters.patch

2 months, 2 weeks

1
0
0 0

[merged mm-stable] mm-swap-fix-potensial-buffer-overflow-in-setup_clusters.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: swap: fix potential buffer overflow in setup_clusters() has been removed from the -mm tree. Its filename was mm-swap-fix-potensial-buffer-overflow-in-setup_clusters.patch This patch was dropped because it was merged into the mm-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Kemeng Shi <shikemeng(a)huaweicloud.com> Subject: mm: swap: fix potential buffer overflow in setup_clusters() Date: Thu, 22 May 2025 20:25:53 +0800 In setup_swap_map(), we only ensure badpages are in range (0, last_page]. As maxpages might be < last_page, setup_clusters() will encounter a buffer overflow when a badpage is >= maxpages. Only call inc_cluster_info_page() for badpage which is < maxpages to fix the issue. Link: https://lkml.kernel.org/r/20250522122554.12209-4-shikemeng@huaweicloud.com Fixes: b843786b0bd0 ("mm: swapfile: fix SSD detection with swapfile on btrfs") Signed-off-by: Kemeng Shi <shikemeng(a)huaweicloud.com> Reviewed-by: Baoquan He <bhe(a)redhat.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Kairui Song <kasong(a)tencent.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/swapfile.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) --- a/mm/swapfile.c~mm-swap-fix-potensial-buffer-overflow-in-setup_clusters +++ a/mm/swapfile.c @@ -3208,9 +3208,13 @@ static struct swap_cluster_info *setup_c * and the EOF part of the last cluster. */ inc_cluster_info_page(si, cluster_info, 0); - for (i = 0; i < swap_header->info.nr_badpages; i++) - inc_cluster_info_page(si, cluster_info, - swap_header->info.badpages[i]); + for (i = 0; i < swap_header->info.nr_badpages; i++) { + unsigned int page_nr = swap_header->info.badpages[i]; + + if (page_nr >= maxpages) + continue; + inc_cluster_info_page(si, cluster_info, page_nr); + } for (i = maxpages; i < round_up(maxpages, SWAPFILE_CLUSTER); i++) inc_cluster_info_page(si, cluster_info, i); _ Patches currently in -mm which might be from shikemeng(a)huaweicloud.com are

2 months, 2 weeks

1
0
0 0

[merged mm-stable] mm-swap-correctly-use-maxpages-in-swapon-syscall-to-avoid-potensial-deadloop.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: swap: correctly use maxpages in swapon syscall to avoid potential deadloop has been removed from the -mm tree. Its filename was mm-swap-correctly-use-maxpages-in-swapon-syscall-to-avoid-potensial-deadloop.patch This patch was dropped because it was merged into the mm-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Kemeng Shi <shikemeng(a)huaweicloud.com> Subject: mm: swap: correctly use maxpages in swapon syscall to avoid potential deadloop Date: Thu, 22 May 2025 20:25:52 +0800 We use maxpages from read_swap_header() to initialize swap_info_struct, however the maxpages might be reduced in setup_swap_extents() and the si->max is assigned with the reduced maxpages from the setup_swap_extents(). Obviously, this could lead to memory waste as we allocated memory based on larger maxpages, besides, this could lead to a potential deadloop as following: 1) When calling setup_clusters() with larger maxpages, unavailable pages within range [si->max, larger maxpages) are not accounted with inc_cluster_info_page(). As a result, these pages are assumed available but can not be allocated. The cluster contains these pages can be moved to frag_clusters list after it's all available pages were allocated. 2) When the cluster mentioned in 1) is the only cluster in frag_clusters list, cluster_alloc_swap_entry() assume order 0 allocation will never failed and will enter a deadloop by keep trying to allocate page from the only cluster in frag_clusters which contains no actually available page. Call setup_swap_extents() to get the final maxpages before swap_info_struct initialization to fix the issue. After this change, span will include badblocks and will become large value which I think is correct value: In summary, there are two kinds of swapfile_activate operations. 1. Filesystem style: Treat all blocks logical continuity and find usable physical extents in logical range. In this way, si->pages will be actual usable physical blocks and span will be "1 + highest_block - lowest_block". 2. Block device style: Treat all blocks physically continue and only one single extent is added. In this way, si->pages will be si->max and span will be "si->pages - 1". Actually, si->pages and si->max is only used in block device style and span value is set with si->pages. As a result, span value in block device style will become a larger value as you mentioned. I think larger value is correct based on: 1. Span value in filesystem style is "1 + highest_block - lowest_block" which is the range cover all possible phisical blocks including the badblocks. 2. For block device style, si->pages is the actual usable block number and is already in pr_info. The original span value before this patch is also refer to usable block number which is redundant in pr_info. [shikemeng(a)huaweicloud.com: ensure si->pages == si->max - 1 after setup_swap_extents()] Link: https://lkml.kernel.org/r/20250522122554.12209-3-shikemeng@huaweicloud.com Link: https://lkml.kernel.org/r/20250718065139.61989-1-shikemeng@huaweicloud.com Link: https://lkml.kernel.org/r/20250522122554.12209-3-shikemeng@huaweicloud.com Fixes: 661383c6111a ("mm: swap: relaim the cached parts that got scanned") Signed-off-by: Kemeng Shi <shikemeng(a)huaweicloud.com> Reviewed-by: Baoquan He <bhe(a)redhat.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Kairui Song <kasong(a)tencent.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/swapfile.c | 53 +++++++++++++++++++++++------------------------- 1 file changed, 26 insertions(+), 27 deletions(-) --- a/mm/swapfile.c~mm-swap-correctly-use-maxpages-in-swapon-syscall-to-avoid-potensial-deadloop +++ a/mm/swapfile.c @@ -3141,43 +3141,30 @@ static unsigned long read_swap_header(st return maxpages; } -static int setup_swap_map_and_extents(struct swap_info_struct *si, - union swap_header *swap_header, - unsigned char *swap_map, - unsigned long maxpages, - sector_t *span) +static int setup_swap_map(struct swap_info_struct *si, + union swap_header *swap_header, + unsigned char *swap_map, + unsigned long maxpages) { - unsigned int nr_good_pages; unsigned long i; - int nr_extents; - - nr_good_pages = maxpages - 1; /* omit header page */ + swap_map[0] = SWAP_MAP_BAD; /* omit header page */ for (i = 0; i < swap_header->info.nr_badpages; i++) { unsigned int page_nr = swap_header->info.badpages[i]; if (page_nr == 0 || page_nr > swap_header->info.last_page) return -EINVAL; if (page_nr < maxpages) { swap_map[page_nr] = SWAP_MAP_BAD; - nr_good_pages--; + si->pages--; } } - if (nr_good_pages) { - swap_map[0] = SWAP_MAP_BAD; - si->max = maxpages; - si->pages = nr_good_pages; - nr_extents = setup_swap_extents(si, span); - if (nr_extents < 0) - return nr_extents; - nr_good_pages = si->pages; - } - if (!nr_good_pages) { + if (!si->pages) { pr_warn("Empty swap-file\n"); return -EINVAL; } - return nr_extents; + return 0; } #define SWAP_CLUSTER_INFO_COLS \ @@ -3217,7 +3204,7 @@ static struct swap_cluster_info *setup_c * Mark unusable pages as unavailable. The clusters aren't * marked free yet, so no list operations are involved yet. * - * See setup_swap_map_and_extents(): header page, bad pages, + * See setup_swap_map(): header page, bad pages, * and the EOF part of the last cluster. */ inc_cluster_info_page(si, cluster_info, 0); @@ -3363,6 +3350,21 @@ SYSCALL_DEFINE2(swapon, const char __use goto bad_swap_unlock_inode; } + si->max = maxpages; + si->pages = maxpages - 1; + nr_extents = setup_swap_extents(si, &span); + if (nr_extents < 0) { + error = nr_extents; + goto bad_swap_unlock_inode; + } + if (si->pages != si->max - 1) { + pr_err("swap:%u != (max:%u - 1)\n", si->pages, si->max); + error = -EINVAL; + goto bad_swap_unlock_inode; + } + + maxpages = si->max; + /* OK, set up the swap map and apply the bad block list */ swap_map = vzalloc(maxpages); if (!swap_map) { @@ -3374,12 +3376,9 @@ SYSCALL_DEFINE2(swapon, const char __use if (error) goto bad_swap_unlock_inode; - nr_extents = setup_swap_map_and_extents(si, swap_header, swap_map, - maxpages, &span); - if (unlikely(nr_extents < 0)) { - error = nr_extents; + error = setup_swap_map(si, swap_header, swap_map, maxpages); + if (error) goto bad_swap_unlock_inode; - } /* * Use kvmalloc_array instead of bitmap_zalloc as the allocation order might _ Patches currently in -mm which might be from shikemeng(a)huaweicloud.com are

2 months, 2 weeks

1
0
0 0

[merged mm-stable] mm-swap-move-nr_swap_pages-counter-decrement-from-folio_alloc_swap-to-swap_range_alloc.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: swap: move nr_swap_pages counter decrement from folio_alloc_swap() to swap_range_alloc() has been removed from the -mm tree. Its filename was mm-swap-move-nr_swap_pages-counter-decrement-from-folio_alloc_swap-to-swap_range_alloc.patch This patch was dropped because it was merged into the mm-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Kemeng Shi <shikemeng(a)huaweicloud.com> Subject: mm: swap: move nr_swap_pages counter decrement from folio_alloc_swap() to swap_range_alloc() Date: Thu, 22 May 2025 20:25:51 +0800 Patch series "Some randome fixes and cleanups to swapfile". Patch 0-3 are some random fixes. Patch 4 is a cleanup. More details can be found in respective patches. This patch (of 4): When folio_alloc_swap() encounters a failure in either mem_cgroup_try_charge_swap() or add_to_swap_cache(), nr_swap_pages counter is not decremented for allocated entry. However, the following put_swap_folio() will increase nr_swap_pages counter unpairly and lead to an imbalance. Move nr_swap_pages decrement from folio_alloc_swap() to swap_range_alloc() to pair the nr_swap_pages counting. Link: https://lkml.kernel.org/r/20250522122554.12209-1-shikemeng@huaweicloud.com Link: https://lkml.kernel.org/r/20250522122554.12209-2-shikemeng@huaweicloud.com Fixes: 0ff67f990bd4 ("mm, swap: remove swap slot cache") Signed-off-by: Kemeng Shi <shikemeng(a)huaweicloud.com> Reviewed-by: Kairui Song <kasong(a)tencent.com> Reviewed-by: Baoquan He <bhe(a)redhat.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/swapfile.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/swapfile.c~mm-swap-move-nr_swap_pages-counter-decrement-from-folio_alloc_swap-to-swap_range_alloc +++ a/mm/swapfile.c @@ -1115,6 +1115,7 @@ static void swap_range_alloc(struct swap if (vm_swap_full()) schedule_work(&si->reclaim_work); } + atomic_long_sub(nr_entries, &nr_swap_pages); } static void swap_range_free(struct swap_info_struct *si, unsigned long offset, @@ -1313,7 +1314,6 @@ int folio_alloc_swap(struct folio *folio if (add_to_swap_cache(folio, entry, gfp | __GFP_NOMEMALLOC, NULL)) goto out_free; - atomic_long_sub(size, &nr_swap_pages); return 0; out_free: _ Patches currently in -mm which might be from shikemeng(a)huaweicloud.com are

2 months, 2 weeks

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2025