The quilt patch titled
Subject: mm: shmem: fix the shmem large folio allocation for the i915 driver
has been removed from the -mm tree. Its filename was
mm-shmem-fix-the-shmem-large-folio-allocation-for-the-i915-driver.patch
This patch was dropped because an alternative patch was or shall be merged
------------------------------------------------------
From: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Subject: mm: shmem: fix the shmem large folio allocation for the i915 driver
Date: Mon, 28 Jul 2025 16:03:53 +0800
After commit acd7ccb284b8 ("mm: shmem: add large folio support for
tmpfs"), we extend the 'huge=' option to allow any sized large folios for
tmpfs, which means tmpfs will allow getting a highest order hint based on
the size of write() and fallocate() paths, and then will try each
allowable large order.
However, when the i915 driver allocates shmem memory, it doesn't provide
hint information about the size of the large folio to be allocated,
resulting in the inability to allocate PMD-sized shmem, which in turn
affects GPU performance.
To fix this issue, add the 'end' information for shmem_read_folio_gfp() to
help allocate PMD-sized large folios. Additionally, use the maximum
allocation chunk (via mapping_max_folio_size()) to determine the size of
the large folios to allocate in the i915 driver.
Patryk added:
: In my tests, the performance drop ranges from a few percent up to 13%
: in Unigine Superposition under heavy memory usage on the CPU Core Ultra
: 155H with the Xe 128 EU GPU. Other users have reported performance
: impact up to 30% on certain workloads. Please find more in the
: regressions reports:
: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14645
: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13845
:
: I believe the change should be backported to all active kernel branches
: after version 6.12.
Link: https://lkml.kernel.org/r/0d734549d5ed073c80b11601da3abdd5223e1889.17536898…
Fixes: acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs")
Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Reported-by: Patryk Kowalczyk <patryk(a)kowalczyk.ws>
Reported-by: Ville Syrj��l�� <ville.syrjala(a)linux.intel.com>
Tested-by: Patryk Kowalczyk <patryk(a)kowalczyk.ws>
Cc: Christan K��nig <christian.koenig(a)amd.com>
Cc: Dave Airlie <airlied(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Huang Ray <Ray.Huang(a)amd.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Jani Nikula <jani.nikula(a)linux.intel.com>
Cc: Jonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Mathew Brost <matthew.brost(a)intel.com>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: Thomas Zimemrmann <tzimmermann(a)suse.de>
Cc: Tvrtko Ursulin <tursulin(a)ursulin.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/gpu/drm/drm_gem.c | 2 +-
drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 7 ++++++-
drivers/gpu/drm/ttm/ttm_backup.c | 2 +-
include/linux/shmem_fs.h | 4 ++--
mm/shmem.c | 7 ++++---
5 files changed, 14 insertions(+), 8 deletions(-)
--- a/drivers/gpu/drm/drm_gem.c~mm-shmem-fix-the-shmem-large-folio-allocation-for-the-i915-driver
+++ a/drivers/gpu/drm/drm_gem.c
@@ -627,7 +627,7 @@ struct page **drm_gem_get_pages(struct d
i = 0;
while (i < npages) {
long nr;
- folio = shmem_read_folio_gfp(mapping, i,
+ folio = shmem_read_folio_gfp(mapping, i, 0,
mapping_gfp_mask(mapping));
if (IS_ERR(folio))
goto fail;
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c~mm-shmem-fix-the-shmem-large-folio-allocation-for-the-i915-driver
+++ a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -69,6 +69,7 @@ int shmem_sg_alloc_table(struct drm_i915
struct scatterlist *sg;
unsigned long next_pfn = 0; /* suppress gcc warning */
gfp_t noreclaim;
+ size_t chunk;
int ret;
if (overflows_type(size / PAGE_SIZE, page_count))
@@ -94,6 +95,7 @@ int shmem_sg_alloc_table(struct drm_i915
mapping_set_unevictable(mapping);
noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM);
noreclaim |= __GFP_NORETRY | __GFP_NOWARN;
+ chunk = mapping_max_folio_size(mapping);
sg = st->sgl;
st->nents = 0;
@@ -105,10 +107,13 @@ int shmem_sg_alloc_table(struct drm_i915
0,
}, *s = shrink;
gfp_t gfp = noreclaim;
+ loff_t bytes = (page_count - i) << PAGE_SHIFT;
+ loff_t pos = i << PAGE_SHIFT;
+ bytes = min_t(loff_t, chunk, bytes);
do {
cond_resched();
- folio = shmem_read_folio_gfp(mapping, i, gfp);
+ folio = shmem_read_folio_gfp(mapping, i, pos + bytes, gfp);
if (!IS_ERR(folio))
break;
--- a/drivers/gpu/drm/ttm/ttm_backup.c~mm-shmem-fix-the-shmem-large-folio-allocation-for-the-i915-driver
+++ a/drivers/gpu/drm/ttm/ttm_backup.c
@@ -100,7 +100,7 @@ ttm_backup_backup_page(struct file *back
struct folio *to_folio;
int ret;
- to_folio = shmem_read_folio_gfp(mapping, idx, alloc_gfp);
+ to_folio = shmem_read_folio_gfp(mapping, idx, 0, alloc_gfp);
if (IS_ERR(to_folio))
return PTR_ERR(to_folio);
--- a/include/linux/shmem_fs.h~mm-shmem-fix-the-shmem-large-folio-allocation-for-the-i915-driver
+++ a/include/linux/shmem_fs.h
@@ -153,12 +153,12 @@ enum sgp_type {
int shmem_get_folio(struct inode *inode, pgoff_t index, loff_t write_end,
struct folio **foliop, enum sgp_type sgp);
struct folio *shmem_read_folio_gfp(struct address_space *mapping,
- pgoff_t index, gfp_t gfp);
+ pgoff_t index, loff_t end, gfp_t gfp);
static inline struct folio *shmem_read_folio(struct address_space *mapping,
pgoff_t index)
{
- return shmem_read_folio_gfp(mapping, index, mapping_gfp_mask(mapping));
+ return shmem_read_folio_gfp(mapping, index, 0, mapping_gfp_mask(mapping));
}
static inline struct page *shmem_read_mapping_page(
--- a/mm/shmem.c~mm-shmem-fix-the-shmem-large-folio-allocation-for-the-i915-driver
+++ a/mm/shmem.c
@@ -5930,6 +5930,7 @@ int shmem_zero_setup(struct vm_area_stru
* shmem_read_folio_gfp - read into page cache, using specified page allocation flags.
* @mapping: the folio's address_space
* @index: the folio index
+ * @end: end of a read if allocating a new folio
* @gfp: the page allocator flags to use if allocating
*
* This behaves as a tmpfs "read_cache_page_gfp(mapping, index, gfp)",
@@ -5942,14 +5943,14 @@ int shmem_zero_setup(struct vm_area_stru
* with the mapping_gfp_mask(), to avoid OOMing the machine unnecessarily.
*/
struct folio *shmem_read_folio_gfp(struct address_space *mapping,
- pgoff_t index, gfp_t gfp)
+ pgoff_t index, loff_t end, gfp_t gfp)
{
#ifdef CONFIG_SHMEM
struct inode *inode = mapping->host;
struct folio *folio;
int error;
- error = shmem_get_folio_gfp(inode, index, 0, &folio, SGP_CACHE,
+ error = shmem_get_folio_gfp(inode, index, end, &folio, SGP_CACHE,
gfp, NULL, NULL);
if (error)
return ERR_PTR(error);
@@ -5968,7 +5969,7 @@ EXPORT_SYMBOL_GPL(shmem_read_folio_gfp);
struct page *shmem_read_mapping_page_gfp(struct address_space *mapping,
pgoff_t index, gfp_t gfp)
{
- struct folio *folio = shmem_read_folio_gfp(mapping, index, gfp);
+ struct folio *folio = shmem_read_folio_gfp(mapping, index, 0, gfp);
struct page *page;
if (IS_ERR(folio))
_
Patches currently in -mm which might be from baolin.wang(a)linux.alibaba.com are
The following changes since commit 347e9f5043c89695b01e66b3ed111755afcf1911:
Linux 6.16-rc6 (2025-07-13 14:25:58 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
for you to fetch changes up to 6693731487a8145a9b039bc983d77edc47693855:
vsock/virtio: Allocate nonlinear SKBs for handling large transmit buffers (2025-08-01 09:11:09 -0400)
Changes from v1:
drop commits that I put in there by mistake. Sorry!
----------------------------------------------------------------
virtio, vhost: features, fixes
vhost can now support legacy threading
if enabled in Kconfig
vsock memory allocation strategies for
large buffers have been improved,
reducing pressure on kmalloc
vhost now supports the in-order feature
guest bits missed the merge window
fixes, cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
----------------------------------------------------------------
Alok Tiwari (4):
virtio: Fix typo in register_virtio_device() doc comment
vhost-scsi: Fix typos and formatting in comments and logs
vhost: Fix typos
vhost-scsi: Fix check for inline_sg_cnt exceeding preallocated limit
Anders Roxell (1):
vdpa: Fix IDR memory leak in VDUSE module exit
Cindy Lu (1):
vhost: Reintroduce kthread API and add mode selection
Dr. David Alan Gilbert (2):
vhost: vringh: Remove unused iotlb functions
vhost: vringh: Remove unused functions
Dragos Tatulea (2):
vdpa/mlx5: Fix needs_teardown flag calculation
vdpa/mlx5: Fix release of uninitialized resources on error path
Gerd Hoffmann (1):
drm/virtio: implement virtio_gpu_shutdown
Jason Wang (3):
vhost: fail early when __vhost_add_used() fails
vhost: basic in order support
vhost_net: basic in_order support
Michael S. Tsirkin (2):
virtio: fix comments, readability
virtio: document ENOSPC
Mike Christie (1):
vhost-scsi: Fix log flooding with target does not exist errors
Pei Xiao (1):
vhost: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...))
Viresh Kumar (2):
virtio-mmio: Remove virtqueue list from mmio device
virtio-vdpa: Remove virtqueue list
WangYuli (1):
virtio: virtio_dma_buf: fix missing parameter documentation
Will Deacon (9):
vhost/vsock: Avoid allocating arbitrarily-sized SKBs
vsock/virtio: Validate length in packet header before skb_put()
vsock/virtio: Move length check to callers of virtio_vsock_skb_rx_put()
vsock/virtio: Resize receive buffers so that each SKB fits in a 4K page
vsock/virtio: Rename virtio_vsock_alloc_skb()
vsock/virtio: Move SKB allocation lower-bound check to callers
vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers
vsock/virtio: Rename virtio_vsock_skb_rx_put()
vsock/virtio: Allocate nonlinear SKBs for handling large transmit buffers
drivers/gpu/drm/virtio/virtgpu_drv.c | 8 +-
drivers/vdpa/mlx5/core/mr.c | 3 +
drivers/vdpa/mlx5/net/mlx5_vnet.c | 12 +-
drivers/vdpa/vdpa_user/vduse_dev.c | 1 +
drivers/vhost/Kconfig | 18 ++
drivers/vhost/net.c | 88 +++++---
drivers/vhost/scsi.c | 24 +-
drivers/vhost/vhost.c | 377 ++++++++++++++++++++++++++++----
drivers/vhost/vhost.h | 30 ++-
drivers/vhost/vringh.c | 118 ----------
drivers/vhost/vsock.c | 15 +-
drivers/virtio/virtio.c | 7 +-
drivers/virtio/virtio_dma_buf.c | 2 +
drivers/virtio/virtio_mmio.c | 52 +----
drivers/virtio/virtio_ring.c | 4 +
drivers/virtio/virtio_vdpa.c | 44 +---
include/linux/virtio.h | 2 +-
include/linux/virtio_vsock.h | 46 +++-
include/linux/vringh.h | 12 -
include/uapi/linux/vhost.h | 29 +++
kernel/vhost_task.c | 2 +-
net/vmw_vsock/virtio_transport.c | 20 +-
net/vmw_vsock/virtio_transport_common.c | 3 +-
23 files changed, 575 insertions(+), 342 deletions(-)
The patch titled
Subject: mm/kmemleak: avoid deadlock by moving pr_warn() outside kmemleak_lock
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-kmemleak-avoid-deadlock-by-moving-pr_warn-outside-kmemleak_lock.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: mm/kmemleak: avoid deadlock by moving pr_warn() outside kmemleak_lock
Date: Thu, 31 Jul 2025 02:57:18 -0700
When netpoll is enabled, calling pr_warn_once() while holding
kmemleak_lock in mem_pool_alloc() can cause a deadlock due to lock
inversion with the netconsole subsystem. This occurs because
pr_warn_once() may trigger netpoll, which eventually leads to
__alloc_skb() and back into kmemleak code, attempting to reacquire
kmemleak_lock.
This is the path for the deadlock.
mem_pool_alloc()
-> raw_spin_lock_irqsave(&kmemleak_lock, flags);
-> pr_warn_once()
-> netconsole subsystem
-> netpoll
-> __alloc_skb
-> __create_object
-> raw_spin_lock_irqsave(&kmemleak_lock, flags);
Fix this by setting a flag and issuing the pr_warn_once() after
kmemleak_lock is released.
Link: https://lkml.kernel.org/r/20250731-kmemleak_lock-v1-1-728fd470198f@debian.o…
Fixes: c5665868183fec ("mm: kmemleak: use the memory pool for early allocations")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Reported-by: Jakub Kicinski <kuba(a)kernel.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kmemleak.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/kmemleak.c~mm-kmemleak-avoid-deadlock-by-moving-pr_warn-outside-kmemleak_lock
+++ a/mm/kmemleak.c
@@ -470,6 +470,7 @@ static struct kmemleak_object *mem_pool_
{
unsigned long flags;
struct kmemleak_object *object;
+ bool warn = false;
/* try the slab allocator first */
if (object_cache) {
@@ -488,8 +489,10 @@ static struct kmemleak_object *mem_pool_
else if (mem_pool_free_count)
object = &mem_pool[--mem_pool_free_count];
else
- pr_warn_once("Memory pool empty, consider increasing CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE\n");
+ warn = true;
raw_spin_unlock_irqrestore(&kmemleak_lock, flags);
+ if (warn)
+ pr_warn_once("Memory pool empty, consider increasing CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE\n");
return object;
}
_
Patches currently in -mm which might be from leitao(a)debian.org are
mm-kmemleak-avoid-deadlock-by-moving-pr_warn-outside-kmemleak_lock.patch
The patch titled
Subject: mm/userfaultfd: fix kmap_local LIFO ordering for CONFIG_HIGHPTE
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-userfaultfd-fix-kmap_local-lifo-ordering-for-config_highpte.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Sasha Levin <sashal(a)kernel.org>
Subject: mm/userfaultfd: fix kmap_local LIFO ordering for CONFIG_HIGHPTE
Date: Thu, 31 Jul 2025 10:44:31 -0400
With CONFIG_HIGHPTE on 32-bit ARM, move_pages_pte() maps PTE pages using
kmap_local_page(), which requires unmapping in Last-In-First-Out order.
The current code maps dst_pte first, then src_pte, but unmaps them in the
same order (dst_pte, src_pte), violating the LIFO requirement. This
causes the warning in kunmap_local_indexed():
WARNING: CPU: 0 PID: 604 at mm/highmem.c:622 kunmap_local_indexed+0x178/0x17c
addr \!= __fix_to_virt(FIX_KMAP_BEGIN + idx)
Fix this by reversing the unmap order to respect LIFO ordering.
This issue follows the same pattern as similar fixes:
- commit eca6828403b8 ("crypto: skcipher - fix mismatch between mapping and unmapping order")
- commit 8cf57c6df818 ("nilfs2: eliminate staggered calls to kunmap in nilfs_rename")
Both of which addressed the same fundamental requirement that kmap_local
operations must follow LIFO ordering.
Link: https://lkml.kernel.org/r/20250731144431.773923-1-sashal@kernel.org
Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Suren Baghdasaryan <surenb(a)google.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/userfaultfd.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/mm/userfaultfd.c~mm-userfaultfd-fix-kmap_local-lifo-ordering-for-config_highpte
+++ a/mm/userfaultfd.c
@@ -1453,10 +1453,15 @@ out:
folio_unlock(src_folio);
folio_put(src_folio);
}
- if (dst_pte)
- pte_unmap(dst_pte);
+ /*
+ * Unmap in reverse order (LIFO) to maintain proper kmap_local
+ * index ordering when CONFIG_HIGHPTE is enabled. We mapped dst_pte
+ * first, then src_pte, so we must unmap src_pte first, then dst_pte.
+ */
if (src_pte)
pte_unmap(src_pte);
+ if (dst_pte)
+ pte_unmap(dst_pte);
mmu_notifier_invalidate_range_end(&range);
if (si)
put_swap_device(si);
_
Patches currently in -mm which might be from sashal(a)kernel.org are
mm-userfaultfd-fix-kmap_local-lifo-ordering-for-config_highpte.patch
The patch titled
Subject: mm/debug_vm_pgtable: clear page table entries at destroy_args()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-debug_vm_pgtable-clear-page-table-entries-at-destroy_args.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Herton R. Krzesinski" <herton(a)redhat.com>
Subject: mm/debug_vm_pgtable: clear page table entries at destroy_args()
Date: Thu, 31 Jul 2025 18:40:51 -0300
The mm/debug_vm_pagetable test allocates manually page table entries for
the tests it runs, using also its manually allocated mm_struct. That in
itself is ok, but when it exits, at destroy_args() it fails to clear those
entries with the *_clear functions.
The problem is that leaves stale entries. If another process allocates an
mm_struct with a pgd at the same address, it may end up running into the
stale entry. This is happening in practice on a debug kernel with
CONFIG_DEBUG_VM_PGTABLE=y, for example this is the output with some extra
debugging I added (it prints a warning trace if pgtables_bytes goes
negative, in addition to the warning at check_mm() function):
[ 2.539353] debug_vm_pgtable: [get_random_vaddr ]: random_vaddr is 0x7ea247140000
[ 2.539366] kmem_cache info
[ 2.539374] kmem_cachep 0x000000002ce82385 - freelist 0x0000000000000000 - offset 0x508
[ 2.539447] debug_vm_pgtable: [init_args ]: args->mm is 0x000000002267cc9e
(...)
[ 2.552800] WARNING: CPU: 5 PID: 116 at include/linux/mm.h:2841 free_pud_range+0x8bc/0x8d0
[ 2.552816] Modules linked in:
[ 2.552843] CPU: 5 UID: 0 PID: 116 Comm: modprobe Not tainted 6.12.0-105.debug_vm2.el10.ppc64le+debug #1 VOLUNTARY
[ 2.552859] Hardware name: IBM,9009-41A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW910.00 (VL910_062) hv:phyp pSeries
[ 2.552872] NIP: c0000000007eef3c LR: c0000000007eef30 CTR: c0000000003d8c90
[ 2.552885] REGS: c0000000622e73b0 TRAP: 0700 Not tainted (6.12.0-105.debug_vm2.el10.ppc64le+debug)
[ 2.552899] MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24002822 XER: 0000000a
[ 2.552954] CFAR: c0000000008f03f0 IRQMASK: 0
[ 2.552954] GPR00: c0000000007eef30 c0000000622e7650 c000000002b1ac00 0000000000000001
[ 2.552954] GPR04: 0000000000000008 0000000000000000 c0000000007eef30 ffffffffffffffff
[ 2.552954] GPR08: 00000000ffff00f5 0000000000000001 0000000000000048 0000000000004000
[ 2.552954] GPR12: 00000003fa440000 c000000017ffa300 c0000000051d9f80 ffffffffffffffdb
[ 2.552954] GPR16: 0000000000000000 0000000000000008 000000000000000a 60000000000000e0
[ 2.552954] GPR20: 4080000000000000 c0000000113af038 00007fffcf130000 0000700000000000
[ 2.552954] GPR24: c000000062a6a000 0000000000000001 8000000062a68000 0000000000000001
[ 2.552954] GPR28: 000000000000000a c000000062ebc600 0000000000002000 c000000062ebc760
[ 2.553170] NIP [c0000000007eef3c] free_pud_range+0x8bc/0x8d0
[ 2.553185] LR [c0000000007eef30] free_pud_range+0x8b0/0x8d0
[ 2.553199] Call Trace:
[ 2.553207] [c0000000622e7650] [c0000000007eef30] free_pud_range+0x8b0/0x8d0 (unreliable)
[ 2.553229] [c0000000622e7750] [c0000000007f40b4] free_pgd_range+0x284/0x3b0
[ 2.553248] [c0000000622e7800] [c0000000007f4630] free_pgtables+0x450/0x570
[ 2.553274] [c0000000622e78e0] [c0000000008161c0] exit_mmap+0x250/0x650
[ 2.553292] [c0000000622e7a30] [c0000000001b95b8] __mmput+0x98/0x290
[ 2.558344] [c0000000622e7a80] [c0000000001d1018] exit_mm+0x118/0x1b0
[ 2.558361] [c0000000622e7ac0] [c0000000001d141c] do_exit+0x2ec/0x870
[ 2.558376] [c0000000622e7b60] [c0000000001d1ca8] do_group_exit+0x88/0x150
[ 2.558391] [c0000000622e7bb0] [c0000000001d1db8] sys_exit_group+0x48/0x50
[ 2.558407] [c0000000622e7be0] [c00000000003d810] system_call_exception+0x1e0/0x4c0
[ 2.558423] [c0000000622e7e50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
(...)
[ 2.558892] ---[ end trace 0000000000000000 ]---
[ 2.559022] BUG: Bad rss-counter state mm:000000002267cc9e type:MM_ANONPAGES val:1
[ 2.559037] BUG: non-zero pgtables_bytes on freeing mm: -6144
Here the modprobe process ended up with an allocated mm_struct from the
mm_struct slab that was used before by the debug_vm_pgtable test. That is
not a problem, since the mm_struct is initialized again etc., however, if
it ends up using the same pgd table, it bumps into the old stale entry
when clearing/freeing the page table entries, so it tries to free an entry
already gone (that one which was allocated by the debug_vm_pgtable test),
which also explains the negative pgtables_bytes since it's accounting for
not allocated entries in the current process.
As far as I looked pgd_{alloc,free} etc. does not clear entries, and
clearing of the entries is explicitly done in the free_pgtables->
free_pgd_range->free_p4d_range->free_pud_range->free_pmd_range->
free_pte_range path. However, the debug_vm_pgtable test does not call
free_pgtables, since it allocates mm_struct and entries manually for its
test and eg. not goes through page faults. So it also should clear
manually the entries before exit at destroy_args().
This problem was noticed on a reboot X number of times test being done on
a powerpc host, with a debug kernel with CONFIG_DEBUG_VM_PGTABLE enabled.
Depends on the system, but on a 100 times reboot loop the problem could
manifest once or twice, if a process ends up getting the right mm->pgd
entry with the stale entries used by mm/debug_vm_pagetable. After using
this patch, I couldn't reproduce/experience the problems anymore. I was
able to reproduce the problem as well on latest upstream kernel (6.16).
I also modified destroy_args() to use mmput() instead of mmdrop(), there
is no reason to hold mm_users reference and not release the mm_struct
entirely, and in the output above with my debugging prints I already had
patched it to use mmput, it did not fix the problem, but helped in the
debugging as well.
Link: https://lkml.kernel.org/r/20250731214051.4115182-1-herton@redhat.com
Fixes: 3c9b84f044a9e ("mm/debug_vm_pgtable: introduce struct pgtable_debug_args")
Signed-off-by: Herton R. Krzesinski <herton(a)redhat.com>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Cc: Gavin Shan <gshan(a)redhat.com>
Cc: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/debug_vm_pgtable.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-clear-page-table-entries-at-destroy_args
+++ a/mm/debug_vm_pgtable.c
@@ -1041,29 +1041,34 @@ static void __init destroy_args(struct p
/* Free page table entries */
if (args->start_ptep) {
+ pmd_clear(args->pmdp);
pte_free(args->mm, args->start_ptep);
mm_dec_nr_ptes(args->mm);
}
if (args->start_pmdp) {
+ pud_clear(args->pudp);
pmd_free(args->mm, args->start_pmdp);
mm_dec_nr_pmds(args->mm);
}
if (args->start_pudp) {
+ p4d_clear(args->p4dp);
pud_free(args->mm, args->start_pudp);
mm_dec_nr_puds(args->mm);
}
- if (args->start_p4dp)
+ if (args->start_p4dp) {
+ pgd_clear(args->pgdp);
p4d_free(args->mm, args->start_p4dp);
+ }
/* Free vma and mm struct */
if (args->vma)
vm_area_free(args->vma);
if (args->mm)
- mmdrop(args->mm);
+ mmput(args->mm);
}
static struct page * __init
_
Patches currently in -mm which might be from herton(a)redhat.com are
mm-debug_vm_pgtable-clear-page-table-entries-at-destroy_args.patch
To prevent timing attacks, HMAC value comparison needs to be constant
time. Replace the memcmp() with the correct function, crypto_memneq().
Fixes: 1085b8276bb4 ("tpm: Add the rest of the session HMAC API")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
---
drivers/char/tpm/Kconfig | 1 +
drivers/char/tpm/tpm2-sessions.c | 6 +++---
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
index dddd702b2454a..f9d8a4e966867 100644
--- a/drivers/char/tpm/Kconfig
+++ b/drivers/char/tpm/Kconfig
@@ -31,10 +31,11 @@ config TCG_TPM2_HMAC
bool "Use HMAC and encrypted transactions on the TPM bus"
default X86_64
select CRYPTO_ECDH
select CRYPTO_LIB_AESCFB
select CRYPTO_LIB_SHA256
+ select CRYPTO_LIB_UTILS
help
Setting this causes us to deploy a scheme which uses request
and response HMACs in addition to encryption for
communicating with the TPM to prevent or detect bus snooping
and interposer attacks (see tpm-security.rst). Saying Y
diff --git a/drivers/char/tpm/tpm2-sessions.c b/drivers/char/tpm/tpm2-sessions.c
index bdb119453dfbe..5fbd62ee50903 100644
--- a/drivers/char/tpm/tpm2-sessions.c
+++ b/drivers/char/tpm/tpm2-sessions.c
@@ -69,10 +69,11 @@
#include <linux/unaligned.h>
#include <crypto/kpp.h>
#include <crypto/ecdh.h>
#include <crypto/hash.h>
#include <crypto/hmac.h>
+#include <crypto/utils.h>
/* maximum number of names the TPM must remember for authorization */
#define AUTH_MAX_NAMES 3
#define AES_KEY_BYTES AES_KEYSIZE_128
@@ -827,16 +828,15 @@ int tpm_buf_check_hmac_response(struct tpm_chip *chip, struct tpm_buf *buf,
sha256_update(&sctx, auth->our_nonce, sizeof(auth->our_nonce));
sha256_update(&sctx, &auth->attrs, 1);
/* we're done with the rphash, so put our idea of the hmac there */
tpm2_hmac_final(&sctx, auth->session_key, sizeof(auth->session_key)
+ auth->passphrase_len, rphash);
- if (memcmp(rphash, &buf->data[offset_s], SHA256_DIGEST_SIZE) == 0) {
- rc = 0;
- } else {
+ if (crypto_memneq(rphash, &buf->data[offset_s], SHA256_DIGEST_SIZE)) {
dev_err(&chip->dev, "TPM: HMAC check failed\n");
goto out;
}
+ rc = 0;
/* now do response decryption */
if (auth->attrs & TPM2_SA_ENCRYPT) {
/* need key and IV */
tpm2_KDFa(auth->session_key, SHA256_DIGEST_SIZE
--
2.50.1
The patch titled
Subject: kunit: kasan_test: disable fortify string checker on kasan_strings() test
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
kunit-kasan_test-disable-fortify-string-checker-on-kasan_strings-test.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Yeoreum Yun <yeoreum.yun(a)arm.com>
Subject: kunit: kasan_test: disable fortify string checker on kasan_strings() test
Date: Fri, 1 Aug 2025 13:02:36 +0100
Similar to commit 09c6304e38e4 ("kasan: test: fix compatibility with
FORTIFY_SOURCE") the kernel is panicing in kasan_string().
This is due to the `src` and `ptr` not being hidden from the optimizer
which would disable the runtime fortify string checker.
Call trace:
__fortify_panic+0x10/0x20 (P)
kasan_strings+0x980/0x9b0
kunit_try_run_case+0x68/0x190
kunit_generic_run_threadfn_adapter+0x34/0x68
kthread+0x1c4/0x228
ret_from_fork+0x10/0x20
Code: d503233f a9bf7bfd 910003fd 9424b243 (d4210000)
---[ end trace 0000000000000000 ]---
note: kunit_try_catch[128] exited with irqs disabled
note: kunit_try_catch[128] exited with preempt_count 1
# kasan_strings: try faulted: last
** replaying previous printk message **
# kasan_strings: try faulted: last line seen mm/kasan/kasan_test_c.c:1600
# kasan_strings: internal error occurred preventing test case from running: -4
Link: https://lkml.kernel.org/r/20250801120236.2962642-1-yeoreum.yun@arm.com
Fixes: 73228c7ecc5e ("KASAN: port KASAN Tests to KUnit")
Signed-off-by: Yeoreum Yun <yeoreum.yun(a)arm.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/kasan_test_c.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/kasan/kasan_test_c.c~kunit-kasan_test-disable-fortify-string-checker-on-kasan_strings-test
+++ a/mm/kasan/kasan_test_c.c
@@ -1578,9 +1578,11 @@ static void kasan_strings(struct kunit *
ptr = kmalloc(size, GFP_KERNEL | __GFP_ZERO);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+ OPTIMIZER_HIDE_VAR(ptr);
src = kmalloc(KASAN_GRANULE_SIZE, GFP_KERNEL | __GFP_ZERO);
strscpy(src, "f0cacc1a0000000", KASAN_GRANULE_SIZE);
+ OPTIMIZER_HIDE_VAR(src);
/*
* Make sure that strscpy() does not trigger KASAN if it overreads into
_
Patches currently in -mm which might be from yeoreum.yun(a)arm.com are
kunit-kasan_test-disable-fortify-string-checker-on-kasan_strings-test.patch
The commit in the Fixes tag breaks my laptop (found by git bisect).
My home RJ45 LAN cable cannot connect after that commit.
The call to netif_carrier_on() should be done when netif_carrier_ok()
is false. Not when it's true. Because calling netif_carrier_on() when
__LINK_STATE_NOCARRIER is not set actually does nothing.
Cc: Armando Budianto <sprite(a)gnuweeb.org>
Cc: stable(a)vger.kernel.org
Closes: https://lore.kernel.org/netdev/0752dee6-43d6-4e1f-81d2-4248142cccd2@gnuweeb…
Fixes: 0d9cfc9b8cb1 ("net: usbnet: Avoid potential RCU stall on LINK_CHANGE event")
Signed-off-by: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
---
v2:
- Rebase on top of the latest netdev/net tree. The previous patch was
based on 0d9cfc9b8cb1. Line numbers have changed since then.
drivers/net/usb/usbnet.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index a38ffbf4b3f0..a1827684b92c 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1114,31 +1114,31 @@ static const struct ethtool_ops usbnet_ethtool_ops = {
};
/*-------------------------------------------------------------------------*/
static void __handle_link_change(struct usbnet *dev)
{
if (!test_bit(EVENT_DEV_OPEN, &dev->flags))
return;
if (!netif_carrier_ok(dev->net)) {
+ if (test_and_clear_bit(EVENT_LINK_CARRIER_ON, &dev->flags))
+ netif_carrier_on(dev->net);
+
/* kill URBs for reading packets to save bus bandwidth */
unlink_urbs(dev, &dev->rxq);
/*
* tx_timeout will unlink URBs for sending packets and
* tx queue is stopped by netcore after link becomes off
*/
} else {
- if (test_and_clear_bit(EVENT_LINK_CARRIER_ON, &dev->flags))
- netif_carrier_on(dev->net);
-
/* submitting URBs for reading packets */
queue_work(system_bh_wq, &dev->bh_work);
}
/* hard_mtu or rx_urb_size may change during link change */
usbnet_update_max_qlen(dev);
clear_bit(EVENT_LINK_CHANGE, &dev->flags);
}
--
Ammar Faizi
The commit in the Fixes tag breaks my laptop (found by git bisect).
My home RJ45 LAN cable cannot connect after that commit.
The call to netif_carrier_on() should be done when netif_carrier_ok()
is false. Not when it's true. Because calling netif_carrier_on() when
__LINK_STATE_NOCARRIER is not set actually does nothing.
Cc: Armando Budianto <sprite(a)gnuweeb.org>
Cc: stable(a)vger.kernel.org
Closes: https://lore.kernel.org/netdev/0752dee6-43d6-4e1f-81d2-4248142cccd2@gnuweeb…
Fixes: 0d9cfc9b8cb1 ("net: usbnet: Avoid potential RCU stall on LINK_CHANGE event")
Signed-off-by: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
---
drivers/net/usb/usbnet.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index bc1d8631ffe0..1eb98eeb64f9 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1107,31 +1107,31 @@ static const struct ethtool_ops usbnet_ethtool_ops = {
};
/*-------------------------------------------------------------------------*/
static void __handle_link_change(struct usbnet *dev)
{
if (!test_bit(EVENT_DEV_OPEN, &dev->flags))
return;
if (!netif_carrier_ok(dev->net)) {
+ if (test_and_clear_bit(EVENT_LINK_CARRIER_ON, &dev->flags))
+ netif_carrier_on(dev->net);
+
/* kill URBs for reading packets to save bus bandwidth */
unlink_urbs(dev, &dev->rxq);
/*
* tx_timeout will unlink URBs for sending packets and
* tx queue is stopped by netcore after link becomes off
*/
} else {
- if (test_and_clear_bit(EVENT_LINK_CARRIER_ON, &dev->flags))
- netif_carrier_on(dev->net);
-
/* submitting URBs for reading packets */
tasklet_schedule(&dev->bh);
}
/* hard_mtu or rx_urb_size may change during link change */
usbnet_update_max_qlen(dev);
clear_bit(EVENT_LINK_CHANGE, &dev->flags);
}
--
Ammar Faizi
The Gemalto Cinterion PLS83-W modem (cdc_ether) is emitting confusing link
up and down events when the WWAN interface is activated on the modem-side.
Interrupt URBs will in consecutive polls grab:
* Link Connected
* Link Disconnected
* Link Connected
Where the last Connected is then a stable link state.
When the system is under load this may cause the unlink_urbs() work in
__handle_link_change() to not complete before the next usbnet_link_change()
call turns the carrier on again, allowing rx_submit() to queue new SKBs.
In that event the URB queue is filled faster than it can drain, ending up
in a RCU stall:
rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 0-.... } 33108 jiffies s: 201 root: 0x1/.
rcu: blocking rcu_node structures (internal RCU debug):
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
Call trace:
arch_local_irq_enable+0x4/0x8
local_bh_enable+0x18/0x20
__netdev_alloc_skb+0x18c/0x1cc
rx_submit+0x68/0x1f8 [usbnet]
rx_alloc_submit+0x4c/0x74 [usbnet]
usbnet_bh+0x1d8/0x218 [usbnet]
usbnet_bh_tasklet+0x10/0x18 [usbnet]
tasklet_action_common+0xa8/0x110
tasklet_action+0x2c/0x34
handle_softirqs+0x2cc/0x3a0
__do_softirq+0x10/0x18
____do_softirq+0xc/0x14
call_on_irq_stack+0x24/0x34
do_softirq_own_stack+0x18/0x20
__irq_exit_rcu+0xa8/0xb8
irq_exit_rcu+0xc/0x30
el1_interrupt+0x34/0x48
el1h_64_irq_handler+0x14/0x1c
el1h_64_irq+0x68/0x6c
_raw_spin_unlock_irqrestore+0x38/0x48
xhci_urb_dequeue+0x1ac/0x45c [xhci_hcd]
unlink1+0xd4/0xdc [usbcore]
usb_hcd_unlink_urb+0x70/0xb0 [usbcore]
usb_unlink_urb+0x24/0x44 [usbcore]
unlink_urbs.constprop.0.isra.0+0x64/0xa8 [usbnet]
__handle_link_change+0x34/0x70 [usbnet]
usbnet_deferred_kevent+0x1c0/0x320 [usbnet]
process_scheduled_works+0x2d0/0x48c
worker_thread+0x150/0x1dc
kthread+0xd8/0xe8
ret_from_fork+0x10/0x20
Get around the problem by delaying the carrier on to the scheduled work.
This needs a new flag to keep track of the necessary action.
The carrier ok check cannot be removed as it remains required for the
LINK_RESET event flow.
Fixes: 4b49f58fff00 ("usbnet: handle link change")
Cc: stable(a)vger.kernel.org
Signed-off-by: John Ernberg <john.ernberg(a)actia.se>
---
I've been testing this quite aggressively over a night, and seems equally
stable to my first approach. I'm a little bit concerned that the bit stuff
can now race (although much smaller) in the opposite direction, that a
carrier off can occur between test_and_clear_bit() and the carrier on
action in the handler. Leaving the carrier on when it shouldn't be.
v2:
- target tree in patch description.
- Drop Ming Lei from address list as their address bounces.
- Rework solution based on feedback by Jakub (let me know if you want a
Suggested-by tag, if we're keeping this direction)
v1: https://lore.kernel.org/netdev/20250710085028.1070922-1-john.ernberg@actia.…
Tested on 6.12.20 and forward ported. Stack trace from 6.12.20.
---
drivers/net/usb/usbnet.c | 11 ++++++++---
include/linux/usb/usbnet.h | 1 +
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index c04e715a4c2a..bc1d8631ffe0 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1122,6 +1122,9 @@ static void __handle_link_change(struct usbnet *dev)
* tx queue is stopped by netcore after link becomes off
*/
} else {
+ if (test_and_clear_bit(EVENT_LINK_CARRIER_ON, &dev->flags))
+ netif_carrier_on(dev->net);
+
/* submitting URBs for reading packets */
tasklet_schedule(&dev->bh);
}
@@ -2009,10 +2012,12 @@ EXPORT_SYMBOL(usbnet_manage_power);
void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
{
/* update link after link is reseted */
- if (link && !need_reset)
- netif_carrier_on(dev->net);
- else
+ if (link && !need_reset) {
+ set_bit(EVENT_LINK_CARRIER_ON, &dev->flags);
+ } else {
+ clear_bit(EVENT_LINK_CARRIER_ON, &dev->flags);
netif_carrier_off(dev->net);
+ }
if (need_reset && link)
usbnet_defer_kevent(dev, EVENT_LINK_RESET);
diff --git a/include/linux/usb/usbnet.h b/include/linux/usb/usbnet.h
index 0b9f1e598e3a..4bc6bb01a0eb 100644
--- a/include/linux/usb/usbnet.h
+++ b/include/linux/usb/usbnet.h
@@ -76,6 +76,7 @@ struct usbnet {
# define EVENT_LINK_CHANGE 11
# define EVENT_SET_RX_MODE 12
# define EVENT_NO_IP_ALIGN 13
+# define EVENT_LINK_CARRIER_ON 14
/* This one is special, as it indicates that the device is going away
* there are cyclic dependencies between tasklet, timer and bh
* that must be broken
--
2.49.0