From: Lizhi Xu <lizhi.xu(a)windriver.com>
[ Upstream commit 66d938e89e940e512f4c3deac938ecef399c13f9 ]
The filio lock has been released here, so there is no need to jump to
error_folio_unlock to release it again.
Reported-by: syzbot+b73c7d94a151e2ee1e9b(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b73c7d94a151e2ee1e9b
Signed-off-by: Lizhi Xu <lizhi.xu(a)windriver.com>
Acked-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Paulo Alcantara (Red Hat) <pc(a)manguebit.org>
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my comprehensive investigation, here is my analysis:
## Backport Decision: **YES**
### Detailed Analysis
#### Bug Description
This commit fixes a **critical double-unlock bug** in the netfs (Network
Filesystem Library) buffered write path. The bug was introduced in
commit 8f52de0077ba3b (v6.12-rc1) during a performance optimization
refactoring.
**The specific bug**: In the `flush_content` error path at
fs/netfs/buffered_write.c:346, the code unlocks and releases a folio,
then on line 350, if `filemap_write_and_wait_range()` fails, it jumps to
`error_folio_unlock` which attempts to unlock the **already unlocked**
folio again (line 407).
```c
flush_content:
folio_unlock(folio); // First unlock - line 346
folio_put(folio);
ret = filemap_write_and_wait_range(...);
if (ret < 0)
goto error_folio_unlock; // BUG: jumps to unlock again!
```
**The fix**: Changes line 350 from `goto error_folio_unlock` to `goto
out`, correctly bypassing the duplicate unlock.
#### Severity Assessment: **HIGH**
1. **Impact**:
- With `CONFIG_DEBUG_VM=y`: Immediate kernel panic via
`VM_BUG_ON_FOLIO()` at mm/filemap.c:1498
- With `CONFIG_DEBUG_VM=n`: Silent memory corruption, undefined
behavior, potential use-after-free
- Affects **all network filesystems**: 9p, AFS, Ceph, NFS, SMB/CIFS
2. **Syzbot Evidence**:
- Bug ID: syzbot+b73c7d94a151e2ee1e9b(a)syzkaller.appspotmail.com
- Title: "kernel BUG in netfs_perform_write"
- **17 crash instances** recorded
- Reproducers available (both C and syz formats)
- Affected multiple kernel versions (5.4, 5.10, 5.15, 6.1, 6.12)
3. **Triggering Conditions** (Moderate likelihood):
- Network filesystem write operation
- Incompatible write scenario (netfs_group mismatch or streaming
write conflict)
- I/O error from `filemap_write_and_wait_range()` (network failure,
memory pressure, etc.)
#### Backport Criteria Evaluation
✅ **Fixes important bug affecting users**: Yes - causes kernel panics
and potential memory corruption for all network filesystem users
✅ **Small and contained fix**: Yes - **single line change**, minimal
code modification
✅ **No architectural changes**: Yes - simple error path correction
✅ **Minimal regression risk**: Yes - obviously correct fix, well-
reviewed (Acked-by David Howells, Reviewed-by Paulo Alcantara)
✅ **Confined to subsystem**: Yes - only touches netfs buffered write
error path
✅ **Well-tested**: Yes - syzbot has reproducers, 17 crash instances
documented
#### Affected Stable Trees
**Bug introduced**: v6.12-rc1 (commit 8f52de0077ba3b)
**Bug fixed**: v6.17 (commit 66d938e89e940)
**Vulnerable stable kernels**: 6.12.x, 6.13.x, 6.14.x, 6.15.x, 6.16.x
#### Missing Metadata (Should be added)
The commit is **missing critical stable backport tags**:
- No `Fixes: 8f52de0077ba3b ("netfs: Reduce number of conditional
branches in netfs_perform_write()")`
- No `Cc: stable(a)vger.kernel.org`
This appears to be an oversight, as the fix clearly qualifies for stable
backporting.
### Conclusion
**Strong YES for backporting**. This is a textbook stable tree
candidate:
- Fixes a serious kernel panic/memory corruption bug
- One-line change with zero regression risk
- Affects production users of network filesystems
- Well-tested with reproducers
- Reviewed and acked by subsystem maintainers
The fix should be backported to **all stable kernels containing commit
8f52de0077ba3b** (6.12+).
fs/netfs/buffered_write.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index f27ea5099a681..09394ac2c180d 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -347,7 +347,7 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
folio_put(folio);
ret = filemap_write_and_wait_range(mapping, fpos, fpos + flen - 1);
if (ret < 0)
- goto error_folio_unlock;
+ goto out;
continue;
copied:
--
2.51.0
From: Lance Yang <lance.yang(a)linux.dev>
When splitting an mTHP and replacing a zero-filled subpage with the shared
zeropage, try_to_map_unused_to_zeropage() currently drops the soft-dirty
bit.
For userspace tools like CRIU, which rely on the soft-dirty mechanism for
incremental snapshots, losing this bit means modified pages are missed,
leading to inconsistent memory state after restore.
Preserve the soft-dirty bit from the old PTE when creating the zeropage
mapping to ensure modified pages are correctly tracked.
Cc: <stable(a)vger.kernel.org>
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
---
mm/migrate.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/migrate.c b/mm/migrate.c
index ce83c2c3c287..bf364ba07a3f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -322,6 +322,10 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
pvmw->vma->vm_page_prot));
+
+ if (pte_swp_soft_dirty(ptep_get(pvmw->pte)))
+ newpte = pte_mksoft_dirty(newpte);
+
set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio));
--
2.49.0
Greetings,
I hope this email finds you well.
I am Mrs. Diana Owen contacting you from Reality Trading Group Ltd India.
.
As a new growing company, we are looking for reliable company of your product as seen in your website for urgent purchase order basis.
Could you please give me more details and specification of your products.
We also need your products catalog, MOQ, Production Lead Time and Unit price?
Best Regards,
Mrs. Diana Owen
Purchase Manager
Company: Reality Trading Group Ltd
DAMON's virtual address space operation set implementation (vaddr) calls
pte_offset_map_lock() inside the page table walk callback function.
This is for reading and writing page table accessed bits. If
pte_offset_map_lock() fails, it retries by returning the page table walk
callback function with ACTION_AGAIN.
pte_offset_map_lock() can continuously fail if the target is a pmd
migration entry, though. Hence it could cause an infinite page table
walk if the migration cannot be done until the page table walk is
finished. This indeed caused a soft lockup when CPU hotplugging and
DAMON were running in parallel.
Avoid the infinite loop by simply not retrying the page table walk.
DAMON is promising only a best-effort accuracy, so missing access to
such pages is no problem.
Reported-by: Xinyu Zheng <zhengxinyu6(a)huawei.com>
Closes: https://lore.kernel.org/20250918030029.2652607-1-zhengxinyu6@huawei.com
Fixes: 7780d04046a2 ("mm/pagewalkers: ACTION_AGAIN if pte_offset_map_lock() fails")
Cc: <stable(a)vger.kernel.org> # 6.5.x
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/vaddr.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
index 8c048f9b129e..7e834467b2d8 100644
--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -328,10 +328,8 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
}
pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
- if (!pte) {
- walk->action = ACTION_AGAIN;
+ if (!pte)
return 0;
- }
if (!pte_present(ptep_get(pte)))
goto out;
damon_ptep_mkold(pte, walk->vma, addr);
@@ -481,10 +479,8 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr,
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
- if (!pte) {
- walk->action = ACTION_AGAIN;
+ if (!pte)
return 0;
- }
ptent = ptep_get(pte);
if (!pte_present(ptent))
goto out;
base-commit: 3169a901e935bc1f2d2eec0171abcf524b7747e4
--
2.39.5
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092932-output-egotism-4918@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7 Mon Sep 17 00:00:00 2001
From: Jinjiang Tu <tujinjiang(a)huawei.com>
Date: Fri, 12 Sep 2025 15:41:39 +0800
Subject: [PATCH] mm/hugetlb: fix folio is still mapped when deleted
Migration may be raced with fallocating hole. remove_inode_single_folio
will unmap the folio if the folio is still mapped. However, it's called
without folio lock. If the folio is migrated and the mapped pte has been
converted to migration entry, folio_mapped() returns false, and won't
unmap it. Due to extra refcount held by remove_inode_single_folio,
migration fails, restores migration entry to normal pte, and the folio is
mapped again. As a result, we triggered BUG in filemap_unaccount_folio.
The log is as follows:
BUG: Bad page cache in process hugetlb pfn:156c00
page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00
head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0
aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file"
flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f4(hugetlb)
page dumped because: still mapped when deleted
CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Call Trace:
<TASK>
dump_stack_lvl+0x4f/0x70
filemap_unaccount_folio+0xc4/0x1c0
__filemap_remove_folio+0x38/0x1c0
filemap_remove_folio+0x41/0xd0
remove_inode_hugepages+0x142/0x250
hugetlbfs_fallocate+0x471/0x5a0
vfs_fallocate+0x149/0x380
Hold folio lock before checking if the folio is mapped to avold race with
migration.
Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com
Fixes: 4aae8d1c051e ("mm/hugetlbfs: unmap pages if page fault raced with hole punch")
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 09d4baef29cf..be4be99304bc 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -517,14 +517,16 @@ static bool remove_inode_single_folio(struct hstate *h, struct inode *inode,
/*
* If folio is mapped, it was faulted in after being
- * unmapped in caller. Unmap (again) while holding
- * the fault mutex. The mutex will prevent faults
- * until we finish removing the folio.
+ * unmapped in caller or hugetlb_vmdelete_list() skips
+ * unmapping it due to fail to grab lock. Unmap (again)
+ * while holding the fault mutex. The mutex will prevent
+ * faults until we finish removing the folio. Hold folio
+ * lock to guarantee no concurrent migration.
*/
+ folio_lock(folio);
if (unlikely(folio_mapped(folio)))
hugetlb_unmap_file_folio(h, mapping, folio, index);
- folio_lock(folio);
/*
* We must remove the folio from page cache before removing
* the region/ reserve map (hugetlb_unreserve_pages). In
Hello there,
I hope this message finds you well. My name is Ronald Evergreen.
I am a legal representative based in London, I am contacting you
with a confidential business proposal that presents a unique
opportunity for partnership that will be of great benefits to
both sides financially. For more information reply back and I
will give you detailed information.
For more information please contact my personal email:
evergreenronald86(a)gamil.com.
Ronald Evergreen
The patch titled
Subject: mm: hugetlb: avoid soft lockup when mprotect to large memory area
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-hugetlb-avoid-soft-lockup-when-mprotect-to-large-memory-area.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Yang Shi <yang(a)os.amperecomputing.com>
Subject: mm: hugetlb: avoid soft lockup when mprotect to large memory area
Date: Mon, 29 Sep 2025 13:24:02 -0700
When calling mprotect() to a large hugetlb memory area in our customer's
workload (~300GB hugetlb memory), soft lockup was observed:
watchdog: BUG: soft lockup - CPU#98 stuck for 23s! [t2_new_sysv:126916]
CPU: 98 PID: 126916 Comm: t2_new_sysv Kdump: loaded Not tainted 6.17-rc7
Hardware name: GIGACOMPUTING R2A3-T40-AAV1/Jefferson CIO, BIOS 5.4.4.1 07/15/2025
pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc��: mte_clear_page_tags+0x14/0x24
lr��: mte_sync_tags+0x1c0/0x240
sp��: ffff80003150bb80
x29: ffff80003150bb80 x28: ffff00739e9705a8 x27: 0000ffd2d6a00000
x26: 0000ff8e4bc00000 x25: 00e80046cde00f45 x24: 0000000000022458
x23: 0000000000000000 x22: 0000000000000004 x21: 000000011b380000
x20: ffff000000000000 x19: 000000011b379f40 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc875e0aa5e2c
x8��: 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
x5��: fffffc01ce7a5c00 x4 : 00000000046cde00 x3 : fffffc0000000000
x2��: 0000000000000004 x1 : 0000000000000040 x0 : ffff0046cde7c000
Call trace:
����mte_clear_page_tags+0x14/0x24
����set_huge_pte_at+0x25c/0x280
����hugetlb_change_protection+0x220/0x430
����change_protection+0x5c/0x8c
����mprotect_fixup+0x10c/0x294
����do_mprotect_pkey.constprop.0+0x2e0/0x3d4
����__arm64_sys_mprotect+0x24/0x44
����invoke_syscall+0x50/0x160
����el0_svc_common+0x48/0x144
����do_el0_svc+0x30/0xe0
����el0_svc+0x30/0xf0
����el0t_64_sync_handler+0xc4/0x148
����el0t_64_sync+0x1a4/0x1a8
Soft lockup is not triggered with THP or base page because there is
cond_resched() called for each PMD size.
Although the soft lockup was triggered by MTE, it should be not MTE
specific. The other processing which takes long time in the loop may
trigger soft lockup too.
So add cond_resched() for hugetlb to avoid soft lockup.
Link: https://lkml.kernel.org/r/20250929202402.1663290-1-yang@os.amperecomputing.…
Fixes: 8f860591ffb2 ("[PATCH] Enable mprotect on huge pages")
Signed-off-by: Yang Shi <yang(a)os.amperecomputing.com>
Tested-by: Carl Worth <carl(a)os.amperecomputing.com>
Reviewed-by: Christoph Lameter (Ampere) <cl(a)gentwo.org>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Oscar Salvador <osalvador(a)suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/hugetlb.c~mm-hugetlb-avoid-soft-lockup-when-mprotect-to-large-memory-area
+++ a/mm/hugetlb.c
@@ -7203,6 +7203,8 @@ long hugetlb_change_protection(struct vm
psize);
}
spin_unlock(ptl);
+
+ cond_resched();
}
/*
* Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare
_
Patches currently in -mm which might be from yang(a)os.amperecomputing.com are
mm-hugetlb-avoid-soft-lockup-when-mprotect-to-large-memory-area.patch
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092931-vastness-jawed-6945@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7 Mon Sep 17 00:00:00 2001
From: Jinjiang Tu <tujinjiang(a)huawei.com>
Date: Fri, 12 Sep 2025 15:41:39 +0800
Subject: [PATCH] mm/hugetlb: fix folio is still mapped when deleted
Migration may be raced with fallocating hole. remove_inode_single_folio
will unmap the folio if the folio is still mapped. However, it's called
without folio lock. If the folio is migrated and the mapped pte has been
converted to migration entry, folio_mapped() returns false, and won't
unmap it. Due to extra refcount held by remove_inode_single_folio,
migration fails, restores migration entry to normal pte, and the folio is
mapped again. As a result, we triggered BUG in filemap_unaccount_folio.
The log is as follows:
BUG: Bad page cache in process hugetlb pfn:156c00
page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00
head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0
aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file"
flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f4(hugetlb)
page dumped because: still mapped when deleted
CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Call Trace:
<TASK>
dump_stack_lvl+0x4f/0x70
filemap_unaccount_folio+0xc4/0x1c0
__filemap_remove_folio+0x38/0x1c0
filemap_remove_folio+0x41/0xd0
remove_inode_hugepages+0x142/0x250
hugetlbfs_fallocate+0x471/0x5a0
vfs_fallocate+0x149/0x380
Hold folio lock before checking if the folio is mapped to avold race with
migration.
Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com
Fixes: 4aae8d1c051e ("mm/hugetlbfs: unmap pages if page fault raced with hole punch")
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 09d4baef29cf..be4be99304bc 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -517,14 +517,16 @@ static bool remove_inode_single_folio(struct hstate *h, struct inode *inode,
/*
* If folio is mapped, it was faulted in after being
- * unmapped in caller. Unmap (again) while holding
- * the fault mutex. The mutex will prevent faults
- * until we finish removing the folio.
+ * unmapped in caller or hugetlb_vmdelete_list() skips
+ * unmapping it due to fail to grab lock. Unmap (again)
+ * while holding the fault mutex. The mutex will prevent
+ * faults until we finish removing the folio. Hold folio
+ * lock to guarantee no concurrent migration.
*/
+ folio_lock(folio);
if (unlikely(folio_mapped(folio)))
hugetlb_unmap_file_folio(h, mapping, folio, index);
- folio_lock(folio);
/*
* We must remove the folio from page cache before removing
* the region/ reserve map (hugetlb_unreserve_pages). In
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092930-header-irritable-c8ed@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7b7387650dcf2881fd8bb55bcf3c8bd6c9542dd7 Mon Sep 17 00:00:00 2001
From: Jinjiang Tu <tujinjiang(a)huawei.com>
Date: Fri, 12 Sep 2025 15:41:39 +0800
Subject: [PATCH] mm/hugetlb: fix folio is still mapped when deleted
Migration may be raced with fallocating hole. remove_inode_single_folio
will unmap the folio if the folio is still mapped. However, it's called
without folio lock. If the folio is migrated and the mapped pte has been
converted to migration entry, folio_mapped() returns false, and won't
unmap it. Due to extra refcount held by remove_inode_single_folio,
migration fails, restores migration entry to normal pte, and the folio is
mapped again. As a result, we triggered BUG in filemap_unaccount_folio.
The log is as follows:
BUG: Bad page cache in process hugetlb pfn:156c00
page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00
head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0
aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file"
flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f4(hugetlb)
page dumped because: still mapped when deleted
CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Call Trace:
<TASK>
dump_stack_lvl+0x4f/0x70
filemap_unaccount_folio+0xc4/0x1c0
__filemap_remove_folio+0x38/0x1c0
filemap_remove_folio+0x41/0xd0
remove_inode_hugepages+0x142/0x250
hugetlbfs_fallocate+0x471/0x5a0
vfs_fallocate+0x149/0x380
Hold folio lock before checking if the folio is mapped to avold race with
migration.
Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com
Fixes: 4aae8d1c051e ("mm/hugetlbfs: unmap pages if page fault raced with hole punch")
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 09d4baef29cf..be4be99304bc 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -517,14 +517,16 @@ static bool remove_inode_single_folio(struct hstate *h, struct inode *inode,
/*
* If folio is mapped, it was faulted in after being
- * unmapped in caller. Unmap (again) while holding
- * the fault mutex. The mutex will prevent faults
- * until we finish removing the folio.
+ * unmapped in caller or hugetlb_vmdelete_list() skips
+ * unmapping it due to fail to grab lock. Unmap (again)
+ * while holding the fault mutex. The mutex will prevent
+ * faults until we finish removing the folio. Hold folio
+ * lock to guarantee no concurrent migration.
*/
+ folio_lock(folio);
if (unlikely(folio_mapped(folio)))
hugetlb_unmap_file_folio(h, mapping, folio, index);
- folio_lock(folio);
/*
* We must remove the folio from page cache before removing
* the region/ reserve map (hugetlb_unreserve_pages). In