The patch titled
Subject: mm: swap: check for stable address space before operating on the VMA
has been added to the -mm mm-new branch. Its filename is
mm-swap-check-for-stable-address-space-before-operating-on-the-vma.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Charan Teja Kalla <charan.kalla(a)oss.qualcomm.com>
Subject: mm: swap: check for stable address space before operating on the VMA
Date: Wed, 24 Sep 2025 23:41:38 +0530
It is possible to hit a zero entry while traversing the vmas in unuse_mm()
called from swapoff path and accessing it causes the OOPS:
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000446--> Loading the memory from offset 0x40 on the
XA_ZERO_ENTRY as address.
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
The issue is manifested from the below race between the fork() on a
process and swapoff:
fork(dup_mmap()) swapoff(unuse_mm)
--------------- -----------------
1) Identical mtree is built using
__mt_dup().
2) copy_pte_range()-->
copy_nonpresent_pte():
The dst mm is added into the
mmlist to be visible to the
swapoff operation.
3) Fatal signal is sent to the parent
process(which is the current during the
fork) thus skip the duplication of the
vmas and mark the vma range with
XA_ZERO_ENTRY as a marker for this process
that helps during exit_mmap().
4) swapoff is tried on the
'mm' added to the 'mmlist' as
part of the 2.
5) unuse_mm(), that iterates
through the vma's of this 'mm'
will hit the non-NULL zero entry
and operating on this zero entry
as a vma is resulting into the
oops.
The proper fix would be around not exposing this partially-valid tree to
others when droping the mmap lock, which is being solved with [1]. A
simpler solution would be checking for MMF_UNSTABLE, as it is set if
mm_struct is not fully initialized in dup_mmap().
Thanks to Liam/Lorenzo/David for all the suggestions in fixing this
issue.
Link: https://lkml.kernel.org/r/20250924181138.1762750-1-charan.kalla@oss.qualcom…
Link: https://lore.kernel.org/all/20250815191031.3769540-1-Liam.Howlett@oracle.co… [1]
Fixes: d24062914837 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()")
Signed-off-by: Charan Teja Kalla <charan.kalla(a)oss.qualcomm.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Kairui Song <kasong(a)tencent.com>
Cc: Kemeng Shi <shikemeng(a)huaweicloud.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Nhat Pham <nphamcs(a)gmail.com>
Cc: Peng Zhang <zhangpeng.00(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swapfile.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/swapfile.c~mm-swap-check-for-stable-address-space-before-operating-on-the-vma
+++ a/mm/swapfile.c
@@ -2389,6 +2389,8 @@ static int unuse_mm(struct mm_struct *mm
VMA_ITERATOR(vmi, mm, 0);
mmap_read_lock(mm);
+ if (check_stable_address_space(mm))
+ goto unlock;
for_each_vma(vmi, vma) {
if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
ret = unuse_vma(vma, type);
@@ -2398,6 +2400,7 @@ static int unuse_mm(struct mm_struct *mm
cond_resched();
}
+unlock:
mmap_read_unlock(mm);
return ret;
}
_
Patches currently in -mm which might be from charan.kalla(a)oss.qualcomm.com are
mm-swap-check-for-stable-address-space-before-operating-on-the-vma.patch
Commit 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in
af_alg_sendmsg") changed some fields from bool to 1-bit bitfields.
However, some assignments to these fields, specifically 'more' and
'merge', assign values greater than 1. These relied on C's implicit
conversion to bool, such that zero becomes false and nonzero becomes
true. With a 1-bit bitfield instead, mod 2 of the value is taken
instead, resulting in 0 being assigned in some cases when 1 was
intended. Fix this by restoring the bool type.
Fixes: 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
---
include/crypto/if_alg.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 0c70f3a55575..02fb7c1d9ef7 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -150,15 +150,15 @@ struct af_alg_ctx {
struct crypto_wait wait;
size_t used;
atomic_t rcvused;
- u32 more:1,
- merge:1,
- enc:1,
- write:1,
- init:1;
+ bool more;
+ bool merge;
+ bool enc;
+ bool write;
+ bool init;
unsigned int len;
unsigned int inflight;
};
base-commit: cec1e6e5d1ab33403b809f79cd20d6aff124ccfe
--
2.51.0.536.g15c5d4f767-goog
From: Allison Henderson <allison.henderson(a)oracle.com>
[ Upstream commit f103df763563ad6849307ed5985d1513acc586dd ]
With parent pointers enabled, a rename operation can update up to 5
inodes: src_dp, target_dp, src_ip, target_ip and wip. This causes
their dquots to a be attached to the transaction chain, so we need
to increase XFS_QM_TRANS_MAXDQS. This patch also add a helper
function xfs_dqlockn to lock an arbitrary number of dquots.
Signed-off-by: Allison Henderson <allison.henderson(a)oracle.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Darrick J. Wong <djwong(a)kernel.org>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
[amir: backport to kernels prior to parent pointers to fix an old bug]
A rename operation of a directory (i.e. mv A/C/ B/) may end up changing
three different dquot accounts under the following conditions:
1. user (or group) quotas are enabled
2. A/ B/ and C/ have different owner uids (or gids)
3. A/ blocks shrinks after remove of entry C/
4. B/ blocks grows before adding of entry C/
5. A/ ino <= XFS_DIR2_MAX_SHORT_INUM
6. B/ ino > XFS_DIR2_MAX_SHORT_INUM
7. C/ is converted from sf to block format, because its parent entry
needs to be stored as 8 bytes (see xfs_dir2_sf_replace_needblock)
When all conditions are met (observed in the wild) we get this assertion:
XFS: Assertion failed: qtrx, file: fs/xfs/xfs_trans_dquot.c, line: 207
The upstream commit fixed this bug as a side effect, so decided to apply
it as is rather than changing XFS_QM_TRANS_MAXDQS to 3 in stable kernels.
The Fixes commit below is NOT the commit that introduced the bug, but
for some reason, which is not explained in the commit message, it fixes
the comment to state that highest number of dquots of one type is 3 and
not 2 (which leads to the assertion), without actually fixing it.
The change of wording from "usr, grp OR prj" to "usr, grp and prj"
suggests that there may have been a confusion between "the number of
dquote of one type" and "the number of dquot types" (which is also 3),
so the comment change was only accidentally correct.
Fixes: 10f73d27c8e9 ("xfs: fix the comment explaining xfs_trans_dqlockedjoin")
Cc: stable(a)vger.kernel.org
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
---
Christoph,
This is a cognitive challenge. can you say what you where thinking in
2013 when making the comment change in the Fixes commit?
Is my speculation above correct?
Catherine and Leah,
I decided that cherry-pick this upstream commit as is with a commit
message addendum was the best stable tree strategy.
The commit applies cleanly to 5.15.y, so I assume it does for 6.6 and
6.1 as well. I ran my tests on 5.15.y and nothing fell out, but did not
try to reproduce these complex assertion in a test.
Could you take this candidate backport patch to a spin on your test
branch?
What do you all think about this?
Thanks,
Amir.
fs/xfs/xfs_dquot.c | 41 ++++++++++++++++++++++++++++++++++++++++
fs/xfs/xfs_dquot.h | 1 +
fs/xfs/xfs_qm.h | 2 +-
fs/xfs/xfs_trans_dquot.c | 15 ++++++++++-----
4 files changed, 53 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index c15d61d47a06..6b05d47aa19b 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -1360,6 +1360,47 @@ xfs_dqlock2(
}
}
+static int
+xfs_dqtrx_cmp(
+ const void *a,
+ const void *b)
+{
+ const struct xfs_dqtrx *qa = a;
+ const struct xfs_dqtrx *qb = b;
+
+ if (qa->qt_dquot->q_id > qb->qt_dquot->q_id)
+ return 1;
+ if (qa->qt_dquot->q_id < qb->qt_dquot->q_id)
+ return -1;
+ return 0;
+}
+
+void
+xfs_dqlockn(
+ struct xfs_dqtrx *q)
+{
+ unsigned int i;
+
+ BUILD_BUG_ON(XFS_QM_TRANS_MAXDQS > MAX_LOCKDEP_SUBCLASSES);
+
+ /* Sort in order of dquot id, do not allow duplicates */
+ for (i = 0; i < XFS_QM_TRANS_MAXDQS && q[i].qt_dquot != NULL; i++) {
+ unsigned int j;
+
+ for (j = 0; j < i; j++)
+ ASSERT(q[i].qt_dquot != q[j].qt_dquot);
+ }
+ if (i == 0)
+ return;
+
+ sort(q, i, sizeof(struct xfs_dqtrx), xfs_dqtrx_cmp, NULL);
+
+ mutex_lock(&q[0].qt_dquot->q_qlock);
+ for (i = 1; i < XFS_QM_TRANS_MAXDQS && q[i].qt_dquot != NULL; i++)
+ mutex_lock_nested(&q[i].qt_dquot->q_qlock,
+ XFS_QLOCK_NESTED + i - 1);
+}
+
int __init
xfs_qm_init(void)
{
diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h
index 6b5e3cf40c8b..0e954f88811f 100644
--- a/fs/xfs/xfs_dquot.h
+++ b/fs/xfs/xfs_dquot.h
@@ -231,6 +231,7 @@ int xfs_qm_dqget_uncached(struct xfs_mount *mp,
void xfs_qm_dqput(struct xfs_dquot *dqp);
void xfs_dqlock2(struct xfs_dquot *, struct xfs_dquot *);
+void xfs_dqlockn(struct xfs_dqtrx *q);
void xfs_dquot_set_prealloc_limits(struct xfs_dquot *);
diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h
index 442a0f97a9d4..f75c12c4c6a0 100644
--- a/fs/xfs/xfs_qm.h
+++ b/fs/xfs/xfs_qm.h
@@ -121,7 +121,7 @@ enum {
XFS_QM_TRANS_PRJ,
XFS_QM_TRANS_DQTYPES
};
-#define XFS_QM_TRANS_MAXDQS 2
+#define XFS_QM_TRANS_MAXDQS 5
struct xfs_dquot_acct {
struct xfs_dqtrx dqs[XFS_QM_TRANS_DQTYPES][XFS_QM_TRANS_MAXDQS];
};
diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
index 955c457e585a..99a03acd4488 100644
--- a/fs/xfs/xfs_trans_dquot.c
+++ b/fs/xfs/xfs_trans_dquot.c
@@ -268,24 +268,29 @@ xfs_trans_mod_dquot(
/*
* Given an array of dqtrx structures, lock all the dquots associated and join
- * them to the transaction, provided they have been modified. We know that the
- * highest number of dquots of one type - usr, grp and prj - involved in a
- * transaction is 3 so we don't need to make this very generic.
+ * them to the transaction, provided they have been modified.
*/
STATIC void
xfs_trans_dqlockedjoin(
struct xfs_trans *tp,
struct xfs_dqtrx *q)
{
+ unsigned int i;
ASSERT(q[0].qt_dquot != NULL);
if (q[1].qt_dquot == NULL) {
xfs_dqlock(q[0].qt_dquot);
xfs_trans_dqjoin(tp, q[0].qt_dquot);
- } else {
- ASSERT(XFS_QM_TRANS_MAXDQS == 2);
+ } else if (q[2].qt_dquot == NULL) {
xfs_dqlock2(q[0].qt_dquot, q[1].qt_dquot);
xfs_trans_dqjoin(tp, q[0].qt_dquot);
xfs_trans_dqjoin(tp, q[1].qt_dquot);
+ } else {
+ xfs_dqlockn(q);
+ for (i = 0; i < XFS_QM_TRANS_MAXDQS; i++) {
+ if (q[i].qt_dquot == NULL)
+ break;
+ xfs_trans_dqjoin(tp, q[i].qt_dquot);
+ }
}
}
--
2.47.1
Guangshuo Li wrote:
> Hi Alison, Dave, and all,
>
> Thanks for the feedback. I’ve adopted your suggestions. Below is what I
> plan to take in v3.
I would just post v3. The review tags given on that version will be
picked up when the patch is merged if it is ok.
Thanks,
Ira
[snip]
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Commit 43c51bb573aa ("sc16is7xx: make sure device is in suspend once
probed") permanently enabled access to the enhanced features in
sc16is7xx_probe(), and it is never disabled after that.
Therefore, remove useless re-enable of enhanced features in
sc16is7xx_set_baud().
Fixes: 43c51bb573aa ("sc16is7xx: make sure device is in suspend once probed")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
---
drivers/tty/serial/sc16is7xx.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 1a2c4c14f6aac..c7435595dce13 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -588,13 +588,6 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
div /= prescaler;
}
- /* Enable enhanced features */
- sc16is7xx_efr_lock(port);
- sc16is7xx_port_update(port, SC16IS7XX_EFR_REG,
- SC16IS7XX_EFR_ENABLE_BIT,
- SC16IS7XX_EFR_ENABLE_BIT);
- sc16is7xx_efr_unlock(port);
-
/* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
--
2.39.5