The quilt patch titled
Subject: mm: close theoretical race where stale TLB entries could linger
has been removed from the -mm tree. Its filename was
mm-close-theoretical-race-where-stale-tlb-entries-could-linger.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Ryan Roberts <ryan.roberts(a)arm.com>
Subject: mm: close theoretical race where stale TLB entries could linger
Date: Fri, 6 Jun 2025 10:28:07 +0100
Commit 3ea277194daa ("mm, mprotect: flush TLB if potentially racing with a
parallel reclaim leaving stale TLB entries") described a theoretical race
as such:
"""
Nadav Amit identified a theoretical race between page reclaim and mprotect
due to TLB flushes being batched outside of the PTL being held.
He described the race as follows:
CPU0 CPU1
---- ----
user accesses memory using RW PTE
[PTE now cached in TLB]
try_to_unmap_one()
==> ptep_get_and_clear()
==> set_tlb_ubc_flush_pending()
mprotect(addr, PROT_READ)
==> change_pte_range()
==> [ PTE non-present - no flush ]
user writes using cached RW PTE
...
try_to_unmap_flush()
The same type of race exists for reads when protecting for PROT_NONE and
also exists for operations that can leave an old TLB entry behind such as
munmap, mremap and madvise.
"""
The solution was to introduce flush_tlb_batched_pending() and call it
under the PTL from mprotect/madvise/munmap/mremap to complete any pending
tlb flushes.
However, while madvise_free_pte_range() and
madvise_cold_or_pageout_pte_range() were both retro-fitted to call
flush_tlb_batched_pending() immediately after initially acquiring the PTL,
they both temporarily release the PTL to split a large folio if they
stumble upon one. In this case, where re-acquiring the PTL
flush_tlb_batched_pending() must be called again, but it previously was
not. Let's fix that.
There are 2 Fixes: tags here: the first is the commit that fixed
madvise_free_pte_range(). The second is the commit that added
madvise_cold_or_pageout_pte_range(), which looks like it copy/pasted the
faulty pattern from madvise_free_pte_range().
This is a theoretical bug discovered during code review.
Link: https://lkml.kernel.org/r/20250606092809.4194056-1-ryan.roberts@arm.com
Fixes: 3ea277194daa ("mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries")
Fixes: 9c276cc65a58 ("mm: introduce MADV_COLD")
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
Reviewed-by: Jann Horn <jannh(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Mel Gorman <mgorman <mgorman(a)suse.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/madvise.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/madvise.c~mm-close-theoretical-race-where-stale-tlb-entries-could-linger
+++ a/mm/madvise.c
@@ -508,6 +508,7 @@ restart:
pte_offset_map_lock(mm, pmd, addr, &ptl);
if (!start_pte)
break;
+ flush_tlb_batched_pending(mm);
arch_enter_lazy_mmu_mode();
if (!err)
nr = 0;
@@ -741,6 +742,7 @@ static int madvise_free_pte_range(pmd_t
start_pte = pte;
if (!start_pte)
break;
+ flush_tlb_batched_pending(mm);
arch_enter_lazy_mmu_mode();
if (!err)
nr = 0;
_
Patches currently in -mm which might be from ryan.roberts(a)arm.com are
mm-readahead-honour-new_order-in-page_cache_ra_order.patch
mm-readahead-terminate-async-readahead-on-natural-boundary.patch
mm-readahead-make-space-in-struct-file_ra_state.patch
mm-readahead-store-folio-order-in-struct-file_ra_state.patch
mm-filemap-allow-arch-to-request-folio-size-for-exec-memory.patch
mm-remove-arch_flush_tlb_batched_pending-arch-helper.patch
The quilt patch titled
Subject: mm/vma: reset VMA iterator on commit_merge() OOM failure
has been removed from the -mm tree. Its filename was
mm-vma-reset-vma-iterator-on-commit_merge-oom-failure.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm/vma: reset VMA iterator on commit_merge() OOM failure
Date: Fri, 6 Jun 2025 13:50:32 +0100
While an OOM failure in commit_merge() isn't really feasible due to the
allocation which might fail (a maple tree pre-allocation) being 'too small
to fail', we do need to handle this case correctly regardless.
In vma_merge_existing_range(), we can theoretically encounter failures
which result in an OOM error in two ways - firstly dup_anon_vma() might
fail with an OOM error, and secondly commit_merge() failing, ultimately,
to pre-allocate a maple tree node.
The abort logic for dup_anon_vma() resets the VMA iterator to the initial
range, ensuring that any logic looping on this iterator will correctly
proceed to the next VMA.
However the commit_merge() abort logic does not do the same thing. This
resulted in a syzbot report occurring because mlockall() iterates through
VMAs, is tolerant of errors, but ended up with an incorrect previous VMA
being specified due to incorrect iterator state.
While making this change, it became apparent we are duplicating logic -
the logic introduced in commit 41e6ddcaa0f1 ("mm/vma: add give_up_on_oom
option on modify/merge, use in uffd release") duplicates the
vmg->give_up_on_oom check in both abort branches.
Additionally, we observe that we can perform the anon_dup check safely on
dup_anon_vma() failure, as this will not be modified should this call
fail.
Finally, we need to reset the iterator in both cases, so now we can simply
use the exact same code to abort for both.
We remove the VM_WARN_ON(err != -ENOMEM) as it would be silly for this to
be otherwise and it allows us to implement the abort check more neatly.
Link: https://lkml.kernel.org/r/20250606125032.164249-1-lorenzo.stoakes@oracle.com
Fixes: 47b16d0462a4 ("mm: abort vma_modify() on merge out of memory failure")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reported-by: syzbot+d16409ea9ecc16ed261a(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-mm/6842cc67.a00a0220.29ac89.003b.GAE@google.c…
Reviewed-by: Pedro Falcato <pfalcato(a)suse.de>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vma.c | 22 ++++------------------
1 file changed, 4 insertions(+), 18 deletions(-)
--- a/mm/vma.c~mm-vma-reset-vma-iterator-on-commit_merge-oom-failure
+++ a/mm/vma.c
@@ -967,26 +967,9 @@ static __must_check struct vm_area_struc
err = dup_anon_vma(next, middle, &anon_dup);
}
- if (err)
+ if (err || commit_merge(vmg))
goto abort;
- err = commit_merge(vmg);
- if (err) {
- VM_WARN_ON(err != -ENOMEM);
-
- if (anon_dup)
- unlink_anon_vmas(anon_dup);
-
- /*
- * We've cleaned up any cloned anon_vma's, no VMAs have been
- * modified, no harm no foul if the user requests that we not
- * report this and just give up, leaving the VMAs unmerged.
- */
- if (!vmg->give_up_on_oom)
- vmg->state = VMA_MERGE_ERROR_NOMEM;
- return NULL;
- }
-
khugepaged_enter_vma(vmg->target, vmg->flags);
vmg->state = VMA_MERGE_SUCCESS;
return vmg->target;
@@ -995,6 +978,9 @@ abort:
vma_iter_set(vmg->vmi, start);
vma_iter_load(vmg->vmi);
+ if (anon_dup)
+ unlink_anon_vmas(anon_dup);
+
/*
* This means we have failed to clone anon_vma's correctly, but no
* actual changes to VMAs have occurred, so no harm no foul - if the
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-add-mmap_prepare-compatibility-layer-for-nested-file-systems.patch
mm-add-mmap_prepare-compatibility-layer-for-nested-file-systems-fix-2.patch
docs-mm-expand-vma-doc-to-highlight-pte-freeing-non-vma-traversal.patch
mm-ksm-have-ksm-vma-checks-not-require-a-vma-pointer.patch
mm-ksm-refer-to-special-vmas-via-vm_special-in-ksm_compatible.patch
mm-prevent-ksm-from-breaking-vma-merging-for-new-vmas.patch
tools-testing-selftests-add-vma-merge-tests-for-ksm-merge.patch
mm-pagewalk-split-walk_page_range_novma-into-kernel-user-parts.patch
mm-mremap-introduce-more-mergeable-mremap-via-mremap_relocate_anon.patch
mm-mremap-add-mremap_must_relocate_anon.patch
mm-mremap-add-mremap_relocate_anon-support-for-large-folios.patch
tools-uapi-update-copy-of-linux-mmanh-from-the-kernel-sources.patch
tools-testing-selftests-add-sys_mremap-helper-to-vm_utilh.patch
tools-testing-selftests-add-mremap-cases-that-merge-normally.patch
tools-testing-selftests-add-mremap_relocate_anon-merge-test-cases.patch
tools-testing-selftests-expand-mremap-tests-for-mremap_relocate_anon.patch
tools-testing-selftests-have-cow-self-test-use-mremap_relocate_anon.patch
tools-testing-selftests-test-relocate-anon-in-split-huge-page-test.patch
tools-testing-selftests-add-mremap_relocate_anon-fork-tests.patch
With the conversion done by commit e88f03230dc0 ("clk: qcom: gcc-ipq8074:
rework nss_port5/6 clock to multiple conf") a Copy-Paste error was made
for the nss_port6_tx_clk_src frequency table.
This was caused by the wrong setting of the parent in
ftbl_nss_port6_tx_clk_src that was wrongly set to P_UNIPHY1_RX instead
of P_UNIPHY2_TX.
This cause the UNIPHY2 port to malfunction when it needs to be scaled to
higher clock. The malfunction was observed with the example scenario
with an Aquantia 10G PHY connected and a speed higher than 1G (example
2.5G)
Fix the broken frequency table to restore original functionality.
Cc: stable(a)vger.kernel.org
Fixes: e88f03230dc0 ("clk: qcom: gcc-ipq8074: rework nss_port5/6 clock to multiple conf")
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
---
drivers/clk/qcom/gcc-ipq8074.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/clk/qcom/gcc-ipq8074.c b/drivers/clk/qcom/gcc-ipq8074.c
index 7258ba5c0900..1329ea28d703 100644
--- a/drivers/clk/qcom/gcc-ipq8074.c
+++ b/drivers/clk/qcom/gcc-ipq8074.c
@@ -1895,10 +1895,10 @@ static const struct freq_conf ftbl_nss_port6_tx_clk_src_125[] = {
static const struct freq_multi_tbl ftbl_nss_port6_tx_clk_src[] = {
FMS(19200000, P_XO, 1, 0, 0),
FM(25000000, ftbl_nss_port6_tx_clk_src_25),
- FMS(78125000, P_UNIPHY1_RX, 4, 0, 0),
+ FMS(78125000, P_UNIPHY2_TX, 4, 0, 0),
FM(125000000, ftbl_nss_port6_tx_clk_src_125),
- FMS(156250000, P_UNIPHY1_RX, 2, 0, 0),
- FMS(312500000, P_UNIPHY1_RX, 1, 0, 0),
+ FMS(156250000, P_UNIPHY2_TX, 2, 0, 0),
+ FMS(312500000, P_UNIPHY2_TX, 1, 0, 0),
{ }
};
--
2.48.1
From: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
Hi,
Commit 653143ed73ec ("serial: sh-sci: Check if TX data was written
to device in .tx_empty()") doesn't apply cleanly on top of v5.10.y
stable tree. This series adjust it. Along with it, propose for
backporting other sh-sci fixes.
Please provide your feedback.
Thank you,
Claudiu Beznea
Claudiu Beznea (4):
serial: sh-sci: Check if TX data was written to device in .tx_empty()
serial: sh-sci: Move runtime PM enable to sci_probe_single()
serial: sh-sci: Clean sci_ports[0] after at earlycon exit
serial: sh-sci: Increment the runtime usage counter for the earlycon
device
drivers/tty/serial/sh-sci.c | 97 ++++++++++++++++++++++++++++++-------
1 file changed, 79 insertions(+), 18 deletions(-)
--
2.43.0
From: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
Hi,
Commit 653143ed73ec ("serial: sh-sci: Check if TX data was written
to device in .tx_empty()") doesn't apply cleanly on top of v6.1.y
stable tree. This series adjust it. Along with it, propose for
backporting other sh-sci fixes.
Please provide your feedback.
Thank you,
Claudiu Beznea
Claudiu Beznea (4):
serial: sh-sci: Check if TX data was written to device in .tx_empty()
serial: sh-sci: Move runtime PM enable to sci_probe_single()
serial: sh-sci: Clean sci_ports[0] after at earlycon exit
serial: sh-sci: Increment the runtime usage counter for the earlycon
device
drivers/tty/serial/sh-sci.c | 97 ++++++++++++++++++++++++++++++-------
1 file changed, 79 insertions(+), 18 deletions(-)
--
2.43.0
From: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
Hi,
Commit 653143ed73ec ("serial: sh-sci: Check if TX data was written
to device in .tx_empty()") doesn't apply cleanly on top of v5.15.y
stable tree. This series adjust it. Along with it, propose for
backporting other sh-sci fixes.
Please provide your feedback.
Thank you,
Claudiu Beznea
Claudiu Beznea (4):
serial: sh-sci: Check if TX data was written to device in .tx_empty()
serial: sh-sci: Move runtime PM enable to sci_probe_single()
serial: sh-sci: Clean sci_ports[0] after at earlycon exit
serial: sh-sci: Increment the runtime usage counter for the earlycon
device
drivers/tty/serial/sh-sci.c | 97 ++++++++++++++++++++++++++++++-------
1 file changed, 79 insertions(+), 18 deletions(-)
--
2.43.0
From: Chuck Lever <chuck.lever(a)oracle.com>
[ Upstream commit a648fdeb7c0e17177a2280344d015dba3fbe3314 ]
iattr::ia_size is a loff_t, so these NFSv3 procedures must be
careful to deal with incoming client size values that are larger
than s64_max without corrupting the value.
Silently capping the value results in storing a different value
than the client passed in which is unexpected behavior, so remove
the min_t() check in decode_sattr3().
Note that RFC 1813 permits only the WRITE procedure to return
NFS3ERR_FBIG. We believe that NFSv3 reference implementations
also return NFS3ERR_FBIG when ia_size is too large.
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
(cherry picked from commit a648fdeb7c0e17177a2280344d015dba3fbe3314)
[Larry: backport to 5.4.y. Minor conflict resolved due to missing commit 9cde9360d18d
NFSD: Update the SETATTR3args decoder to use struct xdr_stream]
Signed-off-by: Larry Bassel <larry.bassel(a)oracle.com>
---
fs/nfsd/nfs3xdr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 03e8c45a52f3..25b6b4db0af2 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -122,7 +122,7 @@ decode_sattr3(__be32 *p, struct iattr *iap, struct user_namespace *userns)
iap->ia_valid |= ATTR_SIZE;
p = xdr_decode_hyper(p, &newsize);
- iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
+ iap->ia_size = newsize;
}
if ((tmp = ntohl(*p++)) == 1) { /* set to server time */
iap->ia_valid |= ATTR_ATIME;
--
2.46.0
From: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
Hi,
Commit 653143ed73ec ("serial: sh-sci: Check if TX data was written
to device in .tx_empty()") doesn't apply cleanly on top of v6.6.y
stable tree. This series adjust it. Along with it, propose for
backporting other sh-sci fixes.
Please provide your feedback.
Thank you,
Claudiu Beznea
Claudiu Beznea (4):
serial: sh-sci: Check if TX data was written to device in .tx_empty()
serial: sh-sci: Move runtime PM enable to sci_probe_single()
serial: sh-sci: Clean sci_ports[0] after at earlycon exit
serial: sh-sci: Increment the runtime usage counter for the earlycon
device
drivers/tty/serial/sh-sci.c | 97 ++++++++++++++++++++++++++++++-------
1 file changed, 79 insertions(+), 18 deletions(-)
--
2.43.0
From: Chuck Lever <chuck.lever(a)oracle.com>
[ Upstream commit e6faac3f58c7c4176b66f63def17a34232a17b0e ]
iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and
NFSv4 both define file size as an unsigned 64-bit type. Thus there
is a range of valid file size values an NFS client can send that is
already larger than Linux can handle.
Currently decode_fattr4() dumps a full u64 value into ia_size. If
that value happens to be larger than S64_MAX, then ia_size
underflows. I'm about to fix up the NFSv3 behavior as well, so let's
catch the underflow in the common code path: nfsd_setattr().
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
(cherry picked from commit e6faac3f58c7c4176b66f63def17a34232a17b0e)
[Larry: backport to 5.4.y. Minor conflict resolved due to missing commit 2f221d6f7b88
attr: handle idmapped mounts]
Signed-off-by: Larry Bassel <larry.bassel(a)oracle.com>
---
fs/nfsd/vfs.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 6aa968bee0ce..bee4fdf6e239 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -448,6 +448,10 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
.ia_size = iap->ia_size,
};
+ host_err = -EFBIG;
+ if (iap->ia_size < 0)
+ goto out_unlock;
+
host_err = notify_change(dentry, &size_attr, NULL);
if (host_err)
goto out_unlock;
--
2.46.0