The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: ebfc8542ad62d066771e46c8aa30f5624b89cad8
Gitweb: https://git.kernel.org/tip/ebfc8542ad62d066771e46c8aa30f5624b89cad8
Author: Adrian Hunter <adrian.hunter(a)intel.com>
AuthorDate: Mon, 13 Oct 2025 10:22:42 +03:00
Committer: Peter Zijlstra <peterz(a)infradead.org>
CommitterDate: Tue, 14 Oct 2025 10:38:09 +02:00
perf/core: Fix address filter match with backing files
It was reported that Intel PT address filters do not work in Docker
containers. That relates to the use of overlayfs.
overlayfs records the backing file in struct vm_area_struct vm_file,
instead of the user file that the user mmapped. In order for an address
filter to match, it must compare to the user file inode. There is an
existing helper file_user_inode() for that situation.
Use file_user_inode() instead of file_inode() to get the inode for address
filter matching.
Example:
Setup:
# cd /root
# mkdir test ; cd test ; mkdir lower upper work merged
# cp `which cat` lower
# mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
# perf record --buildid-mmap -e intel_pt//u --filter 'filter * @ /root/test/merged/cat' -- /root/test/merged/cat /proc/self/maps
...
55d61d246000-55d61d2e1000 r-xp 00018000 00:1a 3418 /root/test/merged/cat
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB perf.data ]
# perf buildid-cache --add /root/test/merged/cat
Before:
Address filter does not match so there are no control flow packets
# perf script --itrace=e
# perf script --itrace=b | wc -l
0
# perf script -D | grep 'TIP.PGE' | wc -l
0
#
After:
Address filter does match so there are control flow packets
# perf script --itrace=e
# perf script --itrace=b | wc -l
235
# perf script -D | grep 'TIP.PGE' | wc -l
57
#
With respect to stable kernels, overlayfs mmap function ovl_mmap() was
added in v4.19 but file_user_inode() was not added until v6.8 and never
back-ported to stable kernels. FMODE_BACKING that it depends on was added
in v6.5. This issue has gone largely unnoticed, so back-porting before
v6.8 is probably not worth it, so put 6.8 as the stable kernel prerequisite
version, although in practice the next long term kernel is 6.12.
Closes: https://lore.kernel.org/linux-perf-users/aBCwoq7w8ohBRQCh@fremen.lan
Reported-by: Edd Barrett <edd(a)theunixzoo.co.uk>
Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Acked-by: Amir Goldstein <amir73il(a)gmail.com>
Cc: stable(a)vger.kernel.org # 6.8
---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7541f6f..cd63ec8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9492,7 +9492,7 @@ static bool perf_addr_filter_match(struct perf_addr_filter *filter,
if (!filter->path.dentry)
return false;
- if (d_inode(filter->path.dentry) != file_inode(file))
+ if (d_inode(filter->path.dentry) != file_user_inode(file))
return false;
if (filter->offset > offset + size)
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: 8818f507a9391019a3ec7c57b1a32e4b386e48a5
Gitweb: https://git.kernel.org/tip/8818f507a9391019a3ec7c57b1a32e4b386e48a5
Author: Adrian Hunter <adrian.hunter(a)intel.com>
AuthorDate: Mon, 13 Oct 2025 10:22:43 +03:00
Committer: Peter Zijlstra <peterz(a)infradead.org>
CommitterDate: Tue, 14 Oct 2025 10:38:09 +02:00
perf/core: Fix MMAP event path names with backing files
Some file systems like FUSE-based ones or overlayfs may record the backing
file in struct vm_area_struct vm_file, instead of the user file that the
user mmapped.
Since commit def3ae83da02f ("fs: store real path instead of fake path in
backing file f_path"), file_path() no longer returns the user file path
when applied to a backing file. There is an existing helper
file_user_path() for that situation.
Use file_user_path() instead of file_path() to get the path for MMAP
and MMAP2 events.
Example:
Setup:
# cd /root
# mkdir test ; cd test ; mkdir lower upper work merged
# cp `which cat` lower
# mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
# perf record -e intel_pt//u -- /root/test/merged/cat /proc/self/maps
...
55b0ba399000-55b0ba434000 r-xp 00018000 00:1a 3419 /root/test/merged/cat
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.060 MB perf.data ]
#
Before:
File name is wrong (/cat), so decoding fails:
# perf script --no-itrace --show-mmap-events
cat 367 [016] 100.491492: PERF_RECORD_MMAP2 367/367: [0x55b0ba399000(0x9b000) @ 0x18000 00:02 3419 489959280]: r-xp /cat
...
# perf script --itrace=e | wc -l
Warning:
19 instruction trace errors
19
#
After:
File name is correct (/root/test/merged/cat), so decoding is ok:
# perf script --no-itrace --show-mmap-events
cat 364 [016] 72.153006: PERF_RECORD_MMAP2 364/364: [0x55ce4003d000(0x9b000) @ 0x18000 00:02 3419 3132534314]: r-xp /root/test/merged/cat
# perf script --itrace=e
# perf script --itrace=e | wc -l
0
#
Fixes: def3ae83da02f ("fs: store real path instead of fake path in backing file f_path")
Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Acked-by: Amir Goldstein <amir73il(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index cd63ec8..7b5c237 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9416,7 +9416,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
* need to add enough zero bytes after the string to handle
* the 64bit alignment we do later.
*/
- name = file_path(file, buf, PATH_MAX - sizeof(u64));
+ name = d_path(file_user_path(file), buf, PATH_MAX - sizeof(u64));
if (IS_ERR(name)) {
name = "//toolong";
goto cpy_name;
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: fa4f4bae893fbce8a3edfff1ab7ece0c01dc1328
Gitweb: https://git.kernel.org/tip/fa4f4bae893fbce8a3edfff1ab7ece0c01dc1328
Author: Adrian Hunter <adrian.hunter(a)intel.com>
AuthorDate: Mon, 13 Oct 2025 10:22:44 +03:00
Committer: Peter Zijlstra <peterz(a)infradead.org>
CommitterDate: Tue, 14 Oct 2025 10:38:10 +02:00
perf/core: Fix MMAP2 event device with backing files
Some file systems like FUSE-based ones or overlayfs may record the backing
file in struct vm_area_struct vm_file, instead of the user file that the
user mmapped.
That causes perf to misreport the device major/minor numbers of the file
system of the file, and the generation of the file, and potentially other
inode details. There is an existing helper file_user_inode() for that
situation.
Use file_user_inode() instead of file_inode() to get the inode for MMAP2
events.
Example:
Setup:
# cd /root
# mkdir test ; cd test ; mkdir lower upper work merged
# cp `which cat` lower
# mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
# perf record -e cycles:u -- /root/test/merged/cat /proc/self/maps
...
55b2c91d0000-55b2c926b000 r-xp 00018000 00:1a 3419 /root/test/merged/cat
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.004 MB perf.data (5 samples) ]
#
# stat /root/test/merged/cat
File: /root/test/merged/cat
Size: 1127792 Blocks: 2208 IO Block: 4096 regular file
Device: 0,26 Inode: 3419 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2025-09-08 12:23:59.453309624 +0000
Modify: 2025-09-08 12:23:59.454309624 +0000
Change: 2025-09-08 12:23:59.454309624 +0000
Birth: 2025-09-08 12:23:59.453309624 +0000
Before:
Device reported 00:02 differs from stat output and /proc/self/maps
# perf script --show-mmap-events | grep /root/test/merged/cat
cat 377 [-01] 243.078558: PERF_RECORD_MMAP2 377/377: [0x55b2c91d0000(0x9b000) @ 0x18000 00:02 3419 2068525940]: r-xp /root/test/merged/cat
After:
Device reported 00:1a is the same as stat output and /proc/self/maps
# perf script --show-mmap-events | grep /root/test/merged/cat
cat 362 [-01] 127.755167: PERF_RECORD_MMAP2 362/362: [0x55ba6e781000(0x9b000) @ 0x18000 00:1a 3419 0]: r-xp /root/test/merged/cat
With respect to stable kernels, overlayfs mmap function ovl_mmap() was
added in v4.19 but file_user_inode() was not added until v6.8 and never
back-ported to stable kernels. FMODE_BACKING that it depends on was added
in v6.5. This issue has gone largely unnoticed, so back-porting before
v6.8 is probably not worth it, so put 6.8 as the stable kernel prerequisite
version, although in practice the next long term kernel is 6.12.
Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Acked-by: Amir Goldstein <amir73il(a)gmail.com>
Cc: stable(a)vger.kernel.org # 6.8
---
kernel/events/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7b5c237..177e57c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9403,7 +9403,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
flags |= MAP_HUGETLB;
if (file) {
- struct inode *inode;
+ const struct inode *inode;
dev_t dev;
buf = kmalloc(PATH_MAX, GFP_KERNEL);
@@ -9421,7 +9421,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
name = "//toolong";
goto cpy_name;
}
- inode = file_inode(vma->vm_file);
+ inode = file_user_inode(vma->vm_file);
dev = inode->i_sb->s_dev;
ino = inode->i_ino;
gen = inode->i_generation;
We found some discrete AMD graphics devices hide funtion 0 and the whole
is not supposed to be probed.
Since our original purpose is to allow integrated devices (on bus 0) to
be probed without function 0, we can limit the islolated function probing
only on bus 0.
Cc: stable(a)vger.kernel.org
Fixes: a02fd05661d73a8 ("PCI: Extend isolated function probing to LoongArch")
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
drivers/pci/probe.c | 2 +-
include/linux/hypervisor.h | 8 +++++---
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c83e75a0ec12..da6a2aef823a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2883,7 +2883,7 @@ int pci_scan_slot(struct pci_bus *bus, int devfn)
* a hypervisor that passes through individual PCI
* functions.
*/
- if (!hypervisor_isolated_pci_functions())
+ if (!hypervisor_isolated_pci_functions(bus->number))
break;
}
fn = next_fn(bus, dev, fn);
diff --git a/include/linux/hypervisor.h b/include/linux/hypervisor.h
index be5417303ecf..30ece04a16d9 100644
--- a/include/linux/hypervisor.h
+++ b/include/linux/hypervisor.h
@@ -32,13 +32,15 @@ static inline bool jailhouse_paravirt(void)
#endif /* !CONFIG_X86 */
-static inline bool hypervisor_isolated_pci_functions(void)
+static inline bool hypervisor_isolated_pci_functions(int bus)
{
if (IS_ENABLED(CONFIG_S390))
return true;
- if (IS_ENABLED(CONFIG_LOONGARCH))
- return true;
+ if (IS_ENABLED(CONFIG_LOONGARCH)) {
+ if (bus == 0)
+ return true;
+ }
return jailhouse_paravirt();
}
--
2.47.3
Since the len argument value passed to exfat_ioctl_set_volume_label()
from exfat_nls_to_utf16() is passed 1 too large, an out-of-bounds read
occurs when dereferencing p_cstring in exfat_nls_to_ucs2() later.
And because of the NLS_NAME_OVERLEN macro, another error occurs when
creating a file with a period at the end using utf8 and other iocharsets,
so the NLS_NAME_OVERLEN macro should be removed and the len argument value
should be passed as FSLABEL_MAX - 1.
Cc: <stable(a)vger.kernel.org>
Reported-by: syzbot+98cc76a76de46b3714d4(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=98cc76a76de46b3714d4
Fixes: 370e812b3ec1 ("exfat: add nls operations")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
fs/exfat/file.c | 2 +-
fs/exfat/nls.c | 3 ---
2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index f246cf439588..7ce0fb6f2564 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -521,7 +521,7 @@ static int exfat_ioctl_set_volume_label(struct super_block *sb,
memset(&uniname, 0, sizeof(uniname));
if (label[0]) {
- ret = exfat_nls_to_utf16(sb, label, FSLABEL_MAX,
+ ret = exfat_nls_to_utf16(sb, label, FSLABEL_MAX - 1,
&uniname, &lossy);
if (ret < 0)
return ret;
diff --git a/fs/exfat/nls.c b/fs/exfat/nls.c
index 8243d94ceaf4..57db08a5271c 100644
--- a/fs/exfat/nls.c
+++ b/fs/exfat/nls.c
@@ -616,9 +616,6 @@ static int exfat_nls_to_ucs2(struct super_block *sb,
unilen++;
}
- if (p_cstring[i] != '\0')
- lossy |= NLS_NAME_OVERLEN;
-
*uniname = '\0';
p_uniname->name_len = unilen;
p_uniname->name_hash = exfat_calc_chksum16(upname, unilen << 1, 0,
--
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 0389c305ef56cbadca4cbef44affc0ec3213ed30
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101330-hemstitch-crimson-1681@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0389c305ef56cbadca4cbef44affc0ec3213ed30 Mon Sep 17 00:00:00 2001
From: Lance Yang <lance.yang(a)linux.dev>
Date: Wed, 17 Sep 2025 21:31:37 +0800
Subject: [PATCH] selftests/mm: skip soft-dirty tests when
CONFIG_MEM_SOFT_DIRTY is disabled
The madv_populate and soft-dirty kselftests currently fail on systems
where CONFIG_MEM_SOFT_DIRTY is disabled.
Introduce a new helper softdirty_supported() into vm_util.c/h to ensure
tests are properly skipped when the feature is not enabled.
Link: https://lkml.kernel.org/r/20250917133137.62802-1-lance.yang@linux.dev
Fixes: 9f3265db6ae8 ("selftests: vm: add test for Soft-Dirty PTE bit")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Gabriel Krisman Bertazi <krisman(a)collabora.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/tools/testing/selftests/mm/madv_populate.c b/tools/testing/selftests/mm/madv_populate.c
index b6fabd5c27ed..d8d11bc67ddc 100644
--- a/tools/testing/selftests/mm/madv_populate.c
+++ b/tools/testing/selftests/mm/madv_populate.c
@@ -264,23 +264,6 @@ static void test_softdirty(void)
munmap(addr, SIZE);
}
-static int system_has_softdirty(void)
-{
- /*
- * There is no way to check if the kernel supports soft-dirty, other
- * than by writing to a page and seeing if the bit was set. But the
- * tests are intended to check that the bit gets set when it should, so
- * doing that check would turn a potentially legitimate fail into a
- * skip. Fortunately, we know for sure that arm64 does not support
- * soft-dirty. So for now, let's just use the arch as a corse guide.
- */
-#if defined(__aarch64__)
- return 0;
-#else
- return 1;
-#endif
-}
-
int main(int argc, char **argv)
{
int nr_tests = 16;
@@ -288,7 +271,7 @@ int main(int argc, char **argv)
pagesize = getpagesize();
- if (system_has_softdirty())
+ if (softdirty_supported())
nr_tests += 5;
ksft_print_header();
@@ -300,7 +283,7 @@ int main(int argc, char **argv)
test_holes();
test_populate_read();
test_populate_write();
- if (system_has_softdirty())
+ if (softdirty_supported())
test_softdirty();
err = ksft_get_fail_cnt();
diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c
index 8a3f2b4b2186..4ee4db3750c1 100644
--- a/tools/testing/selftests/mm/soft-dirty.c
+++ b/tools/testing/selftests/mm/soft-dirty.c
@@ -200,8 +200,11 @@ int main(int argc, char **argv)
int pagesize;
ksft_print_header();
- ksft_set_plan(15);
+ if (!softdirty_supported())
+ ksft_exit_skip("soft-dirty is not support\n");
+
+ ksft_set_plan(15);
pagemap_fd = open(PAGEMAP_FILE_PATH, O_RDONLY);
if (pagemap_fd < 0)
ksft_exit_fail_msg("Failed to open %s\n", PAGEMAP_FILE_PATH);
diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c
index 56e9bd541edd..e33cda301dad 100644
--- a/tools/testing/selftests/mm/vm_util.c
+++ b/tools/testing/selftests/mm/vm_util.c
@@ -449,6 +449,23 @@ bool check_vmflag_pfnmap(void *addr)
return check_vmflag(addr, "pf");
}
+bool softdirty_supported(void)
+{
+ char *addr;
+ bool supported = false;
+ const size_t pagesize = getpagesize();
+
+ /* New mappings are expected to be marked with VM_SOFTDIRTY (sd). */
+ addr = mmap(0, pagesize, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
+ if (!addr)
+ ksft_exit_fail_msg("mmap failed\n");
+
+ supported = check_vmflag(addr, "sd");
+ munmap(addr, pagesize);
+ return supported;
+}
+
/*
* Open an fd at /proc/$pid/maps and configure procmap_out ready for
* PROCMAP_QUERY query. Returns 0 on success, or an error code otherwise.
diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h
index 07c4acfd84b6..26c30fdc0241 100644
--- a/tools/testing/selftests/mm/vm_util.h
+++ b/tools/testing/selftests/mm/vm_util.h
@@ -104,6 +104,7 @@ bool find_vma_procmap(struct procmap_fd *procmap, void *address);
int close_procmap(struct procmap_fd *procmap);
int write_sysfs(const char *file_path, unsigned long val);
int read_sysfs(const char *file_path, unsigned long *val);
+bool softdirty_supported(void);
static inline int open_self_procmap(struct procmap_fd *procmap_out)
{
We add pmd folio into ds_queue on the first page fault in
__do_huge_pmd_anonymous_page(), so that we can split it in case of
memory pressure. This should be the same for a pmd folio during wp
page fault.
Commit 1ced09e0331f ("mm: allocate THP on hugezeropage wp-fault") miss
to add it to ds_queue, which means system may not reclaim enough memory
in case of memory pressure even the pmd folio is under used.
Move deferred_split_folio() into map_anon_folio_pmd() to make the pmd
folio installation consistent.
Fixes: 1ced09e0331f ("mm: allocate THP on hugezeropage wp-fault")
Signed-off-by: Wei Yang <richard.weiyang(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Lance Yang <lance.yang(a)linux.dev>
Cc: Dev Jain <dev.jain(a)arm.com>
Cc: <stable(a)vger.kernel.org>
---
v2:
* add fix, cc stable and put description about the flow of current
code
* move deferred_split_folio() into map_anon_folio_pmd()
---
mm/huge_memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1b81680b4225..f13de93637bf 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1232,6 +1232,7 @@ static void map_anon_folio_pmd(struct folio *folio, pmd_t *pmd,
count_vm_event(THP_FAULT_ALLOC);
count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC);
count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC);
+ deferred_split_folio(folio, false);
}
static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf)
@@ -1272,7 +1273,6 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf)
pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable);
map_anon_folio_pmd(folio, vmf->pmd, vma, haddr);
mm_inc_nr_ptes(vma->vm_mm);
- deferred_split_folio(folio, false);
spin_unlock(vmf->ptl);
}
--
2.34.1