From: Steven Rostedt rostedt@goodmis.org
[ Upstream commit 90942f9fac05702065ff82ed0bade0d08168d4ea ]
To determine if a task is a kernel thread or not, it is more reliable to use (current->flags & (PF_KTHREAD|PF_USER_WORKERi)) than to rely on current->mm being NULL. That is because some kernel tasks (io_uring helpers) may have a mm field.
Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20250820180428.592367294@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my comprehensive analysis, here is my determination:
## **Backport Status: YES**
### Extensive Analysis:
#### **Context and Problem Being Fixed:**
This commit addresses a **correctness bug** in how the perf subsystem identifies kernel threads. The issue is that the code was using `current->mm == NULL` to determine if a task is a kernel thread, but this assumption is incorrect for **io_uring helper threads and other PF_USER_WORKER threads**, which have a non-NULL mm field despite being kernel-side workers.
#### **Evidence from Code Investigation:**
1. **This is part of a fix series**: I found three related commits in upstream: - `16ed389227651`: "perf: Skip user unwind if the task is a kernel thread" (already being backported to stable as `823d7b9ec8616`) - `d77e3319e3109`: "perf: Simplify get_perf_callchain() user logic" (already in stable as `96681d3b99282`) - `90942f9fac057`: **This commit** - completes the fix by updating remaining locations
2. **Historical context**: PF_USER_WORKER was introduced in commit `54e6842d0775b` (March 2023) to handle io_uring and vhost workers that behave differently from regular kernel threads. These threads have mm contexts but shouldn't be treated as user threads for operations like register sampling.
3. **Real-world impact**: PowerPC already experienced crashes (commit `01849382373b8`) when trying to access pt_regs for PF_IO_WORKER tasks during coredump generation, demonstrating this class of bugs is real.
#### **Specific Code Changes Analysis:**
1. **kernel/events/callchain.c:247-250** (currently at line 245 in autosel-6.17): - **OLD**: `if (current->mm)` then use `task_pt_regs(current)` - **NEW**: `if (current->flags & (PF_KTHREAD | PF_USER_WORKER))` then skip user unwinding - **Impact**: Prevents perf from attempting to unwind user stack for io_uring helpers
2. **kernel/events/core.c:7455** (currently at line 7443 in autosel-6.17): - **OLD**: `!(current->flags & PF_KTHREAD)` - **NEW**: `!(current->flags & (PF_KTHREAD | PF_USER_WORKER))` - **Impact**: Correctly excludes user worker threads from user register sampling
3. **kernel/events/core.c:8095** (currently at line 8083 in autosel-6.17): - **OLD**: `if (current->mm != NULL)` - **NEW**: `if (!(current->flags & (PF_KTHREAD | PF_USER_WORKER)))` - **Impact**: Prevents incorrect page table walks for user worker threads in `perf_virt_to_phys()`
#### **Why This Qualifies for Backporting:**
1. **Fixes a real bug**: Perf incorrectly handles io_uring helper threads, leading to: - Incorrect callchain collection - Wrong register samples - Potential crashes or data corruption when walking page tables
2. **Affects widely-used functionality**: io_uring is heavily used in modern applications (databases, proxies, async I/O workloads), and perf profiling of these workloads would hit this bug
3. **Small and contained**: Only 3 conditional checks changed across 2 files - minimal risk
4. **Part of an upstream series already being backported**: The first commit in the series (`16ed389227651`) is already marked for stable backport, making this a natural follow-up
5. **No architectural changes**: Pure bug fix with no API changes or feature additions
6. **Low regression risk**: The new flag-based check is more correct than the mm-based check; any behavioral changes would be fixing incorrect behavior
#### **Verification in Current Tree:**
I confirmed that linux-autosel-6.17 still has the old buggy code: - Line 7443: Missing PF_USER_WORKER in the check - Line 8083: Still uses `current->mm != NULL` - callchain.c:245: Still uses `if (current->mm)`
This confirms the fix is needed and not yet applied.
#### **Conclusion:**
This is a **clear YES for backporting**. It's a well-understood correctness fix for a real bug affecting perf profiling of io_uring workloads, with minimal risk and part of an ongoing stable backport series.
kernel/events/callchain.c | 6 +++--- kernel/events/core.c | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c index 6c83ad674d010..decff7266cfbd 100644 --- a/kernel/events/callchain.c +++ b/kernel/events/callchain.c @@ -242,10 +242,10 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
if (user) { if (!user_mode(regs)) { - if (current->mm) - regs = task_pt_regs(current); - else + if (current->flags & (PF_KTHREAD | PF_USER_WORKER)) regs = NULL; + else + regs = task_pt_regs(current); }
if (regs) { diff --git a/kernel/events/core.c b/kernel/events/core.c index 820127536e62b..ea9ff856770be 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7440,7 +7440,7 @@ static void perf_sample_regs_user(struct perf_regs *regs_user, if (user_mode(regs)) { regs_user->abi = perf_reg_abi(current); regs_user->regs = regs; - } else if (!(current->flags & PF_KTHREAD)) { + } else if (!(current->flags & (PF_KTHREAD | PF_USER_WORKER))) { perf_get_regs_user(regs_user, regs); } else { regs_user->abi = PERF_SAMPLE_REGS_ABI_NONE; @@ -8080,7 +8080,7 @@ static u64 perf_virt_to_phys(u64 virt) * Try IRQ-safe get_user_page_fast_only first. * If failed, leave phys_addr as 0. */ - if (current->mm != NULL) { + if (!(current->flags & (PF_KTHREAD | PF_USER_WORKER))) { struct page *p;
pagefault_disable();