From: Harish Kasiviswanathan Harish.Kasiviswanathan@amd.com
[ Upstream commit eac32ff42393efa6657efc821231b8d802c1d485 ]
Fix the following corner case:- Consider a 2M huge page SVM allocation, followed by prefetch call for the first 4K page. The whole range is initially mapped with single PTE. After the prefetch, this range gets split to first page + rest of the pages. Currently, the first page mapping is not updated on MI300A (APU) since page hasn't migrated. However, after range split PTE mapping it not valid.
Fix this by forcing page table update for the whole range when prefetch is called. Calling prefetch on APU doesn't improve performance. If all it deteriotes. However, functionality has to be supported.
v2: Use apu_prefer_gtt as this issue doesn't apply to APUs with carveout VRAM
v3: Simplify by setting the flag for all ASICs as it doesn't affect dGPU
v4: Remove v2 and v3 changes. Force update_mapping when range is split at a size that is not aligned to prange granularity
Suggested-by: Philip Yang Philip.Yang@amd.com Signed-off-by: Harish Kasiviswanathan Harish.Kasiviswanathan@amd.com Reviewed-by: Philip YangPhilip.Yang@amd.com Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 076470b9f6f8d9c7c8ca73a9f054942a686f9ba7) Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
## COMPREHENSIVE ANALYSIS: drm/amdkfd: Fix GPU mappings for APU after prefetch
### 1. COMMIT MESSAGE ANALYSIS
**Subject:** "drm/amdkfd: Fix GPU mappings for APU after prefetch"
**Key Observations:** - Starts with "Fix" - clearly indicates this is a bug fix - **No "Fixes:" tag** - does not reference the commit that introduced the bug - **No "Cc: stable@vger.kernel.org" tag** - maintainer did not explicitly request stable backport - **No CVE mentioned** - not identified as a security issue - **Has "(cherry picked from commit 076470b9f6f8d9c7c8ca73a9f054942a686f9ba7)"** - indicates it was already cherry-picked internally - **Has "Reviewed-by:" tags** from Philip Yang and Felix Kuehling (AMD maintainers) - well-reviewed - **Went through multiple revisions (v2, v3, v4)** - shows careful consideration of the fix
**Bug Description:** The commit describes a specific corner case with AMD APU (specifically MI300A) SVM (Shared Virtual Memory): 1. A 2M huge page SVM allocation is created 2. A prefetch call is made for the first 4K page 3. The range gets split into first page + rest of pages 4. On MI300A APU, the first page mapping is **not updated** because the page hasn't migrated 5. After the range split, the PTE (Page Table Entry) mapping is **not valid**
### 2. DEEP CODE RESEARCH
**Understanding the Bug Mechanism:**
I examined the code in detail and traced the history of the `remap_list` functionality:
**When was remap_list introduced?** - Commit **7ef6b2d4b7e5c** ("drm/amdkfd: remap unaligned svm ranges that have split") - Date: October 17, 2023 - First appeared in: **v6.7-rc1** (November 2023) - This means the bug only affects kernels **v6.7 and later**
**What does remap_list do?** From commit 7ef6b2d4b7e5c, I found that `remap_list` tracks SVM ranges that have been split at non-aligned boundaries. When a 2MB huge page gets split into smaller pages (e.g., 4K + remaining), the split ranges are added to `remap_list` if the split is not aligned to the page range granularity.
**Code Flow Analysis:**
Looking at `svm_range_set_attr()` in `kfd_svm.c`:
```c:3687:3726:drivers/gpu/drm/amd/amdkfd/kfd_svm.c list_for_each_entry(prange, &update_list, update_list) { svm_range_apply_attrs(p, prange, nattr, attrs, &update_mapping); /* TODO: unmap ranges from GPU that lost access */ } // FIX ADDED HERE: update_mapping |= !p->xnack_enabled && !list_empty(&remap_list);
list_for_each_entry_safe(prange, next, &remove_list, update_list) { // ... cleanup code ... }
// Later in the function: list_for_each_entry(prange, &update_list, update_list) { bool migrated;
mutex_lock(&prange->migrate_mutex);
r = svm_range_trigger_migration(mm, prange, &migrated); if (r) goto out_unlock_range;
if (migrated && (!p->xnack_enabled || (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) && prange->mapped_to_gpu) { pr_debug("restore_work will update mappings of GPUs\n"); mutex_unlock(&prange->migrate_mutex); continue; }
if (!migrated && !update_mapping) { // BUG IS HERE! mutex_unlock(&prange->migrate_mutex); continue; // Skips mapping update! }
// ... mapping code ... } ```
**The Bug:** At line 3723, there's a critical check: `if (!migrated && !update_mapping)`. If both conditions are true: - `!migrated`: The page hasn't been migrated (common on APUs where memory doesn't need to physically move) - `!update_mapping`: No update is needed (set by earlier code)
Then the code **skips** the GPU page table update entirely by calling `continue`.
**The Problem:** When a range is split and added to `remap_list`, but no migration occurs (typical on APUs), the `update_mapping` flag remains false. This causes the GPU page tables to **not get updated** to reflect the new split state, leaving **invalid PTEs** in the GPU page tables.
**The Fix:** ```c update_mapping |= !p->xnack_enabled && !list_empty(&remap_list); ```
This forces `update_mapping` to be true when: - `!p->xnack_enabled`: XNACK is disabled (typical for APUs without page fault recovery) - `!list_empty(&remap_list)`: There are ranges that were split at non- aligned boundaries
**Why XNACK matters:** XNACK (eXtended Not-ACKnowledged) is AMD's page fault retry mechanism. When XNACK is enabled, the GPU can handle page faults by retrying. When XNACK is disabled (typical for APUs), page faults cannot be recovered, so **all GPU page tables must be correct before GPU access**.
### 3. SECURITY ASSESSMENT
**Not a security issue:** - No CVE assigned - No security-related keywords in commit message - No mention of exploitability - Appears to be a functional correctness issue
**Potential Impact:** While not a security vulnerability, invalid PTEs could potentially cause: - GPU memory access failures - Unexpected GPU behavior - Possible GPU hangs or resets - Application crashes when accessing GPU memory
However, the commit message doesn't mention crashes, only that "functionality has to be supported."
### 4. FEATURE VS BUG FIX CLASSIFICATION
**Clearly a bug fix:** - Subject starts with "Fix" - Describes incorrect behavior (invalid PTE mappings) - Corrects logic error in mapping update decision - Not adding new functionality
**Not a feature addition:** - No new APIs or exports - No new user-visible functionality - No new hardware support - Simply fixing existing prefetch functionality to work correctly on APUs
### 5. CODE CHANGE SCOPE ASSESSMENT
**Very small and surgical:** - **1 file modified**: `drivers/gpu/drm/amd/amdkfd/kfd_svm.c` - **2 lines added**: One line of logic + one blank line - **No lines removed** - **No function signature changes** - **No API changes**
**Change type:** Simple boolean logic addition to an existing flag. The fix is a one- liner that forces a flag to be true under specific conditions.
**Complexity:** Very low complexity. The logic is straightforward: if XNACK is disabled AND there are remapped ranges, force an update.
### 6. BUG TYPE AND SEVERITY
**Bug Type:** - **Correctness issue**: Invalid GPU page table entries after range splitting - **Logic error**: Missing condition in update decision logic
**Severity Assessment:**
**User-Visible Impact:** - The commit states "functionality has to be supported" but doesn't describe crashes or data corruption - Prefetch operations on APUs would leave invalid PTEs - Likely causes GPU memory access failures or incorrect behavior
**Severity Level: MEDIUM** - Not a crash or data corruption issue (would be HIGH/CRITICAL) - Not a minor cosmetic issue (would be LOW) - Functional correctness issue affecting specific hardware (APUs with SVM) - Affects a specific code path (prefetch with range splitting)
### 7. USER IMPACT EVALUATION
**Who is affected?** - **AMD APU users** (specifically MI300A mentioned) - **Using KFD (Kernel Fusion Driver)** for compute workloads - **Using SVM (Shared Virtual Memory)** features - **Making prefetch calls** that trigger range splitting - **With XNACK disabled** (typical for APUs)
**How common is this use case?** - **Relatively niche**: Requires AMD APUs with ROCm/HIP compute workloads - **MI300A is server-grade hardware**: Not consumer hardware - **SVM is advanced feature**: Not used by typical graphics workloads - **Prefetch is optimization feature**: Not always used
**Impact scope:** - Limited to specific AMD hardware and software stack - Affects compute/HPC workloads, not gaming or typical desktop use - May affect ROCm users on MI300A APUs
### 8. REGRESSION RISK ANALYSIS
**Risk of regression: VERY LOW**
**Why low risk:** 1. **Tiny change**: Only 2 lines added 2. **Simple logic**: Just setting a flag under specific conditions 3. **Well-scoped**: Only affects the update_mapping decision 4. **Well-reviewed**: Multiple reviewers from AMD 5. **Went through 4 revisions**: Carefully considered 6. **Already in mainline**: Been in v6.18-rc6 without reported issues 7. **Already backported**: Already in v6.17.8 stable
**What could go wrong:** - Could cause unnecessary page table updates in some cases - Might slightly impact performance if updates happen when not strictly needed - But the commit message says this doesn't affect dGPUs (discrete GPUs)
**Testing:** - Has Reviewed-by tags from AMD maintainers (Philip Yang, Felix Kuehling) - Went through internal review (v2, v3, v4) - Already cherry-picked to internal tree before mainline
### 9. MAINLINE STABILITY
**Commit dates:** - **AuthorDate**: October 28, 2025 (very recent) - **CommitDate**: November 11, 2025 (very recent) - **First appeared in**: v6.18-rc6 - **Time in mainline**: Less than 2 weeks at analysis time
**Maturity: LOW** This is a **very recent commit**. Normally, we prefer commits that have been in mainline for several weeks or months to ensure they're stable and don't introduce regressions.
**However:** - The fix is extremely simple (2 lines) - It's already been backported to v6.17.8 stable (by Sasha Levin) - No follow-up fixes or reverts found
### 10. DEPENDENCY ANALYSIS
**Critical Dependency: remap_list functionality**
The fix depends on the `remap_list` functionality introduced in commit **7ef6b2d4b7e5c** (October 17, 2023).
**Kernel version compatibility:** - `remap_list` first appeared in **v6.7-rc1** (November 2023) - This fix is **only applicable to kernels v6.7 and later** - **Does NOT apply to v6.6.y and earlier stable trees**
**Verification:** I confirmed that: - v6.7 and later have the remap_list functionality - The fix applies cleanly (already backported to v6.17.8) - No other dependencies identified
### 11. APPLICABILITY TO STABLE TREES
**Which stable trees should receive this fix?**
**Applicable to:** - ✅ **v6.7.y** and later stable trees (has remap_list dependency) - ✅ **v6.8.y, v6.9.y, v6.10.y, v6.11.y, v6.12.y** etc. - ✅ Already backported to **v6.17.8**
**NOT applicable to:** - ❌ **v6.6.y** and earlier (missing remap_list functionality) - ❌ **v5.x** series (missing entire SVM infrastructure changes)
### 12. STABLE KERNEL RULES COMPLIANCE
**Evaluating against stable kernel rules:**
✅ **1. Obviously correct and tested** - Fix is simple and clear (2 lines) - Logic is straightforward - Well-reviewed by AMD maintainers - Already in mainline and backported to v6.17
✅ **2. Fixes a real bug** - Yes, fixes invalid GPU page table entries - Real-world scenario described (prefetch on APU) - Affects actual hardware (MI300A APU)
✅ **3. Fixes an important issue** - **Medium importance**: Not a crash/corruption, but functional correctness - Affects GPU memory access correctness - Could cause GPU hangs or application failures
✅ **4. Small and contained** - Only 2 lines changed - Single file modified - No API changes - Very contained scope
✅ **5. No new features or APIs** - Does not add new functionality - Fixes existing prefetch functionality - No new exports or APIs
✅ **6. Applies cleanly to stable** - Already backported to v6.17.8 - Only applies to v6.7+ (has dependencies) - Clean application (no conflicts)
### CONCERNS AND CAVEATS
**1. Very recent commit (< 1 month old)** - Normally we prefer commits that have been tested in mainline for longer - However, the simplicity of the fix reduces this concern
**2. No explicit stable tag** - Commit lacks "Cc: stable@vger.kernel.org" - Maintainer didn't explicitly request stable backport - However, it's already been backported to v6.17.8 by Sasha Levin
**3. Limited user base** - Only affects AMD APU users with specific workloads - MI300A is server/HPC hardware, not widespread - SVM + prefetch is relatively niche use case
**4. Severity not critical** - No indication of crashes or data corruption - Described as functional issue, not safety issue - "functionality has to be supported" suggests compliance/correctness more than critical bug
**5. Version compatibility** - Only applies to v6.7 and later - Earlier stable trees cannot use this fix due to missing remap_list
### FINAL ASSESSMENT
**Risk vs. Benefit Analysis:**
**Benefits of backporting:** - ✅ Fixes incorrect GPU page table mappings on AMD APUs - ✅ Enables correct prefetch functionality on MI300A and similar hardware - ✅ Very low risk (2-line change, simple logic) - ✅ Well-reviewed and already tested in multiple trees - ✅ Already successfully backported to v6.17.8 - ✅ Helps users running ROCm/HIP workloads on APUs
**Risks of backporting:** - ⚠️ Very recent (< 1 month in mainline) - minimal testing time - ⚠️ Limited user base - affects niche use case - ⚠️ No explicit stable tag from maintainer - ⚠️ Could cause minor performance impact if updates occur unnecessarily
**Risks of NOT backporting:** - Users on APUs with SVM+prefetch will have invalid GPU PTEs - May cause GPU memory access failures - May cause application crashes or unexpected behavior - ROCm/HIP workloads on MI300A APUs may not function correctly
### CONCLUSION
This commit **should be backported** to stable kernel trees v6.7 and later based on the following:
1. **Fixes a real correctness bug**: Invalid GPU page table entries after range splitting is a genuine functional issue that affects real hardware and real users.
2. **Extremely low risk**: The fix is surgical (2 lines), simple logic, and has already been successfully backported to v6.17.8 without issues. Multiple AMD maintainers reviewed it through 4 revisions.
3. **Meets stable criteria**: The commit satisfies all the key stable kernel rules - it's obviously correct, fixes a real bug, is small and contained, doesn't add features, and applies cleanly.
4. **Already accepted by stable maintainers**: The fact that Sasha Levin already backported this to v6.17.8 indicates stable maintainers consider it appropriate.
5. **Important for affected users**: While the user base is limited (AMD APU + SVM + prefetch users), this is important for ROCm/HIP compute workloads on MI300A-class hardware. These are server/HPC environments where correctness is critical.
6. **No known issues**: No follow-up fixes or reverts found, suggesting the fix works as intended.
**Caveats:** - Only applicable to v6.7+ (requires remap_list functionality) - Very recent commit, but the simplicity mitigates concerns - Limited to specific AMD hardware and use cases
The benefits clearly outweigh the minimal risks, and this aligns with stable kernel principles of providing important fixes to users who need them.
**YES**
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 827507cfed7aa..fab6e7721c803 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -3688,6 +3688,8 @@ svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm, svm_range_apply_attrs(p, prange, nattr, attrs, &update_mapping); /* TODO: unmap ranges from GPU that lost access */ } + update_mapping |= !p->xnack_enabled && !list_empty(&remap_list); + list_for_each_entry_safe(prange, next, &remove_list, update_list) { pr_debug("unlink old 0x%p prange 0x%p [0x%lx 0x%lx]\n", prange->svms, prange, prange->start,