From: Xi Ruoyao xry111@xry111.site
[ Upstream commit c97a7dccb3ed680031011cfc1457506e6de49c9a ]
dml21_map_dc_state_into_dml_display_cfg calls (the call is usually inlined by the compiler) populate_dml21_surface_config_from_plane_state and populate_dml21_plane_config_from_plane_state which may use FPU. In a x86-64 build:
$ objdump --disassemble=dml21_map_dc_state_into_dml_display_cfg \ > drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.o | > grep %xmm -c 63
Thus it needs to be guarded with DC_FP_START. But we must note that the current code quality of the in-kernel FPU use in AMD dml2 is very much problematic: we are actually calling DC_FP_START in dml21_wrapper.c here, and this translation unit is built with CC_FLAGS_FPU. Strictly speaking this does not make any sense: with CC_FLAGS_FPU the compiler is allowed to generate FPU uses anywhere in the translated code, perhaps out of the DC_FP_START guard. This problematic pattern also occurs in at least dml2_wrapper.c, dcn35_fpu.c, and dcn351_fpu.c. Thus we really need a careful audit and refactor for the in-kernel FPU uses, and this patch is simply whacking a mole. However per the reporter, whacking this mole is enough to make a 9060XT "just work."
Reported-by: Asiacn 710187964@qq.com Closes: https://github.com/loongson-community/discussions/issues/102 Tested-by: Asiacn 710187964@qq.com Signed-off-by: Xi Ruoyao xry111@xry111.site Reviewed-by: Huacai Chen chenhuacai@loongson.cn Reviewed-by: Alex Hung alex.hung@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my comprehensive analysis of this commit, I can now provide my backporting recommendation.
## **BACKPORT DECISION: YES**
This commit **SHOULD be backported** to stable kernel trees.
---
## **DETAILED ANALYSIS AND JUSTIFICATION**
### **1. NATURE OF THE BUG**
The commit fixes **unprotected FPU (Floating Point Unit) usage** in the AMD display driver. The function `dml21_map_dc_state_into_dml_display_cfg()` contains floating-point operations but was being called without proper kernel FPU protection guards.
**Evidence from the code:** - The commit message shows `objdump` analysis revealing **63 uses of %xmm registers** (SSE/FPU instructions) in the compiled function - In the source code at `drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21 _translation_helper.c:779-987`, I found the actual FPU operations: ```c surface->dcc.informative.dcc_rate_plane0 = 1.0; // Line 791 surface->dcc.informative.dcc_rate_plane1 = 1.0; // Line 792
plane->composition.scaler_info.plane0.h_ratio = (double)scaler_data->ratios.horz.value / (1ULL << 32); // Line 903 plane->composition.scaler_info.plane0.v_ratio = (double)scaler_data->ratios.vert.value / (1ULL << 32); // Line 904 plane->composition.scaler_info.plane1.h_ratio = (double)scaler_data->ratios.horz_c.value / (1ULL << 32); // Line 905 plane->composition.scaler_info.plane1.v_ratio = (double)scaler_data->ratios.vert_c.value / (1ULL << 32); // Line 906 ```
### **2. ROOT CAUSE AND REGRESSION TIMELINE**
Through my investigation, I discovered this is a **regression fix**:
- **v6.15**: Commit `366e77cd4923c` ("Protect FPU in dml2_validate()/dml21_validate()") added DC_FP_START/END around the entire `dml21_validate()` function - **WORKING** - This commit had `Cc: stable@vger.kernel.org` tag - It fixed "do_fpu invoked from kernel context" crashes on LoongArch
- **v6.16**: Commit `fe3250f10819b` ("Call FP Protect Before Mode Programming/Mode Support") refactored the code and moved FP protection to individual calls - It protected `dml2_build_mode_programming()` and `dml2_check_mode_supported()` - **BUT IT MISSED `dml21_map_dc_state_into_dml_display_cfg()`** - **BROKEN**
- **v6.18-rc1**: Current commit `c97a7dccb3ed6` adds the missing protection - **FIXED**
**Affected kernel versions:** v6.16 and v6.17 (including all stable releases) have the regression.
### **3. THE FIX**
The fix is **minimal and surgical**:
```diff @@ -224,7 +224,9 @@ static bool dml21_mode_check_and_programming(...) /* Populate stream, plane mappings and other fields in display config. */ + DC_FP_START(); result = dml21_map_dc_state_into_dml_display_cfg(in_dc, context, dml_ctx); + DC_FP_END(); if (!result) return false;
@@ -279,7 +281,9 @@ static bool dml21_check_mode_support(...) mode_support->dml2_instance = dml_init->dml2_instance; + DC_FP_START(); dml21_map_dc_state_into_dml_display_cfg(in_dc, context, dml_ctx); + DC_FP_END(); ```
**Total change: 4 lines added** (2 × DC_FP_START, 2 × DC_FP_END)
The `DC_FP_START()` and `DC_FP_END()` macros call `kernel_fpu_begin()` and `kernel_fpu_end()` which: 1. Disable preemption 2. Save current FPU state 3. Allow safe FPU usage in kernel context 4. Restore FPU state afterward
### **4. USER-VISIBLE IMPACT**
**Severity:** Hardware doesn't work or kernel crashes
**Affected users:** - AMD Radeon GPU users on LoongArch systems (confirmed: makes 9060XT "just work") - Potentially affects other architectures with strict FPU handling
**Evidence:** - `Reported-by: Asiacn 710187964@qq.com` - `Closes: https://github.com/loongson-community/discussions/issues/102%60 - `Tested-by: Asiacn 710187964@qq.com` - Confirms it works - Similar to commit 366e77cd4923c which showed kernel crashes with stack traces
### **5. BACKPORTING CRITERIA ASSESSMENT**
| Criterion | Status | Evidence | |-----------|--------|----------| | **Fixes important bug** | ✅ YES | Kernel crashes, hardware not working | | **Small and contained** | ✅ YES | Only 4 lines, 1 file changed | | **Minimal regression risk** | ✅ YES | Only adds protection guards around existing code | | **Confined to subsystem** | ✅ YES | AMD display driver only | | **Tested** | ✅ YES | Has Tested-by tag | | **Reviewed** | ✅ YES | Reviewed by Loongson and AMD engineers | | **Clear root cause** | ✅ YES | Detailed commit message with objdump evidence |
### **6. CODE QUALITY OBSERVATIONS**
The commit message honestly acknowledges broader architectural issues:
"the current code quality of the in-kernel FPU use in AMD dml2 is very
much problematic... this patch is simply whacking a mole"
However, it also states:
"whacking this mole is enough to make a 9060XT 'just work.'"
This pragmatic fix is **necessary and correct** even if larger refactoring is needed long-term.
### **7. MISSING STABLE TAGS (Should Have)**
❌ No `Fixes:` tag (should be: `Fixes: fe3250f10819b`) ❌ No `Cc: stable@vger.kernel.org`
**This appears to be an oversight**, not a deliberate exclusion, because: - The earlier related commit 366e77cd4923c had `Cc: stable@vger.kernel.org` - This is part of the same ongoing FPU protection effort - It fixes a clear regression with user-visible impact
### **8. RISK ANALYSIS**
**Risk of backporting:** **VERY LOW** - Change is minimal (only adds guards) - Guards are well-established pattern used throughout the codebase - No logic changes, no new features - Matches pattern of already-backported commit 366e77cd4923c
**Risk of NOT backporting:** **HIGH** - Users with AMD GPUs on LoongArch cannot use their hardware - Potential kernel crashes and FPU state corruption - v6.16 and v6.17 remain broken
---
## **CONCLUSION**
This is a **clear-cut backport candidate** that fixes a regression introduced in v6.16, has been tested, is minimal in scope, and has very low risk. The lack of stable tags appears to be an oversight rather than intentional exclusion.
**Recommended for backport to:** - v6.17.x stable (actively maintained) - v6.16.x stable (if still maintained)
drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c index 03de3cf06ae59..059ede6ff2561 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c @@ -224,7 +224,9 @@ static bool dml21_mode_check_and_programming(const struct dc *in_dc, struct dc_s dml_ctx->config.svp_pstate.callbacks.release_phantom_streams_and_planes(in_dc, context);
/* Populate stream, plane mappings and other fields in display config. */ + DC_FP_START(); result = dml21_map_dc_state_into_dml_display_cfg(in_dc, context, dml_ctx); + DC_FP_END(); if (!result) return false;
@@ -279,7 +281,9 @@ static bool dml21_check_mode_support(const struct dc *in_dc, struct dc_state *co dml_ctx->config.svp_pstate.callbacks.release_phantom_streams_and_planes(in_dc, context);
mode_support->dml2_instance = dml_init->dml2_instance; + DC_FP_START(); dml21_map_dc_state_into_dml_display_cfg(in_dc, context, dml_ctx); + DC_FP_END(); dml_ctx->v21.mode_programming.dml2_instance->scratch.build_mode_programming_locals.mode_programming_params.programming = dml_ctx->v21.mode_programming.programming; DC_FP_START(); is_supported = dml2_check_mode_supported(mode_support);