From: Amber Lin Amber.Lin@amd.com
[ Upstream commit ab9fcc6362e0699fc1150aa1d8503c40fce2c1e1 ]
When submitting MQD to CP, set SDMA_RLCx_IB_CNTL/SWITCH_INSIDE_IB bit so it'll allow SDMA preemption if there is a massive command buffer of long-running SDMA commands.
Signed-off-by: Amber Lin Amber.Lin@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org ---
Based on my comprehensive analysis of both the commit and the kernel codebase context, here is my assessment: **YES** This commit should be backported to stable kernel trees. Here's my extensive analysis: ## Code Analysis The commit makes a minimal but important change to `drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c` in the `update_mqd_sdma()` function: ```c m->sdmax_rlcx_ib_cntl |= SDMA0_GFX_IB_CNTL__SWITCH_INSIDE_IB_MASK; ``` ## Why This Should Be Backported ### 1. **Fixes a Clear Functional Bug** - The commit addresses missing register configuration that prevents proper SDMA preemption - Without this bit set, SDMA engines cannot perform context switches inside Indirect Buffers (IBs) - This creates a scheduling fairness issue where long-running SDMA commands can starve other queues ### 2. **Minimal Risk, High Confidence Change** - **Single line addition**: Only adds one register bit setting - **Well-understood hardware feature**: The `SWITCH_INSIDE_IB` bit is a documented hardware capability - **Conservative fix**: Simply enables existing hardware functionality that should have been enabled - **No API changes**: Doesn't modify interfaces or data structures ### 3. **Broad Impact on Modern AMD Hardware** - Affects all AMD GPUs with SDMA engines from GCN 4.0+ (Vega, Navi, RDNA generations) - Impacts compute workloads using ROCm, OpenCL, and HIP - The issue affects system responsiveness under heavy SDMA usage patterns ### 4. **Consistent with Kernel Patterns** From examining similar commits, this follows the pattern of commits marked "Backport Status: YES": - **Similar Commit #5** was backported for implementing missing register state control in SDMA - Both commits involve configuring SDMA control registers that were previously unset - Both address functional correctness rather than adding new features ### 5. **Meets Stable Tree Criteria** - **Important bugfix**: ✅ Fixes SDMA scheduling fairness - **Small and contained**: ✅ Single line change in one function - **No architectural changes**: ✅ Uses existing register and bit definitions - **Minimal regression risk**: ✅ Only enables documented hardware feature - **Critical subsystem**: ✅ GPU scheduling affects system performance ### 6. **Clear User Impact** The commit message explicitly states the problem: "massive command buffer of long- running SDMA commands" can cause starvation. This directly impacts: - Multi-process compute workloads - Real-time or interactive applications that need timely SDMA access - System responsiveness during heavy GPU compute usage ### 7. **Evidence from Codebase Analysis** My kernel tree analysis revealed: - The `SWITCH_INSIDE_IB` functionality exists across all SDMA generations - Other SDMA drivers (amdgpu) properly configure this register - KFD was missing this configuration, creating an inconsistency - No security implications found - this is purely a scheduling correctness issue This commit represents exactly the type of targeted bugfix that stable trees are designed to include: it's small, safe, fixes a real user-impacting issue, and has minimal risk of regression.
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c index d978fcac26651..4110cdc71f045 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c @@ -387,6 +387,10 @@ static void update_mqd_sdma(struct mqd_manager *mm, void *mqd, m->sdma_engine_id = q->sdma_engine_id; m->sdma_queue_id = q->sdma_queue_id; m->sdmax_rlcx_dummy_reg = SDMA_RLC_DUMMY_DEFAULT; + /* Allow context switch so we don't cross-process starve with a massive + * command buffer of long-running SDMA commands + */ + m->sdmax_rlcx_ib_cntl |= SDMA0_GFX_IB_CNTL__SWITCH_INSIDE_IB_MASK;
q->is_active = QUEUE_IS_ACTIVE(*q); }