This reverts commit 71598a5a7797f0052aaa7bcff0b8d4b8f20f1441.
This commit introduced a regression, however the fix for the regression: aa5fc4362fac ("drm/amdgpu: fix task hang from failed job submission during process kill") depends on things not yet present in 6.12.y and older kernels. Since this commit is more of an optimization, just revert it for 6.12.y and older stable kernels.
Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x - 6.12.x ---
Please apply this revert to 6.1.x to 6.12.x stable trees. The newer stable trees and Linus' tree already have the regression fix.
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 0adb106e2c42..37d53578825b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2292,11 +2292,13 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size, */ long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout) { - timeout = drm_sched_entity_flush(&vm->immediate, timeout); + timeout = dma_resv_wait_timeout(vm->root.bo->tbo.base.resv, + DMA_RESV_USAGE_BOOKKEEP, + true, timeout); if (timeout <= 0) return timeout;
- return drm_sched_entity_flush(&vm->delayed, timeout); + return dma_fence_wait_timeout(vm->last_unlocked, true, timeout); }
static void amdgpu_vm_destroy_task_info(struct kref *kref)
On Fri, Aug 29, 2025 at 03:36:52PM -0400, Alex Deucher wrote:
This reverts commit 71598a5a7797f0052aaa7bcff0b8d4b8f20f1441.
This commit introduced a regression, however the fix for the regression: aa5fc4362fac ("drm/amdgpu: fix task hang from failed job submission during process kill") depends on things not yet present in 6.12.y and older kernels. Since this commit is more of an optimization, just revert it for 6.12.y and older stable kernels.
Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x - 6.12.x
Please apply this revert to 6.1.x to 6.12.x stable trees. The newer stable trees and Linus' tree already have the regression fix.
What is the commit id in Linus's tree for this fix? Why can't we just take that one instead?
thanks,
greg k-h
[AMD Official Use Only - AMD Internal Distribution Only]
-----Original Message----- From: Greg KH gregkh@linuxfoundation.org Sent: Saturday, August 30, 2025 2:19 AM To: Deucher, Alexander Alexander.Deucher@amd.com Cc: stable@vger.kernel.org; sashal@kernel.org Subject: Re: [PATCH] Revert "drm/amdgpu: Avoid extra evict-restore process."
On Fri, Aug 29, 2025 at 03:36:52PM -0400, Alex Deucher wrote:
This reverts commit 71598a5a7797f0052aaa7bcff0b8d4b8f20f1441.
This commit introduced a regression, however the fix for the regression: aa5fc4362fac ("drm/amdgpu: fix task hang from failed job submission during process kill") depends on things not yet present in 6.12.y and older kernels. Since this commit is more of an optimization, just revert it for 6.12.y and older stable kernels.
Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x - 6.12.x
Please apply this revert to 6.1.x to 6.12.x stable trees. The newer stable trees and Linus' tree already have the regression fix.
What is the commit id in Linus's tree for this fix? Why can't we just take that one instead?
The fix from Linus' tree is: aa5fc4362fac ("drm/amdgpu: fix task hang from failed job submission during process kill") However, as I said above, this fix depends on changes that are not yet in 6.12.x and older. So for 6.12.x and older trees we should just revert the original patch.
Alex
linux-stable-mirror@lists.linaro.org