On Wed, 2025-11-26 at 16:03 +0100, Christian König wrote:
On 11/26/25 13:37, Philipp Stanner wrote:
On Wed, 2025-11-26 at 13:31 +0100, Christian König wrote:
[…]
Well the question is how do you detect *reliable* that there is still forward progress?
My understanding is that that's impossible since the internals of command submissions are only really understood by userspace, who submits them.
Right, but we can still try to do our best in the kernel to mitigate the situation.
I think for now amdgpu will implement something like checking if the HW still makes progress after a timeout but only a limited number of re-tries until we say that's it and reset anyway.
Oh oh, isn't that our dear hang_limit? :)
We agree that you can never really now whether userspace just submitted a while(true) job, don't we? Even if some GPU register still indicates "progress".
I think the long-term solution can only be fully fledged GPU scheduling with preemption. That's why we don't need such a timeout mechanism for userspace processes: the scheduler simply interrupts and lets someone else run.
Yeah absolutely.
My hope would be that in the mid-term future we'd get firmware rings that can be preempted through a firmware call for all major hardware. Then a huge share of our problems would disappear.
At least on AMD HW pre-emption is actually horrible unreliable as well.
Do you mean new GPUs with firmware scheduling, or what is "HW pre- emption"?
With firmware interfaces, my hope would be that you could simply tell
stop_running_ring(nr_of_ring) // time slice for someone else start_running_ring(nr_of_ring)
Thereby getting real scheduling and all that. And eliminating many other problems we know well from drm/sched.
Userspace basically needs to co-operate and provide a buffer where the state on a pre-emption is saved into.
That's uncool. With CPU preemption all that is done automatically via the processe's pages.
P.