Am 21.03.2018 um 09:28 schrieb Daniel Vetter:
On Tue, Mar 20, 2018 at 06:47:57PM +0100, Christian König wrote:
Am 20.03.2018 um 15:08 schrieb Daniel Vetter:
[SNIP] For the in-driver reservation path (CS) having a slow-path that grabs a temporary reference, drops the vram lock and then locks the reservation normally (using the acquire context used already for the entire CS) is a bit tricky, but totally feasible. Ttm doesn't do that though.
That is exactly what we do in amdgpu as well, it's just not very efficient nor reliable to retry getting the right pages for a submission over and over again.
Out of curiosity, where's that code? I did read the ttm eviction code way back, and that one definitely didn't do that. Would be interesting to update my understanding.
That is in amdgpu_cs.c. amdgpu_cs_parser_bos() does a horrible dance with grabbing, releasing and regrabbing locks in a loop.
Then in amdgpu_cs_submit() we grab an lock preventing page table updates and check if all pages are still the one we want to have:
amdgpu_mn_lock(p->mn); if (p->bo_list) { for (i = p->bo_list->first_userptr; i < p->bo_list->num_entries; ++i) { struct amdgpu_bo *bo = p->bo_list->array[i].robj;
if (amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm)) { amdgpu_mn_unlock(p->mn); return -ERESTARTSYS; } } }
If anything changed on the page tables we restart the whole IOCTL using -ERESTARTSYS and try again.
Regards, Christian.
-Daniel