Am 26.03.2018 um 17:42 schrieb Jerome Glisse:
On Mon, Mar 26, 2018 at 10:01:21AM +0200, Daniel Vetter wrote:
On Thu, Mar 22, 2018 at 10:58:55AM +0100, Christian König wrote:
Am 22.03.2018 um 08:18 schrieb Daniel Vetter: [SNIP] Key take away from that was that you can't take any locks from neither the MMU notifier nor the shrinker you also take while calling kmalloc (or simpler speaking get_user_pages()).
Additional to that in the MMU or shrinker callback all different kinds of locks might be held, so you basically can't assume that you do thinks like recursive page table walks or call dma_unmap_anything.
That sounds like a design bug in mmu_notifiers, since it would render them useless for KVM. And they were developed for that originally. I think I'll chat with Jerome to understand this, since it's all rather confusing.
Doing dma_unmap() during mmu_notifier callback should be fine, it was last time i check. However there is no formal contract that it is ok to do so.
As I said before dma_unmap() isn't the real problem here.
The issues is more that you can't take a lock in the MMU notifier which you would also take while allocating memory without GFP_NOIO.
That makes it rather tricky to do any command submission, e.g. you need to grab all the pages/memory/resources prehand, then make sure that you don't have a MMU notifier running concurrently and do the submission.
If any of the prerequisites isn't fulfilled we need to restart the operation.
[SNIP] A slightly better solution is using atomic counter: driver_range_start() { atomic_inc(&mydev->notifier_count);
...
Yeah, that is exactly what amdgpu is doing now. Sorry if my description didn't made that clear.
I would like to see driver using same code, as it means one place to fix issues. I had for a long time on my TODO list doing the above conversion to amd or radeon kernel driver. I am pushing up my todo list hopefully in next few weeks i can send an rfc so people can have a real sense of how it can look.
Certainly a good idea, but I think we might have that separate to HMM.
TTM suffered really from feature overload, e.g. trying to do everything in a single subsystem. And it would be rather nice to have coherent userptr handling for GPUs as separate feature.
Regards, Christian.