Quoting Tvrtko Ursulin (2019-09-30 11:33:22)
On 28/09/2019 09:25, Chris Wilson wrote:
Daniel Vetter uncovered a nasty cycle in using the mmu-notifiers to invalidate userptr objects which also happen to be pulled into GGTT mmaps. That is when we unbind the userptr object (on mmu invalidation), we revoke all CPU mmaps, which may then recurse into mmu invalidation.
On the same object? Invalidate on userptr built from some mmap_gtt area, or standard userptr object mmapped via gtt? Or one userptr object built from a mmap_gtt of another userptr object?
Yup, think of the multiple partial mmaps we have on the same object. If we invalidate the mmap_offset, we may hit the same object again in mmu-invalidate. As far as my understanding goes, there is nothing in the munmap/invalidate that prevents this. Although, having the same pages mapped into different process is not unusual, so you would think we are not alone in having device pages in multiple mappings? There might be something more at play here, but Daniel's lockdep patch is straightforward: no recursion allowed in mmu-invalidate.
Will anything change here after the struct mutex removal series? AFAIR you remove struct mutex from userptr invalidation there.
No. This, aiui, is purely an issue where we trigger an mmu-invalidate from inside the mmu_invalidate_range_start.
[snip]
I remember in the distant past we discussed whether or not to allow this. It is indeed a quite perverse setup so I am okay with disallowing it.
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
Regards,
Tvrtko
P.S. I expect there will be some IGTs to be adjusted as well.
Yup. This was to start the ball rolling as come rc1 we will have some fire-fighting to do. -Chris