On 19/02/2018 18:35, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-02-19 18:31:31)
On 19/02/2018 14:01, Chris Wilson wrote:
If we fail to unbind the vma (due to a signal on an active buffer that needs to be moved for the next execbuf), then we need to clear the persistent tracking state we setup for this execbuf.
Fixes: c7c6e46f913b ("drm/i915: Convert execbuf to use struct-of-array packing for critical fields") Testcase: igt/gem_fenced_exec_thrash/no-spare-fences-busy* Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: stable@vger.kernel.org # v4.14+
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 51f3c32c64bf..4eb28e84fda4 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -505,6 +505,8 @@ eb_add_vma(struct i915_execbuffer *eb, unsigned int i, struct i915_vma *vma) list_add_tail(&vma->exec_link, &eb->unbound); if (drm_mm_node_allocated(&vma->node)) err = i915_vma_unbind(vma);
if (unlikely(err))
}vma->exec_flags = NULL; } return err;
I was trying to track down what actually explodes for like 15 minutes.
My track was:
eb_relocate -> eb_lookup_vmas fails -> eb_relocate -> eb_relocate_slow -> eb_reset_vmas -> second pass to eb_lookup_vmas -> resets vma->exec_flags. So no explosion.
So in other words I've failed to find what goes wrong and under which circumstances.
The first eb_relocate calls eb_lookup_vma triggers the failure and exit from execbuf. In that path, we mark the current index as the sentinel (err_vma: eb->vma[i] = NULL) which means we do not clear the last vma when unwinding in eb_release_vmas. So the vma->exec_flags was carried over into the next execbuf call from userspace.
Ah yes, I missed the !vma continue bit in eb_release vmas.
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
Regards,
Tvrtko