On 4/15/26 03:08, Zack Rusin wrote:
On Tue, Apr 14, 2026 at 9:25 AM Christian König christian.koenig@amd.com wrote:
On 4/14/26 12:55, popov.nkv@gmail.com wrote:
From: Vladimir Popov popov.nkv@gmail.com
If vmw_execbuf_fence_commands() call fails in vmw_kms_helper_validation_finish(), it sets *p_fence = NULL. If ctx->bo_list is not empty, the caller, vmw_kms_helper_validation_finish(), passes the fence through a chain of functions to dma_fence_is_array(), which causes a NULL pointer dereference in dma_fence_is_array():
vmw_kms_helper_validation_finish() // pass NULL fence vmw_validation_done() vmw_validation_bo_fence() ttm_eu_fence_buffer_objects() // pass NULL fence dma_resv_add_fence() dma_fence_is_container() dma_fence_is_array() // NULL deref
Well good catch, but that is clearly not the right fix.
I'm not an expert for the vmwgfx code but in case of an error vmw_validation_revert() should be called an not vmw_kms_helper_validation_finish().
To me the patch looks correct. This path is explicitly for submission failure and does BO backoff plus vmw_validation_res_unreserve(ctx, true). The backoff=true branch skips committing dirty-state / backup-MOB changes, which is only correct if commands were not committed. Here the commands have already been submitted; only fence creation failed. So I think unlocking BO reservations without attaching a fence, then letting vmw_validation_done() keep taking the success path for resources is correct.
Ah! I would just avoid adding more TTM exec code dependencies.
We also have the always signaled stub fence for such use cases. How about that change here:
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c index e1f18020170a..8dcb8cd19e29 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c @@ -3843,7 +3843,7 @@ int vmw_execbuf_fence_commands(struct drm_file *file_priv, if (unlikely(ret != 0 && !synced)) { (void) vmw_fallback_wait(dev_priv, false, false, sequence, false, VMW_FENCE_WAIT_TIMEOUT); - *p_fence = NULL; + *p_fence = dma_fence_get_stub(); }
return ret;
iirc the same helper is used by execbuf, and the shared-helper fix correctly covers both paths so this is probably not only a kms issue.
Untangling this code would make sense because it's confusing, but that's not something I'd expect Vladimir to do :)
Yeah fence memory allocation should definitely be move before submitting the commands.
But that is clearly more work.
Thanks, Christian.
z
linaro-mm-sig@lists.linaro.org