On 2025-11-25 at 19:06:32 GMT, Janusz Krzysztofik wrote:
Re-sending because of my response unintentionally HTML formatted, with correct email address of Tvrtko by the way.
Hi Krzysztof,
On Tuesday, 25 November 2025 14:33:38 CET Krzysztof Niemiec wrote:
Initialize eb->vma[].vma pointers to NULL when the eb structure is first set up.
During the execution of eb_lookup_vmas(), the eb->vma array is successively filled up with struct eb_vma objects. This process includes calling eb_add_vma(), which might fail; however, even in the event of failure, eb->vma[i].vma is set for the currently processed buffer.
If eb_add_vma() fails, eb_lookup_vmas() returns with an error, which prompts a call to eb_release_vmas() to clean up the mess. Since eb_lookup_vmas() might fail during processing any (possibly not first) buffer, eb_release_vmas() checks whether a buffer's vma is NULL to know at what point did the lookup function fail.
In eb_lookup_vmas(), eb->vma[i].vma is set to NULL if either the helper function eb_lookup_vma() or eb_validate_vma() fails. eb->vma[i+1].vma is set to NULL in case i915_gem_object_userptr_submit_init() fails; the current one needs to be cleaned up by eb_release_vmas() at this point, so the next one is set. If eb_add_vma() fails, neither the current nor the next vma is nullified, which is a source of a NULL deref bug described in [1].
When entering eb_lookup_vmas(), the vma pointers are set to the slab poison value, instead of NULL.
Your commit description still doesn't answer my question why the whole memory area allocated to the table of VMAs is not initialized to 0 on allocation, only left populated with poison values.
Becuase kvmalloc_array() is used. [1]
I guess one could swap it to a call to kvcalloc() or something similar; the thing is that the call actually handles both allocations of exec_list2 and the eb_vma array, the former doesn't need to be zero-initialized, the latter technically also doesn't but it simplifies error paths (and fixes the linked bug). I'm not sure if a zero-initializing *alloc() would be more readable or not here.
Thanks Krzysztof
[1] https://elixir.bootlin.com/linux/v6.17.9/source/drivers/gpu/drm/i915/gem/i91...
Thanks, Janusz