On Sun, Nov 23, 2025 at 10:51:21PM +0000, Pavel Begunkov wrote:
> +static inline struct dma_token *
> +dma_token_create(struct file *file, struct dma_token_params *params)
> +{
> + struct dma_token *res;
> +
> + if (!file->f_op->dma_map)
> + return ERR_PTR(-EOPNOTSUPP);
> + res = file->f_op->dma_map(file, params);
Calling the file operation ->dmap_map feels really misleading.
create_token as in the function name is already much better, but
it really is not just dma, but dmabuf related, and that should really
be encoded in the name.
Also why not pass the dmabuf and direction directly instead of wrapping
it in the odd params struct making the whole thing hard to follow?
For retrieving a pointer to the struct dma_resv for a given GEM object. We
also introduce it in a new trait, BaseObjectPrivate, which we automatically
implement for all gem objects and don't expose to users outside of the
crate.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
---
rust/kernel/drm/gem/mod.rs | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs
index 5c215e83c1b09..ec3c1b1775196 100644
--- a/rust/kernel/drm/gem/mod.rs
+++ b/rust/kernel/drm/gem/mod.rs
@@ -199,6 +199,18 @@ fn create_mmap_offset(&self) -> Result<u64> {
impl<T: IntoGEMObject> BaseObject for T {}
+/// Crate-private base operations shared by all GEM object classes.
+#[expect(unused)]
+pub(crate) trait BaseObjectPrivate: IntoGEMObject {
+ /// Return a pointer to this object's dma_resv.
+ fn raw_dma_resv(&self) -> *mut bindings::dma_resv {
+ // SAFETY: `as_gem_obj()` always returns a valid pointer to the base DRM gem object
+ unsafe { (*self.as_raw()).resv }
+ }
+}
+
+impl<T: IntoGEMObject> BaseObjectPrivate for T {}
+
/// A base GEM object.
///
/// # Invariants
--
2.52.0
On 11/28/25 11:10, Philipp Stanner wrote:
> On Fri, 2025-11-28 at 11:06 +0100, Christian König wrote:
>> On 11/27/25 12:10, Philipp Stanner wrote:
>>> On Thu, 2025-11-13 at 15:51 +0100, Christian König wrote:
>>>> This should allow amdkfd_fences to outlive the amdgpu module.
>>>>
>>>> v2: implement Felix suggestion to lock the fence while signaling it.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig(a)amd.com>
>>>> ---
>>>>
>>>>
>
> […]
>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>> index a085faac9fe1..8fac70b839ed 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>> @@ -1173,7 +1173,7 @@ static void kfd_process_wq_release(struct work_struct *work)
>>>> synchronize_rcu();
>>>> ef = rcu_access_pointer(p->ef);
>>>> if (ef)
>>>> - dma_fence_signal(ef);
>>>> + amdkfd_fence_signal(ef);
>>>>
>>>> kfd_process_remove_sysfs(p);
>>>> kfd_debugfs_remove_process(p);
>>>> @@ -1990,7 +1990,6 @@ kfd_process_gpuid_from_node(struct kfd_process *p, struct kfd_node *node,
>>>> static int signal_eviction_fence(struct kfd_process *p)
>>>> {
>>>> struct dma_fence *ef;
>>>> - int ret;
>>>>
>>>> rcu_read_lock();
>>>> ef = dma_fence_get_rcu_safe(&p->ef);
>>>> @@ -1998,10 +1997,10 @@ static int signal_eviction_fence(struct kfd_process *p)
>>>> if (!ef)
>>>> return -EINVAL;
>>>>
>>>> - ret = dma_fence_signal(ef);
>>>> + amdkfd_fence_signal(ef);
>>>> dma_fence_put(ef);
>>>>
>>>> - return ret;
>>>> + return 0;
>>>
>>> Oh wait, that's the code I'm also touching in my return code series!
>>>
>>> https://lore.kernel.org/dri-devel/cef83fed-5994-4c77-962c-9c7aac9f7306@amd.…
>>>
>>>
>>> Does this series then solve the problem Felix pointed out in
>>> evict_process_worker()?
>>
>> No it doesn't, I wasn't aware that the higher level code actually needs the status. After all Felix is the maintainer of this part.
>>
>> This patch here needs to be rebased on top of yours and changed accordingly to still return the fence status correctly.
>>
>> But thanks for pointing that out.
>
>
> Alright, so my (repaired, v2) status-code-removal series shall enter drm-misc-next first, and then your series here. ACK?
Works for me, I just need both to re-base the amdgpu patches on top.
Christian.
>
>
> P.
On 11/27/25 11:57, Philipp Stanner wrote:
> On Thu, 2025-11-13 at 15:51 +0100, Christian König wrote:
>> Calling dma_fence_is_signaled() here is illegal!
>
> OK, but why is that patch in this series?
Because the next patch depends on it, otherwise the series won't compile.
My plan is to push the amdgpu patches through amd-staging-drm-next as soon as Alex rebased that branch on drm-next during the next cycle.
Regards,
Christian.
>
> P.
>
>>
>> Signed-off-by: Christian König <christian.koenig(a)amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c | 6 ------
>> 1 file changed, 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
>> index 1ef758ac5076..09c919f72b6c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
>> @@ -120,12 +120,6 @@ static bool amdkfd_fence_enable_signaling(struct dma_fence *f)
>> {
>> struct amdgpu_amdkfd_fence *fence = to_amdgpu_amdkfd_fence(f);
>>
>> - if (!fence)
>> - return false;
>> -
>> - if (dma_fence_is_signaled(f))
>> - return true;
>> -
>> if (!fence->svm_bo) {
>> if (!kgd2kfd_schedule_evict_and_restore_process(fence->mm, f))
>> return true;
>
On 11/27/25 10:48, Philipp Stanner wrote:
> On Wed, 2025-11-26 at 16:24 -0500, Kuehling, Felix wrote:
>>
>> On 2025-11-26 08:19, Philipp Stanner wrote:
>>> The return code of dma_fence_signal() is not really useful as there is
>>> nothing reasonable to do if a fence was already signaled. That return
>>> code shall be removed from the kernel.
>>>
>>> Ignore dma_fence_signal()'s return code.
>>
>> I think this is not correct. Looking at the comment in
>> evict_process_worker, we use the return value to decide a race
>> conditions where multiple threads are trying to signal the eviction
>> fence. Only one of them should schedule the restore work. And the other
>> ones need to increment the reference count to keep evictions balanced.
>
> Thank you for pointing that out. Seems then amdkfd is the only user who
> actually relies on the feature. See below
>
>>
>> Regards,
>> Felix
>>
>>
>>>
>>> Suggested-by: Christian König <christian.koenig(a)amd.com>
>>> Signed-off-by: Philipp Stanner <phasta(a)kernel.org>
>>> ---
>>> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 ++---
>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> index ddfe30c13e9d..950fafa4b3c3 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> @@ -1986,7 +1986,6 @@ kfd_process_gpuid_from_node(struct kfd_process *p, struct kfd_node *node,
>>> static int signal_eviction_fence(struct kfd_process *p)
>>> {
>>> struct dma_fence *ef;
>>> - int ret;
>>>
>>> rcu_read_lock();
>>> ef = dma_fence_get_rcu_safe(&p->ef);
>>> @@ -1994,10 +1993,10 @@ static int signal_eviction_fence(struct kfd_process *p)
>>> if (!ef)
>>> return -EINVAL;
>>>
>>> - ret = dma_fence_signal(ef);
>>> + dma_fence_signal(ef);
>
> The issue now is that dma_fence_signal()'s return code is actually non-
> racy, because check + bit-set are protected by lock.
>
> Christian's new spinlock series would add a lock function for fences:
> https://lore.kernel.org/dri-devel/20251113145332.16805-5-christian.koenig@a…
>
>
> So I suppose this should work:
>
> dma_fence_lock_irqsave(ef, flags);
> if (dma_fence_test_signaled_flag(ef)) {
> dma_fence_unlock_irqrestore(ef, flags);
> return true;
> }
> dma_fence_signal_locked(ef);
> dma_fence_unlock_irqrestore(ef, flags);
>
> return false;
>
>
> + some cosmetic adjustments for the boolean of course.
>
>
> Would that fly and be reasonable? @Felix, Christian.
I was just about to reply with the same idea when your mail arrived.
So yes looks totally reasonable to me.
Regards,
Christian.
>
>
> P.
On 11/27/25 09:23, Viresh Kumar wrote:
> On 27-11-25, 09:07, Christian König wrote:
>> On 11/27/25 08:40, Viresh Kumar wrote:
>>> Move several dma-buf function declarations under
>>> CONFIG_DMA_SHARED_BUFFER and provide static inline no-op implementations
>>> for the disabled case to allow the callers to build when the feature is
>>> not compiled in.
>>
>> Good point, but which driver actually needs that?
>
> This broke some WIP stuff [1] which isn't posted upstream yet. That's why I
> didn't mention anything in the commit log, though I could have added a comment
> about that in the non-commit-log part.
Well then better send that out with the full patch set.
>> In other words there should be a concrete example of what breaks in the commit message.
>
> There is time for those changes to be posted and not sure if they will be
> accepted or not. But either way, this change made sense in general and so I
> thought there is nothing wrong to get this upstream right away.
Yeah when it is unused intermediately then that is usually a no-go even if I agree that it makes sense.
>>> +static inline struct dma_buf *dma_buf_get(int fd)
>>> +{
>>> + return NULL;
>>
>> And here ERR_PTR(-EINVAL).
>
> I am not really sure if this should be EINVAL in this case. EOPNOTSUPP still
> makes sense as the API isn't supported.
When the API isn't compiled in the fd can't be valid (because you can't create a dma_buf object in the first place).
So returning -EINVAL still makes a lot of sense.
Regards,
Christian.
>
>>> +static inline struct dma_buf *dma_buf_iter_begin(void)
>>> +{
>>> + return NULL;
>>> +}
>>> +
>>> +static inline struct dma_buf *dma_buf_iter_next(struct dma_buf *dmbuf)
>>> +{
>>> + return NULL;
>>> +}
>>
>> Those two are only for BPF and not driver use.
>
> Will drop them.
>