On 2/11/26 11:06, Philipp Stanner wrote:
> On Tue, 2026-02-10 at 11:01 +0100, Christian König wrote:
>> At first glance it is counter intuitive to protect a constant function
>> pointer table by RCU, but this allows modules providing the function
>> table to unload by waiting for an RCU grace period.
>
> I think that someone who does not already have a deep understanding
> about dma-buf and fences will have much trouble understanding *why*
> this patch is in the log and *what it achieves*.
>
> Good commit messages are at least as important as good code. In
> drm/sched for example I've been trying so many times to figure out why
> certain hacks and changes were implemented, but all that git-blame ever
> gave me was one liners, often hinting at some driver internal work
> around ._.
How about something like this:
The fence ops of a dma_fence currently need to life as long as the dma_fence is alive.
This means that the module who originally issued a dma_fence can't unload unless all of them are freed up.
As first step to solve this issue protect the fence ops by RCU.
While it is counter intuitive to protect a constant function pointer table by RCU it allows modules to wait for an RCU grace period to make sure that nobody is executing their functions any more.
>
>>
>> v2: make one the now duplicated lockdep warnings a comment instead.
>> v3: Add more documentation to ->wait and ->release callback.
>> v4: fix typo in documentation
>> v5: rebased on drm-tip
>>
>> Signed-off-by: Christian König <christian.koenig(a)amd.com>
>> ---
>> drivers/dma-buf/dma-fence.c | 69 +++++++++++++++++++++++++------------
>> include/linux/dma-fence.h | 29 ++++++++++++++--
>> 2 files changed, 73 insertions(+), 25 deletions(-)
>>
>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
>> index e05beae6e407..de9bf18be3d4 100644
>> --- a/drivers/dma-buf/dma-fence.c
>> +++ b/drivers/dma-buf/dma-fence.c
>> @@ -522,6 +522,7 @@ EXPORT_SYMBOL(dma_fence_signal);
>> signed long
>> dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
>> {
>> + const struct dma_fence_ops *ops;
>> signed long ret;
>>
>> if (WARN_ON(timeout < 0))
>> @@ -533,15 +534,21 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
>>
>> dma_fence_enable_sw_signaling(fence);
>>
>> - if (trace_dma_fence_wait_start_enabled()) {
>
> Why can wait_start_enabled() be removed? Is that related to the life
> time decoupling or is it a separate topic?
It isn't removed, I've just removed the "if (trace_dma_fence_wait_start_enabled())" optimization which is used by the tracing subsystem as self-patching code (longer story).
The trace_dma_fence_wait_start() trace point function is still called a few lines below.
>> - rcu_read_lock();
>> - trace_dma_fence_wait_start(fence);
>> + rcu_read_lock();
>> + ops = rcu_dereference(fence->ops);
>> + trace_dma_fence_wait_start(fence);
>> + if (ops->wait) {
>> + /*
>> + * Implementing the wait ops is deprecated and not supported for
>> + * issuer independent fences, so it is ok to use the ops outside
>
> s/issuer/issuers of
Fixed.
> And how do we know that this here is an independent fence?
> What even is an "independent fence" – one with internal spinlock?
I rephrased the sentence a bit to make that more clearer:
/*
* Implementing the wait ops is deprecated and not supported for
* issuers of fences who wants them to be independent of their
* module after they signal, so it is ok to use the ops outside
* the RCU protected section.
*/
>
>> + * the RCU protected section.
>> + */
>> + rcu_read_unlock();
>> + ret = ops->wait(fence, intr, timeout);
>> + } else {
>> rcu_read_unlock();
>> - }
>> - if (fence->ops->wait)
>> - ret = fence->ops->wait(fence, intr, timeout);
>> - else
>> ret = dma_fence_default_wait(fence, intr, timeout);
>> + }
>
> The git diff here looks awkward. Do you use git format-patch --
> histogram?
Nope, what's the matter?
>> if (trace_dma_fence_wait_end_enabled()) {
>> rcu_read_lock();
>> trace_dma_fence_wait_end(fence);
>>
>> @@ -1049,7 +1067,12 @@ __dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
>> BUG_ON(!ops || !ops->get_driver_name || !ops->get_timeline_name);
>>
>> kref_init(&fence->refcount);
>> - fence->ops = ops;
>> + /*
>> + * At first glance it is counter intuitive to protect a constant
>> + * function pointer table by RCU, but this allows modules providing the
>> + * function table to unload by waiting for an RCU grace period.
>
> Maybe add a sentence like "Fences can live longer than the module which
> issued them."
Going to use the same as the commit message here as soon as we synced up on that.
>
>> + */
>> + RCU_INIT_POINTER(fence->ops, ops);
>> INIT_LIST_HEAD(&fence->cb_list);
>> fence->lock = lock;
>> fence->context = context;
>> @@ -1129,11 +1152,12 @@ EXPORT_SYMBOL(dma_fence_init64);
>> */
>> const char __rcu *dma_fence_driver_name(struct dma_fence *fence)
>> {
>> - RCU_LOCKDEP_WARN(!rcu_read_lock_held(),
>> - "RCU protection is required for safe access to returned string");
>> + const struct dma_fence_ops *ops;
>>
>> + /* RCU protection is required for safe access to returned string */
>> + ops = rcu_dereference(fence->ops);
>> if (!dma_fence_test_signaled_flag(fence))
>> - return (const char __rcu *)fence->ops->get_driver_name(fence);
>> + return (const char __rcu *)ops->get_driver_name(fence);
>> else
>> return (const char __rcu *)"detached-driver";
>> }
>> @@ -1161,11 +1185,12 @@ EXPORT_SYMBOL(dma_fence_driver_name);
>> */
>> const char __rcu *dma_fence_timeline_name(struct dma_fence *fence)
>> {
>> - RCU_LOCKDEP_WARN(!rcu_read_lock_held(),
>> - "RCU protection is required for safe access to returned string");
>> + const struct dma_fence_ops *ops;
>>
>> + /* RCU protection is required for safe access to returned string */
>> + ops = rcu_dereference(fence->ops);
>> if (!dma_fence_test_signaled_flag(fence))
>> - return (const char __rcu *)fence->ops->get_driver_name(fence);
>> + return (const char __rcu *)ops->get_driver_name(fence);
>> else
>> return (const char __rcu *)"signaled-timeline";
>> }
>
> Did we make any progress in our conversation about removing those two
> functions and callbacks? They're only used by i915.
Actually they are mostly used by the trace points and debugfs, so we certainly can't remove them.
But I'm really wondering why the heck i915 is using them?
>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>> index 9c4d25289239..6bf4feb0e01f 100644
>> --- a/include/linux/dma-fence.h
>> +++ b/include/linux/dma-fence.h
>> @@ -67,7 +67,7 @@ struct seq_file;
>> */
>> struct dma_fence {
>> spinlock_t *lock;
>> - const struct dma_fence_ops *ops;
>> + const struct dma_fence_ops __rcu *ops;
>> /*
>> * We clear the callback list on kref_put so that by the time we
>> * release the fence it is unused. No one should be adding to the
>> @@ -220,6 +220,10 @@ struct dma_fence_ops {
>> * timed out. Can also return other error values on custom implementations,
>> * which should be treated as if the fence is signaled. For example a hardware
>> * lockup could be reported like that.
>> + *
>> + * Implementing this callback prevents the fence from detaching after
>> + * signaling and so it is mandatory for the module providing the
>
> s/mandatory/necessary ?
Fixed.
>
>> + * dma_fence_ops to stay loaded as long as the dma_fence exists.
>> */
>> signed long (*wait)(struct dma_fence *fence,
>> bool intr, signed long timeout);
>> @@ -231,6 +235,13 @@ struct dma_fence_ops {
>> * Can be called from irq context. This callback is optional. If it is
>> * NULL, then dma_fence_free() is instead called as the default
>> * implementation.
>> + *
>> + * Implementing this callback prevents the fence from detaching after
>> + * signaling and so it is mandatory for the module providing the
>
> same
Fixed.
>
>> + * dma_fence_ops to stay loaded as long as the dma_fence exists.
>> + *
>> + * If the callback is implemented the memory backing the dma_fence
>> + * object must be freed RCU safe.
>> */
>> void (*release)(struct dma_fence *fence);
>>
>> @@ -454,13 +465,19 @@ dma_fence_test_signaled_flag(struct dma_fence *fence)
>> static inline bool
>> dma_fence_is_signaled_locked(struct dma_fence *fence)
>> {
>> + const struct dma_fence_ops *ops;
>> +
>> if (dma_fence_test_signaled_flag(fence))
>> return true;
>>
>> - if (fence->ops->signaled && fence->ops->signaled(fence)) {
>> + rcu_read_lock();
>> + ops = rcu_dereference(fence->ops);
>> + if (ops->signaled && ops->signaled(fence)) {
>
> Maybe you can educate me a bit about RCU here – couldn't this still
> race? If the ops were unloaded before you take rcu_read_lock(),
> rcu_dereference() would give you an invalid pointer here since you
> don't check for !ops, no?
Perfectly correct thinking, yes.
But the check for !ops is added in patch #2 when we actually start to set ops = NULL when the fence signals.
I intentionally separated that because it is basically the second step in making the solution to detach the fence ops from the module by RCU work.
We could merge the two patches together, but I think the separation actually makes sense should anybody start to complain about the additional RCU overhead.
Thanks,
Christian.
>
>
>> + rcu_read_unlock();
>> dma_fence_signal_locked(fence);
>> return true;
>> }
>> + rcu_read_unlock();
>>
>> return false;
>> }
>> @@ -484,13 +501,19 @@ dma_fence_is_signaled_locked(struct dma_fence *fence)
>> static inline bool
>> dma_fence_is_signaled(struct dma_fence *fence)
>> {
>> + const struct dma_fence_ops *ops;
>> +
>> if (dma_fence_test_signaled_flag(fence))
>> return true;
>>
>> - if (fence->ops->signaled && fence->ops->signaled(fence)) {
>> + rcu_read_lock();
>> + ops = rcu_dereference(fence->ops);
>> + if (ops->signaled && ops->signaled(fence)) {
>
> same
>
>
> Danke,
> P.
>
>> + rcu_read_unlock();
>> dma_fence_signal(fence);
>> return true;
>> }
>> + rcu_read_unlock();
>>
>> return false;
>> }
>
On 2/11/26 10:50, Philipp Stanner wrote:
> On Tue, 2026-02-10 at 11:01 +0100, Christian König wrote:
...
>> Using a per-fence spinlock allows completely decoupling spinlock producer
>> and consumer life times, simplifying the handling in most use cases.
>
> That's a good commit message btw, detailing what the motivation is.
> Would be great to see messages like that more frequently :]
Yeah, but they are not so easy to write.
>> trace_dma_fence_init(fence);
>> @@ -1091,7 +1094,7 @@ __dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
>> * dma_fence_init - Initialize a custom fence.
>> * @fence: the fence to initialize
>> * @ops: the dma_fence_ops for operations on this fence
>> - * @lock: the irqsafe spinlock to use for locking this fence
>> + * @lock: optional irqsafe spinlock to use for locking this fence
>> * @context: the execution context this fence is run on
>> * @seqno: a linear increasing sequence number for this context
>> *
>> @@ -1101,6 +1104,10 @@ __dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
>> *
>> * context and seqno are used for easy comparison between fences, allowing
>> * to check which fence is later by simply using dma_fence_later().
>> + *
>> + * It is strongly discouraged to provide an external lock. This is only allowed
>
> "strongly discouraged […] because this does not decouple lock and fence
> life times." ?
Good point, added some more text.
>> + * for legacy use cases when multiple fences need to be prevented from
>> + * signaling out of order.
>
> I think our previous discussions revealed that the external lock does
> not even help with that, does it?
Well only when you provide a ->signaled() callback in the dma_fence_ops.
The reason we have so much different approaches in the dma_fence handling is because it is basically the unification multiple different driver implementations which all targeted more or less different use cases.
>> + * for legacy use cases when multiple fences need to be prevented from
>> + * signaling out of order.
>> */
>> void
>> dma_fence_init64(struct dma_fence *fence, const struct dma_fence_ops *ops,
>> diff --git a/drivers/dma-buf/sync_debug.h b/drivers/dma-buf/sync_debug.h
>> index 02af347293d0..c49324505b20 100644
>> --- a/drivers/dma-buf/sync_debug.h
>> +++ b/drivers/dma-buf/sync_debug.h
>> @@ -47,7 +47,7 @@ struct sync_timeline {
>>
>> static inline struct sync_timeline *dma_fence_parent(struct dma_fence *fence)
>> {
>> - return container_of(fence->lock, struct sync_timeline, lock);
>> + return container_of(fence->extern_lock, struct sync_timeline, lock);
>
> You're sure that this will never have to check for the flag?
Yes, the code would have crashed before if anything than a sync_pt created by sync_pt_create was encountered here.
We could drop the wrapper, move the cast to the only place where it matters and document the why and what with a code comment.... but this is all dead code which breaks some of the fundamental dma-fence rules and it is only left here because we can't break the UAPI.
>> static const char *xe_hw_fence_get_driver_name(struct dma_fence *dma_fence)
>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>> index 88c842fc35d5..6eabbb1c471c 100644
>> --- a/include/linux/dma-fence.h
>> +++ b/include/linux/dma-fence.h
>> @@ -34,7 +34,8 @@ struct seq_file;
>> * @ops: dma_fence_ops associated with this fence
>> * @rcu: used for releasing fence with kfree_rcu
>> * @cb_list: list of all callbacks to call
>> - * @lock: spin_lock_irqsave used for locking
>> + * @extern_lock: external spin_lock_irqsave used for locking
>
> Add a "(deprecated)" ?
Done.
>
>> + * @inline_lock: alternative internal spin_lock_irqsave used for locking
>> * @context: execution context this fence belongs to, returned by
>> * dma_fence_context_alloc()
>> * @seqno: the sequence number of this fence inside the execution context,
>> @@ -49,6 +50,7 @@ struct seq_file;
>> * of the time.
>> *
>> * DMA_FENCE_FLAG_INITIALIZED_BIT - fence was initialized
>> + * DMA_FENCE_FLAG_INLINE_LOCK_BIT - use inline spinlock instead of external one
>> * DMA_FENCE_FLAG_SIGNALED_BIT - fence is already signaled
>> * DMA_FENCE_FLAG_TIMESTAMP_BIT - timestamp recorded for fence signaling
>> * DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called
>> @@ -66,7 +68,10 @@ struct seq_file;
>> * been completed, or never called at all.
>> */
>> struct dma_fence {
>> - spinlock_t *lock;
>> + union {
>> + spinlock_t *extern_lock;
>> + spinlock_t inline_lock;
>> + };
>> const struct dma_fence_ops __rcu *ops;
>> /*
>> * We clear the callback list on kref_put so that by the time we
>> @@ -100,6 +105,7 @@ struct dma_fence {
>>
>> enum dma_fence_flag_bits {
>> DMA_FENCE_FLAG_INITIALIZED_BIT,
>> + DMA_FENCE_FLAG_INLINE_LOCK_BIT,
>
> Just asking about a nit: what's the order here, always alphabetically?
In which the flags are used in the code flow.
>> DMA_FENCE_FLAG_SEQNO64_BIT,
>> DMA_FENCE_FLAG_SIGNALED_BIT,
>> DMA_FENCE_FLAG_TIMESTAMP_BIT,
>> @@ -381,11 +387,12 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
>> * dma_fence_spinlock - return pointer to the spinlock protecting the fence
>> * @fence: the fence to get the lock from
>> *
>> - * Return the pointer to the extern lock.
>> + * Return either the pointer to the embedded or the external spin lock.
>> */
>> static inline spinlock_t *dma_fence_spinlock(struct dma_fence *fence)
>> {
>> - return fence->lock;
>> + return test_bit(DMA_FENCE_FLAG_INLINE_LOCK_BIT, &fence->flags) ?
>> + &fence->inline_lock : fence->extern_lock;
>
> I personally am not a fan of using '?' for anything longer than 1 line
> and think that
>
> if (condition)
> return a;
>
> return b;
>
> is much better readable.
Mhm, I disagree in this particular case. Especially that you have both possibilities side by side makes it more readable I think.
Regards,
Christian.
On Tue, Feb 10, 2026 at 10:05:14AM +0100, Jiri Pirko wrote:
> Mon, Feb 09, 2026 at 09:08:03PM +0100, jstultz(a)google.com wrote:
> >On Mon, Feb 9, 2026 at 7:38 AM Jiri Pirko <jiri(a)resnulli.us> wrote:
> >>
> >> From: Jiri Pirko <jiri(a)nvidia.com>
> >>
> >> Currently the flags, which are unused, are validated for all heaps.
> >> Since the follow-up patch introduces a flag valid for only one of the
> >> heaps, allow to specify the valid flags per-heap.
> >
> >I'm not really in this space anymore, so take my feedback with a grain of salt.
> >
> >While the heap allocate flags argument is unused, it was intended to
> >be used for generic allocation flags that would apply to all or at
> >least a wide majority of heaps.
> >
> >It was definitely not added to allow for per-heap or heap specific
> >flags (as this patch tries to utilize it). That was the mess we had
> >with ION driver that we were trying to avoid.
> >
> >The intent of dma-buf heaps is to try to abstract all the different
> >device memory constraints so there only needs to be a [usage] ->
> >[heap] mapping, and otherwise userland can be generalized so that it
> >doesn't need to be re-written to work with different devices/memory
> >types. Adding heap-specific allocation flags prevents that
> >generalization.
> >
> >So instead of adding heap specific flags, the general advice has been
> >to add a separate heap name for the flag property.
>
> Right, my original idea was to add a separate heap. Then I spotted the
> flags and seemed like a great fit. Was not aware or the history or
> original intention. Would be probably good to document it for
> future generations.
>
> So instead of flag, I will add heap named something
> like "system_cc_decrypted" to implement this.
It is problematic to expose a user‑visible API that depends on a name.
Such a design limits our ability to extend the functionality in the
future, should new use cases arise.
Thanks
>
> Thanks!
Hi Jiri,
kernel test robot noticed the following build errors:
[auto build test ERROR on drm-misc/drm-misc-next]
[also build test ERROR on drm-tip/drm-tip trace/for-next linus/master v6.19]
[cannot apply to next-20260209]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Jiri-Pirko/dma-mapping-avoid…
base: https://gitlab.freedesktop.org/drm/misc/kernel.git drm-misc-next
patch link: https://lore.kernel.org/r/20260209153809.250835-6-jiri%40resnulli.us
patch subject: [PATCH 5/5] dma-buf: heaps: system: add an option to allocate explicitly decrypted memory
config: s390-allyesconfig (https://download.01.org/0day-ci/archive/20260211/202602110149.tBUPP0bh-lkp@…)
compiler: s390-linux-gcc (GCC) 15.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260211/202602110149.tBUPP0bh-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602110149.tBUPP0bh-lkp@intel.com/
All errors (new ones prefixed by >>):
drivers/dma-buf/heaps/system_heap.c: In function 'system_heap_set_page_decrypted':
>> drivers/dma-buf/heaps/system_heap.c:66:15: error: implicit declaration of function 'set_memory_decrypted' [-Wimplicit-function-declaration]
66 | ret = set_memory_decrypted(addr, nr_pages);
| ^~~~~~~~~~~~~~~~~~~~
drivers/dma-buf/heaps/system_heap.c: In function 'system_heap_set_page_encrypted':
>> drivers/dma-buf/heaps/system_heap.c:80:15: error: implicit declaration of function 'set_memory_encrypted' [-Wimplicit-function-declaration]
80 | ret = set_memory_encrypted(addr, nr_pages);
| ^~~~~~~~~~~~~~~~~~~~
vim +/set_memory_decrypted +66 drivers/dma-buf/heaps/system_heap.c
59
60 static int system_heap_set_page_decrypted(struct page *page)
61 {
62 unsigned long addr = (unsigned long)page_address(page);
63 unsigned int nr_pages = 1 << compound_order(page);
64 int ret;
65
> 66 ret = set_memory_decrypted(addr, nr_pages);
67 if (ret)
68 pr_warn_ratelimited("dma-buf system heap: failed to decrypt page at %p\n",
69 page_address(page));
70
71 return ret;
72 }
73
74 static int system_heap_set_page_encrypted(struct page *page)
75 {
76 unsigned long addr = (unsigned long)page_address(page);
77 unsigned int nr_pages = 1 << compound_order(page);
78 int ret;
79
> 80 ret = set_memory_encrypted(addr, nr_pages);
81 if (ret)
82 pr_warn_ratelimited("dma-buf system heap: failed to re-encrypt page at %p, leaking memory\n",
83 page_address(page));
84
85 return ret;
86 }
87
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On Tue, Feb 10, 2026 at 03:49:02PM +0100, Jiri Pirko wrote:
> Tue, Feb 10, 2026 at 01:43:57PM +0100, jgg(a)ziepe.ca wrote:
> >On Tue, Feb 10, 2026 at 10:14:08AM +0100, Jiri Pirko wrote:
> >
> >> >I'd advocate that the right design is for userspace to positively
> >> >signal via this flag that it wants/accepts shared memory and without
> >> >the flag shared memory should never be returned.
> >>
> >> We can have the same behaviour with the separate heap, can't we?
> >> Userpace positively signals it wants/accepts the shared memory by
> >> choosing "system_cc_decrypted" heap name.
> >
> >So what do the other heap names do? Always private? Do you ever get
> >heaps that are unknowably private or shared (eg MMIO backed?)
>
> If I understand the code correctly, you may get something like this:
> $ ls /dev/dma_heap/
> default_cma_region
> protected,secure-video
> protected,secure-video-record
> protected,trusted-ui
> system
>
> The "protected*" ones are created by tee. I believe they handle
> memory that is inaccesible to CPU.
If that is the only list of options then maybe just the name will work
Ok.
I *think* CMA and system should be reliably CC private.
The protected ones seem to have their own internal definition, and
probably can't exist on CC VM systems..
Meaning we don't have any shared things leaking through which would be
the point.
Jason
On Tue, Feb 10, 2026 at 10:14:08AM +0100, Jiri Pirko wrote:
> >I'd advocate that the right design is for userspace to positively
> >signal via this flag that it wants/accepts shared memory and without
> >the flag shared memory should never be returned.
>
> We can have the same behaviour with the separate heap, can't we?
> Userpace positively signals it wants/accepts the shared memory by
> choosing "system_cc_decrypted" heap name.
So what do the other heap names do? Always private? Do you ever get
heaps that are unknowably private or shared (eg MMIO backed?)
Jason
Hi Jiri,
kernel test robot noticed the following build errors:
[auto build test ERROR on drm-misc/drm-misc-next]
[also build test ERROR on drm-tip/drm-tip trace/for-next linus/master v6.19]
[cannot apply to next-20260209]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Jiri-Pirko/dma-mapping-avoid…
base: https://gitlab.freedesktop.org/drm/misc/kernel.git drm-misc-next
patch link: https://lore.kernel.org/r/20260209153809.250835-6-jiri%40resnulli.us
patch subject: [PATCH 5/5] dma-buf: heaps: system: add an option to allocate explicitly decrypted memory
config: s390-allmodconfig (https://download.01.org/0day-ci/archive/20260210/202602101926.lsquJdb1-lkp@…)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260210/202602101926.lsquJdb1-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602101926.lsquJdb1-lkp@intel.com/
All errors (new ones prefixed by >>):
>> drivers/dma-buf/heaps/system_heap.c:66:8: error: call to undeclared function 'set_memory_decrypted'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
66 | ret = set_memory_decrypted(addr, nr_pages);
| ^
>> drivers/dma-buf/heaps/system_heap.c:80:8: error: call to undeclared function 'set_memory_encrypted'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
80 | ret = set_memory_encrypted(addr, nr_pages);
| ^
2 errors generated.
vim +/set_memory_decrypted +66 drivers/dma-buf/heaps/system_heap.c
59
60 static int system_heap_set_page_decrypted(struct page *page)
61 {
62 unsigned long addr = (unsigned long)page_address(page);
63 unsigned int nr_pages = 1 << compound_order(page);
64 int ret;
65
> 66 ret = set_memory_decrypted(addr, nr_pages);
67 if (ret)
68 pr_warn_ratelimited("dma-buf system heap: failed to decrypt page at %p\n",
69 page_address(page));
70
71 return ret;
72 }
73
74 static int system_heap_set_page_encrypted(struct page *page)
75 {
76 unsigned long addr = (unsigned long)page_address(page);
77 unsigned int nr_pages = 1 << compound_order(page);
78 int ret;
79
> 80 ret = set_memory_encrypted(addr, nr_pages);
81 if (ret)
82 pr_warn_ratelimited("dma-buf system heap: failed to re-encrypt page at %p, leaking memory\n",
83 page_address(page));
84
85 return ret;
86 }
87
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Hi everyone,
dma_fences have ever lived under the tyranny dictated by the module
lifetime of their issuer, leading to crashes should anybody still holding
a reference to a dma_fence when the module of the issuer was unloaded.
The basic problem is that when buffer are shared between drivers
dma_fence objects can leak into external drivers and stay there even
after they are signaled. The dma_resv object for example only lazy releases
dma_fences.
So what happens is that when the module who originally created the dma_fence
unloads the dma_fence_ops function table becomes unavailable as well and so
any attempt to release the fence crashes the system.
Previously various approaches have been discussed, including changing the
locking semantics of the dma_fence callbacks (by me) as well as using the
drm scheduler as intermediate layer (by Sima) to disconnect dma_fences
from their actual users, but none of them are actually solving all problems.
Tvrtko did some really nice prerequisite work by protecting the returned
strings of the dma_fence_ops by RCU. This way dma_fence creators where
able to just wait for an RCU grace period after fence signaling before
they could be save to free those data structures.
Now this patch set here goes a step further and protects the whole
dma_fence_ops structure by RCU, so that after the fence signals the
pointer to the dma_fence_ops is set to NULL when there is no wait nor
release callback given. All functionality which use the dma_fence_ops
reference are put inside an RCU critical section, except for the
deprecated issuer specific wait and of course the optional release
callback.
Additional to the RCU changes the lock protecting the dma_fence state
previously had to be allocated external. This set here now changes the
functionality to make that external lock optional and allows dma_fences
to use an inline lock and be self contained.
v4:
Rebases the whole set on upstream changes, especially the cleanup
from Philip in patch "drm/amdgpu: independence for the amdkfd_fence!".
Adding two patches which brings the DMA-fence self tests up to date.
The first selftest changes removes the mock_wait and so actually starts
testing the default behavior instead of some hacky implementation in the
test. This one got upstreamed independent of this set.
The second drops the mock_fence as well and tests the new RCU and inline
spinlock functionality.
v5:
Rebase on top of drm-misc-next instead of drm-tip, leave out all driver
changes for now since those should go through the driver specific paths
anyway.
Address a few more review comments, especially some rebase mess and
typos. And finally fix one more bug found by AMDs CI system.
v6:
Minor style changes, re-ordered patch #1, dropped the scheduler fence
change for now
v7:
The patch adding the dma_fence_was_initialized() function was pushed
upstream individually since that is really an independent cleanup.
Fixed some missing i915 bits in patch "dma-buf: abstract fence locking".
Please review and comment,
Christian.
On Mon, Feb 9, 2026 at 7:38 AM Jiri Pirko <jiri(a)resnulli.us> wrote:
>
> From: Jiri Pirko <jiri(a)nvidia.com>
>
> Currently the flags, which are unused, are validated for all heaps.
> Since the follow-up patch introduces a flag valid for only one of the
> heaps, allow to specify the valid flags per-heap.
I'm not really in this space anymore, so take my feedback with a grain of salt.
While the heap allocate flags argument is unused, it was intended to
be used for generic allocation flags that would apply to all or at
least a wide majority of heaps.
It was definitely not added to allow for per-heap or heap specific
flags (as this patch tries to utilize it). That was the mess we had
with ION driver that we were trying to avoid.
The intent of dma-buf heaps is to try to abstract all the different
device memory constraints so there only needs to be a [usage] ->
[heap] mapping, and otherwise userland can be generalized so that it
doesn't need to be re-written to work with different devices/memory
types. Adding heap-specific allocation flags prevents that
generalization.
So instead of adding heap specific flags, the general advice has been
to add a separate heap name for the flag property.
Now, there has been many discussions around "protected buffers" (which
doesn't seem to map exactly to this confidental computing primitive,
but sounds like it might be related) , which have bounced between
being a allocation flag or a device specific heap without much
resolution. I appreciate in this patch seires you've pushed your
concept down into a DMA_ATTR_, as I do feel the kernel should have a
deeper sense of protected buffers (or any general propery like this)
as a concept if it is going to be a generic allocation flag, instead
of it being a somewhat thin creation of the outer heap-driver layer.
But, it seems like the use case here is still far too narrow for a top
level allocation flag.
So I'd advocate against introducing heap-specific flags like this.
thanks
-john