Hi Boris,
On Tue, May 05, 2026 at 05:20:48PM +0200, Boris Brezillon wrote:
Hi Ketil,
On Tue, 5 May 2026 16:05:07 +0200 Ketil Johnsen ketil.johnsen@arm.com wrote:
From: John Stultz jstultz@google.com
Add proper reference counting on the dma_heap structure. While existing heaps are built-in, we may eventually have heaps loaded from modules, and we'll need to be able to properly handle the references to the heaps
It's weird that this "heap as module" thing is mentioned here, but actual robustness to make this safe is not added in the commit or any of the following ones.
Signed-off-by: John Stultz jstultz@google.com Signed-off-by: T.J. Mercier tjmercier@google.com Signed-off-by: Yong Wu yong.wu@mediatek.com [Yong: Just add comment for "minor" and "refcount"] Signed-off-by: Yunfei Dong yunfei.dong@mediatek.com [Yunfei: Change reviewer's comments] Signed-off-by: Florent Tomasin florent.tomasin@arm.com [Florent: Rebase] Signed-off-by: Ketil Johnsen ketil.johnsen@arm.com [Ketil: Rebase]
drivers/dma-buf/dma-heap.c | 29 +++++++++++++++++++++++++++++ include/linux/dma-heap.h | 2 ++ 2 files changed, 31 insertions(+)
diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c index ac5f8685a6494..9fd365ddbd517 100644 --- a/drivers/dma-buf/dma-heap.c +++ b/drivers/dma-buf/dma-heap.c @@ -12,6 +12,7 @@ #include <linux/dma-heap.h> #include <linux/err.h> #include <linux/export.h> +#include <linux/kref.h> #include <linux/list.h> #include <linux/nospec.h> #include <linux/syscalls.h> @@ -31,6 +32,7 @@
- @heap_devt: heap device node
- @list: list head connecting to list of heaps
- @heap_cdev: heap char device
*/
- @refcount: reference counter for this heap device
- Represents a heap of memory from which buffers can be made.
@@ -41,6 +43,7 @@ struct dma_heap { dev_t heap_devt; struct list_head list; struct cdev heap_cdev;
- struct kref refcount;
}; static LIST_HEAD(heap_list); @@ -248,6 +251,7 @@ struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info) if (!heap) return ERR_PTR(-ENOMEM);
- kref_init(&heap->refcount); heap->name = exp_info->name; heap->ops = exp_info->ops; heap->priv = exp_info->priv;
@@ -313,6 +317,31 @@ struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info) } EXPORT_SYMBOL_NS_GPL(dma_heap_add, "DMA_BUF_HEAP"); +static void dma_heap_release(struct kref *ref) +{
- struct dma_heap *heap = container_of(ref, struct dma_heap, refcount);
- unsigned int minor = MINOR(heap->heap_devt);
- mutex_lock(&heap_list_lock);
- list_del(&heap->list);
- mutex_unlock(&heap_list_lock);
- device_destroy(dma_heap_class, heap->heap_devt);
- cdev_del(&heap->heap_cdev);
- xa_erase(&dma_heap_minors, minor);
- kfree(heap);
That's actually problematic, because cdev_del() doesn't guarantee that all opened FDs have been closed [1], it just guarantees that no new ones can materialize. In order to make that safe, we'd need a
- kref_get_unless_zero() in dma_heap_open(), with proper locking around the xa_load() to protect against the heap removal that's happening here
- a dma_heap_put() in a new dma_heap_close() implementation
- a guarantee that heap implementations won't go away until the last ref is dropped, which means ops and all the data needed for this heap to satisfy ioctl()s (and more generally every passed at dma_heap_add() time) have to stay valid until the last ref is dropped. Alternatively, we could restrict this only to in-flight ioctl()s, and have the ops replaced by some dummy ops using RCU or a rwlock. But I guess live dmabufs allocated on this heap have to retain the heap and its implementation anyway.
For record, #3 is already not satisfied by the current tee_heap implementation (tee_dma_heap objects can vanish before the dma_heap object is gone). The other implementations seem to be fine because they are statically linked, and they either have exp_info.priv set to NULL, or something that's never released.
That statement won't hold for long, see: https://lore.kernel.org/r/20260427-dma-buf-heaps-as-modules-v5-0-b6f5678feef...
However, all upstream heaps can be loaded as module, but not unloaded. So once you get a reference to it, you can assume it will live forever. That's why we didn't merge that patch before, even though it was discussed:
https://lore.kernel.org/all/CANDhNCqk9Uk4aXHhUsL4hR1GHNmWZnH3C9Np-A02wdi+J3D...
Maxime
linaro-mm-sig@lists.linaro.org