Hi,
This series is the follow-up of the discussion that John and I had some time ago here:
https://lore.kernel.org/all/CANDhNCquJn6bH3KxKf65BWiTYLVqSd9892-xtFDHHqqyrro...
The initial problem we were discussing was that I'm currently working on a platform which has a memory layout with ECC enabled. However, enabling the ECC has a number of drawbacks on that platform: lower performance, increased memory usage, etc. So for things like framebuffers, the trade-off isn't great and thus there's a memory region with ECC disabled to allocate from for such use cases.
After a suggestion from John, I chose to first start using heap allocations flags to allow for userspace to ask for a particular ECC setup. This is then backed by a new heap type that runs from reserved memory chunks flagged as such, and the existing DT properties to specify the ECC properties.
After further discussion, it was considered that flags were not the right solution, and relying on the names of the heaps would be enough to let userspace know the kind of buffer it deals with.
Thus, even though the uAPI part of it has been dropped in this second version, we still need a driver to create heaps out of carved-out memory regions. In addition to the original usecase, a similar driver can be found in BSPs from most vendors, so I believe it would be a useful addition to the kernel.
Let me know what you think, Maxime
Signed-off-by: Maxime Ripard mripard@kernel.org --- Changes in v5: - Rebased on 6.16-rc2 - Switch from property to dedicated binding - Link to v4: https://lore.kernel.org/r/20250520-dma-buf-ecc-heap-v4-1-bd2e1f1bb42c@kernel...
Changes in v4: - Rebased on 6.15-rc7 - Map buffers only when map is actually called, not at allocation time - Deal with restricted-dma-pool and shared-dma-pool - Reword Kconfig options - Properly report dma_map_sgtable failures - Link to v3: https://lore.kernel.org/r/20250407-dma-buf-ecc-heap-v3-0-97cdd36a5f29@kernel...
Changes in v3: - Reworked global variable patch - Link to v2: https://lore.kernel.org/r/20250401-dma-buf-ecc-heap-v2-0-043fd006a1af@kernel...
Changes in v2: - Add vmap/vunmap operations - Drop ECC flags uapi - Rebase on top of 6.14 - Link to v1: https://lore.kernel.org/r/20240515-dma-buf-ecc-heap-v1-0-54cbbd049511@kernel...
--- Maxime Ripard (2): dt-bindings: reserved-memory: Introduce carved-out memory region binding dma-buf: heaps: Introduce a new heap for reserved memory
.../bindings/reserved-memory/carved-out.yaml | 49 +++ drivers/dma-buf/heaps/Kconfig | 8 + drivers/dma-buf/heaps/Makefile | 1 + drivers/dma-buf/heaps/carveout_heap.c | 362 +++++++++++++++++++++ 4 files changed, 420 insertions(+) --- base-commit: d076bed8cb108ba2236d4d49c92303fda4036893 change-id: 20240515-dma-buf-ecc-heap-28a311d2c94e
Best regards,
Some parts of the memory can be dedicated to specific purposes and exposed as a dedicated memory allocator.
This is especially useful if that particular region has a particular properties the rest of the memory doesn't have. For example, some platforms have their entire RAM covered by ECC but for a small area meant to be used by applications that don't need ECC, and its associated overhead.
Let's introduce a binding to describe such a region and allow the OS to create a dedicated memory allocator for it.
Signed-off-by: Maxime Ripard mripard@kernel.org --- .../bindings/reserved-memory/carved-out.yaml | 49 ++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml new file mode 100644 index 0000000000000000000000000000000000000000..9ab5d1ebd9ebd9111b7c064fabe1c45e752da83b --- /dev/null +++ b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml @@ -0,0 +1,49 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/reserved-memory/carved-out.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Carved-out Memory Region + +description: | + Specifies that the reserved memory region has been carved out of the + main memory allocator, and is intended to be used by the OS as a + dedicated memory allocator. + +maintainers: + - Maxime Ripard mripard@kernel.org + +properties: + compatible: + const: carved-out + + reg: + description: region of memory that is carved out. + +allOf: + - $ref: reserved-memory.yaml + - not: + required: + - reusable + - not: + required: + - no-map + +required: + - compatible + - reg + +unevaluatedProperties: false + +examples: + - | + reserved-memory { + #address-cells = <1>; + #size-cells = <1>; + + memory@12340000 { + compatible = "carved-out"; + reg = <0x12340000 0x00800000>; + }; + };
On Tue, Jun 17, 2025 at 02:25:40PM +0200, Maxime Ripard wrote:
Some parts of the memory can be dedicated to specific purposes and exposed as a dedicated memory allocator.
This is especially useful if that particular region has a particular properties the rest of the memory doesn't have. For example, some platforms have their entire RAM covered by ECC but for a small area meant to be used by applications that don't need ECC, and its associated overhead.
Let's introduce a binding to describe such a region and allow the OS to create a dedicated memory allocator for it.
Signed-off-by: Maxime Ripard mripard@kernel.org
.../bindings/reserved-memory/carved-out.yaml | 49 ++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml new file mode 100644 index 0000000000000000000000000000000000000000..9ab5d1ebd9ebd9111b7c064fabe1c45e752da83b --- /dev/null +++ b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml @@ -0,0 +1,49 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/reserved-memory/carved-out.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml#
+title: Carved-out Memory Region
+description: |
Don't need '|'.
- Specifies that the reserved memory region has been carved out of the
- main memory allocator, and is intended to be used by the OS as a
- dedicated memory allocator.
Other than the commit msg, it is completely lost that this is for ECC-less memory.
This description applies to CMA area as well. So what's the difference?
+maintainers:
- Maxime Ripard mripard@kernel.org
+properties:
- compatible:
- const: carved-out
Isn't everything in reserved-memory a carve out for some purpose. I'm not sure if I'd add 'no ECC' or more along the lines of how this is used. The latter might be useful on platforms which can't disable ECC or don't have ECC at all.
- reg:
- description: region of memory that is carved out.
Can be multiple regions? If not, need 'maxItems: 1'.
+allOf:
- $ref: reserved-memory.yaml
- not:
required:
- reusable
- not:
required:
- no-map
+required:
- compatible
- reg
+unevaluatedProperties: false
+examples:
- |
- reserved-memory {
#address-cells = <1>;
#size-cells = <1>;
memory@12340000 {
compatible = "carved-out";
reg = <0x12340000 0x00800000>;
};
- };
-- 2.49.0
Hi Rob,
On Fri, Jun 27, 2025 at 02:31:32PM -0500, Rob Herring wrote:
On Tue, Jun 17, 2025 at 02:25:40PM +0200, Maxime Ripard wrote:
Some parts of the memory can be dedicated to specific purposes and exposed as a dedicated memory allocator.
This is especially useful if that particular region has a particular properties the rest of the memory doesn't have. For example, some platforms have their entire RAM covered by ECC but for a small area meant to be used by applications that don't need ECC, and its associated overhead.
Let's introduce a binding to describe such a region and allow the OS to create a dedicated memory allocator for it.
Signed-off-by: Maxime Ripard mripard@kernel.org
.../bindings/reserved-memory/carved-out.yaml | 49 ++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml new file mode 100644 index 0000000000000000000000000000000000000000..9ab5d1ebd9ebd9111b7c064fabe1c45e752da83b --- /dev/null +++ b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml @@ -0,0 +1,49 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/reserved-memory/carved-out.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml#
+title: Carved-out Memory Region
+description: |
Don't need '|'.
- Specifies that the reserved memory region has been carved out of the
- main memory allocator, and is intended to be used by the OS as a
- dedicated memory allocator.
Other than the commit msg, it is completely lost that this is for ECC-less memory.
Because it's not. One of the first feedback I got was that the way to identify what a heap provides was the heap name.
So, as far as the binding go, a heap just exposes a chunk of memory the memory allocator wouldn't use. The actual semantics of that chunk of memory don't matter.
This description applies to CMA area as well. So what's the difference?
Yeah, I kind of agree, which is why I initially started with a property, and you then asked for a compatible.
CMA (assuming you mean the allocator, not the CMA heap) is still more though: it only covers some shared-dma-pool memory regions.
+maintainers:
- Maxime Ripard mripard@kernel.org
+properties:
- compatible:
- const: carved-out
Isn't everything in reserved-memory a carve out for some purpose. I'm not sure if I'd add 'no ECC' or more along the lines of how this is used. The latter might be useful on platforms which can't disable ECC or don't have ECC at all.
I don't think we need any discriminant for ECC vs non-ECC. It's just a carved-out memory region at some offset, and the system won't use it.
Maxime
On Mon, Jun 30, 2025 at 06:41:38PM +0200, Maxime Ripard wrote:
Hi Rob,
On Fri, Jun 27, 2025 at 02:31:32PM -0500, Rob Herring wrote:
On Tue, Jun 17, 2025 at 02:25:40PM +0200, Maxime Ripard wrote:
Some parts of the memory can be dedicated to specific purposes and exposed as a dedicated memory allocator.
This is especially useful if that particular region has a particular properties the rest of the memory doesn't have. For example, some platforms have their entire RAM covered by ECC but for a small area meant to be used by applications that don't need ECC, and its associated overhead.
Let's introduce a binding to describe such a region and allow the OS to create a dedicated memory allocator for it.
Signed-off-by: Maxime Ripard mripard@kernel.org
.../bindings/reserved-memory/carved-out.yaml | 49 ++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml new file mode 100644 index 0000000000000000000000000000000000000000..9ab5d1ebd9ebd9111b7c064fabe1c45e752da83b --- /dev/null +++ b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml @@ -0,0 +1,49 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/reserved-memory/carved-out.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml#
+title: Carved-out Memory Region
+description: |
Don't need '|'.
- Specifies that the reserved memory region has been carved out of the
- main memory allocator, and is intended to be used by the OS as a
- dedicated memory allocator.
Other than the commit msg, it is completely lost that this is for ECC-less memory.
Because it's not. One of the first feedback I got was that the way to identify what a heap provides was the heap name.
So, as far as the binding go, a heap just exposes a chunk of memory the memory allocator wouldn't use. The actual semantics of that chunk of memory don't matter.
But they do because you use one carve out for one thing and another carve out for another purpose and they probably aren't interchangeable. For the most part, everything in /reserved-memory is a carve out from regular memory though we failed to enforce that.
This description applies to CMA area as well. So what's the difference?
Yeah, I kind of agree, which is why I initially started with a property, and you then asked for a compatible.
My issues with properties is we have to support N factorial cases for combinations of N properties. It's already fragile. Whereas a compatible is (hopefully) well defined as to what's needed and is only 1 more case to support.
Rob
On Mon, Jun 30, 2025 at 05:08:19PM -0500, Rob Herring wrote:
On Mon, Jun 30, 2025 at 06:41:38PM +0200, Maxime Ripard wrote:
Hi Rob,
On Fri, Jun 27, 2025 at 02:31:32PM -0500, Rob Herring wrote:
On Tue, Jun 17, 2025 at 02:25:40PM +0200, Maxime Ripard wrote:
Some parts of the memory can be dedicated to specific purposes and exposed as a dedicated memory allocator.
This is especially useful if that particular region has a particular properties the rest of the memory doesn't have. For example, some platforms have their entire RAM covered by ECC but for a small area meant to be used by applications that don't need ECC, and its associated overhead.
Let's introduce a binding to describe such a region and allow the OS to create a dedicated memory allocator for it.
Signed-off-by: Maxime Ripard mripard@kernel.org
.../bindings/reserved-memory/carved-out.yaml | 49 ++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml new file mode 100644 index 0000000000000000000000000000000000000000..9ab5d1ebd9ebd9111b7c064fabe1c45e752da83b --- /dev/null +++ b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml @@ -0,0 +1,49 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/reserved-memory/carved-out.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml#
+title: Carved-out Memory Region
+description: |
Don't need '|'.
- Specifies that the reserved memory region has been carved out of the
- main memory allocator, and is intended to be used by the OS as a
- dedicated memory allocator.
Other than the commit msg, it is completely lost that this is for ECC-less memory.
Because it's not. One of the first feedback I got was that the way to identify what a heap provides was the heap name.
So, as far as the binding go, a heap just exposes a chunk of memory the memory allocator wouldn't use. The actual semantics of that chunk of memory don't matter.
But they do because you use one carve out for one thing and another carve out for another purpose and they probably aren't interchangeable.
That was also my initial thought, but it was then discussed that the name of the region is enough of a discriminant. And it makes sense too, it's a sufficient discriminant for the device tree to uniquely identify a given memory region on a given platform already, so we don't really need anything else.
For the most part, everything in /reserved-memory is a carve out from regular memory though we failed to enforce that.
This description applies to CMA area as well. So what's the difference?
Yeah, I kind of agree, which is why I initially started with a property, and you then asked for a compatible.
My issues with properties is we have to support N factorial cases for combinations of N properties. It's already fragile. Whereas a compatible is (hopefully) well defined as to what's needed and is only 1 more case to support.
I think that's also what John especially wanted to avoid. If we have a generic compatible, but the attributes/properties/whatever of the buffers allocated from that region differ (like ecc vs non-ecc, protected vs non-protected, etc.) we will need properties in the device tree to describe them too.
Maxime
Hi Rob,
On Tue, Jul 01, 2025 at 09:12:18AM +0200, Maxime Ripard wrote:
On Mon, Jun 30, 2025 at 05:08:19PM -0500, Rob Herring wrote:
On Mon, Jun 30, 2025 at 06:41:38PM +0200, Maxime Ripard wrote:
Hi Rob,
On Fri, Jun 27, 2025 at 02:31:32PM -0500, Rob Herring wrote:
On Tue, Jun 17, 2025 at 02:25:40PM +0200, Maxime Ripard wrote:
Some parts of the memory can be dedicated to specific purposes and exposed as a dedicated memory allocator.
This is especially useful if that particular region has a particular properties the rest of the memory doesn't have. For example, some platforms have their entire RAM covered by ECC but for a small area meant to be used by applications that don't need ECC, and its associated overhead.
Let's introduce a binding to describe such a region and allow the OS to create a dedicated memory allocator for it.
Signed-off-by: Maxime Ripard mripard@kernel.org
.../bindings/reserved-memory/carved-out.yaml | 49 ++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml new file mode 100644 index 0000000000000000000000000000000000000000..9ab5d1ebd9ebd9111b7c064fabe1c45e752da83b --- /dev/null +++ b/Documentation/devicetree/bindings/reserved-memory/carved-out.yaml @@ -0,0 +1,49 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/reserved-memory/carved-out.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml#
+title: Carved-out Memory Region
+description: |
Don't need '|'.
- Specifies that the reserved memory region has been carved out of the
- main memory allocator, and is intended to be used by the OS as a
- dedicated memory allocator.
Other than the commit msg, it is completely lost that this is for ECC-less memory.
Because it's not. One of the first feedback I got was that the way to identify what a heap provides was the heap name.
So, as far as the binding go, a heap just exposes a chunk of memory the memory allocator wouldn't use. The actual semantics of that chunk of memory don't matter.
But they do because you use one carve out for one thing and another carve out for another purpose and they probably aren't interchangeable.
That was also my initial thought, but it was then discussed that the name of the region is enough of a discriminant. And it makes sense too, it's a sufficient discriminant for the device tree to uniquely identify a given memory region on a given platform already, so we don't really need anything else.
For the most part, everything in /reserved-memory is a carve out from regular memory though we failed to enforce that.
This description applies to CMA area as well. So what's the difference?
Yeah, I kind of agree, which is why I initially started with a property, and you then asked for a compatible.
My issues with properties is we have to support N factorial cases for combinations of N properties. It's already fragile. Whereas a compatible is (hopefully) well defined as to what's needed and is only 1 more case to support.
I think that's also what John especially wanted to avoid. If we have a generic compatible, but the attributes/properties/whatever of the buffers allocated from that region differ (like ecc vs non-ecc, protected vs non-protected, etc.) we will need properties in the device tree to describe them too.
I thought about it some more over the weekend, but if a compatible isn't a great solution either, would it make sense to just create a heap for each shared-dma-pool region, and thus we just wouldn't need any extra compatible or property?
Maxime
Some reserved memory regions might have particular memory setup or attributes that make them good candidates for heaps.
Let's provide a heap type that will create a new heap for each reserved memory region flagged as such.
Signed-off-by: Maxime Ripard mripard@kernel.org --- drivers/dma-buf/heaps/Kconfig | 8 + drivers/dma-buf/heaps/Makefile | 1 + drivers/dma-buf/heaps/carveout_heap.c | 362 ++++++++++++++++++++++++++++++++++ 3 files changed, 371 insertions(+)
diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig index a5eef06c422644e8aadaf5aff2bd9a33c49c1ba3..1ce4f6828d8c06bfdd7bc2e5127707f1778586e6 100644 --- a/drivers/dma-buf/heaps/Kconfig +++ b/drivers/dma-buf/heaps/Kconfig @@ -1,5 +1,13 @@ +config DMABUF_HEAPS_CARVEOUT + bool "DMA-BUF Carveout Heaps" + depends on DMABUF_HEAPS + help + Choose this option to enable the carveout dmabuf heap. The carveout + heap is backed by pages from reserved memory regions flagged as + exportable. If in doubt, say Y. + config DMABUF_HEAPS_SYSTEM bool "DMA-BUF System Heap" depends on DMABUF_HEAPS help Choose this option to enable the system dmabuf heap. The system heap diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile index 974467791032ffb8a7aba17b1407d9a19b3f3b44..b734647ad5c84f449106748160258e372f153df2 100644 --- a/drivers/dma-buf/heaps/Makefile +++ b/drivers/dma-buf/heaps/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_DMABUF_HEAPS_CARVEOUT) += carveout_heap.o obj-$(CONFIG_DMABUF_HEAPS_SYSTEM) += system_heap.o obj-$(CONFIG_DMABUF_HEAPS_CMA) += cma_heap.o diff --git a/drivers/dma-buf/heaps/carveout_heap.c b/drivers/dma-buf/heaps/carveout_heap.c new file mode 100644 index 0000000000000000000000000000000000000000..c01abc72c09d4f2a373462fded2856d975a36c8f --- /dev/null +++ b/drivers/dma-buf/heaps/carveout_heap.c @@ -0,0 +1,362 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <linux/dma-buf.h> +#include <linux/dma-heap.h> +#include <linux/genalloc.h> +#include <linux/highmem.h> +#include <linux/of_reserved_mem.h> + +struct carveout_heap_priv { + struct dma_heap *heap; + struct gen_pool *pool; +}; + +struct carveout_heap_buffer_priv { + struct mutex lock; + struct list_head attachments; + + unsigned long num_pages; + struct carveout_heap_priv *heap; + phys_addr_t paddr; + void *vaddr; + unsigned int vmap_cnt; +}; + +struct carveout_heap_attachment { + struct list_head head; + struct sg_table table; + + struct device *dev; + bool mapped; +}; + +static int carveout_heap_attach(struct dma_buf *buf, + struct dma_buf_attachment *attachment) +{ + struct carveout_heap_buffer_priv *priv = buf->priv; + struct carveout_heap_attachment *a; + struct sg_table *sgt; + unsigned long len = priv->num_pages * PAGE_SIZE; + int ret; + + a = kzalloc(sizeof(*a), GFP_KERNEL); + if (!a) + return -ENOMEM; + INIT_LIST_HEAD(&a->head); + a->dev = attachment->dev; + attachment->priv = a; + + sgt = &a->table; + ret = sg_alloc_table(sgt, 1, GFP_KERNEL); + if (ret) + goto err_cleanup_attach; + + sg_set_buf(sgt->sgl, priv->vaddr, len); + + mutex_lock(&priv->lock); + list_add(&a->head, &priv->attachments); + mutex_unlock(&priv->lock); + + return 0; + +err_cleanup_attach: + kfree(a); + return ret; +} + +static void carveout_heap_detach(struct dma_buf *dmabuf, + struct dma_buf_attachment *attachment) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + struct carveout_heap_attachment *a = attachment->priv; + + mutex_lock(&priv->lock); + list_del(&a->head); + mutex_unlock(&priv->lock); + + sg_free_table(&a->table); + kfree(a); +} + +static struct sg_table * +carveout_heap_map_dma_buf(struct dma_buf_attachment *attachment, + enum dma_data_direction direction) +{ + struct carveout_heap_attachment *a = attachment->priv; + struct sg_table *table = &a->table; + int ret; + + ret = dma_map_sgtable(a->dev, table, direction, 0); + if (ret) + return ERR_PTR(ret); + + a->mapped = true; + + return table; +} + +static void carveout_heap_unmap_dma_buf(struct dma_buf_attachment *attachment, + struct sg_table *table, + enum dma_data_direction direction) +{ + struct carveout_heap_attachment *a = attachment->priv; + + a->mapped = false; + dma_unmap_sgtable(a->dev, table, direction, 0); +} + +static int +carveout_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, + enum dma_data_direction direction) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + struct carveout_heap_attachment *a; + unsigned long len = priv->num_pages * PAGE_SIZE; + + mutex_lock(&priv->lock); + + if (priv->vmap_cnt) + invalidate_kernel_vmap_range(priv->vaddr, len); + + list_for_each_entry(a, &priv->attachments, head) { + if (!a->mapped) + continue; + + dma_sync_sgtable_for_cpu(a->dev, &a->table, direction); + } + + mutex_unlock(&priv->lock); + + return 0; +} + +static int +carveout_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf, + enum dma_data_direction direction) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + struct carveout_heap_attachment *a; + unsigned long len = priv->num_pages * PAGE_SIZE; + + mutex_lock(&priv->lock); + + if (priv->vmap_cnt) + flush_kernel_vmap_range(priv->vaddr, len); + + list_for_each_entry(a, &priv->attachments, head) { + if (!a->mapped) + continue; + + dma_sync_sgtable_for_device(a->dev, &a->table, direction); + } + + mutex_unlock(&priv->lock); + + return 0; +} + +static int carveout_heap_mmap(struct dma_buf *dmabuf, + struct vm_area_struct *vma) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + unsigned long len = priv->num_pages * PAGE_SIZE; + + return vm_iomap_memory(vma, priv->paddr, len); +} + +static int carveout_heap_vmap(struct dma_buf *dmabuf, struct iosys_map *map) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + unsigned long len = priv->num_pages * PAGE_SIZE; + + mutex_lock(&priv->lock); + + if (!priv->vmap_cnt) { + void *vaddr = memremap(priv->paddr, len, MEMREMAP_WB); + + if (!vaddr) { + mutex_unlock(&priv->lock); + return -ENOMEM; + } + + priv->vaddr = vaddr; + } + + WARN_ON(!priv->vaddr); + iosys_map_set_vaddr(map, priv->vaddr); + priv->vmap_cnt++; + + mutex_unlock(&priv->lock); + + return 0; +} + +static void carveout_heap_vunmap(struct dma_buf *dmabuf, struct iosys_map *map) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + + mutex_lock(&priv->lock); + + priv->vmap_cnt--; + if (!priv->vmap_cnt) { + memunmap(priv->vaddr); + priv->vaddr = NULL; + } + + mutex_unlock(&priv->lock); + + iosys_map_clear(map); +} + +static void carveout_heap_dma_buf_release(struct dma_buf *buf) +{ + struct carveout_heap_buffer_priv *buffer_priv = buf->priv; + struct carveout_heap_priv *heap_priv = buffer_priv->heap; + unsigned long len = buffer_priv->num_pages * PAGE_SIZE; + + gen_pool_free(heap_priv->pool, buffer_priv->paddr, len); + kfree(buffer_priv); +} + +static const struct dma_buf_ops carveout_heap_buf_ops = { + .attach = carveout_heap_attach, + .detach = carveout_heap_detach, + .map_dma_buf = carveout_heap_map_dma_buf, + .unmap_dma_buf = carveout_heap_unmap_dma_buf, + .begin_cpu_access = carveout_heap_dma_buf_begin_cpu_access, + .end_cpu_access = carveout_heap_dma_buf_end_cpu_access, + .mmap = carveout_heap_mmap, + .vmap = carveout_heap_vmap, + .vunmap = carveout_heap_vunmap, + .release = carveout_heap_dma_buf_release, +}; + +static struct dma_buf *carveout_heap_allocate(struct dma_heap *heap, + unsigned long len, + u32 fd_flags, + u64 heap_flags) +{ + struct carveout_heap_priv *heap_priv = dma_heap_get_drvdata(heap); + struct carveout_heap_buffer_priv *buffer_priv; + DEFINE_DMA_BUF_EXPORT_INFO(exp_info); + struct dma_buf *buf; + phys_addr_t paddr; + /* len is guaranteed to be page-aligned by the framework, so we can use it as is. */ + size_t size = len; + int ret; + + buffer_priv = kzalloc(sizeof(*buffer_priv), GFP_KERNEL); + if (!buffer_priv) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&buffer_priv->attachments); + mutex_init(&buffer_priv->lock); + + paddr = gen_pool_alloc(heap_priv->pool, size); + if (!paddr) { + ret = -ENOMEM; + goto err_free_buffer_priv; + } + + buffer_priv->paddr = paddr; + buffer_priv->heap = heap_priv; + buffer_priv->num_pages = size >> PAGE_SHIFT; + + /* create the dmabuf */ + exp_info.exp_name = dma_heap_get_name(heap); + exp_info.ops = &carveout_heap_buf_ops; + exp_info.size = size; + exp_info.flags = fd_flags; + exp_info.priv = buffer_priv; + + buf = dma_buf_export(&exp_info); + if (IS_ERR(buf)) { + ret = PTR_ERR(buf); + goto err_free_buffer; + } + + return buf; + +err_free_buffer: + gen_pool_free(heap_priv->pool, paddr, len); +err_free_buffer_priv: + kfree(buffer_priv); + + return ERR_PTR(ret); +} + +static const struct dma_heap_ops carveout_heap_ops = { + .allocate = carveout_heap_allocate, +}; + +static int __init carveout_heap_setup(struct device_node *node) +{ + struct dma_heap_export_info exp_info = {}; + const struct reserved_mem *rmem; + struct carveout_heap_priv *priv; + struct dma_heap *heap; + struct gen_pool *pool; + int ret; + + rmem = of_reserved_mem_lookup(node); + if (!rmem) + return -EINVAL; + + priv = kzalloc(sizeof(*priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + + pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE); + if (!pool) { + ret = -ENOMEM; + goto err_cleanup_heap; + } + priv->pool = pool; + + ret = gen_pool_add(pool, rmem->base, rmem->size, NUMA_NO_NODE); + if (ret) + goto err_release_mem_region; + + exp_info.name = node->full_name; + exp_info.ops = &carveout_heap_ops; + exp_info.priv = priv; + + heap = dma_heap_add(&exp_info); + if (IS_ERR(heap)) { + ret = PTR_ERR(heap); + goto err_release_mem_region; + } + priv->heap = heap; + + return 0; + +err_release_mem_region: + gen_pool_destroy(pool); +err_cleanup_heap: + kfree(priv); + return ret; +} + +static int __init carveout_heap_init(void) +{ + struct device_node *rmem_node; + struct device_node *node; + int ret; + + rmem_node = of_find_node_by_path("/reserved-memory"); + if (!rmem_node) + return 0; + + for_each_child_of_node(rmem_node, node) { + if (!of_device_is_compatible(node, "carved-out")) + continue; + + ret = carveout_heap_setup(node); + if (ret) + return ret; + } + + return 0; +} + +module_init(carveout_heap_init);
On Tue, Jun 17, 2025 at 02:25:41PM +0200, Maxime Ripard wrote:
Some reserved memory regions might have particular memory setup or attributes that make them good candidates for heaps.
Let's provide a heap type that will create a new heap for each reserved memory region flagged as such.
Signed-off-by: Maxime Ripard mripard@kernel.org
drivers/dma-buf/heaps/Kconfig | 8 + drivers/dma-buf/heaps/Makefile | 1 + drivers/dma-buf/heaps/carveout_heap.c | 362 ++++++++++++++++++++++++++++++++++ 3 files changed, 371 insertions(+)
diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig index a5eef06c422644e8aadaf5aff2bd9a33c49c1ba3..1ce4f6828d8c06bfdd7bc2e5127707f1778586e6 100644 --- a/drivers/dma-buf/heaps/Kconfig +++ b/drivers/dma-buf/heaps/Kconfig @@ -1,5 +1,13 @@ +config DMABUF_HEAPS_CARVEOUT
- bool "DMA-BUF Carveout Heaps"
- depends on DMABUF_HEAPS
- help
Choose this option to enable the carveout dmabuf heap. The carveout
heap is backed by pages from reserved memory regions flagged as
exportable. If in doubt, say Y.
config DMABUF_HEAPS_SYSTEM bool "DMA-BUF System Heap" depends on DMABUF_HEAPS help Choose this option to enable the system dmabuf heap. The system heap diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile index 974467791032ffb8a7aba17b1407d9a19b3f3b44..b734647ad5c84f449106748160258e372f153df2 100644 --- a/drivers/dma-buf/heaps/Makefile +++ b/drivers/dma-buf/heaps/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_DMABUF_HEAPS_CARVEOUT) += carveout_heap.o obj-$(CONFIG_DMABUF_HEAPS_SYSTEM) += system_heap.o obj-$(CONFIG_DMABUF_HEAPS_CMA) += cma_heap.o diff --git a/drivers/dma-buf/heaps/carveout_heap.c b/drivers/dma-buf/heaps/carveout_heap.c new file mode 100644 index 0000000000000000000000000000000000000000..c01abc72c09d4f2a373462fded2856d975a36c8f --- /dev/null +++ b/drivers/dma-buf/heaps/carveout_heap.c @@ -0,0 +1,362 @@ +// SPDX-License-Identifier: GPL-2.0
+#include <linux/dma-buf.h> +#include <linux/dma-heap.h> +#include <linux/genalloc.h> +#include <linux/highmem.h> +#include <linux/of_reserved_mem.h>
+struct carveout_heap_priv {
- struct dma_heap *heap;
- struct gen_pool *pool;
+};
+struct carveout_heap_buffer_priv {
- struct mutex lock;
- struct list_head attachments;
- unsigned long num_pages;
- struct carveout_heap_priv *heap;
- phys_addr_t paddr;
- void *vaddr;
- unsigned int vmap_cnt;
+};
+struct carveout_heap_attachment {
- struct list_head head;
- struct sg_table table;
- struct device *dev;
- bool mapped;
+};
+static int carveout_heap_attach(struct dma_buf *buf,
struct dma_buf_attachment *attachment)
+{
- struct carveout_heap_buffer_priv *priv = buf->priv;
- struct carveout_heap_attachment *a;
- struct sg_table *sgt;
- unsigned long len = priv->num_pages * PAGE_SIZE;
- int ret;
- a = kzalloc(sizeof(*a), GFP_KERNEL);
- if (!a)
return -ENOMEM;
- INIT_LIST_HEAD(&a->head);
- a->dev = attachment->dev;
- attachment->priv = a;
- sgt = &a->table;
- ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
- if (ret)
goto err_cleanup_attach;
- sg_set_buf(sgt->sgl, priv->vaddr, len);
- mutex_lock(&priv->lock);
- list_add(&a->head, &priv->attachments);
- mutex_unlock(&priv->lock);
- return 0;
+err_cleanup_attach:
- kfree(a);
- return ret;
+}
+static void carveout_heap_detach(struct dma_buf *dmabuf,
struct dma_buf_attachment *attachment)
+{
- struct carveout_heap_buffer_priv *priv = dmabuf->priv;
- struct carveout_heap_attachment *a = attachment->priv;
- mutex_lock(&priv->lock);
- list_del(&a->head);
- mutex_unlock(&priv->lock);
- sg_free_table(&a->table);
- kfree(a);
+}
+static struct sg_table * +carveout_heap_map_dma_buf(struct dma_buf_attachment *attachment,
enum dma_data_direction direction)
+{
- struct carveout_heap_attachment *a = attachment->priv;
- struct sg_table *table = &a->table;
- int ret;
- ret = dma_map_sgtable(a->dev, table, direction, 0);
- if (ret)
return ERR_PTR(ret);
- a->mapped = true;
- return table;
+}
+static void carveout_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
struct sg_table *table,
enum dma_data_direction direction)
+{
- struct carveout_heap_attachment *a = attachment->priv;
- a->mapped = false;
- dma_unmap_sgtable(a->dev, table, direction, 0);
+}
+static int +carveout_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
enum dma_data_direction direction)
+{
- struct carveout_heap_buffer_priv *priv = dmabuf->priv;
- struct carveout_heap_attachment *a;
- unsigned long len = priv->num_pages * PAGE_SIZE;
- mutex_lock(&priv->lock);
- if (priv->vmap_cnt)
invalidate_kernel_vmap_range(priv->vaddr, len);
- list_for_each_entry(a, &priv->attachments, head) {
if (!a->mapped)
continue;
dma_sync_sgtable_for_cpu(a->dev, &a->table, direction);
- }
- mutex_unlock(&priv->lock);
- return 0;
+}
+static int +carveout_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
enum dma_data_direction direction)
+{
- struct carveout_heap_buffer_priv *priv = dmabuf->priv;
- struct carveout_heap_attachment *a;
- unsigned long len = priv->num_pages * PAGE_SIZE;
- mutex_lock(&priv->lock);
- if (priv->vmap_cnt)
flush_kernel_vmap_range(priv->vaddr, len);
- list_for_each_entry(a, &priv->attachments, head) {
if (!a->mapped)
continue;
dma_sync_sgtable_for_device(a->dev, &a->table, direction);
- }
- mutex_unlock(&priv->lock);
- return 0;
+}
+static int carveout_heap_mmap(struct dma_buf *dmabuf,
struct vm_area_struct *vma)
+{
- struct carveout_heap_buffer_priv *priv = dmabuf->priv;
- unsigned long len = priv->num_pages * PAGE_SIZE;
- return vm_iomap_memory(vma, priv->paddr, len);
+}
+static int carveout_heap_vmap(struct dma_buf *dmabuf, struct iosys_map *map) +{
- struct carveout_heap_buffer_priv *priv = dmabuf->priv;
- unsigned long len = priv->num_pages * PAGE_SIZE;
- mutex_lock(&priv->lock);
- if (!priv->vmap_cnt) {
void *vaddr = memremap(priv->paddr, len, MEMREMAP_WB);
if (!vaddr) {
mutex_unlock(&priv->lock);
return -ENOMEM;
}
priv->vaddr = vaddr;
- }
- WARN_ON(!priv->vaddr);
- iosys_map_set_vaddr(map, priv->vaddr);
- priv->vmap_cnt++;
- mutex_unlock(&priv->lock);
- return 0;
+}
+static void carveout_heap_vunmap(struct dma_buf *dmabuf, struct iosys_map *map) +{
- struct carveout_heap_buffer_priv *priv = dmabuf->priv;
- mutex_lock(&priv->lock);
- priv->vmap_cnt--;
- if (!priv->vmap_cnt) {
memunmap(priv->vaddr);
priv->vaddr = NULL;
- }
- mutex_unlock(&priv->lock);
- iosys_map_clear(map);
+}
+static void carveout_heap_dma_buf_release(struct dma_buf *buf) +{
- struct carveout_heap_buffer_priv *buffer_priv = buf->priv;
- struct carveout_heap_priv *heap_priv = buffer_priv->heap;
- unsigned long len = buffer_priv->num_pages * PAGE_SIZE;
- gen_pool_free(heap_priv->pool, buffer_priv->paddr, len);
- kfree(buffer_priv);
+}
+static const struct dma_buf_ops carveout_heap_buf_ops = {
- .attach = carveout_heap_attach,
- .detach = carveout_heap_detach,
- .map_dma_buf = carveout_heap_map_dma_buf,
- .unmap_dma_buf = carveout_heap_unmap_dma_buf,
- .begin_cpu_access = carveout_heap_dma_buf_begin_cpu_access,
- .end_cpu_access = carveout_heap_dma_buf_end_cpu_access,
- .mmap = carveout_heap_mmap,
- .vmap = carveout_heap_vmap,
- .vunmap = carveout_heap_vunmap,
- .release = carveout_heap_dma_buf_release,
+};
+static struct dma_buf *carveout_heap_allocate(struct dma_heap *heap,
unsigned long len,
u32 fd_flags,
u64 heap_flags)
+{
- struct carveout_heap_priv *heap_priv = dma_heap_get_drvdata(heap);
- struct carveout_heap_buffer_priv *buffer_priv;
- DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
- struct dma_buf *buf;
- phys_addr_t paddr;
- /* len is guaranteed to be page-aligned by the framework, so we can use it as is. */
- size_t size = len;
- int ret;
- buffer_priv = kzalloc(sizeof(*buffer_priv), GFP_KERNEL);
- if (!buffer_priv)
return ERR_PTR(-ENOMEM);
- INIT_LIST_HEAD(&buffer_priv->attachments);
- mutex_init(&buffer_priv->lock);
- paddr = gen_pool_alloc(heap_priv->pool, size);
- if (!paddr) {
ret = -ENOMEM;
goto err_free_buffer_priv;
- }
- buffer_priv->paddr = paddr;
- buffer_priv->heap = heap_priv;
- buffer_priv->num_pages = size >> PAGE_SHIFT;
- /* create the dmabuf */
- exp_info.exp_name = dma_heap_get_name(heap);
- exp_info.ops = &carveout_heap_buf_ops;
- exp_info.size = size;
- exp_info.flags = fd_flags;
- exp_info.priv = buffer_priv;
- buf = dma_buf_export(&exp_info);
- if (IS_ERR(buf)) {
ret = PTR_ERR(buf);
goto err_free_buffer;
- }
- return buf;
+err_free_buffer:
- gen_pool_free(heap_priv->pool, paddr, len);
+err_free_buffer_priv:
- kfree(buffer_priv);
- return ERR_PTR(ret);
+}
+static const struct dma_heap_ops carveout_heap_ops = {
- .allocate = carveout_heap_allocate,
+};
+static int __init carveout_heap_setup(struct device_node *node) +{
- struct dma_heap_export_info exp_info = {};
- const struct reserved_mem *rmem;
- struct carveout_heap_priv *priv;
- struct dma_heap *heap;
- struct gen_pool *pool;
- int ret;
- rmem = of_reserved_mem_lookup(node);
- if (!rmem)
return -EINVAL;
- priv = kzalloc(sizeof(*priv), GFP_KERNEL);
- if (!priv)
return -ENOMEM;
- pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE);
- if (!pool) {
ret = -ENOMEM;
goto err_cleanup_heap;
- }
- priv->pool = pool;
- ret = gen_pool_add(pool, rmem->base, rmem->size, NUMA_NO_NODE);
- if (ret)
goto err_release_mem_region;
- exp_info.name = node->full_name;
- exp_info.ops = &carveout_heap_ops;
- exp_info.priv = priv;
- heap = dma_heap_add(&exp_info);
- if (IS_ERR(heap)) {
ret = PTR_ERR(heap);
goto err_release_mem_region;
- }
- priv->heap = heap;
- return 0;
+err_release_mem_region:
- gen_pool_destroy(pool);
+err_cleanup_heap:
- kfree(priv);
- return ret;
+}
+static int __init carveout_heap_init(void) +{
- struct device_node *rmem_node;
- struct device_node *node;
- int ret;
- rmem_node = of_find_node_by_path("/reserved-memory");
- if (!rmem_node)
return 0;
- for_each_child_of_node(rmem_node, node) {
if (!of_device_is_compatible(node, "carved-out"))
continue;
ret = carveout_heap_setup(node);
if (ret)
return ret;
- }
/reserved-memory nodes get a platform_device, so why not make this a driver?
Rob
On Fri, Jun 27, 2025 at 02:23:47PM -0500, Rob Herring wrote:
+static int __init carveout_heap_init(void) +{
- struct device_node *rmem_node;
- struct device_node *node;
- int ret;
- rmem_node = of_find_node_by_path("/reserved-memory");
- if (!rmem_node)
return 0;
- for_each_child_of_node(rmem_node, node) {
if (!of_device_is_compatible(node, "carved-out"))
continue;
ret = carveout_heap_setup(node);
if (ret)
return ret;
- }
/reserved-memory nodes get a platform_device, so why not make this a driver?
Because I never realised we could :)
Thanks, I'll fix it for the next version
Maxime
linaro-mm-sig@lists.linaro.org