On Fri, Jul 25, 2025 at 10:30:46AM -0600, Logan Gunthorpe wrote:
>
>
> On 2025-07-24 02:13, Leon Romanovsky wrote:
> > On Thu, Jul 24, 2025 at 10:03:13AM +0200, Christoph Hellwig wrote:
> >> On Wed, Jul 23, 2025 at 04:00:06PM +0300, Leon Romanovsky wrote:
> >>> From: Leon Romanovsky <leonro(a)nvidia.com>
> >>>
> >>> Export the pci_p2pdma_map_type() function to allow external modules
> >>> and subsystems to determine the appropriate mapping type for P2PDMA
> >>> transfers between a provider and target device.
> >>
> >> External modules have no business doing this.
> >
> > VFIO PCI code is built as module. There is no way to access PCI p2p code
> > without exporting functions in it.
>
> The solution that would make more sense to me would be for either
> dma_iova_try_alloc() or another helper in dma-iommu.c to handle the
> P2PDMA case. dma-iommu.c already uses those same interfaces and thus
> there would be no need to export the low level helpers from the p2pdma code.
I had same idea in early versions of DMA phys API discussion and it was
pointed (absolutely right) that this is layering violation.
At that time, that remark wasn't such clear to me because HMM code
performs check for p2p on every page and has call to dma_iova_try_alloc()
before that check. But this VFIO DMABUF code shows it much more clearer.
The p2p check is performed before any DMA calls and in case of PCI_P2PDMA_MAP_BUS_ADDR
p2p type between DMABUF exporter device and DMABUF importer device, we
don't call dma_iova_try_alloc() or any DMA API at all.
So unfortunately, I think that dma*.c|h is not right place for p2p
type check.
Thanks
>
> Logan
>
On Thu, Jul 24, 2025 at 05:13:49AM +0000, Kasireddy, Vivek wrote:
> Hi Leon,
>
> > Subject: [PATCH 10/10] vfio/pci: Add dma-buf export support for MMIO
> > regions
> >
> > From: Leon Romanovsky <leonro(a)nvidia.com>
> >
> > Add support for exporting PCI device MMIO regions through dma-buf,
> > enabling safe sharing of non-struct page memory with controlled
> > lifetime management. This allows RDMA and other subsystems to import
> > dma-buf FDs and build them into memory regions for PCI P2P operations.
> >
> > The implementation provides a revocable attachment mechanism using
> > dma-buf move operations. MMIO regions are normally pinned as BARs
> > don't change physical addresses, but access is revoked when the VFIO
> > device is closed or a PCI reset is issued. This ensures kernel
> > self-defense against potentially hostile userspace.
> >
> > Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy(a)intel.com>
> > Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
> > ---
> > drivers/vfio/pci/Kconfig | 20 ++
> > drivers/vfio/pci/Makefile | 2 +
> > drivers/vfio/pci/vfio_pci_config.c | 22 +-
> > drivers/vfio/pci/vfio_pci_core.c | 25 ++-
> > drivers/vfio/pci/vfio_pci_dmabuf.c | 321 +++++++++++++++++++++++++++++
> > drivers/vfio/pci/vfio_pci_priv.h | 23 +++
> > include/linux/dma-buf.h | 1 +
> > include/linux/vfio_pci_core.h | 3 +
> > include/uapi/linux/vfio.h | 19 ++
> > 9 files changed, 431 insertions(+), 5 deletions(-)
> > create mode 100644 drivers/vfio/pci/vfio_pci_dmabuf.c
<...>
> > +static int validate_dmabuf_input(struct vfio_pci_core_device *vdev,
> > + struct vfio_device_feature_dma_buf *dma_buf)
> > +{
> > + struct pci_dev *pdev = vdev->pdev;
> > + u32 bar = dma_buf->region_index;
> > + u64 offset = dma_buf->offset;
> > + u64 len = dma_buf->length;
> > + resource_size_t bar_size;
> > + u64 sum;
> > +
> > + /*
> > + * For PCI the region_index is the BAR number like everything else.
> > + */
> > + if (bar >= VFIO_PCI_ROM_REGION_INDEX)
> > + return -ENODEV;
<...>
> > +/**
> > + * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the
> > + * regions selected.
> > + *
> > + * open_flags are the typical flags passed to open(2), eg O_RDWR,
> > O_CLOEXEC,
> > + * etc. offset/length specify a slice of the region to create the dmabuf from.
> > + * nr_ranges is the total number of (P2P DMA) ranges that comprise the
> > dmabuf.
> Any particular reason why you dropped the option (nr_ranges) of creating a
> single dmabuf from multiple ranges of an MMIO region?
I did it for two reasons. First, I wanted to simplify the code in order
to speed-up discussion over the patchset itself. Second, I failed to
find justification for need of multiple ranges, as the number of BARs
are limited by VFIO_PCI_ROM_REGION_INDEX (6) and same functionality
can be achieved by multiple calls to DMABUF import.
>
> Restricting the dmabuf to a single range (or having to create multiple dmabufs
> to represent multiple regions/ranges associated with a single scattered buffer)
> would be very limiting and may not work in all cases. For instance, in my use-case,
> I am trying to share a large (4k mode) framebuffer (FB) located in GPU's VRAM
> between two (p2p compatible) GPU devices. And, this would probably not work
> given that allocating a large contiguous FB (nr_ranges = 1) in VRAM may not be
> feasible when there is memory pressure.
Can you please help me and point to the place in the code where this can fail?
I'm probably missing something basic as there are no large allocations
in the current patchset.
>
> Furthermore, since you are adding a new UAPI with this patch/feature, as you know,
> we cannot go back and tweak it (to add support for nr_ranges > 1) should there
> be a need in the future, but you can always use nr_ranges = 1 anytime. Therefore,
> I think it makes sense to be flexible in terms of the number of ranges to include
> while creating a dmabuf instead of restricting ourselves to one range.
I'm not a big fan of over-engineering. Let's first understand if this
case is needed.
Thanks
>
> Thanks,
> Vivek
>
> > + *
> > + * Return: The fd number on success, -1 and errno is set on failure.
> > + */
> > +#define VFIO_DEVICE_FEATURE_DMA_BUF 11
> > +
> > +struct vfio_device_feature_dma_buf {
> > + __u32 region_index;
> > + __u32 open_flags;
> > + __u64 offset;
> > + __u64 length;
> > +};
> > +
> > /* -------- API for Type1 VFIO IOMMU -------- */
> >
> > /**
> > --
> > 2.50.1
>
On 22-07-25, 15:46, Dmitry Baryshkov wrote:
> On Tue, Jul 22, 2025 at 05:50:08PM +0530, Jyothi Kumar Seerapu wrote:
> > On 7/19/2025 3:27 PM, Dmitry Baryshkov wrote:
> > > On Mon, Jul 07, 2025 at 09:58:30PM +0530, Jyothi Kumar Seerapu wrote:
> > > > On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
> > > > > On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
[Folks, would be nice to trim replies]
> > > > Could you please confirm if can go with the similar approach of unmap the
> > > > processed TREs based on a fixed threshold or constant value, instead of
> > > > unmapping them all at once?
> > >
> > > I'd still say, that's a bad idea. Please stay within the boundaries of
> > > the DMA API.
> > >
> > I agree with the approach you suggested—it's the GPI's responsibility to
> > manage the available TREs.
> >
> > However, I'm curious whether can we set a dynamic watermark value perhaps
> > half the available TREs) to trigger unmapping of processed TREs ? This would
> > allow the software to prepare the next set of TREs while the hardware
> > continues processing the remaining ones, enabling better parallelism and
> > throughput.
>
> Let's land the simple implementation first, which can then be improved.
> However I don't see any way to return 'above the watermark' from the DMA
> controller. You might need to enhance the API.
Traditionally, we set the dma transfers for watermark level and we get a
interrupt. So you might want to set the callback for watermark level
and then do mapping/unmapping etc in the callback. This is typical model
for dmaengines, we should follow that well
BR
--
~Vinod
The Arm Ethos-U65/85 NPUs are designed for edge AI inference
applications[0].
The driver works with Mesa Teflon. WIP support is available here[1]. The
UAPI should also be compatible with the downstream driver stack[2] and
Vela compiler though that has not been implemented.
Testing so far has been on i.MX93 boards with Ethos-U65. Support for U85
is still todo. Only minor changes on driver side will be needed for U85
support.
A git tree is here[3].
Rob
[0] https://www.arm.com/products/silicon-ip-cpu?families=ethos%20npus
[1] https://gitlab.freedesktop.org/tomeu/mesa.git ethos
[2] https://gitlab.arm.com/artificial-intelligence/ethos-u/
[3] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git ethos
Signed-off-by: Rob Herring (Arm) <robh(a)kernel.org>
---
Rob Herring (Arm) (2):
dt-bindings: npu: Add Arm Ethos-U65/U85
accel: Add Arm Ethos-U NPU driver
.../devicetree/bindings/npu/arm,ethos.yaml | 79 +++
MAINTAINERS | 9 +
drivers/accel/Kconfig | 1 +
drivers/accel/Makefile | 1 +
drivers/accel/ethos/Kconfig | 10 +
drivers/accel/ethos/Makefile | 4 +
drivers/accel/ethos/ethos_device.h | 186 ++++++
drivers/accel/ethos/ethos_drv.c | 412 ++++++++++++
drivers/accel/ethos/ethos_drv.h | 15 +
drivers/accel/ethos/ethos_gem.c | 707 +++++++++++++++++++++
drivers/accel/ethos/ethos_gem.h | 46 ++
drivers/accel/ethos/ethos_job.c | 527 +++++++++++++++
drivers/accel/ethos/ethos_job.h | 41 ++
include/uapi/drm/ethos_accel.h | 262 ++++++++
14 files changed, 2300 insertions(+)
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250715-ethos-3fdd39ef6f19
Best regards,
--
Rob Herring (Arm) <robh(a)kernel.org>
Hi Jeff,
Am Montag, 21. Juli 2025, 16:55:01 Mitteleuropäische Sommerzeit schrieb Jeff Hugo:
> On 7/21/2025 3:17 AM, Tomeu Vizoso wrote:
> > This series adds a new driver for the NPU that Rockchip includes in its
> > newer SoCs, developed by them on the NVDLA base.
> >
> > In its current form, it supports the specific NPU in the RK3588 SoC.
> >
> > The userspace driver is part of Mesa and an initial draft can be found at:
> >
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29698
> >
> > Signed-off-by: Tomeu Vizoso <tomeu(a)tomeuvizoso.net>
>
> This (and the userspace component) appear ready for merge from what I
> can tell. Tomeu is still working on his drm-misc access so I've offered
> to merge on his behalf. Planning on waiting until Friday for any final
> feedback to come in before doing so.
sounds great.
Just to make sure, you're planning to merge patches 1-6 (driver + binding)
into drm-misc and I'll pick up the "arm64: dts: " patches 7-10 afterwards?
Heiko
On Mon, 21 Jul 2025 11:17:33 +0200, Tomeu Vizoso wrote:
> Add the bindings for the Neural Processing Unit IP from Rockchip.
>
> v2:
> - Adapt to new node structure (one node per core, each with its own
> IOMMU)
> - Several misc. fixes from Sebastian Reichel
>
> v3:
> - Split register block in its constituent subblocks, and only require
> the ones that the kernel would ever use (Nicolas Frattaroli)
> - Group supplies (Rob Herring)
> - Explain the way in which the top core is special (Rob Herring)
>
> v4:
> - Change required node name to npu@ (Rob Herring and Krzysztof Kozlowski)
> - Remove unneeded items: (Krzysztof Kozlowski)
> - Fix use of minItems/maxItems (Krzysztof Kozlowski)
> - Add reg-names to list of required properties (Krzysztof Kozlowski)
> - Fix example (Krzysztof Kozlowski)
>
> v5:
> - Rename file to rockchip,rk3588-rknn-core.yaml (Krzysztof Kozlowski)
> - Streamline compatible property (Krzysztof Kozlowski)
>
> v6:
> - Remove mention to NVDLA, as the hardware is only incidentally related
> (Kever Yang)
> - Mark pclk and npu clocks as required by all clocks (Rob Herring)
>
> v7:
> - Remove allOf section, not needed now that all nodes require 4 clocks
> (Heiko Stübner)
>
> v8:
> - Remove notion of top core (Robin Murphy)
>
> Signed-off-by: Sebastian Reichel <sebastian.reichel(a)collabora.com>
> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
> Tested-by: Heiko Stuebner <heiko(a)sntech.de>
> Signed-off-by: Tomeu Vizoso <tomeu(a)tomeuvizoso.net>
> ---
> .../bindings/npu/rockchip,rk3588-rknn-core.yaml | 112 +++++++++++++++++++++
> 1 file changed, 112 insertions(+)
>
Reviewed-by: Rob Herring (Arm) <robh(a)kernel.org>
Hi,
Here's another attempt at supporting user-space allocations from a
specific carved-out reserved memory region.
The initial problem we were discussing was that I'm currently working on
a platform which has a memory layout with ECC enabled. However, enabling
the ECC has a number of drawbacks on that platform: lower performance,
increased memory usage, etc. So for things like framebuffers, the
trade-off isn't great and thus there's a memory region with ECC disabled
to allocate from for such use cases.
After a suggestion from John, I chose to first start using heap
allocations flags to allow for userspace to ask for a particular ECC
setup. This is then backed by a new heap type that runs from reserved
memory chunks flagged as such, and the existing DT properties to specify
the ECC properties.
After further discussion, it was considered that flags were not the
right solution, and relying on the names of the heaps would be enough to
let userspace know the kind of buffer it deals with.
Thus, even though the uAPI part of it had been dropped in this second
version, we still needed a driver to create heaps out of carved-out memory
regions. In addition to the original usecase, a similar driver can be
found in BSPs from most vendors, so I believe it would be a useful
addition to the kernel.
Some extra discussion with Rob Herring [1] came to the conclusion that
some specific compatible for this is not great either, and as such an
new driver probably isn't called for either.
Some other discussions we had with John [2] also dropped some hints that
multiple CMA heaps might be a good idea, and some vendors seem to do
that too.
So here's another attempt that doesn't affect the device tree at all and
will just create a heap for every CMA reserved memory region.
It also falls nicely into the current plan we have to support cgroups in
DRM/KMS and v4l2, which is an additional benefit.
Let me know what you think,
Maxime
1: https://lore.kernel.org/all/20250707-cobalt-dingo-of-serenity-dbf92c@houat/
2: https://lore.kernel.org/all/CANDhNCroe6ZBtN_o=c71kzFFaWK-fF5rCdnr9P5h1sgPOW…
Let me know what you think,
Maxime
Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
---
Changes in v7:
- Invert the logic and register CMA heap from the reserved memory /
dma contiguous code, instead of iterating over them from the CMA heap.
- Link to v6: https://lore.kernel.org/r/20250709-dma-buf-ecc-heap-v6-0-dac9bf80f35d@kerne…
Changes in v6:
- Drop the new driver and allocate a CMA heap for each region now
- Dropped the binding
- Rebased on 6.16-rc5
- Link to v5: https://lore.kernel.org/r/20250617-dma-buf-ecc-heap-v5-0-0abdc5863a4f@kerne…
Changes in v5:
- Rebased on 6.16-rc2
- Switch from property to dedicated binding
- Link to v4: https://lore.kernel.org/r/20250520-dma-buf-ecc-heap-v4-1-bd2e1f1bb42c@kerne…
Changes in v4:
- Rebased on 6.15-rc7
- Map buffers only when map is actually called, not at allocation time
- Deal with restricted-dma-pool and shared-dma-pool
- Reword Kconfig options
- Properly report dma_map_sgtable failures
- Link to v3: https://lore.kernel.org/r/20250407-dma-buf-ecc-heap-v3-0-97cdd36a5f29@kerne…
Changes in v3:
- Reworked global variable patch
- Link to v2: https://lore.kernel.org/r/20250401-dma-buf-ecc-heap-v2-0-043fd006a1af@kerne…
Changes in v2:
- Add vmap/vunmap operations
- Drop ECC flags uapi
- Rebase on top of 6.14
- Link to v1: https://lore.kernel.org/r/20240515-dma-buf-ecc-heap-v1-0-54cbbd049511@kerne…
---
Maxime Ripard (5):
doc: dma-buf: List the heaps by name
dma-buf: heaps: cma: Register list of CMA regions at boot
dma: contiguous: Register reusable CMA regions at boot
dma: contiguous: Reserve default CMA heap
dma-buf: heaps: cma: Create CMA heap for each CMA reserved region
Documentation/userspace-api/dma-buf-heaps.rst | 24 ++++++++------
MAINTAINERS | 1 +
drivers/dma-buf/heaps/Kconfig | 10 ------
drivers/dma-buf/heaps/cma_heap.c | 47 +++++++++++++++++----------
include/linux/dma-buf/heaps/cma.h | 16 +++++++++
kernel/dma/contiguous.c | 11 +++++++
6 files changed, 72 insertions(+), 37 deletions(-)
---
base-commit: 47633099a672fc7bfe604ef454e4f116e2c954b1
change-id: 20240515-dma-buf-ecc-heap-28a311d2c94e
prerequisite-message-id: <20250610131231.1724627-1-jkangas(a)redhat.com>
prerequisite-patch-id: bc44be5968feb187f2bc1b8074af7209462b18e7
prerequisite-patch-id: f02a91b723e5ec01fbfedf3c3905218b43d432da
prerequisite-patch-id: e944d0a3e22f2cdf4d3b3906e5603af934696deb
Best regards,
--
Maxime Ripard <mripard(a)kernel.org>