Re: [PATCH v8 06/11] dma-buf: provide phys_vec to scatter-gather mapping routine - Linaro-mm-sig

19 Nov 2025

      On Tue, Nov 18, 2025 at 04:06:11PM -0800, Nicolin Chen wrote:
...
On Tue, Nov 11, 2025 at 11:57:48AM +0200, Leon Romanovsky wrote:
...
From: Leon Romanovsky leonro@nvidia.com
Add dma_buf_map() and dma_buf_unmap() helpers to convert an array of
MMIO physical address ranges into scatter-gather tables with proper
DMA mapping.
These common functions are a starting point and support any PCI
drivers creating mappings from their BAR's MMIO addresses. VFIO is one
case, as shortly will be RDMA. We can review existing DRM drivers to
refactor them separately. We hope this will evolve into routines to
help common DRM that include mixed CPU and MMIO mappings.
Compared to the dma_map_resource() abuse this implementation handles
the complicated PCI P2P scenarios properly, especially when an IOMMU
is enabled:

Direct bus address mapping without IOVA allocation for
PCI_P2PDMA_MAP_BUS_ADDR, using pci_p2pdma_bus_addr_map(). This
happens if the IOMMU is enabled but the PCIe switch ACS flags allow
transactions to avoid the host bridge.
Further, this handles the slightly obscure, case of MMIO with a
phys_addr_t that is different from the physical BAR programming
(bus offset). The phys_addr_t is converted to a dma_addr_t and
accommodates this effect. This enables certain real systems to
work, especially on ARM platforms.

Mapping through host bridge with IOVA allocation and DMA_ATTR_MMIO
attribute for MMIO memory regions (PCI_P2PDMA_MAP_THRU_HOST_BRIDGE).
This happens when the IOMMU is enabled and the ACS flags are forcing
all traffic to the IOMMU - ie for virtualization systems.

Cases where P2P is not supported through the host bridge/CPU. The
P2P subsystem is the proper place to detect this and block it.

Helper functions fill_sg_entry() and calc_sg_nents() handle the
scatter-gather table construction, splitting large regions into
UINT_MAX-sized chunks to fit within sg->length field limits.
Since the physical address based DMA API forbids use of the CPU list
of the scatterlist this will produce a mangled scatterlist that has
a fully zero-length and NULL'd CPU list. The list is 0 length,
all the struct page pointers are NULL and zero sized. This is stronger
and more robust than the existing mangle_sg_table() technique. It is
a future project to migrate DMABUF as a subsystem away from using
scatterlist for this data structure.
Tested-by: Alex Mastro amastro@fb.com
Tested-by: Nicolin Chen nicolinc@nvidia.com
Signed-off-by: Leon Romanovsky leonro@nvidia.com
Reviewed-by: Nicolin Chen nicolinc@nvidia.com
With a nit:
...
+err_unmap_dma:

if (!i || !dma->state) {
; /* Do nothing */

} else if (dma_use_iova(dma->state)) {
dma_iova_destroy(attach->dev, dma->state, mapped_len, dir,

		 DMA_ATTR_MMIO);

} else {
for_each_sgtable_dma_sg(&dma->sgt, sgl, i)

	dma_unmap_phys(attach->dev, sg_dma_address(sgl),

		       sg_dma_len(sgl), dir, DMA_ATTR_MMIO);

Would it be safer to skip dma_unmap_phys() the range [i, nents)?
[i, nents) is not supposed to be in SG list which we are iterating.
Thanks