On Tue, Nov 18, 2025 at 04:06:11PM -0800, Nicolin Chen wrote:
On Tue, Nov 11, 2025 at 11:57:48AM +0200, Leon Romanovsky wrote:
From: Leon Romanovsky leonro@nvidia.com
Add dma_buf_map() and dma_buf_unmap() helpers to convert an array of MMIO physical address ranges into scatter-gather tables with proper DMA mapping.
These common functions are a starting point and support any PCI drivers creating mappings from their BAR's MMIO addresses. VFIO is one case, as shortly will be RDMA. We can review existing DRM drivers to refactor them separately. We hope this will evolve into routines to help common DRM that include mixed CPU and MMIO mappings.
Compared to the dma_map_resource() abuse this implementation handles the complicated PCI P2P scenarios properly, especially when an IOMMU is enabled:
Direct bus address mapping without IOVA allocation for PCI_P2PDMA_MAP_BUS_ADDR, using pci_p2pdma_bus_addr_map(). This happens if the IOMMU is enabled but the PCIe switch ACS flags allow transactions to avoid the host bridge.
Further, this handles the slightly obscure, case of MMIO with a phys_addr_t that is different from the physical BAR programming (bus offset). The phys_addr_t is converted to a dma_addr_t and accommodates this effect. This enables certain real systems to work, especially on ARM platforms.
Mapping through host bridge with IOVA allocation and DMA_ATTR_MMIO attribute for MMIO memory regions (PCI_P2PDMA_MAP_THRU_HOST_BRIDGE). This happens when the IOMMU is enabled and the ACS flags are forcing all traffic to the IOMMU - ie for virtualization systems.
Cases where P2P is not supported through the host bridge/CPU. The P2P subsystem is the proper place to detect this and block it.
Helper functions fill_sg_entry() and calc_sg_nents() handle the scatter-gather table construction, splitting large regions into UINT_MAX-sized chunks to fit within sg->length field limits.
Since the physical address based DMA API forbids use of the CPU list of the scatterlist this will produce a mangled scatterlist that has a fully zero-length and NULL'd CPU list. The list is 0 length, all the struct page pointers are NULL and zero sized. This is stronger and more robust than the existing mangle_sg_table() technique. It is a future project to migrate DMABUF as a subsystem away from using scatterlist for this data structure.
Tested-by: Alex Mastro amastro@fb.com Tested-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Leon Romanovsky leonro@nvidia.com
Reviewed-by: Nicolin Chen nicolinc@nvidia.com
With a nit:
+err_unmap_dma:
- if (!i || !dma->state) {
; /* Do nothing */- } else if (dma_use_iova(dma->state)) {
dma_iova_destroy(attach->dev, dma->state, mapped_len, dir,DMA_ATTR_MMIO);- } else {
for_each_sgtable_dma_sg(&dma->sgt, sgl, i)dma_unmap_phys(attach->dev, sg_dma_address(sgl),sg_dma_len(sgl), dir, DMA_ATTR_MMIO);Would it be safer to skip dma_unmap_phys() the range [i, nents)?
[i, nents) is not supposed to be in SG list which we are iterating.
Thanks
linaro-mm-sig@lists.linaro.org