This series is the start of adding full DMABUF support to
iommufd. Currently it is limited to only work with VFIO's DMABUF exporter.
It sits on top of Leon's series to add a DMABUF exporter to VFIO:
https://lore.kernel.org/all/20251120-dmabuf-vfio-v9-0-d7f71607f371@nvidia.c…
The existing IOMMU_IOAS_MAP_FILE is enhanced to detect DMABUF fd's, but
otherwise works the same as it does today for a memfd. The user can select
a slice of the FD to map into the ioas and if the underliyng alignment
requirements are met it will be placed in the iommu_domain.
Though limited, it is enough to allow a VMM like QEMU to connect MMIO BAR
memory from VFIO to an iommu_domain controlled by iommufd. This is used
for PCI Peer to Peer support in VMs, and is the last feature that the VFIO
type 1 container has that iommufd couldn't do.
The VFIO type1 version extracts raw PFNs from VMAs, which has no lifetime
control and is a use-after-free security problem.
Instead iommufd relies on revokable DMABUFs. Whenever VFIO thinks there
should be no access to the MMIO it can shoot down the mapping in iommufd
which will unmap it from the iommu_domain. There is no automatic remap,
this is a safety protocol so the kernel doesn't get stuck. Userspace is
expected to know it is doing something that will revoke the dmabuf and
map/unmap it around the activity. Eg when QEMU goes to issue FLR it should
do the map/unmap to iommufd.
Since DMABUF is missing some key general features for this use case it
relies on a "private interconnect" between VFIO and iommufd via the
vfio_pci_dma_buf_iommufd_map() call.
The call confirms the DMABUF has revoke semantics and delivers a phys_addr
for the memory suitable for use with iommu_map().
Medium term there is a desire to expand the supported DMABUFs to include
GPU drivers to support DPDK/SPDK type use cases so future series will work
to add a general concept of revoke and a general negotiation of
interconnect to remove vfio_pci_dma_buf_iommufd_map().
I also plan another series to modify iommufd's vfio_compat to
transparently pull a dmabuf out of a VFIO VMA to emulate more of the uAPI
of type1.
The latest series for interconnect negotation to exchange a phys_addr is:
https://lore.kernel.org/r/20251027044712.1676175-1-vivek.kasireddy@intel.com
And the discussion for design of revoke is here:
https://lore.kernel.org/dri-devel/20250114173103.GE5556@nvidia.com/
This is on github: https://github.com/jgunthorpe/linux/commits/iommufd_dmabuf
v2:
- Rebase on Leon's v9
- Fix mislocking in an iopt_fill_domain() error path
- Revise the comments around how the sub page offset works
- Remove a useless WARN_ON in iopt_pages_rw_access()
- Fixed missed memory free in the selftest
v1: https://patch.msgid.link/r/0-v1-64bed2430cdb+31b-iommufd_dmabuf_jgg@nvidia.…
Jason Gunthorpe (9):
vfio/pci: Add vfio_pci_dma_buf_iommufd_map()
iommufd: Add DMABUF to iopt_pages
iommufd: Do not map/unmap revoked DMABUFs
iommufd: Allow a DMABUF to be revoked
iommufd: Allow MMIO pages in a batch
iommufd: Have pfn_reader process DMABUF iopt_pages
iommufd: Have iopt_map_file_pages convert the fd to a file
iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE
iommufd/selftest: Add some tests for the dmabuf flow
drivers/iommu/iommufd/io_pagetable.c | 78 +++-
drivers/iommu/iommufd/io_pagetable.h | 54 ++-
drivers/iommu/iommufd/ioas.c | 8 +-
drivers/iommu/iommufd/iommufd_private.h | 14 +-
drivers/iommu/iommufd/iommufd_test.h | 10 +
drivers/iommu/iommufd/main.c | 10 +
drivers/iommu/iommufd/pages.c | 414 ++++++++++++++++--
drivers/iommu/iommufd/selftest.c | 143 ++++++
drivers/vfio/pci/vfio_pci_dmabuf.c | 34 ++
include/linux/vfio_pci_core.h | 4 +
tools/testing/selftests/iommu/iommufd.c | 43 ++
tools/testing/selftests/iommu/iommufd_utils.h | 44 ++
12 files changed, 786 insertions(+), 70 deletions(-)
base-commit: f836737ed56db9e2d5b047c56a31e05af0f3f116
--
2.43.0
The drm/ttm patch modifies TTM to support multiple contexts for the pipelined moves.
Then amdgpu/ttm is updated to express dependencies between jobs explicitely,
instead of relying on the ordering of execution guaranteed by the use of a single
instance.
With all of this in place, we can use multiple entities, with each having access
to the available SDMA instances.
This rework also gives the opportunity to merge the clear functions into a single
one and to optimize a bit GART usage.
(The first patch of the series has already been merged through drm-misc but I'm
including it here to reduce conflicts)
For v3 I've kept the series as a whole but I've reorganized the patches so that
everything up to the drm/ttm change can be merged through amd-staging-drm-next
once reviewed.
v3:
- shuffled the patches: everything up to the drm/ttm patch has no dependency
on the ttm change and be merged independently
- split "drm/amdgpu: pass the entity to use to ttm functions" in 2 commits
- moved AMDGPU_GTT_NUM_TRANSFER_WINDOWS removal to its own commit
- added a ttm job submission helper
- addressed comments from Christian and Felix
v2:
- addressed comments from Christian
- dropped "drm/amdgpu: prepare amdgpu_fill_buffer to use N entities" and
"drm/amdgpu: use multiple entities in amdgpu_fill_buffer"
- added "drm/admgpu: handle resv dependencies in amdgpu_ttm_map_buffer",
"drm/amdgpu: round robin through clear_entities in amdgpu_fill_buffer"
- reworked how sdma rings/scheds are passed to amdgpu_ttm
v1: https://lists.freedesktop.org/archives/dri-devel/2025-November/534517.html
Pierre-Eric Pelloux-Prayer (28):
drm/amdgpu: give each kernel job a unique id
drm/amdgpu: use ttm_resource_manager_cleanup
drm/amdgpu: remove direct_submit arg from amdgpu_copy_buffer
drm/amdgpu: remove the ring param from ttm functions
drm/amdgpu: introduce amdgpu_ttm_buffer_entity
drm/amdgpu: add amdgpu_ttm_job_submit helper
drm/amdgpu: fix error handling in amdgpu_copy_buffer
drm/amdgpu: pass the entity to use to amdgpu_ttm_map_buffer
drm/amdgpu: pass the entity to use to ttm public functions
drm/amdgpu: add amdgpu_device argument to ttm functions that need it
drm/amdgpu: statically assign gart windows to ttm entities
drm/amdgpu: remove AMDGPU_GTT_NUM_TRANSFER_WINDOWS
drm/amdgpu: add missing lock when using ttm entities
drm/amdgpu: check entity lock is held in amdgpu_ttm_job_submit
drm/amdgpu: double AMDGPU_GTT_MAX_TRANSFER_SIZE
drm/amdgpu: use larger gart window when possible
drm/amdgpu: introduce amdgpu_sdma_set_vm_pte_scheds
drm/amdgpu: move sched status check inside
amdgpu_ttm_set_buffer_funcs_status
drm/ttm: rework pipelined eviction fence handling
drm/amdgpu: allocate multiple clear entities
drm/amdgpu: allocate multiple move entities
drm/amdgpu: round robin through clear_entities in amdgpu_fill_buffer
drm/amdgpu: use TTM_NUM_MOVE_FENCES when reserving fences
drm/amdgpu: use multiple entities in amdgpu_move_blit
drm/amdgpu: pass all the sdma scheds to amdgpu_mman
drm/amdgpu: give ttm entities access to all the sdma scheds
drm/amdgpu: get rid of amdgpu_ttm_clear_buffer
drm/amdgpu: rename amdgpu_fill_buffer as amdgpu_ttm_clear_buffer
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 +
drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c | 8 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 14 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 8 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 19 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 16 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 493 +++++++++++-------
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 58 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 11 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 8 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 26 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c | 4 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 4 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 12 +-
drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 34 +-
drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 34 +-
drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 34 +-
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 41 +-
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 41 +-
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 37 +-
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 37 +-
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 32 +-
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 32 +-
drivers/gpu/drm/amd/amdgpu/si_dma.c | 34 +-
drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 6 +-
drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 6 +-
drivers/gpu/drm/amd/amdgpu/vce_v1_0.c | 12 +-
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 33 +-
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 +-
.../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 6 +-
.../drm/amd/display/amdgpu_dm/amdgpu_dm_wb.c | 6 +-
.../gpu/drm/ttm/tests/ttm_bo_validate_test.c | 11 +-
drivers/gpu/drm/ttm/tests/ttm_resource_test.c | 5 +-
drivers/gpu/drm/ttm/ttm_bo.c | 47 +-
drivers/gpu/drm/ttm/ttm_bo_util.c | 38 +-
drivers/gpu/drm/ttm/ttm_resource.c | 31 +-
include/drm/ttm/ttm_resource.h | 29 +-
47 files changed, 706 insertions(+), 615 deletions(-)
--
2.43.0
On Thu, Nov 20, 2025 at 05:04:13PM -0700, Alex Williamson wrote:
> @@ -2501,7 +2501,7 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set,
> err_undo:
> list_for_each_entry_from_reverse(vdev, &dev_set->device_list,
> vdev.dev_set_list) {
> - if (__vfio_pci_memory_enabled(vdev))
> + if (vdev->vdev.open_count && __vfio_pci_memory_enabled(vdev))
> vfio_pci_dma_buf_move(vdev, false);
> up_write(&vdev->memory_lock);
> }
>
> Any other suggestions? This should be the only reset path with this
> nuance of affecting non-opened devices. Thanks,
Seems reasonable, but should it be in __vfio_pci_memory_enabled() just
to be robust?
Jason
This series is the start of adding full DMABUF support to
iommufd. Currently it is limited to only work with VFIO's DMABUF exporter.
It sits on top of Leon's series to add a DMABUF exporter to VFIO:
https://lore.kernel.org/r/20251106-dmabuf-vfio-v7-0-2503bf390699@nvidia.com
The existing IOMMU_IOAS_MAP_FILE is enhanced to detect DMABUF fd's, but
otherwise works the same as it does today for a memfd. The user can select
a slice of the FD to map into the ioas and if the underliyng alignment
requirements are met it will be placed in the iommu_domain.
Though limited, it is enough to allow a VMM like QEMU to connect MMIO BAR
memory from VFIO to an iommu_domain controlled by iommufd. This is used
for PCI Peer to Peer support in VMs, and is the last feature that the VFIO
type 1 container has that iommufd couldn't do.
The VFIO type1 version extracts raw PFNs from VMAs, which has no lifetime
control and is a use-after-free security problem.
Instead iommufd relies on revokable DMABUFs. Whenever VFIO thinks there
should be no access to the MMIO it can shoot down the mapping in iommufd
which will unmap it from the iommu_domain. There is no automatic remap,
this is a safety protocol so the kernel doesn't get stuck. Userspace is
expected to know it is doing something that will revoke the dmabuf and
map/unmap it around the activity. Eg when QEMU goes to issue FLR it should
do the map/unmap to iommufd.
Since DMABUF is missing some key general features for this use case it
relies on a "private interconnect" between VFIO and iommufd via the
vfio_pci_dma_buf_iommufd_map() call.
The call confirms the DMABUF has revoke semantics and delivers a phys_addr
for the memory suitable for use with iommu_map().
Medium term there is a desire to expand the supported DMABUFs to include
GPU drivers to support DPDK/SPDK type use cases so future series will work
to add a general concept of revoke and a general negotiation of
interconnect to remove vfio_pci_dma_buf_iommufd_map().
I also plan another series to modify iommufd's vfio_compat to
transparently pull a dmabuf out of a VFIO VMA to emulate more of the uAPI
of type1.
The latest series for interconnect negotation to exchange a phys_addr is:
https://lore.kernel.org/r/20251027044712.1676175-1-vivek.kasireddy@intel.com
And the discussion for design of revoke is here:
https://lore.kernel.org/dri-devel/20250114173103.GE5556@nvidia.com/
This is on github: https://github.com/jgunthorpe/linux/commits/iommufd_dmabuf
v2:
- Rebase on Leon's v7
- Fix mislocking in an iopt_fill_domain() error path
v1: https://patch.msgid.link/r/0-v1-64bed2430cdb+31b-iommufd_dmabuf_jgg@nvidia.…
Jason Gunthorpe (9):
vfio/pci: Add vfio_pci_dma_buf_iommufd_map()
iommufd: Add DMABUF to iopt_pages
iommufd: Do not map/unmap revoked DMABUFs
iommufd: Allow a DMABUF to be revoked
iommufd: Allow MMIO pages in a batch
iommufd: Have pfn_reader process DMABUF iopt_pages
iommufd: Have iopt_map_file_pages convert the fd to a file
iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE
iommufd/selftest: Add some tests for the dmabuf flow
drivers/iommu/iommufd/io_pagetable.c | 78 +++-
drivers/iommu/iommufd/io_pagetable.h | 53 ++-
drivers/iommu/iommufd/ioas.c | 8 +-
drivers/iommu/iommufd/iommufd_private.h | 14 +-
drivers/iommu/iommufd/iommufd_test.h | 10 +
drivers/iommu/iommufd/main.c | 10 +
drivers/iommu/iommufd/pages.c | 407 ++++++++++++++++--
drivers/iommu/iommufd/selftest.c | 142 ++++++
drivers/vfio/pci/vfio_pci_dmabuf.c | 34 ++
include/linux/vfio_pci_core.h | 4 +
tools/testing/selftests/iommu/iommufd.c | 43 ++
tools/testing/selftests/iommu/iommufd_utils.h | 44 ++
12 files changed, 781 insertions(+), 66 deletions(-)
base-commit: bb04e92c86b44b3e36532099b68de1e889acfee7
--
2.43.0
WW mutexes and dma-resv objects, which embed them, typically have a
number of locks belocking to the same lock class. However
code using them typically want to verify the locking on
object granularity, not lock-class granularity.
This series add ww_mutex functions to facilitate that,
(patch 1) and utilizes these functions in the dma-resv lock
checks.
Thomas Hellström (2):
kernel/locking/ww_mutex: Add per-lock lock-check helpers
dma-buf/dma-resv: Improve the dma-resv lockdep checks
include/linux/dma-resv.h | 7 +++++--
include/linux/ww_mutex.h | 18 ++++++++++++++++++
kernel/locking/mutex.c | 10 ++++++++++
3 files changed, 33 insertions(+), 2 deletions(-)
--
2.51.1
Here is the third part of the unification time printing in the kernel.
This time for struct timespec64. The first patch brings a support
into printf() implementation (test cases and documentation update
included) followed by the treewide conversion of the current users.
Petr, we got like more than a half being Acked, I think if you are okay
with this, the patches that have been tagged can be applied.
Note, not everything was compile-tested. Kunit test has been passed, though.
Changelog v3:
- fixed a compilation issue with fnic (LKP), also satisfied checkpatch
- collected more tags
Petr, I have not renamed 'p' to 'n' due to much of rework and
noise introduction for the changes that has been reviewed.
However, I addressed the documentation issues.
v2: <20251111122735.880607-1-andriy.shevchenko(a)linux.intel.com>
Changelog v2:
- dropped wrong patches (Hans, Takashi)
- fixed most of the checkpatch warnings (fdo CI, media CI)
- collected tags
v1: <20251110184727.666591-1-andriy.shevchenko(a)linux.intel.com>
Andy Shevchenko (21):
lib/vsprintf: Add specifier for printing struct timespec64
ceph: Switch to use %ptSp
libceph: Switch to use %ptSp
dma-buf: Switch to use %ptSp
drm/amdgpu: Switch to use %ptSp
drm/msm: Switch to use %ptSp
drm/vblank: Switch to use %ptSp
drm/xe: Switch to use %ptSp
e1000e: Switch to use %ptSp
igb: Switch to use %ptSp
ipmi: Switch to use %ptSp
media: av7110: Switch to use %ptSp
mmc: mmc_test: Switch to use %ptSp
net: dsa: sja1105: Switch to use %ptSp
PCI: epf-test: Switch to use %ptSp
pps: Switch to use %ptSp
ptp: ocp: Switch to use %ptSp
s390/dasd: Switch to use %ptSp
scsi: fnic: Switch to use %ptSp
scsi: snic: Switch to use %ptSp
tracing: Switch to use %ptSp
Documentation/core-api/printk-formats.rst | 11 +++-
drivers/char/ipmi/ipmi_si_intf.c | 3 +-
drivers/char/ipmi/ipmi_ssif.c | 6 +--
drivers/dma-buf/sync_debug.c | 2 +-
.../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 3 +-
drivers/gpu/drm/drm_vblank.c | 6 +--
.../gpu/drm/msm/disp/msm_disp_snapshot_util.c | 3 +-
drivers/gpu/drm/msm/msm_gpu.c | 3 +-
drivers/gpu/drm/xe/xe_devcoredump.c | 4 +-
drivers/mmc/core/mmc_test.c | 20 +++----
drivers/net/dsa/sja1105/sja1105_tas.c | 8 ++-
drivers/net/ethernet/intel/e1000e/ptp.c | 7 +--
drivers/net/ethernet/intel/igb/igb_ptp.c | 7 +--
drivers/pci/endpoint/functions/pci-epf-test.c | 5 +-
drivers/pps/generators/pps_gen_parport.c | 3 +-
drivers/pps/kapi.c | 3 +-
drivers/ptp/ptp_ocp.c | 13 ++---
drivers/s390/block/dasd.c | 3 +-
drivers/scsi/fnic/fnic_trace.c | 52 ++++++++-----------
drivers/scsi/snic/snic_debugfs.c | 10 ++--
drivers/scsi/snic/snic_trc.c | 5 +-
drivers/staging/media/av7110/av7110.c | 2 +-
fs/ceph/dir.c | 5 +-
fs/ceph/inode.c | 49 ++++++-----------
fs/ceph/xattr.c | 6 +--
kernel/trace/trace_output.c | 6 +--
lib/tests/printf_kunit.c | 4 ++
lib/vsprintf.c | 28 +++++++++-
net/ceph/messenger_v2.c | 6 +--
29 files changed, 130 insertions(+), 153 deletions(-)
--
2.50.1