Linaro-mm-sig November 2025

linaro-mm-sig@lists.linaro.org

70 participants
52 discussions

Re: [PATCH v3 19/21] scsi: fnic: Switch to use %ptSp

by Karan Tilak Kumar (kartilak)

On Friday, November 14, 2025 12:43 AM, Andy Shevchenko <andriy.shevchenko(a)linux.intel.com> wrote: > > On Thu, Nov 13, 2025 at 10:34:36PM +0000, Karan Tilak Kumar (kartilak) wrote: > > On Thursday, November 13, 2025 6:33 AM, Andy Shevchenko <andriy.shevchenko(a)linux.intel.com> wrote: > > ... > > > Can you please advise how I can compile test this change? > > I have added the following to my x86_64_defconfig > > CONFIG_SCSI_FC_ATTRS=m > CONFIG_LIBFC=m > CONFIG_LIBFCOE=m > CONFIG_FCOE_FNIC=m > > You can always add the just a one (last) line to a configuration stanza that > can be merged to the .config with help of merge_config tool. It will take care > of all needed dependencies. > > -- > With Best Regards, > Andy Shevchenko > Thank you Andy. Regards, Karan

5 months

[PATCH v7 00/11] vfio/pci: Allow MMIO regions to be exported through dma-buf

by Leon Romanovsky

Changelog: v7: * Dropped restore_revoke flag and added vfio_pci_dma_buf_move to reverse loop. * Fixed spelling errors in documentation patch. * Rebased on top of v6.18-rc3. * Added include to stddef.h to vfio.h, to keep uapi header file independent. v6: https://patch.msgid.link/20251102-dmabuf-vfio-v6-0-d773cff0db9f@nvidia.com * Fixed wrong error check from pcim_p2pdma_init(). * Documented pcim_p2pdma_provider() function. * Improved commit messages. * Added VFIO DMA-BUF selftest, not sent yet. * Added __counted_by(nr_ranges) annotation to struct vfio_device_feature_dma_buf. * Fixed error unwind when dma_buf_fd() fails. * Document latest changes to p2pmem. * Removed EXPORT_SYMBOL_GPL from pci_p2pdma_map_type. * Moved DMA mapping logic to DMA-BUF. * Removed types patch to avoid dependencies between subsystems. * Moved vfio_pci_dma_buf_move() in err_undo block. * Added nvgrace patch. v5: https://lore.kernel.org/all/cover.1760368250.git.leon@kernel.org * Rebased on top of v6.18-rc1. * Added more validation logic to make sure that DMA-BUF length doesn't overflow in various scenarios. * Hide kernel config from the users. * Fixed type conversion issue. DMA ranges are exposed with u64 length, but DMA-BUF uses "unsigned int" as a length for SG entries. * Added check to prevent from VFIO drivers which reports BAR size different from PCI, do not use DMA-BUF functionality. v4: https://lore.kernel.org/all/cover.1759070796.git.leon@kernel.org * Split pcim_p2pdma_provider() to two functions, one that initializes array of providers and another to return right provider pointer. v3: https://lore.kernel.org/all/cover.1758804980.git.leon@kernel.org * Changed pcim_p2pdma_enable() to be pcim_p2pdma_provider(). * Cache provider in vfio_pci_dma_buf struct instead of BAR index. * Removed misleading comment from pcim_p2pdma_provider(). * Moved MMIO check to be in pcim_p2pdma_provider(). v2: https://lore.kernel.org/all/cover.1757589589.git.leon@kernel.org/ * Added extra patch which adds new CONFIG, so next patches can reuse * it. * Squashed "PCI/P2PDMA: Remove redundant bus_offset from map state" into the other patch. * Fixed revoke calls to be aligned with true->false semantics. * Extended p2pdma_providers to be per-BAR and not global to whole * device. * Fixed possible race between dmabuf states and revoke. * Moved revoke to PCI BAR zap block. v1: https://lore.kernel.org/all/cover.1754311439.git.leon@kernel.org * Changed commit messages. * Reused DMA_ATTR_MMIO attribute. * Returned support for multiple DMA ranges per-dMABUF. v0: https://lore.kernel.org/all/cover.1753274085.git.leonro@nvidia.com --------------------------------------------------------------------------- Based on "[PATCH v6 00/16] dma-mapping: migrate to physical address-based API" https://lore.kernel.org/all/cover.1757423202.git.leonro@nvidia.com/ series. --------------------------------------------------------------------------- This series extends the VFIO PCI subsystem to support exporting MMIO regions from PCI device BARs as dma-buf objects, enabling safe sharing of non-struct page memory with controlled lifetime management. This allows RDMA and other subsystems to import dma-buf FDs and build them into memory regions for PCI P2P operations. The series supports a use case for SPDK where a NVMe device will be owned by SPDK through VFIO but interacting with a RDMA device. The RDMA device may directly access the NVMe CMB or directly manipulate the NVMe device's doorbell using PCI P2P. However, as a general mechanism, it can support many other scenarios with VFIO. This dmabuf approach can be usable by iommufd as well for generic and safe P2P mappings. In addition to the SPDK use-case mentioned above, the capability added in this patch series can also be useful when a buffer (located in device memory such as VRAM) needs to be shared between any two dGPU devices or instances (assuming one of them is bound to VFIO PCI) as long as they are P2P DMA compatible. The implementation provides a revocable attachment mechanism using dma-buf move operations. MMIO regions are normally pinned as BARs don't change physical addresses, but access is revoked when the VFIO device is closed or a PCI reset is issued. This ensures kernel self-defense against potentially hostile userspace. The series includes significant refactoring of the PCI P2PDMA subsystem to separate core P2P functionality from memory allocation features, making it more modular and suitable for VFIO use cases that don't need struct page support. ----------------------------------------------------------------------- The series is based originally on https://lore.kernel.org/all/20250307052248.405803-1-vivek.kasireddy@intel.c… but heavily rewritten to be based on DMA physical API. ----------------------------------------------------------------------- The WIP branch can be found here: https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=… Thanks --- Jason Gunthorpe (2): PCI/P2PDMA: Document DMABUF model vfio/nvgrace: Support get_dmabuf_phys Leon Romanovsky (7): PCI/P2PDMA: Separate the mmap() support from the core logic PCI/P2PDMA: Simplify bus address mapping API PCI/P2PDMA: Refactor to separate core P2P functionality from memory allocation PCI/P2PDMA: Provide an access to pci_p2pdma_map_type() function dma-buf: provide phys_vec to scatter-gather mapping routine vfio/pci: Enable peer-to-peer DMA transactions by default vfio/pci: Add dma-buf export support for MMIO regions Vivek Kasireddy (2): vfio: Export vfio device get and put registration helpers vfio/pci: Share the core device pointer while invoking feature functions Documentation/driver-api/pci/p2pdma.rst | 95 +++++++--- block/blk-mq-dma.c | 2 +- drivers/dma-buf/dma-buf.c | 235 ++++++++++++++++++++++++ drivers/iommu/dma-iommu.c | 4 +- drivers/pci/p2pdma.c | 182 +++++++++++++----- drivers/vfio/pci/Kconfig | 3 + drivers/vfio/pci/Makefile | 1 + drivers/vfio/pci/nvgrace-gpu/main.c | 56 ++++++ drivers/vfio/pci/vfio_pci.c | 5 + drivers/vfio/pci/vfio_pci_config.c | 22 ++- drivers/vfio/pci/vfio_pci_core.c | 53 ++++-- drivers/vfio/pci/vfio_pci_dmabuf.c | 315 ++++++++++++++++++++++++++++++++ drivers/vfio/pci/vfio_pci_priv.h | 23 +++ drivers/vfio/vfio_main.c | 2 + include/linux/dma-buf.h | 18 ++ include/linux/pci-p2pdma.h | 120 +++++++----- include/linux/vfio.h | 2 + include/linux/vfio_pci_core.h | 42 +++++ include/uapi/linux/vfio.h | 28 +++ kernel/dma/direct.c | 4 +- mm/hmm.c | 2 +- 21 files changed, 1074 insertions(+), 140 deletions(-) --- base-commit: dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa change-id: 20251016-dmabuf-vfio-6cef732adf5a Best regards, -- Leon Romanovsky <leonro(a)nvidia.com>

5 months

[PATCH net-next v6 0/6] Add AF_XDP zero copy support

by Meghana Malladi

This series adds AF_XDP zero coppy support to icssg driver. Tests were performed on AM64x-EVM with xdpsock application [1]. A clear improvement is seen Transmit (txonly) and receive (rxdrop) for 64 byte packets. 1500 byte test seems to be limited by line rate (1G link) so no improvement seen there in packet rate Having some issue with l2fwd as the benchmarking numbers show 0 for 64 byte packets after forwading first batch packets and I am currently looking into it. AF_XDP performance using 64 byte packets in Kpps. AF_XDP performance using 64 byte packets in Kpps. Benchmark: XDP-SKB XDP-Native XDP-Native(ZeroCopy) rxdrop 253 473 656 txonly 350 354 855 l2fwd 178 240 0 AF_XDP performance using 1500 byte packets in Kpps. Benchmark: XDP-SKB XDP-Native XDP-Native(ZeroCopy) rxdrop 82 82 82 txonly 81 82 82 l2fwd 81 82 82 [1]: https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-example v5: https://lore.kernel.org/all/20251111101523.3160680-1-m-malladi@ti.com/ Meghana Malladi (6): net: ti: icssg-prueth: Add functions to create and destroy Rx/Tx queues net: ti: icssg-prueth: Add XSK pool helpers net: ti: icssg-prueth: Add AF_XDP zero copy for TX net: ti: icssg-prueth: Make emac_run_xdp function independent of page net: ti: icssg-prueth: Add AF_XDP zero copy for RX net: ti: icssg-prueth: Enable zero copy in XDP features drivers/net/ethernet/ti/icssg/icssg_common.c | 469 ++++++++++++++++--- drivers/net/ethernet/ti/icssg/icssg_prueth.c | 394 +++++++++++++--- drivers/net/ethernet/ti/icssg/icssg_prueth.h | 25 +- 3 files changed, 739 insertions(+), 149 deletions(-) base-commit: c9dfb92de0738eb7fe6a591ad1642333793e8b6e -- 2.43.0

5 months

[PATCH net-next v5 0/6] Add AF_XDP zero copy support

by Meghana Malladi

This series adds AF_XDP zero coppy support to icssg driver. Tests were performed on AM64x-EVM with xdpsock application [1]. A clear improvement is seen Transmit (txonly) and receive (rxdrop) for 64 byte packets. 1500 byte test seems to be limited by line rate (1G link) so no improvement seen there in packet rate Having some issue with l2fwd as the benchmarking numbers show 0 for 64 byte packets after forwading first batch packets and I am currently looking into it. AF_XDP performance using 64 byte packets in Kpps. AF_XDP performance using 64 byte packets in Kpps. Benchmark: XDP-SKB XDP-Native XDP-Native(ZeroCopy) rxdrop 253 473 656 txonly 350 354 855 l2fwd 178 240 0 AF_XDP performance using 1500 byte packets in Kpps. Benchmark: XDP-SKB XDP-Native XDP-Native(ZeroCopy) rxdrop 82 82 82 txonly 81 82 82 l2fwd 81 82 82 [1]: https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-example v4: https://lore.kernel.org/all/20251023093927.1878411-1-m-malladi@ti.com/ v5-v4: - Rebased to the latest tip - Collected Reviewed-by from Jacob Keller <jacob.e.keller(a)intel.com> Meghana Malladi (6): net: ti: icssg-prueth: Add functions to create and destroy Rx/Tx queues net: ti: icssg-prueth: Add XSK pool helpers net: ti: icssg-prueth: Add AF_XDP zero copy for TX net: ti: icssg-prueth: Make emac_run_xdp function independent of page net: ti: icssg-prueth: Add AF_XDP zero copy for RX net: ti: icssg-prueth: Enable zero copy in XDP features drivers/net/ethernet/ti/icssg/icssg_common.c | 471 ++++++++++++++++--- drivers/net/ethernet/ti/icssg/icssg_prueth.c | 394 +++++++++++++--- drivers/net/ethernet/ti/icssg/icssg_prueth.h | 25 +- 3 files changed, 741 insertions(+), 149 deletions(-) base-commit: b981e100c19dcd91ce8cca8562c3cdabd4fcf28c -- 2.43.0

5 months

Re: [PATCH] rust: bindings: add `rust_helper_wait_for_completion` helper function

by Byungchul Park

On Thu, Oct 02, 2025 at 12:27:53PM +0200, Danilo Krummrich wrote: > On Thu Oct 2, 2025 at 12:06 PM CEST, Guangbo Cui wrote: > > The DEPT patch series changed `wait_for_completion` into a macro. > > Because bindgen cannot handle function-like macros, this caused > > Rust build errors. Add a helper function to fix it. > > Good catch! > > Given that the latest version of this series was just posted, please squash this > fix into patch "dept: assign unique dept_key to each distinct > wait_for_completion() caller" [1]. Thank you all. I will squash this into the patch. Thanks again! Byungchul > > Thanks, > Danilo > > [1] https://lore.kernel.org/all/20251002081247.51255-37-byungchul@sk.com/

5 months, 1 week

[PATCH v2 00/21] treewide: Introduce %ptS for struct timespec64 and convert users

by Andy Shevchenko

Here is the third part of the unification time printing in the kernel. This time for struct timespec64. The first patch brings a support into printf() implementation (test cases and documentation update included) followed by the treewide conversion of the current users. The idea is to have one or a few biggest users included, the rest can be taken next release cycle on the subsystem basis, but I won't object if the respective maintainers already give their tags. Depending on the tags received it may go via dedicated subsystem or via PRINTK tree. Petr, what do you think? Note, not everything was compile-tested. Kunit test has been passed, though. Changelog v2: - dropped wrong patches (Hans, Takashi) - fixed most of the checkpatch warnings (fdo CI, media CI) - collected tags v1: <20251110184727.666591-1-andriy.shevchenko(a)linux.intel.com> Andy Shevchenko (21): lib/vsprintf: Add specifier for printing struct timespec64 ceph: Switch to use %ptSp libceph: Switch to use %ptSp dma-buf: Switch to use %ptSp drm/amdgpu: Switch to use %ptSp drm/msm: Switch to use %ptSp drm/vblank: Switch to use %ptSp drm/xe: Switch to use %ptSp e1000e: Switch to use %ptSp igb: Switch to use %ptSp ipmi: Switch to use %ptSp media: av7110: Switch to use %ptSp mmc: mmc_test: Switch to use %ptSp net: dsa: sja1105: Switch to use %ptSp PCI: epf-test: Switch to use %ptSp pps: Switch to use %ptSp ptp: ocp: Switch to use %ptSp s390/dasd: Switch to use %ptSp scsi: fnic: Switch to use %ptS scsi: snic: Switch to use %ptSp tracing: Switch to use %ptSp Documentation/core-api/printk-formats.rst | 11 ++++- drivers/char/ipmi/ipmi_si_intf.c | 3 +- drivers/char/ipmi/ipmi_ssif.c | 6 +-- drivers/dma-buf/sync_debug.c | 2 +- .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 3 +- drivers/gpu/drm/drm_vblank.c | 6 +-- .../gpu/drm/msm/disp/msm_disp_snapshot_util.c | 3 +- drivers/gpu/drm/msm/msm_gpu.c | 3 +- drivers/gpu/drm/xe/xe_devcoredump.c | 4 +- drivers/mmc/core/mmc_test.c | 20 +++----- drivers/net/dsa/sja1105/sja1105_tas.c | 8 ++- drivers/net/ethernet/intel/e1000e/ptp.c | 7 +-- drivers/net/ethernet/intel/igb/igb_ptp.c | 7 +-- drivers/pci/endpoint/functions/pci-epf-test.c | 5 +- drivers/pps/generators/pps_gen_parport.c | 3 +- drivers/pps/kapi.c | 3 +- drivers/ptp/ptp_ocp.c | 13 ++--- drivers/s390/block/dasd.c | 3 +- drivers/scsi/fnic/fnic_trace.c | 46 ++++++++--------- drivers/scsi/snic/snic_debugfs.c | 10 ++-- drivers/scsi/snic/snic_trc.c | 5 +- drivers/staging/media/av7110/av7110.c | 2 +- fs/ceph/dir.c | 5 +- fs/ceph/inode.c | 49 ++++++------------- fs/ceph/xattr.c | 6 +-- kernel/trace/trace_output.c | 6 +-- lib/tests/printf_kunit.c | 4 ++ lib/vsprintf.c | 25 ++++++++++ net/ceph/messenger_v2.c | 6 +-- 29 files changed, 126 insertions(+), 148 deletions(-) -- 2.50.1

5 months, 1 week

[PATCH v1 00/20] drm/amdgpu: use all SDMA instances for TTM clears and moves

by Pierre-Eric Pelloux-Prayer

The drm/ttm patch modifies TTM to support multiple contexts for the pipelined moves. Then amdgpu/ttm is updated to express dependencies between jobs explicitely, instead of relying on the ordering of execution guaranteed by the use of a single instance. With all of this in place, we can use multiple entities, with each having access to the available SDMA instances. This rework also gives the opportunity to merge the clear functions into a single one and to optimize a bit GART usage. (The first patch of the series has already been merged through drm-misc but I'm including it here to reduce conflicts) Pierre-Eric Pelloux-Prayer (20): drm/amdgpu: give each kernel job a unique id drm/ttm: rework pipelined eviction fence handling drm/amdgpu: remove direct_submit arg from amdgpu_copy_buffer drm/amdgpu: introduce amdgpu_ttm_entity drm/amdgpu: pass the entity to use to ttm functions drm/amdgpu: statically assign gart windows to ttm entities drm/amdgpu: allocate multiple clear entities drm/amdgpu: allocate multiple move entities drm/amdgpu: pass optional dependency to amdgpu_fill_buffer drm/amdgpu: prepare amdgpu_fill_buffer to use N entities drm/amdgpu: use multiple entities in amdgpu_fill_buffer drm/amdgpu: use TTM_FENCES_MAX_SLOT_COUNT drm/amdgpu: use multiple entities in amdgpu_move_blit drm/amdgpu: pass all the sdma rings to amdgpu_mman drm/amdgpu: introduce amdgpu_sdma_set_vm_pte_scheds drm/amdgpu: give ttm entities access to all the sdma scheds drm/amdgpu: get rid of amdgpu_ttm_clear_buffer drm/amdgpu: rename amdgpu_fill_buffer as amdgpu_clear_buffer drm/amdgpu: use larger gart window when possible drm/amdgpu: double AMDGPU_GTT_MAX_TRANSFER_SIZE drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c | 7 +- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 23 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 19 +- drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 19 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 496 ++++++++++++------ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 51 +- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 24 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 12 +- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 10 +- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 10 +- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 10 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 17 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 17 +- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 10 +- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 10 +- drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 10 +- drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 10 +- drivers/gpu/drm/amd/amdgpu/si_dma.c | 10 +- drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 6 +- drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 6 +- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 31 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 +- .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 2 +- .../drm/amd/display/amdgpu_dm/amdgpu_dm_wb.c | 2 +- .../gpu/drm/ttm/tests/ttm_bo_validate_test.c | 13 +- drivers/gpu/drm/ttm/tests/ttm_resource_test.c | 5 +- drivers/gpu/drm/ttm/ttm_bo.c | 56 +- drivers/gpu/drm/ttm/ttm_bo_util.c | 36 +- drivers/gpu/drm/ttm/ttm_resource.c | 45 +- include/drm/ttm/ttm_resource.h | 34 +- 45 files changed, 651 insertions(+), 414 deletions(-) -- 2.43.0

5 months, 1 week

[PATCH v1 00/23] treewide: Introduce %ptS for struct timespec64 and convert users

by Andy Shevchenko

Here is the third part of unification time printing in the kernel. This time for struct timespec64. The first patch brings support into printf() implementation (test cases and documentation update included) followed by the treewide conversion of the current users. The idea is to have one or a few biggest users included, the rest can be taken next release cycle on the subsystem basis, but I won't object if the respective maintainers already give their tags. Depending on the tags received it may go via dedicated subsystem or via PRINTK tree. Note, not everything was compile-tested. Kunit test has been passed, though. Andy Shevchenko (23): lib/vsprintf: Add specifier for printing struct timespec64 ALSA: seq: Switch to use %ptSp ceph: Switch to use %ptSp libceph: Switch to use %ptSp dma-buf: Switch to use %ptSp drm/amdgpu: Switch to use %ptSp drm/msm: Switch to use %ptSp drm/vblank: Switch to use %ptSp drm/xe: Switch to use %ptSp e1000e: Switch to use %ptSp igb: Switch to use %ptSp ipmi: Switch to use %ptSp media: av7110: Switch to use %ptSp media: v4l2-ioctl: Switch to use %ptSp mmc: mmc_test: Switch to use %ptSp net: dsa: sja1105: Switch to use %ptSp PCI: epf-test: Switch to use %ptSp pps: Switch to use %ptSp ptp: ocp: Switch to use %ptSp s390/dasd: Switch to use %ptSp scsi: fnic: Switch to use %ptS scsi: snic: Switch to use %ptSp tracing: Switch to use %ptSp Documentation/core-api/printk-formats.rst | 11 ++++- drivers/char/ipmi/ipmi_si_intf.c | 3 +- drivers/char/ipmi/ipmi_ssif.c | 6 +-- drivers/dma-buf/sync_debug.c | 2 +- .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 3 +- drivers/gpu/drm/drm_vblank.c | 6 +-- .../gpu/drm/msm/disp/msm_disp_snapshot_util.c | 3 +- drivers/gpu/drm/msm/msm_gpu.c | 3 +- drivers/gpu/drm/xe/xe_devcoredump.c | 4 +- drivers/media/v4l2-core/v4l2-ioctl.c | 5 +- drivers/mmc/core/mmc_test.c | 18 +++---- drivers/net/dsa/sja1105/sja1105_tas.c | 8 ++-- drivers/net/ethernet/intel/e1000e/ptp.c | 7 +-- drivers/net/ethernet/intel/igb/igb_ptp.c | 7 +-- drivers/pci/endpoint/functions/pci-epf-test.c | 5 +- drivers/pps/generators/pps_gen_parport.c | 3 +- drivers/pps/kapi.c | 3 +- drivers/ptp/ptp_ocp.c | 15 +++--- drivers/s390/block/dasd.c | 3 +- drivers/scsi/fnic/fnic_trace.c | 46 ++++++++---------- drivers/scsi/snic/snic_debugfs.c | 10 ++-- drivers/scsi/snic/snic_trc.c | 5 +- drivers/staging/media/av7110/av7110.c | 2 +- fs/ceph/dir.c | 5 +- fs/ceph/inode.c | 47 ++++++------------- fs/ceph/xattr.c | 6 +-- kernel/trace/trace_output.c | 6 +-- lib/tests/printf_kunit.c | 4 ++ lib/vsprintf.c | 25 ++++++++++ net/ceph/messenger_v2.c | 6 +-- sound/core/seq/seq_queue.c | 2 +- sound/core/seq/seq_timer.c | 6 +-- 32 files changed, 131 insertions(+), 154 deletions(-) -- 2.50.1

5 months, 1 week

[PATCH 0/8] Initial DMABUF support for iommufd

by Jason Gunthorpe

This series is the start of adding full DMABUF support to iommufd. Currently it is limited to only work with VFIO's DMABUF exporter. It sits on top of Leon's series to add a DMABUF exporter to VFIO: https://lore.kernel.org/all/cover.1760368250.git.leon@kernel.org/ The existing IOMMU_IOAS_MAP_FILE is enhanced to detect DMABUF fd's, but otherwise works the same as it does today for a memfd. The user can select a slice of the FD to map into the ioas and if the underliyng alignment requirements are met it will be placed in the iommu_domain. Though limited, it is enough to allow a VMM like QEMU to connect MMIO BAR memory from VFIO to an iommu_domain controlled by iommufd. This is used for PCI Peer to Peer support in VMs, and is the last feature that the VFIO type 1 container has that iommufd couldn't do. The VFIO type1 version extracts raw PFNs from VMAs, which has no lifetime control and is a use-after-free security problem. Instead iommufd relies on revokable DMABUFs. Whenever VFIO thinks there should be no access to the MMIO it can shoot down the mapping in iommufd which will unmap it from the iommu_domain. There is no automatic remap, this is a safety protocol so the kernel doesn't get stuck. Userspace is expected to know it is doing something that will revoke the dmabuf and map/unmap it around the activity. Eg when QEMU goes to issue FLR it should do the map/unmap to iommufd. Since DMABUF is missing some key general features for this use case it relies on a "private interconnect" between VFIO and iommufd via the vfio_pci_dma_buf_iommufd_map() call. The call confirms the DMABUF has revoke semantics and delivers a phys_addr for the memory suitable for use with iommu_map(). Medium term there is a desire to expand the supported DMABUFs to include GPU drivers to support DPDK/SPDK type use cases so future series will work to add a general concept of revoke and a general negotiation of interconnect to remove vfio_pci_dma_buf_iommufd_map(). I also plan another series to modify iommufd's vfio_compat to transparently pull a dmabuf out of a VFIO VMA to emulate more of the uAPI of type1. The latest series for interconnect negotation to exchange a phys_addr is: https://lore.kernel.org/r/20251027044712.1676175-1-vivek.kasireddy@intel.com And the discussion for design of revoke is here: https://lore.kernel.org/dri-devel/20250114173103.GE5556@nvidia.com/ This is on github: https://github.com/jgunthorpe/linux/commits/iommufd_dmabuf The branch has various modifications to Leon's series I've suggested. Jason Gunthorpe (8): iommufd: Add DMABUF to iopt_pages iommufd: Do not map/unmap revoked DMABUFs iommufd: Allow a DMABUF to be revoked iommufd: Allow MMIO pages in a batch iommufd: Have pfn_reader process DMABUF iopt_pages iommufd: Have iopt_map_file_pages convert the fd to a file iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE iommufd/selftest: Add some tests for the dmabuf flow drivers/iommu/iommufd/io_pagetable.c | 74 +++- drivers/iommu/iommufd/io_pagetable.h | 53 ++- drivers/iommu/iommufd/ioas.c | 8 +- drivers/iommu/iommufd/iommufd_private.h | 13 +- drivers/iommu/iommufd/iommufd_test.h | 10 + drivers/iommu/iommufd/main.c | 10 + drivers/iommu/iommufd/pages.c | 407 ++++++++++++++++-- drivers/iommu/iommufd/selftest.c | 142 ++++++ tools/testing/selftests/iommu/iommufd.c | 43 ++ tools/testing/selftests/iommu/iommufd_utils.h | 44 ++ 10 files changed, 741 insertions(+), 63 deletions(-) base-commit: fc882154e421f82677925d33577226e776bb07a4 -- 2.43.0

5 months, 2 weeks

[PATCH v3] drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb

by Pierre-Eric Pelloux-Prayer

The Mesa issue referenced below pointed out a possible deadlock: [ 1231.611031] Possible interrupt unsafe locking scenario: [ 1231.611033] CPU0 CPU1 [ 1231.611034] ---- ---- [ 1231.611035] lock(&xa->xa_lock#17); [ 1231.611038] local_irq_disable(); [ 1231.611039] lock(&fence->lock); [ 1231.611041] lock(&xa->xa_lock#17); [ 1231.611044] <Interrupt> [ 1231.611045] lock(&fence->lock); [ 1231.611047] *** DEADLOCK *** In this example, CPU0 would be any function accessing job->dependencies through the xa_* functions that doesn't disable interrupts (eg: drm_sched_job_add_dependency, drm_sched_entity_kill_jobs_cb). CPU1 is executing drm_sched_entity_kill_jobs_cb as a fence signalling callback so in an interrupt context. It will deadlock when trying to grab the xa_lock which is already held by CPU0. Replacing all xa_* usage by their xa_*_irq counterparts would fix this issue, but Christian pointed out another issue: dma_fence_signal takes fence.lock and so does dma_fence_add_callback. dma_fence_signal() // locks f1.lock -> drm_sched_entity_kill_jobs_cb() -> foreach dependencies -> dma_fence_add_callback() // locks f2.lock This will deadlock if f1 and f2 share the same spinlock. To fix both issues, the code iterating on dependencies and re-arming them is moved out to drm_sched_entity_kill_jobs_work. v2: reworded commit message (Philipp) v3: added Fixes tag (Philipp) Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini") Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13908 Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov(a)gmail.com> Suggested-by: Christian König <christian.koenig(a)amd.com> Reviewed-by: Christian König <christian.koenig(a)amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer(a)amd.com> --- drivers/gpu/drm/scheduler/sched_entity.c | 34 +++++++++++++----------- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index c8e949f4a568..fe174a4857be 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -173,26 +173,15 @@ int drm_sched_entity_error(struct drm_sched_entity *entity) } EXPORT_SYMBOL(drm_sched_entity_error); +static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, + struct dma_fence_cb *cb); + static void drm_sched_entity_kill_jobs_work(struct work_struct *wrk) { struct drm_sched_job *job = container_of(wrk, typeof(*job), work); - - drm_sched_fence_scheduled(job->s_fence, NULL); - drm_sched_fence_finished(job->s_fence, -ESRCH); - WARN_ON(job->s_fence->parent); - job->sched->ops->free_job(job); -} - -/* Signal the scheduler finished fence when the entity in question is killed. */ -static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, - struct dma_fence_cb *cb) -{ - struct drm_sched_job *job = container_of(cb, struct drm_sched_job, - finish_cb); + struct dma_fence *f; unsigned long index; - dma_fence_put(f); - /* Wait for all dependencies to avoid data corruptions */ xa_for_each(&job->dependencies, index, f) { struct drm_sched_fence *s_fence = to_drm_sched_fence(f); @@ -220,6 +209,21 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, dma_fence_put(f); } + drm_sched_fence_scheduled(job->s_fence, NULL); + drm_sched_fence_finished(job->s_fence, -ESRCH); + WARN_ON(job->s_fence->parent); + job->sched->ops->free_job(job); +} + +/* Signal the scheduler finished fence when the entity in question is killed. */ +static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, + struct dma_fence_cb *cb) +{ + struct drm_sched_job *job = container_of(cb, struct drm_sched_job, + finish_cb); + + dma_fence_put(f); + INIT_WORK(&job->work, drm_sched_entity_kill_jobs_work); schedule_work(&job->work); } -- 2.43.0

5 months, 2 weeks

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig November 2025