Re: [PATCH v4 11/23] iommufd/viommu: Add IOMMUFD_CMD_HW_QUEUE_ALLOC ioctl

19 May 2025

      On Mon, May 19, 2025 at 10:59:49PM +0530, Vasant Hegde wrote:
...
Jason, Nicolin, Kevin,
On 5/15/2025 9:36 PM, Jason Gunthorpe wrote:
...
On Thu, May 08, 2025 at 08:02:32PM -0700, Nicolin Chen wrote:
...
+/**

struct iommu_hw_queue_alloc - ioctl(IOMMU_HW_QUEUE_ALLOC)

@size: sizeof(struct iommu_hw_queue_alloc)

@flags: Must be 0

@viommu_id: Virtual IOMMU ID to associate the HW queue with

@type: One of enum iommu_hw_queue_type

@index: The logical index to the HW queue per virtual IOMMU for a multi-queue

    model

@out_hw_queue_id: The ID of the new HW queue

@base_addr: Base address of the queue memory in guest physical address space

@length: Length of the queue memory in the guest physical address space

Allocate a HW queue object for a vIOMMU-specific HW-accelerated queue, which

allows HW to access a guest queue memory described by @base_addr and @length.

Upon success, the underlying physical pages of the guest queue memory will be

pinned to prevent VMM from unmapping them in the IOAS until the HW queue gets

destroyed.

Do we have way to make the pinning optional?
As I understand AMD's system the iommu HW itself translates the
base_addr through the S2 page table automatically, so it doesn't need
pinned memory and physical addresses but just the IOVA.
Correct. HW will translate GPA -> SPA automatically using below information.
AMD IOMMU need special device ID to setup with  GPA -> SPA mapping per VM.
and its programmed in VF Control BAR (VFCntlMMIO Offset {16’b[GuestID],
6’b01_0000} Guest Miscellaneous Control Register). IOMMU HW will use this
address for GPA to SPA translation for buffers like command buffer.
So HW will use Base address (GPA), head/tail pointer to get the offset from
Base. Then it will use GPA -> SPA translation.
...
Perhaps for this reason the pinning should be done with a function
call from the driver?
We still need to make sure memory allocated for page is present in memory so
that IOMMU HW can access it.
Pinning at the time of guest boot is enough here -OR- do we need to increase
reference in queue_alloc() path ?
For NVIDIA's vCMDQ that reads host PA directly, pages should be
pinned once when stage 2 mappings are created for the guest RAM,
and iommu_hw_queue_alloc() should pin the pages again to prevent
the gPA from being unmapped in the stage 2 page table. Otherwise
it will be a security hole, as HW continues to read the unmapped
memory through physical address space.
I understand that AMD Command Buffer also needs the S2 mappings
to be present in order to work correctly. But what happens if a
queue memory that isn't pinned (or even gets unmapped)? Will it
raise a translation fault v.s. HW reading the unmapped memory?
If so, I think this is Jason's point: there would be unlikely a
security hole, i.e. for AMD, iommu_hw_queue_alloc() pinning the
physical pages is likely optional.
Thanks
Nicolin

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v4 11/23] iommufd/viommu: Add IOMMUFD_CMD_HW_QUEUE_ALLOC ioctl