From: Jason Gunthorpe jgg@nvidia.com Sent: Thursday, June 20, 2024 10:34 PM
On Thu, Jun 20, 2024 at 04:14:23PM +0200, David Hildenbrand wrote:
- How would the device be able to grab/access "private memory", if not via the user page tables?
The approaches I'm aware of require the secure world to own the IOMMU and generate the IOMMU page tables. So we will not use a GUP approach with VFIO today as the kernel will not have any reason to generate a page table in the first place. Instead we will say "this PCI device translates through the secure world" and walk away.
The page table population would have to be done through the KVM path.
Sorry for noting this discussion late. Dave pointed it to me in a related thread [1].
I had an impression that above approach fits some trusted IO arch (e.g. TDX Connect which has a special secure I/O page table format and requires sharing it between IOMMU/KVM) but not all.
e.g. SEV-TIO spec [2] (page 8) describes to have the IOMMU walk the existing I/O page tables to get HPA and then verify it through a new permission table (RMP) for access control.
That arch may better fit a scheme in which the I/O page tables are still managed by VFIO/IOMMUFD and RMP is managed by KVM, with an an extension to the MAP_DMA call to accept a [guest_memfd, offset] pair to find out the pfn instead of using host virtual address.
looks the Linux MM alignment session [3] did mention "guest_memfd will take ownership of the hugepages, and provide interested parties (userspace, KVM, iommu) with pages to be used" to support that extension?
[1] https://lore.kernel.org/kvm/272e3dbf-ed4a-43f5-8b5f-56bf6d74930c@redhat.com/ [2] https://www.amd.com/system/files/documents/sev-tio-whitepaper.pdf [3] https://lore.kernel.org/kvm/20240712232937.2861788-1-ackerleytng@google.com/
Thanks Kevin