From: Jason Gunthorpe jgg@nvidia.com Sent: Wednesday, October 8, 2025 12:12 AM
This is used by x86 CPUs and can be used in AMD/VT-D x86 IOMMUs. When a x86 IOMMU is running SVA it is using this page table format.
SVA does use this format but mentioning it is a bit confusing here as the SVA page table is managed by mm.
This implementation follows the AMD v2 io-pgtable version.
There is nothing remarkable here, the format can have 4 or 5 levels and limited support for different page sizes. No contiguous pages support.
x86 uses a sign extension mechanism where the top bits of the VA must match the sign bit. The core code supports this through PT_FEAT_SIGN_EXTEND which creates and upper and lower VA range. All the new operations will work correctly in both spaces, however currently there is no way to report the upper space to other layers. Future patches can improve that.
In principle this can support 3 page tables levels matching the 32 bit PAE table format, but no iommu driver needs this. The focus is on the modern 64 bit 4 and 5 level formats.
Comparing the performance of several operations to the existing version:
iommu_map() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 71,61 , 66,58 , -13.13 2^21, 66,60 , 61,55 , -10.10 2^30, 59,56 , 56,54 , -3.03 256*2^12, 392,1360 , 345,1289 , 73.73 256*2^21, 383,1159 , 335,1145 , 70.70 256*2^30, 378,965 , 331,892 , 62.62
iommu_unmap() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 77,71 , 73,68 , -7.07 2^21, 76,70 , 70,66 , -6.06 2^30, 69,66 , 66,63 , -4.04 256*2^12, 225,899 , 210,870 , 75.75 256*2^21, 262,722 , 248,710 , 65.65 256*2^30, 251,643 , 244,634 , 61.61
The small -ve values in the iommu_unmap() are due to the core code calling iommu_pgsize() before invoking the domain op. This is unncessary with this implementation. Future work optimizes this and gets to 2%, 4%, 3%.
Tested-by: Alejandro Jimenez alejandro.j.jimenez@oracle.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com
Reviewed-by: Kevin Tian kevin.tian@intel.com