From: Tian, Kevin kevin.tian@intel.com Sent: Monday, July 31, 2023 2:22 PM
From: Jason Gunthorpe jgg@nvidia.com Sent: Saturday, July 29, 2023 1:45 AM
On Mon, Jul 24, 2023 at 04:03:56AM -0700, Yi Liu wrote:
+/**
- struct iommu_resv_iova_ranges - ioctl(IOMMU_RESV_IOVA_RANGES)
- @size: sizeof(struct iommu_resv_iova_ranges)
- @dev_id: device to read resv iova ranges for
- @num_iovas: Input/Output total number of resv ranges for the device
- @__reserved: Must be 0
- @resv_iovas: Pointer to the output array of struct
iommu_resv_iova_range
- Query a device for ranges of reserved IOVAs. num_iovas will be set to
the
- total number of iovas and the resv_iovas[] will be filled in as space
- permits.
- On input num_iovas is the length of the resv_iovas array. On output it is
- the total number of iovas filled in. The ioctl will return -EMSGSIZE and
- set num_iovas to the required value if num_iovas is too small. In this
- case the caller should allocate a larger output array and re-issue the
- ioctl.
- Under nested translation, userspace should query the reserved IOVAs
for a
- given device, and report it to the stage-1 I/O page table owner to
exclude
- the reserved IOVAs. The reserved IOVAs can also be used to figure out
the
- allowed IOVA ranges for the IOAS that the device is attached to. For
detail
- see ioctl IOMMU_IOAS_IOVA_RANGES.
I'm not sure I like this, the other APIs here work with the *allowed* IOVAs, which is the inverse of this one which works with the *disallowed* IOVAs.
It means we can't take the output of this API and feed it into IOMMUFD_CMD_IOAS_ALLOW_IOVAS.. Though I suppose qemu isn't going to do this anyhow.
On the other hand, it is kind of hard to intersect an allowed list of multiple idevs into a single master list.
As it is, userspace will have to aggregate the list, sort it, merge adjacent overlapping reserved ranges then invert the list to get an allowed list. This is not entirely simple..
Did you already write an algorithm to do this in qemu someplace?
Qemu is optional to aggregate it for S2 given IOMMU_IOAS_IOVA_RANGES is still being used. If the only purpose of using this new cmd is to report per-device reserved ranges to the guest then aggregation is not required.
Arguably IOMMU_IOAS_IOVA_RANGES becomes redundant with this new cmd. But it's already there and as you said it's actually more convenient to be used if the user doesn't care about per-device reserved ranges...
Yes. it's not simple as userspace needs to do a lot ofwork to get the allowed iovas if multiple devices are attached to an IOAS.
Anyhow, this should be split out from this series. It seems simple enough to merge it now if someone can confirm what qemu needs.
Ok, so the reason is this new ioctl can be used to figure out allowd iovas. Right?
Regards, Yi Liu