On Mon, Jul 31, 2023 at 06:14:50AM +0000, Tian, Kevin wrote:
From: Jason Gunthorpe jgg@nvidia.com Sent: Saturday, July 29, 2023 1:17 AM
On Fri, Jul 28, 2023 at 10:07:58AM +0000, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, July 24, 2023 7:04 PM
This reports device's reserved IOVA regions to userspace. This is needed in the nested translation as userspace owns stage-1 HWPT, and
userspace
needs to exclude the reserved IOVA regions in the stage-1 HWPT hence exclude them in the device's DMA address space.
This can also be used to figure out allowed IOVAs of an IOAS.
We may need a special type to mark SW_MSI since it requires identity mapping in stage-1 instead of being reserved.
Only the kernel can do this, so there is no action for user space to take beyond knowing that is is not mappable IOVA.
The merit for "SW_MSI" may be to inform the rest of the system about the IOVA of the ITS page, but with the current situation that isn't required since only the kernel needs that information.
IIUC guest kernel needs to know the "SW_MSI" region and then setup an 1:1 mapping for it in S1.
Yes, but qemu hardcodes this and for some reason people thought that was a good idea back when.
I think the long term way forward is to somehow arrange for the SW_MSI to not become mapped when creating the parent HWPT and instead cause the ITS page to be mapped through some explicit IOCTL.
yes this is a cleaner approach. Qemu selects the intermediate address of vITS page and maps it to physical ITS page in S2. Then the guest kernel just pick whatever "SW_MSI" address in S1 to vITS as it does today on bare metal.
Right, so I've been inclined to minimize the amount of special stuff created for this way of doing the MSI and hope we can reach a better way sooner than later
Jason