On Thu, Jul 03, 2025 at 02:55:32PM -0300, Jason Gunthorpe wrote:
On Thu, Jul 03, 2025 at 02:46:03PM +0000, Pranjal Shrivastava wrote:
Right.. I was however hoping we'd also trap commands like CMD_PRI_RESP and CMD_RESUME...I'm not sure if they should be accelerated via CMDQV.. I guess I'll need to look and understand a little more if they are..
Right now these commands are not supported by vSMMUv3 in Linux.
They probably should be trapped, but completing a PRI (or resuming a stall which we will treat the same) will go through the PRI/page fault logic in iommufd not the cache invalidate.
Ahh, thanks for this, that saved a lot of my time! And yes, I see some functions in eventq.c calling the iopf_group_response which settles the CMD_RESUME. So.. I assume these resume commands would be trapped and *actually* executed through this or a similar path for vPRI.
Meh, I had been putting off reading up the fault parts of iommufd, I guess I'll go through that too, now :)
The goal of the SMMU driver when it detects CMDQV support is to route all supported invalidations to CMDQV queues and then balance those queues across CPUs to reduce lock contention.
I see.. that makes sense.. so it's a relatively small gain (but a nice one). Thanks for clarifying!
On bare metal the gain is small (due to locking and balancing), while on virtualization the gain is huge (due to no trapping).
Ohh yes, I meant the bare metal gains here.. for virtualization, it's definitely huge (as reported too).
Regardless the SMMU driver uses cmdqv support if the HW says it is there.
Jason
Thanks! Praan