Maximilian: you keep ignoring the reviewers that are listed in MAINTAINERS. This isn't acceptable. Next time, I will simply ignore your patches.
On Thu, 20 Nov 2025 14:02:49 +0000, Maximilian Dittgen mdittgen@amazon.de wrote:
At the moment, the ability to direct-inject vLPIs is only enableable on an all-or-nothing per-VM basis, causing unnecessary I/O performance loss in cases where a VM's vCPU count exceeds available vPEs. This RFC introduces per-vCPU control over vLPI injection to realize potential I/O performance gain in such situations.
Background
The value of dynamically enabling the direct injection of vLPIs on a per-vCPU basis is the ability to run guest VMs with simultaneous hardware-forwarded and software-forwarded message-signaled interrupts.
Currently, hardware-forwarded vLPI direct injection on a KVM guest requires GICv4 and is enabled on a per-VM, all-or-nothing basis. vLPI injection enablment happens in two stages:
1) At vGIC initialization, allocate direct injection structures for each vCPU (doorbell IRQ, vPE table entry, virtual pending table, vPEID). 2) When a PCI device is configured for passthrough, map its MSIs to vLPIs using the structures allocated in step 1.Step 1 is all-or-nothing; if any vCPU cannot be configured with the vPE structures necessary for direct injection, the vPEs of all vCPUs are torn down and direct injection is disabled VM-wide.
This universality of direct vLPI injection enablement sparks several issues, with the most pressing being performance degradation on overcommitted hosts.
VM-wide vLPI enablement creates resource inefficiency when guest VMs have more vCPUs than the host has available vPEIDs. The amount of vPEIDs (and consequently, vPEs) a host can allocate is constrained by hardware and defined by GICD_TYPER2.VID + 1 (ITS_MAX_VPEID). Since direct injection requires a vCPU to be assigned a vPEID, at most ITS_MAX_VPEID vCPUs can be configured for direct injection at a time. Because vLPI direct injection is all-or-nothing on a VM, if a new guest VM would exhaust remaining vPEIDs, all vCPUs on that VM would fall back to hypervisor-forwarded LPIs, causing considerable I/O performance degradation.
Such performance degradation is exemplified on hosts with CPU overcommitment. Overcommitting an arbitrarily high number of vCPUs enables a VM's vCPU count to easily exceed the host's available vPEIDs.
Let it be crystal clear: GICv4 and overcommitment is a non-story. It isn't designed for that. If that's what you are trying to achieve, you clearly didn't get the memo.
Even with marginally more vCPUs than vPEIDs, the current all-or-nothing vLPI paradigm disables direct injection entirely. This creates two problems: first, a single many-vCPU overcommitted VM loses all direct injection despite having vPEIDs available;
Are you saying that your HW is so undersized that you cannot create a *single VM* with direct injection? You really have fewer than 9 bit worth of VPEIDs? I'm sorry, but that's laughable. Even a $200 dev board does better.
second, on multi-tenant hosts, VMs booted first consume all vPEIDs, leaving later VMs without direct injection regardless of their I/O intensity. Per-vCPU control would allow userspace to allocate available vPEIDs across VMs based on I/O workload rather than boot order or per-VM vCPU count. This per-vCPU granularity recovers most of the direct injection performance benefit instead of losing it completely.
To allow this per-vCPU granularity, this RFC introduces three new ioctls to the KVM API that enables userspace the ability to activate/deactivate direct vLPI injection capability and resources to vCPUs ad-hoc during VM runtime.
How can that even work when changing the affinity of a vLPI (directly injected) to a vcpu that doesn't have direct injection enabled? You'd have to unmap the vLPI, and plug it back as a normal LPI. Not only this is absolutely ridiculous from a performance perspective, but you are also guaranteed to lose interrupts that would have fired in the meantime. Losing interrupts in a total no-go.
Before I even look at the code, I you to explain how you are dealing with this.
M.