On Wed, Oct 09, 2024, Oliver Upton wrote:
On Wed, Oct 09, 2024 at 07:36:03PM +0100, Marc Zyngier wrote:
As there is very little ordering in the KVM API, userspace can instanciate a half-baked GIC (missing its memory map, for example) at almost any time.
This means that, with the right timing, a thread running vcpu-0 can enter the kernel without a GIC configured and get a GIC created behind its back by another thread. Amusingly, it will pick up that GIC and start messing with the data structures without the GIC having been fully initialised.
Huh, I'm definitely missing something. Could you remind me where we open up this race between KVM_RUN && kvm_vgic_create()?
I'd thought the fact that the latter takes all the vCPU mutexes and checks if any vCPU in the VM has run would be enough to guard against such a race, but clearly not...
Any chance that fixing bugs where vCPU0 can be accessed (and run!) before its fully online help? E.g. if that closes the vCPU0 hole, maybe the vCPU1 case can be handled a bit more gracefully?
[*] https://lore.kernel.org/all/20241009150455.1057573-1-seanjc@google.com
Similarly, a thread running vcpu-1 can enter the kernel, and try to init the GIC that was previously created. Since this GIC isn't properly configured (no memory map), it fails to correctly initialise.
And that's the point where we decide to teardown the GIC, freeing all its resources. Behind vcpu-0's back. Things stop pretty abruptly, with a variety of symptoms. Clearly, this isn't good, we should be a bit more careful about this.
It is obvious that this guest is not viable, as it is missing some important part of its configuration. So instead of trying to tear bits of it down, let's just mark it as *dead*. It means that any further interaction from userspace will result in -EIO. The memory will be released on the "normal" path, when userspace gives up.