On Wed, Aug 03, 2022 at 03:51:24PM +0000, Sean Christopherson wrote:
On Wed, Aug 03, 2022, Chao Peng wrote:
On Tue, Aug 02, 2022 at 04:38:55PM +0000, Sean Christopherson wrote:
On Tue, Aug 02, 2022, Sean Christopherson wrote:
I think we should avoid UNMAPPABLE even on the KVM side of things for the core memslots functionality and instead be very literal, e.g.
KVM_HAS_FD_BASED_MEMSLOTS KVM_MEM_FD_VALID
We'll still need KVM_HAS_USER_UNMAPPABLE_MEMORY, but it won't be tied directly to the memslot. Decoupling the two thingis will require a bit of extra work, but the code impact should be quite small, e.g. explicitly query and propagate MEMFILE_F_USER_INACCESSIBLE to kvm_memory_slot to track if a memslot can be private. And unless I'm missing something, it won't require an additional memslot flag. The biggest oddity (if we don't also add KVM_MEM_PRIVATE) is that KVM would effectively ignore the hva for fd-based memslots for VM types that don't support private memory, i.e. userspace can't opt out of using the fd-based backing, but that doesn't seem like a deal breaker.
I actually love this idea. I don't mind adding extra code for potential usage other than confidential VMs if we can have a workable solution for it.
Hrm, but basing private memory on top of a generic FD_VALID would effectively require shared memory to use hva-based memslots for confidential VMs. That'd yield a very weird API, e.g. non-confidential VMs could be backed entirely by fd-based memslots, but confidential VMs would be forced to use hva-based memslots.
It would work if we can treat userspace_addr as optional for KVM_MEM_FD_VALID, e.g. userspace can opt in to decide whether needing the mappable part or not for a regular VM and we can enforce KVM for confidential VMs. But the u64 type of userspace_addr doesn't allow us to express a 'null' value so sounds like we will end up needing another flag anyway.
In concept, we could have three cofigurations here:
- hva-only: without any flag and use userspace_addr;
- fd-only: another new flag is needed and use fd/offset;
- hva/fd mixed: both userspace_addr and fd/offset is effective. KVM_MEM_PRIVATE is a subset of it for confidential VMs. Not sure regular VM also wants this.
My mental model breaks things down slightly differently, though the end result is more or less the same.
After this series, there will be two types of memory: private and "regular" (I'm trying to avoid "shared"). "Regular" memory is always hva-based (userspace_addr), and private always fd-based (fd+offset).
In the future, if we want to support fd-based memory for "regular" memory, then as you said we'd need to add a new flag, and a new fd+offset pair.
At that point, we'd have two new (relatively to current) flags:
KVM_MEM_PRIVATE_FD_VALID KVM_MEM_FD_VALID
along with two new pairs of fd+offset (private_* and "regular"). Mapping those to your above list:
I previously thought we could reuse the private_fd (name should be changed) for regular VM as well so only need one pair of fd+offset, the meaning of the fd can be decided by the flag. But introducing two pairs of them may support extra usages like one fd for regular memory and another private_fd for private memory, though unsure this is a useful configuration.
- Neither *_FD_VALID flag set.
2a. Both PRIVATE_FD_VALID and FD_VALID are set 2b. FD_VALID is set and the VM doesn't support private memory 3. Only PRIVATE_FD_VALID is set (which private memory support in the VM).
Thus, "regular" VMs can't have a mix in a single memslot because they can't use private memory.
There is no direct relationship between unmappable and fd-based since even fd-based can also be mappable for regular VM?
Hmm, yes, for private memory we have special treatment in page fault handler and that is not applied to regular VM.
Yep.
Ignore this idea for now. If there's an actual use case for generic fd-based memory then we'll want a separate flag, fd, and offset, i.e. that support could be added independent of KVM_MEM_PRIVATE.
If we ignore this idea now (which I'm also fine), do you still think we need change KVM_MEM_PRIVATE to KVM_MEM_USER_UNMAPPBLE?
Hmm, no. After working through this, I think it's safe to say KVM_MEM_USER_UNMAPPABLE is bad name because we could end up with "regular" memory that's backed by an inaccessible (unmappable) file.
One alternative would be to call it KVM_MEM_PROTECTED. That shouldn't cause problems for the known use of "private" (TDX and SNP), and it gives us a little wiggle room, e.g. if we ever get a use case where VMs can share memory that is otherwise protected.
That's a pretty big "if" though, and odds are good we'd need more memslot flags and fd+offset pairs to allow differentiating "private" vs. "protected-shared" without forcing userspace to punch holes in memslots, so I don't know that hedging now will buy us anything.
So I'd say that if people think KVM_MEM_PRIVATE brings additional and meaningful clarity over KVM_MEM_PROTECTECD, then lets go with PRIVATE. But if PROTECTED is just as good, go with PROTECTED as it gives us a wee bit of wiggle room for the future.
Then I'd stay with PRIVATE.
Note, regardless of what name we settle on, I think it makes to do the KVM_PRIVATE_MEM_SLOTS => KVM_INTERNAL_MEM_SLOTS rename.
Agreed.
Chao