On 04/04/2025 18:12, Liam R. Howlett wrote:
+To authors of v7 series referenced in [1]
- Nikita Kalyazin kalyazin@amazon.com [250404 11:44]:
This series is built on top of the Fuad's v7 "mapping guest_memfd backed memory at the host" [1].
I didn't see their addresses in the to/cc, so I added them to my response as I reference the v7 patch set below.
Hi Liam,
Thanks for the feedback and for extending the list.
With James's KVM userfault [2], it is possible to handle stage-2 faults in guest_memfd in userspace. However, KVM itself also triggers faults in guest_memfd in some cases, for example: PV interfaces like kvmclock, PV EOI and page table walking code when fetching the MMIO instruction on x86. It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3] that KVM would be accessing those pages via userspace page tables.
Thanks for being open about the technical call, but it would be better to capture the reasons and not the call date. I explain why in the linking section as well.
Thanks for bringing that up. The document mostly contains the decision itself. The main alternative considered previously was a temporary reintroduction of the pages to the direct map whenever a KVM-internal access is required. It was coming with a significant complexity of guaranteeing correctness in all cases [1]. Since the memslot structure already contains a guest memory pointer supplied by the userspace, KVM can use it directly when in the VMM or vCPU context. I will add this in the cover for the next version.
[1] https://lore.kernel.org/kvm/20240709132041.3625501-1-roypat@amazon.co.uk/T/#...
In order for such faults to be handled in userspace, guest_memfd needs to support userfaultfd.
Changes since v2 [4]:
- James: Fix sgp type when calling shmem_get_folio_gfp
- James: Improved vm_ops->fault() error handling
- James: Add and make use of the can_userfault() VMA operation
- James: Add UFFD_FEATURE_MINOR_GUEST_MEMFD feature flag
- James: Fix typos and add more checks in the test
Nikita
Please slow down...
This patch is at v3, the v7 patch that you are building off has lockdep issues [1] reported by one of the authors, and (sorry for sounding harsh about the v7 of that patch) the cover letter reads a bit more like an RFC than a set ready to go into linux-mm.
AFAIK the lockdep issue was reported on a v7 of a different change. I'm basing my series on [2] ("KVM: Mapping guest_memfd backed memory at the host for software protected VMs"), while the issue was reported on [2] ("KVM: Restricted mapping of guest_memfd at the host and arm64 support"), which is also built on top of [2]. Please correct me if I'm missing something.
The key feature that is required by my series is the ability to mmap guest_memfd when the VM type allows. My understanding is no-one is opposed to that as of now, that's why I assumed it's safe to build on top of that.
[2] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/ [3] https://lore.kernel.org/all/diqz1puanquh.fsf@ackerleytng-ctop.c.googlers.com...
Maybe the lockdep issue is just a patch ordering thing or removed in a later patch set, but that's not mentioned in the discovery email?
What exactly is the goal here and the path forward for the rest of us trying to build on this once it's in mm-new/mm-unstable?
Note that mm-unstable is shared with a lot of other people through linux-next, and we are really trying to stop breaking stuff on them.
Obviously v7 cannot go in until it works with lockdep - otherwise none of us can use lockdep which is not okay.
Also, I am concerned about the amount of testing in the v7 and v3 patch sets that did not bring up a lockdep issue..
[1] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/ [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/T... [3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAos...
If there is anything we need to know about the decisions in the call and that document, can you please pull it into this change log?
I don't think anyone can ensure google will not rename docs to some other office theme tomorrow - as they famously ditch basically every name and application.
Also, most of the community does not want to go to a 17 page (and growing) spreadsheet to hunt down the facts when there is an acceptable and ideal place to document them in git. It's another barrier of entry on reviewing your code as well.
But please, don't take this suggestion as carte blanche for copying a conversation from the doc, just give us the technical reasons for your decisions as briefly as possible.
[4] https://lore.kernel.org/kvm/20250402160721.97596-1-kalyazin@amazon.com/T/
[1]. https://lore.kernel.org/all/diqz1puanquh.fsf@ackerleytng-ctop.c.googlers.com...
Thanks, Liam