On 12.09.25 16:48, Nikita Kalyazin wrote:
On 12/09/2025 14:36, David Hildenbrand wrote:
On 11.09.25 12:15, Nikita Kalyazin wrote:
On 10/09/2025 22:23, James Houghton wrote:
On Tue, Sep 2, 2025 at 4:20 AM Kalyazin, Nikita kalyazin@amazon.co.uk wrote:
From: Nikita Kalyazin kalyazin@amazon.com
Hi Nikita,
Hi James,
Thanks for the review!
write syscall populates guest_memfd with user-supplied data in a generic way, ie no vendor-specific preparation is performed. This is supposed to be used in non-CoCo setups where guest memory is not hardware-encrypted.
What's meant to happen if we do use this for CoCo VMs? I would expect write() to fail, but I don't see why it would (seems like we need/want a check that we aren't write()ing to private memory).
I am not so sure that write() should fail even in CoCo VMs if we access not-yet-prepared pages. My understanding was that the CoCoisation of the memory occurs during "preparation". But I may be wrong here.
But how do you handle that a page is actually inaccessible and should not be touched?
IOW, with CXL you could crash the host.
There is likely some state check missing, or it should be restricted to VM types.
Sorry, I'm missing the link between VM types and CXL. How are they related?
I think what you explain below clarifies it.
My thinking was it is a regular (accessible) page until it is "prepared" by the CoCo hardware, which is currently tracked by the up-to-date flag, so it is safe to assume that until it is "prepared", it is accessible because it was allocated by filemap_grab_folio() -> filemap_alloc_folio() and hasn't been taken over by the CoCo hardware. What scenario can you see where it doesn't apply as of now?
Thanks for clarifying, see below.
I am aware of an attempt to remove preparation tracking from guest_memfd, but it is still at an RFC stage AFAIK [1].
Do we know how this would interact with the direct-map removal?
I'm using folio_test_uptodate() to determine if the page has been removed from the direct map as kvm_gmem_mark_prepared() is what currently removes the page from the direct map and marks it as up-to-date. [2] is a Firecracker feature branch where the two work in combination.
Ah, okay. Yes, I recalled [1] that we wanted to change these semantics to be "uptodate: was zeroed", and that preparation handling would be essentially handled by the arch backend.