On systems that support shared guest memory, write() is useful, for example, for population of the initial image. Even though the same can also be achieved via userspace mapping and memcpying from userspace, write() provides a more performant option because it does not need to set user page tables and it does not cause a page fault for every page like memcpy would. Note that memcpy cannot be accelerated via MADV_POPULATE_WRITE as it is not supported by guest_memfd and relies on GUP.
Populating 512MiB of guest_memfd on a x86 machine: - via memcpy: 436 ms - via write: 202 ms (-54%)
Only PAGE_ALIGNED offset and len are allowed. Even though non-aligned writes are technically possible, when in-place conversion support is implemented [1], the restriction makes handling of mixed shared/private huge pages simpler. write() will only be allowed to populate shared pages.
When direct map removal is implemented [2] - write() will not be allowed to access pages that have already been removed from direct map - on completion, write() will remove the populated pages from direct map
While it is technically possible to implement read() syscall on systems with shared guest memory, it is not supported as there is currently no use case for it.
[1] https://lore.kernel.org/kvm/cover.1760731772.git.ackerleytng@google.com [2] https://lore.kernel.org/kvm/20250924151101.2225820-1-patrick.roy@campus.lmu....
Nikita Kalyazin (2): KVM: guest_memfd: add generic population via write KVM: selftests: update guest_memfd write tests
Documentation/virt/kvm/api.rst | 2 + include/linux/kvm_host.h | 2 +- include/uapi/linux/kvm.h | 1 + .../testing/selftests/kvm/guest_memfd_test.c | 58 +++++++++++++++++-- virt/kvm/guest_memfd.c | 52 +++++++++++++++++ 5 files changed, 108 insertions(+), 7 deletions(-)
base-commit: 8a4821412cf2c1429fffa07c012dd150f2edf78c -- 2.50.1