On Tue, Feb 09, 2021 at 02:17:11PM +0100, Michal Hocko wrote:
On Tue 09-02-21 11:09:38, Mike Rapoport wrote:
On Tue, Feb 09, 2021 at 09:47:08AM +0100, Michal Hocko wrote:
OK, so IIUC this means that the model is to hand over memory from host to guest. I thought the guest would be under control of its address space and therefore it operates on the VMAs. This would benefit from an additional and more specific clarification.
How guest would operate on VMAs if the interface between host and guest is virtual hardware?
I have to say that I am not really familiar with this area so my view might be misleading or completely wrong. I thought that the HW address ranges are mapped to the guest process and therefore have a VMA.
There is a qemu process that currently has mappings of what guest sees as its physical memory, but qemu is a part of hypervisor, i.e. host.
Citing my older email:
I've hesitated whether to continue to use new flags to memfd_create() or to add a new system call and I've decided to use a new system call after I've started to look into man pages update. There would have been two completely independent descriptions and I think it would have been very confusing.
Could you elaborate? Unmapping from the kernel address space can work both for sealed or hugetlb memfds, no? Those features are completely orthogonal AFAICS. With a dedicated syscall you will need to introduce this functionality on top if that is required. Have you considered that? I mean hugetlb pages are used to back guest memory very often. Is this something that will be a secret memory usecase?
Please be really specific when giving arguments to back a new syscall decision.
Isn't "syscalls have completely independent description" specific enough?
We are talking about API here, not the implementation details whether secretmem supports large pages or not.
The purpose of memfd_create() is to create a file-like access to memory. The purpose of memfd_secret() is to create a way to access memory hidden from the kernel.
I don't think overloading memfd_create() with the secretmem flags because they happen to return a file descriptor will be better for users, but rather will be more confusing.