On Tue, 2021-02-16 at 18:16 +0100, David Hildenbrand wrote: [...]
The discussion regarding migratability only really popped up because this is a user-visible thing and not being able to migrate can be a real problem (fragmentation, ZONE_MOVABLE, ...).
I think the biggest use will potentially come from hardware acceleration. If it becomes simple to add say encryption to a secret page with no cost, then no flag needed. However, if we only have a limited number of keys so once we run out no more encrypted memory then it becomes a costly resource and users might want a choice of being backed by encryption or not.
Right. But wouldn't HW support with configurable keys etc. need more syscall parameters (meaning, even memefd_secret() as it is would not be sufficient?). I suspect the simplistic flag approach might not be sufficient. I might be wrong because I have no clue about MKTME and friends.
The theory I was operating under is key management is automatic and hidden, but key scarcity can't be, so if you flag requesting hardware backing then you either get success (the kernel found a key) or failure (the kernel is out of keys). If we actually want to specify the key then we need an extra argument and we *must* have a new system call.
Anyhow, I still think extending memfd_create() might just be good enough - at least for now.
I really think this is the wrong approach for a user space ABI. If we think we'll ever need to move to a separate syscall, we should begin with one. The pain of trying to shift userspace from memfd_create to a new syscall would be enormous. It's not impossible (see clone3) but it's a pain we should avoid if we know it's coming.
Things like HW support might have requirements we don't even know yet and that we cannot even model in memfd_secret() right now.
This is the annoying problem with our Linux unbreakable ABI policy: we get to plan when the ABI is introduced for stuff we don't yet even know about.
James