On Thu, Mar 30, 2023 at 3:27 PM Peter Xu peterx@redhat.com wrote:
On Thu, Mar 30, 2023 at 12:04:09PM -0700, Axel Rasmussen wrote:
On Thu, Mar 30, 2023 at 8:57 AM Peter Xu peterx@redhat.com wrote:
This is a proposal to revert commit 914eedcb9ba0ff53c33808.
I found this when writting a simple UFFDIO_API test to be the first unit test in this set. Two things breaks with the commit:
- UFFDIO_API check was lost and missing. According to man page, the
kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This check is needed if the api version will be extended in the future, or user app won't be able to identify which is a new kernel.
100% agreed, this was a mistake.
- Feature flags checks were removed, which means UFFDIO_API with a
feature that does not exist will also succeed. According to the man page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if unknown features passed in.
I still strongly disagree with reverting this part, my feeling is still that doing so makes things more complicated for no reason.
Re: David's point, it's clearly wrong to change semantics so a thing that used to work now fails. But this instead makes it more permissive
- existing userspace programs continue to work as-is, but *also* one
can achieve the same thing more simply (combine probing + configuration into one step). I don't see any problem with that, generally.
But, if David and others don't find my argument convincing, it isn't the end of the world. It just means I have to go update my userspace code to be a bit more complicated. :)
I'd say it's fine if it's your own program that you can in full control easily. :) Sorry again for not noticing that earlier.
There's one reason that we may consider keeping the behavior. IMHO it is when there're major softwares that uses the "wrong" ABI (let's say so; because it's not following the man pages). If you're aware any such major softwares (especially open sourced) will break due to this revert patch, please shoot.
Well, I did find one example, criu: https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/uffd.c#L266 It doesn't do the two-step probing process, it looks to me like it does what my patch was intending users to do:
It just asks for the requested features up-front, without any probing. And then it returns the subset of features it *actually* got, ostensibly so the caller can compare that vs. what it requested.
Then again, it looks like the caller doesn't *actually* compare the features it got vs. what it asked for. I don't know enough about criu to know if this is a bug, or if they actually just don't care. https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/uffd.c#L312
I also still think the man page is incorrect or at least incomplete no matter what we do here; we should be sure to update it as a follow-up.
So far it looks still fine if with this revert. Maybe I overlooked somewhere?
I'll add this into my todo, but with low priority. If you have suggestion already on how to improve the man page please do so before me!
My thinking is that it describes the bits and pieces, but doesn't explicitly describe end-to-end how to configure a userfaultfd using the two-step probing approach. (Or state that this is actually *required*, unless you only want to set features=0 in any case.)
Maybe it will be easiest if I just send a patch myself with what I'm thinking, and we can see what folks think. Always easier to just look at a patch vs. talking about it in the abstract. :)
-- Peter Xu