On 2/23/23 1:41 PM, Michał Mirosław wrote:
On Thu, 23 Feb 2023 at 07:44, Muhammad Usama Anjum usama.anjum@collabora.com wrote:
On 2/22/23 4:48 PM, Michał Mirosław wrote:
On Wed, 22 Feb 2023 at 12:06, Muhammad Usama Anjum usama.anjum@collabora.com wrote:
[...]
> BTW, I think I assumed that both conditions (all flags in > required_flags and at least one in anyof_flags is present) need to be > true for the page to be selected - is this your intention? All the masks are optional. If all or any of the 3 masks are specified, the page flags must pass these masks to get selected.
This explanation contradicts in part the introductory paragraph, but this version seems more useful as you can pass all masks zero to have all pages selected.
Sorry, I wrote it wrongly. (All the masks are not optional.) Let me rephrase. All or at least any 1 of the 3 masks (required, any, exclude) must be specified. The return_mask must always be specified. Error is returned if all 3 masks (required, anyof, exclude) are zero or return_mask is zero.
Why do you need those restrictions? I'd guess it is valid to request a list of all pages with zero return_mask - this will return a compact list of used ranges of the virtual address space.
At the time, we are supporting 4 flags (PAGE_IS_WRITTEN, PAGE_IS_FILE, PAGE_IS_PRESENT and PAGE_IS_SWAPPED). The idea is that user mention his flags of interest in the return_mask. If he wants only 1 flag, he'll specify it. Definitely if user wants only 1 flag, initially it doesn't make any sense to mention in the return mask. But we want uniformity. If user want, 2 or more flags in returned, return_mask becomes compulsory. So to keep things simple and generic for any number of flags of interest returned, the return_mask must be specified even if the flag of interest is only 1.
I'm not sure why do we want uniformity in the case of 1 flag? If a user specifies a single required flag, I'd expect he doesn't need to look at the flags returned as those will duplicate the information from mere presence of a page. A user might also require a single flag, but want all of them returned. Both requests - return 1 flag and return 0 flags would give meaningful output, so why force one way or the other? Allowing two will also enable users to express the intent: they need either just a list of pages, or they need a list with per-page flags - the need would follow from the code structure or other factors.
We can add as much flexibility as much people ask by keeping code simple. But it is going to be dirty to add error check which detects if return_mask = 0 and if there is only 1 flag of interest mentioned by the user. The following mentioned error check is essential to return deterministic output. Do you think this case is worth it to support and we don't want to go with the generality for both 1 or more flag cases?
if (return_mask == 0 && hweight_long(required_mask | any_mask) != 1) return error;
After taking a while to understand this and compare with already present flag system, `negated flags` is comparatively difficult to understand while already present flags seem easier.
Maybe replacing negated_flags in the API with matched_values = ~negated_flags would make this better?
We compare having to understand XOR vs having to understand ordering of required_flags and excluded_flags.
There is no ordering in current masks scheme. No mask is preferable. For a page to get selected, all the definitions of the masks must be fulfilled. You have come up with good example that what if required_mask = exclude_mask. In this case, no page will fulfill the criterion and hence no page would be selected. It is user's fault that he isn't understanding the definitions of these masks correctly.
Now thinking about it, I can add a error check which would return error if a bit in required and excluded masks matches. Would you like it? Lets put this check in place. (Previously I'd left it for user's wisdom not to do this. If he'll specify same masks in them, he'll get no addresses out of the syscall.)
This error case is (one of) the problems I propose avoiding. You also need much more text to describe the requred/excluded flags interactions and edge cases than saying that a flag must have a value equal to corresponding bit in ~negated_flags to be matched by requried/anyof masks.
I've found excluded_mask very intuitive as compared to negated_mask which is so difficult to understand that I don't know how to use it correctly. Lets take an example, I want pages which are PAGE_IS_WRITTEN and are not PAGE_IS_FILE. In addition, the pages must be PAGE_IS_PRESENT or PAGE_IS_SWAPPED. This can be specified as:
required_mask = PAGE_IS_WRITTEN excluded_mask = PAGE_IS_FILE anyof_mask = PAGE_IS_PRESETNT | PAGE_IS_SWAP
(a) assume page_flags = 0b1111 skip page as 0b1111 & 0b0010 = true
(b) assume page_flags = 0b1001 select page as 0b1001 & 0b0010 = false
It seemed intuitive. Right? How would you achieve same thing with negated_mask?
required_mask = PAGE_IS_WRITTEN negated_mask = PAGE_IS_FILE anyof_mask = PAGE_IS_PRESETNT | PAGE_IS_SWAP
(1) assume page_flags = 0b1111 tested_flags = 0b1111 ^ 0b0010 = 0b1101
(2) assume page_flags = 0b1001 tested_flags = 0b1001 ^ 0b0010 = 0b1011
In (1), we wanted to skip pages which have PAGE_IS_FILE set. But negated_mask has just masked it and page is still getting tested if it should be selected and it would get selected. It is wrong.
In (2), the PAGE_IS_FILE bit of page_flags was 0 and got updated to 1 or PAGE_IS_FILE in tested_flags.
I require flags PAGE_IS_WRITTEN=1, PAGE_IS_FILE=0, so:
required_mask = PAGE_IS_WRITTEN | PAGE_IS_FILE; negated_flags = PAGE_IS_FILE; // flags I want zero
You want PAGE_IS_FILE to be zero and at the same time you are requiring the PAGE_IS_FILE. It is confusing. Lets go with excluded mask and excluded_mask must never have any bit matching with required_mask. Lets stay with this as it is intuitive and would be easy to use from the user's perspective. Andrei and Danylo had suggested these mask scheme and have use cases for this. Andrei and Danylo can please comment as well.
I also require one of PAGE_IS_PRESENT=1 or PAGE_IS_SWAP=1, so:
anyof_mask = PAGE_IS_PRESENT | PAGE_IS_SWAP;
Another case: I want to analyse a process' working set:
required_mask = 0; negated_flags = PAGE_IS_FILE; anyof_mask = PAGE_IS_FILE | PAGE_IS_WRITTEN;
-> gathering pages modified [WRITTEN=1] or not backed by a file [FILE=0].
To clarify a bit: negated_flags doesn't mask anything: the field inverts values of the flags (marks some "active low", if you consider electronic signal analogy).
Best Regards Michał Mirosław