On Sun May 25, 2025 at 7:19 PM UTC, Ujwal Kundur wrote:
I'm afraid I'm too ignorant of this code to be able to suggest something good here. But, can we just remove the comment and plumb the gopts through to uffd_poll_thread()->uffd_handle_page_fault()->__copy_page()?
This is not pretty but it lets us remove the global vars which is clearly a step in the right direction.
Perhaps Andrew can weigh in? If I understood this correctly, we're trying to assert that retrying a successful UFFDIO_COPY operation always results in EEXIST. This is being done in a somewhat racy fashion where a flag (test_uffdio_copy_eexist) is set every 10 seconds using alarm(2). IMO this is a flaky test, we should either:
- remove this variable and associated logic entirely (preferred)
- use a probability function to set this a % of the time instead of
every 10 seconds
- use an async library that can replace the implementation without the
use of global vars
Sorry I don't have an opinion on which of these is the best (I can try to find some time to form an opionion on this later!), but:
Fixing the flakiness sounds great, but I would suggest decoupling that from the refactoring. If it's practical, focus on removing the globals first, while leaving the fundamental logic the same, even if it's bad. Then as a separate series, fix the logic.