On Fri, Dec 3, 2021 at 4:23 PM Eric Biggers ebiggers@kernel.org wrote:
require another solution. This solution is for the queue to be cleared before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'.
Ugh.
I hate POLLFREE, and the more I look at this, the more I think it's broken.
And that
wake_up_poll(wq, EPOLLHUP | POLLFREE);
in particular looks broken - the intent is that it should remove all the wait queue entries (because the wait queue head is going away), but wake_up_poll() iself actually does
__wake_up(x, TASK_NORMAL, 1, poll_to_key(m))
where that '1' is the number of exclusive entries it will wake up.
So if there are two exclusive waiters, wake_up_poll() will simply stop waking things up after the first one.
Which defeats the whole POLLFREE thing too.
Maybe I'm missing something, but POLLFREE really is broken.
I'd argue that all of epoll() is broken, but I guess we're stuck with it.
Now, it's very possible that nobody actually uses exclusive waits for those wait queues, and my "nr_exclusive" argument is about something that isn't actually a bug in reality. But I think it's a sign of confusion, and it's just another issue with POLLFREE.
I really wish we could have some way to not have epoll and aio mess with the wait-queue lists and cache the wait queue head pointers that they don't own.
In the meantime, I don't think these patches make things worse, and they may fix things. But see above about "nr_exclusive" and how I think wait queue entries might end up avoiding POLLFREE handling..
Linus