On Mon, Dec 6, 2021 at 11:54 AM Eric Biggers ebiggers@kernel.org wrote:
It could be fixed by converting signalfd and binder to use something like this, right?
#define wake_up_pollfree(x) \ __wake_up(x, TASK_NORMAL, 0, poll_to_key(EPOLLHUP | POLLFREE))
Yeah, that looks better to me.
That said, maybe it would be even better to then make it more explicit about what it does, and not make it look like just another wakeup with an odd parameter.
IOW, maybe that "pollfree()" function itself could very much do the waitqueue entry removal on each entry using list_del_init(), and not expect the wakeup code for the entry to do so.
I think that kind of explicit "this removes all entries from the wait list head" is better than "let's do a wakeup and expect all entries to magically implicitly remove themselves".
After all, that "implicitly remove themselves" was what didn't happen, and caused the bug in the first place.
And all the normal cases, that don't care about POLLFREE at all, because their waitqueues don't go away from under them, wouldn't care, because "list_del_init()" still leaves a valid self-pointing list in place, so if they do list_del() afterwards, nothing happens.
I dunno. But yes, that wake_up_pollfree() of yours certainly looks better than what we have now.
As for eliminating POLLFREE entirely, that would require that the waitqueue heads be moved to a location which has a longer lifetime.
Yeah, the problem with aio and epoll is exactly that they end up using waitqueue heads without knowing what they are.
I'm not at all convinced that there aren't other situations where the thing the waitqueue head is embedded might not have other lifetimes.
The *common* situation is obviously that it's associated with a file, and the file pointer ends up holding the reference to whatever device or something (global list in a loadable module, or whatever) it is.
Hmm. The poll_wait() callback function actually does get the 'struct file *' that the wait is associated with. I wonder if epoll queueing could actually increment the file ref when it creates its own wait entry, and release it at ep_remove_wait_queue()?
Maybe epoll could avoid having to remove entries entirely that way - simply by virtue of having a ref to the files - and remove the need for having the ->whead pointer entirely (and remove the need for POLLFREE handling)?
And maybe the signalfd case can do the same - instead of expecting exit() to clean up the list when sighand->count goes to zero, maybe the signalfd filp can just hold a ref to that 'struct sighand_struct', and it gets free'd whenever there are no signalfd users left?
That would involve making signalfd_ctx actually tied to one particular 'struct signal', but that might be the right thing to do regardless (instead of making it always act on 'current' like it does now).
So maybe with some re-organization, we could get rid of the need for POLLFREE entirely.. Anybody?
But your patches are certainly simpler in that they just fix the status quo.
Linus