Please apply to 4.4 and 4.9.
These kernel versions don't have aio poll, but the fix for POLLFREE with exclusive waiters is still applicable to them. This series resolves conflicts in all three patches, mostly due to POLLHUP and wait_queue_head_t having been renamed in more recent kernels.
Eric Biggers (3): wait: add wake_up_pollfree() binder: use wake_up_pollfree() signalfd: use wake_up_pollfree()
drivers/android/binder.c | 21 +++++++++------------ fs/signalfd.c | 12 +----------- include/linux/wait.h | 26 ++++++++++++++++++++++++++ kernel/sched/wait.c | 8 ++++++++ 4 files changed, 44 insertions(+), 23 deletions(-)
From: Eric Biggers ebiggers@google.com
commit 42288cb44c4b5fff7653bc392b583a2b8bd6a8c0 upstream.
Several ->poll() implementations are special in that they use a waitqueue whose lifetime is the current task, rather than the struct file as is normally the case. This is okay for blocking polls, since a blocking poll occurs within one task; however, non-blocking polls require another solution. This solution is for the queue to be cleared before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'.
However, that has a bug: wake_up_poll() calls __wake_up() with nr_exclusive=1. Therefore, if there are multiple "exclusive" waiters, and the wakeup function for the first one returns a positive value, only that one will be called. That's *not* what's needed for POLLFREE; POLLFREE is special in that it really needs to wake up everyone.
Considering the three non-blocking poll systems:
- io_uring poll doesn't handle POLLFREE at all, so it is broken anyway.
- aio poll is unaffected, since it doesn't support exclusive waits. However, that's fragile, as someone could add this feature later.
- epoll doesn't appear to be broken by this, since its wakeup function returns 0 when it sees POLLFREE. But this is fragile.
Although there is a workaround (see epoll), it's better to define a function which always sends POLLFREE to all waiters. Add such a function. Also make it verify that the queue really becomes empty after all waiters have been woken up.
Reported-by: Linus Torvalds torvalds@linux-foundation.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211209010455.42744-2-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@google.com --- include/linux/wait.h | 26 ++++++++++++++++++++++++++ kernel/sched/wait.c | 8 ++++++++ 2 files changed, 34 insertions(+)
diff --git a/include/linux/wait.h b/include/linux/wait.h index 2408e8d5c05cc..e022ee95bd45c 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -202,6 +202,7 @@ void __wake_up_locked_key(wait_queue_head_t *q, unsigned int mode, void *key); void __wake_up_sync_key(wait_queue_head_t *q, unsigned int mode, int nr, void *key); void __wake_up_locked(wait_queue_head_t *q, unsigned int mode, int nr); void __wake_up_sync(wait_queue_head_t *q, unsigned int mode, int nr); +void __wake_up_pollfree(wait_queue_head_t *wq_head); void __wake_up_bit(wait_queue_head_t *, void *, int); int __wait_on_bit(wait_queue_head_t *, struct wait_bit_queue *, wait_bit_action_f *, unsigned); int __wait_on_bit_lock(wait_queue_head_t *, struct wait_bit_queue *, wait_bit_action_f *, unsigned); @@ -236,6 +237,31 @@ wait_queue_head_t *bit_waitqueue(void *, int); #define wake_up_interruptible_sync_poll(x, m) \ __wake_up_sync_key((x), TASK_INTERRUPTIBLE, 1, (void *) (m))
+/** + * wake_up_pollfree - signal that a polled waitqueue is going away + * @wq_head: the wait queue head + * + * In the very rare cases where a ->poll() implementation uses a waitqueue whose + * lifetime is tied to a task rather than to the 'struct file' being polled, + * this function must be called before the waitqueue is freed so that + * non-blocking polls (e.g. epoll) are notified that the queue is going away. + * + * The caller must also RCU-delay the freeing of the wait_queue_head, e.g. via + * an explicit synchronize_rcu() or call_rcu(), or via SLAB_DESTROY_BY_RCU. + */ +static inline void wake_up_pollfree(wait_queue_head_t *wq_head) +{ + /* + * For performance reasons, we don't always take the queue lock here. + * Therefore, we might race with someone removing the last entry from + * the queue, and proceed while they still hold the queue lock. + * However, rcu_read_lock() is required to be held in such cases, so we + * can safely proceed with an RCU-delayed free. + */ + if (waitqueue_active(wq_head)) + __wake_up_pollfree(wq_head); +} + #define ___wait_cond_timeout(condition) \ ({ \ bool __cond = (condition); \ diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index 9453efe9b25a6..133afaf05c3f4 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -10,6 +10,7 @@ #include <linux/wait.h> #include <linux/hash.h> #include <linux/kthread.h> +#include <linux/poll.h>
void __init_waitqueue_head(wait_queue_head_t *q, const char *name, struct lock_class_key *key) { @@ -156,6 +157,13 @@ void __wake_up_sync(wait_queue_head_t *q, unsigned int mode, int nr_exclusive) } EXPORT_SYMBOL_GPL(__wake_up_sync); /* For internal use only */
+void __wake_up_pollfree(wait_queue_head_t *wq_head) +{ + __wake_up(wq_head, TASK_NORMAL, 0, (void *)(POLLHUP | POLLFREE)); + /* POLLFREE must have cleared the queue. */ + WARN_ON_ONCE(waitqueue_active(wq_head)); +} + /* * Note: we use "set_current_state()" _after_ the wait-queue add, * because we need a memory barrier there on SMP, so that any
From: Eric Biggers ebiggers@google.com
commit a880b28a71e39013e357fd3adccd1d8a31bc69a8 upstream.
wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up all exclusive waiters. Yet, POLLFREE *must* wake up all waiters. epoll and aio poll are fortunately not affected by this, but it's very fragile. Thus, the new function wake_up_pollfree() has been introduced.
Convert binder to use wake_up_pollfree().
Reported-by: Linus Torvalds torvalds@linux-foundation.org Fixes: f5cb779ba163 ("ANDROID: binder: remove waitqueue when thread exits.") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211209010455.42744-3-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@google.com --- drivers/android/binder.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c index bf047f16fce9e..31a204ebfa6c5 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -2641,21 +2641,18 @@ static int binder_free_thread(struct binder_proc *proc, }
/* - * If this thread used poll, make sure we remove the waitqueue - * from any epoll data structures holding it with POLLFREE. - * waitqueue_active() is safe to use here because we're holding - * the global lock. + * If this thread used poll, make sure we remove the waitqueue from any + * poll data structures holding it. */ - if ((thread->looper & BINDER_LOOPER_STATE_POLL) && - waitqueue_active(&thread->wait)) { - wake_up_poll(&thread->wait, POLLHUP | POLLFREE); - } + if (thread->looper & BINDER_LOOPER_STATE_POLL) + wake_up_pollfree(&thread->wait);
/* - * This is needed to avoid races between wake_up_poll() above and - * and ep_remove_waitqueue() called for other reasons (eg the epoll file - * descriptor being closed); ep_remove_waitqueue() holds an RCU read - * lock, so we can be sure it's done after calling synchronize_rcu(). + * This is needed to avoid races between wake_up_pollfree() above and + * someone else removing the last entry from the queue for other reasons + * (e.g. ep_remove_wait_queue() being called due to an epoll file + * descriptor being closed). Such other users hold an RCU read lock, so + * we can be sure they're done after we call synchronize_rcu(). */ if (thread->looper & BINDER_LOOPER_STATE_POLL) synchronize_rcu();
From: Eric Biggers ebiggers@google.com
commit 9537bae0da1f8d1e2361ab6d0479e8af7824e160 upstream.
wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up all exclusive waiters. Yet, POLLFREE *must* wake up all waiters. epoll and aio poll are fortunately not affected by this, but it's very fragile. Thus, the new function wake_up_pollfree() has been introduced.
Convert signalfd to use wake_up_pollfree().
Reported-by: Linus Torvalds torvalds@linux-foundation.org Fixes: d80e731ecab4 ("epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211209010455.42744-4-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@google.com --- fs/signalfd.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-)
diff --git a/fs/signalfd.c b/fs/signalfd.c index 270221fcef42c..9c5fa0ab5e0fe 100644 --- a/fs/signalfd.c +++ b/fs/signalfd.c @@ -34,17 +34,7 @@
void signalfd_cleanup(struct sighand_struct *sighand) { - wait_queue_head_t *wqh = &sighand->signalfd_wqh; - /* - * The lockless check can race with remove_wait_queue() in progress, - * but in this case its caller should run under rcu_read_lock() and - * sighand_cachep is SLAB_DESTROY_BY_RCU, we can safely return. - */ - if (likely(!waitqueue_active(wqh))) - return; - - /* wait_queue_t->func(POLLFREE) should do remove_wait_queue() */ - wake_up_poll(wqh, POLLHUP | POLLFREE); + wake_up_pollfree(&sighand->signalfd_wqh); }
struct signalfd_ctx {
linux-stable-mirror@lists.linaro.org