If we successfully cancel a work item but that work item needs to be processed through task_work, then we can be sleeping uninterruptibly in io_uring_cancel_generic() and never process it. Hence we don't make forward progress and we end up with an uninterruptible sleep warning.
Add the waitqueue earlier to ensure that any wakeups from cancelations are seen, and switch to using uninterruptible sleep so that postponed task_work additions get seen and processed.
While in there, correct a comment that should be IFF, not IIF.
Reported-by: syzbot+21e6887c0be14181206d@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk --- fs/io_uring.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index b4d5b8d168bf..738076264436 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -9826,7 +9826,7 @@ static __cold void io_uring_drop_tctx_refs(struct task_struct *task)
/* * Find any io_uring ctx that this task has registered or done IO on, and cancel - * requests. @sqd should be not-null IIF it's an SQPOLL thread cancellation. + * requests. @sqd should be not-null IFF it's an SQPOLL thread cancellation. */ static __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd) @@ -9851,6 +9851,8 @@ static __cold void io_uring_cancel_generic(bool cancel_all, if (!inflight) break;
+ prepare_to_wait(&tctx->wait, &wait, TASK_INTERRUPTIBLE); + if (!sqd) { struct io_tctx_node *node; unsigned long index; @@ -9868,8 +9870,9 @@ static __cold void io_uring_cancel_generic(bool cancel_all, cancel_all); }
- prepare_to_wait(&tctx->wait, &wait, TASK_UNINTERRUPTIBLE); + io_run_task_work(); io_uring_drop_tctx_refs(current); + /* * If we've seen completions, retry without waiting. This * avoids a race where a completion comes in before we did
If we successfully cancel a work item but that work item needs to be processed through task_work, then we can be sleeping uninterruptibly in io_uring_cancel_generic() and never process it. Hence we don't make forward progress and we end up with an uninterruptible sleep warning.
Add the waitqueue earlier to ensure that any wakeups from cancelations are seen, and switch to using uninterruptible sleep so that postponed task_work additions get seen and processed.
While in there, correct a comment that should be IFF, not IIF.
Reported-by: syzbot+21e6887c0be14181206d@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk
---
v2 - don't move prepare_to_wait(), it'll run into issues with locking etc, and we don't need to as the inflight tracking guards against missing a wakeup for a completion.
diff --git a/fs/io_uring.c b/fs/io_uring.c index b4d5b8d168bf..111db33b940e 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -9826,7 +9826,7 @@ static __cold void io_uring_drop_tctx_refs(struct task_struct *task)
/* * Find any io_uring ctx that this task has registered or done IO on, and cancel - * requests. @sqd should be not-null IIF it's an SQPOLL thread cancellation. + * requests. @sqd should be not-null IFF it's an SQPOLL thread cancellation. */ static __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd) @@ -9868,8 +9868,10 @@ static __cold void io_uring_cancel_generic(bool cancel_all, cancel_all); }
- prepare_to_wait(&tctx->wait, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(&tctx->wait, &wait, TASK_INTERRUPTIBLE); + io_run_task_work(); io_uring_drop_tctx_refs(current); + /* * If we've seen completions, retry without waiting. This * avoids a race where a completion comes in before we did
在 2021/12/10 上午12:16, Jens Axboe 写道:
If we successfully cancel a work item but that work item needs to be processed through task_work, then we can be sleeping uninterruptibly in io_uring_cancel_generic() and never process it. Hence we don't make forward progress and we end up with an uninterruptible sleep warning.
Add the waitqueue earlier to ensure that any wakeups from cancelations are seen, and switch to using uninterruptible sleep so that postponed
^ typo
task_work additions get seen and processed.
While in there, correct a comment that should be IFF, not IIF.
Reported-by: syzbot+21e6887c0be14181206d@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk
v2 - don't move prepare_to_wait(), it'll run into issues with locking etc, and we don't need to as the inflight tracking guards against missing a wakeup for a completion.
diff --git a/fs/io_uring.c b/fs/io_uring.c index b4d5b8d168bf..111db33b940e 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -9826,7 +9826,7 @@ static __cold void io_uring_drop_tctx_refs(struct task_struct *task) /*
- Find any io_uring ctx that this task has registered or done IO on, and cancel
- requests. @sqd should be not-null IIF it's an SQPOLL thread cancellation.
*/ static __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd)
- requests. @sqd should be not-null IFF it's an SQPOLL thread cancellation.
@@ -9868,8 +9868,10 @@ static __cold void io_uring_cancel_generic(bool cancel_all, cancel_all); }
prepare_to_wait(&tctx->wait, &wait, TASK_UNINTERRUPTIBLE);
prepare_to_wait(&tctx->wait, &wait, TASK_INTERRUPTIBLE);
io_uring_drop_tctx_refs(current);io_run_task_work();
- /*
- If we've seen completions, retry without waiting. This
- avoids a race where a completion comes in before we did
On 12/9/21 8:29 PM, Hao Xu wrote:
在 2021/12/10 上午12:16, Jens Axboe 写道:
If we successfully cancel a work item but that work item needs to be processed through task_work, then we can be sleeping uninterruptibly in io_uring_cancel_generic() and never process it. Hence we don't make forward progress and we end up with an uninterruptible sleep warning.
Add the waitqueue earlier to ensure that any wakeups from cancelations are seen, and switch to using uninterruptible sleep so that postponed
^ typo
Not really a typo, but should be killed from v2 for sure. I'll do that.
在 2021/12/10 下午12:22, Jens Axboe 写道:
On 12/9/21 8:29 PM, Hao Xu wrote:
在 2021/12/10 上午12:16, Jens Axboe 写道:
If we successfully cancel a work item but that work item needs to be processed through task_work, then we can be sleeping uninterruptibly in io_uring_cancel_generic() and never process it. Hence we don't make forward progress and we end up with an uninterruptible sleep warning.
Add the waitqueue earlier to ensure that any wakeups from cancelations are seen, and switch to using uninterruptible sleep so that postponed
^ typo
Not really a typo, but should be killed from v2 for sure. I'll do that.
Don't know why the ^ char doesn't align with 'uninterruptible' ... here I mean 'uninterruptible' is a typo
On 12/10/21 12:31 AM, Hao Xu wrote:
在 2021/12/10 下午12:22, Jens Axboe 写道:
On 12/9/21 8:29 PM, Hao Xu wrote:
在 2021/12/10 上午12:16, Jens Axboe 写道:
If we successfully cancel a work item but that work item needs to be processed through task_work, then we can be sleeping uninterruptibly in io_uring_cancel_generic() and never process it. Hence we don't make forward progress and we end up with an uninterruptible sleep warning.
Add the waitqueue earlier to ensure that any wakeups from cancelations are seen, and switch to using uninterruptible sleep so that postponed
^ typo
Not really a typo, but should be killed from v2 for sure. I'll do that.
Don't know why the ^ char doesn't align with 'uninterruptible' ... here I mean 'uninterruptible' is a typo
Gotcha, I guess the end result is the same as I killed the section on moving the sleep.
linux-stable-mirror@lists.linaro.org