Some setups, like SCSI, can throw spurious -EAGAIN off the softirq completion path. Normally we expect this to happen inline as part of submission, but apparently SCSI has a weird corner case where it can happen as part of normal completions.
This should be solved by having the -EAGAIN bubble back up the stack as part of submission, but previous attempts at this failed and we're not just quite there yet. Instead we currently use REQ_F_REISSUE to handle this case.
For now, catch it in io_rw_should_reissue() and prevent a reissue from a bogus path.
Cc: stable@vger.kernel.org Reported-by: Fabian Ebner f.ebner@proxmox.com Signed-off-by: Jens Axboe axboe@kernel.dk --- fs/io_uring.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 6ba101cd4661..83f67d33bf67 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2447,6 +2447,12 @@ static bool io_rw_should_reissue(struct io_kiocb *req) */ if (percpu_ref_is_dying(&ctx->refs)) return false; + /* + * Play it safe and assume not safe to re-import and reissue if we're + * not in the original thread group (or in task context). + */ + if (!same_thread_group(req->task, current) || !in_task()) + return false; return true; } #else
Am 27.07.21 um 18:58 schrieb Jens Axboe:
Some setups, like SCSI, can throw spurious -EAGAIN off the softirq completion path. Normally we expect this to happen inline as part of submission, but apparently SCSI has a weird corner case where it can happen as part of normal completions.
This should be solved by having the -EAGAIN bubble back up the stack as part of submission, but previous attempts at this failed and we're not just quite there yet. Instead we currently use REQ_F_REISSUE to handle this case.
For now, catch it in io_rw_should_reissue() and prevent a reissue from a bogus path.
Cc: stable@vger.kernel.org Reported-by: Fabian Ebner f.ebner@proxmox.com Signed-off-by: Jens Axboe axboe@kernel.dk
fs/io_uring.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 6ba101cd4661..83f67d33bf67 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2447,6 +2447,12 @@ static bool io_rw_should_reissue(struct io_kiocb *req) */ if (percpu_ref_is_dying(&ctx->refs)) return false;
- /*
* Play it safe and assume not safe to re-import and reissue if we're
* not in the original thread group (or in task context).
*/
- if (!same_thread_group(req->task, current) || !in_task())
return true; } #elsereturn false;
Hi,
thank you for the fix! This does indeed prevent the panic (with 5.11.22) and hang (with 5.13.3) with my problematic workload.
Best Regards, Fabian
On 7/28/21 3:26 AM, Fabian Ebner wrote:
Am 27.07.21 um 18:58 schrieb Jens Axboe:
Some setups, like SCSI, can throw spurious -EAGAIN off the softirq completion path. Normally we expect this to happen inline as part of submission, but apparently SCSI has a weird corner case where it can happen as part of normal completions.
This should be solved by having the -EAGAIN bubble back up the stack as part of submission, but previous attempts at this failed and we're not just quite there yet. Instead we currently use REQ_F_REISSUE to handle this case.
For now, catch it in io_rw_should_reissue() and prevent a reissue from a bogus path.
Cc: stable@vger.kernel.org Reported-by: Fabian Ebner f.ebner@proxmox.com Signed-off-by: Jens Axboe axboe@kernel.dk
fs/io_uring.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 6ba101cd4661..83f67d33bf67 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2447,6 +2447,12 @@ static bool io_rw_should_reissue(struct io_kiocb *req) */ if (percpu_ref_is_dying(&ctx->refs)) return false;
- /*
* Play it safe and assume not safe to re-import and reissue if we're
* not in the original thread group (or in task context).
*/
- if (!same_thread_group(req->task, current) || !in_task())
return true; } #elsereturn false;
Hi,
thank you for the fix! This does indeed prevent the panic (with 5.11.22) and hang (with 5.13.3) with my problematic workload.
Perfect, thanks for re-testing! Can I add your Tested-by to the patch?
Am 28.07.21 um 15:23 schrieb Jens Axboe:
On 7/28/21 3:26 AM, Fabian Ebner wrote:
Am 27.07.21 um 18:58 schrieb Jens Axboe:
Some setups, like SCSI, can throw spurious -EAGAIN off the softirq completion path. Normally we expect this to happen inline as part of submission, but apparently SCSI has a weird corner case where it can happen as part of normal completions.
This should be solved by having the -EAGAIN bubble back up the stack as part of submission, but previous attempts at this failed and we're not just quite there yet. Instead we currently use REQ_F_REISSUE to handle this case.
For now, catch it in io_rw_should_reissue() and prevent a reissue from a bogus path.
Cc: stable@vger.kernel.org Reported-by: Fabian Ebner f.ebner@proxmox.com Signed-off-by: Jens Axboe axboe@kernel.dk
fs/io_uring.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 6ba101cd4661..83f67d33bf67 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2447,6 +2447,12 @@ static bool io_rw_should_reissue(struct io_kiocb *req) */ if (percpu_ref_is_dying(&ctx->refs)) return false;
- /*
* Play it safe and assume not safe to re-import and reissue if we're
* not in the original thread group (or in task context).
*/
- if (!same_thread_group(req->task, current) || !in_task())
return true; } #elsereturn false;
Hi,
thank you for the fix! This does indeed prevent the panic (with 5.11.22) and hang (with 5.13.3) with my problematic workload.
Perfect, thanks for re-testing! Can I add your Tested-by to the patch?
Sure, feel free to do so.
Best Regards, Fabian
linux-stable-mirror@lists.linaro.org