Hi,
On Mon, Apr 20, 2020 at 1:23 AM John Garry <john.garry(a)huawei.com> wrote:
>
> On 18/04/2020 03:43, Bart Van Assche wrote:
> > On 2020-04-16 04:18, John Garry wrote:
> >> If in blk_mq_dispatch_rq_list() we find no budget, then we break of the
> >> dispatch loop, but the request may keep the driver tag, evaulated
> >> in 'nxt' in the previous loop iteration.
> >>
> >> Fix by putting the driver tag for that request.
> >>
> >> Signed-off-by: John Garry <john.garry(a)huawei.com>
> >>
> >> diff --git a/block/blk-mq.c b/block/blk-mq.c
> >> index 8e56884fd2e9..a7785df2c944 100644
> >> --- a/block/blk-mq.c
> >> +++ b/block/blk-mq.c
> >> @@ -1222,8 +1222,10 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
> >> rq = list_first_entry(list, struct request, queuelist);
> >>
> >> hctx = rq->mq_hctx;
> >> - if (!got_budget && !blk_mq_get_dispatch_budget(hctx))
> >> + if (!got_budget && !blk_mq_get_dispatch_budget(hctx)) {
> >> + blk_mq_put_driver_tag(rq);
> >> break;
> >> + }
> >>
> >> if (!blk_mq_get_driver_tag(rq)) {
> >> /*
> >
> > Is this something that can only happen if q->mq_ops->queue_rq(hctx, &bd)
> > returns another value than BLK_STS_OK, BLK_STS_RESOURCE and
> > BLK_STS_DEV_RESOURCE?
>
> Right, as that case is handled in blk_mq_handle_dev_resource()
>
> If so, please add a comment in the source code
> > that explains this.
>
> So important that we should now do this in an extra patch?
>
> >
> > Is this perhaps a bug fix for 0bca799b9280 ("blk-mq: order getting
> > budget and driver tag")? If so, please mention this and add Cc tags for
> > the people who were Cc-ed on that patch.
>
> So it looks like 0bca799b9280 had a flaw, but I am not sure if anything
> got broken there and worthy of stable backport.
>
> I found this issue while debugging Ming's blk-mq cpu hotplug patchset,
> which I feel is ready to merge.
>
> Having said that, this nasty issue did take > 1 day for me to debug...
> so let me know.
As per the above conversation, presumably this should go to stable
then for any kernel that has commit 0bca799b9280 ("blk-mq: order
getting budget and driver tag")? For instance, I think 4.19 would be
affected? When I picked it there I got a conflict due to not having
commit ea4f995ee8b8 ("blk-mq: cache request hardware queue mapping")
but I think it's just a context collision and easy to resolve.
I'm no expert in the block code, but I posted my backport to 4.19 at
<https://crrev.com/c/2163313>. I'm happy to send an email as a patch
to the list too or double-check that someone else's conflict
resolution matches mine.
-Doug
The patch titled
Subject: eventpoll: fix missing wakeup for ovflist in ep_poll_callback
has been added to the -mm tree. Its filename is
eventpoll-fix-missing-wakeup-for-ovflist-in-ep_poll_callback.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/eventpoll-fix-missing-wakeup-for-o…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/eventpoll-fix-missing-wakeup-for-o…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Khazhismel Kumykov <khazhy(a)google.com>
Subject: eventpoll: fix missing wakeup for ovflist in ep_poll_callback
In the event that we add to ovflist, before 339ddb53d373 we would be woken
up by ep_scan_ready_list, and did no wakeup in ep_poll_callback. With
that wakeup removed, if we add to ovflist here, we may never wake up.
Rather than adding back the ep_scan_ready_list wakeup - which was
resulting un uncessary wakeups, trigger a wake-up in ep_poll_callback.
We noticed that one of our workloads was missing wakeups starting with
339ddb53d373 and upon manual inspection, this wakeup seemed missing to me.
With this patch added, we no longer see missing wakeups. I haven't yet
tried to make a small reproducer, but the existing kselftests in
filesystem/epoll passed for me with this patch.
Link: http://lkml.kernel.org/r/20200424025057.118641-1-khazhy@google.com
Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll")
Signed-off-by: Khazhismel Kumykov <khazhy(a)google.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Roman Penyaev <rpenyaev(a)suse.de>
Cc: Heiher <r(a)hev.cc>
Cc: Jason Baron <jbaron(a)akamai.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/eventpoll.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/eventpoll.c~eventpoll-fix-missing-wakeup-for-ovflist-in-ep_poll_callback
+++ a/fs/eventpoll.c
@@ -1240,7 +1240,7 @@ static int ep_poll_callback(wait_queue_e
if (epi->next == EP_UNACTIVE_PTR &&
chain_epi_lockless(epi))
ep_pm_stay_awake_rcu(epi);
- goto out_unlock;
+ goto out_wakeup_unlock;
}
/* If this file is already in the ready list we exit soon */
@@ -1249,6 +1249,7 @@ static int ep_poll_callback(wait_queue_e
ep_pm_stay_awake_rcu(epi);
}
+out_wakeup_unlock:
/*
* Wake up ( if active ) both the eventpoll wait list and the ->poll()
* wait list.
_
Patches currently in -mm which might be from khazhy(a)google.com are
eventpoll-fix-missing-wakeup-for-ovflist-in-ep_poll_callback.patch