On 10/19/22 10:25 AM, Greg Kroah-Hartman wrote:
On Wed, Oct 19, 2022 at 08:06:26AM -0700, Hugh Dickins wrote:
On Wed, 19 Oct 2022, Greg Kroah-Hartman wrote:
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit 040b83fcecfb86f3225d3a5de7fd9b3fbccf83b4 ]
There are two problems can lead to lost wakeup:
- invalid wakeup on the wrong waitqueue:
For example, 2 * wake_batch tags are put, while only wake_batch threads are woken:
__sbq_wake_up atomic_cmpxchg -> reset wait_cnt __sbq_wake_up -> decrease wait_cnt ... __sbq_wake_up -> wait_cnt is decreased to 0 again atomic_cmpxchg sbq_index_atomic_inc -> increase wake_index wake_up_nr -> wake up and waitqueue might be empty sbq_index_atomic_inc -> increase again, one waitqueue is skipped wake_up_nr -> invalid wake up because old wakequeue might be empty
To fix the problem, increasing 'wake_index' before resetting 'wait_cnt'.
- 'wait_cnt' can be decreased while waitqueue is empty
As pointed out by Jan Kara, following race is possible:
CPU1 CPU2 __sbq_wake_up __sbq_wake_up sbq_wake_ptr() sbq_wake_ptr() -> the same wait_cnt = atomic_dec_return() /* decreased to 0 */ sbq_index_atomic_inc() /* move to next waitqueue */ atomic_set() /* reset wait_cnt */ wake_up_nr() /* wake up on the old waitqueue */ wait_cnt = atomic_dec_return() /* * decrease wait_cnt in the old * waitqueue, while it can be * empty. */
Fix the problem by waking up before updating 'wake_index' and 'wait_cnt'.
With this patch, noted that 'wait_cnt' is still decreased in the old empty waitqueue, however, the wakeup is redirected to a active waitqueue, and the extra decrement on the old empty waitqueue is not handled.
Fixes: 88459642cba4 ("blk-mq: abstract tag allocation out into sbitmap library") Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20220803121504.212071-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org
I have no authority on linux-block, but I'll say NAK to this one (and 517/862), and let Jens and Jan overrule me if they disagree.
This was the first of several 6.1-rc1 commits which had given me lost wakeups never suffered before; was not tagged Cc stable; and (unless I've missed it on lore) never had AUTOSEL posted to linux-block or linux-kernel.
Ok, thanks for the review. I'll drop both of the sbitmap.c changes and if people report issues and want them back, I'll be glad to revisit them then.
Sorry for being late, did see Hugh respond to the original auto-select as well, and was surprised to see it moving forward after that. Let's please drop them for now.