On Tue, Dec 21, 2021 at 08:36:33AM -0700, Jens Axboe wrote:
On 12/21/21 8:35 AM, Michael Kelley (LINUX) wrote:
From: Sasha Levin sashal@kernel.org Sent: Monday, December 20, 2021 5:58 PM
From: Jens Axboe axboe@kernel.dk
[ Upstream commit cb2ac2912a9ca7d3d26291c511939a41361d2d83 ]
Dexuan reports that he's seeing spikes of very heavy CPU utilization when running 24 disks and using the 'none' scheduler. This happens off the sched restart path, because SCSI requires the queue to be restarted async, and hence we're hammering on mod_delayed_work_on() to ensure that the work item gets run appropriately.
Avoid hammering on the timer and just use queue_work_on() if no delay has been specified.
Reported-and-tested-by: Dexuan Cui decui@microsoft.com Link: https://lore.kernel.org/linux-block/BYAPR21MB1270C598ED214C0490F47400BF719@B... Reviewed-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org
block/blk-core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/block/blk-core.c b/block/blk-core.c index c2d912d0c976c..a728434fcff87 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1625,6 +1625,8 @@ EXPORT_SYMBOL(kblockd_schedule_work); int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork, unsigned long delay) {
- if (!delay)
return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
} EXPORT_SYMBOL(kblockd_mod_delayed_work_on); -- 2.34.1
Sasha -- there are reports of this patch causing performance problems. See https://lore.kernel.org/lkml/1639853092.524jxfaem2.none@localhost/. I would suggest *not* backporting it to any of the stable branches until the issues are fully sorted out.
Both this and the revert were backported. Which arguably doesn't make a lot of sense, but at least it's consistent and won't cause any issues...
The logic behind it is that it makes it easy for both us as well as everyone else to annotate why a certain patch might be "missing" from the trees - in this case because it was reverted.
It looks dumb now, but it saves a lot of time as well as mitigates the risk of it being picked up again at some point in the future.