Il giorno 16 feb 2018, alle ore 06:39, Mike Galbraith efault@gmx.de ha scritto:
On Thu, 2018-02-15 at 19:13 +0100, Paolo Valente wrote:
Il giorno 14 feb 2018, alle ore 16:44, Jens Axboe axboe@kernel.dk ha scritto:
On 2/14/18 8:39 AM, Paolo Valente wrote:
Il giorno 14 feb 2018, alle ore 16:19, Jens Axboe axboe@kernel.dk ha scritto:
On 2/14/18 1:56 AM, Paolo Valente wrote:
> Il giorno 14 feb 2018, alle ore 08:15, Mike Galbraith efault@gmx.de ha scritto: > > On Wed, 2018-02-14 at 08:04 +0100, Mike Galbraith wrote: >> >> And _of course_, roughly two minutes later, IO stalled. > > P.S. > > crash> bt 19117 > PID: 19117 TASK: ffff8803d2dcd280 CPU: 7 COMMAND: "kworker/7:2" > #0 [ffff8803f7207bb8] __schedule at ffffffff81595e18 > #1 [ffff8803f7207c40] schedule at ffffffff81596422 > #2 [ffff8803f7207c50] io_schedule at ffffffff8108a832 > #3 [ffff8803f7207c60] blk_mq_get_tag at ffffffff8129cd1e > #4 [ffff8803f7207cc0] blk_mq_get_request at ffffffff812987cc > #5 [ffff8803f7207d00] blk_mq_alloc_request at ffffffff81298a9a > #6 [ffff8803f7207d38] blk_get_request_flags at ffffffff8128e674 > #7 [ffff8803f7207d60] scsi_execute at ffffffffa0025b58 [scsi_mod] > #8 [ffff8803f7207d98] scsi_test_unit_ready at ffffffffa002611c [scsi_mod] > #9 [ffff8803f7207df8] sd_check_events at ffffffffa0212747 [sd_mod] > #10 [ffff8803f7207e20] disk_check_events at ffffffff812a0f85 > #11 [ffff8803f7207e78] process_one_work at ffffffff81079867 > #12 [ffff8803f7207eb8] worker_thread at ffffffff8107a127 > #13 [ffff8803f7207f10] kthread at ffffffff8107ef48 > #14 [ffff8803f7207f50] ret_from_fork at ffffffff816001a5 > crash>
This has evidently to do with tag pressure. I've looked for a way to easily reduce the number of tags online, so as to put your system in the bad spot deterministically. But at no avail. Does anyone know a way to do it?
The key here might be that it's not a regular file system request, which I'm sure bfq probably handles differently. So it's possible that you are slowly leaking those tags, and we end up in this miserable situation after a while.
Could you elaborate more on this? My mental model of bfq hooks in this respect is that they do only side operations, which AFAIK cannot block the putting of a tag. IOW, tag getting and putting is done outside bfq, regardless of what bfq does with I/O requests. Is there a flaw in this?
In any case, is there any flag in or the like, in requests passed to bfq, that I could make bfq check, to raise some warning?
I'm completely guessing, and I don't know if this trace is always what Mike sees when things hang. It just seems suspect that we end up with a "special" request here, since I'm sure the regular file system requests outnumber them greatly. That raises my suspicion that the type is related.
But no, there should be no special handling on the freeing side, my guess was that BFQ ends them a bit differently.
Hi Jens, whatever the exact cause of leakage is, a leakage in its turn does sound like a reasonable cause for these hangs. But also if leakage is the cause, it seems to me that reducing tags to just 1 might help trigger the problem quickly and reliably on Mike's machine. If you agree, Jens, which would be the quickest/easiest way to reduce tags?
Whatever the cause, seems this wants some instrumentation that can be left in place for a while. I turned on CONFIG_BLK_DEBUG_FS for Jens, but the little bugger didn't raise it's ugly head all day long.
What you need most is more reproducers.
Yeah, which is however out of our control. That's why I'm nagging Jens, and other knowledgeable people, for that trick to hopefully push your system to quick failure.
Paolo