On Sun, Dec 09, 2018 at 11:44:19AM -0500, Theodore Y. Ts'o wrote:
On Sun, Dec 09, 2018 at 12:30:39PM +0100, Greg KH wrote:
P.P.P.S. If I were king, I'd be asking for a huge number of kunit tests for block-mq to be developed, and then running them under a Thread Sanitizer.
Isn't that what xfs and fio is? Aren't we running this all the time and reporting those issues? How did this bug not show up on those tests, is it just because they didn't run long enough?
Because of those test suites, I was thinking that the block and filesystem paths were one of the more well-tested things we had at the moment, is this not true?
I'm pretty confident about the file system paths, and the "happy paths" for the block layer.
But with Kernel Bugzilla #201685, despite huge amounts both before and after 4.19-rc1, nothing picked it up. It turned out to be very configuration specific, *and* only happened when you were under heavy memory pressure and/or I/O pressure.
I'm starting to try to use blktests, but it's not as mature as xfstests. It has portability issues, as it assumes a much newer userspace. So I can't even run it under some environments at all. The test coverage just isn't as broad. Compare:
ext4/4k: 441 tests, 1 failures, 42 skipped, 4387 seconds Failures: generic/388
Versus:
Run: block/001 block/002 block/003 block/004 block/005 block/006 block/009 block/010 block/012 block/013 block/014 block/015 block/016 block/017 block/018 block/020 block/021 block/023 block/024 loop/001 loop/002 loop/003 loop/004 loop/005 loop/006 nvme/002 nvme/003 nvme/004 nvme/006 nvme/007 nvme/008 nvme/009 nvme/010 nvme/011 nvme/012 nvme/013 nvme/014 nvme/015 nvme/016 nvme/017 nvme/019 nvme/020 nvme/021 nvme/022 nvme/023 nvme/024 nvme/025 nvme/026 nvme/027 nvme/028 scsi/001 scsi/002 scsi/003 scsi/004 scsi/005 scsi/006 srp/001 srp/002 srp/003 srp/004 srp/005 srp/006 srp/007 srp/008 srp/009 srp/010 srp/011 srp/012 srp/013 Failures: block/017 block/024 nvme/002 nvme/003 nvme/008 nvme/009 nvme/010 nvme/011 nvme/012 nvme/013 nvme/014 nvme/015 nvme/016 nvme/019 nvme/020 nvme/021 nvme/022 nvme/023 nvme/024 nvme/025 nvme/026 nvme/027 nvme/028 scsi/006 srp/001 srp/002 srp/003 srp/004 srp/005 srp/006 srp/007 srp/008 srp/009 srp/010 srp/011 srp/012 srp/013 Failed 37 of 69 tests
(Most of the failures are test portability issues that I still need to work through, not real failures. But just look at the number of tests....)
So you are saying quantity rules over quantity? :)
It's really hard to judge this, given that xfstests are testing a whole range of other things (POSIX compliance and stressing the vfs api), while blktests are there to stress the block i/o api/interface.
So both would be best to run as we know xfstests also hits the block layer...
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org