On Wed, Dec 2, 2020 at 8:21 PM Andy Shevchenko andy.shevchenko@gmail.com wrote:
On Wed, Dec 2, 2020 at 1:57 PM David Gow davidgow@google.com wrote:
On Wed, Dec 2, 2020 at 6:06 PM Andy Shevchenko andy.shevchenko@gmail.com wrote:
On Wed, Dec 02, 2020 at 09:51:19AM +0530, Arpitha Raghunandan wrote:
...
What I;m talking about is the output. How it will be implemented (using the same variable or differently) is up to you. So the point is I want to see the statistics of success/total at the end.
I think this should be done in KUNIT rather than in the individual test cases.
I tend to agree here that this really is something for KUnit. At the moment, the tools/testing/kunit/kunit.py script will parse the kernel log and generate these sorts of statistics. I know that needing to run it through a script might seem like a step backwards, but there's no formal place for statistics in the KTAP specification[1] being worked on to standardise kselftest/kunit output formats.
Then it sucks. Fix specification (in a long term) and does it have a comment style of messages that we can have this statistics printed (but maybe not parsed)?
I should clarify: there's nothing in the spec which explicitly defines a place for such statistics (nor anything which requires them). There are "diagnostic" lines which are not parsed, and so it'd be possible to output statistics there. KUnit itself doesn't at present, but allows individual tests to log diagnostic lines which could be such statistics, particularly in cases like this where the full structure of the tests aren't quite exposed to the framework.
Note that there are other parsers for TAP-like formats which are being used with KUnit results, so systems like LAVA could also sum up these statistics. It's also possible, as Arpitha alluded to, to have the test dump them out as a comment.
Fine to me.
This won't actually work for this test as-is, though, as the KUnit version is running as a single giant test case (so KUnit believes that 1/1 tests have passed, rather than having any more-detailed statistics). It looks like there are a few ways to split it up a bit which would make it neater (a test each for the for() loops in test_hexdump_init() seems sensible to me), but at the moment, there's not really a way of programmatically generating test cases which KUnit then counts
Fix it, please. We rely on this statistics pretty much.
The hope is that the Parameterised Test feature will make this possible (though, as mentioned, there are a few other issues around then making those statistics available, but we should be able to work through those).
It may be a silly question, but what is it that makes these statistics useful in this test? Maybe I'm misunderstanding something, but I'd've thought that the important things were whether or not _all_ tests had passed, and -- if not --- _which_ ones had failed. Is the count of failing cases within a test like this really that useful for debugging, or is it more for comparing against different versions? Either way, we'll try to make sure they're available.
The "Parameterised Tests"[2] work Arpitha has been working on ought to go some way to helping here, though it won't solve this completely in this initial version. The problem there is that parameterised tests are not reported individually in a way the kunit.py parser can report cleanly, yet, so it'll still only be counted as one test until that's changed (though, at least, that shouldn't require any test-specific work).
My suggestion for the ultimate state of the test would be:
- Split up the test into separate KUnit tests for the different
"categories" of tests: (e.g., test_hexdump_set, test_hexdump_overflow_set_ascii, etc)
- Replace the for loops in test_hexdump_init() with parameters, so
that KUnit is aware of the original runs.
- Once KUnit and the tooling supports it, these will be reported as
subtests. (In the meantime, the results will be listed individually, commented out)
I'm fine as long as we have this information printed to the user.
Okay -- Arpitha: does this seem like a sensible approach to you?
If printing it to the kernel log is really essential, I'll have a look into how we can do that in KUnit.
Of course, it'll take a while before all of those KUnit pieces are in place. I personally think that a good compromise would be to just do the first of these for now, which would make kunit_tool give at least a 4/4 rather than 1/1 result. Then, once the parameterised testing work is merged (and perhaps the tooling fixes are finished), the tests could be updated to take advantage of that.
How can we guarantee it will be not forgotten?
Thinking about it further, if we can get the parameterised testing stuff in 5.11 as planned, then any tooling/output fixes done later would automatically apply. Maybe rather than doing an intermediate version with just the first splitting up of tests, we try it with the current parameterised test patches, and we can possibly prototype some kernel-side statistics reporting which should work. To be honest, though, the subtest support on the kunit_tool side is likely to take quite a while, so it would be nice to have something (like statistics in the kernel log) which ameliorate the problem in the meantime.
I'll have a bit of a play around with the test output and parsing code this week and see if there's a simple change that can get us most of the way there. I think something should be possible if the test uses the Parameterised testing feature.
Cheers, -- David