On Sat, Jun 20, 2020 at 11:03 PM Frank Rowand frowand.list@gmail.com wrote:
On 2020-06-20 01:44, David Gow wrote:
On Sat, Jun 20, 2020 at 1:58 AM Frank Rowand frowand.list@gmail.com wrote:
On 2020-06-16 07:08, Paolo Bonzini wrote:
On 15/06/20 21:07, Bird, Tim wrote:
> Finally, > - Should a SKIP result be 'ok' (TAP13 spec) or 'not ok' (current kselftest practice)? > See https://testanything.org/tap-version-13-specification.html
Oh! I totally missed this. Uhm. I think "not ok" makes sense to me "it did not run successfully". ... but ... Uhhh ... how do XFAIL and SKIP relate? Neither SKIP nor XFAIL count toward failure, though, so both should be "ok"? I guess we should change it to "ok".
See above for XFAIL.
I initially raised the issue with "SKIP" because I have a lot of tests that depend on hardware availability---for example, a test that does not run on some processor kinds (e.g. on AMD, or old Intel)---and for those SKIP should be considered a success.
No, SKIP should not be considered a success. It should also not be considered a failure. Please do not blur the lines between success, failure, and skipped.
I agree that skipped tests should be their own thing, separate from success and failure, but the way they tend to behave tends to be closer to a success than a failure.
I guess the important note here is that a suite of tests, some of which are SKIPped, can be listed as having passed, so long as none of them failed. So, the rule for "bubbling up" test results is that any failures cause the parent to fail, the parent is marked as skipped if _all_ subtests are skipped, and otherwise is marked as having succeeded. (Reversing the last part: having a suite be marked as skipped if _any_ of the subtests are skipped also makes sense, and has its advantages, but anecdotally seems less common in other systems.)
That really caught my attention as something to be captured in the spec.
My initial response was that bubbling up results is the domain of the test analysis tools, not the test code.
KUnit is actually sitting in the middle. Results are bubbled up from individual tests to the test suites in-kernel (by the common KUnit code), as the suites are TAP tests (individual test cases being subtests), and so need to provide results. The kunit.py script then bubbles those results up (using the same rules) to print a summary.
If I were writing a test analysis tool, I would want the user to have the ability to configure the bubble up rules. Different use cases would desire different rules.
I tend to agree: it'd be nice if test analysis tools could implement different rules here. If we're using TAP subtests, though, the parent tests do need to return a result in the test code, so either that needs to be test-specific (if the parent test is not just a simple union of its subtests), or it could be ignored by an analysis tool which would follow its own rules. (In either case, it may make sense to be able to configure a test analysis tool to always fail or mark tests with failed or skipped subtests, even if its result is "ok", but not vice-versa -- a test which failed would stay failed, even if all its subtests passed.)
My second response was to start thinking about whether the tests themselves should have any sort of bubble up implemented. I think it is a very interesting question. My current mindset is that each test is independent, and their is not a concept of an umbrella test that is the union of a set of subtests. But maybe there is value to umbrella tests. If there is a concept of umbrella tests then I think the spec should define how skip bubbles up.
KUnit suites are definitely that kind of "umbrella test" at the moment.
The other really brave thing one could do to break from the TAP specification would be to add a "skipped" value alongside "ok" and "not ok", and get rid of the whole "SKIP" directive/comment stuff. Possibly not worth the departure from the spec, but it would sidestep part of the problem.
I like being brave in this case. Elevating SKIP to be a peer of "ok" and "not ok" provides a more clear model that SKIP is a first class citizen. It also removes the muddled thinking that the current model promotes.
Cheers, -- David