-----Original Message----- From: Brendan Higgins
On Wed, Jun 10, 2020 at 06:11:06PM +0000, Bird, Tim wrote:
Some months ago I started work on a document to formalize how kselftest implements the TAP specification. However, I didn't finish that work. Maybe it's time to do so now.
kselftest has developed a few differences from the original TAP specification, and some extensions that I believe are worth documenting.
Essentially, we have created our own KTAP (kernel TAP) format. I think it is worth documenting our conventions, in order to keep everyone on the same page.
Below is a partially completed document on my understanding of KTAP, based on examination of some of the kselftest test output. I have not reconciled this with the kunit output format, which I believe has some differences (which maybe we should resolve before we get too far into this).
I submit the document now, before it is finished, because a patch was recently introduced to alter one of the result conventions (from SKIP='not ok' to SKIP='ok').
See the document include inline below
====== start of ktap-doc-rfc.txt ======
[...]
--- from here on is not-yet-organized material
Tip:
- don't change the test plan based on skipped tests.
- it is better to report that a test case was skipped, than to not report it
- that is, don't adjust the number of test cases based on skipped tests
Other things to mention: TAP13 elements not used:
- yaml for diagnostic messages
We talked about this before, but I would like some way to get failed expectation/assertion information in the test in a consistent machine parsible way. Currently we do the following:
# Subtest: example 1..1 # example_simple_test: initializing # example_simple_test: EXPECTATION FAILED at lib/kunit/kunit-example-test.c:29 Expected 1 + 1 == 3, but 1 + 1 == 2 3 == 3 not ok 1 - example_simple_test not ok 5 - example
Technically not TAP compliant, but no one seems to mind. I am okay with keeping it the way it is, but if we don't want it in the KTAP spec, we will need some kind of recourse.
So far, most of the CI systems don't parse out diagnostic data, so it doesn't really matter what the format is. If it's useful for humans, it's valuable as is. However, it would be nice if that could change. But without some formalization of the format of the diagnostic data, it's an intractable problem for CI systems to parse it. So it's really a chicken and egg problem. To solve it, we would have to determine what exactly needs to be provided on a consistent basis for diagnostic data across many tests. I think that it's too big a problem to handle right now. I'm not opposed to migrating to some structure with yaml in the future, but free form text output seems OK for now.
- reason: try to keep things line-based, since output from other things
may be interspersed with messages from the test itself
- TODO directive
Is this more of stating a fact or desire? We don't use TODO either, but it looks like it could be useful.
Just stating a fact. I didn't find TODO in either KUnit or selftest in November when I initially wrote this up. If TODO serves as a kind of XFAIL, it could be useful. I have nothing against it.
KTAP Extensions beyond TAP13:
- nesting
- via indentation
- indentation makes it easier for humans to read
- test identifier
- multiple parts, separated by ':'
Can you elabroate on this more? I am not sure what you mean.
An individual test case can have a name that is scoped by a containing test or test suite. For example: selftests: cpufreq: main.sh This test identifier consists of the test system (selftests), the test area (cpufreq), and the test case name (main.sh). This one's a bit weird because the test case name is just the name of the program in that test area. The program itself doesn't output data in TAP format, and the harness uses it's exit code to detect PASS/FAIL. if main.sh had multiple test cases, it might produce test identifiers like this: selftests: cpufreq: main: check_change_afinity_mask selftests: cpufreq: main: check_permissions_for_mask_operation (Or it might just produce the last part of these strings, the testcase names, and the testcase id might be something generated by the harness or CI system.)
The value of having a single string to identify the testcase (like a uniform resource locator), is that it's easier to use the string to correlate results produced from different CI system that are executing the same test.
- summary lines
- can be skipped by CI systems that do their own calculations
Other notes:
- automatic assignment of result status based on exit code
Tips:
- do NOT describe the result in the test line
- the test case description should be the same whether the test succeeds or fails
- use diagnostic lines to describe or explain results, if this is desirable
- test numbers are considered harmful
- test harnesses should use the test description as the identifier
- test numbers change when testcases are added or removed
- which means that results can't be compared between different versions of the test
- recommendations for diagnostic messages:
- reason for failure
- reason for skip
- diagnostic data should always preceding the result line
- problem: harness may emit result before test can do assessment to determine reason for result
- this is what the kernel uses
Differences between kernel test result format and TAP13:
- in KTAP the "# SKIP" directive is placed after the description on the test result line
====== start of ktap-doc-rfc.txt ====== OK - that's the end of the RFC doc.
Here are a few questions:
- is this document desired or not?
- is it too long or too short?
- if the document is desired, where should it be placed?
I like it. I don't think we can rely on the TAP people updating their stuff based on my interactions with them. So having a spec which is actually maintained would be nice.
Maybe in Documentation/dev-tools/ ?
I'm leaning towards Documentation/dev-tools/test-results_format.rst
I assume somewhere under Documentation, and put into .rst format. Suggestions for a name and location are welcome.
- is this document accurate? I think KUNIT does a few things differently than this description.
- is the intent to have kunit and kselftest have the same output format? if so, then these should be rationalized.
Yeah, I think it would be nice if all test frameworks/libraries for the kernel output tests in the same language.
Agreed.
-- Tim