Thanks so much for doing this! I think everyone agrees that we need _some_ way of documenting which tests to run, and I think this is our best option.
In any case, this patch does a lot, and I'll comment on them one-by-one. (It may be worth splitting this patch up into a few separate bits, if only so that we can better separate the uncontroversial bits from the open questions.)
On Thu, 16 Nov 2023 at 01:52, Nikolai Kondrashov Nikolai.Kondrashov@redhat.com wrote:
Introduce a new 'V:' ("Verify") field to MAINTAINERS. The field accepts a name of a test suite which is required to be executed for each contribution to the subsystem.
Yes -- this is exactly what I'd like. (As much as I'd love 'T' to have been available. Alas...)
The other thing discussed at plumbers was to include this in the 'maintainer profile', but having it as a separate MAINTAINERS entry is my preference, and is better for automation.
The question for what the tag actually contains brings us to...
Each referenced test suite is expected to be documented in the new Documentation/process/tests.rst file, which must have enough structure (documented inside) for the tools to make use of it. Apart from basic data, each test can refer to its "superset" - a test suite which this one is a part of. The expected use is to describe both a large test suite and its subsets, so the former would also be accepted, if a subsystem requires only a subset.
I think this could work, but is a bit complicated.
My initial thought was to have this as a more free-form field, which either contained a: - Path to a command to run (e.g. tools/testing/kunit/run_checks.py) - Path to a documentation file describing the test. - URL to a page describing the test - (Maybe) freeform text.
It's probably worth also looking at this proposal to auto-generate similar documentation: https://lore.kernel.org/linux-kselftest/cover.1689171160.git.mchehab@kernel....
The other question is how to handle outdated results when a new patch revision is sent out. Personally, I think this is something we can solve similarly to 'Reviewed-by', depending on the extent of the changes and cost of the tests. I suspect for most automated tests, this would mean never carrying the 'Tested-with' tag over, but if testing it involved manually building and running kernels against 50 different hardware setups, I could imagine it making sense to not re-do this if a new revision just changed a doc typo. If a URL is used here, it could contain version info, too.
Introduce a new tag, 'Tested-with:', documented in the Documentation/process/submitting-patches.rst file. The tag is expected to reference the documented test suites, similarly to the 'V:' field, and to certify that the submitter executed the test suite on the change, and that it passed.
I'm also 100% for this, though I'd considered it separately from the MAINTAINERS change.
I think, in the ideal case, we want this to link to the results somehow. kcidb would seem to be the obvious choice there.
Again, as a fallback, a plain text field would be useful to describe cases where a patch was tested by some means other than a formal test suite. This might not be ideal, but I'd still rather have people describe that something "builds and boots on <x> hardware" than have to guess if a patch was tested at all.
Of course, it'd then be up to maintainers to decide what they'd accept: I'd expect that some would require there be a 'Tested-with' header which links to valid results for the tests described in MAINTAINERS.
Make scripts/checkpatch.pl ensure any added V: fields reference documented test suites only, and output a warning if a change to a subsystem doesn't certify the required test suites were executed, if any.
I'd definitely want something like this to run at some point in the patch-submission workflow. I think that, ultimately, we'll want to be able to run some tests automatically (e.g., a git hook could run the tests and add the 'Tested-with' line).
Personally, I'd like to require that all patches have a 'Tested-with' field, even if there's not a corresponding 'V' MAINTAINERS entry, as people should at least think of how something's tested, even if there's not a formal 'test suite' for it. Though that seems a longer-term goal
If the test suite description includes a "Command", then checkpatch.pl will output it as the one executing the suite. The command should run with only the kernel tree and the regular developer environment set up. But, at the same time, could simply output instructions for installing any extra dependencies (or pull some automatically). The idea is to get the developer into feedback loop quicker and easier, so they have something to run and iterate on, even if it involves installing some more stuff first. Therefore it's a good idea to add such wrappers to the kernel tree proper and refer to them from the tests.rst.
Extend scripts/get_maintainer.pl to support retrieving the V: fields, and scripts/parse-maintainers.pl to maintain their ordering.
Signed-off-by: Nikolai Kondrashov Nikolai.Kondrashov@redhat.com
The questions I think we need to answer to get this in are: 1. Do we want to split this up (and potentially land it piece-by-piece), or is it more valuable to have a stricter, more complete system from the get-go? 2. What format should the 'V' line take? If it is simply a test name, do we use a doc as suggested (or one generated in part from some other process), or something like a command name or URL? Can it just be freeform text? 3. Should 'Tested-with' be a test name in the same format as 'V', a URL to results (any URL, or just kcidb?), or freeform text? How does this evolve with multiple versions of patches? 4. How should this be enforced? A warning (not an 'error') from checkpatch? A separate script?
Personally, my gut feeling is that we should land the simplest, most minimal version of this (the 'V' field, as freeform text) now, and build on that as consensus and tooling permits. I'd probably also add the 'Tested-with' or similar tag, as freeform text, too. I don't think either of those would cause major problems if we needed to change or restrict the format later; I imagine there won't be a huge need to parse old commits for test data, and even if so, it wouldn't be too hard to ignore any which don't conform to any stricter future convention.
But I don't think there's anything fundamentally wrong with the full plan as-is, so if everyone's happy with it, I'd not object to having it.
Cheers, -- David