On Fri, May 16, 2025 at 02:55:24PM +0200, David Hildenbrand wrote:
On 16.05.25 14:29, Mark Brown wrote:
On Fri, May 16, 2025 at 10:02:16AM +0200, David Hildenbrand wrote:
reason), what exactly is the problem with that?
We run tests. If all pass, we're happy, if one fails, we investigate.
None of the tooling is able to either distinguish between the multiple tests that are being run in gup_longterm, nor compare the results of multiple runs effectively. If all the tests run they report themselves
Okay, so this is purely to make tooling happy. Humans are smart enough to figure it out.
Not just the tools, humans interact with the selftests and their results via tools (unless I'm actively working on something and running the specific test for that thing I'm unlikely to ever directly look at results...).
What mechanism do we have in place to reliably prevent that from happening? And is this at least documented somewhere ("unique identifier for a test")>
It comes from TAP, I can't see a direct reference to anything in the kernel documentation. The main thing enforcing this is people running tooling noticing bad output, unfortunately.
I guess when using kselftest_harness, we get a single identifier per tests (and much less output) just automatically.
Nothing stops something using the harness from logging during the test, the harness tests actually tend to be a little chattier than a lot of the things written directly to kselftest.h as they log the start and end of tests as well as the actual TAP result line as standard.
If a selftest is reporting multiple tests it should report them with names that are stable and unique.
I'm afraid we have other such tests that report duplicate conditions. cow.c is likely another candidate (written by me ;) ).
That one's not come up for me (this was one of four different patches for mm selftests I sent the other day cleaning up duplicate test names).
Probably, the affected tests should be converted to use kselftest_harness, where we just report the result for a single tests, and not the individual assertions.
That would reduce the output of these tests drastically as well.
So that is likely the way to clean this up properly and make tooling happy?
That'd certainly work, though doing that is more surgery on the test than I personally have the time/enthusiasm for right now.
Having the tests being chatty isn't a terrible thing, so long as they're not so chatty they cause execution time problems on serial console - it can be useful if they do blow up and you're looking at a failure on a machine you only have automated access to.