On Thu, 15 Aug 2024, Petr Vorel wrote:
On Fri, 12 Jul 2024, Jeff Layton wrote:
On Fri, 2024-07-12 at 16:12 +1000, NeilBrown wrote:
My point is that if we are going to change the kernel to accommodate LTP at all, we should accommodate LTP as it is today. If we are going to change LTP to accommodate the kernel, then it should accommodate the kernel as it is today.
The problem is that there is no way for userland tell the difference between the older and newer behavior. That was what I was suggesting we add.
To make sure I wasn't talking through my hat, I had a look at the ltp code.
The test in question simply tests that the count of RPC calls increases.
It can get the count of RPC calls in one of 2 ways : 1/ "lhost" - look directly in /proc/net/rpc/{nfs,nfsd} 2/ "rhost" - ssh to the server and look in that file.
FYI "rhost" in LTP can be either using namespaces (Single Host Configuration [1]), which is run by default, or SSH based (Two Host Configuration [2]). IMHO most of the testers (including myself run tests simply via network namespaces).
NOTE: I suppose CONFIG_NAMESPACES=y is a must for 'ip netns' to be working, thus tests would hopefully failed early on kernel having that disabled.
The current test to "fix" this for kernels -ge "6.9" is to force the use of "rhost".
I'm guessing that always using "rhost" for the nfsd stats would always work.
FYI this old commit [3] allowed these tests to be working in network namespaces. It reads for network namespaces both /proc/net/rpc/{nfs,nfsd} from non-namespace ("lhost"). This is the subject of the change in 6.9, which now fails. And for SSH based we obviously look on "rhost" already.
That patch looks like a mistake. The author noticed that the rpc stats were not "namespacified" and instead of reporting the bug (and surely the whole point of a test suite is to report bugs), they made a change that seems completely unnecessary which had the effect of entrenching the bug. Unfortunately the commit message only says why it is same to make the change, not why it us useful.
Fully agree. With nowadays experiences I would have asked him to discuss that on this ML. Ironically, Alexey back then (as part of Oracle) did had report and even fix few network protocol related bugs, did some testing for NFS related fixes.
But if not, the code could get both the local and remote nfsd stats, and check that at least one of them increases (and neither decrease).
This sounds reasonable, thanks for a hint. I'll just look for client RPC calls (/proc/net/rpc/nfs) in both non-namespace and namespace. The only think is that we effectively give up checking where it should be (if it for whatever reason in the future changes again, we miss that). I'm not sure if this would be treated the same as the current situation (Josef Bacik had obvious reasons for this to be working).
Stats should always be visible in the relevant namespace. server stats should be visible in the name space where the server runs. client stats should be visible in the namespace where the filesystem is mounted. This has always been true (as long as we have had stats) and if it ever stops being true, that is a bug.
+1
I think the test suite should test for precisely this case. Testing if the stats are visible from a different namespace is not likely to be an interesting test - unless you want to guard against the possibility that we will one day accidentally de-namespaceify the stats (stranger things have happened).
There could be additional check for namespaces only (skip in SSH) that there is no information leak.
Kind regards, Petr
Thanks, NeilBrown
@Josef @NFS maintainers: WDYT?
Kind regards, Petr
So ltp doesn't need to know which kernel is being used - it can be written to work safely on either.
NeilBrown
[1] https://github.com/linux-test-project/ltp/tree/master/testcases/network#sing... [2] https://github.com/linux-test-project/ltp/tree/master/testcases/network#two-... [3] https://github.com/linux-test-project/ltp/commit/40958772f11d90e4b5052e7e772...
To be clear, I hold this opinion loosely. If the consensus is that we need to revert things then so be it. I just don't see the value of doing that in this particular situation. -- Jeff Layton jlayton@kernel.org