On 08.07.24 19:49, Chuck Lever III wrote:
On Jul 8, 2024, at 6:36 AM, Greg KH greg@kroah.com wrote: On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
On Jul 6, 2024, at 12:11 AM, Greg KH greg@kroah.com wrote: On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
On Jul 2, 2024, at 6:55 PM, Calum Mackay calum.mackay@oracle.com wrote: On 02/07/2024 5:54 pm, Calum Mackay wrote: > I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace. > I see that Josef's changes were backported, as far back as longterm v5.4,
[...]
I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
As a refresher for the stable folken, Josef's changes make nfsstats silo'd, so they no longer show counts from the whole system, but only for NFS operations relating to the local net namespace. That is a surprising change for some users, tools, and testing.
I'm not clear on whether there are any rules/guidelines around LTS backports causing behavior changes that user tools, like nfsstat, might be impacted by.
The same rules that apply for Linus's tree (i.e. no userspace regressions.)
[...] If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?
This sounds like a regression in Linus's tree too, so why isn't it reverted there first?
There is a change in behavior in the upstream code, but Josef's patches fix an information leak and make the statistics more sensible in container environments. I'm not certain that should be considered a regression, but confess I don't know the regression rules to this fine a degree of detail.
Chuck pointed me to this thread (I had an eye on it already anyway) and asked for advice. Take everything I write here with a grain of salt, as this is somewhat tricky situation which makes it hard to predict how Linus would actually want to see this handled. Maybe I should have CCed him, but I doubt he cares right now; but we maybe should bring him in, if an actual user complains.
With that out of the way, let me write a few thoughts:
* That some test breaks is not a regression, as regressions are about "practical issues", not some ABI/API changes that only some tests care about. So if it's just a test that broke update it.
* If a user would reported something like "this change broke my app" it obviously would be something totally different. But that did not happen yet afaics -- or did it? But from the discussion it sounds like that is something that will likely happen down the road. If that's the case I'd say it's best to prevent that from happening.
* Not sure how Linus would react if a user would complain that some workflow broke because rpc_stat are now per net namespace and shows different numbers (e.g. using a format that does not break any apps). It would likely depend on the actual case and how bad he would consider the information leak.
If it is indeed a regression, how can we go about retaining both behaviors (selectable by Kconfig or perhaps administrative UI)?
That likely might be the best idea if user report an actual regression due to this. But switching the format of any existing file creates quite some trouble, as others already mentioned in this thread. So maybe providing the newer format in a different file and allowing to disable the older one though a Kconfig setting might be the best way forward. Sure, it would take years until people would have switched over, but that's how it is with our "no regressions" rule.
Does that help?
Ciao, Thorsten