f5f9d4a314da moved the initialization of the reply cache into the nfsd startup, but it didn't account for the stats counters which can be accessed before nfsd is ever started, causing a NULL pointer dereference.
This is easy to trigger on some arches (like aarch64), but on x86_64, calling this_cpu_ptr(NULL) evidently returns a pointer to the fixed_percpu_data, which I guess this looks just enough like a newly initialized percpu var to allow nfsd_reply_cache_stats_show to access it without Oopsing.
Move the initialization of the per-net+per-cpu reply-cache counters back into nfsd_init_net, while leaving the rest of the reply cache allocations to be done at nfsd startup time.
Kudos to Eirik who did most of the legwork to track this down.
Cc: stable@vger.kernel.org # v6.3+ Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") Reported-and-Tested-by: Eirik Fuller efuller@redhat.com Signed-off-by: Jeff Layton jlayton@kernel.org --- fs/nfsd/cache.h | 2 ++ fs/nfsd/nfscache.c | 13 +++---------- fs/nfsd/nfsctl.c | 8 ++++++++ 3 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h index f21259ead64b..a4b12d6c41d3 100644 --- a/fs/nfsd/cache.h +++ b/fs/nfsd/cache.h @@ -80,6 +80,8 @@ enum {
int nfsd_drc_slab_create(void); void nfsd_drc_slab_free(void); +int nfsd_reply_cache_stats_init(struct nfsd_net *nn); +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn); int nfsd_reply_cache_init(struct nfsd_net *); void nfsd_reply_cache_shutdown(struct nfsd_net *); int nfsd_cache_lookup(struct svc_rqst *); diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 041faa13b852..b696dc629c0b 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -148,12 +148,12 @@ void nfsd_drc_slab_free(void) kmem_cache_destroy(drc_slab); }
-static int nfsd_reply_cache_stats_init(struct nfsd_net *nn) +int nfsd_reply_cache_stats_init(struct nfsd_net *nn) { return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM); }
-static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) { nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM); } @@ -169,17 +169,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) hashsize = nfsd_hashsize(nn->max_drc_entries); nn->maskbits = ilog2(hashsize);
- status = nfsd_reply_cache_stats_init(nn); - if (status) - goto out_nomem; - nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan; nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count; nn->nfsd_reply_cache_shrinker.seeks = 1; status = register_shrinker(&nn->nfsd_reply_cache_shrinker, "nfsd-reply:%s", nn->nfsd_name); if (status) - goto out_stats_destroy; + return status;
nn->drc_hashtbl = kvzalloc(array_size(hashsize, sizeof(*nn->drc_hashtbl)), GFP_KERNEL); @@ -195,9 +191,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) return 0; out_shrinker: unregister_shrinker(&nn->nfsd_reply_cache_shrinker); -out_stats_destroy: - nfsd_reply_cache_stats_destroy(nn); -out_nomem: printk(KERN_ERR "nfsd: failed to allocate reply cache\n"); return -ENOMEM; } diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 1489e0b703b4..7c837afcf615 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1505,6 +1505,9 @@ static __net_init int nfsd_init_net(struct net *net) retval = nfsd_idmap_init(net); if (retval) goto out_idmap_error; + retval = nfsd_reply_cache_stats_init(nn); + if (retval) + goto out_repcache_error; nn->nfsd_versions = NULL; nn->nfsd4_minorversions = NULL; nfsd4_init_leases_net(nn); @@ -1513,6 +1516,8 @@ static __net_init int nfsd_init_net(struct net *net)
return 0;
+out_repcache_error: + nfsd_idmap_shutdown(net); out_idmap_error: nfsd_export_shutdown(net); out_export_error: @@ -1521,6 +1526,9 @@ static __net_init int nfsd_init_net(struct net *net)
static __net_exit void nfsd_exit_net(struct net *net) { + struct nfsd_net *nn = net_generic(net, nfsd_net_id); + + nfsd_reply_cache_stats_destroy(nn); nfsd_idmap_shutdown(net); nfsd_export_shutdown(net); nfsd_netns_free_versions(net_generic(net, nfsd_net_id));
Thanks Eirik and Jeff.
At this point in the release cycle, I plan to apply this for the next merge window (6.5).
A few cosmetic remarks below. I can take this as-is and make adjustments, or you can redrive. Just let me know.
On Fri, Jun 16, 2023 at 03:17:43PM -0400, Jeff Layton wrote:
f5f9d4a314da moved the initialization of the reply cache into the nfsd startup, but it didn't account for the stats counters which can be accessed before nfsd is ever started, causing a NULL pointer dereference.
This is easy to trigger on some arches (like aarch64), but on x86_64, calling this_cpu_ptr(NULL) evidently returns a pointer to the fixed_percpu_data, which I guess this looks just enough like a newly initialized percpu var to allow nfsd_reply_cache_stats_show to access it without Oopsing.
Move the initialization of the per-net+per-cpu reply-cache counters back into nfsd_init_net, while leaving the rest of the reply cache allocations to be done at nfsd startup time.
Kudos to Eirik who did most of the legwork to track this down.
Cc: stable@vger.kernel.org # v6.3+ Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
Why both Fixes: and Cc: stable?
Reported-and-Tested-by: Eirik Fuller efuller@redhat.com
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2215429 ?
Signed-off-by: Jeff Layton jlayton@kernel.org
fs/nfsd/cache.h | 2 ++ fs/nfsd/nfscache.c | 13 +++---------- fs/nfsd/nfsctl.c | 8 ++++++++ 3 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h index f21259ead64b..a4b12d6c41d3 100644 --- a/fs/nfsd/cache.h +++ b/fs/nfsd/cache.h @@ -80,6 +80,8 @@ enum { int nfsd_drc_slab_create(void); void nfsd_drc_slab_free(void); +int nfsd_reply_cache_stats_init(struct nfsd_net *nn); +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn); int nfsd_reply_cache_init(struct nfsd_net *); void nfsd_reply_cache_shutdown(struct nfsd_net *); int nfsd_cache_lookup(struct svc_rqst *); diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 041faa13b852..b696dc629c0b 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -148,12 +148,12 @@ void nfsd_drc_slab_free(void) kmem_cache_destroy(drc_slab); } -static int nfsd_reply_cache_stats_init(struct nfsd_net *nn) +int nfsd_reply_cache_stats_init(struct nfsd_net *nn)
As part of making these two functions into a more public API, I would prefer to rename this nfsd_net_reply_cache_init(), and it should include a KDOC comment out front.
I'm having some trouble easily distinguishing between the purpose of
static __net_init int nfsd_init_net(struct net *net)
and
static int nfsd_startup_net(struct net *net, const struct cred *cred)
The former is invoked when a net namespace is created. The latter is called by write_threads, therefore /proc/fs/nfsd/ must already be mounted.
The function names are confusingly similar and there's no KDOC to be found. I might add a clean-up patch to this one.
{ return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM); } -static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn)
Ditto, rename this nfsd_net_reply_cache_destroy(), plus KDOC.
{ nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM); } @@ -169,17 +169,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) hashsize = nfsd_hashsize(nn->max_drc_entries); nn->maskbits = ilog2(hashsize);
- status = nfsd_reply_cache_stats_init(nn);
- if (status)
goto out_nomem;
- nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan; nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count; nn->nfsd_reply_cache_shrinker.seeks = 1; status = register_shrinker(&nn->nfsd_reply_cache_shrinker, "nfsd-reply:%s", nn->nfsd_name); if (status)
goto out_stats_destroy;
return status;
nn->drc_hashtbl = kvzalloc(array_size(hashsize, sizeof(*nn->drc_hashtbl)), GFP_KERNEL); @@ -195,9 +191,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) return 0; out_shrinker: unregister_shrinker(&nn->nfsd_reply_cache_shrinker); -out_stats_destroy:
- nfsd_reply_cache_stats_destroy(nn);
-out_nomem: printk(KERN_ERR "nfsd: failed to allocate reply cache\n"); return -ENOMEM; } diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 1489e0b703b4..7c837afcf615 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1505,6 +1505,9 @@ static __net_init int nfsd_init_net(struct net *net) retval = nfsd_idmap_init(net); if (retval) goto out_idmap_error;
- retval = nfsd_reply_cache_stats_init(nn);
- if (retval)
nn->nfsd_versions = NULL; nn->nfsd4_minorversions = NULL; nfsd4_init_leases_net(nn);goto out_repcache_error;
@@ -1513,6 +1516,8 @@ static __net_init int nfsd_init_net(struct net *net) return 0; +out_repcache_error:
- nfsd_idmap_shutdown(net);
out_idmap_error: nfsd_export_shutdown(net); out_export_error: @@ -1521,6 +1526,9 @@ static __net_init int nfsd_init_net(struct net *net) static __net_exit void nfsd_exit_net(struct net *net) {
- struct nfsd_net *nn = net_generic(net, nfsd_net_id);
- nfsd_reply_cache_stats_destroy(nn); nfsd_idmap_shutdown(net); nfsd_export_shutdown(net); nfsd_netns_free_versions(net_generic(net, nfsd_net_id));
We should update this nfsd_netns_free_versions() call site to take the new @nn variable.
-- 2.40.1
On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
Thanks Eirik and Jeff.
At this point in the release cycle, I plan to apply this for the next merge window (6.5).
I think we should take this in sooner. This is a regression and a user-triggerable oops in the right situation. If:
- non-x86_64 arch - /proc/fs/nfsd is mounted in the namespace - nfsd is not started in the namespace - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"
A few cosmetic remarks below. I can take this as-is and make adjustments, or you can redrive. Just let me know.
I'll plan to redrive it.
On Fri, Jun 16, 2023 at 03:17:43PM -0400, Jeff Layton wrote:
f5f9d4a314da moved the initialization of the reply cache into the nfsd startup, but it didn't account for the stats counters which can be accessed before nfsd is ever started, causing a NULL pointer dereference.
This is easy to trigger on some arches (like aarch64), but on x86_64, calling this_cpu_ptr(NULL) evidently returns a pointer to the fixed_percpu_data, which I guess this looks just enough like a newly initialized percpu var to allow nfsd_reply_cache_stats_show to access it without Oopsing.
Move the initialization of the per-net+per-cpu reply-cache counters back into nfsd_init_net, while leaving the rest of the reply cache allocations to be done at nfsd startup time.
Kudos to Eirik who did most of the legwork to track this down.
Cc: stable@vger.kernel.org # v6.3+ Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
Why both Fixes: and Cc: stable?
*shrug* : they mean different things. I can drop the Cc stable.
Reported-and-Tested-by: Eirik Fuller efuller@redhat.com
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2215429 ?
Signed-off-by: Jeff Layton jlayton@kernel.org
fs/nfsd/cache.h | 2 ++ fs/nfsd/nfscache.c | 13 +++---------- fs/nfsd/nfsctl.c | 8 ++++++++ 3 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h index f21259ead64b..a4b12d6c41d3 100644 --- a/fs/nfsd/cache.h +++ b/fs/nfsd/cache.h @@ -80,6 +80,8 @@ enum { int nfsd_drc_slab_create(void); void nfsd_drc_slab_free(void); +int nfsd_reply_cache_stats_init(struct nfsd_net *nn); +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn); int nfsd_reply_cache_init(struct nfsd_net *); void nfsd_reply_cache_shutdown(struct nfsd_net *); int nfsd_cache_lookup(struct svc_rqst *); diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 041faa13b852..b696dc629c0b 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -148,12 +148,12 @@ void nfsd_drc_slab_free(void) kmem_cache_destroy(drc_slab); } -static int nfsd_reply_cache_stats_init(struct nfsd_net *nn) +int nfsd_reply_cache_stats_init(struct nfsd_net *nn)
As part of making these two functions into a more public API, I would prefer to rename this nfsd_net_reply_cache_init(), and it should include a KDOC comment out front.
I'm having some trouble easily distinguishing between the purpose of
static __net_init int nfsd_init_net(struct net *net)
and
static int nfsd_startup_net(struct net *net, const struct cred *cred)
The former is invoked when a net namespace is created. The latter is called by write_threads, therefore /proc/fs/nfsd/ must already be mounted.
The function names are confusingly similar and there's no KDOC to be found. I might add a clean-up patch to this one.
I can add some kdoc patches in v2.
{ return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM); } -static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn)
Ditto, rename this nfsd_net_reply_cache_destroy(), plus KDOC.
Ok.
{ nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM); } @@ -169,17 +169,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) hashsize = nfsd_hashsize(nn->max_drc_entries); nn->maskbits = ilog2(hashsize);
- status = nfsd_reply_cache_stats_init(nn);
- if (status)
goto out_nomem;
- nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan; nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count; nn->nfsd_reply_cache_shrinker.seeks = 1; status = register_shrinker(&nn->nfsd_reply_cache_shrinker, "nfsd-reply:%s", nn->nfsd_name); if (status)
goto out_stats_destroy;
return status;
nn->drc_hashtbl = kvzalloc(array_size(hashsize, sizeof(*nn->drc_hashtbl)), GFP_KERNEL); @@ -195,9 +191,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) return 0; out_shrinker: unregister_shrinker(&nn->nfsd_reply_cache_shrinker); -out_stats_destroy:
- nfsd_reply_cache_stats_destroy(nn);
-out_nomem: printk(KERN_ERR "nfsd: failed to allocate reply cache\n"); return -ENOMEM; } diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 1489e0b703b4..7c837afcf615 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1505,6 +1505,9 @@ static __net_init int nfsd_init_net(struct net *net) retval = nfsd_idmap_init(net); if (retval) goto out_idmap_error;
- retval = nfsd_reply_cache_stats_init(nn);
- if (retval)
nn->nfsd_versions = NULL; nn->nfsd4_minorversions = NULL; nfsd4_init_leases_net(nn);goto out_repcache_error;
@@ -1513,6 +1516,8 @@ static __net_init int nfsd_init_net(struct net *net) return 0; +out_repcache_error:
- nfsd_idmap_shutdown(net);
out_idmap_error: nfsd_export_shutdown(net); out_export_error: @@ -1521,6 +1526,9 @@ static __net_init int nfsd_init_net(struct net *net) static __net_exit void nfsd_exit_net(struct net *net) {
- struct nfsd_net *nn = net_generic(net, nfsd_net_id);
- nfsd_reply_cache_stats_destroy(nn); nfsd_idmap_shutdown(net); nfsd_export_shutdown(net); nfsd_netns_free_versions(net_generic(net, nfsd_net_id));
We should update this nfsd_netns_free_versions() call site to take the new @nn variable.
Ahh yes. Will fix.
I'll plan to send a v2 with the changes you suggest.
Thanks,
On 16.06.23 22:54, Jeff Layton wrote:
On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
Thanks Eirik and Jeff.
At this point in the release cycle, I plan to apply this for the next merge window (6.5).
I think we should take this in sooner. This is a regression and a user-triggerable oops in the right situation. If:
- non-x86_64 arch
- /proc/fs/nfsd is mounted in the namespace
- nfsd is not started in the namespace
- unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"
FWIW, might be worth to simply tell Linus about it and let him decide, that's totally fine and even documented in the old and the new docs for handling regressions[1].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/D...
Cc: stable@vger.kernel.org # v6.3+ Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
Why both Fixes: and Cc: stable?
*shrug* : they mean different things. I can drop the Cc stable.
Please leave it, only a stable tag ensures backporting; a fixes tag alone is not enough. See [1] above or these recent messages from Greg:
https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/ https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/
Ciao, Thorsten
On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote:
On 16.06.23 22:54, Jeff Layton wrote:
On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
Thanks Eirik and Jeff.
At this point in the release cycle, I plan to apply this for the next merge window (6.5).
I think we should take this in sooner. This is a regression and a user-triggerable oops in the right situation. If:
- non-x86_64 arch
- /proc/fs/nfsd is mounted in the namespace
- nfsd is not started in the namespace
- unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"
FWIW, might be worth to simply tell Linus about it and let him decide, that's totally fine and even documented in the old and the new docs for handling regressions[1].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/D...
I'd rather Chuck make the final call here. The original patch description didn't point out how easy it is to trigger a panic with this, so I was hoping to convince him.
To further that argument too:
I have to wonder if this bug might cause (temporary?) memory corruption on x86_64. The code hits a spinlock in that struct, so there may be a window of time where it doesn't contain what's expected.
Cc: stable@vger.kernel.org # v6.3+ Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
Why both Fixes: and Cc: stable?
*shrug* : they mean different things. I can drop the Cc stable.
Please leave it, only a stable tag ensures backporting; a fixes tag alone is not enough. See [1] above or these recent messages from Greg:
https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/ https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/
Chuck and I also recently requested that the stable series not pick patches automatically for fs/nfsd. This does need to be backported though, so I cc'ed stable to make that clear.
On 18.06.23 14:09, Jeff Layton wrote:
On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote:
On 16.06.23 22:54, Jeff Layton wrote:
On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
FWIW, might be worth to simply tell Linus about it and let him decide, that's totally fine and even documented in the old and the new docs for handling regressions[1].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/D...
I'd rather Chuck make the final call here.
Totally fine for me, I just wanted to remind folks that this option exist, as I got the impression people forget it or fear it might annoy Linux. :D
Cc: stable@vger.kernel.org # v6.3+ Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
Why both Fixes: and Cc: stable?
*shrug* : they mean different things. I can drop the Cc stable.
Please leave it, only a stable tag ensures backporting; a fixes tag alone is not enough. See [1] above or these recent messages from Greg:
https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/ https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/
Chuck and I also recently requested that the stable series not pick patches automatically for fs/nfsd. This does need to be backported though, so I cc'ed stable to make that clear.
Great, many thx! Ciao, thorsten
On Jun 18, 2023, at 8:09 AM, Jeff Layton jlayton@kernel.org wrote:
On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote:
On 16.06.23 22:54, Jeff Layton wrote:
On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
Thanks Eirik and Jeff.
At this point in the release cycle, I plan to apply this for the next merge window (6.5).
I think we should take this in sooner. This is a regression and a user-triggerable oops in the right situation. If:
- non-x86_64 arch
- /proc/fs/nfsd is mounted in the namespace
- nfsd is not started in the namespace
- unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"
FWIW, might be worth to simply tell Linus about it and let him decide, that's totally fine and even documented in the old and the new docs for handling regressions[1].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/D...
I'd rather Chuck make the final call here.
Thanks! I feel this one needs broader testing than we can manage in just a couple of days. If this were earlier in the -rc cycle I would pull the patch right into 6.4-rc without hesitation. It is obviously -rc material, but the timing is unfortunate.
I'm planning the nfsd for-6.5 pull request early in the merge window, so practically speaking it shouldn't delay the finalized upstream version of this patch by more than a few days.
The original patch description didn't point out how easy it is to trigger a panic with this,
I will add that information.
so I was hoping to convince him.
Oh, I agree it's significant. I just don't want to compound the problem by sending a possibly-buggy patch at the last moment in the 6.4 cycle.
When we have our shiny new CI infrastructure in place, we will be able to move faster and with more confidence on fixes this late in a cycle.
To further that argument too:
I have to wonder if this bug might cause (temporary?) memory corruption on x86_64. The code hits a spinlock in that struct, so there may be a window of time where it doesn't contain what's expected.
Cc: stable@vger.kernel.org # v6.3+ Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
Why both Fixes: and Cc: stable?
*shrug* : they mean different things. I can drop the Cc stable.
Please leave it, only a stable tag ensures backporting; a fixes tag alone is not enough. See [1] above or these recent messages from Greg:
https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/ https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/
Chuck and I also recently requested that the stable series not pick patches automatically for fs/nfsd.
To be clear, we requested that stable not pick up patches for fs/nfsd using AUTOSEL. Basically that means stable will pick up only fs/nfsd patches that are explicitly marked with Fixes: or Cc:stable.
This does need to be backported though, so I cc'ed stable to make that clear.
OK, I'll add the "cc: stable" back too.
My question wasn't so much a demand to drop the tag, but rather a request for an explanation of why both were needed. I'll try to be less terse next time.
Thorsten, if you've added this issue to the regbot database, please feel free to follow up with the correct tags to mark the issue closed as appropriate.
-- Chuck Lever
On Sun, 2023-06-18 at 15:59 +0000, Chuck Lever III wrote:
On Jun 18, 2023, at 8:09 AM, Jeff Layton jlayton@kernel.org wrote:
On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote:
On 16.06.23 22:54, Jeff Layton wrote:
On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
Thanks Eirik and Jeff.
At this point in the release cycle, I plan to apply this for the next merge window (6.5).
I think we should take this in sooner. This is a regression and a user-triggerable oops in the right situation. If:
- non-x86_64 arch
- /proc/fs/nfsd is mounted in the namespace
- nfsd is not started in the namespace
- unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"
FWIW, might be worth to simply tell Linus about it and let him decide, that's totally fine and even documented in the old and the new docs for handling regressions[1].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/D...
I'd rather Chuck make the final call here.
Thanks! I feel this one needs broader testing than we can manage in just a couple of days. If this were earlier in the -rc cycle I would pull the patch right into 6.4-rc without hesitation. It is obviously -rc material, but the timing is unfortunate.
I'm planning the nfsd for-6.5 pull request early in the merge window, so practically speaking it shouldn't delay the finalized upstream version of this patch by more than a few days.
Ok. I'll trust your judgment then and just cultivate my patience!
Cheers,
linux-stable-mirror@lists.linaro.org