From: Chuck Lever chuck.lever@oracle.com
Following up on
https://lore.kernel.org/linux-nfs/d4b235df-4ee5-4824-9d48-e3b3c1f1f4d1@oracl...
Here is a backport series targeting origin/linux-6.1.y that closes the information leak described in the above thread.
I started with v6.1.y because that is the most recent LTS kernel and thus the closest to upstream. I plan to look at 5.15 and 5.10 LTS too if this series is applied to v6.1.y.
Review comments welcome.
Chuck Lever (6): NFSD: Refactor nfsd_reply_cache_free_locked() NFSD: Rename nfsd_reply_cache_alloc() NFSD: Replace nfsd_prune_bucket() NFSD: Refactor the duplicate reply cache shrinker NFSD: Rewrite synopsis of nfsd_percpu_counters_init() NFSD: Fix frame size warning in svc_export_parse()
Jeff Layton (2): nfsd: move reply cache initialization into nfsd startup nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
Josef Bacik (10): sunrpc: don't change ->sv_stats if it doesn't exist nfsd: stop setting ->pg_stats for unused stats sunrpc: pass in the sv_stats struct through svc_create_pooled sunrpc: remove ->pg_stats from svc_program sunrpc: use the struct net as the svc proc private nfsd: rename NFSD_NET_* to NFSD_STATS_* nfsd: expose /proc/net/sunrpc/nfsd in net namespaces nfsd: make all of the nfsd stats per-network namespace nfsd: remove nfsd_stats, make th_cnt a global counter nfsd: make svc_stat per-network namespace instead of global
fs/lockd/svc.c | 3 - fs/nfs/callback.c | 3 - fs/nfsd/export.c | 32 ++++-- fs/nfsd/export.h | 4 +- fs/nfsd/netns.h | 25 ++++- fs/nfsd/nfs4proc.c | 6 +- fs/nfsd/nfscache.c | 201 ++++++++++++++++++++++--------------- fs/nfsd/nfsctl.c | 24 ++--- fs/nfsd/nfsd.h | 1 + fs/nfsd/nfsfh.c | 3 +- fs/nfsd/nfssvc.c | 24 +++-- fs/nfsd/stats.c | 52 ++++------ fs/nfsd/stats.h | 83 ++++++--------- fs/nfsd/trace.h | 22 ++++ fs/nfsd/vfs.c | 6 +- include/linux/sunrpc/svc.h | 5 +- net/sunrpc/stats.c | 2 +- net/sunrpc/svc.c | 36 ++++--- 18 files changed, 301 insertions(+), 231 deletions(-)
From: Jeff Layton jlayton@kernel.org
[ Upstream commit f5f9d4a314da88c0a5faa6d168bf69081b7a25ae ]
There's no need to start the reply cache before nfsd is up and running, and doing so means that we register a shrinker for every net namespace instead of just the ones where nfsd is running.
Move it to the per-net nfsd startup instead.
Reported-by: Dai Ngo dai.ngo@oracle.com Signed-off-by: Jeff Layton jlayton@kernel.org Stable-dep-of: ed9ab7346e90 ("nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net") Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfsctl.c | 8 -------- fs/nfsd/nfssvc.c | 10 +++++++++- 2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 76a60e7a7509..f6637dbb9e18 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1453,16 +1453,11 @@ static __net_init int nfsd_init_net(struct net *net) nn->nfsd_versions = NULL; nn->nfsd4_minorversions = NULL; nfsd4_init_leases_net(nn); - retval = nfsd_reply_cache_init(nn); - if (retval) - goto out_cache_error; get_random_bytes(&nn->siphash_key, sizeof(nn->siphash_key)); seqlock_init(&nn->writeverf_lock);
return 0;
-out_cache_error: - nfsd_idmap_shutdown(net); out_idmap_error: nfsd_export_shutdown(net); out_export_error: @@ -1471,9 +1466,6 @@ static __net_init int nfsd_init_net(struct net *net)
static __net_exit void nfsd_exit_net(struct net *net) { - struct nfsd_net *nn = net_generic(net, nfsd_net_id); - - nfsd_reply_cache_shutdown(nn); nfsd_idmap_shutdown(net); nfsd_export_shutdown(net); nfsd_netns_free_versions(net_generic(net, nfsd_net_id)); diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index a8190caf77f1..d07ee284100f 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -427,16 +427,23 @@ static int nfsd_startup_net(struct net *net, const struct cred *cred) ret = nfsd_file_cache_start_net(net); if (ret) goto out_lockd; - ret = nfs4_state_start_net(net); + + ret = nfsd_reply_cache_init(nn); if (ret) goto out_filecache;
+ ret = nfs4_state_start_net(net); + if (ret) + goto out_reply_cache; + #ifdef CONFIG_NFSD_V4_2_INTER_SSC nfsd4_ssc_init_umount_work(nn); #endif nn->nfsd_net_up = true; return 0;
+out_reply_cache: + nfsd_reply_cache_shutdown(nn); out_filecache: nfsd_file_cache_shutdown_net(net); out_lockd: @@ -454,6 +461,7 @@ static void nfsd_shutdown_net(struct net *net) struct nfsd_net *nn = net_generic(net, nfsd_net_id);
nfs4_state_shutdown_net(net); + nfsd_reply_cache_shutdown(nn); nfsd_file_cache_shutdown_net(net); if (nn->lockd_up) { lockd_down(net);
From: Jeff Layton jlayton@kernel.org
[ Upstream commit ed9ab7346e908496816cffdecd46932035f66e2e ]
Commit f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") moved the initialization of the reply cache into nfsd startup, but didn't account for the stats counters, which can be accessed before nfsd is ever started. The result can be a NULL pointer dereference when someone accesses /proc/fs/nfsd/reply_cache_stats while nfsd is still shut down.
This is a regression and a user-triggerable oops in the right situation:
- non-x86_64 arch - /proc/fs/nfsd is mounted in the namespace - nfsd is not started in the namespace - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"
Although this is easy to trigger on some arches (like aarch64), on x86_64, calling this_cpu_ptr(NULL) evidently returns a pointer to the fixed_percpu_data. That struct looks just enough like a newly initialized percpu var to allow nfsd_reply_cache_stats_show to access it without Oopsing.
Move the initialization of the per-net+per-cpu reply-cache counters back into nfsd_init_net, while leaving the rest of the reply cache allocations to be done at nfsd startup time.
Kudos to Eirik who did most of the legwork to track this down.
Cc: stable@vger.kernel.org # v6.3+ Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") Reported-and-tested-by: Eirik Fuller efuller@redhat.com Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2215429 Signed-off-by: Jeff Layton jlayton@kernel.org Stable-dep-of: 4b14885411f7 ("nfsd: make all of the nfsd stats per-network namespace") Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/cache.h | 2 ++ fs/nfsd/nfscache.c | 25 ++++++++++++++----------- fs/nfsd/nfsctl.c | 10 +++++++++- 3 files changed, 25 insertions(+), 12 deletions(-)
diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h index 3c07d587ae9e..cf7c5f557986 100644 --- a/fs/nfsd/cache.h +++ b/fs/nfsd/cache.h @@ -80,6 +80,8 @@ enum {
int nfsd_drc_slab_create(void); void nfsd_drc_slab_free(void); +int nfsd_net_reply_cache_init(struct nfsd_net *nn); +void nfsd_net_reply_cache_destroy(struct nfsd_net *nn); int nfsd_reply_cache_init(struct nfsd_net *); void nfsd_reply_cache_shutdown(struct nfsd_net *); int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start, diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index f53335ae0ab2..2c6942e2344c 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -148,12 +148,23 @@ void nfsd_drc_slab_free(void) kmem_cache_destroy(drc_slab); }
-static int nfsd_reply_cache_stats_init(struct nfsd_net *nn) +/** + * nfsd_net_reply_cache_init - per net namespace reply cache set-up + * @nn: nfsd_net being initialized + * + * Returns zero on succes; otherwise a negative errno is returned. + */ +int nfsd_net_reply_cache_init(struct nfsd_net *nn) { return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM); }
-static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) +/** + * nfsd_net_reply_cache_destroy - per net namespace reply cache tear-down + * @nn: nfsd_net being freed + * + */ +void nfsd_net_reply_cache_destroy(struct nfsd_net *nn) { nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM); } @@ -169,17 +180,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) hashsize = nfsd_hashsize(nn->max_drc_entries); nn->maskbits = ilog2(hashsize);
- status = nfsd_reply_cache_stats_init(nn); - if (status) - goto out_nomem; - nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan; nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count; nn->nfsd_reply_cache_shrinker.seeks = 1; status = register_shrinker(&nn->nfsd_reply_cache_shrinker, "nfsd-reply:%s", nn->nfsd_name); if (status) - goto out_stats_destroy; + return status;
nn->drc_hashtbl = kvzalloc(array_size(hashsize, sizeof(*nn->drc_hashtbl)), GFP_KERNEL); @@ -195,9 +202,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) return 0; out_shrinker: unregister_shrinker(&nn->nfsd_reply_cache_shrinker); -out_stats_destroy: - nfsd_reply_cache_stats_destroy(nn); -out_nomem: printk(KERN_ERR "nfsd: failed to allocate reply cache\n"); return -ENOMEM; } @@ -217,7 +221,6 @@ void nfsd_reply_cache_shutdown(struct nfsd_net *nn) rp, nn); } } - nfsd_reply_cache_stats_destroy(nn);
kvfree(nn->drc_hashtbl); nn->drc_hashtbl = NULL; diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index f6637dbb9e18..20e603ff6998 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1450,6 +1450,9 @@ static __net_init int nfsd_init_net(struct net *net) retval = nfsd_idmap_init(net); if (retval) goto out_idmap_error; + retval = nfsd_net_reply_cache_init(nn); + if (retval) + goto out_repcache_error; nn->nfsd_versions = NULL; nn->nfsd4_minorversions = NULL; nfsd4_init_leases_net(nn); @@ -1458,6 +1461,8 @@ static __net_init int nfsd_init_net(struct net *net)
return 0;
+out_repcache_error: + nfsd_idmap_shutdown(net); out_idmap_error: nfsd_export_shutdown(net); out_export_error: @@ -1466,9 +1471,12 @@ static __net_init int nfsd_init_net(struct net *net)
static __net_exit void nfsd_exit_net(struct net *net) { + struct nfsd_net *nn = net_generic(net, nfsd_net_id); + + nfsd_net_reply_cache_destroy(nn); nfsd_idmap_shutdown(net); nfsd_export_shutdown(net); - nfsd_netns_free_versions(net_generic(net, nfsd_net_id)); + nfsd_netns_free_versions(nn); }
static struct pernet_operations nfsd_net_ops = {
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 35308e7f0fc3942edc87d9c6dc78c4a096428957 ]
To reduce contention on the bucket locks, we must avoid calling kfree() while each bucket lock is held.
Start by refactoring nfsd_reply_cache_free_locked() into a helper that removes an entry from the bucket (and must therefore run under the lock) and a second helper that frees the entry (which does not need to hold the lock).
For readability, rename the helpers nfsd_cacherep_<verb>.
Reviewed-by: Jeff Layton jlayton@kernel.org Stable-dep-of: a9507f6af145 ("NFSD: Replace nfsd_prune_bucket()") Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfscache.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 2c6942e2344c..11d00b2853ff 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -110,21 +110,33 @@ nfsd_reply_cache_alloc(struct svc_rqst *rqstp, __wsum csum, return rp; }
-static void -nfsd_reply_cache_free_locked(struct nfsd_drc_bucket *b, struct svc_cacherep *rp, - struct nfsd_net *nn) +static void nfsd_cacherep_free(struct svc_cacherep *rp) { - if (rp->c_type == RC_REPLBUFF && rp->c_replvec.iov_base) { - nfsd_stats_drc_mem_usage_sub(nn, rp->c_replvec.iov_len); + if (rp->c_type == RC_REPLBUFF) kfree(rp->c_replvec.iov_base); - } + kmem_cache_free(drc_slab, rp); +} + +static void +nfsd_cacherep_unlink_locked(struct nfsd_net *nn, struct nfsd_drc_bucket *b, + struct svc_cacherep *rp) +{ + if (rp->c_type == RC_REPLBUFF && rp->c_replvec.iov_base) + nfsd_stats_drc_mem_usage_sub(nn, rp->c_replvec.iov_len); if (rp->c_state != RC_UNUSED) { rb_erase(&rp->c_node, &b->rb_head); list_del(&rp->c_lru); atomic_dec(&nn->num_drc_entries); nfsd_stats_drc_mem_usage_sub(nn, sizeof(*rp)); } - kmem_cache_free(drc_slab, rp); +} + +static void +nfsd_reply_cache_free_locked(struct nfsd_drc_bucket *b, struct svc_cacherep *rp, + struct nfsd_net *nn) +{ + nfsd_cacherep_unlink_locked(nn, b, rp); + nfsd_cacherep_free(rp); }
static void @@ -132,8 +144,9 @@ nfsd_reply_cache_free(struct nfsd_drc_bucket *b, struct svc_cacherep *rp, struct nfsd_net *nn) { spin_lock(&b->cache_lock); - nfsd_reply_cache_free_locked(b, rp, nn); + nfsd_cacherep_unlink_locked(nn, b, rp); spin_unlock(&b->cache_lock); + nfsd_cacherep_free(rp); }
int nfsd_drc_slab_create(void)
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit ff0d169329768c1102b7b07eebe5a9839aa1c143 ]
For readability, rename to match the other helpers.
Reviewed-by: Jeff Layton jlayton@kernel.org Stable-dep-of: 4b14885411f7 ("nfsd: make all of the nfsd stats per-network namespace") Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfscache.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 11d00b2853ff..16e62e92964a 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -85,8 +85,8 @@ nfsd_hashsize(unsigned int limit) }
static struct svc_cacherep * -nfsd_reply_cache_alloc(struct svc_rqst *rqstp, __wsum csum, - struct nfsd_net *nn) +nfsd_cacherep_alloc(struct svc_rqst *rqstp, __wsum csum, + struct nfsd_net *nn) { struct svc_cacherep *rp;
@@ -481,7 +481,7 @@ int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start, * preallocate an entry. */ nn = net_generic(SVC_NET(rqstp), nfsd_net_id); - rp = nfsd_reply_cache_alloc(rqstp, csum, nn); + rp = nfsd_cacherep_alloc(rqstp, csum, nn); if (!rp) goto out;
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit a9507f6af1450ed26a4a36d979af518f5bb21e5d ]
Enable nfsd_prune_bucket() to drop the bucket lock while calling kfree(). Use the same pattern that Jeff recently introduced in the NFSD filecache.
A few percpu operations are moved outside the lock since they temporarily disable local IRQs which is expensive and does not need to be done while the lock is held.
Reviewed-by: Jeff Layton jlayton@kernel.org Stable-dep-of: c135e1269f34 ("NFSD: Refactor the duplicate reply cache shrinker") Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfscache.c | 78 +++++++++++++++++++++++++++++++++++++--------- fs/nfsd/trace.h | 22 +++++++++++++ 2 files changed, 85 insertions(+), 15 deletions(-)
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 16e62e92964a..b553f2cece58 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -117,6 +117,21 @@ static void nfsd_cacherep_free(struct svc_cacherep *rp) kmem_cache_free(drc_slab, rp); }
+static unsigned long +nfsd_cacherep_dispose(struct list_head *dispose) +{ + struct svc_cacherep *rp; + unsigned long freed = 0; + + while (!list_empty(dispose)) { + rp = list_first_entry(dispose, struct svc_cacherep, c_lru); + list_del(&rp->c_lru); + nfsd_cacherep_free(rp); + freed++; + } + return freed; +} + static void nfsd_cacherep_unlink_locked(struct nfsd_net *nn, struct nfsd_drc_bucket *b, struct svc_cacherep *rp) @@ -260,6 +275,41 @@ nfsd_cache_bucket_find(__be32 xid, struct nfsd_net *nn) return &nn->drc_hashtbl[hash]; }
+/* + * Remove and return no more than @max expired entries in bucket @b. + * If @max is zero, do not limit the number of removed entries. + */ +static void +nfsd_prune_bucket_locked(struct nfsd_net *nn, struct nfsd_drc_bucket *b, + unsigned int max, struct list_head *dispose) +{ + unsigned long expiry = jiffies - RC_EXPIRE; + struct svc_cacherep *rp, *tmp; + unsigned int freed = 0; + + lockdep_assert_held(&b->cache_lock); + + /* The bucket LRU is ordered oldest-first. */ + list_for_each_entry_safe(rp, tmp, &b->lru_head, c_lru) { + /* + * Don't free entries attached to calls that are still + * in-progress, but do keep scanning the list. + */ + if (rp->c_state == RC_INPROG) + continue; + + if (atomic_read(&nn->num_drc_entries) <= nn->max_drc_entries && + time_before(expiry, rp->c_timestamp)) + break; + + nfsd_cacherep_unlink_locked(nn, b, rp); + list_add(&rp->c_lru, dispose); + + if (max && ++freed > max) + break; + } +} + static long prune_bucket(struct nfsd_drc_bucket *b, struct nfsd_net *nn, unsigned int max) { @@ -283,11 +333,6 @@ static long prune_bucket(struct nfsd_drc_bucket *b, struct nfsd_net *nn, return freed; }
-static long nfsd_prune_bucket(struct nfsd_drc_bucket *b, struct nfsd_net *nn) -{ - return prune_bucket(b, nn, 3); -} - /* * Walk the LRU list and prune off entries that are older than RC_EXPIRE. * Also prune the oldest ones when the total exceeds the max number of entries. @@ -466,6 +511,8 @@ int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start, __wsum csum; struct nfsd_drc_bucket *b; int type = rqstp->rq_cachetype; + unsigned long freed; + LIST_HEAD(dispose); int rtn = RC_DOIT;
rqstp->rq_cacherep = NULL; @@ -490,20 +537,18 @@ int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start, found = nfsd_cache_insert(b, rp, nn); if (found != rp) goto found_entry; - - nfsd_stats_rc_misses_inc(); rqstp->rq_cacherep = rp; rp->c_state = RC_INPROG; + nfsd_prune_bucket_locked(nn, b, 3, &dispose); + spin_unlock(&b->cache_lock);
+ freed = nfsd_cacherep_dispose(&dispose); + trace_nfsd_drc_gc(nn, freed); + + nfsd_stats_rc_misses_inc(); atomic_inc(&nn->num_drc_entries); nfsd_stats_drc_mem_usage_add(nn, sizeof(*rp)); - - nfsd_prune_bucket(b, nn); - -out_unlock: - spin_unlock(&b->cache_lock); -out: - return rtn; + goto out;
found_entry: /* We found a matching entry which is either in progress or done. */ @@ -541,7 +586,10 @@ int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start,
out_trace: trace_nfsd_drc_found(nn, rqstp, rtn); - goto out_unlock; +out_unlock: + spin_unlock(&b->cache_lock); +out: + return rtn; }
/** diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h index 84f26f281fe9..447b3483f94b 100644 --- a/fs/nfsd/trace.h +++ b/fs/nfsd/trace.h @@ -1261,6 +1261,28 @@ TRACE_EVENT(nfsd_drc_mismatch, __entry->ingress) );
+TRACE_EVENT_CONDITION(nfsd_drc_gc, + TP_PROTO( + const struct nfsd_net *nn, + unsigned long freed + ), + TP_ARGS(nn, freed), + TP_CONDITION(freed > 0), + TP_STRUCT__entry( + __field(unsigned long long, boot_time) + __field(unsigned long, freed) + __field(int, total) + ), + TP_fast_assign( + __entry->boot_time = nn->boot_time; + __entry->freed = freed; + __entry->total = atomic_read(&nn->num_drc_entries); + ), + TP_printk("boot_time=%16llx total=%d freed=%lu", + __entry->boot_time, __entry->total, __entry->freed + ) +); + TRACE_EVENT(nfsd_cb_args, TP_PROTO( const struct nfs4_client *clp,
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit c135e1269f34dfdea4bd94c11060c83a3c0b3c12 ]
Avoid holding the bucket lock while freeing cache entries. This change also caps the number of entries that are freed when the shrinker calls to reduce the shrinker's impact on the cache's effectiveness.
Reviewed-by: Jeff Layton jlayton@kernel.org [ cel: adjusted to apply to v6.1.y -- this one might not be necessary ] Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfscache.c | 85 ++++++++++++++++++++++------------------------ 1 file changed, 40 insertions(+), 45 deletions(-)
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index b553f2cece58..049565bbef2d 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -310,51 +310,16 @@ nfsd_prune_bucket_locked(struct nfsd_net *nn, struct nfsd_drc_bucket *b, } }
-static long prune_bucket(struct nfsd_drc_bucket *b, struct nfsd_net *nn, - unsigned int max) -{ - struct svc_cacherep *rp, *tmp; - long freed = 0; - - list_for_each_entry_safe(rp, tmp, &b->lru_head, c_lru) { - /* - * Don't free entries attached to calls that are still - * in-progress, but do keep scanning the list. - */ - if (rp->c_state == RC_INPROG) - continue; - if (atomic_read(&nn->num_drc_entries) <= nn->max_drc_entries && - time_before(jiffies, rp->c_timestamp + RC_EXPIRE)) - break; - nfsd_reply_cache_free_locked(b, rp, nn); - if (max && freed++ > max) - break; - } - return freed; -} - -/* - * Walk the LRU list and prune off entries that are older than RC_EXPIRE. - * Also prune the oldest ones when the total exceeds the max number of entries. +/** + * nfsd_reply_cache_count - count_objects method for the DRC shrinker + * @shrink: our registered shrinker context + * @sc: garbage collection parameters + * + * Returns the total number of entries in the duplicate reply cache. To + * keep things simple and quick, this is not the number of expired entries + * in the cache (ie, the number that would be removed by a call to + * nfsd_reply_cache_scan). */ -static long -prune_cache_entries(struct nfsd_net *nn) -{ - unsigned int i; - long freed = 0; - - for (i = 0; i < nn->drc_hashsize; i++) { - struct nfsd_drc_bucket *b = &nn->drc_hashtbl[i]; - - if (list_empty(&b->lru_head)) - continue; - spin_lock(&b->cache_lock); - freed += prune_bucket(b, nn, 0); - spin_unlock(&b->cache_lock); - } - return freed; -} - static unsigned long nfsd_reply_cache_count(struct shrinker *shrink, struct shrink_control *sc) { @@ -364,13 +329,43 @@ nfsd_reply_cache_count(struct shrinker *shrink, struct shrink_control *sc) return atomic_read(&nn->num_drc_entries); }
+/** + * nfsd_reply_cache_scan - scan_objects method for the DRC shrinker + * @shrink: our registered shrinker context + * @sc: garbage collection parameters + * + * Free expired entries on each bucket's LRU list until we've released + * nr_to_scan freed objects. Nothing will be released if the cache + * has not exceeded it's max_drc_entries limit. + * + * Returns the number of entries released by this call. + */ static unsigned long nfsd_reply_cache_scan(struct shrinker *shrink, struct shrink_control *sc) { struct nfsd_net *nn = container_of(shrink, struct nfsd_net, nfsd_reply_cache_shrinker); + unsigned long freed = 0; + LIST_HEAD(dispose); + unsigned int i;
- return prune_cache_entries(nn); + for (i = 0; i < nn->drc_hashsize; i++) { + struct nfsd_drc_bucket *b = &nn->drc_hashtbl[i]; + + if (list_empty(&b->lru_head)) + continue; + + spin_lock(&b->cache_lock); + nfsd_prune_bucket_locked(nn, b, 0, &dispose); + spin_unlock(&b->cache_lock); + + freed += nfsd_cacherep_dispose(&dispose); + if (freed > sc->nr_to_scan) + break; + } + + trace_nfsd_drc_gc(nn, freed); + return freed; }
/**
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 5ec39944f874e1ecc09f624a70dfaa8ac3bf9d08 ]
In function ‘export_stats_init’, inlined from ‘svc_export_alloc’ at fs/nfsd/export.c:866:6: fs/nfsd/export.c:337:16: warning: ‘nfsd_percpu_counters_init’ accessing 40 bytes in a region of size 0 [-Wstringop-overflow=] 337 | return nfsd_percpu_counters_init(&stats->counter, EXP_STATS_COUNTERS_NUM); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fs/nfsd/export.c:337:16: note: referencing argument 1 of type ‘struct percpu_counter[0]’ fs/nfsd/stats.h: In function ‘svc_export_alloc’: fs/nfsd/stats.h:40:5: note: in a call to function ‘nfsd_percpu_counters_init’ 40 | int nfsd_percpu_counters_init(struct percpu_counter counters[], int num); | ^~~~~~~~~~~~~~~~~~~~~~~~~
Cc: Amir Goldstein amir73il@gmail.com Reviewed-by: Jeff Layton jlayton@kernel.org Stable-dep-of: 93483ac5fec6 ("nfsd: expose /proc/net/sunrpc/nfsd in net namespaces") Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/stats.c | 2 +- fs/nfsd/stats.h | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/nfsd/stats.c b/fs/nfsd/stats.c index 777e24e5da33..1fe6488a1cf9 100644 --- a/fs/nfsd/stats.c +++ b/fs/nfsd/stats.c @@ -74,7 +74,7 @@ static int nfsd_show(struct seq_file *seq, void *v)
DEFINE_PROC_SHOW_ATTRIBUTE(nfsd);
-int nfsd_percpu_counters_init(struct percpu_counter counters[], int num) +int nfsd_percpu_counters_init(struct percpu_counter *counters, int num) { int i, err = 0;
diff --git a/fs/nfsd/stats.h b/fs/nfsd/stats.h index 9b43dc3d9991..c3abe1830da5 100644 --- a/fs/nfsd/stats.h +++ b/fs/nfsd/stats.h @@ -36,9 +36,9 @@ extern struct nfsd_stats nfsdstats;
extern struct svc_stat nfsd_svcstats;
-int nfsd_percpu_counters_init(struct percpu_counter counters[], int num); -void nfsd_percpu_counters_reset(struct percpu_counter counters[], int num); -void nfsd_percpu_counters_destroy(struct percpu_counter counters[], int num); +int nfsd_percpu_counters_init(struct percpu_counter *counters, int num); +void nfsd_percpu_counters_reset(struct percpu_counter *counters, int num); +void nfsd_percpu_counters_destroy(struct percpu_counter *counters, int num); int nfsd_stat_init(void); void nfsd_stat_shutdown(void);
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 6939ace1f22681fface7841cdbf34d3204cc94b5 ]
fs/nfsd/export.c: In function 'svc_export_parse': fs/nfsd/export.c:737:1: warning: the frame size of 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=] 737 | }
On my systems, svc_export_parse() has a stack frame of over 800 bytes, not 1040, but nonetheless, it could do with some reduction.
When a struct svc_export is on the stack, it's a temporary structure used as an argument, and not visible as an actual exported FS. No need to reserve space for export_stats in such cases.
Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202310012359.YEw5IrK6-lkp@intel.com/ Cc: Amir Goldstein amir73il@gmail.com Reviewed-by: Jeff Layton jlayton@kernel.org Stable-dep-of: 4b14885411f7 ("nfsd: make all of the nfsd stats per-network namespace") [ cel: adjusted to apply to v6.1.y ] Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/export.c | 32 +++++++++++++++++++++++--------- fs/nfsd/export.h | 4 ++-- fs/nfsd/stats.h | 12 ++++++------ 3 files changed, 31 insertions(+), 17 deletions(-)
diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c index 668c7527b17e..16fadade86cc 100644 --- a/fs/nfsd/export.c +++ b/fs/nfsd/export.c @@ -339,12 +339,16 @@ static int export_stats_init(struct export_stats *stats)
static void export_stats_reset(struct export_stats *stats) { - nfsd_percpu_counters_reset(stats->counter, EXP_STATS_COUNTERS_NUM); + if (stats) + nfsd_percpu_counters_reset(stats->counter, + EXP_STATS_COUNTERS_NUM); }
static void export_stats_destroy(struct export_stats *stats) { - nfsd_percpu_counters_destroy(stats->counter, EXP_STATS_COUNTERS_NUM); + if (stats) + nfsd_percpu_counters_destroy(stats->counter, + EXP_STATS_COUNTERS_NUM); }
static void svc_export_put(struct kref *ref) @@ -353,7 +357,8 @@ static void svc_export_put(struct kref *ref) path_put(&exp->ex_path); auth_domain_put(exp->ex_client); nfsd4_fslocs_free(&exp->ex_fslocs); - export_stats_destroy(&exp->ex_stats); + export_stats_destroy(exp->ex_stats); + kfree(exp->ex_stats); kfree(exp->ex_uuid); kfree_rcu(exp, ex_rcu); } @@ -744,13 +749,15 @@ static int svc_export_show(struct seq_file *m, seq_putc(m, '\t'); seq_escape(m, exp->ex_client->name, " \t\n\"); if (export_stats) { - seq_printf(m, "\t%lld\n", exp->ex_stats.start_time); + struct percpu_counter *counter = exp->ex_stats->counter; + + seq_printf(m, "\t%lld\n", exp->ex_stats->start_time); seq_printf(m, "\tfh_stale: %lld\n", - percpu_counter_sum_positive(&exp->ex_stats.counter[EXP_STATS_FH_STALE])); + percpu_counter_sum_positive(&counter[EXP_STATS_FH_STALE])); seq_printf(m, "\tio_read: %lld\n", - percpu_counter_sum_positive(&exp->ex_stats.counter[EXP_STATS_IO_READ])); + percpu_counter_sum_positive(&counter[EXP_STATS_IO_READ])); seq_printf(m, "\tio_write: %lld\n", - percpu_counter_sum_positive(&exp->ex_stats.counter[EXP_STATS_IO_WRITE])); + percpu_counter_sum_positive(&counter[EXP_STATS_IO_WRITE])); seq_putc(m, '\n'); return 0; } @@ -796,7 +803,7 @@ static void svc_export_init(struct cache_head *cnew, struct cache_head *citem) new->ex_layout_types = 0; new->ex_uuid = NULL; new->cd = item->cd; - export_stats_reset(&new->ex_stats); + export_stats_reset(new->ex_stats); }
static void export_update(struct cache_head *cnew, struct cache_head *citem) @@ -832,7 +839,14 @@ static struct cache_head *svc_export_alloc(void) if (!i) return NULL;
- if (export_stats_init(&i->ex_stats)) { + i->ex_stats = kmalloc(sizeof(*(i->ex_stats)), GFP_KERNEL); + if (!i->ex_stats) { + kfree(i); + return NULL; + } + + if (export_stats_init(i->ex_stats)) { + kfree(i->ex_stats); kfree(i); return NULL; } diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h index d03f7f6a8642..f73e23bb24a1 100644 --- a/fs/nfsd/export.h +++ b/fs/nfsd/export.h @@ -64,10 +64,10 @@ struct svc_export { struct cache_head h; struct auth_domain * ex_client; int ex_flags; + int ex_fsid; struct path ex_path; kuid_t ex_anon_uid; kgid_t ex_anon_gid; - int ex_fsid; unsigned char * ex_uuid; /* 16 byte fsid */ struct nfsd4_fs_locations ex_fslocs; uint32_t ex_nflavors; @@ -76,7 +76,7 @@ struct svc_export { struct nfsd4_deviceid_map *ex_devid_map; struct cache_detail *cd; struct rcu_head ex_rcu; - struct export_stats ex_stats; + struct export_stats *ex_stats; };
/* an "export key" (expkey) maps a filehandlefragement to an diff --git a/fs/nfsd/stats.h b/fs/nfsd/stats.h index c3abe1830da5..ac58c4b2ab70 100644 --- a/fs/nfsd/stats.h +++ b/fs/nfsd/stats.h @@ -60,22 +60,22 @@ static inline void nfsd_stats_rc_nocache_inc(void) static inline void nfsd_stats_fh_stale_inc(struct svc_export *exp) { percpu_counter_inc(&nfsdstats.counter[NFSD_STATS_FH_STALE]); - if (exp) - percpu_counter_inc(&exp->ex_stats.counter[EXP_STATS_FH_STALE]); + if (exp && exp->ex_stats) + percpu_counter_inc(&exp->ex_stats->counter[EXP_STATS_FH_STALE]); }
static inline void nfsd_stats_io_read_add(struct svc_export *exp, s64 amount) { percpu_counter_add(&nfsdstats.counter[NFSD_STATS_IO_READ], amount); - if (exp) - percpu_counter_add(&exp->ex_stats.counter[EXP_STATS_IO_READ], amount); + if (exp && exp->ex_stats) + percpu_counter_add(&exp->ex_stats->counter[EXP_STATS_IO_READ], amount); }
static inline void nfsd_stats_io_write_add(struct svc_export *exp, s64 amount) { percpu_counter_add(&nfsdstats.counter[NFSD_STATS_IO_WRITE], amount); - if (exp) - percpu_counter_add(&exp->ex_stats.counter[EXP_STATS_IO_WRITE], amount); + if (exp && exp->ex_stats) + percpu_counter_add(&exp->ex_stats->counter[EXP_STATS_IO_WRITE], amount); }
static inline void nfsd_stats_payload_misses_inc(struct nfsd_net *nn)
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit ab42f4d9a26f1723dcfd6c93fcf768032b2bb5e7 ]
We check for the existence of ->sv_stats elsewhere except in the core processing code. It appears that only nfsd actual exports these values anywhere, everybody else just has a write only copy of sv_stats in their svc_program. Add a check for ->sv_stats before every adjustment to allow us to eliminate the stats struct from all the users who don't report the stats.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Jeff Layton jlayton@kernel.org [ cel: adjusted to apply to v6.1.y ] Signed-off-by: Chuck Lever chuck.lever@oracle.com --- net/sunrpc/svc.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 666d738bcf07..35c0651cc002 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -1324,7 +1324,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) goto err_bad_proc;
/* Syntactic check complete */ - serv->sv_stats->rpccnt++; + if (serv->sv_stats) + serv->sv_stats->rpccnt++; trace_svc_process(rqstp, progp->pg_name);
/* Build the reply header. */ @@ -1377,7 +1378,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) goto close_xprt;
err_bad_rpc: - serv->sv_stats->rpcbadfmt++; + if (serv->sv_stats) + serv->sv_stats->rpcbadfmt++; svc_putnl(resv, 1); /* REJECT */ svc_putnl(resv, 0); /* RPC_MISMATCH */ svc_putnl(resv, 2); /* Only RPCv2 supported */ @@ -1387,7 +1389,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) err_bad_auth: dprintk("svc: authentication failed (%d)\n", be32_to_cpu(rqstp->rq_auth_stat)); - serv->sv_stats->rpcbadauth++; + if (serv->sv_stats) + serv->sv_stats->rpcbadauth++; /* Restore write pointer to location of accept status: */ xdr_ressize_check(rqstp, reply_statp); svc_putnl(resv, 1); /* REJECT */ @@ -1397,7 +1400,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
err_bad_prog: dprintk("svc: unknown program %d\n", prog); - serv->sv_stats->rpcbadfmt++; + if (serv->sv_stats) + serv->sv_stats->rpcbadfmt++; svc_putnl(resv, RPC_PROG_UNAVAIL); goto sendit;
@@ -1405,7 +1409,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) svc_printk(rqstp, "unknown version (%d for prog %d, %s)\n", rqstp->rq_vers, rqstp->rq_prog, progp->pg_name);
- serv->sv_stats->rpcbadfmt++; + if (serv->sv_stats) + serv->sv_stats->rpcbadfmt++; svc_putnl(resv, RPC_PROG_MISMATCH); svc_putnl(resv, process.mismatch.lovers); svc_putnl(resv, process.mismatch.hivers); @@ -1414,7 +1419,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) err_bad_proc: svc_printk(rqstp, "unknown procedure (%d)\n", rqstp->rq_proc);
- serv->sv_stats->rpcbadfmt++; + if (serv->sv_stats) + serv->sv_stats->rpcbadfmt++; svc_putnl(resv, RPC_PROC_UNAVAIL); goto sendit;
@@ -1423,7 +1429,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
rpc_stat = rpc_garbage_args; err_bad: - serv->sv_stats->rpcbadfmt++; + if (serv->sv_stats) + serv->sv_stats->rpcbadfmt++; svc_putnl(resv, ntohl(rpc_stat)); goto sendit; } @@ -1469,7 +1476,8 @@ svc_process(struct svc_rqst *rqstp) out_baddir: svc_printk(rqstp, "bad direction 0x%08x, dropping request\n", be32_to_cpu(dir)); - rqstp->rq_server->sv_stats->rpcbadfmt++; + if (rqstp->rq_server->sv_stats) + rqstp->rq_server->sv_stats->rpcbadfmt++; out_drop: svc_drop(rqstp); return 0;
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit a2214ed588fb3c5b9824a21cff870482510372bb ]
A lot of places are setting a blank svc_stats in ->pg_stats and never utilizing these stats. Remove all of these extra structs as we're not reporting these stats anywhere.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/lockd/svc.c | 3 --- fs/nfs/callback.c | 3 --- fs/nfsd/nfssvc.c | 5 ----- 3 files changed, 11 deletions(-)
diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c index 5579e67da17d..c33f78513f00 100644 --- a/fs/lockd/svc.c +++ b/fs/lockd/svc.c @@ -759,8 +759,6 @@ static const struct svc_version *nlmsvc_version[] = { #endif };
-static struct svc_stat nlmsvc_stats; - #define NLM_NRVERS ARRAY_SIZE(nlmsvc_version) static struct svc_program nlmsvc_program = { .pg_prog = NLM_PROGRAM, /* program number */ @@ -768,7 +766,6 @@ static struct svc_program nlmsvc_program = { .pg_vers = nlmsvc_version, /* version table */ .pg_name = "lockd", /* service name */ .pg_class = "nfsd", /* share authentication with nfsd */ - .pg_stats = &nlmsvc_stats, /* stats table */ .pg_authenticate = &lockd_authenticate, /* export authentication */ .pg_init_request = svc_generic_init_request, .pg_rpcbind_set = svc_generic_rpcbind_set, diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c index 46a0a2d6962e..e6445b556ce1 100644 --- a/fs/nfs/callback.c +++ b/fs/nfs/callback.c @@ -407,15 +407,12 @@ static const struct svc_version *nfs4_callback_version[] = { [4] = &nfs4_callback_version4, };
-static struct svc_stat nfs4_callback_stats; - static struct svc_program nfs4_callback_program = { .pg_prog = NFS4_CALLBACK, /* RPC service number */ .pg_nvers = ARRAY_SIZE(nfs4_callback_version), /* Number of entries */ .pg_vers = nfs4_callback_version, /* version table */ .pg_name = "NFSv4 callback", /* service name */ .pg_class = "nfs", /* authentication class */ - .pg_stats = &nfs4_callback_stats, .pg_authenticate = nfs_callback_authenticate, .pg_init_request = svc_generic_init_request, .pg_rpcbind_set = svc_generic_rpcbind_set, diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index d07ee284100f..e1a07cfdab23 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -89,7 +89,6 @@ unsigned long nfsd_drc_max_mem; unsigned long nfsd_drc_mem_used;
#if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) -static struct svc_stat nfsd_acl_svcstats; static const struct svc_version *nfsd_acl_version[] = { # if defined(CONFIG_NFSD_V2_ACL) [2] = &nfsd_acl_version2, @@ -108,15 +107,11 @@ static struct svc_program nfsd_acl_program = { .pg_vers = nfsd_acl_version, .pg_name = "nfsacl", .pg_class = "nfsd", - .pg_stats = &nfsd_acl_svcstats, .pg_authenticate = &svc_set_client, .pg_init_request = nfsd_acl_init_request, .pg_rpcbind_set = nfsd_acl_rpcbind_set, };
-static struct svc_stat nfsd_acl_svcstats = { - .program = &nfsd_acl_program, -}; #endif /* defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) */
static const struct svc_version *nfsd_version[] = {
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit f094323867668d50124886ad884b665de7319537 ]
Since only one service actually reports the rpc stats there's not much of a reason to have a pointer to it in the svc_program struct. Adjust the svc_create_pooled function to take the sv_stats as an argument and pass the struct through there as desired instead of getting it from the svc_program->pg_stats.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Jeff Layton jlayton@kernel.org [ cel: adjusted to apply to v6.1.y ] Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfssvc.c | 3 ++- include/linux/sunrpc/svc.h | 4 +++- net/sunrpc/svc.c | 12 +++++++----- 3 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index e1a07cfdab23..34d5906b844b 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -657,7 +657,8 @@ int nfsd_create_serv(struct net *net) if (nfsd_max_blksize == 0) nfsd_max_blksize = nfsd_get_default_max_blksize(); nfsd_reset_versions(nn); - serv = svc_create_pooled(&nfsd_program, nfsd_max_blksize, nfsd); + serv = svc_create_pooled(&nfsd_program, &nfsd_svcstats, + nfsd_max_blksize, nfsd); if (serv == NULL) return -ENOMEM;
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index 88de45491376..3290b805f749 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -493,7 +493,9 @@ void svc_rqst_replace_page(struct svc_rqst *rqstp, struct page *page); void svc_rqst_free(struct svc_rqst *); void svc_exit_thread(struct svc_rqst *); -struct svc_serv * svc_create_pooled(struct svc_program *, unsigned int, +struct svc_serv * svc_create_pooled(struct svc_program *prog, + struct svc_stat *stats, + unsigned int bufsize, int (*threadfn)(void *data)); int svc_set_num_threads(struct svc_serv *, struct svc_pool *, int); int svc_pool_stats_open(struct svc_serv *serv, struct file *file); diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 35c0651cc002..9ae85347ab39 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -453,8 +453,8 @@ __svc_init_bc(struct svc_serv *serv) * Create an RPC service */ static struct svc_serv * -__svc_create(struct svc_program *prog, unsigned int bufsize, int npools, - int (*threadfn)(void *data)) +__svc_create(struct svc_program *prog, struct svc_stat *stats, + unsigned int bufsize, int npools, int (*threadfn)(void *data)) { struct svc_serv *serv; unsigned int vers; @@ -466,7 +466,7 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools, serv->sv_name = prog->pg_name; serv->sv_program = prog; kref_init(&serv->sv_refcnt); - serv->sv_stats = prog->pg_stats; + serv->sv_stats = stats; if (bufsize > RPCSVC_MAXPAYLOAD) bufsize = RPCSVC_MAXPAYLOAD; serv->sv_max_payload = bufsize? bufsize : 4096; @@ -528,26 +528,28 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools, struct svc_serv *svc_create(struct svc_program *prog, unsigned int bufsize, int (*threadfn)(void *data)) { - return __svc_create(prog, bufsize, 1, threadfn); + return __svc_create(prog, NULL, bufsize, 1, threadfn); } EXPORT_SYMBOL_GPL(svc_create);
/** * svc_create_pooled - Create an RPC service with pooled threads * @prog: the RPC program the new service will handle + * @stats: the stats struct if desired * @bufsize: maximum message size for @prog * @threadfn: a function to service RPC requests for @prog * * Returns an instantiated struct svc_serv object or NULL. */ struct svc_serv *svc_create_pooled(struct svc_program *prog, + struct svc_stat *stats, unsigned int bufsize, int (*threadfn)(void *data)) { struct svc_serv *serv; unsigned int npools = svc_pool_map_get();
- serv = __svc_create(prog, bufsize, npools, threadfn); + serv = __svc_create(prog, stats, bufsize, npools, threadfn); if (!serv) goto out_err; return serv;
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 3f6ef182f144dcc9a4d942f97b6a8ed969f13c95 ]
Now that this isn't used anywhere, remove it.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Jeff Layton jlayton@kernel.org [ cel: adjusted to apply to v6.1.y ] Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfssvc.c | 1 - include/linux/sunrpc/svc.h | 1 - 2 files changed, 2 deletions(-)
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index 34d5906b844b..5ddb1f36f82e 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -136,7 +136,6 @@ struct svc_program nfsd_program = { .pg_vers = nfsd_version, /* version table */ .pg_name = "nfsd", /* program name */ .pg_class = "nfsd", /* authentication class */ - .pg_stats = &nfsd_svcstats, /* version table */ .pg_authenticate = &svc_set_client, /* export authentication */ .pg_init_request = nfsd_init_request, .pg_rpcbind_set = nfsd_rpcbind_set, diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index 3290b805f749..912da376ef9b 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -422,7 +422,6 @@ struct svc_program { const struct svc_version **pg_vers; /* version array */ char * pg_name; /* service name */ char * pg_class; /* class name: services sharing authentication */ - struct svc_stat * pg_stats; /* rpc statistics */ int (*pg_authenticate)(struct svc_rqst *); __be32 (*pg_init_request)(struct svc_rqst *, const struct svc_program *,
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 418b9687dece5bd763c09b5c27a801a7e3387be9 ]
nfsd is the only thing using this helper, and it doesn't use the private currently. When we switch to per-network namespace stats we will need the struct net * in order to get to the nfsd_net. Use the net as the proc private so we can utilize this when we make the switch over.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com --- net/sunrpc/stats.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sunrpc/stats.c b/net/sunrpc/stats.c index 52908f9e6eab..9a0b3e8cc62d 100644 --- a/net/sunrpc/stats.c +++ b/net/sunrpc/stats.c @@ -309,7 +309,7 @@ EXPORT_SYMBOL_GPL(rpc_proc_unregister); struct proc_dir_entry * svc_proc_register(struct net *net, struct svc_stat *statp, const struct proc_ops *proc_ops) { - return do_register(net, statp->program->pg_name, statp, proc_ops); + return do_register(net, statp->program->pg_name, net, proc_ops); } EXPORT_SYMBOL_GPL(svc_proc_register);
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit d98416cc2154053950610bb6880911e3dcbdf8c5 ]
We're going to merge the stats all into per network namespace in subsequent patches, rename these nn counters to be consistent with the rest of the stats.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/netns.h | 4 ++-- fs/nfsd/nfscache.c | 4 ++-- fs/nfsd/stats.h | 6 +++--- 3 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index 51a4b7885cae..d1428f96aa5c 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -25,9 +25,9 @@ struct nfsd4_client_tracking_ops;
enum { /* cache misses due only to checksum comparison failures */ - NFSD_NET_PAYLOAD_MISSES, + NFSD_STATS_PAYLOAD_MISSES, /* amount of memory (in bytes) currently consumed by the DRC */ - NFSD_NET_DRC_MEM_USAGE, + NFSD_STATS_DRC_MEM_USAGE, NFSD_NET_COUNTERS_NUM };
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 049565bbef2d..178d7063fffc 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -696,7 +696,7 @@ int nfsd_reply_cache_stats_show(struct seq_file *m, void *v) atomic_read(&nn->num_drc_entries)); seq_printf(m, "hash buckets: %u\n", 1 << nn->maskbits); seq_printf(m, "mem usage: %lld\n", - percpu_counter_sum_positive(&nn->counter[NFSD_NET_DRC_MEM_USAGE])); + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_DRC_MEM_USAGE])); seq_printf(m, "cache hits: %lld\n", percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_RC_HITS])); seq_printf(m, "cache misses: %lld\n", @@ -704,7 +704,7 @@ int nfsd_reply_cache_stats_show(struct seq_file *m, void *v) seq_printf(m, "not cached: %lld\n", percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_RC_NOCACHE])); seq_printf(m, "payload misses: %lld\n", - percpu_counter_sum_positive(&nn->counter[NFSD_NET_PAYLOAD_MISSES])); + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_PAYLOAD_MISSES])); seq_printf(m, "longest chain len: %u\n", nn->longest_chain); seq_printf(m, "cachesize at longest: %u\n", nn->longest_chain_cachesize); return 0; diff --git a/fs/nfsd/stats.h b/fs/nfsd/stats.h index ac58c4b2ab70..a660f0fb799f 100644 --- a/fs/nfsd/stats.h +++ b/fs/nfsd/stats.h @@ -80,17 +80,17 @@ static inline void nfsd_stats_io_write_add(struct svc_export *exp, s64 amount)
static inline void nfsd_stats_payload_misses_inc(struct nfsd_net *nn) { - percpu_counter_inc(&nn->counter[NFSD_NET_PAYLOAD_MISSES]); + percpu_counter_inc(&nn->counter[NFSD_STATS_PAYLOAD_MISSES]); }
static inline void nfsd_stats_drc_mem_usage_add(struct nfsd_net *nn, s64 amount) { - percpu_counter_add(&nn->counter[NFSD_NET_DRC_MEM_USAGE], amount); + percpu_counter_add(&nn->counter[NFSD_STATS_DRC_MEM_USAGE], amount); }
static inline void nfsd_stats_drc_mem_usage_sub(struct nfsd_net *nn, s64 amount) { - percpu_counter_sub(&nn->counter[NFSD_NET_DRC_MEM_USAGE], amount); + percpu_counter_sub(&nn->counter[NFSD_STATS_DRC_MEM_USAGE], amount); }
#endif /* _NFSD_STATS_H */
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 93483ac5fec62cc1de166051b219d953bb5e4ef4 ]
We are running nfsd servers inside of containers with their own network namespace, and we want to monitor these services using the stats found in /proc. However these are not exposed in the proc inside of the container, so we have to bind mount the host /proc into our containers to get at this information.
Separate out the stat counters init and the proc registration, and move the proc registration into the pernet operations entry and exit points so that these stats can be exposed inside of network namespaces.
This is an intermediate step, this just exposes the global counters in the network namespace. Subsequent patches will move these counters into the per-network namespace container.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfsctl.c | 8 +++++--- fs/nfsd/stats.c | 21 ++++++--------------- fs/nfsd/stats.h | 6 ++++-- 3 files changed, 15 insertions(+), 20 deletions(-)
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 20e603ff6998..fde309b26cbb 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1458,6 +1458,7 @@ static __net_init int nfsd_init_net(struct net *net) nfsd4_init_leases_net(nn); get_random_bytes(&nn->siphash_key, sizeof(nn->siphash_key)); seqlock_init(&nn->writeverf_lock); + nfsd_proc_stat_init(net);
return 0;
@@ -1473,6 +1474,7 @@ static __net_exit void nfsd_exit_net(struct net *net) { struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ nfsd_proc_stat_shutdown(net); nfsd_net_reply_cache_destroy(nn); nfsd_idmap_shutdown(net); nfsd_export_shutdown(net); @@ -1496,7 +1498,7 @@ static int __init init_nfsd(void) retval = nfsd4_init_pnfs(); if (retval) goto out_free_slabs; - retval = nfsd_stat_init(); /* Statistics */ + retval = nfsd_stat_counters_init(); /* Statistics */ if (retval) goto out_free_pnfs; retval = nfsd_drc_slab_create(); @@ -1532,7 +1534,7 @@ static int __init init_nfsd(void) nfsd_lockd_shutdown(); nfsd_drc_slab_free(); out_free_stat: - nfsd_stat_shutdown(); + nfsd_stat_counters_destroy(); out_free_pnfs: nfsd4_exit_pnfs(); out_free_slabs: @@ -1549,7 +1551,7 @@ static void __exit exit_nfsd(void) nfsd_drc_slab_free(); remove_proc_entry("fs/nfs/exports", NULL); remove_proc_entry("fs/nfs", NULL); - nfsd_stat_shutdown(); + nfsd_stat_counters_destroy(); nfsd_lockd_shutdown(); nfsd4_free_slabs(); nfsd4_exit_pnfs(); diff --git a/fs/nfsd/stats.c b/fs/nfsd/stats.c index 1fe6488a1cf9..22d57f92187e 100644 --- a/fs/nfsd/stats.c +++ b/fs/nfsd/stats.c @@ -106,31 +106,22 @@ void nfsd_percpu_counters_destroy(struct percpu_counter counters[], int num) percpu_counter_destroy(&counters[i]); }
-static int nfsd_stat_counters_init(void) +int nfsd_stat_counters_init(void) { return nfsd_percpu_counters_init(nfsdstats.counter, NFSD_STATS_COUNTERS_NUM); }
-static void nfsd_stat_counters_destroy(void) +void nfsd_stat_counters_destroy(void) { nfsd_percpu_counters_destroy(nfsdstats.counter, NFSD_STATS_COUNTERS_NUM); }
-int nfsd_stat_init(void) +void nfsd_proc_stat_init(struct net *net) { - int err; - - err = nfsd_stat_counters_init(); - if (err) - return err; - - svc_proc_register(&init_net, &nfsd_svcstats, &nfsd_proc_ops); - - return 0; + svc_proc_register(net, &nfsd_svcstats, &nfsd_proc_ops); }
-void nfsd_stat_shutdown(void) +void nfsd_proc_stat_shutdown(struct net *net) { - nfsd_stat_counters_destroy(); - svc_proc_unregister(&init_net, "nfsd"); + svc_proc_unregister(net, "nfsd"); } diff --git a/fs/nfsd/stats.h b/fs/nfsd/stats.h index a660f0fb799f..31756a9a8a0a 100644 --- a/fs/nfsd/stats.h +++ b/fs/nfsd/stats.h @@ -39,8 +39,10 @@ extern struct svc_stat nfsd_svcstats; int nfsd_percpu_counters_init(struct percpu_counter *counters, int num); void nfsd_percpu_counters_reset(struct percpu_counter *counters, int num); void nfsd_percpu_counters_destroy(struct percpu_counter *counters, int num); -int nfsd_stat_init(void); -void nfsd_stat_shutdown(void); +int nfsd_stat_counters_init(void); +void nfsd_stat_counters_destroy(void); +void nfsd_proc_stat_init(struct net *net); +void nfsd_proc_stat_shutdown(struct net *net);
static inline void nfsd_stats_rc_hits_inc(void) {
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 4b14885411f74b2b0ce0eb2b39d0fffe54e5ca0d ]
We have a global set of counters that we modify for all of the nfsd operations, but now that we're exposing these stats across all network namespaces we need to make the stats also be per-network namespace. We already have some caching stats that are per-network namespace, so move these definitions into the same counter and then adjust all the helpers and users of these stats to provide the appropriate nfsd_net struct so that the stats are maintained for the per-network namespace objects.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Jeff Layton jlayton@kernel.org [ cel: adjusted to apply to v6.1.y ] Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/cache.h | 2 -- fs/nfsd/netns.h | 17 ++++++++++++++-- fs/nfsd/nfs4proc.c | 6 +++--- fs/nfsd/nfscache.c | 36 +++++++--------------------------- fs/nfsd/nfsctl.c | 12 +++--------- fs/nfsd/nfsfh.c | 3 ++- fs/nfsd/stats.c | 24 ++++++++++++----------- fs/nfsd/stats.h | 49 ++++++++++++++++------------------------------ fs/nfsd/vfs.c | 6 ++++-- 9 files changed, 64 insertions(+), 91 deletions(-)
diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h index cf7c5f557986..3c07d587ae9e 100644 --- a/fs/nfsd/cache.h +++ b/fs/nfsd/cache.h @@ -80,8 +80,6 @@ enum {
int nfsd_drc_slab_create(void); void nfsd_drc_slab_free(void); -int nfsd_net_reply_cache_init(struct nfsd_net *nn); -void nfsd_net_reply_cache_destroy(struct nfsd_net *nn); int nfsd_reply_cache_init(struct nfsd_net *); void nfsd_reply_cache_shutdown(struct nfsd_net *); int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start, diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index d1428f96aa5c..55ab92326384 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -10,6 +10,7 @@
#include <net/net_namespace.h> #include <net/netns/generic.h> +#include <linux/nfs4.h> #include <linux/percpu_counter.h> #include <linux/siphash.h>
@@ -28,7 +29,19 @@ enum { NFSD_STATS_PAYLOAD_MISSES, /* amount of memory (in bytes) currently consumed by the DRC */ NFSD_STATS_DRC_MEM_USAGE, - NFSD_NET_COUNTERS_NUM + NFSD_STATS_RC_HITS, /* repcache hits */ + NFSD_STATS_RC_MISSES, /* repcache misses */ + NFSD_STATS_RC_NOCACHE, /* uncached reqs */ + NFSD_STATS_FH_STALE, /* FH stale error */ + NFSD_STATS_IO_READ, /* bytes returned to read requests */ + NFSD_STATS_IO_WRITE, /* bytes passed in write requests */ +#ifdef CONFIG_NFSD_V4 + NFSD_STATS_FIRST_NFS4_OP, /* count of individual nfsv4 operations */ + NFSD_STATS_LAST_NFS4_OP = NFSD_STATS_FIRST_NFS4_OP + LAST_NFS4_OP, +#define NFSD_STATS_NFS4_OP(op) (NFSD_STATS_FIRST_NFS4_OP + (op)) + NFSD_STATS_WDELEG_GETATTR, /* count of getattr conflict with wdeleg */ +#endif + NFSD_STATS_COUNTERS_NUM };
/* @@ -168,7 +181,7 @@ struct nfsd_net { atomic_t num_drc_entries;
/* Per-netns stats counters */ - struct percpu_counter counter[NFSD_NET_COUNTERS_NUM]; + struct percpu_counter counter[NFSD_STATS_COUNTERS_NUM];
/* longest hash chain seen */ unsigned int longest_chain; diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index b6d768bd5ccc..7451cd34710d 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -2430,10 +2430,10 @@ nfsd4_proc_null(struct svc_rqst *rqstp) return rpc_success; }
-static inline void nfsd4_increment_op_stats(u32 opnum) +static inline void nfsd4_increment_op_stats(struct nfsd_net *nn, u32 opnum) { if (opnum >= FIRST_NFS4_OP && opnum <= LAST_NFS4_OP) - percpu_counter_inc(&nfsdstats.counter[NFSD_STATS_NFS4_OP(opnum)]); + percpu_counter_inc(&nn->counter[NFSD_STATS_NFS4_OP(opnum)]); }
static const struct nfsd4_operation nfsd4_ops[]; @@ -2708,7 +2708,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp) status, nfsd4_op_name(op->opnum));
nfsd4_cstate_clear_replay(cstate); - nfsd4_increment_op_stats(op->opnum); + nfsd4_increment_op_stats(nn, op->opnum); }
fh_put(current_fh); diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 178d7063fffc..50ed64a51551 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -176,27 +176,6 @@ void nfsd_drc_slab_free(void) kmem_cache_destroy(drc_slab); }
-/** - * nfsd_net_reply_cache_init - per net namespace reply cache set-up - * @nn: nfsd_net being initialized - * - * Returns zero on succes; otherwise a negative errno is returned. - */ -int nfsd_net_reply_cache_init(struct nfsd_net *nn) -{ - return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM); -} - -/** - * nfsd_net_reply_cache_destroy - per net namespace reply cache tear-down - * @nn: nfsd_net being freed - * - */ -void nfsd_net_reply_cache_destroy(struct nfsd_net *nn) -{ - nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM); -} - int nfsd_reply_cache_init(struct nfsd_net *nn) { unsigned int hashsize; @@ -501,7 +480,7 @@ nfsd_cache_insert(struct nfsd_drc_bucket *b, struct svc_cacherep *key, int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start, unsigned int len) { - struct nfsd_net *nn; + struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id); struct svc_cacherep *rp, *found; __wsum csum; struct nfsd_drc_bucket *b; @@ -512,7 +491,7 @@ int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start,
rqstp->rq_cacherep = NULL; if (type == RC_NOCACHE) { - nfsd_stats_rc_nocache_inc(); + nfsd_stats_rc_nocache_inc(nn); goto out; }
@@ -522,7 +501,6 @@ int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start, * Since the common case is a cache miss followed by an insert, * preallocate an entry. */ - nn = net_generic(SVC_NET(rqstp), nfsd_net_id); rp = nfsd_cacherep_alloc(rqstp, csum, nn); if (!rp) goto out; @@ -540,7 +518,7 @@ int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start, freed = nfsd_cacherep_dispose(&dispose); trace_nfsd_drc_gc(nn, freed);
- nfsd_stats_rc_misses_inc(); + nfsd_stats_rc_misses_inc(nn); atomic_inc(&nn->num_drc_entries); nfsd_stats_drc_mem_usage_add(nn, sizeof(*rp)); goto out; @@ -548,7 +526,7 @@ int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start, found_entry: /* We found a matching entry which is either in progress or done. */ nfsd_reply_cache_free_locked(NULL, rp, nn); - nfsd_stats_rc_hits_inc(); + nfsd_stats_rc_hits_inc(nn); rtn = RC_DROPIT; rp = found;
@@ -698,11 +676,11 @@ int nfsd_reply_cache_stats_show(struct seq_file *m, void *v) seq_printf(m, "mem usage: %lld\n", percpu_counter_sum_positive(&nn->counter[NFSD_STATS_DRC_MEM_USAGE])); seq_printf(m, "cache hits: %lld\n", - percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_RC_HITS])); + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_RC_HITS])); seq_printf(m, "cache misses: %lld\n", - percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_RC_MISSES])); + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_RC_MISSES])); seq_printf(m, "not cached: %lld\n", - percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_RC_NOCACHE])); + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_RC_NOCACHE])); seq_printf(m, "payload misses: %lld\n", percpu_counter_sum_positive(&nn->counter[NFSD_STATS_PAYLOAD_MISSES])); seq_printf(m, "longest chain len: %u\n", nn->longest_chain); diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index fde309b26cbb..d7a481aa1dac 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1450,7 +1450,7 @@ static __net_init int nfsd_init_net(struct net *net) retval = nfsd_idmap_init(net); if (retval) goto out_idmap_error; - retval = nfsd_net_reply_cache_init(nn); + retval = nfsd_stat_counters_init(nn); if (retval) goto out_repcache_error; nn->nfsd_versions = NULL; @@ -1475,7 +1475,7 @@ static __net_exit void nfsd_exit_net(struct net *net) struct nfsd_net *nn = net_generic(net, nfsd_net_id);
nfsd_proc_stat_shutdown(net); - nfsd_net_reply_cache_destroy(nn); + nfsd_stat_counters_destroy(nn); nfsd_idmap_shutdown(net); nfsd_export_shutdown(net); nfsd_netns_free_versions(nn); @@ -1498,12 +1498,9 @@ static int __init init_nfsd(void) retval = nfsd4_init_pnfs(); if (retval) goto out_free_slabs; - retval = nfsd_stat_counters_init(); /* Statistics */ - if (retval) - goto out_free_pnfs; retval = nfsd_drc_slab_create(); if (retval) - goto out_free_stat; + goto out_free_pnfs; nfsd_lockd_init(); /* lockd->nfsd callbacks */ retval = create_proc_exports_entry(); if (retval) @@ -1533,8 +1530,6 @@ static int __init init_nfsd(void) out_free_lockd: nfsd_lockd_shutdown(); nfsd_drc_slab_free(); -out_free_stat: - nfsd_stat_counters_destroy(); out_free_pnfs: nfsd4_exit_pnfs(); out_free_slabs: @@ -1551,7 +1546,6 @@ static void __exit exit_nfsd(void) nfsd_drc_slab_free(); remove_proc_entry("fs/nfs/exports", NULL); remove_proc_entry("fs/nfs", NULL); - nfsd_stat_counters_destroy(); nfsd_lockd_shutdown(); nfsd4_free_slabs(); nfsd4_exit_pnfs(); diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index 3a2ad88ae648..e73e9d44f1b0 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -327,6 +327,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) __be32 fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type, int access) { + struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id); struct svc_export *exp = NULL; struct dentry *dentry; __be32 error; @@ -395,7 +396,7 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type, int access) out: trace_nfsd_fh_verify_err(rqstp, fhp, type, access, error); if (error == nfserr_stale) - nfsd_stats_fh_stale_inc(exp); + nfsd_stats_fh_stale_inc(nn, exp); return error; }
diff --git a/fs/nfsd/stats.c b/fs/nfsd/stats.c index 22d57f92187e..fdc090284c8d 100644 --- a/fs/nfsd/stats.c +++ b/fs/nfsd/stats.c @@ -34,15 +34,17 @@ struct svc_stat nfsd_svcstats = {
static int nfsd_show(struct seq_file *seq, void *v) { + struct net *net = pde_data(file_inode(seq->file)); + struct nfsd_net *nn = net_generic(net, nfsd_net_id); int i;
seq_printf(seq, "rc %lld %lld %lld\nfh %lld 0 0 0 0\nio %lld %lld\n", - percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_RC_HITS]), - percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_RC_MISSES]), - percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_RC_NOCACHE]), - percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_FH_STALE]), - percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_IO_READ]), - percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_IO_WRITE])); + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_RC_HITS]), + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_RC_MISSES]), + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_RC_NOCACHE]), + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_FH_STALE]), + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_IO_READ]), + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_IO_WRITE]));
/* thread usage: */ seq_printf(seq, "th %u 0", atomic_read(&nfsdstats.th_cnt)); @@ -63,7 +65,7 @@ static int nfsd_show(struct seq_file *seq, void *v) seq_printf(seq,"proc4ops %u", LAST_NFS4_OP + 1); for (i = 0; i <= LAST_NFS4_OP; i++) { seq_printf(seq, " %lld", - percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_NFS4_OP(i)])); + percpu_counter_sum_positive(&nn->counter[NFSD_STATS_NFS4_OP(i)])); }
seq_putc(seq, '\n'); @@ -106,14 +108,14 @@ void nfsd_percpu_counters_destroy(struct percpu_counter counters[], int num) percpu_counter_destroy(&counters[i]); }
-int nfsd_stat_counters_init(void) +int nfsd_stat_counters_init(struct nfsd_net *nn) { - return nfsd_percpu_counters_init(nfsdstats.counter, NFSD_STATS_COUNTERS_NUM); + return nfsd_percpu_counters_init(nn->counter, NFSD_STATS_COUNTERS_NUM); }
-void nfsd_stat_counters_destroy(void) +void nfsd_stat_counters_destroy(struct nfsd_net *nn) { - nfsd_percpu_counters_destroy(nfsdstats.counter, NFSD_STATS_COUNTERS_NUM); + nfsd_percpu_counters_destroy(nn->counter, NFSD_STATS_COUNTERS_NUM); }
void nfsd_proc_stat_init(struct net *net) diff --git a/fs/nfsd/stats.h b/fs/nfsd/stats.h index 31756a9a8a0a..28f5c720e9b3 100644 --- a/fs/nfsd/stats.h +++ b/fs/nfsd/stats.h @@ -10,25 +10,7 @@ #include <uapi/linux/nfsd/stats.h> #include <linux/percpu_counter.h>
- -enum { - NFSD_STATS_RC_HITS, /* repcache hits */ - NFSD_STATS_RC_MISSES, /* repcache misses */ - NFSD_STATS_RC_NOCACHE, /* uncached reqs */ - NFSD_STATS_FH_STALE, /* FH stale error */ - NFSD_STATS_IO_READ, /* bytes returned to read requests */ - NFSD_STATS_IO_WRITE, /* bytes passed in write requests */ -#ifdef CONFIG_NFSD_V4 - NFSD_STATS_FIRST_NFS4_OP, /* count of individual nfsv4 operations */ - NFSD_STATS_LAST_NFS4_OP = NFSD_STATS_FIRST_NFS4_OP + LAST_NFS4_OP, -#define NFSD_STATS_NFS4_OP(op) (NFSD_STATS_FIRST_NFS4_OP + (op)) -#endif - NFSD_STATS_COUNTERS_NUM -}; - struct nfsd_stats { - struct percpu_counter counter[NFSD_STATS_COUNTERS_NUM]; - atomic_t th_cnt; /* number of available threads */ };
@@ -39,43 +21,46 @@ extern struct svc_stat nfsd_svcstats; int nfsd_percpu_counters_init(struct percpu_counter *counters, int num); void nfsd_percpu_counters_reset(struct percpu_counter *counters, int num); void nfsd_percpu_counters_destroy(struct percpu_counter *counters, int num); -int nfsd_stat_counters_init(void); -void nfsd_stat_counters_destroy(void); +int nfsd_stat_counters_init(struct nfsd_net *nn); +void nfsd_stat_counters_destroy(struct nfsd_net *nn); void nfsd_proc_stat_init(struct net *net); void nfsd_proc_stat_shutdown(struct net *net);
-static inline void nfsd_stats_rc_hits_inc(void) +static inline void nfsd_stats_rc_hits_inc(struct nfsd_net *nn) { - percpu_counter_inc(&nfsdstats.counter[NFSD_STATS_RC_HITS]); + percpu_counter_inc(&nn->counter[NFSD_STATS_RC_HITS]); }
-static inline void nfsd_stats_rc_misses_inc(void) +static inline void nfsd_stats_rc_misses_inc(struct nfsd_net *nn) { - percpu_counter_inc(&nfsdstats.counter[NFSD_STATS_RC_MISSES]); + percpu_counter_inc(&nn->counter[NFSD_STATS_RC_MISSES]); }
-static inline void nfsd_stats_rc_nocache_inc(void) +static inline void nfsd_stats_rc_nocache_inc(struct nfsd_net *nn) { - percpu_counter_inc(&nfsdstats.counter[NFSD_STATS_RC_NOCACHE]); + percpu_counter_inc(&nn->counter[NFSD_STATS_RC_NOCACHE]); }
-static inline void nfsd_stats_fh_stale_inc(struct svc_export *exp) +static inline void nfsd_stats_fh_stale_inc(struct nfsd_net *nn, + struct svc_export *exp) { - percpu_counter_inc(&nfsdstats.counter[NFSD_STATS_FH_STALE]); + percpu_counter_inc(&nn->counter[NFSD_STATS_FH_STALE]); if (exp && exp->ex_stats) percpu_counter_inc(&exp->ex_stats->counter[EXP_STATS_FH_STALE]); }
-static inline void nfsd_stats_io_read_add(struct svc_export *exp, s64 amount) +static inline void nfsd_stats_io_read_add(struct nfsd_net *nn, + struct svc_export *exp, s64 amount) { - percpu_counter_add(&nfsdstats.counter[NFSD_STATS_IO_READ], amount); + percpu_counter_add(&nn->counter[NFSD_STATS_IO_READ], amount); if (exp && exp->ex_stats) percpu_counter_add(&exp->ex_stats->counter[EXP_STATS_IO_READ], amount); }
-static inline void nfsd_stats_io_write_add(struct svc_export *exp, s64 amount) +static inline void nfsd_stats_io_write_add(struct nfsd_net *nn, + struct svc_export *exp, s64 amount) { - percpu_counter_add(&nfsdstats.counter[NFSD_STATS_IO_WRITE], amount); + percpu_counter_add(&nn->counter[NFSD_STATS_IO_WRITE], amount); if (exp && exp->ex_stats) percpu_counter_add(&exp->ex_stats->counter[EXP_STATS_IO_WRITE], amount); } diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 5d6a61d47a90..17e96e58e772 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -983,7 +983,9 @@ static __be32 nfsd_finish_read(struct svc_rqst *rqstp, struct svc_fh *fhp, unsigned long *count, u32 *eof, ssize_t host_err) { if (host_err >= 0) { - nfsd_stats_io_read_add(fhp->fh_export, host_err); + struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id); + + nfsd_stats_io_read_add(nn, fhp->fh_export, host_err); *eof = nfsd_eof_on_read(file, offset, host_err, *count); *count = host_err; fsnotify_access(file); @@ -1126,7 +1128,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfsd_file *nf, goto out_nfserr; } *cnt = host_err; - nfsd_stats_io_write_add(exp, *cnt); + nfsd_stats_io_write_add(nn, exp, *cnt); fsnotify_modify(file); host_err = filemap_check_wb_err(file->f_mapping, since); if (host_err < 0)
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit e41ee44cc6a473b1f414031782c3b4283d7f3e5f ]
This is the last global stat, take it out of the nfsd_stats struct and make it a global part of nfsd, report it the same as always.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfsd.h | 1 + fs/nfsd/nfssvc.c | 5 +++-- fs/nfsd/stats.c | 3 +-- fs/nfsd/stats.h | 6 ------ 4 files changed, 5 insertions(+), 10 deletions(-)
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h index fa0144a74267..0e557fb60a0e 100644 --- a/fs/nfsd/nfsd.h +++ b/fs/nfsd/nfsd.h @@ -69,6 +69,7 @@ extern struct mutex nfsd_mutex; extern spinlock_t nfsd_drc_lock; extern unsigned long nfsd_drc_max_mem; extern unsigned long nfsd_drc_mem_used; +extern atomic_t nfsd_th_cnt; /* number of available threads */
extern const struct seq_operations nfs_exports_op;
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index 5ddb1f36f82e..8b944620d798 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -34,6 +34,7 @@
#define NFSDDBG_FACILITY NFSDDBG_SVC
+atomic_t nfsd_th_cnt = ATOMIC_INIT(0); extern struct svc_program nfsd_program; static int nfsd(void *vrqstp); #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) @@ -955,7 +956,7 @@ nfsd(void *vrqstp)
current->fs->umask = 0;
- atomic_inc(&nfsdstats.th_cnt); + atomic_inc(&nfsd_th_cnt);
set_freezable();
@@ -979,7 +980,7 @@ nfsd(void *vrqstp) validate_process_creds(); }
- atomic_dec(&nfsdstats.th_cnt); + atomic_dec(&nfsd_th_cnt);
out: /* Take an extra ref so that the svc_put in svc_exit_thread() diff --git a/fs/nfsd/stats.c b/fs/nfsd/stats.c index fdc090284c8d..cd5e48382fba 100644 --- a/fs/nfsd/stats.c +++ b/fs/nfsd/stats.c @@ -27,7 +27,6 @@
#include "nfsd.h"
-struct nfsd_stats nfsdstats; struct svc_stat nfsd_svcstats = { .program = &nfsd_program, }; @@ -47,7 +46,7 @@ static int nfsd_show(struct seq_file *seq, void *v) percpu_counter_sum_positive(&nn->counter[NFSD_STATS_IO_WRITE]));
/* thread usage: */ - seq_printf(seq, "th %u 0", atomic_read(&nfsdstats.th_cnt)); + seq_printf(seq, "th %u 0", atomic_read(&nfsd_th_cnt));
/* deprecated thread usage histogram stats */ for (i = 0; i < 10; i++) diff --git a/fs/nfsd/stats.h b/fs/nfsd/stats.h index 28f5c720e9b3..9b22b1ae929f 100644 --- a/fs/nfsd/stats.h +++ b/fs/nfsd/stats.h @@ -10,12 +10,6 @@ #include <uapi/linux/nfsd/stats.h> #include <linux/percpu_counter.h>
-struct nfsd_stats { - atomic_t th_cnt; /* number of available threads */ -}; - -extern struct nfsd_stats nfsdstats; - extern struct svc_stat nfsd_svcstats;
int nfsd_percpu_counters_init(struct percpu_counter *counters, int num);
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 16fb9808ab2c99979f081987752abcbc5b092eac ]
The final bit of stats that is global is the rpc svc_stat. Move this into the nfsd_net struct and use that everywhere instead of the global struct. Remove the unused global struct.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/netns.h | 4 ++++ fs/nfsd/nfsctl.c | 2 ++ fs/nfsd/nfssvc.c | 2 +- fs/nfsd/stats.c | 10 ++++------ fs/nfsd/stats.h | 2 -- 5 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index 55ab92326384..548422b24a7d 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -13,6 +13,7 @@ #include <linux/nfs4.h> #include <linux/percpu_counter.h> #include <linux/siphash.h> +#include <linux/sunrpc/stats.h>
/* Hash tables for nfs4_clientid state */ #define CLIENT_HASH_BITS 4 @@ -183,6 +184,9 @@ struct nfsd_net { /* Per-netns stats counters */ struct percpu_counter counter[NFSD_STATS_COUNTERS_NUM];
+ /* sunrpc svc stats */ + struct svc_stat nfsd_svcstats; + /* longest hash chain seen */ unsigned int longest_chain;
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index d7a481aa1dac..813ae75e7128 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1453,6 +1453,8 @@ static __net_init int nfsd_init_net(struct net *net) retval = nfsd_stat_counters_init(nn); if (retval) goto out_repcache_error; + memset(&nn->nfsd_svcstats, 0, sizeof(nn->nfsd_svcstats)); + nn->nfsd_svcstats.program = &nfsd_program; nn->nfsd_versions = NULL; nn->nfsd4_minorversions = NULL; nfsd4_init_leases_net(nn); diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index 8b944620d798..9eb529969b22 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -657,7 +657,7 @@ int nfsd_create_serv(struct net *net) if (nfsd_max_blksize == 0) nfsd_max_blksize = nfsd_get_default_max_blksize(); nfsd_reset_versions(nn); - serv = svc_create_pooled(&nfsd_program, &nfsd_svcstats, + serv = svc_create_pooled(&nfsd_program, &nn->nfsd_svcstats, nfsd_max_blksize, nfsd); if (serv == NULL) return -ENOMEM; diff --git a/fs/nfsd/stats.c b/fs/nfsd/stats.c index cd5e48382fba..36f1373bbe3f 100644 --- a/fs/nfsd/stats.c +++ b/fs/nfsd/stats.c @@ -27,10 +27,6 @@
#include "nfsd.h"
-struct svc_stat nfsd_svcstats = { - .program = &nfsd_program, -}; - static int nfsd_show(struct seq_file *seq, void *v) { struct net *net = pde_data(file_inode(seq->file)); @@ -56,7 +52,7 @@ static int nfsd_show(struct seq_file *seq, void *v) seq_puts(seq, "\nra 0 0 0 0 0 0 0 0 0 0 0 0\n");
/* show my rpc info */ - svc_seq_show(seq, &nfsd_svcstats); + svc_seq_show(seq, &nn->nfsd_svcstats);
#ifdef CONFIG_NFSD_V4 /* Show count for individual nfsv4 operations */ @@ -119,7 +115,9 @@ void nfsd_stat_counters_destroy(struct nfsd_net *nn)
void nfsd_proc_stat_init(struct net *net) { - svc_proc_register(net, &nfsd_svcstats, &nfsd_proc_ops); + struct nfsd_net *nn = net_generic(net, nfsd_net_id); + + svc_proc_register(net, &nn->nfsd_svcstats, &nfsd_proc_ops); }
void nfsd_proc_stat_shutdown(struct net *net) diff --git a/fs/nfsd/stats.h b/fs/nfsd/stats.h index 9b22b1ae929f..14525e854cba 100644 --- a/fs/nfsd/stats.h +++ b/fs/nfsd/stats.h @@ -10,8 +10,6 @@ #include <uapi/linux/nfsd/stats.h> #include <linux/percpu_counter.h>
-extern struct svc_stat nfsd_svcstats; - int nfsd_percpu_counters_init(struct percpu_counter *counters, int num); void nfsd_percpu_counters_reset(struct percpu_counter *counters, int num); void nfsd_percpu_counters_destroy(struct percpu_counter *counters, int num);
On Sat, Aug 10, 2024 at 03:59:51PM -0400, cel@kernel.org wrote:
From: Chuck Lever chuck.lever@oracle.com
Following up on
https://lore.kernel.org/linux-nfs/d4b235df-4ee5-4824-9d48-e3b3c1f1f4d1@oracl...
Here is a backport series targeting origin/linux-6.1.y that closes the information leak described in the above thread.
I started with v6.1.y because that is the most recent LTS kernel and thus the closest to upstream. I plan to look at 5.15 and 5.10 LTS too if this series is applied to v6.1.y.
6.6 would be the most recent LTS, and we would need to have this series on top of 6.6 first before we can backport it to older trees :)
On Aug 12, 2024, at 7:34 AM, Sasha Levin sashal@kernel.org wrote:
On Sat, Aug 10, 2024 at 03:59:51PM -0400, cel@kernel.org wrote:
From: Chuck Lever chuck.lever@oracle.com
Following up on
https://lore.kernel.org/linux-nfs/d4b235df-4ee5-4824-9d48-e3b3c1f1f4d1@oracl...
Here is a backport series targeting origin/linux-6.1.y that closes the information leak described in the above thread.
I started with v6.1.y because that is the most recent LTS kernel and thus the closest to upstream. I plan to look at 5.15 and 5.10 LTS too if this series is applied to v6.1.y.
6.6 would be the most recent LTS, and we would need to have this series on top of 6.6 first before we can backport it to older trees :)
Fair enough -- I was told this work was already in 6.6.y, which is why I started with v6.1. I'll take care of it.
-- Chuck Lever
On Mon, Aug 12, 2024 at 01:52:18PM +0000, Chuck Lever III wrote:
On Aug 12, 2024, at 7:34 AM, Sasha Levin sashal@kernel.org wrote:
On Sat, Aug 10, 2024 at 03:59:51PM -0400, cel@kernel.org wrote:
From: Chuck Lever chuck.lever@oracle.com
Following up on
https://lore.kernel.org/linux-nfs/d4b235df-4ee5-4824-9d48-e3b3c1f1f4d1@oracl...
Here is a backport series targeting origin/linux-6.1.y that closes the information leak described in the above thread.
I started with v6.1.y because that is the most recent LTS kernel and thus the closest to upstream. I plan to look at 5.15 and 5.10 LTS too if this series is applied to v6.1.y.
6.6 would be the most recent LTS, and we would need to have this series on top of 6.6 first before we can backport it to older trees :)
Fair enough -- I was told this work was already in 6.6.y, which is why I started with v6.1. I'll take care of it.
Only the first few commits are in 6.6.y already, and those don't make much sense to take into 6.1.y now as they don't do anything on their own.
thanks,
greg k-h
On Sat, Aug 10, 2024 at 03:59:51PM -0400, cel@kernel.org wrote:
From: Chuck Lever chuck.lever@oracle.com
Following up on
https://lore.kernel.org/linux-nfs/d4b235df-4ee5-4824-9d48-e3b3c1f1f4d1@oracl...
Here is a backport series targeting origin/linux-6.1.y that closes the information leak described in the above thread.
I started with v6.1.y because that is the most recent LTS kernel and thus the closest to upstream. I plan to look at 5.15 and 5.10 LTS too if this series is applied to v6.1.y.
Review comments welcome.
Now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org