Commit 0ef625bba6fb ("vfs: support statx(..., NULL, AT_EMPTY_PATH, ...)") added support for passing in NULL when AT_EMPTY_PATH is given, improving performance when statx is used for fetching stat informantion from a given fd, which is especially important for 32-bit platforms. This commit also improved the performance when an empty string is given by short-circuiting the handling of such paths.
This series is based on the commits in the Linus’ tree. Sligth modifications are applied to the context of the patches for cleanly applying.
Tested-by: Xi Ruoyao xry111@xry111.site Signed-off-by: Miao Wang shankerwangmiao@gmail.com --- Christian Brauner (2): fs: new helper vfs_empty_path() stat: use vfs_empty_path() helper
Mateusz Guzik (1): vfs: support statx(..., NULL, AT_EMPTY_PATH, ...)
fs/internal.h | 14 ++++++ fs/namespace.c | 13 ------ fs/stat.c | 123 ++++++++++++++++++++++++++++++++++++----------------- include/linux/fs.h | 17 ++++++++ 4 files changed, 116 insertions(+), 51 deletions(-) --- base-commit: 049be94099ea5ba8338526c5a4f4f404b9dcaf54 change-id: 20240918-statx-stable-linux-6-10-y-17c3ec7497c8
Best regards,
From: Christian Brauner brauner@kernel.org
commit 1bc6d44 upstream.
Make it possible to quickly check whether AT_EMPTY_PATH is valid. Note, after some discussion we decided to also allow NULL to be passed instead of requiring the empty string.
Signed-off-by: Christian Brauner brauner@kernel.org
Cc: stable@vger.kernel.org # 6.10.x Signed-off-by: Miao Wang shankerwangmiao@gmail.com Tested-by: Xi Ruoyao xry111@xry111.site --- include/linux/fs.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h index 5f07c1c377df..0e334aad85c5 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3619,4 +3619,21 @@ extern int vfs_fadvise(struct file *file, loff_t offset, loff_t len, extern int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice);
+static inline bool vfs_empty_path(int dfd, const char __user *path) +{ + char c; + + if (dfd < 0) + return false; + + /* We now allow NULL to be used for empty path. */ + if (!path) + return true; + + if (unlikely(get_user(c, path))) + return false; + + return !c; +} + #endif /* _LINUX_FS_H */
From: Christian Brauner brauner@kernel.org
commit 27a2d0c upstream.
Use the newly added helper for this.
Signed-off-by: Christian Brauner brauner@kernel.org
Cc: stable@vger.kernel.org # 6.10.x Signed-off-by: Miao Wang shankerwangmiao@gmail.com Tested-by: Xi Ruoyao xry111@xry111.site --- fs/stat.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/fs/stat.c b/fs/stat.c index 70bd3e888cfa..0e8558f9c0b3 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -289,16 +289,8 @@ int vfs_fstatat(int dfd, const char __user *filename, * If AT_EMPTY_PATH is set, we expect the common case to be that * empty path, and avoid doing all the extra pathname work. */ - if (dfd >= 0 && flags == AT_EMPTY_PATH) { - char c; - - ret = get_user(c, filename); - if (unlikely(ret)) - return ret; - - if (likely(!c)) - return vfs_fstat(dfd, stat); - } + if (flags == AT_EMPTY_PATH && vfs_empty_path(dfd, filename)) + return vfs_fstat(dfd, stat);
name = getname_flags(filename, getname_statx_lookup_flags(statx_flags), NULL); ret = vfs_statx(dfd, name, statx_flags, stat, STATX_BASIC_STATS);
From: Mateusz Guzik mjguzik@gmail.com
commit 0ef625b upstream.
The newly used helper also checks for empty ("") paths.
NULL paths with any flag value other than AT_EMPTY_PATH go the usual route and end up with -EFAULT to retain compatibility (Rust is abusing calls of the sort to detect availability of statx).
This avoids path lookup code, lockref management, memory allocation and in case of NULL path userspace memory access (which can be quite expensive with SMAP on x86_64).
Benchmarked with statx(..., AT_EMPTY_PATH, ...) running on Sapphire Rapids, with the "" path for the first two cases and NULL for the last one.
Results in ops/s: stock: 4231237 pre-check: 5944063 (+40%) NULL path: 6601619 (+11%/+56%)
Signed-off-by: Mateusz Guzik mjguzik@gmail.com Link: https://lore.kernel.org/r/20240625151807.620812-1-mjguzik@gmail.com Tested-by: Xi Ruoyao xry111@xry111.site [brauner: use path_mounted() and other tweaks] Signed-off-by: Christian Brauner brauner@kernel.org
Cc: stable@vger.kernel.org # 6.10.x Signed-off-by: Miao Wang shankerwangmiao@gmail.com Tested-by: Xi Ruoyao xry111@xry111.site --- fs/internal.h | 14 ++++++++ fs/namespace.c | 13 ------- fs/stat.c | 111 ++++++++++++++++++++++++++++++++++++++++++--------------- 3 files changed, 97 insertions(+), 41 deletions(-)
diff --git a/fs/internal.h b/fs/internal.h index ab2225136f60..f26454c60a98 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -247,6 +247,8 @@ extern const struct dentry_operations ns_dentry_operations; int getname_statx_lookup_flags(int flags); int do_statx(int dfd, struct filename *filename, unsigned int flags, unsigned int mask, struct statx __user *buffer); +int do_statx_fd(int fd, unsigned int flags, unsigned int mask, + struct statx __user *buffer);
/* * fs/splice.c: @@ -321,3 +323,15 @@ struct stashed_operations { int path_from_stashed(struct dentry **stashed, struct vfsmount *mnt, void *data, struct path *path); void stashed_dentry_prune(struct dentry *dentry); +/** + * path_mounted - check whether path is mounted + * @path: path to check + * + * Determine whether @path refers to the root of a mount. + * + * Return: true if @path is the root of a mount, false if not. + */ +static inline bool path_mounted(const struct path *path) +{ + return path->mnt->mnt_root == path->dentry; +} diff --git a/fs/namespace.c b/fs/namespace.c index e1ced589d835..0134eda41b71 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1846,19 +1846,6 @@ bool may_mount(void) return ns_capable(current->nsproxy->mnt_ns->user_ns, CAP_SYS_ADMIN); }
-/** - * path_mounted - check whether path is mounted - * @path: path to check - * - * Determine whether @path refers to the root of a mount. - * - * Return: true if @path is the root of a mount, false if not. - */ -static inline bool path_mounted(const struct path *path) -{ - return path->mnt->mnt_root == path->dentry; -} - static void warn_mandlock(void) { pr_warn_once("=======================================================\n" diff --git a/fs/stat.c b/fs/stat.c index 0e8558f9c0b3..52fde6fe0086 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -214,6 +214,43 @@ int getname_statx_lookup_flags(int flags) return lookup_flags; }
+static int vfs_statx_path(struct path *path, int flags, struct kstat *stat, + u32 request_mask) +{ + int error = vfs_getattr(path, stat, request_mask, flags); + + if (request_mask & STATX_MNT_ID_UNIQUE) { + stat->mnt_id = real_mount(path->mnt)->mnt_id_unique; + stat->result_mask |= STATX_MNT_ID_UNIQUE; + } else { + stat->mnt_id = real_mount(path->mnt)->mnt_id; + stat->result_mask |= STATX_MNT_ID; + } + + if (path_mounted(path)) + stat->attributes |= STATX_ATTR_MOUNT_ROOT; + stat->attributes_mask |= STATX_ATTR_MOUNT_ROOT; + + /* Handle STATX_DIOALIGN for block devices. */ + if (request_mask & STATX_DIOALIGN) { + struct inode *inode = d_backing_inode(path->dentry); + + if (S_ISBLK(inode->i_mode)) + bdev_statx_dioalign(inode, stat); + } + + return error; +} + +static int vfs_statx_fd(int fd, int flags, struct kstat *stat, + u32 request_mask) +{ + CLASS(fd_raw, f)(fd); + if (!f.file) + return -EBADF; + return vfs_statx_path(&f.file->f_path, flags, stat, request_mask); +} + /** * vfs_statx - Get basic and extra attributes by filename * @dfd: A file descriptor representing the base dir for a relative filename @@ -243,36 +280,13 @@ static int vfs_statx(int dfd, struct filename *filename, int flags, retry: error = filename_lookup(dfd, filename, lookup_flags, &path, NULL); if (error) - goto out; - - error = vfs_getattr(&path, stat, request_mask, flags); - - if (request_mask & STATX_MNT_ID_UNIQUE) { - stat->mnt_id = real_mount(path.mnt)->mnt_id_unique; - stat->result_mask |= STATX_MNT_ID_UNIQUE; - } else { - stat->mnt_id = real_mount(path.mnt)->mnt_id; - stat->result_mask |= STATX_MNT_ID; - } - - if (path.mnt->mnt_root == path.dentry) - stat->attributes |= STATX_ATTR_MOUNT_ROOT; - stat->attributes_mask |= STATX_ATTR_MOUNT_ROOT; - - /* Handle STATX_DIOALIGN for block devices. */ - if (request_mask & STATX_DIOALIGN) { - struct inode *inode = d_backing_inode(path.dentry); - - if (S_ISBLK(inode->i_mode)) - bdev_statx_dioalign(inode, stat); - } - + return error; + error = vfs_statx_path(&path, flags, stat, request_mask); path_put(&path); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } -out: return error; }
@@ -666,7 +680,8 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags, if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) return -EINVAL;
- /* STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests + /* + * STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests * from userland. */ mask &= ~STATX_CHANGE_COOKIE; @@ -678,16 +693,41 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags, return cp_statx(&stat, buffer); }
+int do_statx_fd(int fd, unsigned int flags, unsigned int mask, + struct statx __user *buffer) +{ + struct kstat stat; + int error; + + if (mask & STATX__RESERVED) + return -EINVAL; + if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) + return -EINVAL; + + /* + * STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests + * from userland. + */ + mask &= ~STATX_CHANGE_COOKIE; + + error = vfs_statx_fd(fd, flags, &stat, mask); + if (error) + return error; + + return cp_statx(&stat, buffer); +} + /** * sys_statx - System call to get enhanced stats * @dfd: Base directory to pathwalk from *or* fd to stat. - * @filename: File to stat or "" with AT_EMPTY_PATH + * @filename: File to stat or either NULL or "" with AT_EMPTY_PATH * @flags: AT_* flags to control pathwalk. * @mask: Parts of statx struct actually required. * @buffer: Result buffer. * * Note that fstat() can be emulated by setting dfd to the fd of interest, - * supplying "" as the filename and setting AT_EMPTY_PATH in the flags. + * supplying "" (or preferably NULL) as the filename and setting AT_EMPTY_PATH + * in the flags. */ SYSCALL_DEFINE5(statx, int, dfd, const char __user *, filename, unsigned, flags, @@ -695,8 +735,23 @@ SYSCALL_DEFINE5(statx, struct statx __user *, buffer) { int ret; + unsigned lflags; struct filename *name;
+ /* + * Short-circuit handling of NULL and "" paths. + * + * For a NULL path we require and accept only the AT_EMPTY_PATH flag + * (possibly |'d with AT_STATX flags). + * + * However, glibc on 32-bit architectures implements fstatat as statx + * with the "" pathname and AT_NO_AUTOMOUNT | AT_EMPTY_PATH flags. + * Supporting this results in the uglification below. + */ + lflags = flags & ~(AT_NO_AUTOMOUNT | AT_STATX_SYNC_TYPE); + if (lflags == AT_EMPTY_PATH && vfs_empty_path(dfd, filename)) + return do_statx_fd(dfd, flags & ~AT_NO_AUTOMOUNT, mask, buffer); + name = getname_flags(filename, getname_statx_lookup_flags(flags), NULL); ret = do_statx(dfd, name, flags, mask, buffer); putname(name);
On Wed, Sep 18, 2024 at 10:01:18PM +0800, Miao Wang via B4 Relay wrote:
Commit 0ef625bba6fb ("vfs: support statx(..., NULL, AT_EMPTY_PATH, ...)") added support for passing in NULL when AT_EMPTY_PATH is given, improving performance when statx is used for fetching stat informantion from a given fd, which is especially important for 32-bit platforms. This commit also improved the performance when an empty string is given by short-circuiting the handling of such paths.
This series is based on the commits in the Linus’ tree. Sligth modifications are applied to the context of the patches for cleanly applying.
Tested-by: Xi Ruoyao xry111@xry111.site Signed-off-by: Miao Wang shankerwangmiao@gmail.com
This really looks like a brand new feature wanting to be backported, so why does it qualify under the stable kernel rules as fixing something?
I am willing to take some kinds of "fixes performance issues" new features when the subsystem maintainers agree and ask for it, but that doesn't seem to be the case here, and so without their approval and agreement that this is relevant, we can't accept them.
thanks,
greg k-h
On Wed, 2024-09-18 at 19:33 +0200, Greg Kroah-Hartman wrote:
On Wed, Sep 18, 2024 at 10:01:18PM +0800, Miao Wang via B4 Relay wrote:
Commit 0ef625bba6fb ("vfs: support statx(..., NULL, AT_EMPTY_PATH, ...)") added support for passing in NULL when AT_EMPTY_PATH is given, improving performance when statx is used for fetching stat informantion from a given fd, which is especially important for 32-bit platforms. This commit also improved the performance when an empty string is given by short-circuiting the handling of such paths.
This series is based on the commits in the Linus’ tree. Sligth modifications are applied to the context of the patches for cleanly applying.
Tested-by: Xi Ruoyao xry111@xry111.site Signed-off-by: Miao Wang shankerwangmiao@gmail.com
This really looks like a brand new feature wanting to be backported, so why does it qualify under the stable kernel rules as fixing something?
I am willing to take some kinds of "fixes performance issues" new features when the subsystem maintainers agree and ask for it, but that doesn't seem to be the case here, and so without their approval and agreement that this is relevant, we can't accept them.
Unfortunately the performance issue fix and the new feature are in the same commit. Is it acceptable to separate out the performance fix part for stable? (Basically remove "if (!path) return true;" from the 1st patch.)
On Thu, Sep 19, 2024 at 01:37:17AM +0800, Xi Ruoyao wrote:
On Wed, 2024-09-18 at 19:33 +0200, Greg Kroah-Hartman wrote:
On Wed, Sep 18, 2024 at 10:01:18PM +0800, Miao Wang via B4 Relay wrote:
Commit 0ef625bba6fb ("vfs: support statx(..., NULL, AT_EMPTY_PATH, ...)") added support for passing in NULL when AT_EMPTY_PATH is given, improving performance when statx is used for fetching stat informantion from a given fd, which is especially important for 32-bit platforms. This commit also improved the performance when an empty string is given by short-circuiting the handling of such paths.
This series is based on the commits in the Linus’ tree. Sligth modifications are applied to the context of the patches for cleanly applying.
Tested-by: Xi Ruoyao xry111@xry111.site Signed-off-by: Miao Wang shankerwangmiao@gmail.com
This really looks like a brand new feature wanting to be backported, so why does it qualify under the stable kernel rules as fixing something?
I am willing to take some kinds of "fixes performance issues" new features when the subsystem maintainers agree and ask for it, but that doesn't seem to be the case here, and so without their approval and agreement that this is relevant, we can't accept them.
Unfortunately the performance issue fix and the new feature are in the same commit. Is it acceptable to separate out the performance fix part for stable? (Basically remove "if (!path) return true;" from the 1st patch.)
What prevents you, if you wish to have the increased performance, from just moving to a newer kernel version? We add new features and improvements like this all the time, why is this one so special to warrant doing backports. Especially with no maintainer or subsystem developer asking for this to be done?
thanks,
greg k-h
2024年9月19日 19:18,Greg Kroah-Hartman gregkh@linuxfoundation.org 写道:
On Thu, Sep 19, 2024 at 01:37:17AM +0800, Xi Ruoyao wrote:
On Wed, 2024-09-18 at 19:33 +0200, Greg Kroah-Hartman wrote:
On Wed, Sep 18, 2024 at 10:01:18PM +0800, Miao Wang via B4 Relay wrote:
Commit 0ef625bba6fb ("vfs: support statx(..., NULL, AT_EMPTY_PATH, ...)") added support for passing in NULL when AT_EMPTY_PATH is given, improving performance when statx is used for fetching stat informantion from a given fd, which is especially important for 32-bit platforms. This commit also improved the performance when an empty string is given by short-circuiting the handling of such paths.
This series is based on the commits in the Linus’ tree. Sligth modifications are applied to the context of the patches for cleanly applying.
Tested-by: Xi Ruoyao xry111@xry111.site Signed-off-by: Miao Wang shankerwangmiao@gmail.com
This really looks like a brand new feature wanting to be backported, so why does it qualify under the stable kernel rules as fixing something?
I am willing to take some kinds of "fixes performance issues" new features when the subsystem maintainers agree and ask for it, but that doesn't seem to be the case here, and so without their approval and agreement that this is relevant, we can't accept them.
Unfortunately the performance issue fix and the new feature are in the same commit. Is it acceptable to separate out the performance fix part for stable? (Basically remove "if (!path) return true;" from the 1st patch.)
What prevents you, if you wish to have the increased performance, from just moving to a newer kernel version? We add new features and improvements like this all the time, why is this one so special to warrant doing backports. Especially with no maintainer or subsystem developer asking for this to be done?
We all know the long process from a new improvement getting into the mainline kernel to landing in users' devices. Considering 32-bit archectures which lacks 64-bit time support in fstat(), statx(fd, AT_EMPTY_PATH) is heavily relied on. My intention on putting up this backport is that to quicken this process, benefiting these users.
Another reason is about loongarch. fstat() was not included in loongarch initially, until 6.11. Although the re-inclusion of fstat() is backported to stable releases, we still have implementation problems on the glibc side, that loongarch is the only architecture that may lack the support of fstat. If dynamic probing of the existence of fstat() is implemented, this code path would be only effective on loongarch, adding another layer of mess to the current fstat implementation and adding maintenance burden to glibc maintainers. Instead, if we choose to utilize statx(fd, NULL, AT_EMPTY_PATH), even if using dynamic probing, loongarch and all other 32-bit architectures using statx() for fstat() can benefit from this.
Based on the same reason why you have accepted the re-inclusion of fstat on loongarch into the stable trees, I believe it would be potentially possible to let the statx(..., NULL, AT_EMPTY_PATH, ...) get into the stable trees as well.
Thanks again for your considering this. Since VFS maintainers are being looped in the thread, if their approval is necessary, I'm happy to listen to their opinions.
Cheers,
Miao Wang
On Thu, Sep 19, 2024 at 08:18:10PM +0800, Miao Wang wrote:
2024年9月19日 19:18,Greg Kroah-Hartman gregkh@linuxfoundation.org 写道:
On Thu, Sep 19, 2024 at 01:37:17AM +0800, Xi Ruoyao wrote:
On Wed, 2024-09-18 at 19:33 +0200, Greg Kroah-Hartman wrote:
On Wed, Sep 18, 2024 at 10:01:18PM +0800, Miao Wang via B4 Relay wrote:
Commit 0ef625bba6fb ("vfs: support statx(..., NULL, AT_EMPTY_PATH, ...)") added support for passing in NULL when AT_EMPTY_PATH is given, improving performance when statx is used for fetching stat informantion from a given fd, which is especially important for 32-bit platforms. This commit also improved the performance when an empty string is given by short-circuiting the handling of such paths.
This series is based on the commits in the Linus’ tree. Sligth modifications are applied to the context of the patches for cleanly applying.
Tested-by: Xi Ruoyao xry111@xry111.site Signed-off-by: Miao Wang shankerwangmiao@gmail.com
This really looks like a brand new feature wanting to be backported, so why does it qualify under the stable kernel rules as fixing something?
I am willing to take some kinds of "fixes performance issues" new features when the subsystem maintainers agree and ask for it, but that doesn't seem to be the case here, and so without their approval and agreement that this is relevant, we can't accept them.
Unfortunately the performance issue fix and the new feature are in the same commit. Is it acceptable to separate out the performance fix part for stable? (Basically remove "if (!path) return true;" from the 1st patch.)
What prevents you, if you wish to have the increased performance, from just moving to a newer kernel version? We add new features and improvements like this all the time, why is this one so special to warrant doing backports. Especially with no maintainer or subsystem developer asking for this to be done?
We all know the long process from a new improvement getting into the mainline kernel to landing in users' devices. Considering 32-bit archectures which lacks 64-bit time support in fstat(), statx(fd, AT_EMPTY_PATH) is heavily relied on. My intention on putting up this backport is that to quicken this process, benefiting these users.
Another reason is about loongarch. fstat() was not included in loongarch initially, until 6.11. Although the re-inclusion of fstat() is backported to stable releases, we still have implementation problems on the glibc side, that loongarch is the only architecture that may lack the support of fstat. If dynamic probing of the existence of fstat() is implemented, this code path would be only effective on loongarch, adding another layer of mess to the current fstat implementation and adding maintenance burden to glibc maintainers. Instead, if we choose to utilize statx(fd, NULL, AT_EMPTY_PATH), even if using dynamic probing, loongarch and all other 32-bit architectures using statx() for fstat() can benefit from this.
Based on the same reason why you have accepted the re-inclusion of fstat on loongarch into the stable trees, I believe it would be potentially possible to let the statx(..., NULL, AT_EMPTY_PATH, ...) get into the stable trees as well.
That was a simple patch that only affected one arch, unlike this patch series which touches core kernel code used by everyone.
Please just encourage users of this hardware to use a newer kernel if they wish to take advantage of the performance increase.
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org