Changing the mode of symlinks is meaningless as the vfs doesn't take the mode of a symlink into account during path lookup permission checking.
However, the vfs doesn't block mode changes on symlinks. This however, has lead to an untenable mess roughly classifiable into the following two categories:
(1) Filesystems that don't implement a i_op->setattr() for symlinks.
Such filesystems may or may not know that without i_op->setattr() defined, notify_change() falls back to simple_setattr() causing the inode's mode in the inode cache to be changed.
That's a generic issue as this will affect all non-size changing inode attributes including ownership changes.
Example: afs
(2) Filesystems that fail with EOPNOTSUPP but change the mode of the symlink nonetheless.
Some filesystems will happily update the mode of a symlink but still return EOPNOTSUPP. This is the biggest source of confusion for userspace.
The EOPNOTSUPP in this case comes from POSIX ACLs. Specifically it comes from filesystems that call posix_acl_chmod(), e.g., btrfs via
if (!err && attr->ia_valid & ATTR_MODE) err = posix_acl_chmod(idmap, dentry, inode->i_mode);
Filesystems including btrfs don't implement i_op->set_acl() so posix_acl_chmod() will report EOPNOTSUPP.
When posix_acl_chmod() is called, most filesystems will have finished updating the inode.
Perversely, this has the consequences that this behavior may depend on two kconfig options and mount options:
* CONFIG_POSIX_ACL={y,n} * CONFIG_${FSTYPE}_POSIX_ACL={y,n} * Opt_acl, Opt_noacl
Example: btrfs, ext4, xfs
The only way to change the mode on a symlink currently involves abusing an O_PATH file descriptor in the following manner:
fd = openat(-1, "/path/to/link", O_CLOEXEC | O_PATH | O_NOFOLLOW);
char path[PATH_MAX]; snprintf(path, sizeof(path), "/proc/self/fd/%d", fd); chmod(path, 0000);
But for most major filesystems with POSIX ACL support such as btrfs, ext4, ceph, tmpfs, xfs and others this will fail with EOPNOTSUPP with the mode still updated due to the aforementioned posix_acl_chmod() nonsense.
So, given that for all major filesystems this would fail with EOPNOTSUPP and that both glibc (cf. [1]) and musl (cf. [2]) outright block mode changes on symlinks we should just try and block mode changes on symlinks directly in the vfs and have a clean break with this nonsense.
If this causes any regressions, we do the next best thing and fix up all filesystems that do return EOPNOTSUPP with the mode updated to not call posix_acl_chmod() on symlinks.
But as usual, let's try the clean cut solution first. It's a simple patch that can be easily reverted. Not marking this for backport as I'll do that manually if we're reasonably sure that this works and there are no strong objections.
We could block this in chmod_common() but it's more appropriate to do it notify_change() as it will also mean that we catch filesystems that change symlink permissions explicitly or accidently.
Similar proposals were floated in the past as in [3] and [4] and again recently in [5]. There's also a couple of bugs about this inconsistency as in [6] and [7].
Link: https://sourceware.org/git/?p=glibc.git%3Ba=blob%3Bf=sysdeps/unix/sysv/linux... [1] Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c [2] Link: https://lore.kernel.org/all/20200911065733.GA31579@infradead.org [3] Link: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00518.html [4] Link: https://lore.kernel.org/lkml/87lefmbppo.fsf@oldenburg.str.redhat.com [5] Link: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00467.html [6] Link: https://sourceware.org/bugzilla/show_bug.cgi?id=14578#c17 [7] Cc: stable@vger.kernel.org # please backport to all LTSes but not before v6.6-rc2 is tagged Suggested-by: Christoph Hellwig hch@lst.de Suggested-by: Florian Weimer fweimer@redhat.com Reviewed-by: Aleksa Sarai cyphar@cyphar.com Signed-off-by: Christian Brauner brauner@kernel.org --- Changes in v2: - declarations first... - Link to v1: https://lore.kernel.org/r/20230712-vfs-chmod-symlinks-v1-1-27921df6011f@kern... ---
--- fs/attr.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c index d60dc1edb526..775bf77ce16f 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -394,9 +394,11 @@ int notify_change(struct mnt_idmap *idmap, struct dentry *dentry, return error;
if ((ia_valid & ATTR_MODE)) { - umode_t amode = attr->ia_mode; + if (S_ISLNK(inode->i_mode)) + return -EOPNOTSUPP; + /* Flag setting protected by i_mutex */ - if (is_sxid(amode)) + if (is_sxid(attr->ia_mode)) inode->i_flags &= ~S_NOSEC; }
--- base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5 change-id: 20230712-vfs-chmod-symlinks-a3ad1923a992
On Wed, 12 Jul 2023 20:58:49 +0200, Christian Brauner wrote:
Changing the mode of symlinks is meaningless as the vfs doesn't take the mode of a symlink into account during path lookup permission checking.
However, the vfs doesn't block mode changes on symlinks. This however, has lead to an untenable mess roughly classifiable into the following two categories:
[...]
Let's get this into -next and see whether there's any obvious immediate fallout because of this.
---
Applied to the vfs.misc branch of the vfs/vfs.git tree. Patches in the vfs.misc branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the patch has now been applied. If possible patch trailers will be updated.
Note that commit hashes shown below are subject to change due to rebase, trailer updates or similar. If in doubt, please check the listed branch.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git branch: vfs.misc
[1/1] attr: block mode changes of symlinks https://git.kernel.org/vfs/vfs/c/6be357f00aad
Looks good:
Reviewed-by: Christoph Hellwig hch@lst.de
one minor nitpick below:
if ((ia_valid & ATTR_MODE)) {
if (S_ISLNK(inode->i_mode))
return -EOPNOTSUPP;
Maybe some of the rationale on why we have this check from the commit log should go here?
On Thu, Jul 13, 2023 at 01:40:29PM +0200, Christoph Hellwig wrote:
Looks good:
Reviewed-by: Christoph Hellwig hch@lst.de
one minor nitpick below:
if ((ia_valid & ATTR_MODE)) {
if (S_ISLNK(inode->i_mode))
return -EOPNOTSUPP;
Maybe some of the rationale on why we have this check from the commit log should go here?
I'll add a comment to the patch.
On Wed, Jul 12, 2023 at 08:58:49PM +0200, Christian Brauner wrote:
(1) Filesystems that don't implement a i_op->setattr() for symlinks.
Such filesystems may or may not know that without i_op->setattr() defined, notify_change() falls back to simple_setattr() causing the inode's mode in the inode cache to be changed.
Btw, I think this fallback is pretty harmful. At some point we should probably start auditing all instances and wire the ones up that should be using simple_setattr (probably mostly just in-memory file systems) and refuse attribute changes if .setattr is NULL.
On Thu, Jul 13, 2023 at 02:00:42PM +0200, Christoph Hellwig wrote:
On Wed, Jul 12, 2023 at 08:58:49PM +0200, Christian Brauner wrote:
(1) Filesystems that don't implement a i_op->setattr() for symlinks.
Such filesystems may or may not know that without i_op->setattr() defined, notify_change() falls back to simple_setattr() causing the inode's mode in the inode cache to be changed.
Btw, I think this fallback is pretty harmful. At some point we should probably start auditing all instances and wire the ones up that should be using simple_setattr (probably mostly just in-memory file systems) and refuse attribute changes if .setattr is NULL.
Yes, I agree. For example, it is an issue or at least a potential source for bugs for procfs files. If they don't have a i_op->setattr() handler they still get simple_setattr() which means that they accept ATTR_MODE changes which they were explicitly stopped from doing in 2006 in commit 6d76fa58b050 ("Don't allow chmod() on the /proc/<pid>/ files").
linux-stable-mirror@lists.linaro.org