From: Christian Brauner christian.brauner@ubuntu.com
commit e1bbcd277a53e08d619ffeec56c5c9287f2bf42f upstream.
Hold writers when changing a mount's idmapping to make it more robust.
The vfs layer takes care to retrieve the idmapping of a mount once ensuring that the idmapping used for vfs permission checking is identical to the idmapping passed down to the filesystem.
For ioctl codepaths the filesystem itself is responsible for taking the idmapping into account if they need to. While all filesystems with FS_ALLOW_IDMAP raised take the same precautions as the vfs we should enforce it explicitly by making sure there are no active writers on the relevant mount while changing the idmapping.
This is similar to turning a mount ro with the difference that in contrast to turning a mount ro changing the idmapping can only ever be done once while a mount can transition between ro and rw as much as it wants.
This is a minor user-visible change. But it is extremely unlikely to matter. The caller must've created a detached mount via OPEN_TREE_CLONE and then handed that O_PATH fd to another process or thread which then must've gotten a writable fd for that mount and started creating files in there while the caller is still changing mount properties. While not impossible it will be an extremely rare corner-case and should in general be considered a bug in the application. Consider making a mount MOUNT_ATTR_NOEXEC or MOUNT_ATTR_NODEV while allowing someone else to perform lookups or exec'ing in parallel by handing them a copy of the OPEN_TREE_CLONE fd or another fd beneath that mount.
Link: https://lore.kernel.org/r/20220510095840.152264-1-brauner@kernel.org Cc: Seth Forshee seth.forshee@digitalocean.com Cc: Christoph Hellwig hch@lst.de Cc: Al Viro viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/namespace.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-)
--- a/fs/namespace.c +++ b/fs/namespace.c @@ -3968,6 +3968,23 @@ static int can_idmap_mount(const struct return 0; }
+/** + * mnt_allow_writers() - check whether the attribute change allows writers + * @kattr: the new mount attributes + * @mnt: the mount to which @kattr will be applied + * + * Check whether thew new mount attributes in @kattr allow concurrent writers. + * + * Return: true if writers need to be held, false if not + */ +static inline bool mnt_allow_writers(const struct mount_kattr *kattr, + const struct mount *mnt) +{ + return (!(kattr->attr_set & MNT_READONLY) || + (mnt->mnt.mnt_flags & MNT_READONLY)) && + !kattr->mnt_userns; +} + static struct mount *mount_setattr_prepare(struct mount_kattr *kattr, struct mount *mnt, int *err) { @@ -3998,8 +4015,7 @@ static struct mount *mount_setattr_prepa
last = m;
- if ((kattr->attr_set & MNT_READONLY) && - !(m->mnt.mnt_flags & MNT_READONLY)) { + if (!mnt_allow_writers(kattr, m)) { *err = mnt_hold_writers(m); if (*err) goto out; @@ -4050,13 +4066,8 @@ static void mount_setattr_commit(struct WRITE_ONCE(m->mnt.mnt_flags, flags); }
- /* - * We either set MNT_READONLY above so make it visible - * before ~MNT_WRITE_HOLD or we failed to recursively - * apply mount options. - */ - if ((kattr->attr_set & MNT_READONLY) && - (m->mnt.mnt_flags & MNT_WRITE_HOLD)) + /* If we had to hold writers unblock them. */ + if (m->mnt.mnt_flags & MNT_WRITE_HOLD) mnt_unhold_writers(m);
if (!err && kattr->propagation)