This began as a one-line dma-buf fix for a path_noexec() warning added by commit 1e7ab6f67824 ("anon_inode: rework assertions"). Christoph pointed out that the fix belongs higher up: a pseudo filesystem has no reason not to set SB_I_NOEXEC by default. This series does that.
* Patch 1 sets both flags in init_pseudo(), so every pseudo filesystem gets them. This is the only patch that changes a flag, and the only one with Fixes:/Cc: stable.
* Patch 2 drops the assignments that are now redundant in the callers that set them by hand.
Most callers already set one or both flags. I audited every init_pseudo() caller. Here is what patch 1 actually changes for each. The only visible effect is on dma-buf, where SB_I_NOEXEC silences the warning. SB_I_NODEV is never consulted on these SB_NOUSER mounts, and none of the callers that gain SB_I_NOEXEC are executed from.
caller had patch 1 adds --------------------------- -------- -------------- fs/anon_inodes.c both nothing new mm/secretmem.c both nothing new virt/kvm/guest_memfd.c both nothing new fs/nsfs.c both nothing new fs/pidfs.c both nothing new fs/aio.c NOEXEC NODEV drivers/dma-buf/dma-buf.c neither NOEXEC + NODEV net/socket.c neither NOEXEC + NODEV fs/pipe.c neither NOEXEC + NODEV kernel/resource.c neither NOEXEC + NODEV fs/erofs/super.c neither NOEXEC + NODEV fs/btrfs/tests/... neither NOEXEC + NODEV drivers/vfio/vfio_main.c neither NOEXEC + NODEV drivers/gpu/drm/drm_drv.c neither NOEXEC + NODEV drivers/dax/super.c neither NOEXEC + NODEV block/bdev.c neither NOEXEC + NODEV
John Hubbard (2): libfs: set SB_I_NOEXEC and SB_I_NODEV by default in init_pseudo() libfs: drop redundant SB_I_NOEXEC/SB_I_NODEV in init_pseudo() callers
fs/aio.c | 1 - fs/anon_inodes.c | 2 -- fs/libfs.c | 1 + fs/nsfs.c | 1 - fs/pidfs.c | 2 -- mm/secretmem.c | 2 -- virt/kvm/guest_memfd.c | 2 -- 7 files changed, 1 insertion(+), 10 deletions(-)
base-commit: ba3e43a9e601636f5edb54e259a74f96ca3b8fd8
Since commit 1e7ab6f67824 ("anon_inode: rework assertions"), path_noexec() warns when an anonymous-inode file is mmap'd from a superblock that has not set SB_I_NOEXEC. dma-buf backs its files this way and never set the flag, so mmap of any exported buffer trips the warning on a CONFIG_DEBUG_VFS=y kernel:
WARNING: CPU: 11 PID: 121813 at fs/exec.c:118 path_noexec+0x47/0x50 do_mmap+0x2b5/0x680 vm_mmap_pgoff+0x129/0x210 ksys_mmap_pgoff+0x177/0x240 __x64_sys_mmap+0x33/0x70
init_pseudo() sets up internal SB_NOUSER mounts that are never path-reachable. Set both flags here so every pseudo filesystem gets them by default instead of each caller setting them.
SB_I_NODEV is inert for unreachable mounts. SB_I_NOEXEC has one visible effect: an executable mapping of a pseudo-fs fd, such as a dma-buf, now fails with -EPERM, which is the invariant the assertion enforces. No in-tree caller maps these executable.
Reproduce on CONFIG_DEBUG_VFS=y:
make -C tools/testing/selftests/dmabuf-heaps sudo ./tools/testing/selftests/dmabuf-heaps/dmabuf-heap -t system
Fixes: 1e7ab6f67824 ("anon_inode: rework assertions") Suggested-by: Christoph Hellwig hch@infradead.org Cc: stable@vger.kernel.org Signed-off-by: John Hubbard jhubbard@nvidia.com --- fs/libfs.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/libfs.c b/fs/libfs.c index 1bbea5e7bae3..e8226b9e1bc8 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -736,6 +736,7 @@ struct pseudo_fs_context *init_pseudo(struct fs_context *fc, fc->fs_private = ctx; fc->ops = &pseudo_fs_context_ops; fc->sb_flags |= SB_NOUSER; + fc->s_iflags |= SB_I_NOEXEC | SB_I_NODEV; fc->global = true; } return ctx;
Looks good:
Reviewed-by: Christoph Hellwig hch@lst.de
On Wed 03-06-26 19:53:14, John Hubbard wrote:
Since commit 1e7ab6f67824 ("anon_inode: rework assertions"), path_noexec() warns when an anonymous-inode file is mmap'd from a superblock that has not set SB_I_NOEXEC. dma-buf backs its files this way and never set the flag, so mmap of any exported buffer trips the warning on a CONFIG_DEBUG_VFS=y kernel:
WARNING: CPU: 11 PID: 121813 at fs/exec.c:118 path_noexec+0x47/0x50 do_mmap+0x2b5/0x680 vm_mmap_pgoff+0x129/0x210 ksys_mmap_pgoff+0x177/0x240 __x64_sys_mmap+0x33/0x70
init_pseudo() sets up internal SB_NOUSER mounts that are never path-reachable. Set both flags here so every pseudo filesystem gets them by default instead of each caller setting them.
SB_I_NODEV is inert for unreachable mounts. SB_I_NOEXEC has one visible effect: an executable mapping of a pseudo-fs fd, such as a dma-buf, now fails with -EPERM, which is the invariant the assertion enforces. No in-tree caller maps these executable.
Reproduce on CONFIG_DEBUG_VFS=y:
make -C tools/testing/selftests/dmabuf-heaps sudo ./tools/testing/selftests/dmabuf-heaps/dmabuf-heap -t system
Fixes: 1e7ab6f67824 ("anon_inode: rework assertions") Suggested-by: Christoph Hellwig hch@infradead.org Cc: stable@vger.kernel.org Signed-off-by: John Hubbard jhubbard@nvidia.com
Looks good. Feel free to add:
Reviewed-by: Jan Kara jack@suse.cz
Honza
fs/libfs.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/libfs.c b/fs/libfs.c index 1bbea5e7bae3..e8226b9e1bc8 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -736,6 +736,7 @@ struct pseudo_fs_context *init_pseudo(struct fs_context *fc, fc->fs_private = ctx; fc->ops = &pseudo_fs_context_ops; fc->sb_flags |= SB_NOUSER;
fc->global = true; } return ctx;fc->s_iflags |= SB_I_NOEXEC | SB_I_NODEV;-- 2.54.0
init_pseudo() now sets SB_I_NOEXEC and SB_I_NODEV by default, so the per-caller assignments are redundant. Drop them.
Signed-off-by: John Hubbard jhubbard@nvidia.com --- fs/aio.c | 1 - fs/anon_inodes.c | 2 -- fs/nsfs.c | 1 - fs/pidfs.c | 2 -- mm/secretmem.c | 2 -- virt/kvm/guest_memfd.c | 2 -- 6 files changed, 10 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c index 722476560848..f57fa21a2503 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -318,7 +318,6 @@ static int aio_init_fs_context(struct fs_context *fc) pfc = init_pseudo(fc, AIO_RING_MAGIC); if (!pfc) return -ENOMEM; - fc->s_iflags |= SB_I_NOEXEC; pfc->ops = &aio_super_operations; return 0; } diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c index b8381c7fb636..a7b9b948e33d 100644 --- a/fs/anon_inodes.c +++ b/fs/anon_inodes.c @@ -86,8 +86,6 @@ static int anon_inodefs_init_fs_context(struct fs_context *fc) struct pseudo_fs_context *ctx = init_pseudo(fc, ANON_INODE_FS_MAGIC); if (!ctx) return -ENOMEM; - fc->s_iflags |= SB_I_NOEXEC; - fc->s_iflags |= SB_I_NODEV; ctx->dops = &anon_inodefs_dentry_operations; return 0; } diff --git a/fs/nsfs.c b/fs/nsfs.c index 160018c4fb36..c3b6ae76594a 100644 --- a/fs/nsfs.c +++ b/fs/nsfs.c @@ -664,7 +664,6 @@ static int nsfs_init_fs_context(struct fs_context *fc) struct pseudo_fs_context *ctx = init_pseudo(fc, NSFS_MAGIC); if (!ctx) return -ENOMEM; - fc->s_iflags |= SB_I_NOEXEC | SB_I_NODEV; ctx->s_d_flags |= DCACHE_DONTCACHE; ctx->ops = &nsfs_ops; ctx->eops = &nsfs_export_operations; diff --git a/fs/pidfs.c b/fs/pidfs.c index 1cce4f34a051..c363416766f1 100644 --- a/fs/pidfs.c +++ b/fs/pidfs.c @@ -1115,8 +1115,6 @@ static int pidfs_init_fs_context(struct fs_context *fc) if (!ctx) return -ENOMEM;
- fc->s_iflags |= SB_I_NOEXEC; - fc->s_iflags |= SB_I_NODEV; ctx->s_d_flags |= DCACHE_DONTCACHE; ctx->ops = &pidfs_sops; ctx->eops = &pidfs_export_operations; diff --git a/mm/secretmem.c b/mm/secretmem.c index 5f57ac4720d3..4877c262cb1f 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -245,8 +245,6 @@ static int secretmem_init_fs_context(struct fs_context *fc) if (!ctx) return -ENOMEM;
- fc->s_iflags |= SB_I_NOEXEC; - fc->s_iflags |= SB_I_NODEV; return 0; }
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 69c9d6d546b2..80f201035d77 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -973,8 +973,6 @@ static int kvm_gmem_init_fs_context(struct fs_context *fc) if (!init_pseudo(fc, GUEST_MEMFD_MAGIC)) return -ENOMEM;
- fc->s_iflags |= SB_I_NOEXEC; - fc->s_iflags |= SB_I_NODEV; ctx = fc->fs_private; ctx->ops = &kvm_gmem_super_operations;
Looks good:
Reviewed-by: Christoph Hellwig hch@lst.de
On Wed 03-06-26 19:53:15, John Hubbard wrote:
init_pseudo() now sets SB_I_NOEXEC and SB_I_NODEV by default, so the per-caller assignments are redundant. Drop them.
Signed-off-by: John Hubbard jhubbard@nvidia.com
Looks good. Feel free to add:
Reviewed-by: Jan Kara jack@suse.cz
Honza
fs/aio.c | 1 - fs/anon_inodes.c | 2 -- fs/nsfs.c | 1 - fs/pidfs.c | 2 -- mm/secretmem.c | 2 -- virt/kvm/guest_memfd.c | 2 -- 6 files changed, 10 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c index 722476560848..f57fa21a2503 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -318,7 +318,6 @@ static int aio_init_fs_context(struct fs_context *fc) pfc = init_pseudo(fc, AIO_RING_MAGIC); if (!pfc) return -ENOMEM;
- fc->s_iflags |= SB_I_NOEXEC; pfc->ops = &aio_super_operations; return 0;
} diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c index b8381c7fb636..a7b9b948e33d 100644 --- a/fs/anon_inodes.c +++ b/fs/anon_inodes.c @@ -86,8 +86,6 @@ static int anon_inodefs_init_fs_context(struct fs_context *fc) struct pseudo_fs_context *ctx = init_pseudo(fc, ANON_INODE_FS_MAGIC); if (!ctx) return -ENOMEM;
- fc->s_iflags |= SB_I_NOEXEC;
- fc->s_iflags |= SB_I_NODEV; ctx->dops = &anon_inodefs_dentry_operations; return 0;
} diff --git a/fs/nsfs.c b/fs/nsfs.c index 160018c4fb36..c3b6ae76594a 100644 --- a/fs/nsfs.c +++ b/fs/nsfs.c @@ -664,7 +664,6 @@ static int nsfs_init_fs_context(struct fs_context *fc) struct pseudo_fs_context *ctx = init_pseudo(fc, NSFS_MAGIC); if (!ctx) return -ENOMEM;
- fc->s_iflags |= SB_I_NOEXEC | SB_I_NODEV; ctx->s_d_flags |= DCACHE_DONTCACHE; ctx->ops = &nsfs_ops; ctx->eops = &nsfs_export_operations;
diff --git a/fs/pidfs.c b/fs/pidfs.c index 1cce4f34a051..c363416766f1 100644 --- a/fs/pidfs.c +++ b/fs/pidfs.c @@ -1115,8 +1115,6 @@ static int pidfs_init_fs_context(struct fs_context *fc) if (!ctx) return -ENOMEM;
- fc->s_iflags |= SB_I_NOEXEC;
- fc->s_iflags |= SB_I_NODEV; ctx->s_d_flags |= DCACHE_DONTCACHE; ctx->ops = &pidfs_sops; ctx->eops = &pidfs_export_operations;
diff --git a/mm/secretmem.c b/mm/secretmem.c index 5f57ac4720d3..4877c262cb1f 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -245,8 +245,6 @@ static int secretmem_init_fs_context(struct fs_context *fc) if (!ctx) return -ENOMEM;
- fc->s_iflags |= SB_I_NOEXEC;
- fc->s_iflags |= SB_I_NODEV; return 0;
} diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 69c9d6d546b2..80f201035d77 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -973,8 +973,6 @@ static int kvm_gmem_init_fs_context(struct fs_context *fc) if (!init_pseudo(fc, GUEST_MEMFD_MAGIC)) return -ENOMEM;
- fc->s_iflags |= SB_I_NOEXEC;
- fc->s_iflags |= SB_I_NODEV; ctx = fc->fs_private; ctx->ops = &kvm_gmem_super_operations;
2.54.0
linaro-mm-sig@lists.linaro.org