I have tested that patches 2 and 3 work using the following reproducers. I did not write a reproducer for the issue described in patch 1.
Reproducer for F_SEAL_FUTURE_WRITE not being respected: ``` #define _GNU_SOURCE #include <err.h> #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <linux/udmabuf.h>
#define SYSCHK(x) ({ \ typeof(x) __res = (x); \ if (__res == (typeof(x))-1) \ err(1, "SYSCHK(" #x ")"); \ __res; \ })
int main(void) { int memfd = SYSCHK(memfd_create("test", MFD_ALLOW_SEALING)); SYSCHK(ftruncate(memfd, 0x1000)); SYSCHK(fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK|F_SEAL_FUTURE_WRITE)); int udmabuf_fd = SYSCHK(open("/dev/udmabuf", O_RDWR)); struct udmabuf_create create_arg = { .memfd = memfd, .flags = 0, .offset = 0, .size = 0x1000 }; int buf_fd = SYSCHK(ioctl(udmabuf_fd, UDMABUF_CREATE, &create_arg)); printf("created udmabuf buffer fd %d\n", buf_fd); char *map = SYSCHK(mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, buf_fd, 0)); *map = 'a'; } ```
Reproducer for the memory leak (if you run this for a while, your memory usage will steadily go up, and /sys/kernel/debug/dma_buf/bufinfo will contain a ton of entries): ``` #define _GNU_SOURCE #include <err.h> #include <errno.h> #include <assert.h> #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <sys/resource.h> #include <linux/udmabuf.h>
#define SYSCHK(x) ({ \ typeof(x) __res = (x); \ if (__res == (typeof(x))-1) \ err(1, "SYSCHK(" #x ")"); \ __res; \ })
int main(void) { int memfd = SYSCHK(memfd_create("test", MFD_ALLOW_SEALING)); SYSCHK(ftruncate(memfd, 0x1000)); SYSCHK(fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK)); int udmabuf_fd = SYSCHK(open("/dev/udmabuf", O_RDWR));
// prevent creating new FDs struct rlimit rlim = { .rlim_cur = 1, .rlim_max = 1 }; SYSCHK(setrlimit(RLIMIT_NOFILE, &rlim));
while (1) { struct udmabuf_create create_arg = { .memfd = memfd, .flags = 0, .offset = 0, .size = 0x1000 }; int buf_fd = ioctl(udmabuf_fd, UDMABUF_CREATE, &create_arg); assert(buf_fd == -1); assert(errno == EMFILE); } } ```
Signed-off-by: Jann Horn jannh@google.com --- Jann Horn (3): udmabuf: fix racy memfd sealing check udmabuf: also check for F_SEAL_FUTURE_WRITE udmabuf: fix memory leak on last export_udmabuf() error path
drivers/dma-buf/udmabuf.c | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-) --- base-commit: b86545e02e8c22fb89218f29d381fa8e8b91d815 change-id: 20241203-udmabuf-fixes-d0435ebab663
The current check_memfd_seals() is racy: Since we first do check_memfd_seals() and then udmabuf_pin_folios() without holding any relevant lock across both, F_SEAL_WRITE can be set in between. This is problematic because we can end up holding pins to pages in a write-sealed memfd.
Fix it using the inode lock, that's probably the easiest way. In the future, we might want to consider moving this logic into memfd, especially if anyone else wants to use memfd_pin_folios().
Reported-by: Julian Orth ju.orth@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219106 Closes: https://lore.kernel.org/r/CAG48ez0w8HrFEZtJkfmkVKFDhE5aP7nz=obrimeTgpD+StkV9... Fixes: fbb0de795078 ("Add udmabuf misc device") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com --- drivers/dma-buf/udmabuf.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 8ce1f074c2d32a0a9f59ff7184359e37d56548c6..662b9a26e06668bf59ab36d07c0648c7b02ee5ae 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -436,14 +436,15 @@ static long udmabuf_create(struct miscdevice *device, goto err; }
+ inode_lock_shared(memfd->f_inode); ret = check_memfd_seals(memfd); - if (ret < 0) { - fput(memfd); - goto err; - } + if (ret) + goto out_unlock;
ret = udmabuf_pin_folios(ubuf, memfd, list[i].offset, list[i].size, folios); +out_unlock: + inode_unlock_shared(memfd->f_inode); fput(memfd); if (ret) goto err;
Hi Jann,
Subject: [PATCH 1/3] udmabuf: fix racy memfd sealing check
The current check_memfd_seals() is racy: Since we first do check_memfd_seals() and then udmabuf_pin_folios() without holding any relevant lock across both, F_SEAL_WRITE can be set in between. This is problematic because we can end up holding pins to pages in a write-sealed memfd.
Fix it using the inode lock, that's probably the easiest way. In the future, we might want to consider moving this logic into memfd, especially if anyone else wants to use memfd_pin_folios().
Reported-by: Julian Orth ju.orth@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219106 Closes: https://lore.kernel.org/r/CAG48ez0w8HrFEZtJkfmkVKFDhE5aP7nz=obrimeTg pD+StkV9w@mail.gmail.com Fixes: fbb0de795078 ("Add udmabuf misc device") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com
drivers/dma-buf/udmabuf.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 8ce1f074c2d32a0a9f59ff7184359e37d56548c6..662b9a26e06668bf59ab36d0 7c0648c7b02ee5ae 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -436,14 +436,15 @@ static long udmabuf_create(struct miscdevice *device, goto err; }
inode_lock_shared(memfd->f_inode);
I think having inode_lock_shared(file_inode(memfd)) looks a bit more cleaner. Also, wouldn't it be more appropriate here to take the writer's lock instead of the reader's lock given what we are doing (pinning) in udmabuf_create()?
Thanks, Vivek
ret = check_memfd_seals(memfd);
if (ret < 0) {
fput(memfd);
goto err;
}
if (ret)
goto out_unlock;
ret = udmabuf_pin_folios(ubuf, memfd, list[i].offset, list[i].size, folios);
+out_unlock:
fput(memfd); if (ret) goto err;inode_unlock_shared(memfd->f_inode);
-- 2.47.0.338.g60cca15819-goog
On Wed, Dec 4, 2024 at 10:09 AM Kasireddy, Vivek vivek.kasireddy@intel.com wrote:
Subject: [PATCH 1/3] udmabuf: fix racy memfd sealing check
The current check_memfd_seals() is racy: Since we first do check_memfd_seals() and then udmabuf_pin_folios() without holding any relevant lock across both, F_SEAL_WRITE can be set in between. This is problematic because we can end up holding pins to pages in a write-sealed memfd.
Fix it using the inode lock, that's probably the easiest way. In the future, we might want to consider moving this logic into memfd, especially if anyone else wants to use memfd_pin_folios().
Reported-by: Julian Orth ju.orth@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219106 Closes: https://lore.kernel.org/r/CAG48ez0w8HrFEZtJkfmkVKFDhE5aP7nz=obrimeTg pD+StkV9w@mail.gmail.com Fixes: fbb0de795078 ("Add udmabuf misc device") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com
drivers/dma-buf/udmabuf.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 8ce1f074c2d32a0a9f59ff7184359e37d56548c6..662b9a26e06668bf59ab36d0 7c0648c7b02ee5ae 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -436,14 +436,15 @@ static long udmabuf_create(struct miscdevice *device, goto err; }
inode_lock_shared(memfd->f_inode);
I think having inode_lock_shared(file_inode(memfd)) looks a bit more cleaner.
Good idea, changed that.
Also, wouldn't it be more appropriate here to take the writer's lock instead of the reader's lock given what we are doing (pinning) in udmabuf_create()?
I don't see why that would require taking the inode lock in write mode. I am taking the inode lock to provide exclusion against memfd_add_seals(), which uses inode_lock(); in other words, the inode_lock is to protect the sealing status of the file from changing (which is a reader-like requirement). I'll add a comment in v2 to clarify this.
When F_SEAL_FUTURE_WRITE was introduced, it was overlooked that udmabuf must reject memfds with this flag, just like ones with F_SEAL_WRITE. Fix it by adding F_SEAL_FUTURE_WRITE to SEALS_DENIED.
Fixes: ab3948f58ff8 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com --- drivers/dma-buf/udmabuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 662b9a26e06668bf59ab36d07c0648c7b02ee5ae..8ce77f5837d71a73be677cad014e05f29706057d 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -297,7 +297,7 @@ static const struct dma_buf_ops udmabuf_ops = { };
#define SEALS_WANTED (F_SEAL_SHRINK) -#define SEALS_DENIED (F_SEAL_WRITE) +#define SEALS_DENIED (F_SEAL_WRITE|F_SEAL_FUTURE_WRITE)
static int check_memfd_seals(struct file *memfd) {
Subject: [PATCH 2/3] udmabuf: also check for F_SEAL_FUTURE_WRITE
When F_SEAL_FUTURE_WRITE was introduced, it was overlooked that udmabuf must reject memfds with this flag, just like ones with F_SEAL_WRITE. Fix it by adding F_SEAL_FUTURE_WRITE to SEALS_DENIED.
Fixes: ab3948f58ff8 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com
drivers/dma-buf/udmabuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 662b9a26e06668bf59ab36d07c0648c7b02ee5ae..8ce77f5837d71a73be677ca d014e05f29706057d 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -297,7 +297,7 @@ static const struct dma_buf_ops udmabuf_ops = { };
#define SEALS_WANTED (F_SEAL_SHRINK) -#define SEALS_DENIED (F_SEAL_WRITE) +#define SEALS_DENIED (F_SEAL_WRITE|F_SEAL_FUTURE_WRITE)
Acked-by: Vivek Kasireddy vivek.kasireddy@intel.com
static int check_memfd_seals(struct file *memfd) {
-- 2.47.0.338.g60cca15819-goog
linux-stable-mirror@lists.linaro.org