 
            Am 02.04.24 um 08:49 schrieb zhiguojiang:
As far as I can see that's not because of the DMA-buf code, but because you are somehow using this interface incorrectly.
When dma_buf_poll() is called it is mandatory for the caller to hold a reference to the file descriptor on which the poll operation is executed.
So adding code like "if (!file_count(file))" in the beginning of dma_buf_poll() is certainly broken.
My best guess is that you have some unbalanced dma_buf_get()/dma_buf_put() somewhere instead.
Hi Christian,
The kernel dma_buf_poll() code shound not cause system crashes due to the user mode usage logical issues ?
What user mode logical issues are you talking about? Closing a file while polling on it is perfectly valid.
dma_buf_poll() is called by the filesystem layer and it's the filesystem layer which should make sure that a file can't be closed while polling for an event.
If that doesn't work then you have stumbled over a massive bug in the fs layer. And I have some doubts that this is actually the case.
What is more likely is that some driver messes up the reference count and because of this you see an UAF.
Anyway as far as I can see the DMA-buf code is correct regarding this.
Regards, Christian.
Thanks
在 2024/4/1 20:22, Christian König 写道:
Am 27.03.24 um 03:29 schrieb Zhiguo Jiang:
The issue is a UAF issue of dmabuf file fd. Throght debugging, we found that the dmabuf file fd is added to the epoll event listener list, and when it is released, it is not removed from the epoll list, which leads to the UAF(Use-After-Free) issue.
As far as I can see that's not because of the DMA-buf code, but because you are somehow using this interface incorrectly.
When dma_buf_poll() is called it is mandatory for the caller to hold a reference to the file descriptor on which the poll operation is executed.
So adding code like "if (!file_count(file))" in the beginning of dma_buf_poll() is certainly broken.
My best guess is that you have some unbalanced dma_buf_get()/dma_buf_put() somewhere instead.
Regards, Christian.
The UAF issue can be solved by checking dmabuf file->f_count value and skipping the poll operation for the closed dmabuf file in the dma_buf_poll(). We have tested this solved patch multiple times and have not reproduced the uaf issue.
crash dump: list_del corruption, ffffff8a6f143a90->next is LIST_POISON1 (dead000000000100) ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:55! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP pc : __list_del_entry_valid+0x98/0xd4 lr : __list_del_entry_valid+0x98/0xd4 sp : ffffffc01d413d00 x29: ffffffc01d413d00 x28: 00000000000000c0 x27: 0000000000000020 x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000080007 x23: ffffff8b22e5dcc0 x22: ffffff88a6be12d0 x21: ffffff8b22e572b0 x20: ffffff80254ed0a0 x19: ffffff8a6f143a00 x18: ffffffda5efed3c0 x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865 x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0000000000000018 x11: ffffff8b6c188000 x10: 00000000ffffffff x9 : 6c8413a194897b00 x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c x5 : ffffff8b6c3b2a3e x4 : ffffff8b6c3b2a40 x3 : ffff103000001005 x2 : 0000000000000001 x1 : 00000000000000c0 x0 : 000000000000004e Call trace: __list_del_entry_valid+0x98/0xd4 dma_buf_file_release+0x48/0x90 __fput+0xf4/0x280 ____fput+0x10/0x20 task_work_run+0xcc/0xf4 do_notify_resume+0x2a0/0x33c el0_svc+0x5c/0xa4 el0t_64_sync_handler+0x68/0xb4 el0t_64_sync+0x1a0/0x1a4 Code: d0006fe0 912c5000 f2fbd5a2 94231a01 (d4210000) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops - BUG: Fatal exception SMP: stopping secondary CPUs
Signed-off-by: Zhiguo Jiang justinjiang@vivo.com
drivers/dma-buf/dma-buf.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) mode change 100644 => 100755 drivers/dma-buf/dma-buf.c
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 8fe5aa67b167..e469dd9288cc --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -240,6 +240,10 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) struct dma_resv *resv; __poll_t events; + /* Check if the file exists */ + if (!file_count(file)) + return EPOLLERR;
dmabuf = file->private_data; if (!dmabuf || !dmabuf->resv) return EPOLLERR; @@ -266,8 +270,15 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(&dmabuf->poll.lock); if (events & EPOLLOUT) { - /* Paired with fput in dma_buf_poll_cb */ - get_file(dmabuf->file); + /* + * Paired with fput in dma_buf_poll_cb, + * Skip poll for the closed file. + */ + if (!get_file_rcu(&dmabuf->file)) { + events &= ~EPOLLOUT; + dcb->active = 0; + goto clear_out_event; + } if (!dma_buf_poll_add_cb(resv, true, dcb)) /* No callback queued, wake up any other waiters */ @@ -277,6 +288,7 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) } } +clear_out_event: if (events & EPOLLIN) { struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_in; @@ -289,8 +301,15 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(&dmabuf->poll.lock); if (events & EPOLLIN) { - /* Paired with fput in dma_buf_poll_cb */ - get_file(dmabuf->file); + /* + * Paired with fput in dma_buf_poll_cb, + * Skip poll for the closed file. + */ + if (!get_file_rcu(&dmabuf->file)) { + events &= ~EPOLLIN; + dcb->active = 0; + goto clear_in_event; + } if (!dma_buf_poll_add_cb(resv, false, dcb)) /* No callback queued, wake up any other waiters */ @@ -300,6 +319,7 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) } } +clear_in_event: dma_resv_unlock(resv); return events; }
 
            On Tue, Apr 2, 2024 at 1:08 AM Christian König christian.koenig@amd.com wrote:
Am 02.04.24 um 08:49 schrieb zhiguojiang:
As far as I can see that's not because of the DMA-buf code, but because you are somehow using this interface incorrectly.
When dma_buf_poll() is called it is mandatory for the caller to hold a reference to the file descriptor on which the poll operation is executed.
So adding code like "if (!file_count(file))" in the beginning of dma_buf_poll() is certainly broken.
My best guess is that you have some unbalanced dma_buf_get()/dma_buf_put() somewhere instead.
Hi Christian,
The kernel dma_buf_poll() code shound not cause system crashes due to the user mode usage logical issues ?
What user mode logical issues are you talking about? Closing a file while polling on it is perfectly valid.
dma_buf_poll() is called by the filesystem layer and it's the filesystem layer which should make sure that a file can't be closed while polling for an event.
If that doesn't work then you have stumbled over a massive bug in the fs layer. And I have some doubts that this is actually the case.
What is more likely is that some driver messes up the reference count and because of this you see an UAF.
Anyway as far as I can see the DMA-buf code is correct regarding this.
Regards, Christian.
I tried to reproduce this problem but I couldn't. What I see is a ref get taken when poll is first called. So subsequently closing the fd in userspace while it's being polled doesn't take the refcount all the way to 0. That happens when dma_buf_poll_cb fires, either due to an event or when the fd is closed upon timeout.
I don't really see how this could be triggered from userspace so I am also suspicious of dma_buf_get/put.
linaro-mm-sig@lists.linaro.org

