Am 02.04.24 um 08:49 schrieb zhiguojiang:
>> As far as I can see that's not because of the DMA-buf code, but
>> because you are somehow using this interface incorrectly.
>>
>> When dma_buf_poll() is called it is mandatory for the caller to hold
>> a reference to the file descriptor on which the poll operation is
>> executed.
>>
>> So adding code like "if (!file_count(file))" in the beginning of
>> dma_buf_poll() is certainly broken.
>>
>> My best guess is that you have some unbalanced
>> dma_buf_get()/dma_buf_put() somewhere instead.
>>
>>
> Hi Christian,
>
> The kernel dma_buf_poll() code shound not cause system crashes due to
> the user mode usage logical issues ?
What user mode logical issues are you talking about? Closing a file
while polling on it is perfectly valid.
dma_buf_poll() is called by the filesystem layer and it's the filesystem
layer which should make sure that a file can't be closed while polling
for an event.
If that doesn't work then you have stumbled over a massive bug in the fs
layer. And I have some doubts that this is actually the case.
What is more likely is that some driver messes up the reference count
and because of this you see an UAF.
Anyway as far as I can see the DMA-buf code is correct regarding this.
Regards,
Christian.
>
> Thanks
>
>
> 在 2024/4/1 20:22, Christian König 写道:
>> Am 27.03.24 um 03:29 schrieb Zhiguo Jiang:
>>> The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
>>> that the dmabuf file fd is added to the epoll event listener list, and
>>> when it is released, it is not removed from the epoll list, which leads
>>> to the UAF(Use-After-Free) issue.
>>
>> As far as I can see that's not because of the DMA-buf code, but
>> because you are somehow using this interface incorrectly.
>>
>> When dma_buf_poll() is called it is mandatory for the caller to hold
>> a reference to the file descriptor on which the poll operation is
>> executed.
>>
>> So adding code like "if (!file_count(file))" in the beginning of
>> dma_buf_poll() is certainly broken.
>>
>> My best guess is that you have some unbalanced
>> dma_buf_get()/dma_buf_put() somewhere instead.
>>
>> Regards,
>> Christian.
>>
>>>
>>> The UAF issue can be solved by checking dmabuf file->f_count value and
>>> skipping the poll operation for the closed dmabuf file in the
>>> dma_buf_poll(). We have tested this solved patch multiple times and
>>> have not reproduced the uaf issue.
>>>
>>> crash dump:
>>> list_del corruption, ffffff8a6f143a90->next is LIST_POISON1
>>> (dead000000000100)
>>> ------------[ cut here ]------------
>>> kernel BUG at lib/list_debug.c:55!
>>> Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>>> pc : __list_del_entry_valid+0x98/0xd4
>>> lr : __list_del_entry_valid+0x98/0xd4
>>> sp : ffffffc01d413d00
>>> x29: ffffffc01d413d00 x28: 00000000000000c0 x27: 0000000000000020
>>> x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000080007
>>> x23: ffffff8b22e5dcc0 x22: ffffff88a6be12d0 x21: ffffff8b22e572b0
>>> x20: ffffff80254ed0a0 x19: ffffff8a6f143a00 x18: ffffffda5efed3c0
>>> x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865
>>> x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0000000000000018
>>> x11: ffffff8b6c188000 x10: 00000000ffffffff x9 : 6c8413a194897b00
>>> x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c
>>> x5 : ffffff8b6c3b2a3e x4 : ffffff8b6c3b2a40 x3 : ffff103000001005
>>> x2 : 0000000000000001 x1 : 00000000000000c0 x0 : 000000000000004e
>>> Call trace:
>>> __list_del_entry_valid+0x98/0xd4
>>> dma_buf_file_release+0x48/0x90
>>> __fput+0xf4/0x280
>>> ____fput+0x10/0x20
>>> task_work_run+0xcc/0xf4
>>> do_notify_resume+0x2a0/0x33c
>>> el0_svc+0x5c/0xa4
>>> el0t_64_sync_handler+0x68/0xb4
>>> el0t_64_sync+0x1a0/0x1a4
>>> Code: d0006fe0 912c5000 f2fbd5a2 94231a01 (d4210000)
>>> ---[ end trace 0000000000000000 ]---
>>> Kernel panic - not syncing: Oops - BUG: Fatal exception
>>> SMP: stopping secondary CPUs
>>>
>>> Signed-off-by: Zhiguo Jiang <justinjiang(a)vivo.com>
>>> ---
>>> drivers/dma-buf/dma-buf.c | 28 ++++++++++++++++++++++++----
>>> 1 file changed, 24 insertions(+), 4 deletions(-)
>>> mode change 100644 => 100755 drivers/dma-buf/dma-buf.c
>>>
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index 8fe5aa67b167..e469dd9288cc
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -240,6 +240,10 @@ static __poll_t dma_buf_poll(struct file *file,
>>> poll_table *poll)
>>> struct dma_resv *resv;
>>> __poll_t events;
>>> + /* Check if the file exists */
>>> + if (!file_count(file))
>>> + return EPOLLERR;
>>> +
>>> dmabuf = file->private_data;
>>> if (!dmabuf || !dmabuf->resv)
>>> return EPOLLERR;
>>> @@ -266,8 +270,15 @@ static __poll_t dma_buf_poll(struct file *file,
>>> poll_table *poll)
>>> spin_unlock_irq(&dmabuf->poll.lock);
>>> if (events & EPOLLOUT) {
>>> - /* Paired with fput in dma_buf_poll_cb */
>>> - get_file(dmabuf->file);
>>> + /*
>>> + * Paired with fput in dma_buf_poll_cb,
>>> + * Skip poll for the closed file.
>>> + */
>>> + if (!get_file_rcu(&dmabuf->file)) {
>>> + events &= ~EPOLLOUT;
>>> + dcb->active = 0;
>>> + goto clear_out_event;
>>> + }
>>> if (!dma_buf_poll_add_cb(resv, true, dcb))
>>> /* No callback queued, wake up any other waiters */
>>> @@ -277,6 +288,7 @@ static __poll_t dma_buf_poll(struct file *file,
>>> poll_table *poll)
>>> }
>>> }
>>> +clear_out_event:
>>> if (events & EPOLLIN) {
>>> struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_in;
>>> @@ -289,8 +301,15 @@ static __poll_t dma_buf_poll(struct file
>>> *file, poll_table *poll)
>>> spin_unlock_irq(&dmabuf->poll.lock);
>>> if (events & EPOLLIN) {
>>> - /* Paired with fput in dma_buf_poll_cb */
>>> - get_file(dmabuf->file);
>>> + /*
>>> + * Paired with fput in dma_buf_poll_cb,
>>> + * Skip poll for the closed file.
>>> + */
>>> + if (!get_file_rcu(&dmabuf->file)) {
>>> + events &= ~EPOLLIN;
>>> + dcb->active = 0;
>>> + goto clear_in_event;
>>> + }
>>> if (!dma_buf_poll_add_cb(resv, false, dcb))
>>> /* No callback queued, wake up any other waiters */
>>> @@ -300,6 +319,7 @@ static __poll_t dma_buf_poll(struct file *file,
>>> poll_table *poll)
>>> }
>>> }
>>> +clear_in_event:
>>> dma_resv_unlock(resv);
>>> return events;
>>> }
>>
>
Am 01.04.24 um 14:39 schrieb Tvrtko Ursulin:
>
> On 29/03/2024 00:00, T.J. Mercier wrote:
>> On Thu, Mar 28, 2024 at 7:53 AM Tvrtko Ursulin <tursulin(a)igalia.com>
>> wrote:
>>>
>>> From: Tvrtko Ursulin <tursulin(a)ursulin.net>
>>>
>>> There is no point in compiling in the list and mutex operations
>>> which are
>>> only used from the dma-buf debugfs code, if debugfs is not compiled in.
>>>
>>> Put the code in questions behind some kconfig guards and so save
>>> some text
>>> and maybe even a pointer per object at runtime when not enabled.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tursulin(a)ursulin.net>
>>
>> Reviewed-by: T.J. Mercier <tjmercier(a)google.com>
>
> Thanks!
>
> How would patches to dma-buf be typically landed? Via what tree I
> mean? drm-misc-next?
That should go through drm-misc-next.
And feel free to add Reviewed-by: Christian König
<christian.koenig(a)amd.com> as well.
Regards,
Christian.
>
> Regards,
>
> Tvrtko
On Thu, Mar 28, 2024 at 12:06:56PM +0000, Naveen Mamindlapalli wrote:
> > diff --git a/drivers/net/ethernet/ti/k3-cppi-desc-pool.c b/drivers/net/ethernet/ti/k3-
> > cppi-desc-pool.c
> > index 05cc7aab1ec8..fe8203c05731 100644
> > --- a/drivers/net/ethernet/ti/k3-cppi-desc-pool.c
> > +++ b/drivers/net/ethernet/ti/k3-cppi-desc-pool.c
> > @@ -132,5 +132,17 @@ size_t k3_cppi_desc_pool_avail(struct
> > k3_cppi_desc_pool *pool) } EXPORT_SYMBOL_GPL(k3_cppi_desc_pool_avail);
> >
> > +size_t k3_cppi_desc_pool_desc_size(struct k3_cppi_desc_pool *pool) {
> > + return pool->desc_size;
>
> Don't you need to add NULL check on pool ptr since this function is exported?
What bearing does exporting a function have on whether it should check
for NULL?
Given that this function returns size_t, it can't return an error
number. So what value would it return if "pool" were NULL? It can
only return a positive integer or zero.
Also, the argument should be const as the function doesn't modify the
contents of "pool".
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Tue, Mar 26, 2024 at 7:29 PM Zhiguo Jiang <justinjiang(a)vivo.com> wrote:
>
> The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
> that the dmabuf file fd is added to the epoll event listener list, and
> when it is released, it is not removed from the epoll list, which leads
> to the UAF(Use-After-Free) issue.
>
> The UAF issue can be solved by checking dmabuf file->f_count value and
> skipping the poll operation for the closed dmabuf file in the
> dma_buf_poll(). We have tested this solved patch multiple times and
> have not reproduced the uaf issue.
>
Hi Zhiguo,
What is the most recent kernel version you've seen the bug on?
You are closing the dmabuf fd from another thread while it is still
part of the epoll interest list?
Thanks,
T.J.
On Thu, Mar 28, 2024 at 7:53 AM Tvrtko Ursulin <tursulin(a)igalia.com> wrote:
>
> From: Tvrtko Ursulin <tursulin(a)ursulin.net>
>
> There is no point in compiling in the list and mutex operations which are
> only used from the dma-buf debugfs code, if debugfs is not compiled in.
>
> Put the code in questions behind some kconfig guards and so save some text
> and maybe even a pointer per object at runtime when not enabled.
>
> Signed-off-by: Tvrtko Ursulin <tursulin(a)ursulin.net>
Reviewed-by: T.J. Mercier <tjmercier(a)google.com>
From: Rob Clark <robdclark(a)chromium.org>
virtgpu "vram" GEM objects do not implement obj->get_sg_table(). But
they also don't use drm_gem_map_dma_buf(). In fact they may not even
have guest visible pages. But it is perfectly fine to export and share
with other virtual devices.
Reported-by: Dominik Behr <dbehr(a)chromium.org>
Fixes: 207395da5a97 ("drm/prime: reject DMA-BUF attach when get_sg_table is missing")
Signed-off-by: Rob Clark <robdclark(a)chromium.org>
---
drivers/gpu/drm/drm_prime.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 7352bde299d5..64dd6276e828 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -582,7 +582,12 @@ int drm_gem_map_attach(struct dma_buf *dma_buf,
{
struct drm_gem_object *obj = dma_buf->priv;
- if (!obj->funcs->get_sg_table)
+ /*
+ * drm_gem_map_dma_buf() requires obj->get_sg_table(), but drivers
+ * that implement their own ->map_dma_buf() do not.
+ */
+ if ((dma_buf->ops->map_dma_buf == drm_gem_map_dma_buf) &&
+ !obj->funcs->get_sg_table)
return -ENOSYS;
return drm_gem_pin(obj);
--
2.44.0
This is actually a bit concerning.. importing a host page backed
buffer without guest mapping into a passthru device probably doesn't
work and should be rejected earlier.
I do think we should relax the restriction (either taking my patch or
reverting the commit it fixes) until we work this out properly
(because the original patch is a regression), but importing a buffer
without guest pages into a passthru device can't possibly work
properly. Maybe it works by chance if the host buffer is mapped to
the guest, but that is not guaranteed.
BR,
-R
On Mon, Mar 25, 2024 at 3:35 PM Dominik Behr <dbehr(a)chromium.org> wrote:
>
> It also fixes importing virtgpu blobs into real hardware, for instance amdgpu for DRI_PRIME rendering.
>
> On Fri, Mar 22, 2024 at 2:48 PM Rob Clark <robdclark(a)gmail.com> wrote:
>>
>> From: Rob Clark <robdclark(a)chromium.org>
>>
>> virtgpu "vram" GEM objects do not implement obj->get_sg_table(). But
>> they also don't use drm_gem_map_dma_buf(). In fact they may not even
>> have guest visible pages. But it is perfectly fine to export and share
>> with other virtual devices.
>>
>> Reported-by: Dominik Behr <dbehr(a)chromium.org>
>> Fixes: 207395da5a97 ("drm/prime: reject DMA-BUF attach when get_sg_table is missing")
>> Signed-off-by: Rob Clark <robdclark(a)chromium.org>
>> ---
>> drivers/gpu/drm/drm_prime.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>> index 7352bde299d5..64dd6276e828 100644
>> --- a/drivers/gpu/drm/drm_prime.c
>> +++ b/drivers/gpu/drm/drm_prime.c
>> @@ -582,7 +582,12 @@ int drm_gem_map_attach(struct dma_buf *dma_buf,
>> {
>> struct drm_gem_object *obj = dma_buf->priv;
>>
>> - if (!obj->funcs->get_sg_table)
>> + /*
>> + * drm_gem_map_dma_buf() requires obj->get_sg_table(), but drivers
>> + * that implement their own ->map_dma_buf() do not.
>> + */
>> + if ((dma_buf->ops->map_dma_buf == drm_gem_map_dma_buf) &&
>> + !obj->funcs->get_sg_table)
>> return -ENOSYS;
>>
>> return drm_gem_pin(obj);
>> --
>> 2.44.0
>>