Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error

3 Feb 2021


      On Wed, Feb 3, 2021 at 9:20 PM Suren Baghdasaryan surenb@google.com wrote:
...
On Wed, Feb 3, 2021 at 12:52 AM Daniel Vetter daniel.vetter@ffwll.ch wrote:
...
On Wed, Feb 3, 2021 at 2:57 AM Matthew Wilcox willy@infradead.org wrote:
...
On Tue, Feb 02, 2021 at 04:31:33PM -0800, Suren Baghdasaryan wrote:
...
Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
WARN_ON_ONCE and returning an error. This is to ensure users of the
vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
and get an indication of an error without panicing the kernel.
This will help identifying drivers that need to clear VM_PFNMAP before
using dmabuf system heap which is moving to use vm_insert_page.
NACK.
The system may not _panic_, but it is clearly now _broken_.  The device
doesn't work, and so the system is useless.  You haven't really improved
anything here.  Just bloated the kernel with yet another _ONCE variable
that in a normal system will never ever ever be triggered.
Also, what the heck are you doing with your drivers? dma-buf mmap must
call dma_buf_mmap(), even for forwarded/redirected mmaps from driver
char nodes. If that doesn't work we have some issues with the calling
contract for that function, not in vm_insert_page.
The particular issue I observed (details were posted in
https://lore.kernel.org/patchwork/patch/1372409) is that DRM drivers
set VM_PFNMAP flag (via a call to drm_gem_mmap_obj) before calling
dma_buf_mmap. Some drivers clear that flag but some don't. I could not
find the answer to why VM_PFNMAP is required for dmabuf mappings and
maybe someone can explain that here?
If there is a reason to set this flag other than historical use of
carveout memory then we wanted to catch such cases and fix the drivers
that moved to using dmabuf heaps. However maybe there are other
reasons and if so I would be very grateful if someone could explain
them. That would help me to come up with a better solution.
...
Finally why exactly do we need to make this switch for system heap?
I've recently looked at gup usage by random drivers, and found a lot
of worrying things there. gup on dma-buf is really bad idea in
general.
The reason for the switch is to be able to account dmabufs allocated
using dmabuf heaps to the processes that map them. The next patch in
this series https://lore.kernel.org/patchwork/patch/1374851
implementing the switch contains more details and there is an active
discussion there. Would you mind joining that discussion to keep it in
one place?
How many semi-unrelated buffer accounting schemes does google come up with?
We're at three with this one.
And also we _cannot_ required that all dma-bufs are backed by struct
page, so requiring struct page to make this work is a no-go.
Second, we do not want to all get_user_pages and friends to work on
dma-buf, it causes all kinds of pain. Yes on SoC where dma-buf are
exclusively in system memory you can maybe get away with this, but
dma-buf is supposed to work in more places than just Android SoCs.
If you want to account dma-bufs, and gpu memory in general, I'd say
the solid solution is cgroups. There's patches floating around. And
given that Google Android can't even agree internally on what exactly
you want I'd say we just need to cut over to that and make it happen.
Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error