On Thu, Aug 27, 2020 at 09:31:27AM -0400, Laura Abbott wrote:
> On 8/27/20 8:36 AM, Greg Kroah-Hartman wrote:
> > The ION android code has long been marked to be removed, now that we
> > dma-buf support merged into the real part of the kernel.
> >
> > It was thought that we could wait to remove the ion kernel at a later
> > time, but as the out-of-tree Android fork of the ion code has diverged
> > quite a bit, and any Android device using the ion interface uses that
> > forked version and not this in-tree version, the in-tree copy of the
> > code is abandonded and not used by anyone.
> >
> > Combine this abandoned codebase with the need to make changes to it in
> > order to keep the kernel building properly, which then causes merge
> > issues when merging those changes into the out-of-tree Android code, and
> > you end up with two different groups of people (the in-kernel-tree
> > developers, and the Android kernel developers) who are both annoyed at
> > the current situation. Because of this problem, just drop the in-kernel
> > copy of the ion code now, as it's not used, and is only causing problems
> > for everyone involved.
> >
> > Cc: "Arve Hjønnevåg" <arve(a)android.com>
> > Cc: "Christian König" <christian.koenig(a)amd.com>
> > Cc: Christian Brauner <christian(a)brauner.io>
> > Cc: Christoph Hellwig <hch(a)infradead.org>
> > Cc: Hridya Valsaraju <hridya(a)google.com>
> > Cc: Joel Fernandes <joel(a)joelfernandes.org>
> > Cc: John Stultz <john.stultz(a)linaro.org>
> > Cc: Laura Abbott <laura(a)labbott.name>
> > Cc: Martijn Coenen <maco(a)android.com>
> > Cc: Shuah Khan <shuah(a)kernel.org>
> > Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
> > Cc: Suren Baghdasaryan <surenb(a)google.com>
> > Cc: Todd Kjos <tkjos(a)android.com>
> > Cc: devel(a)driverdev.osuosl.org
> > Cc: dri-devel(a)lists.freedesktop.org
> > Cc: linaro-mm-sig(a)lists.linaro.org
> > Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
>
> We discussed this at the Android MC on Monday and the plan was to
> remove it after the next LTS release.
As 5.10 will be the next LTS release, I have now merged it to my
"testing" branch to go into 5.11-rc1.
thanks,
greg k-h
On Wed, Oct 14, 2020 at 09:16:01AM -0700, Jianxin Xiong wrote:
> The dma-buf API have been used under the assumption that the sg lists
> returned from dma_buf_map_attachment() are fully page aligned. Lots of
> stuff can break otherwise all over the place. Clarify this in the
> documentation and add a check when DMA API debug is enabled.
>
> Signed-off-by: Jianxin Xiong <jianxin.xiong(a)intel.com>
lgtm, thanks for creating this and giving it a spin.
I'll queue this up in drm-misc-next for 5.11, should show up in linux-next
after the merge window is closed.
Cheers, Daniel
> ---
> drivers/dma-buf/dma-buf.c | 21 +++++++++++++++++++++
> include/linux/dma-buf.h | 3 ++-
> 2 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 844967f..7309c83 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -851,6 +851,9 @@ void dma_buf_unpin(struct dma_buf_attachment *attach)
> * Returns sg_table containing the scatterlist to be returned; returns ERR_PTR
> * on error. May return -EINTR if it is interrupted by a signal.
> *
> + * On success, the DMA addresses and lengths in the returned scatterlist are
> + * PAGE_SIZE aligned.
> + *
> * A mapping must be unmapped by using dma_buf_unmap_attachment(). Note that
> * the underlying backing storage is pinned for as long as a mapping exists,
> * therefore users/importers should not hold onto a mapping for undue amounts of
> @@ -904,6 +907,24 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
> attach->dir = direction;
> }
>
> +#ifdef CONFIG_DMA_API_DEBUG
> + {
> + struct scatterlist *sg;
> + u64 addr;
> + int len;
> + int i;
> +
> + for_each_sgtable_dma_sg(sg_table, sg, i) {
> + addr = sg_dma_address(sg);
> + len = sg_dma_len(sg);
> + if (!PAGE_ALIGNED(addr) || !PAGE_ALIGNED(len)) {
> + pr_debug("%s: addr %llx or len %x is not page aligned!\n",
> + __func__, addr, len);
> + }
> + }
> + }
> +#endif /* CONFIG_DMA_API_DEBUG */
> +
> return sg_table;
> }
> EXPORT_SYMBOL_GPL(dma_buf_map_attachment);
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index a2ca294e..4a5fa70 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -145,7 +145,8 @@ struct dma_buf_ops {
> *
> * A &sg_table scatter list of or the backing storage of the DMA buffer,
> * already mapped into the device address space of the &device attached
> - * with the provided &dma_buf_attachment.
> + * with the provided &dma_buf_attachment. The addresses and lengths in
> + * the scatter list are PAGE_SIZE aligned.
> *
> * On failure, returns a negative error value wrapped into a pointer.
> * May also return -EINTR when a signal was received while being
> --
> 1.8.3.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel(a)lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
From: Rob Clark <robdclark(a)chromium.org>
This doesn't remove *all* the struct_mutex, but it covers the worst
of it, ie. shrinker/madvise/free/retire. The submit path still uses
struct_mutex, but it still needs *something* serialize a portion of
the submit path, and lock_stat mostly just shows the lock contention
there being with other submits. And there are a few other bits of
struct_mutex usage in less critical paths (debugfs, etc). But this
seems like a reasonable step in the right direction.
v2: teach lockdep about shrinker locking patters (danvet) and
convert to obj->resv locking (danvet)
Rob Clark (22):
drm/msm/gem: Add obj->lock wrappers
drm/msm/gem: Rename internal get_iova_locked helper
drm/msm/gem: Move prototypes to msm_gem.h
drm/msm/gem: Add some _locked() helpers
drm/msm/gem: Move locking in shrinker path
drm/msm/submit: Move copy_from_user ahead of locking bos
drm/msm: Do rpm get sooner in the submit path
drm/msm/gem: Switch over to obj->resv for locking
drm/msm: Use correct drm_gem_object_put() in fail case
drm/msm: Drop chatty trace
drm/msm: Move update_fences()
drm/msm: Add priv->mm_lock to protect active/inactive lists
drm/msm: Document and rename preempt_lock
drm/msm: Protect ring->submits with it's own lock
drm/msm: Refcount submits
drm/msm: Remove obj->gpu
drm/msm: Drop struct_mutex from the retire path
drm/msm: Drop struct_mutex in free_object() path
drm/msm: remove msm_gem_free_work
drm/msm: drop struct_mutex in madvise path
drm/msm: Drop struct_mutex in shrinker path
drm/msm: Don't implicit-sync if only a single ring
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +-
drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 12 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 +-
drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 1 +
drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 1 +
drivers/gpu/drm/msm/dsi/dsi_host.c | 1 +
drivers/gpu/drm/msm/msm_debugfs.c | 7 +
drivers/gpu/drm/msm/msm_drv.c | 21 +-
drivers/gpu/drm/msm/msm_drv.h | 73 ++-----
drivers/gpu/drm/msm/msm_fbdev.c | 1 +
drivers/gpu/drm/msm/msm_gem.c | 245 ++++++++++------------
drivers/gpu/drm/msm/msm_gem.h | 131 ++++++++++--
drivers/gpu/drm/msm/msm_gem_shrinker.c | 81 +++----
drivers/gpu/drm/msm/msm_gem_submit.c | 154 +++++++++-----
drivers/gpu/drm/msm/msm_gpu.c | 98 +++++----
drivers/gpu/drm/msm/msm_gpu.h | 5 +-
drivers/gpu/drm/msm/msm_ringbuffer.c | 3 +-
drivers/gpu/drm/msm/msm_ringbuffer.h | 13 +-
18 files changed, 459 insertions(+), 396 deletions(-)
--
2.26.2
Am 08.10.20 um 23:49 schrieb John Hubbard:
> On 10/8/20 4:23 AM, Christian König wrote:
>> Add the new vma_set_file() function to allow changing
>> vma->vm_file with the necessary refcount dance.
>>
>> v2: add more users of this.
>>
>> Signed-off-by: Christian König <christian.koenig(a)amd.com>
>> ---
>> drivers/dma-buf/dma-buf.c | 16 +++++-----------
>> drivers/gpu/drm/etnaviv/etnaviv_gem.c | 4 +---
>> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 3 +--
>> drivers/gpu/drm/i915/gem/i915_gem_mman.c | 4 ++--
>> drivers/gpu/drm/msm/msm_gem.c | 4 +---
>> drivers/gpu/drm/omapdrm/omap_gem.c | 3 +--
>> drivers/gpu/drm/vgem/vgem_drv.c | 3 +--
>> drivers/staging/android/ashmem.c | 5 ++---
>> include/linux/mm.h | 2 ++
>> mm/mmap.c | 16 ++++++++++++++++
>> 10 files changed, 32 insertions(+), 28 deletions(-)
>
> Looks like a nice cleanup. Two comments below.
>
> ...
>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> index 3d69e51f3e4d..c9d5f1a38af3 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> @@ -893,8 +893,8 @@ int i915_gem_mmap(struct file *filp, struct
>> vm_area_struct *vma)
>> * requires avoiding extraneous references to their filp, hence
>> why
>> * we prefer to use an anonymous file for their mmaps.
>> */
>> - fput(vma->vm_file);
>> - vma->vm_file = anon;
>> + vma_set_file(vma, anon);
>> + fput(anon);
>
> That's one fput() too many, isn't it?
No, the other cases were replacing the vm_file with something
pre-allocated and also grabbed a new reference.
But this case here uses the freshly allocated anon file and so
vma_set_file() grabs another extra reference which we need to drop.
The alternative is to just keep it as it is. Opinions?
>
>
> ...
>
>> diff --git a/drivers/staging/android/ashmem.c
>> b/drivers/staging/android/ashmem.c
>> index 10b4be1f3e78..a51dc089896e 100644
>> --- a/drivers/staging/android/ashmem.c
>> +++ b/drivers/staging/android/ashmem.c
>> @@ -450,9 +450,8 @@ static int ashmem_mmap(struct file *file, struct
>> vm_area_struct *vma)
>> vma_set_anonymous(vma);
>> }
>> - if (vma->vm_file)
>> - fput(vma->vm_file);
>> - vma->vm_file = asma->file;
>> + vma_set_file(vma, asma->file);
>> + fput(asma->file);
>
> Same here: that fput() seems wrong, as it was already done within
> vma_set_file().
No, that case is correct as well. The Android code here has the matching
get_file() a few lines up, see the surrounding code.
I didn't wanted to replace that since it does some strange error
handling here, so the result is that we need to drop the extra reference
as again.
We could also keep it like it is or maybe better put a TODO comment on it.
Regards,
Christian.
>
>
>
> thanks,
From: Rob Clark <robdclark(a)chromium.org>
This doesn't remove *all* the struct_mutex, but it covers the worst
of it, ie. shrinker/madvise/free/retire. The submit path still uses
struct_mutex, but it still needs *something* serialize a portion of
the submit path, and lock_stat mostly just shows the lock contention
there being with other submits. And there are a few other bits of
struct_mutex usage in less critical paths (debugfs, etc). But this
seems like a reasonable step in the right direction.
Rob Clark (14):
drm/msm: Use correct drm_gem_object_put() in fail case
drm/msm: Drop chatty trace
drm/msm: Move update_fences()
drm/msm: Add priv->mm_lock to protect active/inactive lists
drm/msm: Document and rename preempt_lock
drm/msm: Protect ring->submits with it's own lock
drm/msm: Refcount submits
drm/msm: Remove obj->gpu
drm/msm: Drop struct_mutex from the retire path
drm/msm: Drop struct_mutex in free_object() path
drm/msm: remove msm_gem_free_work
drm/msm: drop struct_mutex in madvise path
drm/msm: Drop struct_mutex in shrinker path
drm/msm: Don't implicit-sync if only a single ring
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +-
drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 12 +--
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 +-
drivers/gpu/drm/msm/msm_debugfs.c | 7 ++
drivers/gpu/drm/msm/msm_drv.c | 15 +---
drivers/gpu/drm/msm/msm_drv.h | 19 +++--
drivers/gpu/drm/msm/msm_gem.c | 76 ++++++------------
drivers/gpu/drm/msm/msm_gem.h | 53 +++++++++----
drivers/gpu/drm/msm/msm_gem_shrinker.c | 58 ++------------
drivers/gpu/drm/msm/msm_gem_submit.c | 17 ++--
drivers/gpu/drm/msm/msm_gpu.c | 96 ++++++++++++++---------
drivers/gpu/drm/msm/msm_gpu.h | 5 +-
drivers/gpu/drm/msm/msm_ringbuffer.c | 3 +-
drivers/gpu/drm/msm/msm_ringbuffer.h | 13 ++-
14 files changed, 188 insertions(+), 194 deletions(-)
--
2.26.2
On Thu, Oct 01, 2020 at 07:28:27PM +0200, Alexandre Bailon wrote:
> Hi Daniel,
>
> On 10/1/20 10:48 AM, Daniel Vetter wrote:
> > On Wed, Sep 30, 2020 at 01:53:46PM +0200, Alexandre Bailon wrote:
> > > This adds a RPMsg driver that implements communication between the CPU and an
> > > APU.
> > > This uses VirtIO buffer to exchange messages but for sharing data, this uses
> > > a dmabuf, mapped to be shared between CPU (userspace) and APU.
> > > The driver is relatively generic, and should work with any SoC implementing
> > > hardware accelerator for AI if they use support remoteproc and VirtIO.
> > >
> > > For the people interested by the firmware or userspace library,
> > > the sources are available here:
> > > https://github.com/BayLibre/open-amp/tree/v2020.01-mtk/apps/examples/apu
> > Since this has open userspace (from a very cursory look), and smells very
> > much like an acceleration driver, and seems to use dma-buf for memory
> > management: Why is this not just a drm driver?
>
> I have never though to DRM since for me it was only a RPMsg driver.
> I don't know well DRM. Could you tell me how you would do it so I could have
> a look ?
Well internally it would still be an rpmsg driver ... I'm assuming that's
kinda similar to how most gpu drivers sit on top of a pci_device or a
platform_device, it's just a means to get at your "device"?
The part I'm talking about here is the userspace api. You're creating an
entirely new chardev interface, which at least from a quick look seems to
be based on dma-buf buffers and used to submit commands to your device to
do some kind of computing/processing. That's exactly what drivers/gpu/drm
does (if you ignore the display/modeset side of things) - at the kernel
level gpus have nothing to do with graphics, but all with handling buffer
objects and throwing workloads at some kind of accelerator thing.
Of course that's just my guess of what's going on, after scrolling through
your driver and userspace a bit, I might be completely off. But if my
guess is roughly right, then your driver is internally an rpmsg
driver, but towards userspace it should be a drm driver.
Cheers, Daniel
>
> Thanks,
> Alexandre
>
> > -Daniel
> >
> > > Alexandre Bailon (3):
> > > Add a RPMSG driver for the APU in the mt8183
> > > rpmsg: apu_rpmsg: update the way to store IOMMU mapping
> > > rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping
> > >
> > > Julien STEPHAN (1):
> > > rpmsg: apu_rpmsg: Add support for async apu request
> > >
> > > drivers/rpmsg/Kconfig | 9 +
> > > drivers/rpmsg/Makefile | 1 +
> > > drivers/rpmsg/apu_rpmsg.c | 752 +++++++++++++++++++++++++++++++++
> > > drivers/rpmsg/apu_rpmsg.h | 52 +++
> > > include/uapi/linux/apu_rpmsg.h | 47 +++
> > > 5 files changed, 861 insertions(+)
> > > create mode 100644 drivers/rpmsg/apu_rpmsg.c
> > > create mode 100644 drivers/rpmsg/apu_rpmsg.h
> > > create mode 100644 include/uapi/linux/apu_rpmsg.h
> > >
> > > --
> > > 2.26.2
> > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel(a)lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> dri-devel mailing list
> dri-devel(a)lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Wed, Sep 30, 2020 at 01:53:46PM +0200, Alexandre Bailon wrote:
> This adds a RPMsg driver that implements communication between the CPU and an
> APU.
> This uses VirtIO buffer to exchange messages but for sharing data, this uses
> a dmabuf, mapped to be shared between CPU (userspace) and APU.
> The driver is relatively generic, and should work with any SoC implementing
> hardware accelerator for AI if they use support remoteproc and VirtIO.
>
> For the people interested by the firmware or userspace library,
> the sources are available here:
> https://github.com/BayLibre/open-amp/tree/v2020.01-mtk/apps/examples/apu
Since this has open userspace (from a very cursory look), and smells very
much like an acceleration driver, and seems to use dma-buf for memory
management: Why is this not just a drm driver?
-Daniel
>
> Alexandre Bailon (3):
> Add a RPMSG driver for the APU in the mt8183
> rpmsg: apu_rpmsg: update the way to store IOMMU mapping
> rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping
>
> Julien STEPHAN (1):
> rpmsg: apu_rpmsg: Add support for async apu request
>
> drivers/rpmsg/Kconfig | 9 +
> drivers/rpmsg/Makefile | 1 +
> drivers/rpmsg/apu_rpmsg.c | 752 +++++++++++++++++++++++++++++++++
> drivers/rpmsg/apu_rpmsg.h | 52 +++
> include/uapi/linux/apu_rpmsg.h | 47 +++
> 5 files changed, 861 insertions(+)
> create mode 100644 drivers/rpmsg/apu_rpmsg.c
> create mode 100644 drivers/rpmsg/apu_rpmsg.h
> create mode 100644 include/uapi/linux/apu_rpmsg.h
>
> --
> 2.26.2
>
> _______________________________________________
> dri-devel mailing list
> dri-devel(a)lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Hi Alex,
On 22.09.2020 01:15, Alex Goins wrote:
> Tested-by: Alex Goins <agoins(a)nvidia.com>
>
> This change fixes a regression with drm_prime_sg_to_page_addr_arrays() and
> AMDGPU in v5.9.
Thanks for testing!
> Commit 39913934 similarly revamped AMDGPU to use sgtable helper functions. When
> it changed from dma_map_sg_attrs() to dma_map_sgtable(), as a side effect it
> started correctly updating sgt->nents to the return value of dma_map_sg_attrs().
> However, drm_prime_sg_to_page_addr_arrays() incorrectly uses sgt->nents to
> iterate over pages, rather than sgt->orig_nents, resulting in it now returning
> the incorrect number of pages on AMDGPU.
>
> I had written a patch that changes drm_prime_sg_to_page_addr_arrays() to use
> for_each_sgtable_sg() instead of for_each_sg(), iterating using sgt->orig_nents:
>
> - for_each_sg(sgt->sgl, sg, sgt->nents, count) {
> + for_each_sgtable_sg(sgt, sg, count) {
>
> This patch takes it further, but still has the effect of fixing the number of
> pages that drm_prime_sg_to_page_addr_arrays() returns. Something like this
> should be included in v5.9 to prevent a regression with AMDGPU.
Probably the easiest way to handle a fix for v5.9 would be to simply
merge the latest version of this patch also to v5.9-rcX:
https://lore.kernel.org/dri-devel/20200904131711.12950-3-m.szyprowski@samsu…
This way we would get it fixed and avoid possible conflict in the -next.
Do you have any AMDGPU fixes for v5.9 in the queue? Maybe you can add
that patch to the queue? Dave: would it be okay that way?
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Am 28.09.20 um 09:37 schrieb Thomas Zimmermann:
> Hi
>
> Am 28.09.20 um 08:50 schrieb Christian König:
>> Am 27.09.20 um 21:16 schrieb Sam Ravnborg:
>>> Hi Thomas.
>>>
>>>>> struct simap {
>>>>> union {
>>>>> void __iomem *vaddr_iomem;
>>>>> void *vaddr;
>>>>> };
>>>>> bool is_iomem;
>>>>> };
>>>>>
>>>>> Where simap is a shorthand for system_iomem_map
>>>>> And it could al be stuffed into a include/linux/simap.h file.
>>>>>
>>>>> Not totally sold on the simap name - but wanted to come up with
>>>>> something.
>>>> Yes. Others, myself included, have suggested to use a name that does not
>>>> imply a connection to the dma-buf framework, but no one has come up with
>>>> a good name.
>>>>
>>>> I strongly dislike simap, as it's entirely non-obvious what it does.
>>>> dma-buf-map is not actually wrong. The structures represents the mapping
>>>> of a dma-able buffer in most cases.
>>>>
>>>>> With this approach users do not have to pull in dma-buf to use it and
>>>>> users will not confuse that this is only for dma-buf usage.
>>>> There's no need to enable dma-buf. It's all in the header file without
>>>> dependencies on dma-buf. It's really just the name.
>>>>
>>>> But there's something else to take into account. The whole issue here is
>>>> that the buffer is disconnected from its originating driver, so we don't
>>>> know which kind of memory ops we have to use. Thinking about it, I
>>>> realized that no one else seemed to have this problem until now.
>>>> Otherwise there would be a solution already. So maybe the dma-buf
>>>> framework *is* the native use case for this data structure.
>>> We have at least:
>>> linux/fb.h:
>>> union {
>>> char __iomem *screen_base; /* Virtual address */
>>> char *screen_buffer;
>>> };
>>>
>>> Which solve more or less the same problem.
> I thought this was for convenience. The important is_iomem bit is missing.
>
>> I also already noted that in TTM we have exactly the same problem and a
>> whole bunch of helpers to allow operations on those pointers.
> How do you call this within TTM?
ttm_bus_placement, but I really don't like that name.
>
> The data structure represents a pointer to either system or I/O memory,
> but not necessatrily device memory. It contains raw data. That would
> give something like
>
> struct databuf_map
> struct databuf_ptr
> struct dbuf_map
> struct dbuf_ptr
>
> My favorite would be dbuf_ptr. It's short and the API names would make
> sense: dbuf_ptr_clear() for clearing, dbuf_ptr_set_vaddr() to set an
> address, dbuf_ptr_incr() to increment, etc. Also, the _ptr indicates
> that it's a single address; not an offset with length.
Puh, no idea. All of that doesn't sound like it 100% hits the underlying
meaning of the structure.
Christian.
>
> Best regards
> Thomas
>
>> Christian.
>>
>>>
>>>> Anyway, if a better name than dma-buf-map comes in, I'm willing to
>>>> rename the thing. Otherwise I intend to merge the patchset by the end of
>>>> the week.
>>> Well, the main thing is that I think this shoud be moved away from
>>> dma-buf. But if indeed dma-buf is the only relevant user in drm then
>>> I am totally fine with the current naming.
>>>
>>> One alternative named that popped up in my head: struct sys_io_map {}
>>> But again, if this is kept in dma-buf then I am fine with the current
>>> naming.
>>>
>>> In other words, if you continue to think this is mostly a dma-buf
>>> thing all three patches are:
>>> Acked-by: Sam Ravnborg <sam(a)ravnborg.org>
>>>
>>> Sam
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel(a)lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Hi Thomas.
> >
> > struct simap {
> > union {
> > void __iomem *vaddr_iomem;
> > void *vaddr;
> > };
> > bool is_iomem;
> > };
> >
> > Where simap is a shorthand for system_iomem_map
> > And it could al be stuffed into a include/linux/simap.h file.
> >
> > Not totally sold on the simap name - but wanted to come up with
> > something.
>
> Yes. Others, myself included, have suggested to use a name that does not
> imply a connection to the dma-buf framework, but no one has come up with
> a good name.
>
> I strongly dislike simap, as it's entirely non-obvious what it does.
> dma-buf-map is not actually wrong. The structures represents the mapping
> of a dma-able buffer in most cases.
>
> >
> > With this approach users do not have to pull in dma-buf to use it and
> > users will not confuse that this is only for dma-buf usage.
>
> There's no need to enable dma-buf. It's all in the header file without
> dependencies on dma-buf. It's really just the name.
>
> But there's something else to take into account. The whole issue here is
> that the buffer is disconnected from its originating driver, so we don't
> know which kind of memory ops we have to use. Thinking about it, I
> realized that no one else seemed to have this problem until now.
> Otherwise there would be a solution already. So maybe the dma-buf
> framework *is* the native use case for this data structure.
We have at least:
linux/fb.h:
union {
char __iomem *screen_base; /* Virtual address */
char *screen_buffer;
};
Which solve more or less the same problem.
> Anyway, if a better name than dma-buf-map comes in, I'm willing to
> rename the thing. Otherwise I intend to merge the patchset by the end of
> the week.
Well, the main thing is that I think this shoud be moved away from
dma-buf. But if indeed dma-buf is the only relevant user in drm then
I am totally fine with the current naming.
One alternative named that popped up in my head: struct sys_io_map {}
But again, if this is kept in dma-buf then I am fine with the current
naming.
In other words, if you continue to think this is mostly a dma-buf
thing all three patches are:
Acked-by: Sam Ravnborg <sam(a)ravnborg.org>
Sam
On Fri, Sep 25, 2020 at 04:51:38PM +0800, Tian Tao wrote:
> drm_modeset_lock.h already declares struct drm_device, so there's no
> need to declare it in vc4_drv.h
>
> Signed-off-by: Tian Tao <tiantao6(a)hisilicon.com>
Just an aside, when submitting patches please use
scripts/get_maintainers.pl to generate the recipient list. Looking through
past few patches from you it seems fairly arbitrary and often misses the
actual maintainers for a given piece of code, which increases the odds the
patch will get lost a lot.
E.g. for this one I'm only like the 5th or so fallback person, and the
main maintainer isn't on the recipient list.
Cheeers, Daniel
> ---
> drivers/gpu/drm/vc4/vc4_drv.h | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> index 8c8d96b..8717a1c 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.h
> +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> @@ -19,7 +19,6 @@
>
> #include "uapi/drm/vc4_drm.h"
>
> -struct drm_device;
> struct drm_gem_object;
>
> /* Don't forget to update vc4_bo.c: bo_type_names[] when adding to
> --
> 2.7.4
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
NULL pointer dereference is observed while exporting the dmabuf but
failed to allocate the 'struct file' which results into the dropping of
the allocated dentry corresponding to this file in the dmabuf fs, which
is ending up in dma_buf_release() and accessing the uninitialzed
dentry->d_fsdata.
Call stack on 5.4 is below:
dma_buf_release+0x2c/0x254 drivers/dma-buf/dma-buf.c:88
__dentry_kill+0x294/0x31c fs/dcache.c:584
dentry_kill fs/dcache.c:673 [inline]
dput+0x250/0x380 fs/dcache.c:859
path_put+0x24/0x40 fs/namei.c:485
alloc_file_pseudo+0x1a4/0x200 fs/file_table.c:235
dma_buf_getfile drivers/dma-buf/dma-buf.c:473 [inline]
dma_buf_export+0x25c/0x3ec drivers/dma-buf/dma-buf.c:585
Fix this by checking for the valid pointer in the dentry->d_fsdata.
Fixes: 4ab59c3c638c ("dma-buf: Move dma_buf_release() from fops to dentry_ops")
Cc: <stable(a)vger.kernel.org> [5.7+]
Signed-off-by: Charan Teja Reddy <charante(a)codeaurora.org>
---
drivers/dma-buf/dma-buf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 58564d82..844967f 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -59,6 +59,8 @@ static void dma_buf_release(struct dentry *dentry)
struct dma_buf *dmabuf;
dmabuf = dentry->d_fsdata;
+ if (unlikely(!dmabuf))
+ return;
BUG_ON(dmabuf->vmapping_counter);
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation
Hi Andrew,
I'm the new DMA-buf maintainer and Daniel and others came up with patches extending the use of the dma_buf_mmap() function.
Now this function is doing something a bit odd by changing the vma->vm_file while installing a VMA in the mmap() system call
The background here is that DMA-buf allows device drivers to export buffer which are then imported into another device driver. The mmap() handler of the importing device driver then find that the pgoff belongs to the exporting device and so redirects the mmap() call there.
In other words user space calls mmap() on one file descriptor, but get a different one mapped into your virtual address space.
My question is now: Is that legal or can you think of something which breaks here?
If it's not legal we should probably block any new users of the dma_buf_mmap() function and consider what should happen with the two existing ones.
If that is legal I would like to document this by adding a new vma_set_file() function which does the necessary reference count dance.
Thanks in advance,
Christian.
GPU drivers need this in their shrinkers, to be able to throw out
mmap'ed buffers. Note that we also need dma_resv_lock in shrinkers,
but that loop is resolved by trylocking in shrinkers.
So full hierarchy is now (ignore some of the other branches we already
have primed):
mmap_read_lock -> dma_resv -> shrinkers -> i_mmap_lock_write
I hope that's not inconsistent with anything mm or fs does, adding
relevant people.
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: linux-media(a)vger.kernel.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: Qian Cai <cai(a)lca.pw>
Cc: linux-xfs(a)vger.kernel.org
Cc: linux-fsdevel(a)vger.kernel.org
Cc: Thomas Hellström (Intel) <thomas_os(a)shipmail.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: linux-mm(a)kvack.org
Cc: linux-rdma(a)vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
---
drivers/dma-buf/dma-resv.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 0e6675ec1d11..9678162a4ac5 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -104,12 +104,14 @@ static int __init dma_resv_lockdep(void)
struct mm_struct *mm = mm_alloc();
struct ww_acquire_ctx ctx;
struct dma_resv obj;
+ struct address_space mapping;
int ret;
if (!mm)
return -ENOMEM;
dma_resv_init(&obj);
+ address_space_init_once(&mapping);
mmap_read_lock(mm);
ww_acquire_init(&ctx, &reservation_ww_class);
@@ -117,6 +119,9 @@ static int __init dma_resv_lockdep(void)
if (ret == -EDEADLK)
dma_resv_lock_slow(&obj, &ctx);
fs_reclaim_acquire(GFP_KERNEL);
+ /* for unmap_mapping_range on trylocked buffer objects in shrinkers */
+ i_mmap_lock_write(&mapping);
+ i_mmap_unlock_write(&mapping);
#ifdef CONFIG_MMU_NOTIFIER
lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
__dma_fence_might_wait();
--
2.27.0
Am 09.09.20 um 09:57 schrieb Tian Tao:
> Fix kernel-doc warnings.
> drivers/gpu/drm/scheduler/sched_fence.c:110: warning: Function parameter or
> member 'f' not described in 'drm_sched_fence_release_scheduled'
> drivers/gpu/drm/scheduler/sched_fence.c:110: warning: Excess function
> parameter 'fence' description in 'drm_sched_fence_release_scheduled'
>
> Signed-off-by: Tian Tao <tiantao6(a)hisilicon.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
> ---
> drivers/gpu/drm/scheduler/sched_fence.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> index 8b45c3a1b84..69de2c7 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -101,7 +101,7 @@ static void drm_sched_fence_free(struct rcu_head *rcu)
> /**
> * drm_sched_fence_release_scheduled - callback that fence can be freed
> *
> - * @fence: fence
> + * @f: fence
> *
> * This function is called when the reference count becomes zero.
> * It just RCU schedules freeing up the fence.
Hi Tomasz,
On 07.09.2020 15:07, Tomasz Figa wrote:
> On Fri, Sep 4, 2020 at 3:35 PM Marek Szyprowski
> <m.szyprowski(a)samsung.com> wrote:
>> Use recently introduced common wrappers operating directly on the struct
>> sg_table objects and scatterlist page iterators to make the code a bit
>> more compact, robust, easier to follow and copy/paste safe.
>>
>> No functional change, because the code already properly did all the
>> scatterlist related calls.
>>
>> Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
>> Reviewed-by: Robin Murphy <robin.murphy(a)arm.com>
>> ---
>> .../common/videobuf2/videobuf2-dma-contig.c | 34 ++++++++-----------
>> .../media/common/videobuf2/videobuf2-dma-sg.c | 32 +++++++----------
>> .../common/videobuf2/videobuf2-vmalloc.c | 12 +++----
>> 3 files changed, 31 insertions(+), 47 deletions(-)
>>
> Thanks for the patch! Please see my comments inline.
>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> index ec3446cc45b8..1b242d844dde 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> @@ -58,10 +58,10 @@ static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt)
>> unsigned int i;
>> unsigned long size = 0;
>>
>> - for_each_sg(sgt->sgl, s, sgt->nents, i) {
>> + for_each_sgtable_dma_sg(sgt, s, i) {
>> if (sg_dma_address(s) != expected)
>> break;
>> - expected = sg_dma_address(s) + sg_dma_len(s);
>> + expected += sg_dma_len(s);
>> size += sg_dma_len(s);
>> }
>> return size;
>> @@ -103,8 +103,7 @@ static void vb2_dc_prepare(void *buf_priv)
>> if (!sgt)
>> return;
>>
>> - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
>> - buf->dma_dir);
>> + dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
>> }
>>
>> static void vb2_dc_finish(void *buf_priv)
>> @@ -115,7 +114,7 @@ static void vb2_dc_finish(void *buf_priv)
>> if (!sgt)
>> return;
>>
>> - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
>> + dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
>> }
>>
>> /*********************************************/
>> @@ -275,8 +274,8 @@ static void vb2_dc_dmabuf_ops_detach(struct dma_buf *dbuf,
>> * memory locations do not require any explicit cache
>> * maintenance prior or after being used by the device.
>> */
>> - dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
>> - attach->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
>> + dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
>> + DMA_ATTR_SKIP_CPU_SYNC);
>> sg_free_table(sgt);
>> kfree(attach);
>> db_attach->priv = NULL;
>> @@ -301,8 +300,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
>>
>> /* release any previous cache */
>> if (attach->dma_dir != DMA_NONE) {
>> - dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
>> - attach->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
>> + dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
>> + DMA_ATTR_SKIP_CPU_SYNC);
>> attach->dma_dir = DMA_NONE;
>> }
>>
>> @@ -310,9 +309,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
>> * mapping to the client with new direction, no cache sync
>> * required see comment in vb2_dc_dmabuf_ops_detach()
>> */
>> - sgt->nents = dma_map_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
>> - dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
>> - if (!sgt->nents) {
>> + if (dma_map_sgtable(db_attach->dev, sgt, dma_dir,
>> + DMA_ATTR_SKIP_CPU_SYNC)) {
>> pr_err("failed to map scatterlist\n");
>> mutex_unlock(lock);
>> return ERR_PTR(-EIO);
> As opposed to dma_map_sg_attrs(), dma_map_sgtable() now returns an
> error code on its own. Is it expected to ignore it and return -EIO?
Those errors are more or less propagated to userspace and -EIO has been
already widely documented in V4L2 documentation as the error code for
the most of the V4L2 ioctls. I don't want to change it. A possible
-EINVAL returned from dma_map_sgtable() was just one of the 'generic'
error codes, not very descriptive in that case. Probably the main
problem here is that dma_map_sg() and friend doesn't return any error
codes...
> ...
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Dear All,
During the Exynos DRM GEM rework and fixing the issues in the.
drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most
drivers in DRM framework incorrectly use nents and orig_nents entries of
the struct sg_table.
In case of the most DMA-mapping implementations exchanging those two
entries or using nents for all loops on the scatterlist is harmless,
because they both have the same value. There exists however a DMA-mapping
implementations, for which such incorrect usage breaks things. The nents
returned by dma_map_sg() might be lower than the nents passed as its
parameter and this is perfectly fine. DMA framework or IOMMU is allowed
to join consecutive chunks while mapping if such operation is supported
by the underlying HW (bus, bridge, IOMMU, etc). Example of the case
where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages
is described here [2]
The DMA-mapping framework documentation [3] states that dma_map_sg()
returns the numer of the created entries in the DMA address space.
However the subsequent calls to dma_sync_sg_for_{device,cpu} and
dma_unmap_sg must be called with the original number of entries passed to
dma_map_sg. The common pattern in DRM drivers were to assign the
dma_map_sg() return value to sg_table->nents and use that value for
the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also
the code iterated over nents times to access the pages stored in the
processed scatterlist, while it should use orig_nents as the numer of
the page entries.
I've tried to identify all such incorrect usage of sg_table->nents and
this is a result of my research. It looks that the incorrect pattern has
been copied over the many drivers mainly in the DRM subsystem. Too bad in
most cases it even worked correctly if the system used a simple, linear
DMA-mapping implementation, for which swapping nents and orig_nents
doesn't make any difference. To avoid similar issues in the future, I've
introduced a common wrappers for DMA-mapping calls, which operate directly
on the sg_table objects. I've also added wrappers for iterating over the
scatterlists stored in the sg_table objects and applied them where
possible. This, together with some common DRM prime helpers, allowed me
to almost get rid of all nents/orig_nents usage in the drivers. I hope
that such change makes the code robust, easier to follow and copy/paste
safe.
The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
it fully. The driver creatively uses sg_table->orig_nents to store the
size of the allocate scatterlist and ignores the number of the entries
returned by dma_map_sg function. In this patchset I only fixed the
sg_table objects exported by dmabuf related functions. I hope that I
didn't break anything there.
Patches are based on top of Linux next-20200903. The required changes to
DMA-mapping framework has been already merged to v5.9-rc3.
If possible I would like ask for merging most of the patches via DRM
tree.
Best regards,
Marek Szyprowski
References:
[1] https://lkml.org/lkml/2020/3/27/555
[2] https://lkml.org/lkml/2020/3/29/65
[3] Documentation/DMA-API-HOWTO.txt
[4] https://lore.kernel.org/linux-iommu/20200512121931.GD20393@lst.de/T/#ma18c9…
Changelog:
v10:
- addressed more issues pointed by Robin Murphy in his review:
* prime: restored WARN_ON() in drm_prime_sg_to_page_addr_arrays()
* armada: simplified cleanup path
* msm: fixed arm64 build
* omapdrm: removed WARN_ON(), which is now in drm_prime_sg_to_page_addr_arrays()
* omapdrm: dropped the incorrect nents/orig_nents patch
* media: pci: also update to use modern DMA_FROM_DEVICE definitions
- dropped merged patches
v9: https://lore.kernel.org/linux-iommu/20200826063316.23486-1-m.szyprowski@sam…
- rebased onto Linux next-20200825, which is based on v5.9-rc2; fixed conflicts
- dropped merged patches
v8:
- rapidio: fixed issues pointed by kbuilt test robot (use of uninitialized
variable
- vb2: rebased after recent changes in the code
v7: https://lore.kernel.org/linux-iommu/20200619103636.11974-1-m.szyprowski@sam…
- changed DMA page interators to standard DMA SG iterators in drm/prime and
videobuf2-dma-contig as suggested by Robin Murphy
- fixed build issues
v6: https://lore.kernel.org/linux-iommu/20200618153956.29558-1-m.szyprowski@sam…
- rebased onto Linux next-20200618, which is based on v5.8-rc1; fixed conflicts
v5: https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprowski@sams…
- fixed some minor style issues and typos
- fixed lack of the attrs argument in ion, dmabuf, rapidio, fastrpc and
vfio patches
v4: https://lore.kernel.org/linux-iommu/20200512121931.GD20393@lst.de/T/
- added for_each_sgtable_* wrappers and applied where possible
- added drm_prime_get_contiguous_size() and applied where possible
- applied drm_prime_sg_to_page_addr_arrays() where possible to remove page
extraction from sg_table objects
- added documentation for the introduced wrappers
- improved patches description a bit
v3: https://lore.kernel.org/dri-devel/20200505083926.28503-1-m.szyprowski@samsu…
- introduce dma_*_sgtable_* wrappers and use them in all patches
v2: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar…
- dropped most of the changes to drm/i915
- added fixes for rcar-du, xen, media and ion
- fixed a few issues pointed by kbuild test robot
- added wide cc: list for each patch
v1: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar…
- initial version
Patch summary:
Marek Szyprowski (30):
drm: prime: add common helper to check scatterlist contiguity
drm: prime: use sgtable iterators in
drm_prime_sg_to_page_addr_arrays()
drm: core: fix common struct sg_table related issues
drm: armada: fix common struct sg_table related issues
drm: etnaviv: fix common struct sg_table related issues
drm: exynos: use common helper for a scatterlist contiguity check
drm: exynos: fix common struct sg_table related issues
drm: i915: fix common struct sg_table related issues
drm: lima: fix common struct sg_table related issues
drm: mediatek: use common helper for a scatterlist contiguity check
drm: mediatek: use common helper for extracting pages array
drm: msm: fix common struct sg_table related issues
drm: omapdrm: use common helper for extracting pages array
drm: panfrost: fix common struct sg_table related issues
drm: rockchip: use common helper for a scatterlist contiguity check
drm: rockchip: fix common struct sg_table related issues
drm: tegra: fix common struct sg_table related issues
drm: v3d: fix common struct sg_table related issues
drm: virtio: fix common struct sg_table related issues
drm: vmwgfx: fix common struct sg_table related issues
drm: xen: fix common struct sg_table related issues
xen: gntdev: fix common struct sg_table related issues
drm: host1x: fix common struct sg_table related issues
drm: rcar-du: fix common struct sg_table related issues
dmabuf: fix common struct sg_table related issues
staging: tegra-vde: fix common struct sg_table related issues
rapidio: fix common struct sg_table related issues
samples: vfio-mdev/mbochs: fix common struct sg_table related issues
media: pci: fix common ALSA DMA-mapping related codes
videobuf2: use sgtable-based scatterlist wrappers
drivers/dma-buf/heaps/heap-helpers.c | 13 ++-
drivers/dma-buf/udmabuf.c | 7 +-
drivers/gpu/drm/armada/armada_gem.c | 24 +++--
drivers/gpu/drm/drm_cache.c | 2 +-
drivers/gpu/drm/drm_gem_cma_helper.c | 23 +----
drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++-
drivers/gpu/drm/drm_prime.c | 91 +++++++++++--------
drivers/gpu/drm/etnaviv/etnaviv_gem.c | 12 +--
drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 15 +--
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 10 +-
drivers/gpu/drm/exynos/exynos_drm_gem.c | 23 +----
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 11 +--
.../gpu/drm/i915/gem/selftests/mock_dmabuf.c | 7 +-
drivers/gpu/drm/lima/lima_gem.c | 11 ++-
drivers/gpu/drm/lima/lima_vm.c | 5 +-
drivers/gpu/drm/mediatek/mtk_drm_gem.c | 37 ++------
drivers/gpu/drm/msm/msm_gem.c | 13 +--
drivers/gpu/drm/msm/msm_gpummu.c | 15 ++-
drivers/gpu/drm/msm/msm_iommu.c | 2 +-
drivers/gpu/drm/omapdrm/omap_gem.c | 14 +--
drivers/gpu/drm/panfrost/panfrost_gem.c | 4 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +-
drivers/gpu/drm/rcar-du/rcar_du_vsp.c | 3 +-
drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 42 +++------
drivers/gpu/drm/tegra/gem.c | 27 ++----
drivers/gpu/drm/tegra/plane.c | 15 +--
drivers/gpu/drm/v3d/v3d_mmu.c | 13 ++-
drivers/gpu/drm/virtio/virtgpu_object.c | 36 +++++---
drivers/gpu/drm/virtio/virtgpu_vq.c | 12 +--
drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 17 +---
drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +-
drivers/gpu/host1x/job.c | 22 ++---
.../common/videobuf2/videobuf2-dma-contig.c | 34 +++----
.../media/common/videobuf2/videobuf2-dma-sg.c | 32 +++----
.../common/videobuf2/videobuf2-vmalloc.c | 12 +--
drivers/media/pci/cx23885/cx23885-alsa.c | 4 +-
drivers/media/pci/cx25821/cx25821-alsa.c | 4 +-
drivers/media/pci/cx88/cx88-alsa.c | 6 +-
drivers/media/pci/saa7134/saa7134-alsa.c | 4 +-
drivers/media/platform/vsp1/vsp1_drm.c | 8 +-
drivers/rapidio/devices/rio_mport_cdev.c | 11 +--
drivers/staging/media/tegra-vde/iommu.c | 4 +-
drivers/xen/gntdev-dmabuf.c | 13 ++-
include/drm/drm_prime.h | 2 +
samples/vfio-mdev/mbochs.c | 3 +-
45 files changed, 280 insertions(+), 406 deletions(-)
--
2.17.1
Hi Tomi,
On 02.09.2020 10:00, Tomi Valkeinen wrote:
> On 01/09/2020 22:33, Robin Murphy wrote:
>> On 2020-08-26 07:32, Marek Szyprowski wrote:
>>> The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
>>> returns the number of the created entries in the DMA address space.
>>> However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
>>> dma_unmap_sg must be called with the original number of the entries
>>> passed to the dma_map_sg().
>>>
>>> struct sg_table is a common structure used for describing a non-contiguous
>>> memory buffer, used commonly in the DRM and graphics subsystems. It
>>> consists of a scatterlist with memory pages and DMA addresses (sgl entry),
>>> as well as the number of scatterlist entries: CPU pages (orig_nents entry)
>>> and DMA mapped pages (nents entry).
>>>
>>> It turned out that it was a common mistake to misuse nents and orig_nents
>>> entries, calling DMA-mapping functions with a wrong number of entries or
>>> ignoring the number of mapped entries returned by the dma_map_sg()
>>> function.
>>>
>>> Fix the code to refer to proper nents or orig_nents entries. This driver
>>> checks for a buffer contiguity in DMA address space, so it should test
>>> sg_table->nents entry.
>>>
>>> Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
>>> ---
>>> drivers/gpu/drm/omapdrm/omap_gem.c | 6 +++---
>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c
>>> index ff0c4b0c3fd0..a7a9a0afe2b6 100644
>>> --- a/drivers/gpu/drm/omapdrm/omap_gem.c
>>> +++ b/drivers/gpu/drm/omapdrm/omap_gem.c
>>> @@ -48,7 +48,7 @@ struct omap_gem_object {
>>> * OMAP_BO_MEM_DMA_API flag set)
>>> *
>>> * - buffers imported from dmabuf (with the OMAP_BO_MEM_DMABUF flag set)
>>> - * if they are physically contiguous (when sgt->orig_nents == 1)
>>> + * if they are physically contiguous (when sgt->nents == 1)
>> Hmm, if this really does mean *physically* contiguous - i.e. if buffers might be shared between
>> DMA-translatable and non-DMA-translatable devices - then these changes might not be appropriate. If
>> not and it only actually means DMA-contiguous, then it would be good to clarify the comments to that
>> effect.
>>
>> Can anyone familiar with omapdrm clarify what exactly the case is here? I know that IOMMUs might be
>> involved to some degree, and I've skimmed the interconnect chapters of enough OMAP TRMs to be scared
>> by the reference to the tiler aperture in the context below :)
> DSS (like many other IPs in OMAP) does not have any MMU/PAT, and can only use contiguous buffers
> (contiguous in the RAM).
>
> There's a special case with TILER (which is not part of DSS but of the memory subsystem, but it's
> still handled internally by the omapdrm driver), which has a PAT. PAT can create a contiguous view
> of scattered pages, and DSS can then use this contiguous view ("tiler aperture", which to DSS looks
> just like normal contiguous memory).
>
> Note that omapdrm does not use dma_map_sg() & co. mentioned in the patch description.
>
> If there's no MMU/PAT, is orig_nents always the same as nents? Or can we have multiple physically
> contiguous pages listed separately in the sgt (so orig_nents > 1) but as the pages form one big
> contiguous area, nents == 1?
Well, when DMA-mapping API is properly used, the difference between
nents and orig_nents is only when the scatterlist have been mapped for DMA.
For the mentioned case, even if the creator of the buffer used only the
pages that are consecutive in the physical memory, he is free to chose
either to set nents/orig_nents to 1 and length to n*PAGE_SIZE or set
nents/orig_nents to n and length to PAGE_SIZE for each. Then the buffer
chunks might be merged, but this is done by the DMA-mapping code. For
your case, without any call to DMA-mapping, you can only assume that the
buffer is contiguous in physical memory if orig_nents is 1.
I've changed the use of nents to orig_nents to make things consistent -
this code operates only on the unmapped buffers. I want to ensure that
anyone who will potentially copy this code, won't make the
nents/orig_nents mistake in the future.
If you don't like it, we can drop this patch, because it won't change
the way the driver works.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Dear All,
During the Exynos DRM GEM rework and fixing the issues in the.
drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most
drivers in DRM framework incorrectly use nents and orig_nents entries of
the struct sg_table.
In case of the most DMA-mapping implementations exchanging those two
entries or using nents for all loops on the scatterlist is harmless,
because they both have the same value. There exists however a DMA-mapping
implementations, for which such incorrect usage breaks things. The nents
returned by dma_map_sg() might be lower than the nents passed as its
parameter and this is perfectly fine. DMA framework or IOMMU is allowed
to join consecutive chunks while mapping if such operation is supported
by the underlying HW (bus, bridge, IOMMU, etc). Example of the case
where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages
is described here [2]
The DMA-mapping framework documentation [3] states that dma_map_sg()
returns the numer of the created entries in the DMA address space.
However the subsequent calls to dma_sync_sg_for_{device,cpu} and
dma_unmap_sg must be called with the original number of entries passed to
dma_map_sg. The common pattern in DRM drivers were to assign the
dma_map_sg() return value to sg_table->nents and use that value for
the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also
the code iterated over nents times to access the pages stored in the
processed scatterlist, while it should use orig_nents as the numer of
the page entries.
I've tried to identify all such incorrect usage of sg_table->nents and
this is a result of my research. It looks that the incorrect pattern has
been copied over the many drivers mainly in the DRM subsystem. Too bad in
most cases it even worked correctly if the system used a simple, linear
DMA-mapping implementation, for which swapping nents and orig_nents
doesn't make any difference. To avoid similar issues in the future, I've
introduced a common wrappers for DMA-mapping calls, which operate directly
on the sg_table objects. I've also added wrappers for iterating over the
scatterlists stored in the sg_table objects and applied them where
possible. This, together with some common DRM prime helpers, allowed me
to almost get rid of all nents/orig_nents usage in the drivers. I hope
that such change makes the code robust, easier to follow and copy/paste
safe.
The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
it fully. The driver creatively uses sg_table->orig_nents to store the
size of the allocate scatterlist and ignores the number of the entries
returned by dma_map_sg function. In this patchset I only fixed the
sg_table objects exported by dmabuf related functions. I hope that I
didn't break anything there.
Patches are based on top of Linux next-20200825. The required changes to
DMA-mapping framework has been already merged to v5.8-rc1.
I would like ask for merging of the 1-27 patches via DRM misc tree.
Best regards,
Marek Szyprowski
References:
[1] https://lkml.org/lkml/2020/3/27/555
[2] https://lkml.org/lkml/2020/3/29/65
[3] Documentation/DMA-API-HOWTO.txt
[4] https://lore.kernel.org/linux-iommu/20200512121931.GD20393@lst.de/T/#ma18c9…
Changelog:
v9:
- rebased onto Linux next-20200825, which is based on v5.9-rc2; fixed conflicts
- dropped merged patches
v8:
- rapidio: fixed issues pointed by kbuilt test robot (use of uninitialized
variable
- vb2: rebased after recent changes in the code
v7: https://lore.kernel.org/linux-iommu/20200619103636.11974-1-m.szyprowski@sam…
- changed DMA page interators to standard DMA SG iterators in drm/prime and
videobuf2-dma-contig as suggested by Robin Murphy
- fixed build issues
v6: https://lore.kernel.org/linux-iommu/20200618153956.29558-1-m.szyprowski@sam…
- rebased onto Linux next-20200618, which is based on v5.8-rc1; fixed conflicts
v5: https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprowski@sams…
- fixed some minor style issues and typos
- fixed lack of the attrs argument in ion, dmabuf, rapidio, fastrpc and
vfio patches
v4: https://lore.kernel.org/linux-iommu/20200512121931.GD20393@lst.de/T/
- added for_each_sgtable_* wrappers and applied where possible
- added drm_prime_get_contiguous_size() and applied where possible
- applied drm_prime_sg_to_page_addr_arrays() where possible to remove page
extraction from sg_table objects
- added documentation for the introduced wrappers
- improved patches description a bit
v3: https://lore.kernel.org/dri-devel/20200505083926.28503-1-m.szyprowski@samsu…
- introduce dma_*_sgtable_* wrappers and use them in all patches
v2: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar…
- dropped most of the changes to drm/i915
- added fixes for rcar-du, xen, media and ion
- fixed a few issues pointed by kbuild test robot
- added wide cc: list for each patch
v1: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar…
- initial version
Patch summary:
Marek Szyprowski (32):
drm: prime: add common helper to check scatterlist contiguity
drm: prime: use sgtable iterators in
drm_prime_sg_to_page_addr_arrays()
drm: core: fix common struct sg_table related issues
drm: armada: fix common struct sg_table related issues
drm: etnaviv: fix common struct sg_table related issues
drm: exynos: use common helper for a scatterlist contiguity check
drm: exynos: fix common struct sg_table related issues
drm: i915: fix common struct sg_table related issues
drm: lima: fix common struct sg_table related issues
drm: mediatek: use common helper for a scatterlist contiguity check
drm: mediatek: use common helper for extracting pages array
drm: msm: fix common struct sg_table related issues
drm: omapdrm: use common helper for extracting pages array
drm: omapdrm: fix common struct sg_table related issues
drm: panfrost: fix common struct sg_table related issues
drm: rockchip: use common helper for a scatterlist contiguity check
drm: rockchip: fix common struct sg_table related issues
drm: tegra: fix common struct sg_table related issues
drm: v3d: fix common struct sg_table related issues
drm: virtio: fix common struct sg_table related issues
drm: vmwgfx: fix common struct sg_table related issues
drm: xen: fix common struct sg_table related issues
xen: gntdev: fix common struct sg_table related issues
drm: host1x: fix common struct sg_table related issues
drm: rcar-du: fix common struct sg_table related issues
dmabuf: fix common struct sg_table related issues
staging: tegra-vde: fix common struct sg_table related issues
misc: fastrpc: fix common struct sg_table related issues
rapidio: fix common struct sg_table related issues
samples: vfio-mdev/mbochs: fix common struct sg_table related issues
media: pci: fix common ALSA DMA-mapping related codes
videobuf2: use sgtable-based scatterlist wrappers
drivers/dma-buf/heaps/heap-helpers.c | 13 ++-
drivers/dma-buf/udmabuf.c | 7 +-
drivers/gpu/drm/armada/armada_gem.c | 12 +--
drivers/gpu/drm/drm_cache.c | 2 +-
drivers/gpu/drm/drm_gem_cma_helper.c | 23 +----
drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++-
drivers/gpu/drm/drm_prime.c | 91 +++++++++++--------
drivers/gpu/drm/etnaviv/etnaviv_gem.c | 12 +--
drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 13 +--
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 10 +-
drivers/gpu/drm/exynos/exynos_drm_gem.c | 23 +----
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 11 +--
.../gpu/drm/i915/gem/selftests/mock_dmabuf.c | 7 +-
drivers/gpu/drm/lima/lima_gem.c | 11 ++-
drivers/gpu/drm/lima/lima_vm.c | 5 +-
drivers/gpu/drm/mediatek/mtk_drm_gem.c | 37 ++------
drivers/gpu/drm/msm/msm_gem.c | 13 +--
drivers/gpu/drm/msm/msm_gpummu.c | 14 ++-
drivers/gpu/drm/msm/msm_iommu.c | 2 +-
drivers/gpu/drm/omapdrm/omap_gem.c | 20 ++--
drivers/gpu/drm/panfrost/panfrost_gem.c | 4 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +-
drivers/gpu/drm/rcar-du/rcar_du_vsp.c | 3 +-
drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 42 +++------
drivers/gpu/drm/tegra/gem.c | 27 ++----
drivers/gpu/drm/tegra/plane.c | 15 +--
drivers/gpu/drm/v3d/v3d_mmu.c | 13 ++-
drivers/gpu/drm/virtio/virtgpu_object.c | 36 +++++---
drivers/gpu/drm/virtio/virtgpu_vq.c | 12 +--
drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 17 +---
drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +-
drivers/gpu/host1x/job.c | 22 ++---
.../common/videobuf2/videobuf2-dma-contig.c | 34 +++----
.../media/common/videobuf2/videobuf2-dma-sg.c | 32 +++----
.../common/videobuf2/videobuf2-vmalloc.c | 12 +--
drivers/media/pci/cx23885/cx23885-alsa.c | 2 +-
drivers/media/pci/cx25821/cx25821-alsa.c | 2 +-
drivers/media/pci/cx88/cx88-alsa.c | 2 +-
drivers/media/pci/saa7134/saa7134-alsa.c | 2 +-
drivers/media/platform/vsp1/vsp1_drm.c | 8 +-
drivers/misc/fastrpc.c | 4 +-
drivers/rapidio/devices/rio_mport_cdev.c | 11 +--
drivers/staging/media/tegra-vde/iommu.c | 4 +-
drivers/xen/gntdev-dmabuf.c | 13 ++-
include/drm/drm_prime.h | 2 +
samples/vfio-mdev/mbochs.c | 3 +-
46 files changed, 273 insertions(+), 398 deletions(-)
--
2.17.1