On Wed, Jul 03, 2019 at 01:48:41PM +0530, Nishka Dasgupta wrote:
> Remove file ion_carveout_heap.c as its functions and definitions are not
> used anywhere.
> Issue found with Coccinelle.
>
> Signed-off-by: Nishka Dasgupta <nishkadg.linux(a)gmail.com>
> ---
> drivers/staging/android/ion/Kconfig | 9 --
> drivers/staging/android/ion/Makefile | 1 -
> .../staging/android/ion/ion_carveout_heap.c | 133 ------------------
I keep trying to do this, but others point out that the ion code is
"going to be fixed up soon" and that people rely on this interface now.
Well, "code outside of the kernel tree" relies on this, which is not ok,
but the "soon" people keep insisting on it...
Odds are I should just delete all of ION, as there hasn't been any
forward progress on it in a long time.
Hopefully that wakes some people up...
thanks,
greg k-h
On Wed, Jul 03, 2019 at 02:14:21PM +0530, Nishka Dasgupta wrote:
> On 03/07/19 2:07 PM, Greg KH wrote:
> > On Wed, Jul 03, 2019 at 01:48:41PM +0530, Nishka Dasgupta wrote:
> > > Remove file ion_carveout_heap.c as its functions and definitions are not
> > > used anywhere.
> > > Issue found with Coccinelle.
> > >
> > > Signed-off-by: Nishka Dasgupta <nishkadg.linux(a)gmail.com>
> > > ---
> > > drivers/staging/android/ion/Kconfig | 9 --
> > > drivers/staging/android/ion/Makefile | 1 -
> > > .../staging/android/ion/ion_carveout_heap.c | 133 ------------------
> >
> > I keep trying to do this, but others point out that the ion code is
> > "going to be fixed up soon" and that people rely on this interface now.
> > Well, "code outside of the kernel tree" relies on this, which is not ok,
> > but the "soon" people keep insisting on it...
> >
> > Odds are I should just delete all of ION, as there hasn't been any
> > forward progress on it in a long time.
>
> I'm sorry, I don't think I understand. Should I drop these patches from my
> tree then?
What "tree"? Let's see what the ION maintainer and developers say
before rushing to anything.
thanks,
greg k-h
On 07/06/2019 20:35, Andrew F. Davis wrote:
> Hello all,
>
> So I've got a new IP on our new SoC I'm looking to make use of and would
> like some help figuring out what framework best matches its function. The
> IP is called a "Page-based Address Translator" or PAT. A PAT instance
> (there are 5 of these things on our J721e device[0]) is basically a
> really simple IOMMU sitting on the interconnect between the device bus
> and what is effectively our northbridge called
> MSMC (DRAM/SRAM/L3-Cache/Coherency controller).
>
> Simplified it looks about like this:
>
> CPUs
> |
> DRAM --- MSMC --- SRAM/L3
> |
> NAVSS - (PATs)
> |
> --- Device Bus ---------
> | | | |
> Device Device Device etc..
>
> Each PAT has a set a window in high memory (about 0x48_0000_0000 area)
> for which any transaction with an address targeting its window will be
> routed into that PAT. The PAT then does a simple calculation based on
> the how far into the window the address is and the current page size,
> does a lookup to an internally held table of translations, then sends the
> transaction back out on the interconnect with a new address. Usually this
> address should be towards somewhere in DRAM, but can be some other device
> or even back into PAT (I'm not sure there is a valid use-case for this
> but just a point of interest).
>
> My gut reaction is that this is an IOMMU which belongs in the IOMMU
> subsystem. But there are a couple oddities that make me less sure it is
> really suitable for the IOMMU framework. First it doesn't sit in front of
> any devices, it sits in front of *all* devices, this means we would have
> every device claim it as an IOMMU parent, even though many devices also
> have a traditional IOMMU connected. Second, there is only a limited
> window of address space per PAT, this means we will get fragmentation and
> allocation failures on occasion, in this way it looks to me more like AGP
> GART. Third, the window is in high-memory, so unlike some IOMMU devices
> which can be used to allow DMA to high-mem from low-mem only devices, PAT
> can't be used for that. Lastly it doesn't provide any isolation, if the
> access does not target the PAT window it is not used (that is not to say
> we don't have isolation, just that it is taken care of by other parts of
> the interconnect).
>
> This means, to me, that PAT has one main purpose: making
> physically-contiguous views of scattered pages in system memory for DMA.
> But it does that really well, the whole translation table is held in a
> PAT-internal SRAM giving 1 bus cycle latency and at full bus bandwidth.
>
> So what are my options here, is IOMMU the right way to go or not?
FWIW, that sounds almost exactly like my (vague) understanding of other
GARTs, and as such should be pretty well manageable via the IOMMU API -
we already have tegra-gart, for example. The aperture contention issue
could certainly be mitigated by letting the firmware claim it's only
associated with the display and any other devices which really need it.
A further interesting avenue of investigation - now that Christoph's
recent work has made it much more possible - would be a second set of
IOMMU DMA ops tailored for "GART-like" domains where force_aperture=0,
which could behave as dma-direct wherever possible and only use IOMMU
remaps when absolutely necessary.
Robin.
> Looking around the kernel I also see the char dev ARP/GART interface
> which looks like a good fit, but also looks quite dated and my guess
> deprecated at this point. Moving right along..
>
> Another thing I saw is we have the support upstream of the DMM device[1]
> available in some OMAPx/AM57x SoCs. I'm a little more familiar with this
> device. The DMM is a bundle of IPs and in fact one of them is called
> "PAT" and it even does basically the same thing this incarnation of "PAT"
> does. It's upstream integration design is a bit questionable
> unfortunately, the DMM support was integrated into the OMAPDRM display
> driver, which does make some sense then given its support for rotation
> (using TILER IP contained in DMM). The issue with this was that the
> DMM/TILER/PAT IP was not part of the our display IP, but instead out at
> the end of the shared device bus, inside the external memory controller.
> Like this new PAT this meant that any IP that could make use of it, but
> only the display framework could actually provide buffers backed by it.
> This meant, for instance, if we wanted to decode some video buffer using
> our video decoder we would have to allocate from DRM framework then pass
> that over to the V4L2 system. This doesn't make much sense and required
> the user-space to know about this odd situation and allocate from the
> right spot or else have to use up valuable CMA space or waste memory with
> dedicated carveouts.
>
> Another idea would be to have this as a special central allocator
> (exposed through DMA-BUF heaps[2] or ION) that would give out normal
> system memory as a DMA-BUF but remap it with PAT if a device that only
> supports contiguous memory tries to attach/map that DMA-BUF.
>
> One last option would be to allow user-space to choose to make the buffer
> contiguous when it needs. That's what the driver in this series allows.
> We expose a remapping device, user-space passes it a non-contiguous
> DMA-BUF handle and it passes a contiguous one back. Simple as that.
>
> So how do we use this, lets take Android for example, we don't know at
> allocation time if a rendering buffer will end up going back into the GPU
> for further processing, or if it will be consumed directly by the display.
> This is a problem for us as our GPU can work with non-contiguous buffers
> but our display cannot, so any buffers that could possibly go to the
> display at some point currently needs to be allocated as contiguous from
> the start, this leads to a lot of unneeded use of carveout/CMA memory.
> With this driver on the other hand, we allocate regular non-contiguous
> system memory (again using DMA-BUF heaps, but ION could work here too),
> then only when a buffer is about to be sent to the display we pass the
> handle to this DMA-BUF to our driver here and take the handle it gives
> back and pass that to the display instead.
>
> As said, it is probably not the ideal solution but it does work and was
> used for some early testing of the IP.
>
> Well, sorry for the wall of text.
> Any and all suggestions very welcome and appreciated.
>
> Thanks,
> Andrew
>
> [0] http://www.ti.com/lit/pdf/spruil1
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/dri…
> [2] https://lkml.org/lkml/2019/6/6/1211
>
> Andrew F. Davis (2):
> dt-bindings: soc: ti: Add TI PAT bindings
> soc: ti: Add Support for the TI Page-based Address Translator (PAT)
>
> .../devicetree/bindings/misc/ti,pat.txt | 34 ++
> drivers/soc/ti/Kconfig | 9 +
> drivers/soc/ti/Makefile | 1 +
> drivers/soc/ti/ti-pat.c | 569 ++++++++++++++++++
> include/uapi/linux/ti-pat.h | 44 ++
> 5 files changed, 657 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/misc/ti,pat.txt
> create mode 100644 drivers/soc/ti/ti-pat.c
> create mode 100644 include/uapi/linux/ti-pat.h
>
Am 12.06.19 um 10:15 schrieb Nicolin Chen:
> Hi Christian,
>
> On Wed, Jun 12, 2019 at 08:05:53AM +0000, Koenig, Christian wrote:
>> Am 12.06.19 um 10:02 schrieb Nicolin Chen:
>> [SNIP]
>>> We haven't used DRM/GRM_PRIME yet but I am also curious would it
>>> benefit DRM also if we reduce this overhead in the dma_buf?
>> No, not at all.
> From you replies, in a summary, does it means that there won't be a case
> of DRM having a dma_buf attaching to the same device, i.e. multiple calls
> of drm_gem_prime_import() function with same parameters of dev + dma_buf?
Well, there are some cases where this happens. But in those cases we
intentionally want to get a new attachment :)
So thinking more about it you would actually break those and that is not
something we can do.
> If so, we can just ignore/drop this patch. Sorry for the misunderstanding.
It might be interesting for things like P2P, but even then it might be
better to just cache the P2P settings instead of the full attachment.
Regards,
Christian.
>
> Thanks
> Nicolin
Am 12.06.19 um 10:02 schrieb Nicolin Chen:
> Hi Christian,
>
> Thanks for the quick reply.
>
> On Wed, Jun 12, 2019 at 07:45:38AM +0000, Koenig, Christian wrote:
>> Am 12.06.19 um 03:22 schrieb Nicolin Chen:
>>> Commit f13e143e7444 ("dma-buf: start caching of sg_table objects v2")
>>> added a support of caching the sgt pointer into an attach pointer to
>>> let users reuse the sgt pointer without another mapping. However, it
>>> might not totally work as most of dma-buf callers are doing attach()
>>> and map_attachment() back-to-back, using drm_prime.c for example:
>>> drm_gem_prime_import_dev() {
>>> attach = dma_buf_attach() {
>>> /* Allocating a new attach */
>>> attach = kzalloc();
>>> /* .... */
>>> return attach;
>>> }
>>> dma_buf_map_attachment(attach, direction) {
>>> /* attach->sgt would be always empty as attach is new */
>>> if (attach->sgt) {
>>> /* Reuse attach->sgt */
>>> }
>>> /* Otherwise, map it */
>>> attach->sgt = map();
>>> }
>>> }
>>>
>>> So, for a cache_sgt_mapping use case, it would need to get the same
>>> attachment pointer in order to reuse its sgt pointer. So this patch
>>> adds a refcount to the attach() function and lets it search for the
>>> existing attach pointer by matching the dev pointer.
>> I don't think that this is a good idea.
>>
>> We use sgt caching as workaround for locking order problems and want to
>> remove it again in the long term.
> Oh. I thought it was for a performance improving purpose. It may
> be a misunderstanding then.
>
>> So what is the actual use case of this?
> We have some similar downstream changes at dma_buf to reduce the
> overhead from multiple clients of the same device doing attach()
> and map_attachment() calls for the same dma_buf.
I don't think that this is a good idea over all. A driver calling attach
for the same buffer is doing something wrong in the first place and we
should not work around this in the DMA-buf handling.
> We haven't used DRM/GRM_PRIME yet but I am also curious would it
> benefit DRM also if we reduce this overhead in the dma_buf?
No, not at all.
Regards,
Christian.
>
> Thanks
> Nicolin
Am 12.06.19 um 03:22 schrieb Nicolin Chen:
> Commit f13e143e7444 ("dma-buf: start caching of sg_table objects v2")
> added a support of caching the sgt pointer into an attach pointer to
> let users reuse the sgt pointer without another mapping. However, it
> might not totally work as most of dma-buf callers are doing attach()
> and map_attachment() back-to-back, using drm_prime.c for example:
> drm_gem_prime_import_dev() {
> attach = dma_buf_attach() {
> /* Allocating a new attach */
> attach = kzalloc();
> /* .... */
> return attach;
> }
> dma_buf_map_attachment(attach, direction) {
> /* attach->sgt would be always empty as attach is new */
> if (attach->sgt) {
> /* Reuse attach->sgt */
> }
> /* Otherwise, map it */
> attach->sgt = map();
> }
> }
>
> So, for a cache_sgt_mapping use case, it would need to get the same
> attachment pointer in order to reuse its sgt pointer. So this patch
> adds a refcount to the attach() function and lets it search for the
> existing attach pointer by matching the dev pointer.
I don't think that this is a good idea.
We use sgt caching as workaround for locking order problems and want to
remove it again in the long term.
So what is the actual use case of this?
Regards,
Christian.
>
> Signed-off-by: Nicolin Chen <nicoleotsuka(a)gmail.com>
> ---
> drivers/dma-buf/dma-buf.c | 23 +++++++++++++++++++++++
> include/linux/dma-buf.h | 2 ++
> 2 files changed, 25 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index f4104a21b069..d0260553a31c 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -559,6 +559,21 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> if (WARN_ON(!dmabuf || !dev))
> return ERR_PTR(-EINVAL);
>
> + /* cache_sgt_mapping requires to reuse the same attachment pointer */
> + if (dmabuf->ops->cache_sgt_mapping) {
> + mutex_lock(&dmabuf->lock);
> +
> + /* Search for existing attachment and increase its refcount */
> + list_for_each_entry(attach, &dmabuf->attachments, node) {
> + if (dev != attach->dev)
> + continue;
> + atomic_inc_not_zero(&attach->refcount);
> + goto unlock_attach;
> + }
> +
> + mutex_unlock(&dmabuf->lock);
> + }
> +
> attach = kzalloc(sizeof(*attach), GFP_KERNEL);
> if (!attach)
> return ERR_PTR(-ENOMEM);
> @@ -575,6 +590,9 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> }
> list_add(&attach->node, &dmabuf->attachments);
>
> + atomic_set(&attach->refcount, 1);
> +
> +unlock_attach:
> mutex_unlock(&dmabuf->lock);
>
> return attach;
> @@ -599,6 +617,11 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
> if (WARN_ON(!dmabuf || !attach))
> return;
>
> + /* Decrease the refcount for cache_sgt_mapping use cases */
> + if (dmabuf->ops->cache_sgt_mapping &&
> + atomic_dec_return(&attach->refcount))
> + return;
> +
> if (attach->sgt)
> dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 8a327566d7f4..65f12212ca2e 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -333,6 +333,7 @@ struct dma_buf {
> * @dev: device attached to the buffer.
> * @node: list of dma_buf_attachment.
> * @sgt: cached mapping.
> + * @refcount: refcount of the attachment for the same device.
> * @dir: direction of cached mapping.
> * @priv: exporter specific attachment data.
> *
> @@ -350,6 +351,7 @@ struct dma_buf_attachment {
> struct device *dev;
> struct list_head node;
> struct sg_table *sgt;
> + atomic_t refcount;
> enum dma_data_direction dir;
> void *priv;
> };
On Mon, 27 May 2019 18:56:20 +0800 Christian Koenig wrote:
> Thanks for the comments, but you are looking at a completely outdated patchset.
>
> If you are interested in the newest one please ping me and I'm going to CC you
> when I send out the next version.
>
Ping...
Thanks
Hillf
Hi everybody,
core idea in this patch set is that DMA-buf importers can now provide an optional invalidate callback. Using this callback and the reservation object exporters can now avoid pinning DMA-buf memory for a long time while sharing it between devices.
I've already send out an older version roughly a year ago, but didn't had time to further look into cleaning this up.
The last time a major problem was that we would had to fix up all drivers implementing DMA-buf at once.
Now I avoid this by allowing mappings to be cached in the DMA-buf attachment and so driver can optionally move over to the new interface one by one.
This is also a prerequisite to my patchset enabling sharing of device memory with DMA-buf.
Please review and/or comment,
Christian.
Quoting Michael Yang (2019-05-14 08:55:37)
> On Thu, May 09, 2019 at 12:46:05PM +0100, Chris Wilson wrote:
> > Quoting Michael Yang (2019-05-09 05:34:11)
> > > If all the sync points were signaled in both fences a and b,
> > > there was only one sync point in merged fence which is a_fence[0].
> > > The Fence structure in android framework might be confused about
> > > timestamp if there were any sync points which were signaled after
> > > a_fence[0]. It might be more reasonable to use timestamp of last signaled
> > > sync point to represent the merged fence.
> > > The issue can be found from EGL extension ANDROID_get_frame_timestamps.
> > > Sometimes the return value of EGL_READS_DONE_TIME_ANDROID is head of
> > > the return value of EGL_RENDERING_COMPLETE_TIME_ANDROID.
> > > That means display/composition had been completed before rendering
> > > was completed that is incorrect.
> > >
> > > Some discussion can be found at:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__android-2Dreview.googl…
> > >
> > > Signed-off-by: Michael Yang <michael.yang(a)imgtec.com>
> > > ---
> > > Hi,
> > > I didn't get response since I previously sent this a month ago.
> > > Could someone have a chance to look at it please?
> > > Thanks.
> > > drivers/dma-buf/sync_file.c | 25 +++++++++++++++++++++++--
> > > 1 file changed, 23 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > > index 4f6305c..d46bfe1 100644
> > > --- a/drivers/dma-buf/sync_file.c
> > > +++ b/drivers/dma-buf/sync_file.c
> > > @@ -274,8 +274,29 @@ static struct sync_file *sync_file_merge(const char *name, struct sync_file *a,
> > > for (; i_b < b_num_fences; i_b++)
> > > add_fence(fences, &i, b_fences[i_b]);
> > >
> > > - if (i == 0)
> > > - fences[i++] = dma_fence_get(a_fences[0]);
> > > + /* If all the sync pts were signaled, then adding the sync_pt who
> > > + * was the last signaled to the fence.
> > > + */
> > > + if (i == 0) {
> > > + struct dma_fence *last_signaled_sync_pt = a_fences[0];
> > > + int iter;
> > > +
> > > + for (iter = 1; iter < a_num_fences; iter++) {
> >
> > If there is more than one fence, sync_file->fence is a fence_array and
> > its timestamp is what you want. If there is one fence, sync_file->fence
> > is a pointer to that fence, and naturally has the right timestamp.
> >
> > In short, this should be handled by dma_fence_array_create() when given
> > a complete set of signaled fences, it too should inherit the signaled
> > status with the timestamp being taken from the last fence. It should
> > also be careful to inherit the error status.
> > -Chris
> Thanks Chris for the inputs. For this case, there will be only one fence
> in sync_file->fence after doing sync_file_merge(). Regarding to the current
> implementation, dma_fence_array_create() is not called as num_fences is equal
> to 1. I was wondering do you suggest that we pass a complete set of signaled
> fences to sync_file_set_fence() and handle it in dma_fence_array_create().
> Thanks.
No, in the case there is only one fence, we just inherit its timestamp
along with its fence status. (A single fence is the degenerate case of
a fence array.)
-Chris
Quoting Michael Yang (2019-05-09 05:34:11)
> If all the sync points were signaled in both fences a and b,
> there was only one sync point in merged fence which is a_fence[0].
> The Fence structure in android framework might be confused about
> timestamp if there were any sync points which were signaled after
> a_fence[0]. It might be more reasonable to use timestamp of last signaled
> sync point to represent the merged fence.
> The issue can be found from EGL extension ANDROID_get_frame_timestamps.
> Sometimes the return value of EGL_READS_DONE_TIME_ANDROID is head of
> the return value of EGL_RENDERING_COMPLETE_TIME_ANDROID.
> That means display/composition had been completed before rendering
> was completed that is incorrect.
>
> Some discussion can be found at:
> https://android-review.googlesource.com/c/kernel/common/+/907009
>
> Signed-off-by: Michael Yang <michael.yang(a)imgtec.com>
> ---
> Hi,
> I didn't get response since I previously sent this a month ago.
> Could someone have a chance to look at it please?
> Thanks.
> drivers/dma-buf/sync_file.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> index 4f6305c..d46bfe1 100644
> --- a/drivers/dma-buf/sync_file.c
> +++ b/drivers/dma-buf/sync_file.c
> @@ -274,8 +274,29 @@ static struct sync_file *sync_file_merge(const char *name, struct sync_file *a,
> for (; i_b < b_num_fences; i_b++)
> add_fence(fences, &i, b_fences[i_b]);
>
> - if (i == 0)
> - fences[i++] = dma_fence_get(a_fences[0]);
> + /* If all the sync pts were signaled, then adding the sync_pt who
> + * was the last signaled to the fence.
> + */
> + if (i == 0) {
> + struct dma_fence *last_signaled_sync_pt = a_fences[0];
> + int iter;
> +
> + for (iter = 1; iter < a_num_fences; iter++) {
If there is more than one fence, sync_file->fence is a fence_array and
its timestamp is what you want. If there is one fence, sync_file->fence
is a pointer to that fence, and naturally has the right timestamp.
In short, this should be handled by dma_fence_array_create() when given
a complete set of signaled fences, it too should inherit the signaled
status with the timestamp being taken from the last fence. It should
also be careful to inherit the error status.
-Chris
On Mon, Apr 22, 2019 at 08:49:27PM +0200, Oscar Gomez Fuente wrote:
> These changes solve warning symbol was not declared in the functions:
> ion_carveout_heap_create and ion_chunk_heap_create
>
> Signed-off-by: Oscar Gomez Fuente <oscargomezf(a)gmail.com>
> ---
> drivers/staging/android/ion/ion_carveout_heap.c | 2 +-
> drivers/staging/android/ion/ion_chunk_heap.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/staging/android/ion/ion_carveout_heap.c b/drivers/staging/android/ion/ion_carveout_heap.c
> index bb9d614..3f359ae 100644
> --- a/drivers/staging/android/ion/ion_carveout_heap.c
> +++ b/drivers/staging/android/ion/ion_carveout_heap.c
> @@ -103,7 +103,7 @@ static struct ion_heap_ops carveout_heap_ops = {
> .unmap_kernel = ion_heap_unmap_kernel,
> };
>
> -struct ion_heap *ion_carveout_heap_create(phys_addr_t base, size_t size)
> +static inline struct ion_heap *ion_carveout_heap_create(phys_addr_t base, size_t size)
Why are you making it inline? Btw, normally we just leave it for the
compiler to choose which functions to make inline.
regards,
dan carpenter
On top of those I have 6 more patches in the pipeline to enable VRAM P2P
with DMA-buf.
So that is not the end of the patch set :)
Christian.
Am 17.04.19 um 15:52 schrieb Chunming Zhou:
> Thanks Christian, great job. I will verify it this week when I finish my
> current work on hand.
>
> -David
>
> 在 2019/4/17 2:38, Christian König wrote:
>> Hi everybody,
>>
>> core idea in this patch set is that DMA-buf importers can now provide an optional invalidate callback. Using this callback and the reservation object exporters can now avoid pinning DMA-buf memory for a long time while sharing it between devices.
>>
>> I've already send out an older version roughly a year ago, but didn't had time to further look into cleaning this up.
>>
>> The last time a major problem was that we would had to fix up all drivers implementing DMA-buf at once.
>>
>> Now I avoid this by allowing mappings to be cached in the DMA-buf attachment and so driver can optionally move over to the new interface one by one.
>>
>> This is also a prerequisite to my patchset enabling sharing of device memory with DMA-buf.
>>
>> Please review and/or comment,
>> Christian.
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel(a)lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
On 3/29/19 7:26 PM, Zengtao (B) wrote:
> Hi laura:
>
>> -----Original Message-----
>> From: Laura Abbott [mailto:labbott@redhat.com]
>> Sent: Friday, March 29, 2019 9:27 PM
>> To: Zengtao (B) <prime.zeng(a)hisilicon.com>; sumit.semwal(a)linaro.org
>> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>; Arve Hjønnevåg
>> <arve(a)android.com>; Todd Kjos <tkjos(a)android.com>; Martijn Coenen
>> <maco(a)android.com>; Joel Fernandes <joel(a)joelfernandes.org>;
>> Christian Brauner <christian(a)brauner.io>; devel(a)driverdev.osuosl.org;
>> dri-devel(a)lists.freedesktop.org; linaro-mm-sig(a)lists.linaro.org;
>> linux-kernel(a)vger.kernel.org
>> Subject: Re: [PATCH] staging: android: ion: refactory ion_alloc for kernel
>> driver use
>>
>> On 3/29/19 11:40 AM, Zeng Tao wrote:
>>> There are two reasons for this patch:
>>> 1. There are some potential requirements for ion_alloc in kernel
>>> space, some media drivers need to allocate media buffers from ion
>>> instead of buddy or dma framework, this is more convient and clean
>>> very for media drivers. And In that case, ion is the only media buffer
>>> provider, it's more easier to maintain.
>>> 2. Fd is only needed by user processes, not the kernel space, so
>>> dma_buf should be returned instead of fd for kernel space, and
>>> dma_buf_fd should be called only for userspace api.
>>>
>>
>> I really want to just NAK this because it doesn't seem like something
>> that's necessary. The purpose of Ion is to provide buffers to userspace
>> because there's no other way for userspace to get access to the memory.
>> The kernel already has other APIs to access the memory. This also
>> complicates the re-work that's been happening where the requirement is
>> only userspace.
>>
>> Can you be more detailed about which media drivers you are referring to
>> and why they can't just use other APIs?
>>
>
> I think I 've got your point, the ION is designed for usespace, but for kernel
> space, we are really lacking of someone which plays the same role,(allocate
> media memory, share the memory using dma_buf, provide debug and statistics
> for media memory).
>
> In fact, for kernel space, we have the dma framework, dma-buf, etc..
> And we can work on top of such apis, but some duplicate jobs(everyone has
> to maintain its own buffer sharing, debug and statistics).
> So we need to have some to do the common things(ION's the best choice now)
>
Keep in mind that Ion is a thin shell of what it was as most of the
debugging and statistics was removed because it was buggy. Most of that
should end up going at the dma_buf layer since it's really a dma_buf allocation
API.
> When the ION was introduced, a lot of media memory frameworks existed, the
> dma framework was not so good, so ION heaps, integrated buffer sharing, statistics
> and usespace api were the required features, but now dma framework is more powerful,
> we don't even need ION heaps now, but the userspace api, buffer sharing, statistics are
> still needed, and the buffer sharing, statistics can be re-worked and export to kernel space,
> not only used by userspace, , and that is my point.
>
I see what you are getting at but I don't think the same thing
applies to the kernel as it does userspace. We can enforce a
single way of using the dma_buf fd in userspace but the kernel
has a variety of ways to use dma_buf because each driver and
framework has its own needs. I'm still not convinced that adding
Ion APIs in the kernel is the right option since as you point out
we don't really need the heaps. That mostly leaves Ion as a wrapper
to handle doing the export. Maybe we could benefit from that
but I think it might require more thought.
I'd rather see a proposal in the media API itself showing what
you think is necessary but without using Ion. That would be
a good start so we could fully review what might make sense to
pull out of Ion into something common.
Thanks,
Laura
>>
>>> Signed-off-by: Zeng Tao <prime.zeng(a)hisilicon.com>
>>> ---
>>> drivers/staging/android/ion/ion.c | 32
>> +++++++++++++++++---------------
>>> 1 file changed, 17 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/staging/android/ion/ion.c
>>> b/drivers/staging/android/ion/ion.c
>>> index 92c2914..e93fb49 100644
>>> --- a/drivers/staging/android/ion/ion.c
>>> +++ b/drivers/staging/android/ion/ion.c
>>> @@ -387,13 +387,13 @@ static const struct dma_buf_ops
>> dma_buf_ops = {
>>> .unmap = ion_dma_buf_kunmap,
>>> };
>>>
>>> -static int ion_alloc(size_t len, unsigned int heap_id_mask, unsigned
>>> int flags)
>>> +struct dma_buf *ion_alloc(size_t len, unsigned int heap_id_mask,
>>> + unsigned int flags)
>>> {
>>> struct ion_device *dev = internal_dev;
>>> struct ion_buffer *buffer = NULL;
>>> struct ion_heap *heap;
>>> DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
>>> - int fd;
>>> struct dma_buf *dmabuf;
>>>
>>> pr_debug("%s: len %zu heap_id_mask %u flags %x\n", __func__,
>> @@
>>> -407,7 +407,7 @@ static int ion_alloc(size_t len, unsigned int
>> heap_id_mask, unsigned int flags)
>>> len = PAGE_ALIGN(len);
>>>
>>> if (!len)
>>> - return -EINVAL;
>>> + return ERR_PTR(-EINVAL);
>>>
>>> down_read(&dev->lock);
>>> plist_for_each_entry(heap, &dev->heaps, node) { @@ -421,10
>> +421,10
>>> @@ static int ion_alloc(size_t len, unsigned int heap_id_mask,
>> unsigned int flags)
>>> up_read(&dev->lock);
>>>
>>> if (!buffer)
>>> - return -ENODEV;
>>> + return ERR_PTR(-ENODEV);
>>>
>>> if (IS_ERR(buffer))
>>> - return PTR_ERR(buffer);
>>> + return ERR_PTR(PTR_ERR(buffer));
>>>
>>> exp_info.ops = &dma_buf_ops;
>>> exp_info.size = buffer->size;
>>> @@ -432,17 +432,12 @@ static int ion_alloc(size_t len, unsigned int
>> heap_id_mask, unsigned int flags)
>>> exp_info.priv = buffer;
>>>
>>> dmabuf = dma_buf_export(&exp_info);
>>> - if (IS_ERR(dmabuf)) {
>>> + if (IS_ERR(dmabuf))
>>> _ion_buffer_destroy(buffer);
>>> - return PTR_ERR(dmabuf);
>>> - }
>>>
>>> - fd = dma_buf_fd(dmabuf, O_CLOEXEC);
>>> - if (fd < 0)
>>> - dma_buf_put(dmabuf);
>>> -
>>> - return fd;
>>> + return dmabuf;
>>> }
>>> +EXPORT_SYMBOL(ion_alloc);
>>>
>>> static int ion_query_heaps(struct ion_heap_query *query)
>>> {
>>> @@ -539,12 +534,19 @@ static long ion_ioctl(struct file *filp, unsigned
>> int cmd, unsigned long arg)
>>> case ION_IOC_ALLOC:
>>> {
>>> int fd;
>>> + struct dma_buf *dmabuf;
>>>
>>> - fd = ion_alloc(data.allocation.len,
>>> + dmabuf = ion_alloc(data.allocation.len,
>>> data.allocation.heap_id_mask,
>>> data.allocation.flags);
>>> - if (fd < 0)
>>> + if (IS_ERR(dmabuf))
>>> + return PTR_ERR(dmabuf);
>>> +
>>> + fd = dma_buf_fd(dmabuf, O_CLOEXEC);
>>> + if (fd < 0) {
>>> + dma_buf_put(dmabuf);
>>> return fd;
>>> + }
>>>
>>> data.allocation.fd = fd;
>>>
>>>
>
Hi Zeng,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on staging/staging-testing]
[also build test WARNING on v5.1-rc2 next-20190329]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Zeng-Tao/staging-android-ion-refac…
coccinelle warnings: (new ones prefixed by >>)
>> drivers/staging/android/ion/ion.c:427:9-16: WARNING: ERR_CAST can be used with buffer
Please review and possibly fold the followup patch.
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
On Sat, Mar 30, 2019 at 02:32:35AM +0000, Zengtao (B) wrote:
> >-----Original Message-----
> >From: Greg Kroah-Hartman [mailto:gregkh@linuxfoundation.org]
> >Sent: Saturday, March 30, 2019 12:04 AM
> >To: Zengtao (B) <prime.zeng(a)hisilicon.com>
> >Cc: labbott(a)redhat.com; sumit.semwal(a)linaro.org;
> >devel(a)driverdev.osuosl.org; Todd Kjos <tkjos(a)android.com>;
> >linux-kernel(a)vger.kernel.org; dri-devel(a)lists.freedesktop.org;
> >linaro-mm-sig(a)lists.linaro.org; Arve Hjønnevåg <arve(a)android.com>;
> >Joel Fernandes <joel(a)joelfernandes.org>; Martijn Coenen
> ><maco(a)android.com>; Christian Brauner <christian(a)brauner.io>
> >Subject: Re: [PATCH] staging: android: ion: refactory ion_alloc for kernel
> >driver use
> >
> >On Sat, Mar 30, 2019 at 02:40:16AM +0800, Zeng Tao wrote:
> >> There are two reasons for this patch:
> >> 1. There are some potential requirements for ion_alloc in kernel
> >> space, some media drivers need to allocate media buffers from ion
> >> instead of buddy or dma framework, this is more convient and clean
> >> very for media drivers. And In that case, ion is the only media buffer
> >> provider, it's more easier to maintain.
> >
> >As this really is just DMA, what is wrong with the existing dma framework
> >that makes it hard to use? You have seen all of the changes recently to it,
> >right?
>
> The current dma framework is powerful enough(to me, and more complex ^_^)
> , CMA, IOMMU are all integrated, it's good. But buffer sharing, statistics, debug,
> are not so friendly for media drivers(each driver has to do all, but duplicate jobs).
Then go add statistics and debugging to the dma code so that everyone
benefits!
thanks,
greg k-h
On Sat, Mar 30, 2019 at 02:40:16AM +0800, Zeng Tao wrote:
> There are two reasons for this patch:
> 1. There are some potential requirements for ion_alloc in kernel space,
> some media drivers need to allocate media buffers from ion instead of
> buddy or dma framework, this is more convient and clean very for media
> drivers. And In that case, ion is the only media buffer provider, it's
> more easier to maintain.
As this really is just DMA, what is wrong with the existing dma
framework that makes it hard to use? You have seen all of the changes
recently to it, right?
thanks,
greg k-h
On Sat, Mar 30, 2019 at 02:40:16AM +0800, Zeng Tao wrote:
> There are two reasons for this patch:
> 1. There are some potential requirements for ion_alloc in kernel space,
> some media drivers need to allocate media buffers from ion instead of
> buddy or dma framework, this is more convient and clean very for media
> drivers. And In that case, ion is the only media buffer provider, it's
> more easier to maintain.
> 2. Fd is only needed by user processes, not the kernel space, so dma_buf
> should be returned instead of fd for kernel space, and dma_buf_fd should
> be called only for userspace api.
>
> Signed-off-by: Zeng Tao <prime.zeng(a)hisilicon.com>
> ---
> drivers/staging/android/ion/ion.c | 32 +++++++++++++++++---------------
> 1 file changed, 17 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> index 92c2914..e93fb49 100644
> --- a/drivers/staging/android/ion/ion.c
> +++ b/drivers/staging/android/ion/ion.c
> @@ -387,13 +387,13 @@ static const struct dma_buf_ops dma_buf_ops = {
> .unmap = ion_dma_buf_kunmap,
> };
>
> -static int ion_alloc(size_t len, unsigned int heap_id_mask, unsigned int flags)
> +struct dma_buf *ion_alloc(size_t len, unsigned int heap_id_mask,
> + unsigned int flags)
> {
> struct ion_device *dev = internal_dev;
> struct ion_buffer *buffer = NULL;
> struct ion_heap *heap;
> DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
> - int fd;
> struct dma_buf *dmabuf;
>
> pr_debug("%s: len %zu heap_id_mask %u flags %x\n", __func__,
> @@ -407,7 +407,7 @@ static int ion_alloc(size_t len, unsigned int heap_id_mask, unsigned int flags)
> len = PAGE_ALIGN(len);
>
> if (!len)
> - return -EINVAL;
> + return ERR_PTR(-EINVAL);
>
> down_read(&dev->lock);
> plist_for_each_entry(heap, &dev->heaps, node) {
> @@ -421,10 +421,10 @@ static int ion_alloc(size_t len, unsigned int heap_id_mask, unsigned int flags)
> up_read(&dev->lock);
>
> if (!buffer)
> - return -ENODEV;
> + return ERR_PTR(-ENODEV);
>
> if (IS_ERR(buffer))
> - return PTR_ERR(buffer);
> + return ERR_PTR(PTR_ERR(buffer));
>
> exp_info.ops = &dma_buf_ops;
> exp_info.size = buffer->size;
> @@ -432,17 +432,12 @@ static int ion_alloc(size_t len, unsigned int heap_id_mask, unsigned int flags)
> exp_info.priv = buffer;
>
> dmabuf = dma_buf_export(&exp_info);
> - if (IS_ERR(dmabuf)) {
> + if (IS_ERR(dmabuf))
> _ion_buffer_destroy(buffer);
> - return PTR_ERR(dmabuf);
> - }
>
> - fd = dma_buf_fd(dmabuf, O_CLOEXEC);
> - if (fd < 0)
> - dma_buf_put(dmabuf);
> -
> - return fd;
> + return dmabuf;
> }
> +EXPORT_SYMBOL(ion_alloc);
If you are going to do this (and personally I'm with Laura in that I
don't think you need it) this should be EXPORT_SYMBOL_GPL() please.
thanks,
greg k-h
On 3/29/19 11:40 AM, Zeng Tao wrote:
> There are two reasons for this patch:
> 1. There are some potential requirements for ion_alloc in kernel space,
> some media drivers need to allocate media buffers from ion instead of
> buddy or dma framework, this is more convient and clean very for media
> drivers. And In that case, ion is the only media buffer provider, it's
> more easier to maintain.
> 2. Fd is only needed by user processes, not the kernel space, so dma_buf
> should be returned instead of fd for kernel space, and dma_buf_fd should
> be called only for userspace api.
>
I really want to just NAK this because it doesn't seem like something
that's necessary. The purpose of Ion is to provide buffers to userspace
because there's no other way for userspace to get access to the memory.
The kernel already has other APIs to access the memory. This also
complicates the re-work that's been happening where the requirement
is only userspace.
Can you be more detailed about which media drivers you are referring
to and why they can't just use other APIs?
> Signed-off-by: Zeng Tao <prime.zeng(a)hisilicon.com>
> ---
> drivers/staging/android/ion/ion.c | 32 +++++++++++++++++---------------
> 1 file changed, 17 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> index 92c2914..e93fb49 100644
> --- a/drivers/staging/android/ion/ion.c
> +++ b/drivers/staging/android/ion/ion.c
> @@ -387,13 +387,13 @@ static const struct dma_buf_ops dma_buf_ops = {
> .unmap = ion_dma_buf_kunmap,
> };
>
> -static int ion_alloc(size_t len, unsigned int heap_id_mask, unsigned int flags)
> +struct dma_buf *ion_alloc(size_t len, unsigned int heap_id_mask,
> + unsigned int flags)
> {
> struct ion_device *dev = internal_dev;
> struct ion_buffer *buffer = NULL;
> struct ion_heap *heap;
> DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
> - int fd;
> struct dma_buf *dmabuf;
>
> pr_debug("%s: len %zu heap_id_mask %u flags %x\n", __func__,
> @@ -407,7 +407,7 @@ static int ion_alloc(size_t len, unsigned int heap_id_mask, unsigned int flags)
> len = PAGE_ALIGN(len);
>
> if (!len)
> - return -EINVAL;
> + return ERR_PTR(-EINVAL);
>
> down_read(&dev->lock);
> plist_for_each_entry(heap, &dev->heaps, node) {
> @@ -421,10 +421,10 @@ static int ion_alloc(size_t len, unsigned int heap_id_mask, unsigned int flags)
> up_read(&dev->lock);
>
> if (!buffer)
> - return -ENODEV;
> + return ERR_PTR(-ENODEV);
>
> if (IS_ERR(buffer))
> - return PTR_ERR(buffer);
> + return ERR_PTR(PTR_ERR(buffer));
>
> exp_info.ops = &dma_buf_ops;
> exp_info.size = buffer->size;
> @@ -432,17 +432,12 @@ static int ion_alloc(size_t len, unsigned int heap_id_mask, unsigned int flags)
> exp_info.priv = buffer;
>
> dmabuf = dma_buf_export(&exp_info);
> - if (IS_ERR(dmabuf)) {
> + if (IS_ERR(dmabuf))
> _ion_buffer_destroy(buffer);
> - return PTR_ERR(dmabuf);
> - }
>
> - fd = dma_buf_fd(dmabuf, O_CLOEXEC);
> - if (fd < 0)
> - dma_buf_put(dmabuf);
> -
> - return fd;
> + return dmabuf;
> }
> +EXPORT_SYMBOL(ion_alloc);
>
> static int ion_query_heaps(struct ion_heap_query *query)
> {
> @@ -539,12 +534,19 @@ static long ion_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> case ION_IOC_ALLOC:
> {
> int fd;
> + struct dma_buf *dmabuf;
>
> - fd = ion_alloc(data.allocation.len,
> + dmabuf = ion_alloc(data.allocation.len,
> data.allocation.heap_id_mask,
> data.allocation.flags);
> - if (fd < 0)
> + if (IS_ERR(dmabuf))
> + return PTR_ERR(dmabuf);
> +
> + fd = dma_buf_fd(dmabuf, O_CLOEXEC);
> + if (fd < 0) {
> + dma_buf_put(dmabuf);
> return fd;
> + }
>
> data.allocation.fd = fd;
>
>
If we're going to export ion_alloc() for other drivers to use then let's
make an ion_free() helper function as well.
void ion_free(struct dma_buf *dmabuf)
{
dma_buf_put(dmabuf);
}
regards,
dan carpenter
On Thu, Feb 28, 2019 at 04:18:57PM -0800, Hyun Kwon wrote:
> Hi Daniel,
>
> On Thu, 2019-02-28 at 02:01:46 -0800, Daniel Vetter wrote:
> > On Wed, Feb 27, 2019 at 04:36:06PM -0800, Hyun Kwon wrote:
> > > Hi Daniel,
> > >
> > > On Wed, 2019-02-27 at 06:13:45 -0800, Daniel Vetter wrote:
> > > > On Tue, Feb 26, 2019 at 11:20 PM Hyun Kwon <hyun.kwon(a)xilinx.com> wrote:
> > > > >
> > > > > Hi Daniel,
> > > > >
> > > > > Thanks for the comment.
> > > > >
> > > > > On Tue, 2019-02-26 at 04:06:13 -0800, Daniel Vetter wrote:
> > > > > > On Tue, Feb 26, 2019 at 12:53 PM Greg Kroah-Hartman
> > > > > > <gregkh(a)linuxfoundation.org> wrote:
> > > > > > >
> > > > > > > On Sat, Feb 23, 2019 at 12:28:17PM -0800, Hyun Kwon wrote:
> > > > > > > > Add the dmabuf map / unmap interfaces. This allows the user driver
> > > > > > > > to be able to import the external dmabuf and use it from user space.
> > > > > > > >
> > > > > > > > Signed-off-by: Hyun Kwon <hyun.kwon(a)xilinx.com>
> > > > > > > > ---
> > > > > > > > drivers/uio/Makefile | 2 +-
> > > > > > > > drivers/uio/uio.c | 43 +++++++++
> > > > > > > > drivers/uio/uio_dmabuf.c | 210 +++++++++++++++++++++++++++++++++++++++++++
> > > > > > > > drivers/uio/uio_dmabuf.h | 26 ++++++
> > > > > > > > include/uapi/linux/uio/uio.h | 33 +++++++
> > > > > > > > 5 files changed, 313 insertions(+), 1 deletion(-)
> > > > > > > > create mode 100644 drivers/uio/uio_dmabuf.c
> > > > > > > > create mode 100644 drivers/uio/uio_dmabuf.h
> > > > > > > > create mode 100644 include/uapi/linux/uio/uio.h
> > > > > > > >
> > > > > > > > diff --git a/drivers/uio/Makefile b/drivers/uio/Makefile
> > > > > > > > index c285dd2..5da16c7 100644
> > > > > > > > --- a/drivers/uio/Makefile
> > > > > > > > +++ b/drivers/uio/Makefile
> > > > > > > > @@ -1,5 +1,5 @@
>
> [snip]
>
> > > > > > Frankly looks like a ploy to sidestep review by graphics folks. We'd
> > > > > > ask for the userspace first :-)
> > > > >
> > > > > Please refer to pull request [1].
> > > > >
> > > > > For any interest in more details, the libmetal is the abstraction layer
> > > > > which provides platform independent APIs. The backend implementation
> > > > > can be selected per different platforms: ex, rtos, linux,
> > > > > standalone (xilinx),,,. For Linux, it supports UIO / vfio as of now.
> > > > > The actual user space drivers sit on top of libmetal. Such drivers can be
> > > > > found in [2]. This is why I try to avoid any device specific code in
> > > > > Linux kernel.
> > > > >
> > > > > >
> > > > > > Also, exporting dma_addr to userspace is considered a very bad idea.
> > > > >
> > > > > I agree, hence the RFC to pick some brains. :-) Would it make sense
> > > > > if this call doesn't export the physicall address, but instead takes
> > > > > only the dmabuf fd and register offsets to be programmed?
> > > > >
> > > > > > If you want to do this properly, you need a minimal in-kernel memory
> > > > > > manager, and those tend to be based on top of drm_gem.c and merged
> > > > > > through the gpu tree. The last place where we accidentally leaked a
> > > > > > dma addr for gpu buffers was in the fbdev code, and we plugged that
> > > > > > one with
> > > > >
> > > > > Could you please help me understand how having a in-kernel memory manager
> > > > > helps? Isn't it just moving same dmabuf import / paddr export functionality
> > > > > in different modules: kernel memory manager vs uio. In fact, Xilinx does have
> > > > > such memory manager based on drm gem in downstream. But for this time we took
> > > > > the approach of implementing this through generic dmabuf allocator, ION, and
> > > > > enabling the import capability in the UIO infrastructure instead.
> > > >
> > > > There's a group of people working on upstreaming a xilinx drm driver
> > > > already. Which driver are we talking about? Can you pls provide a link
> > > > to that xilinx drm driver?
> > > >
> > >
> > > The one I was pushing [1] is implemented purely for display, and not
> > > intended for anything other than that as of now. What I'm refering to above
> > > is part of Xilinx FPGA (acceleration) runtime [2]. As far as I know,
> > > it's planned to be upstreamed, but not yet started. The Xilinx runtime
> > > software has its own in-kernel memory manager based on drm_cma_gem with
> > > its own ioctls [3].
> > >
> > > Thanks,
> > > -hyun
> > >
> > > [1] https://patchwork.kernel.org/patch/10513001/
> > > [2] https://github.com/Xilinx/XRT
> > > [3] https://github.com/Xilinx/XRT/tree/master/src/runtime_src/driver/zynq/drm
> >
> > I've done a very quick look only, and yes this is kinda what I'd expect.
> > Doing a small drm gem driver for an fpga/accelarator that needs lots of
> > memories is the right architecture, since at the low level of kernel
> > interfaces a gpu really isn't anything else than an accelarater.
> >
> > And from a very cursory look the gem driver you mentioned (I only scrolled
> > through the ioctl handler quickly) looks reasonable.
>
> Thanks for taking time to look and share input. But still I'd like to
> understand why it's more reasonable if the similar ioctl exists with drm
> than with uio. Is it because such drm ioctl is vendor specific?
We do have quite a pile of shared infrastructure in drm beyond just the
vendor specific ioctl. So putting accelerator drivers there makes sense,
whether the programming is a GPU, some neural network folder, an FPGA or
something else. The one issue is that we require open source userspace
together with your driver, since just the accelerator shim in the kernel
alone is fairly useless (both for review and for doing anything with it).
But there's also some kernel maintainers who disagree and happily take
drivers originally written for drm and then rewritten for non-drm for
upstream to avoid the drm folks (or at least it very much looks like that,
and happens fairly regularly).
Cheers, Daniel
>
> Thanks,
> -hyun
>
> > -Daniel
> > >
> > > > Thanks, Daniel
> > > >
> > > > > Thanks,
> > > > > -hyun
> > > > >
> > > > > [1] https://github.com/OpenAMP/libmetal/pull/82/commits/951e2762bd487c98919ad12…
> > > > > [2] https://github.com/Xilinx/embeddedsw/tree/master/XilinxProcessorIPLib/drive…
> > > > >
> > > > > >
> > > > > > commit 4be9bd10e22dfc7fc101c5cf5969ef2d3a042d8a (tag:
> > > > > > drm-misc-next-fixes-2018-10-03)
> > > > > > Author: Neil Armstrong <narmstrong(a)baylibre.com>
> > > > > > Date: Fri Sep 28 14:05:55 2018 +0200
> > > > > >
> > > > > > drm/fb_helper: Allow leaking fbdev smem_start
> > > > > >
> > > > > > Together with cuse the above patch should be enough to implement a drm
> > > > > > driver entirely in userspace at least.
> > > > > >
> > > > > > Cheers, Daniel
> > > > > > --
> > > > > > Daniel Vetter
> > > > > > Software Engineer, Intel Corporation
> > > > > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> > > >
> > > >
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Fri, 18 Jan 2019, Andrew F. Davis wrote:
> On 1/18/19 12:37 PM, Liam Mark wrote:
> > The ION begin_cpu_access and end_cpu_access functions use the
> > dma_sync_sg_for_cpu and dma_sync_sg_for_device APIs to perform cache
> > maintenance.
> >
> > Currently it is possible to apply cache maintenance, via the
> > begin_cpu_access and end_cpu_access APIs, to ION buffers which are not
> > dma mapped.
> >
> > The dma sync sg APIs should not be called on sg lists which have not been
> > dma mapped as this can result in cache maintenance being applied to the
> > wrong address. If an sg list has not been dma mapped then its dma_address
> > field has not been populated, some dma ops such as the swiotlb_dma_ops ops
> > use the dma_address field to calculate the address onto which to apply
> > cache maintenance.
> >
> > Also I don’t think we want CMOs to be applied to a buffer which is not
> > dma mapped as the memory should already be coherent for access from the
> > CPU. Any CMOs required for device access taken care of in the
> > dma_buf_map_attachment and dma_buf_unmap_attachment calls.
> > So really it only makes sense for begin_cpu_access and end_cpu_access to
> > apply CMOs if the buffer is dma mapped.
> >
> > Fix the ION begin_cpu_access and end_cpu_access functions to only apply
> > cache maintenance to buffers which are dma mapped.
> >
> > Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
> > Signed-off-by: Liam Mark <lmark(a)codeaurora.org>
> > ---
> > drivers/staging/android/ion/ion.c | 26 +++++++++++++++++++++-----
> > 1 file changed, 21 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> > index 6f5afab7c1a1..1fe633a7fdba 100644
> > --- a/drivers/staging/android/ion/ion.c
> > +++ b/drivers/staging/android/ion/ion.c
> > @@ -210,6 +210,7 @@ struct ion_dma_buf_attachment {
> > struct device *dev;
> > struct sg_table *table;
> > struct list_head list;
> > + bool dma_mapped;
> > };
> >
> > static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> > @@ -231,6 +232,7 @@ static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> >
> > a->table = table;
> > a->dev = attachment->dev;
> > + a->dma_mapped = false;
> > INIT_LIST_HEAD(&a->list);
> >
> > attachment->priv = a;
> > @@ -261,12 +263,18 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
> > {
> > struct ion_dma_buf_attachment *a = attachment->priv;
> > struct sg_table *table;
> > + struct ion_buffer *buffer = attachment->dmabuf->priv;
> >
> > table = a->table;
> >
> > + mutex_lock(&buffer->lock);
> > if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
> > - direction))
> > + direction)) {
> > + mutex_unlock(&buffer->lock);
> > return ERR_PTR(-ENOMEM);
> > + }
> > + a->dma_mapped = true;
> > + mutex_unlock(&buffer->lock);
> >
> > return table;
> > }
> > @@ -275,7 +283,13 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
> > struct sg_table *table,
> > enum dma_data_direction direction)
> > {
> > + struct ion_dma_buf_attachment *a = attachment->priv;
> > + struct ion_buffer *buffer = attachment->dmabuf->priv;
> > +
> > + mutex_lock(&buffer->lock);
> > dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
> > + a->dma_mapped = false;
> > + mutex_unlock(&buffer->lock);
> > }
> >
> > static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
> > @@ -346,8 +360,9 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> >
> > mutex_lock(&buffer->lock);
> > list_for_each_entry(a, &buffer->attachments, list) {
>
> When no devices are attached then buffer->attachments is empty and the
> below does not run, so if I understand this patch correctly then what
> you are protecting against is CPU access in the window after
> dma_buf_attach but before dma_buf_map.
>
Yes
> This is the kind of thing that again makes me think a couple more
> ordering requirements on DMA-BUF ops are needed. DMA-BUFs do not require
> the backing memory to be allocated until map time, this is why the
> dma_address field would still be null as you note in the commit message.
> So why should the CPU be performing accesses on a buffer that is not
> actually backed yet?
>
> I can think of two solutions:
>
> 1) Only allow CPU access (mmap, kmap, {begin,end}_cpu_access) while at
> least one device is mapped.
>
Would be quite limiting to clients.
> 2) Treat the CPU access request like the a device map request and
> trigger the allocation of backing memory just like if a device map had
> come in.
>
Which is, as you mention pretty much what we have now (though the buffer
is allocated even earlier).
> I know the current Ion heaps (and most other DMA-BUF exporters) all do
> the allocation up front so the memory is already there, but DMA-BUF was
> designed with late allocation in mind. I have a use-case I'm working on
> that finally exercises this DMA-BUF functionality and I would like to
> have it export through ION. This patch doesn't prevent that, but seems
> like it is endorsing the the idea that buffers always need to be backed,
> even before device attach/map is has occurred.
>
I didn't interpret the DMA-buf contract as requiring the dma-map to be
called in order for a backing store to be provided, I interpreted it as
meaning there could be a backing store before the dma-map but at the
dma-map call the final backing store configuration would be decided
(perhaps involving migrating the memory to the final backing store).
I will let the dma-buf experts correct me on that.
Limiting userspace clients to not be able to access buffers until after
they are dma-mapped seems unfortuntate to me, dma-mapping usually means a
change of ownership of the memory from the CPU to the device. So generally
while a buffer is dma mapped you have the device access it (though of
course it is supported for CPU to access to the buffer while dma mapped)
and then once the buffer is dma-unmapped the CPU can access it. This is
how the DMA APIs are frequently used, and the changes above make ION align
more with the way the DMA APIs are used. Basically when the buffer is not
dma-mapped the CPU doesn't need to do any CMOs to access the buffer (and
ION ensures not CMOs are applied) but if the CPU does want to access the
buffer while it is dma mapped then ION ensures that the appropriate CMOs
are applied.
It seems like a legitimate uses case to me to allow clients to access the
buffer before (and after) dma-mapping, example post processing of buffers.
> Either of the above two solutions would need to target the DMA-BUF
> framework,
>
> Sumit,
>
> Any comment?
>
> Thanks,
> Andrew
>
> > - dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
> > - direction);
> > + if (a->dma_mapped)
> > + dma_sync_sg_for_cpu(a->dev, a->table->sgl,
> > + a->table->nents, direction);
> > }
> >
> > unlock:
> > @@ -369,8 +384,9 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
> >
> > mutex_lock(&buffer->lock);
> > list_for_each_entry(a, &buffer->attachments, list) {
> > - dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
> > - direction);
> > + if (a->dma_mapped)
> > + dma_sync_sg_for_device(a->dev, a->table->sgl,
> > + a->table->nents, direction);
> > }
> > mutex_unlock(&buffer->lock);
> >
> >
>
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project