Hi Jonathan,
This is the v6 of my patchset that introduces a new interface based on
DMABUF objects.
The code was updated quite a bit, using the feedback on the list for
this patchset but also the feedback I received on the FunctionFS
patchset that I'm working on upstreaming in parallel [1] where the
DMABUF handling code is very similar.
See below for the full changelog.
I decided to drop the scope-based memory management for dma_buf and
I hope you are OK with that. Christian wants the patch(es) to support
scope-based memory management in dma-buf as a separate patchset; once
it's in, I will gladly send a follow-up patch to use __free() where it
makes sense.
For performance numbers, I'll point you to the cover letter for my v5
patchset [2].
This patchset was based on next-20240129.
Cheers,
-Paul
[1] https://lore.kernel.org/all/20230322092118.9213-1-paul@crapouillou.net/
[2] https://lore.kernel.org/linux-iio/219abc43b4fdd4a13b307ed2efaa0e6869e68e3f.…
---
Changelog:
* [2/6]:
- Use new prototype for axi_dmac_alloc_desc() as it changed upstream
* [3/6]:
- Remove dead code in iio_dma_resv_lock()
- Fix non-block actually blocking
- Cache dma_buf_attachment instead of mapping/unmapping it for every
transfer
- Return -EINVAL instead of IIO_IOCTL_UNHANDLED for unknown ioctl
- Make .block_enqueue() callback take a dma_fence pointer, which
will be passed to iio_buffer_signal_dmabuf_done() instead of the
dma_buf_attachment; and remove the backpointer from the priv
structure to the dma_fence.
- Use dma_fence_begin/end_signalling in the dma_fence critical
sections
- Unref dma_fence and dma_buf_attachment in worker, because they
might try to lock the dma_resv, which would deadlock.
- Add buffer ops to lock/unlock the queue. This is motivated by the
fact that once the dma_fence has been installed, we cannot lock
anything anymore - so the queue must be locked before the
dma_fence is installed.
- Use 'long retl' variable to handle the return value of
dma_resv_wait_timeout()
- Protect dmabufs list access with a mutex
- Rework iio_buffer_find_attachment() to use the internal dmabufs
list, instead of messing with dmabufs private data.
- Add an atomically-increasing sequence number for fences
* [4/6]:
- Update iio_dma_buffer_enqueue_dmabuf() to take a dma_fence pointer
- Pass that dma_fence pointer along to
iio_buffer_signal_dmabuf_done()
- Add iio_dma_buffer_lock_queue() / iio_dma_buffer_unlock_queue()
- Do not lock the queue in iio_dma_buffer_enqueue_dmabuf().
The caller will ensure that it has been locked already.
- Replace "int += bool;" by "if (bool) int++;"
- Use dma_fence_begin/end_signalling in the dma_fence critical
sections
- Use one "num_dmabufs" fields instead of one "num_blocks" and one
"num_fileio_blocks". Make it an atomic_t, which makes it possible
to decrement it atomically in iio_buffer_block_release() without
having to lock the queue mutex; and in turn, it means that we
don't need to use iio_buffer_block_put_atomic() everywhere to
avoid locking the queue mutex twice.
- Use cleanup.h guard(mutex) when possible
- Explicitely list all states in the switch in
iio_dma_can_enqueue_block()
- Rename iio_dma_buffer_fileio_mode() to
iio_dma_buffer_can_use_fileio(), and add a comment explaining why
it cannot race vs. DMABUF.
* [5/6]:
- Populate .lock_queue / .unlock_queue callbacks
- Switch to atomic memory allocations in .submit_queue, because of
the dma_fence critical section
- Make sure that the size of the scatterlist is enough
---
Paul Cercueil (6):
dmaengine: Add API function dmaengine_prep_slave_dma_vec()
dmaengine: dma-axi-dmac: Implement device_prep_slave_dma_vec
iio: core: Add new DMABUF interface infrastructure
iio: buffer-dma: Enable support for DMABUFs
iio: buffer-dmaengine: Support new DMABUF based userspace API
Documentation: iio: Document high-speed DMABUF based API
Documentation/iio/dmabuf_api.rst | 54 ++
Documentation/iio/index.rst | 2 +
drivers/dma/dma-axi-dmac.c | 40 ++
drivers/iio/buffer/industrialio-buffer-dma.c | 181 ++++++-
.../buffer/industrialio-buffer-dmaengine.c | 58 ++-
drivers/iio/industrialio-buffer.c | 462 ++++++++++++++++++
include/linux/dmaengine.h | 25 +
include/linux/iio/buffer-dma.h | 31 ++
include/linux/iio/buffer_impl.h | 33 ++
include/uapi/linux/iio/buffer.h | 22 +
10 files changed, 891 insertions(+), 17 deletions(-)
create mode 100644 Documentation/iio/dmabuf_api.rst
--
2.43.0
Hi,
This is the v6 of my patchset that adds a new DMABUF import interface to
FunctionFS.
Given that the cache coherency issue that has been discussed after my
v5 is a tangential problem and not directly related to this new
interface, I decided to drop the dma_buf_begin/end_access() functions
for now - but I'm open to the idea of re-introducing them in a
subsequent patchset.
The patchset was rebased on next-20240129.
Cheers,
-Paul
---
Changelog:
* Drop v5's patches [1/6] and [2/6].
* [3/4]:
- Drop use of dma_buf_begin/end_access(). We now make the assumption
that the devices attached to the DMABUFs must be coherent between
themselves. The cache coherency issue is a tangential problem, and
those functions can be re-introduced in a subsequent patchset.
- Unqueue pending requests on detach. Otherwise, when closing the data
endpoint the DMABUF will never be signaled.
- Use list_for_each_entry_safe() in ffs_dmabuf_detach(), because there
is a list_del() in there.
- use pr_vdebug() instead of pr_debug()
- Rename ffs_dmabuf_unmap_work() -> ffs_dmabuf_cleanup()
---
Paul Cercueil (4):
usb: gadget: Support already-mapped DMA SGs
usb: gadget: functionfs: Factorize wait-for-endpoint code
usb: gadget: functionfs: Add DMABUF import interface
Documentation: usb: Document FunctionFS DMABUF API
Documentation/usb/functionfs.rst | 36 ++
drivers/usb/gadget/Kconfig | 1 +
drivers/usb/gadget/function/f_fs.c | 513 ++++++++++++++++++++++++++--
drivers/usb/gadget/udc/core.c | 7 +-
include/linux/usb/gadget.h | 2 +
include/uapi/linux/usb/functionfs.h | 41 +++
6 files changed, 579 insertions(+), 21 deletions(-)
--
2.43.0
[V4 was: "iio: Add buffer write() support"][1]
Hi Jonathan,
This is a respin of the V3 of my patchset that introduced a new
interface based on DMABUF objects [2].
The V4 was a split of the patchset, to attempt to upstream buffer
write() support first. But since there is no current user upstream, it
was not merged. This V5 is about doing the opposite, and contains the
new DMABUF interface, without adding the buffer write() support. It can
already be used with the upstream adi-axi-adc driver.
In user-space, Libiio uses it to transfer back and forth blocks of
samples between the hardware and the applications, without having to
copy the data.
On a ZCU102 with a FMComms3 daughter board, running Libiio from the
pcercuei/dev-new-dmabuf-api branch [3], compiled with
WITH_LOCAL_DMABUF_API=OFF (so that it uses fileio):
sudo utils/iio_rwdev -b 4096 -B cf-ad9361-lpc
Throughput: 116 MiB/s
Same hardware, with the DMABUF API (WITH_LOCAL_DMABUF_API=ON):
sudo utils/iio_rwdev -b 4096 -B cf-ad9361-lpc
Throughput: 475 MiB/s
This benchmark only measures the speed at which the data can be fetched
to iio_rwdev's internal buffers, and does not actually try to read the
data (e.g. to pipe it to stdout). It shows that fetching the data is
more than 4x faster using the new interface.
When actually reading the data, the performance difference isn't that
impressive (maybe because in case of DMABUF the data is not in cache):
WITH_LOCAL_DMABUF_API=OFF (so that it uses fileio):
sudo utils/iio_rwdev -b 4096 cf-ad9361-lpc | dd of=/dev/zero status=progress
2446422528 bytes (2.4 GB, 2.3 GiB) copied, 22 s, 111 MB/s
WITH_LOCAL_DMABUF_API=ON:
sudo utils/iio_rwdev -b 4096 cf-ad9361-lpc | dd of=/dev/zero status=progress
2334388736 bytes (2.3 GB, 2.2 GiB) copied, 21 s, 114 MB/s
One interesting thing to note is that fileio is (currently) actually
faster than the DMABUF interface if you increase a lot the buffer size.
My explanation is that the cache invalidation routine takes more and
more time the bigger the DMABUF gets. This is because the DMABUF is
backed by small-size pages, so a (e.g.) 64 MiB DMABUF is backed by up
to 16 thousands pages, that have to be invalidated one by one. This can
be addressed by using huge pages, but the udmabuf driver does not (yet)
support creating DMABUFs backed by huge pages.
Anyway, the real benefits happen when the DMABUFs are either shared
between IIO devices, or between the IIO subsystem and another
filesystem. In that case, the DMABUFs are simply passed around drivers,
without the data being copied at any moment.
We use that feature to transfer samples from our transceivers to USB,
using a DMABUF interface to FunctionFS [4].
This drastically increases the throughput, to about 274 MiB/s over a
USB3 link, vs. 127 MiB/s using IIO's fileio interface + write() to the
FunctionFS endpoints, for a lower CPU usage (0.85 vs. 0.65 load avg.).
Based on linux-next/next-20231219.
Cheers,
-Paul
[1] https://lore.kernel.org/all/20230807112113.47157-1-paul@crapouillou.net/
[2] https://lore.kernel.org/all/20230403154800.215924-1-paul@crapouillou.net/
[3] https://github.com/analogdevicesinc/libiio/tree/pcercuei/dev-new-dmabuf-api
[4] https://lore.kernel.org/all/20230322092118.9213-1-paul@crapouillou.net/
---
Changelog:
- [3/8]: Replace V3's dmaengine_prep_slave_dma_array() with a new
dmaengine_prep_slave_dma_vec(), which uses a new 'dma_vec' struct.
Note that at some point we will need to support cyclic transfers
using dmaengine_prep_slave_dma_vec(). Maybe with a new "flags"
parameter to the function?
- [4/8]: Implement .device_prep_slave_dma_vec() instead of V3's
.device_prep_slave_dma_array().
@Vinod: this patch will cause a small conflict with my other
patchset adding scatter-gather support to the axi-dmac driver.
This patch adds a call to axi_dmac_alloc_desc(num_sgs), but the
prototype of this function changed in my other patchset - it would
have to be passed the "chan" variable. I don't know how you prefer it
to be resolved. Worst case scenario (and if @Jonathan is okay with
that) this one patch can be re-sent later, but it would make this
patchset less "atomic".
- [5/8]:
- Use dev_err() instead of pr_err()
- Inline to_iio_dma_fence()
- Add comment to explain why we unref twice when detaching dmabuf
- Remove TODO comment. It is actually safe to free the file's
private data even when transfers are still pending because it
won't be accessed.
- Fix documentation of new fields in struct iio_buffer_access_funcs
- iio_dma_resv_lock() does not need to be exported, make it static
- [7/8]:
- Use the new dmaengine_prep_slave_dma_vec().
- Restrict to input buffers, since output buffers are not yet
supported by IIO buffers.
- [8/8]:
Use description lists for the documentation of the three new IOCTLs
instead of abusing subsections.
---
Alexandru Ardelean (1):
iio: buffer-dma: split iio_dma_buffer_fileio_free() function
Paul Cercueil (7):
iio: buffer-dma: Get rid of outgoing queue
dmaengine: Add API function dmaengine_prep_slave_dma_vec()
dmaengine: dma-axi-dmac: Implement device_prep_slave_dma_vec
iio: core: Add new DMABUF interface infrastructure
iio: buffer-dma: Enable support for DMABUFs
iio: buffer-dmaengine: Support new DMABUF based userspace API
Documentation: iio: Document high-speed DMABUF based API
Documentation/iio/dmabuf_api.rst | 54 +++
Documentation/iio/index.rst | 2 +
drivers/dma/dma-axi-dmac.c | 40 ++
drivers/iio/buffer/industrialio-buffer-dma.c | 242 ++++++++---
.../buffer/industrialio-buffer-dmaengine.c | 52 ++-
drivers/iio/industrialio-buffer.c | 402 ++++++++++++++++++
include/linux/dmaengine.h | 25 ++
include/linux/iio/buffer-dma.h | 33 +-
include/linux/iio/buffer_impl.h | 26 ++
include/uapi/linux/iio/buffer.h | 22 +
10 files changed, 836 insertions(+), 62 deletions(-)
create mode 100644 Documentation/iio/dmabuf_api.rst
--
2.43.0
On 1/26/24 1:25 AM, Kasireddy, Vivek wrote:
>>>> Currently this driver creates a SGT table using the CPU as the
>>>> target device, then performs the dma_sync operations against
>>>> that SGT. This is backwards to how DMA-BUFs are supposed to behave.
>>>> This may have worked for the case where these buffers were given
>>>> only back to the same CPU that produced them as in the QEMU case.
>>>> And only then because the original author had the dma_sync
>>>> operations also backwards, syncing for the "device" on begin_cpu.
>>>> This was noticed and "fixed" in this patch[0].
>>>>
>>>> That then meant we were sync'ing from the CPU to the CPU using
>>>> a pseudo-device "miscdevice". Which then caused another issue
>>>> due to the miscdevice not having a proper DMA mask (and why should
>>>> it, the CPU is not a DMA device). The fix for that was an even
>>>> more egregious hack[1] that declares the CPU is coherent with
>>>> itself and can access its own memory space..
>>>>
>>>> Unwind all this and perform the correct action by doing the dma_sync
>>>> operations for each device currently attached to the backing buffer.
>>> Makes sense.
>>>
>>>>
>>>> [0] commit 1ffe09590121 ("udmabuf: fix dma-buf cpu access")
>>>> [1] commit 9e9fa6a9198b ("udmabuf: Set the DMA mask for the udmabuf
>>>> device (v2)")
>>>>
>>>> Signed-off-by: Andrew Davis <afd(a)ti.com>
>>>> ---
>>>> drivers/dma-buf/udmabuf.c | 41 +++++++++++++++------------------------
>>>> 1 file changed, 16 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
>>>> index 3a23f0a7d112a..ab6764322523c 100644
>>>> --- a/drivers/dma-buf/udmabuf.c
>>>> +++ b/drivers/dma-buf/udmabuf.c
>>>> @@ -26,8 +26,6 @@ MODULE_PARM_DESC(size_limit_mb, "Max size of a
>>>> dmabuf, in megabytes. Default is
>>>> struct udmabuf {
>>>> pgoff_t pagecount;
>>>> struct page **pages;
>>>> - struct sg_table *sg;
>>>> - struct miscdevice *device;
>>>> struct list_head attachments;
>>>> struct mutex lock;
>>>> };
>>>> @@ -169,12 +167,8 @@ static void unmap_udmabuf(struct
>>>> dma_buf_attachment *at,
>>>> static void release_udmabuf(struct dma_buf *buf)
>>>> {
>>>> struct udmabuf *ubuf = buf->priv;
>>>> - struct device *dev = ubuf->device->this_device;
>>>> pgoff_t pg;
>>>>
>>>> - if (ubuf->sg)
>>>> - put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL);
>>> What happens if the last importer maps the dmabuf but erroneously
>>> closes it immediately? Would unmap somehow get called in this case?
>>>
>>
>> Good question, had to scan the framework code a bit here. I thought
>> closing a DMABUF handle would automatically unwind any current
>> attachments/mappings, but it seems nothing in the framework does that.
>>
>> Looks like that is up to the importing drivers[0]:
>>
>>> Once a driver is done with a shared buffer it needs to call
>>> dma_buf_detach() (after cleaning up any mappings) and then
>>> release the reference acquired with dma_buf_get() by
>>> calling dma_buf_put().
>>
>> So closing a DMABUF after mapping without first unmapping it would
>> be a bug in the importer, it is not the exporters problem to check
> It may be a bug in the importer but wouldn't the memory associated
> with the sg table and attachment get leaked if unmap doesn't get called
> in this scenario?
>
Yes the attachment data would be leaked if unattach was not called,
but that is true for all DMABUF exporters. The .release() callback
is meant to be the mirror of the export function and it only cleans
up that. Same for attach/unattach, map/unmap, etc.. If these calls
are not balanced then yes they can leak memory.
Since balance is guaranteed by the API, checking the balance should
be done at that level, not in each and every exporter. If your
comment is that we should add those checks into the DMABUF framework
layer then I would agree.
Andrew
> Thanks,
> Vivek
>
>> for (although some more warnings in the framework checking for that
>> might not be a bad idea..).
>>
>> Andrew
>>
>> [0] https://www.kernel.org/doc/html/v6.7/driver-api/dma-buf.html
>>
>>> Thanks,
>>> Vivek
>>>
>>>> -
>>>> for (pg = 0; pg < ubuf->pagecount; pg++)
>>>> put_page(ubuf->pages[pg]);
>>>> kfree(ubuf->pages);
>>>> @@ -185,33 +179,31 @@ static int begin_cpu_udmabuf(struct dma_buf
>>>> *buf,
>>>> enum dma_data_direction direction)
>>>> {
>>>> struct udmabuf *ubuf = buf->priv;
>>>> - struct device *dev = ubuf->device->this_device;
>>>> - int ret = 0;
>>>> -
>>>> - if (!ubuf->sg) {
>>>> - ubuf->sg = get_sg_table(dev, buf, direction);
>>>> - if (IS_ERR(ubuf->sg)) {
>>>> - ret = PTR_ERR(ubuf->sg);
>>>> - ubuf->sg = NULL;
>>>> - }
>>>> - } else {
>>>> - dma_sync_sg_for_cpu(dev, ubuf->sg->sgl, ubuf->sg->nents,
>>>> - direction);
>>>> - }
>>>> + struct udmabuf_attachment *a;
>>>>
>>>> - return ret;
>>>> + mutex_lock(&ubuf->lock);
>>>> +
>>>> + list_for_each_entry(a, &ubuf->attachments, list)
>>>> + dma_sync_sgtable_for_cpu(a->dev, a->table, direction);
>>>> +
>>>> + mutex_unlock(&ubuf->lock);
>>>> +
>>>> + return 0;
>>>> }
>>>>
>>>> static int end_cpu_udmabuf(struct dma_buf *buf,
>>>> enum dma_data_direction direction)
>>>> {
>>>> struct udmabuf *ubuf = buf->priv;
>>>> - struct device *dev = ubuf->device->this_device;
>>>> + struct udmabuf_attachment *a;
>>>>
>>>> - if (!ubuf->sg)
>>>> - return -EINVAL;
>>>> + mutex_lock(&ubuf->lock);
>>>> +
>>>> + list_for_each_entry(a, &ubuf->attachments, list)
>>>> + dma_sync_sgtable_for_device(a->dev, a->table, direction);
>>>> +
>>>> + mutex_unlock(&ubuf->lock);
>>>>
>>>> - dma_sync_sg_for_device(dev, ubuf->sg->sgl, ubuf->sg->nents,
>>>> direction);
>>>> return 0;
>>>> }
>>>>
>>>> @@ -307,7 +299,6 @@ static long udmabuf_create(struct miscdevice
>>>> *device,
>>>> exp_info.priv = ubuf;
>>>> exp_info.flags = O_RDWR;
>>>>
>>>> - ubuf->device = device;
>>>> buf = dma_buf_export(&exp_info);
>>>> if (IS_ERR(buf)) {
>>>> ret = PTR_ERR(buf);
>>>> --
>>>> 2.39.2
>>>
On 1/24/24 5:05 PM, Kasireddy, Vivek wrote:
> Hi Andrew,
>
>> Currently this driver creates a SGT table using the CPU as the
>> target device, then performs the dma_sync operations against
>> that SGT. This is backwards to how DMA-BUFs are supposed to behave.
>> This may have worked for the case where these buffers were given
>> only back to the same CPU that produced them as in the QEMU case.
>> And only then because the original author had the dma_sync
>> operations also backwards, syncing for the "device" on begin_cpu.
>> This was noticed and "fixed" in this patch[0].
>>
>> That then meant we were sync'ing from the CPU to the CPU using
>> a pseudo-device "miscdevice". Which then caused another issue
>> due to the miscdevice not having a proper DMA mask (and why should
>> it, the CPU is not a DMA device). The fix for that was an even
>> more egregious hack[1] that declares the CPU is coherent with
>> itself and can access its own memory space..
>>
>> Unwind all this and perform the correct action by doing the dma_sync
>> operations for each device currently attached to the backing buffer.
> Makes sense.
>
>>
>> [0] commit 1ffe09590121 ("udmabuf: fix dma-buf cpu access")
>> [1] commit 9e9fa6a9198b ("udmabuf: Set the DMA mask for the udmabuf
>> device (v2)")
>>
>> Signed-off-by: Andrew Davis <afd(a)ti.com>
>> ---
>> drivers/dma-buf/udmabuf.c | 41 +++++++++++++++------------------------
>> 1 file changed, 16 insertions(+), 25 deletions(-)
>>
>> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
>> index 3a23f0a7d112a..ab6764322523c 100644
>> --- a/drivers/dma-buf/udmabuf.c
>> +++ b/drivers/dma-buf/udmabuf.c
>> @@ -26,8 +26,6 @@ MODULE_PARM_DESC(size_limit_mb, "Max size of a
>> dmabuf, in megabytes. Default is
>> struct udmabuf {
>> pgoff_t pagecount;
>> struct page **pages;
>> - struct sg_table *sg;
>> - struct miscdevice *device;
>> struct list_head attachments;
>> struct mutex lock;
>> };
>> @@ -169,12 +167,8 @@ static void unmap_udmabuf(struct
>> dma_buf_attachment *at,
>> static void release_udmabuf(struct dma_buf *buf)
>> {
>> struct udmabuf *ubuf = buf->priv;
>> - struct device *dev = ubuf->device->this_device;
>> pgoff_t pg;
>>
>> - if (ubuf->sg)
>> - put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL);
> What happens if the last importer maps the dmabuf but erroneously
> closes it immediately? Would unmap somehow get called in this case?
>
Good question, had to scan the framework code a bit here. I thought
closing a DMABUF handle would automatically unwind any current
attachments/mappings, but it seems nothing in the framework does that.
Looks like that is up to the importing drivers[0]:
> Once a driver is done with a shared buffer it needs to call
> dma_buf_detach() (after cleaning up any mappings) and then
> release the reference acquired with dma_buf_get() by
> calling dma_buf_put().
So closing a DMABUF after mapping without first unmapping it would
be a bug in the importer, it is not the exporters problem to check
for (although some more warnings in the framework checking for that
might not be a bad idea..).
Andrew
[0] https://www.kernel.org/doc/html/v6.7/driver-api/dma-buf.html
> Thanks,
> Vivek
>
>> -
>> for (pg = 0; pg < ubuf->pagecount; pg++)
>> put_page(ubuf->pages[pg]);
>> kfree(ubuf->pages);
>> @@ -185,33 +179,31 @@ static int begin_cpu_udmabuf(struct dma_buf
>> *buf,
>> enum dma_data_direction direction)
>> {
>> struct udmabuf *ubuf = buf->priv;
>> - struct device *dev = ubuf->device->this_device;
>> - int ret = 0;
>> -
>> - if (!ubuf->sg) {
>> - ubuf->sg = get_sg_table(dev, buf, direction);
>> - if (IS_ERR(ubuf->sg)) {
>> - ret = PTR_ERR(ubuf->sg);
>> - ubuf->sg = NULL;
>> - }
>> - } else {
>> - dma_sync_sg_for_cpu(dev, ubuf->sg->sgl, ubuf->sg->nents,
>> - direction);
>> - }
>> + struct udmabuf_attachment *a;
>>
>> - return ret;
>> + mutex_lock(&ubuf->lock);
>> +
>> + list_for_each_entry(a, &ubuf->attachments, list)
>> + dma_sync_sgtable_for_cpu(a->dev, a->table, direction);
>> +
>> + mutex_unlock(&ubuf->lock);
>> +
>> + return 0;
>> }
>>
>> static int end_cpu_udmabuf(struct dma_buf *buf,
>> enum dma_data_direction direction)
>> {
>> struct udmabuf *ubuf = buf->priv;
>> - struct device *dev = ubuf->device->this_device;
>> + struct udmabuf_attachment *a;
>>
>> - if (!ubuf->sg)
>> - return -EINVAL;
>> + mutex_lock(&ubuf->lock);
>> +
>> + list_for_each_entry(a, &ubuf->attachments, list)
>> + dma_sync_sgtable_for_device(a->dev, a->table, direction);
>> +
>> + mutex_unlock(&ubuf->lock);
>>
>> - dma_sync_sg_for_device(dev, ubuf->sg->sgl, ubuf->sg->nents,
>> direction);
>> return 0;
>> }
>>
>> @@ -307,7 +299,6 @@ static long udmabuf_create(struct miscdevice
>> *device,
>> exp_info.priv = ubuf;
>> exp_info.flags = O_RDWR;
>>
>> - ubuf->device = device;
>> buf = dma_buf_export(&exp_info);
>> if (IS_ERR(buf)) {
>> ret = PTR_ERR(buf);
>> --
>> 2.39.2
>
From: Jason-jh Lin <jason-jh.lin(a)mediatek.corp-partner.google.com>
The property "mediatek,gce-events" is used for GCE event ID corresponding
to a hardware event signal sent by the hardware or a software driver.
If the mailbox providers or consumers want to manipulate the value of
the event ID, they need to know the specific event ID.
Since mediatek,gce-events property is used for both mailbox producers
and consumers, we add a mediatek,gce-props.yaml to place the common GCE
properties like mediatek,gce-events.
Change in v4:
1. Fix some typo.
2. Change maxItems of gce-events from 1024 to 32.
Change in v3:
1. Add more description and fix typo and grammar.
2. Fix $ref as full path.
Change in v2:
1. Add mediatek,gce-props.yaml for other binding reference.
Jason-JH.Lin (3):
dt-bindings: mailbox: Add mediatek,gce-props.yaml
dt-bindings: media: mediatek: mdp: Change mediatek,gce-events to
reference
dt-bindings: soc: mediatek: Change mediatek,gce-events to refernece
.../bindings/mailbox/mediatek,gce-props.yaml | 52 +++++++++++++++++++
.../bindings/media/mediatek,mdp3-rdma.yaml | 11 ++--
.../bindings/media/mediatek,mdp3-rsz.yaml | 12 ++---
.../bindings/media/mediatek,mdp3-wrot.yaml | 12 ++---
.../bindings/soc/mediatek/mediatek,ccorr.yaml | 12 ++---
.../bindings/soc/mediatek/mediatek,mutex.yaml | 11 ++--
.../bindings/soc/mediatek/mediatek,wdma.yaml | 12 ++---
7 files changed, 74 insertions(+), 48 deletions(-)
create mode 100644 Documentation/devicetree/bindings/mailbox/mediatek,gce-props.yaml
--
2.18.0
Hello Felix Kuehling,
The patch 1819200166ce: "drm/amdkfd: Export DMABufs from KFD using
GEM handles" from Aug 24, 2023 (linux-next), leads to the following
Smatch static checker warning:
drivers/dma-buf/dma-buf.c:729 dma_buf_get()
warn: fd used after fd_install() 'fd'
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
809 static int kfd_mem_export_dmabuf(struct kgd_mem *mem)
810 {
811 if (!mem->dmabuf) {
812 struct amdgpu_device *bo_adev;
813 struct dma_buf *dmabuf;
814 int r, fd;
815
816 bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
817 r = drm_gem_prime_handle_to_fd(&bo_adev->ddev, bo_adev->kfd.client.file,
818 mem->gem_handle,
819 mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
820 DRM_RDWR : 0, &fd);
^^^
The drm_gem_prime_handle_to_fd() function does an fd_install() and
returns the result as "fd".
821 if (r)
822 return r;
823 dmabuf = dma_buf_get(fd);
^^
Then we do another fget() inside dma_buf_get(). I'm not an expert,
but this looks wrong. We can't assume that the dmabuf here is the
same one from drm_gem_prime_handle_to_fd() because the user could
change it after the fd_install(). I suspect drm_gem_prime_handle_to_fd()
should pass the dmabuf back instead.
We had several CVEs similar to this such as CVE-2022-1998.
824 close_fd(fd);
825 if (WARN_ON_ONCE(IS_ERR(dmabuf)))
826 return PTR_ERR(dmabuf);
827 mem->dmabuf = dmabuf;
828 }
829
830 return 0;
831 }
regards,
dan carpenter
From: Jason-jh Lin <jason-jh.lin(a)mediatek.corp-partner.google.com>
The property "mediatek,gce-events" is used for GCE event ID corresponding
to a hardware event signal sent by the hardware or a sofware driver.
If the mailbox providers or consumers want to manipulate the value of
the event ID, they need to know the specific event ID.
Since mediatek,gce-events property is used for both mailbox producers
and consumers, we add a mediatek,gce-props.yaml to place the common GCE
properties like mediatek,gce-events.
Change in v3:
1. Add more description and fix typo and grammar.
2. Fix $ref as full path.
Change in v2:
1. Add mediatek,gce-props.yaml for other binding reference.
Jason-JH.Lin (3):
dt-bindings: mailbox: Add mediatek,gce-props.yaml
dt-bindings: media: mediatek: mdp: Change mediatek,gce-events to
reference
dt-bindings: soc: mediatek: Change mediatek,gce-events to refernece
.../bindings/mailbox/mediatek,gce-props.yaml | 52 +++++++++++++++++++
.../bindings/media/mediatek,mdp3-rdma.yaml | 11 ++--
.../bindings/media/mediatek,mdp3-rsz.yaml | 12 ++---
.../bindings/media/mediatek,mdp3-wrot.yaml | 12 ++---
.../bindings/soc/mediatek/mediatek,ccorr.yaml | 12 ++---
.../bindings/soc/mediatek/mediatek,mutex.yaml | 11 ++--
.../bindings/soc/mediatek/mediatek,wdma.yaml | 12 ++---
7 files changed, 74 insertions(+), 48 deletions(-)
create mode 100644 Documentation/devicetree/bindings/mailbox/mediatek,gce-props.yaml
--
2.18.0