On 09/07/2018 08:20 AM, jun qian wrote:
> The value in the wrong position will cause misunderstanding,
> when the debug infomations display in the window.
>
I think the existing order is okay, it's just not separated
well. It's "$count pages of order $order". I also just acked a
patch to remove all this code because it's dead on mainline
anyway. For future work, we should look to make the debugfs
output clearer to avoid ambiguity.
Thanks,
Laura
> Signed-off-by: jun qian <hangdianqj(a)163.com>
> ---
> drivers/staging/android/ion/ion_system_heap.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c
> index 701eb9f3b0f1..54b8a7710958 100644
> --- a/drivers/staging/android/ion/ion_system_heap.c
> +++ b/drivers/staging/android/ion/ion_system_heap.c
> @@ -225,10 +225,10 @@ static int ion_system_heap_debug_show(struct ion_heap *heap, struct seq_file *s,
> pool = sys_heap->pools[i];
>
> seq_printf(s, "%d order %u highmem pages %lu total\n",
> - pool->high_count, pool->order,
> + pool->order, pool->high_count,
> (PAGE_SIZE << pool->order) * pool->high_count);
> seq_printf(s, "%d order %u lowmem pages %lu total\n",
> - pool->low_count, pool->order,
> + pool->order, pool->low_count,
> (PAGE_SIZE << pool->order) * pool->low_count);
> }
>
>
On Tue, Sep 04, 2018 at 02:07:49PM -0500, Gustavo A. R. Silva wrote:
> There is a potential execution path in which pointer memfd is NULL when
> passed as argument to fput(), hence there is a NULL pointer dereference
> in fput().
>
> Fix this by null checking *memfd* before calling fput().
>
> Addresses-Coverity-ID: 1473174 ("Explicit null dereferenced")
> Fixes: fbb0de795078 ("Add udmabuf misc device")
> Signed-off-by: Gustavo A. R. Silva <gustavo(a)embeddedor.com>
Pushed to drm-misc-next.
thanks,
Gerd
Hi,
> > qemu can use memfd to allocate guest ram. Now, with the help of
> > udmabuf, qemu can create a *host* dma-buf for the *guest* graphics
> > buffer.
>
> Guess each physical address in the iovec in
> VIRTIO_GPU_CMD_RESOURCE_ATTACH_BACKING can be passed as the offset in the
> udmabuf_create_item struct?
Exactly.
https://git.kraxel.org/cgit/qemu/commit/?h=sirius/udmabuf&id=515a5b9f1215ea…
> Are you thinking of anything else besides passing the winsrv protocol across
> the guest/host boundary? Just wondering if I'm missing something.
The patch above uses the dmabuf internally in qemu. It simply mmaps it,
so qemu has a linear representation of the resource and can use it as
pixman image backing storage without copying the pixel data.
So it is useful even without actually exporting the dmabuf to other
processes.
cheers,
Gerd
PS: Any chance you can review the v7 patch?
Since commit 9ea0dfbf972 ("dma-buf: make map_atomic and map function
pointers optional"), the core provides the no-op functions when map and
map_atomic are not provided, so we no longer need assert that are
supplied by a dma-buf exporter.
Fixes: 09ea0dfbf972 ("dma-buf: make map_atomic and map function pointers optional")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Daniel Vetter <daniel(a)ffwll.ch>
Cc: Gerd Hoffmann <kraxel(a)redhat.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
---
drivers/dma-buf/dma-buf.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 13884474d158..02f7f9a89979 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -405,7 +405,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
|| !exp_info->ops->map_dma_buf
|| !exp_info->ops->unmap_dma_buf
|| !exp_info->ops->release
- || !exp_info->ops->map
|| !exp_info->ops->mmap)) {
return ERR_PTR(-EINVAL);
}
--
2.18.0
Dear All,
The CMA related functions cma_alloc() and dma_alloc_from_contiguous()
have gfp mask parameter, but sadly they only support __GFP_NOWARN flag.
This gave their users a misleading feeling that any standard memory
allocation flags are supported, what resulted in the security issue when
caller have set __GFP_ZERO flag and expected the buffer to be cleared.
This patchset changes gfp_mask parameter to a simple boolean no_warn
argument, which covers all the underlaying code supports.
This patchset is a result of the following discussion:
https://patchwork.kernel.org/patch/10461919/
Best regards
Marek Szyprowski
Samsung R&D Institute Poland
Patch summary:
Marek Szyprowski (2):
mm/cma: remove unsupported gfp_mask parameter from cma_alloc()
dma: remove unsupported gfp_mask parameter from
dma_alloc_from_contiguous()
arch/arm/mm/dma-mapping.c | 5 +++--
arch/arm64/mm/dma-mapping.c | 4 ++--
arch/powerpc/kvm/book3s_hv_builtin.c | 2 +-
arch/xtensa/kernel/pci-dma.c | 2 +-
drivers/iommu/amd_iommu.c | 2 +-
drivers/iommu/intel-iommu.c | 3 ++-
drivers/s390/char/vmcp.c | 2 +-
drivers/staging/android/ion/ion_cma_heap.c | 2 +-
include/linux/cma.h | 2 +-
include/linux/dma-contiguous.h | 4 ++--
kernel/dma/contiguous.c | 6 +++---
kernel/dma/direct.c | 3 ++-
mm/cma.c | 8 ++++----
mm/cma_debug.c | 2 +-
14 files changed, 25 insertions(+), 22 deletions(-)
--
2.17.1
On Wed, Jul 04, 2018 at 09:26:39AM +0200, Tomeu Vizoso wrote:
> On 07/04/2018 07:53 AM, Gerd Hoffmann wrote:
> > On Tue, Jul 03, 2018 at 10:37:57AM +0200, Daniel Vetter wrote:
> > > On Tue, Jul 03, 2018 at 09:53:58AM +0200, Gerd Hoffmann wrote:
> > > > A driver to let userspace turn memfd regions into dma-bufs.
> > > >
> > > > Use case: Allows qemu create dmabufs for the vga framebuffer or
> > > > virtio-gpu ressources. Then they can be passed around to display
> > > > those guest things on the host. To spice client for classic full
> > > > framebuffer display, and hopefully some day to wayland server for
> > > > seamless guest window display.
> > > >
> > > > qemu test branch:
> > > > https://git.kraxel.org/cgit/qemu/log/?h=sirius/udmabuf
> > > >
> > > > Cc: David Airlie <airlied(a)linux.ie>
> > > > Cc: Tomeu Vizoso <tomeu.vizoso(a)collabora.com>
> > > > Cc: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
> > > > Cc: Daniel Vetter <daniel(a)ffwll.ch>
> > > > Signed-off-by: Gerd Hoffmann <kraxel(a)redhat.com>
> > >
> > > I think some ack for a 2nd use-case, like virtio-wl or whatever would be
> > > really cool. To give us some assurance that this is generically useful.
> >
> > Tomeu? Laurent?
>
> Sorry, but I think I will need some help to understand how this could help
> in the virtio-wl case [adding Zach Reizner to CC].
>
> Any graphics buffers that are allocated with memfd will be shared with the
> compositor via wl_shm, without need for dmabufs.
Within one machine, yes. Once virtualization is added to the mix things
become more complicated ...
When using virtio-gpu the guest will allocate graphics buffers from
normal (guest) ram, then register these buffers (which are allowed to be
scattered) with the host as resource.
qemu can use memfd to allocate guest ram. Now, with the help of
udmabuf, qemu can create a *host* dma-buf for the *guest* graphics
buffer.
That dma-buf can be used by qemu internally (mmap it to get a linear
mapping of the resource, to avoid copying). It can be passed on to
spice-client, to display the guest framebuffer.
And I think it could also be quite useful to pass guest wayland windows
to the host compositor, without mapping host-allocated buffers into the
guest, so we don't have do deal with the "find some address space for
the mapping" issue in the first place. There are more things needed to
complete this of course, but it's a building block ...
cheers,
Gerd
- Intro section that links to how this is exposed to userspace.
- Lots more hyperlinks.
- Minor clarifications and style polish
v2: Add misplaced hunk of kerneldoc from a different patch.
Signed-off-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Gustavo Padovan <gustavo(a)padovan.org>
Cc: linux-media(a)vger.kernel.org
Cc: linaro-mm-sig(a)lists.linaro.org
---
Documentation/driver-api/dma-buf.rst | 6 ++
drivers/dma-buf/dma-fence.c | 147 +++++++++++++++++++--------
2 files changed, 109 insertions(+), 44 deletions(-)
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index dc384f2f7f34..b541e97c7ab1 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -130,6 +130,12 @@ Reservation Objects
DMA Fences
----------
+.. kernel-doc:: drivers/dma-buf/dma-fence.c
+ :doc: DMA fences overview
+
+DMA Fences Functions Reference
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
.. kernel-doc:: drivers/dma-buf/dma-fence.c
:export:
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 7a92f85a4cec..1551ca7df394 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -38,12 +38,43 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
*/
static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(0);
+/**
+ * DOC: DMA fences overview
+ *
+ * DMA fences, represented by &struct dma_fence, are the kernel internal
+ * synchronization primitive for DMA operations like GPU rendering, video
+ * encoding/decoding, or displaying buffers on a screen.
+ *
+ * A fence is initialized using dma_fence_init() and completed using
+ * dma_fence_signal(). Fences are associated with a context, allocated through
+ * dma_fence_context_alloc(), and all fences on the same context are
+ * fully ordered.
+ *
+ * Since the purposes of fences is to facilitate cross-device and
+ * cross-application synchronization, there's multiple ways to use one:
+ *
+ * - Individual fences can be exposed as a &sync_file, accessed as a file
+ * descriptor from userspace, created by calling sync_file_create(). This is
+ * called explicit fencing, since userspace passes around explicit
+ * synchronization points.
+ *
+ * - Some subsystems also have their own explicit fencing primitives, like
+ * &drm_syncobj. Compared to &sync_file, a &drm_syncobj allows the underlying
+ * fence to be updated.
+ *
+ * - Then there's also implicit fencing, where the synchronization points are
+ * implicitly passed around as part of shared &dma_buf instances. Such
+ * implicit fences are stored in &struct reservation_object through the
+ * &dma_buf.resv pointer.
+ */
+
/**
* dma_fence_context_alloc - allocate an array of fence contexts
- * @num: [in] amount of contexts to allocate
+ * @num: amount of contexts to allocate
*
- * This function will return the first index of the number of fences allocated.
- * The fence context is used for setting fence->context to a unique number.
+ * This function will return the first index of the number of fence contexts
+ * allocated. The fence context is used for setting &dma_fence.context to a
+ * unique number by passing the context to dma_fence_init().
*/
u64 dma_fence_context_alloc(unsigned num)
{
@@ -59,10 +90,14 @@ EXPORT_SYMBOL(dma_fence_context_alloc);
* Signal completion for software callbacks on a fence, this will unblock
* dma_fence_wait() calls and run all the callbacks added with
* dma_fence_add_callback(). Can be called multiple times, but since a fence
- * can only go from unsignaled to signaled state, it will only be effective
- * the first time.
+ * can only go from the unsignaled to the signaled state and not back, it will
+ * only be effective the first time.
+ *
+ * Unlike dma_fence_signal(), this function must be called with &dma_fence.lock
+ * held.
*
- * Unlike dma_fence_signal, this function must be called with fence->lock held.
+ * Returns 0 on success and a negative error value when @fence has been
+ * signalled already.
*/
int dma_fence_signal_locked(struct dma_fence *fence)
{
@@ -102,8 +137,11 @@ EXPORT_SYMBOL(dma_fence_signal_locked);
* Signal completion for software callbacks on a fence, this will unblock
* dma_fence_wait() calls and run all the callbacks added with
* dma_fence_add_callback(). Can be called multiple times, but since a fence
- * can only go from unsignaled to signaled state, it will only be effective
- * the first time.
+ * can only go from the unsignaled to the signaled state and not back, it will
+ * only be effective the first time.
+ *
+ * Returns 0 on success and a negative error value when @fence has been
+ * signalled already.
*/
int dma_fence_signal(struct dma_fence *fence)
{
@@ -136,9 +174,9 @@ EXPORT_SYMBOL(dma_fence_signal);
/**
* dma_fence_wait_timeout - sleep until the fence gets signaled
* or until timeout elapses
- * @fence: [in] the fence to wait on
- * @intr: [in] if true, do an interruptible wait
- * @timeout: [in] timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
+ * @fence: the fence to wait on
+ * @intr: if true, do an interruptible wait
+ * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
*
* Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the
* remaining timeout in jiffies on success. Other error values may be
@@ -148,6 +186,8 @@ EXPORT_SYMBOL(dma_fence_signal);
* directly or indirectly (buf-mgr between reservation and committing)
* holds a reference to the fence, otherwise the fence might be
* freed before return, resulting in undefined behavior.
+ *
+ * See also dma_fence_wait() and dma_fence_wait_any_timeout().
*/
signed long
dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
@@ -167,6 +207,13 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
}
EXPORT_SYMBOL(dma_fence_wait_timeout);
+/**
+ * dma_fence_release - default relese function for fences
+ * @kref: &dma_fence.recfount
+ *
+ * This is the default release functions for &dma_fence. Drivers shouldn't call
+ * this directly, but instead call dma_fence_put().
+ */
void dma_fence_release(struct kref *kref)
{
struct dma_fence *fence =
@@ -184,6 +231,13 @@ void dma_fence_release(struct kref *kref)
}
EXPORT_SYMBOL(dma_fence_release);
+/**
+ * dma_fence_free - default release function for &dma_fence.
+ * @fence: fence to release
+ *
+ * This is the default implementation for &dma_fence_ops.release. It calls
+ * kfree_rcu() on @fence.
+ */
void dma_fence_free(struct dma_fence *fence)
{
kfree_rcu(fence, rcu);
@@ -192,10 +246,11 @@ EXPORT_SYMBOL(dma_fence_free);
/**
* dma_fence_enable_sw_signaling - enable signaling on fence
- * @fence: [in] the fence to enable
+ * @fence: the fence to enable
*
- * this will request for sw signaling to be enabled, to make the fence
- * complete as soon as possible
+ * This will request for sw signaling to be enabled, to make the fence
+ * complete as soon as possible. This calls &dma_fence_ops.enable_signaling
+ * internally.
*/
void dma_fence_enable_sw_signaling(struct dma_fence *fence)
{
@@ -220,24 +275,24 @@ EXPORT_SYMBOL(dma_fence_enable_sw_signaling);
/**
* dma_fence_add_callback - add a callback to be called when the fence
* is signaled
- * @fence: [in] the fence to wait on
- * @cb: [in] the callback to register
- * @func: [in] the function to call
+ * @fence: the fence to wait on
+ * @cb: the callback to register
+ * @func: the function to call
*
- * cb will be initialized by dma_fence_add_callback, no initialization
+ * @cb will be initialized by dma_fence_add_callback(), no initialization
* by the caller is required. Any number of callbacks can be registered
* to a fence, but a callback can only be registered to one fence at a time.
*
* Note that the callback can be called from an atomic context. If
* fence is already signaled, this function will return -ENOENT (and
- * *not* call the callback)
+ * *not* call the callback).
*
* Add a software callback to the fence. Same restrictions apply to
- * refcount as it does to dma_fence_wait, however the caller doesn't need to
- * keep a refcount to fence afterwards: when software access is enabled,
- * the creator of the fence is required to keep the fence alive until
- * after it signals with dma_fence_signal. The callback itself can be called
- * from irq context.
+ * refcount as it does to dma_fence_wait(), however the caller doesn't need to
+ * keep a refcount to fence afterward dma_fence_add_callback() has returned:
+ * when software access is enabled, the creator of the fence is required to keep
+ * the fence alive until after it signals with dma_fence_signal(). The callback
+ * itself can be called from irq context.
*
* Returns 0 in case of success, -ENOENT if the fence is already signaled
* and -EINVAL in case of error.
@@ -286,7 +341,7 @@ EXPORT_SYMBOL(dma_fence_add_callback);
/**
* dma_fence_get_status - returns the status upon completion
- * @fence: [in] the dma_fence to query
+ * @fence: the dma_fence to query
*
* This wraps dma_fence_get_status_locked() to return the error status
* condition on a signaled fence. See dma_fence_get_status_locked() for more
@@ -311,8 +366,8 @@ EXPORT_SYMBOL(dma_fence_get_status);
/**
* dma_fence_remove_callback - remove a callback from the signaling list
- * @fence: [in] the fence to wait on
- * @cb: [in] the callback to remove
+ * @fence: the fence to wait on
+ * @cb: the callback to remove
*
* Remove a previously queued callback from the fence. This function returns
* true if the callback is successfully removed, or false if the fence has
@@ -323,6 +378,9 @@ EXPORT_SYMBOL(dma_fence_get_status);
* doing, since deadlocks and race conditions could occur all too easily. For
* this reason, it should only ever be done on hardware lockup recovery,
* with a reference held to the fence.
+ *
+ * Behaviour is undefined if @cb has not been added to @fence using
+ * dma_fence_add_callback() beforehand.
*/
bool
dma_fence_remove_callback(struct dma_fence *fence, struct dma_fence_cb *cb)
@@ -359,9 +417,9 @@ dma_fence_default_wait_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
/**
* dma_fence_default_wait - default sleep until the fence gets signaled
* or until timeout elapses
- * @fence: [in] the fence to wait on
- * @intr: [in] if true, do an interruptible wait
- * @timeout: [in] timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
+ * @fence: the fence to wait on
+ * @intr: if true, do an interruptible wait
+ * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
*
* Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the
* remaining timeout in jiffies on success. If timeout is zero the value one is
@@ -454,12 +512,12 @@ dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count,
/**
* dma_fence_wait_any_timeout - sleep until any fence gets signaled
* or until timeout elapses
- * @fences: [in] array of fences to wait on
- * @count: [in] number of fences to wait on
- * @intr: [in] if true, do an interruptible wait
- * @timeout: [in] timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
- * @idx: [out] the first signaled fence index, meaningful only on
- * positive return
+ * @fences: array of fences to wait on
+ * @count: number of fences to wait on
+ * @intr: if true, do an interruptible wait
+ * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
+ * @idx: used to store the first signaled fence index, meaningful only on
+ * positive return
*
* Returns -EINVAL on custom fence wait implementation, -ERESTARTSYS if
* interrupted, 0 if the wait timed out, or the remaining timeout in jiffies
@@ -468,6 +526,8 @@ dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count,
* Synchronous waits for the first fence in the array to be signaled. The
* caller needs to hold a reference to all fences in the array, otherwise a
* fence might be freed before return, resulting in undefined behavior.
+ *
+ * See also dma_fence_wait() and dma_fence_wait_timeout().
*/
signed long
dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
@@ -540,19 +600,18 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
/**
* dma_fence_init - Initialize a custom fence.
- * @fence: [in] the fence to initialize
- * @ops: [in] the dma_fence_ops for operations on this fence
- * @lock: [in] the irqsafe spinlock to use for locking this fence
- * @context: [in] the execution context this fence is run on
- * @seqno: [in] a linear increasing sequence number for this context
+ * @fence: the fence to initialize
+ * @ops: the dma_fence_ops for operations on this fence
+ * @lock: the irqsafe spinlock to use for locking this fence
+ * @context: the execution context this fence is run on
+ * @seqno: a linear increasing sequence number for this context
*
* Initializes an allocated fence, the caller doesn't have to keep its
* refcount after committing with this fence, but it will need to hold a
- * refcount again if dma_fence_ops.enable_signaling gets called. This can
- * be used for other implementing other types of fence.
+ * refcount again if &dma_fence_ops.enable_signaling gets called.
*
* context and seqno are used for easy comparison between fences, allowing
- * to check which fence is later by simply using dma_fence_later.
+ * to check which fence is later by simply using dma_fence_later().
*/
void
dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
--
2.18.0
Am 04.07.2018 um 11:09 schrieb Michel Dänzer:
> On 2018-07-04 10:31 AM, Christian König wrote:
>> Am 26.06.2018 um 16:31 schrieb Michel Dänzer:
>>> From: Michel Dänzer <michel.daenzer(a)amd.com>
>>>
>>> Fixes the BUG_ON spuriously triggering under the following
>>> circumstances:
>>>
>>> * ttm_eu_reserve_buffers processes a list containing multiple BOs using
>>> the same reservation object, so it calls
>>> reservation_object_reserve_shared with that reservation object once
>>> for each such BO.
>>> * In reservation_object_reserve_shared, old->shared_count ==
>>> old->shared_max - 1, so obj->staged is freed in preparation of an
>>> in-place update.
>>> * ttm_eu_fence_buffer_objects calls reservation_object_add_shared_fence
>>> once for each of the BOs above, always with the same fence.
>>> * The first call adds the fence in the remaining free slot, after which
>>> old->shared_count == old->shared_max.
>> Well, the explanation here is not correct. For multiple BOs using the
>> same reservation object we won't call
>> reservation_object_add_shared_fence() multiple times because we move
>> those to the duplicates list in ttm_eu_reserve_buffers().
>>
>> But this bug can still happen because we call
>> reservation_object_add_shared_fence() manually with fences for the same
>> context in a couple of places.
>>
>> One prominent case which comes to my mind are for the VM BOs during
>> updates. Another possibility are VRAM BOs which need to be cleared.
> Thanks. How about the following:
>
> * ttm_eu_reserve_buffers calls reservation_object_reserve_shared.
> * In reservation_object_reserve_shared, shared_count == shared_max - 1,
> so obj->staged is freed in preparation of an in-place update.
> * ttm_eu_fence_buffer_objects calls reservation_object_add_shared_fence,
> after which shared_count == shared_max.
> * The amdgpu driver also calls reservation_object_add_shared_fence for
> the same reservation object, and the BUG_ON triggers.
I would rather completely drop the reference to the ttm_eu_* functions,
cause those wrappers are completely unrelated to the problem.
Instead let's just note something like the following:
* When reservation_object_reserve_shared is called with shared_count ==
shared_max - 1,
so obj->staged is freed in preparation of an in-place update.
* Now reservation_object_add_shared_fence is called with the first fence
and after that shared_count == shared_max.
* After that reservation_object_add_shared_fence can be called with
follow up fences from the same context, but since shared_count ==
shared_max we would run into this BUG_ON.
> However, nothing bad would happen in
> reservation_object_add_shared_inplace, since all fences use the same
> context, so they can only occupy a single slot.
>
> Prevent this by moving the BUG_ON to where an overflow would actually
> happen (e.g. if a buggy caller didn't call
> reservation_object_reserve_shared before).
>
>
> Also, I'll add a reference to https://bugs.freedesktop.org/106418 in v2,
> as I suspect this fix is necessary under the circumstances described
> there as well.
The rest sounds good to me.
Regards,
Christian.
Am 26.06.2018 um 16:31 schrieb Michel Dänzer:
> From: Michel Dänzer <michel.daenzer(a)amd.com>
>
> Fixes the BUG_ON spuriously triggering under the following
> circumstances:
>
> * ttm_eu_reserve_buffers processes a list containing multiple BOs using
> the same reservation object, so it calls
> reservation_object_reserve_shared with that reservation object once
> for each such BO.
> * In reservation_object_reserve_shared, old->shared_count ==
> old->shared_max - 1, so obj->staged is freed in preparation of an
> in-place update.
> * ttm_eu_fence_buffer_objects calls reservation_object_add_shared_fence
> once for each of the BOs above, always with the same fence.
> * The first call adds the fence in the remaining free slot, after which
> old->shared_count == old->shared_max.
Well, the explanation here is not correct. For multiple BOs using the
same reservation object we won't call
reservation_object_add_shared_fence() multiple times because we move
those to the duplicates list in ttm_eu_reserve_buffers().
But this bug can still happen because we call
reservation_object_add_shared_fence() manually with fences for the same
context in a couple of places.
One prominent case which comes to my mind are for the VM BOs during
updates. Another possibility are VRAM BOs which need to be cleared.
>
> In the next call to reservation_object_add_shared_fence, the BUG_ON
> triggers. However, nothing bad would happen in
> reservation_object_add_shared_inplace, since the fence is already in the
> reservation object.
>
> Prevent this by moving the BUG_ON to where an overflow would actually
> happen (e.g. if a buggy caller didn't call
> reservation_object_reserve_shared before).
>
> Cc: stable(a)vger.kernel.org
> Signed-off-by: Michel Dänzer <michel.daenzer(a)amd.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>.
Regards,
Christian.
> ---
> drivers/dma-buf/reservation.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> index 314eb1071cce..532545b9488e 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -141,6 +141,7 @@ reservation_object_add_shared_inplace(struct reservation_object *obj,
> if (signaled) {
> RCU_INIT_POINTER(fobj->shared[signaled_idx], fence);
> } else {
> + BUG_ON(fobj->shared_count >= fobj->shared_max);
> RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
> fobj->shared_count++;
> }
> @@ -230,10 +231,9 @@ void reservation_object_add_shared_fence(struct reservation_object *obj,
> old = reservation_object_get_list(obj);
> obj->staged = NULL;
>
> - if (!fobj) {
> - BUG_ON(old->shared_count >= old->shared_max);
> + if (!fobj)
> reservation_object_add_shared_inplace(obj, old, fence);
> - } else
> + else
> reservation_object_add_shared_replace(obj, old, fobj, fence);
> }
> EXPORT_SYMBOL(reservation_object_add_shared_fence);
[As requested by Daniel cross posting to intel-gfx as well].
This set is the first step towards allowing to use a DMA-buf without actually pinning the underlying resources. This in turn the the ground work for PCIe P2P operations as well as quite a bunch of other use cases.
The idea is that we build the support for unpinned operation around the already present reservation lock in the DMA-buf object. For this we now grab the reservation object lock while mapping and unmapping DMA-bufs.
The down side is that all implementations as well as users of DMA-buf needs to be audited to make sure that we don't run into double locking or lock inversions.
So please test and/or comment and report back how badly lockdep complains :)
Thanks,
Christian.
Am 28.06.2018 um 11:53 schrieb Zhang, Jerry (Junwei):
> On 06/22/2018 10:11 PM, Christian König wrote:
>> Add function variants which can be called with the reservation lock
>> already held.
>>
>> v2: reordered, add lockdep asserts, fix kerneldoc
>>
>> Signed-off-by: Christian König <christian.koenig(a)amd.com>
>> ---
>> drivers/dma-buf/dma-buf.c | 57
>> +++++++++++++++++++++++++++++++++++++++++++++++
>> include/linux/dma-buf.h | 5 +++++
>> 2 files changed, 62 insertions(+)
>>
>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>> index 852a3928ee71..dc94e76e2e2a 100644
>> --- a/drivers/dma-buf/dma-buf.c
>> +++ b/drivers/dma-buf/dma-buf.c
>> @@ -606,6 +606,40 @@ void dma_buf_detach(struct dma_buf *dmabuf,
>> struct dma_buf_attachment *attach)
>> }
>> EXPORT_SYMBOL_GPL(dma_buf_detach);
>>
>> +/**
>> + * dma_buf_map_attachment_locked - Maps the buffer into _device_
>> address space
>> + * with the reservation lock held. Is a wrapper for map_dma_buf() of
>> the
>> + *
>> + * Returns the scatterlist table of the attachment;
>> + * dma_buf_ops.
>> + * @attach: [in] attachment whose scatterlist is to be returned
>> + * @direction: [in] direction of DMA transfer
>> + *
>> + * Returns sg_table containing the scatterlist to be returned;
>> returns ERR_PTR
>> + * on error. May return -EINTR if it is interrupted by a signal.
>> + *
>> + * A mapping must be unmapped by using
>> dma_buf_unmap_attachment_locked(). Note
>> + * that the underlying backing storage is pinned for as long as a
>> mapping
>> + * exists, therefore users/importers should not hold onto a mapping
>> for undue
>> + * amounts of time.
>> + */
>> +struct sg_table *
>> +dma_buf_map_attachment_locked(struct dma_buf_attachment *attach,
>> + enum dma_data_direction direction)
>> +{
>> + struct sg_table *sg_table;
>> +
>
> Perhaps better to add some error check, like dma_buf_map_attachment()
>
> WARN_ON(!attach || !attach->dmabuf)
Actually I wanted to remove those from the other functions as well.
WARN_ON and BUG_ON checks for NULL pointers before using them are
totally pointless because they have the same effect as a crash.
Regards,
Christian.
>
> Apart from that, it's
> Reviewed-by: Junwei Zhang <Jerry.Zhang(a)amd.com>
>
> Jerry
>
>> + might_sleep();
>> + reservation_object_assert_held(attach->dmabuf->resv);
>> +
>> + sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
>> + if (!sg_table)
>> + sg_table = ERR_PTR(-ENOMEM);
>> +
>> + return sg_table;
>> +}
>> +EXPORT_SYMBOL_GPL(dma_buf_map_attachment_locked);
>> +
>> /**
>> * dma_buf_map_attachment - Returns the scatterlist table of the
>> attachment;
>> * mapped into _device_ address space. Is a wrapper for
>> map_dma_buf() of the
>> @@ -639,6 +673,29 @@ struct sg_table *dma_buf_map_attachment(struct
>> dma_buf_attachment *attach,
>> }
>> EXPORT_SYMBOL_GPL(dma_buf_map_attachment);
>>
>> +/**
>> + * dma_buf_unmap_attachment_locked - unmaps the buffer with
>> reservation lock
>> + * held, should deallocate the associated scatterlist. Is a wrapper for
>> + * unmap_dma_buf() of dma_buf_ops.
>> + * @attach: [in] attachment to unmap buffer from
>> + * @sg_table: [in] scatterlist info of the buffer to unmap
>> + * @direction: [in] direction of DMA transfer
>> + *
>> + * This unmaps a DMA mapping for @attached obtained by
>> + * dma_buf_map_attachment_locked().
>> + */
>> +void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *attach,
>> + struct sg_table *sg_table,
>> + enum dma_data_direction direction)
>> +{
>> + might_sleep();
>> + reservation_object_assert_held(attach->dmabuf->resv);
>> +
>> + attach->dmabuf->ops->unmap_dma_buf(attach, sg_table,
>> + direction);
>> +}
>> +EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment_locked);
>> +
>> /**
>> * dma_buf_unmap_attachment - unmaps and decreases usecount of the
>> buffer;might
>> * deallocate the scatterlist associated. Is a wrapper for
>> unmap_dma_buf() of
>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>> index 991787a03199..a25e754ae2f7 100644
>> --- a/include/linux/dma-buf.h
>> +++ b/include/linux/dma-buf.h
>> @@ -384,8 +384,13 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags);
>> struct dma_buf *dma_buf_get(int fd);
>> void dma_buf_put(struct dma_buf *dmabuf);
>>
>> +struct sg_table *dma_buf_map_attachment_locked(struct
>> dma_buf_attachment *,
>> + enum dma_data_direction);
>> struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
>> enum dma_data_direction);
>> +void dma_buf_unmap_attachment_locked(struct dma_buf_attachment *,
>> + struct sg_table *,
>> + enum dma_data_direction);
>> void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct
>> sg_table *,
>> enum dma_data_direction);
>> int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
>>
Almost everyone uses dma_fence_default_wait.
v2: Also remove the BUG_ON(!ops->wait) (Chris).
Reviewed-by: Christian König <christian.koenig(a)amd.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Gustavo Padovan <gustavo(a)padovan.org>
Cc: linux-media(a)vger.kernel.org
Cc: linaro-mm-sig(a)lists.linaro.org
---
drivers/dma-buf/dma-fence-array.c | 1 -
drivers/dma-buf/dma-fence.c | 8 +++++---
drivers/dma-buf/sw_sync.c | 1 -
include/linux/dma-fence.h | 13 ++++++++-----
4 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
index dd1edfb27b61..a8c254497251 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -104,7 +104,6 @@ const struct dma_fence_ops dma_fence_array_ops = {
.get_timeline_name = dma_fence_array_get_timeline_name,
.enable_signaling = dma_fence_array_enable_signaling,
.signaled = dma_fence_array_signaled,
- .wait = dma_fence_default_wait,
.release = dma_fence_array_release,
};
EXPORT_SYMBOL(dma_fence_array_ops);
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 59049375bd19..41ec19c9efc7 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -158,7 +158,10 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
return -EINVAL;
trace_dma_fence_wait_start(fence);
- ret = fence->ops->wait(fence, intr, timeout);
+ if (fence->ops->wait)
+ ret = fence->ops->wait(fence, intr, timeout);
+ else
+ ret = dma_fence_default_wait(fence, intr, timeout);
trace_dma_fence_wait_end(fence);
return ret;
}
@@ -562,8 +565,7 @@ dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
spinlock_t *lock, u64 context, unsigned seqno)
{
BUG_ON(!lock);
- BUG_ON(!ops || !ops->wait ||
- !ops->get_driver_name || !ops->get_timeline_name);
+ BUG_ON(!ops || !ops->get_driver_name || !ops->get_timeline_name);
kref_init(&fence->refcount);
fence->ops = ops;
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 3d78ca89a605..53c1d6d36a64 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -188,7 +188,6 @@ static const struct dma_fence_ops timeline_fence_ops = {
.get_timeline_name = timeline_fence_get_timeline_name,
.enable_signaling = timeline_fence_enable_signaling,
.signaled = timeline_fence_signaled,
- .wait = dma_fence_default_wait,
.release = timeline_fence_release,
.fence_value_str = timeline_fence_value_str,
.timeline_value_str = timeline_fence_timeline_value_str,
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index c053d19e1e24..02dba8cd033d 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -191,11 +191,14 @@ struct dma_fence_ops {
/**
* @wait:
*
- * Custom wait implementation, or dma_fence_default_wait.
+ * Custom wait implementation, defaults to dma_fence_default_wait() if
+ * not set.
*
- * Must not be NULL, set to dma_fence_default_wait for default implementation.
- * the dma_fence_default_wait implementation should work for any fence, as long
- * as enable_signaling works correctly.
+ * The dma_fence_default_wait implementation should work for any fence, as long
+ * as @enable_signaling works correctly. This hook allows drivers to
+ * have an optimized version for the case where a process context is
+ * already available, e.g. if @enable_signaling for the general case
+ * needs to set up a worker thread.
*
* Must return -ERESTARTSYS if the wait is intr = true and the wait was
* interrupted, and remaining jiffies if fence has signaled, or 0 if wait
@@ -203,7 +206,7 @@ struct dma_fence_ops {
* which should be treated as if the fence is signaled. For example a hardware
* lockup could be reported like that.
*
- * This callback is mandatory.
+ * This callback is optional.
*/
signed long (*wait)(struct dma_fence *fence,
bool intr, signed long timeout);
--
2.17.0
Quoting Michel Dänzer (2018-06-26 15:31:47)
> From: Michel Dänzer <michel.daenzer(a)amd.com>
>
> Fixes the BUG_ON spuriously triggering under the following
> circumstances:
>
> * ttm_eu_reserve_buffers processes a list containing multiple BOs using
> the same reservation object, so it calls
> reservation_object_reserve_shared with that reservation object once
> for each such BO.
> * In reservation_object_reserve_shared, old->shared_count ==
> old->shared_max - 1, so obj->staged is freed in preparation of an
> in-place update.
> * ttm_eu_fence_buffer_objects calls reservation_object_add_shared_fence
> once for each of the BOs above, always with the same fence.
> * The first call adds the fence in the remaining free slot, after which
> old->shared_count == old->shared_max.
>
> In the next call to reservation_object_add_shared_fence, the BUG_ON
> triggers. However, nothing bad would happen in
> reservation_object_add_shared_inplace, since the fence is already in the
> reservation object.
>
> Prevent this by moving the BUG_ON to where an overflow would actually
> happen (e.g. if a buggy caller didn't call
> reservation_object_reserve_shared before).
>
> Cc: stable(a)vger.kernel.org
> Signed-off-by: Michel Dänzer <michel.daenzer(a)amd.com>
I've convinced myself (or rather have not found a valid argument
against) that being able to call reserve_shared + add_shared multiple
times for the same fence is an intended part of reservation_object API
I'd double check with Christian though.
Reviewed-by: Chris Wilson <chris(a)chris-wilson.co.uk>
> drivers/dma-buf/reservation.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> index 314eb1071cce..532545b9488e 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -141,6 +141,7 @@ reservation_object_add_shared_inplace(struct reservation_object *obj,
> if (signaled) {
> RCU_INIT_POINTER(fobj->shared[signaled_idx], fence);
> } else {
> + BUG_ON(fobj->shared_count >= fobj->shared_max);
Personally I would just let kasan detect this and throw away the BUG_ON
or at least move it behind some DMABUF_BUG_ON().
-Chris
On Mon, Jun 25, 2018 at 09:21:15PM +0530, Akhil P Oommen wrote:
>
>
> On 6/25/2018 1:20 PM, Daniel Vetter wrote:
> > On Fri, Jun 22, 2018 at 11:08:48AM +0100, Chris Wilson wrote:
> > > Quoting Gustavo Padovan (2018-06-22 11:04:16)
> > > > Hi Akhil,
> > > >
> > > > On Fri, 2018-06-22 at 15:10 +0530, Akhil P Oommen wrote:
> > > > > Each fence object holds function pointers of the module that
> > > > > initialized
> > > > > it. Allowing the module to unload before this fence's release is
> > > > > catastrophic. So, keep a refcount on the module until the fence is
> > > > > released.
> > > > >
> > > > > Signed-off-by: Akhil P Oommen <akhilpo(a)codeaurora.org>
> > > > > ---
> > > > > Changes in v2:
> > > > > - added description for the new function parameter.
> > > > >
> > > > > drivers/dma-buf/dma-fence.c | 16 +++++++++++++---
> > > > > include/linux/dma-fence.h | 10 ++++++++--
> > > > > 2 files changed, 21 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-
> > > > > fence.c
> > > > > index 4edb9fd..2aaa44e 100644
> > > > > --- a/drivers/dma-buf/dma-fence.c
> > > > > +++ b/drivers/dma-buf/dma-fence.c
> > > > > @@ -18,6 +18,7 @@
> > > > > * more details.
> > > > > */
> > > > > +#include <linux/module.h>
> > > > > #include <linux/slab.h>
> > > > > #include <linux/export.h>
> > > > > #include <linux/atomic.h>
> > > > > @@ -168,6 +169,7 @@ void dma_fence_release(struct kref *kref)
> > > > > {
> > > > > struct dma_fence *fence =
> > > > > container_of(kref, struct dma_fence, refcount);
> > > > > + struct module *module = fence->owner;
> > > > > trace_dma_fence_destroy(fence);
> > > > > @@ -178,6 +180,8 @@ void dma_fence_release(struct kref *kref)
> > > > > fence->ops->release(fence);
> > > > > else
> > > > > dma_fence_free(fence);
> > > > > +
> > > > > + module_put(module);
> > > > > }
> > > > > EXPORT_SYMBOL(dma_fence_release);
> > > > > @@ -541,6 +545,7 @@ struct default_wait_cb {
> > > > > /**
> > > > > * dma_fence_init - Initialize a custom fence.
> > > > > + * @module: [in] the module that calls this API
> > > > > * @fence: [in] the fence to initialize
> > > > > * @ops: [in] the dma_fence_ops for operations on this
> > > > > fence
> > > > > * @lock: [in] the irqsafe spinlock to use for locking
> > > > > this fence
> > > > > @@ -556,8 +561,9 @@ struct default_wait_cb {
> > > > > * to check which fence is later by simply using dma_fence_later.
> > > > > */
> > > > > void
> > > > > -dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops
> > > > > *ops,
> > > > > - spinlock_t *lock, u64 context, unsigned seqno)
> > > > > +_dma_fence_init(struct module *module, struct dma_fence *fence,
> > > > > + const struct dma_fence_ops *ops, spinlock_t *lock,
> > > > > + u64 context, unsigned seqno)
> > > > > {
> > > > > BUG_ON(!lock);
> > > > > BUG_ON(!ops || !ops->wait || !ops->enable_signaling ||
> > > > > @@ -571,7 +577,11 @@ struct default_wait_cb {
> > > > > fence->seqno = seqno;
> > > > > fence->flags = 0UL;
> > > > > fence->error = 0;
> > > > > + fence->owner = module;
> > > > > +
> > > > > + if (!try_module_get(module))
> > > > > + fence->owner = NULL;
> > > > > trace_dma_fence_init(fence);
> > > > > }
> > > > > -EXPORT_SYMBOL(dma_fence_init);
> > > > > +EXPORT_SYMBOL(_dma_fence_init);
> > > > Do we still need to export the symbol, it won't be called from outside
> > > > anymore? Other than that looks good to me:
> > > There's a big drawback in that a module reference is often insufficient,
> > > and that a reference on the driver (or whatever is required for the
> > > lifetime of the fence) will already hold the module reference.
> > >
> > > Considering that we want a few 100k fences in flight per second, is
> > > there no other way to only export a fence with a module reference?
> > We'd need to make the timeline a full-blown object (Maarten owes me one
> > for that design screw-up), and then we could stuff all these things in
> > there.
> >
> > And I think that's the right fix, since try_module_get for every
> > dma_fence_init just ain't cool really :-)
> > -Daniel
> Thanks for the feedback, Daniel.
> I see your point, but I am not sure how much impact an extra refcounting
> would create considering the whole effort of setting up a new fence. Also,
> this refcounting is not required for built-in modules.
>
> As of now, unloading a kernel module that uses fence_init() is an easy way
> to bring down the system. This patch simply fixes that. What you have
> suggested sounds like a non-trivial effort which someone who is more
> familiar with this code base can do a better job than me. Perhaps we can
> take this patch now to fix the issue at hand and later somebody else can
> share a more optimal solution. :)
Module unload is a developer-only feature for a reason. Given that I don't
think fixing this with a hack is the right approach. And dma_fence_init()
is supposed to be really fast.
Also note that you can fix this already for your own driver by simply
waiting for all pending dma_fences to get released, so I don't think it's
super-important to land this asap.
Yes the real fix is a bit more involved, but shouldn't be too hard to pull
off really.
-Daniel
>
> @Gustavo & @Sumit, I would like the maintainers to take a decision here.
>
> -Akhil.
> _______________________________________________
> dri-devel mailing list
> dri-devel(a)lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch