- Linaro-mm-sig - lists.linaro.org

Re: [Linaro-mm-sig] [Intel-gfx] [PATCH 01/17] dma-fence: Some kerneldoc polish for dma-fence.h

by Daniel Vetter

On Mon, Apr 30, 2018 at 10:49:00AM -0700, Eric Anholt wrote: > Daniel Vetter <daniel.vetter(a)ffwll.ch> writes: > > + /** > > + * @fill_driver_data: > > + * > > + * Callback to fill in free-form debug info Returns amount of bytes > > + * filled, or negative error on failure. > > Maybe this "Returns" should be on a new line? Or at least a '.' in > between. Indeed I've missed this, thanks for spotting it. Done both&applied. Thanks, Daniel > > Other than that, > > Reviewed-by: Eric Anholt <eric(a)anholt.net> > > Thanks! -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

7 years, 1 month

1
0
0 0

[PATCH 04/17] dma-fence: Allow wait_any_timeout for all fences

by Daniel Vetter

When this was introduced in commit a519435a96597d8cd96123246fea4ae5a6c90b02 Author: Christian König <christian.koenig(a)amd.com> Date: Tue Oct 20 16:34:16 2015 +0200 dma-buf/fence: add fence_wait_any_timeout function v2 there was a restriction added that this only works if the dma-fence uses the dma_fence_default_wait hook. Which works for amdgpu, which is the only caller. Well, until you share some buffers with e.g. i915, then you get an -EINVAL. But there's really no reason for this, because all drivers must support callbacks. The special ->wait hook is only as an optimization; if the driver needs to create a worker thread for an active callback, then it can avoid to do that if it knows that there's a process context available already. So ->wait is just an optimization, just using the logic in dma_fence_default_wait() should work for all drivers. Let's remove this restriction. Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: Gustavo Padovan <gustavo(a)padovan.org> Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org Cc: Christian König <christian.koenig(a)amd.com> Cc: Alex Deucher <alexander.deucher(a)amd.com> --- drivers/dma-buf/dma-fence.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 7b5b40d6b70e..59049375bd19 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -503,11 +503,6 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, for (i = 0; i < count; ++i) { struct dma_fence *fence = fences[i]; - if (fence->ops->wait != dma_fence_default_wait) { - ret = -EINVAL; - goto fence_rm_cb; - } - cb[i].task = current; if (dma_fence_add_callback(fence, &cb[i].base, dma_fence_default_wait_cb)) { -- 2.17.0

7 years, 1 month

3
2
0 0

[PATCH 17/17] dma-fence: Polish kernel-doc for dma-fence.c

by Daniel Vetter

- Intro section that links to how this is exposed to userspace. - Lots more hyperlinks. - Minor clarifications and style polish Signed-off-by: Daniel Vetter <daniel.vetter(a)ffwll.ch> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: Gustavo Padovan <gustavo(a)padovan.org> Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org --- Documentation/driver-api/dma-buf.rst | 6 ++ drivers/dma-buf/dma-fence.c | 140 ++++++++++++++++++--------- 2 files changed, 102 insertions(+), 44 deletions(-) diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index dc384f2f7f34..b541e97c7ab1 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -130,6 +130,12 @@ Reservation Objects DMA Fences ---------- +.. kernel-doc:: drivers/dma-buf/dma-fence.c + :doc: DMA fences overview + +DMA Fences Functions Reference +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + .. kernel-doc:: drivers/dma-buf/dma-fence.c :export: diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 30fcbe415ff4..4e931e1de198 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -38,12 +38,43 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); */ static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(0); +/** + * DOC: DMA fences overview + * + * DMA fences, represented by &struct dma_fence, are the kernel internal + * synchronization primitive for DMA operations like GPU rendering, video + * encoding/decoding, or displaying buffers on a screen. + * + * A fence is initialized using dma_fence_init() and completed using + * dma_fence_signal(). Fences are associated with a context, allocated through + * dma_fence_context_alloc(), and all fences on the same context are + * fully ordered. + * + * Since the purposes of fences is to facilitate cross-device and + * cross-application synchronization, there's multiple ways to use one: + * + * - Individual fences can be exposed as a &sync_file, accessed as a file + * descriptor from userspace, created by calling sync_file_create(). This is + * called explicit fencing, since userspace passes around explicit + * synchronization points. + * + * - Some subsystems also have their own explicit fencing primitives, like + * &drm_syncobj. Compared to &sync_file, a &drm_syncobj allows the underlying + * fence to be updated. + * + * - Then there's also implicit fencing, where the synchronization points are + * implicitly passed around as part of shared &dma_buf instances. Such + * implicit fences are stored in &struct reservation_object through the + * &dma_buf.resv pointer. + */ + /** * dma_fence_context_alloc - allocate an array of fence contexts - * @num: [in] amount of contexts to allocate + * @num: amount of contexts to allocate * - * This function will return the first index of the number of fences allocated. - * The fence context is used for setting fence->context to a unique number. + * This function will return the first index of the number of fence contexts + * allocated. The fence context is used for setting &dma_fence.context to a + * unique number by passing the context to dma_fence_init(). */ u64 dma_fence_context_alloc(unsigned num) { @@ -59,10 +90,14 @@ EXPORT_SYMBOL(dma_fence_context_alloc); * Signal completion for software callbacks on a fence, this will unblock * dma_fence_wait() calls and run all the callbacks added with * dma_fence_add_callback(). Can be called multiple times, but since a fence - * can only go from unsignaled to signaled state, it will only be effective - * the first time. + * can only go from the unsignaled to the signaled state and not back, it will + * only be effective the first time. * - * Unlike dma_fence_signal, this function must be called with fence->lock held. + * Unlike dma_fence_signal(), this function must be called with &dma_fence.lock + * held. + * + * Returns 0 on success and a negative error value when @fence has been + * signalled already. */ int dma_fence_signal_locked(struct dma_fence *fence) { @@ -102,8 +137,11 @@ EXPORT_SYMBOL(dma_fence_signal_locked); * Signal completion for software callbacks on a fence, this will unblock * dma_fence_wait() calls and run all the callbacks added with * dma_fence_add_callback(). Can be called multiple times, but since a fence - * can only go from unsignaled to signaled state, it will only be effective - * the first time. + * can only go from the unsignaled to the signaled state and not back, it will + * only be effective the first time. + * + * Returns 0 on success and a negative error value when @fence has been + * signalled already. */ int dma_fence_signal(struct dma_fence *fence) { @@ -136,9 +174,9 @@ EXPORT_SYMBOL(dma_fence_signal); /** * dma_fence_wait_timeout - sleep until the fence gets signaled * or until timeout elapses - * @fence: [in] the fence to wait on - * @intr: [in] if true, do an interruptible wait - * @timeout: [in] timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT + * @fence: the fence to wait on + * @intr: if true, do an interruptible wait + * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT * * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the * remaining timeout in jiffies on success. Other error values may be @@ -148,6 +186,8 @@ EXPORT_SYMBOL(dma_fence_signal); * directly or indirectly (buf-mgr between reservation and committing) * holds a reference to the fence, otherwise the fence might be * freed before return, resulting in undefined behavior. + * + * See also dma_fence_wait() and dma_fence_wait_any_timeout(). */ signed long dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout) @@ -167,6 +207,13 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout) } EXPORT_SYMBOL(dma_fence_wait_timeout); +/** + * dma_fence_release - default relese function for fences + * @kref: &dma_fence.recfount + * + * This is the default release functions for &dma_fence. Drivers shouldn't call + * this directly, but instead call dma_fence_put(). + */ void dma_fence_release(struct kref *kref) { struct dma_fence *fence = @@ -199,10 +246,11 @@ EXPORT_SYMBOL(dma_fence_free); /** * dma_fence_enable_sw_signaling - enable signaling on fence - * @fence: [in] the fence to enable + * @fence: the fence to enable * - * this will request for sw signaling to be enabled, to make the fence - * complete as soon as possible + * This will request for sw signaling to be enabled, to make the fence + * complete as soon as possible. This calls &dma_fence_ops.enable_signaling + * internally. */ void dma_fence_enable_sw_signaling(struct dma_fence *fence) { @@ -226,24 +274,24 @@ EXPORT_SYMBOL(dma_fence_enable_sw_signaling); /** * dma_fence_add_callback - add a callback to be called when the fence * is signaled - * @fence: [in] the fence to wait on - * @cb: [in] the callback to register - * @func: [in] the function to call + * @fence: the fence to wait on + * @cb: the callback to register + * @func: the function to call * - * cb will be initialized by dma_fence_add_callback, no initialization + * @cb will be initialized by dma_fence_add_callback(), no initialization * by the caller is required. Any number of callbacks can be registered * to a fence, but a callback can only be registered to one fence at a time. * * Note that the callback can be called from an atomic context. If * fence is already signaled, this function will return -ENOENT (and - * *not* call the callback) + * *not* call the callback). * * Add a software callback to the fence. Same restrictions apply to - * refcount as it does to dma_fence_wait, however the caller doesn't need to - * keep a refcount to fence afterwards: when software access is enabled, - * the creator of the fence is required to keep the fence alive until - * after it signals with dma_fence_signal. The callback itself can be called - * from irq context. + * refcount as it does to dma_fence_wait(), however the caller doesn't need to + * keep a refcount to fence afterward dma_fence_add_callback() has returned: + * when software access is enabled, the creator of the fence is required to keep + * the fence alive until after it signals with dma_fence_signal(). The callback + * itself can be called from irq context. * * Returns 0 in case of success, -ENOENT if the fence is already signaled * and -EINVAL in case of error. @@ -292,7 +340,7 @@ EXPORT_SYMBOL(dma_fence_add_callback); /** * dma_fence_get_status - returns the status upon completion - * @fence: [in] the dma_fence to query + * @fence: the dma_fence to query * * This wraps dma_fence_get_status_locked() to return the error status * condition on a signaled fence. See dma_fence_get_status_locked() for more @@ -317,8 +365,8 @@ EXPORT_SYMBOL(dma_fence_get_status); /** * dma_fence_remove_callback - remove a callback from the signaling list - * @fence: [in] the fence to wait on - * @cb: [in] the callback to remove + * @fence: the fence to wait on + * @cb: the callback to remove * * Remove a previously queued callback from the fence. This function returns * true if the callback is successfully removed, or false if the fence has @@ -329,6 +377,9 @@ EXPORT_SYMBOL(dma_fence_get_status); * doing, since deadlocks and race conditions could occur all too easily. For * this reason, it should only ever be done on hardware lockup recovery, * with a reference held to the fence. + * + * Behaviour is undefined if @cb has not been added to @fence using + * dma_fence_add_callback() beforehand. */ bool dma_fence_remove_callback(struct dma_fence *fence, struct dma_fence_cb *cb) @@ -365,9 +416,9 @@ dma_fence_default_wait_cb(struct dma_fence *fence, struct dma_fence_cb *cb) /** * dma_fence_default_wait - default sleep until the fence gets signaled * or until timeout elapses - * @fence: [in] the fence to wait on - * @intr: [in] if true, do an interruptible wait - * @timeout: [in] timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT + * @fence: the fence to wait on + * @intr: if true, do an interruptible wait + * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT * * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the * remaining timeout in jiffies on success. If timeout is zero the value one is @@ -460,12 +511,12 @@ dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count, /** * dma_fence_wait_any_timeout - sleep until any fence gets signaled * or until timeout elapses - * @fences: [in] array of fences to wait on - * @count: [in] number of fences to wait on - * @intr: [in] if true, do an interruptible wait - * @timeout: [in] timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT - * @idx: [out] the first signaled fence index, meaningful only on - * positive return + * @fences: array of fences to wait on + * @count: number of fences to wait on + * @intr: if true, do an interruptible wait + * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT + * @idx: used to store the first signaled fence index, meaningful only on + * positive return * * Returns -EINVAL on custom fence wait implementation, -ERESTARTSYS if * interrupted, 0 if the wait timed out, or the remaining timeout in jiffies @@ -474,6 +525,8 @@ dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count, * Synchronous waits for the first fence in the array to be signaled. The * caller needs to hold a reference to all fences in the array, otherwise a * fence might be freed before return, resulting in undefined behavior. + * + * See also dma_fence_wait() and dma_fence_wait_timeout(). */ signed long dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, @@ -546,19 +599,18 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout); /** * dma_fence_init - Initialize a custom fence. - * @fence: [in] the fence to initialize - * @ops: [in] the dma_fence_ops for operations on this fence - * @lock: [in] the irqsafe spinlock to use for locking this fence - * @context: [in] the execution context this fence is run on - * @seqno: [in] a linear increasing sequence number for this context + * @fence: the fence to initialize + * @ops: the dma_fence_ops for operations on this fence + * @lock: the irqsafe spinlock to use for locking this fence + * @context: the execution context this fence is run on + * @seqno: a linear increasing sequence number for this context * * Initializes an allocated fence, the caller doesn't have to keep its * refcount after committing with this fence, but it will need to hold a - * refcount again if dma_fence_ops.enable_signaling gets called. This can - * be used for other implementing other types of fence. + * refcount again if &dma_fence_ops.enable_signaling gets called. * * context and seqno are used for easy comparison between fences, allowing - * to check which fence is later by simply using dma_fence_later. + * to check which fence is later by simply using dma_fence_later(). */ void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, -- 2.17.0

7 years, 1 month

1
0
0 0

[PATCH 05/17] dma-fence: Make ->wait callback optional

by Daniel Vetter

Almost everyone uses dma_fence_default_wait. Signed-off-by: Daniel Vetter <daniel.vetter(a)ffwll.ch> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: Gustavo Padovan <gustavo(a)padovan.org> Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org --- drivers/dma-buf/dma-fence-array.c | 1 - drivers/dma-buf/dma-fence.c | 5 ++++- drivers/dma-buf/sw_sync.c | 1 - include/linux/dma-fence.h | 13 ++++++++----- 4 files changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c index dd1edfb27b61..a8c254497251 100644 --- a/drivers/dma-buf/dma-fence-array.c +++ b/drivers/dma-buf/dma-fence-array.c @@ -104,7 +104,6 @@ const struct dma_fence_ops dma_fence_array_ops = { .get_timeline_name = dma_fence_array_get_timeline_name, .enable_signaling = dma_fence_array_enable_signaling, .signaled = dma_fence_array_signaled, - .wait = dma_fence_default_wait, .release = dma_fence_array_release, }; EXPORT_SYMBOL(dma_fence_array_ops); diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 59049375bd19..30fcbe415ff4 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -158,7 +158,10 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout) return -EINVAL; trace_dma_fence_wait_start(fence); - ret = fence->ops->wait(fence, intr, timeout); + if (fence->ops->wait) + ret = fence->ops->wait(fence, intr, timeout); + else + ret = dma_fence_default_wait(fence, intr, timeout); trace_dma_fence_wait_end(fence); return ret; } diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index 3d78ca89a605..53c1d6d36a64 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -188,7 +188,6 @@ static const struct dma_fence_ops timeline_fence_ops = { .get_timeline_name = timeline_fence_get_timeline_name, .enable_signaling = timeline_fence_enable_signaling, .signaled = timeline_fence_signaled, - .wait = dma_fence_default_wait, .release = timeline_fence_release, .fence_value_str = timeline_fence_value_str, .timeline_value_str = timeline_fence_timeline_value_str, diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index c730f569621a..d05496ff0d10 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -191,11 +191,14 @@ struct dma_fence_ops { /** * @wait: * - * Custom wait implementation, or dma_fence_default_wait. + * Custom wait implementation, defaults to dma_fence_default_wait() if + * not set. * - * Must not be NULL, set to dma_fence_default_wait for default implementation. - * the dma_fence_default_wait implementation should work for any fence, as long - * as enable_signaling works correctly. + * The dma_fence_default_wait implementation should work for any fence, as long + * as @enable_signaling works correctly. This hook allows drivers to + * have an optimized version for the case where a process context is + * already available, e.g. if @enable_signaling for the general case + * needs to set up a worker thread. * * Must return -ERESTARTSYS if the wait is intr = true and the wait was * interrupted, and remaining jiffies if fence has signaled, or 0 if wait @@ -203,7 +206,7 @@ struct dma_fence_ops { * which should be treated as if the fence is signaled. For example a hardware * lockup could be reported like that. * - * This callback is mandatory. + * This callback is optional. */ signed long (*wait)(struct dma_fence *fence, bool intr, signed long timeout); -- 2.17.0

7 years, 1 month

1
0
0 0

[PATCH 03/17] dma-fence: Make ->enable_signaling optional

by Daniel Vetter

Many drivers have a trivial implementation for ->enable_signaling. Let's make it optional by assuming that signalling is already available when the callback isn't present. Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: Gustavo Padovan <gustavo(a)padovan.org> Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org --- drivers/dma-buf/dma-fence.c | 13 ++++++++++++- include/linux/dma-fence.h | 3 ++- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 4edb9fd3cf47..7b5b40d6b70e 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -181,6 +181,13 @@ void dma_fence_release(struct kref *kref) } EXPORT_SYMBOL(dma_fence_release); +/** + * dma_fence_free - default release function for &dma_fence. + * @fence: fence to release + * + * This is the default implementation for &dma_fence_ops.release. It calls + * kfree_rcu() on @fence. + */ void dma_fence_free(struct dma_fence *fence) { kfree_rcu(fence, rcu); @@ -560,7 +567,7 @@ dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, spinlock_t *lock, u64 context, unsigned seqno) { BUG_ON(!lock); - BUG_ON(!ops || !ops->wait || !ops->enable_signaling || + BUG_ON(!ops || !ops->wait || !ops->get_driver_name || !ops->get_timeline_name); kref_init(&fence->refcount); @@ -572,6 +579,10 @@ dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, fence->flags = 0UL; fence->error = 0; + if (!ops->enable_signaling) + set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, + &fence->flags); + trace_dma_fence_init(fence); } EXPORT_SYMBOL(dma_fence_init); diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index f9a6848f8558..c730f569621a 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -166,7 +166,8 @@ struct dma_fence_ops { * released when the fence is signalled (through e.g. the interrupt * handler). * - * This callback is mandatory. + * This callback is optional. If this callback is not present, then the + * driver must always have signaling enabled. */ bool (*enable_signaling)(struct dma_fence *fence); -- 2.17.0

7 years, 1 month

1
0
0 0

[PATCH 01/17] dma-fence: Some kerneldoc polish for dma-fence.h

by Daniel Vetter

- Switch to inline member docs for dma_fence_ops. - Mild polish all around. - hyperlink all the things! v2: - Remove the various [in] annotations, they seem really uncommon in kerneldoc and look funny. Signed-off-by: Daniel Vetter <daniel.vetter(a)ffwll.ch> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org --- include/linux/dma-fence.h | 235 +++++++++++++++++++++++++------------- 1 file changed, 154 insertions(+), 81 deletions(-) diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 4c008170fe65..9d6f39bf2111 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -94,11 +94,11 @@ typedef void (*dma_fence_func_t)(struct dma_fence *fence, struct dma_fence_cb *cb); /** - * struct dma_fence_cb - callback for dma_fence_add_callback - * @node: used by dma_fence_add_callback to append this struct to fence::cb_list + * struct dma_fence_cb - callback for dma_fence_add_callback() + * @node: used by dma_fence_add_callback() to append this struct to fence::cb_list * @func: dma_fence_func_t to call * - * This struct will be initialized by dma_fence_add_callback, additional + * This struct will be initialized by dma_fence_add_callback(), additional * data can be passed along by embedding dma_fence_cb in another struct. */ struct dma_fence_cb { @@ -108,75 +108,142 @@ struct dma_fence_cb { /** * struct dma_fence_ops - operations implemented for fence - * @get_driver_name: returns the driver name. - * @get_timeline_name: return the name of the context this fence belongs to. - * @enable_signaling: enable software signaling of fence. - * @signaled: [optional] peek whether the fence is signaled, can be null. - * @wait: custom wait implementation, or dma_fence_default_wait. - * @release: [optional] called on destruction of fence, can be null - * @fill_driver_data: [optional] callback to fill in free-form debug info - * Returns amount of bytes filled, or -errno. - * @fence_value_str: [optional] fills in the value of the fence as a string - * @timeline_value_str: [optional] fills in the current value of the timeline - * as a string * - * Notes on enable_signaling: - * For fence implementations that have the capability for hw->hw - * signaling, they can implement this op to enable the necessary - * irqs, or insert commands into cmdstream, etc. This is called - * in the first wait() or add_callback() path to let the fence - * implementation know that there is another driver waiting on - * the signal (ie. hw->sw case). - * - * This function can be called from atomic context, but not - * from irq context, so normal spinlocks can be used. - * - * A return value of false indicates the fence already passed, - * or some failure occurred that made it impossible to enable - * signaling. True indicates successful enabling. - * - * fence->error may be set in enable_signaling, but only when false is - * returned. - * - * Calling dma_fence_signal before enable_signaling is called allows - * for a tiny race window in which enable_signaling is called during, - * before, or after dma_fence_signal. To fight this, it is recommended - * that before enable_signaling returns true an extra reference is - * taken on the fence, to be released when the fence is signaled. - * This will mean dma_fence_signal will still be called twice, but - * the second time will be a noop since it was already signaled. - * - * Notes on signaled: - * May set fence->error if returning true. - * - * Notes on wait: - * Must not be NULL, set to dma_fence_default_wait for default implementation. - * the dma_fence_default_wait implementation should work for any fence, as long - * as enable_signaling works correctly. - * - * Must return -ERESTARTSYS if the wait is intr = true and the wait was - * interrupted, and remaining jiffies if fence has signaled, or 0 if wait - * timed out. Can also return other error values on custom implementations, - * which should be treated as if the fence is signaled. For example a hardware - * lockup could be reported like that. - * - * Notes on release: - * Can be NULL, this function allows additional commands to run on - * destruction of the fence. Can be called from irq context. - * If pointer is set to NULL, kfree will get called instead. */ - struct dma_fence_ops { + /** + * @get_driver_name: + * + * Returns the driver name. This is a callback to allow drivers to + * compute the name at runtime, without having it to store permanently + * for each fence, or build a cache of some sort. + * + * This callback is mandatory. + */ const char * (*get_driver_name)(struct dma_fence *fence); + + /** + * @get_timeline_name: + * + * Return the name of the context this fence belongs to. This is a + * callback to allow drivers to compute the name at runtime, without + * having it to store permanently for each fence, or build a cache of + * some sort. + * + * This callback is mandatory. + */ const char * (*get_timeline_name)(struct dma_fence *fence); + + /** + * @enable_signaling: + * + * Enable software signaling of fence. + * + * For fence implementations that have the capability for hw->hw + * signaling, they can implement this op to enable the necessary + * interrupts, or insert commands into cmdstream, etc, to avoid these + * costly operations for the common case where only hw->hw + * synchronization is required. This is called in the first + * dma_fence_wait() or dma_fence_add_callback() path to let the fence + * implementation know that there is another driver waiting on the + * signal (ie. hw->sw case). + * + * This function can be called from atomic context, but not + * from irq context, so normal spinlocks can be used. + * + * A return value of false indicates the fence already passed, + * or some failure occurred that made it impossible to enable + * signaling. True indicates successful enabling. + * + * &dma_fence.error may be set in enable_signaling, but only when false + * is returned. + * + * Since many implementations can call dma_fence_signal() even when before + * @enable_signaling has been called there's a race window, where the + * dma_fence_signal() might result in the final fence reference being + * released and its memory freed. To avoid this, implementations of this + * callback should grab their own reference using dma_fence_get(), to be + * released when the fence is signalled (through e.g. the interrupt + * handler). + * + * This callback is mandatory. + */ bool (*enable_signaling)(struct dma_fence *fence); + + /** + * @signaled: + * + * Peek whether the fence is signaled, as a fastpath optimization for + * e.g. dma_fence_wait() or dma_fence_add_callback(). Note that this + * callback does not need to make any guarantees beyond that a fence + * once indicates as signalled must always return true from this + * callback. This callback may return false even if the fence has + * completed already, in this case information hasn't propogated throug + * the system yet. See also dma_fence_is_signaled(). + * + * May set &dma_fence.error if returning true. + * + * This callback is optional. + */ bool (*signaled)(struct dma_fence *fence); + + /** + * @wait: + * + * Custom wait implementation, or dma_fence_default_wait. + * + * Must not be NULL, set to dma_fence_default_wait for default implementation. + * the dma_fence_default_wait implementation should work for any fence, as long + * as enable_signaling works correctly. + * + * Must return -ERESTARTSYS if the wait is intr = true and the wait was + * interrupted, and remaining jiffies if fence has signaled, or 0 if wait + * timed out. Can also return other error values on custom implementations, + * which should be treated as if the fence is signaled. For example a hardware + * lockup could be reported like that. + * + * This callback is mandatory. + */ signed long (*wait)(struct dma_fence *fence, bool intr, signed long timeout); + + /** + * @release: + * + * Called on destruction of fence to release additional resources. + * Can be called from irq context. This callback is optional. If it is + * NULL, then dma_fence_free() is instead called as the default + * implementation. + */ void (*release)(struct dma_fence *fence); + /** + * @fill_driver_data: + * + * Callback to fill in free-form debug info Returns amount of bytes + * filled, or negative error on failure. + * + * This callback is optional. + */ int (*fill_driver_data)(struct dma_fence *fence, void *data, int size); + + /** + * @fence_value_str: + * + * Callback to fill in free-form debug info specific to this fence, like + * the sequence number. + * + * This callback is optional. + */ void (*fence_value_str)(struct dma_fence *fence, char *str, int size); + + /** + * @timeline_value_str: + * + * Fills in the current value of the timeline as a string, like the + * sequence number. This should match what @fill_driver_data prints for + * the most recently signalled fence (assuming no delayed signalling). + */ void (*timeline_value_str)(struct dma_fence *fence, char *str, int size); }; @@ -189,7 +256,7 @@ void dma_fence_free(struct dma_fence *fence); /** * dma_fence_put - decreases refcount of the fence - * @fence: [in] fence to reduce refcount of + * @fence: fence to reduce refcount of */ static inline void dma_fence_put(struct dma_fence *fence) { @@ -199,7 +266,7 @@ static inline void dma_fence_put(struct dma_fence *fence) /** * dma_fence_get - increases refcount of the fence - * @fence: [in] fence to increase refcount of + * @fence: fence to increase refcount of * * Returns the same fence, with refcount increased by 1. */ @@ -213,7 +280,7 @@ static inline struct dma_fence *dma_fence_get(struct dma_fence *fence) /** * dma_fence_get_rcu - get a fence from a reservation_object_list with * rcu read lock - * @fence: [in] fence to increase refcount of + * @fence: fence to increase refcount of * * Function returns NULL if no refcount could be obtained, or the fence. */ @@ -227,7 +294,7 @@ static inline struct dma_fence *dma_fence_get_rcu(struct dma_fence *fence) /** * dma_fence_get_rcu_safe - acquire a reference to an RCU tracked fence - * @fencep: [in] pointer to fence to increase refcount of + * @fencep: pointer to fence to increase refcount of * * Function returns NULL if no refcount could be obtained, or the fence. * This function handles acquiring a reference to a fence that may be @@ -289,14 +356,16 @@ void dma_fence_enable_sw_signaling(struct dma_fence *fence); /** * dma_fence_is_signaled_locked - Return an indication if the fence * is signaled yet. - * @fence: [in] the fence to check + * @fence: the fence to check * * Returns true if the fence was already signaled, false if not. Since this * function doesn't enable signaling, it is not guaranteed to ever return - * true if dma_fence_add_callback, dma_fence_wait or - * dma_fence_enable_sw_signaling haven't been called before. + * true if dma_fence_add_callback(), dma_fence_wait() or + * dma_fence_enable_sw_signaling() haven't been called before. * - * This function requires fence->lock to be held. + * This function requires &dma_fence.lock to be held. + * + * See also dma_fence_is_signaled(). */ static inline bool dma_fence_is_signaled_locked(struct dma_fence *fence) @@ -314,17 +383,19 @@ dma_fence_is_signaled_locked(struct dma_fence *fence) /** * dma_fence_is_signaled - Return an indication if the fence is signaled yet. - * @fence: [in] the fence to check + * @fence: the fence to check * * Returns true if the fence was already signaled, false if not. Since this * function doesn't enable signaling, it is not guaranteed to ever return - * true if dma_fence_add_callback, dma_fence_wait or - * dma_fence_enable_sw_signaling haven't been called before. + * true if dma_fence_add_callback(), dma_fence_wait() or + * dma_fence_enable_sw_signaling() haven't been called before. * * It's recommended for seqno fences to call dma_fence_signal when the * operation is complete, it makes it possible to prevent issues from * wraparound between time of issue and time of use by checking the return * value of this function before calling hardware-specific wait instructions. + * + * See also dma_fence_is_signaled_locked(). */ static inline bool dma_fence_is_signaled(struct dma_fence *fence) @@ -342,8 +413,8 @@ dma_fence_is_signaled(struct dma_fence *fence) /** * __dma_fence_is_later - return if f1 is chronologically later than f2 - * @f1: [in] the first fence's seqno - * @f2: [in] the second fence's seqno from the same context + * @f1: the first fence's seqno + * @f2: the second fence's seqno from the same context * * Returns true if f1 is chronologically later than f2. Both fences must be * from the same context, since a seqno is not common across contexts. @@ -355,8 +426,8 @@ static inline bool __dma_fence_is_later(u32 f1, u32 f2) /** * dma_fence_is_later - return if f1 is chronologically later than f2 - * @f1: [in] the first fence from the same context - * @f2: [in] the second fence from the same context + * @f1: the first fence from the same context + * @f2: the second fence from the same context * * Returns true if f1 is chronologically later than f2. Both fences must be * from the same context, since a seqno is not re-used across contexts. @@ -372,8 +443,8 @@ static inline bool dma_fence_is_later(struct dma_fence *f1, /** * dma_fence_later - return the chronologically later fence - * @f1: [in] the first fence from the same context - * @f2: [in] the second fence from the same context + * @f1: the first fence from the same context + * @f2: the second fence from the same context * * Returns NULL if both fences are signaled, otherwise the fence that would be * signaled last. Both fences must be from the same context, since a seqno is @@ -398,7 +469,7 @@ static inline struct dma_fence *dma_fence_later(struct dma_fence *f1, /** * dma_fence_get_status_locked - returns the status upon completion - * @fence: [in] the dma_fence to query + * @fence: the dma_fence to query * * Drivers can supply an optional error status condition before they signal * the fence (to indicate whether the fence was completed due to an error @@ -422,8 +493,8 @@ int dma_fence_get_status(struct dma_fence *fence); /** * dma_fence_set_error - flag an error condition on the fence - * @fence: [in] the dma_fence - * @error: [in] the error to store + * @fence: the dma_fence + * @error: the error to store * * Drivers can supply an optional error status condition before they signal * the fence, to indicate that the fence was completed due to an error @@ -449,8 +520,8 @@ signed long dma_fence_wait_any_timeout(struct dma_fence **fences, /** * dma_fence_wait - sleep until the fence gets signaled - * @fence: [in] the fence to wait on - * @intr: [in] if true, do an interruptible wait + * @fence: the fence to wait on + * @intr: if true, do an interruptible wait * * This function will return -ERESTARTSYS if interrupted by a signal, * or 0 if the fence was signaled. Other error values may be @@ -459,6 +530,8 @@ signed long dma_fence_wait_any_timeout(struct dma_fence **fences, * Performs a synchronous wait on this fence. It is assumed the caller * directly or indirectly holds a reference to the fence, otherwise the * fence might be freed before return, resulting in undefined behavior. + * + * See also dma_fence_wait_timeout() and dma_fence_wait_any_timeout(). */ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) { -- 2.17.0

7 years, 1 month

1
0
0 0

[PATCH 1/5] dma-buf: use parameter structure for dma_buf_attach

by Christian König

Move the parameters into a structure to make it simpler to extend it in follow up patches. This also adds the importer private as parameter so that we can directly work with a completely filled in attachment structure. Signed-off-by: Christian König <christian.koenig(a)amd.com> --- drivers/dma-buf/dma-buf.c | 15 ++++++++------- drivers/gpu/drm/armada/armada_gem.c | 6 +++++- drivers/gpu/drm/drm_prime.c | 6 +++++- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 6 +++++- drivers/gpu/drm/tegra/gem.c | 6 +++++- drivers/gpu/drm/udl/udl_dmabuf.c | 6 +++++- drivers/media/common/videobuf2/videobuf2-dma-contig.c | 6 +++++- drivers/media/common/videobuf2/videobuf2-dma-sg.c | 6 +++++- drivers/staging/media/tegra-vde/tegra-vde.c | 6 +++++- include/linux/dma-buf.h | 17 +++++++++++++++-- 10 files changed, 63 insertions(+), 17 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d78d5fc173dc..4b46982c6d9c 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -534,8 +534,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put); /** * dma_buf_attach - Add the device to dma_buf's attachments list; optionally, * calls attach() of dma_buf_ops to allow device-specific attach functionality - * @dmabuf: [in] buffer to attach device to. - * @dev: [in] device to be attached. + * @info: [in] holds all the attach related information provided + * by the importer. see &struct dma_buf_attach_info + * for further details. * * Returns struct dma_buf_attachment pointer for this attachment. Attachments * must be cleaned up by calling dma_buf_detach(). @@ -549,26 +550,26 @@ EXPORT_SYMBOL_GPL(dma_buf_put); * accessible to @dev, and cannot be moved to a more suitable place. This is * indicated with the error code -EBUSY. */ -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev) +struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info) { + struct dma_buf *dmabuf = info->dmabuf; struct dma_buf_attachment *attach; int ret; - if (WARN_ON(!dmabuf || !dev)) + if (WARN_ON(!dmabuf || !info->dev)) return ERR_PTR(-EINVAL); attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM); - attach->dev = dev; + attach->dev = info->dev; attach->dmabuf = dmabuf; mutex_lock(&dmabuf->lock); if (dmabuf->ops->attach) { - ret = dmabuf->ops->attach(dmabuf, dev, attach); + ret = dmabuf->ops->attach(dmabuf, info->dev, attach); if (ret) goto err_attach; } diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index a97f509743a5..f4d1c11f57ea 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -514,6 +514,10 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, struct drm_gem_object * armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = buf + }; struct dma_buf_attachment *attach; struct armada_gem_object *dobj; @@ -529,7 +533,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) } } - attach = dma_buf_attach(buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 7856a9b3f8a8..4da242de51c2 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -707,6 +707,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, struct dma_buf *dma_buf, struct device *attach_dev) { + struct dma_buf_attach_info attach_info = { + .dev = attach_dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct sg_table *sgt; struct drm_gem_object *obj; @@ -727,7 +731,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL); - attach = dma_buf_attach(dma_buf, attach_dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 69a7aec49e84..7b737a883106 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -288,6 +288,10 @@ static const struct drm_i915_gem_object_ops i915_gem_object_dmabuf_ops = { struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct drm_i915_gem_object *obj; int ret; @@ -306,7 +310,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, } /* need to attach */ - attach = dma_buf_attach(dma_buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 8b0b4ff64bb4..1f2bee91e0ad 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -332,6 +332,10 @@ struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file, static struct tegra_bo *tegra_bo_import(struct drm_device *drm, struct dma_buf *buf) { + struct dma_buf_attach_info attach_info = { + .dev = drm->dev, + .dmabuf = buf + }; struct tegra_drm *tegra = drm->dev_private; struct dma_buf_attachment *attach; struct tegra_bo *bo; @@ -341,7 +345,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, if (IS_ERR(bo)) return bo; - attach = dma_buf_attach(buf, drm->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { err = PTR_ERR(attach); goto free; diff --git a/drivers/gpu/drm/udl/udl_dmabuf.c b/drivers/gpu/drm/udl/udl_dmabuf.c index 2867ed155ff6..c4db84abe231 100644 --- a/drivers/gpu/drm/udl/udl_dmabuf.c +++ b/drivers/gpu/drm/udl/udl_dmabuf.c @@ -243,6 +243,10 @@ static int udl_prime_create(struct drm_device *dev, struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct sg_table *sg; struct udl_gem_object *uobj; @@ -250,7 +254,7 @@ struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, /* need to attach */ get_device(dev->dev); - attach = dma_buf_attach(dma_buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { put_device(dev->dev); return ERR_CAST(attach); diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index f1178f6f434d..93bd1f40f756 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -677,6 +677,10 @@ static void vb2_dc_detach_dmabuf(void *mem_priv) static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dbuf + }; struct vb2_dc_buf *buf; struct dma_buf_attachment *dba; @@ -692,7 +696,7 @@ static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */ - dba = dma_buf_attach(dbuf, buf->dev); + dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf); diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c index 753ed3138dcc..4e61050ba87f 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -609,6 +609,10 @@ static void vb2_dma_sg_detach_dmabuf(void *mem_priv) static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dbuf + }; struct vb2_dma_sg_buf *buf; struct dma_buf_attachment *dba; @@ -624,7 +628,7 @@ static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */ - dba = dma_buf_attach(dbuf, buf->dev); + dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf); diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c index c47659e96089..25d112443b0d 100644 --- a/drivers/staging/media/tegra-vde/tegra-vde.c +++ b/drivers/staging/media/tegra-vde/tegra-vde.c @@ -529,6 +529,10 @@ static int tegra_vde_attach_dmabuf(struct device *dev, size_t *size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dmabuf + }; struct dma_buf_attachment *attachment; struct dma_buf *dmabuf; struct sg_table *sgt; @@ -547,7 +551,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev, return -EINVAL; } - attachment = dma_buf_attach(dmabuf, dev); + attachment = dma_buf_attach(&attach_info); if (IS_ERR(attachment)) { dev_err(dev, "Failed to attach dmabuf\n"); err = PTR_ERR(attachment); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 085db2fee2d7..414b4dde5eb7 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -362,6 +362,19 @@ struct dma_buf_export_info { struct dma_buf_export_info name = { .exp_name = KBUILD_MODNAME, \ .owner = THIS_MODULE } +/** + * struct dma_buf_attach_info - holds information needed to attach to a dma_buf + * @dmabuf: the exported dma_buf + * @dev: the device which wants to import the attachment + * + * This structure holds the information required to attach to a buffer. Used + * with dma_buf_attach() only. + */ +struct dma_buf_attach_info { + struct dma_buf *dmabuf; + struct device *dev; +}; + /** * get_dma_buf - convenience wrapper for get_file. * @dmabuf: [in] pointer to dma_buf @@ -376,8 +389,8 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); } -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev); +struct dma_buf_attachment * +dma_buf_attach(const struct dma_buf_attach_info *info); void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *dmabuf_attach); -- 2.14.1

7 years, 1 month

2
6
0 0

[RfC PATCH] Add udmabuf misc device

by Gerd Hoffmann

A driver to let userspace turn iovecs into dma-bufs. Use case: Allows qemu pass around dmabufs for the guest framebuffer. https://www.kraxel.org/cgit/qemu/log/?h=sirius/udmabuf has an experimental patch. Also allows qemu to export guest virtio-gpu resources as host dmabufs. Should be possible to use it to display guest wayland windows on the host display server. virtio-gpu ressources can be chunked so we will actually need multiple iovec entries. UNTESTED. Want collect some feedback on the general approach with this RfC series. Can this work? If not, better ideas? Question: Must this be hooked into some kind of mlock accounting, to limit the amout of memory userspace is allowed to pin this way? Or will get_user_pages_fast() handle that for me? Known issue: Driver API isn't complete yet. Need add some flags, for example to support read-only buffers. Cc: David Airlie <airlied(a)linux.ie> Cc: Tomeu Vizoso <tomeu.vizoso(a)collabora.com> Signed-off-by: Gerd Hoffmann <kraxel(a)redhat.com> --- include/uapi/linux/udmabuf.h | 21 ++++ drivers/dma-buf/udmabuf.c | 250 +++++++++++++++++++++++++++++++++++++++++++ drivers/dma-buf/Kconfig | 7 ++ drivers/dma-buf/Makefile | 1 + 4 files changed, 279 insertions(+) create mode 100644 include/uapi/linux/udmabuf.h create mode 100644 drivers/dma-buf/udmabuf.c diff --git a/include/uapi/linux/udmabuf.h b/include/uapi/linux/udmabuf.h new file mode 100644 index 0000000000..fd2fa441fe --- /dev/null +++ b/include/uapi/linux/udmabuf.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_LINUX_UDMABUF_H +#define _UAPI_LINUX_UDMABUF_H + +#include <linux/types.h> +#include <linux/ioctl.h> + +struct udmabuf_iovec { + __u64 base; + __u64 len; +}; + +struct udmabuf_create { + __u32 flags; + __u32 niov; + struct udmabuf_iovec iovs[]; +}; + +#define UDMABUF_CREATE _IOW(0x42, 0x23, struct udmabuf_create) + +#endif /* _UAPI_LINUX_UDMABUF_H */ diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c new file mode 100644 index 0000000000..ec012d7ac7 --- /dev/null +++ b/drivers/dma-buf/udmabuf.c @@ -0,0 +1,250 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#include <linux/init.h> +#include <linux/module.h> +#include <linux/device.h> +#include <linux/kernel.h> +#include <linux/slab.h> +#include <linux/miscdevice.h> +#include <linux/dma-buf.h> +#include <linux/highmem.h> + +#include <uapi/linux/udmabuf.h> + +struct udmabuf { + u32 pagecount; + struct page **pages; +}; + +static int udmabuf_vm_fault(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + struct udmabuf *ubuf = vma->vm_private_data; + + if (WARN_ON(vmf->pgoff >= ubuf->pagecount)) + return VM_FAULT_SIGBUS; + + vmf->page = ubuf->pages[vmf->pgoff]; + get_page(vmf->page); + return 0; +} + +static const struct vm_operations_struct udmabuf_vm_ops = { + .fault = udmabuf_vm_fault, +}; + +static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma) +{ + struct udmabuf *ubuf = buf->priv; + + if ((vma->vm_flags & VM_SHARED) == 0) + return -EINVAL; + + vma->vm_ops = &udmabuf_vm_ops; + vma->vm_private_data = ubuf; + return 0; +} + +static struct sg_table *map_udmabuf(struct dma_buf_attachment *at, + enum dma_data_direction direction) +{ + struct udmabuf *ubuf = at->dmabuf->priv; + struct sg_table *sg; + + sg = kzalloc(sizeof(*sg), GFP_KERNEL); + if (!sg) + goto err1; + if (sg_alloc_table_from_pages(sg, ubuf->pages, ubuf->pagecount, + 0, ubuf->pagecount << PAGE_SHIFT, + GFP_KERNEL) < 0) + goto err2; + if (!dma_map_sg(at->dev, sg->sgl, sg->nents, direction)) + goto err3; + + return sg; + +err3: + sg_free_table(sg); +err2: + kfree(sg); +err1: + return ERR_PTR(-ENOMEM); +} + +static void unmap_udmabuf(struct dma_buf_attachment *at, + struct sg_table *sg, + enum dma_data_direction direction) +{ + sg_free_table(sg); + kfree(sg); +} + +static void release_udmabuf(struct dma_buf *buf) +{ + struct udmabuf *ubuf = buf->priv; + pgoff_t pg; + + for (pg = 0; pg < ubuf->pagecount; pg++) + put_page(ubuf->pages[pg]); + kfree(ubuf->pages); + kfree(ubuf); +} + +static void *kmap_atomic_udmabuf(struct dma_buf *buf, unsigned long offset) +{ + struct udmabuf *ubuf = buf->priv; + struct page *page = ubuf->pages[offset >> PAGE_SHIFT]; + + return kmap_atomic(page); +} + +static void *kmap_udmabuf(struct dma_buf *buf, unsigned long offset) +{ + struct udmabuf *ubuf = buf->priv; + struct page *page = ubuf->pages[offset >> PAGE_SHIFT]; + + return kmap(page); +} + +static struct dma_buf_ops udmabuf_ops = { + .map_dma_buf = map_udmabuf, + .unmap_dma_buf = unmap_udmabuf, + .release = release_udmabuf, + .map_atomic = kmap_atomic_udmabuf, + .map = kmap_udmabuf, + .mmap = mmap_udmabuf, +}; + +static long udmabuf_ioctl_create(struct file *filp, unsigned long arg) +{ + struct udmabuf_create create; + struct udmabuf_iovec *iovs; + struct udmabuf *ubuf; + DEFINE_DMA_BUF_EXPORT_INFO(exp_info); + struct dma_buf *buf; + pgoff_t pgoff, pgcnt; + u32 iov; + int ret; + + if (copy_from_user(&create, (void __user *)arg, + sizeof(struct udmabuf_create))) + return -EFAULT; + + iovs = kmalloc_array(create.niov, sizeof(struct udmabuf_iovec), + GFP_KERNEL); + if (!iovs) + return -ENOMEM; + + arg += offsetof(struct udmabuf_create, iovs); + ret = -EFAULT; + if (copy_from_user(iovs, (void __user *)arg, + create.niov * sizeof(struct udmabuf_iovec))) + goto err_free_iovs; + + ubuf = kzalloc(sizeof(struct udmabuf), GFP_KERNEL); + if (!ubuf) + goto err_free_iovs; + + ret = -EINVAL; + for (iov = 0; iov < create.niov; iov++) { + if (!IS_ALIGNED(iovs[iov].base, PAGE_SIZE)) + goto err_free_iovs; + if (!IS_ALIGNED(iovs[iov].len, PAGE_SIZE)) + goto err_free_iovs; + ubuf->pagecount += iovs[iov].len >> PAGE_SHIFT; + } + + ret = -ENOMEM; + ubuf->pages = kmalloc_array(ubuf->pagecount, sizeof(struct page*), + GFP_KERNEL); + if (!ubuf->pages) + goto err_free_buf; + + pgoff = 0; + for (iov = 0; iov < create.niov; iov++) { + pgcnt = iovs[iov].len >> PAGE_SHIFT; + while (pgcnt > 0) { + ret = get_user_pages_fast(iovs[iov].base, pgcnt, + true, /* write */ + ubuf->pages + pgoff); + if (ret < 0) + goto err_put_pages; + pgoff += ret; + pgcnt -= ret; + } + } + + exp_info.ops = &udmabuf_ops; + exp_info.size = ubuf->pagecount << PAGE_SHIFT; + exp_info.priv = ubuf; + + buf = dma_buf_export(&exp_info); + if (IS_ERR(buf)) { + ret = PTR_ERR(buf); + goto err_put_pages; + } + + kfree(iovs); + return dma_buf_fd(buf, 0); + +err_put_pages: + while (pgoff > 0) + put_page(ubuf->pages[--pgoff]); +err_free_buf: + kfree(ubuf->pages); + kfree(ubuf); +err_free_iovs: + kfree(iovs); + return ret; +} + +static long udmabuf_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) +{ + long ret; + + switch (ioctl) { + case UDMABUF_CREATE: + ret = udmabuf_ioctl_create(filp, arg); + break; + default: + ret = -EINVAL; + break; + } + return ret; +} + +static const struct file_operations udmabuf_fops = { + .owner = THIS_MODULE, + .unlocked_ioctl = udmabuf_ioctl, +}; + +static struct miscdevice udmabuf_misc = { + .minor = MISC_DYNAMIC_MINOR, + .name = "udmabuf", + .fops = &udmabuf_fops, +}; + +static int __init udmabuf_dev_init(void) +{ + int ret; + + ret = misc_register(&udmabuf_misc); + if (ret) + return ret; + + return 0; +} + +static void __exit udmabuf_dev_exit(void) +{ + misc_deregister(&udmabuf_misc); +} + +module_init(udmabuf_dev_init) +module_exit(udmabuf_dev_exit) + +MODULE_LICENSE("GPL v2"); diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig index ed3b785bae..5876b52554 100644 --- a/drivers/dma-buf/Kconfig +++ b/drivers/dma-buf/Kconfig @@ -30,4 +30,11 @@ config SW_SYNC WARNING: improper use of this can result in deadlocking kernel drivers from userspace. Intended for test and debug only. +config UDMABUF + tristate "userspace dmabuf misc driver" + default n + depends on DMA_SHARED_BUFFER + ---help--- + A driver to let userspace turn iovs into dma-bufs. + endmenu diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index c33bf88631..0913a6ccab 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,3 +1,4 @@ obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o +obj-$(CONFIG_UDMABUF) += udmabuf.o -- 2.9.3

7 years, 1 month

7
26
0 0

buffer sharing across VMs - xen-zcopy and hyper_dmabuf discussion

by Dongwon Kim

(changed subject and decoupling from udmabuf thread) On Wed, Apr 11, 2018 at 08:59:32AM +0300, Oleksandr Andrushchenko wrote: > On 04/10/2018 08:26 PM, Dongwon Kim wrote: > >On Tue, Apr 10, 2018 at 09:37:53AM +0300, Oleksandr Andrushchenko wrote: > >>On 04/06/2018 09:57 PM, Dongwon Kim wrote: > >>>On Fri, Apr 06, 2018 at 03:36:03PM +0300, Oleksandr Andrushchenko wrote: > >>>>On 04/06/2018 02:57 PM, Gerd Hoffmann wrote: > >>>>> Hi, > >>>>> > >>>>>>>I fail to see any common ground for xen-zcopy and udmabuf ... > >>>>>>Does the above mean you can assume that xen-zcopy and udmabuf > >>>>>>can co-exist as two different solutions? > >>>>>Well, udmabuf route isn't fully clear yet, but yes. > >>>>> > >>>>>See also gvt (intel vgpu), where the hypervisor interface is abstracted > >>>>>away into a separate kernel modules even though most of the actual vgpu > >>>>>emulation code is common. > >>>>Thank you for your input, I'm just trying to figure out > >>>>which of the three z-copy solutions intersect and how much > >>>>>>And what about hyper-dmabuf? > >>>xen z-copy solution is pretty similar fundamentally to hyper_dmabuf > >>>in terms of these core sharing feature: > >>> > >>>1. the sharing process - import prime/dmabuf from the producer -> extract > >>>underlying pages and get those shared -> return references for shared pages > >Another thing is danvet was kind of against to the idea of importing existing > >dmabuf/prime buffer and forward it to the other domain due to synchronization > >issues. He proposed to make hyper_dmabuf only work as an exporter so that it > >can have a full control over the buffer. I think we need to talk about this > >further as well. > Yes, I saw this. But this limits the use-cases so much. I agree. Our current approach is a lot more flexible. You can find very similar feedback in my reply to those review messages. However, I also understand Daniel's concern as well. I believe we need more dicussion regarding this matter. > For instance, running Android as a Guest (which uses ION to allocate > buffers) means that finally HW composer will import dma-buf into > the DRM driver. Then, in case of xen-front for example, it needs to be > shared with the backend (Host side). Of course, we can change user-space > to make xen-front allocate the buffers (make it exporter), but what we try > to avoid is to change user-space which in normal world would have remain > unchanged otherwise. > So, I do think we have to support this use-case and just have to understand > the complexity. > > > > >danvet, can you comment on this topic? > > > >>>2. the page sharing mechanism - it uses Xen-grant-table. > >>> > >>>And to give you a quick summary of differences as far as I understand > >>>between two implementations (please correct me if I am wrong, Oleksandr.) > >>> > >>>1. xen-zcopy is DRM specific - can import only DRM prime buffer > >>>while hyper_dmabuf can export any dmabuf regardless of originator > >>Well, this is true. And at the same time this is just a matter > >>of extending the API: xen-zcopy is a helper driver designed for > >>xen-front/back use-case, so this is why it only has DRM PRIME API > >>>2. xen-zcopy doesn't seem to have dma-buf synchronization between two VMs > >>>while (as danvet called it as remote dmabuf api sharing) hyper_dmabuf sends > >>>out synchronization message to the exporting VM for synchronization. > >>This is true. Again, this is because of the use-cases it covers. > >>But having synchronization for a generic solution seems to be a good idea. > >Yeah, understood xen-zcopy works ok with your use case. But I am just curious > >if it is ok not to have any inter-domain synchronization in this sharing model. > The synchronization is done with displif protocol [1] > >The buffer being shared is technically dma-buf and originator needs to be able > >to keep track of it. > As I am working in DRM terms the tracking is done by the DRM core > for me for free. (This might be one of the reasons Daniel sees DRM > based implementation fit very good from code-reuse POV). yeah but once you have a DRM object (whether it's dmabuf or not) on a remote domain, it is totally new object and out of sync (correct me if I am wrong) with original DRM prime, isn't it? How could these two different but based on same pages be synchronized? > > > >>>3. 1-level references - when using grant-table for sharing pages, there will > >>>be same # of refs (each 8 byte) > >>To be precise, grant ref is 4 bytes > >You are right. Thanks for correction.;) > > > >>>as # of shared pages, which is passed to > >>>the userspace to be shared with importing VM in case of xen-zcopy. > >>The reason for that is that xen-zcopy is a helper driver, e.g. > >>the grant references come from the display backend [1], which implements > >>Xen display protocol [2]. So, effectively the backend extracts references > >>from frontend's requests and passes those to xen-zcopy as an array > >>of refs. > >>> Compared > >>>to this, hyper_dmabuf does multiple level addressing to generate only one > >>>reference id that represents all shared pages. > >>In the protocol [2] only one reference to the gref directory is passed > >>between VMs > >>(and the gref directory is a single-linked list of shared pages containing > >>all > >>of the grefs of the buffer). > >ok, good to know. I will look into its implementation in more details but is > >this gref directory (chained grefs) something that can be used for any general > >memory sharing use case or is it jsut for xen-display (in current code base)? > Not to mislead you: one grant ref is passed via displif protocol, > but the page it's referencing contains the rest of the grant refs. > I checked displif.h. I like the concept of chaining 2nd level grefs. As you should have already realized, our multi-level addressing is almost identical to gref directory except that we defined another level on top to address multiple 2nd level grefs instead of creating a linked list. And I see there would be an advantage in terms of memory saving in your method. Now I think why it should be remaining as one of displif features. I think we could expand this to any type of large buffer sharing use-case in Xen (possibly as an extension to grant-table driver?) > As to if this can be used for any memory: yes. It is the same for > sndif and displif Xen protocols, but defined twice as strictly speaking > sndif and displif are two separate protocols. > > While reviewing your RFC v2 one of the comments I had [2] was that if we > can start from defining such a generic protocol for hyper-dmabuf. > It can be a header file, which not only has the description part > (which then become a part of Documentation/...rst file), but also defines > all the required constants for requests, responses, defines message formats, > state diagrams etc. all at one place. Of course this protocol must not be > Xen specific, but be OS/hypervisor agnostic. > Having that will trigger a new round of discussion, so we have it all > designed > and discussed before we start implementing. > > Besides the protocol we have to design UAPI part as well and make sure > the hyper-dmabuf is not only accessible from user-space, but there will be > number > of kernel-space users as well. > > > >>>4. inter VM messaging (hype_dmabuf only) - hyper_dmabuf has inter-vm msg > >>>communication defined for dmabuf synchronization and private data (meta > >>>info that Matt Roper mentioned) exchange. > >>This is true, xen-zcopy has no means for inter VM sync and meta-data, > >>simply because it doesn't have any code for inter VM exchange in it, > >>e.g. the inter VM protocol is handled by the backend [1]. > >>>5. driver-to-driver notification (hyper_dmabuf only) - importing VM gets > >>>notified when newdmabuf is exported from other VM - uevent can be optionally > >>>generated when this happens. > >>> > >>>6. structure - hyper_dmabuf is targetting to provide a generic solution for > >>>inter-domain dmabuf sharing for most hypervisors, which is why it has two > >>>layers as mattrope mentioned, front-end that contains standard API and backend > >>>that is specific to hypervisor. > >>Again, xen-zcopy is decoupled from inter VM communication > >>>>>No idea, didn't look at it in detail. > >>>>> > >>>>>Looks pretty complex from a distant view. Maybe because it tries to > >>>>>build a communication framework using dma-bufs instead of a simple > >>>>>dma-buf passing mechanism. > >>>we started with simple dma-buf sharing but realized there are many > >>>things we need to consider in real use-case, so we added communication > >>>, notification and dma-buf synchronization then re-structured it to > >>>front-end and back-end (this made things more compicated..) since Xen > >>>was not our only target. Also, we thought passing the reference for the > >>>buffer (hyper_dmabuf_id) is not secure so added uvent mechanism later. > >>> > >>>>Yes, I am looking at it now, trying to figure out the full story > >>>>and its implementation. BTW, Intel guys were about to share some > >>>>test application for hyper-dmabuf, maybe I have missed one. > >>>>It could probably better explain the use-cases and the complexity > >>>>they have in hyper-dmabuf. > >>>One example is actually in github. If you want take a look at it, please > >>>visit: > >>> > >>>https://github.com/downor/linux_hyper_dmabuf_test/tree/xen/simple_export > >>Thank you, I'll have a look > >>>>>Like xen-zcopy it seems to depend on the idea that the hypervisor > >>>>>manages all memory it is easy for guests to share pages with the help of > >>>>>the hypervisor. > >>>>So, for xen-zcopy we were not trying to make it generic, > >>>>it just solves display (dumb) zero-copying use-cases for Xen. > >>>>We implemented it as a DRM helper driver because we can't see any > >>>>other use-cases as of now. > >>>>For example, we also have Xen para-virtualized sound driver, but > >>>>its buffer memory usage is not comparable to what display wants > >>>>and it works somewhat differently (e.g. there is no "frame done" > >>>>event, so one can't tell when the sound buffer can be "flipped"). > >>>>At the same time, we do not use virtio-gpu, so this could probably > >>>>be one more candidate for shared dma-bufs some day. > >>>>> Which simply isn't the case on kvm. > >>>>> > >>>>>hyper-dmabuf and xen-zcopy could maybe share code, or hyper-dmabuf build > >>>>>on top of xen-zcopy. > >>>>Hm, I can imagine that: xen-zcopy could be a library code for hyper-dmabuf > >>>>in terms of implementing all that page sharing fun in multiple directions, > >>>>e.g. Host->Guest, Guest->Host, Guest<->Guest. > >>>>But I'll let Matt and Dongwon to comment on that. > >>>I think we can definitely collaborate. Especially, maybe we are using some > >>>outdated sharing mechanism/grant-table mechanism in our Xen backend (thanks > >>>for bringing that up Oleksandr). However, the question is once we collaborate > >>>somehow, can xen-zcopy's usecase use the standard API that hyper_dmabuf > >>>provides? I don't think we need different IOCTLs that do the same in the final > >>>solution. > >>> > >>If you think of xen-zcopy as a library (which implements Xen > >>grant references mangling) and DRM PRIME wrapper on top of that > >>library, we can probably define proper API for that library, > >>so both xen-zcopy and hyper-dmabuf can use it. What is more, I am > >>about to start upstreaming Xen para-virtualized sound device driver soon, > >>which also uses similar code and gref passing mechanism [3]. > >>(Actually, I was about to upstream drm/xen-front, drm/xen-zcopy and > >>snd/xen-front and then propose a Xen helper library for sharing big buffers, > >>so common code of the above drivers can use the same code w/o code > >>duplication) > >I think it is possible to use your functions for memory sharing part in > >hyper_dmabuf's backend (this 'backend' means the layer that does page sharing > >and inter-vm communication with xen-specific way.), so why don't we work on > >"Xen helper library for sharing big buffers" first while we continue our > >discussion on the common API layer that can cover any dmabuf sharing cases. > > > Well, I would love we reuse the code that I have, but I also > understand that it was limited by my use-cases. So, I do not > insist we have to ;) > If we start designing and discussing hyper-dmabuf protocol we of course > can work on this helper library in parallel. > >>Thank you, > >>Oleksandr > >> > >>P.S. All, is it a good idea to move this out of udmabuf thread into a > >>dedicated one? > >Either way is fine with me. > So, if you can start designing the protocol we may have a dedicated mail > thread for that. I will try to help with the protocol as much as I can Sure thanks. We can talk about it. just FYI, I have prepared a application note that contains definition of hyper_dmabuf messages included in RFC v2 patch. That would be a great starting point. It will be great if you can review it. > > >>>>>cheers, > >>>>> Gerd > >>>>> > >>>>Thank you, > >>>>Oleksandr > >>>> > >>>>P.S. Sorry for making your original mail thread to discuss things much > >>>>broader than your RFC... > >>>> > >>[1] https://github.com/xen-troops/displ_be > >>[2] https://elixir.bootlin.com/linux/v4.16-rc7/source/include/xen/interface/io/… > >>[3] https://elixir.bootlin.com/linux/v4.16-rc7/source/include/xen/interface/io/… > >> > [1] https://elixir.bootlin.com/linux/v4.16-rc7/source/include/xen/interface/io/… > [2] > https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00685.html

7 years, 1 month

1
0
0 0

[RFC PATCH v2 0/9] hyper_dmabuf: Hyper_DMABUF driver

by Dongwon Kim

This patch series contains the implementation of a new device driver, hyper_DMABUF driver, which provides a way to expand the boundary of Linux DMA-BUF sharing to across different VM instances in Multi-OS platform enabled by a Hypervisor (e.g. XEN) This version 2 series is basically refactored version of old series starting with "[RFC PATCH 01/60] hyper_dmabuf: initial working version of hyper_dmabuf drv" Implementation details of this driver are described in the reference guide added by the second patch, "[RFC PATCH v2 2/5] hyper_dmabuf: architecture specification and reference guide". Attaching 'Overview' section here as a quick summary. ------------------------------------------------------------------------------ Section 1. Overview ------------------------------------------------------------------------------ Hyper_DMABUF driver is a Linux device driver running on multiple Virtual achines (VMs), which expands DMA-BUF sharing capability to the VM environment where multiple different OS instances need to share same physical data without data-copy across VMs. To share a DMA_BUF across VMs, an instance of the Hyper_DMABUF drv on the exporting VM (so called, “exporter”) imports a local DMA_BUF from the original producer of the buffer, then re-exports it with an unique ID, hyper_dmabuf_id for the buffer to the importing VM (so called, “importer”). Another instance of the Hyper_DMABUF driver on importer registers a hyper_dmabuf_id together with reference information for the shared physical pages associated with the DMA_BUF to its database when the export happens. The actual mapping of the DMA_BUF on the importer’s side is done by the Hyper_DMABUF driver when user space issues the IOCTL command to access the shared DMA_BUF. The Hyper_DMABUF driver works as both an importing and exporting driver as is, that is, no special configuration is required. Consequently, only a single module per VM is needed to enable cross-VM DMA_BUF exchange. ------------------------------------------------------------------------------ There is a git repository at github.com where this series of patches are all integrated in Linux kernel tree based on the commit: commit ae64f9bd1d3621b5e60d7363bc20afb46aede215 Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Date: Sun Dec 3 11:01:47 2018 -0500 Linux 4.15-rc2 https://github.com/downor/linux_hyper_dmabuf.git hyper_dmabuf_integration_v4 Dongwon Kim, Mateusz Polrola (9): hyper_dmabuf: initial upload of hyper_dmabuf drv core framework hyper_dmabuf: architecture specification and reference guide MAINTAINERS: adding Hyper_DMABUF driver section in MAINTAINERS hyper_dmabuf: user private data attached to hyper_DMABUF hyper_dmabuf: hyper_DMABUF synchronization across VM hyper_dmabuf: query ioctl for retreiving various hyper_DMABUF info hyper_dmabuf: event-polling mechanism for detecting a new hyper_DMABUF hyper_dmabuf: threaded interrupt in Xen-backend hyper_dmabuf: default backend for XEN hypervisor Documentation/hyper-dmabuf-sharing.txt | 734 ++++++++++++++++ MAINTAINERS | 11 + drivers/dma-buf/Kconfig | 2 + drivers/dma-buf/Makefile | 1 + drivers/dma-buf/hyper_dmabuf/Kconfig | 50 ++ drivers/dma-buf/hyper_dmabuf/Makefile | 44 + .../backends/xen/hyper_dmabuf_xen_comm.c | 944 +++++++++++++++++++++ .../backends/xen/hyper_dmabuf_xen_comm.h | 78 ++ .../backends/xen/hyper_dmabuf_xen_comm_list.c | 158 ++++ .../backends/xen/hyper_dmabuf_xen_comm_list.h | 67 ++ .../backends/xen/hyper_dmabuf_xen_drv.c | 46 + .../backends/xen/hyper_dmabuf_xen_drv.h | 53 ++ .../backends/xen/hyper_dmabuf_xen_shm.c | 525 ++++++++++++ .../backends/xen/hyper_dmabuf_xen_shm.h | 46 + drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c | 410 +++++++++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.h | 122 +++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_event.c | 122 +++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_event.h | 38 + drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.c | 135 +++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.h | 53 ++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c | 794 +++++++++++++++++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.h | 52 ++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.c | 295 +++++++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.h | 73 ++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c | 416 +++++++++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.h | 89 ++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.c | 415 +++++++++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.h | 34 + drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_query.c | 174 ++++ drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_query.h | 36 + .../hyper_dmabuf/hyper_dmabuf_remote_sync.c | 324 +++++++ .../hyper_dmabuf/hyper_dmabuf_remote_sync.h | 32 + .../dma-buf/hyper_dmabuf/hyper_dmabuf_sgl_proc.c | 257 ++++++ .../dma-buf/hyper_dmabuf/hyper_dmabuf_sgl_proc.h | 43 + drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_struct.h | 143 ++++ include/uapi/linux/hyper_dmabuf.h | 134 +++ 36 files changed, 6950 insertions(+) create mode 100644 Documentation/hyper-dmabuf-sharing.txt create mode 100644 drivers/dma-buf/hyper_dmabuf/Kconfig create mode 100644 drivers/dma-buf/hyper_dmabuf/Makefile create mode 100644 drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c create mode 100644 drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.h create mode 100644 drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm_list.c create mode 100644 drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm_list.h create mode 100644 drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_drv.c create mode 100644 drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_drv.h create mode 100644 drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_shm.c create mode 100644 drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_shm.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_event.c create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_event.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.c create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.c create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.c create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_query.c create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_query.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_remote_sync.c create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_remote_sync.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_sgl_proc.c create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_sgl_proc.h create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_struct.h create mode 100644 include/uapi/linux/hyper_dmabuf.h -- 2.16.1

7 years, 1 month

5
20
0 0

[PATCH v2] Add udmabuf misc device

by Gerd Hoffmann

A driver to let userspace turn iovecs into dma-bufs. Use case: Allows qemu create dmabufs for the vga framebuffer or virtio-gpu ressources. Then they can be passed around to display those guest things on the host. To spice client for classic full framebuffer display, and hopefully some day to wayland server for seamless guest window display. Those dma-bufs are accounted against user's shm mlock bucket as the pages are effectively locked in memory. Cc: David Airlie <airlied(a)linux.ie> Cc: Tomeu Vizoso <tomeu.vizoso(a)collabora.com> Cc: Daniel Vetter <daniel(a)ffwll.ch> Signed-off-by: Gerd Hoffmann <kraxel(a)redhat.com> --- include/uapi/linux/udmabuf.h | 23 ++ drivers/dma-buf/udmabuf.c | 261 ++++++++++++++++++++++ tools/testing/selftests/drivers/dma-buf/udmabuf.c | 69 ++++++ drivers/dma-buf/Kconfig | 7 + drivers/dma-buf/Makefile | 1 + tools/testing/selftests/drivers/dma-buf/Makefile | 5 + 6 files changed, 366 insertions(+) create mode 100644 include/uapi/linux/udmabuf.h create mode 100644 drivers/dma-buf/udmabuf.c create mode 100644 tools/testing/selftests/drivers/dma-buf/udmabuf.c create mode 100644 tools/testing/selftests/drivers/dma-buf/Makefile diff --git a/include/uapi/linux/udmabuf.h b/include/uapi/linux/udmabuf.h new file mode 100644 index 0000000000..54ceba203a --- /dev/null +++ b/include/uapi/linux/udmabuf.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_LINUX_UDMABUF_H +#define _UAPI_LINUX_UDMABUF_H + +#include <linux/types.h> +#include <linux/ioctl.h> + +struct udmabuf_iovec { + __u64 base; + __u64 len; +}; + +#define UDMABUF_FLAGS_CLOEXEC 0x01 + +struct udmabuf_create { + __u32 flags; + __u32 niov; + struct udmabuf_iovec iovs[]; +}; + +#define UDMABUF_CREATE _IOW(0x42, 0x23, struct udmabuf_create) + +#endif /* _UAPI_LINUX_UDMABUF_H */ diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c new file mode 100644 index 0000000000..664ab4ee4e --- /dev/null +++ b/drivers/dma-buf/udmabuf.c @@ -0,0 +1,261 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#include <linux/init.h> +#include <linux/module.h> +#include <linux/device.h> +#include <linux/kernel.h> +#include <linux/slab.h> +#include <linux/miscdevice.h> +#include <linux/dma-buf.h> +#include <linux/highmem.h> +#include <linux/cred.h> + +#include <uapi/linux/udmabuf.h> + +struct udmabuf { + u32 pagecount; + struct page **pages; + struct user_struct *owner; +}; + +static int udmabuf_vm_fault(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + struct udmabuf *ubuf = vma->vm_private_data; + + if (WARN_ON(vmf->pgoff >= ubuf->pagecount)) + return VM_FAULT_SIGBUS; + + vmf->page = ubuf->pages[vmf->pgoff]; + get_page(vmf->page); + return 0; +} + +static const struct vm_operations_struct udmabuf_vm_ops = { + .fault = udmabuf_vm_fault, +}; + +static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma) +{ + struct udmabuf *ubuf = buf->priv; + + if ((vma->vm_flags & VM_SHARED) == 0) + return -EINVAL; + + vma->vm_ops = &udmabuf_vm_ops; + vma->vm_private_data = ubuf; + return 0; +} + +static struct sg_table *map_udmabuf(struct dma_buf_attachment *at, + enum dma_data_direction direction) +{ + struct udmabuf *ubuf = at->dmabuf->priv; + struct sg_table *sg; + + sg = kzalloc(sizeof(*sg), GFP_KERNEL); + if (!sg) + goto err1; + if (sg_alloc_table_from_pages(sg, ubuf->pages, ubuf->pagecount, + 0, ubuf->pagecount << PAGE_SHIFT, + GFP_KERNEL) < 0) + goto err2; + if (!dma_map_sg(at->dev, sg->sgl, sg->nents, direction)) + goto err3; + + return sg; + +err3: + sg_free_table(sg); +err2: + kfree(sg); +err1: + return ERR_PTR(-ENOMEM); +} + +static void unmap_udmabuf(struct dma_buf_attachment *at, + struct sg_table *sg, + enum dma_data_direction direction) +{ + sg_free_table(sg); + kfree(sg); +} + +static void release_udmabuf(struct dma_buf *buf) +{ + struct udmabuf *ubuf = buf->priv; + pgoff_t pg; + + for (pg = 0; pg < ubuf->pagecount; pg++) + put_page(ubuf->pages[pg]); + user_shm_unlock(ubuf->pagecount << PAGE_SHIFT, ubuf->owner); + free_uid(ubuf->owner); + kfree(ubuf->pages); + kfree(ubuf); +} + +static void *kmap_atomic_udmabuf(struct dma_buf *buf, unsigned long page_num) +{ + struct udmabuf *ubuf = buf->priv; + struct page *page = ubuf->pages[page_num]; + + return kmap_atomic(page); +} + +static void *kmap_udmabuf(struct dma_buf *buf, unsigned long page_num) +{ + struct udmabuf *ubuf = buf->priv; + struct page *page = ubuf->pages[page_num]; + + return kmap(page); +} + +static struct dma_buf_ops udmabuf_ops = { + .map_dma_buf = map_udmabuf, + .unmap_dma_buf = unmap_udmabuf, + .release = release_udmabuf, + .map_atomic = kmap_atomic_udmabuf, + .map = kmap_udmabuf, + .mmap = mmap_udmabuf, +}; + +static long udmabuf_ioctl_create(struct file *filp, unsigned long arg) +{ + struct udmabuf_create create; + struct udmabuf_iovec *iovs; + struct udmabuf *ubuf; + DEFINE_DMA_BUF_EXPORT_INFO(exp_info); + struct dma_buf *buf; + pgoff_t pgoff, pgcnt; + u32 iov, flags; + int ret; + + if (copy_from_user(&create, (void __user *)arg, + sizeof(struct udmabuf_create))) + return -EFAULT; + + iovs = kmalloc_array(create.niov, sizeof(struct udmabuf_iovec), + GFP_KERNEL); + if (!iovs) + return -ENOMEM; + + arg += offsetof(struct udmabuf_create, iovs); + ret = -EFAULT; + if (copy_from_user(iovs, (void __user *)arg, + create.niov * sizeof(struct udmabuf_iovec))) + goto err_free_iovs; + + ubuf = kzalloc(sizeof(struct udmabuf), GFP_KERNEL); + if (!ubuf) + goto err_free_iovs; + + ret = -EINVAL; + for (iov = 0; iov < create.niov; iov++) { + if (!IS_ALIGNED(iovs[iov].base, PAGE_SIZE)) + goto err_free_buf; + if (!IS_ALIGNED(iovs[iov].len, PAGE_SIZE)) + goto err_free_buf; + ubuf->pagecount += iovs[iov].len >> PAGE_SHIFT; + } + + /* this effectively mlocks the pages so account it accordingly */ + ret = -ENOMEM; + ubuf->owner = current_user(); + if (!user_shm_lock(ubuf->pagecount << PAGE_SHIFT, ubuf->owner)) + goto err_free_buf; + + ubuf->pages = kmalloc_array(ubuf->pagecount, sizeof(struct page*), + GFP_KERNEL); + if (!ubuf->pages) + goto err_shm_unlock; + + pgoff = 0; + for (iov = 0; iov < create.niov; iov++) { + pgcnt = iovs[iov].len >> PAGE_SHIFT; + while (pgcnt > 0) { + ret = get_user_pages_fast(iovs[iov].base, pgcnt, + true, /* write */ + ubuf->pages + pgoff); + if (ret < 0) + goto err_put_pages; + pgoff += ret; + pgcnt -= ret; + } + } + + exp_info.ops = &udmabuf_ops; + exp_info.size = ubuf->pagecount << PAGE_SHIFT; + exp_info.priv = ubuf; + + buf = dma_buf_export(&exp_info); + if (IS_ERR(buf)) { + ret = PTR_ERR(buf); + goto err_put_pages; + } + + flags = 0; + if (create.flags & UDMABUF_FLAGS_CLOEXEC) + flags |= O_CLOEXEC; + + kfree(iovs); + return dma_buf_fd(buf, flags); + +err_put_pages: + while (pgoff > 0) + put_page(ubuf->pages[--pgoff]); +err_shm_unlock: + user_shm_unlock(ubuf->pagecount << PAGE_SHIFT, ubuf->owner); +err_free_buf: + free_uid(ubuf->owner); + kfree(ubuf->pages); + kfree(ubuf); +err_free_iovs: + kfree(iovs); + return ret; +} + +static long udmabuf_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) +{ + long ret; + + switch (ioctl) { + case UDMABUF_CREATE: + ret = udmabuf_ioctl_create(filp, arg); + break; + default: + ret = -EINVAL; + break; + } + return ret; +} + +static const struct file_operations udmabuf_fops = { + .owner = THIS_MODULE, + .unlocked_ioctl = udmabuf_ioctl, +}; + +static struct miscdevice udmabuf_misc = { + .minor = MISC_DYNAMIC_MINOR, + .name = "udmabuf", + .fops = &udmabuf_fops, +}; + +static int __init udmabuf_dev_init(void) +{ + return misc_register(&udmabuf_misc); +} + +static void __exit udmabuf_dev_exit(void) +{ + misc_deregister(&udmabuf_misc); +} + +module_init(udmabuf_dev_init) +module_exit(udmabuf_dev_exit) + +MODULE_AUTHOR("Gerd Hoffmann <kraxel(a)redhat.com>"); +MODULE_LICENSE("GPL v2"); diff --git a/tools/testing/selftests/drivers/dma-buf/udmabuf.c b/tools/testing/selftests/drivers/dma-buf/udmabuf.c new file mode 100644 index 0000000000..3472c8ee49 --- /dev/null +++ b/tools/testing/selftests/drivers/dma-buf/udmabuf.c @@ -0,0 +1,69 @@ +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <string.h> +#include <errno.h> +#include <fcntl.h> +#include <malloc.h> + +#include <sys/ioctl.h> +#include <linux/udmabuf.h> + +#define TEST_PREFIX "drivers/dma-buf/udmabuf" +#define NUM_PAGES 4 + +int main(int argc, char *argv[]) +{ + struct udmabuf_create *create; + void *mem; + int dev, fd; + + dev = open("/dev/udmabuf", O_RDWR); + if (dev < 0) { + printf("%s: [skip]\n", TEST_PREFIX); + exit(77); + } + + mem = memalign(getpagesize(), getpagesize() * NUM_PAGES); + if (mem == NULL) { + printf("%s: [FAIL]\n", TEST_PREFIX); + exit (1); + } + + create = malloc(sizeof(struct udmabuf_create) + + sizeof(struct udmabuf_iovec)); + create->flags = 0; + create->niov = 1; + + /* should fail (base not page aligned) */ + create->iovs[0].base = (intptr_t)mem + getpagesize()/2; + create->iovs[0].len = getpagesize(); + fd = ioctl(dev, UDMABUF_CREATE, create); + if (fd >= 0) { + printf("%s: [FAIL]\n", TEST_PREFIX); + exit(1); + } + + /* should fail (size not multiple of page) */ + create->iovs[0].base = (intptr_t)mem; + create->iovs[0].len = getpagesize()/2; + fd = ioctl(dev, UDMABUF_CREATE, create); + if (fd >= 0) { + printf("%s: [FAIL]\n", TEST_PREFIX); + exit(1); + } + + /* should work */ + create->iovs[0].base = (intptr_t)mem; + create->iovs[0].len = getpagesize() * NUM_PAGES; + fd = ioctl(dev, UDMABUF_CREATE, create); + if (fd < 0) { + printf("%s: [FAIL]\n", TEST_PREFIX); + exit(1); + } + close(fd); + + fprintf(stderr, "%s: ok\n", TEST_PREFIX); + close(dev); + return 0; +} diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig index ed3b785bae..19be3ec62d 100644 --- a/drivers/dma-buf/Kconfig +++ b/drivers/dma-buf/Kconfig @@ -30,4 +30,11 @@ config SW_SYNC WARNING: improper use of this can result in deadlocking kernel drivers from userspace. Intended for test and debug only. +config UDMABUF + bool "userspace dmabuf misc driver" + default n + depends on DMA_SHARED_BUFFER + ---help--- + A driver to let userspace turn iovs into dma-bufs. + endmenu diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index c33bf88631..0913a6ccab 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,3 +1,4 @@ obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o +obj-$(CONFIG_UDMABUF) += udmabuf.o diff --git a/tools/testing/selftests/drivers/dma-buf/Makefile b/tools/testing/selftests/drivers/dma-buf/Makefile new file mode 100644 index 0000000000..4154c3d7aa --- /dev/null +++ b/tools/testing/selftests/drivers/dma-buf/Makefile @@ -0,0 +1,5 @@ +CFLAGS += -I../../../../../usr/include/ + +TEST_GEN_PROGS := udmabuf + +include ../../lib.mk -- 2.9.3

7 years, 2 months

5
6
0 0

[PATCH] MAINTAINERS: add dri-devel&linaro-mm for Android ION

by Daniel Vetter

Most of the other cross-driver gfx infrastructure (dma_buf, dma_fence) also gets cross posted to all the relevant gfx/memory lists. Doing the same for ION means people won't miss relevant patches. Cc: Laura Abbott <labbott(a)redhat.com> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: devel(a)driverdev.osuosl.org Cc: dri-devel(a)lists.freedesktop.org Cc: linaro-mm-sig(a)lists.linaro.org (moderated for non-subscribers) Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> --- MAINTAINERS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 555db72d4eb7..d43cdfca3eb5 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -902,6 +902,8 @@ ANDROID ION DRIVER M: Laura Abbott <labbott(a)redhat.com> M: Sumit Semwal <sumit.semwal(a)linaro.org> L: devel(a)driverdev.osuosl.org +L: dri-devel(a)lists.freedesktop.org +L: linaro-mm-sig(a)lists.linaro.org (moderated for non-subscribers) S: Supported F: drivers/staging/android/ion F: drivers/staging/android/uapi/ion.h -- 2.16.2

7 years, 2 months

3
4
0 0

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Christoph Hellwig

On Thu, Mar 29, 2018 at 09:58:54PM -0400, Jerome Glisse wrote: > dma_map_resource() is the right API (thought its current implementation > is fill with x86 assumptions). So i would argue that arch can decide to > implement it or simply return dma error address which trigger fallback > path into the caller (at least for GPU drivers). SG variant can be added > on top. It isn't in general. It doesn't integrate with scatterlists (see my comment to page one), and it doesn't integrate with all the subsystems that also need a kernel virtual address.

7 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Christian König

Am 29.03.2018 um 18:25 schrieb Logan Gunthorpe: > > On 29/03/18 10:10 AM, Christian König wrote: >> Why not? I mean the dma_map_resource() function is for P2P while other >> dma_map_* functions are only for system memory. > Oh, hmm, I wasn't aware dma_map_resource was exclusively for mapping > P2P. Though it's a bit odd seeing we've been working under the > assumption that PCI P2P is different as it has to translate the PCI bus > address. Where as P2P for devices on other buses is a big unknown. Yeah, completely agree. On my TODO list (but rather far down) is actually supporting P2P with USB devices. And no, I don't have the slightest idea how to do this at the moment. >>> And this is necessary to >>> check if the DMA ops in use support it or not. We can't have the >>> dma_map_X() functions do the wrong thing because they don't support it yet. >> Well that sounds like we should just return an error from >> dma_map_resources() when an architecture doesn't support P2P yet as Alex >> suggested. > Yes, well except in our patch-set we can't easily use > dma_map_resources() as we either have SGLs to deal with or we need to > create whole new interfaces to a number of subsystems. Agree as well. I was also in clear favor of extending the SGLs to have a flag for this instead of the dma_map_resource() interface, but for some reason that didn't made it into the kernel. >> You don't seem to understand the implications: The devices do have a >> common upstream bridge! In other words your code would currently claim >> that P2P is supported, but in practice it doesn't work. > Do they? They don't on any of the Intel machines I'm looking at. The > previous version of the patchset not only required a common upstream > bridge but two layers of upstream bridges on both devices which would > effectively limit transfers to PCIe switches only. But Bjorn did not > like this. At least to me that sounds like a good idea, it would at least disable (the incorrect) auto detection of P2P for such devices. >> You need to include both drivers which participate in the P2P >> transaction to make sure that both supports this and give them >> opportunity to chicken out and in the case of AMD APUs even redirect the >> request to another location (e.g. participate in the DMA translation). > I don't think it's the drivers responsibility to reject P2P . The > topology is what governs support or not. The discussions we had with > Bjorn settled on if the devices are all behind the same bridge they can > communicate with each other. This is essentially guaranteed by the PCI spec. Well it is not only rejecting P2P, see the devices I need to worry about are essentially part of the CPU. Their resources looks like a PCI BAR to the BIOS and OS, but are actually backed by stolen system memory. So as crazy as it sounds what you get is an operation which starts as P2P, but then the GPU drivers sees it and says: Hey please don't write that to my PCIe BAR, but rather system memory location X. >> DMA-buf fortunately seems to handle all this already, that's why we >> choose it as base for our implementation. > Well, unfortunately DMA-buf doesn't help for the drivers we are working > with as neither the block layer nor the RDMA subsystem have any > interfaces for it. A fact that gives me quite some sleepless nights as well. I think we sooner or later need to extend those interfaces to work with DMA-bufs as well. I will try to give your patch set a review when I'm back from vacation and rebase my DMA-buf work on top of that. Regards, Christian. > > Logan

7 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Christian König

Am 29.03.2018 um 17:45 schrieb Logan Gunthorpe: > > On 29/03/18 05:44 AM, Christian König wrote: >> Am 28.03.2018 um 21:53 schrieb Logan Gunthorpe: >>> On 28/03/18 01:44 PM, Christian König wrote: >>>> Well, isn't that exactly what dma_map_resource() is good for? As far as >>>> I can see it makes sure IOMMU is aware of the access route and >>>> translates a CPU address into a PCI Bus address. >>>> I'm using that with the AMD IOMMU driver and at least there it works >>>> perfectly fine. >>> Yes, it would be nice, but no arch has implemented this yet. We are just >>> lucky in the x86 case because that arch is simple and doesn't need to do >>> anything for P2P (partially due to the Bus and CPU addresses being the >>> same). But in the general case, you can't rely on it. >> Well, that an arch hasn't implemented it doesn't mean that we don't have >> the right interface to do it. > Yes, but right now we don't have a performant way to check if we are > doing P2P or not in the dma_map_X() wrappers. Why not? I mean the dma_map_resource() function is for P2P while other dma_map_* functions are only for system memory. > And this is necessary to > check if the DMA ops in use support it or not. We can't have the > dma_map_X() functions do the wrong thing because they don't support it yet. Well that sounds like we should just return an error from dma_map_resources() when an architecture doesn't support P2P yet as Alex suggested. >> Devices integrated in the CPU usually only "claim" to be PCIe devices. >> In reality their memory request path go directly through the integrated >> north bridge. The reason for this is simple better throughput/latency. > These are just more reasons why our patchset restricts to devices behind > a switch. And more mess for someone to deal with if they need to relax > that restriction. You don't seem to understand the implications: The devices do have a common upstream bridge! In other words your code would currently claim that P2P is supported, but in practice it doesn't work. You need to include both drivers which participate in the P2P transaction to make sure that both supports this and give them opportunity to chicken out and in the case of AMD APUs even redirect the request to another location (e.g. participate in the DMA translation). DMA-buf fortunately seems to handle all this already, that's why we choose it as base for our implementation. Regards, Christian.

7 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Alex Deucher

Sorry, didn't mean to drop the lists here. re-adding. On Wed, Mar 28, 2018 at 4:05 PM, Alex Deucher <alexdeucher(a)gmail.com> wrote: > On Wed, Mar 28, 2018 at 3:53 PM, Logan Gunthorpe <logang(a)deltatee.com> wrote: >> >> >> On 28/03/18 01:44 PM, Christian König wrote: >>> Well, isn't that exactly what dma_map_resource() is good for? As far as >>> I can see it makes sure IOMMU is aware of the access route and >>> translates a CPU address into a PCI Bus address. >> >>> I'm using that with the AMD IOMMU driver and at least there it works >>> perfectly fine. >> >> Yes, it would be nice, but no arch has implemented this yet. We are just >> lucky in the x86 case because that arch is simple and doesn't need to do >> anything for P2P (partially due to the Bus and CPU addresses being the >> same). But in the general case, you can't rely on it. > > Could we do something for the arches where it works? I feel like peer > to peer has dragged out for years because everyone is trying to boil > the ocean for all arches. There are a huge number of use cases for > peer to peer on these "simple" architectures which actually represent > a good deal of the users that want this. > > Alex > >> >>>>> Yeah, but not for ours. See if you want to do real peer 2 peer you need >>>>> to keep both the operation as well as the direction into account. >>>> Not sure what you are saying here... I'm pretty sure we are doing "real" >>>> peer 2 peer... >>>> >>>>> For example when you can do writes between A and B that doesn't mean >>>>> that writes between B and A work. And reads are generally less likely to >>>>> work than writes. etc... >>>> If both devices are behind a switch then the PCI spec guarantees that A >>>> can both read and write B and vice versa. >>> >>> Sorry to say that, but I know a whole bunch of PCI devices which >>> horrible ignores that. >> >> Can you elaborate? As far as the device is concerned it shouldn't know >> whether a request comes from a peer or from the host. If it does do >> crazy stuff like that it's well out of spec. It's up to the switch (or >> root complex if good support exists) to route the request to the device >> and it's the root complex that tends to be what drops the load requests >> which causes the asymmetries. >> >> Logan >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx(a)lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

7 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Christian König

Am 28.03.2018 um 21:53 schrieb Logan Gunthorpe: > > On 28/03/18 01:44 PM, Christian König wrote: >> Well, isn't that exactly what dma_map_resource() is good for? As far as >> I can see it makes sure IOMMU is aware of the access route and >> translates a CPU address into a PCI Bus address. >> I'm using that with the AMD IOMMU driver and at least there it works >> perfectly fine. > Yes, it would be nice, but no arch has implemented this yet. We are just > lucky in the x86 case because that arch is simple and doesn't need to do > anything for P2P (partially due to the Bus and CPU addresses being the > same). But in the general case, you can't rely on it. Well, that an arch hasn't implemented it doesn't mean that we don't have the right interface to do it. >>>> Yeah, but not for ours. See if you want to do real peer 2 peer you need >>>> to keep both the operation as well as the direction into account. >>> Not sure what you are saying here... I'm pretty sure we are doing "real" >>> peer 2 peer... >>> >>>> For example when you can do writes between A and B that doesn't mean >>>> that writes between B and A work. And reads are generally less likely to >>>> work than writes. etc... >>> If both devices are behind a switch then the PCI spec guarantees that A >>> can both read and write B and vice versa. >> Sorry to say that, but I know a whole bunch of PCI devices which >> horrible ignores that. > Can you elaborate? As far as the device is concerned it shouldn't know > whether a request comes from a peer or from the host. If it does do > crazy stuff like that it's well out of spec. It's up to the switch (or > root complex if good support exists) to route the request to the device > and it's the root complex that tends to be what drops the load requests > which causes the asymmetries. Devices integrated in the CPU usually only "claim" to be PCIe devices. In reality their memory request path go directly through the integrated north bridge. The reason for this is simple better throughput/latency. That is hidden from the software, for example the BIOS just allocates address space for the BARs as if it's a normal PCIe device. The only crux is when you then do peer2peer your request simply go into nirvana and are not handled by anything because the BARs are only visible from the CPU side of the northbridge. Regards, Christian. > > Logan > _______________________________________________ > amd-gfx mailing list > amd-gfx(a)lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

7 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Christian König

Am 28.03.2018 um 20:57 schrieb Logan Gunthorpe: > > On 28/03/18 12:28 PM, Christian König wrote: >> I'm just using amdgpu as blueprint because I'm the co-maintainer of it >> and know it mostly inside out. > Ah, I see. > >> The resource addresses are translated using dma_map_resource(). As far >> as I know that should be sufficient to offload all the architecture >> specific stuff to the DMA subsystem. > It's not. The dma_map infrastructure currently has no concept of > peer-to-peer mappings and is designed for system memory only. No > architecture I'm aware of will translate PCI CPU addresses into PCI Bus > addresses which is necessary for any transfer that doesn't go through > the root complex (though on arches like x86 the CPU and Bus address > happen to be the same). There's a lot of people that would like to see > this change but it's likely going to be a long road before it does. Well, isn't that exactly what dma_map_resource() is good for? As far as I can see it makes sure IOMMU is aware of the access route and translates a CPU address into a PCI Bus address. > Furthermore, one of the reasons our patch-set avoids going through the > root complex at all is that IOMMU drivers will need to be made aware > that it is operating on P2P memory and do arch-specific things > accordingly. There will also need to be flags that indicate whether a > given IOMMU driver supports this. None of this work is done or easy. I'm using that with the AMD IOMMU driver and at least there it works perfectly fine. >> Yeah, but not for ours. See if you want to do real peer 2 peer you need >> to keep both the operation as well as the direction into account. > Not sure what you are saying here... I'm pretty sure we are doing "real" > peer 2 peer... > >> For example when you can do writes between A and B that doesn't mean >> that writes between B and A work. And reads are generally less likely to >> work than writes. etc... > If both devices are behind a switch then the PCI spec guarantees that A > can both read and write B and vice versa. Sorry to say that, but I know a whole bunch of PCI devices which horrible ignores that. For example all AMD APUs fall under that category... > Only once you involve root > complexes do you have this problem. Ie. you have unknown support which > may be no support, or partial support (stores but not loads); or > sometimes bad performance; or a combination of both... and you need some > way to figure out all this mess and that is hard. Whoever tries to > implement a white list will have to sort all this out. Yes, exactly and unfortunately it looks like I'm the poor guy who needs to do this :) Regards, Christian. > > Logan > _______________________________________________ > amd-gfx mailing list > amd-gfx(a)lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

7 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Christian König

Am 28.03.2018 um 18:25 schrieb Logan Gunthorpe: > > On 28/03/18 10:02 AM, Christian König wrote: >> Yeah, that looks very similar to what I picked up from the older >> patches, going to read up on that after my vacation. > Yeah, I was just reading through your patchset and there are a lot of > similarities. Though, I'm not sure what you're trying to accomplish as I > could not find a cover letter and it seems to only enable one driver. Yeah, it was the last day before my easter vacation and I wanted it out of the door. > Is it meant to enable DMA transactions only between two AMD GPUs? Not really, DMA-buf is a general framework for sharing buffers between device drivers. It is widely used in the GFX stack on laptops with both Intel+AMD, Intel+NVIDIA or AMD+AMD graphics devices. Additional to that ARM uses it quite massively for their GFX stacks because they have rendering and displaying device separated. I'm just using amdgpu as blueprint because I'm the co-maintainer of it and know it mostly inside out. > I also don't see where you've taken into account the PCI bus address. On > some architectures this is not the same as the CPU physical address. The resource addresses are translated using dma_map_resource(). As far as I know that should be sufficient to offload all the architecture specific stuff to the DMA subsystem. > >> Just in general why are you interested in the "distance" of the devices? > We've taken a general approach where some drivers may provide p2p memory > (ie. an NVMe card or an RDMA NIC) and other drivers make use of it (ie. > the NVMe-of driver). The orchestrator driver needs to find the most > applicable provider device for a transaction in a situation that may > have multiple providers and multiple clients. So the most applicable > provider is the one that's closest ("distance"-wise) to all the clients > for the P2P transaction. That seems to make sense. > >> And BTW: At least for writes that Peer 2 Peer transactions between >> different root complexes work is actually more common than the other way >> around. > Maybe on x86 with hardware made in the last few years. But on PowerPC, > ARM64, and likely a lot more the chance of support is *much* less. Also, > hardware that only supports P2P stores is hardly full support and is > insufficient for our needs. Yeah, but not for ours. See if you want to do real peer 2 peer you need to keep both the operation as well as the direction into account. For example when you can do writes between A and B that doesn't mean that writes between B and A work. And reads are generally less likely to work than writes. etc... Since the use case I'm targeting for is GFX or GFX+V4L (or GFX+NIC in the future) I really need to handle all such use cases as well. > >> So I'm a bit torn between using a blacklist or a whitelist. A whitelist >> is certainly more conservative approach, but that could get a bit long. > I think a whitelist approach is correct. Given old hardware and other > architectures, a black list is going to be too long and too difficult to > comprehensively populate. Yeah, it would certainly be better if we have something in the root complex capabilities. But you're right that a whitelist sounds the less painful way. Regards, Christian. > > Logan > _______________________________________________ > amd-gfx mailing list > amd-gfx(a)lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

7 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

by Christian König

Am 28.03.2018 um 17:47 schrieb Logan Gunthorpe: > > On 28/03/18 09:07 AM, Christian König wrote: >> Am 28.03.2018 um 14:38 schrieb Christoph Hellwig: >>> On Sun, Mar 25, 2018 at 12:59:54PM +0200, Christian König wrote: >>>> From: "wdavis(a)nvidia.com" <wdavis(a)nvidia.com> >>>> >>>> Add an interface to find the first device which is upstream of both >>>> devices. >>> Please work with Logan and base this on top of the outstanding peer >>> to peer patchset. >> Can you point me to that? The last code I could find about that was from >> 2015. > The latest posted series is here: > > https://lkml.org/lkml/2018/3/12/830 > > However, we've made some significant changes to the area that's similar > to what you are doing. You can find lasted un-posted here: > > https://github.com/sbates130272/linux-p2pmem/tree/pci-p2p-v4-pre2 > > Specifically this function would be of interest to you: > > https://github.com/sbates130272/linux-p2pmem/blob/0e9468ae2a5a5198513dd1299… > > However, the difference between what we are doing is that we are > interested in the distance through the common upstream device and you > appear to be finding the actual common device. Yeah, that looks very similar to what I picked up from the older patches, going to read up on that after my vacation. Just in general why are you interested in the "distance" of the devices? And BTW: At least for writes that Peer 2 Peer transactions between different root complexes work is actually more common than the other way around. So I'm a bit torn between using a blacklist or a whitelist. A whitelist is certainly more conservative approach, but that could get a bit long. Thanks, Christian. > > Thanks, > > Logan

7 years, 2 months

1
0
0 0

[PATCH] dma-buf: use parameter structure for dma_buf_attach

by Christian König

Move the parameters into a structure to make it simpler to extend it in follow up patches. This also adds the importer private as parameter so that we can directly work with a completely filled in attachment structure. Signed-off-by: Christian König <christian.koenig(a)amd.com> --- drivers/dma-buf/dma-buf.c | 16 +++++++++------- drivers/gpu/drm/armada/armada_gem.c | 6 +++++- drivers/gpu/drm/drm_prime.c | 6 +++++- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 6 +++++- drivers/gpu/drm/tegra/gem.c | 6 +++++- drivers/gpu/drm/udl/udl_dmabuf.c | 6 +++++- drivers/media/common/videobuf2/videobuf2-dma-contig.c | 6 +++++- drivers/media/common/videobuf2/videobuf2-dma-sg.c | 6 +++++- drivers/staging/media/tegra-vde/tegra-vde.c | 6 +++++- include/linux/dma-buf.h | 19 +++++++++++++++++-- 10 files changed, 66 insertions(+), 17 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d78d5fc173dc..d2e8ca0d9427 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -534,8 +534,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put); /** * dma_buf_attach - Add the device to dma_buf's attachments list; optionally, * calls attach() of dma_buf_ops to allow device-specific attach functionality - * @dmabuf: [in] buffer to attach device to. - * @dev: [in] device to be attached. + * @info: [in] holds all the attach related information provided + * by the importer. see &struct dma_buf_attach_info + * for further details. * * Returns struct dma_buf_attachment pointer for this attachment. Attachments * must be cleaned up by calling dma_buf_detach(). @@ -549,26 +550,27 @@ EXPORT_SYMBOL_GPL(dma_buf_put); * accessible to @dev, and cannot be moved to a more suitable place. This is * indicated with the error code -EBUSY. */ -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev) +struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info) { + struct dma_buf *dmabuf = info->dmabuf; struct dma_buf_attachment *attach; int ret; - if (WARN_ON(!dmabuf || !dev)) + if (WARN_ON(!dmabuf || !info->dev)) return ERR_PTR(-EINVAL); attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM); - attach->dev = dev; + attach->dev = info->dev; attach->dmabuf = dmabuf; + attach->priv = info->priv; mutex_lock(&dmabuf->lock); if (dmabuf->ops->attach) { - ret = dmabuf->ops->attach(dmabuf, dev, attach); + ret = dmabuf->ops->attach(dmabuf, info->dev, attach); if (ret) goto err_attach; } diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index a97f509743a5..f4d1c11f57ea 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -514,6 +514,10 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, struct drm_gem_object * armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = buf + }; struct dma_buf_attachment *attach; struct armada_gem_object *dobj; @@ -529,7 +533,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) } } - attach = dma_buf_attach(buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 7856a9b3f8a8..4da242de51c2 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -707,6 +707,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, struct dma_buf *dma_buf, struct device *attach_dev) { + struct dma_buf_attach_info attach_info = { + .dev = attach_dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct sg_table *sgt; struct drm_gem_object *obj; @@ -727,7 +731,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL); - attach = dma_buf_attach(dma_buf, attach_dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 864439a214c8..94552ef3e5a7 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -288,6 +288,10 @@ static const struct drm_i915_gem_object_ops i915_gem_object_dmabuf_ops = { struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct drm_i915_gem_object *obj; int ret; @@ -306,7 +310,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, } /* need to attach */ - attach = dma_buf_attach(dma_buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 49b9bf28f872..462a4bac3f82 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -332,6 +332,10 @@ struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file, static struct tegra_bo *tegra_bo_import(struct drm_device *drm, struct dma_buf *buf) { + struct dma_buf_attach_info attach_info = { + .dev = drm->dev, + .dmabuf = buf + }; struct tegra_drm *tegra = drm->dev_private; struct dma_buf_attachment *attach; struct tegra_bo *bo; @@ -341,7 +345,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, if (IS_ERR(bo)) return bo; - attach = dma_buf_attach(buf, drm->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { err = PTR_ERR(attach); goto free; diff --git a/drivers/gpu/drm/udl/udl_dmabuf.c b/drivers/gpu/drm/udl/udl_dmabuf.c index 2867ed155ff6..c4db84abe231 100644 --- a/drivers/gpu/drm/udl/udl_dmabuf.c +++ b/drivers/gpu/drm/udl/udl_dmabuf.c @@ -243,6 +243,10 @@ static int udl_prime_create(struct drm_device *dev, struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct sg_table *sg; struct udl_gem_object *uobj; @@ -250,7 +254,7 @@ struct drm_gem_object *udl_gem_prime_import(struct drm_device *dev, /* need to attach */ get_device(dev->dev); - attach = dma_buf_attach(dma_buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) { put_device(dev->dev); return ERR_CAST(attach); diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index f1178f6f434d..93bd1f40f756 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -677,6 +677,10 @@ static void vb2_dc_detach_dmabuf(void *mem_priv) static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dbuf + }; struct vb2_dc_buf *buf; struct dma_buf_attachment *dba; @@ -692,7 +696,7 @@ static void *vb2_dc_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */ - dba = dma_buf_attach(dbuf, buf->dev); + dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf); diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c index 753ed3138dcc..4e61050ba87f 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -609,6 +609,10 @@ static void vb2_dma_sg_detach_dmabuf(void *mem_priv) static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, unsigned long size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dbuf + }; struct vb2_dma_sg_buf *buf; struct dma_buf_attachment *dba; @@ -624,7 +628,7 @@ static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, buf->dev = dev; /* create attachment for the dmabuf with the user device */ - dba = dma_buf_attach(dbuf, buf->dev); + dba = dma_buf_attach(&attach_info); if (IS_ERR(dba)) { pr_err("failed to attach dmabuf\n"); kfree(buf); diff --git a/drivers/staging/media/tegra-vde/tegra-vde.c b/drivers/staging/media/tegra-vde/tegra-vde.c index c47659e96089..25d112443b0d 100644 --- a/drivers/staging/media/tegra-vde/tegra-vde.c +++ b/drivers/staging/media/tegra-vde/tegra-vde.c @@ -529,6 +529,10 @@ static int tegra_vde_attach_dmabuf(struct device *dev, size_t *size, enum dma_data_direction dma_dir) { + struct dma_buf_attach_info attach_info = { + .dev = dev, + .dmabuf = dmabuf + }; struct dma_buf_attachment *attachment; struct dma_buf *dmabuf; struct sg_table *sgt; @@ -547,7 +551,7 @@ static int tegra_vde_attach_dmabuf(struct device *dev, return -EINVAL; } - attachment = dma_buf_attach(dmabuf, dev); + attachment = dma_buf_attach(&attach_info); if (IS_ERR(attachment)) { dev_err(dev, "Failed to attach dmabuf\n"); err = PTR_ERR(attachment); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 085db2fee2d7..2c27568d44af 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -362,6 +362,21 @@ struct dma_buf_export_info { struct dma_buf_export_info name = { .exp_name = KBUILD_MODNAME, \ .owner = THIS_MODULE } +/** + * struct dma_buf_attach_info - holds information needed to attach to a dma_buf + * @dmabuf: the exported dma_buf + * @dev: the device which wants to import the attachment + * @priv: private data of importer to this attachment + * + * This structure holds the information required to attach to a buffer. Used + * with dma_buf_attach() only. + */ +struct dma_buf_attach_info { + struct dma_buf *dmabuf; + struct device *dev; + void *priv; +}; + /** * get_dma_buf - convenience wrapper for get_file. * @dmabuf: [in] pointer to dma_buf @@ -376,8 +391,8 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); } -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev); +struct dma_buf_attachment * +dma_buf_attach(const struct dma_buf_attach_info *info); void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *dmabuf_attach); -- 2.14.1

7 years, 2 months

3
6
0 0

RFC: unpinned DMA-buf exporting v2

by Christian König

Hi everybody, since I've got positive feedback from Daniel I continued working on this approach. A few issues are still open: 1. Daniel suggested that I make the invalidate_mappings callback a parameter of dma_buf_attach(). This approach unfortunately won't work because when the attachment is created the importer is not necessarily ready to handle invalidation events. E.g. in the amdgpu example we first need to setup the imported GEM/TMM objects and install that in the attachment. My solution is to introduce a separate function to grab the locks and set the callback, this function could then be used to pin the buffer later on if that turns out to be necessary after all. 2. With my example setup this currently results in a ping/pong situation because the exporter prefers a VRAM placement while the importer prefers a GTT placement. This results in quite a performance drop, but can be fixed by a simple mesa patch which allows shred BOs to be placed in both VRAM and GTT. Question is what should we do in the meantime? Accept the performance drop or only allow unpinned sharing with new Mesa? Please review and comment, Christian.

7 years, 2 months

6
30
0 0

[PATCH 1/5] dma-buf: add optional invalidate_mappings callback v3

by Christian König

Each importer can now provide an invalidate_mappings callback. This allows the exporter to provide the mappings without the need to pin the backing store. v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback Signed-off-by: Christian König <christian.koenig(a)amd.com> --- drivers/dma-buf/dma-buf.c | 43 +++++++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 28 ++++++++++++++++++++++++++++ 2 files changed, 71 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d2e8ca0d9427..ffaa2f9a9c2c 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -566,6 +566,7 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info attach->dev = info->dev; attach->dmabuf = dmabuf; attach->priv = info->priv; + attach->invalidate = info->invalidate; mutex_lock(&dmabuf->lock); @@ -574,7 +575,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info if (ret) goto err_attach; } + reservation_object_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments); + reservation_object_unlock(dmabuf->resv); mutex_unlock(&dmabuf->lock); return attach; @@ -600,7 +603,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) return; mutex_lock(&dmabuf->lock); + reservation_object_lock(dmabuf->resv, NULL); list_del(&attach->node); + reservation_object_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach); @@ -634,10 +639,23 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL); + /* + * Mapping a DMA-buf can trigger its invalidation, prevent sending this + * event to the caller by temporary removing this attachment from the + * list. + */ + if (attach->invalidate) { + reservation_object_assert_held(attach->dmabuf->resv); + list_del(&attach->node); + } + sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); if (!sg_table) sg_table = ERR_PTR(-ENOMEM); + if (attach->invalidate) + list_add(&attach->node, &attach->dmabuf->attachments); + return sg_table; } EXPORT_SYMBOL_GPL(dma_buf_map_attachment); @@ -658,6 +676,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, { might_sleep(); + if (attach->invalidate) + reservation_object_assert_held(attach->dmabuf->resv); + if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) return; @@ -666,6 +687,26 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); +/** + * dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf + * + * @dmabuf: [in] buffer which mappings should be invalidated + * + * Informs all attachmenst that they need to destroy and recreated all their + * mappings. + */ +void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) +{ + struct dma_buf_attachment *attach; + + reservation_object_assert_held(dmabuf->resv); + + list_for_each_entry(attach, &dmabuf->attachments, node) + if (attach->invalidate) + attach->invalidate(attach); +} +EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings); + /** * DOC: cpu access * @@ -1123,10 +1164,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) seq_puts(s, "\tAttached Devices:\n"); attach_count = 0; + reservation_object_lock(buf_obj->resv, NULL); list_for_each_entry(attach_obj, &buf_obj->attachments, node) { seq_printf(s, "\t%s\n", dev_name(attach_obj->dev)); attach_count++; } + reservation_object_unlock(buf_obj->resv); seq_printf(s, "Total %d devices attached\n\n", attach_count); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 2c27568d44af..15dd8598bff1 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -270,6 +270,8 @@ struct dma_buf_ops { * @poll: for userspace poll support * @cb_excl: for userspace poll support * @cb_shared: for userspace poll support + * @invalidation_supported: True when the exporter supports unpinned operation + * using the reservation lock. * * This represents a shared buffer, created by calling dma_buf_export(). The * userspace representation is a normal file descriptor, which can be created by @@ -293,6 +295,7 @@ struct dma_buf { struct list_head list_node; void *priv; struct reservation_object *resv; + bool invalidation_supported; /* poll support */ wait_queue_head_t poll; @@ -326,6 +329,28 @@ struct dma_buf_attachment { struct device *dev; struct list_head node; void *priv; + + /** + * @invalidate: + * + * Optional callback provided by the importer of the dma-buf. + * + * If provided the exporter can avoid pinning the backing store while + * mappings exists. + * + * The function is called with the lock of the reservation object + * associated with the dma_buf held and the mapping function must be + * called with this lock held as well. This makes sure that no mapping + * is created concurrently with an ongoing invalidation. + * + * After the callback all existing mappings are still valid until all + * fences in the dma_bufs reservation object are signaled, but should be + * destroyed by the importer as soon as possible. + * + * New mappings can be created immediately, but can't be used before the + * exclusive fence in the dma_bufs reservation object is signaled. + */ + void (*invalidate)(struct dma_buf_attachment *attach); }; /** @@ -367,6 +392,7 @@ struct dma_buf_export_info { * @dmabuf: the exported dma_buf * @dev: the device which wants to import the attachment * @priv: private data of importer to this attachment + * @invalidate: callback to use for invalidating mappings * * This structure holds the information required to attach to a buffer. Used * with dma_buf_attach() only. @@ -375,6 +401,7 @@ struct dma_buf_attach_info { struct dma_buf *dmabuf; struct device *dev; void *priv; + void (*invalidate)(struct dma_buf_attachment *attach); }; /** @@ -406,6 +433,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); +void dma_buf_invalidate_mappings(struct dma_buf *dma_buf); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); int dma_buf_end_cpu_access(struct dma_buf *dma_buf, -- 2.14.1

7 years, 2 months

1
4
0 0

RFC: unpinned DMA-buf exporting

by Christian König

This set of patches adds an option invalidate_mappings callback to each DMA-buf attachment which can be filled in by the importer. This callback allows the exporter to provided the DMA-buf content without pinning it. The reservation objects lock acts as synchronization point for buffer moves and creating mappings. This set includes an implementation for amdgpu which should be rather easily portable to other DRM drivers. Please comment, Christian.

7 years, 2 months

2
16
0 0

[PATCH 1/4] dma-buf: add optional invalidate_mappings callback

by Christian König

Each importer can now provide an invalidate_mappings callback. This allows the exporter to provide the mappings without the need to pin the backing store. Signed-off-by: Christian König <christian.koenig(a)amd.com> --- drivers/dma-buf/dma-buf.c | 25 +++++++++++++++++++++++++ include/linux/dma-buf.h | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d78d5fc173dc..ed8d5844ae74 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -629,6 +629,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, might_sleep(); + if (attach->invalidate_mappings) + reservation_object_assert_held(attach->dmabuf->resv); + if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL); @@ -656,6 +659,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, { might_sleep(); + if (attach->invalidate_mappings) + reservation_object_assert_held(attach->dmabuf->resv); + if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) return; @@ -664,6 +670,25 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); +/** + * dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf + * + * @dmabuf: [in] buffer which mappings should be invalidated + * + * Informs all attachmenst that they need to destroy and recreated all their + * mappings. + */ +void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) +{ + struct dma_buf_attachment *attach; + + reservation_object_assert_held(dmabuf->resv); + + list_for_each_entry(attach, &dmabuf->attachments, node) + attach->invalidate_mappings(attach); +} +EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings); + /** * DOC: cpu access * diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 085db2fee2d7..c1e2f7d93509 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -91,6 +91,18 @@ struct dma_buf_ops { */ void (*detach)(struct dma_buf *, struct dma_buf_attachment *); + /** + * @supports_mapping_invalidation: + * + * True for exporters which supports unpinned DMA-buf operation using + * the reservation lock. + * + * When attachment->invalidate_mappings is set the @map_dma_buf and + * @unmap_dma_buf callbacks can be called with the reservation lock + * held. + */ + bool supports_mapping_invalidation; + /** * @map_dma_buf: * @@ -326,6 +338,29 @@ struct dma_buf_attachment { struct device *dev; struct list_head node; void *priv; + + /** + * @invalidate_mappings: + * + * Optional callback provided by the importer of the attachment which + * must be set before mappings are created. + * + * If provided the exporter can avoid pinning the backing store while + * mappings exists. + * + * The function is called with the lock of the reservation object + * associated with the dma_buf held and the mapping function must be + * called with this lock held as well. This makes sure that no mapping + * is created concurrently with an ongoing invalidation. + * + * After the callback all existing mappings are still valid until all + * fences in the dma_bufs reservation object are signaled, but should be + * destroyed by the importer as soon as possible. + * + * New mappings can be created immediately, but can't be used before the + * exclusive fence in the dma_bufs reservation object is signaled. + */ + void (*invalidate_mappings)(struct dma_buf_attachment *attach); }; /** @@ -391,6 +426,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); +void dma_buf_invalidate_mappings(struct dma_buf *dma_buf); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); int dma_buf_end_cpu_access(struct dma_buf *dma_buf, -- 2.14.1

7 years, 3 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig