tldr; DMA buffers aren't normal memory, expecting that you can use
them like that (like calling get_user_pages works, or that they're
accounting like any other normal memory) cannot be guaranteed.
Since some userspace only runs on integrated devices, where all
buffers are actually all resident system memory, there's a huge
temptation to assume that a struct page is always present and useable
like for any more pagecache backed mmap. This has the potential to
result in a uapi nightmare.
To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
blocks get_user_pages and all the other struct page based
infrastructure for everyone. In spirit this is the uapi counterpart to
the kernel-internal CONFIG_DMABUF_DEBUG.
Motivated by a recent patch which wanted to swich the system dma-buf
heap to vm_insert_page instead of vm_insert_pfn.
v2:
Jason brought up that we also want to guarantee that all ptes have the
pte_special flag set, to catch fast get_user_pages (on architectures
that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>From auditing the various functions to insert pfn pte entires
(vm_insert_pfn_prot, remap_pfn_range and all it's callers like
dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
this should be the correct flag to check for.
References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-Wb…
Acked-by: Christian König <christian.koenig(a)amd.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: John Stultz <john.stultz(a)linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: linux-media(a)vger.kernel.org
Cc: linaro-mm-sig(a)lists.linaro.org
---
drivers/dma-buf/dma-buf.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index f264b70c383e..06cb1d2e9fdc 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -127,6 +127,7 @@ static struct file_system_type dma_buf_fs_type = {
static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
{
struct dma_buf *dmabuf;
+ int ret;
if (!is_dma_buf_file(file))
return -EINVAL;
@@ -142,7 +143,11 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
dmabuf->size >> PAGE_SHIFT)
return -EINVAL;
- return dmabuf->ops->mmap(dmabuf, vma);
+ ret = dmabuf->ops->mmap(dmabuf, vma);
+
+ WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+ return ret;
}
static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
@@ -1244,6 +1249,8 @@ EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
unsigned long pgoff)
{
+ int ret;
+
if (WARN_ON(!dmabuf || !vma))
return -EINVAL;
@@ -1264,7 +1271,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
vma_set_file(vma, dmabuf->file);
vma->vm_pgoff = pgoff;
- return dmabuf->ops->mmap(dmabuf, vma);
+ ret = dmabuf->ops->mmap(dmabuf, vma);
+
+ WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+ return ret;
}
EXPORT_SYMBOL_GPL(dma_buf_mmap);
--
2.31.0
On Fri, Apr 02, 2021 at 02:39:31PM +0530, Nava kishore Manne wrote:
> Add "fpga-config-from-dmabuf" property to allow the bitstream
> configuration from pre-allocated dma-buffer.
>
> Signed-off-by: Nava kishore Manne <nava.manne(a)xilinx.com>
> ---
> Documentation/devicetree/bindings/fpga/fpga-region.txt | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/fpga/fpga-region.txt b/Documentation/devicetree/bindings/fpga/fpga-region.txt
> index 969ca53bb65e..c573cf258d60 100644
> --- a/Documentation/devicetree/bindings/fpga/fpga-region.txt
> +++ b/Documentation/devicetree/bindings/fpga/fpga-region.txt
> @@ -177,6 +177,8 @@ Optional properties:
> it indicates that the FPGA has already been programmed with this image.
> If this property is in an overlay targeting a FPGA region, it is a
> request to program the FPGA with that image.
> +- fpga-config-from-dmabuf : boolean, set if the FPGA configured done from the
> + pre-allocated dma-buffer.
Sounds like an implementation detail of the OS. Doesn't belong in DT.
Rob
> - fpga-bridges : should contain a list of phandles to FPGA Bridges that must be
> controlled during FPGA programming along with the parent FPGA bridge.
> This property is optional if the FPGA Manager handles the bridges.
> --
> 2.18.0
>
From: David Stevens <stevensd(a)chromium.org>
Allocate a new private stub fence in drm_syncobj_assign_null_handle,
instead of using a static stub fence.
When userspace creates a fence with DRM_SYNCOBJ_CREATE_SIGNALED or when
userspace signals a fence via DRM_IOCTL_SYNCOBJ_SIGNAL, the timestamp
obtained when the fence is exported and queried with SYNC_IOC_FILE_INFO
should match when the fence's status was changed from the perspective of
userspace, which is during the respective ioctl.
When a static stub fence started being used in by these ioctls, this
behavior changed. Instead, the timestamp returned by SYNC_IOC_FILE_INFO
became the first time anything used the static stub fence, which has no
meaning to userspace.
Signed-off-by: David Stevens <stevensd(a)chromium.org>
---
drivers/dma-buf/dma-fence.c | 33 ++++++++++++++++++++++++++++++++-
drivers/gpu/drm/drm_syncobj.c | 28 ++++++++++++++++++++--------
include/linux/dma-fence.h | 1 +
3 files changed, 53 insertions(+), 9 deletions(-)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index d64fc03929be..6081eb962490 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -26,6 +26,11 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
static DEFINE_SPINLOCK(dma_fence_stub_lock);
static struct dma_fence dma_fence_stub;
+struct drm_fence_private_stub {
+ struct dma_fence base;
+ spinlock_t lock;
+};
+
/*
* fence context counter: each execution context should have its own
* fence context, this allows checking if fences belong to the same
@@ -123,7 +128,9 @@ static const struct dma_fence_ops dma_fence_stub_ops = {
/**
* dma_fence_get_stub - return a signaled fence
*
- * Return a stub fence which is already signaled.
+ * Return a stub fence which is already signaled. The fence's
+ * timestamp corresponds to the first time after boot this
+ * function is called.
*/
struct dma_fence *dma_fence_get_stub(void)
{
@@ -141,6 +148,30 @@ struct dma_fence *dma_fence_get_stub(void)
}
EXPORT_SYMBOL(dma_fence_get_stub);
+/**
+ * dma_fence_allocate_private_stub - return a private, signaled fence
+ *
+ * Return a newly allocated and signaled stub fence.
+ */
+struct dma_fence *dma_fence_allocate_private_stub(void)
+{
+ struct drm_fence_private_stub *fence;
+
+ fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+ if (fence == NULL)
+ return ERR_PTR(-ENOMEM);
+
+ spin_lock_init(&fence->lock);
+ dma_fence_init(&fence->base,
+ &dma_fence_stub_ops,
+ &fence->lock,
+ 0, 0);
+ dma_fence_signal(&fence->base);
+
+ return &fence->base;
+}
+EXPORT_SYMBOL(dma_fence_allocate_private_stub);
+
/**
* dma_fence_context_alloc - allocate an array of fence contexts
* @num: amount of contexts to allocate
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 349146049849..c6125e57ae37 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -350,12 +350,15 @@ EXPORT_SYMBOL(drm_syncobj_replace_fence);
*
* Assign a already signaled stub fence to the sync object.
*/
-static void drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
+static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
{
- struct dma_fence *fence = dma_fence_get_stub();
+ struct dma_fence *fence = dma_fence_allocate_private_stub();
+ if (IS_ERR(fence))
+ return PTR_ERR(fence);
- drm_syncobj_replace_fence(syncobj, fence);
- dma_fence_put(fence);
+ drm_syncobj_replace_fence(syncobj, fence);
+ dma_fence_put(fence);
+ return 0;
}
/* 5s default for wait submission */
@@ -469,6 +472,7 @@ EXPORT_SYMBOL(drm_syncobj_free);
int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
struct dma_fence *fence)
{
+ int ret;
struct drm_syncobj *syncobj;
syncobj = kzalloc(sizeof(struct drm_syncobj), GFP_KERNEL);
@@ -479,8 +483,13 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
INIT_LIST_HEAD(&syncobj->cb_list);
spin_lock_init(&syncobj->lock);
- if (flags & DRM_SYNCOBJ_CREATE_SIGNALED)
- drm_syncobj_assign_null_handle(syncobj);
+ if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
+ ret = drm_syncobj_assign_null_handle(syncobj);
+ if (ret < 0) {
+ drm_syncobj_put(syncobj);
+ return ret;
+ }
+ }
if (fence)
drm_syncobj_replace_fence(syncobj, fence);
@@ -1322,8 +1331,11 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
if (ret < 0)
return ret;
- for (i = 0; i < args->count_handles; i++)
- drm_syncobj_assign_null_handle(syncobjs[i]);
+ for (i = 0; i < args->count_handles; i++) {
+ ret = drm_syncobj_assign_null_handle(syncobjs[i]);
+ if (ret < 0)
+ break;
+ }
drm_syncobj_array_free(syncobjs, args->count_handles);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 9f12efaaa93a..6ffb4b2c6371 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -587,6 +587,7 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
}
struct dma_fence *dma_fence_get_stub(void);
+struct dma_fence *dma_fence_allocate_private_stub(void);
u64 dma_fence_context_alloc(unsigned num);
#define DMA_FENCE_TRACE(f, fmt, args...) \
--
2.31.0.208.g409f899ff0-goog
From: David Stevens <stevensd(a)chromium.org>
Allocate a new private stub fence in drm_syncobj_assign_null_handle,
instead of using a static stub fence.
When userspace creates a fence with DRM_SYNCOBJ_CREATE_SIGNALED or when
userspace signals a fence via DRM_IOCTL_SYNCOBJ_SIGNAL, the timestamp
obtained when the fence is exported and queried with SYNC_IOC_FILE_INFO
should match when the fence's status was changed from the perspective of
userspace, which is during the respective ioctl.
When a static stub fence started being used in by these ioctls, this
behavior changed. Instead, the timestamp returned by SYNC_IOC_FILE_INFO
became the first time anything used the static stub fence, which has no
meaning to userspace.
Signed-off-by: David Stevens <stevensd(a)chromium.org>
---
v2 -> v3:
- reuse the static stub spinlock
v1 -> v2:
- checkpatch style fixes
drivers/dma-buf/dma-fence.c | 27 ++++++++++++++++++++++++++-
drivers/gpu/drm/drm_syncobj.c | 25 +++++++++++++++++++------
include/linux/dma-fence.h | 1 +
3 files changed, 46 insertions(+), 7 deletions(-)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index d64fc03929be..ce0f5eff575d 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -123,7 +123,9 @@ static const struct dma_fence_ops dma_fence_stub_ops = {
/**
* dma_fence_get_stub - return a signaled fence
*
- * Return a stub fence which is already signaled.
+ * Return a stub fence which is already signaled. The fence's
+ * timestamp corresponds to the first time after boot this
+ * function is called.
*/
struct dma_fence *dma_fence_get_stub(void)
{
@@ -141,6 +143,29 @@ struct dma_fence *dma_fence_get_stub(void)
}
EXPORT_SYMBOL(dma_fence_get_stub);
+/**
+ * dma_fence_allocate_private_stub - return a private, signaled fence
+ *
+ * Return a newly allocated and signaled stub fence.
+ */
+struct dma_fence *dma_fence_allocate_private_stub(void)
+{
+ struct dma_fence *fence;
+
+ fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+ if (fence == NULL)
+ return ERR_PTR(-ENOMEM);
+
+ dma_fence_init(fence,
+ &dma_fence_stub_ops,
+ &dma_fence_stub_lock,
+ 0, 0);
+ dma_fence_signal(fence);
+
+ return fence;
+}
+EXPORT_SYMBOL(dma_fence_allocate_private_stub);
+
/**
* dma_fence_context_alloc - allocate an array of fence contexts
* @num: amount of contexts to allocate
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 349146049849..a54aa850d143 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -350,12 +350,16 @@ EXPORT_SYMBOL(drm_syncobj_replace_fence);
*
* Assign a already signaled stub fence to the sync object.
*/
-static void drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
+static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
{
- struct dma_fence *fence = dma_fence_get_stub();
+ struct dma_fence *fence = dma_fence_allocate_private_stub();
+
+ if (IS_ERR(fence))
+ return PTR_ERR(fence);
drm_syncobj_replace_fence(syncobj, fence);
dma_fence_put(fence);
+ return 0;
}
/* 5s default for wait submission */
@@ -469,6 +473,7 @@ EXPORT_SYMBOL(drm_syncobj_free);
int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
struct dma_fence *fence)
{
+ int ret;
struct drm_syncobj *syncobj;
syncobj = kzalloc(sizeof(struct drm_syncobj), GFP_KERNEL);
@@ -479,8 +484,13 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
INIT_LIST_HEAD(&syncobj->cb_list);
spin_lock_init(&syncobj->lock);
- if (flags & DRM_SYNCOBJ_CREATE_SIGNALED)
- drm_syncobj_assign_null_handle(syncobj);
+ if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
+ ret = drm_syncobj_assign_null_handle(syncobj);
+ if (ret < 0) {
+ drm_syncobj_put(syncobj);
+ return ret;
+ }
+ }
if (fence)
drm_syncobj_replace_fence(syncobj, fence);
@@ -1322,8 +1332,11 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
if (ret < 0)
return ret;
- for (i = 0; i < args->count_handles; i++)
- drm_syncobj_assign_null_handle(syncobjs[i]);
+ for (i = 0; i < args->count_handles; i++) {
+ ret = drm_syncobj_assign_null_handle(syncobjs[i]);
+ if (ret < 0)
+ break;
+ }
drm_syncobj_array_free(syncobjs, args->count_handles);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 9f12efaaa93a..6ffb4b2c6371 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -587,6 +587,7 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
}
struct dma_fence *dma_fence_get_stub(void);
+struct dma_fence *dma_fence_allocate_private_stub(void);
u64 dma_fence_context_alloc(unsigned num);
#define DMA_FENCE_TRACE(f, fmt, args...) \
--
2.31.0.208.g409f899ff0-goog
From: David Stevens <stevensd(a)chromium.org>
Allocate a new private stub fence in drm_syncobj_assign_null_handle,
instead of using a static stub fence.
When userspace creates a fence with DRM_SYNCOBJ_CREATE_SIGNALED or when
userspace signals a fence via DRM_IOCTL_SYNCOBJ_SIGNAL, the timestamp
obtained when the fence is exported and queried with SYNC_IOC_FILE_INFO
should match when the fence's status was changed from the perspective of
userspace, which is during the respective ioctl.
When a static stub fence started being used in by these ioctls, this
behavior changed. Instead, the timestamp returned by SYNC_IOC_FILE_INFO
became the first time anything used the static stub fence, which has no
meaning to userspace.
Signed-off-by: David Stevens <stevensd(a)chromium.org>
---
v1 -> v2:
- checkpatch style fixes
drivers/dma-buf/dma-fence.c | 33 ++++++++++++++++++++++++++++++++-
drivers/gpu/drm/drm_syncobj.c | 25 +++++++++++++++++++------
include/linux/dma-fence.h | 1 +
3 files changed, 52 insertions(+), 7 deletions(-)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index d64fc03929be..6081eb962490 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -26,6 +26,11 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
static DEFINE_SPINLOCK(dma_fence_stub_lock);
static struct dma_fence dma_fence_stub;
+struct drm_fence_private_stub {
+ struct dma_fence base;
+ spinlock_t lock;
+};
+
/*
* fence context counter: each execution context should have its own
* fence context, this allows checking if fences belong to the same
@@ -123,7 +128,9 @@ static const struct dma_fence_ops dma_fence_stub_ops = {
/**
* dma_fence_get_stub - return a signaled fence
*
- * Return a stub fence which is already signaled.
+ * Return a stub fence which is already signaled. The fence's
+ * timestamp corresponds to the first time after boot this
+ * function is called.
*/
struct dma_fence *dma_fence_get_stub(void)
{
@@ -141,6 +148,30 @@ struct dma_fence *dma_fence_get_stub(void)
}
EXPORT_SYMBOL(dma_fence_get_stub);
+/**
+ * dma_fence_allocate_private_stub - return a private, signaled fence
+ *
+ * Return a newly allocated and signaled stub fence.
+ */
+struct dma_fence *dma_fence_allocate_private_stub(void)
+{
+ struct drm_fence_private_stub *fence;
+
+ fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+ if (fence == NULL)
+ return ERR_PTR(-ENOMEM);
+
+ spin_lock_init(&fence->lock);
+ dma_fence_init(&fence->base,
+ &dma_fence_stub_ops,
+ &fence->lock,
+ 0, 0);
+ dma_fence_signal(&fence->base);
+
+ return &fence->base;
+}
+EXPORT_SYMBOL(dma_fence_allocate_private_stub);
+
/**
* dma_fence_context_alloc - allocate an array of fence contexts
* @num: amount of contexts to allocate
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 349146049849..a54aa850d143 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -350,12 +350,16 @@ EXPORT_SYMBOL(drm_syncobj_replace_fence);
*
* Assign a already signaled stub fence to the sync object.
*/
-static void drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
+static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
{
- struct dma_fence *fence = dma_fence_get_stub();
+ struct dma_fence *fence = dma_fence_allocate_private_stub();
+
+ if (IS_ERR(fence))
+ return PTR_ERR(fence);
drm_syncobj_replace_fence(syncobj, fence);
dma_fence_put(fence);
+ return 0;
}
/* 5s default for wait submission */
@@ -469,6 +473,7 @@ EXPORT_SYMBOL(drm_syncobj_free);
int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
struct dma_fence *fence)
{
+ int ret;
struct drm_syncobj *syncobj;
syncobj = kzalloc(sizeof(struct drm_syncobj), GFP_KERNEL);
@@ -479,8 +484,13 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
INIT_LIST_HEAD(&syncobj->cb_list);
spin_lock_init(&syncobj->lock);
- if (flags & DRM_SYNCOBJ_CREATE_SIGNALED)
- drm_syncobj_assign_null_handle(syncobj);
+ if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
+ ret = drm_syncobj_assign_null_handle(syncobj);
+ if (ret < 0) {
+ drm_syncobj_put(syncobj);
+ return ret;
+ }
+ }
if (fence)
drm_syncobj_replace_fence(syncobj, fence);
@@ -1322,8 +1332,11 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
if (ret < 0)
return ret;
- for (i = 0; i < args->count_handles; i++)
- drm_syncobj_assign_null_handle(syncobjs[i]);
+ for (i = 0; i < args->count_handles; i++) {
+ ret = drm_syncobj_assign_null_handle(syncobjs[i]);
+ if (ret < 0)
+ break;
+ }
drm_syncobj_array_free(syncobjs, args->count_handles);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 9f12efaaa93a..6ffb4b2c6371 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -587,6 +587,7 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
}
struct dma_fence *dma_fence_get_stub(void);
+struct dma_fence *dma_fence_allocate_private_stub(void);
u64 dma_fence_context_alloc(unsigned num);
#define DMA_FENCE_TRACE(f, fmt, args...) \
--
2.31.0.208.g409f899ff0-goog
Hi Qu,
Am 02.04.21 um 05:18 schrieb Qu Huang:
> Before dma_resv_lock(bo->base.resv, NULL) in amdgpu_bo_release_notify(),
> the bo->base.resv lock may be held by ttm_mem_evict_first(),
That can't happen since when bo_release_notify is called the BO has not
more references and is therefore deleted.
And we never evict a deleted BO, we just wait for it to become idle.
Regards,
Christian.
> and the VRAM mem will be evicted, mem region was replaced
> by Gtt mem region. amdgpu_bo_release_notify() will then
> hold the bo->base.resv lock, and SDMA will get an invalid
> address in amdgpu_fill_buffer(), resulting in a VMFAULT
> or memory corruption.
>
> To avoid it, we have to hold bo->base.resv lock first, and
> check whether the mem.mem_type is TTM_PL_VRAM.
>
> Signed-off-by: Qu Huang <jinsdb(a)126.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 4b29b82..8018574 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1300,12 +1300,16 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo)
> if (bo->base.resv == &bo->base._resv)
> amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);
>
> - if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node ||
> - !(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
> + if (!(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
> return;
>
> dma_resv_lock(bo->base.resv, NULL);
>
> + if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node) {
> + dma_resv_unlock(bo->base.resv);
> + return;
> + }
> +
> r = amdgpu_fill_buffer(abo, AMDGPU_POISON, bo->base.resv, &fence);
> if (!WARN_ON(r)) {
> amdgpu_bo_fence(abo, fence, false);
> --
> 1.8.3.1
>
On Thu, Apr 1, 2021 at 8:34 AM Doug Anderson <dianders(a)chromium.org> wrote:
>
> Hi,
>
> On Wed, Mar 31, 2021 at 6:24 PM Rob Clark <robdclark(a)gmail.com> wrote:
> >
> > @@ -45,6 +30,9 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
> > list_for_each_entry(msm_obj, &priv->inactive_dontneed, mm_list) {
> > if (freed >= sc->nr_to_scan)
> > break;
> > + /* Use trylock, because we cannot block on a obj that
> > + * might be trying to acquire mm_lock
> > + */
>
> nit: I thought the above multi-line commenting style was only for
> "net" subsystem?
we do use the "net" style a fair bit already.. (OTOH I tend to not
really care what checkpatch says)
> > if (!msm_gem_trylock(&msm_obj->base))
> > continue;
> > if (is_purgeable(msm_obj)) {
> > @@ -56,8 +44,11 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
> >
> > mutex_unlock(&priv->mm_lock);
> >
> > - if (freed > 0)
> > + if (freed > 0) {
> > trace_msm_gem_purge(freed << PAGE_SHIFT);
> > + } else {
> > + return SHRINK_STOP;
> > + }
>
> It probably doesn't matter, but I wonder if we should still be
> returning SHRINK_STOP if we got any trylock failures. It could
> possibly be worth returning 0 in that case?
On the surface, you'd think that, but there be mm dragons.. we can hit
shrinker from the submit path when the obj is locked already and we
are trying to allocate backing pages. We don't want to tell vmscan to
keep trying, because we'll keep failing to grab that objects lock
>
> > @@ -75,6 +66,9 @@ vmap_shrink(struct list_head *mm_list)
> > unsigned unmapped = 0;
> >
> > list_for_each_entry(msm_obj, mm_list, mm_list) {
> > + /* Use trylock, because we cannot block on a obj that
> > + * might be trying to acquire mm_lock
> > + */
>
> If you end up changing the commenting style above, should also be here.
>
> At this point this seems fine to land to me. Though I'm not an expert
> on every interaction in this code, I've spent enough time starting at
> it that I'm comfortable with:
>
> Reviewed-by: Douglas Anderson <dianders(a)chromium.org>
thanks
BR,
-R
From: Rob Clark <robdclark(a)chromium.org>
I've been spending some time looking into how things behave under high
memory pressure. The first patch is a random cleanup I noticed along
the way. The second improves the situation significantly when we are
getting shrinker called from many threads in parallel. And the last
two are $debugfs/gem fixes I needed so I could monitor the state of GEM
objects (ie. how many are active/purgable/purged) while triggering high
memory pressure.
We could probably go a bit further with dropping the mm_lock in the
shrinker->scan() loop, but this is already a pretty big improvement.
The next step is probably actually to add support to unpin/evict
inactive objects. (We are part way there since we have already de-
coupled the iova lifetime from the pages lifetime, but there are a
few sharp corners to work through.)
Rob Clark (4):
drm/msm: Remove unused freed llist node
drm/msm: Avoid mutex in shrinker_count()
drm/msm: Fix debugfs deadlock
drm/msm: Improved debugfs gem stats
drivers/gpu/drm/msm/msm_debugfs.c | 14 ++---
drivers/gpu/drm/msm/msm_drv.c | 4 ++
drivers/gpu/drm/msm/msm_drv.h | 15 ++++--
drivers/gpu/drm/msm/msm_fb.c | 3 +-
drivers/gpu/drm/msm/msm_gem.c | 65 ++++++++++++++++++-----
drivers/gpu/drm/msm/msm_gem.h | 72 +++++++++++++++++++++++---
drivers/gpu/drm/msm/msm_gem_shrinker.c | 28 ++++------
7 files changed, 150 insertions(+), 51 deletions(-)
--
2.30.2