This is a note to let you know that I've just added the patch titled
drm/amdgpu: Properly allocate VM invalidate eng v2
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-amdgpu-properly-allocate-vm-invalidate-eng-v2.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From c5066129af4436ab0da8eefe4289774a5409706d Mon Sep 17 00:00:00 2001
From: ozeng <oak.zeng(a)amd.com>
Date: Tue, 6 Jun 2017 10:53:28 -0500
Subject: drm/amdgpu: Properly allocate VM invalidate eng v2
From: ozeng <oak.zeng(a)amd.com>
commit c5066129af4436ab0da8eefe4289774a5409706d upstream.
v1: Properly allocate TLB invalidation engine to avoid conflict.
v2: Added comments to codes
Signed-off-by: Oak Zeng <Oak.Zeng(a)amd.com>
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
Acked-by: Christian Konig <christian.koenig(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -395,7 +395,16 @@ static int gmc_v9_0_early_init(void *han
static int gmc_v9_0_late_init(void *handle)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
- unsigned vm_inv_eng[AMDGPU_MAX_VMHUBS] = { 3, 3 };
+ /*
+ * The latest engine allocation on gfx9 is:
+ * Engine 0, 1: idle
+ * Engine 2, 3: firmware
+ * Engine 4~13: amdgpu ring, subject to change when ring number changes
+ * Engine 14~15: idle
+ * Engine 16: kfd tlb invalidation
+ * Engine 17: Gart flushes
+ */
+ unsigned vm_inv_eng[AMDGPU_MAX_VMHUBS] = { 4, 4 };
unsigned i;
for(i = 0; i < adev->num_rings; ++i) {
@@ -408,9 +417,9 @@ static int gmc_v9_0_late_init(void *hand
ring->funcs->vmhub);
}
- /* Engine 17 is used for GART flushes */
+ /* Engine 16 is used for KFD and 17 for GART flushes */
for(i = 0; i < AMDGPU_MAX_VMHUBS; ++i)
- BUG_ON(vm_inv_eng[i] > 17);
+ BUG_ON(vm_inv_eng[i] > 16);
return amdgpu_irq_get(adev, &adev->mc.vm_fault, 0);
}
Patches currently in stable-queue which might be from oak.zeng(a)amd.com are
queue-4.14/drm-amdgpu-properly-allocate-vm-invalidate-eng-v2.patch
This is a note to let you know that I've just added the patch titled
drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-amdgpu-potential-uninitialized-variable-in-amdgpu_vm_update_directories.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 78aa02c713fcf19e9bc8511ab61a5fd6c877cc01 Mon Sep 17 00:00:00 2001
From: Dan Carpenter <dan.carpenter(a)oracle.com>
Date: Sat, 30 Sep 2017 11:14:13 +0300
Subject: drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Dan Carpenter <dan.carpenter(a)oracle.com>
commit 78aa02c713fcf19e9bc8511ab61a5fd6c877cc01 upstream.
After commit ea09729c9302 ("drm/amdgpu: rework page directory filling
v2") then it becomes a lot harder to verify that "r" is initialized. My
static checker complains and so I've reviewed the code. It does look
like it might be buggy... Anyway, it doesn't hurt to set "r" to zero
at the start.
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1201,7 +1201,7 @@ static void amdgpu_vm_invalidate_level(s
int amdgpu_vm_update_directories(struct amdgpu_device *adev,
struct amdgpu_vm *vm)
{
- int r;
+ int r = 0;
r = amdgpu_vm_update_level(adev, vm, &vm->root, 0);
if (r)
Patches currently in stable-queue which might be from dan.carpenter(a)oracle.com are
queue-4.14/drm-amdgpu-potential-uninitialized-variable-in-amdgpu_vce_ring_parse_cs.patch
queue-4.14/drm-amdgpu-potential-uninitialized-variable-in-amdgpu_vm_update_directories.patch
This is a note to let you know that I've just added the patch titled
drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-amdgpu-potential-uninitialized-variable-in-amdgpu_vce_ring_parse_cs.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 40a9960b046290939b56ce8e51f365258f27f264 Mon Sep 17 00:00:00 2001
From: Dan Carpenter <dan.carpenter(a)oracle.com>
Date: Sat, 30 Sep 2017 11:13:28 +0300
Subject: drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Dan Carpenter <dan.carpenter(a)oracle.com>
commit 40a9960b046290939b56ce8e51f365258f27f264 upstream.
We shifted some code around in commit 9cca0b8e5df0 ("drm/amdgpu: move
amdgpu_cs_sysvm_access_required into find_mapping") and now my static
checker complains that "r" might not be initialized at the end of the
function. I've reviewed the code, and that seems possible, but it's
also possible I may have missed something.
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -647,7 +647,7 @@ int amdgpu_vce_ring_parse_cs(struct amdg
uint32_t allocated = 0;
uint32_t tmp, handle = 0;
uint32_t *size = &tmp;
- int i, r, idx = 0;
+ int i, r = 0, idx = 0;
p->job->vm = NULL;
ib->gpu_addr = amdgpu_sa_bo_gpu_addr(ib->sa_bo);
Patches currently in stable-queue which might be from dan.carpenter(a)oracle.com are
queue-4.14/drm-amdgpu-potential-uninitialized-variable-in-amdgpu_vce_ring_parse_cs.patch
queue-4.14/drm-amdgpu-potential-uninitialized-variable-in-amdgpu_vm_update_directories.patch
This is a note to let you know that I've just added the patch titled
drm/amdgpu: fix error handling in amdgpu_bo_do_create
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-amdgpu-fix-error-handling-in-amdgpu_bo_do_create.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From a695e43712242c354748e9bae5d137d4337a7694 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig(a)amd.com>
Date: Tue, 31 Oct 2017 09:36:13 +0100
Subject: drm/amdgpu: fix error handling in amdgpu_bo_do_create
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Christian König <christian.koenig(a)amd.com>
commit a695e43712242c354748e9bae5d137d4337a7694 upstream.
The bo structure is freed up in case of an error, so we can't do any
accounting if that happens.
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -391,6 +391,9 @@ int amdgpu_bo_create_restricted(struct a
r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, type,
&bo->placement, page_align, !kernel, NULL,
acc_size, sg, resv, &amdgpu_ttm_bo_destroy);
+ if (unlikely(r != 0))
+ return r;
+
bytes_moved = atomic64_read(&adev->num_bytes_moved) -
initial_bytes_moved;
if (adev->mc.visible_vram_size < adev->mc.real_vram_size &&
@@ -400,9 +403,6 @@ int amdgpu_bo_create_restricted(struct a
else
amdgpu_cs_report_moved_bytes(adev, bytes_moved, 0);
- if (unlikely(r != 0))
- return r;
-
if (kernel)
bo->tbo.priority = 1;
Patches currently in stable-queue which might be from christian.koenig(a)amd.com are
queue-4.14/drm-ttm-fix-ttm_bo_cleanup_refs_or_queue-once-more.patch
queue-4.14/drm-amdgpu-potential-uninitialized-variable-in-amdgpu_vce_ring_parse_cs.patch
queue-4.14/drm-amdgpu-properly-allocate-vm-invalidate-eng-v2.patch
queue-4.14/drm-amdgpu-potential-uninitialized-variable-in-amdgpu_vm_update_directories.patch
queue-4.14/drm-amdgpu-fix-error-handling-in-amdgpu_bo_do_create.patch
queue-4.14/dma-buf-make-reservation_object_copy_fences-rcu-save.patch
This is a note to let you know that I've just added the patch titled
drm/amdgpu: correct reference clock value on vega10
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-amdgpu-correct-reference-clock-value-on-vega10.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 76d6172b6fab16455af4b67bb18a3f66011592f8 Mon Sep 17 00:00:00 2001
From: Ken Wang <Ken.Wang(a)amd.com>
Date: Fri, 29 Sep 2017 15:41:43 +0800
Subject: drm/amdgpu: correct reference clock value on vega10
From: Ken Wang <Ken.Wang(a)amd.com>
commit 76d6172b6fab16455af4b67bb18a3f66011592f8 upstream.
Old value from bringup was wrong.
Signed-off-by: Ken Wang <Ken.Wang(a)amd.com>
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/amd/amdgpu/soc15.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -279,10 +279,7 @@ static void soc15_init_golden_registers(
}
static u32 soc15_get_xclk(struct amdgpu_device *adev)
{
- if (adev->asic_type == CHIP_VEGA10)
- return adev->clock.spll.reference_freq/4;
- else
- return adev->clock.spll.reference_freq;
+ return adev->clock.spll.reference_freq;
}
Patches currently in stable-queue which might be from Ken.Wang(a)amd.com are
queue-4.14/drm-amdgpu-correct-reference-clock-value-on-vega10.patch
queue-4.14/drm-amdgpu-remove-check-which-is-not-valid-for-certain-vbios.patch
This is a note to let you know that I've just added the patch titled
dma-buf: make reservation_object_copy_fences rcu save
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dma-buf-make-reservation_object_copy_fences-rcu-save.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 39e16ba16c147e662bf9fbcee9a99d70d420382f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig(a)amd.com>
Date: Mon, 4 Sep 2017 21:02:45 +0200
Subject: dma-buf: make reservation_object_copy_fences rcu save
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Christian König <christian.koenig(a)amd.com>
commit 39e16ba16c147e662bf9fbcee9a99d70d420382f upstream.
Stop requiring that the src reservation object is locked for this operation.
Acked-by: Chunming Zhou <david1.zhou(a)amd.com>
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1504551766-5093-1-git-send-em…
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/dma-buf/reservation.c | 56 +++++++++++++++++++++++++++++++-----------
1 file changed, 42 insertions(+), 14 deletions(-)
--- a/drivers/dma-buf/reservation.c
+++ b/drivers/dma-buf/reservation.c
@@ -266,8 +266,7 @@ EXPORT_SYMBOL(reservation_object_add_exc
* @dst: the destination reservation object
* @src: the source reservation object
*
-* Copy all fences from src to dst. Both src->lock as well as dst-lock must be
-* held.
+* Copy all fences from src to dst. dst-lock must be held.
*/
int reservation_object_copy_fences(struct reservation_object *dst,
struct reservation_object *src)
@@ -277,33 +276,62 @@ int reservation_object_copy_fences(struc
size_t size;
unsigned i;
- src_list = reservation_object_get_list(src);
+ rcu_read_lock();
+ src_list = rcu_dereference(src->fence);
+retry:
if (src_list) {
- size = offsetof(typeof(*src_list),
- shared[src_list->shared_count]);
+ unsigned shared_count = src_list->shared_count;
+
+ size = offsetof(typeof(*src_list), shared[shared_count]);
+ rcu_read_unlock();
+
dst_list = kmalloc(size, GFP_KERNEL);
if (!dst_list)
return -ENOMEM;
- dst_list->shared_count = src_list->shared_count;
- dst_list->shared_max = src_list->shared_count;
- for (i = 0; i < src_list->shared_count; ++i)
- dst_list->shared[i] =
- dma_fence_get(src_list->shared[i]);
+ rcu_read_lock();
+ src_list = rcu_dereference(src->fence);
+ if (!src_list || src_list->shared_count > shared_count) {
+ kfree(dst_list);
+ goto retry;
+ }
+
+ dst_list->shared_count = 0;
+ dst_list->shared_max = shared_count;
+ for (i = 0; i < src_list->shared_count; ++i) {
+ struct dma_fence *fence;
+
+ fence = rcu_dereference(src_list->shared[i]);
+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+ &fence->flags))
+ continue;
+
+ if (!dma_fence_get_rcu(fence)) {
+ kfree(dst_list);
+ src_list = rcu_dereference(src->fence);
+ goto retry;
+ }
+
+ if (dma_fence_is_signaled(fence)) {
+ dma_fence_put(fence);
+ continue;
+ }
+
+ dst_list->shared[dst_list->shared_count++] = fence;
+ }
} else {
dst_list = NULL;
}
+ new = dma_fence_get_rcu_safe(&src->fence_excl);
+ rcu_read_unlock();
+
kfree(dst->staged);
dst->staged = NULL;
src_list = reservation_object_get_list(dst);
-
old = reservation_object_get_excl(dst);
- new = reservation_object_get_excl(src);
-
- dma_fence_get(new);
preempt_disable();
write_seqcount_begin(&dst->seq);
Patches currently in stable-queue which might be from christian.koenig(a)amd.com are
queue-4.14/drm-ttm-fix-ttm_bo_cleanup_refs_or_queue-once-more.patch
queue-4.14/drm-amdgpu-potential-uninitialized-variable-in-amdgpu_vce_ring_parse_cs.patch
queue-4.14/drm-amdgpu-properly-allocate-vm-invalidate-eng-v2.patch
queue-4.14/drm-amdgpu-potential-uninitialized-variable-in-amdgpu_vm_update_directories.patch
queue-4.14/drm-amdgpu-fix-error-handling-in-amdgpu_bo_do_create.patch
queue-4.14/dma-buf-make-reservation_object_copy_fences-rcu-save.patch
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4b277247b1df177a7a60f0601ebe3dc952e5dd54 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig(a)amd.com>
Date: Mon, 13 Nov 2017 17:20:50 +0100
Subject: [PATCH] drm/amdgpu: set f_mapping on exported DMA-bufs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Otherwise we can't correctly CPU map TTM buffers.
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Acked-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
index 90af8e82b16a..ae9c106979d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
@@ -169,10 +169,14 @@ struct dma_buf *amdgpu_gem_prime_export(struct drm_device *dev,
int flags)
{
struct amdgpu_bo *bo = gem_to_amdgpu_bo(gobj);
+ struct dma_buf *buf;
if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm) ||
bo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID)
return ERR_PTR(-EPERM);
- return drm_gem_prime_export(dev, gobj, flags);
+ buf = drm_gem_prime_export(dev, gobj, flags);
+ if (!IS_ERR(buf))
+ buf->file->f_mapping = dev->anon_inode->i_mapping;
+ return buf;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4b6b691ee38abae8842aed61d442dfb315c45789 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig(a)amd.com>
Date: Mon, 16 Oct 2017 10:32:04 +0200
Subject: [PATCH] drm/amdgpu: linear validate first then bind to GART
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
For VM emulation for old UVD/VCE we need to validate the BO with linear
VRAM flag set first and then eventually bind it to GART.
Validating with linear VRAM flag set can move the BO to GART making
UVD/VCE read/write from an unbound GART BO.
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
CC: stable(a)vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 32cf83e2f2d9..f7fceb63413c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1582,14 +1582,14 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
if (READ_ONCE((*bo)->tbo.resv->lock.ctx) != &parser->ticket)
return -EINVAL;
- r = amdgpu_ttm_bind(&(*bo)->tbo, &(*bo)->tbo.mem);
- if (unlikely(r))
- return r;
-
- if ((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
- return 0;
+ if (!((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)) {
+ (*bo)->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
+ amdgpu_ttm_placement_from_domain(*bo, (*bo)->allowed_domains);
+ r = ttm_bo_validate(&(*bo)->tbo, &(*bo)->placement, false,
+ false);
+ if (r)
+ return r;
+ }
- (*bo)->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
- amdgpu_ttm_placement_from_domain(*bo, (*bo)->allowed_domains);
- return ttm_bo_validate(&(*bo)->tbo, &(*bo)->placement, false, false);
+ return amdgpu_ttm_bind(&(*bo)->tbo, &(*bo)->tbo.mem);
}
This is a note to let you know that I've just added the patch titled
nfsd: Fix stateid races between OPEN and CLOSE
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nfsd-fix-stateid-races-between-open-and-close.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 15ca08d3299682dc49bad73251677b2c5017ef08 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)primarydata.com>
Date: Fri, 3 Nov 2017 08:00:10 -0400
Subject: nfsd: Fix stateid races between OPEN and CLOSE
From: Trond Myklebust <trond.myklebust(a)primarydata.com>
commit 15ca08d3299682dc49bad73251677b2c5017ef08 upstream.
Open file stateids can linger on the nfs4_file list of stateids even
after they have been closed. In order to avoid reusing such a
stateid, and confusing the client, we need to recheck the
nfs4_stid's type after taking the mutex.
Otherwise, we risk reusing an old stateid that was already closed,
which will confuse clients that expect new stateids to conform to
RFC7530 Sections 9.1.4.2 and 16.2.5 or RFC5661 Sections 8.2.2 and 18.2.4.
Signed-off-by: Trond Myklebust <trond.myklebust(a)primarydata.com>
Signed-off-by: J. Bruce Fields <bfields(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/nfsd/nfs4state.c | 67 +++++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 59 insertions(+), 8 deletions(-)
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3513,7 +3513,9 @@ nfsd4_find_existing_open(struct nfs4_fil
/* ignore lock owners */
if (local->st_stateowner->so_is_open_owner == 0)
continue;
- if (local->st_stateowner == &oo->oo_owner) {
+ if (local->st_stateowner != &oo->oo_owner)
+ continue;
+ if (local->st_stid.sc_type == NFS4_OPEN_STID) {
ret = local;
atomic_inc(&ret->st_stid.sc_count);
break;
@@ -3522,6 +3524,52 @@ nfsd4_find_existing_open(struct nfs4_fil
return ret;
}
+static __be32
+nfsd4_verify_open_stid(struct nfs4_stid *s)
+{
+ __be32 ret = nfs_ok;
+
+ switch (s->sc_type) {
+ default:
+ break;
+ case NFS4_CLOSED_STID:
+ case NFS4_CLOSED_DELEG_STID:
+ ret = nfserr_bad_stateid;
+ break;
+ case NFS4_REVOKED_DELEG_STID:
+ ret = nfserr_deleg_revoked;
+ }
+ return ret;
+}
+
+/* Lock the stateid st_mutex, and deal with races with CLOSE */
+static __be32
+nfsd4_lock_ol_stateid(struct nfs4_ol_stateid *stp)
+{
+ __be32 ret;
+
+ mutex_lock(&stp->st_mutex);
+ ret = nfsd4_verify_open_stid(&stp->st_stid);
+ if (ret != nfs_ok)
+ mutex_unlock(&stp->st_mutex);
+ return ret;
+}
+
+static struct nfs4_ol_stateid *
+nfsd4_find_and_lock_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
+{
+ struct nfs4_ol_stateid *stp;
+ for (;;) {
+ spin_lock(&fp->fi_lock);
+ stp = nfsd4_find_existing_open(fp, open);
+ spin_unlock(&fp->fi_lock);
+ if (!stp || nfsd4_lock_ol_stateid(stp) == nfs_ok)
+ break;
+ nfs4_put_stid(&stp->st_stid);
+ }
+ return stp;
+}
+
static struct nfs4_openowner *
alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
struct nfsd4_compound_state *cstate)
@@ -3566,6 +3614,7 @@ init_open_stateid(struct nfs4_file *fp,
mutex_init(&stp->st_mutex);
mutex_lock(&stp->st_mutex);
+retry:
spin_lock(&oo->oo_owner.so_client->cl_lock);
spin_lock(&fp->fi_lock);
@@ -3590,7 +3639,11 @@ out_unlock:
spin_unlock(&fp->fi_lock);
spin_unlock(&oo->oo_owner.so_client->cl_lock);
if (retstp) {
- mutex_lock(&retstp->st_mutex);
+ /* Handle races with CLOSE */
+ if (nfsd4_lock_ol_stateid(retstp) != nfs_ok) {
+ nfs4_put_stid(&retstp->st_stid);
+ goto retry;
+ }
/* To keep mutex tracking happy */
mutex_unlock(&stp->st_mutex);
stp = retstp;
@@ -4411,9 +4464,7 @@ nfsd4_process_open2(struct svc_rqst *rqs
status = nfs4_check_deleg(cl, open, &dp);
if (status)
goto out;
- spin_lock(&fp->fi_lock);
- stp = nfsd4_find_existing_open(fp, open);
- spin_unlock(&fp->fi_lock);
+ stp = nfsd4_find_and_lock_existing_open(fp, open);
} else {
open->op_file = NULL;
status = nfserr_bad_stateid;
@@ -4427,7 +4478,6 @@ nfsd4_process_open2(struct svc_rqst *rqs
*/
if (stp) {
/* Stateid was found, this is an OPEN upgrade */
- mutex_lock(&stp->st_mutex);
status = nfs4_upgrade_open(rqstp, fp, current_fh, stp, open);
if (status) {
mutex_unlock(&stp->st_mutex);
@@ -5314,7 +5364,6 @@ static void nfsd4_close_open_stateid(str
bool unhashed;
LIST_HEAD(reaplist);
- s->st_stid.sc_type = NFS4_CLOSED_STID;
spin_lock(&clp->cl_lock);
unhashed = unhash_open_stateid(s, &reaplist);
@@ -5353,10 +5402,12 @@ nfsd4_close(struct svc_rqst *rqstp, stru
nfsd4_bump_seqid(cstate, status);
if (status)
goto out;
+
+ stp->st_stid.sc_type = NFS4_CLOSED_STID;
nfs4_inc_and_copy_stateid(&close->cl_stateid, &stp->st_stid);
- mutex_unlock(&stp->st_mutex);
nfsd4_close_open_stateid(stp);
+ mutex_unlock(&stp->st_mutex);
/* put reference from nfs4_preprocess_seqid_op */
nfs4_put_stid(&stp->st_stid);
Patches currently in stable-queue which might be from trond.myklebust(a)primarydata.com are
queue-4.9/nfsd-fix-stateid-races-between-open-and-close.patch
queue-4.9/nfsd-fix-another-open-stateid-race.patch
This is a note to let you know that I've just added the patch titled
nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nfsd-fix-panic-in-posix_unblock_lock-called-from-nfs4_laundromat.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 64ebe12494fd5d193f014ce38e1fd83cc57883c8 Mon Sep 17 00:00:00 2001
From: Naofumi Honda <honda(a)math.sci.hokudai.ac.jp>
Date: Thu, 9 Nov 2017 10:57:16 -0500
Subject: nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat
From: Naofumi Honda <honda(a)math.sci.hokudai.ac.jp>
commit 64ebe12494fd5d193f014ce38e1fd83cc57883c8 upstream.
>From kernel 4.9, my two nfsv4 servers sometimes suffer from
"panic: unable to handle kernel page request"
in posix_unblock_lock() called from nfs4_laundromat().
These panics diseappear if we revert the commit "nfsd: add a LRU list
for blocked locks".
The cause appears to be a typo in nfs4_laundromat(), which is also
present in nfs4_state_shutdown_net().
Fixes: 7919d0a27f1e "nfsd: add a LRU list for blocked locks"
Cc: jlayton(a)redhat.com
Reveiwed-by: Jeff Layton <jlayton(a)redhat.com>
Signed-off-by: J. Bruce Fields <bfields(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/nfsd/nfs4state.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4732,7 +4732,7 @@ nfs4_laundromat(struct nfsd_net *nn)
spin_unlock(&nn->blocked_locks_lock);
while (!list_empty(&reaplist)) {
- nbl = list_first_entry(&nn->blocked_locks_lru,
+ nbl = list_first_entry(&reaplist,
struct nfsd4_blocked_lock, nbl_lru);
list_del_init(&nbl->nbl_lru);
posix_unblock_lock(&nbl->nbl_lock);
@@ -7143,7 +7143,7 @@ nfs4_state_shutdown_net(struct net *net)
spin_unlock(&nn->blocked_locks_lock);
while (!list_empty(&reaplist)) {
- nbl = list_first_entry(&nn->blocked_locks_lru,
+ nbl = list_first_entry(&reaplist,
struct nfsd4_blocked_lock, nbl_lru);
list_del_init(&nbl->nbl_lru);
posix_unblock_lock(&nbl->nbl_lock);
Patches currently in stable-queue which might be from honda(a)math.sci.hokudai.ac.jp are
queue-4.9/nfsd-fix-panic-in-posix_unblock_lock-called-from-nfs4_laundromat.patch