From: tuqiang tu.qiang35@zte.com.cn
The MR/QP restrack also needs to be released when delete it, otherwise it cause memory leak as the task struct won't be released.
This problem has been fixed by the commit <dac153f2802d> ("RDMA/restrack: Release MR restrack when delete"), but still exists in the linux-5.10.y branch.
Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects") Signed-off-by: tuqiang tu.qiang35@zte.com.cn Signed-off-by: Jiang Kun jiang.kun2@zte.com.cn Cc: stable@vger.kernel.org Cc: xu xin xu.xin16@zte.com.cn Cc: Doug Ledford dledford@redhat.com Cc: Jason Gunthorpe jgg@ziepe.ca Cc: Leon Romanovsky leon@kernel.org --- drivers/infiniband/core/restrack.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c index bbbbec5b1593..d5a69c4a1891 100644 --- a/drivers/infiniband/core/restrack.c +++ b/drivers/infiniband/core/restrack.c @@ -326,8 +326,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) rt = &dev->res[res->type];
old = xa_erase(&rt->xa, res->id); - if (res->type == RDMA_RESTRACK_MR || res->type == RDMA_RESTRACK_QP) - return; WARN_ON(old != res); res->valid = false;
On Sat, Nov 16, 2024 at 05:57:48PM +0800, jiang.kun2@zte.com.cn wrote:
From: tuqiang tu.qiang35@zte.com.cn
The MR/QP restrack also needs to be released when delete it, otherwise it cause memory leak as the task struct won't be released.
This problem has been fixed by the commit <dac153f2802d> ("RDMA/restrack: Release MR restrack when delete"), but still exists in the linux-5.10.y branch.
Why don't we just take the correct fix? Why is this needed instead?
thanks,
greg k-h
On Sat, Nov 16, 2024 at 05:57:48PM +0800, jiang.kun2@zte.com.cn wrote:
From: tuqiang tu.qiang35@zte.com.cn
The MR/QP restrack also needs to be released when delete it, otherwise it cause memory leak as the task struct won't be released.
This problem has been fixed by the commit <dac153f2802d> ("RDMA/restrack: Release MR restrack when delete"), but still exists in the linux-5.10.y branch.
Why don't we just take the correct fix? Why is this needed instead?
1. Reply: Why don't we just take the correct fix? ========================================= Due to inconsistent code context, it is not possible to directly cherry-pick the changes to the linux-5.10 branch. The commit 514aee660df4 (RDMA: Globally allocate and release QP memory) resolved the resource release issue for QP, but the MR issue remains unresolved.
2. Reply: Why is this needed instead? ================================== When a user applies for resources by executing MR/QP-related commands, they will reference the task_struct object. However, when consuming the object, rdma_restrack_del does not have the corresponding release mechanism.
Stack: 0xffffffffb70df1d0 : get_task_struct+0x0/0x50 [kernel] 0xffffffffc5b3a42c : rdma_restrack_attach_task.isra.6+0x2c/0x50 [ib_core] 0xffffffffc748fd54 : ib_uverbs_reg_mr+0x194/0x260 [ib_uverbs] 0xffffffffc749a049 : ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xb9/0x110 [ib_uverbs] 0xffffffffc7496a1f : ib_uverbs_run_method+0x6ff/0x7b0 [ib_uverbs] 0xffffffffc7496c65 : ib_uverbs_cmd_verbs+0x195/0x360 [ib_uverbs] 0xffffffffc7496ec3 : ib_uverbs_ioctl+0x93/0xe0 [ib_uverbs] 0xffffffffb736bbe9 : __x64_sys_ioctl+0x89/0xc0 [kernel] 0xffffffffb7a62a10 : do_syscall_64+0x30/0x40 [kernel]
0xffffffffb70df1d0 : get_task_struct+0x0/0x50 [kernel] 0xffffffffc5b3a42c : rdma_restrack_attach_task.isra.6+0x2c/0x50 [ib_core] 0xffffffffc749bfea : ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0xaba/0xb40 [ib_uverbs] 0xffffffffc7496a1f : ib_uverbs_run_method+0x6ff/0x7b0 [ib_uverbs] 0xffffffffc7496c65 : ib_uverbs_cmd_verbs+0x195/0x360 [ib_uverbs] 0xffffffffc7496ec3 : ib_uverbs_ioctl+0x93/0xe0 [ib_uverbs] 0xffffffffb736bbe9 : __x64_sys_ioctl+0x89/0xc0 [kernel] 0xffffffffb7a62a10 : do_syscall_64+0x30/0x40 [kernel]
thanks,
greg k-h
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#opti...
Rule: The upstream commit ID must be specified with a separate line above the commit text. Subject: [PATCH STABLE 5.10] RDMA/restrack: Release MR/QP restrack when delete Link: https://lore.kernel.org/stable/20241116175748571awvOCFyR9lCLwe61IhOXL%40zte....
Please ignore this mail if the patch is not relevant for upstream.
linux-stable-mirror@lists.linaro.org