[PATCH 5.8 60/70] RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel()

31 Oct 2020

From: Jason Gunthorpe jgg@nvidia.com
commit 2ee9bf346fbfd1dad0933b9eb3a4c2c0979b633e upstream.
This three thread race can result in the work being run once the callback
becomes NULL:
CPU1                 CPU2                   CPU3
 netevent_callback()
                     process_one_req()       rdma_addr_cancel()
                      [..]
     spin_lock_bh()
    set_timeout()
     spin_unlock_bh()
spin_lock_bh()
    					list_del_init(&req->list);
    					spin_unlock_bh()
req->callback = NULL
    	     spin_lock_bh()
    	       if (!list_empty(&req->list))
                         // Skipped!
    	         // cancel_delayed_work(&req->work);
    	     spin_unlock_bh()
process_one_req() // again
    	     req->callback() // BOOM
    					cancel_delayed_work_sync()
The solution is to always cancel the work once it is completed so any
in between set_timeout() does not result in it running again.
Cc: stable@vger.kernel.org
Fixes: 44e75052bc2a ("RDMA/rdma_cm: Make rdma_addr_cancel into a fence")
Link: https://lore.kernel.org/r/20200930072007.1009692-1-leon@kernel.org
Reported-by: Dan Aloni dan@kernelim.com
Signed-off-by: Leon Romanovsky leonro@nvidia.com
Signed-off-by: Jason Gunthorpe jgg@nvidia.com
Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
---
 drivers/infiniband/core/addr.c |   11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -647,13 +647,12 @@ static void process_one_req(struct work_
    req->callback = NULL;
spin_lock_bh(&lock);
+	/*
+	 * Although the work will normally have been canceled by the workqueue,
+	 * it can still be requeued as long as it is on the req_list.
+	 */
+	cancel_delayed_work(&req->work);
    if (!list_empty(&req->list)) {
-		/*
-		 * Although the work will normally have been canceled by the
-		 * workqueue, it can still be requeued as long as it is on the
-		 * req_list.
-		 */
-		cancel_delayed_work(&req->work);
    	list_del_init(&req->list);
    	kfree(req);
    }

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

[PATCH 5.8 60/70] RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel()