From: Selvin Xavier selvin.xavier@broadcom.com
[ Upstream commit 8be3e5b0c96beeefe9d5486b96575d104d3e7d17 ]
Driver waits indefinitely for the fifo occupancy to go below a threshold as soon as the pacing interrupt is received. This can cause soft lockup on one of the processors, if the rate of DB is very high.
Add a loop count for FPGA and exit the __wait_for_fifo_occupancy_below_th if the loop is taking more time. Pacing will be continuing until the occupancy is below the threshold. This is ensured by the checks in bnxt_re_pacing_timer_exp and further scheduling the work for pacing based on the fifo occupancy.
Fixes: 2ad4e6303a6d ("RDMA/bnxt_re: Implement doorbell pacing algorithm") Link: https://patch.msgid.link/r/1728373302-19530-7-git-send-email-selvin.xavier@b... Reviewed-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Reviewed-by: Chandramohan Akula chandramohan.akula@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com [ Add the declaration of variable pacing_data to make it work on 6.6.y ] Signed-off-by: Alva Lan alvalan9@foxmail.com --- drivers/infiniband/hw/bnxt_re/main.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c index c7e51cc2ea26..082a383c4913 100644 --- a/drivers/infiniband/hw/bnxt_re/main.c +++ b/drivers/infiniband/hw/bnxt_re/main.c @@ -485,6 +485,8 @@ static void bnxt_re_set_default_pacing_data(struct bnxt_re_dev *rdev) static void __wait_for_fifo_occupancy_below_th(struct bnxt_re_dev *rdev) { u32 read_val, fifo_occup; + struct bnxt_qplib_db_pacing_data *pacing_data = rdev->qplib_res.pacing_data; + u32 retry_fifo_check = 1000;
/* loop shouldn't run infintely as the occupancy usually goes * below pacing algo threshold as soon as pacing kicks in. @@ -500,6 +502,14 @@ static void __wait_for_fifo_occupancy_below_th(struct bnxt_re_dev *rdev)
if (fifo_occup < rdev->qplib_res.pacing_data->pacing_th) break; + if (!retry_fifo_check--) { + dev_info_once(rdev_to_dev(rdev), + "%s: fifo_occup = 0x%xfifo_max_depth = 0x%x pacing_th = 0x%x\n", + __func__, fifo_occup, pacing_data->fifo_max_depth, + pacing_data->pacing_th); + break; + } + } }
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 8be3e5b0c96beeefe9d5486b96575d104d3e7d17
WARNING: Author mismatch between patch and upstream commit: Backport author: alvalan9@foxmail.com Commit author: Selvin Xavierselvin.xavier@broadcom.com
Status in newer kernel trees: 6.12.y | Present (exact SHA1) 6.6.y | Not found
Note: The patch differs from the upstream commit: --- 1: 8be3e5b0c96be ! 1: ee8c4af490b25 RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop @@ Metadata ## Commit message ## RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop
+ [ Upstream commit 8be3e5b0c96beeefe9d5486b96575d104d3e7d17 ] + Driver waits indefinitely for the fifo occupancy to go below a threshold as soon as the pacing interrupt is received. This can cause soft lockup on one of the processors, if the rate of DB is very high. @@ Commit message Reviewed-by: Chandramohan Akula chandramohan.akula@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com + [ Add the declaration of variable pacing_data to make it work on 6.6.y ] + Signed-off-by: Alva Lan alvalan9@foxmail.com
## drivers/infiniband/hw/bnxt_re/main.c ## -@@ drivers/infiniband/hw/bnxt_re/main.c: static bool is_dbr_fifo_full(struct bnxt_re_dev *rdev) +@@ drivers/infiniband/hw/bnxt_re/main.c: static void bnxt_re_set_default_pacing_data(struct bnxt_re_dev *rdev) static void __wait_for_fifo_occupancy_below_th(struct bnxt_re_dev *rdev) { - struct bnxt_qplib_db_pacing_data *pacing_data = rdev->qplib_res.pacing_data; + u32 read_val, fifo_occup; ++ struct bnxt_qplib_db_pacing_data *pacing_data = rdev->qplib_res.pacing_data; + u32 retry_fifo_check = 1000; - u32 fifo_occup;
/* loop shouldn't run infintely as the occupancy usually goes + * below pacing algo threshold as soon as pacing kicks in. @@ drivers/infiniband/hw/bnxt_re/main.c: static void __wait_for_fifo_occupancy_below_th(struct bnxt_re_dev *rdev)
- if (fifo_occup < pacing_data->pacing_th) + if (fifo_occup < rdev->qplib_res.pacing_data->pacing_th) break; + if (!retry_fifo_check--) { + dev_info_once(rdev_to_dev(rdev), ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
linux-stable-mirror@lists.linaro.org