From: Justin Tee justin.tee@broadcom.com
[ Upstream commit b5bf6d681fce69cd1a57bfc0f1bdbbb348035117 ]
The kref for Fabric_DID ndlps is not decremented after repeated FDISC failures and exhausting maximum allowed retries. This can leave the ndlp lingering unnecessarily. Add a test and set bit operation for the NLP_DROPPED flag. If not previously set, then a kref is decremented. The ndlp is freed when the remaining reference for the completing ELS is put.
Signed-off-by: Justin Tee justin.tee@broadcom.com Message-ID: 20250915180811.137530-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
YES - In the failure branch of `lpfc_cmpl_els_fdisc()` the driver used to log the exhausted retry and drop straight to `fdisc_failed`, leaving the fabric `ndlp`’s initial kref outstanding; only the completion-held reference is released later at `out:` (`drivers/scsi/lpfc/lpfc_els.c:11252-11271`). - The new `test_and_set_bit(NLP_DROPPED, …)` + `lpfc_nlp_put(ndlp)` sequence (`drivers/scsi/lpfc/lpfc_els.c:11267-11269`) mirrors the established pattern for retiring nodes safely once that initial reference is no longer needed (`drivers/scsi/lpfc/lpfc_hbadisc.c:4949-4954`, with the meaning of `NLP_DROPPED` defined in `drivers/scsi/lpfc/lpfc_disc.h:197`). - Without this drop, every fabric FDISC failure that exhausts retries leaks the `ndlp`, keeping discovery objects and their resources pinned; that is a real bug that can accumulate across repeated fabric login failures. - The fix is small, localized to the terminal failure path, and guarded by the bit test so it cannot double-drop an already-released node, which keeps regression risk low. - The affected logic exists unchanged in stable kernels, so backporting would directly eliminate the leak there without pulling in broader dependencies.
drivers/scsi/lpfc/lpfc_els.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index fca81e0c7c2e1..4c405bade4f34 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -11259,6 +11259,11 @@ lpfc_cmpl_els_fdisc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, lpfc_vlog_msg(vport, KERN_WARNING, LOG_ELS, "0126 FDISC cmpl status: x%x/x%x)\n", ulp_status, ulp_word4); + + /* drop initial reference */ + if (!test_and_set_bit(NLP_DROPPED, &ndlp->nlp_flag)) + lpfc_nlp_put(ndlp); + goto fdisc_failed; }