On Thu, Apr 15, 2021 at 10:51:42PM +0300, Anastasia Kovaleva wrote:
Hi Greg and Sasha,
QLogic FC adapters don’t work in P2P mode on the latest stable 5.4 (at least QLE2692, and QLE2694, QLE2742 are affected).
We’ve tested and bisected from 5.4 up to 5.4.112 and figured out the following:
- From 5.4 to 5.4.5 inclusively direct mode doesn’t work at all.
Stable commit a82545b62e07 ("scsi: qla2xxx: Change discovery state before PLOGI") fixes the issue in 5.4.6
- From 5.4.6 to 5.4.68 inclusively direct mode works but FC link cannot be
recovered after issue_lip and all presented LUNs are lost
Not working issue_lip is an outcome of a82545b62e07 applied to stable without the upstream commit 65e920093805 ("scsi: qla2xxx: Fix device connect issues in P2P configuration.").
- From 5.4.69 up till now (5.4.112) direct mode doesn’t work again
The issue was introduced by stable commit 74924e407bf7 ("scsi: qla2xxx: Retry PLOGI on FC-NVMe PRLI failure").
Upstream commit 84ed362ac40c ("scsi: qla2xxx: Dual FCP-NVMe target port support") fixes the issue.
So, in order to fix both P2P issues we need to apply upstream commits 65e920093805 and 84ed362ac40c.
That's a great analysis, thank you.
However, stable commits 0b84591fdd5e ("scsi: qla2xxx: Fix stuck login session using prli_pend_timer") introduced in 5.4.19 and 74924e407bf7 ("scsi: qla2xxx: Retry PLOGI on FC-NVMe PRLI failure") in 5.4.69 were applied in the wrong order, upstream and chronological-wise with regards to required upstream fixes, and cherry-picking of the fixes is not possible without a merge conflict.
Right, in particular: 74924e407bf7 ("scsi: qla2xxx: Retry PLOGI on FC-NVMe PRLI failure") was modified to work around missing 84ed362ac40c ("scsi: qla2xxx: Dual FCP-NVMe target port support"), which is where the rest of the conflicts are coming from.
The series provides merge conflict resolution and resolves both P2P discovery and issue_lip issue. It has been tested over Linux stable 5.4.112 and Ubuntu 20.04 kernel 5.4.0-71.79 (that's based off stable 5.4.101).
Please apply at your earliest convenience to 5.4 stable.
So instead of applying even more modified patches that'll create similar issue in the future, I backed up 0b84591fdd5e and 74924e407bf7, and applied the 4 commits you've pointed out in the "correct" order. I also grabbed 27258a577144 ("scsi: qla2xxx: Add a shadow variable to hold disc_state history of fcport") for completeness.
Thanks for diagnosing this issue! Please let me know if something is still broken.