From: Justin Tee <justin.tee(a)broadcom.com>
[ Upstream commit 4ddf01f2f1504fa08b766e8cfeec558e9f8eef6c ]
There are cases after NPIV deletion where the fabric switch still believes
the NPIV is logged into the fabric. This occurs when a vport is
unregistered before the Remove All DA_ID CT and LOGO ELS are sent to the
fabric.
Currently fc_remove_host(), which calls dev_loss_tmo for all D_IDs including
the fabric D_ID, removes the last ndlp reference and frees the ndlp rport
object. This sometimes causes the race condition where the final DA_ID and
LOGO are skipped from being sent to the fabric switch.
Fix by moving the fc_remove_host() and scsi_remove_host() calls after DA_ID
and LOGO are sent.
Signed-off-by: Justin Tee <justin.tee(a)broadcom.com>
Link: https://lore.kernel.org/r/20240305200503.57317-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/scsi/lpfc/lpfc_vport.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_vport.c b/drivers/scsi/lpfc/lpfc_vport.c
index 4d171f5c213f7..6b4259894584f 100644
--- a/drivers/scsi/lpfc/lpfc_vport.c
+++ b/drivers/scsi/lpfc/lpfc_vport.c
@@ -693,10 +693,6 @@ lpfc_vport_delete(struct fc_vport *fc_vport)
lpfc_free_sysfs_attr(vport);
lpfc_debugfs_terminate(vport);
- /* Remove FC host to break driver binding. */
- fc_remove_host(shost);
- scsi_remove_host(shost);
-
/* Send the DA_ID and Fabric LOGO to cleanup Nameserver entries. */
ndlp = lpfc_findnode_did(vport, Fabric_DID);
if (!ndlp)
@@ -740,6 +736,10 @@ lpfc_vport_delete(struct fc_vport *fc_vport)
skip_logo:
+ /* Remove FC host to break driver binding. */
+ fc_remove_host(shost);
+ scsi_remove_host(shost);
+
lpfc_cleanup(vport);
/* Remove scsi host now. The nodes are cleaned up. */
--
2.43.0
From: Rohit Ner <rohitner(a)google.com>
[ Upstream commit 767712f91de76abd22a45184e6e3440120b8bfce ]
As per JEDEC Standard No. 223E Section 5.9.2, the max # active commands
value programmed by the host sw in MCQConfig.MAC should be one less than
the actual value.
Signed-off-by: Rohit Ner <rohitner(a)google.com>
Link: https://lore.kernel.org/r/20240220095637.2900067-1-rohitner@google.com
Reviewed-by: Peter Wang <peter.wang(a)mediatek.com>
Reviewed-by: Can Guo <quic_cang(a)quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/ufs/core/ufs-mcq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
index 0787456c2b892..c873fd8239427 100644
--- a/drivers/ufs/core/ufs-mcq.c
+++ b/drivers/ufs/core/ufs-mcq.c
@@ -94,7 +94,7 @@ void ufshcd_mcq_config_mac(struct ufs_hba *hba, u32 max_active_cmds)
val = ufshcd_readl(hba, REG_UFS_MCQ_CFG);
val &= ~MCQ_CFG_MAC_MASK;
- val |= FIELD_PREP(MCQ_CFG_MAC_MASK, max_active_cmds);
+ val |= FIELD_PREP(MCQ_CFG_MAC_MASK, max_active_cmds - 1);
ufshcd_writel(hba, val, REG_UFS_MCQ_CFG);
}
EXPORT_SYMBOL_GPL(ufshcd_mcq_config_mac);
--
2.43.0
From: Rohit Ner <rohitner(a)google.com>
[ Upstream commit 767712f91de76abd22a45184e6e3440120b8bfce ]
As per JEDEC Standard No. 223E Section 5.9.2, the max # active commands
value programmed by the host sw in MCQConfig.MAC should be one less than
the actual value.
Signed-off-by: Rohit Ner <rohitner(a)google.com>
Link: https://lore.kernel.org/r/20240220095637.2900067-1-rohitner@google.com
Reviewed-by: Peter Wang <peter.wang(a)mediatek.com>
Reviewed-by: Can Guo <quic_cang(a)quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/ufs/core/ufs-mcq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
index 0787456c2b892..c873fd8239427 100644
--- a/drivers/ufs/core/ufs-mcq.c
+++ b/drivers/ufs/core/ufs-mcq.c
@@ -94,7 +94,7 @@ void ufshcd_mcq_config_mac(struct ufs_hba *hba, u32 max_active_cmds)
val = ufshcd_readl(hba, REG_UFS_MCQ_CFG);
val &= ~MCQ_CFG_MAC_MASK;
- val |= FIELD_PREP(MCQ_CFG_MAC_MASK, max_active_cmds);
+ val |= FIELD_PREP(MCQ_CFG_MAC_MASK, max_active_cmds - 1);
ufshcd_writel(hba, val, REG_UFS_MCQ_CFG);
}
EXPORT_SYMBOL_GPL(ufshcd_mcq_config_mac);
--
2.43.0
I'm announcing the release of the 6.8.4 kernel.
All users of the 6.8 kernel series must upgrade.
The updated 6.8.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.8.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
include/linux/workqueue.h | 35 --
kernel/workqueue.c | 757 +++++++---------------------------------------
3 files changed, 131 insertions(+), 663 deletions(-)
Greg Kroah-Hartman (12):
Revert "workqueue: Shorten events_freezable_power_efficient name"
Revert "workqueue: Don't call cpumask_test_cpu() with -1 CPU in wq_update_node_max_active()"
Revert "workqueue: Implement system-wide nr_active enforcement for unbound workqueues"
Revert "workqueue: Introduce struct wq_node_nr_active"
Revert "workqueue: RCU protect wq->dfl_pwq and implement accessors for it"
Revert "workqueue: Make wq_adjust_max_active() round-robin pwqs while activating"
Revert "workqueue: Move nr_active handling into helpers"
Revert "workqueue: Replace pwq_activate_inactive_work() with [__]pwq_activate_work()"
Revert "workqueue: Factor out pwq_is_empty()"
Revert "workqueue: Move pwq->max_active to wq->max_active"
Revert "workqueue.c: Increase workqueue name length"
Linux 6.8.4
[cc stable to see if they have any ideas about fixing this]
On Sat, 2024-04-06 at 12:16 -0400, John David Anglin wrote:
> On 2024-04-06 11:06 a.m., James Bottomley wrote:
> > On Sat, 2024-04-06 at 10:30 -0400, John David Anglin wrote:
> > > On 2024-04-05 3:36 p.m., Bart Van Assche wrote:
> > > > On 4/4/24 13:07, John David Anglin wrote:
> > > > > On 2024-04-04 12:32 p.m., Bart Van Assche wrote:
> > > > > > Can you please help with verifying whether this kernel warn
> > > > > > ing is only triggered by the 6.1 stable kernel series or
> > > > > > whether it is also
> > > > > > triggered by a vanilla kernel, e.g. kernel v6.8? That will
> > > > > > tell us whether we
> > > > > > need to review the upstream changes or the backp
> > > > > > orts on the v6.1 branch.
> > > > > Stable kernel v6.8.3 is okay.
> > > > Would it be possible to bisect this issue on the linux-6.1.y
> > > > branch? That probably will be faster than reviewing all
> > > > backports
> > > > of SCSI patches on that branch.
> > > The warning triggers with v6.1.81. It doesn't trigger with
> > > v6.1.80.
> > It's this patch:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
> >
> > The specific problem being that the update to scsi_execute doesn't
> > set the sense_len that the WARN_ON is checking.
> >
> > This isn't a problem in mainline because we've converted all uses
> > of scsi_execute. Stable needs to either complete the conversion or
> > back out the inital patch. This change depends on the above change:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
>
> Thus, more than just the initial patch needs to be backed out.
OK, so the reason the bad patch got pulled in is because it's a
precursor of this fixes tagged backport:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
Which is presumably the other patch you had to back out to fix the
issue.
The problem is that Mike's series updating and then removing
scsi_execute() went into the tree as one series, so no-one notice the
first patch had this bug because the buggy routine got removed at the
end of the series. This also means there's nothing to fix and backport
in upstream.
The bug is also more widely spread than simply domain validation,
because every use of scsi_execute in the current stable tree will trip
this.
I'm not sure what the best fix is. I can certainly come up with a one
line fix for stable adding the missing length in the #define, but it
can't come from upstream as stated above. We could back the two
patches out then do a stable specific fix for the UAS problem (I don't
think we can leave the UAS patch backed out because the problem was
pretty serious).
What does stable want to do?
James