Generic protection fault type kernel panic is observed when user performs soft(ordered) HBA unplug operation while IOs are running on drives connected to HBA.
When user performs ordered HBA removal operation then kernel calls PCI device's .remove() call back function where driver is flushing out all the outstanding SCSI IO commands with DID_NO_CONNECT host byte and also un-maps sg buffers allocated for these IO commands. But in the ordered HBA removal case (unlike of real HBA hot unplug) HBA device is still alive and hence HBA hardware is performing the DMA operations to those buffers on the system memory which are already unmapped while flushing out the outstanding SCSI IO commands and this leads to Kernel panic.
This bug got introduced from below commit, commit c666d3be99c000bb889a33353e9be0fa5808d3de ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload")
Fix: Don't flush out the outstanding IOs from .remove() path in case of ordered HBA removal since HBA will be still alive in this case and it can complete the outstanding IOs. Flush out the outstanding IOs only in case physical HBA hot unplug where their won't be any communication with the HBA.
During shutdown also it is possible that HBA hardware can perform DMA operations on those outstanding IO buffers which are completed with DID_NO_CONNECT by the driver from .shutdown(). So same above fix is applied in shutdown path as well.
Cc: stable@vger.kernel.org Signed-off-by: Sreekanth Reddy sreekanth.reddy@broadcom.com --- v1: Update the patch description.
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 778d5e6..04a40af 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -9908,8 +9908,8 @@ static void scsih_remove(struct pci_dev *pdev)
ioc->remove_host = 1;
- mpt3sas_wait_for_commands_to_complete(ioc); - _scsih_flush_running_cmds(ioc); + if (!pci_device_is_present(pdev)) + _scsih_flush_running_cmds(ioc);
_scsih_fw_event_cleanup_queue(ioc);
@@ -9992,8 +9992,8 @@ static void scsih_remove(struct pci_dev *pdev)
ioc->remove_host = 1;
- mpt3sas_wait_for_commands_to_complete(ioc); - _scsih_flush_running_cmds(ioc); + if (!pci_device_is_present(pdev)) + _scsih_flush_running_cmds(ioc);
_scsih_fw_event_cleanup_queue(ioc);
Hi
[This is an automated email]
This commit has been processed because it contains a -stable tag. The stable tag indicates that it's relevant for the following trees: all
The bot has tested the following trees: v5.5.9, v5.4.25, v4.19.109, v4.14.173, v4.9.216, v4.4.216.
v5.5.9: Build OK! v5.4.25: Build OK! v4.19.109: Build OK! v4.14.173: Build OK! v4.9.216: Failed to apply! Possible dependencies: c666d3be99c0 ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload")
v4.4.216: Failed to apply! Possible dependencies: 96902835e7e2 ("mpt3sas: Eliminate conditional locking in mpt3sas_scsih_issue_tm()") 98c56ad32c33 ("mpt3sas: Eliminate dead sleep_flag code") c666d3be99c0 ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload")
NOTE: The patch will not be queued to stable trees until it is upstream.
How should we proceed with this patch?
On Wed, Mar 18, 2020 at 4:00 AM Sasha Levin sashal@kernel.org wrote:
Hi
[This is an automated email]
This commit has been processed because it contains a -stable tag. The stable tag indicates that it's relevant for the following trees: all
The bot has tested the following trees: v5.5.9, v5.4.25, v4.19.109, v4.14.173, v4.9.216, v4.4.216.
v5.5.9: Build OK! v5.4.25: Build OK! v4.19.109: Build OK! v4.14.173: Build OK! v4.9.216: Failed to apply! Possible dependencies: c666d3be99c0 ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload")
v4.4.216: Failed to apply! Possible dependencies: 96902835e7e2 ("mpt3sas: Eliminate conditional locking in mpt3sas_scsih_issue_tm()") 98c56ad32c33 ("mpt3sas: Eliminate dead sleep_flag code") c666d3be99c0 ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload")
NOTE: The patch will not be queued to stable trees until it is upstream.
How should we proceed with this patch?
This fix patch is applicable only for below stable kernels, v5.5.9, v5.4.25, v4.19.109, v4.14.173
Please let me know if I need to resend this patch by specifying the list of stable kernels on those this patch is applicable?
Thanks, Sreekanth
-- Thanks Sasha
linux-stable-mirror@lists.linaro.org