-----Original Message----- From: Jason Gunthorpe jgg@ziepe.ca Sent: Tuesday, May 19, 2020 8:27 PM To: Dalessandro, Dennis dennis.dalessandro@intel.com Cc: dledford@redhat.com; linux-rdma@vger.kernel.org; Marciniszyn, Mike mike.marciniszyn@intel.com; stable@vger.kernel.org; Wan, Kaike kaike.wan@intel.com Subject: Re: [PATCH for-rc or next 1/3] IB/hfi1: Do not destroy hfi1_wq when the device is shut down
On Mon, May 11, 2020 at 11:13:15PM -0400, Dennis Dalessandro wrote:
From: Kaike Wan kaike.wan@intel.com
The workqueue hfi1_wq is destroyed in function shutdown_device(), which is called by either shutdown_one() or remove_one(). The function shutdown_one() is called when the kernel is rebooted while remove_one() is called when the hfi1 driver is unloaded. When the kernel is rebooted, hfi1_wq is destroyed while all qps are still active, leading to a kernel crash:
AFAIK the purpose of shutdown is to stop all in progress DMAs.
If devices are wildly doing DMA during the shutdown process then all manner of things can fail, including kexecing into another kernel.
Do you achive that with these shutdown handlers?
We did try to shut down the hardware and will address some software issues in next revision.
It does make sense that the work queue would not be destroyed in shutdown, but I'm surprised it doesn't flush it?
Will do.
Kaike