While mapping DMA for scatter list when a scsi command is queued the existing call to dma_alloc_coherent() in our map_sg_data() function passes zero for the gfp_flags parameter. We are most definitly in atomic context at this point as queue_command() is called in softirq context and further we have a spinlock holding the scsi host lock.
Fix this by passing GFP_ATOMIC to dma_alloc_coherent() to prevent any sort of sleeping in atomic context deadlock.
Fixes: 4dddbc26c389 ("[SCSI] ibmvscsi: handle large scatter/gather lists") Cc: stable@vger.kernel.org Signed-off-by: Tyrel Datwyler tyreld@linux.vnet.ibm.com --- drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index 1135e74..cb8535e 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -731,7 +731,7 @@ static int map_sg_data(struct scsi_cmnd *cmd, evt_struct->ext_list = (struct srp_direct_buf *) dma_alloc_coherent(dev, SG_ALL * sizeof(struct srp_direct_buf), - &evt_struct->ext_list_token, 0); + &evt_struct->ext_list_token, GFP_ATOMIC); if (!evt_struct->ext_list) { if (!firmware_has_feature(FW_FEATURE_CMO)) sdev_printk(KERN_ERR, cmd->device,
On 01/09/2019 08:58 PM, Tyrel Datwyler wrote:
While mapping DMA for scatter list when a scsi command is queued the existing call to dma_alloc_coherent() in our map_sg_data() function passes zero for the gfp_flags parameter. We are most definitly in atomic context at this point as queue_command() is called in softirq context and further we have a spinlock holding the scsi host lock.
Fix this by passing GFP_ATOMIC to dma_alloc_coherent() to prevent any sort of sleeping in atomic context deadlock.
Fixes: 4dddbc26c389 ("[SCSI] ibmvscsi: handle large scatter/gather lists") Cc: stable@vger.kernel.org Signed-off-by: Tyrel Datwyler tyreld@linux.vnet.ibm.com
drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index 1135e74..cb8535e 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -731,7 +731,7 @@ static int map_sg_data(struct scsi_cmnd *cmd, evt_struct->ext_list = (struct srp_direct_buf *) dma_alloc_coherent(dev, SG_ALL * sizeof(struct srp_direct_buf),
&evt_struct->ext_list_token, 0);
if (!evt_struct->ext_list) { if (!firmware_has_feature(FW_FEATURE_CMO)) sdev_printk(KERN_ERR, cmd->device,&evt_struct->ext_list_token, GFP_ATOMIC);
Reviewed-by: Brian King brking@linux.vnet.ibm.com
On Wed, Jan 09, 2019 at 06:58:56PM -0800, Tyrel Datwyler wrote:
While mapping DMA for scatter list when a scsi command is queued the existing call to dma_alloc_coherent() in our map_sg_data() function passes zero for the gfp_flags parameter. We are most definitly in atomic context at this point as queue_command() is called in softirq context and further we have a spinlock holding the scsi host lock.
Fix this by passing GFP_ATOMIC to dma_alloc_coherent() to prevent any sort of sleeping in atomic context deadlock.
This is a pretty clear sign you should not be using dma_alloc_coherent to start with. GFP_ATOMIC support in many of the implementations either doesn't work at all or is severly constrained. Given that the descriptor is written by the OS and read by the hardware exactly once there is no point in having the coherent mapping to start with.
On 01/10/2019 07:07 AM, Christoph Hellwig wrote:
On Wed, Jan 09, 2019 at 06:58:56PM -0800, Tyrel Datwyler wrote:
While mapping DMA for scatter list when a scsi command is queued the existing call to dma_alloc_coherent() in our map_sg_data() function passes zero for the gfp_flags parameter. We are most definitly in atomic context at this point as queue_command() is called in softirq context and further we have a spinlock holding the scsi host lock.
Fix this by passing GFP_ATOMIC to dma_alloc_coherent() to prevent any sort of sleeping in atomic context deadlock.
This is a pretty clear sign you should not be using dma_alloc_coherent to start with. GFP_ATOMIC support in many of the implementations either doesn't work at all or is severly constrained. Given that the descriptor is written by the OS and read by the hardware exactly once there is no point in having the coherent mapping to start with.
This allocation isn't a single use allocation. The driver is just lazy about allocating our ext_list area for large SG lists (ie. SG_ALL). When the driver was first written it only supported up to 10 indirect SRP buffers. James Bottemley added the large SG support back in 2005 with the commit referenced here in the fixes tag "4dddbc26c389". We only allocate the ext_list when we come across a SG list requiring more than 10 indirect buffers. Once allocated we will reuse if already allocated.
-Tyrel
On Thu, Jan 10, 2019 at 12:11:53PM -0800, Tyrel Datwyler wrote:
This allocation isn't a single use allocation. The driver is just lazy about allocating our ext_list area for large SG lists (ie. SG_ALL). When the driver was first written it only supported up to 10 indirect SRP buffers. James Bottemley added the large SG support back in 2005 with the commit referenced here in the fixes tag "4dddbc26c389". We only allocate the ext_list when we come across a SG list requiring more than 10 indirect buffers. Once allocated we will reuse if already allocated.
I think the right fix is to just allocate the buffer for the ext_list as part of the scsi command using the .cmd_size field in the host template, and then dma map it in queuecommand and unmap it on completion.
On 01/10/2019 07:07 AM, Christoph Hellwig wrote:
On Wed, Jan 09, 2019 at 06:58:56PM -0800, Tyrel Datwyler wrote:
While mapping DMA for scatter list when a scsi command is queued the existing call to dma_alloc_coherent() in our map_sg_data() function passes zero for the gfp_flags parameter. We are most definitly in atomic context at this point as queue_command() is called in softirq context and further we have a spinlock holding the scsi host lock.
Fix this by passing GFP_ATOMIC to dma_alloc_coherent() to prevent any sort of sleeping in atomic context deadlock.
This is a pretty clear sign you should not be using dma_alloc_coherent to start with. GFP_ATOMIC support in many of the implementations either doesn't work at all or is severly constrained.
On a secondary note I was unaware of the GFP_ATOMIC limitations. Should this be added to the documentation somewhere? I don't see any mention here form DMA-API-HOWTO.txt.
Using Consistent DMA mappings =============================
To allocate and map large (PAGE_SIZE or so) consistent DMA regions, you should do::
dma_addr_t dma_handle;
cpu_addr = dma_alloc_coherent(dev, size, &dma_handle, gfp);
where device is a ``struct device *``. This may be called in interrupt context with the GFP_ATOMIC flag.
-Tyrel
Given that the
descriptor is written by the OS and read by the hardware exactly once there is no point in having the coherent mapping to start with.
On Thu, Jan 10, 2019 at 03:15:35PM -0800, Tyrel Datwyler wrote:
On a secondary note I was unaware of the GFP_ATOMIC limitations. Should this be added to the documentation somewhere? I don't see any mention here form DMA-API-HOWTO.txt.
The DMA documentation unfortauntely doesn't seem very good. It's been on my todo list to eventually update it, but I'm still discoverying various warts.
GFP_ATOMIC allocations generally work fine on DMA coherent architectures, but tend to cause problems on a lot of non-coherent ones with the notable exceptions of arm and arm64 that go to great length to introduce special pools for them. But that code is rarely exercised, so I found various bugs e.g. in the arm64 iommu code for this case.
But more importantly there really should be no need for the coherent allocation from irq context - we only need coherent for descriptors that don't have clear ownership, and aything allocated in the I/O path generally has that.
linux-stable-mirror@lists.linaro.org