Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5

2 Nov 2023


      On Wed, Nov 01, 2023 at 07:23:05PM +0800, Ming Lei wrote:
...
On Wed, Nov 01, 2023 at 11:15:02AM +0100, Hannes Reinecke wrote:
...
...
nvme_queue_rq() on the above request.
And that is something I've been wondering (for quite some time now):
What _is_ the appropriate error handling for -ENOMEM?
It is just my guess.
Actually it shouldn't fail since the sgl allocation is backed with
memory pool, but there is also dma pool allocation and dma mapping.
...
At this time, we assume it to be a retryable error and re-run the queue
in the hope that things will sort itself out.
It should not be hard to figure out why nvme_queue_rq() can't move on.
There's only a few reasons nvme_queue_rq would return BLK_STS_RESOURCE
for a typical read/write command:
DMA mapping error
  Can't allocate SGL from mempool
  Can't allocate PRP from dma_pool
  Controller stuck in resetting state
We should always be able to get at least one allocation from the memory
pools, so I think the only one the driver doesn't have a way to
guarantee eventual forward progress are the DMA mapping error
conditions. Is there some other limit that the driver needs to consider
when configuring it's largest supported transfers?

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5