On Fri, Jan 1, 2021 at 10:00 AM Mikulas Patocka mpatocka@redhat.com wrote:
On Wed, 30 Dec 2020, Ignat Korchagin wrote:
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 53791138d78b..e4fd690c70e1 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -1539,7 +1549,10 @@ static blk_status_t crypt_convert(struct crypt_config *cc,
while (ctx->iter_in.bi_size && ctx->iter_out.bi_size) {
crypt_alloc_req(cc, ctx);
r = crypt_alloc_req(cc, ctx);
if (r)
return BLK_STS_RESOURCE;
atomic_inc(&ctx->cc_pending); if (crypt_integrity_aead(cc))
-- 2.20.1
I'm not quite convinced that returning BLK_STS_RESOURCE will help. The block layer will convert this value back to -ENOMEM and return it to the caller, resulting in an I/O error.
Note that GFP_ATOMIC allocations may fail anytime and you must handle allocation failure gracefully - i.e. process the request without any error.
An acceptable solution would be to punt the request to a workqueue and do GFP_NOIO allocation from the workqueue. Or add the request to some list and process the list when some other request completes.
We can do the workqueue, if that's the desired behaviour. The second patch from this patchset already adds the code for the request to be retried from the workqueue if crypt_convert returns BLK_STS_DEV_RESOURCE, so all is needed is to change returning BLK_STS_RESOURCE to BLK_STS_DEV_RESOURCE here. Does that sound reasonable?
You should write a test that simulates allocation failure and verify that the kernel handles it gracefully without any I/O error.
I already have the test for the second patch in the set, but will adapt it to make sure the allocation failure codepath is covered as well.
Mikulas