On 10/25/23 12:13, Mikulas Patocka wrote:
So, I forward this to memory management maintainers.
Hi,
What do you think? - We have a problem that if dm-crypt allocates pages with order > 3 (PAGE_ALLOC_COSTLY_ORDER), the system occasionally freezes waiting in writeback.
Hmm, so I've checked the backtraces provided and none seems to show the actual allocating task doing the crypt_alloc_buffer(), do we know what it's doing? Is it blocked or perhaps spinning in some infinite loop?
dm-crypt allocates the pages with GFP_NOWAIT | __GFP_HIGHMEM | __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | __GFP_COMP, so it shouldn't put any pressure on the system. If the allocations fails, it falls back to smaller order allocation or to mempool as a last resort.
Right. I noticed it may also fallback to __GFP_DIRECT_RECLAIM but then it's only order-0.
When the freeze happens, there is "349264kB" free memory - so the system is not short on memory.
Yeah, it may still be fragmented, although in [1] the sysqr show memory report suggests it's not, pages do exist up to MAX_ORDER. Weird.
[1] https://lore.kernel.org/all/ZTiJ3CO8w0jauOzW@mail-itl/
Should we restrict the dm-crypt allocation size to PAGE_ALLOC_COSTLY_ORDER? Or is it a bug somewhere in memory management system that needs to be fixes there?
Haven't found any. However I'd like to point out some things I noticed in crypt_alloc_buffer(), although they are probably not related.
static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size) { struct crypt_config *cc = io->cc; struct bio *clone; unsigned int nr_iovecs = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; gfp_t gfp_mask = GFP_NOWAIT | __GFP_HIGHMEM; unsigned int remaining_size; unsigned int order = MAX_ORDER - 1;
retry: if (unlikely(gfp_mask & __GFP_DIRECT_RECLAIM)) mutex_lock(&cc->bio_alloc_lock);
What if we end up in "goto retry" more than once? I don't see a matching unlock. Yeah, very unlikely to happen that order-0 in page allocator which includes __GFP_DIRECT_RECLAIM would fail, but not impossible, and also I see crypt_page_alloc() for the mempool can fail for another reason, due to a counter being too high. Looks dangerous.
clone = bio_alloc_bioset(cc->dev->bdev, nr_iovecs, io->base_bio->bi_opf, GFP_NOIO, &cc->bs); clone->bi_private = io; clone->bi_end_io = crypt_endio;
remaining_size = size;
while (remaining_size) { struct page *pages; unsigned size_to_add; unsigned remaining_order = __fls((remaining_size + PAGE_SIZE - 1) >> PAGE_SHIFT);
Tip: you can use get_order(remaining_size) here.
order = min(order, remaining_order); while (order > 0) {
Is this intentionally > 0 and not >= 0? We could still succeed avoiding mempool with order-0...
pages = alloc_pages(gfp_mask | __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | __GFP_COMP, order); if (likely(pages != NULL)) goto have_pages; order--; } pages = mempool_alloc(&cc->page_pool, gfp_mask); if (!pages) { crypt_free_buffer_pages(cc, clone); bio_put(clone); gfp_mask |= __GFP_DIRECT_RECLAIM; order = 0; goto retry; }
have_pages: size_to_add = min((unsigned)PAGE_SIZE << order, remaining_size); __bio_add_page(clone, pages, size_to_add, 0); remaining_size -= size_to_add; }
/* Allocate space for integrity tags */ if (dm_crypt_integrity_io_alloc(io, clone)) { crypt_free_buffer_pages(cc, clone); bio_put(clone); clone = NULL; }
if (unlikely(gfp_mask & __GFP_DIRECT_RECLAIM)) mutex_unlock(&cc->bio_alloc_lock);
return clone; }