Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5

30 Oct 2023


      On Mon 30-10-23 12:49:01, Mikulas Patocka wrote:
...
On Mon, 30 Oct 2023, Jan Kara wrote:
...
...
...
...
What if we end up in "goto retry" more than once? I don't see a matching
It is impossible. Before we jump to the retry label, we set 
__GFP_DIRECT_RECLAIM. mempool_alloc can't ever fail if 
__GFP_DIRECT_RECLAIM is present (it will just wait until some other task 
frees some objects into the mempool).
Ah, missed that. And the traces don't show that we would be waiting for
that. I'm starting to think the allocation itself is really not the issue
here. Also I don't think it deprives something else of large order pages, as
per the sysrq listing they still existed.
What I rather suspect is what happens next to the allocated bio such that it
works well with order-0 or up to costly_order pages, but there's some
problem causing a deadlock if the bio contains larger pages than that?
Hum, so in all the backtraces presented we see that we are waiting for page
writeback to complete but I don't see anything that would be preventing the
bios from completing. Page writeback can submit quite large bios so it kind
of makes sense that it trips up some odd behavior. Looking at the code
I can see one possible problem in crypt_alloc_buffer() but it doesn't
explain why reducing initial page order would help. Anyway: Are we
guaranteed mempool has enough pages for arbitrarily large bio that can
enter crypt_alloc_buffer()? I can see crypt_page_alloc() does limit the
number of pages in the mempool to dm_crypt_pages_per_client plus I assume
the percpu counter bias in cc->n_allocated_pages can limit the really
available number of pages even further. So if a single bio is large enough
to trip percpu_counter_read_positive(&cc->n_allocated_pages) >=
dm_crypt_pages_per_client condition in crypt_page_alloc(), we can loop
forever? But maybe this cannot happen for some reason...
If this is not it, I think we need to find out why the writeback bios are
not completeting. Probably I'd start with checking what is kcryptd,
presumably responsible for processing these bios, doing.
						Honza

cc->page_pool is initialized to hold BIO_MAX_VECS pages. crypt_map will 
restrict the bio size to BIO_MAX_VECS (see dm_accept_partial_bio being 
called from crypt_map).
When we allocate a buffer in crypt_alloc_buffer, we try first allocation 
without waiting, then we grab the mutex and we try allocation with 
waiting.
The mutex should prevent a deadlock when two processes allocate 128 pages 
concurrently and wait for each other to free some pages.
The limit to dm_crypt_pages_per_client only applies to pages allocated 
from the kernel - when this limit is reached, we can still allocate from 
the mempool, so it shoudn't cause deadlocks.
Ah, ok, I missed the limitation of the bio size in crypt_map(). Thanks for
explanation! So really the only advice I have now it to check what kcryptd
is doing when the system is stuck. Because we didn't see it in any of the
stacktrace dumps.
Honza
-- 
Jan Kara jack@suse.com
SUSE Labs, CR

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5