On Mon, 30 Oct 2023, Vlastimil Babka wrote:
Ah, missed that. And the traces don't show that we would be waiting for that. I'm starting to think the allocation itself is really not the issue here. Also I don't think it deprives something else of large order pages, as per the sysrq listing they still existed.
What I rather suspect is what happens next to the allocated bio such that it works well with order-0 or up to costly_order pages, but there's some problem causing a deadlock if the bio contains larger pages than that?
Yes. There are many "if (order > PAGE_ALLOC_COSTLY_ORDER)" branches in the memory allocation code and I suppose that one of them does something bad and triggers this bug. But I don't know which one.
Cc Honza. The thread starts here: https://lore.kernel.org/all/ZTNH0qtmint%2FzLJZ@mail-itl/
The linked qubes reports has a number of blocked task listings that can be expanded: https://github.com/QubesOS/qubes-issues/issues/8575
Mikulas