Re: [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle

17 Jul 2025


      Hi Greg,
On 7/16/25 2:34 PM, Greg Kroah-Hartman wrote:
...
On Tue, Jul 15, 2025 at 06:25:07PM +0500, Muhammad Usama Anjum wrote:
...
When there is memory pressure, at resume time dma_alloc_coherent()
returns error which in turn fails the loading of firmware and hence
the driver crashes:
kernel: kworker/u33:5: page allocation failure: order:7,
mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
kernel: CPU: 1 UID: 0 PID: 7693 Comm: kworker/u33:5 Not tainted 6.11.11-valve17-1-neptune-611-g027868a0ac03 #1 3843143b92e9da0fa2d3d5f21f51beaed15c7d59
kernel: Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
kernel: Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
kernel: Call Trace:
kernel:  <TASK>
kernel:  dump_stack_lvl+0x4e/0x70
kernel:  warn_alloc+0x164/0x190
kernel:  ? srso_return_thunk+0x5/0x5f
kernel:  ? __alloc_pages_direct_compact+0xaf/0x360
kernel:  __alloc_pages_slowpath.constprop.0+0xc75/0xd70
kernel:  __alloc_pages_noprof+0x321/0x350
kernel:  __dma_direct_alloc_pages.isra.0+0x14a/0x290
kernel:  dma_direct_alloc+0x70/0x270
kernel:  mhi_fw_load_handler+0x126/0x340 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
kernel:  mhi_pm_st_worker+0x5e8/0xac0 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
kernel:  ? srso_return_thunk+0x5/0x5f
kernel:  process_one_work+0x17e/0x330
kernel:  worker_thread+0x2ce/0x3f0
kernel:  ? __pfx_worker_thread+0x10/0x10
kernel:  kthread+0xd2/0x100
kernel:  ? __pfx_kthread+0x10/0x10
kernel:  ret_from_fork+0x34/0x50
kernel:  ? __pfx_kthread+0x10/0x10
kernel:  ret_from_fork_asm+0x1a/0x30
kernel:  </TASK>
kernel: Mem-Info:
kernel: active_anon:513809 inactive_anon:152 isolated_anon:0
    active_file:359315 inactive_file:2487001 isolated_file:0
    unevictable:637 dirty:19 writeback:0
    slab_reclaimable:160391 slab_unreclaimable:39729
    mapped:175836 shmem:51039 pagetables:4415
    sec_pagetables:0 bounce:0
    kernel_misc_reclaimable:0
    free:125666 free_pcp:0 free_cma:0
This is not a "crash", it is a warning that your huge memory allocation
did not succeed.  Properly handle this issue (and if you know it's going
to happen, turn the warning off in your allocation), and you should be
fine.
Yes, the system is fine. But wifi/sound drivers fail to reinitialize.
...
...
In above example, if we sum all the consumed memory, it comes out
to be 15.5GB and free memory is ~ 500MB from a total of 16GB RAM.
Even though memory is present. But all of the dma memory has been
exhausted or fragmented.
What caused that to happen?
Excessive use of the page cache occurs when user-space applications open
and consume large amounts of file system memory, even if those files are
no longer being actively read. I haven't found any documentation on limiting
the size of the page cache or preventing it from occupying DMA-capable
memory—perhaps the MM developers can provide more insight.
I can reproduce this issue by running stress tests that create and
sequentially read files. On a system with 16GB of RAM, the page cache can
easily grow to 10–12GB. Since the kernel manages the page cache, it's unclear
why it doesn't reclaim inactive cache more aggressively.
...
...
Fix it by allocating it only once and then reuse the same allocated
memory. As we'll allocate this memory only once, this memory will stay
allocated.
As others said, no, don't consume memory for no good reason, that just
means that other devices will fail more frequently.  If all
devices/drivers did this, you wouldn't have memory to work either.
Makes sense.
...
thanks,
greg k-h

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle