On 3 Oct 2025, at 9:49, Lance Yang wrote:
Hey Wei,
On 2025/10/2 09:38, Wei Yang wrote:
We add pmd folio into ds_queue on the first page fault in __do_huge_pmd_anonymous_page(), so that we can split it in case of memory pressure. This should be the same for a pmd folio during wp page fault.
Commit 1ced09e0331f ("mm: allocate THP on hugezeropage wp-fault") miss to add it to ds_queue, which means system may not reclaim enough memory
IIRC, it was commit dafff3f4c850 ("mm: split underused THPs") that started unconditionally adding all new anon THPs to _deferred_list :)
in case of memory pressure even the pmd folio is under used.
Move deferred_split_folio() into map_anon_folio_pmd() to make the pmd folio installation consistent.
Fixes: 1ced09e0331f ("mm: allocate THP on hugezeropage wp-fault")
Shouldn't this rather be the following?
Fixes: dafff3f4c850 ("mm: split underused THPs")
Yes, I agree. In this case, this patch looks more like an optimization for split underused THPs.
One observation on this change is that right after zero pmd wp, the deferred split queue could be scanned, the newly added pmd folio will split since it is all zero except one subpage. This means we probably should allocate a base folio for zero pmd wp and map the rest to zero page at the beginning if split underused THP is enabled to avoid this long trip. The downside is that user app cannot get a pmd folio if it is intended to write data into the entire folio.
Usama might be able to give some insight here.
Thanks, Lance
Signed-off-by: Wei Yang richard.weiyang@gmail.com Cc: David Hildenbrand david@redhat.com Cc: Lance Yang lance.yang@linux.dev Cc: Dev Jain dev.jain@arm.com Cc: stable@vger.kernel.org
v2:
- add fix, cc stable and put description about the flow of current code
- move deferred_split_folio() into map_anon_folio_pmd()
mm/huge_memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1b81680b4225..f13de93637bf 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1232,6 +1232,7 @@ static void map_anon_folio_pmd(struct folio *folio, pmd_t *pmd, count_vm_event(THP_FAULT_ALLOC); count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC);
- deferred_split_folio(folio, false); } static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf)
@@ -1272,7 +1273,6 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf) pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); map_anon_folio_pmd(folio, vmf->pmd, vma, haddr); mm_inc_nr_ptes(vma->vm_mm);
spin_unlock(vmf->ptl); }deferred_split_folio(folio, false);
Best Regards, Yan, Zi