On May 22, 2025, at 13:34, Oscar Salvador osalvador@suse.de wrote:
On Thu, May 22, 2025 at 11:47:05AM +0800, Muchun Song wrote:
Thanks for fixing this problem. BTW, in order to catch future similar problem, it is better to add WARN_ON into folio_hstate() to assert if hugetlb_lock is not held when folio's reference count is zero. For this fix, LGTM.
Why cannot we put all the burden in alloc_and_dissolve_hugetlb_folio(), which will again check things under the lock?
I've also considered about this choice, because there is another similar case in isolate_or_dissolve_huge_page() which could benefit from this change. I am fine with both approaches. Anyway, adding an assertion into folio_hstate() is an improvement for capturing invalid users in the future. Because any user of folio_hstate() should hold a reference to folio or hold the hugetlb_lock to make sure it returns a valid hstate for a hugetlb folio.
Muchun, Thanks.
I mean, I would be ok to save cycles and check upfront in replace_free_hugepage_folios(), but the latter has only one user which is alloc_contig_range(), which is not really an expected-to-be optimized function.
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bd8971388236..b4d937732256 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2924,13 +2924,6 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn)
while (start_pfn < end_pfn) { folio = pfn_folio(start_pfn);
if (folio_test_hugetlb(folio)) {
h = folio_hstate(folio);
} else {
start_pfn++;
continue;
}
- if (!folio_ref_count(folio)) { ret = alloc_and_dissolve_hugetlb_folio(h, folio, &isolate_list);
-- Oscar Salvador SUSE Labs