On 12/21/25 10:35, Li Wang wrote:
David Hildenbrand (Red Hat) david@kernel.org wrote:
On 12/21/25 09:58, Li Wang wrote:
The hugetlb cgroup usage wait loops in charge_reserved_hugetlb.sh were unbounded and could hang forever if the expected cgroup file value never appears (e.g. due to bugs, timing issues, or unexpected behavior).
Did you actually hit that in practice? Just wondering.
Yes.
On an aarch64 64k setup with 512MB hugepages, the test failed earlier (hugetlbfs got mounted with an effective size of 0 due to size=256M), so write_to_hugetlbfs couldn’t allocate the expected pages. After that, the script’s wait loops never observed the target value, so they spun forever.
Okay, so essentially what you fix in patch #3, correct?
It might make sense to reorder #2 and #3, and likely current #3 should get a Fixes: tag.
Then you can just briefly describe here that this was previously hit due to other tests issues. Although I wonder how much value this patch here as after #3 is in. But it looks like a cleanup and the timeout of 60s sounds reasonable.
I know the reservation of hugetlb folios can take a rather long time in some environments, though.