On 28/07/2023 17:13, Yin Fengwei wrote:
In madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(), folio_mapcount() is used to check whether the folio is shared. But it's not correct as folio_mapcount() returns total mapcount of large folio.
Use folio_estimated_sharers() here as the estimated number is enough.
Yin Fengwei (2): madvise: don't use mapcount() against large folio for sharing check madvise: don't use mapcount() against large folio for sharing check
mm/huge_memory.c | 2 +- mm/madvise.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
As a set of fixes, I agree this is definitely an improvement, so:
Reviewed-By: Ryan Roberts
But I have a couple of comments around further improvements;
Once we have the scheme that David is working on to be able to provide precise exclusive vs shared info, we will probably want to move to that. Although that scheme will need access to the mm_struct of a process known to be mapping the folio. We have that info, but its not passed to folio_estimated_sharers() so we can't just reimplement folio_estimated_sharers() - we will need to rework these call sites again.
Given the aspiration for most of the memory to be large folios going forwards, wouldn't it be better to avoid splitting the large folio where the large folio is mapped entirely within the range of the madvise operation? Sorry if this has already been discussed and decided against - I didn't follow the RFC too closely. Or perhaps you plan to do this as a follow up?
Thanks, Ryan