On 28.05.25 17:45, Oscar Salvador wrote:
On Wed, May 28, 2025 at 05:09:26PM +0200, David Hildenbrand wrote:
On 28.05.25 17:03, Peter Xu wrote:
So I'm not 100% sure we need the folio lock even for copy; IIUC a refcount would be enough?
The introducing patches seem to talk about blocking concurrent migration / rmap walks.
I thought the main reason was because PageLock protects us against writes, so when copying (in case of copying the underlying file), we want the file to be stable throughout the copy?
Well, we don't do the same for ordinary pages, why should we do for hugetlb?
See wp_page_copy().
If you have a MAP_PRIVATE mapping of a file and modify the pagecache pages concurrently (write to another MAP_SHARED mapping, write() ...), there are no guarantees about one observing any specific page state.
At least not that I am aware of ;)
Maybe also concurrent fallocate(PUNCH_HOLE) is a problem regarding reservations? Not sure ...
fallocate()->hugetlb_vmdelete_list() tries to grab the vma in write-mode, and hugetlb_wp() grabs the lock in read-mode, so we should be covered?
Yeah, maybe that's the case nowadays. Maybe it wasn't in the past ...
Also, hugetlbfs_punch_hole()->remove_inode_hugepages() will try to grab the mutex.
The only fishy thing I see is hugetlbfs_zero_partial_page().
But that is for old_page, and as I said, I thought main reason was to protect us against writes during the copy.
See above, I really wouldn't understand why that is required.
For 2) I am also not sure if we need need the pagecache folio locked; I doubt it ... but this code is not the easiest to follow.
I have been staring at that code and thinking about potential scenarios for a few days now, and I cannot convice myself that we need pagecache_folio's lock when pagecache_folio != old_folio because as a matter of fact I cannot think of anything it protects us against.
I plan to rework this in a more sane way, or at least less offusctaed, and then Galvin can fire his syzkaller to check whether we are good.