On Wed, May 28, 2025 at 11:45 AM Oscar Salvador osalvador@suse.de wrote:
On Wed, May 28, 2025 at 05:09:26PM +0200, David Hildenbrand wrote:
On 28.05.25 17:03, Peter Xu wrote:
So I'm not 100% sure we need the folio lock even for copy; IIUC a refcount would be enough?
The introducing patches seem to talk about blocking concurrent migration / rmap walks.
I thought the main reason was because PageLock protects us against writes, so when copying (in case of copying the underlying file), we want the file to be stable throughout the copy?
Maybe also concurrent fallocate(PUNCH_HOLE) is a problem regarding reservations? Not sure ...
fallocate()->hugetlb_vmdelete_list() tries to grab the vma in write-mode, and hugetlb_wp() grabs the lock in read-mode, so we should be covered?
Also, hugetlbfs_punch_hole()->remove_inode_hugepages() will try to grab the mutex.
The only fishy thing I see is hugetlbfs_zero_partial_page().
But that is for old_page, and as I said, I thought main reason was to protect us against writes during the copy.
For 2) I am also not sure if we need need the pagecache folio locked; I doubt it ... but this code is not the easiest to follow.
I have been staring at that code and thinking about potential scenarios for a few days now, and I cannot convice myself that we need pagecache_folio's lock when pagecache_folio != old_folio because as a matter of fact I cannot think of anything it protects us against.
Hi Oscar,
Have you thought about the UFFDIO_CONTINUE case (hugetlb_mfill_atomic_pte())?
I'm slightly concerned that, if you aren't holding pagecache_folio's lock, there might be issues where hugetlb_mfill_atomic_pte() proceeds to map a hugetlb page that it is not supposed to. (For example, if the fault handler does not generally hold pagecache_folio's lock, hugetlb_mfill_atomic_pte() will see a page in the pagecache and map it, even though it may not have been zeroed yet.)
I haven't had enough time to fully think through this case, but just want to make sure it has been considered.
Thanks!
I plan to rework this in a more sane way, or at least less offusctaed, and then Galvin can fire his syzkaller to check whether we are good.
-- Oscar Salvador SUSE Labs