On Fri, Dec 21, 2018 at 10:28:25AM -0800, Mike Kravetz wrote:
On 12/21/18 2:28 AM, Kirill A. Shutemov wrote:
On Tue, Dec 18, 2018 at 02:35:57PM -0800, Mike Kravetz wrote:
Instead of writing the required complicated code for this rare occurrence, just eliminate the race. i_mmap_rwsem is now held in read mode for the duration of page fault processing. Hold i_mmap_rwsem longer in truncation and hold punch code to cover the call to remove_inode_hugepages.
One of remove_inode_hugepages() callers is noticeably missing -- hugetlbfs_evict_inode(). Why?
It at least deserves a comment on why the lock rule doesn't apply to it.
In the case of hugetlbfs_evict_inode, the vfs layer guarantees there are no more users of the inode/file.
I'm not convinced that it is true. See documentation for ->evict_inode() in Documentation/filesystems/porting:
Caller does *not* evict the pagecache or inode-associated metadata buffers; the method has to use truncate_inode_pages_final() to get rid of those.
Is hugetlbfs special here?