On 06.08.22 01:13, Peter Xu wrote:
On Fri, Aug 05, 2022 at 01:48:35PM -0700, Mike Kravetz wrote:
On 08/05/22 20:57, David Hildenbrand wrote:
On 05.08.22 20:33, Mike Kravetz wrote:
On 08/05/22 20:25, David Hildenbrand wrote:
On 05.08.22 20:23, Mike Kravetz wrote:
On 08/05/22 14:14, Peter Xu wrote: > On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 61e6135c54ef..462a6b0344ac 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) >> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) >> return 0; >> >> + /* >> + * Hugetlb does not require/support writenotify; especially, it does not >> + * support softdirty tracking. >> + */ >> + if (is_vm_hugetlb_page(vma)) >> + return 0; > > I'm kind of confused here.. you seems to be fixing up soft-dirty for > hugetlb but here it's explicitly forbidden. > > Could you explain a bit more on why this patch is needed if (assume > there'll be a working) patch 2 being provided? >
No comments on the patch, but ...
Since it required little thought, I ran the test program on next-20220802 and was surprised that the issue did not recreate. Even added a simple printk to make sure we were getting into vma_wants_writenotify with a hugetlb vma. We were.
... does your config have CONFIG_MEM_SOFT_DIRTY enabled?
No, Duh!
FYI - Some time back, I started looking at adding soft dirty support for hugetlb mappings. I did not finish that work. But, I seem to recall places where code was operating on hugetlb mappings when perhaps it should not.
Perhaps, it would also be good to just disable soft dirty for hugetlb at the source?
I thought about that as well. But I came to the conclusion that without patch #2, hugetlb VMAs cannot possibly support write-notify, so there is no need to bother in vma_wants_writenotify() at all.
The "root" would be places where we clear VM_SOFTDIRTY. That should only be fs/proc/task_mmu.c:clear_refs_write() IIRC.
So I don't particularly care, I consider this patch a bit cleaner and more generic, but I can adjust clear_refs_write() instead of there is a preference.
After a closer look, I agree that this may be the simplest/cleanest way to proceed. I was going to suggest that you note hugetlb does not support softdirty, but see you did in the comment.
Acked-by: Mike Kravetz mike.kravetz@oracle.com
Filtering out hugetlbfs in vma_wants_writenotify() is still a bit hard to follow to me, since it's not clear why hugetlbfs never wants writenotify.
Well, because the write-fault handler as is cannot deal with write-faults in shared mappings. It cannot possibly work or ever have worked.
If it's only about soft-dirty, we could have added the hugetlbfs check into vma_soft_dirty_enabled(), then I think it'll achieve the same thing and much clearer - with the soft-dirty check constantly returning false for it, hugetlbfs shared vmas should have vma_wants_writenotify() naturally return 0 already.
I considered that an option, but went with this approach for simplicity and because I don't see a need to check for hugetlb in other code paths (especially, in the !hugetlb mprotect variant).
I mean, with patch #2 we can remove it anytime we do support softdirty tracking -- or if there is need for write-notify we can move it into the softdirty check only.
For the long term - shouldn't we just enable soft-dirty for hugetlbfs? I remember Mike used to have that in todo. Since we've got patch 2 already, I feel like that's really much close (is the only missing piece the clear refs write part? or maybe some more that I didn't notice).
My gut feeling is that there isn't too much missing to have it working. Define a PTE bit, implement hugetlb variant of clearing, and set it when setting the PTE dirty.
Thanks!