On 08/05/22 19:13, Peter Xu wrote:
On Fri, Aug 05, 2022 at 01:48:35PM -0700, Mike Kravetz wrote:
On 08/05/22 20:57, David Hildenbrand wrote:
On 05.08.22 20:33, Mike Kravetz wrote:
On 08/05/22 20:25, David Hildenbrand wrote:
On 05.08.22 20:23, Mike Kravetz wrote:
On 08/05/22 14:14, Peter Xu wrote: > On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 61e6135c54ef..462a6b0344ac 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) >> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) >> return 0; >> >> + /* >> + * Hugetlb does not require/support writenotify; especially, it does not >> + * support softdirty tracking. >> + */ >> + if (is_vm_hugetlb_page(vma)) >> + return 0; > > I'm kind of confused here.. you seems to be fixing up soft-dirty for > hugetlb but here it's explicitly forbidden. > > Could you explain a bit more on why this patch is needed if (assume > there'll be a working) patch 2 being provided? >
No comments on the patch, but ...
Since it required little thought, I ran the test program on next-20220802 and was surprised that the issue did not recreate. Even added a simple printk to make sure we were getting into vma_wants_writenotify with a hugetlb vma. We were.
... does your config have CONFIG_MEM_SOFT_DIRTY enabled?
No, Duh!
FYI - Some time back, I started looking at adding soft dirty support for hugetlb mappings. I did not finish that work. But, I seem to recall places where code was operating on hugetlb mappings when perhaps it should not.
Perhaps, it would also be good to just disable soft dirty for hugetlb at the source?
I thought about that as well. But I came to the conclusion that without patch #2, hugetlb VMAs cannot possibly support write-notify, so there is no need to bother in vma_wants_writenotify() at all.
The "root" would be places where we clear VM_SOFTDIRTY. That should only be fs/proc/task_mmu.c:clear_refs_write() IIRC.
So I don't particularly care, I consider this patch a bit cleaner and more generic, but I can adjust clear_refs_write() instead of there is a preference.
After a closer look, I agree that this may be the simplest/cleanest way to proceed. I was going to suggest that you note hugetlb does not support softdirty, but see you did in the comment.
Acked-by: Mike Kravetz mike.kravetz@oracle.com
Filtering out hugetlbfs in vma_wants_writenotify() is still a bit hard to follow to me, since it's not clear why hugetlbfs never wants writenotify.
If it's only about soft-dirty, we could have added the hugetlbfs check into vma_soft_dirty_enabled(), then I think it'll achieve the same thing and much clearer - with the soft-dirty check constantly returning false for it, hugetlbfs shared vmas should have vma_wants_writenotify() naturally return 0 already.
For the long term - shouldn't we just enable soft-dirty for hugetlbfs? I remember Mike used to have that in todo. Since we've got patch 2 already, I feel like that's really much close (is the only missing piece the clear refs write part? or maybe some more that I didn't notice).
Then patch 1 (or IMHO equivalant check in vma_soft_dirty_enabled(), but maybe in stable trees we don't have vma_soft_dirty_enabled then it's exactly patch 1) can be a stable-only backport just to avoid the bug from triggering.
It looks like vma_soft_dirty_enabled is recent and not in any stable trees (or even 5.19).
Yes, I did start working on hugetlb softdirty support in the past. https://lore.kernel.org/lkml/20210211000322.159437-1-mike.kravetz@oracle.com...
Unfortunately, it got preempted by other things. I will try to move it up the priority list.