On Mar 6, 2023, at 5:03 PM, Peter Xu peterx@redhat.com wrote:
!! External Email
On Mon, Mar 06, 2023 at 02:50:21PM -0800, Axel Rasmussen wrote:
Quite a few userfaultfd functions took both mm and vma pointers as arguments. Since the mm is trivially accessible via vma->vm_mm, there's no reason to pass both; it just needlessly extends the already long argument list.
Get rid of the mm pointer, where possible, to shorten the argument list.
Signed-off-by: Axel Rasmussen axelrasmussen@google.com
Acked-by: Peter Xu peterx@redhat.com
One nit below:
@@ -6277,7 +6276,7 @@ int hugetlb_mfill_atomic_pte(struct mm_struct *dst_mm, folio_in_pagecache = true; }
ptl = huge_pte_lock(h, dst_mm, dst_pte);
ptl = huge_pte_lock(h, dst_vma->vm_mm, dst_pte);
ret = -EIO; if (folio_test_hwpoison(folio))
@@ -6319,9 +6318,9 @@ int hugetlb_mfill_atomic_pte(struct mm_struct *dst_mm, if (wp_copy) _dst_pte = huge_pte_mkuffd_wp(_dst_pte);
set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
set_huge_pte_at(dst_vma->vm_mm, dst_addr, dst_pte, _dst_pte);
hugetlb_count_add(pages_per_huge_page(h), dst_mm);
hugetlb_count_add(pages_per_huge_page(h), dst_vma->vm_mm);
When vm_mm referenced multiple times (say, >=3?), let's still cache it in a temp var?
I'm not sure whether compiler is smart enough to already do that with a reg, even if so it may slightly improve readability too, imho, by avoiding the multiple but same indirection for the reader.
I am not sure if you referred to this code specifically or in general. I once looked into it, and the compiler is really stupid in this regard and super conservative when it comes to aliasing. Even if you use “restrict” keyword or “__pure” or “__const” function attributes, in certain cases (function calls to other compilation units, or inline assembly - I don’t remember) the compiler might ignore them. Worse, llvm and gcc are inconsistent.
From code-generated perspective, I did not see a clear cut that benefits caching over not. From performance perspective the impact is negligible. I mention all of that because I thought it matters too, but it mostly does not.
That’s all to say that in most cases, I think that whatever makes the code more readable should be preferred. I think that you are correct in saying that “caching” it will make the code more readable, but performance-wise it is probably meaningless.