With recent changes necessitating mmap_lock to be held for write while expanding a stack, per-VMA locks should follow the same rules and be write-locked to prevent page faults into the VMA being expanded. Add the necessary locking.
Signed-off-by: Suren Baghdasaryan surenb@google.com --- mm/mmap.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/mm/mmap.c b/mm/mmap.c index 204ddcd52625..c66e4622a557 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1977,6 +1977,8 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) return -ENOMEM; }
+ /* Lock the VMA before expanding to prevent concurrent page faults */ + vma_start_write(vma); /* * vma->vm_start/vm_end cannot change under us because the caller * is required to hold the mmap_lock in read mode. We need the @@ -2064,6 +2066,8 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) return -ENOMEM; }
+ /* Lock the VMA before expanding to prevent concurrent page faults */ + vma_start_write(vma); /* * vma->vm_start/vm_end cannot change under us because the caller * is required to hold the mmap_lock in read mode. We need the
mmap_region adds a newly created VMA into VMA tree and might modify it afterwards before dropping the mmap_lock. This poses a problem for page faults handled under per-VMA locks because they don't take the mmap_lock and can stumble on this VMA while it's still being modified. Currently this does not pose a problem since post-addition modifications are done only for file-backed VMAs, which are not handled under per-VMA lock. However, once support for handling file-backed page faults with per-VMA locks is added, this will become a race. Fix this by write-locking the VMA before inserting it into the VMA tree. Other places where a new VMA is added into VMA tree do not modify it after the insertion, so do not need the same locking.
Signed-off-by: Suren Baghdasaryan surenb@google.com --- mm/mmap.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/mm/mmap.c b/mm/mmap.c index c66e4622a557..84c71431a527 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2812,6 +2812,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr, if (vma->vm_file) i_mmap_lock_write(vma->vm_file->f_mapping);
+ /* Lock the VMA since it is modified after insertion into VMA tree */ + vma_start_write(vma); vma_iter_store(&vmi, vma); mm->map_count++; if (vma->vm_file) {
* Suren Baghdasaryan surenb@google.com [230707 00:32]:
mmap_region adds a newly created VMA into VMA tree and might modify it afterwards before dropping the mmap_lock. This poses a problem for page faults handled under per-VMA locks because they don't take the mmap_lock and can stumble on this VMA while it's still being modified. Currently this does not pose a problem since post-addition modifications are done only for file-backed VMAs, which are not handled under per-VMA lock. However, once support for handling file-backed page faults with per-VMA locks is added, this will become a race. Fix this by write-locking the VMA before inserting it into the VMA tree. Other places where a new VMA is added into VMA tree do not modify it after the insertion, so do not need the same locking.
Signed-off-by: Suren Baghdasaryan surenb@google.com
mm/mmap.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/mm/mmap.c b/mm/mmap.c index c66e4622a557..84c71431a527 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2812,6 +2812,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr, if (vma->vm_file) i_mmap_lock_write(vma->vm_file->f_mapping);
- /* Lock the VMA since it is modified after insertion into VMA tree */
So it is modified, but that i_mmap_lock_write() directly above this comment is potentially moving below the insert and that is why this lock is needed.
- vma_start_write(vma); vma_iter_store(&vmi, vma); mm->map_count++; if (vma->vm_file) {
-- 2.41.0.255.g8b1d071c50-goog
On Fri, Jul 7, 2023 at 7:48 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
- Suren Baghdasaryan surenb@google.com [230707 00:32]:
mmap_region adds a newly created VMA into VMA tree and might modify it afterwards before dropping the mmap_lock. This poses a problem for page faults handled under per-VMA locks because they don't take the mmap_lock and can stumble on this VMA while it's still being modified. Currently this does not pose a problem since post-addition modifications are done only for file-backed VMAs, which are not handled under per-VMA lock. However, once support for handling file-backed page faults with per-VMA locks is added, this will become a race. Fix this by write-locking the VMA before inserting it into the VMA tree. Other places where a new VMA is added into VMA tree do not modify it after the insertion, so do not need the same locking.
Signed-off-by: Suren Baghdasaryan surenb@google.com
mm/mmap.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/mm/mmap.c b/mm/mmap.c index c66e4622a557..84c71431a527 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2812,6 +2812,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr, if (vma->vm_file) i_mmap_lock_write(vma->vm_file->f_mapping);
/* Lock the VMA since it is modified after insertion into VMA tree */
So it is modified, but that i_mmap_lock_write() directly above this comment is potentially moving below the insert and that is why this lock is needed.
Correct, we should not rely on i_mmap_lock_write() which can be moved (and is suggested to be moved in https://lore.kernel.org/all/20230606124939.93561-1-yu.ma@intel.com/).
vma_start_write(vma); vma_iter_store(&vmi, vma); mm->map_count++; if (vma->vm_file) {
-- 2.41.0.255.g8b1d071c50-goog
-- To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
Rule: 'Cc: stable@vger.kernel.org' or 'commit <sha1> upstream.' Subject: [PATCH 1/2] mm: lock a vma before stack expansion Link: https://lore.kernel.org/stable/20230707043211.3682710-1-surenb%40google.com
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
On Thu, 6 Jul 2023 21:32:10 -0700 Suren Baghdasaryan surenb@google.com wrote:
With recent changes necessitating mmap_lock to be held for write while expanding a stack, per-VMA locks should follow the same rules and be write-locked to prevent page faults into the VMA being expanded. Add the necessary locking.
What are the possible runtime effects of this change?
--- a/mm/mmap.c +++ b/mm/mmap.c @@ -1977,6 +1977,8 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) return -ENOMEM; }
- /* Lock the VMA before expanding to prevent concurrent page faults */
- vma_start_write(vma); /*
- vma->vm_start/vm_end cannot change under us because the caller
- is required to hold the mmap_lock in read mode. We need the
@@ -2064,6 +2066,8 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) return -ENOMEM; }
- /* Lock the VMA before expanding to prevent concurrent page faults */
- vma_start_write(vma); /*
- vma->vm_start/vm_end cannot change under us because the caller
- is required to hold the mmap_lock in read mode. We need the
-- 2.41.0.255.g8b1d071c50-goog
On Fri, Jul 7, 2023 at 7:27 PM Andrew Morton akpm@linux-foundation.org wrote:
On Thu, 6 Jul 2023 21:32:10 -0700 Suren Baghdasaryan surenb@google.com wrote:
With recent changes necessitating mmap_lock to be held for write while expanding a stack, per-VMA locks should follow the same rules and be write-locked to prevent page faults into the VMA being expanded. Add the necessary locking.
What are the possible runtime effects of this change?
During the stack expansion concurrent page faults would have to wait and visa versa (stack expansion would have to wait if there are ongoing page faults). I think it's the same runtime effects as the recent similar change requiring mmap_lock to be write lock before stack expansion.
--- a/mm/mmap.c +++ b/mm/mmap.c @@ -1977,6 +1977,8 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) return -ENOMEM; }
/* Lock the VMA before expanding to prevent concurrent page faults */
vma_start_write(vma); /* * vma->vm_start/vm_end cannot change under us because the caller * is required to hold the mmap_lock in read mode. We need the
@@ -2064,6 +2066,8 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) return -ENOMEM; }
/* Lock the VMA before expanding to prevent concurrent page faults */
vma_start_write(vma); /* * vma->vm_start/vm_end cannot change under us because the caller * is required to hold the mmap_lock in read mode. We need the
-- 2.41.0.255.g8b1d071c50-goog
-- To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
linux-stable-mirror@lists.linaro.org