On Tue, Nov 19, 2024 at 12:59:45PM -0500, Liam R. Howlett wrote:
From: "Liam R. Howlett" Liam.Howlett@Oracle.com
The mmap_region() function tries to install a new vma, which requires a pre-allocation for the maple tree write due to the complex locking scenarios involved.
Recent efforts to simplify the error recovery required the relocation of the preallocation of the maple tree nodes (via vma_iter_prealloc() calling mas_preallocate()) higher in the function.
The relocation of the preallocation meant that, if there was a file associated with the vma and the driver call (mmap_file()) modified the vma flags, then a new merge of the new vma with existing vmas is attempted.
During the attempt to merge the existing vma with the new vma, the vma iterator is used - the same iterator that would be used for the next write attempt to the tree. In the event of needing a further allocation and if the new allocations fails, the vma iterator (and contained maple state) will cleaned up, including freeing all previous allocations and will be reset internally.
Upon returning to the __mmap_region() function, the error is available in the vma_merge_struct and can be used to detect the -ENOMEM status.
Hitting an -ENOMEM scenario after the driver callback leaves the system in a state that undoing the mapping is worse than continuing by dipping into the reserve.
A preallocation should be performed in the case of an -ENOMEM and the allocations were lost during the failure scenario. The __GFP_NOFAIL flag is used in the allocation to ensure the allocation succeeds after implicitly telling the driver that the mapping was happening.
The range is already set in the vma_iter_store() call below, so it is not necessary and is dropped.
Reported-by: syzbot+bc6bfc25a68b7a020ee1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/x/log.txt?x=17b0ace8580000 Fixes: 5de195060b2e2 ("mm: resolve faulty mmap_region() error path behaviour") Signed-off-by: Liam R. Howlett Liam.Howlett@Oracle.com Cc: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Jann Horn jannh@google.com Cc: stable@vger.kernel.org
Looks good to me:
Reviewed-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
I mean ideally we'd not have to handle this scenario, but 6.13 resolves this in a different way, and since this will never nearly happen (perhaps actually never in reality), I think having an operation that will nearly always be a no-op beats out alternative solutions.
mm/mmap.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
Changes since v1:
- Don't bail out and force the allocation when the merge failure is -ENOMEM - Thanks Lorenzo
diff --git a/mm/mmap.c b/mm/mmap.c index 79d541f1502b2..4f6e566d52faa 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1491,7 +1491,18 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr, vm_flags = vma->vm_flags; goto file_expanded; }
vma_iter_config(&vmi, addr, end);
/*
* In the unlikely even that more memory was needed, but
* not available for the vma merge, the vma iterator
* will have no memory reserved for the write we told
* the driver was happening. To keep up the ruse,
* ensure the allocation for the store succeeds.
*/
if (vmg_nomem(&vmg)) {
mas_preallocate(&vmi.mas, vma,
GFP_KERNEL|__GFP_NOFAIL);
}
}
vm_flags = vma->vm_flags;
-- 2.43.0