On Thu 13-12-18 17:04:00, Johannes Weiner wrote: [...]
Acked-by: Johannes Weiner hannes@cmpxchg.org
Thanks!
Just one nit:
@@ -2993,6 +2993,17 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) struct vm_area_struct *vma = vmf->vma; vm_fault_t ret;
- /*
* Preallocate pte before we take page_lock because this might lead to
* deadlocks for memcg reclaim which waits for pages under writeback.
*/
- if (pmd_none(*vmf->pmd) && !vmf->prealloc_pte) {
vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm, vmf->address);
if (!vmf->prealloc_pte)
return VM_FAULT_OOM;
smp_wmb(); /* See comment in __pte_alloc() */
- }
Could you be more specific in the deadlock comment? git blame will work fine for a while, but it becomes a pain to find corresponding patches after stuff gets moved around for years.
In particular the race diagram between reclaim with a page lock held and the fs doing SetPageWriteback batches before kicking off IO would be useful directly in the code, IMO.
This?
diff --git a/mm/memory.c b/mm/memory.c index bb78e90a9b70..ece221e4da6d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2995,7 +2995,18 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
/* * Preallocate pte before we take page_lock because this might lead to - * deadlocks for memcg reclaim which waits for pages under writeback. + * deadlocks for memcg reclaim which waits for pages under writeback: + * lock_page(A) + * SetPageWriteback(A) + * unlock_page(A) + * lock_page(B) + * lock_page(B) + * pte_alloc_pne + * shrink_page_list + * wait_on_page_writeback(A) + * SetPageWriteback(B) + * unlock_page(B) + * # flush A, B to clear the writeback */ if (pmd_none(*vmf->pmd) && !vmf->prealloc_pte) { vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm, vmf->address);