On 2022-12-15 02:51, Michal Hocko wrote:
On Wed 14-12-22 17:21:10, Mathieu Desnoyers wrote:
When encountering any vma in the range with policy other than MPOL_BIND or MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put on the policy just allocated with mpol_dup().
This allows arbitrary users to leak kernel memory.
Fixes: c6018b4b2549 ("mm/mempolicy: add set_mempolicy_home_node syscall") Signed-off-by: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Cc: Ben Widawsky ben.widawsky@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Feng Tang feng.tang@intel.com Cc: Michal Hocko mhocko@kernel.org Cc: Andrea Arcangeli aarcange@redhat.com Cc: Mel Gorman mgorman@techsingularity.net Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Randy Dunlap rdunlap@infradead.org Cc: Vlastimil Babka vbabka@suse.cz Cc: Andi Kleen ak@linux.intel.com Cc: Dan Williams dan.j.williams@intel.com Cc: Huang Ying ying.huang@intel.com Cc: linux-api@vger.kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: stable@vger.kernel.org # 5.17+
Acked-by: Michal Hocko mhocko@suse.com Thanks for catching this!
Btw. looking at the code again it seems rather pointless to duplicate the policy just to throw it away anyway. A slightly bigger diff but this looks more reasonable to me. What do you think? I can also send it as a clean up on top of your fix.
I think it would be best if this comes as a cleanup on top of my fix. The diff is larger than the minimal change needed to fix the leak in stable branches.
Your approach looks fine, except for the vma_policy(vma) -> old change already spotted by Aneesh.
Thanks,
Mathieu
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 61aa9aedb728..918cdc8a7f0c 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1489,7 +1489,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le { struct mm_struct *mm = current->mm; struct vm_area_struct *vma;
- struct mempolicy *new;
- struct mempolicy *new. *old; unsigned long vmstart; unsigned long vmend; unsigned long end;
@@ -1521,30 +1521,28 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le return 0; mmap_write_lock(mm); for_each_vma_range(vmi, vma, end) {
vmstart = max(start, vma->vm_start);
vmend = min(end, vma->vm_end);
new = mpol_dup(vma_policy(vma));
if (IS_ERR(new)) {
err = PTR_ERR(new);
break;
}
/*
* Only update home node if there is an existing vma policy
*/
if (!new)
continue;
- /*
*/
- If any vma in the range got policy other than MPOL_BIND
- or MPOL_PREFERRED_MANY we return error. We don't reset
- the home node for vmas we already updated before.
if (new->mode != MPOL_BIND && new->mode != MPOL_PREFERRED_MANY) {
old = vma_policy(vma);
if (!old)
continue;
}if (old->mode != MPOL_BIND && old->mode != MPOL_PREFERRED_MANY) { err = -EOPNOTSUPP; break;
new = mpol_dup(vma_policy(vma));
if (IS_ERR(new)) {
err = PTR_ERR(new);
break;
}
- new->home_node = home_node;
vmstart = max(start, vma->vm_start);
err = mbind_range(mm, vmstart, vmend, new); mpol_put(new); if (err)vmend = min(end, vma->vm_end);