RE: [PATCH v2 2/2] mm: zswap: disable migration while using per-CPU acomp_ctx

8 Jan 2025

...
-----Original Message-----
From: Yosry Ahmed yosryahmed@google.com
Sent: Tuesday, January 7, 2025 4:12 PM
To: Sridhar, Kanchana P kanchana.p.sridhar@intel.com
Cc: Andrew Morton akpm@linux-foundation.org; Johannes Weiner
hannes@cmpxchg.org; Nhat Pham nphamcs@gmail.com; Chengming
Zhou chengming.zhou@linux.dev; Vitaly Wool vitalywool@gmail.com;
Barry Song baohua@kernel.org; Sam Sun samsun1006219@gmail.com;
linux-mm@kvack.org; linux-kernel@vger.kernel.org; stable@vger.kernel.org
Subject: Re: [PATCH v2 2/2] mm: zswap: disable migration while using per-
CPU acomp_ctx
On Tue, Jan 7, 2025 at 4:02 PM Sridhar, Kanchana P
kanchana.p.sridhar@intel.com wrote:
...
Hi Yosry,
...
-----Original Message-----
From: Yosry Ahmed yosryahmed@google.com
Sent: Tuesday, January 7, 2025 2:23 PM
To: Andrew Morton akpm@linux-foundation.org
Cc: Johannes Weiner hannes@cmpxchg.org; Nhat Pham
nphamcs@gmail.com; Chengming Zhou chengming.zhou@linux.dev;
Vitaly Wool vitalywool@gmail.com; Barry Song baohua@kernel.org;
Sam
...
...
Sun samsun1006219@gmail.com; Sridhar, Kanchana P
kanchana.p.sridhar@intel.com; linux-mm@kvack.org; linux-
kernel@vger.kernel.org; Yosry Ahmed yosryahmed@google.com;
stable@vger.kernel.org
Subject: [PATCH v2 2/2] mm: zswap: disable migration while using per-
CPU
...
...
acomp_ctx
In zswap_compress() and zswap_decompress(), the per-CPU acomp_ctx of
the
current CPU at the beginning of the operation is retrieved and used
throughout.  However, since neither preemption nor migration are
disabled,
...
...
it is possible that the operation continues on a different CPU.
If the original CPU is hotunplugged while the acomp_ctx is still in use,
we run into a UAF bug as the resources attached to the acomp_ctx are
freed
...
...
during hotunplug in zswap_cpu_comp_dead().
The problem was introduced in commit 1ec3b5fe6eec ("mm/zswap: move
to
...
...
use
crypto_acomp API for hardware acceleration") when the switch to the
crypto_acomp API was made.  Prior to that, the per-CPU crypto_comp
was
...
...
retrieved using get_cpu_ptr() which disables preemption and makes sure
the
...
...
CPU cannot go away from under us.  Preemption cannot be disabled with
the
...
...
crypto_acomp API as a sleepable context is needed.
Commit 8ba2f844f050 ("mm/zswap: change per-cpu mutex and buffer to
per-acomp_ctx") increased the UAF surface area by making the per-CPU
buffers dynamic, adding yet another resource that can be freed from
under
...
...
zswap compression/decompression by CPU hotunplug.
This cannot be fixed by holding cpus_read_lock(), as it is possible for
code already holding the lock to fall into reclaim and enter zswap
(causing a deadlock). It also cannot be fixed by wrapping the usage of
acomp_ctx in an SRCU critical section and using synchronize_srcu() in
zswap_cpu_comp_dead(), because synchronize_srcu() is not allowed in
CPU-hotplug notifiers (see
Documentation/RCU/Design/Requirements/Requirements.rst).
This can be fixed by refcounting the acomp_ctx, but it involves
complexity in handling the race between the refcount dropping to zero in
zswap_[de]compress() and the refcount being re-initialized when the CPU
is onlined.
Keep things simple for now and just disable migration while using the
per-CPU acomp_ctx to block CPU hotunplug until the usage is over.
Fixes: 1ec3b5fe6eec ("mm/zswap: move to use crypto_acomp API for
hardware acceleration")
Cc: stable@vger.kernel.org
Signed-off-by: Yosry Ahmed yosryahmed@google.com
Reported-by: Johannes Weiner hannes@cmpxchg.org
Closes:
https://lore.kernel.org/lkml/20241113213007.GB1564047@cmpxchg.org/
...
...
Reported-by: Sam Sun samsun1006219@gmail.com
Closes:
https://lore.kernel.org/lkml/CAEkJfYMtSdM5HceNsXUDf5haghD5+o2e7Qv4O
...
...
curuL4tPg6OaQ@mail.gmail.com/
mm/zswap.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index f6316b66fb236..ecd86153e8a32 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -880,6 +880,18 @@ static int zswap_cpu_comp_dead(unsigned int
cpu,
...
...
struct hlist_node *node)
      return 0;
 }
+/* Remain on the CPU while using its acomp_ctx to stop it from going
offline
...
...
*/
+static struct crypto_acomp_ctx *acomp_ctx_get_cpu(struct
crypto_acomp_ctx __percpu *acomp_ctx)
+{

migrate_disable();


return raw_cpu_ptr(acomp_ctx);



+}



+static void acomp_ctx_put_cpu(void)
+{

migrate_enable();



+}



static bool zswap_compress(struct page *page, struct zswap_entry
*entry,
...
...
                     struct zswap_pool *pool)

{
@@ -893,8 +905,7 @@ static bool zswap_compress(struct page *page,
struct
...
...
zswap_entry *entry,
      gfp_t gfp;
      u8 *dst;

acomp_ctx = raw_cpu_ptr(pool->acomp_ctx);





acomp_ctx = acomp_ctx_get_cpu(pool->acomp_ctx);
mutex_lock(&acomp_ctx->mutex);

dst = acomp_ctx->buffer;



@@ -950,6 +961,7 @@ static bool zswap_compress(struct page *page,
struct
...
...
zswap_entry *entry,
              zswap_reject_alloc_fail++;
  mutex_unlock(&acomp_ctx->mutex);


acomp_ctx_put_cpu();



I have observed that disabling/enabling preemption in this portion of the
code
...
can result in scheduling while atomic errors. Is the same possible while
disabling/enabling migration?
IIUC no, migration disabled is not an atomic context.
Ok, thanks.
...
...
Couple of possibly related thoughts:

I have been thinking some more about the purpose of this per-cpu

acomp_ctx
...
 mutex. It appears that the main benefit is it causes task blocked errors

(which are
...
 useful to detect problems) if any computes in the code section it covers

take a
...
 long duration. Other than that, it does not protect a resource, nor

prevent
...
 cpu offlining from deleting its containing structure.

It does protect resources. Consider this case:

Process A runs on CPU #1, gets the acomp_ctx, and locks it, then is

migrated to CPU #2.

Process B runs on CPU #1, gets the same acomp_ctx, and tries to lock

it then waits for process A. Without the lock they would be using the
same lock.
It is also possible that process A simply gets preempted while running
on CPU #1 by another task that also tries to use the acomp_ctx. The
mutex also protects against this case.
Got it, thanks for the explanations. It seems with this patch, the mutex
would be redundant in the first example. Would this also be true of the
second example where process A gets preempted?  If not, is it worth
figuring out a solution that works for both migration and preemption?
...
...

Seems like the overall problem appears to be applicable to any per-cpu

data
...
 that is being used by a process, vis-a-vis cpu hotunplug. Could it be that a
 solution in cpu hotunplug can safe-guard more generally? Really not sure
 about the specifics of any solution, but it occurred to me that this may

not
...
 be a problem unique to zswap.

Not really. Static per-CPU data and data allocated with alloc_percpu()
should be available for all possible CPUs, regardless of whether they
are online or not, so CPU hotunplug is not relevant. It is relevant
here because we allocate the memory dynamically for online CPUs only
to save memory. I am not sure how important this is as I am not aware
what the difference between the number of online and possible CPUs can
be in real life deployments.
Thought I would clarify what I meant: the problem of per-cpu data that
gets allocated dynamically using cpu hotplug and deleted even while in use
by cpu hotunplug may not be unique to zswap. If so, I was wondering if
a more generic solution in the cpu hotunplug code would be feasible/worth
exploring.
Thanks,
Kanchana
...
...
Thanks,
Kanchana
...
  return comp_ret == 0 && alloc_ret == 0;

}
@@ -960,7 +972,7 @@ static void zswap_decompress(struct
zswap_entry
...
...
*entry, struct folio *folio)
      struct crypto_acomp_ctx *acomp_ctx;
      u8 *src;

acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx);




acomp_ctx = acomp_ctx_get_cpu(entry->pool->acomp_ctx);
mutex_lock(&acomp_ctx->mutex);

src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);



@@ -990,6 +1002,7 @@ static void zswap_decompress(struct
zswap_entry
...
...
*entry, struct folio *folio)
  if (src != acomp_ctx->buffer)
          zpool_unmap_handle(zpool, entry->handle);


acomp_ctx_put_cpu();



}
/*********************************
2.47.1.613.gc27f4b7a9f-goog

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

RE: [PATCH v2 2/2] mm: zswap: disable migration while using per-CPU acomp_ctx

curuL4tPg6OaQ@mail.gmail.com/

/*********************************