The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x 4b5d1e47b69426c0f7491d97d73ad0152d02d437 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2023081217-gender-font-a356@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
4b5d1e47b694 ("zsmalloc: fix races between modifications of fullness and isolated") c0547d0b6a4b ("zsmalloc: consolidate zs_pool's migrate_lock and size_class's locks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4b5d1e47b69426c0f7491d97d73ad0152d02d437 Mon Sep 17 00:00:00 2001 From: Andrew Yang andrew.yang@mediatek.com Date: Fri, 21 Jul 2023 14:37:01 +0800 Subject: [PATCH] zsmalloc: fix races between modifications of fullness and isolated
We encountered many kernel exceptions of VM_BUG_ON(zspage->isolated == 0) in dec_zspage_isolation() and BUG_ON(!pages[1]) in zs_unmap_object() lately. This issue only occurs when migration and reclamation occur at the same time.
With our memory stress test, we can reproduce this issue several times a day. We have no idea why no one else encountered this issue. BTW, we switched to the new kernel version with this defect a few months ago.
Since fullness and isolated share the same unsigned int, modifications of them should be protected by the same lock.
[andrew.yang@mediatek.com: move comment] Link: https://lkml.kernel.org/r/20230727062910.6337-1-andrew.yang@mediatek.com Link: https://lkml.kernel.org/r/20230721063705.11455-1-andrew.yang@mediatek.com Fixes: c4549b871102 ("zsmalloc: remove zspage isolation for migration") Signed-off-by: Andrew Yang andrew.yang@mediatek.com Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Cc: Matthias Brugger matthias.bgg@gmail.com Cc: Minchan Kim minchan@kernel.org Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 3f057970504e..32916d28d9d9 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1798,6 +1798,7 @@ static void replace_sub_page(struct size_class *class, struct zspage *zspage,
static bool zs_page_isolate(struct page *page, isolate_mode_t mode) { + struct zs_pool *pool; struct zspage *zspage;
/* @@ -1807,9 +1808,10 @@ static bool zs_page_isolate(struct page *page, isolate_mode_t mode) VM_BUG_ON_PAGE(PageIsolated(page), page);
zspage = get_zspage(page); - migrate_write_lock(zspage); + pool = zspage->pool; + spin_lock(&pool->lock); inc_zspage_isolation(zspage); - migrate_write_unlock(zspage); + spin_unlock(&pool->lock);
return true; } @@ -1875,12 +1877,12 @@ static int zs_page_migrate(struct page *newpage, struct page *page, kunmap_atomic(s_addr);
replace_sub_page(class, zspage, newpage, page); + dec_zspage_isolation(zspage); /* * Since we complete the data copy and set up new zspage structure, * it's okay to release the pool's lock. */ spin_unlock(&pool->lock); - dec_zspage_isolation(zspage); migrate_write_unlock(zspage);
get_page(newpage); @@ -1897,14 +1899,16 @@ static int zs_page_migrate(struct page *newpage, struct page *page,
static void zs_page_putback(struct page *page) { + struct zs_pool *pool; struct zspage *zspage;
VM_BUG_ON_PAGE(!PageIsolated(page), page);
zspage = get_zspage(page); - migrate_write_lock(zspage); + pool = zspage->pool; + spin_lock(&pool->lock); dec_zspage_isolation(zspage); - migrate_write_unlock(zspage); + spin_unlock(&pool->lock); }
static const struct movable_operations zsmalloc_mops = {
We encountered many kernel exceptions of VM_BUG_ON(zspage->isolated == 0) in dec_zspage_isolation() and BUG_ON(!pages[1]) in zs_unmap_object() lately. This issue only occurs when migration and reclamation occur at the same time.
With our memory stress test, we can reproduce this issue several times a day. We have no idea why no one else encountered this issue. BTW, we switched to the new kernel version with this defect a few months ago.
Since fullness and isolated share the same unsigned int, modifications of them should be protected by the same lock.
[andrew.yang@mediatek.com: move comment] Link: https://lkml.kernel.org/r/20230727062910.6337-1-andrew.yang@mediatek.com Link: https://lkml.kernel.org/r/20230721063705.11455-1-andrew.yang@mediatek.com Fixes: c4549b871102 ("zsmalloc: remove zspage isolation for migration") Signed-off-by: Andrew Yang andrew.yang@mediatek.com Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Cc: Matthias Brugger matthias.bgg@gmail.com Cc: Minchan Kim minchan@kernel.org Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org (cherry picked from commit 4b5d1e47b69426c0f7491d97d73ad0152d02d437) --- mm/zsmalloc.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index d03941cace2c..aa1cb03ad72c 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1821,6 +1821,7 @@ static void replace_sub_page(struct size_class *class, struct zspage *zspage,
static bool zs_page_isolate(struct page *page, isolate_mode_t mode) { + struct size_class *class; struct zspage *zspage;
/* @@ -1831,9 +1832,10 @@ static bool zs_page_isolate(struct page *page, isolate_mode_t mode) VM_BUG_ON_PAGE(PageIsolated(page), page);
zspage = get_zspage(page); - migrate_write_lock(zspage); + class = zspage_class(zspage->pool, zspage); + spin_lock(&class->lock); inc_zspage_isolation(zspage); - migrate_write_unlock(zspage); + spin_unlock(&class->lock);
return true; } @@ -1909,8 +1911,8 @@ static int zs_page_migrate(struct page *newpage, struct page *page, * it's okay to release migration_lock. */ write_unlock(&pool->migrate_lock); - spin_unlock(&class->lock); dec_zspage_isolation(zspage); + spin_unlock(&class->lock); migrate_write_unlock(zspage);
get_page(newpage); @@ -1927,15 +1929,17 @@ static int zs_page_migrate(struct page *newpage, struct page *page,
static void zs_page_putback(struct page *page) { + struct size_class *class; struct zspage *zspage;
VM_BUG_ON_PAGE(!PageMovable(page), page); VM_BUG_ON_PAGE(!PageIsolated(page), page);
zspage = get_zspage(page); - migrate_write_lock(zspage); + class = zspage_class(zspage->pool, zspage); + spin_lock(&class->lock); dec_zspage_isolation(zspage); - migrate_write_unlock(zspage); + spin_unlock(&class->lock); }
static const struct movable_operations zsmalloc_mops = {
linux-stable-mirror@lists.linaro.org