From: Zijun Hu <quic_zijuhu(a)quicinc.com>
Remove macro list_for_each_reverse due to below reasons:
- it is same as list_for_each_prev.
- it is not used by current kernel tree.
Fixes: 8bf0cdfac7f8 ("<linux/list.h>: Introduce the list_for_each_reverse() method")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
include/linux/list.h | 8 --------
1 file changed, 8 deletions(-)
diff --git a/include/linux/list.h b/include/linux/list.h
index 5f4b0a39cf46..29a375889fb8 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -686,14 +686,6 @@ static inline void list_splice_tail_init(struct list_head *list,
#define list_for_each(pos, head) \
for (pos = (head)->next; !list_is_head(pos, (head)); pos = pos->next)
-/**
- * list_for_each_reverse - iterate backwards over a list
- * @pos: the &struct list_head to use as a loop cursor.
- * @head: the head for your list.
- */
-#define list_for_each_reverse(pos, head) \
- for (pos = (head)->prev; pos != (head); pos = pos->prev)
-
/**
* list_for_each_rcu - Iterate over a list in an RCU-safe fashion
* @pos: the &struct list_head to use as a loop cursor.
---
base-commit: 6a36d828bdef0e02b1e6c12e2160f5b83be6aab5
change-id: 20240916-fix_list-553c447bde0f
Best regards,
--
Zijun Hu <quic_zijuhu(a)quicinc.com>
From: Schspa Shi <schspa(a)gmail.com>
commit a5201d42e2f8a8e8062103170027840ee372742f upstream.
When num_reg_defaults > 0 but reg_defaults is NULL, there will be a
NULL pointer exception.
Current code has no such usage, but as additional hardening, also
check this to prevent any chance of crashing.
Signed-off-by: Schspa Shi <schspa(a)gmail.com>
Link: https://lore.kernel.org/r/20220629130951.63040-1-schspa@gmail.com
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Roman Smirnov <r.smirnov(a)omp.ru>
---
drivers/base/regmap/regcache.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/base/regmap/regcache.c b/drivers/base/regmap/regcache.c
index 7fdd702e564a..5ff79ba665ad 100644
--- a/drivers/base/regmap/regcache.c
+++ b/drivers/base/regmap/regcache.c
@@ -133,6 +133,12 @@ int regcache_init(struct regmap *map, const struct regmap_config *config)
return -EINVAL;
}
+ if (config->num_reg_defaults && !config->reg_defaults) {
+ dev_err(map->dev,
+ "Register defaults number are set without the reg!\n");
+ return -EINVAL;
+ }
+
for (i = 0; i < config->num_reg_defaults; i++)
if (config->reg_defaults[i].reg % map->reg_stride)
return -EINVAL;
--
2.34.1
From: Dandan Zhang <zhangdandan(a)uniontech.com>
[ Upstream commit 494b0792d962e8efac72b3a5b6d9bcd4e6fa8cf0 ]
The kvm_hypercall() set for LoongArch is limited to a1-a5. So the
mention of a6 in the comment is undefined that needs to be rectified.
Reviewed-by: Bibo Mao <maobibo(a)loongson.cn>
Signed-off-by: Wentao Guan <guanwentao(a)uniontech.com>
Signed-off-by: Dandan Zhang <zhangdandan(a)uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
Signed-off-by: WangYuli <wangyuli(a)uniontech.com>
--
Changlog:
*v1 -> v2: Correct the commit-msg format.
---
arch/loongarch/include/asm/kvm_para.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/loongarch/include/asm/kvm_para.h b/arch/loongarch/include/asm/kvm_para.h
index 4ba2312e5f8c..6d5e9b6c5714 100644
--- a/arch/loongarch/include/asm/kvm_para.h
+++ b/arch/loongarch/include/asm/kvm_para.h
@@ -28,9 +28,9 @@
* Hypercall interface for KVM hypervisor
*
* a0: function identifier
- * a1-a6: args
+ * a1-a5: args
* Return value will be placed in a0.
- * Up to 6 arguments are passed in a1, a2, a3, a4, a5, a6.
+ * Up to 5 arguments are passed in a1, a2, a3, a4, a5.
*/
static __always_inline long kvm_hypercall0(u64 fid)
{
--
2.43.0
The quilt patch titled
Subject: zram: free secondary algorithms names
has been removed from the -mm tree. Its filename was
zram-free-secondary-algorithms-names.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Subject: zram: free secondary algorithms names
Date: Wed, 11 Sep 2024 11:54:56 +0900
We need to kfree() secondary algorithms names when reset zram device that
had multi-streams, otherwise we leak memory.
[senozhatsky(a)chromium.org: kfree(NULL) is legal]
Link: https://lkml.kernel.org/r/20240917013021.868769-1-senozhatsky@chromium.org
Link: https://lkml.kernel.org/r/20240911025600.3681789-1-senozhatsky@chromium.org
Fixes: 001d92735701 ("zram: add recompression algorithm sysfs knob")
Signed-off-by: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/block/zram/zram_drv.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/block/zram/zram_drv.c~zram-free-secondary-algorithms-names
+++ a/drivers/block/zram/zram_drv.c
@@ -2112,6 +2112,11 @@ static void zram_destroy_comps(struct zr
zram->num_active_comps--;
}
+ for (prio = ZRAM_SECONDARY_COMP; prio < ZRAM_MAX_COMPS; prio++) {
+ kfree(zram->comp_algs[prio]);
+ zram->comp_algs[prio] = NULL;
+ }
+
zram_comp_params_reset(zram);
}
_
Patches currently in -mm which might be from senozhatsky(a)chromium.org are
The quilt patch titled
Subject: mm: z3fold: deprecate CONFIG_Z3FOLD
has been removed from the -mm tree. Its filename was
mm-z3fold-deprecate-config_z3fold.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Yosry Ahmed <yosryahmed(a)google.com>
Subject: mm: z3fold: deprecate CONFIG_Z3FOLD
Date: Wed, 4 Sep 2024 23:33:43 +0000
The z3fold compressed pages allocator is rarely used, most users use
zsmalloc. The only disadvantage of zsmalloc in comparison is the
dependency on MMU, and zbud is a more common option for !MMU as it was the
default zswap allocator for a long time.
Historically, zsmalloc had worse latency than zbud and z3fold but offered
better memory savings. This is no longer the case as shown by a simple
recent analysis [1]. That analysis showed that z3fold does not have any
advantage over zsmalloc or zbud considering both performance and memory
usage. In a kernel build test on tmpfs in a limited cgroup, z3fold took
3% more time and used 1.8% more memory. The latency of zswap_load() was
7% higher, and that of zswap_store() was 10% higher. Zsmalloc is better
in all metrics.
Moreover, z3fold apparently has latent bugs, which was made noticeable by
a recent soft lockup bug report with z3fold [2]. Switching to zsmalloc
not only fixed the problem, but also reduced the swap usage from 6~8G to
1~2G. Other users have also reported being bitten by mistakenly enabling
z3fold.
Other than hurting users, z3fold is repeatedly causing wasted engineering
effort. Apart from investigating the above bug, it came up in multiple
development discussions (e.g. [3]) as something we need to handle, when
there aren't any legit users (at least not intentionally).
The natural course of action is to deprecate z3fold, and remove in a few
cycles if no objections are raised from active users. Next on the list
should be zbud, as it offers marginal latency gains at the cost of huge
memory waste when compared to zsmalloc. That one will need to wait until
zsmalloc does not depend on MMU.
Rename the user-visible config option from CONFIG_Z3FOLD to
CONFIG_Z3FOLD_DEPRECATED so that users with CONFIG_Z3FOLD=y get a new
prompt with explanation during make oldconfig. Also, remove
CONFIG_Z3FOLD=y from defconfigs.
[1]https://lore.kernel.org/lkml/CAJD7tkbRF6od-2x_L8-A1QL3=2Ww13sCj4S3i4bNndq…
[2]https://lore.kernel.org/lkml/EF0ABD3E-A239-4111-A8AB-5C442E759CF3@gmail.c…
[3]https://lore.kernel.org/lkml/CAJD7tkbnmeVugfunffSovJf9FAgy9rhBVt_tx=nxUve…
[arnd(a)arndb.de: deprecate ZSWAP_ZPOOL_DEFAULT_Z3FOLD as well]
Link: https://lkml.kernel.org/r/20240909202625.1054880-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20240904233343.933462-1-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed(a)google.com>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Acked-by: Chris Down <chris(a)chrisdown.name>
Acked-by: Nhat Pham <nphamcs(a)gmail.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Acked-by: Vitaly Wool <vitaly.wool(a)konsulko.com>
Acked-by: Christoph Hellwig <hch(a)lst.de>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)kernel.org>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Cc: Huacai Chen <chenhuacai(a)kernel.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao(a)linux.ibm.com>
Cc: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Cc: WANG Xuerui <kernel(a)xen0n.name>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/loongarch/configs/loongson3_defconfig | 1
arch/powerpc/configs/ppc64_defconfig | 1
mm/Kconfig | 25 ++++++++++++++-----
3 files changed, 19 insertions(+), 8 deletions(-)
--- a/arch/loongarch/configs/loongson3_defconfig~mm-z3fold-deprecate-config_z3fold
+++ a/arch/loongarch/configs/loongson3_defconfig
@@ -96,7 +96,6 @@ CONFIG_ZPOOL=y
CONFIG_ZSWAP=y
CONFIG_ZSWAP_COMPRESSOR_DEFAULT_ZSTD=y
CONFIG_ZBUD=y
-CONFIG_Z3FOLD=y
CONFIG_ZSMALLOC=m
# CONFIG_COMPAT_BRK is not set
CONFIG_MEMORY_HOTPLUG=y
--- a/arch/powerpc/configs/ppc64_defconfig~mm-z3fold-deprecate-config_z3fold
+++ a/arch/powerpc/configs/ppc64_defconfig
@@ -81,7 +81,6 @@ CONFIG_MODULE_SIG_SHA512=y
CONFIG_PARTITION_ADVANCED=y
CONFIG_BINFMT_MISC=m
CONFIG_ZSWAP=y
-CONFIG_Z3FOLD=y
CONFIG_ZSMALLOC=y
# CONFIG_SLAB_MERGE_DEFAULT is not set
CONFIG_SLAB_FREELIST_RANDOM=y
--- a/mm/Kconfig~mm-z3fold-deprecate-config_z3fold
+++ a/mm/Kconfig
@@ -146,12 +146,15 @@ config ZSWAP_ZPOOL_DEFAULT_ZBUD
help
Use the zbud allocator as the default allocator.
-config ZSWAP_ZPOOL_DEFAULT_Z3FOLD
- bool "z3fold"
- select Z3FOLD
+config ZSWAP_ZPOOL_DEFAULT_Z3FOLD_DEPRECATED
+ bool "z3foldi (DEPRECATED)"
+ select Z3FOLD_DEPRECATED
help
Use the z3fold allocator as the default allocator.
+ Deprecated and scheduled for removal in a few cycles,
+ see CONFIG_Z3FOLD_DEPRECATED.
+
config ZSWAP_ZPOOL_DEFAULT_ZSMALLOC
bool "zsmalloc"
select ZSMALLOC
@@ -163,7 +166,7 @@ config ZSWAP_ZPOOL_DEFAULT
string
depends on ZSWAP
default "zbud" if ZSWAP_ZPOOL_DEFAULT_ZBUD
- default "z3fold" if ZSWAP_ZPOOL_DEFAULT_Z3FOLD
+ default "z3fold" if ZSWAP_ZPOOL_DEFAULT_Z3FOLD_DEPRECATED
default "zsmalloc" if ZSWAP_ZPOOL_DEFAULT_ZSMALLOC
default ""
@@ -177,15 +180,25 @@ config ZBUD
deterministic reclaim properties that make it preferable to a higher
density approach when reclaim will be used.
-config Z3FOLD
- tristate "3:1 compression allocator (z3fold)"
+config Z3FOLD_DEPRECATED
+ tristate "3:1 compression allocator (z3fold) (DEPRECATED)"
depends on ZSWAP
help
+ Deprecated and scheduled for removal in a few cycles. If you have
+ a good reason for using Z3FOLD over ZSMALLOC, please contact
+ linux-mm(a)kvack.org and the zswap maintainers.
+
A special purpose allocator for storing compressed pages.
It is designed to store up to three compressed pages per physical
page. It is a ZBUD derivative so the simplicity and determinism are
still there.
+config Z3FOLD
+ tristate
+ default y if Z3FOLD_DEPRECATED=y
+ default m if Z3FOLD_DEPRECATED=m
+ depends on Z3FOLD_DEPRECATED
+
config ZSMALLOC
tristate
prompt "N:1 compression allocator (zsmalloc)" if (ZSWAP || ZRAM)
_
Patches currently in -mm which might be from yosryahmed(a)google.com are
The quilt patch titled
Subject: mm/huge_memory: ensure huge_zero_folio won't have large_rmappable flag set
has been removed from the -mm tree. Its filename was
mm-huge_memory-ensure-huge_zero_folio-wont-have-large_rmappable-flag-set.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Miaohe Lin <linmiaohe(a)huawei.com>
Subject: mm/huge_memory: ensure huge_zero_folio won't have large_rmappable flag set
Date: Sat, 14 Sep 2024 09:53:06 +0800
Ensure huge_zero_folio won't have large_rmappable flag set. So it can be
reported as thp,zero correctly through stable_page_flags().
Link: https://lkml.kernel.org/r/20240914015306.3656791-1-linmiaohe@huawei.com
Fixes: 5691753d73a2 ("mm: convert huge_zero_page to huge_zero_folio")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/huge_memory.c~mm-huge_memory-ensure-huge_zero_folio-wont-have-large_rmappable-flag-set
+++ a/mm/huge_memory.c
@@ -220,6 +220,8 @@ retry:
count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED);
return false;
}
+ /* Ensure zero folio won't have large_rmappable flag set. */
+ folio_clear_large_rmappable(zero_folio);
preempt_disable();
if (cmpxchg(&huge_zero_folio, NULL, zero_folio)) {
preempt_enable();
_
Patches currently in -mm which might be from linmiaohe(a)huawei.com are
mm-memory-failure-fix-vm_bug_on_pagepagepoisonedpage-when-unpoison-memory.patch
The quilt patch titled
Subject: mm/hugetlb.c: fix UAF of vma in hugetlb fault pathway
has been removed from the -mm tree. Its filename was
mm-hugetlbc-fix-uaf-of-vma-in-hugetlb-fault-pathway.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Vishal Moola (Oracle)" <vishal.moola(a)gmail.com>
Subject: mm/hugetlb.c: fix UAF of vma in hugetlb fault pathway
Date: Sat, 14 Sep 2024 12:41:19 -0700
Syzbot reports a UAF in hugetlb_fault(). This happens because
vmf_anon_prepare() could drop the per-VMA lock and allow the current VMA
to be freed before hugetlb_vma_unlock_read() is called.
We can fix this by using a modified version of vmf_anon_prepare() that
doesn't release the VMA lock on failure, and then release it ourselves
after hugetlb_vma_unlock_read().
Link: https://lkml.kernel.org/r/20240914194243.245-2-vishal.moola@gmail.com
Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()")
Reported-by: syzbot+2dab93857ee95f2eeb08(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-mm/00000000000067c20b06219fbc26@google.com/
Signed-off-by: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlbc-fix-uaf-of-vma-in-hugetlb-fault-pathway
+++ a/mm/hugetlb.c
@@ -6048,7 +6048,7 @@ retry_avoidcopy:
* When the original hugepage is shared one, it does not have
* anon_vma prepared.
*/
- ret = vmf_anon_prepare(vmf);
+ ret = __vmf_anon_prepare(vmf);
if (unlikely(ret))
goto out_release_all;
@@ -6247,7 +6247,7 @@ static vm_fault_t hugetlb_no_page(struct
}
if (!(vma->vm_flags & VM_MAYSHARE)) {
- ret = vmf_anon_prepare(vmf);
+ ret = __vmf_anon_prepare(vmf);
if (unlikely(ret))
goto out;
}
@@ -6378,6 +6378,14 @@ static vm_fault_t hugetlb_no_page(struct
folio_unlock(folio);
out:
hugetlb_vma_unlock_read(vma);
+
+ /*
+ * We must check to release the per-VMA lock. __vmf_anon_prepare() is
+ * the only way ret can be set to VM_FAULT_RETRY.
+ */
+ if (unlikely(ret & VM_FAULT_RETRY))
+ vma_end_read(vma);
+
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
return ret;
@@ -6599,6 +6607,14 @@ out_ptl:
}
out_mutex:
hugetlb_vma_unlock_read(vma);
+
+ /*
+ * We must check to release the per-VMA lock. __vmf_anon_prepare() in
+ * hugetlb_wp() is the only way ret can be set to VM_FAULT_RETRY.
+ */
+ if (unlikely(ret & VM_FAULT_RETRY))
+ vma_end_read(vma);
+
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
/*
* Generally it's safe to hold refcount during waiting page lock. But
_
Patches currently in -mm which might be from vishal.moola(a)gmail.com are
The quilt patch titled
Subject: mm: change vmf_anon_prepare() to __vmf_anon_prepare()
has been removed from the -mm tree. Its filename was
mm-change-vmf_anon_prepare-to-__vmf_anon_prepare.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Vishal Moola (Oracle)" <vishal.moola(a)gmail.com>
Subject: mm: change vmf_anon_prepare() to __vmf_anon_prepare()
Date: Sat, 14 Sep 2024 12:41:18 -0700
Some callers of vmf_anon_prepare() may not want us to release the per-VMA
lock ourselves. Rename vmf_anon_prepare() to __vmf_anon_prepare() and let
the callers drop the lock when desired.
Also, make vmf_anon_prepare() a wrapper that releases the per-VMA lock
itself for any callers that don't care.
This is in preparation to fix this bug reported by syzbot:
https://lore.kernel.org/linux-mm/00000000000067c20b06219fbc26@google.com/
Link: https://lkml.kernel.org/r/20240914194243.245-1-vishal.moola@gmail.com
Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()")
Reported-by: syzbot+2dab93857ee95f2eeb08(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-mm/00000000000067c20b06219fbc26@google.com/
Signed-off-by: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/internal.h | 11 ++++++++++-
mm/memory.c | 8 +++-----
2 files changed, 13 insertions(+), 6 deletions(-)
--- a/mm/internal.h~mm-change-vmf_anon_prepare-to-__vmf_anon_prepare
+++ a/mm/internal.h
@@ -310,7 +310,16 @@ static inline void wake_throttle_isolate
wake_up(wqh);
}
-vm_fault_t vmf_anon_prepare(struct vm_fault *vmf);
+vm_fault_t __vmf_anon_prepare(struct vm_fault *vmf);
+static inline vm_fault_t vmf_anon_prepare(struct vm_fault *vmf)
+{
+ vm_fault_t ret = __vmf_anon_prepare(vmf);
+
+ if (unlikely(ret & VM_FAULT_RETRY))
+ vma_end_read(vmf->vma);
+ return ret;
+}
+
vm_fault_t do_swap_page(struct vm_fault *vmf);
void folio_rotate_reclaimable(struct folio *folio);
bool __folio_end_writeback(struct folio *folio);
--- a/mm/memory.c~mm-change-vmf_anon_prepare-to-__vmf_anon_prepare
+++ a/mm/memory.c
@@ -3259,7 +3259,7 @@ static inline vm_fault_t vmf_can_call_fa
}
/**
- * vmf_anon_prepare - Prepare to handle an anonymous fault.
+ * __vmf_anon_prepare - Prepare to handle an anonymous fault.
* @vmf: The vm_fault descriptor passed from the fault handler.
*
* When preparing to insert an anonymous page into a VMA from a
@@ -3273,7 +3273,7 @@ static inline vm_fault_t vmf_can_call_fa
* Return: 0 if fault handling can proceed. Any other value should be
* returned to the caller.
*/
-vm_fault_t vmf_anon_prepare(struct vm_fault *vmf)
+vm_fault_t __vmf_anon_prepare(struct vm_fault *vmf)
{
struct vm_area_struct *vma = vmf->vma;
vm_fault_t ret = 0;
@@ -3281,10 +3281,8 @@ vm_fault_t vmf_anon_prepare(struct vm_fa
if (likely(vma->anon_vma))
return 0;
if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
- if (!mmap_read_trylock(vma->vm_mm)) {
- vma_end_read(vma);
+ if (!mmap_read_trylock(vma->vm_mm))
return VM_FAULT_RETRY;
- }
}
if (__anon_vma_prepare(vma))
ret = VM_FAULT_OOM;
_
Patches currently in -mm which might be from vishal.moola(a)gmail.com are
The quilt patch titled
Subject: resource: fix region_intersects() vs add_memory_driver_managed()
has been removed from the -mm tree. Its filename was
resource-fix-region_intersects-vs-add_memory_driver_managed.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Huang Ying <ying.huang(a)intel.com>
Subject: resource: fix region_intersects() vs add_memory_driver_managed()
Date: Fri, 6 Sep 2024 11:07:11 +0800
On a system with CXL memory, the resource tree (/proc/iomem) related to
CXL memory may look like something as follows.
490000000-50fffffff : CXL Window 0
490000000-50fffffff : region0
490000000-50fffffff : dax0.0
490000000-50fffffff : System RAM (kmem)
Because drivers/dax/kmem.c calls add_memory_driver_managed() during
onlining CXL memory, which makes "System RAM (kmem)" a descendant of "CXL
Window X". This confuses region_intersects(), which expects all "System
RAM" resources to be at the top level of iomem_resource. This can lead to
bugs.
For example, when the following command line is executed to write some
memory in CXL memory range via /dev/mem,
$ dd if=data of=/dev/mem bs=$((1 << 10)) seek=$((0x490000000 >> 10)) count=1
dd: error writing '/dev/mem': Bad address
1+0 records in
0+0 records out
0 bytes copied, 0.0283507 s, 0.0 kB/s
the command fails as expected. However, the error code is wrong. It
should be "Operation not permitted" instead of "Bad address". More
seriously, the /dev/mem permission checking in devmem_is_allowed() passes
incorrectly. Although the accessing is prevented later because ioremap()
isn't allowed to map system RAM, it is a potential security issue. During
command executing, the following warning is reported in the kernel log for
calling ioremap() on system RAM.
ioremap on RAM at 0x0000000490000000 - 0x0000000490000fff
WARNING: CPU: 2 PID: 416 at arch/x86/mm/ioremap.c:216 __ioremap_caller.constprop.0+0x131/0x35d
Call Trace:
memremap+0xcb/0x184
xlate_dev_mem_ptr+0x25/0x2f
write_mem+0x94/0xfb
vfs_write+0x128/0x26d
ksys_write+0xac/0xfe
do_syscall_64+0x9a/0xfd
entry_SYSCALL_64_after_hwframe+0x4b/0x53
The details of command execution process are as follows. In the above
resource tree, "System RAM" is a descendant of "CXL Window 0" instead of a
top level resource. So, region_intersects() will report no System RAM
resources in the CXL memory region incorrectly, because it only checks the
top level resources. Consequently, devmem_is_allowed() will return 1
(allow access via /dev/mem) for CXL memory region incorrectly.
Fortunately, ioremap() doesn't allow to map System RAM and reject the
access.
So, region_intersects() needs to be fixed to work correctly with the
resource tree with "System RAM" not at top level as above. To fix it, if
we found a unmatched resource in the top level, we will continue to search
matched resources in its descendant resources. So, we will not miss any
matched resources in resource tree anymore.
In the new implementation, an example resource tree
|------------- "CXL Window 0" ------------|
|-- "System RAM" --|
will behave similar as the following fake resource tree for
region_intersects(, IORESOURCE_SYSTEM_RAM, ),
|-- "System RAM" --||-- "CXL Window 0a" --|
Where "CXL Window 0a" is part of the original "CXL Window 0" that
isn't covered by "System RAM".
Link: https://lkml.kernel.org/r/20240906030713.204292-2-ying.huang@intel.com
Fixes: c221c0b0308f ("device-dax: "Hotplug" persistent memory for use like normal RAM")
Signed-off-by: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron(a)huawei.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Alison Schofield <alison.schofield(a)intel.com>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Alistair Popple <apopple(a)nvidia.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/resource.c | 58 +++++++++++++++++++++++++++++++++++++-------
1 file changed, 50 insertions(+), 8 deletions(-)
--- a/kernel/resource.c~resource-fix-region_intersects-vs-add_memory_driver_managed
+++ a/kernel/resource.c
@@ -540,20 +540,62 @@ static int __region_intersects(struct re
size_t size, unsigned long flags,
unsigned long desc)
{
- struct resource res;
+ resource_size_t ostart, oend;
int type = 0; int other = 0;
- struct resource *p;
+ struct resource *p, *dp;
+ bool is_type, covered;
+ struct resource res;
res.start = start;
res.end = start + size - 1;
for (p = parent->child; p ; p = p->sibling) {
- bool is_type = (((p->flags & flags) == flags) &&
- ((desc == IORES_DESC_NONE) ||
- (desc == p->desc)));
-
- if (resource_overlaps(p, &res))
- is_type ? type++ : other++;
+ if (!resource_overlaps(p, &res))
+ continue;
+ is_type = (p->flags & flags) == flags &&
+ (desc == IORES_DESC_NONE || desc == p->desc);
+ if (is_type) {
+ type++;
+ continue;
+ }
+ /*
+ * Continue to search in descendant resources as if the
+ * matched descendant resources cover some ranges of 'p'.
+ *
+ * |------------- "CXL Window 0" ------------|
+ * |-- "System RAM" --|
+ *
+ * will behave similar as the following fake resource
+ * tree when searching "System RAM".
+ *
+ * |-- "System RAM" --||-- "CXL Window 0a" --|
+ */
+ covered = false;
+ ostart = max(res.start, p->start);
+ oend = min(res.end, p->end);
+ for_each_resource(p, dp, false) {
+ if (!resource_overlaps(dp, &res))
+ continue;
+ is_type = (dp->flags & flags) == flags &&
+ (desc == IORES_DESC_NONE || desc == dp->desc);
+ if (is_type) {
+ type++;
+ /*
+ * Range from 'ostart' to 'dp->start'
+ * isn't covered by matched resource.
+ */
+ if (dp->start > ostart)
+ break;
+ if (dp->end >= oend) {
+ covered = true;
+ break;
+ }
+ /* Remove covered range */
+ ostart = max(ostart, dp->end + 1);
+ }
+ }
+ if (!covered)
+ other++;
}
if (type == 0)
_
Patches currently in -mm which might be from ying.huang(a)intel.com are
resource-make-alloc_free_mem_region-works-for-iomem_resource.patch
resource-kunit-add-test-case-for-region_intersects.patch