From: david regan <dregan(a)mail.com>
commit 36415a7964711822e63695ea67fede63979054d9 upstream
The brcmnand driver contains a bug in which if a page (example 2k byte)
is read from the parallel/ONFI NAND and within that page a subpage (512
byte) has correctable errors which is followed by a subpage with
uncorrectable errors, the page read will return the wrong status of
correctable (as opposed to the actual status of uncorrectable.)
The bug is in function brcmnand_read_by_pio where there is a check for
uncorrectable bits which will be preempted if a previous status for
correctable bits is detected.
The fix is to stop checking for bad bits only if we already have a bad
bits status.
Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller")
Signed-off-by: david regan <dregan(a)mail.com>
Reviewed-by: Florian Fainelli <f.fainelli(a)gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Link: https://lore.kernel.org/linux-mtd/trinity-478e0c09-9134-40e8-8f8c-31c371225…
[florian: make patch apply to 4.19]
Signed-off-by: Florian Fainelli <f.fainelli(a)gmail.com>
---
drivers/mtd/nand/raw/brcmnand/brcmnand.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
index 774ffa9e23f3..2b02f558b5e1 100644
--- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
+++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
@@ -1637,7 +1637,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip,
mtd->oobsize / trans,
host->hwcfg.sector_size_1k);
- if (!ret) {
+ if (ret != -EBADMSG) {
*err_addr = brcmnand_read_reg(ctrl,
BRCMNAND_UNCORR_ADDR) |
((u64)(brcmnand_read_reg(ctrl,
--
2.25.1
Due to btrfs_item* helpers name changes in v5.17-rc1, here are two manual
backport patches. Already verified by running fstests.
Su Yue (2):
btrfs: tree-checker: check item_size for inode_item
btrfs: tree-checker: check item_size for dev_item
fs/btrfs/tree-checker.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
--
2.34.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 44cad52cc14ae10062f142ec16ede489bccf4469 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Mon, 14 Feb 2022 13:05:49 +0100
Subject: [PATCH] x86/ptrace: Fix xfpregs_set()'s incorrect xmm clearing
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
xfpregs_set() handles 32-bit REGSET_XFP and 64-bit REGSET_FP. The actual
code treats these regsets as modern FX state (i.e. the beginning part of
XSTATE). The declarations of the regsets thought they were the legacy
i387 format. The code thought they were the 32-bit (no xmm8..15) variant
of XSTATE and, for good measure, made the high bits disappear by zeroing
the wrong part of the buffer. The latter broke ptrace, and everything
else confused anyone trying to understand the code. In particular, the
nonsense definitions of the regsets confused me when I wrote this code.
Clean this all up. Change the declarations to match reality (which
shouldn't change the generated code, let alone the ABI) and fix
xfpregs_set() to clear the correct bits and to only do so for 32-bit
callers.
Fixes: 6164331d15f7 ("x86/fpu: Rewrite xfpregs_set()")
Reported-by: Luís Ferreira <contact(a)lsferreira.net>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215524
Link: https://lore.kernel.org/r/YgpFnZpF01WwR8wU@zn.tnic
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 437d7c930c0b..75ffaef8c299 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -91,11 +91,9 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
const void *kbuf, const void __user *ubuf)
{
struct fpu *fpu = &target->thread.fpu;
- struct user32_fxsr_struct newstate;
+ struct fxregs_state newstate;
int ret;
- BUILD_BUG_ON(sizeof(newstate) != sizeof(struct fxregs_state));
-
if (!cpu_feature_enabled(X86_FEATURE_FXSR))
return -ENODEV;
@@ -116,9 +114,10 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
/* Copy the state */
memcpy(&fpu->fpstate->regs.fxsave, &newstate, sizeof(newstate));
- /* Clear xmm8..15 */
+ /* Clear xmm8..15 for 32-bit callers */
BUILD_BUG_ON(sizeof(fpu->__fpstate.regs.fxsave.xmm_space) != 16 * 16);
- memset(&fpu->fpstate->regs.fxsave.xmm_space[8], 0, 8 * 16);
+ if (in_ia32_syscall())
+ memset(&fpu->fpstate->regs.fxsave.xmm_space[8*4], 0, 8 * 16);
/* Mark FP and SSE as in use when XSAVE is enabled */
if (use_xsave())
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 6d2244c94799..8d2f2f995539 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -1224,7 +1224,7 @@ static struct user_regset x86_64_regsets[] __ro_after_init = {
},
[REGSET_FP] = {
.core_note_type = NT_PRFPREG,
- .n = sizeof(struct user_i387_struct) / sizeof(long),
+ .n = sizeof(struct fxregs_state) / sizeof(long),
.size = sizeof(long), .align = sizeof(long),
.active = regset_xregset_fpregs_active, .regset_get = xfpregs_get, .set = xfpregs_set
},
@@ -1271,7 +1271,7 @@ static struct user_regset x86_32_regsets[] __ro_after_init = {
},
[REGSET_XFP] = {
.core_note_type = NT_PRXFPREG,
- .n = sizeof(struct user32_fxsr_struct) / sizeof(u32),
+ .n = sizeof(struct fxregs_state) / sizeof(u32),
.size = sizeof(u32), .align = sizeof(u32),
.active = regset_xregset_fpregs_active, .regset_get = xfpregs_get, .set = xfpregs_set
},
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 467a726b754f474936980da793b4ff2ec3e382a7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20Koutn=C3=BD?= <mkoutny(a)suse.com>
Date: Thu, 17 Feb 2022 17:11:28 +0100
Subject: [PATCH] cgroup-v1: Correct privileges check in release_agent writes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The idea is to check: a) the owning user_ns of cgroup_ns, b)
capabilities in init_user_ns.
The commit 24f600856418 ("cgroup-v1: Require capabilities to set
release_agent") got this wrong in the write handler of release_agent
since it checked user_ns of the opener (may be different from the owning
user_ns of cgroup_ns).
Secondly, to avoid possibly confused deputy, the capability of the
opener must be checked.
Fixes: 24f600856418 ("cgroup-v1: Require capabilities to set release_agent")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/stable/20220216121142.GB30035@blackbody.suse.cz/
Signed-off-by: Michal Koutný <mkoutny(a)suse.com>
Reviewed-by: Masami Ichikawa(CIP) <masami.ichikawa(a)cybertrust.co.jp>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 0e877dbcfeea..afc6c0e9c966 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -546,6 +546,7 @@ static ssize_t cgroup_release_agent_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
struct cgroup *cgrp;
+ struct cgroup_file_ctx *ctx;
BUILD_BUG_ON(sizeof(cgrp->root->release_agent_path) < PATH_MAX);
@@ -553,8 +554,9 @@ static ssize_t cgroup_release_agent_write(struct kernfs_open_file *of,
* Release agent gets called with all capabilities,
* require capabilities to set release agent.
*/
- if ((of->file->f_cred->user_ns != &init_user_ns) ||
- !capable(CAP_SYS_ADMIN))
+ ctx = of->priv;
+ if ((ctx->ns->user_ns != &init_user_ns) ||
+ !file_ns_capable(of->file, &init_user_ns, CAP_SYS_ADMIN))
return -EPERM;
cgrp = cgroup_kn_lock_live(of->kn, false);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 467a726b754f474936980da793b4ff2ec3e382a7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20Koutn=C3=BD?= <mkoutny(a)suse.com>
Date: Thu, 17 Feb 2022 17:11:28 +0100
Subject: [PATCH] cgroup-v1: Correct privileges check in release_agent writes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The idea is to check: a) the owning user_ns of cgroup_ns, b)
capabilities in init_user_ns.
The commit 24f600856418 ("cgroup-v1: Require capabilities to set
release_agent") got this wrong in the write handler of release_agent
since it checked user_ns of the opener (may be different from the owning
user_ns of cgroup_ns).
Secondly, to avoid possibly confused deputy, the capability of the
opener must be checked.
Fixes: 24f600856418 ("cgroup-v1: Require capabilities to set release_agent")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/stable/20220216121142.GB30035@blackbody.suse.cz/
Signed-off-by: Michal Koutný <mkoutny(a)suse.com>
Reviewed-by: Masami Ichikawa(CIP) <masami.ichikawa(a)cybertrust.co.jp>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 0e877dbcfeea..afc6c0e9c966 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -546,6 +546,7 @@ static ssize_t cgroup_release_agent_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
struct cgroup *cgrp;
+ struct cgroup_file_ctx *ctx;
BUILD_BUG_ON(sizeof(cgrp->root->release_agent_path) < PATH_MAX);
@@ -553,8 +554,9 @@ static ssize_t cgroup_release_agent_write(struct kernfs_open_file *of,
* Release agent gets called with all capabilities,
* require capabilities to set release agent.
*/
- if ((of->file->f_cred->user_ns != &init_user_ns) ||
- !capable(CAP_SYS_ADMIN))
+ ctx = of->priv;
+ if ((ctx->ns->user_ns != &init_user_ns) ||
+ !file_ns_capable(of->file, &init_user_ns, CAP_SYS_ADMIN))
return -EPERM;
cgrp = cgroup_kn_lock_live(of->kn, false);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 467a726b754f474936980da793b4ff2ec3e382a7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20Koutn=C3=BD?= <mkoutny(a)suse.com>
Date: Thu, 17 Feb 2022 17:11:28 +0100
Subject: [PATCH] cgroup-v1: Correct privileges check in release_agent writes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The idea is to check: a) the owning user_ns of cgroup_ns, b)
capabilities in init_user_ns.
The commit 24f600856418 ("cgroup-v1: Require capabilities to set
release_agent") got this wrong in the write handler of release_agent
since it checked user_ns of the opener (may be different from the owning
user_ns of cgroup_ns).
Secondly, to avoid possibly confused deputy, the capability of the
opener must be checked.
Fixes: 24f600856418 ("cgroup-v1: Require capabilities to set release_agent")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/stable/20220216121142.GB30035@blackbody.suse.cz/
Signed-off-by: Michal Koutný <mkoutny(a)suse.com>
Reviewed-by: Masami Ichikawa(CIP) <masami.ichikawa(a)cybertrust.co.jp>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 0e877dbcfeea..afc6c0e9c966 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -546,6 +546,7 @@ static ssize_t cgroup_release_agent_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
struct cgroup *cgrp;
+ struct cgroup_file_ctx *ctx;
BUILD_BUG_ON(sizeof(cgrp->root->release_agent_path) < PATH_MAX);
@@ -553,8 +554,9 @@ static ssize_t cgroup_release_agent_write(struct kernfs_open_file *of,
* Release agent gets called with all capabilities,
* require capabilities to set release agent.
*/
- if ((of->file->f_cred->user_ns != &init_user_ns) ||
- !capable(CAP_SYS_ADMIN))
+ ctx = of->priv;
+ if ((ctx->ns->user_ns != &init_user_ns) ||
+ !file_ns_capable(of->file, &init_user_ns, CAP_SYS_ADMIN))
return -EPERM;
cgrp = cgroup_kn_lock_live(of->kn, false);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 467a726b754f474936980da793b4ff2ec3e382a7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20Koutn=C3=BD?= <mkoutny(a)suse.com>
Date: Thu, 17 Feb 2022 17:11:28 +0100
Subject: [PATCH] cgroup-v1: Correct privileges check in release_agent writes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The idea is to check: a) the owning user_ns of cgroup_ns, b)
capabilities in init_user_ns.
The commit 24f600856418 ("cgroup-v1: Require capabilities to set
release_agent") got this wrong in the write handler of release_agent
since it checked user_ns of the opener (may be different from the owning
user_ns of cgroup_ns).
Secondly, to avoid possibly confused deputy, the capability of the
opener must be checked.
Fixes: 24f600856418 ("cgroup-v1: Require capabilities to set release_agent")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/stable/20220216121142.GB30035@blackbody.suse.cz/
Signed-off-by: Michal Koutný <mkoutny(a)suse.com>
Reviewed-by: Masami Ichikawa(CIP) <masami.ichikawa(a)cybertrust.co.jp>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 0e877dbcfeea..afc6c0e9c966 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -546,6 +546,7 @@ static ssize_t cgroup_release_agent_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
struct cgroup *cgrp;
+ struct cgroup_file_ctx *ctx;
BUILD_BUG_ON(sizeof(cgrp->root->release_agent_path) < PATH_MAX);
@@ -553,8 +554,9 @@ static ssize_t cgroup_release_agent_write(struct kernfs_open_file *of,
* Release agent gets called with all capabilities,
* require capabilities to set release agent.
*/
- if ((of->file->f_cred->user_ns != &init_user_ns) ||
- !capable(CAP_SYS_ADMIN))
+ ctx = of->priv;
+ if ((ctx->ns->user_ns != &init_user_ns) ||
+ !file_ns_capable(of->file, &init_user_ns, CAP_SYS_ADMIN))
return -EPERM;
cgrp = cgroup_kn_lock_live(of->kn, false);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 05c7b7a92cc87ff8d7fde189d0fade250697573c Mon Sep 17 00:00:00 2001
From: Zhang Qiao <zhangqiao22(a)huawei.com>
Date: Fri, 21 Jan 2022 18:12:10 +0800
Subject: [PATCH] cgroup/cpuset: Fix a race between cpuset_attach() and cpu
hotplug
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
As previously discussed(https://lkml.org/lkml/2022/1/20/51),
cpuset_attach() is affected with similar cpu hotplug race,
as follow scenario:
cpuset_attach() cpu hotplug
--------------------------- ----------------------
down_write(cpuset_rwsem)
guarantee_online_cpus() // (load cpus_attach)
sched_cpu_deactivate
set_cpu_active()
// will change cpu_active_mask
set_cpus_allowed_ptr(cpus_attach)
__set_cpus_allowed_ptr_locked()
// (if the intersection of cpus_attach and
cpu_active_mask is empty, will return -EINVAL)
up_write(cpuset_rwsem)
To avoid races such as described above, protect cpuset_attach() call
with cpu_hotplug_lock.
Fixes: be367d099270 ("cgroups: let ss->can_attach and ss->attach do whole threadgroups at a time")
Cc: stable(a)vger.kernel.org # v2.6.32+
Reported-by: Zhao Gongyi <zhaogongyi(a)huawei.com>
Signed-off-by: Zhang Qiao <zhangqiao22(a)huawei.com>
Acked-by: Waiman Long <longman(a)redhat.com>
Reviewed-by: Michal Koutný <mkoutny(a)suse.com>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 4c7254e8f49a..97c53f3cc917 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2289,6 +2289,7 @@ static void cpuset_attach(struct cgroup_taskset *tset)
cgroup_taskset_first(tset, &css);
cs = css_cs(css);
+ cpus_read_lock();
percpu_down_write(&cpuset_rwsem);
guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
@@ -2342,6 +2343,7 @@ static void cpuset_attach(struct cgroup_taskset *tset)
wake_up(&cpuset_attach_wq);
percpu_up_write(&cpuset_rwsem);
+ cpus_read_unlock();
}
/* The various types of files and directories in a cpuset file system */
When a THP is present in the page cache, we can return it several times,
leading to userspace seeing the same data repeatedly if doing a read()
that crosses a 64-page boundary. This is probably not a security issue
(since the data all comes from the same file), but it can be interpreted
as a transient data corruption issue. Fortunately, it is very rare as
it can only occur when CONFIG_READ_ONLY_THP_FOR_FS is enabled, and it can
only happen to executables. We don't often call read() on executables.
This bug is fixed differently in v5.17 by commit 6b24ca4a1a8d
("mm: Use multi-index entries in the page cache"). That commit is
unsuitable for backporting, so fix this in the clearest way. It
sacrifices a little performance for clarity, but this should never
be a performance path in these kernel versions.
Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read")
Cc: stable(a)vger.kernel.org # v5.15, v5.16
Link: https://lore.kernel.org/r/df3b5d1c-a36b-2c73-3e27-99e74983de3a@suse.cz/
Analyzed-by: Adam Majer <amajer(a)suse.com>
Analyzed-by: Dirk Mueller <dmueller(a)suse.com>
Bisected-by: Takashi Iwai <tiwai(a)suse.de>
Reported-by: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
---
mm/filemap.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 82a17c35eb96..1293c3409e42 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2354,8 +2354,12 @@ static void filemap_get_read_batch(struct address_space *mapping,
break;
if (PageReadahead(head))
break;
- xas.xa_index = head->index + thp_nr_pages(head) - 1;
- xas.xa_offset = (xas.xa_index >> xas.xa_shift) & XA_CHUNK_MASK;
+ if (PageHead(head)) {
+ xas_set(&xas, head->index + thp_nr_pages(head));
+ /* Handle wrap correctly */
+ if (xas.xa_index - 1 >= max)
+ break;
+ }
continue;
put_page:
put_page(head);
--
2.34.1
SoCs such as the Ampere Altra define clusters but have no shared
processor-side cache. As of v5.16 with CONFIG_SCHED_CLUSTER and
CONFIG_SCHED_MC, build_sched_domain() will BUG() with:
BUG: arch topology borken
the CLS domain not a subset of the MC domain
for each CPU (160 times for a 2 socket 80 core Altra system). The MC
level cpu mask is then extended to that of the CLS child, and is later
removed entirely as redundant.
This change detects when all cpu_coregroup_mask weights=1 and uses an
alternative sched_domain_topology equivalent to the default if
CONFIG_SCHED_MC were disabled.
The final resulting sched domain topology is unchanged with or without
CONFIG_SCHED_CLUSTER, and the BUG is avoided:
For CPU0:
With CLS:
CLS [0-1]
DIE [0-79]
NUMA [0-159]
Without CLS:
DIE [0-79]
NUMA [0-159]
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Vincent Guittot <vincent.guittot(a)linaro.org>
Cc: Barry Song <song.bao.hua(a)hisilicon.com>
Cc: Valentin Schneider <valentin.schneider(a)arm.com>
Cc: D. Scott Phillips <scott(a)os.amperecomputing.com>
Cc: Ilkka Koskinen <ilkka(a)os.amperecomputing.com>
Cc: <stable(a)vger.kernel.org> # 5.16.x
Signed-off-by: Darren Hart <darren(a)os.amperecomputing.com>
---
arch/arm64/kernel/smp.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 27df5c1e6baa..0a78ac5c8830 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -715,9 +715,22 @@ void __init smp_init_cpus(void)
}
}
+static struct sched_domain_topology_level arm64_no_mc_topology[] = {
+#ifdef CONFIG_SCHED_SMT
+ { cpu_smt_mask, cpu_smt_flags, SD_INIT_NAME(SMT) },
+#endif
+
+#ifdef CONFIG_SCHED_CLUSTER
+ { cpu_clustergroup_mask, cpu_cluster_flags, SD_INIT_NAME(CLS) },
+#endif
+ { cpu_cpu_mask, SD_INIT_NAME(DIE) },
+ { NULL, },
+};
+
void __init smp_prepare_cpus(unsigned int max_cpus)
{
const struct cpu_operations *ops;
+ bool use_no_mc_topology = true;
int err;
unsigned int cpu;
unsigned int this_cpu;
@@ -758,6 +771,25 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
set_cpu_present(cpu, true);
numa_store_cpu_info(cpu);
+
+ /*
+ * Only use no_mc topology if all cpu_coregroup_mask weights=1
+ */
+ if (cpumask_weight(cpu_coregroup_mask(cpu)) > 1)
+ use_no_mc_topology = false;
+ }
+
+ /*
+ * SoCs with no shared processor-side cache will have cpu_coregroup_mask
+ * weights=1. If they also define clusters with cpu_clustergroup_mask
+ * weights > 1, build_sched_domain() will trigger a BUG as the CLS
+ * cpu_mask will not be a subset of MC. It will extend the MC cpu_mask
+ * to match CLS, and later discard the MC level. Avoid the bug by using
+ * a topology without the MC if the cpu_coregroup_mask weights=1.
+ */
+ if (use_no_mc_topology) {
+ pr_info("cpu_coregroup_mask weights=1, skipping MC topology level");
+ set_sched_topology(arm64_no_mc_topology);
}
}
--
2.31.1