On Sat, Dec 14, 2024 at 10:56:13AM +0000, Marc Zyngier wrote:
[+ Mark] On Sat, 14 Dec 2024 00:52:08 +0000, Mark Brown broonie@kernel.org wrote:
In commit 892f7237b3ff ("arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}") we moved access to ZCR, SMCR and SMIDR later in the boot process in order to ensure that we don't attempt to interact with them if SVE or SME is disabled on the command line. Unfortunately when initialising the boot CPU in init_cpu_features() we work on a copy of the struct cpuinfo_arm64 for the boot CPU used only during boot, not the percpu copy used by the sysfs code.
Fix this by moving the handling for SMIDR_EL1 for the boot CPU to cpuinfo_store_boot_cpu() so it can operate on the percpu copy of the data. This reduces the potential for error that could come from having both the percpu and boot CPU copies in init_cpu_features().
This issue wasn't apparent when testing on emulated platforms that do not report values in this ID register.
Fixes: 892f7237b3ff ("arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}") Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org
arch/arm64/kernel/cpufeature.c | 6 ------ arch/arm64/kernel/cpuinfo.c | 11 +++++++++++ 2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 6ce71f444ed84f9056196bb21bbfac61c9687e30..b88102fd2c20f77e25af6df513fda09a484e882e 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1167,12 +1167,6 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info) id_aa64pfr1_sme(read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1))) { unsigned long cpacr = cpacr_save_enable_kernel_sme();
/*
* We mask out SMPS since even if the hardware
* supports priorities the kernel does not at present
* and we block access to them.
*/
vec_init_vq_map(ARM64_VEC_SME);info->reg_smidr = read_cpuid(SMIDR_EL1) & ~SMIDR_EL1_SMPS;
cpacr_restore(cpacr); diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index d79e88fccdfce427507e7a34c5959ce6309cbd12..b7d403da71e5a01ed3943eb37e7a00af238771a2 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -499,4 +499,15 @@ void __init cpuinfo_store_boot_cpu(void) boot_cpu_data = *info; init_cpu_features(&boot_cpu_data);
- /* SMIDR_EL1 needs to be stored in the percpu data for sysfs */
- if (IS_ENABLED(CONFIG_ARM64_SME) &&
id_aa64pfr1_sme(read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1))) {
/*
* We mask out SMPS since even if the hardware
* supports priorities the kernel does not at present
* and we block access to them.
*/
info->reg_smidr = read_cpuid(SMIDR_EL1) & ~SMIDR_EL1_SMPS;
- }
}
I don't understand the need to single out SMIDR_EL1. It seems to only make things even more fragile than they already are by adding more synchronisation phases.
Why isn't the following a good enough fix? It makes it plain that boot_cpu_data is only a copy of CPU0's initial boot state.
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index d79e88fccdfce..0cbb42fd48850 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -497,6 +497,6 @@ void __init cpuinfo_store_boot_cpu(void) struct cpuinfo_arm64 *info = &per_cpu(cpu_data, 0); __cpuinfo_store_cpu(info);
- init_cpu_features(info); boot_cpu_data = *info;
- init_cpu_features(&boot_cpu_data);
}
I think that change in isolation is fine, but I don't think that's the right fix.
I think that what we did in commit:
892f7237b3ff ("arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}")
... introduces an anti-pattern that'd be nice to avoid. That broke the existing split of __cpuinfo_store_cpu() and init_cpu_features(), where the former read the ID regs, and the latter set up the features *without* altering the copy of the ID regs that was read. i.e. init_cpu_features() shouldn't write to its info argument at all.
I understand that we have to do something as a bodge for broken FW which traps SME, but I'd much rather we did that within __cpuinfo_store_cpu().
Can we add something to check whether SME was disabled on the command line, and use that in __cpuinfo_store_cpu(), effectively reverting 892f7237b3ff?
Mark.