Hi Greg,
Here is the state of the current v5.4 backport. Now that the 32bit code has been merged, it doesn't conflict when KVM's shared 32bit/64bit code needs to use these constants.
I've fixed the two issues that were reported against the v5.10 backport.
I had a go at bringing all the pre-requisites in to add proton-pack.c to v5.4. Its currently 39 patches: https://git.gitlab.arm.com/linux-arm/linux-jm.git /bhb/alternative_backport/UNTESTED/v5.4.183 (or for web browsers: https://gitlab.arm.com/linux-arm/linux-jm/-/commits/bhb/alternative_backport... )
I've not managed to bring b881cdce77b4 "KVM: arm64: Allocate hyp vectors statically" in, as that depends on the hypervisor's per-cpu support. Its this patch that means those 'smccc_wa' templates are needed. I estimate it would be 60 patches in total if we go this way, I don't think its a good idea: All this still has to work around 541ad0150ca4 ("arm: Remove 32bit KVM host support") and 9ed24f4b712b ("KVM: arm64: Move virt/kvm/arm to arch/arm64") being missing. I think this approach creates more problems than it solves as the files have to be in different places because of 32bit. It also creates a significantly larger testing problem: I'm not looking forward to working out how to test KVM guest migration over the variant-4 KVM ABI changes in 29e8910a566a.
This version of the backport still adds the mitigation management code to cpu_errata.c, because that is where this stuff lived in v5.4.
If you prefer, I can try adding proton-pack.c as a new empty file. I think this would only confuse matters as the line-numbers would never match, and there is an interaction with the spectre-v2 mitigations that live in cpu_errata.c.
There is one patch not present in mainline 'KVM: arm64: Add templtes for BHB mitigation sequences'.
Thanks,
James
Anshuman Khandual (1): arm64: Add Cortex-X2 CPU part definition
James Morse (18): arm64: entry.S: Add ventry overflow sanity checks arm64: entry: Make the trampoline cleanup optional arm64: entry: Free up another register on kpti's tramp_exit path arm64: entry: Move the trampoline data page before the text page arm64: entry: Allow tramp_alias to access symbols after the 4K boundary arm64: entry: Don't assume tramp_vectors is the start of the vectors arm64: entry: Move trampoline macros out of ifdef'd section arm64: entry: Make the kpti trampoline's kpti sequence optional arm64: entry: Allow the trampoline text to occupy multiple pages arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations arm64: entry: Add vectors that have the bhb mitigation sequences arm64: entry: Add macro for reading symbol addresses from the trampoline arm64: Add percpu vectors for EL1 arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2 KVM: arm64: Add templates for BHB mitigation sequences arm64: Mitigate spectre style branch history side channels KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated arm64: Use the clearbhb instruction in mitigations
Joey Gouly (1): arm64: add ID_AA64ISAR2_EL1 sys register
Rob Herring (1): arm64: Add part number for Arm Cortex-A77
Suzuki K Poulose (1): arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
arch/arm/include/asm/kvm_host.h | 7 + arch/arm/include/uapi/asm/kvm.h | 6 + arch/arm64/Kconfig | 9 + arch/arm64/include/asm/assembler.h | 33 +++ arch/arm64/include/asm/cpu.h | 1 + arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/cpufeature.h | 40 +++ arch/arm64/include/asm/cputype.h | 16 ++ arch/arm64/include/asm/fixmap.h | 6 +- arch/arm64/include/asm/kvm_host.h | 5 + arch/arm64/include/asm/kvm_mmu.h | 6 +- arch/arm64/include/asm/mmu.h | 8 +- arch/arm64/include/asm/sections.h | 5 + arch/arm64/include/asm/sysreg.h | 17 ++ arch/arm64/include/asm/vectors.h | 73 ++++++ arch/arm64/include/uapi/asm/kvm.h | 5 + arch/arm64/kernel/cpu_errata.c | 385 +++++++++++++++++++++++++++- arch/arm64/kernel/cpufeature.c | 21 ++ arch/arm64/kernel/cpuinfo.c | 1 + arch/arm64/kernel/entry.S | 213 +++++++++++---- arch/arm64/kernel/vmlinux.lds.S | 2 +- arch/arm64/kvm/hyp/hyp-entry.S | 64 +++++ arch/arm64/kvm/hyp/switch.c | 8 +- arch/arm64/kvm/sys_regs.c | 2 +- arch/arm64/mm/mmu.c | 12 +- include/linux/arm-smccc.h | 5 + virt/kvm/arm/psci.c | 34 ++- 27 files changed, 912 insertions(+), 75 deletions(-) create mode 100644 arch/arm64/include/asm/vectors.h
From: Rob Herring robh@kernel.org
commit 8a6b88e66233f5f1779b0a1342aa9dc030dddcd5 upstream.
Add the MIDR part number info for the Arm Cortex-A77.
Signed-off-by: Rob Herring robh@kernel.org Acked-by: Catalin Marinas catalin.marinas@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Link: https://lore.kernel.org/r/20201028182839.166037-1-robh@kernel.org Signed-off-by: Will Deacon will@kernel.org Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index aca07c2f6e6e..b009d4813537 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -71,6 +71,7 @@ #define ARM_CPU_PART_CORTEX_A55 0xD05 #define ARM_CPU_PART_CORTEX_A76 0xD0B #define ARM_CPU_PART_NEOVERSE_N1 0xD0C +#define ARM_CPU_PART_CORTEX_A77 0xD0D
#define APM_CPU_PART_POTENZA 0x000
@@ -102,6 +103,7 @@ #define MIDR_CORTEX_A55 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A55) #define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76) #define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1) +#define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77) #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX)
From: Suzuki K Poulose suzuki.poulose@arm.com
commit 2d0d656700d67239a57afaf617439143d8dac9be upstream.
Add the CPU Partnumbers for the new Arm designs.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Mark Rutland mark.rutland@arm.com Cc: Will Deacon will@kernel.org Acked-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com Link: https://lore.kernel.org/r/20211019163153.3692640-2-suzuki.poulose@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/cputype.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index b009d4813537..6a0acbec77ae 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -72,6 +72,8 @@ #define ARM_CPU_PART_CORTEX_A76 0xD0B #define ARM_CPU_PART_NEOVERSE_N1 0xD0C #define ARM_CPU_PART_CORTEX_A77 0xD0D +#define ARM_CPU_PART_CORTEX_A710 0xD47 +#define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define APM_CPU_PART_POTENZA 0x000
@@ -104,6 +106,8 @@ #define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76) #define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1) #define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77) +#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710) +#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2) #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX)
From: Joey Gouly joey.gouly@arm.com
commit 9e45365f1469ef2b934f9d035975dbc9ad352116 upstream.
This is a new ID register, introduced in 8.7.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Will Deacon will@kernel.org Cc: Marc Zyngier maz@kernel.org Cc: James Morse james.morse@arm.com Cc: Alexandru Elisei alexandru.elisei@arm.com Cc: Suzuki K Poulose suzuki.poulose@arm.com Cc: Reiji Watanabe reijiw@google.com Acked-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20211210165432.8106-3-joey.gouly@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/cpu.h | 1 + arch/arm64/include/asm/sysreg.h | 15 +++++++++++++++ arch/arm64/kernel/cpufeature.c | 9 +++++++++ arch/arm64/kernel/cpuinfo.c | 1 + arch/arm64/kvm/sys_regs.c | 2 +- 5 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h index d72d995b7e25..85cc06380e93 100644 --- a/arch/arm64/include/asm/cpu.h +++ b/arch/arm64/include/asm/cpu.h @@ -25,6 +25,7 @@ struct cpuinfo_arm64 { u64 reg_id_aa64dfr1; u64 reg_id_aa64isar0; u64 reg_id_aa64isar1; + u64 reg_id_aa64isar2; u64 reg_id_aa64mmfr0; u64 reg_id_aa64mmfr1; u64 reg_id_aa64mmfr2; diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 9b68f1b3915e..50ed2747c572 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -165,6 +165,7 @@
#define SYS_ID_AA64ISAR0_EL1 sys_reg(3, 0, 0, 6, 0) #define SYS_ID_AA64ISAR1_EL1 sys_reg(3, 0, 0, 6, 1) +#define SYS_ID_AA64ISAR2_EL1 sys_reg(3, 0, 0, 6, 2)
#define SYS_ID_AA64MMFR0_EL1 sys_reg(3, 0, 0, 7, 0) #define SYS_ID_AA64MMFR1_EL1 sys_reg(3, 0, 0, 7, 1) @@ -575,6 +576,20 @@ #define ID_AA64ISAR1_GPI_NI 0x0 #define ID_AA64ISAR1_GPI_IMP_DEF 0x1
+/* id_aa64isar2 */ +#define ID_AA64ISAR2_RPRES_SHIFT 4 +#define ID_AA64ISAR2_WFXT_SHIFT 0 + +#define ID_AA64ISAR2_RPRES_8BIT 0x0 +#define ID_AA64ISAR2_RPRES_12BIT 0x1 +/* + * Value 0x1 has been removed from the architecture, and is + * reserved, but has not yet been removed from the ARM ARM + * as of ARM DDI 0487G.b. + */ +#define ID_AA64ISAR2_WFXT_NI 0x0 +#define ID_AA64ISAR2_WFXT_SUPPORTED 0x2 + /* id_aa64pfr0 */ #define ID_AA64PFR0_CSV3_SHIFT 60 #define ID_AA64PFR0_CSV2_SHIFT 56 diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index acdef8d76c64..6b77e4942495 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -150,6 +150,10 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = { ARM64_FTR_END, };
+static const struct arm64_ftr_bits ftr_id_aa64isar2[] = { + ARM64_FTR_END, +}; + static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = { ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0), @@ -410,6 +414,7 @@ static const struct __ftr_reg_entry { /* Op1 = 0, CRn = 0, CRm = 6 */ ARM64_FTR_REG(SYS_ID_AA64ISAR0_EL1, ftr_id_aa64isar0), ARM64_FTR_REG(SYS_ID_AA64ISAR1_EL1, ftr_id_aa64isar1), + ARM64_FTR_REG(SYS_ID_AA64ISAR2_EL1, ftr_id_aa64isar2),
/* Op1 = 0, CRn = 0, CRm = 7 */ ARM64_FTR_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0), @@ -581,6 +586,7 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info) init_cpu_ftr_reg(SYS_ID_AA64DFR1_EL1, info->reg_id_aa64dfr1); init_cpu_ftr_reg(SYS_ID_AA64ISAR0_EL1, info->reg_id_aa64isar0); init_cpu_ftr_reg(SYS_ID_AA64ISAR1_EL1, info->reg_id_aa64isar1); + init_cpu_ftr_reg(SYS_ID_AA64ISAR2_EL1, info->reg_id_aa64isar2); init_cpu_ftr_reg(SYS_ID_AA64MMFR0_EL1, info->reg_id_aa64mmfr0); init_cpu_ftr_reg(SYS_ID_AA64MMFR1_EL1, info->reg_id_aa64mmfr1); init_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2); @@ -704,6 +710,8 @@ void update_cpu_features(int cpu, info->reg_id_aa64isar0, boot->reg_id_aa64isar0); taint |= check_update_ftr_reg(SYS_ID_AA64ISAR1_EL1, cpu, info->reg_id_aa64isar1, boot->reg_id_aa64isar1); + taint |= check_update_ftr_reg(SYS_ID_AA64ISAR2_EL1, cpu, + info->reg_id_aa64isar2, boot->reg_id_aa64isar2);
/* * Differing PARange support is fine as long as all peripherals and @@ -838,6 +846,7 @@ static u64 __read_sysreg_by_encoding(u32 sys_id) read_sysreg_case(SYS_ID_AA64MMFR2_EL1); read_sysreg_case(SYS_ID_AA64ISAR0_EL1); read_sysreg_case(SYS_ID_AA64ISAR1_EL1); + read_sysreg_case(SYS_ID_AA64ISAR2_EL1);
read_sysreg_case(SYS_CNTFRQ_EL0); read_sysreg_case(SYS_CTR_EL0); diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 05933c065732..90b35011a22f 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -344,6 +344,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info) info->reg_id_aa64dfr1 = read_cpuid(ID_AA64DFR1_EL1); info->reg_id_aa64isar0 = read_cpuid(ID_AA64ISAR0_EL1); info->reg_id_aa64isar1 = read_cpuid(ID_AA64ISAR1_EL1); + info->reg_id_aa64isar2 = read_cpuid(ID_AA64ISAR2_EL1); info->reg_id_aa64mmfr0 = read_cpuid(ID_AA64MMFR0_EL1); info->reg_id_aa64mmfr1 = read_cpuid(ID_AA64MMFR1_EL1); info->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1); diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index da649e90240c..a25f737dfa0b 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1454,7 +1454,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { /* CRm=6 */ ID_SANITISED(ID_AA64ISAR0_EL1), ID_SANITISED(ID_AA64ISAR1_EL1), - ID_UNALLOCATED(6,2), + ID_SANITISED(ID_AA64ISAR2_EL1), ID_UNALLOCATED(6,3), ID_UNALLOCATED(6,4), ID_UNALLOCATED(6,5),
From: Anshuman Khandual anshuman.khandual@arm.com
commit 72bb9dcb6c33cfac80282713c2b4f2b254cd24d1 upstream.
Add the CPU Partnumbers for the new Arm designs.
Cc: Will Deacon will@kernel.org Cc: Suzuki Poulose suzuki.poulose@arm.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com Link: https://lore.kernel.org/r/1642994138-25887-2-git-send-email-anshuman.khandua... Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 6a0acbec77ae..e4394be47d35 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -73,6 +73,7 @@ #define ARM_CPU_PART_NEOVERSE_N1 0xD0C #define ARM_CPU_PART_CORTEX_A77 0xD0D #define ARM_CPU_PART_CORTEX_A710 0xD47 +#define ARM_CPU_PART_CORTEX_X2 0xD48 #define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define APM_CPU_PART_POTENZA 0x000 @@ -107,6 +108,7 @@ #define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1) #define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77) #define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710) +#define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2) #define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2) #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
commit 4330e2c5c04c27bebf89d34e0bc14e6943413067 upstream.
Subsequent patches add even more code to the ventry slots. Ensure kernels that overflow a ventry slot don't get built.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/entry.S | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index db137746c6fa..98991aa9d0b1 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -59,6 +59,7 @@
.macro kernel_ventry, el, label, regsize = 64 .align 7 +.Lventry_start@: #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 alternative_if ARM64_UNMAP_KERNEL_AT_EL0 .if \el == 0 @@ -116,6 +117,7 @@ alternative_else_nop_endif mrs x0, tpidrro_el0 #endif b el()\el()_\label +.org .Lventry_start@ + 128 // Did we overflow the ventry slot? .endm
.macro tramp_alias, dst, sym @@ -1080,6 +1082,7 @@ alternative_else_nop_endif add x30, x30, #(1b - tramp_vectors) isb ret +.org 1b + 128 // Did we overflow the ventry slot? .endm
.macro tramp_exit, regsize = 64
commit d739da1694a0eaef0358a42b76904b611539b77b upstream.
Subsequent patches will add additional sets of vectors that use the same tricks as the kpti vectors to reach the full-fat vectors. The full-fat vectors contain some cleanup for kpti that is patched in by alternatives when kpti is in use. Once there are additional vectors, the cleanup will be needed in more cases.
But on big/little systems, the cleanup would be harmful if no trampoline vector were in use. Instead of forcing CPUs that don't need a trampoline vector to use one, make the trampoline cleanup optional.
Entry at the top of the vectors will skip the cleanup. The trampoline vectors can then skip the first instruction, triggering the cleanup to run.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/entry.S | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 98991aa9d0b1..a6dcd68ce7de 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -61,16 +61,20 @@ .align 7 .Lventry_start@: #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 -alternative_if ARM64_UNMAP_KERNEL_AT_EL0 .if \el == 0 + /* + * This must be the first instruction of the EL0 vector entries. It is + * skipped by the trampoline vectors, to trigger the cleanup. + */ + b .Lskip_tramp_vectors_cleanup@ .if \regsize == 64 mrs x30, tpidrro_el0 msr tpidrro_el0, xzr .else mov x30, xzr .endif +.Lskip_tramp_vectors_cleanup@: .endif -alternative_else_nop_endif #endif
sub sp, sp, #S_FRAME_SIZE @@ -1079,7 +1083,7 @@ alternative_if_not ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM prfm plil1strm, [x30, #(1b - tramp_vectors)] alternative_else_nop_endif msr vbar_el1, x30 - add x30, x30, #(1b - tramp_vectors) + add x30, x30, #(1b - tramp_vectors + 4) isb ret .org 1b + 128 // Did we overflow the ventry slot?
commit 03aff3a77a58b5b52a77e00537a42090ad57b80b upstream.
Kpti stashes x30 in far_el1 while it uses x30 for all its work.
Making the vectors a per-cpu data structure will require a second register.
Allow tramp_exit two registers before it unmaps the kernel, by leaving x30 on the stack, and stashing x29 in far_el1.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/entry.S | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index a6dcd68ce7de..7e52b6991bf1 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -367,14 +367,16 @@ alternative_else_nop_endif ldp x24, x25, [sp, #16 * 12] ldp x26, x27, [sp, #16 * 13] ldp x28, x29, [sp, #16 * 14] - ldr lr, [sp, #S_LR] - add sp, sp, #S_FRAME_SIZE // restore sp
.if \el == 0 -alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0 +alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0 + ldr lr, [sp, #S_LR] + add sp, sp, #S_FRAME_SIZE // restore sp + eret +alternative_else_nop_endif #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 bne 5f - msr far_el1, x30 + msr far_el1, x29 tramp_alias x30, tramp_exit_native br x30 5: @@ -382,6 +384,8 @@ alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0 br x30 #endif .else + ldr lr, [sp, #S_LR] + add sp, sp, #S_FRAME_SIZE // restore sp eret .endif sb @@ -1092,10 +1096,12 @@ alternative_else_nop_endif .macro tramp_exit, regsize = 64 adr x30, tramp_vectors msr vbar_el1, x30 - tramp_unmap_kernel x30 + ldr lr, [sp, #S_LR] + tramp_unmap_kernel x29 .if \regsize == 64 - mrs x30, far_el1 + mrs x29, far_el1 .endif + add sp, sp, #S_FRAME_SIZE // restore sp eret sb .endm
commit c091fb6ae059cda563b2a4d93fdbc548ef34e1d6 upstream.
The trampoline code has a data page that holds the address of the vectors, which is unmapped when running in user-space. This ensures that with CONFIG_RANDOMIZE_BASE, the randomised address of the kernel can't be discovered until after the kernel has been mapped.
If the trampoline text page is extended to include multiple sets of vectors, it will be larger than a single page, making it tricky to find the data page without knowing the size of the trampoline text pages, which will vary with PAGE_SIZE.
Move the data page to appear before the text page. This allows the data page to be found without knowing the size of the trampoline text pages. 'tramp_vectors' is used to refer to the beginning of the .entry.tramp.text section, do that explicitly.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/fixmap.h | 2 +- arch/arm64/kernel/entry.S | 9 +++++++-- 2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h index f987b8a8f325..2e0977c7564c 100644 --- a/arch/arm64/include/asm/fixmap.h +++ b/arch/arm64/include/asm/fixmap.h @@ -63,8 +63,8 @@ enum fixed_addresses { #endif /* CONFIG_ACPI_APEI_GHES */
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 - FIX_ENTRY_TRAMP_DATA, FIX_ENTRY_TRAMP_TEXT, + FIX_ENTRY_TRAMP_DATA, #define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT)) #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ __end_of_permanent_fixed_addresses, diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 7e52b6991bf1..7822ecc0e165 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -1061,6 +1061,11 @@ alternative_else_nop_endif */ .endm
+ .macro tramp_data_page dst + adr \dst, .entry.tramp.text + sub \dst, \dst, PAGE_SIZE + .endm + .macro tramp_ventry, regsize = 64 .align 7 1: @@ -1077,7 +1082,7 @@ alternative_else_nop_endif 2: tramp_map_kernel x30 #ifdef CONFIG_RANDOMIZE_BASE - adr x30, tramp_vectors + PAGE_SIZE + tramp_data_page x30 alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003 ldr x30, [x30] #else @@ -1228,7 +1233,7 @@ ENTRY(__sdei_asm_entry_trampoline) 1: str x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
#ifdef CONFIG_RANDOMIZE_BASE - adr x4, tramp_vectors + PAGE_SIZE + tramp_data_page x4 add x4, x4, #:lo12:__sdei_asm_trampoline_next_handler ldr x4, [x4] #else
commit 6c5bf79b69f911560fbf82214c0971af6e58e682 upstream.
Systems using kpti enter and exit the kernel through a trampoline mapping that is always mapped, even when the kernel is not. tramp_valias is a macro to find the address of a symbol in the trampoline mapping.
Adding extra sets of vectors will expand the size of the entry.tramp.text section to beyond 4K. tramp_valias will be unable to generate addresses for symbols beyond 4K as it uses the 12 bit immediate of the add instruction.
As there are now two registers available when tramp_alias is called, use the extra register to avoid the 4K limit of the 12 bit immediate.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/entry.S | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 7822ecc0e165..3489edd57c51 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -124,9 +124,12 @@ .org .Lventry_start@ + 128 // Did we overflow the ventry slot? .endm
- .macro tramp_alias, dst, sym + .macro tramp_alias, dst, sym, tmp mov_q \dst, TRAMP_VALIAS - add \dst, \dst, #(\sym - .entry.tramp.text) + adr_l \tmp, \sym + add \dst, \dst, \tmp + adr_l \tmp, .entry.tramp.text + sub \dst, \dst, \tmp .endm
// This macro corrupts x0-x3. It is the caller's duty @@ -377,10 +380,10 @@ alternative_else_nop_endif #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 bne 5f msr far_el1, x29 - tramp_alias x30, tramp_exit_native + tramp_alias x30, tramp_exit_native, x29 br x30 5: - tramp_alias x30, tramp_exit_compat + tramp_alias x30, tramp_exit_compat, x29 br x30 #endif .else @@ -1362,7 +1365,7 @@ alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0 alternative_else_nop_endif
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 - tramp_alias dst=x5, sym=__sdei_asm_exit_trampoline + tramp_alias dst=x5, sym=__sdei_asm_exit_trampoline, tmp=x3 br x5 #endif ENDPROC(__sdei_asm_handler)
commit ed50da7764535f1e24432ded289974f2bf2b0c5a upstream.
The tramp_ventry macro uses tramp_vectors as the address of the vectors when calculating which ventry in the 'full fat' vectors to branch to.
While there is one set of tramp_vectors, this will be true. Adding multiple sets of vectors will break this assumption.
Move the generation of the vectors to a macro, and pass the start of the vectors as an argument to tramp_ventry.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/entry.S | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 3489edd57c51..09c78d6781a7 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -1069,7 +1069,7 @@ alternative_else_nop_endif sub \dst, \dst, PAGE_SIZE .endm
- .macro tramp_ventry, regsize = 64 + .macro tramp_ventry, vector_start, regsize .align 7 1: .if \regsize == 64 @@ -1092,10 +1092,10 @@ alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003 ldr x30, =vectors #endif alternative_if_not ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM - prfm plil1strm, [x30, #(1b - tramp_vectors)] + prfm plil1strm, [x30, #(1b - \vector_start)] alternative_else_nop_endif msr vbar_el1, x30 - add x30, x30, #(1b - tramp_vectors + 4) + add x30, x30, #(1b - \vector_start + 4) isb ret .org 1b + 128 // Did we overflow the ventry slot? @@ -1114,19 +1114,21 @@ alternative_else_nop_endif sb .endm
- .align 11 -ENTRY(tramp_vectors) + .macro generate_tramp_vector +.Lvector_start@: .space 0x400
- tramp_ventry - tramp_ventry - tramp_ventry - tramp_ventry + .rept 4 + tramp_ventry .Lvector_start@, 64 + .endr + .rept 4 + tramp_ventry .Lvector_start@, 32 + .endr + .endm
- tramp_ventry 32 - tramp_ventry 32 - tramp_ventry 32 - tramp_ventry 32 + .align 11 +ENTRY(tramp_vectors) + generate_tramp_vector END(tramp_vectors)
ENTRY(tramp_exit_native)
commit 13d7a08352a83ef2252aeb464a5e08dfc06b5dfd upstream.
The macros for building the kpti trampoline are all behind CONFIG_UNMAP_KERNEL_AT_EL0, and in a region that outputs to the .entry.tramp.text section.
Move the macros out so they can be used to generate other kinds of trampoline. Only the symbols need to be guarded by CONFIG_UNMAP_KERNEL_AT_EL0 and appear in the .entry.tramp.text section.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/entry.S | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 09c78d6781a7..a2ec7ef24402 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -1025,12 +1025,6 @@ ENDPROC(el0_svc)
.popsection // .entry.text
-#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 -/* - * Exception vectors trampoline. - */ - .pushsection ".entry.tramp.text", "ax" - // Move from tramp_pg_dir to swapper_pg_dir .macro tramp_map_kernel, tmp mrs \tmp, ttbr1_el1 @@ -1126,6 +1120,11 @@ alternative_else_nop_endif .endr .endm
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 +/* + * Exception vectors trampoline. + */ + .pushsection ".entry.tramp.text", "ax" .align 11 ENTRY(tramp_vectors) generate_tramp_vector
commit c47e4d04ba0f1ea17353d85d45f611277507e07a upstream.
Spectre-BHB needs to add sequences to the vectors. Having one global set of vectors is a problem for big/little systems where the sequence is costly on cpus that are not vulnerable.
Making the vectors per-cpu in the style of KVM's bh_harden_hyp_vecs requires the vectors to be generated by macros.
Make the kpti re-mapping of the kernel optional, so the macros can be used without kpti.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/entry.S | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index a2ec7ef24402..bb456f596c43 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -1063,9 +1063,10 @@ alternative_else_nop_endif sub \dst, \dst, PAGE_SIZE .endm
- .macro tramp_ventry, vector_start, regsize + .macro tramp_ventry, vector_start, regsize, kpti .align 7 1: + .if \kpti == 1 .if \regsize == 64 msr tpidrro_el0, x30 // Restored in kernel_ventry .endif @@ -1088,9 +1089,14 @@ alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003 alternative_if_not ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM prfm plil1strm, [x30, #(1b - \vector_start)] alternative_else_nop_endif + msr vbar_el1, x30 + isb + .else + ldr x30, =vectors + .endif // \kpti == 1 + add x30, x30, #(1b - \vector_start + 4) - isb ret .org 1b + 128 // Did we overflow the ventry slot? .endm @@ -1108,15 +1114,15 @@ alternative_else_nop_endif sb .endm
- .macro generate_tramp_vector + .macro generate_tramp_vector, kpti .Lvector_start@: .space 0x400
.rept 4 - tramp_ventry .Lvector_start@, 64 + tramp_ventry .Lvector_start@, 64, \kpti .endr .rept 4 - tramp_ventry .Lvector_start@, 32 + tramp_ventry .Lvector_start@, 32, \kpti .endr .endm
@@ -1127,7 +1133,7 @@ alternative_else_nop_endif .pushsection ".entry.tramp.text", "ax" .align 11 ENTRY(tramp_vectors) - generate_tramp_vector + generate_tramp_vector kpti=1 END(tramp_vectors)
ENTRY(tramp_exit_native)
commit a9c406e6462ff14956d690de7bbe5131a5677dc9 upstream.
Adding a second set of vectors to .entry.tramp.text will make it larger than a single 4K page.
Allow the trampoline text to occupy up to three pages by adding two more fixmap slots. Previous changes to tramp_valias allowed it to reach beyond a single page.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/fixmap.h | 6 ++++-- arch/arm64/include/asm/sections.h | 5 +++++ arch/arm64/kernel/entry.S | 2 +- arch/arm64/kernel/vmlinux.lds.S | 2 +- arch/arm64/mm/mmu.c | 12 +++++++++--- 5 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h index 2e0977c7564c..928a96b9b161 100644 --- a/arch/arm64/include/asm/fixmap.h +++ b/arch/arm64/include/asm/fixmap.h @@ -63,9 +63,11 @@ enum fixed_addresses { #endif /* CONFIG_ACPI_APEI_GHES */
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 - FIX_ENTRY_TRAMP_TEXT, + FIX_ENTRY_TRAMP_TEXT3, + FIX_ENTRY_TRAMP_TEXT2, + FIX_ENTRY_TRAMP_TEXT1, FIX_ENTRY_TRAMP_DATA, -#define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT)) +#define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT1)) #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ __end_of_permanent_fixed_addresses,
diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h index 25a73aab438f..a75f2882cc7c 100644 --- a/arch/arm64/include/asm/sections.h +++ b/arch/arm64/include/asm/sections.h @@ -20,4 +20,9 @@ extern char __irqentry_text_start[], __irqentry_text_end[]; extern char __mmuoff_data_start[], __mmuoff_data_end[]; extern char __entry_tramp_text_start[], __entry_tramp_text_end[];
+static inline size_t entry_tramp_text_size(void) +{ + return __entry_tramp_text_end - __entry_tramp_text_start; +} + #endif /* __ASM_SECTIONS_H */ diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index bb456f596c43..c1cebaf68e0c 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -1059,7 +1059,7 @@ alternative_else_nop_endif .endm
.macro tramp_data_page dst - adr \dst, .entry.tramp.text + adr_l \dst, .entry.tramp.text sub \dst, \dst, PAGE_SIZE .endm
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 1f82cf631c3c..fbab044b3a39 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -276,7 +276,7 @@ ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1)) <= SZ_4K, "Hibernate exit text too big or misaligned") #endif #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 -ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE, +ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) <= 3*PAGE_SIZE, "Entry trampoline text too big") #endif /* diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 99bc0289ab2b..5cf575f23af2 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -583,6 +583,8 @@ early_param("rodata", parse_rodata); #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 static int __init map_entry_trampoline(void) { + int i; + pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC; phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start);
@@ -591,11 +593,15 @@ static int __init map_entry_trampoline(void)
/* Map only the text into the trampoline page table */ memset(tramp_pg_dir, 0, PGD_SIZE); - __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE, - prot, __pgd_pgtable_alloc, 0); + __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, + entry_tramp_text_size(), prot, + __pgd_pgtable_alloc, NO_BLOCK_MAPPINGS);
/* Map both the text and data into the kernel page table */ - __set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot); + for (i = 0; i < DIV_ROUND_UP(entry_tramp_text_size(), PAGE_SIZE); i++) + __set_fixmap(FIX_ENTRY_TRAMP_TEXT1 - i, + pa_start + i * PAGE_SIZE, prot); + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { extern char __entry_tramp_data_start[];
commit aff65393fa1401e034656e349abd655cfe272de0 upstream.
kpti is an optional feature, for systems not using kpti a set of vectors for the spectre-bhb mitigations is needed.
Add another set of vectors, __bp_harden_el1_vectors, that will be used if a mitigation is needed and kpti is not in use.
The EL1 ventries are repeated verbatim as there is no additional work needed for entry from EL1.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/entry.S | 35 ++++++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index c1cebaf68e0c..1bc33f506bb1 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -1066,10 +1066,11 @@ alternative_else_nop_endif .macro tramp_ventry, vector_start, regsize, kpti .align 7 1: - .if \kpti == 1 .if \regsize == 64 msr tpidrro_el0, x30 // Restored in kernel_ventry .endif + + .if \kpti == 1 /* * Defend against branch aliasing attacks by pushing a dummy * entry onto the return stack and using a RET instruction to @@ -1156,6 +1157,38 @@ __entry_tramp_data_start: #endif /* CONFIG_RANDOMIZE_BASE */ #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
+/* + * Exception vectors for spectre mitigations on entry from EL1 when + * kpti is not in use. + */ + .macro generate_el1_vector +.Lvector_start@: + kernel_ventry 1, sync_invalid // Synchronous EL1t + kernel_ventry 1, irq_invalid // IRQ EL1t + kernel_ventry 1, fiq_invalid // FIQ EL1t + kernel_ventry 1, error_invalid // Error EL1t + + kernel_ventry 1, sync // Synchronous EL1h + kernel_ventry 1, irq // IRQ EL1h + kernel_ventry 1, fiq_invalid // FIQ EL1h + kernel_ventry 1, error // Error EL1h + + .rept 4 + tramp_ventry .Lvector_start@, 64, kpti=0 + .endr + .rept 4 + tramp_ventry .Lvector_start@, 32, kpti=0 + .endr + .endm + + .pushsection ".entry.text", "ax" + .align 11 +SYM_CODE_START(__bp_harden_el1_vectors) + generate_el1_vector +SYM_CODE_END(__bp_harden_el1_vectors) + .popsection + + /* * Register switch for AArch64. The callee-saved registers need to be saved * and restored. On entry:
commit ba2689234be92024e5635d30fe744f4853ad97db upstream.
Some CPUs affected by Spectre-BHB need a sequence of branches, or a firmware call to be run before any indirect branch. This needs to go in the vectors. No CPU needs both.
While this can be patched in, it would run on all CPUs as there is a single set of vectors. If only one part of a big/little combination is affected, the unaffected CPUs have to run the mitigation too.
Create extra vectors that include the sequence. Subsequent patches will allow affected CPUs to select this set of vectors. Later patches will modify the loop count to match what the CPU requires.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/assembler.h | 24 ++++++++++++++ arch/arm64/include/asm/vectors.h | 34 +++++++++++++++++++ arch/arm64/kernel/entry.S | 53 +++++++++++++++++++++++++----- include/linux/arm-smccc.h | 5 +++ 4 files changed, 107 insertions(+), 9 deletions(-) create mode 100644 arch/arm64/include/asm/vectors.h
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 4a4258f17c86..1279e4f5bd8f 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -757,4 +757,28 @@ USER(\label, ic ivau, \tmp2) // invalidate I line PoU .Lyield_out_@ : .endm
+ .macro __mitigate_spectre_bhb_loop tmp +#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY + mov \tmp, #32 +.Lspectre_bhb_loop@: + b . + 4 + subs \tmp, \tmp, #1 + b.ne .Lspectre_bhb_loop@ + sb +#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */ + .endm + + /* Save/restores x0-x3 to the stack */ + .macro __mitigate_spectre_bhb_fw +#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY + stp x0, x1, [sp, #-16]! + stp x2, x3, [sp, #-16]! + mov w0, #ARM_SMCCC_ARCH_WORKAROUND_3 +alternative_cb smccc_patch_fw_mitigation_conduit + nop // Patched to SMC/HVC #0 +alternative_cb_end + ldp x2, x3, [sp], #16 + ldp x0, x1, [sp], #16 +#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */ + .endm #endif /* __ASM_ASSEMBLER_H */ diff --git a/arch/arm64/include/asm/vectors.h b/arch/arm64/include/asm/vectors.h new file mode 100644 index 000000000000..16ca74260375 --- /dev/null +++ b/arch/arm64/include/asm/vectors.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2022 ARM Ltd. + */ +#ifndef __ASM_VECTORS_H +#define __ASM_VECTORS_H + +/* + * Note: the order of this enum corresponds to two arrays in entry.S: + * tramp_vecs and __bp_harden_el1_vectors. By default the canonical + * 'full fat' vectors are used directly. + */ +enum arm64_bp_harden_el1_vectors { +#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY + /* + * Perform the BHB loop mitigation, before branching to the canonical + * vectors. + */ + EL1_VECTOR_BHB_LOOP, + + /* + * Make the SMC call for firmware mitigation, before branching to the + * canonical vectors. + */ + EL1_VECTOR_BHB_FW, +#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */ + + /* + * Remap the kernel before branching to the canonical vectors. + */ + EL1_VECTOR_KPTI, +}; + +#endif /* __ASM_VECTORS_H */ diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 1bc33f506bb1..14351ee5e812 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -1063,13 +1063,26 @@ alternative_else_nop_endif sub \dst, \dst, PAGE_SIZE .endm
- .macro tramp_ventry, vector_start, regsize, kpti + +#define BHB_MITIGATION_NONE 0 +#define BHB_MITIGATION_LOOP 1 +#define BHB_MITIGATION_FW 2 + + .macro tramp_ventry, vector_start, regsize, kpti, bhb .align 7 1: .if \regsize == 64 msr tpidrro_el0, x30 // Restored in kernel_ventry .endif
+ .if \bhb == BHB_MITIGATION_LOOP + /* + * This sequence must appear before the first indirect branch. i.e. the + * ret out of tramp_ventry. It appears here because x30 is free. + */ + __mitigate_spectre_bhb_loop x30 + .endif // \bhb == BHB_MITIGATION_LOOP + .if \kpti == 1 /* * Defend against branch aliasing attacks by pushing a dummy @@ -1097,6 +1110,15 @@ alternative_else_nop_endif ldr x30, =vectors .endif // \kpti == 1
+ .if \bhb == BHB_MITIGATION_FW + /* + * The firmware sequence must appear before the first indirect branch. + * i.e. the ret out of tramp_ventry. But it also needs the stack to be + * mapped to save/restore the registers the SMC clobbers. + */ + __mitigate_spectre_bhb_fw + .endif // \bhb == BHB_MITIGATION_FW + add x30, x30, #(1b - \vector_start + 4) ret .org 1b + 128 // Did we overflow the ventry slot? @@ -1104,6 +1126,9 @@ alternative_else_nop_endif
.macro tramp_exit, regsize = 64 adr x30, tramp_vectors +#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY + add x30, x30, SZ_4K +#endif msr vbar_el1, x30 ldr lr, [sp, #S_LR] tramp_unmap_kernel x29 @@ -1115,26 +1140,32 @@ alternative_else_nop_endif sb .endm
- .macro generate_tramp_vector, kpti + .macro generate_tramp_vector, kpti, bhb .Lvector_start@: .space 0x400
.rept 4 - tramp_ventry .Lvector_start@, 64, \kpti + tramp_ventry .Lvector_start@, 64, \kpti, \bhb .endr .rept 4 - tramp_ventry .Lvector_start@, 32, \kpti + tramp_ventry .Lvector_start@, 32, \kpti, \bhb .endr .endm
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 /* * Exception vectors trampoline. + * The order must match __bp_harden_el1_vectors and the + * arm64_bp_harden_el1_vectors enum. */ .pushsection ".entry.tramp.text", "ax" .align 11 ENTRY(tramp_vectors) - generate_tramp_vector kpti=1 +#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY + generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_LOOP + generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_FW +#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */ + generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_NONE END(tramp_vectors)
ENTRY(tramp_exit_native) @@ -1161,7 +1192,7 @@ __entry_tramp_data_start: * Exception vectors for spectre mitigations on entry from EL1 when * kpti is not in use. */ - .macro generate_el1_vector + .macro generate_el1_vector, bhb .Lvector_start@: kernel_ventry 1, sync_invalid // Synchronous EL1t kernel_ventry 1, irq_invalid // IRQ EL1t @@ -1174,17 +1205,21 @@ __entry_tramp_data_start: kernel_ventry 1, error // Error EL1h
.rept 4 - tramp_ventry .Lvector_start@, 64, kpti=0 + tramp_ventry .Lvector_start@, 64, 0, \bhb .endr .rept 4 - tramp_ventry .Lvector_start@, 32, kpti=0 + tramp_ventry .Lvector_start@, 32, 0, \bhb .endr .endm
+/* The order must match tramp_vecs and the arm64_bp_harden_el1_vectors enum. */ .pushsection ".entry.text", "ax" .align 11 SYM_CODE_START(__bp_harden_el1_vectors) - generate_el1_vector +#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY + generate_el1_vector bhb=BHB_MITIGATION_LOOP + generate_el1_vector bhb=BHB_MITIGATION_FW +#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */ SYM_CODE_END(__bp_harden_el1_vectors) .popsection
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index 4e97ba64dbb4..3e6ef64e74d3 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -76,6 +76,11 @@ ARM_SMCCC_SMC_32, \ 0, 0x7fff)
+#define ARM_SMCCC_ARCH_WORKAROUND_3 \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_32, \ + 0, 0x3fff) + #define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 1
#ifndef __ASSEMBLY__
commit b28a8eebe81c186fdb1a0078263b30576c8e1f42 upstream.
The trampoline code needs to use the address of symbols in the wider kernel, e.g. vectors. PC-relative addressing wouldn't work as the trampoline code doesn't run at the address the linker expected.
tramp_ventry uses a literal pool, unless CONFIG_RANDOMIZE_BASE is set, in which case it uses the data page as a literal pool because the data page can be unmapped when running in user-space, which is required for CPUs vulnerable to meltdown.
Pull this logic out as a macro, instead of adding a third copy of it.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/entry.S | 35 ++++++++++++++++------------------- 1 file changed, 16 insertions(+), 19 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 14351ee5e812..e4b5a15c2e2e 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -1063,6 +1063,15 @@ alternative_else_nop_endif sub \dst, \dst, PAGE_SIZE .endm
+ .macro tramp_data_read_var dst, var +#ifdef CONFIG_RANDOMIZE_BASE + tramp_data_page \dst + add \dst, \dst, #:lo12:__entry_tramp_data_\var + ldr \dst, [\dst] +#else + ldr \dst, =\var +#endif + .endm
#define BHB_MITIGATION_NONE 0 #define BHB_MITIGATION_LOOP 1 @@ -1093,13 +1102,8 @@ alternative_else_nop_endif b . 2: tramp_map_kernel x30 -#ifdef CONFIG_RANDOMIZE_BASE - tramp_data_page x30 alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003 - ldr x30, [x30] -#else - ldr x30, =vectors -#endif + tramp_data_read_var x30, vectors alternative_if_not ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM prfm plil1strm, [x30, #(1b - \vector_start)] alternative_else_nop_endif @@ -1183,7 +1187,12 @@ END(tramp_exit_compat) .align PAGE_SHIFT .globl __entry_tramp_data_start __entry_tramp_data_start: +__entry_tramp_data_vectors: .quad vectors +#ifdef CONFIG_ARM_SDE_INTERFACE +__entry_tramp_data___sdei_asm_trampoline_next_handler: + .quad __sdei_asm_handler +#endif /* CONFIG_ARM_SDE_INTERFACE */ .popsection // .rodata #endif /* CONFIG_RANDOMIZE_BASE */ #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ @@ -1310,13 +1319,7 @@ ENTRY(__sdei_asm_entry_trampoline) */ 1: str x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
-#ifdef CONFIG_RANDOMIZE_BASE - tramp_data_page x4 - add x4, x4, #:lo12:__sdei_asm_trampoline_next_handler - ldr x4, [x4] -#else - ldr x4, =__sdei_asm_handler -#endif + tramp_data_read_var x4, __sdei_asm_trampoline_next_handler br x4 ENDPROC(__sdei_asm_entry_trampoline) NOKPROBE(__sdei_asm_entry_trampoline) @@ -1339,12 +1342,6 @@ ENDPROC(__sdei_asm_exit_trampoline) NOKPROBE(__sdei_asm_exit_trampoline) .ltorg .popsection // .entry.tramp.text -#ifdef CONFIG_RANDOMIZE_BASE -.pushsection ".rodata", "a" -__sdei_asm_trampoline_next_handler: - .quad __sdei_asm_handler -.popsection // .rodata -#endif /* CONFIG_RANDOMIZE_BASE */ #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
/*
commit bd09128d16fac3c34b80bd6a29088ac632e8ce09 upstream.
The Spectre-BHB workaround adds a firmware call to the vectors. This is needed on some CPUs, but not others. To avoid the unaffected CPU in a big/little pair from making the firmware call, create per cpu vectors.
The per-cpu vectors only apply when returning from EL0.
Systems using KPTI can use the canonical 'full-fat' vectors directly at EL1, the trampoline exit code will switch to this_cpu_vector on exit to EL0. Systems not using KPTI should always use this_cpu_vector.
this_cpu_vector will point at a vector in tramp_vecs or __bp_harden_el1_vectors, depending on whether KPTI is in use.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/mmu.h | 2 +- arch/arm64/include/asm/vectors.h | 27 +++++++++++++++++++++++++++ arch/arm64/kernel/cpufeature.c | 11 +++++++++++ arch/arm64/kernel/entry.S | 16 ++++++++++------ arch/arm64/kvm/hyp/switch.c | 8 ++++++-- 5 files changed, 55 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index f217e3292919..353450011e3d 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -29,7 +29,7 @@ typedef struct { */ #define ASID(mm) ((mm)->context.id.counter & 0xffff)
-static inline bool arm64_kernel_unmapped_at_el0(void) +static __always_inline bool arm64_kernel_unmapped_at_el0(void) { return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0) && cpus_have_const_cap(ARM64_UNMAP_KERNEL_AT_EL0); diff --git a/arch/arm64/include/asm/vectors.h b/arch/arm64/include/asm/vectors.h index 16ca74260375..3f76dfd9e074 100644 --- a/arch/arm64/include/asm/vectors.h +++ b/arch/arm64/include/asm/vectors.h @@ -5,6 +5,15 @@ #ifndef __ASM_VECTORS_H #define __ASM_VECTORS_H
+#include <linux/bug.h> +#include <linux/percpu.h> + +#include <asm/fixmap.h> + +extern char vectors[]; +extern char tramp_vectors[]; +extern char __bp_harden_el1_vectors[]; + /* * Note: the order of this enum corresponds to two arrays in entry.S: * tramp_vecs and __bp_harden_el1_vectors. By default the canonical @@ -31,4 +40,22 @@ enum arm64_bp_harden_el1_vectors { EL1_VECTOR_KPTI, };
+/* The vectors to use on return from EL0. e.g. to remap the kernel */ +DECLARE_PER_CPU_READ_MOSTLY(const char *, this_cpu_vector); + +#ifndef CONFIG_UNMAP_KERNEL_AT_EL0 +#define TRAMP_VALIAS 0 +#endif + +static inline const char * +arm64_get_bp_hardening_vector(enum arm64_bp_harden_el1_vectors slot) +{ + if (arm64_kernel_unmapped_at_el0()) + return (char *)TRAMP_VALIAS + SZ_2K * slot; + + WARN_ON_ONCE(slot == EL1_VECTOR_KPTI); + + return __bp_harden_el1_vectors + SZ_2K * slot; +} + #endif /* __ASM_VECTORS_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 6b77e4942495..0d89d535720f 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -10,11 +10,13 @@ #include <linux/bsearch.h> #include <linux/cpumask.h> #include <linux/crash_dump.h> +#include <linux/percpu.h> #include <linux/sort.h> #include <linux/stop_machine.h> #include <linux/types.h> #include <linux/mm.h> #include <linux/cpu.h> + #include <asm/cpu.h> #include <asm/cpufeature.h> #include <asm/cpu_ops.h> @@ -23,6 +25,7 @@ #include <asm/processor.h> #include <asm/sysreg.h> #include <asm/traps.h> +#include <asm/vectors.h> #include <asm/virt.h>
/* Kernel representation of AT_HWCAP and AT_HWCAP2 */ @@ -45,6 +48,8 @@ static struct arm64_cpu_capabilities const __ro_after_init *cpu_hwcaps_ptrs[ARM6 /* Need also bit for ARM64_CB_PATCH */ DECLARE_BITMAP(boot_capabilities, ARM64_NPATCHABLE);
+DEFINE_PER_CPU_READ_MOSTLY(const char *, this_cpu_vector) = vectors; + /* * Flag to indicate if we have computed the system wide * capabilities based on the boot time active CPUs. This @@ -1047,6 +1052,12 @@ kpti_install_ng_mappings(const struct arm64_cpu_capabilities *__unused) static bool kpti_applied = false; int cpu = smp_processor_id();
+ if (__this_cpu_read(this_cpu_vector) == vectors) { + const char *v = arm64_get_bp_hardening_vector(EL1_VECTOR_KPTI); + + __this_cpu_write(this_cpu_vector, v); + } + /* * We don't need to rewrite the page-tables if either we've done * it already or we have KASLR enabled and therefore have not diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index e4b5a15c2e2e..fcfbb2b009e2 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -60,7 +60,6 @@ .macro kernel_ventry, el, label, regsize = 64 .align 7 .Lventry_start@: -#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 .if \el == 0 /* * This must be the first instruction of the EL0 vector entries. It is @@ -75,7 +74,6 @@ .endif .Lskip_tramp_vectors_cleanup@: .endif -#endif
sub sp, sp, #S_FRAME_SIZE #ifdef CONFIG_VMAP_STACK @@ -1129,10 +1127,14 @@ alternative_else_nop_endif .endm
.macro tramp_exit, regsize = 64 - adr x30, tramp_vectors -#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY - add x30, x30, SZ_4K -#endif + tramp_data_read_var x30, this_cpu_vector +alternative_if_not ARM64_HAS_VIRT_HOST_EXTN + mrs x29, tpidr_el1 +alternative_else + mrs x29, tpidr_el2 +alternative_endif + ldr x30, [x30, x29] + msr vbar_el1, x30 ldr lr, [sp, #S_LR] tramp_unmap_kernel x29 @@ -1193,6 +1195,8 @@ __entry_tramp_data_vectors: __entry_tramp_data___sdei_asm_trampoline_next_handler: .quad __sdei_asm_handler #endif /* CONFIG_ARM_SDE_INTERFACE */ +__entry_tramp_data_this_cpu_vector: + .quad this_cpu_vector .popsection // .rodata #endif /* CONFIG_RANDOMIZE_BASE */ #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 14607fac7ca3..768983bd2326 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -25,6 +25,7 @@ #include <asm/debug-monitors.h> #include <asm/processor.h> #include <asm/thread_info.h> +#include <asm/vectors.h>
extern struct exception_table_entry __start___kvm_ex_table; extern struct exception_table_entry __stop___kvm_ex_table; @@ -152,7 +153,7 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
static void deactivate_traps_vhe(void) { - extern char vectors[]; /* kernel exception vectors */ + const char *host_vectors = vectors; write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
/* @@ -163,7 +164,10 @@ static void deactivate_traps_vhe(void) asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_1165522));
write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1); - write_sysreg(vectors, vbar_el1); + + if (!arm64_kernel_unmapped_at_el0()) + host_vectors = __this_cpu_read(this_cpu_vector); + write_sysreg(host_vectors, vbar_el1); } NOKPROBE_SYMBOL(deactivate_traps_vhe);
commit dee435be76f4117410bbd90573a881fd33488f37 upstream.
Speculation attacks against some high-performance processors can make use of branch history to influence future speculation as part of a spectre-v2 attack. This is not mitigated by CSV2, meaning CPUs that previously reported 'Not affected' are now moderately mitigated by CSV2.
Update the value in /sys/devices/system/cpu/vulnerabilities/spectre_v2 to also show the state of the BHB mitigation.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com [ code move to cpu_errata.c for backport ] Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/cpufeature.h | 9 +++++++ arch/arm64/kernel/cpu_errata.c | 41 ++++++++++++++++++++++++++--- 2 files changed, 46 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index ccae05da98a7..a798443ed76f 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -639,6 +639,15 @@ static inline int arm64_get_ssbd_state(void)
void arm64_set_ssbd_mitigation(bool state);
+/* Watch out, ordering is important here. */ +enum mitigation_state { + SPECTRE_UNAFFECTED, + SPECTRE_MITIGATED, + SPECTRE_VULNERABLE, +}; + +enum mitigation_state arm64_get_spectre_bhb_state(void); + extern int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
static inline u32 id_aa64mmfr0_parange_to_phys_shift(int parange) diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 1e16c4e00e77..182305000de3 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -989,15 +989,41 @@ ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, return sprintf(buf, "Mitigation: __user pointer sanitization\n"); }
+static const char *get_bhb_affected_string(enum mitigation_state bhb_state) +{ + switch (bhb_state) { + case SPECTRE_UNAFFECTED: + return ""; + default: + case SPECTRE_VULNERABLE: + return ", but not BHB"; + case SPECTRE_MITIGATED: + return ", BHB"; + } +} + ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, char *buf) { + enum mitigation_state bhb_state = arm64_get_spectre_bhb_state(); + const char *bhb_str = get_bhb_affected_string(bhb_state); + const char *v2_str = "Branch predictor hardening"; + switch (get_spectre_v2_workaround_state()) { case ARM64_BP_HARDEN_NOT_REQUIRED: - return sprintf(buf, "Not affected\n"); - case ARM64_BP_HARDEN_WA_NEEDED: - return sprintf(buf, "Mitigation: Branch predictor hardening\n"); - case ARM64_BP_HARDEN_UNKNOWN: + if (bhb_state == SPECTRE_UNAFFECTED) + return sprintf(buf, "Not affected\n"); + + /* + * Platforms affected by Spectre-BHB can't report + * "Not affected" for Spectre-v2. + */ + v2_str = "CSV2"; + fallthrough; + case ARM64_BP_HARDEN_WA_NEEDED: + return sprintf(buf, "Mitigation: %s%s\n", v2_str, bhb_str); + case ARM64_BP_HARDEN_UNKNOWN: + fallthrough; default: return sprintf(buf, "Vulnerable\n"); } @@ -1019,3 +1045,10 @@ ssize_t cpu_show_spec_store_bypass(struct device *dev,
return sprintf(buf, "Vulnerable\n"); } + +static enum mitigation_state spectre_bhb_state; + +enum mitigation_state arm64_get_spectre_bhb_state(void) +{ + return spectre_bhb_state; +}
KVM writes the Spectre-v2 mitigation template at the beginning of each vector when a CPU requires a specific sequence to run.
Because the template is copied, it can not be modified by the alternatives at runtime.
Add templates for calling ARCH_WORKAROUND_3 and one for each value of K in the brancy-loop. Instead of adding dummy functions for 'fn', which would disable the Spectre-v2 mitigation, add template_start to indicate that a template (and which one) is in use. Finally add a copy of install_bp_hardening_cb() that is able to install these.
Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/kvm_mmu.h | 6 ++- arch/arm64/include/asm/mmu.h | 6 +++ arch/arm64/kernel/cpu_errata.c | 65 +++++++++++++++++++++++++++++++- arch/arm64/kvm/hyp/hyp-entry.S | 54 ++++++++++++++++++++++++++ 5 files changed, 130 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 1dc3c762fdcb..4ffa86149d28 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -55,7 +55,8 @@ #define ARM64_WORKAROUND_CAVIUM_TX2_219_TVM 45 #define ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM 46 #define ARM64_WORKAROUND_1542419 47 +#define ARM64_SPECTRE_BHB 48
-#define ARM64_NCAPS 48 +#define ARM64_NCAPS 49
#endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index befe37d4bc0e..78d110667c0c 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -478,7 +478,8 @@ static inline void *kvm_get_hyp_vector(void) void *vect = kern_hyp_va(kvm_ksym_ref(__kvm_hyp_vector)); int slot = -1;
- if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR) && data->fn) { + if ((cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR) || + cpus_have_const_cap(ARM64_SPECTRE_BHB)) && data->template_start) { vect = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs_start)); slot = data->hyp_vectors_slot; } @@ -507,7 +508,8 @@ static inline int kvm_map_vectors(void) * !HBP + HEL2 -> allocate one vector slot and use exec mapping * HBP + HEL2 -> use hardened vertors and use exec mapping */ - if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR)) { + if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR) || + cpus_have_const_cap(ARM64_SPECTRE_BHB)) { __kvm_bp_vect_base = kvm_ksym_ref(__bp_harden_hyp_vecs_start); __kvm_bp_vect_base = kern_hyp_va(__kvm_bp_vect_base); } diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 353450011e3d..1b9e49fb0e1b 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -82,6 +82,12 @@ typedef void (*bp_hardening_cb_t)(void); struct bp_hardening_data { int hyp_vectors_slot; bp_hardening_cb_t fn; + + /* + * template_start is only used by the BHB mitigation to identify the + * hyp_vectors_slot sequence. + */ + const char *template_start; };
#if (defined(CONFIG_HARDEN_BRANCH_PREDICTOR) || \ diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 182305000de3..30818b757d51 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -116,6 +116,14 @@ DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); #ifdef CONFIG_KVM_INDIRECT_VECTORS extern char __smccc_workaround_1_smc_start[]; extern char __smccc_workaround_1_smc_end[]; +extern char __smccc_workaround_3_smc_start[]; +extern char __smccc_workaround_3_smc_end[]; +extern char __spectre_bhb_loop_k8_start[]; +extern char __spectre_bhb_loop_k8_end[]; +extern char __spectre_bhb_loop_k24_start[]; +extern char __spectre_bhb_loop_k24_end[]; +extern char __spectre_bhb_loop_k32_start[]; +extern char __spectre_bhb_loop_k32_end[];
static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start, const char *hyp_vecs_end) @@ -129,11 +137,11 @@ static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start, __flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K); }
+static DEFINE_RAW_SPINLOCK(bp_lock); static void install_bp_hardening_cb(bp_hardening_cb_t fn, const char *hyp_vecs_start, const char *hyp_vecs_end) { - static DEFINE_RAW_SPINLOCK(bp_lock); int cpu, slot = -1;
/* @@ -161,6 +169,7 @@ static void install_bp_hardening_cb(bp_hardening_cb_t fn,
__this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot); __this_cpu_write(bp_hardening_data.fn, fn); + __this_cpu_write(bp_hardening_data.template_start, hyp_vecs_start); raw_spin_unlock(&bp_lock); } #else @@ -1052,3 +1061,57 @@ enum mitigation_state arm64_get_spectre_bhb_state(void) { return spectre_bhb_state; } + +#ifdef CONFIG_KVM_INDIRECT_VECTORS +static const char *kvm_bhb_get_vecs_end(const char *start) +{ + if (start == __smccc_workaround_3_smc_start) + return __smccc_workaround_3_smc_end; + else if (start == __spectre_bhb_loop_k8_start) + return __spectre_bhb_loop_k8_end; + else if (start == __spectre_bhb_loop_k24_start) + return __spectre_bhb_loop_k24_end; + else if (start == __spectre_bhb_loop_k32_start) + return __spectre_bhb_loop_k32_end; + + return NULL; +} + +void kvm_setup_bhb_slot(const char *hyp_vecs_start) +{ + int cpu, slot = -1; + const char *hyp_vecs_end; + + if (!IS_ENABLED(CONFIG_KVM) || !is_hyp_mode_available()) + return; + + hyp_vecs_end = kvm_bhb_get_vecs_end(hyp_vecs_start); + if (WARN_ON_ONCE(!hyp_vecs_start || !hyp_vecs_end)) + return; + + raw_spin_lock(&bp_lock); + for_each_possible_cpu(cpu) { + if (per_cpu(bp_hardening_data.template_start, cpu) == hyp_vecs_start) { + slot = per_cpu(bp_hardening_data.hyp_vectors_slot, cpu); + break; + } + } + + if (slot == -1) { + slot = atomic_inc_return(&arm64_el2_vector_last_slot); + BUG_ON(slot >= BP_HARDEN_EL2_SLOTS); + __copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end); + } + + __this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot); + __this_cpu_write(bp_hardening_data.template_start, hyp_vecs_start); + raw_spin_unlock(&bp_lock); +} +#else +#define __smccc_workaround_3_smc_start NULL +#define __spectre_bhb_loop_k8_start NULL +#define __spectre_bhb_loop_k24_start NULL +#define __spectre_bhb_loop_k32_start NULL + +void kvm_setup_bhb_slot(const char *hyp_vecs_start) { } +#endif diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S index f36aad0f207b..2ad750208e33 100644 --- a/arch/arm64/kvm/hyp/hyp-entry.S +++ b/arch/arm64/kvm/hyp/hyp-entry.S @@ -347,4 +347,58 @@ ENTRY(__smccc_workaround_1_smc_start) ldp x0, x1, [sp, #(8 * 2)] add sp, sp, #(8 * 4) ENTRY(__smccc_workaround_1_smc_end) + +ENTRY(__smccc_workaround_3_smc_start) + esb + sub sp, sp, #(8 * 4) + stp x2, x3, [sp, #(8 * 0)] + stp x0, x1, [sp, #(8 * 2)] + mov w0, #ARM_SMCCC_ARCH_WORKAROUND_3 + smc #0 + ldp x2, x3, [sp, #(8 * 0)] + ldp x0, x1, [sp, #(8 * 2)] + add sp, sp, #(8 * 4) +ENTRY(__smccc_workaround_3_smc_end) + +ENTRY(__spectre_bhb_loop_k8_start) + esb + sub sp, sp, #(8 * 2) + stp x0, x1, [sp, #(8 * 0)] + mov x0, #8 +2: b . + 4 + subs x0, x0, #1 + b.ne 2b + dsb nsh + isb + ldp x0, x1, [sp, #(8 * 0)] + add sp, sp, #(8 * 2) +ENTRY(__spectre_bhb_loop_k8_end) + +ENTRY(__spectre_bhb_loop_k24_start) + esb + sub sp, sp, #(8 * 2) + stp x0, x1, [sp, #(8 * 0)] + mov x0, #24 +2: b . + 4 + subs x0, x0, #1 + b.ne 2b + dsb nsh + isb + ldp x0, x1, [sp, #(8 * 0)] + add sp, sp, #(8 * 2) +ENTRY(__spectre_bhb_loop_k24_end) + +ENTRY(__spectre_bhb_loop_k32_start) + esb + sub sp, sp, #(8 * 2) + stp x0, x1, [sp, #(8 * 0)] + mov x0, #32 +2: b . + 4 + subs x0, x0, #1 + b.ne 2b + dsb nsh + isb + ldp x0, x1, [sp, #(8 * 0)] + add sp, sp, #(8 * 2) +ENTRY(__spectre_bhb_loop_k32_end) #endif
commit 558c303c9734af5a813739cd284879227f7297d2 upstream.
Speculation attacks against some high-performance processors can make use of branch history to influence future speculation. When taking an exception from user-space, a sequence of branches or a firmware call overwrites or invalidates the branch history.
The sequence of branches is added to the vectors, and should appear before the first indirect branch. For systems using KPTI the sequence is added to the kpti trampoline where it has a free register as the exit from the trampoline is via a 'ret'. For systems not using KPTI, the same register tricks are used to free up a register in the vectors.
For the firmware call, arch-workaround-3 clobbers 4 registers, so there is no choice but to save them to the EL1 stack. This only happens for entry from EL0, so if we take an exception due to the stack access, it will not become re-entrant.
For KVM, the existing branch-predictor-hardening vectors are used. When a spectre version of these vectors is in use, the firmware call is sufficient to mitigate against Spectre-BHB. For the non-spectre versions, the sequence of branches is added to the indirect vector.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: stable@kernel.org # <v5.17.x 72bb9dcb6c33c arm64: Add Cortex-X2 CPU part definition Cc: stable@kernel.org # <v5.16.x 2d0d656700d67 arm64: Add Neoverse-N2, Cortex-A710 CPU part definition Cc: stable@kernel.org # <v5.10.x 8a6b88e66233f arm64: Add part number for Arm Cortex-A77 [ modified for stable, moved code to cpu_errata.c removed bitmap of mitigations, use kvm template infrastructure ] Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/Kconfig | 9 + arch/arm64/include/asm/assembler.h | 6 +- arch/arm64/include/asm/cpufeature.h | 18 ++ arch/arm64/include/asm/cputype.h | 8 + arch/arm64/include/asm/sysreg.h | 1 + arch/arm64/include/asm/vectors.h | 5 + arch/arm64/kernel/cpu_errata.c | 269 +++++++++++++++++++++++++++- arch/arm64/kvm/hyp/hyp-entry.S | 4 + 8 files changed, 316 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 9c8ea5939865..a1a828ca188c 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1139,6 +1139,15 @@ config ARM64_SSBD
If unsure, say Y.
+config MITIGATE_SPECTRE_BRANCH_HISTORY + bool "Mitigate Spectre style attacks against branch history" if EXPERT + default y + help + Speculation attacks against some high-performance processors can + make use of branch history to influence future speculation. + When taking an exception from user-space, a sequence of branches + or a firmware call overwrites the branch history. + config RODATA_FULL_DEFAULT_ENABLED bool "Apply r/o permissions of VM areas also to their linear aliases" default y diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 1279e4f5bd8f..4b13739ca518 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -759,7 +759,9 @@ USER(\label, ic ivau, \tmp2) // invalidate I line PoU
.macro __mitigate_spectre_bhb_loop tmp #ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY - mov \tmp, #32 +alternative_cb spectre_bhb_patch_loop_iter + mov \tmp, #32 // Patched to correct the immediate +alternative_cb_end .Lspectre_bhb_loop@: b . + 4 subs \tmp, \tmp, #1 @@ -774,7 +776,7 @@ USER(\label, ic ivau, \tmp2) // invalidate I line PoU stp x0, x1, [sp, #-16]! stp x2, x3, [sp, #-16]! mov w0, #ARM_SMCCC_ARCH_WORKAROUND_3 -alternative_cb smccc_patch_fw_mitigation_conduit +alternative_cb arm64_update_smccc_conduit nop // Patched to SMC/HVC #0 alternative_cb_end ldp x2, x3, [sp], #16 diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index a798443ed76f..40a5e48881af 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -508,6 +508,21 @@ static inline bool cpu_supports_mixed_endian_el0(void) return id_aa64mmfr0_mixed_endian_el0(read_cpuid(ID_AA64MMFR0_EL1)); }
+static inline bool supports_csv2p3(int scope) +{ + u64 pfr0; + u8 csv2_val; + + if (scope == SCOPE_LOCAL_CPU) + pfr0 = read_sysreg_s(SYS_ID_AA64PFR0_EL1); + else + pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1); + + csv2_val = cpuid_feature_extract_unsigned_field(pfr0, + ID_AA64PFR0_CSV2_SHIFT); + return csv2_val == 3; +} + static inline bool system_supports_32bit_el0(void) { return cpus_have_const_cap(ARM64_HAS_32BIT_EL0); @@ -647,6 +662,9 @@ enum mitigation_state { };
enum mitigation_state arm64_get_spectre_bhb_state(void); +bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, int scope); +u8 spectre_bhb_loop_affected(int scope); +void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *__unused);
extern int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index e4394be47d35..f0165df489a3 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -72,9 +72,13 @@ #define ARM_CPU_PART_CORTEX_A76 0xD0B #define ARM_CPU_PART_NEOVERSE_N1 0xD0C #define ARM_CPU_PART_CORTEX_A77 0xD0D +#define ARM_CPU_PART_NEOVERSE_V1 0xD40 +#define ARM_CPU_PART_CORTEX_A78 0xD41 +#define ARM_CPU_PART_CORTEX_X1 0xD44 #define ARM_CPU_PART_CORTEX_A710 0xD47 #define ARM_CPU_PART_CORTEX_X2 0xD48 #define ARM_CPU_PART_NEOVERSE_N2 0xD49 +#define ARM_CPU_PART_CORTEX_A78C 0xD4B
#define APM_CPU_PART_POTENZA 0x000
@@ -107,9 +111,13 @@ #define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76) #define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1) #define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77) +#define MIDR_NEOVERSE_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V1) +#define MIDR_CORTEX_A78 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78) +#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1) #define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710) #define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2) #define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2) +#define MIDR_CORTEX_A78C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C) #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 50ed2747c572..b35579352856 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -661,6 +661,7 @@ #endif
/* id_aa64mmfr1 */ +#define ID_AA64MMFR1_ECBHB_SHIFT 60 #define ID_AA64MMFR1_PAN_SHIFT 20 #define ID_AA64MMFR1_LOR_SHIFT 16 #define ID_AA64MMFR1_HPD_SHIFT 12 diff --git a/arch/arm64/include/asm/vectors.h b/arch/arm64/include/asm/vectors.h index 3f76dfd9e074..1f65c37dc653 100644 --- a/arch/arm64/include/asm/vectors.h +++ b/arch/arm64/include/asm/vectors.h @@ -40,6 +40,11 @@ enum arm64_bp_harden_el1_vectors { EL1_VECTOR_KPTI, };
+#ifndef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY +#define EL1_VECTOR_BHB_LOOP -1 +#define EL1_VECTOR_BHB_FW -1 +#endif /* !CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */ + /* The vectors to use on return from EL0. e.g. to remap the kernel */ DECLARE_PER_CPU_READ_MOSTLY(const char *, this_cpu_vector);
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 30818b757d51..0f74dc2b13c0 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -13,6 +13,7 @@ #include <asm/cputype.h> #include <asm/cpufeature.h> #include <asm/smp_plat.h> +#include <asm/vectors.h>
static bool __maybe_unused is_affected_midr_range(const struct arm64_cpu_capabilities *entry, int scope) @@ -936,6 +937,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = { .cpu_enable = cpu_enable_ssbd_mitigation, .midr_range_list = arm64_ssb_cpus, }, + { + .desc = "Spectre-BHB", + .capability = ARM64_SPECTRE_BHB, + .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, + .matches = is_spectre_bhb_affected, + .cpu_enable = spectre_bhb_enable_mitigation, + }, #ifdef CONFIG_ARM64_ERRATUM_1418040 { .desc = "ARM erratum 1418040", @@ -1055,6 +1063,33 @@ ssize_t cpu_show_spec_store_bypass(struct device *dev, return sprintf(buf, "Vulnerable\n"); }
+/* + * We try to ensure that the mitigation state can never change as the result of + * onlining a late CPU. + */ +static void update_mitigation_state(enum mitigation_state *oldp, + enum mitigation_state new) +{ + enum mitigation_state state; + + do { + state = READ_ONCE(*oldp); + if (new <= state) + break; + } while (cmpxchg_relaxed(oldp, state, new) != state); +} + +/* + * Spectre BHB. + * + * A CPU is either: + * - Mitigated by a branchy loop a CPU specific number of times, and listed + * in our "loop mitigated list". + * - Mitigated in software by the firmware Spectre v2 call. + * - Has the 'Exception Clears Branch History Buffer' (ECBHB) feature, so no + * software mitigation in the vectors is needed. + * - Has CSV2.3, so is unaffected. + */ static enum mitigation_state spectre_bhb_state;
enum mitigation_state arm64_get_spectre_bhb_state(void) @@ -1062,6 +1097,164 @@ enum mitigation_state arm64_get_spectre_bhb_state(void) return spectre_bhb_state; }
+/* + * This must be called with SCOPE_LOCAL_CPU for each type of CPU, before any + * SCOPE_SYSTEM call will give the right answer. + */ +u8 spectre_bhb_loop_affected(int scope) +{ + u8 k = 0; + static u8 max_bhb_k; + + if (scope == SCOPE_LOCAL_CPU) { + static const struct midr_range spectre_bhb_k32_list[] = { + MIDR_ALL_VERSIONS(MIDR_CORTEX_A78), + MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C), + MIDR_ALL_VERSIONS(MIDR_CORTEX_X1), + MIDR_ALL_VERSIONS(MIDR_CORTEX_A710), + MIDR_ALL_VERSIONS(MIDR_CORTEX_X2), + MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), + MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), + {}, + }; + static const struct midr_range spectre_bhb_k24_list[] = { + MIDR_ALL_VERSIONS(MIDR_CORTEX_A76), + MIDR_ALL_VERSIONS(MIDR_CORTEX_A77), + MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), + {}, + }; + static const struct midr_range spectre_bhb_k8_list[] = { + MIDR_ALL_VERSIONS(MIDR_CORTEX_A72), + MIDR_ALL_VERSIONS(MIDR_CORTEX_A57), + {}, + }; + + if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k32_list)) + k = 32; + else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list)) + k = 24; + else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list)) + k = 8; + + max_bhb_k = max(max_bhb_k, k); + } else { + k = max_bhb_k; + } + + return k; +} + +static enum mitigation_state spectre_bhb_get_cpu_fw_mitigation_state(void) +{ + int ret; + struct arm_smccc_res res; + + if (psci_ops.smccc_version == SMCCC_VERSION_1_0) + return SPECTRE_VULNERABLE; + + switch (psci_ops.conduit) { + case PSCI_CONDUIT_HVC: + arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, + ARM_SMCCC_ARCH_WORKAROUND_3, &res); + break; + + case PSCI_CONDUIT_SMC: + arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, + ARM_SMCCC_ARCH_WORKAROUND_3, &res); + break; + + default: + return SPECTRE_VULNERABLE; + } + + ret = res.a0; + switch (ret) { + case SMCCC_RET_SUCCESS: + return SPECTRE_MITIGATED; + case SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED: + return SPECTRE_UNAFFECTED; + default: + fallthrough; + case SMCCC_RET_NOT_SUPPORTED: + return SPECTRE_VULNERABLE; + } +} + +static bool is_spectre_bhb_fw_affected(int scope) +{ + static bool system_affected; + enum mitigation_state fw_state; + bool has_smccc = (psci_ops.smccc_version >= SMCCC_VERSION_1_1); + static const struct midr_range spectre_bhb_firmware_mitigated_list[] = { + MIDR_ALL_VERSIONS(MIDR_CORTEX_A73), + MIDR_ALL_VERSIONS(MIDR_CORTEX_A75), + {}, + }; + bool cpu_in_list = is_midr_in_range_list(read_cpuid_id(), + spectre_bhb_firmware_mitigated_list); + + if (scope != SCOPE_LOCAL_CPU) + return system_affected; + + fw_state = spectre_bhb_get_cpu_fw_mitigation_state(); + if (cpu_in_list || (has_smccc && fw_state == SPECTRE_MITIGATED)) { + system_affected = true; + return true; + } + + return false; +} + +static bool supports_ecbhb(int scope) +{ + u64 mmfr1; + + if (scope == SCOPE_LOCAL_CPU) + mmfr1 = read_sysreg_s(SYS_ID_AA64MMFR1_EL1); + else + mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); + + return cpuid_feature_extract_unsigned_field(mmfr1, + ID_AA64MMFR1_ECBHB_SHIFT); +} + +bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, + int scope) +{ + WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible()); + + if (supports_csv2p3(scope)) + return false; + + if (spectre_bhb_loop_affected(scope)) + return true; + + if (is_spectre_bhb_fw_affected(scope)) + return true; + + return false; +} + +static void this_cpu_set_vectors(enum arm64_bp_harden_el1_vectors slot) +{ + const char *v = arm64_get_bp_hardening_vector(slot); + + if (slot < 0) + return; + + __this_cpu_write(this_cpu_vector, v); + + /* + * When KPTI is in use, the vectors are switched when exiting to + * user-space. + */ + if (arm64_kernel_unmapped_at_el0()) + return; + + write_sysreg(v, vbar_el1); + isb(); +} + #ifdef CONFIG_KVM_INDIRECT_VECTORS static const char *kvm_bhb_get_vecs_end(const char *start) { @@ -1077,7 +1270,7 @@ static const char *kvm_bhb_get_vecs_end(const char *start) return NULL; }
-void kvm_setup_bhb_slot(const char *hyp_vecs_start) +static void kvm_setup_bhb_slot(const char *hyp_vecs_start) { int cpu, slot = -1; const char *hyp_vecs_end; @@ -1113,5 +1306,77 @@ void kvm_setup_bhb_slot(const char *hyp_vecs_start) #define __spectre_bhb_loop_k24_start NULL #define __spectre_bhb_loop_k32_start NULL
-void kvm_setup_bhb_slot(const char *hyp_vecs_start) { } +static void kvm_setup_bhb_slot(const char *hyp_vecs_start) { } #endif + +void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry) +{ + enum mitigation_state fw_state, state = SPECTRE_VULNERABLE; + + if (!is_spectre_bhb_affected(entry, SCOPE_LOCAL_CPU)) + return; + + if (get_spectre_v2_workaround_state() == ARM64_BP_HARDEN_UNKNOWN) { + /* No point mitigating Spectre-BHB alone. */ + } else if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY)) { + pr_info_once("spectre-bhb mitigation disabled by compile time option\n"); + } else if (cpu_mitigations_off()) { + pr_info_once("spectre-bhb mitigation disabled by command line option\n"); + } else if (supports_ecbhb(SCOPE_LOCAL_CPU)) { + state = SPECTRE_MITIGATED; + } else if (spectre_bhb_loop_affected(SCOPE_LOCAL_CPU)) { + switch (spectre_bhb_loop_affected(SCOPE_SYSTEM)) { + case 8: + kvm_setup_bhb_slot(__spectre_bhb_loop_k8_start); + break; + case 24: + kvm_setup_bhb_slot(__spectre_bhb_loop_k24_start); + break; + case 32: + kvm_setup_bhb_slot(__spectre_bhb_loop_k32_start); + break; + default: + WARN_ON_ONCE(1); + } + this_cpu_set_vectors(EL1_VECTOR_BHB_LOOP); + + state = SPECTRE_MITIGATED; + } else if (is_spectre_bhb_fw_affected(SCOPE_LOCAL_CPU)) { + fw_state = spectre_bhb_get_cpu_fw_mitigation_state(); + if (fw_state == SPECTRE_MITIGATED) { + kvm_setup_bhb_slot(__smccc_workaround_3_smc_start); + this_cpu_set_vectors(EL1_VECTOR_BHB_FW); + + /* + * With WA3 in the vectors, the WA1 calls can be + * removed. + */ + __this_cpu_write(bp_hardening_data.fn, NULL); + + state = SPECTRE_MITIGATED; + } + } + + update_mitigation_state(&spectre_bhb_state, state); +} + +/* Patched to correct the immediate */ +void noinstr spectre_bhb_patch_loop_iter(struct alt_instr *alt, + __le32 *origptr, __le32 *updptr, int nr_inst) +{ + u8 rd; + u32 insn; + u16 loop_count = spectre_bhb_loop_affected(SCOPE_SYSTEM); + + BUG_ON(nr_inst != 1); /* MOV -> MOV */ + + if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY)) + return; + + insn = le32_to_cpu(*origptr); + rd = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RD, insn); + insn = aarch64_insn_gen_movewide(rd, loop_count, 0, + AARCH64_INSN_VARIANT_64BIT, + AARCH64_INSN_MOVEWIDE_ZERO); + *updptr++ = cpu_to_le32(insn); +} diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S index 2ad750208e33..b59b66f1f905 100644 --- a/arch/arm64/kvm/hyp/hyp-entry.S +++ b/arch/arm64/kvm/hyp/hyp-entry.S @@ -113,6 +113,10 @@ el1_hvc_guest: /* ARM_SMCCC_ARCH_WORKAROUND_2 handling */ eor w1, w1, #(ARM_SMCCC_ARCH_WORKAROUND_1 ^ \ ARM_SMCCC_ARCH_WORKAROUND_2) + cbz w1, wa_epilogue + + eor w1, w1, #(ARM_SMCCC_ARCH_WORKAROUND_2 ^ \ + ARM_SMCCC_ARCH_WORKAROUND_3) cbnz w1, el1_trap
#ifdef CONFIG_ARM64_SSBD
commit a5905d6af492ee6a4a2205f0d550b3f931b03d03 upstream.
KVM allows the guest to discover whether the ARCH_WORKAROUND SMCCC are implemented, and to preserve that state during migration through its firmware register interface.
Add the necessary boiler plate for SMCCC_ARCH_WORKAROUND_3.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com [ kvm code moved to virt/kvm/arm. ] Signed-off-by: James Morse james.morse@arm.com --- arch/arm/include/asm/kvm_host.h | 7 +++++++ arch/arm/include/uapi/asm/kvm.h | 6 ++++++ arch/arm64/include/asm/kvm_host.h | 5 +++++ arch/arm64/include/uapi/asm/kvm.h | 5 +++++ virt/kvm/arm/psci.c | 34 ++++++++++++++++++++++++++++--- 5 files changed, 54 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 32564b017ba0..d8ac89879327 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -15,6 +15,7 @@ #include <asm/kvm_asm.h> #include <asm/kvm_mmio.h> #include <asm/fpstate.h> +#include <asm/spectre.h> #include <kvm/arm_arch_timer.h>
#define __KVM_HAVE_ARCH_INTC_INITIALIZED @@ -424,4 +425,10 @@ static inline bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
#define kvm_arm_vcpu_loaded(vcpu) (false)
+static inline int kvm_arm_get_spectre_bhb_state(void) +{ + /* 32bit guests don't need firmware for this */ + return SPECTRE_VULNERABLE; /* aka SMCCC_RET_NOT_SUPPORTED */ +} + #endif /* __ARM_KVM_HOST_H__ */ diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h index 2769360f195c..89b8e70068a1 100644 --- a/arch/arm/include/uapi/asm/kvm.h +++ b/arch/arm/include/uapi/asm/kvm.h @@ -227,6 +227,12 @@ struct kvm_vcpu_events { #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED 3 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED (1U << 4)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3 KVM_REG_ARM_FW_REG(3) + /* Higher values mean better protection. */ +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL 0 +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL 1 +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED 2 + /* Device Control API: ARM VGIC */ #define KVM_DEV_ARM_VGIC_GRP_ADDR 0 #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1 diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 697702a1a1ff..e6efdbe88c0a 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -684,4 +684,9 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
#define kvm_arm_vcpu_loaded(vcpu) ((vcpu)->arch.sysregs_loaded_on_cpu)
+static inline enum mitigation_state kvm_arm_get_spectre_bhb_state(void) +{ + return arm64_get_spectre_bhb_state(); +} + #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 67c21f9bdbad..08440ce57a1c 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -240,6 +240,11 @@ struct kvm_vcpu_events { #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED 3 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED (1U << 4)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3 KVM_REG_ARM_FW_REG(3) +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL 0 +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL 1 +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED 2 + /* SVE registers */ #define KVM_REG_ARM64_SVE (0x15 << KVM_REG_ARM_COPROC_SHIFT)
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c index 48fde38d64c3..2f5dc7fb437b 100644 --- a/virt/kvm/arm/psci.c +++ b/virt/kvm/arm/psci.c @@ -426,6 +426,18 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu) break; } break; + case ARM_SMCCC_ARCH_WORKAROUND_3: + switch (kvm_arm_get_spectre_bhb_state()) { + case SPECTRE_VULNERABLE: + break; + case SPECTRE_MITIGATED: + val = SMCCC_RET_SUCCESS; + break; + case SPECTRE_UNAFFECTED: + val = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED; + break; + } + break; } break; default: @@ -438,7 +450,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu) { - return 3; /* PSCI version and two workaround registers */ + return 4; /* PSCI version and three workaround registers */ }
int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices) @@ -452,6 +464,9 @@ int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices) if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, uindices++)) return -EFAULT;
+ if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3, uindices++)) + return -EFAULT; + return 0; }
@@ -486,9 +501,20 @@ static int get_kernel_wa_level(u64 regid) return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED; case KVM_SSBD_UNKNOWN: default: - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN; + break; } - } + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN; + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3: + switch (kvm_arm_get_spectre_bhb_state()) { + case SPECTRE_VULNERABLE: + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL; + case SPECTRE_MITIGATED: + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL; + case SPECTRE_UNAFFECTED: + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED; + } + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL; + }
return -EINVAL; } @@ -503,6 +529,7 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) val = kvm_psci_version(vcpu, vcpu->kvm); break; case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1: + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3: val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK; break; case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2: @@ -555,6 +582,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) }
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1: + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3: if (val & ~KVM_REG_FEATURE_LEVEL_MASK) return -EINVAL;
commit 228a26b912287934789023b4132ba76065d9491c upstream.
Future CPUs may implement a clearbhb instruction that is sufficient to mitigate SpectreBHB. CPUs that implement this instruction, but not CSV2.3 must be affected by Spectre-BHB.
Add support to use this instruction as the BHB mitigation on CPUs that support it. The instruction is in the hint space, so it will be treated by a NOP as older CPUs.
Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Catalin Marinas catalin.marinas@arm.com [ modified for stable: Use a KVM vector template instead of alternatives, removed bitmap of mitigations ] Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/include/asm/assembler.h | 7 +++++++ arch/arm64/include/asm/cpufeature.h | 13 +++++++++++++ arch/arm64/include/asm/sysreg.h | 1 + arch/arm64/include/asm/vectors.h | 7 +++++++ arch/arm64/kernel/cpu_errata.c | 14 ++++++++++++++ arch/arm64/kernel/cpufeature.c | 1 + arch/arm64/kernel/entry.S | 8 ++++++++ arch/arm64/kvm/hyp/hyp-entry.S | 6 ++++++ 8 files changed, 57 insertions(+)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 4b13739ca518..01112f9767bc 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -110,6 +110,13 @@ hint #20 .endm
+/* + * Clear Branch History instruction + */ + .macro clearbhb + hint #22 + .endm + /* * Speculation barrier */ diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 40a5e48881af..f63438474dd5 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -523,6 +523,19 @@ static inline bool supports_csv2p3(int scope) return csv2_val == 3; }
+static inline bool supports_clearbhb(int scope) +{ + u64 isar2; + + if (scope == SCOPE_LOCAL_CPU) + isar2 = read_sysreg_s(SYS_ID_AA64ISAR2_EL1); + else + isar2 = read_sanitised_ftr_reg(SYS_ID_AA64ISAR2_EL1); + + return cpuid_feature_extract_unsigned_field(isar2, + ID_AA64ISAR2_CLEARBHB_SHIFT); +} + static inline bool system_supports_32bit_el0(void) { return cpus_have_const_cap(ARM64_HAS_32BIT_EL0); diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index b35579352856..5b3bdad66b27 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -577,6 +577,7 @@ #define ID_AA64ISAR1_GPI_IMP_DEF 0x1
/* id_aa64isar2 */ +#define ID_AA64ISAR2_CLEARBHB_SHIFT 28 #define ID_AA64ISAR2_RPRES_SHIFT 4 #define ID_AA64ISAR2_WFXT_SHIFT 0
diff --git a/arch/arm64/include/asm/vectors.h b/arch/arm64/include/asm/vectors.h index 1f65c37dc653..f64613a96d53 100644 --- a/arch/arm64/include/asm/vectors.h +++ b/arch/arm64/include/asm/vectors.h @@ -32,6 +32,12 @@ enum arm64_bp_harden_el1_vectors { * canonical vectors. */ EL1_VECTOR_BHB_FW, + + /* + * Use the ClearBHB instruction, before branching to the canonical + * vectors. + */ + EL1_VECTOR_BHB_CLEAR_INSN, #endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */
/* @@ -43,6 +49,7 @@ enum arm64_bp_harden_el1_vectors { #ifndef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY #define EL1_VECTOR_BHB_LOOP -1 #define EL1_VECTOR_BHB_FW -1 +#define EL1_VECTOR_BHB_CLEAR_INSN -1 #endif /* !CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */
/* The vectors to use on return from EL0. e.g. to remap the kernel */ diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 0f74dc2b13c0..33b33416fea4 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -125,6 +125,8 @@ extern char __spectre_bhb_loop_k24_start[]; extern char __spectre_bhb_loop_k24_end[]; extern char __spectre_bhb_loop_k32_start[]; extern char __spectre_bhb_loop_k32_end[]; +extern char __spectre_bhb_clearbhb_start[]; +extern char __spectre_bhb_clearbhb_end[];
static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start, const char *hyp_vecs_end) @@ -1086,6 +1088,7 @@ static void update_mitigation_state(enum mitigation_state *oldp, * - Mitigated by a branchy loop a CPU specific number of times, and listed * in our "loop mitigated list". * - Mitigated in software by the firmware Spectre v2 call. + * - Has the ClearBHB instruction to perform the mitigation. * - Has the 'Exception Clears Branch History Buffer' (ECBHB) feature, so no * software mitigation in the vectors is needed. * - Has CSV2.3, so is unaffected. @@ -1226,6 +1229,9 @@ bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, if (supports_csv2p3(scope)) return false;
+ if (supports_clearbhb(scope)) + return true; + if (spectre_bhb_loop_affected(scope)) return true;
@@ -1266,6 +1272,8 @@ static const char *kvm_bhb_get_vecs_end(const char *start) return __spectre_bhb_loop_k24_end; else if (start == __spectre_bhb_loop_k32_start) return __spectre_bhb_loop_k32_end; + else if (start == __spectre_bhb_clearbhb_start) + return __spectre_bhb_clearbhb_end;
return NULL; } @@ -1305,6 +1313,7 @@ static void kvm_setup_bhb_slot(const char *hyp_vecs_start) #define __spectre_bhb_loop_k8_start NULL #define __spectre_bhb_loop_k24_start NULL #define __spectre_bhb_loop_k32_start NULL +#define __spectre_bhb_clearbhb_start NULL
static void kvm_setup_bhb_slot(const char *hyp_vecs_start) { } #endif @@ -1323,6 +1332,11 @@ void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry) } else if (cpu_mitigations_off()) { pr_info_once("spectre-bhb mitigation disabled by command line option\n"); } else if (supports_ecbhb(SCOPE_LOCAL_CPU)) { + state = SPECTRE_MITIGATED; + } else if (supports_clearbhb(SCOPE_LOCAL_CPU)) { + kvm_setup_bhb_slot(__spectre_bhb_clearbhb_start); + this_cpu_set_vectors(EL1_VECTOR_BHB_CLEAR_INSN); + state = SPECTRE_MITIGATED; } else if (spectre_bhb_loop_affected(SCOPE_LOCAL_CPU)) { switch (spectre_bhb_loop_affected(SCOPE_SYSTEM)) { diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 0d89d535720f..d07dadd6b8ff 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -156,6 +156,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = { };
static const struct arm64_ftr_bits ftr_id_aa64isar2[] = { + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_CLEARBHB_SHIFT, 4, 0), ARM64_FTR_END, };
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index fcfbb2b009e2..296422119488 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -1074,6 +1074,7 @@ alternative_else_nop_endif #define BHB_MITIGATION_NONE 0 #define BHB_MITIGATION_LOOP 1 #define BHB_MITIGATION_FW 2 +#define BHB_MITIGATION_INSN 3
.macro tramp_ventry, vector_start, regsize, kpti, bhb .align 7 @@ -1090,6 +1091,11 @@ alternative_else_nop_endif __mitigate_spectre_bhb_loop x30 .endif // \bhb == BHB_MITIGATION_LOOP
+ .if \bhb == BHB_MITIGATION_INSN + clearbhb + isb + .endif // \bhb == BHB_MITIGATION_INSN + .if \kpti == 1 /* * Defend against branch aliasing attacks by pushing a dummy @@ -1170,6 +1176,7 @@ ENTRY(tramp_vectors) #ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_LOOP generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_FW + generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_INSN #endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */ generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_NONE END(tramp_vectors) @@ -1232,6 +1239,7 @@ SYM_CODE_START(__bp_harden_el1_vectors) #ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY generate_el1_vector bhb=BHB_MITIGATION_LOOP generate_el1_vector bhb=BHB_MITIGATION_FW + generate_el1_vector bhb=BHB_MITIGATION_INSN #endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */ SYM_CODE_END(__bp_harden_el1_vectors) .popsection diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S index b59b66f1f905..99b8ecaae810 100644 --- a/arch/arm64/kvm/hyp/hyp-entry.S +++ b/arch/arm64/kvm/hyp/hyp-entry.S @@ -405,4 +405,10 @@ ENTRY(__spectre_bhb_loop_k32_start) ldp x0, x1, [sp, #(8 * 0)] add sp, sp, #(8 * 2) ENTRY(__spectre_bhb_loop_k32_end) + +ENTRY(__spectre_bhb_clearbhb_start) + esb + clearbhb + isb +ENTRY(__spectre_bhb_clearbhb_end) #endif
On Tue, Mar 15, 2022 at 06:23:53PM +0000, James Morse wrote:
Hi Greg,
Here is the state of the current v5.4 backport. Now that the 32bit code has been merged, it doesn't conflict when KVM's shared 32bit/64bit code needs to use these constants.
I've fixed the two issues that were reported against the v5.10 backport.
I had a go at bringing all the pre-requisites in to add proton-pack.c to v5.4. Its currently 39 patches: https://git.gitlab.arm.com/linux-arm/linux-jm.git /bhb/alternative_backport/UNTESTED/v5.4.183 (or for web browsers: https://gitlab.arm.com/linux-arm/linux-jm/-/commits/bhb/alternative_backport... )
I've queued it up.
I've not managed to bring b881cdce77b4 "KVM: arm64: Allocate hyp vectors statically" in, as that depends on the hypervisor's per-cpu support. Its this patch that means those 'smccc_wa' templates are needed. I estimate it would be 60 patches in total if we go this way, I don't think its a good idea: All this still has to work around 541ad0150ca4 ("arm: Remove 32bit KVM host support") and 9ed24f4b712b ("KVM: arm64: Move virt/kvm/arm to arch/arm64") being missing. I think this approach creates more problems than it solves as the files have to be in different places because of 32bit. It also creates a significantly larger testing problem: I'm not looking forward to working out how to test KVM guest migration over the variant-4 KVM ABI changes in 29e8910a566a.
I think that the workaround you mention later on for this issue makes sense cost/benefit-wise.
This version of the backport still adds the mitigation management code to cpu_errata.c, because that is where this stuff lived in v5.4.
If you prefer, I can try adding proton-pack.c as a new empty file. I think this would only confuse matters as the line-numbers would never match, and there is an interaction with the spectre-v2 mitigations that live in cpu_errata.c.
There is one patch not present in mainline 'KVM: arm64: Add templtes for BHB mitigation sequences'.
Thank you!
Hi Sasha,
On 3/16/22 3:41 PM, Sasha Levin wrote:
On Tue, Mar 15, 2022 at 06:23:53PM +0000, James Morse wrote:
Hi Greg,
Here is the state of the current v5.4 backport. Now that the 32bit code has been merged, it doesn't conflict when KVM's shared 32bit/64bit code needs to use these constants.
I've fixed the two issues that were reported against the v5.10 backport.
I had a go at bringing all the pre-requisites in to add proton-pack.c to v5.4. Its currently 39 patches: https://git.gitlab.arm.com/linux-arm/linux-jm.git /bhb/alternative_backport/UNTESTED/v5.4.183 (or for web browsers: https://gitlab.arm.com/linux-arm/linux-jm/-/commits/bhb/alternative_backport... )
I've queued it up.
Just to clarify - you've queued this series that I posted, not the above 'UNTESTED' branch where I tried (unsuccessfully) to bring in the prerequisites.
(The position of your comment made me jump!)
The background is Greg commented on the BHB list that he thought it would be better to bring in the prerequisites, I said I'd have a go. As described here, I think it just creates new problems.
Thanks,
James
On Wed, Mar 16, 2022 at 05:38:34PM +0000, James Morse wrote:
Hi Sasha,
On 3/16/22 3:41 PM, Sasha Levin wrote:
On Tue, Mar 15, 2022 at 06:23:53PM +0000, James Morse wrote:
Hi Greg,
Here is the state of the current v5.4 backport. Now that the 32bit code has been merged, it doesn't conflict when KVM's shared 32bit/64bit code needs to use these constants.
I've fixed the two issues that were reported against the v5.10 backport.
I had a go at bringing all the pre-requisites in to add proton-pack.c to v5.4. Its currently 39 patches: https://git.gitlab.arm.com/linux-arm/linux-jm.git /bhb/alternative_backport/UNTESTED/v5.4.183 (or for web browsers: https://gitlab.arm.com/linux-arm/linux-jm/-/commits/bhb/alternative_backport... )
I've queued it up.
Just to clarify - you've queued this series that I posted, not the above 'UNTESTED' branch where I tried (unsuccessfully) to bring in the prerequisites.
(The position of your comment made me jump!)
Yes, sorry, the series :)
On Wed, Mar 16, 2022 at 11:41:07AM -0400, Sasha Levin wrote:
On Tue, Mar 15, 2022 at 06:23:53PM +0000, James Morse wrote:
Hi Greg,
Here is the state of the current v5.4 backport. Now that the 32bit code has been merged, it doesn't conflict when KVM's shared 32bit/64bit code needs to use these constants.
I've fixed the two issues that were reported against the v5.10 backport.
I had a go at bringing all the pre-requisites in to add proton-pack.c to v5.4. Its currently 39 patches: https://git.gitlab.arm.com/linux-arm/linux-jm.git /bhb/alternative_backport/UNTESTED/v5.4.183 (or for web browsers: https://gitlab.arm.com/linux-arm/linux-jm/-/commits/bhb/alternative_backport... )
I've queued it up.
Thanks for merging these, and James, thanks for the backports! I agree, this way is a bit "simpler", but thanks for trying the other way as well.
Does this mean that the 4.19 and older backports will be easier?
greg k-h
Hi Greg,
On 3/17/22 10:00 AM, Greg KH wrote:
On Wed, Mar 16, 2022 at 11:41:07AM -0400, Sasha Levin wrote:
On Tue, Mar 15, 2022 at 06:23:53PM +0000, James Morse wrote:
Hi Greg,
Here is the state of the current v5.4 backport. Now that the 32bit code has been merged, it doesn't conflict when KVM's shared 32bit/64bit code needs to use these constants.
I've fixed the two issues that were reported against the v5.10 backport.
I had a go at bringing all the pre-requisites in to add proton-pack.c to v5.4. Its currently 39 patches: https://git.gitlab.arm.com/linux-arm/linux-jm.git /bhb/alternative_backport/UNTESTED/v5.4.183 (or for web browsers: https://gitlab.arm.com/linux-arm/linux-jm/-/commits/bhb/alternative_backport... )
I've queued it up.
Thanks for merging these, and James, thanks for the backports! I agree, this way is a bit "simpler", but thanks for trying the other way as well.
Does this mean that the 4.19 and older backports will be easier?
It means they'll be based on the versions I posted on before. But: the last review comment that needs addressing is the A76 ID macros that came in with an errata workaround, but the workaround wasn't backported. I need to work out how easy that is to do.
Thanks,
James
linux-stable-mirror@lists.linaro.org