This is the start of the stable review cycle for the 4.19.234 release. There are 33 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.234-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.19.234-rc2
Juergen Gross jgross@suse.com xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
Juergen Gross jgross@suse.com xen/gnttab: fix gnttab_end_foreign_access() without page specified
Juergen Gross jgross@suse.com xen/pvcalls: use alloc/free_pages_exact()
Juergen Gross jgross@suse.com xen/9p: use alloc/free_pages_exact()
Juergen Gross jgross@suse.com xen: remove gnttab_query_foreign_access()
Juergen Gross jgross@suse.com xen/gntalloc: don't use gnttab_query_foreign_access()
Juergen Gross jgross@suse.com xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross jgross@suse.com xen/netfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross jgross@suse.com xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
Juergen Gross jgross@suse.com xen/grant-table: add gnttab_try_end_foreign_access()
Juergen Gross jgross@suse.com xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
Russell King (Oracle) rmk+kernel@armlinux.org.uk ARM: fix build warning in proc-v7-bugs.c
Nathan Chancellor nathan@kernel.org ARM: Do not use NOCROSSREFS directive with ld.lld
Russell King (Oracle) rmk+kernel@armlinux.org.uk ARM: fix co-processor register typo
Sami Tolvanen samitolvanen@google.com kbuild: add CONFIG_LD_IS_LLD
Emmanuel Gil Peyrot linkmauve@linkmauve.fr ARM: fix build error when BPF_SYSCALL is disabled
Russell King (Oracle) rmk+kernel@armlinux.org.uk ARM: include unprivileged BPF status in Spectre V2 reporting
Russell King (Oracle) rmk+kernel@armlinux.org.uk ARM: Spectre-BHB workaround
Russell King (Oracle) rmk+kernel@armlinux.org.uk ARM: use LOADADDR() to get load address of sections
Russell King (Oracle) rmk+kernel@armlinux.org.uk ARM: early traps initialisation
Russell King (Oracle) rmk+kernel@armlinux.org.uk ARM: report Spectre v2 status through sysfs
Mark Rutland mark.rutland@arm.com arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()
Steven Price steven.price@arm.com arm/arm64: Provide a wrapper for SMCCC 1.1 calls
Josh Poimboeuf jpoimboe@redhat.com x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
Josh Poimboeuf jpoimboe@redhat.com x86/speculation: Warn about Spectre v2 LFENCE mitigation
Kim Phillips kim.phillips@amd.com x86/speculation: Update link to AMD speculation whitepaper
Kim Phillips kim.phillips@amd.com x86/speculation: Use generic retpoline by default on AMD
Josh Poimboeuf jpoimboe@redhat.com x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Peter Zijlstra peterz@infradead.org Documentation/hw-vuln: Update spectre doc
Peter Zijlstra peterz@infradead.org x86/speculation: Add eIBRS + Retpoline options
Peter Zijlstra (Intel) peterz@infradead.org x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
Peter Zijlstra peterz@infradead.org x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
Borislav Petkov bp@suse.de x86/speculation: Merge one test in spectre_v2_user_select_mitigation()
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 48 ++++-- Documentation/admin-guide/kernel-parameters.txt | 8 +- Makefile | 4 +- arch/arm/include/asm/assembler.h | 10 ++ arch/arm/include/asm/spectre.h | 32 ++++ arch/arm/kernel/Makefile | 2 + arch/arm/kernel/entry-armv.S | 79 ++++++++- arch/arm/kernel/entry-common.S | 24 +++ arch/arm/kernel/spectre.c | 71 ++++++++ arch/arm/kernel/traps.c | 65 ++++++- arch/arm/kernel/vmlinux.lds.h | 43 ++++- arch/arm/mm/Kconfig | 11 ++ arch/arm/mm/proc-v7-bugs.c | 201 +++++++++++++++++++--- arch/x86/include/asm/cpufeatures.h | 2 +- arch/x86/include/asm/nospec-branch.h | 16 +- arch/x86/kernel/cpu/bugs.c | 214 +++++++++++++++++------- drivers/block/xen-blkfront.c | 63 ++++--- drivers/firmware/psci.c | 15 ++ drivers/net/xen-netfront.c | 54 +++--- drivers/scsi/xen-scsifront.c | 3 +- drivers/xen/gntalloc.c | 25 +-- drivers/xen/grant-table.c | 71 ++++---- drivers/xen/pvcalls-front.c | 8 +- drivers/xen/xenbus/xenbus_client.c | 24 ++- include/linux/arm-smccc.h | 74 ++++++++ include/linux/bpf.h | 11 ++ include/xen/grant_table.h | 19 ++- init/Kconfig | 3 + kernel/sysctl.c | 8 + net/9p/trans_xen.c | 14 +- tools/arch/x86/include/asm/cpufeatures.h | 2 +- 31 files changed, 963 insertions(+), 261 deletions(-)
From: Borislav Petkov bp@suse.de
commit a5ce9f2bb665d1d2b31f139a02dbaa2dfbb62fa6 upstream.
Merge the test whether the CPU supports STIBP into the test which determines whether STIBP is required. Thus try to simplify what is already an insane logic.
Remove a superfluous newline in a comment, while at it.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Anthony Steinhauser asteinhauser@google.com Link: https://lkml.kernel.org/r/20200615065806.GB14668@zn.tnic [fllinden@amazon.com: fixed contextual conflict (comment) for 4.19] Signed-off-by: Frank van der Linden fllinden@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/bugs.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -756,10 +756,12 @@ spectre_v2_user_select_mitigation(enum s }
/* - * If enhanced IBRS is enabled or SMT impossible, STIBP is not + * If no STIBP, enhanced IBRS is enabled or SMT impossible, STIBP is not * required. */ - if (!smt_possible || spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) + if (!boot_cpu_has(X86_FEATURE_STIBP) || + !smt_possible || + spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) return;
/* @@ -771,12 +773,6 @@ spectre_v2_user_select_mitigation(enum s boot_cpu_has(X86_FEATURE_AMD_STIBP_ALWAYS_ON)) mode = SPECTRE_V2_USER_STRICT_PREFERRED;
- /* - * If STIBP is not available, clear the STIBP mode. - */ - if (!boot_cpu_has(X86_FEATURE_STIBP)) - mode = SPECTRE_V2_USER_NONE; - spectre_v2_user_stibp = mode;
set_mode: @@ -1255,7 +1251,6 @@ static int ib_prctl_set(struct task_stru if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE && spectre_v2_user_stibp == SPECTRE_V2_USER_NONE) return 0; - /* * With strict mode for both IBPB and STIBP, the instruction * code paths avoid checking this task flag and instead,
From: Peter Zijlstra peterz@infradead.org
commit f8a66d608a3e471e1202778c2a36cbdc96bae73b upstream.
Currently Linux prevents usage of retpoline,amd on !AMD hardware, this is unfriendly and gets in the way of testing. Remove this restriction.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Borislav Petkov bp@suse.de Acked-by: Josh Poimboeuf jpoimboe@redhat.com Tested-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/r/20211026120310.487348118@infradead.org [fllinden@amazon.com: backported to 4.19 (no Hygon in 4.19)] Signed-off-by: Frank van der Linden fllinden@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/bugs.c | 6 ------ 1 file changed, 6 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -839,12 +839,6 @@ static enum spectre_v2_mitigation_cmd __ return SPECTRE_V2_CMD_AUTO; }
- if (cmd == SPECTRE_V2_CMD_RETPOLINE_AMD && - boot_cpu_data.x86_vendor != X86_VENDOR_AMD) { - pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n"); - return SPECTRE_V2_CMD_AUTO; - } - spec_v2_print_cond(mitigation_options[i].option, mitigation_options[i].secure); return cmd;
From: "Peter Zijlstra (Intel)" peterz@infradead.org
commit d45476d9832409371537013ebdd8dc1a7781f97a upstream.
The RETPOLINE_AMD name is unfortunate since it isn't necessarily AMD only, in fact Hygon also uses it. Furthermore it will likely be sufficient for some Intel processors. Therefore rename the thing to RETPOLINE_LFENCE to better describe what it is.
Add the spectre_v2=retpoline,lfence option as an alias to spectre_v2=retpoline,amd to preserve existing setups. However, the output of /sys/devices/system/cpu/vulnerabilities/spectre_v2 will be changed.
[ bp: Fix typos, massage. ]
Co-developed-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Thomas Gleixner tglx@linutronix.de [fllinden@amazon.com: backported to 4.19] Signed-off-by: Frank van der Linden fllinden@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/cpufeatures.h | 2 +- arch/x86/include/asm/nospec-branch.h | 12 ++++++------ arch/x86/kernel/cpu/bugs.c | 29 ++++++++++++++++++----------- tools/arch/x86/include/asm/cpufeatures.h | 2 +- 4 files changed, 26 insertions(+), 19 deletions(-)
--- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -203,7 +203,7 @@ #define X86_FEATURE_SME ( 7*32+10) /* AMD Secure Memory Encryption */ #define X86_FEATURE_PTI ( 7*32+11) /* Kernel Page Table Isolation enabled */ #define X86_FEATURE_RETPOLINE ( 7*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */ -#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* "" AMD Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE_LFENCE ( 7*32+13) /* "" Use LFENCE for Spectre variant 2 */ #define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */ #define X86_FEATURE_CDP_L2 ( 7*32+15) /* Code and Data Prioritization L2 */ #define X86_FEATURE_MSR_SPEC_CTRL ( 7*32+16) /* "" MSR SPEC_CTRL is implemented */ --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -119,7 +119,7 @@ ANNOTATE_NOSPEC_ALTERNATIVE ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \ __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_LFENCE #else jmp *\reg #endif @@ -130,7 +130,7 @@ ANNOTATE_NOSPEC_ALTERNATIVE ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg), \ __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_LFENCE #else call *\reg #endif @@ -181,7 +181,7 @@ "lfence;\n" \ ANNOTATE_RETPOLINE_SAFE \ "call *%[thunk_target]\n", \ - X86_FEATURE_RETPOLINE_AMD) + X86_FEATURE_RETPOLINE_LFENCE) # define THUNK_TARGET(addr) [thunk_target] "r" (addr)
#else /* CONFIG_X86_32 */ @@ -211,7 +211,7 @@ "lfence;\n" \ ANNOTATE_RETPOLINE_SAFE \ "call *%[thunk_target]\n", \ - X86_FEATURE_RETPOLINE_AMD) + X86_FEATURE_RETPOLINE_LFENCE)
# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) #endif @@ -223,8 +223,8 @@ /* The Spectre V2 mitigation variants */ enum spectre_v2_mitigation { SPECTRE_V2_NONE, - SPECTRE_V2_RETPOLINE_GENERIC, - SPECTRE_V2_RETPOLINE_AMD, + SPECTRE_V2_RETPOLINE, + SPECTRE_V2_LFENCE, SPECTRE_V2_IBRS_ENHANCED, };
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -621,7 +621,7 @@ enum spectre_v2_mitigation_cmd { SPECTRE_V2_CMD_FORCE, SPECTRE_V2_CMD_RETPOLINE, SPECTRE_V2_CMD_RETPOLINE_GENERIC, - SPECTRE_V2_CMD_RETPOLINE_AMD, + SPECTRE_V2_CMD_RETPOLINE_LFENCE, };
enum spectre_v2_user_cmd { @@ -781,8 +781,8 @@ set_mode:
static const char * const spectre_v2_strings[] = { [SPECTRE_V2_NONE] = "Vulnerable", - [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline", - [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline", + [SPECTRE_V2_RETPOLINE] = "Mitigation: Retpolines", + [SPECTRE_V2_LFENCE] = "Mitigation: LFENCE", [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS", };
@@ -794,7 +794,8 @@ static const struct { { "off", SPECTRE_V2_CMD_NONE, false }, { "on", SPECTRE_V2_CMD_FORCE, true }, { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false }, - { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false }, + { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_LFENCE, false }, + { "retpoline,lfence", SPECTRE_V2_CMD_RETPOLINE_LFENCE, false }, { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false }, { "auto", SPECTRE_V2_CMD_AUTO, false }, }; @@ -832,13 +833,19 @@ static enum spectre_v2_mitigation_cmd __ }
if ((cmd == SPECTRE_V2_CMD_RETPOLINE || - cmd == SPECTRE_V2_CMD_RETPOLINE_AMD || + cmd == SPECTRE_V2_CMD_RETPOLINE_LFENCE || cmd == SPECTRE_V2_CMD_RETPOLINE_GENERIC) && !IS_ENABLED(CONFIG_RETPOLINE)) { pr_err("%s selected but not compiled in. Switching to AUTO select\n", mitigation_options[i].option); return SPECTRE_V2_CMD_AUTO; }
+ if ((cmd == SPECTRE_V2_CMD_RETPOLINE_LFENCE) && + !boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { + pr_err("%s selected, but CPU doesn't have a serializing LFENCE. Switching to AUTO select\n", mitigation_options[i].option); + return SPECTRE_V2_CMD_AUTO; + } + spec_v2_print_cond(mitigation_options[i].option, mitigation_options[i].secure); return cmd; @@ -873,9 +880,9 @@ static void __init spectre_v2_select_mit if (IS_ENABLED(CONFIG_RETPOLINE)) goto retpoline_auto; break; - case SPECTRE_V2_CMD_RETPOLINE_AMD: + case SPECTRE_V2_CMD_RETPOLINE_LFENCE: if (IS_ENABLED(CONFIG_RETPOLINE)) - goto retpoline_amd; + goto retpoline_lfence; break; case SPECTRE_V2_CMD_RETPOLINE_GENERIC: if (IS_ENABLED(CONFIG_RETPOLINE)) @@ -891,17 +898,17 @@ static void __init spectre_v2_select_mit
retpoline_auto: if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { - retpoline_amd: + retpoline_lfence: if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { pr_err("Spectre mitigation: LFENCE not serializing, switching to generic retpoline\n"); goto retpoline_generic; } - mode = SPECTRE_V2_RETPOLINE_AMD; - setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD); + mode = SPECTRE_V2_LFENCE; + setup_force_cpu_cap(X86_FEATURE_RETPOLINE_LFENCE); setup_force_cpu_cap(X86_FEATURE_RETPOLINE); } else { retpoline_generic: - mode = SPECTRE_V2_RETPOLINE_GENERIC; + mode = SPECTRE_V2_RETPOLINE; setup_force_cpu_cap(X86_FEATURE_RETPOLINE); }
--- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -203,7 +203,7 @@ #define X86_FEATURE_SME ( 7*32+10) /* AMD Secure Memory Encryption */ #define X86_FEATURE_PTI ( 7*32+11) /* Kernel Page Table Isolation enabled */ #define X86_FEATURE_RETPOLINE ( 7*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */ -#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* "" AMD Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE_LFENCE ( 7*32+13) /* "" Use LFENCEs for Spectre variant 2 */ #define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */ #define X86_FEATURE_CDP_L2 ( 7*32+15) /* Code and Data Prioritization L2 */ #define X86_FEATURE_MSR_SPEC_CTRL ( 7*32+16) /* "" MSR SPEC_CTRL is implemented */
From: Peter Zijlstra peterz@infradead.org
commit 1e19da8522c81bf46b335f84137165741e0d82b7 upstream.
Thanks to the chaps at VUsec it is now clear that eIBRS is not sufficient, therefore allow enabling of retpolines along with eIBRS.
Add spectre_v2=eibrs, spectre_v2=eibrs,lfence and spectre_v2=eibrs,retpoline options to explicitly pick your preferred means of mitigation.
Since there's new mitigations there's also user visible changes in /sys/devices/system/cpu/vulnerabilities/spectre_v2 to reflect these new mitigations.
[ bp: Massage commit message, trim error messages, do more precise eIBRS mode checking. ]
Co-developed-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Patrick Colp patrick.colp@oracle.com Reviewed-by: Thomas Gleixner tglx@linutronix.de [fllinden@amazon.com: backported to 4.19 (no Hygon)] Signed-off-by: Frank van der Linden fllinden@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/nospec-branch.h | 4 - arch/x86/kernel/cpu/bugs.c | 131 +++++++++++++++++++++++++---------- 2 files changed, 98 insertions(+), 37 deletions(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -225,7 +225,9 @@ enum spectre_v2_mitigation { SPECTRE_V2_NONE, SPECTRE_V2_RETPOLINE, SPECTRE_V2_LFENCE, - SPECTRE_V2_IBRS_ENHANCED, + SPECTRE_V2_EIBRS, + SPECTRE_V2_EIBRS_RETPOLINE, + SPECTRE_V2_EIBRS_LFENCE, };
/* The indirect branch speculation control variants */ --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -622,6 +622,9 @@ enum spectre_v2_mitigation_cmd { SPECTRE_V2_CMD_RETPOLINE, SPECTRE_V2_CMD_RETPOLINE_GENERIC, SPECTRE_V2_CMD_RETPOLINE_LFENCE, + SPECTRE_V2_CMD_EIBRS, + SPECTRE_V2_CMD_EIBRS_RETPOLINE, + SPECTRE_V2_CMD_EIBRS_LFENCE, };
enum spectre_v2_user_cmd { @@ -694,6 +697,13 @@ spectre_v2_parse_user_cmdline(enum spect return SPECTRE_V2_USER_CMD_AUTO; }
+static inline bool spectre_v2_in_eibrs_mode(enum spectre_v2_mitigation mode) +{ + return (mode == SPECTRE_V2_EIBRS || + mode == SPECTRE_V2_EIBRS_RETPOLINE || + mode == SPECTRE_V2_EIBRS_LFENCE); +} + static void __init spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd) { @@ -761,7 +771,7 @@ spectre_v2_user_select_mitigation(enum s */ if (!boot_cpu_has(X86_FEATURE_STIBP) || !smt_possible || - spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) + spectre_v2_in_eibrs_mode(spectre_v2_enabled)) return;
/* @@ -783,7 +793,9 @@ static const char * const spectre_v2_str [SPECTRE_V2_NONE] = "Vulnerable", [SPECTRE_V2_RETPOLINE] = "Mitigation: Retpolines", [SPECTRE_V2_LFENCE] = "Mitigation: LFENCE", - [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS", + [SPECTRE_V2_EIBRS] = "Mitigation: Enhanced IBRS", + [SPECTRE_V2_EIBRS_LFENCE] = "Mitigation: Enhanced IBRS + LFENCE", + [SPECTRE_V2_EIBRS_RETPOLINE] = "Mitigation: Enhanced IBRS + Retpolines", };
static const struct { @@ -797,6 +809,9 @@ static const struct { { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_LFENCE, false }, { "retpoline,lfence", SPECTRE_V2_CMD_RETPOLINE_LFENCE, false }, { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false }, + { "eibrs", SPECTRE_V2_CMD_EIBRS, false }, + { "eibrs,lfence", SPECTRE_V2_CMD_EIBRS_LFENCE, false }, + { "eibrs,retpoline", SPECTRE_V2_CMD_EIBRS_RETPOLINE, false }, { "auto", SPECTRE_V2_CMD_AUTO, false }, };
@@ -834,15 +849,29 @@ static enum spectre_v2_mitigation_cmd __
if ((cmd == SPECTRE_V2_CMD_RETPOLINE || cmd == SPECTRE_V2_CMD_RETPOLINE_LFENCE || - cmd == SPECTRE_V2_CMD_RETPOLINE_GENERIC) && + cmd == SPECTRE_V2_CMD_RETPOLINE_GENERIC || + cmd == SPECTRE_V2_CMD_EIBRS_LFENCE || + cmd == SPECTRE_V2_CMD_EIBRS_RETPOLINE) && !IS_ENABLED(CONFIG_RETPOLINE)) { - pr_err("%s selected but not compiled in. Switching to AUTO select\n", mitigation_options[i].option); + pr_err("%s selected but not compiled in. Switching to AUTO select\n", + mitigation_options[i].option); + return SPECTRE_V2_CMD_AUTO; + } + + if ((cmd == SPECTRE_V2_CMD_EIBRS || + cmd == SPECTRE_V2_CMD_EIBRS_LFENCE || + cmd == SPECTRE_V2_CMD_EIBRS_RETPOLINE) && + !boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) { + pr_err("%s selected but CPU doesn't have eIBRS. Switching to AUTO select\n", + mitigation_options[i].option); return SPECTRE_V2_CMD_AUTO; }
- if ((cmd == SPECTRE_V2_CMD_RETPOLINE_LFENCE) && + if ((cmd == SPECTRE_V2_CMD_RETPOLINE_LFENCE || + cmd == SPECTRE_V2_CMD_EIBRS_LFENCE) && !boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { - pr_err("%s selected, but CPU doesn't have a serializing LFENCE. Switching to AUTO select\n", mitigation_options[i].option); + pr_err("%s selected, but CPU doesn't have a serializing LFENCE. Switching to AUTO select\n", + mitigation_options[i].option); return SPECTRE_V2_CMD_AUTO; }
@@ -851,6 +880,24 @@ static enum spectre_v2_mitigation_cmd __ return cmd; }
+static enum spectre_v2_mitigation __init spectre_v2_select_retpoline(void) +{ + if (!IS_ENABLED(CONFIG_RETPOLINE)) { + pr_err("Kernel not compiled with retpoline; no mitigation available!"); + return SPECTRE_V2_NONE; + } + + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { + if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { + pr_err("LFENCE not serializing, switching to generic retpoline\n"); + return SPECTRE_V2_RETPOLINE; + } + return SPECTRE_V2_LFENCE; + } + + return SPECTRE_V2_RETPOLINE; +} + static void __init spectre_v2_select_mitigation(void) { enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline(); @@ -871,48 +918,60 @@ static void __init spectre_v2_select_mit case SPECTRE_V2_CMD_FORCE: case SPECTRE_V2_CMD_AUTO: if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) { - mode = SPECTRE_V2_IBRS_ENHANCED; - /* Force it so VMEXIT will restore correctly */ - x86_spec_ctrl_base |= SPEC_CTRL_IBRS; - wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); - goto specv2_set_mode; + mode = SPECTRE_V2_EIBRS; + break; } - if (IS_ENABLED(CONFIG_RETPOLINE)) - goto retpoline_auto; + + mode = spectre_v2_select_retpoline(); break; + case SPECTRE_V2_CMD_RETPOLINE_LFENCE: - if (IS_ENABLED(CONFIG_RETPOLINE)) - goto retpoline_lfence; + mode = SPECTRE_V2_LFENCE; break; + case SPECTRE_V2_CMD_RETPOLINE_GENERIC: - if (IS_ENABLED(CONFIG_RETPOLINE)) - goto retpoline_generic; + mode = SPECTRE_V2_RETPOLINE; break; + case SPECTRE_V2_CMD_RETPOLINE: - if (IS_ENABLED(CONFIG_RETPOLINE)) - goto retpoline_auto; + mode = spectre_v2_select_retpoline(); + break; + + case SPECTRE_V2_CMD_EIBRS: + mode = SPECTRE_V2_EIBRS; + break; + + case SPECTRE_V2_CMD_EIBRS_LFENCE: + mode = SPECTRE_V2_EIBRS_LFENCE; + break; + + case SPECTRE_V2_CMD_EIBRS_RETPOLINE: + mode = SPECTRE_V2_EIBRS_RETPOLINE; break; } - pr_err("Spectre mitigation: kernel not compiled with retpoline; no mitigation available!"); - return;
-retpoline_auto: - if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { - retpoline_lfence: - if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { - pr_err("Spectre mitigation: LFENCE not serializing, switching to generic retpoline\n"); - goto retpoline_generic; - } - mode = SPECTRE_V2_LFENCE; + if (spectre_v2_in_eibrs_mode(mode)) { + /* Force it so VMEXIT will restore correctly */ + x86_spec_ctrl_base |= SPEC_CTRL_IBRS; + wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); + } + + switch (mode) { + case SPECTRE_V2_NONE: + case SPECTRE_V2_EIBRS: + break; + + case SPECTRE_V2_LFENCE: + case SPECTRE_V2_EIBRS_LFENCE: setup_force_cpu_cap(X86_FEATURE_RETPOLINE_LFENCE); + /* fallthrough */ + + case SPECTRE_V2_RETPOLINE: + case SPECTRE_V2_EIBRS_RETPOLINE: setup_force_cpu_cap(X86_FEATURE_RETPOLINE); - } else { - retpoline_generic: - mode = SPECTRE_V2_RETPOLINE; - setup_force_cpu_cap(X86_FEATURE_RETPOLINE); + break; }
-specv2_set_mode: spectre_v2_enabled = mode; pr_info("%s\n", spectre_v2_strings[mode]);
@@ -938,7 +997,7 @@ specv2_set_mode: * the CPU supports Enhanced IBRS, kernel might un-intentionally not * enable IBRS around firmware calls. */ - if (boot_cpu_has(X86_FEATURE_IBRS) && mode != SPECTRE_V2_IBRS_ENHANCED) { + if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_eibrs_mode(mode)) { setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW); pr_info("Enabling Restricted Speculation for firmware calls\n"); } @@ -1596,7 +1655,7 @@ static ssize_t tsx_async_abort_show_stat
static char *stibp_state(void) { - if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) + if (spectre_v2_in_eibrs_mode(spectre_v2_enabled)) return "";
switch (spectre_v2_user_stibp) {
From: Peter Zijlstra peterz@infradead.org
commit 5ad3eb1132453b9795ce5fd4572b1c18b292cca9 upstream.
Update the doc with the new fun.
[ bp: Massage commit message. ]
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Thomas Gleixner tglx@linutronix.de [fllinden@amazon.com: backported to 4.19] Signed-off-by: Frank van der Linden fllinden@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/hw-vuln/spectre.rst | 42 ++++++++++++++++-------- Documentation/admin-guide/kernel-parameters.txt | 8 +++- 2 files changed, 35 insertions(+), 15 deletions(-)
--- a/Documentation/admin-guide/hw-vuln/spectre.rst +++ b/Documentation/admin-guide/hw-vuln/spectre.rst @@ -131,6 +131,19 @@ steer its indirect branch speculations t speculative execution's side effects left in level 1 cache to infer the victim's data.
+Yet another variant 2 attack vector is for the attacker to poison the +Branch History Buffer (BHB) to speculatively steer an indirect branch +to a specific Branch Target Buffer (BTB) entry, even if the entry isn't +associated with the source address of the indirect branch. Specifically, +the BHB might be shared across privilege levels even in the presence of +Enhanced IBRS. + +Currently the only known real-world BHB attack vector is via +unprivileged eBPF. Therefore, it's highly recommended to not enable +unprivileged eBPF, especially when eIBRS is used (without retpolines). +For a full mitigation against BHB attacks, it's recommended to use +retpolines (or eIBRS combined with retpolines). + Attack scenarios ----------------
@@ -364,13 +377,15 @@ The possible values in this file are:
- Kernel status:
- ==================================== ================================= - 'Not affected' The processor is not vulnerable - 'Vulnerable' Vulnerable, no mitigation - 'Mitigation: Full generic retpoline' Software-focused mitigation - 'Mitigation: Full AMD retpoline' AMD-specific software mitigation - 'Mitigation: Enhanced IBRS' Hardware-focused mitigation - ==================================== ================================= + ======================================== ================================= + 'Not affected' The processor is not vulnerable + 'Mitigation: None' Vulnerable, no mitigation + 'Mitigation: Retpolines' Use Retpoline thunks + 'Mitigation: LFENCE' Use LFENCE instructions + 'Mitigation: Enhanced IBRS' Hardware-focused mitigation + 'Mitigation: Enhanced IBRS + Retpolines' Hardware-focused + Retpolines + 'Mitigation: Enhanced IBRS + LFENCE' Hardware-focused + LFENCE + ======================================== =================================
- Firmware status: Show if Indirect Branch Restricted Speculation (IBRS) is used to protect against Spectre variant 2 attacks when calling firmware (x86 only). @@ -584,12 +599,13 @@ kernel command line.
Specific mitigations can also be selected manually:
- retpoline - replace indirect branches - retpoline,generic - google's original retpoline - retpoline,amd - AMD-specific minimal thunk + retpoline auto pick between generic,lfence + retpoline,generic Retpolines + retpoline,lfence LFENCE; indirect branch + retpoline,amd alias for retpoline,lfence + eibrs enhanced IBRS + eibrs,retpoline enhanced IBRS + Retpolines + eibrs,lfence enhanced IBRS + LFENCE
Not specifying this option is equivalent to spectre_v2=auto. --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4329,8 +4329,12 @@ Specific mitigations can also be selected manually:
retpoline - replace indirect branches - retpoline,generic - google's original retpoline - retpoline,amd - AMD-specific minimal thunk + retpoline,generic - Retpolines + retpoline,lfence - LFENCE; indirect branch + retpoline,amd - alias for retpoline,lfence + eibrs - enhanced IBRS + eibrs,retpoline - enhanced IBRS + Retpolines + eibrs,lfence - enhanced IBRS + LFENCE
Not specifying this option is equivalent to spectre_v2=auto.
From: Josh Poimboeuf jpoimboe@redhat.com
commit 44a3918c8245ab10c6c9719dd12e7a8d291980d8 upstream.
With unprivileged eBPF enabled, eIBRS (without retpoline) is vulnerable to Spectre v2 BHB-based attacks.
When both are enabled, print a warning message and report it in the 'spectre_v2' sysfs vulnerabilities file.
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Thomas Gleixner tglx@linutronix.de [fllinden@amazon.com: backported to 4.19] Signed-off-by: Frank van der Linden fllinden@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/bugs.c | 35 +++++++++++++++++++++++++++++------ include/linux/bpf.h | 11 +++++++++++ kernel/sysctl.c | 8 ++++++++ 3 files changed, 48 insertions(+), 6 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -31,6 +31,7 @@ #include <asm/intel-family.h> #include <asm/e820/api.h> #include <asm/hypervisor.h> +#include <linux/bpf.h>
#include "cpu.h"
@@ -607,6 +608,16 @@ static inline const char *spectre_v2_mod static inline const char *spectre_v2_module_string(void) { return ""; } #endif
+#define SPECTRE_V2_EIBRS_EBPF_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks possible via Spectre v2 BHB attacks!\n" + +#ifdef CONFIG_BPF_SYSCALL +void unpriv_ebpf_notify(int new_state) +{ + if (spectre_v2_enabled == SPECTRE_V2_EIBRS && !new_state) + pr_err(SPECTRE_V2_EIBRS_EBPF_MSG); +} +#endif + static inline bool match_option(const char *arg, int arglen, const char *opt) { int len = strlen(opt); @@ -950,6 +961,9 @@ static void __init spectre_v2_select_mit break; }
+ if (mode == SPECTRE_V2_EIBRS && unprivileged_ebpf_enabled()) + pr_err(SPECTRE_V2_EIBRS_EBPF_MSG); + if (spectre_v2_in_eibrs_mode(mode)) { /* Force it so VMEXIT will restore correctly */ x86_spec_ctrl_base |= SPEC_CTRL_IBRS; @@ -1685,6 +1699,20 @@ static char *ibpb_state(void) return ""; }
+static ssize_t spectre_v2_show_state(char *buf) +{ + if (spectre_v2_enabled == SPECTRE_V2_EIBRS && unprivileged_ebpf_enabled()) + return sprintf(buf, "Vulnerable: Unprivileged eBPF enabled\n"); + + return sprintf(buf, "%s%s%s%s%s%s\n", + spectre_v2_strings[spectre_v2_enabled], + ibpb_state(), + boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", + stibp_state(), + boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "", + spectre_v2_module_string()); +} + static ssize_t srbds_show_state(char *buf) { return sprintf(buf, "%s\n", srbds_strings[srbds_mitigation]); @@ -1710,12 +1738,7 @@ static ssize_t cpu_show_common(struct de return sprintf(buf, "%s\n", spectre_v1_strings[spectre_v1_mitigation]);
case X86_BUG_SPECTRE_V2: - return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], - ibpb_state(), - boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", - stibp_state(), - boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "", - spectre_v2_module_string()); + return spectre_v2_show_state(buf);
case X86_BUG_SPEC_STORE_BYPASS: return sprintf(buf, "%s\n", ssb_strings[ssb_mode]); --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -533,6 +533,11 @@ static inline int bpf_map_attr_numa_node struct bpf_prog *bpf_prog_get_type_path(const char *name, enum bpf_prog_type type); int array_map_alloc_check(union bpf_attr *attr);
+static inline bool unprivileged_ebpf_enabled(void) +{ + return !sysctl_unprivileged_bpf_disabled; +} + #else /* !CONFIG_BPF_SYSCALL */ static inline struct bpf_prog *bpf_prog_get(u32 ufd) { @@ -644,6 +649,12 @@ static inline struct bpf_prog *bpf_prog_ { return ERR_PTR(-EOPNOTSUPP); } + +static inline bool unprivileged_ebpf_enabled(void) +{ + return false; +} + #endif /* CONFIG_BPF_SYSCALL */
static inline struct bpf_prog *bpf_prog_get_type(u32 ufd, --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -251,6 +251,11 @@ static int sysrq_sysctl_handler(struct c #endif
#ifdef CONFIG_BPF_SYSCALL + +void __weak unpriv_ebpf_notify(int new_state) +{ +} + static int bpf_unpriv_handler(struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { @@ -268,6 +273,9 @@ static int bpf_unpriv_handler(struct ctl return -EPERM; *(int *)table->data = unpriv_enable; } + + unpriv_ebpf_notify(unpriv_enable); + return ret; } #endif
From: Kim Phillips kim.phillips@amd.com
commit 244d00b5dd4755f8df892c86cab35fb2cfd4f14b upstream.
AMD retpoline may be susceptible to speculation. The speculation execution window for an incorrect indirect branch prediction using LFENCE/JMP sequence may potentially be large enough to allow exploitation using Spectre V2.
By default, don't use retpoline,lfence on AMD. Instead, use the generic retpoline.
Signed-off-by: Kim Phillips kim.phillips@amd.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/bugs.c | 8 -------- 1 file changed, 8 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -898,14 +898,6 @@ static enum spectre_v2_mitigation __init return SPECTRE_V2_NONE; }
- if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { - if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { - pr_err("LFENCE not serializing, switching to generic retpoline\n"); - return SPECTRE_V2_RETPOLINE; - } - return SPECTRE_V2_LFENCE; - } - return SPECTRE_V2_RETPOLINE; }
From: Kim Phillips kim.phillips@amd.com
commit e9b6013a7ce31535b04b02ba99babefe8a8599fa upstream.
Update the link to the "Software Techniques for Managing Speculation on AMD Processors" whitepaper.
Signed-off-by: Kim Phillips kim.phillips@amd.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/hw-vuln/spectre.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/Documentation/admin-guide/hw-vuln/spectre.rst +++ b/Documentation/admin-guide/hw-vuln/spectre.rst @@ -60,8 +60,8 @@ privileged data touched during the specu Spectre variant 1 attacks take advantage of speculative execution of conditional branches, while Spectre variant 2 attacks use speculative execution of indirect branches to leak privileged memory. -See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[7] <spec_ref7>` -:ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`. +See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[6] <spec_ref6>` +:ref:`[7] <spec_ref7>` :ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`.
Spectre variant 1 (Bounds Check Bypass) --------------------------------------- @@ -746,7 +746,7 @@ AMD white papers:
.. _spec_ref6:
-[6] `Software techniques for managing speculation on AMD processors https://developer.amd.com/wp-content/resources/90343-B_SoftwareTechniquesforManagingSpeculation_WP_7-18Update_FNL.pdf`_. +[6] `Software techniques for managing speculation on AMD processors https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf`_.
ARM white papers:
From: Josh Poimboeuf jpoimboe@redhat.com
commit eafd987d4a82c7bb5aa12f0e3b4f8f3dea93e678 upstream.
With:
f8a66d608a3e ("x86,bugs: Unconditionally allow spectre_v2=retpoline,amd")
it became possible to enable the LFENCE "retpoline" on Intel. However, Intel doesn't recommend it, as it has some weaknesses compared to retpoline.
Now AMD doesn't recommend it either.
It can still be left available as a cmdline option. It's faster than retpoline but is weaker in certain scenarios -- particularly SMT, but even non-SMT may be vulnerable in some cases.
So just unconditionally warn if the user requests it on the cmdline.
[ bp: Massage commit message. ]
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/bugs.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -608,6 +608,7 @@ static inline const char *spectre_v2_mod static inline const char *spectre_v2_module_string(void) { return ""; } #endif
+#define SPECTRE_V2_LFENCE_MSG "WARNING: LFENCE mitigation is not recommended for this CPU, data leaks possible!\n" #define SPECTRE_V2_EIBRS_EBPF_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks possible via Spectre v2 BHB attacks!\n"
#ifdef CONFIG_BPF_SYSCALL @@ -929,6 +930,7 @@ static void __init spectre_v2_select_mit break;
case SPECTRE_V2_CMD_RETPOLINE_LFENCE: + pr_err(SPECTRE_V2_LFENCE_MSG); mode = SPECTRE_V2_LFENCE; break;
@@ -1693,6 +1695,9 @@ static char *ibpb_state(void)
static ssize_t spectre_v2_show_state(char *buf) { + if (spectre_v2_enabled == SPECTRE_V2_LFENCE) + return sprintf(buf, "Vulnerable: LFENCE\n"); + if (spectre_v2_enabled == SPECTRE_V2_EIBRS && unprivileged_ebpf_enabled()) return sprintf(buf, "Vulnerable: Unprivileged eBPF enabled\n");
From: Josh Poimboeuf jpoimboe@redhat.com
commit 0de05d056afdb00eca8c7bbb0c79a3438daf700c upstream.
The commit
44a3918c8245 ("x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting")
added a warning for the "eIBRS + unprivileged eBPF" combination, which has been shown to be vulnerable against Spectre v2 BHB-based attacks.
However, there's no warning about the "eIBRS + LFENCE retpoline + unprivileged eBPF" combo. The LFENCE adds more protection by shortening the speculation window after a mispredicted branch. That makes an attack significantly more difficult, even with unprivileged eBPF. So at least for now the logic doesn't warn about that combination.
But if you then add SMT into the mix, the SMT attack angle weakens the effectiveness of the LFENCE considerably.
So extend the "eIBRS + unprivileged eBPF" warning to also include the "eIBRS + LFENCE + unprivileged eBPF + SMT" case.
[ bp: Massage commit message. ]
Suggested-by: Alyssa Milburn alyssa.milburn@linux.intel.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/bugs.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -610,12 +610,27 @@ static inline const char *spectre_v2_mod
#define SPECTRE_V2_LFENCE_MSG "WARNING: LFENCE mitigation is not recommended for this CPU, data leaks possible!\n" #define SPECTRE_V2_EIBRS_EBPF_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks possible via Spectre v2 BHB attacks!\n" +#define SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS+LFENCE mitigation and SMT, data leaks possible via Spectre v2 BHB attacks!\n"
#ifdef CONFIG_BPF_SYSCALL void unpriv_ebpf_notify(int new_state) { - if (spectre_v2_enabled == SPECTRE_V2_EIBRS && !new_state) + if (new_state) + return; + + /* Unprivileged eBPF is enabled */ + + switch (spectre_v2_enabled) { + case SPECTRE_V2_EIBRS: pr_err(SPECTRE_V2_EIBRS_EBPF_MSG); + break; + case SPECTRE_V2_EIBRS_LFENCE: + if (sched_smt_active()) + pr_err(SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG); + break; + default: + break; + } } #endif
@@ -1075,6 +1090,10 @@ void arch_smt_update(void) { mutex_lock(&spec_ctrl_mutex);
+ if (sched_smt_active() && unprivileged_ebpf_enabled() && + spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE) + pr_warn_once(SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG); + switch (spectre_v2_user_stibp) { case SPECTRE_V2_USER_NONE: break; @@ -1699,7 +1718,11 @@ static ssize_t spectre_v2_show_state(cha return sprintf(buf, "Vulnerable: LFENCE\n");
if (spectre_v2_enabled == SPECTRE_V2_EIBRS && unprivileged_ebpf_enabled()) - return sprintf(buf, "Vulnerable: Unprivileged eBPF enabled\n"); + return sprintf(buf, "Vulnerable: eIBRS with unprivileged eBPF\n"); + + if (sched_smt_active() && unprivileged_ebpf_enabled() && + spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE) + return sprintf(buf, "Vulnerable: eIBRS+LFENCE with unprivileged eBPF and SMT\n");
return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
From: Steven Price steven.price@arm.com
commit 541625ac47ce9d0835efaee0fcbaa251b0000a37 upstream.
SMCCC 1.1 calls may use either HVC or SMC depending on the PSCI conduit. Rather than coding this in every call site, provide a macro which uses the correct instruction. The macro also handles the case where no conduit is configured/available returning a not supported error in res, along with returning the conduit used for the call.
This allow us to remove some duplicated code and will be useful later when adding paravirtualized time hypervisor calls.
Signed-off-by: Steven Price steven.price@arm.com Acked-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/arm-smccc.h | 58 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+)
--- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -311,5 +311,63 @@ asmlinkage void __arm_smccc_hvc(unsigned #define SMCCC_RET_NOT_SUPPORTED -1 #define SMCCC_RET_NOT_REQUIRED -2
+/* + * Like arm_smccc_1_1* but always returns SMCCC_RET_NOT_SUPPORTED. + * Used when the SMCCC conduit is not defined. The empty asm statement + * avoids compiler warnings about unused variables. + */ +#define __fail_smccc_1_1(...) \ + do { \ + __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \ + asm ("" __constraints(__count_args(__VA_ARGS__))); \ + if (___res) \ + ___res->a0 = SMCCC_RET_NOT_SUPPORTED; \ + } while (0) + +/* + * arm_smccc_1_1_invoke() - make an SMCCC v1.1 compliant call + * + * This is a variadic macro taking one to eight source arguments, and + * an optional return structure. + * + * @a0-a7: arguments passed in registers 0 to 7 + * @res: result values from registers 0 to 3 + * + * This macro will make either an HVC call or an SMC call depending on the + * current SMCCC conduit. If no valid conduit is available then -1 + * (SMCCC_RET_NOT_SUPPORTED) is returned in @res.a0 (if supplied). + * + * The return value also provides the conduit that was used. + */ +#define arm_smccc_1_1_invoke(...) ({ \ + int method = arm_smccc_1_1_get_conduit(); \ + switch (method) { \ + case SMCCC_CONDUIT_HVC: \ + arm_smccc_1_1_hvc(__VA_ARGS__); \ + break; \ + case SMCCC_CONDUIT_SMC: \ + arm_smccc_1_1_smc(__VA_ARGS__); \ + break; \ + default: \ + __fail_smccc_1_1(__VA_ARGS__); \ + method = SMCCC_CONDUIT_NONE; \ + break; \ + } \ + method; \ + }) + +/* Paravirtualised time calls (defined by ARM DEN0057A) */ +#define ARM_SMCCC_HV_PV_TIME_FEATURES \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x20) + +#define ARM_SMCCC_HV_PV_TIME_ST \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x21) + #endif /*__ASSEMBLY__*/ #endif /*__LINUX_ARM_SMCCC_H*/
From: Mark Rutland mark.rutland@arm.com
commit 6b7fe77c334ae59fed9500140e08f4f896b36871 upstream.
SMCCC callers are currently amassing a collection of enums for the SMCCC conduit, and are having to dig into the PSCI driver's internals in order to figure out what to do.
Let's clean this up, with common SMCCC_CONDUIT_* definitions, and an arm_smccc_1_1_get_conduit() helper that abstracts the PSCI driver's internal state.
We can kill off the PSCI_CONDUIT_* definitions once we've migrated users over to the new interface.
Signed-off-by: Mark Rutland mark.rutland@arm.com Acked-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Acked-by: Will Deacon will.deacon@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/psci.c | 15 +++++++++++++++ include/linux/arm-smccc.h | 16 ++++++++++++++++ 2 files changed, 31 insertions(+)
--- a/drivers/firmware/psci.c +++ b/drivers/firmware/psci.c @@ -64,6 +64,21 @@ struct psci_operations psci_ops = { .smccc_version = SMCCC_VERSION_1_0, };
+enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void) +{ + if (psci_ops.smccc_version < SMCCC_VERSION_1_1) + return SMCCC_CONDUIT_NONE; + + switch (psci_ops.conduit) { + case PSCI_CONDUIT_SMC: + return SMCCC_CONDUIT_SMC; + case PSCI_CONDUIT_HVC: + return SMCCC_CONDUIT_HVC; + default: + return SMCCC_CONDUIT_NONE; + } +} + typedef unsigned long (psci_fn)(unsigned long, unsigned long, unsigned long, unsigned long); static psci_fn *invoke_psci_fn; --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -89,6 +89,22 @@
#include <linux/linkage.h> #include <linux/types.h> + +enum arm_smccc_conduit { + SMCCC_CONDUIT_NONE, + SMCCC_CONDUIT_SMC, + SMCCC_CONDUIT_HVC, +}; + +/** + * arm_smccc_1_1_get_conduit() + * + * Returns the conduit to be used for SMCCCv1.1 or later. + * + * When SMCCCv1.1 is not present, returns SMCCC_CONDUIT_NONE. + */ +enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void); + /** * struct arm_smccc_res - Result from SMC/HVC call * @a0-a3 result values from registers 0 to 3
From: "Russell King (Oracle)" rmk+kernel@armlinux.org.uk
commit 9dd78194a3722fa6712192cdd4f7032d45112a9a upstream.
As per other architectures, add support for reporting the Spectre vulnerability status via sysfs CPU.
Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk [ preserve res variable and add SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED - gregkh ] Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/spectre.h | 28 ++++++++ arch/arm/kernel/Makefile | 2 arch/arm/kernel/spectre.c | 54 ++++++++++++++++ arch/arm/mm/Kconfig | 1 arch/arm/mm/proc-v7-bugs.c | 132 +++++++++++++++++++++++++++++++---------- 5 files changed, 186 insertions(+), 31 deletions(-) create mode 100644 arch/arm/include/asm/spectre.h create mode 100644 arch/arm/kernel/spectre.c
--- /dev/null +++ b/arch/arm/include/asm/spectre.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef __ASM_SPECTRE_H +#define __ASM_SPECTRE_H + +enum { + SPECTRE_UNAFFECTED, + SPECTRE_MITIGATED, + SPECTRE_VULNERABLE, +}; + +enum { + __SPECTRE_V2_METHOD_BPIALL, + __SPECTRE_V2_METHOD_ICIALLU, + __SPECTRE_V2_METHOD_SMC, + __SPECTRE_V2_METHOD_HVC, +}; + +enum { + SPECTRE_V2_METHOD_BPIALL = BIT(__SPECTRE_V2_METHOD_BPIALL), + SPECTRE_V2_METHOD_ICIALLU = BIT(__SPECTRE_V2_METHOD_ICIALLU), + SPECTRE_V2_METHOD_SMC = BIT(__SPECTRE_V2_METHOD_SMC), + SPECTRE_V2_METHOD_HVC = BIT(__SPECTRE_V2_METHOD_HVC), +}; + +void spectre_v2_update_state(unsigned int state, unsigned int methods); + +#endif --- a/arch/arm/kernel/Makefile +++ b/arch/arm/kernel/Makefile @@ -106,4 +106,6 @@ endif
obj-$(CONFIG_HAVE_ARM_SMCCC) += smccc-call.o
+obj-$(CONFIG_GENERIC_CPU_VULNERABILITIES) += spectre.o + extra-y := $(head-y) vmlinux.lds --- /dev/null +++ b/arch/arm/kernel/spectre.c @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <linux/cpu.h> +#include <linux/device.h> + +#include <asm/spectre.h> + +ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, + char *buf) +{ + return sprintf(buf, "Mitigation: __user pointer sanitization\n"); +} + +static unsigned int spectre_v2_state; +static unsigned int spectre_v2_methods; + +void spectre_v2_update_state(unsigned int state, unsigned int method) +{ + if (state > spectre_v2_state) + spectre_v2_state = state; + spectre_v2_methods |= method; +} + +ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, + char *buf) +{ + const char *method; + + if (spectre_v2_state == SPECTRE_UNAFFECTED) + return sprintf(buf, "%s\n", "Not affected"); + + if (spectre_v2_state != SPECTRE_MITIGATED) + return sprintf(buf, "%s\n", "Vulnerable"); + + switch (spectre_v2_methods) { + case SPECTRE_V2_METHOD_BPIALL: + method = "Branch predictor hardening"; + break; + + case SPECTRE_V2_METHOD_ICIALLU: + method = "I-cache invalidation"; + break; + + case SPECTRE_V2_METHOD_SMC: + case SPECTRE_V2_METHOD_HVC: + method = "Firmware call"; + break; + + default: + method = "Multiple mitigations"; + break; + } + + return sprintf(buf, "Mitigation: %s\n", method); +} --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -823,6 +823,7 @@ config CPU_BPREDICT_DISABLE
config CPU_SPECTRE bool + select GENERIC_CPU_VULNERABILITIES
config HARDEN_BRANCH_PREDICTOR bool "Harden the branch predictor against aliasing attacks" if EXPERT --- a/arch/arm/mm/proc-v7-bugs.c +++ b/arch/arm/mm/proc-v7-bugs.c @@ -7,8 +7,36 @@ #include <asm/cp15.h> #include <asm/cputype.h> #include <asm/proc-fns.h> +#include <asm/spectre.h> #include <asm/system_misc.h>
+#ifdef CONFIG_ARM_PSCI +#define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 1 +static int __maybe_unused spectre_v2_get_cpu_fw_mitigation_state(void) +{ + struct arm_smccc_res res; + + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, + ARM_SMCCC_ARCH_WORKAROUND_1, &res); + + switch ((int)res.a0) { + case SMCCC_RET_SUCCESS: + return SPECTRE_MITIGATED; + + case SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED: + return SPECTRE_UNAFFECTED; + + default: + return SPECTRE_VULNERABLE; + } +} +#else +static int __maybe_unused spectre_v2_get_cpu_fw_mitigation_state(void) +{ + return SPECTRE_VULNERABLE; +} +#endif + #ifdef CONFIG_HARDEN_BRANCH_PREDICTOR DEFINE_PER_CPU(harden_branch_predictor_fn_t, harden_branch_predictor_fn);
@@ -37,13 +65,60 @@ static void __maybe_unused call_hvc_arch arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL); }
-static void cpu_v7_spectre_init(void) +static unsigned int spectre_v2_install_workaround(unsigned int method) { const char *spectre_v2_method = NULL; int cpu = smp_processor_id();
if (per_cpu(harden_branch_predictor_fn, cpu)) - return; + return SPECTRE_MITIGATED; + + switch (method) { + case SPECTRE_V2_METHOD_BPIALL: + per_cpu(harden_branch_predictor_fn, cpu) = + harden_branch_predictor_bpiall; + spectre_v2_method = "BPIALL"; + break; + + case SPECTRE_V2_METHOD_ICIALLU: + per_cpu(harden_branch_predictor_fn, cpu) = + harden_branch_predictor_iciallu; + spectre_v2_method = "ICIALLU"; + break; + + case SPECTRE_V2_METHOD_HVC: + per_cpu(harden_branch_predictor_fn, cpu) = + call_hvc_arch_workaround_1; + cpu_do_switch_mm = cpu_v7_hvc_switch_mm; + spectre_v2_method = "hypervisor"; + break; + + case SPECTRE_V2_METHOD_SMC: + per_cpu(harden_branch_predictor_fn, cpu) = + call_smc_arch_workaround_1; + cpu_do_switch_mm = cpu_v7_smc_switch_mm; + spectre_v2_method = "firmware"; + break; + } + + if (spectre_v2_method) + pr_info("CPU%u: Spectre v2: using %s workaround\n", + smp_processor_id(), spectre_v2_method); + + return SPECTRE_MITIGATED; +} +#else +static unsigned int spectre_v2_install_workaround(unsigned int method) +{ + pr_info("CPU%u: Spectre V2: workarounds disabled by configuration\n"); + + return SPECTRE_VULNERABLE; +} +#endif + +static void cpu_v7_spectre_v2_init(void) +{ + unsigned int state, method = 0;
switch (read_cpuid_part()) { case ARM_CPU_PART_CORTEX_A8: @@ -52,32 +127,37 @@ static void cpu_v7_spectre_init(void) case ARM_CPU_PART_CORTEX_A17: case ARM_CPU_PART_CORTEX_A73: case ARM_CPU_PART_CORTEX_A75: - per_cpu(harden_branch_predictor_fn, cpu) = - harden_branch_predictor_bpiall; - spectre_v2_method = "BPIALL"; + state = SPECTRE_MITIGATED; + method = SPECTRE_V2_METHOD_BPIALL; break;
case ARM_CPU_PART_CORTEX_A15: case ARM_CPU_PART_BRAHMA_B15: - per_cpu(harden_branch_predictor_fn, cpu) = - harden_branch_predictor_iciallu; - spectre_v2_method = "ICIALLU"; + state = SPECTRE_MITIGATED; + method = SPECTRE_V2_METHOD_ICIALLU; break;
-#ifdef CONFIG_ARM_PSCI case ARM_CPU_PART_BRAHMA_B53: /* Requires no workaround */ + state = SPECTRE_UNAFFECTED; break; + default: /* Other ARM CPUs require no workaround */ - if (read_cpuid_implementor() == ARM_CPU_IMP_ARM) + if (read_cpuid_implementor() == ARM_CPU_IMP_ARM) { + state = SPECTRE_UNAFFECTED; break; + } /* fallthrough */ - /* Cortex A57/A72 require firmware workaround */ + /* Cortex A57/A72 require firmware workaround */ case ARM_CPU_PART_CORTEX_A57: case ARM_CPU_PART_CORTEX_A72: { struct arm_smccc_res res;
+ state = spectre_v2_get_cpu_fw_mitigation_state(); + if (state != SPECTRE_MITIGATED) + break; + if (psci_ops.smccc_version == SMCCC_VERSION_1_0) break;
@@ -87,10 +167,7 @@ static void cpu_v7_spectre_init(void) ARM_SMCCC_ARCH_WORKAROUND_1, &res); if ((int)res.a0 != 0) break; - per_cpu(harden_branch_predictor_fn, cpu) = - call_hvc_arch_workaround_1; - cpu_do_switch_mm = cpu_v7_hvc_switch_mm; - spectre_v2_method = "hypervisor"; + method = SPECTRE_V2_METHOD_HVC; break;
case PSCI_CONDUIT_SMC: @@ -98,28 +175,21 @@ static void cpu_v7_spectre_init(void) ARM_SMCCC_ARCH_WORKAROUND_1, &res); if ((int)res.a0 != 0) break; - per_cpu(harden_branch_predictor_fn, cpu) = - call_smc_arch_workaround_1; - cpu_do_switch_mm = cpu_v7_smc_switch_mm; - spectre_v2_method = "firmware"; + method = SPECTRE_V2_METHOD_SMC; break;
default: + state = SPECTRE_VULNERABLE; break; } } -#endif }
- if (spectre_v2_method) - pr_info("CPU%u: Spectre v2: using %s workaround\n", - smp_processor_id(), spectre_v2_method); -} -#else -static void cpu_v7_spectre_init(void) -{ + if (state == SPECTRE_MITIGATED) + state = spectre_v2_install_workaround(method); + + spectre_v2_update_state(state, method); } -#endif
static __maybe_unused bool cpu_v7_check_auxcr_set(bool *warned, u32 mask, const char *msg) @@ -149,16 +219,16 @@ static bool check_spectre_auxcr(bool *wa void cpu_v7_ca8_ibe(void) { if (check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(6))) - cpu_v7_spectre_init(); + cpu_v7_spectre_v2_init(); }
void cpu_v7_ca15_ibe(void) { if (check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(0))) - cpu_v7_spectre_init(); + cpu_v7_spectre_v2_init(); }
void cpu_v7_bugs_init(void) { - cpu_v7_spectre_init(); + cpu_v7_spectre_v2_init(); }
From: "Russell King (Oracle)" rmk+kernel@armlinux.org.uk
commit 04e91b7324760a377a725e218b5ee783826d30f5 upstream.
Provide a couple of helpers to copy the vectors and stubs, and also to flush the copied vectors and stubs.
Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/kernel/traps.c | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-)
--- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -830,10 +830,22 @@ static inline void __init kuser_init(voi } #endif
+#ifndef CONFIG_CPU_V7M +static void copy_from_lma(void *vma, void *lma_start, void *lma_end) +{ + memcpy(vma, lma_start, lma_end - lma_start); +} + +static void flush_vectors(void *vma, size_t offset, size_t size) +{ + unsigned long start = (unsigned long)vma + offset; + unsigned long end = start + size; + + flush_icache_range(start, end); +} + void __init early_trap_init(void *vectors_base) { -#ifndef CONFIG_CPU_V7M - unsigned long vectors = (unsigned long)vectors_base; extern char __stubs_start[], __stubs_end[]; extern char __vectors_start[], __vectors_end[]; unsigned i; @@ -854,17 +866,20 @@ void __init early_trap_init(void *vector * into the vector page, mapped at 0xffff0000, and ensure these * are visible to the instruction stream. */ - memcpy((void *)vectors, __vectors_start, __vectors_end - __vectors_start); - memcpy((void *)vectors + 0x1000, __stubs_start, __stubs_end - __stubs_start); + copy_from_lma(vectors_base, __vectors_start, __vectors_end); + copy_from_lma(vectors_base + 0x1000, __stubs_start, __stubs_end);
kuser_init(vectors_base);
- flush_icache_range(vectors, vectors + PAGE_SIZE * 2); + flush_vectors(vectors_base, 0, PAGE_SIZE * 2); +} #else /* ifndef CONFIG_CPU_V7M */ +void __init early_trap_init(void *vectors_base) +{ /* * on V7-M there is no need to copy the vector table to a dedicated * memory area. The address is configurable and so a table in the kernel * image can be used. */ -#endif } +#endif
From: "Russell King (Oracle)" rmk+kernel@armlinux.org.uk
commit 8d9d651ff2270a632e9dc497b142db31e8911315 upstream.
Use the linker's LOADADDR() macro to get the load address of the sections, and provide a macro to set the start and end symbols.
Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/kernel/vmlinux.lds.h | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-)
--- a/arch/arm/kernel/vmlinux.lds.h +++ b/arch/arm/kernel/vmlinux.lds.h @@ -25,6 +25,11 @@ #define ARM_MMU_DISCARD(x) x #endif
+/* Set start/end symbol names to the LMA for the section */ +#define ARM_LMA(sym, section) \ + sym##_start = LOADADDR(section); \ + sym##_end = LOADADDR(section) + SIZEOF(section) + #define PROC_INFO \ . = ALIGN(4); \ __proc_info_begin = .; \ @@ -100,19 +105,19 @@ * only thing that matters is their relative offsets */ #define ARM_VECTORS \ - __vectors_start = .; \ + __vectors_lma = .; \ .vectors 0xffff0000 : AT(__vectors_start) { \ *(.vectors) \ } \ - . = __vectors_start + SIZEOF(.vectors); \ - __vectors_end = .; \ + ARM_LMA(__vectors, .vectors); \ + . = __vectors_lma + SIZEOF(.vectors); \ \ - __stubs_start = .; \ - .stubs ADDR(.vectors) + 0x1000 : AT(__stubs_start) { \ + __stubs_lma = .; \ + .stubs ADDR(.vectors) + 0x1000 : AT(__stubs_lma) { \ *(.stubs) \ } \ - . = __stubs_start + SIZEOF(.stubs); \ - __stubs_end = .; \ + ARM_LMA(__stubs, .stubs); \ + . = __stubs_lma + SIZEOF(.stubs); \ \ PROVIDE(vector_fiq_offset = vector_fiq - ADDR(.vectors));
From: "Russell King (Oracle)" rmk+kernel@armlinux.org.uk
commit b9baf5c8c5c356757f4f9d8180b5e9d234065bc3 upstream.
Workaround the Spectre BHB issues for Cortex-A15, Cortex-A57, Cortex-A72, Cortex-A73 and Cortex-A75. We also include Brahma B15 as well to be safe, which is affected by Spectre V2 in the same ways as Cortex-A15.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk [changes due to lack of SYSTEM_FREEING_INITMEM - gregkh] Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/assembler.h | 10 ++++ arch/arm/include/asm/spectre.h | 4 + arch/arm/kernel/entry-armv.S | 79 ++++++++++++++++++++++++++++++++++++--- arch/arm/kernel/entry-common.S | 24 +++++++++++ arch/arm/kernel/spectre.c | 4 + arch/arm/kernel/traps.c | 38 ++++++++++++++++++ arch/arm/kernel/vmlinux.lds.h | 18 +++++++- arch/arm/mm/Kconfig | 10 ++++ arch/arm/mm/proc-v7-bugs.c | 76 +++++++++++++++++++++++++++++++++++++ 9 files changed, 254 insertions(+), 9 deletions(-)
--- a/arch/arm/include/asm/assembler.h +++ b/arch/arm/include/asm/assembler.h @@ -110,6 +110,16 @@ .endm #endif
+#if __LINUX_ARM_ARCH__ < 7 + .macro dsb, args + mcr p15, 0, r0, c7, c10, 4 + .endm + + .macro isb, args + mcr p15, 0, r0, c7, r5, 4 + .endm +#endif + .macro asm_trace_hardirqs_off, save=1 #if defined(CONFIG_TRACE_IRQFLAGS) .if \save --- a/arch/arm/include/asm/spectre.h +++ b/arch/arm/include/asm/spectre.h @@ -14,6 +14,7 @@ enum { __SPECTRE_V2_METHOD_ICIALLU, __SPECTRE_V2_METHOD_SMC, __SPECTRE_V2_METHOD_HVC, + __SPECTRE_V2_METHOD_LOOP8, };
enum { @@ -21,8 +22,11 @@ enum { SPECTRE_V2_METHOD_ICIALLU = BIT(__SPECTRE_V2_METHOD_ICIALLU), SPECTRE_V2_METHOD_SMC = BIT(__SPECTRE_V2_METHOD_SMC), SPECTRE_V2_METHOD_HVC = BIT(__SPECTRE_V2_METHOD_HVC), + SPECTRE_V2_METHOD_LOOP8 = BIT(__SPECTRE_V2_METHOD_LOOP8), };
void spectre_v2_update_state(unsigned int state, unsigned int methods);
+int spectre_bhb_update_vectors(unsigned int method); + #endif --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -1029,12 +1029,11 @@ vector_\name: sub lr, lr, #\correction .endif
- @ - @ Save r0, lr_<exception> (parent PC) and spsr_<exception> - @ (parent CPSR) - @ + @ Save r0, lr_<exception> (parent PC) stmia sp, {r0, lr} @ save r0, lr - mrs lr, spsr + + @ Save spsr_<exception> (parent CPSR) +2: mrs lr, spsr str lr, [sp, #8] @ save spsr
@ @@ -1055,6 +1054,44 @@ vector_\name: movs pc, lr @ branch to handler in SVC mode ENDPROC(vector_\name)
+#ifdef CONFIG_HARDEN_BRANCH_HISTORY + .subsection 1 + .align 5 +vector_bhb_loop8_\name: + .if \correction + sub lr, lr, #\correction + .endif + + @ Save r0, lr_<exception> (parent PC) + stmia sp, {r0, lr} + + @ bhb workaround + mov r0, #8 +1: b . + 4 + subs r0, r0, #1 + bne 1b + dsb + isb + b 2b +ENDPROC(vector_bhb_loop8_\name) + +vector_bhb_bpiall_\name: + .if \correction + sub lr, lr, #\correction + .endif + + @ Save r0, lr_<exception> (parent PC) + stmia sp, {r0, lr} + + @ bhb workaround + mcr p15, 0, r0, c7, c5, 6 @ BPIALL + @ isb not needed due to "movs pc, lr" in the vector stub + @ which gives a "context synchronisation". + b 2b +ENDPROC(vector_bhb_bpiall_\name) + .previous +#endif + .align 2 @ handler addresses follow this label 1: @@ -1063,6 +1100,10 @@ ENDPROC(vector_\name) .section .stubs, "ax", %progbits @ This must be the first word .word vector_swi +#ifdef CONFIG_HARDEN_BRANCH_HISTORY + .word vector_bhb_loop8_swi + .word vector_bhb_bpiall_swi +#endif
vector_rst: ARM( swi SYS_ERROR0 ) @@ -1177,8 +1218,10 @@ vector_addrexcptn: * FIQ "NMI" handler *----------------------------------------------------------------------------- * Handle a FIQ using the SVC stack allowing FIQ act like NMI on x86 - * systems. + * systems. This must be the last vector stub, so lets place it in its own + * subsection. */ + .subsection 2 vector_stub fiq, FIQ_MODE, 4
.long __fiq_usr @ 0 (USR_26 / USR_32) @@ -1211,6 +1254,30 @@ vector_addrexcptn: W(b) vector_irq W(b) vector_fiq
+#ifdef CONFIG_HARDEN_BRANCH_HISTORY + .section .vectors.bhb.loop8, "ax", %progbits +.L__vectors_bhb_loop8_start: + W(b) vector_rst + W(b) vector_bhb_loop8_und + W(ldr) pc, .L__vectors_bhb_loop8_start + 0x1004 + W(b) vector_bhb_loop8_pabt + W(b) vector_bhb_loop8_dabt + W(b) vector_addrexcptn + W(b) vector_bhb_loop8_irq + W(b) vector_bhb_loop8_fiq + + .section .vectors.bhb.bpiall, "ax", %progbits +.L__vectors_bhb_bpiall_start: + W(b) vector_rst + W(b) vector_bhb_bpiall_und + W(ldr) pc, .L__vectors_bhb_bpiall_start + 0x1008 + W(b) vector_bhb_bpiall_pabt + W(b) vector_bhb_bpiall_dabt + W(b) vector_addrexcptn + W(b) vector_bhb_bpiall_irq + W(b) vector_bhb_bpiall_fiq +#endif + .data .align 2
--- a/arch/arm/kernel/entry-common.S +++ b/arch/arm/kernel/entry-common.S @@ -166,12 +166,36 @@ ENDPROC(ret_from_fork) */
.align 5 +#ifdef CONFIG_HARDEN_BRANCH_HISTORY +ENTRY(vector_bhb_loop8_swi) + sub sp, sp, #PT_REGS_SIZE + stmia sp, {r0 - r12} + mov r8, #8 +1: b 2f +2: subs r8, r8, #1 + bne 1b + dsb + isb + b 3f +ENDPROC(vector_bhb_loop8_swi) + + .align 5 +ENTRY(vector_bhb_bpiall_swi) + sub sp, sp, #PT_REGS_SIZE + stmia sp, {r0 - r12} + mcr p15, 0, r8, c7, c5, 6 @ BPIALL + isb + b 3f +ENDPROC(vector_bhb_bpiall_swi) +#endif + .align 5 ENTRY(vector_swi) #ifdef CONFIG_CPU_V7M v7m_exception_entry #else sub sp, sp, #PT_REGS_SIZE stmia sp, {r0 - r12} @ Calling r0 - r12 +3: ARM( add r8, sp, #S_PC ) ARM( stmdb r8, {sp, lr}^ ) @ Calling sp, lr THUMB( mov r8, sp ) --- a/arch/arm/kernel/spectre.c +++ b/arch/arm/kernel/spectre.c @@ -45,6 +45,10 @@ ssize_t cpu_show_spectre_v2(struct devic method = "Firmware call"; break;
+ case SPECTRE_V2_METHOD_LOOP8: + method = "History overwrite"; + break; + default: method = "Multiple mitigations"; break; --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -33,6 +33,7 @@ #include <linux/atomic.h> #include <asm/cacheflush.h> #include <asm/exception.h> +#include <asm/spectre.h> #include <asm/unistd.h> #include <asm/traps.h> #include <asm/ptrace.h> @@ -844,6 +845,43 @@ static void flush_vectors(void *vma, siz flush_icache_range(start, end); }
+#ifdef CONFIG_HARDEN_BRANCH_HISTORY +int spectre_bhb_update_vectors(unsigned int method) +{ + extern char __vectors_bhb_bpiall_start[], __vectors_bhb_bpiall_end[]; + extern char __vectors_bhb_loop8_start[], __vectors_bhb_loop8_end[]; + void *vec_start, *vec_end; + + if (system_state > SYSTEM_SCHEDULING) { + pr_err("CPU%u: Spectre BHB workaround too late - system vulnerable\n", + smp_processor_id()); + return SPECTRE_VULNERABLE; + } + + switch (method) { + case SPECTRE_V2_METHOD_LOOP8: + vec_start = __vectors_bhb_loop8_start; + vec_end = __vectors_bhb_loop8_end; + break; + + case SPECTRE_V2_METHOD_BPIALL: + vec_start = __vectors_bhb_bpiall_start; + vec_end = __vectors_bhb_bpiall_end; + break; + + default: + pr_err("CPU%u: unknown Spectre BHB state %d\n", + smp_processor_id(), method); + return SPECTRE_VULNERABLE; + } + + copy_from_lma(vectors_page, vec_start, vec_end); + flush_vectors(vectors_page, 0, vec_end - vec_start); + + return SPECTRE_MITIGATED; +} +#endif + void __init early_trap_init(void *vectors_base) { extern char __stubs_start[], __stubs_end[]; --- a/arch/arm/kernel/vmlinux.lds.h +++ b/arch/arm/kernel/vmlinux.lds.h @@ -106,11 +106,23 @@ */ #define ARM_VECTORS \ __vectors_lma = .; \ - .vectors 0xffff0000 : AT(__vectors_start) { \ - *(.vectors) \ + OVERLAY 0xffff0000 : NOCROSSREFS AT(__vectors_lma) { \ + .vectors { \ + *(.vectors) \ + } \ + .vectors.bhb.loop8 { \ + *(.vectors.bhb.loop8) \ + } \ + .vectors.bhb.bpiall { \ + *(.vectors.bhb.bpiall) \ + } \ } \ ARM_LMA(__vectors, .vectors); \ - . = __vectors_lma + SIZEOF(.vectors); \ + ARM_LMA(__vectors_bhb_loop8, .vectors.bhb.loop8); \ + ARM_LMA(__vectors_bhb_bpiall, .vectors.bhb.bpiall); \ + . = __vectors_lma + SIZEOF(.vectors) + \ + SIZEOF(.vectors.bhb.loop8) + \ + SIZEOF(.vectors.bhb.bpiall); \ \ __stubs_lma = .; \ .stubs ADDR(.vectors) + 0x1000 : AT(__stubs_lma) { \ --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -844,6 +844,16 @@ config HARDEN_BRANCH_PREDICTOR
If unsure, say Y.
+config HARDEN_BRANCH_HISTORY + bool "Harden Spectre style attacks against branch history" if EXPERT + depends on CPU_SPECTRE + default y + help + Speculation attacks against some high-performance processors can + make use of branch history to influence future speculation. When + taking an exception, a sequence of branches overwrites the branch + history, or branch history is invalidated. + config TLS_REG_EMUL bool select NEED_KUSER_HELPERS --- a/arch/arm/mm/proc-v7-bugs.c +++ b/arch/arm/mm/proc-v7-bugs.c @@ -191,6 +191,81 @@ static void cpu_v7_spectre_v2_init(void) spectre_v2_update_state(state, method); }
+#ifdef CONFIG_HARDEN_BRANCH_HISTORY +static int spectre_bhb_method; + +static const char *spectre_bhb_method_name(int method) +{ + switch (method) { + case SPECTRE_V2_METHOD_LOOP8: + return "loop"; + + case SPECTRE_V2_METHOD_BPIALL: + return "BPIALL"; + + default: + return "unknown"; + } +} + +static int spectre_bhb_install_workaround(int method) +{ + if (spectre_bhb_method != method) { + if (spectre_bhb_method) { + pr_err("CPU%u: Spectre BHB: method disagreement, system vulnerable\n", + smp_processor_id()); + + return SPECTRE_VULNERABLE; + } + + if (spectre_bhb_update_vectors(method) == SPECTRE_VULNERABLE) + return SPECTRE_VULNERABLE; + + spectre_bhb_method = method; + } + + pr_info("CPU%u: Spectre BHB: using %s workaround\n", + smp_processor_id(), spectre_bhb_method_name(method)); + + return SPECTRE_MITIGATED; +} +#else +static int spectre_bhb_install_workaround(int method) +{ + return SPECTRE_VULNERABLE; +} +#endif + +static void cpu_v7_spectre_bhb_init(void) +{ + unsigned int state, method = 0; + + switch (read_cpuid_part()) { + case ARM_CPU_PART_CORTEX_A15: + case ARM_CPU_PART_BRAHMA_B15: + case ARM_CPU_PART_CORTEX_A57: + case ARM_CPU_PART_CORTEX_A72: + state = SPECTRE_MITIGATED; + method = SPECTRE_V2_METHOD_LOOP8; + break; + + case ARM_CPU_PART_CORTEX_A73: + case ARM_CPU_PART_CORTEX_A75: + state = SPECTRE_MITIGATED; + method = SPECTRE_V2_METHOD_BPIALL; + break; + + default: + state = SPECTRE_UNAFFECTED; + break; + } + + if (state == SPECTRE_MITIGATED) + state = spectre_bhb_install_workaround(method); + + spectre_v2_update_state(state, method); +} + static __maybe_unused bool cpu_v7_check_auxcr_set(bool *warned, u32 mask, const char *msg) { @@ -231,4 +306,5 @@ void cpu_v7_ca15_ibe(void) void cpu_v7_bugs_init(void) { cpu_v7_spectre_v2_init(); + cpu_v7_spectre_bhb_init(); }
From: Russell King (Oracle) rmk+kernel@armlinux.org.uk
commit 25875aa71dfefd1959f07e626c4d285b88b27ac2 upstream.
The mitigations for Spectre-BHB are only applied when an exception is taken, but when unprivileged BPF is enabled, userspace can load BPF programs that can be used to exploit the problem.
When unprivileged BPF is enabled, report the vulnerable status via the spectre_v2 sysfs file.
Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/kernel/spectre.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/arch/arm/kernel/spectre.c +++ b/arch/arm/kernel/spectre.c @@ -1,9 +1,19 @@ // SPDX-License-Identifier: GPL-2.0-only +#include <linux/bpf.h> #include <linux/cpu.h> #include <linux/device.h>
#include <asm/spectre.h>
+static bool _unprivileged_ebpf_enabled(void) +{ +#ifdef CONFIG_BPF_SYSCALL + return !sysctl_unprivileged_bpf_disabled; +#else + return false +#endif +} + ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf) { @@ -31,6 +41,9 @@ ssize_t cpu_show_spectre_v2(struct devic if (spectre_v2_state != SPECTRE_MITIGATED) return sprintf(buf, "%s\n", "Vulnerable");
+ if (_unprivileged_ebpf_enabled()) + return sprintf(buf, "Vulnerable: Unprivileged eBPF enabled\n"); + switch (spectre_v2_methods) { case SPECTRE_V2_METHOD_BPIALL: method = "Branch predictor hardening";
From: Emmanuel Gil Peyrot linkmauve@linkmauve.fr
commit 330f4c53d3c2d8b11d86ec03a964b86dc81452f5 upstream.
It was missing a semicolon.
Signed-off-by: Emmanuel Gil Peyrot linkmauve@linkmauve.fr Reviewed-by: Nathan Chancellor nathan@kernel.org Fixes: 25875aa71dfe ("ARM: include unprivileged BPF status in Spectre V2 reporting"). Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/kernel/spectre.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/kernel/spectre.c +++ b/arch/arm/kernel/spectre.c @@ -10,7 +10,7 @@ static bool _unprivileged_ebpf_enabled(v #ifdef CONFIG_BPF_SYSCALL return !sysctl_unprivileged_bpf_disabled; #else - return false + return false; #endif }
From: Sami Tolvanen samitolvanen@google.com
commit b744b43f79cc758127042e71f9ad7b1afda30f84 upstream.
Similarly to the CC_IS_CLANG config, add LD_IS_LLD to avoid GNU ld specific logic such as ld-version or ld-ifversion and gain the ability to select potential features that depend on the linker at configuration time such as LTO.
Signed-off-by: Sami Tolvanen samitolvanen@google.com Acked-by: Masahiro Yamada masahiroy@kernel.org [nc: Reword commit message] Signed-off-by: Nathan Chancellor natechancellor@gmail.com Tested-by: Sedat Dilek sedat.dilek@gmail.com Reviewed-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- init/Kconfig | 3 +++ 1 file changed, 3 insertions(+)
--- a/init/Kconfig +++ b/init/Kconfig @@ -19,6 +19,9 @@ config GCC_VERSION config CC_IS_CLANG def_bool $(success,$(CC) --version | head -n 1 | grep -q clang)
+config LD_IS_LLD + def_bool $(success,$(LD) -v | head -n 1 | grep -q LLD) + config CLANG_VERSION int default $(shell,$(srctree)/scripts/clang-version.sh $(CC))
From: Russell King (Oracle) rmk+kernel@armlinux.org.uk
commit 33970b031dc4653cc9dc80f2886976706c4c8ef1 upstream.
In the recent Spectre BHB patches, there was a typo that is only exposed in certain configurations: mcr p15,0,XX,c7,r5,4 should have been mcr p15,0,XX,c7,c5,4
Reported-by: kernel test robot lkp@intel.com Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround") Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/assembler.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/include/asm/assembler.h +++ b/arch/arm/include/asm/assembler.h @@ -116,7 +116,7 @@ .endm
.macro isb, args - mcr p15, 0, r0, c7, r5, 4 + mcr p15, 0, r0, c7, c5, 4 .endm #endif
From: Nathan Chancellor nathan@kernel.org
commit 36168e387fa7d0f1fe0cd5cf76c8cea7aee714fa upstream.
ld.lld does not support the NOCROSSREFS directive at the moment, which breaks the build after commit b9baf5c8c5c3 ("ARM: Spectre-BHB workaround"):
ld.lld: error: ./arch/arm/kernel/vmlinux.lds:34: AT expected, but got NOCROSSREFS
Support for this directive will eventually be implemented, at which point a version check can be added. To avoid breaking the build in the meantime, just define NOCROSSREFS to nothing when using ld.lld, with a link to the issue for tracking.
Cc: stable@vger.kernel.org Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround") Link: https://github.com/ClangBuiltLinux/linux/issues/1609 Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/kernel/vmlinux.lds.h | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/arch/arm/kernel/vmlinux.lds.h +++ b/arch/arm/kernel/vmlinux.lds.h @@ -25,6 +25,14 @@ #define ARM_MMU_DISCARD(x) x #endif
+/* + * ld.lld does not support NOCROSSREFS: + * https://github.com/ClangBuiltLinux/linux/issues/1609 + */ +#ifdef CONFIG_LD_IS_LLD +#define NOCROSSREFS +#endif + /* Set start/end symbol names to the LMA for the section */ #define ARM_LMA(sym, section) \ sym##_start = LOADADDR(section); \
From: Russell King (Oracle) rmk+kernel@armlinux.org.uk
commit b1a384d2cbccb1eb3f84765020d25e2c1929706e upstream.
The kernel test robot discovered that building without HARDEN_BRANCH_PREDICTOR issues a warning due to a missing argument to pr_info().
Add the missing argument.
Reported-by: kernel test robot lkp@intel.com Fixes: 9dd78194a372 ("ARM: report Spectre v2 status through sysfs") Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mm/proc-v7-bugs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/arm/mm/proc-v7-bugs.c +++ b/arch/arm/mm/proc-v7-bugs.c @@ -110,7 +110,8 @@ static unsigned int spectre_v2_install_w #else static unsigned int spectre_v2_install_workaround(unsigned int method) { - pr_info("CPU%u: Spectre V2: workarounds disabled by configuration\n"); + pr_info("CPU%u: Spectre V2: workarounds disabled by configuration\n", + smp_processor_id());
return SPECTRE_VULNERABLE; }
From: Juergen Gross jgross@suse.com
Commit 3777ea7bac3113005b7180e6b9dadf16d19a5827 upstream.
Letting xenbus_grant_ring() tear down grants in the error case is problematic, as the other side could already have used these grants. Calling gnttab_end_foreign_access_ref() without checking success is resulting in an unclear situation for any caller of xenbus_grant_ring() as in the error case the memory pages of the ring page might be partially mapped. Freeing them would risk unwanted foreign access to them, while not freeing them would leak memory.
In order to remove the need to undo any gnttab_grant_foreign_access() calls, use gnttab_alloc_grant_references() to make sure no further error can occur in the loop granting access to the ring pages.
It should be noted that this way of handling removes leaking of grant entries in the error case, too.
This is CVE-2022-23040 / part of XSA-396.
Reported-by: Demi Marie Obenour demi@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/xenbus/xenbus_client.c | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-)
--- a/drivers/xen/xenbus/xenbus_client.c +++ b/drivers/xen/xenbus/xenbus_client.c @@ -368,7 +368,14 @@ int xenbus_grant_ring(struct xenbus_devi unsigned int nr_pages, grant_ref_t *grefs) { int err; - int i, j; + unsigned int i; + grant_ref_t gref_head; + + err = gnttab_alloc_grant_references(nr_pages, &gref_head); + if (err) { + xenbus_dev_fatal(dev, err, "granting access to ring page"); + return err; + }
for (i = 0; i < nr_pages; i++) { unsigned long gfn; @@ -378,23 +385,14 @@ int xenbus_grant_ring(struct xenbus_devi else gfn = virt_to_gfn(vaddr);
- err = gnttab_grant_foreign_access(dev->otherend_id, gfn, 0); - if (err < 0) { - xenbus_dev_fatal(dev, err, - "granting access to ring page"); - goto fail; - } - grefs[i] = err; + grefs[i] = gnttab_claim_grant_reference(&gref_head); + gnttab_grant_foreign_access_ref(grefs[i], dev->otherend_id, + gfn, 0);
vaddr = vaddr + XEN_PAGE_SIZE; }
return 0; - -fail: - for (j = 0; j < i; j++) - gnttab_end_foreign_access_ref(grefs[j], 0); - return err; } EXPORT_SYMBOL_GPL(xenbus_grant_ring);
From: Juergen Gross jgross@suse.com
Commit 6b1775f26a2da2b05a6dc8ec2b5d14e9a4701a1a upstream.
Add a new grant table function gnttab_try_end_foreign_access(), which will remove and free a grant if it is not in use.
Its main use case is to either free a grant if it is no longer in use, or to take some other action if it is still in use. This other action can be an error exit, or (e.g. in the case of blkfront persistent grant feature) some special handling.
This is CVE-2022-23036, CVE-2022-23038 / part of XSA-396.
Reported-by: Demi Marie Obenour demi@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/grant-table.c | 14 ++++++++++++-- include/xen/grant_table.h | 12 ++++++++++++ 2 files changed, 24 insertions(+), 2 deletions(-)
--- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -436,11 +436,21 @@ static void gnttab_add_deferred(grant_re what, ref, page ? page_to_pfn(page) : -1); }
+int gnttab_try_end_foreign_access(grant_ref_t ref) +{ + int ret = _gnttab_end_foreign_access_ref(ref, 0); + + if (ret) + put_free_entry(ref); + + return ret; +} +EXPORT_SYMBOL_GPL(gnttab_try_end_foreign_access); + void gnttab_end_foreign_access(grant_ref_t ref, int readonly, unsigned long page) { - if (gnttab_end_foreign_access_ref(ref, readonly)) { - put_free_entry(ref); + if (gnttab_try_end_foreign_access(ref)) { if (page != 0) put_page(virt_to_page(page)); } else --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -97,10 +97,22 @@ int gnttab_end_foreign_access_ref(grant_ * access has been ended, free the given page too. Access will be ended * immediately iff the grant entry is not in use, otherwise it will happen * some time later. page may be 0, in which case no freeing will occur. + * Note that the granted page might still be accessed (read or write) by the + * other side after gnttab_end_foreign_access() returns, so even if page was + * specified as 0 it is not allowed to just reuse the page for other + * purposes immediately. */ void gnttab_end_foreign_access(grant_ref_t ref, int readonly, unsigned long page);
+/* + * End access through the given grant reference, iff the grant entry is + * no longer in use. In case of success ending foreign access, the + * grant reference is deallocated. + * Return 1 if the grant entry was freed, 0 if it is still in use. + */ +int gnttab_try_end_foreign_access(grant_ref_t ref); + int gnttab_grant_foreign_transfer(domid_t domid, unsigned long pfn);
unsigned long gnttab_end_foreign_transfer_ref(grant_ref_t ref);
From: Juergen Gross jgross@suse.com
Commit abf1fd5919d6238ee3bc5eb4a9b6c3947caa6638 upstream.
It isn't enough to check whether a grant is still being in use by calling gnttab_query_foreign_access(), as a mapping could be realized by the other side just after having called that function.
In case the call was done in preparation of revoking a grant it is better to do so via gnttab_end_foreign_access_ref() and check the success of that operation instead.
For the ring allocation use alloc_pages_exact() in order to avoid high order pages in case of a multi-page ring.
If a grant wasn't unmapped by the backend without persistent grants being used, set the device state to "error".
This is CVE-2022-23036 / part of XSA-396.
Reported-by: Demi Marie Obenour demi@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Roger Pau Monné roger.pau@citrix.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/xen-blkfront.c | 63 +++++++++++++++++++++++++------------------ 1 file changed, 37 insertions(+), 26 deletions(-)
--- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -1344,7 +1344,8 @@ free_shadow: rinfo->ring_ref[i] = GRANT_INVALID_REF; } } - free_pages((unsigned long)rinfo->ring.sring, get_order(info->nr_ring_pages * XEN_PAGE_SIZE)); + free_pages_exact(rinfo->ring.sring, + info->nr_ring_pages * XEN_PAGE_SIZE); rinfo->ring.sring = NULL;
if (rinfo->irq) @@ -1428,9 +1429,15 @@ static int blkif_get_final_status(enum b return BLKIF_RSP_OKAY; }
-static bool blkif_completion(unsigned long *id, - struct blkfront_ring_info *rinfo, - struct blkif_response *bret) +/* + * Return values: + * 1 response processed. + * 0 missing further responses. + * -1 error while processing. + */ +static int blkif_completion(unsigned long *id, + struct blkfront_ring_info *rinfo, + struct blkif_response *bret) { int i = 0; struct scatterlist *sg; @@ -1453,7 +1460,7 @@ static bool blkif_completion(unsigned lo
/* Wait the second response if not yet here. */ if (s2->status < REQ_DONE) - return false; + return 0;
bret->status = blkif_get_final_status(s->status, s2->status); @@ -1504,42 +1511,43 @@ static bool blkif_completion(unsigned lo } /* Add the persistent grant into the list of free grants */ for (i = 0; i < num_grant; i++) { - if (gnttab_query_foreign_access(s->grants_used[i]->gref)) { + if (!gnttab_try_end_foreign_access(s->grants_used[i]->gref)) { /* * If the grant is still mapped by the backend (the * backend has chosen to make this grant persistent) * we add it at the head of the list, so it will be * reused first. */ - if (!info->feature_persistent) - pr_alert_ratelimited("backed has not unmapped grant: %u\n", - s->grants_used[i]->gref); + if (!info->feature_persistent) { + pr_alert("backed has not unmapped grant: %u\n", + s->grants_used[i]->gref); + return -1; + } list_add(&s->grants_used[i]->node, &rinfo->grants); rinfo->persistent_gnts_c++; } else { /* - * If the grant is not mapped by the backend we end the - * foreign access and add it to the tail of the list, - * so it will not be picked again unless we run out of - * persistent grants. + * If the grant is not mapped by the backend we add it + * to the tail of the list, so it will not be picked + * again unless we run out of persistent grants. */ - gnttab_end_foreign_access(s->grants_used[i]->gref, 0, 0UL); s->grants_used[i]->gref = GRANT_INVALID_REF; list_add_tail(&s->grants_used[i]->node, &rinfo->grants); } } if (s->req.operation == BLKIF_OP_INDIRECT) { for (i = 0; i < INDIRECT_GREFS(num_grant); i++) { - if (gnttab_query_foreign_access(s->indirect_grants[i]->gref)) { - if (!info->feature_persistent) - pr_alert_ratelimited("backed has not unmapped grant: %u\n", - s->indirect_grants[i]->gref); + if (!gnttab_try_end_foreign_access(s->indirect_grants[i]->gref)) { + if (!info->feature_persistent) { + pr_alert("backed has not unmapped grant: %u\n", + s->indirect_grants[i]->gref); + return -1; + } list_add(&s->indirect_grants[i]->node, &rinfo->grants); rinfo->persistent_gnts_c++; } else { struct page *indirect_page;
- gnttab_end_foreign_access(s->indirect_grants[i]->gref, 0, 0UL); /* * Add the used indirect page back to the list of * available pages for indirect grefs. @@ -1554,7 +1562,7 @@ static bool blkif_completion(unsigned lo } }
- return true; + return 1; }
static irqreturn_t blkif_interrupt(int irq, void *dev_id) @@ -1620,12 +1628,17 @@ static irqreturn_t blkif_interrupt(int i }
if (bret.operation != BLKIF_OP_DISCARD) { + int ret; + /* * We may need to wait for an extra response if the * I/O request is split in 2 */ - if (!blkif_completion(&id, rinfo, &bret)) + ret = blkif_completion(&id, rinfo, &bret); + if (!ret) continue; + if (unlikely(ret < 0)) + goto err; }
if (add_id_to_freelist(rinfo, id)) { @@ -1731,8 +1744,7 @@ static int setup_blkring(struct xenbus_d for (i = 0; i < info->nr_ring_pages; i++) rinfo->ring_ref[i] = GRANT_INVALID_REF;
- sring = (struct blkif_sring *)__get_free_pages(GFP_NOIO | __GFP_HIGH, - get_order(ring_size)); + sring = alloc_pages_exact(ring_size, GFP_NOIO); if (!sring) { xenbus_dev_fatal(dev, -ENOMEM, "allocating shared ring"); return -ENOMEM; @@ -1742,7 +1754,7 @@ static int setup_blkring(struct xenbus_d
err = xenbus_grant_ring(dev, rinfo->ring.sring, info->nr_ring_pages, gref); if (err < 0) { - free_pages((unsigned long)sring, get_order(ring_size)); + free_pages_exact(sring, ring_size); rinfo->ring.sring = NULL; goto fail; } @@ -2720,11 +2732,10 @@ static void purge_persistent_grants(stru list_for_each_entry_safe(gnt_list_entry, tmp, &rinfo->grants, node) { if (gnt_list_entry->gref == GRANT_INVALID_REF || - gnttab_query_foreign_access(gnt_list_entry->gref)) + !gnttab_try_end_foreign_access(gnt_list_entry->gref)) continue;
list_del(&gnt_list_entry->node); - gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL); rinfo->persistent_gnts_c--; gnt_list_entry->gref = GRANT_INVALID_REF; list_add_tail(&gnt_list_entry->node, &rinfo->grants);
From: Juergen Gross jgross@suse.com
Commit 31185df7e2b1d2fa1de4900247a12d7b9c7087eb upstream.
It isn't enough to check whether a grant is still being in use by calling gnttab_query_foreign_access(), as a mapping could be realized by the other side just after having called that function.
In case the call was done in preparation of revoking a grant it is better to do so via gnttab_end_foreign_access_ref() and check the success of that operation instead.
This is CVE-2022-23037 / part of XSA-396.
Reported-by: Demi Marie Obenour demi@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/xen-netfront.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -414,14 +414,12 @@ static bool xennet_tx_buf_gc(struct netf queue->tx_link[id] = TX_LINK_NONE; skb = queue->tx_skbs[id]; queue->tx_skbs[id] = NULL; - if (unlikely(gnttab_query_foreign_access( - queue->grant_tx_ref[id]) != 0)) { + if (unlikely(!gnttab_end_foreign_access_ref( + queue->grant_tx_ref[id], GNTMAP_readonly))) { dev_alert(dev, "Grant still in use by backend domain\n"); goto err; } - gnttab_end_foreign_access_ref( - queue->grant_tx_ref[id], GNTMAP_readonly); gnttab_release_grant_reference( &queue->gref_tx_head, queue->grant_tx_ref[id]); queue->grant_tx_ref[id] = GRANT_INVALID_REF;
From: Juergen Gross jgross@suse.com
Commit 33172ab50a53578a95691310f49567c9266968b0 upstream.
It isn't enough to check whether a grant is still being in use by calling gnttab_query_foreign_access(), as a mapping could be realized by the other side just after having called that function.
In case the call was done in preparation of revoking a grant it is better to do so via gnttab_try_end_foreign_access() and check the success of that operation instead.
This is CVE-2022-23038 / part of XSA-396.
Reported-by: Demi Marie Obenour demi@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/xen-scsifront.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/scsi/xen-scsifront.c +++ b/drivers/scsi/xen-scsifront.c @@ -233,12 +233,11 @@ static void scsifront_gnttab_done(struct return;
for (i = 0; i < shadow->nr_grants; i++) { - if (unlikely(gnttab_query_foreign_access(shadow->gref[i]))) { + if (unlikely(!gnttab_try_end_foreign_access(shadow->gref[i]))) { shost_printk(KERN_ALERT, info->host, KBUILD_MODNAME "grant still in use by backend\n"); BUG(); } - gnttab_end_foreign_access(shadow->gref[i], 0, 0UL); }
kfree(shadow->sg);
From: Juergen Gross jgross@suse.com
Commit d3b6372c5881cb54925212abb62c521df8ba4809 upstream.
Using gnttab_query_foreign_access() is unsafe, as it is racy by design.
The use case in the gntalloc driver is not needed at all. While at it replace the call of gnttab_end_foreign_access_ref() with a call of gnttab_end_foreign_access(), which is what is really wanted there. In case the grant wasn't used due to an allocation failure, just free the grant via gnttab_free_grant_reference().
This is CVE-2022-23039 / part of XSA-396.
Reported-by: Demi Marie Obenour demi@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/gntalloc.c | 25 +++++++------------------ 1 file changed, 7 insertions(+), 18 deletions(-)
--- a/drivers/xen/gntalloc.c +++ b/drivers/xen/gntalloc.c @@ -169,20 +169,14 @@ undo: __del_gref(gref); }
- /* It's possible for the target domain to map the just-allocated grant - * references by blindly guessing their IDs; if this is done, then - * __del_gref will leave them in the queue_gref list. They need to be - * added to the global list so that we can free them when they are no - * longer referenced. - */ - if (unlikely(!list_empty(&queue_gref))) - list_splice_tail(&queue_gref, &gref_list); mutex_unlock(&gref_mutex); return rc; }
static void __del_gref(struct gntalloc_gref *gref) { + unsigned long addr; + if (gref->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) { uint8_t *tmp = kmap(gref->page); tmp[gref->notify.pgoff] = 0; @@ -196,21 +190,16 @@ static void __del_gref(struct gntalloc_g gref->notify.flags = 0;
if (gref->gref_id) { - if (gnttab_query_foreign_access(gref->gref_id)) - return; - - if (!gnttab_end_foreign_access_ref(gref->gref_id, 0)) - return; - - gnttab_free_grant_reference(gref->gref_id); + if (gref->page) { + addr = (unsigned long)page_to_virt(gref->page); + gnttab_end_foreign_access(gref->gref_id, 0, addr); + } else + gnttab_free_grant_reference(gref->gref_id); }
gref_size--; list_del(&gref->next_gref);
- if (gref->page) - __free_page(gref->page); - kfree(gref); }
From: Juergen Gross jgross@suse.com
Commit 1dbd11ca75fe664d3e54607547771d021f531f59 upstream.
Remove gnttab_query_foreign_access(), as it is unused and unsafe to use.
All previous use cases assumed a grant would not be in use after gnttab_query_foreign_access() returned 0. This information is useless in best case, as it only refers to a situation in the past, which could have changed already.
Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/grant-table.c | 25 ------------------------- include/xen/grant_table.h | 2 -- 2 files changed, 27 deletions(-)
--- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -134,13 +134,6 @@ struct gnttab_ops { * return the frame. */ unsigned long (*end_foreign_transfer_ref)(grant_ref_t ref); - /* - * Query the status of a grant entry. Ref parameter is reference of - * queried grant entry, return value is the status of queried entry. - * Detailed status(writing/reading) can be gotten from the return value - * by bit operations. - */ - int (*query_foreign_access)(grant_ref_t ref); };
struct unmap_refs_callback_data { @@ -285,22 +278,6 @@ int gnttab_grant_foreign_access(domid_t } EXPORT_SYMBOL_GPL(gnttab_grant_foreign_access);
-static int gnttab_query_foreign_access_v1(grant_ref_t ref) -{ - return gnttab_shared.v1[ref].flags & (GTF_reading|GTF_writing); -} - -static int gnttab_query_foreign_access_v2(grant_ref_t ref) -{ - return grstatus[ref] & (GTF_reading|GTF_writing); -} - -int gnttab_query_foreign_access(grant_ref_t ref) -{ - return gnttab_interface->query_foreign_access(ref); -} -EXPORT_SYMBOL_GPL(gnttab_query_foreign_access); - static int gnttab_end_foreign_access_ref_v1(grant_ref_t ref, int readonly) { u16 flags, nflags; @@ -1307,7 +1284,6 @@ static const struct gnttab_ops gnttab_v1 .update_entry = gnttab_update_entry_v1, .end_foreign_access_ref = gnttab_end_foreign_access_ref_v1, .end_foreign_transfer_ref = gnttab_end_foreign_transfer_ref_v1, - .query_foreign_access = gnttab_query_foreign_access_v1, };
static const struct gnttab_ops gnttab_v2_ops = { @@ -1319,7 +1295,6 @@ static const struct gnttab_ops gnttab_v2 .update_entry = gnttab_update_entry_v2, .end_foreign_access_ref = gnttab_end_foreign_access_ref_v2, .end_foreign_transfer_ref = gnttab_end_foreign_transfer_ref_v2, - .query_foreign_access = gnttab_query_foreign_access_v2, };
static bool gnttab_need_v2(void) --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -118,8 +118,6 @@ int gnttab_grant_foreign_transfer(domid_ unsigned long gnttab_end_foreign_transfer_ref(grant_ref_t ref); unsigned long gnttab_end_foreign_transfer(grant_ref_t ref);
-int gnttab_query_foreign_access(grant_ref_t ref); - /* * operations on reserved batches of grant references */
From: Juergen Gross jgross@suse.com
Commit 5cadd4bb1d7fc9ab201ac14620d1a478357e4ebd upstream.
Instead of __get_free_pages() and free_pages() use alloc_pages_exact() and free_pages_exact(). This is in preparation of a change of gnttab_end_foreign_access() which will prohibit use of high-order pages.
By using the local variable "order" instead of ring->intf->ring_order in the error path of xen_9pfs_front_alloc_dataring() another bug is fixed, as the error path can be entered before ring->intf->ring_order is being set.
By using alloc_pages_exact() the size in bytes is specified for the allocation, which fixes another bug for the case of order < (PAGE_SHIFT - XEN_PAGE_SHIFT).
This is part of CVE-2022-23041 / XSA-396.
Reported-by: Simon Gaiser simon@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/9p/trans_xen.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
--- a/net/9p/trans_xen.c +++ b/net/9p/trans_xen.c @@ -301,9 +301,9 @@ static void xen_9pfs_front_free(struct x ref = priv->rings[i].intf->ref[j]; gnttab_end_foreign_access(ref, 0, 0); } - free_pages((unsigned long)priv->rings[i].data.in, - XEN_9PFS_RING_ORDER - - (PAGE_SHIFT - XEN_PAGE_SHIFT)); + free_pages_exact(priv->rings[i].data.in, + 1UL << (XEN_9PFS_RING_ORDER + + XEN_PAGE_SHIFT)); } gnttab_end_foreign_access(priv->rings[i].ref, 0, 0); free_page((unsigned long)priv->rings[i].intf); @@ -341,8 +341,8 @@ static int xen_9pfs_front_alloc_dataring if (ret < 0) goto out; ring->ref = ret; - bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, - XEN_9PFS_RING_ORDER - (PAGE_SHIFT - XEN_PAGE_SHIFT)); + bytes = alloc_pages_exact(1UL << (XEN_9PFS_RING_ORDER + XEN_PAGE_SHIFT), + GFP_KERNEL | __GFP_ZERO); if (!bytes) { ret = -ENOMEM; goto out; @@ -373,9 +373,7 @@ out: if (bytes) { for (i--; i >= 0; i--) gnttab_end_foreign_access(ring->intf->ref[i], 0, 0); - free_pages((unsigned long)bytes, - XEN_9PFS_RING_ORDER - - (PAGE_SHIFT - XEN_PAGE_SHIFT)); + free_pages_exact(bytes, 1UL << (XEN_9PFS_RING_ORDER + XEN_PAGE_SHIFT)); } gnttab_end_foreign_access(ring->ref, 0, 0); free_page((unsigned long)ring->intf);
From: Juergen Gross jgross@suse.com
Commit b0576cc9c6b843d99c6982888d59a56209341888 upstream.
Instead of __get_free_pages() and free_pages() use alloc_pages_exact() and free_pages_exact(). This is in preparation of a change of gnttab_end_foreign_access() which will prohibit use of high-order pages.
This is part of CVE-2022-23041 / XSA-396.
Reported-by: Simon Gaiser simon@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/pvcalls-front.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -346,8 +346,8 @@ static void free_active_ring(struct sock if (!map->active.ring) return;
- free_pages((unsigned long)map->active.data.in, - map->active.ring->ring_order); + free_pages_exact(map->active.data.in, + PAGE_SIZE << map->active.ring->ring_order); free_page((unsigned long)map->active.ring); }
@@ -361,8 +361,8 @@ static int alloc_active_ring(struct sock goto out;
map->active.ring->ring_order = PVCALLS_RING_ORDER; - bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, - PVCALLS_RING_ORDER); + bytes = alloc_pages_exact(PAGE_SIZE << PVCALLS_RING_ORDER, + GFP_KERNEL | __GFP_ZERO); if (!bytes) goto out;
From: Juergen Gross jgross@suse.com
Commit 42baefac638f06314298087394b982ead9ec444b upstream.
gnttab_end_foreign_access() is used to free a grant reference and optionally to free the associated page. In case the grant is still in use by the other side processing is being deferred. This leads to a problem in case no page to be freed is specified by the caller: the caller doesn't know that the page is still mapped by the other side and thus should not be used for other purposes.
The correct way to handle this situation is to take an additional reference to the granted page in case handling is being deferred and to drop that reference when the grant reference could be freed finally.
This requires that there are no users of gnttab_end_foreign_access() left directly repurposing the granted page after the call, as this might result in clobbered data or information leaks via the not yet freed grant reference.
This is part of CVE-2022-23041 / XSA-396.
Reported-by: Simon Gaiser simon@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/grant-table.c | 36 +++++++++++++++++++++++++++++------- include/xen/grant_table.h | 7 ++++++- 2 files changed, 35 insertions(+), 8 deletions(-)
--- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -134,6 +134,10 @@ struct gnttab_ops { * return the frame. */ unsigned long (*end_foreign_transfer_ref)(grant_ref_t ref); + /* + * Read the frame number related to a given grant reference. + */ + unsigned long (*read_frame)(grant_ref_t ref); };
struct unmap_refs_callback_data { @@ -331,6 +335,16 @@ int gnttab_end_foreign_access_ref(grant_ } EXPORT_SYMBOL_GPL(gnttab_end_foreign_access_ref);
+static unsigned long gnttab_read_frame_v1(grant_ref_t ref) +{ + return gnttab_shared.v1[ref].frame; +} + +static unsigned long gnttab_read_frame_v2(grant_ref_t ref) +{ + return gnttab_shared.v2[ref].full_page.frame; +} + struct deferred_entry { struct list_head list; grant_ref_t ref; @@ -360,12 +374,9 @@ static void gnttab_handle_deferred(struc spin_unlock_irqrestore(&gnttab_list_lock, flags); if (_gnttab_end_foreign_access_ref(entry->ref, entry->ro)) { put_free_entry(entry->ref); - if (entry->page) { - pr_debug("freeing g.e. %#x (pfn %#lx)\n", - entry->ref, page_to_pfn(entry->page)); - put_page(entry->page); - } else - pr_info("freeing g.e. %#x\n", entry->ref); + pr_debug("freeing g.e. %#x (pfn %#lx)\n", + entry->ref, page_to_pfn(entry->page)); + put_page(entry->page); kfree(entry); entry = NULL; } else { @@ -390,9 +401,18 @@ static void gnttab_handle_deferred(struc static void gnttab_add_deferred(grant_ref_t ref, bool readonly, struct page *page) { - struct deferred_entry *entry = kmalloc(sizeof(*entry), GFP_ATOMIC); + struct deferred_entry *entry; + gfp_t gfp = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL; const char *what = KERN_WARNING "leaking";
+ entry = kmalloc(sizeof(*entry), gfp); + if (!page) { + unsigned long gfn = gnttab_interface->read_frame(ref); + + page = pfn_to_page(gfn_to_pfn(gfn)); + get_page(page); + } + if (entry) { unsigned long flags;
@@ -1284,6 +1304,7 @@ static const struct gnttab_ops gnttab_v1 .update_entry = gnttab_update_entry_v1, .end_foreign_access_ref = gnttab_end_foreign_access_ref_v1, .end_foreign_transfer_ref = gnttab_end_foreign_transfer_ref_v1, + .read_frame = gnttab_read_frame_v1, };
static const struct gnttab_ops gnttab_v2_ops = { @@ -1295,6 +1316,7 @@ static const struct gnttab_ops gnttab_v2 .update_entry = gnttab_update_entry_v2, .end_foreign_access_ref = gnttab_end_foreign_access_ref_v2, .end_foreign_transfer_ref = gnttab_end_foreign_transfer_ref_v2, + .read_frame = gnttab_read_frame_v2, };
static bool gnttab_need_v2(void) --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -100,7 +100,12 @@ int gnttab_end_foreign_access_ref(grant_ * Note that the granted page might still be accessed (read or write) by the * other side after gnttab_end_foreign_access() returns, so even if page was * specified as 0 it is not allowed to just reuse the page for other - * purposes immediately. + * purposes immediately. gnttab_end_foreign_access() will take an additional + * reference to the granted page in this case, which is dropped only after + * the grant is no longer in use. + * This requires that multi page allocations for areas subject to + * gnttab_end_foreign_access() are done via alloc_pages_exact() (and freeing + * via free_pages_exact()) in order to avoid high order pages. */ void gnttab_end_foreign_access(grant_ref_t ref, int readonly, unsigned long page);
From: Juergen Gross jgross@suse.com
Commit 66e3531b33ee51dad17c463b4d9c9f52e341503d upstream.
When calling gnttab_end_foreign_access_ref() the returned value must be tested and the reaction to that value should be appropriate.
In case of failure in xennet_get_responses() the reaction should not be to crash the system, but to disable the network device.
The calls in setup_netfront() can be replaced by calls of gnttab_end_foreign_access(). While at it avoid double free of ring pages and grant references via xennet_disconnect_backend() in this case.
This is CVE-2022-23042 / part of XSA-396.
Reported-by: Demi Marie Obenour demi@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/xen-netfront.c | 48 +++++++++++++++++++++++++++++---------------- 1 file changed, 31 insertions(+), 17 deletions(-)
--- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -862,7 +862,6 @@ static int xennet_get_responses(struct n int max = XEN_NETIF_NR_SLOTS_MIN + (rx->status <= RX_COPY_THRESHOLD); int slots = 1; int err = 0; - unsigned long ret;
if (rx->flags & XEN_NETRXF_extra_info) { err = xennet_get_extras(queue, extras, rp); @@ -893,8 +892,13 @@ static int xennet_get_responses(struct n goto next; }
- ret = gnttab_end_foreign_access_ref(ref, 0); - BUG_ON(!ret); + if (!gnttab_end_foreign_access_ref(ref, 0)) { + dev_alert(dev, + "Grant still in use by backend domain\n"); + queue->info->broken = true; + dev_alert(dev, "Disabled for further use\n"); + return -EINVAL; + }
gnttab_release_grant_reference(&queue->gref_rx_head, ref);
@@ -1098,6 +1102,10 @@ static int xennet_poll(struct napi_struc err = xennet_get_responses(queue, &rinfo, rp, &tmpq);
if (unlikely(err)) { + if (queue->info->broken) { + spin_unlock(&queue->rx_lock); + return 0; + } err: while ((skb = __skb_dequeue(&tmpq))) __skb_queue_tail(&errq, skb); @@ -1676,7 +1684,7 @@ static int setup_netfront(struct xenbus_ struct netfront_queue *queue, unsigned int feature_split_evtchn) { struct xen_netif_tx_sring *txs; - struct xen_netif_rx_sring *rxs; + struct xen_netif_rx_sring *rxs = NULL; grant_ref_t gref; int err;
@@ -1696,21 +1704,21 @@ static int setup_netfront(struct xenbus_
err = xenbus_grant_ring(dev, txs, 1, &gref); if (err < 0) - goto grant_tx_ring_fail; + goto fail; queue->tx_ring_ref = gref;
rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH); if (!rxs) { err = -ENOMEM; xenbus_dev_fatal(dev, err, "allocating rx ring page"); - goto alloc_rx_ring_fail; + goto fail; } SHARED_RING_INIT(rxs); FRONT_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
err = xenbus_grant_ring(dev, rxs, 1, &gref); if (err < 0) - goto grant_rx_ring_fail; + goto fail; queue->rx_ring_ref = gref;
if (feature_split_evtchn) @@ -1723,22 +1731,28 @@ static int setup_netfront(struct xenbus_ err = setup_netfront_single(queue);
if (err) - goto alloc_evtchn_fail; + goto fail;
return 0;
/* If we fail to setup netfront, it is safe to just revoke access to * granted pages because backend is not accessing it at this point. */ -alloc_evtchn_fail: - gnttab_end_foreign_access_ref(queue->rx_ring_ref, 0); -grant_rx_ring_fail: - free_page((unsigned long)rxs); -alloc_rx_ring_fail: - gnttab_end_foreign_access_ref(queue->tx_ring_ref, 0); -grant_tx_ring_fail: - free_page((unsigned long)txs); -fail: + fail: + if (queue->rx_ring_ref != GRANT_INVALID_REF) { + gnttab_end_foreign_access(queue->rx_ring_ref, 0, + (unsigned long)rxs); + queue->rx_ring_ref = GRANT_INVALID_REF; + } else { + free_page((unsigned long)rxs); + } + if (queue->tx_ring_ref != GRANT_INVALID_REF) { + gnttab_end_foreign_access(queue->tx_ring_ref, 0, + (unsigned long)txs); + queue->tx_ring_ref = GRANT_INVALID_REF; + } else { + free_page((unsigned long)txs); + } return err; }
Hi!
This is the start of the stable review cycle for the 4.19.234 release. There are 33 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-4...
Tested-by: Pavel Machek (CIP) pavel@denx.de
Best regards, Pavel
On Thu, 10 Mar 2022 15:18:27 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.234 release. There are 33 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.234-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v4.19: 10 builds: 10 pass, 0 fail 22 boots: 22 pass, 0 fail 40 tests: 40 pass, 0 fail
Linux version: 4.19.234-rc2-g7603caa5cc11 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On 3/10/22 7:18 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.234 release. There are 33 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.234-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Thu, Mar 10, 2022 at 03:18:27PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.234 release. There are 33 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000. Anything received after that time might be too late.
Build results: total: 156 pass: 156 fail: 0 Qemu test results: total: 425 pass: 425 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
Hi Greg,
On Thu, Mar 10, 2022 at 03:18:27PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.234 release. There are 33 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000. Anything received after that time might be too late.
Build test: mips (gcc version 11.2.1 20220301): 63 configs -> no failure arm (gcc version 11.2.1 20220301): 116 configs -> no new failure arm64 (gcc version 11.2.1 20220301): 2 configs -> no failure x86_64 (gcc version 11.2.1 20220301): 4 configs -> no failure
Boot test: x86_64: Booted on my test laptop. No regression. x86_64: Booted on qemu. No regression. [1]
[1]. https://openqa.qa.codethink.co.uk/tests/860
Tested-by: Sudip Mukherjee sudip.mukherjee@codethink.co.uk
-- Regards Sudip
On Thu, 10 Mar 2022 at 19:52, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.19.234 release. There are 33 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 12 Mar 2022 14:07:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.234-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 4.19.234-rc2 * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git * git branch: linux-4.19.y * git commit: 7603caa5cc1196665ba06ded1f0a6f615eeaebf5 * git describe: v4.19.233-34-g7603caa5cc11 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-4.19.y/build/v4.19....
## Test Regressions (compared to v4.19.233-19-g83f8068e02bc) No test regressions found.
## Metric Regressions (compared to v4.19.233-19-g83f8068e02bc) No metric regressions found.
## Test Fixes (compared to v4.19.233-19-g83f8068e02bc) No test fixes found.
## Metric Fixes (compared to v4.19.233-19-g83f8068e02bc) No metric fixes found.
## Test result summary total: 84857, pass: 68636, fail: 1023, skip: 13321, xfail: 1877
## Build Summary * arm: 281 total, 275 passed, 6 failed * arm64: 39 total, 39 passed, 0 failed * dragonboard-410c: 1 total, 1 passed, 0 failed * hi6220-hikey: 1 total, 1 passed, 0 failed * i386: 19 total, 19 passed, 0 failed * juno-r2: 1 total, 1 passed, 0 failed * mips: 27 total, 27 passed, 0 failed * powerpc: 60 total, 49 passed, 11 failed * s390: 12 total, 12 passed, 0 failed * sparc: 12 total, 12 passed, 0 failed * x15: 1 total, 1 passed, 0 failed * x86: 1 total, 1 passed, 0 failed * x86_64: 38 total, 38 passed, 0 failed
## Test suites summary * fwts * igt-gpu-tools * kselftest-android * kselftest-arm64 * kselftest-bpf * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kvm-unit-tests * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * network-basic-tests * packetdrill * perf * rcutorture * ssuite * v4l2-compliance * vdso
-- Linaro LKFT https://lkft.linaro.org
linux-stable-mirror@lists.linaro.org