----------------- Note, PLEASE TEST this kernel if you are on the 5.4.y tree before using it in a real workload. This was a quick release due to the obvious security fixes in it, and as such, it has not had very much testing "in the wild". Please let us know of any problems seen. Also note that the user/kernel api for the new security mitigations might be changing over time, so do not get used to them being fixed in stone just yet. -----------------
I'm announcing the release of the 5.4.252 kernel.
All users of the 5.4 kernel series must upgrade.
The updated 5.4.y git tree can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.4.y and can be browsed at the normal kernel.org git web browser: https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git%3Ba=summa...
thanks,
greg k-h
------------
Documentation/ABI/testing/sysfs-devices-system-cpu | 11 Documentation/admin-guide/hw-vuln/gather_data_sampling.rst | 109 ++++++ Documentation/admin-guide/hw-vuln/index.rst | 1 Documentation/admin-guide/kernel-parameters.txt | 39 +- Makefile | 2 arch/Kconfig | 3 arch/alpha/include/asm/bugs.h | 20 - arch/arm/Kconfig | 1 arch/arm/include/asm/bugs.h | 4 arch/arm/kernel/bugs.c | 3 arch/ia64/Kconfig | 1 arch/ia64/include/asm/bugs.h | 20 - arch/ia64/kernel/setup.c | 3 arch/m68k/Kconfig | 1 arch/m68k/include/asm/bugs.h | 21 - arch/m68k/kernel/setup_mm.c | 3 arch/mips/Kconfig | 1 arch/mips/include/asm/bugs.h | 17 - arch/mips/kernel/setup.c | 13 arch/parisc/include/asm/bugs.h | 20 - arch/powerpc/include/asm/bugs.h | 15 arch/sh/Kconfig | 1 arch/sh/include/asm/bugs.h | 78 ---- arch/sh/include/asm/processor.h | 2 arch/sh/kernel/idle.c | 1 arch/sh/kernel/setup.c | 55 +++ arch/sparc/Kconfig | 1 arch/sparc/include/asm/bugs.h | 18 - arch/sparc/kernel/setup_32.c | 7 arch/um/Kconfig | 1 arch/um/include/asm/bugs.h | 7 arch/um/kernel/um_arch.c | 3 arch/x86/Kconfig | 20 + arch/x86/include/asm/bugs.h | 2 arch/x86/include/asm/cpufeature.h | 10 arch/x86/include/asm/cpufeatures.h | 18 - arch/x86/include/asm/disabled-features.h | 4 arch/x86/include/asm/fpu/internal.h | 2 arch/x86/include/asm/mem_encrypt.h | 2 arch/x86/include/asm/msr-index.h | 12 arch/x86/include/asm/required-features.h | 4 arch/x86/kernel/cpu/amd.c | 3 arch/x86/kernel/cpu/bugs.c | 209 +++++++++---- arch/x86/kernel/cpu/common.c | 122 ++++++- arch/x86/kernel/cpu/cpu.h | 2 arch/x86/kernel/cpu/scattered.c | 3 arch/x86/kernel/fpu/init.c | 8 arch/x86/kernel/smpboot.c | 1 arch/x86/kvm/cpuid.h | 1 arch/x86/kvm/x86.c | 5 arch/x86/mm/init.c | 7 arch/x86/xen/smp_pv.c | 2 arch/xtensa/include/asm/bugs.h | 18 - drivers/base/cpu.c | 8 drivers/net/xen-netback/netback.c | 15 include/asm-generic/bugs.h | 11 include/linux/cpu.h | 6 include/linux/sched/task.h | 2 init/main.c | 21 - kernel/fork.c | 37 +- tools/arch/x86/include/asm/cpufeatures.h | 18 - tools/arch/x86/include/asm/disabled-features.h | 3 tools/arch/x86/include/asm/required-features.h | 3 63 files changed, 659 insertions(+), 402 deletions(-)
Arnaldo Carvalho de Melo (1): tools headers cpufeatures: Sync with the kernel sources
Borislav Petkov (AMD) (1): x86/bugs: Increase the x86 bugs vector size to two u32s
Daniel Sneddon (4): x86/speculation: Add Gather Data Sampling mitigation x86/speculation: Add force option to GDS mitigation x86/speculation: Add Kconfig option for GDS KVM: Add GDS_NO support to KVM
Dave Hansen (1): Documentation/x86: Fix backwards on/off logic about YMM support
Greg Kroah-Hartman (2): x86: fix backwards merge of GDS/SRSO bit Linux 5.4.252
Juergen Gross (2): x86/xen: Fix secondary processors' FPU initialization x86/mm: fix poking_init() for Xen PV guests
Kim Phillips (1): x86/cpu, kvm: Add support for CPUID_80000021_EAX
Peter Zijlstra (3): x86/mm: Use mm_alloc() in poking_init() mm: Move mm_cachep initialization to mm_init() x86/mm: Initialize text poking earlier
Ross Lagerwall (1): xen/netback: Fix buffer overrun triggered by unusual packet
Sean Christopherson (1): x86/cpufeatures: Assign dedicated feature word for CPUID_0x8000001F[EAX]
Thomas Gleixner (15): init: Provide arch_cpu_finalize_init() x86/cpu: Switch to arch_cpu_finalize_init() ARM: cpu: Switch to arch_cpu_finalize_init() ia64/cpu: Switch to arch_cpu_finalize_init() m68k/cpu: Switch to arch_cpu_finalize_init() mips/cpu: Switch to arch_cpu_finalize_init() sh/cpu: Switch to arch_cpu_finalize_init() sparc/cpu: Switch to arch_cpu_finalize_init() um/cpu: Switch to arch_cpu_finalize_init() init: Remove check_bugs() leftovers init: Invoke arch_cpu_finalize_init() earlier init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init() x86/fpu: Remove cpuinfo argument from init functions x86/fpu: Mark init functions __init x86/fpu: Move FPU initialization into arch_cpu_finalize_init()
Tom Lendacky (2): x86/cpufeatures: Add SEV-ES CPU feature x86/cpu: Add VM page flush MSR availablility as a CPUID feature
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 726ac2e01b77..08e153614e09 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -480,16 +480,17 @@ Description: information about CPUs heterogeneity. cpu_capacity: capacity of cpu#.
What: /sys/devices/system/cpu/vulnerabilities + /sys/devices/system/cpu/vulnerabilities/gather_data_sampling + /sys/devices/system/cpu/vulnerabilities/itlb_multihit + /sys/devices/system/cpu/vulnerabilities/l1tf + /sys/devices/system/cpu/vulnerabilities/mds /sys/devices/system/cpu/vulnerabilities/meltdown + /sys/devices/system/cpu/vulnerabilities/mmio_stale_data + /sys/devices/system/cpu/vulnerabilities/spec_store_bypass /sys/devices/system/cpu/vulnerabilities/spectre_v1 /sys/devices/system/cpu/vulnerabilities/spectre_v2 - /sys/devices/system/cpu/vulnerabilities/spec_store_bypass - /sys/devices/system/cpu/vulnerabilities/l1tf - /sys/devices/system/cpu/vulnerabilities/mds /sys/devices/system/cpu/vulnerabilities/srbds /sys/devices/system/cpu/vulnerabilities/tsx_async_abort - /sys/devices/system/cpu/vulnerabilities/itlb_multihit - /sys/devices/system/cpu/vulnerabilities/mmio_stale_data Date: January 2018 Contact: Linux kernel mailing list linux-kernel@vger.kernel.org Description: Information about CPU vulnerabilities diff --git a/Documentation/admin-guide/hw-vuln/gather_data_sampling.rst b/Documentation/admin-guide/hw-vuln/gather_data_sampling.rst new file mode 100644 index 000000000000..264bfa937f7d --- /dev/null +++ b/Documentation/admin-guide/hw-vuln/gather_data_sampling.rst @@ -0,0 +1,109 @@ +.. SPDX-License-Identifier: GPL-2.0 + +GDS - Gather Data Sampling +========================== + +Gather Data Sampling is a hardware vulnerability which allows unprivileged +speculative access to data which was previously stored in vector registers. + +Problem +------- +When a gather instruction performs loads from memory, different data elements +are merged into the destination vector register. However, when a gather +instruction that is transiently executed encounters a fault, stale data from +architectural or internal vector registers may get transiently forwarded to the +destination vector register instead. This will allow a malicious attacker to +infer stale data using typical side channel techniques like cache timing +attacks. GDS is a purely sampling-based attack. + +The attacker uses gather instructions to infer the stale vector register data. +The victim does not need to do anything special other than use the vector +registers. The victim does not need to use gather instructions to be +vulnerable. + +Because the buffers are shared between Hyper-Threads cross Hyper-Thread attacks +are possible. + +Attack scenarios +---------------- +Without mitigation, GDS can infer stale data across virtually all +permission boundaries: + + Non-enclaves can infer SGX enclave data + Userspace can infer kernel data + Guests can infer data from hosts + Guest can infer guest from other guests + Users can infer data from other users + +Because of this, it is important to ensure that the mitigation stays enabled in +lower-privilege contexts like guests and when running outside SGX enclaves. + +The hardware enforces the mitigation for SGX. Likewise, VMMs should ensure +that guests are not allowed to disable the GDS mitigation. If a host erred and +allowed this, a guest could theoretically disable GDS mitigation, mount an +attack, and re-enable it. + +Mitigation mechanism +-------------------- +This issue is mitigated in microcode. The microcode defines the following new +bits: + + ================================ === ============================ + IA32_ARCH_CAPABILITIES[GDS_CTRL] R/O Enumerates GDS vulnerability + and mitigation support. + IA32_ARCH_CAPABILITIES[GDS_NO] R/O Processor is not vulnerable. + IA32_MCU_OPT_CTRL[GDS_MITG_DIS] R/W Disables the mitigation + 0 by default. + IA32_MCU_OPT_CTRL[GDS_MITG_LOCK] R/W Locks GDS_MITG_DIS=0. Writes + to GDS_MITG_DIS are ignored + Can't be cleared once set. + ================================ === ============================ + +GDS can also be mitigated on systems that don't have updated microcode by +disabling AVX. This can be done by setting gather_data_sampling="force" or +"clearcpuid=avx" on the kernel command-line. + +If used, these options will disable AVX use by turning off XSAVE YMM support. +However, the processor will still enumerate AVX support. Userspace that +does not follow proper AVX enumeration to check both AVX *and* XSAVE YMM +support will break. + +Mitigation control on the kernel command line +--------------------------------------------- +The mitigation can be disabled by setting "gather_data_sampling=off" or +"mitigations=off" on the kernel command line. Not specifying either will default +to the mitigation being enabled. Specifying "gather_data_sampling=force" will +use the microcode mitigation when available or disable AVX on affected systems +where the microcode hasn't been updated to include the mitigation. + +GDS System Information +------------------------ +The kernel provides vulnerability status information through sysfs. For +GDS this can be accessed by the following sysfs file: + +/sys/devices/system/cpu/vulnerabilities/gather_data_sampling + +The possible values contained in this file are: + + ============================== ============================================= + Not affected Processor not vulnerable. + Vulnerable Processor vulnerable and mitigation disabled. + Vulnerable: No microcode Processor vulnerable and microcode is missing + mitigation. + Mitigation: AVX disabled, + no microcode Processor is vulnerable and microcode is missing + mitigation. AVX disabled as mitigation. + Mitigation: Microcode Processor is vulnerable and mitigation is in + effect. + Mitigation: Microcode (locked) Processor is vulnerable and mitigation is in + effect and cannot be disabled. + Unknown: Dependent on + hypervisor status Running on a virtual guest processor that is + affected but with no way to know if host + processor is mitigated or vulnerable. + ============================== ============================================= + +GDS Default mitigation +---------------------- +The updated microcode will enable the mitigation by default. The kernel's +default action is to leave the mitigation enabled. diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst index 2adec1e6520a..245468b0f2be 100644 --- a/Documentation/admin-guide/hw-vuln/index.rst +++ b/Documentation/admin-guide/hw-vuln/index.rst @@ -16,3 +16,4 @@ are configurable at compile, boot or run time. multihit.rst special-register-buffer-data-sampling.rst processor_mmio_stale_data.rst + gather_data_sampling.rst diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 9d2185616d1a..51f845419b9c 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1336,6 +1336,26 @@ Format: off | on default: on
+ gather_data_sampling= + [X86,INTEL] Control the Gather Data Sampling (GDS) + mitigation. + + Gather Data Sampling is a hardware vulnerability which + allows unprivileged speculative access to data which was + previously stored in vector registers. + + This issue is mitigated by default in updated microcode. + The mitigation may have a performance impact but can be + disabled. On systems without the microcode mitigation + disabling AVX serves as a mitigation. + + force: Disable AVX to mitigate systems without + microcode mitigation. No effect if the microcode + mitigation is present. Known to cause crashes in + userspace with buggy AVX enumeration. + + off: Disable GDS mitigation. + gcov_persist= [GCOV] When non-zero (default), profiling data for kernel modules is saved and remains accessible via debugfs, even when the module is unloaded/reloaded. @@ -2696,21 +2716,22 @@ Disable all optional CPU mitigations. This improves system performance, but it may also expose users to several CPU vulnerabilities. - Equivalent to: nopti [X86,PPC] + Equivalent to: gather_data_sampling=off [X86] kpti=0 [ARM64] - nospectre_v1 [X86,PPC] + kvm.nx_huge_pages=off [X86] + l1tf=off [X86] + mds=off [X86] + mmio_stale_data=off [X86] + no_entry_flush [PPC] + no_uaccess_flush [PPC] nobp=0 [S390] + nopti [X86,PPC] + nospectre_v1 [X86,PPC] nospectre_v2 [X86,PPC,S390,ARM64] - spectre_v2_user=off [X86] spec_store_bypass_disable=off [X86,PPC] + spectre_v2_user=off [X86] ssbd=force-off [ARM64] - l1tf=off [X86] - mds=off [X86] tsx_async_abort=off [X86] - kvm.nx_huge_pages=off [X86] - no_entry_flush [PPC] - no_uaccess_flush [PPC] - mmio_stale_data=off [X86]
Exceptions: This does not have any effect on diff --git a/Makefile b/Makefile index 0b17d6936c2f..be75dc3ae8de 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 5 PATCHLEVEL = 4 -SUBLEVEL = 251 +SUBLEVEL = 252 EXTRAVERSION = NAME = Kleptomaniac Octopus
diff --git a/arch/Kconfig b/arch/Kconfig index 2219a07dca1e..4d03616bf597 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -271,6 +271,9 @@ config ARCH_HAS_UNCACHED_SEGMENT select ARCH_HAS_DMA_PREP_COHERENT bool
+config ARCH_HAS_CPU_FINALIZE_INIT + bool + # Select if arch init_task must go in the __init_task_data section config ARCH_TASK_STRUCT_ON_STACK bool diff --git a/arch/alpha/include/asm/bugs.h b/arch/alpha/include/asm/bugs.h deleted file mode 100644 index 78030d1c7e7e..000000000000 --- a/arch/alpha/include/asm/bugs.h +++ /dev/null @@ -1,20 +0,0 @@ -/* - * include/asm-alpha/bugs.h - * - * Copyright (C) 1994 Linus Torvalds - */ - -/* - * This is included by init/main.c to check for architecture-dependent bugs. - * - * Needs: - * void check_bugs(void); - */ - -/* - * I don't know of any alpha bugs yet.. Nice chip - */ - -static void check_bugs(void) -{ -} diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index a70696a95b79..2feb5dade121 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -5,6 +5,7 @@ config ARM select ARCH_32BIT_OFF_T select ARCH_CLOCKSOURCE_DATA select ARCH_HAS_BINFMT_FLAT + select ARCH_HAS_CPU_FINALIZE_INIT if MMU select ARCH_HAS_DEBUG_VIRTUAL if MMU select ARCH_HAS_DEVMEM_IS_ALLOWED select ARCH_HAS_DMA_COHERENT_TO_PFN if SWIOTLB diff --git a/arch/arm/include/asm/bugs.h b/arch/arm/include/asm/bugs.h index 97a312ba0840..fe385551edec 100644 --- a/arch/arm/include/asm/bugs.h +++ b/arch/arm/include/asm/bugs.h @@ -1,7 +1,5 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * arch/arm/include/asm/bugs.h - * * Copyright (C) 1995-2003 Russell King */ #ifndef __ASM_BUGS_H @@ -10,10 +8,8 @@ extern void check_writebuffer_bugs(void);
#ifdef CONFIG_MMU -extern void check_bugs(void); extern void check_other_bugs(void); #else -#define check_bugs() do { } while (0) #define check_other_bugs() do { } while (0) #endif
diff --git a/arch/arm/kernel/bugs.c b/arch/arm/kernel/bugs.c index 14c8dbbb7d2d..087bce6ec8e9 100644 --- a/arch/arm/kernel/bugs.c +++ b/arch/arm/kernel/bugs.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/init.h> +#include <linux/cpu.h> #include <asm/bugs.h> #include <asm/proc-fns.h>
@@ -11,7 +12,7 @@ void check_other_bugs(void) #endif }
-void __init check_bugs(void) +void __init arch_cpu_finalize_init(void) { check_writebuffer_bugs(); check_other_bugs(); diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 6a6036f16abe..72dc0ac46d6b 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -8,6 +8,7 @@ menu "Processor type and features"
config IA64 bool + select ARCH_HAS_CPU_FINALIZE_INIT select ARCH_MIGHT_HAVE_PC_PARPORT select ARCH_MIGHT_HAVE_PC_SERIO select ACPI diff --git a/arch/ia64/include/asm/bugs.h b/arch/ia64/include/asm/bugs.h deleted file mode 100644 index 0d6b9bded56c..000000000000 --- a/arch/ia64/include/asm/bugs.h +++ /dev/null @@ -1,20 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* - * This is included by init/main.c to check for architecture-dependent bugs. - * - * Needs: - * void check_bugs(void); - * - * Based on <asm-alpha/bugs.h>. - * - * Modified 1998, 1999, 2003 - * David Mosberger-Tang davidm@hpl.hp.com, Hewlett-Packard Co. - */ -#ifndef _ASM_IA64_BUGS_H -#define _ASM_IA64_BUGS_H - -#include <asm/processor.h> - -extern void check_bugs (void); - -#endif /* _ASM_IA64_BUGS_H */ diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c index bb320c6d0cc9..6700f066f59c 100644 --- a/arch/ia64/kernel/setup.c +++ b/arch/ia64/kernel/setup.c @@ -1073,8 +1073,7 @@ cpu_init (void) } }
-void __init -check_bugs (void) +void __init arch_cpu_finalize_init(void) { ia64_patch_mckinley_e9((unsigned long) __start___mckinley_e9_bundles, (unsigned long) __end___mckinley_e9_bundles); diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig index 6663f1741798..d84a6a63d36a 100644 --- a/arch/m68k/Kconfig +++ b/arch/m68k/Kconfig @@ -4,6 +4,7 @@ config M68K default y select ARCH_32BIT_OFF_T select ARCH_HAS_BINFMT_FLAT + select ARCH_HAS_CPU_FINALIZE_INIT if MMU select ARCH_HAS_DMA_PREP_COHERENT if HAS_DMA && MMU && !COLDFIRE select ARCH_HAS_SYNC_DMA_FOR_DEVICE if HAS_DMA select ARCH_MIGHT_HAVE_PC_PARPORT if ISA diff --git a/arch/m68k/include/asm/bugs.h b/arch/m68k/include/asm/bugs.h deleted file mode 100644 index 745530651e0b..000000000000 --- a/arch/m68k/include/asm/bugs.h +++ /dev/null @@ -1,21 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* - * include/asm-m68k/bugs.h - * - * Copyright (C) 1994 Linus Torvalds - */ - -/* - * This is included by init/main.c to check for architecture-dependent bugs. - * - * Needs: - * void check_bugs(void); - */ - -#ifdef CONFIG_MMU -extern void check_bugs(void); /* in arch/m68k/kernel/setup.c */ -#else -static void check_bugs(void) -{ -} -#endif diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c index 528484feff80..a9ef8b6b01bc 100644 --- a/arch/m68k/kernel/setup_mm.c +++ b/arch/m68k/kernel/setup_mm.c @@ -10,6 +10,7 @@ */
#include <linux/kernel.h> +#include <linux/cpu.h> #include <linux/mm.h> #include <linux/sched.h> #include <linux/delay.h> @@ -527,7 +528,7 @@ static int __init proc_hardware_init(void) module_init(proc_hardware_init); #endif
-void check_bugs(void) +void __init arch_cpu_finalize_init(void) { #if defined(CONFIG_FPU) && !defined(CONFIG_M68KFPU_EMU) if (m68k_fputype == 0) { diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 2811ecc1f3c7..a353d1d1b457 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -5,6 +5,7 @@ config MIPS select ARCH_32BIT_OFF_T if !64BIT select ARCH_BINFMT_ELF_STATE if MIPS_FP_SUPPORT select ARCH_CLOCKSOURCE_DATA + select ARCH_HAS_CPU_FINALIZE_INIT select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAS_UBSAN_SANITIZE_ALL select ARCH_SUPPORTS_UPROBES diff --git a/arch/mips/include/asm/bugs.h b/arch/mips/include/asm/bugs.h index d8ab8b7129b5..6d04d7d3a8f2 100644 --- a/arch/mips/include/asm/bugs.h +++ b/arch/mips/include/asm/bugs.h @@ -1,17 +1,11 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* - * This is included by init/main.c to check for architecture-dependent bugs. - * * Copyright (C) 2007 Maciej W. Rozycki - * - * Needs: - * void check_bugs(void); */ #ifndef _ASM_BUGS_H #define _ASM_BUGS_H
#include <linux/bug.h> -#include <linux/delay.h> #include <linux/smp.h>
#include <asm/cpu.h> @@ -31,17 +25,6 @@ static inline void check_bugs_early(void) #endif }
-static inline void check_bugs(void) -{ - unsigned int cpu = smp_processor_id(); - - cpu_data[cpu].udelay_val = loops_per_jiffy; - check_bugs32(); -#ifdef CONFIG_64BIT - check_bugs64(); -#endif -} - static inline int r4k_daddiu_bug(void) { #ifdef CONFIG_64BIT diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c index d91b772214b5..1c4114f8f9aa 100644 --- a/arch/mips/kernel/setup.c +++ b/arch/mips/kernel/setup.c @@ -11,6 +11,8 @@ * Copyright (C) 2000, 2001, 2002, 2007 Maciej W. Rozycki */ #include <linux/init.h> +#include <linux/cpu.h> +#include <linux/delay.h> #include <linux/ioport.h> #include <linux/export.h> #include <linux/screen_info.h> @@ -812,3 +814,14 @@ static int __init setnocoherentio(char *str) } early_param("nocoherentio", setnocoherentio); #endif + +void __init arch_cpu_finalize_init(void) +{ + unsigned int cpu = smp_processor_id(); + + cpu_data[cpu].udelay_val = loops_per_jiffy; + check_bugs32(); + + if (IS_ENABLED(CONFIG_CPU_R4X00_BUGS64)) + check_bugs64(); +} diff --git a/arch/parisc/include/asm/bugs.h b/arch/parisc/include/asm/bugs.h deleted file mode 100644 index 0a7f9db6bd1c..000000000000 --- a/arch/parisc/include/asm/bugs.h +++ /dev/null @@ -1,20 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* - * include/asm-parisc/bugs.h - * - * Copyright (C) 1999 Mike Shaver - */ - -/* - * This is included by init/main.c to check for architecture-dependent bugs. - * - * Needs: - * void check_bugs(void); - */ - -#include <asm/processor.h> - -static inline void check_bugs(void) -{ -// identify_cpu(&boot_cpu_data); -} diff --git a/arch/powerpc/include/asm/bugs.h b/arch/powerpc/include/asm/bugs.h deleted file mode 100644 index 01b8f6ca4dbb..000000000000 --- a/arch/powerpc/include/asm/bugs.h +++ /dev/null @@ -1,15 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-or-later */ -#ifndef _ASM_POWERPC_BUGS_H -#define _ASM_POWERPC_BUGS_H - -/* - */ - -/* - * This file is included by 'init/main.c' to check for - * architecture-dependent bugs. - */ - -static inline void check_bugs(void) { } - -#endif /* _ASM_POWERPC_BUGS_H */ diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig index f356ee674d89..671897a680ca 100644 --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -2,6 +2,7 @@ config SUPERH def_bool y select ARCH_HAS_BINFMT_FLAT if !MMU + select ARCH_HAS_CPU_FINALIZE_INIT select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_MIGHT_HAVE_PC_PARPORT diff --git a/arch/sh/include/asm/bugs.h b/arch/sh/include/asm/bugs.h deleted file mode 100644 index 030df56bfdb2..000000000000 --- a/arch/sh/include/asm/bugs.h +++ /dev/null @@ -1,78 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef __ASM_SH_BUGS_H -#define __ASM_SH_BUGS_H - -/* - * This is included by init/main.c to check for architecture-dependent bugs. - * - * Needs: - * void check_bugs(void); - */ - -/* - * I don't know of any Super-H bugs yet. - */ - -#include <asm/processor.h> - -extern void select_idle_routine(void); - -static void __init check_bugs(void) -{ - extern unsigned long loops_per_jiffy; - char *p = &init_utsname()->machine[2]; /* "sh" */ - - select_idle_routine(); - - current_cpu_data.loops_per_jiffy = loops_per_jiffy; - - switch (current_cpu_data.family) { - case CPU_FAMILY_SH2: - *p++ = '2'; - break; - case CPU_FAMILY_SH2A: - *p++ = '2'; - *p++ = 'a'; - break; - case CPU_FAMILY_SH3: - *p++ = '3'; - break; - case CPU_FAMILY_SH4: - *p++ = '4'; - break; - case CPU_FAMILY_SH4A: - *p++ = '4'; - *p++ = 'a'; - break; - case CPU_FAMILY_SH4AL_DSP: - *p++ = '4'; - *p++ = 'a'; - *p++ = 'l'; - *p++ = '-'; - *p++ = 'd'; - *p++ = 's'; - *p++ = 'p'; - break; - case CPU_FAMILY_SH5: - *p++ = '6'; - *p++ = '4'; - break; - case CPU_FAMILY_UNKNOWN: - /* - * Specifically use CPU_FAMILY_UNKNOWN rather than - * default:, so we're able to have the compiler whine - * about unhandled enumerations. - */ - break; - } - - printk("CPU: %s\n", get_cpu_subtype(¤t_cpu_data)); - -#ifndef __LITTLE_ENDIAN__ - /* 'eb' means 'Endian Big' */ - *p++ = 'e'; - *p++ = 'b'; -#endif - *p = '\0'; -} -#endif /* __ASM_SH_BUGS_H */ diff --git a/arch/sh/include/asm/processor.h b/arch/sh/include/asm/processor.h index 6fbf8c80e498..386786b1594a 100644 --- a/arch/sh/include/asm/processor.h +++ b/arch/sh/include/asm/processor.h @@ -173,6 +173,8 @@ extern unsigned int instruction_size(unsigned int insn); #define instruction_size(insn) (4) #endif
+void select_idle_routine(void); + #endif /* __ASSEMBLY__ */
#ifdef CONFIG_SUPERH32 diff --git a/arch/sh/kernel/idle.c b/arch/sh/kernel/idle.c index c20fc5487e05..bcb8eabd7618 100644 --- a/arch/sh/kernel/idle.c +++ b/arch/sh/kernel/idle.c @@ -15,6 +15,7 @@ #include <linux/smp.h> #include <linux/atomic.h> #include <asm/pgalloc.h> +#include <asm/processor.h> #include <asm/smp.h> #include <asm/bl_bit.h>
diff --git a/arch/sh/kernel/setup.c b/arch/sh/kernel/setup.c index c25ee383cb83..2221e057e6b4 100644 --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -43,6 +43,7 @@ #include <asm/smp.h> #include <asm/mmu_context.h> #include <asm/mmzone.h> +#include <asm/processor.h> #include <asm/sparsemem.h>
/* @@ -362,3 +363,57 @@ int test_mode_pin(int pin) { return sh_mv.mv_mode_pins() & pin; } + +void __init arch_cpu_finalize_init(void) +{ + char *p = &init_utsname()->machine[2]; /* "sh" */ + + select_idle_routine(); + + current_cpu_data.loops_per_jiffy = loops_per_jiffy; + + switch (current_cpu_data.family) { + case CPU_FAMILY_SH2: + *p++ = '2'; + break; + case CPU_FAMILY_SH2A: + *p++ = '2'; + *p++ = 'a'; + break; + case CPU_FAMILY_SH3: + *p++ = '3'; + break; + case CPU_FAMILY_SH4: + *p++ = '4'; + break; + case CPU_FAMILY_SH4A: + *p++ = '4'; + *p++ = 'a'; + break; + case CPU_FAMILY_SH4AL_DSP: + *p++ = '4'; + *p++ = 'a'; + *p++ = 'l'; + *p++ = '-'; + *p++ = 'd'; + *p++ = 's'; + *p++ = 'p'; + break; + case CPU_FAMILY_UNKNOWN: + /* + * Specifically use CPU_FAMILY_UNKNOWN rather than + * default:, so we're able to have the compiler whine + * about unhandled enumerations. + */ + break; + } + + pr_info("CPU: %s\n", get_cpu_subtype(¤t_cpu_data)); + +#ifndef __LITTLE_ENDIAN__ + /* 'eb' means 'Endian Big' */ + *p++ = 'e'; + *p++ = 'b'; +#endif + *p = '\0'; +} diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 8cb5bb020b4b..881f6a849148 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -52,6 +52,7 @@ config SPARC config SPARC32 def_bool !64BIT select ARCH_32BIT_OFF_T + select ARCH_HAS_CPU_FINALIZE_INIT if !SMP select ARCH_HAS_SYNC_DMA_FOR_CPU select GENERIC_ATOMIC64 select CLZ_TAB diff --git a/arch/sparc/include/asm/bugs.h b/arch/sparc/include/asm/bugs.h deleted file mode 100644 index 02fa369b9c21..000000000000 --- a/arch/sparc/include/asm/bugs.h +++ /dev/null @@ -1,18 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* include/asm/bugs.h: Sparc probes for various bugs. - * - * Copyright (C) 1996, 2007 David S. Miller (davem@davemloft.net) - */ - -#ifdef CONFIG_SPARC32 -#include <asm/cpudata.h> -#endif - -extern unsigned long loops_per_jiffy; - -static void __init check_bugs(void) -{ -#if defined(CONFIG_SPARC32) && !defined(CONFIG_SMP) - cpu_data(0).udelay_val = loops_per_jiffy; -#endif -} diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c index afe1592a6d08..4373c1d64ab8 100644 --- a/arch/sparc/kernel/setup_32.c +++ b/arch/sparc/kernel/setup_32.c @@ -422,3 +422,10 @@ static int __init topology_init(void) }
subsys_initcall(topology_init); + +#if defined(CONFIG_SPARC32) && !defined(CONFIG_SMP) +void __init arch_cpu_finalize_init(void) +{ + cpu_data(0).udelay_val = loops_per_jiffy; +} +#endif diff --git a/arch/um/Kconfig b/arch/um/Kconfig index c56d3526a3bd..468a5d63ef26 100644 --- a/arch/um/Kconfig +++ b/arch/um/Kconfig @@ -5,6 +5,7 @@ menu "UML-specific options" config UML bool default y + select ARCH_HAS_CPU_FINALIZE_INIT select ARCH_HAS_KCOV select ARCH_NO_PREEMPT select HAVE_ARCH_AUDITSYSCALL diff --git a/arch/um/include/asm/bugs.h b/arch/um/include/asm/bugs.h deleted file mode 100644 index 4473942a0839..000000000000 --- a/arch/um/include/asm/bugs.h +++ /dev/null @@ -1,7 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef __UM_BUGS_H -#define __UM_BUGS_H - -void check_bugs(void); - -#endif diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c index 640c8e178502..004ef4ebb57d 100644 --- a/arch/um/kernel/um_arch.c +++ b/arch/um/kernel/um_arch.c @@ -3,6 +3,7 @@ * Copyright (C) 2000 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) */
+#include <linux/cpu.h> #include <linux/delay.h> #include <linux/init.h> #include <linux/mm.h> @@ -353,7 +354,7 @@ void __init setup_arch(char **cmdline_p) setup_hostinfo(host_info, sizeof host_info); }
-void __init check_bugs(void) +void __init arch_cpu_finalize_init(void) { arch_check_bugs(); os_check_bugs(); diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6002252692af..df0a3a1b08ae 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -60,6 +60,7 @@ config X86 select ARCH_CLOCKSOURCE_DATA select ARCH_CLOCKSOURCE_INIT select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI + select ARCH_HAS_CPU_FINALIZE_INIT select ARCH_HAS_DEBUG_VIRTUAL select ARCH_HAS_DEVMEM_IS_ALLOWED select ARCH_HAS_ELF_RANDOMIZE @@ -2500,6 +2501,25 @@ config ARCH_ENABLE_SPLIT_PMD_PTLOCK def_bool y depends on X86_64 || X86_PAE
+config GDS_FORCE_MITIGATION + bool "Force GDS Mitigation" + depends on CPU_SUP_INTEL + default n + help + Gather Data Sampling (GDS) is a hardware vulnerability which allows + unprivileged speculative access to data which was previously stored in + vector registers. + + This option is equivalent to setting gather_data_sampling=force on the + command line. The microcode mitigation is used if present, otherwise + AVX is disabled as a mitigation. On affected systems that are missing + the microcode any userspace code that unconditionally uses AVX will + break with this option set. + + Setting this option on systems not vulnerable to GDS has no effect. + + If in doubt, say N. + config ARCH_ENABLE_HUGEPAGE_MIGRATION def_bool y depends on X86_64 && HUGETLB_PAGE && MIGRATION diff --git a/arch/x86/include/asm/bugs.h b/arch/x86/include/asm/bugs.h index 794eb2129bc6..6554ddb2ad49 100644 --- a/arch/x86/include/asm/bugs.h +++ b/arch/x86/include/asm/bugs.h @@ -4,8 +4,6 @@
#include <asm/processor.h>
-extern void check_bugs(void); - #if defined(CONFIG_CPU_SUP_INTEL) void check_mpx_erratum(struct cpuinfo_x86 *c); #else diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 619c1f80a2ab..4466a47b7608 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -30,6 +30,8 @@ enum cpuid_leafs CPUID_7_ECX, CPUID_8000_0007_EBX, CPUID_7_EDX, + CPUID_8000_001F_EAX, + CPUID_8000_0021_EAX, };
#ifdef CONFIG_X86_FEATURE_NAMES @@ -88,8 +90,10 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 19, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 20, feature_bit) || \ REQUIRED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 19)) + BUILD_BUG_ON_ZERO(NCAPINTS != 21))
#define DISABLED_MASK_BIT_SET(feature_bit) \ ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 0, feature_bit) || \ @@ -111,8 +115,10 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 19, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 20, feature_bit) || \ DISABLED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 19)) + BUILD_BUG_ON_ZERO(NCAPINTS != 21))
#define cpu_has(c, bit) \ (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 3e360dc07bae..f42286e9a2b1 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -13,8 +13,8 @@ /* * Defines x86 CPU feature bits */ -#define NCAPINTS 19 /* N 32-bit words worth of info */ -#define NBUGINTS 1 /* N 32-bit bug flags */ +#define NCAPINTS 21 /* N 32-bit words worth of info */ +#define NBUGINTS 2 /* N 32-bit bug flags */
/* * Note: If the comment begins with a quoted string, that string is used @@ -96,7 +96,7 @@ #define X86_FEATURE_SYSCALL32 ( 3*32+14) /* "" syscall in IA32 userspace */ #define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in IA32 userspace */ #define X86_FEATURE_REP_GOOD ( 3*32+16) /* REP microcode works well */ -#define X86_FEATURE_SME_COHERENT ( 3*32+17) /* "" AMD hardware-enforced cache coherency */ +/* FREE! ( 3*32+17) */ #define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" LFENCE synchronizes RDTSC */ #define X86_FEATURE_ACC_POWER ( 3*32+19) /* AMD Accumulated Power Mechanism */ #define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */ @@ -201,7 +201,7 @@ #define X86_FEATURE_INVPCID_SINGLE ( 7*32+ 7) /* Effectively INVPCID && CR4.PCIDE=1 */ #define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */ #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */ -#define X86_FEATURE_SME ( 7*32+10) /* AMD Secure Memory Encryption */ +/* FREE! ( 7*32+10) */ #define X86_FEATURE_PTI ( 7*32+11) /* Kernel Page Table Isolation enabled */ #define X86_FEATURE_KERNEL_IBRS ( 7*32+12) /* "" Set/clear IBRS on kernel entry/exit */ #define X86_FEATURE_RSB_VMEXIT ( 7*32+13) /* "" Fill RSB on VM-Exit */ @@ -211,7 +211,7 @@ #define X86_FEATURE_SSBD ( 7*32+17) /* Speculative Store Bypass Disable */ #define X86_FEATURE_MBA ( 7*32+18) /* Memory Bandwidth Allocation */ #define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* "" Fill RSB on context switches */ -#define X86_FEATURE_SEV ( 7*32+20) /* AMD Secure Encrypted Virtualization */ +/* FREE! ( 7*32+20) */ #define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */ #define X86_FEATURE_USE_IBRS_FW ( 7*32+22) /* "" Use IBRS during runtime firmware calls */ #define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE ( 7*32+23) /* "" Disable Speculative Store Bypass. */ @@ -375,6 +375,13 @@ #define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */ #define X86_FEATURE_SPEC_CTRL_SSBD (18*32+31) /* "" Speculative Store Bypass Disable */
+/* AMD-defined memory encryption features, CPUID level 0x8000001f (EAX), word 19 */ +#define X86_FEATURE_SME (19*32+ 0) /* AMD Secure Memory Encryption */ +#define X86_FEATURE_SEV (19*32+ 1) /* AMD Secure Encrypted Virtualization */ +#define X86_FEATURE_VM_PAGE_FLUSH (19*32+ 2) /* "" VM Page Flush MSR is supported */ +#define X86_FEATURE_SEV_ES (19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */ +#define X86_FEATURE_SME_COHERENT (19*32+10) /* "" AMD hardware-enforced cache coherency */ + /* * BUG word(s) */ @@ -415,5 +422,6 @@ #define X86_BUG_RETBLEED X86_BUG(26) /* CPU is affected by RETBleed */ #define X86_BUG_EIBRS_PBRSB X86_BUG(27) /* EIBRS is vulnerable to Post Barrier RSB Predictions */ #define X86_BUG_MMIO_UNKNOWN X86_BUG(28) /* CPU is too old and its MMIO Stale Data status is unknown */ +#define X86_BUG_GDS X86_BUG(29) /* CPU is affected by Gather Data Sampling */
#endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index a5ea841cc6d2..8453260f6d9f 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -84,6 +84,8 @@ #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP) #define DISABLED_MASK17 0 #define DISABLED_MASK18 0 -#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) +#define DISABLED_MASK19 0 +#define DISABLED_MASK20 0 +#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21)
#endif /* _ASM_X86_DISABLED_FEATURES_H */ diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 5ed702e2c55f..330841bad616 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -41,7 +41,7 @@ extern int dump_fpu(struct pt_regs *ptregs, struct user_i387_struct *fpstate); extern void fpu__init_cpu(void); extern void fpu__init_system_xstate(void); extern void fpu__init_cpu_xstate(void); -extern void fpu__init_system(struct cpuinfo_x86 *c); +extern void fpu__init_system(void); extern void fpu__init_check_bugs(void); extern void fpu__resume_cpu(void); extern u64 fpu__get_supported_xfeatures_mask(void); diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 848ce43b9040..190c5ca537e9 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -77,6 +77,8 @@ early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; static inline int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
+static inline void mem_encrypt_init(void) { } + #define __bss_decrypted
#endif /* CONFIG_AMD_MEM_ENCRYPT */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index f3fa903c5b29..7137256f2c31 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -147,6 +147,15 @@ * Not susceptible to Post-Barrier * Return Stack Buffer Predictions. */ +#define ARCH_CAP_GDS_CTRL BIT(25) /* + * CPU is vulnerable to Gather + * Data Sampling (GDS) and + * has controls for mitigation. + */ +#define ARCH_CAP_GDS_NO BIT(26) /* + * CPU is not vulnerable to Gather + * Data Sampling (GDS). + */
#define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /* @@ -165,6 +174,8 @@ #define MSR_IA32_MCU_OPT_CTRL 0x00000123 #define RNGDS_MITG_DIS BIT(0) #define FB_CLEAR_DIS BIT(3) /* CPU Fill buffer clear disable */ +#define GDS_MITG_DIS BIT(4) /* Disable GDS mitigation */ +#define GDS_MITG_LOCKED BIT(5) /* GDS mitigation locked */
#define MSR_IA32_SYSENTER_CS 0x00000174 #define MSR_IA32_SYSENTER_ESP 0x00000175 @@ -484,6 +495,7 @@ #define MSR_AMD64_ICIBSEXTDCTL 0xc001103c #define MSR_AMD64_IBSOPDATA4 0xc001103d #define MSR_AMD64_IBS_REG_COUNT_MAX 8 /* includes MSR_AMD64_IBSBRTARGET */ +#define MSR_AMD64_VM_PAGE_FLUSH 0xc001011e #define MSR_AMD64_SEV 0xc0010131 #define MSR_AMD64_SEV_ENABLED_BIT 0 #define MSR_AMD64_SEV_ENABLED BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT) diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h index 6847d85400a8..fb3d81347e33 100644 --- a/arch/x86/include/asm/required-features.h +++ b/arch/x86/include/asm/required-features.h @@ -101,6 +101,8 @@ #define REQUIRED_MASK16 0 #define REQUIRED_MASK17 0 #define REQUIRED_MASK18 0 -#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) +#define REQUIRED_MASK19 0 +#define REQUIRED_MASK20 0 +#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21)
#endif /* _ASM_X86_REQUIRED_FEATURES_H */ diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 3f182c06b305..11f09df72f51 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -663,7 +663,7 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c) * If BIOS has not enabled SME then don't advertise the * SME feature (set in scattered.c). * For SEV: If BIOS has not enabled SEV then don't advertise the - * SEV feature (set in scattered.c). + * SEV and SEV_ES feature (set in scattered.c). * * In all cases, since support for SME and SEV requires long mode, * don't advertise the feature under CONFIG_X86_32. @@ -694,6 +694,7 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c) setup_clear_cpu_cap(X86_FEATURE_SME); clear_sev: setup_clear_cpu_cap(X86_FEATURE_SEV); + setup_clear_cpu_cap(X86_FEATURE_SEV_ES); } }
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 75ca28bb267c..48ae44cf7795 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -9,7 +9,6 @@ * - Andrew D. Balsa (code cleanup). */ #include <linux/init.h> -#include <linux/utsname.h> #include <linux/cpu.h> #include <linux/module.h> #include <linux/nospec.h> @@ -25,9 +24,7 @@ #include <asm/msr.h> #include <asm/vmx.h> #include <asm/paravirt.h> -#include <asm/alternative.h> #include <asm/pgtable.h> -#include <asm/set_memory.h> #include <asm/intel-family.h> #include <asm/e820/api.h> #include <asm/hypervisor.h> @@ -47,6 +44,7 @@ static void __init md_clear_select_mitigation(void); static void __init taa_select_mitigation(void); static void __init mmio_select_mitigation(void); static void __init srbds_select_mitigation(void); +static void __init gds_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR without task-specific bits set */ u64 x86_spec_ctrl_base; @@ -115,21 +113,8 @@ EXPORT_SYMBOL_GPL(mds_idle_clear); DEFINE_STATIC_KEY_FALSE(mmio_stale_data_clear); EXPORT_SYMBOL_GPL(mmio_stale_data_clear);
-void __init check_bugs(void) +void __init cpu_select_mitigations(void) { - identify_boot_cpu(); - - /* - * identify_boot_cpu() initialized SMT support information, let the - * core code know. - */ - cpu_smt_check_topology(); - - if (!IS_ENABLED(CONFIG_SMP)) { - pr_info("CPU: "); - print_cpu_info(&boot_cpu_data); - } - /* * Read the SPEC_CTRL MSR to account for reserved bits which may * have unknown values. AMD64_LS_CFG MSR is cached in the early AMD @@ -165,39 +150,7 @@ void __init check_bugs(void) l1tf_select_mitigation(); md_clear_select_mitigation(); srbds_select_mitigation(); - - arch_smt_update(); - -#ifdef CONFIG_X86_32 - /* - * Check whether we are able to run this kernel safely on SMP. - * - * - i386 is no longer supported. - * - In order to run on anything without a TSC, we need to be - * compiled for a i486. - */ - if (boot_cpu_data.x86 < 4) - panic("Kernel requires i486+ for 'invlpg' and other features"); - - init_utsname()->machine[1] = - '0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86); - alternative_instructions(); - - fpu__init_check_bugs(); -#else /* CONFIG_X86_64 */ - alternative_instructions(); - - /* - * Make sure the first 2MB area is not mapped by huge pages - * There are typically fixed size MTRRs in there and overlapping - * MTRRs into large pages causes slow downs. - * - * Right now we don't do that with gbpages because there seems - * very little benefit for that case. - */ - if (!direct_gbpages) - set_memory_4k((unsigned long)__va(0), 1); -#endif + gds_select_mitigation(); }
/* @@ -648,6 +601,149 @@ static int __init srbds_parse_cmdline(char *str) } early_param("srbds", srbds_parse_cmdline);
+#undef pr_fmt +#define pr_fmt(fmt) "GDS: " fmt + +enum gds_mitigations { + GDS_MITIGATION_OFF, + GDS_MITIGATION_UCODE_NEEDED, + GDS_MITIGATION_FORCE, + GDS_MITIGATION_FULL, + GDS_MITIGATION_FULL_LOCKED, + GDS_MITIGATION_HYPERVISOR, +}; + +#if IS_ENABLED(CONFIG_GDS_FORCE_MITIGATION) +static enum gds_mitigations gds_mitigation __ro_after_init = GDS_MITIGATION_FORCE; +#else +static enum gds_mitigations gds_mitigation __ro_after_init = GDS_MITIGATION_FULL; +#endif + +static const char * const gds_strings[] = { + [GDS_MITIGATION_OFF] = "Vulnerable", + [GDS_MITIGATION_UCODE_NEEDED] = "Vulnerable: No microcode", + [GDS_MITIGATION_FORCE] = "Mitigation: AVX disabled, no microcode", + [GDS_MITIGATION_FULL] = "Mitigation: Microcode", + [GDS_MITIGATION_FULL_LOCKED] = "Mitigation: Microcode (locked)", + [GDS_MITIGATION_HYPERVISOR] = "Unknown: Dependent on hypervisor status", +}; + +bool gds_ucode_mitigated(void) +{ + return (gds_mitigation == GDS_MITIGATION_FULL || + gds_mitigation == GDS_MITIGATION_FULL_LOCKED); +} +EXPORT_SYMBOL_GPL(gds_ucode_mitigated); + +void update_gds_msr(void) +{ + u64 mcu_ctrl_after; + u64 mcu_ctrl; + + switch (gds_mitigation) { + case GDS_MITIGATION_OFF: + rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl); + mcu_ctrl |= GDS_MITG_DIS; + break; + case GDS_MITIGATION_FULL_LOCKED: + /* + * The LOCKED state comes from the boot CPU. APs might not have + * the same state. Make sure the mitigation is enabled on all + * CPUs. + */ + case GDS_MITIGATION_FULL: + rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl); + mcu_ctrl &= ~GDS_MITG_DIS; + break; + case GDS_MITIGATION_FORCE: + case GDS_MITIGATION_UCODE_NEEDED: + case GDS_MITIGATION_HYPERVISOR: + return; + }; + + wrmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl); + + /* + * Check to make sure that the WRMSR value was not ignored. Writes to + * GDS_MITG_DIS will be ignored if this processor is locked but the boot + * processor was not. + */ + rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl_after); + WARN_ON_ONCE(mcu_ctrl != mcu_ctrl_after); +} + +static void __init gds_select_mitigation(void) +{ + u64 mcu_ctrl; + + if (!boot_cpu_has_bug(X86_BUG_GDS)) + return; + + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { + gds_mitigation = GDS_MITIGATION_HYPERVISOR; + goto out; + } + + if (cpu_mitigations_off()) + gds_mitigation = GDS_MITIGATION_OFF; + /* Will verify below that mitigation _can_ be disabled */ + + /* No microcode */ + if (!(x86_read_arch_cap_msr() & ARCH_CAP_GDS_CTRL)) { + if (gds_mitigation == GDS_MITIGATION_FORCE) { + /* + * This only needs to be done on the boot CPU so do it + * here rather than in update_gds_msr() + */ + setup_clear_cpu_cap(X86_FEATURE_AVX); + pr_warn("Microcode update needed! Disabling AVX as mitigation.\n"); + } else { + gds_mitigation = GDS_MITIGATION_UCODE_NEEDED; + } + goto out; + } + + /* Microcode has mitigation, use it */ + if (gds_mitigation == GDS_MITIGATION_FORCE) + gds_mitigation = GDS_MITIGATION_FULL; + + rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl); + if (mcu_ctrl & GDS_MITG_LOCKED) { + if (gds_mitigation == GDS_MITIGATION_OFF) + pr_warn("Mitigation locked. Disable failed.\n"); + + /* + * The mitigation is selected from the boot CPU. All other CPUs + * _should_ have the same state. If the boot CPU isn't locked + * but others are then update_gds_msr() will WARN() of the state + * mismatch. If the boot CPU is locked update_gds_msr() will + * ensure the other CPUs have the mitigation enabled. + */ + gds_mitigation = GDS_MITIGATION_FULL_LOCKED; + } + + update_gds_msr(); +out: + pr_info("%s\n", gds_strings[gds_mitigation]); +} + +static int __init gds_parse_cmdline(char *str) +{ + if (!str) + return -EINVAL; + + if (!boot_cpu_has_bug(X86_BUG_GDS)) + return 0; + + if (!strcmp(str, "off")) + gds_mitigation = GDS_MITIGATION_OFF; + else if (!strcmp(str, "force")) + gds_mitigation = GDS_MITIGATION_FORCE; + + return 0; +} +early_param("gather_data_sampling", gds_parse_cmdline); + #undef pr_fmt #define pr_fmt(fmt) "Spectre V1 : " fmt
@@ -2207,6 +2303,11 @@ static ssize_t retbleed_show_state(char *buf) return sprintf(buf, "%s\n", retbleed_strings[retbleed_mitigation]); }
+static ssize_t gds_show_state(char *buf) +{ + return sysfs_emit(buf, "%s\n", gds_strings[gds_mitigation]); +} + static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, char *buf, unsigned int bug) { @@ -2256,6 +2357,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr case X86_BUG_RETBLEED: return retbleed_show_state(buf);
+ case X86_BUG_GDS: + return gds_show_state(buf); + default: break; } @@ -2320,4 +2424,9 @@ ssize_t cpu_show_retbleed(struct device *dev, struct device_attribute *attr, cha { return cpu_show_common(dev, attr, buf, X86_BUG_RETBLEED); } + +ssize_t cpu_show_gds(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_GDS); +} #endif diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index c8ccf5bfd534..fcfe891c1e8e 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -17,11 +17,16 @@ #include <linux/init.h> #include <linux/kprobes.h> #include <linux/kgdb.h> +#include <linux/mem_encrypt.h> #include <linux/smp.h> +#include <linux/cpu.h> #include <linux/io.h> #include <linux/syscore_ops.h>
#include <asm/stackprotector.h> +#include <linux/utsname.h> + +#include <asm/alternative.h> #include <asm/perf_event.h> #include <asm/mmu_context.h> #include <asm/archrandom.h> @@ -57,6 +62,7 @@ #ifdef CONFIG_X86_LOCAL_APIC #include <asm/uv/uv.h> #endif +#include <asm/set_memory.h>
#include "cpu.h"
@@ -961,6 +967,12 @@ void get_cpu_cap(struct cpuinfo_x86 *c) if (c->extended_cpuid_level >= 0x8000000a) c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
+ if (c->extended_cpuid_level >= 0x8000001f) + c->x86_capability[CPUID_8000_001F_EAX] = cpuid_eax(0x8000001f); + + if (c->extended_cpuid_level >= 0x80000021) + c->x86_capability[CPUID_8000_0021_EAX] = cpuid_eax(0x80000021); + init_scattered_cpuid_features(c); init_speculation_control(c); init_cqm(c); @@ -1123,6 +1135,12 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { #define MMIO_SBDS BIT(2) /* CPU is affected by RETbleed, speculating where you would not expect it */ #define RETBLEED BIT(3) +/* CPU is affected by SMT (cross-thread) return predictions */ +#define SMT_RSB BIT(4) +/* CPU is affected by SRSO */ +#define SRSO BIT(5) +/* CPU is affected by GDS */ +#define GDS BIT(6)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS), @@ -1135,19 +1153,21 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(BROADWELL_X, X86_STEPPING_ANY, MMIO), VULNBL_INTEL_STEPPINGS(BROADWELL, X86_STEPPING_ANY, SRBDS), VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED), - VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED), + VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED | GDS), VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED), - VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED), - VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED), + VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED | GDS), + VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED | GDS), VULNBL_INTEL_STEPPINGS(CANNONLAKE_L, X86_STEPPING_ANY, RETBLEED), - VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED), - VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO), - VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO), - VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED), + VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS), + VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS), + VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS), + VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS), VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED), - VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED), + VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS), + VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS), + VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS), VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED), - VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED), + VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS), VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPING_ANY, MMIO | MMIO_SBDS), VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D, X86_STEPPING_ANY, MMIO), VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS), @@ -1273,6 +1293,16 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) !(ia32_cap & ARCH_CAP_PBRSB_NO)) setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB);
+ /* + * Check if CPU is vulnerable to GDS. If running in a virtual machine on + * an affected processor, the VMM may have disabled the use of GATHER by + * disabling AVX2. The only way to do this in HW is to clear XCR0[2], + * which means that AVX will be disabled. + */ + if (cpu_matches(cpu_vuln_blacklist, GDS) && !(ia32_cap & ARCH_CAP_GDS_NO) && + boot_cpu_has(X86_FEATURE_AVX)) + setup_force_cpu_bug(X86_BUG_GDS); + if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return;
@@ -1358,8 +1388,6 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
cpu_set_bug_bits(c);
- fpu__init_system(c); - #ifdef CONFIG_X86_32 /* * Regardless of whether PCID is enumerated, the SDM says @@ -1751,6 +1779,8 @@ void identify_secondary_cpu(struct cpuinfo_x86 *c) validate_apic_and_package_id(c); x86_spec_ctrl_setup_ap(); update_srbds_msr(); + if (boot_cpu_has_bug(X86_BUG_GDS)) + update_gds_msr(); }
static __init int setup_noclflush(char *arg) @@ -2049,8 +2079,6 @@ void cpu_init(void) clear_all_debug_regs(); dbg_restore_debug_regs();
- fpu__init_cpu(); - if (is_uv_system()) uv_cpu_init();
@@ -2108,8 +2136,6 @@ void cpu_init(void) clear_all_debug_regs(); dbg_restore_debug_regs();
- fpu__init_cpu(); - load_fixmap_gdt(cpu); } #endif @@ -2156,3 +2182,69 @@ void arch_smt_update(void) /* Check whether IPI broadcasting can be enabled */ apic_smt_update(); } + +void __init arch_cpu_finalize_init(void) +{ + identify_boot_cpu(); + + /* + * identify_boot_cpu() initialized SMT support information, let the + * core code know. + */ + cpu_smt_check_topology(); + + if (!IS_ENABLED(CONFIG_SMP)) { + pr_info("CPU: "); + print_cpu_info(&boot_cpu_data); + } + + cpu_select_mitigations(); + + arch_smt_update(); + + if (IS_ENABLED(CONFIG_X86_32)) { + /* + * Check whether this is a real i386 which is not longer + * supported and fixup the utsname. + */ + if (boot_cpu_data.x86 < 4) + panic("Kernel requires i486+ for 'invlpg' and other features"); + + init_utsname()->machine[1] = + '0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86); + } + + /* + * Must be before alternatives because it might set or clear + * feature bits. + */ + fpu__init_system(); + fpu__init_cpu(); + + alternative_instructions(); + + if (IS_ENABLED(CONFIG_X86_64)) { + /* + * Make sure the first 2MB area is not mapped by huge pages + * There are typically fixed size MTRRs in there and overlapping + * MTRRs into large pages causes slow downs. + * + * Right now we don't do that with gbpages because there seems + * very little benefit for that case. + */ + if (!direct_gbpages) + set_memory_4k((unsigned long)__va(0), 1); + } else { + fpu__init_check_bugs(); + } + + /* + * This needs to be called before any devices perform DMA + * operations that might use the SWIOTLB bounce buffers. It will + * mark the bounce buffers as decrypted so that their usage will + * not cause "plain-text" data to be decrypted when accessed. It + * must be called after late_time_init() so that Hyper-V x86/x64 + * hypercalls work when the SWIOTLB bounce buffers are decrypted. + */ + mem_encrypt_init(); +} diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h index 4d04c127c4a7..8a64520b5310 100644 --- a/arch/x86/kernel/cpu/cpu.h +++ b/arch/x86/kernel/cpu/cpu.h @@ -76,9 +76,11 @@ extern void detect_ht(struct cpuinfo_x86 *c); extern void check_null_seg_clears_base(struct cpuinfo_x86 *c);
unsigned int aperfmperf_get_khz(int cpu); +void cpu_select_mitigations(void);
extern void x86_spec_ctrl_setup_ap(void); extern void update_srbds_msr(void); +extern void update_gds_msr(void);
extern u64 x86_read_arch_cap_msr(void);
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index a03e309a0ac5..37f716eaf0e6 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -40,9 +40,6 @@ static const struct cpuid_bit cpuid_bits[] = { { X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 }, { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 }, { X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 }, - { X86_FEATURE_SME, CPUID_EAX, 0, 0x8000001f, 0 }, - { X86_FEATURE_SEV, CPUID_EAX, 1, 0x8000001f, 0 }, - { X86_FEATURE_SME_COHERENT, CPUID_EAX, 10, 0x8000001f, 0 }, { 0, 0, 0, 0, 0 } };
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c index 17d092eb1934..5e710ad6a3c0 100644 --- a/arch/x86/kernel/fpu/init.c +++ b/arch/x86/kernel/fpu/init.c @@ -50,7 +50,7 @@ void fpu__init_cpu(void) fpu__init_cpu_xstate(); }
-static bool fpu__probe_without_cpuid(void) +static bool __init fpu__probe_without_cpuid(void) { unsigned long cr0; u16 fsw, fcw; @@ -68,7 +68,7 @@ static bool fpu__probe_without_cpuid(void) return fsw == 0 && (fcw & 0x103f) == 0x003f; }
-static void fpu__init_system_early_generic(struct cpuinfo_x86 *c) +static void __init fpu__init_system_early_generic(void) { if (!boot_cpu_has(X86_FEATURE_CPUID) && !test_bit(X86_FEATURE_FPU, (unsigned long *)cpu_caps_cleared)) { @@ -290,10 +290,10 @@ static void __init fpu__init_parse_early_param(void) * Called on the boot CPU once per system bootup, to set up the initial * FPU state that is later cloned into all processes: */ -void __init fpu__init_system(struct cpuinfo_x86 *c) +void __init fpu__init_system(void) { fpu__init_parse_early_param(); - fpu__init_system_early_generic(c); + fpu__init_system_early_generic();
/* * The FPU has to be operational for some of the diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 45e5ecb43393..d6a8efff9c61 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -235,6 +235,7 @@ static void notrace start_secondary(void *unused) #endif load_current_idt(); cpu_init(); + fpu__init_cpu(); rcu_cpu_starting(raw_smp_processor_id()); x86_cpuinit.early_percpu_clock_init(); preempt_disable(); diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index 7dec43b2c420..defae8082789 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -53,6 +53,7 @@ static const struct cpuid_reg reverse_cpuid[] = { [CPUID_7_ECX] = { 7, 0, CPUID_ECX}, [CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX}, [CPUID_7_EDX] = { 7, 0, CPUID_EDX}, + [CPUID_8000_0021_EAX] = {0x80000021, 0, CPUID_EAX}, };
static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index d152afdfa8b4..2ee3da99bc1d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -226,6 +226,8 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
u64 __read_mostly host_xcr0;
+extern bool gds_ucode_mitigated(void); + struct kmem_cache *x86_fpu_cache; EXPORT_SYMBOL_GPL(x86_fpu_cache);
@@ -1409,6 +1411,9 @@ static u64 kvm_get_arch_capabilities(void) /* Guests don't need to know "Fill buffer clear control" exists */ data &= ~ARCH_CAP_FB_CLEAR_CTRL;
+ if (!boot_cpu_has_bug(X86_BUG_GDS) || gds_ucode_mitigated()) + data |= ARCH_CAP_GDS_NO; + return data; }
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 38e6798ce44f..086b274fa60f 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -7,6 +7,7 @@ #include <linux/swapops.h> #include <linux/kmemleak.h> #include <linux/sched/task.h> +#include <linux/sched/mm.h>
#include <asm/set_memory.h> #include <asm/cpu_device_id.h> @@ -26,6 +27,7 @@ #include <asm/cpufeature.h> #include <asm/pti.h> #include <asm/text-patching.h> +#include <asm/paravirt.h>
/* * We need to define the tracepoints somewhere, and tlb.c @@ -735,9 +737,12 @@ void __init poking_init(void) spinlock_t *ptl; pte_t *ptep;
- poking_mm = copy_init_mm(); + poking_mm = mm_alloc(); BUG_ON(!poking_mm);
+ /* Xen PV guests need the PGD to be pinned. */ + paravirt_arch_dup_mmap(NULL, poking_mm); + /* * Randomize the poking address, but make sure that the following page * will be mapped at the same PMD. We need 2 pages, so find space for 3, diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c index 928fbe63c96f..3a0a27d94c05 100644 --- a/arch/x86/xen/smp_pv.c +++ b/arch/x86/xen/smp_pv.c @@ -28,6 +28,7 @@ #include <asm/desc.h> #include <asm/pgtable.h> #include <asm/cpu.h> +#include <asm/fpu/internal.h>
#include <xen/interface/xen.h> #include <xen/interface/vcpu.h> @@ -61,6 +62,7 @@ static void cpu_bringup(void)
cr4_init(); cpu_init(); + fpu__init_cpu(); touch_softlockup_watchdog(); preempt_disable();
diff --git a/arch/xtensa/include/asm/bugs.h b/arch/xtensa/include/asm/bugs.h deleted file mode 100644 index 69b29d198249..000000000000 --- a/arch/xtensa/include/asm/bugs.h +++ /dev/null @@ -1,18 +0,0 @@ -/* - * include/asm-xtensa/bugs.h - * - * This is included by init/main.c to check for architecture-dependent bugs. - * - * Xtensa processors don't have any bugs. :) - * - * This file is subject to the terms and conditions of the GNU General - * Public License. See the file "COPYING" in the main directory of - * this archive for more details. - */ - -#ifndef _XTENSA_BUGS_H -#define _XTENSA_BUGS_H - -static void check_bugs(void) { } - -#endif /* _XTENSA_BUGS_H */ diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 980e9a76e172..5e0c1fd27720 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -581,6 +581,12 @@ ssize_t __weak cpu_show_retbleed(struct device *dev, return sysfs_emit(buf, "Not affected\n"); }
+ssize_t __weak cpu_show_gds(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "Not affected\n"); +} + static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); @@ -592,6 +598,7 @@ static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL); static DEVICE_ATTR(srbds, 0444, cpu_show_srbds, NULL); static DEVICE_ATTR(mmio_stale_data, 0444, cpu_show_mmio_stale_data, NULL); static DEVICE_ATTR(retbleed, 0444, cpu_show_retbleed, NULL); +static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_meltdown.attr, @@ -605,6 +612,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_srbds.attr, &dev_attr_mmio_stale_data.attr, &dev_attr_retbleed.attr, + &dev_attr_gather_data_sampling.attr, NULL };
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index a3078755939e..c3eadac893d8 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -389,7 +389,7 @@ static void xenvif_get_requests(struct xenvif_queue *queue, struct gnttab_map_grant_ref *gop = queue->tx_map_ops + *map_ops; struct xen_netif_tx_request *txp = first;
- nr_slots = shinfo->nr_frags + 1; + nr_slots = shinfo->nr_frags + frag_overflow + 1;
copy_count(skb) = 0; XENVIF_TX_CB(skb)->split_mask = 0; @@ -455,8 +455,8 @@ static void xenvif_get_requests(struct xenvif_queue *queue, } }
- for (shinfo->nr_frags = 0; shinfo->nr_frags < nr_slots; - shinfo->nr_frags++, gop++) { + for (shinfo->nr_frags = 0; nr_slots > 0 && shinfo->nr_frags < MAX_SKB_FRAGS; + shinfo->nr_frags++, gop++, nr_slots--) { index = pending_index(queue->pending_cons++); pending_idx = queue->pending_ring[index]; xenvif_tx_create_map_op(queue, pending_idx, txp, @@ -469,12 +469,12 @@ static void xenvif_get_requests(struct xenvif_queue *queue, txp++; }
- if (frag_overflow) { + if (nr_slots > 0) {
shinfo = skb_shinfo(nskb); frags = shinfo->frags;
- for (shinfo->nr_frags = 0; shinfo->nr_frags < frag_overflow; + for (shinfo->nr_frags = 0; shinfo->nr_frags < nr_slots; shinfo->nr_frags++, txp++, gop++) { index = pending_index(queue->pending_cons++); pending_idx = queue->pending_ring[index]; @@ -485,6 +485,11 @@ static void xenvif_get_requests(struct xenvif_queue *queue, }
skb_shinfo(skb)->frag_list = nskb; + } else if (nskb) { + /* A frag_list skb was allocated but it is no longer needed + * because enough slots were converted to copy ops above. + */ + kfree_skb(nskb); }
(*copy_ops) = cop - queue->tx_copy_ops; diff --git a/include/asm-generic/bugs.h b/include/asm-generic/bugs.h deleted file mode 100644 index 69021830f078..000000000000 --- a/include/asm-generic/bugs.h +++ /dev/null @@ -1,11 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef __ASM_GENERIC_BUGS_H -#define __ASM_GENERIC_BUGS_H -/* - * This file is included by 'init/main.c' to check for - * architecture-dependent bugs. - */ - -static inline void check_bugs(void) { } - -#endif /* __ASM_GENERIC_BUGS_H */ diff --git a/include/linux/cpu.h b/include/linux/cpu.h index b42e9c413447..782491dd1999 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -193,6 +193,12 @@ void arch_cpu_idle_enter(void); void arch_cpu_idle_exit(void); void arch_cpu_idle_dead(void);
+#ifdef CONFIG_ARCH_HAS_CPU_FINALIZE_INIT +void arch_cpu_finalize_init(void); +#else +static inline void arch_cpu_finalize_init(void) { } +#endif + int cpu_report_state(int cpu); int cpu_check_up_prepare(int cpu); void cpu_set_state_online(int cpu); diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index 6f33a07858cf..853ab403e77b 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -53,6 +53,7 @@ extern void sched_dead(struct task_struct *p); void __noreturn do_task_dead(void); void __noreturn make_task_dead(int signr);
+extern void mm_cache_init(void); extern void proc_caches_init(void);
extern void fork_init(void); @@ -93,7 +94,6 @@ extern long _do_fork(struct kernel_clone_args *kargs); extern bool legacy_clone_args_valid(const struct kernel_clone_args *kargs); extern long do_fork(unsigned long, unsigned long, unsigned long, int __user *, int __user *); struct task_struct *fork_idle(int); -struct mm_struct *copy_init_mm(void); extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags); extern long kernel_wait4(pid_t, int __user *, int, struct rusage *);
diff --git a/init/main.c b/init/main.c index a17a111d9336..1db844b38810 100644 --- a/init/main.c +++ b/init/main.c @@ -93,10 +93,8 @@ #include <linux/cache.h> #include <linux/rodata_test.h> #include <linux/jump_label.h> -#include <linux/mem_encrypt.h>
#include <asm/io.h> -#include <asm/bugs.h> #include <asm/setup.h> #include <asm/sections.h> #include <asm/cacheflush.h> @@ -504,8 +502,6 @@ void __init __weak thread_stack_cache_init(void) } #endif
-void __init __weak mem_encrypt_init(void) { } - void __init __weak poking_init(void) { }
void __init __weak pgtable_cache_init(void) { } @@ -567,6 +563,7 @@ static void __init mm_init(void) init_espfix_bsp(); /* Should be run after espfix64 is set up. */ pti_init(); + mm_cache_init(); }
void __init __weak arch_call_rest_init(void) @@ -627,7 +624,7 @@ asmlinkage __visible void __init start_kernel(void) sort_main_extable(); trap_init(); mm_init(); - + poking_init(); ftrace_init();
/* trace_printk can be enabled here */ @@ -721,14 +718,6 @@ asmlinkage __visible void __init start_kernel(void) */ locking_selftest();
- /* - * This needs to be called before any devices perform DMA - * operations that might use the SWIOTLB bounce buffers. It will - * mark the bounce buffers as decrypted so that their usage will - * not cause "plain-text" data to be decrypted when accessed. - */ - mem_encrypt_init(); - #ifdef CONFIG_BLK_DEV_INITRD if (initrd_start && !initrd_below_start_ok && page_to_pfn(virt_to_page((void *)initrd_start)) < min_low_pfn) { @@ -745,6 +734,9 @@ asmlinkage __visible void __init start_kernel(void) late_time_init(); sched_clock_init(); calibrate_delay(); + + arch_cpu_finalize_init(); + pid_idr_init(); anon_vma_init(); #ifdef CONFIG_X86 @@ -771,9 +763,6 @@ asmlinkage __visible void __init start_kernel(void) taskstats_init_early(); delayacct_init();
- poking_init(); - check_bugs(); - acpi_subsystem_init(); arch_post_acpi_subsys_init(); sfi_init_late(); diff --git a/kernel/fork.c b/kernel/fork.c index 5b4a19682207..39134effb2bf 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2335,11 +2335,6 @@ struct task_struct *fork_idle(int cpu) return task; }
-struct mm_struct *copy_init_mm(void) -{ - return dup_mm(NULL, &init_mm); -} - /* * Ok, this is the main fork-routine. * @@ -2710,10 +2705,27 @@ static void sighand_ctor(void *data) init_waitqueue_head(&sighand->signalfd_wqh); }
-void __init proc_caches_init(void) +void __init mm_cache_init(void) { unsigned int mm_size;
+ /* + * The mm_cpumask is located at the end of mm_struct, and is + * dynamically sized based on the maximum CPU number this system + * can have, taking hotplug into account (nr_cpu_ids). + */ + mm_size = sizeof(struct mm_struct) + cpumask_size(); + + mm_cachep = kmem_cache_create_usercopy("mm_struct", + mm_size, ARCH_MIN_MMSTRUCT_ALIGN, + SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, + offsetof(struct mm_struct, saved_auxv), + sizeof_field(struct mm_struct, saved_auxv), + NULL); +} + +void __init proc_caches_init(void) +{ sighand_cachep = kmem_cache_create("sighand_cache", sizeof(struct sighand_struct), 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_TYPESAFE_BY_RCU| @@ -2731,19 +2743,6 @@ void __init proc_caches_init(void) SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL);
- /* - * The mm_cpumask is located at the end of mm_struct, and is - * dynamically sized based on the maximum CPU number this system - * can have, taking hotplug into account (nr_cpu_ids). - */ - mm_size = sizeof(struct mm_struct) + cpumask_size(); - - mm_cachep = kmem_cache_create_usercopy("mm_struct", - mm_size, ARCH_MIN_MMSTRUCT_ALIGN, - SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, - offsetof(struct mm_struct, saved_auxv), - sizeof_field(struct mm_struct, saved_auxv), - NULL); vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT); mmap_init(); nsproxy_cache_init(); diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index 3efaf338d325..eea52cb0e6ab 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -13,8 +13,8 @@ /* * Defines x86 CPU feature bits */ -#define NCAPINTS 19 /* N 32-bit words worth of info */ -#define NBUGINTS 1 /* N 32-bit bug flags */ +#define NCAPINTS 20 /* N 32-bit words worth of info */ +#define NBUGINTS 2 /* N 32-bit bug flags */
/* * Note: If the comment begins with a quoted string, that string is used @@ -96,6 +96,7 @@ #define X86_FEATURE_SYSCALL32 ( 3*32+14) /* "" syscall in IA32 userspace */ #define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in IA32 userspace */ #define X86_FEATURE_REP_GOOD ( 3*32+16) /* REP microcode works well */ +/* FREE! ( 3*32+17) */ #define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" LFENCE synchronizes RDTSC */ #define X86_FEATURE_ACC_POWER ( 3*32+19) /* AMD Accumulated Power Mechanism */ #define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */ @@ -199,7 +200,7 @@ #define X86_FEATURE_INVPCID_SINGLE ( 7*32+ 7) /* Effectively INVPCID && CR4.PCIDE=1 */ #define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */ #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */ -#define X86_FEATURE_SME ( 7*32+10) /* AMD Secure Memory Encryption */ +/* FREE! ( 7*32+10) */ #define X86_FEATURE_PTI ( 7*32+11) /* Kernel Page Table Isolation enabled */ #define X86_FEATURE_RETPOLINE ( 7*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */ #define X86_FEATURE_RETPOLINE_LFENCE ( 7*32+13) /* "" Use LFENCEs for Spectre variant 2 */ @@ -209,7 +210,7 @@ #define X86_FEATURE_SSBD ( 7*32+17) /* Speculative Store Bypass Disable */ #define X86_FEATURE_MBA ( 7*32+18) /* Memory Bandwidth Allocation */ #define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* "" Fill RSB on context switches */ -#define X86_FEATURE_SEV ( 7*32+20) /* AMD Secure Encrypted Virtualization */ +/* FREE! ( 7*32+20) */ #define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */ #define X86_FEATURE_USE_IBRS_FW ( 7*32+22) /* "" Use IBRS during runtime firmware calls */ #define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE ( 7*32+23) /* "" Disable Speculative Store Bypass. */ @@ -287,6 +288,7 @@ #define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM-Exit when EIBRS is enabled */
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ +#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */
/* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */ @@ -328,6 +330,7 @@ #define X86_FEATURE_AVIC (15*32+13) /* Virtual Interrupt Controller */ #define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */ #define X86_FEATURE_VGIF (15*32+16) /* Virtual GIF */ +#define X86_FEATURE_SVME_ADDR_CHK (15*32+28) /* "" SVME addr check */
/* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */ #define X86_FEATURE_AVX512VBMI (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/ @@ -367,6 +370,13 @@ #define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */ #define X86_FEATURE_SPEC_CTRL_SSBD (18*32+31) /* "" Speculative Store Bypass Disable */
+/* AMD-defined memory encryption features, CPUID level 0x8000001f (EAX), word 19 */ +#define X86_FEATURE_SME (19*32+ 0) /* AMD Secure Memory Encryption */ +#define X86_FEATURE_SEV (19*32+ 1) /* AMD Secure Encrypted Virtualization */ +#define X86_FEATURE_VM_PAGE_FLUSH (19*32+ 2) /* "" VM Page Flush MSR is supported */ +#define X86_FEATURE_SEV_ES (19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */ +#define X86_FEATURE_SME_COHERENT (19*32+10) /* "" AMD hardware-enforced cache coherency */ + /* * BUG word(s) */ diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h index a5ea841cc6d2..f0f935f8d917 100644 --- a/tools/arch/x86/include/asm/disabled-features.h +++ b/tools/arch/x86/include/asm/disabled-features.h @@ -84,6 +84,7 @@ #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP) #define DISABLED_MASK17 0 #define DISABLED_MASK18 0 -#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) +#define DISABLED_MASK19 0 +#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
#endif /* _ASM_X86_DISABLED_FEATURES_H */ diff --git a/tools/arch/x86/include/asm/required-features.h b/tools/arch/x86/include/asm/required-features.h index 6847d85400a8..fa5700097f64 100644 --- a/tools/arch/x86/include/asm/required-features.h +++ b/tools/arch/x86/include/asm/required-features.h @@ -101,6 +101,7 @@ #define REQUIRED_MASK16 0 #define REQUIRED_MASK17 0 #define REQUIRED_MASK18 0 -#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) +#define REQUIRED_MASK19 0 +#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
#endif /* _ASM_X86_REQUIRED_FEATURES_H */
Hi Thomas/Greg
We are seeing “get of unsupported state” warnings during FPU initialization in the v5.4.252 and v5.10.189 kernel booted on AWS EC2 instances with Intel processors based on Nitro system. These warnings are observed in EC2 c5.18xlarge instance:
[ 1.204495] ------------[ cut here ]------------ [ 1.204495] get of unsupported state [ 1.204495] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:879 get_xsave_addr+0x81/0x90 [ 1.204495] Modules linked in: [ 1.204495] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.252 #10 [ 1.204495] Hardware name: Amazon EC2 c5.18xlarge/, BIOS 1.0 10/16/2017 [ 1.204495] RIP: 0010:get_xsave_addr+0x81/0x90 [ 1.204495] Code: 5b c3 48 83 c4 08 31 c0 5b c3 80 3d 7c f0 78 01 00 75 c1 48 c7 c7 34 be 03 b2 89 4c 24 04 c6 05 68 f0 78 01 01 e8 ef 41 05 00 <0f> 0b 48 63 4c 24 04 eb a1 31 c0 c3 0f 1f 00 0f 1f 44 00 00 41 54 [ 1.204495] RSP: 0000:ffffffffb2603ed0 EFLAGS: 00010282 [ 1.204495] RAX: 0000000000000000 RBX: ffffffffb27ebe80 RCX: 0000000047cb2486 [ 1.204495] RDX: 0000000000000018 RSI: ffffffffb39e99a0 RDI: ffffffffb39e756c [ 1.204495] RBP: ffffffffb27ebd40 R08: 7520666f20746567 R09: 74726f707075736e [ 1.204495] R10: 00000000000962fc R11: 6574617473206465 R12: ffffffffb2d89b60 [ 1.204495] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000 [ 1.204495] FS: 0000000000000000(0000) GS:ffff96d031400000(0000) knlGS:0000000000000000 [ 1.204495] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.204495] CR2: ffff96e277fff000 CR3: 000000103060a001 CR4: 00000000007200b0 [ 1.204495] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.204495] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.204495] Call Trace: [ 1.204495] ? __warn+0x85/0xd0 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? report_bug+0xb6/0x130 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? fixup_bug.part.12+0x18/0x30 [ 1.204495] ? do_error_trap+0x95/0xb0 [ 1.204495] ? do_invalid_op+0x36/0x40 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? invalid_op+0x1e/0x30 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] identify_cpu+0x422/0x510 [ 1.204495] identify_boot_cpu+0xc/0x94 [ 1.204495] arch_cpu_finalize_init+0x5/0x47 [ 1.204495] start_kernel+0x468/0x511 [ 1.204495] secondary_startup_64+0xa4/0xb0 [ 1.204495] ---[ end trace dffac81ff531fcf2 ]---
The issue can be easily reproduced on both virtualized and bare metal instances but interesting thing is it can’t be found in other latest stable kernels v4.14, v4.19, v5.15 and newer. We tried to bisect between v5.4.251 and v5.4.252 and were able to find below commit to be the culprit. Also, reverting it in v5.4.252 and v5.10.189 resolved above warnings completely.
x86/fpu: Move FPU initialization into arch_cpu_finalize_init() commit b81fac906a8f9e682e513ddd95697ec7a20878d4 upstream
We used to speculate the fix might be similar to commit 3f8968f1f0ad (“x86/xen: Fix secondary processors' FPU initialization”) but since only kernel 5.4/5.10 are impacted, we’re not quite sure how this commit affects them in practice. Could you please take a look and share your insights?
Also put stack traces from v5.10.189:
[ 1.210910] ------------[ cut here ]------------ [ 1.210910] get of unsupported state [ 1.210910] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:974 get_xsave_addr+0x89/0xa0 [ 1.210910] Modules linked in: [ 1.210910] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.189 #4 [ 1.210910] Hardware name: Amazon EC2 c5.18xlarge/, BIOS 1.0 10/16/2017 [ 1.210910] RIP: 0010:get_xsave_addr+0x89/0xa0 [ 1.210910] Code: c4 08 31 c0 5b e9 17 a4 bc 00 80 3d e7 75 eb 01 00 75 b9 48 c7 c7 b7 f4 09 ab 89 4c 24 04 c6 05 d3 75 eb 01 01 e8 17 98 05 00 <0f> 0b 48 63 4c 24 04 eb 99 31 c0 e9 e7 a3 bc 00 0f 1f 80 00 00 00 [ 1.210910] RSP: 0000:ffffffffab603ec8 EFLAGS: 00010286 [ 1.210910] RAX: 0000000000000000 RBX: ffffffffabf25bc0 RCX: 00000000fffeffff [ 1.210910] RDX: ffffffffab603cd0 RSI: 00000000fffeffff RDI: ffffffffad1a3dec [ 1.210910] RBP: ffffffffabf25a60 R08: 0000000000000000 R09: 0000000000000001 [ 1.210910] R10: 0000000000000000 R11: ffffffffab603cc8 R12: ffffffffac539b40 [ 1.210910] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000 [ 1.210910] FS: 0000000000000000(0000) GS:ffff9150f1600000(0000) knlGS:0000000000000000 [ 1.210910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.210910] CR2: ffff915702801000 CR3: 0000001780610001 CR4: 00000000007300b0 [ 1.210910] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.210910] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.210910] Call Trace: [ 1.210910] ? __warn+0x7d/0xe0 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] ? report_bug+0xbb/0x140 [ 1.210910] ? handle_bug+0x3f/0x70 [ 1.210910] ? exc_invalid_op+0x13/0x60 [ 1.210910] ? asm_exc_invalid_op+0x12/0x20 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] identify_cpu+0x42a/0x550 [ 1.210910] identify_boot_cpu+0xc/0x94 [ 1.210910] arch_cpu_finalize_init+0x5/0x47 [ 1.210910] start_kernel+0x4bc/0x56b [ 1.210910] secondary_startup_64_no_verify+0xb0/0xbb [ 1.210910] ---[ end trace 14850c6f8ee0875d ]---
Thanks, Shaoying
On Tue, Aug 15, 2023 at 08:15:39PM +0000, Shaoying Xu wrote:
Hi Thomas/Greg
We are seeing “get of unsupported state” warnings during FPU initialization in the v5.4.252 and v5.10.189 kernel booted on AWS EC2 instances with Intel processors based on Nitro system. These warnings are observed in EC2 c5.18xlarge instance:
[ 1.204495] ------------[ cut here ]------------ [ 1.204495] get of unsupported state [ 1.204495] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:879 get_xsave_addr+0x81/0x90 [ 1.204495] Modules linked in: [ 1.204495] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.252 #10 [ 1.204495] Hardware name: Amazon EC2 c5.18xlarge/, BIOS 1.0 10/16/2017 [ 1.204495] RIP: 0010:get_xsave_addr+0x81/0x90 [ 1.204495] Code: 5b c3 48 83 c4 08 31 c0 5b c3 80 3d 7c f0 78 01 00 75 c1 48 c7 c7 34 be 03 b2 89 4c 24 04 c6 05 68 f0 78 01 01 e8 ef 41 05 00 <0f> 0b 48 63 4c 24 04 eb a1 31 c0 c3 0f 1f 00 0f 1f 44 00 00 41 54 [ 1.204495] RSP: 0000:ffffffffb2603ed0 EFLAGS: 00010282 [ 1.204495] RAX: 0000000000000000 RBX: ffffffffb27ebe80 RCX: 0000000047cb2486 [ 1.204495] RDX: 0000000000000018 RSI: ffffffffb39e99a0 RDI: ffffffffb39e756c [ 1.204495] RBP: ffffffffb27ebd40 R08: 7520666f20746567 R09: 74726f707075736e [ 1.204495] R10: 00000000000962fc R11: 6574617473206465 R12: ffffffffb2d89b60 [ 1.204495] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000 [ 1.204495] FS: 0000000000000000(0000) GS:ffff96d031400000(0000) knlGS:0000000000000000 [ 1.204495] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.204495] CR2: ffff96e277fff000 CR3: 000000103060a001 CR4: 00000000007200b0 [ 1.204495] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.204495] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.204495] Call Trace: [ 1.204495] ? __warn+0x85/0xd0 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? report_bug+0xb6/0x130 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? fixup_bug.part.12+0x18/0x30 [ 1.204495] ? do_error_trap+0x95/0xb0 [ 1.204495] ? do_invalid_op+0x36/0x40 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? invalid_op+0x1e/0x30 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] identify_cpu+0x422/0x510 [ 1.204495] identify_boot_cpu+0xc/0x94 [ 1.204495] arch_cpu_finalize_init+0x5/0x47 [ 1.204495] start_kernel+0x468/0x511 [ 1.204495] secondary_startup_64+0xa4/0xb0 [ 1.204495] ---[ end trace dffac81ff531fcf2 ]---
The issue can be easily reproduced on both virtualized and bare metal instances but interesting thing is it can’t be found in other latest stable kernels v4.14, v4.19, v5.15 and newer. We tried to bisect between v5.4.251 and v5.4.252 and were able to find below commit to be the culprit. Also, reverting it in v5.4.252 and v5.10.189 resolved above warnings completely.
x86/fpu: Move FPU initialization into arch_cpu_finalize_init() commit b81fac906a8f9e682e513ddd95697ec7a20878d4 upstream
We used to speculate the fix might be similar to commit 3f8968f1f0ad (“x86/xen: Fix secondary processors' FPU initialization”) but since only kernel 5.4/5.10 are impacted, we’re not quite sure how this commit affects them in practice. Could you please take a look and share your insights?
Also put stack traces from v5.10.189:
[ 1.210910] ------------[ cut here ]------------ [ 1.210910] get of unsupported state [ 1.210910] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:974 get_xsave_addr+0x89/0xa0 [ 1.210910] Modules linked in: [ 1.210910] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.189 #4 [ 1.210910] Hardware name: Amazon EC2 c5.18xlarge/, BIOS 1.0 10/16/2017 [ 1.210910] RIP: 0010:get_xsave_addr+0x89/0xa0 [ 1.210910] Code: c4 08 31 c0 5b e9 17 a4 bc 00 80 3d e7 75 eb 01 00 75 b9 48 c7 c7 b7 f4 09 ab 89 4c 24 04 c6 05 d3 75 eb 01 01 e8 17 98 05 00 <0f> 0b 48 63 4c 24 04 eb 99 31 c0 e9 e7 a3 bc 00 0f 1f 80 00 00 00 [ 1.210910] RSP: 0000:ffffffffab603ec8 EFLAGS: 00010286 [ 1.210910] RAX: 0000000000000000 RBX: ffffffffabf25bc0 RCX: 00000000fffeffff [ 1.210910] RDX: ffffffffab603cd0 RSI: 00000000fffeffff RDI: ffffffffad1a3dec [ 1.210910] RBP: ffffffffabf25a60 R08: 0000000000000000 R09: 0000000000000001 [ 1.210910] R10: 0000000000000000 R11: ffffffffab603cc8 R12: ffffffffac539b40 [ 1.210910] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000 [ 1.210910] FS: 0000000000000000(0000) GS:ffff9150f1600000(0000) knlGS:0000000000000000 [ 1.210910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.210910] CR2: ffff915702801000 CR3: 0000001780610001 CR4: 00000000007300b0 [ 1.210910] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.210910] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.210910] Call Trace: [ 1.210910] ? __warn+0x7d/0xe0 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] ? report_bug+0xbb/0x140 [ 1.210910] ? handle_bug+0x3f/0x70 [ 1.210910] ? exc_invalid_op+0x13/0x60 [ 1.210910] ? asm_exc_invalid_op+0x12/0x20 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] identify_cpu+0x42a/0x550 [ 1.210910] identify_boot_cpu+0xc/0x94 [ 1.210910] arch_cpu_finalize_init+0x5/0x47 [ 1.210910] start_kernel+0x4bc/0x56b [ 1.210910] secondary_startup_64_no_verify+0xb0/0xbb [ 1.210910] ---[ end trace 14850c6f8ee0875d ]---
I think this is fixed with commit b3607269ff57 ("x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")"), which is queued up for the next 5.4 and 5.10.y releases to happen "soon". Can you test the released -rc1 versions of this to verify it is resolved or not?
thanks,
greg k-h
Hi,
On Tue, Aug 15, 2023 at 11:16:52PM +0200, Greg KH wrote:
On Tue, Aug 15, 2023 at 08:15:39PM +0000, Shaoying Xu wrote:
Hi Thomas/Greg
We are seeing “get of unsupported state” warnings during FPU initialization in the v5.4.252 and v5.10.189 kernel booted on AWS EC2 instances with Intel processors based on Nitro system. These warnings are observed in EC2 c5.18xlarge instance:
[ 1.204495] ------------[ cut here ]------------ [ 1.204495] get of unsupported state [ 1.204495] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:879 get_xsave_addr+0x81/0x90 [ 1.204495] Modules linked in: [ 1.204495] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.252 #10 [ 1.204495] Hardware name: Amazon EC2 c5.18xlarge/, BIOS 1.0 10/16/2017 [ 1.204495] RIP: 0010:get_xsave_addr+0x81/0x90 [ 1.204495] Code: 5b c3 48 83 c4 08 31 c0 5b c3 80 3d 7c f0 78 01 00 75 c1 48 c7 c7 34 be 03 b2 89 4c 24 04 c6 05 68 f0 78 01 01 e8 ef 41 05 00 <0f> 0b 48 63 4c 24 04 eb a1 31 c0 c3 0f 1f 00 0f 1f 44 00 00 41 54 [ 1.204495] RSP: 0000:ffffffffb2603ed0 EFLAGS: 00010282 [ 1.204495] RAX: 0000000000000000 RBX: ffffffffb27ebe80 RCX: 0000000047cb2486 [ 1.204495] RDX: 0000000000000018 RSI: ffffffffb39e99a0 RDI: ffffffffb39e756c [ 1.204495] RBP: ffffffffb27ebd40 R08: 7520666f20746567 R09: 74726f707075736e [ 1.204495] R10: 00000000000962fc R11: 6574617473206465 R12: ffffffffb2d89b60 [ 1.204495] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000 [ 1.204495] FS: 0000000000000000(0000) GS:ffff96d031400000(0000) knlGS:0000000000000000 [ 1.204495] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.204495] CR2: ffff96e277fff000 CR3: 000000103060a001 CR4: 00000000007200b0 [ 1.204495] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.204495] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.204495] Call Trace: [ 1.204495] ? __warn+0x85/0xd0 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? report_bug+0xb6/0x130 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? fixup_bug.part.12+0x18/0x30 [ 1.204495] ? do_error_trap+0x95/0xb0 [ 1.204495] ? do_invalid_op+0x36/0x40 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? invalid_op+0x1e/0x30 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] identify_cpu+0x422/0x510 [ 1.204495] identify_boot_cpu+0xc/0x94 [ 1.204495] arch_cpu_finalize_init+0x5/0x47 [ 1.204495] start_kernel+0x468/0x511 [ 1.204495] secondary_startup_64+0xa4/0xb0 [ 1.204495] ---[ end trace dffac81ff531fcf2 ]---
The issue can be easily reproduced on both virtualized and bare metal instances but interesting thing is it can’t be found in other latest stable kernels v4.14, v4.19, v5.15 and newer. We tried to bisect between v5.4.251 and v5.4.252 and were able to find below commit to be the culprit. Also, reverting it in v5.4.252 and v5.10.189 resolved above warnings completely.
x86/fpu: Move FPU initialization into arch_cpu_finalize_init() commit b81fac906a8f9e682e513ddd95697ec7a20878d4 upstream
We used to speculate the fix might be similar to commit 3f8968f1f0ad (“x86/xen: Fix secondary processors' FPU initialization”) but since only kernel 5.4/5.10 are impacted, we’re not quite sure how this commit affects them in practice. Could you please take a look and share your insights?
Also put stack traces from v5.10.189:
[ 1.210910] ------------[ cut here ]------------ [ 1.210910] get of unsupported state [ 1.210910] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:974 get_xsave_addr+0x89/0xa0 [ 1.210910] Modules linked in: [ 1.210910] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.189 #4 [ 1.210910] Hardware name: Amazon EC2 c5.18xlarge/, BIOS 1.0 10/16/2017 [ 1.210910] RIP: 0010:get_xsave_addr+0x89/0xa0 [ 1.210910] Code: c4 08 31 c0 5b e9 17 a4 bc 00 80 3d e7 75 eb 01 00 75 b9 48 c7 c7 b7 f4 09 ab 89 4c 24 04 c6 05 d3 75 eb 01 01 e8 17 98 05 00 <0f> 0b 48 63 4c 24 04 eb 99 31 c0 e9 e7 a3 bc 00 0f 1f 80 00 00 00 [ 1.210910] RSP: 0000:ffffffffab603ec8 EFLAGS: 00010286 [ 1.210910] RAX: 0000000000000000 RBX: ffffffffabf25bc0 RCX: 00000000fffeffff [ 1.210910] RDX: ffffffffab603cd0 RSI: 00000000fffeffff RDI: ffffffffad1a3dec [ 1.210910] RBP: ffffffffabf25a60 R08: 0000000000000000 R09: 0000000000000001 [ 1.210910] R10: 0000000000000000 R11: ffffffffab603cc8 R12: ffffffffac539b40 [ 1.210910] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000 [ 1.210910] FS: 0000000000000000(0000) GS:ffff9150f1600000(0000) knlGS:0000000000000000 [ 1.210910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.210910] CR2: ffff915702801000 CR3: 0000001780610001 CR4: 00000000007300b0 [ 1.210910] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.210910] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.210910] Call Trace: [ 1.210910] ? __warn+0x7d/0xe0 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] ? report_bug+0xbb/0x140 [ 1.210910] ? handle_bug+0x3f/0x70 [ 1.210910] ? exc_invalid_op+0x13/0x60 [ 1.210910] ? asm_exc_invalid_op+0x12/0x20 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] identify_cpu+0x42a/0x550 [ 1.210910] identify_boot_cpu+0xc/0x94 [ 1.210910] arch_cpu_finalize_init+0x5/0x47 [ 1.210910] start_kernel+0x4bc/0x56b [ 1.210910] secondary_startup_64_no_verify+0xb0/0xbb [ 1.210910] ---[ end trace 14850c6f8ee0875d ]---
I think this is fixed with commit b3607269ff57 ("x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")"), which is queued up for the next 5.4 and 5.10.y releases to happen "soon". Can you test the released -rc1 versions of this to verify it is resolved or not?
While not the reporter, I can confirm that the commit resolved the issue.
The issue was reported some days back in Debian as https://bugs.debian.org/1044518
I verified that with 5.10.190-rc1 the above warning is gone: https://bugs.debian.org/1044518#34
(as it was fixed already with 5.10.190-rc1 pending, I did not botherred to as well report it upstream after it was clear it is not a Debian specific issue anymore).
Regards, Salvatore
On 2023-08-15, 2:17 PM, "Greg KH" <gregkh@linuxfoundation.org mailto:gregkh@linuxfoundation.org >> > wrote:
On Tue, Aug 15, 2023 at 08:15:39PM +0000, Shaoying Xu wrote:
Hi Thomas/Greg
We are seeing “get of unsupported state” warnings during FPU initialization in the v5.4.252 and v5.10.189 kernel booted on AWS EC2 instances with Intel processors based on Nitro system. These warnings are observed in EC2 c5.18xlarge instance:
[ 1.204495] ------------[ cut here ]------------ [ 1.204495] get of unsupported state [ 1.204495] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:879 get_xsave_addr+0x81/0x90 [ 1.204495] Modules linked in: [ 1.204495] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.252 #10 [ 1.204495] Hardware name: Amazon EC2 c5.18xlarge/, BIOS 1.0 10/16/2017 [ 1.204495] RIP: 0010:get_xsave_addr+0x81/0x90 [ 1.204495] Code: 5b c3 48 83 c4 08 31 c0 5b c3 80 3d 7c f0 78 01 00 75 c1 48 c7 c7 34 be 03 b2 89 4c 24 04 c6 05 68 f0 78 01 01 e8 ef 41 05 00 <0f> > 0b 48 63 4c 24 04 eb a1 31 c0 c3 0f 1f 00 0f 1f 44 00 00 41 54 [ 1.204495] RSP: 0000:ffffffffb2603ed0 EFLAGS: 00010282 [ 1.204495] RAX: 0000000000000000 RBX: ffffffffb27ebe80 RCX: 0000000047cb2486 [ 1.204495] RDX: 0000000000000018 RSI: ffffffffb39e99a0 RDI: ffffffffb39e756c [ 1.204495] RBP: ffffffffb27ebd40 R08: 7520666f20746567 R09: 74726f707075736e [ 1.204495] R10: 00000000000962fc R11: 6574617473206465 R12: ffffffffb2d89b60 [ 1.204495] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000 [ 1.204495] FS: 0000000000000000(0000) GS:ffff96d031400000(0000) knlGS:0000000000000000 [ 1.204495] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.204495] CR2: ffff96e277fff000 CR3: 000000103060a001 CR4: 00000000007200b0 [ 1.204495] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.204495] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.204495] Call Trace: [ 1.204495] ? __warn+0x85/0xd0 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? report_bug+0xb6/0x130 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? fixup_bug.part.12+0x18/0x30 [ 1.204495] ? do_error_trap+0x95/0xb0 [ 1.204495] ? do_invalid_op+0x36/0x40 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] ? invalid_op+0x1e/0x30 [ 1.204495] ? get_xsave_addr+0x81/0x90 [ 1.204495] identify_cpu+0x422/0x510 [ 1.204495] identify_boot_cpu+0xc/0x94 [ 1.204495] arch_cpu_finalize_init+0x5/0x47 [ 1.204495] start_kernel+0x468/0x511 [ 1.204495] secondary_startup_64+0xa4/0xb0 [ 1.204495] ---[ end trace dffac81ff531fcf2 ]---
The issue can be easily reproduced on both virtualized and bare metal instances but interesting thing is it can’t be found in other latest stable kernels v4.14, v4.19, v5.15 and newer. We tried to bisect between v5.4.251 and v5.4.252 and were able to find below commit to be the culprit. Also, reverting it in v5.4.252 and v5.10.189 resolved above warnings completely.
x86/fpu: Move FPU initialization into arch_cpu_finalize_init() commit b81fac906a8f9e682e513ddd95697ec7a20878d4 upstream
We used to speculate the fix might be similar to commit 3f8968f1f0ad (“x86/xen: Fix secondary processors' FPU initialization”) but since only kernel 5.4/5.10 are impacted, we’re not quite sure how this commit affects them in practice. Could you please take a look and share your insights?
Also put stack traces from v5.10.189:
[ 1.210910] ------------[ cut here ]------------ [ 1.210910] get of unsupported state [ 1.210910] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:974 get_xsave_addr+0x89/0xa0 [ 1.210910] Modules linked in: [ 1.210910] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.189 #4 [ 1.210910] Hardware name: Amazon EC2 c5.18xlarge/, BIOS 1.0 10/16/2017 [ 1.210910] RIP: 0010:get_xsave_addr+0x89/0xa0 [ 1.210910] Code: c4 08 31 c0 5b e9 17 a4 bc 00 80 3d e7 75 eb 01 00 75 b9 48 c7 c7 b7 f4 09 ab 89 4c 24 04 c6 05 d3 75 eb 01 01 e8 17 98 05 00 <0f> > 0b 48 63 4c 24 04 eb 99 31 c0 e9 e7 a3 bc 00 0f 1f 80 00 00 00 [ 1.210910] RSP: 0000:ffffffffab603ec8 EFLAGS: 00010286 [ 1.210910] RAX: 0000000000000000 RBX: ffffffffabf25bc0 RCX: 00000000fffeffff [ 1.210910] RDX: ffffffffab603cd0 RSI: 00000000fffeffff RDI: ffffffffad1a3dec [ 1.210910] RBP: ffffffffabf25a60 R08: 0000000000000000 R09: 0000000000000001 [ 1.210910] R10: 0000000000000000 R11: ffffffffab603cc8 R12: ffffffffac539b40 [ 1.210910] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000 [ 1.210910] FS: 0000000000000000(0000) GS:ffff9150f1600000(0000) knlGS:0000000000000000 [ 1.210910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.210910] CR2: ffff915702801000 CR3: 0000001780610001 CR4: 00000000007300b0 [ 1.210910] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.210910] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.210910] Call Trace: [ 1.210910] ? __warn+0x7d/0xe0 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] ? report_bug+0xbb/0x140 [ 1.210910] ? handle_bug+0x3f/0x70 [ 1.210910] ? exc_invalid_op+0x13/0x60 [ 1.210910] ? asm_exc_invalid_op+0x12/0x20 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] ? get_xsave_addr+0x89/0xa0 [ 1.210910] identify_cpu+0x42a/0x550 [ 1.210910] identify_boot_cpu+0xc/0x94 [ 1.210910] arch_cpu_finalize_init+0x5/0x47 [ 1.210910] start_kernel+0x4bc/0x56b [ 1.210910] secondary_startup_64_no_verify+0xb0/0xbb [ 1.210910] ---[ end trace 14850c6f8ee0875d ]---
I think this is fixed with commit b3607269ff57 ("x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")"), which is queued up for the next 5.4 and 5.10.y releases to happen "soon". Can you test the released -rc1 versions of this to verify it is resolved or not?
thanks,
greg k-h
Hi Greg,
Thank you for pointing this commit. I verfied v5.4.254-rc1 and v5.10.191-rc1 on the same EC2 c5.18xlarge instance would no longer reproduce above warnings. In addition, I tested v5.4.252 and v5.10.189 plus commit b3607269ff57 ("x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")") and found fpu initilization was clean. It can be confirmed that this commit resolves the reported issue.
Thanks, Shaoying
linux-stable-mirror@lists.linaro.org