This is the start of the stable review cycle for the 3.16.56 release. There are 76 patches in this series, which will be posted as responses to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Mar 14 12:00:00 UTC 2018. Anything received after that time might be too late.
All the patches have also been committed to the linux-3.16.y-rc branch of https://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-stable-rc.git . A shortlog and diffstat can be found below.
Ben.
-------------
Andi Kleen (3): module/retpoline: Warn about missing retpoline in module [caf7501a1b4ec964190f31f9c3f163de252273b8] x86/retpoline/irq32: Convert assembler indirect jumps [7614e913db1f40fff819b36216484dc3808995d4] x86/retpoline: Optimize inline assembler for vmexit_fill_RSB [3f7d875566d8e79c5e0b2c9a413e91b2c29e0854]
Andrey Ryabinin (1): x86/asm: Use register variable to get stack pointer value [196bd485ee4f03ce4c690bfcf38138abfcd0a4bc]
Andy Lutomirski (3): x86/asm: Make asm/alternative.h safe from assembly [f005f5d860e0231fe212cfda8c1a3148b99609f4] x86/cpu: Factor out application of forced CPU caps [8bf1ebca215c262e48c15a4a15f175991776f57f] x86: Clean up current_stack_pointer [83653c16da91112236292871b820cb8b367220e3]
Arnd Bergmann (1): x86: fix build warnign with 32-bit PAE [not upstream; specific to KAISER]
Ben Hutchings (1): x86/syscall: Sanitize syscall table de-references under speculation [2fbd7af5af8665d18bcefae3e9700be07e22b681]
Borislav Petkov (6): x86/alternatives: Fix ALTERNATIVE_2 padding generation properly [dbe4058a6a44af4ca5d146aebe01b0a1f9b7fd2a] x86/alternatives: Fix optimize_nops() checking [612e8e9350fd19cae6900cf36ea0c6892d1a0dca] x86/alternatives: Guard NOPs optimization [69df353ff305805fc16082d0c5bfa6e20fa8b863] x86/bugs: Drop one "mitigation" from dmesg [55fa19d3e51f33d9cd4056d25836d93abf9438db] x86/cpu: Merge bugs.c and bugs_64.c [62a67e123e058a67db58bc6a14354dd037bafd0a] x86/nospec: Fix header guards names [7a32fc51ca938e67974cbb9db31e1a43f98345a9]
Colin Ian King (1): x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable" [e698dcdfcda41efd0984de539767b4cddd235f1e]
Dan Carpenter (1): x86/spectre: Fix an error message [9de29eac8d2189424d81c0d840cd0469aa3d41c8]
Dan Williams (13): array_index_nospec: Sanitize speculative array de-references [f3804203306e098dae9ca51540fcd5eb700d7f40] nl80211: Sanitize array index in parse_txq_params [259d8c1e984318497c84eef547bbb6b1d9f4eb05] nospec: Include <asm/barrier.h> dependency [eb6174f6d1be16b19cfa43dac296bfed003ce1a6] nospec: Kill array_index_nospec_mask_check() [1d91c1d2c80cb70e2e553845e278b87a960c04da] vfs, fdtable: Prevent bounds-check bypass via speculative execution [56c30ba7b348b90484969054d561f711ba196507] x86/get_user: Use pointer masking to limit speculation [c7f631cb07e7da06ac1d231ca178452339e32a94] x86/kvm: Update spectre-v1 mitigation [085331dfc6bbe3501fb936e657331ca943827600] x86/spectre: Report get_user mitigation for spectre_v1 [edfbae53dab8348fca778531be9f4855d2ca0360] x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec [304ec1b050310548db33063e567123fae8fd0301] x86/usercopy: Replace open coded stac/clac with __uaccess_{begin, end} [b5c4ae4f35325d520b230bab6eb3310613b72ac1] x86: Implement array_index_mask_nospec [babdde2698d482b6c0de1eab4f697cf5856c5859] x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec [b3bbfb3fb5d25776b8e3f361d2eedaabb0b496cd] x86: Introduce barrier_nospec [b3d7ad85b80bbc404635dca80f5b129f6242bc7a]
Darren Kenny (1): x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL [af189c95a371b59f493dbe0f50c0a09724868881]
Dave Hansen (2): x86/Documentation: Add PTI description [01c9b17bf673b05bb401b76ec763e9730ccf1376] x86/cpu/intel: Introduce macros for Intel family numbers [970442c599b22ccd644ebfe94d1d303bf6f87c05]
David Woodhouse (14): sysfs/cpu: Fix typos in vulnerability documentation [9ecccfaa7cb5249bd31bdceb93fcf5bedb8a24d8] x86/cpufeatures: Add X86_BUG_SPECTRE_V[12] [99c6fa2511d8a683e61468be91b83f85452115fa] x86/cpufeatures: Clean up Spectre v2 related CPUID flags [2961298efe1ea1b6fc0d7ee8b76018fa6c0bcef2] x86/retpoline/checksum32: Convert assembler indirect jumps [5096732f6f695001fa2d6f1335a2680b37912c69] x86/retpoline/crypto: Convert crypto assembler indirect jumps [9697fa39efd3fc3692f2949d4045f393ec58450b] x86/retpoline/entry: Convert entry assembler indirect jumps [2641f08bb7fc63a636a2b18173221d7040a3512e] x86/retpoline/ftrace: Convert ftrace assembler indirect jumps [9351803bd803cdbeb9b5a7850b7b6f464806e3db] x86/retpoline/hyperv: Convert assembler indirect jumps [e70e5892b28c18f517f29ab6e83bd57705104b31] x86/retpoline/xen: Convert Xen hypercall indirect jumps [ea08816d5b185ab3d09e95e393f265af54560350] x86/retpoline: Add initial retpoline support [76b043848fd22dbf7f8bf3a1452f8c70d557b860] x86/retpoline: Avoid retpolines for built-in __init functions [66f793099a636862a71c59d4a6ba91387b155e0c] x86/retpoline: Fill RSB on context switch for affected CPUs [c995efd5a740d9cbafbf58bde4973e8b50b4d761] x86/retpoline: Fill return stack buffer on vmexit [117cc7a908c83697b0b737d15ae1eb5943afe35b] x86/spectre: Add boot time option to select Spectre v2 mitigation [da285121560e769cc31797bba6422eea71d473e0]
Dou Liyang (1): x86/spectre: Check CONFIG_RETPOLINE in command line parser [9471eee9186a46893726e22ebb54cade3f9bc043]
Gustavo A. R. Silva (1): x86/cpu: Change type of x86_cache_size variable to unsigned int [24dbc6000f4b9b0ef5a9daecb161f1907733765a]
Jim Mattson (1): kvm: vmx: Scrub hardware GPRs at VM-exit [0cb5b30698fdc8f6b4646012e3acb4ddce430788]
Josh Poimboeuf (1): x86/paravirt: Remove 'noreplace-paravirt' cmdline option [12c69f1e94c89d40696e83804dd2f0965b5250cd]
KarimAllah Ahmed (1): x86/spectre: Simplify spectre_v2 command line parsing [9005c6834c0ffdfe46afa76656bd9276cca864f6]
Linus Torvalds (2): x86: fix SMAP in 32-bit environments [de9e478b9d49f3a0214310d921450cf5bb4a21e6] x86: reorganize SMAP handling in user space accesses [11f1a4b9755f5dbc3e822a96502ebe9b044b14d8]
Mark Rutland (1): Documentation: Document array_index_nospec [f84a56f73dddaeac1dba8045b007f742f61cd2da]
Masahiro Yamada (1): kconfig.h: use __is_defined() to check if MODULE is defined [4f920843d248946545415c1bf6120942048708ed]
Masami Hiramatsu (3): kprobes/x86: Blacklist indirect thunk functions for kprobes [c1804a236894ecc942da7dc6c5abe209e56cba93] kprobes/x86: Disable optimizing on the function jumps to indirect thunk [c86a32c09f8ced67971a2310e3b0dda4d1749007] retpoline: Introduce start/end markers of indirect thunk [736e80a4213e9bbce40a7c050337047128b472ac]
Peter Zijlstra (2): KVM: VMX: Make indirect call speculation safe [c940a3fb1e2e9b7d03228ab28f375fb5a47ff699] KVM: x86: Make indirect calls in emulator speculation safe [1a29b5b7f347a1a9230c1e0af5b37e3e571588ab]
Thomas Gleixner (8): sysfs/cpu: Add vulnerability folder [87590ce6e373d1a5401f6539f0c59ef92dd924a9] x86/alternatives: Make optimize_nops() interrupt safe and synced [66c117d7fa2ae429911e60d84bf31a90b2b96189] x86/cpu/bugs: Make retpoline module warning conditional [e383095c7fe8d218e00ec0f83e4b95ed4e627b02] x86/cpu: Implement CPU vulnerabilites sysfs functions [61dc0f555b5c761cdafb0ba5bd41ecf22d68a4c4] x86/cpufeatures: Add X86_BUG_CPU_INSECURE [a89f040fa34ec9cd682aed98b8f04e3c47d998bd] x86/cpufeatures: Make CPU bugs sticky [6cbd2171e89b13377261d15e64384df60ecb530e] x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN [de791821c295cc61419a06fe5562288417d1bc58] x86/retpoline: Remove compile time warning [b8b9ce4b5aec8de9e23cabb0a26b78641f9ab1d6]
Tom Lendacky (4): x86/cpu, x86/pti: Do not enable PTI on AMD processors [694d99d40972f12e59a3696effee8a376b79d7c8] x86/cpu/AMD: Make LFENCE a serializing instruction [e4d0e84e490790798691aaa0f2e598637f1867ec] x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC [9c6a73c75864ad9fa49e5fa6513e4c4071c0e29f] x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros [28d437d550e1e39f805d99f9f8ac399c778827b7]
Waiman Long (1): x86/retpoline: Remove the esp/rsp thunk [1df37383a8aeabb9b418698f0bcdffea01f4b1b2]
Will Deacon (1): nospec: Move array_index_nospec() parameter checking into separate macro [8fa80c503b484ddc1abbd10c7cb2ab81f3824a50]
Zhenwei.Pi (1): x86/pti: Document fix wrong index [98f0fceec7f84d80bc053e49e596088573086421]
Documentation/ABI/testing/sysfs-devices-system-cpu | 16 ++ Documentation/kernel-parameters.txt | 51 +++- Documentation/speculation.txt | 90 +++++++ Documentation/x86/pti.txt | 186 +++++++++++++ Makefile | 4 +- arch/x86/Kconfig | 14 + arch/x86/Makefile | 8 + arch/x86/crypto/aesni-intel_asm.S | 5 +- arch/x86/crypto/camellia-aesni-avx-asm_64.S | 3 +- arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 3 +- arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 3 +- arch/x86/include/asm/alternative-asm.h | 14 +- arch/x86/include/asm/alternative.h | 20 +- arch/x86/include/asm/asm.h | 11 + arch/x86/include/asm/barrier.h | 31 ++- arch/x86/include/asm/cpufeature.h | 8 + arch/x86/include/asm/intel-family.h | 68 +++++ arch/x86/include/asm/nospec-branch.h | 198 ++++++++++++++ arch/x86/include/asm/processor.h | 6 +- arch/x86/include/asm/switch_to.h | 38 +++ arch/x86/include/asm/uaccess.h | 64 +++-- arch/x86/include/asm/uaccess_32.h | 24 ++ arch/x86/include/asm/uaccess_64.h | 94 +++++-- arch/x86/include/asm/xen/hypercall.h | 5 +- arch/x86/include/uapi/asm/msr-index.h | 3 + arch/x86/kernel/alternative.c | 29 +- arch/x86/kernel/cpu/Makefile | 4 +- arch/x86/kernel/cpu/amd.c | 28 +- arch/x86/kernel/cpu/bugs.c | 299 ++++++++++++++++++++- arch/x86/kernel/cpu/bugs_64.c | 33 --- arch/x86/kernel/cpu/common.c | 32 ++- arch/x86/kernel/cpu/microcode/intel.c | 2 +- arch/x86/kernel/cpu/proc.c | 4 +- arch/x86/kernel/entry_32.S | 15 +- arch/x86/kernel/entry_64.S | 29 +- arch/x86/kernel/irq_32.c | 16 +- arch/x86/kernel/kprobes/opt.c | 23 +- arch/x86/kernel/mcount_64.S | 8 +- arch/x86/kernel/vmlinux.lds.S | 6 + arch/x86/kvm/emulate.c | 9 +- arch/x86/kvm/svm.c | 23 ++ arch/x86/kvm/vmx.c | 46 ++-- arch/x86/lib/Makefile | 2 + arch/x86/lib/checksum_32.S | 7 +- arch/x86/lib/getuser.S | 10 + arch/x86/lib/retpoline-export.c | 24 ++ arch/x86/lib/retpoline.S | 47 ++++ arch/x86/lib/usercopy_32.c | 20 +- drivers/base/Kconfig | 3 + drivers/base/cpu.c | 48 ++++ drivers/hv/hv.c | 25 +- include/linux/cpu.h | 7 + include/linux/fdtable.h | 5 +- include/linux/init.h | 9 +- include/linux/kaiser.h | 2 +- include/linux/kconfig.h | 9 +- include/linux/module.h | 9 + include/linux/nospec.h | 59 ++++ kernel/module.c | 11 + net/wireless/nl80211.c | 9 +- scripts/mod/modpost.c | 9 + 61 files changed, 1662 insertions(+), 226 deletions(-)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Andi Kleen ak@linux.intel.com
commit caf7501a1b4ec964190f31f9c3f163de252273b8 upstream.
There's a risk that a kernel which has full retpoline mitigations becomes vulnerable when a module gets loaded that hasn't been compiled with the right compiler or the right option.
To enable detection of that mismatch at module load time, add a module info string "retpoline" at build time when the module was compiled with retpoline support. This only covers compiled C source, but assembler source or prebuilt object files are not checked.
If a retpoline enabled kernel detects a non retpoline protected module at load time, print a warning and report it in the sysfs vulnerability file.
[ tglx: Massaged changelog ]
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: David Woodhouse dwmw2@infradead.org Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: jeyu@kernel.org Cc: arjan@linux.intel.com Link: https://lkml.kernel.org/r/20180125235028.31211-1-andi@firstfloor.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/cpu/bugs.c | 17 ++++++++++++++++- include/linux/module.h | 9 +++++++++ kernel/module.c | 11 +++++++++++ scripts/mod/modpost.c | 9 +++++++++ 4 files changed, 45 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -10,6 +10,7 @@ #include <linux/init.h> #include <linux/utsname.h> #include <linux/cpu.h> +#include <linux/module.h>
#include <asm/nospec-branch.h> #include <asm/cmdline.h> @@ -155,6 +156,19 @@ static const char *spectre_v2_strings[] #define pr_fmt(fmt) "Spectre V2 mitigation: " fmt
static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; +static bool spectre_v2_bad_module; + +#ifdef RETPOLINE +bool retpoline_module_ok(bool has_retpoline) +{ + if (spectre_v2_enabled == SPECTRE_V2_NONE || has_retpoline) + return true; + + pr_err("System may be vunerable to spectre v2\n"); + spectre_v2_bad_module = true; + return false; +} +#endif
static void __init spec2_print_if_insecure(const char *reason) { @@ -340,6 +354,7 @@ ssize_t cpu_show_spectre_v2(struct devic if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) return sprintf(buf, "Not affected\n");
- return sprintf(buf, "%s\n", spectre_v2_strings[spectre_v2_enabled]); + return sprintf(buf, "%s%s\n", spectre_v2_strings[spectre_v2_enabled], + spectre_v2_bad_module ? " - vulnerable module loaded" : ""); } #endif --- a/include/linux/module.h +++ b/include/linux/module.h @@ -656,4 +656,13 @@ static inline void module_bug_finalize(c static inline void module_bug_cleanup(struct module *mod) {} #endif /* CONFIG_GENERIC_BUG */
+#ifdef RETPOLINE +extern bool retpoline_module_ok(bool has_retpoline); +#else +static inline bool retpoline_module_ok(bool has_retpoline) +{ + return true; +} +#endif + #endif /* _LINUX_MODULE_H */ --- a/kernel/module.c +++ b/kernel/module.c @@ -2494,6 +2494,15 @@ static int elf_header_check(struct load_ return 0; }
+static void check_modinfo_retpoline(struct module *mod, struct load_info *info) +{ + if (retpoline_module_ok(get_modinfo(info, "retpoline"))) + return; + + pr_warn("%s: loading module not compiled with retpoline compiler.\n", + mod->name); +} + /* Sets info->hdr and info->len. */ static int copy_module_from_user(const void __user *umod, unsigned long len, struct load_info *info) @@ -2698,6 +2707,8 @@ static int check_modinfo(struct module * if (!get_modinfo(info, "intree")) add_taint_module(mod, TAINT_OOT_MODULE, LOCKDEP_STILL_OK);
+ check_modinfo_retpoline(mod, info); + if (get_modinfo(info, "staging")) { add_taint_module(mod, TAINT_CRAP, LOCKDEP_STILL_OK); pr_warn("%s: module is from the staging directory, the quality " --- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -1952,6 +1952,14 @@ static void add_intree_flag(struct buffe buf_printf(b, "\nMODULE_INFO(intree, "Y");\n"); }
+/* Cannot check for assembler */ +static void add_retpoline(struct buffer *b) +{ + buf_printf(b, "\n#ifdef RETPOLINE\n"); + buf_printf(b, "MODULE_INFO(retpoline, "Y");\n"); + buf_printf(b, "#endif\n"); +} + static void add_staging_flag(struct buffer *b, const char *name) { static const char *staging_dir = "drivers/staging"; @@ -2282,6 +2290,7 @@ int main(int argc, char **argv)
add_header(&buf, mod); add_intree_flag(&buf, !external_module); + add_retpoline(&buf); add_staging_flag(&buf, mod->name); err |= add_versions(&buf, mod); add_depends(&buf, mod, modules);
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit c995efd5a740d9cbafbf58bde4973e8b50b4d761 upstream.
On context switch from a shallow call stack to a deeper one, as the CPU does 'ret' up the deeper side it may encounter RSB entries (predictions for where the 'ret' goes to) which were populated in userspace.
This is problematic if neither SMEP nor KPTI (the latter of which marks userspace pages as NX for the kernel) are active, as malicious code in userspace may then be executed speculatively.
Overwrite the CPU's return prediction stack with calls which are predicted to return to an infinite loop, to "capture" speculation if this happens. This is required both for retpoline, and also in conjunction with IBRS for !SMEP && !KPTI.
On Skylake+ the problem is slightly different, and an *underflow* of the RSB may cause errant branch predictions to occur. So there it's not so much overwrite, as *filling* the RSB to attempt to prevent it getting empty. This is only a partial solution for Skylake+ since there are many other conditions which may result in the RSB becoming empty. The full solution on Skylake+ is to use IBRS, which will prevent the problem even when the RSB becomes empty. With IBRS, the RSB-stuffing will not be required on context switch.
[ tglx: Added missing vendor check and slighty massaged comments and changelog ]
[js] backport to 4.4 -- __switch_to_asm does not exist there, we have to patch the switch_to macros for both x86_32 and x86_64.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Arjan van de Ven arjan@linux.intel.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: Jiri Slaby jslaby@suse.cz [bwh: Backported to 3.16: use the first available feature number] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/cpufeature.h | 1 + arch/x86/include/asm/switch_to.h | 38 ++++++++++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/bugs.c | 36 ++++++++++++++++++++++++++++++++++++ 3 files changed, 75 insertions(+)
--- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -187,6 +187,7 @@ #define X86_FEATURE_HW_PSTATE (7*32+ 8) /* AMD HW-PState */ #define X86_FEATURE_PROC_FEEDBACK (7*32+ 9) /* AMD ProcFeedbackInterface */ #define X86_FEATURE_INVPCID_SINGLE (7*32+10) /* Effectively INVPCID && CR4.PCIDE=1 */ +#define X86_FEATURE_RSB_CTXSW (7*32+11) /* Fill RSB on context switches */
#define X86_FEATURE_RETPOLINE (7*32+29) /* Generic Retpoline mitigation for Spectre variant 2 */ #define X86_FEATURE_RETPOLINE_AMD (7*32+30) /* AMD Retpoline mitigation for Spectre variant 2 */ --- a/arch/x86/include/asm/switch_to.h +++ b/arch/x86/include/asm/switch_to.h @@ -1,6 +1,8 @@ #ifndef _ASM_X86_SWITCH_TO_H #define _ASM_X86_SWITCH_TO_H
+#include <asm/nospec-branch.h> + struct task_struct; /* one of the stranger aspects of C forward declarations */ __visible struct task_struct *__switch_to(struct task_struct *prev, struct task_struct *next); @@ -24,6 +26,23 @@ void __switch_to_xtra(struct task_struct #define __switch_canary_iparam #endif /* CC_STACKPROTECTOR */
+#ifdef CONFIG_RETPOLINE + /* + * When switching from a shallower to a deeper call stack + * the RSB may either underflow or use entries populated + * with userspace addresses. On CPUs where those concerns + * exist, overwrite the RSB with entries which capture + * speculative execution to prevent attack. + */ +#define __retpoline_fill_return_buffer \ + ALTERNATIVE("jmp 910f", \ + __stringify(__FILL_RETURN_BUFFER(%%ebx, RSB_CLEAR_LOOPS, %%esp)),\ + X86_FEATURE_RSB_CTXSW) \ + "910:\n\t" +#else +#define __retpoline_fill_return_buffer +#endif + /* * Saving eflags is important. It switches not only IOPL between tasks, * it also protects other tasks from NT leaking through sysenter etc. @@ -46,6 +65,7 @@ do { \ "movl $1f,%[prev_ip]\n\t" /* save EIP */ \ "pushl %[next_ip]\n\t" /* restore EIP */ \ __switch_canary \ + __retpoline_fill_return_buffer \ "jmp __switch_to\n" /* regparm call */ \ "1:\t" \ "popl %%ebp\n\t" /* restore EBP */ \ @@ -100,6 +120,23 @@ do { \ #define __switch_canary_iparam #endif /* CC_STACKPROTECTOR */
+#ifdef CONFIG_RETPOLINE + /* + * When switching from a shallower to a deeper call stack + * the RSB may either underflow or use entries populated + * with userspace addresses. On CPUs where those concerns + * exist, overwrite the RSB with entries which capture + * speculative execution to prevent attack. + */ +#define __retpoline_fill_return_buffer \ + ALTERNATIVE("jmp 910f", \ + __stringify(__FILL_RETURN_BUFFER(%%r12, RSB_CLEAR_LOOPS, %%rsp)),\ + X86_FEATURE_RSB_CTXSW) \ + "910:\n\t" +#else +#define __retpoline_fill_return_buffer +#endif + /* Save restore flags to clear handle leaking NT */ #define switch_to(prev, next, last) \ asm volatile(SAVE_CONTEXT \ @@ -108,6 +145,7 @@ do { \ "call __switch_to\n\t" \ "movq "__percpu_arg([current_task])",%%rsi\n\t" \ __switch_canary \ + __retpoline_fill_return_buffer \ "movq %P[thread_info](%%rsi),%%r8\n\t" \ "movq %%rax,%%rdi\n\t" \ "testl %[_tif_fork],%P[ti_flags](%%r8)\n\t" \ --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -22,6 +22,7 @@ #include <asm/alternative.h> #include <asm/pgtable.h> #include <asm/cacheflush.h> +#include <asm/intel-family.h>
static void __init spectre_v2_select_mitigation(void);
@@ -217,6 +218,23 @@ disable: return SPECTRE_V2_CMD_NONE; }
+/* Check for Skylake-like CPUs (for RSB handling) */ +static bool __init is_skylake_era(void) +{ + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && + boot_cpu_data.x86 == 6) { + switch (boot_cpu_data.x86_model) { + case INTEL_FAM6_SKYLAKE_MOBILE: + case INTEL_FAM6_SKYLAKE_DESKTOP: + case INTEL_FAM6_SKYLAKE_X: + case INTEL_FAM6_KABYLAKE_MOBILE: + case INTEL_FAM6_KABYLAKE_DESKTOP: + return true; + } + } + return false; +} + static void __init spectre_v2_select_mitigation(void) { enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline(); @@ -275,6 +293,24 @@ retpoline_auto:
spectre_v2_enabled = mode; pr_info("%s\n", spectre_v2_strings[mode]); + + /* + * If neither SMEP or KPTI are available, there is a risk of + * hitting userspace addresses in the RSB after a context switch + * from a shallow call stack to a deeper one. To prevent this fill + * the entire RSB, even when using IBRS. + * + * Skylake era CPUs have a separate issue with *underflow* of the + * RSB, when they will predict 'ret' targets from the generic BTB. + * The proper mitigation for this is IBRS. If IBRS is not supported + * or deactivated in favour of retpolines the RSB fill on context + * switch is required. + */ + if ((!boot_cpu_has(X86_FEATURE_KAISER) && + !boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) { + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW); + pr_info("Filling RSB on context switch\n"); + } }
#undef pr_fmt
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Linus Torvalds torvalds@linux-foundation.org
commit 11f1a4b9755f5dbc3e822a96502ebe9b044b14d8 upstream.
This reorganizes how we do the stac/clac instructions in the user access code. Instead of adding the instructions directly to the same inline asm that does the actual user level access and exception handling, add them at a higher level.
This is mainly preparation for the next step, where we will expose an interface to allow users to mark several accesses together as being user space accesses, but it does already clean up some code:
- the inlined trivial cases of copy_in_user() now do stac/clac just once over the accesses: they used to do one pair around the user space read, and another pair around the write-back.
- the {get,put}_user_ex() macros that are used with the catch/try handling don't do any stac/clac at all, because that happens in the try/catch surrounding them.
Other than those two cleanups that happened naturally from the re-organization, this should not make any difference. Yet.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/uaccess.h | 53 ++++++++++++++-------- arch/x86/include/asm/uaccess_64.h | 94 +++++++++++++++++++++++++++------------ 2 files changed, 101 insertions(+), 46 deletions(-)
--- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -143,6 +143,9 @@ extern int __get_user_4(void); extern int __get_user_8(void); extern int __get_user_bad(void);
+#define __uaccess_begin() stac() +#define __uaccess_end() clac() + /* * This is a type: either unsigned long, if the argument fits into * that type, or otherwise unsigned long long. @@ -201,10 +204,10 @@ __typeof__(__builtin_choose_expr(sizeof(
#ifdef CONFIG_X86_32 #define __put_user_asm_u64(x, addr, err, errret) \ - asm volatile(ASM_STAC "\n" \ + asm volatile("\n" \ "1: movl %%eax,0(%2)\n" \ "2: movl %%edx,4(%2)\n" \ - "3: " ASM_CLAC "\n" \ + "3:" \ ".section .fixup,"ax"\n" \ "4: movl %3,%0\n" \ " jmp 3b\n" \ @@ -215,10 +218,10 @@ __typeof__(__builtin_choose_expr(sizeof( : "A" (x), "r" (addr), "i" (errret), "0" (err))
#define __put_user_asm_ex_u64(x, addr) \ - asm volatile(ASM_STAC "\n" \ + asm volatile("\n" \ "1: movl %%eax,0(%1)\n" \ "2: movl %%edx,4(%1)\n" \ - "3: " ASM_CLAC "\n" \ + "3:" \ _ASM_EXTABLE_EX(1b, 2b) \ _ASM_EXTABLE_EX(2b, 3b) \ : : "A" (x), "r" (addr)) @@ -311,6 +314,10 @@ do { \ } \ } while (0)
+/* + * This doesn't do __uaccess_begin/end - the exception handling + * around it must do that. + */ #define __put_user_size_ex(x, ptr, size) \ do { \ __chk_user_ptr(ptr); \ @@ -365,9 +372,9 @@ do { \ } while (0)
#define __get_user_asm(x, addr, err, itype, rtype, ltype, errret) \ - asm volatile(ASM_STAC "\n" \ + asm volatile("\n" \ "1: mov"itype" %2,%"rtype"1\n" \ - "2: " ASM_CLAC "\n" \ + "2:\n" \ ".section .fixup,"ax"\n" \ "3: mov %3,%0\n" \ " xor"itype" %"rtype"1,%"rtype"1\n" \ @@ -377,6 +384,10 @@ do { \ : "=r" (err), ltype(x) \ : "m" (__m(addr)), "i" (errret), "0" (err))
+/* + * This doesn't do __uaccess_begin/end - the exception handling + * around it must do that. + */ #define __get_user_size_ex(x, ptr, size) \ do { \ __chk_user_ptr(ptr); \ @@ -407,7 +418,9 @@ do { \ #define __put_user_nocheck(x, ptr, size) \ ({ \ int __pu_err; \ + __uaccess_begin(); \ __put_user_size((x), (ptr), (size), __pu_err, -EFAULT); \ + __uaccess_end(); \ __pu_err; \ })
@@ -415,7 +428,9 @@ do { \ ({ \ int __gu_err; \ unsigned long __gu_val; \ + __uaccess_begin(); \ __get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT); \ + __uaccess_end(); \ (x) = (__force __typeof__(*(ptr)))__gu_val; \ __gu_err; \ }) @@ -430,9 +445,9 @@ struct __large_struct { unsigned long bu * aliasing issues. */ #define __put_user_asm(x, addr, err, itype, rtype, ltype, errret) \ - asm volatile(ASM_STAC "\n" \ + asm volatile("\n" \ "1: mov"itype" %"rtype"1,%2\n" \ - "2: " ASM_CLAC "\n" \ + "2:\n" \ ".section .fixup,"ax"\n" \ "3: mov %3,%0\n" \ " jmp 2b\n" \ @@ -452,11 +467,11 @@ struct __large_struct { unsigned long bu */ #define uaccess_try do { \ current_thread_info()->uaccess_err = 0; \ - stac(); \ + __uaccess_begin(); \ barrier();
#define uaccess_catch(err) \ - clac(); \ + __uaccess_end(); \ (err) |= (current_thread_info()->uaccess_err ? -EFAULT : 0); \ } while (0)
@@ -552,12 +567,13 @@ extern void __cmpxchg_wrong_size(void) __typeof__(ptr) __uval = (uval); \ __typeof__(*(ptr)) __old = (old); \ __typeof__(*(ptr)) __new = (new); \ + __uaccess_begin(); \ switch (size) { \ case 1: \ { \ - asm volatile("\t" ASM_STAC "\n" \ + asm volatile("\n" \ "1:\t" LOCK_PREFIX "cmpxchgb %4, %2\n" \ - "2:\t" ASM_CLAC "\n" \ + "2:\n" \ "\t.section .fixup, "ax"\n" \ "3:\tmov %3, %0\n" \ "\tjmp 2b\n" \ @@ -571,9 +587,9 @@ extern void __cmpxchg_wrong_size(void) } \ case 2: \ { \ - asm volatile("\t" ASM_STAC "\n" \ + asm volatile("\n" \ "1:\t" LOCK_PREFIX "cmpxchgw %4, %2\n" \ - "2:\t" ASM_CLAC "\n" \ + "2:\n" \ "\t.section .fixup, "ax"\n" \ "3:\tmov %3, %0\n" \ "\tjmp 2b\n" \ @@ -587,9 +603,9 @@ extern void __cmpxchg_wrong_size(void) } \ case 4: \ { \ - asm volatile("\t" ASM_STAC "\n" \ + asm volatile("\n" \ "1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n" \ - "2:\t" ASM_CLAC "\n" \ + "2:\n" \ "\t.section .fixup, "ax"\n" \ "3:\tmov %3, %0\n" \ "\tjmp 2b\n" \ @@ -606,9 +622,9 @@ extern void __cmpxchg_wrong_size(void) if (!IS_ENABLED(CONFIG_X86_64)) \ __cmpxchg_wrong_size(); \ \ - asm volatile("\t" ASM_STAC "\n" \ + asm volatile("\n" \ "1:\t" LOCK_PREFIX "cmpxchgq %4, %2\n" \ - "2:\t" ASM_CLAC "\n" \ + "2:\n" \ "\t.section .fixup, "ax"\n" \ "3:\tmov %3, %0\n" \ "\tjmp 2b\n" \ @@ -623,6 +639,7 @@ extern void __cmpxchg_wrong_size(void) default: \ __cmpxchg_wrong_size(); \ } \ + __uaccess_end(); \ *__uval = __old; \ __ret; \ }) --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -56,35 +56,49 @@ int __copy_from_user_nocheck(void *dst, if (!__builtin_constant_p(size)) return copy_user_generic(dst, (__force void *)src, size); switch (size) { - case 1:__get_user_asm(*(u8 *)dst, (u8 __user *)src, + case 1: + __uaccess_begin(); + __get_user_asm(*(u8 *)dst, (u8 __user *)src, ret, "b", "b", "=q", 1); + __uaccess_end(); return ret; - case 2:__get_user_asm(*(u16 *)dst, (u16 __user *)src, + case 2: + __uaccess_begin(); + __get_user_asm(*(u16 *)dst, (u16 __user *)src, ret, "w", "w", "=r", 2); + __uaccess_end(); return ret; - case 4:__get_user_asm(*(u32 *)dst, (u32 __user *)src, + case 4: + __uaccess_begin(); + __get_user_asm(*(u32 *)dst, (u32 __user *)src, ret, "l", "k", "=r", 4); + __uaccess_end(); return ret; - case 8:__get_user_asm(*(u64 *)dst, (u64 __user *)src, + case 8: + __uaccess_begin(); + __get_user_asm(*(u64 *)dst, (u64 __user *)src, ret, "q", "", "=r", 8); + __uaccess_end(); return ret; case 10: + __uaccess_begin(); __get_user_asm(*(u64 *)dst, (u64 __user *)src, ret, "q", "", "=r", 10); - if (unlikely(ret)) - return ret; - __get_user_asm(*(u16 *)(8 + (char *)dst), - (u16 __user *)(8 + (char __user *)src), - ret, "w", "w", "=r", 2); + if (likely(!ret)) + __get_user_asm(*(u16 *)(8 + (char *)dst), + (u16 __user *)(8 + (char __user *)src), + ret, "w", "w", "=r", 2); + __uaccess_end(); return ret; case 16: + __uaccess_begin(); __get_user_asm(*(u64 *)dst, (u64 __user *)src, ret, "q", "", "=r", 16); - if (unlikely(ret)) - return ret; - __get_user_asm(*(u64 *)(8 + (char *)dst), - (u64 __user *)(8 + (char __user *)src), - ret, "q", "", "=r", 8); + if (likely(!ret)) + __get_user_asm(*(u64 *)(8 + (char *)dst), + (u64 __user *)(8 + (char __user *)src), + ret, "q", "", "=r", 8); + __uaccess_end(); return ret; default: return copy_user_generic(dst, (__force void *)src, size); @@ -106,35 +120,51 @@ int __copy_to_user_nocheck(void __user * if (!__builtin_constant_p(size)) return copy_user_generic((__force void *)dst, src, size); switch (size) { - case 1:__put_user_asm(*(u8 *)src, (u8 __user *)dst, + case 1: + __uaccess_begin(); + __put_user_asm(*(u8 *)src, (u8 __user *)dst, ret, "b", "b", "iq", 1); + __uaccess_end(); return ret; - case 2:__put_user_asm(*(u16 *)src, (u16 __user *)dst, + case 2: + __uaccess_begin(); + __put_user_asm(*(u16 *)src, (u16 __user *)dst, ret, "w", "w", "ir", 2); + __uaccess_end(); return ret; - case 4:__put_user_asm(*(u32 *)src, (u32 __user *)dst, + case 4: + __uaccess_begin(); + __put_user_asm(*(u32 *)src, (u32 __user *)dst, ret, "l", "k", "ir", 4); + __uaccess_end(); return ret; - case 8:__put_user_asm(*(u64 *)src, (u64 __user *)dst, + case 8: + __uaccess_begin(); + __put_user_asm(*(u64 *)src, (u64 __user *)dst, ret, "q", "", "er", 8); + __uaccess_end(); return ret; case 10: + __uaccess_begin(); __put_user_asm(*(u64 *)src, (u64 __user *)dst, ret, "q", "", "er", 10); - if (unlikely(ret)) - return ret; - asm("":::"memory"); - __put_user_asm(4[(u16 *)src], 4 + (u16 __user *)dst, - ret, "w", "w", "ir", 2); + if (likely(!ret)) { + asm("":::"memory"); + __put_user_asm(4[(u16 *)src], 4 + (u16 __user *)dst, + ret, "w", "w", "ir", 2); + } + __uaccess_end(); return ret; case 16: + __uaccess_begin(); __put_user_asm(*(u64 *)src, (u64 __user *)dst, ret, "q", "", "er", 16); - if (unlikely(ret)) - return ret; - asm("":::"memory"); - __put_user_asm(1[(u64 *)src], 1 + (u64 __user *)dst, - ret, "q", "", "er", 8); + if (likely(!ret)) { + asm("":::"memory"); + __put_user_asm(1[(u64 *)src], 1 + (u64 __user *)dst, + ret, "q", "", "er", 8); + } + __uaccess_end(); return ret; default: return copy_user_generic((__force void *)dst, src, size); @@ -160,39 +190,47 @@ int __copy_in_user(void __user *dst, con switch (size) { case 1: { u8 tmp; + __uaccess_begin(); __get_user_asm(tmp, (u8 __user *)src, ret, "b", "b", "=q", 1); if (likely(!ret)) __put_user_asm(tmp, (u8 __user *)dst, ret, "b", "b", "iq", 1); + __uaccess_end(); return ret; } case 2: { u16 tmp; + __uaccess_begin(); __get_user_asm(tmp, (u16 __user *)src, ret, "w", "w", "=r", 2); if (likely(!ret)) __put_user_asm(tmp, (u16 __user *)dst, ret, "w", "w", "ir", 2); + __uaccess_end(); return ret; }
case 4: { u32 tmp; + __uaccess_begin(); __get_user_asm(tmp, (u32 __user *)src, ret, "l", "k", "=r", 4); if (likely(!ret)) __put_user_asm(tmp, (u32 __user *)dst, ret, "l", "k", "ir", 4); + __uaccess_end(); return ret; } case 8: { u64 tmp; + __uaccess_begin(); __get_user_asm(tmp, (u64 __user *)src, ret, "q", "", "=r", 8); if (likely(!ret)) __put_user_asm(tmp, (u64 __user *)dst, ret, "q", "", "er", 8); + __uaccess_end(); return ret; } default:
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 9697fa39efd3fc3692f2949d4045f393ec58450b upstream.
Convert all indirect jumps in crypto assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Arjan van de Ven arjan@linux.intel.com Acked-by: Ingo Molnar mingo@kernel.org Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515707194-20531-6-git-send-email-dwmw@amazon.co.u... [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/crypto/aesni-intel_asm.S | 5 +++-- arch/x86/crypto/camellia-aesni-avx-asm_64.S | 3 ++- arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 3 ++- arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 3 ++- 4 files changed, 9 insertions(+), 5 deletions(-)
--- a/arch/x86/crypto/aesni-intel_asm.S +++ b/arch/x86/crypto/aesni-intel_asm.S @@ -31,6 +31,7 @@
#include <linux/linkage.h> #include <asm/inst.h> +#include <asm/nospec-branch.h>
#ifdef __x86_64__ .data @@ -2703,7 +2704,7 @@ ENTRY(aesni_xts_crypt8) pxor INC, STATE4 movdqu IV, 0x30(OUTP)
- call *%r11 + CALL_NOSPEC %r11
movdqu 0x00(OUTP), INC pxor INC, STATE1 @@ -2748,7 +2749,7 @@ ENTRY(aesni_xts_crypt8) _aesni_gf128mul_x_ble() movups IV, (IVP)
- call *%r11 + CALL_NOSPEC %r11
movdqu 0x40(OUTP), INC pxor INC, STATE1 --- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S @@ -16,6 +16,7 @@ */
#include <linux/linkage.h> +#include <asm/nospec-branch.h>
#define CAMELLIA_TABLE_BYTE_LEN 272
@@ -1210,7 +1211,7 @@ camellia_xts_crypt_16way: vpxor 14 * 16(%rax), %xmm15, %xmm14; vpxor 15 * 16(%rax), %xmm15, %xmm15;
- call *%r9; + CALL_NOSPEC %r9;
addq $(16 * 16), %rsp;
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S @@ -11,6 +11,7 @@ */
#include <linux/linkage.h> +#include <asm/nospec-branch.h>
#define CAMELLIA_TABLE_BYTE_LEN 272
@@ -1323,7 +1324,7 @@ camellia_xts_crypt_32way: vpxor 14 * 32(%rax), %ymm15, %ymm14; vpxor 15 * 32(%rax), %ymm15, %ymm15;
- call *%r9; + CALL_NOSPEC %r9;
addq $(16 * 32), %rsp;
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S +++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S @@ -45,6 +45,7 @@
#include <asm/inst.h> #include <linux/linkage.h> +#include <asm/nospec-branch.h>
## ISCSI CRC 32 Implementation with crc32 and pclmulqdq Instruction
@@ -171,7 +172,7 @@ continue_block: movzxw (bufp, %rax, 2), len offset=crc_array-jump_table lea offset(bufp, len, 1), bufp - jmp *bufp + JMP_NOSPEC bufp
################################################################ ## 2a) PROCESS FULL BLOCKS:
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit 1a29b5b7f347a1a9230c1e0af5b37e3e571588ab upstream.
Replace the indirect calls with CALL_NOSPEC.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: David Woodhouse dwmw@amazon.co.uk Cc: Andrea Arcangeli aarcange@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Ashok Raj ashok.raj@intel.com Cc: Greg KH gregkh@linuxfoundation.org Cc: Jun Nakajima jun.nakajima@intel.com Cc: David Woodhouse dwmw2@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: rga@amazon.de Cc: Dave Hansen dave.hansen@intel.com Cc: Asit Mallick asit.k.mallick@intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Jason Baron jbaron@akamai.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Dan Williams dan.j.williams@intel.com Cc: Arjan Van De Ven arjan.van.de.ven@intel.com Cc: Tim Chen tim.c.chen@linux.intel.com Link: https://lkml.kernel.org/r/20180125095843.595615683@infradead.org [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kvm/emulate.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -25,6 +25,7 @@ #include <linux/module.h> #include <asm/kvm_emulate.h> #include <linux/stringify.h> +#include <asm/nospec-branch.h>
#include "x86.h" #include "tss.h" @@ -906,8 +907,8 @@ static u8 test_cc(unsigned int condition void (*fop)(void) = (void *)em_setcc + 4 * (condition & 0xf);
flags = (flags & EFLAGS_MASK) | X86_EFLAGS_IF; - asm("push %[flags]; popf; call *%[fastop]" - : "=a"(rc) : [fastop]"r"(fop), [flags]"r"(flags)); + asm("push %[flags]; popf; " CALL_NOSPEC + : "=a"(rc) : [thunk_target]"r"(fop), [flags]"r"(flags)); return rc; }
@@ -4622,9 +4623,9 @@ static int fastop(struct x86_emulate_ctx ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF; if (!(ctxt->d & ByteOp)) fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE; - asm("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n" + asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n" : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags), - [fastop]"+S"(fop) + [thunk_target]"+S"(fop), ASM_CALL_CONSTRAINT : "c"(ctxt->src2.val)); ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK); if (!fop) /* exception is returned in fop variable */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Will Deacon will.deacon@arm.com
commit 8fa80c503b484ddc1abbd10c7cb2ab81f3824a50 upstream.
For architectures providing their own implementation of array_index_mask_nospec() in asm/barrier.h, attempting to use WARN_ONCE() to complain about out-of-range parameters using WARN_ON() results in a mess of mutually-dependent include files.
Rather than unpick the dependencies, simply have the core code in nospec.h perform the checking for us.
Signed-off-by: Will Deacon will.deacon@arm.com Acked-by: Thomas Gleixner tglx@linutronix.de Cc: Dan Williams dan.j.williams@intel.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lkml.kernel.org/r/1517840166-15399-1-git-send-email-will.deacon@arm.c... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- include/linux/nospec.h | 36 +++++++++++++++++++++--------------- 1 file changed, 21 insertions(+), 15 deletions(-)
--- a/include/linux/nospec.h +++ b/include/linux/nospec.h @@ -20,20 +20,6 @@ static inline unsigned long array_index_ unsigned long size) { /* - * Warn developers about inappropriate array_index_nospec() usage. - * - * Even if the CPU speculates past the WARN_ONCE branch, the - * sign bit of @index is taken into account when generating the - * mask. - * - * This warning is compiled out when the compiler can infer that - * @index and @size are less than LONG_MAX. - */ - if (WARN_ONCE(index > LONG_MAX || size > LONG_MAX, - "array_index_nospec() limited to range of [0, LONG_MAX]\n")) - return 0; - - /* * Always calculate and emit the mask even if the compiler * thinks the mask is not needed. The compiler does not take * into account the value of @index under speculation. @@ -44,6 +30,26 @@ static inline unsigned long array_index_ #endif
/* + * Warn developers about inappropriate array_index_nospec() usage. + * + * Even if the CPU speculates past the WARN_ONCE branch, the + * sign bit of @index is taken into account when generating the + * mask. + * + * This warning is compiled out when the compiler can infer that + * @index and @size are less than LONG_MAX. + */ +#define array_index_mask_nospec_check(index, size) \ +({ \ + if (WARN_ONCE(index > LONG_MAX || size > LONG_MAX, \ + "array_index_nospec() limited to range of [0, LONG_MAX]\n")) \ + _mask = 0; \ + else \ + _mask = array_index_mask_nospec(index, size); \ + _mask; \ +}) + +/* * array_index_nospec - sanitize an array index after a bounds check * * For a code sequence like: @@ -61,7 +67,7 @@ static inline unsigned long array_index_ ({ \ typeof(index) _i = (index); \ typeof(size) _s = (size); \ - unsigned long _mask = array_index_mask_nospec(_i, _s); \ + unsigned long _mask = array_index_mask_nospec_check(_i, _s); \ \ BUILD_BUG_ON(sizeof(_i) > sizeof(long)); \ BUILD_BUG_ON(sizeof(_s) > sizeof(long)); \
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit 1d91c1d2c80cb70e2e553845e278b87a960c04da upstream.
There are multiple problems with the dynamic sanity checking in array_index_nospec_mask_check():
* It causes unnecessary overhead in the 32-bit case since integer sized @index values will no longer cause the check to be compiled away like in the 64-bit case.
* In the 32-bit case it may trigger with user controllable input when the expectation is that should only trigger during development of new kernel enabling.
* The macro reuses the input parameter in multiple locations which is broken if someone passes an expression like 'index++' to array_index_nospec().
Reported-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Dan Williams dan.j.williams@intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Arjan van de Ven arjan@linux.intel.com Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Woodhouse dwmw2@infradead.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Will Deacon will.deacon@arm.com Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/151881604278.17395.6605847763178076520.stgit@dwilli... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- include/linux/nospec.h | 22 +--------------------- 1 file changed, 1 insertion(+), 21 deletions(-)
--- a/include/linux/nospec.h +++ b/include/linux/nospec.h @@ -30,26 +30,6 @@ static inline unsigned long array_index_ #endif
/* - * Warn developers about inappropriate array_index_nospec() usage. - * - * Even if the CPU speculates past the WARN_ONCE branch, the - * sign bit of @index is taken into account when generating the - * mask. - * - * This warning is compiled out when the compiler can infer that - * @index and @size are less than LONG_MAX. - */ -#define array_index_mask_nospec_check(index, size) \ -({ \ - if (WARN_ONCE(index > LONG_MAX || size > LONG_MAX, \ - "array_index_nospec() limited to range of [0, LONG_MAX]\n")) \ - _mask = 0; \ - else \ - _mask = array_index_mask_nospec(index, size); \ - _mask; \ -}) - -/* * array_index_nospec - sanitize an array index after a bounds check * * For a code sequence like: @@ -67,7 +47,7 @@ static inline unsigned long array_index_ ({ \ typeof(index) _i = (index); \ typeof(size) _s = (size); \ - unsigned long _mask = array_index_mask_nospec_check(_i, _s); \ + unsigned long _mask = array_index_mask_nospec(_i, _s); \ \ BUILD_BUG_ON(sizeof(_i) > sizeof(long)); \ BUILD_BUG_ON(sizeof(_s) > sizeof(long)); \
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit 66c117d7fa2ae429911e60d84bf31a90b2b96189 upstream.
Richard reported the following crash:
[ 0.036000] BUG: unable to handle kernel paging request at 55501e06 [ 0.036000] IP: [<c0aae48b>] common_interrupt+0xb/0x38 [ 0.036000] Call Trace: [ 0.036000] [<c0409c80>] ? add_nops+0x90/0xa0 [ 0.036000] [<c040a054>] apply_alternatives+0x274/0x630
Chuck decoded:
" 0: 8d 90 90 83 04 24 lea 0x24048390(%eax),%edx 6: 80 fc 0f cmp $0xf,%ah 9: a8 0f test $0xf,%al
b: a0 06 1e 50 55 mov 0x55501e06,%al
10: 57 push %edi 11: 56 push %esi
Interrupt 0x30 occurred while the alternatives code was replacing the initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized version, 0x8d,0x76,0x00. Only the first byte has been replaced so far, and it makes a mess out of the insn decoding."
optimize_nops() is buggy in two aspects:
- It's not disabling interrupts across the modification - It's lacking a sync_core() call
Add both.
Fixes: 4fd4b6e5537c 'x86/alternatives: Use optimized NOPs for padding' Reported-and-tested-by: "Richard W.M. Jones" rjones@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Richard W.M. Jones rjones@redhat.com Cc: Chuck Ebbert cebbert.lkml@gmail.com Cc: Borislav Petkov bp@alien8.de Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1509031232340.15006@nanos Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/alternative.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -325,10 +325,15 @@ done:
static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) { + unsigned long flags; + if (instr[0] != 0x90) return;
+ local_irq_save(flags); add_nops(instr + (a->instrlen - a->padlen), a->padlen); + sync_core(); + local_irq_restore(flags);
DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ", instr, a->instrlen - a->padlen, a->padlen);
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Andi Kleen ak@linux.intel.com
commit 3f7d875566d8e79c5e0b2c9a413e91b2c29e0854 upstream.
The generated assembler for the C fill RSB inline asm operations has several issues:
- The C code sets up the loop register, which is then immediately overwritten in __FILL_RETURN_BUFFER with the same value again.
- The C code also passes in the iteration count in another register, which is not used at all.
Remove these two unnecessary operations. Just rely on the single constant passed to the macro for the iterations.
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: dave.hansen@intel.com Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: arjan@linux.intel.com Link: https://lkml.kernel.org/r/20180117225328.15414-1-andi@firstfloor.org [bwh: Backported to 3.16: adjust contex] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/nospec-branch.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -183,15 +183,16 @@ extern char __indirect_thunk_end[]; static inline void vmexit_fill_RSB(void) { #ifdef CONFIG_RETPOLINE - unsigned long loops = RSB_CLEAR_LOOPS / 2; + unsigned long loops;
asm volatile (ALTERNATIVE("jmp 910f", __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)), X86_FEATURE_RETPOLINE) "910:" - : "=&r" (loops), ASM_CALL_CONSTRAINT - : "r" (loops) : "memory" ); + : "=r" (loops), ASM_CALL_CONSTRAINT + : : "memory" ); #endif } + #endif /* __ASSEMBLY__ */ #endif /* __NOSPEC_BRANCH_H__ */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dave Hansen dave@sr71.net
commit 970442c599b22ccd644ebfe94d1d303bf6f87c05 upstream.
Problem:
We have a boatload of open-coded family-6 model numbers. Half of them have these model numbers in hex and the other half in decimal. This makes grepping for them tons of fun, if you were to try.
Solution:
Consolidate all the magic numbers. Put all the definitions in one header.
The names here are closely derived from the comments describing the models from arch/x86/events/intel/core.c. We could easily make them shorter by doing things like s/SANDYBRIDGE/SNB/, but they seemed fine even with the longer versions to me.
Do not take any of these names too literally, like "DESKTOP" or "MOBILE". These are all colloquial names and not precise descriptions of everywhere a given model will show up.
Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Andy Lutomirski luto@amacapital.net Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Darren Hart dvhart@infradead.org Cc: Dave Hansen dave@sr71.net Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Doug Thompson dougthompson@xmission.com Cc: Eduardo Valentin edubezval@gmail.com Cc: H. Peter Anvin hpa@zytor.com Cc: Jacob Pan jacob.jun.pan@linux.intel.com Cc: Kan Liang kan.liang@intel.com Cc: Len Brown lenb@kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mauro Carvalho Chehab mchehab@osg.samsung.com Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Rajneesh Bhardwaj rajneesh.bhardwaj@intel.com Cc: Souvik Kumar Chakravarty souvik.k.chakravarty@intel.com Cc: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Tony Luck tony.luck@intel.com Cc: Ulf Hansson ulf.hansson@linaro.org Cc: Viresh Kumar viresh.kumar@linaro.org Cc: Vishwanath Somayaji vishwanath.somayaji@intel.com Cc: Zhang Rui rui.zhang@intel.com Cc: jacob.jun.pan@intel.com Cc: linux-acpi@vger.kernel.org Cc: linux-edac@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: platform-driver-x86@vger.kernel.org Link: http://lkml.kernel.org/r/20160603001927.F2A7D828@viggo.jf.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/intel-family.h | 68 +++++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+) create mode 100644 arch/x86/include/asm/intel-family.h
--- /dev/null +++ b/arch/x86/include/asm/intel-family.h @@ -0,0 +1,68 @@ +#ifndef _ASM_X86_INTEL_FAMILY_H +#define _ASM_X86_INTEL_FAMILY_H + +/* + * "Big Core" Processors (Branded as Core, Xeon, etc...) + * + * The "_X" parts are generally the EP and EX Xeons, or the + * "Extreme" ones, like Broadwell-E. + * + * Things ending in "2" are usually because we have no better + * name for them. There's no processor called "WESTMERE2". + */ + +#define INTEL_FAM6_CORE_YONAH 0x0E +#define INTEL_FAM6_CORE2_MEROM 0x0F +#define INTEL_FAM6_CORE2_MEROM_L 0x16 +#define INTEL_FAM6_CORE2_PENRYN 0x17 +#define INTEL_FAM6_CORE2_DUNNINGTON 0x1D + +#define INTEL_FAM6_NEHALEM 0x1E +#define INTEL_FAM6_NEHALEM_EP 0x1A +#define INTEL_FAM6_NEHALEM_EX 0x2E +#define INTEL_FAM6_WESTMERE 0x25 +#define INTEL_FAM6_WESTMERE2 0x1F +#define INTEL_FAM6_WESTMERE_EP 0x2C +#define INTEL_FAM6_WESTMERE_EX 0x2F + +#define INTEL_FAM6_SANDYBRIDGE 0x2A +#define INTEL_FAM6_SANDYBRIDGE_X 0x2D +#define INTEL_FAM6_IVYBRIDGE 0x3A +#define INTEL_FAM6_IVYBRIDGE_X 0x3E + +#define INTEL_FAM6_HASWELL_CORE 0x3C +#define INTEL_FAM6_HASWELL_X 0x3F +#define INTEL_FAM6_HASWELL_ULT 0x45 +#define INTEL_FAM6_HASWELL_GT3E 0x46 + +#define INTEL_FAM6_BROADWELL_CORE 0x3D +#define INTEL_FAM6_BROADWELL_XEON_D 0x56 +#define INTEL_FAM6_BROADWELL_GT3E 0x47 +#define INTEL_FAM6_BROADWELL_X 0x4F + +#define INTEL_FAM6_SKYLAKE_MOBILE 0x4E +#define INTEL_FAM6_SKYLAKE_DESKTOP 0x5E +#define INTEL_FAM6_SKYLAKE_X 0x55 +#define INTEL_FAM6_KABYLAKE_MOBILE 0x8E +#define INTEL_FAM6_KABYLAKE_DESKTOP 0x9E + +/* "Small Core" Processors (Atom) */ + +#define INTEL_FAM6_ATOM_PINEVIEW 0x1C +#define INTEL_FAM6_ATOM_LINCROFT 0x26 +#define INTEL_FAM6_ATOM_PENWELL 0x27 +#define INTEL_FAM6_ATOM_CLOVERVIEW 0x35 +#define INTEL_FAM6_ATOM_CEDARVIEW 0x36 +#define INTEL_FAM6_ATOM_SILVERMONT1 0x37 /* BayTrail/BYT / Valleyview */ +#define INTEL_FAM6_ATOM_SILVERMONT2 0x4D /* Avaton/Rangely */ +#define INTEL_FAM6_ATOM_AIRMONT 0x4C /* CherryTrail / Braswell */ +#define INTEL_FAM6_ATOM_MERRIFIELD1 0x4A /* Tangier */ +#define INTEL_FAM6_ATOM_MERRIFIELD2 0x5A /* Annidale */ +#define INTEL_FAM6_ATOM_GOLDMONT 0x5C +#define INTEL_FAM6_ATOM_DENVERTON 0x5F /* Goldmont Microserver */ + +/* Xeon Phi */ + +#define INTEL_FAM6_XEON_PHI_KNL 0x57 /* Knights Landing */ + +#endif /* _ASM_X86_INTEL_FAMILY_H */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit da285121560e769cc31797bba6422eea71d473e0 upstream.
Add a spectre_v2= option to select the mitigation used for the indirect branch speculation vulnerability.
Currently, the only option available is retpoline, in its various forms. This will be expanded to cover the new IBRS/IBPB microcode features.
The RETPOLINE_AMD feature relies on a serializing LFENCE for speculation control. For AMD hardware, only set RETPOLINE_AMD if LFENCE is a serializing instruction, which is indicated by the LFENCE_RDTSC feature.
[ tglx: Folded back the LFENCE/AMD fixes and reworked it so IBRS integration becomes simple ]
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515707194-20531-5-git-send-email-dwmw@amazon.co.u... [bwh: Backported to 3.16: adjust filename, context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- Documentation/kernel-parameters.txt | 28 +++++ arch/x86/include/asm/nospec-branch.h | 10 ++ arch/x86/kernel/cpu/bugs.c | 158 +++++++++++++++++++++++- arch/x86/kernel/cpu/common.c | 4 - 4 files changed, 195 insertions(+), 5 deletions(-)
--- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2167,6 +2167,11 @@ bytes respectively. Such letter suffixes register save and restore. The kernel will only save legacy floating-point registers on task switch.
+ nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2 + (indirect branch prediction) vulnerability. System may + allow data leaks with this option, which is equivalent + to spectre_v2=off. + noxsave [BUGS=X86] Disables x86 extended register state save and restore using xsave. The kernel will fallback to enabling legacy floating-point and sse state. @@ -3168,6 +3173,29 @@ bytes respectively. Such letter suffixes sonypi.*= [HW] Sony Programmable I/O Control Device driver See Documentation/laptops/sonypi.txt
+ spectre_v2= [X86] Control mitigation of Spectre variant 2 + (indirect branch speculation) vulnerability. + + on - unconditionally enable + off - unconditionally disable + auto - kernel detects whether your CPU model is + vulnerable + + Selecting 'on' will, and 'auto' may, choose a + mitigation method at run time according to the + CPU, the available microcode, the setting of the + CONFIG_RETPOLINE configuration option, and the + compiler with which the kernel was built. + + Specific mitigations can also be selected manually: + + retpoline - replace indirect branches + retpoline,generic - google's original retpoline + retpoline,amd - AMD-specific minimal thunk + + Not specifying this option is equivalent to + spectre_v2=auto. + spia_io_base= [HW,MTD] spia_fio_base= spia_pedr= --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -102,5 +102,15 @@ # define THUNK_TARGET(addr) [thunk_target] "rm" (addr) #endif
+/* The Spectre V2 mitigation variants */ +enum spectre_v2_mitigation { + SPECTRE_V2_NONE, + SPECTRE_V2_RETPOLINE_MINIMAL, + SPECTRE_V2_RETPOLINE_MINIMAL_AMD, + SPECTRE_V2_RETPOLINE_GENERIC, + SPECTRE_V2_RETPOLINE_AMD, + SPECTRE_V2_IBRS, +}; + #endif /* __ASSEMBLY__ */ #endif /* __NOSPEC_BRANCH_H__ */ --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -10,6 +10,9 @@ #include <linux/init.h> #include <linux/utsname.h> #include <linux/cpu.h> + +#include <asm/nospec-branch.h> +#include <asm/cmdline.h> #include <asm/bugs.h> #include <asm/processor.h> #include <asm/processor-flags.h> @@ -20,6 +23,8 @@ #include <asm/pgtable.h> #include <asm/cacheflush.h>
+static void __init spectre_v2_select_mitigation(void); + #ifdef CONFIG_X86_32
static double __initdata x = 4195835.0; @@ -87,6 +92,9 @@ void __init check_bugs(void) print_cpu_info(&boot_cpu_data); }
+ /* Select the proper spectre mitigation before patching alternatives */ + spectre_v2_select_mitigation(); + #ifdef CONFIG_X86_32 /* * Check whether we are able to run this kernel safely on SMP. @@ -124,6 +132,153 @@ void __init check_bugs(void) #endif }
+/* The kernel command line selection */ +enum spectre_v2_mitigation_cmd { + SPECTRE_V2_CMD_NONE, + SPECTRE_V2_CMD_AUTO, + SPECTRE_V2_CMD_FORCE, + SPECTRE_V2_CMD_RETPOLINE, + SPECTRE_V2_CMD_RETPOLINE_GENERIC, + SPECTRE_V2_CMD_RETPOLINE_AMD, +}; + +static const char *spectre_v2_strings[] = { + [SPECTRE_V2_NONE] = "Vulnerable", + [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline", + [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline", + [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline", + [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline", +}; + +#undef pr_fmt +#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt + +static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; + +static void __init spec2_print_if_insecure(const char *reason) +{ + if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + pr_info("%s\n", reason); +} + +static void __init spec2_print_if_secure(const char *reason) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + pr_info("%s\n", reason); +} + +static inline bool retp_compiler(void) +{ + return __is_defined(RETPOLINE); +} + +static inline bool match_option(const char *arg, int arglen, const char *opt) +{ + int len = strlen(opt); + + return len == arglen && !strncmp(arg, opt, len); +} + +static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) +{ + char arg[20]; + int ret; + + ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, + sizeof(arg)); + if (ret > 0) { + if (match_option(arg, ret, "off")) { + goto disable; + } else if (match_option(arg, ret, "on")) { + spec2_print_if_secure("force enabled on command line."); + return SPECTRE_V2_CMD_FORCE; + } else if (match_option(arg, ret, "retpoline")) { + spec2_print_if_insecure("retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE; + } else if (match_option(arg, ret, "retpoline,amd")) { + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) { + pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n"); + return SPECTRE_V2_CMD_AUTO; + } + spec2_print_if_insecure("AMD retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE_AMD; + } else if (match_option(arg, ret, "retpoline,generic")) { + spec2_print_if_insecure("generic retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE_GENERIC; + } else if (match_option(arg, ret, "auto")) { + return SPECTRE_V2_CMD_AUTO; + } + } + + if (!cmdline_find_option_bool(boot_command_line, "nospectre_v2")) + return SPECTRE_V2_CMD_AUTO; +disable: + spec2_print_if_insecure("disabled on command line."); + return SPECTRE_V2_CMD_NONE; +} + +static void __init spectre_v2_select_mitigation(void) +{ + enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline(); + enum spectre_v2_mitigation mode = SPECTRE_V2_NONE; + + /* + * If the CPU is not affected and the command line mode is NONE or AUTO + * then nothing to do. + */ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2) && + (cmd == SPECTRE_V2_CMD_NONE || cmd == SPECTRE_V2_CMD_AUTO)) + return; + + switch (cmd) { + case SPECTRE_V2_CMD_NONE: + return; + + case SPECTRE_V2_CMD_FORCE: + /* FALLTRHU */ + case SPECTRE_V2_CMD_AUTO: + goto retpoline_auto; + + case SPECTRE_V2_CMD_RETPOLINE_AMD: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_amd; + break; + case SPECTRE_V2_CMD_RETPOLINE_GENERIC: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_generic; + break; + case SPECTRE_V2_CMD_RETPOLINE: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_auto; + break; + } + pr_err("kernel not compiled with retpoline; no mitigation available!"); + return; + +retpoline_auto: + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { + retpoline_amd: + if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { + pr_err("LFENCE not serializing. Switching to generic retpoline\n"); + goto retpoline_generic; + } + mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD : + SPECTRE_V2_RETPOLINE_MINIMAL_AMD; + setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD); + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); + } else { + retpoline_generic: + mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_GENERIC : + SPECTRE_V2_RETPOLINE_MINIMAL; + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); + } + + spectre_v2_enabled = mode; + pr_info("%s\n", spectre_v2_strings[mode]); +} + +#undef pr_fmt + #ifdef CONFIG_SYSFS ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf) @@ -148,6 +303,7 @@ ssize_t cpu_show_spectre_v2(struct devic { if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) return sprintf(buf, "Not affected\n"); - return sprintf(buf, "Vulnerable\n"); + + return sprintf(buf, "%s\n", spectre_v2_strings[spectre_v2_enabled]); } #endif --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -810,10 +810,6 @@ static void __init early_identify_cpu(st
setup_force_cpu_bug(X86_BUG_SPECTRE_V1); setup_force_cpu_bug(X86_BUG_SPECTRE_V2); - -#ifdef CONFIG_RETPOLINE - setup_force_cpu_cap(X86_FEATURE_RETPOLINE); -#endif }
void __init early_cpu_init(void)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit b3bbfb3fb5d25776b8e3f361d2eedaabb0b496cd upstream.
For __get_user() paths, do not allow the kernel to speculate on the value of a user controlled pointer. In addition to the 'stac' instruction for Supervisor Mode Access Protection (SMAP), a barrier_nospec() causes the access_ok() result to resolve in the pipeline before the CPU might take any speculative action on the pointer value. Given the cost of 'stac' the speculation barrier is placed after 'stac' to hopefully overlap the cost of disabling SMAP with the cost of flushing the instruction pipeline.
Since __get_user is a major kernel interface that deals with user controlled pointers, the __uaccess_begin_nospec() mechanism will prevent speculative execution past an access_ok() permission check. While speculative execution past access_ok() is not enough to lead to a kernel memory leak, it is a necessary precondition.
To be clear, __uaccess_begin_nospec() is addressing a class of potential problems near __get_user() usages.
Note, that while the barrier_nospec() in __uaccess_begin_nospec() is used to protect __get_user(), pointer masking similar to array_index_nospec() will be used for get_user() since it incorporates a bounds check near the usage.
uaccess_try_nospec provides the same mechanism for get_user_try.
No functional changes.
Suggested-by: Linus Torvalds torvalds@linux-foundation.org Suggested-by: Andi Kleen ak@linux.intel.com Suggested-by: Ingo Molnar mingo@redhat.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Kees Cook keescook@chromium.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro viro@zeniv.linux.org.uk Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727415922.33451.5796614273104346583.stgit@dwill... [bwh: Backported to 3.16: use current_thread_info()] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/uaccess.h | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -145,6 +145,11 @@ extern int __get_user_bad(void);
#define __uaccess_begin() stac() #define __uaccess_end() clac() +#define __uaccess_begin_nospec() \ +({ \ + stac(); \ + barrier_nospec(); \ +})
/* * This is a type: either unsigned long, if the argument fits into @@ -470,6 +475,10 @@ struct __large_struct { unsigned long bu __uaccess_begin(); \ barrier();
+#define uaccess_try_nospec do { \ + current_thread_info()->uaccess_err = 0; \ + __uaccess_begin_nospec(); \ + #define uaccess_catch(err) \ __uaccess_end(); \ (err) |= (current_thread_info()->uaccess_err ? -EFAULT : 0); \
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 5096732f6f695001fa2d6f1335a2680b37912c69 upstream.
Convert all indirect jumps in 32bit checksum assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Arjan van de Ven arjan@linux.intel.com Acked-by: Ingo Molnar mingo@kernel.org Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515707194-20531-11-git-send-email-dwmw@amazon.co.... [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/lib/checksum_32.S | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/arch/x86/lib/checksum_32.S +++ b/arch/x86/lib/checksum_32.S @@ -29,7 +29,8 @@ #include <asm/dwarf2.h> #include <asm/errno.h> #include <asm/asm.h> - +#include <asm/nospec-branch.h> + /* * computes a partial checksum, e.g. for TCP/UDP fragments */ @@ -165,7 +166,7 @@ ENTRY(csum_partial) negl %ebx lea 45f(%ebx,%ebx,2), %ebx testl %esi, %esi - jmp *%ebx + JMP_NOSPEC %ebx
# Handle 2-byte-aligned regions 20: addw (%esi), %ax @@ -463,7 +464,7 @@ ENTRY(csum_partial_copy_generic) andl $-32,%edx lea 3f(%ebx,%ebx), %ebx testl %esi, %esi - jmp *%ebx + JMP_NOSPEC %ebx 1: addl $64,%esi addl $64,%edi SRC(movb -32(%edx),%bl) ; SRC(movb (%edx),%bl)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Mark Rutland mark.rutland@arm.com
commit f84a56f73dddaeac1dba8045b007f742f61cd2da upstream.
Document the rationale and usage of the new array_index_nospec() helper.
Signed-off-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Kees Cook keescook@chromium.org Cc: linux-arch@vger.kernel.org Cc: Jonathan Corbet corbet@lwn.net Cc: Peter Zijlstra peterz@infradead.org Cc: gregkh@linuxfoundation.org Cc: kernel-hardening@lists.openwall.com Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727413645.33451.15878817161436755393.stgit@dwil... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- Documentation/speculation.txt | 90 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 90 insertions(+) create mode 100644 Documentation/speculation.txt
--- /dev/null +++ b/Documentation/speculation.txt @@ -0,0 +1,90 @@ +This document explains potential effects of speculation, and how undesirable +effects can be mitigated portably using common APIs. + +=========== +Speculation +=========== + +To improve performance and minimize average latencies, many contemporary CPUs +employ speculative execution techniques such as branch prediction, performing +work which may be discarded at a later stage. + +Typically speculative execution cannot be observed from architectural state, +such as the contents of registers. However, in some cases it is possible to +observe its impact on microarchitectural state, such as the presence or +absence of data in caches. Such state may form side-channels which can be +observed to extract secret information. + +For example, in the presence of branch prediction, it is possible for bounds +checks to be ignored by code which is speculatively executed. Consider the +following code: + + int load_array(int *array, unsigned int index) + { + if (index >= MAX_ARRAY_ELEMS) + return 0; + else + return array[index]; + } + +Which, on arm64, may be compiled to an assembly sequence such as: + + CMP <index>, #MAX_ARRAY_ELEMS + B.LT less + MOV <returnval>, #0 + RET + less: + LDR <returnval>, [<array>, <index>] + RET + +It is possible that a CPU mis-predicts the conditional branch, and +speculatively loads array[index], even if index >= MAX_ARRAY_ELEMS. This +value will subsequently be discarded, but the speculated load may affect +microarchitectural state which can be subsequently measured. + +More complex sequences involving multiple dependent memory accesses may +result in sensitive information being leaked. Consider the following +code, building on the prior example: + + int load_dependent_arrays(int *arr1, int *arr2, int index) + { + int val1, val2, + + val1 = load_array(arr1, index); + val2 = load_array(arr2, val1); + + return val2; + } + +Under speculation, the first call to load_array() may return the value +of an out-of-bounds address, while the second call will influence +microarchitectural state dependent on this value. This may provide an +arbitrary read primitive. + +==================================== +Mitigating speculation side-channels +==================================== + +The kernel provides a generic API to ensure that bounds checks are +respected even under speculation. Architectures which are affected by +speculation-based side-channels are expected to implement these +primitives. + +The array_index_nospec() helper in <linux/nospec.h> can be used to +prevent information from being leaked via side-channels. + +A call to array_index_nospec(index, size) returns a sanitized index +value that is bounded to [0, size) even under cpu speculation +conditions. + +This can be used to protect the earlier load_array() example: + + int load_array(int *array, unsigned int index) + { + if (index >= MAX_ARRAY_ELEMS) + return 0; + else { + index = array_index_nospec(index, MAX_ARRAY_ELEMS); + return array[index]; + } + }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit e70e5892b28c18f517f29ab6e83bd57705104b31 upstream.
Convert all indirect jumps in hyperv inline asm code to use non-speculative sequences when CONFIG_RETPOLINE is enabled.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Arjan van de Ven arjan@linux.intel.com Acked-by: Ingo Molnar mingo@kernel.org Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515707194-20531-9-git-send-email-dwmw@amazon.co.u... [bwh: Backported to 3.16: - Drop changes to hv_do_fast_hypercall8() - Include earlier updates to the asm constraints - Adjust filename, context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- drivers/hv/hv.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
--- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -29,6 +29,7 @@ #include <linux/version.h> #include <linux/interrupt.h> #include <asm/hyperv.h> +#include <asm/nospec-branch.h> #include "hyperv_vmbus.h"
/* The one and only */ @@ -93,10 +94,13 @@ static u64 do_hypercall(u64 control, voi u64 output_address = (output) ? virt_to_phys(output) : 0; void *hypercall_page = hv_context.hypercall_page;
- __asm__ __volatile__("mov %0, %%r8" : : "r" (output_address) : "r8"); - __asm__ __volatile__("call *%3" : "=a" (hv_status) : - "c" (control), "d" (input_address), - "m" (hypercall_page)); + __asm__ __volatile__("mov %4, %%r8\n" + CALL_NOSPEC + : "=a" (hv_status), ASM_CALL_CONSTRAINT, + "+c" (control), "+d" (input_address) + : "r" (output_address), + THUNK_TARGET(hypercall_page) + : "cc", "memory", "r8", "r9", "r10", "r11");
return hv_status;
@@ -114,11 +118,14 @@ static u64 do_hypercall(u64 control, voi u32 output_address_lo = output_address & 0xFFFFFFFF; void *hypercall_page = hv_context.hypercall_page;
- __asm__ __volatile__ ("call *%8" : "=d"(hv_status_hi), - "=a"(hv_status_lo) : "d" (control_hi), - "a" (control_lo), "b" (input_address_hi), - "c" (input_address_lo), "D"(output_address_hi), - "S"(output_address_lo), "m" (hypercall_page)); + __asm__ __volatile__(CALL_NOSPEC + : "=d" (hv_status_hi), "=a" (hv_status_lo), + "+c" (input_address_lo), ASM_CALL_CONSTRAINT + : "d" (control_hi), "a" (control_lo), + "b" (input_address_hi), + "D"(output_address_hi), "S"(output_address_lo), + THUNK_TARGET(hypercall_page) + : "cc", "memory");
return hv_status_lo | ((u64)hv_status_hi << 32); #endif /* !x86_64 */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit 28d437d550e1e39f805d99f9f8ac399c778827b7 upstream.
The PAUSE instruction is currently used in the retpoline and RSB filling macros as a speculation trap. The use of PAUSE was originally suggested because it showed a very, very small difference in the amount of cycles/time used to execute the retpoline as compared to LFENCE. On AMD, the PAUSE instruction is not a serializing instruction, so the pause/jmp loop will use excess power as it is speculated over waiting for return to mispredict to the correct target.
The RSB filling macro is applicable to AMD, and, if software is unable to verify that LFENCE is serializing on AMD (possible when running under a hypervisor), the generic retpoline support will be used and, so, is also applicable to AMD. Keep the current usage of PAUSE for Intel, but add an LFENCE instruction to the speculation trap for AMD.
The same sequence has been adopted by GCC for the GCC generated retpolines.
Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@alien8.de Acked-by: David Woodhouse dwmw@amazon.co.uk Acked-by: Arjan van de Ven arjan@linux.intel.com Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Paul Turner pjt@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Jiri Kosina jikos@kernel.org Cc: Dave Hansen dave.hansen@intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Dan Williams dan.j.williams@intel.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Kees Cook keescook@google.com Link: https://lkml.kernel.org/r/20180113232730.31060.36287.stgit@tlendack-t1.amdof... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/nospec-branch.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -11,7 +11,7 @@ * Fill the CPU return stack buffer. * * Each entry in the RSB, if used for a speculative 'ret', contains an - * infinite 'pause; jmp' loop to capture speculative execution. + * infinite 'pause; lfence; jmp' loop to capture speculative execution. * * This is required in various cases for retpoline and IBRS-based * mitigations for the Spectre variant 2 vulnerability. Sometimes to @@ -38,11 +38,13 @@ call 772f; \ 773: /* speculation trap */ \ pause; \ + lfence; \ jmp 773b; \ 772: \ call 774f; \ 775: /* speculation trap */ \ pause; \ + lfence; \ jmp 775b; \ 774: \ dec reg; \ @@ -60,6 +62,7 @@ call .Ldo_rop_@ .Lspec_trap_@: pause + lfence jmp .Lspec_trap_@ .Ldo_rop_@: mov \reg, (%_ASM_SP) @@ -142,6 +145,7 @@ " .align 16\n" \ "901: call 903f;\n" \ "902: pause;\n" \ + " lfence;\n" \ " jmp 902b;\n" \ " .align 16\n" \ "903: addl $4, %%esp;\n" \
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit b8b9ce4b5aec8de9e23cabb0a26b78641f9ab1d6 upstream.
Remove the compile time warning when CONFIG_RETPOLINE=y and the compiler does not have retpoline support. Linus rationale for this is:
It's wrong because it will just make people turn off RETPOLINE, and the asm updates - and return stack clearing - that are independent of the compiler are likely the most important parts because they are likely the ones easiest to target.
And it's annoying because most people won't be able to do anything about it. The number of people building their own compiler? Very small. So if their distro hasn't got a compiler yet (and pretty much nobody does), the warning is just annoying crap.
It is already properly reported as part of the sysfs interface. The compile-time warning only encourages bad things.
Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support") Requested-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: David Woodhouse dwmw@amazon.co.uk Cc: Peter Zijlstra (Intel) peterz@infradead.org Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/CA+55aFzWgquv4i6Mab6bASqYXg3ErV3XDFEYf=GEcCDQg5uAt... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/Makefile | 2 -- 1 file changed, 2 deletions(-)
--- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -182,8 +182,6 @@ ifdef CONFIG_RETPOLINE RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern -mindirect-branch-register) ifneq ($(RETPOLINE_CFLAGS),) KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE - else - $(warning CONFIG_RETPOLINE=y, but not supported by the compiler. Toolchain update recommended.) endif endif
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 117cc7a908c83697b0b737d15ae1eb5943afe35b upstream.
In accordance with the Intel and AMD documentation, we need to overwrite all entries in the RSB on exiting a guest, to prevent malicious branch target predictions from affecting the host kernel. This is needed both for retpoline and for IBRS.
[ak: numbers again for the RSB stuffing labels]
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515755487-8524-1-git-send-email-dwmw@amazon.co.uk [bwh: Backported to 3.16: - Drop the ANNOTATE_NOSPEC_ALTERNATIVEs - Adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -7,6 +7,48 @@ #include <asm/alternative-asm.h> #include <asm/cpufeature.h>
+/* + * Fill the CPU return stack buffer. + * + * Each entry in the RSB, if used for a speculative 'ret', contains an + * infinite 'pause; jmp' loop to capture speculative execution. + * + * This is required in various cases for retpoline and IBRS-based + * mitigations for the Spectre variant 2 vulnerability. Sometimes to + * eliminate potentially bogus entries from the RSB, and sometimes + * purely to ensure that it doesn't get empty, which on some CPUs would + * allow predictions from other (unwanted!) sources to be used. + * + * We define a CPP macro such that it can be used from both .S files and + * inline assembly. It's possible to do a .macro and then include that + * from C via asm(".include <asm/nospec-branch.h>") but let's not go there. + */ + +#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */ +#define RSB_FILL_LOOPS 16 /* To avoid underflow */ + +/* + * Google experimented with loop-unrolling and this turned out to be + * the optimal version — two calls, each with their own speculation + * trap should their return address end up getting used, in a loop. + */ +#define __FILL_RETURN_BUFFER(reg, nr, sp) \ + mov $(nr/2), reg; \ +771: \ + call 772f; \ +773: /* speculation trap */ \ + pause; \ + jmp 773b; \ +772: \ + call 774f; \ +775: /* speculation trap */ \ + pause; \ + jmp 775b; \ +774: \ + dec reg; \ + jnz 771b; \ + add $(BITS_PER_LONG/8) * nr, sp; + #ifdef __ASSEMBLY__
/* @@ -61,6 +103,19 @@ #endif .endm
+ /* + * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP + * monstrosity above, manually. + */ +.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req +#ifdef CONFIG_RETPOLINE + ALTERNATIVE "jmp .Lskip_rsb_@", \ + __stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)) \ + \ftr +.Lskip_rsb_@: +#endif +.endm + #else /* __ASSEMBLY__ */
#if defined(CONFIG_X86_64) && defined(RETPOLINE) @@ -97,7 +152,7 @@ X86_FEATURE_RETPOLINE)
# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) -#else /* No retpoline */ +#else /* No retpoline for C / inline asm */ # define CALL_NOSPEC "call *%[thunk_target]\n" # define THUNK_TARGET(addr) [thunk_target] "rm" (addr) #endif @@ -112,5 +167,24 @@ enum spectre_v2_mitigation { SPECTRE_V2_IBRS, };
+/* + * On VMEXIT we must ensure that no RSB predictions learned in the guest + * can be followed in the host, by overwriting the RSB completely. Both + * retpoline and IBRS mitigations for Spectre v2 need this; only on future + * CPUs with IBRS_ATT *might* it be avoided. + */ +static inline void vmexit_fill_RSB(void) +{ +#ifdef CONFIG_RETPOLINE + unsigned long loops = RSB_CLEAR_LOOPS / 2; + + asm volatile (ALTERNATIVE("jmp 910f", + __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)), + X86_FEATURE_RETPOLINE) + "910:" + : "=&r" (loops), ASM_CALL_CONSTRAINT + : "r" (loops) : "memory" ); +#endif +} #endif /* __ASSEMBLY__ */ #endif /* __NOSPEC_BRANCH_H__ */ --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -36,6 +36,7 @@ #include <asm/desc.h> #include <asm/debugreg.h> #include <asm/kvm_para.h> +#include <asm/nospec-branch.h>
#include <asm/virtext.h> #include "trace.h" @@ -3963,6 +3964,9 @@ static void svm_vcpu_run(struct kvm_vcpu #endif );
+ /* Eliminate branch target predictions from guest mode */ + vmexit_fill_RSB(); + #ifdef CONFIG_X86_64 wrmsrl(MSR_GS_BASE, svm->host.gs_base); #else --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -45,6 +45,7 @@ #include <asm/perf_event.h> #include <asm/debugreg.h> #include <asm/kexec.h> +#include <asm/nospec-branch.h>
#include "trace.h"
@@ -7557,6 +7558,9 @@ static void __noclone vmx_vcpu_run(struc #endif );
+ /* Eliminate branch target predictions from guest mode */ + vmexit_fill_RSB(); + /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */ if (debugctlmsr) update_debugctlmsr(debugctlmsr);
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 2961298efe1ea1b6fc0d7ee8b76018fa6c0bcef2 upstream.
We want to expose the hardware features simply in /proc/cpuinfo as "ibrs", "ibpb" and "stibp". Since AMD has separate CPUID bits for those, use them as the user-visible bits.
When the Intel SPEC_CTRL bit is set which indicates both IBRS and IBPB capability, set those (AMD) bits accordingly. Likewise if the Intel STIBP bit is set, set the AMD STIBP that's used for the generic hardware capability.
Hide the rest from /proc/cpuinfo by putting "" in the comments. Including RETPOLINE and RETPOLINE_AMD which shouldn't be visible there. There are patches to make the sysfs vulnerabilities information non-readable by non-root, and the same should apply to all information about which mitigations are actually in use. Those *shouldn't* appear in /proc/cpuinfo.
The feature bit for whether IBPB is actually used, which is needed for ALTERNATIVEs, is renamed to X86_FEATURE_USE_IBPB.
Originally-by: Borislav Petkov bp@suse.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: ak@linux.intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517070274-12128-2-git-send-email-dwmw@amazon.co.u... [bwh: For 3.16, just apply the part that hides fake CPU feature bits] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -187,10 +187,10 @@ #define X86_FEATURE_HW_PSTATE (7*32+ 8) /* AMD HW-PState */ #define X86_FEATURE_PROC_FEEDBACK (7*32+ 9) /* AMD ProcFeedbackInterface */ #define X86_FEATURE_INVPCID_SINGLE (7*32+10) /* Effectively INVPCID && CR4.PCIDE=1 */ -#define X86_FEATURE_RSB_CTXSW (7*32+11) /* Fill RSB on context switches */ +#define X86_FEATURE_RSB_CTXSW (7*32+11) /* "" Fill RSB on context switches */
-#define X86_FEATURE_RETPOLINE (7*32+29) /* Generic Retpoline mitigation for Spectre variant 2 */ -#define X86_FEATURE_RETPOLINE_AMD (7*32+30) /* AMD Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE (7*32+29) /* "" Generic Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE_AMD (7*32+30) /* "" AMD Retpoline mitigation for Spectre variant 2 */ /* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */ #define X86_FEATURE_KAISER (7*32+31) /* "" CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov bp@suse.de
commit 62a67e123e058a67db58bc6a14354dd037bafd0a upstream.
Should be easier when following boot paths. It probably is a left over from the x86 unification eons ago.
No functionality change.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20161024173844.23038-3-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org [bwh: Backported to 3.16: - Add #ifdef around check_fpu(), which is not used on x86_64 - Adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -16,9 +16,7 @@ obj-y := intel_cacheinfo.o scattered.o obj-y += proc.o capflags.o powerflags.o common.o obj-y += rdrand.o obj-y += match.o - -obj-$(CONFIG_X86_32) += bugs.o -obj-$(CONFIG_X86_64) += bugs_64.o +obj-y += bugs.o
obj-$(CONFIG_CPU_SUP_INTEL) += intel.o obj-$(CONFIG_CPU_SUP_AMD) += amd.o --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -16,6 +16,10 @@ #include <asm/msr.h> #include <asm/paravirt.h> #include <asm/alternative.h> +#include <asm/pgtable.h> +#include <asm/cacheflush.h> + +#ifdef CONFIG_X86_32
static double __initdata x = 4195835.0; static double __initdata y = 3145727.0; @@ -63,6 +67,8 @@ static void __init check_fpu(void) } }
+#endif /* CONFIG_X86_32 */ + void __init check_bugs(void) { #ifdef CONFIG_X86_32 @@ -74,11 +80,13 @@ void __init check_bugs(void) #endif
identify_boot_cpu(); -#ifndef CONFIG_SMP - pr_info("CPU: "); - print_cpu_info(&boot_cpu_data); -#endif
+ if (!IS_ENABLED(CONFIG_SMP)) { + pr_info("CPU: "); + print_cpu_info(&boot_cpu_data); + } + +#ifdef CONFIG_X86_32 /* * Check whether we are able to run this kernel safely on SMP. * @@ -99,4 +107,18 @@ void __init check_bugs(void) */ if (cpu_has_fpu) check_fpu(); +#else /* CONFIG_X86_64 */ + alternative_instructions(); + + /* + * Make sure the first 2MB area is not mapped by huge pages + * There are typically fixed size MTRRs in there and overlapping + * MTRRs into large pages causes slow downs. + * + * Right now we don't do that with gbpages because there seems + * very little benefit for that case. + */ + if (!direct_gbpages) + set_memory_4k((unsigned long)__va(0), 1); +#endif } --- a/arch/x86/kernel/cpu/bugs_64.c +++ /dev/null @@ -1,33 +0,0 @@ -/* - * Copyright (C) 1994 Linus Torvalds - * Copyright (C) 2000 SuSE - */ - -#include <linux/kernel.h> -#include <linux/init.h> -#include <asm/alternative.h> -#include <asm/bugs.h> -#include <asm/processor.h> -#include <asm/mtrr.h> -#include <asm/cacheflush.h> - -void __init check_bugs(void) -{ - identify_boot_cpu(); -#if !defined(CONFIG_SMP) - printk(KERN_INFO "CPU: "); - print_cpu_info(&boot_cpu_data); -#endif - alternative_instructions(); - - /* - * Make sure the first 2MB area is not mapped by huge pages - * There are typically fixed size MTRRs in there and overlapping - * MTRRs into large pages causes slow downs. - * - * Right now we don't do that with gbpages because there seems - * very little benefit for that case. - */ - if (!direct_gbpages) - set_memory_4k((unsigned long)__va(0), 1); -}
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 76b043848fd22dbf7f8bf3a1452f8c70d557b860 upstream.
Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide the corresponding thunks. Provide assembler macros for invoking the thunks in the same way that GCC does, from native and inline assembler.
This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In some circumstances, IBRS microcode features may be used instead, and the retpoline can be disabled.
On AMD CPUs if lfence is serialising, the retpoline can be dramatically simplified to a simple "lfence; jmp *\reg". A future patch, after it has been verified that lfence really is serialising in all circumstances, can enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition to X86_FEATURE_RETPOLINE.
Do not align the retpoline in the altinstr section, because there is no guarantee that it stays aligned when it's copied over the oldinstr during alternative patching.
[ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks] [ tglx: Put actual function CALL/JMP in front of the macros, convert to symbolic labels ] [ dwmw2: Convert back to numeric labels, merge objtool fixes ]
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Arjan van de Ven arjan@linux.intel.com Acked-by: Ingo Molnar mingo@kernel.org Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.u... [bwh: Backported to 3.16: - Add C source to export the thunk symbols - Drop ANNOTATE_NOSPEC_ALTERNATIVE since we don't have objtool - Use the first available feaure numbers - Adjust filename, context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -339,6 +339,19 @@ config GOLDFISH def_bool y depends on X86_GOLDFISH
+config RETPOLINE + bool "Avoid speculative indirect branches in kernel" + default y + help + Compile kernel with the retpoline compiler options to guard against + kernel-to-user data leaks by avoiding speculative indirect + branches. Requires a compiler with -mindirect-branch=thunk-extern + support for full protection. The kernel may run slower. + + Without compiler support, at least indirect branches in assembler + code are eliminated. Since this includes the syscall entry path, + it is not entirely pointless. + if X86_32 config X86_EXTENDED_PLATFORM bool "Support for extended (non-PC) x86 platforms" --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -177,6 +177,16 @@ KBUILD_CFLAGS += $(call cc-option,-mno-a KBUILD_CFLAGS += $(mflags-y) KBUILD_AFLAGS += $(mflags-y)
+# Avoid indirect branches in kernel to deal with Spectre +ifdef CONFIG_RETPOLINE + RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern -mindirect-branch-register) + ifneq ($(RETPOLINE_CFLAGS),) + KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE + else + $(warning CONFIG_RETPOLINE=y, but not supported by the compiler. Toolchain update recommended.) + endif +endif + archscripts: scripts_basic $(Q)$(MAKE) $(build)=arch/x86/tools relocs
--- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -188,6 +188,8 @@ #define X86_FEATURE_PROC_FEEDBACK (7*32+ 9) /* AMD ProcFeedbackInterface */ #define X86_FEATURE_INVPCID_SINGLE (7*32+10) /* Effectively INVPCID && CR4.PCIDE=1 */
+#define X86_FEATURE_RETPOLINE (7*32+29) /* Generic Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE_AMD (7*32+30) /* AMD Retpoline mitigation for Spectre variant 2 */ /* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */ #define X86_FEATURE_KAISER (7*32+31) /* "" CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */
--- /dev/null +++ b/arch/x86/include/asm/nospec-branch.h @@ -0,0 +1,106 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __NOSPEC_BRANCH_H__ +#define __NOSPEC_BRANCH_H__ + +#include <asm/alternative.h> +#include <asm/alternative-asm.h> +#include <asm/cpufeature.h> + +#ifdef __ASSEMBLY__ + +/* + * These are the bare retpoline primitives for indirect jmp and call. + * Do not use these directly; they only exist to make the ALTERNATIVE + * invocation below less ugly. + */ +.macro RETPOLINE_JMP reg:req + call .Ldo_rop_@ +.Lspec_trap_@: + pause + jmp .Lspec_trap_@ +.Ldo_rop_@: + mov \reg, (%_ASM_SP) + ret +.endm + +/* + * This is a wrapper around RETPOLINE_JMP so the called function in reg + * returns to the instruction after the macro. + */ +.macro RETPOLINE_CALL reg:req + jmp .Ldo_call_@ +.Ldo_retpoline_jmp_@: + RETPOLINE_JMP \reg +.Ldo_call_@: + call .Ldo_retpoline_jmp_@ +.endm + +/* + * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple + * indirect jmp/call which may be susceptible to the Spectre variant 2 + * attack. + */ +.macro JMP_NOSPEC reg:req +#ifdef CONFIG_RETPOLINE + ALTERNATIVE_2 __stringify(jmp *\reg), \ + __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ + __stringify(lfence; jmp *\reg), X86_FEATURE_RETPOLINE_AMD +#else + jmp *\reg +#endif +.endm + +.macro CALL_NOSPEC reg:req +#ifdef CONFIG_RETPOLINE + ALTERNATIVE_2 __stringify(call *\reg), \ + __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ + __stringify(lfence; call *\reg), X86_FEATURE_RETPOLINE_AMD +#else + call *\reg +#endif +.endm + +#else /* __ASSEMBLY__ */ + +#if defined(CONFIG_X86_64) && defined(RETPOLINE) + +/* + * Since the inline asm uses the %V modifier which is only in newer GCC, + * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE. + */ +# define CALL_NOSPEC \ + ALTERNATIVE( \ + "call *%[thunk_target]\n", \ + "call __x86_indirect_thunk_%V[thunk_target]\n", \ + X86_FEATURE_RETPOLINE) +# define THUNK_TARGET(addr) [thunk_target] "r" (addr) + +#elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE) +/* + * For i386 we use the original ret-equivalent retpoline, because + * otherwise we'll run out of registers. We don't care about CET + * here, anyway. + */ +# define CALL_NOSPEC ALTERNATIVE("call *%[thunk_target]\n", \ + " jmp 904f;\n" \ + " .align 16\n" \ + "901: call 903f;\n" \ + "902: pause;\n" \ + " jmp 902b;\n" \ + " .align 16\n" \ + "903: addl $4, %%esp;\n" \ + " pushl %[thunk_target];\n" \ + " ret;\n" \ + " .align 16\n" \ + "904: call 901b;\n", \ + X86_FEATURE_RETPOLINE) + +# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) +#else /* No retpoline */ +# define CALL_NOSPEC "call *%[thunk_target]\n" +# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) +#endif + +#endif /* __ASSEMBLY__ */ +#endif /* __NOSPEC_BRANCH_H__ */ --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -810,6 +810,10 @@ static void __init early_identify_cpu(st
setup_force_cpu_bug(X86_BUG_SPECTRE_V1); setup_force_cpu_bug(X86_BUG_SPECTRE_V2); + +#ifdef CONFIG_RETPOLINE + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); +#endif }
void __init early_cpu_init(void) --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -23,6 +23,8 @@ lib-y += memcpy_$(BITS).o lib-$(CONFIG_SMP) += rwlock.o lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o +lib-$(CONFIG_RETPOLINE) += retpoline.o +obj-$(CONFIG_RETPOLINE) += retpoline-export.o
obj-y += msr.o msr-reg.o msr-reg-export.o hash.o
--- /dev/null +++ b/arch/x86/lib/retpoline.S @@ -0,0 +1,46 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include <linux/stringify.h> +#include <linux/linkage.h> +#include <asm/dwarf2.h> +#include <asm/cpufeature.h> +#include <asm/alternative-asm.h> +#include <asm/nospec-branch.h> + +.macro THUNK reg + .section .text.__x86.indirect_thunk.\reg + +ENTRY(__x86_indirect_thunk_\reg) + CFI_STARTPROC + JMP_NOSPEC %\reg + CFI_ENDPROC +ENDPROC(__x86_indirect_thunk_\reg) +.endm + +/* + * Despite being an assembler file we can't just use .irp here + * because __KSYM_DEPS__ only uses the C preprocessor and would + * only see one instance of "__x86_indirect_thunk_\reg" rather + * than one per register with the correct names. So we do it + * the simple and nasty way... + */ +#define GENERATE_THUNK(reg) THUNK reg + +GENERATE_THUNK(_ASM_AX) +GENERATE_THUNK(_ASM_BX) +GENERATE_THUNK(_ASM_CX) +GENERATE_THUNK(_ASM_DX) +GENERATE_THUNK(_ASM_SI) +GENERATE_THUNK(_ASM_DI) +GENERATE_THUNK(_ASM_BP) +GENERATE_THUNK(_ASM_SP) +#ifdef CONFIG_64BIT +GENERATE_THUNK(r8) +GENERATE_THUNK(r9) +GENERATE_THUNK(r10) +GENERATE_THUNK(r11) +GENERATE_THUNK(r12) +GENERATE_THUNK(r13) +GENERATE_THUNK(r14) +GENERATE_THUNK(r15) +#endif --- /dev/null +++ b/arch/x86/lib/retpoline-export.c @@ -0,0 +1,25 @@ +#include <linux/linkage.h> + +#ifdef CONFIG_RETPOLINE +#ifdef CONFIG_X86_32 +#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void); EXPORT_SYMBOL(__x86_indirect_thunk_e ## reg); +#else +#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void); EXPORT_SYMBOL(__x86_indirect_thunk_r ## reg); +INDIRECT_THUNK(8) +INDIRECT_THUNK(9) +INDIRECT_THUNK(10) +INDIRECT_THUNK(11) +INDIRECT_THUNK(12) +INDIRECT_THUNK(13) +INDIRECT_THUNK(14) +INDIRECT_THUNK(15) +#endif +INDIRECT_THUNK(ax) +INDIRECT_THUNK(bx) +INDIRECT_THUNK(cx) +INDIRECT_THUNK(dx) +INDIRECT_THUNK(si) +INDIRECT_THUNK(di) +INDIRECT_THUNK(bp) +INDIRECT_THUNK(sp) +#endif /* CONFIG_RETPOLINE */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit f3804203306e098dae9ca51540fcd5eb700d7f40 upstream.
array_index_nospec() is proposed as a generic mechanism to mitigate against Spectre-variant-1 attacks, i.e. an attack that bypasses boundary checks via speculative execution. The array_index_nospec() implementation is expected to be safe for current generation CPUs across multiple architectures (ARM, x86).
Based on an original implementation by Linus Torvalds, tweaked to remove speculative flows by Alexei Starovoitov, and tweaked again by Linus to introduce an x86 assembly implementation for the mask generation.
Co-developed-by: Linus Torvalds torvalds@linux-foundation.org Co-developed-by: Alexei Starovoitov ast@kernel.org Suggested-by: Cyril Novikov cnovikov@lynx.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: Peter Zijlstra peterz@infradead.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Cc: Russell King linux@armlinux.org.uk Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727414229.33451.18411580953862676575.stgit@dwil... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- include/linux/nospec.h | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 72 insertions(+) create mode 100644 include/linux/nospec.h
--- /dev/null +++ b/include/linux/nospec.h @@ -0,0 +1,72 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright(c) 2018 Linus Torvalds. All rights reserved. +// Copyright(c) 2018 Alexei Starovoitov. All rights reserved. +// Copyright(c) 2018 Intel Corporation. All rights reserved. + +#ifndef _LINUX_NOSPEC_H +#define _LINUX_NOSPEC_H + +/** + * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise + * @index: array element index + * @size: number of elements in array + * + * When @index is out of bounds (@index >= @size), the sign bit will be + * set. Extend the sign bit to all bits and invert, giving a result of + * zero for an out of bounds index, or ~0 if within bounds [0, @size). + */ +#ifndef array_index_mask_nospec +static inline unsigned long array_index_mask_nospec(unsigned long index, + unsigned long size) +{ + /* + * Warn developers about inappropriate array_index_nospec() usage. + * + * Even if the CPU speculates past the WARN_ONCE branch, the + * sign bit of @index is taken into account when generating the + * mask. + * + * This warning is compiled out when the compiler can infer that + * @index and @size are less than LONG_MAX. + */ + if (WARN_ONCE(index > LONG_MAX || size > LONG_MAX, + "array_index_nospec() limited to range of [0, LONG_MAX]\n")) + return 0; + + /* + * Always calculate and emit the mask even if the compiler + * thinks the mask is not needed. The compiler does not take + * into account the value of @index under speculation. + */ + OPTIMIZER_HIDE_VAR(index); + return ~(long)(index | (size - 1UL - index)) >> (BITS_PER_LONG - 1); +} +#endif + +/* + * array_index_nospec - sanitize an array index after a bounds check + * + * For a code sequence like: + * + * if (index < size) { + * index = array_index_nospec(index, size); + * val = array[index]; + * } + * + * ...if the CPU speculates past the bounds check then + * array_index_nospec() will clamp the index within the range of [0, + * size). + */ +#define array_index_nospec(index, size) \ +({ \ + typeof(index) _i = (index); \ + typeof(size) _s = (size); \ + unsigned long _mask = array_index_mask_nospec(_i, _s); \ + \ + BUILD_BUG_ON(sizeof(_i) > sizeof(long)); \ + BUILD_BUG_ON(sizeof(_s) > sizeof(long)); \ + \ + _i &= _mask; \ + _i; \ +}) +#endif /* _LINUX_NOSPEC_H */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov bp@suse.de
commit 55fa19d3e51f33d9cd4056d25836d93abf9438db upstream.
Make
[ 0.031118] Spectre V2 mitigation: Mitigation: Full generic retpoline
into
[ 0.031118] Spectre V2: Mitigation: Full generic retpoline
to reduce the mitigation mitigations strings.
Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: riel@redhat.com Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: David Woodhouse dwmw2@infradead.org Cc: jikos@kernel.org Cc: luto@amacapital.net Cc: dave.hansen@intel.com Cc: torvalds@linux-foundation.org Cc: keescook@google.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: tim.c.chen@linux.intel.com Cc: pjt@google.com Link: https://lkml.kernel.org/r/20180126121139.31959-5-bp@alien8.de Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/cpu/bugs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -153,7 +153,7 @@ static const char *spectre_v2_strings[] };
#undef pr_fmt -#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt +#define pr_fmt(fmt) "Spectre V2 : " fmt
static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; static bool spectre_v2_bad_module;
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Ryabinin aryabinin@virtuozzo.com
commit 196bd485ee4f03ce4c690bfcf38138abfcd0a4bc upstream.
Currently we use current_stack_pointer() function to get the value of the stack pointer register. Since commit:
f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
... we have a stack register variable declared. It can be used instead of current_stack_pointer() function which allows to optimize away some excessive "mov %rsp, %<dst>" instructions:
-mov %rsp,%rdx -sub %rdx,%rax -cmp $0x3fff,%rax -ja ffffffff810722fd <ist_begin_non_atomic+0x2d>
+sub %rsp,%rax +cmp $0x3fff,%rax +ja ffffffff810722fa <ist_begin_non_atomic+0x2a>
Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer and use it instead of the removed function.
Signed-off-by: Andrey Ryabinin aryabinin@virtuozzo.com Reviewed-by: Josh Poimboeuf jpoimboe@redhat.com Cc: Andy Lutomirski luto@kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar mingo@kernel.org [dwmw2: We want ASM_CALL_CONSTRAINT for retpoline] Signed-off-by: David Woodhouse dwmw@amazon.co.ku Signed-off-by: Razvan Ghitulete rga@amazon.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [bwh: Backported to 3.16: drop change in ist_begin_non_atomic()] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/include/asm/asm.h +++ b/arch/x86/include/asm/asm.h @@ -80,4 +80,15 @@ /* For C file, we already have NOKPROBE_SYMBOL macro */ #endif
+#ifndef __ASSEMBLY__ +/* + * This output constraint should be used for any inline asm which has a "call" + * instruction. Otherwise the asm may be inserted before the frame pointer + * gets set up by the containing function. If you forget to do this, objtool + * may print a "call without frame pointer save/setup" warning. + */ +register unsigned long current_stack_pointer asm(_ASM_SP); +#define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer) +#endif + #endif /* _ASM_X86_ASM_H */ --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -170,17 +170,6 @@ static inline struct thread_info *curren return ti; }
-static inline unsigned long current_stack_pointer(void) -{ - unsigned long sp; -#ifdef CONFIG_X86_64 - asm("mov %%rsp,%0" : "=g" (sp)); -#else - asm("mov %%esp,%0" : "=g" (sp)); -#endif - return sp; -} - #else /* !__ASSEMBLY__ */
/* how to get the thread information struct from ASM */ --- a/arch/x86/kernel/irq_32.c +++ b/arch/x86/kernel/irq_32.c @@ -71,7 +71,7 @@ static void call_on_stack(void *func, vo
static inline void *current_stack(void) { - return (void *)(current_stack_pointer() & ~(THREAD_SIZE - 1)); + return (void *)(current_stack_pointer & ~(THREAD_SIZE - 1)); }
static inline int @@ -96,7 +96,7 @@ execute_on_irq_stack(int overflow, struc
/* Save the next esp at the bottom of the stack */ prev_esp = (u32 *)irqstk; - *prev_esp = current_stack_pointer(); + *prev_esp = current_stack_pointer;
if (unlikely(overflow)) call_on_stack(print_stack_overflow, isp); @@ -149,7 +149,7 @@ void do_softirq_own_stack(void)
/* Push the previous esp onto the stack */ prev_esp = (u32 *)irqstk; - *prev_esp = current_stack_pointer(); + *prev_esp = current_stack_pointer;
call_on_stack(__do_softirq, isp); }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Colin Ian King colin.king@canonical.com
commit e698dcdfcda41efd0984de539767b4cddd235f1e upstream.
Trivial fix to spelling mistake in pr_err error message text.
Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Andi Kleen ak@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: kernel-janitors@vger.kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@suse.de Cc: David Woodhouse dwmw@amazon.co.uk Link: https://lkml.kernel.org/r/20180130193218.9271-1-colin.king@canonical.com Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/cpu/bugs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -165,7 +165,7 @@ bool retpoline_module_ok(bool has_retpol if (spectre_v2_enabled == SPECTRE_V2_NONE || has_retpoline) return true;
- pr_err("System may be vunerable to spectre v2\n"); + pr_err("System may be vulnerable to spectre v2\n"); spectre_v2_bad_module = true; return false; }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit a89f040fa34ec9cd682aed98b8f04e3c47d998bd upstream.
Many x86 CPUs leak information to user space due to missing isolation of user space and kernel space page tables. There are many well documented ways to exploit that.
The upcoming software migitation of isolating the user and kernel space page tables needs a misfeature flag so code can be made runtime conditional.
Add the BUG bits which indicates that the CPU is affected and add a feature bit which indicates that the software migitation is enabled.
Assume for now that _ALL_ x86 CPUs are affected by this. Exceptions can be made later.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Andy Lutomirski luto@kernel.org Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Laight David.Laight@aculab.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Eduardo Valentin eduval@amazon.com Cc: Greg KH gregkh@linuxfoundation.org Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Juergen Gross jgross@suse.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Will Deacon will.deacon@arm.com Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar mingo@kernel.org [bwh: Backported to 3.16: assign the first available bug number] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -241,6 +241,7 @@ #define X86_BUG_COMA X86_BUG(2) /* Cyrix 6x86 coma */ #define X86_BUG_AMD_TLB_MMATCH X86_BUG(3) /* AMD Erratum 383 */ #define X86_BUG_AMD_APIC_C1E X86_BUG(4) /* AMD Erratum 400 */ +#define X86_BUG_CPU_INSECURE X86_BUG(5) /* CPU is insecure and needs kernel page table isolation */
#if defined(__KERNEL__) && !defined(__ASSEMBLY__)
--- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -804,6 +804,9 @@ static void __init early_identify_cpu(st this_cpu->c_bsp_init(c);
setup_force_cpu_cap(X86_FEATURE_ALWAYS); + + /* Assume for now that ALL x86 CPUs are insecure */ + setup_force_cpu_bug(X86_BUG_CPU_INSECURE); }
void __init early_cpu_init(void)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dave Hansen dave.hansen@linux.intel.com
commit 01c9b17bf673b05bb401b76ec763e9730ccf1376 upstream.
Add some details about how PTI works, what some of the downsides are, and how to debug it when things go wrong.
Also document the kernel parameter: 'pti/nopti'.
Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Randy Dunlap rdunlap@infradead.org Reviewed-by: Kees Cook keescook@chromium.org Cc: Moritz Lipp moritz.lipp@iaik.tugraz.at Cc: Daniel Gruss daniel.gruss@iaik.tugraz.at Cc: Michael Schwarz michael.schwarz@iaik.tugraz.at Cc: Richard Fellner richard.fellner@student.tugraz.at Cc: Andy Lutomirski luto@kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Hugh Dickins hughd@google.com Cc: Andi Lutomirsky luto@kernel.org Link: https://lkml.kernel.org/r/20180105174436.1BC6FA2B@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- Documentation//kernel-parameters.txt | 21 ++- Documentation/x86/pti.txt | 186 ++++++++++++++++++++++++ 2 files changed, 200 insertions(+), 7 deletions(-) create mode 100644 Documentation/x86/pti.txt
--- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2229,8 +2229,6 @@ bytes respectively. Such letter suffixes
nojitter [IA-64] Disables jitter checking for ITC timers.
- nopti [X86-64] Disable KAISER isolation of kernel from user. - no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver
no-kvmapf [X86,KVM] Disable paravirtualized asynchronous page @@ -2752,11 +2750,20 @@ bytes respectively. Such letter suffixes pt. [PARIDE] See Documentation/blockdev/paride.txt.
- pti= [X86_64] - Control KAISER user/kernel address space isolation: - on - enable - off - disable - auto - default setting + pti= [X86_64] Control Page Table Isolation of user and + kernel address spaces. Disabling this feature + removes hardening, but improves performance of + system calls and interrupts. + + on - unconditionally enable + off - unconditionally disable + auto - kernel detects whether your CPU model is + vulnerable to issues that PTI mitigates + + Not specifying this option is equivalent to pti=auto. + + nopti [X86_64] + Equivalent to pti=off
pty.legacy_count= [KNL] Number of legacy pty's. Overwrites compiled-in --- /dev/null +++ b/Documentation/x86/pti.txt @@ -0,0 +1,186 @@ +Overview +======== + +Page Table Isolation (pti, previously known as KAISER[1]) is a +countermeasure against attacks on the shared user/kernel address +space such as the "Meltdown" approach[2]. + +To mitigate this class of attacks, we create an independent set of +page tables for use only when running userspace applications. When +the kernel is entered via syscalls, interrupts or exceptions, the +page tables are switched to the full "kernel" copy. When the system +switches back to user mode, the user copy is used again. + +The userspace page tables contain only a minimal amount of kernel +data: only what is needed to enter/exit the kernel such as the +entry/exit functions themselves and the interrupt descriptor table +(IDT). There are a few strictly unnecessary things that get mapped +such as the first C function when entering an interrupt (see +comments in pti.c). + +This approach helps to ensure that side-channel attacks leveraging +the paging structures do not function when PTI is enabled. It can be +enabled by setting CONFIG_PAGE_TABLE_ISOLATION=y at compile time. +Once enabled at compile-time, it can be disabled at boot with the +'nopti' or 'pti=' kernel parameters (see kernel-parameters.txt). + +Page Table Management +===================== + +When PTI is enabled, the kernel manages two sets of page tables. +The first set is very similar to the single set which is present in +kernels without PTI. This includes a complete mapping of userspace +that the kernel can use for things like copy_to_user(). + +Although _complete_, the user portion of the kernel page tables is +crippled by setting the NX bit in the top level. This ensures +that any missed kernel->user CR3 switch will immediately crash +userspace upon executing its first instruction. + +The userspace page tables map only the kernel data needed to enter +and exit the kernel. This data is entirely contained in the 'struct +cpu_entry_area' structure which is placed in the fixmap which gives +each CPU's copy of the area a compile-time-fixed virtual address. + +For new userspace mappings, the kernel makes the entries in its +page tables like normal. The only difference is when the kernel +makes entries in the top (PGD) level. In addition to setting the +entry in the main kernel PGD, a copy of the entry is made in the +userspace page tables' PGD. + +This sharing at the PGD level also inherently shares all the lower +layers of the page tables. This leaves a single, shared set of +userspace page tables to manage. One PTE to lock, one set of +accessed bits, dirty bits, etc... + +Overhead +======== + +Protection against side-channel attacks is important. But, +this protection comes at a cost: + +1. Increased Memory Use + a. Each process now needs an order-1 PGD instead of order-0. + (Consumes an additional 4k per process). + b. The 'cpu_entry_area' structure must be 2MB in size and 2MB + aligned so that it can be mapped by setting a single PMD + entry. This consumes nearly 2MB of RAM once the kernel + is decompressed, but no space in the kernel image itself. + +2. Runtime Cost + a. CR3 manipulation to switch between the page table copies + must be done at interrupt, syscall, and exception entry + and exit (it can be skipped when the kernel is interrupted, + though.) Moves to CR3 are on the order of a hundred + cycles, and are required at every entry and exit. + b. A "trampoline" must be used for SYSCALL entry. This + trampoline depends on a smaller set of resources than the + non-PTI SYSCALL entry code, so requires mapping fewer + things into the userspace page tables. The downside is + that stacks must be switched at entry time. + d. Global pages are disabled for all kernel structures not + mapped into both kernel and userspace page tables. This + feature of the MMU allows different processes to share TLB + entries mapping the kernel. Losing the feature means more + TLB misses after a context switch. The actual loss of + performance is very small, however, never exceeding 1%. + d. Process Context IDentifiers (PCID) is a CPU feature that + allows us to skip flushing the entire TLB when switching page + tables by setting a special bit in CR3 when the page tables + are changed. This makes switching the page tables (at context + switch, or kernel entry/exit) cheaper. But, on systems with + PCID support, the context switch code must flush both the user + and kernel entries out of the TLB. The user PCID TLB flush is + deferred until the exit to userspace, minimizing the cost. + See intel.com/sdm for the gory PCID/INVPCID details. + e. The userspace page tables must be populated for each new + process. Even without PTI, the shared kernel mappings + are created by copying top-level (PGD) entries into each + new process. But, with PTI, there are now *two* kernel + mappings: one in the kernel page tables that maps everything + and one for the entry/exit structures. At fork(), we need to + copy both. + f. In addition to the fork()-time copying, there must also + be an update to the userspace PGD any time a set_pgd() is done + on a PGD used to map userspace. This ensures that the kernel + and userspace copies always map the same userspace + memory. + g. On systems without PCID support, each CR3 write flushes + the entire TLB. That means that each syscall, interrupt + or exception flushes the TLB. + h. INVPCID is a TLB-flushing instruction which allows flushing + of TLB entries for non-current PCIDs. Some systems support + PCIDs, but do not support INVPCID. On these systems, addresses + can only be flushed from the TLB for the current PCID. When + flushing a kernel address, we need to flush all PCIDs, so a + single kernel address flush will require a TLB-flushing CR3 + write upon the next use of every PCID. + +Possible Future Work +==================== +1. We can be more careful about not actually writing to CR3 + unless its value is actually changed. +2. Allow PTI to be enabled/disabled at runtime in addition to the + boot-time switching. + +Testing +======== + +To test stability of PTI, the following test procedure is recommended, +ideally doing all of these in parallel: + +1. Set CONFIG_DEBUG_ENTRY=y +2. Run several copies of all of the tools/testing/selftests/x86/ tests + (excluding MPX and protection_keys) in a loop on multiple CPUs for + several minutes. These tests frequently uncover corner cases in the + kernel entry code. In general, old kernels might cause these tests + themselves to crash, but they should never crash the kernel. +3. Run the 'perf' tool in a mode (top or record) that generates many + frequent performance monitoring non-maskable interrupts (see "NMI" + in /proc/interrupts). This exercises the NMI entry/exit code which + is known to trigger bugs in code paths that did not expect to be + interrupted, including nested NMIs. Using "-c" boosts the rate of + NMIs, and using two -c with separate counters encourages nested NMIs + and less deterministic behavior. + + while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done + +4. Launch a KVM virtual machine. +5. Run 32-bit binaries on systems supporting the SYSCALL instruction. + This has been a lightly-tested code path and needs extra scrutiny. + +Debugging +========= + +Bugs in PTI cause a few different signatures of crashes +that are worth noting here. + + * Failures of the selftests/x86 code. Usually a bug in one of the + more obscure corners of entry_64.S + * Crashes in early boot, especially around CPU bringup. Bugs + in the trampoline code or mappings cause these. + * Crashes at the first interrupt. Caused by bugs in entry_64.S, + like screwing up a page table switch. Also caused by + incorrectly mapping the IRQ handler entry code. + * Crashes at the first NMI. The NMI code is separate from main + interrupt handlers and can have bugs that do not affect + normal interrupts. Also caused by incorrectly mapping NMI + code. NMIs that interrupt the entry code must be very + careful and can be the cause of crashes that show up when + running perf. + * Kernel crashes at the first exit to userspace. entry_64.S + bugs, or failing to map some of the exit code. + * Crashes at first interrupt that interrupts userspace. The paths + in entry_64.S that return to userspace are sometimes separate + from the ones that return to the kernel. + * Double faults: overflowing the kernel stack because of page + faults upon page faults. Caused by touching non-pti-mapped + data in the entry code, or forgetting to switch to kernel + CR3 before calling into C functions which are not pti-mapped. + * Userspace segfaults early in boot, sometimes manifesting + as mount(8) failing to mount the rootfs. These have + tended to be TLB invalidation issues. Usually invalidating + the wrong PCID, or otherwise missing an invalidation. + +1. https://gruss.cc/files/kaiser.pdf +2. https://meltdownattack.com/meltdown.pdf
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov bp@suse.de
commit dbe4058a6a44af4ca5d146aebe01b0a1f9b7fd2a upstream.
Quentin caught a corner case with the generation of instruction padding in the ALTERNATIVE_2 macro: if len(orig_insn) < len(alt1) < len(alt2), then not enough padding gets added and that is not good(tm) as we could overwrite the beginning of the next instruction.
Luckily, at the time of this writing, we don't have ALTERNATIVE_2() invocations which have that problem and even if we did, a simple fix would be to prepend the instructions with enough prefixes so that that corner case doesn't happen.
However, best it would be if we fixed it properly. See below for a simple, abstracted example of what we're doing.
So what we ended up doing is, we compute the
max(len(alt1), len(alt2)) - len(orig_insn)
and feed that value to the .skip gas directive. The max() cannot have conditionals due to gas limitations, thus the fancy integer math.
With this patch, all ALTERNATIVE_2 sites get padded correctly; generating obscure test cases pass too:
#define alt_max_short(a, b) ((a) ^ (((a) ^ (b)) & -(-((a) < (b)))))
#define gen_skip(orig, alt1, alt2, marker) \ .skip -((alt_max_short(alt1, alt2) - (orig)) > 0) * \ (alt_max_short(alt1, alt2) - (orig)),marker
.pushsection .text, "ax" .globl main main: gen_skip(1, 2, 4, 0x09) gen_skip(4, 1, 2, 0x10) ... .popsection
Thanks to Quentin for catching it and double-checking the fix!
Reported-by: Quentin Casasnovas quentin.casasnovas@oracle.com Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Oleg Nesterov oleg@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20150404133443.GE21152@pd.tnic Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/alternative-asm.h | 14 ++++++++++++-- arch/x86/include/asm/alternative.h | 16 ++++++++++++---- arch/x86/kernel/alternative.c | 4 ++-- 3 files changed, 26 insertions(+), 8 deletions(-)
--- a/arch/x86/include/asm/alternative-asm.h +++ b/arch/x86/include/asm/alternative-asm.h @@ -45,12 +45,22 @@ .popsection .endm
+#define old_len 141b-140b +#define new_len1 144f-143f +#define new_len2 145f-144f + +/* + * max without conditionals. Idea adapted from: + * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax + */ +#define alt_max_short(a, b) ((a) ^ (((a) ^ (b)) & -(-((a) < (b))))) + .macro ALTERNATIVE_2 oldinstr, newinstr1, feature1, newinstr2, feature2 140: \oldinstr 141: - .skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90 - .skip -(((145f-144f)-(144f-143f)-(141b-140b)) > 0) * ((145f-144f)-(144f-143f)-(141b-140b)),0x90 + .skip -((alt_max_short(new_len1, new_len2) - (old_len)) > 0) * \ + (alt_max_short(new_len1, new_len2) - (old_len)),0x90 142:
.pushsection .altinstructions,"a" --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -96,13 +96,21 @@ static inline int alternatives_text_rese alt_end_marker ":\n"
/* + * max without conditionals. Idea adapted from: + * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax + * + * The additional "-" is needed because gas works with s32s. + */ +#define alt_max_short(a, b) "((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") - (" b ")))))" + +/* * Pad the second replacement alternative with additional NOPs if it is * additionally longer than the first replacement alternative. */ -#define OLDINSTR_2(oldinstr, num1, num2) \ - __OLDINSTR(oldinstr, num1) \ - ".skip -(((" alt_rlen(num2) ")-(" alt_rlen(num1) ")-(662b-661b)) > 0) * " \ - "((" alt_rlen(num2) ")-(" alt_rlen(num1) ")-(662b-661b)),0x90\n" \ +#define OLDINSTR_2(oldinstr, num1, num2) \ + "661:\n\t" oldinstr "\n662:\n" \ + ".skip -((" alt_max_short(alt_rlen(num1), alt_rlen(num2)) " - (" alt_slen ")) > 0) * " \ + "(" alt_max_short(alt_rlen(num1), alt_rlen(num2)) " - (" alt_slen ")), 0x90\n" \ alt_end_marker ":\n"
#define ALTINSTR_ENTRY(feature, num) \ --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -372,11 +372,11 @@ void __init_or_module apply_alternatives continue; }
- DPRINTK("feat: %d*32+%d, old: (%p, len: %d), repl: (%p, len: %d)", + DPRINTK("feat: %d*32+%d, old: (%p, len: %d), repl: (%p, len: %d), pad: %d", a->cpuid >> 5, a->cpuid & 0x1f, instr, a->instrlen, - replacement, a->replacementlen); + replacement, a->replacementlen, a->padlen);
DUMP_BYTES(instr, a->instrlen, "%p: old_insn: ", instr); DUMP_BYTES(replacement, a->replacementlen, "%p: rpl_insn: ", replacement);
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit 9c6a73c75864ad9fa49e5fa6513e4c4071c0e29f upstream.
With LFENCE now a serializing instruction, use LFENCE_RDTSC in preference to MFENCE_RDTSC. However, since the kernel could be running under a hypervisor that does not support writing that MSR, read the MSR back and verify that the bit has been set successfully. If the MSR can be read and the bit is set, then set the LFENCE_RDTSC feature, otherwise set the MFENCE_RDTSC feature.
Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Reviewed-by: Borislav Petkov bp@suse.de Cc: Peter Zijlstra peterz@infradead.org Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Dave Hansen dave.hansen@intel.com Cc: Borislav Petkov bp@alien8.de Cc: Dan Williams dan.j.williams@intel.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: David Woodhouse dwmw@amazon.co.uk Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/20180108220932.12580.52458.stgit@tlendack-t1.amdof... [bwh: Backported to 3.16: adjust filename, context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/uapi/asm/msr-index.h | 1 + arch/x86/kernel/cpu/amd.c | 18 ++++++++++++++++-- 2 files changed, 17 insertions(+), 2 deletions(-)
--- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -225,6 +225,7 @@ #define MSR_FAM10H_NODE_ID 0xc001100c #define MSR_F10H_DECFG 0xc0011029 #define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1 +#define MSR_F10H_DECFG_LFENCE_SERIALIZE BIT_ULL(MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT)
/* K8 MSRs */ #define MSR_K8_TOP_MEM1 0xc001001a --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -673,6 +673,9 @@ static void init_amd(struct cpuinfo_x86 set_cpu_cap(c, X86_FEATURE_K8);
if (cpu_has_xmm2) { + unsigned long long val; + int ret; + /* * A serializing LFENCE has less overhead than MFENCE, so * use it for execution serialization. On families which @@ -683,8 +686,19 @@ static void init_amd(struct cpuinfo_x86 msr_set_bit(MSR_F10H_DECFG, MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
- /* MFENCE stops RDTSC speculation */ - set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); + /* + * Verify that the MSR write was successful (could be running + * under a hypervisor) and only then assume that LFENCE is + * serializing. + */ + ret = rdmsrl_safe(MSR_F10H_DECFG, &val); + if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) { + /* A serializing LFENCE stops RDTSC speculation */ + set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC); + } else { + /* MFENCE stops RDTSC speculation */ + set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); + } }
#ifdef CONFIG_X86_64
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Masahiro Yamada yamada.masahiro@socionext.com
commit 4f920843d248946545415c1bf6120942048708ed upstream.
The macro MODULE is not a config option, it is a per-file build option. So, config_enabled(MODULE) is not sensible. (There is another case in include/linux/export.h, where config_enabled() is used against a non-config option.)
This commit renames some macros in include/linux/kconfig.h for the use for non-config macros and replaces config_enabled(MODULE) with __is_defined(MODULE).
I am keeping config_enabled() because it is still referenced from some places, but I expect it would be deprecated in the future.
Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Signed-off-by: Michal Marek mmarek@suse.com [bwh: Backported to 3.16: drop change in IS_REACHABLE()] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/include/linux/kconfig.h +++ b/include/linux/kconfig.h @@ -17,10 +17,11 @@ * the last step cherry picks the 2nd arg, we get a zero. */ #define __ARG_PLACEHOLDER_1 0, -#define config_enabled(cfg) _config_enabled(cfg) -#define _config_enabled(value) __config_enabled(__ARG_PLACEHOLDER_##value) -#define __config_enabled(arg1_or_junk) ___config_enabled(arg1_or_junk 1, 0) -#define ___config_enabled(__ignored, val, ...) val +#define config_enabled(cfg) ___is_defined(cfg) +#define __is_defined(x) ___is_defined(x) +#define ___is_defined(val) ____is_defined(__ARG_PLACEHOLDER_##val) +#define ____is_defined(arg1_or_junk) __take_second_arg(arg1_or_junk 1, 0) +#define __take_second_arg(__ignored, val, ...) val
/* * IS_ENABLED(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'y' or 'm',
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Andy Lutomirski luto@amacapital.net
commit 83653c16da91112236292871b820cb8b367220e3 upstream.
There's no good reason for it to be a macro, and x86_64 will want to use it, so it should be in a header.
Acked-by: Borislav Petkov bp@suse.de Signed-off-by: Andy Lutomirski luto@amacapital.net Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/thread_info.h | 11 +++++++++++ arch/x86/kernel/irq_32.c | 13 +++---------- 2 files changed, 14 insertions(+), 10 deletions(-)
--- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -170,6 +170,17 @@ static inline struct thread_info *curren return ti; }
+static inline unsigned long current_stack_pointer(void) +{ + unsigned long sp; +#ifdef CONFIG_X86_64 + asm("mov %%rsp,%0" : "=g" (sp)); +#else + asm("mov %%esp,%0" : "=g" (sp)); +#endif + return sp; +} + #else /* !__ASSEMBLY__ */
/* how to get the thread information struct from ASM */ --- a/arch/x86/kernel/irq_32.c +++ b/arch/x86/kernel/irq_32.c @@ -69,16 +69,9 @@ static void call_on_stack(void *func, vo : "memory", "cc", "edx", "ecx", "eax"); }
-/* how to get the current stack pointer from C */ -#define current_stack_pointer ({ \ - unsigned long sp; \ - asm("mov %%esp,%0" : "=g" (sp)); \ - sp; \ -}) - static inline void *current_stack(void) { - return (void *)(current_stack_pointer & ~(THREAD_SIZE - 1)); + return (void *)(current_stack_pointer() & ~(THREAD_SIZE - 1)); }
static inline int @@ -103,7 +96,7 @@ execute_on_irq_stack(int overflow, struc
/* Save the next esp at the bottom of the stack */ prev_esp = (u32 *)irqstk; - *prev_esp = current_stack_pointer; + *prev_esp = current_stack_pointer();
if (unlikely(overflow)) call_on_stack(print_stack_overflow, isp); @@ -156,7 +149,7 @@ void do_softirq_own_stack(void)
/* Push the previous esp onto the stack */ prev_esp = (u32 *)irqstk; - *prev_esp = current_stack_pointer; + *prev_esp = current_stack_pointer();
call_on_stack(__do_softirq, isp); }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit 87590ce6e373d1a5401f6539f0c59ef92dd924a9 upstream.
As the meltdown/spectre problem affects several CPU architectures, it makes sense to have common way to express whether a system is affected by a particular vulnerability or not. If affected the way to express the mitigation should be common as well.
Create /sys/devices/system/cpu/vulnerabilities folder and files for meltdown, spectre_v1 and spectre_v2.
Allow architectures to override the show function.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Cc: Peter Zijlstra peterz@infradead.org Cc: Will Deacon will.deacon@arm.com Cc: Dave Hansen dave.hansen@intel.com Cc: Linus Torvalds torvalds@linuxfoundation.org Cc: Borislav Petkov bp@alien8.de Cc: David Woodhouse dwmw@amazon.co.uk Link: https://lkml.kernel.org/r/20180107214913.096657732@linutronix.de [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- Documentation/ABI/testing/sysfs-devices-system-cpu | 16 ++++++++ drivers/base/Kconfig | 3 ++ drivers/base/cpu.c | 48 ++++++++++++++++++++++ include/linux/cpu.h | 7 ++++ 4 files changed, 74 insertions(+)
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -224,3 +224,19 @@ Description: Parameters for the Intel P- frequency range.
More details can be found in Documentation/cpu-freq/intel-pstate.txt + +What: /sys/devices/system/cpu/vulnerabilities + /sys/devices/system/cpu/vulnerabilities/meltdown + /sys/devices/system/cpu/vulnerabilities/spectre_v1 + /sys/devices/system/cpu/vulnerabilities/spectre_v2 +Date: Januar 2018 +Contact: Linux kernel mailing list linux-kernel@vger.kernel.org +Description: Information about CPU vulnerabilities + + The files are named after the code names of CPU + vulnerabilities. The output of those files reflects the + state of the CPUs in the system. Possible output values: + + "Not affected" CPU is not affected by the vulnerability + "Vulnerable" CPU is affected and no mitigation in effect + "Mitigation: $M" CPU is affetcted and mitigation $M is in effect --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -193,6 +193,9 @@ config GENERIC_CPU_DEVICES config GENERIC_CPU_AUTOPROBE bool
+config GENERIC_CPU_VULNERABILITIES + bool + config SOC_BUS bool
--- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -418,10 +418,58 @@ static void __init cpu_dev_register_gene #endif }
+#ifdef CONFIG_GENERIC_CPU_VULNERABILITIES + +ssize_t __weak cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +ssize_t __weak cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +ssize_t __weak cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); +static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); +static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); + +static struct attribute *cpu_root_vulnerabilities_attrs[] = { + &dev_attr_meltdown.attr, + &dev_attr_spectre_v1.attr, + &dev_attr_spectre_v2.attr, + NULL +}; + +static const struct attribute_group cpu_root_vulnerabilities_group = { + .name = "vulnerabilities", + .attrs = cpu_root_vulnerabilities_attrs, +}; + +static void __init cpu_register_vulnerabilities(void) +{ + if (sysfs_create_group(&cpu_subsys.dev_root->kobj, + &cpu_root_vulnerabilities_group)) + pr_err("Unable to register CPU vulnerabilities\n"); +} + +#else +static inline void cpu_register_vulnerabilities(void) { } +#endif + void __init cpu_dev_init(void) { if (subsys_system_register(&cpu_subsys, cpu_root_attr_groups)) panic("Failed to register CPU subsystem");
cpu_dev_register_generic(); + cpu_register_vulnerabilities(); } --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -39,6 +39,13 @@ extern void cpu_remove_dev_attr(struct d extern int cpu_add_dev_attr_group(struct attribute_group *attrs); extern void cpu_remove_dev_attr_group(struct attribute_group *attrs);
+extern ssize_t cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf); + #ifdef CONFIG_HOTPLUG_CPU extern void unregister_cpu(struct cpu *cpu); extern ssize_t arch_cpu_probe(const char *, size_t);
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
I ran into a 4.9 build warning in randconfig testing, starting with the KAISER patches:
arch/x86/kernel/ldt.c: In function 'alloc_ldt_struct': arch/x86/include/asm/pgtable_types.h:208:24: error: large integer implicitly truncated to unsigned type [-Werror=overflow] #define __PAGE_KERNEL (__PAGE_KERNEL_EXEC | _PAGE_NX) ^ arch/x86/kernel/ldt.c:81:6: note: in expansion of macro '__PAGE_KERNEL' __PAGE_KERNEL); ^~~~~~~~~~~~~
I originally ran into this last year when the patches were part of linux-next, and tried to work around it by using the proper 'pteval_t' types consistently, but that caused additional problems.
This takes a much simpler approach, and makes the argument type of the dummy helper always 64-bit, which is wide enough for any page table layout and won't hurt since this call is just an empty stub anyway.
Fixes: 8f0baadf2bea ("kaiser: merged update") Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Kees Cook keescook@chromium.org Acked-by: Hugh Dickins hughd@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- include/linux/kaiser.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/kaiser.h b/include/linux/kaiser.h index 58c55b1589d0..b56c19010480 100644 --- a/include/linux/kaiser.h +++ b/include/linux/kaiser.h @@ -32,7 +32,7 @@ static inline void kaiser_init(void) { } static inline int kaiser_add_mapping(unsigned long addr, - unsigned long size, unsigned long flags) + unsigned long size, u64 flags) { return 0; }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Darren Kenny darren.kenny@oracle.com
commit af189c95a371b59f493dbe0f50c0a09724868881 upstream.
Fixes: 117cc7a908c83 ("x86/retpoline: Fill return stack buffer on vmexit") Signed-off-by: Darren Kenny darren.kenny@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Andi Kleen ak@linux.intel.com Cc: Borislav Petkov bp@alien8.de Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Arjan van de Ven arjan@linux.intel.com Cc: David Woodhouse dwmw@amazon.co.uk Link: https://lkml.kernel.org/r/20180202191220.blvgkgutojecxr3b@starbug-vm.ie.orac... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/nospec-branch.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -178,7 +178,7 @@ extern char __indirect_thunk_end[]; * On VMEXIT we must ensure that no RSB predictions learned in the guest * can be followed in the host, by overwriting the RSB completely. Both * retpoline and IBRS mitigations for Spectre v2 need this; only on future - * CPUs with IBRS_ATT *might* it be avoided. + * CPUs with IBRS_ALL *might* it be avoided. */ static inline void vmexit_fill_RSB(void) {
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov bp@suse.de
commit 69df353ff305805fc16082d0c5bfa6e20fa8b863 upstream.
Take a look at the first instruction byte before optimizing the NOP - there might be something else there already, like the ALTERNATIVE_2() in rdtsc_barrier() which NOPs out on AMD even though we just patched in an MFENCE.
This happens because the alternatives sees X86_FEATURE_MFENCE_RDTSC, AMD CPUs set it, we patch in the MFENCE and right afterwards it sees X86_FEATURE_LFENCE_RDTSC which AMD CPUs don't set and we blindly optimize the NOP.
Checking whether at least the first byte is 0x90 prevents that.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1428181662-18020-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/alternative.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -325,6 +325,9 @@ done:
static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) { + if (instr[0] != 0x90) + return; + add_nops(instr + (a->instrlen - a->padlen), a->padlen);
DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Andy Lutomirski luto@kernel.org
commit 8bf1ebca215c262e48c15a4a15f175991776f57f upstream.
There are multiple call sites that apply forced CPU caps. Factor them into a helper.
Signed-off-by: Andy Lutomirski luto@kernel.org Reviewed-by: Borislav Petkov bp@suse.de Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Fenghua Yu fenghua.yu@intel.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Matthew Whitehead tedheadster@gmail.com Cc: Oleg Nesterov oleg@redhat.com Cc: One Thousand Gnomes gnomes@lxorguk.ukuu.org.uk Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Yu-cheng Yu yu-cheng.yu@intel.com Link: http://lkml.kernel.org/r/623ff7555488122143e4417de09b18be2085ad06.1484705016... Signed-off-by: Ingo Molnar mingo@kernel.org [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/cpu/common.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-)
--- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -670,6 +670,16 @@ void cpu_detect(struct cpuinfo_x86 *c) } }
+static void apply_forced_caps(struct cpuinfo_x86 *c) +{ + int i; + + for (i = 0; i < NCAPINTS; i++) { + c->x86_capability[i] &= ~cpu_caps_cleared[i]; + c->x86_capability[i] |= cpu_caps_set[i]; + } +} + void get_cpu_cap(struct cpuinfo_x86 *c) { u32 tfms, xlvl; @@ -915,10 +925,7 @@ static void identify_cpu(struct cpuinfo_ this_cpu->c_identify(c);
/* Clear/Set all flags overriden by options, after probe */ - for (i = 0; i < NCAPINTS; i++) { - c->x86_capability[i] &= ~cpu_caps_cleared[i]; - c->x86_capability[i] |= cpu_caps_set[i]; - } + apply_forced_caps(c);
#ifdef CONFIG_X86_64 c->apicid = apic->phys_pkg_id(c->initial_apicid, 0); @@ -978,10 +985,7 @@ static void identify_cpu(struct cpuinfo_ * Clear/Set all flags overriden by options, need do it * before following smp all cpus cap AND. */ - for (i = 0; i < NCAPINTS; i++) { - c->x86_capability[i] &= ~cpu_caps_cleared[i]; - c->x86_capability[i] |= cpu_caps_set[i]; - } + apply_forced_caps(c);
/* * On SMP, boot_cpu_data holds the common feature set between
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov bp@suse.de
commit 612e8e9350fd19cae6900cf36ea0c6892d1a0dca upstream.
The alternatives code checks only the first byte whether it is a NOP, but with NOPs in front of the payload and having actual instructions after it breaks the "optimized' test.
Make sure to scan all bytes before deciding to optimize the NOPs in there.
Reported-by: David Woodhouse dwmw2@infradead.org Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Andi Kleen ak@linux.intel.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Jiri Kosina jikos@kernel.org Cc: Dave Hansen dave.hansen@intel.com Cc: Andi Kleen andi@firstfloor.org Cc: Andrew Lutomirski luto@kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/20180110112815.mgciyf5acwacphkq@pd.tnic Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/alternative.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -326,9 +326,12 @@ done: static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) { unsigned long flags; + int i;
- if (instr[0] != 0x90) - return; + for (i = 0; i < a->padlen; i++) { + if (instr[i] != 0x90) + return; + }
local_irq_save(flags); add_nops(instr + (a->instrlen - a->padlen), a->padlen);
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit 694d99d40972f12e59a3696effee8a376b79d7c8 upstream.
AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against. The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault.
Disable page table isolation by default on AMD processors by not setting the X86_BUG_CPU_INSECURE feature, which controls whether X86_FEATURE_PTI is set.
Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Andy Lutomirski luto@kernel.org Link: https://lkml.kernel.org/r/20171227054354.20369.94587.stgit@tlendack-t1.amdof... [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/cpu/common.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -805,8 +805,8 @@ static void __init early_identify_cpu(st
setup_force_cpu_cap(X86_FEATURE_ALWAYS);
- /* Assume for now that ALL x86 CPUs are insecure */ - setup_force_cpu_bug(X86_BUG_CPU_INSECURE); + if (c->x86_vendor != X86_VENDOR_AMD) + setup_force_cpu_bug(X86_BUG_CPU_INSECURE); }
void __init early_cpu_init(void)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 99c6fa2511d8a683e61468be91b83f85452115fa upstream.
Add the bug bits for spectre v1/2 and force them unconditionally for all cpus.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515239374-23361-2-git-send-email-dwmw@amazon.co.u... [bwh: Backported to 3.16: assign the first available bug numbers] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/cpufeature.h | 2 ++ arch/x86/kernel/cpu/common.c | 3 +++ 2 files changed, 5 insertions(+)
--- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -242,6 +242,8 @@ #define X86_BUG_AMD_TLB_MMATCH X86_BUG(3) /* AMD Erratum 383 */ #define X86_BUG_AMD_APIC_C1E X86_BUG(4) /* AMD Erratum 400 */ #define X86_BUG_CPU_MELTDOWN X86_BUG(5) /* CPU is affected by meltdown attack and needs kernel page table isolation */ +#define X86_BUG_SPECTRE_V1 X86_BUG(6) /* CPU is affected by Spectre variant 1 attack with conditional branches */ +#define X86_BUG_SPECTRE_V2 X86_BUG(7) /* CPU is affected by Spectre variant 2 attack with indirect branches */
#if defined(__KERNEL__) && !defined(__ASSEMBLY__)
--- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -807,6 +807,9 @@ static void __init early_identify_cpu(st
if (c->x86_vendor != X86_VENDOR_AMD) setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); + + setup_force_cpu_bug(X86_BUG_SPECTRE_V1); + setup_force_cpu_bug(X86_BUG_SPECTRE_V2); }
void __init early_cpu_init(void)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit e4d0e84e490790798691aaa0f2e598637f1867ec upstream.
To aid in speculation control, make LFENCE a serializing instruction since it has less overhead than MFENCE. This is done by setting bit 1 of MSR 0xc0011029 (DE_CFG). Some families that support LFENCE do not have this MSR. For these families, the LFENCE instruction is already serializing.
Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Reviewed-by: Borislav Petkov bp@suse.de Cc: Peter Zijlstra peterz@infradead.org Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Dave Hansen dave.hansen@intel.com Cc: Borislav Petkov bp@alien8.de Cc: Dan Williams dan.j.williams@intel.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: David Woodhouse dwmw@amazon.co.uk Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/20180108220921.12580.71694.stgit@tlendack-t1.amdof... [bwh: Backported to 3.16: adjust filename, context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/uapi/asm/msr-index.h | 2 ++ arch/x86/kernel/cpu/amd.c | 10 ++++++++++ 2 files changed, 12 insertions(+)
--- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -223,6 +223,8 @@ #define FAM10H_MMIO_CONF_BASE_MASK 0xfffffffULL #define FAM10H_MMIO_CONF_BASE_SHIFT 20 #define MSR_FAM10H_NODE_ID 0xc001100c +#define MSR_F10H_DECFG 0xc0011029 +#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1
/* K8 MSRs */ #define MSR_K8_TOP_MEM1 0xc001001a --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -673,6 +673,16 @@ static void init_amd(struct cpuinfo_x86 set_cpu_cap(c, X86_FEATURE_K8);
if (cpu_has_xmm2) { + /* + * A serializing LFENCE has less overhead than MFENCE, so + * use it for execution serialization. On families which + * don't have that MSR, LFENCE is already serializing. + * msr_set_bit() uses the safe accessors, too, even if the MSR + * is not present. + */ + msr_set_bit(MSR_F10H_DECFG, + MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT); + /* MFENCE stops RDTSC speculation */ set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit b5c4ae4f35325d520b230bab6eb3310613b72ac1 upstream.
In preparation for converting some __uaccess_begin() instances to __uacess_begin_nospec(), make sure all 'from user' uaccess paths are using the _begin(), _end() helpers rather than open-coded stac() and clac().
No functional changes.
Suggested-by: Ingo Molnar mingo@redhat.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Kees Cook keescook@chromium.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro viro@zeniv.linux.org.uk Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727416438.33451.17309465232057176966.stgit@dwil... [bwh: Backported to 3.16: - Convert several more functions to use __uaccess_begin_nospec(), that are just wrappers in mainline - Adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/lib/usercopy_32.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/arch/x86/lib/usercopy_32.c +++ b/arch/x86/lib/usercopy_32.c @@ -570,12 +570,12 @@ do { \ unsigned long __copy_to_user_ll(void __user *to, const void *from, unsigned long n) { - stac(); + __uaccess_begin(); if (movsl_is_ok(to, from, n)) __copy_user(to, from, n); else n = __copy_user_intel(to, from, n); - clac(); + __uaccess_end(); return n; } EXPORT_SYMBOL(__copy_to_user_ll); @@ -583,12 +583,12 @@ EXPORT_SYMBOL(__copy_to_user_ll); unsigned long __copy_from_user_ll(void *to, const void __user *from, unsigned long n) { - stac(); + __uaccess_begin(); if (movsl_is_ok(to, from, n)) __copy_user_zeroing(to, from, n); else n = __copy_user_zeroing_intel(to, from, n); - clac(); + __uaccess_end(); return n; } EXPORT_SYMBOL(__copy_from_user_ll); @@ -596,13 +596,13 @@ EXPORT_SYMBOL(__copy_from_user_ll); unsigned long __copy_from_user_ll_nozero(void *to, const void __user *from, unsigned long n) { - stac(); + __uaccess_begin(); if (movsl_is_ok(to, from, n)) __copy_user(to, from, n); else n = __copy_user_intel((void __user *)to, (const void *)from, n); - clac(); + __uaccess_end(); return n; } EXPORT_SYMBOL(__copy_from_user_ll_nozero); @@ -610,7 +610,7 @@ EXPORT_SYMBOL(__copy_from_user_ll_nozero unsigned long __copy_from_user_ll_nocache(void *to, const void __user *from, unsigned long n) { - stac(); + __uaccess_begin(); #ifdef CONFIG_X86_INTEL_USERCOPY if (n > 64 && cpu_has_xmm2) n = __copy_user_zeroing_intel_nocache(to, from, n); @@ -619,7 +619,7 @@ unsigned long __copy_from_user_ll_nocach #else __copy_user_zeroing(to, from, n); #endif - clac(); + __uaccess_end(); return n; } EXPORT_SYMBOL(__copy_from_user_ll_nocache); @@ -627,7 +627,7 @@ EXPORT_SYMBOL(__copy_from_user_ll_nocach unsigned long __copy_from_user_ll_nocache_nozero(void *to, const void __user *from, unsigned long n) { - stac(); + __uaccess_begin(); #ifdef CONFIG_X86_INTEL_USERCOPY if (n > 64 && cpu_has_xmm2) n = __copy_user_intel_nocache(to, from, n); @@ -636,7 +636,7 @@ unsigned long __copy_from_user_ll_nocach #else __copy_user(to, from, n); #endif - clac(); + __uaccess_end(); return n; } EXPORT_SYMBOL(__copy_from_user_ll_nocache_nozero);
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit ea08816d5b185ab3d09e95e393f265af54560350 upstream.
Convert indirect call in Xen hypercall to use non-speculative sequence, when CONFIG_RETPOLINE is enabled.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Arjan van de Ven arjan@linux.intel.com Acked-by: Ingo Molnar mingo@kernel.org Reviewed-by: Juergen Gross jgross@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515707194-20531-10-git-send-email-dwmw@amazon.co.... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/xen/hypercall.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/arch/x86/include/asm/xen/hypercall.h +++ b/arch/x86/include/asm/xen/hypercall.h @@ -44,6 +44,7 @@ #include <asm/page.h> #include <asm/pgtable.h> #include <asm/smap.h> +#include <asm/nospec-branch.h>
#include <xen/interface/xen.h> #include <xen/interface/sched.h> @@ -215,9 +216,9 @@ privcmd_call(unsigned call, __HYPERCALL_5ARG(a1, a2, a3, a4, a5);
stac(); - asm volatile("call *%[call]" + asm volatile(CALL_NOSPEC : __HYPERCALL_5PARAM - : [call] "a" (&hypercall_page[call]) + : [thunk_target] "a" (&hypercall_page[call]) : __HYPERCALL_CLOBBER5); clac();
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit babdde2698d482b6c0de1eab4f697cf5856c5859 upstream.
array_index_nospec() uses a mask to sanitize user controllable array indexes, i.e. generate a 0 mask if 'index' >= 'size', and a ~0 mask otherwise. While the default array_index_mask_nospec() handles the carry-bit from the (index - size) result in software.
The x86 array_index_mask_nospec() does the same, but the carry-bit is handled in the processor CF flag without conditional instructions in the control flow.
Suggested-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727414808.33451.1873237130672785331.stgit@dwill... [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/barrier.h | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
--- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -25,6 +25,30 @@ #endif
/** + * array_index_mask_nospec() - generate a mask that is ~0UL when the + * bounds check succeeds and 0 otherwise + * @index: array element index + * @size: number of elements in array + * + * Returns: + * 0 - (index < size) + */ +static inline unsigned long array_index_mask_nospec(unsigned long index, + unsigned long size) +{ + unsigned long mask; + + asm ("cmp %1,%2; sbb %0,%0;" + :"=r" (mask) + :"r"(size),"r" (index) + :"cc"); + return mask; +} + +/* Override the default implementation from linux/nospec.h. */ +#define array_index_mask_nospec array_index_mask_nospec + +/** * read_barrier_depends - Flush all pending reads that subsequents reads * depend on. *
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Andi Kleen ak@linux.intel.com
commit 7614e913db1f40fff819b36216484dc3808995d4 upstream.
Convert all indirect jumps in 32bit irq inline asm code to use non speculative sequences.
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Arjan van de Ven arjan@linux.intel.com Acked-by: Ingo Molnar mingo@kernel.org Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515707194-20531-12-git-send-email-dwmw@amazon.co.... [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/irq_32.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/arch/x86/kernel/irq_32.c +++ b/arch/x86/kernel/irq_32.c @@ -20,6 +20,7 @@ #include <linux/mm.h>
#include <asm/apic.h> +#include <asm/nospec-branch.h>
DEFINE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat); EXPORT_PER_CPU_SYMBOL(irq_stat); @@ -61,11 +62,11 @@ DEFINE_PER_CPU(struct irq_stack *, softi static void call_on_stack(void *func, void *stack) { asm volatile("xchgl %%ebx,%%esp \n" - "call *%%edi \n" + CALL_NOSPEC "movl %%ebx,%%esp \n" : "=b" (stack) : "0" (stack), - "D"(func) + [thunk_target] "D"(func) : "memory", "cc", "edx", "ecx", "eax"); }
@@ -102,11 +103,11 @@ execute_on_irq_stack(int overflow, struc call_on_stack(print_stack_overflow, isp);
asm volatile("xchgl %%ebx,%%esp \n" - "call *%%edi \n" + CALL_NOSPEC "movl %%ebx,%%esp \n" : "=a" (arg1), "=d" (arg2), "=b" (isp) : "0" (irq), "1" (desc), "2" (isp), - "D" (desc->handle_irq) + [thunk_target] "D" (desc->handle_irq) : "memory", "cc", "ecx"); return 1; }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit c86a32c09f8ced67971a2310e3b0dda4d1749007 upstream.
Since indirect jump instructions will be replaced by jump to __x86_indirect_thunk_*, those jmp instruction must be treated as an indirect jump. Since optprobe prohibits to optimize probes in the function which uses an indirect jump, it also needs to find out the function which jump to __x86_indirect_thunk_* and disable optimization.
Add a check that the jump target address is between the __indirect_thunk_start/end when optimizing kprobe.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: Andi Kleen ak@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ananth N Mavinakayanahalli ananth@linux.vnet.ibm.com Cc: Arjan van de Ven arjan@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/151629212062.10241.6991266100233002273.stgit@devbo... [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/kprobes/opt.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/kprobes/opt.c +++ b/arch/x86/kernel/kprobes/opt.c @@ -36,6 +36,7 @@ #include <asm/alternative.h> #include <asm/insn.h> #include <asm/debugreg.h> +#include <asm/nospec-branch.h>
#include "common.h"
@@ -191,7 +192,7 @@ static int copy_optimized_instructions(u }
/* Check whether insn is indirect jump */ -static int insn_is_indirect_jump(struct insn *insn) +static int __insn_is_indirect_jump(struct insn *insn) { return ((insn->opcode.bytes[0] == 0xff && (X86_MODRM_REG(insn->modrm.value) & 6) == 4) || /* Jump */ @@ -225,6 +226,26 @@ static int insn_jump_into_range(struct i return (start <= target && target <= start + len); }
+static int insn_is_indirect_jump(struct insn *insn) +{ + int ret = __insn_is_indirect_jump(insn); + +#ifdef CONFIG_RETPOLINE + /* + * Jump to x86_indirect_thunk_* is treated as an indirect jump. + * Note that even with CONFIG_RETPOLINE=y, the kernel compiled with + * older gcc may use indirect jump. So we add this check instead of + * replace indirect-jump check. + */ + if (!ret) + ret = insn_jump_into_range(insn, + (unsigned long)__indirect_thunk_start, + (unsigned long)__indirect_thunk_end - + (unsigned long)__indirect_thunk_start); +#endif + return ret; +} + /* Decode whole function to ensure any instructions don't jump into target */ static int can_optimize(unsigned long paddr) {
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit 304ec1b050310548db33063e567123fae8fd0301 upstream.
Quoting Linus:
I do think that it would be a good idea to very expressly document the fact that it's not that the user access itself is unsafe. I do agree that things like "get_user()" want to be protected, but not because of any direct bugs or problems with get_user() and friends, but simply because get_user() is an excellent source of a pointer that is obviously controlled from a potentially attacking user space. So it's a prime candidate for then finding _subsequent_ accesses that can then be used to perturb the cache.
__uaccess_begin_nospec() covers __get_user() and copy_from_iter() where the limit check is far away from the user pointer de-reference. In those cases a barrier_nospec() prevents speculation with a potential pointer to privileged memory. uaccess_try_nospec covers get_user_try.
Suggested-by: Linus Torvalds torvalds@linux-foundation.org Suggested-by: Andi Kleen ak@linux.intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: Kees Cook keescook@chromium.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro viro@zeniv.linux.org.uk Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727416953.33451.10508284228526170604.stgit@dwil... [bwh: Backported to 3.16: - Convert several more functions to use __uaccess_begin_nospec(), that are just wrappers in mainline - There's no 'case 8' in __copy_to_user_inatomic() - Adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -433,7 +433,7 @@ do { \ ({ \ int __gu_err; \ unsigned long __gu_val; \ - __uaccess_begin(); \ + __uaccess_begin_nospec(); \ __get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT); \ __uaccess_end(); \ (x) = (__force __typeof__(*(ptr)))__gu_val; \ @@ -541,7 +541,7 @@ struct __large_struct { unsigned long bu * get_user_ex(...); * } get_user_catch(err) */ -#define get_user_try uaccess_try +#define get_user_try uaccess_try_nospec #define get_user_catch(err) uaccess_catch(err)
#define get_user_ex(x, ptr) do { \ @@ -576,7 +576,7 @@ extern void __cmpxchg_wrong_size(void) __typeof__(ptr) __uval = (uval); \ __typeof__(*(ptr)) __old = (old); \ __typeof__(*(ptr)) __new = (new); \ - __uaccess_begin(); \ + __uaccess_begin_nospec(); \ switch (size) { \ case 1: \ { \ --- a/arch/x86/include/asm/uaccess_32.h +++ b/arch/x86/include/asm/uaccess_32.h @@ -48,19 +48,19 @@ __copy_to_user_inatomic(void __user *to,
switch (n) { case 1: - __uaccess_begin(); + __uaccess_begin_nospec(); __put_user_size(*(u8 *)from, (u8 __user *)to, 1, ret, 1); __uaccess_end(); return ret; case 2: - __uaccess_begin(); + __uaccess_begin_nospec(); __put_user_size(*(u16 *)from, (u16 __user *)to, 2, ret, 2); __uaccess_end(); return ret; case 4: - __uaccess_begin(); + __uaccess_begin_nospec(); __put_user_size(*(u32 *)from, (u32 __user *)to, 4, ret, 4); __uaccess_end(); @@ -104,17 +104,17 @@ __copy_from_user_inatomic(void *to, cons
switch (n) { case 1: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_size(*(u8 *)to, from, 1, ret, 1); __uaccess_end(); return ret; case 2: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_size(*(u16 *)to, from, 2, ret, 2); __uaccess_end(); return ret; case 4: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_size(*(u32 *)to, from, 4, ret, 4); __uaccess_end(); return ret; @@ -154,17 +154,17 @@ __copy_from_user(void *to, const void __
switch (n) { case 1: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_size(*(u8 *)to, from, 1, ret, 1); __uaccess_end(); return ret; case 2: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_size(*(u16 *)to, from, 2, ret, 2); __uaccess_end(); return ret; case 4: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_size(*(u32 *)to, from, 4, ret, 4); __uaccess_end(); return ret; @@ -182,17 +182,17 @@ static __always_inline unsigned long __c
switch (n) { case 1: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_size(*(u8 *)to, from, 1, ret, 1); __uaccess_end(); return ret; case 2: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_size(*(u16 *)to, from, 2, ret, 2); __uaccess_end(); return ret; case 4: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_size(*(u32 *)to, from, 4, ret, 4); __uaccess_end(); return ret; --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -57,31 +57,31 @@ int __copy_from_user_nocheck(void *dst, return copy_user_generic(dst, (__force void *)src, size); switch (size) { case 1: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_asm(*(u8 *)dst, (u8 __user *)src, ret, "b", "b", "=q", 1); __uaccess_end(); return ret; case 2: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_asm(*(u16 *)dst, (u16 __user *)src, ret, "w", "w", "=r", 2); __uaccess_end(); return ret; case 4: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_asm(*(u32 *)dst, (u32 __user *)src, ret, "l", "k", "=r", 4); __uaccess_end(); return ret; case 8: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_asm(*(u64 *)dst, (u64 __user *)src, ret, "q", "", "=r", 8); __uaccess_end(); return ret; case 10: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_asm(*(u64 *)dst, (u64 __user *)src, ret, "q", "", "=r", 10); if (likely(!ret)) @@ -91,7 +91,7 @@ int __copy_from_user_nocheck(void *dst, __uaccess_end(); return ret; case 16: - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_asm(*(u64 *)dst, (u64 __user *)src, ret, "q", "", "=r", 16); if (likely(!ret)) @@ -190,7 +190,7 @@ int __copy_in_user(void __user *dst, con switch (size) { case 1: { u8 tmp; - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_asm(tmp, (u8 __user *)src, ret, "b", "b", "=q", 1); if (likely(!ret)) @@ -201,7 +201,7 @@ int __copy_in_user(void __user *dst, con } case 2: { u16 tmp; - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_asm(tmp, (u16 __user *)src, ret, "w", "w", "=r", 2); if (likely(!ret)) @@ -213,7 +213,7 @@ int __copy_in_user(void __user *dst, con
case 4: { u32 tmp; - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_asm(tmp, (u32 __user *)src, ret, "l", "k", "=r", 4); if (likely(!ret)) @@ -224,7 +224,7 @@ int __copy_in_user(void __user *dst, con } case 8: { u64 tmp; - __uaccess_begin(); + __uaccess_begin_nospec(); __get_user_asm(tmp, (u64 __user *)src, ret, "q", "", "=r", 8); if (likely(!ret)) --- a/arch/x86/lib/usercopy_32.c +++ b/arch/x86/lib/usercopy_32.c @@ -570,7 +570,7 @@ do { \ unsigned long __copy_to_user_ll(void __user *to, const void *from, unsigned long n) { - __uaccess_begin(); + __uaccess_begin_nospec(); if (movsl_is_ok(to, from, n)) __copy_user(to, from, n); else @@ -583,7 +583,7 @@ EXPORT_SYMBOL(__copy_to_user_ll); unsigned long __copy_from_user_ll(void *to, const void __user *from, unsigned long n) { - __uaccess_begin(); + __uaccess_begin_nospec(); if (movsl_is_ok(to, from, n)) __copy_user_zeroing(to, from, n); else @@ -596,7 +596,7 @@ EXPORT_SYMBOL(__copy_from_user_ll); unsigned long __copy_from_user_ll_nozero(void *to, const void __user *from, unsigned long n) { - __uaccess_begin(); + __uaccess_begin_nospec(); if (movsl_is_ok(to, from, n)) __copy_user(to, from, n); else @@ -610,7 +610,7 @@ EXPORT_SYMBOL(__copy_from_user_ll_nozero unsigned long __copy_from_user_ll_nocache(void *to, const void __user *from, unsigned long n) { - __uaccess_begin(); + __uaccess_begin_nospec(); #ifdef CONFIG_X86_INTEL_USERCOPY if (n > 64 && cpu_has_xmm2) n = __copy_user_zeroing_intel_nocache(to, from, n); @@ -627,7 +627,7 @@ EXPORT_SYMBOL(__copy_from_user_ll_nocach unsigned long __copy_from_user_ll_nocache_nozero(void *to, const void __user *from, unsigned long n) { - __uaccess_begin(); + __uaccess_begin_nospec(); #ifdef CONFIG_X86_INTEL_USERCOPY if (n > 64 && cpu_has_xmm2) n = __copy_user_intel_nocache(to, from, n);
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit c7f631cb07e7da06ac1d231ca178452339e32a94 upstream.
Quoting Linus:
I do think that it would be a good idea to very expressly document the fact that it's not that the user access itself is unsafe. I do agree that things like "get_user()" want to be protected, but not because of any direct bugs or problems with get_user() and friends, but simply because get_user() is an excellent source of a pointer that is obviously controlled from a potentially attacking user space. So it's a prime candidate for then finding _subsequent_ accesses that can then be used to perturb the cache.
Unlike the __get_user() case get_user() includes the address limit check near the pointer de-reference. With that locality the speculation can be mitigated with pointer narrowing rather than a barrier, i.e. array_index_nospec(). Where the narrowing is performed by:
cmp %limit, %ptr sbb %mask, %mask and %mask, %ptr
With respect to speculation the value of %ptr is either less than %limit or NULL.
Co-developed-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: Kees Cook keescook@chromium.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro viro@zeniv.linux.org.uk Cc: Andy Lutomirski luto@kernel.org Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727417469.33451.11804043010080838495.stgit@dwil... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/lib/getuser.S | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -40,6 +40,8 @@ ENTRY(__get_user_1) GET_THREAD_INFO(%_ASM_DX) cmp TI_addr_limit(%_ASM_DX),%_ASM_AX jae bad_get_user + sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */ + and %_ASM_DX, %_ASM_AX ASM_STAC 1: movzbl (%_ASM_AX),%edx xor %eax,%eax @@ -55,6 +57,8 @@ ENTRY(__get_user_2) GET_THREAD_INFO(%_ASM_DX) cmp TI_addr_limit(%_ASM_DX),%_ASM_AX jae bad_get_user + sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */ + and %_ASM_DX, %_ASM_AX ASM_STAC 2: movzwl -1(%_ASM_AX),%edx xor %eax,%eax @@ -70,6 +74,8 @@ ENTRY(__get_user_4) GET_THREAD_INFO(%_ASM_DX) cmp TI_addr_limit(%_ASM_DX),%_ASM_AX jae bad_get_user + sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */ + and %_ASM_DX, %_ASM_AX ASM_STAC 3: movl -3(%_ASM_AX),%edx xor %eax,%eax @@ -86,6 +92,8 @@ ENTRY(__get_user_8) GET_THREAD_INFO(%_ASM_DX) cmp TI_addr_limit(%_ASM_DX),%_ASM_AX jae bad_get_user + sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */ + and %_ASM_DX, %_ASM_AX ASM_STAC 4: movq -7(%_ASM_AX),%rdx xor %eax,%eax @@ -97,6 +105,8 @@ ENTRY(__get_user_8) GET_THREAD_INFO(%_ASM_DX) cmp TI_addr_limit(%_ASM_DX),%_ASM_AX jae bad_get_user_8 + sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */ + and %_ASM_DX, %_ASM_AX ASM_STAC 4: movl -7(%_ASM_AX),%edx 5: movl -3(%_ASM_AX),%ecx
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 2641f08bb7fc63a636a2b18173221d7040a3512e upstream.
Convert indirect jumps in core 32/64bit entry assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled.
Don't use CALL_NOSPEC in entry_SYSCALL_64_fastpath because the return address after the 'call' instruction must be *precisely* at the .Lentry_SYSCALL_64_after_fastpath label for stub_ptregs_64 to work, and the use of alternatives will mess that up unless we play horrid games to prepend with NOPs and make the variants the same length. It's not worth it; in the case where we ALTERNATIVE out the retpoline, the first instruction at __x86.indirect_thunk.rax is going to be a bare jmp *%rax anyway.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Ingo Molnar mingo@kernel.org Acked-by: Arjan van de Ven arjan@linux.intel.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515707194-20531-7-git-send-email-dwmw@amazon.co.u... Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Razvan Ghitulete rga@amazon.de [bwh: Backported to 3.16: adjust filenames, context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -58,6 +58,7 @@ #include <asm/alternative-asm.h> #include <asm/asm.h> #include <asm/smap.h> +#include <asm/nospec-branch.h>
/* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this. */ #include <linux/elf-em.h> @@ -308,7 +309,8 @@ ENTRY(ret_from_kernel_thread) pushl_cfi $0x0202 # Reset kernel eflags popfl_cfi movl PT_EBP(%esp),%eax - call *PT_EBX(%esp) + movl PT_EBX(%esp), %edx + CALL_NOSPEC %edx movl $0,PT_EAX(%esp) jmp syscall_exit CFI_ENDPROC @@ -1277,7 +1279,7 @@ error_code: movl %ecx, %es TRACE_IRQS_OFF movl %esp,%eax # pt_regs pointer - call *%edi + CALL_NOSPEC %edi jmp ret_from_exception CFI_ENDPROC END(page_fault) --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -59,6 +59,7 @@ #include <asm/smap.h> #include <asm/pgtable_types.h> #include <asm/kaiser.h> +#include <asm/nospec-branch.h> #include <linux/err.h>
/* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this. */ @@ -375,7 +376,7 @@ ENTRY(ret_from_fork) subq $REST_SKIP, %rsp # leave space for volatiles CFI_ADJUST_CFA_OFFSET REST_SKIP movq %rbp, %rdi - call *%rbx + CALL_NOSPEC %rbx movl $0, RAX(%rsp) RESTORE_REST jmp int_ret_from_sys_call @@ -451,7 +452,12 @@ system_call_fastpath: #endif ja badsys movq %r10,%rcx +#ifdef CONFIG_RETPOLINE + movq sys_call_table(, %rax, 8), %rax + call __x86_indirect_thunk_rax +#else call *sys_call_table(,%rax,8) # XXX: rip relative +#endif movq %rax,RAX-ARGOFFSET(%rsp) /* * Syscall return path ending with SYSRET (fast path) @@ -578,7 +584,12 @@ tracesys: #endif ja int_ret_from_sys_call /* RAX(%rsp) set to -ENOSYS above */ movq %r10,%rcx /* fixup for C */ +#ifdef CONFIG_RETPOLINE + movq sys_call_table(, %rax, 8), %rax + call __x86_indirect_thunk_rax +#else call *sys_call_table(,%rax,8) +#endif movq %rax,RAX-ARGOFFSET(%rsp) /* Use IRET because user could have changed frame */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Josh Poimboeuf jpoimboe@redhat.com
commit 12c69f1e94c89d40696e83804dd2f0965b5250cd upstream.
The 'noreplace-paravirt' option disables paravirt patching, leaving the original pv indirect calls in place.
That's highly incompatible with retpolines, unless we want to uglify paravirt even further and convert the paravirt calls to retpolines.
As far as I can tell, the option doesn't seem to be useful for much other than introducing surprising corner cases and making the kernel vulnerable to Spectre v2. It was probably a debug option from the early paravirt days. So just remove it.
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Juergen Gross jgross@suse.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Andi Kleen ak@linux.intel.com Cc: Ashok Raj ashok.raj@intel.com Cc: Greg KH gregkh@linuxfoundation.org Cc: Jun Nakajima jun.nakajima@intel.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Rusty Russell rusty@rustcorp.com.au Cc: Dave Hansen dave.hansen@intel.com Cc: Asit Mallick asit.k.mallick@intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jason Baron jbaron@akamai.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Alok Kataria akataria@vmware.com Cc: Arjan Van De Ven arjan.van.de.ven@intel.com Cc: David Woodhouse dwmw2@infradead.org Cc: Dan Williams dan.j.williams@intel.com Link: https://lkml.kernel.org/r/20180131041333.2x6blhxirc2kclrq@treble [bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- Documentation/kernel-parameters.txt | 2 -- arch/x86/kernel/alternative.c | 14 -------------- 2 files changed, 16 deletions(-)
--- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2271,8 +2271,6 @@ bytes respectively. Such letter suffixes norandmaps Don't use address space randomization. Equivalent to echo 0 > /proc/sys/kernel/randomize_va_space
- noreplace-paravirt [X86,IA-64,PV_OPS] Don't patch paravirt_ops - noreplace-smp [X86-32,SMP] Don't replace SMP instructions with UP alternatives
--- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -41,17 +41,6 @@ static int __init setup_noreplace_smp(ch } __setup("noreplace-smp", setup_noreplace_smp);
-#ifdef CONFIG_PARAVIRT -static int __initdata_or_module noreplace_paravirt = 0; - -static int __init setup_noreplace_paravirt(char *str) -{ - noreplace_paravirt = 1; - return 1; -} -__setup("noreplace-paravirt", setup_noreplace_paravirt); -#endif - #define DPRINTK(fmt, args...) \ do { \ if (debug_alternative) \ @@ -574,9 +563,6 @@ void __init_or_module apply_paravirt(str struct paravirt_patch_site *p; char insnbuf[MAX_PATCH_LEN];
- if (noreplace_paravirt) - return; - for (p = start; p < end; p++) { unsigned int used;
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit eb6174f6d1be16b19cfa43dac296bfed003ce1a6 upstream.
The nospec.h header expects the per-architecture header file <asm/barrier.h> to optionally define array_index_mask_nospec(). Include that dependency to prevent inadvertent fallback to the default array_index_mask_nospec() implementation.
The default implementation may not provide a full mitigation on architectures that perform data value speculation.
Reported-by: Christian Borntraeger borntraeger@de.ibm.com Signed-off-by: Dan Williams dan.j.williams@intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Arjan van de Ven arjan@linux.intel.com Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Woodhouse dwmw2@infradead.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Will Deacon will.deacon@arm.com Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/151881605404.17395.1341935530792574707.stgit@dwilli... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- include/linux/nospec.h | 1 + 1 file changed, 1 insertion(+)
--- a/include/linux/nospec.h +++ b/include/linux/nospec.h @@ -5,6 +5,7 @@
#ifndef _LINUX_NOSPEC_H #define _LINUX_NOSPEC_H +#include <asm/barrier.h>
/** * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Ben Hutchings ben@decadent.org.uk
commit 2fbd7af5af8665d18bcefae3e9700be07e22b681 upstream.
The upstream version of this, touching C code, was written by Dan Williams, with the following description:
The syscall table base is a user controlled function pointer in kernel space. Use array_index_nospec() to prevent any out of bounds speculation.
While retpoline prevents speculating into a userspace directed target it does not stop the pointer de-reference, the concern is leaking memory relative to the syscall table base, by observing instruction cache behavior.
The x86_64 assembly version for 4.4 was written by Jiri Slaby, with the following description:
In 4.4.118, we have commit c8961332d6da (x86/syscall: Sanitize syscall table de-references under speculation), which is a backport of upstream commit 2fbd7af5af86. But it fixed only the C part of the upstream patch -- the IA32 sysentry. So it ommitted completely the assembly part -- the 64bit sysentry.
Fix that in this patch by explicit array_index_mask_nospec written in assembly. The same was used in lib/getuser.S.
However, to have "sbb" working properly, we have to switch from "cmp" against (NR_syscalls-1) to (NR_syscalls), otherwise the last syscall number would be "and"ed by 0. It is because the original "ja" relies on "CF" or "ZF", but we rely only on "CF" in "sbb". That means: switch to "jae" conditional jump too.
Final note: use rcx for mask as this is exactly what is overwritten by the 4th syscall argument (r10) right after.
In 3.16 the x86_32 syscall table lookup is also written in assembly. So I've taken Jiri's version and added similar masking in entry_32.S, using edx as the temporary. edx is clobbered by SAVE_REGS and seems to be free at this point.
Cc: Dan Williams dan.j.williams@intel.com Cc: Jiri Slaby jslaby@suse.cz Cc: Jan Beulich JBeulich@suse.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Andy Lutomirski luto@kernel.org Cc: alan@linux.intel.com Cc: Jinpu Wang jinpu.wang@profitbricks.com Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -426,6 +426,8 @@ sysenter_past_esp: sysenter_do_call: cmpl $(NR_syscalls), %eax jae sysenter_badsys + sbb %edx, %edx /* array_index_mask_nospec() */ + and %edx, %eax call *sys_call_table(,%eax,4) sysenter_after_call: movl %eax,PT_EAX(%esp) @@ -503,6 +505,8 @@ ENTRY(system_call) cmpl $(NR_syscalls), %eax jae syscall_badsys syscall_call: + sbb %edx, %edx /* array_index_mask_nospec() */ + and %edx, %eax call *sys_call_table(,%eax,4) syscall_after_call: movl %eax,PT_EAX(%esp) # store the return value --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -445,12 +445,14 @@ GLOBAL(system_call_after_swapgs) jnz tracesys system_call_fastpath: #if __SYSCALL_MASK == ~0 - cmpq $__NR_syscall_max,%rax + cmpq $NR_syscalls, %rax #else andl $__SYSCALL_MASK,%eax - cmpl $__NR_syscall_max,%eax + cmpl $NR_syscalls, %eax #endif - ja badsys + jae badsys + sbb %rcx, %rcx /* array_index_mask_nospec() */ + and %rcx, %rax movq %r10,%rcx #ifdef CONFIG_RETPOLINE movq sys_call_table(, %rax, 8), %rax @@ -577,12 +579,14 @@ tracesys: LOAD_ARGS ARGOFFSET, 1 RESTORE_REST #if __SYSCALL_MASK == ~0 - cmpq $__NR_syscall_max,%rax + cmpq $NR_syscalls, %rax #else andl $__SYSCALL_MASK,%eax - cmpl $__NR_syscall_max,%eax + cmpl $NR_syscalls, %eax #endif - ja int_ret_from_sys_call /* RAX(%rsp) set to -ENOSYS above */ + jae int_ret_from_sys_call /* RAX(%rsp) set to -ENOSYS above */ + sbb %rcx, %rcx /* array_index_mask_nospec() */ + and %rcx, %rax movq %r10,%rcx /* fixup for C */ #ifdef CONFIG_RETPOLINE movq sys_call_table(, %rax, 8), %rax
On 03/12/2018, 04:06 AM, Ben Hutchings wrote:
In 3.16 the x86_32 syscall table lookup is also written in assembly. So I've taken Jiri's version and added similar masking in entry_32.S, using edx as the temporary. edx is clobbered by SAVE_REGS and seems to be free at this point.
I don't know the state in 3.16, but in 3.12, I had to fix the 32bit entry on 64bit in arch/x86/ia32/ia32entry.S (ia32_sysenter_target & others) too.
thanks,
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit 61dc0f555b5c761cdafb0ba5bd41ecf22d68a4c4 upstream.
Implement the CPU vulnerabilty show functions for meltdown, spectre_v1 and spectre_v2.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Cc: Peter Zijlstra peterz@infradead.org Cc: Will Deacon will.deacon@arm.com Cc: Dave Hansen dave.hansen@intel.com Cc: Linus Torvalds torvalds@linuxfoundation.org Cc: Borislav Petkov bp@alien8.de Cc: David Woodhouse dwmw@amazon.co.uk Link: https://lkml.kernel.org/r/20180107214913.177414879@linutronix.de [bwh: Backported to 3.16: - Meltdown mitigation feature flag is KAISER - Adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/Kconfig | 1 + arch/x86/kernel/cpu/bugs.c | 29 +++++++++++++++++++++++++++++ 2 files changed, 30 insertions(+)
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -130,6 +130,7 @@ config X86 select HAVE_IRQ_EXIT_ON_IRQ_STACK if X86_64 select HAVE_CC_STACKPROTECTOR select GENERIC_CPU_AUTOPROBE + select GENERIC_CPU_VULNERABILITIES select HAVE_ARCH_AUDITSYSCALL select ARCH_SUPPORTS_ATOMIC_RMW
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -9,6 +9,7 @@ */ #include <linux/init.h> #include <linux/utsname.h> +#include <linux/cpu.h> #include <asm/bugs.h> #include <asm/processor.h> #include <asm/processor-flags.h> @@ -122,3 +123,31 @@ void __init check_bugs(void) set_memory_4k((unsigned long)__va(0), 1); #endif } + +#ifdef CONFIG_SYSFS +ssize_t cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN)) + return sprintf(buf, "Not affected\n"); + if (boot_cpu_has(X86_FEATURE_KAISER)) + return sprintf(buf, "Mitigation: PTI\n"); + return sprintf(buf, "Vulnerable\n"); +} + +ssize_t cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1)) + return sprintf(buf, "Not affected\n"); + return sprintf(buf, "Vulnerable\n"); +} + +ssize_t cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + return sprintf(buf, "Not affected\n"); + return sprintf(buf, "Vulnerable\n"); +} +#endif
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit edfbae53dab8348fca778531be9f4855d2ca0360 upstream.
Reflect the presence of get_user(), __get_user(), and 'syscall' protections in sysfs. The expectation is that new and better tooling will allow the kernel to grow more usages of array_index_nospec(), for now, only claim mitigation for __user pointer de-references.
Reported-by: Jiri Slaby jslaby@suse.cz Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727420158.33451.11658324346540434635.stgit@dwil... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/cpu/bugs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -353,7 +353,7 @@ ssize_t cpu_show_spectre_v1(struct devic { if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1)) return sprintf(buf, "Not affected\n"); - return sprintf(buf, "Vulnerable\n"); + return sprintf(buf, "Mitigation: __user pointer sanitization\n"); }
ssize_t cpu_show_spectre_v2(struct device *dev,
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 9ecccfaa7cb5249bd31bdceb93fcf5bedb8a24d8 upstream.
Fixes: 87590ce6e ("sysfs/cpu: Add vulnerability folder") Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Ben Hutchings ben@decadent.org.uk --- Documentation/ABI/testing/sysfs-devices-system-cpu | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -229,7 +229,7 @@ What: /sys/devices/system/cpu/vulnerabi /sys/devices/system/cpu/vulnerabilities/meltdown /sys/devices/system/cpu/vulnerabilities/spectre_v1 /sys/devices/system/cpu/vulnerabilities/spectre_v2 -Date: Januar 2018 +Date: January 2018 Contact: Linux kernel mailing list linux-kernel@vger.kernel.org Description: Information about CPU vulnerabilities
@@ -239,4 +239,4 @@ Description: Information about CPU vulne
"Not affected" CPU is not affected by the vulnerability "Vulnerable" CPU is affected and no mitigation in effect - "Mitigation: $M" CPU is affetcted and mitigation $M is in effect + "Mitigation: $M" CPU is affected and mitigation $M is in effect
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: KarimAllah Ahmed karahmed@amazon.de
commit 9005c6834c0ffdfe46afa76656bd9276cca864f6 upstream.
[dwmw2: Use ARRAY_SIZE]
Signed-off-by: KarimAllah Ahmed karahmed@amazon.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517484441-1420-3-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/cpu/bugs.c | 86 ++++++++++++++++++++++++++++++---------------- 1 file changed, 56 insertions(+), 30 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -181,13 +181,13 @@ static inline const char *spectre_v2_mod static void __init spec2_print_if_insecure(const char *reason) { if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) - pr_info("%s\n", reason); + pr_info("%s selected on command line.\n", reason); }
static void __init spec2_print_if_secure(const char *reason) { if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) - pr_info("%s\n", reason); + pr_info("%s selected on command line.\n", reason); }
static inline bool retp_compiler(void) @@ -202,42 +202,68 @@ static inline bool match_option(const ch return len == arglen && !strncmp(arg, opt, len); }
+static const struct { + const char *option; + enum spectre_v2_mitigation_cmd cmd; + bool secure; +} mitigation_options[] = { + { "off", SPECTRE_V2_CMD_NONE, false }, + { "on", SPECTRE_V2_CMD_FORCE, true }, + { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false }, + { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false }, + { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false }, + { "auto", SPECTRE_V2_CMD_AUTO, false }, +}; + static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) { char arg[20]; - int ret; + int ret, i; + enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO; + + if (cmdline_find_option_bool(boot_command_line, "nospectre_v2")) + return SPECTRE_V2_CMD_NONE; + else { + ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, + sizeof(arg)); + if (ret < 0) + return SPECTRE_V2_CMD_AUTO;
- ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, - sizeof(arg)); - if (ret > 0) { - if (match_option(arg, ret, "off")) { - goto disable; - } else if (match_option(arg, ret, "on")) { - spec2_print_if_secure("force enabled on command line."); - return SPECTRE_V2_CMD_FORCE; - } else if (match_option(arg, ret, "retpoline")) { - spec2_print_if_insecure("retpoline selected on command line."); - return SPECTRE_V2_CMD_RETPOLINE; - } else if (match_option(arg, ret, "retpoline,amd")) { - if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) { - pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n"); - return SPECTRE_V2_CMD_AUTO; - } - spec2_print_if_insecure("AMD retpoline selected on command line."); - return SPECTRE_V2_CMD_RETPOLINE_AMD; - } else if (match_option(arg, ret, "retpoline,generic")) { - spec2_print_if_insecure("generic retpoline selected on command line."); - return SPECTRE_V2_CMD_RETPOLINE_GENERIC; - } else if (match_option(arg, ret, "auto")) { + for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) { + if (!match_option(arg, ret, mitigation_options[i].option)) + continue; + cmd = mitigation_options[i].cmd; + break; + } + + if (i >= ARRAY_SIZE(mitigation_options)) { + pr_err("unknown option (%s). Switching to AUTO select\n", + mitigation_options[i].option); return SPECTRE_V2_CMD_AUTO; } }
- if (!cmdline_find_option_bool(boot_command_line, "nospectre_v2")) + if ((cmd == SPECTRE_V2_CMD_RETPOLINE || + cmd == SPECTRE_V2_CMD_RETPOLINE_AMD || + cmd == SPECTRE_V2_CMD_RETPOLINE_GENERIC) && + !IS_ENABLED(CONFIG_RETPOLINE)) { + pr_err("%s selected but not compiled in. Switching to AUTO select\n", + mitigation_options[i].option); return SPECTRE_V2_CMD_AUTO; -disable: - spec2_print_if_insecure("disabled on command line."); - return SPECTRE_V2_CMD_NONE; + } + + if (cmd == SPECTRE_V2_CMD_RETPOLINE_AMD && + boot_cpu_data.x86_vendor != X86_VENDOR_AMD) { + pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n"); + return SPECTRE_V2_CMD_AUTO; + } + + if (mitigation_options[i].secure) + spec2_print_if_secure(mitigation_options[i].option); + else + spec2_print_if_insecure(mitigation_options[i].option); + + return cmd; }
/* Check for Skylake-like CPUs (for RSB handling) */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit 6cbd2171e89b13377261d15e64384df60ecb530e upstream.
There is currently no way to force CPU bug bits like CPU feature bits. That makes it impossible to set a bug bit once at boot and have it stick for all upcoming CPUs.
Extend the force set/clear arrays to handle bug bits as well.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@kernel.org Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Borislav Petkov bp@alien8.de Cc: Borislav Petkov bpetkov@suse.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave.hansen@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Laight David.Laight@aculab.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Eduardo Valentin eduval@amazon.com Cc: Greg KH gregkh@linuxfoundation.org Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Juergen Gross jgross@suse.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Will Deacon will.deacon@arm.com Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.992156574@linutronix.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/cpufeature.h | 2 ++ arch/x86/include/asm/processor.h | 4 ++-- arch/x86/kernel/cpu/common.c | 6 +++--- 3 files changed, 7 insertions(+), 5 deletions(-)
--- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -286,6 +286,8 @@ extern const char * const x86_power_flag set_bit(bit, (unsigned long *)cpu_caps_set); \ } while (0)
+#define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit) + #define cpu_has_fpu boot_cpu_has(X86_FEATURE_FPU) #define cpu_has_vme boot_cpu_has(X86_FEATURE_VME) #define cpu_has_de boot_cpu_has(X86_FEATURE_DE) --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -147,8 +147,8 @@ extern struct cpuinfo_x86 boot_cpu_data; extern struct cpuinfo_x86 new_cpu_data;
extern struct tss_struct doublefault_tss; -extern __u32 cpu_caps_cleared[NCAPINTS]; -extern __u32 cpu_caps_set[NCAPINTS]; +extern __u32 cpu_caps_cleared[NCAPINTS + NBUGINTS]; +extern __u32 cpu_caps_set[NCAPINTS + NBUGINTS];
#ifdef CONFIG_SMP DECLARE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info); --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -441,8 +441,8 @@ static const char *table_lookup_model(st return NULL; /* Not found */ }
-__u32 cpu_caps_cleared[NCAPINTS]; -__u32 cpu_caps_set[NCAPINTS]; +__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS]; +__u32 cpu_caps_set[NCAPINTS + NBUGINTS];
void load_percpu_segment(int cpu) { @@ -674,7 +674,7 @@ static void apply_forced_caps(struct cpu { int i;
- for (i = 0; i < NCAPINTS; i++) { + for (i = 0; i < NCAPINTS + NBUGINTS; i++) { c->x86_capability[i] &= ~cpu_caps_cleared[i]; c->x86_capability[i] |= cpu_caps_set[i]; }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit c1804a236894ecc942da7dc6c5abe209e56cba93 upstream.
Mark __x86_indirect_thunk_* functions as blacklist for kprobes because those functions can be called from anywhere in the kernel including blacklist functions of kprobes.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: Andi Kleen ak@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ananth N Mavinakayanahalli ananth@linux.vnet.ibm.com Cc: Arjan van de Ven arjan@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/151629209111.10241.5444852823378068683.stgit@devbo... [bwh: Backported to 3.16: exports are still done from C] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -24,7 +24,9 @@ ENDPROC(__x86_indirect_thunk_\reg) * than one per register with the correct names. So we do it * the simple and nasty way... */ -#define GENERATE_THUNK(reg) THUNK reg +#define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym) +#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg) +#define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg)
GENERATE_THUNK(_ASM_AX) GENERATE_THUNK(_ASM_BX)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Waiman Long longman@redhat.com
commit 1df37383a8aeabb9b418698f0bcdffea01f4b1b2 upstream.
It doesn't make sense to have an indirect call thunk with esp/rsp as retpoline code won't work correctly with the stack pointer register. Removing it will help compiler writers to catch error in case such a thunk call is emitted incorrectly.
Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support") Suggested-by: Jeff Law law@redhat.com Signed-off-by: Waiman Long longman@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Kees Cook keescook@google.com Cc: Andi Kleen ak@linux.intel.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Arjan van de Ven arjan@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1516658974-27852-1-git-send-email-longman@redhat.c... [bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/lib/retpoline-export.c | 1 - arch/x86/lib/retpoline.S | 1 - 2 files changed, 2 deletions(-)
--- a/arch/x86/lib/retpoline-export.c +++ b/arch/x86/lib/retpoline-export.c @@ -21,5 +21,4 @@ INDIRECT_THUNK(dx) INDIRECT_THUNK(si) INDIRECT_THUNK(di) INDIRECT_THUNK(bp) -INDIRECT_THUNK(sp) #endif /* CONFIG_RETPOLINE */ --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -35,7 +35,6 @@ GENERATE_THUNK(_ASM_DX) GENERATE_THUNK(_ASM_SI) GENERATE_THUNK(_ASM_DI) GENERATE_THUNK(_ASM_BP) -GENERATE_THUNK(_ASM_SP) #ifdef CONFIG_64BIT GENERATE_THUNK(r8) GENERATE_THUNK(r9)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit 56c30ba7b348b90484969054d561f711ba196507 upstream.
'fd' is a user controlled value that is used as a data dependency to read from the 'fdt->fd' array. In order to avoid potential leaks of kernel memory values, block speculative execution of the instruction stream that could issue reads based on an invalid 'file *' returned from __fcheck_files.
Co-developed-by: Elena Reshetova elena.reshetova@intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro viro@zeniv.linux.org.uk Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727418500.33451.17392199002892248656.stgit@dwil... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- include/linux/fdtable.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -9,6 +9,7 @@ #include <linux/compiler.h> #include <linux/spinlock.h> #include <linux/rcupdate.h> +#include <linux/nospec.h> #include <linux/types.h> #include <linux/init.h> #include <linux/fs.h> @@ -76,8 +77,10 @@ static inline struct file *__fcheck_file { struct fdtable *fdt = rcu_dereference_raw(files->fdt);
- if (fd < fdt->max_fds) + if (fd < fdt->max_fds) { + fd = array_index_nospec(fd, fdt->max_fds); return rcu_dereference_raw(fdt->fd[fd]); + } return NULL; }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit b3d7ad85b80bbc404635dca80f5b129f6242bc7a upstream.
Rename the open coded form of this instruction sequence from rdtsc_ordered() into a generic barrier primitive, barrier_nospec().
One of the mitigations for Spectre variant1 vulnerabilities is to fence speculative execution after successfully validating a bounds check. I.e. force the result of a bounds check to resolve in the instruction pipeline to ensure speculative execution honors that result before potentially operating on out-of-bounds data.
No functional changes.
Suggested-by: Linus Torvalds torvalds@linux-foundation.org Suggested-by: Andi Kleen ak@linux.intel.com Suggested-by: Ingo Molnar mingo@redhat.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Kees Cook keescook@chromium.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro viro@zeniv.linux.org.uk Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727415361.33451.9049453007262764675.stgit@dwill... [bwh: Backported to 3.16: update rdtsc_barrier() instead of rdtsc_ordered()] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -48,6 +48,10 @@ static inline unsigned long array_index_ /* Override the default implementation from linux/nospec.h. */ #define array_index_mask_nospec array_index_mask_nospec
+/* Prevent speculative execution past this barrier. */ +#define barrier_nospec() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \ + "lfence", X86_FEATURE_LFENCE_RDTSC) + /** * read_barrier_depends - Flush all pending reads that subsequents reads * depend on. @@ -174,8 +178,7 @@ do { \ */ static __always_inline void rdtsc_barrier(void) { - alternative(ASM_NOP3, "mfence", X86_FEATURE_MFENCE_RDTSC); - alternative(ASM_NOP3, "lfence", X86_FEATURE_LFENCE_RDTSC); + barrier_nospec(); }
#endif /* _ASM_X86_BARRIER_H */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit 085331dfc6bbe3501fb936e657331ca943827600 upstream.
Commit 75f139aaf896 "KVM: x86: Add memory barrier on vmcs field lookup" added a raw 'asm("lfence");' to prevent a bounds check bypass of 'vmcs_field_to_offset_table'.
The lfence can be avoided in this path by using the array_index_nospec() helper designed for these types of fixes.
Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Paolo Bonzini pbonzini@redhat.com Cc: Andrew Honig ahonig@google.com Cc: kvm@vger.kernel.org Cc: Jim Mattson jmattson@google.com Link: https://lkml.kernel.org/r/151744959670.6342.3001723920950249067.stgit@dwilli... [bwh: Backported to 3.16: - Replace max_vmcs_field with the local size variable - Adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -32,6 +32,7 @@ #include <linux/slab.h> #include <linux/tboot.h> #include <linux/hrtimer.h> +#include <linux/nospec.h> #include "kvm_cache_regs.h" #include "x86.h"
@@ -695,23 +696,21 @@ static const unsigned short vmcs_field_t FIELD(HOST_RSP, host_rsp), FIELD(HOST_RIP, host_rip), }; -static const int max_vmcs_field = ARRAY_SIZE(vmcs_field_to_offset_table);
static inline short vmcs_field_to_offset(unsigned long field) { - if (field >= max_vmcs_field) - return -1; + const size_t size = ARRAY_SIZE(vmcs_field_to_offset_table); + unsigned short offset;
- /* - * FIXME: Mitigation for CVE-2017-5753. To be replaced with a - * generic mechanism. - */ - asm("lfence"); - - if (vmcs_field_to_offset_table[field] == 0) + BUILD_BUG_ON(size > SHRT_MAX); + if (field >= size) return -1;
- return vmcs_field_to_offset_table[field]; + field = array_index_nospec(field, size); + offset = vmcs_field_to_offset_table[field]; + if (offset == 0) + return -1; + return offset; }
static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu *vcpu)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit de791821c295cc61419a06fe5562288417d1bc58 upstream.
Use the name associated with the particular attack which needs page table isolation for mitigation.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: Alan Cox gnomes@lxorguk.ukuu.org.uk Cc: Jiri Koshina jikos@kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Andi Lutomirski luto@amacapital.net Cc: Andi Kleen ak@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Paul Turner pjt@google.com Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Greg KH gregkh@linux-foundation.org Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801051525300.1724@nanos [bwh: Backported to 3.16: bug number is different] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -241,7 +241,7 @@ #define X86_BUG_COMA X86_BUG(2) /* Cyrix 6x86 coma */ #define X86_BUG_AMD_TLB_MMATCH X86_BUG(3) /* AMD Erratum 383 */ #define X86_BUG_AMD_APIC_C1E X86_BUG(4) /* AMD Erratum 400 */ -#define X86_BUG_CPU_INSECURE X86_BUG(5) /* CPU is insecure and needs kernel page table isolation */ +#define X86_BUG_CPU_MELTDOWN X86_BUG(5) /* CPU is affected by meltdown attack and needs kernel page table isolation */
#if defined(__KERNEL__) && !defined(__ASSEMBLY__)
--- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -806,7 +806,7 @@ static void __init early_identify_cpu(st setup_force_cpu_cap(X86_FEATURE_ALWAYS);
if (c->x86_vendor != X86_VENDOR_AMD) - setup_force_cpu_bug(X86_BUG_CPU_INSECURE); + setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); }
void __init early_cpu_init(void)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Jim Mattson jmattson@google.com
commit 0cb5b30698fdc8f6b4646012e3acb4ddce430788 upstream.
Guest GPR values are live in the hardware GPRs at VM-exit. Do not leave any guest values in hardware GPRs after the guest GPR values are saved to the vcpu_vmx structure.
This is a partial mitigation for CVE 2017-5715 and CVE 2017-5753. Specifically, it defeats the Project Zero PoC for CVE 2017-5715.
Suggested-by: Eric Northup digitaleric@google.com Signed-off-by: Jim Mattson jmattson@google.com Reviewed-by: Eric Northup digitaleric@google.com Reviewed-by: Benjamin Serebrin serebrin@google.com Reviewed-by: Andrew Honig ahonig@google.com [Paolo: Add AMD bits, Signed-off-by: Tom Lendacky thomas.lendacky@amd.com] Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kvm/svm.c | 19 +++++++++++++++++++ arch/x86/kvm/vmx.c | 14 +++++++++++++- 2 files changed, 32 insertions(+), 1 deletion(-)
--- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -3915,6 +3915,25 @@ static void svm_vcpu_run(struct kvm_vcpu "mov %%r14, %c[r14](%[svm]) \n\t" "mov %%r15, %c[r15](%[svm]) \n\t" #endif + /* + * Clear host registers marked as clobbered to prevent + * speculative use. + */ + "xor %%" _ASM_BX ", %%" _ASM_BX " \n\t" + "xor %%" _ASM_CX ", %%" _ASM_CX " \n\t" + "xor %%" _ASM_DX ", %%" _ASM_DX " \n\t" + "xor %%" _ASM_SI ", %%" _ASM_SI " \n\t" + "xor %%" _ASM_DI ", %%" _ASM_DI " \n\t" +#ifdef CONFIG_X86_64 + "xor %%r8, %%r8 \n\t" + "xor %%r9, %%r9 \n\t" + "xor %%r10, %%r10 \n\t" + "xor %%r11, %%r11 \n\t" + "xor %%r12, %%r12 \n\t" + "xor %%r13, %%r13 \n\t" + "xor %%r14, %%r14 \n\t" + "xor %%r15, %%r15 \n\t" +#endif "pop %%" _ASM_BP : : [svm]"a"(svm), --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -7487,6 +7487,7 @@ static void __noclone vmx_vcpu_run(struc /* Save guest registers, load host registers, keep flags */ "mov %0, %c[wordsize](%%" _ASM_SP ") \n\t" "pop %0 \n\t" + "setbe %c[fail](%0)\n\t" "mov %%" _ASM_AX ", %c[rax](%0) \n\t" "mov %%" _ASM_BX ", %c[rbx](%0) \n\t" __ASM_SIZE(pop) " %c[rcx](%0) \n\t" @@ -7503,12 +7504,23 @@ static void __noclone vmx_vcpu_run(struc "mov %%r13, %c[r13](%0) \n\t" "mov %%r14, %c[r14](%0) \n\t" "mov %%r15, %c[r15](%0) \n\t" + "xor %%r8d, %%r8d \n\t" + "xor %%r9d, %%r9d \n\t" + "xor %%r10d, %%r10d \n\t" + "xor %%r11d, %%r11d \n\t" + "xor %%r12d, %%r12d \n\t" + "xor %%r13d, %%r13d \n\t" + "xor %%r14d, %%r14d \n\t" + "xor %%r15d, %%r15d \n\t" #endif "mov %%cr2, %%" _ASM_AX " \n\t" "mov %%" _ASM_AX ", %c[cr2](%0) \n\t"
+ "xor %%eax, %%eax \n\t" + "xor %%ebx, %%ebx \n\t" + "xor %%esi, %%esi \n\t" + "xor %%edi, %%edi \n\t" "pop %%" _ASM_BP "; pop %%" _ASM_DX " \n\t" - "setbe %c[fail](%0) \n\t" ".pushsection .rodata \n\t" ".global vmx_return \n\t" "vmx_return: " _ASM_PTR " 2b \n\t"
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 66f793099a636862a71c59d4a6ba91387b155e0c upstream.
There's no point in building init code with retpolines, since it runs before any potentially hostile userspace does. And before the retpoline is actually ALTERNATIVEd into place, for much of it.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: karahmed@amazon.de Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517484441-1420-2-git-send-email-dwmw@amazon.co.uk [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- include/linux/init.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/include/linux/init.h +++ b/include/linux/init.h @@ -4,6 +4,13 @@ #include <linux/compiler.h> #include <linux/types.h>
+/* Built-in __init functions needn't be compiled with retpoline */ +#if defined(RETPOLINE) && !defined(MODULE) +#define __noretpoline __attribute__((indirect_branch("keep"))) +#else +#define __noretpoline +#endif + /* These macros are used to mark some functions or * initialized data (doesn't apply to uninitialized data) * as `initialization' functions. The kernel can take this @@ -39,7 +46,7 @@
/* These are for everybody (although not all archs will actually discard it in modules) */ -#define __init __section(.init.text) __cold notrace +#define __init __section(.init.text) __cold notrace __noretpoline #define __initdata __section(.init.data) #define __initconst __constsection(.init.rodata) #define __exitdata __section(.exit.data)
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Andy Lutomirski luto@kernel.org
commit f005f5d860e0231fe212cfda8c1a3148b99609f4 upstream.
asm/alternative.h isn't directly useful from assembly, but it shouldn't break the build.
Signed-off-by: Andy Lutomirski luto@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/e5b693fcef99fe6e80341c9e97a002fb23871e91.1461698311... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/alternative.h | 4 ++++ 1 file changed, 4 insertions(+)
--- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -1,6 +1,8 @@ #ifndef _ASM_X86_ALTERNATIVE_H #define _ASM_X86_ALTERNATIVE_H
+#ifndef __ASSEMBLY__ + #include <linux/types.h> #include <linux/stddef.h> #include <linux/stringify.h> @@ -251,4 +253,6 @@ extern void *text_poke(void *addr, const extern int poke_int3_handler(struct pt_regs *regs); extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler);
+#endif /* __ASSEMBLY__ */ + #endif /* _ASM_X86_ALTERNATIVE_H */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit 259d8c1e984318497c84eef547bbb6b1d9f4eb05 upstream.
Wireless drivers rely on parse_txq_params to validate that txq_params->ac is less than NL80211_NUM_ACS by the time the low-level driver's ->conf_tx() handler is called. Use a new helper, array_index_nospec(), to sanitize txq_params->ac with respect to speculation. I.e. ensure that any speculation into ->conf_tx() handlers is done with a value of txq_params->ac that is within the bounds of [0, NL80211_NUM_ACS).
Reported-by: Christian Lamparter chunkeey@gmail.com Reported-by: Elena Reshetova elena.reshetova@intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Johannes Berg johannes@sipsolutions.net Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: linux-wireless@vger.kernel.org Cc: torvalds@linux-foundation.org Cc: "David S. Miller" davem@davemloft.net Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727419584.33451.7700736761686184303.stgit@dwill... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- net/wireless/nl80211.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -14,6 +14,7 @@ #include <linux/nl80211.h> #include <linux/rtnetlink.h> #include <linux/netlink.h> +#include <linux/nospec.h> #include <linux/etherdevice.h> #include <net/net_namespace.h> #include <net/genetlink.h> @@ -1829,20 +1830,22 @@ static const struct nla_policy txq_param static int parse_txq_params(struct nlattr *tb[], struct ieee80211_txq_params *txq_params) { + u8 ac; + if (!tb[NL80211_TXQ_ATTR_AC] || !tb[NL80211_TXQ_ATTR_TXOP] || !tb[NL80211_TXQ_ATTR_CWMIN] || !tb[NL80211_TXQ_ATTR_CWMAX] || !tb[NL80211_TXQ_ATTR_AIFS]) return -EINVAL;
- txq_params->ac = nla_get_u8(tb[NL80211_TXQ_ATTR_AC]); + ac = nla_get_u8(tb[NL80211_TXQ_ATTR_AC]); txq_params->txop = nla_get_u16(tb[NL80211_TXQ_ATTR_TXOP]); txq_params->cwmin = nla_get_u16(tb[NL80211_TXQ_ATTR_CWMIN]); txq_params->cwmax = nla_get_u16(tb[NL80211_TXQ_ATTR_CWMAX]); txq_params->aifs = nla_get_u8(tb[NL80211_TXQ_ATTR_AIFS]);
- if (txq_params->ac >= NL80211_NUM_ACS) + if (ac >= NL80211_NUM_ACS) return -EINVAL; - + txq_params->ac = array_index_nospec(ac, NL80211_NUM_ACS); return 0; }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: "zhenwei.pi" zhenwei.pi@youruncloud.com
commit 98f0fceec7f84d80bc053e49e596088573086421 upstream.
In section <2. Runtime Cost>, fix wrong index.
Signed-off-by: zhenwei.pi zhenwei.pi@youruncloud.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: dave.hansen@linux.intel.com Link: https://lkml.kernel.org/r/1516237492-27739-1-git-send-email-zhenwei.pi@youru... Signed-off-by: Ben Hutchings ben@decadent.org.uk --- Documentation/x86/pti.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/Documentation/x86/pti.txt +++ b/Documentation/x86/pti.txt @@ -78,7 +78,7 @@ this protection comes at a cost: non-PTI SYSCALL entry code, so requires mapping fewer things into the userspace page tables. The downside is that stacks must be switched at entry time. - d. Global pages are disabled for all kernel structures not + c. Global pages are disabled for all kernel structures not mapped into both kernel and userspace page tables. This feature of the MMU allows different processes to share TLB entries mapping the kernel. Losing the feature means more
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Linus Torvalds torvalds@linux-foundation.org
commit de9e478b9d49f3a0214310d921450cf5bb4a21e6 upstream.
In commit 11f1a4b9755f ("x86: reorganize SMAP handling in user space accesses") I changed how the stac/clac instructions were generated around the user space accesses, which then made it possible to do batched accesses efficiently for user string copies etc.
However, in doing so, I completely spaced out, and didn't even think about the 32-bit case. And nobody really even seemed to notice, because SMAP doesn't even exist until modern Skylake processors, and you'd have to be crazy to run 32-bit kernels on a modern CPU.
Which brings us to Andy Lutomirski.
He actually tested the 32-bit kernel on new hardware, and noticed that it doesn't work. My bad. The trivial fix is to add the required uaccess begin/end markers around the raw accesses in <asm/uaccess_32.h>.
I feel a bit bad about this patch, just because that header file really should be cleaned up to avoid all the duplicated code in it, and this commit just expands on the problem. But this just fixes the bug without any bigger cleanup surgery.
Reported-and-tested-by: Andy Lutomirski luto@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org [bwh: Backported to 3.16: There's no 'case 8' in __copy_to_user_inatomic()] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/include/asm/uaccess_32.h +++ b/arch/x86/include/asm/uaccess_32.h @@ -48,16 +48,22 @@ __copy_to_user_inatomic(void __user *to,
switch (n) { case 1: + __uaccess_begin(); __put_user_size(*(u8 *)from, (u8 __user *)to, 1, ret, 1); + __uaccess_end(); return ret; case 2: + __uaccess_begin(); __put_user_size(*(u16 *)from, (u16 __user *)to, 2, ret, 2); + __uaccess_end(); return ret; case 4: + __uaccess_begin(); __put_user_size(*(u32 *)from, (u32 __user *)to, 4, ret, 4); + __uaccess_end(); return ret; } } @@ -98,13 +104,19 @@ __copy_from_user_inatomic(void *to, cons
switch (n) { case 1: + __uaccess_begin(); __get_user_size(*(u8 *)to, from, 1, ret, 1); + __uaccess_end(); return ret; case 2: + __uaccess_begin(); __get_user_size(*(u16 *)to, from, 2, ret, 2); + __uaccess_end(); return ret; case 4: + __uaccess_begin(); __get_user_size(*(u32 *)to, from, 4, ret, 4); + __uaccess_end(); return ret; } } @@ -142,13 +154,19 @@ __copy_from_user(void *to, const void __
switch (n) { case 1: + __uaccess_begin(); __get_user_size(*(u8 *)to, from, 1, ret, 1); + __uaccess_end(); return ret; case 2: + __uaccess_begin(); __get_user_size(*(u16 *)to, from, 2, ret, 2); + __uaccess_end(); return ret; case 4: + __uaccess_begin(); __get_user_size(*(u32 *)to, from, 4, ret, 4); + __uaccess_end(); return ret; } } @@ -164,13 +182,19 @@ static __always_inline unsigned long __c
switch (n) { case 1: + __uaccess_begin(); __get_user_size(*(u8 *)to, from, 1, ret, 1); + __uaccess_end(); return ret; case 2: + __uaccess_begin(); __get_user_size(*(u16 *)to, from, 2, ret, 2); + __uaccess_end(); return ret; case 4: + __uaccess_begin(); __get_user_size(*(u32 *)to, from, 4, ret, 4); + __uaccess_end(); return ret; } }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@oracle.com
commit 9de29eac8d2189424d81c0d840cd0469aa3d41c8 upstream.
If i == ARRAY_SIZE(mitigation_options) then we accidentally print garbage from one space beyond the end of the mitigation_options[] array.
Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@suse.de Cc: David Woodhouse dwmw@amazon.co.uk Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: KarimAllah Ahmed karahmed@amazon.de Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: kernel-janitors@vger.kernel.org Fixes: 9005c6834c0f ("x86/spectre: Simplify spectre_v2 command line parsing") Link: http://lkml.kernel.org/r/20180214071416.GA26677@mwanda Signed-off-by: Ingo Molnar mingo@kernel.org [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -237,8 +237,7 @@ static enum spectre_v2_mitigation_cmd __ }
if (i >= ARRAY_SIZE(mitigation_options)) { - pr_err("unknown option (%s). Switching to AUTO select\n", - mitigation_options[i].option); + pr_err("unknown option (%s). Switching to AUTO select\n", arg); return SPECTRE_V2_CMD_AUTO; } }
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: "Gustavo A. R. Silva" garsilva@embeddedor.com
commit 24dbc6000f4b9b0ef5a9daecb161f1907733765a upstream.
Currently, x86_cache_size is of type int, which makes no sense as we will never have a valid cache size equal or less than 0. So instead of initializing this variable to -1, it can perfectly be initialized to 0 and use it as an unsigned variable instead.
Suggested-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Gustavo A. R. Silva garsilva@embeddedor.com Cc: Borislav Petkov bp@alien8.de Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Addresses-Coverity-ID: 1464429 Link: http://lkml.kernel.org/r/20180213192208.GA26414@embeddedor.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/processor.h | 2 +- arch/x86/kernel/cpu/common.c | 2 +- arch/x86/kernel/cpu/microcode/intel.c | 2 +- arch/x86/kernel/cpu/proc.c | 4 ++-- 4 files changed, 5 insertions(+), 5 deletions(-)
--- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -107,7 +107,7 @@ struct cpuinfo_x86 { char x86_vendor_id[16]; char x86_model_id[64]; /* in KB - valid for CPUS which support this call: */ - int x86_cache_size; + unsigned int x86_cache_size; int x86_cache_alignment; /* In bytes */ int x86_power; unsigned long loops_per_jiffy; --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -905,7 +905,7 @@ static void identify_cpu(struct cpuinfo_ int i;
c->loops_per_jiffy = loops_per_jiffy; - c->x86_cache_size = -1; + c->x86_cache_size = 0; c->x86_vendor = X86_VENDOR_UNKNOWN; c->x86_model = c->x86_mask = 0; /* So far unknown... */ c->x86_vendor_id[0] = '\0'; /* Unset */ --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -352,7 +352,7 @@ static struct microcode_ops microcode_in
static int __init calc_llc_size_per_core(struct cpuinfo_x86 *c) { - u64 llc_size = c->x86_cache_size * 1024; + u64 llc_size = c->x86_cache_size * 1024ULL;
do_div(llc_size, c->x86_max_cores);
--- a/arch/x86/kernel/cpu/proc.c +++ b/arch/x86/kernel/cpu/proc.c @@ -86,8 +86,8 @@ static int show_cpuinfo(struct seq_file }
/* Cache size */ - if (c->x86_cache_size >= 0) - seq_printf(m, "cache size\t: %d KB\n", c->x86_cache_size); + if (c->x86_cache_size) + seq_printf(m, "cache size\t: %u KB\n", c->x86_cache_size);
show_cpuinfo_core(m, c, cpu); show_cpuinfo_misc(m, c);
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit 736e80a4213e9bbce40a7c050337047128b472ac upstream.
Introduce start/end markers of __x86_indirect_thunk_* functions. To make it easy, consolidate .text.__x86.indirect_thunk.* sections to one .text.__x86.indirect_thunk section and put it in the end of kernel text section and adds __indirect_thunk_start/end so that other subsystem (e.g. kprobes) can identify it.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: Andi Kleen ak@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ananth N Mavinakayanahalli ananth@linux.vnet.ibm.com Cc: Arjan van de Ven arjan@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/151629206178.10241.6828804696410044771.stgit@devbo... [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/nospec-branch.h | 3 +++ arch/x86/kernel/vmlinux.lds.S | 6 ++++++ arch/x86/lib/retpoline.S | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -171,6 +171,9 @@ enum spectre_v2_mitigation { SPECTRE_V2_IBRS, };
+extern char __indirect_thunk_start[]; +extern char __indirect_thunk_end[]; + /* * On VMEXIT we must ensure that no RSB predictions learned in the guest * can be followed in the host, by overwriting the RSB completely. Both --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -104,6 +104,12 @@ SECTIONS IRQENTRY_TEXT *(.fixup) *(.gnu.warning) +#ifdef CONFIG_RETPOLINE + __indirect_thunk_start = .; + *(.text.__x86.indirect_thunk) + __indirect_thunk_end = .; +#endif + /* End of text section */ _etext = .; } :text = 0x9090 --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -8,7 +8,7 @@ #include <asm/nospec-branch.h>
.macro THUNK reg - .section .text.__x86.indirect_thunk.\reg + .section .text.__x86.indirect_thunk
ENTRY(__x86_indirect_thunk_\reg) CFI_STARTPROC
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit c940a3fb1e2e9b7d03228ab28f375fb5a47ff699 upstream.
Replace indirect call with CALL_NOSPEC.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: David Woodhouse dwmw@amazon.co.uk Cc: Andrea Arcangeli aarcange@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Ashok Raj ashok.raj@intel.com Cc: Greg KH gregkh@linuxfoundation.org Cc: Jun Nakajima jun.nakajima@intel.com Cc: David Woodhouse dwmw2@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: rga@amazon.de Cc: Dave Hansen dave.hansen@intel.com Cc: Asit Mallick asit.k.mallick@intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Jason Baron jbaron@akamai.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Dan Williams dan.j.williams@intel.com Cc: Arjan Van De Ven arjan.van.de.ven@intel.com Cc: Tim Chen tim.c.chen@linux.intel.com Link: https://lkml.kernel.org/r/20180125095843.645776917@infradead.org [bwh: Backported to 3.16: also add ASM_CALL_CONSTRAINT] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -7248,13 +7248,14 @@ static void vmx_handle_external_intr(str "pushf\n\t" "orl $0x200, (%%" _ASM_SP ")\n\t" __ASM_SIZE(push) " $%c[cs]\n\t" - "call *%[entry]\n\t" + CALL_NOSPEC : #ifdef CONFIG_X86_64 - [sp]"=&r"(tmp) + [sp]"=&r"(tmp), #endif + ASM_CALL_CONSTRAINT : - [entry]"r"(entry), + THUNK_TARGET(entry), [ss]"i"(__KERNEL_DS), [cs]"i"(__KERNEL_CS) );
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 9351803bd803cdbeb9b5a7850b7b6f464806e3db upstream.
Convert all indirect jumps in ftrace assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Arjan van de Ven arjan@linux.intel.com Acked-by: Ingo Molnar mingo@kernel.org Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515707194-20531-8-git-send-email-dwmw@amazon.co.u... [bwh: Backported to 3.16: adjust filenames, context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -1189,7 +1189,8 @@ trace: movl 0x4(%ebp), %edx subl $MCOUNT_INSN_SIZE, %eax
- call *ftrace_trace_function + movl ftrace_trace_function, %ecx + CALL_NOSPEC %ecx
popl %edx popl %ecx @@ -1224,7 +1225,7 @@ return_to_handler: movl %eax, %ecx popl %edx popl %eax - jmp *%ecx + JMP_NOSPEC %ecx #endif
#ifdef CONFIG_TRACING --- a/arch/x86/kernel/mcount_64.S +++ b/arch/x86/kernel/mcount_64.S @@ -7,7 +7,7 @@ #include <linux/linkage.h> #include <asm/ptrace.h> #include <asm/ftrace.h> - +#include <asm/nospec-branch.h>
.code64 .section .entry.text, "ax" @@ -210,8 +210,8 @@ trace: #endif subq $MCOUNT_INSN_SIZE, %rdi
- call *ftrace_trace_function - + movq ftrace_trace_function, %r8 + CALL_NOSPEC %r8 MCOUNT_RESTORE_FRAME
jmp ftrace_stub @@ -254,5 +254,5 @@ GLOBAL(return_to_handler) movq 8(%rsp), %rdx movq (%rsp), %rax addq $24, %rsp - jmp *%rdi + JMP_NOSPEC %rdi #endif
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov bp@suse.de
commit 7a32fc51ca938e67974cbb9db31e1a43f98345a9 upstream.
... to adhere to the _ASM_X86_ naming scheme.
No functional change.
Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: riel@redhat.com Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: David Woodhouse dwmw2@infradead.org Cc: jikos@kernel.org Cc: luto@amacapital.net Cc: dave.hansen@intel.com Cc: torvalds@linux-foundation.org Cc: keescook@google.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Cc: pjt@google.com Link: https://lkml.kernel.org/r/20180126121139.31959-3-bp@alien8.de Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/nospec-branch.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -1,7 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __NOSPEC_BRANCH_H__ -#define __NOSPEC_BRANCH_H__ +#ifndef _ASM_X86_NOSPEC_BRANCH_H_ +#define _ASM_X86_NOSPEC_BRANCH_H_
#include <asm/alternative.h> #include <asm/alternative-asm.h> @@ -195,4 +195,4 @@ static inline void vmexit_fill_RSB(void) }
#endif /* __ASSEMBLY__ */ -#endif /* __NOSPEC_BRANCH_H__ */ +#endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
3.16.56-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit e383095c7fe8d218e00ec0f83e4b95ed4e627b02 upstream.
If sysfs is disabled and RETPOLINE not defined:
arch/x86/kernel/cpu/bugs.c:97:13: warning: ‘spectre_v2_bad_module’ defined but not used [-Wunused-variable] static bool spectre_v2_bad_module;
Hide it.
Fixes: caf7501a1b4e ("module/retpoline: Warn about missing retpoline in module") Reported-by: Borislav Petkov bp@alien8.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Andi Kleen ak@linux.intel.com Cc: David Woodhouse dwmw2@infradead.org [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/kernel/cpu/bugs.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -156,9 +156,10 @@ static const char *spectre_v2_strings[] #define pr_fmt(fmt) "Spectre V2 : " fmt
static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; -static bool spectre_v2_bad_module;
#ifdef RETPOLINE +static bool spectre_v2_bad_module; + bool retpoline_module_ok(bool has_retpoline) { if (spectre_v2_enabled == SPECTRE_V2_NONE || has_retpoline) @@ -168,6 +169,13 @@ bool retpoline_module_ok(bool has_retpol spectre_v2_bad_module = true; return false; } + +static inline const char *spectre_v2_module_string(void) +{ + return spectre_v2_bad_module ? " - vulnerable module loaded" : ""; +} +#else +static inline const char *spectre_v2_module_string(void) { return ""; } #endif
static void __init spec2_print_if_insecure(const char *reason) @@ -355,6 +363,6 @@ ssize_t cpu_show_spectre_v2(struct devic return sprintf(buf, "Not affected\n");
return sprintf(buf, "%s%s\n", spectre_v2_strings[spectre_v2_enabled], - spectre_v2_bad_module ? " - vulnerable module loaded" : ""); + spectre_v2_module_string()); } #endif
On Mon, Mar 12, 2018 at 03:06:11AM +0000, Ben Hutchings wrote:
This is the start of the stable review cycle for the 3.16.56 release. There are 76 patches in this series, which will be posted as responses to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Mar 14 12:00:00 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 136 fail:0 Qemu test results: total: 115 pass: 112 fail:3 Failed tests: mipsel:24Kf:malta_defconfig:smp:rootfs mipsel64:malta_defconfig:nosmp:rootfs mipsel64:malta_defconfig:smp:rootfs
The failures are due to newly added tests; the init process crashes. v3.16 passes those tests, so the problem was introduced later. I'll run a bisect later to see if I can find the culprit. If not, I'll drop the new tests from this kernel version.
Guenter
On Mon, Mar 12, 2018 at 08:00:53AM -0700, Guenter Roeck wrote:
On Mon, Mar 12, 2018 at 03:06:11AM +0000, Ben Hutchings wrote:
This is the start of the stable review cycle for the 3.16.56 release. There are 76 patches in this series, which will be posted as responses to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Mar 14 12:00:00 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 136 fail:0 Qemu test results: total: 115 pass: 112 fail:3 Failed tests: mipsel:24Kf:malta_defconfig:smp:rootfs mipsel64:malta_defconfig:nosmp:rootfs mipsel64:malta_defconfig:smp:rootfs
The failures are due to newly added tests; the init process crashes. v3.16 passes those tests, so the problem was introduced later. I'll run a bisect later to see if I can find the culprit. If not, I'll drop the new tests from this kernel version.
Turns out I did the bisect kalready. Attached. It does suggest that there may be a real problem.
Guenter
--- # bad: [3e50cd97ed730bb0abfcdbc8c1a18871c2750c33] Linux 3.16.55 # good: [19583ca584d6f574384e17fe7613dfaeadcdc4a6] Linux 3.16 git bisect start 'HEAD' 'v3.16' # good: [d1afef76e102be87955151c93bd51fd04c1c0c01] arm64: mm: ensure that the zero page is visible to the page table walker git bisect good d1afef76e102be87955151c93bd51fd04c1c0c01 # good: [b6927bd60d353de044584ab9400aaccd8694fe1e] can: Fix kernel panic at security_sock_rcv_skb git bisect good b6927bd60d353de044584ab9400aaccd8694fe1e # good: [aa9a2ec0e82b64db1851d96ab1e9c83f8ea17a39] ARM: kexec: Make .text R/W in machine_kexec git bisect good aa9a2ec0e82b64db1851d96ab1e9c83f8ea17a39 # good: [bf5ac638a0ffa923ed03ba8cdb8241b812f5fe4f] can: gs_usb: fix busy loop if no more TX context is available git bisect good bf5ac638a0ffa923ed03ba8cdb8241b812f5fe4f # good: [957a3d249cb16292a199f73b7138d23ee44ca433] Revert "x86: kvmclock: Disable use from vDSO if KPTI is enabled" git bisect good 957a3d249cb16292a199f73b7138d23ee44ca433 # bad: [66fe40226beb16fb7809d275aec362f479388935] USB: serial: option: adding support for YUGA CLM920-NC5 git bisect bad 66fe40226beb16fb7809d275aec362f479388935 # good: [0b6433856a149885470f2ab3a138e99347c323a4] arm64: fpsimd: Prevent registers leaking from dead tasks git bisect good 0b6433856a149885470f2ab3a138e99347c323a4 # good: [d6e7dd39a7f036eb3e48032d68d9e70f2e9781cf] MIPS: Clear [MSA]FPE CSR.Cause after notify_die() git bisect good d6e7dd39a7f036eb3e48032d68d9e70f2e9781cf # bad: [d97c5dd698a37a6f4fcce8132853620f7390f797] MIPS: Fix an FCSR access API regression with NT_PRFPREG and MSA git bisect bad d97c5dd698a37a6f4fcce8132853620f7390f797 # bad: [b18b5d55c0e8b2bccda919f5f227ec3ba1056f2a] MIPS: Fix a preemption issue with thread's FPU defaults git bisect bad b18b5d55c0e8b2bccda919f5f227ec3ba1056f2a # good: [0efd2f915bbc608f66065c36b291d37efe0a0b0f] MIPS: Always clear FCSR cause bits after emulation git bisect good 0efd2f915bbc608f66065c36b291d37efe0a0b0f # bad: [3127c502272aca5f46b04c0b11afb464ad4fcbaf] MIPS: math-emu: Define IEEE 754-2008 feature control bits git bisect bad 3127c502272aca5f46b04c0b11afb464ad4fcbaf # bad: [8605aa2fea28c0485aeb60c114a9d52df1455915] MIPS: Set `si_code' for SIGFPE signals sent from emulation too git bisect bad 8605aa2fea28c0485aeb60c114a9d52df1455915 # first bad commit: [8605aa2fea28c0485aeb60c114a9d52df1455915] MIPS: Set `si_code' for SIGFPE signals sent from emulation too
linux-stable-mirror@lists.linaro.org