On Fri, 2020-06-19 at 16:31 +0200, Greg Kroah-Hartman wrote:
From: Paolo Bonzini pbonzini@redhat.com
[ Upstream commit d43e2675e96fc6ae1a633b6a69d296394448cc32 ]
KVM stores the gfn in MMIO SPTEs as a caching optimization. These are split in two parts, as in "[high 11111 low]", to thwart any attempt to use these bits in an L1TF attack. This works as long as there are 5 free bits between MAXPHYADDR and bit 50 (inclusive), leaving bit 51 free so that the MMIO access triggers a reserved-bit-set page fault.
Hi, I'm now seeing this warning in VM bootup with 4.14.y
Not seen with 4.19.129 and 5.4.47 that also included this commit.
Any ideas what's missing in 4.14 ?
[ 2.294049] ------------[ cut here ]------------ [ 2.294621] WARNING: CPU: 43 PID: 856 at arch/x86/kvm/mmu.c:279 kvm_mmu_set_mmio_spte_mask+0x4e/0x60 [kvm] [ 2.295583] Modules linked in: kvm_intel(+) kvm irqbypass bfq sch_fq_codel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ata_piix dm_mirror dm_region_hash dm_log dm_mod dax autofs4 [ 2.297269] CPU: 43 PID: 856 Comm: systemd-udevd Not tainted 4.14.185 #1 [ 2.297920] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 [ 2.298782] task: ffff9b2350b19dc0 task.stack: ffffa86344604000 [ 2.299390] RIP: 0010:kvm_mmu_set_mmio_spte_mask+0x4e/0x60 [kvm] [ 2.299987] RSP: 0018:ffffa86344607c78 EFLAGS: 00010206 [ 2.300522] RAX: 0000000000000000 RBX: ffffffffc0457000 RCX: 0000000000000000 [ 2.301239] RDX: ffffffff00000001 RSI: 0008000000000001 RDI: 0008000000000001 [ 2.301935] RBP: ffffffffc03bd951 R08: ffff9b235f4e33a0 R09: ffff9b2355f57258 [ 2.302646] R10: 0000000000000164 R11: 00000000ffffffff R12: 0000000000000000 [ 2.303356] R13: ffffffffc0458780 R14: ffffa86344607ea0 R15: 0000000000000001 [ 2.304069] FS: 00007f3e95dedc40(0000) GS:ffff9b235f4c0000(0000) knlGS:0000000000000000 [ 2.304852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.305425] CR2: 000055bd35ff10d0 CR3: 000000081026a004 CR4: 00000000001606e0 [ 2.306137] Call Trace: [ 2.306414] kvm_arch_init+0x90/0x130 [kvm] [ 2.306852] kvm_init+0x1c/0x2b0 [kvm] [ 2.307258] ? __slab_free+0x13a/0x2e0 [ 2.307649] ? hardware_setup+0x4ab/0x4ab [kvm_intel] [ 2.308178] vmx_init+0x21/0x6af [kvm_intel] [ 2.308604] ? hardware_setup+0x4ab/0x4ab [kvm_intel] [ 2.309132] do_one_initcall+0x3e/0xf4 [ 2.309512] ? kmem_cache_alloc_trace+0xef/0x190 [ 2.309985] do_init_module+0x5c/0x1f0 [ 2.310386] load_module+0x1f31/0x2620 [ 2.310769] ? SYSC_finit_module+0x95/0xb0 [ 2.311202] SYSC_finit_module+0x95/0xb0 [ 2.311600] do_syscall_64+0x74/0x190 [ 2.311980] entry_SYSCALL_64_after_hwframe+0x41/0xa6 [ 2.312496] RIP: 0033:0x7f3e966b71bd [ 2.312860] RSP: 002b:00007ffe0db584c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 2.313606] RAX: ffffffffffffffda RBX: 000055bd36027b10 RCX: 00007f3e966b71bd [ 2.314314] RDX: 0000000000000000 RSI: 00007f3e962fe84d RDI: 000000000000000f [ 2.315017] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000007 [ 2.315719] R10: 000000000000000f R11: 0000000000000246 R12: 00007f3e962fe84d [ 2.316420] R13: 0000000000000000 R14: 000055bd3602f400 R15: 000055bd36027b10 [ 2.317130] Code: 29 25 06 00 75 25 48 b8 00 00 00 00 00 00 00 40 48 09 c6 48 09 c7 48 89 35 38 25 06 00 48 89 3d 39 25 06 00 c3 0f 0b 0f 0b eb d2 <0f> 0b eb d7 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 [ 2.318933] ---[ end trace d933315308434918 ]---
$ head /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 63 model name : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
The bit positions however were computed wrongly for AMD processors that have encryption support. In this case, x86_phys_bits is reduced (for example from 48 to 43, to account for the C bit at position 47 and four bits used internally to store the SEV ASID and other stuff) while x86_cache_bits in would remain set to 48, and _all_ bits between the reduced MAXPHYADDR and bit 51 are set. Then low_phys_bits would also cover some of the bits that are set in the shadow_mmio_value, terribly confusing the gfn caching mechanism.
To fix this, avoid splitting gfns as long as the processor does not have the L1TF bug (which includes all AMD processors). When there is no splitting, low_phys_bits can be set to the reduced MAXPHYADDR removing the overlap. This fixes "npt=0" operation on EPYC processors.
Thanks to Maxim Levitsky for bisecting this bug.
Cc: stable@vger.kernel.org Fixes: 52918ed5fcf0 ("KVM: SVM: Override default MMIO mask if memory encryption is enabled") Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org
arch/x86/kvm/mmu.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index d8878266553c..7220ab210dcf 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -275,6 +275,8 @@ static bool is_executable_pte(u64 spte); void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask, u64 mmio_value) { BUG_ON((mmio_mask & mmio_value) != mmio_value);
- WARN_ON(mmio_value & (shadow_nonpresent_or_rsvd_mask <<
shadow_nonpresent_or_rsvd_mask_len));
- WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask); shadow_mmio_value = mmio_value | SPTE_SPECIAL_MASK; shadow_mmio_mask = mmio_mask | SPTE_SPECIAL_MASK;
} @@ -467,16 +469,15 @@ static void kvm_mmu_reset_all_pte_masks(void) * the most significant bits of legal physical address space. */ shadow_nonpresent_or_rsvd_mask = 0;
- low_phys_bits = boot_cpu_data.x86_cache_bits;
- if (boot_cpu_data.x86_cache_bits <
52 - shadow_nonpresent_or_rsvd_mask_len) {
- low_phys_bits = boot_cpu_data.x86_phys_bits;
- if (boot_cpu_has_bug(X86_BUG_L1TF) &&
!WARN_ON_ONCE(boot_cpu_data.x86_cache_bits >=
52 - shadow_nonpresent_or_rsvd_mask_len)) {
low_phys_bits = boot_cpu_data.x86_cache_bits
shadow_nonpresent_or_rsvd_mask =- shadow_nonpresent_or_rsvd_mask_len;
rsvd_bits(boot_cpu_data.x86_cache_bits -
shadow_nonpresent_or_rsvd_mask_len,
boot_cpu_data.x86_cache_bits - 1);
low_phys_bits -= shadow_nonpresent_or_rsvd_mask_len;
- } else
WARN_ON_ONCE(boot_cpu_has_bug(X86_BUG_L1TF));
rsvd_bits(low_phys_bits,
boot_cpu_data.x86_cache_bits - 1);
- }
shadow_nonpresent_or_rsvd_lower_gfn_mask = GENMASK_ULL(low_phys_bits - 1, PAGE_SHIFT);