This is a note to let you know that I've just added the patch titled
x86/mm: Fix vmalloc_fault to use pXd_large
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-fix-vmalloc_fault-to-use-pxd_large.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 18a955219bf7d9008ce480d4451b6b8bf4483a22 Mon Sep 17 00:00:00 2001
From: Toshi Kani <toshi.kani(a)hpe.com>
Date: Tue, 13 Mar 2018 11:03:46 -0600
Subject: x86/mm: Fix vmalloc_fault to use pXd_large
From: Toshi Kani <toshi.kani(a)hpe.com>
commit 18a955219bf7d9008ce480d4451b6b8bf4483a22 upstream.
Gratian Crisan reported that vmalloc_fault() crashes when CONFIG_HUGETLBFS
is not set since the function inadvertently uses pXn_huge(), which always
return 0 in this case. ioremap() does not depend on CONFIG_HUGETLBFS.
Fix vmalloc_fault() to call pXd_large() instead.
Fixes: f4eafd8bcd52 ("x86/mm: Fix vmalloc_fault() to handle large pages properly")
Reported-by: Gratian Crisan <gratian.crisan(a)ni.com>
Signed-off-by: Toshi Kani <toshi.kani(a)hpe.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Cc: linux-mm(a)kvack.org
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Link: https://lkml.kernel.org/r/20180313170347.3829-2-toshi.kani@hpe.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/fault.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -330,7 +330,7 @@ static noinline int vmalloc_fault(unsign
if (!pmd_k)
return -1;
- if (pmd_huge(*pmd_k))
+ if (pmd_large(*pmd_k))
return 0;
pte_k = pte_offset_kernel(pmd_k, address);
@@ -475,7 +475,7 @@ static noinline int vmalloc_fault(unsign
if (pud_none(*pud) || pud_pfn(*pud) != pud_pfn(*pud_ref))
BUG();
- if (pud_huge(*pud))
+ if (pud_large(*pud))
return 0;
pmd = pmd_offset(pud, address);
@@ -486,7 +486,7 @@ static noinline int vmalloc_fault(unsign
if (pmd_none(*pmd) || pmd_pfn(*pmd) != pmd_pfn(*pmd_ref))
BUG();
- if (pmd_huge(*pmd))
+ if (pmd_large(*pmd))
return 0;
pte_ref = pte_offset_kernel(pmd_ref, address);
Patches currently in stable-queue which might be from toshi.kani(a)hpe.com are
queue-4.15/x86-mm-fix-vmalloc_fault-to-use-pxd_large.patch
This is a note to let you know that I've just added the patch titled
selftests/x86/entry_from_vm86: Exit with 1 if we fail
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
selftests-x86-entry_from_vm86-exit-with-1-if-we-fail.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 327d53d005ca47b10eae940616ed11c569f75a9b Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Tue, 13 Mar 2018 22:03:10 -0700
Subject: selftests/x86/entry_from_vm86: Exit with 1 if we fail
From: Andy Lutomirski <luto(a)kernel.org>
commit 327d53d005ca47b10eae940616ed11c569f75a9b upstream.
Fix a logic error that caused the test to exit with 0 even if test
cases failed.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Stas Sergeev <stsp(a)list.ru>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: bartoldeman(a)gmail.com
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/b1cc37144038958a469c8f70a5f47a6a5638636a.152100360…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/testing/selftests/x86/entry_from_vm86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -318,7 +318,7 @@ int main(void)
clearhandler(SIGSEGV);
/* Make sure nothing explodes if we fork. */
- if (fork() > 0)
+ if (fork() == 0)
return 0;
return (nerrs == 0 ? 0 : 1);
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.15/x86-speculation-objtool-annotate-indirect-calls-jumps-for-objtool-on-32-bit-kernels.patch
queue-4.15/x86-vm86-32-fix-popf-emulation.patch
queue-4.15/x86-mm-fix-vmalloc_fault-to-use-pxd_large.patch
queue-4.15/selftests-x86-entry_from_vm86-add-test-cases-for-popf.patch
queue-4.15/selftests-x86-entry_from_vm86-exit-with-1-if-we-fail.patch
This is a note to let you know that I've just added the patch titled
RDMAVT: Fix synchronization around percpu_ref
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rdmavt-fix-synchronization-around-percpu_ref.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 74b44bbe80b4c62113ac1501482ea1ee40eb9d67 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj(a)kernel.org>
Date: Wed, 14 Mar 2018 12:10:18 -0700
Subject: RDMAVT: Fix synchronization around percpu_ref
From: Tejun Heo <tj(a)kernel.org>
commit 74b44bbe80b4c62113ac1501482ea1ee40eb9d67 upstream.
rvt_mregion uses percpu_ref for reference counting and RCU to protect
accesses from lkey_table. When a rvt_mregion needs to be freed, it
first gets unregistered from lkey_table and then rvt_check_refs() is
called to wait for in-flight usages before the rvt_mregion is freed.
rvt_check_refs() seems to have a couple issues.
* It has a fast exit path which tests percpu_ref_is_zero(). However,
a percpu_ref reading zero doesn't mean that the object can be
released. In fact, the ->release() callback might not even have
started executing yet. Proceeding with freeing can lead to
use-after-free.
* lkey_table is RCU protected but there is no RCU grace period in the
free path. percpu_ref uses RCU internally but it's sched-RCU whose
grace periods are different from regular RCU. Also, it generally
isn't a good idea to depend on internal behaviors like this.
To address the above issues, this patch removes the fast exit and adds
an explicit synchronize_rcu().
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro(a)intel.com>
Cc: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Cc: linux-rdma(a)vger.kernel.org
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/sw/rdmavt/mr.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -489,11 +489,13 @@ static int rvt_check_refs(struct rvt_mre
unsigned long timeout;
struct rvt_dev_info *rdi = ib_to_rvt(mr->pd->device);
- if (percpu_ref_is_zero(&mr->refcount))
- return 0;
- /* avoid dma mr */
- if (mr->lkey)
+ if (mr->lkey) {
+ /* avoid dma mr */
rvt_dereg_clean_qps(mr);
+ /* @mr was indexed on rcu protected @lkey_table */
+ synchronize_rcu();
+ }
+
timeout = wait_for_completion_timeout(&mr->comp, 5 * HZ);
if (!timeout) {
rvt_pr_err(rdi,
Patches currently in stable-queue which might be from tj(a)kernel.org are
queue-4.15/rdmavt-fix-synchronization-around-percpu_ref.patch
queue-4.15/fs-aio-add-explicit-rcu-grace-period-when-freeing-kioctx.patch
queue-4.15/fs-aio-use-rcu-accessors-for-kioctx_table-table.patch
This is a note to let you know that I've just added the patch titled
selftests/x86/entry_from_vm86: Add test cases for POPF
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
selftests-x86-entry_from_vm86-add-test-cases-for-popf.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 78393fdde2a456cafa414b171c90f26a3df98b20 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Tue, 13 Mar 2018 22:03:11 -0700
Subject: selftests/x86/entry_from_vm86: Add test cases for POPF
From: Andy Lutomirski <luto(a)kernel.org>
commit 78393fdde2a456cafa414b171c90f26a3df98b20 upstream.
POPF is currently broken -- add tests to catch the error. This
results in:
[RUN] POPF with VIP set and IF clear from vm86 mode
[INFO] Exited vm86 mode due to STI
[FAIL] Incorrect return reason (started at eip = 0xd, ended at eip = 0xf)
because POPF currently fails to check IF before reporting a pending
interrupt.
This patch also makes the FAIL message a bit more informative.
Reported-by: Bart Oldeman <bartoldeman(a)gmail.com>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Stas Sergeev <stsp(a)list.ru>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: http://lkml.kernel.org/r/a16270b5cfe7832d6d00c479d0f871066cbdb52b.152100360…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/testing/selftests/x86/entry_from_vm86.c | 30 +++++++++++++++++++++++---
1 file changed, 27 insertions(+), 3 deletions(-)
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -95,6 +95,10 @@ asm (
"int3\n\t"
"vmcode_int80:\n\t"
"int $0x80\n\t"
+ "vmcode_popf_hlt:\n\t"
+ "push %ax\n\t"
+ "popf\n\t"
+ "hlt\n\t"
"vmcode_umip:\n\t"
/* addressing via displacements */
"smsw (2052)\n\t"
@@ -124,8 +128,8 @@ asm (
extern unsigned char vmcode[], end_vmcode[];
extern unsigned char vmcode_bound[], vmcode_sysenter[], vmcode_syscall[],
- vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_umip[],
- vmcode_umip_str[], vmcode_umip_sldt[];
+ vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_popf_hlt[],
+ vmcode_umip[], vmcode_umip_str[], vmcode_umip_sldt[];
/* Returns false if the test was skipped. */
static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
@@ -175,7 +179,7 @@ static bool do_test(struct vm86plus_stru
(VM86_TYPE(ret) == rettype && VM86_ARG(ret) == retarg)) {
printf("[OK]\tReturned correctly\n");
} else {
- printf("[FAIL]\tIncorrect return reason\n");
+ printf("[FAIL]\tIncorrect return reason (started at eip = 0x%lx, ended at eip = 0x%lx)\n", eip, v86->regs.eip);
nerrs++;
}
@@ -264,6 +268,9 @@ int main(void)
v86.regs.ds = load_addr / 16;
v86.regs.es = load_addr / 16;
+ /* Use the end of the page as our stack. */
+ v86.regs.esp = 4096;
+
assert((v86.regs.cs & 3) == 0); /* Looks like RPL = 0 */
/* #BR -- should deliver SIG??? */
@@ -295,6 +302,23 @@ int main(void)
v86.regs.eflags &= ~X86_EFLAGS_IF;
do_test(&v86, vmcode_sti - vmcode, VM86_STI, 0, "STI with VIP set");
+ /* POPF with VIP set but IF clear: should not trap */
+ v86.regs.eflags = X86_EFLAGS_VIP;
+ v86.regs.eax = 0;
+ do_test(&v86, vmcode_popf_hlt - vmcode, VM86_UNKNOWN, 0, "POPF with VIP set and IF clear");
+
+ /* POPF with VIP set and IF set: should trap */
+ v86.regs.eflags = X86_EFLAGS_VIP;
+ v86.regs.eax = X86_EFLAGS_IF;
+ do_test(&v86, vmcode_popf_hlt - vmcode, VM86_STI, 0, "POPF with VIP and IF set");
+
+ /* POPF with VIP clear and IF set: should not trap */
+ v86.regs.eflags = 0;
+ v86.regs.eax = X86_EFLAGS_IF;
+ do_test(&v86, vmcode_popf_hlt - vmcode, VM86_UNKNOWN, 0, "POPF with VIP clear and IF set");
+
+ v86.regs.eflags = 0;
+
/* INT3 -- should cause #BP */
do_test(&v86, vmcode_int3 - vmcode, VM86_TRAP, 3, "INT3");
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.15/x86-speculation-objtool-annotate-indirect-calls-jumps-for-objtool-on-32-bit-kernels.patch
queue-4.15/x86-vm86-32-fix-popf-emulation.patch
queue-4.15/x86-mm-fix-vmalloc_fault-to-use-pxd_large.patch
queue-4.15/selftests-x86-entry_from_vm86-add-test-cases-for-popf.patch
queue-4.15/selftests-x86-entry_from_vm86-exit-with-1-if-we-fail.patch
This is a note to let you know that I've just added the patch titled
parisc: Handle case where flush_cache_range is called with no context
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
parisc-handle-case-where-flush_cache_range-is-called-with-no-context.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 9ef0f88fe5466c2ca1d2975549ba6be502c464c1 Mon Sep 17 00:00:00 2001
From: John David Anglin <dave.anglin(a)bell.net>
Date: Wed, 7 Mar 2018 08:18:05 -0500
Subject: parisc: Handle case where flush_cache_range is called with no context
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: John David Anglin <dave.anglin(a)bell.net>
commit 9ef0f88fe5466c2ca1d2975549ba6be502c464c1 upstream.
Just when I had decided that flush_cache_range() was always called with
a valid context, Helge reported two cases where the
"BUG_ON(!vma->vm_mm->context);" was hit on the phantom buildd:
kernel BUG at /mnt/sdb6/linux/linux-4.15.4/arch/parisc/kernel/cache.c:587!
CPU: 1 PID: 3254 Comm: kworker/1:2 Tainted: G D 4.15.0-1-parisc64-smp #1 Debian 4.15.4-1+b1
Workqueue: events free_ioctx
IAOQ[0]: flush_cache_range+0x164/0x168
IAOQ[1]: flush_cache_page+0x0/0x1c8
RP(r2): unmap_page_range+0xae8/0xb88
Backtrace:
[<00000000404a6980>] unmap_page_range+0xae8/0xb88
[<00000000404a6ae0>] unmap_single_vma+0xc0/0x188
[<00000000404a6cdc>] zap_page_range_single+0x134/0x1f8
[<00000000404a702c>] unmap_mapping_range+0x1cc/0x208
[<0000000040461518>] truncate_pagecache+0x98/0x108
[<0000000040461624>] truncate_setsize+0x9c/0xb8
[<00000000405d7f30>] put_aio_ring_file+0x80/0x100
[<00000000405d803c>] aio_free_ring+0x8c/0x290
[<00000000405d82c0>] free_ioctx+0x80/0x180
[<0000000040284e6c>] process_one_work+0x21c/0x668
[<00000000402854c4>] worker_thread+0x20c/0x778
[<0000000040291d44>] kthread+0x2d4/0x2e0
[<0000000040204020>] end_fault_vector+0x20/0xc0
This indicates that we need to handle the no context case in
flush_cache_range() as we do in flush_cache_mm().
In thinking about this, I realized that we don't need to flush the TLB
when there is no context. So, I added context checks to the large flush
cases in flush_cache_mm() and flush_cache_range(). The large flush case
occurs frequently in flush_cache_mm() and the change should improve fork
performance.
The v2 version of this change removes the BUG_ON from flush_cache_page()
by skipping the TLB flush when there is no context. I also added code
to flush the TLB in flush_cache_mm() and flush_cache_range() when we
have a context that's not current. Now all three routines handle TLB
flushes in a similar manner.
Signed-off-by: John David Anglin <dave.anglin(a)bell.net>
Cc: stable(a)vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller(a)gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/parisc/kernel/cache.c | 41 ++++++++++++++++++++++++++++++++---------
1 file changed, 32 insertions(+), 9 deletions(-)
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -543,7 +543,8 @@ void flush_cache_mm(struct mm_struct *mm
rp3440, etc. So, avoid it if the mm isn't too big. */
if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
mm_total_size(mm) >= parisc_cache_flush_threshold) {
- flush_tlb_all();
+ if (mm->context)
+ flush_tlb_all();
flush_cache_all();
return;
}
@@ -571,6 +572,8 @@ void flush_cache_mm(struct mm_struct *mm
pfn = pte_pfn(*ptep);
if (!pfn_valid(pfn))
continue;
+ if (unlikely(mm->context))
+ flush_tlb_page(vma, addr);
__flush_cache_page(vma, addr, PFN_PHYS(pfn));
}
}
@@ -579,26 +582,46 @@ void flush_cache_mm(struct mm_struct *mm
void flush_cache_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end)
{
+ pgd_t *pgd;
+ unsigned long addr;
+
if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
end - start >= parisc_cache_flush_threshold) {
- flush_tlb_range(vma, start, end);
+ if (vma->vm_mm->context)
+ flush_tlb_range(vma, start, end);
flush_cache_all();
return;
}
- flush_user_dcache_range_asm(start, end);
- if (vma->vm_flags & VM_EXEC)
- flush_user_icache_range_asm(start, end);
- flush_tlb_range(vma, start, end);
+ if (vma->vm_mm->context == mfsp(3)) {
+ flush_user_dcache_range_asm(start, end);
+ if (vma->vm_flags & VM_EXEC)
+ flush_user_icache_range_asm(start, end);
+ flush_tlb_range(vma, start, end);
+ return;
+ }
+
+ pgd = vma->vm_mm->pgd;
+ for (addr = vma->vm_start; addr < vma->vm_end; addr += PAGE_SIZE) {
+ unsigned long pfn;
+ pte_t *ptep = get_ptep(pgd, addr);
+ if (!ptep)
+ continue;
+ pfn = pte_pfn(*ptep);
+ if (pfn_valid(pfn)) {
+ if (unlikely(vma->vm_mm->context))
+ flush_tlb_page(vma, addr);
+ __flush_cache_page(vma, addr, PFN_PHYS(pfn));
+ }
+ }
}
void
flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn)
{
- BUG_ON(!vma->vm_mm->context);
-
if (pfn_valid(pfn)) {
- flush_tlb_page(vma, vmaddr);
+ if (likely(vma->vm_mm->context))
+ flush_tlb_page(vma, vmaddr);
__flush_cache_page(vma, vmaddr, PFN_PHYS(pfn));
}
}
Patches currently in stable-queue which might be from dave.anglin(a)bell.net are
queue-4.15/parisc-handle-case-where-flush_cache_range-is-called-with-no-context.patch
This is a note to let you know that I've just added the patch titled
lock_parent() needs to recheck if dentry got __dentry_kill'ed under it
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
lock_parent-needs-to-recheck-if-dentry-got-__dentry_kill-ed-under-it.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3b821409632ab778d46e807516b457dfa72736ed Mon Sep 17 00:00:00 2001
From: Al Viro <viro(a)zeniv.linux.org.uk>
Date: Fri, 23 Feb 2018 20:47:17 -0500
Subject: lock_parent() needs to recheck if dentry got __dentry_kill'ed under it
From: Al Viro <viro(a)zeniv.linux.org.uk>
commit 3b821409632ab778d46e807516b457dfa72736ed upstream.
In case when dentry passed to lock_parent() is protected from freeing only
by the fact that it's on a shrink list and trylock of parent fails, we
could get hit by __dentry_kill() (and subsequent dentry_kill(parent))
between unlocking dentry and locking presumed parent. We need to recheck
that dentry is alive once we lock both it and parent *and* postpone
rcu_read_unlock() until after that point. Otherwise we could return
a pointer to struct dentry that already is rcu-scheduled for freeing, with
->d_lock held on it; caller's subsequent attempt to unlock it can end
up with memory corruption.
Cc: stable(a)vger.kernel.org # 3.12+, counting backports
Signed-off-by: Al Viro <viro(a)zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/dcache.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -644,11 +644,16 @@ again:
spin_unlock(&parent->d_lock);
goto again;
}
- rcu_read_unlock();
- if (parent != dentry)
+ if (parent != dentry) {
spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
- else
+ if (unlikely(dentry->d_lockref.count < 0)) {
+ spin_unlock(&parent->d_lock);
+ parent = NULL;
+ }
+ } else {
parent = NULL;
+ }
+ rcu_read_unlock();
return parent;
}
Patches currently in stable-queue which might be from viro(a)zeniv.linux.org.uk are
queue-4.15/fs-teach-path_connected-to-handle-nfs-filesystems-with-multiple-roots.patch
queue-4.15/lock_parent-needs-to-recheck-if-dentry-got-__dentry_kill-ed-under-it.patch
This is a note to let you know that I've just added the patch titled
kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-arm-arm64-vgic-v3-tighten-synchronization-for-guests-using-v2-on-v3.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 27e91ad1e746e341ca2312f29bccb9736be7b476 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <marc.zyngier(a)arm.com>
Date: Tue, 6 Mar 2018 21:44:37 +0000
Subject: kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
From: Marc Zyngier <marc.zyngier(a)arm.com>
commit 27e91ad1e746e341ca2312f29bccb9736be7b476 upstream.
On guest exit, and when using GICv2 on GICv3, we use a dsb(st) to
force synchronization between the memory-mapped guest view and
the system-register view that the hypervisor uses.
This is incorrect, as the spec calls out the need for "a DSB whose
required access type is both loads and stores with any Shareability
attribute", while we're only synchronizing stores.
We also lack an isb after the dsb to ensure that the latter has
actually been executed before we start reading stuff from the sysregs.
The fix is pretty easy: turn dsb(st) into dsb(sy), and slap an isb()
just after.
Cc: stable(a)vger.kernel.org
Fixes: f68d2b1b73cc ("arm64: KVM: Implement vgic-v3 save/restore")
Acked-by: Christoffer Dall <cdall(a)kernel.org>
Reviewed-by: Andre Przywara <andre.przywara(a)arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier(a)arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
virt/kvm/arm/hyp/vgic-v3-sr.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/virt/kvm/arm/hyp/vgic-v3-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
@@ -215,7 +215,8 @@ void __hyp_text __vgic_v3_save_state(str
* are now visible to the system register interface.
*/
if (!cpu_if->vgic_sre) {
- dsb(st);
+ dsb(sy);
+ isb();
cpu_if->vgic_vmcr = read_gicreg(ICH_VMCR_EL2);
}
Patches currently in stable-queue which might be from marc.zyngier(a)arm.com are
queue-4.15/kvm-arm-arm64-vgic-don-t-populate-multiple-lrs-with-the-same-vintid.patch
queue-4.15/kvm-arm-arm64-vgic-v3-tighten-synchronization-for-guests-using-v2-on-v3.patch
queue-4.15/kvm-arm-arm64-reset-mapped-irqs-on-vm-reset.patch
queue-4.15/kvm-arm-arm64-reduce-verbosity-of-kvm-init-log.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: Fix device passthrough when SME is active
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-device-passthrough-when-sme-is-active.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From daaf216c06fba4ee4dc3f62715667da929d68774 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky(a)amd.com>
Date: Thu, 8 Mar 2018 17:17:31 -0600
Subject: KVM: x86: Fix device passthrough when SME is active
From: Tom Lendacky <thomas.lendacky(a)amd.com>
commit daaf216c06fba4ee4dc3f62715667da929d68774 upstream.
When using device passthrough with SME active, the MMIO range that is
mapped for the device should not be mapped encrypted. Add a check in
set_spte() to insure that a page is not mapped encrypted if that page
is a device MMIO page as indicated by kvm_is_mmio_pfn().
Cc: <stable(a)vger.kernel.org> # 4.14.x-
Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/mmu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2758,8 +2758,10 @@ static int set_spte(struct kvm_vcpu *vcp
else
pte_access &= ~ACC_WRITE_MASK;
+ if (!kvm_is_mmio_pfn(pfn))
+ spte |= shadow_me_mask;
+
spte |= (u64)pfn << PAGE_SHIFT;
- spte |= shadow_me_mask;
if (pte_access & ACC_WRITE_MASK) {
Patches currently in stable-queue which might be from thomas.lendacky(a)amd.com are
queue-4.15/kvm-x86-fix-device-passthrough-when-sme-is-active.patch
queue-4.15/x86-cpufeatures-add-intel-pconfig-cpufeature.patch
queue-4.15/x86-cpufeatures-add-intel-total-memory-encryption-cpufeature.patch
This is a note to let you know that I've just added the patch titled
KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-arm-arm64-vgic-don-t-populate-multiple-lrs-with-the-same-vintid.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 16ca6a607d84bef0129698d8d808f501afd08d43 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <marc.zyngier(a)arm.com>
Date: Tue, 6 Mar 2018 21:48:01 +0000
Subject: KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
From: Marc Zyngier <marc.zyngier(a)arm.com>
commit 16ca6a607d84bef0129698d8d808f501afd08d43 upstream.
The vgic code is trying to be clever when injecting GICv2 SGIs,
and will happily populate LRs with the same interrupt number if
they come from multiple vcpus (after all, they are distinct
interrupt sources).
Unfortunately, this is against the letter of the architecture,
and the GICv2 architecture spec says "Each valid interrupt stored
in the List registers must have a unique VirtualID for that
virtual CPU interface.". GICv3 has similar (although slightly
ambiguous) restrictions.
This results in guests locking up when using GICv2-on-GICv3, for
example. The obvious fix is to stop trying so hard, and inject
a single vcpu per SGI per guest entry. After all, pending SGIs
with multiple source vcpus are pretty rare, and are mostly seen
in scenario where the physical CPUs are severely overcomitted.
But as we now only inject a single instance of a multi-source SGI per
vcpu entry, we may delay those interrupts for longer than strictly
necessary, and run the risk of injecting lower priority interrupts
in the meantime.
In order to address this, we adopt a three stage strategy:
- If we encounter a multi-source SGI in the AP list while computing
its depth, we force the list to be sorted
- When populating the LRs, we prevent the injection of any interrupt
of lower priority than that of the first multi-source SGI we've
injected.
- Finally, the injection of a multi-source SGI triggers the request
of a maintenance interrupt when there will be no pending interrupt
in the LRs (HCR_NPIE).
At the point where the last pending interrupt in the LRs switches
from Pending to Active, the maintenance interrupt will be delivered,
allowing us to add the remaining SGIs using the same process.
Cc: stable(a)vger.kernel.org
Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
Acked-by: Christoffer Dall <cdall(a)kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier(a)arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/irqchip/arm-gic-v3.h | 1
include/linux/irqchip/arm-gic.h | 1
virt/kvm/arm/vgic/vgic-v2.c | 9 ++++-
virt/kvm/arm/vgic/vgic-v3.c | 9 ++++-
virt/kvm/arm/vgic/vgic.c | 61 ++++++++++++++++++++++++++++---------
virt/kvm/arm/vgic/vgic.h | 2 +
6 files changed, 67 insertions(+), 16 deletions(-)
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -503,6 +503,7 @@
#define ICH_HCR_EN (1 << 0)
#define ICH_HCR_UIE (1 << 1)
+#define ICH_HCR_NPIE (1 << 3)
#define ICH_HCR_TC (1 << 10)
#define ICH_HCR_TALL0 (1 << 11)
#define ICH_HCR_TALL1 (1 << 12)
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -84,6 +84,7 @@
#define GICH_HCR_EN (1 << 0)
#define GICH_HCR_UIE (1 << 1)
+#define GICH_HCR_NPIE (1 << 3)
#define GICH_LR_VIRTUALID (0x3ff << 0)
#define GICH_LR_PHYSID_CPUID_SHIFT (10)
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -37,6 +37,13 @@ void vgic_v2_init_lrs(void)
vgic_v2_write_lr(i, 0);
}
+void vgic_v2_set_npie(struct kvm_vcpu *vcpu)
+{
+ struct vgic_v2_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v2;
+
+ cpuif->vgic_hcr |= GICH_HCR_NPIE;
+}
+
void vgic_v2_set_underflow(struct kvm_vcpu *vcpu)
{
struct vgic_v2_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v2;
@@ -64,7 +71,7 @@ void vgic_v2_fold_lr_state(struct kvm_vc
int lr;
unsigned long flags;
- cpuif->vgic_hcr &= ~GICH_HCR_UIE;
+ cpuif->vgic_hcr &= ~(GICH_HCR_UIE | GICH_HCR_NPIE);
for (lr = 0; lr < vgic_cpu->used_lrs; lr++) {
u32 val = cpuif->vgic_lr[lr];
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -26,6 +26,13 @@ static bool group1_trap;
static bool common_trap;
static bool gicv4_enable;
+void vgic_v3_set_npie(struct kvm_vcpu *vcpu)
+{
+ struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
+
+ cpuif->vgic_hcr |= ICH_HCR_NPIE;
+}
+
void vgic_v3_set_underflow(struct kvm_vcpu *vcpu)
{
struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
@@ -47,7 +54,7 @@ void vgic_v3_fold_lr_state(struct kvm_vc
int lr;
unsigned long flags;
- cpuif->vgic_hcr &= ~ICH_HCR_UIE;
+ cpuif->vgic_hcr &= ~(ICH_HCR_UIE | ICH_HCR_NPIE);
for (lr = 0; lr < vgic_cpu->used_lrs; lr++) {
u64 val = cpuif->vgic_lr[lr];
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -675,22 +675,37 @@ static inline void vgic_set_underflow(st
vgic_v3_set_underflow(vcpu);
}
+static inline void vgic_set_npie(struct kvm_vcpu *vcpu)
+{
+ if (kvm_vgic_global_state.type == VGIC_V2)
+ vgic_v2_set_npie(vcpu);
+ else
+ vgic_v3_set_npie(vcpu);
+}
+
/* Requires the ap_list_lock to be held. */
-static int compute_ap_list_depth(struct kvm_vcpu *vcpu)
+static int compute_ap_list_depth(struct kvm_vcpu *vcpu,
+ bool *multi_sgi)
{
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
struct vgic_irq *irq;
int count = 0;
+ *multi_sgi = false;
+
DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&vgic_cpu->ap_list_lock));
list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
spin_lock(&irq->irq_lock);
/* GICv2 SGIs can count for more than one... */
- if (vgic_irq_is_sgi(irq->intid) && irq->source)
- count += hweight8(irq->source);
- else
+ if (vgic_irq_is_sgi(irq->intid) && irq->source) {
+ int w = hweight8(irq->source);
+
+ count += w;
+ *multi_sgi |= (w > 1);
+ } else {
count++;
+ }
spin_unlock(&irq->irq_lock);
}
return count;
@@ -701,28 +716,43 @@ static void vgic_flush_lr_state(struct k
{
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
struct vgic_irq *irq;
- int count = 0;
+ int count;
+ bool npie = false;
+ bool multi_sgi;
+ u8 prio = 0xff;
DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&vgic_cpu->ap_list_lock));
- if (compute_ap_list_depth(vcpu) > kvm_vgic_global_state.nr_lr)
+ count = compute_ap_list_depth(vcpu, &multi_sgi);
+ if (count > kvm_vgic_global_state.nr_lr || multi_sgi)
vgic_sort_ap_list(vcpu);
+ count = 0;
+
list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
spin_lock(&irq->irq_lock);
- if (unlikely(vgic_target_oracle(irq) != vcpu))
- goto next;
-
/*
- * If we get an SGI with multiple sources, try to get
- * them in all at once.
+ * If we have multi-SGIs in the pipeline, we need to
+ * guarantee that they are all seen before any IRQ of
+ * lower priority. In that case, we need to filter out
+ * these interrupts by exiting early. This is easy as
+ * the AP list has been sorted already.
*/
- do {
+ if (multi_sgi && irq->priority > prio) {
+ spin_unlock(&irq->irq_lock);
+ break;
+ }
+
+ if (likely(vgic_target_oracle(irq) == vcpu)) {
vgic_populate_lr(vcpu, irq, count++);
- } while (irq->source && count < kvm_vgic_global_state.nr_lr);
-next:
+ if (irq->source) {
+ npie = true;
+ prio = irq->priority;
+ }
+ }
+
spin_unlock(&irq->irq_lock);
if (count == kvm_vgic_global_state.nr_lr) {
@@ -733,6 +763,9 @@ next:
}
}
+ if (npie)
+ vgic_set_npie(vcpu);
+
vcpu->arch.vgic_cpu.used_lrs = count;
/* Nuke remaining LRs */
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -151,6 +151,7 @@ void vgic_v2_fold_lr_state(struct kvm_vc
void vgic_v2_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr);
void vgic_v2_clear_lr(struct kvm_vcpu *vcpu, int lr);
void vgic_v2_set_underflow(struct kvm_vcpu *vcpu);
+void vgic_v2_set_npie(struct kvm_vcpu *vcpu);
int vgic_v2_has_attr_regs(struct kvm_device *dev, struct kvm_device_attr *attr);
int vgic_v2_dist_uaccess(struct kvm_vcpu *vcpu, bool is_write,
int offset, u32 *val);
@@ -180,6 +181,7 @@ void vgic_v3_fold_lr_state(struct kvm_vc
void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr);
void vgic_v3_clear_lr(struct kvm_vcpu *vcpu, int lr);
void vgic_v3_set_underflow(struct kvm_vcpu *vcpu);
+void vgic_v3_set_npie(struct kvm_vcpu *vcpu);
void vgic_v3_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
void vgic_v3_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
void vgic_v3_enable(struct kvm_vcpu *vcpu);
Patches currently in stable-queue which might be from marc.zyngier(a)arm.com are
queue-4.15/kvm-arm-arm64-vgic-don-t-populate-multiple-lrs-with-the-same-vintid.patch
queue-4.15/kvm-arm-arm64-vgic-v3-tighten-synchronization-for-guests-using-v2-on-v3.patch
queue-4.15/kvm-arm-arm64-reset-mapped-irqs-on-vm-reset.patch
queue-4.15/kvm-arm-arm64-reduce-verbosity-of-kvm-init-log.patch
This is a note to let you know that I've just added the patch titled
KVM: arm/arm64: Reduce verbosity of KVM init log
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-arm-arm64-reduce-verbosity-of-kvm-init-log.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 76600428c3677659e3c3633bb4f2ea302220a275 Mon Sep 17 00:00:00 2001
From: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Date: Fri, 2 Mar 2018 08:16:30 +0000
Subject: KVM: arm/arm64: Reduce verbosity of KVM init log
From: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
commit 76600428c3677659e3c3633bb4f2ea302220a275 upstream.
On my GICv3 system, the following is printed to the kernel log at boot:
kvm [1]: 8-bit VMID
kvm [1]: IDMAP page: d20e35000
kvm [1]: HYP VA range: 800000000000:ffffffffffff
kvm [1]: vgic-v2@2c020000
kvm [1]: GIC system register CPU interface enabled
kvm [1]: vgic interrupt IRQ1
kvm [1]: virtual timer IRQ4
kvm [1]: Hyp mode initialized successfully
The KVM IDMAP is a mapping of a statically allocated kernel structure,
and so printing its physical address leaks the physical placement of
the kernel when physical KASLR in effect. So change the kvm_info() to
kvm_debug() to remove it from the log output.
While at it, trim the output a bit more: IRQ numbers can be found in
/proc/interrupts, and the HYP VA and vgic-v2 lines are not highly
informational either.
Cc: <stable(a)vger.kernel.org>
Acked-by: Will Deacon <will.deacon(a)arm.com>
Acked-by: Christoffer Dall <cdall(a)kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier(a)arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
virt/kvm/arm/arch_timer.c | 2 +-
virt/kvm/arm/mmu.c | 6 +++---
virt/kvm/arm/vgic/vgic-v2.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -773,7 +773,7 @@ int kvm_timer_hyp_init(bool has_gic)
}
}
- kvm_info("virtual timer IRQ%d\n", host_vtimer_irq);
+ kvm_debug("virtual timer IRQ%d\n", host_vtimer_irq);
cpuhp_setup_state(CPUHP_AP_KVM_ARM_TIMER_STARTING,
"kvm/arm/timer:starting", kvm_timer_starting_cpu,
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1760,9 +1760,9 @@ int kvm_mmu_init(void)
*/
BUG_ON((hyp_idmap_start ^ (hyp_idmap_end - 1)) & PAGE_MASK);
- kvm_info("IDMAP page: %lx\n", hyp_idmap_start);
- kvm_info("HYP VA range: %lx:%lx\n",
- kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
+ kvm_debug("IDMAP page: %lx\n", hyp_idmap_start);
+ kvm_debug("HYP VA range: %lx:%lx\n",
+ kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
hyp_idmap_start < kern_hyp_va(~0UL) &&
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -381,7 +381,7 @@ int vgic_v2_probe(const struct gic_kvm_i
kvm_vgic_global_state.type = VGIC_V2;
kvm_vgic_global_state.max_gic_vcpus = VGIC_V2_MAX_CPUS;
- kvm_info("vgic-v2@%llx\n", info->vctrl.start);
+ kvm_debug("vgic-v2@%llx\n", info->vctrl.start);
return 0;
out:
Patches currently in stable-queue which might be from ard.biesheuvel(a)linaro.org are
queue-4.15/kvm-arm-arm64-reduce-verbosity-of-kvm-init-log.patch