Once an exception has been injected, any side effects related to
the exception (such as setting CR2 or DR6) have been taked place.
Therefore, once KVM sets the VM-entry interruption information
field or the AMD EVENTINJ field, the next VM-entry must deliver that
exception.
Pending interrupts are processed after injected exceptions, so
in theory it would not be a problem to use KVM_INTERRUPT when
an injected exception is present. However, DOSEMU is using
run->ready_for_interrupt_injection to detect interrupt windows
and then using KVM_SET_SREGS/KVM_SET_REGS to inject the
interrupt manually. For this to work, the interrupt window
must be delayed after the completion of the previous event
injection.
Cc: stable(a)vger.kernel.org
Reported-by: Stas Sergeev <stsp2(a)yandex.ru>
Tested-by: Stas Sergeev <stsp2(a)yandex.ru>
Fixes: 71cc849b7093 ("KVM: x86: Fix split-irqchip vs interrupt injection window request")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
(cherry picked from commit 043264d97e9ab74cc9661c8b1f9c00c8ce24cad9)
---
arch/x86/kvm/x86.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4116567f3d44..5e921f1e00db 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4358,8 +4358,18 @@ static int kvm_cpu_accept_dm_intr(struct kvm_vcpu *vcpu)
static int kvm_vcpu_ready_for_interrupt_injection(struct kvm_vcpu *vcpu)
{
+ /*
+ * Do not cause an interrupt window exit if an exception
+ * is pending or an event needs reinjection; userspace
+ * might want to inject the interrupt manually using KVM_SET_REGS
+ * or KVM_SET_SREGS. For that to work, we must be at an
+ * instruction boundary and with no events half-injected.
+ */
return kvm_arch_interrupt_allowed(vcpu) &&
- kvm_cpu_accept_dm_intr(vcpu);
+ kvm_cpu_accept_dm_intr(vcpu) &&
+ !kvm_event_needs_reinjection(vcpu)
+ !vcpu->arch.exception.pending;
+
}
static int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
--
2.27.0
Den 2021-07-27 kl. 20:06, skrev Ben Greear:
> On 7/27/21 9:50 AM, pgndev wrote:
>> embedded. checking...
>>
>>
>> iiuc, it's got an i2c. possible a uart is on Irq4 thru the i2c?
>>
>>
>> if so, might wanna take a look here:
>>
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=213031
>>
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bugzilla.kernel.org_sh…>
>>
>>
>>
>>
>> maybe related? at least shares symptoms...
>>
>>
>> ACPI subsystem lead sez in offlist thread re: that
>>
>>
>> "It looks like commit 96b15a0b45182f1c3da5a861196da27000da2e3c
>> needs to
>>
>> be reverted."
>
> I don't see that commit in linus tree nor my stable tree, can you check
> that hash and also
> show me the commit message and other info so I can track it down?
>
AFAIK, that commit is supposed to reference
0ec4e55e9f571f08970ed115ec0addc691eda613 in linus tree, landed in 5.13.2
'
and a matching thread:
"boot of J1900 (quad-core Celeron) mobo: kernel <= 5.12.15, OK; kernel
>= 5.12.17, 5.13.4, slow boot (>> 660 secs) + hang/FAIL"
on stable@ ml.
--
Thomas
From: Mike Rapoport <rppt(a)linux.ibm.com>
The memory reserved by console/PALcode or non-volatile memory is not added
to memblock.memory.
Since commit fa3354e4ea39 (mm: free_area_init: use maximal zone PFNs rather
than zone sizes) the initialization of the memory map relies on the
accuracy of memblock.memory to properly calculate zone sizes. The holes in
memblock.memory caused by absent regions reserved by the firmware cause
incorrect initialization of struct pages which leads to BUG() during the
initial page freeing:
BUG: Bad page state in process swapper pfn:2ffc53
page:fffffc000ecf14c0 refcount:0 mapcount:1 mapping:0000000000000000 index:0x0
flags: 0x0()
raw: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
raw: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
page dumped because: nonzero mapcount
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.7.0-03841-gfa3354e4ea39-dirty #26
fffffc0001b5bd68 fffffc0001b5be80 fffffc00011cd148 fffffc000ecf14c0
fffffc00019803df fffffc0001b5be80 fffffc00011ce340 fffffc000ecf14c0
0000000000000000 fffffc0001b5be80 fffffc0001b482c0 fffffc00027d6618
fffffc00027da7d0 00000000002ff97a 0000000000000000 fffffc0001b5be80
fffffc00011d1abc fffffc000ecf14c0 fffffc0002d00000 fffffc0001b5be80
fffffc0001b2350c 0000000000300000 fffffc0001b48298 fffffc0001b482c0
Trace:
[<fffffc00011cd148>] bad_page+0x168/0x1b0
[<fffffc00011ce340>] free_pcp_prepare+0x1e0/0x290
[<fffffc00011d1abc>] free_unref_page+0x2c/0xa0
[<fffffc00014ee5f0>] cmp_ex_sort+0x0/0x30
[<fffffc00014ee5f0>] cmp_ex_sort+0x0/0x30
[<fffffc000101001c>] _stext+0x1c/0x20
Fix this by registering the reserved ranges in memblock.memory.
Link: https://lore.kernel.org/lkml/20210726192311.uffqnanxw3ac5wwi@ivybridge
Fixes: fa3354e4ea39 ("mm: free_area_init: use maximal zone PFNs rather than zone sizes")
Reported-by: Matt Turner <mattst88(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com>
---
arch/alpha/kernel/setup.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c
index 7d56c217b235..b4fbbba30aa2 100644
--- a/arch/alpha/kernel/setup.c
+++ b/arch/alpha/kernel/setup.c
@@ -319,18 +319,19 @@ setup_memory(void *kernel_end)
i, cluster->usage, cluster->start_pfn,
cluster->start_pfn + cluster->numpages);
- /* Bit 0 is console/PALcode reserved. Bit 1 is
- non-volatile memory -- we might want to mark
- this for later. */
- if (cluster->usage & 3)
- continue;
-
end = cluster->start_pfn + cluster->numpages;
if (end > max_low_pfn)
max_low_pfn = end;
memblock_add(PFN_PHYS(cluster->start_pfn),
cluster->numpages << PAGE_SHIFT);
+
+ /* Bit 0 is console/PALcode reserved. Bit 1 is
+ non-volatile memory -- we might want to mark
+ this for later. */
+ if (cluster->usage & 3)
+ memblock_reserve(PFN_PHYS(cluster->start_pfn),
+ cluster->numpages << PAGE_SHIFT);
}
/*
base-commit: d8079fac168168b25677dc16c00ffaf9fb7df723
--
2.28.0
The patch titled
Subject: hugetlbfs: fix mount mode command line processing
has been removed from the -mm tree. Its filename was
hugetlbfs-fix-mount-mode-command-line-processing.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: fix mount mode command line processing
In commit 32021982a324 ("hugetlbfs: Convert to fs_context") processing of
the mount mode string was changed from match_octal() to fsparam_u32. This
changed existing behavior as match_octal does not require octal values to
have a '0' prefix, but fsparam_u32 does.
Use fsparam_u32oct which provides the same behavior as match_octal.
Link: https://lkml.kernel.org/r/20210721183326.102716-1-mike.kravetz@oracle.com
Fixes: 32021982a324 ("hugetlbfs: Convert to fs_context")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Dennis Camera <bugs+kernel.org(a)dtnr.ch>
Reviewed-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/hugetlbfs/inode.c~hugetlbfs-fix-mount-mode-command-line-processing
+++ a/fs/hugetlbfs/inode.c
@@ -77,7 +77,7 @@ enum hugetlb_param {
static const struct fs_parameter_spec hugetlb_fs_parameters[] = {
fsparam_u32 ("gid", Opt_gid),
fsparam_string("min_size", Opt_min_size),
- fsparam_u32 ("mode", Opt_mode),
+ fsparam_u32oct("mode", Opt_mode),
fsparam_string("nr_inodes", Opt_nr_inodes),
fsparam_string("pagesize", Opt_pagesize),
fsparam_string("size", Opt_size),
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlb-simplify-prep_compound_gigantic_page-ref-count-racing-code.patch
hugetlb-drop-ref-count-earlier-after-page-allocation.patch
hugetlb-before-freeing-hugetlb-page-set-dtor-to-appropriate-value.patch
The patch titled
Subject: mm: fix the deadlock in finish_fault()
has been removed from the -mm tree. Its filename was
mm-fix-the-deadlock-in-finish_fault.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Qi Zheng <zhengqi.arch(a)bytedance.com>
Subject: mm: fix the deadlock in finish_fault()
Commit 63f3655f9501 ("mm, memcg: fix reclaim deadlock with writeback") fix
the following ABBA deadlock by pre-allocating the pte page table without
holding the page lock.
lock_page(A)
SetPageWriteback(A)
unlock_page(A)
lock_page(B)
lock_page(B)
pte_alloc_one
shrink_page_list
wait_on_page_writeback(A)
SetPageWriteback(B)
unlock_page(B)
# flush A, B to clear the writeback
Commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault()
codepaths") rework the relevant code but ignore this race. This will
cause the deadlock above to appear again, so fix it.
Link: https://lkml.kernel.org/r/20210721074849.57004-1-zhengqi.arch@bytedance.com
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Qi Zheng <zhengqi.arch(a)bytedance.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
--- a/mm/memory.c~mm-fix-the-deadlock-in-finish_fault
+++ a/mm/memory.c
@@ -4026,8 +4026,17 @@ vm_fault_t finish_fault(struct vm_fault
return ret;
}
- if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd)))
+ if (vmf->prealloc_pte) {
+ vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
+ if (likely(pmd_none(*vmf->pmd))) {
+ mm_inc_nr_ptes(vma->vm_mm);
+ pmd_populate(vma->vm_mm, vmf->pmd, vmf->prealloc_pte);
+ vmf->prealloc_pte = NULL;
+ }
+ spin_unlock(vmf->ptl);
+ } else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) {
return VM_FAULT_OOM;
+ }
}
/* See comment in handle_pte_fault() */
_
Patches currently in -mm which might be from zhengqi.arch(a)bytedance.com are