From: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Subject: mm: sections are not offlined during memory hotremove
Memory hotplug and hotremove operate with per-block granularity. If the
machine has a large amount of memory (more than 64G), the size of a memory
block can span multiple sections. By mistake, during hotremove we set
only the first section to offline state.
The bug was discovered because kernel selftest started to fail:
https://lkml.kernel.org/r/20180423011247.GK5563@yexl-desktop
After commit, "mm/memory_hotplug: optimize probe routine". But, the bug
is older than this commit. In this optimization we also added a check for
sections to be in a proper state during hotplug operation.
Link: http://lkml.kernel.org/r/20180427145257.15222-1-pasha.tatashin@oracle.com
Fixes: 2d070eab2e82 ("mm: consider zone which is not fully populated to have holes")
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Steven Sistare <steven.sistare(a)oracle.com>
Cc: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/sparse.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/sparse.c~mm-sections-are-not-offlined-during-memory-hotremove mm/sparse.c
--- a/mm/sparse.c~mm-sections-are-not-offlined-during-memory-hotremove
+++ a/mm/sparse.c
@@ -629,7 +629,7 @@ void offline_mem_sections(unsigned long
unsigned long pfn;
for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
- unsigned long section_nr = pfn_to_section_nr(start_pfn);
+ unsigned long section_nr = pfn_to_section_nr(pfn);
struct mem_section *ms;
/*
_
Quoting Lionel Landwerlin (2018-05-11 18:41:28)
> On 11/05/18 16:51, Chris Wilson wrote:
>
> But I can't even startup a gdm on that machine with drm-tip. So maybe
> there is some much more broken...
>
> Don't leave us in suspense...
>
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=890614
>
>
> Not our bug :)
You would not believe how relieved I am that someone else has bugs in
their code.
-Chris
Linus,
Working on some new updates to trace filtering, I noticed that the
regex_match_front() test was updated to be limited to the size
of the pattern instead of the full test string. But as the test string
is not guaranteed to be nul terminated, it still needs to consider
the size of the test string.
Please pull the latest trace-v4.17-rc4 tree, which can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-v4.17-rc4
Tag SHA1: cb82e07247fa5f1a91940a2928fb260256727a14
Head SHA1: dc432c3d7f9bceb3de6f5b44fb9c657c9810ed6d
Steven Rostedt (VMware) (1):
tracing: Fix regex_match_front() to not over compare the test string
----
kernel/trace/trace_events_filter.c | 3 +++
1 file changed, 3 insertions(+)
---------------------------
commit dc432c3d7f9bceb3de6f5b44fb9c657c9810ed6d
Author: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Date: Wed May 9 11:59:32 2018 -0400
tracing: Fix regex_match_front() to not over compare the test string
The regex match function regex_match_front() in the tracing filter logic,
was fixed to test just the pattern length from testing the entire test
string. That is, it went from strncmp(str, r->pattern, len) to
strcmp(str, r->pattern, r->len).
The issue is that str is not guaranteed to be nul terminated, and if r->len
is greater than the length of str, it can access more memory than is
allocated.
The solution is to add a simple test if (len < r->len) return 0.
Cc: stable(a)vger.kernel.org
Fixes: 285caad415f45 ("tracing/filters: Fix MATCH_FRONT_ONLY filter matching")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 1f951b3df60c..7d306b74230f 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -762,6 +762,9 @@ static int regex_match_full(char *str, struct regex *r, int len)
static int regex_match_front(char *str, struct regex *r, int len)
{
+ if (len < r->len)
+ return 0;
+
if (strncmp(str, r->pattern, r->len) == 0)
return 1;
return 0;
The patch below does not apply to the 4.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e538409257d0217a9bc715686100a5328db75a15 Mon Sep 17 00:00:00 2001
From: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Date: Wed, 4 Apr 2018 22:38:49 +0200
Subject: [PATCH] test_firmware: fix setting old custom fw path back on exit,
second try
Commit 65c79230576 tried to clear the custom firmware path on exit by
writing a single space to the firmware_class.path parameter. This
doesn't work because nothing strips this space from the value stored
and fw_get_filesystem_firmware() only ignores zero-length paths.
Instead, write a null byte.
Fixes: 0a8adf58475 ("test: add firmware_class loader test")
Fixes: 65c79230576 ("test_firmware: fix setting old custom fw path back on exit")
Signed-off-by: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Acked-by: Luis R. Rodriguez <mcgrof(a)kernel.org>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/tools/testing/selftests/firmware/fw_lib.sh b/tools/testing/selftests/firmware/fw_lib.sh
index 9ea31b57d71a..962d7f4ac627 100755
--- a/tools/testing/selftests/firmware/fw_lib.sh
+++ b/tools/testing/selftests/firmware/fw_lib.sh
@@ -154,11 +154,13 @@ test_finish()
if [ "$HAS_FW_LOADER_USER_HELPER" = "yes" ]; then
echo "$OLD_TIMEOUT" >/sys/class/firmware/timeout
fi
- if [ "$OLD_FWPATH" = "" ]; then
- OLD_FWPATH=" "
- fi
if [ "$TEST_REQS_FW_SET_CUSTOM_PATH" = "yes" ]; then
- echo -n "$OLD_FWPATH" >/sys/module/firmware_class/parameters/path
+ if [ "$OLD_FWPATH" = "" ]; then
+ # A zero-length write won't work; write a null byte
+ printf '\000' >/sys/module/firmware_class/parameters/path
+ else
+ echo -n "$OLD_FWPATH" >/sys/module/firmware_class/parameters/path
+ fi
fi
if [ -f $FW ]; then
rm -f "$FW"
kvm_read_guest() will eventually look up in kvm_memslots(), which requires
either to hold the kvm->slots_lock or to be inside a kvm->srcu critical
section.
In contrast to x86 and s390 we don't take the SRCU lock on every guest
exit, so we have to do it individually for each kvm_read_guest() call.
Use the newly introduced wrapper for that.
Cc: Stable <stable(a)vger.kernel.org> # 4.12+
Reported-by: Jan Glauber <jan.glauber(a)caviumnetworks.com>
Signed-off-by: Andre Przywara <andre.przywara(a)arm.com>
---
virt/kvm/arm/vgic/vgic-its.c | 4 ++--
virt/kvm/arm/vgic/vgic-v3.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 7cb060e01a76..4ed79c939fb4 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1897,7 +1897,7 @@ static int scan_its_table(struct vgic_its *its, gpa_t base, int size, int esz,
int next_offset;
size_t byte_offset;
- ret = kvm_read_guest(kvm, gpa, entry, esz);
+ ret = kvm_read_guest_lock(kvm, gpa, entry, esz);
if (ret)
return ret;
@@ -2267,7 +2267,7 @@ static int vgic_its_restore_cte(struct vgic_its *its, gpa_t gpa, int esz)
int ret;
BUG_ON(esz > sizeof(val));
- ret = kvm_read_guest(kvm, gpa, &val, esz);
+ ret = kvm_read_guest_lock(kvm, gpa, &val, esz);
if (ret)
return ret;
val = le64_to_cpu(val);
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index c7423f3768e5..bdcf8e7a6161 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -344,7 +344,7 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
bit_nr = irq->intid % BITS_PER_BYTE;
ptr = pendbase + byte_offset;
- ret = kvm_read_guest(kvm, ptr, &val, 1);
+ ret = kvm_read_guest_lock(kvm, ptr, &val, 1);
if (ret)
return ret;
@@ -397,7 +397,7 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
ptr = pendbase + byte_offset;
if (byte_offset != last_byte_offset) {
- ret = kvm_read_guest(kvm, ptr, &val, 1);
+ ret = kvm_read_guest_lock(kvm, ptr, &val, 1);
if (ret)
return ret;
last_byte_offset = byte_offset;
--
2.14.1
kvm_read_guest() will eventually look up in kvm_memslots(), which requires
either to hold the kvm->slots_lock or to be inside a kvm->srcu critical
section.
In contrast to x86 and s390 we don't take the SRCU lock on every guest
exit, so we have to do it individually for each kvm_read_guest() call.
Provide a wrapper which does that and use that everywhere.
Note that ending the SRCU critical section before returning from the
kvm_read_guest() wrapper is safe, because the data has been *copied*, so
we don't need to rely on valid references to the memslot anymore.
Cc: Stable <stable(a)vger.kernel.org> # 4.8+
Reported-by: Jan Glauber <jan.glauber(a)caviumnetworks.com>
Signed-off-by: Andre Przywara <andre.przywara(a)arm.com>
---
arch/arm/include/asm/kvm_mmu.h | 16 ++++++++++++++++
arch/arm64/include/asm/kvm_mmu.h | 16 ++++++++++++++++
virt/kvm/arm/vgic/vgic-its.c | 15 ++++++++-------
3 files changed, 40 insertions(+), 7 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 707a1f06dc5d..f675162663f0 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -309,6 +309,22 @@ static inline unsigned int kvm_get_vmid_bits(void)
return 8;
}
+/*
+ * We are not in the kvm->srcu critical section most of the time, so we take
+ * the SRCU read lock here. Since we copy the data from the user page, we
+ * can immediately drop the lock again.
+ */
+static inline int kvm_read_guest_lock(struct kvm *kvm,
+ gpa_t gpa, void *data, unsigned long len)
+{
+ int srcu_idx = srcu_read_lock(&kvm->srcu);
+ int ret = kvm_read_guest(kvm, gpa, data, len);
+
+ srcu_read_unlock(&kvm->srcu, srcu_idx);
+
+ return ret;
+}
+
static inline void *kvm_get_hyp_vector(void)
{
return kvm_ksym_ref(__kvm_hyp_vector);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 082110993647..6128992c2ded 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -360,6 +360,22 @@ static inline unsigned int kvm_get_vmid_bits(void)
return (cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8;
}
+/*
+ * We are not in the kvm->srcu critical section most of the time, so we take
+ * the SRCU read lock here. Since we copy the data from the user page, we
+ * can immediately drop the lock again.
+ */
+static inline int kvm_read_guest_lock(struct kvm *kvm,
+ gpa_t gpa, void *data, unsigned long len)
+{
+ int srcu_idx = srcu_read_lock(&kvm->srcu);
+ int ret = kvm_read_guest(kvm, gpa, data, len);
+
+ srcu_read_unlock(&kvm->srcu, srcu_idx);
+
+ return ret;
+}
+
#ifdef CONFIG_KVM_INDIRECT_VECTORS
/*
* EL2 vectors can be mapped and rerouted in a number of ways,
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 51a80b600632..7cb060e01a76 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -281,8 +281,8 @@ static int update_lpi_config(struct kvm *kvm, struct vgic_irq *irq,
int ret;
unsigned long flags;
- ret = kvm_read_guest(kvm, propbase + irq->intid - GIC_LPI_OFFSET,
- &prop, 1);
+ ret = kvm_read_guest_lock(kvm, propbase + irq->intid - GIC_LPI_OFFSET,
+ &prop, 1);
if (ret)
return ret;
@@ -444,8 +444,9 @@ static int its_sync_lpi_pending_table(struct kvm_vcpu *vcpu)
* this very same byte in the last iteration. Reuse that.
*/
if (byte_offset != last_byte_offset) {
- ret = kvm_read_guest(vcpu->kvm, pendbase + byte_offset,
- &pendmask, 1);
+ ret = kvm_read_guest_lock(vcpu->kvm,
+ pendbase + byte_offset,
+ &pendmask, 1);
if (ret) {
kfree(intids);
return ret;
@@ -789,7 +790,7 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 baser, u32 id,
return false;
/* Each 1st level entry is represented by a 64-bit value. */
- if (kvm_read_guest(its->dev->kvm,
+ if (kvm_read_guest_lock(its->dev->kvm,
BASER_ADDRESS(baser) + index * sizeof(indirect_ptr),
&indirect_ptr, sizeof(indirect_ptr)))
return false;
@@ -1370,8 +1371,8 @@ static void vgic_its_process_commands(struct kvm *kvm, struct vgic_its *its)
cbaser = CBASER_ADDRESS(its->cbaser);
while (its->cwriter != its->creadr) {
- int ret = kvm_read_guest(kvm, cbaser + its->creadr,
- cmd_buf, ITS_CMD_SIZE);
+ int ret = kvm_read_guest_lock(kvm, cbaser + its->creadr,
+ cmd_buf, ITS_CMD_SIZE);
/*
* If kvm_read_guest() fails, this could be due to the guest
* programming a bogus value in CBASER or something else going
--
2.14.1
Apparently the development of update_affinity() overlapped with the
promotion of irq_lock to be _irqsave, so the patch didn't convert this
lock over. This will make lockdep complain.
Fix this by disabling IRQs around the lock.
Cc: stable(a)vger.kernel.org
Fixes: 08c9fd042117 ("KVM: arm/arm64: vITS: Add a helper to update the affinity of an LPI")
Reported-by: Jan Glauber <jan.glauber(a)caviumnetworks.com>
Signed-off-by: Andre Przywara <andre.przywara(a)arm.com>
---
virt/kvm/arm/vgic/vgic-its.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 41abf92f2699..51a80b600632 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -350,10 +350,11 @@ static int vgic_copy_lpi_list(struct kvm_vcpu *vcpu, u32 **intid_ptr)
static int update_affinity(struct vgic_irq *irq, struct kvm_vcpu *vcpu)
{
int ret = 0;
+ unsigned long flags;
- spin_lock(&irq->irq_lock);
+ spin_lock_irqsave(&irq->irq_lock, flags);
irq->target_vcpu = vcpu;
- spin_unlock(&irq->irq_lock);
+ spin_unlock_irqrestore(&irq->irq_lock, flags);
if (irq->hw) {
struct its_vlpi_map map;
--
2.14.1
As Jan reported [1], lockdep complains about the VGIC not being bullet
proof. This seems to be due to two issues:
- When commit 006df0f34930 ("KVM: arm/arm64: Support calling
vgic_update_irq_pending from irq context") promoted irq_lock and
ap_list_lock to _irqsave, we forgot two instances of irq_lock.
lockdeps seems to pick those up.
- If a lock is _irqsave, any other locks we take inside them should be
_irqsafe as well. So the lpi_list_lock needs to be promoted also.
This fixes both issues by simply making the remaining instances of those
locks _irqsave.
One irq_lock is addressed in a separate patch, to simplify backporting.
Cc: stable(a)vger.kernel.org
Fixes: 006df0f34930 ("KVM: arm/arm64: Support calling vgic_update_irq_pending from irq context")
Reported-by: Jan Glauber <jan.glauber(a)caviumnetworks.com>
Signed-off-by: Andre Przywara <andre.przywara(a)arm.com>
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/575718.html
---
virt/kvm/arm/vgic/vgic-debug.c | 5 +++--
virt/kvm/arm/vgic/vgic-its.c | 10 ++++++----
virt/kvm/arm/vgic/vgic.c | 22 ++++++++++++++--------
3 files changed, 23 insertions(+), 14 deletions(-)
diff --git a/virt/kvm/arm/vgic/vgic-debug.c b/virt/kvm/arm/vgic/vgic-debug.c
index 10b38178cff2..4ffc0b5e6105 100644
--- a/virt/kvm/arm/vgic/vgic-debug.c
+++ b/virt/kvm/arm/vgic/vgic-debug.c
@@ -211,6 +211,7 @@ static int vgic_debug_show(struct seq_file *s, void *v)
struct vgic_state_iter *iter = (struct vgic_state_iter *)v;
struct vgic_irq *irq;
struct kvm_vcpu *vcpu = NULL;
+ unsigned long flags;
if (iter->dist_id == 0) {
print_dist_state(s, &kvm->arch.vgic);
@@ -227,9 +228,9 @@ static int vgic_debug_show(struct seq_file *s, void *v)
irq = &kvm->arch.vgic.spis[iter->intid - VGIC_NR_PRIVATE_IRQS];
}
- spin_lock(&irq->irq_lock);
+ spin_lock_irqsave(&irq->irq_lock, flags);
print_irq_state(s, irq, vcpu);
- spin_unlock(&irq->irq_lock);
+ spin_unlock_irqrestore(&irq->irq_lock, flags);
return 0;
}
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index a8f07243aa9f..41abf92f2699 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -52,6 +52,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 intid,
{
struct vgic_dist *dist = &kvm->arch.vgic;
struct vgic_irq *irq = vgic_get_irq(kvm, NULL, intid), *oldirq;
+ unsigned long flags;
int ret;
/* In this case there is no put, since we keep the reference. */
@@ -71,7 +72,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 intid,
irq->intid = intid;
irq->target_vcpu = vcpu;
- spin_lock(&dist->lpi_list_lock);
+ spin_lock_irqsave(&dist->lpi_list_lock, flags);
/*
* There could be a race with another vgic_add_lpi(), so we need to
@@ -99,7 +100,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 intid,
dist->lpi_list_count++;
out_unlock:
- spin_unlock(&dist->lpi_list_lock);
+ spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
/*
* We "cache" the configuration table entries in our struct vgic_irq's.
@@ -315,6 +316,7 @@ static int vgic_copy_lpi_list(struct kvm_vcpu *vcpu, u32 **intid_ptr)
{
struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
struct vgic_irq *irq;
+ unsigned long flags;
u32 *intids;
int irq_count, i = 0;
@@ -330,7 +332,7 @@ static int vgic_copy_lpi_list(struct kvm_vcpu *vcpu, u32 **intid_ptr)
if (!intids)
return -ENOMEM;
- spin_lock(&dist->lpi_list_lock);
+ spin_lock_irqsave(&dist->lpi_list_lock, flags);
list_for_each_entry(irq, &dist->lpi_list_head, lpi_list) {
if (i == irq_count)
break;
@@ -339,7 +341,7 @@ static int vgic_copy_lpi_list(struct kvm_vcpu *vcpu, u32 **intid_ptr)
continue;
intids[i++] = irq->intid;
}
- spin_unlock(&dist->lpi_list_lock);
+ spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
*intid_ptr = intids;
return i;
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 27219313a406..1dfb5b2f1b12 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -43,9 +43,13 @@ struct vgic_global kvm_vgic_global_state __ro_after_init = {
* kvm->lock (mutex)
* its->cmd_lock (mutex)
* its->its_lock (mutex)
- * vgic_cpu->ap_list_lock
- * kvm->lpi_list_lock
- * vgic_irq->irq_lock
+ * vgic_cpu->ap_list_lock must be taken with IRQs disabled
+ * kvm->lpi_list_lock must be taken with IRQs disabled
+ * vgic_irq->irq_lock must be taken with IRQs disabled
+ *
+ * As the ap_list_lock might be taken from the timer interrupt handler,
+ * we have to disable IRQs before taking this lock and everything lower
+ * than it.
*
* If you need to take multiple locks, always take the upper lock first,
* then the lower ones, e.g. first take the its_lock, then the irq_lock.
@@ -72,8 +76,9 @@ static struct vgic_irq *vgic_get_lpi(struct kvm *kvm, u32 intid)
{
struct vgic_dist *dist = &kvm->arch.vgic;
struct vgic_irq *irq = NULL;
+ unsigned long flags;
- spin_lock(&dist->lpi_list_lock);
+ spin_lock_irqsave(&dist->lpi_list_lock, flags);
list_for_each_entry(irq, &dist->lpi_list_head, lpi_list) {
if (irq->intid != intid)
@@ -89,7 +94,7 @@ static struct vgic_irq *vgic_get_lpi(struct kvm *kvm, u32 intid)
irq = NULL;
out_unlock:
- spin_unlock(&dist->lpi_list_lock);
+ spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
return irq;
}
@@ -134,19 +139,20 @@ static void vgic_irq_release(struct kref *ref)
void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
{
struct vgic_dist *dist = &kvm->arch.vgic;
+ unsigned long flags;
if (irq->intid < VGIC_MIN_LPI)
return;
- spin_lock(&dist->lpi_list_lock);
+ spin_lock_irqsave(&dist->lpi_list_lock, flags);
if (!kref_put(&irq->refcount, vgic_irq_release)) {
- spin_unlock(&dist->lpi_list_lock);
+ spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
return;
};
list_del(&irq->lpi_list);
dist->lpi_list_count--;
- spin_unlock(&dist->lpi_list_lock);
+ spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
kfree(irq);
}
--
2.14.1
commit ece1397cbc89c51914fae1aec729539cfd8bd62b upstream
Some variants of the Arm Cortex-55 cores (r0p0, r0p1, r1p0) suffer
from an erratum 1024718, which causes incorrect updates when DBM/AP
bits in a page table entry is modified without a break-before-make
sequence. The work around is to disable the hardware DBM feature
on the affected cores. The hardware Access Flag management features
is not affected.
The hardware DBM feature is a non-conflicting capability, i.e, the
kernel could handle cores using the feature and those without having
the features running at the same time. So this work around is detected
at early boot time, rather than delaying it until the CPUs are brought
up into the kernel with MMU turned on. This also avoids other complexities
with late CPUs turning online, with or without the hardware DBM features.
Cc: stable(a)vger.kernel.org # v4.14
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Will Deacon <will.deacon(a)arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
---
Note: The upstream commit is on top of a reworked capability
infrastructure for arm64 heterogeneous systems, which allows
delaying the CPU model checks. This backport is based on the
original version of the patch [0], which checks the affected
CPU models during the early boot.
[0] https://lkml.kernel.org/r/20180116102323.3470-1-suzuki.poulose@arm.com
---
Documentation/arm64/silicon-errata.txt | 1 +
arch/arm64/Kconfig | 14 ++++++++++++
arch/arm64/include/asm/assembler.h | 40 ++++++++++++++++++++++++++++++++++
arch/arm64/include/asm/cputype.h | 2 ++
arch/arm64/mm/proc.S | 5 +++++
5 files changed, 62 insertions(+)
diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
index f3d0d31..e4fe6ad 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -55,6 +55,7 @@ stable kernels.
| ARM | Cortex-A57 | #834220 | ARM64_ERRATUM_834220 |
| ARM | Cortex-A72 | #853709 | N/A |
| ARM | Cortex-A73 | #858921 | ARM64_ERRATUM_858921 |
+| ARM | Cortex-A55 | #1024718 | ARM64_ERRATUM_1024718 |
| ARM | MMU-500 | #841119,#826419 | N/A |
| | | | |
| Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c2abb4e..2d5f7ac 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -443,6 +443,20 @@ config ARM64_ERRATUM_843419
If unsure, say Y.
+config ARM64_ERRATUM_1024718
+ bool "Cortex-A55: 1024718: Update of DBM/AP bits without break before make might result in incorrect update"
+ default y
+ help
+ This option adds work around for Arm Cortex-A55 Erratum 1024718.
+
+ Affected Cortex-A55 cores (r0p0, r0p1, r1p0) could cause incorrect
+ update of the hardware dirty bit when the DBM/AP bits are updated
+ without a break-before-make. The work around is to disable the usage
+ of hardware DBM locally on the affected cores. CPUs not affected by
+ erratum will continue to use the feature.
+
+ If unsure, say Y.
+
config CAVIUM_ERRATUM_22375
bool "Cavium erratum 22375, 24313"
default y
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 463619d..25b2a41 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -25,6 +25,7 @@
#include <asm/asm-offsets.h>
#include <asm/cpufeature.h>
+#include <asm/cputype.h>
#include <asm/page.h>
#include <asm/pgtable-hwdef.h>
#include <asm/ptrace.h>
@@ -495,4 +496,43 @@ alternative_endif
and \phys, \pte, #(((1 << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
.endm
+/*
+ * Check the MIDR_EL1 of the current CPU for a given model and a range of
+ * variant/revision. See asm/cputype.h for the macros used below.
+ *
+ * model: MIDR_CPU_MODEL of CPU
+ * rv_min: Minimum of MIDR_CPU_VAR_REV()
+ * rv_max: Maximum of MIDR_CPU_VAR_REV()
+ * res: Result register.
+ * tmp1, tmp2, tmp3: Temporary registers
+ *
+ * Corrupts: res, tmp1, tmp2, tmp3
+ * Returns: 0, if the CPU id doesn't match. Non-zero otherwise
+ */
+ .macro cpu_midr_match model, rv_min, rv_max, res, tmp1, tmp2, tmp3
+ mrs \res, midr_el1
+ mov_q \tmp1, (MIDR_REVISION_MASK | MIDR_VARIANT_MASK)
+ mov_q \tmp2, MIDR_CPU_MODEL_MASK
+ and \tmp3, \res, \tmp2 // Extract model
+ and \tmp1, \res, \tmp1 // rev & variant
+ mov_q \tmp2, \model
+ cmp \tmp3, \tmp2
+ cset \res, eq
+ cbz \res, .Ldone\@ // Model matches ?
+
+ .if (\rv_min != 0) // Skip min check if rv_min == 0
+ mov_q \tmp3, \rv_min
+ cmp \tmp1, \tmp3
+ cset \res, ge
+ .endif // \rv_min != 0
+ /* Skip rv_max check if rv_min == rv_max && rv_min != 0 */
+ .if ((\rv_min != \rv_max) || \rv_min == 0)
+ mov_q \tmp2, \rv_max
+ cmp \tmp1, \tmp2
+ cset \tmp2, le
+ and \res, \res, \tmp2
+ .endif
+.Ldone\@:
+ .endm
+
#endif /* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index be7bd19..30da091 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -78,6 +78,7 @@
#define ARM_CPU_PART_AEM_V8 0xD0F
#define ARM_CPU_PART_FOUNDATION 0xD00
+#define ARM_CPU_PART_CORTEX_A55 0xD05
#define ARM_CPU_PART_CORTEX_A57 0xD07
#define ARM_CPU_PART_CORTEX_A72 0xD08
#define ARM_CPU_PART_CORTEX_A53 0xD03
@@ -98,6 +99,7 @@
#define QCOM_CPU_PART_KRYO 0x200
#define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
+#define MIDR_CORTEX_A55 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A55)
#define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
#define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72)
#define MIDR_CORTEX_A73 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A73)
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 139320a..e338165 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -438,6 +438,11 @@ ENTRY(__cpu_setup)
cbz x9, 2f
cmp x9, #2
b.lt 1f
+#ifdef CONFIG_ARM64_ERRATUM_1024718
+ /* Disable hardware DBM on Cortex-A55 r0p0, r0p1 & r1p0 */
+ cpu_midr_match MIDR_CORTEX_A55, MIDR_CPU_VAR_REV(0, 0), MIDR_CPU_VAR_REV(1, 0), x1, x2, x3, x4
+ cbnz x1, 1f
+#endif
orr x10, x10, #TCR_HD // hardware Dirty flag update
1: orr x10, x10, #TCR_HA // hardware Access flag update
2:
--
2.7.4