Since commit 1a50d9403fb9 ("treewide: Fix probing of devices in DT
overlays"), when using device-tree overlays, the FWNODE_FLAG_NOT_DEVICE
is set on each overlay nodes. This flag is cleared when a struct device
is actually created for the DT node.
Also, when a device is created, the device DT node is parsed for known
phandle and devlinks consumer/supplier links are created between the
device (consumer) and the devices referenced by phandles (suppliers).
As these supplier device can have a struct device not already created,
the FWNODE_FLAG_NOT_DEVICE can be set for suppliers and leads the
devlink supplier point to the device's parent instead of the device
itself.
Avoid this situation clearing the supplier FWNODE_FLAG_NOT_DEVICE just
before the devlink creation if a device is supposed to be created and
handled later in the process.
Fixes: 1a50d9403fb9 ("treewide: Fix probing of devices in DT overlays")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
---
drivers/of/property.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/of/property.c b/drivers/of/property.c
index 641a40cf5cf3..ff5cac477dbe 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1097,6 +1097,7 @@ static void of_link_to_phandle(struct device_node *con_np,
struct device_node *sup_np)
{
struct device_node *tmp_np = of_node_get(sup_np);
+ struct fwnode_handle *sup_fwnode;
/* Check that sup_np and its ancestors are available. */
while (tmp_np) {
@@ -1113,7 +1114,20 @@ static void of_link_to_phandle(struct device_node *con_np,
tmp_np = of_get_next_parent(tmp_np);
}
- fwnode_link_add(of_fwnode_handle(con_np), of_fwnode_handle(sup_np));
+ /*
+ * In case of overlays, the fwnode are added with FWNODE_FLAG_NOT_DEVICE
+ * flag set. A node can have a phandle that references an other node
+ * added by the overlay.
+ * Clear the supplier's FWNODE_FLAG_NOT_DEVICE so that fw_devlink links
+ * to this supplier instead of linking to its parent.
+ */
+ sup_fwnode = of_fwnode_handle(sup_np);
+ if (sup_fwnode->flags & FWNODE_FLAG_NOT_DEVICE) {
+ if (of_property_present(sup_np, "compatible") &&
+ of_device_is_available(sup_np))
+ sup_fwnode->flags &= ~FWNODE_FLAG_NOT_DEVICE;
+ }
+ fwnode_link_add(of_fwnode_handle(con_np), sup_fwnode);
}
/**
--
2.43.0
The patchset fixes some warnings reported by the kernel during boot.
The cache size info is from Processor_Datasheet_v2XX.pdf [1], Section
2.2.1 Master Processor.
The cache line size and the set-associative info are from Cortex-A53
Documentation [2].
From the doc, it can be concluded that L1 i-cache is 4-way assoc, L1
d-cache is 2-way assoc and L2 cache is 16-way assoc. Calculate the dts
props accordingly.
Also, to use KVM's VGIC code, GICH, GICV registers spaces and maintenance
IRQ are added to the dts with verification.
[1]: https://github.com/96boards/documentation/blob/master/enterprise/poplar/har…
[2]: https://developer.arm.com/documentation/ddi0500/j/Level-1-Memory-System
Signed-off-by: Yang Xiwen <forbidden405(a)outlook.com>
---
Changes in v3:
- send patches to stable (Andrew Lunn)
- rewrite the commit logs more formally (Andrew Lunn)
- rename l2-cache0 to l2-cache (Krzysztof Kozlowski)
- Link to v2: https://lore.kernel.org/r/20240218-cache-v2-0-1fd919e2bd3e@outlook.com
Changes in v2:
- arm64: dts: hi3798cv200: add GICH, GICV register spces and
maintainance IRQ.
- Link to v1: https://lore.kernel.org/r/20240218-cache-v1-0-2c0a8a4472e7@outlook.com
---
Yang Xiwen (3):
arm64: dts: hi3798cv200: fix the size of GICR
arm64: dts: hi3798cv200: add GICH, GICV register space and irq
arm64: dts: hi3798cv200: add cache info
arch/arm64/boot/dts/hisilicon/hi3798cv200.dtsi | 43 +++++++++++++++++++++++++-
1 file changed, 42 insertions(+), 1 deletion(-)
---
base-commit: 8d3dea210042f54b952b481838c1e7dfc4ec751d
change-id: 20240218-cache-11c8bf7566c2
Best regards,
--
Yang Xiwen <forbidden405(a)outlook.com>
'nr' member of struct spmi_controller, which serves as an identifier
for the controller/bus. This value is a dynamic ID assigned in
spmi_controller_alloc, and overriding it from the driver results in an
ida_free error "ida_free called for id=xx which is not allocated".
Signed-off-by: Vamshi Gajjela <vamshigajjela(a)google.com>
Fixes: 70f59c90c819 ("staging: spmi: add Hikey 970 SPMI controller driver")
Cc: stable(a)vger.kernel.org
---
drivers/spmi/hisi-spmi-controller.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/spmi/hisi-spmi-controller.c b/drivers/spmi/hisi-spmi-controller.c
index 674a350cc676..fa068b34b040 100644
--- a/drivers/spmi/hisi-spmi-controller.c
+++ b/drivers/spmi/hisi-spmi-controller.c
@@ -300,7 +300,6 @@ static int spmi_controller_probe(struct platform_device *pdev)
spin_lock_init(&spmi_controller->lock);
- ctrl->nr = spmi_controller->channel;
ctrl->dev.parent = pdev->dev.parent;
ctrl->dev.of_node = of_node_get(pdev->dev.of_node);
--
2.44.0.rc1.240.g4c46232300-goog
There are few uses of CoCo that don't rely on working cryptography and
hence a working RNG. Unfortunately, the CoCo threat model means that the
VM host cannot be trusted and may actively work against guests to
extract secrets or manipulate computation. Since a malicious host can
modify or observe nearly all inputs to guests, the only remaining source
of entropy for CoCo guests is RDRAND.
If RDRAND is broken -- due to CPU hardware fault -- the RNG as a whole
is meant to gracefully continue on gathering entropy from other sources,
but since there aren't other sources on CoCo, this is catastrophic.
This is mostly a concern at boot time when initially seeding the RNG, as
after that the consequences of a broken RDRAND are much more
theoretical.
So, try at boot to seed the RNG using 256 bits of RDRAND output. If this
fails, panic(). This will also trigger if the system is booted without
RDRAND, as RDRAND is essential for a safe CoCo boot.
This patch is deliberately written to be "just a CoCo x86 driver
feature" and not part of the RNG itself. Many device drivers and
platforms have some desire to contribute something to the RNG, and
add_device_randomness() is specifically meant for this purpose. Any
driver can call this with seed data of any quality, or even garbage
quality, and it can only possibly make the quality of the RNG better or
have no effect, but can never make it worse. Rather than trying to
build something into the core of the RNG, this patch interprets the
particular CoCo issue as just a CoCo issue, and therefore separates this
all out into driver (well, arch/platform) code.
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Daniel P. Berrangé <berrange(a)redhat.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Reviewed-by: Elena Reshetova <elena.reshetova(a)intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reviewed-by: Theodore Ts'o <tytso(a)mit.edu>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
Changes v3->v4:
- Add stable@ tag and reviewed-by lines.
- Add comment for Dave explaining where the "32" comes from.
arch/x86/coco/core.c | 40 +++++++++++++++++++++++++++++++++++++
arch/x86/include/asm/coco.h | 2 ++
arch/x86/kernel/setup.c | 2 ++
3 files changed, 44 insertions(+)
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index eeec9986570e..0e988bff4aec 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -3,13 +3,16 @@
* Confidential Computing Platform Capability checks
*
* Copyright (C) 2021 Advanced Micro Devices, Inc.
+ * Copyright (C) 2024 Jason A. Donenfeld <Jason(a)zx2c4.com>. All Rights Reserved.
*
* Author: Tom Lendacky <thomas.lendacky(a)amd.com>
*/
#include <linux/export.h>
#include <linux/cc_platform.h>
+#include <linux/random.h>
+#include <asm/archrandom.h>
#include <asm/coco.h>
#include <asm/processor.h>
@@ -153,3 +156,40 @@ __init void cc_set_mask(u64 mask)
{
cc_mask = mask;
}
+
+__init void cc_random_init(void)
+{
+ /*
+ * The seed is 32 bytes (in units of longs), which is 256 bits, which
+ * is the security level that the RNG is targeting.
+ */
+ unsigned long rng_seed[32 / sizeof(long)];
+ size_t i, longs;
+
+ if (cc_vendor == CC_VENDOR_NONE)
+ return;
+
+ /*
+ * Since the CoCo threat model includes the host, the only reliable
+ * source of entropy that can be neither observed nor manipulated is
+ * RDRAND. Usually, RDRAND failure is considered tolerable, but since
+ * CoCo guests have no other unobservable source of entropy, it's
+ * important to at least ensure the RNG gets some initial random seeds.
+ */
+ for (i = 0; i < ARRAY_SIZE(rng_seed); i += longs) {
+ longs = arch_get_random_longs(&rng_seed[i], ARRAY_SIZE(rng_seed) - i);
+
+ /*
+ * A zero return value means that the guest doesn't have RDRAND
+ * or the CPU is physically broken, and in both cases that
+ * means most crypto inside of the CoCo instance will be
+ * broken, defeating the purpose of CoCo in the first place. So
+ * just panic here because it's absolutely unsafe to continue
+ * executing.
+ */
+ if (longs == 0)
+ panic("RDRAND is defective.");
+ }
+ add_device_randomness(rng_seed, sizeof(rng_seed));
+ memzero_explicit(rng_seed, sizeof(rng_seed));
+}
diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
index 76c310b19b11..e9d059449885 100644
--- a/arch/x86/include/asm/coco.h
+++ b/arch/x86/include/asm/coco.h
@@ -15,6 +15,7 @@ extern enum cc_vendor cc_vendor;
void cc_set_mask(u64 mask);
u64 cc_mkenc(u64 val);
u64 cc_mkdec(u64 val);
+void cc_random_init(void);
#else
#define cc_vendor (CC_VENDOR_NONE)
@@ -27,6 +28,7 @@ static inline u64 cc_mkdec(u64 val)
{
return val;
}
+static inline void cc_random_init(void) { }
#endif
#endif /* _ASM_X86_COCO_H */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 84201071dfac..30a653cfc7d2 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -36,6 +36,7 @@
#include <asm/bios_ebda.h>
#include <asm/bugs.h>
#include <asm/cacheinfo.h>
+#include <asm/coco.h>
#include <asm/cpu.h>
#include <asm/efi.h>
#include <asm/gart.h>
@@ -994,6 +995,7 @@ void __init setup_arch(char **cmdline_p)
* memory size.
*/
mem_encrypt_setup_arch();
+ cc_random_init();
efi_fake_memmap();
efi_find_mirror();
--
2.43.2
Here's the recently merged mds improvement patches adapted to latest stable tree.
I've only compile tested them, but since I have also done similar backports for
older kernels I'm sure they should work.
The main difference is in the definition of the CLEAR_CPU_BUFFERS macro since
5.4 doesn't contains the alternative relocation handling logic hence the verw
instruction is moved out of the alternative definition and instead we have a jump which
skips the verw instruction there. That way the relocation will be handled by the
toolchain rather than the kernel.
Since I don't know if I will have time to work on the other branches this patchset
can be used as basis for the rest of the stable kernels. The main difference would be
which bit is used for CLEAR_CPU_BUFFERS. For kernel 6.6 the 2nd patch can be used verbatim
from upstrem (unlike this modified version) since the alternative relocation
did land in v6.5. However, even if used as-is from this patchset it's not a problem.
V2:
Added upstream commit id to individual patches.
H. Peter Anvin (Intel) (1):
x86/asm: Add _ASM_RIP() macro for x86-64 (%rip) suffix
Pawan Gupta (5):
x86/bugs: Add asm helpers for executing VERW
x86/entry_64: Add VERW just before userspace transition
x86/entry_32: Add VERW just before userspace transition
x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
KVM/VMX: Move VERW closer to VMentry for MDS mitigation
Sean Christopherson (1):
KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
Documentation/x86/mds.rst | 38 ++++++++++++++++++++--------
arch/x86/entry/Makefile | 2 +-
arch/x86/entry/common.c | 2 --
arch/x86/entry/entry.S | 23 +++++++++++++++++
arch/x86/entry/entry_32.S | 3 +++
arch/x86/entry/entry_64.S | 10 ++++++++
arch/x86/entry/entry_64_compat.S | 1 +
arch/x86/include/asm/asm.h | 6 ++++-
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/irqflags.h | 1 +
arch/x86/include/asm/nospec-branch.h | 26 ++++++++++---------
arch/x86/kernel/cpu/bugs.c | 15 +++++------
arch/x86/kernel/nmi.c | 3 ---
arch/x86/kvm/vmx/run_flags.h | 7 +++--
arch/x86/kvm/vmx/vmenter.S | 9 ++++---
arch/x86/kvm/vmx/vmx.c | 12 ++++++---
16 files changed, 111 insertions(+), 49 deletions(-)
create mode 100644 arch/x86/entry/entry.S
--
2.34.1
From: Charan Teja Kalla <quic_charante(a)quicinc.com>
This fix is applicable for LTS kernel, 6.1.y. In latest kernels, this race
issue is fixed by the patch series [1] and [2]. The right thing to do here
would have been propagating these changes from latest kernel to the stable
branch, 6.1.y. However, these changes seems too intrusive to be picked for
stable branches. Hence, the fix proposed can be taken as an alternative
instead of backporting the patch series.
[1] https://lore.kernel.org/all/0-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidi…
[2] https://lore.kernel.org/all/0-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nv…
Issue:
A race condition is observed when arm_smmu_device_probe and
modprobe of client devices happens in parallel. This results
in the allocation of a new default domain for the iommu group
even though it was previously allocated and the respective iova
domain(iovad) was initialized. However, for this newly allocated
default domain, iovad will not be initialized. As a result, for
devices requesting dma allocations, this uninitialized iovad will
be used, thereby causing NULL pointer dereference issue.
Flow:
- During arm_smmu_device_probe, bus_iommu_probe() will be called
as part of iommu_device_register(). This results in the device probe,
__iommu_probe_device().
- When the modprobe of the client device happens in parallel, it
sets up the DMA configuration for the device using of_dma_configure_id(),
which inturn calls iommu_probe_device(). Later, default domain is
allocated and attached using iommu_alloc_default_domain() and
__iommu_attach_device() respectively. It then ends up initializing a
mapping domain(IOVA domain) and rcaches for the device via
arch_setup_dma_ops()->iommu_setup_dma_ops().
- Now, in the bus_iommu_probe() path, it again tries to allocate
a default domain via probe_alloc_default_domain(). This results in
allocating a new default domain(along with IOVA domain) via
__iommu_domain_alloc(). However, this newly allocated IOVA domain
will not be initialized.
- Now, when the same client device tries dma allocations via
iommu_dma_alloc(), it ends up accessing the rcaches of the newly
allocated IOVA domain, which is not initialized. This results
into NULL pointer dereferencing.
Fix this issue by adding a check in probe_alloc_default_domain()
to see if the iommu_group already has a default domain allocated
and initialized.
Signed-off-by: Charan Teja Kalla <quic_charante(a)quicinc.com>
Co-developed-by: Nikhil V <quic_nprakash(a)quicinc.com>
Signed-off-by: Nikhil V <quic_nprakash(a)quicinc.com>
---
drivers/iommu/iommu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8b3897239477..83736824f17d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1741,6 +1741,9 @@ static void probe_alloc_default_domain(struct bus_type *bus,
{
struct __group_domain_type gtype;
+ if (group->default_domain)
+ return;
+
memset(>ype, 0, sizeof(gtype));
/* Ask for default domain requirements of all devices in the group */
--
2.17.1
RISC-V PLIC cannot "end-of-interrupt" (EOI) disabled interrupts, as
explained in the description of Interrupt Completion in the PLIC spec:
"The PLIC signals it has completed executing an interrupt handler by
writing the interrupt ID it received from the claim to the claim/complete
register. The PLIC does not check whether the completion ID is the same
as the last claim ID for that target. If the completion ID does not match
an interrupt source that *is currently enabled* for the target, the
completion is silently ignored."
Commit 69ea463021be ("irqchip/sifive-plic: Fixup EOI failed when masked")
ensured that EOI is successful by enabling interrupt first, before EOI.
Commit a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask
operations") removed the interrupt enabling code from the previous
commit, because it assumes that interrupt should already be enabled at the
point of EOI. However, this is incorrect: there is a window after a hart
claiming an interrupt and before irq_desc->lock getting acquired,
interrupt can be disabled during this window. Thus, EOI can be invoked
while the interrupt is disabled, effectively nullify this EOI. This
results in the interrupt never gets asserted again, and the device who
uses this interrupt appears frozen.
Make sure that interrupt is really enabled before EOI.
Fixes: a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask operations")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
---
v2:
- add unlikely() for optimization
- re-word commit message to make it clearer
drivers/irqchip/irq-sifive-plic.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index e1484905b7bd..0a233e9d9607 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -148,7 +148,13 @@ static void plic_irq_eoi(struct irq_data *d)
{
struct plic_handler *handler = this_cpu_ptr(&plic_handlers);
- writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
+ if (unlikely(irqd_irq_disabled(d))) {
+ plic_toggle(handler, d->hwirq, 1);
+ writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
+ plic_toggle(handler, d->hwirq, 0);
+ } else {
+ writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
+ }
}
#ifdef CONFIG_SMP
--
2.39.2