This series works around a few ARM64 errata.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Jonathan Corbet corbet@lwn.net Cc: Robin Murphy robin.murphy@arm.com Cc: Joerg Roedel joro@8bytes.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: iommu@lists.linux.dev
Changelog: ========== v2 -> v3 - Backport other relevant errata patches from the same series as MMU-700 erratum 2812531 - v2 link: https://lore.kernel.org/stable/20230724185017.1675459-1-eahariha@linux.micro...
v1 -> v2: - Drop patch accepted as commit e4e7f67cc14e9638798f80513e84b8fb62cdb7e3 in v5.15.121 - Appropriate sign-offs - v1 link:https://lore.kernel.org/stable/1689895414-17425-1-git-send-email-eahariha@li...
Robin Murphy (4): iommu/arm-smmu-v3: Work around MMU-600 erratum 1076982 iommu/arm-smmu-v3: Document MMU-700 erratum 2812531 iommu/arm-smmu-v3: Add explicit feature for nesting iommu/arm-smmu-v3: Document nesting-related errata
Suzuki K Poulose (2): arm64: errata: Add workaround for TSB flush failures arm64: errata: Add detection for TRBE write to out-of-range
Documentation/arm64/silicon-errata.rst | 12 ++++ arch/arm64/Kconfig | 74 +++++++++++++++++++++ arch/arm64/include/asm/barrier.h | 16 ++++- arch/arm64/kernel/cpu_errata.c | 39 +++++++++++ arch/arm64/tools/cpucaps | 2 + drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 50 ++++++++++++++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 +++ 7 files changed, 200 insertions(+), 1 deletion(-)
From: Suzuki K Poulose suzuki.poulose@arm.com
commit fa82d0b4b833790ac4572377fb777dcea24a9d69 upstream
Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers from errata, where a TSB (trace synchronization barrier) fails to flush the trace data completely, when executed from a trace prohibited region. In Linux we always execute it after we have moved the PE to trace prohibited region. So, we can apply the workaround every time a TSB is executed.
The work around is to issue two TSB consecutively.
NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying that a late CPU could be blocked from booting if it is the first CPU that requires the workaround. This is because we do not allow setting a cpu_hwcaps after the SMP boot. The other alternative is to use "this_cpu_has_cap()" instead of the faster system wide check, which may be a bit of an overhead, given we may have to do this in nvhe KVM host before a guest entry.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Mark Rutland mark.rutland@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Marc Zyngier maz@kernel.org Acked-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com Link: https://lore.kernel.org/r/20211019163153.3692640-4-suzuki.poulose@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Easwar Hariharan eahariha@linux.microsoft.com --- Documentation/arm64/silicon-errata.rst | 4 ++++ arch/arm64/Kconfig | 33 ++++++++++++++++++++++++++ arch/arm64/include/asm/barrier.h | 16 ++++++++++++- arch/arm64/kernel/cpu_errata.c | 19 +++++++++++++++ arch/arm64/tools/cpucaps | 1 + 5 files changed, 72 insertions(+), 1 deletion(-)
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 076861b0f5ac..1de575fc135b 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -104,6 +104,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A710 | #2119858 | ARM64_ERRATUM_2119858 | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-A710 | #2054223 | ARM64_ERRATUM_2054223 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-N1 | #1188873,1418040| ARM64_ERRATUM_1418040 | +----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-N1 | #1349291 | N/A | @@ -112,6 +114,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-N2 | #2139208 | ARM64_ERRATUM_2139208 | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Neoverse-N2 | #2067961 | ARM64_ERRATUM_2067961 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | MMU-500 | #841119,826419 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e5e35470647b..6dce6e56ee53 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -774,6 +774,39 @@ config ARM64_ERRATUM_2139208
If unsure, say Y.
+config ARM64_WORKAROUND_TSB_FLUSH_FAILURE + bool + +config ARM64_ERRATUM_2054223 + bool "Cortex-A710: 2054223: workaround TSB instruction failing to flush trace" + default y + select ARM64_WORKAROUND_TSB_FLUSH_FAILURE + help + Enable workaround for ARM Cortex-A710 erratum 2054223 + + Affected cores may fail to flush the trace data on a TSB instruction, when + the PE is in trace prohibited state. This will cause losing a few bytes + of the trace cached. + + Workaround is to issue two TSB consecutively on affected cores. + + If unsure, say Y. + +config ARM64_ERRATUM_2067961 + bool "Neoverse-N2: 2067961: workaround TSB instruction failing to flush trace" + default y + select ARM64_WORKAROUND_TSB_FLUSH_FAILURE + help + Enable workaround for ARM Neoverse-N2 erratum 2067961 + + Affected cores may fail to flush the trace data on a TSB instruction, when + the PE is in trace prohibited state. This will cause losing a few bytes + of the trace cached. + + Workaround is to issue two TSB consecutively on affected cores. + + If unsure, say Y. + config CAVIUM_ERRATUM_22375 bool "Cavium erratum 22375, 24313" default y diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h index 451e11e5fd23..1c5a00598458 100644 --- a/arch/arm64/include/asm/barrier.h +++ b/arch/arm64/include/asm/barrier.h @@ -23,7 +23,7 @@ #define dsb(opt) asm volatile("dsb " #opt : : : "memory")
#define psb_csync() asm volatile("hint #17" : : : "memory") -#define tsb_csync() asm volatile("hint #18" : : : "memory") +#define __tsb_csync() asm volatile("hint #18" : : : "memory") #define csdb() asm volatile("hint #20" : : : "memory")
#ifdef CONFIG_ARM64_PSEUDO_NMI @@ -46,6 +46,20 @@ #define dma_rmb() dmb(oshld) #define dma_wmb() dmb(oshst)
+ +#define tsb_csync() \ + do { \ + /* \ + * CPUs affected by Arm Erratum 2054223 or 2067961 needs \ + * another TSB to ensure the trace is flushed. The barriers \ + * don't have to be strictly back to back, as long as the \ + * CPU is in trace prohibited state. \ + */ \ + if (cpus_have_final_cap(ARM64_WORKAROUND_TSB_FLUSH_FAILURE)) \ + __tsb_csync(); \ + __tsb_csync(); \ + } while (0) + /* * Generate a mask for array_index__nospec() that is ~0UL when 0 <= idx < sz * and 0 otherwise. diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index d810d4b7b438..ab412b45732f 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -375,6 +375,18 @@ static const struct midr_range trbe_overwrite_fill_mode_cpus[] = { }; #endif /* CONFIG_ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE */
+#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE +static const struct midr_range tsb_flush_fail_cpus[] = { +#ifdef CONFIG_ARM64_ERRATUM_2067961 + MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), +#endif +#ifdef CONFIG_ARM64_ERRATUM_2054223 + MIDR_ALL_VERSIONS(MIDR_CORTEX_A710), +#endif + {}, +}; +#endif /* CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */ + const struct arm64_cpu_capabilities arm64_errata[] = { #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE { @@ -606,6 +618,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = { .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE, CAP_MIDR_RANGE_LIST(trbe_overwrite_fill_mode_cpus), }, +#endif +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE + { + .desc = "ARM erratum 2067961 or 2054223", + .capability = ARM64_WORKAROUND_TSB_FLUSH_FAILURE, + ERRATA_MIDR_RANGE_LIST(tsb_flush_fail_cpus), + }, #endif { } diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index 32fe50a3a26c..36ab307c69d4 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -57,6 +57,7 @@ WORKAROUND_1542419 WORKAROUND_1742098 WORKAROUND_2457168 WORKAROUND_TRBE_OVERWRITE_FILL_MODE +WORKAROUND_TSB_FLUSH_FAILURE WORKAROUND_CAVIUM_23154 WORKAROUND_CAVIUM_27456 WORKAROUND_CAVIUM_30115
From: Suzuki K Poulose suzuki.poulose@arm.com
commit 8d81b2a38ddfc4b03662d2359765648c8b4cc73c upstream
Arm Neoverse-N2 and Cortex-A710 cores are affected by an erratum where the trbe, under some circumstances, might write upto 64bytes to an address after the Limit as programmed by the TRBLIMITR_EL1.LIMIT. This might - - Corrupt a page in the ring buffer, which may corrupt trace from a previous session, consumed by userspace. - Hit the guard page at the end of the vmalloc area and raise a fault.
To keep the handling simpler, we always leave the last page from the range, which TRBE is allowed to write. This can be achieved by ensuring that we always have more than a PAGE worth space in the range, while calculating the LIMIT for TRBE. And then the LIMIT pointer can be adjusted to leave the PAGE (TRBLIMITR.LIMIT -= PAGE_SIZE), out of the TRBE range while enabling it. This makes sure that the TRBE will only write to an area within its allowed limit (i.e, [head-head+size]) and we do not have to handle address faults within the driver.
Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Leo Yan leo.yan@linaro.org Cc: Will Deacon will@kernel.org Cc: Mark Rutland mark.rutland@arm.com Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com Link: https://lore.kernel.org/r/20211019163153.3692640-5-suzuki.poulose@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Easwar Hariharan eahariha@linux.microsoft.com --- Documentation/arm64/silicon-errata.rst | 4 +++ arch/arm64/Kconfig | 41 ++++++++++++++++++++++++++ arch/arm64/kernel/cpu_errata.c | 20 +++++++++++++ arch/arm64/tools/cpucaps | 1 + 4 files changed, 66 insertions(+)
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 1de575fc135b..f64354f8a79f 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -106,6 +106,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A710 | #2054223 | ARM64_ERRATUM_2054223 | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-A710 | #2224489 | ARM64_ERRATUM_2224489 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-N1 | #1188873,1418040| ARM64_ERRATUM_1418040 | +----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-N1 | #1349291 | N/A | @@ -116,6 +118,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-N2 | #2067961 | ARM64_ERRATUM_2067961 | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Neoverse-N2 | #2253138 | ARM64_ERRATUM_2253138 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | MMU-500 | #841119,826419 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 6dce6e56ee53..5ab4b0520eab 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -807,6 +807,47 @@ config ARM64_ERRATUM_2067961
If unsure, say Y.
+config ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE + bool + +config ARM64_ERRATUM_2253138 + bool "Neoverse-N2: 2253138: workaround TRBE writing to address out-of-range" + depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in + depends on CORESIGHT_TRBE + default y + select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE + help + This option adds the workaround for ARM Neoverse-N2 erratum 2253138. + + Affected Neoverse-N2 cores might write to an out-of-range address, not reserved + for TRBE. Under some conditions, the TRBE might generate a write to the next + virtually addressed page following the last page of the TRBE address space + (i.e., the TRBLIMITR_EL1.LIMIT), instead of wrapping around to the base. + + Work around this in the driver by always making sure that there is a + page beyond the TRBLIMITR_EL1.LIMIT, within the space allowed for the TRBE. + + If unsure, say Y. + +config ARM64_ERRATUM_2224489 + bool "Cortex-A710: 2224489: workaround TRBE writing to address out-of-range" + depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in + depends on CORESIGHT_TRBE + default y + select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE + help + This option adds the workaround for ARM Cortex-A710 erratum 2224489. + + Affected Cortex-A710 cores might write to an out-of-range address, not reserved + for TRBE. Under some conditions, the TRBE might generate a write to the next + virtually addressed page following the last page of the TRBE address space + (i.e., the TRBLIMITR_EL1.LIMIT), instead of wrapping around to the base. + + Work around this in the driver by always making sure that there is a + page beyond the TRBLIMITR_EL1.LIMIT, within the space allowed for the TRBE. + + If unsure, say Y. + config CAVIUM_ERRATUM_22375 bool "Cavium erratum 22375, 24313" default y diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index ab412b45732f..bf69a20bc27f 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -387,6 +387,18 @@ static const struct midr_range tsb_flush_fail_cpus[] = { }; #endif /* CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */
+#ifdef CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE +static struct midr_range trbe_write_out_of_range_cpus[] = { +#ifdef CONFIG_ARM64_ERRATUM_2253138 + MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), +#endif +#ifdef CONFIG_ARM64_ERRATUM_2224489 + MIDR_ALL_VERSIONS(MIDR_CORTEX_A710), +#endif + {}, +}; +#endif /* CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE */ + const struct arm64_cpu_capabilities arm64_errata[] = { #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE { @@ -625,6 +637,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = { .capability = ARM64_WORKAROUND_TSB_FLUSH_FAILURE, ERRATA_MIDR_RANGE_LIST(tsb_flush_fail_cpus), }, +#endif +#ifdef CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE + { + .desc = "ARM erratum 2253138 or 2224489", + .capability = ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE, + .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE, + CAP_MIDR_RANGE_LIST(trbe_write_out_of_range_cpus), + }, #endif { } diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index 36ab307c69d4..fcaeec5a5125 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -58,6 +58,7 @@ WORKAROUND_1742098 WORKAROUND_2457168 WORKAROUND_TRBE_OVERWRITE_FILL_MODE WORKAROUND_TSB_FLUSH_FAILURE +WORKAROUND_TRBE_WRITE_OUT_OF_RANGE WORKAROUND_CAVIUM_23154 WORKAROUND_CAVIUM_27456 WORKAROUND_CAVIUM_30115
From: Robin Murphy robin.murphy@arm.com
commit f322e8af35c7f23a8c08b595c38d6c855b2d836f upstream
MMU-600 versions prior to r1p0 fail to correctly generate a WFE wakeup event when the command queue transitions fom full to non-full. We can easily work around this by simply hiding the SEV capability such that we fall back to polling for space in the queue - since MMU-600 implements MSIs we wouldn't expect to need SEV for sync completion either, so this should have little to no impact.
Signed-off-by: Robin Murphy robin.murphy@arm.com Reviewed-by: Nicolin Chen nicolinc@nvidia.com Tested-by: Nicolin Chen nicolinc@nvidia.com Link: https://lore.kernel.org/r/08adbe3d01024d8382a478325f73b56851f76e49.168373125... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Easwar Hariharan eahariha@linux.microsoft.com --- Documentation/arm64/silicon-errata.rst | 2 ++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 +++++++++++++++++++++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 +++++ 3 files changed, 37 insertions(+)
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index f64354f8a79f..55e1e074dec1 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -122,6 +122,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | MMU-500 | #841119,826419 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | MMU-600 | #1076982 | N/A | ++----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index bcdb2cbdda97..782d040a829c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3459,6 +3459,33 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) return 0; }
+#define IIDR_IMPLEMENTER_ARM 0x43b +#define IIDR_PRODUCTID_ARM_MMU_600 0x483 + +static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu) +{ + u32 reg; + unsigned int implementer, productid, variant, revision; + + reg = readl_relaxed(smmu->base + ARM_SMMU_IIDR); + implementer = FIELD_GET(IIDR_IMPLEMENTER, reg); + productid = FIELD_GET(IIDR_PRODUCTID, reg); + variant = FIELD_GET(IIDR_VARIANT, reg); + revision = FIELD_GET(IIDR_REVISION, reg); + + switch (implementer) { + case IIDR_IMPLEMENTER_ARM: + switch (productid) { + case IIDR_PRODUCTID_ARM_MMU_600: + /* Arm erratum 1076982 */ + if (variant == 0 && revision <= 2) + smmu->features &= ~ARM_SMMU_FEAT_SEV; + break; + } + break; + } +} + static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) { u32 reg; @@ -3664,6 +3691,8 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
smmu->ias = max(smmu->ias, smmu->oas);
+ arm_smmu_device_iidr_probe(smmu); + if (arm_smmu_sva_supported(smmu)) smmu->features |= ARM_SMMU_FEAT_SVA;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 4cb136f07914..5964e02c4e57 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -69,6 +69,12 @@ #define IDR5_VAX GENMASK(11, 10) #define IDR5_VAX_52_BIT 1
+#define ARM_SMMU_IIDR 0x18 +#define IIDR_PRODUCTID GENMASK(31, 20) +#define IIDR_VARIANT GENMASK(19, 16) +#define IIDR_REVISION GENMASK(15, 12) +#define IIDR_IMPLEMENTER GENMASK(11, 0) + #define ARM_SMMU_CR0 0x20 #define CR0_ATSCHK (1 << 4) #define CR0_CMDQEN (1 << 3)
From: Robin Murphy robin.murphy@arm.com
commit 309a15cb16bb075da1c99d46fb457db6a1a2669e upstream
To work around MMU-700 erratum 2812531 we need to ensure that certain sequences of commands cannot be issued without an intervening sync. In practice this falls out of our current command-batching machinery anyway - each batch only contains a single type of invalidation command, and ends with a sync. The only exception is when a batch is sufficiently large to need issuing across multiple command queue slots, wherein the earlier slots will not contain a sync and thus may in theory interleave with another batch being issued in parallel to create an affected sequence across the slot boundary.
Since MMU-700 supports range invalidate commands and thus we will prefer to use them (which also happens to avoid conditions for other errata), I'm not entirely sure it's even possible for a single high-level invalidate call to generate a batch of more than 63 commands, but for the sake of robustness and documentation, wire up an option to enforce that a sync is always inserted for every slot issued.
The other aspect is that the relative order of DVM commands cannot be controlled, so DVM cannot be used. Again that is already the status quo, but since we have at least defined ARM_SMMU_FEAT_BTM, we can explicitly disable it for documentation purposes even if it's not wired up anywhere yet.
Signed-off-by: Robin Murphy robin.murphy@arm.com Reviewed-by: Nicolin Chen nicolinc@nvidia.com Link: https://lore.kernel.org/r/330221cdfd0003cd51b6c04e7ff3566741ad8374.168373125... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Easwar Hariharan eahariha@linux.microsoft.com --- Documentation/arm64/silicon-errata.rst | 2 ++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 12 ++++++++++++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 + 3 files changed, 15 insertions(+)
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 55e1e074dec1..322df8abbc0e 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -124,6 +124,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | MMU-600 | #1076982 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | MMU-700 | #2812531 | N/A | ++----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 782d040a829c..6a551a48d271 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -897,6 +897,12 @@ static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu, struct arm_smmu_cmdq_batch *cmds, struct arm_smmu_cmdq_ent *cmd) { + if (cmds->num == CMDQ_BATCH_ENTRIES - 1 && + (smmu->options & ARM_SMMU_OPT_CMDQ_FORCE_SYNC)) { + arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true); + cmds->num = 0; + } + if (cmds->num == CMDQ_BATCH_ENTRIES) { arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, false); cmds->num = 0; @@ -3461,6 +3467,7 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
#define IIDR_IMPLEMENTER_ARM 0x43b #define IIDR_PRODUCTID_ARM_MMU_600 0x483 +#define IIDR_PRODUCTID_ARM_MMU_700 0x487
static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu) { @@ -3481,6 +3488,11 @@ static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu) if (variant == 0 && revision <= 2) smmu->features &= ~ARM_SMMU_FEAT_SEV; break; + case IIDR_PRODUCTID_ARM_MMU_700: + /* Arm erratum 2812531 */ + smmu->features &= ~ARM_SMMU_FEAT_BTM; + smmu->options |= ARM_SMMU_OPT_CMDQ_FORCE_SYNC; + break; } break; } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 5964e02c4e57..abaecdf8d5d2 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -651,6 +651,7 @@ struct arm_smmu_device { #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) #define ARM_SMMU_OPT_PAGE0_REGS_ONLY (1 << 1) #define ARM_SMMU_OPT_MSIPOLL (1 << 2) +#define ARM_SMMU_OPT_CMDQ_FORCE_SYNC (1 << 3) u32 options;
struct arm_smmu_cmdq cmdq;
On 2023-08-02 18:02, Easwar Hariharan wrote:
From: Robin Murphy robin.murphy@arm.com
commit 309a15cb16bb075da1c99d46fb457db6a1a2669e upstream
To work around MMU-700 erratum 2812531 we need to ensure that certain sequences of commands cannot be issued without an intervening sync. In practice this falls out of our current command-batching machinery anyway - each batch only contains a single type of invalidation command, and ends with a sync. The only exception is when a batch is sufficiently large to need issuing across multiple command queue slots, wherein the earlier slots will not contain a sync and thus may in theory interleave with another batch being issued in parallel to create an affected sequence across the slot boundary.
Since MMU-700 supports range invalidate commands and thus we will prefer to use them (which also happens to avoid conditions for other errata), I'm not entirely sure it's even possible for a single high-level invalidate call to generate a batch of more than 63 commands,
Out of interest, have you observed a case where this actually happens?
but for the sake of robustness and documentation, wire up an option to enforce that a sync is always inserted for every slot issued.
The other aspect is that the relative order of DVM commands cannot be controlled, so DVM cannot be used. Again that is already the status quo, but since we have at least defined ARM_SMMU_FEAT_BTM, we can explicitly disable it for documentation purposes even if it's not wired up anywhere yet.
Note that there seems to be a slight issue with this patch that I missed, under discussion here:
https://lore.kernel.org/linux-iommu/27c895b8-1fb0-be88-8bc3-878d754684c8@hua...
Thanks, Robin.
Signed-off-by: Robin Murphy robin.murphy@arm.com Reviewed-by: Nicolin Chen nicolinc@nvidia.com Link: https://lore.kernel.org/r/330221cdfd0003cd51b6c04e7ff3566741ad8374.168373125... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Easwar Hariharan eahariha@linux.microsoft.com
Documentation/arm64/silicon-errata.rst | 2 ++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 12 ++++++++++++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 + 3 files changed, 15 insertions(+)
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 55e1e074dec1..322df8abbc0e 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -124,6 +124,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | MMU-600 | #1076982 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | MMU-700 | #2812531 | N/A | ++----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 782d040a829c..6a551a48d271 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -897,6 +897,12 @@ static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu, struct arm_smmu_cmdq_batch *cmds, struct arm_smmu_cmdq_ent *cmd) {
- if (cmds->num == CMDQ_BATCH_ENTRIES - 1 &&
(smmu->options & ARM_SMMU_OPT_CMDQ_FORCE_SYNC)) {
arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true);
cmds->num = 0;
- }
- if (cmds->num == CMDQ_BATCH_ENTRIES) { arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, false); cmds->num = 0;
@@ -3461,6 +3467,7 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) #define IIDR_IMPLEMENTER_ARM 0x43b #define IIDR_PRODUCTID_ARM_MMU_600 0x483 +#define IIDR_PRODUCTID_ARM_MMU_700 0x487 static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu) { @@ -3481,6 +3488,11 @@ static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu) if (variant == 0 && revision <= 2) smmu->features &= ~ARM_SMMU_FEAT_SEV; break;
case IIDR_PRODUCTID_ARM_MMU_700:
/* Arm erratum 2812531 */
smmu->features &= ~ARM_SMMU_FEAT_BTM;
smmu->options |= ARM_SMMU_OPT_CMDQ_FORCE_SYNC;
} break; }break;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 5964e02c4e57..abaecdf8d5d2 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -651,6 +651,7 @@ struct arm_smmu_device { #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) #define ARM_SMMU_OPT_PAGE0_REGS_ONLY (1 << 1) #define ARM_SMMU_OPT_MSIPOLL (1 << 2) +#define ARM_SMMU_OPT_CMDQ_FORCE_SYNC (1 << 3) u32 options; struct arm_smmu_cmdq cmdq;
On Mon, 7 Aug 2023, Robin Murphy wrote:
On 2023-08-02 18:02, Easwar Hariharan wrote:
From: Robin Murphy robin.murphy@arm.com
commit 309a15cb16bb075da1c99d46fb457db6a1a2669e upstream
To work around MMU-700 erratum 2812531 we need to ensure that certain sequences of commands cannot be issued without an intervening sync. In practice this falls out of our current command-batching machinery anyway - each batch only contains a single type of invalidation command, and ends with a sync. The only exception is when a batch is sufficiently large to need issuing across multiple command queue slots, wherein the earlier slots will not contain a sync and thus may in theory interleave with another batch being issued in parallel to create an affected sequence across the slot boundary.
Since MMU-700 supports range invalidate commands and thus we will prefer to use them (which also happens to avoid conditions for other errata), I'm not entirely sure it's even possible for a single high-level invalidate call to generate a batch of more than 63 commands,
Out of interest, have you observed a case where this actually happens?
Not so far, but this was more of a proactive pickup since the erratum was published and we have some hardware being tested.
but for the sake of robustness and documentation, wire up an option to enforce that a sync is always inserted for every slot issued.
The other aspect is that the relative order of DVM commands cannot be controlled, so DVM cannot be used. Again that is already the status quo, but since we have at least defined ARM_SMMU_FEAT_BTM, we can explicitly disable it for documentation purposes even if it's not wired up anywhere yet.
Note that there seems to be a slight issue with this patch that I missed, under discussion here:
https://lore.kernel.org/linux-iommu/27c895b8-1fb0-be88-8bc3-878d754684c8@hua...
Thanks for the heads up. I did not backport the TTL patch for this series, or the ones for 6.1 or 6.4, but I'll track the discussion. Could you CC any fixup patch you may create for this to stable for 5.15+ as well?
Thanks, Easwar
Thanks, Robin.
<snip>
From: Robin Murphy robin.murphy@arm.com
commit 1d9777b9f3d55b4b6faf186ba4f1d6fb560c0523 upstream
In certain cases we may want to refuse to allow nested translation even when both stages are implemented, so let's add an explicit feature for nesting support which we can control in its own right. For now this merely serves as documentation, but it means a nice convenient check will be ready and waiting for the future nesting code.
Signed-off-by: Robin Murphy robin.murphy@arm.com Reviewed-by: Nicolin Chen nicolinc@nvidia.com Link: https://lore.kernel.org/r/136c3f4a3a84cc14a5a1978ace57dfd3ed67b688.168373125... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Easwar Hariharan eahariha@linux.microsoft.com --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 ++++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 + 2 files changed, 5 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 6a551a48d271..7cec4a457d91 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3703,6 +3703,10 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
smmu->ias = max(smmu->ias, smmu->oas);
+ if ((smmu->features & ARM_SMMU_FEAT_TRANS_S1) && + (smmu->features & ARM_SMMU_FEAT_TRANS_S2)) + smmu->features |= ARM_SMMU_FEAT_NESTING; + arm_smmu_device_iidr_probe(smmu);
if (arm_smmu_sva_supported(smmu)) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index abaecdf8d5d2..c594a9b46999 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -646,6 +646,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_BTM (1 << 16) #define ARM_SMMU_FEAT_SVA (1 << 17) #define ARM_SMMU_FEAT_E2H (1 << 18) +#define ARM_SMMU_FEAT_NESTING (1 << 19) u32 features;
#define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
From: Robin Murphy robin.murphy@arm.com
commit 0bfbfc526c70606bf0fad302e4821087cbecfaf4 upstream
Both MMU-600 and MMU-700 have similar errata around TLB invalidation while both stages of translation are active, which will need some consideration once nesting support is implemented. For now, though, it's very easy to make our implicit lack of nesting support explicit for those cases, so they're less likely to be missed in future.
Signed-off-by: Robin Murphy robin.murphy@arm.com Reviewed-by: Nicolin Chen nicolinc@nvidia.com Link: https://lore.kernel.org/r/696da78d32bb4491f898f11b0bb4d850a8aa7c6a.168373125... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Easwar Hariharan eahariha@linux.microsoft.com --- Documentation/arm64/silicon-errata.rst | 4 ++-- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 5 +++++ 2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 322df8abbc0e..83a75e16e54d 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -122,9 +122,9 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | MMU-500 | #841119,826419 | N/A | +----------------+-----------------+-----------------+-----------------------------+ -| ARM | MMU-600 | #1076982 | N/A | +| ARM | MMU-600 | #1076982,1209401| N/A | +----------------+-----------------+-----------------+-----------------------------+ -| ARM | MMU-700 | #2812531 | N/A | +| ARM | MMU-700 | #2268618,2812531| N/A | +----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 | diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 7cec4a457d91..340ef116d574 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3487,11 +3487,16 @@ static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu) /* Arm erratum 1076982 */ if (variant == 0 && revision <= 2) smmu->features &= ~ARM_SMMU_FEAT_SEV; + /* Arm erratum 1209401 */ + if (variant < 2) + smmu->features &= ~ARM_SMMU_FEAT_NESTING; break; case IIDR_PRODUCTID_ARM_MMU_700: /* Arm erratum 2812531 */ smmu->features &= ~ARM_SMMU_FEAT_BTM; smmu->options |= ARM_SMMU_OPT_CMDQ_FORCE_SYNC; + /* Arm errata 2268618, 2812531 */ + smmu->features &= ~ARM_SMMU_FEAT_NESTING; break; } break;
linux-stable-mirror@lists.linaro.org