The ID_AA64PFR1_EL1.MTE_frac field is currently hidden from KVM. However, when ID_AA64PFR1_EL1.MTE==2, ID_AA64PFR1_EL1.MTE_frac==0 indicates that MTE_ASYNC is supported. On a host with ID_AA64PFR1_EL1.MTE==2 but without MTE_ASYNC support a guest with the MTE capability enabled will incorrectly see MTE_ASYNC advertised as supported. This series fixes that.
This was found by inspection and the current behaviour is not known to break anything. Linux doesn't check MTE_frac, and wrongly, assumes MTE async faults can be generated whenever MTE is supported. This is a separate problem and not addressed here.
I am looking for feedback on whether this change is valuable or otherwise.
Changes since v1: Only pass MTE_Frac hw value to the guest when it is the exact failure case. Changed base commit to v6.15-rc5 but still applies on v6.16-rc2 as well.
Ben Horgan (3): arm64/sysreg: Expose MTE_frac so that it is visible to KVM KVM: arm64: Make MTE_frac masking conditional on MTE capability KVM: selftests: Confirm exposing MTE_frac does not break migration
arch/arm64/kernel/cpufeature.c | 1 + arch/arm64/kvm/sys_regs.c | 28 ++++++- .../testing/selftests/kvm/arm64/set_id_regs.c | 77 ++++++++++++++++++- 3 files changed, 103 insertions(+), 3 deletions(-)
base-commit: 92a09c47464d040866cf2b4cd052bc60555185fb
KVM exposes the sanitised ID registers to guests. Currently these ignore the ID_AA64PFR1_EL1.MTE_frac field, meaning guests always see a value of zero.
This is a problem for platforms without the MTE_ASYNC feature where ID_AA64PFR1_EL1.MTE==0x2 and ID_AA64PFR1_EL1.MTE_frac==0xf. KVM forces MTE_frac to zero, meaning the guest believes MTE_ASYNC is supported, when no async fault will ever occur.
Before KVM can fix this, the architecture needs to sanitise the ID register field for MTE_frac.
Linux itself does not use MTE_frac field and just assumes MTE async faults can be generated if MTE is supported.
Signed-off-by: Ben Horgan ben.horgan@arm.com --- arch/arm64/kernel/cpufeature.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 9c4d6d552b25..e952f4b07ce1 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -298,6 +298,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = { static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = { ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_GCS), FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_EL1_GCS_SHIFT, 4, 0), + S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_EL1_MTE_frac_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_EL1_SME_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_EL1_MPAM_frac_SHIFT, 4, 0),
If MTE_frac is masked out unconditionally then the guest will always see ID_AA64PFR1_EL1_MTE_frac as 0. However, a value of 0 when ID_AA64PFR1_EL1_MTE is 2 indicates that MTE_ASYNC is supported. Hence, for a host with ID_AA64PFR1_EL1_MTE==2 and ID_AA64PFR1_EL1_MTE_frac==0xf (MTE_ASYNC unsupported) the guest would see MTE_ASYNC advertised as supported whilst the host does not support it. Hence, expose the sanitised value of MTE_frac to the guest and user-space.
As MTE_frac was previously hidden, always 0, and KVM must accept values from KVM provided by user-space, when ID_AA64PFR1_EL1.MTE is 2 allow user-space to set ID_AA64PFR1_EL1.MTE_frac to 0. However, ignore it to avoid incorrectly claiming hardware support for MTE_ASYNC in the guest.
Note that linux does not check the value of ID_AA64PFR1_EL1_MTE_frac and wrongly assumes that MTE async faults can be generated even on hardware that does nto support them. This issue is not addressed here.
Signed-off-by: Ben Horgan ben.horgan@arm.com --- arch/arm64/kvm/sys_regs.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 005ad28f7306..ee30c94c0ace 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1600,13 +1600,14 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu, val = sanitise_id_aa64pfr0_el1(vcpu, val); break; case SYS_ID_AA64PFR1_EL1: - if (!kvm_has_mte(vcpu->kvm)) + if (!kvm_has_mte(vcpu->kvm)) { val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE); + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE_frac); + }
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SME); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_RNDR_trap); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_NMI); - val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE_frac); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_GCS); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_THE); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTEX); @@ -1953,11 +1954,34 @@ static int set_id_aa64pfr1_el1(struct kvm_vcpu *vcpu, { u64 hw_val = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1); u64 mpam_mask = ID_AA64PFR1_EL1_MPAM_frac_MASK; + u8 mte = SYS_FIELD_GET(ID_AA64PFR1_EL1, MTE, hw_val); + u8 user_mte_frac = SYS_FIELD_GET(ID_AA64PFR1_EL1, MTE_frac, user_val); + u8 hw_mte_frac = SYS_FIELD_GET(ID_AA64PFR1_EL1, MTE_frac, hw_val);
/* See set_id_aa64pfr0_el1 for comment about MPAM */ if ((hw_val & mpam_mask) == (user_val & mpam_mask)) user_val &= ~ID_AA64PFR1_EL1_MPAM_frac_MASK;
+ /* + * Previously MTE_frac was hidden from guest. However, if the + * hardware supports MTE2 but not MTE_ASYM_FAULT then a value + * of 0 for this field indicates that the hardware supports + * MTE_ASYNC. Whereas, 0xf indicates MTE_ASYNC is not supported. + * + * As KVM must accept values from KVM provided by user-space, + * when ID_AA64PFR1_EL1.MTE is 2 allow user-space to set + * ID_AA64PFR1_EL1.MTE_frac to 0. However, ignore it to avoid + * incorrectly claiming hardware support for MTE_ASYNC in the + * guest. + */ + + if (mte == ID_AA64PFR1_EL1_MTE_MTE2 && + hw_mte_frac == ID_AA64PFR1_EL1_MTE_frac_NI && + user_mte_frac == ID_AA64PFR1_EL1_MTE_frac_ASYNC) { + user_val &= ~ID_AA64PFR1_EL1_MTE_frac_MASK; + user_val |= hw_val & ID_AA64PFR1_EL1_MTE_frac_MASK; + } + return set_id_reg(vcpu, rd, user_val); }
When MTE is supported but MTE_ASYMM is not (ID_AA64PFR1_EL1.MTE == 2) ID_AA64PFR1_EL1.MTE_frac == 0xF indicates MTE_ASYNC is unsupported and MTE_frac == 0 indicates it is supported.
As MTE_frac was previously unconditionally read as 0 from the guest and user-space, check that using SET_ONE_REG to set it to 0 succeeds but does not change MTE_frac from unsupported (0xF) to supported (0). This is required as values originating from KVM from user-space must be accepted to avoid breaking migration.
Also, to allow this MTE field to be tested, enable KVM_ARM_CAP_MTE for the set_id_regs test. No effect on existing tests is expected.
Signed-off-by: Ben Horgan ben.horgan@arm.com --- .../testing/selftests/kvm/arm64/set_id_regs.c | 77 ++++++++++++++++++- 1 file changed, 76 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/arm64/set_id_regs.c b/tools/testing/selftests/kvm/arm64/set_id_regs.c index 322b9d3b0125..34f4174e7285 100644 --- a/tools/testing/selftests/kvm/arm64/set_id_regs.c +++ b/tools/testing/selftests/kvm/arm64/set_id_regs.c @@ -15,6 +15,8 @@ #include "test_util.h" #include <linux/bitfield.h>
+bool have_cap_arm_mte; + enum ftr_type { FTR_EXACT, /* Use a predefined safe value */ FTR_LOWER_SAFE, /* Smaller value is safe */ @@ -543,6 +545,70 @@ static void test_user_set_mpam_reg(struct kvm_vcpu *vcpu) ksft_test_result_fail("ID_AA64PFR1_EL1.MPAM_frac value should not be ignored\n"); }
+#define MTE_IDREG_TEST 1 +static void test_user_set_mte_reg(struct kvm_vcpu *vcpu) +{ + uint64_t masks[KVM_ARM_FEATURE_ID_RANGE_SIZE]; + struct reg_mask_range range = { + .addr = (__u64)masks, + }; + uint64_t val; + uint64_t mte; + uint64_t mte_frac; + int idx, err; + + if (!have_cap_arm_mte) { + ksft_test_result_skip("MTE capability not supported, nothing to test\n"); + return; + } + + /* Get writable masks for feature ID registers */ + memset(range.reserved, 0, sizeof(range.reserved)); + vm_ioctl(vcpu->vm, KVM_ARM_GET_REG_WRITABLE_MASKS, &range); + + idx = encoding_to_range_idx(SYS_ID_AA64PFR1_EL1); + if ((masks[idx] & ID_AA64PFR1_EL1_MTE_frac_MASK) == ID_AA64PFR1_EL1_MTE_frac_MASK) { + ksft_test_result_skip("ID_AA64PFR1_EL1.MTE_frac is officially writable, nothing to test\n"); + return; + } + + /* + * When MTE is supported but MTE_ASYMM is not (ID_AA64PFR1_EL1.MTE == 2) + * ID_AA64PFR1_EL1.MTE_frac == 0xF indicates MTE_ASYNC is unsupported + * and MTE_frac == 0 indicates it is supported. + * + * As MTE_frac was previously unconditionally read as 0, check + * that the set to 0 succeeds but does not change MTE_frac + * from unsupported (0xF) to supported (0). + * + */ + val = vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(SYS_ID_AA64PFR1_EL1)); + + mte = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE), val); + mte_frac = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE_frac), val); + if (mte != ID_AA64PFR1_EL1_MTE_MTE2 || + mte_frac != ID_AA64PFR1_EL1_MTE_frac_NI) { + ksft_test_result_skip("MTE_ASYNC or MTE_ASYMM are supported, nothing to test\n"); + return; + } + + /* Try to set MTE_frac=0. */ + val &= ~ID_AA64PFR1_EL1_MTE_frac_MASK; + val |= FIELD_PREP(ID_AA64PFR1_EL1_MTE_frac_MASK, 0); + err = __vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_ID_AA64PFR1_EL1), val); + if (err) { + ksft_test_result_fail("ID_AA64PFR1_EL1.MTE_frac=0 was not accepted\n"); + return; + } + + val = vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(SYS_ID_AA64PFR1_EL1)); + mte_frac = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE_frac), val); + if (mte_frac == ID_AA64PFR1_EL1_MTE_frac_NI) + ksft_test_result_pass("ID_AA64PFR1_EL1.MTE_frac=0 accepted and still 0xF\n"); + else + ksft_test_result_pass("ID_AA64PFR1_EL1.MTE_frac no longer 0xF\n"); +} + static void test_guest_reg_read(struct kvm_vcpu *vcpu) { bool done = false; @@ -673,6 +739,14 @@ static void test_reset_preserves_id_regs(struct kvm_vcpu *vcpu) ksft_test_result_pass("%s\n", __func__); }
+void kvm_arch_vm_post_create(struct kvm_vm *vm) +{ + if (vm_check_cap(vm, KVM_CAP_ARM_MTE)) { + vm_enable_cap(vm, KVM_CAP_ARM_MTE, 0); + have_cap_arm_mte = true; + } +} + int main(void) { struct kvm_vcpu *vcpu; @@ -701,7 +775,7 @@ int main(void) ARRAY_SIZE(ftr_id_aa64pfr1_el1) + ARRAY_SIZE(ftr_id_aa64mmfr0_el1) + ARRAY_SIZE(ftr_id_aa64mmfr1_el1) + ARRAY_SIZE(ftr_id_aa64mmfr2_el1) + ARRAY_SIZE(ftr_id_aa64zfr0_el1) - ARRAY_SIZE(test_regs) + 3 + - MPAM_IDREG_TEST; + MPAM_IDREG_TEST + MTE_IDREG_TEST;
ksft_set_plan(test_cnt);
@@ -709,6 +783,7 @@ int main(void) test_vcpu_ftr_id_regs(vcpu); test_vcpu_non_ftr_id_regs(vcpu); test_user_set_mpam_reg(vcpu); + test_user_set_mte_reg(vcpu);
test_guest_reg_read(vcpu);
On Mon, 12 May 2025 12:41:09 +0100, Ben Horgan wrote:
The ID_AA64PFR1_EL1.MTE_frac field is currently hidden from KVM. However, when ID_AA64PFR1_EL1.MTE==2, ID_AA64PFR1_EL1.MTE_frac==0 indicates that MTE_ASYNC is supported. On a host with ID_AA64PFR1_EL1.MTE==2 but without MTE_ASYNC support a guest with the MTE capability enabled will incorrectly see MTE_ASYNC advertised as supported. This series fixes that.
[...]
Applied to next, thanks!
[1/3] arm64/sysreg: Expose MTE_frac so that it is visible to KVM commit: 5799a2983fea4f22f6aaa291aee3b704db81d29f [2/3] KVM: arm64: Make MTE_frac masking conditional on MTE capability commit: fe21ff5d4b46c06c4e43a310fdb781d1e6b5173e [3/3] KVM: selftests: Confirm exposing MTE_frac does not break migration commit: 69018866d2baa014413e340744bfb342ec0cdbcd
Cheers,
M.
linux-kselftest-mirror@lists.linaro.org