On Mon, Feb 21, 2022 at 04:07:06PM +0000, Szabolcs Nagy wrote:
The 02/21/2022 14:32, Catalin Marinas wrote:
On Mon, Feb 07, 2022 at 03:20:39PM +0000, Mark Brown wrote:
diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst index b72ff17d600a..5626cf208000 100644 --- a/Documentation/arm64/elf_hwcaps.rst +++ b/Documentation/arm64/elf_hwcaps.rst @@ -259,6 +259,39 @@ HWCAP2_RPRES Functionality implied by ID_AA64ISAR2_EL1.RPRES == 0b0001. +HWCAP2_SME
- Functionality implied by ID_AA64PFR1_EL1.SME == 0b0001, as described
- by Documentation/arm64/sme.rst.
+HWCAP2_SME_I16I64
- Functionality implied by ID_AA64SMFR0_EL1.I16I64 == 0b1111.
+HWCAP2_SME_F64F64
- Functionality implied by ID_AA64SMFR0_EL1.F64F64 == 0b1.
+HWCAP2_SME_I8I32
- Functionality implied by ID_AA64SMFR0_EL1.I8I32 == 0b1111.
+HWCAP2_SME_F16F32
- Functionality implied by ID_AA64SMFR0_EL1.F16F32 == 0b1.
+HWCAP2_SME_B16F32
- Functionality implied by ID_AA64SMFR0_EL1.B16F32 == 0b1.
+HWCAP2_SME_F32F32
- Functionality implied by ID_AA64SMFR0_EL1.F32F32 == 0b1.
+HWCAP2_SME_FA64
- Functionality implied by ID_AA64SMFR0_EL1.FA64 == 0b1.
More of a question for the libc people: should we drop the fine-grained HWCAP corresponding to the new ID_AA64SMFR0_EL1 register (only keep HWCAP2_SME) and get the user space to use the MRS emulation? Would any ifunc resolver be affected?
good question.
within glibc HWCAP2_SME is enough (to decide if we need to deal with additional register state and the lazy ZA save scheme) but i guess user code that actually uses sme would need the details (including in ifunc resolvers in principle).
since we have mrs, there is no strict need for hwcaps. if ifunc resolvers using this info are not widespread then the mrs emulation overhead is acceptable, but i suspect hwcaps are nicer to use.
I presume the ifunc resolvers only run once, so the overhead won't be noticed. Anyway, happy to keep the new HWCAP2 if they are useful.
do we have a plan after hwcap2 bits run out? :)
HWCAP3 or we free up the top 32-bit in both HWCAP and HWCAP2 ranges. We did not extend into those upper bits because of the ILP32 discussions at the time.