This patch series ended up much larger than expected, please bear with me! The goal here is to support vendor extensions, starting at probing the device tree and ending with reporting to userspace.
The main design objective was to allow vendors to operate independently of each other. This has been achieved by delegating vendor extensions to a new struct "hart_isa_vendor" which is a counterpart to "hart_isa".
Each vendor will have their own list of extensions they support. Each vendor will have a "namespace" to themselves which is set at the key values of 0x8000 - 0x8080. It is up to the vendor's disgression how they wish to allocate keys in the range for their vendor extensions.
Reporting to userspace follows a similar story, leveraging the hwprobe syscall. There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_0 that is used to request supported vendor extensions. The vendor extension keys are disambiguated by the vendor associated with the cpumask passed into hwprobe. The entire 64-bit key space is available to each vendor.
On to the xtheadvector specific code. xtheadvector is a custom extension that is based upon riscv vector version 0.7.1 [1]. All of the vector routines have been modified to support this alternative vector version based upon whether xtheadvector was determined to be supported at boot. I have tested this with an Allwinner Nezha board. I ran into issues booting the board on 6.9-rc1 so I applied these patches to 6.8. There are a couple of minor merge conflicts that do arrise when doing that, so please let me know if you have been able to boot this board with a 6.9 kernel. I used SkiffOS [2] to manage building the image, but upgraded the U-Boot version to Samuel Holland's more up-to-date version [3] and changed out the device tree used by U-Boot with the device trees that are present in upstream linux and this series. Thank you Samuel for all of the work you did to make this task possible.
To test the integration, I used the riscv vector kselftests. I modified the test cases to be able to more easily extend them, and then added a xtheadvector target that works by calling hwprobe and swapping out the vector asm if needed.
[1] https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361c... [2] https://github.com/skiffos/SkiffOS/tree/master/configs/allwinner/nezha [3] https://github.com/smaeul/u-boot/commit/2e89b706f5c956a70c989cd31665f1429e9a...
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- Charlie Jenkins (18): dt-bindings: riscv: Add vendorid and archid riscv: cpufeature: Fix thead vector hwcap removal dt-bindings: riscv: Add xtheadvector ISA extension description riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree riscv: Fix extension subset checking riscv: Extend cpufeature.c to detect vendor extensions riscv: Optimize riscv_cpu_isa_extension_(un)likely() riscv: Introduce vendor variants of extension helpers riscv: uaccess: Add alternative for xtheadvector uaccess riscv: csr: Add CSR encodings for VCSR_VXRM/VCSR_VXSAT riscv: Create xtheadvector file riscv: vector: Support xtheadvector save/restore riscv: hwprobe: Disambiguate vector and xtheadvector in hwprobe riscv: hwcap: Add v to hwcap if xtheadvector enabled riscv: hwprobe: Add vendor extension probing riscv: hwprobe: Document vendor extensions and xtheadvector extension selftests: riscv: Fix vector tests selftests: riscv: Support xtheadvector in vector tests
Heiko Stuebner (1): RISC-V: define the elements of the VCSR vector CSR
Documentation/arch/riscv/hwprobe.rst | 12 + Documentation/devicetree/bindings/riscv/cpus.yaml | 11 + .../devicetree/bindings/riscv/extensions.yaml | 9 + arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi | 4 +- arch/riscv/include/asm/cpufeature.h | 143 +++++++--- arch/riscv/include/asm/csr.h | 13 + arch/riscv/include/asm/hwcap.h | 23 ++ arch/riscv/include/asm/hwprobe.h | 4 +- arch/riscv/include/asm/sbi.h | 2 + arch/riscv/include/asm/vector.h | 228 ++++++++++++---- arch/riscv/include/asm/xtheadvector.h | 25 ++ arch/riscv/include/uapi/asm/hwprobe.h | 10 +- arch/riscv/kernel/cpu.c | 20 ++ arch/riscv/kernel/cpufeature.c | 264 +++++++++++++++--- arch/riscv/kernel/kernel_mode_vector.c | 4 +- arch/riscv/kernel/sys_hwprobe.c | 59 ++++- arch/riscv/kernel/vector.c | 22 +- arch/riscv/lib/uaccess.S | 1 + tools/testing/selftests/riscv/vector/.gitignore | 3 +- tools/testing/selftests/riscv/vector/Makefile | 17 +- .../selftests/riscv/vector/v_exec_initval_nolibc.c | 93 +++++++ tools/testing/selftests/riscv/vector/v_helpers.c | 66 +++++ tools/testing/selftests/riscv/vector/v_helpers.h | 7 + tools/testing/selftests/riscv/vector/v_initval.c | 22 ++ .../selftests/riscv/vector/v_initval_nolibc.c | 68 ----- .../selftests/riscv/vector/vstate_exec_nolibc.c | 20 +- .../testing/selftests/riscv/vector/vstate_prctl.c | 295 ++++++++++++--------- 27 files changed, 1114 insertions(+), 331 deletions(-) --- base-commit: 4cece764965020c22cff7665b18a012006359095 change-id: 20240411-dev-charlie-support_thead_vector_6_9-1591fc2a431d
vendorid and marchid are required during devicetree parsing to determine known hardware capabilities. This parsing happens before the whole system has booted, so only the boot hart is online and able to report the value of its vendorid and archid.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- Documentation/devicetree/bindings/riscv/cpus.yaml | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml index d87dd50f1a4b..c21d7374636c 100644 --- a/Documentation/devicetree/bindings/riscv/cpus.yaml +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml @@ -94,6 +94,17 @@ properties: description: The blocksize in bytes for the Zicboz cache operations.
+ riscv,vendorid: + $ref: /schemas/types.yaml#/definitions/uint64 + description: + Same value as the mvendorid CSR. + + riscv,archid: + $ref: /schemas/types.yaml#/definitions/uint64 + description: + Same value as the marchid CSR. + + # RISC-V has multiple properties for cache op block sizes as the sizes # differ between individual CBO extensions cache-op-block-size: false
On Thu, Apr 11, 2024 at 09:11:07PM -0700, Charlie Jenkins wrote:
vendorid and marchid are required during devicetree parsing to determine known hardware capabilities. This parsing happens before the whole system has booted, so only the boot hart is online and able to report the value of its vendorid and archid.
I'll comment on the kernel patch, but this is not needed.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
Documentation/devicetree/bindings/riscv/cpus.yaml | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml index d87dd50f1a4b..c21d7374636c 100644 --- a/Documentation/devicetree/bindings/riscv/cpus.yaml +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml @@ -94,6 +94,17 @@ properties: description: The blocksize in bytes for the Zicboz cache operations.
- riscv,vendorid:
- $ref: /schemas/types.yaml#/definitions/uint64
- description:
Same value as the mvendorid CSR.
- riscv,archid:
- $ref: /schemas/types.yaml#/definitions/uint64
- description:
Same value as the marchid CSR.
- # RISC-V has multiple properties for cache op block sizes as the sizes # differ between individual CBO extensions cache-op-block-size: false
-- 2.44.0
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs") Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/include/asm/sbi.h | 2 ++ arch/riscv/kernel/cpu.c | 20 ++++++++++++++++++++ arch/riscv/kernel/cpufeature.c | 22 ++++++++++++++++++++-- 3 files changed, 42 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index 6e68f8dff76b..0fab508a65b3 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -370,6 +370,8 @@ static inline int sbi_remote_fence_i(const struct cpumask *cpu_mask) { return -1 static inline void sbi_init(void) {} #endif /* CONFIG_RISCV_SBI */
+unsigned long riscv_get_mvendorid(void); +unsigned long riscv_get_marchid(void); unsigned long riscv_cached_mvendorid(unsigned int cpu_id); unsigned long riscv_cached_marchid(unsigned int cpu_id); unsigned long riscv_cached_mimpid(unsigned int cpu_id); diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c index d11d6320fb0d..08319a819f32 100644 --- a/arch/riscv/kernel/cpu.c +++ b/arch/riscv/kernel/cpu.c @@ -139,6 +139,26 @@ int riscv_of_parent_hartid(struct device_node *node, unsigned long *hartid) return -1; }
+unsigned long __init riscv_get_marchid(void) +{ +#if IS_ENABLED(CONFIG_RISCV_SBI) + return sbi_spec_is_0_1() ? 0 : sbi_get_marchid(); +#elif IS_ENABLED(CONFIG_RISCV_M_MODE) + return csr_read(CSR_MARCHID); +#endif + return 0; +} + +unsigned long __init riscv_get_mvendorid(void) +{ +#if IS_ENABLED(CONFIG_RISCV_SBI) + return sbi_spec_is_0_1() ? 0 : sbi_get_mvendorid(); +#elif IS_ENABLED(CONFIG_RISCV_M_MODE) + return csr_read(CSR_MVENDORID); +#endif + return 0; +} + DEFINE_PER_CPU(struct riscv_cpuinfo, riscv_cpuinfo);
unsigned long riscv_cached_mvendorid(unsigned int cpu_id) diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 3ed2359eae35..cd156adbeb66 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -490,6 +490,8 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) struct acpi_table_header *rhct; acpi_status status; unsigned int cpu; + u64 boot_vendorid; + u64 boot_archid;
if (!acpi_disabled) { status = acpi_get_table(ACPI_SIG_RHCT, 0, &rhct); @@ -497,9 +499,14 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) return; }
+ boot_vendorid = riscv_get_mvendorid(); + boot_archid = riscv_get_marchid(); + for_each_possible_cpu(cpu) { struct riscv_isainfo *isainfo = &hart_isa[cpu]; unsigned long this_hwcap = 0; + u64 this_vendorid; + u64 this_archid;
if (acpi_disabled) { node = of_cpu_device_node_get(cpu); @@ -514,12 +521,23 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) pr_warn("Unable to find "riscv,isa" devicetree entry\n"); continue; } + if (of_property_read_u64(node, "riscv,vendorid", &this_vendorid) < 0) { + pr_warn("Unable to find "riscv,vendorid" devicetree entry, using boot hart mvendorid instead\n"); + this_vendorid = boot_vendorid; + } + + if (of_property_read_u64(node, "riscv,archid", &this_archid) < 0) { + pr_warn("Unable to find "riscv,vendorid" devicetree entry, using boot hart marchid instead\n"); + this_archid = boot_archid; + } } else { rc = acpi_get_riscv_isa(rhct, cpu, &isa); if (rc < 0) { pr_warn("Unable to get ISA for the hart - %d\n", cpu); continue; } + this_vendorid = boot_vendorid; + this_archid = boot_archid; }
riscv_parse_isa_string(&this_hwcap, isainfo, isa2hwcap, isa); @@ -544,8 +562,8 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) * CPU cores with the ratified spec will contain non-zero * marchid. */ - if (acpi_disabled && riscv_cached_mvendorid(cpu) == THEAD_VENDOR_ID && - riscv_cached_marchid(cpu) == 0x0) { + if (acpi_disabled && this_vendorid == THEAD_VENDOR_ID && + this_archid == 0x0) { this_hwcap &= ~isa2hwcap[RISCV_ISA_EXT_v]; clear_bit(RISCV_ISA_EXT_v, isainfo->isa); }
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped. If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
@@ -514,12 +521,23 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) pr_warn("Unable to find "riscv,isa" devicetree entry\n"); continue; }
if (of_property_read_u64(node, "riscv,vendorid", &this_vendorid) < 0) {
pr_warn("Unable to find \"riscv,vendorid\" devicetree entry, using boot hart mvendorid instead\n");
This should 100% not be a warning, it's not a required property in the binding.
Cheers, Conor.
this_vendorid = boot_vendorid;
}
On Fri, Apr 12, 2024 at 3:26 AM Conor Dooley conor.dooley@microchip.com wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped. If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
It's possible I'm misinterpreting, but is the suggestion to apply the marchid/mvendorid we find on the boot CPU and assume it's the same on all other CPUs? Since we're reporting the marchid/mvendorid/mimpid to usermode in a per-hart way, it would be better IMO if we really do query marchid/mvendorid/mimpid on each hart. The problem with applying the boot CPU's value everywhere is if we're ever wrong in the future (ie that assumption doesn't hold on some machine), we'll only find out about it after the fact. Since we reported the wrong information to usermode via hwprobe, it'll be an ugly userspace ABI issue to clean up.
-Evan
On Fri, Apr 12, 2024 at 10:04:17AM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 3:26 AM Conor Dooley conor.dooley@microchip.com wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped. If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
It's possible I'm misinterpreting, but is the suggestion to apply the marchid/mvendorid we find on the boot CPU and assume it's the same on all other CPUs? Since we're reporting the marchid/mvendorid/mimpid to usermode in a per-hart way, it would be better IMO if we really do query marchid/mvendorid/mimpid on each hart. The problem with applying the boot CPU's value everywhere is if we're ever wrong in the future (ie that assumption doesn't hold on some machine), we'll only find out about it after the fact. Since we reported the wrong information to usermode via hwprobe, it'll be an ugly userspace ABI issue to clean up.
You're misinterpreting, we do get the values on all individually as they're brought online. This is only used by the code that throws a bone to people with crappy vendor dtbs that put "v" in riscv,isa when they support the unratified version.
On Fri, Apr 12, 2024 at 07:38:04PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:04:17AM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 3:26 AM Conor Dooley conor.dooley@microchip.com wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped. If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
It's possible I'm misinterpreting, but is the suggestion to apply the marchid/mvendorid we find on the boot CPU and assume it's the same on all other CPUs? Since we're reporting the marchid/mvendorid/mimpid to usermode in a per-hart way, it would be better IMO if we really do query marchid/mvendorid/mimpid on each hart. The problem with applying the boot CPU's value everywhere is if we're ever wrong in the future (ie that assumption doesn't hold on some machine), we'll only find out about it after the fact. Since we reported the wrong information to usermode via hwprobe, it'll be an ugly userspace ABI issue to clean up.
You're misinterpreting, we do get the values on all individually as they're brought online. This is only used by the code that throws a bone to people with crappy vendor dtbs that put "v" in riscv,isa when they support the unratified version.
Not quite, the alternatives are patched before the other cpus are booted, so the alternatives will have false positives resulting in broken kernels.
- Charlie
On Fri, Apr 12, 2024 at 11:46:21AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 07:38:04PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:04:17AM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 3:26 AM Conor Dooley conor.dooley@microchip.com wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped. If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
It's possible I'm misinterpreting, but is the suggestion to apply the marchid/mvendorid we find on the boot CPU and assume it's the same on all other CPUs? Since we're reporting the marchid/mvendorid/mimpid to usermode in a per-hart way, it would be better IMO if we really do query marchid/mvendorid/mimpid on each hart. The problem with applying the boot CPU's value everywhere is if we're ever wrong in the future (ie that assumption doesn't hold on some machine), we'll only find out about it after the fact. Since we reported the wrong information to usermode via hwprobe, it'll be an ugly userspace ABI issue to clean up.
You're misinterpreting, we do get the values on all individually as they're brought online. This is only used by the code that throws a bone to people with crappy vendor dtbs that put "v" in riscv,isa when they support the unratified version.
Not quite,
Remember that this patch stands in isolation and the justification given in your commit message does not mention anything other than fixing my broken patch.
the alternatives are patched before the other cpus are booted, so the alternatives will have false positives resulting in broken kernels.
Over-eagerly disabling vector isn't going to break any kernels and really should not break a behaving userspace either. Under-eagerly disabling it (in a way that this approach could solve) is only going to happen on a system where the boot hart has non-zero values and claims support for v but a non-boot hart has zero values and claims support for v but actually doesn't implement the ratified version. If the boot hart doesn't support v, then we currently disable the extension as only homogeneous stuff is supported by Linux. If the boot hart claims support for "v" but doesn't actually implement the ratified version neither the intent of my original patch nor this fix for it are going to help avoid a broken kernel.
I think we do have a problem if the boot cpu having some erratum leads to the kernel being patched in a way that does not work for the other CPUs on the system, but I don't think this series addresses that sort of issue at all as you'd be adding code to the pi section if you were fixing it. I also don't think we should be making pre-emptive changes to the errata patching code either to solve that sort of problem, until an SoC shows up where things don't work.
Cheers, Conor.
On Fri, Apr 12, 2024 at 08:26:12PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 11:46:21AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 07:38:04PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:04:17AM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 3:26 AM Conor Dooley conor.dooley@microchip.com wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped. If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
It's possible I'm misinterpreting, but is the suggestion to apply the marchid/mvendorid we find on the boot CPU and assume it's the same on all other CPUs? Since we're reporting the marchid/mvendorid/mimpid to usermode in a per-hart way, it would be better IMO if we really do query marchid/mvendorid/mimpid on each hart. The problem with applying the boot CPU's value everywhere is if we're ever wrong in the future (ie that assumption doesn't hold on some machine), we'll only find out about it after the fact. Since we reported the wrong information to usermode via hwprobe, it'll be an ugly userspace ABI issue to clean up.
You're misinterpreting, we do get the values on all individually as they're brought online. This is only used by the code that throws a bone to people with crappy vendor dtbs that put "v" in riscv,isa when they support the unratified version.
Not quite,
Remember that this patch stands in isolation and the justification given in your commit message does not mention anything other than fixing my broken patch.
Fixing the patch in the simplest sense would be to eagerly get the mvendorid/marchid without using the cached version. But this assumes that all harts have the same mvendorid/marchid. This is not something that I am strongly attached to. If it truly is detrimental to Linux to allow a user a way to specify different vendorids for different harts then I will remove that code.
- Charlie
the alternatives are patched before the other cpus are booted, so the alternatives will have false positives resulting in broken kernels.
Over-eagerly disabling vector isn't going to break any kernels and really should not break a behaving userspace either. Under-eagerly disabling it (in a way that this approach could solve) is only going to happen on a system where the boot hart has non-zero values and claims support for v but a non-boot hart has zero values and claims support for v but actually doesn't implement the ratified version. If the boot hart doesn't support v, then we currently disable the extension as only homogeneous stuff is supported by Linux. If the boot hart claims support for "v" but doesn't actually implement the ratified version neither the intent of my original patch nor this fix for it are going to help avoid a broken kernel.
I think we do have a problem if the boot cpu having some erratum leads to the kernel being patched in a way that does not work for the other CPUs on the system, but I don't think this series addresses that sort of issue at all as you'd be adding code to the pi section if you were fixing it. I also don't think we should be making pre-emptive changes to the errata patching code either to solve that sort of problem, until an SoC shows up where things don't work. Cheers, Conor.
On Fri, Apr 12, 2024 at 01:34:43PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 08:26:12PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 11:46:21AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 07:38:04PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:04:17AM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 3:26 AM Conor Dooley conor.dooley@microchip.com wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote: > The riscv_cpuinfo struct that contains mvendorid and marchid is not > populated until all harts are booted which happens after the DT parsing. > Use the vendorid/archid values from the DT if available or assume all > harts have the same values as the boot hart as a fallback. > > Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped. If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
It's possible I'm misinterpreting, but is the suggestion to apply the marchid/mvendorid we find on the boot CPU and assume it's the same on all other CPUs? Since we're reporting the marchid/mvendorid/mimpid to usermode in a per-hart way, it would be better IMO if we really do query marchid/mvendorid/mimpid on each hart. The problem with applying the boot CPU's value everywhere is if we're ever wrong in the future (ie that assumption doesn't hold on some machine), we'll only find out about it after the fact. Since we reported the wrong information to usermode via hwprobe, it'll be an ugly userspace ABI issue to clean up.
You're misinterpreting, we do get the values on all individually as they're brought online. This is only used by the code that throws a bone to people with crappy vendor dtbs that put "v" in riscv,isa when they support the unratified version.
Not quite,
Remember that this patch stands in isolation and the justification given in your commit message does not mention anything other than fixing my broken patch.
Fixing the patch in the simplest sense would be to eagerly get the mvendorid/marchid without using the cached version. But this assumes that all harts have the same mvendorid/marchid. This is not something that I am strongly attached to. If it truly is detrimental to Linux to allow a user a way to specify different vendorids for different harts then I will remove that code.
I think that the simple fix is all that we need to do here, perhaps updating the comment to point out how naive we are being. `
the alternatives are patched before the other cpus are booted, so the alternatives will have false positives resulting in broken kernels.
Over-eagerly disabling vector isn't going to break any kernels and really should not break a behaving userspace either. Under-eagerly disabling it (in a way that this approach could solve) is only going to happen on a system where the boot hart has non-zero values and claims support for v but a non-boot hart has zero values and claims support for v but actually doesn't implement the ratified version. If the boot hart doesn't support v, then we currently disable the extension as only homogeneous stuff is supported by Linux. If the boot hart claims support for "v" but doesn't actually implement the ratified version neither the intent of my original patch nor this fix for it are going to help avoid a broken kernel.
I think we do have a problem if the boot cpu having some erratum leads to the kernel being patched in a way that does not work for the other CPUs on the system, but I don't think this series addresses that sort of issue at all as you'd be adding code to the pi section if you were fixing it. I also don't think we should be making pre-emptive changes to the errata patching code either to solve that sort of problem, until an SoC shows up where things don't work. Cheers, Conor.
On Fri, Apr 12, 2024 at 11:25:47AM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped.
Yes, the DT those shipped with will not have the property in the DT so will fall back on the boot hart. The addition of the DT properties allow future heterogenous systems to be able to function.
If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Yeah that is a minor optimization that can I can apply.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
They have an mvendorid of 0x5b7.
This is already falling back on the boot CPU, but that is not a solution that scales. Even though all systems currently have homogenous marchid/mvendorid I am hesitant to assert that all systems are homogenous without providing an option to override this. The overhead is looking for a field in the DT which does not seem to be impactful enough to prevent the addition of this option.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
@@ -514,12 +521,23 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) pr_warn("Unable to find "riscv,isa" devicetree entry\n"); continue; }
if (of_property_read_u64(node, "riscv,vendorid", &this_vendorid) < 0) {
pr_warn("Unable to find \"riscv,vendorid\" devicetree entry, using boot hart mvendorid instead\n");
This should 100% not be a warning, it's not a required property in the binding.
Yes definitely, thank you.
- Charlie
Cheers, Conor.
this_vendorid = boot_vendorid;
}
On Fri, Apr 12, 2024 at 10:12:20AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 11:25:47AM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped.
Yes, the DT those shipped with will not have the property in the DT so will fall back on the boot hart. The addition of the DT properties allow future heterogenous systems to be able to function.
I think you've kinda missed the point about what the original code was actually doing here. Really the kernel should not be doing validation of the devicetree at all, but I was trying to avoid people shooting themselves in the foot by doing something simple that would work for their (incorrect) vendor dtbs. Future heterogenous systems should be using riscv,isa-extensions, which is totally unaffected by this codepath (and setting actual values for mimpid/marchid too ideally!).
If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Yeah that is a minor optimization that can I can apply.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
They have an mvendorid of 0x5b7.
That was a braino, clearly I should have typed "mimpid".
This is already falling back on the boot CPU, but that is not a solution that scales. Even though all systems currently have homogenous marchid/mvendorid I am hesitant to assert that all systems are homogenous without providing an option to override this.
There are already is an option. Use the non-deprecated property in your new system for describing what extesions you support. We don't need to add any more properties (for now at least).
The overhead is looking for a field in the DT which does not seem to be impactful enough to prevent the addition of this option.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
@@ -514,12 +521,23 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) pr_warn("Unable to find "riscv,isa" devicetree entry\n"); continue; }
if (of_property_read_u64(node, "riscv,vendorid", &this_vendorid) < 0) {
pr_warn("Unable to find \"riscv,vendorid\" devicetree entry, using boot hart mvendorid instead\n");
This should 100% not be a warning, it's not a required property in the binding.
Yes definitely, thank you.
- Charlie
Cheers, Conor.
this_vendorid = boot_vendorid;
}
On Fri, Apr 12, 2024 at 07:47:48PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:12:20AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 11:25:47AM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped.
Yes, the DT those shipped with will not have the property in the DT so will fall back on the boot hart. The addition of the DT properties allow future heterogenous systems to be able to function.
I think you've kinda missed the point about what the original code was actually doing here. Really the kernel should not be doing validation of the devicetree at all, but I was trying to avoid people shooting themselves in the foot by doing something simple that would work for their (incorrect) vendor dtbs. Future heterogenous systems should be using riscv,isa-extensions, which is totally unaffected by this codepath (and setting actual values for mimpid/marchid too ideally!).
I am on the same page with you about that.
If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Yeah that is a minor optimization that can I can apply.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
They have an mvendorid of 0x5b7.
That was a braino, clearly I should have typed "mimpid".
This is already falling back on the boot CPU, but that is not a solution that scales. Even though all systems currently have homogenous marchid/mvendorid I am hesitant to assert that all systems are homogenous without providing an option to override this.
There are already is an option. Use the non-deprecated property in your new system for describing what extesions you support. We don't need to add any more properties (for now at least).
The issue is that it is not possible to know which vendor extensions are associated with a vendor. That requires a global namespace where each extension can be looked up in a table. I have opted to have a vendor-specific namespace so that vendors don't have to worry about stepping on other vendor's toes (or the other way around). In order to support that, the vendorid of the hart needs to be known prior.
I know a rebuttal here is that this is taking away from the point of the original patch. I can split this patch up if so. The goal here is to allow vendor extensions to play nicely with the rest of the system. There are two uses of the mvendorid DT value, this fix, and the patch that adds vendor extension support. I felt that it was applicable to wrap the mvendorid DT value into this patch, but if you would prefer that to live separate of this fix then that is fine too.
- Charlie
The overhead is looking for a field in the DT which does not seem to be impactful enough to prevent the addition of this option.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
@@ -514,12 +521,23 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) pr_warn("Unable to find "riscv,isa" devicetree entry\n"); continue; }
if (of_property_read_u64(node, "riscv,vendorid", &this_vendorid) < 0) {
pr_warn("Unable to find \"riscv,vendorid\" devicetree entry, using boot hart mvendorid instead\n");
This should 100% not be a warning, it's not a required property in the binding.
Yes definitely, thank you.
- Charlie
Cheers, Conor.
this_vendorid = boot_vendorid;
}
On Fri, Apr 12, 2024 at 01:48:46PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 07:47:48PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:12:20AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 11:25:47AM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped.
Yes, the DT those shipped with will not have the property in the DT so will fall back on the boot hart. The addition of the DT properties allow future heterogenous systems to be able to function.
I think you've kinda missed the point about what the original code was actually doing here. Really the kernel should not be doing validation of the devicetree at all, but I was trying to avoid people shooting themselves in the foot by doing something simple that would work for their (incorrect) vendor dtbs. Future heterogenous systems should be using riscv,isa-extensions, which is totally unaffected by this codepath (and setting actual values for mimpid/marchid too ideally!).
I am on the same page with you about that.
If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Yeah that is a minor optimization that can I can apply.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
They have an mvendorid of 0x5b7.
That was a braino, clearly I should have typed "mimpid".
This is already falling back on the boot CPU, but that is not a solution that scales. Even though all systems currently have homogenous marchid/mvendorid I am hesitant to assert that all systems are homogenous without providing an option to override this.
There are already is an option. Use the non-deprecated property in your new system for describing what extesions you support. We don't need to add any more properties (for now at least).
The issue is that it is not possible to know which vendor extensions are associated with a vendor. That requires a global namespace where each extension can be looked up in a table. I have opted to have a vendor-specific namespace so that vendors don't have to worry about stepping on other vendor's toes (or the other way around). In order to support that, the vendorid of the hart needs to be known prior.
Nah, I think you're mixing up something like hwprobe and having namespaces there with needing namespacing on the devicetree probing side too. You don't need any vendor namespacing, it's perfectly fine (IMO) for a vendor to implement someone else's extension and I think we should allow probing any vendors extension on any CPU.
I know a rebuttal here is that this is taking away from the point of the original patch. I can split this patch up if so. The goal here is to allow vendor extensions to play nicely with the rest of the system. There are two uses of the mvendorid DT value, this fix, and the patch that adds vendor extension support. I felt that it was applicable to wrap the mvendorid DT value into this patch, but if you would prefer that to live separate of this fix then that is fine too.
- Charlie
The overhead is looking for a field in the DT which does not seem to be impactful enough to prevent the addition of this option.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
@@ -514,12 +521,23 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) pr_warn("Unable to find "riscv,isa" devicetree entry\n"); continue; }
if (of_property_read_u64(node, "riscv,vendorid", &this_vendorid) < 0) {
pr_warn("Unable to find \"riscv,vendorid\" devicetree entry, using boot hart mvendorid instead\n");
This should 100% not be a warning, it's not a required property in the binding.
Yes definitely, thank you.
- Charlie
Cheers, Conor.
this_vendorid = boot_vendorid;
}
On Fri, Apr 12, 2024 at 10:27:47PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 01:48:46PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 07:47:48PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:12:20AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 11:25:47AM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
The riscv_cpuinfo struct that contains mvendorid and marchid is not populated until all harts are booted which happens after the DT parsing. Use the vendorid/archid values from the DT if available or assume all harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff from dt, then I don't think we should add it. None of the devicetrees that the commit you're fixing here addresses will have these properties and if they did have them, they'd then also be new enough to hopefully not have "v" either - the issue is they're using whatever crap the vendor shipped.
Yes, the DT those shipped with will not have the property in the DT so will fall back on the boot hart. The addition of the DT properties allow future heterogenous systems to be able to function.
I think you've kinda missed the point about what the original code was actually doing here. Really the kernel should not be doing validation of the devicetree at all, but I was trying to avoid people shooting themselves in the foot by doing something simple that would work for their (incorrect) vendor dtbs. Future heterogenous systems should be using riscv,isa-extensions, which is totally unaffected by this codepath (and setting actual values for mimpid/marchid too ideally!).
I am on the same page with you about that.
If we're gonna get the information from DT, we already have something that we can look at to perform the disable as the cpu compatibles give us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid, since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid repeating these ecalls on all systems.
Yeah that is a minor optimization that can I can apply.
Perhaps for now we could just look at the boot CPU alone? To my knowledge the systems that this targets all have homogeneous marchid/mvendorid values of 0x0.
They have an mvendorid of 0x5b7.
That was a braino, clearly I should have typed "mimpid".
This is already falling back on the boot CPU, but that is not a solution that scales. Even though all systems currently have homogenous marchid/mvendorid I am hesitant to assert that all systems are homogenous without providing an option to override this.
There are already is an option. Use the non-deprecated property in your new system for describing what extesions you support. We don't need to add any more properties (for now at least).
The issue is that it is not possible to know which vendor extensions are associated with a vendor. That requires a global namespace where each extension can be looked up in a table. I have opted to have a vendor-specific namespace so that vendors don't have to worry about stepping on other vendor's toes (or the other way around). In order to support that, the vendorid of the hart needs to be known prior.
Nah, I think you're mixing up something like hwprobe and having namespaces there with needing namespacing on the devicetree probing side too. You don't need any vendor namespacing, it's perfectly fine (IMO) for a vendor to implement someone else's extension and I think we should allow probing any vendors extension on any CPU.
I am not mixing it up. Sure a vendor can implement somebody else's extension, they just need to add it to their namespace too.
- Charlie
I know a rebuttal here is that this is taking away from the point of the original patch. I can split this patch up if so. The goal here is to allow vendor extensions to play nicely with the rest of the system. There are two uses of the mvendorid DT value, this fix, and the patch that adds vendor extension support. I felt that it was applicable to wrap the mvendorid DT value into this patch, but if you would prefer that to live separate of this fix then that is fine too.
- Charlie
The overhead is looking for a field in the DT which does not seem to be impactful enough to prevent the addition of this option.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
@@ -514,12 +521,23 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) pr_warn("Unable to find "riscv,isa" devicetree entry\n"); continue; }
if (of_property_read_u64(node, "riscv,vendorid", &this_vendorid) < 0) {
pr_warn("Unable to find \"riscv,vendorid\" devicetree entry, using boot hart mvendorid instead\n");
This should 100% not be a warning, it's not a required property in the binding.
Yes definitely, thank you.
- Charlie
Cheers, Conor.
this_vendorid = boot_vendorid;
}
On Fri, Apr 12, 2024 at 02:31:42PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 10:27:47PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 01:48:46PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 07:47:48PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:12:20AM -0700, Charlie Jenkins wrote:
This is already falling back on the boot CPU, but that is not a solution that scales. Even though all systems currently have homogenous marchid/mvendorid I am hesitant to assert that all systems are homogenous without providing an option to override this.
There are already is an option. Use the non-deprecated property in your new system for describing what extesions you support. We don't need to add any more properties (for now at least).
The issue is that it is not possible to know which vendor extensions are associated with a vendor. That requires a global namespace where each extension can be looked up in a table. I have opted to have a vendor-specific namespace so that vendors don't have to worry about stepping on other vendor's toes (or the other way around). In order to support that, the vendorid of the hart needs to be known prior.
Nah, I think you're mixing up something like hwprobe and having namespaces there with needing namespacing on the devicetree probing side too. You don't need any vendor namespacing, it's perfectly fine (IMO) for a vendor to implement someone else's extension and I think we should allow probing any vendors extension on any CPU.
I am not mixing it up. Sure a vendor can implement somebody else's extension, they just need to add it to their namespace too.
I didn't mean that you were mixing up how your implementation worked, my point was that you're mixing up the hwprobe stuff which may need namespacing for $a{b,p}i_reason and probing from DT which does not. I don't think that the kernel should need to be changed at all if someone shows up and implements another vendor's extension - we already have far too many kernel changes required to display support for extensions and I don't welcome potential for more.
Another thing I just thought of was systems where the SoC vendor implements some extension that gets communicated in the ISA string but is not the vendor in mvendorid in their various CPUs. I wouldn't want to see several different entries in structs (or several different hwprobe keys, but that's another story) for this situation because you're only allowing probing what's in the struct matching the vendorid.
On Sat, Apr 13, 2024 at 12:40:26AM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 02:31:42PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 10:27:47PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 01:48:46PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 07:47:48PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:12:20AM -0700, Charlie Jenkins wrote:
This is already falling back on the boot CPU, but that is not a solution that scales. Even though all systems currently have homogenous marchid/mvendorid I am hesitant to assert that all systems are homogenous without providing an option to override this.
There are already is an option. Use the non-deprecated property in your new system for describing what extesions you support. We don't need to add any more properties (for now at least).
The issue is that it is not possible to know which vendor extensions are associated with a vendor. That requires a global namespace where each extension can be looked up in a table. I have opted to have a vendor-specific namespace so that vendors don't have to worry about stepping on other vendor's toes (or the other way around). In order to support that, the vendorid of the hart needs to be known prior.
Nah, I think you're mixing up something like hwprobe and having namespaces there with needing namespacing on the devicetree probing side too. You don't need any vendor namespacing, it's perfectly fine (IMO) for a vendor to implement someone else's extension and I think we should allow probing any vendors extension on any CPU.
I am not mixing it up. Sure a vendor can implement somebody else's extension, they just need to add it to their namespace too.
I didn't mean that you were mixing up how your implementation worked, my point was that you're mixing up the hwprobe stuff which may need namespacing for $a{b,p}i_reason and probing from DT which does not. I don't think that the kernel should need to be changed at all if someone shows up and implements another vendor's extension - we already have far too many kernel changes required to display support for extensions and I don't welcome potential for more.
Yes I understand where you are coming from. We do not want it to require very many changes to add an extension. With this framework, there are the same number of changes to add a vendor extension as there is to add a standard extension. There is the upfront cost of creating the struct for the first vendor extension from a vendor, but after that the extension only needs to be added to the associated vendor's file (I am extracting this out to a vendor file in the next version). This is also a very easy task since the fields from a different vendor can be copied and adapted.
Another thing I just thought of was systems where the SoC vendor implements some extension that gets communicated in the ISA string but is not the vendor in mvendorid in their various CPUs. I wouldn't want to see several different entries in structs (or several different hwprobe keys, but that's another story) for this situation because you're only allowing probing what's in the struct matching the vendorid.
Since the isa string is a per-hart field, the vendor associated with the hart will be used.
- Charlie
On Mon, Apr 15, 2024 at 08:34:05PM -0700, Charlie Jenkins wrote:
On Sat, Apr 13, 2024 at 12:40:26AM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 02:31:42PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 10:27:47PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 01:48:46PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 07:47:48PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:12:20AM -0700, Charlie Jenkins wrote:
> This is already falling back on the boot CPU, but that is not a solution > that scales. Even though all systems currently have homogenous > marchid/mvendorid I am hesitant to assert that all systems are > homogenous without providing an option to override this.
There are already is an option. Use the non-deprecated property in your new system for describing what extesions you support. We don't need to add any more properties (for now at least).
The issue is that it is not possible to know which vendor extensions are associated with a vendor. That requires a global namespace where each extension can be looked up in a table. I have opted to have a vendor-specific namespace so that vendors don't have to worry about stepping on other vendor's toes (or the other way around). In order to support that, the vendorid of the hart needs to be known prior.
Nah, I think you're mixing up something like hwprobe and having namespaces there with needing namespacing on the devicetree probing side too. You don't need any vendor namespacing, it's perfectly fine (IMO) for a vendor to implement someone else's extension and I think we should allow probing any vendors extension on any CPU.
I am not mixing it up. Sure a vendor can implement somebody else's extension, they just need to add it to their namespace too.
I didn't mean that you were mixing up how your implementation worked, my point was that you're mixing up the hwprobe stuff which may need namespacing for $a{b,p}i_reason and probing from DT which does not. I don't think that the kernel should need to be changed at all if someone shows up and implements another vendor's extension - we already have far too many kernel changes required to display support for extensions and I don't welcome potential for more.
Yes I understand where you are coming from. We do not want it to require very many changes to add an extension. With this framework, there are the same number of changes to add a vendor extension as there is to add a standard extension.
No, it is actually subtly different. Even if the kernel already supports the extension, it needs to be patched for each vendor
There is the upfront cost of creating the struct for the first vendor extension from a vendor, but after that the extension only needs to be added to the associated vendor's file (I am extracting this out to a vendor file in the next version). This is also a very easy task since the fields from a different vendor can be copied and adapted.
Another thing I just thought of was systems where the SoC vendor implements some extension that gets communicated in the ISA string but is not the vendor in mvendorid in their various CPUs. I wouldn't want to see several different entries in structs (or several different hwprobe keys, but that's another story) for this situation because you're only allowing probing what's in the struct matching the vendorid.
Since the isa string is a per-hart field, the vendor associated with the hart will be used.
I don't know if you just didn't really read what I said or didn't understand it, but this response doesn't address my comment. Consider SoC vendor S buys CPUs from vendors A & B and asks both of them to implement Xsjam. The CPUs are have the vendorid of either A or B, depending on who made it. This scenario should not result in two different hwprobe keys nor two different in-kernel riscv_has_vendor_ext() checks to see if the extension is supported. *If* the extension is vendor namespaced, it should be to the SoC vendor whose extension it is, not the individual CPU vendors that implemented it.
Additionally, consider that CPUs from both vendors are in the same SoC and all CPUs support Xsjam. Linux only supports homogeneous extensions so we should be able to detect that all CPUs support the extension and use it in a driver etc, but that's either not going to work (or be difficult to orchestrate) with different mappings per CPU vendor. I saw your v2 cover letter, in which you said: Only patch vendor extension if all harts are associated with the same vendor. This is the best chance the kernel has for working properly if there are multiple vendors. I don't think that level of paranoia is required: if firmware tells us that an extension is supported, then we can trust that those extensions have been implemented correctly. If the fear of implementation bugs is what is driving the namespacing that you've gone for, I don't think that it is required and we can simplify things, with the per-vendor structs being the vendor of the extension (so SoC vendor S in my example), not A and B who are the vendors of the CPU IP.
Thanks, Conor.
On Tue, Apr 16, 2024 at 08:36:33AM +0100, Conor Dooley wrote:
On Mon, Apr 15, 2024 at 08:34:05PM -0700, Charlie Jenkins wrote:
On Sat, Apr 13, 2024 at 12:40:26AM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 02:31:42PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 10:27:47PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 01:48:46PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 07:47:48PM +0100, Conor Dooley wrote: > On Fri, Apr 12, 2024 at 10:12:20AM -0700, Charlie Jenkins wrote:
> > This is already falling back on the boot CPU, but that is not a solution > > that scales. Even though all systems currently have homogenous > > marchid/mvendorid I am hesitant to assert that all systems are > > homogenous without providing an option to override this. > > There are already is an option. Use the non-deprecated property in your > new system for describing what extesions you support. We don't need to > add any more properties (for now at least).
The issue is that it is not possible to know which vendor extensions are associated with a vendor. That requires a global namespace where each extension can be looked up in a table. I have opted to have a vendor-specific namespace so that vendors don't have to worry about stepping on other vendor's toes (or the other way around). In order to support that, the vendorid of the hart needs to be known prior.
Nah, I think you're mixing up something like hwprobe and having namespaces there with needing namespacing on the devicetree probing side too. You don't need any vendor namespacing, it's perfectly fine (IMO) for a vendor to implement someone else's extension and I think we should allow probing any vendors extension on any CPU.
I am not mixing it up. Sure a vendor can implement somebody else's extension, they just need to add it to their namespace too.
I didn't mean that you were mixing up how your implementation worked, my point was that you're mixing up the hwprobe stuff which may need namespacing for $a{b,p}i_reason and probing from DT which does not. I don't think that the kernel should need to be changed at all if someone shows up and implements another vendor's extension - we already have far too many kernel changes required to display support for extensions and I don't welcome potential for more.
Yes I understand where you are coming from. We do not want it to require very many changes to add an extension. With this framework, there are the same number of changes to add a vendor extension as there is to add a standard extension.
No, it is actually subtly different. Even if the kernel already supports the extension, it needs to be patched for each vendor
There is the upfront cost of creating the struct for the first vendor extension from a vendor, but after that the extension only needs to be added to the associated vendor's file (I am extracting this out to a vendor file in the next version). This is also a very easy task since the fields from a different vendor can be copied and adapted.
Another thing I just thought of was systems where the SoC vendor implements some extension that gets communicated in the ISA string but is not the vendor in mvendorid in their various CPUs. I wouldn't want to see several different entries in structs (or several different hwprobe keys, but that's another story) for this situation because you're only allowing probing what's in the struct matching the vendorid.
Since the isa string is a per-hart field, the vendor associated with the hart will be used.
I don't know if you just didn't really read what I said or didn't understand it, but this response doesn't address my comment.
I read what you said! This question seemed to me as another variant of "what happens when one vendor implements an extension from a different vendor", and since we already discussed that I was trying to figure out what you were actually asking.
Consider SoC vendor S buys CPUs from vendors A & B and asks both of them to implement Xsjam. The CPUs are have the vendorid of either A or B, depending on who made it. This scenario should not result in two different hwprobe keys nor two different in-kernel riscv_has_vendor_ext() checks to see if the extension is supported. *If* the extension is vendor namespaced, it should be to the SoC vendor whose extension it is, not the individual CPU vendors that implemented it.
Additionally, consider that CPUs from both vendors are in the same SoC and all CPUs support Xsjam. Linux only supports homogeneous extensions so we should be able to detect that all CPUs support the extension and use it in a driver etc, but that's either not going to work (or be difficult to orchestrate) with different mappings per CPU vendor. I saw your v2 cover letter, in which you said: Only patch vendor extension if all harts are associated with the same vendor. This is the best chance the kernel has for working properly if there are multiple vendors. I don't think that level of paranoia is required: if firmware tells us that an extension is supported, then we can trust that those extensions have been implemented correctly. If the fear of implementation bugs is what is driving the namespacing that you've gone for, I don't think that it is required and we can simplify things, with the per-vendor structs being the vendor of the extension (so SoC vendor S in my example), not A and B who are the vendors of the CPU IP.
Thanks, Conor.
Thank you for expanding upon this idea further. This solution of indexing the extensions based on the vendor who proposed them does make a lot of sense. There are some key differences here of note. When vendors are able to mix vendor extensions, defining a bitmask that contains all of the vendor extensions gets a bit messier. I see two possible solutions.
1. Vendor keys cannot overlap between vendors. A set bit in the bitmask is associated with exactly one extension.
2. Vendor keys can overlap between vendors. There is a vendor bitmask per vendor. When setting/checking a vendor extension, first index into the vendor extension bitmask with the vendor associated with the extension and then with the key of the vendor extension.
A third option would be to use the standard extension framework. This causes the standard extension list to become populated with extensions that most harts will never implement so I am opposed to that.
This problem carries over into hwprobe since the schemes proposed by Evan and I both rely on the mvendorid of harts associated with the cpumask. To have this level of support in hwprobe for SoCs with a mix of vendors but the same extensions I again see two options:
1. Vendor keys cannot overlap between vendors. A set bit in the bitmask is associated with exactly one extension. This bitmask would be returned by the vendor extension hwprobe key.
2. Vendor keys can overlap between vendors. There is an hwprobe key per vendor. Automatic resolution of the vendor doesn't work because the vendor-specific feature being requested (extensions in the case) may be of a vendor that is different than the hart's vendor, in otherwords there are two variables necessary: the vendor and a way to ask hwprobe for a list of the vendor extensions. With hwprobe there is only the "key" that can be used to encode these variables simultaneously. We could have something like a HWPROBE_THEAD_EXT_0 key that would return all thead vendor extensions supported by the harts corresponding to the cpumask.
I didn't list the option that we shove all of the vendor extensions into the same fields that are used for standard extensions because that will fill up the standard extension probing with all of the vendor extensions that most SoCs will not care about.
The second option for hwprobe is nice because there are "only" 64 values supported in the returned bitmask so if there ends up being a lot of vendor extensions that need to be exposed, then we would end up with a lot of unused bits on most systems.
For the internal kernel structures it matters less (or doesn't matter at all) since it's not exposed to userspace and it can always change. Having consistency is nice for developers though so it would be my preference to have schemes that reflect each other for the in-kernel structures and hwprobe.
Thank you for working this problem out with me. I know there is a lot of text I am pushing here, hopefully we can design something that doesn't need to be re-written in the future.
- Charlie
On Tue, Apr 16, 2024 at 9:25 PM Charlie Jenkins charlie@rivosinc.com wrote:
On Tue, Apr 16, 2024 at 08:36:33AM +0100, Conor Dooley wrote:
On Mon, Apr 15, 2024 at 08:34:05PM -0700, Charlie Jenkins wrote:
On Sat, Apr 13, 2024 at 12:40:26AM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 02:31:42PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 10:27:47PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 01:48:46PM -0700, Charlie Jenkins wrote: > On Fri, Apr 12, 2024 at 07:47:48PM +0100, Conor Dooley wrote: > > On Fri, Apr 12, 2024 at 10:12:20AM -0700, Charlie Jenkins wrote:
> > > This is already falling back on the boot CPU, but that is not a solution > > > that scales. Even though all systems currently have homogenous > > > marchid/mvendorid I am hesitant to assert that all systems are > > > homogenous without providing an option to override this. > > > > There are already is an option. Use the non-deprecated property in your > > new system for describing what extesions you support. We don't need to > > add any more properties (for now at least). > > The issue is that it is not possible to know which vendor extensions are > associated with a vendor. That requires a global namespace where each > extension can be looked up in a table. I have opted to have a > vendor-specific namespace so that vendors don't have to worry about > stepping on other vendor's toes (or the other way around). In order to > support that, the vendorid of the hart needs to be known prior.
Nah, I think you're mixing up something like hwprobe and having namespaces there with needing namespacing on the devicetree probing side too. You don't need any vendor namespacing, it's perfectly fine (IMO) for a vendor to implement someone else's extension and I think we should allow probing any vendors extension on any CPU.
I am not mixing it up. Sure a vendor can implement somebody else's extension, they just need to add it to their namespace too.
I didn't mean that you were mixing up how your implementation worked, my point was that you're mixing up the hwprobe stuff which may need namespacing for $a{b,p}i_reason and probing from DT which does not. I don't think that the kernel should need to be changed at all if someone shows up and implements another vendor's extension - we already have far too many kernel changes required to display support for extensions and I don't welcome potential for more.
Yes I understand where you are coming from. We do not want it to require very many changes to add an extension. With this framework, there are the same number of changes to add a vendor extension as there is to add a standard extension.
No, it is actually subtly different. Even if the kernel already supports the extension, it needs to be patched for each vendor
There is the upfront cost of creating the struct for the first vendor extension from a vendor, but after that the extension only needs to be added to the associated vendor's file (I am extracting this out to a vendor file in the next version). This is also a very easy task since the fields from a different vendor can be copied and adapted.
Another thing I just thought of was systems where the SoC vendor implements some extension that gets communicated in the ISA string but is not the vendor in mvendorid in their various CPUs. I wouldn't want to see several different entries in structs (or several different hwprobe keys, but that's another story) for this situation because you're only allowing probing what's in the struct matching the vendorid.
Since the isa string is a per-hart field, the vendor associated with the hart will be used.
I don't know if you just didn't really read what I said or didn't understand it, but this response doesn't address my comment.
I read what you said! This question seemed to me as another variant of "what happens when one vendor implements an extension from a different vendor", and since we already discussed that I was trying to figure out what you were actually asking.
Consider SoC vendor S buys CPUs from vendors A & B and asks both of them to implement Xsjam. The CPUs are have the vendorid of either A or B, depending on who made it. This scenario should not result in two different hwprobe keys nor two different in-kernel riscv_has_vendor_ext() checks to see if the extension is supported. *If* the extension is vendor namespaced, it should be to the SoC vendor whose extension it is, not the individual CPU vendors that implemented it.
Additionally, consider that CPUs from both vendors are in the same SoC and all CPUs support Xsjam. Linux only supports homogeneous extensions so we should be able to detect that all CPUs support the extension and use it in a driver etc, but that's either not going to work (or be difficult to orchestrate) with different mappings per CPU vendor. I saw your v2 cover letter, in which you said: Only patch vendor extension if all harts are associated with the same vendor. This is the best chance the kernel has for working properly if there are multiple vendors. I don't think that level of paranoia is required: if firmware tells us that an extension is supported, then we can trust that those extensions have been implemented correctly. If the fear of implementation bugs is what is driving the namespacing that you've gone for, I don't think that it is required and we can simplify things, with the per-vendor structs being the vendor of the extension (so SoC vendor S in my example), not A and B who are the vendors of the CPU IP.
Thanks, Conor.
Thank you for expanding upon this idea further. This solution of indexing the extensions based on the vendor who proposed them does make a lot of sense. There are some key differences here of note. When vendors are able to mix vendor extensions, defining a bitmask that contains all of the vendor extensions gets a bit messier. I see two possible solutions.
- Vendor keys cannot overlap between vendors. A set bit in the bitmask
is associated with exactly one extension.
- Vendor keys can overlap between vendors. There is a vendor bitmask
per vendor. When setting/checking a vendor extension, first index into the vendor extension bitmask with the vendor associated with the extension and then with the key of the vendor extension.
A third option would be to use the standard extension framework. This causes the standard extension list to become populated with extensions that most harts will never implement so I am opposed to that.
This problem carries over into hwprobe since the schemes proposed by Evan and I both rely on the mvendorid of harts associated with the cpumask. To have this level of support in hwprobe for SoCs with a mix of vendors but the same extensions I again see two options:
- Vendor keys cannot overlap between vendors. A set bit in the bitmask
is associated with exactly one extension. This bitmask would be returned by the vendor extension hwprobe key.
- Vendor keys can overlap between vendors. There is an hwprobe key per
vendor. Automatic resolution of the vendor doesn't work because the vendor-specific feature being requested (extensions in the case) may be of a vendor that is different than the hart's vendor, in otherwords there are two variables necessary: the vendor and a way to ask hwprobe for a list of the vendor extensions. With hwprobe there is only the "key" that can be used to encode these variables simultaneously. We could have something like a HWPROBE_THEAD_EXT_0 key that would return all thead vendor extensions supported by the harts corresponding to the cpumask.
I was a big proponent of the vendor namespacing in hwprobe, as I liked the tidiness of it, and felt it could handle most cases (including mix-n-matching multiple mvendorids in a single SoC). However my balloon lost its air after chatting with Palmer, as there's one case it really can't handle: white labeling. This is where I buy a THead (for instance) CPU for my SoC, including all its vendor extensions, and do nothing but change the mvendorid to my own. If this is a thing, then the vendor extensions basically have to be a single global namespace in hwprobe (sigh).
I do like Charlie's idea of at least letting vendors allocate a key at a time, eg HWPROBE_THEAD_EXT_0, rather than racing to allocate a bit at a time in a key like HWPROBE_VENDOR_EXT_0. That gives it some semblance of organization, and still gives us a chance of a cleanup/deprecation path for vendors that stop producing chips. -Evan
On Wed, Apr 17, 2024 at 09:02:05AM -0700, Evan Green wrote:
On Tue, Apr 16, 2024 at 9:25 PM Charlie Jenkins charlie@rivosinc.com wrote:
On Tue, Apr 16, 2024 at 08:36:33AM +0100, Conor Dooley wrote:
On Mon, Apr 15, 2024 at 08:34:05PM -0700, Charlie Jenkins wrote:
On Sat, Apr 13, 2024 at 12:40:26AM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 02:31:42PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 10:27:47PM +0100, Conor Dooley wrote: > On Fri, Apr 12, 2024 at 01:48:46PM -0700, Charlie Jenkins wrote: > > On Fri, Apr 12, 2024 at 07:47:48PM +0100, Conor Dooley wrote: > > > On Fri, Apr 12, 2024 at 10:12:20AM -0700, Charlie Jenkins wrote:
> > > > This is already falling back on the boot CPU, but that is not a solution > > > > that scales. Even though all systems currently have homogenous > > > > marchid/mvendorid I am hesitant to assert that all systems are > > > > homogenous without providing an option to override this. > > > > > > There are already is an option. Use the non-deprecated property in your > > > new system for describing what extesions you support. We don't need to > > > add any more properties (for now at least). > > > > The issue is that it is not possible to know which vendor extensions are > > associated with a vendor. That requires a global namespace where each > > extension can be looked up in a table. I have opted to have a > > vendor-specific namespace so that vendors don't have to worry about > > stepping on other vendor's toes (or the other way around). In order to > > support that, the vendorid of the hart needs to be known prior. > > Nah, I think you're mixing up something like hwprobe and having > namespaces there with needing namespacing on the devicetree probing side > too. You don't need any vendor namespacing, it's perfectly fine (IMO) > for a vendor to implement someone else's extension and I think we should > allow probing any vendors extension on any CPU.
I am not mixing it up. Sure a vendor can implement somebody else's extension, they just need to add it to their namespace too.
I didn't mean that you were mixing up how your implementation worked, my point was that you're mixing up the hwprobe stuff which may need namespacing for $a{b,p}i_reason and probing from DT which does not. I don't think that the kernel should need to be changed at all if someone shows up and implements another vendor's extension - we already have far too many kernel changes required to display support for extensions and I don't welcome potential for more.
Yes I understand where you are coming from. We do not want it to require very many changes to add an extension. With this framework, there are the same number of changes to add a vendor extension as there is to add a standard extension.
No, it is actually subtly different. Even if the kernel already supports the extension, it needs to be patched for each vendor
There is the upfront cost of creating the struct for the first vendor extension from a vendor, but after that the extension only needs to be added to the associated vendor's file (I am extracting this out to a vendor file in the next version). This is also a very easy task since the fields from a different vendor can be copied and adapted.
Another thing I just thought of was systems where the SoC vendor implements some extension that gets communicated in the ISA string but is not the vendor in mvendorid in their various CPUs. I wouldn't want to see several different entries in structs (or several different hwprobe keys, but that's another story) for this situation because you're only allowing probing what's in the struct matching the vendorid.
Since the isa string is a per-hart field, the vendor associated with the hart will be used.
I don't know if you just didn't really read what I said or didn't understand it, but this response doesn't address my comment.
I read what you said! This question seemed to me as another variant of "what happens when one vendor implements an extension from a different vendor", and since we already discussed that I was trying to figure out what you were actually asking.
Consider SoC vendor S buys CPUs from vendors A & B and asks both of them to implement Xsjam. The CPUs are have the vendorid of either A or B, depending on who made it. This scenario should not result in two different hwprobe keys nor two different in-kernel riscv_has_vendor_ext() checks to see if the extension is supported. *If* the extension is vendor namespaced, it should be to the SoC vendor whose extension it is, not the individual CPU vendors that implemented it.
Additionally, consider that CPUs from both vendors are in the same SoC and all CPUs support Xsjam. Linux only supports homogeneous extensions so we should be able to detect that all CPUs support the extension and use it in a driver etc, but that's either not going to work (or be difficult to orchestrate) with different mappings per CPU vendor. I saw your v2 cover letter, in which you said: Only patch vendor extension if all harts are associated with the same vendor. This is the best chance the kernel has for working properly if there are multiple vendors. I don't think that level of paranoia is required: if firmware tells us that an extension is supported, then we can trust that those extensions have been implemented correctly. If the fear of implementation bugs is what is driving the namespacing that you've gone for, I don't think that it is required and we can simplify things, with the per-vendor structs being the vendor of the extension (so SoC vendor S in my example), not A and B who are the vendors of the CPU IP.
Thanks, Conor.
Thank you for expanding upon this idea further. This solution of indexing the extensions based on the vendor who proposed them does make a lot of sense. There are some key differences here of note. When vendors are able to mix vendor extensions, defining a bitmask that contains all of the vendor extensions gets a bit messier. I see two possible solutions.
- Vendor keys cannot overlap between vendors. A set bit in the bitmask
is associated with exactly one extension.
- Vendor keys can overlap between vendors. There is a vendor bitmask
per vendor. When setting/checking a vendor extension, first index into the vendor extension bitmask with the vendor associated with the extension and then with the key of the vendor extension.
A third option would be to use the standard extension framework. This causes the standard extension list to become populated with extensions that most harts will never implement so I am opposed to that.
This problem carries over into hwprobe since the schemes proposed by Evan and I both rely on the mvendorid of harts associated with the cpumask. To have this level of support in hwprobe for SoCs with a mix of vendors but the same extensions I again see two options:
- Vendor keys cannot overlap between vendors. A set bit in the bitmask
is associated with exactly one extension. This bitmask would be returned by the vendor extension hwprobe key.
- Vendor keys can overlap between vendors. There is an hwprobe key per
vendor. Automatic resolution of the vendor doesn't work because the vendor-specific feature being requested (extensions in the case) may be of a vendor that is different than the hart's vendor, in otherwords there are two variables necessary: the vendor and a way to ask hwprobe for a list of the vendor extensions. With hwprobe there is only the "key" that can be used to encode these variables simultaneously. We could have something like a HWPROBE_THEAD_EXT_0 key that would return all thead vendor extensions supported by the harts corresponding to the cpumask.
I was a big proponent of the vendor namespacing in hwprobe, as I liked the tidiness of it, and felt it could handle most cases (including mix-n-matching multiple mvendorids in a single SoC). However my balloon lost its air after chatting with Palmer, as there's one case it really can't handle: white labeling. This is where I buy a THead (for instance) CPU for my SoC, including all its vendor extensions, and do nothing but change the mvendorid to my own. If this is a thing, then the vendor extensions basically have to be a single global namespace in hwprobe (sigh).
I do like Charlie's idea of at least letting vendors allocate a key at a time, eg HWPROBE_THEAD_EXT_0, rather than racing to allocate a bit at a time in a key like HWPROBE_VENDOR_EXT_0. That gives it some semblance of organization, and still gives us a chance of a cleanup/deprecation path for vendors that stop producing chips. -Evan
Okay I will send a v3 following that method!
- Charlie
The xtheadvector ISA extension is described on the T-Head extension spec Github page [1].
[1] https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector...
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- Documentation/devicetree/bindings/riscv/extensions.yaml | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index 468c646247aa..3fd9dcf70662 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -477,6 +477,10 @@ properties: latency, as ratified in commit 56ed795 ("Update riscv-crypto-spec-vector.adoc") of riscv-crypto.
+ # vendor extensions, each extension sorted alphanumerically under the + # vendor they belong to. Vendors are sorted alphanumerically as well. + + # Andes - const: xandespmu description: The Andes Technology performance monitor extension for counter overflow @@ -484,5 +488,10 @@ properties: Registers in the AX45MP datasheet. https://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet....
+ # T-HEAD + - const: xtheadvector + description: + The T-HEAD specific 0.7.1 vector implementation. + additionalProperties: true ...
On Thu, Apr 11, 2024 at 09:11:09PM -0700, Charlie Jenkins wrote:
The xtheadvector ISA extension is described on the T-Head extension spec Github page [1].
[1] https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector...
Link: <foo> [1]
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
Documentation/devicetree/bindings/riscv/extensions.yaml | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index 468c646247aa..3fd9dcf70662 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -477,6 +477,10 @@ properties: latency, as ratified in commit 56ed795 ("Update riscv-crypto-spec-vector.adoc") of riscv-crypto.
# vendor extensions, each extension sorted alphanumerically under the
# vendor they belong to. Vendors are sorted alphanumerically as well.
# Andes - const: xandespmu description: The Andes Technology performance monitor extension for counter overflow
@@ -484,5 +488,10 @@ properties: Registers in the AX45MP datasheet. https://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet....
# T-HEAD
- const: xtheadvector
description:
The T-HEAD specific 0.7.1 vector implementation.
This needs the link and a SHA or some other reference to the version of the document.
Thanks, Conor.
additionalProperties: true ...
-- 2.44.0
On Fri, Apr 12, 2024 at 11:27:23AM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:09PM -0700, Charlie Jenkins wrote:
The xtheadvector ISA extension is described on the T-Head extension spec Github page [1].
[1] https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector...
Link: <foo> [1]
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
Documentation/devicetree/bindings/riscv/extensions.yaml | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index 468c646247aa..3fd9dcf70662 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -477,6 +477,10 @@ properties: latency, as ratified in commit 56ed795 ("Update riscv-crypto-spec-vector.adoc") of riscv-crypto.
# vendor extensions, each extension sorted alphanumerically under the
# vendor they belong to. Vendors are sorted alphanumerically as well.
# Andes - const: xandespmu description: The Andes Technology performance monitor extension for counter overflow
@@ -484,5 +488,10 @@ properties: Registers in the AX45MP datasheet. https://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet....
# T-HEAD
- const: xtheadvector
description:
The T-HEAD specific 0.7.1 vector implementation.
This needs the link and a SHA or some other reference to the version of the document.
Okay will add, thank you.
- Charlie
Thanks, Conor.
additionalProperties: true ...
-- 2.44.0
The D1/D1s SoCs support xtheadvector which should be included in the devicetree. Also include vendorid and archid for the cpu.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi b/arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi index 64c3c2e6cbe0..aee07d33a4d3 100644 --- a/arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi +++ b/arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi @@ -27,7 +27,9 @@ cpu0: cpu@0 { riscv,isa = "rv64imafdc"; riscv,isa-base = "rv64i"; riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", - "zifencei", "zihpm"; + "zifencei", "zihpm", "xtheadvector"; + riscv,vendorid = <0x00000000 0x0000005b7>; + riscv,archid = <0x00000000 0x000000000>; #cooling-cells = <2>;
cpu0_intc: interrupt-controller {
This loop is supposed to check if ext->subset_ext_ids[j] is valid, rather than if ext->subset_ext_ids[i] is valid, before setting the extension id ext->subset_ext_ids[j] in isainfo->isa.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com Fixes: 0d8295ed975b ("riscv: add ISA extension parsing for scalar crypto") --- arch/riscv/kernel/cpufeature.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index cd156adbeb66..5eb52d270a9a 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -617,7 +617,7 @@ static int __init riscv_fill_hwcap_from_ext_list(unsigned long *isa2hwcap)
if (ext->subset_ext_size) { for (int j = 0; j < ext->subset_ext_size; j++) { - if (riscv_isa_extension_check(ext->subset_ext_ids[i])) + if (riscv_isa_extension_check(ext->subset_ext_ids[j])) set_bit(ext->subset_ext_ids[j], isainfo->isa); } }
On Thu, Apr 11, 2024 at 09:11:11PM -0700, Charlie Jenkins wrote:
This loop is supposed to check if ext->subset_ext_ids[j] is valid, rather than if ext->subset_ext_ids[i] is valid, before setting the extension id ext->subset_ext_ids[j] in isainfo->isa.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com Fixes: 0d8295ed975b ("riscv: add ISA extension parsing for scalar crypto")
Reviewed-by: Conor Dooley conor.dooley@microchip.com
Thanks, Conor.
arch/riscv/kernel/cpufeature.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index cd156adbeb66..5eb52d270a9a 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -617,7 +617,7 @@ static int __init riscv_fill_hwcap_from_ext_list(unsigned long *isa2hwcap) if (ext->subset_ext_size) { for (int j = 0; j < ext->subset_ext_size; j++) {
if (riscv_isa_extension_check(ext->subset_ext_ids[i]))
if (riscv_isa_extension_check(ext->subset_ext_ids[j])) set_bit(ext->subset_ext_ids[j], isainfo->isa); } }
-- 2.44.0
Create a private namespace for each vendor above 0x8000. During the probing of hardware capabilities, the vendorid of each hart is used to resolve the vendor extension compatibility.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/include/asm/cpufeature.h | 7 ++ arch/riscv/include/asm/hwcap.h | 23 ++++ arch/riscv/kernel/cpufeature.c | 203 ++++++++++++++++++++++++++++++------ 3 files changed, 200 insertions(+), 33 deletions(-)
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index 347805446151..b5f4eedcfa86 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -26,11 +26,18 @@ struct riscv_isainfo { DECLARE_BITMAP(isa, RISCV_ISA_EXT_MAX); };
+struct riscv_isavendorinfo { + DECLARE_BITMAP(isa, RISCV_ISA_VENDOR_EXT_SIZE); +}; + DECLARE_PER_CPU(struct riscv_cpuinfo, riscv_cpuinfo);
/* Per-cpu ISA extensions. */ extern struct riscv_isainfo hart_isa[NR_CPUS];
+/* Per-cpu ISA vendor extensions. */ +extern struct riscv_isainfo hart_isa_vendor[NR_CPUS]; + void riscv_user_isa_enable(void);
#if defined(CONFIG_RISCV_MISALIGNED) diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index e17d0078a651..38157be5becd 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -87,6 +87,29 @@ #define RISCV_ISA_EXT_MAX 128 #define RISCV_ISA_EXT_INVALID U32_MAX
+/* + * These macros represent the logical IDs of each vendor RISC-V ISA extension + * and are used in each vendor ISA bitmap. The logical IDs start from + * RISCV_ISA_VENDOR_EXT_BASE, which allows the 0-0x7999 range to be + * reserved for non-vendor extensions. The maximum, RISCV_ISA_VENDOR_EXT_MAX, + * is defined in order to allocate the bitmap and may be increased when + * necessary. + * + * Values are expected to overlap between vendors. + * + * New extensions should just be added to the bottom of the respective vendor, + * rather than added alphabetically, in order to avoid unnecessary shuffling. + * + */ +#define RISCV_ISA_VENDOR_EXT_BASE 0x8000 + +/* THead Vendor Extensions */ +#define RISCV_ISA_VENDOR_EXT_XTHEADVECTOR 0x8000 + +#define RISCV_ISA_VENDOR_EXT_MAX 0x8080 +#define RISCV_ISA_VENDOR_EXT_SIZE (RISCV_ISA_VENDOR_EXT_MAX - RISCV_ISA_VENDOR_EXT_BASE) +#define RISCV_ISA_VENDOR_EXT_INVALID U32_MAX + #ifdef CONFIG_RISCV_M_MODE #define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SMAIA #else diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 5eb52d270a9a..f72fbdd0d7f5 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -32,9 +32,15 @@ unsigned long elf_hwcap __read_mostly; /* Host ISA bitmap */ static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly;
+/* Host ISA vendor bitmap */ +static DECLARE_BITMAP(riscv_isa_vendor, RISCV_ISA_VENDOR_EXT_SIZE) __read_mostly; + /* Per-cpu ISA extensions. */ struct riscv_isainfo hart_isa[NR_CPUS];
+/* Per-cpu ISA vendor extensions. */ +struct riscv_isainfo hart_isa_vendor[NR_CPUS]; + /** * riscv_isa_extension_base() - Get base extension word * @@ -309,8 +315,15 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
const size_t riscv_isa_ext_count = ARRAY_SIZE(riscv_isa_ext);
+const struct riscv_isa_ext_data riscv_isa_vendor_ext_thead[] = { + __RISCV_ISA_EXT_DATA(xtheadvector, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR), +}; + +const size_t riscv_isa_vendor_ext_count_thead = ARRAY_SIZE(riscv_isa_vendor_ext_thead); + static void __init match_isa_ext(const struct riscv_isa_ext_data *ext, const char *name, - const char *name_end, struct riscv_isainfo *isainfo) + const char *name_end, struct riscv_isainfo *isainfo, + unsigned int id_offset) { if ((name_end - name == strlen(ext->name)) && !strncasecmp(name, ext->name, name_end - name)) { @@ -321,7 +334,7 @@ static void __init match_isa_ext(const struct riscv_isa_ext_data *ext, const cha if (ext->subset_ext_size) { for (int i = 0; i < ext->subset_ext_size; i++) { if (riscv_isa_extension_check(ext->subset_ext_ids[i])) - set_bit(ext->subset_ext_ids[i], isainfo->isa); + set_bit(ext->subset_ext_ids[i] - id_offset, isainfo->isa); } }
@@ -330,12 +343,34 @@ static void __init match_isa_ext(const struct riscv_isa_ext_data *ext, const cha * (rejected by riscv_isa_extension_check()). */ if (riscv_isa_extension_check(ext->id)) - set_bit(ext->id, isainfo->isa); + set_bit(ext->id - id_offset, isainfo->isa); + } +} + +static bool __init get_isa_vendor_ext(unsigned long vendorid, + const struct riscv_isa_ext_data **isa_vendor_ext, + size_t *count) +{ + bool found_vendor = true; + + switch (vendorid) { + case THEAD_VENDOR_ID: + *isa_vendor_ext = riscv_isa_vendor_ext_thead; + *count = riscv_isa_vendor_ext_count_thead; + break; + default: + *isa_vendor_ext = NULL; + *count = 0; + found_vendor = false; + break; } + + return found_vendor; }
static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct riscv_isainfo *isainfo, - unsigned long *isa2hwcap, const char *isa) + struct riscv_isainfo *isavendorinfo, unsigned long vendorid, + unsigned long *isa2hwcap, const char *isa) { /* * For all possible cpus, we have already validated in @@ -349,8 +384,30 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc const char *ext = isa++; const char *ext_end = isa; bool ext_long = false, ext_err = false; + struct riscv_isainfo *selected_isainfo = isainfo; + const struct riscv_isa_ext_data *selected_riscv_isa_ext = riscv_isa_ext; + size_t selected_riscv_isa_ext_count = riscv_isa_ext_count; + unsigned int id_offset = 0;
switch (*ext) { + case 'x': + case 'X': + bool found; + + found = get_isa_vendor_ext(vendorid, + &selected_riscv_isa_ext, + &selected_riscv_isa_ext_count); + selected_isainfo = isavendorinfo; + id_offset = RISCV_ISA_VENDOR_EXT_BASE; + if (!found) { + pr_warn("No associated vendor extensions with vendor id: %lx\n", + vendorid); + for (; *isa && *isa != '_'; ++isa) + ; + ext_err = true; + break; + } + fallthrough; case 's': /* * Workaround for invalid single-letter 's' & 'u' (QEMU). @@ -366,8 +423,6 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc } fallthrough; case 'S': - case 'x': - case 'X': case 'z': case 'Z': /* @@ -476,8 +531,10 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc set_bit(nr, isainfo->isa); } } else { - for (int i = 0; i < riscv_isa_ext_count; i++) - match_isa_ext(&riscv_isa_ext[i], ext, ext_end, isainfo); + for (int i = 0; i < selected_riscv_isa_ext_count; i++) + match_isa_ext(&selected_riscv_isa_ext[i], ext, + ext_end, selected_isainfo, + id_offset); } } } @@ -490,8 +547,8 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) struct acpi_table_header *rhct; acpi_status status; unsigned int cpu; - u64 boot_vendorid; - u64 boot_archid; + u64 boot_vendorid = ULL(-1), vendorid; + u64 boot_archid = ULL(-1);
if (!acpi_disabled) { status = acpi_get_table(ACPI_SIG_RHCT, 0, &rhct); @@ -499,11 +556,9 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) return; }
- boot_vendorid = riscv_get_mvendorid(); - boot_archid = riscv_get_marchid(); - for_each_possible_cpu(cpu) { struct riscv_isainfo *isainfo = &hart_isa[cpu]; + struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu]; unsigned long this_hwcap = 0; u64 this_vendorid; u64 this_archid; @@ -523,11 +578,19 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) } if (of_property_read_u64(node, "riscv,vendorid", &this_vendorid) < 0) { pr_warn("Unable to find "riscv,vendorid" devicetree entry, using boot hart mvendorid instead\n"); + + if (boot_vendorid == -1) + this_vendorid = riscv_get_mvendorid(); + this_vendorid = boot_vendorid; }
if (of_property_read_u64(node, "riscv,archid", &this_archid) < 0) { pr_warn("Unable to find "riscv,vendorid" devicetree entry, using boot hart marchid instead\n"); + + if (boot_archid == -1) + boot_archid = riscv_get_marchid(); + this_archid = boot_archid; } } else { @@ -540,7 +603,8 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) this_archid = boot_archid; }
- riscv_parse_isa_string(&this_hwcap, isainfo, isa2hwcap, isa); + riscv_parse_isa_string(&this_hwcap, isainfo, isavendorinfo, + this_vendorid, isa2hwcap, isa);
/* * These ones were as they were part of the base ISA when the @@ -582,21 +646,77 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) bitmap_copy(riscv_isa, isainfo->isa, RISCV_ISA_EXT_MAX); else bitmap_and(riscv_isa, riscv_isa, isainfo->isa, RISCV_ISA_EXT_MAX); + + /* + * All harts must have the same vendor to have compatible + * vendor extensions. + */ + if (bitmap_empty(riscv_isa_vendor, RISCV_ISA_VENDOR_EXT_SIZE)) { + vendorid = this_vendorid; + bitmap_copy(riscv_isa_vendor, isavendorinfo->isa, + RISCV_ISA_VENDOR_EXT_SIZE); + } else if (vendorid != this_vendorid) { + vendorid = -1ULL; + bitmap_clear(riscv_isa_vendor, 0, RISCV_ISA_VENDOR_EXT_SIZE); + } else { + bitmap_and(riscv_isa_vendor, riscv_isa_vendor, + isavendorinfo->isa, + RISCV_ISA_VENDOR_EXT_SIZE); + } }
if (!acpi_disabled && rhct) acpi_put_table((struct acpi_table_header *)rhct); }
+static void __init riscv_add_cpu_ext(struct device_node *cpu_node, + unsigned long *this_hwcap, + unsigned long *isa2hwcap, + const struct riscv_isa_ext_data *riscv_isa_ext_data, + struct riscv_isainfo *isainfo, + unsigned int id_offset, + size_t riscv_isa_ext_count) +{ + for (int i = 0; i < riscv_isa_ext_count; i++) { + const struct riscv_isa_ext_data ext = riscv_isa_ext_data[i]; + + if (of_property_match_string(cpu_node, "riscv,isa-extensions", + ext.property) < 0) + continue; + + if (ext.subset_ext_size) { + for (int j = 0; j < ext.subset_ext_size; j++) { + if (riscv_isa_extension_check(ext.subset_ext_ids[j])) + set_bit(ext.subset_ext_ids[j] - id_offset, isainfo->isa); + } + } + + if (riscv_isa_extension_check(ext.id)) { + set_bit(ext.id - id_offset, isainfo->isa); + + /* Only single letter extensions get set in hwcap */ + if (strnlen(ext.name, 2) == 1) + *this_hwcap |= isa2hwcap[ext.id]; + } + } +} + static int __init riscv_fill_hwcap_from_ext_list(unsigned long *isa2hwcap) { unsigned int cpu; + u64 boot_vendorid, vendorid;
for_each_possible_cpu(cpu) { unsigned long this_hwcap = 0; struct device_node *cpu_node; struct riscv_isainfo *isainfo = &hart_isa[cpu];
+ struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu]; + size_t riscv_isa_vendor_ext_count; + const struct riscv_isa_ext_data *riscv_isa_vendor_ext; + u64 this_vendorid; + bool found_vendor; + cpu_node = of_cpu_device_node_get(cpu); if (!cpu_node) { pr_warn("Unable to find cpu node\n"); @@ -608,28 +728,28 @@ static int __init riscv_fill_hwcap_from_ext_list(unsigned long *isa2hwcap) continue; }
- for (int i = 0; i < riscv_isa_ext_count; i++) { - const struct riscv_isa_ext_data *ext = &riscv_isa_ext[i]; + riscv_add_cpu_ext(cpu_node, &this_hwcap, isa2hwcap, + riscv_isa_ext, isainfo, 0, + riscv_isa_ext_count);
- if (of_property_match_string(cpu_node, "riscv,isa-extensions", - ext->property) < 0) - continue; - - if (ext->subset_ext_size) { - for (int j = 0; j < ext->subset_ext_size; j++) { - if (riscv_isa_extension_check(ext->subset_ext_ids[j])) - set_bit(ext->subset_ext_ids[j], isainfo->isa); - } - } + if (of_property_read_u64(cpu_node, "riscv,vendorid", &this_vendorid) < 0) { + pr_warn("Unable to find "riscv,vendorid" devicetree entry, using boot hart mvendorid instead\n"); + if (boot_vendorid == -1) + boot_vendorid = riscv_get_mvendorid(); + this_vendorid = boot_vendorid; + }
- if (riscv_isa_extension_check(ext->id)) { - set_bit(ext->id, isainfo->isa); + found_vendor = get_isa_vendor_ext(this_vendorid, + &riscv_isa_vendor_ext, + &riscv_isa_vendor_ext_count);
- /* Only single letter extensions get set in hwcap */ - if (strnlen(riscv_isa_ext[i].name, 2) == 1) - this_hwcap |= isa2hwcap[riscv_isa_ext[i].id]; - } - } + if (found_vendor) + riscv_add_cpu_ext(cpu_node, &this_hwcap, isa2hwcap, + riscv_isa_vendor_ext, isavendorinfo, + RISCV_ISA_VENDOR_EXT_BASE, riscv_isa_vendor_ext_count); + else + pr_warn("No associated vendor extensions with vendor id: %llx\n", + vendorid);
of_node_put(cpu_node);
@@ -646,6 +766,23 @@ static int __init riscv_fill_hwcap_from_ext_list(unsigned long *isa2hwcap) bitmap_copy(riscv_isa, isainfo->isa, RISCV_ISA_EXT_MAX); else bitmap_and(riscv_isa, riscv_isa, isainfo->isa, RISCV_ISA_EXT_MAX); + + /* + * All harts must have the same vendorid to have compatible + * vendor extensions. + */ + if (bitmap_empty(riscv_isa_vendor, RISCV_ISA_VENDOR_EXT_SIZE)) { + vendorid = this_vendorid; + bitmap_copy(riscv_isa_vendor, isavendorinfo->isa, + RISCV_ISA_VENDOR_EXT_SIZE); + } else if (vendorid != this_vendorid) { + vendorid = -1ULL; + bitmap_clear(riscv_isa_vendor, 0, + RISCV_ISA_VENDOR_EXT_SIZE); + } else { + bitmap_and(riscv_isa_vendor, riscv_isa_vendor, + isavendorinfo->isa, RISCV_ISA_VENDOR_EXT_SIZE); + } }
if (bitmap_empty(riscv_isa, RISCV_ISA_EXT_MAX))
On Thu, Apr 11, 2024 at 09:11:12PM -0700, Charlie Jenkins wrote:
static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct riscv_isainfo *isainfo,
unsigned long *isa2hwcap, const char *isa)
struct riscv_isainfo *isavendorinfo, unsigned long vendorid,
unsigned long *isa2hwcap, const char *isa)
{ /* * For all possible cpus, we have already validated in @@ -349,8 +384,30 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc const char *ext = isa++; const char *ext_end = isa; bool ext_long = false, ext_err = false;
struct riscv_isainfo *selected_isainfo = isainfo;
const struct riscv_isa_ext_data *selected_riscv_isa_ext = riscv_isa_ext;
size_t selected_riscv_isa_ext_count = riscv_isa_ext_count;
unsigned int id_offset = 0;
switch (*ext) {
case 'x':
case 'X':
One quick remark is that we should not go and support this stuff via riscv,isa in my opinion, only allowing it for the riscv,isa-extensions parsing. We don't have a way to define meanings for vendor extensions in this way. ACPI also uses this codepath and at the moment the kernel's docs say we're gonna follow isa string parsing rules in a specific version of the ISA manual. While that manual provides a format for the string and meanings for standard extensions, there's nothing in there that allows us to get consistent meanings for specific vendor extensions, so I think we should avoid intentionally supporting this here.
I'd probably go as far as to actively skip vendor extensions in riscv_parse_isa_string() to avoid any potential issues.
bool found;
found = get_isa_vendor_ext(vendorid,
&selected_riscv_isa_ext,
&selected_riscv_isa_ext_count);
selected_isainfo = isavendorinfo;
id_offset = RISCV_ISA_VENDOR_EXT_BASE;
if (!found) {
pr_warn("No associated vendor extensions with vendor id: %lx\n",
vendorid);
This should not be a warning, anything we don't understand should be silently ignored to avoid spamming just because the kernel has not grown support for it yet.
Thanks, Conor.
for (; *isa && *isa != '_'; ++isa)
;
ext_err = true;
break;
}
case 's': /* * Workaround for invalid single-letter 's' & 'u' (QEMU).fallthrough;
@@ -366,8 +423,6 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc } fallthrough; case 'S':
case 'x':
case 'z': case 'Z': /*case 'X':
@@ -476,8 +531,10 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc set_bit(nr, isainfo->isa); } } else {
for (int i = 0; i < riscv_isa_ext_count; i++)
match_isa_ext(&riscv_isa_ext[i], ext, ext_end, isainfo);
for (int i = 0; i < selected_riscv_isa_ext_count; i++)
match_isa_ext(&selected_riscv_isa_ext[i], ext,
ext_end, selected_isainfo,
} }id_offset);
}
On Fri, Apr 12, 2024 at 01:30:08PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:12PM -0700, Charlie Jenkins wrote:
static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct riscv_isainfo *isainfo,
unsigned long *isa2hwcap, const char *isa)
struct riscv_isainfo *isavendorinfo, unsigned long vendorid,
unsigned long *isa2hwcap, const char *isa)
{ /* * For all possible cpus, we have already validated in @@ -349,8 +384,30 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc const char *ext = isa++; const char *ext_end = isa; bool ext_long = false, ext_err = false;
struct riscv_isainfo *selected_isainfo = isainfo;
const struct riscv_isa_ext_data *selected_riscv_isa_ext = riscv_isa_ext;
size_t selected_riscv_isa_ext_count = riscv_isa_ext_count;
unsigned int id_offset = 0;
switch (*ext) {
case 'x':
case 'X':
One quick remark is that we should not go and support this stuff via riscv,isa in my opinion, only allowing it for the riscv,isa-extensions parsing. We don't have a way to define meanings for vendor extensions in this way. ACPI also uses this codepath and at the moment the kernel's docs say we're gonna follow isa string parsing rules in a specific version of the ISA manual. While that manual provides a format for the string and meanings for standard extensions, there's nothing in there that allows us to get consistent meanings for specific vendor extensions, so I think we should avoid intentionally supporting this here.
Getting a "consistent meaning" is managed by a vendor. If a vendor supports a vendor extension and puts it in their DT/ACPI table it's up to them to ensure that it works. How does riscv,isa-extensions allow for a consistent meaning?
I'd probably go as far as to actively skip vendor extensions in riscv_parse_isa_string() to avoid any potential issues.
bool found;
found = get_isa_vendor_ext(vendorid,
&selected_riscv_isa_ext,
&selected_riscv_isa_ext_count);
selected_isainfo = isavendorinfo;
id_offset = RISCV_ISA_VENDOR_EXT_BASE;
if (!found) {
pr_warn("No associated vendor extensions with vendor id: %lx\n",
vendorid);
This should not be a warning, anything we don't understand should be silently ignored to avoid spamming just because the kernel has not grown support for it yet.
Sounds good.
- Charlie
Thanks, Conor.
for (; *isa && *isa != '_'; ++isa)
;
ext_err = true;
break;
}
case 's': /* * Workaround for invalid single-letter 's' & 'u' (QEMU).fallthrough;
@@ -366,8 +423,6 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc } fallthrough; case 'S':
case 'x':
case 'z': case 'Z': /*case 'X':
@@ -476,8 +531,10 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc set_bit(nr, isainfo->isa); } } else {
for (int i = 0; i < riscv_isa_ext_count; i++)
match_isa_ext(&riscv_isa_ext[i], ext, ext_end, isainfo);
for (int i = 0; i < selected_riscv_isa_ext_count; i++)
match_isa_ext(&selected_riscv_isa_ext[i], ext,
ext_end, selected_isainfo,
} }id_offset);
}
On Fri, Apr 12, 2024 at 09:58:04AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 01:30:08PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:12PM -0700, Charlie Jenkins wrote:
static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct riscv_isainfo *isainfo,
unsigned long *isa2hwcap, const char *isa)
struct riscv_isainfo *isavendorinfo, unsigned long vendorid,
unsigned long *isa2hwcap, const char *isa)
{ /* * For all possible cpus, we have already validated in @@ -349,8 +384,30 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc const char *ext = isa++; const char *ext_end = isa; bool ext_long = false, ext_err = false;
struct riscv_isainfo *selected_isainfo = isainfo;
const struct riscv_isa_ext_data *selected_riscv_isa_ext = riscv_isa_ext;
size_t selected_riscv_isa_ext_count = riscv_isa_ext_count;
unsigned int id_offset = 0;
switch (*ext) {
case 'x':
case 'X':
One quick remark is that we should not go and support this stuff via riscv,isa in my opinion, only allowing it for the riscv,isa-extensions parsing. We don't have a way to define meanings for vendor extensions in this way. ACPI also uses this codepath and at the moment the kernel's docs say we're gonna follow isa string parsing rules in a specific version of the ISA manual. While that manual provides a format for the string and meanings for standard extensions, there's nothing in there that allows us to get consistent meanings for specific vendor extensions, so I think we should avoid intentionally supporting this here.
Getting a "consistent meaning" is managed by a vendor.
IOW, there's absolutely no guarantee of a consistent meaning.
If a vendor supports a vendor extension and puts it in their DT/ACPI table it's up to them to ensure that it works. How does riscv,isa-extensions allow for a consistent meaning?
The definitions for each string contain links to exact versions of specifications that they correspond to.
I'd probably go as far as to actively skip vendor extensions in riscv_parse_isa_string() to avoid any potential issues.
bool found;
found = get_isa_vendor_ext(vendorid,
&selected_riscv_isa_ext,
&selected_riscv_isa_ext_count);
selected_isainfo = isavendorinfo;
id_offset = RISCV_ISA_VENDOR_EXT_BASE;
if (!found) {
pr_warn("No associated vendor extensions with vendor id: %lx\n",
vendorid);
This should not be a warning, anything we don't understand should be silently ignored to avoid spamming just because the kernel has not grown support for it yet.
Sounds good.
- Charlie
Thanks, Conor.
for (; *isa && *isa != '_'; ++isa)
;
ext_err = true;
break;
}
case 's': /* * Workaround for invalid single-letter 's' & 'u' (QEMU).fallthrough;
@@ -366,8 +423,6 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc } fallthrough; case 'S':
case 'x':
case 'z': case 'Z': /*case 'X':
@@ -476,8 +531,10 @@ static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct risc set_bit(nr, isainfo->isa); } } else {
for (int i = 0; i < riscv_isa_ext_count; i++)
match_isa_ext(&riscv_isa_ext[i], ext, ext_end, isainfo);
for (int i = 0; i < selected_riscv_isa_ext_count; i++)
match_isa_ext(&selected_riscv_isa_ext[i], ext,
ext_end, selected_isainfo,
} }id_offset);
}
Hi Charlie,
kernel test robot noticed the following build warnings:
[auto build test WARNING on 4cece764965020c22cff7665b18a012006359095]
url: https://github.com/intel-lab-lkp/linux/commits/Charlie-Jenkins/dt-bindings-r... base: 4cece764965020c22cff7665b18a012006359095 patch link: https://lore.kernel.org/r/20240411-dev-charlie-support_thead_vector_6_9-v1-6... patch subject: [PATCH 06/19] riscv: Extend cpufeature.c to detect vendor extensions config: riscv-defconfig (https://download.01.org/0day-ci/archive/20240412/202404122206.TkXKhj29-lkp@i...) compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project 8b3b4a92adee40483c27f26c478a384cd69c6f05) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240412/202404122206.TkXKhj29-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202404122206.TkXKhj29-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from arch/riscv/kernel/cpufeature.c:20: In file included from arch/riscv/include/asm/cacheflush.h:9: In file included from include/linux/mm.h:2208: include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 522 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~
arch/riscv/kernel/cpufeature.c:395:4: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
395 | bool found; | ^ 2 warnings generated.
vim +395 arch/riscv/kernel/cpufeature.c
370 371 static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct riscv_isainfo *isainfo, 372 struct riscv_isainfo *isavendorinfo, unsigned long vendorid, 373 unsigned long *isa2hwcap, const char *isa) 374 { 375 /* 376 * For all possible cpus, we have already validated in 377 * the boot process that they at least contain "rv" and 378 * whichever of "32"/"64" this kernel supports, and so this 379 * section can be skipped. 380 */ 381 isa += 4; 382 383 while (*isa) { 384 const char *ext = isa++; 385 const char *ext_end = isa; 386 bool ext_long = false, ext_err = false; 387 struct riscv_isainfo *selected_isainfo = isainfo; 388 const struct riscv_isa_ext_data *selected_riscv_isa_ext = riscv_isa_ext; 389 size_t selected_riscv_isa_ext_count = riscv_isa_ext_count; 390 unsigned int id_offset = 0; 391 392 switch (*ext) { 393 case 'x': 394 case 'X':
395 bool found;
396 397 found = get_isa_vendor_ext(vendorid, 398 &selected_riscv_isa_ext, 399 &selected_riscv_isa_ext_count); 400 selected_isainfo = isavendorinfo; 401 id_offset = RISCV_ISA_VENDOR_EXT_BASE; 402 if (!found) { 403 pr_warn("No associated vendor extensions with vendor id: %lx\n", 404 vendorid); 405 for (; *isa && *isa != '_'; ++isa) 406 ; 407 ext_err = true; 408 break; 409 } 410 fallthrough; 411 case 's': 412 /* 413 * Workaround for invalid single-letter 's' & 'u' (QEMU). 414 * No need to set the bit in riscv_isa as 's' & 'u' are 415 * not valid ISA extensions. It works unless the first 416 * multi-letter extension in the ISA string begins with 417 * "Su" and is not prefixed with an underscore. 418 */ 419 if (ext[-1] != '_' && ext[1] == 'u') { 420 ++isa; 421 ext_err = true; 422 break; 423 } 424 fallthrough; 425 case 'S': 426 case 'z': 427 case 'Z': 428 /* 429 * Before attempting to parse the extension itself, we find its end. 430 * As multi-letter extensions must be split from other multi-letter 431 * extensions with an "_", the end of a multi-letter extension will 432 * either be the null character or the "_" at the start of the next 433 * multi-letter extension. 434 * 435 * Next, as the extensions version is currently ignored, we 436 * eliminate that portion. This is done by parsing backwards from 437 * the end of the extension, removing any numbers. This may be a 438 * major or minor number however, so the process is repeated if a 439 * minor number was found. 440 * 441 * ext_end is intended to represent the first character *after* the 442 * name portion of an extension, but will be decremented to the last 443 * character itself while eliminating the extensions version number. 444 * A simple re-increment solves this problem. 445 */ 446 ext_long = true; 447 for (; *isa && *isa != '_'; ++isa) 448 if (unlikely(!isalnum(*isa))) 449 ext_err = true; 450 451 ext_end = isa; 452 if (unlikely(ext_err)) 453 break; 454 455 if (!isdigit(ext_end[-1])) 456 break; 457 458 while (isdigit(*--ext_end)) 459 ; 460 461 if (tolower(ext_end[0]) != 'p' || !isdigit(ext_end[-1])) { 462 ++ext_end; 463 break; 464 } 465 466 while (isdigit(*--ext_end)) 467 ; 468 469 ++ext_end; 470 break; 471 default: 472 /* 473 * Things are a little easier for single-letter extensions, as they 474 * are parsed forwards. 475 * 476 * After checking that our starting position is valid, we need to 477 * ensure that, when isa was incremented at the start of the loop, 478 * that it arrived at the start of the next extension. 479 * 480 * If we are already on a non-digit, there is nothing to do. Either 481 * we have a multi-letter extension's _, or the start of an 482 * extension. 483 * 484 * Otherwise we have found the current extension's major version 485 * number. Parse past it, and a subsequent p/minor version number 486 * if present. The `p` extension must not appear immediately after 487 * a number, so there is no fear of missing it. 488 * 489 */ 490 if (unlikely(!isalpha(*ext))) { 491 ext_err = true; 492 break; 493 } 494 495 if (!isdigit(*isa)) 496 break; 497 498 while (isdigit(*++isa)) 499 ; 500 501 if (tolower(*isa) != 'p') 502 break; 503 504 if (!isdigit(*++isa)) { 505 --isa; 506 break; 507 } 508 509 while (isdigit(*++isa)) 510 ; 511 512 break; 513 } 514 515 /* 516 * The parser expects that at the start of an iteration isa points to the 517 * first character of the next extension. As we stop parsing an extension 518 * on meeting a non-alphanumeric character, an extra increment is needed 519 * where the succeeding extension is a multi-letter prefixed with an "_". 520 */ 521 if (*isa == '_') 522 ++isa; 523 524 if (unlikely(ext_err)) 525 continue; 526 if (!ext_long) { 527 int nr = tolower(*ext) - 'a'; 528 529 if (riscv_isa_extension_check(nr)) { 530 *this_hwcap |= isa2hwcap[nr]; 531 set_bit(nr, isainfo->isa); 532 } 533 } else { 534 for (int i = 0; i < selected_riscv_isa_ext_count; i++) 535 match_isa_ext(&selected_riscv_isa_ext[i], ext, 536 ext_end, selected_isainfo, 537 id_offset); 538 } 539 } 540 } 541
Hi Charlie,
kernel test robot noticed the following build errors:
[auto build test ERROR on 4cece764965020c22cff7665b18a012006359095]
url: https://github.com/intel-lab-lkp/linux/commits/Charlie-Jenkins/dt-bindings-r... base: 4cece764965020c22cff7665b18a012006359095 patch link: https://lore.kernel.org/r/20240411-dev-charlie-support_thead_vector_6_9-v1-6... patch subject: [PATCH 06/19] riscv: Extend cpufeature.c to detect vendor extensions config: riscv-randconfig-r133-20240413 (https://download.01.org/0day-ci/archive/20240414/202404140621.x9B02eF8-lkp@i...) compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18) reproduce: (https://download.01.org/0day-ci/archive/20240414/202404140621.x9B02eF8-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202404140621.x9B02eF8-lkp@intel.com/
All errors (new ones prefixed by >>):
arch/riscv/kernel/cpufeature.c:395:4: error: expected expression
395 | bool found; | ^
arch/riscv/kernel/cpufeature.c:397:4: error: use of undeclared identifier 'found'
397 | found = get_isa_vendor_ext(vendorid, | ^ arch/riscv/kernel/cpufeature.c:402:9: error: use of undeclared identifier 'found' 402 | if (!found) { | ^ 3 errors generated.
vim +395 arch/riscv/kernel/cpufeature.c
370 371 static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct riscv_isainfo *isainfo, 372 struct riscv_isainfo *isavendorinfo, unsigned long vendorid, 373 unsigned long *isa2hwcap, const char *isa) 374 { 375 /* 376 * For all possible cpus, we have already validated in 377 * the boot process that they at least contain "rv" and 378 * whichever of "32"/"64" this kernel supports, and so this 379 * section can be skipped. 380 */ 381 isa += 4; 382 383 while (*isa) { 384 const char *ext = isa++; 385 const char *ext_end = isa; 386 bool ext_long = false, ext_err = false; 387 struct riscv_isainfo *selected_isainfo = isainfo; 388 const struct riscv_isa_ext_data *selected_riscv_isa_ext = riscv_isa_ext; 389 size_t selected_riscv_isa_ext_count = riscv_isa_ext_count; 390 unsigned int id_offset = 0; 391 392 switch (*ext) { 393 case 'x': 394 case 'X':
395 bool found;
396
397 found = get_isa_vendor_ext(vendorid,
398 &selected_riscv_isa_ext, 399 &selected_riscv_isa_ext_count); 400 selected_isainfo = isavendorinfo; 401 id_offset = RISCV_ISA_VENDOR_EXT_BASE; 402 if (!found) { 403 pr_warn("No associated vendor extensions with vendor id: %lx\n", 404 vendorid); 405 for (; *isa && *isa != '_'; ++isa) 406 ; 407 ext_err = true; 408 break; 409 } 410 fallthrough; 411 case 's': 412 /* 413 * Workaround for invalid single-letter 's' & 'u' (QEMU). 414 * No need to set the bit in riscv_isa as 's' & 'u' are 415 * not valid ISA extensions. It works unless the first 416 * multi-letter extension in the ISA string begins with 417 * "Su" and is not prefixed with an underscore. 418 */ 419 if (ext[-1] != '_' && ext[1] == 'u') { 420 ++isa; 421 ext_err = true; 422 break; 423 } 424 fallthrough; 425 case 'S': 426 case 'z': 427 case 'Z': 428 /* 429 * Before attempting to parse the extension itself, we find its end. 430 * As multi-letter extensions must be split from other multi-letter 431 * extensions with an "_", the end of a multi-letter extension will 432 * either be the null character or the "_" at the start of the next 433 * multi-letter extension. 434 * 435 * Next, as the extensions version is currently ignored, we 436 * eliminate that portion. This is done by parsing backwards from 437 * the end of the extension, removing any numbers. This may be a 438 * major or minor number however, so the process is repeated if a 439 * minor number was found. 440 * 441 * ext_end is intended to represent the first character *after* the 442 * name portion of an extension, but will be decremented to the last 443 * character itself while eliminating the extensions version number. 444 * A simple re-increment solves this problem. 445 */ 446 ext_long = true; 447 for (; *isa && *isa != '_'; ++isa) 448 if (unlikely(!isalnum(*isa))) 449 ext_err = true; 450 451 ext_end = isa; 452 if (unlikely(ext_err)) 453 break; 454 455 if (!isdigit(ext_end[-1])) 456 break; 457 458 while (isdigit(*--ext_end)) 459 ; 460 461 if (tolower(ext_end[0]) != 'p' || !isdigit(ext_end[-1])) { 462 ++ext_end; 463 break; 464 } 465 466 while (isdigit(*--ext_end)) 467 ; 468 469 ++ext_end; 470 break; 471 default: 472 /* 473 * Things are a little easier for single-letter extensions, as they 474 * are parsed forwards. 475 * 476 * After checking that our starting position is valid, we need to 477 * ensure that, when isa was incremented at the start of the loop, 478 * that it arrived at the start of the next extension. 479 * 480 * If we are already on a non-digit, there is nothing to do. Either 481 * we have a multi-letter extension's _, or the start of an 482 * extension. 483 * 484 * Otherwise we have found the current extension's major version 485 * number. Parse past it, and a subsequent p/minor version number 486 * if present. The `p` extension must not appear immediately after 487 * a number, so there is no fear of missing it. 488 * 489 */ 490 if (unlikely(!isalpha(*ext))) { 491 ext_err = true; 492 break; 493 } 494 495 if (!isdigit(*isa)) 496 break; 497 498 while (isdigit(*++isa)) 499 ; 500 501 if (tolower(*isa) != 'p') 502 break; 503 504 if (!isdigit(*++isa)) { 505 --isa; 506 break; 507 } 508 509 while (isdigit(*++isa)) 510 ; 511 512 break; 513 } 514 515 /* 516 * The parser expects that at the start of an iteration isa points to the 517 * first character of the next extension. As we stop parsing an extension 518 * on meeting a non-alphanumeric character, an extra increment is needed 519 * where the succeeding extension is a multi-letter prefixed with an "_". 520 */ 521 if (*isa == '_') 522 ++isa; 523 524 if (unlikely(ext_err)) 525 continue; 526 if (!ext_long) { 527 int nr = tolower(*ext) - 'a'; 528 529 if (riscv_isa_extension_check(nr)) { 530 *this_hwcap |= isa2hwcap[nr]; 531 set_bit(nr, isainfo->isa); 532 } 533 } else { 534 for (int i = 0; i < selected_riscv_isa_ext_count; i++) 535 match_isa_ext(&selected_riscv_isa_ext[i], ext, 536 ext_end, selected_isainfo, 537 id_offset); 538 } 539 } 540 } 541
When alternatives are disabled, riscv_cpu_isa_extension_(un)likely() checks if the current cpu supports the selected extension if not all cpus support the extension. It is sufficient to only check if the current cpu supports the extension.
The alternatives code to handle if all cpus support an extension is factored out into a new function to support this.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/include/asm/cpufeature.h | 84 +++++++++++++++++++++---------------- 1 file changed, 48 insertions(+), 36 deletions(-)
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index b5f4eedcfa86..db2ab037843a 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -90,22 +90,13 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, unsigned i __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_##ext)
static __always_inline bool -riscv_has_extension_likely(const unsigned long ext) +__riscv_has_extension_likely_alternatives(const unsigned long ext) { - compiletime_assert(ext < RISCV_ISA_EXT_MAX, - "ext must be < RISCV_ISA_EXT_MAX"); - - if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) { - asm goto( - ALTERNATIVE("j %l[l_no]", "nop", 0, %[ext], 1) - : - : [ext] "i" (ext) - : - : l_no); - } else { - if (!__riscv_isa_extension_available(NULL, ext)) - goto l_no; - } + asm goto(ALTERNATIVE("j %l[l_no]", "nop", 0, %[ext], 1) + : + : [ext] "i" (ext) + : + : l_no);
return true; l_no: @@ -113,42 +104,63 @@ riscv_has_extension_likely(const unsigned long ext) }
static __always_inline bool -riscv_has_extension_unlikely(const unsigned long ext) +__riscv_has_extension_unlikely_alternatives(const unsigned long ext) { - compiletime_assert(ext < RISCV_ISA_EXT_MAX, - "ext must be < RISCV_ISA_EXT_MAX"); - - if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) { - asm goto( - ALTERNATIVE("nop", "j %l[l_yes]", 0, %[ext], 1) - : - : [ext] "i" (ext) - : - : l_yes); - } else { - if (__riscv_isa_extension_available(NULL, ext)) - goto l_yes; - } + asm goto(ALTERNATIVE("nop", "j %l[l_yes]", 0, %[ext], 1) + : + : [ext] "i" (ext) + : + : l_yes);
return false; l_yes: return true; }
+static __always_inline bool +riscv_has_extension_likely(const unsigned long ext) +{ + compiletime_assert(ext < RISCV_ISA_EXT_MAX, + "ext must be < RISCV_ISA_EXT_MAX"); + + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) + return __riscv_has_extension_likely_alternatives(ext); + else + return __riscv_isa_extension_available(NULL, ext); +} + +static __always_inline bool +riscv_has_extension_unlikely(const unsigned long ext) +{ + compiletime_assert(ext < RISCV_ISA_EXT_MAX, + "ext must be < RISCV_ISA_EXT_MAX"); + + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) + return __riscv_has_extension_unlikely_alternatives(ext); + else + return __riscv_isa_extension_available(NULL, ext); +} + static __always_inline bool riscv_cpu_has_extension_likely(int cpu, const unsigned long ext) { - if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_likely(ext)) - return true; + compiletime_assert(ext < RISCV_ISA_EXT_MAX, + "ext must be < RISCV_ISA_EXT_MAX");
- return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && __riscv_has_extension_likely_alternatives(ext)) + return true; + else + return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); }
static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsigned long ext) { - if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_unlikely(ext)) - return true; + compiletime_assert(ext < RISCV_ISA_EXT_MAX, + "ext must be < RISCV_ISA_EXT_MAX");
- return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && __riscv_has_extension_unlikely_alternatives(ext)) + return true; + else + return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); }
#endif
On Thu, Apr 11, 2024 at 09:11:13PM -0700, Charlie Jenkins wrote:
When alternatives are disabled, riscv_cpu_isa_extension_(un)likely() checks if the current cpu supports the selected extension if not all cpus support the extension. It is sufficient to only check if the current cpu supports the extension.
The alternatives code to handle if all cpus support an extension is factored out into a new function to support this.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsigned long ext) {
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_unlikely(ext))
return true;
- compiletime_assert(ext < RISCV_ISA_EXT_MAX,
"ext must be < RISCV_ISA_EXT_MAX");
- return __riscv_isa_extension_available(hart_isa[cpu].isa, ext);
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && __riscv_has_extension_unlikely_alternatives(ext))
return true;
- else
return __riscv_isa_extension_available(hart_isa[cpu].isa, ext);
}
static __always_inline bool riscv_cpu_has_extension_likely(int cpu, const unsigned long ext) { if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_likely(ext)) return true;
return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); }
This is the code as things stand. If alternatives are disabled, the if statement becomes if (0 && foo) which will lead to the function call getting constant folded away and all you end up with is the call to __riscv_isa_extension_available(). Unless I am missing something, I don't think this patch has any affect?
Thanks, Conor.
On Fri, Apr 12, 2024 at 11:40:38AM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:13PM -0700, Charlie Jenkins wrote:
When alternatives are disabled, riscv_cpu_isa_extension_(un)likely() checks if the current cpu supports the selected extension if not all cpus support the extension. It is sufficient to only check if the current cpu supports the extension.
The alternatives code to handle if all cpus support an extension is factored out into a new function to support this.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsigned long ext) {
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_unlikely(ext))
return true;
- compiletime_assert(ext < RISCV_ISA_EXT_MAX,
"ext must be < RISCV_ISA_EXT_MAX");
- return __riscv_isa_extension_available(hart_isa[cpu].isa, ext);
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && __riscv_has_extension_unlikely_alternatives(ext))
return true;
- else
return __riscv_isa_extension_available(hart_isa[cpu].isa, ext);
}
static __always_inline bool riscv_cpu_has_extension_likely(int cpu, const unsigned long ext) { if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_likely(ext)) return true;
return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); }
This is the code as things stand. If alternatives are disabled, the if statement becomes if (0 && foo) which will lead to the function call getting constant folded away and all you end up with is the call to __riscv_isa_extension_available(). Unless I am missing something, I don't think this patch has any affect?
Yeah I fumbled this one it appears. I got thrown off by the nested IS_ENABLED(CONFIG_RISCV_ALTERNATIVE). This patch eliminates the need for this and maybe can avoid avoid confusion in the future.
- Charlie
Thanks, Conor.
On Fri, Apr 12, 2024 at 10:34:28AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 11:40:38AM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:13PM -0700, Charlie Jenkins wrote:
When alternatives are disabled, riscv_cpu_isa_extension_(un)likely() checks if the current cpu supports the selected extension if not all cpus support the extension. It is sufficient to only check if the current cpu supports the extension.
The alternatives code to handle if all cpus support an extension is factored out into a new function to support this.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsigned long ext) {
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_unlikely(ext))
return true;
- compiletime_assert(ext < RISCV_ISA_EXT_MAX,
"ext must be < RISCV_ISA_EXT_MAX");
- return __riscv_isa_extension_available(hart_isa[cpu].isa, ext);
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && __riscv_has_extension_unlikely_alternatives(ext))
return true;
- else
return __riscv_isa_extension_available(hart_isa[cpu].isa, ext);
}
static __always_inline bool riscv_cpu_has_extension_likely(int cpu, const unsigned long ext) { if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_likely(ext)) return true;
return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); }
This is the code as things stand. If alternatives are disabled, the if statement becomes if (0 && foo) which will lead to the function call getting constant folded away and all you end up with is the call to __riscv_isa_extension_available(). Unless I am missing something, I don't think this patch has any affect?
Yeah I fumbled this one it appears. I got thrown off by the nested IS_ENABLED(CONFIG_RISCV_ALTERNATIVE). This patch eliminates the need for this and maybe can avoid avoid confusion in the future.
I think it just creates unneeded functions and can/should be dropped.
Create vendor variants of the existing extension helpers. If the existing functions were instead modified to support vendor extensions, a branch based on the ext value being greater than RISCV_ISA_VENDOR_EXT_BASE would have to be introduced. This additional branch would have an unnecessary performance impact.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/include/asm/cpufeature.h | 54 +++++++++++++++++++++++++++++++++++++ arch/riscv/kernel/cpufeature.c | 34 ++++++++++++++++++++--- 2 files changed, 84 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index db2ab037843a..8f19e3681b4f 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -89,6 +89,10 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, unsigned i #define riscv_isa_extension_available(isa_bitmap, ext) \ __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_##ext)
+bool __riscv_isa_vendor_extension_available(const unsigned long *vendor_isa_bitmap, unsigned int bit); +#define riscv_isa_vendor_extension_available(isa_bitmap, ext) \ + __riscv_isa_vendor_extension_available(isa_bitmap, RISCV_ISA_VENDOR_EXT_##ext) + static __always_inline bool __riscv_has_extension_likely_alternatives(const unsigned long ext) { @@ -117,6 +121,8 @@ __riscv_has_extension_unlikely_alternatives(const unsigned long ext) return true; }
+/* Standard extension helpers */ + static __always_inline bool riscv_has_extension_likely(const unsigned long ext) { @@ -163,4 +169,52 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); }
+/* Vendor extension helpers */ + +static __always_inline bool +riscv_has_vendor_extension_likely(const unsigned long ext) +{ + compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX, + "ext must be < RISCV_ISA_VENDOR_EXT_MAX"); + + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) + return __riscv_has_extension_likely_alternatives(ext); + else + return __riscv_isa_vendor_extension_available(NULL, ext); +} + +static __always_inline bool +riscv_has_vendor_extension_unlikely(const unsigned long ext) +{ + compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX, + "ext must be < RISCV_ISA_VENDOR_EXT_MAX"); + + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) + return __riscv_has_extension_unlikely_alternatives(ext); + else + return __riscv_isa_vendor_extension_available(NULL, ext); +} + +static __always_inline bool riscv_cpu_has_vendor_extension_likely(int cpu, const unsigned long ext) +{ + compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX, + "ext must be < RISCV_ISA_VENDOR_EXT_MAX"); + + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) + return __riscv_has_extension_likely_alternatives(ext); + else + return __riscv_isa_vendor_extension_available(hart_isa_vendor[cpu].isa, ext); +} + +static __always_inline bool riscv_cpu_has_vendor_extension_unlikely(int cpu, const unsigned long ext) +{ + compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX, + "ext must be < RISCV_ISA_VENDOR_EXT_MAX"); + + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) + return __riscv_has_extension_unlikely_alternatives(ext); + else + return __riscv_isa_vendor_extension_available(hart_isa_vendor[cpu].isa, ext); +} + #endif diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index f72fbdd0d7f5..41a4d2028428 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -78,6 +78,29 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, unsigned i } EXPORT_SYMBOL_GPL(__riscv_isa_extension_available);
+/** + * __riscv_isa_vendor_extension_available() - Check whether given vendor + * extension is available or not + * + * @isa_bitmap: ISA bitmap to use + * @bit: bit position of the desired extension + * Return: true or false + * + * NOTE: If isa_bitmap is NULL then Host ISA bitmap will be used. + */ +bool __riscv_isa_vendor_extension_available(const unsigned long *isa_bitmap, unsigned int bit) +{ + const unsigned long *bmap = (isa_bitmap) ? isa_bitmap : riscv_isa_vendor; + + bit -= RISCV_ISA_VENDOR_EXT_BASE; + + if (bit < 0 || bit >= RISCV_ISA_VENDOR_EXT_MAX) + return false; + + return test_bit(bit, bmap) ? true : false; +} +EXPORT_SYMBOL_GPL(__riscv_isa_vendor_extension_available); + static bool riscv_isa_extension_check(int id) { switch (id) { @@ -930,14 +953,17 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
id = PATCH_ID_CPUFEATURE_ID(alt->patch_id);
- if (id >= RISCV_ISA_EXT_MAX) { + if (id >= RISCV_ISA_VENDOR_EXT_BASE) { + if (!__riscv_isa_vendor_extension_available(NULL, id)) + continue; + } else if (id < RISCV_ISA_EXT_MAX) { + if (!__riscv_isa_extension_available(NULL, id)) + continue; + } else { WARN(1, "This extension id:%d is not in ISA extension list", id); continue; }
- if (!__riscv_isa_extension_available(NULL, id)) - continue; - value = PATCH_ID_CPUFEATURE_VALUE(alt->patch_id); if (!riscv_cpufeature_patch_check(id, value)) continue;
On Thu, Apr 11, 2024 at 09:11:14PM -0700, Charlie Jenkins wrote:
Create vendor variants of the existing extension helpers. If the existing functions were instead modified to support vendor extensions, a branch based on the ext value being greater than RISCV_ISA_VENDOR_EXT_BASE would have to be introduced. This additional branch would have an unnecessary performance impact.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
I've not looked at the "main" patch in the series that adds all of the probing and structures for representing this info yet beyond a cursory glance, but it feels like we're duplicating a bunch of infrastructure here before it is necessary. The IDs are all internal to Linux, so I'd rather we kept everything in the same structure until we have more than a handful of vendor extensions. With this patch (and the theadpmu stuff) we will have three vendor extensions which feels like a drop in the bucket compared to the standard ones.
arch/riscv/include/asm/cpufeature.h | 54 +++++++++++++++++++++++++++++++++++++ arch/riscv/kernel/cpufeature.c | 34 ++++++++++++++++++++--- 2 files changed, 84 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index db2ab037843a..8f19e3681b4f 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -89,6 +89,10 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, unsigned i #define riscv_isa_extension_available(isa_bitmap, ext) \ __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_##ext) +bool __riscv_isa_vendor_extension_available(const unsigned long *vendor_isa_bitmap, unsigned int bit); +#define riscv_isa_vendor_extension_available(isa_bitmap, ext) \
- __riscv_isa_vendor_extension_available(isa_bitmap, RISCV_ISA_VENDOR_EXT_##ext)
static __always_inline bool __riscv_has_extension_likely_alternatives(const unsigned long ext) { @@ -117,6 +121,8 @@ __riscv_has_extension_unlikely_alternatives(const unsigned long ext) return true; } +/* Standard extension helpers */
static __always_inline bool riscv_has_extension_likely(const unsigned long ext) { @@ -163,4 +169,52 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); } +/* Vendor extension helpers */
+static __always_inline bool +riscv_has_vendor_extension_likely(const unsigned long ext) +{
- compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX,
"ext must be < RISCV_ISA_VENDOR_EXT_MAX");
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE))
return __riscv_has_extension_likely_alternatives(ext);
- else
return __riscv_isa_vendor_extension_available(NULL, ext);
+}
+static __always_inline bool +riscv_has_vendor_extension_unlikely(const unsigned long ext) +{
- compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX,
"ext must be < RISCV_ISA_VENDOR_EXT_MAX");
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE))
return __riscv_has_extension_unlikely_alternatives(ext);
- else
return __riscv_isa_vendor_extension_available(NULL, ext);
+}
+static __always_inline bool riscv_cpu_has_vendor_extension_likely(int cpu, const unsigned long ext) +{
- compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX,
"ext must be < RISCV_ISA_VENDOR_EXT_MAX");
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE))
return __riscv_has_extension_likely_alternatives(ext);
- else
return __riscv_isa_vendor_extension_available(hart_isa_vendor[cpu].isa, ext);
+}
+static __always_inline bool riscv_cpu_has_vendor_extension_unlikely(int cpu, const unsigned long ext) +{
- compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX,
"ext must be < RISCV_ISA_VENDOR_EXT_MAX");
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE))
return __riscv_has_extension_unlikely_alternatives(ext);
- else
return __riscv_isa_vendor_extension_available(hart_isa_vendor[cpu].isa, ext);
+}
Same stuff about constant folding applies to these, I think these should just mirror the existing functions (if needed at all).
Cheers, Conor.
On Fri, Apr 12, 2024 at 12:49:57PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:14PM -0700, Charlie Jenkins wrote:
Create vendor variants of the existing extension helpers. If the existing functions were instead modified to support vendor extensions, a branch based on the ext value being greater than RISCV_ISA_VENDOR_EXT_BASE would have to be introduced. This additional branch would have an unnecessary performance impact.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
I've not looked at the "main" patch in the series that adds all of the probing and structures for representing this info yet beyond a cursory glance, but it feels like we're duplicating a bunch of infrastructure here before it is necessary. The IDs are all internal to Linux, so I'd rather we kept everything in the same structure until we have more than a handful of vendor extensions. With this patch (and the theadpmu stuff) we will have three vendor extensions which feels like a drop in the bucket compared to the standard ones.
It is not duplicating infrastructure. If we merge this into the existing infrastructure, we would be littering if (ext > RISCV_ISA_VENDOR_EXT_BASE) in __riscv_isa_extension_available. This is particularily important exactly because we have so few vendor extensions currently so this check would be irrelevant in the vast majority of cases.
It is also unecessary to push off the refactoring until we have some "sufficient" amount of vendor extensions to deem changing the infrastructure when I already have the patch available here. This does not introduce any extra overhead to existing functions and will be able to support vendors into the future.
- Charlie
arch/riscv/include/asm/cpufeature.h | 54 +++++++++++++++++++++++++++++++++++++ arch/riscv/kernel/cpufeature.c | 34 ++++++++++++++++++++--- 2 files changed, 84 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index db2ab037843a..8f19e3681b4f 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -89,6 +89,10 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, unsigned i #define riscv_isa_extension_available(isa_bitmap, ext) \ __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_##ext) +bool __riscv_isa_vendor_extension_available(const unsigned long *vendor_isa_bitmap, unsigned int bit); +#define riscv_isa_vendor_extension_available(isa_bitmap, ext) \
- __riscv_isa_vendor_extension_available(isa_bitmap, RISCV_ISA_VENDOR_EXT_##ext)
static __always_inline bool __riscv_has_extension_likely_alternatives(const unsigned long ext) { @@ -117,6 +121,8 @@ __riscv_has_extension_unlikely_alternatives(const unsigned long ext) return true; } +/* Standard extension helpers */
static __always_inline bool riscv_has_extension_likely(const unsigned long ext) { @@ -163,4 +169,52 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); } +/* Vendor extension helpers */
+static __always_inline bool +riscv_has_vendor_extension_likely(const unsigned long ext) +{
- compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX,
"ext must be < RISCV_ISA_VENDOR_EXT_MAX");
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE))
return __riscv_has_extension_likely_alternatives(ext);
- else
return __riscv_isa_vendor_extension_available(NULL, ext);
+}
+static __always_inline bool +riscv_has_vendor_extension_unlikely(const unsigned long ext) +{
- compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX,
"ext must be < RISCV_ISA_VENDOR_EXT_MAX");
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE))
return __riscv_has_extension_unlikely_alternatives(ext);
- else
return __riscv_isa_vendor_extension_available(NULL, ext);
+}
+static __always_inline bool riscv_cpu_has_vendor_extension_likely(int cpu, const unsigned long ext) +{
- compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX,
"ext must be < RISCV_ISA_VENDOR_EXT_MAX");
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE))
return __riscv_has_extension_likely_alternatives(ext);
- else
return __riscv_isa_vendor_extension_available(hart_isa_vendor[cpu].isa, ext);
+}
+static __always_inline bool riscv_cpu_has_vendor_extension_unlikely(int cpu, const unsigned long ext) +{
- compiletime_assert(ext < RISCV_ISA_VENDOR_EXT_MAX,
"ext must be < RISCV_ISA_VENDOR_EXT_MAX");
- if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE))
return __riscv_has_extension_unlikely_alternatives(ext);
- else
return __riscv_isa_vendor_extension_available(hart_isa_vendor[cpu].isa, ext);
+}
Same stuff about constant folding applies to these, I think these should just mirror the existing functions (if needed at all).
Cheers, Conor.
On Fri, Apr 12, 2024 at 10:43:02AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 12:49:57PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:14PM -0700, Charlie Jenkins wrote:
Create vendor variants of the existing extension helpers. If the existing functions were instead modified to support vendor extensions, a branch based on the ext value being greater than RISCV_ISA_VENDOR_EXT_BASE would have to be introduced. This additional branch would have an unnecessary performance impact.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
I've not looked at the "main" patch in the series that adds all of the probing and structures for representing this info yet beyond a cursory glance, but it feels like we're duplicating a bunch of infrastructure here before it is necessary. The IDs are all internal to Linux, so I'd rather we kept everything in the same structure until we have more than a handful of vendor extensions. With this patch (and the theadpmu stuff) we will have three vendor extensions which feels like a drop in the bucket compared to the standard ones.
It is not duplicating infrastructure. If we merge this into the existing infrastructure, we would be littering if (ext > RISCV_ISA_VENDOR_EXT_BASE) in __riscv_isa_extension_available. This is particularily important exactly because we have so few vendor extensions currently so this check would be irrelevant in the vast majority of cases.
That's only because of your implementation. The existing vendor extension works fine without this littering. That's another thing actually, you forgot to convert over the user we already have :)
It is also unecessary to push off the refactoring until we have some "sufficient" amount of vendor extensions to deem changing the infrastructure when I already have the patch available here. This does not introduce any extra overhead to existing functions and will be able to support vendors into the future.
Yeah, maybe that's true but this was my gut reaction before reading the other patch in detail (which I've still yet to do).
On Fri, Apr 12, 2024 at 09:40:03PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:43:02AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 12:49:57PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:14PM -0700, Charlie Jenkins wrote:
Create vendor variants of the existing extension helpers. If the existing functions were instead modified to support vendor extensions, a branch based on the ext value being greater than RISCV_ISA_VENDOR_EXT_BASE would have to be introduced. This additional branch would have an unnecessary performance impact.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
I've not looked at the "main" patch in the series that adds all of the probing and structures for representing this info yet beyond a cursory glance, but it feels like we're duplicating a bunch of infrastructure here before it is necessary. The IDs are all internal to Linux, so I'd rather we kept everything in the same structure until we have more than a handful of vendor extensions. With this patch (and the theadpmu stuff) we will have three vendor extensions which feels like a drop in the bucket compared to the standard ones.
It is not duplicating infrastructure. If we merge this into the existing infrastructure, we would be littering if (ext > RISCV_ISA_VENDOR_EXT_BASE) in __riscv_isa_extension_available. This is particularily important exactly because we have so few vendor extensions currently so this check would be irrelevant in the vast majority of cases.
That's only because of your implementation. The existing vendor extension works fine without this littering. That's another thing actually, you forgot to convert over the user we already have :)
Oh right, I will convert them over. The fundemental goal of this patch is to allow a way for vendors to support their own extensions without needing to populate riscv_isa_ext. This is to create separation between vendors so they do not impact each other.
xlinuxenvcfg does not fit into this scheme however. This scheme assumes that a hart cannot have multiple vendors which that extension breaks. xlinuxenvcfg is really filling a hole in the standard isa that is applicible to all vendors and does not appear in the device tree so it is okay for that to live outside this scheme.
It is also unecessary to push off the refactoring until we have some "sufficient" amount of vendor extensions to deem changing the infrastructure when I already have the patch available here. This does not introduce any extra overhead to existing functions and will be able to support vendors into the future.
Yeah, maybe that's true but this was my gut reaction before reading the other patch in detail (which I've still yet to do).
- Charlie
On Fri, Apr 12, 2024 at 02:03:48PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 09:40:03PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:43:02AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 12:49:57PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:14PM -0700, Charlie Jenkins wrote:
Create vendor variants of the existing extension helpers. If the existing functions were instead modified to support vendor extensions, a branch based on the ext value being greater than RISCV_ISA_VENDOR_EXT_BASE would have to be introduced. This additional branch would have an unnecessary performance impact.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
I've not looked at the "main" patch in the series that adds all of the probing and structures for representing this info yet beyond a cursory glance, but it feels like we're duplicating a bunch of infrastructure here before it is necessary. The IDs are all internal to Linux, so I'd rather we kept everything in the same structure until we have more than a handful of vendor extensions. With this patch (and the theadpmu stuff) we will have three vendor extensions which feels like a drop in the bucket compared to the standard ones.
It is not duplicating infrastructure. If we merge this into the existing infrastructure, we would be littering if (ext > RISCV_ISA_VENDOR_EXT_BASE) in __riscv_isa_extension_available. This is particularily important exactly because we have so few vendor extensions currently so this check would be irrelevant in the vast majority of cases.
That's only because of your implementation. The existing vendor extension works fine without this littering. That's another thing actually, you forgot to convert over the user we already have :)
Oh right, I will convert them over. The fundemental goal of this patch is to allow a way for vendors to support their own extensions without needing to populate riscv_isa_ext. This is to create separation between vendors so they do not impact each other.
The one that needs converting is xandespmu. As I said on the other patch a minute I don't think isolating vendors for the internal representation is needed and can be left in hwprobe. I also don't think we can rely on a behaviour of "SiFive CPUs will always have SiFive's mvendorid" or that kinda thing, I've heard talk of the SoC vendor getting their mvendorid for custom CPU cores instead of the CPU vendor and it's possible for the SBI implementation to "adjust" the values also.
xlinuxenvcfg does not fit into this scheme however. This scheme assumes that a hart cannot have multiple vendors which that extension breaks. xlinuxenvcfg is really filling a hole in the standard isa that is applicible to all vendors and does not appear in the device tree so it is okay for that to live outside this scheme.
Ye, xlinuxenvcfg is an internal psuedo-extension that should be treated more like a standard one than something vendor.
It is also unecessary to push off the refactoring until we have some "sufficient" amount of vendor extensions to deem changing the infrastructure when I already have the patch available here. This does not introduce any extra overhead to existing functions and will be able to support vendors into the future.
Yeah, maybe that's true but this was my gut reaction before reading the other patch in detail (which I've still yet to do).
- Charlie
On Fri, Apr 12, 2024 at 10:34:10PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 02:03:48PM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 09:40:03PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 10:43:02AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 12:49:57PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:14PM -0700, Charlie Jenkins wrote:
Create vendor variants of the existing extension helpers. If the existing functions were instead modified to support vendor extensions, a branch based on the ext value being greater than RISCV_ISA_VENDOR_EXT_BASE would have to be introduced. This additional branch would have an unnecessary performance impact.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
I've not looked at the "main" patch in the series that adds all of the probing and structures for representing this info yet beyond a cursory glance, but it feels like we're duplicating a bunch of infrastructure here before it is necessary. The IDs are all internal to Linux, so I'd rather we kept everything in the same structure until we have more than a handful of vendor extensions. With this patch (and the theadpmu stuff) we will have three vendor extensions which feels like a drop in the bucket compared to the standard ones.
It is not duplicating infrastructure. If we merge this into the existing infrastructure, we would be littering if (ext > RISCV_ISA_VENDOR_EXT_BASE) in __riscv_isa_extension_available. This is particularily important exactly because we have so few vendor extensions currently so this check would be irrelevant in the vast majority of cases.
That's only because of your implementation. The existing vendor extension works fine without this littering. That's another thing actually, you forgot to convert over the user we already have :)
Oh right, I will convert them over. The fundemental goal of this patch is to allow a way for vendors to support their own extensions without needing to populate riscv_isa_ext. This is to create separation between vendors so they do not impact each other.
The one that needs converting is xandespmu. As I said on the other patch a minute I don't think isolating vendors for the internal representation is needed and can be left in hwprobe. I also don't think we can rely on a behaviour of "SiFive CPUs will always have SiFive's mvendorid" or that kinda thing, I've heard talk of the SoC vendor getting their mvendorid for custom CPU cores instead of the CPU vendor and it's possible for the SBI implementation to "adjust" the values also.
Okay that may be possible but that is up to the vendor when that happens. The vendor extensions are fundamentally different from the standard extensions and have even less guarantees of correctness which seems like it would invite more errata if multiple vendors implement the same vendor extensions. I can extract the code into a different file for each vendor so that is more clear.
- Charlie
xlinuxenvcfg does not fit into this scheme however. This scheme assumes that a hart cannot have multiple vendors which that extension breaks. xlinuxenvcfg is really filling a hole in the standard isa that is applicible to all vendors and does not appear in the device tree so it is okay for that to live outside this scheme.
Ye, xlinuxenvcfg is an internal psuedo-extension that should be treated more like a standard one than something vendor.
It is also unecessary to push off the refactoring until we have some "sufficient" amount of vendor extensions to deem changing the infrastructure when I already have the patch available here. This does not introduce any extra overhead to existing functions and will be able to support vendors into the future.
Yeah, maybe that's true but this was my gut reaction before reading the other patch in detail (which I've still yet to do).
- Charlie
At this time, use the fallback uaccess routines rather than customizing the vectorized uaccess routines to be compatible with xtheadvector.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/lib/uaccess.S | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S index bc22c078aba8..74bd75b673d7 100644 --- a/arch/riscv/lib/uaccess.S +++ b/arch/riscv/lib/uaccess.S @@ -15,6 +15,7 @@ SYM_FUNC_START(__asm_copy_to_user) #ifdef CONFIG_RISCV_ISA_V ALTERNATIVE("j fallback_scalar_usercopy", "nop", 0, RISCV_ISA_EXT_v, CONFIG_RISCV_ISA_V) + ALTERNATIVE("nop", "j fallback_scalar_usercopy", 0, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR, CONFIG_RISCV_ISA_V) REG_L t0, riscv_v_usercopy_threshold bltu a2, t0, fallback_scalar_usercopy tail enter_vector_usercopy
From: Heiko Stuebner heiko@sntech.de
The VCSR CSR contains two elements VXRM[2:1] and VXSAT[0].
Define constants for those to access the elements in a readable way.
Acked-by: Guo Ren guoren@kernel.org Reviewed-by: Conor Dooley conor.dooley@microchip.com Signed-off-by: Heiko Stuebner heiko.stuebner@vrull.eu --- arch/riscv/include/asm/csr.h | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 2468c55933cd..13bc99c995d1 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -215,6 +215,11 @@ #define SMSTATEEN0_SSTATEEN0_SHIFT 63 #define SMSTATEEN0_SSTATEEN0 (_ULL(1) << SMSTATEEN0_SSTATEEN0_SHIFT)
+/* VCSR flags */ +#define VCSR_VXRM_MASK 3 +#define VCSR_VXRM_SHIFT 1 +#define VCSR_VXSAT_MASK 1 + /* symbolic CSR names: */ #define CSR_CYCLE 0xc00 #define CSR_TIME 0xc01
On Thu, Apr 11, 2024 at 09:11:16PM -0700, Charlie Jenkins wrote:
From: Heiko Stuebner heiko@sntech.de
The VCSR CSR contains two elements VXRM[2:1] and VXSAT[0].
Define constants for those to access the elements in a readable way.
Acked-by: Guo Ren guoren@kernel.org Reviewed-by: Conor Dooley conor.dooley@microchip.com Signed-off-by: Heiko Stuebner heiko.stuebner@vrull.eu
You need to sign off on this as the submitter Charlie.
On Fri, Apr 12, 2024 at 12:27:50PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:16PM -0700, Charlie Jenkins wrote:
From: Heiko Stuebner heiko@sntech.de
The VCSR CSR contains two elements VXRM[2:1] and VXSAT[0].
Define constants for those to access the elements in a readable way.
Acked-by: Guo Ren guoren@kernel.org Reviewed-by: Conor Dooley conor.dooley@microchip.com Signed-off-by: Heiko Stuebner heiko.stuebner@vrull.eu
You need to sign off on this as the submitter Charlie.
I wasn't sure, thank you!
- Charlie
The VXRM vector csr for xtheadvector has an encoding of 0xa and VXSAT has an encoding of 0x9.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/include/asm/csr.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 13bc99c995d1..e5a35efd56e0 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -219,6 +219,8 @@ #define VCSR_VXRM_MASK 3 #define VCSR_VXRM_SHIFT 1 #define VCSR_VXSAT_MASK 1 +#define VCSR_VXSAT 0x9 +#define VCSR_VXRM 0xa
/* symbolic CSR names: */ #define CSR_CYCLE 0xc00
These definitions didn't fit anywhere nicely, so create a new file to house various xtheadvector instruction encodings.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/include/asm/xtheadvector.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
diff --git a/arch/riscv/include/asm/xtheadvector.h b/arch/riscv/include/asm/xtheadvector.h new file mode 100644 index 000000000000..348263ea164c --- /dev/null +++ b/arch/riscv/include/asm/xtheadvector.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +/* + * Vector 0.7.1 as used for example on T-Head Xuantie cores, uses an older + * encoding for vsetvli (ta, ma vs. d1), so provide an instruction for + * vsetvli t4, x0, e8, m8, d1 + */ +#define THEAD_VSETVLI_T4X0E8M8D1 ".long 0x00307ed7\n\t" +#define THEAD_VSETVLI_X0X0E8M8D1 ".long 0x00307057\n\t" + +/* + * While in theory, the vector-0.7.1 vsb.v and vlb.v result in the same + * encoding as the standard vse8.v and vle8.v, compilers seem to optimize + * the call resulting in a different encoding and then using a value for + * the "mop" field that is not part of vector-0.7.1 + * So encode specific variants for vstate_save and _restore. + */ +#define THEAD_VSB_V_V0T0 ".long 0x02028027\n\t" +#define THEAD_VSB_V_V8T0 ".long 0x02028427\n\t" +#define THEAD_VSB_V_V16T0 ".long 0x02028827\n\t" +#define THEAD_VSB_V_V24T0 ".long 0x02028c27\n\t" +#define THEAD_VLB_V_V0T0 ".long 0x012028007\n\t" +#define THEAD_VLB_V_V8T0 ".long 0x012028407\n\t" +#define THEAD_VLB_V_V16T0 ".long 0x012028807\n\t" +#define THEAD_VLB_V_V24T0 ".long 0x012028c07\n\t"
On Thu, Apr 11, 2024 at 09:11:18PM -0700, Charlie Jenkins wrote:
These definitions didn't fit anywhere nicely, so create a new file to house various xtheadvector instruction encodings.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/include/asm/xtheadvector.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
diff --git a/arch/riscv/include/asm/xtheadvector.h b/arch/riscv/include/asm/xtheadvector.h new file mode 100644 index 000000000000..348263ea164c --- /dev/null +++ b/arch/riscv/include/asm/xtheadvector.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0-only */
+/*
- Vector 0.7.1 as used for example on T-Head Xuantie cores, uses an older
- encoding for vsetvli (ta, ma vs. d1), so provide an instruction for
- vsetvli t4, x0, e8, m8, d1
- */
+#define THEAD_VSETVLI_T4X0E8M8D1 ".long 0x00307ed7\n\t" +#define THEAD_VSETVLI_X0X0E8M8D1 ".long 0x00307057\n\t"
+/*
- While in theory, the vector-0.7.1 vsb.v and vlb.v result in the same
- encoding as the standard vse8.v and vle8.v, compilers seem to optimize
- the call resulting in a different encoding and then using a value for
- the "mop" field that is not part of vector-0.7.1
- So encode specific variants for vstate_save and _restore.
This wording seems oddly familiar to me, did Heiko not write this?
- */
+#define THEAD_VSB_V_V0T0 ".long 0x02028027\n\t" +#define THEAD_VSB_V_V8T0 ".long 0x02028427\n\t" +#define THEAD_VSB_V_V16T0 ".long 0x02028827\n\t" +#define THEAD_VSB_V_V24T0 ".long 0x02028c27\n\t" +#define THEAD_VLB_V_V0T0 ".long 0x012028007\n\t" +#define THEAD_VLB_V_V8T0 ".long 0x012028407\n\t" +#define THEAD_VLB_V_V16T0 ".long 0x012028807\n\t" +#define THEAD_VLB_V_V24T0 ".long 0x012028c07\n\t"
-- 2.44.0
On Fri, Apr 12, 2024 at 12:30:32PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:18PM -0700, Charlie Jenkins wrote:
These definitions didn't fit anywhere nicely, so create a new file to house various xtheadvector instruction encodings.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/include/asm/xtheadvector.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
diff --git a/arch/riscv/include/asm/xtheadvector.h b/arch/riscv/include/asm/xtheadvector.h new file mode 100644 index 000000000000..348263ea164c --- /dev/null +++ b/arch/riscv/include/asm/xtheadvector.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0-only */
+/*
- Vector 0.7.1 as used for example on T-Head Xuantie cores, uses an older
- encoding for vsetvli (ta, ma vs. d1), so provide an instruction for
- vsetvli t4, x0, e8, m8, d1
- */
+#define THEAD_VSETVLI_T4X0E8M8D1 ".long 0x00307ed7\n\t" +#define THEAD_VSETVLI_X0X0E8M8D1 ".long 0x00307057\n\t"
+/*
- While in theory, the vector-0.7.1 vsb.v and vlb.v result in the same
- encoding as the standard vse8.v and vle8.v, compilers seem to optimize
- the call resulting in a different encoding and then using a value for
- the "mop" field that is not part of vector-0.7.1
- So encode specific variants for vstate_save and _restore.
This wording seems oddly familiar to me, did Heiko not write this?
Yeah, I wasn't sure how to attribute him. He wrote almost all of the lines in this file, but I put it together into this file. What is the standard way of doing that?
- Charlie
- */
+#define THEAD_VSB_V_V0T0 ".long 0x02028027\n\t" +#define THEAD_VSB_V_V8T0 ".long 0x02028427\n\t" +#define THEAD_VSB_V_V16T0 ".long 0x02028827\n\t" +#define THEAD_VSB_V_V24T0 ".long 0x02028c27\n\t" +#define THEAD_VLB_V_V0T0 ".long 0x012028007\n\t" +#define THEAD_VLB_V_V8T0 ".long 0x012028407\n\t" +#define THEAD_VLB_V_V16T0 ".long 0x012028807\n\t" +#define THEAD_VLB_V_V24T0 ".long 0x012028c07\n\t"
-- 2.44.0
On Fri, Apr 12, 2024 at 11:24:35AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 12:30:32PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:18PM -0700, Charlie Jenkins wrote:
These definitions didn't fit anywhere nicely, so create a new file to house various xtheadvector instruction encodings.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/include/asm/xtheadvector.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
diff --git a/arch/riscv/include/asm/xtheadvector.h b/arch/riscv/include/asm/xtheadvector.h new file mode 100644 index 000000000000..348263ea164c --- /dev/null +++ b/arch/riscv/include/asm/xtheadvector.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0-only */
+/*
- Vector 0.7.1 as used for example on T-Head Xuantie cores, uses an older
- encoding for vsetvli (ta, ma vs. d1), so provide an instruction for
- vsetvli t4, x0, e8, m8, d1
- */
+#define THEAD_VSETVLI_T4X0E8M8D1 ".long 0x00307ed7\n\t" +#define THEAD_VSETVLI_X0X0E8M8D1 ".long 0x00307057\n\t"
+/*
- While in theory, the vector-0.7.1 vsb.v and vlb.v result in the same
- encoding as the standard vse8.v and vle8.v, compilers seem to optimize
- the call resulting in a different encoding and then using a value for
- the "mop" field that is not part of vector-0.7.1
- So encode specific variants for vstate_save and _restore.
This wording seems oddly familiar to me, did Heiko not write this?
Yeah, I wasn't sure how to attribute him. He wrote almost all of the lines in this file, but I put it together into this file. What is the standard way of doing that?
The original patches have his sob and authorship, so I would at least expect co-developed-by.
On Fri, Apr 12, 2024 at 08:00:46PM +0100, Conor Dooley wrote:
On Fri, Apr 12, 2024 at 11:24:35AM -0700, Charlie Jenkins wrote:
On Fri, Apr 12, 2024 at 12:30:32PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:18PM -0700, Charlie Jenkins wrote:
These definitions didn't fit anywhere nicely, so create a new file to house various xtheadvector instruction encodings.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/include/asm/xtheadvector.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
diff --git a/arch/riscv/include/asm/xtheadvector.h b/arch/riscv/include/asm/xtheadvector.h new file mode 100644 index 000000000000..348263ea164c --- /dev/null +++ b/arch/riscv/include/asm/xtheadvector.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0-only */
+/*
- Vector 0.7.1 as used for example on T-Head Xuantie cores, uses an older
- encoding for vsetvli (ta, ma vs. d1), so provide an instruction for
- vsetvli t4, x0, e8, m8, d1
- */
+#define THEAD_VSETVLI_T4X0E8M8D1 ".long 0x00307ed7\n\t" +#define THEAD_VSETVLI_X0X0E8M8D1 ".long 0x00307057\n\t"
+/*
- While in theory, the vector-0.7.1 vsb.v and vlb.v result in the same
- encoding as the standard vse8.v and vle8.v, compilers seem to optimize
- the call resulting in a different encoding and then using a value for
- the "mop" field that is not part of vector-0.7.1
- So encode specific variants for vstate_save and _restore.
This wording seems oddly familiar to me, did Heiko not write this?
Yeah, I wasn't sure how to attribute him. He wrote almost all of the lines in this file, but I put it together into this file. What is the standard way of doing that?
The original patches have his sob and authorship, so I would at least expect co-developed-by.
Perfect, thank you for pointing me in the right direction.
- Charlie
Use alternatives to add support for xtheadvector vector save/restore routines.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/include/asm/csr.h | 6 + arch/riscv/include/asm/vector.h | 228 +++++++++++++++++++++++++-------- arch/riscv/kernel/kernel_mode_vector.c | 4 +- arch/riscv/kernel/vector.c | 22 +++- 4 files changed, 203 insertions(+), 57 deletions(-)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index e5a35efd56e0..13657d096e7d 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -30,6 +30,12 @@ #define SR_VS_CLEAN _AC(0x00000400, UL) #define SR_VS_DIRTY _AC(0x00000600, UL)
+#define SR_VS_THEAD _AC(0x01800000, UL) /* xtheadvector Status */ +#define SR_VS_OFF_THEAD _AC(0x00000000, UL) +#define SR_VS_INITIAL_THEAD _AC(0x00800000, UL) +#define SR_VS_CLEAN_THEAD _AC(0x01000000, UL) +#define SR_VS_DIRTY_THEAD _AC(0x01800000, UL) + #define SR_XS _AC(0x00018000, UL) /* Extension Status */ #define SR_XS_OFF _AC(0x00000000, UL) #define SR_XS_INITIAL _AC(0x00008000, UL) diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index 731dcd0ed4de..f6ca30dd7d86 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -18,6 +18,25 @@ #include <asm/cpufeature.h> #include <asm/csr.h> #include <asm/asm.h> +#include <asm/xtheadvector.h> + +#define __riscv_v_vstate_or(_val, TYPE) ({ \ + typeof(_val) _res = _val; \ + if (riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) \ + _res = (_res & ~SR_VS_THEAD) | SR_VS_##TYPE##_THEAD; \ + else \ + _res = (_res & ~SR_VS) | SR_VS_##TYPE; \ + _res; \ +}) + +#define __riscv_v_vstate_check(_val, TYPE) ({ \ + bool _res; \ + if (riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) \ + _res = ((_val) & SR_VS_THEAD) == SR_VS_##TYPE##_THEAD; \ + else \ + _res = ((_val) & SR_VS) == SR_VS_##TYPE; \ + _res; \ +})
extern unsigned long riscv_v_vsize; int riscv_v_setup_vsize(void); @@ -42,37 +61,43 @@ static __always_inline bool has_vector(void)
static inline void __riscv_v_vstate_clean(struct pt_regs *regs) { - regs->status = (regs->status & ~SR_VS) | SR_VS_CLEAN; + regs->status = __riscv_v_vstate_or(regs->status, CLEAN); }
static inline void __riscv_v_vstate_dirty(struct pt_regs *regs) { - regs->status = (regs->status & ~SR_VS) | SR_VS_DIRTY; + regs->status = __riscv_v_vstate_or(regs->status, DIRTY); }
static inline void riscv_v_vstate_off(struct pt_regs *regs) { - regs->status = (regs->status & ~SR_VS) | SR_VS_OFF; + regs->status = __riscv_v_vstate_or(regs->status, OFF); }
static inline void riscv_v_vstate_on(struct pt_regs *regs) { - regs->status = (regs->status & ~SR_VS) | SR_VS_INITIAL; + regs->status = __riscv_v_vstate_or(regs->status, INITIAL); }
static inline bool riscv_v_vstate_query(struct pt_regs *regs) { - return (regs->status & SR_VS) != 0; + return !__riscv_v_vstate_check(regs->status, OFF); }
static __always_inline void riscv_v_enable(void) { - csr_set(CSR_SSTATUS, SR_VS); + if (riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) + csr_set(CSR_SSTATUS, SR_VS_THEAD); + else + csr_set(CSR_SSTATUS, SR_VS); }
static __always_inline void riscv_v_disable(void) { - csr_clear(CSR_SSTATUS, SR_VS); + if (riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) + csr_clear(CSR_SSTATUS, SR_VS_THEAD); + else + csr_clear(CSR_SSTATUS, SR_VS); }
static __always_inline void __vstate_csr_save(struct __riscv_v_ext_state *dest) @@ -81,10 +106,47 @@ static __always_inline void __vstate_csr_save(struct __riscv_v_ext_state *dest) "csrr %0, " __stringify(CSR_VSTART) "\n\t" "csrr %1, " __stringify(CSR_VTYPE) "\n\t" "csrr %2, " __stringify(CSR_VL) "\n\t" - "csrr %3, " __stringify(CSR_VCSR) "\n\t" - "csrr %4, " __stringify(CSR_VLENB) "\n\t" : "=r" (dest->vstart), "=r" (dest->vtype), "=r" (dest->vl), - "=r" (dest->vcsr), "=r" (dest->vlenb) : :); + "=r" (dest->vcsr) : :); + + if (riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { + u32 tmp_vcsr; + bool restore_fpu = false; + unsigned long status = csr_read(CSR_SSTATUS); + + /* + * CSR_VCSR is defined as + * [2:1] - vxrm[1:0] + * [0] - vxsat + * The earlier vector spec implemented by T-Head uses separate + * registers for the same bit-elements, so just combine those + * into the existing output field. + * + * Additionally T-Head cores need FS to be enabled when accessing + * the VXRM and VXSAT CSRs, otherwise ending in illegal instructions. + * Though the cores do not implement the VXRM and VXSAT fields in the + * FCSR CSR that vector-0.7.1 specifies. + */ + if ((status & SR_FS) == SR_FS_OFF) { + csr_set(CSR_SSTATUS, (status & ~SR_FS) | SR_FS_CLEAN); + restore_fpu = true; + } + + asm volatile ( + "csrr %[tmp_vcsr], " __stringify(VCSR_VXRM) "\n\t" + "slliw %[vcsr], %[tmp_vcsr], " __stringify(VCSR_VXRM_SHIFT) "\n\t" + "csrr %[tmp_vcsr], " __stringify(VCSR_VXSAT) "\n\t" + "or %[vcsr], %[vcsr], %[tmp_vcsr]\n\t" + : [vcsr] "=r" (dest->vcsr), [tmp_vcsr] "=&r" (tmp_vcsr)); + + if (restore_fpu) + csr_set(CSR_SSTATUS, status); + } else { + asm volatile ( + "csrr %[vcsr], " __stringify(CSR_VCSR) "\n\t" + "csrr %[vlenb], " __stringify(CSR_VLENB) "\n\t" + : [vcsr] "=r" (dest->vcsr), [vlenb] "=r" (dest->vlenb)); + } }
static __always_inline void __vstate_csr_restore(struct __riscv_v_ext_state *src) @@ -95,9 +157,37 @@ static __always_inline void __vstate_csr_restore(struct __riscv_v_ext_state *src "vsetvl x0, %2, %1\n\t" ".option pop\n\t" "csrw " __stringify(CSR_VSTART) ", %0\n\t" - "csrw " __stringify(CSR_VCSR) ", %3\n\t" - : : "r" (src->vstart), "r" (src->vtype), "r" (src->vl), - "r" (src->vcsr) :); + : : "r" (src->vstart), "r" (src->vtype), "r" (src->vl)); + + if (riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { + u32 tmp_vcsr; + bool restore_fpu = false; + unsigned long status = csr_read(CSR_SSTATUS); + + /* + * Similar to __vstate_csr_save above, restore values for the + * separate VXRM and VXSAT CSRs from the vcsr variable. + */ + if ((status & SR_FS) == SR_FS_OFF) { + csr_set(CSR_SSTATUS, (status & ~SR_FS) | SR_FS_CLEAN); + restore_fpu = true; + } + + asm volatile ( + "srliw %[tmp_vcsr], %[vcsr], " __stringify(VCSR_VXRM_SHIFT) "\n\t" + "andi %[tmp_vcsr], %[tmp_vcsr], " __stringify(VCSR_VXRM_MASK) "\n\t" + "csrw " __stringify(VCSR_VXRM) ", %[tmp_vcsr]\n\t" + "andi %[tmp_vcsr], %[vcsr], " __stringify(VCSR_VXSAT_MASK) "\n\t" + "csrw " __stringify(VCSR_VXSAT) ", %[tmp_vcsr]\n\t" + : [tmp_vcsr] "=&r" (tmp_vcsr) : [vcsr] "r" (src->vcsr)); + + if (restore_fpu) + csr_set(CSR_SSTATUS, status); + } else { + asm volatile ( + "csrw " __stringify(CSR_VCSR) ", %[vcsr]\n\t" + : : [vcsr] "r" (src->vcsr)); + } }
static inline void __riscv_v_vstate_save(struct __riscv_v_ext_state *save_to, @@ -107,19 +197,33 @@ static inline void __riscv_v_vstate_save(struct __riscv_v_ext_state *save_to,
riscv_v_enable(); __vstate_csr_save(save_to); - asm volatile ( - ".option push\n\t" - ".option arch, +v\n\t" - "vsetvli %0, x0, e8, m8, ta, ma\n\t" - "vse8.v v0, (%1)\n\t" - "add %1, %1, %0\n\t" - "vse8.v v8, (%1)\n\t" - "add %1, %1, %0\n\t" - "vse8.v v16, (%1)\n\t" - "add %1, %1, %0\n\t" - "vse8.v v24, (%1)\n\t" - ".option pop\n\t" - : "=&r" (vl) : "r" (datap) : "memory"); + if (riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { + asm volatile ( + "mv t0, %0\n\t" + THEAD_VSETVLI_T4X0E8M8D1 + THEAD_VSB_V_V0T0 + "add t0, t0, t4\n\t" + THEAD_VSB_V_V0T0 + "add t0, t0, t4\n\t" + THEAD_VSB_V_V0T0 + "add t0, t0, t4\n\t" + THEAD_VSB_V_V0T0 + : : "r" (datap) : "memory", "t0", "t4"); + } else { + asm volatile ( + ".option push\n\t" + ".option arch, +v\n\t" + "vsetvli %0, x0, e8, m8, ta, ma\n\t" + "vse8.v v0, (%1)\n\t" + "add %1, %1, %0\n\t" + "vse8.v v8, (%1)\n\t" + "add %1, %1, %0\n\t" + "vse8.v v16, (%1)\n\t" + "add %1, %1, %0\n\t" + "vse8.v v24, (%1)\n\t" + ".option pop\n\t" + : "=&r" (vl) : "r" (datap) : "memory"); + } riscv_v_disable(); }
@@ -129,55 +233,77 @@ static inline void __riscv_v_vstate_restore(struct __riscv_v_ext_state *restore_ unsigned long vl;
riscv_v_enable(); - asm volatile ( - ".option push\n\t" - ".option arch, +v\n\t" - "vsetvli %0, x0, e8, m8, ta, ma\n\t" - "vle8.v v0, (%1)\n\t" - "add %1, %1, %0\n\t" - "vle8.v v8, (%1)\n\t" - "add %1, %1, %0\n\t" - "vle8.v v16, (%1)\n\t" - "add %1, %1, %0\n\t" - "vle8.v v24, (%1)\n\t" - ".option pop\n\t" - : "=&r" (vl) : "r" (datap) : "memory"); + if (riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { + asm volatile ( + "mv t0, %0\n\t" + THEAD_VSETVLI_T4X0E8M8D1 + THEAD_VLB_V_V0T0 + "add t0, t0, t4\n\t" + THEAD_VLB_V_V0T0 + "add t0, t0, t4\n\t" + THEAD_VLB_V_V0T0 + "add t0, t0, t4\n\t" + THEAD_VLB_V_V0T0 + : : "r" (datap) : "memory", "t0", "t4"); + } else { + asm volatile ( + ".option push\n\t" + ".option arch, +v\n\t" + "vsetvli %0, x0, e8, m8, ta, ma\n\t" + "vle8.v v0, (%1)\n\t" + "add %1, %1, %0\n\t" + "vle8.v v8, (%1)\n\t" + "add %1, %1, %0\n\t" + "vle8.v v16, (%1)\n\t" + "add %1, %1, %0\n\t" + "vle8.v v24, (%1)\n\t" + ".option pop\n\t" + : "=&r" (vl) : "r" (datap) : "memory"); + } __vstate_csr_restore(restore_from); riscv_v_disable(); }
static inline void __riscv_v_vstate_discard(void) { - unsigned long vl, vtype_inval = 1UL << (BITS_PER_LONG - 1); + unsigned long vtype_inval = 1UL << (BITS_PER_LONG - 1);
riscv_v_enable(); + if (riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) + asm volatile (THEAD_VSETVLI_X0X0E8M8D1); + else + asm volatile ( + ".option push\n\t" + ".option arch, +v\n\t" + "vsetvli x0, x0, e8, m8, ta, ma\n\t" + ".option pop\n\t"); + asm volatile ( ".option push\n\t" ".option arch, +v\n\t" - "vsetvli %0, x0, e8, m8, ta, ma\n\t" "vmv.v.i v0, -1\n\t" "vmv.v.i v8, -1\n\t" "vmv.v.i v16, -1\n\t" "vmv.v.i v24, -1\n\t" - "vsetvl %0, x0, %1\n\t" + "vsetvl x0, x0, %0\n\t" ".option pop\n\t" - : "=&r" (vl) : "r" (vtype_inval) : "memory"); + : : "r" (vtype_inval)); + riscv_v_disable(); }
static inline void riscv_v_vstate_discard(struct pt_regs *regs) { - if ((regs->status & SR_VS) == SR_VS_OFF) - return; - - __riscv_v_vstate_discard(); - __riscv_v_vstate_dirty(regs); + if (riscv_v_vstate_query(regs)) { + __riscv_v_vstate_discard(); + __riscv_v_vstate_dirty(regs); + } }
static inline void riscv_v_vstate_save(struct __riscv_v_ext_state *vstate, struct pt_regs *regs) { - if ((regs->status & SR_VS) == SR_VS_DIRTY) { + if (__riscv_v_vstate_check(regs->status, DIRTY)) { __riscv_v_vstate_save(vstate, vstate->datap); __riscv_v_vstate_clean(regs); } @@ -186,7 +312,7 @@ static inline void riscv_v_vstate_save(struct __riscv_v_ext_state *vstate, static inline void riscv_v_vstate_restore(struct __riscv_v_ext_state *vstate, struct pt_regs *regs) { - if ((regs->status & SR_VS) != SR_VS_OFF) { + if (riscv_v_vstate_query(regs)) { __riscv_v_vstate_restore(vstate, vstate->datap); __riscv_v_vstate_clean(regs); } @@ -195,7 +321,7 @@ static inline void riscv_v_vstate_restore(struct __riscv_v_ext_state *vstate, static inline void riscv_v_vstate_set_restore(struct task_struct *task, struct pt_regs *regs) { - if ((regs->status & SR_VS) != SR_VS_OFF) { + if (riscv_v_vstate_query(regs)) { set_tsk_thread_flag(task, TIF_RISCV_V_DEFER_RESTORE); riscv_v_vstate_on(regs); } diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c index 6afe80c7f03a..ad70fc581dbe 100644 --- a/arch/riscv/kernel/kernel_mode_vector.c +++ b/arch/riscv/kernel/kernel_mode_vector.c @@ -143,7 +143,7 @@ static int riscv_v_start_kernel_context(bool *is_nested)
/* Transfer the ownership of V from user to kernel, then save */ riscv_v_start(RISCV_PREEMPT_V | RISCV_PREEMPT_V_DIRTY); - if ((task_pt_regs(current)->status & SR_VS) == SR_VS_DIRTY) { + if (__riscv_v_vstate_check(task_pt_regs(current)->status, DIRTY)) { uvstate = ¤t->thread.vstate; __riscv_v_vstate_save(uvstate, uvstate->datap); } @@ -160,7 +160,7 @@ asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs) return;
depth = riscv_v_ctx_get_depth(); - if (depth == 0 && (regs->status & SR_VS) == SR_VS_DIRTY) + if (depth == 0 && __riscv_v_vstate_check(regs->status, DIRTY)) riscv_preempt_v_set_dirty();
riscv_v_ctx_depth_inc(); diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c index 6727d1d3b8f2..d8ec2757cc2e 100644 --- a/arch/riscv/kernel/vector.c +++ b/arch/riscv/kernel/vector.c @@ -33,10 +33,24 @@ int riscv_v_setup_vsize(void) { unsigned long this_vsize;
- /* There are 32 vector registers with vlenb length. */ - riscv_v_enable(); - this_vsize = csr_read(CSR_VLENB) * 32; - riscv_v_disable(); + /* + * This is called before alternatives have been patched so can't use + * riscv_has_vendor_extension_unlikely + */ + if (__riscv_isa_vendor_extension_available(NULL, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { + /* + * Although xtheadvector states that th.vlenb exists and + * overlaps with the vector 1.0 extension overlaps, an illegal + * instruction is raised if read. These systems all currently + * have a fixed vector length of 128, so hardcode that value. + */ + this_vsize = 128; + } else { + /* There are 32 vector registers with vlenb length. */ + riscv_v_enable(); + this_vsize = csr_read(CSR_VLENB) * 32; + riscv_v_disable(); + }
if (!riscv_v_vsize) { riscv_v_vsize = this_vsize;
Ensure that hwprobe does not flag "v" when xtheadvector is present.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/kernel/sys_hwprobe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 8cae41a502dd..e0a42c851511 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
- if (has_vector()) + if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) pair->value |= RISCV_HWPROBE_IMA_V;
/* @@ -112,7 +112,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
- if (has_vector()) { + if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);
On Thu, Apr 11, 2024 at 09:11:20PM -0700, Charlie Jenkins wrote:
Ensure that hwprobe does not flag "v" when xtheadvector is present.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/kernel/sys_hwprobe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 8cae41a502dd..e0a42c851511 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
- if (has_vector())
- if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
Hmm, I think this is "dangerous". has_vector() is used across the kernel now in several places for the in-kernel vector. I don't think that has_vector() should return true for the T-Head stuff given that & has_vector() should represent the ratified spec. I'll have to think about this one and how nasty this makes any of the save/restore code etc.
pair->value |= RISCV_HWPROBE_IMA_V;
/* @@ -112,7 +112,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
if (has_vector()) {
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);
-- 2.44.0
On Fri, Apr 12, 2024 at 4:35 AM Conor Dooley conor.dooley@microchip.com wrote:
On Thu, Apr 11, 2024 at 09:11:20PM -0700, Charlie Jenkins wrote:
Ensure that hwprobe does not flag "v" when xtheadvector is present.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/kernel/sys_hwprobe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 8cae41a502dd..e0a42c851511 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
if (has_vector())
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
Hmm, I think this is "dangerous". has_vector() is used across the kernel now in several places for the in-kernel vector. I don't think that has_vector() should return true for the T-Head stuff given that & has_vector() should represent the ratified spec. I'll have to think about this one and how nasty this makes any of the save/restore code etc.
Yeah, my nose crinkled here as well. If you're having to do a vendorish thing in this generic spot, then others may too, suggesting perhaps this isn't the cleanest way to go about it. Ideally extensions are all additive, rather than subtractive, I guess?
pair->value |= RISCV_HWPROBE_IMA_V; /*
@@ -112,7 +112,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
if (has_vector()) {
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);
-- 2.44.0
On Fri, Apr 12, 2024 at 10:04:42AM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 4:35 AM Conor Dooley conor.dooley@microchip.com wrote:
On Thu, Apr 11, 2024 at 09:11:20PM -0700, Charlie Jenkins wrote:
Ensure that hwprobe does not flag "v" when xtheadvector is present.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/kernel/sys_hwprobe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 8cae41a502dd..e0a42c851511 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
if (has_vector())
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
Hmm, I think this is "dangerous". has_vector() is used across the kernel now in several places for the in-kernel vector. I don't think that has_vector() should return true for the T-Head stuff given that & has_vector() should represent the ratified spec. I'll have to think about this one and how nasty this makes any of the save/restore code etc.
Yeah, my nose crinkled here as well. If you're having to do a vendorish thing in this generic spot, then others may too, suggesting perhaps this isn't the cleanest way to go about it. Ideally extensions are all additive, rather than subtractive, I guess?
This was the "easiest" way to support this but I agree this is not ideal. The vector code is naturally coupled with having support for "v" and I wanted to leverage that. The other concern is all of the ifdefs for having V enabled. I can make all of those V or XTHEADVECTOR; that will increase the surface area of xtheadvector but it is probably the right(?) way to go.
- Charlie
pair->value |= RISCV_HWPROBE_IMA_V; /*
@@ -112,7 +112,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
if (has_vector()) {
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);
-- 2.44.0
On Fri, Apr 12, 2024 at 11:22 AM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 10:04:42AM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 4:35 AM Conor Dooley conor.dooley@microchip.com wrote:
On Thu, Apr 11, 2024 at 09:11:20PM -0700, Charlie Jenkins wrote:
Ensure that hwprobe does not flag "v" when xtheadvector is present.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/kernel/sys_hwprobe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 8cae41a502dd..e0a42c851511 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
if (has_vector())
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
Hmm, I think this is "dangerous". has_vector() is used across the kernel now in several places for the in-kernel vector. I don't think that has_vector() should return true for the T-Head stuff given that & has_vector() should represent the ratified spec. I'll have to think about this one and how nasty this makes any of the save/restore code etc.
Yeah, my nose crinkled here as well. If you're having to do a vendorish thing in this generic spot, then others may too, suggesting perhaps this isn't the cleanest way to go about it. Ideally extensions are all additive, rather than subtractive, I guess?
This was the "easiest" way to support this but I agree this is not ideal. The vector code is naturally coupled with having support for "v" and I wanted to leverage that. The other concern is all of the ifdefs for having V enabled. I can make all of those V or XTHEADVECTOR; that will increase the surface area of xtheadvector but it is probably the right(?) way to go.
For the ifdefs, if you've got a Kconfig somewhere for THEAD_VECTOR, can't that just depend on the V config? We'd end up with the limitation that you can't add V 0.7 support without also dragging in V1.0 support, but that's probably fine, right?
-Evan
On Fri, Apr 12, 2024 at 03:08:31PM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 11:22 AM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 10:04:42AM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 4:35 AM Conor Dooley conor.dooley@microchip.com wrote:
On Thu, Apr 11, 2024 at 09:11:20PM -0700, Charlie Jenkins wrote:
Ensure that hwprobe does not flag "v" when xtheadvector is present.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/kernel/sys_hwprobe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 8cae41a502dd..e0a42c851511 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
if (has_vector())
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
Hmm, I think this is "dangerous". has_vector() is used across the kernel now in several places for the in-kernel vector. I don't think that has_vector() should return true for the T-Head stuff given that & has_vector() should represent the ratified spec. I'll have to think about this one and how nasty this makes any of the save/restore code etc.
Yeah, my nose crinkled here as well. If you're having to do a vendorish thing in this generic spot, then others may too, suggesting perhaps this isn't the cleanest way to go about it. Ideally extensions are all additive, rather than subtractive, I guess?
This was the "easiest" way to support this but I agree this is not ideal. The vector code is naturally coupled with having support for "v" and I wanted to leverage that. The other concern is all of the ifdefs for having V enabled. I can make all of those V or XTHEADVECTOR; that will increase the surface area of xtheadvector but it is probably the right(?) way to go.
For the ifdefs, if you've got a Kconfig somewhere for THEAD_VECTOR, can't that just depend on the V config? We'd end up with the limitation that you can't add V 0.7 support without also dragging in V1.0 support, but that's probably fine, right?
That's a great idea, thank you for the suggestion.
- Charlie
-Evan
xtheadvector is not vector 1.0 compatible, but it can leverage all of the same save/restore routines as vector plus riscv_v_first_use_handler(). vector 1.0 and xtheadvector are mutually exclusive so there is no risk of overlap.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/kernel/cpufeature.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 41a4d2028428..59f628b1341c 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -647,9 +647,13 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) * Many vendors with T-Head CPU cores which implement the 0.7.1 * version of the vector specification put "v" into their DTs. * CPU cores with the ratified spec will contain non-zero - * marchid. + * marchid. Only allow "v" to be set if xtheadvector is present. */ - if (acpi_disabled && this_vendorid == THEAD_VENDOR_ID && + if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, + RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { + this_hwcap |= isa2hwcap[RISCV_ISA_EXT_v]; + set_bit(RISCV_ISA_EXT_v, isainfo->isa); + } else if (acpi_disabled && this_vendorid == THEAD_VENDOR_ID && this_archid == 0x0) { this_hwcap &= ~isa2hwcap[RISCV_ISA_EXT_v]; clear_bit(RISCV_ISA_EXT_v, isainfo->isa); @@ -776,6 +780,15 @@ static int __init riscv_fill_hwcap_from_ext_list(unsigned long *isa2hwcap)
of_node_put(cpu_node);
+ /* + * Enable kernel vector routines if xtheadvector is present + */ + if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, + RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { + this_hwcap |= isa2hwcap[RISCV_ISA_EXT_v]; + set_bit(RISCV_ISA_EXT_v, isainfo->isa); + } + /* * All "okay" harts should have same isa. Set HWCAP based on * common capabilities of every "okay" hart, in case they don't.
On Thu, Apr 11, 2024 at 09:11:21PM -0700, Charlie Jenkins wrote:
xtheadvector is not vector 1.0 compatible, but it can leverage all of the same save/restore routines as vector plus riscv_v_first_use_handler(). vector 1.0 and xtheadvector are mutually exclusive so there is no risk of overlap.
I think this not okay to do - if a program checks hwcap to see if vector is supported they'll get told it is on T-Head system where only the 0.7.1 is.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/kernel/cpufeature.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 41a4d2028428..59f628b1341c 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -647,9 +647,13 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) * Many vendors with T-Head CPU cores which implement the 0.7.1 * version of the vector specification put "v" into their DTs. * CPU cores with the ratified spec will contain non-zero
* marchid.
*/* marchid. Only allow "v" to be set if xtheadvector is present.
if (acpi_disabled && this_vendorid == THEAD_VENDOR_ID &&
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa,
RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) {
this_hwcap |= isa2hwcap[RISCV_ISA_EXT_v];
set_bit(RISCV_ISA_EXT_v, isainfo->isa);
this_archid == 0x0) { this_hwcap &= ~isa2hwcap[RISCV_ISA_EXT_v]; clear_bit(RISCV_ISA_EXT_v, isainfo->isa);} else if (acpi_disabled && this_vendorid == THEAD_VENDOR_ID &&
@@ -776,6 +780,15 @@ static int __init riscv_fill_hwcap_from_ext_list(unsigned long *isa2hwcap) of_node_put(cpu_node);
/*
* Enable kernel vector routines if xtheadvector is present
*/
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa,
RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) {
this_hwcap |= isa2hwcap[RISCV_ISA_EXT_v];
set_bit(RISCV_ISA_EXT_v, isainfo->isa);
}
- /*
- All "okay" harts should have same isa. Set HWCAP based on
- common capabilities of every "okay" hart, in case they don't.
-- 2.44.0
On Fri, Apr 12, 2024 at 12:37:08PM +0100, Conor Dooley wrote:
On Thu, Apr 11, 2024 at 09:11:21PM -0700, Charlie Jenkins wrote:
xtheadvector is not vector 1.0 compatible, but it can leverage all of the same save/restore routines as vector plus riscv_v_first_use_handler(). vector 1.0 and xtheadvector are mutually exclusive so there is no risk of overlap.
I think this not okay to do - if a program checks hwcap to see if vector is supported they'll get told it is on T-Head system where only the 0.7.1 is.
That's fair. I did remove it from the hwprobe result but this is kind of a gross way of doing it. I'll mess around with this so this isn't necessary.
- Charlie
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/kernel/cpufeature.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 41a4d2028428..59f628b1341c 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -647,9 +647,13 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) * Many vendors with T-Head CPU cores which implement the 0.7.1 * version of the vector specification put "v" into their DTs. * CPU cores with the ratified spec will contain non-zero
* marchid.
*/* marchid. Only allow "v" to be set if xtheadvector is present.
if (acpi_disabled && this_vendorid == THEAD_VENDOR_ID &&
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa,
RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) {
this_hwcap |= isa2hwcap[RISCV_ISA_EXT_v];
set_bit(RISCV_ISA_EXT_v, isainfo->isa);
this_archid == 0x0) { this_hwcap &= ~isa2hwcap[RISCV_ISA_EXT_v]; clear_bit(RISCV_ISA_EXT_v, isainfo->isa);} else if (acpi_disabled && this_vendorid == THEAD_VENDOR_ID &&
@@ -776,6 +780,15 @@ static int __init riscv_fill_hwcap_from_ext_list(unsigned long *isa2hwcap) of_node_put(cpu_node);
/*
* Enable kernel vector routines if xtheadvector is present
*/
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa,
RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) {
this_hwcap |= isa2hwcap[RISCV_ISA_EXT_v];
set_bit(RISCV_ISA_EXT_v, isainfo->isa);
}
- /*
- All "okay" harts should have same isa. Set HWCAP based on
- common capabilities of every "okay" hart, in case they don't.
-- 2.44.0
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_0" which allows userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR vendor extension.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- arch/riscv/include/asm/hwprobe.h | 4 +-- arch/riscv/include/uapi/asm/hwprobe.h | 10 +++++- arch/riscv/kernel/sys_hwprobe.c | 59 +++++++++++++++++++++++++++++++++-- 3 files changed, 68 insertions(+), 5 deletions(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 630507dff5ea..e68496b4f8de 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /* - * Copyright 2023 Rivos, Inc + * Copyright 2023-2024 Rivos, Inc */
#ifndef _ASM_HWPROBE_H @@ -8,7 +8,7 @@
#include <uapi/asm/hwprobe.h>
-#define RISCV_HWPROBE_MAX_KEY 6 +#define RISCV_HWPROBE_MAX_KEY 7
static inline bool riscv_hwprobe_key_is_valid(__s64 key) { diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 9f2a8e3ff204..6614d3adfc75 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /* - * Copyright 2023 Rivos, Inc + * Copyright 2023-2024 Rivos, Inc */
#ifndef _UAPI_ASM_HWPROBE_H @@ -67,6 +67,14 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0) #define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0) #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6 +/* + * It is not possible for one CPU to have multiple vendor ids, so each vendor + * has its own vendor extension "namespace". The keys for each vendor starts + * at zero. + */ +#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7 + /* T-Head */ +#define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
/* Flags */ diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index e0a42c851511..365ce7380443 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
- if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) + if (has_vector() && + !__riscv_isa_vendor_extension_available(NULL, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) pair->value |= RISCV_HWPROBE_IMA_V;
/* @@ -112,7 +113,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
- if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { + if (has_vector() && + !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB); @@ -139,6 +141,55 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, pair->value &= ~missing; }
+static void hwprobe_isa_vendor_ext0(struct riscv_hwprobe *pair, + const struct cpumask *cpus) +{ + int cpu; + u64 missing = 0; + + pair->value = 0; + + struct riscv_hwprobe mvendorid = { + .key = RISCV_HWPROBE_KEY_MVENDORID, + .value = 0 + }; + + hwprobe_arch_id(&mvendorid, cpus); + + /* Set value to zero if CPUs in the set do not have the same vendor. */ + if (mvendorid.value == -1ULL) + return; + + /* + * Loop through and record vendor extensions that 1) anyone has, and + * 2) anyone doesn't have. + */ + for_each_cpu(cpu, cpus) { + struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu]; + +#define VENDOR_EXT_KEY(ext) \ + do { \ + if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \ + RISCV_ISA_VENDOR_EXT_##ext)) \ + pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ + else \ + missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ + } while (false) + + /* + * Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace, + * regardless of the kernel's configuration, as no other checks, besides + * presence in the hart_vendor_isa bitmap, are made. + */ + VENDOR_EXT_KEY(XTHEADVECTOR); + +#undef VENDOR_EXT_KEY + } + + /* Now turn off reporting features if any CPU is missing it. */ + pair->value &= ~missing; +} + static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) { struct riscv_hwprobe pair; @@ -216,6 +267,10 @@ static void hwprobe_one_pair(struct riscv_hwprobe *pair, pair->value = riscv_cboz_block_size; break;
+ case RISCV_HWPROBE_KEY_VENDOR_EXT_0: + hwprobe_isa_vendor_ext0(pair, cpus); + break; + /* * For forward compatibility, unknown keys don't fail the whole * call, but get their element key set to -1 and value set to 0
On Thu, Apr 11, 2024 at 09:11:22PM -0700, Charlie Jenkins wrote:
+static void hwprobe_isa_vendor_ext0(struct riscv_hwprobe *pair,
const struct cpumask *cpus)
+{
- int cpu;
- u64 missing = 0;
- pair->value = 0;
- struct riscv_hwprobe mvendorid = {
.key = RISCV_HWPROBE_KEY_MVENDORID,
.value = 0
- };
- hwprobe_arch_id(&mvendorid, cpus);
- /* Set value to zero if CPUs in the set do not have the same vendor. */
- if (mvendorid.value == -1ULL)
return;
- /*
* Loop through and record vendor extensions that 1) anyone has, and
* 2) anyone doesn't have.
*/
- for_each_cpu(cpu, cpus) {
struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu];
+#define VENDOR_EXT_KEY(ext) \
- do { \
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \
RISCV_ISA_VENDOR_EXT_##ext)) \
pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
else \
missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
- } while (false)
- /*
* Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace,
* regardless of the kernel's configuration, as no other checks, besides
* presence in the hart_vendor_isa bitmap, are made.
*/
- VENDOR_EXT_KEY(XTHEADVECTOR);
Reading the comment here, I don't think you can do this. All vector support in userspace is continent on kernel configuration options.
+#undef VENDOR_EXT_KEY
- }
- /* Now turn off reporting features if any CPU is missing it. */
- pair->value &= ~missing;
+}
On Thu, Apr 11, 2024 at 9:12 PM Charlie Jenkins charlie@rivosinc.com wrote:
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_0" which allows userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR vendor extension.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/include/asm/hwprobe.h | 4 +-- arch/riscv/include/uapi/asm/hwprobe.h | 10 +++++- arch/riscv/kernel/sys_hwprobe.c | 59 +++++++++++++++++++++++++++++++++-- 3 files changed, 68 insertions(+), 5 deletions(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 630507dff5ea..e68496b4f8de 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _ASM_HWPROBE_H @@ -8,7 +8,7 @@
#include <uapi/asm/hwprobe.h>
-#define RISCV_HWPROBE_MAX_KEY 6 +#define RISCV_HWPROBE_MAX_KEY 7
static inline bool riscv_hwprobe_key_is_valid(__s64 key) { diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 9f2a8e3ff204..6614d3adfc75 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _UAPI_ASM_HWPROBE_H @@ -67,6 +67,14 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0) #define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0) #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6 +/*
- It is not possible for one CPU to have multiple vendor ids, so each vendor
- has its own vendor extension "namespace". The keys for each vendor starts
- at zero.
- */
+#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7
- /* T-Head */
+#define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
/* Flags */ diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index e0a42c851511..365ce7380443 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
if (has_vector() &&
!__riscv_isa_vendor_extension_available(NULL, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) pair->value |= RISCV_HWPROBE_IMA_V; /*
@@ -112,7 +113,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) {
if (has_vector() &&
!riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);
@@ -139,6 +141,55 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, pair->value &= ~missing; }
+static void hwprobe_isa_vendor_ext0(struct riscv_hwprobe *pair,
const struct cpumask *cpus)
+{
int cpu;
u64 missing = 0;
pair->value = 0;
struct riscv_hwprobe mvendorid = {
.key = RISCV_HWPROBE_KEY_MVENDORID,
.value = 0
};
hwprobe_arch_id(&mvendorid, cpus);
/* Set value to zero if CPUs in the set do not have the same vendor. */
if (mvendorid.value == -1ULL)
return;
/*
* Loop through and record vendor extensions that 1) anyone has, and
* 2) anyone doesn't have.
*/
for_each_cpu(cpu, cpus) {
struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu];
+#define VENDOR_EXT_KEY(ext) \
do { \
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \
RISCV_ISA_VENDOR_EXT_##ext)) \
pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
else \
missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
} while (false)
/*
* Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace,
* regardless of the kernel's configuration, as no other checks, besides
* presence in the hart_vendor_isa bitmap, are made.
*/
VENDOR_EXT_KEY(XTHEADVECTOR);
+#undef VENDOR_EXT_KEY
Hey Charlie, Thanks for writing this up! At the very least I think the THEAD-specific stuff should probably end up in its own file, otherwise it'll get chaotic with vendors clamoring to add stuff right here. What do you think about this approach: * We leave RISCV_HWPROBE_MAX_KEY as the max key for the "generic world", eg 6-ish * We define that any key above 0x8000000000000000 is in the vendor space, so the meaning of the keys depends first on the mvendorid value. * In the kernel code, each new vendor adds on to a global struct, which might look something like: struct hwprobe_vendor_space vendor_space[] = { { .mvendorid = VENDOR_THEAD, .max_hwprobe_key = THEAD_MAX_HWPROBE_KEY, // currently 1 or 0x8000000000000001 with what you've got. .hwprobe_fn = thead_hwprobe }, ... };
* A hwprobe_thead.c implements thead_hwprobe(), and is called whenever the generic hwprobe encounters a key >=0x8000000000000000. * Generic code for setting up the VDSO can then still call the vendor-specific hwprobe_fn() repeatedly with an "all CPUs" mask from the base to max_hwprobe_key and set up the cached tables in userspace. * Since the VDSO data has limited space we may have to cap the number of vendor keys we cache to be lower than max_hwprobe_key. Since the data itself is not exposed to usermode we can raise this cap later if needed.
-Evan
}
/* Now turn off reporting features if any CPU is missing it. */
pair->value &= ~missing;
+}
static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) { struct riscv_hwprobe pair; @@ -216,6 +267,10 @@ static void hwprobe_one_pair(struct riscv_hwprobe *pair, pair->value = riscv_cboz_block_size; break;
case RISCV_HWPROBE_KEY_VENDOR_EXT_0:
hwprobe_isa_vendor_ext0(pair, cpus);
break;
/* * For forward compatibility, unknown keys don't fail the whole * call, but get their element key set to -1 and value set to 0
-- 2.44.0
On Fri, Apr 12, 2024 at 10:05:21AM -0700, Evan Green wrote:
On Thu, Apr 11, 2024 at 9:12 PM Charlie Jenkins charlie@rivosinc.com wrote:
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_0" which allows userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR vendor extension.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/include/asm/hwprobe.h | 4 +-- arch/riscv/include/uapi/asm/hwprobe.h | 10 +++++- arch/riscv/kernel/sys_hwprobe.c | 59 +++++++++++++++++++++++++++++++++-- 3 files changed, 68 insertions(+), 5 deletions(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 630507dff5ea..e68496b4f8de 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _ASM_HWPROBE_H @@ -8,7 +8,7 @@
#include <uapi/asm/hwprobe.h>
-#define RISCV_HWPROBE_MAX_KEY 6 +#define RISCV_HWPROBE_MAX_KEY 7
static inline bool riscv_hwprobe_key_is_valid(__s64 key) { diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 9f2a8e3ff204..6614d3adfc75 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _UAPI_ASM_HWPROBE_H @@ -67,6 +67,14 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0) #define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0) #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6 +/*
- It is not possible for one CPU to have multiple vendor ids, so each vendor
- has its own vendor extension "namespace". The keys for each vendor starts
- at zero.
- */
+#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7
- /* T-Head */
+#define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
/* Flags */ diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index e0a42c851511..365ce7380443 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
if (has_vector() &&
!__riscv_isa_vendor_extension_available(NULL, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) pair->value |= RISCV_HWPROBE_IMA_V; /*
@@ -112,7 +113,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) {
if (has_vector() &&
!riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);
@@ -139,6 +141,55 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, pair->value &= ~missing; }
+static void hwprobe_isa_vendor_ext0(struct riscv_hwprobe *pair,
const struct cpumask *cpus)
+{
int cpu;
u64 missing = 0;
pair->value = 0;
struct riscv_hwprobe mvendorid = {
.key = RISCV_HWPROBE_KEY_MVENDORID,
.value = 0
};
hwprobe_arch_id(&mvendorid, cpus);
/* Set value to zero if CPUs in the set do not have the same vendor. */
if (mvendorid.value == -1ULL)
return;
/*
* Loop through and record vendor extensions that 1) anyone has, and
* 2) anyone doesn't have.
*/
for_each_cpu(cpu, cpus) {
struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu];
+#define VENDOR_EXT_KEY(ext) \
do { \
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \
RISCV_ISA_VENDOR_EXT_##ext)) \
pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
else \
missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
} while (false)
/*
* Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace,
* regardless of the kernel's configuration, as no other checks, besides
* presence in the hart_vendor_isa bitmap, are made.
*/
VENDOR_EXT_KEY(XTHEADVECTOR);
+#undef VENDOR_EXT_KEY
Hey Charlie, Thanks for writing this up! At the very least I think the THEAD-specific stuff should probably end up in its own file, otherwise it'll get chaotic with vendors clamoring to add stuff right here.
Great idea!
What do you think about this approach:
- We leave RISCV_HWPROBE_MAX_KEY as the max key for the "generic
world", eg 6-ish
- We define that any key above 0x8000000000000000 is in the vendor
space, so the meaning of the keys depends first on the mvendorid value.
- In the kernel code, each new vendor adds on to a global struct,
which might look something like: struct hwprobe_vendor_space vendor_space[] = { { .mvendorid = VENDOR_THEAD, .max_hwprobe_key = THEAD_MAX_HWPROBE_KEY, // currently 1 or 0x8000000000000001 with what you've got. .hwprobe_fn = thead_hwprobe }, ... };
- A hwprobe_thead.c implements thead_hwprobe(), and is called
whenever the generic hwprobe encounters a key >=0x8000000000000000.
- Generic code for setting up the VDSO can then still call the
vendor-specific hwprobe_fn() repeatedly with an "all CPUs" mask from the base to max_hwprobe_key and set up the cached tables in userspace.
- Since the VDSO data has limited space we may have to cap the number
of vendor keys we cache to be lower than max_hwprobe_key. Since the data itself is not exposed to usermode we can raise this cap later if needed.
I know vendor extensions are kind of the "wild west" of riscv, but in spite of that I want to design a consistent API. The issue I had with having this "vendor space" for exposing vendor extensions was that this is something that is inherently the same for all vendors. I see a vendor space like this more applicable for something like "RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE" where a vendor has a specific value they would like to expose. I do agree that having a vendor space is a good design choice, but I am not convinced that vendor extensions are the proper use-case.
By having RISCV_HWPROBE_KEY_VENDOR_EXT_0 we can expose the vendor extensions in the same way that standard extensions are exposed, with a bitmask representing each extension. If these are instead in the vendor space, each vendor would probably be inclined to introduce a key like RISCV_HWPROBE_KEY_THEAD_EXT_0 that returns a bitmask of all of the thead vendor extensions. This duplicated effort is what I am trying to avoid. The alternative would be that vendors have a separate key for each vendor extension they would like to expose, but that is strictly less efficient than the existing bitmask probing.
Do you think that having the vendor space is appropriate for vendor extensions given my concerns?
- Charlie
-Evan
}
/* Now turn off reporting features if any CPU is missing it. */
pair->value &= ~missing;
+}
static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) { struct riscv_hwprobe pair; @@ -216,6 +267,10 @@ static void hwprobe_one_pair(struct riscv_hwprobe *pair, pair->value = riscv_cboz_block_size; break;
case RISCV_HWPROBE_KEY_VENDOR_EXT_0:
hwprobe_isa_vendor_ext0(pair, cpus);
break;
/* * For forward compatibility, unknown keys don't fail the whole * call, but get their element key set to -1 and value set to 0
-- 2.44.0
On Fri, Apr 12, 2024 at 11:17 AM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 10:05:21AM -0700, Evan Green wrote:
On Thu, Apr 11, 2024 at 9:12 PM Charlie Jenkins charlie@rivosinc.com wrote:
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_0" which allows userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR vendor extension.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/include/asm/hwprobe.h | 4 +-- arch/riscv/include/uapi/asm/hwprobe.h | 10 +++++- arch/riscv/kernel/sys_hwprobe.c | 59 +++++++++++++++++++++++++++++++++-- 3 files changed, 68 insertions(+), 5 deletions(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 630507dff5ea..e68496b4f8de 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _ASM_HWPROBE_H @@ -8,7 +8,7 @@
#include <uapi/asm/hwprobe.h>
-#define RISCV_HWPROBE_MAX_KEY 6 +#define RISCV_HWPROBE_MAX_KEY 7
static inline bool riscv_hwprobe_key_is_valid(__s64 key) { diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 9f2a8e3ff204..6614d3adfc75 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _UAPI_ASM_HWPROBE_H @@ -67,6 +67,14 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0) #define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0) #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6 +/*
- It is not possible for one CPU to have multiple vendor ids, so each vendor
- has its own vendor extension "namespace". The keys for each vendor starts
- at zero.
- */
+#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7
- /* T-Head */
+#define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
/* Flags */ diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index e0a42c851511..365ce7380443 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
if (has_vector() &&
!__riscv_isa_vendor_extension_available(NULL, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) pair->value |= RISCV_HWPROBE_IMA_V; /*
@@ -112,7 +113,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) {
if (has_vector() &&
!riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);
@@ -139,6 +141,55 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, pair->value &= ~missing; }
+static void hwprobe_isa_vendor_ext0(struct riscv_hwprobe *pair,
const struct cpumask *cpus)
+{
int cpu;
u64 missing = 0;
pair->value = 0;
struct riscv_hwprobe mvendorid = {
.key = RISCV_HWPROBE_KEY_MVENDORID,
.value = 0
};
hwprobe_arch_id(&mvendorid, cpus);
/* Set value to zero if CPUs in the set do not have the same vendor. */
if (mvendorid.value == -1ULL)
return;
/*
* Loop through and record vendor extensions that 1) anyone has, and
* 2) anyone doesn't have.
*/
for_each_cpu(cpu, cpus) {
struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu];
+#define VENDOR_EXT_KEY(ext) \
do { \
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \
RISCV_ISA_VENDOR_EXT_##ext)) \
pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
else \
missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
} while (false)
/*
* Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace,
* regardless of the kernel's configuration, as no other checks, besides
* presence in the hart_vendor_isa bitmap, are made.
*/
VENDOR_EXT_KEY(XTHEADVECTOR);
+#undef VENDOR_EXT_KEY
Hey Charlie, Thanks for writing this up! At the very least I think the THEAD-specific stuff should probably end up in its own file, otherwise it'll get chaotic with vendors clamoring to add stuff right here.
Great idea!
What do you think about this approach:
- We leave RISCV_HWPROBE_MAX_KEY as the max key for the "generic
world", eg 6-ish
- We define that any key above 0x8000000000000000 is in the vendor
space, so the meaning of the keys depends first on the mvendorid value.
- In the kernel code, each new vendor adds on to a global struct,
which might look something like: struct hwprobe_vendor_space vendor_space[] = { { .mvendorid = VENDOR_THEAD, .max_hwprobe_key = THEAD_MAX_HWPROBE_KEY, // currently 1 or 0x8000000000000001 with what you've got. .hwprobe_fn = thead_hwprobe }, ... };
- A hwprobe_thead.c implements thead_hwprobe(), and is called
whenever the generic hwprobe encounters a key >=0x8000000000000000.
- Generic code for setting up the VDSO can then still call the
vendor-specific hwprobe_fn() repeatedly with an "all CPUs" mask from the base to max_hwprobe_key and set up the cached tables in userspace.
- Since the VDSO data has limited space we may have to cap the number
of vendor keys we cache to be lower than max_hwprobe_key. Since the data itself is not exposed to usermode we can raise this cap later if needed.
I know vendor extensions are kind of the "wild west" of riscv, but in spite of that I want to design a consistent API. The issue I had with having this "vendor space" for exposing vendor extensions was that this is something that is inherently the same for all vendors. I see a vendor space like this more applicable for something like "RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE" where a vendor has a specific value they would like to expose. I do agree that having a vendor space is a good design choice, but I am not convinced that vendor extensions are the proper use-case.
By having RISCV_HWPROBE_KEY_VENDOR_EXT_0 we can expose the vendor extensions in the same way that standard extensions are exposed, with a bitmask representing each extension. If these are instead in the vendor space, each vendor would probably be inclined to introduce a key like RISCV_HWPROBE_KEY_THEAD_EXT_0 that returns a bitmask of all of the thead vendor extensions. This duplicated effort is what I am trying to avoid. The alternative would be that vendors have a separate key for each vendor extension they would like to expose, but that is strictly less efficient than the existing bitmask probing.
Do you think that having the vendor space is appropriate for vendor extensions given my concerns?
I do see what you're going for. It's tidy for a bitmask to just let anyone allocate the next bit, but leaves you with the same problem when a vendor decides they want to expose an enum, or decides they want to expose a bazillion things. I think a generalized version of the approach you've written would be: simply let vendors allocate keys from the same global space we're already using. My worry was that it would turn into an expansive suburban sprawl of mostly dead bits, or in the case of vendor-specific keys, full of "if (mvendor_id() != MINE) return 0;". My hope with the vendored keyspace is it would keep the sprawl from polluting the general array of (hopefully valuable) info with stuff that's likely to become less relevant as time passes. It also lowers the bar a bit to make it easier for vendors to expose bits, as they don't consume global space for everyone for all of time, just themselves.
So yes, personally I'm still in the camp of siloing the vendor stuff off to its own area. -Evan
On Fri, Apr 12, 2024 at 12:07:46PM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 11:17 AM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 10:05:21AM -0700, Evan Green wrote:
On Thu, Apr 11, 2024 at 9:12 PM Charlie Jenkins charlie@rivosinc.com wrote:
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_0" which allows userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR vendor extension.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/include/asm/hwprobe.h | 4 +-- arch/riscv/include/uapi/asm/hwprobe.h | 10 +++++- arch/riscv/kernel/sys_hwprobe.c | 59 +++++++++++++++++++++++++++++++++-- 3 files changed, 68 insertions(+), 5 deletions(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 630507dff5ea..e68496b4f8de 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _ASM_HWPROBE_H @@ -8,7 +8,7 @@
#include <uapi/asm/hwprobe.h>
-#define RISCV_HWPROBE_MAX_KEY 6 +#define RISCV_HWPROBE_MAX_KEY 7
static inline bool riscv_hwprobe_key_is_valid(__s64 key) { diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 9f2a8e3ff204..6614d3adfc75 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _UAPI_ASM_HWPROBE_H @@ -67,6 +67,14 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0) #define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0) #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6 +/*
- It is not possible for one CPU to have multiple vendor ids, so each vendor
- has its own vendor extension "namespace". The keys for each vendor starts
- at zero.
- */
+#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7
- /* T-Head */
+#define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
/* Flags */ diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index e0a42c851511..365ce7380443 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
if (has_vector() &&
!__riscv_isa_vendor_extension_available(NULL, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) pair->value |= RISCV_HWPROBE_IMA_V; /*
@@ -112,7 +113,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) {
if (has_vector() &&
!riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);
@@ -139,6 +141,55 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, pair->value &= ~missing; }
+static void hwprobe_isa_vendor_ext0(struct riscv_hwprobe *pair,
const struct cpumask *cpus)
+{
int cpu;
u64 missing = 0;
pair->value = 0;
struct riscv_hwprobe mvendorid = {
.key = RISCV_HWPROBE_KEY_MVENDORID,
.value = 0
};
hwprobe_arch_id(&mvendorid, cpus);
/* Set value to zero if CPUs in the set do not have the same vendor. */
if (mvendorid.value == -1ULL)
return;
/*
* Loop through and record vendor extensions that 1) anyone has, and
* 2) anyone doesn't have.
*/
for_each_cpu(cpu, cpus) {
struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu];
+#define VENDOR_EXT_KEY(ext) \
do { \
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \
RISCV_ISA_VENDOR_EXT_##ext)) \
pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
else \
missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
} while (false)
/*
* Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace,
* regardless of the kernel's configuration, as no other checks, besides
* presence in the hart_vendor_isa bitmap, are made.
*/
VENDOR_EXT_KEY(XTHEADVECTOR);
+#undef VENDOR_EXT_KEY
Hey Charlie, Thanks for writing this up! At the very least I think the THEAD-specific stuff should probably end up in its own file, otherwise it'll get chaotic with vendors clamoring to add stuff right here.
Great idea!
What do you think about this approach:
- We leave RISCV_HWPROBE_MAX_KEY as the max key for the "generic
world", eg 6-ish
- We define that any key above 0x8000000000000000 is in the vendor
space, so the meaning of the keys depends first on the mvendorid value.
- In the kernel code, each new vendor adds on to a global struct,
which might look something like: struct hwprobe_vendor_space vendor_space[] = { { .mvendorid = VENDOR_THEAD, .max_hwprobe_key = THEAD_MAX_HWPROBE_KEY, // currently 1 or 0x8000000000000001 with what you've got. .hwprobe_fn = thead_hwprobe }, ... };
- A hwprobe_thead.c implements thead_hwprobe(), and is called
whenever the generic hwprobe encounters a key >=0x8000000000000000.
- Generic code for setting up the VDSO can then still call the
vendor-specific hwprobe_fn() repeatedly with an "all CPUs" mask from the base to max_hwprobe_key and set up the cached tables in userspace.
- Since the VDSO data has limited space we may have to cap the number
of vendor keys we cache to be lower than max_hwprobe_key. Since the data itself is not exposed to usermode we can raise this cap later if needed.
I know vendor extensions are kind of the "wild west" of riscv, but in spite of that I want to design a consistent API. The issue I had with having this "vendor space" for exposing vendor extensions was that this is something that is inherently the same for all vendors. I see a vendor space like this more applicable for something like "RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE" where a vendor has a specific value they would like to expose. I do agree that having a vendor space is a good design choice, but I am not convinced that vendor extensions are the proper use-case.
By having RISCV_HWPROBE_KEY_VENDOR_EXT_0 we can expose the vendor extensions in the same way that standard extensions are exposed, with a bitmask representing each extension. If these are instead in the vendor space, each vendor would probably be inclined to introduce a key like RISCV_HWPROBE_KEY_THEAD_EXT_0 that returns a bitmask of all of the thead vendor extensions. This duplicated effort is what I am trying to avoid. The alternative would be that vendors have a separate key for each vendor extension they would like to expose, but that is strictly less efficient than the existing bitmask probing.
Do you think that having the vendor space is appropriate for vendor extensions given my concerns?
I do see what you're going for. It's tidy for a bitmask to just let anyone allocate the next bit, but leaves you with the same problem when a vendor decides they want to expose an enum, or decides they want to expose a bazillion things. I think a generalized version of
This patch is strictly to expose if a vendor extension is supported, how does exposing enums factor in here?
the approach you've written would be: simply let vendors allocate keys from the same global space we're already using. My worry was that it
I am missing how my proposal suggests allowing vendors to allocate keys in a global space.
would turn into an expansive suburban sprawl of mostly dead bits, or in the case of vendor-specific keys, full of "if (mvendor_id() != MINE) return 0;". My hope with the vendored keyspace is it would keep
An application will always need to check vendorid before calling hwprobe with a vendor-specific feature? If that hwprobe support is a key above 1<<63, then the application will need to pass that vendor-specific key and interpret the vendor-specific value. If that hwprobe support is what I have proposed here, then the user calls the standardized vendor extension hwprobe endpoint and then needs to interpret the result based on the vendor of the cpumask. In both cases they need to check the vendorid of the cpumask. In the test case I added I failed to check the vendorid but I should have had that.
the sprawl from polluting the general array of (hopefully valuable) info with stuff that's likely to become less relevant as time passes. It also lowers the bar a bit to make it easier for vendors to expose bits, as they don't consume global space for everyone for all of time, just themselves.
The vendor keys are tied directly to the vendor. So as it grows we would have something like:
#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7 /* T-Head */ #define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XTHEAD2 (2 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XTHEAD3 (3 << 0) /* Vendor 2 */ #define RISCV_HWPROBE_VENDOR_EXT_XVENDOR1 (1 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XVENDOR2 (2 << 0) /* Vendor 3 */ ...
The keys overlap between vendors. To determine which extension a vendor supports, hwprobe gets data from hart_isa_vendor[cpu]. If the vendor is vendor 2, it is not possible for a vendor extension from vendor 3 to end up in there. Only the extensions from that vendor can be supported by that vendor's hardware.
So yes, personally I'm still in the camp of siloing the vendor stuff off to its own area.
I don't quite see how what I have proposed doesn't "silo" the extensions that pertain to each vendor since the keys are specific to each vendor.
- Charlie
-Evan
On Fri, Apr 12, 2024 at 1:20 PM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 12:07:46PM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 11:17 AM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 10:05:21AM -0700, Evan Green wrote:
On Thu, Apr 11, 2024 at 9:12 PM Charlie Jenkins charlie@rivosinc.com wrote:
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_0" which allows userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR vendor extension.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/include/asm/hwprobe.h | 4 +-- arch/riscv/include/uapi/asm/hwprobe.h | 10 +++++- arch/riscv/kernel/sys_hwprobe.c | 59 +++++++++++++++++++++++++++++++++-- 3 files changed, 68 insertions(+), 5 deletions(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 630507dff5ea..e68496b4f8de 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _ASM_HWPROBE_H @@ -8,7 +8,7 @@
#include <uapi/asm/hwprobe.h>
-#define RISCV_HWPROBE_MAX_KEY 6 +#define RISCV_HWPROBE_MAX_KEY 7
static inline bool riscv_hwprobe_key_is_valid(__s64 key) { diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 9f2a8e3ff204..6614d3adfc75 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _UAPI_ASM_HWPROBE_H @@ -67,6 +67,14 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0) #define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0) #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6 +/*
- It is not possible for one CPU to have multiple vendor ids, so each vendor
- has its own vendor extension "namespace". The keys for each vendor starts
- at zero.
- */
+#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7
- /* T-Head */
+#define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
/* Flags */ diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index e0a42c851511..365ce7380443 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
if (has_vector() &&
!__riscv_isa_vendor_extension_available(NULL, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) pair->value |= RISCV_HWPROBE_IMA_V; /*
@@ -112,7 +113,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) {
if (has_vector() &&
!riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);
@@ -139,6 +141,55 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, pair->value &= ~missing; }
+static void hwprobe_isa_vendor_ext0(struct riscv_hwprobe *pair,
const struct cpumask *cpus)
+{
int cpu;
u64 missing = 0;
pair->value = 0;
struct riscv_hwprobe mvendorid = {
.key = RISCV_HWPROBE_KEY_MVENDORID,
.value = 0
};
hwprobe_arch_id(&mvendorid, cpus);
/* Set value to zero if CPUs in the set do not have the same vendor. */
if (mvendorid.value == -1ULL)
return;
/*
* Loop through and record vendor extensions that 1) anyone has, and
* 2) anyone doesn't have.
*/
for_each_cpu(cpu, cpus) {
struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu];
+#define VENDOR_EXT_KEY(ext) \
do { \
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \
RISCV_ISA_VENDOR_EXT_##ext)) \
pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
else \
missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
} while (false)
/*
* Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace,
* regardless of the kernel's configuration, as no other checks, besides
* presence in the hart_vendor_isa bitmap, are made.
*/
VENDOR_EXT_KEY(XTHEADVECTOR);
+#undef VENDOR_EXT_KEY
Hey Charlie, Thanks for writing this up! At the very least I think the THEAD-specific stuff should probably end up in its own file, otherwise it'll get chaotic with vendors clamoring to add stuff right here.
Great idea!
What do you think about this approach:
- We leave RISCV_HWPROBE_MAX_KEY as the max key for the "generic
world", eg 6-ish
- We define that any key above 0x8000000000000000 is in the vendor
space, so the meaning of the keys depends first on the mvendorid value.
- In the kernel code, each new vendor adds on to a global struct,
which might look something like: struct hwprobe_vendor_space vendor_space[] = { { .mvendorid = VENDOR_THEAD, .max_hwprobe_key = THEAD_MAX_HWPROBE_KEY, // currently 1 or 0x8000000000000001 with what you've got. .hwprobe_fn = thead_hwprobe }, ... };
- A hwprobe_thead.c implements thead_hwprobe(), and is called
whenever the generic hwprobe encounters a key >=0x8000000000000000.
- Generic code for setting up the VDSO can then still call the
vendor-specific hwprobe_fn() repeatedly with an "all CPUs" mask from the base to max_hwprobe_key and set up the cached tables in userspace.
- Since the VDSO data has limited space we may have to cap the number
of vendor keys we cache to be lower than max_hwprobe_key. Since the data itself is not exposed to usermode we can raise this cap later if needed.
I know vendor extensions are kind of the "wild west" of riscv, but in spite of that I want to design a consistent API. The issue I had with having this "vendor space" for exposing vendor extensions was that this is something that is inherently the same for all vendors. I see a vendor space like this more applicable for something like "RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE" where a vendor has a specific value they would like to expose. I do agree that having a vendor space is a good design choice, but I am not convinced that vendor extensions are the proper use-case.
By having RISCV_HWPROBE_KEY_VENDOR_EXT_0 we can expose the vendor extensions in the same way that standard extensions are exposed, with a bitmask representing each extension. If these are instead in the vendor space, each vendor would probably be inclined to introduce a key like RISCV_HWPROBE_KEY_THEAD_EXT_0 that returns a bitmask of all of the thead vendor extensions. This duplicated effort is what I am trying to avoid. The alternative would be that vendors have a separate key for each vendor extension they would like to expose, but that is strictly less efficient than the existing bitmask probing.
Do you think that having the vendor space is appropriate for vendor extensions given my concerns?
I do see what you're going for. It's tidy for a bitmask to just let anyone allocate the next bit, but leaves you with the same problem when a vendor decides they want to expose an enum, or decides they want to expose a bazillion things. I think a generalized version of
This patch is strictly to expose if a vendor extension is supported, how does exposing enums factor in here?
the approach you've written would be: simply let vendors allocate keys from the same global space we're already using. My worry was that it
I am missing how my proposal suggests allowing vendors to allocate keys in a global space.
would turn into an expansive suburban sprawl of mostly dead bits, or in the case of vendor-specific keys, full of "if (mvendor_id() != MINE) return 0;". My hope with the vendored keyspace is it would keep
An application will always need to check vendorid before calling hwprobe with a vendor-specific feature? If that hwprobe support is a key above 1<<63, then the application will need to pass that vendor-specific key and interpret the vendor-specific value. If that hwprobe support is what I have proposed here, then the user calls the standardized vendor extension hwprobe endpoint and then needs to interpret the result based on the vendor of the cpumask. In both cases they need to check the vendorid of the cpumask. In the test case I added I failed to check the vendorid but I should have had that.
the sprawl from polluting the general array of (hopefully valuable) info with stuff that's likely to become less relevant as time passes. It also lowers the bar a bit to make it easier for vendors to expose bits, as they don't consume global space for everyone for all of time, just themselves.
The vendor keys are tied directly to the vendor. So as it grows we would have something like:
#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7 /* T-Head */ #define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XTHEAD2 (2 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XTHEAD3 (3 << 0) /* Vendor 2 */ #define RISCV_HWPROBE_VENDOR_EXT_XVENDOR1 (1 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XVENDOR2 (2 << 0) /* Vendor 3 */ ...
The keys overlap between vendors. To determine which extension a vendor supports, hwprobe gets data from hart_isa_vendor[cpu]. If the vendor is vendor 2, it is not possible for a vendor extension from vendor 3 to end up in there. Only the extensions from that vendor can be supported by that vendor's hardware.
Gotcha. You're right I had misinterpreted this, thinking XTHEADVECTOR was a valid bit regardless of mvendorid, and that other vendors would have to choose new bits for their features and always return 0 for XTHEADVECTOR. With your explanation, it seems like you're allocating keys (in no particular order) whose meaning will change based on mvendorid.
I guess I'm still not convinced that saving each vendor from having to add a VENDOR_EXT key in their keyspace is worth the sacrifice of spraying the vendor-specific keys across the generic keyspace. Are there advantages to having a single key whose category is similar but whose bits are entirely vendor-defined? Maybe if I were userspace and my feature could be satisfied equivalently by XTHEADVECTOR or XRIVOSOTHERTHING, then I could do one hwprobe call instead of two? But I don't think the vendors are going to be consistent enough for that equivalency to ever prove useful. The advantages in my head of the separate vendor keyspace are: * Keeps the kernel code simple: if key >= (1 >> 63) vendor_config->do_hwprobe(), rather than having all these little calls in each specific switch case for vendor_config->do_vendor_ext0(), vendor_config->do_vendor_ext1(), etc. * It extends easily into passing other forms of vendor hwprobe info later, rather than solving only the case of risc-v extensions now, and then having to do this all again for each additional category of vendor data. * Similarly, it discourages future vendors from trying to squint and find a way to make a vaguely generic sounding category for their own hwprobe key which will ultimately only ever be filled in by them anyway.
-Evan
On Fri, Apr 12, 2024 at 02:43:01PM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 1:20 PM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 12:07:46PM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 11:17 AM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 10:05:21AM -0700, Evan Green wrote:
On Thu, Apr 11, 2024 at 9:12 PM Charlie Jenkins charlie@rivosinc.com wrote:
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_0" which allows userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR vendor extension.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
arch/riscv/include/asm/hwprobe.h | 4 +-- arch/riscv/include/uapi/asm/hwprobe.h | 10 +++++- arch/riscv/kernel/sys_hwprobe.c | 59 +++++++++++++++++++++++++++++++++-- 3 files changed, 68 insertions(+), 5 deletions(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 630507dff5ea..e68496b4f8de 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _ASM_HWPROBE_H @@ -8,7 +8,7 @@
#include <uapi/asm/hwprobe.h>
-#define RISCV_HWPROBE_MAX_KEY 6 +#define RISCV_HWPROBE_MAX_KEY 7
static inline bool riscv_hwprobe_key_is_valid(__s64 key) { diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 9f2a8e3ff204..6614d3adfc75 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /*
- Copyright 2023 Rivos, Inc
*/
- Copyright 2023-2024 Rivos, Inc
#ifndef _UAPI_ASM_HWPROBE_H @@ -67,6 +67,14 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0) #define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0) #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6 +/*
- It is not possible for one CPU to have multiple vendor ids, so each vendor
- has its own vendor extension "namespace". The keys for each vendor starts
- at zero.
- */
+#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7
- /* T-Head */
+#define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
/* Flags */ diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index e0a42c851511..365ce7380443 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -69,7 +69,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, if (riscv_isa_extension_available(NULL, c)) pair->value |= RISCV_HWPROBE_IMA_C;
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR))
if (has_vector() &&
!__riscv_isa_vendor_extension_available(NULL, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) pair->value |= RISCV_HWPROBE_IMA_V; /*
@@ -112,7 +113,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZACAS); EXT_KEY(ZICOND);
if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) {
if (has_vector() &&
!riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);
@@ -139,6 +141,55 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, pair->value &= ~missing; }
+static void hwprobe_isa_vendor_ext0(struct riscv_hwprobe *pair,
const struct cpumask *cpus)
+{
int cpu;
u64 missing = 0;
pair->value = 0;
struct riscv_hwprobe mvendorid = {
.key = RISCV_HWPROBE_KEY_MVENDORID,
.value = 0
};
hwprobe_arch_id(&mvendorid, cpus);
/* Set value to zero if CPUs in the set do not have the same vendor. */
if (mvendorid.value == -1ULL)
return;
/*
* Loop through and record vendor extensions that 1) anyone has, and
* 2) anyone doesn't have.
*/
for_each_cpu(cpu, cpus) {
struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu];
+#define VENDOR_EXT_KEY(ext) \
do { \
if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \
RISCV_ISA_VENDOR_EXT_##ext)) \
pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
else \
missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \
} while (false)
/*
* Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace,
* regardless of the kernel's configuration, as no other checks, besides
* presence in the hart_vendor_isa bitmap, are made.
*/
VENDOR_EXT_KEY(XTHEADVECTOR);
+#undef VENDOR_EXT_KEY
Hey Charlie, Thanks for writing this up! At the very least I think the THEAD-specific stuff should probably end up in its own file, otherwise it'll get chaotic with vendors clamoring to add stuff right here.
Great idea!
What do you think about this approach:
- We leave RISCV_HWPROBE_MAX_KEY as the max key for the "generic
world", eg 6-ish
- We define that any key above 0x8000000000000000 is in the vendor
space, so the meaning of the keys depends first on the mvendorid value.
- In the kernel code, each new vendor adds on to a global struct,
which might look something like: struct hwprobe_vendor_space vendor_space[] = { { .mvendorid = VENDOR_THEAD, .max_hwprobe_key = THEAD_MAX_HWPROBE_KEY, // currently 1 or 0x8000000000000001 with what you've got. .hwprobe_fn = thead_hwprobe }, ... };
- A hwprobe_thead.c implements thead_hwprobe(), and is called
whenever the generic hwprobe encounters a key >=0x8000000000000000.
- Generic code for setting up the VDSO can then still call the
vendor-specific hwprobe_fn() repeatedly with an "all CPUs" mask from the base to max_hwprobe_key and set up the cached tables in userspace.
- Since the VDSO data has limited space we may have to cap the number
of vendor keys we cache to be lower than max_hwprobe_key. Since the data itself is not exposed to usermode we can raise this cap later if needed.
I know vendor extensions are kind of the "wild west" of riscv, but in spite of that I want to design a consistent API. The issue I had with having this "vendor space" for exposing vendor extensions was that this is something that is inherently the same for all vendors. I see a vendor space like this more applicable for something like "RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE" where a vendor has a specific value they would like to expose. I do agree that having a vendor space is a good design choice, but I am not convinced that vendor extensions are the proper use-case.
By having RISCV_HWPROBE_KEY_VENDOR_EXT_0 we can expose the vendor extensions in the same way that standard extensions are exposed, with a bitmask representing each extension. If these are instead in the vendor space, each vendor would probably be inclined to introduce a key like RISCV_HWPROBE_KEY_THEAD_EXT_0 that returns a bitmask of all of the thead vendor extensions. This duplicated effort is what I am trying to avoid. The alternative would be that vendors have a separate key for each vendor extension they would like to expose, but that is strictly less efficient than the existing bitmask probing.
Do you think that having the vendor space is appropriate for vendor extensions given my concerns?
I do see what you're going for. It's tidy for a bitmask to just let anyone allocate the next bit, but leaves you with the same problem when a vendor decides they want to expose an enum, or decides they want to expose a bazillion things. I think a generalized version of
This patch is strictly to expose if a vendor extension is supported, how does exposing enums factor in here?
the approach you've written would be: simply let vendors allocate keys from the same global space we're already using. My worry was that it
I am missing how my proposal suggests allowing vendors to allocate keys in a global space.
would turn into an expansive suburban sprawl of mostly dead bits, or in the case of vendor-specific keys, full of "if (mvendor_id() != MINE) return 0;". My hope with the vendored keyspace is it would keep
An application will always need to check vendorid before calling hwprobe with a vendor-specific feature? If that hwprobe support is a key above 1<<63, then the application will need to pass that vendor-specific key and interpret the vendor-specific value. If that hwprobe support is what I have proposed here, then the user calls the standardized vendor extension hwprobe endpoint and then needs to interpret the result based on the vendor of the cpumask. In both cases they need to check the vendorid of the cpumask. In the test case I added I failed to check the vendorid but I should have had that.
the sprawl from polluting the general array of (hopefully valuable) info with stuff that's likely to become less relevant as time passes. It also lowers the bar a bit to make it easier for vendors to expose bits, as they don't consume global space for everyone for all of time, just themselves.
The vendor keys are tied directly to the vendor. So as it grows we would have something like:
#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7 /* T-Head */ #define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XTHEAD2 (2 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XTHEAD3 (3 << 0) /* Vendor 2 */ #define RISCV_HWPROBE_VENDOR_EXT_XVENDOR1 (1 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XVENDOR2 (2 << 0) /* Vendor 3 */ ...
The keys overlap between vendors. To determine which extension a vendor supports, hwprobe gets data from hart_isa_vendor[cpu]. If the vendor is vendor 2, it is not possible for a vendor extension from vendor 3 to end up in there. Only the extensions from that vendor can be supported by that vendor's hardware.
Gotcha. You're right I had misinterpreted this, thinking XTHEADVECTOR was a valid bit regardless of mvendorid, and that other vendors would have to choose new bits for their features and always return 0 for XTHEADVECTOR. With your explanation, it seems like you're allocating keys (in no particular order) whose meaning will change based on mvendorid.
I guess I'm still not convinced that saving each vendor from having to add a VENDOR_EXT key in their keyspace is worth the sacrifice of spraying the vendor-specific keys across the generic keyspace. Are there advantages to having a single key whose category is similar but whose bits are entirely vendor-defined? Maybe if I were userspace and my feature could be satisfied equivalently by XTHEADVECTOR or XRIVOSOTHERTHING, then I could do one hwprobe call instead of two? But I don't think the vendors are going to be consistent enough for that equivalency to ever prove useful. The advantages in my head of the separate vendor keyspace are:
- Keeps the kernel code simple: if key >= (1 >> 63)
vendor_config->do_hwprobe(), rather than having all these little calls in each specific switch case for vendor_config->do_vendor_ext0(), vendor_config->do_vendor_ext1(), etc.
The consistency between vendors is guaranteed in this scheme. They just add the extension to hwprobe_isa_vendor_ext0. The following code is the critical code from the kernel:
for_each_cpu(cpu, cpus) { struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu];
#define VENDOR_EXT_KEY(ext) \ do { \ if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \ RISCV_ISA_VENDOR_EXT_##ext)) \ pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ else \ missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ } while (false)
/* * Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace, * regardless of the kernel's configuration, as no other checks, besides * presence in the hart_vendor_isa bitmap, are made. */ VENDOR_EXT_KEY(XTHEADVECTOR);
#undef VENDOR_EXT_KEY }
/* Now turn off reporting features if any CPU is missing it. */ pair->value &= ~missing;
The only thing a vendor will have to do is add an entry below VENDOR_EXT_KEY(XTHEADVECTOR) with their extension name (of course populating a value for the key as well). This existing code will then check if the extension is compatible with the hardware and appropriate populate the bitmask. All vendors get this functionality for "free" without needing to write the boilerplate code to expose vendor extensions through hwprobe.
Now that I write this out I do see that I overlooked that this code needs to check the vendorid to ensure that the given extension is actually associated with the vendorid. This would make this more complicated but still seems like a low barrier to entry for a new vendor, as well as a standard API for getting all vendor extensions that are available on the platform regardless of which platform is being used.
- It extends easily into passing other forms of vendor hwprobe info
later, rather than solving only the case of risc-v extensions now, and then having to do this all again for each additional category of vendor data.
This is a great point. I do agree that a different solution will be necessary for arbitrary vendor data and I am all for making something future compatible. At the same time I don't want to get trapped into something that is suboptimal for the sake of doing less work later. There is no chance of any compatibility once we leave the realm of riscv extensions, so once a vendor needs something exported I would be happy to write the code to support that.
- Similarly, it discourages future vendors from trying to squint and
find a way to make a vaguely generic sounding category for their own hwprobe key which will ultimately only ever be filled in by them anyway.
What do you mean by this? There are no "categories" here, the vendor just writes out their extension VENDOR_EXT_KEY(XVENDOREXTENSION) and it gets shuttled to userspace on the hwprobe vendor call.
- Charlie
-Evan
On Fri, Apr 12, 2024 at 3:21 PM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 02:43:01PM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 1:20 PM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 12:07:46PM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 11:17 AM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 10:05:21AM -0700, Evan Green wrote:
On Thu, Apr 11, 2024 at 9:12 PM Charlie Jenkins charlie@rivosinc.com wrote: > > Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_0" which allows > userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR vendor > extension. > > Signed-off-by: Charlie Jenkins charlie@rivosinc.com > --- > arch/riscv/include/asm/hwprobe.h | 4 +-- > arch/riscv/include/uapi/asm/hwprobe.h | 10 +++++- > arch/riscv/kernel/sys_hwprobe.c | 59 +++++++++++++++++++++++++++++++++-- > 3 files changed, 68 insertions(+), 5 deletions(-) > > diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h > index 630507dff5ea..e68496b4f8de 100644 > --- a/arch/riscv/include/asm/hwprobe.h > +++ b/arch/riscv/include/asm/hwprobe.h > @@ -1,6 +1,6 @@ > /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > /* > - * Copyright 2023 Rivos, Inc > + * Copyright 2023-2024 Rivos, Inc > */ > > #ifndef _ASM_HWPROBE_H > @@ -8,7 +8,7 @@ > > #include <uapi/asm/hwprobe.h> > > -#define RISCV_HWPROBE_MAX_KEY 6 > +#define RISCV_HWPROBE_MAX_KEY 7 > > static inline bool riscv_hwprobe_key_is_valid(__s64 key) > { > diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h > index 9f2a8e3ff204..6614d3adfc75 100644 > --- a/arch/riscv/include/uapi/asm/hwprobe.h > +++ b/arch/riscv/include/uapi/asm/hwprobe.h > @@ -1,6 +1,6 @@ > /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > /* > - * Copyright 2023 Rivos, Inc > + * Copyright 2023-2024 Rivos, Inc > */ > > #ifndef _UAPI_ASM_HWPROBE_H > @@ -67,6 +67,14 @@ struct riscv_hwprobe { > #define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0) > #define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0) > #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6 > +/* > + * It is not possible for one CPU to have multiple vendor ids, so each vendor > + * has its own vendor extension "namespace". The keys for each vendor starts > + * at zero. > + */ > +#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7 > + /* T-Head */ > +#define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) > /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */ > > /* Flags */ > diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c > index e0a42c851511..365ce7380443 100644 > --- a/arch/riscv/kernel/sys_hwprobe.c > +++ b/arch/riscv/kernel/sys_hwprobe.c > @@ -69,7 +69,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, > if (riscv_isa_extension_available(NULL, c)) > pair->value |= RISCV_HWPROBE_IMA_C; > > - if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) > + if (has_vector() && > + !__riscv_isa_vendor_extension_available(NULL, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) > pair->value |= RISCV_HWPROBE_IMA_V; > > /* > @@ -112,7 +113,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, > EXT_KEY(ZACAS); > EXT_KEY(ZICOND); > > - if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { > + if (has_vector() && > + !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { > EXT_KEY(ZVBB); > EXT_KEY(ZVBC); > EXT_KEY(ZVKB); > @@ -139,6 +141,55 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, > pair->value &= ~missing; > } > > +static void hwprobe_isa_vendor_ext0(struct riscv_hwprobe *pair, > + const struct cpumask *cpus) > +{ > + int cpu; > + u64 missing = 0; > + > + pair->value = 0; > + > + struct riscv_hwprobe mvendorid = { > + .key = RISCV_HWPROBE_KEY_MVENDORID, > + .value = 0 > + }; > + > + hwprobe_arch_id(&mvendorid, cpus); > + > + /* Set value to zero if CPUs in the set do not have the same vendor. */ > + if (mvendorid.value == -1ULL) > + return; > + > + /* > + * Loop through and record vendor extensions that 1) anyone has, and > + * 2) anyone doesn't have. > + */ > + for_each_cpu(cpu, cpus) { > + struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu]; > + > +#define VENDOR_EXT_KEY(ext) \ > + do { \ > + if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \ > + RISCV_ISA_VENDOR_EXT_##ext)) \ > + pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ > + else \ > + missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ > + } while (false) > + > + /* > + * Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace, > + * regardless of the kernel's configuration, as no other checks, besides > + * presence in the hart_vendor_isa bitmap, are made. > + */ > + VENDOR_EXT_KEY(XTHEADVECTOR); > + > +#undef VENDOR_EXT_KEY
Hey Charlie, Thanks for writing this up! At the very least I think the THEAD-specific stuff should probably end up in its own file, otherwise it'll get chaotic with vendors clamoring to add stuff right here.
Great idea!
What do you think about this approach:
- We leave RISCV_HWPROBE_MAX_KEY as the max key for the "generic
world", eg 6-ish
- We define that any key above 0x8000000000000000 is in the vendor
space, so the meaning of the keys depends first on the mvendorid value.
- In the kernel code, each new vendor adds on to a global struct,
which might look something like: struct hwprobe_vendor_space vendor_space[] = { { .mvendorid = VENDOR_THEAD, .max_hwprobe_key = THEAD_MAX_HWPROBE_KEY, // currently 1 or 0x8000000000000001 with what you've got. .hwprobe_fn = thead_hwprobe }, ... };
- A hwprobe_thead.c implements thead_hwprobe(), and is called
whenever the generic hwprobe encounters a key >=0x8000000000000000.
- Generic code for setting up the VDSO can then still call the
vendor-specific hwprobe_fn() repeatedly with an "all CPUs" mask from the base to max_hwprobe_key and set up the cached tables in userspace.
- Since the VDSO data has limited space we may have to cap the number
of vendor keys we cache to be lower than max_hwprobe_key. Since the data itself is not exposed to usermode we can raise this cap later if needed.
I know vendor extensions are kind of the "wild west" of riscv, but in spite of that I want to design a consistent API. The issue I had with having this "vendor space" for exposing vendor extensions was that this is something that is inherently the same for all vendors. I see a vendor space like this more applicable for something like "RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE" where a vendor has a specific value they would like to expose. I do agree that having a vendor space is a good design choice, but I am not convinced that vendor extensions are the proper use-case.
By having RISCV_HWPROBE_KEY_VENDOR_EXT_0 we can expose the vendor extensions in the same way that standard extensions are exposed, with a bitmask representing each extension. If these are instead in the vendor space, each vendor would probably be inclined to introduce a key like RISCV_HWPROBE_KEY_THEAD_EXT_0 that returns a bitmask of all of the thead vendor extensions. This duplicated effort is what I am trying to avoid. The alternative would be that vendors have a separate key for each vendor extension they would like to expose, but that is strictly less efficient than the existing bitmask probing.
Do you think that having the vendor space is appropriate for vendor extensions given my concerns?
I do see what you're going for. It's tidy for a bitmask to just let anyone allocate the next bit, but leaves you with the same problem when a vendor decides they want to expose an enum, or decides they want to expose a bazillion things. I think a generalized version of
This patch is strictly to expose if a vendor extension is supported, how does exposing enums factor in here?
the approach you've written would be: simply let vendors allocate keys from the same global space we're already using. My worry was that it
I am missing how my proposal suggests allowing vendors to allocate keys in a global space.
would turn into an expansive suburban sprawl of mostly dead bits, or in the case of vendor-specific keys, full of "if (mvendor_id() != MINE) return 0;". My hope with the vendored keyspace is it would keep
An application will always need to check vendorid before calling hwprobe with a vendor-specific feature? If that hwprobe support is a key above 1<<63, then the application will need to pass that vendor-specific key and interpret the vendor-specific value. If that hwprobe support is what I have proposed here, then the user calls the standardized vendor extension hwprobe endpoint and then needs to interpret the result based on the vendor of the cpumask. In both cases they need to check the vendorid of the cpumask. In the test case I added I failed to check the vendorid but I should have had that.
the sprawl from polluting the general array of (hopefully valuable) info with stuff that's likely to become less relevant as time passes. It also lowers the bar a bit to make it easier for vendors to expose bits, as they don't consume global space for everyone for all of time, just themselves.
The vendor keys are tied directly to the vendor. So as it grows we would have something like:
#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7 /* T-Head */ #define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XTHEAD2 (2 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XTHEAD3 (3 << 0) /* Vendor 2 */ #define RISCV_HWPROBE_VENDOR_EXT_XVENDOR1 (1 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XVENDOR2 (2 << 0) /* Vendor 3 */ ...
The keys overlap between vendors. To determine which extension a vendor supports, hwprobe gets data from hart_isa_vendor[cpu]. If the vendor is vendor 2, it is not possible for a vendor extension from vendor 3 to end up in there. Only the extensions from that vendor can be supported by that vendor's hardware.
Gotcha. You're right I had misinterpreted this, thinking XTHEADVECTOR was a valid bit regardless of mvendorid, and that other vendors would have to choose new bits for their features and always return 0 for XTHEADVECTOR. With your explanation, it seems like you're allocating keys (in no particular order) whose meaning will change based on mvendorid.
I guess I'm still not convinced that saving each vendor from having to add a VENDOR_EXT key in their keyspace is worth the sacrifice of spraying the vendor-specific keys across the generic keyspace. Are there advantages to having a single key whose category is similar but whose bits are entirely vendor-defined? Maybe if I were userspace and my feature could be satisfied equivalently by XTHEADVECTOR or XRIVOSOTHERTHING, then I could do one hwprobe call instead of two? But I don't think the vendors are going to be consistent enough for that equivalency to ever prove useful. The advantages in my head of the separate vendor keyspace are:
- Keeps the kernel code simple: if key >= (1 >> 63)
vendor_config->do_hwprobe(), rather than having all these little calls in each specific switch case for vendor_config->do_vendor_ext0(), vendor_config->do_vendor_ext1(), etc.
The consistency between vendors is guaranteed in this scheme. They just add the extension to hwprobe_isa_vendor_ext0. The following code is the critical code from the kernel:
for_each_cpu(cpu, cpus) { struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu];
#define VENDOR_EXT_KEY(ext) \ do { \ if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \ RISCV_ISA_VENDOR_EXT_##ext)) \ pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ else \ missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ } while (false)
/* * Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace, * regardless of the kernel's configuration, as no other checks, besides * presence in the hart_vendor_isa bitmap, are made. */ VENDOR_EXT_KEY(XTHEADVECTOR);
#undef VENDOR_EXT_KEY }
/* Now turn off reporting features if any CPU is missing it. */ pair->value &= ~missing;
The only thing a vendor will have to do is add an entry below VENDOR_EXT_KEY(XTHEADVECTOR) with their extension name (of course populating a value for the key as well). This existing code will then check if the extension is compatible with the hardware and appropriate populate the bitmask. All vendors get this functionality for "free" without needing to write the boilerplate code to expose vendor extensions through hwprobe.
Now that I write this out I do see that I overlooked that this code needs to check the vendorid to ensure that the given extension is actually associated with the vendorid. This would make this more complicated but still seems like a low barrier to entry for a new vendor, as well as a standard API for getting all vendor extensions that are available on the platform regardless of which platform is being used.
Maybe I'll reserve judgment until I see the next spin, since we need both the "conditionalize on mvendorid" part, and to move the vendor stuff into a thead-specific file as discussed earlier. I'll be trying to picture how this looks 10 years from now, when a bunch of vendors have added dozens of extensions, and 75% of them are at that point defunct baggage.
- It extends easily into passing other forms of vendor hwprobe info
later, rather than solving only the case of risc-v extensions now, and then having to do this all again for each additional category of vendor data.
This is a great point. I do agree that a different solution will be necessary for arbitrary vendor data and I am all for making something future compatible. At the same time I don't want to get trapped into something that is suboptimal for the sake of doing less work later. There is no chance of any compatibility once we leave the realm of riscv extensions, so once a vendor needs something exported I would be happy to write the code to support that.
- Similarly, it discourages future vendors from trying to squint and
find a way to make a vaguely generic sounding category for their own hwprobe key which will ultimately only ever be filled in by them anyway.
What do you mean by this? There are no "categories" here, the vendor just writes out their extension VENDOR_EXT_KEY(XVENDOREXTENSION) and it gets shuttled to userspace on the hwprobe vendor call.
The category in this case is RISC-V extensions, since you've defined a key whose contents are vendor-specific, but whose bits must all fit the category of being a risc-v vendor extension.
To frame it in another light, one equivalent version from an ABI perspective would be to say ok, let's put this key up into the 1<<63 range, but carve out a "common key" range where all vendors implement the same key definitions, like this VENDOR_EXT_0 key. Is that useful, or is it unnecessary structure? I think I'm of the opinion it's unnecessary structure, but I'm still open to being convinced. -Evan
- Charlie
-Evan
On Fri, Apr 12, 2024 at 03:50:05PM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 3:21 PM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 02:43:01PM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 1:20 PM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 12:07:46PM -0700, Evan Green wrote:
On Fri, Apr 12, 2024 at 11:17 AM Charlie Jenkins charlie@rivosinc.com wrote:
On Fri, Apr 12, 2024 at 10:05:21AM -0700, Evan Green wrote: > On Thu, Apr 11, 2024 at 9:12 PM Charlie Jenkins charlie@rivosinc.com wrote: > > > > Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_0" which allows > > userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR vendor > > extension. > > > > Signed-off-by: Charlie Jenkins charlie@rivosinc.com > > --- > > arch/riscv/include/asm/hwprobe.h | 4 +-- > > arch/riscv/include/uapi/asm/hwprobe.h | 10 +++++- > > arch/riscv/kernel/sys_hwprobe.c | 59 +++++++++++++++++++++++++++++++++-- > > 3 files changed, 68 insertions(+), 5 deletions(-) > > > > diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h > > index 630507dff5ea..e68496b4f8de 100644 > > --- a/arch/riscv/include/asm/hwprobe.h > > +++ b/arch/riscv/include/asm/hwprobe.h > > @@ -1,6 +1,6 @@ > > /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > > /* > > - * Copyright 2023 Rivos, Inc > > + * Copyright 2023-2024 Rivos, Inc > > */ > > > > #ifndef _ASM_HWPROBE_H > > @@ -8,7 +8,7 @@ > > > > #include <uapi/asm/hwprobe.h> > > > > -#define RISCV_HWPROBE_MAX_KEY 6 > > +#define RISCV_HWPROBE_MAX_KEY 7 > > > > static inline bool riscv_hwprobe_key_is_valid(__s64 key) > > { > > diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h > > index 9f2a8e3ff204..6614d3adfc75 100644 > > --- a/arch/riscv/include/uapi/asm/hwprobe.h > > +++ b/arch/riscv/include/uapi/asm/hwprobe.h > > @@ -1,6 +1,6 @@ > > /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > > /* > > - * Copyright 2023 Rivos, Inc > > + * Copyright 2023-2024 Rivos, Inc > > */ > > > > #ifndef _UAPI_ASM_HWPROBE_H > > @@ -67,6 +67,14 @@ struct riscv_hwprobe { > > #define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0) > > #define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0) > > #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6 > > +/* > > + * It is not possible for one CPU to have multiple vendor ids, so each vendor > > + * has its own vendor extension "namespace". The keys for each vendor starts > > + * at zero. > > + */ > > +#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7 > > + /* T-Head */ > > +#define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) > > /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */ > > > > /* Flags */ > > diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c > > index e0a42c851511..365ce7380443 100644 > > --- a/arch/riscv/kernel/sys_hwprobe.c > > +++ b/arch/riscv/kernel/sys_hwprobe.c > > @@ -69,7 +69,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, > > if (riscv_isa_extension_available(NULL, c)) > > pair->value |= RISCV_HWPROBE_IMA_C; > > > > - if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) > > + if (has_vector() && > > + !__riscv_isa_vendor_extension_available(NULL, RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) > > pair->value |= RISCV_HWPROBE_IMA_V; > > > > /* > > @@ -112,7 +113,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, > > EXT_KEY(ZACAS); > > EXT_KEY(ZICOND); > > > > - if (has_vector() && !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { > > + if (has_vector() && > > + !riscv_has_vendor_extension_unlikely(RISCV_ISA_VENDOR_EXT_XTHEADVECTOR)) { > > EXT_KEY(ZVBB); > > EXT_KEY(ZVBC); > > EXT_KEY(ZVKB); > > @@ -139,6 +141,55 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, > > pair->value &= ~missing; > > } > > > > +static void hwprobe_isa_vendor_ext0(struct riscv_hwprobe *pair, > > + const struct cpumask *cpus) > > +{ > > + int cpu; > > + u64 missing = 0; > > + > > + pair->value = 0; > > + > > + struct riscv_hwprobe mvendorid = { > > + .key = RISCV_HWPROBE_KEY_MVENDORID, > > + .value = 0 > > + }; > > + > > + hwprobe_arch_id(&mvendorid, cpus); > > + > > + /* Set value to zero if CPUs in the set do not have the same vendor. */ > > + if (mvendorid.value == -1ULL) > > + return; > > + > > + /* > > + * Loop through and record vendor extensions that 1) anyone has, and > > + * 2) anyone doesn't have. > > + */ > > + for_each_cpu(cpu, cpus) { > > + struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu]; > > + > > +#define VENDOR_EXT_KEY(ext) \ > > + do { \ > > + if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \ > > + RISCV_ISA_VENDOR_EXT_##ext)) \ > > + pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ > > + else \ > > + missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ > > + } while (false) > > + > > + /* > > + * Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace, > > + * regardless of the kernel's configuration, as no other checks, besides > > + * presence in the hart_vendor_isa bitmap, are made. > > + */ > > + VENDOR_EXT_KEY(XTHEADVECTOR); > > + > > +#undef VENDOR_EXT_KEY > > Hey Charlie, > Thanks for writing this up! At the very least I think the > THEAD-specific stuff should probably end up in its own file, otherwise > it'll get chaotic with vendors clamoring to add stuff right here.
Great idea!
> What do you think about this approach: > * We leave RISCV_HWPROBE_MAX_KEY as the max key for the "generic > world", eg 6-ish > * We define that any key above 0x8000000000000000 is in the vendor > space, so the meaning of the keys depends first on the mvendorid > value. > * In the kernel code, each new vendor adds on to a global struct, > which might look something like: > struct hwprobe_vendor_space vendor_space[] = { > { > .mvendorid = VENDOR_THEAD, > .max_hwprobe_key = THEAD_MAX_HWPROBE_KEY, // currently > 1 or 0x8000000000000001 with what you've got. > .hwprobe_fn = thead_hwprobe > }, > ... > }; > > * A hwprobe_thead.c implements thead_hwprobe(), and is called > whenever the generic hwprobe encounters a key >=0x8000000000000000. > * Generic code for setting up the VDSO can then still call the > vendor-specific hwprobe_fn() repeatedly with an "all CPUs" mask from > the base to max_hwprobe_key and set up the cached tables in userspace. > * Since the VDSO data has limited space we may have to cap the number > of vendor keys we cache to be lower than max_hwprobe_key. Since the > data itself is not exposed to usermode we can raise this cap later if > needed.
I know vendor extensions are kind of the "wild west" of riscv, but in spite of that I want to design a consistent API. The issue I had with having this "vendor space" for exposing vendor extensions was that this is something that is inherently the same for all vendors. I see a vendor space like this more applicable for something like "RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE" where a vendor has a specific value they would like to expose. I do agree that having a vendor space is a good design choice, but I am not convinced that vendor extensions are the proper use-case.
By having RISCV_HWPROBE_KEY_VENDOR_EXT_0 we can expose the vendor extensions in the same way that standard extensions are exposed, with a bitmask representing each extension. If these are instead in the vendor space, each vendor would probably be inclined to introduce a key like RISCV_HWPROBE_KEY_THEAD_EXT_0 that returns a bitmask of all of the thead vendor extensions. This duplicated effort is what I am trying to avoid. The alternative would be that vendors have a separate key for each vendor extension they would like to expose, but that is strictly less efficient than the existing bitmask probing.
Do you think that having the vendor space is appropriate for vendor extensions given my concerns?
I do see what you're going for. It's tidy for a bitmask to just let anyone allocate the next bit, but leaves you with the same problem when a vendor decides they want to expose an enum, or decides they want to expose a bazillion things. I think a generalized version of
This patch is strictly to expose if a vendor extension is supported, how does exposing enums factor in here?
the approach you've written would be: simply let vendors allocate keys from the same global space we're already using. My worry was that it
I am missing how my proposal suggests allowing vendors to allocate keys in a global space.
would turn into an expansive suburban sprawl of mostly dead bits, or in the case of vendor-specific keys, full of "if (mvendor_id() != MINE) return 0;". My hope with the vendored keyspace is it would keep
An application will always need to check vendorid before calling hwprobe with a vendor-specific feature? If that hwprobe support is a key above 1<<63, then the application will need to pass that vendor-specific key and interpret the vendor-specific value. If that hwprobe support is what I have proposed here, then the user calls the standardized vendor extension hwprobe endpoint and then needs to interpret the result based on the vendor of the cpumask. In both cases they need to check the vendorid of the cpumask. In the test case I added I failed to check the vendorid but I should have had that.
the sprawl from polluting the general array of (hopefully valuable) info with stuff that's likely to become less relevant as time passes. It also lowers the bar a bit to make it easier for vendors to expose bits, as they don't consume global space for everyone for all of time, just themselves.
The vendor keys are tied directly to the vendor. So as it grows we would have something like:
#define RISCV_HWPROBE_KEY_VENDOR_EXT_0 7 /* T-Head */ #define RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR (1 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XTHEAD2 (2 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XTHEAD3 (3 << 0) /* Vendor 2 */ #define RISCV_HWPROBE_VENDOR_EXT_XVENDOR1 (1 << 0) #define RISCV_HWPROBE_VENDOR_EXT_XVENDOR2 (2 << 0) /* Vendor 3 */ ...
The keys overlap between vendors. To determine which extension a vendor supports, hwprobe gets data from hart_isa_vendor[cpu]. If the vendor is vendor 2, it is not possible for a vendor extension from vendor 3 to end up in there. Only the extensions from that vendor can be supported by that vendor's hardware.
Gotcha. You're right I had misinterpreted this, thinking XTHEADVECTOR was a valid bit regardless of mvendorid, and that other vendors would have to choose new bits for their features and always return 0 for XTHEADVECTOR. With your explanation, it seems like you're allocating keys (in no particular order) whose meaning will change based on mvendorid.
I guess I'm still not convinced that saving each vendor from having to add a VENDOR_EXT key in their keyspace is worth the sacrifice of spraying the vendor-specific keys across the generic keyspace. Are there advantages to having a single key whose category is similar but whose bits are entirely vendor-defined? Maybe if I were userspace and my feature could be satisfied equivalently by XTHEADVECTOR or XRIVOSOTHERTHING, then I could do one hwprobe call instead of two? But I don't think the vendors are going to be consistent enough for that equivalency to ever prove useful. The advantages in my head of the separate vendor keyspace are:
- Keeps the kernel code simple: if key >= (1 >> 63)
vendor_config->do_hwprobe(), rather than having all these little calls in each specific switch case for vendor_config->do_vendor_ext0(), vendor_config->do_vendor_ext1(), etc.
The consistency between vendors is guaranteed in this scheme. They just add the extension to hwprobe_isa_vendor_ext0. The following code is the critical code from the kernel:
for_each_cpu(cpu, cpus) { struct riscv_isainfo *isavendorinfo = &hart_isa_vendor[cpu];
#define VENDOR_EXT_KEY(ext) \ do { \ if (__riscv_isa_vendor_extension_available(isavendorinfo->isa, \ RISCV_ISA_VENDOR_EXT_##ext)) \ pair->value |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ else \ missing |= RISCV_HWPROBE_VENDOR_EXT_##ext; \ } while (false)
/* * Only use VENDOR_EXT_KEY() for extensions which can be exposed to userspace, * regardless of the kernel's configuration, as no other checks, besides * presence in the hart_vendor_isa bitmap, are made. */ VENDOR_EXT_KEY(XTHEADVECTOR);
#undef VENDOR_EXT_KEY }
/* Now turn off reporting features if any CPU is missing it. */ pair->value &= ~missing;
The only thing a vendor will have to do is add an entry below VENDOR_EXT_KEY(XTHEADVECTOR) with their extension name (of course populating a value for the key as well). This existing code will then check if the extension is compatible with the hardware and appropriate populate the bitmask. All vendors get this functionality for "free" without needing to write the boilerplate code to expose vendor extensions through hwprobe.
Now that I write this out I do see that I overlooked that this code needs to check the vendorid to ensure that the given extension is actually associated with the vendorid. This would make this more complicated but still seems like a low barrier to entry for a new vendor, as well as a standard API for getting all vendor extensions that are available on the platform regardless of which platform is being used.
Maybe I'll reserve judgment until I see the next spin, since we need both the "conditionalize on mvendorid" part, and to move the vendor stuff into a thead-specific file as discussed earlier. I'll be trying to picture how this looks 10 years from now, when a bunch of vendors have added dozens of extensions, and 75% of them are at that point defunct baggage.
Okay I will make some changes here and then we can continue this conversation :)
- It extends easily into passing other forms of vendor hwprobe info
later, rather than solving only the case of risc-v extensions now, and then having to do this all again for each additional category of vendor data.
This is a great point. I do agree that a different solution will be necessary for arbitrary vendor data and I am all for making something future compatible. At the same time I don't want to get trapped into something that is suboptimal for the sake of doing less work later. There is no chance of any compatibility once we leave the realm of riscv extensions, so once a vendor needs something exported I would be happy to write the code to support that.
- Similarly, it discourages future vendors from trying to squint and
find a way to make a vaguely generic sounding category for their own hwprobe key which will ultimately only ever be filled in by them anyway.
What do you mean by this? There are no "categories" here, the vendor just writes out their extension VENDOR_EXT_KEY(XVENDOREXTENSION) and it gets shuttled to userspace on the hwprobe vendor call.
The category in this case is RISC-V extensions, since you've defined a key whose contents are vendor-specific, but whose bits must all fit the category of being a risc-v vendor extension.
To frame it in another light, one equivalent version from an ABI perspective would be to say ok, let's put this key up into the 1<<63 range, but carve out a "common key" range where all vendors implement the same key definitions, like this VENDOR_EXT_0 key. Is that useful, or is it unnecessary structure? I think I'm of the opinion it's unnecessary structure, but I'm still open to being convinced.
That makes sense, thank you for clarifying, I appreciate that perspective. I am coming from the direction that I want to share as much as possible between vendors to minimize both kernel and userspace code. In that sense, it is unnecessary. It would be fine to have each vendor define their own way of probing which vendor extensions are available. My inclination is that would lead to more verbosity in the kernel and userspace, but I too am open to be convinced.
- Charlie
-Evan
- Charlie
-Evan
Document support for vendor extensions using the key RISCV_HWPROBE_KEY_VENDOR_EXT_0 and xtheadvector extension using the key RISCV_ISA_VENDOR_EXT_XTHEADVECTOR.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- Documentation/arch/riscv/hwprobe.rst | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst index b2bcc9eed9aa..38e1b0c7c38c 100644 --- a/Documentation/arch/riscv/hwprobe.rst +++ b/Documentation/arch/riscv/hwprobe.rst @@ -210,3 +210,15 @@ The following keys are defined:
* :c:macro:`RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE`: An unsigned int which represents the size of the Zicboz block in bytes. + +* :c:macro:`RISCV_HWPROBE_KEY_VENDOR_EXT_0`: A bitmask containing the vendor + extensions that are compatible with the + :c:macro:`RISCV_HWPROBE_BASE_BEHAVIOR_IMA`: base system behavior. A set of + CPUs is only compatible with a vendor extension if all CPUs in the set have + the same mvendorid and support the extension. + + * T-HEAD + + * :c:macro:`RISCV_ISA_VENDOR_EXT_XTHEADVECTOR`: The xtheadvector vendor + extension is supported in the T-Head ISA extensions spec starting from + commit a18c801634 ("Add T-Head VECTOR vendor extension. ").
Overhaul the riscv vector tests to use kselftest_harness to help the test cases correctly report the results and decouple the individual test cases from each other. With this refactoring, only run the test cases is vector is reported and properly report the test case as skipped otherwise. The v_initval_nolibc test was previously not checking if vector was supported and used a function (malloc) which invalidates the state of the vector registers.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- tools/testing/selftests/riscv/vector/.gitignore | 3 +- tools/testing/selftests/riscv/vector/Makefile | 17 +- .../selftests/riscv/vector/v_exec_initval_nolibc.c | 84 +++++++ tools/testing/selftests/riscv/vector/v_helpers.c | 56 +++++ tools/testing/selftests/riscv/vector/v_helpers.h | 5 + tools/testing/selftests/riscv/vector/v_initval.c | 16 ++ .../selftests/riscv/vector/v_initval_nolibc.c | 68 ------ .../testing/selftests/riscv/vector/vstate_prctl.c | 266 ++++++++++++--------- 8 files changed, 324 insertions(+), 191 deletions(-)
diff --git a/tools/testing/selftests/riscv/vector/.gitignore b/tools/testing/selftests/riscv/vector/.gitignore index 9ae7964491d5..7d9c87cd0649 100644 --- a/tools/testing/selftests/riscv/vector/.gitignore +++ b/tools/testing/selftests/riscv/vector/.gitignore @@ -1,3 +1,4 @@ vstate_exec_nolibc vstate_prctl -v_initval_nolibc +v_initval +v_exec_initval_nolibc diff --git a/tools/testing/selftests/riscv/vector/Makefile b/tools/testing/selftests/riscv/vector/Makefile index bfff0ff4f3be..995746359477 100644 --- a/tools/testing/selftests/riscv/vector/Makefile +++ b/tools/testing/selftests/riscv/vector/Makefile @@ -2,18 +2,27 @@ # Copyright (C) 2021 ARM Limited # Originally tools/testing/arm64/abi/Makefile
-TEST_GEN_PROGS := vstate_prctl v_initval_nolibc -TEST_GEN_PROGS_EXTENDED := vstate_exec_nolibc +TEST_GEN_PROGS := v_initval vstate_prctl +TEST_GEN_PROGS_EXTENDED := vstate_exec_nolibc v_exec_initval_nolibc sys_hwprobe.o v_helpers.o
include ../../lib.mk
-$(OUTPUT)/vstate_prctl: vstate_prctl.c ../hwprobe/sys_hwprobe.S +$(OUTPUT)/sys_hwprobe.o: ../hwprobe/sys_hwprobe.S + $(CC) -static -c -o$@ $(CFLAGS) $^ + +$(OUTPUT)/v_helpers.o: v_helpers.c + $(CC) -static -c -o$@ $(CFLAGS) $^ + +$(OUTPUT)/vstate_prctl: vstate_prctl.c $(OUTPUT)/sys_hwprobe.o $(OUTPUT)/v_helpers.o $(CC) -static -o$@ $(CFLAGS) $(LDFLAGS) $^
$(OUTPUT)/vstate_exec_nolibc: vstate_exec_nolibc.c $(CC) -nostdlib -static -include ../../../../include/nolibc/nolibc.h \ -Wall $(CFLAGS) $(LDFLAGS) $^ -o $@ -lgcc
-$(OUTPUT)/v_initval_nolibc: v_initval_nolibc.c +$(OUTPUT)/v_initval: v_initval.c $(OUTPUT)/sys_hwprobe.o $(OUTPUT)/v_helpers.o + $(CC) -static -o$@ $(CFLAGS) $(LDFLAGS) $^ + +$(OUTPUT)/v_exec_initval_nolibc: v_exec_initval_nolibc.c $(CC) -nostdlib -static -include ../../../../include/nolibc/nolibc.h \ -Wall $(CFLAGS) $(LDFLAGS) $^ -o $@ -lgcc diff --git a/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c b/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c new file mode 100644 index 000000000000..363727672704 --- /dev/null +++ b/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c @@ -0,0 +1,84 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Get values of vector registers as soon as the program starts to test if + * is properly cleaning the values before starting a new program. Vector + * registers are caller saved, so no function calls may happen before reading + * the values. To further ensure consistency, this file is compiled without + * libc and without auto-vectorization. + * + * To be "clean" all values must be either all ones or all zeroes. + */ + +#define __stringify_1(x...) #x +#define __stringify(x...) __stringify_1(x) + +int main(int argc, char **argv) +{ + char prev_value = 0, value; + unsigned long vl; + int first = 1; + + asm volatile ( + ".option push\n\t" + ".option arch, +v\n\t" + "vsetvli %[vl], x0, e8, m1, ta, ma\n\t" + ".option pop\n\t" + : [vl] "=r" (vl) + ); + +#define CHECK_VECTOR_REGISTER(register) ({ \ + for (int i = 0; i < vl; i++) { \ + asm volatile ( \ + ".option push\n\t" \ + ".option arch, +v\n\t" \ + "vmv.x.s %0, " __stringify(register) "\n\t" \ + "vsrl.vi " __stringify(register) ", " __stringify(register) ", 8\n\t" \ + ".option pop\n\t" \ + : "=r" (value)); \ + if (first) { \ + first = 0; \ + } else if (value != prev_value || !(value == 0x00 || value == 0xff)) { \ + printf("Register "__stringify(register)" values not clean! value: %u\n", value); \ + exit(-1); \ + } \ + prev_value = value; \ + } \ +}) + + CHECK_VECTOR_REGISTER(v0); + CHECK_VECTOR_REGISTER(v1); + CHECK_VECTOR_REGISTER(v2); + CHECK_VECTOR_REGISTER(v3); + CHECK_VECTOR_REGISTER(v4); + CHECK_VECTOR_REGISTER(v5); + CHECK_VECTOR_REGISTER(v6); + CHECK_VECTOR_REGISTER(v7); + CHECK_VECTOR_REGISTER(v8); + CHECK_VECTOR_REGISTER(v9); + CHECK_VECTOR_REGISTER(v10); + CHECK_VECTOR_REGISTER(v11); + CHECK_VECTOR_REGISTER(v12); + CHECK_VECTOR_REGISTER(v13); + CHECK_VECTOR_REGISTER(v14); + CHECK_VECTOR_REGISTER(v15); + CHECK_VECTOR_REGISTER(v16); + CHECK_VECTOR_REGISTER(v17); + CHECK_VECTOR_REGISTER(v18); + CHECK_VECTOR_REGISTER(v19); + CHECK_VECTOR_REGISTER(v20); + CHECK_VECTOR_REGISTER(v21); + CHECK_VECTOR_REGISTER(v22); + CHECK_VECTOR_REGISTER(v23); + CHECK_VECTOR_REGISTER(v24); + CHECK_VECTOR_REGISTER(v25); + CHECK_VECTOR_REGISTER(v26); + CHECK_VECTOR_REGISTER(v27); + CHECK_VECTOR_REGISTER(v28); + CHECK_VECTOR_REGISTER(v29); + CHECK_VECTOR_REGISTER(v30); + CHECK_VECTOR_REGISTER(v31); + +#undef CHECK_VECTOR_REGISTER + + return 0; +} diff --git a/tools/testing/selftests/riscv/vector/v_helpers.c b/tools/testing/selftests/riscv/vector/v_helpers.c new file mode 100644 index 000000000000..15c22318db72 --- /dev/null +++ b/tools/testing/selftests/riscv/vector/v_helpers.c @@ -0,0 +1,56 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "../hwprobe/hwprobe.h" +#include <stdlib.h> +#include <stdio.h> +#include <unistd.h> +#include <sys/wait.h> + +int is_vector_supported(void) +{ + struct riscv_hwprobe pair; + + pair.key = RISCV_HWPROBE_KEY_IMA_EXT_0; + riscv_hwprobe(&pair, 1, 0, NULL, 0); + return pair.value & RISCV_HWPROBE_IMA_V; +} + +int launch_test(char *next_program, int test_inherit) +{ + char *exec_argv[3], *exec_envp[1]; + int rc, pid, status; + + pid = fork(); + if (pid < 0) { + printf("fork failed %d", pid); + return -1; + } + + if (!pid) { + exec_argv[0] = next_program; + exec_argv[1] = test_inherit != 0 ? "x" : NULL; + exec_argv[2] = NULL; + exec_envp[0] = NULL; + /* launch the program again to check inherit */ + rc = execve(next_program, exec_argv, exec_envp); + if (rc) { + perror("execve"); + printf("child execve failed %d\n", rc); + exit(-1); + } + } + + rc = waitpid(-1, &status, 0); + if (rc < 0) { + printf("waitpid failed\n"); + return -3; + } + + if ((WIFEXITED(status) && WEXITSTATUS(status) == -1) || + WIFSIGNALED(status)) { + printf("child exited abnormally\n"); + return -4; + } + + return WEXITSTATUS(status); +} diff --git a/tools/testing/selftests/riscv/vector/v_helpers.h b/tools/testing/selftests/riscv/vector/v_helpers.h new file mode 100644 index 000000000000..88719c4be496 --- /dev/null +++ b/tools/testing/selftests/riscv/vector/v_helpers.h @@ -0,0 +1,5 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +int is_vector_supported(void); + +int launch_test(char *next_program, int test_inherit); diff --git a/tools/testing/selftests/riscv/vector/v_initval.c b/tools/testing/selftests/riscv/vector/v_initval.c new file mode 100644 index 000000000000..f38b5797fa31 --- /dev/null +++ b/tools/testing/selftests/riscv/vector/v_initval.c @@ -0,0 +1,16 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "../../kselftest_harness.h" +#include "v_helpers.h" + +#define NEXT_PROGRAM "./v_exec_initval_nolibc" + +TEST(v_initval) +{ + if (!is_vector_supported()) + SKIP(return, "Vector not supported"); + + ASSERT_EQ(0, launch_test(NEXT_PROGRAM, 0)); +} + +TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/riscv/vector/v_initval_nolibc.c b/tools/testing/selftests/riscv/vector/v_initval_nolibc.c deleted file mode 100644 index 1dd94197da30..000000000000 --- a/tools/testing/selftests/riscv/vector/v_initval_nolibc.c +++ /dev/null @@ -1,68 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only - -#include "../../kselftest.h" -#define MAX_VSIZE (8192 * 32) - -void dump(char *ptr, int size) -{ - int i = 0; - - for (i = 0; i < size; i++) { - if (i != 0) { - if (i % 16 == 0) - printf("\n"); - else if (i % 8 == 0) - printf(" "); - } - printf("%02x ", ptr[i]); - } - printf("\n"); -} - -int main(void) -{ - int i; - unsigned long vl; - char *datap, *tmp; - - datap = malloc(MAX_VSIZE); - if (!datap) { - ksft_test_result_fail("fail to allocate memory for size = %d\n", MAX_VSIZE); - exit(-1); - } - - tmp = datap; - asm volatile ( - ".option push\n\t" - ".option arch, +v\n\t" - "vsetvli %0, x0, e8, m8, ta, ma\n\t" - "vse8.v v0, (%2)\n\t" - "add %1, %2, %0\n\t" - "vse8.v v8, (%1)\n\t" - "add %1, %1, %0\n\t" - "vse8.v v16, (%1)\n\t" - "add %1, %1, %0\n\t" - "vse8.v v24, (%1)\n\t" - ".option pop\n\t" - : "=&r" (vl), "=r" (tmp) : "r" (datap) : "memory"); - - ksft_print_msg("vl = %lu\n", vl); - - if (datap[0] != 0x00 && datap[0] != 0xff) { - ksft_test_result_fail("v-regesters are not properly initialized\n"); - dump(datap, vl * 4); - exit(-1); - } - - for (i = 1; i < vl * 4; i++) { - if (datap[i] != datap[0]) { - ksft_test_result_fail("detect stale values on v-regesters\n"); - dump(datap, vl * 4); - exit(-2); - } - } - - free(datap); - ksft_exit_pass(); - return 0; -} diff --git a/tools/testing/selftests/riscv/vector/vstate_prctl.c b/tools/testing/selftests/riscv/vector/vstate_prctl.c index 27668fb3b6d0..528e8c544db0 100644 --- a/tools/testing/selftests/riscv/vector/vstate_prctl.c +++ b/tools/testing/selftests/riscv/vector/vstate_prctl.c @@ -3,50 +3,13 @@ #include <unistd.h> #include <errno.h> #include <sys/wait.h> +#include <sys/types.h> +#include <stdlib.h>
-#include "../hwprobe/hwprobe.h" -#include "../../kselftest.h" +#include "../../kselftest_harness.h" +#include "v_helpers.h"
#define NEXT_PROGRAM "./vstate_exec_nolibc" -static int launch_test(int test_inherit) -{ - char *exec_argv[3], *exec_envp[1]; - int rc, pid, status; - - pid = fork(); - if (pid < 0) { - ksft_test_result_fail("fork failed %d", pid); - return -1; - } - - if (!pid) { - exec_argv[0] = NEXT_PROGRAM; - exec_argv[1] = test_inherit != 0 ? "x" : NULL; - exec_argv[2] = NULL; - exec_envp[0] = NULL; - /* launch the program again to check inherit */ - rc = execve(NEXT_PROGRAM, exec_argv, exec_envp); - if (rc) { - perror("execve"); - ksft_test_result_fail("child execve failed %d\n", rc); - exit(-1); - } - } - - rc = waitpid(-1, &status, 0); - if (rc < 0) { - ksft_test_result_fail("waitpid failed\n"); - return -3; - } - - if ((WIFEXITED(status) && WEXITSTATUS(status) == -1) || - WIFSIGNALED(status)) { - ksft_test_result_fail("child exited abnormally\n"); - return -4; - } - - return WEXITSTATUS(status); -}
int test_and_compare_child(long provided, long expected, int inherit) { @@ -54,14 +17,13 @@ int test_and_compare_child(long provided, long expected, int inherit)
rc = prctl(PR_RISCV_V_SET_CONTROL, provided); if (rc != 0) { - ksft_test_result_fail("prctl with provided arg %lx failed with code %d\n", - provided, rc); + printf("prctl with provided arg %lx failed with code %d\n", + provided, rc); return -1; } - rc = launch_test(inherit); + rc = launch_test(NEXT_PROGRAM, inherit); if (rc != expected) { - ksft_test_result_fail("Test failed, check %d != %ld\n", rc, - expected); + printf("Test failed, check %d != %ld\n", rc, expected); return -2; } return 0; @@ -70,112 +32,180 @@ int test_and_compare_child(long provided, long expected, int inherit) #define PR_RISCV_V_VSTATE_CTRL_CUR_SHIFT 0 #define PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT 2
-int main(void) +TEST(get_control_no_v) { - struct riscv_hwprobe pair; - long flag, expected; long rc;
- pair.key = RISCV_HWPROBE_KEY_IMA_EXT_0; - rc = riscv_hwprobe(&pair, 1, 0, NULL, 0); - if (rc < 0) { - ksft_test_result_fail("hwprobe() failed with %ld\n", rc); - return -1; - } + if (is_vector_supported()) + SKIP(return, "Test expects vector to be not supported");
- if (pair.key != RISCV_HWPROBE_KEY_IMA_EXT_0) { - ksft_test_result_fail("hwprobe cannot probe RISCV_HWPROBE_KEY_IMA_EXT_0\n"); - return -2; - } + rc = prctl(PR_RISCV_V_GET_CONTROL); + EXPECT_EQ(-1, rc) TH_LOG("GET_CONTROL should fail on kernel/hw without V"); + EXPECT_EQ(EINVAL, errno) TH_LOG("GET_CONTROL should fail on kernel/hw without V"); +}
- if (!(pair.value & RISCV_HWPROBE_IMA_V)) { - rc = prctl(PR_RISCV_V_GET_CONTROL); - if (rc != -1 || errno != EINVAL) { - ksft_test_result_fail("GET_CONTROL should fail on kernel/hw without V\n"); - return -3; - } - - rc = prctl(PR_RISCV_V_SET_CONTROL, PR_RISCV_V_VSTATE_CTRL_ON); - if (rc != -1 || errno != EINVAL) { - ksft_test_result_fail("GET_CONTROL should fail on kernel/hw without V\n"); - return -4; - } - - ksft_test_result_skip("Vector not supported\n"); - return 0; - } +TEST(set_control_no_v) +{ + long rc; + + if (is_vector_supported()) + SKIP(return, "Test expects vector to be not supported"); + + rc = prctl(PR_RISCV_V_SET_CONTROL, PR_RISCV_V_VSTATE_CTRL_ON); + EXPECT_EQ(-1, rc) TH_LOG("SET_CONTROL should fail on kernel/hw without V"); + EXPECT_EQ(EINVAL, errno) TH_LOG("SET_CONTROL should fail on kernel/hw without V"); +} + +TEST(vstate_on_current) +{ + long flag; + long rc; + + if (!is_vector_supported()) + SKIP(return, "Vector not supported");
flag = PR_RISCV_V_VSTATE_CTRL_ON; rc = prctl(PR_RISCV_V_SET_CONTROL, flag); - if (rc != 0) { - ksft_test_result_fail("Enabling V for current should always success\n"); - return -5; - } + EXPECT_EQ(0, rc) TH_LOG("Enabling V for current should always success"); +} + +TEST(vstate_off_eperm) +{ + long flag; + long rc; + + if (!is_vector_supported()) + SKIP(return, "Vector not supported");
flag = PR_RISCV_V_VSTATE_CTRL_OFF; rc = prctl(PR_RISCV_V_SET_CONTROL, flag); - if (rc != -1 || errno != EPERM) { - ksft_test_result_fail("Disabling current's V alive must fail with EPERM(%d)\n", - errno); - return -5; - } + EXPECT_EQ(EPERM, errno) TH_LOG("Disabling current's V alive must fail with EPERM(%d)", errno); + EXPECT_EQ(-1, rc) TH_LOG("Disabling current's V alive must fail with EPERM(%d)", errno); +} + +TEST(vstate_on_no_nesting) +{ + long flag; + + if (!is_vector_supported()) + SKIP(return, "Vector not supported");
/* Turn on next's vector explicitly and test */ flag = PR_RISCV_V_VSTATE_CTRL_ON << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT; - if (test_and_compare_child(flag, PR_RISCV_V_VSTATE_CTRL_ON, 0)) - return -6; + + EXPECT_EQ(0, test_and_compare_child(flag, PR_RISCV_V_VSTATE_CTRL_ON, 0)); +} + +TEST(vstate_off_nesting) +{ + long flag; + + if (!is_vector_supported()) + SKIP(return, "Vector not supported");
/* Turn off next's vector explicitly and test */ flag = PR_RISCV_V_VSTATE_CTRL_OFF << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT; - if (test_and_compare_child(flag, PR_RISCV_V_VSTATE_CTRL_OFF, 0)) - return -7; + + EXPECT_EQ(0, test_and_compare_child(flag, PR_RISCV_V_VSTATE_CTRL_OFF, 1)); +} + +TEST(vstate_on_inherit_no_nesting) +{ + long flag, expected; + + if (!is_vector_supported()) + SKIP(return, "Vector not supported"); + + /* Turn on next's vector explicitly and test no inherit */ + flag = PR_RISCV_V_VSTATE_CTRL_ON << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT; + flag |= PR_RISCV_V_VSTATE_CTRL_INHERIT; + expected = flag | PR_RISCV_V_VSTATE_CTRL_ON; + + EXPECT_EQ(0, test_and_compare_child(flag, expected, 0)); +} + +TEST(vstate_on_inherit) +{ + long flag, expected; + + if (!is_vector_supported()) + SKIP(return, "Vector not supported");
/* Turn on next's vector explicitly and test inherit */ flag = PR_RISCV_V_VSTATE_CTRL_ON << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT; flag |= PR_RISCV_V_VSTATE_CTRL_INHERIT; expected = flag | PR_RISCV_V_VSTATE_CTRL_ON; - if (test_and_compare_child(flag, expected, 0)) - return -8;
- if (test_and_compare_child(flag, expected, 1)) - return -9; + EXPECT_EQ(0, test_and_compare_child(flag, expected, 1)); +} + +TEST(vstate_off_inherit_no_nesting) +{ + long flag, expected; + + if (!is_vector_supported()) + SKIP(return, "Vector not supported"); + + /* Turn off next's vector explicitly and test no inherit */ + flag = PR_RISCV_V_VSTATE_CTRL_OFF << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT; + flag |= PR_RISCV_V_VSTATE_CTRL_INHERIT; + expected = flag | PR_RISCV_V_VSTATE_CTRL_OFF; + + EXPECT_EQ(0, test_and_compare_child(flag, expected, 0)); +} + +TEST(vstate_off_inherit) +{ + long flag, expected; + + if (!is_vector_supported()) + SKIP(return, "Vector not supported");
/* Turn off next's vector explicitly and test inherit */ flag = PR_RISCV_V_VSTATE_CTRL_OFF << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT; flag |= PR_RISCV_V_VSTATE_CTRL_INHERIT; expected = flag | PR_RISCV_V_VSTATE_CTRL_OFF; - if (test_and_compare_child(flag, expected, 0)) - return -10;
- if (test_and_compare_child(flag, expected, 1)) - return -11; + EXPECT_EQ(0, test_and_compare_child(flag, expected, 1)); +} + +/* arguments should fail with EINVAL */ +TEST(inval_set_control_1) +{ + int rc; + + if (!is_vector_supported()) + SKIP(return, "Vector not supported");
- /* arguments should fail with EINVAL */ rc = prctl(PR_RISCV_V_SET_CONTROL, 0xff0); - if (rc != -1 || errno != EINVAL) { - ksft_test_result_fail("Undefined control argument should return EINVAL\n"); - return -12; - } + EXPECT_EQ(-1, rc); + EXPECT_EQ(EINVAL, errno); +} + +/* arguments should fail with EINVAL */ +TEST(inval_set_control_2) +{ + int rc; + + if (!is_vector_supported()) + SKIP(return, "Vector not supported");
rc = prctl(PR_RISCV_V_SET_CONTROL, 0x3); - if (rc != -1 || errno != EINVAL) { - ksft_test_result_fail("Undefined control argument should return EINVAL\n"); - return -12; - } + EXPECT_EQ(-1, rc); + EXPECT_EQ(EINVAL, errno); +}
- rc = prctl(PR_RISCV_V_SET_CONTROL, 0xc); - if (rc != -1 || errno != EINVAL) { - ksft_test_result_fail("Undefined control argument should return EINVAL\n"); - return -12; - } +/* arguments should fail with EINVAL */ +TEST(inval_set_control_3) +{ + int rc;
- rc = prctl(PR_RISCV_V_SET_CONTROL, 0xc); - if (rc != -1 || errno != EINVAL) { - ksft_test_result_fail("Undefined control argument should return EINVAL\n"); - return -12; - } + if (!is_vector_supported()) + SKIP(return, "Vector not supported");
- ksft_test_result_pass("tests for riscv_v_vstate_ctrl pass\n"); - ksft_exit_pass(); - return 0; + rc = prctl(PR_RISCV_V_SET_CONTROL, 0xc); + EXPECT_EQ(-1, rc); + EXPECT_EQ(EINVAL, errno); } + +TEST_HARNESS_MAIN
Extend existing vector tests to be compatible with the xtheadvector instruction set.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com --- .../selftests/riscv/vector/v_exec_initval_nolibc.c | 23 ++++-- tools/testing/selftests/riscv/vector/v_helpers.c | 16 +++- tools/testing/selftests/riscv/vector/v_helpers.h | 4 +- tools/testing/selftests/riscv/vector/v_initval.c | 12 ++- .../selftests/riscv/vector/vstate_exec_nolibc.c | 20 +++-- .../testing/selftests/riscv/vector/vstate_prctl.c | 85 +++++++++++++++------- 6 files changed, 111 insertions(+), 49 deletions(-)
diff --git a/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c b/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c index 363727672704..b6c79d3a92fc 100644 --- a/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c +++ b/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c @@ -18,13 +18,22 @@ int main(int argc, char **argv) unsigned long vl; int first = 1;
- asm volatile ( - ".option push\n\t" - ".option arch, +v\n\t" - "vsetvli %[vl], x0, e8, m1, ta, ma\n\t" - ".option pop\n\t" - : [vl] "=r" (vl) - ); + if (argc > 2 && strcmp(argv[2], "x")) + asm volatile ( + // 0 | zimm[10:0] | rs1 | 1 1 1 | rd |1010111| vsetvli + // vsetvli t4, x0, e8, m1, d1 + ".insn 0b00000000000000000111111011010111\n\t" + "mv %[vl], t4\n\t" + : [vl] "=r" (vl) : : "t4" + ); + else + asm volatile ( + ".option push\n\t" + ".option arch, +v\n\t" + "vsetvli %[vl], x0, e8, m1, ta, ma\n\t" + ".option pop\n\t" + : [vl] "=r" (vl) + );
#define CHECK_VECTOR_REGISTER(register) ({ \ for (int i = 0; i < vl; i++) { \ diff --git a/tools/testing/selftests/riscv/vector/v_helpers.c b/tools/testing/selftests/riscv/vector/v_helpers.c index 15c22318db72..fb6bece73119 100644 --- a/tools/testing/selftests/riscv/vector/v_helpers.c +++ b/tools/testing/selftests/riscv/vector/v_helpers.c @@ -6,6 +6,15 @@ #include <unistd.h> #include <sys/wait.h>
+int is_xtheadvector_supported(void) +{ + struct riscv_hwprobe pair; + + pair.key = RISCV_HWPROBE_KEY_VENDOR_EXT_0; + riscv_hwprobe(&pair, 1, 0, NULL, 0); + return pair.value & RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR; +} + int is_vector_supported(void) { struct riscv_hwprobe pair; @@ -15,9 +24,9 @@ int is_vector_supported(void) return pair.value & RISCV_HWPROBE_IMA_V; }
-int launch_test(char *next_program, int test_inherit) +int launch_test(char *next_program, int test_inherit, int xtheadvector) { - char *exec_argv[3], *exec_envp[1]; + char *exec_argv[4], *exec_envp[1]; int rc, pid, status;
pid = fork(); @@ -29,7 +38,8 @@ int launch_test(char *next_program, int test_inherit) if (!pid) { exec_argv[0] = next_program; exec_argv[1] = test_inherit != 0 ? "x" : NULL; - exec_argv[2] = NULL; + exec_argv[2] = xtheadvector != 0 ? "x" : NULL; + exec_argv[3] = NULL; exec_envp[0] = NULL; /* launch the program again to check inherit */ rc = execve(next_program, exec_argv, exec_envp); diff --git a/tools/testing/selftests/riscv/vector/v_helpers.h b/tools/testing/selftests/riscv/vector/v_helpers.h index 88719c4be496..67d41cb6f871 100644 --- a/tools/testing/selftests/riscv/vector/v_helpers.h +++ b/tools/testing/selftests/riscv/vector/v_helpers.h @@ -1,5 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0-only */
+int is_xtheadvector_supported(void); + int is_vector_supported(void);
-int launch_test(char *next_program, int test_inherit); +int launch_test(char *next_program, int test_inherit, int xtheadvector); diff --git a/tools/testing/selftests/riscv/vector/v_initval.c b/tools/testing/selftests/riscv/vector/v_initval.c index f38b5797fa31..be9e1d18ad29 100644 --- a/tools/testing/selftests/riscv/vector/v_initval.c +++ b/tools/testing/selftests/riscv/vector/v_initval.c @@ -7,10 +7,16 @@
TEST(v_initval) { - if (!is_vector_supported()) - SKIP(return, "Vector not supported"); + int xtheadvector = 0;
- ASSERT_EQ(0, launch_test(NEXT_PROGRAM, 0)); + if (!is_vector_supported()) { + if (is_xtheadvector_supported()) + xtheadvector = 1; + else + SKIP(return, "Vector not supported"); + } + + ASSERT_EQ(0, launch_test(NEXT_PROGRAM, 0, xtheadvector)); }
TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c b/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c index 1f9969bed235..12d30d3b90fa 100644 --- a/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c +++ b/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c @@ -6,13 +6,16 @@
int main(int argc, char **argv) { - int rc, pid, status, test_inherit = 0; + int rc, pid, status, test_inherit = 0, xtheadvector = 0; long ctrl, ctrl_c; char *exec_argv[2], *exec_envp[2];
- if (argc > 1) + if (argc > 1 && strcmp(argv[1], "x")) test_inherit = 1;
+ if (argc > 2 && strcmp(argv[2], "x")) + xtheadvector = 1; + ctrl = my_syscall1(__NR_prctl, PR_RISCV_V_GET_CONTROL); if (ctrl < 0) { puts("PR_RISCV_V_GET_CONTROL is not supported\n"); @@ -53,11 +56,14 @@ int main(int argc, char **argv) puts("child's vstate_ctrl not equal to parent's\n"); exit(-1); } - asm volatile (".option push\n\t" - ".option arch, +v\n\t" - "vsetvli x0, x0, e32, m8, ta, ma\n\t" - ".option pop\n\t" - ); + if (xtheadvector) + asm volatile (".insn 0x00007ed7"); + else + asm volatile (".option push\n\t" + ".option arch, +v\n\t" + "vsetvli x0, x0, e32, m8, ta, ma\n\t" + ".option pop\n\t" + ); exit(ctrl); } } diff --git a/tools/testing/selftests/riscv/vector/vstate_prctl.c b/tools/testing/selftests/riscv/vector/vstate_prctl.c index 528e8c544db0..dd3c5f06f800 100644 --- a/tools/testing/selftests/riscv/vector/vstate_prctl.c +++ b/tools/testing/selftests/riscv/vector/vstate_prctl.c @@ -11,7 +11,7 @@
#define NEXT_PROGRAM "./vstate_exec_nolibc"
-int test_and_compare_child(long provided, long expected, int inherit) +int test_and_compare_child(long provided, long expected, int inherit, int xtheadvector) { int rc;
@@ -21,7 +21,7 @@ int test_and_compare_child(long provided, long expected, int inherit) provided, rc); return -1; } - rc = launch_test(NEXT_PROGRAM, inherit); + rc = launch_test(NEXT_PROGRAM, inherit, xtheadvector); if (rc != expected) { printf("Test failed, check %d != %ld\n", rc, expected); return -2; @@ -36,7 +36,7 @@ TEST(get_control_no_v) { long rc;
- if (is_vector_supported()) + if (is_vector_supported() || is_xtheadvector_supported()) SKIP(return, "Test expects vector to be not supported");
rc = prctl(PR_RISCV_V_GET_CONTROL); @@ -48,7 +48,7 @@ TEST(set_control_no_v) { long rc;
- if (is_vector_supported()) + if (is_vector_supported() || is_xtheadvector_supported()) SKIP(return, "Test expects vector to be not supported");
rc = prctl(PR_RISCV_V_SET_CONTROL, PR_RISCV_V_VSTATE_CTRL_ON); @@ -61,7 +61,7 @@ TEST(vstate_on_current) long flag; long rc;
- if (!is_vector_supported()) + if (!is_vector_supported() && !is_xtheadvector_supported()) SKIP(return, "Vector not supported");
flag = PR_RISCV_V_VSTATE_CTRL_ON; @@ -74,7 +74,7 @@ TEST(vstate_off_eperm) long flag; long rc;
- if (!is_vector_supported()) + if (!is_vector_supported() && !is_xtheadvector_supported()) SKIP(return, "Vector not supported");
flag = PR_RISCV_V_VSTATE_CTRL_OFF; @@ -86,87 +86,116 @@ TEST(vstate_off_eperm) TEST(vstate_on_no_nesting) { long flag; + int xtheadvector = 0;
- if (!is_vector_supported()) - SKIP(return, "Vector not supported"); + if (!is_vector_supported()) { + if (is_xtheadvector_supported()) + xtheadvector = 1; + else + SKIP(return, "Vector not supported"); + }
/* Turn on next's vector explicitly and test */ flag = PR_RISCV_V_VSTATE_CTRL_ON << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT;
- EXPECT_EQ(0, test_and_compare_child(flag, PR_RISCV_V_VSTATE_CTRL_ON, 0)); + EXPECT_EQ(0, test_and_compare_child(flag, PR_RISCV_V_VSTATE_CTRL_ON, 0, xtheadvector)); }
TEST(vstate_off_nesting) { long flag; + int xtheadvector = 0;
- if (!is_vector_supported()) - SKIP(return, "Vector not supported"); + if (!is_vector_supported()) { + if (is_xtheadvector_supported()) + xtheadvector = 1; + else + SKIP(return, "Vector not supported"); + }
/* Turn off next's vector explicitly and test */ flag = PR_RISCV_V_VSTATE_CTRL_OFF << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT;
- EXPECT_EQ(0, test_and_compare_child(flag, PR_RISCV_V_VSTATE_CTRL_OFF, 1)); + EXPECT_EQ(0, test_and_compare_child(flag, PR_RISCV_V_VSTATE_CTRL_OFF, 1, xtheadvector)); }
TEST(vstate_on_inherit_no_nesting) { long flag, expected; + int xtheadvector = 0;
- if (!is_vector_supported()) - SKIP(return, "Vector not supported"); + if (!is_vector_supported()) { + if (is_xtheadvector_supported()) + xtheadvector = 1; + else + SKIP(return, "Vector not supported"); + }
/* Turn on next's vector explicitly and test no inherit */ flag = PR_RISCV_V_VSTATE_CTRL_ON << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT; flag |= PR_RISCV_V_VSTATE_CTRL_INHERIT; expected = flag | PR_RISCV_V_VSTATE_CTRL_ON;
- EXPECT_EQ(0, test_and_compare_child(flag, expected, 0)); + EXPECT_EQ(0, test_and_compare_child(flag, expected, 0, xtheadvector)); }
TEST(vstate_on_inherit) { long flag, expected; + int xtheadvector = 0;
- if (!is_vector_supported()) - SKIP(return, "Vector not supported"); + if (!is_vector_supported()) { + if (is_xtheadvector_supported()) + xtheadvector = 1; + else + SKIP(return, "Vector not supported"); + }
/* Turn on next's vector explicitly and test inherit */ flag = PR_RISCV_V_VSTATE_CTRL_ON << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT; flag |= PR_RISCV_V_VSTATE_CTRL_INHERIT; expected = flag | PR_RISCV_V_VSTATE_CTRL_ON;
- EXPECT_EQ(0, test_and_compare_child(flag, expected, 1)); + EXPECT_EQ(0, test_and_compare_child(flag, expected, 1, xtheadvector)); }
TEST(vstate_off_inherit_no_nesting) { long flag, expected; + int xtheadvector = 0;
- if (!is_vector_supported()) - SKIP(return, "Vector not supported"); - + if (!is_vector_supported()) { + if (is_xtheadvector_supported()) + xtheadvector = 1; + else + SKIP(return, "Vector not supported"); + } /* Turn off next's vector explicitly and test no inherit */ flag = PR_RISCV_V_VSTATE_CTRL_OFF << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT; flag |= PR_RISCV_V_VSTATE_CTRL_INHERIT; expected = flag | PR_RISCV_V_VSTATE_CTRL_OFF;
- EXPECT_EQ(0, test_and_compare_child(flag, expected, 0)); + EXPECT_EQ(0, test_and_compare_child(flag, expected, 0, xtheadvector)); }
TEST(vstate_off_inherit) { long flag, expected; + int xtheadvector = 0;
- if (!is_vector_supported()) - SKIP(return, "Vector not supported"); + if (!is_vector_supported()) { + if (is_xtheadvector_supported()) + xtheadvector = 1; + else + SKIP(return, "Vector not supported"); + }
/* Turn off next's vector explicitly and test inherit */ flag = PR_RISCV_V_VSTATE_CTRL_OFF << PR_RISCV_V_VSTATE_CTRL_NEXT_SHIFT; flag |= PR_RISCV_V_VSTATE_CTRL_INHERIT; expected = flag | PR_RISCV_V_VSTATE_CTRL_OFF;
- EXPECT_EQ(0, test_and_compare_child(flag, expected, 1)); + EXPECT_EQ(0, test_and_compare_child(flag, expected, 1, xtheadvector)); }
/* arguments should fail with EINVAL */ @@ -174,7 +203,7 @@ TEST(inval_set_control_1) { int rc;
- if (!is_vector_supported()) + if (!is_vector_supported() && !is_xtheadvector_supported()) SKIP(return, "Vector not supported");
rc = prctl(PR_RISCV_V_SET_CONTROL, 0xff0); @@ -187,7 +216,7 @@ TEST(inval_set_control_2) { int rc;
- if (!is_vector_supported()) + if (!is_vector_supported() && !is_xtheadvector_supported()) SKIP(return, "Vector not supported");
rc = prctl(PR_RISCV_V_SET_CONTROL, 0x3); @@ -200,7 +229,7 @@ TEST(inval_set_control_3) { int rc;
- if (!is_vector_supported()) + if (!is_vector_supported() && !is_xtheadvector_supported()) SKIP(return, "Vector not supported");
rc = prctl(PR_RISCV_V_SET_CONTROL, 0xc);
linux-kselftest-mirror@lists.linaro.org