Re: [PATCH 02/19] riscv: cpufeature: Fix thead vector hwcap removal

12 Apr 2024


      On Fri, Apr 12, 2024 at 08:26:12PM +0100, Conor Dooley wrote:
...
On Fri, Apr 12, 2024 at 11:46:21AM -0700, Charlie Jenkins wrote:
...
On Fri, Apr 12, 2024 at 07:38:04PM +0100, Conor Dooley wrote:
...
On Fri, Apr 12, 2024 at 10:04:17AM -0700, Evan Green wrote:
...
On Fri, Apr 12, 2024 at 3:26 AM Conor Dooley conor.dooley@microchip.com wrote:
...
On Thu, Apr 11, 2024 at 09:11:08PM -0700, Charlie Jenkins wrote:
...
The riscv_cpuinfo struct that contains mvendorid and marchid is not
populated until all harts are booted which happens after the DT parsing.
Use the vendorid/archid values from the DT if available or assume all
harts have the same values as the boot hart as a fallback.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
If this is our only use case for getting the mvendorid/marchid stuff
from dt, then I don't think we should add it. None of the devicetrees
that the commit you're fixing here addresses will have these properties
and if they did have them, they'd then also be new enough to hopefully
not have "v" either - the issue is they're using whatever crap the
vendor shipped.
If we're gonna get the information from DT, we already have something
that we can look at to perform the disable as the cpu compatibles give
us enough information to make the decision.
I also think that we could just cache the boot CPU's marchid/mvendorid,
since we already have to look at it in riscv_fill_cpu_mfr_info(), avoid
repeating these ecalls on all systems.
Perhaps for now we could just look at the boot CPU alone? To my
knowledge the systems that this targets all have homogeneous
marchid/mvendorid values of 0x0.
It's possible I'm misinterpreting, but is the suggestion to apply the
marchid/mvendorid we find on the boot CPU and assume it's the same on
all other CPUs? Since we're reporting the marchid/mvendorid/mimpid to
usermode in a per-hart way, it would be better IMO if we really do
query marchid/mvendorid/mimpid on each hart. The problem with applying
the boot CPU's value everywhere is if we're ever wrong in the future
(ie that assumption doesn't hold on some machine), we'll only find out
about it after the fact. Since we reported the wrong information to
usermode via hwprobe, it'll be an ugly userspace ABI issue to clean
up.
You're misinterpreting, we do get the values on all individually as
they're brought online. This is only used by the code that throws a bone
to people with crappy vendor dtbs that put "v" in riscv,isa when they
support the unratified version.
Not quite,
Remember that this patch stands in isolation and the justification given
in your commit message does not mention anything other than fixing my
broken patch.
Fixing the patch in the simplest sense would be to eagerly get the
mvendorid/marchid without using the cached version. But this assumes
that all harts have the same mvendorid/marchid. This is not something
that I am strongly attached to. If it truly is detrimental to Linux to
allow a user a way to specify different vendorids for different harts
then I will remove that code.
- Charlie
...
...
the alternatives are patched before the other cpus are
booted, so the alternatives will have false positives resulting in
broken kernels.
Over-eagerly disabling vector isn't going to break any kernels and
really should not break a behaving userspace either.
Under-eagerly disabling it (in a way that this approach could solve) is
only going to happen on a system where the boot hart has non-zero values
and claims support for v but a non-boot hart has zero values and
claims support for v but actually doesn't implement the ratified version.
If the boot hart doesn't support v, then we currently disable the
extension as only homogeneous stuff is supported by Linux. If the boot
hart claims support for "v" but doesn't actually implement the ratified
version neither the intent of my original patch nor this fix for it are
going to help avoid a broken kernel.
I think we do have a problem if the boot cpu having some erratum leads
to the kernel being patched in a way that does not work for the other
CPUs on the system, but I don't think this series addresses that sort of
issue at all as you'd be adding code to the pi section if you were fixing
it. I also don't think we should be making pre-emptive changes to the
errata patching code either to solve that sort of problem, until an SoC
shows up where things don't work.
Cheers,
Conor.

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 02/19] riscv: cpufeature: Fix thead vector hwcap removal