On Mon, 06 Jan 2025 12:03:44 +0000, Mark Rutland mark.rutland@arm.com wrote:
On Mon, Jan 06, 2025 at 11:12:53AM +0000, Marc Zyngier wrote:
On Mon, 06 Jan 2025 09:40:56 +0000, Mark Rutland mark.rutland@arm.com wrote:
On Fri, Jan 03, 2025 at 06:22:55PM +0000, Marc Zyngier wrote:
The hwcaps code that exposes SVE features to userspace only considers ID_AA64ZFR0_EL1, while this is only valid when ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.
The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the ID_AA64ZFR0_EL1 register is also 0. So far, so good.
Things become a bit more interesting if the HW implements SME. In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME* features. And these fields overlap with their SVE interpretations. But the architecture says that the SME and SVE feature sets must match, so we're still hunky-dory.
This goes wrong if the HW implements SME, but not SVE. In this case, we end-up advertising some SVE features to userspace, even if the HW has none. That's because we never consider whether SVE is actually implemented. Oh well.
Ugh; this is a massive pain. :(
Was this found by inspection, or is some real software going wrong?
Catalin can comment on that -- I understand that he found existing SW latching on SVE2 being wrongly advertised as hwcaps.
Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE being non-zero.
Unfortunately, I'm not sure this fix is correct+complete.
We expose ID_AA64PFR0_EL1 and ID_AA64ZFR0_EL1 via ID register emulation, so any userspace software reading ID_AA64ZFR0_EL1 will encounter the same surprise. If we hide that I'm worried we might hide some SME-only information that isn't exposed elsewhere, and I'm not sure we can reasonably hide ID_AA64ZFR0_EL1 emulation for SME-only (more on that below).
I don't understand where things go wrong. EL0 SW that looks at the ID registers should perform similar checks, and we are not trying to make things better on that front (we can't). Unless you invent time travel and fix the architecture 5 years ago... :-/
Fair enough; if we say software consuming ID_AA64ZFR0_EL1 must check ID_AA64PFR0_EL1.SVE or ID_AA64PFR1_EL1.SME first, and we leave the emulation of ID_AA64ZFR0_EL1 as-is, that's fine by me.
I think that's what the architecture forces on us, unfortunately.
The hwcaps are effectively demultiplexing the ID registers, and they have to be exact, which is what this patch provides (SVE2 doesn't get wrongly advertised when not present).
Secondly, all our HWCAP documentation is written in the form:
| HWCAP2_SVEBF16 | Functionality implied by ID_AA64ZFR0_EL1.BF16 == 0b0001.
... so while the architectural behaviour is a surprise, the kernel is (techincallyy) behaving exactly as documented prior to this patch. Maybe we need to change that documentation?
Again, I don't see what goes wrong here. BF16 is only implemented for SVE or SME+FA64, and FA64 requires SVE2. So at least for that one, we should be good.
That was probably a bad example. What I was trying to get at is that the HWCAPs are behavind exactly *as documented*, but that's not what we actually want them to describe. For example, SVE2 is described as:
| Functionality implied by ID_AA64ZFR0_EL1.SVEver == 0b0001.
... which is exactly what we check today, but that doesn't architecturally imply FEAT_SVE2 on SME-only HW where it can apparently be 0b0001 due to FEAT_SME alone.
So to match the code change we'd need to change that to something like:
| Functionality impled by ID_AA64PFR0_EL1 == 0b0001 and | ID_AA64ZFR0_EL1.SVEver == 0b0001
... with similar for other hwcaps.
Yeah, seems like a decent addition. I'll fold that in.
Do we have equivalent SME hwcaps for the relevant features?
... looking at:
https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID...
... I see that ID_AA64ZFR0_EL1.B16B16 >= 0b0010 implies the presence of SME BFMUL and BFSCALE instructions, but I don't see something equivalent in ID_AA64SMFR0_EL1 per:
https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID...
... so I suspect ID_AA64ZFR0_EL1 might be the only source of truth for this.
Indeed, and the SME HWCAPs are not doing the right thing either. Or rather, we have no way to tell userspace that BFMUL/BFSCALE are available.
To be clear, I'm happy to punt on adding SME-specific HWCAPs, I just want to make sure we're agreed as to whether the existing HWCAPs should be SVE-specific, which it sounds like we are?
I think we're aligned here. I'll respin something shortly, once I've made some progress on the state of my Inbox... :-/
Thanks,
M.