Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") made ghash_finup() pass the wrong buffer to ghash_do_simd_update(). As a result, ghash-neon now produces incorrect outputs when the message length isn't divisible by 16 bytes. Fix this.
(I didn't notice this earlier because this code is reached only on CPUs that support NEON but not PMULL. I haven't yet found a way to get qemu-system-aarch64 to emulate that configuration.)
Fixes: 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") Cc: stable@vger.kernel.org Reported-by: Diederik de Haas diederik@cknow-tech.com Closes: https://lore.kernel.org/linux-crypto/DETXT7QI62KE.F3CGH2VWX1SC@cknow-tech.co... Signed-off-by: Eric Biggers ebiggers@kernel.org ---
If it's okay, I'd like to just take this via libcrypto-fixes.
arch/arm64/crypto/ghash-ce-glue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c index 7951557a285a..ef249d06c92c 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -131,11 +131,11 @@ static int ghash_finup(struct shash_desc *desc, const u8 *src,
if (len) { u8 buf[GHASH_BLOCK_SIZE] = {};
memcpy(buf, src, len); - ghash_do_simd_update(1, ctx->digest, src, key, NULL, + ghash_do_simd_update(1, ctx->digest, buf, key, NULL, pmull_ghash_update_p8); memzero_explicit(buf, sizeof(buf)); } return ghash_export(desc, dst); }
base-commit: 7a3984bbd69055898add0fe22445f99435f33450
On Tue, Dec 09, 2025 at 02:34:17PM -0800, Eric Biggers wrote:
Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") made ghash_finup() pass the wrong buffer to ghash_do_simd_update(). As a result, ghash-neon now produces incorrect outputs when the message length isn't divisible by 16 bytes. Fix this.
(I didn't notice this earlier because this code is reached only on CPUs that support NEON but not PMULL. I haven't yet found a way to get qemu-system-aarch64 to emulate that configuration.)
Fixes: 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") Cc: stable@vger.kernel.org Reported-by: Diederik de Haas diederik@cknow-tech.com Closes: https://lore.kernel.org/linux-crypto/DETXT7QI62KE.F3CGH2VWX1SC@cknow-tech.co... Signed-off-by: Eric Biggers ebiggers@kernel.org
If it's okay, I'd like to just take this via libcrypto-fixes.
arch/arm64/crypto/ghash-ce-glue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Thanks for catching this!
Acked-by: Herbert Xu herbert@gondor.apana.org.au
On Tue, Dec 09, 2025 at 02:34:17PM -0800, Eric Biggers wrote:
Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") made ghash_finup() pass the wrong buffer to ghash_do_simd_update(). As a result, ghash-neon now produces incorrect outputs when the message length isn't divisible by 16 bytes. Fix this.
(I didn't notice this earlier because this code is reached only on CPUs that support NEON but not PMULL. I haven't yet found a way to get qemu-system-aarch64 to emulate that configuration.)
Fixes: 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") Cc: stable@vger.kernel.org Reported-by: Diederik de Haas diederik@cknow-tech.com Closes: https://lore.kernel.org/linux-crypto/DETXT7QI62KE.F3CGH2VWX1SC@cknow-tech.co... Signed-off-by: Eric Biggers ebiggers@kernel.org
If it's okay, I'd like to just take this via libcrypto-fixes.
arch/arm64/crypto/ghash-ce-glue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Applied to https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git/log/?h=li...
(As always, additional reviews/acks still appreciated!)
- Eric
On Tue Dec 9, 2025 at 11:34 PM CET, Eric Biggers wrote:
Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") made ghash_finup() pass the wrong buffer to ghash_do_simd_update(). As a result, ghash-neon now produces incorrect outputs when the message length isn't divisible by 16 bytes. Fix this.
I was hoping to not have to do a 'git bisect', but this is much better :-D I can confirm that this patch fixes the error I was seeing, so
Tested-by: Diederik de Haas diederik@cknow-tech.com
(I didn't notice this earlier because this code is reached only on CPUs that support NEON but not PMULL. I haven't yet found a way to get qemu-system-aarch64 to emulate that configuration.)
https://www.qemu.org/docs/master/system/arm/raspi.html indicates it can emulate various Raspberry Pi models. I've only tested it with RPi 3B+ (bc of its wifi+bt chip), but I wouldn't be surprised if all RPi models would have this problem? Dunno if QEMU emulates that though.
Thanks for the quick fix!
Cheers, Diederik
Fixes: 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") Cc: stable@vger.kernel.org Reported-by: Diederik de Haas diederik@cknow-tech.com Closes: https://lore.kernel.org/linux-crypto/DETXT7QI62KE.F3CGH2VWX1SC@cknow-tech.co... Signed-off-by: Eric Biggers ebiggers@kernel.org
If it's okay, I'd like to just take this via libcrypto-fixes.
arch/arm64/crypto/ghash-ce-glue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c index 7951557a285a..ef249d06c92c 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -131,11 +131,11 @@ static int ghash_finup(struct shash_desc *desc, const u8 *src, if (len) { u8 buf[GHASH_BLOCK_SIZE] = {}; memcpy(buf, src, len);
ghash_do_simd_update(1, ctx->digest, src, key, NULL,
memzero_explicit(buf, sizeof(buf)); } return ghash_export(desc, dst);ghash_do_simd_update(1, ctx->digest, buf, key, NULL, pmull_ghash_update_p8);}
base-commit: 7a3984bbd69055898add0fe22445f99435f33450
On Wed, 10 Dec 2025 at 18:22, Diederik de Haas diederik@cknow-tech.com wrote:
On Tue Dec 9, 2025 at 11:34 PM CET, Eric Biggers wrote:
Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") made ghash_finup() pass the wrong buffer to ghash_do_simd_update(). As a result, ghash-neon now produces incorrect outputs when the message length isn't divisible by 16 bytes. Fix this.
I was hoping to not have to do a 'git bisect', but this is much better :-D I can confirm that this patch fixes the error I was seeing, so
Tested-by: Diederik de Haas diederik@cknow-tech.com
(I didn't notice this earlier because this code is reached only on CPUs that support NEON but not PMULL. I haven't yet found a way to get qemu-system-aarch64 to emulate that configuration.)
https://www.qemu.org/docs/master/system/arm/raspi.html indicates it can emulate various Raspberry Pi models. I've only tested it with RPi 3B+ (bc of its wifi+bt chip), but I wouldn't be surprised if all RPi models would have this problem? Dunno if QEMU emulates that though.
All 64-bit RPi models except the RPi5 are affected by this, as those do not implement the crypto extensions. So I would expect QEMU to do the same.
It would be nice, though, if we could emulate this on the mach-virt machine model too. It should be fairly trivial to do, so if there is demand for this I can look into it.
On Wed, Dec 10, 2025 at 06:31:44PM +0900, Ard Biesheuvel wrote:
On Wed, 10 Dec 2025 at 18:22, Diederik de Haas diederik@cknow-tech.com wrote:
On Tue Dec 9, 2025 at 11:34 PM CET, Eric Biggers wrote:
Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") made ghash_finup() pass the wrong buffer to ghash_do_simd_update(). As a result, ghash-neon now produces incorrect outputs when the message length isn't divisible by 16 bytes. Fix this.
I was hoping to not have to do a 'git bisect', but this is much better :-D I can confirm that this patch fixes the error I was seeing, so
Tested-by: Diederik de Haas diederik@cknow-tech.com
(I didn't notice this earlier because this code is reached only on CPUs that support NEON but not PMULL. I haven't yet found a way to get qemu-system-aarch64 to emulate that configuration.)
https://www.qemu.org/docs/master/system/arm/raspi.html indicates it can emulate various Raspberry Pi models. I've only tested it with RPi 3B+ (bc of its wifi+bt chip), but I wouldn't be surprised if all RPi models would have this problem? Dunno if QEMU emulates that though.
All 64-bit RPi models except the RPi5 are affected by this, as those do not implement the crypto extensions. So I would expect QEMU to do the same.
It would be nice, though, if we could emulate this on the mach-virt machine model too. It should be fairly trivial to do, so if there is demand for this I can look into it.
I'm definitely interested in it. I'm already testing multiple "-cpu" options, and it's easy to add more.
With qemu-system-aarch64 I'm currently only using "-M virt", since the other machine models I've tried don't boot with arm64 defconfig, including "-M raspi3b" and "-M raspi4b".
There may be some tricks I'm missing. Regardless, expanding the selection of available CPUs for "-M virt" would be helpful. Either by adding "real" CPUs that have "interesting" combinations of features, or by just allowing turning features off like "-cpu max,aes=off,pmull=off,sha256=off". (Certain features like sve can already be turned off in that way, but not the ones relevant to us.)
- Eric
On Fri, 12 Dec 2025 at 06:40, Eric Biggers ebiggers@kernel.org wrote:
On Wed, Dec 10, 2025 at 06:31:44PM +0900, Ard Biesheuvel wrote:
On Wed, 10 Dec 2025 at 18:22, Diederik de Haas diederik@cknow-tech.com wrote:
On Tue Dec 9, 2025 at 11:34 PM CET, Eric Biggers wrote:
Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") made ghash_finup() pass the wrong buffer to ghash_do_simd_update(). As a result, ghash-neon now produces incorrect outputs when the message length isn't divisible by 16 bytes. Fix this.
I was hoping to not have to do a 'git bisect', but this is much better :-D I can confirm that this patch fixes the error I was seeing, so
Tested-by: Diederik de Haas diederik@cknow-tech.com
(I didn't notice this earlier because this code is reached only on CPUs that support NEON but not PMULL. I haven't yet found a way to get qemu-system-aarch64 to emulate that configuration.)
https://www.qemu.org/docs/master/system/arm/raspi.html indicates it can emulate various Raspberry Pi models. I've only tested it with RPi 3B+ (bc of its wifi+bt chip), but I wouldn't be surprised if all RPi models would have this problem? Dunno if QEMU emulates that though.
All 64-bit RPi models except the RPi5 are affected by this, as those do not implement the crypto extensions. So I would expect QEMU to do the same.
It would be nice, though, if we could emulate this on the mach-virt machine model too. It should be fairly trivial to do, so if there is demand for this I can look into it.
I'm definitely interested in it. I'm already testing multiple "-cpu" options, and it's easy to add more.
With qemu-system-aarch64 I'm currently only using "-M virt", since the other machine models I've tried don't boot with arm64 defconfig, including "-M raspi3b" and "-M raspi4b".
There may be some tricks I'm missing. Regardless, expanding the selection of available CPUs for "-M virt" would be helpful. Either by adding "real" CPUs that have "interesting" combinations of features, or by just allowing turning features off like "-cpu max,aes=off,pmull=off,sha256=off". (Certain features like sve can already be turned off in that way, but not the ones relevant to us.)
There are some architectural rules around which combinations of crypto extensions are permitted: - PMULL implies AES, and there is no way for the ID registers to describe a CPU that has PMULL but not AES - SHA256 implies SHA1 (but the ID register fields are independent) - SHA3 and SHA512 both imply SHA256+SHA1 - SVE versions are not allowed to be implemented unless the plain NEON version is implemented as well - FEAT_Crypto has different meanings for v8.0, v8.2 and v9.x
So it would be much easier, also in terms of future maintenance, to have a simple 'crypto=off' setting that applies to all emulated CPU models, given that disabling all crypto on any given compliant CPU will never result in something that the architecture does not permit.
Would that work for you?
On Mon, Dec 15, 2025 at 04:54:34PM +0900, Ard Biesheuvel wrote:
All 64-bit RPi models except the RPi5 are affected by this, as those do not implement the crypto extensions. So I would expect QEMU to do the same.
It would be nice, though, if we could emulate this on the mach-virt machine model too. It should be fairly trivial to do, so if there is demand for this I can look into it.
I'm definitely interested in it. I'm already testing multiple "-cpu" options, and it's easy to add more.
With qemu-system-aarch64 I'm currently only using "-M virt", since the other machine models I've tried don't boot with arm64 defconfig, including "-M raspi3b" and "-M raspi4b".
There may be some tricks I'm missing. Regardless, expanding the selection of available CPUs for "-M virt" would be helpful. Either by adding "real" CPUs that have "interesting" combinations of features, or by just allowing turning features off like "-cpu max,aes=off,pmull=off,sha256=off". (Certain features like sve can already be turned off in that way, but not the ones relevant to us.)
There are some architectural rules around which combinations of crypto extensions are permitted:
- PMULL implies AES, and there is no way for the ID registers to
describe a CPU that has PMULL but not AES
- SHA256 implies SHA1 (but the ID register fields are independent)
- SHA3 and SHA512 both imply SHA256+SHA1
- SVE versions are not allowed to be implemented unless the plain NEON
version is implemented as well
- FEAT_Crypto has different meanings for v8.0, v8.2 and v9.x
So it would be much easier, also in terms of future maintenance, to have a simple 'crypto=off' setting that applies to all emulated CPU models, given that disabling all crypto on any given compliant CPU will never result in something that the architecture does not permit.
Would that work for you?
I thought it had been established that the "crypto" grouping of features (as implemented by gcc and clang) doesn't reflect the actual hardware feature fields and is misleading because additional crypto extensions continue to be added.
I'm not sure that applies here, but just something to consider.
There's certainly no need to support emulating combinations of features that no hardware actually implements. So yes, if that means "crypto" is the right choice, that sounds fine.
- Eric
On Mon, 15 Dec 2025 at 21:16, Eric Biggers ebiggers@kernel.org wrote:
On Mon, Dec 15, 2025 at 04:54:34PM +0900, Ard Biesheuvel wrote:
All 64-bit RPi models except the RPi5 are affected by this, as those do not implement the crypto extensions. So I would expect QEMU to do the same.
It would be nice, though, if we could emulate this on the mach-virt machine model too. It should be fairly trivial to do, so if there is demand for this I can look into it.
I'm definitely interested in it. I'm already testing multiple "-cpu" options, and it's easy to add more.
With qemu-system-aarch64 I'm currently only using "-M virt", since the other machine models I've tried don't boot with arm64 defconfig, including "-M raspi3b" and "-M raspi4b".
There may be some tricks I'm missing. Regardless, expanding the selection of available CPUs for "-M virt" would be helpful. Either by adding "real" CPUs that have "interesting" combinations of features, or by just allowing turning features off like "-cpu max,aes=off,pmull=off,sha256=off". (Certain features like sve can already be turned off in that way, but not the ones relevant to us.)
There are some architectural rules around which combinations of crypto extensions are permitted:
- PMULL implies AES, and there is no way for the ID registers to
describe a CPU that has PMULL but not AES
- SHA256 implies SHA1 (but the ID register fields are independent)
- SHA3 and SHA512 both imply SHA256+SHA1
- SVE versions are not allowed to be implemented unless the plain NEON
version is implemented as well
- FEAT_Crypto has different meanings for v8.0, v8.2 and v9.x
So it would be much easier, also in terms of future maintenance, to have a simple 'crypto=off' setting that applies to all emulated CPU models, given that disabling all crypto on any given compliant CPU will never result in something that the architecture does not permit.
Would that work for you?
I thought it had been established that the "crypto" grouping of features (as implemented by gcc and clang) doesn't reflect the actual hardware feature fields and is misleading because additional crypto extensions continue to be added.
I'm not sure that applies here, but just something to consider.
You are right, this is why 'crypto=on' can never mean anything other than 'do not disable the crypto extensions that this particular CPU type provides' But that does not mean 'crypto=off' is equally problematic.
There's certainly no need to support emulating combinations of features that no hardware actually implements. So yes, if that means "crypto" is the right choice, that sounds fine.
OK, I'll have a stab at that and cc you on the patches.
linux-stable-mirror@lists.linaro.org