Re: [PATCH] x86: checksum: Fix unaligned checksums on < i686

13 Jul 2023


      On Mon, 10 Jul 2023 at 23:01, David Laight David.Laight@aculab.com wrote:
...
From: David Gow
...
Sent: 04 July 2023 09:32
The checksum_32 code was originally written to only handle 2-byte
aligned buffers, but was later extended to support arbitrary alignment.
However, the non-PPro variant doesn't apply the carry before jumping to
the 2- or 4-byte aligned versions, which clear CF.
....
...
I also tested it on a real 486DX2, with the same results.
Which cpu does anyone really care about?
The unrolled 'adcl' loop is horrid on intel cpu between
(about) 'core' and 'haswell' because each u-op can only
have two inputs and adc needs 3 - so is 2 u-ops.
First fixed by summing to alternate registers.
On anything modern (well I've not checked some Atom based
servers) misaligned accesses are pretty near zero cost.
So it really isn't worth the tests that align data.
(I suspect it all got better a long time ago except
for transfers that cross cache-line boundaries, with
adc taking two cycles even that might be free.)
I agree that the implementation here is suitably ancient far from
optimal, but think the alignment issue in this patch is a separate
correctness issue. (Even if it's one which is unlikely to result in
real-world problems at present.)
There's probably a valid argument around whether or not the
checksumming code should be alignment-agnostic, or whether a 2- or 4-
byte alignment is something callers have to guarantee, but since some
effort had gone into making these work with unaligned data, I think
it's sensible to make sure those cases actually work, and so for the
KUnit test to verify that all these different alignments all give
correct results.
If it seems worth cleaning up / optimising the code more significantly
(maybe some of the people who really care have TCP header checksum
offload anyway), that seems like a separate task to me. Personally,
while I care quite a bit that 32-bit x86 is still working (even on
ancient CPUs), I don't really have the time to spend optimising it.
(Worst-case, if maintaining it gets too rough, we could possibly fall
back to the C implementations, though I haven't benchmarked them.)
-- David

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] x86: checksum: Fix unaligned checksums on < i686