After re-checking in the spec and comparing stack offsets with glibc, The last pushed argument must be 16-byte aligned (i.e. aligned before the call) so that in the callee esp+4 is multiple of 16, so the principle is the 32-bit equivalent to what Ammar fixed for x86_64. It's possible that 32-bit code using SSE2 or MMX could have been affected. In addition the frame pointer ought to be zero at the deepest level.
Link: https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI Cc: Ammar Faizi ammar.faizi@students.amikom.ac.id Cc: stable@vger.kernel.org Signed-off-by: Willy Tarreau w@1wt.eu --- tools/include/nolibc/nolibc.h | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h index 96b6d56acb57..7f300dc379e7 100644 --- a/tools/include/nolibc/nolibc.h +++ b/tools/include/nolibc/nolibc.h @@ -583,13 +583,21 @@ struct sys_stat_struct { })
/* startup code */ +/* + * i386 System V ABI mandates: + * 1) last pushed argument must be 16-byte aligned. + * 2) The deepest stack frame should be set to zero + * + */ asm(".section .text\n" ".global _start\n" "_start:\n" "pop %eax\n" // argc (first arg, %eax) "mov %esp, %ebx\n" // argv[] (second arg, %ebx) "lea 4(%ebx,%eax,4),%ecx\n" // then a NULL then envp (third arg, %ecx) - "and $-16, %esp\n" // x86 ABI : esp must be 16-byte aligned when + "xor %ebp, %ebp\n" // zero the stack frame + "and $-16, %esp\n" // x86 ABI : esp must be 16-byte aligned before + "sub $4, %esp\n" // the call instruction (args are aligned) "push %ecx\n" // push all registers on the stack so that we "push %ebx\n" // support both regparm and plain stack modes "push %eax\n"
From: Willy Tarreau
Sent: 24 October 2021 18:28
After re-checking in the spec and comparing stack offsets with glibc, The last pushed argument must be 16-byte aligned (i.e. aligned before the call) so that in the callee esp+4 is multiple of 16, so the principle is the 32-bit equivalent to what Ammar fixed for x86_64. It's possible that 32-bit code using SSE2 or MMX could have been affected. In addition the frame pointer ought to be zero at the deepest level.
...
/* startup code */ +/*
- i386 System V ABI mandates:
- last pushed argument must be 16-byte aligned.
- The deepest stack frame should be set to zero
I'm pretty sure that the historic SYSV i386 ABI only every required 4-byte alignment for the stack.
At some point it got 'randomly' changed to 16-byte. I don't think this happened until after compiler support for SSE2 intrinsics was added. ISTR the NetBSD found that it was gcc that moved the goalposts.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Mon, Oct 25, 2021 at 07:46:11AM +0000, David Laight wrote:
From: Willy Tarreau
Sent: 24 October 2021 18:28
After re-checking in the spec and comparing stack offsets with glibc, The last pushed argument must be 16-byte aligned (i.e. aligned before the call) so that in the callee esp+4 is multiple of 16, so the principle is the 32-bit equivalent to what Ammar fixed for x86_64. It's possible that 32-bit code using SSE2 or MMX could have been affected. In addition the frame pointer ought to be zero at the deepest level.
...
/* startup code */ +/*
- i386 System V ABI mandates:
- last pushed argument must be 16-byte aligned.
- The deepest stack frame should be set to zero
I'm pretty sure that the historic SYSV i386 ABI only every required 4-byte alignment for the stack.
At some point it got 'randomly' changed to 16-byte. I don't think this happened until after compiler support for SSE2 intrinsics was added.
It's very possible because I've done a number of tests and noticed that in some cases the called functions' stack doesn't seem to be more than 4-aligned. However the deepest function in the stack starts with an aligned stack so I prefer to follow this same rule.
Willy
From: Willy Tarreau
Sent: 25 October 2021 09:06
On Mon, Oct 25, 2021 at 07:46:11AM +0000, David Laight wrote:
From: Willy Tarreau
Sent: 24 October 2021 18:28
After re-checking in the spec and comparing stack offsets with glibc, The last pushed argument must be 16-byte aligned (i.e. aligned before the call) so that in the callee esp+4 is multiple of 16, so the principle is the 32-bit equivalent to what Ammar fixed for x86_64. It's possible that 32-bit code using SSE2 or MMX could have been affected. In addition the frame pointer ought to be zero at the deepest level.
...
/* startup code */ +/*
- i386 System V ABI mandates:
- last pushed argument must be 16-byte aligned.
- The deepest stack frame should be set to zero
I'm pretty sure that the historic SYSV i386 ABI only every required 4-byte alignment for the stack.
At some point it got 'randomly' changed to 16-byte. I don't think this happened until after compiler support for SSE2 intrinsics was added.
It's very possible because I've done a number of tests and noticed that in some cases the called functions' stack doesn't seem to be more than 4-aligned. However the deepest function in the stack starts with an aligned stack so I prefer to follow this same rule.
Any call through asm is unlikely to maintain the 16 byte alignment. But yes, starting off on the required (by gcc) alignment does no harm.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
linux-stable-mirror@lists.linaro.org