On 26.08.20 16:27, Thomas Gleixner wrote:
On Wed, Aug 26 2020 at 13:53, Alexander Graf wrote:
Commit 633260fa143 ("x86/irq: Convey vector as argument and not in ptregs") changed the handover logic of the vector identifier from ~vector in orig_ax to purely register based. Unfortunately, this field has another consumer in the APIC code which the commit did not touch. The net result was that IRQ balancing did not work and instead resulted in interrupt storms, slowing down the system.
The net result is an observationof the symptom but that does not explain what the underlying technical issue is.
This patch restores the original semantics that orig_ax contains the vector. When we receive an interrupt now, the actual vector number stays stored in the orig_ax field which then gets consumed by the APIC code.
To ensure that nobody else trips over this in the future, the patch also adds comments at strategic places to warn anyone who would refactor the code that there is another consumer of the field.
With this patch in place, IRQ balancing works as expected and performance levels are restored to previous levels.
There's a lot of 'This patch and we' in that changelog. Care to grep for 'This patch' in Documentation/process/ ?
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index df8c017..22e829c 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -727,7 +727,7 @@ SYM_CODE_START_LOCAL(asm_\cfunc) ENCODE_FRAME_POINTER movl %esp, %eax movl PT_ORIG_EAX(%esp), %edx /* get the vector from stack */
movl $-1, PT_ORIG_EAX(%esp) /* no syscall to restart */
/* keep vector on stack for APIC's irq_complete_move() */
Yes that's fixing your observed wreckage, but it introduces a worse one.
user space -> interrupt push vector into orig_ax (values are in the ranges of 0-127 and -128 - 255 except for the system vectors which do not go through this code) handle() ... exit_to_user_mode_loop() arch_do_signal() /* Did we come from a system call? */ if (syscall_get_nr(current, regs) >= 0) {
---> BOOM for any vector 0-127 because syscall_get_nr() resolves to regs->orig_ax
Going to be fun to debug.
Hah, that's the code flow I was looking for to understand why the value was negative in the first place. Thanks a lot for pointing it out!
The below nasty hack cures it, but I hate it with a passion. I'll look deeper for a sane variant.
An alternative (that doesn't make the code easier to read, but would fix the issue at hand) would be touse a pushq imm16 with vector | 0x8000 instead to always make the value negative, no?
Alex
Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879