From: Mark Rutland mark.rutland@arm.com
[ Upstream commit e85094c31ddb794ac41c299a5a7a68243148f829 ]
Due to some historical confusion, arm64's current_top_of_stack() isn't what the stackleak code expects. This could in theory result in a number of problems, and practically results in an unnecessary performance hit. We can avoid this by aligning the arm64 implementation with the x86 implementation.
The arm64 implementation of current_top_of_stack() was added specifically for stackleak in commit:
0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
This was intended to be equivalent to the x86 implementation, but the implementation, semantics, and performance characteristics differ wildly:
* On x86, current_top_of_stack() returns the top of the current task's task stack, regardless of which stack is in active use.
The implementation accesses a percpu variable which the x86 entry code maintains, and returns the location immediately above the pt_regs on the task stack (above which x86 has some padding).
* On arm64 current_top_of_stack() returns the top of the stack in active use (i.e. the one which is currently being used).
The implementation checks the SP against a number of potentially-accessible stacks, and will BUG() if no stack is found.
The core stackleak_erase() code determines the upper bound of stack to erase with:
| if (on_thread_stack()) | boundary = current_stack_pointer; | else | boundary = current_top_of_stack();
On arm64 stackleak_erase() is always called on a task stack, and on_thread_stack() should always be true. On x86, stackleak_erase() is mostly called on a trampoline stack, and is sometimes called on a task stack.
Currently, this results in a lot of unnecessary code being generated for arm64 for the impossible !on_thread_stack() case. Some of this is inlined, bloating stackleak_erase(), while portions of this are left out-of-line and permitted to be instrumented (which would be a functional problem if that code were reachable).
As a first step towards improving this, this patch aligns arm64's implementation of current_top_of_stack() with x86's, always returning the top of the current task's stack. With GCC 11.1.0 this results in the bulk of the unnecessary code being removed, including all of the out-of-line instrumentable code.
While I don't believe there's a functional problem in practice I've marked this as a fix since the semantic was clearly wrong, the fix itself is simple, and other code might rely upon this in future.
Fixes: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin") Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Alexander Popov alex.popov@linux.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Andy Lutomirski luto@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Kees Cook keescook@chromium.org Cc: Will Deacon will@kernel.org Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/20220427173128.2603085-2-mark.rutland@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/processor.h | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index ee2bdc1b9f5b..5e73d7f7d1e7 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -335,12 +335,10 @@ long get_tagged_addr_ctrl(struct task_struct *task); * of header definitions for the use of task_stack_page. */
-#define current_top_of_stack() \ -({ \ - struct stack_info _info; \ - BUG_ON(!on_accessible_stack(current, current_stack_pointer, 1, &_info)); \ - _info.high; \ -}) +/* + * The top of the current task's task stack + */ +#define current_top_of_stack() ((unsigned long)current->stack + THREAD_SIZE) #define on_thread_stack() (on_task_stack(current, current_stack_pointer, 1, NULL))
#endif /* __ASSEMBLY__ */