2025-05-16T08:34:25-07:00, Deepak Gupta debug@rivosinc.com:
On Thu, May 15, 2025 at 10:48:35AM +0200, Radim Krčmář wrote:
2025-05-15T09:28:25+02:00, Alexandre Ghiti alex@ghiti.fr:
On 06/05/2025 12:10, Radim Krčmář wrote:
2025-05-02T16:30:36-07:00, Deepak Gupta debug@rivosinc.com:
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S @@ -91,6 +91,32 @@ +.macro restore_userssp tmp
- ALTERNATIVE("nops(2)",
__stringify( \
REG_L \tmp, TASK_TI_USER_SSP(tp); \
csrw CSR_SSP, \tmp),
0,
RISCV_ISA_EXT_ZICFISS,
CONFIG_RISCV_USER_CFI)
+.endm
Do we need to emit the nops when CONFIG_RISCV_USER_CFI isn't selected?
(Why not put #ifdef CONFIG_RISCV_USER_CFI around the ALTERNATIVES?)
The alternatives are used to create a generic kernel that contains the code for a large number of extensions and only enable it at runtime depending on the platform capabilities. This way distros can ship a single kernel that works on all platforms.
Yup, and if a kernel is compiled without CONFIG_RISCV_USER_CFI, the nops will only enlarge the binary and potentially slow down execution. In other words, why we don't do something like this
(!CONFIG_RISCV_USER_CFI ? "" : (RISCV_ISA_EXT_ZICFISS ? __stringify(...) : "nops(x)"))
instead of the current
(CONFIG_RISCV_USER_CFI && RISCV_ISA_EXT_ZICFISS ? __stringify(...) : "nops(x)")
It could be a new preprocessor macro in case we wanted to make it nice, but it's probably not a common case, so an ifdef could work as well.
Do we just generally not care about such minor optimizations?
On its own just for this series, I am not sure if I would call it even a minor optimization.
This patch uses ifdef in thread_info, but not here.
Both places minimize the runtime impact on kernels that don't have CONFIG_RISCV_USER_CFI, so I would like to understand the reasoning behind the decision to include one and not the other.
But sure, it may (or may not) have noticeable effect if someone were to go around and muck with ALTERNATIVES macro and emit `old_c` only if config were selected. That should be a patch set on its own with data providing benefits from it.
The difference is small and each build and implementation can behave differently, so code analysis seems the most appropriate tool here. We must still do a lot of subjective guesswork, because it is hard to predict the future development.
We should be moving on the pareto front and there are 3 roughly optimization parameters in this case: the C code, the binary code, and the work done by the programmer. The current patch is forgoing the binary quality (nops are strictly worse). The ifdef and the macro solutions prefer binary quality, and then differ if they consider work minimization (ifdef) or nice C (macro).
Does the current patch represent the ideal compromise? (I can just recalibrate my values for future reviews...)
Thanks.