With NO_HZ_IDLE, we get CONTEXT_TRACKING_IDLE, so we get these transitions:
ct_idle_enter() ct_kernel_exit() ct_state_inc_clear_work()
ct_idle_exit() ct_kernel_enter() ct_work_flush()
With just CONTEXT_TRACKING_IDLE, ct_state_inc_clear_work() is just ct_state_inc() and ct_work_flush() is a no-op. However, making them be functional as if under CONTEXT_TRACKING_WORK would allow NO_HZ_IDLE to leverage IPI deferral to keep idle CPUs idle longer.
Having this enabled for NO_HZ_IDLE is a different argument than for having it for NO_HZ_FULL (power savings vs latency/performance), but the backing mechanism is identical.
Add a default-no option to enable IPI deferral with NO_HZ_IDLE.
Signed-off-by: Valentin Schneider vschneid@redhat.com --- kernel/time/Kconfig | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig index 04efc2b605823..a3cb903a4e022 100644 --- a/kernel/time/Kconfig +++ b/kernel/time/Kconfig @@ -188,9 +188,23 @@ config CONTEXT_TRACKING_USER_FORCE
config CONTEXT_TRACKING_WORK bool - depends on HAVE_CONTEXT_TRACKING_WORK && CONTEXT_TRACKING_USER + depends on HAVE_CONTEXT_TRACKING_WORK && (CONTEXT_TRACKING_USER || CONTEXT_TRACKING_WORK_IDLE) default y
+config CONTEXT_TRACKING_WORK_IDLE + bool + depends on HAVE_CONTEXT_TRACKING_WORK && CONTEXT_TRACKING_IDLE && !CONTEXT_TRACKING_USER + default n + help + This option enables deferral of some IPIs when they are targeted at CPUs + that are idle. This can help keep CPUs idle longer, but induces some + extra overhead to idle <-> kernel transitions and to IPI sending. + + Say Y if the power improvements are worth more to you than the added + overheads. + + Say N otherwise. + config NO_HZ bool "Old Idle dynticks config" help