Motivation of backport: -----------------------
1. The cfcdef5e30469 ("rcu: Allow rcu_do_batch() to dynamically adjust batch sizes") broke the default behaviour of "offloading rcu callbacks" setup. In that scenario after each callback the caller context was used to check if it has to be rescheduled giving a CPU time for others. After that change an "offloaded" setup can switch to time-based RCU callbacks processing, what can be long for latency sensitive workloads and SCHED_FIFO processes, i.e. callbacks are invoked for a long time with keeping preemption off and without checking cond_resched().
2. Our devices which run Android and 5.10 kernel have some critical areas which are sensitive to latency. It is a low latency audio, 8k video, UI stack and so on. For example below is a trace that illustrates a delay of "irq/396-5-0072" RT task to complete IRQ processing:
<snip> rcuop/6-54 [000] d.h2 183.752989: irq_handler_entry: irq=85 name=i2c_geni rcuop/6-54 [000] d.h5 183.753007: sched_waking: comm=irq/396-5-0072 pid=12675 prio=49 target_cpu=000 rcuop/6-54 [000] dNh6 183.753014: sched_wakeup: irq/396-5-0072:12675 [49] success=1 CPU:000 rcuop/6-54 [000] dNh2 183.753015: irq_handler_exit: irq=85 ret=handled rcuop/6-54 [000] .N.. 183.753018: rcu_invoke_callback: rcu_preempt rhp=0xffffff88ffd440b0 func=__d_free.cfi_jt rcuop/6-54 [000] .N.. 183.753020: rcu_invoke_callback: rcu_preempt rhp=0xffffff892ffd8400 func=inode_free_by_rcu.cfi_jt rcuop/6-54 [000] .N.. 183.753021: rcu_invoke_callback: rcu_preempt rhp=0xffffff89327cd708 func=i_callback.cfi_jt ... rcuop/6-54 [000] .N.. 183.755941: rcu_invoke_callback: rcu_preempt rhp=0xffffff8993c5a968 func=i_callback.cfi_jt rcuop/6-54 [000] .N.. 183.755942: rcu_invoke_callback: rcu_preempt rhp=0xffffff8993c4bd20 func=__d_free.cfi_jt rcuop/6-54 [000] dN.. 183.755944: rcu_batch_end: rcu_preempt CBs-invoked=2112 idle=>c<>c<>c<>c< rcuop/6-54 [000] dN.. 183.755946: rcu_utilization: Start context switch rcuop/6-54 [000] dN.. 183.755946: rcu_utilization: End context switch rcuop/6-54 [000] d..2 183.755959: sched_switch: rcuop/6:54 [120] R ==> migration/0:16 [0] ... migratio-16 [000] d..2 183.756021: sched_switch: migration/0:16 [0] S ==> irq/396-5-0072:12675 [49] <snip>
The "irq/396-5-0072:12675" was delayed for ~3 milliseconds due to introduced side effect. Please note, on our Android devices we get ~70 000 callbacks registered to be invoked by the "rcuop/x" workers. This is during 1 seconds time interval and regular handset usage. Latencies bigger that 3 milliseconds affect our high-resolution audio streaming over the LDAC/Bluetooth stack.
Two patches depend on each other.
Frederic Weisbecker (2): rcu: Fix callbacks processing time limit retaining cond_resched() rcu: Apply callbacks processing time limit only on softirq
kernel/rcu/tree.c | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-)