On Tue, 5 May 2020 09:37:42 -0700 Eric Dumazet eric.dumazet@gmail.com wrote:
On 5/5/20 9:31 AM, Eric Dumazet wrote:
On 5/5/20 9:25 AM, Eric Dumazet wrote:
On 5/5/20 9:13 AM, SeongJae Park wrote:
On Tue, 5 May 2020 09:00:44 -0700 Eric Dumazet edumazet@google.com wrote:
On Tue, May 5, 2020 at 8:47 AM SeongJae Park sjpark@amazon.com wrote:
On Tue, 5 May 2020 08:20:50 -0700 Eric Dumazet eric.dumazet@gmail.com wrote:
> > > On 5/5/20 8:07 AM, SeongJae Park wrote: >> On Tue, 5 May 2020 07:53:39 -0700 Eric Dumazet edumazet@google.com wrote: >> >
[...]
I would ask Paul opinion on this issue, because we have many objects being freed after RCU grace periods.
If RCU subsystem can not keep-up, I guess other workloads will also suffer.
Sure, we can revert patches there and there trying to work around the issue, but for objects allocated from process context, we should not have these problems.
I wonder if simply adjusting rcu_divisor to 6 or 5 would help
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index d9a49cd6065a20936edbda1b334136ab597cde52..fde833bac0f9f81e8536211b4dad6e7575c1219a 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -427,7 +427,7 @@ module_param(qovld, long, 0444); static ulong jiffies_till_first_fqs = ULONG_MAX; static ulong jiffies_till_next_fqs = ULONG_MAX; static bool rcu_kick_kthreads; -static int rcu_divisor = 7; +static int rcu_divisor = 6; module_param(rcu_divisor, int, 0644); /* Force an exit from rcu_do_batch() after 3 milliseconds. */
To be clear, you can adjust the value without building a new kernel.
echo 6 >/sys/module/rcutree/parameters/rcu_divisor
I tried value 6, 5, and 4, but none of those removed the problem.
Thanks, SeongJae Park