On 3/8/22 6:47 PM, Peng Liu wrote:
When CONFIG_KFENCE_DYNAMIC_OBJECTS is set to a big number, kfence kunit-test-case test_gfpzero will eat up nearly all the CPU's resources and rcu_stall is reported as the following log which is cut from a physical server.
rcu: INFO: rcu_sched self-detected stall on CPU rcu: 68-....: (14422 ticks this GP) idle=6ce/1/0x4000000000000002 softirq=592/592 fqs=7500 (t=15004 jiffies g=10677 q=20019) Task dump for CPU 68: task:kunit_try_catch state:R running task stack: 0 pid: 9728 ppid: 2 flags:0x0000020a Call trace: dump_backtrace+0x0/0x1e4 show_stack+0x20/0x2c sched_show_task+0x148/0x170 ... rcu_sched_clock_irq+0x70/0x180 update_process_times+0x68/0xb0 tick_sched_handle+0x38/0x74 ... gic_handle_irq+0x78/0x2c0 el1_irq+0xb8/0x140 kfree+0xd8/0x53c test_alloc+0x264/0x310 [kfence_test] test_gfpzero+0xf4/0x840 [kfence_test] kunit_try_run_case+0x48/0x20c kunit_generic_run_threadfn_adapter+0x28/0x34 kthread+0x108/0x13c ret_from_fork+0x10/0x18
To avoid rcu_stall and unacceptable latency, a schedule point is added to test_gfpzero.
Signed-off-by: Peng Liu liupeng256@huawei.com
mm/kfence/kfence_test.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c index caed6b4eba94..1b50f70a4c0f 100644 --- a/mm/kfence/kfence_test.c +++ b/mm/kfence/kfence_test.c @@ -627,6 +627,7 @@ static void test_gfpzero(struct kunit *test) kunit_warn(test, "giving up ... cannot get same object back\n"); return; }
cond_resched();
This sounds like a band-aid - is there a better way to fix this?
} for (i = 0; i < size; i++)
thanks, -- Shuah