On Wed, May 6, 2020 at 5:59 AM SeongJae Park sjpark@amazon.com wrote:
TL; DR: It was not kernel's fault, but the benchmark program.
So, the problem is reproducible using the lebench[1] only. I carefully read it's code again.
Before running the problem occurred "poll big" sub test, lebench executes "context switch" sub test. For the test, it sets the cpu affinity[2] and process priority[3] of itself to '0' and '-20', respectively. However, it doesn't restore the values to original value even after the "context switch" is finished. For the reason, "select big" sub test also run binded on CPU 0 and has lowest nice value. Therefore, it can disturb the RCU callback thread for the CPU 0, which processes the deferred deallocations of the sockets, and as a result it triggers the OOM.
We confirmed the problem disappears by offloading the RCU callbacks from the CPU 0 using rcu_nocbs=0 boot parameter or simply restoring the affinity and/or priority.
Someone _might_ still argue that this is kernel problem because the problem didn't occur on the old kernels prior to the Al's patches. However, setting the affinity and priority was available because the program received the permission. Therefore, it would be reasonable to blame the system administrators rather than the kernel.
So, please ignore this patchset, apology for making confuse. If you still has some doubts or need more tests, please let me know.
[1] https://github.com/LinuxPerfStudy/LEBench [2] https://github.com/LinuxPerfStudy/LEBench/blob/master/TEST_DIR/OS_Eval.c#L82... [3] https://github.com/LinuxPerfStudy/LEBench/blob/master/TEST_DIR/OS_Eval.c#L82...
Thanks, SeongJae Park
No harm done, thanks for running more tests and root-causing the issue !