Linus (aka Greg),
Vaibhav Nagarnaik found that modifying the ring buffer size could cause a huge latency in the system because it does a while loop to free pages without releasing the CPU (on non preempt kernels). In a case where there are hundreds of thousands of pages to free it could actually cause a system stall. A properly place cond_resched() solves this issue.
Please pull the latest trace-v4.19-rc4 tree, which can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git trace-v4.19-rc4
Tag SHA1: 977e4fb3741e24151a255ee13bd4a1224545ae4e Head SHA1: 83f365554e47997ec68dc4eca3f5dce525cd15c3
Vaibhav Nagarnaik (1): ring-buffer: Allow for rescheduling when removing pages
---- kernel/trace/ring_buffer.c | 2 ++ 1 file changed, 2 insertions(+) --------------------------- commit 83f365554e47997ec68dc4eca3f5dce525cd15c3 Author: Vaibhav Nagarnaik vnagarnaik@google.com Date: Fri Sep 7 15:31:29 2018 -0700
ring-buffer: Allow for rescheduling when removing pages
When reducing ring buffer size, pages are removed by scheduling a work item on each CPU for the corresponding CPU ring buffer. After the pages are removed from ring buffer linked list, the pages are free()d in a tight loop. The loop does not give up CPU until all pages are removed. In a worst case behavior, when lot of pages are to be freed, it can cause system stall.
After the pages are removed from the list, the free() can happen while the work is rescheduled. Call cond_resched() in the loop to prevent the system hangup.
Link: http://lkml.kernel.org/r/20180907223129.71994-1-vnagarnaik@google.com
Cc: stable@vger.kernel.org Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic") Reported-by: Jason Behmer jbehmer@google.com Signed-off-by: Vaibhav Nagarnaik vnagarnaik@google.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 1d92d4a982fd..65bd4616220d 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -1546,6 +1546,8 @@ rb_remove_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned long nr_pages) tmp_iter_page = first_page;
do { + cond_resched(); + to_remove_page = tmp_iter_page; rb_inc_page(cpu_buffer, &tmp_iter_page);