Linus (aka Greg),
Vaibhav Nagarnaik found that modifying the ring buffer size could cause a huge latency in the system because it does a while loop to free pages without releasing the CPU (on non preempt kernels). In a case where there are hundreds of thousands of pages to free it could actually cause a system stall. A properly place cond_resched() solves this issue.
Please pull the latest trace-v4.19-rc4 tree, which can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git trace-v4.19-rc4
Tag SHA1: 977e4fb3741e24151a255ee13bd4a1224545ae4e Head SHA1: 83f365554e47997ec68dc4eca3f5dce525cd15c3
Vaibhav Nagarnaik (1): ring-buffer: Allow for rescheduling when removing pages
---- kernel/trace/ring_buffer.c | 2 ++ 1 file changed, 2 insertions(+) --------------------------- commit 83f365554e47997ec68dc4eca3f5dce525cd15c3 Author: Vaibhav Nagarnaik vnagarnaik@google.com Date: Fri Sep 7 15:31:29 2018 -0700
ring-buffer: Allow for rescheduling when removing pages
When reducing ring buffer size, pages are removed by scheduling a work item on each CPU for the corresponding CPU ring buffer. After the pages are removed from ring buffer linked list, the pages are free()d in a tight loop. The loop does not give up CPU until all pages are removed. In a worst case behavior, when lot of pages are to be freed, it can cause system stall.
After the pages are removed from the list, the free() can happen while the work is rescheduled. Call cond_resched() in the loop to prevent the system hangup.
Link: http://lkml.kernel.org/r/20180907223129.71994-1-vnagarnaik@google.com
Cc: stable@vger.kernel.org Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic") Reported-by: Jason Behmer jbehmer@google.com Signed-off-by: Vaibhav Nagarnaik vnagarnaik@google.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 1d92d4a982fd..65bd4616220d 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -1546,6 +1546,8 @@ rb_remove_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned long nr_pages) tmp_iter_page = first_page;
do { + cond_resched(); + to_remove_page = tmp_iter_page; rb_inc_page(cpu_buffer, &tmp_iter_page);
On Tue, Sep 18, 2018 at 07:14:13PM -0400, Steven Rostedt wrote:
Linus (aka Greg),
Vaibhav Nagarnaik found that modifying the ring buffer size could cause a huge latency in the system because it does a while loop to free pages without releasing the CPU (on non preempt kernels). In a case where there are hundreds of thousands of pages to free it could actually cause a system stall. A properly place cond_resched() solves this issue.
Please pull the latest trace-v4.19-rc4 tree, which can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git trace-v4.19-rc4
Ick, line wrapping makes it hard to cut/paste :(
Anyway, now pulled and pushed out.
greg k-h
On Wed, 19 Sep 2018 08:07:06 +0200 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Tue, Sep 18, 2018 at 07:14:13PM -0400, Steven Rostedt wrote:
Linus (aka Greg),
Vaibhav Nagarnaik found that modifying the ring buffer size could cause a huge latency in the system because it does a while loop to free pages without releasing the CPU (on non preempt kernels). In a case where there are hundreds of thousands of pages to free it could actually cause a system stall. A properly place cond_resched() solves this issue.
Please pull the latest trace-v4.19-rc4 tree, which can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git trace-v4.19-rc4
Ick, line wrapping makes it hard to cut/paste :(
??
That's the way I have always posted pull requests. I place the branch on the second line. It's not line wrapped, it's a hard coded new line. Long ago I was told to do it that way.
Should that be changed? It would be trivial to update my scripts.
Anyway, now pulled and pushed out.
Thanks,
-- Steve
On Wed, Sep 19, 2018 at 09:39:23AM -0400, Steven Rostedt wrote:
On Wed, 19 Sep 2018 08:07:06 +0200 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Tue, Sep 18, 2018 at 07:14:13PM -0400, Steven Rostedt wrote:
Linus (aka Greg),
Vaibhav Nagarnaik found that modifying the ring buffer size could cause a huge latency in the system because it does a while loop to free pages without releasing the CPU (on non preempt kernels). In a case where there are hundreds of thousands of pages to free it could actually cause a system stall. A properly place cond_resched() solves this issue.
Please pull the latest trace-v4.19-rc4 tree, which can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git trace-v4.19-rc4
Ick, line wrapping makes it hard to cut/paste :(
??
That's the way I have always posted pull requests. I place the branch on the second line. It's not line wrapped, it's a hard coded new line. Long ago I was told to do it that way.
Should that be changed? It would be trivial to update my scripts.
Ah, ok, that's not what I have been doing for a long time, nor what the sub-maintainers that send stuff to me have done. Normally it is: git_url tag
Like this one for perf stuff: https://lore.kernel.org/lkml/20171027195047.27132-1-acme@kernel.org/
If you have been doing it this way to Linus, that's fine, I can adapt :)
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org