On Tue, Aug 11, 2020 at 09:57:20AM +0200, David Sterba wrote:
On Mon, Aug 10, 2020 at 03:14:30PM -0400, Sasha Levin wrote:
From: "Paul E. McKenney" paulmck@kernel.org
[ Upstream commit 9f47eb5461aaeb6cb8696f9d11503ae90e4d5cb0 ]
Very large I/Os can cause the following RCU CPU stall warning:
RIP: 0010:rb_prev+0x8/0x50 Code: 49 89 c0 49 89 d1 48 89 c2 48 89 f8 e9 e5 fd ff ff 4c 89 48 10 c3 4c = 89 06 c3 4c 89 40 10 c3 0f 1f 00 48 8b 0f 48 39 cf 74 38 <48> 8b 47 10 48 85 c0 74 22 48 8b 50 08 48 85 d2 74 0c 48 89 d0 48 RSP: 0018:ffffc9002212bab0 EFLAGS: 00000287 ORIG_RAX: ffffffffffffff13 RAX: ffff888821f93630 RBX: ffff888821f93630 RCX: ffff888821f937e0 RDX: 0000000000000000 RSI: 0000000000102000 RDI: ffff888821f93630 RBP: 0000000000103000 R08: 000000000006c000 R09: 0000000000000238 R10: 0000000000102fff R11: ffffc9002212bac8 R12: 0000000000000001 R13: ffffffffffffffff R14: 0000000000102000 R15: ffff888821f937e0 __lookup_extent_mapping+0xa0/0x110 try_release_extent_mapping+0xdc/0x220 btrfs_releasepage+0x45/0x70 shrink_page_list+0xa39/0xb30 shrink_inactive_list+0x18f/0x3b0 shrink_lruvec+0x38e/0x6b0 shrink_node+0x14d/0x690 do_try_to_free_pages+0xc6/0x3e0 try_to_free_mem_cgroup_pages+0xe6/0x1e0 reclaim_high.constprop.73+0x87/0xc0 mem_cgroup_handle_over_high+0x66/0x150 exit_to_usermode_loop+0x82/0xd0 do_syscall_64+0xd4/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9
On a PREEMPT=n kernel, the try_release_extent_mapping() function's "while" loop might run for a very long time on a large I/O. This commit therefore adds a cond_resched() to this loop, providing RCU any needed quiescent states.
Signed-off-by: Paul E. McKenney paulmck@kernel.org
Paul,
this patch was well hidden in some huge RCU pile (https://lore.kernel.org/lkml/20200623002147.25750-11-paulmck@kernel.org/)
I wonder why you haven't CCed linux-btrfs, I spotted the patch queued for stable by incidentally. The timestamp is from June, that's quite some time ago. We can deal with one more patch and I tend to reply with acks quickly for easy patches like this to not block other peoples work but I'm a bit disappointed by sidestepping maintained subsystems. It's not just this patch, it happens from time time only to increase the disapointement.
My bad, and please accept my apologies. I clearly left out the step of adding proper Cc: lines. :-/
Thanx, Paul