On 2022/8/4 18:25, Wang Yugui wrote:
Hi,
xfstest btrfs/158 trigged a panic after these 2 patches are applied.
btrfs-158-dmesg.txt dmesg output when panic btrfs-158-dmesg-decoded.txt dmesg output decoded by decode_stacktrace.sh and some source code is added too.
reproduce rate: not 100%, but 2 times here.
xfstest './check -g scrub' seem higher rate than './check test/btrfs/158' to reproduce this problem .
Also reproduced here running that in a loop.
linux kernel: 5.15.59 with some local backport patches too.
Got the reason pinned down, missing one dependency.
The code triggering the crash is "const u32 sectorsize = fs_info->sectorsize", and @fs_info is from bioc.
But bioc initialization doesn't ensure every bioc has its fs_info initialized.
That is only ensured by commit 731ccf15c952 ("btrfs: make sure btrfs_io_context::fs_info is always initialized").
So I have also need to backport that patch.
Weirdly, I ran my tests with "-g raid -g replace -g scrub" but didn't trigger this on even older branches.
I'll do more tests to make sure it doesn't cause problems.
Thanks, Qu
Best Regards Wang Yugui (wangyugui@e16-tech.com) 2022/08/04
Hi Greg and Sasha,
This two patches are backports for v5.15 and v5.10 (for v5.10 conflicts can be auto resolved) stable branches.
(For older branches from v4.9 to v5.4, due to some naming change, although the patches can be applied with auto-resolve, they won't compile).
These two patches are reducing the chance of destructive RMW cycle, where btrfs can use corrupted data to generate new P/Q, thus making some repairable data unrepairable.
Those patches are more important than what I initially thought, thus unfortunately they are not CCed to stable by themselves.
Furthermore due to recent refactors/renames, there are quite some member change related to those patches, thus have to be manually backported.
One of the fastest way to verify the behavior is the existing btrfs/125 test case from fstests. (not in auto group AFAIK).
Qu Wenruo (2): btrfs: only write the sectors in the vertical stripe which has data stripes btrfs: raid56: don't trust any cached sector in __raid56_parity_recover()
fs/btrfs/raid56.c | 74 ++++++++++++++++++++++++++++++++++++----------- 1 file changed, 57 insertions(+), 17 deletions(-)
-- 2.37.0