[...]
Additionally, the only difference between fixing the issue and before is that there is no return error handling of make_request(). But after previous patch cleaned md_write_start(), make_requst() only return error in raid5_make_request() by dm-raid, see commit 41425f96d7aa ("dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape)". Since dm always splits data and flush operation into two separate io, io size of flush submitted by dm always is 0, make_request() will not be called in md_submit_flush_data(). To prevent future modifications from introducing issues, add WARN_ON to ensure make_request() no error is returned in this context.
[...] @@ -560,8 +552,20 @@ static void md_submit_flush_data(struct work_struct *ws) bio_endio(bio); } else { bio->bi_opf &= ~REQ_PREFLUSH;
md_handle_request(mddev, bio);
/*
* make_requst() will never return error here, it only
* returns error in raid5_make_request() by dm-raid.
* Since dm always splits data and flush operation into
* two separate io, io size of flush submitted by dm
* always is 0, make_request() will not be called here.
*/
if (WARN_ON_ONCE(!mddev->pers->make_request(mddev, bio)))
}bio_io_error(bio);;
Hello,
It looks we can hit this WARN_ON_ONCE() after which rootfs is switching to read-only:
May 20 15:13:35 hostname kernel: WARNING: CPU: 35 PID: 1517323 at drivers/md/md.c:621 md_submit_flush_data+0x9b/0xe0 ... May 20 15:13:35 hostname kernel: XFS (md125): log I/O error -5 May 20 15:13:35 hostname kernel: XFS (md125): Filesystem has been shut down due to log error (0x2). May 20 15:13:35 hostname kernel: XFS (md125): Please unmount the filesystem and rectify the problem(s).
Can you double check if the following regression is actual?
Since both stable/linux-6.1.y and stable/linux-6.6.y branches don't have b75197e86e6d ("md: Remove flush handling") there is a minor issue with this backport.
Statement "previous patch cleaned md_write_start(), make_requst() only return error in raid5_make_request() by dm-raid" will not work for both branches since 03e792eaf18e ("md: change the return value type of md_write_start to void") was not backported.
So we should either backport it, or do error handling, not the WARN_ON_ONCE().
linux-stable-mirror@lists.linaro.org