Hi
This is backport of patches d208b89401e0 ("dm: fix mempool NULL pointer race when completing IO") and 9f6dc6337610 ("dm: interlock pending dm_io and dm_wait_for_bios_completion") for the kernel 5.10.
The bugs fixed by these patches can cause random crashing when reloading dm table, so it is eligible for stable backport.
This patch is different from the upstream patches because the code diverged significantly.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com
--- drivers/md/dm.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
Index: linux-stable/drivers/md/dm.c =================================================================== --- linux-stable.orig/drivers/md/dm.c 2022-04-19 16:17:52.000000000 +0200 +++ linux-stable/drivers/md/dm.c 2022-04-19 16:23:23.000000000 +0200 @@ -607,19 +607,26 @@ static void start_io_acct(struct dm_io * false, 0, &io->stats_aux); }
+static void free_io(struct mapped_device *md, struct dm_io *io); + static void end_io_acct(struct dm_io *io) { struct mapped_device *md = io->md; struct bio *bio = io->orig_bio; - unsigned long duration = jiffies - io->start_time; - - bio_end_io_acct(bio, io->start_time); + unsigned long start_time = io->start_time; + unsigned long duration = jiffies - start_time;
if (unlikely(dm_stats_used(&md->stats))) dm_stats_account_io(&md->stats, bio_data_dir(bio), bio->bi_iter.bi_sector, bio_sectors(bio), true, duration, &io->stats_aux);
+ free_io(md, io); + + smp_wmb(); + + bio_end_io_acct(bio, start_time); + /* nudge anyone waiting on suspend queue */ if (unlikely(wq_has_sleeper(&md->wait))) wake_up(&md->wait); @@ -930,7 +937,6 @@ static void dec_pending(struct dm_io *io io_error = io->status; bio = io->orig_bio; end_io_acct(io); - free_io(md, io);
if (io_error == BLK_STS_DM_REQUEUE) return; @@ -2345,6 +2351,8 @@ static int dm_wait_for_bios_completion(s } finish_wait(&md->wait, &wait);
+ smp_rmb(); + return r; }
On Thu, Apr 21, 2022 at 02:08:30PM -0400, Mikulas Patocka wrote:
Hi
Not really needed in a changelog text :)
This is backport of patches d208b89401e0 ("dm: fix mempool NULL pointer race when completing IO") and 9f6dc6337610 ("dm: interlock pending dm_io and dm_wait_for_bios_completion") for the kernel 5.10.
Can you just make these 2 different patches?
The bugs fixed by these patches can cause random crashing when reloading dm table, so it is eligible for stable backport.
This patch is different from the upstream patches because the code diverged significantly.
This change is _VERY_ different. I would need acks from the maintainers of this code before I could accept this, along with a much more detailed description of why the original commits will not work here as well.
Same for the other backports.
thanks,
greg k-h
On Tue, 26 Apr 2022, Greg Kroah-Hartman wrote:
On Thu, Apr 21, 2022 at 02:08:30PM -0400, Mikulas Patocka wrote:
Hi
Not really needed in a changelog text :)
This is backport of patches d208b89401e0 ("dm: fix mempool NULL pointer race when completing IO") and 9f6dc6337610 ("dm: interlock pending dm_io and dm_wait_for_bios_completion") for the kernel 5.10.
Can you just make these 2 different patches?
The bugs fixed by these patches can cause random crashing when reloading dm table, so it is eligible for stable backport.
This patch is different from the upstream patches because the code diverged significantly.
This change is _VERY_ different. I would need acks from the maintainers of this code before I could accept this, along with a much more detailed description of why the original commits will not work here as well.
Same for the other backports.
Regarding backporting of 9f6dc633:
My reasoning was that introducing "md->pending_io" in the backported stable kernels is useless - it will just degrade performance by consuming one more cache line per I/O without providing any gain.
In the upstream kernels, Mike needs that "md->pending_io" variable for other reasons (the I/O accounting was reworked there in order to avoid some spikes with dm-crypt), but there is no need for it in the stable kernels.
In order to fix that race condition, all we need to do is to make sure that dm_stats_account_io is called before bio_end_io_acct - and the patch does that - it swaps them.
Do you still insist that this useless percpu variable must be added to the stable kernels? If you do, I can make it, but I think it's better to just swap those two functions.
Mikulas
On Thu, Apr 28, 2022 at 12:22:26PM -0400, Mikulas Patocka wrote:
On Tue, 26 Apr 2022, Greg Kroah-Hartman wrote:
On Thu, Apr 21, 2022 at 02:08:30PM -0400, Mikulas Patocka wrote:
Hi
Not really needed in a changelog text :)
This is backport of patches d208b89401e0 ("dm: fix mempool NULL pointer race when completing IO") and 9f6dc6337610 ("dm: interlock pending dm_io and dm_wait_for_bios_completion") for the kernel 5.10.
Can you just make these 2 different patches?
The bugs fixed by these patches can cause random crashing when reloading dm table, so it is eligible for stable backport.
This patch is different from the upstream patches because the code diverged significantly.
This change is _VERY_ different. I would need acks from the maintainers of this code before I could accept this, along with a much more detailed description of why the original commits will not work here as well.
Same for the other backports.
Regarding backporting of 9f6dc633:
My reasoning was that introducing "md->pending_io" in the backported stable kernels is useless - it will just degrade performance by consuming one more cache line per I/O without providing any gain.
In the upstream kernels, Mike needs that "md->pending_io" variable for other reasons (the I/O accounting was reworked there in order to avoid some spikes with dm-crypt), but there is no need for it in the stable kernels.
In order to fix that race condition, all we need to do is to make sure that dm_stats_account_io is called before bio_end_io_acct - and the patch does that - it swaps them.
Do you still insist that this useless percpu variable must be added to the stable kernels? If you do, I can make it, but I think it's better to just swap those two functions.
I am no insisting on anything, I want the dm maintainers to agree that this change is acceptable to take as it is not what is in Linus's tree. Every time we take a "not upstream" commit, the odds are 90% that it ends up being wrong, so I need extra review and assurances that it is acceptable before I can apply it.
thanks,
greg k-h
On Fri, Apr 29 2022 at 4:37P -0400, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Apr 28, 2022 at 12:22:26PM -0400, Mikulas Patocka wrote:
On Tue, 26 Apr 2022, Greg Kroah-Hartman wrote:
On Thu, Apr 21, 2022 at 02:08:30PM -0400, Mikulas Patocka wrote:
Hi
Not really needed in a changelog text :)
This is backport of patches d208b89401e0 ("dm: fix mempool NULL pointer race when completing IO") and 9f6dc6337610 ("dm: interlock pending dm_io and dm_wait_for_bios_completion") for the kernel 5.10.
Can you just make these 2 different patches?
The bugs fixed by these patches can cause random crashing when reloading dm table, so it is eligible for stable backport.
This patch is different from the upstream patches because the code diverged significantly.
This change is _VERY_ different. I would need acks from the maintainers of this code before I could accept this, along with a much more detailed description of why the original commits will not work here as well.
Same for the other backports.
Regarding backporting of 9f6dc633:
My reasoning was that introducing "md->pending_io" in the backported stable kernels is useless - it will just degrade performance by consuming one more cache line per I/O without providing any gain.
In the upstream kernels, Mike needs that "md->pending_io" variable for other reasons (the I/O accounting was reworked there in order to avoid some spikes with dm-crypt), but there is no need for it in the stable kernels.
In order to fix that race condition, all we need to do is to make sure that dm_stats_account_io is called before bio_end_io_acct - and the patch does that - it swaps them.
Do you still insist that this useless percpu variable must be added to the stable kernels? If you do, I can make it, but I think it's better to just swap those two functions.
I am no insisting on anything, I want the dm maintainers to agree that this change is acceptable to take as it is not what is in Linus's tree. Every time we take a "not upstream" commit, the odds are 90% that it ends up being wrong, so I need extra review and assurances that it is acceptable before I can apply it.
FYI, I've reviewed Mikulas's latest stable backport patches (not yet posted) and provided by Reviewed-by. So once you see them you can trust I've looked at the changes and am fine with you picking them up.
Thanks, Mike
linux-stable-mirror@lists.linaro.org