From: Christoph Hellwig hch@lst.de
From: Christoph Hellwig hch@lst.de
Upstream commit: 3175199ab0ac ("block: split bio_kmalloc from bio_alloc_bioset")
This is backport to stable 5.10. It fixes an issue reported by syzbot. Link: https://syzkaller.appspot.com/bug?id=a3416231e37024a75f2b95bd95db0d8ce8132a8...
bio_kmalloc shares almost no logic with the bio_set based fast path in bio_alloc_bioset. Split it into an entirely separate implementation.
Reported-by: syzbot+4f441e6ca0fcad141421@syzkaller.appspotmail.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Acked-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Tadeusz Struk tadeusz.struk@linaro.org --- block/bio.c | 166 +++++++++++++++++++++++--------------------- include/linux/bio.h | 6 +- 2 files changed, 86 insertions(+), 86 deletions(-)
diff --git a/block/bio.c b/block/bio.c index f8d26ce7b61b..be59276e462e 100644 --- a/block/bio.c +++ b/block/bio.c @@ -405,122 +405,101 @@ static void punt_bios_to_rescuer(struct bio_set *bs) * @nr_iovecs: number of iovecs to pre-allocate * @bs: the bio_set to allocate from. * - * Description: - * If @bs is NULL, uses kmalloc() to allocate the bio; else the allocation is - * backed by the @bs's mempool. + * Allocate a bio from the mempools in @bs. * - * When @bs is not NULL, if %__GFP_DIRECT_RECLAIM is set then bio_alloc will - * always be able to allocate a bio. This is due to the mempool guarantees. - * To make this work, callers must never allocate more than 1 bio at a time - * from this pool. Callers that need to allocate more than 1 bio must always - * submit the previously allocated bio for IO before attempting to allocate - * a new one. Failure to do so can cause deadlocks under memory pressure. + * If %__GFP_DIRECT_RECLAIM is set then bio_alloc will always be able to + * allocate a bio. This is due to the mempool guarantees. To make this work, + * callers must never allocate more than 1 bio at a time from the general pool. + * Callers that need to allocate more than 1 bio must always submit the + * previously allocated bio for IO before attempting to allocate a new one. + * Failure to do so can cause deadlocks under memory pressure. * - * Note that when running under submit_bio_noacct() (i.e. any block - * driver), bios are not submitted until after you return - see the code in - * submit_bio_noacct() that converts recursion into iteration, to prevent - * stack overflows. + * Note that when running under submit_bio_noacct() (i.e. any block driver), + * bios are not submitted until after you return - see the code in + * submit_bio_noacct() that converts recursion into iteration, to prevent + * stack overflows. * - * This would normally mean allocating multiple bios under - * submit_bio_noacct() would be susceptible to deadlocks, but we have - * deadlock avoidance code that resubmits any blocked bios from a rescuer - * thread. + * This would normally mean allocating multiple bios under submit_bio_noacct() + * would be susceptible to deadlocks, but we have + * deadlock avoidance code that resubmits any blocked bios from a rescuer + * thread. * - * However, we do not guarantee forward progress for allocations from other - * mempools. Doing multiple allocations from the same mempool under - * submit_bio_noacct() should be avoided - instead, use bio_set's front_pad - * for per bio allocations. + * However, we do not guarantee forward progress for allocations from other + * mempools. Doing multiple allocations from the same mempool under + * submit_bio_noacct() should be avoided - instead, use bio_set's front_pad + * for per bio allocations. * - * RETURNS: - * Pointer to new bio on success, NULL on failure. + * Returns: Pointer to new bio on success, NULL on failure. */ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs, struct bio_set *bs) { gfp_t saved_gfp = gfp_mask; - unsigned front_pad; - unsigned inline_vecs; - struct bio_vec *bvl = NULL; struct bio *bio; void *p;
- if (!bs) { - if (nr_iovecs > UIO_MAXIOV) - return NULL; - - p = kmalloc(struct_size(bio, bi_inline_vecs, nr_iovecs), gfp_mask); - front_pad = 0; - inline_vecs = nr_iovecs; - } else { - /* should not use nobvec bioset for nr_iovecs > 0 */ - if (WARN_ON_ONCE(!mempool_initialized(&bs->bvec_pool) && - nr_iovecs > 0)) - return NULL; - /* - * submit_bio_noacct() converts recursion to iteration; this - * means if we're running beneath it, any bios we allocate and - * submit will not be submitted (and thus freed) until after we - * return. - * - * This exposes us to a potential deadlock if we allocate - * multiple bios from the same bio_set() while running - * underneath submit_bio_noacct(). If we were to allocate - * multiple bios (say a stacking block driver that was splitting - * bios), we would deadlock if we exhausted the mempool's - * reserve. - * - * We solve this, and guarantee forward progress, with a rescuer - * workqueue per bio_set. If we go to allocate and there are - * bios on current->bio_list, we first try the allocation - * without __GFP_DIRECT_RECLAIM; if that fails, we punt those - * bios we would be blocking to the rescuer workqueue before - * we retry with the original gfp_flags. - */ - - if (current->bio_list && - (!bio_list_empty(¤t->bio_list[0]) || - !bio_list_empty(¤t->bio_list[1])) && - bs->rescue_workqueue) - gfp_mask &= ~__GFP_DIRECT_RECLAIM; + /* should not use nobvec bioset for nr_iovecs > 0 */ + if (WARN_ON_ONCE(!mempool_initialized(&bs->bvec_pool) && nr_iovecs > 0)) + return NULL;
+ /* + * submit_bio_noacct() converts recursion to iteration; this means if + * we're running beneath it, any bios we allocate and submit will not be + * submitted (and thus freed) until after we return. + * + * This exposes us to a potential deadlock if we allocate multiple bios + * from the same bio_set() while running underneath submit_bio_noacct(). + * If we were to allocate multiple bios (say a stacking block driver + * that was splitting bios), we would deadlock if we exhausted the + * mempool's reserve. + * + * We solve this, and guarantee forward progress, with a rescuer + * workqueue per bio_set. If we go to allocate and there are bios on + * current->bio_list, we first try the allocation without + * __GFP_DIRECT_RECLAIM; if that fails, we punt those bios we would be + * blocking to the rescuer workqueue before we retry with the original + * gfp_flags. + */ + if (current->bio_list && + (!bio_list_empty(¤t->bio_list[0]) || + !bio_list_empty(¤t->bio_list[1])) && + bs->rescue_workqueue) + gfp_mask &= ~__GFP_DIRECT_RECLAIM; + + p = mempool_alloc(&bs->bio_pool, gfp_mask); + if (!p && gfp_mask != saved_gfp) { + punt_bios_to_rescuer(bs); + gfp_mask = saved_gfp; p = mempool_alloc(&bs->bio_pool, gfp_mask); - if (!p && gfp_mask != saved_gfp) { - punt_bios_to_rescuer(bs); - gfp_mask = saved_gfp; - p = mempool_alloc(&bs->bio_pool, gfp_mask); - } - - front_pad = bs->front_pad; - inline_vecs = BIO_INLINE_VECS; } - if (unlikely(!p)) return NULL;
- bio = p + front_pad; - bio_init(bio, NULL, 0); - - if (nr_iovecs > inline_vecs) { + bio = p + bs->front_pad; + if (nr_iovecs > BIO_INLINE_VECS) { unsigned long idx = 0; + struct bio_vec *bvl = NULL;
bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx, &bs->bvec_pool); if (!bvl && gfp_mask != saved_gfp) { punt_bios_to_rescuer(bs); gfp_mask = saved_gfp; - bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx, &bs->bvec_pool); + bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx, + &bs->bvec_pool); }
if (unlikely(!bvl)) goto err_free;
bio->bi_flags |= idx << BVEC_POOL_OFFSET; + bio_init(bio, bvl, bvec_nr_vecs(idx)); } else if (nr_iovecs) { - bvl = bio->bi_inline_vecs; + bio_init(bio, bio->bi_inline_vecs, BIO_INLINE_VECS); + } else { + bio_init(bio, NULL, 0); }
bio->bi_pool = bs; - bio->bi_max_vecs = nr_iovecs; - bio->bi_io_vec = bvl; return bio;
err_free: @@ -529,6 +508,31 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs, } EXPORT_SYMBOL(bio_alloc_bioset);
+/** + * bio_kmalloc - kmalloc a bio for I/O + * @gfp_mask: the GFP_* mask given to the slab allocator + * @nr_iovecs: number of iovecs to pre-allocate + * + * Use kmalloc to allocate and initialize a bio. + * + * Returns: Pointer to new bio on success, NULL on failure. + */ +struct bio *bio_kmalloc(gfp_t gfp_mask, unsigned int nr_iovecs) +{ + struct bio *bio; + + if (nr_iovecs > UIO_MAXIOV) + return NULL; + + bio = kmalloc(struct_size(bio, bi_inline_vecs, nr_iovecs), gfp_mask); + if (unlikely(!bio)) + return NULL; + bio_init(bio, nr_iovecs ? bio->bi_inline_vecs : NULL, nr_iovecs); + bio->bi_pool = NULL; + return bio; +} +EXPORT_SYMBOL(bio_kmalloc); + void zero_fill_bio_iter(struct bio *bio, struct bvec_iter start) { unsigned long flags; diff --git a/include/linux/bio.h b/include/linux/bio.h index 23b7a73cd757..1c790e48dcef 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -390,6 +390,7 @@ extern int biovec_init_pool(mempool_t *pool, int pool_entries); extern int bioset_init_from_src(struct bio_set *bs, struct bio_set *src);
extern struct bio *bio_alloc_bioset(gfp_t, unsigned int, struct bio_set *); +struct bio *bio_kmalloc(gfp_t gfp_mask, unsigned int nr_iovecs); extern void bio_put(struct bio *);
extern void __bio_clone_fast(struct bio *, struct bio *); @@ -402,11 +403,6 @@ static inline struct bio *bio_alloc(gfp_t gfp_mask, unsigned int nr_iovecs) return bio_alloc_bioset(gfp_mask, nr_iovecs, &fs_bio_set); }
-static inline struct bio *bio_kmalloc(gfp_t gfp_mask, unsigned int nr_iovecs) -{ - return bio_alloc_bioset(gfp_mask, nr_iovecs, NULL); -} - extern blk_qc_t submit_bio(struct bio *);
extern void bio_endio(struct bio *);
From: Christoph Hellwig hch@lst.de
From: Christoph Hellwig hch@lst.de
Upstream commit: b90994c6ab62 ("block: fix bounce_clone_bio for passthrough bios")
This is backport to stable 5.10. It fixes an issue reported by syzbot. Link: https://syzkaller.appspot.com/bug?id=a3416231e37024a75f2b95bd95db0d8ce8132a8...
Now that bio_alloc_bioset does not fall back to kmalloc for a NULL bio_set, handle that case explicitly and simplify the calling conventions.
Based on an earlier patch from Chaitanya Kulkarni.
Fixes: 3175199ab0ac ("block: split bio_kmalloc from bio_alloc_bioset") Reported-by: syzbot+4f441e6ca0fcad141421@syzkaller.appspotmail.com Reported-by: Chaitanya Kulkarni Chaitanya.Kulkarni@wdc.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Tadeusz Struk tadeusz.struk@linaro.org --- block/bounce.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/block/bounce.c b/block/bounce.c index 162a6eee8999..4da429de78a2 100644 --- a/block/bounce.c +++ b/block/bounce.c @@ -214,8 +214,7 @@ static void bounce_end_io_read_isa(struct bio *bio) __bounce_end_io_read(bio, &isa_page_pool); }
-static struct bio *bounce_clone_bio(struct bio *bio_src, gfp_t gfp_mask, - struct bio_set *bs) +static struct bio *bounce_clone_bio(struct bio *bio_src, gfp_t gfp_mask) { struct bvec_iter iter; struct bio_vec bv; @@ -242,8 +241,11 @@ static struct bio *bounce_clone_bio(struct bio *bio_src, gfp_t gfp_mask, * asking for trouble and would force extra work on * __bio_clone_fast() anyways. */ - - bio = bio_alloc_bioset(gfp_mask, bio_segments(bio_src), bs); + if (bio_is_passthrough(bio_src)) + bio = bio_kmalloc(gfp_mask, bio_segments(bio_src)); + else + bio = bio_alloc_bioset(gfp_mask, bio_segments(bio_src), + &bounce_bio_set); if (!bio) return NULL; bio->bi_disk = bio_src->bi_disk; @@ -294,7 +296,6 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig, unsigned i = 0; bool bounce = false; int sectors = 0; - bool passthrough = bio_is_passthrough(*bio_orig);
bio_for_each_segment(from, *bio_orig, iter) { if (i++ < BIO_MAX_PAGES) @@ -305,14 +306,14 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig, if (!bounce) return;
- if (!passthrough && sectors < bio_sectors(*bio_orig)) { + if (!bio_is_passthrough(*bio_orig) && + sectors < bio_sectors(*bio_orig)) { bio = bio_split(*bio_orig, sectors, GFP_NOIO, &bounce_bio_split); bio_chain(bio, *bio_orig); submit_bio_noacct(*bio_orig); *bio_orig = bio; } - bio = bounce_clone_bio(*bio_orig, GFP_NOIO, passthrough ? NULL : - &bounce_bio_set); + bio = bounce_clone_bio(*bio_orig, GFP_NOIO);
/* * Bvec table can't be updated by bio_for_each_segment_all(),
On Mon, Jul 18, 2022 at 02:12:25PM -0700, Tadeusz Struk wrote:
From: Christoph Hellwig hch@lst.de
From: Christoph Hellwig hch@lst.de
Upstream commit: 3175199ab0ac ("block: split bio_kmalloc from bio_alloc_bioset")
This is backport to stable 5.10. It fixes an issue reported by syzbot. Link: https://syzkaller.appspot.com/bug?id=a3416231e37024a75f2b95bd95db0d8ce8132a8...
Both now queued up, thanks.
greg k-h
On Sat, Jul 23, 2022 at 04:59:42PM +0200, Greg KH wrote:
On Mon, Jul 18, 2022 at 02:12:25PM -0700, Tadeusz Struk wrote:
From: Christoph Hellwig hch@lst.de
From: Christoph Hellwig hch@lst.de
Upstream commit: 3175199ab0ac ("block: split bio_kmalloc from bio_alloc_bioset")
This is backport to stable 5.10. It fixes an issue reported by syzbot. Link: https://syzkaller.appspot.com/bug?id=a3416231e37024a75f2b95bd95db0d8ce8132a8...
Both now queued up, thanks.
As was just reported, this breaks things all over the place: https://lore.kernel.org/r/219030d8-3408-cc9d-7aec-1fb14ab891ce@roeck-us.net
Note, I also had to add lots of fix-up patches on top of these two that you missed, so odds are there are other fix-ups that I also missed.
Please go and test this again, and submit ALL patches that are needed after they pass the proper testing and I will be glad to reconsider them again.
thanks,
greg k-h
On Thu, Jul 28, 2022 at 04:42:35PM +0200, Greg KH wrote:
Link: https://syzkaller.appspot.com/bug?id=a3416231e37024a75f2b95bd95db0d8ce8132a8...
Both now queued up, thanks.
As was just reported, this breaks things all over the place: https://lore.kernel.org/r/219030d8-3408-cc9d-7aec-1fb14ab891ce@roeck-us.net
Note, I also had to add lots of fix-up patches on top of these two that you missed, so odds are there are other fix-ups that I also missed.
Please go and test this again, and submit ALL patches that are needed after they pass the proper testing and I will be glad to reconsider them again.
Why did this even get backported? It was a cleanup that required a lot of prep work, and should not by itself fix anything.
On Thu, Jul 28, 2022 at 04:45:20PM +0200, Christoph Hellwig wrote:
On Thu, Jul 28, 2022 at 04:42:35PM +0200, Greg KH wrote:
Link: https://syzkaller.appspot.com/bug?id=a3416231e37024a75f2b95bd95db0d8ce8132a8...
Both now queued up, thanks.
As was just reported, this breaks things all over the place: https://lore.kernel.org/r/219030d8-3408-cc9d-7aec-1fb14ab891ce@roeck-us.net
Note, I also had to add lots of fix-up patches on top of these two that you missed, so odds are there are other fix-ups that I also missed.
Please go and test this again, and submit ALL patches that are needed after they pass the proper testing and I will be glad to reconsider them again.
Why did this even get backported? It was a cleanup that required a lot of prep work, and should not by itself fix anything.
Looks like syzkaller is reporting something odd...
Tadeusz, how was this tested?
thanks,
greg k-h
On 7/28/22 08:00, Greg KH wrote:
On Thu, Jul 28, 2022 at 04:45:20PM +0200, Christoph Hellwig wrote:
On Thu, Jul 28, 2022 at 04:42:35PM +0200, Greg KH wrote:
Link:https://syzkaller.appspot.com/bug?id=a3416231e37024a75f2b95bd95db0d8ce8132a8...
Both now queued up, thanks.
As was just reported, this breaks things all over the place: https://lore.kernel.org/r/219030d8-3408-cc9d-7aec-1fb14ab891ce@roeck-us.net
Note, I also had to add lots of fix-up patches on top of these two that you missed, so odds are there are other fix-ups that I also missed.
Please go and test this again, and submit ALL patches that are needed after they pass the proper testing and I will be glad to reconsider them again.
Why did this even get backported? It was a cleanup that required a lot of prep work, and should not by itself fix anything.
Looks like syzkaller is reporting something odd...
Tadeusz, how was this tested?
Yes, I tested it with syzbot and locally and it fixed the syzbot reported "kernel BUG at block/blk-mq.c:567" issue:
https://syzkaller.appspot.com/bug?id=a3416231e37024a75f2b95bd95db0d8ce8132a8...
I only tested it with booting from ext4 fs, as I don't have any btrfs setup.
linux-stable-mirror@lists.linaro.org