'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1 bytes.
Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can include very limited pages, and usually at most 256, so the fs bio size won't be bigger than 1M bytes most of times.
Since we support multi-page bvec, in theory one fs bio really can be added > 1M pages, especially in case of hugepage, or big writeback with too many dirty pages. Then there is chance in which .bi_size is overflowed.
Fixes this issue by using bio_full() to check if the added segment may overflow .bi_size.
Cc: Liu Yiding liuyd.fnst@cn.fujitsu.com Cc: kernel test robot rong.a.chen@intel.com Cc: "Darrick J. Wong" darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: Christoph Hellwig hch@lst.de Cc: stable@vger.kernel.org Fixes: 07173c3ec276 ("block: enable multipage bvecs") Signed-off-by: Ming Lei ming.lei@redhat.com --- block/bio.c | 10 +++++----- fs/iomap.c | 2 +- fs/xfs/xfs_aops.c | 2 +- include/linux/bio.h | 18 ++++++++++++++++-- 4 files changed, 23 insertions(+), 9 deletions(-)
diff --git a/block/bio.c b/block/bio.c index ce797d73bb43..67bba12d273b 100644 --- a/block/bio.c +++ b/block/bio.c @@ -731,7 +731,7 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio, } }
- if (bio_full(bio)) + if (bio_full(bio, len)) return 0;
if (bio->bi_phys_segments >= queue_max_segments(q)) @@ -807,7 +807,7 @@ void __bio_add_page(struct bio *bio, struct page *page, struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt];
WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)); - WARN_ON_ONCE(bio_full(bio)); + WARN_ON_ONCE(bio_full(bio, len));
bv->bv_page = page; bv->bv_offset = off; @@ -834,7 +834,7 @@ int bio_add_page(struct bio *bio, struct page *page, bool same_page = false;
if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) { - if (bio_full(bio)) + if (bio_full(bio, len)) return 0; __bio_add_page(bio, page, len, offset); } @@ -922,7 +922,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) if (same_page) put_page(page); } else { - if (WARN_ON_ONCE(bio_full(bio))) + if (WARN_ON_ONCE(bio_full(bio, len))) return -EINVAL; __bio_add_page(bio, page, len, offset); } @@ -966,7 +966,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) ret = __bio_iov_bvec_add_pages(bio, iter); else ret = __bio_iov_iter_get_pages(bio, iter); - } while (!ret && iov_iter_count(iter) && !bio_full(bio)); + } while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
if (iov_iter_bvec_no_ref(iter)) bio_set_flag(bio, BIO_NO_PAGE_REF); diff --git a/fs/iomap.c b/fs/iomap.c index 12654c2e78f8..da961fca3180 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -333,7 +333,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, if (iop) atomic_inc(&iop->read_count);
- if (!ctx->bio || !is_contig || bio_full(ctx->bio)) { + if (!ctx->bio || !is_contig || bio_full(ctx->bio, plen)) { gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL); int nr_vecs = (length + PAGE_SIZE - 1) >> PAGE_SHIFT;
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 8da5e6637771..11f703d4a605 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -782,7 +782,7 @@ xfs_add_to_ioend( atomic_inc(&iop->write_count);
if (!merged) { - if (bio_full(wpc->ioend->io_bio)) + if (bio_full(wpc->ioend->io_bio, len)) xfs_chain_bio(wpc->ioend, wbc, bdev, sector); bio_add_page(wpc->ioend->io_bio, page, len, poff); } diff --git a/include/linux/bio.h b/include/linux/bio.h index f87abaa898f0..e36b8fc1b1c3 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -102,9 +102,23 @@ static inline void *bio_data(struct bio *bio) return NULL; }
-static inline bool bio_full(struct bio *bio) +/** + * bio_full - check if the bio is full + * @bio: bio to check + * @len: length of one segment to be added + * + * Return true if @bio is full and one segment with @len bytes can't be + * added to the bio, otherwise return false + */ +static inline bool bio_full(struct bio *bio, unsigned len) { - return bio->bi_vcnt >= bio->bi_max_vecs; + if (bio->bi_vcnt >= bio->bi_max_vecs) + return true; + + if (bio->bi_iter.bi_size > UINT_MAX - len) + return true; + + return false; }
static inline bool bio_next_segment(const struct bio *bio,
On 7/1/19 1:14 AM, Ming Lei wrote:
'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1 bytes.
Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can include very limited pages, and usually at most 256, so the fs bio size won't be bigger than 1M bytes most of times.
Since we support multi-page bvec, in theory one fs bio really can be added > 1M pages, especially in case of hugepage, or big writeback with too many dirty pages. Then there is chance in which .bi_size is overflowed.
Fixes this issue by using bio_full() to check if the added segment may overflow .bi_size.
Any objections to queuing this up for 5.3? It's not a new regression this series.
On 7/1/19 8:05 AM, Jens Axboe wrote:
On 7/1/19 1:14 AM, Ming Lei wrote:
'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1 bytes.
Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can include very limited pages, and usually at most 256, so the fs bio size won't be bigger than 1M bytes most of times.
Since we support multi-page bvec, in theory one fs bio really can be added > 1M pages, especially in case of hugepage, or big writeback with too many dirty pages. Then there is chance in which .bi_size is overflowed.
Fixes this issue by using bio_full() to check if the added segment may overflow .bi_size.
Any objections to queuing this up for 5.3? It's not a new regression this series.
I took a closer look, and applied for 5.3 and removed the stable tag. We'll need to apply your patch for stable, and I added an adapted one for 5.3. I don't want a huge merge hassle because of this.
On 7/1/19 8:14 AM, Jens Axboe wrote:
On 7/1/19 8:05 AM, Jens Axboe wrote:
On 7/1/19 1:14 AM, Ming Lei wrote:
'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1 bytes.
Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can include very limited pages, and usually at most 256, so the fs bio size won't be bigger than 1M bytes most of times.
Since we support multi-page bvec, in theory one fs bio really can be added > 1M pages, especially in case of hugepage, or big writeback with too many dirty pages. Then there is chance in which .bi_size is overflowed.
Fixes this issue by using bio_full() to check if the added segment may overflow .bi_size.
Any objections to queuing this up for 5.3? It's not a new regression this series.
I took a closer look, and applied for 5.3 and removed the stable tag. We'll need to apply your patch for stable, and I added an adapted one for 5.3. I don't want a huge merge hassle because of this.
OK, so we still get conflicts with that, due to both the same page merge fix, and Christophs 5.3 changes.
I ended up pulling in 5.2-rc6 in for-5.3/block, which resolves at least most of it, and kept the stable tag since now it's possible to backport without too much trouble.
On Mon, Jul 01, 2019 at 08:20:13AM -0600, Jens Axboe wrote:
On 7/1/19 8:14 AM, Jens Axboe wrote:
On 7/1/19 8:05 AM, Jens Axboe wrote:
On 7/1/19 1:14 AM, Ming Lei wrote:
'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1 bytes.
Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can include very limited pages, and usually at most 256, so the fs bio size won't be bigger than 1M bytes most of times.
Since we support multi-page bvec, in theory one fs bio really can be added > 1M pages, especially in case of hugepage, or big writeback with too many dirty pages. Then there is chance in which .bi_size is overflowed.
Fixes this issue by using bio_full() to check if the added segment may overflow .bi_size.
Any objections to queuing this up for 5.3? It's not a new regression this series.
I took a closer look, and applied for 5.3 and removed the stable tag. We'll need to apply your patch for stable, and I added an adapted one for 5.3. I don't want a huge merge hassle because of this.
OK, so we still get conflicts with that, due to both the same page merge fix, and Christophs 5.3 changes.
I ended up pulling in 5.2-rc6 in for-5.3/block, which resolves at least most of it, and kept the stable tag since now it's possible to backport without too much trouble.
Thanks for merging it.
BTW, we need the -stable tag, since Yiding has test case to reproduce the issue reliably, which just needs one big machine with enough memory, and fast storage, I guess.
thanks, Ming
On 7/1/19 7:38 PM, Ming Lei wrote:
On Mon, Jul 01, 2019 at 08:20:13AM -0600, Jens Axboe wrote:
On 7/1/19 8:14 AM, Jens Axboe wrote:
On 7/1/19 8:05 AM, Jens Axboe wrote:
On 7/1/19 1:14 AM, Ming Lei wrote:
'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1 bytes.
Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can include very limited pages, and usually at most 256, so the fs bio size won't be bigger than 1M bytes most of times.
Since we support multi-page bvec, in theory one fs bio really can be added > 1M pages, especially in case of hugepage, or big writeback with too many dirty pages. Then there is chance in which .bi_size is overflowed.
Fixes this issue by using bio_full() to check if the added segment may overflow .bi_size.
Any objections to queuing this up for 5.3? It's not a new regression this series.
I took a closer look, and applied for 5.3 and removed the stable tag. We'll need to apply your patch for stable, and I added an adapted one for 5.3. I don't want a huge merge hassle because of this.
OK, so we still get conflicts with that, due to both the same page merge fix, and Christophs 5.3 changes.
I ended up pulling in 5.2-rc6 in for-5.3/block, which resolves at least most of it, and kept the stable tag since now it's possible to backport without too much trouble.
Thanks for merging it.
BTW, we need the -stable tag, since Yiding has test case to reproduce the issue reliably, which just needs one big machine with enough memory, and fast storage, I guess.
Just to be clear, I wasn't saying it shouldn't go to stable. But it's pointless to mark something for stable if you know it'll reject, and won't be easily fixable by the person applying it. For that case, it's better to NOT CC stable, and just send in an appropriate patch instead.
But that's all moot now, as per last section in the email you are replying to.
linux-stable-mirror@lists.linaro.org