So we've had this regression in 9p for.. almost a year, which is way too long, but there was no "easy" reproducer until yesterday (thank you again!!)
It turned out to be a bug with iov_iter on folios, iov_iter_get_pages_alloc2() would advance the iov_iter correctly up to the end edge of a folio and the later copy_to_iter() fails on the iterate_folioq() bug.
Happy to consider alternative ways of fixing this, now there's a reproducer it's all much clearer; for the bug to be visible we basically need to make and IO with non-contiguous folios in the iov_iter which is not obvious to test with synthetic VMs, with size that triggers a zero-copy read followed by a non-zero-copy read.
Signed-off-by: Dominique Martinet asmadeus@codewreck.org --- Changes in v2: - Fixed 'remain' being used uninitialized in iterate_folioq when going through the goto - s/forwarded/advanced in commit message - Link to v1: https://lore.kernel.org/r/20250811-iot_iter_folio-v1-0-d9c223adf93c@codewrec...
--- Dominique Martinet (2): iov_iter: iterate_folioq: fix handling of offset >= folio size iov_iter: iov_folioq_get_pages: don't leave empty slot behind
include/linux/iov_iter.h | 5 ++++- lib/iov_iter.c | 6 +++--- 2 files changed, 7 insertions(+), 4 deletions(-) --- base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585 change-id: 20250811-iot_iter_folio-1b7849f88fed
Best regards,
From: Dominique Martinet asmadeus@codewreck.org
It's apparently possible to get an iov advanced all the way up to the end of the current page we're looking at, e.g.
(gdb) p *iter $24 = {iter_type = 4 '\004', nofault = false, data_source = false, iov_offset = 4096, {__ubuf_iovec = { iov_base = 0xffff88800f5bc000, iov_len = 655}, {{__iov = 0xffff88800f5bc000, kvec = 0xffff88800f5bc000, bvec = 0xffff88800f5bc000, folioq = 0xffff88800f5bc000, xarray = 0xffff88800f5bc000, ubuf = 0xffff88800f5bc000}, count = 655}}, {nr_segs = 2, folioq_slot = 2 '\002', xarray_start = 2}}
Where iov_offset is 4k with 4k-sized folios
This should have been because we're only in the 2nd slot and there's another one after this, but iterate_folioq should not try to map a folio that skips the whole size, and more importantly part here does not end up zero (because 'PAGE_SIZE - skip % PAGE_SIZE' ends up PAGE_SIZE and not zero..), so skip forward to the "advance to next folio" code
Reported-by: Maximilian Bosch maximilian@mbosch.me Reported-by: Ryan Lahfa ryan@lahfa.xyz Reported-by: Christian Theune ct@flyingcircus.io Reported-by: Arnout Engelen arnout@bzzt.net Link: https://lkml.kernel.org/r/D4LHHUNLG79Y.12PI0X6BEHRHW@mbosch.me/ Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios") Cc: stable@vger.kernel.org # v6.12+ Acked-by: David Howells dhowells@redhat.com Signed-off-by: Dominique Martinet asmadeus@codewreck.org --- include/linux/iov_iter.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/iov_iter.h b/include/linux/iov_iter.h index c4aa58032faf874ee5b29bd37f9e23c479741bef..a77ff9c7e4b21eacd166adb506b79e7ddd723aa1 100644 --- a/include/linux/iov_iter.h +++ b/include/linux/iov_iter.h @@ -160,7 +160,7 @@ size_t iterate_folioq(struct iov_iter *iter, size_t len, void *priv, void *priv2
do { struct folio *folio = folioq_folio(folioq, slot); - size_t part, remain, consumed; + size_t part, remain = 0, consumed; size_t fsize; void *base;
@@ -168,6 +168,8 @@ size_t iterate_folioq(struct iov_iter *iter, size_t len, void *priv, void *priv2 break;
fsize = folioq_folio_size(folioq, slot); + if (skip >= fsize) + goto next; base = kmap_local_folio(folio, skip); part = umin(len, PAGE_SIZE - skip % PAGE_SIZE); remain = step(base, progress, part, priv, priv2); @@ -177,6 +179,7 @@ size_t iterate_folioq(struct iov_iter *iter, size_t len, void *priv, void *priv2 progress += consumed; skip += consumed; if (skip >= fsize) { +next: skip = 0; slot++; if (slot == folioq_nr_slots(folioq) && folioq->next) {
linux-stable-mirror@lists.linaro.org