On 5/9/23 22:18, Jeff Layton wrote:
On Tue, 2023-05-09 at 08:57 +0800, xiubli@redhat.com wrote:
From: Xiubo Li xiubli@redhat.com
Blindly expanding the readahead windows will cause unneccessary pagecache thrashing and also will introdue the network workload. We should disable expanding the windows if the readahead is disabled and also shouldn't expand the windows too much.
Expanding forward firstly instead of expanding backward for possible sequential reads.
Bound `rreq->len` to the actual file size to restore the previous page cache usage.
Cc: stable@vger.kernel.org Fixes: 49870056005c ("ceph: convert ceph_readpages to ceph_readahead") URL: https://lore.kernel.org/ceph-devel/20230504082510.247-1-sehuww@mail.scut.edu... URL: https://www.spinics.net/lists/ceph-users/msg76183.html Cc: Hu Weiwen sehuww@mail.scut.edu.cn Signed-off-by: Xiubo Li xiubli@redhat.com
V4:
- two small cleanup from Ilya's comments. Thanks
(cc'ing Steve French since he was asking me about ceph readahead yesterday)
FWIW, the original idea here was to try to read whole OSD objects when we can. I can see that that may have been overzealous though, so ramping up the size more slowly makes sense.
fs/ceph/addr.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index ca4dc6450887..683ba9fbd590 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -188,16 +188,30 @@ static void ceph_netfs_expand_readahead(struct netfs_io_request *rreq) struct inode *inode = rreq->inode; struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_file_layout *lo = &ci->i_layout;
- unsigned long max_pages = inode->i_sb->s_bdi->ra_pages;
- unsigned long max_len = max_pages << PAGE_SHIFT;
- loff_t end = rreq->start + rreq->len, new_end; u32 blockoff;
- u64 blockno;
- /* Expand the start downward */
- blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
- rreq->start = blockno * lo->stripe_unit;
- rreq->len += blockoff;
- /* Readahead is disabled */
- if (!max_pages)
return;
- /* Now, round up the length to the next block */
- rreq->len = roundup(rreq->len, lo->stripe_unit);
- /*
* Try to expand the length forward by rounding up it to the next
* block, but do not exceed the file size, unless the original
* request already exceeds it.
*/
Hi Jeff,
Do you really need to clamp this to the i_size? Is it ever possible for the client to be out of date as to the real file size? If so then you might end up with a short read when there is actually more data.
I guess by doing this, you're trying to avoid having to call the OSD back after a short read and get a zero-length read when the file is shorter than the requested read?
This is folded from Weiwen's another fix https://patchwork.kernel.org/project/ceph-devel/patch/20230504082510.247-1-s....
For small files use case this may really could cause unnecessary network workload and inefficient usage of the page cache.
Thanks
- Xiubo
- new_end = min(round_up(end, lo->stripe_unit), rreq->i_size);
- if (new_end > end && new_end <= rreq->start + max_len)
rreq->len = new_end - rreq->start;
- /* Try to expand the start downward */
- div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
- if (rreq->len + blockoff <= max_len) {
rreq->start -= blockoff;
rreq->len += blockoff;
- } }
static bool ceph_netfs_clamp_length(struct netfs_io_subrequest *subreq)