Re: [PATCH v2] mm/readahead: Fix large folio support in async readahead

11 Nov 2024

On 11.11.24 15:28, Yafang Shao wrote:
...
On Mon, Nov 11, 2024 at 6:33 PM David Hildenbrand david@redhat.com wrote:
...
On 08.11.24 15:17, Yafang Shao wrote:
...
When testing large folio support with XFS on our servers, we observed that
only a few large folios are mapped when reading large files via mmap.
After a thorough analysis, I identified it was caused by the
`/sys/block/*/queue/read_ahead_kb` setting. On our test servers, this
parameter is set to 128KB. After I tune it to 2MB, the large folio can
work as expected. However, I believe the large folio behavior should not be
dependent on the value of read_ahead_kb. It would be more robust if the
kernel can automatically adopt to it.
Now I am extremely confused.
Documentation/ABI/stable/sysfs-block:
"[RW] Maximum number of kilobytes to read-ahead for filesystems on this
block device."
So, with your patch, will we also be changing the readahead size to
exceed that, or simply allocate larger folios and not exceeding the
readahead size (e.g., leaving them partially non-filled)?
Exceeding the readahead size for the MADV_HUGEPAGE case is
straightforward; this is what the current patch accomplishes.
Okay, so this only applies with MADV_HUGEPAGE I assume. Likely we should 
also make that clearer in the subject.
mm/readahead: allow exceeding configured read_ahead_kb with MADV_HUGEPAGE
If this is really a fix, especially one that deserves CC-stable, I 
cannot tell. Willy is the obvious expert :)
...
...
If you're also changing the readahead behavior to exceed the
configuration parameter it would sound to me like "I am pushing the
brake pedal and my care brakes; fix the brakes to adopt whether to brake
automatically" :)
Likely I am missing something here, and how the read_ahead_kb parameter
is used after your patch.
The read_ahead_kb parameter continues to function for
non-MADV_HUGEPAGE scenarios, whereas special handling is required for
the MADV_HUGEPAGE case. It appears that we ought to update the
Documentation/ABI/stable/sysfs-block to reflect the changes related to
large folios, correct?
Yes, how it related to MADV_HUGEPAGE. I would assume that it would get 
ignored, but ...
... staring at get_next_ra_size(), it's not quite ignored, because we 
still us it as a baseline to detect how much we want to bump up the 
limit when the requested size is small? (*2 vs *4 etc) :/
So the semantics are really starting to get weird, unless I am missing 
something important.
[...]
...
Perhaps a more straightforward solution would be to implement it
directly at the callsite, as demonstrated below?
Likely something into this direction might be better, but Willy is the 
expert that code.
...

diff --git a/mm/readahead.c b/mm/readahead.c
index 3dc6c7a128dd..187efae95b02 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -642,7 +642,11 @@ void page_cache_async_ra(struct readahead_control *ractl,
                         1UL << order);
         if (index == expected) {
                 ra->start += ra->size;

          ra->size = get_next_ra_size(ra, max_pages);




          /*


           * Allow the actual size to exceed the readahead window for a


           * large folio.



"a large folio" -> "with MADV_HUGEPAGE" ? Or can this be hit on 
different paths that are not covered in the patch description?
-- 
Cheers,

David / dhildenb



    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v2] mm/readahead: Fix large folio support in async readahead