On Fri, Jun 21, 2024 at 4:50 PM Chris Li chrisl@kernel.org wrote:
On Fri, Jun 21, 2024 at 12:47 AM Barry Song 21cnbao@gmail.com wrote:
On Fri, Jun 21, 2024 at 7:25 PM Ryan Roberts ryan.roberts@arm.com wrote:
On 20/06/2024 12:34, David Hildenbrand wrote:
On 20.06.24 11:04, Ryan Roberts wrote:
On 20/06/2024 01:26, Barry Song wrote:
From: Barry Song v-songbaohua@oppo.com
Both Ryan and Chris have been utilizing the small test program to aid in debugging and identifying issues with swap entry allocation. While a real or intricate workload might be more suitable for assessing the correctness and effectiveness of the swap allocation policy, a small test program presents a simpler means of understanding the problem and initially verifying the improvements being made.
Let's endeavor to integrate it into the self-test suite. Although it presently only accommodates 64KB and 4KB, I'm optimistic that we can expand its capabilities to support multiple sizes and simulate more complex systems in the future as required.
I'll try to summarize the thread with Huang Ying by suggesting this test program is "neccessary but not sufficient" to exhaustively test the mTHP swap-out path. I've certainly found it useful and think it would be a valuable addition to the tree.
That said, I'm not convinced it is a selftest; IMO a selftest should provide a clear pass/fail result against some criteria and must be able to be run automatically by (e.g.) a CI system.
Likely we should then consider moving other such performance-related thingies out of the selftests?
Yes, that would get my vote. But of the 4 tests you mentioned that use clock_gettime(), it looks like transhuge-stress is the only one that doesn't have a pass/fail result, so is probably the only candidate for moving.
The others either use the times as a timeout and determines failure if the action didn't occur within the timeout (e.g. ksm_tests.c) or use it to add some supplemental performance information to an otherwise functionality-oriented test.
Thank you very much, Ryan. I think you've found a better home for this tool . I will send v2, relocating it to tools/mm and adding a function to swap in either the whole mTHPs or a portion of mTHPs by "-a"(aligned swapin).
So basically, we will have
- Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation code under
high exercise in a short time.
- Use MADV_DONTNEED to simulate the behavior of libc and Java heap in freeing
memory, as well as for munmap, app exits, or OOM killer scenarios. This ensures new mTHP is always generated, released or swapped out, similar to the behavior on a PC or Android phone where many applications are frequently started and terminated.
Will this cover the case that the ratio of order 0 and order 4 swap requests change during LMK, and swapfile is almost full?
If not, please add that :-)
Due to 2, we ensure a certain proportion of mTHP. Similarly, because of 3, we maintain a certain proportion of small folios, as we don't support large folios swap-in, meaning any swap-in will immediately result in small folios. Therefore, with both 2 and 3, we automatically achieve a system containing both mTHP and small folios. Additionally, 1 provides the ability to continuously swap them out. If we set the same sizes for 2 and 3, we'll achieve a 1:1 ratio of large folios to small folios. How about starting with a 1:1 ratio?
To meet the requirement that the swapfile is almost full, I can increase the memory to ensure the total size is quite close to zRAM. This way, we give the small folios a chance to perform a slow scan and observe the impact.
- Swap in with or without the "-a" option to observe how fragments
due to swap-in and the incoming swap-in of large folios will impact swap-out fallback.
And many thanks to Chris for the suggestion on improving it within selftest, though I prefer to place it in tools/mm.
I am perfectly fine with that. Looking forward to your V2.
Chris
Thanks Barry