From: David Howells
Sent: 15 September 2023 11:10
David Laight David.Laight@ACULAB.COM wrote:
Add kunit tests to benchmark 256MiB copies to a UBUF iterator and an IOVEC iterator. This attaches a userspace VM with a mapped file in it temporarily to the test thread.
Isn't that going to be completely dominated by the cache fills from memory?
Yes... but it should be consistent in the amount of time that consumes since no device drivers are involved. I can try adding the same folio to the anon_file multiple times - it might work especially if I don't put the pages on the LRU (if that's even possible) - but I wanted separate pages for the extraction test.
You could also just not do the copy! Although you need (say) asm volatile("\n",:::"memory") to stop it all being completely optimised away. That might show up a difference in the 'out_of_line' test where 15% on top on the data copies is massive - it may be that the data cache behaviour is very different for the two cases.
...
Some measurements can be made using readv() and writev() on /dev/zero and /dev/null.
Forget /dev/null; that doesn't actually engage any iteration code. The same for writing to /dev/zero. Reading from /dev/zero does its own iteration thing rather than using iterate_and_advance(), presumably because it checks for signals and resched.
Using /dev/null does exercise the 'copy iov from user' code. Last time I looked at that the 32bit compat code was faster than the 64bit code on x86!
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)