Hi Ilpo,
On 11/20/2023 3:13 AM, Ilpo Järvinen wrote:
When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results.
Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo.
Using a prime multiplier with modulo ensures the entire buffer is eventually read. 23 is small enough that the reads are spread out but wrapping does not occur very frequently (wrapping too often can trigger L2 hits more frequently which causes noise to the test because getting the data from LLC is not required).
It was discovered that not all primes work equally well and some can cause wildly unstable results (e.g., in an earlier version of this patch, the reads were done in reversed order and 59 was used as the prime resulting in unacceptably high and unstable results in MBA and MBM test on some architectures).
Link: https://lore.kernel.org/linux-kselftest/TYAPR01MB6330025B5E6537F94DA49ACB8B4... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
I am not very comfortable with all the uncertainty involved in this patch. A consolation is that this is surely an improvement.
Reviewed-by: Reinette Chatre reinette.chatre@intel.com
Reinette