The commit: 3f235279828c ("x86/cpu: Restore AMD's DE_CFG MSR after resume") renamed the MSR_F10H_DECFG_LFENCE_SERIALIZE macro to MSR_AMD64_DE_CFG_LFENCE_SERIALIZE. The fix changed MSR_F10H_DECFG_LFENCE_SERIALIZE to MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT in the init_amd() function, but should have used MSR_AMD64_DE_CFG_LFENCE_SERIALIZE. This causes a discrepancy in the LFENCE serialization check in the init_amd() function.
This causes a ~16% sysbench memory regression, when running: sysbench --test=memory run
Fixes: 3f235279828c2a8aff3164fef08d58f7af2d64fc("x86/cpu: Restore AMD's DE_CFG MSR after resume ") Signed-off-by: Rhythm Mahajan rhythm.m.mahajan@oracle.com ---
The test result before the commit 3f2352798("x86/cpu: Restore AMD's DE_CFG MSR after resume")
$ sysbench --test=memory run sysbench 1.0.17 (using system LuaJIT 2.0.4)
Running the test with following options: Number of threads: 1 Initializing random number generator from current time
Running memory speed test with the following options: block size: 1KiB total size: 102400MiB operation: write scope: global
Initializing worker threads...
Threads started!
Total operations: 27466829 (2746182.07 per second)
26823.08 MiB transferred (2681.82 MiB/sec)
General statistics: total time: 10.0001s total number of events: 27466829
Latency (ms): min: 0.00 avg: 0.00 max: 0.20 95th percentile: 0.00 sum: 4041.60
Threads fairness: events (avg/stddev): 27466829.0000/0.00 execution time (avg/stddev): 4.0416/0.00
The test result after the commit 3f2352798("x86/cpu: Restore AMD's DE_CFG MSR after resume")
$ sysbench --test=memory run sysbench 1.0.17 (using system LuaJIT 2.0.4)
Running the test with following options: Number of threads: 1 Initializing random number generator from current time
Running memory speed test with the following options: block size: 1KiB total size: 102400MiB operation: write scope: global
Initializing worker threads...
Threads started!
Total operations: 33758407 (3375232.84 per second)
32967.19 MiB transferred (3296.13 MiB/sec)
General statistics: total time: 10.0001s total number of events: 33758407
Latency (ms): min: 0.00 avg: 0.00 max: 0.06 95th percentile: 0.00 sum: 4115.95
Threads fairness: events (avg/stddev): 33758407.0000/0.00 execution time (avg/stddev): 4.1160/0.00 --- arch/x86/kernel/cpu/amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index ee5d0f943ec8c..4122afeaaaff5 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -941,7 +941,7 @@ static void init_amd(struct cpuinfo_x86 *c) * serializing. */ ret = rdmsrl_safe(MSR_AMD64_DE_CFG, &val); - if (!ret && (val & MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT)) { + if (!ret && (val & MSR_AMD64_DE_CFG_LFENCE_SERIALIZE)) { /* A serializing LFENCE stops RDTSC speculation */ set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC); } else {