On Tue, Jan 16, 2024 at 09:44:01PM +0100, Pavel Machek wrote:
Hi!
From: "Borislav Petkov (AMD)" bp@alien8.de
[ Upstream commit 04c3024560d3a14acd18d0a51a1d0a89d29b7eb5 ]
AMD does not have the requirement for a synchronization barrier when acccessing a certain group of MSRs. Do not incur that unnecessary penalty there.
...
Performance captured using an unmodified ipi-bench using the 'mesh-ipi' option with and without weak_wrmsr_fence() on a Zen4 system also showed significant performance improvement without weak_wrmsr_fence(). The 'mesh-ipi' option ignores CCX or CCD and just picks random vCPU.
Average throughput (10 iterations) with weak_wrmsr_fence(), Cumulative throughput: 4933374 IPI/s
Average throughput (10 iterations) without weak_wrmsr_fence(), Cumulative throughput: 6355156 IPI/s
[1] https://github.com/bytedance/kvm-utils/tree/master/microbenchmark/ipi-bench
Speed improvement, not a bugfix. Please drop.
Dropped, thanks!