#regzbot introduced: 129dab6e1286
Hello everyone,
We've identified a performance regression that starts with linux kernel 6.10 and persists through 6.16(tested at commit e540341508ce). Bisection pointed to commit: 129dab6e1286 ("iommu/vt-d: Use cache_tag_flush_range_np() in iotlb_sync_map").
The issue occurs when running fio against two NVMe devices located under the same PCIe bridge (dual-port NVMe configuration). Performance drops compared to configurations where the devices are on different bridges.
Observed Performance: - Before the commit: ~6150 MiB/s, regardless of NVMe device placement. - After the commit: -- Same PCIe bridge: ~4985 MiB/s -- Different PCIe bridges: ~6150 MiB/s
Currently we can only reproduce the issue on a Z3 metal instance on gcp. I suspect the issue can be reproducible if you have a dual port nvme on any machine. At [1] there's a more detailed description of the issue and details on the reproducer.
Could you please advise on the appropriate path forward to mitigate or address this regression?
Thanks, Jo
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2115738
linux-stable-mirror@lists.linaro.org