On Mon, Dec 11, 2023 at 5:05 AM Simon Kaegi simon.kaegi@gmail.com wrote:
#regzbot introduced v6.1.62..v6.1.63 #regzbot introduced: baddcc2c71572968cdaeee1c4ab3dc0ad90fa765
We hit this regression when updating our guest vm kernel from 6.1.62 to 6.1.63 -- bisecting, this problem was introduced in baddcc2c71572968cdaeee1c4ab3dc0ad90fa765 -- virtio/vsock: replace virtio_vsock_pkt with sk_buff -- https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v...
We're getting a timeout when trying to connect to the vsocket in the guest VM when launching a kata containers 3.2.0 agent. We haven't done much more to understand the problem at this point.
It looks like the same issue described here: https://github.com/rust-vmm/vm-virtio/issues/204
In summary that patch also contains a performance improvement, because by switching to sk_buffs, we can use only one descriptor for the whole packet (header + payload), whereas before we used two for each packet. Some devices (e.g. rust-vmm's vsock) mistakenly always expect 2 descriptors, but this is a violation of the VIRTIO specification.
Which device are you using?
Can you confirm that your device conforms to the specification?
Stefano
We can reproduce 100% of the time but don't currently have a simple reproducer as the problem was found in our build service which uses kata-containers (with cloud-hypervisor).
We have not checked the mainline as we currently are tied to 6.1.x.
-Simon