On Mon, Dec 11, 2023 at 3:20 PM Simon Kaegi simon.kaegi@gmail.com wrote:
Thanks Greg, Stefano,
tldr; withdrawing the regression -- rust-vmm vsock mistake
We're not strictly tied to the 6.1.x tree but generally stick with the long term releases because we patch every week or so and want less to change if possible.
I think you're exactly right re: rust-vmm's vsock. We're using cloud hypervisor and just tried updating to a fixed version and everything is working as expected. https://github.com/rust-vmm/vm-virtio/issues/204 (thanks Stefano)
Cool, thanks for confirming!
Stefano
Thanks all... nothing to see.
- Simon
On Mon, Dec 11, 2023 at 3:39 AM Stefano Garzarella sgarzare@redhat.com wrote:
On Mon, Dec 11, 2023 at 5:05 AM Simon Kaegi simon.kaegi@gmail.com wrote:
#regzbot introduced v6.1.62..v6.1.63 #regzbot introduced: baddcc2c71572968cdaeee1c4ab3dc0ad90fa765
We hit this regression when updating our guest vm kernel from 6.1.62 to 6.1.63 -- bisecting, this problem was introduced in baddcc2c71572968cdaeee1c4ab3dc0ad90fa765 -- virtio/vsock: replace virtio_vsock_pkt with sk_buff -- https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v...
We're getting a timeout when trying to connect to the vsocket in the guest VM when launching a kata containers 3.2.0 agent. We haven't done much more to understand the problem at this point.
It looks like the same issue described here: https://github.com/rust-vmm/vm-virtio/issues/204
In summary that patch also contains a performance improvement, because by switching to sk_buffs, we can use only one descriptor for the whole packet (header + payload), whereas before we used two for each packet. Some devices (e.g. rust-vmm's vsock) mistakenly always expect 2 descriptors, but this is a violation of the VIRTIO specification.
Which device are you using?
Can you confirm that your device conforms to the specification?
Stefano
We can reproduce 100% of the time but don't currently have a simple reproducer as the problem was found in our build service which uses kata-containers (with cloud-hypervisor).
We have not checked the mainline as we currently are tied to 6.1.x.
-Simon