[TLDR: This mail in primarily relevant for Linux kernel regression tracking. See link in footer if these mails annoy you.]
On 11.12.23 16:23, Stefano Garzarella wrote:
On Mon, Dec 11, 2023 at 3:20 PM Simon Kaegi simon.kaegi@gmail.com wrote:
Thanks Greg, Stefano,
tldr; withdrawing the regression -- rust-vmm vsock mistake
In that case:
#regzbot resolve: reporter withdrawed the report #regzbot ignore-activity
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr That page also explains what to do if mails like this annoy you.
Thanks all... nothing to see.
- Simon
On Mon, Dec 11, 2023 at 3:39 AM Stefano Garzarella sgarzare@redhat.com wrote:
On Mon, Dec 11, 2023 at 5:05 AM Simon Kaegi simon.kaegi@gmail.com wrote:
#regzbot introduced v6.1.62..v6.1.63 #regzbot introduced: baddcc2c71572968cdaeee1c4ab3dc0ad90fa765
We hit this regression when updating our guest vm kernel from 6.1.62 to 6.1.63 -- bisecting, this problem was introduced in baddcc2c71572968cdaeee1c4ab3dc0ad90fa765 -- virtio/vsock: replace virtio_vsock_pkt with sk_buff -- https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v...
We're getting a timeout when trying to connect to the vsocket in the guest VM when launching a kata containers 3.2.0 agent. We haven't done much more to understand the problem at this point.
It looks like the same issue described here: https://github.com/rust-vmm/vm-virtio/issues/204
In summary that patch also contains a performance improvement, because by switching to sk_buffs, we can use only one descriptor for the whole packet (header + payload), whereas before we used two for each packet. Some devices (e.g. rust-vmm's vsock) mistakenly always expect 2 descriptors, but this is a violation of the VIRTIO specification.
Which device are you using?
Can you confirm that your device conforms to the specification?
Stefano
We can reproduce 100% of the time but don't currently have a simple reproducer as the problem was found in our build service which uses kata-containers (with cloud-hypervisor).
We have not checked the mainline as we currently are tied to 6.1.x.
-Simon