On Wed, Apr 09, 2025 at 01:12:19PM +0200, David Hildenbrand wrote:
On 09.04.25 12:56, Michael S. Tsirkin wrote:
On Wed, Apr 09, 2025 at 12:46:41PM +0200, David Hildenbrand wrote:
On 07.04.25 23:20, Michael S. Tsirkin wrote:
On Mon, Apr 07, 2025 at 08:47:05PM +0200, David Hildenbrand wrote:
In my opinion, it makes the most sense to keep the spec as it is and change QEMU and the kernel to match, but obviously that's not trivial to do in a way that doesn't break existing devices and drivers.
If only it would be limited to QEMU and Linux ... :)
Out of curiosity, assuming we'd make the spec match the current QEMU/Linux implementation at least for the 3 involved features only, would there be a way to adjust crossvm without any disruption?
I still have the feeling that it will be rather hard to get that all implementations match the spec ... For new features+queues it will be easy to force the usage of fixed virtqueue numbers, but for free-page-hinting and reporting, it's a mess :(
Still thinking about a way to fix drivers... We can discuss this theoretically, maybe?
Yes, absolutely. I took the time to do some more digging; regarding drivers only Linux seems to be problematic.
virtio-win, FreeBSD, NetBSD and OpenBSD and don't seem to support problematic features (free page hinting, free page reporting) in their virtio-balloon implementations.
So from the known drivers, only Linux is applicable.
reporting_vq is either at idx 4/3/2 free_page_vq is either at idx 3/2 statsq is at idx2 (only relevant if the feature is offered)
So if we could test for the existence of a virtqueue at an idx easily, we could test from highest-to-smallest idx.
But I recall that testing for the existance of a virtqueue on s390x resulted in the problem/deadlock in the first place ...
-- Cheers,
David / dhildenb
So let's talk about a new feature bit?
Are you thinking about a new feature that switches between "fixed queue indices" and "compressed queue indices", whereby the latter would be the legacy default and we would expect all devices to switch to the new fixed-queue-indices layout?
We could make all new features require "fixed-queue-indices".
I see two ways: 1. we make driver behave correctly with in spec and out of spec devices and we make qemu behave correctly with in spec and out of spec devices 2. a new feature bit
I prefer 1, and when we add a new feature we can also document that it should be in spec if negotiated.
My question is if 1 is practical.
Since vqs are probed after feature negotiation, it looks like we could have a feature bit trigger sane behaviour, right?
In the Linux driver, yes. In QEMU (devices), we add the queues when realizing, so we'd need some mechanism to adjust the queue indices based on feature negotiation I guess?
Well we can add queues later, nothing prevents that.
For virtio-balloon it might be doable to simply always create+indicate free-page hinting to resolve the issue easily.
OK, so - for devices, we suggest that basically VIRTIO_BALLOON_F_REPORTING only created with VIRTIO_BALLOON_F_FREE_PAGE_HINT and VIRTIO_BALLOON_F_FREE_PAGE_HINT only created with VIRTIO_BALLOON_F_STATS_VQ
I got that.
Now, for drivers.
If the dependency is satisfied as above, no difference.
What should drivers do if not?
I think the thing to do would be to first probe spec compliant vq numbers? If not there, try with the non compliant version?
However, you wrote:
But I recall that testing for the existance of a virtqueue on s390x resulted in the problem/deadlock in the first place ...
I think the deadlock was if trying to *use* a non-existent virtqueue?
This is qemu code:
case CCW_CMD_READ_VQ_CONF: if (check_len) { if (ccw.count != sizeof(vq_config)) { ret = -EINVAL; break; } } else if (ccw.count < sizeof(vq_config)) { /* Can't execute command. */ ret = -EINVAL; break; } if (!ccw.cda) { ret = -EFAULT; } else { ret = ccw_dstream_read(&sch->cds, vq_config.index); if (ret) { break; } vq_config.index = be16_to_cpu(vq_config.index); if (vq_config.index >= VIRTIO_QUEUE_MAX) { ret = -EINVAL; break; } vq_config.num_max = virtio_queue_get_num(vdev, vq_config.index); vq_config.num_max = cpu_to_be16(vq_config.num_max); ret = ccw_dstream_write(&sch->cds, vq_config.num_max); if (!ret) { sch->curr_status.scsw.count = ccw.count - sizeof(vq_config); } }
and
int virtio_queue_get_num(VirtIODevice *vdev, int n) { return vdev->vq[n].vring.num; }
it seems to happily return vq size with no issues?
For virtio-fs it might not be that easy.
virtio fs? But it has no features?
I kind of dislike it that we have a feature bit for bugs though. What would be a minimal new feature to add so it does not feel wrong?
Probably as above: fixed vs. compressed virtqueue indices?
-- Cheers,
David / dhildenb