Heh, but that one said:
+\item[ VIRTIO_BALLOON_F_WS_REPORTING(6) ] The device has support for Working Set
Which does not seem to reflect reality ...
Please feel free to disregard these features and reuse their bits and queue indexes; as far as I know, they are not actually enabled anywhere currently and the corresponding guest patches were only applied to some (no-longer-used) ChromeOS kernel trees, so the compatibility impact should be minimal. I will also try to clean up the leftover bits on the crosvm side just to clear things up.
Thanks for your reply, and thanks for clarifying+cleaning it up.
I dug a bit more into cross-vm, because that one seems to be the only one out there that does not behave like everybody else I found (maybe good, maybe bad :) ).
- There was temporarily even another feature (VIRTIO_BALLOON_F_EVENTS_VQ)
and another queue.
It got removed from cross-vm in:
commit 9ba634b82b55ba762dc8724676b2cf9419460145 Author: Daniel Verkamp dverkamp@chromium.org Date: Thu Jul 11 11:29:52 2024 -0700
devices: virtio-balloon: remove event queue support VIRTIO_BALLOON_F_EVENTS_VQ was part of a proposed virtio spec change. It is not currently supported by upstream Linux, so removing this should have no effect except for guest kernels that had CHROMIUM patches applied. The virtqueue indexes for the ws-related queues are decremented to fill the hole left by the removal of the event VQ; these are non-standard as well, so they do not have virtqueue indexes assigned in the virtio spec, but the proposed spec extension did actually use vq indexes 5 and 6. BUG=b:214864326
- cross-vm is aware of the upstream Linux driver
They thought your fix would go upstream; it didn't.
commit a2fa119e759d0238a42ff15a9aff0dfd122afebd Author: Daniel Verkamp dverkamp@chromium.org Date: Wed Jul 10 16:16:28 2024 -0700
devices: virtio-balloon: warn about queue index mismatches The Linux kernel virtio-balloon driver spec non-compliance related to queue numbering is being fixed; add some diagnostics to our device that help to check if everything is working as expected. <https://lore.kernel.org/virtualization/CACGkMEsg0+vpav1Fo8JF1isq4Ef8t4_CFN1scyztDO8bXzRLBQ@mail.gmail.com/T/> Additionally, replace the num_expected_queues() function with per-queue checking to avoid the need for the duplicate feature checks and queue count calculation; each pop_queue() call will be checked using the `?` operator and return a more useful error message if a particular queue is missing. BUG=None TEST=crosvm run --balloon-page-reporting ...
IIRC, in that commit they switched to the "spec" behavior.
That's when they started hard-coding the queue indexes.
CCing Daniel. All Linux versions should be incompatible with cross-vmm regarding free page reporting. How is that handled?
In practice, it only works because nobody calls crosvm with --balloon-page-reporting (it's off by default), so the balloon device does not advertise the VIRTIO_BALLOON_F_PAGE_REPORTING feature.
(I just went searching now, and it does seem like there is actually one user in Android that does try to enable page reporting[1], which I'll have to look into...)
In my opinion, it makes the most sense to keep the spec as it is and change QEMU and the kernel to match, but obviously that's not trivial to do in a way that doesn't break existing devices and drivers.
If only it would be limited to QEMU and Linux ... :)
Out of curiosity, assuming we'd make the spec match the current QEMU/Linux implementation at least for the 3 involved features only, would there be a way to adjust crossvm without any disruption?
I still have the feeling that it will be rather hard to get that all implementations match the spec ... For new features+queues it will be easy to force the usage of fixed virtqueue numbers, but for free-page-hinting and reporting, it's a mess :(