Happy new year everyone,
I have a few questions regarding the design of the whole memory sharing over
Vsock work and looking to get some feedback before I spend time making the
changes.
The initial design (from Nov):
- Initially I implemented a FFA based DMA heap (no DMA ops used), which would
allocate memory and make direct FFA calls to send the memory to the other
endpoint.
- The userspace would open this heap, allocated dma-buf and pass its FD over
Vsock.
- The Vsock layer would then call ->shmem() (a new callback I added to
dma-buf/heap, which will send memory over FFA and return ffa bus address), to
get the metadata to be shared over Vsock.
The current design:
- In one of the calls Bertrand suggested to not create parallel paths for
sending memory over FFA. Instead we should use the exiting dma-ops for FFA
(used for virtqueue and reserved-mem) somehow.
- I created a platform device (as a child of the FFA device) and assigned a new
set of DMA ops to it (the only difference from reserved-mem ops is that we
don't do swiotlb here and allocate fresh instead of the reserved mem).
- This pdev is used by the DMA heap to allocate memory using
dma_alloc_coherent() and that made sure everything got mapped correctly.
- I still need a dma-buf helper to get the metadata to send over Vsock. The
existing helper was renamed as s/shmem/shmem_data.
The future design (that I have questions about):
- The FFA specific DMA heap I now have doesn't do anything special compared to
the system heap, mostly exactly same.
- Which made me realize that probably I shouldn't add a new heap (Google can add
one later if they really want) and the solution should work with any heap /
dma-buf.
- So, userspace should allocate heap from system-heap, get a dma-buf from it and
send its FD.
- The vsock layer then should attach this dma-buf to a `struct device` somehow
and then call map_dma_buf() for the dma-buf. This requires the dma-ops of the
device to be set to the FFA based dma-ops and then it should just work.
- The tricky point is finding that device struct (as Vsock can't get it from
dma-buf or usersapce).
- One way, I think (still needs exploring but should be possible) is to use the
struct device of the virtio-msg device over which Vsock is implemented. We can
set the dma-ops of the virtio-msg device accordingly.
- The system heap doesn't guarantee contiguous allocation though (which my FFA
heap did) and so we will be required to send a scatter-gather list over vsock,
instead of just one address and size (what I am doing right now).
- Does this make sense ? Or if there is a use-case that this won't solve, etc ?
--
viresh
Hello,
I have some questions / feedback on the Xen side of virtio-msg.
AFAIU virtio-msg defines a protocol in order to deal with discovery
(optional) and configuration of the PV devices. But things are undefined
regarding what is a "memory address".
In Xen memory model with grants, each guest has its own memory space.
The frontend shares memory pages with the backend through grants pages
identified by grant references.
So in a design based on grants, virtio addresses can't (or at least
shouldn't) be guest physical address; but needs to be something derived
on grants. A earlier design [1] was forging a address with the grant
reference, but I feel it's not great, as it forces "map+unmap" cycles
for temporary buffers thus has the same performance problem as Xen PV
drivers (without persistent grants) where map+unmap cycles is a
performance problem.
And would make the address space very fragmented and in often limited to
4KB buffers.
One idea that is already used by Xen displif (PV Display) is to have a
"gref directory" and describe the address space on that. The gref
directory is roughly a array of grant references (shared pages infos)
that could describe a address space starting at 0, where each page is
defined by a grant reference of the directory. That way, the backend can
freely keep all or a part or the address space persistently mapped (or
eventually map it all at once); and the address space is also contiguous
which would help with >4KB buffers.
Any thoughts ?
[1]
https://static.sched.com/hosted_files/xen2021/bf/Thursday_2021-Xen-Summit-v…
--
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech