On Tue, Jun 13, 2023 at 08:45:31AM +0200, Juergen Gross wrote:
On 12.06.23 22:09, Demi Marie Obenour wrote:
On Mon, Jun 12, 2023 at 08:27:59AM +0200, Juergen Gross wrote:
On 10.06.23 17:32, Demi Marie Obenour wrote:
When a grant entry is still in use by the remote domain, Linux must put it on a deferred list.
This lacks quite some context.
The main problem is related to the grant not having been unmapped after the end of a request, but the side granting the access is assuming this should be the case.
The GUI agent has relied on deferred grant reclaim for as long as it has existed. One could argue that doing so means that the agent is misusing gntalloc, but this is not documented anywhere. A better fix would be to use IOCTL_GNTDEV_SET_UNMAP_NOTIFY in the GUI daemon.
I don't think this is a gntalloc specific problem. You could create the same problem with a kernel implementation.
And yes, using IOCTL_GNTDEV_SET_UNMAP_NOTIFY might be a good idea.
Just a comment to this one: while in practice it might work most of the time, in theory there could be hundreds or even thousands of mappings at the same time and event channels seems to be too scarce resource to get unmap notification for all of those (even if you'd use an event channel per grant area, not individual page). As said to Demi elsewhere, our GUI protocol might have a way to signal when unmapping happened, but the backend (the side mapping the grants) would need to know when they actually were unmapped by another process (in some configurations, like Wayland mentioned below, the process itself doesn't give us this info). Ideally, for this we'd need something similar to IOCTL_GNTDEV_SET_UNMAP_NOTIFY that notifies the mapping side, not the other domain. I have no idea how such API should look (just poll()/read() on the gntdev FD used for that IOCTL?) or whether it's really worth adding such API. Or maybe something like this already exists? In worst case, even polling for this info (so - a way to check if process P has grant G from domain D still mapped) would do, although that wouldn't be ideal.
In general this means that the two sides implementing the protocol don't agree how it should work, or that the protocol itself has a flaw.
What would a better solution be? This is going to be particularly tricky with Wayland, as the wl_shm protocol makes absolutely no guarantees that compositors will promptly release the mapping and provides no way whatsoever for Wayland clients to know when this has happened. Relying on an LD_PRELOAD hack is not sustainable.
Anyway, besides trying to fix our usage of grant tables, I think it's still worthwhile to improve deferred reclaim, either by making the limit per iteration configurable (sysfs node / writable module parameter is fine) or by making the limit not needed by using workqueue. The former sounds like a smaller code change.