On Sun, Sep 12, 2021 at 07:53:07PM +0300, Oded Gabbay wrote:
Hi, Re-sending this patch-set following the release of our user-space TPC compiler and runtime library.
I would appreciate a review on this.
I think the big open we have is the entire revoke discussions. Having the option to let dma-buf hang around which map to random local memory ranges, without clear ownership link and a way to kill it sounds bad to me.
I think there's a few options: - We require revoke support. But I've heard rdma really doesn't like that, I guess because taking out an MR while holding the dma_resv_lock would be an inversion, so can't be done. Jason, can you recap what exactly the hold-up was again that makes this a no-go?
- The other option I discussed is a bit more the exlusive device ownership model we've had for gpus in drm of the really old kind. Roughly this would work like this, in terms of drm_device: - Only the current owner (drm_master in current drm code, but should probably rename that to drm_owner) is allowed to use the accel driver. So all ioctl would fail if you're not drm_master. - On dropmaster/file close we'd revoke as much as possible, e.g. in-flight commands, mmaps, anything really that can be revoked. - For non-revokable things like these dma-buf we'd keep a drm_master reference around. This would prevent the next open to acquire ownership rights, which at least prevents all the nasty potential problems. - admin (or well container orchestrator) then has responsibility to shoot down all process until the problem goes away (i.e. until you hit the one with the rdma MR which keeps the dma-buf alive)
- Not sure there's another reasonable way to do this without inviting some problems once we get outside of the "single kernel instance per tenant" use-case.
Wrt implementation there's the trouble of this reinventing a bunch of drm stuff and concepts, but that's maybe for after we've figured out semantics.
Also would be great if you have a pull request for the userspace runtime that shows a bit how this all gets used and tied together. Or maybe some pointers, since I guess retconning a PR in github is maybe a bit much.
Cheers, Daniel
Thanks, Oded
Oded Gabbay (1): habanalabs: define uAPI to export FD for DMA-BUF
Tomer Tayar (1): habanalabs: add support for dma-buf exporter
drivers/misc/habanalabs/Kconfig | 1 + drivers/misc/habanalabs/common/habanalabs.h | 22 + drivers/misc/habanalabs/common/memory.c | 522 +++++++++++++++++++- drivers/misc/habanalabs/gaudi/gaudi.c | 1 + drivers/misc/habanalabs/goya/goya.c | 1 + include/uapi/misc/habanalabs.h | 28 +- 6 files changed, 570 insertions(+), 5 deletions(-)
-- 2.17.1
On Tue, Sep 14, 2021 at 04:18:31PM +0200, Daniel Vetter wrote:
On Sun, Sep 12, 2021 at 07:53:07PM +0300, Oded Gabbay wrote:
Hi, Re-sending this patch-set following the release of our user-space TPC compiler and runtime library.
I would appreciate a review on this.
I think the big open we have is the entire revoke discussions. Having the option to let dma-buf hang around which map to random local memory ranges, without clear ownership link and a way to kill it sounds bad to me.
I think there's a few options:
- We require revoke support. But I've heard rdma really doesn't like that, I guess because taking out an MR while holding the dma_resv_lock would be an inversion, so can't be done. Jason, can you recap what exactly the hold-up was again that makes this a no-go?
RDMA HW can't do revoke.
So we have to exclude almost all the HW and several interesting use cases to enable a revoke operation.
- For non-revokable things like these dma-buf we'd keep a drm_master reference around. This would prevent the next open to acquire ownership rights, which at least prevents all the nasty potential problems.
This is what I generally would expect, the DMABUF FD and its DMA memory just floats about until the unrevokable user releases it, which happens when the FD that is driving the import eventually gets closed.
I still don't think any of the complexity is needed, pinnable memory is a thing in Linux, just account for it in mlocked and that is enough.
Jason
linaro-mm-sig@lists.linaro.org