ib_umem_dmabuf_map_pages() returns 0 on success and -ERRNO on failure.
dma_resv_wait_timeout() uses a different scheme:
* Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or * greater than zero on success.
This results in ib_umem_dmabuf_map_pages() being non-functional as a positive return will be understood to be an error by drivers.
Fixes: f30bceab16d1 ("RDMA: use dma_resv_wait() instead of extracting the fence") Cc: stable@kernel.org Tested-by: Maor Gottlieb maorg@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/infiniband/core/umem_dmabuf.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
Oded, I assume the Habana driver will hit this as well - does this mean you are not testing upstream kernels?
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c index fce80a4a5147cd..04c04e6d24c358 100644 --- a/drivers/infiniband/core/umem_dmabuf.c +++ b/drivers/infiniband/core/umem_dmabuf.c @@ -18,6 +18,7 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) struct scatterlist *sg; unsigned long start, end, cur = 0; unsigned int nmap = 0; + long ret; int i;
dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); @@ -67,9 +68,14 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) * may be not up-to-date. Wait for the exporter to finish * the migration. */ - return dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv, + ret = dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv, DMA_RESV_USAGE_KERNEL, false, MAX_SCHEDULE_TIMEOUT); + if (ret < 0) + return ret; + if (ret == 0) + return -ETIMEDOUT; + return 0; } EXPORT_SYMBOL(ib_umem_dmabuf_map_pages);
base-commit: 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
On Tue, Aug 16, 2022 at 11:03:20AM -0300, Jason Gunthorpe wrote:
ib_umem_dmabuf_map_pages() returns 0 on success and -ERRNO on failure.
dma_resv_wait_timeout() uses a different scheme:
- Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
- greater than zero on success.
This results in ib_umem_dmabuf_map_pages() being non-functional as a positive return will be understood to be an error by drivers.
Fixes: f30bceab16d1 ("RDMA: use dma_resv_wait() instead of extracting the fence") Cc: stable@kernel.org Tested-by: Maor Gottlieb maorg@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com
drivers/infiniband/core/umem_dmabuf.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
Thanks, applied to -rc.
On Tue, Aug 16, 2022 at 5:03 PM Jason Gunthorpe jgg@nvidia.com wrote:
ib_umem_dmabuf_map_pages() returns 0 on success and -ERRNO on failure.
dma_resv_wait_timeout() uses a different scheme:
- Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
- greater than zero on success.
This results in ib_umem_dmabuf_map_pages() being non-functional as a positive return will be understood to be an error by drivers.
Fixes: f30bceab16d1 ("RDMA: use dma_resv_wait() instead of extracting the fence") Cc: stable@kernel.org Tested-by: Maor Gottlieb maorg@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com
drivers/infiniband/core/umem_dmabuf.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
Oded, I assume the Habana driver will hit this as well - does this mean you are not testing upstream kernels?
Thanks Jason for letting me know.
You are correct, we don't use upstream kernels. We use a back-ported EFA driver for 5.15 which in that version, ib_umem_dmabuf_map_pages() calls dma_resv_excl_fence(). So I guess that's why we didn't encounter this issue.
Thanks, oded
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c index fce80a4a5147cd..04c04e6d24c358 100644 --- a/drivers/infiniband/core/umem_dmabuf.c +++ b/drivers/infiniband/core/umem_dmabuf.c @@ -18,6 +18,7 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) struct scatterlist *sg; unsigned long start, end, cur = 0; unsigned int nmap = 0;
long ret; int i; dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv);
@@ -67,9 +68,14 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) * may be not up-to-date. Wait for the exporter to finish * the migration. */
return dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv,
ret = dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv, DMA_RESV_USAGE_KERNEL, false, MAX_SCHEDULE_TIMEOUT);
if (ret < 0)
return ret;
if (ret == 0)
return -ETIMEDOUT;
return 0;
} EXPORT_SYMBOL(ib_umem_dmabuf_map_pages);
base-commit: 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
2.37.2
linaro-mm-sig@lists.linaro.org