On Wed, May 06, 2026 at 04:55:27PM +0100, Matt Evans wrote:
Hi Leon,
On 06/05/2026 16:29, Leon Romanovsky wrote:
On Wed, May 06, 2026 at 02:53:31PM +0100, Matt Evans wrote:
Hi Alex,
On 01/05/2026 20:12, Alex Williamson wrote:
On Thu, 16 Apr 2026 06:17:44 -0700 Matt Evans mattev@meta.com wrote:
vfio_pci_dma_buf_cleanup() assumed all VFIO device DMABUFs need to be revoked. However, if vfio_pci_dma_buf_move() revokes DMABUFs before the fd/device closes, then vfio_pci_dma_buf_cleanup() would do a second/underflowing kref_put() then wait_for_completion() on a completion that never fires. Fixed by predicating on revocation status.
This could happen if PCI_COMMAND_MEMORY is cleared before closing the device fd (but the scenario is more likely to hit when future commits add more methods to revoke DMABUFs).
Fixes: 1a8a5227f2299 ("vfio: Wait for dma-buf invalidation to complete") Signed-off-by: Matt Evans mattev@meta.com
(Just a fix, but later "vfio/pci: Convert BAR mmap() to use a DMABUF" and "vfio/pci: Permanently revoke a DMABUF on request" depend on this context, so including in this series.)
We really need a fix for this split out from this series, It's already been shown[1] that this is trivially reachable. Carlos proposed[2] a similar solution to the one below. I was concurrently working on the issued and suggested an alternative[3]. Let's pick a solution for 7.1-rc. Thanks,
It looks like [3] is progressing, so I'll drop this one when I can rebase onto it.
I noticed [3] removes the dma_resv_lock(priv->dmabuf->resv) around the priv->vdev = NULL, and this series' vfio_pci_mmap_huge_fault() relies on vdev only changing whilst resv is held to resolve a race between a fault and cleanup (see patch 7 of this series). The handler takes resv so that it can stably test vdev in order to take memory_lock.
I think that you should rely on priv->revoked and not on priv->vdev.
Needs both unfortunately, as the fault handler ultimately needs to take vdev->memory_lock.
One can argue that if priv->revoked == True, all accesses to device should be denied and treated as priv->vdev == Null.
Thanks
linaro-mm-sig@lists.linaro.org