TTM, GEM, DRM or the core DMA-buf framework are needs
to enable software signaling before the fence is signaled.
The core DMA-buf framework software can forget to call
enable_signaling before the fence is signaled. It means
framework code can forget to call dma_fence_enable_sw_signaling()
before calling dma_fence_is_signaled(). To avoid this scenario
on debug kernel, check the DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT bit
status before checking the MA_FENCE_FLAG_SIGNALED_BIT bit status
to confirm that software signaling is enabled.
Arvind Yadav (4):
dma-buf: Check status of enable-signaling bit on debug
drm/sched: Add callback and enable signaling on debug
dma-buf: Add callback and enable signaling on debug
dma-buf: Add callback and enable signaling on debug
drivers/dma-buf/dma-fence.c | 17 ++++++++
drivers/dma-buf/st-dma-fence-chain.c | 17 ++++++++
drivers/dma-buf/st-dma-fence-unwrap.c | 54 +++++++++++++++++++++++++
drivers/dma-buf/st-dma-fence.c | 34 +++++++++++++++-
drivers/dma-buf/st-dma-resv.c | 30 ++++++++++++++
drivers/gpu/drm/scheduler/sched_fence.c | 12 ++++++
drivers/gpu/drm/scheduler/sched_main.c | 4 +-
include/linux/dma-fence.h | 5 +++
8 files changed, 171 insertions(+), 2 deletions(-)
--
2.25.1
Hi Daniel Vetter,
The patch https://patchwork.freedesktop.org/patch/414455/:
"dma-buf: Add debug option" from Jan. 15, 2021, leads to the following expection:
Backtrace:
[<ffffffc0081a2258>] atomic_notifier_call_chain+0x9c/0xe8
[<ffffffc0081a2d54>] notify_die+0x114/0x19c
[<ffffffc0080348d8>] __die+0xec/0x468
[<ffffffc008034648>] die+0x54/0x1f8
[<ffffffc0080631e8>] die_kernel_fault+0x80/0xbc
[<ffffffc0080630fc>] __do_kernel_fault+0x268/0x2d4
[<ffffffc008062c4c>] do_bad_area+0x68/0x148
[<ffffffc00a6dab34>] do_translation_fault+0xbc/0x108
[<ffffffc0080619f8>] do_mem_abort+0x6c/0x1e8
[<ffffffc00a68f5cc>] el1_abort+0x3c/0x64
[<ffffffc00a68f54c>] el1h_64_sync_handler+0x5c/0xa0
[<ffffffc008011ae4>] el1h_64_sync+0x78/0x80
[<ffffffc008063b9c>] dcache_inval_poc+0x40/0x58
[<ffffffc009236104>] iommu_dma_sync_sg_for_cpu+0x144/0x280
[<ffffffc0082b4870>] dma_sync_sg_for_cpu+0xbc/0x110
[<ffffffc002c7538c>] system_heap_dma_buf_begin_cpu_access+0x144/0x1e0 [system_heap]
[<ffffffc0094154e4>] dma_buf_begin_cpu_access+0xa4/0x10c
[<ffffffc004888df4>] isp71_allocate_working_buffer+0x3b0/0xe8c [mtk_hcp]
[<ffffffc004884a20>] mtk_hcp_allocate_working_buffer+0xc0/0x108 [mtk_hcp]
Because of CONFIG_DMABUF_DEBUG will default enable when DMA_API_DEBUG enable,
and when not support dma coherent, since the main function of user calling
dma_buf_begin_cpu_access and dma_buf_end_cpu_access is to do cache sync during
dma_buf_map_attachment and dma_buf_unmap_attachment, which get PA error from
sgtable by sg_phys(sg), this leads to the expection.
1.dma_buf_map_attachement()
-.> mangle_sg_table(sg) // "sg->page_link ^= ~0xffUL" to rotate PA in this patch.
2.dma_buf_begin_cpu_access()
-.> system_heap_dma_buf_begin_cpu_access() in system_heap.c // do cache sync if mapped attachment before
-.> iommu_dma_sync_sg_for_cpu() in dma-iommu.c
-.> arch_sync_dma_for_device(sg_phys(sg), sg->length, dir) // get PA error since PA mix up
3.dma_buf_end_cpu_access() and dma_buf_begin_cpu_access are similar.
4.dma_buf_unmap_attachement()
-.> mangle_sg_table(sg) // "sg->page_link ^= ~0xffUL" to rotate PA
drivers/dma-buf/Kconfig:
config DMABUF_DEBUG
bool "DMA-BUF debug checks"
default y if DMA_API_DEBUG
drivers/dma-buf/dma-buf.c:
static void mangle_sg_table(struct sg_table *sg_table)
{
#ifdef CONFIG_DMABUF_DEBUG
int i;
struct scatterlist *sg;
/* To catch abuse of the underlying struct page by importers mix
* up the bits, but take care to preserve the low SG_ bits to
* not corrupt the sgt. The mixing is undone in __unmap_dma_buf
* before passing the sgt back to the exporter. */
for_each_sgtable_sg(sg_table, sg, i)
sg->page_link ^= ~0xffUL;
#endif
}
drivers/iommu/dma-iommu.c:
static void iommu_dma_sync_sg_for_cpu(struct device *dev,
struct scatterlist *sgl, int nelems,
enum dma_data_direction dir)
{
struct scatterlist *sg;
int i;
if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
return;
for_each_sg(sgl, sg, nelems, i) {
if (!dev_is_dma_coherent(dev))
arch_sync_dma_for_cpu(sg_phys(sg), sg->length, dir);
if (is_swiotlb_buffer(sg_phys(sg)))
swiotlb_tbl_sync_single(dev, sg_phys(sg), sg->length,
dir, SYNC_FOR_CPU);
}
}
Thanks,
Yunfei.
Hello,
This series moves all drivers to a dynamic dma-buf locking specification.
From now on all dma-buf importers are made responsible for holding
dma-buf's reservation lock around all operations performed over dma-bufs
in accordance to the locking specification. This allows us to utilize
reservation lock more broadly around kernel without fearing of a potential
deadlocks.
This patchset passes all i915 selftests. It was also tested using VirtIO,
Panfrost, Lima, Tegra, udmabuf, AMDGPU and Nouveau drivers. I tested cases
of display+GPU, display+V4L and GPU+V4L dma-buf sharing (where appropriate),
which covers majority of kernel drivers since rest of the drivers share
same or similar code paths.
Changelog:
v3: - Factored out dma_buf_mmap_unlocked() and attachment functions
into aseparate patches, like was suggested by Christian König.
- Corrected and factored out dma-buf locking documentation into
a separate patch, like was suggested by Christian König.
- Intel driver dropped the reservation locking fews days ago from
its BO-release code path, but we need that locking for the imported
GEMs because in the end that code path unmaps the imported GEM.
So I added back the locking needed by the imported GEMs, updating
the "dma-buf attachment locking specification" patch appropriately.
- Tested Nouveau+Intel dma-buf import/export combo.
- Tested udmabuf import to i915/Nouveau/AMDGPU.
- Fixed few places in Etnaviv, Panfrost and Lima drivers that I missed
to switch to locked dma-buf vmapping in the drm/gem: Take reservation
lock for vmap/vunmap operations" patch. In a result invalidated the
Christian's r-b that he gave to v2.
- Added locked dma-buf vmap/vunmap functions that are needed for fixing
vmappping of Etnaviv, Panfrost and Lima drivers mentioned above.
I actually had this change stashed for the drm-shmem shrinker patchset,
but then realized that it's already needed by the dma-buf patches.
Also improved my tests to better cover these code paths.
v2: - Changed locking specification to avoid problems with a cross-driver
ww locking, like was suggested by Christian König. Now the attach/detach
callbacks are invoked without the held lock and exporter should take the
lock.
- Added "locking convention" documentation that explains which dma-buf
functions and callbacks are locked/unlocked for importers and exporters,
which was requested by Christian König.
- Added ack from Tomasz Figa to the V4L patches that he gave to v1.
Dmitry Osipenko (9):
dma-buf: Add _unlocked postfix to function names
dma-buf: Add locked variant of dma_buf_vmap/vunmap()
drm/gem: Take reservation lock for vmap/vunmap operations
dma-buf: Move dma_buf_vmap/vunmap_unlocked() to dynamic locking
specification
dma-buf: Move dma_buf_mmap_unlocked() to dynamic locking specification
dma-buf: Move dma-buf attachment to dynamic locking specification
dma-buf: Document dynamic locking convention
media: videobuf2: Stop using internal dma-buf lock
dma-buf: Remove internal lock
Documentation/driver-api/dma-buf.rst | 6 +
drivers/dma-buf/dma-buf.c | 276 ++++++++++++++----
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 4 +-
drivers/gpu/drm/armada/armada_gem.c | 14 +-
drivers/gpu/drm/drm_client.c | 4 +-
drivers/gpu/drm/drm_gem.c | 24 ++
drivers/gpu/drm/drm_gem_dma_helper.c | 6 +-
drivers/gpu/drm/drm_gem_framebuffer_helper.c | 6 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +-
drivers/gpu/drm/drm_gem_ttm_helper.c | 9 +-
drivers/gpu/drm/drm_prime.c | 12 +-
drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 4 +-
drivers/gpu/drm/exynos/exynos_drm_gem.c | 2 +-
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 6 +-
drivers/gpu/drm/i915/gem/i915_gem_object.c | 12 +
.../drm/i915/gem/selftests/i915_gem_dmabuf.c | 20 +-
drivers/gpu/drm/lima/lima_sched.c | 4 +-
drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 8 +-
drivers/gpu/drm/panfrost/panfrost_dump.c | 4 +-
drivers/gpu/drm/panfrost/panfrost_perfcnt.c | 6 +-
drivers/gpu/drm/qxl/qxl_object.c | 17 +-
drivers/gpu/drm/qxl/qxl_prime.c | 4 +-
drivers/gpu/drm/tegra/gem.c | 27 +-
drivers/infiniband/core/umem_dmabuf.c | 11 +-
.../common/videobuf2/videobuf2-dma-contig.c | 26 +-
.../media/common/videobuf2/videobuf2-dma-sg.c | 23 +-
.../common/videobuf2/videobuf2-vmalloc.c | 17 +-
.../platform/nvidia/tegra-vde/dmabuf-cache.c | 12 +-
drivers/misc/fastrpc.c | 12 +-
drivers/xen/gntdev-dmabuf.c | 14 +-
include/drm/drm_gem.h | 3 +
include/linux/dma-buf.h | 57 ++--
32 files changed, 410 insertions(+), 242 deletions(-)
--
2.37.2