Linaro-mm-sig

linaro-mm-sig@lists.linaro.org

9 participants
3222 discussions

Re: [PATCH v3 09/28] drm/amdgpu: pass the entity to use to ttm public functions

by Christian König

On 11/21/25 11:12, Pierre-Eric Pelloux-Prayer wrote: > This way the caller can select the one it wants to use. > > Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer(a)amd.com> I'm wondering if it wouldn't make sense to put a pointer to adev into each amdgpu_ttm_buffer_entity. But that is maybe something for another patch. For now: Reviewed-by: Christian König <christian.koenig(a)amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c | 3 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 +-- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 34 +++++++++---------- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 16 +++++---- > drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 3 +- > 5 files changed, 32 insertions(+), 28 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c > index 3636b757c974..a050167e76a4 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c > @@ -37,7 +37,8 @@ static int amdgpu_benchmark_do_move(struct amdgpu_device *adev, unsigned size, > > stime = ktime_get(); > for (i = 0; i < n; i++) { > - r = amdgpu_copy_buffer(adev, saddr, daddr, size, NULL, &fence, > + r = amdgpu_copy_buffer(adev, &adev->mman.default_entity, > + saddr, daddr, size, NULL, &fence, > false, 0); > if (r) > goto exit_do_move; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > index 926a3f09a776..858eb9fa061b 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > @@ -1322,8 +1322,8 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo) > if (r) > goto out; > > - r = amdgpu_fill_buffer(abo, 0, &bo->base._resv, &fence, true, > - AMDGPU_KERNEL_JOB_ID_CLEAR_ON_RELEASE); > + r = amdgpu_fill_buffer(&adev->mman.clear_entity, abo, 0, &bo->base._resv, > + &fence, AMDGPU_KERNEL_JOB_ID_CLEAR_ON_RELEASE); > if (WARN_ON(r)) > goto out; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > index 3d850893b97f..1d3afad885da 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > @@ -359,7 +359,7 @@ static int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev, > write_compress_disable)); > } > > - r = amdgpu_copy_buffer(adev, from, to, cur_size, resv, > + r = amdgpu_copy_buffer(adev, entity, from, to, cur_size, resv, > &next, true, copy_flags); > if (r) > goto error; > @@ -414,8 +414,9 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo, > (abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE)) { > struct dma_fence *wipe_fence = NULL; > > - r = amdgpu_fill_buffer(abo, 0, NULL, &wipe_fence, > - false, AMDGPU_KERNEL_JOB_ID_MOVE_BLIT); > + r = amdgpu_fill_buffer(&adev->mman.move_entity, > + abo, 0, NULL, &wipe_fence, > + AMDGPU_KERNEL_JOB_ID_MOVE_BLIT); > if (r) { > goto error; > } else if (wipe_fence) { > @@ -2258,7 +2259,9 @@ static int amdgpu_ttm_prepare_job(struct amdgpu_device *adev, > DMA_RESV_USAGE_BOOKKEEP); > } > > -int amdgpu_copy_buffer(struct amdgpu_device *adev, uint64_t src_offset, > +int amdgpu_copy_buffer(struct amdgpu_device *adev, > + struct amdgpu_ttm_buffer_entity *entity, > + uint64_t src_offset, > uint64_t dst_offset, uint32_t byte_count, > struct dma_resv *resv, > struct dma_fence **fence, > @@ -2282,7 +2285,7 @@ int amdgpu_copy_buffer(struct amdgpu_device *adev, uint64_t src_offset, > max_bytes = adev->mman.buffer_funcs->copy_max_bytes; > num_loops = DIV_ROUND_UP(byte_count, max_bytes); > num_dw = ALIGN(num_loops * adev->mman.buffer_funcs->copy_num_dw, 8); > - r = amdgpu_ttm_prepare_job(adev, &adev->mman.move_entity, num_dw, > + r = amdgpu_ttm_prepare_job(adev, entity, num_dw, > resv, vm_needs_flush, &job, > AMDGPU_KERNEL_JOB_ID_TTM_COPY_BUFFER); > if (r) > @@ -2411,22 +2414,18 @@ int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo, > return r; > } > > -int amdgpu_fill_buffer(struct amdgpu_bo *bo, > - uint32_t src_data, > - struct dma_resv *resv, > - struct dma_fence **f, > - bool delayed, > - u64 k_job_id) > +int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, > + struct amdgpu_bo *bo, > + uint32_t src_data, > + struct dma_resv *resv, > + struct dma_fence **f, > + u64 k_job_id) > { > struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); > - struct amdgpu_ttm_buffer_entity *entity; > struct dma_fence *fence = NULL; > struct amdgpu_res_cursor dst; > int r; > > - entity = delayed ? &adev->mman.clear_entity : > - &adev->mman.move_entity; > - > if (!adev->mman.buffer_funcs_enabled) { > dev_err(adev->dev, > "Trying to clear memory with ring turned off.\n"); > @@ -2443,13 +2442,14 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo, > /* Never fill more than 256MiB at once to avoid timeouts */ > cur_size = min(dst.size, 256ULL << 20); > > - r = amdgpu_ttm_map_buffer(adev, &adev->mman.default_entity, > + r = amdgpu_ttm_map_buffer(adev, entity, > &bo->tbo, bo->tbo.resource, &dst, > 1, false, &cur_size, &to); > if (r) > goto error; > > - r = amdgpu_ttm_fill_mem(adev, entity, src_data, to, cur_size, resv, > + r = amdgpu_ttm_fill_mem(adev, entity, > + src_data, to, cur_size, resv, > &next, true, k_job_id); > if (r) > goto error; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h > index 41bbc25680a2..9288599c9c46 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h > @@ -167,7 +167,9 @@ int amdgpu_ttm_init(struct amdgpu_device *adev); > void amdgpu_ttm_fini(struct amdgpu_device *adev); > void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, > bool enable); > -int amdgpu_copy_buffer(struct amdgpu_device *adev, uint64_t src_offset, > +int amdgpu_copy_buffer(struct amdgpu_device *adev, > + struct amdgpu_ttm_buffer_entity *entity, > + uint64_t src_offset, > uint64_t dst_offset, uint32_t byte_count, > struct dma_resv *resv, > struct dma_fence **fence, > @@ -175,12 +177,12 @@ int amdgpu_copy_buffer(struct amdgpu_device *adev, uint64_t src_offset, > int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo, > struct dma_resv *resv, > struct dma_fence **fence); > -int amdgpu_fill_buffer(struct amdgpu_bo *bo, > - uint32_t src_data, > - struct dma_resv *resv, > - struct dma_fence **fence, > - bool delayed, > - u64 k_job_id); > +int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, > + struct amdgpu_bo *bo, > + uint32_t src_data, > + struct dma_resv *resv, > + struct dma_fence **f, > + u64 k_job_id); > > int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo); > void amdgpu_ttm_recover_gart(struct ttm_buffer_object *tbo); > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > index ade1d4068d29..9c76f1ba0e55 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > @@ -157,7 +157,8 @@ svm_migrate_copy_memory_gart(struct amdgpu_device *adev, dma_addr_t *sys, > goto out_unlock; > } > > - r = amdgpu_copy_buffer(adev, gart_s, gart_d, size * PAGE_SIZE, > + r = amdgpu_copy_buffer(adev, entity, > + gart_s, gart_d, size * PAGE_SIZE, > NULL, &next, true, 0); > if (r) { > dev_err(adev->dev, "fail %d to copy memory\n", r);

2 weeks, 4 days

Re: [PATCH v3 06/28] drm/amdgpu: add amdgpu_ttm_job_submit helper

by Christian König

On 11/21/25 11:12, Pierre-Eric Pelloux-Prayer wrote: > Deduplicate the IB padding code and will also be used > later to check locking. > > Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer(a)amd.com> Reviewed-by: Christian König <christian.koenig(a)amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 34 ++++++++++++------------- > 1 file changed, 16 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > index 17e1892c44a2..be1232b2d55e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > @@ -162,6 +162,18 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo, > *placement = abo->placement; > } > > +static struct dma_fence * > +amdgpu_ttm_job_submit(struct amdgpu_device *adev, struct amdgpu_job *job, u32 num_dw) > +{ > + struct amdgpu_ring *ring; > + > + ring = adev->mman.buffer_funcs_ring; > + amdgpu_ring_pad_ib(ring, &job->ibs[0]); > + WARN_ON(job->ibs[0].length_dw > num_dw); > + > + return amdgpu_job_submit(job); > +} > + > /** > * amdgpu_ttm_map_buffer - Map memory into the GART windows > * @adev: the device being used > @@ -185,7 +197,6 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_device *adev, > { > unsigned int offset, num_pages, num_dw, num_bytes; > uint64_t src_addr, dst_addr; > - struct amdgpu_ring *ring; > struct amdgpu_job *job; > void *cpu_addr; > uint64_t flags; > @@ -240,10 +251,6 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_device *adev, > amdgpu_emit_copy_buffer(adev, &job->ibs[0], src_addr, > dst_addr, num_bytes, 0); > > - ring = adev->mman.buffer_funcs_ring; > - amdgpu_ring_pad_ib(ring, &job->ibs[0]); > - WARN_ON(job->ibs[0].length_dw > num_dw); > - > flags = amdgpu_ttm_tt_pte_flags(adev, bo->ttm, mem); > if (tmz) > flags |= AMDGPU_PTE_TMZ; > @@ -261,7 +268,7 @@ static int amdgpu_ttm_map_buffer(struct amdgpu_device *adev, > amdgpu_gart_map_vram_range(adev, pa, 0, num_pages, flags, cpu_addr); > } > > - dma_fence_put(amdgpu_job_submit(job)); > + dma_fence_put(amdgpu_ttm_job_submit(adev, job, num_dw)); > return 0; > } > > @@ -1497,10 +1504,7 @@ static int amdgpu_ttm_access_memory_sdma(struct ttm_buffer_object *bo, > amdgpu_emit_copy_buffer(adev, &job->ibs[0], src_addr, dst_addr, > PAGE_SIZE, 0); > > - amdgpu_ring_pad_ib(adev->mman.buffer_funcs_ring, &job->ibs[0]); > - WARN_ON(job->ibs[0].length_dw > num_dw); > - > - fence = amdgpu_job_submit(job); > + fence = amdgpu_ttm_job_submit(adev, job, num_dw); > > if (!dma_fence_wait_timeout(fence, false, adev->sdma_timeout)) > r = -ETIMEDOUT; > @@ -2285,11 +2289,9 @@ int amdgpu_copy_buffer(struct amdgpu_device *adev, uint64_t src_offset, > byte_count -= cur_size_in_bytes; > } > > - amdgpu_ring_pad_ib(ring, &job->ibs[0]); > - WARN_ON(job->ibs[0].length_dw > num_dw); > - *fence = amdgpu_job_submit(job); > if (r) > goto error_free; > + *fence = amdgpu_ttm_job_submit(adev, job, num_dw); > > return r; > > @@ -2307,7 +2309,6 @@ static int amdgpu_ttm_fill_mem(struct amdgpu_device *adev, uint32_t src_data, > u64 k_job_id) > { > unsigned int num_loops, num_dw; > - struct amdgpu_ring *ring; > struct amdgpu_job *job; > uint32_t max_bytes; > unsigned int i; > @@ -2331,10 +2332,7 @@ static int amdgpu_ttm_fill_mem(struct amdgpu_device *adev, uint32_t src_data, > byte_count -= cur_size; > } > > - ring = adev->mman.buffer_funcs_ring; > - amdgpu_ring_pad_ib(ring, &job->ibs[0]); > - WARN_ON(job->ibs[0].length_dw > num_dw); > - *fence = amdgpu_job_submit(job); > + *fence = amdgpu_ttm_job_submit(adev, job, num_dw); > return 0; > } >

2 weeks, 4 days

Re: [PATCH v9 10/11] vfio/pci: Add dma-buf export support for MMIO regions

by Leon Romanovsky

On Thu, Nov 20, 2025 at 05:04:13PM -0700, Alex Williamson wrote: > On Thu, 20 Nov 2025 11:28:29 +0200 > Leon Romanovsky <leon(a)kernel.org> wrote: > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > > index 142b84b3f225..51a3bcc26f8b 100644 > > --- a/drivers/vfio/pci/vfio_pci_core.c > > +++ b/drivers/vfio/pci/vfio_pci_core.c > ... > > @@ -2487,8 +2500,11 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, > > > > err_undo: > > list_for_each_entry_from_reverse(vdev, &dev_set->device_list, > > - vdev.dev_set_list) > > + vdev.dev_set_list) { > > + if (__vfio_pci_memory_enabled(vdev)) > > + vfio_pci_dma_buf_move(vdev, false); > > up_write(&vdev->memory_lock); > > + } > > I ran into a bug here. In the hot reset path we can have dev_sets > where one or more devices are not opened by the user. The vconfig > buffer for the device is established on open. However: > > bool __vfio_pci_memory_enabled(struct vfio_pci_core_device *vdev) > { > struct pci_dev *pdev = vdev->pdev; > u16 cmd = le16_to_cpu(*(__le16 *)&vdev->vconfig[PCI_COMMAND]); > ... > > Leads to a NULL pointer dereference. > > I think the most straightforward fix is simply to test the open_count > on the vfio_device, which is also protected by the dev_set->lock that > we already hold here: > > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -2501,7 +2501,7 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, > err_undo: > list_for_each_entry_from_reverse(vdev, &dev_set->device_list, > vdev.dev_set_list) { > - if (__vfio_pci_memory_enabled(vdev)) > + if (vdev->vdev.open_count && __vfio_pci_memory_enabled(vdev)) > vfio_pci_dma_buf_move(vdev, false); > up_write(&vdev->memory_lock); > } > > Any other suggestions? This should be the only reset path with this > nuance of affecting non-opened devices. Thanks, It seems right to me. Thanks > > Alex

2 weeks, 4 days

Re: [PATCH v2] dma-buf: system_heap: use larger contiguous mappings instead of per-page mmap

by Sumit Semwal

Hi Barry, On Fri, 21 Nov 2025 at 06:54, Barry Song <21cnbao(a)gmail.com> wrote: > > Hi Sumit, > > > > > Using the micro-benchmark below, we see that mmap becomes > > 3.5X faster: > > > Marcin pointed out to me off-tree that it is actually 35x faster, > not 3.5x faster. Sorry for my poor math. I assume you can fix this > when merging it? Sure, I corrected this, and is merged to drm-misc-next Thanks, Sumit. > > > > > W/ patch: > > > > ~ # ./a.out > > mmap 512MB took 200266.000 us, verify OK > > ~ # ./a.out > > mmap 512MB took 198151.000 us, verify OK > > ~ # ./a.out > > mmap 512MB took 197069.000 us, verify OK > > ~ # ./a.out > > mmap 512MB took 196781.000 us, verify OK > > ~ # ./a.out > > mmap 512MB took 198102.000 us, verify OK > > ~ # ./a.out > > mmap 512MB took 195552.000 us, verify OK > > > > W/o patch: > > > > ~ # ./a.out > > mmap 512MB took 6987470.000 us, verify OK > > ~ # ./a.out > > mmap 512MB took 6970739.000 us, verify OK > > ~ # ./a.out > > mmap 512MB took 6984383.000 us, verify OK > > ~ # ./a.out > > mmap 512MB took 6971311.000 us, verify OK > > ~ # ./a.out > > mmap 512MB took 6991680.000 us, verify OK > > > Thanks > Barry

2 weeks, 4 days

Re: [PATCH v9 10/11] vfio/pci: Add dma-buf export support for MMIO regions

by Jason Gunthorpe

On Thu, Nov 20, 2025 at 05:04:13PM -0700, Alex Williamson wrote: > @@ -2501,7 +2501,7 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, > err_undo: > list_for_each_entry_from_reverse(vdev, &dev_set->device_list, > vdev.dev_set_list) { > - if (__vfio_pci_memory_enabled(vdev)) > + if (vdev->vdev.open_count && __vfio_pci_memory_enabled(vdev)) > vfio_pci_dma_buf_move(vdev, false); > up_write(&vdev->memory_lock); > } > > Any other suggestions? This should be the only reset path with this > nuance of affecting non-opened devices. Thanks, Seems reasonable, but should it be in __vfio_pci_memory_enabled() just to be robust? Jason

2 weeks, 5 days

Re: [PATCH] drm/xe: Fix memory leak when handling pagefault vma

by Thomas Hellström

On Thu, 2025-11-20 at 18:14 +0200, Mika Kuoppala wrote: > When the pagefault handling code was moved to a new file, an extra > drm_exec_init() was added to the VMA path. This call is unnecessary > because > xe_validation_ctx_init() already performs a drm_exec_init(), > resulting in a > memory leak reported by kmemleak. > > Remove the redundant drm_exec_init() from the VMA pagefault handling > code. > > Fixes: fb544b844508 ("drm/xe: Implement xe_pagefault_queue_work") > Cc: Matthew Brost <matthew.brost(a)intel.com> > Cc: Stuart Summers <stuart.summers(a)intel.com> > Cc: Lucas De Marchi <lucas.demarchi(a)intel.com> > Cc: "Thomas Hellström" <thomas.hellstrom(a)linux.intel.com> > Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com> > Cc: Sumit Semwal <sumit.semwal(a)linaro.org> > Cc: "Christian König" <christian.koenig(a)amd.com> > Cc: intel-xe(a)lists.freedesktop.org > Cc: linux-media(a)vger.kernel.org > Cc: dri-devel(a)lists.freedesktop.org > Cc: linaro-mm-sig(a)lists.linaro.org > Signed-off-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com> > --- > drivers/gpu/drm/xe/xe_pagefault.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/gpu/drm/xe/xe_pagefault.c > b/drivers/gpu/drm/xe/xe_pagefault.c > index fe3e40145012..afb06598b6e1 100644 > --- a/drivers/gpu/drm/xe/xe_pagefault.c > +++ b/drivers/gpu/drm/xe/xe_pagefault.c > @@ -102,7 +102,6 @@ static int xe_pagefault_handle_vma(struct xe_gt > *gt, struct xe_vma *vma, > > /* Lock VM and BOs dma-resv */ > xe_validation_ctx_init(&ctx, &vm->xe->val, &exec, (struct > xe_val_flags) {}); > - drm_exec_init(&exec, 0, 0); > drm_exec_until_all_locked(&exec) { > err = xe_pagefault_begin(&exec, vma, tile->mem.vram, > needs_vram == 1);

2 weeks, 5 days

Re: [PATCH 5/9] iommufd: Allow MMIO pages in a batch

by Jason Gunthorpe

On Thu, Nov 20, 2025 at 07:59:19AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg(a)nvidia.com> > > Sent: Saturday, November 8, 2025 12:50 AM > > > > +enum batch_kind { > > + BATCH_CPU_MEMORY = 0, > > + BATCH_MMIO, > > +}; > > with 'CPU_MEMORY' (instead of plain 'MEMORY') implies future > support of 'DEV_MEMORY'? Maybe, but I don't have an immediate thought on this. CXL "MMIO" that is cachable is a thing but we can also label it as CPU_MEMORY. We might have something for CC shared/protected memory down the road. Thanks, Jason

2 weeks, 5 days

Re: [PATCH net-next v6 0/6] Add AF_XDP zero copy support

by patchwork-bot+netdevbpf＠kernel.org

Hello: This series was applied to netdev/net-next.git (main) by Paolo Abeni <pabeni(a)redhat.com>: On Tue, 18 Nov 2025 19:25:36 +0530 you wrote: > This series adds AF_XDP zero coppy support to icssg driver. > > Tests were performed on AM64x-EVM with xdpsock application [1]. > > A clear improvement is seen Transmit (txonly) and receive (rxdrop) > for 64 byte packets. 1500 byte test seems to be limited by line > rate (1G link) so no improvement seen there in packet rate > > [...] Here is the summary with links: - [net-next,v6,1/6] net: ti: icssg-prueth: Add functions to create and destroy Rx/Tx queues https://git.kernel.org/netdev/net-next/c/41dde7f1d013 - [net-next,v6,2/6] net: ti: icssg-prueth: Add XSK pool helpers https://git.kernel.org/netdev/net-next/c/7dfd7597911f - [net-next,v6,3/6] net: ti: icssg-prueth: Add AF_XDP zero copy for TX https://git.kernel.org/netdev/net-next/c/8756ef2eb078 - [net-next,v6,4/6] net: ti: icssg-prueth: Make emac_run_xdp function independent of page https://git.kernel.org/netdev/net-next/c/121133163c9f - [net-next,v6,5/6] net: ti: icssg-prueth: Add AF_XDP zero copy for RX https://git.kernel.org/netdev/net-next/c/7a64bb388df3 - [net-next,v6,6/6] net: ti: icssg-prueth: Enable zero copy in XDP features https://git.kernel.org/netdev/net-next/c/c6a1ec1870e6 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html

2 weeks, 5 days

Re: [PATCH 02/18] dma-buf: protected fence ops by RCU v3

by Christian König

On 11/18/25 17:03, Tvrtko Ursulin wrote: >>>> @@ -448,13 +465,19 @@ dma_fence_is_signaled_locked(struct dma_fence *fence) >>>> static inline bool >>>> dma_fence_is_signaled(struct dma_fence *fence) >>>> { >>>> + const struct dma_fence_ops *ops; >>>> + >>>> if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >>>> return true; >>>> - if (fence->ops->signaled && fence->ops->signaled(fence)) { >>>> + rcu_read_lock(); >>>> + ops = rcu_dereference(fence->ops); >>>> + if (ops->signaled && ops->signaled(fence)) { >>>> + rcu_read_unlock(); >>> >>> With the unlocked version two threads could race and one could make the fence->lock go away just around here, before the dma_fence_signal below will take it. It seems it is only safe to rcu_read_unlock before signaling if using the embedded fence (later in the series). Can you think of a downside to holding the rcu read lock to after signaling? that would make it safe I think. >> >> Well it's good to talk about it but I think that it is not necessary to protect the lock in this particular case. >> >> See the RCU protection is only for the fence->ops pointer, but the lock can be taken way after the fence is already signaled. >> >> That's why I came up with the patch to move the lock into the fence in the first place. > > Right. And you think there is nothing to gain with the option of keeping the rcu_read_unlock() to after signalling? Ie. why not plug a potential race if we can for no negative effect. I thought quite a bit over that, but at least of hand I can't come up with a reason why we should do this. The signaling path doesn't need the RCU read side lock as far as I can see. Regards, Christian. > > Regards, > > Tvrtko

2 weeks, 5 days

[RFC PATCH 0/2] locking/ww_mutex, dma-buf/dma-resv: Improve detection of unheld locks

by Thomas Hellström

WW mutexes and dma-resv objects, which embed them, typically have a number of locks belocking to the same lock class. However code using them typically want to verify the locking on object granularity, not lock-class granularity. This series add ww_mutex functions to facilitate that, (patch 1) and utilizes these functions in the dma-resv lock checks. Thomas Hellström (2): kernel/locking/ww_mutex: Add per-lock lock-check helpers dma-buf/dma-resv: Improve the dma-resv lockdep checks include/linux/dma-resv.h | 7 +++++-- include/linux/ww_mutex.h | 18 ++++++++++++++++++ kernel/locking/mutex.c | 10 ++++++++++ 3 files changed, 33 insertions(+), 2 deletions(-) -- 2.51.1

2 weeks, 5 days

← Newer
1
2
3
4
5
6
7
8
9
...
323
Older →

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig