This is rather overkill since currently all drivers call this from
hardirq (or at least timers). But maybe in the future we're going to
have thread irq handlers and what not, doesn't hurt to be prepared.
Plus this is an easy start for sprinkling these fence annotations into
shared code.
Cc: linux-media(a)vger.kernel.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: linux-rdma(a)vger.kernel.org
Cc: amd-gfx(a)lists.freedesktop.org
Cc: intel-gfx(a)lists.freedesktop.org
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
---
drivers/gpu/drm/drm_vblank.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 42a84eb4cc8c..d681ab09963c 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -24,6 +24,7 @@
* OTHER DEALINGS IN THE SOFTWARE.
*/
+#include <linux/dma-fence.h>
#include <linux/export.h>
#include <linux/moduleparam.h>
@@ -1909,7 +1910,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
{
struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
unsigned long irqflags;
- bool disable_irq;
+ bool disable_irq, fence_cookie;
if (drm_WARN_ON_ONCE(dev, !drm_dev_has_vblank(dev)))
return false;
@@ -1917,6 +1918,8 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
return false;
+ fence_cookie = dma_fence_begin_signalling();
+
spin_lock_irqsave(&dev->event_lock, irqflags);
/* Need timestamp lock to prevent concurrent execution with
@@ -1929,6 +1932,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
if (!vblank->enabled) {
spin_unlock(&dev->vblank_time_lock);
spin_unlock_irqrestore(&dev->event_lock, irqflags);
+ dma_fence_end_signalling(fence_cookie);
return false;
}
@@ -1954,6 +1958,8 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
if (disable_irq)
vblank_disable_fn(&vblank->disable_timer);
+ dma_fence_end_signalling(fence_cookie);
+
return true;
}
EXPORT_SYMBOL(drm_handle_vblank);
--
2.27.0
From: Xiyu Yang <xiyuyang19(a)fudan.edu.cn>
[ Upstream commit 11425c4519e2c974a100fc984867046d905b9380 ]
ttm_bo_add_move_fence() invokes dma_fence_get(), which returns a
reference of the specified dma_fence object to "fence" with increased
refcnt.
When ttm_bo_add_move_fence() returns, local variable "fence" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling path of
ttm_bo_add_move_fence(). When no_wait_gpu flag is equals to true, the
function forgets to decrease the refcnt increased by dma_fence_get(),
causing a refcnt leak.
Fix this issue by calling dma_fence_put() when no_wait_gpu flag is
equals to true.
Signed-off-by: Xiyu Yang <xiyuyang19(a)fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf(a)gmail.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Link: https://patchwork.freedesktop.org/patch/370221/
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/ttm/ttm_bo.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index abf165b2f64fc..3ce8ad7603c7f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -941,8 +941,10 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
if (!fence)
return 0;
- if (no_wait_gpu)
+ if (no_wait_gpu) {
+ dma_fence_put(fence);
return -EBUSY;
+ }
dma_resv_add_shared_fence(bo->base.resv, fence);
--
2.25.1
From: Xiyu Yang <xiyuyang19(a)fudan.edu.cn>
[ Upstream commit 11425c4519e2c974a100fc984867046d905b9380 ]
ttm_bo_add_move_fence() invokes dma_fence_get(), which returns a
reference of the specified dma_fence object to "fence" with increased
refcnt.
When ttm_bo_add_move_fence() returns, local variable "fence" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling path of
ttm_bo_add_move_fence(). When no_wait_gpu flag is equals to true, the
function forgets to decrease the refcnt increased by dma_fence_get(),
causing a refcnt leak.
Fix this issue by calling dma_fence_put() when no_wait_gpu flag is
equals to true.
Signed-off-by: Xiyu Yang <xiyuyang19(a)fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf(a)gmail.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Link: https://patchwork.freedesktop.org/patch/370221/
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/ttm/ttm_bo.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 9e07c3f75156b..ef5bc00c73e23 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -881,8 +881,10 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
if (!fence)
return 0;
- if (no_wait_gpu)
+ if (no_wait_gpu) {
+ dma_fence_put(fence);
return -EBUSY;
+ }
dma_resv_add_shared_fence(bo->base.resv, fence);
--
2.25.1
From: Xiyu Yang <xiyuyang19(a)fudan.edu.cn>
[ Upstream commit 37cc4b95d13f311c04aa8e9daacca3905ad45ca7 ]
ttm_bo_vm_fault_reserved() invokes dma_fence_get(), which returns a
reference of the specified dma_fence object to "moving" with increased
refcnt.
When ttm_bo_vm_fault_reserved() returns, local variable "moving" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in several exception handling paths
of ttm_bo_vm_fault_reserved(). When those error scenarios occur such as
"err" equals to -EBUSY, the function forgets to decrease the refcnt
increased by dma_fence_get(), causing a refcnt leak.
Fix this issue by calling dma_fence_put() when no_wait_gpu flag is
equals to true.
Signed-off-by: Xiyu Yang <xiyuyang19(a)fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf(a)gmail.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Link: https://patchwork.freedesktop.org/patch/370219/
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 0ad30b1129821..72100b84c7a90 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -300,8 +300,10 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
break;
case -EBUSY:
case -ERESTARTSYS:
+ dma_fence_put(moving);
return VM_FAULT_NOPAGE;
default:
+ dma_fence_put(moving);
return VM_FAULT_SIGBUS;
}
--
2.25.1
On 21.06.2020 06:00, Dmitry Osipenko wrote:
> В Fri, 19 Jun 2020 12:36:31 +0200
> Marek Szyprowski <m.szyprowski(a)samsung.com> пишет:
>
>> The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg()
>> function returns the number of the created entries in the DMA address
>> space. However the subsequent calls to the
>> dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with
>> the original number of the entries passed to the dma_map_sg().
>>
>> struct sg_table is a common structure used for describing a
>> non-contiguous memory buffer, used commonly in the DRM and graphics
>> subsystems. It consists of a scatterlist with memory pages and DMA
>> addresses (sgl entry), as well as the number of scatterlist entries:
>> CPU pages (orig_nents entry) and DMA mapped pages (nents entry).
>>
>> It turned out that it was a common mistake to misuse nents and
>> orig_nents entries, calling DMA-mapping functions with a wrong number
>> of entries or ignoring the number of mapped entries returned by the
>> dma_map_sg() function.
>>
>> To avoid such issues, lets use a common dma-mapping wrappers operating
>> directly on the struct sg_table objects and use scatterlist page
>> iterators where possible. This, almost always, hides references to the
>> nents and orig_nents entries, making the code robust, easier to follow
>> and copy/paste safe.
>>
>> Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
>> Reviewed-by: Dmitry Osipenko <digetx(a)gmail.com>
>> ---
>> drivers/staging/media/tegra-vde/iommu.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/staging/media/tegra-vde/iommu.c
>> b/drivers/staging/media/tegra-vde/iommu.c index
>> 6af863d92123..adf8dc7ee25c 100644 ---
>> a/drivers/staging/media/tegra-vde/iommu.c +++
>> b/drivers/staging/media/tegra-vde/iommu.c @@ -36,8 +36,8 @@ int
>> tegra_vde_iommu_map(struct tegra_vde *vde,
>> addr = iova_dma_addr(&vde->iova, iova);
>>
>> - size = iommu_map_sg(vde->domain, addr, sgt->sgl, sgt->nents,
>> - IOMMU_READ | IOMMU_WRITE);
>> + size = iommu_map_sgtable(vde->domain, addr, sgt,
>> + IOMMU_READ | IOMMU_WRITE);
>> if (!size) {
>> __free_iova(&vde->iova, iova);
>> return -ENXIO;
> Ahh, I saw the build failure report. You're changing the DMA API in
> this series, while DMA API isn't used by this driver, it uses IOMMU
> API. Hence there is no need to touch this code. Similar problem in the
> host1x driver patch.
The issue is caused by the lack of iommu_map_sgtable() stub when no
IOMMU support is configured. I've posted a patch for this:
https://lore.kernel.org/lkml/20200630081756.18526-1-m.szyprowski@samsung.co…
The patch for this driver is fine, we have to wait until the above fix
gets merged and then it can be applied during the next release cycle.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
On Wed, May 20, 2020 at 03:39:31PM +0200, Erwan Le Ray wrote:
> Add support of generic DT binding for annoucing RTS/CTS lines. The initial
> binding 'st,hw-flow-control' is not needed anymore since generic binding
> is available, but is kept for backward compatibility.
>
> Signed-off-by: Erwan Le Ray <erwan.leray(a)st.com>
>
> diff --git a/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml b/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml
> index 75b8521eb7cb..06d5f251ec88 100644
> --- a/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml
> +++ b/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml
> @@ -35,9 +35,11 @@ properties:
> description: label associated with this uart
>
> st,hw-flow-ctrl:
> - description: enable hardware flow control
> + description: enable hardware flow control (deprecated)
> $ref: /schemas/types.yaml#/definitions/flag
>
> + uart-has-rtscts: true
> +
> dmas:
> minItems: 1
> maxItems: 2
> --
> 2.17.1
>
Did this get ignored by the DT maintainers? :(
On Thu, Jun 11, 2020 at 09:30:12AM +0200, Thomas Hellström (Intel) wrote:
>
> On 6/4/20 10:12 AM, Daniel Vetter wrote:
> > Two in one go:
> > - it is allowed to call dma_fence_wait() while holding a
> > dma_resv_lock(). This is fundamental to how eviction works with ttm,
> > so required.
> >
> > - it is allowed to call dma_fence_wait() from memory reclaim contexts,
> > specifically from shrinker callbacks (which i915 does), and from mmu
> > notifier callbacks (which amdgpu does, and which i915 sometimes also
> > does, and probably always should, but that's kinda a debate). Also
> > for stuff like HMM we really need to be able to do this, or things
> > get real dicey.
> >
> > Consequence is that any critical path necessary to get to a
> > dma_fence_signal for a fence must never a) call dma_resv_lock nor b)
> > allocate memory with GFP_KERNEL. Also by implication of
> > dma_resv_lock(), no userspace faulting allowed. That's some supremely
> > obnoxious limitations, which is why we need to sprinkle the right
> > annotations to all relevant paths.
> >
> > The one big locking context we're leaving out here is mmu notifiers,
> > added in
> >
> > commit 23b68395c7c78a764e8963fc15a7cfd318bf187f
> > Author: Daniel Vetter <daniel.vetter(a)ffwll.ch>
> > Date: Mon Aug 26 22:14:21 2019 +0200
> >
> > mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
> >
> > that one covers a lot of other callsites, and it's also allowed to
> > wait on dma-fences from mmu notifiers. But there's no ready-made
> > functions exposed to prime this, so I've left it out for now.
> >
> > v2: Also track against mmu notifier context.
> >
> > v3: kerneldoc to spec the cross-driver contract. Note that currently
> > i915 throws in a hard-coded 10s timeout on foreign fences (not sure
> > why that was done, but it's there), which is why that rule is worded
> > with SHOULD instead of MUST.
> >
> > Also some of the mmu_notifier/shrinker rules might surprise SoC
> > drivers, I haven't fully audited them all. Which is infeasible anyway,
> > we'll need to run them with lockdep and dma-fence annotations and see
> > what goes boom.
> >
> > v4: A spelling fix from Mika
> >
> > Cc: Mika Kuoppala <mika.kuoppala(a)intel.com>
> > Cc: Thomas Hellstrom <thomas.hellstrom(a)intel.com>
> > Cc: linux-media(a)vger.kernel.org
> > Cc: linaro-mm-sig(a)lists.linaro.org
> > Cc: linux-rdma(a)vger.kernel.org
> > Cc: amd-gfx(a)lists.freedesktop.org
> > Cc: intel-gfx(a)lists.freedesktop.org
> > Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
> > Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
> > Cc: Christian König <christian.koenig(a)amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
> > ---
> > Documentation/driver-api/dma-buf.rst | 6 ++++
> > drivers/dma-buf/dma-fence.c | 41 ++++++++++++++++++++++++++++
> > drivers/dma-buf/dma-resv.c | 4 +++
> > include/linux/dma-fence.h | 1 +
> > 4 files changed, 52 insertions(+)
>
> I still have my doubts about allowing fence waiting from within shrinkers.
> IMO ideally they should use a trywait approach, in order to allow memory
> allocation during command submission for drivers that
> publish fences before command submission. (Since early reservation object
> release requires that).
Yeah it is a bit annoying, e.g. for drm/scheduler I think we'll end up
with a mempool to make sure it can handle it's allocations.
> But since drivers are already waiting from within shrinkers and I take your
> word for HMM requiring this,
Yeah the big trouble is HMM and mmu notifiers. That's the really awkward
one, the shrinker one is a lot less established.
I do wonder whether the mmu notifier constraint should only be set when
mmu notifiers are enabled, since on a bunch of arm-soc gpu drivers that
stuff just doesn't matter. But I expect that sooner or later these arm
gpus will show up in bigger arm cores, where you might want to have kvm
and maybe device virtualization and stuff, and then you need mmu
notifiers.
Plus having a very clear and consistent cross-driver api contract is imo
better than leaving this up to drivers and then having incompatible
assumptions.
I've pinged a bunch of armsoc gpu driver people and ask them how much this
hurts, so that we have a clear answer. On x86 I don't think we have much
of a choice on this, with userptr in amd and i915 and hmm work in nouveau
(but nouveau I think doesn't use dma_fence in there). I think it'll take
us a while to really bottom out on this specific question here.
-Daniel
>
> Reviewed-by: Thomas Hellström <thomas.hellstrom(a)intel.com>
>
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch