On Tue, Oct 19, 2021 at 08:23:45PM +0800, guangming.cao(a)mediatek.com wrote:
> From: Guangming Cao <Guangming.Cao(a)mediatek.com>
>
> Since there is no mandatory inspection for attachments in dma_buf_release.
> There will be a case that dma_buf already released but attachment is still
> in use, which can points to the dmabuf, and it maybe cause
> some unexpected issues.
>
> With IOMMU, when this cases occurs, there will have IOMMU address
> translation fault(s) followed by this warning,
> I think it's useful for dma devices to debug issue.
>
> Signed-off-by: Guangming Cao <Guangming.Cao(a)mediatek.com>
This feels a lot like hand-rolling kobject debugging. If you want to do
this then I think adding kobject debug support to
dma_buf/dma_buf_attachment would be better than hand-rolling something
bespoke here.
Also on the patch itself: You don't need the trylock. For correctly
working code non one else can get at the dma-buf, so no locking needed to
iterate through the attachment list. For incorrect code the kernel will be
on fire pretty soon anyway, trying to do locking won't help :-) And
without the trylock we can catch more bugs (e.g. if you also forgot to
unlock and not just forgot to detach).
-Daniel
> ---
> drivers/dma-buf/dma-buf.c | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 511fe0d217a0..672404857d6a 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -74,6 +74,29 @@ static void dma_buf_release(struct dentry *dentry)
> */
> BUG_ON(dmabuf->cb_shared.active || dmabuf->cb_excl.active);
>
> + /* attachment check */
> + if (dma_resv_trylock(dmabuf->resv) && WARN(!list_empty(&dmabuf->attachments),
> + "%s err, inode:%08lu size:%08zu name:%s exp_name:%s flags:0x%08x mode:0x%08x, %s\n",
> + __func__, file_inode(dmabuf->file)->i_ino, dmabuf->size,
> + dmabuf->name, dmabuf->exp_name,
> + dmabuf->file->f_flags, dmabuf->file->f_mode,
> + "Release dmabuf before detach all attachments, dump attach:\n")) {
> + int attach_cnt = 0;
> + dma_addr_t dma_addr;
> + struct dma_buf_attachment *attach_obj;
> + /* dump all attachment info */
> + list_for_each_entry(attach_obj, &dmabuf->attachments, node) {
> + dma_addr = (dma_addr_t)0;
> + if (attach_obj->sgt)
> + dma_addr = sg_dma_address(attach_obj->sgt->sgl);
> + pr_err("attach[%d]: dev:%s dma_addr:0x%-12lx\n",
> + attach_cnt, dev_name(attach_obj->dev), dma_addr);
> + attach_cnt++;
> + }
> + pr_err("Total %d devices attached\n\n", attach_cnt);
> + dma_resv_unlock(dmabuf->resv);
> + }
> +
> dmabuf->ops->release(dmabuf);
>
> if (dmabuf->resv == (struct dma_resv *)&dmabuf[1])
> --
> 2.17.1
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Am 08.10.21 um 13:20 schrieb Shunsuke Mie:
> A comment for the dma_buf_vmap/vunmap() is not catching up a
> corresponding implementation.
>
> Signed-off-by: Shunsuke Mie <mie(a)igel.co.jp>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
> ---
> drivers/dma-buf/dma-buf.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index beb504a92d60..7b619998f03a 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -1052,8 +1052,8 @@ EXPORT_SYMBOL_GPL(dma_buf_move_notify);
> *
> * Interfaces::
> *
> - * void \*dma_buf_vmap(struct dma_buf \*dmabuf)
> - * void dma_buf_vunmap(struct dma_buf \*dmabuf, void \*vaddr)
> + * void \*dma_buf_vmap(struct dma_buf \*dmabuf, struct dma_buf_map \*map)
> + * void dma_buf_vunmap(struct dma_buf \*dmabuf, struct dma_buf_map \*map)
> *
> * The vmap call can fail if there is no vmap support in the exporter, or if
> * it runs out of vmalloc space. Note that the dma-buf layer keeps a reference
Hello Guangming, Christian,
On Tue, 12 Oct 2021, 14:09 , <guangming.cao(a)mediatek.com> wrote:
> From: Guangming Cao <Guangming.Cao(a)mediatek.com>
>
> > Am 09.10.21 um 07:55 schrieb guangming.cao(a)mediatek.com:
> > From: Guangming Cao <Guangming.Cao(a)mediatek.com>
> > >
> > > If dma-buf don't want userspace users to touch the dmabuf buffer,
> > > it seems we should add this restriction into dma_buf_ops.mmap,
> > > not in this IOCTL:DMA_BUF_SET_NAME.
> > >
> > > With this restriction, we can only know the kernel users of the dmabuf
> > > by attachments.
> > > However, for many userspace users, such as userpsace users of dma_heap,
> > > they also need to mark the usage of dma-buf, and they don't care about
> > > who attached to this dmabuf, and seems it's no meaning to be waiting
> for
> > > IOCTL:DMA_BUF_SET_NAME rather than mmap.
> >
> > Sounds valid to me, but I have no idea why this restriction was added in
> > the first place.
> >
> > Can you double check the git history and maybe identify when that was
> > added? Mentioning this change in the commit message then might make
> > things a bit easier to understand.
> >
> > Thanks,
> > Christian.
> It was add in this patch: https://patchwork.freedesktop.org/patch/310349/.
> However, there is no illustration about it.
> I guess it wants users to set_name when no attachments on the dmabuf,
> for case with attachments, we can find owner by device in attachments.
> But just I said in commit message, this is might not a good idea.
>
> Do you have any idea?
>
For the original series, the idea was that allowing name change mid-use
could confuse the users about the dma-buf. However, the rest of the series
also makes sure each dma-buf have a unique inode, and any accounting should
probably use that, without relying on the name as much.
So I don't have an objection to this change.
Best,
Sumit.
> >
> > >
> > > Signed-off-by: Guangming Cao <Guangming.Cao(a)mediatek.com>
> > > ---
> > > drivers/dma-buf/dma-buf.c | 14 ++------------
> > > 1 file changed, 2 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > > index 511fe0d217a0..db2f4efdec32 100644
> > > --- a/drivers/dma-buf/dma-buf.c
> > > +++ b/drivers/dma-buf/dma-buf.c
> > > @@ -325,10 +325,8 @@ static __poll_t dma_buf_poll(struct file *file,
> poll_table *poll)
> > >
> > > /**
> > > * dma_buf_set_name - Set a name to a specific dma_buf to track the
> usage.
> > > - * The name of the dma-buf buffer can only be set when the dma-buf is
> not
> > > - * attached to any devices. It could theoritically support changing
> the
> > > - * name of the dma-buf if the same piece of memory is used for
> multiple
> > > - * purpose between different devices.
> > > + * It could theoretically support changing the name of the dma-buf if
> the same
> > > + * piece of memory is used for multiple purpose between different
> devices.
> > > *
> > > * @dmabuf: [in] dmabuf buffer that will be renamed.
> > > * @buf: [in] A piece of userspace memory that contains the
> name of
> > > @@ -346,19 +344,11 @@ static long dma_buf_set_name(struct dma_buf
> *dmabuf, const char __user *buf)
> > > if (IS_ERR(name))
> > > return PTR_ERR(name);
> > >
> > > - dma_resv_lock(dmabuf->resv, NULL);
> > > - if (!list_empty(&dmabuf->attachments)) {
> > > - ret = -EBUSY;
> > > - kfree(name);
> > > - goto out_unlock;
> > > - }
> > > spin_lock(&dmabuf->name_lock);
> > > kfree(dmabuf->name);
> > > dmabuf->name = name;
> > > spin_unlock(&dmabuf->name_lock);
> > >
> > > -out_unlock:
> > > - dma_resv_unlock(dmabuf->resv);
> > > return ret;
> > > }
> > >
>
Am 09.10.21 um 07:55 schrieb guangming.cao(a)mediatek.com:
> From: Guangming Cao <Guangming.Cao(a)mediatek.com>
>
> If dma-buf don't want userspace users to touch the dmabuf buffer,
> it seems we should add this restriction into dma_buf_ops.mmap,
> not in this IOCTL:DMA_BUF_SET_NAME.
>
> With this restriction, we can only know the kernel users of the dmabuf
> by attachments.
> However, for many userspace users, such as userpsace users of dma_heap,
> they also need to mark the usage of dma-buf, and they don't care about
> who attached to this dmabuf, and seems it's no meaning to be waiting for
> IOCTL:DMA_BUF_SET_NAME rather than mmap.
Sounds valid to me, but I have no idea why this restriction was added in
the first place.
Can you double check the git history and maybe identify when that was
added? Mentioning this change in the commit message then might make
things a bit easier to understand.
Thanks,
Christian.
>
> Signed-off-by: Guangming Cao <Guangming.Cao(a)mediatek.com>
> ---
> drivers/dma-buf/dma-buf.c | 14 ++------------
> 1 file changed, 2 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 511fe0d217a0..db2f4efdec32 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -325,10 +325,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>
> /**
> * dma_buf_set_name - Set a name to a specific dma_buf to track the usage.
> - * The name of the dma-buf buffer can only be set when the dma-buf is not
> - * attached to any devices. It could theoritically support changing the
> - * name of the dma-buf if the same piece of memory is used for multiple
> - * purpose between different devices.
> + * It could theoretically support changing the name of the dma-buf if the same
> + * piece of memory is used for multiple purpose between different devices.
> *
> * @dmabuf: [in] dmabuf buffer that will be renamed.
> * @buf: [in] A piece of userspace memory that contains the name of
> @@ -346,19 +344,11 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
> if (IS_ERR(name))
> return PTR_ERR(name);
>
> - dma_resv_lock(dmabuf->resv, NULL);
> - if (!list_empty(&dmabuf->attachments)) {
> - ret = -EBUSY;
> - kfree(name);
> - goto out_unlock;
> - }
> spin_lock(&dmabuf->name_lock);
> kfree(dmabuf->name);
> dmabuf->name = name;
> spin_unlock(&dmabuf->name_lock);
>
> -out_unlock:
> - dma_resv_unlock(dmabuf->resv);
> return ret;
> }
>
On 10/8/21 7:47 PM, guangming.cao(a)mediatek.com wrote:
> From: Guangming Cao <Guangming.Cao(a)mediatek.com>
>
> If dma-buf don't want userspace users to touch the dmabuf buffer,
> it seems we should add this restriction into dma_buf_ops.mmap,
> not in this IOCTL:DMA_BUF_SET_NAME.
>
> With this restriction, we can only know the kernel users of the dmabuf
> by attachments.
> However, for many userspace users, such as userpsace users of dma_heap,
> they also need to mark the usage of dma-buf, and they don't care about
> who attached to this dmabuf, and seems it's no meaning to waitting for
to be waiting for
> IOCTL:DMA_BUF_SET_NAME rather than mmap.
>
> Signed-off-by: Guangming Cao <Guangming.Cao(a)mediatek.com>
> ---
> drivers/dma-buf/dma-buf.c | 14 ++------------
> 1 file changed, 2 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 511fe0d217a0..afbd0a226639 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -325,10 +325,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>
> /**
> * dma_buf_set_name - Set a name to a specific dma_buf to track the usage.
> - * The name of the dma-buf buffer can only be set when the dma-buf is not
> - * attached to any devices. It could theoritically support changing the
> - * name of the dma-buf if the same piece of memory is used for multiple
> - * purpose between different devices.
> + * It could theoritically support changing the name of the dma-buf if the same
theoretically
(yes, it was incorrect before this change.)
> + * piece of memory is used for multiple purpose between different devices.
> *
> * @dmabuf: [in] dmabuf buffer that will be renamed.
> * @buf: [in] A piece of userspace memory that contains the name of
> @@ -346,19 +344,11 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
> if (IS_ERR(name))
> return PTR_ERR(name);
>
> - dma_resv_lock(dmabuf->resv, NULL);
> - if (!list_empty(&dmabuf->attachments)) {
> - ret = -EBUSY;
> - kfree(name);
> - goto out_unlock;
> - }
> spin_lock(&dmabuf->name_lock);
> kfree(dmabuf->name);
> dmabuf->name = name;
> spin_unlock(&dmabuf->name_lock);
>
> -out_unlock:
> - dma_resv_unlock(dmabuf->resv);
> return ret;
> }
>
>
--
~Randy
Am 08.10.21 um 08:29 schrieb guangming.cao(a)mediatek.com:
> From: Guangming Cao <Guangming.Cao(a)mediatek.com>
>
> Because dma-buf.name can be freed in func: "dma_buf_set_name",
> so, we need to acquire lock first before we read/write dma_buf.name
> to prevent Use After Free(UAF) issue.
>
> Signed-off-by: Guangming Cao <Guangming.Cao(a)mediatek.com>
> ---
> drivers/dma-buf/dma-buf.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 511fe0d217a0..aebb51b3ff52 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -80,7 +80,9 @@ static void dma_buf_release(struct dentry *dentry)
> dma_resv_fini(dmabuf->resv);
>
> module_put(dmabuf->owner);
> + spin_lock(&dmabuf->name_lock);
> kfree(dmabuf->name);
> + spin_unlock(&dmabuf->name_lock);
That here is certainly a NAK. This is the release function if somebody
is changing the name on a released DMA-buf we have much bigger problems.
> kfree(dmabuf);
> }
>
> @@ -1372,6 +1374,8 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
> if (ret)
> goto error_unlock;
>
> +
> + spin_lock(&dmabuf->name_lock);
> seq_printf(s, "%08zu\t%08x\t%08x\t%08ld\t%s\t%08lu\t%s\n",
> buf_obj->size,
> buf_obj->file->f_flags, buf_obj->file->f_mode,
> @@ -1379,6 +1383,7 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
> buf_obj->exp_name,
> file_inode(buf_obj->file)->i_ino,
> buf_obj->name ?: "");
> + spin_unlock(&dmabuf->name_lock);
Yeah, that part looks like a good idea to me as well.
Christian.
>
> robj = buf_obj->resv;
> fence = dma_resv_excl_fence(robj);
A simpler version of the iterator to be used when the dma_resv object is
locked.
v2: fix index check here as well
v3: minor coding improvement, some documentation cleanup
Signed-off-by: Christian König <christian.koenig(a)amd.com>
---
drivers/dma-buf/dma-resv.c | 51 ++++++++++++++++++++++++++++++++++++++
include/linux/dma-resv.h | 20 +++++++++++++++
2 files changed, 71 insertions(+)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index a480af9581bd..2f98caa68ae5 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -423,6 +423,57 @@ struct dma_fence *dma_resv_iter_next_unlocked(struct dma_resv_iter *cursor)
}
EXPORT_SYMBOL(dma_resv_iter_next_unlocked);
+/**
+ * dma_resv_iter_first - first fence from a locked dma_resv object
+ * @cursor: cursor to record the current position
+ *
+ * Return the first fence in the dma_resv object while holding the
+ * &dma_resv.lock.
+ */
+struct dma_fence *dma_resv_iter_first(struct dma_resv_iter *cursor)
+{
+ struct dma_fence *fence;
+
+ dma_resv_assert_held(cursor->obj);
+
+ cursor->index = 0;
+ if (cursor->all_fences)
+ cursor->fences = dma_resv_shared_list(cursor->obj);
+ else
+ cursor->fences = NULL;
+
+ fence = dma_resv_excl_fence(cursor->obj);
+ if (!fence)
+ fence = dma_resv_iter_next(cursor);
+
+ cursor->is_restarted = true;
+ return fence;
+}
+EXPORT_SYMBOL_GPL(dma_resv_iter_first);
+
+/**
+ * dma_resv_iter_next - next fence from a locked dma_resv object
+ * @cursor: cursor to record the current position
+ *
+ * Return the next fences from the dma_resv object while holding the
+ * &dma_resv.lock.
+ */
+struct dma_fence *dma_resv_iter_next(struct dma_resv_iter *cursor)
+{
+ unsigned int idx;
+
+ dma_resv_assert_held(cursor->obj);
+
+ cursor->is_restarted = false;
+ if (!cursor->fences || cursor->index >= cursor->fences->shared_count)
+ return NULL;
+
+ idx = cursor->index++;
+ return rcu_dereference_protected(cursor->fences->shared[idx],
+ dma_resv_held(cursor->obj));
+}
+EXPORT_SYMBOL_GPL(dma_resv_iter_next);
+
/**
* dma_resv_copy_fences - Copy all fences from src to dst.
* @dst: the destination reservation object
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 764138ad8583..491359cea54c 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -179,6 +179,8 @@ struct dma_resv_iter {
struct dma_fence *dma_resv_iter_first_unlocked(struct dma_resv_iter *cursor);
struct dma_fence *dma_resv_iter_next_unlocked(struct dma_resv_iter *cursor);
+struct dma_fence *dma_resv_iter_first(struct dma_resv_iter *cursor);
+struct dma_fence *dma_resv_iter_next(struct dma_resv_iter *cursor);
/**
* dma_resv_iter_begin - initialize a dma_resv_iter object
@@ -244,6 +246,24 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
for (fence = dma_resv_iter_first_unlocked(cursor); \
fence; fence = dma_resv_iter_next_unlocked(cursor))
+/**
+ * dma_resv_for_each_fence - fence iterator
+ * @cursor: a struct dma_resv_iter pointer
+ * @obj: a dma_resv object pointer
+ * @all_fences: true if all fences should be returned
+ * @fence: the current fence
+ *
+ * Iterate over the fences in a struct dma_resv object while holding the
+ * &dma_resv.lock. @all_fences controls if the shared fences are returned as
+ * well. The cursor initialisation is part of the iterator and the fence stays
+ * valid as long as the lock is held and so no extra reference to the fence is
+ * taken.
+ */
+#define dma_resv_for_each_fence(cursor, obj, all_fences, fence) \
+ for (dma_resv_iter_begin(cursor, obj, all_fences), \
+ fence = dma_resv_iter_first(cursor); fence; \
+ fence = dma_resv_iter_next(cursor))
+
#define dma_resv_held(obj) lockdep_is_held(&(obj)->lock.base)
#define dma_resv_assert_held(obj) lockdep_assert_held(&(obj)->lock.base)
--
2.25.1
Hi guys,
I've fixed up the lockdep splat in the new selftests found by the CI
systems and added another path for dma_resv_poll.
I know you guys are flooded, but can we get at least the first few patches
committed? The patches to change the individual drivers could also be pushed later on I think.
Thanks,
Christian.
On Thu, Sep 30, 2021 at 03:46:35PM +0300, Oded Gabbay wrote:
> After reading the kernel iommu code, I think this is not relevant
> here, and I'll add a comment appropriately but I'll also write it
> here, and please correct me if my understanding is wrong.
>
> The memory behind this specific dma-buf has *always* resided on the
> device itself, i.e. it lives only in the 'device' domain (after all,
> it maps a PCI bar address which points to the device memory).
> Therefore, it was never in the 'CPU' domain and hence, there is no
> need to perform a sync of the memory to the CPU's cache, as it was
> never inside that cache to begin with.
>
> This is not the same case as with regular memory which is dma-mapped
> and then copied into the device using a dma engine. In that case,
> the memory started in the 'CPU' domain and moved to the 'device'
> domain. When it is unmapped it will indeed be recycled to be used
> for another purpose and therefore we need to sync the CPU cache.
>
> Is my understanding correct ?
It makes sense to me
Jason
On Wed, Sep 29, 2021 at 12:17:35AM +0300, Oded Gabbay wrote:
> On Tue, Sep 28, 2021 at 8:36 PM Jason Gunthorpe <jgg(a)ziepe.ca> wrote:
> >
> > On Sun, Sep 12, 2021 at 07:53:09PM +0300, Oded Gabbay wrote:
> > > From: Tomer Tayar <ttayar(a)habana.ai>
> > >
> > > Implement the calls to the dma-buf kernel api to create a dma-buf
> > > object backed by FD.
> > >
> > > We block the option to mmap the DMA-BUF object because we don't support
> > > DIRECT_IO and implicit P2P.
> >
> > This statement doesn't make sense, you can mmap your dmabuf if you
> > like. All dmabuf mmaps are supposed to set the special bit/etc to
> > exclude them from get_user_pages() anyhow - and since this is BAR
> > memory not struct page memory this driver would be doing it anyhow.
> >
> But we block mmap the dmabuf fd from user-space.
> If you try to do it, you will get MAP_FAILED.
You do, I'm saying the above paragraph explaining *why* that was done
is not correct.
> > > We check the p2p distance using pci_p2pdma_distance_many() and refusing
> > > to map dmabuf in case the distance doesn't allow p2p.
> >
> > Does this actually allow the p2p transfer for your intended use cases?
>
> It depends on the system. If we are working bare-metal, then yes, it allows.
> If inside a VM, then no. The virtualized root complex is not
> white-listed and the kernel can't know the distance.
> But I remember you asked me to add this check, in v3 of the review IIRC.
> I don't mind removing this check if you don't object.
Yes, i tis the right code, I was curious how far along things have
gotten
> > Don't write to the kernel log from user space triggered actions
> at all ?
At all.
> It's the first time I hear about this limitation...
Oh? It is a security issue, we don't want to allow userspace to DOS
the kerne logging.
> How do you tell the user it has done something wrong ?
dev_dbg is the usual way and then users doing debugging can opt in to
the logging.
> > Why doesn't this return a sg_table * and an ERR_PTR?
> Basically I modeled this function after amdgpu_vram_mgr_alloc_sgt()
> And in that function they also return int and pass the sg_table as **
>
> If it's critical I can change.
Please follow the normal kernel style
Jason
On Tue, Sep 28, 2021 at 10:04:29AM +0300, Oded Gabbay wrote:
> On Thu, Sep 23, 2021 at 12:22 PM Oded Gabbay <ogabbay(a)kernel.org> wrote:
> >
> > On Sat, Sep 18, 2021 at 11:38 AM Oded Gabbay <ogabbay(a)kernel.org> wrote:
> > >
> > > On Fri, Sep 17, 2021 at 3:30 PM Daniel Vetter <daniel(a)ffwll.ch> wrote:
> > > >
> > > > On Thu, Sep 16, 2021 at 10:10:14AM -0300, Jason Gunthorpe wrote:
> > > > > On Thu, Sep 16, 2021 at 02:31:34PM +0200, Daniel Vetter wrote:
> > > > > > On Wed, Sep 15, 2021 at 10:45:36AM +0300, Oded Gabbay wrote:
> > > > > > > On Tue, Sep 14, 2021 at 7:12 PM Jason Gunthorpe <jgg(a)ziepe.ca> wrote:
> > > > > > > >
> > > > > > > > On Tue, Sep 14, 2021 at 04:18:31PM +0200, Daniel Vetter wrote:
> > > > > > > > > On Sun, Sep 12, 2021 at 07:53:07PM +0300, Oded Gabbay wrote:
> > > > > > > > > > Hi,
> > > > > > > > > > Re-sending this patch-set following the release of our user-space TPC
> > > > > > > > > > compiler and runtime library.
> > > > > > > > > >
> > > > > > > > > > I would appreciate a review on this.
> > > > > > > > >
> > > > > > > > > I think the big open we have is the entire revoke discussions. Having the
> > > > > > > > > option to let dma-buf hang around which map to random local memory ranges,
> > > > > > > > > without clear ownership link and a way to kill it sounds bad to me.
> > > > > > > > >
> > > > > > > > > I think there's a few options:
> > > > > > > > > - We require revoke support. But I've heard rdma really doesn't like that,
> > > > > > > > > I guess because taking out an MR while holding the dma_resv_lock would
> > > > > > > > > be an inversion, so can't be done. Jason, can you recap what exactly the
> > > > > > > > > hold-up was again that makes this a no-go?
> > > > > > > >
> > > > > > > > RDMA HW can't do revoke.
> > > > > >
> > > > > > Like why? I'm assuming when the final open handle or whatever for that MR
> > > > > > is closed, you do clean up everything? Or does that MR still stick around
> > > > > > forever too?
> > > > >
> > > > > It is a combination of uAPI and HW specification.
> > > > >
> > > > > revoke here means you take a MR object and tell it to stop doing DMA
> > > > > without causing the MR object to be destructed.
> > > > >
> > > > > All the drivers can of course destruct the MR, but doing such a
> > > > > destruction without explicit synchronization with user space opens
> > > > > things up to a serious use-after potential that could be a security
> > > > > issue.
> > > > >
> > > > > When the open handle closes the userspace is synchronized with the
> > > > > kernel and we can destruct the HW objects safely.
> > > > >
> > > > > So, the special HW feature required is 'stop doing DMA but keep the
> > > > > object in an error state' which isn't really implemented, and doesn't
> > > > > extend very well to other object types beyond simple MRs.
> > > >
> > > > Yeah revoke without destroying the MR doesn't work, and it sounds like
> > > > revoke by destroying the MR just moves the can of worms around to another
> > > > place.
> > > >
> > > > > > 1. User A opens gaudi device, sets up dma-buf export
> > > > > >
> > > > > > 2. User A registers that with RDMA, or anything else that doesn't support
> > > > > > revoke.
> > > > > >
> > > > > > 3. User A closes gaudi device
> > > > > >
> > > > > > 4. User B opens gaudi device, assumes that it has full control over the
> > > > > > device and uploads some secrets, which happen to end up in the dma-buf
> > > > > > region user A set up
> > > > >
> > > > > I would expect this is blocked so long as the DMABUF exists - eg the
> > > > > DMABUF will hold a fget on the FD of #1 until the DMABUF is closed, so
> > > > > that #3 can't actually happen.
> > > > >
> > > > > > It's not mlocked memory, it's mlocked memory and I can exfiltrate
> > > > > > it.
> > > > >
> > > > > That's just bug, don't make buggy drivers :)
> > > >
> > > > Well yeah, but given that habanalabs hand rolled this I can't just check
> > > > for the usual things we have to enforce this in drm. And generally you can
> > > > just open chardevs arbitrarily, and multiple users fighting over each
> > > > another. The troubles only start when you have private state or memory
> > > > allocations of some kind attached to the struct file (instead of the
> > > > underlying device), or something else that requires device exclusivity.
> > > > There's no standard way to do that.
> > > >
> > > > Plus in many cases you really want revoke on top (can't get that here
> > > > unfortunately it seems), and the attempts to get towards a generic
> > > > revoke() just never went anywhere. So again it's all hand-rolled
> > > > per-subsystem. *insert lament about us not having done this through a
> > > > proper subsystem*
> > > >
> > > > Anyway it sounds like the code takes care of that.
> > > > -Daniel
> > >
> > > Daniel, Jason,
> > > Thanks for reviewing this code.
> > >
> > > Can I get an R-B / A-B from you for this patch-set ?
> > >
> > > Thanks,
> > > Oded
> >
> > A kind reminder.
> >
> > Thanks,
> > Oded
>
> Hi,
> I know last week was LPC and maybe this got lost in the inbox, so I'm
> sending it again to make sure you got my request for R-B / A-B.
I was waiting for some clarity from the maintainers summit, but that's
still about as unclear as it gets. Either way technically it sounds ok,
but I'm a bit burried so didn't look at the code.
Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
But looking beyond the strict lens of dma-buf I'm still impressed by the
mess this created, to get to the same endpoint of "we open our stack" in
the same time it takes others to sort this out. I'm still looking for some
kind of plan to fix this.
Also you probably want to get Dave to ack this too, I pinged him on irc
last week about this after maintainer summit.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Sun, Sep 12, 2021 at 07:53:08PM +0300, Oded Gabbay wrote:
> /* HL_MEM_OP_* */
> __u32 op;
> - /* HL_MEM_* flags */
> + /* HL_MEM_* flags.
> + * For the HL_MEM_OP_EXPORT_DMABUF_FD opcode, this field holds the
> + * DMA-BUF file/FD flags.
> + */
> __u32 flags;
> /* Context ID - Currently not in use */
> __u32 ctx_id;
> @@ -1072,6 +1091,13 @@ struct hl_mem_out {
>
> __u32 pad;
> };
> +
> + /* Returned in HL_MEM_OP_EXPORT_DMABUF_FD. Represents the
> + * DMA-BUF object that was created to describe a memory
> + * allocation on the device's memory space. The FD should be
> + * passed to the importer driver
> + */
> + __u64 fd;
fd's should be a s32 type in a fixed width uapi.
I usually expect to see the uapi changes inside the commit that
consumes them, splitting the patch like this seems strange but
harmless.
Jason
From: Rob Clark <robdclark(a)chromium.org>
This series adds deadline awareness to fences, so realtime deadlines
such as vblank can be communicated to the fence signaller for power/
frequency management decisions.
This is partially inspired by a trick i915 does, but implemented
via dma-fence for a couple of reasons:
1) To continue to be able to use the atomic helpers
2) To support cases where display and gpu are different drivers
This iteration adds a dma-fence ioctl to set a deadline (both to
support igt-tests, and compositors which delay decisions about which
client buffer to display), and a sw_sync ioctl to read back the
deadline. IGT tests utilizing these can be found at:
https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadl…
v1: https://patchwork.freedesktop.org/series/93035/
v2: Move filtering out of later deadlines to fence implementation
to avoid increasing the size of dma_fence
v3: Add support in fence-array and fence-chain; Add some uabi to
support igt tests and userspace compositors.
Rob Clark (9):
dma-fence: Add deadline awareness
drm/vblank: Add helper to get next vblank time
drm/atomic-helper: Set fence deadline for vblank
drm/scheduler: Add fence deadline support
drm/msm: Add deadline based boost support
dma-buf/fence-array: Add fence deadline support
dma-buf/fence-chain: Add fence deadline support
dma-buf/sync_file: Add SET_DEADLINE ioctl
dma-buf/sw_sync: Add fence deadline support
drivers/dma-buf/dma-fence-array.c | 11 ++++
drivers/dma-buf/dma-fence-chain.c | 13 +++++
drivers/dma-buf/dma-fence.c | 20 +++++++
drivers/dma-buf/sw_sync.c | 58 +++++++++++++++++++
drivers/dma-buf/sync_debug.h | 2 +
drivers/dma-buf/sync_file.c | 19 +++++++
drivers/gpu/drm/drm_atomic_helper.c | 36 ++++++++++++
drivers/gpu/drm/drm_vblank.c | 32 +++++++++++
drivers/gpu/drm/msm/msm_fence.c | 76 +++++++++++++++++++++++++
drivers/gpu/drm/msm/msm_fence.h | 20 +++++++
drivers/gpu/drm/msm/msm_gpu.h | 1 +
drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++++++
drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++
drivers/gpu/drm/scheduler/sched_main.c | 2 +-
include/drm/drm_vblank.h | 1 +
include/drm/gpu_scheduler.h | 8 +++
include/linux/dma-fence.h | 16 ++++++
include/uapi/linux/sync_file.h | 20 +++++++
18 files changed, 388 insertions(+), 1 deletion(-)
--
2.31.1
> > > +static int bcmasp_set_priv_flags(struct net_device *dev, u32 flags)
> > > +{
> > > + struct bcmasp_intf *intf = netdev_priv(dev);
> > > +
> > > + intf->wol_keep_rx_en = flags & BCMASP_WOL_KEEP_RX_EN ? 1 : 0;
> > > +
> > > + return 0;
> >
> > Please could you explain this some more. How can you disable RX and
> > still have WoL working?
>
> Wake-on-LAN using Magic Packets and network filters requires keeping the
> UniMAC's receiver turned on, and then the packets feed into the Magic Packet
> Detector (MPD) block or the network filter block. In that mode DRAM is in
> self refresh and there is local matching of frames into a tiny FIFO however
> in the case of magic packets the packets leading to a wake-up are dropped as
> there is nowhere to store them. In the case of a network filter match (e.g.:
> matching a multicast IP address plus protocol, plus source/destination
> ports) the packets are also discarded because the receive DMA was shut down.
>
> When the wol_keep_rx_en flag is set, the above happens but we also allow the
> packets that did match a network filter to reach the small FIFO (Justin
> would know how many entries are there) that is used to push the packets to
> DRAM. The packet contents are held in there until the system wakes up which
> is usually just a few hundreds of micro seconds after we received a packet
> that triggered a wake-up. Once we overflow the receive DMA FIFO capacity
> subsequent packets get dropped which is fine since we are usually talking
> about very low bit rates, and we only try to push to DRAM the packets of
> interest, that is those for which we have a network filter.
>
> This is convenient in scenarios where you want to wake-up from multicast DNS
> (e.g.: wake on Googlecast, Bonjour etc.) because then the packet that
> resulted in the system wake-up is not discarded but is then delivered to the
> network stack.
Thanks for the explanation. It would be easier for the user if you
automate this. Enable is by default for WoL types which have user
content?
> > > + /* Per ch */
> > > + intf->tx_spb_dma = priv->base + TX_SPB_DMA_OFFSET(intf);
> > > + intf->res.tx_spb_ctrl = priv->base + TX_SPB_CTRL_OFFSET(intf);
> > > + /*
> > > + * Stop gap solution. This should be removed when 72165a0 is
> > > + * deprecated
> > > + */
> >
> > Is that an internal commit?
>
> Yes this is a revision of the silicon that is not meant to see the light of
> day.
So this can all be removed?
Andrew
On Fri, 24 Sep 2021 14:44:48 -0700, Justin Chen wrote:
> The ASP 2.0 Ethernet controller uses a brcm unimac.
>
> Signed-off-by: Justin Chen <justinpopo6(a)gmail.com>
> Signed-off-by: Florian Fainelli <f.fainelli(a)gmail.com>
> ---
> Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml | 1 +
> 1 file changed, 1 insertion(+)
>
Running 'make dtbs_check' with the schema in this patch gives the
following warnings. Consider if they are expected or the schema is
incorrect. These may not be new warnings.
Note that it is not yet a requirement to have 0 warnings for dtbs_check.
This will change in the future.
Full log is available here: https://patchwork.ozlabs.org/patch/1532529
mdio@e14: #address-cells:0:0: 1 was expected
arch/arm64/boot/dts/broadcom/bcm2711-rpi-400.dt.yaml
arch/arm64/boot/dts/broadcom/bcm2711-rpi-4-b.dt.yaml
arch/arm/boot/dts/bcm2711-rpi-400.dt.yaml
arch/arm/boot/dts/bcm2711-rpi-4-b.dt.yaml
mdio@e14: #size-cells:0:0: 0 was expected
arch/arm64/boot/dts/broadcom/bcm2711-rpi-400.dt.yaml
arch/arm64/boot/dts/broadcom/bcm2711-rpi-4-b.dt.yaml
arch/arm/boot/dts/bcm2711-rpi-400.dt.yaml
arch/arm/boot/dts/bcm2711-rpi-4-b.dt.yaml
On Fri, 24 Sep 2021 14:44:47 -0700, Justin Chen wrote:
> From: Florian Fainelli <f.fainelli(a)gmail.com>
>
> Add a binding document for the Broadcom ASP 2.0 Ethernet controller.
>
> Signed-off-by: Florian Fainelli <f.fainelli(a)gmail.com>
> Signed-off-by: Justin Chen <justinpopo6(a)gmail.com>
> ---
> .../devicetree/bindings/net/brcm,asp-v2.0.yaml | 147 +++++++++++++++++++++
> 1 file changed, 147 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml
>
My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):
yamllint warnings/errors:
./Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml:79:10: [warning] wrong indentation: expected 10 but found 9 (indentation)
dtschema/dtc warnings/errors:
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/net/brcm,asp-v2.0.example.dt.yaml: asp@9c00000: 'mdio@c614', 'mdio@ce14' do not match any of the regexes: 'pinctrl-[0-9]+'
From schema: /builds/robherring/linux-dt-review/Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml
Documentation/devicetree/bindings/net/brcm,asp-v2.0.example.dt.yaml:0:0: /example-0/asp@9c00000/mdio@c614: failed to match any schema with compatible: ['brcm,asp-v2.0-mdio']
Documentation/devicetree/bindings/net/brcm,asp-v2.0.example.dt.yaml:0:0: /example-0/asp@9c00000/mdio@ce14: failed to match any schema with compatible: ['brcm,asp-v2.0-mdio']
doc reference errors (make refcheckdocs):
See https://patchwork.ozlabs.org/patch/1532528
This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.
If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:
pip3 install dtschema --upgrade
Please check and re-submit.