Linaro-mm-sig

linaro-mm-sig@lists.linaro.org

8 participants
3221 discussions

Re: [PATCH 1/4] dma-buf/fence: give some reasonable maximum signaling timeout

by Christian König

On 11/25/25 08:55, Philipp Stanner wrote: > On Thu, 2025-11-20 at 15:41 +0100, Christian König wrote: >> Add a define implementations can use as reasonable maximum signaling >> timeout. Document that if this timeout is exceeded by config options >> implementations should taint the kernel. >> >> Tainting the kernel is important for bug reports to detect that end >> users might be using a problematic configuration. >> >> Signed-off-by: Christian König <christian.koenig(a)amd.com> >> --- >> include/linux/dma-fence.h | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h >> index 64639e104110..b31dfa501c84 100644 >> --- a/include/linux/dma-fence.h >> +++ b/include/linux/dma-fence.h >> @@ -28,6 +28,20 @@ struct dma_fence_ops; >> struct dma_fence_cb; >> struct seq_file; >> >> +/** >> + * define DMA_FENCE_MAX_REASONABLE_TIMEOUT - max reasonable signaling timeout >> + * >> + * The dma_fence object has a deep inter dependency with core memory >> + * management, for a detailed explanation see section DMA Fences under >> + * Documentation/driver-api/dma-buf.rst. >> + * >> + * Because of this all dma_fence implementations must guarantee that each fence >> + * completes in a finite time. This define here now gives a reasonable value for >> + * the timeout to use. It is possible to use a longer timeout in an >> + * implementation but that should taint the kernel. >> + */ >> +#define DMA_FENCE_MAX_REASONABLE_TIMEOUT (2*HZ) > > HZ can change depending on the config. Is that really a good choice? I > could see racy situations arising in some configs vs others 2*HZ is always two seconds expressed in number of jiffies, I can use msecs_to_jiffies(2000) to make that more obvious. The GPU scheduler has a very similar define, MAX_WAIT_SCHED_ENTITY_Q_EMPTY which is currently just 1 second. The real question is what is the maximum amount of time we can wait for the HW before we should trigger a timeout? Some AMD internal team is pushing for 10 seconds, but that also means that for example we wait 10 seconds for the OOM killer to do something. That sounds like way to long. Regards, Christian. > > P.

1 week, 6 days

Re: 答复: [External Mail]Re: [PATCH] dma-buf: add some tracepoints to debug.

by Christian König

On 11/25/25 03:01, 高翔 wrote: >> In general quite nice to have, but this needs a bit more explanation why you need it. > > I want to track the status of dmabuf in real time in the production environment. Please add that to the commit message. > >> I have strongly doubts that this is your legal name :) >> Please google the requirements for a Signed-off-by line. > That's my real name, gaoxiang. However, there are quite a few people with the same name in the company, and I rank 17th. It doesn't matter what your eMail address is, but the name in a Signed-off-by must be your legal name. In other words the same what you would use on a contract, usually including first and last name. It could be that gaoxiang is perfectly fine in your restrication, but the number 17 certainly looks odd. > And the previous patches all used this name. > >> dmabuf->name can't be accessed without holding the appropriate lock, same of all other cases. > ok, thanks. It can be accessed with holding dmabuf->name_lock. That should work, yes. Regards, Christian. > > > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > *发件人:* Christian König <christian.koenig(a)amd.com> > *发送时间:* 2025年11月24日 22:21:57 > *收件人:* Xiang Gao; sumit.semwal(a)linaro.org; rostedt(a)goodmis.org; mhiramat(a)kernel.org > *抄送:* linux-media(a)vger.kernel.org; dri-devel(a)lists.freedesktop.org; linaro-mm-sig(a)lists.linaro.org; linux-kernel(a)vger.kernel.org; mathieu.desnoyers(a)efficios.com; dhowells(a)redhat.com; kuba(a)kernel.org; brauner(a)kernel.org; akpm(a)linux-foundation.org; linux-trace-kernel(a)vger.kernel.org; 高翔 > *主题:* [External Mail]Re: [PATCH] dma-buf: add some tracepoints to debug. > > [外部邮件] 此邮件来源于小米公司外部，请谨慎处理。若对邮件安全性存疑，请将邮件转发给misec(a)xiaomi.xn--com-iw3ew31vyqjqpq > > On 11/24/25 14:36, Xiang Gao wrote: >> From: gaoxiang17 <gaoxiang17(a)xiaomi.com> >> >> With these tracepoints, we can track dmabuf in real time. >> >> For example: >> binder:3025_3-10524 [000] ..... 553.310313: dma_buf_export: exp_name=qcom,system name=(null) size=12771328 ino=2799 >> binder:3025_3-10524 [000] ..... 553.310318: dma_buf_fd: exp_name=qcom,system name=(null) size=12771328 ino=2799 fd=8 >> RenderThread-9307 [000] ..... 553.310869: dma_buf_get: exp_name=qcom,system name=blastBufferQueue for scaleUpDow size=12771328 ino=2799 fd=673 f_ref=4 >> RenderThread-9307 [000] ..... 553.310871: dma_buf_attach: dev_name=kgsl-3d0 exp_name=qcom,system name=blastBufferQueue for scaleUpDow size=12771328 ino=2799 >> RenderThread-9307 [000] ..... 553.310946: dma_buf_mmap_internal: exp_name=qcom,system name=blastBufferQueue for scaleUpDow size=12771328 ino=2799 >> RenderThread-9307 [004] ..... 553.315084: dma_buf_detach: exp_name=qcom,system name=blastBufferQueue for scaleUpDow size=12771328 ino=2799 >> RenderThread-9307 [004] ..... 553.315084: dma_buf_put: exp_name=qcom,system name=blastBufferQueue for scaleUpDow size=12771328 ino=2799 f_ref=5 > > In general quite nice to have, but this needs a bit more explanation why you need it. > >> >> Signed-off-by: gaoxiang17 <gaoxiang17(a)xiaomi.com> > > I have strongly doubts that this is your legal name :) > > Please google the requirements for a Signed-off-by line. > >> --- >> drivers/dma-buf/dma-buf.c | 19 +++ >> include/trace/events/dma_buf.h | 245 +++++++++++++++++++++++++++++++++ >> 2 files changed, 264 insertions(+) >> create mode 100644 include/trace/events/dma_buf.h >> >> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >> index 2bcf9ceca997..8b5af73f0218 100644 >> --- a/drivers/dma-buf/dma-buf.c >> +++ b/drivers/dma-buf/dma-buf.c >> @@ -35,6 +35,9 @@ >> >> #include "dma-buf-sysfs-stats.h" >> >> +#define CREATE_TRACE_POINTS >> +#include <trace/events/dma_buf.h> >> + >> static inline int is_dma_buf_file(struct file *); >> >> static DEFINE_MUTEX(dmabuf_list_mutex); >> @@ -220,6 +223,8 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) >> dmabuf->size >> PAGE_SHIFT) >> return -EINVAL; >> >> + trace_dma_buf_mmap_internal(dmabuf); >> + >> return dmabuf->ops->mmap(dmabuf, vma); >> } >> >> @@ -745,6 +750,8 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) >> >> __dma_buf_list_add(dmabuf); >> >> + trace_dma_buf_export(dmabuf); >> + >> return dmabuf; >> >> err_dmabuf: >> @@ -779,6 +786,8 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags) >> >> fd_install(fd, dmabuf->file); >> >> + trace_dma_buf_fd(dmabuf, fd); >> + >> return fd; >> } >> EXPORT_SYMBOL_NS_GPL(dma_buf_fd, "DMA_BUF"); >> @@ -805,6 +814,8 @@ struct dma_buf *dma_buf_get(int fd) >> return ERR_PTR(-EINVAL); >> } >> >> + trace_dma_buf_get(fd, file); >> + >> return file->private_data; >> } >> EXPORT_SYMBOL_NS_GPL(dma_buf_get, "DMA_BUF"); >> @@ -825,6 +836,8 @@ void dma_buf_put(struct dma_buf *dmabuf) >> return; >> >> fput(dmabuf->file); >> + >> + trace_dma_buf_put(dmabuf); >> } >> EXPORT_SYMBOL_NS_GPL(dma_buf_put, "DMA_BUF"); >> >> @@ -998,6 +1011,8 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_dynamic_attach, "DMA_BUF"); >> struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, >> struct device *dev) >> { >> + trace_dma_buf_attach(dmabuf, dev); >> + >> return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL); >> } >> EXPORT_SYMBOL_NS_GPL(dma_buf_attach, "DMA_BUF"); >> @@ -1024,6 +1039,8 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) >> dmabuf->ops->detach(dmabuf, attach); >> >> kfree(attach); >> + >> + trace_dma_buf_detach(dmabuf); >> } >> EXPORT_SYMBOL_NS_GPL(dma_buf_detach, "DMA_BUF"); >> >> @@ -1488,6 +1505,8 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, >> vma_set_file(vma, dmabuf->file); >> vma->vm_pgoff = pgoff; >> >> + trace_dma_buf_mmap(dmabuf); >> + >> return dmabuf->ops->mmap(dmabuf, vma); >> } >> EXPORT_SYMBOL_NS_GPL(dma_buf_mmap, "DMA_BUF"); >> diff --git a/include/trace/events/dma_buf.h b/include/trace/events/dma_buf.h >> new file mode 100644 >> index 000000000000..796ae444f6ae >> --- /dev/null >> +++ b/include/trace/events/dma_buf.h >> @@ -0,0 +1,245 @@ >> +/* SPDX-License-Identifier: GPL-2.0 */ >> +#undef TRACE_SYSTEM >> +#define TRACE_SYSTEM dma_buf >> + >> +#if !defined(_TRACE_DMA_BUF_H) || defined(TRACE_HEADER_MULTI_READ) >> +#define _TRACE_DMA_BUF_H >> + >> +#include <linux/dma-buf.h> >> +#include <linux/tracepoint.h> >> + >> +TRACE_EVENT(dma_buf_export, >> + >> + TP_PROTO(struct dma_buf *dmabuf), >> + >> + TP_ARGS(dmabuf), >> + >> + TP_STRUCT__entry( >> + __string(exp_name, dmabuf->exp_name) >> + __string(name, dmabuf->name) > > dmabuf->name can't be accessed without holding the appropriate lock, same of all other cases. > > Regards, > Christian. > >> + __field(size_t, size) >> + __field(ino_t, ino) >> + ), >> + >> + TP_fast_assign( >> + __assign_str(exp_name); >> + __assign_str(name); >> + __entry->size = dmabuf->size; >> + __entry->ino = dmabuf->file->f_inode->i_ino; >> + ), >> + >> + TP_printk("exp_name=%s name=%s size=%zu ino=%lu", >> + __get_str(exp_name), >> + __get_str(name), >> + __entry->size, >> + __entry->ino) >> +); >> + >> +TRACE_EVENT(dma_buf_fd, >> + >> + TP_PROTO(struct dma_buf *dmabuf, int fd), >> + >> + TP_ARGS(dmabuf, fd), >> + >> + TP_STRUCT__entry( >> + __string(exp_name, dmabuf->exp_name) >> + __string(name, dmabuf->name) >> + __field(size_t, size) >> + __field(ino_t, ino) >> + __field(int, fd) >> + ), >> + >> + TP_fast_assign( >> + __assign_str(exp_name); >> + __assign_str(name); >> + __entry->size = dmabuf->size; >> + __entry->ino = dmabuf->file->f_inode->i_ino; >> + __entry->fd = fd; >> + ), >> + >> + TP_printk("exp_name=%s name=%s size=%zu ino=%lu fd=%d", >> + __get_str(exp_name), >> + __get_str(name), >> + __entry->size, >> + __entry->ino, >> + __entry->fd) >> +); >> + >> +TRACE_EVENT(dma_buf_mmap_internal, >> + >> + TP_PROTO(struct dma_buf *dmabuf), >> + >> + TP_ARGS(dmabuf), >> + >> + TP_STRUCT__entry( >> + __string(exp_name, dmabuf->exp_name) >> + __string(name, dmabuf->name) >> + __field(size_t, size) >> + __field(ino_t, ino) >> + ), >> + >> + TP_fast_assign( >> + __assign_str(exp_name); >> + __assign_str(name); >> + __entry->size = dmabuf->size; >> + __entry->ino = dmabuf->file->f_inode->i_ino; >> + ), >> + >> + TP_printk("exp_name=%s name=%s size=%zu ino=%lu", >> + __get_str(exp_name), >> + __get_str(name), >> + __entry->size, >> + __entry->ino) >> +); >> + >> +TRACE_EVENT(dma_buf_mmap, >> + >> + TP_PROTO(struct dma_buf *dmabuf), >> + >> + TP_ARGS(dmabuf), >> + >> + TP_STRUCT__entry( >> + __string(exp_name, dmabuf->exp_name) >> + __string(name, dmabuf->name) >> + __field(size_t, size) >> + __field(ino_t, ino) >> + ), >> + >> + TP_fast_assign( >> + __assign_str(exp_name); >> + __assign_str(name); >> + __entry->size = dmabuf->size; >> + __entry->ino = dmabuf->file->f_inode->i_ino; >> + ), >> + >> + TP_printk("exp_name=%s name=%s size=%zu ino=%lu", >> + __get_str(exp_name), >> + __get_str(name), >> + __entry->size, >> + __entry->ino) >> +); >> + >> +TRACE_EVENT(dma_buf_attach, >> + >> + TP_PROTO(struct dma_buf *dmabuf, struct device *dev), >> + >> + TP_ARGS(dmabuf, dev), >> + >> + TP_STRUCT__entry( >> + __string(dname, dev_name(dev)) >> + __string(exp_name, dmabuf->exp_name) >> + __string(name, dmabuf->name) >> + __field(size_t, size) >> + __field(ino_t, ino) >> + ), >> + >> + TP_fast_assign( >> + __assign_str(dname); >> + __assign_str(exp_name); >> + __assign_str(name); >> + __entry->size = dmabuf->size; >> + __entry->ino = dmabuf->file->f_inode->i_ino; >> + ), >> + >> + TP_printk("dev_name=%s exp_name=%s name=%s size=%zu ino=%lu", >> + __get_str(dname), >> + __get_str(exp_name), >> + __get_str(name), >> + __entry->size, >> + __entry->ino) >> +); >> + >> +TRACE_EVENT(dma_buf_detach, >> + >> + TP_PROTO(struct dma_buf *dmabuf), >> + >> + TP_ARGS(dmabuf), >> + >> + TP_STRUCT__entry( >> + __string(exp_name, dmabuf->exp_name) >> + __string(name, dmabuf->name) >> + __field(size_t, size) >> + __field(ino_t, ino) >> + ), >> + >> + TP_fast_assign( >> + __assign_str(exp_name); >> + __assign_str(name); >> + __entry->size = dmabuf->size; >> + __entry->ino = dmabuf->file->f_inode->i_ino; >> + ), >> + >> + TP_printk("exp_name=%s name=%s size=%zu ino=%lu", >> + __get_str(exp_name), >> + __get_str(name), >> + __entry->size, >> + __entry->ino) >> +); >> + >> +TRACE_EVENT(dma_buf_get, >> + >> + TP_PROTO(int fd, struct file *file), >> + >> + TP_ARGS(fd, file), >> + >> + TP_STRUCT__entry( >> + __string(exp_name, ((struct dma_buf *)file->private_data)->exp_name) >> + __string(name, ((struct dma_buf *)file->private_data)->name) >> + __field(size_t, size) >> + __field(ino_t, ino) >> + __field(int, fd) >> + __field(long, f_ref) >> + ), >> + >> + TP_fast_assign( >> + __assign_str(exp_name); >> + __assign_str(name); >> + __entry->size = ((struct dma_buf *)file->private_data)->size; >> + __entry->ino = ((struct dma_buf *)file->private_data)->file->f_inode->i_ino; >> + __entry->fd = fd; >> + __entry->f_ref = file_ref_get(&file->f_ref); >> + ), >> + >> + TP_printk("exp_name=%s name=%s size=%zu ino=%lu fd=%d f_ref=%ld", >> + __get_str(exp_name), >> + __get_str(name), >> + __entry->size, >> + __entry->ino, >> + __entry->fd, >> + __entry->f_ref) >> +); >> + >> +TRACE_EVENT(dma_buf_put, >> + >> + TP_PROTO(struct dma_buf *dmabuf), >> + >> + TP_ARGS(dmabuf), >> + >> + TP_STRUCT__entry( >> + __string(exp_name, dmabuf->exp_name) >> + __string(name, dmabuf->name) >> + __field(size_t, size) >> + __field(ino_t, ino) >> + __field(long, f_ref) >> + ), >> + >> + TP_fast_assign( >> + __assign_str(exp_name); >> + __assign_str(name); >> + __entry->size = dmabuf->size; >> + __entry->ino = dmabuf->file->f_inode->i_ino; >> + __entry->f_ref = file_ref_get(&dmabuf->file->f_ref); >> + ), >> + >> + TP_printk("exp_name=%s name=%s size=%zu ino=%lu f_ref=%ld", >> + __get_str(exp_name), >> + __get_str(name), >> + __entry->size, >> + __entry->ino, >> + __entry->f_ref) >> +); >> + >> +#endif /* _TRACE_DMA_BUF_H */ >> + >> +/* This part must be outside protection */ >> +#include <trace/define_trace.h> >

1 week, 6 days

Reasonable maximum signaling timeout for dma_fences

by Christian König

Hi everybody, we have documented here https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html#dma-fence-cr… that dma_fence objects must signal in a reasonable amount of time, but at the same time note that drivers might have a different idea of what reasonable means. Recently I realized that this is actually not a good idea. Background is that the wall clock timeout means that for example the OOM killer might actually wait for this timeout to be able to terminate a process and reclaim the memory used. And this is just an example of how general kernel features might depend on that. Some drivers and fence implementations used 10 seconds and that raised complains by end users. So at least amdgpu recently switched to 2 second which triggered an internal discussion about it. This patch set here now adds a define to the dma_fence header which gives 2 seconds as reasonable amount of time. SW-sync is modified to always taint the kernel (since it doesn't has a timeout), VGEM is switched over to the new define and the scheduler gets a warning and taints the kernel if a driver uses a timeout longer than that. I have not much intention of actually committing the patches (maybe except the SW-sync one), but question is if 2 seconds are reasonable? Regards, Christian.

1 week, 6 days

Re: [PATCH] dma-buf: add some tracepoints to debug.

by Christian König

On 11/24/25 14:36, Xiang Gao wrote: > From: gaoxiang17 <gaoxiang17(a)xiaomi.com> > > With these tracepoints, we can track dmabuf in real time. > > For example: > binder:3025_3-10524 [000] ..... 553.310313: dma_buf_export: exp_name=qcom,system name=(null) size=12771328 ino=2799 > binder:3025_3-10524 [000] ..... 553.310318: dma_buf_fd: exp_name=qcom,system name=(null) size=12771328 ino=2799 fd=8 > RenderThread-9307 [000] ..... 553.310869: dma_buf_get: exp_name=qcom,system name=blastBufferQueue for scaleUpDow size=12771328 ino=2799 fd=673 f_ref=4 > RenderThread-9307 [000] ..... 553.310871: dma_buf_attach: dev_name=kgsl-3d0 exp_name=qcom,system name=blastBufferQueue for scaleUpDow size=12771328 ino=2799 > RenderThread-9307 [000] ..... 553.310946: dma_buf_mmap_internal: exp_name=qcom,system name=blastBufferQueue for scaleUpDow size=12771328 ino=2799 > RenderThread-9307 [004] ..... 553.315084: dma_buf_detach: exp_name=qcom,system name=blastBufferQueue for scaleUpDow size=12771328 ino=2799 > RenderThread-9307 [004] ..... 553.315084: dma_buf_put: exp_name=qcom,system name=blastBufferQueue for scaleUpDow size=12771328 ino=2799 f_ref=5 In general quite nice to have, but this needs a bit more explanation why you need it. > > Signed-off-by: gaoxiang17 <gaoxiang17(a)xiaomi.com> I have strongly doubts that this is your legal name :) Please google the requirements for a Signed-off-by line. > --- > drivers/dma-buf/dma-buf.c | 19 +++ > include/trace/events/dma_buf.h | 245 +++++++++++++++++++++++++++++++++ > 2 files changed, 264 insertions(+) > create mode 100644 include/trace/events/dma_buf.h > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index 2bcf9ceca997..8b5af73f0218 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -35,6 +35,9 @@ > > #include "dma-buf-sysfs-stats.h" > > +#define CREATE_TRACE_POINTS > +#include <trace/events/dma_buf.h> > + > static inline int is_dma_buf_file(struct file *); > > static DEFINE_MUTEX(dmabuf_list_mutex); > @@ -220,6 +223,8 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) > dmabuf->size >> PAGE_SHIFT) > return -EINVAL; > > + trace_dma_buf_mmap_internal(dmabuf); > + > return dmabuf->ops->mmap(dmabuf, vma); > } > > @@ -745,6 +750,8 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) > > __dma_buf_list_add(dmabuf); > > + trace_dma_buf_export(dmabuf); > + > return dmabuf; > > err_dmabuf: > @@ -779,6 +786,8 @@ int dma_buf_fd(struct dma_buf *dmabuf, int flags) > > fd_install(fd, dmabuf->file); > > + trace_dma_buf_fd(dmabuf, fd); > + > return fd; > } > EXPORT_SYMBOL_NS_GPL(dma_buf_fd, "DMA_BUF"); > @@ -805,6 +814,8 @@ struct dma_buf *dma_buf_get(int fd) > return ERR_PTR(-EINVAL); > } > > + trace_dma_buf_get(fd, file); > + > return file->private_data; > } > EXPORT_SYMBOL_NS_GPL(dma_buf_get, "DMA_BUF"); > @@ -825,6 +836,8 @@ void dma_buf_put(struct dma_buf *dmabuf) > return; > > fput(dmabuf->file); > + > + trace_dma_buf_put(dmabuf); > } > EXPORT_SYMBOL_NS_GPL(dma_buf_put, "DMA_BUF"); > > @@ -998,6 +1011,8 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_dynamic_attach, "DMA_BUF"); > struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > struct device *dev) > { > + trace_dma_buf_attach(dmabuf, dev); > + > return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL); > } > EXPORT_SYMBOL_NS_GPL(dma_buf_attach, "DMA_BUF"); > @@ -1024,6 +1039,8 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) > dmabuf->ops->detach(dmabuf, attach); > > kfree(attach); > + > + trace_dma_buf_detach(dmabuf); > } > EXPORT_SYMBOL_NS_GPL(dma_buf_detach, "DMA_BUF"); > > @@ -1488,6 +1505,8 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, > vma_set_file(vma, dmabuf->file); > vma->vm_pgoff = pgoff; > > + trace_dma_buf_mmap(dmabuf); > + > return dmabuf->ops->mmap(dmabuf, vma); > } > EXPORT_SYMBOL_NS_GPL(dma_buf_mmap, "DMA_BUF"); > diff --git a/include/trace/events/dma_buf.h b/include/trace/events/dma_buf.h > new file mode 100644 > index 000000000000..796ae444f6ae > --- /dev/null > +++ b/include/trace/events/dma_buf.h > @@ -0,0 +1,245 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#undef TRACE_SYSTEM > +#define TRACE_SYSTEM dma_buf > + > +#if !defined(_TRACE_DMA_BUF_H) || defined(TRACE_HEADER_MULTI_READ) > +#define _TRACE_DMA_BUF_H > + > +#include <linux/dma-buf.h> > +#include <linux/tracepoint.h> > + > +TRACE_EVENT(dma_buf_export, > + > + TP_PROTO(struct dma_buf *dmabuf), > + > + TP_ARGS(dmabuf), > + > + TP_STRUCT__entry( > + __string(exp_name, dmabuf->exp_name) > + __string(name, dmabuf->name) dmabuf->name can't be accessed without holding the appropriate lock, same of all other cases. Regards, Christian. > + __field(size_t, size) > + __field(ino_t, ino) > + ), > + > + TP_fast_assign( > + __assign_str(exp_name); > + __assign_str(name); > + __entry->size = dmabuf->size; > + __entry->ino = dmabuf->file->f_inode->i_ino; > + ), > + > + TP_printk("exp_name=%s name=%s size=%zu ino=%lu", > + __get_str(exp_name), > + __get_str(name), > + __entry->size, > + __entry->ino) > +); > + > +TRACE_EVENT(dma_buf_fd, > + > + TP_PROTO(struct dma_buf *dmabuf, int fd), > + > + TP_ARGS(dmabuf, fd), > + > + TP_STRUCT__entry( > + __string(exp_name, dmabuf->exp_name) > + __string(name, dmabuf->name) > + __field(size_t, size) > + __field(ino_t, ino) > + __field(int, fd) > + ), > + > + TP_fast_assign( > + __assign_str(exp_name); > + __assign_str(name); > + __entry->size = dmabuf->size; > + __entry->ino = dmabuf->file->f_inode->i_ino; > + __entry->fd = fd; > + ), > + > + TP_printk("exp_name=%s name=%s size=%zu ino=%lu fd=%d", > + __get_str(exp_name), > + __get_str(name), > + __entry->size, > + __entry->ino, > + __entry->fd) > +); > + > +TRACE_EVENT(dma_buf_mmap_internal, > + > + TP_PROTO(struct dma_buf *dmabuf), > + > + TP_ARGS(dmabuf), > + > + TP_STRUCT__entry( > + __string(exp_name, dmabuf->exp_name) > + __string(name, dmabuf->name) > + __field(size_t, size) > + __field(ino_t, ino) > + ), > + > + TP_fast_assign( > + __assign_str(exp_name); > + __assign_str(name); > + __entry->size = dmabuf->size; > + __entry->ino = dmabuf->file->f_inode->i_ino; > + ), > + > + TP_printk("exp_name=%s name=%s size=%zu ino=%lu", > + __get_str(exp_name), > + __get_str(name), > + __entry->size, > + __entry->ino) > +); > + > +TRACE_EVENT(dma_buf_mmap, > + > + TP_PROTO(struct dma_buf *dmabuf), > + > + TP_ARGS(dmabuf), > + > + TP_STRUCT__entry( > + __string(exp_name, dmabuf->exp_name) > + __string(name, dmabuf->name) > + __field(size_t, size) > + __field(ino_t, ino) > + ), > + > + TP_fast_assign( > + __assign_str(exp_name); > + __assign_str(name); > + __entry->size = dmabuf->size; > + __entry->ino = dmabuf->file->f_inode->i_ino; > + ), > + > + TP_printk("exp_name=%s name=%s size=%zu ino=%lu", > + __get_str(exp_name), > + __get_str(name), > + __entry->size, > + __entry->ino) > +); > + > +TRACE_EVENT(dma_buf_attach, > + > + TP_PROTO(struct dma_buf *dmabuf, struct device *dev), > + > + TP_ARGS(dmabuf, dev), > + > + TP_STRUCT__entry( > + __string(dname, dev_name(dev)) > + __string(exp_name, dmabuf->exp_name) > + __string(name, dmabuf->name) > + __field(size_t, size) > + __field(ino_t, ino) > + ), > + > + TP_fast_assign( > + __assign_str(dname); > + __assign_str(exp_name); > + __assign_str(name); > + __entry->size = dmabuf->size; > + __entry->ino = dmabuf->file->f_inode->i_ino; > + ), > + > + TP_printk("dev_name=%s exp_name=%s name=%s size=%zu ino=%lu", > + __get_str(dname), > + __get_str(exp_name), > + __get_str(name), > + __entry->size, > + __entry->ino) > +); > + > +TRACE_EVENT(dma_buf_detach, > + > + TP_PROTO(struct dma_buf *dmabuf), > + > + TP_ARGS(dmabuf), > + > + TP_STRUCT__entry( > + __string(exp_name, dmabuf->exp_name) > + __string(name, dmabuf->name) > + __field(size_t, size) > + __field(ino_t, ino) > + ), > + > + TP_fast_assign( > + __assign_str(exp_name); > + __assign_str(name); > + __entry->size = dmabuf->size; > + __entry->ino = dmabuf->file->f_inode->i_ino; > + ), > + > + TP_printk("exp_name=%s name=%s size=%zu ino=%lu", > + __get_str(exp_name), > + __get_str(name), > + __entry->size, > + __entry->ino) > +); > + > +TRACE_EVENT(dma_buf_get, > + > + TP_PROTO(int fd, struct file *file), > + > + TP_ARGS(fd, file), > + > + TP_STRUCT__entry( > + __string(exp_name, ((struct dma_buf *)file->private_data)->exp_name) > + __string(name, ((struct dma_buf *)file->private_data)->name) > + __field(size_t, size) > + __field(ino_t, ino) > + __field(int, fd) > + __field(long, f_ref) > + ), > + > + TP_fast_assign( > + __assign_str(exp_name); > + __assign_str(name); > + __entry->size = ((struct dma_buf *)file->private_data)->size; > + __entry->ino = ((struct dma_buf *)file->private_data)->file->f_inode->i_ino; > + __entry->fd = fd; > + __entry->f_ref = file_ref_get(&file->f_ref); > + ), > + > + TP_printk("exp_name=%s name=%s size=%zu ino=%lu fd=%d f_ref=%ld", > + __get_str(exp_name), > + __get_str(name), > + __entry->size, > + __entry->ino, > + __entry->fd, > + __entry->f_ref) > +); > + > +TRACE_EVENT(dma_buf_put, > + > + TP_PROTO(struct dma_buf *dmabuf), > + > + TP_ARGS(dmabuf), > + > + TP_STRUCT__entry( > + __string(exp_name, dmabuf->exp_name) > + __string(name, dmabuf->name) > + __field(size_t, size) > + __field(ino_t, ino) > + __field(long, f_ref) > + ), > + > + TP_fast_assign( > + __assign_str(exp_name); > + __assign_str(name); > + __entry->size = dmabuf->size; > + __entry->ino = dmabuf->file->f_inode->i_ino; > + __entry->f_ref = file_ref_get(&dmabuf->file->f_ref); > + ), > + > + TP_printk("exp_name=%s name=%s size=%zu ino=%lu f_ref=%ld", > + __get_str(exp_name), > + __get_str(name), > + __entry->size, > + __entry->ino, > + __entry->f_ref) > +); > + > +#endif /* _TRACE_DMA_BUF_H */ > + > +/* This part must be outside protection */ > +#include <trace/define_trace.h>

2 weeks

Re: [RFC v2 00/11] Add dmabuf read/write via io_uring

by Christian König

On 11/24/25 12:30, Pavel Begunkov wrote: > On 11/24/25 10:33, Christian König wrote: >> On 11/23/25 23:51, Pavel Begunkov wrote: >>> Picking up the work on supporting dmabuf in the read/write path. >> >> IIRC that work was completely stopped because it violated core dma_fence and DMA-buf rules and after some private discussion was considered not doable in general. >> >> Or am I mixing something up here? > > The time gap is purely due to me being busy. I wasn't CC'ed to those private > discussions you mentioned, but the v1 feedback was to use dynamic attachments > and avoid passing dma address arrays directly. > > https://lore.kernel.org/all/cover.1751035820.git.asml.silence@gmail.com/ > > I'm lost on what part is not doable. Can you elaborate on the core > dma-fence dma-buf rules? I most likely mixed that up, in other words that was a different discussion. When you use dma_fences to indicate async completion of events you need to be super duper careful that you only do this for in flight events, have the fence creation in the right order etc... For example once the fence is created you can't make any memory allocations any more, that's why we have this dance of reserving fence slots, creating the fence and then adding it. >> Since I don't see any dma_fence implementation at all that might actually be the case. > > See Patch 5, struct blk_mq_dma_fence. It's used in the move_notify > callback and is signaled when all inflight IO using the current > mapping are complete. All new IO requests will try to recreate the > mapping, and hence potentially wait with dma_resv_wait_timeout(). Without looking at the code that approach sounds more or less correct to me. >> On the other hand we have direct I/O from DMA-buf working for quite a while, just not upstream and without io_uring support. > > Have any reference? There is a WIP feature in AMDs GPU driver package for ROCm. But that can't be used as general purpose DMA-buf approach, because it makes use of internal knowledge about how the GPU driver is using the backing store. BTW when you use DMA addresses from DMA-buf always keep in mind that this memory can be written by others at the same time, e.g. you can't do things like compute a CRC first, then write to backing store and finally compare CRC. Regards, Christian.

2 weeks

Re: [RFC v2 00/11] Add dmabuf read/write via io_uring

by Christian König

On 11/23/25 23:51, Pavel Begunkov wrote: > Picking up the work on supporting dmabuf in the read/write path. IIRC that work was completely stopped because it violated core dma_fence and DMA-buf rules and after some private discussion was considered not doable in general. Or am I mixing something up here? Since I don't see any dma_fence implementation at all that might actually be the case. On the other hand we have direct I/O from DMA-buf working for quite a while, just not upstream and without io_uring support. Regards, Christian. > There > are two main changes. First, it doesn't pass a dma addresss directly by > rather wraps it into an opaque structure, which is extended and > understood by the target driver. > > The second big change is support for dynamic attachments, which added a > good part of complexity (see Patch 5). I kept the main machinery in nvme > at first, but move_notify can ask to kill the dma mapping asynchronously, > and any new IO would need to wait during submission, thus it was moved > to blk-mq. That also introduced an extra callback layer b/w driver and > blk-mq. > > There are some rough corners, and I'm not perfectly happy about the > complexity and layering. For v3 I'll try to move the waiting up in the > stack to io_uring wrapped into library helpers. > > For now, I'm interested what is the best way to test move_notify? And > how dma_resv_reserve_fences() errors should be handled in move_notify? > > The uapi didn't change, after registration it looks like a normal > io_uring registered buffer and can be used as such. Only non-vectored > fixed reads/writes are allowed. Pseudo code: > > // registration > reg_buf_idx = 0; > io_uring_update_buffer(ring, reg_buf_idx, { dma_buf_fd, file_fd }); > > // request creation > io_uring_prep_read_fixed(sqe, file_fd, buffer_offset, > buffer_size, file_offset, reg_buf_idx); > > And as previously, a good bunch of code was taken from Keith's series [1]. > > liburing based example: > > git: https://github.com/isilence/liburing.git dmabuf-rw > link: https://github.com/isilence/liburing/tree/dmabuf-rw > > [1] https://lore.kernel.org/io-uring/20220805162444.3985535-1-kbusch@fb.com/ > > Pavel Begunkov (11): > file: add callback for pre-mapping dmabuf > iov_iter: introduce iter type for pre-registered dma > block: move around bio flagging helpers > block: introduce dma token backed bio type > block: add infra to handle dmabuf tokens > nvme-pci: add support for dmabuf reggistration > nvme-pci: implement dma_token backed requests > io_uring/rsrc: add imu flags > io_uring/rsrc: extended reg buffer registration > io_uring/rsrc: add dmabuf-backed buffer registeration > io_uring/rsrc: implement dmabuf regbuf import > > block/Makefile | 1 + > block/bdev.c | 14 ++ > block/bio.c | 21 +++ > block/blk-merge.c | 23 +++ > block/blk-mq-dma-token.c | 236 +++++++++++++++++++++++++++++++ > block/blk-mq.c | 20 +++ > block/blk.h | 3 +- > block/fops.c | 3 + > drivers/nvme/host/pci.c | 217 ++++++++++++++++++++++++++++ > include/linux/bio.h | 49 ++++--- > include/linux/blk-mq-dma-token.h | 60 ++++++++ > include/linux/blk-mq.h | 21 +++ > include/linux/blk_types.h | 8 +- > include/linux/blkdev.h | 3 + > include/linux/dma_token.h | 35 +++++ > include/linux/fs.h | 4 + > include/linux/uio.h | 10 ++ > include/uapi/linux/io_uring.h | 13 +- > io_uring/rsrc.c | 201 +++++++++++++++++++++++--- > io_uring/rsrc.h | 23 ++- > io_uring/rw.c | 7 +- > lib/iov_iter.c | 30 +++- > 22 files changed, 948 insertions(+), 54 deletions(-) > create mode 100644 block/blk-mq-dma-token.c > create mode 100644 include/linux/blk-mq-dma-token.h > create mode 100644 include/linux/dma_token.h >

2 weeks

Re: [PATCH v3 27/28] drm/amdgpu: get rid of amdgpu_ttm_clear_buffer

by Christian König

On 11/21/25 11:12, Pierre-Eric Pelloux-Prayer wrote: > It's doing the same thing as amdgpu_fill_buffer(src_data=0), so drop it. > > The only caveat is that amdgpu_res_cleared() return value is only valid > right after allocation. > > --- > v2: introduce new "bool consider_clear_status" arg > --- > > Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer(a)amd.com> It would be better to have that ealier in the patch set, but I guess that gives you rebasing problems? Christian. > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 16 ++-- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 90 +++++----------------- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 7 +- > 3 files changed, 33 insertions(+), 80 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > index 7d8d70135cc2..dccc31d0128e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > @@ -725,13 +725,17 @@ int amdgpu_bo_create(struct amdgpu_device *adev, > bo->tbo.resource->mem_type == TTM_PL_VRAM) { > struct dma_fence *fence; > > - r = amdgpu_ttm_clear_buffer(adev, bo, bo->tbo.base.resv, &fence); > + r = amdgpu_fill_buffer(adev, amdgpu_ttm_next_clear_entity(adev), > + bo, 0, NULL, &fence, > + true, AMDGPU_KERNEL_JOB_ID_TTM_CLEAR_BUFFER); > if (unlikely(r)) > goto fail_unreserve; > > - dma_resv_add_fence(bo->tbo.base.resv, fence, > - DMA_RESV_USAGE_KERNEL); > - dma_fence_put(fence); > + if (fence) { > + dma_resv_add_fence(bo->tbo.base.resv, fence, > + DMA_RESV_USAGE_KERNEL); > + dma_fence_put(fence); > + } > } > if (!bp->resv) > amdgpu_bo_unreserve(bo); > @@ -1323,8 +1327,8 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo) > goto out; > > r = amdgpu_fill_buffer(adev, amdgpu_ttm_next_clear_entity(adev), > - abo, 0, &bo->base._resv, > - &fence, AMDGPU_KERNEL_JOB_ID_CLEAR_ON_RELEASE); > + abo, 0, &bo->base._resv, &fence, > + false, AMDGPU_KERNEL_JOB_ID_CLEAR_ON_RELEASE); > if (WARN_ON(r)) > goto out; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > index 39cfe2dbdf03..c65c411ce26e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > @@ -459,7 +459,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo, > > r = amdgpu_fill_buffer(adev, entity, > abo, 0, NULL, &wipe_fence, > - AMDGPU_KERNEL_JOB_ID_MOVE_BLIT); > + false, AMDGPU_KERNEL_JOB_ID_MOVE_BLIT); > if (r) { > goto error; > } else if (wipe_fence) { > @@ -2459,79 +2459,28 @@ static int amdgpu_ttm_fill_mem(struct amdgpu_device *adev, > } > > /** > - * amdgpu_ttm_clear_buffer - clear memory buffers > + * amdgpu_fill_buffer - fill a buffer with a given value > * @adev: amdgpu device object > - * @bo: amdgpu buffer object > - * @resv: reservation object > - * @fence: dma_fence associated with the operation > + * @entity: optional entity to use. If NULL, the clearing entities will be > + * used to load-balance the partial clears > + * @bo: the bo to fill > + * @src_data: the value to set > + * @resv: fences contained in this reservation will be used as dependencies. > + * @out_fence: the fence from the last clear will be stored here. It might be > + * NULL if no job was run. > + * @dependency: optional input dependency fence. > + * @consider_clear_status: true if region reported as cleared by amdgpu_res_cleared() > + * are skipped. > + * @k_job_id: trace id > * > - * Clear the memory buffer resource. > - * > - * Returns: > - * 0 for success or a negative error code on failure. > */ > -int amdgpu_ttm_clear_buffer(struct amdgpu_device *adev, > - struct amdgpu_bo *bo, > - struct dma_resv *resv, > - struct dma_fence **fence) > -{ > - struct amdgpu_ttm_buffer_entity *entity; > - struct amdgpu_res_cursor cursor; > - u64 addr; > - int r = 0; > - > - if (!adev->mman.buffer_funcs_enabled) > - return -EINVAL; > - > - if (!fence) > - return -EINVAL; > - entity = &adev->mman.clear_entities[0]; > - *fence = dma_fence_get_stub(); > - > - amdgpu_res_first(bo->tbo.resource, 0, amdgpu_bo_size(bo), &cursor); > - > - mutex_lock(&entity->lock); > - while (cursor.remaining) { > - struct dma_fence *next = NULL; > - u64 size; > - > - if (amdgpu_res_cleared(&cursor)) { > - amdgpu_res_next(&cursor, cursor.size); > - continue; > - } > - > - /* Never clear more than 256MiB at once to avoid timeouts */ > - size = min(cursor.size, 256ULL << 20); > - > - r = amdgpu_ttm_map_buffer(adev, entity, > - &bo->tbo, bo->tbo.resource, &cursor, > - 1, false, false, &size, &addr); > - if (r) > - goto err; > - > - r = amdgpu_ttm_fill_mem(adev, entity, 0, addr, size, resv, > - &next, true, > - AMDGPU_KERNEL_JOB_ID_TTM_CLEAR_BUFFER); > - if (r) > - goto err; > - > - dma_fence_put(*fence); > - *fence = next; > - > - amdgpu_res_next(&cursor, size); > - } > -err: > - mutex_unlock(&entity->lock); > - > - return r; > -} > - > int amdgpu_fill_buffer(struct amdgpu_device *adev, > struct amdgpu_ttm_buffer_entity *entity, > struct amdgpu_bo *bo, > uint32_t src_data, > struct dma_resv *resv, > - struct dma_fence **f, > + struct dma_fence **out_fence, > + bool consider_clear_status, > u64 k_job_id) > { > struct dma_fence *fence = NULL; > @@ -2551,6 +2500,11 @@ int amdgpu_fill_buffer(struct amdgpu_device *adev, > struct dma_fence *next; > uint64_t cur_size, to; > > + if (consider_clear_status && amdgpu_res_cleared(&dst)) { > + amdgpu_res_next(&dst, dst.size); > + continue; > + } > + > /* Never fill more than 256MiB at once to avoid timeouts */ > cur_size = min(dst.size, 256ULL << 20); > > @@ -2574,9 +2528,7 @@ int amdgpu_fill_buffer(struct amdgpu_device *adev, > } > error: > mutex_unlock(&entity->lock); > - if (f) > - *f = dma_fence_get(fence); > - dma_fence_put(fence); > + *out_fence = fence; > return r; > } > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h > index 653a4d17543e..f3bdbcec9afc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h > @@ -181,16 +181,13 @@ int amdgpu_copy_buffer(struct amdgpu_device *adev, > struct dma_resv *resv, > struct dma_fence **fence, > bool vm_needs_flush, uint32_t copy_flags); > -int amdgpu_ttm_clear_buffer(struct amdgpu_device *adev, > - struct amdgpu_bo *bo, > - struct dma_resv *resv, > - struct dma_fence **fence); > int amdgpu_fill_buffer(struct amdgpu_device *adev, > struct amdgpu_ttm_buffer_entity *entity, > struct amdgpu_bo *bo, > uint32_t src_data, > struct dma_resv *resv, > - struct dma_fence **f, > + struct dma_fence **out_fence, > + bool consider_clear_status, > u64 k_job_id); > struct amdgpu_ttm_buffer_entity *amdgpu_ttm_next_clear_entity(struct amdgpu_device *adev); >

2 weeks, 3 days

Re: [PATCH 6/9] iommufd: Have pfn_reader process DMABUF iopt_pages

by Jason Gunthorpe

On Thu, Nov 20, 2025 at 08:04:37AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg(a)nvidia.com> > > Sent: Saturday, November 8, 2025 12:50 AM > > + > > +static int pfn_reader_fill_dmabuf(struct pfn_reader_dmabuf *dmabuf, > > + struct pfn_batch *batch, > > + unsigned long start_index, > > + unsigned long last_index) > > +{ > > + unsigned long start = dmabuf->start_offset + start_index * PAGE_SIZE; > > + > > + /* > > + * This works in PAGE_SIZE indexes, if the dmabuf is sliced and > > + * starts/ends at a sub page offset then the batch to domain code will > > + * adjust it. > > + */ > > dmabuf->start_offset comes from pages->dmabuf.start, which is initialized as: > > pages->dmabuf.start = start - start_byte; > > so it's always page-aligned. Where is the sub-page offset coming from? I need to go over this again to check it, this sub-page stuff is a bit convoluted. start_offset should include the sub page offset here.. > > @@ -1687,6 +1737,12 @@ static void __iopt_area_unfill_domain(struct > > iopt_area *area, > > > > lockdep_assert_held(&pages->mutex); > > > > + if (iopt_is_dmabuf(pages)) { > > + iopt_area_unmap_domain_range(area, domain, start_index, > > + last_index); > > + return; > > + } > > + > > this belongs to patch3? This is part of programming the domain with the dmabuf, the patch3 was about the revoke which is a slightly different topic though they are both similar. Thanks, Jason

2 weeks, 3 days

Re: [PATCH 2/9] iommufd: Add DMABUF to iopt_pages

by Jason Gunthorpe

On Thu, Nov 20, 2025 at 07:55:04AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg(a)nvidia.com> > > Sent: Saturday, November 8, 2025 12:50 AM > > > > > > @@ -2031,7 +2155,10 @@ int iopt_pages_rw_access(struct iopt_pages > > *pages, unsigned long start_byte, > > if ((flags & IOMMUFD_ACCESS_RW_WRITE) && !pages->writable) > > return -EPERM; > > > > - if (pages->type == IOPT_ADDRESS_FILE) > > + if (iopt_is_dmabuf(pages)) > > + return -EINVAL; > > + > > probably also add helpers for other types, e.g.: > > iopt_is_user() > iopt_is_memfd() The helper was to integrate the IS_ENABLED() check for DMABUF, there are not so many others uses, I think leave it to not bloat the patch. > > + if (pages->type != IOPT_ADDRESS_USER) > > return iopt_pages_rw_slow(pages, start_index, last_index, > > start_byte % PAGE_SIZE, data, > > length, > > flags); > > -- > > then the following WARN_ON() becomes useless: > > if (IS_ENABLED(CONFIG_IOMMUFD_TEST) && > WARN_ON(pages->type != IOPT_ADDRESS_USER)) > return -EINVAL; Yep Thanks, Jason

2 weeks, 3 days

Re: [PATCH v3 09/28] drm/amdgpu: pass the entity to use to ttm public functions

by Christian König

On 11/21/25 11:12, Pierre-Eric Pelloux-Prayer wrote: > This way the caller can select the one it wants to use. > > Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer(a)amd.com> I'm wondering if it wouldn't make sense to put a pointer to adev into each amdgpu_ttm_buffer_entity. But that is maybe something for another patch. For now: Reviewed-by: Christian König <christian.koenig(a)amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c | 3 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 +-- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 34 +++++++++---------- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 16 +++++---- > drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 3 +- > 5 files changed, 32 insertions(+), 28 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c > index 3636b757c974..a050167e76a4 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c > @@ -37,7 +37,8 @@ static int amdgpu_benchmark_do_move(struct amdgpu_device *adev, unsigned size, > > stime = ktime_get(); > for (i = 0; i < n; i++) { > - r = amdgpu_copy_buffer(adev, saddr, daddr, size, NULL, &fence, > + r = amdgpu_copy_buffer(adev, &adev->mman.default_entity, > + saddr, daddr, size, NULL, &fence, > false, 0); > if (r) > goto exit_do_move; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > index 926a3f09a776..858eb9fa061b 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > @@ -1322,8 +1322,8 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo) > if (r) > goto out; > > - r = amdgpu_fill_buffer(abo, 0, &bo->base._resv, &fence, true, > - AMDGPU_KERNEL_JOB_ID_CLEAR_ON_RELEASE); > + r = amdgpu_fill_buffer(&adev->mman.clear_entity, abo, 0, &bo->base._resv, > + &fence, AMDGPU_KERNEL_JOB_ID_CLEAR_ON_RELEASE); > if (WARN_ON(r)) > goto out; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > index 3d850893b97f..1d3afad885da 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > @@ -359,7 +359,7 @@ static int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev, > write_compress_disable)); > } > > - r = amdgpu_copy_buffer(adev, from, to, cur_size, resv, > + r = amdgpu_copy_buffer(adev, entity, from, to, cur_size, resv, > &next, true, copy_flags); > if (r) > goto error; > @@ -414,8 +414,9 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo, > (abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE)) { > struct dma_fence *wipe_fence = NULL; > > - r = amdgpu_fill_buffer(abo, 0, NULL, &wipe_fence, > - false, AMDGPU_KERNEL_JOB_ID_MOVE_BLIT); > + r = amdgpu_fill_buffer(&adev->mman.move_entity, > + abo, 0, NULL, &wipe_fence, > + AMDGPU_KERNEL_JOB_ID_MOVE_BLIT); > if (r) { > goto error; > } else if (wipe_fence) { > @@ -2258,7 +2259,9 @@ static int amdgpu_ttm_prepare_job(struct amdgpu_device *adev, > DMA_RESV_USAGE_BOOKKEEP); > } > > -int amdgpu_copy_buffer(struct amdgpu_device *adev, uint64_t src_offset, > +int amdgpu_copy_buffer(struct amdgpu_device *adev, > + struct amdgpu_ttm_buffer_entity *entity, > + uint64_t src_offset, > uint64_t dst_offset, uint32_t byte_count, > struct dma_resv *resv, > struct dma_fence **fence, > @@ -2282,7 +2285,7 @@ int amdgpu_copy_buffer(struct amdgpu_device *adev, uint64_t src_offset, > max_bytes = adev->mman.buffer_funcs->copy_max_bytes; > num_loops = DIV_ROUND_UP(byte_count, max_bytes); > num_dw = ALIGN(num_loops * adev->mman.buffer_funcs->copy_num_dw, 8); > - r = amdgpu_ttm_prepare_job(adev, &adev->mman.move_entity, num_dw, > + r = amdgpu_ttm_prepare_job(adev, entity, num_dw, > resv, vm_needs_flush, &job, > AMDGPU_KERNEL_JOB_ID_TTM_COPY_BUFFER); > if (r) > @@ -2411,22 +2414,18 @@ int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo, > return r; > } > > -int amdgpu_fill_buffer(struct amdgpu_bo *bo, > - uint32_t src_data, > - struct dma_resv *resv, > - struct dma_fence **f, > - bool delayed, > - u64 k_job_id) > +int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, > + struct amdgpu_bo *bo, > + uint32_t src_data, > + struct dma_resv *resv, > + struct dma_fence **f, > + u64 k_job_id) > { > struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); > - struct amdgpu_ttm_buffer_entity *entity; > struct dma_fence *fence = NULL; > struct amdgpu_res_cursor dst; > int r; > > - entity = delayed ? &adev->mman.clear_entity : > - &adev->mman.move_entity; > - > if (!adev->mman.buffer_funcs_enabled) { > dev_err(adev->dev, > "Trying to clear memory with ring turned off.\n"); > @@ -2443,13 +2442,14 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo, > /* Never fill more than 256MiB at once to avoid timeouts */ > cur_size = min(dst.size, 256ULL << 20); > > - r = amdgpu_ttm_map_buffer(adev, &adev->mman.default_entity, > + r = amdgpu_ttm_map_buffer(adev, entity, > &bo->tbo, bo->tbo.resource, &dst, > 1, false, &cur_size, &to); > if (r) > goto error; > > - r = amdgpu_ttm_fill_mem(adev, entity, src_data, to, cur_size, resv, > + r = amdgpu_ttm_fill_mem(adev, entity, > + src_data, to, cur_size, resv, > &next, true, k_job_id); > if (r) > goto error; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h > index 41bbc25680a2..9288599c9c46 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h > @@ -167,7 +167,9 @@ int amdgpu_ttm_init(struct amdgpu_device *adev); > void amdgpu_ttm_fini(struct amdgpu_device *adev); > void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, > bool enable); > -int amdgpu_copy_buffer(struct amdgpu_device *adev, uint64_t src_offset, > +int amdgpu_copy_buffer(struct amdgpu_device *adev, > + struct amdgpu_ttm_buffer_entity *entity, > + uint64_t src_offset, > uint64_t dst_offset, uint32_t byte_count, > struct dma_resv *resv, > struct dma_fence **fence, > @@ -175,12 +177,12 @@ int amdgpu_copy_buffer(struct amdgpu_device *adev, uint64_t src_offset, > int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo, > struct dma_resv *resv, > struct dma_fence **fence); > -int amdgpu_fill_buffer(struct amdgpu_bo *bo, > - uint32_t src_data, > - struct dma_resv *resv, > - struct dma_fence **fence, > - bool delayed, > - u64 k_job_id); > +int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, > + struct amdgpu_bo *bo, > + uint32_t src_data, > + struct dma_resv *resv, > + struct dma_fence **f, > + u64 k_job_id); > > int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo); > void amdgpu_ttm_recover_gart(struct ttm_buffer_object *tbo); > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > index ade1d4068d29..9c76f1ba0e55 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > @@ -157,7 +157,8 @@ svm_migrate_copy_memory_gart(struct amdgpu_device *adev, dma_addr_t *sys, > goto out_unlock; > } > > - r = amdgpu_copy_buffer(adev, gart_s, gart_d, size * PAGE_SIZE, > + r = amdgpu_copy_buffer(adev, entity, > + gart_s, gart_d, size * PAGE_SIZE, > NULL, &next, true, 0); > if (r) { > dev_err(adev->dev, "fail %d to copy memory\n", r);

2 weeks, 3 days

← Newer
1
2
3
4
5
6
7
8
...
323
Older →

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig