On Mon, 21 Jul 2025 11:17:27 +0200, Tomeu Vizoso wrote:
> This series adds a new driver for the NPU that Rockchip includes in its
> newer SoCs, developed by them on the NVDLA base.
>
> In its current form, it supports the specific NPU in the RK3588 SoC.
>
> The userspace driver is part of Mesa and an initial draft can be found at:
>
> [...]
Applied, thanks!
[07/10] arm64: dts: rockchip: add pd_npu label for RK3588 power domains
commit: 6d64bceb97a1c93b3cc2131f7e023ef2f9cf33f2
[08/10] arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588-base
commit: a31dfc060a747f08705ace36d8de006bc13318fa
[09/10] arm64: dts: rockchip: Enable the NPU on quartzpro64
commit: 640366d644b1e282771a09c72be37162b6eda438
[10/10] arm64: dts: rockchip: enable NPU on ROCK 5B
commit: 3af6a83fc85033e44ce5cd0e1de54dc20b7e15af
Best regards,
--
Heiko Stuebner <heiko(a)sntech.de>
Hi Ling,
kernel test robot noticed the following build warnings:
[auto build test WARNING on char-misc/char-misc-testing]
[also build test WARNING on char-misc/char-misc-next char-misc/char-misc-linus linus/master v6.16 next-20250806]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ling-Xu/misc-fastrpc-Save-ac…
base: char-misc/char-misc-testing
patch link: https://lore.kernel.org/r/20250806115114.688814-5-quic_lxu5%40quicinc.com
patch subject: [PATCH v2 4/4] misc: fastrpc: Skip reference for DMA handles
config: hexagon-randconfig-002-20250807 (https://download.01.org/0day-ci/archive/20250807/202508070731.S30957lV-lkp@…)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 7b8dea265e72c3037b6b1e54d5ab51b7e14f328b)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250807/202508070731.S30957lV-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202508070731.S30957lV-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/misc/fastrpc.c:368:30: warning: unused variable 'sess' [-Wunused-variable]
368 | struct fastrpc_session_ctx *sess = fl->sctx;
| ^~~~
1 warning generated.
vim +/sess +368 drivers/misc/fastrpc.c
c68cfb718c8f97 Srinivas Kandagatla 2019-02-08 363
8f6c1d8c4f0cc3 Vamsi Krishna Gattupalli 2022-02-14 364
8f6c1d8c4f0cc3 Vamsi Krishna Gattupalli 2022-02-14 365 static int fastrpc_map_lookup(struct fastrpc_user *fl, int fd,
1922c68c56c660 Ling Xu 2025-08-06 366 struct fastrpc_map **ppmap)
c68cfb718c8f97 Srinivas Kandagatla 2019-02-08 367 {
9446fa1683a7e3 Abel Vesa 2022-11-24 @368 struct fastrpc_session_ctx *sess = fl->sctx;
c68cfb718c8f97 Srinivas Kandagatla 2019-02-08 369 struct fastrpc_map *map = NULL;
d259063578ed76 Ling Xu 2025-08-06 370 struct dma_buf *buf;
9446fa1683a7e3 Abel Vesa 2022-11-24 371 int ret = -ENOENT;
c68cfb718c8f97 Srinivas Kandagatla 2019-02-08 372
d259063578ed76 Ling Xu 2025-08-06 373 buf = dma_buf_get(fd);
d259063578ed76 Ling Xu 2025-08-06 374 if (IS_ERR(buf))
d259063578ed76 Ling Xu 2025-08-06 375 return PTR_ERR(buf);
d259063578ed76 Ling Xu 2025-08-06 376
9446fa1683a7e3 Abel Vesa 2022-11-24 377 spin_lock(&fl->lock);
c68cfb718c8f97 Srinivas Kandagatla 2019-02-08 378 list_for_each_entry(map, &fl->maps, node) {
d259063578ed76 Ling Xu 2025-08-06 379 if (map->fd != fd || map->buf != buf)
9446fa1683a7e3 Abel Vesa 2022-11-24 380 continue;
9446fa1683a7e3 Abel Vesa 2022-11-24 381
9446fa1683a7e3 Abel Vesa 2022-11-24 382 *ppmap = map;
9446fa1683a7e3 Abel Vesa 2022-11-24 383 ret = 0;
9446fa1683a7e3 Abel Vesa 2022-11-24 384 break;
c68cfb718c8f97 Srinivas Kandagatla 2019-02-08 385 }
9446fa1683a7e3 Abel Vesa 2022-11-24 386 spin_unlock(&fl->lock);
8f6c1d8c4f0cc3 Vamsi Krishna Gattupalli 2022-02-14 387
8f6c1d8c4f0cc3 Vamsi Krishna Gattupalli 2022-02-14 388 return ret;
8f6c1d8c4f0cc3 Vamsi Krishna Gattupalli 2022-02-14 389 }
8f6c1d8c4f0cc3 Vamsi Krishna Gattupalli 2022-02-14 390
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On Mon, Aug 04, 2025 at 10:10:32AM -0400, Benjamin LaHaise wrote:
> FYI: this entire patch series was rejected as spam by large numbers of
> linux-mm subscribers using @gmail.com email addresses.
Thanks for the heads-up. Are you aware of any issues from my side?
I'm sending patches with git-send-email through mail.kernel.org SMTP.
Thanks
>
> -ben (owner-linux-mm)
From: Leon Romanovsky <leonro(a)nvidia.com>
---------------------------------------------------------------------------
Based on blk and DMA patches which will be sent during coming merge window.
---------------------------------------------------------------------------
This series extends the VFIO PCI subsystem to support exporting MMIO regions
from PCI device BARs as dma-buf objects, enabling safe sharing of non-struct
page memory with controlled lifetime management. This allows RDMA and other
subsystems to import dma-buf FDs and build them into memory regions for PCI
P2P operations.
The series supports a use case for SPDK where a NVMe device will be owned
by SPDK through VFIO but interacting with a RDMA device. The RDMA device
may directly access the NVMe CMB or directly manipulate the NVMe device's
doorbell using PCI P2P.
However, as a general mechanism, it can support many other scenarios with
VFIO. This dmabuf approach can be usable by iommufd as well for generic
and safe P2P mappings.
In addition to the SPDK use-case mentioned above, the capability added
in this patch series can also be useful when a buffer (located in device
memory such as VRAM) needs to be shared between any two dGPU devices or
instances (assuming one of them is bound to VFIO PCI) as long as they
are P2P DMA compatible.
The implementation provides a revocable attachment mechanism using dma-buf
move operations. MMIO regions are normally pinned as BARs don't change
physical addresses, but access is revoked when the VFIO device is closed
or a PCI reset is issued. This ensures kernel self-defense against
potentially hostile userspace.
The series includes significant refactoring of the PCI P2PDMA subsystem
to separate core P2P functionality from memory allocation features,
making it more modular and suitable for VFIO use cases that don't need
struct page support.
-----------------------------------------------------------------------
This is based on
https://lore.kernel.org/all/20250307052248.405803-1-vivek.kasireddy@intel.c…
but heavily rewritten to be based on DMA physical API.
-----------------------------------------------------------------------
The WIP branch can be found here:
https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=…
Thanks
Leon Romanovsky (8):
PCI/P2PDMA: Remove redundant bus_offset from map state
PCI/P2PDMA: Introduce p2pdma_provider structure for cleaner
abstraction
PCI/P2PDMA: Simplify bus address mapping API
PCI/P2PDMA: Refactor to separate core P2P functionality from memory
allocation
PCI/P2PDMA: Export pci_p2pdma_map_type() function
types: move phys_vec definition to common header
vfio/pci: Enable peer-to-peer DMA transactions by default
vfio/pci: Add dma-buf export support for MMIO regions
Vivek Kasireddy (2):
vfio: Export vfio device get and put registration helpers
vfio/pci: Share the core device pointer while invoking feature
functions
block/blk-mq-dma.c | 7 +-
drivers/iommu/dma-iommu.c | 4 +-
drivers/pci/p2pdma.c | 144 +++++++++----
drivers/vfio/pci/Kconfig | 20 ++
drivers/vfio/pci/Makefile | 2 +
drivers/vfio/pci/vfio_pci_config.c | 22 +-
drivers/vfio/pci/vfio_pci_core.c | 59 ++++--
drivers/vfio/pci/vfio_pci_dmabuf.c | 321 +++++++++++++++++++++++++++++
drivers/vfio/pci/vfio_pci_priv.h | 23 +++
drivers/vfio/vfio_main.c | 2 +
include/linux/dma-buf.h | 1 +
include/linux/pci-p2pdma.h | 114 +++++-----
include/linux/types.h | 5 +
include/linux/vfio.h | 2 +
include/linux/vfio_pci_core.h | 4 +
include/uapi/linux/vfio.h | 19 ++
kernel/dma/direct.c | 4 +-
mm/hmm.c | 2 +-
18 files changed, 631 insertions(+), 124 deletions(-)
create mode 100644 drivers/vfio/pci/vfio_pci_dmabuf.c
--
2.50.1
On Tue, Jul 29, 2025 at 02:54:13PM -0600, Logan Gunthorpe wrote:
>
>
> On 2025-07-28 17:11, Jason Gunthorpe wrote:
> >> If the dma mapping for P2P memory doesn't need to create an iommu
> >> mapping then that's fine. But it should be the dma-iommu layer to decide
> >> that.
> >
> > So above, we can't use dma-iommu.c, it might not be compiled into the
> > kernel but the dma_map_phys() path is still valid.
>
> This is an easily solved problem. I did a very rough sketch below to say
> it's really not that hard. (Note it has some rough edges that could be
> cleaned up and I based it off Leon's git repo which appears to not be
> the same as what was posted, but the core concept is sound).
I started to prepare v2, this is why posted version is slightly
different from dmabuf-vfio branch.
In addition to what Jason wrote. there is an extra complexity with using
state. The wrappers which operate on dma_iova_state assume that all memory,
which is going to be mapped, is the same type: or p2p or not.
This is not the cased for HMM/RDMA users, there you create state in
advance and get mixed type of pages.
Thanks
On Tue, Jul 29, 2025 at 02:54:13PM -0600, Logan Gunthorpe wrote:
>
>
> On 2025-07-28 17:11, Jason Gunthorpe wrote:
> >> If the dma mapping for P2P memory doesn't need to create an iommu
> >> mapping then that's fine. But it should be the dma-iommu layer to decide
> >> that.
> >
> > So above, we can't use dma-iommu.c, it might not be compiled into the
> > kernel but the dma_map_phys() path is still valid.
>
> This is an easily solved problem. I did a very rough sketch below to say
> it's really not that hard. (Note it has some rough edges that could be
> cleaned up and I based it off Leon's git repo which appears to not be
> the same as what was posted, but the core concept is sound).
I did hope for something like this in the early days, but it proved
not so easy to get agreements on details :(
My feeling was we should get some actual examples of using this thing
and then it is far easier to discuss ideas, like yours here, to
improve it. Many of the discussions kind of got confused without
enough actual usering code for everyone to refer to.
For instance the nvme use case is a big driver for the API design, and
it is quite different from these simpler flows, this idea needs to see
how it would work there.
Maybe this idea could also have provider = NULL meaning it is CPU
cachable memory?
> +static inline void dma_iova_try_alloc_p2p(struct p2pdma_provider *provider,
> + struct device *dev, struct dma_iova_state *state, phys_addr_t phys,
> + size_t size)
> +{
> +}
Can't be empty - PCI_P2PDMA_MAP_THRU_HOST_BRIDGE vs
PCI_P2PDMA_MAP_BUS_ADDR still matters so it still must set
dma_iova_state::bus_addr to get dma_map_phys_prealloc() to do the
right thing.
Still, it would make sense to put something like that in dma/mapping.c
and rely on the static inline stub for dma_iova_try_alloc()..
> for (i = 0; i < priv->nr_ranges; i++) {
> - if (!state) {
> - addr = pci_p2pdma_bus_addr_map(provider,
> - phys_vec[i].paddr);
> - } else if (dma_use_iova(state)) {
> - ret = dma_iova_link(attachment->dev, state,
> - phys_vec[i].paddr, 0,
> - phys_vec[i].len, dir, attrs);
> - if (ret)
> - goto err_unmap_dma;
> -
> - mapped_len += phys_vec[i].len;
> - } else {
> - addr = dma_map_phys(attachment->dev, phys_vec[i].paddr,
> - phys_vec[i].len, dir, attrs);
> - ret = dma_mapping_error(attachment->dev, addr);
> - if (ret)
> - goto err_unmap_dma;
> - }
> + addr = dma_map_phys_prealloc(attachment->dev, phys_vec[i].paddr,
> + phys_vec[i].len, dir, attrs, state,
> + provider);
There was a draft of something like this at some point. The
DMA_MAPPING_USE_IOVA is a new twist though
> #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
> struct dma_iova_state {
> dma_addr_t addr;
> u64 __size;
> + bool bus_addr;
> };
Gowing this structure has been strongly pushed back on. This probably
can be solved in some other way, a bitfield on size perhaps..
> +dma_addr_t dma_map_phys_prealloc(struct device *dev, phys_addr_t phys,
> size_t size,
> + enum dma_data_direction dir, unsigned long attrs,
> + struct dma_iova_state *state, struct p2pdma_provider *provider)
> +{
> + int ret;
> +
> + if (state->bus_addr)
> + return pci_p2pdma_bus_addr_map(provider, phys);
> +
> + if (dma_use_iova(state)) {
> + ret = dma_iova_link(dev, state, phys, 0, size, dir, attrs);
> + if (ret)
> + return DMA_MAPPING_ERROR;
> +
> + return DMA_MAPPING_USE_IOVA;
> + }
> +
> + return dma_map_phys(dev, phys, size, dir, attrs);
> +}
> +EXPORT_SYMBOL_GPL(dma_map_phys_prealloc);
I would be tempted to inline this
Overall, yeah I would certainly welcome improvements like this if
everyone can agree, but I'd really like to see nvme merged before we
start working on ideas. That way the proposal can be properly
evaluated by all the stake holders.
Jason
On Mon, Jul 28, 2025 at 11:07:34AM -0600, Logan Gunthorpe wrote:
>
>
> On 2025-07-28 10:41, Leon Romanovsky wrote:
> > On Mon, Jul 28, 2025 at 10:12:31AM -0600, Logan Gunthorpe wrote:
> >>
> >>
> >> On 2025-07-27 13:05, Jason Gunthorpe wrote:
> >>> On Fri, Jul 25, 2025 at 10:30:46AM -0600, Logan Gunthorpe wrote:
> >>>>
> >>>>
> >>>> On 2025-07-24 02:13, Leon Romanovsky wrote:
> >>>>> On Thu, Jul 24, 2025 at 10:03:13AM +0200, Christoph Hellwig wrote:
> >>>>>> On Wed, Jul 23, 2025 at 04:00:06PM +0300, Leon Romanovsky wrote:
> >>>>>>> From: Leon Romanovsky <leonro(a)nvidia.com>
> >>>>>>>
> >>>>>>> Export the pci_p2pdma_map_type() function to allow external modules
> >>>>>>> and subsystems to determine the appropriate mapping type for P2PDMA
> >>>>>>> transfers between a provider and target device.
> >>>>>>
> >>>>>> External modules have no business doing this.
> >>>>>
> >>>>> VFIO PCI code is built as module. There is no way to access PCI p2p code
> >>>>> without exporting functions in it.
> >>>>
> >>>> The solution that would make more sense to me would be for either
> >>>> dma_iova_try_alloc() or another helper in dma-iommu.c to handle the
> >>>> P2PDMA case.
> >>>
> >>> This has nothing to do with dma-iommu.c, the decisions here still need
> >>> to be made even if dma-iommu.c is not compiled in.
> >>
> >> Doesn't it though? Every single call in patch 10 to the newly exported
> >> PCI functions calls into the the dma-iommu functions.
Patch 10 has lots of flows, only one will end up in dma-iommu.c
vfio_pci_dma_buf_map() calls pci_p2pdma_bus_addr_map(),
dma_iova_link(), dma_map_phys().
Only iova_link would call to dma-iommu.c - if dma_map_phys() is called
we know that dma-iommu.c won't be called by it.
> >> If there were non-iommu paths then I would expect the code would
> >> use the regular DMA api directly which would then call in to
> >> dma-iommu.
> >
> > If p2p type is PCI_P2PDMA_MAP_BUS_ADDR, there will no dma-iommu and DMA
> > at all.
>
> I understand that and it is completely beside my point.
>
> If the dma mapping for P2P memory doesn't need to create an iommu
> mapping then that's fine. But it should be the dma-iommu layer to decide
> that.
So above, we can't use dma-iommu.c, it might not be compiled into the
kernel but the dma_map_phys() path is still valid.
> It's not a decision that should be made by every driver doing this
> kind of thing.
Sort of, I think we are trying to get to some place where there are
subsystem, or at least data structure specific helpers that do this
(ie nvme has BIO helpers), but the helpers should be running this
logic directly for performance. Leon hasn't done it but I think we
should see helpers for DMABUF too encapsulating the logic shown in
patch 10. I think we need to prove it out these basic points first
before trying to go and convert a bunch of GPU drivers.
The vfio in patch 10 is not the full example since it only has a
single scatter/gather" effectively, but the generalized version loops
over pci_p2pdma_bus_addr_map(), dma_iova_link(), dma_map_phys() for
each page.
Part of the new API design is to only do one kind of mapping operation
at once, and part of the design is we know that the P2P type is fixed.
It makes no performance sense to check the type inside the
pci_p2pdma_bus_addr_map()/ dma_iova_link()/dma_map_phys() within the
per-page loop.
I do think some level of abstraction has been lost here in pursuit of
performance. If someone does have a better way to structure this
without a performance hit then fantastic, but thats going back and
revising the new DMA API. This just builds on top of that, and yes, it
is not so abstract.
Jason
On Mon, Jul 28, 2025 at 10:12:31AM -0600, Logan Gunthorpe wrote:
>
>
> On 2025-07-27 13:05, Jason Gunthorpe wrote:
> > On Fri, Jul 25, 2025 at 10:30:46AM -0600, Logan Gunthorpe wrote:
> >>
> >>
> >> On 2025-07-24 02:13, Leon Romanovsky wrote:
> >>> On Thu, Jul 24, 2025 at 10:03:13AM +0200, Christoph Hellwig wrote:
> >>>> On Wed, Jul 23, 2025 at 04:00:06PM +0300, Leon Romanovsky wrote:
> >>>>> From: Leon Romanovsky <leonro(a)nvidia.com>
> >>>>>
> >>>>> Export the pci_p2pdma_map_type() function to allow external modules
> >>>>> and subsystems to determine the appropriate mapping type for P2PDMA
> >>>>> transfers between a provider and target device.
> >>>>
> >>>> External modules have no business doing this.
> >>>
> >>> VFIO PCI code is built as module. There is no way to access PCI p2p code
> >>> without exporting functions in it.
> >>
> >> The solution that would make more sense to me would be for either
> >> dma_iova_try_alloc() or another helper in dma-iommu.c to handle the
> >> P2PDMA case.
> >
> > This has nothing to do with dma-iommu.c, the decisions here still need
> > to be made even if dma-iommu.c is not compiled in.
>
> Doesn't it though? Every single call in patch 10 to the newly exported
> PCI functions calls into the the dma-iommu functions. If there were
> non-iommu paths then I would expect the code would use the regular DMA
> api directly which would then call in to dma-iommu.
If p2p type is PCI_P2PDMA_MAP_BUS_ADDR, there will no dma-iommu and DMA
at all.
+static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf,
+ struct dma_buf_attachment *attachment)
+{
+ struct vfio_pci_dma_buf *priv = dmabuf->priv;
+
+ if (!attachment->peer2peer)
+ return -EOPNOTSUPP;
+
+ if (priv->revoked)
+ return -ENODEV;
+
+ switch (pci_p2pdma_map_type(priv->vdev->provider, attachment->dev)) {
+ case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE:
+ break;
+ case PCI_P2PDMA_MAP_BUS_ADDR:
+ /*
+ * There is no need in IOVA at all for this flow.
+ * We rely on attachment->priv == NULL as a marker
+ * for this mode.
+ */
+ return 0;
+ default:
+ return -EINVAL;
+ }
+
+ attachment->priv = kzalloc(sizeof(struct dma_iova_state), GFP_KERNEL);
+ if (!attachment->priv)
+ return -ENOMEM;
+
+ dma_iova_try_alloc(attachment->dev, attachment->priv, 0, priv->phys_vec.len);
+ return 0;
+}
We've discussed a number of times of how some heap names are bad, but
not really what makes a good heap name.
Let's document what we expect the heap names to look like.
Reviewed-by: Bagas Sanjaya <bagasdotme(a)gmail.com>
Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
---
Changes in v3:
- Grammar, spelling fixes
- Remove the cacheable / uncacheable name suggestion
- Link to v2: https://lore.kernel.org/r/20250616-dma-buf-heap-names-doc-v2-1-8ae43174cdbf…
Changes in v2:
- Added justifications for each requirement / suggestions
- Added a mention and example of buffer attributes
- Link to v1: https://lore.kernel.org/r/20250520-dma-buf-heap-names-doc-v1-1-ab31f74809ee…
---
Documentation/userspace-api/dma-buf-heaps.rst | 35 +++++++++++++++++++++++++++
1 file changed, 35 insertions(+)
diff --git a/Documentation/userspace-api/dma-buf-heaps.rst b/Documentation/userspace-api/dma-buf-heaps.rst
index 535f49047ce6450796bf4380c989e109355efc05..3ee4e7961fe390ba356a2125d53b060546c3e4a6 100644
--- a/Documentation/userspace-api/dma-buf-heaps.rst
+++ b/Documentation/userspace-api/dma-buf-heaps.rst
@@ -21,5 +21,40 @@ following heaps:
usually created either through the kernel commandline through the
`cma` parameter, a memory region Device-Tree node with the
`linux,cma-default` property set, or through the `CMA_SIZE_MBYTES` or
`CMA_SIZE_PERCENTAGE` Kconfig options. Depending on the platform, it
might be called ``reserved``, ``linux,cma``, or ``default-pool``.
+
+Naming Convention
+=================
+
+``dma-buf`` heaps name should meet a number of constraints:
+
+- The name must be stable, and must not change from one version to the other.
+ Userspace identifies heaps by their name, so if the names ever change, we
+ would be likely to introduce regressions.
+
+- The name must describe the memory region the heap will allocate from, and
+ must uniquely identify it in a given platform. Since userspace applications
+ use the heap name as the discriminant, it must be able to tell which heap it
+ wants to use reliably if there's multiple heaps.
+
+- The name must not mention implementation details, such as the allocator. The
+ heap driver will change over time, and implementation details when it was
+ introduced might not be relevant in the future.
+
+- The name should describe properties of the buffers that would be allocated.
+ Doing so will make heap identification easier for userspace. Such properties
+ are:
+
+ - ``contiguous`` for physically contiguous buffers;
+
+ - ``protected`` for encrypted buffers not accessible the OS;
+
+- The name may describe intended usage. Doing so will make heap identification
+ easier for userspace applications and users.
+
+For example, assuming a platform with a reserved memory region located at the
+RAM address 0x42000000, intended to allocate video framebuffers, physically
+contiguous, and backed by the CMA kernel allocator, good names would be
+``memory@42000000-cacheable-contiguous`` or ``video@42000000``, but
+``cma-video`` wouldn't.
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250520-dma-buf-heap-names-doc-31261aa0cfe6
Best regards,
--
Maxime Ripard <mripard(a)kernel.org>
On Fri, Jul 25, 2025 at 10:30:46AM -0600, Logan Gunthorpe wrote:
>
>
> On 2025-07-24 02:13, Leon Romanovsky wrote:
> > On Thu, Jul 24, 2025 at 10:03:13AM +0200, Christoph Hellwig wrote:
> >> On Wed, Jul 23, 2025 at 04:00:06PM +0300, Leon Romanovsky wrote:
> >>> From: Leon Romanovsky <leonro(a)nvidia.com>
> >>>
> >>> Export the pci_p2pdma_map_type() function to allow external modules
> >>> and subsystems to determine the appropriate mapping type for P2PDMA
> >>> transfers between a provider and target device.
> >>
> >> External modules have no business doing this.
> >
> > VFIO PCI code is built as module. There is no way to access PCI p2p code
> > without exporting functions in it.
>
> The solution that would make more sense to me would be for either
> dma_iova_try_alloc() or another helper in dma-iommu.c to handle the
> P2PDMA case.
This has nothing to do with dma-iommu.c, the decisions here still need
to be made even if dma-iommu.c is not compiled in.
It could be exported from the main dma code, but I think it would just
be a 1 line wrapper around the existing function? I'd rather rename
the functions and leave them in the p2pdma.c files...
Jason
On Fri, Jul 25, 2025 at 05:34:40AM +0000, Kasireddy, Vivek wrote:
> Hi Leon,
>
> > Subject: Re: [PATCH 10/10] vfio/pci: Add dma-buf export support for MMIO
> > regions
> >
> > > >
> > > > From: Leon Romanovsky <leonro(a)nvidia.com>
> > > >
> > > > Add support for exporting PCI device MMIO regions through dma-buf,
> > > > enabling safe sharing of non-struct page memory with controlled
> > > > lifetime management. This allows RDMA and other subsystems to
> > import
> > > > dma-buf FDs and build them into memory regions for PCI P2P
> > operations.
> > > >
> > > > The implementation provides a revocable attachment mechanism using
> > > > dma-buf move operations. MMIO regions are normally pinned as BARs
> > > > don't change physical addresses, but access is revoked when the VFIO
> > > > device is closed or a PCI reset is issued. This ensures kernel
> > > > self-defense against potentially hostile userspace.
> > > >
> > > > Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
> > > > Signed-off-by: Vivek Kasireddy <vivek.kasireddy(a)intel.com>
> > > > Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
> > > > ---
> > > > drivers/vfio/pci/Kconfig | 20 ++
> > > > drivers/vfio/pci/Makefile | 2 +
> > > > drivers/vfio/pci/vfio_pci_config.c | 22 +-
> > > > drivers/vfio/pci/vfio_pci_core.c | 25 ++-
> > > > drivers/vfio/pci/vfio_pci_dmabuf.c | 321
> > +++++++++++++++++++++++++++++
> > > > drivers/vfio/pci/vfio_pci_priv.h | 23 +++
> > > > include/linux/dma-buf.h | 1 +
> > > > include/linux/vfio_pci_core.h | 3 +
> > > > include/uapi/linux/vfio.h | 19 ++
> > > > 9 files changed, 431 insertions(+), 5 deletions(-)
> > > > create mode 100644 drivers/vfio/pci/vfio_pci_dmabuf.c
> >
> > <...>
> >
> > > > +static int validate_dmabuf_input(struct vfio_pci_core_device *vdev,
> > > > + struct vfio_device_feature_dma_buf
> > *dma_buf)
> > > > +{
> > > > + struct pci_dev *pdev = vdev->pdev;
> > > > + u32 bar = dma_buf->region_index;
> > > > + u64 offset = dma_buf->offset;
> > > > + u64 len = dma_buf->length;
> > > > + resource_size_t bar_size;
> > > > + u64 sum;
> > > > +
> > > > + /*
> > > > + * For PCI the region_index is the BAR number like everything else.
> > > > + */
> > > > + if (bar >= VFIO_PCI_ROM_REGION_INDEX)
> > > > + return -ENODEV;
> >
> > <...>
> >
> > > > +/**
> > > > + * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the
> > > > + * regions selected.
> > > > + *
> > > > + * open_flags are the typical flags passed to open(2), eg O_RDWR,
> > > > O_CLOEXEC,
> > > > + * etc. offset/length specify a slice of the region to create the dmabuf
> > from.
> > > > + * nr_ranges is the total number of (P2P DMA) ranges that comprise the
> > > > dmabuf.
> > > Any particular reason why you dropped the option (nr_ranges) of creating
> > a
> > > single dmabuf from multiple ranges of an MMIO region?
> >
> > I did it for two reasons. First, I wanted to simplify the code in order
> > to speed-up discussion over the patchset itself. Second, I failed to
> > find justification for need of multiple ranges, as the number of BARs
> > are limited by VFIO_PCI_ROM_REGION_INDEX (6) and same functionality
> > can be achieved by multiple calls to DMABUF import.
> I don't think the same functionality can be achieved by multiple calls to
> dmabuf import. AFAIU, a dmabuf (as of today) is backed by a SGL that can
> have multiple entries because it represents a scattered buffer (multiple
> non-contiguous entries in System RAM or an MMIO region).
I don't know all the reasons why SG was chosen, but one of the main
reasons is that DMA SG API was the only one possible way to handle p2p
transfers (peer2peer flag).
> But in this patch you are constraining it such that only one entry associated with a
> buffer would be included, which effectively means that we cannot create
> a dmabuf to represent scattered buffers (located in a single MMIO region
> such as VRAM or other device memory) anymore.
Yes
>
> >
> > >
> > > Restricting the dmabuf to a single range (or having to create multiple
> > dmabufs
> > > to represent multiple regions/ranges associated with a single scattered
> > buffer)
> > > would be very limiting and may not work in all cases. For instance, in my
> > use-case,
> > > I am trying to share a large (4k mode) framebuffer (FB) located in GPU's
> > VRAM
> > > between two (p2p compatible) GPU devices. And, this would probably not
> > work
> > > given that allocating a large contiguous FB (nr_ranges = 1) in VRAM may
> > not be
> > > feasible when there is memory pressure.
> >
> > Can you please help me and point to the place in the code where this can
> > fail?
> > I'm probably missing something basic as there are no large allocations
> > in the current patchset.
> Sorry, I was not very clear. What I meant is that it is not prudent to assume that
> there will only be one range associated with an MMIO region which we need to
> consider while creating a dmabuf. And, I was pointing out my use-case as an
> example where vfio-pci needs to create a dmabuf for a large buffer (FB) that
> would likely be scattered (and not contiguous) in an MMIO region (such as VRAM).
>
> Let me further explain with my use-case. Here is a link to my Qemu-based test:
> https://gitlab.freedesktop.org/Vivek/qemu/-/commit/b2bdb16d9cfaf55384c95b1f…
Ohh, thanks. I'll add nr_ranges in next version. I see that you are
using same region_index for all ranges and this is how I would like to
keep it: "multiple nr_ranges, same region_index".
Thanks
On Fri, Jul 25, 2025 at 01:12:35PM -0600, Logan Gunthorpe wrote:
>
>
> On 2025-07-25 12:54, Leon Romanovsky wrote:
> >> The solution that would make more sense to me would be for either
> >> dma_iova_try_alloc() or another helper in dma-iommu.c to handle the
> >> P2PDMA case. dma-iommu.c already uses those same interfaces and thus
> >> there would be no need to export the low level helpers from the p2pdma code.
> >
> > I had same idea in early versions of DMA phys API discussion and it was
> > pointed (absolutely right) that this is layering violation.
>
> Respectfully, I have to disagree with this. Having the layer (ie.
> dma-iommu) that normally checks how to handle a P2PDMA request now check
> how to handle these DMA requests is the exact opposite of a layering
> violation.
I'm aware of your implementation and have feeling that it was very
influenced by NVMe requirements, so the end result is very tailored
for it. Other users have very different paths if p2p is taken. Just
see last VFIO patch in this series, it skips all DMA logic.
> Expecting every driver that wants to do P2PDMA to have to
> figure out for themselves how to map the memory before calling into the
> DMA API doesn't seem like a good design choice to me.
We had this discussion earlier too on previous versions. The summary is
that p2p capable devices are very special anyway. They need to work with
p2p natively. BTW, the implementation is not supposed to be in the
drivers, but in their respective subsystems.
>
> > So unfortunately, I think that dma*.c|h is not right place for p2p
> > type check.
>
> dma*.c is already where those checks are done. I'm not sure patches to
> remove the code from that layer and put it into the NVMe driver would
> make a lot of sense (and then, of course, we'd have to put it into every
> other driver that wants to participate in p2p transactions).
I don't have plans to remove existing checks right now, but NVMe was already
converted to new DMA phys API.
Thanks
>
> Logan
>
>
On Fri, Jul 25, 2025 at 10:30:46AM -0600, Logan Gunthorpe wrote:
>
>
> On 2025-07-24 02:13, Leon Romanovsky wrote:
> > On Thu, Jul 24, 2025 at 10:03:13AM +0200, Christoph Hellwig wrote:
> >> On Wed, Jul 23, 2025 at 04:00:06PM +0300, Leon Romanovsky wrote:
> >>> From: Leon Romanovsky <leonro(a)nvidia.com>
> >>>
> >>> Export the pci_p2pdma_map_type() function to allow external modules
> >>> and subsystems to determine the appropriate mapping type for P2PDMA
> >>> transfers between a provider and target device.
> >>
> >> External modules have no business doing this.
> >
> > VFIO PCI code is built as module. There is no way to access PCI p2p code
> > without exporting functions in it.
>
> The solution that would make more sense to me would be for either
> dma_iova_try_alloc() or another helper in dma-iommu.c to handle the
> P2PDMA case. dma-iommu.c already uses those same interfaces and thus
> there would be no need to export the low level helpers from the p2pdma code.
I had same idea in early versions of DMA phys API discussion and it was
pointed (absolutely right) that this is layering violation.
At that time, that remark wasn't such clear to me because HMM code
performs check for p2p on every page and has call to dma_iova_try_alloc()
before that check. But this VFIO DMABUF code shows it much more clearer.
The p2p check is performed before any DMA calls and in case of PCI_P2PDMA_MAP_BUS_ADDR
p2p type between DMABUF exporter device and DMABUF importer device, we
don't call dma_iova_try_alloc() or any DMA API at all.
So unfortunately, I think that dma*.c|h is not right place for p2p
type check.
Thanks
>
> Logan
>
On Thu, Jul 24, 2025 at 05:13:49AM +0000, Kasireddy, Vivek wrote:
> Hi Leon,
>
> > Subject: [PATCH 10/10] vfio/pci: Add dma-buf export support for MMIO
> > regions
> >
> > From: Leon Romanovsky <leonro(a)nvidia.com>
> >
> > Add support for exporting PCI device MMIO regions through dma-buf,
> > enabling safe sharing of non-struct page memory with controlled
> > lifetime management. This allows RDMA and other subsystems to import
> > dma-buf FDs and build them into memory regions for PCI P2P operations.
> >
> > The implementation provides a revocable attachment mechanism using
> > dma-buf move operations. MMIO regions are normally pinned as BARs
> > don't change physical addresses, but access is revoked when the VFIO
> > device is closed or a PCI reset is issued. This ensures kernel
> > self-defense against potentially hostile userspace.
> >
> > Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy(a)intel.com>
> > Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
> > ---
> > drivers/vfio/pci/Kconfig | 20 ++
> > drivers/vfio/pci/Makefile | 2 +
> > drivers/vfio/pci/vfio_pci_config.c | 22 +-
> > drivers/vfio/pci/vfio_pci_core.c | 25 ++-
> > drivers/vfio/pci/vfio_pci_dmabuf.c | 321 +++++++++++++++++++++++++++++
> > drivers/vfio/pci/vfio_pci_priv.h | 23 +++
> > include/linux/dma-buf.h | 1 +
> > include/linux/vfio_pci_core.h | 3 +
> > include/uapi/linux/vfio.h | 19 ++
> > 9 files changed, 431 insertions(+), 5 deletions(-)
> > create mode 100644 drivers/vfio/pci/vfio_pci_dmabuf.c
<...>
> > +static int validate_dmabuf_input(struct vfio_pci_core_device *vdev,
> > + struct vfio_device_feature_dma_buf *dma_buf)
> > +{
> > + struct pci_dev *pdev = vdev->pdev;
> > + u32 bar = dma_buf->region_index;
> > + u64 offset = dma_buf->offset;
> > + u64 len = dma_buf->length;
> > + resource_size_t bar_size;
> > + u64 sum;
> > +
> > + /*
> > + * For PCI the region_index is the BAR number like everything else.
> > + */
> > + if (bar >= VFIO_PCI_ROM_REGION_INDEX)
> > + return -ENODEV;
<...>
> > +/**
> > + * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the
> > + * regions selected.
> > + *
> > + * open_flags are the typical flags passed to open(2), eg O_RDWR,
> > O_CLOEXEC,
> > + * etc. offset/length specify a slice of the region to create the dmabuf from.
> > + * nr_ranges is the total number of (P2P DMA) ranges that comprise the
> > dmabuf.
> Any particular reason why you dropped the option (nr_ranges) of creating a
> single dmabuf from multiple ranges of an MMIO region?
I did it for two reasons. First, I wanted to simplify the code in order
to speed-up discussion over the patchset itself. Second, I failed to
find justification for need of multiple ranges, as the number of BARs
are limited by VFIO_PCI_ROM_REGION_INDEX (6) and same functionality
can be achieved by multiple calls to DMABUF import.
>
> Restricting the dmabuf to a single range (or having to create multiple dmabufs
> to represent multiple regions/ranges associated with a single scattered buffer)
> would be very limiting and may not work in all cases. For instance, in my use-case,
> I am trying to share a large (4k mode) framebuffer (FB) located in GPU's VRAM
> between two (p2p compatible) GPU devices. And, this would probably not work
> given that allocating a large contiguous FB (nr_ranges = 1) in VRAM may not be
> feasible when there is memory pressure.
Can you please help me and point to the place in the code where this can fail?
I'm probably missing something basic as there are no large allocations
in the current patchset.
>
> Furthermore, since you are adding a new UAPI with this patch/feature, as you know,
> we cannot go back and tweak it (to add support for nr_ranges > 1) should there
> be a need in the future, but you can always use nr_ranges = 1 anytime. Therefore,
> I think it makes sense to be flexible in terms of the number of ranges to include
> while creating a dmabuf instead of restricting ourselves to one range.
I'm not a big fan of over-engineering. Let's first understand if this
case is needed.
Thanks
>
> Thanks,
> Vivek
>
> > + *
> > + * Return: The fd number on success, -1 and errno is set on failure.
> > + */
> > +#define VFIO_DEVICE_FEATURE_DMA_BUF 11
> > +
> > +struct vfio_device_feature_dma_buf {
> > + __u32 region_index;
> > + __u32 open_flags;
> > + __u64 offset;
> > + __u64 length;
> > +};
> > +
> > /* -------- API for Type1 VFIO IOMMU -------- */
> >
> > /**
> > --
> > 2.50.1
>
On 22-07-25, 15:46, Dmitry Baryshkov wrote:
> On Tue, Jul 22, 2025 at 05:50:08PM +0530, Jyothi Kumar Seerapu wrote:
> > On 7/19/2025 3:27 PM, Dmitry Baryshkov wrote:
> > > On Mon, Jul 07, 2025 at 09:58:30PM +0530, Jyothi Kumar Seerapu wrote:
> > > > On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
> > > > > On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
[Folks, would be nice to trim replies]
> > > > Could you please confirm if can go with the similar approach of unmap the
> > > > processed TREs based on a fixed threshold or constant value, instead of
> > > > unmapping them all at once?
> > >
> > > I'd still say, that's a bad idea. Please stay within the boundaries of
> > > the DMA API.
> > >
> > I agree with the approach you suggested—it's the GPI's responsibility to
> > manage the available TREs.
> >
> > However, I'm curious whether can we set a dynamic watermark value perhaps
> > half the available TREs) to trigger unmapping of processed TREs ? This would
> > allow the software to prepare the next set of TREs while the hardware
> > continues processing the remaining ones, enabling better parallelism and
> > throughput.
>
> Let's land the simple implementation first, which can then be improved.
> However I don't see any way to return 'above the watermark' from the DMA
> controller. You might need to enhance the API.
Traditionally, we set the dma transfers for watermark level and we get a
interrupt. So you might want to set the callback for watermark level
and then do mapping/unmapping etc in the callback. This is typical model
for dmaengines, we should follow that well
BR
--
~Vinod
The Arm Ethos-U65/85 NPUs are designed for edge AI inference
applications[0].
The driver works with Mesa Teflon. WIP support is available here[1]. The
UAPI should also be compatible with the downstream driver stack[2] and
Vela compiler though that has not been implemented.
Testing so far has been on i.MX93 boards with Ethos-U65. Support for U85
is still todo. Only minor changes on driver side will be needed for U85
support.
A git tree is here[3].
Rob
[0] https://www.arm.com/products/silicon-ip-cpu?families=ethos%20npus
[1] https://gitlab.freedesktop.org/tomeu/mesa.git ethos
[2] https://gitlab.arm.com/artificial-intelligence/ethos-u/
[3] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git ethos
Signed-off-by: Rob Herring (Arm) <robh(a)kernel.org>
---
Rob Herring (Arm) (2):
dt-bindings: npu: Add Arm Ethos-U65/U85
accel: Add Arm Ethos-U NPU driver
.../devicetree/bindings/npu/arm,ethos.yaml | 79 +++
MAINTAINERS | 9 +
drivers/accel/Kconfig | 1 +
drivers/accel/Makefile | 1 +
drivers/accel/ethos/Kconfig | 10 +
drivers/accel/ethos/Makefile | 4 +
drivers/accel/ethos/ethos_device.h | 186 ++++++
drivers/accel/ethos/ethos_drv.c | 412 ++++++++++++
drivers/accel/ethos/ethos_drv.h | 15 +
drivers/accel/ethos/ethos_gem.c | 707 +++++++++++++++++++++
drivers/accel/ethos/ethos_gem.h | 46 ++
drivers/accel/ethos/ethos_job.c | 527 +++++++++++++++
drivers/accel/ethos/ethos_job.h | 41 ++
include/uapi/drm/ethos_accel.h | 262 ++++++++
14 files changed, 2300 insertions(+)
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250715-ethos-3fdd39ef6f19
Best regards,
--
Rob Herring (Arm) <robh(a)kernel.org>
Hi Jeff,
Am Montag, 21. Juli 2025, 16:55:01 Mitteleuropäische Sommerzeit schrieb Jeff Hugo:
> On 7/21/2025 3:17 AM, Tomeu Vizoso wrote:
> > This series adds a new driver for the NPU that Rockchip includes in its
> > newer SoCs, developed by them on the NVDLA base.
> >
> > In its current form, it supports the specific NPU in the RK3588 SoC.
> >
> > The userspace driver is part of Mesa and an initial draft can be found at:
> >
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29698
> >
> > Signed-off-by: Tomeu Vizoso <tomeu(a)tomeuvizoso.net>
>
> This (and the userspace component) appear ready for merge from what I
> can tell. Tomeu is still working on his drm-misc access so I've offered
> to merge on his behalf. Planning on waiting until Friday for any final
> feedback to come in before doing so.
sounds great.
Just to make sure, you're planning to merge patches 1-6 (driver + binding)
into drm-misc and I'll pick up the "arm64: dts: " patches 7-10 afterwards?
Heiko
On Mon, 21 Jul 2025 11:17:33 +0200, Tomeu Vizoso wrote:
> Add the bindings for the Neural Processing Unit IP from Rockchip.
>
> v2:
> - Adapt to new node structure (one node per core, each with its own
> IOMMU)
> - Several misc. fixes from Sebastian Reichel
>
> v3:
> - Split register block in its constituent subblocks, and only require
> the ones that the kernel would ever use (Nicolas Frattaroli)
> - Group supplies (Rob Herring)
> - Explain the way in which the top core is special (Rob Herring)
>
> v4:
> - Change required node name to npu@ (Rob Herring and Krzysztof Kozlowski)
> - Remove unneeded items: (Krzysztof Kozlowski)
> - Fix use of minItems/maxItems (Krzysztof Kozlowski)
> - Add reg-names to list of required properties (Krzysztof Kozlowski)
> - Fix example (Krzysztof Kozlowski)
>
> v5:
> - Rename file to rockchip,rk3588-rknn-core.yaml (Krzysztof Kozlowski)
> - Streamline compatible property (Krzysztof Kozlowski)
>
> v6:
> - Remove mention to NVDLA, as the hardware is only incidentally related
> (Kever Yang)
> - Mark pclk and npu clocks as required by all clocks (Rob Herring)
>
> v7:
> - Remove allOf section, not needed now that all nodes require 4 clocks
> (Heiko Stübner)
>
> v8:
> - Remove notion of top core (Robin Murphy)
>
> Signed-off-by: Sebastian Reichel <sebastian.reichel(a)collabora.com>
> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
> Tested-by: Heiko Stuebner <heiko(a)sntech.de>
> Signed-off-by: Tomeu Vizoso <tomeu(a)tomeuvizoso.net>
> ---
> .../bindings/npu/rockchip,rk3588-rknn-core.yaml | 112 +++++++++++++++++++++
> 1 file changed, 112 insertions(+)
>
Reviewed-by: Rob Herring (Arm) <robh(a)kernel.org>
6.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Zimmermann <tzimmermann(a)suse.de>
commit f6bfc9afc7510cb5e6fbe0a17c507917b0120280 upstream.
Acquire GEM handles in drm_framebuffer_init() and release them in
the corresponding drm_framebuffer_cleanup(). Ties the handle's
lifetime to the framebuffer. Not all GEM buffer objects have GEM
handles. If not set, no refcounting takes place. This is the case
for some fbdev emulation. This is not a problem as these GEM objects
do not use dma-bufs and drivers will not release them while fbdev
emulation is running. Framebuffer flags keep a bit per color plane
of which the framebuffer holds a GEM handle reference.
As all drivers use drm_framebuffer_init(), they will now all hold
dma-buf references as fixed in commit 5307dce878d4 ("drm/gem: Acquire
references on GEM handles for framebuffers").
In the GEM framebuffer helpers, restore the original ref counting
on buffer objects. As the helpers for handle refcounting are now
no longer called from outside the DRM core, unexport the symbols.
v3:
- don't mix internal flags with mode flags (Christian)
v2:
- track framebuffer handle refs by flag
- drop gma500 cleanup (Christian)
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers")
Reported-by: Bert Karwatzki <spasswolf(a)web.de>
Closes: https://lore.kernel.org/dri-devel/20250703115915.3096-1-spasswolf@web.de/
Tested-by: Bert Karwatzki <spasswolf(a)web.de>
Tested-by: Mario Limonciello <superm1(a)kernel.org>
Tested-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Anusha Srivatsa <asrivats(a)redhat.com>
Cc: Christian König <christian.koenig(a)amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: linux-media(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Link: https://lore.kernel.org/r/20250707131224.249496-1-tzimmermann@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/drm_framebuffer.c | 31 ++++++++++++++++++++--
drivers/gpu/drm/drm_gem.c | 38 +++++++++++++++++----------
drivers/gpu/drm/drm_gem_framebuffer_helper.c | 16 ++++-------
drivers/gpu/drm/drm_internal.h | 2 -
include/drm/drm_framebuffer.h | 7 ++++
5 files changed, 68 insertions(+), 26 deletions(-)
--- a/drivers/gpu/drm/drm_framebuffer.c
+++ b/drivers/gpu/drm/drm_framebuffer.c
@@ -862,11 +862,23 @@ EXPORT_SYMBOL_FOR_TESTS_ONLY(drm_framebu
int drm_framebuffer_init(struct drm_device *dev, struct drm_framebuffer *fb,
const struct drm_framebuffer_funcs *funcs)
{
+ unsigned int i;
int ret;
+ bool exists;
if (WARN_ON_ONCE(fb->dev != dev || !fb->format))
return -EINVAL;
+ for (i = 0; i < fb->format->num_planes; i++) {
+ if (drm_WARN_ON_ONCE(dev, fb->internal_flags & DRM_FRAMEBUFFER_HAS_HANDLE_REF(i)))
+ fb->internal_flags &= ~DRM_FRAMEBUFFER_HAS_HANDLE_REF(i);
+ if (fb->obj[i]) {
+ exists = drm_gem_object_handle_get_if_exists_unlocked(fb->obj[i]);
+ if (exists)
+ fb->internal_flags |= DRM_FRAMEBUFFER_HAS_HANDLE_REF(i);
+ }
+ }
+
INIT_LIST_HEAD(&fb->filp_head);
fb->funcs = funcs;
@@ -875,7 +887,7 @@ int drm_framebuffer_init(struct drm_devi
ret = __drm_mode_object_add(dev, &fb->base, DRM_MODE_OBJECT_FB,
false, drm_framebuffer_free);
if (ret)
- goto out;
+ goto err;
mutex_lock(&dev->mode_config.fb_lock);
dev->mode_config.num_fb++;
@@ -883,7 +895,16 @@ int drm_framebuffer_init(struct drm_devi
mutex_unlock(&dev->mode_config.fb_lock);
drm_mode_object_register(dev, &fb->base);
-out:
+
+ return 0;
+
+err:
+ for (i = 0; i < fb->format->num_planes; i++) {
+ if (fb->internal_flags & DRM_FRAMEBUFFER_HAS_HANDLE_REF(i)) {
+ drm_gem_object_handle_put_unlocked(fb->obj[i]);
+ fb->internal_flags &= ~DRM_FRAMEBUFFER_HAS_HANDLE_REF(i);
+ }
+ }
return ret;
}
EXPORT_SYMBOL(drm_framebuffer_init);
@@ -960,6 +981,12 @@ EXPORT_SYMBOL(drm_framebuffer_unregister
void drm_framebuffer_cleanup(struct drm_framebuffer *fb)
{
struct drm_device *dev = fb->dev;
+ unsigned int i;
+
+ for (i = 0; i < fb->format->num_planes; i++) {
+ if (fb->internal_flags & DRM_FRAMEBUFFER_HAS_HANDLE_REF(i))
+ drm_gem_object_handle_put_unlocked(fb->obj[i]);
+ }
mutex_lock(&dev->mode_config.fb_lock);
list_del(&fb->head);
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -223,23 +223,34 @@ static void drm_gem_object_handle_get(st
}
/**
- * drm_gem_object_handle_get_unlocked - acquire reference on user-space handles
+ * drm_gem_object_handle_get_if_exists_unlocked - acquire reference on user-space handle, if any
* @obj: GEM object
*
- * Acquires a reference on the GEM buffer object's handle. Required
- * to keep the GEM object alive. Call drm_gem_object_handle_put_unlocked()
- * to release the reference.
+ * Acquires a reference on the GEM buffer object's handle. Required to keep
+ * the GEM object alive. Call drm_gem_object_handle_put_if_exists_unlocked()
+ * to release the reference. Does nothing if the buffer object has no handle.
+ *
+ * Returns:
+ * True if a handle exists, or false otherwise
*/
-void drm_gem_object_handle_get_unlocked(struct drm_gem_object *obj)
+bool drm_gem_object_handle_get_if_exists_unlocked(struct drm_gem_object *obj)
{
struct drm_device *dev = obj->dev;
guard(mutex)(&dev->object_name_lock);
- drm_WARN_ON(dev, !obj->handle_count); /* first ref taken in create-tail helper */
+ /*
+ * First ref taken during GEM object creation, if any. Some
+ * drivers set up internal framebuffers with GEM objects that
+ * do not have a GEM handle. Hence, this counter can be zero.
+ */
+ if (!obj->handle_count)
+ return false;
+
drm_gem_object_handle_get(obj);
+
+ return true;
}
-EXPORT_SYMBOL(drm_gem_object_handle_get_unlocked);
/**
* drm_gem_object_handle_free - release resources bound to userspace handles
@@ -272,7 +283,7 @@ static void drm_gem_object_exported_dma_
}
/**
- * drm_gem_object_handle_put_unlocked - releases reference on user-space handles
+ * drm_gem_object_handle_put_unlocked - releases reference on user-space handle
* @obj: GEM object
*
* Releases a reference on the GEM buffer object's handle. Possibly releases
@@ -283,14 +294,14 @@ void drm_gem_object_handle_put_unlocked(
struct drm_device *dev = obj->dev;
bool final = false;
- if (WARN_ON(READ_ONCE(obj->handle_count) == 0))
+ if (drm_WARN_ON(dev, READ_ONCE(obj->handle_count) == 0))
return;
/*
- * Must bump handle count first as this may be the last
- * ref, in which case the object would disappear before we
- * checked for a name
- */
+ * Must bump handle count first as this may be the last
+ * ref, in which case the object would disappear before
+ * we checked for a name.
+ */
mutex_lock(&dev->object_name_lock);
if (--obj->handle_count == 0) {
@@ -303,7 +314,6 @@ void drm_gem_object_handle_put_unlocked(
if (final)
drm_gem_object_put(obj);
}
-EXPORT_SYMBOL(drm_gem_object_handle_put_unlocked);
/*
* Called at device or object close to release the file's
--- a/drivers/gpu/drm/drm_gem_framebuffer_helper.c
+++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
@@ -99,7 +99,7 @@ void drm_gem_fb_destroy(struct drm_frame
unsigned int i;
for (i = 0; i < fb->format->num_planes; i++)
- drm_gem_object_handle_put_unlocked(fb->obj[i]);
+ drm_gem_object_put(fb->obj[i]);
drm_framebuffer_cleanup(fb);
kfree(fb);
@@ -182,10 +182,8 @@ int drm_gem_fb_init_with_funcs(struct dr
if (!objs[i]) {
drm_dbg_kms(dev, "Failed to lookup GEM object\n");
ret = -ENOENT;
- goto err_gem_object_handle_put_unlocked;
+ goto err_gem_object_put;
}
- drm_gem_object_handle_get_unlocked(objs[i]);
- drm_gem_object_put(objs[i]);
min_size = (height - 1) * mode_cmd->pitches[i]
+ drm_format_info_min_pitch(info, i, width)
@@ -195,22 +193,22 @@ int drm_gem_fb_init_with_funcs(struct dr
drm_dbg_kms(dev,
"GEM object size (%zu) smaller than minimum size (%u) for plane %d\n",
objs[i]->size, min_size, i);
- drm_gem_object_handle_put_unlocked(objs[i]);
+ drm_gem_object_put(objs[i]);
ret = -EINVAL;
- goto err_gem_object_handle_put_unlocked;
+ goto err_gem_object_put;
}
}
ret = drm_gem_fb_init(dev, fb, mode_cmd, objs, i, funcs);
if (ret)
- goto err_gem_object_handle_put_unlocked;
+ goto err_gem_object_put;
return 0;
-err_gem_object_handle_put_unlocked:
+err_gem_object_put:
while (i > 0) {
--i;
- drm_gem_object_handle_put_unlocked(objs[i]);
+ drm_gem_object_put(objs[i]);
}
return ret;
}
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -161,7 +161,7 @@ void drm_sysfs_lease_event(struct drm_de
/* drm_gem.c */
int drm_gem_init(struct drm_device *dev);
-void drm_gem_object_handle_get_unlocked(struct drm_gem_object *obj);
+bool drm_gem_object_handle_get_if_exists_unlocked(struct drm_gem_object *obj);
void drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj);
int drm_gem_handle_create_tail(struct drm_file *file_priv,
struct drm_gem_object *obj,
--- a/include/drm/drm_framebuffer.h
+++ b/include/drm/drm_framebuffer.h
@@ -23,6 +23,7 @@
#ifndef __DRM_FRAMEBUFFER_H__
#define __DRM_FRAMEBUFFER_H__
+#include <linux/bits.h>
#include <linux/ctype.h>
#include <linux/list.h>
#include <linux/sched.h>
@@ -100,6 +101,8 @@ struct drm_framebuffer_funcs {
unsigned num_clips);
};
+#define DRM_FRAMEBUFFER_HAS_HANDLE_REF(_i) BIT(0u + (_i))
+
/**
* struct drm_framebuffer - frame buffer object
*
@@ -189,6 +192,10 @@ struct drm_framebuffer {
*/
int flags;
/**
+ * @internal_flags: Framebuffer flags like DRM_FRAMEBUFFER_HAS_HANDLE_REF.
+ */
+ unsigned int internal_flags;
+ /**
* @filp_head: Placed on &drm_file.fbs, protected by &drm_file.fbs_lock.
*/
struct list_head filp_head;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Zimmermann <tzimmermann(a)suse.de>
commit f6bfc9afc7510cb5e6fbe0a17c507917b0120280 upstream.
Acquire GEM handles in drm_framebuffer_init() and release them in
the corresponding drm_framebuffer_cleanup(). Ties the handle's
lifetime to the framebuffer. Not all GEM buffer objects have GEM
handles. If not set, no refcounting takes place. This is the case
for some fbdev emulation. This is not a problem as these GEM objects
do not use dma-bufs and drivers will not release them while fbdev
emulation is running. Framebuffer flags keep a bit per color plane
of which the framebuffer holds a GEM handle reference.
As all drivers use drm_framebuffer_init(), they will now all hold
dma-buf references as fixed in commit 5307dce878d4 ("drm/gem: Acquire
references on GEM handles for framebuffers").
In the GEM framebuffer helpers, restore the original ref counting
on buffer objects. As the helpers for handle refcounting are now
no longer called from outside the DRM core, unexport the symbols.
v3:
- don't mix internal flags with mode flags (Christian)
v2:
- track framebuffer handle refs by flag
- drop gma500 cleanup (Christian)
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers")
Reported-by: Bert Karwatzki <spasswolf(a)web.de>
Closes: https://lore.kernel.org/dri-devel/20250703115915.3096-1-spasswolf@web.de/
Tested-by: Bert Karwatzki <spasswolf(a)web.de>
Tested-by: Mario Limonciello <superm1(a)kernel.org>
Tested-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Anusha Srivatsa <asrivats(a)redhat.com>
Cc: Christian König <christian.koenig(a)amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: linux-media(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Link: https://lore.kernel.org/r/20250707131224.249496-1-tzimmermann@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/drm_framebuffer.c | 31 ++++++++++++++++++++--
drivers/gpu/drm/drm_gem.c | 38 +++++++++++++++++----------
drivers/gpu/drm/drm_gem_framebuffer_helper.c | 16 ++++-------
drivers/gpu/drm/drm_internal.h | 2 -
include/drm/drm_framebuffer.h | 7 ++++
5 files changed, 68 insertions(+), 26 deletions(-)
--- a/drivers/gpu/drm/drm_framebuffer.c
+++ b/drivers/gpu/drm/drm_framebuffer.c
@@ -844,11 +844,23 @@ void drm_framebuffer_free(struct kref *k
int drm_framebuffer_init(struct drm_device *dev, struct drm_framebuffer *fb,
const struct drm_framebuffer_funcs *funcs)
{
+ unsigned int i;
int ret;
+ bool exists;
if (WARN_ON_ONCE(fb->dev != dev || !fb->format))
return -EINVAL;
+ for (i = 0; i < fb->format->num_planes; i++) {
+ if (drm_WARN_ON_ONCE(dev, fb->internal_flags & DRM_FRAMEBUFFER_HAS_HANDLE_REF(i)))
+ fb->internal_flags &= ~DRM_FRAMEBUFFER_HAS_HANDLE_REF(i);
+ if (fb->obj[i]) {
+ exists = drm_gem_object_handle_get_if_exists_unlocked(fb->obj[i]);
+ if (exists)
+ fb->internal_flags |= DRM_FRAMEBUFFER_HAS_HANDLE_REF(i);
+ }
+ }
+
INIT_LIST_HEAD(&fb->filp_head);
fb->funcs = funcs;
@@ -857,7 +869,7 @@ int drm_framebuffer_init(struct drm_devi
ret = __drm_mode_object_add(dev, &fb->base, DRM_MODE_OBJECT_FB,
false, drm_framebuffer_free);
if (ret)
- goto out;
+ goto err;
mutex_lock(&dev->mode_config.fb_lock);
dev->mode_config.num_fb++;
@@ -865,7 +877,16 @@ int drm_framebuffer_init(struct drm_devi
mutex_unlock(&dev->mode_config.fb_lock);
drm_mode_object_register(dev, &fb->base);
-out:
+
+ return 0;
+
+err:
+ for (i = 0; i < fb->format->num_planes; i++) {
+ if (fb->internal_flags & DRM_FRAMEBUFFER_HAS_HANDLE_REF(i)) {
+ drm_gem_object_handle_put_unlocked(fb->obj[i]);
+ fb->internal_flags &= ~DRM_FRAMEBUFFER_HAS_HANDLE_REF(i);
+ }
+ }
return ret;
}
EXPORT_SYMBOL(drm_framebuffer_init);
@@ -942,6 +963,12 @@ EXPORT_SYMBOL(drm_framebuffer_unregister
void drm_framebuffer_cleanup(struct drm_framebuffer *fb)
{
struct drm_device *dev = fb->dev;
+ unsigned int i;
+
+ for (i = 0; i < fb->format->num_planes; i++) {
+ if (fb->internal_flags & DRM_FRAMEBUFFER_HAS_HANDLE_REF(i))
+ drm_gem_object_handle_put_unlocked(fb->obj[i]);
+ }
mutex_lock(&dev->mode_config.fb_lock);
list_del(&fb->head);
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -197,23 +197,34 @@ static void drm_gem_object_handle_get(st
}
/**
- * drm_gem_object_handle_get_unlocked - acquire reference on user-space handles
+ * drm_gem_object_handle_get_if_exists_unlocked - acquire reference on user-space handle, if any
* @obj: GEM object
*
- * Acquires a reference on the GEM buffer object's handle. Required
- * to keep the GEM object alive. Call drm_gem_object_handle_put_unlocked()
- * to release the reference.
+ * Acquires a reference on the GEM buffer object's handle. Required to keep
+ * the GEM object alive. Call drm_gem_object_handle_put_if_exists_unlocked()
+ * to release the reference. Does nothing if the buffer object has no handle.
+ *
+ * Returns:
+ * True if a handle exists, or false otherwise
*/
-void drm_gem_object_handle_get_unlocked(struct drm_gem_object *obj)
+bool drm_gem_object_handle_get_if_exists_unlocked(struct drm_gem_object *obj)
{
struct drm_device *dev = obj->dev;
guard(mutex)(&dev->object_name_lock);
- drm_WARN_ON(dev, !obj->handle_count); /* first ref taken in create-tail helper */
+ /*
+ * First ref taken during GEM object creation, if any. Some
+ * drivers set up internal framebuffers with GEM objects that
+ * do not have a GEM handle. Hence, this counter can be zero.
+ */
+ if (!obj->handle_count)
+ return false;
+
drm_gem_object_handle_get(obj);
+
+ return true;
}
-EXPORT_SYMBOL(drm_gem_object_handle_get_unlocked);
/**
* drm_gem_object_handle_free - release resources bound to userspace handles
@@ -246,7 +257,7 @@ static void drm_gem_object_exported_dma_
}
/**
- * drm_gem_object_handle_put_unlocked - releases reference on user-space handles
+ * drm_gem_object_handle_put_unlocked - releases reference on user-space handle
* @obj: GEM object
*
* Releases a reference on the GEM buffer object's handle. Possibly releases
@@ -257,14 +268,14 @@ void drm_gem_object_handle_put_unlocked(
struct drm_device *dev = obj->dev;
bool final = false;
- if (WARN_ON(READ_ONCE(obj->handle_count) == 0))
+ if (drm_WARN_ON(dev, READ_ONCE(obj->handle_count) == 0))
return;
/*
- * Must bump handle count first as this may be the last
- * ref, in which case the object would disappear before we
- * checked for a name
- */
+ * Must bump handle count first as this may be the last
+ * ref, in which case the object would disappear before
+ * we checked for a name.
+ */
mutex_lock(&dev->object_name_lock);
if (--obj->handle_count == 0) {
@@ -277,7 +288,6 @@ void drm_gem_object_handle_put_unlocked(
if (final)
drm_gem_object_put(obj);
}
-EXPORT_SYMBOL(drm_gem_object_handle_put_unlocked);
/*
* Called at device or object close to release the file's
--- a/drivers/gpu/drm/drm_gem_framebuffer_helper.c
+++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
@@ -99,7 +99,7 @@ void drm_gem_fb_destroy(struct drm_frame
unsigned int i;
for (i = 0; i < fb->format->num_planes; i++)
- drm_gem_object_handle_put_unlocked(fb->obj[i]);
+ drm_gem_object_put(fb->obj[i]);
drm_framebuffer_cleanup(fb);
kfree(fb);
@@ -182,10 +182,8 @@ int drm_gem_fb_init_with_funcs(struct dr
if (!objs[i]) {
drm_dbg_kms(dev, "Failed to lookup GEM object\n");
ret = -ENOENT;
- goto err_gem_object_handle_put_unlocked;
+ goto err_gem_object_put;
}
- drm_gem_object_handle_get_unlocked(objs[i]);
- drm_gem_object_put(objs[i]);
min_size = (height - 1) * mode_cmd->pitches[i]
+ drm_format_info_min_pitch(info, i, width)
@@ -195,22 +193,22 @@ int drm_gem_fb_init_with_funcs(struct dr
drm_dbg_kms(dev,
"GEM object size (%zu) smaller than minimum size (%u) for plane %d\n",
objs[i]->size, min_size, i);
- drm_gem_object_handle_put_unlocked(objs[i]);
+ drm_gem_object_put(objs[i]);
ret = -EINVAL;
- goto err_gem_object_handle_put_unlocked;
+ goto err_gem_object_put;
}
}
ret = drm_gem_fb_init(dev, fb, mode_cmd, objs, i, funcs);
if (ret)
- goto err_gem_object_handle_put_unlocked;
+ goto err_gem_object_put;
return 0;
-err_gem_object_handle_put_unlocked:
+err_gem_object_put:
while (i > 0) {
--i;
- drm_gem_object_handle_put_unlocked(objs[i]);
+ drm_gem_object_put(objs[i]);
}
return ret;
}
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -155,7 +155,7 @@ void drm_sysfs_lease_event(struct drm_de
/* drm_gem.c */
int drm_gem_init(struct drm_device *dev);
-void drm_gem_object_handle_get_unlocked(struct drm_gem_object *obj);
+bool drm_gem_object_handle_get_if_exists_unlocked(struct drm_gem_object *obj);
void drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj);
int drm_gem_handle_create_tail(struct drm_file *file_priv,
struct drm_gem_object *obj,
--- a/include/drm/drm_framebuffer.h
+++ b/include/drm/drm_framebuffer.h
@@ -23,6 +23,7 @@
#ifndef __DRM_FRAMEBUFFER_H__
#define __DRM_FRAMEBUFFER_H__
+#include <linux/bits.h>
#include <linux/ctype.h>
#include <linux/list.h>
#include <linux/sched.h>
@@ -100,6 +101,8 @@ struct drm_framebuffer_funcs {
unsigned num_clips);
};
+#define DRM_FRAMEBUFFER_HAS_HANDLE_REF(_i) BIT(0u + (_i))
+
/**
* struct drm_framebuffer - frame buffer object
*
@@ -189,6 +192,10 @@ struct drm_framebuffer {
*/
int flags;
/**
+ * @internal_flags: Framebuffer flags like DRM_FRAMEBUFFER_HAS_HANDLE_REF.
+ */
+ unsigned int internal_flags;
+ /**
* @hot_x: X coordinate of the cursor hotspot. Used by the legacy cursor
* IOCTL when the driver supports cursor through a DRM_PLANE_TYPE_CURSOR
* universal plane.
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Zimmermann <tzimmermann(a)suse.de>
commit f6bfc9afc7510cb5e6fbe0a17c507917b0120280 upstream.
Acquire GEM handles in drm_framebuffer_init() and release them in
the corresponding drm_framebuffer_cleanup(). Ties the handle's
lifetime to the framebuffer. Not all GEM buffer objects have GEM
handles. If not set, no refcounting takes place. This is the case
for some fbdev emulation. This is not a problem as these GEM objects
do not use dma-bufs and drivers will not release them while fbdev
emulation is running. Framebuffer flags keep a bit per color plane
of which the framebuffer holds a GEM handle reference.
As all drivers use drm_framebuffer_init(), they will now all hold
dma-buf references as fixed in commit 5307dce878d4 ("drm/gem: Acquire
references on GEM handles for framebuffers").
In the GEM framebuffer helpers, restore the original ref counting
on buffer objects. As the helpers for handle refcounting are now
no longer called from outside the DRM core, unexport the symbols.
v3:
- don't mix internal flags with mode flags (Christian)
v2:
- track framebuffer handle refs by flag
- drop gma500 cleanup (Christian)
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers")
Reported-by: Bert Karwatzki <spasswolf(a)web.de>
Closes: https://lore.kernel.org/dri-devel/20250703115915.3096-1-spasswolf@web.de/
Tested-by: Bert Karwatzki <spasswolf(a)web.de>
Tested-by: Mario Limonciello <superm1(a)kernel.org>
Tested-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Anusha Srivatsa <asrivats(a)redhat.com>
Cc: Christian König <christian.koenig(a)amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: linux-media(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Link: https://lore.kernel.org/r/20250707131224.249496-1-tzimmermann@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/drm_framebuffer.c | 31 ++++++++++++++++++++--
drivers/gpu/drm/drm_gem.c | 38 +++++++++++++++++----------
drivers/gpu/drm/drm_gem_framebuffer_helper.c | 16 ++++-------
drivers/gpu/drm/drm_internal.h | 2 -
include/drm/drm_framebuffer.h | 7 ++++
5 files changed, 68 insertions(+), 26 deletions(-)
--- a/drivers/gpu/drm/drm_framebuffer.c
+++ b/drivers/gpu/drm/drm_framebuffer.c
@@ -860,11 +860,23 @@ void drm_framebuffer_free(struct kref *k
int drm_framebuffer_init(struct drm_device *dev, struct drm_framebuffer *fb,
const struct drm_framebuffer_funcs *funcs)
{
+ unsigned int i;
int ret;
+ bool exists;
if (WARN_ON_ONCE(fb->dev != dev || !fb->format))
return -EINVAL;
+ for (i = 0; i < fb->format->num_planes; i++) {
+ if (drm_WARN_ON_ONCE(dev, fb->internal_flags & DRM_FRAMEBUFFER_HAS_HANDLE_REF(i)))
+ fb->internal_flags &= ~DRM_FRAMEBUFFER_HAS_HANDLE_REF(i);
+ if (fb->obj[i]) {
+ exists = drm_gem_object_handle_get_if_exists_unlocked(fb->obj[i]);
+ if (exists)
+ fb->internal_flags |= DRM_FRAMEBUFFER_HAS_HANDLE_REF(i);
+ }
+ }
+
INIT_LIST_HEAD(&fb->filp_head);
fb->funcs = funcs;
@@ -873,7 +885,7 @@ int drm_framebuffer_init(struct drm_devi
ret = __drm_mode_object_add(dev, &fb->base, DRM_MODE_OBJECT_FB,
false, drm_framebuffer_free);
if (ret)
- goto out;
+ goto err;
mutex_lock(&dev->mode_config.fb_lock);
dev->mode_config.num_fb++;
@@ -881,7 +893,16 @@ int drm_framebuffer_init(struct drm_devi
mutex_unlock(&dev->mode_config.fb_lock);
drm_mode_object_register(dev, &fb->base);
-out:
+
+ return 0;
+
+err:
+ for (i = 0; i < fb->format->num_planes; i++) {
+ if (fb->internal_flags & DRM_FRAMEBUFFER_HAS_HANDLE_REF(i)) {
+ drm_gem_object_handle_put_unlocked(fb->obj[i]);
+ fb->internal_flags &= ~DRM_FRAMEBUFFER_HAS_HANDLE_REF(i);
+ }
+ }
return ret;
}
EXPORT_SYMBOL(drm_framebuffer_init);
@@ -958,6 +979,12 @@ EXPORT_SYMBOL(drm_framebuffer_unregister
void drm_framebuffer_cleanup(struct drm_framebuffer *fb)
{
struct drm_device *dev = fb->dev;
+ unsigned int i;
+
+ for (i = 0; i < fb->format->num_planes; i++) {
+ if (fb->internal_flags & DRM_FRAMEBUFFER_HAS_HANDLE_REF(i))
+ drm_gem_object_handle_put_unlocked(fb->obj[i]);
+ }
mutex_lock(&dev->mode_config.fb_lock);
list_del(&fb->head);
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -197,23 +197,34 @@ static void drm_gem_object_handle_get(st
}
/**
- * drm_gem_object_handle_get_unlocked - acquire reference on user-space handles
+ * drm_gem_object_handle_get_if_exists_unlocked - acquire reference on user-space handle, if any
* @obj: GEM object
*
- * Acquires a reference on the GEM buffer object's handle. Required
- * to keep the GEM object alive. Call drm_gem_object_handle_put_unlocked()
- * to release the reference.
+ * Acquires a reference on the GEM buffer object's handle. Required to keep
+ * the GEM object alive. Call drm_gem_object_handle_put_if_exists_unlocked()
+ * to release the reference. Does nothing if the buffer object has no handle.
+ *
+ * Returns:
+ * True if a handle exists, or false otherwise
*/
-void drm_gem_object_handle_get_unlocked(struct drm_gem_object *obj)
+bool drm_gem_object_handle_get_if_exists_unlocked(struct drm_gem_object *obj)
{
struct drm_device *dev = obj->dev;
guard(mutex)(&dev->object_name_lock);
- drm_WARN_ON(dev, !obj->handle_count); /* first ref taken in create-tail helper */
+ /*
+ * First ref taken during GEM object creation, if any. Some
+ * drivers set up internal framebuffers with GEM objects that
+ * do not have a GEM handle. Hence, this counter can be zero.
+ */
+ if (!obj->handle_count)
+ return false;
+
drm_gem_object_handle_get(obj);
+
+ return true;
}
-EXPORT_SYMBOL(drm_gem_object_handle_get_unlocked);
/**
* drm_gem_object_handle_free - release resources bound to userspace handles
@@ -246,7 +257,7 @@ static void drm_gem_object_exported_dma_
}
/**
- * drm_gem_object_handle_put_unlocked - releases reference on user-space handles
+ * drm_gem_object_handle_put_unlocked - releases reference on user-space handle
* @obj: GEM object
*
* Releases a reference on the GEM buffer object's handle. Possibly releases
@@ -257,14 +268,14 @@ void drm_gem_object_handle_put_unlocked(
struct drm_device *dev = obj->dev;
bool final = false;
- if (WARN_ON(READ_ONCE(obj->handle_count) == 0))
+ if (drm_WARN_ON(dev, READ_ONCE(obj->handle_count) == 0))
return;
/*
- * Must bump handle count first as this may be the last
- * ref, in which case the object would disappear before we
- * checked for a name
- */
+ * Must bump handle count first as this may be the last
+ * ref, in which case the object would disappear before
+ * we checked for a name.
+ */
mutex_lock(&dev->object_name_lock);
if (--obj->handle_count == 0) {
@@ -277,7 +288,6 @@ void drm_gem_object_handle_put_unlocked(
if (final)
drm_gem_object_put(obj);
}
-EXPORT_SYMBOL(drm_gem_object_handle_put_unlocked);
/*
* Called at device or object close to release the file's
--- a/drivers/gpu/drm/drm_gem_framebuffer_helper.c
+++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
@@ -99,7 +99,7 @@ void drm_gem_fb_destroy(struct drm_frame
unsigned int i;
for (i = 0; i < fb->format->num_planes; i++)
- drm_gem_object_handle_put_unlocked(fb->obj[i]);
+ drm_gem_object_put(fb->obj[i]);
drm_framebuffer_cleanup(fb);
kfree(fb);
@@ -182,10 +182,8 @@ int drm_gem_fb_init_with_funcs(struct dr
if (!objs[i]) {
drm_dbg_kms(dev, "Failed to lookup GEM object\n");
ret = -ENOENT;
- goto err_gem_object_handle_put_unlocked;
+ goto err_gem_object_put;
}
- drm_gem_object_handle_get_unlocked(objs[i]);
- drm_gem_object_put(objs[i]);
min_size = (height - 1) * mode_cmd->pitches[i]
+ drm_format_info_min_pitch(info, i, width)
@@ -195,22 +193,22 @@ int drm_gem_fb_init_with_funcs(struct dr
drm_dbg_kms(dev,
"GEM object size (%zu) smaller than minimum size (%u) for plane %d\n",
objs[i]->size, min_size, i);
- drm_gem_object_handle_put_unlocked(objs[i]);
+ drm_gem_object_put(objs[i]);
ret = -EINVAL;
- goto err_gem_object_handle_put_unlocked;
+ goto err_gem_object_put;
}
}
ret = drm_gem_fb_init(dev, fb, mode_cmd, objs, i, funcs);
if (ret)
- goto err_gem_object_handle_put_unlocked;
+ goto err_gem_object_put;
return 0;
-err_gem_object_handle_put_unlocked:
+err_gem_object_put:
while (i > 0) {
--i;
- drm_gem_object_handle_put_unlocked(objs[i]);
+ drm_gem_object_put(objs[i]);
}
return ret;
}
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -153,7 +153,7 @@ void drm_sysfs_lease_event(struct drm_de
/* drm_gem.c */
int drm_gem_init(struct drm_device *dev);
-void drm_gem_object_handle_get_unlocked(struct drm_gem_object *obj);
+bool drm_gem_object_handle_get_if_exists_unlocked(struct drm_gem_object *obj);
void drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj);
int drm_gem_handle_create_tail(struct drm_file *file_priv,
struct drm_gem_object *obj,
--- a/include/drm/drm_framebuffer.h
+++ b/include/drm/drm_framebuffer.h
@@ -23,6 +23,7 @@
#ifndef __DRM_FRAMEBUFFER_H__
#define __DRM_FRAMEBUFFER_H__
+#include <linux/bits.h>
#include <linux/ctype.h>
#include <linux/list.h>
#include <linux/sched.h>
@@ -100,6 +101,8 @@ struct drm_framebuffer_funcs {
unsigned num_clips);
};
+#define DRM_FRAMEBUFFER_HAS_HANDLE_REF(_i) BIT(0u + (_i))
+
/**
* struct drm_framebuffer - frame buffer object
*
@@ -189,6 +192,10 @@ struct drm_framebuffer {
*/
int flags;
/**
+ * @internal_flags: Framebuffer flags like DRM_FRAMEBUFFER_HAS_HANDLE_REF.
+ */
+ unsigned int internal_flags;
+ /**
* @filp_head: Placed on &drm_file.fbs, protected by &drm_file.fbs_lock.
*/
struct list_head filp_head;