Hi,
On Mon, Oct 23, 2023 at 10:25:50AM -0700, Doug Anderson wrote:
> On Mon, Oct 23, 2023 at 9:31 AM Yuran Pereira <yuran.pereira(a)hotmail.com> wrote:
> >
> > Since "Clean up checks for already prepared/enabled in panels" has
> > already been done and merged [1], I think there is no longer a need
> > for this item to be in the gpu TODO.
> >
> > [1] https://patchwork.freedesktop.org/patch/551421/
> >
> > Signed-off-by: Yuran Pereira <yuran.pereira(a)hotmail.com>
> > ---
> > Documentation/gpu/todo.rst | 25 -------------------------
> > 1 file changed, 25 deletions(-)
>
> It's not actually all done. It's in a bit of a limbo state right now,
> unfortunately. I landed all of the "simple" cases where panels were
> needlessly tracking prepare/enable, but the less simple cases are
> still outstanding.
>
> Specifically the issue is that many panels have code to properly power
> cycle themselves off at shutdown time and in order to do that they
> need to keep track of the prepare/enable state. After a big, long
> discussion [1] it was decided that we could get rid of all the panel
> code handling shutdown if only all relevant DRM KMS drivers would
> properly call drm_atomic_helper_shutdown().
>
> I made an attempt to get DRM KMS drivers to call
> drm_atomic_helper_shutdown() [2] [3] [4]. I was able to land the
> patches that went through drm-misc, but currently many of the
> non-drm-misc ones are blocked waiting for attention.
>
> ...so things that could be done to help out:
>
> a) Could review patches that haven't landed in [4]. Maybe adding a
> Reviewed-by tag would help wake up maintainers?
>
> b) Could see if you can identify panels that are exclusively used w/
> DRM drivers that have already been converted and then we could post
> patches for just those panels. I have no idea how easy this task would
> be. Is it enough to look at upstream dts files by "compatible" string?
I think it is, yes.
Maxime
Hi all,
This series is based on previous RFCs/discussions:
Tech topic: https://lore.kernel.org/linux-iommu/20250918214425.2677057-1-amastro@fb.com/
RFCv1: https://lore.kernel.org/all/20260226202211.929005-1-mattev@meta.com/
RFCv2: https://lore.kernel.org/kvm/20260312184613.3710705-1-mattev@meta.com/
The background/rationale is covered in more detail in the RFC cover
letters. The TL;DR is:
The goal is to enable userspace driver designs that use VFIO to export
DMABUFs representing subsets of PCI device BARs, and "vend" those
buffers from a primary process to other subordinate processes by fd.
These processes then mmap() the buffers and their access to the device
is isolated to the exported ranges. This is an improvement on sharing
the VFIO device fd to subordinate processes, which would allow
unfettered access.
This is achieved by enabling mmap() of vfio-pci DMABUFs, passed by fd
to subordinate processes. Second, a new ioctl()-based revocation
mechanism is added to allow the primary process to forcibly revoke
access to previously-shared BAR spans, even if the subordinate
processes haven't cleanly exited.
(The related topic of safe delegation of iommufd control to the
subordinate processes is not addressed here, and is follow-up work.)
As well as isolation and revocation, another advantage to accessing a
BAR through a VMA backed by a DMABUF is that it's straightforward to
mmap() the buffer with access attributes, such as write-combining.
Feedback from the RFCs requested that, instead of creating
DMABUF-specific vm_ops and .fault paths, to go the whole way and
migrate the existing VFIO PCI BAR mmap() to be backed by a DMABUF too,
resulting in a common vm_ops and fault handler for mmap()s of both the
VFIO device and explicitly-exported DMABUFs. This will help future
iommufd emulation of VFIO Type1 peer-to-peer, making it easier to get
a DMABUF for a VFIO BAR as a DMA target.
mmap() conversion to use DMABUF underneath has been done for vfio-pci,
but not sub-drivers:
nvgrace-gpu's mmap() override path is unchanged; I kept this out of
scope for now not least because I don't have a thorough test setup
for this system. I would prefer to help the nvgrace-gpu maintainers
enable BAR mmap() DMABUFs themselves.
Notes on patches
================
PCI/P2PDMA: Add CONFIG_PCI_P2PDMA_CORE
Later in the series, vfio-pci's mmap() is going to depend on
pcim_p2pdma_provider() which depended on CONFIG_PCI_P2PDMA, which
in turn depended on ZONE_DEVICE (which isn't available on 32-bit
and some archs, because they lack MEMORY_HOTPLUG and friends).
VFIO does _not_ require actual P2P to be present for basic mmap()
functionality, only for the optional CONFIG_DMA_SHARED_BUFFER
feature.
This splits P2PDMA into a CONFIG_PCI_P2PDMA_CORE (which currently
contains pcim_p2pdma_provider()) and an optional CONFIG_PCI_P2PDMA
(which depends on ZONE_DEVICE etc., and provides P2P
functionality).
vfio/pci: Add a helper to look up PFNs for DMABUFs
vfio/pci: Add a helper to create a DMABUF for a BAR-map VMA
The first is for a DMABUF VMA fault handler to determine
arbitrary-sized PFNs from ranges in DMABUF. Secondly, refactor
DMABUF export for use by the existing export feature and add a new
helper that creates a DMABUF corresponding to a VFIO BAR mmap()
request.
vfio/pci: Convert BAR mmap() to use a DMABUF
The vfio-pci core mmap() creates a DMABUF with the helper, and the
vm_ops fault handler uses the other helper to resolve the fault.
Because this depends on DMABUF structs/code, CONFIG_VFIO_PCI_CORE
needs to depend on CONFIG_DMA_SHARED_BUFFER. The
CONFIG_VFIO_PCI_DMABUF still conditionally enables the export
support code.
NOTE: The user mmap()s a device fd, but the resulting VMA's vm_file
becomes that of the DMABUF which takes ownership of the device and
puts it on release. This maintains the existing behaviour of a VMA
keeping the VFIO device open.
BAR zapping then happens via the existing vfio_pci_dma_buf_move()
path, which now needs to unmap PTEs in the DMABUF's address_space.
vfio/pci: Provide a user-facing name for BAR mappings
There was a request for decent debug naming in /proc/<pid>/maps
etc. comparable to the existing VFIO names: since the VMAs are
DMABUFs, they have a "dmabuf:" prefix and can't be 100% identical
to before. This is a user-visible change, but this patch at least
now gives us extra info on the BDF & BAR being mapped.
vfio/pci: Clean up BAR zap and revocation
In general (see NOTE!) the vfio_pci_zap_bars() is now obsolete,
since it unmaps PTEs in the VFIO device address_space which is now
unused. This consolidates all calls (e.g. around reset) with the
neighbouring vfio_pci_dma_buf_move()s into new functions, to
revoke-zap/unrevoke.
!!! NOTE: the nvgrace-gpu driver continues to use its own private
vm_ops, fault handler, etc. for its special memregions, and these
DO still add PTEs to the VFIO device address_space. So, a
temporary flag, vdev->bar_needs_zap, maintains the old behaviour
for this use. At least this patch's consolidation makes it easy to
remove the remaining zap when this need goes away; a FIXME reminds
that this can be removed when nvgrace-gpu is converted.
vfio/pci: Support mmap() of a VFIO DMABUF
Adds mmap() for a DMABUF fd exported from vfio-pci.
It was a goal to keep the VFIO device fd lifetime behaviour
unchanged with respect to the DMABUFs. An application can close
all device fds, and this will revoke/clean up all DMABUFs; no
mappings or other access can be performed now. When enabling
mmap() of the DMABUFs, this means access through the VMA is also
revoked. This complicates the fault handler because whilst the
DMABUF exists, it has no guarantee that the corresponding VFIO
device is still alive. Adds synchronisation ensuring the vdev is
available before vdev->memory_lock is touched; this holds the
device registration so that even if the buffer has been cleaned up,
vdev hasn't been freed and so the lock can be safely taken.
(I decided against the alternative of preventing cleanup by holding
the VFIO device open if any DMABUFs exist, because it's both a
change of behaviour and less clean overall.)
I've added a chonky comment in place, happy to clarify more if you
have ideas.
This commit makes VFIO_PCI_CORE depend on PCI_P2PDMA_CORE (commit
1) to bring in (only) the P2PDMA provider code.
vfio/pci: Permanently revoke a DMABUF on request
By weight, this is mostly a rename of revoked to an enum, status.
There are now 3 states for a buffer, usable and revoked
temporary/permanent. A new VFIO device ioctl is added,
VFIO_DEVICE_PCI_DMABUF_REVOKE, which passes a DMABUF (exported from
that device) and permanently revokes it. Thus a userspace driver
can guarantee any downstream consumers of a shared fd are prevented
from accessing a BAR range, and that range can be reused.
The code doing revocation in vfio_pci_dma_buf_move() is moved,
unchanged, to a common function for use by _move() and the new
ioctl path.
Q: I can't think of a good reason to temporarily revoke/unrevoke
buffers from userspace, so didn't add a 'flags' field to the ioctl
struct. Easy to add if people think it's worthwhile for future
use.
vfio/pci: Add mmap() attributes to DMABUF feature
Adds a new VFIO feature, VFIO_DEVICE_FEATURE_DMA_BUF_MEMATTR.
After a DMABUF is exported, this feature ioctl() isused to set a
memory attribute that will be used by future mmap()s of the DMABUF
fd (i.e. it does nothing for any existing maps).
The default is UC, and via the feature one can specify CPU access
as WC. The attribute is an enum/scalar rather than
bitmap/cumulative. The attributes follow a "try-fail" model where
a client can request an attribute and either succeed or fail with
ENOTSUPP if it's unknown; if future attributes are
platform-specific then their support can be probed.
(Since it's just UC/WC for now, there is no reservation or numeric
structure to the namespace yet, but we could support
system/arch-specific values in future by carving out base +
arch-specific + IMPDEF ranges.)
Testing
=======
(The [RFC ONLY] userspace test program, for QEMU edu-plus, has been
dropped from the series, but can be found in the GitHub branch below.
It at least illustrates the export, map, revoke, attribute, and close
semantics interoperate.)
This code has been tested in mapping DMABUFs of single/multiple
ranges, aliasing mmap()s, aliasing ranges across DMABUFs, vm_pgoff >
0, revocation, shutdown/cleanup scenarios, and hugepage mappings seem
to work correctly. I've lightly tested WC mappings also (by observing
resulting PTEs as having the correct attributes...). No regressions
observed on the VFIO selftests, or on our internal vfio-pci
applications.
End
===
This is based on VFIO next (e.g. at b9285405c5f6).
These commits are on GitHub for easier browsing, along with
"[RFC ONLY] selftests: vfio: Add standalone vfio_dmabuf_mmap_test":
https://github.com/metamev/linux/compare/b9285405c5f6...metamev:linux:dev/m…
Thanks for reading,
Matt
================================================================================
Change log:
v2:
- Rebase on VFIO next, picking up Alex's
vfio_pci_dma_buf_move()/vfio_pci_dma_buf_cleanup() fixes, and
dropping "vfio/pci: Fix vfio_pci_dma_buf_cleanup() double-put"
- Added "PCI/P2PDMA: Add CONFIG_PCI_P2PDMA_CORE" so that the
newly-added vfio-pci hard dependency on the P2PDMA provider instead
pulls in the _CORE variant and not the full-fat CONFIG_PCI_P2PDMA.
This means that the core of vfio-pci does not need ZONE_DEVICE, but
if it's available then enabling P2PDMA in turn enables DMABUF
export. Fixes basic VFIO operation on 32b or other platforms without
ZONE_DEVICE.
- Fixed comment inaccuracy in vfio_pci_dma_buf_revoke() and cleaned
up vdev validity test.
- vfio_pci_dma_buf_find_pfn(): use PAGE_ALIGN(), better span variable
naming, OVF check
- Made vm_pgoffs use consistent (keeping the resource index at the
top and masking where offset is used). For BAR mmap, use new
vma_pgoff_adjust to create the DMABUF with the exact mmap()ed span
instead of from the start of the BAR with an invisible portion
before the mapping.
- Added VFIO_DEVICE_FEATURE_DMA_BUF_MEMATTR to set memory attributes,
instead of using the export `flags` field.
- vfio_pci_ioctl_reset: Moved vfio_pci_zap_revoke_bars()
(effectively, vfio_pci_dma_buf_move()) back after D0 transition.
Note, if a BAR zap is needed, it's done in this function so now
happens after this D0 transition with the _move; it was done before
it at the time of the memory_lock taking.
- Minimised vfio_pci_dma_buf_mmap() (removed redundant span check),
added READ_ONCE for memattr
- Misc fixes: comment in DMABUF name generation, removed superfluous
READ_ONCE from faulthandler
v1:
https://lore.kernel.org/kvm/20260416131815.2729131-1-mattev@meta.com/
- Cleanup of the common DMABUF-aware VMA vm_ops fault handler and
export code.
- Fixed a lot of races, particularly faults racing with DMABUF
cleanup (if the VFIO device fds close, for example).
- Added nicer human-readable names for VFIO mmap() VMAs
RFCv2: Respin based on the feedback/suggestions:
https://lore.kernel.org/kvm/20260312184613.3710705-1-mattev@meta.com/
- Transform the existing VFIO BAR mmap path to also use DMABUFs
behind the scenes, and then simply share that code for
explicitly-mapped DMABUFs. Jason wanted to go that direction to
enable iommufd VFIO type 1 emulation to pick up a DMABUF for an IO
mapping.
- Revoke buffers using a VFIO device fd ioctl
RFCv1:
https://lore.kernel.org/all/20260226202211.929005-1-mattev@meta.com/
Matt Evans (9):
PCI/P2PDMA: Add CONFIG_PCI_P2PDMA_CORE
vfio/pci: Add a helper to look up PFNs for DMABUFs
vfio/pci: Add a helper to create a DMABUF for a BAR-map VMA
vfio/pci: Convert BAR mmap() to use a DMABUF
vfio/pci: Provide a user-facing name for BAR mappings
vfio/pci: Clean up BAR zap and revocation
vfio/pci: Support mmap() of a VFIO DMABUF
vfio/pci: Permanently revoke a DMABUF on request
vfio/pci: Add mmap() attributes to DMABUF feature
drivers/pci/Kconfig | 10 +-
drivers/pci/Makefile | 2 +-
drivers/pci/p2pdma.c | 16 +
drivers/vfio/pci/Kconfig | 4 +-
drivers/vfio/pci/Makefile | 3 +-
drivers/vfio/pci/nvgrace-gpu/main.c | 5 +
drivers/vfio/pci/vfio_pci_config.c | 30 +-
drivers/vfio/pci/vfio_pci_core.c | 225 +++++++++---
drivers/vfio/pci/vfio_pci_dmabuf.c | 548 ++++++++++++++++++++++++----
drivers/vfio/pci/vfio_pci_priv.h | 57 ++-
include/linux/pci-p2pdma.h | 24 +-
include/linux/pci.h | 2 +-
include/linux/vfio_pci_core.h | 1 +
include/uapi/linux/vfio.h | 57 +++
14 files changed, 815 insertions(+), 169 deletions(-)
--
2.47.3
This patch series introduces the Qualcomm DSP Accelerator (QDA) driver,
a DRM-based accelerator driver for Qualcomm DSPs. The driver provides a
standardized interface for offloading computational tasks to DSPs found
on Qualcomm SoCs, supporting all DSP domains.
The QDA driver implements the FastRPC protocol over the DRM accel
subsystem. It uses the same device-tree node structure as the existing
fastrpc driver in drivers/misc/. The approach for binding the QDA driver
to device-tree nodes while coexisting with the fastrpc driver is an open
item described below.
RFC thread: https://lore.kernel.org/dri-devel/20260224-qda-firstpost-v1-0-fe46a9c1a046@…
User-space staging branch
=========================
https://github.com/qualcomm/fastrpc/tree/accel/staging
Key Features
============
* Standard DRM accelerator interface via /dev/accel/accelN
* GEM-based buffer management with DMA-BUF import/export (PRIME)
* IOMMU-based memory isolation using per-process context banks
* FastRPC protocol implementation for DSP communication
* RPMsg transport layer for reliable message passing
* Support for all DSP domains (ADSP, CDSP, SDSP, GDSP)
* DRM IOCTL interface for DSP session management, buffer allocation,
and remote procedure invocation
Architecture
============
1. DRM Accelerator Framework Integration
The driver registers as a DRM accel device, exposing a standard
/dev/accel/accelN character device node. This provides established
DRM infrastructure for device management, file operations, and
IOCTL dispatch.
2. Memory Management
Buffers are managed as GEM objects with full PRIME support for
DMA-BUF import/export. This enables seamless buffer sharing with
other DRM drivers (GPU, camera, video) using standard kernel
mechanisms.
3. IOMMU Context Bank Management
IOMMU context banks (CBs) are represented as proper struct device
instances on a custom virtual bus (qda-compute-cb). Each CB device
is registered with the IOMMU subsystem and receives its own IOMMU
domain, enabling per-session address space isolation. The custom
bus was introduced because IOMMU context banks are synthetic
constructs — not real platform devices — and to ensure CB device
lifetime is strictly subordinate to the parent QDA device.
See also: https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualco…
4. Memory Manager Architecture
A pluggable memory manager coordinates IOMMU device assignment and
buffer allocation. The current implementation uses a DMA-coherent
backend with SID-prefixed DMA addresses for DSP firmware
compatibility.
5. Transport Layer
RPMsg communication is handled in a dedicated transport layer
(qda_rpmsg.c), separate from the core DRM driver logic.
6. Code Organization
The driver is organized across multiple files (~4600 lines total):
* qda_drv.c: Core driver and DRM integration
* qda_rpmsg.c: RPMsg transport layer
* qda_cb.c: Context bank device management
* qda_compute_bus.c: Custom virtual bus for CB devices
* qda_gem.c: GEM object management
* qda_prime.c: DMA-BUF import (PRIME)
* qda_memory_manager.c: IOMMU device registry and allocation
* qda_memory_dma.c: DMA-coherent allocation backend
* qda_fastrpc.c: FastRPC protocol implementation
* qda_ioctl.c: IOCTL dispatch
7. UAPI Design
The driver exposes DRM-style IOCTLs defined in
include/uapi/drm/qda_accel.h, following DRM UAPI conventions
(__u32/__u64 types, C++ guard, GPL-2.0-only WITH Linux-syscall-note).
Patch Series Organization
==========================
Patch 01: MAINTAINERS entry
Patch 02: Driver documentation (Documentation/accel/qda/)
Patches 03-04: Core driver skeleton and compute bus
Patch 05: iommu: Register qda-compute-cb bus with IOMMU subsystem
Patches 06-07: CB device enumeration and memory manager
Patch 08: QUERY IOCTL and UAPI header
Patches 09-11: GEM buffer management and PRIME import
Patches 12-15: FastRPC protocol (invoke, session create/release,
map/unmap)
Open Items
===========
1. Device-Tree Compatible String
The QDA driver uses the same device-tree node structure and
properties as the existing fastrpc driver in drivers/misc/. A
mechanism is needed to allow the QDA driver to bind to its device
node independently of the fastrpc driver.
The intended coexistence model is: platforms that require the
complete fastrpc feature set continue to use "qcom,fastrpc"; new
platforms where a feature available only in QDA takes priority, or
where QDA's current feature set is sufficient, use a QDA-specific
compatible string. New feature development is directed toward QDA
rather than the existing fastrpc driver. As QDA matures toward
feature parity with fastrpc, platforms can adopt the QDA-specific
compatible string exclusively.
The options under consideration are:
a) Add a new "qcom,qda" compatible string to the existing
qcom,fastrpc.yaml binding, since the DT node structure and
properties are identical. This avoids a separate binding file
but adds a QDA-specific string to a fastrpc binding.
b) Introduce a separate qcom,qda.yaml binding that references or
inherits the fastrpc binding properties.
Seeking guidance from DT binding maintainers on the preferred
approach.
2. Privilege Level Management
Currently, daemon processes and user processes have the same access
level as both use the same accel device node. This needs to be
addressed as daemons attach to privileged DSP protection domains
and require higher privilege levels for system-level operations.
Seeking guidance on the best approach: separate device nodes,
capability-based checks, or DRM master/authentication mechanisms.
3. UAPI Compatibility Layer
A compatibility layer is needed to facilitate migration of client
applications from the existing FastRPC UAPI to the new QDA UAPI,
ensuring a smooth transition for existing userspace code. Seeking
guidance on the preferred implementation approach: in-kernel
translation layer, userspace wrapper library, or hybrid solution.
An initial evaluation of an in-kernel translation shim was
performed, where legacy FastRPC device nodes (/dev/fastrpc-*) are
exposed and requests are internally routed to the QDA accel driver.
The goal was to keep the compatibility layer minimal, reuse existing
QDA helper paths (attach, buffer allocation, mapping, etc.), and
avoid duplication of GEM and buffer management logic.
However, the following challenges were identified:
a) Dependency on drm_file for QDA helpers
QDA relies on GEM-backed allocations and per-client handle
namespaces, which require a valid struct drm_file. Since GEM
handles are scoped per drm_file, the compatibility layer cannot
directly reuse QDA helper paths without establishing a proper
drm_file context for each client.
b) Lack of public API for drm_file creation
Creating a drm_file directly (similar to mock_drm_getfile()-style
approaches) is not feasible, as the required helpers
(drm_file_alloc(), drm_file_free(), etc.) are internal to the DRM
core and not exported. This prevents external drivers from safely
constructing and managing drm_file instances.
c) VFS-based open is not a viable solution
Opening the underlying accel device (/dev/accel/accelN) from the
compatibility driver via filp_open() does provide a valid
drm_file, but introduces reliance on userspace-visible device
paths, lack of stability in containerized or chroot environments,
and no clean mapping between legacy device nodes and accel
devices.
d) Userspace proxy limitations (CUSE)
A CUSE-based userspace proxy was evaluated. However, DMA-buf file
descriptors passed by legacy applications cannot be directly
reused in the CUSE daemon (file descriptors are process-specific),
which breaks buffer sharing semantics.
e) drm_client-based approaches do not match requirements
drm_client APIs (used for fbdev emulation) rely on a shared
drm_file and do not provide the per-client isolation required by
FastRPC semantics.
Due to the above constraints, it is currently unclear how to
implement an in-kernel compatibility layer that correctly handles
per-client drm_file contexts without relying on VFS paths or
non-exported DRM internals.
4. Documentation Improvements
Add detailed IOCTL usage examples, document DSP firmware interface
requirements, and create a migration guide from the existing FastRPC
driver.
5. Per-Session Memory Allocation
Develop a userspace API to support memory allocation on a per-session
basis, enabling session-specific memory management.
6. Audio and Sensors PD Support
The current series does not handle Audio PD and Sensors PD
functionalities. These specialized protection domains require
additional support for real-time constraints and power management.
Interface Compatibility
========================
The QDA driver uses the same device-tree node structure and child node
layout (including "qcom,fastrpc-compute-cb" child nodes) as the
existing fastrpc driver. The underlying FastRPC protocol and DSP
firmware interface are compatible with the existing fastrpc driver,
ensuring that DSP firmware and libraries continue to work without
modification.
References
==========
Previous discussions on this migration:
- https://lkml.org/lkml/2024/6/24/479
- https://lkml.org/lkml/2024/6/21/1252
Testing
=======
The driver has been tested on Qualcomm platforms with:
- Basic FastRPC attach/release operations
- DSP process creation and initialization
- Memory mapping/unmapping operations
- Dynamic invocation with various buffer types
- GEM buffer allocation and mmap
- PRIME buffer import from other subsystems
Signed-off-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
---
Ekansh Gupta (15):
MAINTAINERS: Add entry for Qualcomm DSP Accelerator (QDA) driver
accel/qda: Add QDA driver documentation
accel/qda: Add initial QDA DRM accelerator driver
accel/qda: Add compute bus for QDA context banks
iommu: Add QDA compute context bank bus to iommu_buses
accel/qda: Create compute context bank devices on QDA compute bus
accel/qda: Add memory manager for CB devices
accel/qda: Add QUERY IOCTL and QDA UAPI header
accel/qda: Add DMA-backed GEM objects and memory manager integration
accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
accel/qda: Add PRIME DMA-BUF import support
accel/qda: Add FastRPC invocation support
accel/qda: Add DSP process creation and release
accel/qda: Add remote memory mapping to DSP address space
accel/qda: Add remote memory unmap from DSP address space
Documentation/accel/index.rst | 1 +
Documentation/accel/qda/index.rst | 13 +
Documentation/accel/qda/qda.rst | 146 +++++
MAINTAINERS | 9 +
drivers/accel/Kconfig | 1 +
drivers/accel/Makefile | 2 +
drivers/accel/qda/Kconfig | 34 +
drivers/accel/qda/Makefile | 19 +
drivers/accel/qda/qda_cb.c | 146 +++++
drivers/accel/qda/qda_cb.h | 32 +
drivers/accel/qda/qda_compute_bus.c | 68 ++
drivers/accel/qda/qda_drv.c | 192 ++++++
drivers/accel/qda/qda_drv.h | 91 +++
drivers/accel/qda/qda_fastrpc.c | 1058 ++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_fastrpc.h | 390 ++++++++++++
drivers/accel/qda/qda_gem.c | 177 ++++++
drivers/accel/qda/qda_gem.h | 62 ++
drivers/accel/qda/qda_ioctl.c | 296 +++++++++
drivers/accel/qda/qda_ioctl.h | 19 +
drivers/accel/qda/qda_memory_dma.c | 110 ++++
drivers/accel/qda/qda_memory_dma.h | 17 +
drivers/accel/qda/qda_memory_manager.c | 380 ++++++++++++
drivers/accel/qda/qda_memory_manager.h | 75 +++
drivers/accel/qda/qda_prime.c | 184 ++++++
drivers/accel/qda/qda_prime.h | 18 +
drivers/accel/qda/qda_rpmsg.c | 248 ++++++++
drivers/accel/qda/qda_rpmsg.h | 30 +
drivers/iommu/iommu.c | 4 +
include/linux/qda_compute_bus.h | 32 +
include/uapi/drm/qda_accel.h | 229 +++++++
30 files changed, 4083 insertions(+)
---
base-commit: 80dd246accce631c328ea43294e53b2b2dd2aa32
change-id: 20260519-qda-series-78c2bf0ed78b
Best regards,
--
Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>
dma_buf_unpin() requires the caller to hold the exporter's dma_resv
lock:
void dma_buf_unpin(struct dma_buf_attachment *attach)
{
...
dma_resv_assert_held(dmabuf->resv);
...
}
iopt_release_pages() calls dma_buf_unpin() without taking that lock,
so every iommufd_ioas_destroy()/iommufd_ioas_unmap() that releases
the last reference on a DMABUF-backed iopt_pages triggers a WARN.
This was hit while running tools/testing/selftests/iommu/iommufd:
WARNING: drivers/dma-buf/dma-buf.c:1137 at dma_buf_unpin+0x62/0x70
RIP: 0010:dma_buf_unpin+0x62/0x70
Call Trace:
<TASK>
dma_buf_unpin+0x62/0x70
iopt_release_pages+0xe4/0x190
iopt_unmap_iova_range+0x1c7/0x290
iopt_unmap_all+0x1a/0x30
iommufd_ioas_destroy+0x1d/0x50
iommufd_fops_release+0x93/0x150
__fput+0xfc/0x2c0
__x64_sys_close+0x3d/0x80
do_syscall_64+0x65/0x180
</TASK>
Take the dma_resv lock around dma_buf_unpin() in iopt_release_pages(),
matching the iopt_map_dmabuf() convention. dma_buf_detach() acquires the
reservation lock internally, so it must remain outside the locked region.
Fixes: 8c5f9645c389 ("iommufd: Add dma_buf_pin()")
Reported-by: Ankit Soni <Ankit.Soni(a)amd.com>
Signed-off-by: Ankit Soni <Ankit.Soni(a)amd.com>
---
drivers/iommu/iommufd/pages.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 9bdb2945afe1..7b64002e54b9 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1663,7 +1663,9 @@ void iopt_release_pages(struct kref *kref)
if (iopt_is_dmabuf(pages) && pages->dmabuf.attach) {
struct dma_buf *dmabuf = pages->dmabuf.attach->dmabuf;
+ dma_resv_lock(dmabuf->resv, NULL);
dma_buf_unpin(pages->dmabuf.attach);
+ dma_resv_unlock(dmabuf->resv);
dma_buf_detach(dmabuf, pages->dmabuf.attach);
dma_buf_put(dmabuf);
WARN_ON(!list_empty(&pages->dmabuf.tracker));
--
2.43.0
Have you ever wondered what your life would look like if you made entirely different choices? Life simulation games have always been a fascinating genre for gamers, but few capture the unpredictable, hilarious, and sometimes chaotic nature of existence quite like Bitlife. Instead of relying on heavy 3D graphics, it is a text-based simulator that focuses entirely on the ripple effects of your decisions. It’s perfect for casual gaming sessions, so let's dive into how to play and get the most out of this quirky experience.
https://bitlifefree.io/
Gameplay: Growing Up, One Year at a Time
The premise of the game is incredibly simple but highly addictive. You are born with a random set of basic stats—Happiness, Health, Smarts, and Looks—in a random country to random parents. From there, you control your character's life year by year simply by tapping the "Age" button.
In your early years, your choices are understandably limited to things like interacting with your parents, going to the doctor, or playing with pets. But as you grow into a teenager and an adult, the world completely opens up. You can choose to study hard, drop out, date, travel the world, buy real estate, or even turn to a life of crime.
Every year, the game throws random scenarios at you: a classmate might insult you, you might be offered a questionable substance at a party, or you might find a wallet on the street. How you react directly impacts your stats and future opportunities. You might even have to pass mini-games, like navigating a maze for your driving test or escaping from prison. The ultimate goal is simply to live your life until your character passes away, leaving behind a unique legacy and a tombstone summarizing your deeds.
Tips for a Great Experience
If you are just starting out, here are a few tips to make your virtual life more successful—or at least more entertaining:
Keep an eye on your core stats: Your Health and Happiness are crucial. If they drop too low, your character might face early health issues. Go to the gym, meditate, go to the movies, or spend time with family to keep these bars in the green.
Education pays off (usually): If you want a high-paying, stable career like a doctor, judge, or CEO, use the "Study harder" option every year during school. Read books at the library to passively boost your Smarts stat.
Hunt for Ribbons: At the end of every life, you are awarded a ribbon based on how you lived (e.g., "Hero," "Scandalous," "Lazy," or "Rich"). Trying to collect all the different ribbons is a great way to give yourself specific goals.
Don't be afraid of the absurd: The real charm of the game is in its wild unpredictability. Sometimes, making terrible choices, trying to become a famous actor, or buying a crazy exotic pet leads to the most memorable playthroughs. Don't always play it safe!
Conclusion
Ultimately, the beauty of this simulator lies in its endless replayability. Every time you hit the button to start a new life, it is a completely blank slate. You can be a saint in one lifetime and an absolute menace to society in the next. Whether you have five minutes to kill on a bus commute or an hour to craft a sprawling, multi-generational family dynasty, diving into Bitlife offers a fun, lighthearted escape into a world where you pull all the strings. Give it a try, and see exactly where your choices take you!
Changes since the RFC:
- Include support for ForeignOwnable for ARef, so that a Fence can be
stuffed into an XArray et al. (Code by Danilo)
- Implement ForeignOwnable (with new borrow type) for DriverFence, so
that it can be stuffed into an XArray.
- Include the rcu::RcuBox data type to defer dropping data with RCU
(Cody by Alice)
- Port DmaFence to RcuBox to make UAF bugs through later, new dma_fence
callbacks (backend_ops) impossible.
- Force users to pass their fence data in an RcuBox (or have it not
need drop()) through a Sealed trait.
- Document the rules for the user's DriverFence::data's drop
implementation very clearly (deadlock danger).
- rustfmt, Clippy.
- Various style suggestions, safety comments, etc. (Önur)
- Add __rust_helper prefix to helper functions. (Önur)
Changes in RFC v3:
- Omit JobQueue patches for now
- Completely redesign the memory layout: Instead of a Fence
refcounting a DriverFence, both now live in the same allocation to
allow for future support the dma_fence backend_ops callbacks which
need to do container_of. (mostly Boris's feedback)
- Allow for pre-allocating fences to avoid deadlocks when submitting
jobs to a GPU. (Boris)
- Simultaneously, allow for pre-preparing fence callback objects, so
the driver can allocate them when it sees fit. (code largely stolen
and inspired by Daniel).
- Signal fences on drop, ensure synchronization.
- Force users to set an error code when signalling.
- Write more documentation
- A ton of minor other changes.
Alright, so since the last RFCs did not reveal significant design
issues, I decided to transition this series to a v1 and hope that we can
get it upstream.
This now includes code for more common infrastructure that dma_fence
needs, contributed by Danilo and Alice.
---
Old cover letter for RFC:
So, this is the spiritual successor of the first / second RFC [1]. v2
also contained code for drm::JobQueue, but mostly to show how the fence
code would be used. JobQueue is under heavy rework right now, so I don't
want to bother your eyes with it. The docstring examples should show how
Rust fences are supposed to be used, though.
This v3 contains a huge amount of highly valuable feedback from a
variety of people, notably Boris, but also from Alice, Gary and Danilo.
There are some TODOs open (a better trait for fence backend_ops and RCU
support), but my hope is that this effort is now finally approaching its
end.
I would greatly appreciate feedback and especially more information
about what might be missing to make this usable, which is obviously
where Daniel's and Boris's feedback will be valuable once more.
Please regard this patch just as what it's titled: an RFC, to discuss a
bit more and to inform a broader community about what the current state
is and where this is heading at.
Many regards,
Philipp
[1] https://lore.kernel.org/rust-for-linux/20260203081403.68733-2-phasta@kernel…
Alice Ryhl (1):
rust: rcu: add RcuBox type
Danilo Krummrich (1):
rust: types: implement ForeignOwnable for ARef<T>
Philipp Stanner (2):
rust: Add dma_fence abstractions
MAINTAINERS: Add entry for Rust dma-buf
MAINTAINERS | 2 +
rust/bindings/bindings_helper.h | 2 +
rust/helpers/dma_fence.c | 48 ++
rust/helpers/helpers.c | 1 +
rust/kernel/dma_buf/dma_fence.rs | 821 +++++++++++++++++++++++++++++++
rust/kernel/dma_buf/mod.rs | 13 +
rust/kernel/lib.rs | 1 +
rust/kernel/sync/aref.rs | 39 ++
rust/kernel/sync/rcu.rs | 31 +-
rust/kernel/sync/rcu/rcu_box.rs | 145 ++++++
10 files changed, 1102 insertions(+), 1 deletion(-)
create mode 100644 rust/helpers/dma_fence.c
create mode 100644 rust/kernel/dma_buf/dma_fence.rs
create mode 100644 rust/kernel/dma_buf/mod.rs
create mode 100644 rust/kernel/sync/rcu/rcu_box.rs
--
2.54.0
Feeling the need for speed and a bit of winter fun, even when the weather outside is frightful? Then maybe it’s time to check out Snow Rider 3D. This simple but surprisingly addictive game offers a thrill of downhill skiing and snowboarding right from your browser, no downloads required. Let’s break down how to jump in and start enjoying this surprisingly engaging title.
https://snowriderfree.com/
Gameplay: Simple Controls, Endless Possibilities
The core gameplay of Snow Rider 3D is deceptively straightforward. You control your character's direction using the left and right arrow keys (or A and D). Your objective? Navigate through a series of procedurally generated slopes littered with obstacles. These obstacles range from simple ramps and rails to more challenging hazards like trees, snowdrifts, and even abandoned shacks.
The beauty of Snow Rider 3D lies in its physics. While simple, they feel surprisingly realistic. You'll need to anticipate turns, adjust your speed, and time your jumps to successfully navigate the terrain. A crash will reset you to the beginning of the course, so precision and patience are key.
The game offers different levels, each presenting a unique challenge. Some focus on speed and long jumps, while others demand skillful maneuvering through tight spaces. As you progress, you unlock new skins and sleds, adding a touch of customization to your experience. Think of it as a casual time-killer that can quickly turn into an hour-long obsession!
Tips for Mastering the Mountain:
Alright, so you're ready to hit the slopes. Here are a few tips to help you improve your runs and avoid those frustrating wipeouts:
Practice Makes Perfect: Don't get discouraged by early crashes. The more you play, the better you'll understand the physics and learn to anticipate the terrain.
Master the Turns: Smooth, controlled turns are essential for maintaining speed and avoiding obstacles. Practice feathering the arrow keys to make subtle adjustments.
Timing is Everything: When approaching jumps and ramps, pay close attention to your speed and angle. A well-timed jump can make all the difference.
Don't Be Afraid to Slow Down: Sometimes, the fastest route isn't the safest. Don't be afraid to ease off the gas and navigate tricky sections with caution. Consider looking up guides for specific levels of Snow Rider 3D at websites like Snow Rider 3D if you’re really struggling.
Experiment with Sleds and Skins: Different sleds may offer slight variations in handling. Try out different options to find one that suits your playstyle.
Conclusion: A Fun and Accessible Winter Escape
Snow Rider 3D is a surprisingly addictive and accessible game that’s perfect for a quick dose of winter fun. It's simple controls and challenging gameplay make it easy to pick up and play, while its procedural generation ensures that each run is a unique experience. So, whether you're looking for a casual time-killer or a challenging skill-based game, Snow Rider 3D is definitely worth checking out.
Ready to unleash your inner fruit ninja without the mess? Then get ready to dive into the addictively simple, yet surprisingly challenging world of Slice Master. This game, readily available online, is perfect for a quick burst of fun or a more extended gaming session. It’s a testament to the fact that gameplay doesn't need to be complex to be engaging.
https://slicemasterfree.com
Gameplay: Simple Mechanics, Endless Fun
The core concept of Slice Master is refreshingly straightforward. Colorful fruits are launched into the air, and your mission is to slice them into pieces before they fall off the screen. You control a virtual blade with your mouse or finger (depending on the platform), and drawing lines through the fruit initiates the slicing action.
The catch? You have limited lives, and letting too many fruits fall untouched will result in a game over. Occasionally, you'll also encounter bombs mixed in with the fruit barrage. Accidentally slicing a bomb will end your run instantly, adding a layer of strategic thinking to the rapid-fire action.
As you progress, the game throws different types of fruit at you, some requiring multiple slices, and the speed increases gradually, demanding faster reflexes and more precise movements. Special fruits might offer score multipliers or other benefits, adding further depth to the gameplay. It’s a game where practice truly makes perfect, and mastering the art of fruit slicing is incredibly satisfying. You can try it out now by clicking on Slice Master.
Tips for Achieving Fruit-Slicing Mastery
While the game seems simple on the surface, a few strategies can significantly improve your score and extend your gameplay.
• Focus on Efficiency: Instead of frantically slashing at individual fruits, try to slice multiple fruits with a single, well-aimed swipe. This not only increases your score but also conserves your limited slicing time.
• Prioritize High-Value Fruits: Keep an eye out for special fruits that offer bonus points or multipliers. Slicing these at the right moment can dramatically boost your score.
• Be Mindful of Bombs: This one is crucial! Always be aware of the position of the bombs and avoid them at all costs. A moment of carelessness can instantly end your game. Try to train yourself to recognize them early and plan your slices accordingly.
• Practice Makes Perfect: Like any skill-based game, practice is essential for improving your reflexes and accuracy. The more you play, the better you'll become at predicting fruit trajectories and executing precise slices. So, keep practicing and you'll be reaching new high scores in no time!
In Conclusion: A Slice of Addictive Fun
Slice Master offers a surprisingly addictive and engaging gaming experience, despite its simple premise. Its accessible gameplay, combined with the escalating challenge, makes it a perfect choice for a quick dose of entertainment or a more extended gaming session. Whether you're looking for a casual distraction or a skill-based challenge, Slice Master provides a satisfying and fun way to test your reflexes and accuracy. So, grab your virtual blade and prepare to unleash your inner fruit-slicing ninja!
In case MMIO size is bigger than 4G and peer2peer DMA goes
through host bridge, we trigger a code path that assigns the
total linked IOVA (which is greater than 4G) to mapped_len.
Previously, `mapped_len` was declared as 32-bit `unsigned int`.
When accumulating `size_t` lengths, this leads to a silent wrap-around.
This truncation causes truncated lengths to be passed to functions
like `fill_sg_entry()`.
Fix this by changing `mapped_len` to `size_t` (64-bit). While
at it, fix similar potential overflow issues in `calc_sg_nents`
by using `size_t` for `nents` and checking against `UINT_MAX`
and using `unsigned int` for the loop iterator in `fill_sg_entry`
to match.
Fixes: 3aa31a8bb11e ("dma-buf: provide phys_vec to scatter-gather mapping routine")
Cc: stable(a)vger.kernel.org
Cc: iommu(a)lists.linux.dev
Reviewed-by: Pranjal Shrivastava <praan(a)google.com>
Signed-off-by: David Hu <xuehaohu(a)google.com>
---
Changes in v4:
- Added WARN_ON_ONCE() to the nents overflow check to prevent silent
failures (Claude Bot).
Changes in v3:
- Removed leftover sentence fragment from the commit message.
- Kept `nents = 0` initialization (previously stated as removed in the
v2 changelog) as it is strictly required for the `+=` accumulation
loop in `calc_sg_nents()`.
Changes in v2:
- Fixed 'IVOA' -> 'IOVA' typo and expanded commit message (Claude Bot).
- Added Reverse Xmas tree formatting (Pranjal).
- Folded in extra bounds checking for calc_sg_nents() (Pranjal).
- Folded in type consistency fix for fill_sg_entry() (Pranjal).
drivers/dma-buf/dma-buf-mapping.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/dma-buf/dma-buf-mapping.c b/drivers/dma-buf/dma-buf-mapping.c
index 794acff2546a..1aabc0ee70bb 100644
--- a/drivers/dma-buf/dma-buf-mapping.c
+++ b/drivers/dma-buf/dma-buf-mapping.c
@@ -10,7 +10,7 @@ static struct scatterlist *fill_sg_entry(struct scatterlist *sgl, size_t length,
dma_addr_t addr)
{
unsigned int len, nents;
- int i;
+ unsigned int i;
nents = DIV_ROUND_UP(length, UINT_MAX);
for (i = 0; i < nents; i++) {
@@ -36,7 +36,7 @@ static unsigned int calc_sg_nents(struct dma_iova_state *state,
struct phys_vec *phys_vec, size_t nr_ranges,
size_t size)
{
- unsigned int nents = 0;
+ size_t nents = 0;
size_t i;
if (!state || !dma_use_iova(state)) {
@@ -51,6 +51,9 @@ static unsigned int calc_sg_nents(struct dma_iova_state *state,
nents = DIV_ROUND_UP(size, UINT_MAX);
}
+ if (WARN_ON_ONCE(nents > UINT_MAX))
+ return 0;
+
return nents;
}
@@ -95,9 +98,10 @@ struct sg_table *dma_buf_phys_vec_to_sgt(struct dma_buf_attachment *attach,
size_t nr_ranges, size_t size,
enum dma_data_direction dir)
{
- unsigned int nents, mapped_len = 0;
struct dma_buf_dma *dma;
struct scatterlist *sgl;
+ size_t mapped_len = 0;
+ unsigned int nents;
dma_addr_t addr;
size_t i;
int ret;
--
2.54.0.929.g9b7fa37559-goog