On Tue, Feb 24, 2026 at 12:38:54AM +0530, Ekansh Gupta wrote:
This patch series introduces the Qualcomm DSP Accelerator (QDA) driver, a modern DRM-based accelerator implementation for Qualcomm Hexagon DSPs. The driver provides a standardized interface for offloading computational tasks to DSPs found on Qualcomm SoCs, supporting all DSP domains (ADSP, CDSP, SDSP, GDSP).
The QDA driver is designed as an alternative for the FastRPC driver in drivers/misc/, offering improved resource management, better integration with standard kernel subsystems, and alignment with the Linux kernel's Compute Accelerators framework.
If I understand correctly, this is just the same FastRPC protocol but in the accel framework, and hence with a new userspace ABI?
I don't fancy the name "QDA" as an acronym for "FastRPC Accel".
I would much prefer to see this living in drivers/accel/fastrpc and be named some variation of "fastrpc" (e.g. fastrpc_accel). (Driver name can be "fastrpc" as the other one apparently is named "qcom,fastrpc").
User-space staging branch
https://github.com/qualcomm/fastrpc/tree/accel/staging
Key Features
- Standard DRM accelerator interface via /dev/accel/accelN
- GEM-based buffer management with DMA-BUF import/export support
- IOMMU-based memory isolation using per-process context banks
- FastRPC protocol implementation for DSP communication
- RPMsg transport layer for reliable message passing
- Support for all DSP domains (ADSP, CDSP, SDSP, GDSP)
- Comprehensive IOCTL interface for DSP operations
High-Level Architecture Differences with Existing FastRPC Driver
The QDA driver represents a significant architectural departure from the existing FastRPC driver (drivers/misc/fastrpc.c), addressing several key limitations while maintaining protocol compatibility:
- DRM Accelerator Framework Integration
- FastRPC: Custom character device (/dev/fastrpc-*)
- QDA: Standard DRM accel device (/dev/accel/accelN)
- Benefit: Leverages established DRM infrastructure for device management.
- Memory Management
- FastRPC: Custom memory allocator with ION/DMA-BUF integration
- QDA: Native GEM objects with full PRIME support
- Benefit: Seamless buffer sharing using standard DRM mechanisms
- IOMMU Context Bank Management
- FastRPC: Direct IOMMU domain manipulation, limited isolation
- QDA: Custom compute bus (qda_cb_bus_type) with proper device model
- Benefit: Each CB device is a proper struct device with IOMMU group support, enabling better isolation and resource tracking.
- https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualcom...
- Memory Manager Architecture
- FastRPC: Monolithic allocator
- QDA: Pluggable memory manager with backend abstraction
- Benefit: Currently uses DMA-coherent backend, easily extensible for future memory types (e.g., carveout, CMA)
- Transport Layer
- FastRPC: Direct RPMsg integration in core driver
- QDA: Abstracted transport layer (qda_rpmsg.c)
- Benefit: Clean separation of concerns, easier to add alternative transports if needed
- Code Organization
- FastRPC: ~3000 lines in single file
- QDA: Modular design across multiple files (~4600 lines total)
"Now 50% more LOC and you need 6 tabs open in your IDE!"
Might be better, but in itself it provides no immediate value.
* qda_drv.c: Core driver and DRM integration * qda_gem.c: GEM object management * qda_memory_manager.c: Memory and IOMMU management * qda_fastrpc.c: FastRPC protocol implementation * qda_rpmsg.c: Transport layer * qda_cb.c: Context bank device management
- Benefit: Better maintainability, clearer separation of concerns
- UAPI Design
- FastRPC: Custom IOCTL interface
- QDA: DRM-style IOCTLs with proper versioning support
- Benefit: Follows DRM conventions, easier userspace integration
- Documentation
- FastRPC: Minimal in-tree documentation
- QDA: Comprehensive documentation in Documentation/accel/qda/
- Benefit: Better developer experience, clearer API contracts
- Buffer Reference Mechanism
- FastRPC: Uses buffer file descriptors (FDs) for all book-keeping in both kernel and DSP
- QDA: Uses GEM handles for kernel-side management, providing better integration with DRM subsystem
- Benefit: Leverages DRM GEM infrastructure for reference counting, lifetime management, and integration with other DRM components
This is all good, but what is the plan regarding /dev/fastrpc-*?
The idea here clearly is to provide an alternative implementation, and they seem to bind to the same toplevel compatible - so you can only compile one into your kernel at any point in time.
So if I understand correctly, at some point in time we need to say CONFIG_DRM_ACCEL_QDA=m and CONFIG_QCOM_FASTRPC=n, which will break all existing user space applications? That's not acceptable.
Would it be possible to have a final driver that is implemented as a accel, but provides wrappers for the legacy misc and ioctl interface to the applications?
Regards, Bjorn
Key Technical Improvements
Proper device model: CB devices are real struct device instances on a custom bus, enabling proper IOMMU group management and power management integration
Reference-counted IOMMU devices: Multiple file descriptors from the same process share a single IOMMU device, reducing overhead
GEM-based buffer lifecycle: Automatic cleanup via DRM GEM reference counting, eliminating many resource leak scenarios
Modular memory backends: The memory manager supports pluggable backends, currently implementing DMA-coherent allocations with SID-prefixed addresses for DSP firmware
Context-based invocation tracking: XArray-based context management with proper synchronization and cleanup
Patch Series Organization
Patches 1-2: Driver skeleton and documentation Patches 3-6: RPMsg transport and IOMMU/CB infrastructure Patches 7-9: DRM device registration and basic IOCTL Patches 10-12: GEM buffer management and PRIME support Patches 13-17: FastRPC protocol implementation (attach, invoke, create, map/unmap) Patch 18: MAINTAINERS entry
Open Items
The following items are identified as open items:
- Privilege Level Management
- Currently, daemon processes and user processes have the same access level as both use the same accel device node. This needs to be addressed as daemons attach to privileged DSP PDs and require higher privilege levels for system-level operations
- Seeking guidance on the best approach: separate device nodes, capability-based checks, or DRM master/authentication mechanisms
- UAPI Compatibility Layer
- Add UAPI compat layer to facilitate migration of client applications from existing FastRPC UAPI to the new QDA accel driver UAPI, ensuring smooth transition for existing userspace code
- Seeking guidance on implementation approach: in-kernel translation layer, userspace wrapper library, or hybrid solution
- Documentation Improvements
- Add detailed IOCTL usage examples
- Document DSP firmware interface requirements
- Create migration guide from existing FastRPC
- Per-Domain Memory Allocation
- Develop new userspace API to support memory allocation on a per domain basis, enabling domain-specific memory management and optimization
- Audio and Sensors PD Support
- The current patch series does not handle Audio PD and Sensors PD functionalities. These specialized protection domains require additional support for real-time constraints and power management
Interface Compatibility
The QDA driver maintains compatibility with existing FastRPC infrastructure:
Device Tree Bindings: The driver uses the same device tree bindings as the existing FastRPC driver, ensuring no changes are required to device tree sources. The "qcom,fastrpc" compatible string and child node structure remain unchanged.
Userspace Interface: While the driver provides a new DRM-based UAPI, the underlying FastRPC protocol and DSP firmware interface remain compatible. This ensures that DSP firmware and libraries continue to work without modification.
Migration Path: The modular design allows for gradual migration, where both drivers can coexist during the transition period. Applications can be migrated incrementally to the new UAPI with the help of the planned compatibility layer.
References
Previous discussions on this migration:
Testing
The driver has been tested on Qualcomm platforms with:
- Basic FastRPC attach/release operations
- DSP process creation and initialization
- Memory mapping/unmapping operations
- Dynamic invocation with various buffer types
- GEM buffer allocation and mmap
- PRIME buffer import from other subsystems
Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Ekansh Gupta (18): accel/qda: Add Qualcomm QDA DSP accelerator driver docs accel/qda: Add Qualcomm DSP accelerator driver skeleton accel/qda: Add RPMsg transport for Qualcomm DSP accelerator accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU accel/qda: Create compute CB devices on QDA compute bus accel/qda: Add memory manager for CB devices accel/qda: Add DRM accel device registration for QDA driver accel/qda: Add per-file DRM context and open/close handling accel/qda: Add QUERY IOCTL and basic QDA UAPI header accel/qda: Add DMA-backed GEM objects and memory manager integration accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs accel/qda: Add PRIME dma-buf import support accel/qda: Add initial FastRPC attach and release support accel/qda: Add FastRPC dynamic invocation support accel/qda: Add FastRPC DSP process creation support accel/qda: Add FastRPC-based DSP memory mapping support accel/qda: Add FastRPC-based DSP memory unmapping support MAINTAINERS: Add MAINTAINERS entry for QDA driver
Documentation/accel/index.rst | 1 + Documentation/accel/qda/index.rst | 14 + Documentation/accel/qda/qda.rst | 129 ++++ MAINTAINERS | 9 + arch/arm64/configs/defconfig | 2 + drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 2 + drivers/accel/qda/Kconfig | 35 ++ drivers/accel/qda/Makefile | 19 + drivers/accel/qda/qda_cb.c | 182 ++++++ drivers/accel/qda/qda_cb.h | 26 + drivers/accel/qda/qda_compute_bus.c | 23 + drivers/accel/qda/qda_drv.c | 375 ++++++++++++ drivers/accel/qda/qda_drv.h | 171 ++++++ drivers/accel/qda/qda_fastrpc.c | 1002 ++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_fastrpc.h | 433 ++++++++++++++ drivers/accel/qda/qda_gem.c | 211 +++++++ drivers/accel/qda/qda_gem.h | 103 ++++ drivers/accel/qda/qda_ioctl.c | 271 +++++++++ drivers/accel/qda/qda_ioctl.h | 118 ++++ drivers/accel/qda/qda_memory_dma.c | 91 +++ drivers/accel/qda/qda_memory_dma.h | 46 ++ drivers/accel/qda/qda_memory_manager.c | 382 ++++++++++++ drivers/accel/qda/qda_memory_manager.h | 148 +++++ drivers/accel/qda/qda_prime.c | 194 +++++++ drivers/accel/qda/qda_prime.h | 43 ++ drivers/accel/qda/qda_rpmsg.c | 327 +++++++++++ drivers/accel/qda/qda_rpmsg.h | 57 ++ drivers/iommu/iommu.c | 4 + include/linux/qda_compute_bus.h | 22 + include/uapi/drm/qda_accel.h | 224 +++++++ 31 files changed, 4665 insertions(+)
base-commit: d4906ae14a5f136ceb671bb14cedbf13fa560da6 change-id: 20260223-qda-firstpost-4ab05249e2cc
Best regards,
Ekansh Gupta ekansh.gupta@oss.qualcomm.com