Linaro-mm-sig February 2026

linaro-mm-sig@lists.linaro.org

129 participants
1792 discussions

[PATCH 00/13] Add DMA support for LINFlexD UART driver

by Larisa Grigore

This patchset enhances the LINFlexD UART driver and its device tree bindings to support DMA transfers, configurable clock inputs, dynamic baudrate changes, and termios features. It also includes a series of fixes and improvements to ensure reliable operation across various modes and configurations. The changes added can be summarized as follows: 1. Fixes with respect to FIFO handling, locking, interrupt related registers and INITM mode transition. 2. Removal of the earlycon workaround, as proper FIFO handling and INITM transitions now ensure stable behavior. 3. Support for configurable stop bits and dynamic baudrate changes based on clock inputs and termios settings. 4. Optional DMA support for RX and TX paths, preventing character loss during high-throughput operations like copy-paste. Cyclic DMA is used for RX to avoid gaps between transactions. Larisa Grigore (8): serial: linflexuart: Clean SLEEP bit in LINCR1 after suspend serial: linflexuart: Check FIFO full before writing serial: linflexuart: Correctly clear UARTSR in buffer mode serial: linflexuart: Update RXEN/TXEN outside INITM mode serial: linflexuart: Ensure FIFO is empty when entering INITM serial: linflexuart: Revert earlycon workaround serial: linflexuart: Add support for configurable stop bits serial: linflexuart: Add DMA support Radu Pirea (5): serial: linflexuart: Fix locking in set_termios dt-bindings: serial: fsl-linflexuart: add clock input properties dt-bindings: serial: fsl-linflexuart: add dma properties serial: linflexuart: Add support for changing baudrate serial: linflexuart: Avoid stopping DMA during receive operations .../bindings/serial/fsl,s32-linflexuart.yaml | 31 + drivers/tty/serial/fsl_linflexuart.c | 972 +++++++++++++++--- 2 files changed, 846 insertions(+), 157 deletions(-) -- 2.47.0

1 week

[PATCH v2 00/10] dma-bug: heaps: Add Tegra VPR support

by Thierry Reding

From: Thierry Reding <treding(a)nvidia.com> Hi, This series adds support for the video protection region (VPR) used on Tegra SoC devices. It's a special region of memory that is protected from accesses by the CPU and used to store DRM protected content (both decrypted stream data as well as decoded video frames). Patches 1 and 2 add DT binding documentation for the VPR and add the VPR to the list of memory-region items for display and host1x. Patch 3 adds bitmap_allocate(), which is like bitmap_allocate_region() but works on sizes that are not a power of two. Patch 4 introduces new APIs needed by the Tegra VPR implementation that allow CMA areas to be dynamically created at runtime rather than using the fixed, system-wide list. This is used in this driver specifically because it can use an arbitrary number of these areas (though they are currently limited to 4). Patch 5 adds some infrastructure for DMA heap implementations to provide information through debugfs. The Tegra VPR implementation is added in patch 6. See its commit message for more details about the specifics of this implementation. Finally, patches 7-10 add the VPR placeholder node on Tegra234 and hook it up to the host1x and GPU nodes so that they can make use of this region. Changes in v2: - Tegra VPR implementation is now more optimized to reduce the number of (very slow) resize operations, and allows cross-chunk allocations - dynamic CMA areas are now trackd separately from static ones, but the global number of CMA pages accounts for all areas Thierry Thierry Reding (10): dt-bindings: reserved-memory: Document Tegra VPR dt-bindings: display: tegra: Document memory regions bitmap: Add bitmap_allocate() function mm/cma: Allow dynamically creating CMA areas dma-buf: heaps: Add debugfs support dma-buf: heaps: Add support for Tegra VPR arm64: tegra: Add VPR placeholder node on Tegra234 arm64: tegra: Add GPU node on Tegra234 arm64: tegra: Hook up VPR to host1x arm64: tegra: Hook up VPR to the GPU .../display/tegra/nvidia,tegra186-dc.yaml | 10 + .../display/tegra/nvidia,tegra20-dc.yaml | 10 +- .../display/tegra/nvidia,tegra20-host1x.yaml | 7 + .../nvidia,tegra-video-protection-region.yaml | 55 + arch/arm/mm/dma-mapping.c | 2 +- arch/arm64/boot/dts/nvidia/tegra234.dtsi | 60 + arch/s390/mm/init.c | 2 +- drivers/dma-buf/dma-heap.c | 56 + drivers/dma-buf/heaps/Kconfig | 7 + drivers/dma-buf/heaps/Makefile | 1 + drivers/dma-buf/heaps/cma_heap.c | 2 +- drivers/dma-buf/heaps/tegra-vpr.c | 1265 +++++++++++++++++ include/linux/bitmap.h | 25 +- include/linux/cma.h | 7 +- include/linux/dma-heap.h | 2 + include/trace/events/tegra_vpr.h | 57 + mm/cma.c | 187 ++- mm/cma.h | 5 +- 18 files changed, 1713 insertions(+), 47 deletions(-) create mode 100644 Documentation/devicetree/bindings/reserved-memory/nvidia,tegra-video-protection-region.yaml create mode 100644 drivers/dma-buf/heaps/tegra-vpr.c create mode 100644 include/trace/events/tegra_vpr.h -- 2.52.0

2 weeks, 5 days

[PATCH v18 00/42] DEPT(DEPendency Tracker)

by Byungchul Park

I'm happy to see that DEPT reported real problems in practice: https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SA… https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.pa… https://lore.kernel.org/all/b6e00e77-4a8c-4e05-ab79-266bf05fcc2d@igalia.com/ I’ve added documentation describing DEPT — this should help you understand what DEPT is and how it works. You can use DEPT simply by enabling CONFIG_DEPT and checking dmesg at runtime. --- Hi Linus and folks, I’ve been developing a tool to detect deadlock possibilities by tracking waits/events — rather than lock acquisition order — to cover all the synchronization mechanisms. To summarize the design rationale, starting from the problem statement, through analysis, to the solution: CURRENT STATUS -------------- Lockdep tracks lock acquisition order to identify deadlock conditions. Additionally, it tracks IRQ state changes — via {en,dis}able — to detect cases where locks are acquired unintentionally during interrupt handling. PROBLEM ------- Waits and their associated events that are never reachable can eventually lead to deadlocks. However, since Lockdep focuses solely on lock acquisition order, it has inherent limitations when handling waits and events. Moreover, by tracking only lock acquisition order, Lockdep cannot properly handle read locks or cross-event scenarios — such as wait_for_completion() and complete() — making it increasingly inadequate as a general-purpose deadlock detection tool. SOLUTION -------- Once again, waits and their associated events that are never reachable can eventually lead to deadlocks. The new solution, DEPT, focuses directly on waits and events. DEPT monitors waits and events, and reports them when any become unreachable. DEPT provides: * Correct handling of read locks. * Support for general waits and events. * Continuous operation, even after multiple reports. * Simple, intuitive annotation APIs. There are still false positives, and some are already being worked on for suppression. Especially splitting the folio class into several appropriate classes e.g. block device mapping class and regular file mapping class, is currently under active development by me and Yeoreum Yun. Anyway, these efforts will need to continue for a while, as we’ve seen with lockdep over two decades. DEPT is tagged as EXPERIMENTAL in Kconfig — meaning it’s not yet suitable for use as an automation tool. However, for those who are interested in using DEPT to analyze complex synchronization patterns and extract dependency insights, DEPT would be a great tool for the purpose. Thanks for your support and contributions to: Harry Yoo <harry.yoo(a)oracle.com> Gwan-gyeong Mun <gwan-gyeong.mun(a)intel.com> Yunseong Kim <ysk(a)kzalloc.com> Yeoreum Yun <yeoreum.yun(a)arm.com> FAQ --- Q. Is this the first attempt to solve this problem? A. No. The cross-release feature (commit b09be676e0ff2) attempted to address it — as a Lockdep extension. It was merged, but quickly reverted, because: While it uncovered valuable hidden issues, it also introduced false positives. Since these false positives mask further real problems with Lockdep — and developers strongly dislike them — the feature was rolled back. Q. Why wasn’t DEPT built as a Lockdep extension? A. Lockdep is the result of years of work by kernel developers — and is now very stable. But I chose to build DEPT separately, because: While reusing BFS(Breadth First Search) and Lockdep’s hashing is beneficial, the rest of the system must be rebuilt from scratch to align with DEPT’s wait-event model — since Lockdep was originally designed for tracking lock acquisition orders, not wait-event dependencies. Q. Do you plan to replace Lockdep entirely? A. Not at all — Lockdep still plays a vital role in validating correct lock usage. While its dependency-checking logic should eventually be superseded by DEPT, the rest of its functionality should stay. Q. Should we replace the dependency check immediately? A. Absolutely not. Lockdep’s stability is the result of years of hard work by kernel developers. Lockdep and DEPT should run side by side until DEPT matures. Q. Stronger detection often leads to more false positives — which was a major pain point when cross-release was added. Is DEPT designed to handle this? A. Yes. DEPT’s simple, generalized design enables flexible reporting — so while false positives still need fixing, they’re far less disruptive than they were under the Lockdep extension, cross-release. Q. Why not fix all false positives out-of-tree before merging? A. Since the affected subsystems span the entire kernel, like Lockdep, which has relied on annotations to avoid false positives over the last two decades, DEPT too will require the annotation efforts. Performing annotation work within the mainline will help us add annotations more appropriately and will also make DEPT a useful tool for a wider range of users more quickly. CONFIG_DEPT is marked EXPERIMENTAL, so it’s opt-in. Some users are already interested in using DEPT to analyze complex synchronization patterns and extract dependency insights. Byungchul --- Changes from v17: 1. Rebase on the mainline as of 2025 Dec 5. 2. Convert the documents' format from txt to rst. (feedbacked by Jonathan Corbet and Bagas Sanjaya) 3. Move the documents from 'Documentation/dependency' to 'Documentation/dev-tools'. (feedbakced by Jonathan Corbet) 4. Improve the documentation. (feedbacked by NeilBrown) 5. Use a common function, enter_from_user_mode(), instead of arch specific code, to notice context switch from user mode. (feedbacked by Dave Hansen, Mark Rutland, and Mark Brown) 6. Resolve the header dependency issue by using dept's internal header, instead of relocating 'struct llist_{head,node}' to another header. (feedbacked by Greg KH) 7. Improve page(or folio) usage type APIs. 8. Add rust helper for wait_for_completion(). (feedbacked by Guangbo Cui, Boqun Feng, and Danilo Krummrich) 9. Refine some commit messages. Changes from v16: 1. Rebase on v6.17. 2. Fix a false positive from rcu (by Yunseong Kim) 3. Introduce APIs to set page's usage, dept_set_page_usage() and dept_reset_page_usage() to avoid false positives. 4. Consider lock_page() as a potential wait unconditionally. 5. Consider folio_lock_killable() as a potential wait unconditionally. 6. Add support for tracking PG_writeback waits and events. 7. Fix two build errors due to the additional debug information added by dept. (by Yunseong Kim) Changes from v15: 1. Fix typo and improve comments and commit messages (feedbacked by ALOK TIWARI, Waiman Long, and kernel test robot). 2. Do not stop dept on detection of cicular dependency of recover event, allowing to keep reporting. 3. Add SK hynix to copyright. 4. Consider folio_lock() as a potential wait unconditionally. 5. Fix Kconfig dependency bug (feedbacked by kernel test rebot). 6. Do not suppress reports that involve classes even that have already involved in other reports, allowing to keep reporting. Changes from v14: 1. Rebase on the current latest, v6.15-rc6. 2. Refactor dept code. 3. With multi event sites for a single wait, even if an event forms a circular dependency, the event can be recovered by other event(or wake up) paths. Even though informing the circular dependency is worthy but it should be suppressed once informing it, if it doesn't lead an actual deadlock. So introduce APIs to annotate the relationship between event site and recover site, that are, event_site() and dept_recover_event(). 4. wait_for_completion() worked with dept map embedded in struct completion. However, it generates a few false positves since all the waits using the instance of struct completion, share the map and key. To avoid the false positves, make it not to share the map and key but each wait_for_completion() caller have its own key by default. Of course, external maps also can be used if needed. 5. Fix a bug about hardirq on/off tracing. 6. Implement basic unit test for dept. 7. Add more supports for dma fence synchronization. 8. Add emergency stop of dept e.g. on panic(). 9. Fix false positives by mmu_notifier_invalidate_*(). 10. Fix recursive call bug by DEPT_WARN_*() and DEPT_STOP(). 11. Fix trivial bugs in DEPT_WARN_*() and DEPT_STOP(). 12. Fix a bug that a spin lock, dept_pool_spin, is used in both contexts of irq disabled and enabled without irq disabled. 13. Suppress reports with classes, any of that already have been reported, even though they have different chains but being barely meaningful. 14. Print stacktrace of the wait that an event is now waking up, not only stacktrace of the event. 15. Make dept aware of lockdep_cmp_fn() that is used to avoid false positives in lockdep so that dept can also avoid them. 16. Do do_event() only if there are no ecxts have been delimited. 17. Fix a bug that was not synchronized for stage_m in struct dept_task, using a spin lock, dept_task()->stage_lock. 18. Fix a bug that dept didn't handle the case that multiple ttwus for a single waiter can be called at the same time e.i. a race issue. 19. Distinguish each kernel context from others, not only by system call but also by user oriented fault so that dept can work with more accuracy information about kernel context. That helps to avoid a few false positives. 20. Limit dept's working to x86_64 and arm64. Changes from v13: 1. Rebase on the current latest version, v6.9-rc7. 2. Add 'dept' documentation describing dept APIs. Changes from v12: 1. Refine the whole document for dept. 2. Add 'Interpret dept report' section in the document, using a deadlock report obtained in practice. Hope this version of document helps guys understand dept better. https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SA… https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.pa… Changes from v11: 1. Add 'dept' documentation describing the concept of dept. 2. Rewrite the commit messages of the following commits for using weaker lockdep annotation, for better description. fs/jbd2: Use a weaker annotation in journal handling cpu/hotplug: Use a weaker annotation in AP thread (feedbacked by Thomas Gleixner) Changes from v10: 1. Fix noinstr warning when building kernel source. 2. dept has been reporting some false positives due to the folio lock's unfairness. Reflect it and make dept work based on dept annotaions instead of just wait and wake up primitives. 3. Remove the support for PG_writeback while working on 2. I will add the support later if needed. 4. dept didn't print stacktrace for [S] if the participant of a deadlock is not lock mechanism but general wait and event. However, it made hard to interpret the report in that case. So add support to print stacktrace of the requestor who asked the event context to run - usually a waiter of the event does it just before going to wait state. 5. Give up tracking raw_local_irq_{disable,enable}() since it totally messed up dept's irq tracking. So make it work in the same way as lockdep does. I will consider it once any false positives by those are observed again. 6. Change the manual rwsem_acquire_read(->j_trans_commit_map) annotation in fs/jbd2/transaction.c to the try version so that it works as much as it exactly needs. 7. Remove unnecessary 'inline' keyword in dept.c and add '__maybe_unused' to a needed place. Changes from v9: 1. Fix a bug. SDT tracking didn't work well because of my big mistake that I should've used waiter's map to indentify its class but it had been working with waker's one. FYI, PG_locked and PG_writeback weren't affected. They still worked well. (reported by YoungJun) Changes from v8: 1. Fix build error by adding EXPORT_SYMBOL(PG_locked_map) and EXPORT_SYMBOL(PG_writeback_map) for kernel module build - appologize for that. (reported by kernel test robot) 2. Fix build error by removing header file's circular dependency that was caused by "atomic.h", "kernel.h" and "irqflags.h", which I introduced - appolgize for that. (reported by kernel test robot) Changes from v7: 1. Fix a bug that cannot track rwlock dependency properly, introduced in v7. (reported by Boqun and lockdep selftest) 2. Track wait/event of PG_{locked,writeback} more aggressively assuming that when a bit of PG_{locked,writeback} is cleared there might be waits on the bit. (reported by Linus, Hillf and syzbot) 3. Fix and clean bad style code e.i. unnecessarily introduced a randome pattern and so on. (pointed out by Linux) 4. Clean code for applying dept to wait_for_completion(). Changes from v6: 1. Tie to task scheduler code to track sleep and try_to_wake_up() assuming sleeps cause waits, try_to_wake_up()s would be the events that those are waiting for, of course with proper dept annotations, sdt_might_sleep_weak(), sdt_might_sleep_strong() and so on. For these cases, class is classified at sleep entrance rather than the synchronization initialization code. Which would extremely reduce false alarms. 2. Remove the dept associated instance in each page struct for tracking dependencies by PG_locked and PG_writeback thanks to the 1. work above. 3. Introduce CONFIG_dept_AGGRESIVE_TIMEOUT_WAIT to suppress reports that waits with timeout set are involved, for those who don't like verbose reporting. 4. Add a mechanism to refill the internal memory pools on running out so that dept could keep working as long as free memory is available in the system. 5. Re-enable tracking hashed-waitqueue wait. That's going to no longer generate false positives because class is classified at sleep entrance rather than the waitqueue initailization. 6. Refactor to make it easier to port onto each new version of the kernel. 7. Apply dept to dma fence. 8. Do trivial optimizaitions. Changes from v5: 1. Use just pr_warn_once() rather than WARN_ONCE() on the lack of internal resources because WARN_*() printing stacktrace is too much for informing the lack. (feedback from Ted, Hyeonggon) 2. Fix trivial bugs like missing initializing a struct before using it. 3. Assign a different class per task when handling onstack variables for waitqueue or the like. Which makes dept distinguish between onstack variables of different tasks so as to prevent false positives. (reported by Hyeonggon) 4. Make dept aware of even raw_local_irq_*() to prevent false positives. (reported by Hyeonggon) 5. Don't consider dependencies between the events that might be triggered within __schedule() and the waits that requires __schedule(), real ones. (reported by Hyeonggon) 6. Unstage the staged wait that has prepare_to_wait_event()'ed *and* yet to get to __schedule(), if we encounter __schedule() in-between for another sleep, which is possible if e.g. a mutex_lock() exists in 'condition' of ___wait_event(). 7. Turn on CONFIG_PROVE_LOCKING when CONFIG_DEPT is on, to rely on the hardirq and softirq entrance tracing to make dept more portable for now. Changes from v4: 1. Fix some bugs that produce false alarms. 2. Distinguish each syscall context from another *for arm64*. 3. Make it not warn it but just print it in case dept ring buffer gets exhausted. (feedback from Hyeonggon) 4. Explicitely describe "EXPERIMENTAL" and "dept might produce false positive reports" in Kconfig. (feedback from Ted) Changes from v3: 1. dept shouldn't create dependencies between different depths of a class that were indicated by *_lock_nested(). dept normally doesn't but it does once another lock class comes in. So fixed it. (feedback from Hyeonggon) 2. dept considered a wait as a real wait once getting to __schedule() even if it has been set to TASK_RUNNING by wake up sources in advance. Fixed it so that dept doesn't consider the case as a real wait. (feedback from Jan Kara) 3. Stop tracking dependencies with a map once the event associated with the map has been handled. dept will start to work with the map again, on the next sleep. Changes from v2: 1. Disable dept on bit_wait_table[] in sched/wait_bit.c reporting a lot of false positives, which is my fault. Wait/event for bit_wait_table[] should've been tagged in a higher layer for better work, which is a future work. (feedback from Jan Kara) 2. Disable dept on crypto_larval's completion to prevent a false positive. Changes from v1: 1. Fix coding style and typo. (feedback from Steven) 2. Distinguish each work context from another in workqueue. 3. Skip checking lock acquisition with nest_lock, which is about correct lock usage that should be checked by lockdep. Changes from RFC(v0): 1. Prevent adding a wait tag at prepare_to_wait() but __schedule(). (feedback from Linus and Matthew) 2. Use try version at lockdep_acquire_cpus_lock() annotation. 3. Distinguish each syscall context from another. Byungchul Park (41): dept: implement DEPT(DEPendency Tracker) dept: add single event dependency tracker APIs dept: add lock dependency tracker APIs dept: tie to lockdep and IRQ tracing dept: add proc knobs to show stats and dependency graph dept: distinguish each kernel context from another dept: distinguish each work from another dept: add a mechanism to refill the internal memory pools on running out dept: record the latest one out of consecutive waits of the same class dept: apply sdt_might_sleep_{start,end}() to wait_for_completion()/complete() dept: apply sdt_might_sleep_{start,end}() to swait dept: apply sdt_might_sleep_{start,end}() to waitqueue wait dept: apply sdt_might_sleep_{start,end}() to hashed-waitqueue wait dept: apply sdt_might_sleep_{start,end}() to dma fence dept: track timeout waits separately with a new Kconfig dept: apply timeout consideration to wait_for_completion()/complete() dept: apply timeout consideration to swait dept: apply timeout consideration to waitqueue wait dept: apply timeout consideration to hashed-waitqueue wait dept: apply timeout consideration to dma fence wait dept: make dept able to work with an external wgen dept: track PG_locked with dept dept: print staged wait's stacktrace on report locking/lockdep: prevent various lockdep assertions when lockdep_off()'ed dept: add documents for dept cpu/hotplug: use a weaker annotation in AP thread dept: assign dept map to mmu notifier invalidation synchronization dept: assign unique dept_key to each distinct dma fence caller dept: make dept aware of lockdep_set_lock_cmp_fn() annotation dept: make dept stop from working on debug_locks_off() dept: assign unique dept_key to each distinct wait_for_completion() caller completion, dept: introduce init_completion_dmap() API dept: introduce a new type of dependency tracking between multi event sites dept: add module support for struct dept_event_site and dept_event_site_dep dept: introduce event_site() to disable event tracking if it's recoverable dept: implement a basic unit test for dept dept: call dept_hardirqs_off() in local_irq_*() regardless of irq state dept: introduce APIs to set page usage and use subclasses_evt for the usage dept: track PG_writeback with dept SUNRPC: relocate struct rcu_head to the first field of struct rpc_xprt mm: percpu: increase PERCPU_DYNAMIC_SIZE_SHIFT on DEPT and large PAGE_SIZE Yunseong Kim (1): rcu/update: fix same dept key collision between various types of RCU Documentation/dev-tools/dept.rst | 778 ++++++ Documentation/dev-tools/dept_api.rst | 125 + drivers/dma-buf/dma-fence.c | 23 +- include/asm-generic/vmlinux.lds.h | 13 +- include/linux/completion.h | 124 +- include/linux/dept.h | 402 +++ include/linux/dept_ldt.h | 78 + include/linux/dept_sdt.h | 68 + include/linux/dept_unit_test.h | 67 + include/linux/dma-fence.h | 74 +- include/linux/hardirq.h | 3 + include/linux/irq-entry-common.h | 4 + include/linux/irqflags.h | 21 +- include/linux/local_lock_internal.h | 1 + include/linux/lockdep.h | 105 +- include/linux/lockdep_types.h | 3 + include/linux/mm_types.h | 4 + include/linux/mmu_notifier.h | 26 + include/linux/module.h | 5 + include/linux/mutex.h | 1 + include/linux/page-flags.h | 217 +- include/linux/pagemap.h | 37 +- include/linux/percpu-rwsem.h | 2 +- include/linux/percpu.h | 4 + include/linux/rcupdate_wait.h | 13 +- include/linux/rtmutex.h | 1 + include/linux/rwlock_types.h | 1 + include/linux/rwsem.h | 1 + include/linux/sched.h | 118 + include/linux/seqlock.h | 2 +- include/linux/spinlock_types_raw.h | 3 + include/linux/srcu.h | 2 +- include/linux/sunrpc/xprt.h | 9 +- include/linux/swait.h | 3 + include/linux/wait.h | 3 + include/linux/wait_bit.h | 3 + init/init_task.c | 2 + init/main.c | 2 + kernel/Makefile | 1 + kernel/cpu.c | 2 +- kernel/dependency/Makefile | 5 + kernel/dependency/dept.c | 3499 ++++++++++++++++++++++++++ kernel/dependency/dept_hash.h | 10 + kernel/dependency/dept_internal.h | 314 +++ kernel/dependency/dept_object.h | 13 + kernel/dependency/dept_proc.c | 94 + kernel/dependency/dept_unit_test.c | 173 ++ kernel/exit.c | 1 + kernel/fork.c | 2 + kernel/locking/lockdep.c | 33 + kernel/module/main.c | 19 + kernel/rcu/rcu.h | 1 + kernel/rcu/update.c | 5 +- kernel/sched/completion.c | 62 +- kernel/sched/core.c | 9 + kernel/workqueue.c | 3 + lib/Kconfig.debug | 48 + lib/debug_locks.c | 2 + lib/locking-selftest.c | 2 + mm/filemap.c | 38 + mm/mm_init.c | 3 + mm/mmu_notifier.c | 31 +- rust/helpers/completion.c | 5 + 63 files changed, 6602 insertions(+), 121 deletions(-) create mode 100644 Documentation/dev-tools/dept.rst create mode 100644 Documentation/dev-tools/dept_api.rst create mode 100644 include/linux/dept.h create mode 100644 include/linux/dept_ldt.h create mode 100644 include/linux/dept_sdt.h create mode 100644 include/linux/dept_unit_test.h create mode 100644 kernel/dependency/Makefile create mode 100644 kernel/dependency/dept.c create mode 100644 kernel/dependency/dept_hash.h create mode 100644 kernel/dependency/dept_internal.h create mode 100644 kernel/dependency/dept_object.h create mode 100644 kernel/dependency/dept_proc.c create mode 100644 kernel/dependency/dept_unit_test.c base-commit: 43dfc13ca972988e620a6edb72956981b75ab6b0 -- 2.17.1

2 weeks, 5 days

[PATCH RFC 00/26] Add DMA-buf mapping types and convert vfio/iommufd to use them

by Jason Gunthorpe

Since its introduction, DMA-buf has only supported using scatterlist for the exporter and importer to exchange address information. This is not sufficient for all use cases as dma_addr_t is a very specific and limited type that should not be abused for things unrelated to the DMA API. There are several motivations for addressing this now: 1) VFIO to IOMMUFD and KVM requires a physical address, not a dma_addr_t scatterlist, it cannot be represented in the scatterlist structure 2) xe vGPU requires the host driver to accept a DMABUF from VFIO of its own VF and convert it into an internal VRAM address on the PF 3) We are starting to look at replacement datastructures for scatterlist 4) Ideas around UALink/etc are suggesting not using the DMA API None of these can sanely be achieved using scatterlist. Introduce a new mechanism called "mapping types" which allows DMA-buf to work with more map/unmap options than scatterlist. Each mapping type encompasses a full set of functions and data unique to itself. The core code provides a match-making system to select the best type offered by the exporter and importer to be the active mapping type for the attachment. Everything related to scatterlist is moved into a DMA-buf SGT mapping type, and into the "dma_buf_sgt_*" namespace for clarity. Existing exporters are moved over to explicitly declare SGT mapping types and importers are adjusted to use the dma_buf_sgt_* named importer helpers. Mapping types are designed to be extendable, a driver can declare its own mapping type for its internal private interconnect and use that without having to adjust the core code. The new attachment sequence starts with the importing driver declaring what mapping types it can accept: struct dma_buf_mapping_match imp_match[] = { DMA_BUF_IMAPPING_MY_DRIVER(dev, ...), DMA_BUF_IMAPPING_SGT(dev, false), }; attach = dma_buf_mapping_attach(dmabuf, imp_match, ...) Most drivers will do this via a dma_buf_sgt_*attach() helper. The exporting driver can then declare what mapping types it can supply: int exporter_match_mapping(struct dma_buf_match_args *args) { struct dma_buf_mapping_match exp_match[] = { DMA_BUF_EMAPPING_MY_DRIVER(my_ops, dev, ...), DMA_BUF_EMAPPING_SGT(sgt_ops, dev, false), DMA_BUF_EMAPPING_PAL(PAL_ops), }; return dma_buf_match_mapping(args, exp_match, ...); } Most drivers will do this via a helper: static const struct dma_buf_ops ops = { DMA_BUF_SIMPLE_SGT_EXP_MATCH(map_func, unmap_func) }; During dma_buf_mapping_attach() the core code will select a mutual match between the importer and exporter and record it as the active match in the attach->map_type. Each mapping type has its own types/function calls for mapping/unmapping, and storage in the attach->map_type for its information. As such each mapping type can offer function signatures and data that exactly matches its needs. This series goes through a sequence of: 1) Introduce the basic mapping type framework and the main components of the SGT mapping type 2) Automatically make all existing exporters and importers use core generated SGT mapping types so every attachment has a SGT mapping type 3) Convert all exporter drivers to natively create a SGT mapping type 4) Move all dma_buf_* functions and types that are related to SGT into dma_buf_sgt_* 5) Remove all the now-unused items that have been moved into SGT specific structures. 6) Demonstrate adding a new Physical Address List alongside SGT. Due to the high number of files touched I would expect this to be broken into phases, but this shows the entire picture. This is on github: https://github.com/jgunthorpe/linux/commits/dmabuf_map_type It is a followup to the discussion here: https://lore.kernel.org/dri-devel/20251027044712.1676175-1-vivek.kasireddy@… Jason Gunthorpe (26): dma-buf: Introduce DMA-buf mapping types dma-buf: Add the SGT DMA mapping type dma-buf: Add dma_buf_mapping_attach() dma-buf: Route SGT related actions through attach->map_type dma-buf: Allow single exporter drivers to avoid the match_mapping function drm: Check the SGT ops for drm_gem_map_dma_buf() dma-buf: Convert all the simple exporters to use SGT mapping type drm/vmwgfx: Use match_mapping instead of dummy calls accel/habanalabs: Use the SGT mapping type drm/xe/dma-buf: Use the SGT mapping type drm/amdgpu: Use the SGT mapping type vfio/pci: Change the DMA-buf exporter to use mapping_type dma-buf: Update dma_buf_phys_vec_to_sgt() to use the SGT mapping type iio: buffer: convert to use the SGT mapping type functionfs: convert to use the SGT mapping type dma-buf: Remove unused SGT stuff from the common structures treewide: Rename dma_buf_map_attachment(_unlocked) to dma_buf_sgt_ treewide: Rename dma_buf_unmap_attachment(_unlocked) to dma_buf_sgt_* treewide: Rename dma_buf_attach() to dma_buf_sgt_attach() treewide: Rename dma_buf_dynamic_attach() to dma_buf_sgt_dynamic_attach() dma-buf: Add the Physical Address List DMA mapping type vfio/pci: Add physical address list support to DMABUF iommufd: Use the PAL mapping type instead of a vfio function iommufd: Support DMA-bufs with multiple physical ranges iommufd/selftest: Check multi-phys DMA-buf scenarios dma-buf: Add kunit tests for mapping type Documentation/gpu/todo.rst | 2 +- drivers/accel/amdxdna/amdxdna_gem.c | 14 +- drivers/accel/amdxdna/amdxdna_ubuf.c | 10 +- drivers/accel/habanalabs/common/memory.c | 54 ++- drivers/accel/ivpu/ivpu_gem.c | 10 +- drivers/accel/ivpu/ivpu_gem_userptr.c | 11 +- drivers/accel/qaic/qaic_data.c | 8 +- drivers/dma-buf/Makefile | 1 + drivers/dma-buf/dma-buf-mapping.c | 186 ++++++++- drivers/dma-buf/dma-buf.c | 180 ++++++--- drivers/dma-buf/heaps/cma_heap.c | 12 +- drivers/dma-buf/heaps/system_heap.c | 13 +- drivers/dma-buf/st-dma-mapping.c | 373 ++++++++++++++++++ drivers/dma-buf/udmabuf.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 98 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 +- drivers/gpu/drm/armada/armada_gem.c | 33 +- drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- drivers/gpu/drm/drm_prime.c | 31 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 18 +- drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 +- .../drm/i915/gem/selftests/i915_gem_dmabuf.c | 8 +- .../gpu/drm/i915/gem/selftests/mock_dmabuf.c | 8 +- drivers/gpu/drm/msm/msm_gem_prime.c | 7 +- drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 11 +- drivers/gpu/drm/tegra/gem.c | 33 +- drivers/gpu/drm/virtio/virtgpu_prime.c | 23 +- drivers/gpu/drm/vmwgfx/vmwgfx_prime.c | 32 +- drivers/gpu/drm/xe/xe_bo.c | 18 +- drivers/gpu/drm/xe/xe_dma_buf.c | 61 +-- drivers/iio/industrialio-buffer.c | 15 +- drivers/infiniband/core/umem_dmabuf.c | 15 +- drivers/iommu/iommufd/io_pagetable.h | 4 +- drivers/iommu/iommufd/iommufd_private.h | 8 - drivers/iommu/iommufd/iommufd_test.h | 7 + drivers/iommu/iommufd/pages.c | 85 ++-- drivers/iommu/iommufd/selftest.c | 177 ++++++--- .../media/common/videobuf2/videobuf2-core.c | 2 +- .../common/videobuf2/videobuf2-dma-contig.c | 26 +- .../media/common/videobuf2/videobuf2-dma-sg.c | 21 +- .../common/videobuf2/videobuf2-vmalloc.c | 13 +- .../platform/nvidia/tegra-vde/dmabuf-cache.c | 9 +- drivers/misc/fastrpc.c | 21 +- drivers/tee/tee_heap.c | 13 +- drivers/usb/gadget/function/f_fs.c | 11 +- drivers/vfio/pci/vfio_pci_dmabuf.c | 79 ++-- drivers/xen/gntdev-dmabuf.c | 29 +- include/linux/dma-buf-mapping.h | 297 ++++++++++++++ include/linux/dma-buf.h | 168 ++++---- io_uring/zcrx.c | 9 +- net/core/devmem.c | 14 +- samples/vfio-mdev/mbochs.c | 10 +- sound/soc/fsl/fsl_asrc_m2m.c | 12 +- tools/testing/selftests/iommu/iommufd.c | 43 ++ tools/testing/selftests/iommu/iommufd_utils.h | 17 + 55 files changed, 1764 insertions(+), 614 deletions(-) create mode 100644 drivers/dma-buf/st-dma-mapping.c base-commit: c63e5a50e1dd291cd95b04291b028fdcaba4c534 -- 2.43.0

1 month

[PATCH] mm/vmalloc: map contiguous pages in batches for vmap() whenever possible

by Barry Song

From: Barry Song <v-songbaohua(a)oppo.com> In many cases, the pages passed to vmap() may include high-order pages allocated with __GFP_COMP flags. For example, the systemheap often allocates pages in descending order: order 8, then 4, then 0. Currently, vmap() iterates over every page individually—even pages inside a high-order block are handled one by one. This patch detects high-order pages and maps them as a single contiguous block whenever possible. An alternative would be to implement a new API, vmap_sg(), but that change seems to be large in scope. When vmapping a 128MB dma-buf using the systemheap, this patch makes system_heap_do_vmap() roughly 17× faster. W/ patch: [ 10.404769] system_heap_do_vmap took 2494000 ns [ 12.525921] system_heap_do_vmap took 2467008 ns [ 14.517348] system_heap_do_vmap took 2471008 ns [ 16.593406] system_heap_do_vmap took 2444000 ns [ 19.501341] system_heap_do_vmap took 2489008 ns W/o patch: [ 7.413756] system_heap_do_vmap took 42626000 ns [ 9.425610] system_heap_do_vmap took 42500992 ns [ 11.810898] system_heap_do_vmap took 42215008 ns [ 14.336790] system_heap_do_vmap took 42134992 ns [ 16.373890] system_heap_do_vmap took 42750000 ns Cc: David Hildenbrand <david(a)kernel.org> Cc: Uladzislau Rezki <urezki(a)gmail.com> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: John Stultz <jstultz(a)google.com> Cc: Maxime Ripard <mripard(a)kernel.org> Tested-by: Tangquan Zheng <zhengtangquan(a)oppo.com> Signed-off-by: Barry Song <v-songbaohua(a)oppo.com> --- * diff with rfc: Many code refinements based on David's suggestions, thanks! Refine comment and changelog according to Uladzislau, thanks! rfc link: https://lore.kernel.org/linux-mm/20251122090343.81243-1-21cnbao@gmail.com/ mm/vmalloc.c | 45 +++++++++++++++++++++++++++++++++++++++------ 1 file changed, 39 insertions(+), 6 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 41dd01e8430c..8d577767a9e5 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -642,6 +642,29 @@ static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end, return err; } +static inline int get_vmap_batch_order(struct page **pages, + unsigned int stride, unsigned int max_steps, unsigned int idx) +{ + int nr_pages = 1; + + /* + * Currently, batching is only supported in vmap_pages_range + * when page_shift == PAGE_SHIFT. + */ + if (stride != 1) + return 0; + + nr_pages = compound_nr(pages[idx]); + if (nr_pages == 1) + return 0; + if (max_steps < nr_pages) + return 0; + + if (num_pages_contiguous(&pages[idx], nr_pages) == nr_pages) + return compound_order(pages[idx]); + return 0; +} + /* * vmap_pages_range_noflush is similar to vmap_pages_range, but does not * flush caches. @@ -655,23 +678,33 @@ int __vmap_pages_range_noflush(unsigned long addr, unsigned long end, pgprot_t prot, struct page **pages, unsigned int page_shift) { unsigned int i, nr = (end - addr) >> PAGE_SHIFT; + unsigned int stride; WARN_ON(page_shift < PAGE_SHIFT); + /* + * For vmap(), users may allocate pages from high orders down to + * order 0, while always using PAGE_SHIFT as the page_shift. + * We first check whether the initial page is a compound page. If so, + * there may be an opportunity to batch multiple pages together. + */ if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC) || - page_shift == PAGE_SHIFT) + (page_shift == PAGE_SHIFT && !PageCompound(pages[0]))) return vmap_small_pages_range_noflush(addr, end, prot, pages); - for (i = 0; i < nr; i += 1U << (page_shift - PAGE_SHIFT)) { - int err; + stride = 1U << (page_shift - PAGE_SHIFT); + for (i = 0; i < nr; ) { + int err, order; - err = vmap_range_noflush(addr, addr + (1UL << page_shift), + order = get_vmap_batch_order(pages, stride, nr - i, i); + err = vmap_range_noflush(addr, addr + (1UL << (page_shift + order)), page_to_phys(pages[i]), prot, - page_shift); + page_shift + order); if (err) return err; - addr += 1UL << page_shift; + addr += 1UL << (page_shift + order); + i += 1U << (order + page_shift - PAGE_SHIFT); } return 0; -- 2.39.3 (Apple Git-146)

1 month, 1 week

[PATCH] dma-buf: set SB_I_NOEXEC and SB_I_NODEV on dmabuf filesystem

by Chia-Lin Kao (AceLan)

The VFS now warns if an inode flagged with S_ANON_INODE is located on a filesystem that does not have SB_I_NOEXEC set. dmabuf inodes are created using alloc_anon_inode(), which sets S_ANON_INODE. This triggers a warning in path_noexec() when a dmabuf is mmapped, for example by GStreamer's v4l2src element. [ 60.061328] WARNING: CPU: 2 PID: 2803 at fs/exec.c:125 path_noexec+0xa0/0xd0 ... [ 60.061637] do_mmap+0x2b5/0x680 The warning was introduced by commit 1e7ab6f67824 ("anon_inode: rework assertions") which added enforcement that anonymous inodes must be on filesystems with SB_I_NOEXEC set. Fix this by setting SB_I_NOEXEC and SB_I_NODEV on the dmabuf filesystem context, following the same pattern as commit ce7419b6cf23d ("anon_inode: raise SB_I_NODEV and SB_I_NOEXEC") and commit 98f99394a104c ("secretmem: use SB_I_NOEXEC"). Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao(a)canonical.com> --- drivers/dma-buf/dma-buf.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index a4d8f2ff94e46..dea79aaab10ce 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -221,6 +221,8 @@ static int dma_buf_fs_init_context(struct fs_context *fc) if (!ctx) return -ENOMEM; ctx->dops = &dma_buf_dentry_ops; + fc->s_iflags |= SB_I_NOEXEC; + fc->s_iflags |= SB_I_NODEV; return 0; } -- 2.51.0

1 month, 1 week

[PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver

by Ekansh Gupta

This patch series introduces the Qualcomm DSP Accelerator (QDA) driver, a modern DRM-based accelerator implementation for Qualcomm Hexagon DSPs. The driver provides a standardized interface for offloading computational tasks to DSPs found on Qualcomm SoCs, supporting all DSP domains (ADSP, CDSP, SDSP, GDSP). The QDA driver is designed as an alternative for the FastRPC driver in drivers/misc/, offering improved resource management, better integration with standard kernel subsystems, and alignment with the Linux kernel's Compute Accelerators framework. User-space staging branch ============ https://github.com/qualcomm/fastrpc/tree/accel/staging Key Features ============ * Standard DRM accelerator interface via /dev/accel/accelN * GEM-based buffer management with DMA-BUF import/export support * IOMMU-based memory isolation using per-process context banks * FastRPC protocol implementation for DSP communication * RPMsg transport layer for reliable message passing * Support for all DSP domains (ADSP, CDSP, SDSP, GDSP) * Comprehensive IOCTL interface for DSP operations High-Level Architecture Differences with Existing FastRPC Driver ================================================================= The QDA driver represents a significant architectural departure from the existing FastRPC driver (drivers/misc/fastrpc.c), addressing several key limitations while maintaining protocol compatibility: 1. DRM Accelerator Framework Integration - FastRPC: Custom character device (/dev/fastrpc-*) - QDA: Standard DRM accel device (/dev/accel/accelN) - Benefit: Leverages established DRM infrastructure for device management. 2. Memory Management - FastRPC: Custom memory allocator with ION/DMA-BUF integration - QDA: Native GEM objects with full PRIME support - Benefit: Seamless buffer sharing using standard DRM mechanisms 3. IOMMU Context Bank Management - FastRPC: Direct IOMMU domain manipulation, limited isolation - QDA: Custom compute bus (qda_cb_bus_type) with proper device model - Benefit: Each CB device is a proper struct device with IOMMU group support, enabling better isolation and resource tracking. - https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualco… 4. Memory Manager Architecture - FastRPC: Monolithic allocator - QDA: Pluggable memory manager with backend abstraction - Benefit: Currently uses DMA-coherent backend, easily extensible for future memory types (e.g., carveout, CMA) 5. Transport Layer - FastRPC: Direct RPMsg integration in core driver - QDA: Abstracted transport layer (qda_rpmsg.c) - Benefit: Clean separation of concerns, easier to add alternative transports if needed 8. Code Organization - FastRPC: ~3000 lines in single file - QDA: Modular design across multiple files (~4600 lines total) * qda_drv.c: Core driver and DRM integration * qda_gem.c: GEM object management * qda_memory_manager.c: Memory and IOMMU management * qda_fastrpc.c: FastRPC protocol implementation * qda_rpmsg.c: Transport layer * qda_cb.c: Context bank device management - Benefit: Better maintainability, clearer separation of concerns 9. UAPI Design - FastRPC: Custom IOCTL interface - QDA: DRM-style IOCTLs with proper versioning support - Benefit: Follows DRM conventions, easier userspace integration 10. Documentation - FastRPC: Minimal in-tree documentation - QDA: Comprehensive documentation in Documentation/accel/qda/ - Benefit: Better developer experience, clearer API contracts 11. Buffer Reference Mechanism - FastRPC: Uses buffer file descriptors (FDs) for all book-keeping in both kernel and DSP - QDA: Uses GEM handles for kernel-side management, providing better integration with DRM subsystem - Benefit: Leverages DRM GEM infrastructure for reference counting, lifetime management, and integration with other DRM components Key Technical Improvements =========================== * Proper device model: CB devices are real struct device instances on a custom bus, enabling proper IOMMU group management and power management integration * Reference-counted IOMMU devices: Multiple file descriptors from the same process share a single IOMMU device, reducing overhead * GEM-based buffer lifecycle: Automatic cleanup via DRM GEM reference counting, eliminating many resource leak scenarios * Modular memory backends: The memory manager supports pluggable backends, currently implementing DMA-coherent allocations with SID-prefixed addresses for DSP firmware * Context-based invocation tracking: XArray-based context management with proper synchronization and cleanup Patch Series Organization ========================== Patches 1-2: Driver skeleton and documentation Patches 3-6: RPMsg transport and IOMMU/CB infrastructure Patches 7-9: DRM device registration and basic IOCTL Patches 10-12: GEM buffer management and PRIME support Patches 13-17: FastRPC protocol implementation (attach, invoke, create, map/unmap) Patch 18: MAINTAINERS entry Open Items =========== The following items are identified as open items: 1. Privilege Level Management - Currently, daemon processes and user processes have the same access level as both use the same accel device node. This needs to be addressed as daemons attach to privileged DSP PDs and require higher privilege levels for system-level operations - Seeking guidance on the best approach: separate device nodes, capability-based checks, or DRM master/authentication mechanisms 2. UAPI Compatibility Layer - Add UAPI compat layer to facilitate migration of client applications from existing FastRPC UAPI to the new QDA accel driver UAPI, ensuring smooth transition for existing userspace code - Seeking guidance on implementation approach: in-kernel translation layer, userspace wrapper library, or hybrid solution 3. Documentation Improvements - Add detailed IOCTL usage examples - Document DSP firmware interface requirements - Create migration guide from existing FastRPC 4. Per-Domain Memory Allocation - Develop new userspace API to support memory allocation on a per domain basis, enabling domain-specific memory management and optimization 5. Audio and Sensors PD Support - The current patch series does not handle Audio PD and Sensors PD functionalities. These specialized protection domains require additional support for real-time constraints and power management Interface Compatibility ======================== The QDA driver maintains compatibility with existing FastRPC infrastructure: * Device Tree Bindings: The driver uses the same device tree bindings as the existing FastRPC driver, ensuring no changes are required to device tree sources. The "qcom,fastrpc" compatible string and child node structure remain unchanged. * Userspace Interface: While the driver provides a new DRM-based UAPI, the underlying FastRPC protocol and DSP firmware interface remain compatible. This ensures that DSP firmware and libraries continue to work without modification. * Migration Path: The modular design allows for gradual migration, where both drivers can coexist during the transition period. Applications can be migrated incrementally to the new UAPI with the help of the planned compatibility layer. References ========== Previous discussions on this migration: - https://lkml.org/lkml/2024/6/24/479 - https://lkml.org/lkml/2024/6/21/1252 Testing ======= The driver has been tested on Qualcomm platforms with: - Basic FastRPC attach/release operations - DSP process creation and initialization - Memory mapping/unmapping operations - Dynamic invocation with various buffer types - GEM buffer allocation and mmap - PRIME buffer import from other subsystems Signed-off-by: Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com> --- Ekansh Gupta (18): accel/qda: Add Qualcomm QDA DSP accelerator driver docs accel/qda: Add Qualcomm DSP accelerator driver skeleton accel/qda: Add RPMsg transport for Qualcomm DSP accelerator accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU accel/qda: Create compute CB devices on QDA compute bus accel/qda: Add memory manager for CB devices accel/qda: Add DRM accel device registration for QDA driver accel/qda: Add per-file DRM context and open/close handling accel/qda: Add QUERY IOCTL and basic QDA UAPI header accel/qda: Add DMA-backed GEM objects and memory manager integration accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs accel/qda: Add PRIME dma-buf import support accel/qda: Add initial FastRPC attach and release support accel/qda: Add FastRPC dynamic invocation support accel/qda: Add FastRPC DSP process creation support accel/qda: Add FastRPC-based DSP memory mapping support accel/qda: Add FastRPC-based DSP memory unmapping support MAINTAINERS: Add MAINTAINERS entry for QDA driver Documentation/accel/index.rst | 1 + Documentation/accel/qda/index.rst | 14 + Documentation/accel/qda/qda.rst | 129 ++++ MAINTAINERS | 9 + arch/arm64/configs/defconfig | 2 + drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 2 + drivers/accel/qda/Kconfig | 35 ++ drivers/accel/qda/Makefile | 19 + drivers/accel/qda/qda_cb.c | 182 ++++++ drivers/accel/qda/qda_cb.h | 26 + drivers/accel/qda/qda_compute_bus.c | 23 + drivers/accel/qda/qda_drv.c | 375 ++++++++++++ drivers/accel/qda/qda_drv.h | 171 ++++++ drivers/accel/qda/qda_fastrpc.c | 1002 ++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_fastrpc.h | 433 ++++++++++++++ drivers/accel/qda/qda_gem.c | 211 +++++++ drivers/accel/qda/qda_gem.h | 103 ++++ drivers/accel/qda/qda_ioctl.c | 271 +++++++++ drivers/accel/qda/qda_ioctl.h | 118 ++++ drivers/accel/qda/qda_memory_dma.c | 91 +++ drivers/accel/qda/qda_memory_dma.h | 46 ++ drivers/accel/qda/qda_memory_manager.c | 382 ++++++++++++ drivers/accel/qda/qda_memory_manager.h | 148 +++++ drivers/accel/qda/qda_prime.c | 194 +++++++ drivers/accel/qda/qda_prime.h | 43 ++ drivers/accel/qda/qda_rpmsg.c | 327 +++++++++++ drivers/accel/qda/qda_rpmsg.h | 57 ++ drivers/iommu/iommu.c | 4 + include/linux/qda_compute_bus.h | 22 + include/uapi/drm/qda_accel.h | 224 +++++++ 31 files changed, 4665 insertions(+) --- base-commit: d4906ae14a5f136ceb671bb14cedbf13fa560da6 change-id: 20260223-qda-firstpost-4ab05249e2cc Best regards, -- Ekansh Gupta <ekansh.gupta(a)oss.qualcomm.com>

1 month, 3 weeks

TRUSTED EXPERTS ONLINE RECOVERY SOLUTIONS//ALMIGHTY RECOVERY COIN

by Jeremy Thorgan

Almighty Cryptocurrency Recovery is a private investigation, asset recovery, and financial regulator. We specialize in instances involving recovery scams, cryptocurrency, fake investment schemes, and ethical hacking. We examine the factors influencing your score and are also experts in credit repair. removing criminal records, school grades, and jobs involving phone and social media hacking. Every piece of software required to carry out recoveries from beginning to end is available. The writers and offenders band together to establish a syndicate, so be wary of false reviews and testimonials on the internet.To get started, send an email to our support team at the address below as soon as you can. almightyrecoverycoin(a)mail.com and whatsapp +53 51 55 6969 Visit website; almightyrecoveryco.wixsite.com/almighty-recovery-co Stay Safe!

2 months, 1 week

5
4
10 0

[PATCH v2 00/12] media: stm32: dcmi: stability & performance enhancements

by Alain Volmat

This series improve stability of the capture by fixing the handling of the overrun which was leading to captured frame corruption. Locking within the driver is also simplified and the way DMA is handled is reworked allowing to avoid having a specific handling for the JPEG data. Performances of capture can now be increased via the usage of a DMA->MDMA chaining which allows for capture of higher resolution / framerate. Signed-off-by: Alain Volmat <alain.volmat(a)foss.st.com> --- Changes in v2: - Fix pm_sleep_ptr -> pm_ptr to avoid unused function warning - Fix typo / remove useless comment in binding - Link to v1: https://lore.kernel.org/r/20251218-stm32-dcmi-dma-chaining-v1-0-39948ca6cbf… --- Alain Volmat (12): media: stm32: dcmi: Switch from __maybe_unused to pm_ptr() media: stm32: dcmi: perform dmaengine_slave_config at probe media: stm32: dcmi: only create dma descriptor once at buf_prepare media: stm32: dcmi: stop the dma transfer on overrun media: stm32: dcmi: rework spin_lock calls media: stm32: dcmi: perform all dma handling within irq_thread media: stm32: dcmi: use dmaengine_terminate_async in irq context media: stm32: dcmi: continuous mode capture in JPEG dt-bindings: media: st: dcmi: add DMA-MDMA chaining properties media: stm32: dcmi: addition of DMA-MDMA chaining support ARM: dts: stm32: add sram node within stm32mp151.dtsi ARM: dts: stm32: enable DCMI DMA-MDMA chaining on stm32mp157c-ev1.dts .../devicetree/bindings/media/st,stm32-dcmi.yaml | 11 +- arch/arm/boot/dts/st/stm32mp151.dtsi | 8 + arch/arm/boot/dts/st/stm32mp157c-ev1.dts | 15 + drivers/media/platform/st/stm32/stm32-dcmi.c | 475 +++++++++++++-------- 4 files changed, 341 insertions(+), 168 deletions(-) --- base-commit: f7231cff1f3ff8259bef02dc4999bc132abf29cf change-id: 20251213-stm32-dcmi-dma-chaining-9ea1da83007d Best regards, -- Alain Volmat <alain.volmat(a)foss.st.com>

2 months, 2 weeks

[PATCH 0/9] accel: New driver for NXP's Neutron NPU

by Ioana Ciocoi-Radulescu

Introduce a new accel driver for the Neutron Neural Processing Unit (NPU), along with associated dt-bindings and DTS node. The first patch extends the GEM DMA helper APIs to allow bidirectional mapping of non-coherent DMA buffers. While not part of the Neutron driver, it's a prerequisite allowing us to use the GEM DMA helper. Neutron is a Neural Processing Unit from NXP, providing machine learning (ML) acceleration for edge AI applications. Neutron is integrated on NXP SoCs such as the i.MX95. The NPU consists of the following: - RISC-V core running a proprietary firmware - One or more Neutron cores, representing the main computation engine performing ML operations - Dedicated fast memory (TCM) - DMA engine that handles data transfers between DDR and TCM The firmware is closed source and distributed as a binary here [1]. The Neutron software stack also contains a userspace library [1] and a LiteRT custom delegate [2] that allow integration with standard LiteRT tools. [1] https://github.com/nxp-upstream/neutron/tree/upstream [2] https://github.com/nxp-imx/tflite-neutron-delegate Signed-off-by: Ioana Ciocoi-Radulescu <ruxandra.radulescu(a)nxp.com> --- Ioana Ciocoi-Radulescu (9): drm/gem-dma: Add flag for bidirectional mapping of non-coherent GEM DMA buffers accel/neutron: Add documentation for NXP Neutron accelerator driver dt-bindings: npu: Add bindings for NXP Neutron accel/neutron: Add driver for NXP Neutron NPU accel/neutron: Add GEM buffer object support accel/neutron: Add mailbox support accel/neutron: Add job submission IOCTL accel/neutron: Add logging support arm64: dts: imx95: Add Neutron node Documentation/accel/index.rst | 1 + Documentation/accel/neutron/index.rst | 12 + Documentation/accel/neutron/neutron.rst | 131 ++++++++ .../devicetree/bindings/npu/nxp,imx95-neutron.yaml | 95 ++++++ MAINTAINERS | 10 + arch/arm64/boot/dts/freescale/imx95.dtsi | 28 ++ drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 3 +- drivers/accel/neutron/Kconfig | 16 + drivers/accel/neutron/Makefile | 12 + drivers/accel/neutron/neutron_debugfs.c | 34 ++ drivers/accel/neutron/neutron_debugfs.h | 15 + drivers/accel/neutron/neutron_device.c | 239 ++++++++++++++ drivers/accel/neutron/neutron_device.h | 158 +++++++++ drivers/accel/neutron/neutron_driver.c | 262 +++++++++++++++ drivers/accel/neutron/neutron_driver.h | 16 + drivers/accel/neutron/neutron_gem.c | 115 +++++++ drivers/accel/neutron/neutron_gem.h | 14 + drivers/accel/neutron/neutron_job.c | 367 +++++++++++++++++++++ drivers/accel/neutron/neutron_job.h | 45 +++ drivers/accel/neutron/neutron_mailbox.c | 47 +++ drivers/accel/neutron/neutron_mailbox.h | 42 +++ drivers/gpu/drm/drm_gem_dma_helper.c | 6 +- include/drm/drm_gem_dma_helper.h | 3 + include/uapi/drm/neutron_accel.h | 130 ++++++++ 25 files changed, 1799 insertions(+), 3 deletions(-) --- base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f change-id: 20260226-neutron-c435e39d167f Best regards, -- Ioana Ciocoi-Radulescu <ruxandra.radulescu(a)nxp.com>

2 months, 3 weeks

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig February 2026