Introduce a new accel driver for the Neutron Neural Processing Unit (NPU), along with associated dt-bindings and DTS node.
The first patch extends the GEM DMA helper APIs to allow bidirectional mapping of non-coherent DMA buffers. While not part of the Neutron driver, it's a prerequisite allowing us to use the GEM DMA helper.
Neutron is a Neural Processing Unit from NXP, providing machine learning (ML) acceleration for edge AI applications. Neutron is integrated on NXP SoCs such as the i.MX95.
The NPU consists of the following: - RISC-V core running a proprietary firmware - One or more Neutron cores, representing the main computation engine performing ML operations - Dedicated fast memory (TCM) - DMA engine that handles data transfers between DDR and TCM
The firmware is closed source and distributed as a binary here [1].
The Neutron software stack also contains a userspace library [1] and a LiteRT custom delegate [2] that allow integration with standard LiteRT tools.
[1] https://github.com/nxp-upstream/neutron/tree/upstream [2] https://github.com/nxp-imx/tflite-neutron-delegate
Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com --- Changes in v2: - rebase on newer drm-misc-next - dt bindings: clock fixes and renames - update DTS to match new names - remove unnecessary fields from neutron_job structure - fix use of uninitialized variable
- Link to v1: https://lore.kernel.org/r/20260226-neutron-v1-0-46eccb3bb50a@nxp.com
--- Ioana Ciocoi-Radulescu (9): drm/gem-dma: Add flag for bidirectional mapping of non-coherent GEM DMA buffers accel/neutron: Add documentation for NXP Neutron accelerator driver dt-bindings: npu: Add NXP Neutron accel/neutron: Add driver for NXP Neutron NPU accel/neutron: Add GEM buffer object support accel/neutron: Add mailbox support accel/neutron: Add job submission IOCTL accel/neutron: Add logging support arm64: dts: imx95: Add Neutron node
Documentation/accel/index.rst | 1 + Documentation/accel/neutron/index.rst | 12 + Documentation/accel/neutron/neutron.rst | 131 ++++++++ .../devicetree/bindings/npu/nxp,imx95-neutron.yaml | 96 ++++++ MAINTAINERS | 10 + arch/arm64/boot/dts/freescale/imx95.dtsi | 28 ++ drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 3 +- drivers/accel/neutron/Kconfig | 16 + drivers/accel/neutron/Makefile | 12 + drivers/accel/neutron/neutron_debugfs.c | 34 ++ drivers/accel/neutron/neutron_debugfs.h | 15 + drivers/accel/neutron/neutron_device.c | 239 +++++++++++++ drivers/accel/neutron/neutron_device.h | 155 +++++++++ drivers/accel/neutron/neutron_driver.c | 262 +++++++++++++++ drivers/accel/neutron/neutron_driver.h | 16 + drivers/accel/neutron/neutron_gem.c | 116 +++++++ drivers/accel/neutron/neutron_gem.h | 14 + drivers/accel/neutron/neutron_job.c | 372 +++++++++++++++++++++ drivers/accel/neutron/neutron_job.h | 43 +++ drivers/accel/neutron/neutron_mailbox.c | 47 +++ drivers/accel/neutron/neutron_mailbox.h | 42 +++ drivers/gpu/drm/drm_gem_dma_helper.c | 6 +- include/drm/drm_gem_dma_helper.h | 3 + include/uapi/drm/neutron_accel.h | 130 +++++++ 25 files changed, 1801 insertions(+), 3 deletions(-) --- base-commit: 6716101ae42949e98ad4b9e71eeba08c055be410 change-id: 20260226-neutron-c435e39d167f
Best regards,
Introduce a flag that allows a user to request non-coherent buffers allocated via the GEM DMA helper for bidirectional use.
Keep current behaviour (DMA_TO_DEVICE mapping) as default, with no change required for existing GEM DMA users.
While it hasn't been the case until now, some devices like NXP's Neutron Neural Processing Unit (NPU) require contiguous, non-coherent DMA buffers they can both read from and write to. Unlike traditional DRM devices, Neutron uses the same DMA buffer both for reading model data and for writing inference output.
Neutron's usage scenario is a good match for the GEM DMA helpers, except for the fact that current implementation only considers the DMA_TO_DEVICE direction.
Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com --- drivers/gpu/drm/drm_gem_dma_helper.c | 6 ++++-- include/drm/drm_gem_dma_helper.h | 3 +++ 2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_gem_dma_helper.c b/drivers/gpu/drm/drm_gem_dma_helper.c index ecb9746f4da8..dbf5ad4426d3 100644 --- a/drivers/gpu/drm/drm_gem_dma_helper.c +++ b/drivers/gpu/drm/drm_gem_dma_helper.c @@ -148,7 +148,8 @@ struct drm_gem_dma_object *drm_gem_dma_create(struct drm_device *drm, if (dma_obj->map_noncoherent) { dma_obj->vaddr = dma_alloc_noncoherent(drm->dev, size, &dma_obj->dma_addr, - DMA_TO_DEVICE, + dma_obj->map_bidirectional ? + DMA_BIDIRECTIONAL : DMA_TO_DEVICE, GFP_KERNEL | __GFP_NOWARN); } else { dma_obj->vaddr = dma_alloc_wc(drm->dev, size, @@ -238,7 +239,8 @@ void drm_gem_dma_free(struct drm_gem_dma_object *dma_obj) if (dma_obj->map_noncoherent) dma_free_noncoherent(gem_obj->dev->dev, dma_obj->base.size, dma_obj->vaddr, dma_obj->dma_addr, - DMA_TO_DEVICE); + dma_obj->map_bidirectional ? + DMA_BIDIRECTIONAL : DMA_TO_DEVICE); else dma_free_wc(gem_obj->dev->dev, dma_obj->base.size, dma_obj->vaddr, dma_obj->dma_addr); diff --git a/include/drm/drm_gem_dma_helper.h b/include/drm/drm_gem_dma_helper.h index f2678e7ecb98..e0022f2fdfef 100644 --- a/include/drm/drm_gem_dma_helper.h +++ b/include/drm/drm_gem_dma_helper.h @@ -17,6 +17,8 @@ struct drm_mode_create_dumb; * DMA addresses. * @vaddr: kernel virtual address of the backing memory * @map_noncoherent: if true, the GEM object is backed by non-coherent memory + * @map_bidirectional: valid only if map_noncoherent flag is set. If true, allow + * bidirectional use of the non-coherent memory buffer */ struct drm_gem_dma_object { struct drm_gem_object base; @@ -27,6 +29,7 @@ struct drm_gem_dma_object { void *vaddr;
bool map_noncoherent; + bool map_bidirectional; };
#define to_drm_gem_dma_obj(gem_obj) \
Neutron is NXP's Neural Processing Unit (NPU) and it's integrated on the i.MX95 SoC. It is capable of running inferences on a large range of ML models and targets edge AI applications.
Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com --- Documentation/accel/index.rst | 1 + Documentation/accel/neutron/index.rst | 12 +++ Documentation/accel/neutron/neutron.rst | 131 ++++++++++++++++++++++++++++++++ 3 files changed, 144 insertions(+)
diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst index cbc7d4c3876a..dbe177074739 100644 --- a/Documentation/accel/index.rst +++ b/Documentation/accel/index.rst @@ -9,5 +9,6 @@ Compute Accelerators
introduction amdxdna/index + neutron/index qaic/index rocket/index diff --git a/Documentation/accel/neutron/index.rst b/Documentation/accel/neutron/index.rst new file mode 100644 index 000000000000..8f15346d16c7 --- /dev/null +++ b/Documentation/accel/neutron/index.rst @@ -0,0 +1,12 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +========================== + accel/neutron NPU driver +========================== + +The accel/neutron driver supports the Neutron NPU (Neural Processing Unit) +from NXP. + +.. toctree:: + + neutron diff --git a/Documentation/accel/neutron/neutron.rst b/Documentation/accel/neutron/neutron.rst new file mode 100644 index 000000000000..c5066d53ce69 --- /dev/null +++ b/Documentation/accel/neutron/neutron.rst @@ -0,0 +1,131 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +.. include:: <isonum.txt> + +==================== + Neutron NPU Driver +==================== + +:Copyright: |copy| 2026 NXP + +Overview +======== + +Neutron is NXP's eIQ Neutron Neural Processing Unit (NPU). It is a highly +scalable, power-efficient machine learning accelerator targeting quantized +ML models for edge AI applications. Neutron is integrated into i.MX95 and +other NXP platforms. + +A more detailed description of Neutron NPU and usage scenarios can be +found at [1]_. + +Hardware Description +==================== + +Neutron has the following hardware components: + +- RISC-V core: this is the "brain" of the Neutron NPU. It runs a proprietary + firmware responsible for programming registers, processing commands and + managing the other hardware components +- one or more Neutron cores: the main computation engine performing Machine + Learning (ML) operations +- TCM: a dedicated fast memory +- Data Mover: a DMA engine that handles data transfers between system memory + and Neutron's internal memory + +Software Stack +============== + +The following software components are required for running an inference +on the Neutron accelerator: + +- Neutron converter [2]_, [3]_: this is an offline tool that converts models + from standard TFLite (LiteRT) format to a custom format for execution on the + Neutron NPU; +- An inference engine, e.g. LiteRT's XNNPack, which in turn uses +- A LiteRT custom delegate [4]_ to dispatch custom operators to Neutron NPU; +- A userspace library [5]_ that the delegate links to, which wraps IOCTLs + to the kernel driver in a higher-level API. It handles microcode, weights + and kernels preparation and base address computations needed by the NPU for + job execution. It also triggers cache syncs when required; +- The Neutron kernel driver, which handles device initialization and + communicates directly with the Neutron firmware; +- Neutron firmware [5]_, a proprietary firmware that executes on the RISC-V + core and directly drives the execution of the NPU hardware. + +Usage Flow +========== + +This section describes the steps required to run an inference job on the +Neutron NPU. + +Offline Conversion +------------------ + +The first step is to convert a standard TFLite model using the Neutron +converter. Supported standard operators are extracted together and mapped +to one or multiple **NeutronGraph** custom operators in the converted model. +Standard operators that are not supported by the NPU are left unchanged and +will be executed on the CPU. + +Runtime Flow +------------ + +On the platform's Cortex-A cores running Linux, the LiteRT inference engine +is responsible for loading the ML model, pre-processing the input data and +handing over the tensor computation to the NPU via the custom delegate. + +The inference engine can be exercised via one of the standard TFLite tools +(e.g. benchmark_model, label_image, etc) or via any custom application that +uses the LiteRT runtime API. + +When preparing to run an inference job, userspace requests a memory buffer +from the kernel driver. It loads both the model and the input data in the +buffer, while also reserving a section for the inference output. It then +issues a job submission command with the prepared buffer and waits for +completion. + +The kernel driver sends the inference job details to the Neutron firmware +via mailbox registers. The NPU executes the inference and issues an interrupt +to the Linux core once it is finished. The driver in return marks the job +as complete so userspace can access and post-process the output. + +Boot Sequence +============= + +The Neutron driver is responsible for loading the firmware image and +initiating the NPU boot sequence. The device is powered down during suspend +and each resume operation implies running the firmware load and boot sequence +again. + +Hardware Constraints +==================== + +Cache Coherency +--------------- + +Some of the NXP platforms that Neutron is integrated on, including i.MX95, +do not ensure Neutron memory coherency at hardware level, generating the +need for explicit DMA sync operations. Given that only parts of the memory +buffer may require syncing at any given time (e.g. multiple inferences using +the same model but different input data) and that the kernel driver is unaware +of the buffer partitioning, the sync operations are driven from userspace. + +Buffer alignment +---------------- + +The Neutron DMA engine requires the inference buffers to be aligned to 1MB +boundary. We allocate buffers for Neutron NPU from a reserved CMA pool that +satisfies this alignment requirement. + +References +========== + +.. [1] i.MX Machine Learning User's Guide: https://www.nxp.com/docs/en/user-guide/UG10166.pdf +.. [2] Neutron Converter binary and User Guide available for download here: + https://www.nxp.com/design/design-center/software/eiq-ai-development-environ... +.. [3] NXP's eIQ PyPi repository: https://eiq.nxp.com/repository/eiq-neutron-sdk/ +.. [4] TFLite delegate source code: https://github.com/nxp-imx/tflite-neutron-delegate +.. [5] Neutron firmware, library and TFLite delegate available here as binaries: + https://github.com/nxp-upstream/neutron/tree/upstream +
Add the bindings for Neutron, a Neural Processing Unit from NXP.
Signed-off-by: Jiwei Fu jiwei.fu@nxp.com Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com --- v2: - clock fixes: add description, rename, make clocks mandatory - rename node from neutron to npu - fix indentation --- .../devicetree/bindings/npu/nxp,imx95-neutron.yaml | 96 ++++++++++++++++++++++ 1 file changed, 96 insertions(+)
diff --git a/Documentation/devicetree/bindings/npu/nxp,imx95-neutron.yaml b/Documentation/devicetree/bindings/npu/nxp,imx95-neutron.yaml new file mode 100644 index 000000000000..a06de4bc3f0a --- /dev/null +++ b/Documentation/devicetree/bindings/npu/nxp,imx95-neutron.yaml @@ -0,0 +1,96 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/npu/nxp,imx95-neutron.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: NXP Neutron NPU + +maintainers: + - Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com + - Jiwei Fu jiwei.fu@nxp.com + +description: + Neutron is an NPU from NXP targeting edge AI inference applications. + Initially supported on i.MX95 SoCs. + +properties: + compatible: + enum: + - nxp,imx95-neutron + + reg: + items: + - description: Register space + - description: Instruction area of the TCM space + - description: Data area of the TCM space + + reg-names: + items: + - const: regs + - const: itcm + - const: dtcm + + memory-region: + description: + Phandle referencing a "shared-dma-pool" to be used for Neutron + inference buffers, which need to be 1MB aligned. + + The memory region must be defined with alignment of 1MB and size + should be large enough to accommodate the targeted ML models. It + should be marked as reusable. + maxItems: 1 + + interrupts: + maxItems: 1 + + clocks: + items: + - description: Core clock + - description: APB bus clock + + clock-names: + items: + - const: core + - const: apb + + iommus: + maxItems: 1 + + power-domains: + maxItems: 1 + +required: + - compatible + - reg + - reg-names + - memory-region + - interrupts + - clocks + - clock-names + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/arm-gic.h> + #include <dt-bindings/interrupt-controller/irq.h> + + bus { + #address-cells = <2>; + #size-cells = <2>; + + npu@4ab00000 { + compatible = "nxp,imx95-neutron"; + reg = <0x0 0x4ab00000 0x0 0x00000400>, + <0x0 0x4AB10000 0x0 0x00010000>, + <0x0 0x4AB08000 0x0 0x00008000>; + reg-names = "regs", "itcm", "dtcm"; + memory-region = <&neutron_pool>; + interrupts = <GIC_SPI 318 IRQ_TYPE_LEVEL_HIGH>; + clocks = <&scmi_clk 68>, <&scmi_clk 67>; + clock-names = "core", "apb"; + power-domains = <&scmi_devpd 20>; + }; + }; +...
On 06/03/2026 14:27, Ioana Ciocoi-Radulescu wrote:
- iommus:
- maxItems: 1
- power-domains:
- maxItems: 1
+required:
- compatible
- reg
- reg-names
- memory-region
- interrupts
- clocks
- clock-names
Missing power-domains
+additionalProperties: false
+examples:
- |
- #include <dt-bindings/interrupt-controller/arm-gic.h>
- #include <dt-bindings/interrupt-controller/irq.h>
- bus {
#address-cells = <2>;#size-cells = <2>;npu@4ab00000 {compatible = "nxp,imx95-neutron";reg = <0x0 0x4ab00000 0x0 0x00000400>,<0x0 0x4AB10000 0x0 0x00010000>,<0x0 0x4AB08000 0x0 0x00008000>;
Keep consistent code, so lowercase hex.
With these two fixed:
Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@oss.qualcomm.com
Best regards, Krzysztof
Add a driver for the Neutron Neural Processing Unit from NXP.
Neutron NPU provides machine learning (ML) acceleration for edge AI applications. Neutron is integrated on NXP SoCs such as the i.MX95. More information can be found under Documentation/accel/neutron.
For now introduce basic functionalities only: device probe and remove, firmware load, boot and shutdown procedures, interrupt support, power management.
Signed-off-by: Jiwei Fu jiwei.fu@nxp.com Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com --- MAINTAINERS | 10 ++ drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 3 +- drivers/accel/neutron/Kconfig | 16 +++ drivers/accel/neutron/Makefile | 7 + drivers/accel/neutron/neutron_device.c | 160 +++++++++++++++++++++++ drivers/accel/neutron/neutron_device.h | 120 +++++++++++++++++ drivers/accel/neutron/neutron_driver.c | 228 +++++++++++++++++++++++++++++++++ drivers/accel/neutron/neutron_driver.h | 13 ++ 9 files changed, 557 insertions(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS index 8a5b27b061da..f7a687eb6b54 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -19191,6 +19191,16 @@ S: Orphan F: Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml F: drivers/nfc/nxp-nci
+NXP Neutron NPU DRIVER +M: Ioana Ciocoi Radulescu ruxandra.radulescu@nxp.com +M: Jiwei Fu jiwei.fu@nxp.com +L: dri-devel@lists.freedesktop.org +S: Maintained +T: git https://gitlab.freedesktop.org/drm/misc/kernel.git +F: Documentation/accel/neutron/ +F: drivers/accel/neutron/ +F: include/uapi/drm/neutron_accel.h + NXP/Goodix TFA989X (TFA1) DRIVER M: Stephan Gerhold stephan@gerhold.net L: linux-sound@vger.kernel.org diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig index bdf48ccafcf2..ba392371b972 100644 --- a/drivers/accel/Kconfig +++ b/drivers/accel/Kconfig @@ -28,6 +28,7 @@ source "drivers/accel/amdxdna/Kconfig" source "drivers/accel/ethosu/Kconfig" source "drivers/accel/habanalabs/Kconfig" source "drivers/accel/ivpu/Kconfig" +source "drivers/accel/neutron/Kconfig" source "drivers/accel/qaic/Kconfig" source "drivers/accel/rocket/Kconfig"
diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile index 1d3a7251b950..698136e12cce 100644 --- a/drivers/accel/Makefile +++ b/drivers/accel/Makefile @@ -4,5 +4,6 @@ obj-$(CONFIG_DRM_ACCEL_AMDXDNA) += amdxdna/ obj-$(CONFIG_DRM_ACCEL_ARM_ETHOSU) += ethosu/ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/ obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/ +obj-$(CONFIG_DRM_ACCEL_NXP_NEUTRON) += neutron/ obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/ -obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ \ No newline at end of file +obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ diff --git a/drivers/accel/neutron/Kconfig b/drivers/accel/neutron/Kconfig new file mode 100644 index 000000000000..37b8ecb49804 --- /dev/null +++ b/drivers/accel/neutron/Kconfig @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0+ + +config DRM_ACCEL_NXP_NEUTRON + tristate "NXP Neutron NPU" + depends on HAS_IOMEM + depends on DRM_ACCEL + depends on ARCH_MXC + select DRM_GEM_DMA_HELPER + select DRM_SCHED + help + Enables driver for NXP Neutron NPU. + + Select this if you have an NXP SoC with Neutron, like i.MX95, + and want to run machine learning applications. + + If built as module, the module is named neutron. diff --git a/drivers/accel/neutron/Makefile b/drivers/accel/neutron/Makefile new file mode 100644 index 000000000000..7592e318dd83 --- /dev/null +++ b/drivers/accel/neutron/Makefile @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0+ + +obj-$(CONFIG_DRM_ACCEL_NXP_NEUTRON) := neutron.o + +neutron-y := \ + neutron_driver.o \ + neutron_device.o diff --git a/drivers/accel/neutron/neutron_device.c b/drivers/accel/neutron/neutron_device.c new file mode 100644 index 000000000000..61b3c96b4996 --- /dev/null +++ b/drivers/accel/neutron/neutron_device.c @@ -0,0 +1,160 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* Copyright 2025-2026 NXP */ + +#include <linux/bitfield.h> +#include <linux/elf.h> +#include <linux/firmware.h> +#include <linux/iopoll.h> + +#include "neutron_device.h" + +void neutron_enable_irq(struct neutron_device *ndev) +{ + u32 val; + + val = readl_relaxed(NEUTRON_REG(ndev, INTENA)); + val |= INTENA_INFDONE; + writel_relaxed(val, NEUTRON_REG(ndev, INTENA)); +} + +void neutron_disable_irq(struct neutron_device *ndev) +{ + writel_relaxed(INTENA_INFDONE, NEUTRON_REG(ndev, INTCLR)); +} + +void neutron_handle_irq(struct neutron_device *ndev) +{ + u32 appstatus; + + appstatus = readl_relaxed(NEUTRON_REG(ndev, APPSTATUS)); + + /* Write 1 to clear */ + writel_relaxed(appstatus & APPSTATUS_CLEAR_MASK, NEUTRON_REG(ndev, APPSTATUS)); + + if (appstatus & APPSTATUS_FAULTCAUSE_MASK) + dev_err(ndev->dev, "Neutron halted due to fault: 0x%lx\n", + FIELD_GET(APPSTATUS_FAULTCAUSE_MASK, appstatus)); +} + +#define neutron_boot_done(appctrl) \ + (FIELD_GET(APPCTRL_MBWR_MASK, (appctrl)) == APPCTRL_MBWR_MAGIC) + +static int neutron_start(struct neutron_device *ndev) +{ + u32 resetctrl, appctrl; + int ret; + + resetctrl = readl_relaxed(NEUTRON_REG(ndev, RESETCTRL)); + writel_relaxed(resetctrl | RESETCTRL_ZVRUN, NEUTRON_REG(ndev, RESETCTRL)); + + ret = readl_poll_timeout(NEUTRON_REG(ndev, APPCTRL), + appctrl, neutron_boot_done(appctrl), + 100, 1000 * USEC_PER_MSEC); + if (ret) { + dev_err(ndev->dev, "Neutron boot timed out\n"); + return -ETIMEDOUT; + } + + return 0; +} + +static void neutron_stop(struct neutron_device *ndev) +{ + u32 resetctrl; + + resetctrl = readl_relaxed(NEUTRON_REG(ndev, RESETCTRL)); + writel_relaxed(resetctrl & ~RESETCTRL_ZVRUN, NEUTRON_REG(ndev, RESETCTRL)); + + readl_poll_timeout(NEUTRON_REG(ndev, RESETCTRL), + resetctrl, !(resetctrl & RESETCTRL_ZVRUN), + 100, 100 * USEC_PER_MSEC); +} + +static void __iomem *neutron_tcm_da_to_va(struct neutron_device *ndev, u64 da) +{ + struct neutron_mem_region *mem; + int offset, i; + + for (i = 0; i < NEUTRON_MEM_MAX; i++) { + if (i != NEUTRON_MEM_ITCM && i != NEUTRON_MEM_DTCM) + continue; + mem = &ndev->mem_regions[i]; + if (da >= mem->da && da < mem->da + mem->size) { + offset = da - mem->da; + return mem->va + offset; + } + } + + return NULL; +} + +static int neutron_load_firmware(struct neutron_device *ndev) +{ + const struct firmware *fw; + struct elf32_hdr *ehdr; + struct elf32_phdr *phdr, *seg; + void __iomem *dest; + int i, ret; + + ret = request_firmware(&fw, NEUTRON_FIRMWARE_NAME, ndev->dev); + if (ret) { + dev_err(ndev->dev, "Failed to request firmware\n"); + return ret; + } + + ehdr = (struct elf32_hdr *)fw->data; + if (memcmp(ehdr->e_ident, ELFMAG, SELFMAG) != 0) { + dev_err(ndev->dev, "Invalid firmware image\n"); + ret = -EINVAL; + goto out_release_fw; + } + + phdr = (struct elf32_phdr *)(fw->data + ehdr->e_phoff); + for (i = 0; i < ehdr->e_phnum; i++) { + seg = &phdr[i]; + if (seg->p_type != PT_LOAD || !seg->p_memsz) + continue; + + dest = neutron_tcm_da_to_va(ndev, seg->p_paddr); + if (!dest) { + dev_err(ndev->dev, "Invalid firmware segment: 0x%x\n", seg->p_paddr); + ret = -EINVAL; + goto out_release_fw; + } + + memcpy_toio(dest, fw->data + seg->p_offset, seg->p_filesz); + if (seg->p_memsz > seg->p_filesz) + memset_io(dest + seg->p_filesz, 0, seg->p_memsz - seg->p_filesz); + } + +out_release_fw: + release_firmware(fw); + + return ret; +} + +int neutron_boot(struct neutron_device *ndev) +{ + int ret; + + if (ndev->flags & NEUTRON_BOOTED) + neutron_shutdown(ndev); + + ret = neutron_load_firmware(ndev); + if (ret) + return ret; + + ret = neutron_start(ndev); + if (ret) + return ret; + + ndev->flags |= NEUTRON_BOOTED; + + return 0; +} + +void neutron_shutdown(struct neutron_device *ndev) +{ + neutron_stop(ndev); + ndev->flags &= ~NEUTRON_BOOTED; +} diff --git a/drivers/accel/neutron/neutron_device.h b/drivers/accel/neutron/neutron_device.h new file mode 100644 index 000000000000..8e4df7462d82 --- /dev/null +++ b/drivers/accel/neutron/neutron_device.h @@ -0,0 +1,120 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* Copyright 2025-2026 NXP */ + +#ifndef __NEUTRON_DEVICE_H__ +#define __NEUTRON_DEVICE_H__ + +#include <linux/device.h> +#include <linux/mutex.h> +#include <linux/spinlock.h> +#include <linux/bits.h> +#include <drm/drm_device.h> + +struct clk_bulk_data; + +#define NEUTRON_FIRMWARE_NAME "NeutronFirmware.elf" + +/* Register offsets */ +#define NEUTRON_REG_RESETCTRL 0x00 +#define NEUTRON_REG_STATUSERR 0x04 +#define NEUTRON_REG_INTENA 0x08 +#define NEUTRON_REG_INTCLR 0x0C +#define NEUTRON_REG_APPCTRL 0x200 +#define NEUTRON_REG_APPSTATUS 0x204 +#define NEUTRON_REG_BASEDDRL 0x208 +#define NEUTRON_REG_BASEDDRH 0x20C +#define NEUTRON_REG_RINGCTRL 0x230 +#define NEUTRON_REG_TAIL 0x238 +#define NEUTRON_REG_HEAD 0x23C +#define NEUTRON_REG_MBOX0 0x240 +#define NEUTRON_REG_MBOX1 0x244 +#define NEUTRON_REG_MBOX2 0x248 +#define NEUTRON_REG_MBOX3 0x24C +#define NEUTRON_REG_MBOX4 0x250 +#define NEUTRON_REG_MBOX5 0x254 +#define NEUTRON_REG_MBOX6 0x258 +#define NEUTRON_REG_MBOX7 0x25C +#define NEUTRON_REG_BASEINOUTL 0x280 +#define NEUTRON_REG_BASEINOUTH 0x284 +#define NEUTRON_REG_BASESPILLL 0x288 +#define NEUTRON_REG_BASESPILLH 0x28C + +/* Register fields */ +#define RESETCTRL_ZVRUN BIT(0) + +#define INTENA_INFDONE BIT(1) + +#define APPCTRL_MBWR_MASK GENMASK(31, 16) +#define APPCTRL_MBWR_MAGIC 0xF807 + +#define APPSTATUS_INFDONE BIT(0) +#define APPSTATUS_INFHALTED BIT(1) +#define APPSTATUS_FAULTCAUSE_MASK GENMASK(21, 16) +#define APPSTATUS_CLEAR_MASK GENMASK(4, 0) + +#define RINGCTRL_ADDR_MASK GENMASK(16, 8) +#define RINGCTRL_SIZE_MASK GENMASK(7, 0) +#define RINGCTRL_SIZE_MULT 256 + +/* Neutron device-side memory map */ +#define NEUTRON_ITCM_DA 0x0 +#define NEUTRON_DTCM_DA 0x40000 +#define NEUTRON_DTCM_BANK1_OFFSET 0x4000 + +/* Driver flags */ +#define NEUTRON_BOOTED BIT(0) + +/** + * struct neutron_mem_region - Neutron memory region descriptor + * @va: kernel virtual address of the memory region + * @da: Device address of the memory region + * @size: size of the memory region + */ +struct neutron_mem_region { + void __iomem *va; + u64 da; + size_t size; +}; + +enum neutron_mem_id { + NEUTRON_MEM_REGS = 0, + NEUTRON_MEM_ITCM, + NEUTRON_MEM_DTCM, + NEUTRON_MEM_MAX +}; + +/** + * struct neutron_device - Neutron device structure + * @base: Base DRM device + * @dev: Pointer to underlying device + * @mem_regions: Array of memory region descriptors + * @irq: IRQ number + * @clks: Neutron clocks + * @num_clks: Number of clocks + * @flags: Software flags used by driver + */ +struct neutron_device { + struct drm_device base; + struct device *dev; + + struct neutron_mem_region mem_regions[NEUTRON_MEM_MAX]; + + int irq; + struct clk_bulk_data *clks; + int num_clks; + u32 flags; +}; + +#define to_neutron_device(drm) \ + container_of(drm, struct neutron_device, base) + +#define NEUTRON_REG(ndev, name) \ + ((ndev)->mem_regions[NEUTRON_MEM_REGS].va + NEUTRON_REG_##name) + +int neutron_boot(struct neutron_device *ndev); +void neutron_shutdown(struct neutron_device *ndev); +void neutron_enable_irq(struct neutron_device *ndev); +void neutron_disable_irq(struct neutron_device *ndev); +void neutron_handle_irq(struct neutron_device *ndev); + +#endif /* __NEUTRON_DEVICE_H__ */ diff --git a/drivers/accel/neutron/neutron_driver.c b/drivers/accel/neutron/neutron_driver.c new file mode 100644 index 000000000000..7f34785216cf --- /dev/null +++ b/drivers/accel/neutron/neutron_driver.c @@ -0,0 +1,228 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* Copyright 2025-2026 NXP */ + +#include <linux/clk.h> +#include <linux/dma-mapping.h> +#include <linux/interrupt.h> +#include <linux/module.h> +#include <linux/of.h> +#include <linux/of_reserved_mem.h> +#include <linux/platform_device.h> +#include <linux/pm_runtime.h> + +#include <drm/drm_accel.h> +#include <drm/drm_drv.h> +#include <drm/drm_ioctl.h> +#include <drm/drm_gem.h> + +#include "neutron_device.h" +#include "neutron_driver.h" + +#define NEUTRON_SUSPEND_DELAY_MS 1000 + +static int neutron_open(struct drm_device *drm, struct drm_file *file) +{ + struct neutron_device *ndev = to_neutron_device(drm); + struct neutron_file_priv *npriv; + + npriv = kzalloc_obj(*npriv); + if (!npriv) + return -ENOMEM; + + npriv->ndev = ndev; + file->driver_priv = npriv; + + return 0; +} + +static void neutron_postclose(struct drm_device *drm, struct drm_file *file) +{ + struct neutron_file_priv *npriv = file->driver_priv; + + kfree(npriv); +} + +DEFINE_DRM_ACCEL_FOPS(neutron_drm_driver_fops); + +static const struct drm_driver neutron_drm_driver = { + .driver_features = DRIVER_COMPUTE_ACCEL, + .name = "neutron", + .desc = "NXP Neutron driver", + .major = 1, + .minor = 0, + + .fops = &neutron_drm_driver_fops, + .open = neutron_open, + .postclose = neutron_postclose, +}; + +static irqreturn_t neutron_irq_handler_thread(int irq, void *data) +{ + struct neutron_device *ndev = data; + + neutron_handle_irq(ndev); + + return IRQ_HANDLED; +} + +static int neutron_map_region(struct platform_device *pdev, char *name, + enum neutron_mem_id id) +{ + struct neutron_device *ndev = platform_get_drvdata(pdev); + struct neutron_mem_region *mem = &ndev->mem_regions[id]; + struct resource *res; + + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, name); + if (!res) + return -EINVAL; + + mem->va = devm_ioremap_resource(&pdev->dev, res); + if (IS_ERR(mem->va)) + return PTR_ERR(mem->va); + + mem->size = resource_size(res); + + if (id == NEUTRON_MEM_ITCM) + mem->da = NEUTRON_ITCM_DA; + else if (id == NEUTRON_MEM_DTCM) + mem->da = NEUTRON_DTCM_DA; + + return 0; +} + +static int neutron_probe(struct platform_device *pdev) +{ + struct neutron_device *ndev; + struct device *dev; + int ret; + + ndev = devm_drm_dev_alloc(&pdev->dev, &neutron_drm_driver, + struct neutron_device, base); + if (IS_ERR(ndev)) + return PTR_ERR(ndev); + + platform_set_drvdata(pdev, ndev); + dev = &pdev->dev; + ndev->dev = dev; + + dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48)); + + /* Map registers, ITCM and DTCM regions of the Neutron device */ + ret = neutron_map_region(pdev, "regs", NEUTRON_MEM_REGS); + if (ret) + return ret; + ret = neutron_map_region(pdev, "itcm", NEUTRON_MEM_ITCM); + if (ret) + return ret; + ret = neutron_map_region(pdev, "dtcm", NEUTRON_MEM_DTCM); + if (ret) + return ret; + + ndev->num_clks = devm_clk_bulk_get_all(dev, &ndev->clks); + if (ndev->num_clks < 0) + return ndev->num_clks; + + ndev->irq = platform_get_irq(pdev, 0); + if (ndev->irq < 0) + return ndev->irq; + + ret = devm_request_threaded_irq(dev, ndev->irq, NULL, + neutron_irq_handler_thread, + IRQF_ONESHOT, KBUILD_MODNAME, ndev); + if (ret) { + dev_err(dev, "Failed to request irq %d\n", ndev->irq); + return ret; + } + + ret = of_reserved_mem_device_init(&pdev->dev); + if (ret) { + dev_err(dev, "Failed to initialize reserved memory\n"); + return ret; + } + + ret = devm_pm_runtime_enable(dev); + if (ret) + goto free_reserved; + + pm_runtime_set_autosuspend_delay(dev, NEUTRON_SUSPEND_DELAY_MS); + pm_runtime_use_autosuspend(dev); + + ret = drm_dev_register(&ndev->base, 0); + if (ret) + goto free_reserved; + + return 0; + +free_reserved: + of_reserved_mem_device_release(&pdev->dev); + + return ret; +} + +static void neutron_remove(struct platform_device *pdev) +{ + struct neutron_device *ndev = platform_get_drvdata(pdev); + + drm_dev_unregister(&ndev->base); + of_reserved_mem_device_release(&pdev->dev); +} + +static int neutron_runtime_suspend(struct device *dev) +{ + struct neutron_device *ndev = dev_get_drvdata(dev); + + neutron_disable_irq(ndev); + neutron_shutdown(ndev); + + clk_bulk_disable_unprepare(ndev->num_clks, ndev->clks); + + return 0; +} + +static int neutron_runtime_resume(struct device *dev) +{ + struct neutron_device *ndev = dev_get_drvdata(dev); + int ret; + + ret = clk_bulk_prepare_enable(ndev->num_clks, ndev->clks); + if (ret) + return ret; + + ret = neutron_boot(ndev); + if (ret) { + clk_bulk_disable_unprepare(ndev->num_clks, ndev->clks); + return ret; + } + + neutron_enable_irq(ndev); + + return 0; +} + +static const struct dev_pm_ops neutron_pm_ops = { + SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume) + RUNTIME_PM_OPS(neutron_runtime_suspend, neutron_runtime_resume, NULL) +}; + +static const struct of_device_id neutron_match_table[] = { + { .compatible = "nxp,imx95-neutron" }, + {} +}; + +MODULE_DEVICE_TABLE(of, neutron_match_table); + +static struct platform_driver neutron_driver = { + .probe = &neutron_probe, + .remove = &neutron_remove, + .driver = { + .name = "neutron", + .of_match_table = of_match_ptr(neutron_match_table), + .pm = pm_ptr(&neutron_pm_ops), + }, +}; +module_platform_driver(neutron_driver); + +MODULE_AUTHOR("NXP"); +MODULE_DESCRIPTION("NXP Neutron Accel Driver"); +MODULE_LICENSE("GPL"); +MODULE_FIRMWARE(NEUTRON_FIRMWARE_NAME); diff --git a/drivers/accel/neutron/neutron_driver.h b/drivers/accel/neutron/neutron_driver.h new file mode 100644 index 000000000000..cd52b5eb2d27 --- /dev/null +++ b/drivers/accel/neutron/neutron_driver.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* Copyright 2025 NXP */ + +#ifndef __NEUTRON_DRIVER_H__ +#define __NEUTRON_DRIVER_H__ + +struct neutron_device; + +struct neutron_file_priv { + struct neutron_device *ndev; +}; + +#endif /* __NEUTRON_DRIVER_H__ */
On 06/03/2026 14:27, Ioana Ciocoi-Radulescu wrote:
diff --git a/MAINTAINERS b/MAINTAINERS index 8a5b27b061da..f7a687eb6b54 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -19191,6 +19191,16 @@ S: Orphan F: Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml F: drivers/nfc/nxp-nci +NXP Neutron NPU DRIVER
s/Neutron/NEUTRON/ as everything here is in uppercase
+M: Ioana Ciocoi Radulescu ruxandra.radulescu@nxp.com +M: Jiwei Fu jiwei.fu@nxp.com +L: dri-devel@lists.freedesktop.org +S: Maintained +T: git https://gitlab.freedesktop.org/drm/misc/kernel.git +F: Documentation/accel/neutron/ +F: drivers/accel/neutron/ +F: include/uapi/drm/neutron_accel.h
diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile index 1d3a7251b950..698136e12cce 100644 --- a/drivers/accel/Makefile +++ b/drivers/accel/Makefile @@ -4,5 +4,6 @@ obj-$(CONFIG_DRM_ACCEL_AMDXDNA) += amdxdna/ obj-$(CONFIG_DRM_ACCEL_ARM_ETHOSU) += ethosu/ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/ obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/ +obj-$(CONFIG_DRM_ACCEL_NXP_NEUTRON) += neutron/ obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/ -obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ \ No newline at end of file
You still have patch warnings.
+obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ diff --git a/drivers/accel/neutron/Kconfig b/drivers/accel/neutron/Kconfig new file mode 100644 index 000000000000..37b8ecb49804 --- /dev/null +++ b/drivers/accel/neutron/Kconfig @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0+
+config DRM_ACCEL_NXP_NEUTRON
- tristate "NXP Neutron NPU"
- depends on HAS_IOMEM
- depends on DRM_ACCEL
- depends on ARCH_MXC
Missing compile test
- select DRM_GEM_DMA_HELPER
- select DRM_SCHED
- help
Enables driver for NXP Neutron NPU.Select this if you have an NXP SoC with Neutron, like i.MX95,and want to run machine learning applications.If built as module, the module is named neutron.
...
- ret = devm_request_threaded_irq(dev, ndev->irq, NULL,
neutron_irq_handler_thread,IRQF_ONESHOT, KBUILD_MODNAME, ndev);- if (ret) {
dev_err(dev, "Failed to request irq %d\n", ndev->irq);
Drop, not needed.
return ret;- }
- ret = of_reserved_mem_device_init(&pdev->dev);
- if (ret) {
dev_err(dev, "Failed to initialize reserved memory\n");return ret;- }
- ret = devm_pm_runtime_enable(dev);
- if (ret)
goto free_reserved;- pm_runtime_set_autosuspend_delay(dev, NEUTRON_SUSPEND_DELAY_MS);
- pm_runtime_use_autosuspend(dev);
- ret = drm_dev_register(&ndev->base, 0);
- if (ret)
goto free_reserved;- return 0;
+free_reserved:
- of_reserved_mem_device_release(&pdev->dev);
- return ret;
+}
+static void neutron_remove(struct platform_device *pdev) +{
- struct neutron_device *ndev = platform_get_drvdata(pdev);
- drm_dev_unregister(&ndev->base);
- of_reserved_mem_device_release(&pdev->dev);
+}
+static int neutron_runtime_suspend(struct device *dev) +{
- struct neutron_device *ndev = dev_get_drvdata(dev);
- neutron_disable_irq(ndev);
- neutron_shutdown(ndev);
- clk_bulk_disable_unprepare(ndev->num_clks, ndev->clks);
- return 0;
+}
+static int neutron_runtime_resume(struct device *dev) +{
- struct neutron_device *ndev = dev_get_drvdata(dev);
- int ret;
- ret = clk_bulk_prepare_enable(ndev->num_clks, ndev->clks);
- if (ret)
return ret;- ret = neutron_boot(ndev);
- if (ret) {
clk_bulk_disable_unprepare(ndev->num_clks, ndev->clks);return ret;- }
- neutron_enable_irq(ndev);
- return 0;
+}
+static const struct dev_pm_ops neutron_pm_ops = {
- SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
- RUNTIME_PM_OPS(neutron_runtime_suspend, neutron_runtime_resume, NULL)
+};
+static const struct of_device_id neutron_match_table[] = {
- { .compatible = "nxp,imx95-neutron" },
- {}
+};
+MODULE_DEVICE_TABLE(of, neutron_match_table);
+static struct platform_driver neutron_driver = {
- .probe = &neutron_probe,
- .remove = &neutron_remove,
- .driver = {
.name = "neutron",.of_match_table = of_match_ptr(neutron_match_table),
Drop of_match_ptr. You will have (or you have already same as v1) here warning.
.pm = pm_ptr(&neutron_pm_ops),- },
+};
Best regards, Krzysztof
On Friday, March 6, 2026 at 4:22 PM, Krzysztof Kozlowski wrote:
On 06/03/2026 14:27, Ioana Ciocoi-Radulescu wrote:
diff --git a/MAINTAINERS b/MAINTAINERS index 8a5b27b061da..f7a687eb6b54 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -19191,6 +19191,16 @@ S: Orphan F: Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml F: drivers/nfc/nxp-nci
+NXP Neutron NPU DRIVER
s/Neutron/NEUTRON/ as everything here is in uppercase
Ok.
+M: Ioana Ciocoi Radulescu ruxandra.radulescu@nxp.com +M: Jiwei Fu jiwei.fu@nxp.com +L: dri-devel@lists.freedesktop.org +S: Maintained +T: git https://gitlab.freedesktop.org/drm/misc/kernel.git +F: Documentation/accel/neutron/ +F: drivers/accel/neutron/ +F: include/uapi/drm/neutron_accel.h
diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile index 1d3a7251b950..698136e12cce 100644 --- a/drivers/accel/Makefile +++ b/drivers/accel/Makefile @@ -4,5 +4,6 @@ obj-$(CONFIG_DRM_ACCEL_AMDXDNA) +=
amdxdna/
obj-$(CONFIG_DRM_ACCEL_ARM_ETHOSU) += ethosu/ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/ obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/ +obj-$(CONFIG_DRM_ACCEL_NXP_NEUTRON) += neutron/ obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/ -obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ \ No newline at end of file
You still have patch warnings.
Yeah, so the last line of this Makefile lacked the line ending and vim fixed that on its own when I edited the file. I can add the neutron line and leave the rest untouched, just making sure this is what you're requesting?
+obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ diff --git a/drivers/accel/neutron/Kconfig b/drivers/accel/neutron/Kconfig new file mode 100644 index 000000000000..37b8ecb49804 --- /dev/null +++ b/drivers/accel/neutron/Kconfig @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0+
+config DRM_ACCEL_NXP_NEUTRON
- tristate "NXP Neutron NPU"
- depends on HAS_IOMEM
- depends on DRM_ACCEL
- depends on ARCH_MXC
Missing compile test
Will add.
- select DRM_GEM_DMA_HELPER
- select DRM_SCHED
- help
Enables driver for NXP Neutron NPU.Select this if you have an NXP SoC with Neutron, like i.MX95,and want to run machine learning applications.If built as module, the module is named neutron....
- ret = devm_request_threaded_irq(dev, ndev->irq, NULL,
neutron_irq_handler_thread,IRQF_ONESHOT, KBUILD_MODNAME,ndev);
- if (ret) {
dev_err(dev, "Failed to request irq %d\n", ndev->irq);Drop, not needed.
Ok
return ret;- }
- ret = of_reserved_mem_device_init(&pdev->dev);
- if (ret) {
dev_err(dev, "Failed to initialize reserved memory\n");return ret;- }
- ret = devm_pm_runtime_enable(dev);
- if (ret)
goto free_reserved;- pm_runtime_set_autosuspend_delay(dev,
NEUTRON_SUSPEND_DELAY_MS);
- pm_runtime_use_autosuspend(dev);
- ret = drm_dev_register(&ndev->base, 0);
- if (ret)
goto free_reserved;- return 0;
+free_reserved:
- of_reserved_mem_device_release(&pdev->dev);
- return ret;
+}
+static void neutron_remove(struct platform_device *pdev) {
- struct neutron_device *ndev = platform_get_drvdata(pdev);
- drm_dev_unregister(&ndev->base);
- of_reserved_mem_device_release(&pdev->dev);
+}
+static int neutron_runtime_suspend(struct device *dev) {
- struct neutron_device *ndev = dev_get_drvdata(dev);
- neutron_disable_irq(ndev);
- neutron_shutdown(ndev);
- clk_bulk_disable_unprepare(ndev->num_clks, ndev->clks);
- return 0;
+}
+static int neutron_runtime_resume(struct device *dev) {
- struct neutron_device *ndev = dev_get_drvdata(dev);
- int ret;
- ret = clk_bulk_prepare_enable(ndev->num_clks, ndev->clks);
- if (ret)
return ret;- ret = neutron_boot(ndev);
- if (ret) {
clk_bulk_disable_unprepare(ndev->num_clks, ndev->clks);return ret;- }
- neutron_enable_irq(ndev);
- return 0;
+}
+static const struct dev_pm_ops neutron_pm_ops = {
- SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
pm_runtime_force_resume)
- RUNTIME_PM_OPS(neutron_runtime_suspend,
neutron_runtime_resume,
+NULL) };
+static const struct of_device_id neutron_match_table[] = {
- { .compatible = "nxp,imx95-neutron" },
- {}
+};
+MODULE_DEVICE_TABLE(of, neutron_match_table);
+static struct platform_driver neutron_driver = {
- .probe = &neutron_probe,
- .remove = &neutron_remove,
- .driver = {
.name = "neutron",.of_match_table =of_match_ptr(neutron_match_table),
Drop of_match_ptr. You will have (or you have already same as v1) here warning.
Will fix. But how do I get to see the warning here? Tried building with W=1 and OF support disabled but it didn't complain.
Thanks! Ioana
.pm = pm_ptr(&neutron_pm_ops),- },
+};
Best regards, Krzysztof
Add the following IOCTLs: - CREATE_BO - for creating a new buffer object and passing BO info back to user - SYNC_BO - for explicit DMA sync operations on the BO memory, since Neutron isn't guaranteed to be cache coherent. User controls which portions of the buffer memory to sync and the direction.
The Neutron device requires contiguous DMA buffers, so use the GEM DMA helpers for creating and managing the BOs. Depending on the platform it is integrated on, Neutron device may or may not be cache coherent. On i.MX95, the first platform for which we add Neutron support, it is not.
Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com --- v2: Fix use of uninitialized variable --- drivers/accel/neutron/Makefile | 3 +- drivers/accel/neutron/neutron_driver.c | 13 +++- drivers/accel/neutron/neutron_gem.c | 116 +++++++++++++++++++++++++++++++++ drivers/accel/neutron/neutron_gem.h | 14 ++++ include/uapi/drm/neutron_accel.h | 79 ++++++++++++++++++++++ 5 files changed, 223 insertions(+), 2 deletions(-)
diff --git a/drivers/accel/neutron/Makefile b/drivers/accel/neutron/Makefile index 7592e318dd83..d4298c7a8535 100644 --- a/drivers/accel/neutron/Makefile +++ b/drivers/accel/neutron/Makefile @@ -4,4 +4,5 @@ obj-$(CONFIG_DRM_ACCEL_NXP_NEUTRON) := neutron.o
neutron-y := \ neutron_driver.o \ - neutron_device.o + neutron_device.o \ + neutron_gem.o diff --git a/drivers/accel/neutron/neutron_driver.c b/drivers/accel/neutron/neutron_driver.c index 7f34785216cf..c9a18bf52037 100644 --- a/drivers/accel/neutron/neutron_driver.c +++ b/drivers/accel/neutron/neutron_driver.c @@ -14,12 +14,19 @@ #include <drm/drm_drv.h> #include <drm/drm_ioctl.h> #include <drm/drm_gem.h> +#include <drm/neutron_accel.h>
#include "neutron_device.h" #include "neutron_driver.h" +#include "neutron_gem.h"
#define NEUTRON_SUSPEND_DELAY_MS 1000
+static const struct drm_ioctl_desc neutron_drm_ioctls[] = { + DRM_IOCTL_DEF_DRV(NEUTRON_CREATE_BO, neutron_ioctl_create_bo, 0), + DRM_IOCTL_DEF_DRV(NEUTRON_SYNC_BO, neutron_ioctl_sync_bo, 0), +}; + static int neutron_open(struct drm_device *drm, struct drm_file *file) { struct neutron_device *ndev = to_neutron_device(drm); @@ -45,7 +52,7 @@ static void neutron_postclose(struct drm_device *drm, struct drm_file *file) DEFINE_DRM_ACCEL_FOPS(neutron_drm_driver_fops);
static const struct drm_driver neutron_drm_driver = { - .driver_features = DRIVER_COMPUTE_ACCEL, + .driver_features = DRIVER_COMPUTE_ACCEL | DRIVER_GEM, .name = "neutron", .desc = "NXP Neutron driver", .major = 1, @@ -54,6 +61,10 @@ static const struct drm_driver neutron_drm_driver = { .fops = &neutron_drm_driver_fops, .open = neutron_open, .postclose = neutron_postclose, + .ioctls = neutron_drm_ioctls, + .num_ioctls = ARRAY_SIZE(neutron_drm_ioctls), + + .gem_create_object = neutron_gem_create_object, };
static irqreturn_t neutron_irq_handler_thread(int irq, void *data) diff --git a/drivers/accel/neutron/neutron_gem.c b/drivers/accel/neutron/neutron_gem.c new file mode 100644 index 000000000000..142237caf041 --- /dev/null +++ b/drivers/accel/neutron/neutron_gem.c @@ -0,0 +1,116 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* Copyright 2025-2026 NXP */ + +#include <linux/sizes.h> +#include <linux/align.h> +#include <linux/dma-map-ops.h> +#include <drm/drm_device.h> +#include <drm/drm_gem_dma_helper.h> +#include <drm/drm_print.h> +#include <drm/neutron_accel.h> + +#include "neutron_device.h" +#include "neutron_gem.h" + +#define NEUTRON_BO_ALIGN SZ_1M + +struct drm_gem_object *neutron_gem_create_object(struct drm_device *drm, size_t size) +{ + struct neutron_device *ndev = to_neutron_device(drm); + struct drm_gem_dma_object *dma_obj; + struct drm_gem_object *gem_obj; + + dma_obj = kzalloc_obj(*dma_obj); + if (!dma_obj) + return ERR_PTR(-ENOMEM); + + dma_obj->map_noncoherent = !dev_is_dma_coherent(ndev->dev); + dma_obj->map_bidirectional = true; + gem_obj = &dma_obj->base; + + return gem_obj; +} + +int neutron_ioctl_create_bo(struct drm_device *drm, void *data, struct drm_file *filp) +{ + struct drm_neutron_create_bo *args = data; + struct drm_gem_dma_object *dma_obj; + struct drm_gem_object *gem_obj; + size_t size; + int ret; + + if (!args->size || args->pad) + return -EINVAL; + + size = ALIGN(args->size, NEUTRON_BO_ALIGN); + + dma_obj = drm_gem_dma_create(drm, size); + if (IS_ERR(dma_obj)) + return PTR_ERR(dma_obj); + + gem_obj = &dma_obj->base; + + /* We expect correctly aligned buffers, but double-check */ + if (drm_WARN_ON(drm, !IS_ALIGNED(dma_obj->dma_addr, NEUTRON_BO_ALIGN))) { + ret = -EFAULT; + goto out_put; + } + + ret = drm_gem_handle_create(filp, gem_obj, &args->handle); + if (ret) + goto out_put; + + args->map_offset = drm_vma_node_offset_addr(&gem_obj->vma_node); + args->size = gem_obj->size; + +out_put: + /* No need to keep a reference of the GEM object. Freeing is handled by user */ + drm_gem_object_put(gem_obj); + + return ret; +} + +int neutron_ioctl_sync_bo(struct drm_device *drm, void *data, struct drm_file *filp) +{ + struct drm_neutron_sync_bo *args = data; + struct drm_gem_dma_object *dma_obj; + struct drm_gem_object *gem_obj; + dma_addr_t start_addr; + int ret = 0; + + gem_obj = drm_gem_object_lookup(filp, args->handle); + if (!gem_obj) { + dev_dbg(drm->dev, "Invalid BO handle %u\n", args->handle); + return -ENOENT; + } + + dma_obj = to_drm_gem_dma_obj(gem_obj); + + if (!args->size || args->offset >= gem_obj->size || + args->size > gem_obj->size - args->offset) { + dev_dbg(drm->dev, "Invalid offset/size for BO sync\n"); + ret = -EINVAL; + goto out_put; + } + + start_addr = dma_obj->dma_addr + args->offset; + + switch (args->direction) { + case DRM_NEUTRON_SYNC_TO_DEVICE: + dma_sync_single_for_device(drm->dev, start_addr, args->size, + DMA_BIDIRECTIONAL); + break; + case DRM_NEUTRON_SYNC_FROM_DEVICE: + dma_sync_single_for_cpu(drm->dev, start_addr, args->size, + DMA_BIDIRECTIONAL); + break; + default: + dev_dbg(drm->dev, "Invalid direction for BO sync\n"); + ret = -EINVAL; + } + +out_put: + drm_gem_object_put(gem_obj); + + return ret; +} diff --git a/drivers/accel/neutron/neutron_gem.h b/drivers/accel/neutron/neutron_gem.h new file mode 100644 index 000000000000..95ba2fe96617 --- /dev/null +++ b/drivers/accel/neutron/neutron_gem.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* Copyright 2025-2026 NXP */ + +#ifndef __NEUTRON_GEM_H__ +#define __NEUTRON_GEM_H__ + +#include <drm/drm_gem.h> + +struct drm_gem_object *neutron_gem_create_object(struct drm_device *drm, size_t size); + +int neutron_ioctl_create_bo(struct drm_device *drm, void *data, struct drm_file *filp); +int neutron_ioctl_sync_bo(struct drm_device *drm, void *data, struct drm_file *filp); + +#endif /* __NEUTRON_GEM_H__ */ diff --git a/include/uapi/drm/neutron_accel.h b/include/uapi/drm/neutron_accel.h new file mode 100644 index 000000000000..2f5639f2e0e8 --- /dev/null +++ b/include/uapi/drm/neutron_accel.h @@ -0,0 +1,79 @@ +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ +/* Copyright 2025-2026 NXP */ + +#ifndef __NEUTRON_ACCEL_H__ +#define __NEUTRON_ACCEL_H__ + +#include "drm.h" + +#if defined(__cplusplus) +extern "C" { +#endif + +/** + * enum drm_neutron_ioctl - Neutron IOCTL IDs + * + * @DRM_NEUTRON_CREATE_BO: Create a buffer object + * @DRM_NEUTRON_SYNC_BO: Sync (parts of) the buffer object memory + */ +enum drm_neutron_ioctl { + DRM_NEUTRON_CREATE_BO = 0, + DRM_NEUTRON_SYNC_BO, +}; + +/** + * struct drm_neutron_create_bo - Create a buffer object and return buffer + * info to user + * + * @size: Size in bytes of requested buffer. May be updated by driver + * if allocated size different than requested + * @handle: Returned handle for the new buffer object + * @pad: MBZ + * @map_offset: Returned offset for mmap() calls + */ +struct drm_neutron_create_bo { + __u64 size; + __u32 handle; + __u32 pad; + __u64 map_offset; +}; + +/** + * enum drm_neutron_sync_dir - Direction of buffer object synchronization + * + * @DRM_NEUTRON_SYNC_TO_DEVICE: Sync from CPU to device + * @DRM_NEUTRON_SYNC_FROM_DEVICE: Sync from device to CPU + */ +enum drm_neutron_sync_dir { + DRM_NEUTRON_SYNC_TO_DEVICE = 0, + DRM_NEUTRON_SYNC_FROM_DEVICE, +}; + +/** + * struct drm_neutron_sync_bo - Sync buffer object memory + * + * @handle: Handle of buffer object to sync + * @direction: Direction of sync, can be one of enum drm_neutron_sync_dir + * @size: Size of the memory to sync, in bytes + * @offset: Offset inside the buffer, in bytes + */ +struct drm_neutron_sync_bo { + __u32 handle; + __u32 direction; + __u64 size; + __u64 offset; +}; + +#define DRM_IOCTL_NEUTRON_CREATE_BO \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_NEUTRON_CREATE_BO, \ + struct drm_neutron_create_bo) + +#define DRM_IOCTL_NEUTRON_SYNC_BO \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_NEUTRON_SYNC_BO, \ + struct drm_neutron_sync_bo) + +#if defined(__cplusplus) +} +#endif + +#endif /* __NEUTRON_ACCEL_H__ */
The driver communicates with the Neutron firmware via eight register-backed mailboxes. A subset of the mailbox registers are used to pass commands from driver to Neutron, while the rest are written by Neutron firmware with status/ack info.
Signed-off-by: Jiwei Fu jiwei.fu@nxp.com Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com --- drivers/accel/neutron/Makefile | 3 ++- drivers/accel/neutron/neutron_device.c | 4 +++ drivers/accel/neutron/neutron_mailbox.c | 47 +++++++++++++++++++++++++++++++++ drivers/accel/neutron/neutron_mailbox.h | 42 +++++++++++++++++++++++++++++ 4 files changed, 95 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/neutron/Makefile b/drivers/accel/neutron/Makefile index d4298c7a8535..192ed896a9f9 100644 --- a/drivers/accel/neutron/Makefile +++ b/drivers/accel/neutron/Makefile @@ -5,4 +5,5 @@ obj-$(CONFIG_DRM_ACCEL_NXP_NEUTRON) := neutron.o neutron-y := \ neutron_driver.o \ neutron_device.o \ - neutron_gem.o + neutron_gem.o \ + neutron_mailbox.o diff --git a/drivers/accel/neutron/neutron_device.c b/drivers/accel/neutron/neutron_device.c index 61b3c96b4996..e5c09105be99 100644 --- a/drivers/accel/neutron/neutron_device.c +++ b/drivers/accel/neutron/neutron_device.c @@ -7,6 +7,7 @@ #include <linux/iopoll.h>
#include "neutron_device.h" +#include "neutron_mailbox.h"
void neutron_enable_irq(struct neutron_device *ndev) { @@ -148,6 +149,9 @@ int neutron_boot(struct neutron_device *ndev) if (ret) return ret;
+ /* Prepare device to receive jobs */ + neutron_mbox_reset_state(ndev); + ndev->flags |= NEUTRON_BOOTED;
return 0; diff --git a/drivers/accel/neutron/neutron_mailbox.c b/drivers/accel/neutron/neutron_mailbox.c new file mode 100644 index 000000000000..327ef2e8081d --- /dev/null +++ b/drivers/accel/neutron/neutron_mailbox.c @@ -0,0 +1,47 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* Copyright 2023, 2025-2026 NXP */ + +#include <linux/iopoll.h> + +#include "neutron_device.h" +#include "neutron_mailbox.h" + +#define NEUTRON_MBOX_FW_STATUS(dev) NEUTRON_REG(dev, MBOX0) +#define NEUTRON_MBOX_FW_ERRCODE(dev) NEUTRON_REG(dev, MBOX1) +#define NEUTRON_MBOX_CMD_ID(dev) NEUTRON_REG(dev, MBOX3) +#define NEUTRON_MBOX_CMD_ARG_BASE(dev) NEUTRON_REG(dev, MBOX4) +#define NEUTRON_MBOX_CMD_ARG(dev, i) (NEUTRON_MBOX_CMD_ARG_BASE(dev) + (i) * 4) + +int neutron_mbox_send_cmd(struct neutron_device *ndev, struct neutron_mbox_cmd *cmd) +{ + u32 status; + int i; + + /* Make sure Neutron is ready to receive commands */ + status = readl_relaxed(NEUTRON_MBOX_FW_STATUS(ndev)); + if (status != NEUTRON_FW_STATUS_RESET) + return -EBUSY; + + for (i = 0; i < NEUTRON_MBOX_MAX_CMD_ARGS; i++) + writel_relaxed(cmd->args[i], NEUTRON_MBOX_CMD_ARG(ndev, i)); + writel(cmd->id, NEUTRON_MBOX_CMD_ID(ndev)); + + return 0; +} + +int neutron_mbox_reset_state(struct neutron_device *ndev) +{ + u32 status; + + writel_relaxed(NEUTRON_CMD_RESET_STATE, NEUTRON_MBOX_CMD_ID(ndev)); + + return readl_poll_timeout(NEUTRON_MBOX_FW_STATUS(ndev), status, + status == NEUTRON_FW_STATUS_RESET, + 100, 100 * USEC_PER_MSEC); +} + +void neutron_mbox_read_state(struct neutron_device *ndev, struct neutron_mbox_state *state) +{ + state->status = readl_relaxed(NEUTRON_MBOX_FW_STATUS(ndev)); + state->err_code = readl_relaxed(NEUTRON_MBOX_FW_ERRCODE(ndev)); +} diff --git a/drivers/accel/neutron/neutron_mailbox.h b/drivers/accel/neutron/neutron_mailbox.h new file mode 100644 index 000000000000..4fe40a2f6a0c --- /dev/null +++ b/drivers/accel/neutron/neutron_mailbox.h @@ -0,0 +1,42 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* Copyright 2023, 2025-2026 NXP */ + +#ifndef __NEUTRON_MAILBOX_H__ +#define __NEUTRON_MAILBOX_H__ + +#include <linux/types.h> + +struct neutron_device; + +/* Device (firmware) status magic values */ +enum neutron_mbox_fwstat { + NEUTRON_FW_STATUS_RESET = 0, + NEUTRON_FW_STATUS_ACK = 0xA3, + NEUTRON_FW_STATUS_DONE = 0xAD0, +}; + +/* Firmware command opcodes */ +enum neutron_mbox_cmdid { + NEUTRON_CMD_INFERENCE = 0x269, + NEUTRON_CMD_RESET_STATE = 0x23637, +}; + +#define NEUTRON_MBOX_MAX_CMD_ARGS 4 + +/* Firmware command */ +struct neutron_mbox_cmd { + enum neutron_mbox_cmdid id; + u32 args[NEUTRON_MBOX_MAX_CMD_ARGS]; +}; + +/* Device state */ +struct neutron_mbox_state { + enum neutron_mbox_fwstat status; + u32 err_code; +}; + +int neutron_mbox_send_cmd(struct neutron_device *ndev, struct neutron_mbox_cmd *cmd); +void neutron_mbox_read_state(struct neutron_device *ndev, struct neutron_mbox_state *state); +int neutron_mbox_reset_state(struct neutron_device *ndev); + +#endif /* __NEUTRON_MAILBOX_H__ */
Neutron can execute a single job at a time. For now, only inference jobs are supported. Each job has exactly one BO associated with it.
When submitting a job, user also provides a syncobj handle on which it will wait for job completion.
We use the DRM GPU scheduler for job management. Large part of the job submission code is based on the example of the ethosu driver.
Signed-off-by: Jiwei Fu jiwei.fu@nxp.com Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com --- v2: - drop fence_lock - remove unnecessary fields from struct neutron_job --- drivers/accel/neutron/Makefile | 1 + drivers/accel/neutron/neutron_device.c | 8 +- drivers/accel/neutron/neutron_device.h | 18 ++ drivers/accel/neutron/neutron_driver.c | 28 ++- drivers/accel/neutron/neutron_driver.h | 3 + drivers/accel/neutron/neutron_job.c | 372 +++++++++++++++++++++++++++++++++ drivers/accel/neutron/neutron_job.h | 43 ++++ include/uapi/drm/neutron_accel.h | 51 +++++ 8 files changed, 519 insertions(+), 5 deletions(-)
diff --git a/drivers/accel/neutron/Makefile b/drivers/accel/neutron/Makefile index 192ed896a9f9..ac6dd576521c 100644 --- a/drivers/accel/neutron/Makefile +++ b/drivers/accel/neutron/Makefile @@ -6,4 +6,5 @@ neutron-y := \ neutron_driver.o \ neutron_device.o \ neutron_gem.o \ + neutron_job.o \ neutron_mailbox.o diff --git a/drivers/accel/neutron/neutron_device.c b/drivers/accel/neutron/neutron_device.c index e5c09105be99..571ec906ad72 100644 --- a/drivers/accel/neutron/neutron_device.c +++ b/drivers/accel/neutron/neutron_device.c @@ -7,6 +7,7 @@ #include <linux/iopoll.h>
#include "neutron_device.h" +#include "neutron_job.h" #include "neutron_mailbox.h"
void neutron_enable_irq(struct neutron_device *ndev) @@ -32,9 +33,14 @@ void neutron_handle_irq(struct neutron_device *ndev) /* Write 1 to clear */ writel_relaxed(appstatus & APPSTATUS_CLEAR_MASK, NEUTRON_REG(ndev, APPSTATUS));
- if (appstatus & APPSTATUS_FAULTCAUSE_MASK) + if (appstatus & APPSTATUS_FAULTCAUSE_MASK) { dev_err(ndev->dev, "Neutron halted due to fault: 0x%lx\n", FIELD_GET(APPSTATUS_FAULTCAUSE_MASK, appstatus)); + return neutron_job_err_handler(ndev); + } + + if (appstatus & APPSTATUS_INFDONE) + neutron_job_done_handler(ndev); }
#define neutron_boot_done(appctrl) \ diff --git a/drivers/accel/neutron/neutron_device.h b/drivers/accel/neutron/neutron_device.h index 8e4df7462d82..1953cdf19bfd 100644 --- a/drivers/accel/neutron/neutron_device.h +++ b/drivers/accel/neutron/neutron_device.h @@ -9,8 +9,10 @@ #include <linux/spinlock.h> #include <linux/bits.h> #include <drm/drm_device.h> +#include <drm/gpu_scheduler.h>
struct clk_bulk_data; +struct neutron_job;
#define NEUTRON_FIRMWARE_NAME "NeutronFirmware.elf"
@@ -92,6 +94,12 @@ enum neutron_mem_id { * @clks: Neutron clocks * @num_clks: Number of clocks * @flags: Software flags used by driver + * @sched: GPU scheduler + * @sched_lock: Scheduler lock, for neutron_push_job + * @fence_context: Fence context + * @job_seqno: Job sequence number + * @job_lock: Job lock, for active_job handling + * @active_job: Currently active job */ struct neutron_device { struct drm_device base; @@ -103,6 +111,16 @@ struct neutron_device { struct clk_bulk_data *clks; int num_clks; u32 flags; + + struct drm_gpu_scheduler sched; + /* For neutron_push_job */ + struct mutex sched_lock; + u64 fence_context; + u64 job_seqno; + + /* For active_job handling */ + struct mutex job_lock; + struct neutron_job *active_job; };
#define to_neutron_device(drm) \ diff --git a/drivers/accel/neutron/neutron_driver.c b/drivers/accel/neutron/neutron_driver.c index c9a18bf52037..ceae1f7e8359 100644 --- a/drivers/accel/neutron/neutron_driver.c +++ b/drivers/accel/neutron/neutron_driver.c @@ -19,40 +19,53 @@ #include "neutron_device.h" #include "neutron_driver.h" #include "neutron_gem.h" +#include "neutron_job.h"
#define NEUTRON_SUSPEND_DELAY_MS 1000
static const struct drm_ioctl_desc neutron_drm_ioctls[] = { DRM_IOCTL_DEF_DRV(NEUTRON_CREATE_BO, neutron_ioctl_create_bo, 0), DRM_IOCTL_DEF_DRV(NEUTRON_SYNC_BO, neutron_ioctl_sync_bo, 0), + DRM_IOCTL_DEF_DRV(NEUTRON_SUBMIT_JOB, neutron_ioctl_submit_job, 0), };
static int neutron_open(struct drm_device *drm, struct drm_file *file) { struct neutron_device *ndev = to_neutron_device(drm); struct neutron_file_priv *npriv; + int ret;
npriv = kzalloc_obj(*npriv); if (!npriv) return -ENOMEM;
npriv->ndev = ndev; - file->driver_priv = npriv;
+ ret = neutron_job_open(npriv); + if (ret) + goto err_free; + + file->driver_priv = npriv; return 0; + +err_free: + kfree(npriv); + return ret; }
static void neutron_postclose(struct drm_device *drm, struct drm_file *file) { struct neutron_file_priv *npriv = file->driver_priv;
+ neutron_job_close(npriv); kfree(npriv); }
DEFINE_DRM_ACCEL_FOPS(neutron_drm_driver_fops);
static const struct drm_driver neutron_drm_driver = { - .driver_features = DRIVER_COMPUTE_ACCEL | DRIVER_GEM, + .driver_features = DRIVER_COMPUTE_ACCEL | DRIVER_GEM | + DRIVER_SYNCOBJ, .name = "neutron", .desc = "NXP Neutron driver", .major = 1, @@ -151,19 +164,25 @@ static int neutron_probe(struct platform_device *pdev) return ret; }
- ret = devm_pm_runtime_enable(dev); + ret = neutron_job_init(ndev); if (ret) goto free_reserved;
+ ret = devm_pm_runtime_enable(dev); + if (ret) + goto free_job; + pm_runtime_set_autosuspend_delay(dev, NEUTRON_SUSPEND_DELAY_MS); pm_runtime_use_autosuspend(dev);
ret = drm_dev_register(&ndev->base, 0); if (ret) - goto free_reserved; + goto free_job;
return 0;
+free_job: + neutron_job_fini(ndev); free_reserved: of_reserved_mem_device_release(&pdev->dev);
@@ -175,6 +194,7 @@ static void neutron_remove(struct platform_device *pdev) struct neutron_device *ndev = platform_get_drvdata(pdev);
drm_dev_unregister(&ndev->base); + neutron_job_fini(ndev); of_reserved_mem_device_release(&pdev->dev); }
diff --git a/drivers/accel/neutron/neutron_driver.h b/drivers/accel/neutron/neutron_driver.h index cd52b5eb2d27..b709de74105a 100644 --- a/drivers/accel/neutron/neutron_driver.h +++ b/drivers/accel/neutron/neutron_driver.h @@ -4,10 +4,13 @@ #ifndef __NEUTRON_DRIVER_H__ #define __NEUTRON_DRIVER_H__
+#include <drm/gpu_scheduler.h> + struct neutron_device;
struct neutron_file_priv { struct neutron_device *ndev; + struct drm_sched_entity sched_entity; };
#endif /* __NEUTRON_DRIVER_H__ */ diff --git a/drivers/accel/neutron/neutron_job.c b/drivers/accel/neutron/neutron_job.c new file mode 100644 index 000000000000..e2993235fdab --- /dev/null +++ b/drivers/accel/neutron/neutron_job.c @@ -0,0 +1,372 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* Copyright 2025-2026 NXP */ + +#include <linux/delay.h> +#include <linux/pm_runtime.h> +#include <drm/drm_file.h> +#include <drm/drm_print.h> +#include <drm/drm_gem_dma_helper.h> +#include <drm/neutron_accel.h> + +#include "neutron_driver.h" +#include "neutron_device.h" +#include "neutron_gem.h" +#include "neutron_mailbox.h" +#include "neutron_job.h" + +#define NEUTRON_JOB_TIMEOUT_MS 5000 + +static const char *neutron_fence_get_driver_name(struct dma_fence *fence) +{ + return "neutron"; +} + +static const char *neutron_fence_get_timeline_name(struct dma_fence *fence) +{ + return "neutron-npu"; +} + +static const struct dma_fence_ops neutron_fence_ops = { + .get_driver_name = neutron_fence_get_driver_name, + .get_timeline_name = neutron_fence_get_timeline_name, +}; + +static void neutron_hw_submit(struct neutron_job *job) +{ + struct neutron_device *ndev = job->ndev; + struct neutron_mbox_cmd cmd = {0}; + u32 base_l, base_h; + u64 base_addr; + int ret; + + switch (job->type) { + case DRM_NEUTRON_JOB_INFERENCE: + cmd.id = NEUTRON_CMD_INFERENCE; + cmd.args[0] = job->inference.tensor_offset; + cmd.args[1] = job->inference.microcode_offset; + cmd.args[2] = job->inference.tensor_count; + break; + default: + dev_WARN(ndev->dev, "Unknown job type: %d\n", job->type); + return; + } + + base_addr = to_drm_gem_dma_obj(job->bo)->dma_addr; + base_l = lower_32_bits(base_addr); + base_h = upper_32_bits(base_addr); + + writel_relaxed(base_l, NEUTRON_REG(ndev, BASEDDRL)); + writel_relaxed(base_l, NEUTRON_REG(ndev, BASEINOUTL)); + writel_relaxed(base_l, NEUTRON_REG(ndev, BASESPILLL)); + writel_relaxed(base_h, NEUTRON_REG(ndev, BASEDDRH)); + writel_relaxed(base_h, NEUTRON_REG(ndev, BASEINOUTH)); + writel_relaxed(base_h, NEUTRON_REG(ndev, BASESPILLH)); + + ret = neutron_mbox_send_cmd(ndev, &cmd); + if (ret) { + /* Nothing we can do here, we'll reset the device on timeout */ + dev_err(ndev->dev, "Failed to submit job, device is busy\n"); + } +} + +void neutron_job_err_handler(struct neutron_device *ndev) +{ + guard(mutex)(&ndev->job_lock); + + if (ndev->active_job) + drm_sched_fault(&ndev->sched); +} + +void neutron_job_done_handler(struct neutron_device *ndev) +{ + struct neutron_mbox_state state; + struct dma_fence *fence; + + neutron_mbox_read_state(ndev, &state); + if (state.status != NEUTRON_FW_STATUS_DONE) { + dev_err(ndev->dev, "Inconsistent firmware state: status 0x%x, err 0x%x\n", + state.status, state.err_code); + return neutron_job_err_handler(ndev); + } + + /* Reset Neutron internal state to prepare for next inference */ + neutron_mbox_reset_state(ndev); + + scoped_guard(mutex, &ndev->job_lock) { + if (ndev->active_job) { + fence = ndev->active_job->neutron_fence; + if (state.err_code != 0) { + dev_warn(ndev->dev, "Job finished with error: 0x%x\n", + state.err_code); + dma_fence_set_error(fence, state.err_code); + } + dma_fence_signal(fence); + ndev->active_job = NULL; + } + } +} + +static void neutron_cleanup_job(struct kref *ref) +{ + struct neutron_job *job = container_of(ref, struct neutron_job, refcnt); + + pm_runtime_put_autosuspend(job->ndev->base.dev); + + dma_fence_put(job->neutron_fence); + drm_gem_object_put(job->bo); + + kfree(job); +} + +static void neutron_put_job(struct neutron_job *job) +{ + kref_put(&job->refcnt, neutron_cleanup_job); +} + +static void neutron_free_job(struct drm_sched_job *sched_job) +{ + struct neutron_job *job = to_neutron_job(sched_job); + + drm_sched_job_cleanup(sched_job); + neutron_put_job(job); +} + +static struct dma_fence *neutron_run_job(struct drm_sched_job *sched_job) +{ + struct neutron_job *job = to_neutron_job(sched_job); + struct dma_fence *fence = job->neutron_fence; + struct neutron_device *ndev = job->ndev; + + if (unlikely(job->base.s_fence->finished.error)) + return NULL; + + dma_fence_init(fence, &neutron_fence_ops, NULL, + ndev->fence_context, ++ndev->job_seqno); + dma_fence_get(fence); + + scoped_guard(mutex, &ndev->job_lock) { + ndev->active_job = job; + neutron_hw_submit(job); + } + + return fence; +} + +static enum drm_gpu_sched_stat neutron_timedout_job(struct drm_sched_job *sched_job) +{ + struct neutron_job *job = to_neutron_job(sched_job); + struct neutron_device *ndev = job->ndev; + struct neutron_mbox_state state; + + /* We assume Neutron is stuck, retrieve current state and reset */ + neutron_mbox_read_state(ndev, &state); + dev_err(ndev->dev, "Neutron timedout, status: 0x%x, err: 0x%x\n", + state.status, state.err_code); + + drm_sched_stop(&ndev->sched, sched_job); + + scoped_guard(mutex, &ndev->job_lock) + ndev->active_job = NULL; + + pm_runtime_force_suspend(ndev->dev); + pm_runtime_force_resume(ndev->dev); + + drm_sched_start(&ndev->sched, 0); + + return DRM_GPU_SCHED_STAT_RESET; +} + +static void neutron_cancel_job(struct drm_sched_job *sched_job) +{ + struct neutron_job *job = to_neutron_job(sched_job); + struct neutron_device *ndev = job->ndev; + + guard(mutex)(&ndev->job_lock); + + if (!dma_fence_is_signaled(job->neutron_fence)) { + dma_fence_set_error(job->neutron_fence, -ECANCELED); + dma_fence_signal(job->neutron_fence); + } +} + +static const struct drm_sched_backend_ops neutron_sched_ops = { + .run_job = neutron_run_job, + .free_job = neutron_free_job, + .timedout_job = neutron_timedout_job, + .cancel_job = neutron_cancel_job, +}; + +int neutron_job_init(struct neutron_device *ndev) +{ + const struct drm_sched_init_args args = { + .ops = &neutron_sched_ops, + .num_rqs = DRM_SCHED_PRIORITY_COUNT, + .credit_limit = 1, + .timeout = msecs_to_jiffies(NEUTRON_JOB_TIMEOUT_MS), + .name = dev_name(ndev->dev), + .dev = ndev->dev, + }; + int ret; + + ret = devm_mutex_init(ndev->dev, &ndev->sched_lock); + if (ret) + return ret; + ret = devm_mutex_init(ndev->dev, &ndev->job_lock); + if (ret) + return ret; + + ndev->fence_context = dma_fence_context_alloc(1); + + ret = drm_sched_init(&ndev->sched, &args); + if (ret) + dev_err(ndev->dev, "Error creating DRM scheduler\n"); + + return ret; +} + +void neutron_job_fini(struct neutron_device *ndev) +{ + drm_sched_fini(&ndev->sched); +} + +int neutron_job_open(struct neutron_file_priv *npriv) +{ + struct neutron_device *ndev = npriv->ndev; + struct drm_gpu_scheduler *sched = &ndev->sched; + int ret; + + ret = drm_sched_entity_init(&npriv->sched_entity, + DRM_SCHED_PRIORITY_NORMAL, + &sched, 1, NULL); + if (ret) + dev_err(ndev->dev, "Error creating scheduler entity\n"); + + return ret; +} + +void neutron_job_close(struct neutron_file_priv *npriv) +{ + drm_sched_entity_destroy(&npriv->sched_entity); +} + +static int neutron_push_job(struct neutron_job *job, struct drm_syncobj *sync) +{ + struct neutron_device *ndev = job->ndev; + struct ww_acquire_ctx acquire_ctx; + struct dma_fence *sched_fence; + int ret; + + ret = drm_gem_lock_reservations(&job->bo, 1, &acquire_ctx); + if (ret) + return ret; + + ret = dma_resv_reserve_fences(job->bo->resv, 1); + if (ret) + goto out_unlock_res; + + ret = drm_sched_job_add_implicit_dependencies(&job->base, job->bo, true); + if (ret) + goto out_unlock_res; + + ret = pm_runtime_resume_and_get(ndev->base.dev); + if (ret) + goto out_unlock_res; + + scoped_guard(mutex, &ndev->sched_lock) { + drm_sched_job_arm(&job->base); + + sched_fence = dma_fence_get(&job->base.s_fence->finished); + drm_syncobj_replace_fence(sync, sched_fence); + + kref_get(&job->refcnt); + drm_sched_entity_push_job(&job->base); + + dma_resv_add_fence(job->bo->resv, sched_fence, + DMA_RESV_USAGE_WRITE); + + dma_fence_put(sched_fence); + } + +out_unlock_res: + drm_gem_unlock_reservations(&job->bo, 1, &acquire_ctx); + + return ret; +} + +int neutron_ioctl_submit_job(struct drm_device *drm, void *data, struct drm_file *filp) +{ + struct neutron_device *ndev = to_neutron_device(drm); + struct neutron_file_priv *npriv = filp->driver_priv; + struct drm_neutron_submit_job *args = data; + struct drm_syncobj *syncobj; + struct neutron_job *job; + int ret; + + if (args->pad) + return -EINVAL; + + job = kzalloc_obj(*job); + if (!job) + return -ENOMEM; + + job->ndev = ndev; + kref_init(&job->refcnt); + + job->neutron_fence = kzalloc_obj(*job->neutron_fence); + if (!job->neutron_fence) { + ret = -ENOMEM; + goto out_free_job; + } + + switch (args->type) { + case DRM_NEUTRON_JOB_INFERENCE: + memcpy(&job->inference, &args->inference, + sizeof(args->inference)); + break; + default: + dev_dbg(ndev->dev, "Invalid job type %d\n", args->type); + ret = -EINVAL; + goto out_free_fence; + } + + job->bo = drm_gem_object_lookup(filp, args->bo_handle); + if (!job->bo) { + dev_dbg(ndev->dev, "Invalid BO handle\n"); + ret = -ENOENT; + goto out_free_fence; + } + + syncobj = drm_syncobj_find(filp, args->syncobj_handle); + if (!syncobj) { + dev_dbg(ndev->dev, "Invalid syncobj handle\n"); + ret = -ENOENT; + goto out_put_gem; + } + + ret = drm_sched_job_init(&job->base, &npriv->sched_entity, 1, NULL, + filp->client_id); + if (ret) + goto out_put_syncobj; + + ret = neutron_push_job(job, syncobj); + if (ret) + goto out_sched_cleanup; + + neutron_put_job(job); + drm_syncobj_put(syncobj); + + return 0; + +out_sched_cleanup: + drm_sched_job_cleanup(&job->base); +out_put_syncobj: + drm_syncobj_put(syncobj); +out_put_gem: + drm_gem_object_put(job->bo); +out_free_fence: + kfree(job->neutron_fence); +out_free_job: + kfree(job); + + return ret; +} diff --git a/drivers/accel/neutron/neutron_job.h b/drivers/accel/neutron/neutron_job.h new file mode 100644 index 000000000000..df97266a0fb6 --- /dev/null +++ b/drivers/accel/neutron/neutron_job.h @@ -0,0 +1,43 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* Copyright 2025-2026 NXP */ + +#ifndef __NEUTRON_JOB_H__ +#define __NEUTRON_JOB_H__ + +#include <linux/kref.h> +#include <drm/drm_gem.h> +#include <drm/drm_syncobj.h> +#include <drm/gpu_scheduler.h> +#include <drm/neutron_accel.h> + +#include "neutron_driver.h" + +struct neutron_device; +struct neutron_file_priv; + +struct neutron_job { + struct drm_sched_job base; + struct neutron_device *ndev; + struct dma_fence *neutron_fence; + struct drm_gem_object *bo; + enum drm_neutron_job_type type; + union { + struct drm_neutron_inference_job inference; + }; + struct kref refcnt; +}; + +#define to_neutron_job(job) \ + container_of(job, struct neutron_job, base) + +int neutron_job_init(struct neutron_device *dev); +void neutron_job_fini(struct neutron_device *dev); +int neutron_job_open(struct neutron_file_priv *npriv); +void neutron_job_close(struct neutron_file_priv *npriv); + +void neutron_job_done_handler(struct neutron_device *dev); +void neutron_job_err_handler(struct neutron_device *dev); + +int neutron_ioctl_submit_job(struct drm_device *dev, void *data, struct drm_file *filp); + +#endif /* __NEUTRON_JOB_H__ */ diff --git a/include/uapi/drm/neutron_accel.h b/include/uapi/drm/neutron_accel.h index 2f5639f2e0e8..a9e5682709d2 100644 --- a/include/uapi/drm/neutron_accel.h +++ b/include/uapi/drm/neutron_accel.h @@ -15,10 +15,12 @@ extern "C" { * * @DRM_NEUTRON_CREATE_BO: Create a buffer object * @DRM_NEUTRON_SYNC_BO: Sync (parts of) the buffer object memory + * @DRM_NEUTRON_SUBMIT_JOB: Submit a job to the device */ enum drm_neutron_ioctl { DRM_NEUTRON_CREATE_BO = 0, DRM_NEUTRON_SYNC_BO, + DRM_NEUTRON_SUBMIT_JOB, };
/** @@ -64,6 +66,51 @@ struct drm_neutron_sync_bo { __u64 offset; };
+/** + * enum drm_neutron_job_type - Type of job to submit to Neutron device + * + * @DRM_NEUTRON_JOB_INFERENCE: Inference job + */ +enum drm_neutron_job_type { + DRM_NEUTRON_JOB_INFERENCE = 0, +}; + +/** + * struct drm_neutron_inference_job - Inference job descriptor + * + * @tensor_offset: Offset of tensor array inside job BO + * @microcode_offset: Microcode offset inside BO + * @tensor_count: Number of valid tensors + * @pad: MBZ + */ +struct drm_neutron_inference_job { + __u32 tensor_offset; + __u32 microcode_offset; + __u32 tensor_count; + __u32 pad[5]; +}; + +/** + * struct drm_neutron_submit_job - Submit a job to Neutron device + * + * @type: Job type, one of enum drm_neutron_job_type + * @bo_handle: BO handle for this job + * @inference: Inference job descriptor (when type is DRM_NEUTRON_JOB_INFERENCE) + * @reserved: Reserved for future job types + * @syncobj_handle: Handle of syncobj on which user waits for job completion + * @pad: MBZ + */ +struct drm_neutron_submit_job { + __u32 type; + __u32 bo_handle; + union { + struct drm_neutron_inference_job inference; + __u32 reserved[8]; + }; + __u32 syncobj_handle; + __u32 pad; +}; + #define DRM_IOCTL_NEUTRON_CREATE_BO \ DRM_IOWR(DRM_COMMAND_BASE + DRM_NEUTRON_CREATE_BO, \ struct drm_neutron_create_bo) @@ -72,6 +119,10 @@ struct drm_neutron_sync_bo { DRM_IOWR(DRM_COMMAND_BASE + DRM_NEUTRON_SYNC_BO, \ struct drm_neutron_sync_bo)
+#define DRM_IOCTL_NEUTRON_SUBMIT_JOB \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_NEUTRON_SUBMIT_JOB, \ + struct drm_neutron_submit_job) + #if defined(__cplusplus) } #endif
From: Frank Li (AI-BOT) frank.li@nxp.com
- if (appstatus & APPSTATUS_FAULTCAUSE_MASK) {
dev_err(ndev->dev, "Neutron halted due to fault: 0x%lx\n",FIELD_GET(APPSTATUS_FAULTCAUSE_MASK, appstatus));return neutron_job_err_handler(ndev);
AI: neutron_job_err_handler() returns void, not int. Remove 'return'.
- ret = drm_sched_job_init(&job->base, &npriv->sched_entity, 1, NULL,
filp->client_id);- if (ret)
goto out_put_syncobj;- ret = neutron_push_job(job, syncobj);
- if (ret)
goto out_sched_cleanup;- neutron_put_job(job);
- drm_syncobj_put(syncobj);
- return 0;
+out_sched_cleanup:
- drm_sched_job_cleanup(&job->base);
+out_put_syncobj:
- drm_syncobj_put(syncobj);
+out_put_gem:
- drm_gem_object_put(job->bo);
AI: In the success path, neutron_put_job(job) is called which decrements refcnt. But if neutron_push_job() fails and we hit out_sched_cleanup, the job refcnt is never decremented. This leaks the job structure. Consider: if neutron_push_job() succeeds, it calls kref_get() inside sched_lock. If it fails, no kref_get() happens, so don't call
(Need owner do judgment. Not sure if AI said correctly.)
Frank
On Friday, March 6, 2026 at 7:03 PM, Frank Li wrote:
- if (appstatus & APPSTATUS_FAULTCAUSE_MASK) {
dev_err(ndev->dev, "Neutron halted due to fault: 0x%lx\n",FIELD_GET(APPSTATUS_FAULTCAUSE_MASK,appstatus));
return neutron_job_err_handler(ndev);AI: neutron_job_err_handler() returns void, not int. Remove 'return'.
Ok, will fix.
- ret = drm_sched_job_init(&job->base, &npriv->sched_entity, 1, NULL,
filp->client_id);- if (ret)
goto out_put_syncobj;- ret = neutron_push_job(job, syncobj);
- if (ret)
goto out_sched_cleanup;- neutron_put_job(job);
- drm_syncobj_put(syncobj);
- return 0;
+out_sched_cleanup:
- drm_sched_job_cleanup(&job->base);
+out_put_syncobj:
- drm_syncobj_put(syncobj);
+out_put_gem:
- drm_gem_object_put(job->bo);
AI: In the success path, neutron_put_job(job) is called which decrements refcnt. But if neutron_push_job() fails and we hit out_sched_cleanup, the job refcnt is never decremented. This leaks the job structure. Consider: if neutron_push_job() succeeds, it calls kref_get() inside sched_lock. If it fails, no kref_get() happens, so don't call
(Need owner do judgment. Not sure if AI said correctly.)
I don't see an issue here, kref_get() is called at a point where neutron_push_job() can't fail anymore. And if neutron_push_job() fails earlier, error path looks clean, it frees everything in reverse order, including the job struct.
Btw, what agent did you use for review?
Thanks, Ioana
Frank
Expose the Neutron firmware log via debugfs interface. The log resides in internal device memory.
Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com --- drivers/accel/neutron/Makefile | 2 + drivers/accel/neutron/neutron_debugfs.c | 34 ++++++++++++++++ drivers/accel/neutron/neutron_debugfs.h | 15 +++++++ drivers/accel/neutron/neutron_device.c | 69 +++++++++++++++++++++++++++++++++ drivers/accel/neutron/neutron_device.h | 17 ++++++++ drivers/accel/neutron/neutron_driver.c | 3 ++ 6 files changed, 140 insertions(+)
diff --git a/drivers/accel/neutron/Makefile b/drivers/accel/neutron/Makefile index ac6dd576521c..6d5c204460af 100644 --- a/drivers/accel/neutron/Makefile +++ b/drivers/accel/neutron/Makefile @@ -8,3 +8,5 @@ neutron-y := \ neutron_gem.o \ neutron_job.o \ neutron_mailbox.o + +neutron-$(CONFIG_DEBUG_FS) += neutron_debugfs.o diff --git a/drivers/accel/neutron/neutron_debugfs.c b/drivers/accel/neutron/neutron_debugfs.c new file mode 100644 index 000000000000..a392286e40b7 --- /dev/null +++ b/drivers/accel/neutron/neutron_debugfs.c @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* Copyright 2025 NXP */ + +#include <linux/debugfs.h> + +#include "neutron_device.h" +#include "neutron_debugfs.h" + +static ssize_t fw_log_read(struct file *f, char __user *buf, size_t count, loff_t *pos) +{ + struct neutron_device *ndev = file_inode(f)->i_private; + + if (!ndev->log.size) + return 0; + + if (ndev->flags & NEUTRON_BOOTED) + neutron_read_log(ndev, count); + + return simple_read_from_buffer(buf, count, pos, ndev->log.buf, + ndev->log.buf_count); +} + +static const struct file_operations fw_log_fops = { + .owner = THIS_MODULE, + .read = fw_log_read, +}; + +void neutron_debugfs_init(struct neutron_device *ndev) +{ + struct dentry *debugfs_root; + + debugfs_root = ndev->base.debugfs_root; + debugfs_create_file("fw_log", 0444, debugfs_root, ndev, &fw_log_fops); +} diff --git a/drivers/accel/neutron/neutron_debugfs.h b/drivers/accel/neutron/neutron_debugfs.h new file mode 100644 index 000000000000..7cd4b5af55a6 --- /dev/null +++ b/drivers/accel/neutron/neutron_debugfs.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* Copyright 2025 NXP */ + +#ifndef __NEUTRON_DEBUGFS_H__ +#define __NEUTRON_DEBUGFS_H__ + +struct neutron_device; + +#if defined(CONFIG_DEBUG_FS) +void neutron_debugfs_init(struct neutron_device *ndev); +#else +static inline void neutron_debugfs_init(struct neutron_device *ndev) {} +#endif + +#endif /* __NEUTRON_DEBUGFS_H__ */ diff --git a/drivers/accel/neutron/neutron_device.c b/drivers/accel/neutron/neutron_device.c index 571ec906ad72..a5cedc91ad25 100644 --- a/drivers/accel/neutron/neutron_device.c +++ b/drivers/accel/neutron/neutron_device.c @@ -77,6 +77,70 @@ static void neutron_stop(struct neutron_device *ndev) 100, 100 * USEC_PER_MSEC); }
+static void neutron_init_logging(struct neutron_device *ndev) +{ + size_t old_size = ndev->log.size; + u32 ringctrl; + + ringctrl = readl_relaxed(NEUTRON_REG(ndev, RINGCTRL)); + + ndev->log.base = ndev->mem_regions[NEUTRON_MEM_DTCM].va + + NEUTRON_DTCM_BANK1_OFFSET + + FIELD_GET(RINGCTRL_ADDR_MASK, ringctrl); + ndev->log.size = FIELD_GET(RINGCTRL_SIZE_MASK, ringctrl) * + RINGCTRL_SIZE_MULT; + + if (ndev->log.size == 0) { + dev_info(ndev->dev, "Firmware logging is disabled\n"); + return; + } + + /* If size didn't change, keep using the old buffer */ + if (old_size == ndev->log.size) + return; + + devm_kfree(ndev->dev, ndev->log.buf); + ndev->log.buf = devm_kmalloc(ndev->dev, ndev->log.size, GFP_KERNEL); + if (!ndev->log.buf) { + ndev->log.size = 0; + dev_warn(ndev->dev, "Failed to allocate log buffer, logging is disabled\n"); + } +} + +/* Read up to count bytes from device log into local buffer */ +void neutron_read_log(struct neutron_device *ndev, size_t count) +{ + size_t bytes, remaining; + u32 head, tail; + + ndev->log.buf_count = 0; + + if (!(ndev->flags & NEUTRON_BOOTED) || !ndev->log.size) + return; + + tail = readl_relaxed(NEUTRON_REG(ndev, TAIL)); + head = readl_relaxed(NEUTRON_REG(ndev, HEAD)); + + if (tail == head) + return; + + /* Read from head to end of buffer or tail */ + bytes = (head < tail) ? (tail - head) : (ndev->log.size - head); + bytes = min(count, bytes); + memcpy_fromio(ndev->log.buf, ndev->log.base + head, bytes); + ndev->log.buf_count = bytes; + + /* Read from start of buffer, if it wraps around */ + if (head > tail && bytes < count) { + remaining = min(count - bytes, tail); + memcpy_fromio(ndev->log.buf + bytes, ndev->log.base, remaining); + ndev->log.buf_count += remaining; + } + + head = (head + ndev->log.buf_count) % ndev->log.size; + writel_relaxed(head, NEUTRON_REG(ndev, HEAD)); +} + static void __iomem *neutron_tcm_da_to_va(struct neutron_device *ndev, u64 da) { struct neutron_mem_region *mem; @@ -158,6 +222,8 @@ int neutron_boot(struct neutron_device *ndev) /* Prepare device to receive jobs */ neutron_mbox_reset_state(ndev);
+ neutron_init_logging(ndev); + ndev->flags |= NEUTRON_BOOTED;
return 0; @@ -165,6 +231,9 @@ int neutron_boot(struct neutron_device *ndev)
void neutron_shutdown(struct neutron_device *ndev) { + /* Device log becomes unavailable after shutdown, save it */ + neutron_read_log(ndev, ndev->log.size); + neutron_stop(ndev); ndev->flags &= ~NEUTRON_BOOTED; } diff --git a/drivers/accel/neutron/neutron_device.h b/drivers/accel/neutron/neutron_device.h index 1953cdf19bfd..bbca0da9f9bc 100644 --- a/drivers/accel/neutron/neutron_device.h +++ b/drivers/accel/neutron/neutron_device.h @@ -85,11 +85,26 @@ enum neutron_mem_id { NEUTRON_MEM_MAX };
+/** + * struct neutron_log - Neutron log buffer descriptor + * @base: base address of the log buffer in device memory + * @size: Size of the log buffer + * @buf: Kernel buffer for log data + * @buf_count: Number of bytes available in the kernel buffer + */ +struct neutron_log { + void __iomem *base; + size_t size; + void *buf; + size_t buf_count; +}; + /** * struct neutron_device - Neutron device structure * @base: Base DRM device * @dev: Pointer to underlying device * @mem_regions: Array of memory region descriptors + * @log: Log buffer descriptor * @irq: IRQ number * @clks: Neutron clocks * @num_clks: Number of clocks @@ -106,6 +121,7 @@ struct neutron_device { struct device *dev;
struct neutron_mem_region mem_regions[NEUTRON_MEM_MAX]; + struct neutron_log log;
int irq; struct clk_bulk_data *clks; @@ -134,5 +150,6 @@ void neutron_shutdown(struct neutron_device *ndev); void neutron_enable_irq(struct neutron_device *ndev); void neutron_disable_irq(struct neutron_device *ndev); void neutron_handle_irq(struct neutron_device *ndev); +void neutron_read_log(struct neutron_device *ndev, size_t bytes);
#endif /* __NEUTRON_DEVICE_H__ */ diff --git a/drivers/accel/neutron/neutron_driver.c b/drivers/accel/neutron/neutron_driver.c index ceae1f7e8359..14b4bc3a79d1 100644 --- a/drivers/accel/neutron/neutron_driver.c +++ b/drivers/accel/neutron/neutron_driver.c @@ -18,6 +18,7 @@
#include "neutron_device.h" #include "neutron_driver.h" +#include "neutron_debugfs.h" #include "neutron_gem.h" #include "neutron_job.h"
@@ -168,6 +169,8 @@ static int neutron_probe(struct platform_device *pdev) if (ret) goto free_reserved;
+ neutron_debugfs_init(ndev); + ret = devm_pm_runtime_enable(dev); if (ret) goto free_job;
Add the node for Neutron NPU. Also add a reserved memory region for allocating Neutron buffers, which have a 1MB alignment constraint.
Signed-off-by: Jiwei Fu jiwei.fu@nxp.com Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com --- v2: Match changes in dt bindings --- arch/arm64/boot/dts/freescale/imx95.dtsi | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi b/arch/arm64/boot/dts/freescale/imx95.dtsi index 55e2da094c88..1c6865a8d482 100644 --- a/arch/arm64/boot/dts/freescale/imx95.dtsi +++ b/arch/arm64/boot/dts/freescale/imx95.dtsi @@ -336,6 +336,19 @@ sram1: sram@204c0000 { #size-cells = <1>; };
+ reserved-memory { + #address-cells = <2>; + #size-cells = <2>; + ranges; + + neutron_pool: neutron-pool { + compatible = "shared-dma-pool"; + size = <0x0 0x8000000>; + alignment = <0x0 0x100000>; + reusable; + }; + }; + firmware { scmi { compatible = "arm,scmi"; @@ -2181,5 +2194,20 @@ ddr-pmu@4e090dc0 { reg = <0x0 0x4e090dc0 0x0 0x200>; interrupts = <GIC_SPI 91 IRQ_TYPE_LEVEL_HIGH>; }; + + neutron: npu@4ab00000 { + compatible = "nxp,imx95-neutron"; + reg = <0x0 0x4ab00000 0x0 0x00000400>, + <0x0 0x4ab10000 0x0 0x00010000>, + <0x0 0x4ab08000 0x0 0x00008000>; + reg-names = "regs", "itcm", "dtcm"; + memory-region = <&neutron_pool>; + interrupts = <GIC_SPI 318 IRQ_TYPE_LEVEL_HIGH>; + clocks = <&scmi_clk IMX95_CLK_NPU>, + <&scmi_clk IMX95_CLK_NPUAPB>; + clock-names = "core", "apb"; + power-domains = <&scmi_devpd IMX95_PD_NPU>; + iommus = <&smmu 0xd>; + }; }; };
Hi Ioana,
Looks like the userspace portion of the driver is closed source (libNeutronDriver.so)?
https://github.com/nxp-imx/tflite-neutron-delegate/blob/lf-6.12.49_2.2.0/CMa...
Regards,
Tomeu
On Fri, Mar 6, 2026 at 2:27 PM Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com wrote:
Introduce a new accel driver for the Neutron Neural Processing Unit (NPU), along with associated dt-bindings and DTS node.
The first patch extends the GEM DMA helper APIs to allow bidirectional mapping of non-coherent DMA buffers. While not part of the Neutron driver, it's a prerequisite allowing us to use the GEM DMA helper.
Neutron is a Neural Processing Unit from NXP, providing machine learning (ML) acceleration for edge AI applications. Neutron is integrated on NXP SoCs such as the i.MX95.
The NPU consists of the following:
- RISC-V core running a proprietary firmware
- One or more Neutron cores, representing the main computation engine performing ML operations
- Dedicated fast memory (TCM)
- DMA engine that handles data transfers between DDR and TCM
The firmware is closed source and distributed as a binary here [1].
The Neutron software stack also contains a userspace library [1] and a LiteRT custom delegate [2] that allow integration with standard LiteRT tools.
[1] https://github.com/nxp-upstream/neutron/tree/upstream [2] https://github.com/nxp-imx/tflite-neutron-delegate
Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com
Changes in v2:
rebase on newer drm-misc-next
dt bindings: clock fixes and renames
update DTS to match new names
remove unnecessary fields from neutron_job structure
fix use of uninitialized variable
Link to v1: https://lore.kernel.org/r/20260226-neutron-v1-0-46eccb3bb50a@nxp.com
Ioana Ciocoi-Radulescu (9): drm/gem-dma: Add flag for bidirectional mapping of non-coherent GEM DMA buffers accel/neutron: Add documentation for NXP Neutron accelerator driver dt-bindings: npu: Add NXP Neutron accel/neutron: Add driver for NXP Neutron NPU accel/neutron: Add GEM buffer object support accel/neutron: Add mailbox support accel/neutron: Add job submission IOCTL accel/neutron: Add logging support arm64: dts: imx95: Add Neutron node
Documentation/accel/index.rst | 1 + Documentation/accel/neutron/index.rst | 12 + Documentation/accel/neutron/neutron.rst | 131 ++++++++ .../devicetree/bindings/npu/nxp,imx95-neutron.yaml | 96 ++++++ MAINTAINERS | 10 + arch/arm64/boot/dts/freescale/imx95.dtsi | 28 ++ drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 3 +- drivers/accel/neutron/Kconfig | 16 + drivers/accel/neutron/Makefile | 12 + drivers/accel/neutron/neutron_debugfs.c | 34 ++ drivers/accel/neutron/neutron_debugfs.h | 15 + drivers/accel/neutron/neutron_device.c | 239 +++++++++++++ drivers/accel/neutron/neutron_device.h | 155 +++++++++ drivers/accel/neutron/neutron_driver.c | 262 +++++++++++++++ drivers/accel/neutron/neutron_driver.h | 16 + drivers/accel/neutron/neutron_gem.c | 116 +++++++ drivers/accel/neutron/neutron_gem.h | 14 + drivers/accel/neutron/neutron_job.c | 372 +++++++++++++++++++++ drivers/accel/neutron/neutron_job.h | 43 +++ drivers/accel/neutron/neutron_mailbox.c | 47 +++ drivers/accel/neutron/neutron_mailbox.h | 42 +++ drivers/gpu/drm/drm_gem_dma_helper.c | 6 +- include/drm/drm_gem_dma_helper.h | 3 + include/uapi/drm/neutron_accel.h | 130 +++++++ 25 files changed, 1801 insertions(+), 3 deletions(-)
base-commit: 6716101ae42949e98ad4b9e71eeba08c055be410 change-id: 20260226-neutron-c435e39d167f
Best regards,
Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com
On Saturday, March 21, 2026 at 7:19 PM, Tomeu Vizoso wrote:
Hi Ioana,
Looks like the userspace portion of the driver is closed source (libNeutronDriver.so)?
https://github.com/nxp-imx/tflite-neutron-delegate/blob/lf-6.12.49_2.2 .0/CMakeLists.txt
Hi Tomeu,
Yes, it's closed for now. We do plan to publish the source code on github, but I believe that's still a few months away.
Thanks, Ioana
Regards,
Tomeu
On Fri, Mar 6, 2026 at 2:27 PM Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com wrote:
Introduce a new accel driver for the Neutron Neural Processing Unit (NPU), along with associated dt-bindings and DTS node.
The first patch extends the GEM DMA helper APIs to allow bidirectional mapping of non-coherent DMA buffers. While not part of the Neutron driver, it's a prerequisite allowing us to use the GEM DMA helper.
Neutron is a Neural Processing Unit from NXP, providing machine learning (ML) acceleration for edge AI applications. Neutron is integrated on NXP SoCs such as the i.MX95.
The NPU consists of the following:
- RISC-V core running a proprietary firmware
- One or more Neutron cores, representing the main computation engine performing ML operations
- Dedicated fast memory (TCM)
- DMA engine that handles data transfers between DDR and TCM
The firmware is closed source and distributed as a binary here [1].
The Neutron software stack also contains a userspace library [1] and a LiteRT custom delegate [2] that allow integration with standard LiteRT tools.
[1] https://github.com/nxp-upstream/neutron/tree/upstream [2] https://github.com/nxp-imx/tflite-neutron-delegate
Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com
Changes in v2:
rebase on newer drm-misc-next
dt bindings: clock fixes and renames
update DTS to match new names
remove unnecessary fields from neutron_job structure
fix use of uninitialized variable
Link to v1:
https://lore.kernel.org/r/20260226-neutron-v1-0-46eccb3bb50a@nxp.com
Ioana Ciocoi-Radulescu (9): drm/gem-dma: Add flag for bidirectional mapping of non-coherent GEM
DMA buffers
accel/neutron: Add documentation for NXP Neutron accelerator driver dt-bindings: npu: Add NXP Neutron accel/neutron: Add driver for NXP Neutron NPU accel/neutron: Add GEM buffer object support accel/neutron: Add mailbox support accel/neutron: Add job submission IOCTL accel/neutron: Add logging support arm64: dts: imx95: Add Neutron nodeDocumentation/accel/index.rst | 1 + Documentation/accel/neutron/index.rst | 12 + Documentation/accel/neutron/neutron.rst | 131 ++++++++ .../devicetree/bindings/npu/nxp,imx95-neutron.yaml | 96 ++++++ MAINTAINERS | 10 + arch/arm64/boot/dts/freescale/imx95.dtsi | 28 ++ drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 3 +- drivers/accel/neutron/Kconfig | 16 + drivers/accel/neutron/Makefile | 12 + drivers/accel/neutron/neutron_debugfs.c | 34 ++ drivers/accel/neutron/neutron_debugfs.h | 15 + drivers/accel/neutron/neutron_device.c | 239 +++++++++++++ drivers/accel/neutron/neutron_device.h | 155 +++++++++ drivers/accel/neutron/neutron_driver.c | 262 +++++++++++++++ drivers/accel/neutron/neutron_driver.h | 16 + drivers/accel/neutron/neutron_gem.c | 116 +++++++ drivers/accel/neutron/neutron_gem.h | 14 + drivers/accel/neutron/neutron_job.c | 372
+++++++++++++++++++++
drivers/accel/neutron/neutron_job.h | 43 +++ drivers/accel/neutron/neutron_mailbox.c | 47 +++ drivers/accel/neutron/neutron_mailbox.h | 42 +++ drivers/gpu/drm/drm_gem_dma_helper.c | 6 +- include/drm/drm_gem_dma_helper.h | 3 + include/uapi/drm/neutron_accel.h | 130 +++++++ 25 files changed, 1801 insertions(+), 3 deletions(-)
base-commit: 6716101ae42949e98ad4b9e71eeba08c055be410 change-id: 20260226-neutron-c435e39d167f
Best regards,
Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com
On Tue, Mar 24, 2026 at 1:12 PM Ioana Ciocoi Radulescu < ruxandra.radulescu@nxp.com> wrote:
On Saturday, March 21, 2026 at 7:19 PM, Tomeu Vizoso wrote:
Hi Ioana,
Looks like the userspace portion of the driver is closed source (libNeutronDriver.so)?
https://github.com/nxp-imx/tflite-neutron-delegate/blob/lf-6.12.49_2.2 .0/CMakeLists.txt
Hi Tomeu,
Yes, it's closed for now. We do plan to publish the source code on github, but I believe that's still a few months away.
I think you may want to sync with your userspace team sooner rather than later, so you can comply with this requirement:
https://docs.kernel.org/gpu/drm-uapi.html#open-source-userspace-requirements
It could be good to also share firmware code with other firmware-mediated NPU drivers if possible, or at least the part of the rpmsg protocol that makes sense to share.
You can see my submission for the Thames driver for a link to the firmware code.
I would be happy to help consolidate code between this category of drivers if you want.
Regards,
Tomeu
Thanks, Ioana
Regards,
Tomeu
On Fri, Mar 6, 2026 at 2:27 PM Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com wrote:
Introduce a new accel driver for the Neutron Neural Processing Unit (NPU), along with associated dt-bindings and DTS node.
The first patch extends the GEM DMA helper APIs to allow bidirectional mapping of non-coherent DMA buffers. While not part of the Neutron driver, it's a prerequisite allowing us to use the GEM DMA helper.
Neutron is a Neural Processing Unit from NXP, providing machine learning (ML) acceleration for edge AI applications. Neutron is integrated on NXP SoCs such as the i.MX95.
The NPU consists of the following:
- RISC-V core running a proprietary firmware
- One or more Neutron cores, representing the main computation engine performing ML operations
- Dedicated fast memory (TCM)
- DMA engine that handles data transfers between DDR and TCM
The firmware is closed source and distributed as a binary here [1].
The Neutron software stack also contains a userspace library [1] and a LiteRT custom delegate [2] that allow integration with standard LiteRT tools.
[1] https://github.com/nxp-upstream/neutron/tree/upstream [2] https://github.com/nxp-imx/tflite-neutron-delegate
Signed-off-by: Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com
Changes in v2:
rebase on newer drm-misc-next
dt bindings: clock fixes and renames
update DTS to match new names
remove unnecessary fields from neutron_job structure
fix use of uninitialized variable
Link to v1:
https://lore.kernel.org/r/20260226-neutron-v1-0-46eccb3bb50a@nxp.com
Ioana Ciocoi-Radulescu (9): drm/gem-dma: Add flag for bidirectional mapping of non-coherent
GEM
DMA buffers
accel/neutron: Add documentation for NXP Neutron acceleratordriver
dt-bindings: npu: Add NXP Neutron accel/neutron: Add driver for NXP Neutron NPU accel/neutron: Add GEM buffer object support accel/neutron: Add mailbox support accel/neutron: Add job submission IOCTL accel/neutron: Add logging support arm64: dts: imx95: Add Neutron nodeDocumentation/accel/index.rst | 1 + Documentation/accel/neutron/index.rst | 12 + Documentation/accel/neutron/neutron.rst | 131 ++++++++ .../devicetree/bindings/npu/nxp,imx95-neutron.yaml | 96 ++++++ MAINTAINERS | 10 + arch/arm64/boot/dts/freescale/imx95.dtsi | 28 ++ drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 3 +- drivers/accel/neutron/Kconfig | 16 + drivers/accel/neutron/Makefile | 12 + drivers/accel/neutron/neutron_debugfs.c | 34 ++ drivers/accel/neutron/neutron_debugfs.h | 15 + drivers/accel/neutron/neutron_device.c | 239 +++++++++++++ drivers/accel/neutron/neutron_device.h | 155 +++++++++ drivers/accel/neutron/neutron_driver.c | 262
+++++++++++++++
drivers/accel/neutron/neutron_driver.h | 16 + drivers/accel/neutron/neutron_gem.c | 116 +++++++ drivers/accel/neutron/neutron_gem.h | 14 + drivers/accel/neutron/neutron_job.c | 372
+++++++++++++++++++++
drivers/accel/neutron/neutron_job.h | 43 +++ drivers/accel/neutron/neutron_mailbox.c | 47 +++ drivers/accel/neutron/neutron_mailbox.h | 42 +++ drivers/gpu/drm/drm_gem_dma_helper.c | 6 +- include/drm/drm_gem_dma_helper.h | 3 + include/uapi/drm/neutron_accel.h | 130 +++++++ 25 files changed, 1801 insertions(+), 3 deletions(-)
base-commit: 6716101ae42949e98ad4b9e71eeba08c055be410 change-id: 20260226-neutron-c435e39d167f
Best regards,
Ioana Ciocoi-Radulescu ruxandra.radulescu@nxp.com
On Tuesday, March 24, 2026 at 6:40 PM, Tomeu Vizoso wrote:
Hi Ioana,
Looks like the userspace portion of the driver is closed source (libNeutronDriver.so)?
https://github.com/nxp-imx/tflite-neutron-delegate/blob/lf-6.12.49_2.2.0/CMa...
Hi Tomeu,
Yes, it's closed for now. We do plan to publish the source code on github, but I believe that's still a few months away.
I think you may want to sync with your userspace team sooner rather than later, so you can comply with this requirement:
https://docs.kernel.org/gpu/drm-uapi.html#open-source-userspace-requirements
Thanks for bringing this up, it helps us raise internally the priority for the userspace side. In the meantime, I still hope to gather additional feedback on the kernel driver.
It could be good to also share firmware code with other firmware-mediated NPU drivers if possible, or at least the part of the rpmsg protocol that makes sense to share.
You can see my submission for the Thames driver for a link to the firmware code.
I would be happy to help consolidate code between this category of drivers if you want.
Thanks for the offer. We are considering our options, I'll get back once we reach an internal decision.
Regards, Ioana
linaro-mm-sig@lists.linaro.org