On 19/05/2025 10:27, Tomeu Vizoso wrote:
> On Mon, May 19, 2025 at 8:08 AM Krzysztof Kozlowski <krzk(a)kernel.org> wrote:
>>
>> On 16/05/2025 18:53, Tomeu Vizoso wrote:
>>> See Chapter 36 "RKNN" from the RK3588 TRM (Part 1).
>>>
>>> This is a derivative of NVIDIA's NVDLA, but with its own front-end
>>> processor.
>>>
>>> The IP is divided in three cores, programmed independently. The first
>>> core though is special, requiring to be powered on before any of the
>>> others can be used.
>>>
>>> The IOMMU of the first core is also special in that it has two subunits
>>> (read/write?) that need to be programmed in sync.
>>>
>>> v2:
>>> - Have one device for each NPU core (Sebastian Reichel)
>>> - Have one device for each IOMMU (Sebastian Reichel)
>>> - Correctly sort nodes (Diederik de Haas)
>>> - Add rockchip,iommu compatible to IOMMU nodes (Sebastian Reichel)
>>>
>>> v3:
>>> - Adapt to a split of the register block in the DT bindings (Nicolas
>>> Frattaroli)
>>>
>>> Signed-off-by: Tomeu Vizoso <tomeu(a)tomeuvizoso.net>
>>> ---
>>> arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 85 +++++++++++++++++++++++++++
>>> 1 file changed, 85 insertions(+)
>>>
>>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
>>> index 1e18ad93ba0ebdad31642b88ff0f90ef4e8dc76f..7b961ab838212fad8e4a70390fdc917a828433a9 100644
>>> --- a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
>>> +++ b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
>>> @@ -1136,6 +1136,91 @@ power-domain@RK3588_PD_SDMMC {
>>> };
>>> };
>>>
>>> + rknn_core_top: npu-core@fdab0000 {
>>
>> npu@
>>
>>> + compatible = "rockchip,rk3588-rknn-core-top", "rockchip,rknn-core-top";
>>
>> You never tested this. Test before sending instead of relying on us or
>> after merging.
>
> Can you please extend on this? I have tested this series before
> sending and I don't understand what you mean here.
I mean exactly that: it was not tested, because warnings are clearly
visible/expected. I also found now Rob's report which even shows you the
warnings, so how come you still claim this was tested?
Best regards,
Krzysztof
On 5/19/25 06:08, wangtao wrote:
>
>
>> -----Original Message-----
>> From: Christian König <christian.koenig(a)amd.com>
>> Sent: Friday, May 16, 2025 6:29 PM
>> To: wangtao <tao.wangtao(a)honor.com>; sumit.semwal(a)linaro.org;
>> benjamin.gaignard(a)collabora.com; Brian.Starkey(a)arm.com;
>> jstultz(a)google.com; tjmercier(a)google.com
>> Cc: linux-media(a)vger.kernel.org; dri-devel(a)lists.freedesktop.org; linaro-
>> mm-sig(a)lists.linaro.org; linux-kernel(a)vger.kernel.org;
>> wangbintian(BintianWang) <bintian.wang(a)honor.com>; yipengxiang
>> <yipengxiang(a)honor.com>; liulu <liulu.liu(a)honor.com>; hanfeng
>> <feng.han(a)honor.com>
>> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement
>> DMA_BUF_IOCTL_RW_FILE for system_heap
>>
>> On 5/16/25 11:49, wangtao wrote:
>>>>>> Please try using udmabuf with sendfile() as confirmed to be working
>>>>>> by
>>>> T.J.
>>>>> [wangtao] Using buffer IO with dmabuf file read/write requires one
>>>> memory copy.
>>>>> Direct IO removes this copy to enable zero-copy. The sendfile system
>>>>> call reduces memory copies from two (read/write) to one. However,
>>>>> with udmabuf, sendfile still keeps at least one copy, failing zero-copy.
>>>>
>>>>
>>>> Then please work on fixing this.
>>> [wangtao] What needs fixing? Does sendfile achieve zero-copy?
>>> sendfile reduces memory copies (from 2 to 1) for network sockets, but
>>> still requires one copy and cannot achieve zero copies.
>>
>> Well why not? See sendfile() is the designated Linux uAPI for moving data
>> between two files, maybe splice() is also appropriate.
>>
>> The memory file descriptor and your destination file are both a files. So those
>> uAPIs apply.
> [wangtao] I realize our disagreement lies here:
> You believe sendfile enables zero-copy for regular file → socket/file:
No what I mean is that it should be possible to solve this using sendfile() or splice() and not come uo with a hacky IOCTL to bypass well tested and agreed upon system calls.
> sendfile(dst_socket, src_disk)
> [disk] --DMA--> [page buffer] --DMA--> [NIC]
> sendfile(dst_disk, src_disk)
> [disk] --DMA--> [page buffer] --DMA--> [DISK]
>
> But for regular file → memory file (e.g., tmpfs/shmem), a CPU copy is unavoidable:
> sendfile(dst_memfile, src_disk)
> [disk] --DMA--> [page buffer] --CPU copy--> [DISK]
> Without memory-to-memory DMA, this wastes CPU/power — critical for embedded devices.
>
>>
>> Now what you suggest is to add a new IOCTL to do this in a very specific
>> manner just for the system DMA-buf heap. And as far as I can see that is in
>> general a complete no-go.
>>
>> I mean I understand why you do this. Instead of improving the existing
>> functionality you're just hacking something together because it is simple for
>> you.
>>
>> It might be possible to implement that generic for DMA-buf heaps if
>> udmabuf allocation overhead can't be reduced, but that is then just the
>> second step.
> [wangtao] On dmabuf:
> - DMABUF lacks Direct I/O support, hence our proposal.
> - memfd supports Direct I/O but doesn’t fit our use case.
> - udmabuf via memfd works but needs systemic changes (low ROI) and has slow allocation.
>
> Your objections:
> 1. Adding an IOCTL? This targets dmabuf specifically, and our fix is simple.
> sendfile doesn’t resolve it.
> 2. Accessing sgtable pages in the exporter? As the dmabuf creator, the exporter
> fully controls sgtable/page data. We can restrict access to cases with no
> external users.
>
> Could you clarify which point you oppose?
Both. I might be repeating myself, but I think what you do here is a no-go and reimplements core system call functionality by a way which we certainly shouldn't allow.
T.J's testing shows that sendfile() seems to work at least in one direction. The other use case can certainly be optimized. So if you want to improve this work on that instead.
Regards,
Christian
>
>>
>> Regards,
>> Christian.
On 16/05/2025 18:53, Tomeu Vizoso wrote:
> See Chapter 36 "RKNN" from the RK3588 TRM (Part 1).
>
> This is a derivative of NVIDIA's NVDLA, but with its own front-end
> processor.
>
> The IP is divided in three cores, programmed independently. The first
> core though is special, requiring to be powered on before any of the
> others can be used.
>
> The IOMMU of the first core is also special in that it has two subunits
> (read/write?) that need to be programmed in sync.
>
> v2:
> - Have one device for each NPU core (Sebastian Reichel)
> - Have one device for each IOMMU (Sebastian Reichel)
> - Correctly sort nodes (Diederik de Haas)
> - Add rockchip,iommu compatible to IOMMU nodes (Sebastian Reichel)
>
> v3:
> - Adapt to a split of the register block in the DT bindings (Nicolas
> Frattaroli)
>
> Signed-off-by: Tomeu Vizoso <tomeu(a)tomeuvizoso.net>
> ---
> arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 85 +++++++++++++++++++++++++++
> 1 file changed, 85 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
> index 1e18ad93ba0ebdad31642b88ff0f90ef4e8dc76f..7b961ab838212fad8e4a70390fdc917a828433a9 100644
> --- a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
> @@ -1136,6 +1136,91 @@ power-domain@RK3588_PD_SDMMC {
> };
> };
>
> + rknn_core_top: npu-core@fdab0000 {
npu@
> + compatible = "rockchip,rk3588-rknn-core-top", "rockchip,rknn-core-top";
You never tested this. Test before sending instead of relying on us or
after merging.
Best regards,
Krzysztof
On Fri, 16 May 2025 18:53:14 +0200, Tomeu Vizoso wrote:
> This series adds a new driver for the NPU that Rockchip includes in its
> newer SoCs, developed by them on the NVDLA base.
>
> In its current form, it supports the specific NPU in the RK3588 SoC.
>
> The userspace driver is part of Mesa and an initial draft can be found at:
>
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29698
>
> Signed-off-by: Tomeu Vizoso <tomeu(a)tomeuvizoso.net>
> ---
> Changes in v3:
> - Reference in the device tree only the register blocks that are
> actually used.
> - Several style and robustness fixes suggested in the mailing list.
> - Added patches from Nicolas Frattaroli that add support to the NPU for
> the Rock 5B board.
> - Link to v2: https://lore.kernel.org/r/20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizos…
>
> Changes in v2:
> - Drop patch adding the rk3588 compatible to rockchip-iommu (Sebastian Reichel)
> - Drop patch adding support for multiple power domains to rockchip-iommu (Sebastian Reichel)
> - Link to v1: https://lore.kernel.org/r/20240612-6-10-rocket-v1-0-060e48eea250@tomeuvizos…
>
> ---
> Nicolas Frattaroli (2):
> arm64: dts: rockchip: add pd_npu label for RK3588 power domains
> arm64: dts: rockchip: enable NPU on ROCK 5B
>
> Tomeu Vizoso (8):
> dt-bindings: npu: rockchip,rknn: Add bindings
> arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588s
> arm64: dts: rockchip: Enable the NPU on quartzpro64
> accel/rocket: Add registers header
> accel/rocket: Add a new driver for Rockchip's NPU
> accel/rocket: Add IOCTL for BO creation
> accel/rocket: Add job submission IOCTL
> accel/rocket: Add IOCTLs for synchronizing memory accesses
>
> Documentation/accel/index.rst | 1 +
> Documentation/accel/rocket/index.rst | 25 +
> .../bindings/npu/rockchip,rknn-core.yaml | 162 +
> MAINTAINERS | 10 +
> arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 87 +-
> .../arm64/boot/dts/rockchip/rk3588-quartzpro64.dts | 30 +
> arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 56 +
> drivers/accel/Kconfig | 1 +
> drivers/accel/Makefile | 1 +
> drivers/accel/rocket/Kconfig | 25 +
> drivers/accel/rocket/Makefile | 10 +
> drivers/accel/rocket/rocket_core.c | 103 +
> drivers/accel/rocket/rocket_core.h | 59 +
> drivers/accel/rocket/rocket_device.c | 45 +
> drivers/accel/rocket/rocket_device.h | 31 +
> drivers/accel/rocket/rocket_drv.c | 337 ++
> drivers/accel/rocket/rocket_drv.h | 17 +
> drivers/accel/rocket/rocket_gem.c | 211 +
> drivers/accel/rocket/rocket_gem.h | 31 +
> drivers/accel/rocket/rocket_job.c | 723 ++++
> drivers/accel/rocket/rocket_job.h | 50 +
> drivers/accel/rocket/rocket_registers.h | 4425 ++++++++++++++++++++
> include/uapi/drm/rocket_accel.h | 145 +
> 23 files changed, 6584 insertions(+), 1 deletion(-)
> ---
> base-commit: 46bfbcd135a6df00c49cf043bf2c9c9387bc882d
> change-id: 20240612-6-10-rocket-9316defc14c7
>
> Best regards,
> --
> Tomeu Vizoso <tomeu(a)tomeuvizoso.net>
>
>
>
My bot found new DTB warnings on the .dts files added or changed in this
series.
Some warnings may be from an existing SoC .dtsi. Or perhaps the warnings
are fixed by another series. Ultimately, it is up to the platform
maintainer whether these warnings are acceptable or not. No need to reply
unless the platform maintainer has comments.
If you already ran DT checks and didn't see these error(s), then
make sure dt-schema is up to date:
pip3 install dtschema --upgrade
This patch series was applied (using b4) to base:
Base: base-commit 46bfbcd135a6df00c49cf043bf2c9c9387bc882d not known, ignoring
Base: attempting to guess base-commit...
Base: tags/v6.15-rc6-20-g4106486839d1 (exact match)
If this is not the correct base, please add 'base-commit' tag
(or use b4 which does this automatically)
New warnings running 'make CHECK_DTBS=y for arch/arm64/boot/dts/rockchip/' for 20250516-6-10-rocket-v3-0-7051ac9225db(a)tomeuvizoso.net:
arch/arm64/boot/dts/rockchip/rk3588-coolpi-cm5-genbook.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-coolpi-cm5-genbook.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-coolpi-cm5-genbook.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-odroid-m2.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-odroid-m2.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-odroid-m2.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5b.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5b.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5b.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-h96-max-v58.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-h96-max-v58.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-h96-max-v58.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-armsom-w3.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-armsom-w3.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-armsom-w3.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-orangepi-5-ultra.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-orangepi-5-ultra.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-orangepi-5-ultra.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-jaguar.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-jaguar.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-jaguar.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6b-io.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6b-io.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6b-io.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-ok3588-c.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-gameforce-ace.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-ok3588-c.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-gameforce-ace.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-ok3588-c.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-gameforce-ace.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-tiger-haikou.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-tiger-haikou.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-tiger-haikou.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3582-radxa-e52c.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3582-radxa-e52c.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3582-radxa-e52c.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-toybrick-x0.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-toybrick-x0.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-toybrick-x0.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-coolpi-4b.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-coolpi-4b.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-coolpi-4b.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-khadas-edge2.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-khadas-edge2.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-khadas-edge2.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-friendlyelec-cm3588-nas.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-friendlyelec-cm3588-nas.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-friendlyelec-cm3588-nas.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6a-io.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6a-io.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6a-io.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-rock-5-itx.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-rock-5-itx.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-rock-5-itx.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-rock-5a.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-rock-5a.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-rock-5a.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-evb1-v10.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-evb1-v10.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-evb1-v10.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-turing-rk1.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-turing-rk1.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-turing-rk1.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-mnt-reform2.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-mnt-reform2.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-mnt-reform2.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-armsom-sige7.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-armsom-sige7.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-armsom-sige7.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-nanopc-t6-lts.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-nanopc-t6-lts.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-nanopc-t6-lts.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6c.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6c.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6c.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-orangepi-5-plus.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-orangepi-5-plus.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-orangepi-5-plus.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-nanopc-t6.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-nanopc-t6.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-nanopc-t6.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6s.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6s.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6s.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-orangepi-5-max.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-orangepi-5-max.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-orangepi-5-max.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-rock-5c.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-rock-5c.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-rock-5c.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-firefly-itx-3588j.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-firefly-itx-3588j.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-firefly-itx-3588j.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-coolpi-cm5-evb.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-coolpi-cm5-evb.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588-coolpi-cm5-evb.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-indiedroid-nova.dtb: npu-core@fdab0000 (rockchip,rk3588-rknn-core-top): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core-top', 'rockchip,rknn-core-top'] is too long
'rockchip,rk3588-rknn-core-top' is not one of ['rockchip,rk3588-rknn-core']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-indiedroid-nova.dtb: npu-core@fdac0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
arch/arm64/boot/dts/rockchip/rk3588s-indiedroid-nova.dtb: npu-core@fdad0000 (rockchip,rk3588-rknn-core): compatible: 'oneOf' conditional failed, one must be fixed:
['rockchip,rk3588-rknn-core', 'rockchip,rknn-core'] is too long
'rockchip,rk3588-rknn-core' is not one of ['rockchip,rk3588-rknn-core-top']
from schema $id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml#
On 5/16/25 09:40, wangtao wrote:
>
>
>> -----Original Message-----
>> From: Christian König <christian.koenig(a)amd.com>
>> Sent: Thursday, May 15, 2025 10:26 PM
>> To: wangtao <tao.wangtao(a)honor.com>; sumit.semwal(a)linaro.org;
>> benjamin.gaignard(a)collabora.com; Brian.Starkey(a)arm.com;
>> jstultz(a)google.com; tjmercier(a)google.com
>> Cc: linux-media(a)vger.kernel.org; dri-devel(a)lists.freedesktop.org; linaro-
>> mm-sig(a)lists.linaro.org; linux-kernel(a)vger.kernel.org;
>> wangbintian(BintianWang) <bintian.wang(a)honor.com>; yipengxiang
>> <yipengxiang(a)honor.com>; liulu 00013167 <liulu.liu(a)honor.com>; hanfeng
>> 00012985 <feng.han(a)honor.com>
>> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement
>> DMA_BUF_IOCTL_RW_FILE for system_heap
>>
>> On 5/15/25 16:03, wangtao wrote:
>>> [wangtao] My Test Configuration (CPU 1GHz, 5-test average):
>>> Allocation: 32x32MB buffer creation
>>> - dmabuf 53ms vs. udmabuf 694ms (10X slower)
>>> - Note: shmem shows excessive allocation time
>>
>> Yeah, that is something already noted by others as well. But that is
>> orthogonal.
>>
>>>
>>> Read 1024MB File:
>>> - dmabuf direct 326ms vs. udmabuf direct 461ms (40% slower)
>>> - Note: pin_user_pages_fast consumes majority CPU cycles
>>>
>>> Key function call timing: See details below.
>>
>> Those aren't valid, you are comparing different functionalities here.
>>
>> Please try using udmabuf with sendfile() as confirmed to be working by T.J.
> [wangtao] Using buffer IO with dmabuf file read/write requires one memory copy.
> Direct IO removes this copy to enable zero-copy. The sendfile system call
> reduces memory copies from two (read/write) to one. However, with udmabuf,
> sendfile still keeps at least one copy, failing zero-copy.
Then please work on fixing this.
Regards,
Christian.
>
> If udmabuf sendfile uses buffer IO (file page cache), read latency matches
> dmabuf buffer read, but allocation time is much longer.
> With Direct IO, the default 16-page pipe size makes it slower than buffer IO.
>
> Test data shows:
> udmabuf direct read is much faster than udmabuf sendfile.
> dmabuf direct read outperforms udmabuf direct read by a large margin.
>
> Issue: After udmabuf is mapped via map_dma_buf, apps using memfd or
> udmabuf for Direct IO might cause errors, but there are no safeguards to
> prevent this.
>
> Allocate 32x32MB buffer and read 1024 MB file Test:
> Metric | alloc (ms) | read (ms) | total (ms)
> -----------------------|------------|-----------|-----------
> udmabuf buffer read | 539 | 2017 | 2555
> udmabuf direct read | 522 | 658 | 1179
> udmabuf buffer sendfile| 505 | 1040 | 1546
> udmabuf direct sendfile| 510 | 2269 | 2780
> dmabuf buffer read | 51 | 1068 | 1118
> dmabuf direct read | 52 | 297 | 349
>
> udmabuf sendfile test steps:
> 1. Open data file(1024MB), get back_fd
> 2. Create memfd(32MB) # Loop steps 2-6
> 3. Allocate udmabuf with memfd
> 4. Call sendfile(memfd, back_fd)
> 5. Close memfd after sendfile
> 6. Close udmabuf
> 7. Close back_fd
>
>>
>> Regards,
>> Christian.
>
On Fri, May 16, 2025 at 02:19:45PM +0800, Xu Yilun wrote:
> > I don't know why you'd disable a viommu while the VM is running,
> > doesn't make sense.
>
> Here it means remove the CC setup for viommu, shared setup is still
> kept.
That might makes sense for the vPCI function, but not the vIOMMU. A
secure VIOMMU needs to be running at all times while the guest is
running. Perhaps it has no devices it can be used with, but it's
functionality has to be there because a driver in the VM will be
connected to it.
At most "bind" should only tell the already existing secure vIOMMU
that it is allowed to translate for a specific vPCI function.
Jason
On 5/16/25 11:49, wangtao wrote:
>>>> Please try using udmabuf with sendfile() as confirmed to be working by
>> T.J.
>>> [wangtao] Using buffer IO with dmabuf file read/write requires one
>> memory copy.
>>> Direct IO removes this copy to enable zero-copy. The sendfile system
>>> call reduces memory copies from two (read/write) to one. However, with
>>> udmabuf, sendfile still keeps at least one copy, failing zero-copy.
>>
>>
>> Then please work on fixing this.
> [wangtao] What needs fixing? Does sendfile achieve zero-copy?
> sendfile reduces memory copies (from 2 to 1) for network sockets,
> but still requires one copy and cannot achieve zero copies.
Well why not? See sendfile() is the designated Linux uAPI for moving data between two files, maybe splice() is also appropriate.
The memory file descriptor and your destination file are both a files. So those uAPIs apply.
Now what you suggest is to add a new IOCTL to do this in a very specific manner just for the system DMA-buf heap. And as far as I can see that is in general a complete no-go.
I mean I understand why you do this. Instead of improving the existing functionality you're just hacking something together because it is simple for you.
It might be possible to implement that generic for DMA-buf heaps if udmabuf allocation overhead can't be reduced, but that is then just the second step.
Regards,
Christian.
Hi!
I previously discussed this with Simona on IRC but would like to get
some feedback also from a wider audience:
We're planning to share dma-bufs using a fast interconnect in a way
similar to pcie-p2p:
The rough plan is to identify dma-bufs capable of sharing this way by
looking at the address of either the dma-buf ops and / or the
importer_ops to conclude it's a device using the same driver (or
possibly child driver) and then take special action when the dma-
addresses are obtained. Nothing visible outside of the xe driver or its
child driver.
Are there any absolute "DON'T"s or recommendations to keep in mind WRT
to this approach?
Thanks,
Thomas
On Fri, May 16, 2025 at 02:02:29AM +0800, Xu Yilun wrote:
> > IMHO, I think it might be helpful that you can picture out what are the
> > minimum requirements (function/life cycle) to the current IOMMUFD TSM
> > bind architecture:
> >
> > 1.host tsm_bind (preparation) is in IOMMUFD, triggered by QEMU handling
> > the TVM-HOST call.
> > 2. TDI acceptance is handled in guest_request() to accept the TDI after
> > the validation in the TVM)
>
> I'll try my best to brainstorm and make a flow in ASCII.
>
> (*) means new feature
>
>
> Guest Guest TSM QEMU VFIO IOMMUFD host TSM KVM
> ----- --------- ---- ---- ------- -------- ---
> 1. *Connect(IDE)
> 2. Init vdev
open /dev/vfio/XX as a VFIO action
Then VFIO attaches to IOMMUFD as an iommufd action creating the idev
> 3. *create dmabuf
> 4. *export dmabuf
> 5. create memslot
> 6. *import dmabuf
> 7. setup shared DMA
> 8. create hwpt
> 9. attach hwpt
> 10. kvm run
> 11.enum shared dev
> 12.*Connect(Bind)
> 13. *GHCI Bind
> 14. *Bind
> 15 CC viommu alloc
> 16. vdevice allloc
viommu and vdevice creation happen before KVM run. The vPCI function
is visible to the guest from the very start, even though it is in T=0
mode. If a platform does not require any special CC steps prior to KVM
run then it just has a NOP for these functions.
What you have here is some new BIND operation against the already
existing vdevice as we discussed earlier.
> 16. *attach vdev
> 17. *setup CC viommu
> 18 *tsm_bind
> 19. *bind
> 20.*Attest
> 21. *GHCI get CC info
> 22. *get CC info
> 23. *vdev guest req
> 24. *guest req
> 25.*Accept
> 26. *GHCI accept MMIO/DMA
> 27. *accept MMIO/DMA
> 28. *vdev guest req
> 29. *guest req
> 30. *map private MMIO
> 31. *GHCI start tdi
> 32. *start tdi
> 33. *vdev guest req
> 34. *guest req
This seems reasonable you want to have some generic RPC scheme to
carry messages fro mthe VM to the TSM tunneled through the iommufd
vdevice (because the vdevice has the vPCI ID, the KVM ID, the VIOMMU
id and so on)
> 35.Workload...
> 36.*disconnect(Unbind)
> 37. *GHCI unbind
> 38. *Unbind
> 39. *detach vdev
unbind vdev. vdev remains until kvm is stopped.
> 40. *tsm_unbind
> 41. *TDX stop tdi
> 42. *TDX disable mmio cb
> 43. *cb dmabuf revoke
> 44. *unmap private MMIO
> 45. *TDX disable dma cb
> 46. *cb disable CC viommu
I don't know why you'd disable a viommu while the VM is running,
doesn't make sense.
> 47. *TDX tdi free
> 48. *enable mmio
> 49. *cb dmabuf recover
> 50.workable shared dev
This is a nice chart, it would be good to see a comparable chart for
AMD and ARM
Jason
On Fri, May 16, 2025 at 12:04:04AM +0800, Xu Yilun wrote:
> > arches this was mostly invisible to the hypervisor?
>
> Attest & Accept can be invisible to hypervisor, or host just help pass
> data blobs between guest, firmware & device.
>
> Bind cannot be host agnostic, host should be aware not to touch device
> after Bind.
I'm not sure this is fully true, this could be a Intel thing. When the
vPCI is created the host can already know it shouldn't touch the PCI
device anymore and the secure world would enforce that when it gets a
bind command.
The fact it hasn't been locked out immediately at vPCI creation time
is sort of a detail that doesn't matter, IMHO.
> > It might be reasonable to have VFIO reach into iommufd to do that on
> > an already existing iommufd VDEVICE object. A little weird, but we
> > could probably make that work.
>
> Mm, Are you proposing an uAPI in VFIO, and a kAPI from VFIO -> IOMMUFD like:
>
> ioctl(vfio_fd, VFIO_DEVICE_ATTACH_VDEV, vdev_id)
> -> iommufd_device_attach_vdev()
> -> tsm_tdi_bind()
Not ATTACH, you wanted BIND. You could have a VFIO_DEVICE_BIND(iommufd
vdevice id)
> > sees VFIO remove the S-EPT and release the KVM, then have iommufd
> > destroy the VDEVICE object.
>
> Regarding VM destroy, TDX Connect has more enforcement, VM could only be
> destroyed after all assigned CC vPCI devices are destroyed.
And KVM destroys the VM?
> Nowadays, VFIO already holds KVM reference, so we need
>
> close(vfio_fd)
> -> iommufd_device_detach_vdev()
This doesn't happen though, it destroys the normal device (idev) which
the vdevice is stacked on top of. You'd have to make normal device
destruction trigger vdevice destruction
> -> tsm_tdi_unbind()
> -> tdi stop
> -> callback to VFIO, dmabuf_move_notify(revoke)
> -> KVM unmap MMIO
> -> tdi metadata remove
This omits the viommu. It won't get destroyed until the iommufd
closes, so iommufd will be holding the kvm and it will do the final
put.
Jason
On 5/15/25 16:03, wangtao wrote:
> [wangtao] My Test Configuration (CPU 1GHz, 5-test average):
> Allocation: 32x32MB buffer creation
> - dmabuf 53ms vs. udmabuf 694ms (10X slower)
> - Note: shmem shows excessive allocation time
Yeah, that is something already noted by others as well. But that is orthogonal.
>
> Read 1024MB File:
> - dmabuf direct 326ms vs. udmabuf direct 461ms (40% slower)
> - Note: pin_user_pages_fast consumes majority CPU cycles
>
> Key function call timing: See details below.
Those aren't valid, you are comparing different functionalities here.
Please try using udmabuf with sendfile() as confirmed to be working by T.J.
Regards,
Christian.
On Wed, May 14, 2025 at 2:00 PM Song Liu <song(a)kernel.org> wrote:
>
> On Tue, May 13, 2025 at 9:36 AM T.J. Mercier <tjmercier(a)google.com> wrote:
> >
> > Use the same test buffers as the traditional iterator and a new BPF map
> > to verify the test buffers can be found with the open coded dmabuf
> > iterator.
> >
> > Signed-off-by: T.J. Mercier <tjmercier(a)google.com>
> > Acked-by: Christian König <christian.koenig(a)amd.com>
> > Acked-by: Song Liu <song(a)kernel.org>
> > ---
> > .../testing/selftests/bpf/bpf_experimental.h | 5 +++
> > .../selftests/bpf/prog_tests/dmabuf_iter.c | 41 +++++++++++++++++++
> > .../testing/selftests/bpf/progs/dmabuf_iter.c | 38 +++++++++++++++++
> > 3 files changed, 84 insertions(+)
> >
> > diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h
> > index 6535c8ae3c46..5e512a1d09d1 100644
> > --- a/tools/testing/selftests/bpf/bpf_experimental.h
> > +++ b/tools/testing/selftests/bpf/bpf_experimental.h
> > @@ -591,4 +591,9 @@ extern int bpf_iter_kmem_cache_new(struct bpf_iter_kmem_cache *it) __weak __ksym
> > extern struct kmem_cache *bpf_iter_kmem_cache_next(struct bpf_iter_kmem_cache *it) __weak __ksym;
> > extern void bpf_iter_kmem_cache_destroy(struct bpf_iter_kmem_cache *it) __weak __ksym;
> >
> > +struct bpf_iter_dmabuf;
> > +extern int bpf_iter_dmabuf_new(struct bpf_iter_dmabuf *it) __weak __ksym;
> > +extern struct dma_buf *bpf_iter_dmabuf_next(struct bpf_iter_dmabuf *it) __weak __ksym;
> > +extern void bpf_iter_dmabuf_destroy(struct bpf_iter_dmabuf *it) __weak __ksym;
> > +
> > #endif
> > diff --git a/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c b/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c
> > index dc740bd0e2bd..6c2b0c3dbcd8 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c
> > @@ -219,14 +219,52 @@ static void subtest_dmabuf_iter_check_default_iter(struct dmabuf_iter *skel)
> > close(iter_fd);
> > }
> >
> > +static void subtest_dmabuf_iter_check_open_coded(struct dmabuf_iter *skel, int map_fd)
> > +{
> > + LIBBPF_OPTS(bpf_test_run_opts, topts);
> > + char key[DMA_BUF_NAME_LEN];
> > + int err, fd;
> > + bool found;
> > +
> > + /* No need to attach it, just run it directly */
> > + fd = bpf_program__fd(skel->progs.iter_dmabuf_for_each);
> > +
> > + err = bpf_prog_test_run_opts(fd, &topts);
> > + if (!ASSERT_OK(err, "test_run_opts err"))
> > + return;
> > + if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
> > + return;
> > +
> > + if (!ASSERT_OK(bpf_map_get_next_key(map_fd, NULL, key), "get next key"))
> > + return;
> > +
> > + do {
> > + ASSERT_OK(bpf_map_lookup_elem(map_fd, key, &found), "lookup");
> > + ASSERT_TRUE(found, "found test buffer");
>
> This check failed once in the CI, on s390:
>
> Error: #89/3 dmabuf_iter/open_coded
> 9309 subtest_dmabuf_iter_check_open_coded:PASS:test_run_opts err 0 nsec
> 9310 subtest_dmabuf_iter_check_open_coded:PASS:test_run_opts retval 0 nsec
> 9311 subtest_dmabuf_iter_check_open_coded:PASS:get next key 0 nsec
> 9312 subtest_dmabuf_iter_check_open_coded:PASS:lookup 0 nsec
> 9313 subtest_dmabuf_iter_check_open_coded:FAIL:found test buffer
> unexpected found test buffer: got FALSE
>
> But it passed in the rerun. It is probably a bit flakey. Maybe we need some
> barrier somewhere.
>
> Here is the failure:
>
> https://github.com/kernel-patches/bpf/actions/runs/15002058808/job/42234864…
>
> To see the log, you need to log in GitHub.
>
> Thanks,
> Song
Thanks, yeah I have been trying to run this locally today but still
working on setting up an environment for it. Daniel Xu thoughtfully
suggested I use a github PR to trigger CI, but I tried that last week
without success: https://github.com/kernel-patches/bpf/pull/8910
I'm not sure if this is the cause (doesn't show up on the runs that
pass) but I have no idea why that would be intermittently failing:
libbpf: Error in bpf_create_map_xattr(testbuf_hash): -EINVAL. Retrying
without BTF.
> > + } while (bpf_map_get_next_key(map_fd, key, key));
> > +}
>
> [...]
On Wed, May 14, 2025 at 1:53 PM Song Liu <song(a)kernel.org> wrote:
>
> On Tue, May 13, 2025 at 9:36 AM T.J. Mercier <tjmercier(a)google.com> wrote:
> >
> > This test creates a udmabuf, and a dmabuf from the system dmabuf heap,
> > and uses a BPF program that prints dmabuf metadata with the new
> > dmabuf_iter to verify they can be found.
> >
> > Signed-off-by: T.J. Mercier <tjmercier(a)google.com>
> > Acked-by: Christian König <christian.koenig(a)amd.com>
>
> Acked-by: Song Liu <song(a)kernel.org>
Thanks.
>
> With one more comment below.
>
> [...]
>
> > diff --git a/tools/testing/selftests/bpf/progs/dmabuf_iter.c b/tools/testing/selftests/bpf/progs/dmabuf_iter.c
> > new file mode 100644
> > index 000000000000..2a1b5397196d
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/dmabuf_iter.c
> > @@ -0,0 +1,53 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2025 Google LLC */
> > +#include <vmlinux.h>
> > +#include <bpf/bpf_core_read.h>
> > +#include <bpf/bpf_helpers.h>
> > +
> > +/* From uapi/linux/dma-buf.h */
> > +#define DMA_BUF_NAME_LEN 32
> > +
> > +char _license[] SEC("license") = "GPL";
> > +
> > +/*
> > + * Fields output by this iterator are delimited by newlines. Convert any
> > + * newlines in user-provided printed strings to spaces.
> > + */
> > +static void sanitize_string(char *src, size_t size)
> > +{
> > + for (char *c = src; *c && (size_t)(c - src) < size; ++c)
>
> We should do the size check first, right? IOW:
>
> for (char *c = src; (size_t)(c - src) < size && *c; ++c)
Yeah if you call the function with size = 0, which is kinda
questionable and not possible with the non-zero array size that is
tied to immutable UAPI. Let's change it like you suggest.
>
> > + if (*c == '\n')
> > + *c = ' ';
> > +}
> > +
> [...]
From: Rob Clark <robdclark(a)chromium.org>
Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse
Memory[2] in the form of:
1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/
MAP_NULL/UNMAP commands
2. A new VM_BIND ioctl to allow submitting batches of one or more
MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue
I did not implement support for synchronous VM_BIND commands. Since
userspace could just immediately wait for the `SUBMIT` to complete, I don't
think we need this extra complexity in the kernel. Synchronous/immediate
VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue.
The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533
Changes in v4:
- Various locking/etc fixes
- Optimize the pgtable preallocation. If userspace sorts the VM_BIND ops
then the kernel detects ops that fall into the same 2MB last level PTD
to avoid duplicate page preallocation.
- Add way to throttle pushing jobs to the scheduler, to cap the amount of
potentially temporary prealloc'd pgtable pages.
- Add vm_log to devcoredump for debugging. If the vm_log_shift module
param is set, keep a log of the last 1<<vm_log_shift VM updates for
easier debugging of faults/crashes.
- Link to v3: https://lore.kernel.org/all/20250428205619.227835-1-robdclark@gmail.com/
Changes in v3:
- Switched to seperate VM_BIND ioctl. This makes the UABI a bit
cleaner, but OTOH the userspace code was cleaner when the end result
of either type of VkQueue lead to the same ioctl. So I'm a bit on
the fence.
- Switched to doing the gpuvm bookkeeping synchronously, and only
deferring the pgtable updates. This avoids needing to hold any resv
locks in the fence signaling path, resolving the last shrinker related
lockdep complaints. OTOH it means userspace can trigger invalid
pgtable updates with multiple VM_BIND queues. In this case, we ensure
that unmaps happen completely (to prevent userspace from using this to
access free'd pages), mark the context as unusable, and move on with
life.
- Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/
Changes in v2:
- Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been
merged.
- Pre-allocate all the things, and drop HACK patch which disabled shrinker.
This includes ensuring that vm_bo objects are allocated up front, pre-
allocating VMA objects, and pre-allocating pages used for pgtable updates.
The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that
were initially added for panthor.
- Add back support for BO dumping for devcoredump.
- Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.c…
[1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm
[2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html
[3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700
Rob Clark (40):
drm/gpuvm: Don't require obj lock in destructor path
drm/gpuvm: Allow VAs to hold soft reference to BOs
drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan()
drm/sched: Add enqueue credit limit
iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()
drm/msm: Rename msm_file_private -> msm_context
drm/msm: Improve msm_context comments
drm/msm: Rename msm_gem_address_space -> msm_gem_vm
drm/msm: Remove vram carveout support
drm/msm: Collapse vma allocation and initialization
drm/msm: Collapse vma close and delete
drm/msm: Don't close VMAs on purge
drm/msm: drm_gpuvm conversion
drm/msm: Convert vm locking
drm/msm: Use drm_gpuvm types more
drm/msm: Split out helper to get iommu prot flags
drm/msm: Add mmu support for non-zero offset
drm/msm: Add PRR support
drm/msm: Rename msm_gem_vma_purge() -> _unmap()
drm/msm: Drop queued submits on lastclose()
drm/msm: Lazily create context VM
drm/msm: Add opt-in for VM_BIND
drm/msm: Mark VM as unusable on GPU hangs
drm/msm: Add _NO_SHARE flag
drm/msm: Crashdump prep for sparse mappings
drm/msm: rd dumping prep for sparse mappings
drm/msm: Crashdec support for sparse
drm/msm: rd dumping support for sparse
drm/msm: Extract out syncobj helpers
drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL
drm/msm: Add VM_BIND submitqueue
drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON
drm/msm: Support pgtable preallocation
drm/msm: Split out map/unmap ops
drm/msm: Add VM_BIND ioctl
drm/msm: Add VM logging for VM_BIND updates
drm/msm: Add VMA unmap reason
drm/msm: Add mmu prealloc tracepoint
drm/msm: use trylock for debugfs
drm/msm: Bump UAPI version
drivers/gpu/drm/drm_gem.c | 14 +-
drivers/gpu/drm/drm_gpuvm.c | 15 +-
drivers/gpu/drm/msm/Kconfig | 1 +
drivers/gpu/drm/msm/Makefile | 1 +
drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 25 +-
drivers/gpu/drm/msm/adreno/a2xx_gpummu.c | 5 +-
drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 17 +-
drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 17 +-
drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +-
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 22 +-
drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +-
drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 10 +-
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 32 +-
drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 49 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 6 +-
drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 10 +-
drivers/gpu/drm/msm/adreno/adreno_device.c | 4 -
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 99 +-
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 23 +-
.../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 14 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 18 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 2 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 18 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 14 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 4 +-
drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 6 +-
drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 28 +-
drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 12 +-
drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 4 +-
drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 19 +-
drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 12 +-
drivers/gpu/drm/msm/dsi/dsi_host.c | 14 +-
drivers/gpu/drm/msm/msm_drv.c | 184 +--
drivers/gpu/drm/msm/msm_drv.h | 35 +-
drivers/gpu/drm/msm/msm_fb.c | 18 +-
drivers/gpu/drm/msm/msm_fbdev.c | 2 +-
drivers/gpu/drm/msm/msm_gem.c | 494 +++---
drivers/gpu/drm/msm/msm_gem.h | 247 ++-
drivers/gpu/drm/msm/msm_gem_prime.c | 15 +
drivers/gpu/drm/msm/msm_gem_shrinker.c | 104 +-
drivers/gpu/drm/msm/msm_gem_submit.c | 295 ++--
drivers/gpu/drm/msm/msm_gem_vma.c | 1471 ++++++++++++++++-
drivers/gpu/drm/msm/msm_gpu.c | 214 ++-
drivers/gpu/drm/msm/msm_gpu.h | 144 +-
drivers/gpu/drm/msm/msm_gpu_trace.h | 14 +
drivers/gpu/drm/msm/msm_iommu.c | 302 +++-
drivers/gpu/drm/msm/msm_kms.c | 18 +-
drivers/gpu/drm/msm/msm_kms.h | 2 +-
drivers/gpu/drm/msm/msm_mmu.h | 38 +-
drivers/gpu/drm/msm/msm_rd.c | 62 +-
drivers/gpu/drm/msm/msm_ringbuffer.c | 10 +-
drivers/gpu/drm/msm/msm_submitqueue.c | 96 +-
drivers/gpu/drm/msm/msm_syncobj.c | 172 ++
drivers/gpu/drm/msm/msm_syncobj.h | 37 +
drivers/gpu/drm/scheduler/sched_entity.c | 16 +-
drivers/gpu/drm/scheduler/sched_main.c | 3 +
drivers/iommu/io-pgtable-arm.c | 27 +-
include/drm/drm_gem.h | 10 +-
include/drm/drm_gpuvm.h | 12 +-
include/drm/gpu_scheduler.h | 13 +-
include/linux/io-pgtable.h | 8 +
include/uapi/drm/msm_drm.h | 149 +-
63 files changed, 3484 insertions(+), 1251 deletions(-)
create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c
create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h
--
2.49.0
From: Rob Clark <robdclark(a)chromium.org>
Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse
Memory[2] in the form of:
1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/
MAP_NULL/UNMAP commands
2. A new VM_BIND ioctl to allow submitting batches of one or more
MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue
I did not implement support for synchronous VM_BIND commands. Since
userspace could just immediately wait for the `SUBMIT` to complete, I don't
think we need this extra complexity in the kernel. Synchronous/immediate
VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue.
The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533
Changes in v4:
- Various locking/etc fixes
- Optimize the pgtable preallocation. If userspace sorts the VM_BIND ops
then the kernel detects ops that fall into the same 2MB last level PTD
to avoid duplicate page preallocation.
- Add way to throttle pushing jobs to the scheduler, to cap the amount of
potentially temporary prealloc'd pgtable pages.
- Add vm_log to devcoredump for debugging. If the vm_log_shift module
param is set, keep a log of the last 1<<vm_log_shift VM updates for
easier debugging of faults/crashes.
- Link to v3: https://lore.kernel.org/all/20250428205619.227835-1-robdclark@gmail.com/
Changes in v3:
- Switched to seperate VM_BIND ioctl. This makes the UABI a bit
cleaner, but OTOH the userspace code was cleaner when the end result
of either type of VkQueue lead to the same ioctl. So I'm a bit on
the fence.
- Switched to doing the gpuvm bookkeeping synchronously, and only
deferring the pgtable updates. This avoids needing to hold any resv
locks in the fence signaling path, resolving the last shrinker related
lockdep complaints. OTOH it means userspace can trigger invalid
pgtable updates with multiple VM_BIND queues. In this case, we ensure
that unmaps happen completely (to prevent userspace from using this to
access free'd pages), mark the context as unusable, and move on with
life.
- Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/
Changes in v2:
- Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been
merged.
- Pre-allocate all the things, and drop HACK patch which disabled shrinker.
This includes ensuring that vm_bo objects are allocated up front, pre-
allocating VMA objects, and pre-allocating pages used for pgtable updates.
The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that
were initially added for panthor.
- Add back support for BO dumping for devcoredump.
- Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.c…
[1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm
[2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html
[3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700
Rob Clark (40):
drm/gpuvm: Don't require obj lock in destructor path
drm/gpuvm: Allow VAs to hold soft reference to BOs
drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan()
drm/sched: Add enqueue credit limit
iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()
drm/msm: Rename msm_file_private -> msm_context
drm/msm: Improve msm_context comments
drm/msm: Rename msm_gem_address_space -> msm_gem_vm
drm/msm: Remove vram carveout support
drm/msm: Collapse vma allocation and initialization
drm/msm: Collapse vma close and delete
drm/msm: Don't close VMAs on purge
drm/msm: drm_gpuvm conversion
drm/msm: Convert vm locking
drm/msm: Use drm_gpuvm types more
drm/msm: Split out helper to get iommu prot flags
drm/msm: Add mmu support for non-zero offset
drm/msm: Add PRR support
drm/msm: Rename msm_gem_vma_purge() -> _unmap()
drm/msm: Drop queued submits on lastclose()
drm/msm: Lazily create context VM
drm/msm: Add opt-in for VM_BIND
drm/msm: Mark VM as unusable on GPU hangs
drm/msm: Add _NO_SHARE flag
drm/msm: Crashdump prep for sparse mappings
drm/msm: rd dumping prep for sparse mappings
drm/msm: Crashdec support for sparse
drm/msm: rd dumping support for sparse
drm/msm: Extract out syncobj helpers
drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL
drm/msm: Add VM_BIND submitqueue
drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON
drm/msm: Support pgtable preallocation
drm/msm: Split out map/unmap ops
drm/msm: Add VM_BIND ioctl
drm/msm: Add VM logging for VM_BIND updates
drm/msm: Add VMA unmap reason
drm/msm: Add mmu prealloc tracepoint
drm/msm: use trylock for debugfs
drm/msm: Bump UAPI version
drivers/gpu/drm/drm_gem.c | 14 +-
drivers/gpu/drm/drm_gpuvm.c | 15 +-
drivers/gpu/drm/msm/Kconfig | 1 +
drivers/gpu/drm/msm/Makefile | 1 +
drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 25 +-
drivers/gpu/drm/msm/adreno/a2xx_gpummu.c | 5 +-
drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 17 +-
drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 17 +-
drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +-
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 22 +-
drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +-
drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 10 +-
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 32 +-
drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 49 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 6 +-
drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 10 +-
drivers/gpu/drm/msm/adreno/adreno_device.c | 4 -
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 99 +-
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 23 +-
.../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 14 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 18 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 2 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 18 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 14 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 4 +-
drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 6 +-
drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 28 +-
drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 12 +-
drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 4 +-
drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 19 +-
drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 12 +-
drivers/gpu/drm/msm/dsi/dsi_host.c | 14 +-
drivers/gpu/drm/msm/msm_drv.c | 184 +--
drivers/gpu/drm/msm/msm_drv.h | 35 +-
drivers/gpu/drm/msm/msm_fb.c | 18 +-
drivers/gpu/drm/msm/msm_fbdev.c | 2 +-
drivers/gpu/drm/msm/msm_gem.c | 494 +++---
drivers/gpu/drm/msm/msm_gem.h | 247 ++-
drivers/gpu/drm/msm/msm_gem_prime.c | 15 +
drivers/gpu/drm/msm/msm_gem_shrinker.c | 104 +-
drivers/gpu/drm/msm/msm_gem_submit.c | 295 ++--
drivers/gpu/drm/msm/msm_gem_vma.c | 1471 ++++++++++++++++-
drivers/gpu/drm/msm/msm_gpu.c | 214 ++-
drivers/gpu/drm/msm/msm_gpu.h | 144 +-
drivers/gpu/drm/msm/msm_gpu_trace.h | 14 +
drivers/gpu/drm/msm/msm_iommu.c | 302 +++-
drivers/gpu/drm/msm/msm_kms.c | 18 +-
drivers/gpu/drm/msm/msm_kms.h | 2 +-
drivers/gpu/drm/msm/msm_mmu.h | 38 +-
drivers/gpu/drm/msm/msm_rd.c | 62 +-
drivers/gpu/drm/msm/msm_ringbuffer.c | 10 +-
drivers/gpu/drm/msm/msm_submitqueue.c | 96 +-
drivers/gpu/drm/msm/msm_syncobj.c | 172 ++
drivers/gpu/drm/msm/msm_syncobj.h | 37 +
drivers/gpu/drm/scheduler/sched_entity.c | 16 +-
drivers/gpu/drm/scheduler/sched_main.c | 3 +
drivers/iommu/io-pgtable-arm.c | 27 +-
include/drm/drm_gem.h | 10 +-
include/drm/drm_gpuvm.h | 12 +-
include/drm/gpu_scheduler.h | 13 +-
include/linux/io-pgtable.h | 8 +
include/uapi/drm/msm_drm.h | 149 +-
63 files changed, 3484 insertions(+), 1251 deletions(-)
create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c
create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h
--
2.49.0
On Wed, May 14, 2025 at 03:02:53PM +0800, Xu Yilun wrote:
> > We have an awkward fit for what CCA people are doing to the various
> > Linux APIs. Looking somewhat maximally across all the arches a "bind"
> > for a CC vPCI device creation operation does:
> >
> > - Setup the CPU page tables for the VM to have access to the MMIO
>
> This is guest side thing, is it? Anything host need to opt-in?
CPU hypervisor page tables.
> > - Revoke hypervisor access to the MMIO
>
> VFIO could choose never to mmap MMIO, so in this case nothing to do?
Yes, if you do it that way.
> > - Setup the vIOMMU to understand the vPCI device
> > - Take over control of some of the IOVA translation, at least for T=1,
> > and route to the the vIOMMU
> > - Register the vPCI with any attestation functions the VM might use
> > - Do some DOE stuff to manage/validate TDSIP/etc
>
> Intel TDX Connect has a extra requirement for "unbind":
>
> - Revoke KVM page table (S-EPT) for the MMIO only after TDISP
> CONFIG_UNLOCK
Maybe you could express this as the S-EPT always has the MMIO mapped
into it as long as the vPCI function is installed to the VM? Is KVM
responsible for the S-EPT?
> Another thing is, seems your term "bind" includes all steps for
> shared -> private conversion.
Well, I was talking about vPCI creation. I understand that during the
vPCI lifecycle the VM will do "bind" "unbind" which are more or less
switching the device into a T=1 mode. Though I understood on some
arches this was mostly invisible to the hypervisor?
> But in my mind, "bind" only includes
> putting device in TDISP LOCK state & corresponding host setups required
> by firmware. I.e "bind" means host lockes down the CC setup, waiting for
> guest attestation.
So we will need to have some other API for this that modifies the vPCI
object.
It might be reasonable to have VFIO reach into iommufd to do that on
an already existing iommufd VDEVICE object. A little weird, but we
could probably make that work.
But you have some weird ordering issues here if the S-EPT has to have
the VFIO MMIO then you have to have a close() destruction order that
sees VFIO remove the S-EPT and release the KVM, then have iommufd
destroy the VDEVICE object.
> > It doesn't mean that iommufd is suddenly doing PCI stuff, no, that
> > stays in VFIO.
>
> I'm not sure if Alexey's patch [1] illustates your idea. It calls
> tsm_tdi_bind() which directly does device stuff, and impacts MMIO.
> VFIO doesn't know about this.
>
> I have to interpret this as VFIO firstly hand over device CC features
> and MMIO resources to IOMMUFD, so VFIO never cares about them.
>
> [1] https://lore.kernel.org/all/20250218111017.491719-15-aik@amd.com/
There is also the PCI layer involved here and maybe PCI should be
participating in managing some of this. Like it makes a bit of sense
that PCI would block the FLR on platforms that require this?
Jason
On Wed, May 14, 2025 at 7:58 AM Tvrtko Ursulin
<tvrtko.ursulin(a)igalia.com> wrote:
>
>
> On 14/05/2025 14:57, Rob Clark wrote:
> > On Wed, May 14, 2025 at 3:01 AM Tvrtko Ursulin
> > <tvrtko.ursulin(a)igalia.com> wrote:
> >>
> >>
> >> On 13/05/2025 15:16, Rob Clark wrote:
> >>> On Fri, May 9, 2025 at 8:34 AM Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com> wrote:
> >>>>
> >>>> Dma-fence objects currently suffer from a potential use after free problem
> >>>> where fences exported to userspace and other drivers can outlive the
> >>>> exporting driver, or the associated data structures.
> >>>>
> >>>> The discussion on how to address this concluded that adding reference
> >>>> counting to all the involved objects is not desirable, since it would need
> >>>> to be very wide reaching and could cause unloadable drivers if another
> >>>> entity would be holding onto a signaled fence reference potentially
> >>>> indefinitely.
> >>>>
> >>>> This patch enables the safe access by introducing and documenting a
> >>>> contract between fence exporters and users. It documents a set of
> >>>> contraints and adds helpers which a) drivers with potential to suffer from
> >>>> the use after free must use and b) users of the dma-fence API must use as
> >>>> well.
> >>>>
> >>>> Premise of the design has multiple sides:
> >>>>
> >>>> 1. Drivers (fence exporters) MUST ensure a RCU grace period between
> >>>> signalling a fence and freeing the driver private data associated with it.
> >>>>
> >>>> The grace period does not have to follow the signalling immediately but
> >>>> HAS to happen before data is freed.
> >>>>
> >>>> 2. Users of the dma-fence API marked with such requirement MUST contain
> >>>> the complete access to the data within a single code block guarded by the
> >>>> new dma_fence_access_begin() and dma_fence_access_end() helpers.
> >>>>
> >>>> The combination of the two ensures that whoever sees the
> >>>> DMA_FENCE_FLAG_SIGNALED_BIT not set is guaranteed to have access to a
> >>>> valid fence->lock and valid data potentially accessed by the fence->ops
> >>>> virtual functions, until the call to dma_fence_access_end().
> >>>>
> >>>> 3. Module unload (fence->ops) disappearing is for now explicitly not
> >>>> handled. That would required a more complex protection, possibly needing
> >>>> SRCU instead of RCU to handle callers such as dma_fence_wait_timeout(),
> >>>> where race between dma_fence_enable_sw_signaling, signalling, and
> >>>> dereference of fence->ops->wait() would need a sleeping SRCU context.
> >>>>
> >>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com>
> >>>> ---
> >>>> drivers/dma-buf/dma-fence.c | 69 +++++++++++++++++++++++++++++++++++++
> >>>> include/linux/dma-fence.h | 32 ++++++++++++-----
> >>>> 2 files changed, 93 insertions(+), 8 deletions(-)
> >>>>
> >>>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> >>>> index dc2456f68685..cfe1d7b79c22 100644
> >>>> --- a/drivers/dma-buf/dma-fence.c
> >>>> +++ b/drivers/dma-buf/dma-fence.c
> >>>> @@ -533,6 +533,7 @@ void dma_fence_release(struct kref *kref)
> >>>> struct dma_fence *fence =
> >>>> container_of(kref, struct dma_fence, refcount);
> >>>>
> >>>> + dma_fence_access_begin();
> >>>> trace_dma_fence_destroy(fence);
> >>>>
> >>>> if (WARN(!list_empty(&fence->cb_list) &&
> >>>> @@ -560,6 +561,8 @@ void dma_fence_release(struct kref *kref)
> >>>> fence->ops->release(fence);
> >>>> else
> >>>> dma_fence_free(fence);
> >>>> +
> >>>> + dma_fence_access_end();
> >>>> }
> >>>> EXPORT_SYMBOL(dma_fence_release);
> >>>>
> >>>> @@ -982,11 +985,13 @@ EXPORT_SYMBOL(dma_fence_set_deadline);
> >>>> */
> >>>> void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq)
> >>>> {
> >>>> + dma_fence_access_begin();
> >>>> seq_printf(seq, "%s %s seq %llu %ssignalled\n",
> >>>> dma_fence_driver_name(fence),
> >>>> dma_fence_timeline_name(fence),
> >>>> fence->seqno,
> >>>> dma_fence_is_signaled(fence) ? "" : "un");
> >>>> + dma_fence_access_end();
> >>>> }
> >>>> EXPORT_SYMBOL(dma_fence_describe);
> >>>>
> >>>> @@ -1033,3 +1038,67 @@ dma_fence_init64(struct dma_fence *fence, const struct dma_fence_ops *ops,
> >>>> __set_bit(DMA_FENCE_FLAG_SEQNO64_BIT, &fence->flags);
> >>>> }
> >>>> EXPORT_SYMBOL(dma_fence_init64);
> >>>> +
> >>>> +/**
> >>>> + * dma_fence_driver_name - Access the driver name
> >>>> + * @fence: the fence to query
> >>>> + *
> >>>> + * Returns a driver name backing the dma-fence implementation.
> >>>> + *
> >>>> + * IMPORTANT CONSIDERATION:
> >>>> + * Dma-fence contract stipulates that access to driver provided data (data not
> >>>> + * directly embedded into the object itself), such as the &dma_fence.lock and
> >>>> + * memory potentially accessed by the &dma_fence.ops functions, is forbidden
> >>>> + * after the fence has been signalled. Drivers are allowed to free that data,
> >>>> + * and some do.
> >>>> + *
> >>>> + * To allow safe access drivers are mandated to guarantee a RCU grace period
> >>>> + * between signalling the fence and freeing said data.
> >>>> + *
> >>>> + * As such access to the driver name is only valid inside a RCU locked section.
> >>>> + * The pointer MUST be both queried and USED ONLY WITHIN a SINGLE block guarded
> >>>> + * by the &dma_fence_access_being and &dma_fence_access_end pair.
> >>>> + */
> >>>> +const char *dma_fence_driver_name(struct dma_fence *fence)
> >>>> +{
> >>>> + RCU_LOCKDEP_WARN(!rcu_read_lock_held(),
> >>>> + "rcu_read_lock() required for safe access to returned string");
> >>>> +
> >>>> + if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >>>> + return fence->ops->get_driver_name(fence);
> >>>> + else
> >>>> + return "detached-driver";
> >>>> +}
> >>>> +EXPORT_SYMBOL(dma_fence_driver_name);
> >>>> +
> >>>> +/**
> >>>> + * dma_fence_timeline_name - Access the timeline name
> >>>> + * @fence: the fence to query
> >>>> + *
> >>>> + * Returns a timeline name provided by the dma-fence implementation.
> >>>> + *
> >>>> + * IMPORTANT CONSIDERATION:
> >>>> + * Dma-fence contract stipulates that access to driver provided data (data not
> >>>> + * directly embedded into the object itself), such as the &dma_fence.lock and
> >>>> + * memory potentially accessed by the &dma_fence.ops functions, is forbidden
> >>>> + * after the fence has been signalled. Drivers are allowed to free that data,
> >>>> + * and some do.
> >>>> + *
> >>>> + * To allow safe access drivers are mandated to guarantee a RCU grace period
> >>>> + * between signalling the fence and freeing said data.
> >>>> + *
> >>>> + * As such access to the driver name is only valid inside a RCU locked section.
> >>>> + * The pointer MUST be both queried and USED ONLY WITHIN a SINGLE block guarded
> >>>> + * by the &dma_fence_access_being and &dma_fence_access_end pair.
> >>>> + */
> >>>> +const char *dma_fence_timeline_name(struct dma_fence *fence)
> >>>> +{
> >>>> + RCU_LOCKDEP_WARN(!rcu_read_lock_held(),
> >>>> + "rcu_read_lock() required for safe access to returned string");
> >>>> +
> >>>> + if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >>>> + return fence->ops->get_driver_name(fence);
> >>>> + else
> >>>> + return "signaled-timeline";
> >>>
> >>> This means that trace_dma_fence_signaled() will get the wrong
> >>> timeline/driver name, which probably screws up perfetto and maybe
> >>> other tools.
> >>
> >> Do you think context and seqno are not enough for those tools and they
> >> actually rely on the names? It would sound weird if they decided to
> >> index anything on the names which are non-standardised between drivers,
> >> but I guess anything is possible.
> >
> > At some point perfetto uses the timeline name to put up a named fence
> > timeline, I'm not sure if it is using the name or context # for
> > subsequent fence events (namely, signalled). I'd have to check the
> > code and get back to you.
>
> If you can it would be useful. Presumably it saves the names from the
> start edge of fence lifetime. But again, who knows.
Ok, it looks like perfetto is ok... mostly..
DrmTracker::GetFenceTimelineByContext() will try to lookup the
timeline by context #, and then if that fails, create a new timeline
with the name from the trace event, and add it to the hashmap.
It might be that "signaled-timeline" shows up if the first event seen
is the fence-signaled event.
> > There is also gpuvis, which I guess does something similar, but
> > haven't looked into it. Idk if there are others.
>
> I know GpuVis uses DRM sched tracepoints since Pierre-Eric was
> explaining me about those in the context of tracing rework he did there.
> I am not sure about dma-fence tracepoints.
>
> +Pierre-Eric on the off chance you know from the top of your head how
> much GpuVis depends on them (dma-fence tracepoints).
>
> >>> Maybe it would work well enough just to move the
> >>> trace_dma_fence_signaled() call ahead of the test_and_set_bit()? Idk
> >>> if some things will start getting confused if they see that trace
> >>> multiple times.
> >>
> >> Another alternative is to make this tracepoint access the names
> >> directly. It is under the lock so guaranteed not to get freed with
> >> drivers which will be made compliant with the documented rules.
> >
> > I guess it would have been better if, other than dma_fence_init
> > tracepoint, later tracepoints didn't include the driver/timeline
> > name.. that would have forced the use of the context. But I guess too
> > late for that. Perhaps the least bad thing to do is use the locking?
>
> You mean this last alternative I mentioned? I think that will work fine.
> I'll wait a little bit longer for more potential comments before re-spi
> ning with that.
yes
> Were you able to test the series for your use case? Assuming it is not
> upstream msm since I don't immediately see a path in msm_fence which
> gets freed at runtime?
Not yet, but I think it should because it is the exact same problem
your igt test triggers.
This is with my VM_BIND series, which will dynamically create/teardown
sched entities
BR,
-R
On Wed, May 14, 2025 at 3:01 AM Tvrtko Ursulin
<tvrtko.ursulin(a)igalia.com> wrote:
>
>
> On 13/05/2025 15:16, Rob Clark wrote:
> > On Fri, May 9, 2025 at 8:34 AM Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com> wrote:
> >>
> >> Dma-fence objects currently suffer from a potential use after free problem
> >> where fences exported to userspace and other drivers can outlive the
> >> exporting driver, or the associated data structures.
> >>
> >> The discussion on how to address this concluded that adding reference
> >> counting to all the involved objects is not desirable, since it would need
> >> to be very wide reaching and could cause unloadable drivers if another
> >> entity would be holding onto a signaled fence reference potentially
> >> indefinitely.
> >>
> >> This patch enables the safe access by introducing and documenting a
> >> contract between fence exporters and users. It documents a set of
> >> contraints and adds helpers which a) drivers with potential to suffer from
> >> the use after free must use and b) users of the dma-fence API must use as
> >> well.
> >>
> >> Premise of the design has multiple sides:
> >>
> >> 1. Drivers (fence exporters) MUST ensure a RCU grace period between
> >> signalling a fence and freeing the driver private data associated with it.
> >>
> >> The grace period does not have to follow the signalling immediately but
> >> HAS to happen before data is freed.
> >>
> >> 2. Users of the dma-fence API marked with such requirement MUST contain
> >> the complete access to the data within a single code block guarded by the
> >> new dma_fence_access_begin() and dma_fence_access_end() helpers.
> >>
> >> The combination of the two ensures that whoever sees the
> >> DMA_FENCE_FLAG_SIGNALED_BIT not set is guaranteed to have access to a
> >> valid fence->lock and valid data potentially accessed by the fence->ops
> >> virtual functions, until the call to dma_fence_access_end().
> >>
> >> 3. Module unload (fence->ops) disappearing is for now explicitly not
> >> handled. That would required a more complex protection, possibly needing
> >> SRCU instead of RCU to handle callers such as dma_fence_wait_timeout(),
> >> where race between dma_fence_enable_sw_signaling, signalling, and
> >> dereference of fence->ops->wait() would need a sleeping SRCU context.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com>
> >> ---
> >> drivers/dma-buf/dma-fence.c | 69 +++++++++++++++++++++++++++++++++++++
> >> include/linux/dma-fence.h | 32 ++++++++++++-----
> >> 2 files changed, 93 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> >> index dc2456f68685..cfe1d7b79c22 100644
> >> --- a/drivers/dma-buf/dma-fence.c
> >> +++ b/drivers/dma-buf/dma-fence.c
> >> @@ -533,6 +533,7 @@ void dma_fence_release(struct kref *kref)
> >> struct dma_fence *fence =
> >> container_of(kref, struct dma_fence, refcount);
> >>
> >> + dma_fence_access_begin();
> >> trace_dma_fence_destroy(fence);
> >>
> >> if (WARN(!list_empty(&fence->cb_list) &&
> >> @@ -560,6 +561,8 @@ void dma_fence_release(struct kref *kref)
> >> fence->ops->release(fence);
> >> else
> >> dma_fence_free(fence);
> >> +
> >> + dma_fence_access_end();
> >> }
> >> EXPORT_SYMBOL(dma_fence_release);
> >>
> >> @@ -982,11 +985,13 @@ EXPORT_SYMBOL(dma_fence_set_deadline);
> >> */
> >> void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq)
> >> {
> >> + dma_fence_access_begin();
> >> seq_printf(seq, "%s %s seq %llu %ssignalled\n",
> >> dma_fence_driver_name(fence),
> >> dma_fence_timeline_name(fence),
> >> fence->seqno,
> >> dma_fence_is_signaled(fence) ? "" : "un");
> >> + dma_fence_access_end();
> >> }
> >> EXPORT_SYMBOL(dma_fence_describe);
> >>
> >> @@ -1033,3 +1038,67 @@ dma_fence_init64(struct dma_fence *fence, const struct dma_fence_ops *ops,
> >> __set_bit(DMA_FENCE_FLAG_SEQNO64_BIT, &fence->flags);
> >> }
> >> EXPORT_SYMBOL(dma_fence_init64);
> >> +
> >> +/**
> >> + * dma_fence_driver_name - Access the driver name
> >> + * @fence: the fence to query
> >> + *
> >> + * Returns a driver name backing the dma-fence implementation.
> >> + *
> >> + * IMPORTANT CONSIDERATION:
> >> + * Dma-fence contract stipulates that access to driver provided data (data not
> >> + * directly embedded into the object itself), such as the &dma_fence.lock and
> >> + * memory potentially accessed by the &dma_fence.ops functions, is forbidden
> >> + * after the fence has been signalled. Drivers are allowed to free that data,
> >> + * and some do.
> >> + *
> >> + * To allow safe access drivers are mandated to guarantee a RCU grace period
> >> + * between signalling the fence and freeing said data.
> >> + *
> >> + * As such access to the driver name is only valid inside a RCU locked section.
> >> + * The pointer MUST be both queried and USED ONLY WITHIN a SINGLE block guarded
> >> + * by the &dma_fence_access_being and &dma_fence_access_end pair.
> >> + */
> >> +const char *dma_fence_driver_name(struct dma_fence *fence)
> >> +{
> >> + RCU_LOCKDEP_WARN(!rcu_read_lock_held(),
> >> + "rcu_read_lock() required for safe access to returned string");
> >> +
> >> + if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >> + return fence->ops->get_driver_name(fence);
> >> + else
> >> + return "detached-driver";
> >> +}
> >> +EXPORT_SYMBOL(dma_fence_driver_name);
> >> +
> >> +/**
> >> + * dma_fence_timeline_name - Access the timeline name
> >> + * @fence: the fence to query
> >> + *
> >> + * Returns a timeline name provided by the dma-fence implementation.
> >> + *
> >> + * IMPORTANT CONSIDERATION:
> >> + * Dma-fence contract stipulates that access to driver provided data (data not
> >> + * directly embedded into the object itself), such as the &dma_fence.lock and
> >> + * memory potentially accessed by the &dma_fence.ops functions, is forbidden
> >> + * after the fence has been signalled. Drivers are allowed to free that data,
> >> + * and some do.
> >> + *
> >> + * To allow safe access drivers are mandated to guarantee a RCU grace period
> >> + * between signalling the fence and freeing said data.
> >> + *
> >> + * As such access to the driver name is only valid inside a RCU locked section.
> >> + * The pointer MUST be both queried and USED ONLY WITHIN a SINGLE block guarded
> >> + * by the &dma_fence_access_being and &dma_fence_access_end pair.
> >> + */
> >> +const char *dma_fence_timeline_name(struct dma_fence *fence)
> >> +{
> >> + RCU_LOCKDEP_WARN(!rcu_read_lock_held(),
> >> + "rcu_read_lock() required for safe access to returned string");
> >> +
> >> + if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >> + return fence->ops->get_driver_name(fence);
> >> + else
> >> + return "signaled-timeline";
> >
> > This means that trace_dma_fence_signaled() will get the wrong
> > timeline/driver name, which probably screws up perfetto and maybe
> > other tools.
>
> Do you think context and seqno are not enough for those tools and they
> actually rely on the names? It would sound weird if they decided to
> index anything on the names which are non-standardised between drivers,
> but I guess anything is possible.
At some point perfetto uses the timeline name to put up a named fence
timeline, I'm not sure if it is using the name or context # for
subsequent fence events (namely, signalled). I'd have to check the
code and get back to you.
There is also gpuvis, which I guess does something similar, but
haven't looked into it. Idk if there are others.
> > Maybe it would work well enough just to move the
> > trace_dma_fence_signaled() call ahead of the test_and_set_bit()? Idk
> > if some things will start getting confused if they see that trace
> > multiple times.
>
> Another alternative is to make this tracepoint access the names
> directly. It is under the lock so guaranteed not to get freed with
> drivers which will be made compliant with the documented rules.
I guess it would have been better if, other than dma_fence_init
tracepoint, later tracepoints didn't include the driver/timeline
name.. that would have forced the use of the context. But I guess too
late for that. Perhaps the least bad thing to do is use the locking?
BR,
-R
I'm going to push patches #1-#6 to drm-misc-next.
They make sense as a stand alone cleanups anyway.
But that here needs a bit more documentation I think.
On 5/13/25 09:45, Tvrtko Ursulin wrote:
> Dma-fence objects currently suffer from a potential use after free problem
> where fences exported to userspace and other drivers can outlive the
> exporting driver, or the associated data structures.
>
> The discussion on how to address this concluded that adding reference
> counting to all the involved objects is not desirable, since it would need
> to be very wide reaching and could cause unloadable drivers if another
> entity would be holding onto a signaled fence reference potentially
> indefinitely.
>
> This patch enables the safe access by introducing and documenting a
> contract between fence exporters and users. It documents a set of
> contraints and adds helpers which a) drivers with potential to suffer from
> the use after free must use and b) users of the dma-fence API must use as
> well.
>
> Premise of the design has multiple sides:
>
> 1. Drivers (fence exporters) MUST ensure a RCU grace period between
> signalling a fence and freeing the driver private data associated with it.
That's a must have anyway, otherwise functions like dma_fence_get_rcu() won't work.
I hope that we have documented that somewhere, but I'm not 100% sure to be honest.
> The grace period does not have to follow the signalling immediately but
> HAS to happen before data is freed.
That is the new requirement we have to document somehow.
I'm not 100% sure but I think module unloading waits for an RCU grace period anyway.
> 2. Users of the dma-fence API marked with such requirement MUST contain
> the complete access to the data within a single code block guarded by the
> new dma_fence_access_begin() and dma_fence_access_end() helpers.
>
> The combination of the two ensures that whoever sees the
> DMA_FENCE_FLAG_SIGNALED_BIT not set is guaranteed to have access to a
> valid fence->lock and valid data potentially accessed by the fence->ops
> virtual functions, until the call to dma_fence_access_end().
Mhm, how about returning copies of the string?
This is only for debugging anyway and kstrdup_const() isn't that costly.
Regards,
Christian.
>
> 3. Module unload (fence->ops) disappearing is for now explicitly not
> handled. That would required a more complex protection, possibly needing
> SRCU instead of RCU to handle callers such as dma_fence_wait_timeout(),
> where race between dma_fence_enable_sw_signaling, signalling, and
> dereference of fence->ops->wait() would need a sleeping SRCU context.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com>
> ---
> drivers/dma-buf/dma-fence.c | 69 +++++++++++++++++++++++++++++++++++++
> include/linux/dma-fence.h | 32 ++++++++++++-----
> 2 files changed, 93 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index dc2456f68685..cfe1d7b79c22 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -533,6 +533,7 @@ void dma_fence_release(struct kref *kref)
> struct dma_fence *fence =
> container_of(kref, struct dma_fence, refcount);
>
> + dma_fence_access_begin();
> trace_dma_fence_destroy(fence);
>
> if (WARN(!list_empty(&fence->cb_list) &&
> @@ -560,6 +561,8 @@ void dma_fence_release(struct kref *kref)
> fence->ops->release(fence);
> else
> dma_fence_free(fence);
> +
> + dma_fence_access_end();
> }
> EXPORT_SYMBOL(dma_fence_release);
>
> @@ -982,11 +985,13 @@ EXPORT_SYMBOL(dma_fence_set_deadline);
> */
> void dma_fence_describe(struct dma_fence *fence, struct seq_file *seq)
> {
> + dma_fence_access_begin();
> seq_printf(seq, "%s %s seq %llu %ssignalled\n",
> dma_fence_driver_name(fence),
> dma_fence_timeline_name(fence),
> fence->seqno,
> dma_fence_is_signaled(fence) ? "" : "un");
> + dma_fence_access_end();
> }
> EXPORT_SYMBOL(dma_fence_describe);
>
> @@ -1033,3 +1038,67 @@ dma_fence_init64(struct dma_fence *fence, const struct dma_fence_ops *ops,
> __set_bit(DMA_FENCE_FLAG_SEQNO64_BIT, &fence->flags);
> }
> EXPORT_SYMBOL(dma_fence_init64);
> +
> +/**
> + * dma_fence_driver_name - Access the driver name
> + * @fence: the fence to query
> + *
> + * Returns a driver name backing the dma-fence implementation.
> + *
> + * IMPORTANT CONSIDERATION:
> + * Dma-fence contract stipulates that access to driver provided data (data not
> + * directly embedded into the object itself), such as the &dma_fence.lock and
> + * memory potentially accessed by the &dma_fence.ops functions, is forbidden
> + * after the fence has been signalled. Drivers are allowed to free that data,
> + * and some do.
> + *
> + * To allow safe access drivers are mandated to guarantee a RCU grace period
> + * between signalling the fence and freeing said data.
> + *
> + * As such access to the driver name is only valid inside a RCU locked section.
> + * The pointer MUST be both queried and USED ONLY WITHIN a SINGLE block guarded
> + * by the &dma_fence_access_being and &dma_fence_access_end pair.
> + */
> +const char *dma_fence_driver_name(struct dma_fence *fence)
> +{
> + RCU_LOCKDEP_WARN(!rcu_read_lock_held(),
> + "rcu_read_lock() required for safe access to returned string");
> +
> + if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> + return fence->ops->get_driver_name(fence);
> + else
> + return "detached-driver";
> +}
> +EXPORT_SYMBOL(dma_fence_driver_name);
> +
> +/**
> + * dma_fence_timeline_name - Access the timeline name
> + * @fence: the fence to query
> + *
> + * Returns a timeline name provided by the dma-fence implementation.
> + *
> + * IMPORTANT CONSIDERATION:
> + * Dma-fence contract stipulates that access to driver provided data (data not
> + * directly embedded into the object itself), such as the &dma_fence.lock and
> + * memory potentially accessed by the &dma_fence.ops functions, is forbidden
> + * after the fence has been signalled. Drivers are allowed to free that data,
> + * and some do.
> + *
> + * To allow safe access drivers are mandated to guarantee a RCU grace period
> + * between signalling the fence and freeing said data.
> + *
> + * As such access to the driver name is only valid inside a RCU locked section.
> + * The pointer MUST be both queried and USED ONLY WITHIN a SINGLE block guarded
> + * by the &dma_fence_access_being and &dma_fence_access_end pair.
> + */
> +const char *dma_fence_timeline_name(struct dma_fence *fence)
> +{
> + RCU_LOCKDEP_WARN(!rcu_read_lock_held(),
> + "rcu_read_lock() required for safe access to returned string");
> +
> + if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> + return fence->ops->get_driver_name(fence);
> + else
> + return "signaled-timeline";
> +}
> +EXPORT_SYMBOL(dma_fence_timeline_name);
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index c5ac37e10d85..b39e430142ea 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -377,15 +377,31 @@ bool dma_fence_remove_callback(struct dma_fence *fence,
> struct dma_fence_cb *cb);
> void dma_fence_enable_sw_signaling(struct dma_fence *fence);
>
> -static inline const char *dma_fence_driver_name(struct dma_fence *fence)
> -{
> - return fence->ops->get_driver_name(fence);
> -}
> +/**
> + * DOC: Safe external access to driver provided object members
> + *
> + * All data not stored directly in the dma-fence object, such as the
> + * &dma_fence.lock and memory potentially accessed by functions in the
> + * &dma_fence.ops table, MUST NOT be accessed after the fence has been signalled
> + * because after that point drivers are allowed to free it.
> + *
> + * All code accessing that data via the dma-fence API (or directly, which is
> + * discouraged), MUST make sure to contain the complete access within a
> + * &dma_fence_access_begin and &dma_fence_access_end pair.
> + *
> + * Some dma-fence API handles this automatically, while other, as for example
> + * &dma_fence_driver_name and &dma_fence_timeline_name, leave that
> + * responsibility to the caller.
> + *
> + * To enable this scheme to work drivers MUST ensure a RCU grace period elapses
> + * between signalling the fence and freeing the said data.
> + *
> + */
> +#define dma_fence_access_begin rcu_read_lock
> +#define dma_fence_access_end rcu_read_unlock
>
> -static inline const char *dma_fence_timeline_name(struct dma_fence *fence)
> -{
> - return fence->ops->get_timeline_name(fence);
> -}
> +const char *dma_fence_driver_name(struct dma_fence *fence);
> +const char *dma_fence_timeline_name(struct dma_fence *fence);
>
> /**
> * dma_fence_is_signaled_locked - Return an indication if the fence
On 5/13/25 04:06, Hyejeong Choi wrote:
> smp_store_mb() inserts memory barrier after storing operation.
> It is different with what the comment is originally aiming so Null
> pointer dereference can be happened if memory update is reordered.
>
> Signed-off-by: Hyejeong Choi <hjeong.choi(a)samsung.com>
I've reviewed, add CC stable and Fixes tags and pushed it to drm-misc-fixes.
Thanks,
Christian.
> ---
> drivers/dma-buf/dma-resv.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 5f8d010516f0..b1ef4546346d 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -320,8 +320,9 @@ void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
> count++;
>
> dma_resv_list_set(fobj, i, fence, usage);
> - /* pointer update must be visible before we extend the num_fences */
> - smp_store_mb(fobj->num_fences, count);
> + /* fence update must be visible before we extend the num_fences */
> + smp_wmb();
> + fobj->num_fences = count;
> }
> EXPORT_SYMBOL(dma_resv_add_fence);
>
>
>