Il 26/02/24 09:50, Shawn Sung ha scritto:
> From: Hsiao Chien Sung <shawn.sung(a)mediatek.corp-partner.google.com>
>
> Rename functions of mtk_ddp_comp:
> - To align the naming rule
> - To reduce the code size
>
> Signed-off-by: Hsiao Chien Sung <shawn.sung(a)mediatek.corp-partner.google.com>
Reviewed-by: AngeloGiaocchino Del Regno <angelogioacchino.delregno(a)collabora.com>
On Thu, Feb 22, 2024 at 05:33:48PM +0100, Marco Pagani wrote:
> >
> > In this context, the TTM unit tests fail as well in qemu, with worse result:
> > It seems there is some bad cleanup after a failed test case, causing list
> > corruptions in the drm core and ultimately a crash. I don't know if this
> > is also caused by the missing dma_mask initialization.
> >
>
> That's interesting. Which --arch argument are you using to run the
> tests with QEMU?
Example (I am not sure if any of those parameters matters; it is just one
of my tests):
qemu-system-x86_64 -kernel arch/x86/boot/bzImage -M q35 -cpu IvyBridge \
-no-reboot -snapshot -smp 2 \
-device e1000,netdev=net0 -netdev user,id=net0 -m 512 \
-drive file=rootfs.ext2,format=raw,if=ide \
--append "earlycon=uart8250,io,0x3f8,9600n8 root=/dev/sda1 console=ttyS0" \
-d unimp,guest_errors -nographic -monitor none
This results in:
[ ... ]
[ 5.989496] KTAP version 1
[ 5.989639] # Subtest: ttm_device
[ 5.989711] # module: ttm_device_test
[ 5.989760] 1..5
[ 6.002044] ok 1 ttm_device_init_basic
[ 6.013557] ok 2 ttm_device_init_multiple
ILLOPC: ffffffffb8ac9350: 0f 0b
[ 6.022481] ok 3 ttm_device_fini_basic
[ 6.026172] ------------[ cut here ]------------
[ 6.026315] WARNING: CPU: 1 PID: 1575 at drivers/gpu/drm/ttm/ttm_device.c:206 ttm_device_init+0x170/0x190
...
[ 6.135016] ok 3 Above the allocation limit
[ 6.138759] ------------[ cut here ]------------
[ 6.138925] WARNING: CPU: 1 PID: 1595 at kernel/dma/mapping.c:503 dma_alloc_attrs+0xf6/0x100
...
[ 6.143850] # ttm_pool_alloc_basic: ASSERTION FAILED at drivers/gpu/drm/ttm/tests/ttm_pool_test.c:162
[ 6.143850] Expected err == 0, but
[ 6.143850] err == -12 (0xfffffffffffffff4)
[ 6.148824] not ok 4 One page, with coherent DMA mappings enabled
From there things go downhill.
[ 6.152821] list_add corruption. prev->next should be next (ffffffffbbd53950), but was 0000000000000000. (prev=ffff8af1c38f9e20).
and so on until the emulation crashes.
Guenter
Hi Marco,
On 2/22/24 07:32, Marco Pagani wrote:
>
>
> On 2024-02-18 16:49, Guenter Roeck wrote:
>> Hi,
>>
>> On Thu, Nov 30, 2023 at 06:14:16PM +0100, Marco Pagani wrote:
>>> This patch introduces an initial KUnit test suite for GEM objects
>>> backed by shmem buffers.
>>>
>>> Suggested-by: Javier Martinez Canillas <javierm(a)redhat.com>
>>> Signed-off-by: Marco Pagani <marpagan(a)redhat.com>
>>
>> When running this in qemu, I get lots of warnings backtraces in the drm
>> core.
>>
>> WARNING: CPU: 0 PID: 1341 at drivers/gpu/drm/drm_gem_shmem_helper.c:327
>> WARNING: CPU: 0 PID: 1341 at drivers/gpu/drm/drm_gem_shmem_helper.c:173
>> WARNING: CPU: 0 PID: 1341 at drivers/gpu/drm/drm_gem_shmem_helper.c:385
>> WARNING: CPU: 0 PID: 1341 at drivers/gpu/drm/drm_gem_shmem_helper.c:211
>> WARNING: CPU: 0 PID: 1345 at kernel/dma/mapping.c:194
>> WARNING: CPU: 0 PID: 1347 at drivers/gpu/drm/drm_gem_shmem_helper.c:429
>> WARNING: CPU: 0 PID: 1349 at drivers/gpu/drm/drm_gem_shmem_helper.c:445
>>
>> It looks like dma_resv_assert_held() asserts each time it is executed.
>> The backtrace in kernel/dma/mapping.c is triggered by
>> if (WARN_ON_ONCE(!dev->dma_mask))
>> return 0;
>> in __dma_map_sg_attrs().
>>
>> Is this a possible problem in the test code, or can it be caused by
>> some limitations or bugs in the qemu emulation ? If so, do you have any
>> thoughts or ideas what those limitations / bugs might be ?
>
> Hi Guenter,
>
> Thanks for reporting this issue. As you correctly noted, the warnings appear to
> be caused by the dma_mask in the mock device being uninitialized. I'll send a
> patch to fix it soon.
>
Thanks a lot for the update.
In this context, the TTM unit tests fail as well in qemu, with worse result:
It seems there is some bad cleanup after a failed test case, causing list
corruptions in the drm core and ultimately a crash. I don't know if this
is also caused by the missing dma_mask initialization.
Thanks,
Guenter
Hi,
On Thu, Nov 30, 2023 at 06:14:16PM +0100, Marco Pagani wrote:
> This patch introduces an initial KUnit test suite for GEM objects
> backed by shmem buffers.
>
> Suggested-by: Javier Martinez Canillas <javierm(a)redhat.com>
> Signed-off-by: Marco Pagani <marpagan(a)redhat.com>
When running this in qemu, I get lots of warnings backtraces in the drm
core.
WARNING: CPU: 0 PID: 1341 at drivers/gpu/drm/drm_gem_shmem_helper.c:327
WARNING: CPU: 0 PID: 1341 at drivers/gpu/drm/drm_gem_shmem_helper.c:173
WARNING: CPU: 0 PID: 1341 at drivers/gpu/drm/drm_gem_shmem_helper.c:385
WARNING: CPU: 0 PID: 1341 at drivers/gpu/drm/drm_gem_shmem_helper.c:211
WARNING: CPU: 0 PID: 1345 at kernel/dma/mapping.c:194
WARNING: CPU: 0 PID: 1347 at drivers/gpu/drm/drm_gem_shmem_helper.c:429
WARNING: CPU: 0 PID: 1349 at drivers/gpu/drm/drm_gem_shmem_helper.c:445
It looks like dma_resv_assert_held() asserts each time it is executed.
The backtrace in kernel/dma/mapping.c is triggered by
if (WARN_ON_ONCE(!dev->dma_mask))
return 0;
in __dma_map_sg_attrs().
Is this a possible problem in the test code, or can it be caused by
some limitations or bugs in the qemu emulation ? If so, do you have any
thoughts or ideas what those limitations / bugs might be ?
Thanks,
Guenter
They rather fundamentally break the entire concept of zero copy, so if
an exporter manages to hand these out things will break all over.
Luckily there's not too many case that use
swiotlb_sync_single_for_device/cpu():
- The generic iommu dma-api code in drivers/iommu/dma-iommu.c. We can
catch that with sg_dma_is_swiotlb() reliably.
- The generic direct dma code in kernel/dma/direct.c. We can mostly
catch that with looking for a NULL dma_ops, except for some powerpc
special cases.
- Xen, which I don't bother to catch here.
Implement these checks in dma_buf_map_attachment when
CONFIG_DMA_API_DEBUG is enabled.
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: linux-media(a)vger.kernel.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: Paul Cercueil <paul(a)crapouillou.net>
---
Entirely untested, but since I sent the mail with the idea I figured I
might as well type it up after I realized there's a lot fewer cases to
check. That is, if I haven't completely misread the dma-api and swiotlb
code.
-Sima
---
drivers/dma-buf/dma-buf.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index d1e7f823fbdb..d6f95523f995 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -28,6 +28,12 @@
#include <linux/mount.h>
#include <linux/pseudo_fs.h>
+#ifdef CONFIG_DMA_API_DEBUG
+#include <linux/dma-direct.h>
+#include <linux/dma-map-ops.h>
+#include <linux/swiotlb.h>
+#endif
+
#include <uapi/linux/dma-buf.h>
#include <uapi/linux/magic.h>
@@ -1149,10 +1155,13 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
#ifdef CONFIG_DMA_API_DEBUG
if (!IS_ERR(sg_table)) {
struct scatterlist *sg;
+ struct device *dev = attach->dev;
u64 addr;
int len;
int i;
+ bool is_direct_dma = !get_dma_ops(dev);
+
for_each_sgtable_dma_sg(sg_table, sg, i) {
addr = sg_dma_address(sg);
len = sg_dma_len(sg);
@@ -1160,7 +1169,15 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
pr_debug("%s: addr %llx or len %x is not page aligned!\n",
__func__, addr, len);
}
+
+ if (is_direct_dma) {
+ phys_addr_t paddr = dma_to_phys(dev, addr);
+
+ WARN_ON_ONCE(is_swiotlb_buffer(dev, paddr));
+ }
}
+
+ WARN_ON_ONCE(sg_dma_is_swiotlb(sg));
}
#endif /* CONFIG_DMA_API_DEBUG */
return sg_table;
--
2.43.0
Hi,
This is the v5 of my patchset that adds a new DMABUF import interface to
FunctionFS.
Daniel / Sima suggested that I should cache the dma_buf_attachment while
the DMABUF is attached to the interface, instead of mapping/unmapping
the DMABUF for every transfer (also because unmapping is not possible in
the dma_fence's critical section). This meant having to add new
dma_buf_begin_access() / dma_buf_end_access() functions that the driver
can call to ensure cache coherency. These two functions are provided by
the new patch [1/6], and an implementation for udmabuf was added in
[2/6] - see the changelog below.
This patchset was successfully tested with CONFIG_LOCKDEP, no errors
were reported in dmesg while using the interface.
This interface is being used at Analog Devices, to transfer data from
high-speed transceivers to USB in a zero-copy fashion, using also the
DMABUF import interface to the IIO subsystem which is being upstreamed
in parallel [1]. The two are used by the Libiio software [2].
On a ZCU102 board with a FMComms3 daughter board, using the combination
of these two new interfaces yields a drastic improvement of the
throughput, from about 127 MiB/s using IIO's buffer read/write interface
+ read/write to the FunctionFS endpoints, to about 274 MiB/s when
passing around DMABUFs, for a lower CPU usage (0.85 load avg. before,
vs. 0.65 after).
Right now, *technically* there are no users of this interface, as
Analog Devices wants to wait until both interfaces are accepted upstream
to merge the DMABUF code in Libiio into the main branch, and Jonathan
wants to wait and see if this patchset is accepted to greenlight the
DMABUF interface in IIO as well. I think this isn't really a problem;
once everybody is happy with its part of the cake, we can merge them all
at once.
This is obviously for 5.9, and based on next-20240119.
Changelog:
- [1/6]: New patch
- [2/6]: New patch
- [5/6]:
- Cache the dma_buf_attachment while the DMABUF is attached.
- Use dma_buf_begin/end_access() to ensure that the DMABUF data will be
coherent to the hardware.
- Remove comment about cache-management and dma_buf_unmap_attachment(),
since we now use dma_buf_begin/end_access().
- Select DMA_SHARED_BUFFER in Kconfig entry
- Add Christian's ACK
Cheers,
-Paul
[1] https://lore.kernel.org/linux-iio/219abc43b4fdd4a13b307ed2efaa0e6869e68e3f.…
[2] https://github.com/analogdevicesinc/libiio/tree/pcercuei/dev-new-dmabuf-api
Paul Cercueil (6):
dma-buf: Add dma_buf_{begin,end}_access()
dma-buf: udmabuf: Implement .{begin,end}_access
usb: gadget: Support already-mapped DMA SGs
usb: gadget: functionfs: Factorize wait-for-endpoint code
usb: gadget: functionfs: Add DMABUF import interface
Documentation: usb: Document FunctionFS DMABUF API
Documentation/usb/functionfs.rst | 36 ++
drivers/dma-buf/dma-buf.c | 66 ++++
drivers/dma-buf/udmabuf.c | 27 ++
drivers/usb/gadget/Kconfig | 1 +
drivers/usb/gadget/function/f_fs.c | 502 ++++++++++++++++++++++++++--
drivers/usb/gadget/udc/core.c | 7 +-
include/linux/dma-buf.h | 37 ++
include/linux/usb/gadget.h | 2 +
include/uapi/linux/usb/functionfs.h | 41 +++
9 files changed, 698 insertions(+), 21 deletions(-)
--
2.43.0
On Tue, 6 Feb 2024 09:45:18 +0530 Anshuman Khandual <anshuman.khandual(a)arm.com> wrote:
> cma_get_name() just returns cma->name without any additional transformation
> unlike other helpers such as cma_get_base() and cma_get_size(). This helper
> is not worth the additional indirection, and can be dropped after replacing
> directly with cma->name in the sole caller __add_cma_heap().
drivers/dma-buf/heaps/cma_heap.c: In function '__add_cma_heap':
drivers/dma-buf/heaps/cma_heap.c:379:28: error: invalid use of undefined type 'struct cma'
379 | exp_info.name = cma->name;
| ^~
Fixing this would require moving the `struct cma' definition into
cma.h. I don't think that's worthwhile.
From: Jason-jh Lin <jason-jh.lin(a)mediatek.corp-partner.google.com>
Memory Definitions:
secure memory - Memory allocated in the TEE (Trusted Execution
Environment) which is inaccessible in the REE (Rich Execution
Environment, i.e. linux kernel/userspace).
secure handle - Integer value which acts as reference to 'secure
memory'. Used in communication between TEE and REE to reference
'secure memory'.
secure buffer - 'secure memory' that is used to store decrypted,
compressed video or for other general purposes in the TEE.
secure surface - 'secure memory' that is used to store graphic buffers.
Memory Usage in SVP:
The overall flow of SVP starts with encrypted video coming in from an
outside source into the REE. The REE will then allocate a 'secure
buffer' and send the corresponding 'secure handle' along with the
encrypted, compressed video data to the TEE. The TEE will then decrypt
the video and store the result in the 'secure buffer'. The REE will
then allocate a 'secure surface'. The REE will pass the 'secure
handles' for both the 'secure buffer' and 'secure surface' into the
TEE for video decoding. The video decoder HW will then decode the
contents of the 'secure buffer' and place the result in the 'secure
surface'. The REE will then attach the 'secure surface' to the overlay
plane for rendering of the video.
Everything relating to ensuring security of the actual contents of the
'secure buffer' and 'secure surface' is out of scope for the REE and
is the responsibility of the TEE.
DRM driver handles allocation of gem objects that are backed by a 'secure
surface' and for displaying a 'secure surface' on the overlay plane.
This introduces a new flag for object creation called
DRM_MTK_GEM_CREATE_ENCRYPTED which indicates it should be a 'secure
surface'. All changes here are in MediaTek specific code.
---
TODO:
1) Remove get sec larb port interface in ddp_comp, ovl and ovl_adaptor.
2) Verify instruction for enabling/disabling dapc and larb port in TEE
drop the sec_engine flags in normal world and.
3) Move DISP_REG_OVL_SECURE setting to secure world for mtk_disp_ovl.c.
4) Change the parameter register address in mtk_ddp_sec_write()
from "u32 addr" to "struct cmdq_client_reg *cmdq_reg".
5) Implement setting mmsys routing table in the secure world series.
---
Based on 5 series and 1 patch:
[1] v3 dma-buf: heaps: Add MediaTek secure heap
- https://patchwork.kernel.org/project/linux-mediatek/list/?series=809023
[2] v3 add driver to support secure video decoder
- https://patchwork.kernel.org/project/linux-mediatek/list/?series=807308
[3] v4 soc: mediatek: Add register definitions for GCE
- https://patchwork.kernel.org/project/linux-mediatek/patch/20231212121957.19…
[4] v2 Add CMDQ driver support for mt8188
- https://patchwork.kernel.org/project/linux-mediatek/list/?series=810302
[5] Add mediatek,gce-events definition to mediatek,gce-mailbox bindings
- https://patchwork.kernel.org/project/linux-mediatek/list/?series=810938
[6] v3 Add CMDQ secure driver for SVP
- https://patchwork.kernel.org/project/linux-mediatek/list/?series=812379
---
Change in v3:
1. fix kerneldoc problems
2. fix typo in title and commit message
3. adjust naming for secure variable
4. add the missing part for is_suecure plane implementation
5. use BIT_ULL macro to replace bit shifting
6. move modification of ovl_adaptor part to the correct patch
7. add TODO list in commit message
8. add commit message for using share memory to store execute count
Change in v2:
1. remove the DRIVER_RDNDER flag for mtk_drm_ioctl
2. move cmdq_insert_backup_cookie into client driver
3. move secure gce node define from mt8195-cherry.dtsi to mt8195.dtsi
---
CK Hu (1):
drm/mediatek: Add interface to allocate MediaTek GEM buffer.
Jason-JH.Lin (10):
drm/mediatek/uapi: Add DRM_MTK_GEM_CREATE_ENCRYPTED flag
drm/mediatek: Add secure buffer control flow to mtk_drm_gem
drm/mediatek: Add secure identify flag and funcution to mtk_drm_plane
drm/mediatek: Add mtk_ddp_sec_write to config secure buffer info
drm/mediatek: Add get_sec_port interface to mtk_ddp_comp
drm/mediatek: Add secure layer config support for ovl
drm/mediatek: Add secure layer config support for ovl_adaptor
drm/mediatek: Add secure flow support to mediatek-drm
drm/mediatek: Add cmdq_insert_backup_cookie before secure pkt finalize
arm64: dts: mt8195: Add secure mbox settings for vdosys
arch/arm64/boot/dts/mediatek/mt8195.dtsi | 6 +-
drivers/gpu/drm/mediatek/mtk_disp_drv.h | 3 +
drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 31 +-
.../gpu/drm/mediatek/mtk_disp_ovl_adaptor.c | 15 +
drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 274 +++++++++++++++++-
drivers/gpu/drm/mediatek/mtk_drm_crtc.h | 1 +
drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 30 ++
drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 14 +
drivers/gpu/drm/mediatek/mtk_drm_drv.c | 13 +
drivers/gpu/drm/mediatek/mtk_drm_gem.c | 122 ++++++++
drivers/gpu/drm/mediatek/mtk_drm_gem.h | 16 +
drivers/gpu/drm/mediatek/mtk_drm_plane.c | 26 ++
drivers/gpu/drm/mediatek/mtk_drm_plane.h | 2 +
drivers/gpu/drm/mediatek/mtk_mdp_rdma.c | 11 +-
drivers/gpu/drm/mediatek/mtk_mdp_rdma.h | 2 +
include/uapi/drm/mediatek_drm.h | 59 ++++
16 files changed, 607 insertions(+), 18 deletions(-)
create mode 100644 include/uapi/drm/mediatek_drm.h
--
2.18.0
DMA buffers allocated from the CMA dma-buf heap get counted under
RssFile for processes that map them and trigger page faults. In
addition to the incorrect accounting reported to userspace, reclaim
behavior was influenced by the MM_FILEPAGES counter until linux 6.8, but
this memory is not reclaimable. [1] Change the CMA dma-buf heap to set
VM_PFNMAP on the VMA so MM does not poke at the memory managed by this
dma-buf heap, and use vmf_insert_pfn to correct the RSS accounting.
The system dma-buf heap does not suffer from this issue since
remap_pfn_range is used during the mmap of the buffer, which also sets
VM_PFNMAP on the VMA.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/m…
Fixes: b61614ec318a ("dma-buf: heaps: Add CMA heap to dmabuf heaps")
Signed-off-by: T.J. Mercier <tjmercier(a)google.com>
---
drivers/dma-buf/heaps/cma_heap.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index ee899f8e6721..4a63567e93ba 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -168,10 +168,7 @@ static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf)
if (vmf->pgoff > buffer->pagecount)
return VM_FAULT_SIGBUS;
- vmf->page = buffer->pages[vmf->pgoff];
- get_page(vmf->page);
-
- return 0;
+ return vmf_insert_pfn(vma, vmf->address, page_to_pfn(buffer->pages[vmf->pgoff]));
}
static const struct vm_operations_struct dma_heap_vm_ops = {
@@ -185,6 +182,8 @@ static int cma_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0)
return -EINVAL;
+ vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP);
+
vma->vm_ops = &dma_heap_vm_ops;
vma->vm_private_data = buffer;
--
2.43.0.381.gb435a96ce8-goog