Hi
Am 21.02.25 um 16:51 schrieb andriy.shevchenko(a)linux.intel.com:
> On Fri, Feb 21, 2025 at 11:36:00AM +0000, Aditya Garg wrote:
>> From: Kerem Karabay <kekrby(a)gmail.com>
>>
>> Add XRGB8888 emulation helper for devices that only support BGR888.
> ...
>
>> + for (x = 0; x < pixels; x++) {
>> + pix = le32_to_cpu(sbuf32[x]);
>> + /* write red-green-blue to output in little endianness */
>> + *dbuf8++ = (pix & 0x00ff0000) >> 16;
>> + *dbuf8++ = (pix & 0x0000ff00) >> 8;
>> + *dbuf8++ = (pix & 0x000000ff) >> 0;
> put_unaligned_be24()
I'm all for sharing helper code, but maybe not here.
- DRM pixel formats are always little endian.
- CPU encoding is LE or BE.
- Pixel-component order can be anything: RGB/BGR/etc.
So the code has a comment to explain what happens here. Adding that call
with the _be24 postfix into the mix would make it more confusing.
>
>> + }
> ...
>
>> + static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = {
>> + 3,
>> + };
> One line?
>
> static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 3 };
I'd be ok, if there's a string preference in the kernel to use thins
style. Most of DRM doesn't AFAIK.
Best regards
Thomas
>
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
Hi
Am 22.02.25 um 10:07 schrieb Aditya Garg:
>> What padding, please? Why TCP UAPI headers do not have these attributes?
>> Think about it, and think about what actually __packed does and how it affects
>> (badly) the code generation. Otherwise it looks like a cargo cult.
>>
>>> I tried removing __packed btw and driver no longer works.
>> So, you need to find a justification why. But definitely not due to padding in
>> many of them. They can go without __packed as they are naturally aligned.
> Alright, I did some debugging, basically printk sizeof(struct). Did it for both packed and unpacked with the following results:
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_request_header is 16
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_request_header_unpacked is 16
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_response_header is 20
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_response_header_unpacked is 20
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_simple_request is 32
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_simple_request_unpacked is 32
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_information is 65
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_information_unpacked is 68
In the unpacked version, there is a 3-byte gap after the
'bits_per_pixel' to align the next field. Using __packed removes those
gaps at the expense of runtime overhead.
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_frame is 12
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_frame_unpacked is 12
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_footer is 80
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_footer_unpacked is 80
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request is 48
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_unpacked is 48
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_response is 40
> Feb 22 13:02:04 MacBook kernel: size of struct appletbdrm_fb_request_response_unpacked is 40
>
> So, the difference in sizeof in unpacked and packed is only in appletbdrm_msg_information. So, I kept this packed, and removed it from others. The Touch Bar still works.
>
> So maybe keep just this packed?
The fields in the TCP header are aligned by design. Unfortunately, this
hardware's protocol is not. And there's no way of fixing this now. Just
keep all of them packed if you want. At least it's clear then what
happens. And if your hardware requires this, you can't do much anyway.
Best regards
Thomas
>>
>>
>> ...
>>
>>>>> + if (response->msg == APPLETBDRM_MSG_SIGNAL_READINESS) {
>>>>> + if (!readiness_signal_received) {
>>>>> + readiness_signal_received = true;
>>>>> + goto retry;
>>>>> + }
>>>>> +
>>>>> + drm_err(drm, "Encountered unexpected readiness signal\n");
>>>>> + return -EIO;
>>>>> + }
>>>>> +
>>>>> + if (actual_size != size) {
>>>>> + drm_err(drm, "Actual size (%d) doesn't match expected size (%lu)\n",
>>>>> + actual_size, size);
>>>>> + return -EIO;
>>>>> + }
>>>>> +
>>>>> + if (response->msg != expected_response) {
>>>>> + drm_err(drm, "Unexpected response from device (expected %p4ch found %p4ch)\n",
>>>>> + &expected_response, &response->msg);
>>>>> + return -EIO;
>>>> For three different cases the same error code, can it be adjusted more to the
>>>> situation?
>>> All these are I/O errors, you got any suggestion?
>> Your email client mangled the code so badly that it's hard to read. But I would
>> suggest to use -EINTR in the first case, and -EBADMSG. But also you may consider
>> -EPROTO.
> Thanks
>>>>> + }
>> ...
>>
>>>>> + if (ret)
>>>>> + return ret;
>>>>> + else if (!new_plane_state->visible)
>>>> Why 'else'? It's redundant.
>>> I’ve just followed what other drm drivers are doing here:
>>>
>>> https://elixir.bootlin.com/linux/v6.13.3/source/drivers/gpu/drm/tiny/bochs.…
>>> https://elixir.bootlin.com/linux/v6.13.3/source/drivers/gpu/drm/tiny/cirrus…
>>>
>>> And plenty more
>> A bad example is still a bad example. 'else' is simply redundant in this
>> case and add a noisy to the code.
>>
>>> I won’t mind removing else. You want that?
>> Sure.
>>
>> ...
>>
>>>>> + request_size = ALIGN(sizeof(struct appletbdrm_fb_request) +
>>>>> + frames_size +
>>>>> + sizeof(struct appletbdrm_fb_request_footer), 16);
>>>> Missing header for ALIGN().
>>>>
>>>> But have you checked overflow.h for the possibility of using some helper macros
>>>> from there? This is what should be usually done for k*alloc() in the kernel.
>>> I don’t really think we need a macro here.
>> Hmm... is frames_size known to be in a guaranteed range to make sure no
>> potential overflow happens?
> I don’t really find any cause of potential overflow.
>
>
>>>>> + appletbdrm_state->request = kzalloc(request_size, GFP_KERNEL);
>>>>> +
>>>>> + if (!appletbdrm_state->request)
>>>>> + return -ENOMEM;
>> ...
>>
>>>>> + request->msg_id = timestamp & 0xff;
>>>> Why ' & 0xff'?
>>> https://github.com/imbushuo/DFRDisplayKm/blob/master/src/DFRDisplayKm/DfrDi…
>> This is not an answer.
>> Why do you need this here? Isn't the type of msg_id enough?
> Hmm, I double checked this. msg_id is u8 in the Linux port so would anyways never exceed 0xff. I’ll remove this.
> Its different in the Windows driver.
>> ...
>>
>>>>> + adev->mode = (struct drm_display_mode) {
>>>> Why do you need a compound literal here? Perhaps you want to have that to be
>>>> done directly in DRM_MODE_INIT()?
>>> I really don’t find this as an issue. You want me to declare another structure, basically this?:
>> Nope, I'm asking if the DRM_MODE_INIT() is done in a way that it only can be
>> used for the static data. Seems like the case. Have you tried to convert
>> DRM_MODE_INIT() to be always a compound literal? Does it break things?
> Seems to be breaking things.
>>> struct drm_display_mode mode = {
>>> DRM_MODE_INIT(60, adev->height, adev->width,
>>> DRM_MODE_RES_MM(adev->height, 218),
>>> DRM_MODE_RES_MM(adev->width, 218))
>>> };
>>> adev->mode = mode;
>>>
>>>>> + DRM_MODE_INIT(60, adev->height, adev->width,
>>>>> + DRM_MODE_RES_MM(adev->height, 218),
>>>>> + DRM_MODE_RES_MM(adev->width, 218))
>>>>> + };
>> --
>> With Best Regards,
>> Andy Shevchenko
>>
>>
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
Hi,
[Now with a Gstreamer demo, see below.]
This patch set allocates the restricted DMA-bufs via the TEE subsystem.
The TEE subsystem handles the DMA-buf allocations since it is the TEE
(OP-TEE, AMD-TEE, TS-TEE, or perhaps a future QCOMTEE) which sets up the
restrictions for the memory used for the DMA-bufs.
I've added a new IOCTL, TEE_IOC_RSTMEM_ALLOC, to allocate the restricted
DMA-bufs. This IOCTL reaches the backend TEE driver, allowing it to choose
how to allocate the restricted physical memory.
TEE_IOC_RSTMEM_ALLOC takes in addition to a size and flags parameters also
a use-case parameter. This is used by the backend TEE driver to decide on
allocation policy and which devices should be able to access the memory.
Three use-cases (Secure Video Playback, Trusted UI, and Secure Video
Recording) has been identified so far to serve as examples of what can be
expected. More use-cases can be added in userspace ABI, but it's up to the
backend TEE drivers to provide the implementation.
Each use-case has it's own restricted memory pool since different use-cases
requires isolation from different parts of the system. A restricted memory
pool can be based on a static carveout instantiated while probing the TEE
backend driver, or dynamically allocated from CMA and made restricted as
needed by the TEE.
This can be tested on a RockPi 4B+ with the following steps:
repo init -u https://github.com/jenswi-linaro/manifest.git -m rockpi4.xml \
-b prototype/sdp-v5
repo sync -j8
cd build
make toolchains -j$(nproc)
make all -j$(nproc)
# Copy ../out/rockpi4.img to an SD card and boot the RockPi from that
# Connect a monitor to the RockPi
# login and at the prompt:
gst-launch-1.0 videotestsrc ! \
aesenc key=1f9423681beb9a79215820f6bda73d0f \
iv=e9aa8e834d8d70b7e0d254ff670dd718 serialize-iv=true ! \
aesdec key=1f9423681beb9a79215820f6bda73d0f ! \
kmssink
The aesdec module has been hacked to use an OP-TEE TA to decrypt the stream
into restricted DMA-bufs which are consumed by the kmssink.
The primitive QEMU tests from previous patch set can be tested on RockPi
in the same way with:
xtest --sdp-basic
The primitive test are tested on QEMU with the following steps:
repo init -u https://github.com/jenswi-linaro/manifest.git -m qemu_v8.xml \
-b prototype/sdp-v5
repo sync -j8
cd build
make toolchains -j$(nproc)
make SPMC_AT_EL=1 all -j$(nproc)
make SPMC_AT_EL=1 run-only
# login and at the prompt:
xtest --sdp-basic
The SPMC_AT_EL=1 parameter configures the build with FF-A and an SPMC at
S-EL1 inside OP-TEE. The parameter can be changed into SPMC_AT_EL=n to test
without FF-A using the original SMC ABI instead. Please remember to do
%rm -rf ../trusted-firmware-a/build/qemu
for TF-A to be rebuilt properly using the new configuration.
https://optee.readthedocs.io/en/latest/building/prerequisites.html
list dependencies needed to build the above.
The tests are pretty basic, mostly checking that a Trusted Application in
the secure world can access and manipulate the memory. There are also some
negative tests for out of bounds buffers etc.
Thanks,
Jens
Changes since V4:
* Adding the patch "tee: add TEE_IOC_RSTMEM_FD_INFO" needed by the
GStreamer demo
* Removing the dummy CPU access and mmap functions from the dma_buf_ops
* Fixing a compile error in "optee: FF-A: dynamic restricted memory allocation"
reported by kernel test robot <lkp(a)intel.com>
Changes since V3:
* Make the use_case and flags field in struct tee_shm u32's instead of
u16's
* Add more description for TEE_IOC_RSTMEM_ALLOC in the header file
* Import namespace DMA_BUF in module tee, reported by lkp(a)intel.com
* Added a note in the commit message for "optee: account for direction
while converting parameters" why it's needed
* Factor out dynamic restricted memory allocation from
"optee: support restricted memory allocation" into two new commits
"optee: FF-A: dynamic restricted memory allocation" and
"optee: smc abi: dynamic restricted memory allocation"
* Guard CMA usage with #ifdef CONFIG_CMA, effectively disabling dynamic
restricted memory allocate if CMA isn't configured
Changes since the V2 RFC:
* Based on v6.12
* Replaced the flags for SVP and Trusted UID memory with a u32 field with
unique id for each use case
* Added dynamic allocation of restricted memory pools
* Added OP-TEE ABI both with and without FF-A for dynamic restricted memory
* Added support for FF-A with FFA_LEND
Changes since the V1 RFC:
* Based on v6.11
* Complete rewrite, replacing the restricted heap with TEE_IOC_RSTMEM_ALLOC
Changes since Olivier's post [2]:
* Based on Yong Wu's post [1] where much of dma-buf handling is done in
the generic restricted heap
* Simplifications and cleanup
* New commit message for "dma-buf: heaps: add Linaro restricted dmabuf heap
support"
* Replaced the word "secure" with "restricted" where applicable
Jens Wiklander (7):
tee: add restricted memory allocation
tee: add TEE_IOC_RSTMEM_FD_INFO
optee: account for direction while converting parameters
optee: sync secure world ABI headers
optee: support restricted memory allocation
optee: FF-A: dynamic restricted memory allocation
optee: smc abi: dynamic restricted memory allocation
drivers/tee/Makefile | 1 +
drivers/tee/optee/Makefile | 1 +
drivers/tee/optee/call.c | 10 +-
drivers/tee/optee/core.c | 1 +
drivers/tee/optee/ffa_abi.c | 179 +++++++++++++-
drivers/tee/optee/optee_ffa.h | 27 ++-
drivers/tee/optee/optee_msg.h | 65 ++++-
drivers/tee/optee/optee_private.h | 75 ++++--
drivers/tee/optee/optee_smc.h | 71 +++++-
drivers/tee/optee/rpc.c | 31 ++-
drivers/tee/optee/rstmem.c | 389 ++++++++++++++++++++++++++++++
drivers/tee/optee/smc_abi.c | 215 +++++++++++++++--
drivers/tee/tee_core.c | 69 +++++-
drivers/tee/tee_private.h | 4 +
drivers/tee/tee_rstmem.c | 188 +++++++++++++++
drivers/tee/tee_shm.c | 2 +
drivers/tee/tee_shm_pool.c | 69 +++++-
include/linux/tee_core.h | 15 ++
include/linux/tee_drv.h | 2 +
include/uapi/linux/tee.h | 71 +++++-
20 files changed, 1409 insertions(+), 76 deletions(-)
create mode 100644 drivers/tee/optee/rstmem.c
create mode 100644 drivers/tee/tee_rstmem.c
base-commit: a64dcfb451e254085a7daee5fe51bf22959d52d3
--
2.43.0
Hello everyone,
just a few DMA-buf cleanup patches. The first one is fixing an incorrect
documentation which has annoyed me for quite a while.
The rest basically just removes stuff we no longer need or which was
just added as abstraction which was never used.
Please review and/or voice objection if some of that is till needed for
something.
Regards,
Christian.
Hi,
On Thu, 13 Feb 2025 at 12:40, Boris Brezillon
<boris.brezillon(a)collabora.com> wrote:
> On Thu, 13 Feb 2025 14:46:01 +0530 Sumit Garg <sumit.garg(a)linaro.org> wrote:
> > Yeah but all the prior vendor specific secure/restricted DMA heaps
> > relied on DT information.
>
> Right, but there's nothing in the DMA heap provider API forcing that.
Yeah. DMA heaps are just a way to allocate memory from a specific
place. It allows people to settle on having a single way to do
allocations from weird platform-specific places; the only weird
platform-specific part userspace needs to deal with is figuring out
the name to use. The rest is at least a unified API: the point of
dma-heaps was exactly to have a single coherent API for userspace, not
to create one API for ZONE_CMA and DT ranges and everyone else doing
their own thing.
> > Rather than that it's better
> > for the user to directly ask the TEE device to allocate restricted
> > memory without worrying about how the memory restriction gets
> > enforced.
>
> If the consensus is that restricted/protected memory allocation should
> always be routed to the TEE, sure, but I had the feeling this wasn't as
> clear as that. OTOH, using a dma-heap to expose the TEE-SDP
> implementation provides the same benefits, without making potential
> future non-TEE based implementations a pain for users. The dma-heap
> ioctl being common to all implementations, it just becomes a
> configuration matter if we want to change the heap we rely on for
> protected/restricted buffer allocation. And because heaps have
> unique/well-known names, users can still default to (or rely solely on)
> the TEE-SPD implementation if they want.
>
> > There have been several attempts with DMA heaps in the past which all
> > resulted in a very vendor specific vertically integrated solution. But
> > the solution with TEE subsystem aims to make it generic and vendor
> > agnostic.
>
> Just because all previous protected/restricted dma-heap effort
> failed to make it upstream, doesn't mean dma-heap is the wrong way of
> exposing this feature IMHO.
To be fair, having a TEE implementation does give us a much better
chance of having a sensible cross-vendor plan. And the fact it's
already (sort of accidentally and only on one platform AFAICT) ready
for a 'test' interface, where we can still exercise protected
allocation paths but without having to go through all the
platform-specific setup that is inaccessible to most people, is also
really great! That's probably been the biggest barrier to having this
tested outside of IHVs and OEMs.
But just because TEE is one good backend implementation, doesn't mean
it should be the userspace ABI. Why should userspace care that TEE has
mediated the allocation instead of it being a predefined range within
DT? How does userspace pick which TEE device to use? What advantage
does userspace get from having to have a different codepath to get a
different handle to memory? What about x86?
I think this proposal is looking at it from the wrong direction.
Instead of working upwards from the implementation to userspace, start
with userspace and work downwards. The interesting property to focus
on is allocating memory, not that EL1 is involved behind the scenes.
Cheers,
Daniel