On Tue, Oct 29, 2024 at 04:35:16PM +0000, Pavel Begunkov wrote:
> I see, the reply is about your phrase about additional memory
> abstractions:
>
> "... don't really need to build memory buffer abstraction over
> memory buffer abstraction."
Yes, over the exsting memory buffer abstraction (dma_buf).
> If you mean internals, making up a dmabuf that has never existed in the
> picture in the first place is not cleaner or easier in any way. If that
> changes, e.g. there is more code to reuse in the future, we can unify it
> then.
I'm not sure what "making up" means here, they are all made up :)
> > with pre-registering the memry with the iommu to get good performance
> > in IOMMU-enabled setups.
>
> The page pool already does that just like it handles the normal
> path without providers.
In which case is basically is a dma-buf. If you'd expose it as such
we could actually use to communicate between subsystems in the
kernel.
Am 25.10.24 um 08:52 schrieb Friedrich Vock:
> On 24.10.24 22:29, Matthew Brost wrote:
>> On Thu, Oct 24, 2024 at 02:41:57PM +0200, Christian König wrote:
>>> Reports indicates that some userspace applications try to merge more
>>> than
>>> 80k of fences into a single dma_fence_array leading to a warning from
>>
>> Really, yikes.
>
> Not really IME. Unless Christian means some reports I don't have access
> to, the cases where userspace applications tried to do that were really
> just cases where the fence count exploded exponentially because
> dma_fence_unwrap_merge failed to actually merge identical fences (see
> patch 2). At no point have I actually seen apps trying to merge 80k+
> unique fences.
While working on it I've modified our stress test tool to send the same
1GiB SDMA copy to 100k different contexts.
Turned out it's perfectly possible to create so many fences, there is
nothing blocking userspace to do it.
While this isn't a realistic use case the kernel should at least not
crash or spill a warning, but either handle or reject it gracefully.
Friedrich can you confirm that patch two in this series fixes the
problem? I would really like to get it into drm-misc-fixes before 6.12
comes out.
Thanks,
Christian.
>
> Regards,
> Friedrich
>
>>
>>> kzalloc() that the requested size becomes to big.
>>>
>>> While that is clearly an userspace bug we should probably handle
>>> that case
>>> gracefully in the kernel.
>>>
>>> So we can either reject requests to merge more than a reasonable
>>> amount of
>>> fences (64k maybe?) or we can start to use kvzalloc() instead of
>>> kzalloc().
>>> This patch here does the later.
>>>
>>
>> This patch seems reasonable to me if the above use is in fact valid.
>>
>>> Signed-off-by: Christian König <christian.koenig(a)amd.com>
>>> CC: stable(a)vger.kernel.org
>>
>> Fixes tag?
>>
>> Patch itself LGTM:
>> Reviewed-by: Matthew Brost <matthew.brost(a)intel.com>
>>
>>> ---
>>> drivers/dma-buf/dma-fence-array.c | 6 +++---
>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-fence-array.c
>>> b/drivers/dma-buf/dma-fence-array.c
>>> index 8a08ffde31e7..46ac42bcfac0 100644
>>> --- a/drivers/dma-buf/dma-fence-array.c
>>> +++ b/drivers/dma-buf/dma-fence-array.c
>>> @@ -119,8 +119,8 @@ static void dma_fence_array_release(struct
>>> dma_fence *fence)
>>> for (i = 0; i < array->num_fences; ++i)
>>> dma_fence_put(array->fences[i]);
>>>
>>> - kfree(array->fences);
>>> - dma_fence_free(fence);
>>> + kvfree(array->fences);
>>> + kvfree_rcu(fence, rcu);
>>> }
>>>
>>> static void dma_fence_array_set_deadline(struct dma_fence *fence,
>>> @@ -153,7 +153,7 @@ struct dma_fence_array
>>> *dma_fence_array_alloc(int num_fences)
>>> {
>>> struct dma_fence_array *array;
>>>
>>> - return kzalloc(struct_size(array, callbacks, num_fences),
>>> GFP_KERNEL);
>>> + return kvzalloc(struct_size(array, callbacks, num_fences),
>>> GFP_KERNEL);
>>> }
>>> EXPORT_SYMBOL(dma_fence_array_alloc);
>>>
>>> --
>>> 2.34.1
>>>
>
On Thu, Oct 24, 2024 at 05:40:02PM +0100, Pavel Begunkov wrote:
> On 10/24/24 17:06, Christoph Hellwig wrote:
> > On Thu, Oct 24, 2024 at 03:23:06PM +0100, Pavel Begunkov wrote:
> > > > That's not what this series does. It adds the new memory_provider_ops
> > > > set of hooks, with once implementation for dmabufs, and one for
> > > > io_uring zero copy.
> > >
> > > First, it's not a _new_ abstraction over a buffer as you called it
> > > before, the abstraction (net_iov) is already merged.
> >
> > Umm, it is a new ops vector.
>
> I don't understand what you mean. Callback?
struct memory_provider_ops. It's a method table or ops vetor, no
callbacks involved.
> Then please go ahead and take a look at the patchset in question
> and see how much of dmabuf handling is there comparing to pure
> networking changes. The point that it's a new set of API and lots
> of changes not related directly to dmabufs stand. dmabufs is useful
> there as an abstraction there, but it's a very long stretch saying
> that the series is all about it.
I did take a look, that's why I replied.
> > > on an existing network specific abstraction, which are not restricted to
> > > pages or anything specific in the long run, but the flow of which from
> > > net stack to user and back is controlled by io_uring. If you worry about
> > > abuse, io_uring can't even sanely initialise those buffers itself and
> > > therefore asking the page pool code to do that.
> >
> > No, I worry about trying to io_uring for not good reason. This
>
> It sounds that the argument is that you just don't want any
> io_uring APIs, I don't think you'd be able to help you with
> that.
No, that's complete misinterpreting what I'm saying. Of course an
io_uring API is fine. But tying low-level implementation details to
to is not.
> > pre-cludes in-kernel uses which would be extremly useful for
>
> Uses of what? devmem TCP is merged, I'm not removing it,
> and the net_iov abstraction is in there, which can be potentially
> be reused by other in-kernel users if that'd even make sense.
How when you are hardcoding io uring memory registrations instead
of making them a generic dmabuf? Which btw would also really help
with pre-registering the memry with the iommu to get good performance
in IOMMU-enabled setups.
Am 25.10.24 um 11:05 schrieb Tvrtko Ursulin:
>
> On 25/10/2024 09:59, Tvrtko Ursulin wrote:
>>
>> On 24/10/2024 13:41, Christian König wrote:
>>> Reports indicates that some userspace applications try to merge more
>>> than
>>> 80k of fences into a single dma_fence_array leading to a warning from
>>> kzalloc() that the requested size becomes to big.
>>>
>>> While that is clearly an userspace bug we should probably handle
>>> that case
>>> gracefully in the kernel.
>>>
>>> So we can either reject requests to merge more than a reasonable
>>> amount of
>>> fences (64k maybe?) or we can start to use kvzalloc() instead of
>>> kzalloc().
>>> This patch here does the later.
>>
>> Rejecting would potentially be safer, otherwise there is a path for
>> userspace to trigger a warn in kvmalloc_node (see 0829b5bcdd3b
>> ("drm/i915: 2 GiB of relocations ought to be enough for anybody*"))
>> and spam dmesg at will.
>
> Actually that is a WARN_ON_*ONCE* there so maybe not so critical to
> invent a limit. Up for discussion I suppose.
>
> Regards,
>
> Tvrtko
>
>>
>> Question is what limit to set...
That's one of the reasons why I opted for kvzalloc() initially.
I mean we could use some nice round number like 65536, but that would be
totally arbitrary.
Any comments on the other two patches? I need to get them upstream.
Thanks,
Christian.
>>
>> Regards,
>>
>> Tvrtko
>>
>>> Signed-off-by: Christian König <christian.koenig(a)amd.com>
>>> CC: stable(a)vger.kernel.org
>>> ---
>>> drivers/dma-buf/dma-fence-array.c | 6 +++---
>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-fence-array.c
>>> b/drivers/dma-buf/dma-fence-array.c
>>> index 8a08ffde31e7..46ac42bcfac0 100644
>>> --- a/drivers/dma-buf/dma-fence-array.c
>>> +++ b/drivers/dma-buf/dma-fence-array.c
>>> @@ -119,8 +119,8 @@ static void dma_fence_array_release(struct
>>> dma_fence *fence)
>>> for (i = 0; i < array->num_fences; ++i)
>>> dma_fence_put(array->fences[i]);
>>> - kfree(array->fences);
>>> - dma_fence_free(fence);
>>> + kvfree(array->fences);
>>> + kvfree_rcu(fence, rcu);
>>> }
>>> static void dma_fence_array_set_deadline(struct dma_fence *fence,
>>> @@ -153,7 +153,7 @@ struct dma_fence_array
>>> *dma_fence_array_alloc(int num_fences)
>>> {
>>> struct dma_fence_array *array;
>>> - return kzalloc(struct_size(array, callbacks, num_fences),
>>> GFP_KERNEL);
>>> + return kvzalloc(struct_size(array, callbacks, num_fences),
>>> GFP_KERNEL);
>>> }
>>> EXPORT_SYMBOL(dma_fence_array_alloc);
On Thu, Oct 24, 2024 at 03:23:06PM +0100, Pavel Begunkov wrote:
> > That's not what this series does. It adds the new memory_provider_ops
> > set of hooks, with once implementation for dmabufs, and one for
> > io_uring zero copy.
>
> First, it's not a _new_ abstraction over a buffer as you called it
> before, the abstraction (net_iov) is already merged.
Umm, it is a new ops vector.
> Second, you mention devmem TCP, and it's not just a page pool with
> "dmabufs", it's a user API to use it and other memory agnostic
> allocation logic. And yes, dmabufs there is the least technically
> important part. Just having a dmabuf handle solves absolutely nothing.
It solves a lot, becaue it provides a proper abstraction.
> > So you are precluding zero copy RX into anything but your magic
> > io_uring buffers, and using an odd abstraction for that.
>
> Right io_uring zero copy RX API expects transfer to happen into io_uring
> controlled buffers, and that's the entire idea. Buffers that are based
> on an existing network specific abstraction, which are not restricted to
> pages or anything specific in the long run, but the flow of which from
> net stack to user and back is controlled by io_uring. If you worry about
> abuse, io_uring can't even sanely initialise those buffers itself and
> therefore asking the page pool code to do that.
No, I worry about trying to io_uring for not good reason. This
pre-cludes in-kernel uses which would be extremly useful for
network storage drivers, and it precludes device memory of all
kinds.
> I'm even more confused how that would help. The user API has to
> be implemented and adding a new dmabuf gives nothing, not even
> mentioning it's not clear what semantics of that beast is
> supposed to be.
>
The dma-buf maintainers already explained to you last time
that there is absolutely no need to use the dmabuf UAPI, you
can use dma-bufs through in-kernel interfaces just fine.
Hi guys,
turned out that userspace can also merge dma_fence_chain contains
which can result in really huge arrays.
Fix those merges to sort the arrays and remove the duplicates.
Additional to that start to use kvzalloc() for dma_fence_array
containers so that can handle much larger arrays if necessary.
Please review and comment,
Christian.
On Wed, Oct 23, 2024 at 03:34:53PM +0100, Pavel Begunkov wrote:
> It doesn't care much what kind of memory it is, nor it's important
> for internals how it's imported, it's user addresses -> pages for
> user convenience sake. All the net_iov setup code is in the page pool
> core code. What it does, however, is implementing the user API, so
That's not what this series does. It adds the new memory_provider_ops
set of hooks, with once implementation for dmabufs, and one for
io_uring zero copy.
So you are precluding zero copy RX into anything but your magic
io_uring buffers, and using an odd abstraction for that.
The right way would be to support zero copy RX into every
designated dmabuf, and make io_uring work with udmabuf or if
absolutely needed it's own kind of dmabuf. Instead we create
a maze of incompatible abstractions here. The use case of e.g.
doing zero copy receive into a NVMe CMB using PCIe P2P transactions
is every but made up, so this does create a problem.
On Wed, Oct 16, 2024 at 11:52:39AM -0700, David Wei wrote:
> From: Pavel Begunkov <asml.silence(a)gmail.com>
>
> Currently net_iov stores a pointer to struct dmabuf_genpool_chunk_owner,
> which serves as a useful abstraction to share data and provide a
> context. However, it's too devmem specific, and we want to reuse it for
> other memory providers, and for that we need to decouple net_iov from
> devmem. Make net_iov to point to a new base structure called
> net_iov_area, which dmabuf_genpool_chunk_owner extends.
We've been there before. Instead of reinventing your own memory
provider please enhance dmabufs for your use case. We don't really
need to build memory buffer abstraction over memory buffer abstraction.
On Tue, Oct 22, 2024 at 1:38 AM Maxime Ripard <mripard(a)redhat.com> wrote:
>
> I wanted to follow-up on the discussion we had at Plumbers with John and
> T.J. about (among other things) adding new heaps to the kernel.
>
> I'm still interested in merging a carve-out driver[1], since it seems to be
> in every vendor BSP and got asked again last week.
>
> I remember from our discussion that for new heap types to be merged, we
> needed a kernel use-case. Looking back, I'm not entirely sure how one
> can provide that given that heaps are essentially facilities for
> user-space.
>
> Am I misremembering or missing something? What are the requirements for
> you to consider adding a new heap driver?
It's basically the same as the DRM subsystem rules.
https://docs.kernel.org/gpu/drm-uapi.html#open-source-userspace-requirements
ie: There has to be opensource user for it, and the user has to be
more significant than a "toy" implementation (which can be a bit
subjective and contentious when trying to get out of a chicken and egg
loop).
thanks
-john
Hi,
This patch set allocates the restricted DMA-bufs via the TEE subsystem.
This a complete rewrite compared to the previous patch set [1], and other
earlier proposals [2] and [3] with a separate restricted heap.
The TEE subsystem handles the DMA-buf allocations since it is the TEE
(OP-TEE, AMD-TEE, TS-TEE, or a future QTEE) which sets up the restrictions
for the memory used for the DMA-bufs.
I've added a new IOCTL, TEE_IOC_RSTMEM_ALLOC, to allocate the restricted
DMA-bufs. This new IOCTL reaches the backend TEE driver, allowing it to
choose how to allocate the restricted physical memory.
TEE_IOC_RSTMEM_ALLOC is quite similar to TEE_IOC_SHM_ALLOC so it's tempting
to extend TEE_IOC_SHM_ALLOC with two new flags
TEE_IOC_SHM_FLAG_SECURE_VIDEO and TEE_IOC_SHM_FLAG_SECURE_TRUSTED_UI for
the same feature. However, it might be a bit confusing since
TEE_IOC_SHM_ALLOC only returns an anonymous file descriptor, but
TEE_IOC_SHM_FLAG_SECURE_VIDEO and TEE_IOC_SHM_FLAG_SECURE_TRUSTED_UI would
return a DMA-buf file descriptor instead. What do others think?
This can be tested on QEMU with the following steps:
repo init -u https://github.com/jenswi-linaro/manifest.git -m qemu_v8.xml \
-b prototype/sdp-v2
repo sync -j8
cd build
make toolchains -j4
make all -j$(nproc)
make run-only
# login and at the prompt:
xtest --sdp-basic
https://optee.readthedocs.io/en/latest/building/prerequisites.html
list dependencies needed to build the above.
The tests are pretty basic, mostly checking that a Trusted Application in
the secure world can access and manipulate the memory. There are also some
negative tests for out of bounds buffers etc.
Thanks,
Jens
[1] https://lore.kernel.org/lkml/20240830070351.2855919-1-jens.wiklander@linaro…
[2] https://lore.kernel.org/dri-devel/20240515112308.10171-1-yong.wu@mediatek.c…
[3] https://lore.kernel.org/lkml/20220805135330.970-1-olivier.masse@nxp.com/
Changes since the V1 RFC:
* Based on v6.11
* Complete rewrite, replacing the restricted heap with TEE_IOC_RSTMEM_ALLOC
Changes since Olivier's post [2]:
* Based on Yong Wu's post [1] where much of dma-buf handling is done in
the generic restricted heap
* Simplifications and cleanup
* New commit message for "dma-buf: heaps: add Linaro restricted dmabuf heap
support"
* Replaced the word "secure" with "restricted" where applicable
Jens Wiklander (2):
tee: add restricted memory allocation
optee: support restricted memory allocation
drivers/tee/Makefile | 1 +
drivers/tee/optee/core.c | 21 ++++
drivers/tee/optee/optee_private.h | 6 +
drivers/tee/optee/optee_smc.h | 35 ++++++
drivers/tee/optee/smc_abi.c | 45 ++++++-
drivers/tee/tee_core.c | 33 ++++-
drivers/tee/tee_private.h | 2 +
drivers/tee/tee_rstmem.c | 200 ++++++++++++++++++++++++++++++
drivers/tee/tee_shm.c | 2 +
drivers/tee/tee_shm_pool.c | 69 ++++++++++-
include/linux/tee_core.h | 6 +
include/linux/tee_drv.h | 9 ++
include/uapi/linux/tee.h | 33 ++++-
13 files changed, 455 insertions(+), 7 deletions(-)
create mode 100644 drivers/tee/tee_rstmem.c
--
2.43.0
Reports indicates that some userspace applications try to merge more than
80k of fences into a single dma_fence_array leading to a warning from
kzalloc() that the requested size becomes to big.
While that is clearly an userspace bug we should probably handle that case
gracefully in the kernel.
So we can either reject requests to merge more than a reasonable amount of
fences (64k maybe?) or we can start to use kvzalloc() instead of kzalloc().
This patch here does the later.
Signed-off-by: Christian König <christian.koenig(a)amd.com>
CC: stable(a)vger.kernel.org
---
drivers/dma-buf/dma-fence-array.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
index 8a08ffde31e7..46ac42bcfac0 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -119,8 +119,8 @@ static void dma_fence_array_release(struct dma_fence *fence)
for (i = 0; i < array->num_fences; ++i)
dma_fence_put(array->fences[i]);
- kfree(array->fences);
- dma_fence_free(fence);
+ kvfree(array->fences);
+ kvfree_rcu(fence, rcu);
}
static void dma_fence_array_set_deadline(struct dma_fence *fence,
@@ -153,7 +153,7 @@ struct dma_fence_array *dma_fence_array_alloc(int num_fences)
{
struct dma_fence_array *array;
- return kzalloc(struct_size(array, callbacks, num_fences), GFP_KERNEL);
+ return kvzalloc(struct_size(array, callbacks, num_fences), GFP_KERNEL);
}
EXPORT_SYMBOL(dma_fence_array_alloc);
--
2.34.1
Am 16.10.24 um 18:43 schrieb Adrián Larumbe:
> On 16.10.2024 15:12, Christian König wrote:
>> Am 15.10.24 um 01:31 schrieb Adrián Larumbe:
>>> Doesn't make any functional difference because generic dma_fence is the
>>> first panfrost_fence structure member, but I guess it doesn't hurt either.
>> As discussed with Sima we want to push into the exactly opposite direction
>> because that requires that the panfrost module stays loaded as long as fences
>> are around.
> Does that mean in future commits the struct dma_fence_ops' .release pointer will be
> done with altogether?
Yes, exactly that's the idea.
As a first step I'm preparing patches right now to enforce using kmalloc
instead of driver brewed approaches for dma_fence handling.
Regards,
Christian.
>
>> So clearly a NAK to this one here. Rather document on the structure that the
>> dma_fence structure must be the first member.
>>
>> Regards,
>> Christian.
>>
>>> Signed-off-by: Adrián Larumbe <adrian.larumbe(a)collabora.com>
>>> ---
>>> drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++
>>> 1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> index 5d83c6a148ec..fa219f719bdc 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> @@ -85,9 +85,15 @@ static const char *panfrost_fence_get_timeline_name(struct dma_fence *fence)
>>> }
>>> }
>>> +static void panfrost_fence_release(struct dma_fence *fence)
>>> +{
>>> + kfree(to_panfrost_fence(fence));
>>> +}
>>> +
>>> static const struct dma_fence_ops panfrost_fence_ops = {
>>> .get_driver_name = panfrost_fence_get_driver_name,
>>> .get_timeline_name = panfrost_fence_get_timeline_name,
>>> + .release = panfrost_fence_release,
>>> };
>>> static struct dma_fence *panfrost_fence_create(struct panfrost_device *pfdev, int js_num)
On 15/10/2024 14:07, Jyothi Kumar Seerapu wrote:
> When high performance with multiple i2c messages in a single transfer
> is required, employ Block Event Interrupt (BEI) to trigger interrupts
> after specific messages transfer and the last message transfer,
> thereby reducing interrupts.
> For each i2c message transfer, a series of Transfer Request Elements(TREs)
> must be programmed, including config tre for frequency configuration,
> go tre for holding i2c address and dma tre for holding dma buffer address,
> length as per the hardware programming guide. For transfer using BEI,
> multiple I2C messages may necessitate the preparation of config, go,
> and tx DMA TREs. However, a channel TRE size of 64 is often insufficient,
> potentially leading to failures due to inadequate memory space.
>
> Adjust the channel TRE size through the device tree.
> The default size is 64, but clients can modify this value based on
> their heigher channel TRE size requirements.
>
> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu(a)quicinc.com>
> ---
> arch/arm64/boot/dts/qcom/sc7280.dtsi | 132 +++++++++++++--------------
> 1 file changed, 66 insertions(+), 66 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi b/arch/arm64/boot/dts/qcom/sc7280.dtsi
> index 3d8410683402..c7c0e15ff9d3 100644
> --- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
> @@ -1064,7 +1064,7 @@
> };
>
> gpi_dma0: dma-controller@900000 {
> - #dma-cells = <3>;
> + #dma-cells = <4>;
> compatible = "qcom,sc7280-gpi-dma", "qcom,sm6350-gpi-dma";
> reg = <0 0x00900000 0 0x60000>;
> interrupts = <GIC_SPI 244 IRQ_TYPE_LEVEL_HIGH>,
> @@ -1114,8 +1114,8 @@
> "qup-memory";
> power-domains = <&rpmhpd SC7280_CX>;
> required-opps = <&rpmhpd_opp_low_svs>;
> - dmas = <&gpi_dma0 0 0 QCOM_GPI_I2C>,
> - <&gpi_dma0 1 0 QCOM_GPI_I2C>;
> + dmas = <&gpi_dma0 0 0 QCOM_GPI_I2C 64>,
> + <&gpi_dma0 1 0 QCOM_GPI_I2C 64>;
So everywhere is 64, thus this is fixed. Deduce it from the compatible
Best regards,
Krzysztof
Am 15.10.24 um 01:31 schrieb Adrián Larumbe:
> Doesn't make any functional difference because generic dma_fence is the
> first panfrost_fence structure member, but I guess it doesn't hurt either.
As discussed with Sima we want to push into the exactly opposite
direction because that requires that the panfrost module stays loaded as
long as fences are around.
So clearly a NAK to this one here. Rather document on the structure that
the dma_fence structure must be the first member.
Regards,
Christian.
> Signed-off-by: Adrián Larumbe <adrian.larumbe(a)collabora.com>
> ---
> drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 5d83c6a148ec..fa219f719bdc 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -85,9 +85,15 @@ static const char *panfrost_fence_get_timeline_name(struct dma_fence *fence)
> }
> }
>
> +static void panfrost_fence_release(struct dma_fence *fence)
> +{
> + kfree(to_panfrost_fence(fence));
> +}
> +
> static const struct dma_fence_ops panfrost_fence_ops = {
> .get_driver_name = panfrost_fence_get_driver_name,
> .get_timeline_name = panfrost_fence_get_timeline_name,
> + .release = panfrost_fence_release,
> };
>
> static struct dma_fence *panfrost_fence_create(struct panfrost_device *pfdev, int js_num)
On 15-10-24, 17:37, Jyothi Kumar Seerapu wrote:
> When high performance with multiple i2c messages in a single transfer
> is required, employ Block Event Interrupt (BEI) to trigger interrupts
> after specific messages transfer and the last message transfer,
> thereby reducing interrupts.
>
> For each i2c message transfer, a series of Transfer Request Elements(TREs)
> must be programmed, including config tre for frequency configuration,
> go tre for holding i2c address and dma tre for holding dma buffer address,
> length as per the hardware programming guide. For transfer using BEI,
> multiple I2C messages may necessitate the preparation of config, go,
> and tx DMA TREs. However, a channel TRE size of 64 is often insufficient,
> potentially leading to failures due to inadequate memory space.
>
> Add additional argument to dma-cell property for channel TRE size.
> With this, adjust the channel TRE size via the device tree.
> The default size is 64, but clients can modify this value based on
> their specific requirements.
>
> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu(a)quicinc.com>
> ---
> Documentation/devicetree/bindings/dma/qcom,gpi.yaml | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/dma/qcom,gpi.yaml b/Documentation/devicetree/bindings/dma/qcom,gpi.yaml
> index 4df4e61895d2..002495921643 100644
> --- a/Documentation/devicetree/bindings/dma/qcom,gpi.yaml
> +++ b/Documentation/devicetree/bindings/dma/qcom,gpi.yaml
> @@ -54,14 +54,16 @@ properties:
> maxItems: 13
>
> "#dma-cells":
> - const: 3
> + minItems: 3
> + maxItems: 4
> description: >
> DMA clients must use the format described in dma.txt, giving a phandle
> - to the DMA controller plus the following 3 integer cells:
> + to the DMA controller plus the following 4 integer cells:
> - channel: if set to 0xffffffff, any available channel will be allocated
> for the client. Otherwise, the exact channel specified will be used.
> - seid: serial id of the client as defined in the SoC documentation.
> - client: type of the client as defined in dt-bindings/dma/qcom-gpi.h
> + - channel-tre-size: size of the channel TRE (transfer ring element)
This is a firmware /software property, why should this be in hardware
description?
--
~Vinod
On Tue, Oct 15, 2024 at 05:37:46PM +0530, Jyothi Kumar Seerapu wrote:
> When high performance with multiple i2c messages in a single transfer
> is required, employ Block Event Interrupt (BEI) to trigger interrupts
> after specific messages transfer and the last message transfer,
> thereby reducing interrupts.
>
> For each i2c message transfer, a series of Transfer Request Elements(TREs)
> must be programmed, including config tre for frequency configuration,
> go tre for holding i2c address and dma tre for holding dma buffer address,
> length as per the hardware programming guide. For transfer using BEI,
> multiple I2C messages may necessitate the preparation of config, go,
> and tx DMA TREs. However, a channel TRE size of 64 is often insufficient,
> potentially leading to failures due to inadequate memory space.
>
> Add additional argument to dma-cell property for channel TRE size.
No such property 'dma-cell'
> With this, adjust the channel TRE size via the device tree.
> The default size is 64, but clients can modify this value based on
> their specific requirements.
>
> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu(a)quicinc.com>
> ---
> Documentation/devicetree/bindings/dma/qcom,gpi.yaml | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/dma/qcom,gpi.yaml b/Documentation/devicetree/bindings/dma/qcom,gpi.yaml
> index 4df4e61895d2..002495921643 100644
> --- a/Documentation/devicetree/bindings/dma/qcom,gpi.yaml
> +++ b/Documentation/devicetree/bindings/dma/qcom,gpi.yaml
> @@ -54,14 +54,16 @@ properties:
> maxItems: 13
>
> "#dma-cells":
> - const: 3
> + minItems: 3
> + maxItems: 4
> description: >
> DMA clients must use the format described in dma.txt, giving a phandle
> - to the DMA controller plus the following 3 integer cells:
> + to the DMA controller plus the following 4 integer cells:
> - channel: if set to 0xffffffff, any available channel will be allocated
> for the client. Otherwise, the exact channel specified will be used.
> - seid: serial id of the client as defined in the SoC documentation.
> - client: type of the client as defined in dt-bindings/dma/qcom-gpi.h
> + - channel-tre-size: size of the channel TRE (transfer ring element)
>
> iommus:
> maxItems: 1
> --
> 2.17.1
>
On 15/10/2024 14:07, Jyothi Kumar Seerapu wrote:
> When high performance with multiple i2c messages in a single transfer
> is required, employ Block Event Interrupt (BEI) to trigger interrupts
> after specific messages transfer and the last message transfer,
> thereby reducing interrupts.
>
> For each i2c message transfer, a series of Transfer Request Elements(TREs)
> must be programmed, including config tre for frequency configuration,
> go tre for holding i2c address and dma tre for holding dma buffer address,
> length as per the hardware programming guide. For transfer using BEI,
> multiple I2C messages may necessitate the preparation of config, go,
> and tx DMA TREs. However, a channel TRE size of 64 is often insufficient,
> potentially leading to failures due to inadequate memory space.
Please kindly test the patches before you sent them. Upstream is not a
testing service.
Best regards,
Krzysztof