Hi,
On Thu, Jul 2, 2015 at 4:01 PM, Pascal Brand pascal.brand@linaro.org wrote:
Hello,
*Buffer management* It looks that temporary buffers are now managed in the client API, not in the driver anymore. The positive point is that it makes the driver lighter. But the drawbacks is the number of extra ioctl used when using a temporary buffer: 1 to allocate, 1 to map, and 1 to unmap. So a TEEC_InvokeCommand could lead to 13 ioctls. What about the performances? Could not it be optimized in some way? Could we manage temporary buffers, dma buffers and "other" buffers in different ways?
Temporary buffers should be avoided completely if the client is concerned about performance. Allocated shared memory is the most efficient way of shared memory. DMA buffers should be treated in the same way as allocated shared memory.
Well, I agree that temporary buffers should be avoided when performances is required. However, they are widely used when the user is using small buffers, used to pass a struct for example. In such a case, the struct may be on the stack, and you pass it in a temporary buffer. The extra copy does not really affect performances in our current optee_linuxdriver. But performances may be more affected with the Generic Driver.
Yes, I agree that the old driver handles this more efficiently. As far as temporary buffers aren't used in performance critical loops we should be OK though. If it turns out to be a real problem I'd rather address it later when the initial patch set has been accepted.
The kernel API isn't there right now because it's not used by anything. This patch-set is kept as small as possible while still being useful, to avoid making it harder than necessary to upstream. Once this patch-set is upstream we can start to add different features, for instance a kernel API.
All right. We should just not forget that this is something quite important for us.
Would this be OK? struct tee_shm_pool_res_mem_info { unsigned long priv_vaddr; unsigned long priv_paddr; size_t priv_size; unsigned long dma_buf_vaddr; unsigned long dma_buf_paddr; size_t dma_buf_size; }; struct tee_shm_pool *tee_shm_pool_alloc_res_mem(struct device *dev, struct tee_shm_pool_res_mem_info *info);
Yes.
Thanks
The proposed api in tee.h doesn't cover memory that isn't represented by
a file descriptor. A memref is represented as:
struct tee_ioctl_param_memref {
__u64 shm_offs; __u64 size; __s64 shm_fd;
};
I see two options here:
a) We add a way to represent registered shared memory with a file
descriptor
b) We add another parameter type, regmem
In a) we can then map this memory into another process (having
tee-supplicant in mind) if needed. In b) there's no easy way of mapping
the memory into another process, but perhaps easier to implement if
this isn't a problem.
Either case is out of scope for patch v4, but we shouldn't make it hard to
add it to tee.h later.
A) would mean one more TEE_IOC_XXX while b) would need some more bits in
TEE_IOCTL_PARAMS_ATTR_TYPE_XXX. Changing TEE_IOCTL_PARAM_ATTR_TYPE_MASK
to 0xff would guarantee room for all future parameter types.
Another bit from the TEE_GEN_CAP_XXX bits can be used to indicate if
registered shared memory is supported by a specific driver.
In the use-case I was thinking about there wasn't a need to map the memory with a third party because it could potentially generate some strange side effects: since the maps can only be done in 4k increments, too much memory would be shared from one process to the other. For me at least option b) fits much better.
+1 on b)
it would also add more generic arguments I wonder if it could not be used to solve the temporary buffers perf issues.
Yes possibly, we should keep this in mind when fixing registered shared memory.
For use cases where a client gets hold of a dma_buf without AllocateSharedMemory, we could implement it using implement-specific TEEC_MEM_DMABUF flag, as currently performed in optee_client.
Good idea.
We also have an extra review comment, on optee_client. In the supplicant (optee_client), the buffer lifetime in get_param() and free_param() is not really clear to us:
get_param():
mmap the buffer sets params[idx].u.memref.shm_fd to -1. Why?
write_response() closes all file descriptors that hasn't been claimed through get_param(). Since get_param() has created an shm with this file descriptor it will be closed free_param() instead. Closing it twice would be an error. Doing a dup() on the file descriptor instead would just be inefficient.
free_param()
unmap the buffer close it
The questions:
why don't we have a "refcount++" of the buffer in get_param() is not it dangerous to close the buffer (roughly meaning "it is no more used") before being sent and processed by the driver.
The refcount was already increased by the driver when creating a new file descriptor for this buffer. So when tee-supplicant get the buffer the refcount is already at least 2, closing it will not make it hit 0.
Thanks, Jens