Hello,
There was discussion that started on github (https://github.com/OP-TEE/optee_os/issues/1019) and then continued in this mailing list. We discussed how OP-TEE can interact with virtualization and came to conclusion that one of tasks is to rework shared memory mechanism. So I want to discuss my vision on this topic.
Currently shared buffer is carved out from RAM by some higher powers and presented to OP-TEE. OP-TEE informs Normal World about its location. And then OP-TEE uses Normal World's allocator to allocate buffers for various needs. This approach has two drawbacks: shared region size is limited and it is wastes memory space when it is not used. On most platforms it have size of 1 megabyte which is not enough to run GP tests (according to comments in board files).
We have agreed that we don't need this shared carveout at all. This region is unsecure by definition, so Normal World can allocate in anywhere it wishes and then just pass PAs to OP-TEE.
I already did most of the work on kernel side. You can check it there: https://github.com/lorc/linux/tree/optee-new-shm This code still needs to be tidied up, possibly some caching should be added. I also hacked OP-TEE core to make it work. So I want to discuss proper changes to OP-TEE memory management subsystem.
Main requirements are: - MM should be dynamic: we can add and remove mappings in runtime. - There can be arbitrary number of maps. - Every map should be associated with guest virtual machine, so we can clean up easily when one of Normal Worlds dies. - Preserve existing APIs if possible. I don't want to rework all MM API calls and you probably don't want to review this :)
Here is how I see this: there will be two types of mappings: static and dynamic. Most of the areas will be left static, but the area for shared memory and probably the area for device IO will be made dynamic. Actually I borrowed this idea from Linux's vmalloc.c. Lets take SHM as an example:
There will be region in virtual address space for SHM. But it will not be backed by any physical memory at the start. When OP-TEE needs shared buffer it issues RPC call to Normal World, Normal World provides physical page(s). OP-TEE then allocates part of SHM virtual space and maps there pages provided by Normal World. When Normal World allocates SHM on its own, it calls Secure World with `OPTEE_SMC_ADD_SHM_BUFFER` where it provides list of pages. OP-TEE then maps pages in the same way as in previous case. There will be complementary functions and SMC calls to unmap shared buffer and free allocated VA space.
We will need some some allocator to track areas in SHM VA space. I plan to use rb-tree allocator like it is used in vmalloc.c. Probably we will need second rb-tree with PA's to speed up `phys_to_virt` lookups. For every allocated area in VA space we will hold virtual machine id, so we can free this region when virtual machine dies.
APIs in core_memprot.h will be preserved. There just will be another handling for SHM region cases.
Also corresponding changes to MMU drivers should be done. We need to be able to map/unmap individual pages.
The same mechanism can be used for memory mapped peripherals, so device drivers can use plain old `ioremap` to gain access to device registers.
I hope this is not overengineered :)
I will appreciate any comments on this, especially if I missed something.
Hi Volodymyr,
On Wed, Oct 26, 2016 at 8:44 PM, Volodymyr Babchuk vlad.babchuk@gmail.com wrote:
Hello,
There was discussion that started on github (https://github.com/OP-TEE/optee_os/issues/1019) and then continued in this mailing list. We discussed how OP-TEE can interact with virtualization and came to conclusion that one of tasks is to rework shared memory mechanism. So I want to discuss my vision on this topic.
Currently shared buffer is carved out from RAM by some higher powers and presented to OP-TEE. OP-TEE informs Normal World about its location. And then OP-TEE uses Normal World's allocator to allocate buffers for various needs. This approach has two drawbacks: shared region size is limited and it is wastes memory space when it is not used. On most platforms it have size of 1 megabyte which is not enough to run GP tests (according to comments in board files).
We have agreed that we don't need this shared carveout at all. This region is unsecure by definition, so Normal World can allocate in anywhere it wishes and then just pass PAs to OP-TEE.
I already did most of the work on kernel side. You can check it there: https://github.com/lorc/linux/tree/optee-new-shm This code still needs to be tidied up, possibly some caching should be added. I also hacked OP-TEE core to make it work. So I want to discuss proper changes to OP-TEE memory management subsystem.
I've done a similar experiment in normal world (maintaining backwards compat). My WIP patches are available at: https://github.com/jenswi-linaro/linux-1/tree/reg_shm https://github.com/jenswi-linaro/optee_client/tree/shm_reg
My focus is shared memory with user space, kernel space still uses the pool.
Main requirements are:
- MM should be dynamic: we can add and remove mappings in runtime.
- There can be arbitrary number of maps.
- Every map should be associated with guest virtual machine, so we
can clean up easily when one of Normal Worlds dies.
- Preserve existing APIs if possible. I don't want to rework all MM
API calls and you probably don't want to review this :)
I'm currently working with a way to represent memory objects in a generic fashion in order to be able to implement secure paged shared memory (sometimes needed when a TA invokes another TA). These abstract memory objects (MOBJ) is somewhat like your maps above. One difference is that a MOBJ will only be mapped when needed. RPC and command buffers they are always needed to be mapped so there's no difference from maps, but most of the MOBJs will only be accessed by TAs so they will only be mapped in TA space.
Here is how I see this: there will be two types of mappings: static and dynamic. Most of the areas will be left static, but the area for shared memory and probably the area for device IO will be made dynamic. Actually I borrowed this idea from Linux's vmalloc.c. Lets take SHM as an example:
There will be region in virtual address space for SHM. But it will not be backed by any physical memory at the start. When OP-TEE needs shared buffer it issues RPC call to Normal World, Normal World provides physical page(s). OP-TEE then allocates part of SHM virtual space and maps there pages provided by Normal World. When Normal World allocates SHM on its own, it calls Secure World with `OPTEE_SMC_ADD_SHM_BUFFER` where it provides list of pages. OP-TEE then maps pages in the same way as in previous case. There will be complementary functions and SMC calls to unmap shared buffer and free allocated VA space.
We will need some some allocator to track areas in SHM VA space. I plan to use rb-tree allocator like it is used in vmalloc.c. Probably we will need second rb-tree with PA's to speed up `phys_to_virt` lookups. For every allocated area in VA space we will hold virtual machine id, so we can free this region when virtual machine dies.
APIs in core_memprot.h will be preserved. There just will be another handling for SHM region cases.
Also corresponding changes to MMU drivers should be done. We need to be able to map/unmap individual pages.
The same mechanism can be used for memory mapped peripherals, so device drivers can use plain old `ioremap` to gain access to device registers.
I hope this is not overengineered :)
I think we for the moment should keep mapping of peripherals out if this as it's a different problem. Today we actually have some support for dealing with mapping of peripherals via core_mmu_add_mapping().
To manage SHM VA space you can use the API in <mm/tee_mm.h> Provided that we only map the MOBJs that we need to access I think they will be rather few so scalability is probably not an issue. At least nothing to worry about now, as long as we know that it can be fixed if needed.
My current focus is abstracting memory in struct tee_ta_param so paged secure shared memory can be implemented in a sane way. The purpose of my experiment with sharing user space memory with secure world was to get a better feeling for what's needed by the memory abstraction in other use cases. Hopefully both your and mine work can be done more or less independently.
Thanks, Jens
Hi, Jens
On 27 October 2016 at 14:19, Jens Wiklander jens.wiklander@linaro.org wrote:
Hi Volodymyr,
On Wed, Oct 26, 2016 at 8:44 PM, Volodymyr Babchuk vlad.babchuk@gmail.com wrote:
Hello,
There was discussion that started on github (https://github.com/OP-TEE/optee_os/issues/1019) and then continued in this mailing list. We discussed how OP-TEE can interact with virtualization and came to conclusion that one of tasks is to rework shared memory mechanism. So I want to discuss my vision on this topic.
Currently shared buffer is carved out from RAM by some higher powers and presented to OP-TEE. OP-TEE informs Normal World about its location. And then OP-TEE uses Normal World's allocator to allocate buffers for various needs. This approach has two drawbacks: shared region size is limited and it is wastes memory space when it is not used. On most platforms it have size of 1 megabyte which is not enough to run GP tests (according to comments in board files).
We have agreed that we don't need this shared carveout at all. This region is unsecure by definition, so Normal World can allocate in anywhere it wishes and then just pass PAs to OP-TEE.
I already did most of the work on kernel side. You can check it there: https://github.com/lorc/linux/tree/optee-new-shm This code still needs to be tidied up, possibly some caching should be added. I also hacked OP-TEE core to make it work. So I want to discuss proper changes to OP-TEE memory management subsystem.
I've done a similar experiment in normal world (maintaining backwards compat). My WIP patches are available at: https://github.com/jenswi-linaro/linux-1/tree/reg_shm https://github.com/jenswi-linaro/optee_client/tree/shm_reg
My focus is shared memory with user space, kernel space still uses the pool.
It looks very good. But I think that it will be good get rid of the pool completely. That will make API more consistent and it will definitely ease up integration with hypervisor.
Main requirements are:
- MM should be dynamic: we can add and remove mappings in runtime.
- There can be arbitrary number of maps.
- Every map should be associated with guest virtual machine, so we
can clean up easily when one of Normal Worlds dies.
- Preserve existing APIs if possible. I don't want to rework all MM
API calls and you probably don't want to review this :)
I'm currently working with a way to represent memory objects in a generic fashion in order to be able to implement secure paged shared memory (sometimes needed when a TA invokes another TA). These abstract memory objects (MOBJ) is somewhat like your maps above. One difference is that a MOBJ will only be mapped when needed. RPC and command buffers they are always needed to be mapped so there's no difference from maps, but most of the MOBJs will only be accessed by TAs so they will only be mapped in TA space.
I see. I think it is good to reuse the same mechanism for all shared buffers. Can I ask about current state of this feature? Can I see some code to get a better grip on this?
Here is how I see this: there will be two types of mappings: static and dynamic. Most of the areas will be left static, but the area for shared memory and probably the area for device IO will be made dynamic. Actually I borrowed this idea from Linux's vmalloc.c. Lets take SHM as an example:
There will be region in virtual address space for SHM. But it will not be backed by any physical memory at the start. When OP-TEE needs shared buffer it issues RPC call to Normal World, Normal World provides physical page(s). OP-TEE then allocates part of SHM virtual space and maps there pages provided by Normal World. When Normal World allocates SHM on its own, it calls Secure World with `OPTEE_SMC_ADD_SHM_BUFFER` where it provides list of pages. OP-TEE then maps pages in the same way as in previous case. There will be complementary functions and SMC calls to unmap shared buffer and free allocated VA space.
We will need some some allocator to track areas in SHM VA space. I plan to use rb-tree allocator like it is used in vmalloc.c. Probably we will need second rb-tree with PA's to speed up `phys_to_virt` lookups. For every allocated area in VA space we will hold virtual machine id, so we can free this region when virtual machine dies.
APIs in core_memprot.h will be preserved. There just will be another handling for SHM region cases.
Also corresponding changes to MMU drivers should be done. We need to be able to map/unmap individual pages.
The same mechanism can be used for memory mapped peripherals, so device drivers can use plain old `ioremap` to gain access to device registers.
I hope this is not overengineered :)
I think we for the moment should keep mapping of peripherals out if this as it's a different problem. Today we actually have some support for dealing with mapping of peripherals via core_mmu_add_mapping().
Yes, I thought that peripherals mapping would be a free bonus. I have seen core_mmu_add_mapping(), actually I used it in my hack to map shared buffers from Normal World :)
To manage SHM VA space you can use the API in <mm/tee_mm.h> Provided that we only map the MOBJs that we need to access I think they will be rather few so scalability is probably not an issue. At least nothing to worry about now, as long as we know that it can be fixed if needed.
Oops, I completely missed that part of MM subsystem. Sure, we don't need another allocator if there are one.
My current focus is abstracting memory in struct tee_ta_param so paged secure shared memory can be implemented in a sane way. The purpose of my experiment with sharing user space memory with secure world was to get a better feeling for what's needed by the memory abstraction in other use cases. Hopefully both your and mine work can be done more or less independently.
You better know OP-TEE than me, so your opinion is very valuable for me. I am reluctant to add new entities to the core, so how do you think, can MOBJs be altered to serve also as a primitives for all shared buffers? If yes, then probably I need to wait while you finish your work before starting mine.
On Thu, Oct 27, 2016 at 8:53 PM, Volodymyr Babchuk vlad.babchuk@gmail.com wrote:
Hi, Jens
On 27 October 2016 at 14:19, Jens Wiklander jens.wiklander@linaro.org wrote:
Hi Volodymyr,
On Wed, Oct 26, 2016 at 8:44 PM, Volodymyr Babchuk vlad.babchuk@gmail.com wrote:
Hello,
There was discussion that started on github (https://github.com/OP-TEE/optee_os/issues/1019) and then continued in this mailing list. We discussed how OP-TEE can interact with virtualization and came to conclusion that one of tasks is to rework shared memory mechanism. So I want to discuss my vision on this topic.
Currently shared buffer is carved out from RAM by some higher powers and presented to OP-TEE. OP-TEE informs Normal World about its location. And then OP-TEE uses Normal World's allocator to allocate buffers for various needs. This approach has two drawbacks: shared region size is limited and it is wastes memory space when it is not used. On most platforms it have size of 1 megabyte which is not enough to run GP tests (according to comments in board files).
We have agreed that we don't need this shared carveout at all. This region is unsecure by definition, so Normal World can allocate in anywhere it wishes and then just pass PAs to OP-TEE.
I already did most of the work on kernel side. You can check it there: https://github.com/lorc/linux/tree/optee-new-shm This code still needs to be tidied up, possibly some caching should be added. I also hacked OP-TEE core to make it work. So I want to discuss proper changes to OP-TEE memory management subsystem.
I've done a similar experiment in normal world (maintaining backwards compat). My WIP patches are available at: https://github.com/jenswi-linaro/linux-1/tree/reg_shm https://github.com/jenswi-linaro/optee_client/tree/shm_reg
My focus is shared memory with user space, kernel space still uses the pool.
It looks very good. But I think that it will be good get rid of the pool completely. That will make API more consistent and it will definitely ease up integration with hypervisor.
I agree. But the pool may very well still be needed for some TEE implementations, so we'll need to be able to both.
Main requirements are:
- MM should be dynamic: we can add and remove mappings in runtime.
- There can be arbitrary number of maps.
- Every map should be associated with guest virtual machine, so we
can clean up easily when one of Normal Worlds dies.
- Preserve existing APIs if possible. I don't want to rework all MM
API calls and you probably don't want to review this :)
I'm currently working with a way to represent memory objects in a generic fashion in order to be able to implement secure paged shared memory (sometimes needed when a TA invokes another TA). These abstract memory objects (MOBJ) is somewhat like your maps above. One difference is that a MOBJ will only be mapped when needed. RPC and command buffers they are always needed to be mapped so there's no difference from maps, but most of the MOBJs will only be accessed by TAs so they will only be mapped in TA space.
I see. I think it is good to reuse the same mechanism for all shared buffers. Can I ask about current state of this feature? Can I see some code to get a better grip on this?
The code is currently changing a lot so I'm afraid it would be a bit confusing at the moment. Give me two weeks and I'll share something that's a bit more stable.
Here is how I see this: there will be two types of mappings: static and dynamic. Most of the areas will be left static, but the area for shared memory and probably the area for device IO will be made dynamic. Actually I borrowed this idea from Linux's vmalloc.c. Lets take SHM as an example:
There will be region in virtual address space for SHM. But it will not be backed by any physical memory at the start. When OP-TEE needs shared buffer it issues RPC call to Normal World, Normal World provides physical page(s). OP-TEE then allocates part of SHM virtual space and maps there pages provided by Normal World. When Normal World allocates SHM on its own, it calls Secure World with `OPTEE_SMC_ADD_SHM_BUFFER` where it provides list of pages. OP-TEE then maps pages in the same way as in previous case. There will be complementary functions and SMC calls to unmap shared buffer and free allocated VA space.
We will need some some allocator to track areas in SHM VA space. I plan to use rb-tree allocator like it is used in vmalloc.c. Probably we will need second rb-tree with PA's to speed up `phys_to_virt` lookups. For every allocated area in VA space we will hold virtual machine id, so we can free this region when virtual machine dies.
APIs in core_memprot.h will be preserved. There just will be another handling for SHM region cases.
Also corresponding changes to MMU drivers should be done. We need to be able to map/unmap individual pages.
The same mechanism can be used for memory mapped peripherals, so device drivers can use plain old `ioremap` to gain access to device registers.
I hope this is not overengineered :)
I think we for the moment should keep mapping of peripherals out if this as it's a different problem. Today we actually have some support for dealing with mapping of peripherals via core_mmu_add_mapping().
Yes, I thought that peripherals mapping would be a free bonus. I have seen core_mmu_add_mapping(), actually I used it in my hack to map shared buffers from Normal World :)
:-)
To manage SHM VA space you can use the API in <mm/tee_mm.h> Provided that we only map the MOBJs that we need to access I think they will be rather few so scalability is probably not an issue. At least nothing to worry about now, as long as we know that it can be fixed if needed.
Oops, I completely missed that part of MM subsystem. Sure, we don't need another allocator if there are one.
My current focus is abstracting memory in struct tee_ta_param so paged secure shared memory can be implemented in a sane way. The purpose of my experiment with sharing user space memory with secure world was to get a better feeling for what's needed by the memory abstraction in other use cases. Hopefully both your and mine work can be done more or less independently.
You better know OP-TEE than me, so your opinion is very valuable for me. I am reluctant to add new entities to the core, so how do you think, can MOBJs be altered to serve also as a primitives for all shared buffers? If yes, then probably I need to wait while you finish your work before starting mine.
The idea is that these MOBJs should be primitives for all shared buffers, even manage parts of the lifecycle.
Thanks, Jens
On 29 October 2016 at 12:57, Jens Wiklander jens.wiklander@linaro.org wrote:
On Thu, Oct 27, 2016 at 8:53 PM, Volodymyr Babchuk vlad.babchuk@gmail.com wrote:
Hi, Jens
On 27 October 2016 at 14:19, Jens Wiklander jens.wiklander@linaro.org wrote:
Hi Volodymyr,
On Wed, Oct 26, 2016 at 8:44 PM, Volodymyr Babchuk vlad.babchuk@gmail.com wrote:
Hello,
There was discussion that started on github (https://github.com/OP-TEE/optee_os/issues/1019) and then continued in this mailing list. We discussed how OP-TEE can interact with virtualization and came to conclusion that one of tasks is to rework shared memory mechanism. So I want to discuss my vision on this topic.
Currently shared buffer is carved out from RAM by some higher powers and presented to OP-TEE. OP-TEE informs Normal World about its location. And then OP-TEE uses Normal World's allocator to allocate buffers for various needs. This approach has two drawbacks: shared region size is limited and it is wastes memory space when it is not used. On most platforms it have size of 1 megabyte which is not enough to run GP tests (according to comments in board files).
We have agreed that we don't need this shared carveout at all. This region is unsecure by definition, so Normal World can allocate in anywhere it wishes and then just pass PAs to OP-TEE.
I already did most of the work on kernel side. You can check it there: https://github.com/lorc/linux/tree/optee-new-shm This code still needs to be tidied up, possibly some caching should be added. I also hacked OP-TEE core to make it work. So I want to discuss proper changes to OP-TEE memory management subsystem.
I've done a similar experiment in normal world (maintaining backwards compat). My WIP patches are available at: https://github.com/jenswi-linaro/linux-1/tree/reg_shm https://github.com/jenswi-linaro/optee_client/tree/shm_reg
My focus is shared memory with user space, kernel space still uses the pool.
It looks very good. But I think that it will be good get rid of the pool completely. That will make API more consistent and it will definitely ease up integration with hypervisor.
I agree. But the pool may very well still be needed for some TEE implementations, so we'll need to be able to both.
I don't want remote pool interface. Actually, I don't want to touch generic TEE interfaces at all. In my kernel patch I reused that pool interface and just created another pool implementation that is specific for OP-TEE (drivers\tee\optee\shm_pool.c). You did good joob designing generic TEE interface I see no point in altering it.
Main requirements are:
- MM should be dynamic: we can add and remove mappings in runtime.
- There can be arbitrary number of maps.
- Every map should be associated with guest virtual machine, so we
can clean up easily when one of Normal Worlds dies.
- Preserve existing APIs if possible. I don't want to rework all MM
API calls and you probably don't want to review this :)
I'm currently working with a way to represent memory objects in a generic fashion in order to be able to implement secure paged shared memory (sometimes needed when a TA invokes another TA). These abstract memory objects (MOBJ) is somewhat like your maps above. One difference is that a MOBJ will only be mapped when needed. RPC and command buffers they are always needed to be mapped so there's no difference from maps, but most of the MOBJs will only be accessed by TAs so they will only be mapped in TA space.
I see. I think it is good to reuse the same mechanism for all shared buffers. Can I ask about current state of this feature? Can I see some code to get a better grip on this?
The code is currently changing a lot so I'm afraid it would be a bit confusing at the moment. Give me two weeks and I'll share something that's a bit more stable.
Okay. I also plan to finish my work in nearest weeks.
Here is how I see this: there will be two types of mappings: static and dynamic. Most of the areas will be left static, but the area for shared memory and probably the area for device IO will be made dynamic. Actually I borrowed this idea from Linux's vmalloc.c. Lets take SHM as an example:
There will be region in virtual address space for SHM. But it will not be backed by any physical memory at the start. When OP-TEE needs shared buffer it issues RPC call to Normal World, Normal World provides physical page(s). OP-TEE then allocates part of SHM virtual space and maps there pages provided by Normal World. When Normal World allocates SHM on its own, it calls Secure World with `OPTEE_SMC_ADD_SHM_BUFFER` where it provides list of pages. OP-TEE then maps pages in the same way as in previous case. There will be complementary functions and SMC calls to unmap shared buffer and free allocated VA space.
We will need some some allocator to track areas in SHM VA space. I plan to use rb-tree allocator like it is used in vmalloc.c. Probably we will need second rb-tree with PA's to speed up `phys_to_virt` lookups. For every allocated area in VA space we will hold virtual machine id, so we can free this region when virtual machine dies.
APIs in core_memprot.h will be preserved. There just will be another handling for SHM region cases.
Also corresponding changes to MMU drivers should be done. We need to be able to map/unmap individual pages.
The same mechanism can be used for memory mapped peripherals, so device drivers can use plain old `ioremap` to gain access to device registers.
I hope this is not overengineered :)
I think we for the moment should keep mapping of peripherals out if this as it's a different problem. Today we actually have some support for dealing with mapping of peripherals via core_mmu_add_mapping().
Yes, I thought that peripherals mapping would be a free bonus. I have seen core_mmu_add_mapping(), actually I used it in my hack to map shared buffers from Normal World :)
:-)
To manage SHM VA space you can use the API in <mm/tee_mm.h> Provided that we only map the MOBJs that we need to access I think they will be rather few so scalability is probably not an issue. At least nothing to worry about now, as long as we know that it can be fixed if needed.
Oops, I completely missed that part of MM subsystem. Sure, we don't need another allocator if there are one.
My current focus is abstracting memory in struct tee_ta_param so paged secure shared memory can be implemented in a sane way. The purpose of my experiment with sharing user space memory with secure world was to get a better feeling for what's needed by the memory abstraction in other use cases. Hopefully both your and mine work can be done more or less independently.
You better know OP-TEE than me, so your opinion is very valuable for me. I am reluctant to add new entities to the core, so how do you think, can MOBJs be altered to serve also as a primitives for all shared buffers? If yes, then probably I need to wait while you finish your work before starting mine.
The idea is that these MOBJs should be primitives for all shared buffers, even manage parts of the lifecycle.
Okay, when you'll finish MOBJs, I'll look on how to integrate them with my part.
Hi Jens,
On the topic of integrating TEE with Hyp in a Multi-VM environment like Volodymyr wants to do, we would not feel comfortable to get rid of the "shm pool" concept, specifically the shm allocation from the carveout.
We would disagree with a protection model where any NS buffer, from any client operating system or from EL2, can be shared with the Secure World. The reason is we desire to sandbox NS EL1 and NS EL2 from Secure world, since in the past, people have used the loop holes in TAs to root Linux kernel. Google "bitsplease" if you want any recent example.
Granted in ARM architecture, SEL1 can always gain access to the entire NS memory space, but SEL0 does not have that power, nor does some of the non-ARM-Trustzone TEEs.
Last I checked, Gen TEE Driver has the mandate to work with any TEE. We agree with that goal :-)
Having said that, we also agree with Volodymyr, always max out the carveout is not terribly space-efficient.
When we rework MM in OP-TEE, would the SHM protocol, currently implemented in drivers/tee as a generic protocol (namely not specific to OP-TEE), be subject to some twicks?
Thanks
-thomas
On 10/31/2016 06:20 AM, Volodymyr Babchuk wrote:
On 29 October 2016 at 12:57, Jens Wiklander jens.wiklander@linaro.org wrote:
On Thu, Oct 27, 2016 at 8:53 PM, Volodymyr Babchuk vlad.babchuk@gmail.com wrote:
Hi, Jens
On 27 October 2016 at 14:19, Jens Wiklander jens.wiklander@linaro.org wrote:
Hi Volodymyr,
On Wed, Oct 26, 2016 at 8:44 PM, Volodymyr Babchuk vlad.babchuk@gmail.com wrote:
Hello,
There was discussion that started on github (https://github.com/OP-TEE/optee_os/issues/1019) and then continued in this mailing list. We discussed how OP-TEE can interact with virtualization and came to conclusion that one of tasks is to rework shared memory mechanism. So I want to discuss my vision on this topic.
Currently shared buffer is carved out from RAM by some higher powers and presented to OP-TEE. OP-TEE informs Normal World about its location. And then OP-TEE uses Normal World's allocator to allocate buffers for various needs. This approach has two drawbacks: shared region size is limited and it is wastes memory space when it is not used. On most platforms it have size of 1 megabyte which is not enough to run GP tests (according to comments in board files).
We have agreed that we don't need this shared carveout at all. This region is unsecure by definition, so Normal World can allocate in anywhere it wishes and then just pass PAs to OP-TEE.
I already did most of the work on kernel side. You can check it there: https://github.com/lorc/linux/tree/optee-new-shm This code still needs to be tidied up, possibly some caching should be added. I also hacked OP-TEE core to make it work. So I want to discuss proper changes to OP-TEE memory management subsystem.
I've done a similar experiment in normal world (maintaining backwards compat). My WIP patches are available at: https://github.com/jenswi-linaro/linux-1/tree/reg_shm https://github.com/jenswi-linaro/optee_client/tree/shm_reg
My focus is shared memory with user space, kernel space still uses the pool.
It looks very good. But I think that it will be good get rid of the pool completely. That will make API more consistent and it will definitely ease up integration with hypervisor.
I agree. But the pool may very well still be needed for some TEE implementations, so we'll need to be able to both.
I don't want remote pool interface. Actually, I don't want to touch generic TEE interfaces at all. In my kernel patch I reused that pool interface and just created another pool implementation that is specific for OP-TEE (drivers\tee\optee\shm_pool.c). You did good joob designing generic TEE interface I see no point in altering it.
Main requirements are:
- MM should be dynamic: we can add and remove mappings in runtime.
- There can be arbitrary number of maps.
- Every map should be associated with guest virtual machine, so we
can clean up easily when one of Normal Worlds dies.
- Preserve existing APIs if possible. I don't want to rework all MM
API calls and you probably don't want to review this :)
I'm currently working with a way to represent memory objects in a generic fashion in order to be able to implement secure paged shared memory (sometimes needed when a TA invokes another TA). These abstract memory objects (MOBJ) is somewhat like your maps above. One difference is that a MOBJ will only be mapped when needed. RPC and command buffers they are always needed to be mapped so there's no difference from maps, but most of the MOBJs will only be accessed by TAs so they will only be mapped in TA space.
I see. I think it is good to reuse the same mechanism for all shared buffers. Can I ask about current state of this feature? Can I see some code to get a better grip on this?
The code is currently changing a lot so I'm afraid it would be a bit confusing at the moment. Give me two weeks and I'll share something that's a bit more stable.
Okay. I also plan to finish my work in nearest weeks.
Here is how I see this: there will be two types of mappings: static and dynamic. Most of the areas will be left static, but the area for shared memory and probably the area for device IO will be made dynamic. Actually I borrowed this idea from Linux's vmalloc.c. Lets take SHM as an example:
There will be region in virtual address space for SHM. But it will not be backed by any physical memory at the start. When OP-TEE needs shared buffer it issues RPC call to Normal World, Normal World provides physical page(s). OP-TEE then allocates part of SHM virtual space and maps there pages provided by Normal World. When Normal World allocates SHM on its own, it calls Secure World with `OPTEE_SMC_ADD_SHM_BUFFER` where it provides list of pages. OP-TEE then maps pages in the same way as in previous case. There will be complementary functions and SMC calls to unmap shared buffer and free allocated VA space.
We will need some some allocator to track areas in SHM VA space. I plan to use rb-tree allocator like it is used in vmalloc.c. Probably we will need second rb-tree with PA's to speed up `phys_to_virt` lookups. For every allocated area in VA space we will hold virtual machine id, so we can free this region when virtual machine dies.
APIs in core_memprot.h will be preserved. There just will be another handling for SHM region cases.
Also corresponding changes to MMU drivers should be done. We need to be able to map/unmap individual pages.
The same mechanism can be used for memory mapped peripherals, so device drivers can use plain old `ioremap` to gain access to device registers.
I hope this is not overengineered :)
I think we for the moment should keep mapping of peripherals out if this as it's a different problem. Today we actually have some support for dealing with mapping of peripherals via core_mmu_add_mapping().
Yes, I thought that peripherals mapping would be a free bonus. I have seen core_mmu_add_mapping(), actually I used it in my hack to map shared buffers from Normal World :)
:-)
To manage SHM VA space you can use the API in <mm/tee_mm.h> Provided that we only map the MOBJs that we need to access I think they will be rather few so scalability is probably not an issue. At least nothing to worry about now, as long as we know that it can be fixed if needed.
Oops, I completely missed that part of MM subsystem. Sure, we don't need another allocator if there are one.
My current focus is abstracting memory in struct tee_ta_param so paged secure shared memory can be implemented in a sane way. The purpose of my experiment with sharing user space memory with secure world was to get a better feeling for what's needed by the memory abstraction in other use cases. Hopefully both your and mine work can be done more or less independently.
You better know OP-TEE than me, so your opinion is very valuable for me. I am reluctant to add new entities to the core, so how do you think, can MOBJs be altered to serve also as a primitives for all shared buffers? If yes, then probably I need to wait while you finish your work before starting mine.
The idea is that these MOBJs should be primitives for all shared buffers, even manage parts of the lifecycle.
Okay, when you'll finish MOBJs, I'll look on how to integrate them with my part.
Hi Thomas,
On Tue, Dec 6, 2016 at 6:34 PM, thomas m zeng tzeng@codeaurora.org wrote:
Hi Jens,
On the topic of integrating TEE with Hyp in a Multi-VM environment like Volodymyr wants to do, we would not feel comfortable to get rid of the "shm pool" concept, specifically the shm allocation from the carveout.
I agree that we should keep the pool, for some configurations.
We would disagree with a protection model where any NS buffer, from any client operating system or from EL2, can be shared with the Secure World. The reason is we desire to sandbox NS EL1 and NS EL2 from Secure world, since in the past, people have used the loop holes in TAs to root Linux kernel. Google "bitsplease" if you want any recent example.
Granted in ARM architecture, SEL1 can always gain access to the entire NS memory space, but SEL0 does not have that power, nor does some of the non-ARM-Trustzone TEEs.
Last I checked, Gen TEE Driver has the mandate to work with any TEE. We agree with that goal :-)
:-)
Having said that, we also agree with Volodymyr, always max out the carveout is not terribly space-efficient.
When we rework MM in OP-TEE, would the SHM protocol, currently implemented in drivers/tee as a generic protocol (namely not specific to OP-TEE), be subject to some twicks?
Sorry, I don't understand, which protocol?
Thanks, Jens
Hi Jens,
When we rework MM in OP-TEE, would the SHM protocol, currently implemented in drivers/tee as a generic protocol (namely not specific to OP-TEE), be subject to some twicks?
Sorry, I don't understand, which protocol?
The NS side is expected to call MSG_GET_SHM_CONFIG at boot, and use the returned parameters to initialize the shm_pool module.
To be able to work with multiple VMs, are we open to tweaking the boot protocol when we rework the MM in OP-TEE and Gen TEE driver?
thanks -thomas
On 12/06/2016 12:39 PM, Jens Wiklander wrote:
Hi Thomas,
On Tue, Dec 6, 2016 at 6:34 PM, thomas m zeng tzeng@codeaurora.org wrote:
Hi Jens,
On the topic of integrating TEE with Hyp in a Multi-VM environment like Volodymyr wants to do, we would not feel comfortable to get rid of the "shm pool" concept, specifically the shm allocation from the carveout.
I agree that we should keep the pool, for some configurations.
We would disagree with a protection model where any NS buffer, from any client operating system or from EL2, can be shared with the Secure World. The reason is we desire to sandbox NS EL1 and NS EL2 from Secure world, since in the past, people have used the loop holes in TAs to root Linux kernel. Google "bitsplease" if you want any recent example.
Granted in ARM architecture, SEL1 can always gain access to the entire NS memory space, but SEL0 does not have that power, nor does some of the non-ARM-Trustzone TEEs.
Last I checked, Gen TEE Driver has the mandate to work with any TEE. We agree with that goal :-)
:-)
Having said that, we also agree with Volodymyr, always max out the carveout is not terribly space-efficient.
When we rework MM in OP-TEE, would the SHM protocol, currently implemented in drivers/tee as a generic protocol (namely not specific to OP-TEE), be subject to some twicks?
Sorry, I don't understand, which protocol?
Thanks, Jens
Hi,
On Wed, Dec 7, 2016 at 7:19 PM, thomas m zeng tzeng@codeaurora.org wrote:
Hi Jens,
When we rework MM in OP-TEE, would the SHM protocol, currently implemented in drivers/tee as a generic protocol (namely not specific to OP-TEE), be subject to some twicks?
Sorry, I don't understand, which protocol?
The NS side is expected to call MSG_GET_SHM_CONFIG at boot, and use the returned parameters to initialize the shm_pool module.
To be able to work with multiple VMs, are we open to tweaking the boot protocol when we rework the MM in OP-TEE and Gen TEE driver?
Yes, I don't see a problem with that if needed as long as we can keep backwards compatibility.
Thanks, Jens