Hi,
On Thu, Aug 25, 2016 at 7:51 PM, Volodymyr Babchuk <notifications@github.com
wrote:
Hello.
I didn't find any mention of optee mailing list, so, perhaps issue tracker is the best place to discuss the matter.
We have tee-dev@lists.linaro.org, I've added it in CC, we can continue the discussion there if you wish.
XEN hypervisor is now ported on ARM and our company is working https://www.xenproject.org/component/allvideoshare/video/globallogic-xen-android.html to provide virtualization solutions to different customers. Some customers also want to use TEE in theirs projects. And there arises topic of interaction between hypervisor and OP-TEE kernel. I have made small PoC https://github.com/lorc/xen/commits/staging-4.7 to show that hypervisor and op-tee can live together. Long things short, I am able to run xtest on different guest VMs (virtual machines) (but not simulatelly). Also, you can kill VM during its interaction with secure world and this will be handled correctly: RPC's will be ended with error, opened sessions will be closed. All handling is done in hypervisor. I didn't made any changes to any parts of OP-TEE.
Wow, that's cool!
But it is merely PoC, and there are many questions that need to be answered:
- Synchronization. This is the biggest problem at this moment. There
will be different wait queues in different VMs and they just know nothing about each other. Suppose, that VM1's call should wait on wait queue till VM2 releases mutex. Currently, there are no way to signal to VM1 that waiting is finished. VM2 will signal only to its own wait queue. Probably we need another mechanism for signaling to Normal Word. Can it be software interrupt for example?
Yes, we really need a software interrupt for this. Unfortunately all 8 non-secure software interrupts in Linux ARMv7 kernel are taken. The recommendation from ARM is to configure the GIC with 8 non-secure software interrupts so we should probably stick with that. This means that we need to find a way to free one non-secure software interrupt to be able to reassign it for secure world use. As PoC we could of course configure more non-secure SGIs but I'm not sure it will be useful beyond just PoC.
- High-level Isolation between VM's data. OP-TEE need to track which
shared buffer or session belong to which VM. SMC Call Convention states that VM ID should be passed in r7. I have seen that optee kernel stores it in mutex metadata. But currently there are no use for it. Probably we can tag every session with it. It is not big deal as for me.
- TA isolation. Suppose that multiple VMs want to open session to the
same TA. Should we ensure that TA code will not leak any sensitive data between VMs? For example, we can run multiple instances of the same TA, one instance for each VM.
Yes for different VMs, sessions and contexts of TAs should be completely isolated. That means for instance that single instance TAs will be single instance per VM. We'll likely only have static TAs as true single instance TAs.
- VM termination. VM can die at any moment. Even during RPC. It can
leave opened sessions. My PoC handles this, but in a somewhat crude way. Should we teach OP-TEE to handle this in a more civilized way? For example, hypervisor signals that VM is closed and OP-TEE kernel goes through all opened sessions and terminates them. Also it kills threads assigned to that VM and then unlocks locked mutexes.
Yes, OP-TEE should assist here.
- Shared memory. Currently allocations from SHM are done by Normal
World. But there are only one region. My PoC splits this region into 4 parts and assign every part to different VM. In this way every VM has isolated SHM region. But only three VMs are supported (one region is assigned to hypervisor itself). And every VM can use maximum 1/4 of SHM. That's not very convenient. Probably, OPTEE (and hypervisor) should allocate SHM on request.
The best would be if we could avoid using this SHM pool at all and instead use normal memory allocated with vmalloc() or from user space as shared memory instead. It is a bit complicated to implement as we need to redesign how we handle shared memory objects in OP-TEE, but it would solve this problem and also other problems.
Thanks, Jens