[PATCH v2 00/13] TEE mediator (and OP-TEE) support in XEN

List overview All Threads
Download

newer

older

OP-TEE 3.3.0

[PATCH] tee: optee: log message if...

Volodymyr Babchuk

3 Sep 2018 3 Sep '18

4:54 p.m.

From: Volodymyr Babchuk vlad.babchuk@gmail.com

Hello all,

This is v2 of patch series for OP-TEE mediator support in XEN. Changes from v1:

- Added domctl interface, so now xl decides what domain should work with TEE - Removed XSM support due to change described above - Patch with OP-TEE mediator was splited to 7 separate patches - Removed patch with call_smccc() function. Now this series depend on Julien Grall's series "xen/arm: SMCCC fixup and improvement" [3]

===== v1:

This is follow for patch series [1]. There was lots of discussions for that series and I tried to address all of them in this new patchset.

Currently, I had a working solution for OP-TEE virtualization and it is being upstreamed right now ([2]). So, I think it is a good time to introduce support in XEN as well.

This series include generic TEE mediator framework and full-scale OP-TEE mediator which is working with mentioned chages in OP-TEE. So, multiple domains can work simultaneously with OP-TEE.

I added XSM support, so now it is possible to control which domains can work with TEEs. Also I changed way how TEE discovery is done. Now it is very generic and should support any platform.

[1] https://lists.xenproject.org/archives/html/xen-devel/2017-10/msg01451.html [2] https://github.com/OP-TEE/optee_os/pull/2370 [3] https://lists.xenproject.org/archives/html/xen-devel/2018-08/msg02138.html

Volodymyr Babchuk (13): arm: add generic TEE mediator framework domctl: add tee_op domctl arm: tee: add OP-TEE header files optee: add OP-TEE mediator skeleton optee: add fast calls handling optee: add domain contexts optee: add std call handling optee: add support for RPC SHM buffers optee: add support for arbitrary shared memory optee: add support for RPC commands libxc: add xc_dom_tee_enable(...) function xl: add "tee" option for xl.cfg lixl: arm: create optee firmware node in DT if tee=1

MAINTAINERS | 6 + docs/man/xl.cfg.pod.5.in | 10 + tools/libxc/include/xenctrl.h | 7 + tools/libxc/xc_domain.c | 13 + tools/libxl/libxl_arm.c | 59 +- tools/libxl/libxl_create.c | 1 + tools/libxl/libxl_types.idl | 1 + tools/xl/xl_parse.c | 1 + xen/arch/arm/Kconfig | 9 + xen/arch/arm/Makefile | 1 + xen/arch/arm/domain.c | 4 + xen/arch/arm/domain_build.c | 4 + xen/arch/arm/domctl.c | 10 + xen/arch/arm/setup.c | 1 + xen/arch/arm/shutdown.c | 1 + xen/arch/arm/tee/Kconfig | 4 + xen/arch/arm/tee/Makefile | 2 + xen/arch/arm/tee/optee.c | 1042 +++++++++++++++++++++++++++++++++++ xen/arch/arm/tee/tee.c | 69 +++ xen/arch/arm/vsmc.c | 5 + xen/arch/arm/xen.lds.S | 7 + xen/include/asm-arm/tee/optee_msg.h | 444 +++++++++++++++ xen/include/asm-arm/tee/optee_smc.h | 507 +++++++++++++++++ xen/include/asm-arm/tee/tee.h | 91 +++ xen/include/public/domctl.h | 8 + 25 files changed, 2294 insertions(+), 13 deletions(-) create mode 100644 xen/arch/arm/tee/Kconfig create mode 100644 xen/arch/arm/tee/Makefile create mode 100644 xen/arch/arm/tee/optee.c create mode 100644 xen/arch/arm/tee/tee.c create mode 100644 xen/include/asm-arm/tee/optee_msg.h create mode 100644 xen/include/asm-arm/tee/optee_smc.h create mode 100644 xen/include/asm-arm/tee/tee.h

-- 2.7.4

Show replies by date

Volodymyr Babchuk

3 Sep 3 Sep

4:54 p.m.

New subject: [PATCH v2 04/13] optee: add OP-TEE mediator skeleton

Add very basic OP-TEE mediator. It can probe for OP-TEE presence, tell it about domain creation/destuction and forward all known calls.

This is all what is needed for Dom0 to work with OP-TEE as long as Dom0 shares 1:1 mapped pages with OP-TEE. Any attempt to call OP-TEE from DomU will fail and can lead to spectacular results.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com --- xen/arch/arm/tee/Kconfig | 4 ++ xen/arch/arm/tee/Makefile | 1 + xen/arch/arm/tee/optee.c | 134 ++++++++++++++++++++++++++++++++++++ xen/include/asm-arm/tee/optee_smc.h | 50 ++++++++++++++ 4 files changed, 189 insertions(+) create mode 100644 xen/arch/arm/tee/optee.c

diff --git a/xen/arch/arm/tee/Kconfig b/xen/arch/arm/tee/Kconfig index e69de29..5b829db 100644 --- a/xen/arch/arm/tee/Kconfig +++ b/xen/arch/arm/tee/Kconfig @@ -0,0 +1,4 @@ +config OPTEE + bool "Enable OP-TEE mediator" + default n + depends on TEE diff --git a/xen/arch/arm/tee/Makefile b/xen/arch/arm/tee/Makefile index c54d479..982c879 100644 --- a/xen/arch/arm/tee/Makefile +++ b/xen/arch/arm/tee/Makefile @@ -1 +1,2 @@ obj-y += tee.o +obj-$(CONFIG_OPTEE) += optee.o diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c new file mode 100644 index 0000000..7bb84d9 --- /dev/null +++ b/xen/arch/arm/tee/optee.c @@ -0,0 +1,134 @@ +/* + * xen/arch/arm/tee/optee.c + * + * OP-TEE mediator + * + * Volodymyr Babchuk volodymyr_babchuk@epam.com + * Copyright (c) 2018 EPAM Systems. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include <xen/device_tree.h> +#include <xen/sched.h> +#include <asm/smccc.h> +#include <asm/tee/tee.h> + +#include <asm/tee/optee_msg.h> +#include <asm/tee/optee_smc.h> + +static bool optee_probe(void) +{ + struct dt_device_node *node; + struct arm_smccc_res resp; + + /* Check for entry in dtb */ + node = dt_find_compatible_node(NULL, NULL, "linaro,optee-tz"); + if ( !node ) + return false; + + /* Check UID */ + arm_smccc_smc(ARM_SMCCC_CALL_UID_FID(TRUSTED_OS_END), &resp); + + if ( resp.a0 != OPTEE_MSG_UID_0 || + resp.a1 != OPTEE_MSG_UID_1 || + resp.a2 != OPTEE_MSG_UID_2 || + resp.a3 != OPTEE_MSG_UID_3 ) + return false; + + return true; +} + +static int optee_enable(struct domain *d) +{ + struct arm_smccc_res resp; + + arm_smccc_smc(OPTEE_SMC_VM_CREATED, d->domain_id + 1, 0, 0, 0, 0, 0, 0, + &resp); + if ( resp.a0 != OPTEE_SMC_RETURN_OK ) { + gprintk(XENLOG_WARNING, "OP-TEE don't want to support domain: %d\n", + (uint32_t)resp.a0); + return -ENODEV; + } + + return 0; +} + +static void forward_call(struct cpu_user_regs *regs) +{ + struct arm_smccc_res resp; + + arm_smccc_smc(get_user_reg(regs, 0), + get_user_reg(regs, 1), + get_user_reg(regs, 2), + get_user_reg(regs, 3), + get_user_reg(regs, 4), + get_user_reg(regs, 5), + get_user_reg(regs, 6), + /* client id 0 is reserved for hypervisor itself */ + current->domain->domain_id + 1, + &resp); + + set_user_reg(regs, 0, resp.a0); + set_user_reg(regs, 1, resp.a1); + set_user_reg(regs, 2, resp.a2); + set_user_reg(regs, 3, resp.a3); + set_user_reg(regs, 4, 0); + set_user_reg(regs, 5, 0); + set_user_reg(regs, 6, 0); + set_user_reg(regs, 7, 0); +} + +static void optee_domain_destroy(struct domain *d) +{ + struct arm_smccc_res resp; + + /* At this time all domain VCPUs should be stopped */ + + /* Inform OP-TEE that domain is shutting down */ + arm_smccc_smc(OPTEE_SMC_VM_DESTROYED, d->domain_id + 1, 0, 0, 0, 0, 0, 0, + &resp); +} + +static bool optee_handle_call(struct cpu_user_regs *regs) +{ + switch ( get_user_reg(regs, 0) ) + { + case OPTEE_SMC_CALLS_COUNT: + case OPTEE_SMC_CALLS_UID: + case OPTEE_SMC_CALLS_REVISION: + case OPTEE_SMC_CALL_GET_OS_UUID: + case OPTEE_SMC_FUNCID_GET_OS_REVISION: + case OPTEE_SMC_ENABLE_SHM_CACHE: + case OPTEE_SMC_DISABLE_SHM_CACHE: + case OPTEE_SMC_GET_SHM_CONFIG: + case OPTEE_SMC_EXCHANGE_CAPABILITIES: + case OPTEE_SMC_CALL_WITH_ARG: + case OPTEE_SMC_CALL_RETURN_FROM_RPC: + forward_call(regs); + return true; + default: + return false; + } +} + +static const struct tee_mediator_ops optee_ops = +{ + .probe = optee_probe, + .enable = optee_enable, + .domain_destroy = optee_domain_destroy, + .handle_call = optee_handle_call, +}; + +REGISTER_TEE_MEDIATOR(optee, "OP-TEE", &optee_ops); + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/include/asm-arm/tee/optee_smc.h b/xen/include/asm-arm/tee/optee_smc.h index 26d100e..1c5a247 100644 --- a/xen/include/asm-arm/tee/optee_smc.h +++ b/xen/include/asm-arm/tee/optee_smc.h @@ -305,6 +305,56 @@ struct optee_smc_disable_shm_cache_result { OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_ENABLE_SHM_CACHE)

/* + * Inform OP-TEE about a new virtual machine + * + * Hypervisor issues this call during virtual machine (guest) creation. + * OP-TEE records VM_ID of new virtual machine and makes self ready + * to receive requests from it. + * + * Call requests usage: + * a0 SMC Function ID, OPTEE_SMC_VM_CREATED + * a1 VM_ID of newly created virtual machine + * a2-6 Not used + * a7 Hypervisor Client ID register. Must be 0, because only hypervisor + * can issue this call + * + * Normal return register usage: + * a0 OPTEE_SMC_RETURN_OK + * a1-7 Preserved + * + * Error return: + * a0 OPTEE_SMC_RETURN_ENOTAVAIL OP-TEE has no resources for + * another VM + * a1-7 Preserved + * + */ +#define OPTEE_SMC_FUNCID_VM_CREATED 13 +#define OPTEE_SMC_VM_CREATED \ + OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_VM_CREATED) + +/* + * Inform OP-TEE about shutdown of a virtual machine + * + * Hypervisor issues this call during virtual machine (guest) destruction. + * OP-TEE will clean up all resources associated with this VM. + * + * Call requests usage: + * a0 SMC Function ID, OPTEE_SMC_VM_DESTROYED + * a1 VM_ID of virtual machine being shutted down + * a2-6 Not used + * a7 Hypervisor Client ID register. Must be 0, because only hypervisor + * can issue this call + * + * Normal return register usage: + * a0 OPTEE_SMC_RETURN_OK + * a1-7 Preserved + * + */ +#define OPTEE_SMC_FUNCID_VM_DESTROYED 14 +#define OPTEE_SMC_VM_DESTROYED \ + OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_VM_DESTROYED) + +/* * Resume from RPC (for example after processing a foreign interrupt) * * Call register usage:

-- 2.7.4

Julien Grall

5:38 p.m.

New subject: [PATCH v2 04/13] optee: add OP-TEE mediator skeleton

Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...

Add very basic OP-TEE mediator. It can probe for OP-TEE presence, tell it about domain creation/destuction and forward all known

s/destuction/destruction/

...

calls.

This is all what is needed for Dom0 to work with OP-TEE as long as Dom0 shares 1:1 mapped pages with OP-TEE. Any attempt to call OP-TEE from DomU will fail and can lead to spectacular results.

Shall we expect fireworks? :).

Anyway, I think this is a call for forbidding DomU access until it is supported. This also has the benefits to allow merging Dom0 support for OP-TEE without the rest.

...

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/Kconfig | 4 ++ xen/arch/arm/tee/Makefile | 1 + xen/arch/arm/tee/optee.c | 134 ++++++++++++++++++++++++++++++++++++ xen/include/asm-arm/tee/optee_smc.h | 50 ++++++++++++++ 4 files changed, 189 insertions(+) create mode 100644 xen/arch/arm/tee/optee.c

diff --git a/xen/arch/arm/tee/Kconfig b/xen/arch/arm/tee/Kconfig index e69de29..5b829db 100644 --- a/xen/arch/arm/tee/Kconfig +++ b/xen/arch/arm/tee/Kconfig @@ -0,0 +1,4 @@ +config OPTEE

bool "Enable OP-TEE mediator"

default n

depends on TEE

diff --git a/xen/arch/arm/tee/Makefile b/xen/arch/arm/tee/Makefile index c54d479..982c879 100644 --- a/xen/arch/arm/tee/Makefile +++ b/xen/arch/arm/tee/Makefile @@ -1 +1,2 @@ obj-y += tee.o +obj-$(CONFIG_OPTEE) += optee.o diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c new file mode 100644 index 0000000..7bb84d9 --- /dev/null +++ b/xen/arch/arm/tee/optee.c @@ -0,0 +1,134 @@ +/*

xen/arch/arm/tee/optee.c

OP-TEE mediator

Volodymyr Babchuk volodymyr_babchuk@epam.com

Copyright (c) 2018 EPAM Systems.

This program is free software; you can redistribute it and/or modify

it under the terms of the GNU General Public License version 2 as

published by the Free Software Foundation.

*/

+#include <xen/device_tree.h> +#include <xen/sched.h> +#include <asm/smccc.h> +#include <asm/tee/tee.h>

+#include <asm/tee/optee_msg.h> +#include <asm/tee/optee_smc.h>

+static bool optee_probe(void) +{
struct dt_device_node *node;

struct arm_smccc_res resp;

/* Check for entry in dtb */

node = dt_find_compatible_node(NULL, NULL, "linaro,optee-tz");

if ( !node )
   return false;
/* Check UID */

arm_smccc_smc(ARM_SMCCC_CALL_UID_FID(TRUSTED_OS_END), &resp);

if ( resp.a0 != OPTEE_MSG_UID_0 ||
    resp.a1 != OPTEE_MSG_UID_1 ||
    resp.a2 != OPTEE_MSG_UID_2 ||
    resp.a3 != OPTEE_MSG_UID_3 )

I would be extra cautious with the sign-extension here. It would be better to follow the spec regarding UUID by casting resp.a0 & co to 32-bit.

...

   return false;
return true;
+}

+static int optee_enable(struct domain *d) +{
struct arm_smccc_res resp;

arm_smccc_smc(OPTEE_SMC_VM_CREATED, d->domain_id + 1, 0, 0, 0, 0, 0, 0,
             &resp);
if ( resp.a0 != OPTEE_SMC_RETURN_OK ) {
   gprintk(XENLOG_WARNING, "OP-TEE don't want to support domain: %d\n",

This message is slightly odd. It would be better to say:

"Unable to create OP-TEE client: rc=%d\n".

...

```
           (uint32_t)resp.a0);
```
```
   return -ENODEV;
```
}
return 0;

+static void forward_call(struct cpu_user_regs *regs) +{

struct arm_smccc_res resp;
arm_smccc_smc(get_user_reg(regs, 0),
```
             get_user_reg(regs, 1),
```
```
             get_user_reg(regs, 2),
```
```
             get_user_reg(regs, 3),
```
```
             get_user_reg(regs, 4),
```
```
             get_user_reg(regs, 5),
```
```
             get_user_reg(regs, 6),
```

             /* client id 0 is reserved for hypervisor itself */

             current->domain->domain_id + 1,

I would prefer if the client ID is encoded in a macro. So this could be re-used.

...

             &resp);
set_user_reg(regs, 0, resp.a0);

set_user_reg(regs, 1, resp.a1);

set_user_reg(regs, 2, resp.a2);

set_user_reg(regs, 3, resp.a3);

set_user_reg(regs, 4, 0);

set_user_reg(regs, 5, 0);

set_user_reg(regs, 6, 0);

set_user_reg(regs, 7, 0);
+}

+static void optee_domain_destroy(struct domain *d) +{
struct arm_smccc_res resp;

/* At this time all domain VCPUs should be stopped */

/* Inform OP-TEE that domain is shutting down */

arm_smccc_smc(OPTEE_SMC_VM_DESTROYED, d->domain_id + 1, 0, 0, 0, 0, 0, 0,
             &resp);

So this SMC should never fail or even get preempted? I was kind of expecting that it may time some time to destroy a domain.

...

+}

+static bool optee_handle_call(struct cpu_user_regs *regs) +{
switch ( get_user_reg(regs, 0) )

{

case OPTEE_SMC_CALLS_COUNT:

case OPTEE_SMC_CALLS_UID:

case OPTEE_SMC_CALLS_REVISION:

case OPTEE_SMC_CALL_GET_OS_UUID:

case OPTEE_SMC_FUNCID_GET_OS_REVISION:

case OPTEE_SMC_ENABLE_SHM_CACHE:

case OPTEE_SMC_DISABLE_SHM_CACHE:

case OPTEE_SMC_GET_SHM_CONFIG:

case OPTEE_SMC_EXCHANGE_CAPABILITIES:

case OPTEE_SMC_CALL_WITH_ARG:

case OPTEE_SMC_CALL_RETURN_FROM_RPC:
   forward_call(regs);
   return true;
default:
   return false;
}
+}

+static const struct tee_mediator_ops optee_ops = +{

.probe = optee_probe,

.enable = optee_enable,

.domain_destroy = optee_domain_destroy,

.handle_call = optee_handle_call,

+};

+REGISTER_TEE_MEDIATOR(optee, "OP-TEE", &optee_ops);

+/*

Local variables:

mode: C

c-file-style: "BSD"

c-basic-offset: 4

indent-tabs-mode: nil

End:

*/

diff --git a/xen/include/asm-arm/tee/optee_smc.h b/xen/include/asm-arm/tee/optee_smc.h index 26d100e..1c5a247 100644 --- a/xen/include/asm-arm/tee/optee_smc.h +++ b/xen/include/asm-arm/tee/optee_smc.h @@ -305,6 +305,56 @@ struct optee_smc_disable_shm_cache_result { OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_ENABLE_SHM_CACHE) /*
Inform OP-TEE about a new virtual machine

Hypervisor issues this call during virtual machine (guest) creation.

OP-TEE records VM_ID of new virtual machine and makes self ready

to receive requests from it.

Call requests usage:

a0 SMC Function ID, OPTEE_SMC_VM_CREATED

a1 VM_ID of newly created virtual machine

a2-6 Not used

a7 Hypervisor Client ID register. Must be 0, because only hypervisor
 can issue this call
Normal return register usage:

a0 OPTEE_SMC_RETURN_OK

a1-7 Preserved

Error return:

a0 OPTEE_SMC_RETURN_ENOTAVAIL OP-TEE has no resources for
			another VM
a1-7 Preserved

*/
+#define OPTEE_SMC_FUNCID_VM_CREATED 13 +#define OPTEE_SMC_VM_CREATED \

OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_VM_CREATED)

+/*
Inform OP-TEE about shutdown of a virtual machine

Hypervisor issues this call during virtual machine (guest) destruction.

OP-TEE will clean up all resources associated with this VM.

Call requests usage:

a0 SMC Function ID, OPTEE_SMC_VM_DESTROYED

a1 VM_ID of virtual machine being shutted down

a2-6 Not used

a7 Hypervisor Client ID register. Must be 0, because only hypervisor
 can issue this call
Normal return register usage:

a0 OPTEE_SMC_RETURN_OK

a1-7 Preserved

*/
+#define OPTEE_SMC_FUNCID_VM_DESTROYED 14 +#define OPTEE_SMC_VM_DESTROYED \

OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_VM_DESTROYED)

+/*

Resume from RPC (for example after processing a foreign interrupt)

Call register usage:

Cheers,

-- Julien Grall

Volodymyr Babchuk

5:55 p.m.

New subject: [PATCH v2 04/13] optee: add OP-TEE mediator skeleton

Hi Julien,

On 03.09.18 20:38, Julien Grall wrote:

...

Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
Add very basic OP-TEE mediator. It can probe for OP-TEE presence, tell it about domain creation/destuction and forward all known

s/destuction/destruction/

...
calls.

This is all what is needed for Dom0 to work with OP-TEE as long as Dom0 shares 1:1 mapped pages with OP-TEE. Any attempt to call OP-TEE from DomU will fail and can lead to spectacular results.

Shall we expect fireworks? :).

I tried couple of time, but with no success :)

...

Anyway, I think this is a call for forbidding DomU access until it is supported. This also has the benefits to allow merging Dom0 support for OP-TEE without the rest.

Some time ago you said that I can't be sure that Dom0 is 1:1 mapped, because of grant refs. So, actually, any access should be forbidden. I can omit

REGISTER_TEE_MEDIATOR(optee, "OP-TEE", &optee_ops);

till the very last patch in the series.

But then it would not compile, because optee_ops is the static struct...

...

...
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/Kconfig            |   4 ++ xen/arch/arm/tee/Makefile           |   1 + xen/arch/arm/tee/optee.c            | 134 ++++++++++++++++++++++++++++++++++++ xen/include/asm-arm/tee/optee_smc.h | 50 ++++++++++++++ 4 files changed, 189 insertions(+) create mode 100644 xen/arch/arm/tee/optee.c

diff --git a/xen/arch/arm/tee/Kconfig b/xen/arch/arm/tee/Kconfig index e69de29..5b829db 100644 --- a/xen/arch/arm/tee/Kconfig +++ b/xen/arch/arm/tee/Kconfig @@ -0,0 +1,4 @@ +config OPTEE +    bool "Enable OP-TEE mediator" +    default n +    depends on TEE diff --git a/xen/arch/arm/tee/Makefile b/xen/arch/arm/tee/Makefile index c54d479..982c879 100644 --- a/xen/arch/arm/tee/Makefile +++ b/xen/arch/arm/tee/Makefile @@ -1 +1,2 @@ obj-y += tee.o +obj-$(CONFIG_OPTEE) += optee.o diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c new file mode 100644 index 0000000..7bb84d9 --- /dev/null +++ b/xen/arch/arm/tee/optee.c @@ -0,0 +1,134 @@ +/*

xen/arch/arm/tee/optee.c

OP-TEE mediator

Volodymyr Babchuk volodymyr_babchuk@epam.com

Copyright (c) 2018 EPAM Systems.

This program is free software; you can redistribute it and/or modify

it under the terms of the GNU General Public License version 2 as

published by the Free Software Foundation.

*/

+#include <xen/device_tree.h> +#include <xen/sched.h> +#include <asm/smccc.h> +#include <asm/tee/tee.h>

+#include <asm/tee/optee_msg.h> +#include <asm/tee/optee_smc.h>

+static bool optee_probe(void) +{ +    struct dt_device_node *node; +    struct arm_smccc_res resp;

+    /* Check for entry in dtb */ +    node = dt_find_compatible_node(NULL, NULL, "linaro,optee-tz"); +    if ( !node ) +        return false;

+    /* Check UID */ +    arm_smccc_smc(ARM_SMCCC_CALL_UID_FID(TRUSTED_OS_END), &resp);

+    if ( resp.a0 != OPTEE_MSG_UID_0 || +         resp.a1 != OPTEE_MSG_UID_1 || +         resp.a2 != OPTEE_MSG_UID_2 || +         resp.a3 != OPTEE_MSG_UID_3 )

I would be extra cautious with the sign-extension here. It would be better to follow the spec regarding UUID by casting resp.a0 & co to 32-bit.

...
+        return false;

+    return true; +}

+static int optee_enable(struct domain *d) +{ +    struct arm_smccc_res resp;

+    arm_smccc_smc(OPTEE_SMC_VM_CREATED, d->domain_id + 1, 0, 0, 0, 0, 0, 0, +                  &resp); +    if ( resp.a0 != OPTEE_SMC_RETURN_OK ) { +        gprintk(XENLOG_WARNIREGISTER_TEE_MEDIATOR(optee, "OP-TEE", &optee_ops);NG, "OP-TEE don't want to support domain: %d\n",

This message is slightly odd. It would be better to say:

"Unable to create OP-TEE client: rc=%d\n".

...
+                (uint32_t)resp.a0); +        return -ENODEV; +    }

+    return 0; +}

+static void forward_call(struct cpu_user_regs *regs) +{ +    struct arm_smccc_res resp;

+    arm_smccc_smc(get_user_reg(regs, 0), +                  get_user_reg(regs, 1), +                  get_user_reg(regs, 2), +                  get_user_reg(regs, 3), +                  get_user_reg(regs, 4), +                  get_user_reg(regs, 5), +                  get_user_reg(regs, 6), +                  /* client id 0 is reserved for hypervisor itself */ +                  current->domain->domain_id + 1,

I would prefer if the client ID is encoded in a macro. So this could be re-used

Something like

#define OPTEE_CLIENT_ID(domain) (domain->domain_id + 1)

...

...
+                  &resp);

+    set_user_reg(regs, 0, resp.a0); +    set_user_reg(regs, 1, resp.a1); +    set_user_reg(regs, 2, resp.a2); +    set_user_reg(regs, 3, resp.a3); +    set_user_reg(regs, 4, 0); +    set_user_reg(regs, 5, 0); +    set_user_reg(regs, 6, 0); +    set_user_reg(regs, 7, 0); +}

+static void optee_domain_destroy(struct domain *d) +{ +    struct arm_smccc_res resp;

+    /* At this time all domain VCPUs should be stopped */

+    /* Inform OP-TEE that domain is shutting down */ +    arm_smccc_smc(OPTEE_SMC_VM_DESTROYED, d->domain_id + 1, 0, 0, 0, 0, 0, 0, +                  &resp);

So this SMC should never fail or even get preempted? I was kind of expecting that it may time some time to destroy a domain.

It is the fast SMCCC call, so it can't be preempted. And it is really fast, at lest in OP-TEE case. Basically, OP-TEE just removes some entries from the list and frees some memory. And yes, it can't fail.

...

...
+}

+static bool optee_handle_call(struct cpu_user_regs *regs) +{ +    switch ( get_user_reg(regs, 0) ) +    { +    case OPTEE_SMC_CALLS_COUNT: +    case OPTEE_SMC_CALLS_UID: +    case OPTEE_SMC_CALLS_REVISION: +    case OPTEE_SMC_CALL_GET_OS_UUID: +    case OPTEE_SMC_FUNCID_GET_OS_REVISION: +    case OPTEE_SMC_ENABLE_SHM_CACHE: +    case OPTEE_SMC_DISABLE_SHM_CACHE: +    case OPTEE_SMC_GET_SHM_CONFIG: +    case OPTEE_SMC_EXCHANGE_CAPABILITIES: +    case OPTEE_SMC_CALL_WITH_ARG: +    case OPTEE_SMC_CALL_RETURN_FROM_RPC: +        forward_call(regs); +        return true; +    default: +        return false; +    } +}

+static const struct tee_mediator_ops optee_ops = +{ +    .probe = optee_probe, +    .enable = optee_enable, +    .domain_destroy = optee_domain_destroy, +    .handle_call = optee_handle_call, +};

+REGISTER_TEE_MEDIATOR(optee, "OP-TEE", &optee_ops);

+/*

Local variables:

mode: C

c-file-style: "BSD"

c-basic-offset: 4

indent-tabs-mode: nil

End:

*/

diff --git a/xen/include/asm-arm/tee/optee_smc.h b/xen/include/asm-arm/tee/optee_smc.h index 26d100e..1c5a247 100644 --- a/xen/include/asm-arm/tee/optee_smc.h +++ b/xen/include/asm-arm/tee/optee_smc.h @@ -305,6 +305,56 @@ struct optee_smc_disable_shm_cache_result {       OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_ENABLE_SHM_CACHE) /*

Inform OP-TEE about a new virtual machine

Hypervisor issues this call during virtual machine (guest) creation.

OP-TEE records VM_ID of new virtual machine and makes self ready

to receive requests from it.

Call requests usage:

a0    SMC Function ID, OPTEE_SMC_VM_CREATED

a1    VM_ID of newly created virtual machine

a2-6 Not used

a7    Hypervisor Client ID register. Must be 0, because only

hypervisor

*      can issue this call

Normal return register usage:

a0    OPTEE_SMC_RETURN_OK

a1-7    Preserved

Error return:

a0    OPTEE_SMC_RETURN_ENOTAVAIL    OP-TEE has no resources for

*                    another VM

a1-7    Preserved

*/

+#define OPTEE_SMC_FUNCID_VM_CREATED    13 +#define OPTEE_SMC_VM_CREATED \ +    OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_VM_CREATED)

+/*

Inform OP-TEE about shutdown of a virtual machine

Hypervisor issues this call during virtual machine (guest)

destruction.

OP-TEE will clean up all resources associated with this VM.

Call requests usage:

a0    SMC Function ID, OPTEE_SMC_VM_DESTROYED

a1    VM_ID of virtual machine being shutted down

a2-6 Not used

a7    Hypervisor Client ID register. Must be 0, because only

hypervisor

*      can issue this call

Normal return register usage:

a0    OPTEE_SMC_RETURN_OK

a1-7    Preserved

*/

+#define OPTEE_SMC_FUNCID_VM_DESTROYED    14 +#define OPTEE_SMC_VM_DESTROYED \ +    OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_VM_DESTROYED)

+/*    * Resume from RPC (for example after processing a foreign interrupt)    *    * Call register usage:

Cheers,

-- Volodymyr Babchuk

Julien Grall

4 Sep 4 Sep

7:48 p.m.

New subject: [PATCH v2 04/13] optee: add OP-TEE mediator skeleton

On 09/03/2018 06:55 PM, Volodymyr Babchuk wrote:

...

Hi Julien,

Hi Volodymyr,

...

On 03.09.18 20:38, Julien Grall wrote:

...
Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
Add very basic OP-TEE mediator. It can probe for OP-TEE presence, tell it about domain creation/destuction and forward all known

s/destuction/destruction/

...
calls.

This is all what is needed for Dom0 to work with OP-TEE as long as Dom0 shares 1:1 mapped pages with OP-TEE. Any attempt to call OP-TEE from DomU will fail and can lead to spectacular results.

Shall we expect fireworks? :).

I tried couple of time, but with no success :)

...
Anyway, I think this is a call for forbidding DomU access until it is supported. This also has the benefits to allow merging Dom0 support for OP-TEE without the rest.

Some time ago you said that I can't be sure that Dom0 is 1:1 mapped, because of grant refs. So, actually, any access should be forbidden. I can omit

Oh right. I that case, make it clear in the commit message because there are nothing in Dom0 preventing to share page that are not direct mapped.

This will make easier for the commiter (either Stefano or I) to know this can't go without the rest of the series.

[...]

...

...
...
+static void forward_call(struct cpu_user_regs *regs) +{ + struct arm_smccc_res resp;

+ arm_smccc_smc(get_user_reg(regs, 0), + get_user_reg(regs, 1), + get_user_reg(regs, 2), + get_user_reg(regs, 3), + get_user_reg(regs, 4), + get_user_reg(regs, 5), + get_user_reg(regs, 6), + /* client id 0 is reserved for hypervisor itself */ + current->domain->domain_id + 1,

I would prefer if the client ID is encoded in a macro. So this could be re-used

Something like

#define OPTEE_CLIENT_ID(domain) (domain->domain_id + 1)

?

This sounds good.

[...]

...

...
...
+                  &resp);

+    set_user_reg(regs, 0, resp.a0); +    set_user_reg(regs, 1, resp.a1); +    set_user_reg(regs, 2, resp.a2); +    set_user_reg(regs, 3, resp.a3); +    set_user_reg(regs, 4, 0); +    set_user_reg(regs, 5, 0); +    set_user_reg(regs, 6, 0); +    set_user_reg(regs, 7, 0); +}

+static void optee_domain_destroy(struct domain *d) +{ +    struct arm_smccc_res resp;

+    /* At this time all domain VCPUs should be stopped */

+    /* Inform OP-TEE that domain is shutting down */ +    arm_smccc_smc(OPTEE_SMC_VM_DESTROYED, d->domain_id + 1, 0, 0, 0, 0, 0, 0, +                  &resp);

So this SMC should never fail or even get preempted? I was kind of expecting that it may time some time to destroy a domain.

It is the fast SMCCC call, so it can't be preempted. And it is really fast, at lest in OP-TEE case.

Without number, I can't really know what fast means here. Do you have a rough idea?

...

Basically, OP-TEE just removes some entries from the list and frees some memory. And yes, it can't fail.

Cheers,

-- Julien Grall

Volodymyr Babchuk

5 Sep 5 Sep

12:17 p.m.

New subject: [PATCH v2 04/13] optee: add OP-TEE mediator skeleton

Hi Julien,

On 04.09.18 22:48, Julien Grall wrote:

...

On 09/03/2018 06:55 PM, Volodymyr Babchuk wrote:

...
Hi Julien,

Hi Volodymyr,

...
On 03.09.18 20:38, Julien Grall wrote:

...
Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
Add very basic OP-TEE mediator. It can probe for OP-TEE presence, tell it about domain creation/destuction and forward all known

s/destuction/destruction/

...
calls.

This is all what is needed for Dom0 to work with OP-TEE as long as Dom0 shares 1:1 mapped pages with OP-TEE. Any attempt to call OP-TEE from DomU will fail and can lead to spectacular results.

Shall we expect fireworks? :).

I tried couple of time, but with no success :)

...
Anyway, I think this is a call for forbidding DomU access until it is supported. This also has the benefits to allow merging Dom0 support for OP-TEE without the rest.

Some time ago you said that I can't be sure that Dom0 is 1:1 mapped, because of grant refs. So, actually, any access should be forbidden. I can omit

Oh right. I that case, make it clear in the commit message because there are nothing in Dom0 preventing to share page that are not direct mapped.

...

This will make easier for the commiter (either Stefano or I) to know this can't go without the rest of the series.

Ah, sure. Had to indicate this explicitly. Will do this in the next version of the series.

[...]

...

...
...
...
+                  &resp);

+    set_user_reg(regs, 0, resp.a0); +    set_user_reg(regs, 1, resp.a1); +    set_user_reg(regs, 2, resp.a2); +    set_user_reg(regs, 3, resp.a3); +    set_user_reg(regs, 4, 0); +    set_user_reg(regs, 5, 0); +    set_user_reg(regs, 6, 0); +    set_user_reg(regs, 7, 0); +}

+static void optee_domain_destroy(struct domain *d) +{ +    struct arm_smccc_res resp;

+    /* At this time all domain VCPUs should be stopped */

+    /* Inform OP-TEE that domain is shutting down */ +    arm_smccc_smc(OPTEE_SMC_VM_DESTROYED, d->domain_id + 1, 0, 0, 0, 0, 0, 0, +                  &resp);

So this SMC should never fail or even get preempted? I was kind of expecting that it may time some time to destroy a domain.

It is the fast SMCCC call, so it can't be preempted. And it is really fast, at lest in OP-TEE case.

Without number, I can't really know what fast means here. Do you have a rough idea?

"Fast" used there in a sense, defined in SMCCC:

" Fast Calls execute atomic operations.

The call appears to be atomic from the perspective of the calling PE, and returns when the requested operation has completed."

And "Standard Call" (will be introduced in the series) is the "Yielding Call". Probably I should use term from SMCCC, but for some reason I stick to term used it OP-TEE.

I can do some measurements on how "fast" this particular call is. But problem is that it is really atomic from OP-TEE perspective.

-- Volodymyr Babchuk

Julien Grall

1:16 p.m.

New subject: [PATCH v2 04/13] optee: add OP-TEE mediator skeleton

Hi,

On 09/05/2018 01:17 PM, Volodymyr Babchuk wrote:

...

On 04.09.18 22:48, Julien Grall wrote:

...
On 09/03/2018 06:55 PM, Volodymyr Babchuk wrote:

...
Hi Julien,

Hi Volodymyr,

...
On 03.09.18 20:38, Julien Grall wrote:

...
Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
Add very basic OP-TEE mediator. It can probe for OP-TEE presence, tell it about domain creation/destuction and forward all known

s/destuction/destruction/

...
calls.

This is all what is needed for Dom0 to work with OP-TEE as long as Dom0 shares 1:1 mapped pages with OP-TEE. Any attempt to call OP-TEE from DomU will fail and can lead to spectacular results.

Shall we expect fireworks? :).

I tried couple of time, but with no success :)

...
Anyway, I think this is a call for forbidding DomU access until it is supported. This also has the benefits to allow merging Dom0 support for OP-TEE without the rest.

Some time ago you said that I can't be sure that Dom0 is 1:1 mapped, because of grant refs. So, actually, any access should be forbidden. I can omit

Oh right. I that case, make it clear in the commit message because there are nothing in Dom0 preventing to share page that are not direct mapped.

...
This will make easier for the commiter (either Stefano or I) to know this can't go without the rest of the series.

Ah, sure. Had to indicate this explicitly. Will do this in the next version of the series.

[...]

...
...
...
...
+                  &resp);

+    set_user_reg(regs, 0, resp.a0); +    set_user_reg(regs, 1, resp.a1); +    set_user_reg(regs, 2, resp.a2); +    set_user_reg(regs, 3, resp.a3); +    set_user_reg(regs, 4, 0); +    set_user_reg(regs, 5, 0); +    set_user_reg(regs, 6, 0); +    set_user_reg(regs, 7, 0); +}

+static void optee_domain_destroy(struct domain *d) +{ +    struct arm_smccc_res resp;

+    /* At this time all domain VCPUs should be stopped */

+    /* Inform OP-TEE that domain is shutting down */ +    arm_smccc_smc(OPTEE_SMC_VM_DESTROYED, d->domain_id + 1, 0, 0, 0, 0, 0, 0, +                  &resp);

So this SMC should never fail or even get preempted? I was kind of expecting that it may time some time to destroy a domain.

It is the fast SMCCC call, so it can't be preempted. And it is really fast, at lest in OP-TEE case.

Without number, I can't really know what fast means here. Do you have a rough idea?

"Fast" used there in a sense, defined in SMCCC:

" Fast Calls execute atomic operations.

The call appears to be atomic from the perspective of the calling PE, and returns when the requested operation has completed."

And "Standard Call" (will be introduced in the series) is the "Yielding Call". Probably I should use term from SMCCC, but for some reason I stick to term used it OP-TEE.

I don't mind which term you used as long as you clearly define them within your series.

...

I can do some measurements on how "fast" this particular call is. But problem is that it is really atomic from OP-TEE perspective.

My concern is usually such operation can take time. In the context of Xen, the domain destruction path is preemptible which allow a domain to be rescheduled.

While today, SMC_VM_DESTROYED is a fast call because there are not much. I am a bit concern that in the future this may grow and therefore turn into a long operation (i.e few ms).

So I am a bit concerned that this interface is not future-proof.

Cheers,

-- Julien Grall

Volodymyr Babchuk

1:38 p.m.

New subject: [PATCH v2 04/13] optee: add OP-TEE mediator skeleton

Hi,

On 05.09.18 16:16, Julien Grall wrote:

...

Hi,

On 09/05/2018 01:17 PM, Volodymyr Babchuk wrote:

...
On 04.09.18 22:48, Julien Grall wrote:

...
On 09/03/2018 06:55 PM, Volodymyr Babchuk wrote:

...
Hi Julien,

Hi Volodymyr,

...
On 03.09.18 20:38, Julien Grall wrote:

...
Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
Add very basic OP-TEE mediator. It can probe for OP-TEE presence, tell it about domain creation/destuction and forward all known

s/destuction/destruction/

...
calls.

This is all what is needed for Dom0 to work with OP-TEE as long as Dom0 shares 1:1 mapped pages with OP-TEE. Any attempt to call OP-TEE from DomU will fail and can lead to spectacular results.

Shall we expect fireworks? :).

I tried couple of time, but with no success :)

...
Anyway, I think this is a call for forbidding DomU access until it is supported. This also has the benefits to allow merging Dom0 support for OP-TEE without the rest.

Some time ago you said that I can't be sure that Dom0 is 1:1 mapped, because of grant refs. So, actually, any access should be forbidden. I can omit

Oh right. I that case, make it clear in the commit message because there are nothing in Dom0 preventing to share page that are not direct mapped.

...
This will make easier for the commiter (either Stefano or I) to know this can't go without the rest of the series.

Ah, sure. Had to indicate this explicitly. Will do this in the next version of the series.

[...]

...
...
...
...
+                  &resp);

+    set_user_reg(regs, 0, resp.a0); +    set_user_reg(regs, 1, resp.a1); +    set_user_reg(regs, 2, resp.a2); +    set_user_reg(regs, 3, resp.a3); +    set_user_reg(regs, 4, 0); +    set_user_reg(regs, 5, 0); +    set_user_reg(regs, 6, 0); +    set_user_reg(regs, 7, 0); +}

+static void optee_domain_destroy(struct domain *d) +{ +    struct arm_smccc_res resp;

+    /* At this time all domain VCPUs should be stopped */

+    /* Inform OP-TEE that domain is shutting down */ +    arm_smccc_smc(OPTEE_SMC_VM_DESTROYED, d->domain_id + 1, 0, 0, 0, 0, 0, 0, +                  &resp);

So this SMC should never fail or even get preempted? I was kind of expecting that it may time some time to destroy a domain.

It is the fast SMCCC call, so it can't be preempted. And it is really fast, at lest in OP-TEE case.

Without number, I can't really know what fast means here. Do you have a rough idea?

"Fast" used there in a sense, defined in SMCCC:

" Fast Calls execute atomic operations.

The call appears to be atomic from the perspective of the calling PE, and returns when the requested operation has completed."

And "Standard Call" (will be introduced in the series) is the "Yielding Call". Probably I should use term from SMCCC, but for some reason I stick to term used it OP-TEE.

I don't mind which term you used as long as you clearly define them within your series.

...
I can do some measurements on how "fast" this particular call is. But problem is that it is really atomic from OP-TEE perspective.

My concern is usually such operation can take time. In the context of Xen, the domain destruction path is preemptible which allow a domain to be rescheduled.

While today, SMC_VM_DESTROYED is a fast call because there are not much. I am a bit concern that in the future this may grow and therefore turn into a long operation (i.e few ms).

So I am a bit concerned that this interface is not future-proof.

I think, this is broader issue. I.e. the same is true for any calls to firmware.

Domain destruction path in XEN is preemptible because it deals with a lots of resources and some resources can be blocked right now, because they can be shared with other domains (at least, it is my understanding).

Current model of virtualization in OP-TEE is much simpler and domain destruction is truly atomic thing. It is really just couple of calls to free().

I can't tell what will be in the future, but I see some options how to overcome newly discovered issues, if they will ever rise. But I can tell that right now there is no much sense in introducing yielding calls for domain destruction/creation. This would require significant changes in OP-TEE core without any visible benefit.

-- Volodymyr Babchuk

Julien Grall

1:47 p.m.

New subject: [PATCH v2 04/13] optee: add OP-TEE mediator skeleton

Hi,

On 09/05/2018 02:38 PM, Volodymyr Babchuk wrote:

...

Hi,

On 05.09.18 16:16, Julien Grall wrote:

...
Hi,

On 09/05/2018 01:17 PM, Volodymyr Babchuk wrote:

...
On 04.09.18 22:48, Julien Grall wrote:

...
On 09/03/2018 06:55 PM, Volodymyr Babchuk wrote:

...
Hi Julien,

Hi Volodymyr,

...
On 03.09.18 20:38, Julien Grall wrote:

...
Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote: > Add very basic OP-TEE mediator. It can probe for OP-TEE presence, > tell it about domain creation/destuction and forward all known

s/destuction/destruction/

> calls. > > This is all what is needed for Dom0 to work with OP-TEE as long > as Dom0 shares 1:1 mapped pages with OP-TEE. Any attempt to call > OP-TEE from DomU will fail and can lead to spectacular results.

Shall we expect fireworks? :).

I tried couple of time, but with no success :)

...
Anyway, I think this is a call for forbidding DomU access until it is supported. This also has the benefits to allow merging Dom0 support for OP-TEE without the rest.

Some time ago you said that I can't be sure that Dom0 is 1:1 mapped, because of grant refs. So, actually, any access should be forbidden. I can omit

Oh right. I that case, make it clear in the commit message because there are nothing in Dom0 preventing to share page that are not direct mapped.

...
This will make easier for the commiter (either Stefano or I) to know this can't go without the rest of the series.

Ah, sure. Had to indicate this explicitly. Will do this in the next version of the series.

[...]

...
...
...
> + &resp); > + > + set_user_reg(regs, 0, resp.a0); > + set_user_reg(regs, 1, resp.a1); > + set_user_reg(regs, 2, resp.a2); > + set_user_reg(regs, 3, resp.a3); > + set_user_reg(regs, 4, 0); > + set_user_reg(regs, 5, 0); > + set_user_reg(regs, 6, 0); > + set_user_reg(regs, 7, 0); > +} > + > +static void optee_domain_destroy(struct domain *d) > +{ > + struct arm_smccc_res resp; > + > + /* At this time all domain VCPUs should be stopped */ > + > + /* Inform OP-TEE that domain is shutting down */ > + arm_smccc_smc(OPTEE_SMC_VM_DESTROYED, d->domain_id + 1, 0, > 0, 0, 0, 0, 0, > + &resp);

So this SMC should never fail or even get preempted? I was kind of expecting that it may time some time to destroy a domain.

It is the fast SMCCC call, so it can't be preempted. And it is really fast, at lest in OP-TEE case.

Without number, I can't really know what fast means here. Do you have a rough idea?

"Fast" used there in a sense, defined in SMCCC:

" Fast Calls execute atomic operations.

The call appears to be atomic from the perspective of the calling PE, and returns when the requested operation has completed."

And "Standard Call" (will be introduced in the series) is the "Yielding Call". Probably I should use term from SMCCC, but for some reason I stick to term used it OP-TEE.

I don't mind which term you used as long as you clearly define them within your series.

...
I can do some measurements on how "fast" this particular call is. But problem is that it is really atomic from OP-TEE perspective.

My concern is usually such operation can take time. In the context of Xen, the domain destruction path is preemptible which allow a domain to be rescheduled.

While today, SMC_VM_DESTROYED is a fast call because there are not much. I am a bit concern that in the future this may grow and therefore turn into a long operation (i.e few ms).

So I am a bit concerned that this interface is not future-proof.

I think, this is broader issue. I.e. the same is true for any calls to firmware.

Domain destruction path in XEN is preemptible because it deals with a lots of resources and some resources can be blocked right now, because they can be shared with other domains (at least, it is my understanding).

Resources shared with other domain have a counter. The function domain_relinquish_resources will not block because resource have not been unshared for that domain.

Instead, reference on the resource will be dropped. The domain structure will get released when all the resource was released (see put_domain()).

The preemption is only here because it can take time to release everything (any Xen atomic action should be under few ms).

...

Current model of virtualization in OP-TEE is much simpler and domain destruction is truly atomic thing. It is really just couple of calls to free().

I can't tell what will be in the future, but I see some options how to overcome newly discovered issues, if they will ever rise. But I can tell that right now there is no much sense in introducing yielding calls for domain destruction/creation. This would require significant changes in OP-TEE core without any visible benefit.

Usually public interface are designed to be robust and future-proof because it is a pain to update them. It does not need to be fully implemented in the callee side. If OP-TEE folks thinks a fast call is ok and it would be easy to upgrade, then I can deal with it.

But please write that in the commit message (and potentially in the code). So we keep track what was the reason behind that choice.

Cheers,

-- Julien Grall

Volodymyr Babchuk

3 Sep 3 Sep

4:54 p.m.

New subject: [PATCH v2 06/13] optee: add domain contexts

OP-TEE meditor needs to store per-domain context, as will be seen in the next patches. At this moment it stores only reference to domain.

This allows us to filter out calls from domains that are not allowed to work with OP-TEE.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com --- xen/arch/arm/tee/optee.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 48bff5d..c895a99 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -19,6 +19,14 @@ #include <asm/tee/optee_msg.h> #include <asm/tee/optee_smc.h>

+struct domain_ctx { + struct list_head list; + struct domain *domain; +}; + +static LIST_HEAD(domain_ctx_list); +static DEFINE_SPINLOCK(domain_ctx_list_lock); + static bool optee_probe(void) { struct dt_device_node *node; @@ -41,18 +49,49 @@ static bool optee_probe(void) return true; }

+static struct domain_ctx *find_domain_ctx(struct domain* d) +{ + struct domain_ctx *ctx; + + spin_lock(&domain_ctx_list_lock); + + list_for_each_entry( ctx, &domain_ctx_list, list ) + { + if ( ctx->domain == d ) + { + spin_unlock(&domain_ctx_list_lock); + return ctx; + } + } + + spin_unlock(&domain_ctx_list_lock); + return NULL; +} + static int optee_enable(struct domain *d) { struct arm_smccc_res resp; + struct domain_ctx *ctx; + + ctx = xzalloc(struct domain_ctx); + if ( !ctx ) + return -ENOMEM;

arm_smccc_smc(OPTEE_SMC_VM_CREATED, d->domain_id + 1, 0, 0, 0, 0, 0, 0, &resp); if ( resp.a0 != OPTEE_SMC_RETURN_OK ) { gprintk(XENLOG_WARNING, "OP-TEE don't want to support domain: %d\n", (uint32_t)resp.a0); + xfree(ctx); return -ENODEV; }

+ ctx->domain = d; + + spin_lock(&domain_ctx_list_lock); + list_add_tail(&ctx->list, &domain_ctx_list); + spin_unlock(&domain_ctx_list_lock); + return 0; }

@@ -95,15 +134,36 @@ static void set_return(struct cpu_user_regs *regs, uint32_t ret) set_user_reg(regs, 7, 0); }

+ static void optee_domain_destroy(struct domain *d) { struct arm_smccc_res resp; + struct domain_ctx *ctx; + bool found = false;

/* At this time all domain VCPUs should be stopped */

/* Inform OP-TEE that domain is shutting down */ arm_smccc_smc(OPTEE_SMC_VM_DESTROYED, d->domain_id + 1, 0, 0, 0, 0, 0, 0, &resp); + + /* Remove context from the list */ + spin_lock(&domain_ctx_list_lock); + list_for_each_entry( ctx, &domain_ctx_list, list ) + { + if ( ctx->domain == d ) + { + found = true; + list_del(&ctx->list); + break; + } + } + spin_unlock(&domain_ctx_list_lock); + + if ( !found ) + return; + + xfree(ctx); }

static bool handle_exchange_capabilities(struct cpu_user_regs *regs) @@ -141,6 +201,12 @@ static bool handle_exchange_capabilities(struct cpu_user_regs *regs)

static bool optee_handle_call(struct cpu_user_regs *regs) { + struct domain_ctx *ctx; + + ctx = find_domain_ctx(current->domain); + if ( !ctx ) + return false; + switch ( get_user_reg(regs, 0) ) { case OPTEE_SMC_CALLS_COUNT:

-- 2.7.4

Julien Grall

5 Sep 5 Sep

2:10 p.m.

New subject: [PATCH v2 06/13] optee: add domain contexts

Hi,

On 09/03/2018 05:54 PM, Volodymyr Babchuk wrote:

...

OP-TEE meditor needs to store per-domain context, as will be seen

s/meditor/mediator/

...

in the next patches. At this moment it stores only reference to domain.

This allows us to filter out calls from domains that are not allowed to work with OP-TEE.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 48bff5d..c895a99 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -19,6 +19,14 @@ #include <asm/tee/optee_msg.h> #include <asm/tee/optee_smc.h> +struct domain_ctx {

I would prefer if the structure has optee in its name such as optee_domain.

We also tend to abbreviate context in Xen using the ctxt.

...

struct list_head list;

struct domain *domain;

+};

+static LIST_HEAD(domain_ctx_list); +static DEFINE_SPINLOCK(domain_ctx_list_lock);

static bool optee_probe(void) { struct dt_device_node *node;

@@ -41,18 +49,49 @@ static bool optee_probe(void) return true; } +static struct domain_ctx *find_domain_ctx(struct domain* d) +{
struct domain_ctx *ctx;

spin_lock(&domain_ctx_list_lock);

list_for_each_entry( ctx, &domain_ctx_list, list )

{
   if ( ctx->domain == d )
   {
           spin_unlock(&domain_ctx_list_lock);
           return ctx;
   }
}

spin_unlock(&domain_ctx_list_lock);

return NULL;
+}

Why do you need to store the context in a list? As you have a context per domain it would be better to store that directly in struct domain by introduce a field "void * tee".

With that, most of the code in this patch will become unnecessary. So you potentially can fold it into something else.

Cheers,

-- Julien Grall

Andrew Cooper

2:18 p.m.

New subject: [Xen-devel] [PATCH v2 06/13] optee: add domain contexts

On 05/09/18 15:10, Julien Grall wrote:

...

Hi,

On 09/03/2018 05:54 PM, Volodymyr Babchuk wrote:

...
OP-TEE meditor needs to store per-domain context, as will be seen

s/meditor/mediator/

...
in the next patches. At this moment it stores only reference to domain.

This allows us to filter out calls from domains that are not allowed to work with OP-TEE.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 48bff5d..c895a99 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -19,6 +19,14 @@ #include <asm/tee/optee_msg.h> #include <asm/tee/optee_smc.h> +struct domain_ctx {

I would prefer if the structure has optee in its name such as optee_domain.

We also tend to abbreviate context in Xen using the ctxt.

I'd argue that "ctx" is the better way of abbreviating context. It's far more usual, and easier to type. (That said, optee_domain would be the most consistent name with the rest of the codebase in this case)

~Andrew

Volodymyr Babchuk

2:23 p.m.

New subject: [PATCH v2 06/13] optee: add domain contexts

Hi,

On 05.09.18 17:10, Julien Grall wrote:

...

Hi,

On 09/03/2018 05:54 PM, Volodymyr Babchuk wrote:

...
OP-TEE meditor needs to store per-domain context, as will be seen

s/meditor/mediator/

...
in the next patches. At this moment it stores only reference to domain.

This allows us to filter out calls from domains that are not allowed to work with OP-TEE.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 48bff5d..c895a99 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -19,6 +19,14 @@ #include <asm/tee/optee_msg.h> #include <asm/tee/optee_smc.h> +struct domain_ctx {

I would prefer if the structure has optee in its name such as optee_domain.

It is supposed to be internal for this file, so I chose this name. But I can rename it, no problem.

...

We also tend to abbreviate context in Xen using the ctxt.

...
+    struct list_head list; +    struct domain *domain; +};

+static LIST_HEAD(domain_ctx_list); +static DEFINE_SPINLOCK(domain_ctx_list_lock);

static bool optee_probe(void) {       struct dt_device_node *node; @@ -41,18 +49,49 @@ static bool optee_probe(void)       return true; } +static struct domain_ctx *find_domain_ctx(struct domain* d) +{ +    struct domain_ctx *ctx;

+    spin_lock(&domain_ctx_list_lock);

+    list_for_each_entry( ctx, &domain_ctx_list, list ) +    { +        if ( ctx->domain == d ) +        { +                spin_unlock(&domain_ctx_list_lock); +                return ctx; +        } +    }

+    spin_unlock(&domain_ctx_list_lock); +    return NULL; +}

Why do you need to store the context in a list? As you have a context per domain it would be better to store that directly in struct domain by introduce a field "void * tee".

Yes, this was my first intention. But I was unsure on if it is appropriate to add driver-specific fields to struct arch_domain. If you sat that it is okay (under #ifdef CONFIG_TEE), then I'll do in this way.

-- Volodymyr Babchuk

Julien Grall

2:27 p.m.

New subject: [PATCH v2 06/13] optee: add domain contexts

On 09/05/2018 03:23 PM, Volodymyr Babchuk wrote:

...

Hi,

Hi,

...

On 05.09.18 17:10, Julien Grall wrote:

...
Hi,

On 09/03/2018 05:54 PM, Volodymyr Babchuk wrote:

...
OP-TEE meditor needs to store per-domain context, as will be seen

s/meditor/mediator/

...
in the next patches. At this moment it stores only reference to domain.

This allows us to filter out calls from domains that are not allowed to work with OP-TEE.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 48bff5d..c895a99 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -19,6 +19,14 @@ #include <asm/tee/optee_msg.h> #include <asm/tee/optee_smc.h> +struct domain_ctx {

I would prefer if the structure has optee in its name such as optee_domain.

It is supposed to be internal for this file, so I chose this name. But I can rename it, no problem.

Even if it is internal, the structure will appear in debug symbols. If you use tools like pahole it will be quite difficult to know what the structure is for without a grep.

...

...
We also tend to abbreviate context in Xen using the ctxt.

...
+    struct list_head list; +    struct domain *domain; +};

+static LIST_HEAD(domain_ctx_list); +static DEFINE_SPINLOCK(domain_ctx_list_lock);

static bool optee_probe(void) {       struct dt_device_node *node; @@ -41,18 +49,49 @@ static bool optee_probe(void)       return true; } +static struct domain_ctx *find_domain_ctx(struct domain* d) +{ +    struct domain_ctx *ctx;

+    spin_lock(&domain_ctx_list_lock);

+    list_for_each_entry( ctx, &domain_ctx_list, list ) +    { +        if ( ctx->domain == d ) +        { +                spin_unlock(&domain_ctx_list_lock); +                return ctx; +        } +    }

+    spin_unlock(&domain_ctx_list_lock); +    return NULL; +}

Why do you need to store the context in a list? As you have a context per domain it would be better to store that directly in struct domain by introduce a field "void * tee".

Yes, this was my first intention. But I was unsure on if it is appropriate to add driver-specific fields to struct arch_domain. If you sat that it is okay (under #ifdef CONFIG_TEE), then I'll do in this way.

I would definitely prefer the "driver-specific" fields. This is likely going to be useful for other TEE as well.

Cheers,

-- Julien Grall

Volodymyr Babchuk

3 Sep 3 Sep

4:54 p.m.

New subject: [PATCH v2 07/13] optee: add std call handling

Main way to communicate with OP-TEE is to issue standard SMCCC call. "Standard" is a SMCCC term and it means that call can be interrupted and OP-TEE can return control to NW before completing the call.

In contranst with fast calls, where arguments and return values are passed in registers, standard calls use shared memory. Register pair r1,r2 holds 64-bit PA of command buffer, where all arguments are stored and which is used to return data. OP-TEE internally copies contents of this buffer into own secure memory before accessing and validating any data in command buffer. This is done to make sure that NW will not change contents of the validated parameters.

Mediator needs to do the same for number of reasons:

1. To make sure that guest will not change data after validation. 2. To translate IPAs to PAs in the command buffer (this is not done in this patch). 3. To hide translated address from guest, so it will not be able to do IPA->PA translation by misusing mediator.

Also mediator pins the page with original command buffer because it will write to it later, when returning response from OP-TEE.

During standard call OP-TEE can issue multiple "RPC returns", asking NW to do some work for OP-TEE. NW then issues special call OPTEE_SMC_CALL_RETURN_FROM_RPC to resume handling of the original call. Thus, mediator needs to maintain context for original standard call during multiple SMCCC calls.

Standard call is considered complete, when returned value is not RPC request.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com --- xen/arch/arm/tee/optee.c | 319 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 316 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index c895a99..1008eba 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -12,6 +12,7 @@ */

#include <xen/device_tree.h> +#include <xen/domain_page.h> #include <xen/sched.h> #include <asm/smccc.h> #include <asm/tee/tee.h> @@ -19,9 +20,27 @@ #include <asm/tee/optee_msg.h> #include <asm/tee/optee_smc.h>

+#define MAX_STD_CALLS 16 + +/* + * Call context. OP-TEE can issue multiple RPC returns during one call. + * We need to preserve context during them. + */ +struct std_call_ctx { + struct list_head list; + struct optee_msg_arg *guest_arg; + struct optee_msg_arg *xen_arg; + mfn_t guest_arg_mfn; + int optee_thread_id; + int rpc_op; +}; + struct domain_ctx { struct list_head list; + struct list_head call_ctx_list; struct domain *domain; + atomic_t call_ctx_count; + spinlock_t lock; };

static LIST_HEAD(domain_ctx_list); @@ -49,6 +68,44 @@ static bool optee_probe(void) return true; }

+static mfn_t lookup_and_pin_guest_ram_addr(paddr_t gaddr, + struct page_info **pg) +{ + mfn_t mfn; + gfn_t gfn; + p2m_type_t t; + struct page_info *page; + struct domain *d = current->domain; + + gfn = gaddr_to_gfn(gaddr); + mfn = p2m_lookup(d, gfn, &t); + + if ( t != p2m_ram_rw || mfn_eq(mfn, INVALID_MFN) ) + return INVALID_MFN; + + page = mfn_to_page(mfn); + if ( !page ) + return INVALID_MFN; + + if ( !get_page(page, d) ) + return INVALID_MFN; + + if ( pg ) + *pg = page; + + return mfn; +} + +static void unpin_guest_ram_addr(mfn_t mfn) +{ + struct page_info *page; + page = mfn_to_page(mfn); + if ( !page ) + return; + + put_page(page); +} + static struct domain_ctx *find_domain_ctx(struct domain* d) { struct domain_ctx *ctx; @@ -87,6 +144,10 @@ static int optee_enable(struct domain *d) }

ctx->domain = d; + INIT_LIST_HEAD(&ctx->call_ctx_list); + + atomic_set(&ctx->call_ctx_count, 0); + spin_lock_init(&ctx->lock);

spin_lock(&domain_ctx_list_lock); list_add_tail(&ctx->list, &domain_ctx_list); @@ -134,11 +195,72 @@ static void set_return(struct cpu_user_regs *regs, uint32_t ret) set_user_reg(regs, 7, 0); }

+static struct std_call_ctx *allocate_std_call_ctx(struct domain_ctx *ctx) +{ + struct std_call_ctx *call; + int count; + + count = atomic_add_unless(&ctx->call_ctx_count, 1, MAX_STD_CALLS); + if ( count == MAX_STD_CALLS ) + return NULL; + + call = xzalloc(struct std_call_ctx); + if ( !call ) { + atomic_dec(&ctx->call_ctx_count); + return NULL; + } + + call->optee_thread_id = -1; + + spin_lock(&ctx->lock); + list_add_tail(&call->list, &ctx->call_ctx_list); + spin_unlock(&ctx->lock); + + return call; +} + +static void free_std_call_ctx(struct domain_ctx *ctx, struct std_call_ctx *call) +{ + atomic_dec(&ctx->call_ctx_count); + + spin_lock(&ctx->lock); + list_del(&call->list); + spin_unlock(&ctx->lock); + + if ( call->xen_arg ) + free_xenheap_page(call->xen_arg); + + if ( call->guest_arg ) { + unmap_domain_page_global(call->guest_arg); + unpin_guest_ram_addr(call->guest_arg_mfn); + } + + xfree(call); +} + +static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id) +{ + struct std_call_ctx *call; + + spin_lock(&ctx->lock); + list_for_each_entry( call, &ctx->call_ctx_list, list ) + { + if ( call->optee_thread_id == thread_id ) + { + spin_unlock(&ctx->lock); + return call; + } + } + spin_unlock(&ctx->lock); + + return NULL; +}

static void optee_domain_destroy(struct domain *d) { struct arm_smccc_res resp; struct domain_ctx *ctx; + struct std_call_ctx *call, *call_tmp; bool found = false;

/* At this time all domain VCPUs should be stopped */ @@ -163,9 +285,201 @@ static void optee_domain_destroy(struct domain *d) if ( !found ) return;

+ ASSERT(!spin_is_locked(&ctx->lock)); + + list_for_each_entry_safe( call, call_tmp, &ctx->call_ctx_list, list ) + free_std_call_ctx(ctx, call); + + ASSERT(!atomic_read(&ctx->call_ctx_count)); + xfree(ctx); }

+/* + * Copy command buffer into xen memory to: + * 1) Hide translated addresses from guest + * 2) Make sure that guest wouldn't change data in command buffer during call + */ +static bool copy_std_request(struct cpu_user_regs *regs, + struct std_call_ctx *call) +{ + paddr_t cmd_gaddr, xen_addr; + + cmd_gaddr = (paddr_t)get_user_reg(regs, 1) << 32 | + get_user_reg(regs, 2); + + /* + * Command buffer should start at page boundary. + * This is OP-TEE ABI requirement. + */ + if ( cmd_gaddr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1) ) + return false; + + call->guest_arg_mfn = lookup_and_pin_guest_ram_addr(cmd_gaddr, NULL); + if ( mfn_eq(call->guest_arg_mfn, INVALID_MFN) ) + return false; + + call->guest_arg = map_domain_page_global(call->guest_arg_mfn); + if ( !call->guest_arg ) { + unpin_guest_ram_addr(call->guest_arg_mfn); + return false; + } + + call->xen_arg = alloc_xenheap_page(); + if ( !call->xen_arg ) { + unpin_guest_ram_addr(call->guest_arg_mfn); + return false; + } + + memcpy(call->xen_arg, call->guest_arg, OPTEE_MSG_NONCONTIG_PAGE_SIZE); + + xen_addr = virt_to_maddr(call->xen_arg); + + set_user_reg(regs, 1, xen_addr >> 32); + set_user_reg(regs, 2, xen_addr & 0xFFFFFFFF); + + return true; +} + +static bool copy_std_request_back(struct domain_ctx *ctx, + struct cpu_user_regs *regs, + struct std_call_ctx *call) +{ + unsigned int i; + uint32_t attr; + + call->guest_arg->ret = call->xen_arg->ret; + call->guest_arg->ret_origin = call->xen_arg->ret_origin; + call->guest_arg->session = call->xen_arg->session; + for ( i = 0; i < call->xen_arg->num_params; i++ ) { + attr = call->xen_arg->params[i].attr; + + switch ( attr & OPTEE_MSG_ATTR_TYPE_MASK ) { + case OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT: + case OPTEE_MSG_ATTR_TYPE_TMEM_INOUT: + call->guest_arg->params[i].u.tmem.size = + call->xen_arg->params[i].u.tmem.size; + continue; + case OPTEE_MSG_ATTR_TYPE_VALUE_OUTPUT: + case OPTEE_MSG_ATTR_TYPE_VALUE_INOUT: + call->guest_arg->params[i].u.value.a = + call->xen_arg->params[i].u.value.a; + call->guest_arg->params[i].u.value.b = + call->xen_arg->params[i].u.value.b; + continue; + case OPTEE_MSG_ATTR_TYPE_RMEM_OUTPUT: + case OPTEE_MSG_ATTR_TYPE_RMEM_INOUT: + call->guest_arg->params[i].u.rmem.size = + call->xen_arg->params[i].u.rmem.size; + continue; + case OPTEE_MSG_ATTR_TYPE_NONE: + case OPTEE_MSG_ATTR_TYPE_RMEM_INPUT: + case OPTEE_MSG_ATTR_TYPE_TMEM_INPUT: + continue; + } + } + + return true; +} + +static bool execute_std_call(struct domain_ctx *ctx, + struct cpu_user_regs *regs, + struct std_call_ctx *call) +{ + register_t optee_ret; + + forward_call(regs); + + optee_ret = get_user_reg(regs, 0); + if ( OPTEE_SMC_RETURN_IS_RPC(optee_ret) ) + { + call->optee_thread_id = get_user_reg(regs, 3); + call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(optee_ret); + return true; + } + + copy_std_request_back(ctx, regs, call); + + free_std_call_ctx(ctx, call); + + return true; +} + +static bool handle_std_call(struct domain_ctx *ctx, struct cpu_user_regs *regs) +{ + struct std_call_ctx *call; + bool ret; + + call = allocate_std_call_ctx(ctx); + + if (!call) + return false; + + ret = copy_std_request(regs, call); + if ( !ret ) + goto out; + + /* Now we can safely examine contents of command buffer */ + if ( OPTEE_MSG_GET_ARG_SIZE(call->xen_arg->num_params) > + OPTEE_MSG_NONCONTIG_PAGE_SIZE ) { + ret = false; + goto out; + } + + switch ( call->xen_arg->cmd ) + { + case OPTEE_MSG_CMD_OPEN_SESSION: + case OPTEE_MSG_CMD_CLOSE_SESSION: + case OPTEE_MSG_CMD_INVOKE_COMMAND: + case OPTEE_MSG_CMD_CANCEL: + case OPTEE_MSG_CMD_REGISTER_SHM: + case OPTEE_MSG_CMD_UNREGISTER_SHM: + ret = true; + break; + default: + ret = false; + } + + if (!ret) + goto out; + + ret = execute_std_call(ctx, regs, call); + +out: + if (!ret) + free_std_call_ctx(ctx, call); + + return ret; +} + +static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs) +{ + struct std_call_ctx *call; + + int optee_thread_id = get_user_reg(regs, 3); + + call = find_call_ctx(ctx, optee_thread_id); + + if ( !call ) + return false; + + switch ( call->rpc_op ) { + case OPTEE_SMC_RPC_FUNC_ALLOC: + /* TODO: Add handling */ + break; + case OPTEE_SMC_RPC_FUNC_FREE: + /* TODO: Add handling */ + break; + case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR: + break; + case OPTEE_SMC_RPC_FUNC_CMD: + /* TODO: Add handling */ + break; + } + + return execute_std_call(ctx, regs, call); +} + static bool handle_exchange_capabilities(struct cpu_user_regs *regs) { uint32_t caps; @@ -225,10 +539,9 @@ static bool optee_handle_call(struct cpu_user_regs *regs) case OPTEE_SMC_EXCHANGE_CAPABILITIES: return handle_exchange_capabilities(regs); case OPTEE_SMC_CALL_WITH_ARG: + return handle_std_call(ctx, regs); case OPTEE_SMC_CALL_RETURN_FROM_RPC: - /* TODO: Add proper handling for this calls */ - forward_call(regs); - return true; + return handle_rpc(ctx, regs); default: return false; }

-- 2.7.4

Julien Grall

5 Sep 5 Sep

3:17 p.m.

New subject: [PATCH v2 07/13] optee: add std call handling

Hi,

On 09/03/2018 05:54 PM, Volodymyr Babchuk wrote:

...

Main way to communicate with OP-TEE is to issue standard SMCCC

NIT: The main way

...

call. "Standard" is a SMCCC term and it means that call can be interrupted and OP-TEE can return control to NW before completing the call.

In contranst with fast calls, where arguments and return values

NIT: s/contranst/contrast/

...

are passed in registers, standard calls use shared memory. Register pair r1,r2 holds 64-bit PA of command buffer, where all arguments

Do you mean w1, w2?

...

are stored and which is used to return data. OP-TEE internally copies contents of this buffer into own secure memory before accessing and validating any data in command buffer. This is done to make sure that NW will not change contents of the validated parameters.

Mediator needs to do the same for number of reasons:

To make sure that guest will not change data after validation.

To translate IPAs to PAs in the command buffer (this is not done in this patch).

To hide translated address from guest, so it will not be able to do IPA->PA translation by misusing mediator.

Also mediator pins the page with original command buffer because it will write to it later, when returning response from OP-TEE.

I don't think it is necessary to pin the guest command buffer. You can use guestcopy helper to copy to/from the guest memory.

If the guest modify the P2M at the same time, then it is not our business if something wrong happen. The only things we need to prevent here is writing back to an MFN that does not belong to the domain.

...

During standard call OP-TEE can issue multiple "RPC returns", asking NW to do some work for OP-TEE. NW then issues special call OPTEE_SMC_CALL_RETURN_FROM_RPC to resume handling of the original call. Thus, mediator needs to maintain context for original standard call during multiple SMCCC calls.

Standard call is considered complete, when returned value is not RPC request.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 319 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 316 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index c895a99..1008eba 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -12,6 +12,7 @@ */ #include <xen/device_tree.h> +#include <xen/domain_page.h> #include <xen/sched.h> #include <asm/smccc.h> #include <asm/tee/tee.h> @@ -19,9 +20,27 @@ #include <asm/tee/optee_msg.h> #include <asm/tee/optee_smc.h> +#define MAX_STD_CALLS 16

I suspect this is used to restrict the number of calls in flight. If so, I would appreciate if this is documented in the code and the limitations explained in the commit message.

...

+/*

Call context. OP-TEE can issue multiple RPC returns during one call.

We need to preserve context during them.

*/

+struct std_call_ctx {

struct list_head list;

struct optee_msg_arg *guest_arg;

struct optee_msg_arg *xen_arg;

mfn_t guest_arg_mfn;

int optee_thread_id;

int rpc_op;

+};

struct domain_ctx { struct list_head list;

struct list_head call_ctx_list; struct domain *domain;

atomic_t call_ctx_count;

spinlock_t lock; };

static LIST_HEAD(domain_ctx_list); @@ -49,6 +68,44 @@ static bool optee_probe(void) return true; } +static mfn_t lookup_and_pin_guest_ram_addr(paddr_t gaddr,
                                       struct page_info **pg)

I don't think there are need to return both an MFN and a page. You can deduce easily from the other. In this context, it would be better to return a page.

Also, I would prefer if this function take a guest frame address over a guest physical address.

Lastly this function is basically a re-implementation of get_page_from_gfn with a restriction on the p2m type. Please re-implement it using that function.

...

+{
mfn_t mfn;

gfn_t gfn;

p2m_type_t t;

struct page_info *page;

struct domain *d = current->domain;

gfn = gaddr_to_gfn(gaddr);

mfn = p2m_lookup(d, gfn, &t);

if ( t != p2m_ram_rw || mfn_eq(mfn, INVALID_MFN) )
   return INVALID_MFN;
page = mfn_to_page(mfn);

mfn_to_page can never failed. If you want to check whether an MFN is valid, then you can use mfn_valid(...).

...

if ( !page )
   return INVALID_MFN;
if ( !get_page(page, d) )
   return INVALID_MFN;
if ( pg )
   *pg = page;
return mfn;
+}

+static void unpin_guest_ram_addr(mfn_t mfn) +{
struct page_info *page;

page = mfn_to_page(mfn);

if ( !page )
   return;
put_page(page);
+}

static struct domain_ctx *find_domain_ctx(struct domain* d) { struct domain_ctx *ctx;

@@ -87,6 +144,10 @@ static int optee_enable(struct domain *d) } ctx->domain = d;

INIT_LIST_HEAD(&ctx->call_ctx_list);

atomic_set(&ctx->call_ctx_count, 0);

spin_lock_init(&ctx->lock);

spin_lock(&domain_ctx_list_lock); list_add_tail(&ctx->list, &domain_ctx_list); @@ -134,11 +195,72 @@ static void set_return(struct cpu_user_regs *regs, uint32_t ret) set_user_reg(regs, 7, 0); } +static struct std_call_ctx *allocate_std_call_ctx(struct domain_ctx *ctx) +{

struct std_call_ctx *call;

int count;

count = atomic_add_unless(&ctx->call_ctx_count, 1, MAX_STD_CALLS);

Please a comment explaining the rationale for this.

...

if ( count == MAX_STD_CALLS )
   return NULL;
call = xzalloc(struct std_call_ctx);

if ( !call ) {

Coding style:

if {

...

   atomic_dec(&ctx->call_ctx_count);
   return NULL;
}

call->optee_thread_id = -1;

spin_lock(&ctx->lock);

list_add_tail(&call->list, &ctx->call_ctx_list);

spin_unlock(&ctx->lock);

return call;
+}

+static void free_std_call_ctx(struct domain_ctx *ctx, struct std_call_ctx *call) +{
atomic_dec(&ctx->call_ctx_count);

spin_lock(&ctx->lock);

list_del(&call->list);

spin_unlock(&ctx->lock);

if ( call->xen_arg )
   free_xenheap_page(call->xen_arg);

free_xenheap_page is able to cope with NULL pointer.

...

if ( call->guest_arg ) {
   unmap_domain_page_global(call->guest_arg);
   unpin_guest_ram_addr(call->guest_arg_mfn);
}

xfree(call);
+}

+static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id) +{
struct std_call_ctx *call;

spin_lock(&ctx->lock);

list_for_each_entry( call, &ctx->call_ctx_list, list )

{
   if ( call->optee_thread_id == thread_id )
   {
           spin_unlock(&ctx->lock);
           return call;
   }
}

spin_unlock(&ctx->lock);

return NULL;
+} static void optee_domain_destroy(struct domain *d) { struct arm_smccc_res resp; struct domain_ctx *ctx;

struct std_call_ctx *call, *call_tmp; bool found = false;

/* At this time all domain VCPUs should be stopped */ @@ -163,9 +285,201 @@ static void optee_domain_destroy(struct domain *d) if ( !found ) return;
ASSERT(!spin_is_locked(&ctx->lock));

list_for_each_entry_safe( call, call_tmp, &ctx->call_ctx_list, list )
   free_std_call_ctx(ctx, call);
ASSERT(!atomic_read(&ctx->call_ctx_count));
xfree(ctx);
}
+/*

Copy command buffer into xen memory to:

Hide translated addresses from guest

Make sure that guest wouldn't change data in command buffer during call

*/

+static bool copy_std_request(struct cpu_user_regs *regs,
                        struct std_call_ctx *call)
+{
paddr_t cmd_gaddr, xen_addr;

cmd_gaddr = (paddr_t)get_user_reg(regs, 1) << 32 |
   get_user_reg(regs, 2);
/*
* Command buffer should start at page boundary.
* This is OP-TEE ABI requirement.
*/
if ( cmd_gaddr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1) )
   return false;
call->guest_arg_mfn = lookup_and_pin_guest_ram_addr(cmd_gaddr, NULL);

if ( mfn_eq(call->guest_arg_mfn, INVALID_MFN) )
   return false;
call->guest_arg = map_domain_page_global(call->guest_arg_mfn);

if ( !call->guest_arg ) {
   unpin_guest_ram_addr(call->guest_arg_mfn);
   return false;
}

It look like to me you want to use access_guest_memory_by_ipa here. This would avoid the page to be mapped global for the duration of the call.

...

call->xen_arg = alloc_xenheap_page();

I don't see anything in the code making sure that OPTEE_MSG_NONCONTIG_PAGE_SIZE fits in a Xen page-size. You probably want to add a BUILD_BUG_ON(OPTEE_MSG_NONCONTIG_PAGE > PAGE_SIZE).

Also, for been more future-proof (i.e Xen using an higher page size), it would be better to use _xmalloc(OPTEE_MSG_NONCONTIG_PAGE_SIZE, OPTEE_MSG_NONCONTIG_PAGE_SIZE).

_xmalloc will alloc_xenheap_page() if it is necessary.

...

if ( !call->xen_arg ) {
   unpin_guest_ram_addr(call->guest_arg_mfn);
   return false;
}

memcpy(call->xen_arg, call->guest_arg, OPTEE_MSG_NONCONTIG_PAGE_SIZE);

xen_addr = virt_to_maddr(call->xen_arg);

set_user_reg(regs, 1, xen_addr >> 32);

set_user_reg(regs, 2, xen_addr & 0xFFFFFFFF);

return true;
+}

+static bool copy_std_request_back(struct domain_ctx *ctx,
                             struct cpu_user_regs *regs,
                             struct std_call_ctx *call)
+{

unsigned int i;

uint32_t attr;

This function might be a bit tricky to implement using access_guest_memory_by_ipa. However, you can just map the region temporarily. This would avoid to use map_domain_page_global() within then code and avoid one less potentially error in the cleanup path.

...

call->guest_arg->ret = call->xen_arg->ret;

call->guest_arg->ret_origin = call->xen_arg->ret_origin;

call->guest_arg->session = call->xen_arg->session;

for ( i = 0; i < call->xen_arg->num_params; i++ ) {

for ( ... ) {

...

   attr = call->xen_arg->params[i].attr;

   switch ( attr & OPTEE_MSG_ATTR_TYPE_MASK ) {

switch ( ... ) {

...

   case OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT:

   case OPTEE_MSG_ATTR_TYPE_TMEM_INOUT:

       call->guest_arg->params[i].u.tmem.size =

           call->xen_arg->params[i].u.tmem.size;

```
       continue;
```

   case OPTEE_MSG_ATTR_TYPE_VALUE_OUTPUT:

   case OPTEE_MSG_ATTR_TYPE_VALUE_INOUT:

       call->guest_arg->params[i].u.value.a =

           call->xen_arg->params[i].u.value.a;

       call->guest_arg->params[i].u.value.b =

           call->xen_arg->params[i].u.value.b;

```
       continue;
```

   case OPTEE_MSG_ATTR_TYPE_RMEM_OUTPUT:

   case OPTEE_MSG_ATTR_TYPE_RMEM_INOUT:

       call->guest_arg->params[i].u.rmem.size =

           call->xen_arg->params[i].u.rmem.size;

```
       continue;
```
```
   case OPTEE_MSG_ATTR_TYPE_NONE:
```

   case OPTEE_MSG_ATTR_TYPE_RMEM_INPUT:

   case OPTEE_MSG_ATTR_TYPE_TMEM_INPUT:

```
       continue;
```
```
   }
```
}
return true;

+static bool execute_std_call(struct domain_ctx *ctx,

                        struct cpu_user_regs *regs,

                        struct std_call_ctx *call)

register_t optee_ret;
forward_call(regs);

I find a bit odd that you introduce a return for forward_call in the previous patch. But you barely use it.

I think I would prefer if the function does not return a value and you just rely on get_user_reg(regs, 0).

...

optee_ret = get_user_reg(regs, 0);

if ( OPTEE_SMC_RETURN_IS_RPC(optee_ret) )

{
   call->optee_thread_id = get_user_reg(regs, 3);
   call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(optee_ret);
   return true;
}

copy_std_request_back(ctx, regs, call);

Here another function return a value but it is not used. If you don't plan to use the return, then please drop. If you plan to use it, then do it everywhere.

...

free_std_call_ctx(ctx, call);

return true;

+}

+static bool handle_std_call(struct domain_ctx *ctx, struct cpu_user_regs *regs) +{
struct std_call_ctx *call;

bool ret;

call = allocate_std_call_ctx(ctx);

if (!call)
   return false;
ret = copy_std_request(regs, call);

if ( !ret )
   goto out;
/* Now we can safely examine contents of command buffer */

if ( OPTEE_MSG_GET_ARG_SIZE(call->xen_arg->num_params) >
    OPTEE_MSG_NONCONTIG_PAGE_SIZE ) {

if ( ... ) {

...

   ret = false;
   goto out;
}

switch ( call->xen_arg->cmd )

{

case OPTEE_MSG_CMD_OPEN_SESSION:

case OPTEE_MSG_CMD_CLOSE_SESSION:

case OPTEE_MSG_CMD_INVOKE_COMMAND:

case OPTEE_MSG_CMD_CANCEL:

case OPTEE_MSG_CMD_REGISTER_SHM:

case OPTEE_MSG_CMD_UNREGISTER_SHM:
   ret = true;
   break;
default:
   ret = false;
}

if (!ret)
   goto out;
ret = execute_std_call(ctx, regs, call);
+out:
if (!ret)
   free_std_call_ctx(ctx, call);
return ret;
+}

+static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs) +{

struct std_call_ctx *call;

No need for the newline here.

...

int optee_thread_id = get_user_reg(regs, 3);

call = find_call_ctx(ctx, optee_thread_id);

if ( !call )
   return false;
switch ( call->rpc_op ) {

switch ( ... ) {

...

case OPTEE_SMC_RPC_FUNC_ALLOC:
   /* TODO: Add handling */
   break;
case OPTEE_SMC_RPC_FUNC_FREE:
   /* TODO: Add handling */
   break;
case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR:
   break;
case OPTEE_SMC_RPC_FUNC_CMD:
   /* TODO: Add handling */
   break;
}

return execute_std_call(ctx, regs, call);
+}

static bool handle_exchange_capabilities(struct cpu_user_regs *regs) { uint32_t caps;

@@ -225,10 +539,9 @@ static bool optee_handle_call(struct cpu_user_regs *regs) case OPTEE_SMC_EXCHANGE_CAPABILITIES: return handle_exchange_capabilities(regs); case OPTEE_SMC_CALL_WITH_ARG:
   return handle_std_call(ctx, regs);
case OPTEE_SMC_CALL_RETURN_FROM_RPC:
   /* TODO: Add proper handling for this calls */
   forward_call(regs);
   return true;
   return handle_rpc(ctx, regs);
default:
    return false;
}

Cheers,

-- Julien Grall

Volodymyr Babchuk

10 Sep 10 Sep

5:37 p.m.

New subject: [PATCH v2 07/13] optee: add std call handling

Hi Julien,

On 05.09.18 18:17, Julien Grall wrote:

...

Hi,

On 09/03/2018 05:54 PM, Volodymyr Babchuk wrote:

...
Main way to communicate with OP-TEE is to issue standard SMCCC

NIT: The main way

...
call. "Standard" is a SMCCC term and it means that call can be interrupted and OP-TEE can return control to NW before completing the call.

In contranst with fast calls, where arguments and return values

NIT: s/contranst/contrast/

...
are passed in registers, standard calls use shared memory. Register pair r1,r2 holds 64-bit PA of command buffer, where all arguments

Do you mean w1, w2?

Good question. How to call the registers, so it would be valid both for ARMv7 and ARMv8?

...

...
are stored and which is used to return data. OP-TEE internally copies contents of this buffer into own secure memory before accessing and validating any data in command buffer. This is done to make sure that NW will not change contents of the validated parameters.

Mediator needs to do the same for number of reasons:

To make sure that guest will not change data after validation.

To translate IPAs to PAs in the command buffer (this is not done

in this patch). 3. To hide translated address from guest, so it will not be able to do IPA->PA translation by misusing mediator.

Also mediator pins the page with original command buffer because it will write to it later, when returning response from OP-TEE.

I don't think it is necessary to pin the guest command buffer. You can use guestcopy helper to copy to/from the guest memory.

If the guest modify the P2M at the same time, then it is not our business if something wrong happen. The only things we need to prevent here is writing back to an MFN that does not belong to the domain.

You are right. I did this in the same way as OP-TEE does, except that OP-TEE does not copies whole command buffer. So yes, your approach is better. Thank you.

...

...
During standard call OP-TEE can issue multiple "RPC returns", asking NW to do some work for OP-TEE. NW then issues special call OPTEE_SMC_CALL_RETURN_FROM_RPC to resume handling of the original call. Thus, mediator needs to maintain context for original standard call during multiple SMCCC calls.

Standard call is considered complete, when returned value is not RPC request.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 319 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 316 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index c895a99..1008eba 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -12,6 +12,7 @@ */ #include <xen/device_tree.h> +#include <xen/domain_page.h> #include <xen/sched.h> #include <asm/smccc.h> #include <asm/tee/tee.h> @@ -19,9 +20,27 @@ #include <asm/tee/optee_msg.h> #include <asm/tee/optee_smc.h> +#define MAX_STD_CALLS 16

I suspect this is used to restrict the number of calls in flight. If so, I would appreciate if this is documented in the code and the limitations explained in the commit message.

Yes, you are right. I'll add description.

...

...

+/*

Call context. OP-TEE can issue multiple RPC returns during one call.

We need to preserve context during them.

*/

+struct std_call_ctx { +    struct list_head list; +    struct optee_msg_arg *guest_arg; +    struct optee_msg_arg *xen_arg; +    mfn_t guest_arg_mfn; +    int optee_thread_id; +    int rpc_op; +};

struct domain_ctx {       struct list_head list; +    struct list_head call_ctx_list;       struct domain *domain; +    atomic_t call_ctx_count; +    spinlock_t lock; }; static LIST_HEAD(domain_ctx_list); @@ -49,6 +68,44 @@ static bool optee_probe(void)       return true; } +static mfn_t lookup_and_pin_guest_ram_addr(paddr_t gaddr, +                                            struct page_info **pg)

I don't think there are need to return both an MFN and a page. You can deduce easily from the other. In this context, it would be better to return a page.

Also, I would prefer if this function take a guest frame address over a guest physical address.

Lastly this function is basically a re-implementation of get_page_from_gfn with a restriction on the p2m type. Please re-implement it using that function.

...
+{ +    mfn_t mfn; +    gfn_t gfn; +    p2m_type_t t; +    struct page_info *page; +    struct domain *d = current->domain;

+    gfn = gaddr_to_gfn(gaddr); +    mfn = p2m_lookup(d, gfn, &t);

+    if ( t != p2m_ram_rw || mfn_eq(mfn, INVALID_MFN) ) +        return INVALID_MFN;

+    page = mfn_to_page(mfn);

mfn_to_page can never failed. If you want to check whether an MFN is valid, then you can use mfn_valid(...).

...
+    if ( !page ) +        return INVALID_MFN;

+    if ( !get_page(page, d) ) +        return INVALID_MFN;

+    if ( pg ) +        *pg = page;

+    return mfn; +}

+static void unpin_guest_ram_addr(mfn_t mfn) +{ +    struct page_info *page; +    page = mfn_to_page(mfn); +    if ( !page ) +        return;

+    put_page(page); +}

static struct domain_ctx *find_domain_ctx(struct domain* d) {       struct domain_ctx *ctx; @@ -87,6 +144,10 @@ static int optee_enable(struct domain *d)       }       ctx->domain = d; +    INIT_LIST_HEAD(&ctx->call_ctx_list);

+    atomic_set(&ctx->call_ctx_count, 0); +    spin_lock_init(&ctx->lock);       spin_lock(&domain_ctx_list_lock);       list_add_tail(&ctx->list, &domain_ctx_list); @@ -134,11 +195,72 @@ static void set_return(struct cpu_user_regs *regs, uint32_t ret)       set_user_reg(regs, 7, 0); } +static struct std_call_ctx *allocate_std_call_ctx(struct domain_ctx *ctx) +{ +    struct std_call_ctx *call; +    int count;

+    count = atomic_add_unless(&ctx->call_ctx_count, 1, MAX_STD_CALLS);

Please a comment explaining the rationale for this.

E.g. "/* Limit number of calls for the current domain */" ?

...

...
+    if ( count == MAX_STD_CALLS ) +        return NULL;

+    call = xzalloc(struct std_call_ctx); +    if ( !call ) {

Coding style:

if {

...
+        atomic_dec(&ctx->call_ctx_count); +        return NULL; +    }

+    call->optee_thread_id = -1;

+    spin_lock(&ctx->lock); +    list_add_tail(&call->list, &ctx->call_ctx_list); +    spin_unlock(&ctx->lock);

+    return call; +}

+static void free_std_call_ctx(struct domain_ctx *ctx, struct std_call_ctx *call) +{ +    atomic_dec(&ctx->call_ctx_count);

+    spin_lock(&ctx->lock); +    list_del(&call->list); +    spin_unlock(&ctx->lock);

+    if ( call->xen_arg ) +        free_xenheap_page(call->xen_arg);

free_xenheap_page is able to cope with NULL pointer.

...

+    if ( call->guest_arg ) { +        unmap_domain_page_global(call->guest_arg); +        unpin_guest_ram_addr(call->guest_arg_mfn); +    }

+    xfree(call); +}

+static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id) +{ +    struct std_call_ctx *call;

+    spin_lock(&ctx->lock); +    list_for_each_entry( call, &ctx->call_ctx_list, list ) +    { +        if ( call->optee_thread_id == thread_id ) +        { +                spin_unlock(&ctx->lock); +                return call; +        } +    } +    spin_unlock(&ctx->lock);

+    return NULL; +} static void optee_domain_destroy(struct domain *d) {       struct arm_smccc_res resp;       struct domain_ctx *ctx; +    struct std_call_ctx *call, *call_tmp;       bool found = false;       /* At this time all domain VCPUs should be stopped */ @@ -163,9 +285,201 @@ static void optee_domain_destroy(struct domain *d)       if ( !found )           return; +    ASSERT(!spin_is_locked(&ctx->lock));

+    list_for_each_entry_safe( call, call_tmp, &ctx->call_ctx_list, list ) +        free_std_call_ctx(ctx, call);

+    ASSERT(!atomic_read(&ctx->call_ctx_count));

xfree(ctx); } +/*

Copy command buffer into xen memory to:

Hide translated addresses from guest

Make sure that guest wouldn't change data in command buffer

during call

*/

+static bool copy_std_request(struct cpu_user_regs *regs, +                             struct std_call_ctx *call) +{ +    paddr_t cmd_gaddr, xen_addr;

+    cmd_gaddr = (paddr_t)get_user_reg(regs, 1) << 32 | +        get_user_reg(regs, 2);

+    /* +     * Command buffer should start at page boundary. +     * This is OP-TEE ABI requirement. +     */ +    if ( cmd_gaddr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1) ) +        return false;

+    call->guest_arg_mfn = lookup_and_pin_guest_ram_addr(cmd_gaddr, NULL); +    if ( mfn_eq(call->guest_arg_mfn, INVALID_MFN) ) +        return false;

+    call->guest_arg = map_domain_page_global(call->guest_arg_mfn); +    if ( !call->guest_arg ) { +        unpin_guest_ram_addr(call->guest_arg_mfn); +        return false; +    }

It look like to me you want to use access_guest_memory_by_ipa here. This would avoid the page to be mapped global for the duration of the call.

Yes, as you said earlier, I can try to copy whole contents of command buffer to shadowed region and then handle it thjere.

...

...

+ call->xen_arg = alloc_xenheap_page();

I don't see anything in the code making sure that OPTEE_MSG_NONCONTIG_PAGE_SIZE fits in a Xen page-size. You probably want to add a BUILD_BUG_ON(OPTEE_MSG_NONCONTIG_PAGE > PAGE_SIZE).

...
Also, for been more future-proof (i.e Xen using an higher page size), it

would be better to use _xmalloc(OPTEE_MSG_NONCONTIG_PAGE_SIZE, OPTEE_MSG_NONCONTIG_PAGE_SIZE).

_xmalloc will alloc_xenheap_page() if it is necessary.

Thank you for suggestion. It is good to

...

...
+    if ( !call->xen_arg ) { +        unpin_guest_ram_addr(call->guest_arg_mfn); +        return false; +    }

+    memcpy(call->xen_arg, call->guest_arg, OPTEE_MSG_NONCONTIG_PAGE_SIZE);

+    xen_addr = virt_to_maddr(call->xen_arg);

+    set_user_reg(regs, 1, xen_addr >> 32); +    set_user_reg(regs, 2, xen_addr & 0xFFFFFFFF);

+    return true; +}

+static bool copy_std_request_back(struct domain_ctx *ctx, +                                  struct cpu_user_regs *regs, +                                  struct std_call_ctx *call) +{ +    unsigned int i; +    uint32_t attr;

This function might be a bit tricky to implement using access_guest_memory_by_ipa. However, you can just map the region temporarily. This would avoid to use map_domain_page_global() within then code and avoid one less potentially error in the cleanup path.

I think, I can make all needed manipulations in the shadowed buffer and then copy it as a whole to the guest's one.

...

...

+    call->guest_arg->ret = call->xen_arg->ret; +    call->guest_arg->ret_origin = call->xen_arg->ret_origin; +    call->guest_arg->session = call->xen_arg->session; +    for ( i = 0; i < call->xen_arg->num_params; i++ ) {

for ( ... ) {

...
+        attr = call->xen_arg->params[i].attr;

+        switch ( attr & OPTEE_MSG_ATTR_TYPE_MASK ) {

switch ( ... ) {

...
+        case OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_TMEM_INOUT: +            call->guest_arg->params[i].u.tmem.size = +                call->xen_arg->params[i].u.tmem.size; +            continue; +        case OPTEE_MSG_ATTR_TYPE_VALUE_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_VALUE_INOUT: +            call->guest_arg->params[i].u.value.a = +                call->xen_arg->params[i].u.value.a; +            call->guest_arg->params[i].u.value.b = +                call->xen_arg->params[i].u.value.b; +            continue; +        case OPTEE_MSG_ATTR_TYPE_RMEM_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_RMEM_INOUT: +            call->guest_arg->params[i].u.rmem.size = +                call->xen_arg->params[i].u.rmem.size; +            continue; +        case OPTEE_MSG_ATTR_TYPE_NONE: +        case OPTEE_MSG_ATTR_TYPE_RMEM_INPUT: +        case OPTEE_MSG_ATTR_TYPE_TMEM_INPUT: +            continue; +        } +    }

+    return true; +}

+static bool execute_std_call(struct domain_ctx *ctx, +                             struct cpu_user_regs *regs, +                             struct std_call_ctx *call) +{ +    register_t optee_ret;

+    forward_call(regs);

I find a bit odd that you introduce a return for forward_call in the previous patch. But you barely use it.

You see, division of original big patch into smaller ones is a bit arbitrary, so you can spot artifacts like this. I'll try to fix them.

...

I think I would prefer if the function does not return a value and you just rely on get_user_reg(regs, 0).

...

+ optee_ret = get_user_reg(regs, 0); + if ( OPTEE_SMC_RETURN_IS_RPC(optee_ret) ) + { + call->optee_thread_id = get_user_reg(regs, 3); + call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(optee_ret); + return true; + }

+ copy_std_request_back(ctx, regs, call);

Here another function return a value but it is not used. If you don't plan to use the return, then please drop. If you plan to use it, then do it everywhere.

The same thing. It is used later in the series, when there will be additional logic.

...

...

+    free_std_call_ctx(ctx, call);

+    return true; +}

+static bool handle_std_call(struct domain_ctx *ctx, struct cpu_user_regs *regs) +{ +    struct std_call_ctx *call; +    bool ret;

+    call = allocate_std_call_ctx(ctx);

+    if (!call) +        return false;

+    ret = copy_std_request(regs, call); +    if ( !ret ) +        goto out;

+    /* Now we can safely examine contents of command buffer */ +    if ( OPTEE_MSG_GET_ARG_SIZE(call->xen_arg->num_params) > +         OPTEE_MSG_NONCONTIG_PAGE_SIZE ) {

if ( ... ) {

...
+        ret = false; +        goto out; +    }

+    switch ( call->xen_arg->cmd ) +    { +    case OPTEE_MSG_CMD_OPEN_SESSION: +    case OPTEE_MSG_CMD_CLOSE_SESSION: +    case OPTEE_MSG_CMD_INVOKE_COMMAND: +    case OPTEE_MSG_CMD_CANCEL: +    case OPTEE_MSG_CMD_REGISTER_SHM: +    case OPTEE_MSG_CMD_UNREGISTER_SHM: +        ret = true; +        break; +    default: +        ret = false; +    }

+    if (!ret) +        goto out;

+    ret = execute_std_call(ctx, regs, call);

+out: +    if (!ret) +        free_std_call_ctx(ctx, call);

+    return ret; +}

+static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs) +{ +    struct std_call_ctx *call;

No need for the newline here.

...
+    int optee_thread_id = get_user_reg(regs, 3);

+    call = find_call_ctx(ctx, optee_thread_id);

+    if ( !call ) +        return false;

+    switch ( call->rpc_op ) {

switch ( ... ) {

...
+    case OPTEE_SMC_RPC_FUNC_ALLOC: +        /* TODO: Add handling */ +        break; +    case OPTEE_SMC_RPC_FUNC_FREE: +        /* TODO: Add handling */ +        break; +    case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR: +        break; +    case OPTEE_SMC_RPC_FUNC_CMD: +        /* TODO: Add handling */ +        break; +    }

+    return execute_std_call(ctx, regs, call); +}

static bool handle_exchange_capabilities(struct cpu_user_regs *regs) {       uint32_t caps; @@ -225,10 +539,9 @@ static bool optee_handle_call(struct cpu_user_regs *regs)       case OPTEE_SMC_EXCHANGE_CAPABILITIES:           return handle_exchange_capabilities(regs);       case OPTEE_SMC_CALL_WITH_ARG: +        return handle_std_call(ctx, regs);       case OPTEE_SMC_CALL_RETURN_FROM_RPC: -        /* TODO: Add proper handling for this calls */ -        forward_call(regs); -        return true; +        return handle_rpc(ctx, regs);       default:           return false;       }

Cheers,

-- Volodymyr Babchuk

Julien Grall

11 Sep 11 Sep

11:19 a.m.

New subject: [PATCH v2 07/13] optee: add std call handling

On 10/09/18 18:37, Volodymyr Babchuk wrote:

...

Hi Julien,

Hi,

...

On 05.09.18 18:17, Julien Grall wrote:

...
Hi,

On 09/03/2018 05:54 PM, Volodymyr Babchuk wrote:

...
Main way to communicate with OP-TEE is to issue standard SMCCC

NIT: The main way

...
call. "Standard" is a SMCCC term and it means that call can be interrupted and OP-TEE can return control to NW before completing the call.

In contranst with fast calls, where arguments and return values

NIT: s/contranst/contrast/

...
are passed in registers, standard calls use shared memory. Register pair r1,r2 holds 64-bit PA of command buffer, where all arguments

Do you mean w1, w2?

Good question. How to call the registers, so it would be valid both for ARMv7 and ARMv8?

Which naming does OP-TEE use? Is it a*?

[...]

...

...
...
+static struct std_call_ctx *allocate_std_call_ctx(struct domain_ctx *ctx) +{ + struct std_call_ctx *call; + int count;

+ count = atomic_add_unless(&ctx->call_ctx_count, 1, MAX_STD_CALLS);

Please a comment explaining the rationale for this.

E.g. "/* Limit number of calls for the current domain */" ?

Yes please.

[...]

...

...
...
+    if ( !call->xen_arg ) { +        unpin_guest_ram_addr(call->guest_arg_mfn); +        return false; +    }

+    memcpy(call->xen_arg, call->guest_arg, OPTEE_MSG_NONCONTIG_PAGE_SIZE);

+    xen_addr = virt_to_maddr(call->xen_arg);

+    set_user_reg(regs, 1, xen_addr >> 32); +    set_user_reg(regs, 2, xen_addr & 0xFFFFFFFF);

+    return true; +}

+static bool copy_std_request_back(struct domain_ctx *ctx, +                                  struct cpu_user_regs *regs, +                                  struct std_call_ctx *call) +{ +    unsigned int i; +    uint32_t attr;

This function might be a bit tricky to implement using access_guest_memory_by_ipa. However, you can just map the region temporarily. This would avoid to use map_domain_page_global() within then code and avoid one less potentially error in the cleanup path.

I think, I can make all needed manipulations in the shadowed buffer and then copy it as a whole to the guest's one.

That would quite dangerous as you may copy back data in the guest you don't want. So I would prefer the whitelist solution as you do below.

...

...
...

+    call->guest_arg->ret = call->xen_arg->ret; +    call->guest_arg->ret_origin = call->xen_arg->ret_origin; +    call->guest_arg->session = call->xen_arg->session; +    for ( i = 0; i < call->xen_arg->num_params; i++ ) {

for ( ... ) {

...
+        attr = call->xen_arg->params[i].attr;

+        switch ( attr & OPTEE_MSG_ATTR_TYPE_MASK ) {

switch ( ... ) {

...
+        case OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_TMEM_INOUT: +            call->guest_arg->params[i].u.tmem.size = +                call->xen_arg->params[i].u.tmem.size; +            continue; +        case OPTEE_MSG_ATTR_TYPE_VALUE_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_VALUE_INOUT: +            call->guest_arg->params[i].u.value.a = +                call->xen_arg->params[i].u.value.a; +            call->guest_arg->params[i].u.value.b = +                call->xen_arg->params[i].u.value.b; +            continue; +        case OPTEE_MSG_ATTR_TYPE_RMEM_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_RMEM_INOUT: +            call->guest_arg->params[i].u.rmem.size = +                call->xen_arg->params[i].u.rmem.size; +            continue; +        case OPTEE_MSG_ATTR_TYPE_NONE: +        case OPTEE_MSG_ATTR_TYPE_RMEM_INPUT: +        case OPTEE_MSG_ATTR_TYPE_TMEM_INPUT: +            continue; +        } +    }

+    return true; +}

+static bool execute_std_call(struct domain_ctx *ctx, +                             struct cpu_user_regs *regs, +                             struct std_call_ctx *call) +{ +    register_t optee_ret;

+    forward_call(regs);

I find a bit odd that you introduce a return for forward_call in the previous patch. But you barely use it.

You see, division of original big patch into smaller ones is a bit arbitrary, so you can spot artifacts like this. I'll try to fix them.

I would recommend to think about the split from the beginning of your work, not afterward. If you can't split easily, then it likely means the code review is going to be difficult to do.

But here the main problem is not really the split. The problem is you have a function return a value. That value should *always* be used or checked. If you don't care about the value, the you mostly likely want a comment on top (and possible an ASSERT) stating this will never fail.

...

...
I think I would prefer if the function does not return a value and you just rely on get_user_reg(regs, 0).

...

+ optee_ret = get_user_reg(regs, 0); + if ( OPTEE_SMC_RETURN_IS_RPC(optee_ret) ) + { + call->optee_thread_id = get_user_reg(regs, 3); + call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(optee_ret); + return true; + }

+ copy_std_request_back(ctx, regs, call);

Here another function return a value but it is not used. If you don't plan to use the return, then please drop. If you plan to use it, then do it everywhere.

The same thing. It is used later in the series, when there will be additional logic.

See above.

Cheers,

-- Julien Grall

Volodymyr Babchuk

11:31 a.m.

New subject: [PATCH v2 07/13] optee: add std call handling

On 11.09.18 14:19, Julien Grall wrote:

...

On 10/09/18 18:37, Volodymyr Babchuk wrote:

...
Hi Julien,

Hi,

...
On 05.09.18 18:17, Julien Grall wrote:

...
Hi,

On 09/03/2018 05:54 PM, Volodymyr Babchuk wrote:

...
Main way to communicate with OP-TEE is to issue standard SMCCC

NIT: The main way

...
call. "Standard" is a SMCCC term and it means that call can be interrupted and OP-TEE can return control to NW before completing the call.

In contranst with fast calls, where arguments and return values

NIT: s/contranst/contrast/

...
are passed in registers, standard calls use shared memory. Register pair r1,r2 holds 64-bit PA of command buffer, where all arguments

Do you mean w1, w2?

Good question. How to call the registers, so it would be valid both for ARMv7 and ARMv8?

Which naming does OP-TEE use? Is it a*?

Exactly. I've seen the same convention in XEN, in smccc code. So, a*, then?

[...]

...

...
...
This function might be a bit tricky to implement using access_guest_memory_by_ipa. However, you can just map the region temporarily. This would avoid to use map_domain_page_global() within then code and avoid one less potentially error in the cleanup path.

I think, I can make all needed manipulations in the shadowed buffer and then copy it as a whole to the guest's one.

That would quite dangerous as you may copy back data in the guest you don't want. So I would prefer the whitelist solution as you do below.

Okay, then I can map the needed guest page temporary.

...

...
...
...

+    call->guest_arg->ret = call->xen_arg->ret; +    call->guest_arg->ret_origin = call->xen_arg->ret_origin; +    call->guest_arg->session = call->xen_arg->session; +    for ( i = 0; i < call->xen_arg->num_params; i++ ) {

for ( ... ) {

...
+        attr = call->xen_arg->params[i].attr;

+        switch ( attr & OPTEE_MSG_ATTR_TYPE_MASK ) {

switch ( ... ) {

...
+        case OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_TMEM_INOUT: +            call->guest_arg->params[i].u.tmem.size = +                call->xen_arg->params[i].u.tmem.size; +            continue; +        case OPTEE_MSG_ATTR_TYPE_VALUE_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_VALUE_INOUT: +            call->guest_arg->params[i].u.value.a = +                call->xen_arg->params[i].u.value.a; +            call->guest_arg->params[i].u.value.b = +                call->xen_arg->params[i].u.value.b; +            continue; +        case OPTEE_MSG_ATTR_TYPE_RMEM_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_RMEM_INOUT: +            call->guest_arg->params[i].u.rmem.size = +                call->xen_arg->params[i].u.rmem.size; +            continue; +        case OPTEE_MSG_ATTR_TYPE_NONE: +        case OPTEE_MSG_ATTR_TYPE_RMEM_INPUT: +        case OPTEE_MSG_ATTR_TYPE_TMEM_INPUT: +            continue; +        } +    }

+    return true; +}

+static bool execute_std_call(struct domain_ctx *ctx, +                             struct cpu_user_regs *regs, +                             struct std_call_ctx *call) +{ +    register_t optee_ret;

+    forward_call(regs);

I find a bit odd that you introduce a return for forward_call in the previous patch. But you barely use it.

You see, division of original big patch into smaller ones is a bit arbitrary, so you can spot artifacts like this. I'll try to fix them.

I would recommend to think about the split from the beginning of your work, not afterward. If you can't split easily, then it likely means the code review is going to be difficult to do.

Sure, usually I do patches in this way. But in this case I developed entirely new feature and it took number of iterations to get it work in the way I wanted. Thus, one big patch, which is was split into series.

...

But here the main problem is not really the split. The problem is you have a function return a value. That value should *always* be used or checked. If you don't care about the value, the you mostly likely want a comment on top (and possible an ASSERT) stating this will never fail.

Okay, I got the idea. Thanks.

-- Volodymyr Babchuk

Julien Grall

1:30 p.m.

New subject: [PATCH v2 07/13] optee: add std call handling

On 11/09/18 12:31, Volodymyr Babchuk wrote:

...

On 11.09.18 14:19, Julien Grall wrote:

...
On 10/09/18 18:37, Volodymyr Babchuk wrote:

...
Hi Julien,

Hi,

...
On 05.09.18 18:17, Julien Grall wrote:

...
Hi,

On 09/03/2018 05:54 PM, Volodymyr Babchuk wrote:

...
Main way to communicate with OP-TEE is to issue standard SMCCC

NIT: The main way

...
call. "Standard" is a SMCCC term and it means that call can be interrupted and OP-TEE can return control to NW before completing the call.

In contranst with fast calls, where arguments and return values

NIT: s/contranst/contrast/

...
are passed in registers, standard calls use shared memory. Register pair r1,r2 holds 64-bit PA of command buffer, where all arguments

Do you mean w1, w2?

Good question. How to call the registers, so it would be valid both for ARMv7 and ARMv8?

Which naming does OP-TEE use? Is it a*?

Exactly. I've seen the same convention in XEN, in smccc code. So, a*, then?

It would be nice to stay consistent with OP-TEE. So I think a* suits better.

Cheers,

-- Julien Grall

Volodymyr Babchuk

3 Sep 3 Sep

4:54 p.m.

New subject: [PATCH v2 08/13] optee: add support for RPC SHM buffers

OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown.

Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com --- xen/arch/arm/tee/optee.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 119 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 1008eba..6d6b51d 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -21,6 +21,7 @@ #include <asm/tee/optee_smc.h>

#define MAX_STD_CALLS 16 +#define MAX_RPC_SHMS 16

/* * Call context. OP-TEE can issue multiple RPC returns during one call. @@ -35,11 +36,22 @@ struct std_call_ctx { int rpc_op; };

+/* Pre-allocated SHM buffer for RPC commands */ +struct shm_rpc { + struct list_head list; + struct optee_msg_arg *guest_arg; + struct page *guest_page; + mfn_t guest_mfn; + uint64_t cookie; +}; + struct domain_ctx { struct list_head list; struct list_head call_ctx_list; + struct list_head shm_rpc_list; struct domain *domain; atomic_t call_ctx_count; + atomic_t shm_rpc_count; spinlock_t lock; };

@@ -145,8 +157,10 @@ static int optee_enable(struct domain *d)

ctx->domain = d; INIT_LIST_HEAD(&ctx->call_ctx_list); + INIT_LIST_HEAD(&ctx->shm_rpc_list);

atomic_set(&ctx->call_ctx_count, 0); + atomic_set(&ctx->shm_rpc_count, 0); spin_lock_init(&ctx->lock);

spin_lock(&domain_ctx_list_lock); @@ -256,11 +270,81 @@ static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id) return NULL; }

+static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t gaddr, + uint64_t cookie) +{ + struct shm_rpc *shm_rpc; + int count; + + count = atomic_add_unless(&ctx->shm_rpc_count, 1, MAX_RPC_SHMS); + if ( count == MAX_RPC_SHMS ) + return NULL; + + shm_rpc = xzalloc(struct shm_rpc); + if ( !shm_rpc ) + goto err; + + shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL); + + if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) ) + goto err; + + shm_rpc->guest_arg = map_domain_page_global(shm_rpc->guest_mfn); + if ( !shm_rpc->guest_arg ) + { + gprintk(XENLOG_INFO, "Could not map domain page\n"); + goto err; + } + shm_rpc->cookie = cookie; + + spin_lock(&ctx->lock); + list_add_tail(&shm_rpc->list, &ctx->shm_rpc_list); + spin_unlock(&ctx->lock); + + return shm_rpc; + +err: + atomic_dec(&ctx->shm_rpc_count); + xfree(shm_rpc); + return NULL; +} + +static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ + struct shm_rpc *shm_rpc; + bool found = false; + + spin_lock(&ctx->lock); + + list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) + { + if ( shm_rpc->cookie == cookie ) + { + found = true; + list_del(&shm_rpc->list); + break; + } + } + spin_unlock(&ctx->lock); + + if ( !found ) { + return; + } + + if ( shm_rpc->guest_arg ) { + unpin_guest_ram_addr(shm_rpc->guest_mfn); + unmap_domain_page_global(shm_rpc->guest_arg); + } + + xfree(shm_rpc); +} + static void optee_domain_destroy(struct domain *d) { struct arm_smccc_res resp; struct domain_ctx *ctx; struct std_call_ctx *call, *call_tmp; + struct shm_rpc *shm_rpc, *shm_rpc_tmp; bool found = false;

/* At this time all domain VCPUs should be stopped */ @@ -290,7 +374,11 @@ static void optee_domain_destroy(struct domain *d) list_for_each_entry_safe( call, call_tmp, &ctx->call_ctx_list, list ) free_std_call_ctx(ctx, call);

+ list_for_each_entry_safe( shm_rpc, shm_rpc_tmp, &ctx->shm_rpc_list, list ) + free_shm_rpc(ctx, shm_rpc->cookie); + ASSERT(!atomic_read(&ctx->call_ctx_count)); + ASSERT(!atomic_read(&ctx->shm_rpc_count));

xfree(ctx); } @@ -452,6 +540,32 @@ out: return ret; }

+static void handle_rpc_func_alloc(struct domain_ctx *ctx, + struct cpu_user_regs *regs) +{ + paddr_t ptr = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2); + + if ( ptr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1) ) + gprintk(XENLOG_WARNING, "Domain returned invalid RPC command buffer\n"); + + if ( ptr ) { + uint64_t cookie = get_user_reg(regs, 4) << 32 | get_user_reg(regs, 5); + struct shm_rpc *shm_rpc; + + shm_rpc = allocate_and_map_shm_rpc(ctx, ptr, cookie); + if ( !shm_rpc ) + { + gprintk(XENLOG_WARNING, "Failed to allocate shm_rpc object\n"); + ptr = 0; + } + else + ptr = mfn_to_maddr(shm_rpc->guest_mfn); + + set_user_reg(regs, 1, ptr >> 32); + set_user_reg(regs, 2, ptr & 0xFFFFFFFF); + } +} + static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs) { struct std_call_ctx *call; @@ -465,11 +579,14 @@ static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs)

switch ( call->rpc_op ) { case OPTEE_SMC_RPC_FUNC_ALLOC: - /* TODO: Add handling */ + handle_rpc_func_alloc(ctx, regs); break; case OPTEE_SMC_RPC_FUNC_FREE: - /* TODO: Add handling */ + { + uint64_t cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2); + free_shm_rpc(ctx, cookie); break; + } case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR: break; case OPTEE_SMC_RPC_FUNC_CMD:

-- 2.7.4

Julien Grall

10 Sep 10 Sep

1:01 p.m.

New subject: [PATCH v2 08/13] optee: add support for RPC SHM buffers

Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...

OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown.

Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests.

Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest?

...

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 119 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 1008eba..6d6b51d 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -21,6 +21,7 @@ #include <asm/tee/optee_smc.h> #define MAX_STD_CALLS 16 +#define MAX_RPC_SHMS 16 /*

Call context. OP-TEE can issue multiple RPC returns during one call.

@@ -35,11 +36,22 @@ struct std_call_ctx { int rpc_op; }; +/* Pre-allocated SHM buffer for RPC commands */ +struct shm_rpc {

struct list_head list;

struct optee_msg_arg *guest_arg;

struct page *guest_page;

mfn_t guest_mfn;

uint64_t cookie;

+};

struct domain_ctx { struct list_head list; struct list_head call_ctx_list;

struct list_head shm_rpc_list; struct domain *domain; atomic_t call_ctx_count;

atomic_t shm_rpc_count; spinlock_t lock; };

@@ -145,8 +157,10 @@ static int optee_enable(struct domain *d) ctx->domain = d; INIT_LIST_HEAD(&ctx->call_ctx_list);

INIT_LIST_HEAD(&ctx->shm_rpc_list);

atomic_set(&ctx->call_ctx_count, 0);

atomic_set(&ctx->shm_rpc_count, 0); spin_lock_init(&ctx->lock);

spin_lock(&domain_ctx_list_lock); @@ -256,11 +270,81 @@ static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id) return NULL; } +static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t gaddr,

I would prefer if you pass a gfn instead of the address here.

...

                                   uint64_t cookie)

NIT: Indentation

...

+{
struct shm_rpc *shm_rpc;

int count;

count = atomic_add_unless(&ctx->shm_rpc_count, 1, MAX_RPC_SHMS);

if ( count == MAX_RPC_SHMS )
   return NULL;
shm_rpc = xzalloc(struct shm_rpc);

if ( !shm_rpc )
   goto err;
shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL);

if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) )
   goto err;
shm_rpc->guest_arg = map_domain_page_global(shm_rpc->guest_mfn);

if ( !shm_rpc->guest_arg )

{
   gprintk(XENLOG_INFO, "Could not map domain page\n");

You don't unpin the guest page if Xen can't map the page.

...

   goto err;
}

shm_rpc->cookie = cookie;

spin_lock(&ctx->lock);

list_add_tail(&shm_rpc->list, &ctx->shm_rpc_list);

spin_unlock(&ctx->lock);

return shm_rpc;
+err:

atomic_dec(&ctx->shm_rpc_count);

xfree(shm_rpc);

return NULL;

+}

+static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{
struct shm_rpc *shm_rpc;

bool found = false;

spin_lock(&ctx->lock);

list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list )

{
   if ( shm_rpc->cookie == cookie )

What does guarantee you the cookie will be uniq?

...

```
   {
```
```
       found = true;
```
```
       list_del(&shm_rpc->list);
```
```
       break;
```
```
   }
```
}
spin_unlock(&ctx->lock);

At this point you have a shm_rpc in hand to free. But what does guarantee you no-one will use it?

...

if ( !found ) {
   return;
}

No need for the {} in a one-liner.

...

if ( shm_rpc->guest_arg ) {

Coding style:

if ( ... ) {

...

   unpin_guest_ram_addr(shm_rpc->guest_mfn);
   unmap_domain_page_global(shm_rpc->guest_arg);
}

xfree(shm_rpc);
+}

static void optee_domain_destroy(struct domain *d) { struct arm_smccc_res resp; struct domain_ctx *ctx; struct std_call_ctx *call, *call_tmp;

struct shm_rpc *shm_rpc, *shm_rpc_tmp; bool found = false;

/* At this time all domain VCPUs should be stopped */ @@ -290,7 +374,11 @@ static void optee_domain_destroy(struct domain *d) list_for_each_entry_safe( call, call_tmp, &ctx->call_ctx_list, list ) free_std_call_ctx(ctx, call);
list_for_each_entry_safe( shm_rpc, shm_rpc_tmp, &ctx->shm_rpc_list, list )
   free_shm_rpc(ctx, shm_rpc->cookie);
ASSERT(!atomic_read(&ctx->call_ctx_count));
ASSERT(!atomic_read(&ctx->shm_rpc_count));
xfree(ctx); } @@ -452,6 +540,32 @@ out: return ret; } +static void handle_rpc_func_alloc(struct domain_ctx *ctx,
                             struct cpu_user_regs *regs)
+{
paddr_t ptr = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2);

if ( ptr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1) )
   gprintk(XENLOG_WARNING, "Domain returned invalid RPC command buffer\n");
if ( ptr ) {
   uint64_t cookie = get_user_reg(regs, 4) << 32 | get_user_reg(regs, 5);
   struct shm_rpc *shm_rpc;
   shm_rpc = allocate_and_map_shm_rpc(ctx, ptr, cookie);
   if ( !shm_rpc )
   {
       gprintk(XENLOG_WARNING, "Failed to allocate shm_rpc object\n");
       ptr = 0;
   }
   else
       ptr = mfn_to_maddr(shm_rpc->guest_mfn);
   set_user_reg(regs, 1, ptr >> 32);
   set_user_reg(regs, 2, ptr & 0xFFFFFFFF);
}
+}

static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs) { struct std_call_ctx *call;

@@ -465,11 +579,14 @@ static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs) switch ( call->rpc_op ) { case OPTEE_SMC_RPC_FUNC_ALLOC:
   /* TODO: Add handling */
   handle_rpc_func_alloc(ctx, regs);
    break;
case OPTEE_SMC_RPC_FUNC_FREE:
   /* TODO: Add handling */
{
   uint64_t cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2);

Newline here.

...

```
   free_shm_rpc(ctx, cookie);
```

It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards.

...

      break;
} case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR: break; case OPTEE_SMC_RPC_FUNC_CMD:

Cheers,

-- Julien Grall

Volodymyr Babchuk

5:44 p.m.

New subject: [PATCH v2 08/13] optee: add support for RPC SHM buffers

Hi Julien,

On 10.09.18 16:01, Julien Grall wrote:

...

Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown.

Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests.

Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest?

Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time.

...

...
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 119 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 1008eba..6d6b51d 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -21,6 +21,7 @@ #include <asm/tee/optee_smc.h> #define MAX_STD_CALLS   16 +#define MAX_RPC_SHMS    16 /*    * Call context. OP-TEE can issue multiple RPC returns during one call. @@ -35,11 +36,22 @@ struct std_call_ctx {       int rpc_op; }; +/* Pre-allocated SHM buffer for RPC commands */ +struct shm_rpc { +    struct list_head list; +    struct optee_msg_arg *guest_arg; +    struct page *guest_page; +    mfn_t guest_mfn; +    uint64_t cookie; +};

struct domain_ctx {       struct list_head list;       struct list_head call_ctx_list; +    struct list_head shm_rpc_list;       struct domain *domain;       atomic_t call_ctx_count; +    atomic_t shm_rpc_count;       spinlock_t lock; }; @@ -145,8 +157,10 @@ static int optee_enable(struct domain *d)       ctx->domain = d;       INIT_LIST_HEAD(&ctx->call_ctx_list); +    INIT_LIST_HEAD(&ctx->shm_rpc_list);       atomic_set(&ctx->call_ctx_count, 0); +    atomic_set(&ctx->shm_rpc_count, 0);       spin_lock_init(&ctx->lock);       spin_lock(&domain_ctx_list_lock); @@ -256,11 +270,81 @@ static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id)       return NULL; } +static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t gaddr,

I would prefer if you pass a gfn instead of the address here.

...
+                                        uint64_t cookie)

NIT: Indentation

...
+{ +    struct shm_rpc *shm_rpc; +    int count;

+    count = atomic_add_unless(&ctx->shm_rpc_count, 1, MAX_RPC_SHMS); +    if ( count == MAX_RPC_SHMS ) +        return NULL;

+    shm_rpc = xzalloc(struct shm_rpc); +    if ( !shm_rpc ) +        goto err;

+    shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL);

+    if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) ) +        goto err;

+    shm_rpc->guest_arg = map_domain_page_global(shm_rpc->guest_mfn); +    if ( !shm_rpc->guest_arg ) +    { +        gprintk(XENLOG_INFO, "Could not map domain page\n");

You don't unpin the guest page if Xen can't map the page.

...
+        goto err; +    } +    shm_rpc->cookie = cookie;

+    spin_lock(&ctx->lock); +    list_add_tail(&shm_rpc->list, &ctx->shm_rpc_list); +    spin_unlock(&ctx->lock);

+    return shm_rpc;

+err: +    atomic_dec(&ctx->shm_rpc_count); +    xfree(shm_rpc); +    return NULL; +}

+static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ +    struct shm_rpc *shm_rpc; +    bool found = false;

+    spin_lock(&ctx->lock);

+    list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) +    { +        if ( shm_rpc->cookie == cookie )

What does guarantee you the cookie will be uniq?

Normal World guarantees. This is the part of the protocol.

...

...
+ { + found = true; + list_del(&shm_rpc->list); + break; + } + } + spin_unlock(&ctx->lock);

At this point you have a shm_rpc in hand to free. But what does guarantee you no-one will use it?

This is valid point. I'll revisit this part of the code, thank you. Looks like I need some refcount there.

...

...

+    if ( !found ) { +        return; +    }

No need for the {} in a one-liner.

...

+    if ( shm_rpc->guest_arg ) {

Coding style:

if ( ... ) {

...
+        unpin_guest_ram_addr(shm_rpc->guest_mfn); +        unmap_domain_page_global(shm_rpc->guest_arg); +    }

+    xfree(shm_rpc); +}

static void optee_domain_destroy(struct domain *d) {       struct arm_smccc_res resp;       struct domain_ctx *ctx;       struct std_call_ctx *call, *call_tmp; +    struct shm_rpc *shm_rpc, *shm_rpc_tmp;       bool found = false;       /* At this time all domain VCPUs should be stopped */ @@ -290,7 +374,11 @@ static void optee_domain_destroy(struct domain *d)       list_for_each_entry_safe( call, call_tmp, &ctx->call_ctx_list, list )           free_std_call_ctx(ctx, call); +    list_for_each_entry_safe( shm_rpc, shm_rpc_tmp, &ctx->shm_rpc_list, list ) +        free_shm_rpc(ctx, shm_rpc->cookie);

ASSERT(!atomic_read(&ctx->call_ctx_count)); +    ASSERT(!atomic_read(&ctx->shm_rpc_count));       xfree(ctx); } @@ -452,6 +540,32 @@ out:       return ret; } +static void handle_rpc_func_alloc(struct domain_ctx *ctx, +                                  struct cpu_user_regs *regs) +{ +    paddr_t ptr = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2);

+    if ( ptr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1) ) +        gprintk(XENLOG_WARNING, "Domain returned invalid RPC command buffer\n");

+    if ( ptr ) { +        uint64_t cookie = get_user_reg(regs, 4) << 32 | get_user_reg(regs, 5); +        struct shm_rpc *shm_rpc;

+        shm_rpc = allocate_and_map_shm_rpc(ctx, ptr, cookie); +        if ( !shm_rpc ) +        { +            gprintk(XENLOG_WARNING, "Failed to allocate shm_rpc object\n"); +            ptr = 0; +        } +        else +            ptr = mfn_to_maddr(shm_rpc->guest_mfn);

+        set_user_reg(regs, 1, ptr >> 32); +        set_user_reg(regs, 2, ptr & 0xFFFFFFFF); +    } +}

static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs) {       struct std_call_ctx *call; @@ -465,11 +579,14 @@ static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs)       switch ( call->rpc_op ) {       case OPTEE_SMC_RPC_FUNC_ALLOC: -        /* TODO: Add handling */ +        handle_rpc_func_alloc(ctx, regs);           break;       case OPTEE_SMC_RPC_FUNC_FREE: -        /* TODO: Add handling */ +    { +        uint64_t cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2);

Newline here.

...
+        free_shm_rpc(ctx, cookie);

It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards.

No, it is OP-TEE asked to free buffer. This function is called, when NW returns from the RPC. So at this moment NW freed the buffer.

...

...
break; + } case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR: break; case OPTEE_SMC_RPC_FUNC_CMD:

Cheers,

-- Volodymyr Babchuk

Julien Grall

11 Sep 11 Sep

11:53 a.m.

New subject: [PATCH v2 08/13] optee: add support for RPC SHM buffers

On 10/09/18 18:44, Volodymyr Babchuk wrote:

...

Hi Julien,

On 10.09.18 16:01, Julien Grall wrote:

...
Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown.

Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests.

Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest?

Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time.

This is a bit misleading... This copy will *only* happen for IRQ during an RPC. What are the chances for that? Fairly limited. If this is happening too often, then the map/unmap here will be your least concern.

However, I would like to see any performance comparison here to weight with the memory impact in Xen (Arm32 have limited amount of VA available).

...

...
...
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 119 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 1008eba..6d6b51d 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -21,6 +21,7 @@ #include <asm/tee/optee_smc.h> #define MAX_STD_CALLS   16 +#define MAX_RPC_SHMS    16 /*    * Call context. OP-TEE can issue multiple RPC returns during one call. @@ -35,11 +36,22 @@ struct std_call_ctx {       int rpc_op; }; +/* Pre-allocated SHM buffer for RPC commands */ +struct shm_rpc { +    struct list_head list; +    struct optee_msg_arg *guest_arg; +    struct page *guest_page; +    mfn_t guest_mfn; +    uint64_t cookie; +};

struct domain_ctx {       struct list_head list;       struct list_head call_ctx_list; +    struct list_head shm_rpc_list;       struct domain *domain;       atomic_t call_ctx_count; +    atomic_t shm_rpc_count;       spinlock_t lock; }; @@ -145,8 +157,10 @@ static int optee_enable(struct domain *d)       ctx->domain = d;       INIT_LIST_HEAD(&ctx->call_ctx_list); +    INIT_LIST_HEAD(&ctx->shm_rpc_list);       atomic_set(&ctx->call_ctx_count, 0); +    atomic_set(&ctx->shm_rpc_count, 0);       spin_lock_init(&ctx->lock);       spin_lock(&domain_ctx_list_lock); @@ -256,11 +270,81 @@ static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id)       return NULL; } +static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t gaddr,

I would prefer if you pass a gfn instead of the address here.

...
+                                        uint64_t cookie)

NIT: Indentation

...
+{ +    struct shm_rpc *shm_rpc; +    int count;

+    count = atomic_add_unless(&ctx->shm_rpc_count, 1, MAX_RPC_SHMS); +    if ( count == MAX_RPC_SHMS ) +        return NULL;

+    shm_rpc = xzalloc(struct shm_rpc); +    if ( !shm_rpc ) +        goto err;

+    shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL);

+    if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) ) +        goto err;

+    shm_rpc->guest_arg = map_domain_page_global(shm_rpc->guest_mfn); +    if ( !shm_rpc->guest_arg ) +    { +        gprintk(XENLOG_INFO, "Could not map domain page\n");

You don't unpin the guest page if Xen can't map the page.

...
+        goto err; +    } +    shm_rpc->cookie = cookie;

+    spin_lock(&ctx->lock); +    list_add_tail(&shm_rpc->list, &ctx->shm_rpc_list); +    spin_unlock(&ctx->lock);

+    return shm_rpc;

+err: +    atomic_dec(&ctx->shm_rpc_count); +    xfree(shm_rpc); +    return NULL; +}

+static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ +    struct shm_rpc *shm_rpc; +    bool found = false;

+    spin_lock(&ctx->lock);

+    list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) +    { +        if ( shm_rpc->cookie == cookie )

What does guarantee you the cookie will be uniq?

Normal World guarantees. This is the part of the protocol.

By NW, do you mean the guest? You should know by now we should not trust what the guest is doing. If you think it is still fine, then I would like some writing to explain what is the impact of a guest putting twice the same cookie ID.

[...]

...

...
It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards.

No, it is OP-TEE asked to free buffer. This function is called, when NW returns from the RPC. So at this moment NW freed the buffer.

But you forward that call to OP-TEE after. So what would OP-TEE do with that?

Looking at that code, I just noticed there potential race condition here. Nothing prevent a guest to call twice with the same optee_thread_id.

So it would be possible for two vCPU to call concurrently the same command and free it.

...

...
...
break; + } case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR: break; case OPTEE_SMC_RPC_FUNC_CMD:

Cheers,

Cheers,

-- Julien Grall

Volodymyr Babchuk

7:30 p.m.

New subject: [PATCH v2 08/13] optee: add support for RPC SHM buffers

Hi Julien,

On 11.09.18 14:53, Julien Grall wrote:

...

On 10/09/18 18:44, Volodymyr Babchuk wrote:

...
Hi Julien,

On 10.09.18 16:01, Julien Grall wrote:

...
Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown.

Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests.

Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest?

Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time.

This is a bit misleading... This copy will *only* happen for IRQ during an RPC. What are the chances for that? Fairly limited. If this is happening too often, then the map/unmap here will be your least concern.

Now, this copy will happen for every IRQ when CPU is in S-EL1/S-EL0 mode. Chances are quite high, I must say. Look: OP-TEE or (TA) is doing something, like encrypting some buffer, for example. IRQ fires, OP-TEE immediately executes RPC return (right from interrupt handler), so NW can handle interrupt. Then NW returns control back to OP-TEE, if it wants to.

This is how long job in OP-TEE can be preempted by linux kernel, for example. Timer IRQ ensures that control will be returned to linux, scheduler schedules some other task and OP-TEE patiently waits until its caller is scheduled back, so it can resume the work.

...

However, I would like to see any performance comparison here to weight with the memory impact in Xen (Arm32 have limited amount of VA available).

With current configuration, this is maximum 16 pages per guest. As for performance comparison... This is doable, but will take some time.

[...]

...

...
...
...
+static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ +    struct shm_rpc *shm_rpc; +    bool found = false;

+    spin_lock(&ctx->lock);

+    list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) +    { +        if ( shm_rpc->cookie == cookie )

What does guarantee you the cookie will be uniq?

Normal World guarantees. This is the part of the protocol.

By NW, do you mean the guest? You should know by now we should not trust what the guest is doing. If you think it is still fine, then I would like some writing to explain what is the impact of a guest putting twice the same cookie ID.

Ah, I see your point. Yes, I'll add check to ensure that cookie is not reused. Thank you for pointing to this.

...

...
...
It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards.

No, it is OP-TEE asked to free buffer. This function is called, when NW returns from the RPC. So at this moment NW freed the buffer.

But you forward that call to OP-TEE after. So what would OP-TEE do with that?

Happily resume interrupted work. There is how RPC works:

1. NW client issues STD call (or yielding call in terms of SMCCC) 2. OP-TEE starts its work, but it is needed to be interrupted for some reason: IRQ arrived, it wants to block on a mutex, it asks NW to do some work (like allocating memory or loading TA). This is called "RPC return". 3. OP-TEE suspends thread and does return from SMC call with code OPTEE_SMC_RPC_VAL(SOME_CMD) in a0, and some optional parameters in other registers 4. NW sees that this is a RPC, and not completed STD call, so it does SOME_CMD and issues another SMC with code OPTEE_SMC_CALL_RETURN_FROM_RPC in a0 5. OP-TEE wakes up suspended thread and continues execution 6. pts 2-5 are repeated until OP-TEE finishes the work 7. It returns from last SMC call with code OPTEE_SMC_RETURN_SUCCESS/ OPTEE_SMC_RETURN_some_error in a0. 8. optee driver sees that call from pt.1 is finished at least and returns control back to client

...

Looking at that code, I just noticed there potential race condition here. Nothing prevent a guest to call twice with the same optee_thread_id.

OP-TEE has internal check against this.

...

So it would be possible for two vCPU to call concurrently the same command and free it.

Maybe you noticed that mediator uses shadow buffer to read cookie id. So it will free the buffer mentioned by OP-TEE. Basically what happened:

1. OP-TEE asks "free buffer with cookie X" in RPC return 2. guests says "I freed that buffer" in SMC call 3. mediator frees buffer with cookie X on its side

In this particular order.

-- Volodymyr Babchuk

Julien Grall

12 Sep 12 Sep

10:59 a.m.

New subject: [PATCH v2 08/13] optee: add support for RPC SHM buffers

Hi Volodymyr,

On 09/11/2018 08:30 PM, Volodymyr Babchuk wrote:

...

On 11.09.18 14:53, Julien Grall wrote:

...
On 10/09/18 18:44, Volodymyr Babchuk wrote:

...
On 10.09.18 16:01, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown.

Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests.

Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest?

Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time.

This is a bit misleading... This copy will *only* happen for IRQ during an RPC. What are the chances for that? Fairly limited. If this is happening too often, then the map/unmap here will be your least concern.

Now, this copy will happen for every IRQ when CPU is in S-EL1/S-EL0 mode. Chances are quite high, I must say. Look: OP-TEE or (TA) is doing something, like encrypting some buffer, for example. IRQ fires, OP-TEE immediately executes RPC return (right from interrupt handler), so NW can handle interrupt. Then NW returns control back to OP-TEE, if it wants to.

I understand this... But the map/unmap should be negligible over the rest of the context.

...

This is how long job in OP-TEE can be preempted by linux kernel, for example. Timer IRQ ensures that control will be returned to linux, scheduler schedules some other task and OP-TEE patiently waits until its caller is scheduled back, so it can resume the work.

...
However, I would like to see any performance comparison here to weight with the memory impact in Xen (Arm32 have limited amount of VA available).

With current configuration, this is maximum 16 pages per guest. As for performance comparison... This is doable, but will take some time.

Let me write it differently, I will always chose the safe side until this is strictly necessary or performance has been proven. I might be convinced for just 16 pages, although it feels like a premature optimization...

...

...
...
...
It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards.

No, it is OP-TEE asked to free buffer. This function is called, when NW returns from the RPC. So at this moment NW freed the buffer.

But you forward that call to OP-TEE after. So what would OP-TEE do with that?

Happily resume interrupted work. There is how RPC works:

NW client issues STD call (or yielding call in terms of SMCCC)

OP-TEE starts its work, but it is needed to be interrupted for some

reason: IRQ arrived, it wants to block on a mutex, it asks NW to do some work (like allocating memory or loading TA). This is called "RPC return". 3. OP-TEE suspends thread and does return from SMC call with code OPTEE_SMC_RPC_VAL(SOME_CMD) in a0, and some optional parameters in other registers 4. NW sees that this is a RPC, and not completed STD call, so it does SOME_CMD and issues another SMC with code OPTEE_SMC_CALL_RETURN_FROM_RPC in a0 5. OP-TEE wakes up suspended thread and continues execution 6. pts 2-5 are repeated until OP-TEE finishes the work 7. It returns from last SMC call with code OPTEE_SMC_RETURN_SUCCESS/ OPTEE_SMC_RETURN_some_error in a0. 8. optee driver sees that call from pt.1 is finished at least and returns control back to client

Thank you for the explanation. As I mentioned in another thread, it would be good to have some kind of highly level explanation in the tree and all those interaction. If it is already existing, then pointer in the code.

...

...
Looking at that code, I just noticed there potential race condition here. Nothing prevent a guest to call twice with the same optee_thread_id.

OP-TEE has internal check against this.

I am not sure how OP-TEE internal check would help here. The user may know that thread-id 1 exist and will call it from 2 vCPUs concurrently.

So handle_rpc will find a context associated to it and use it for execute_std_call. If OP-TEE return an error (or is done with it), you will end up to free twice the same context.

Did I miss anything?

...

...
So it would be possible for two vCPU to call concurrently the same command and free it.

Maybe you noticed that mediator uses shadow buffer to read cookie id.

I am not speaking about the cookie id but the thread_id... However this statement is wrong in the context we are discussing. Both thread_id and cookie are read from the guest registers.

...

it will free the buffer mentioned by OP-TEE. Basically what happened:

OP-TEE asks "free buffer with cookie X" in RPC return

guests says "I freed that buffer" in SMC call

mediator frees buffer with cookie X on its side

In this particular order.

Cheers,

-- Julien Grall

Volodymyr Babchuk

1:51 p.m.

New subject: [PATCH v2 08/13] optee: add support for RPC SHM buffers

Hi,

On 12.09.18 13:59, Julien Grall wrote:

...

Hi Volodymyr,

On 09/11/2018 08:30 PM, Volodymyr Babchuk wrote:

...
On 11.09.18 14:53, Julien Grall wrote:

...
On 10/09/18 18:44, Volodymyr Babchuk wrote:

...
On 10.09.18 16:01, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown.

Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests.

Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest?

Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time.

This is a bit misleading... This copy will *only* happen for IRQ during an RPC. What are the chances for that? Fairly limited. If this is happening too often, then the map/unmap here will be your least concern.

Now, this copy will happen for every IRQ when CPU is in S-EL1/S-EL0 mode. Chances are quite high, I must say. Look: OP-TEE or (TA) is doing something, like encrypting some buffer, for example. IRQ fires, OP-TEE immediately executes RPC return (right from interrupt handler), so NW can handle interrupt. Then NW returns control back to OP-TEE, if it wants to.

I understand this... But the map/unmap should be negligible over the rest of the context.

I thought that map/unmap is quite costly operation, but I can be wrong there.

...

...
This is how long job in OP-TEE can be preempted by linux kernel, for example. Timer IRQ ensures that control will be returned to linux, scheduler schedules some other task and OP-TEE patiently waits until its caller is scheduled back, so it can resume the work.

...
However, I would like to see any performance comparison here to weight with the memory impact in Xen (Arm32 have limited amount of VA available).

With current configuration, this is maximum 16 pages per guest. As for performance comparison... This is doable, but will take some time.

Let me write it differently, I will always chose the safe side until this is strictly necessary or performance has been proven. I might be convinced for just 16 pages, although it feels like a premature optimization...

Okay, then I'll stick with memory copy helpers for now.

...

...
...
...
...
It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards.

No, it is OP-TEE asked to free buffer. This function is called, when NW returns from the RPC. So at this moment NW freed the buffer.

But you forward that call to OP-TEE after. So what would OP-TEE do with that?

Happily resume interrupted work. There is how RPC works:

NW client issues STD call (or yielding call in terms of SMCCC)

OP-TEE starts its work, but it is needed to be interrupted for some

reason: IRQ arrived, it wants to block on a mutex, it asks NW to do some work (like allocating memory or loading TA). This is called "RPC return". 3. OP-TEE suspends thread and does return from SMC call with code OPTEE_SMC_RPC_VAL(SOME_CMD) in a0, and some optional parameters in other registers 4. NW sees that this is a RPC, and not completed STD call, so it does SOME_CMD and issues another SMC with code OPTEE_SMC_CALL_RETURN_FROM_RPC in a0 5. OP-TEE wakes up suspended thread and continues execution 6. pts 2-5 are repeated until OP-TEE finishes the work 7. It returns from last SMC call with code OPTEE_SMC_RETURN_SUCCESS/ OPTEE_SMC_RETURN_some_error in a0. 8. optee driver sees that call from pt.1 is finished at least and returns control back to client

Thank you for the explanation. As I mentioned in another thread, it would be good to have some kind of highly level explanation in the tree and all those interaction. If it is already existing, then pointer in the code.

High level is covered at [1], and low level is covered in already mentioned header files. But I don't know about any explanation at detail level I gave you above.

...

...
...
Looking at that code, I just noticed there potential race condition here. Nothing prevent a guest to call twice with the same optee_thread_id.

OP-TEE has internal check against this.

I am not sure how OP-TEE internal check would help here. The user may know that thread-id 1 exist and will call it from 2 vCPUs concurrently.

So handle_rpc will find a context associated to it and use it for execute_std_call. If OP-TEE return an error (or is done with it), you will end up to free twice the same context.

Did I miss anything?

Ah, I see what do you mean. Yes, I need to track call state. Some flag to indicate that this particular call is already being processed.

...

...
...
So it would be possible for two vCPU to call concurrently the same command and free it.

Maybe you noticed that mediator uses shadow buffer to read cookie id.

I am not speaking about the cookie id but the thread_id... However this statement is wrong in the context we are discussing. Both thread_id and cookie are read from the guest registers.

thread_id is indeed read from guest registers. cookie id should be read from OP-TEE registers, I'll re-check this.

[1] https://github.com/OP-TEE/optee_os/blob/master/documentation/optee_design.md

-- Volodymyr Babchuk

Julien Grall

18 Sep 18 Sep

4:11 p.m.

New subject: [PATCH v2 08/13] optee: add support for RPC SHM buffers

On 09/12/2018 02:51 PM, Volodymyr Babchuk wrote:

...

Hi,

Hi,

...

On 12.09.18 13:59, Julien Grall wrote:

...
Hi Volodymyr,

On 09/11/2018 08:30 PM, Volodymyr Babchuk wrote:

...
On 11.09.18 14:53, Julien Grall wrote:

...
On 10/09/18 18:44, Volodymyr Babchuk wrote:

...
On 10.09.18 16:01, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote: > OP-TEE usually uses the same idea with command buffers (see > previous commit) to issue RPC requests. Problem is that initially > it has no buffer, where it can write request. So the first RPC > request it makes is special: it requests NW to allocate shared > buffer for other RPC requests. Usually this buffer is allocated > only once for every OP-TEE thread and it remains allocated all > the time until shutdown. > > Mediator needs to pin this buffer(s) to make sure that domain can't > transfer it to someone else. Also it should be mapped into XEN > address space, because mediator needs to check responses from > guests.

Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest?

Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time.

This is a bit misleading... This copy will *only* happen for IRQ during an RPC. What are the chances for that? Fairly limited. If this is happening too often, then the map/unmap here will be your least concern.

Now, this copy will happen for every IRQ when CPU is in S-EL1/S-EL0 mode. Chances are quite high, I must say. Look: OP-TEE or (TA) is doing something, like encrypting some buffer, for example. IRQ fires, OP-TEE immediately executes RPC return (right from interrupt handler), so NW can handle interrupt. Then NW returns control back to OP-TEE, if it wants to.

I understand this... But the map/unmap should be negligible over the rest of the context.

I thought that map/unmap is quite costly operation, but I can be wrong there.

At the moment, map/unmap is nearly a nop on Arm64 because all the RAM is mapped (I would avoid to assume that thought :)). The only cost if going through the p2m to translate the IPA to PA.

For Arm32, each CPUs has its own page-tables and the map/unmap (and TLB flush) will be done locally. I would still expect the impact to be minimal.

Note that today map_domain_page on Arm32 is quite simplistic. It would be possible to optimize it for lowering the impact of map/unmap.

[...]

...

...
...
...
...
...
It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards.

No, it is OP-TEE asked to free buffer. This function is called, when NW returns from the RPC. So at this moment NW freed the buffer.

But you forward that call to OP-TEE after. So what would OP-TEE do with that?

Happily resume interrupted work. There is how RPC works:

NW client issues STD call (or yielding call in terms of SMCCC)

OP-TEE starts its work, but it is needed to be interrupted for some

reason: IRQ arrived, it wants to block on a mutex, it asks NW to do some work (like allocating memory or loading TA). This is called "RPC return". 3. OP-TEE suspends thread and does return from SMC call with code OPTEE_SMC_RPC_VAL(SOME_CMD) in a0, and some optional parameters in other registers 4. NW sees that this is a RPC, and not completed STD call, so it does SOME_CMD and issues another SMC with code OPTEE_SMC_CALL_RETURN_FROM_RPC in a0 5. OP-TEE wakes up suspended thread and continues execution 6. pts 2-5 are repeated until OP-TEE finishes the work 7. It returns from last SMC call with code OPTEE_SMC_RETURN_SUCCESS/ OPTEE_SMC_RETURN_some_error in a0. 8. optee driver sees that call from pt.1 is finished at least and returns control back to client

Thank you for the explanation. As I mentioned in another thread, it would be good to have some kind of highly level explanation in the tree and all those interaction. If it is already existing, then pointer in the code.

High level is covered at [1], and low level is covered in already mentioned header files.

Could you add those pointers at the top of the OP-TEE file?

...

But I don't know about any explanation at detail level I gave you above.

That's fine. Can you add that in the commit message?

Cheers,

-- Julien Grall

Volodymyr Babchuk

3 Sep 3 Sep

4:54 p.m.

New subject: [PATCH v2 09/13] optee: add support for arbitrary shared memory

Shared memory is widely used by NW to communicate with TAs in OP-TEE. NW can share part of own memory with TA or OP-TEE core, by registering it OP-TEE, or by providing a temporal refernce. Anyways, information about such memory buffers are sent to OP-TEE as a list of pages. This mechanism is descripted optee_msg.h.

Mediator should step in when NW tries to share memory with OP-TEE for two reasons:

1. Do address translation from IPA to PA. 2. Pin domain pages till they are mapped into OP-TEE or TA address space, so domain can't transfer this pages to other domain or baloon out them.

Address translation is done by translate_noncontig(...) function. It allocates new buffer from xenheap and then walks on guest provided list of pages, translates addresses and stores PAs into newly allocated buffer. This buffer will be provided to OP-TEE instead of original buffer from the guest. This buffer will be free at the end of sdandard call.

In the same time this function pins pages and stores them in struct shm_buf object. This object will live all the time, when given SHM buffer is known to OP-TEE. It will be freed after guest unregisters shared buffer. At this time pages will be unpinned.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com --- xen/arch/arm/tee/optee.c | 245 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 244 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 6d6b51d..8bfcfdc 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -22,6 +22,8 @@

#define MAX_STD_CALLS 16 #define MAX_RPC_SHMS 16 +#define MAX_TOTAL_SMH_BUF_PG 16384 +#define MAX_NONCONTIG_ENTRIES 5

/* * Call context. OP-TEE can issue multiple RPC returns during one call. @@ -31,6 +33,9 @@ struct std_call_ctx { struct list_head list; struct optee_msg_arg *guest_arg; struct optee_msg_arg *xen_arg; + /* Buffer for translated page addresses, shared with OP-TEE */ + void *non_contig[MAX_NONCONTIG_ENTRIES]; + int non_contig_order[MAX_NONCONTIG_ENTRIES]; mfn_t guest_arg_mfn; int optee_thread_id; int rpc_op; @@ -45,13 +50,24 @@ struct shm_rpc { uint64_t cookie; };

+/* Shared memory buffer for arbitrary data */ +struct shm_buf { + struct list_head list; + uint64_t cookie; + int max_page_cnt; + int page_cnt; + struct page_info *pages[]; +}; + struct domain_ctx { struct list_head list; struct list_head call_ctx_list; struct list_head shm_rpc_list; + struct list_head shm_buf_list; struct domain *domain; atomic_t call_ctx_count; atomic_t shm_rpc_count; + atomic_t shm_buf_pages; spinlock_t lock; };

@@ -158,9 +174,12 @@ static int optee_enable(struct domain *d) ctx->domain = d; INIT_LIST_HEAD(&ctx->call_ctx_list); INIT_LIST_HEAD(&ctx->shm_rpc_list); + INIT_LIST_HEAD(&ctx->shm_buf_list);

atomic_set(&ctx->call_ctx_count, 0); atomic_set(&ctx->shm_rpc_count, 0); + atomic_set(&ctx->shm_buf_pages, 0); + spin_lock_init(&ctx->lock);

spin_lock(&domain_ctx_list_lock); @@ -339,12 +358,76 @@ static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) xfree(shm_rpc); }

+static struct shm_buf *allocate_shm_buf(struct domain_ctx *ctx, + uint64_t cookie, + int pages_cnt) +{ + struct shm_buf *shm_buf; + + while(1) + { + int old = atomic_read(&ctx->shm_buf_pages); + int new = old + pages_cnt; + if ( new >= MAX_TOTAL_SMH_BUF_PG ) + return NULL; + if ( likely(old == atomic_cmpxchg(&ctx->shm_buf_pages, old, new)) ) + break; + } + + shm_buf = xzalloc_bytes(sizeof(struct shm_buf) + + pages_cnt * sizeof(struct page *)); + if ( !shm_buf ) { + atomic_sub(pages_cnt, &ctx->shm_buf_pages); + return NULL; + } + + shm_buf->cookie = cookie; + shm_buf->max_page_cnt = pages_cnt; + + spin_lock(&ctx->lock); + list_add_tail(&shm_buf->list, &ctx->shm_buf_list); + spin_unlock(&ctx->lock); + + return shm_buf; +} + +static void free_shm_buf(struct domain_ctx *ctx, uint64_t cookie) +{ + struct shm_buf *shm_buf; + bool found = false; + + spin_lock(&ctx->lock); + list_for_each_entry( shm_buf, &ctx->shm_buf_list, list ) + { + if ( shm_buf->cookie == cookie ) + { + found = true; + list_del(&shm_buf->list); + break; + } + } + spin_unlock(&ctx->lock); + + if ( !found ) { + return; + } + + for ( int i = 0; i < shm_buf->page_cnt; i++ ) + if ( shm_buf->pages[i] ) + put_page(shm_buf->pages[i]); + + atomic_sub(shm_buf->max_page_cnt, &ctx->shm_buf_pages); + + xfree(shm_buf); +} + static void optee_domain_destroy(struct domain *d) { struct arm_smccc_res resp; struct domain_ctx *ctx; struct std_call_ctx *call, *call_tmp; struct shm_rpc *shm_rpc, *shm_rpc_tmp; + struct shm_buf *shm_buf, *shm_buf_tmp; bool found = false;

/* At this time all domain VCPUs should be stopped */ @@ -377,12 +460,163 @@ static void optee_domain_destroy(struct domain *d) list_for_each_entry_safe( shm_rpc, shm_rpc_tmp, &ctx->shm_rpc_list, list ) free_shm_rpc(ctx, shm_rpc->cookie);

+ list_for_each_entry_safe( shm_buf, shm_buf_tmp, &ctx->shm_buf_list, list ) + free_shm_buf(ctx, shm_buf->cookie); + ASSERT(!atomic_read(&ctx->call_ctx_count)); ASSERT(!atomic_read(&ctx->shm_rpc_count)); + ASSERT(!atomic_read(&ctx->shm_buf_pages));

xfree(ctx); }

+#define PAGELIST_ENTRIES_PER_PAGE \ + ((OPTEE_MSG_NONCONTIG_PAGE_SIZE / sizeof(u64)) - 1) + +static size_t get_pages_list_size(size_t num_entries) +{ + int pages = DIV_ROUND_UP(num_entries, PAGELIST_ENTRIES_PER_PAGE); + + return pages * OPTEE_MSG_NONCONTIG_PAGE_SIZE; +} + +static bool translate_noncontig(struct domain_ctx *ctx, + struct std_call_ctx *call, + struct optee_msg_param *param, + int idx) +{ + /* + * Refer to OPTEE_MSG_ATTR_NONCONTIG description in optee_msg.h for details. + */ + uint64_t size; + int page_offset; + int num_pages; + int order; + int entries_on_page = 0; + paddr_t gaddr; + mfn_t guest_mfn; + struct { + uint64_t pages_list[PAGELIST_ENTRIES_PER_PAGE]; + uint64_t next_page_data; + } *pages_data_guest, *pages_data_xen, *pages_data_xen_start; + struct shm_buf *shm_buf; + + page_offset = param->u.tmem.buf_ptr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1); + + size = ROUNDUP(param->u.tmem.size + page_offset, + OPTEE_MSG_NONCONTIG_PAGE_SIZE); + + num_pages = DIV_ROUND_UP(size, OPTEE_MSG_NONCONTIG_PAGE_SIZE); + + order = get_order_from_bytes(get_pages_list_size(num_pages)); + + pages_data_xen_start = alloc_xenheap_pages(order, 0); + if ( !pages_data_xen_start ) + return false; + + shm_buf = allocate_shm_buf(ctx, param->u.tmem.shm_ref, num_pages); + if ( !shm_buf ) + goto err_free; + + gaddr = param->u.tmem.buf_ptr & ~(OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1); + guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL); + if ( mfn_eq(guest_mfn, INVALID_MFN) ) + goto err_free; + + pages_data_guest = map_domain_page(guest_mfn); + if ( !pages_data_guest ) + goto err_free; + + pages_data_xen = pages_data_xen_start; + while ( num_pages ) { + struct page_info *page; + mfn_t entry_mfn = lookup_and_pin_guest_ram_addr( + pages_data_guest->pages_list[entries_on_page], &page); + + if ( mfn_eq(entry_mfn, INVALID_MFN) ) + goto err_unmap; + + shm_buf->pages[shm_buf->page_cnt++] = page; + pages_data_xen->pages_list[entries_on_page] = mfn_to_maddr(entry_mfn); + entries_on_page++; + + if ( entries_on_page == PAGELIST_ENTRIES_PER_PAGE ) { + pages_data_xen->next_page_data = virt_to_maddr(pages_data_xen + 1); + pages_data_xen++; + gaddr = pages_data_guest->next_page_data; + + unmap_domain_page(pages_data_guest); + unpin_guest_ram_addr(guest_mfn); + + guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL); + if ( mfn_eq(guest_mfn, INVALID_MFN) ) + goto err_free; + + pages_data_guest = map_domain_page(guest_mfn); + if ( !pages_data_guest ) + goto err_free; + /* Roll over to the next page */ + entries_on_page = 0; + } + num_pages--; + } + + param->u.tmem.buf_ptr = virt_to_maddr(pages_data_xen_start) | page_offset; + + call->non_contig[idx] = pages_data_xen_start; + call->non_contig_order[idx] = order; + + unmap_domain_page(pages_data_guest); + unpin_guest_ram_addr(guest_mfn); + return true; + +err_unmap: + unmap_domain_page(pages_data_guest); + unpin_guest_ram_addr(guest_mfn); + free_shm_buf(ctx, shm_buf->cookie); + +err_free: + free_xenheap_pages(pages_data_xen_start, order); + + return false; +} + +static bool translate_params(struct domain_ctx *ctx, + struct std_call_ctx *call) +{ + unsigned int i; + uint32_t attr; + + for ( i = 0; i < call->xen_arg->num_params; i++ ) { + attr = call->xen_arg->params[i].attr; + + switch ( attr & OPTEE_MSG_ATTR_TYPE_MASK ) { + case OPTEE_MSG_ATTR_TYPE_TMEM_INPUT: + case OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT: + case OPTEE_MSG_ATTR_TYPE_TMEM_INOUT: + if ( attr & OPTEE_MSG_ATTR_NONCONTIG ) { + if ( !translate_noncontig(ctx, call, + call->xen_arg->params + i, i) ) + return false; + } + else { + gprintk(XENLOG_WARNING, "Guest tries to use old tmem arg\n"); + return false; + } + break; + case OPTEE_MSG_ATTR_TYPE_NONE: + case OPTEE_MSG_ATTR_TYPE_VALUE_INPUT: + case OPTEE_MSG_ATTR_TYPE_VALUE_OUTPUT: + case OPTEE_MSG_ATTR_TYPE_VALUE_INOUT: + case OPTEE_MSG_ATTR_TYPE_RMEM_INPUT: + case OPTEE_MSG_ATTR_TYPE_RMEM_OUTPUT: + case OPTEE_MSG_ATTR_TYPE_RMEM_INOUT: + continue; + } + } + return true; +} + /* * Copy command buffer into xen memory to: * 1) Hide translated addresses from guest @@ -488,6 +722,15 @@ static bool execute_std_call(struct domain_ctx *ctx,

copy_std_request_back(ctx, regs, call);

+ /* + * If guest successfully unregistered own shared memory, + * then we can unpin it's pages + */ + if ( call->xen_arg->cmd == OPTEE_MSG_CMD_UNREGISTER_SHM && + call->xen_arg->ret == 0 ) { + free_shm_buf(ctx, call->xen_arg->params[0].u.rmem.shm_ref); + } + free_std_call_ctx(ctx, call);

return true; @@ -522,7 +765,7 @@ static bool handle_std_call(struct domain_ctx *ctx, struct cpu_user_regs *regs) case OPTEE_MSG_CMD_CANCEL: case OPTEE_MSG_CMD_REGISTER_SHM: case OPTEE_MSG_CMD_UNREGISTER_SHM: - ret = true; + ret = translate_params(ctx, call); break; default: ret = false;

-- 2.7.4

Julien Grall

10 Sep 10 Sep

2:02 p.m.

New subject: [PATCH v2 09/13] optee: add support for arbitrary shared memory

Hi,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...

Shared memory is widely used by NW to communicate with TAs in OP-TEE. NW can share part of own memory with TA or OP-TEE core, by registering it OP-TEE, or by providing a temporal refernce. Anyways, information about such memory buffers are sent to OP-TEE as a list of pages. This mechanism is descripted optee_msg.h.

Mediator should step in when NW tries to share memory with OP-TEE for two reasons:

Do address translation from IPA to PA.

Pin domain pages till they are mapped into OP-TEE or TA address space, so domain can't transfer this pages to other domain or baloon out them.

s/baloon/balloon/

...

Address translation is done by translate_noncontig(...) function. It allocates new buffer from xenheap and then walks on guest provided list of pages, translates addresses and stores PAs into newly allocated buffer. This buffer will be provided to OP-TEE instead of original buffer from the guest. This buffer will be free at the end of sdandard call.

In the same time this function pins pages and stores them in struct shm_buf object. This object will live all the time, when given SHM buffer is known to OP-TEE. It will be freed after guest unregisters shared buffer. At this time pages will be unpinned.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 245 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 244 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 6d6b51d..8bfcfdc 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -22,6 +22,8 @@ #define MAX_STD_CALLS 16 #define MAX_RPC_SHMS 16 +#define MAX_TOTAL_SMH_BUF_PG 16384

So that's 64MB worth of guest memory. Do we expect them to be mapped in Xen? Or just pinned?

...

+#define MAX_NONCONTIG_ENTRIES 5 /*

Call context. OP-TEE can issue multiple RPC returns during one call.

@@ -31,6 +33,9 @@ struct std_call_ctx { struct list_head list; struct optee_msg_arg *guest_arg; struct optee_msg_arg *xen_arg;

/* Buffer for translated page addresses, shared with OP-TEE */

void *non_contig[MAX_NONCONTIG_ENTRIES];

int non_contig_order[MAX_NONCONTIG_ENTRIES];

Can you please introduce a structure with the order and mapping?

...

  mfn_t guest_arg_mfn;
  int optee_thread_id;
  int rpc_op;
@@ -45,13 +50,24 @@ struct shm_rpc { uint64_t cookie; }; +/* Shared memory buffer for arbitrary data */ +struct shm_buf {

struct list_head list;

uint64_t cookie;

int max_page_cnt;

int page_cnt;

AFAICT, max_page_cnt and page_cnt should never but negative. If so, then they should be unsigned.

...

struct page_info *pages[];

+};

struct domain_ctx { struct list_head list; struct list_head call_ctx_list; struct list_head shm_rpc_list;

struct list_head shm_buf_list; struct domain *domain; atomic_t call_ctx_count; atomic_t shm_rpc_count;

atomic_t shm_buf_pages; spinlock_t lock; };

@@ -158,9 +174,12 @@ static int optee_enable(struct domain *d) ctx->domain = d; INIT_LIST_HEAD(&ctx->call_ctx_list); INIT_LIST_HEAD(&ctx->shm_rpc_list);

INIT_LIST_HEAD(&ctx->shm_buf_list);

atomic_set(&ctx->call_ctx_count, 0); atomic_set(&ctx->shm_rpc_count, 0);
atomic_set(&ctx->shm_buf_pages, 0);
spin_lock_init(&ctx->lock);
spin_lock(&domain_ctx_list_lock); @@ -339,12 +358,76 @@ static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) xfree(shm_rpc); } +static struct shm_buf *allocate_shm_buf(struct domain_ctx *ctx,
                                   uint64_t cookie,
                                   int pages_cnt)

Ditto.

...

struct shm_buf *shm_buf;
while(1)
{

   int old = atomic_read(&ctx->shm_buf_pages);

```
   int new = old + pages_cnt;
```
```
   if ( new >= MAX_TOTAL_SMH_BUF_PG )
```
```
       return NULL;
```

   if ( likely(old == atomic_cmpxchg(&ctx->shm_buf_pages, old, new)) )

```
       break;
```
}
shm_buf = xzalloc_bytes(sizeof(struct shm_buf) +

                       pages_cnt * sizeof(struct page *));

if ( !shm_buf ) {

Coding style:

if ( ... ) {

...

   atomic_sub(pages_cnt, &ctx->shm_buf_pages);
   return NULL;
}

shm_buf->cookie = cookie;

shm_buf->max_page_cnt = pages_cnt;

spin_lock(&ctx->lock);

list_add_tail(&shm_buf->list, &ctx->shm_buf_list);

spin_unlock(&ctx->lock);

return shm_buf;
+}

+static void free_shm_buf(struct domain_ctx *ctx, uint64_t cookie) +{
struct shm_buf *shm_buf; > + bool found = false;

spin_lock(&ctx->lock);

list_for_each_entry( shm_buf, &ctx->shm_buf_list, list )

{
   if ( shm_buf->cookie == cookie )

What does guarantee you the cookie will be uniq?

...

```
   {
```
```
       found = true;
```
```
       list_del(&shm_buf->list);
```
```
       break;
```
```
   }
```
}
spin_unlock(&ctx->lock);

At this point you have a shm_buf in hand to free. But what does guarantee you no-one will use it?

...

if ( !found ) {
   return;
}

for ( int i = 0; i < shm_buf->page_cnt; i++ )

Please define int i before hand.

...

   if ( shm_buf->pages[i] )
       put_page(shm_buf->pages[i]);
atomic_sub(shm_buf->max_page_cnt, &ctx->shm_buf_pages);

xfree(shm_buf);
+}

static void optee_domain_destroy(struct domain *d) { struct arm_smccc_res resp; struct domain_ctx *ctx; struct std_call_ctx *call, *call_tmp; struct shm_rpc *shm_rpc, *shm_rpc_tmp;

struct shm_buf *shm_buf, *shm_buf_tmp; bool found = false;

/* At this time all domain VCPUs should be stopped */ @@ -377,12 +460,163 @@ static void optee_domain_destroy(struct domain *d) list_for_each_entry_safe( shm_rpc, shm_rpc_tmp, &ctx->shm_rpc_list, list ) free_shm_rpc(ctx, shm_rpc->cookie);
list_for_each_entry_safe( shm_buf, shm_buf_tmp, &ctx->shm_buf_list, list )
   free_shm_buf(ctx, shm_buf->cookie);
ASSERT(!atomic_read(&ctx->call_ctx_count));
ASSERT(!atomic_read(&ctx->shm_rpc_count));
ASSERT(!atomic_read(&ctx->shm_buf_pages));
xfree(ctx); } +#define PAGELIST_ENTRIES_PER_PAGE \

((OPTEE_MSG_NONCONTIG_PAGE_SIZE / sizeof(u64)) - 1)

+static size_t get_pages_list_size(size_t num_entries) +{

int pages = DIV_ROUND_UP(num_entries, PAGELIST_ENTRIES_PER_PAGE);

return pages * OPTEE_MSG_NONCONTIG_PAGE_SIZE;

+}

+static bool translate_noncontig(struct domain_ctx *ctx,
                           struct std_call_ctx *call,
                           struct optee_msg_param *param,
                           int idx)

Most likely this should be unsigned.

...

+{
/*
* Refer to OPTEE_MSG_ATTR_NONCONTIG description in optee_msg.h for details.
*/
uint64_t size;

int page_offset;

int num_pages;

int order;

int entries_on_page = 0;

paddr_t gaddr;

mfn_t guest_mfn;

I don't like the terminology guest_mfn. This is misleading because of the past usage in Xen. It would be better if you call this just mfn.

...

struct {
   uint64_t pages_list[PAGELIST_ENTRIES_PER_PAGE];
   uint64_t next_page_data;
} *pages_data_guest, *pages_data_xen, *pages_data_xen_start;

struct shm_buf *shm_buf;

page_offset = param->u.tmem.buf_ptr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1);

size = ROUNDUP(param->u.tmem.size + page_offset,
              OPTEE_MSG_NONCONTIG_PAGE_SIZE);
num_pages = DIV_ROUND_UP(size, OPTEE_MSG_NONCONTIG_PAGE_SIZE);

order = get_order_from_bytes(get_pages_list_size(num_pages));

pages_data_xen_start = alloc_xenheap_pages(order, 0);

This could be replaced by a _xmalloc and would avoid to allocate more memory than necessary when the order is getting bigger.

...

if ( !pages_data_xen_start )
   return false;
shm_buf = allocate_shm_buf(ctx, param->u.tmem.shm_ref, num_pages);

In alocate_shm_buf you are now globally limiting the number of pages ( (16384) to pin. However, this does not limit per call.

With the current limit, you would could call up to 16384 times lookup_and_pin_guest_ram_addr(...). On Arm, for p2m related operation, we limit to 512 iterations in one go before checking the preemption. So I think 16384 times is far too much.

...

if ( !shm_buf )
```
   goto err_free;
```
gaddr = param->u.tmem.buf_ptr & ~(OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1);
guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL);
if ( mfn_eq(guest_mfn, INVALID_MFN) )
```
   goto err_free;
```
pages_data_guest = map_domain_page(guest_mfn);
if ( !pages_data_guest )
```
   goto err_free;
```
pages_data_xen = pages_data_xen_start;
while ( num_pages ) {
```
   struct page_info *page;
```

   mfn_t entry_mfn = lookup_and_pin_guest_ram_addr(

       pages_data_guest->pages_list[entries_on_page], &page);

   if ( mfn_eq(entry_mfn, INVALID_MFN) )

```
       goto err_unmap;
```

   shm_buf->pages[shm_buf->page_cnt++] = page;

   pages_data_xen->pages_list[entries_on_page] = mfn_to_maddr(entry_mfn);

```
   entries_on_page++;
```

   if ( entries_on_page == PAGELIST_ENTRIES_PER_PAGE ) {

       pages_data_xen->next_page_data = virt_to_maddr(pages_data_xen + 1);

```
       pages_data_xen++;
```

       gaddr = pages_data_guest->next_page_data;

next_page_data is not a guest address but a machine address. For anything related to address, the variable should be named accordingly to avoid confusion.

...

       unmap_domain_page(pages_data_guest);

       unpin_guest_ram_addr(guest_mfn);

       guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL);

       if ( mfn_eq(guest_mfn, INVALID_MFN) )

```
           goto err_free;
```

       pages_data_guest = map_domain_page(guest_mfn);

```
       if ( !pages_data_guest )
```
```
           goto err_free;
```

       /* Roll over to the next page */

```
       entries_on_page = 0;
```
```
   }
```
```
   num_pages--;
```
}
param->u.tmem.buf_ptr = virt_to_maddr(pages_data_xen_start) | page_offset;
call->non_contig[idx] = pages_data_xen_start;
call->non_contig_order[idx] = order;

If you use _xmalloc, then you don't need to store the order. But, who is going to free non_contig?

...

unmap_domain_page(pages_data_guest);

unpin_guest_ram_addr(guest_mfn);

return true;

+err_unmap:

unmap_domain_page(pages_data_guest);

unpin_guest_ram_addr(guest_mfn);

free_shm_buf(ctx, shm_buf->cookie);

+err_free:

free_xenheap_pages(pages_data_xen_start, order);

return false;

+}

+static bool translate_params(struct domain_ctx *ctx,
                        struct std_call_ctx *call)
+{

unsigned int i;

uint32_t attr;

for ( i = 0; i < call->xen_arg->num_params; i++ ) {

Please pay attention to Xen coding style. I haven't pointed out everywhere, but I would all of them to be fixed in the next version.

...

   attr = call->xen_arg->params[i].attr;

   switch ( attr & OPTEE_MSG_ATTR_TYPE_MASK ) {

   case OPTEE_MSG_ATTR_TYPE_TMEM_INPUT:

   case OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT:

   case OPTEE_MSG_ATTR_TYPE_TMEM_INOUT:

       if ( attr & OPTEE_MSG_ATTR_NONCONTIG ) {

           if ( !translate_noncontig(ctx, call,

                                     call->xen_arg->params + i, i) )

```
               return false;
```
```
       }
```
```
       else {
```

           gprintk(XENLOG_WARNING, "Guest tries to use old tmem arg\n");

```
           return false;
```
```
       }
```
```
       break;
```
```
   case OPTEE_MSG_ATTR_TYPE_NONE:
```

   case OPTEE_MSG_ATTR_TYPE_VALUE_INPUT:

   case OPTEE_MSG_ATTR_TYPE_VALUE_OUTPUT:

   case OPTEE_MSG_ATTR_TYPE_VALUE_INOUT:

   case OPTEE_MSG_ATTR_TYPE_RMEM_INPUT:

   case OPTEE_MSG_ATTR_TYPE_RMEM_OUTPUT:

   case OPTEE_MSG_ATTR_TYPE_RMEM_INOUT:

```
       continue;
```
```
   }
```
}
return true;

/*
- Copy command buffer into xen memory to:
- 1. Hide translated addresses from guest

@@ -488,6 +722,15 @@ static bool execute_std_call(struct domain_ctx *ctx, copy_std_request_back(ctx, regs, call);

* If guest successfully unregistered own shared memory,

```
* then we can unpin it's pages
```
```
*/
```
if ( call->xen_arg->cmd == OPTEE_MSG_CMD_UNREGISTER_SHM &&
```
    call->xen_arg->ret == 0 ) {
```

   free_shm_buf(ctx, call->xen_arg->params[0].u.rmem.shm_ref);

}
```
free_std_call_ctx(ctx, call);
```

```
   ret = true;
```

   ret = translate_params(ctx, call);
    break;
default:
    ret = false;

Cheers,

-- Julien Grall

Volodymyr Babchuk

6:04 p.m.

New subject: [PATCH v2 09/13] optee: add support for arbitrary shared memory

Hi,

On 10.09.18 17:02, Julien Grall wrote:

...

Hi,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
Shared memory is widely used by NW to communicate with TAs in OP-TEE. NW can share part of own memory with TA or OP-TEE core, by registering it OP-TEE, or by providing a temporal refernce. Anyways, information about such memory buffers are sent to OP-TEE as a list of pages. This mechanism is descripted optee_msg.h.

Mediator should step in when NW tries to share memory with OP-TEE for two reasons:

Do address translation from IPA to PA.

Pin domain pages till they are mapped into OP-TEE or TA

address space, so domain can't transfer this pages to other domain or baloon out them.

s/baloon/balloon/

...
Address translation is done by translate_noncontig(...) function. It allocates new buffer from xenheap and then walks on guest provided list of pages, translates addresses and stores PAs into newly allocated buffer. This buffer will be provided to OP-TEE instead of original buffer from the guest. This buffer will be free at the end of sdandard call.

In the same time this function pins pages and stores them in struct shm_buf object. This object will live all the time, when given SHM buffer is known to OP-TEE. It will be freed after guest unregisters shared buffer. At this time pages will be unpinned.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 245 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 244 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 6d6b51d..8bfcfdc 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -22,6 +22,8 @@ #define MAX_STD_CALLS 16 #define MAX_RPC_SHMS 16 +#define MAX_TOTAL_SMH_BUF_PG 16384

So that's 64MB worth of guest memory. Do we expect them to be mapped in Xen? Or just pinned?

Just pinned. We are not interested in contents of this memory.

...

...
+#define MAX_NONCONTIG_ENTRIES   5 /*    * Call context. OP-TEE can issue multiple RPC returns during one call. @@ -31,6 +33,9 @@ struct std_call_ctx {       struct list_head list;       struct optee_msg_arg *guest_arg;       struct optee_msg_arg *xen_arg; +    /* Buffer for translated page addresses, shared with OP-TEE */ +    void *non_contig[MAX_NONCONTIG_ENTRIES]; +    int non_contig_order[MAX_NONCONTIG_ENTRIES];

Can you please introduce a structure with the order and mapping?

...
mfn_t guest_arg_mfn;       int optee_thread_id;       int rpc_op; @@ -45,13 +50,24 @@ struct shm_rpc {       uint64_t cookie; }; +/* Shared memory buffer for arbitrary data */ +struct shm_buf { +    struct list_head list; +    uint64_t cookie; +    int max_page_cnt; +    int page_cnt;

AFAICT, max_page_cnt and page_cnt should never but negative. If so, then they should be unsigned.

...
+    struct page_info *pages[]; +};

struct domain_ctx {       struct list_head list;       struct list_head call_ctx_list;       struct list_head shm_rpc_list; +    struct list_head shm_buf_list;       struct domain *domain;       atomic_t call_ctx_count;       atomic_t shm_rpc_count; +    atomic_t shm_buf_pages;       spinlock_t lock; }; @@ -158,9 +174,12 @@ static int optee_enable(struct domain *d)       ctx->domain = d;       INIT_LIST_HEAD(&ctx->call_ctx_list);       INIT_LIST_HEAD(&ctx->shm_rpc_list); +    INIT_LIST_HEAD(&ctx->shm_buf_list);       atomic_set(&ctx->call_ctx_count, 0);       atomic_set(&ctx->shm_rpc_count, 0); +    atomic_set(&ctx->shm_buf_pages, 0);

spin_lock_init(&ctx->lock);       spin_lock(&domain_ctx_list_lock); @@ -339,12 +358,76 @@ static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie)       xfree(shm_rpc); } +static struct shm_buf *allocate_shm_buf(struct domain_ctx *ctx, +                                        uint64_t cookie, +                                        int pages_cnt)

Ditto.

...
+{ +    struct shm_buf *shm_buf;

+    while(1) +    { +        int old = atomic_read(&ctx->shm_buf_pages); +        int new = old + pages_cnt; +        if ( new >= MAX_TOTAL_SMH_BUF_PG ) +            return NULL; +        if ( likely(old == atomic_cmpxchg(&ctx->shm_buf_pages, old, new)) ) +            break; +    }

+    shm_buf = xzalloc_bytes(sizeof(struct shm_buf) + +                            pages_cnt * sizeof(struct page *)); +    if ( !shm_buf ) {

Coding style:

if ( ... ) {

...
+        atomic_sub(pages_cnt, &ctx->shm_buf_pages); +        return NULL; +    }

+    shm_buf->cookie = cookie; +    shm_buf->max_page_cnt = pages_cnt;

+    spin_lock(&ctx->lock); +    list_add_tail(&shm_buf->list, &ctx->shm_buf_list); +    spin_unlock(&ctx->lock);

+    return shm_buf; +}

+static void free_shm_buf(struct domain_ctx *ctx, uint64_t cookie) +{ +    struct shm_buf *shm_buf; > +    bool found = false;

+    spin_lock(&ctx->lock); +    list_for_each_entry( shm_buf, &ctx->shm_buf_list, list ) +    { +        if ( shm_buf->cookie == cookie )

What does guarantee you the cookie will be uniq?

...
+        { +            found = true; +            list_del(&shm_buf->list); +            break; +        } +    } +    spin_unlock(&ctx->lock);

At this point you have a shm_buf in hand to free. But what does guarantee you no-one will use it?

...

+    if ( !found ) { +        return; +    }

+    for ( int i = 0; i < shm_buf->page_cnt; i++ )

Please define int i before hand.

...
+        if ( shm_buf->pages[i] ) +            put_page(shm_buf->pages[i]);

+    atomic_sub(shm_buf->max_page_cnt, &ctx->shm_buf_pages);

+    xfree(shm_buf); +}

static void optee_domain_destroy(struct domain *d) {       struct arm_smccc_res resp;       struct domain_ctx *ctx;       struct std_call_ctx *call, *call_tmp;       struct shm_rpc *shm_rpc, *shm_rpc_tmp; +    struct shm_buf *shm_buf, *shm_buf_tmp;       bool found = false;       /* At this time all domain VCPUs should be stopped */ @@ -377,12 +460,163 @@ static void optee_domain_destroy(struct domain *d)       list_for_each_entry_safe( shm_rpc, shm_rpc_tmp, &ctx->shm_rpc_list, list )           free_shm_rpc(ctx, shm_rpc->cookie); +    list_for_each_entry_safe( shm_buf, shm_buf_tmp, &ctx->shm_buf_list, list ) +        free_shm_buf(ctx, shm_buf->cookie);

ASSERT(!atomic_read(&ctx->call_ctx_count));       ASSERT(!atomic_read(&ctx->shm_rpc_count)); +    ASSERT(!atomic_read(&ctx->shm_buf_pages));       xfree(ctx); } +#define PAGELIST_ENTRIES_PER_PAGE                       \ +    ((OPTEE_MSG_NONCONTIG_PAGE_SIZE / sizeof(u64)) - 1)

+static size_t get_pages_list_size(size_t num_entries) +{ +    int pages = DIV_ROUND_UP(num_entries, PAGELIST_ENTRIES_PER_PAGE);

+    return pages * OPTEE_MSG_NONCONTIG_PAGE_SIZE; +}

+static bool translate_noncontig(struct domain_ctx *ctx, +                                struct std_call_ctx *call, +                                struct optee_msg_param *param, +                                int idx)

Most likely this should be unsigned.

...
+{ +    /* +     * Refer to OPTEE_MSG_ATTR_NONCONTIG description in optee_msg.h for details. +     */ +    uint64_t size; +    int page_offset; +    int num_pages; +    int order; +    int entries_on_page = 0; +    paddr_t gaddr; +    mfn_t guest_mfn;

I don't like the terminology guest_mfn. This is misleading because of the past usage in Xen. It would be better if you call this just mfn.

Okay, will do.

...

...
+    struct { +        uint64_t pages_list[PAGELIST_ENTRIES_PER_PAGE]; +        uint64_t next_page_data; +    } *pages_data_guest, *pages_data_xen, *pages_data_xen_start; +    struct shm_buf *shm_buf;

+    page_offset = param->u.tmem.buf_ptr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1);

+    size = ROUNDUP(param->u.tmem.size + page_offset, +                   OPTEE_MSG_NONCONTIG_PAGE_SIZE);

+    num_pages = DIV_ROUND_UP(size, OPTEE_MSG_NONCONTIG_PAGE_SIZE);

+    order = get_order_from_bytes(get_pages_list_size(num_pages));

+    pages_data_xen_start = alloc_xenheap_pages(order, 0);

This could be replaced by a _xmalloc and would avoid to allocate more memory than necessary when the order is getting bigger.

Thanks. Would it allocate page-aligned buffer? This is crucial in this case. I can't find any documentation on it so I don't know which alignment it guarantees.

...

...
+ if ( !pages_data_xen_start ) + return false;

+ shm_buf = allocate_shm_buf(ctx, param->u.tmem.shm_ref, num_pages);

In alocate_shm_buf you are now globally limiting the number of pages ( (16384) to pin. However, this does not limit per call.

With the current limit, you would could call up to 16384 times lookup_and_pin_guest_ram_addr(...). On Arm, for p2m related operation, we limit to 512 iterations in one go before checking the preemption. So I think 16384 times is far too much.

So, in other words, I can translate only 2MB buffer (if 4096KB pages are used), is it right? I think, it will be okay to implement such limitation for this initial version of mediator. In the future, it would be possible to do RPC return from XEN (as OP-TEE does) to finish this request later.

...

...
+    if ( !shm_buf ) +        goto err_free;

+    gaddr = param->u.tmem.buf_ptr & ~(OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1); +    guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL); +    if ( mfn_eq(guest_mfn, INVALID_MFN) ) +        goto err_free;

+    pages_data_guest = map_domain_page(guest_mfn); +    if ( !pages_data_guest ) +        goto err_free;

+    pages_data_xen = pages_data_xen_start; +    while ( num_pages ) { +        struct page_info *page; +        mfn_t entry_mfn = lookup_and_pin_guest_ram_addr( +            pages_data_guest->pages_list[entries_on_page], &page);

+        if ( mfn_eq(entry_mfn, INVALID_MFN) ) +            goto err_unmap;

+        shm_buf->pages[shm_buf->page_cnt++] = page; +        pages_data_xen->pages_list[entries_on_page] = mfn_to_maddr(entry_mfn); +        entries_on_page++;

+        if ( entries_on_page == PAGELIST_ENTRIES_PER_PAGE ) { +            pages_data_xen->next_page_data = virt_to_maddr(pages_data_xen + 1); +            pages_data_xen++; +            gaddr = pages_data_guest->next_page_data;

next_page_data is not a guest address but a machine address. For anything related to address, the variable should be named accordingly to avoid confusion.

Why? In this case it is IPA that comes from the guest.

...

...

+            unmap_domain_page(pages_data_guest); +            unpin_guest_ram_addr(guest_mfn);

+            guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL); +            if ( mfn_eq(guest_mfn, INVALID_MFN) ) +                goto err_free;

+            pages_data_guest = map_domain_page(guest_mfn); +            if ( !pages_data_guest ) +                goto err_free; +            /* Roll over to the next page */ +            entries_on_page = 0; +        } +        num_pages--; +    }

+    param->u.tmem.buf_ptr = virt_to_maddr(pages_data_xen_start) | page_offset;

+    call->non_contig[idx] = pages_data_xen_start; +    call->non_contig_order[idx] = order;

If you use _xmalloc, then you don't need to store the order. But, who is going to free non_contig?

Thank you for pointing out. It should be done in free_std_call_ctx() but seems I missed this part of the code when split the big patch into smaller ones.

...

...

+    unmap_domain_page(pages_data_guest); +    unpin_guest_ram_addr(guest_mfn); +    return true;

+err_unmap: +    unmap_domain_page(pages_data_guest); +    unpin_guest_ram_addr(guest_mfn); +    free_shm_buf(ctx, shm_buf->cookie);

+err_free: +    free_xenheap_pages(pages_data_xen_start, order);

+    return false; +}

+static bool translate_params(struct domain_ctx *ctx, +                             struct std_call_ctx *call) +{ +    unsigned int i; +    uint32_t attr;

+    for ( i = 0; i < call->xen_arg->num_params; i++ ) {

Please pay attention to Xen coding style. I haven't pointed out everywhere, but I would all of them to be fixed in the next version.

Yes, I'm sorry for that. I simultaneously work with different projects and sometimes it is hard to track coding style. I'll fix all such problems.

...

...
+        attr = call->xen_arg->params[i].attr;

+        switch ( attr & OPTEE_MSG_ATTR_TYPE_MASK ) { +        case OPTEE_MSG_ATTR_TYPE_TMEM_INPUT: +        case OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_TMEM_INOUT: +            if ( attr & OPTEE_MSG_ATTR_NONCONTIG ) { +                if ( !translate_noncontig(ctx, call, +                                          call->xen_arg->params + i, i) ) +                    return false; +            } +            else { +                gprintk(XENLOG_WARNING, "Guest tries to use old tmem arg\n"); +                return false; +            } +            break; +        case OPTEE_MSG_ATTR_TYPE_NONE: +        case OPTEE_MSG_ATTR_TYPE_VALUE_INPUT: +        case OPTEE_MSG_ATTR_TYPE_VALUE_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_VALUE_INOUT: +        case OPTEE_MSG_ATTR_TYPE_RMEM_INPUT: +        case OPTEE_MSG_ATTR_TYPE_RMEM_OUTPUT: +        case OPTEE_MSG_ATTR_TYPE_RMEM_INOUT: +            continue; +        } +    } +    return true; +}

/*    * Copy command buffer into xen memory to:    * 1) Hide translated addresses from guest @@ -488,6 +722,15 @@ static bool execute_std_call(struct domain_ctx *ctx,       copy_std_request_back(ctx, regs, call); +    /* +     * If guest successfully unregistered own shared memory, +     * then we can unpin it's pages +     */ +    if ( call->xen_arg->cmd == OPTEE_MSG_CMD_UNREGISTER_SHM && +         call->xen_arg->ret == 0 ) { +        free_shm_buf(ctx, call->xen_arg->params[0].u.rmem.shm_ref); +    }

free_std_call_ctx(ctx, call);       return true; @@ -522,7 +765,7 @@ static bool handle_std_call(struct domain_ctx *ctx, struct cpu_user_regs *regs)       case OPTEE_MSG_CMD_CANCEL:       case OPTEE_MSG_CMD_REGISTER_SHM:       case OPTEE_MSG_CMD_UNREGISTER_SHM: -        ret = true; +        ret = translate_params(ctx, call);           break;       default:           ret = false;

Cheers,

-- Volodymyr Babchuk

Julien Grall

11 Sep 11 Sep

1:37 p.m.

New subject: [PATCH v2 09/13] optee: add support for arbitrary shared memory

Hi Volodymyr,

On 10/09/18 19:04, Volodymyr Babchuk wrote:

...

On 10.09.18 17:02, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
+    struct { +        uint64_t pages_list[PAGELIST_ENTRIES_PER_PAGE]; +        uint64_t next_page_data; +    } *pages_data_guest, *pages_data_xen, *pages_data_xen_start; +    struct shm_buf *shm_buf;

+    page_offset = param->u.tmem.buf_ptr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1);

+    size = ROUNDUP(param->u.tmem.size + page_offset, +                   OPTEE_MSG_NONCONTIG_PAGE_SIZE);

+    num_pages = DIV_ROUND_UP(size, OPTEE_MSG_NONCONTIG_PAGE_SIZE);

+    order = get_order_from_bytes(get_pages_list_size(num_pages));

+    pages_data_xen_start = alloc_xenheap_pages(order, 0);

This could be replaced by a _xmalloc and would avoid to allocate more memory than necessary when the order is getting bigger.

Thanks. Would it allocate page-aligned buffer? This is crucial in this case. I can't find any documentation on it so I don't know which alignment it guarantees.

_xmalloc takes in argument the alignment required for the allocation.

...

...
...
+ if ( !pages_data_xen_start ) + return false;

+ shm_buf = allocate_shm_buf(ctx, param->u.tmem.shm_ref, num_pages);

In alocate_shm_buf you are now globally limiting the number of pages ( (16384) to pin. However, this does not limit per call.

With the current limit, you would could call up to 16384 times lookup_and_pin_guest_ram_addr(...). On Arm, for p2m related operation, we limit to 512 iterations in one go before checking the preemption. So I think 16384 times is far too much.

So, in other words, I can translate only 2MB buffer (if 4096KB pages are used), is it right?

2MB for the whole command. So if you have 5 buffer in the command, then the sum of the buffer should not be bigger than 2MB.

However, 2MB might be too big considering that you also need to account the SMC call. Does buffer can be passed for fast call?

...

I think, it will be okay to implement such limitation for this initial version of mediator. In the future, it would be possible to do RPC return from XEN (as OP-TEE does) to finish this request later.

...
...
+    if ( !shm_buf ) +        goto err_free;

+    gaddr = param->u.tmem.buf_ptr & ~(OPTEE_MSG_NONCONTIG_PAGE_SIZE

1);

+    guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL); +    if ( mfn_eq(guest_mfn, INVALID_MFN) ) +        goto err_free;

+    pages_data_guest = map_domain_page(guest_mfn); +    if ( !pages_data_guest ) +        goto err_free;

+    pages_data_xen = pages_data_xen_start; +    while ( num_pages ) { +        struct page_info *page; +        mfn_t entry_mfn = lookup_and_pin_guest_ram_addr( +            pages_data_guest->pages_list[entries_on_page], &page);

+        if ( mfn_eq(entry_mfn, INVALID_MFN) ) +            goto err_unmap;

+        shm_buf->pages[shm_buf->page_cnt++] = page; +        pages_data_xen->pages_list[entries_on_page] = mfn_to_maddr(entry_mfn); +        entries_on_page++;

+        if ( entries_on_page == PAGELIST_ENTRIES_PER_PAGE ) { +            pages_data_xen->next_page_data = virt_to_maddr(pages_data_xen + 1); +            pages_data_xen++; +            gaddr = pages_data_guest->next_page_data;

next_page_data is not a guest address but a machine address. For anything related to address, the variable should be named accordingly to avoid confusion.

Why? In this case it is IPA that comes from the guest.

Because I misread the variables.

...

...
...

+    unmap_domain_page(pages_data_guest); +    unpin_guest_ram_addr(guest_mfn); +    return true;

+err_unmap: +    unmap_domain_page(pages_data_guest); +    unpin_guest_ram_addr(guest_mfn); +    free_shm_buf(ctx, shm_buf->cookie);

+err_free: +    free_xenheap_pages(pages_data_xen_start, order);

+    return false; +}

+static bool translate_params(struct domain_ctx *ctx, +                             struct std_call_ctx *call) +{ +    unsigned int i; +    uint32_t attr;

+    for ( i = 0; i < call->xen_arg->num_params; i++ ) {

Please pay attention to Xen coding style. I haven't pointed out everywhere, but I would all of them to be fixed in the next version.

Yes, I'm sorry for that. I simultaneously work with different projects and sometimes it is hard to track coding style. I'll fix all such problems.

IIRC your team is working on the checkpatch. It might be worth using it to see how it performs on your series.

Cheers,

-- Julien Grall

Volodymyr Babchuk

7:33 p.m.

New subject: [PATCH v2 09/13] optee: add support for arbitrary shared memory

Hi Julien,

On 11.09.18 16:37, Julien Grall wrote:

...

Hi Volodymyr,

On 10/09/18 19:04, Volodymyr Babchuk wrote:

...
On 10.09.18 17:02, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote:

[...]

...

...
...
...
+ if ( !pages_data_xen_start ) + return false;

+ shm_buf = allocate_shm_buf(ctx, param->u.tmem.shm_ref, num_pages);

In alocate_shm_buf you are now globally limiting the number of pages ( (16384) to pin. However, this does not limit per call.

With the current limit, you would could call up to 16384 times lookup_and_pin_guest_ram_addr(...). On Arm, for p2m related operation, we limit to 512 iterations in one go before checking the preemption. So I think 16384 times is far too much.

So, in other words, I can translate only 2MB buffer (if 4096KB pages are used), is it right?

2MB for the whole command. So if you have 5 buffer in the command, then the sum of the buffer should not be bigger than 2MB.

4 buffers, but yes, it can be up to 8MB. Okay, I'll add per-call counter to limit memory usage for a whole call.

...

However, 2MB might be too big considering that you also need to account the SMC call. Does buffer can be passed for fast call?

No, all such calls are yielding calls, so you can ignore time used for SMC call itself.

-- Volodymyr Babchuk

Julien Grall

12 Sep 12 Sep

11:02 a.m.

New subject: [PATCH v2 09/13] optee: add support for arbitrary shared memory

On 09/11/2018 08:33 PM, Volodymyr Babchuk wrote:

...

Hi Julien,

Hi,

...

On 11.09.18 16:37, Julien Grall wrote:

...
Hi Volodymyr,

On 10/09/18 19:04, Volodymyr Babchuk wrote:

...
On 10.09.18 17:02, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote:

[...]

...
...
...
...
+ if ( !pages_data_xen_start ) + return false;

+ shm_buf = allocate_shm_buf(ctx, param->u.tmem.shm_ref, num_pages);

In alocate_shm_buf you are now globally limiting the number of pages ( (16384) to pin. However, this does not limit per call.

With the current limit, you would could call up to 16384 times lookup_and_pin_guest_ram_addr(...). On Arm, for p2m related operation, we limit to 512 iterations in one go before checking the preemption. So I think 16384 times is far too much.

So, in other words, I can translate only 2MB buffer (if 4096KB pages are used), is it right?

2MB for the whole command. So if you have 5 buffer in the command, then the sum of the buffer should not be bigger than 2MB.

4 buffers, but yes, it can be up to 8MB. Okay, I'll add per-call counter to limit memory usage for a whole call.

That would need to be reduced to 2MB in total per call. You probably want to look at max_order(...).

...

...
However, 2MB might be too big considering that you also need to account the SMC call. Does buffer can be passed for fast call?

No, all such calls are yielding calls, so you can ignore time used for SMC call itself.

How come you can ignore it? It has a cost to trap to EL3.

Cheers,

-- Julien Grall

Volodymyr Babchuk

12:45 p.m.

New subject: [PATCH v2 09/13] optee: add support for arbitrary shared memory

Hi,

On 12.09.18 14:02, Julien Grall wrote:

...

On 09/11/2018 08:33 PM, Volodymyr Babchuk wrote:

...
Hi Julien,

Hi,

...
On 11.09.18 16:37, Julien Grall wrote:

...
Hi Volodymyr,

On 10/09/18 19:04, Volodymyr Babchuk wrote:

...
On 10.09.18 17:02, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote:

[...]

...
...
...
...
+ if ( !pages_data_xen_start ) + return false;

+ shm_buf = allocate_shm_buf(ctx, param->u.tmem.shm_ref, num_pages);

In alocate_shm_buf you are now globally limiting the number of pages ( (16384) to pin. However, this does not limit per call.

With the current limit, you would could call up to 16384 times lookup_and_pin_guest_ram_addr(...). On Arm, for p2m related operation, we limit to 512 iterations in one go before checking the preemption. So I think 16384 times is far too much.

So, in other words, I can translate only 2MB buffer (if 4096KB pages are used), is it right?

2MB for the whole command. So if you have 5 buffer in the command, then the sum of the buffer should not be bigger than 2MB.

4 buffers, but yes, it can be up to 8MB. Okay, I'll add per-call counter to limit memory usage for a whole call.

That would need to be reduced to 2MB in total per call. You probably want to look at max_order(...).

Yes, this is what I was saying. 512 pages per call.

...

...
...
However, 2MB might be too big considering that you also need to account the SMC call. Does buffer can be passed for fast call?

No, all such calls are yielding calls, so you can ignore time used for SMC call itself.

How come you can ignore it? It has a cost to trap to EL3.

Strictly speaking, yes. All steps has cost: trap to EL3, dispatch in EL3, switch to S-EL1, new thread preparation in OP-TEE, context switch in OP-TEE. I wanted to tell, that in my opinion, this is negligible in comparison with the actual call processing. But maybe, I'm wrong there.

-- Volodymyr Babchuk

Julien Grall

18 Sep 18 Sep

4:19 p.m.

New subject: [PATCH v2 09/13] optee: add support for arbitrary shared memory

On 09/12/2018 01:45 PM, Volodymyr Babchuk wrote:

...

Hi,

Hi Volodymyr,

...

On 12.09.18 14:02, Julien Grall wrote:

...
On 09/11/2018 08:33 PM, Volodymyr Babchuk wrote:

...
...
However, 2MB might be too big considering that you also need to account the SMC call. Does buffer can be passed for fast call?

No, all such calls are yielding calls, so you can ignore time used for SMC call itself.

How come you can ignore it? It has a cost to trap to EL3.

Strictly speaking, yes. All steps has cost: trap to EL3, dispatch in EL3, switch to S-EL1, new thread preparation in OP-TEE, context switch in OP-TEE. I wanted to tell, that in my opinion, this is negligible in comparison with the actual call processing. But maybe, I'm wrong there.

It would be interesting to have a breakdown on the time spent in the call (with virtualization). Did you have a chance to do that?

Cheers,

-- Julien Grall

Volodymyr Babchuk

3 Sep 3 Sep

4:54 p.m.

New subject: [PATCH v2 10/13] optee: add support for RPC commands

OP-TEE can issue multiple RPC requests. We are interested mostly in request that asks NW to allocate/free shared memory for OP-TEE needs, becuase mediator need to do address translation in the same way as it was done for shared buffers registered by NW.

As mediator now accesses shared command buffer, we need to shadow it in the same way, as we shadow request buffers for STD calls.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com --- xen/arch/arm/tee/optee.c | 137 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 126 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 8bfcfdc..b2d795e 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -45,6 +45,7 @@ struct std_call_ctx { struct shm_rpc { struct list_head list; struct optee_msg_arg *guest_arg; + struct optee_msg_arg *xen_arg; struct page *guest_page; mfn_t guest_mfn; uint64_t cookie; @@ -303,6 +304,10 @@ static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t if ( !shm_rpc ) goto err;

+ shm_rpc->xen_arg = alloc_xenheap_page(); + if ( !shm_rpc->xen_arg ) + goto err; + shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL);

if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) ) @@ -324,6 +329,10 @@ static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t

err: atomic_dec(&ctx->shm_rpc_count); + + if ( shm_rpc->xen_arg ) + free_xenheap_page(shm_rpc->xen_arg); + xfree(shm_rpc); return NULL; } @@ -346,9 +355,10 @@ static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) } spin_unlock(&ctx->lock);

- if ( !found ) { + if ( !found ) return; - } + + free_xenheap_page(shm_rpc->xen_arg);

if ( shm_rpc->guest_arg ) { unpin_guest_ram_addr(shm_rpc->guest_mfn); @@ -358,6 +368,24 @@ static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) xfree(shm_rpc); }

+static struct shm_rpc *find_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ + struct shm_rpc *shm_rpc; + + spin_lock(&ctx->lock); + list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) + { + if ( shm_rpc->cookie == cookie ) + { + spin_unlock(&ctx->lock); + return shm_rpc; + } + } + spin_unlock(&ctx->lock); + + return NULL; +} + static struct shm_buf *allocate_shm_buf(struct domain_ctx *ctx, uint64_t cookie, int pages_cnt) @@ -704,6 +732,28 @@ static bool copy_std_request_back(struct domain_ctx *ctx, return true; }

+static void handle_rpc_return(struct domain_ctx *ctx, + struct cpu_user_regs *regs, + struct std_call_ctx *call) +{ + call->optee_thread_id = get_user_reg(regs, 3); + call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(get_user_reg(regs, 0)); + + if ( call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD ) + { + /* Copy RPC request from shadowed buffer to guest */ + uint64_t cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2); + struct shm_rpc *shm_rpc = find_shm_rpc(ctx, cookie); + if ( !shm_rpc ) + { + gprintk(XENLOG_ERR, "Can't find SHM-RPC with cookie %lx\n", cookie); + return; + } + memcpy(shm_rpc->guest_arg, shm_rpc->xen_arg, + OPTEE_MSG_GET_ARG_SIZE(shm_rpc->xen_arg->num_params)); + } +} + static bool execute_std_call(struct domain_ctx *ctx, struct cpu_user_regs *regs, struct std_call_ctx *call) @@ -715,8 +765,7 @@ static bool execute_std_call(struct domain_ctx *ctx, optee_ret = get_user_reg(regs, 0); if ( OPTEE_SMC_RETURN_IS_RPC(optee_ret) ) { - call->optee_thread_id = get_user_reg(regs, 3); - call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(optee_ret); + handle_rpc_return(ctx, regs, call); return true; }

@@ -783,6 +832,74 @@ out: return ret; }

+ +static void handle_rpc_cmd_alloc(struct domain_ctx *ctx, + struct cpu_user_regs *regs, + struct std_call_ctx *call, + struct shm_rpc *shm_rpc) +{ + if ( shm_rpc->xen_arg->params[0].attr != (OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT | + OPTEE_MSG_ATTR_NONCONTIG) ) + { + gprintk(XENLOG_WARNING, "Invalid attrs for shared mem buffer\n"); + return; + } + + /* Last entry in non_contig array is used to hold RPC-allocated buffer */ + if ( call->non_contig[MAX_NONCONTIG_ENTRIES - 1] ) + { + free_xenheap_pages(call->non_contig[MAX_NONCONTIG_ENTRIES - 1], + call->non_contig_order[MAX_NONCONTIG_ENTRIES - 1]); + call->non_contig[MAX_NONCONTIG_ENTRIES - 1] = NULL; + } + translate_noncontig(ctx, call, shm_rpc->xen_arg->params + 0, + MAX_NONCONTIG_ENTRIES - 1); +} + +static void handle_rpc_cmd(struct domain_ctx *ctx, struct cpu_user_regs *regs, + struct std_call_ctx *call) +{ + struct shm_rpc *shm_rpc; + uint64_t cookie; + int num_params; + + cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2); + + shm_rpc = find_shm_rpc(ctx, cookie); + + if ( !shm_rpc ) + { + gprintk(XENLOG_ERR, "Can't find SHM-RPC with cookie %lx\n", cookie); + return; + } + + num_params = shm_rpc->guest_arg->num_params; + + barrier(); /* Ensure that num_params is read once */ + if ( OPTEE_MSG_GET_ARG_SIZE(num_params) > OPTEE_MSG_NONCONTIG_PAGE_SIZE ) + return; + + memcpy(shm_rpc->xen_arg, shm_rpc->guest_arg, + OPTEE_MSG_GET_ARG_SIZE(num_params)); + + switch (shm_rpc->xen_arg->cmd) { + case OPTEE_MSG_RPC_CMD_GET_TIME: + break; + case OPTEE_MSG_RPC_CMD_WAIT_QUEUE: + break; + case OPTEE_MSG_RPC_CMD_SUSPEND: + break; + case OPTEE_MSG_RPC_CMD_SHM_ALLOC: + handle_rpc_cmd_alloc(ctx, regs, call, shm_rpc); + break; + case OPTEE_MSG_RPC_CMD_SHM_FREE: + free_shm_buf(ctx, shm_rpc->xen_arg->params[0].u.value.b); + break; + default: + break; + } +} + static void handle_rpc_func_alloc(struct domain_ctx *ctx, struct cpu_user_regs *regs) { @@ -796,13 +913,11 @@ static void handle_rpc_func_alloc(struct domain_ctx *ctx, struct shm_rpc *shm_rpc;

shm_rpc = allocate_and_map_shm_rpc(ctx, ptr, cookie); - if ( !shm_rpc ) - { + if ( !shm_rpc ) { gprintk(XENLOG_WARNING, "Failed to allocate shm_rpc object\n"); - ptr = 0; - } - else - ptr = mfn_to_maddr(shm_rpc->guest_mfn); + ptr = ~0; + } else + ptr = virt_to_maddr(shm_rpc->xen_arg);

set_user_reg(regs, 1, ptr >> 32); set_user_reg(regs, 2, ptr & 0xFFFFFFFF); @@ -833,7 +948,7 @@ static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs) case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR: break; case OPTEE_SMC_RPC_FUNC_CMD: - /* TODO: Add handling */ + handle_rpc_cmd(ctx, regs, call); break; }

-- 2.7.4

Julien Grall

10 Sep 10 Sep

3:34 p.m.

New subject: [PATCH v2 10/13] optee: add support for RPC commands

Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...

OP-TEE can issue multiple RPC requests. We are interested mostly in request that asks NW to allocate/free shared memory for OP-TEE needs, becuase mediator need to do address translation in the same

s/becuase/because/ s/need/needs/

the mediator

...

way as it was done for shared buffers registered by NW.

As mediator now accesses shared command buffer, we need to shadow

"As the"

...

it in the same way, as we shadow request buffers for STD calls.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 137 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 126 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 8bfcfdc..b2d795e 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -45,6 +45,7 @@ struct std_call_ctx { struct shm_rpc { struct list_head list; struct optee_msg_arg *guest_arg;

struct optee_msg_arg *xen_arg; struct page *guest_page; mfn_t guest_mfn; uint64_t cookie;

@@ -303,6 +304,10 @@ static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t if ( !shm_rpc ) goto err;
shm_rpc->xen_arg = alloc_xenheap_page();

if ( !shm_rpc->xen_arg )
   goto err;
shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL);
if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) ) @@ -324,6 +329,10 @@ static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t err: atomic_dec(&ctx->shm_rpc_count);
if ( shm_rpc->xen_arg )
   free_xenheap_page(shm_rpc->xen_arg);
xfree(shm_rpc);
return NULL;
}
@@ -346,9 +355,10 @@ static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) } spin_unlock(&ctx->lock);

if ( !found ) {

if ( !found ) return;

}

That change should be folded into the patch adding the {}.

...

free_xenheap_page(shm_rpc->xen_arg);

if ( shm_rpc->guest_arg ) { unpin_guest_ram_addr(shm_rpc->guest_mfn); @@ -358,6 +368,24 @@ static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) xfree(shm_rpc); } +static struct shm_rpc *find_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{
struct shm_rpc *shm_rpc;

spin_lock(&ctx->lock);

list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list )

{
   if ( shm_rpc->cookie == cookie )
   {
           spin_unlock(&ctx->lock);
           return shm_rpc;
   }
}

spin_unlock(&ctx->lock);

return NULL;
+}

static struct shm_buf *allocate_shm_buf(struct domain_ctx *ctx, uint64_t cookie, int pages_cnt)

@@ -704,6 +732,28 @@ static bool copy_std_request_back(struct domain_ctx *ctx, return true; } +static void handle_rpc_return(struct domain_ctx *ctx,
                        struct cpu_user_regs *regs,
                        struct std_call_ctx *call)
+{
call->optee_thread_id = get_user_reg(regs, 3);

call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(get_user_reg(regs, 0));

if ( call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD )

{
   /* Copy RPC request from shadowed buffer to guest */
   uint64_t cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2);
   struct shm_rpc *shm_rpc = find_shm_rpc(ctx, cookie);

Newline between declaration and code.

...

   if ( !shm_rpc )
   {
       gprintk(XENLOG_ERR, "Can't find SHM-RPC with cookie %lx\n", cookie);
       return;
   }
   memcpy(shm_rpc->guest_arg, shm_rpc->xen_arg,
          OPTEE_MSG_GET_ARG_SIZE(shm_rpc->xen_arg->num_params));
}
+}

static bool execute_std_call(struct domain_ctx *ctx, struct cpu_user_regs *regs, struct std_call_ctx *call)

@@ -715,8 +765,7 @@ static bool execute_std_call(struct domain_ctx *ctx, optee_ret = get_user_reg(regs, 0); if ( OPTEE_SMC_RETURN_IS_RPC(optee_ret) ) {
   call->optee_thread_id = get_user_reg(regs, 3);
   call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(optee_ret);
   handle_rpc_return(ctx, regs, call);

It would make sense to introduce handle_rpc_return where you actually add those 2 lines.

...

      return true;
  }

@@ -783,6 +832,74 @@ out: return ret; }

+static void handle_rpc_cmd_alloc(struct domain_ctx *ctx,

                            struct cpu_user_regs *regs,

                            struct std_call_ctx *call,

                            struct shm_rpc *shm_rpc)

if ( shm_rpc->xen_arg->params[0].attr != (OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT |

                                       OPTEE_MSG_ATTR_NONCONTIG) )

{

   gprintk(XENLOG_WARNING, "Invalid attrs for shared mem buffer\n");

```
   return;
```
}
/* Last entry in non_contig array is used to hold RPC-allocated buffer */
if ( call->non_contig[MAX_NONCONTIG_ENTRIES - 1] )
{

   free_xenheap_pages(call->non_contig[MAX_NONCONTIG_ENTRIES - 1],

                      call->non_contig_order[MAX_NONCONTIG_ENTRIES - 1]);

   call->non_contig[MAX_NONCONTIG_ENTRIES - 1] = NULL;

}

This is quite odd. Why don't you just deny allocating information in the non_config array? This would avoid to silently dropped any page that may have been linked together and potentially used still in use.

I would also prefer if you introduce a define for that RCP-allocated buffer.

...

translate_noncontig(ctx, call, shm_rpc->xen_arg->params + 0,

What does the "+ 0" stand for?

...

                   MAX_NONCONTIG_ENTRIES - 1);
+}

+static void handle_rpc_cmd(struct domain_ctx *ctx, struct cpu_user_regs *regs,
                      struct std_call_ctx *call)
+{
struct shm_rpc *shm_rpc;

uint64_t cookie;

int num_params;

cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2);

shm_rpc = find_shm_rpc(ctx, cookie);

if ( !shm_rpc )

{
   gprintk(XENLOG_ERR, "Can't find SHM-RPC with cookie %lx\n", cookie);
   return;
}

num_params = shm_rpc->guest_arg->num_params;

barrier(); /* Ensure that num_params is read once */

I would use:

num_params = ACCESS_ONCE(shm_rpc->guest_arg->num_params);

...

if ( OPTEE_MSG_GET_ARG_SIZE(num_params) > OPTEE_MSG_NONCONTIG_PAGE_SIZE )
   return;
memcpy(shm_rpc->xen_arg, shm_rpc->guest_arg,
      OPTEE_MSG_GET_ARG_SIZE(num_params));
switch (shm_rpc->xen_arg->cmd) {

case OPTEE_MSG_RPC_CMD_GET_TIME:
   break;
case OPTEE_MSG_RPC_CMD_WAIT_QUEUE:
   break;
case OPTEE_MSG_RPC_CMD_SUSPEND:
   break;
case OPTEE_MSG_RPC_CMD_SHM_ALLOC:
   handle_rpc_cmd_alloc(ctx, regs, call, shm_rpc);
   break;
case OPTEE_MSG_RPC_CMD_SHM_FREE:
   free_shm_buf(ctx, shm_rpc->xen_arg->params[0].u.value.b);
   break;
default:
   break;
}
+}

static void handle_rpc_func_alloc(struct domain_ctx *ctx, struct cpu_user_regs *regs) {

@@ -796,13 +913,11 @@ static void handle_rpc_func_alloc(struct domain_ctx *ctx, struct shm_rpc *shm_rpc; shm_rpc = allocate_and_map_shm_rpc(ctx, ptr, cookie);
   if ( !shm_rpc )
   {
   if ( !shm_rpc ) {

Please try to avoid changing the coding style in different patch. But this one is wrong.

...

          gprintk(XENLOG_WARNING, "Failed to allocate shm_rpc object\n");

```
       ptr = 0;
```
```
   }
```
```
   else
```

       ptr = mfn_to_maddr(shm_rpc->guest_mfn);

```
       ptr = ~0;
```

Can you explain why you change from 0 to ~0?

...

   } else
       ptr = virt_to_maddr(shm_rpc->xen_arg);
set_user_reg(regs, 1, ptr >> 32); set_user_reg(regs, 2, ptr & 0xFFFFFFFF); @@ -833,7 +948,7 @@ static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs) case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR: break; case OPTEE_SMC_RPC_FUNC_CMD:
   /* TODO: Add handling */
   handle_rpc_cmd(ctx, regs, call);
    break;
}

Cheers,

-- Julien Grall

Volodymyr Babchuk

6:14 p.m.

New subject: [PATCH v2 10/13] optee: add support for RPC commands

Hi,

On 10.09.18 18:34, Julien Grall wrote:

...

Hi Volodymyr,

On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
OP-TEE can issue multiple RPC requests. We are interested mostly in request that asks NW to allocate/free shared memory for OP-TEE needs, becuase mediator need to do address translation in the same

s/becuase/because/ s/need/needs/

the mediator

...
way as it was done for shared buffers registered by NW.

As mediator now accesses shared command buffer, we need to shadow

"As the"

...
it in the same way, as we shadow request buffers for STD calls.

Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com

xen/arch/arm/tee/optee.c | 137 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 126 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 8bfcfdc..b2d795e 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -45,6 +45,7 @@ struct std_call_ctx { struct shm_rpc {       struct list_head list;       struct optee_msg_arg *guest_arg; +    struct optee_msg_arg *xen_arg;       struct page *guest_page;       mfn_t guest_mfn;       uint64_t cookie; @@ -303,6 +304,10 @@ static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t       if ( !shm_rpc )           goto err; +    shm_rpc->xen_arg = alloc_xenheap_page(); +    if ( !shm_rpc->xen_arg ) +        goto err;

shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL);       if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) ) @@ -324,6 +329,10 @@ static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t err:       atomic_dec(&ctx->shm_rpc_count);

+    if ( shm_rpc->xen_arg ) +        free_xenheap_page(shm_rpc->xen_arg);

xfree(shm_rpc);       return NULL; } @@ -346,9 +355,10 @@ static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie)       }       spin_unlock(&ctx->lock); -    if ( !found ) { +    if ( !found )           return; -    }

That change should be folded into the patch adding the {}.

...

+    free_xenheap_page(shm_rpc->xen_arg);       if ( shm_rpc->guest_arg ) {           unpin_guest_ram_addr(shm_rpc->guest_mfn); @@ -358,6 +368,24 @@ static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie)       xfree(shm_rpc); } +static struct shm_rpc *find_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ +    struct shm_rpc *shm_rpc;

+    spin_lock(&ctx->lock); +    list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) +    { +        if ( shm_rpc->cookie == cookie ) +        { +                spin_unlock(&ctx->lock); +                return shm_rpc; +        } +    } +    spin_unlock(&ctx->lock);

+    return NULL; +}

static struct shm_buf *allocate_shm_buf(struct domain_ctx *ctx,                                           uint64_t cookie,                                           int pages_cnt) @@ -704,6 +732,28 @@ static bool copy_std_request_back(struct domain_ctx *ctx,       return true; } +static void handle_rpc_return(struct domain_ctx *ctx, +                             struct cpu_user_regs *regs, +                             struct std_call_ctx *call) +{ +    call->optee_thread_id = get_user_reg(regs, 3); +    call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(get_user_reg(regs, 0));

+    if ( call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD ) +    { +        /* Copy RPC request from shadowed buffer to guest */ +        uint64_t cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2); +        struct shm_rpc *shm_rpc = find_shm_rpc(ctx, cookie);

Newline between declaration and code.

Sorry, another habit from kernel coding style :(

...

...
+        if ( !shm_rpc ) +        { +            gprintk(XENLOG_ERR, "Can't find SHM-RPC with cookie %lx\n", cookie); +            return; +        } +        memcpy(shm_rpc->guest_arg, shm_rpc->xen_arg, +               OPTEE_MSG_GET_ARG_SIZE(shm_rpc->xen_arg->num_params)); +    } +}

static bool execute_std_call(struct domain_ctx *ctx,                                struct cpu_user_regs *regs,                                struct std_call_ctx *call) @@ -715,8 +765,7 @@ static bool execute_std_call(struct domain_ctx *ctx,       optee_ret = get_user_reg(regs, 0);       if ( OPTEE_SMC_RETURN_IS_RPC(optee_ret) )       { -        call->optee_thread_id = get_user_reg(regs, 3); -        call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(optee_ret); +        handle_rpc_return(ctx, regs, call);

It would make sense to introduce handle_rpc_return where you actually add those 2 lines.

...
return true;       } @@ -783,6 +832,74 @@ out:       return ret; }

+static void handle_rpc_cmd_alloc(struct domain_ctx *ctx, +                                 struct cpu_user_regs *regs, +                                 struct std_call_ctx *call, +                                 struct shm_rpc *shm_rpc) +{ +    if ( shm_rpc->xen_arg->params[0].attr != (OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT | +                                            OPTEE_MSG_ATTR_NONCONTIG) ) +    { +        gprintk(XENLOG_WARNING, "Invalid attrs for shared mem buffer\n"); +        return; +    }

+    /* Last entry in non_contig array is used to hold RPC-allocated buffer */ +    if ( call->non_contig[MAX_NONCONTIG_ENTRIES - 1] ) +    { +        free_xenheap_pages(call->non_contig[MAX_NONCONTIG_ENTRIES - 1],

call->non_contig_order[MAX_NONCONTIG_ENTRIES - 1]); +        call->non_contig[MAX_NONCONTIG_ENTRIES - 1] = NULL; +    }

This is quite odd. Why don't you just deny allocating information in the non_config array? This would avoid to silently dropped any page that may have been linked together and potentially used still in use.

No, this, actually is part of the protocol. OP-TEE can ask to allocate more shared buffers, one per RPC return. call->non_contig[x] is needed to hold list of pages until OP_TEE consumes them. The it can be freed and reused to allocate next buffer. Consider this:

1. OP-TEE issues RPC "allocate buffer" 2. NW returns list of pages 3. Mediator translates and stores address in non_contig[x] 4. OP-TEE begins to consume this list 5. IRQ arrives and OP-TEE forced to break the work 6. Mediator receives control back, but it should not free non_contig[x], because it is not sure of OP-TEE finished reading from it 7. Xen/guest handles the IRQ and returns control back to OP-TEE 8. OP-TEE finishes processing this buffers and asks for another one 9. NW returns list of pages for the next buffer 10. At this point mediator is sure that OP-TEE finished processing old non_contig[x], so it can free it and allocated another.

...

I would also prefer if you introduce a define for that RCP-allocated buffer.

...
+    translate_noncontig(ctx, call, shm_rpc->xen_arg->params + 0,

What does the "+ 0" stand for?

...
+                        MAX_NONCONTIG_ENTRIES - 1); +}

+static void handle_rpc_cmd(struct domain_ctx *ctx, struct cpu_user_regs *regs, +                           struct std_call_ctx *call) +{ +    struct shm_rpc *shm_rpc; +    uint64_t cookie; +    int num_params;

+    cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2);

+    shm_rpc = find_shm_rpc(ctx, cookie);

+    if ( !shm_rpc ) +    { +        gprintk(XENLOG_ERR, "Can't find SHM-RPC with cookie %lx\n", cookie); +        return; +    }

+    num_params = shm_rpc->guest_arg->num_params;

+    barrier(); /* Ensure that num_params is read once */

I would use:

num_params = ACCESS_ONCE(shm_rpc->guest_arg->num_params);

I looked for this macro, but somehow missed it. Thanks for pointing out.

...

...
+    if ( OPTEE_MSG_GET_ARG_SIZE(num_params) > OPTEE_MSG_NONCONTIG_PAGE_SIZE ) +        return;

+    memcpy(shm_rpc->xen_arg, shm_rpc->guest_arg, +           OPTEE_MSG_GET_ARG_SIZE(num_params));

+    switch (shm_rpc->xen_arg->cmd) { +    case OPTEE_MSG_RPC_CMD_GET_TIME: +        break; +    case OPTEE_MSG_RPC_CMD_WAIT_QUEUE: +        break; +    case OPTEE_MSG_RPC_CMD_SUSPEND: +        break; +    case OPTEE_MSG_RPC_CMD_SHM_ALLOC: +        handle_rpc_cmd_alloc(ctx, regs, call, shm_rpc); +        break; +    case OPTEE_MSG_RPC_CMD_SHM_FREE: +        free_shm_buf(ctx, shm_rpc->xen_arg->params[0].u.value.b); +        break; +    default: +        break; +    } +}

static void handle_rpc_func_alloc(struct domain_ctx *ctx,                                     struct cpu_user_regs *regs) { @@ -796,13 +913,11 @@ static void handle_rpc_func_alloc(struct domain_ctx *ctx,           struct shm_rpc *shm_rpc;           shm_rpc = allocate_and_map_shm_rpc(ctx, ptr, cookie); -        if ( !shm_rpc ) -        { +        if ( !shm_rpc ) {

Please try to avoid changing the coding style in different patch. But this one is wrong.

Yep :( this is the artifact from splitting the big patch.

...

...
gprintk(XENLOG_WARNING, "Failed to allocate shm_rpc object\n"); - ptr = 0; - } - else - ptr = mfn_to_maddr(shm_rpc->guest_mfn); + ptr = ~0;

Can you explain why you change from 0 to ~0?

I had to introduce this in the original patch, actually.

...

...
+ } else + ptr = virt_to_maddr(shm_rpc->xen_arg); set_user_reg(regs, 1, ptr >> 32); set_user_reg(regs, 2, ptr & 0xFFFFFFFF); @@ -833,7 +948,7 @@ static bool handle_rpc(struct domain_ctx *ctx, struct cpu_user_regs *regs) case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR: break; case OPTEE_SMC_RPC_FUNC_CMD: - /* TODO: Add handling */ + handle_rpc_cmd(ctx, regs, call); break; }

Cheers,

-- Volodymyr Babchuk

Julien Grall

11 Sep 11 Sep

1:56 p.m.

New subject: [PATCH v2 10/13] optee: add support for RPC commands

On 10/09/18 19:14, Volodymyr Babchuk wrote:

...

On 10.09.18 18:34, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
static struct shm_buf *allocate_shm_buf(struct domain_ctx *ctx, uint64_t cookie, int pages_cnt) @@ -704,6 +732,28 @@ static bool copy_std_request_back(struct domain_ctx *ctx, return true; } +static void handle_rpc_return(struct domain_ctx *ctx, + struct cpu_user_regs *regs, + struct std_call_ctx *call) +{ + call->optee_thread_id = get_user_reg(regs, 3); + call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(get_user_reg(regs, 0));

+ if ( call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD ) + { + /* Copy RPC request from shadowed buffer to guest */ + uint64_t cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2); + struct shm_rpc *shm_rpc = find_shm_rpc(ctx, cookie);

Newline between declaration and code.

Sorry, another habit from kernel coding style :(

I think you need to modify your habit because I am pretty sure Linux folks would not allow:

struct shm_rpc *rpc = find_shm(...); if ( rpc ) ...

The correct way is:

struct shm_rpc *rpc = find_shm(...);

if ( rpc )

Notice the newline between struct and if.

...

...
...
+        if ( !shm_rpc ) +        { +            gprintk(XENLOG_ERR, "Can't find SHM-RPC with cookie %lx\n", cookie); +            return; +        } +        memcpy(shm_rpc->guest_arg, shm_rpc->xen_arg, +               OPTEE_MSG_GET_ARG_SIZE(shm_rpc->xen_arg->num_params)); +    } +}

static bool execute_std_call(struct domain_ctx *ctx,                                struct cpu_user_regs *regs,                                struct std_call_ctx *call) @@ -715,8 +765,7 @@ static bool execute_std_call(struct domain_ctx *ctx,       optee_ret = get_user_reg(regs, 0);       if ( OPTEE_SMC_RETURN_IS_RPC(optee_ret) )       { -        call->optee_thread_id = get_user_reg(regs, 3); -        call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(optee_ret); +        handle_rpc_return(ctx, regs, call);

It would make sense to introduce handle_rpc_return where you actually add those 2 lines.

...
return true;       } @@ -783,6 +832,74 @@ out:       return ret; }

+static void handle_rpc_cmd_alloc(struct domain_ctx *ctx, +                                 struct cpu_user_regs *regs, +                                 struct std_call_ctx *call, +                                 struct shm_rpc *shm_rpc) +{ +    if ( shm_rpc->xen_arg->params[0].attr != (OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT | +                                            OPTEE_MSG_ATTR_NONCONTIG) ) +    { +        gprintk(XENLOG_WARNING, "Invalid attrs for shared mem buffer\n"); +        return; +    }

+    /* Last entry in non_contig array is used to hold RPC-allocated buffer */ +    if ( call->non_contig[MAX_NONCONTIG_ENTRIES - 1] ) +    { +        free_xenheap_pages(call->non_contig[MAX_NONCONTIG_ENTRIES - 1],

call->non_contig_order[MAX_NONCONTIG_ENTRIES - 1]);

+        call->non_contig[MAX_NONCONTIG_ENTRIES - 1] = NULL; +    }

This is quite odd. Why don't you just deny allocating information in the non_config array? This would avoid to silently dropped any page that may have been linked together and potentially used still in use.

No, this, actually is part of the protocol. OP-TEE can ask to allocate more shared buffers, one per RPC return.

Please a give link to the spec and the paragraph.

...

call->non_contig[x] is needed to hold list of pages until OP_TEE consumes them. The it can be freed and reused to allocate next buffer. Consider this:

What is x?

...

OP-TEE issues RPC "allocate buffer"

NW returns list of pages

Mediator translates and stores address in non_contig[x]

OP-TEE begins to consume this list

IRQ arrives and OP-TEE forced to break the work

Mediator receives control back, but it should not free non_contig[x],

because it is not sure of OP-TEE finished reading from it 7. Xen/guest handles the IRQ and returns control back to OP-TEE 8. OP-TEE finishes processing this buffers and asks for another one 9. NW returns list of pages for the next buffer 10. At this point mediator is sure that OP-TEE finished processing old non_contig[x], so it can free it and allocated another.

Thank you for the description of the protocol. However, it is still does not explain why you decided to free MAX_NONCONTIG_ENTRIES - 1. Why not 0 or 1 or n?

Overall, it feels like to me you want to write more documentation about how the mediator is supposed to work.

[...]

...

...
Please try to avoid changing the coding style in different patch. But this one is wrong.

Yep :( this is the artifact from splitting the big patch.

...
...
gprintk(XENLOG_WARNING, "Failed to allocate shm_rpc object\n"); - ptr = 0; - } - else - ptr = mfn_to_maddr(shm_rpc->guest_mfn); + ptr = ~0;

Can you explain why you change from 0 to ~0?

I had to introduce this in the original patch, actually.

What do you mean?

Cheers,

-- Julien Grall

Volodymyr Babchuk

6:58 p.m.

New subject: [PATCH v2 10/13] optee: add support for RPC commands

Hi,

On 11.09.18 16:56, Julien Grall wrote:

...

On 10/09/18 19:14, Volodymyr Babchuk wrote:

...
On 10.09.18 18:34, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote:

...
static struct shm_buf *allocate_shm_buf(struct domain_ctx *ctx, uint64_t cookie, int pages_cnt) @@ -704,6 +732,28 @@ static bool copy_std_request_back(struct domain_ctx *ctx, return true; } +static void handle_rpc_return(struct domain_ctx *ctx, + struct cpu_user_regs *regs, + struct std_call_ctx *call) +{ + call->optee_thread_id = get_user_reg(regs, 3); + call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(get_user_reg(regs, 0));

+ if ( call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD ) + { + /* Copy RPC request from shadowed buffer to guest */ + uint64_t cookie = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2); + struct shm_rpc *shm_rpc = find_shm_rpc(ctx, cookie);

Newline between declaration and code.

Sorry, another habit from kernel coding style :(

I think you need to modify your habit because I am pretty sure Linux folks would not allow:

Ah, yes, I read your previous message incorrectly.

...

...
...
...
+        if ( !shm_rpc ) +        { +            gprintk(XENLOG_ERR, "Can't find SHM-RPC with cookie %lx\n", cookie); +            return; +        } +        memcpy(shm_rpc->guest_arg, shm_rpc->xen_arg, +               OPTEE_MSG_GET_ARG_SIZE(shm_rpc->xen_arg->num_params)); +    } +}

static bool execute_std_call(struct domain_ctx *ctx,                                struct cpu_user_regs *regs,                                struct std_call_ctx *call) @@ -715,8 +765,7 @@ static bool execute_std_call(struct domain_ctx *ctx,       optee_ret = get_user_reg(regs, 0);       if ( OPTEE_SMC_RETURN_IS_RPC(optee_ret) )       { -        call->optee_thread_id = get_user_reg(regs, 3); -        call->rpc_op = OPTEE_SMC_RETURN_GET_RPC_FUNC(optee_ret); +        handle_rpc_return(ctx, regs, call);

It would make sense to introduce handle_rpc_return where you actually add those 2 lines.

...
return true;       } @@ -783,6 +832,74 @@ out:       return ret; }

+static void handle_rpc_cmd_alloc(struct domain_ctx *ctx, +                                 struct cpu_user_regs *regs, +                                 struct std_call_ctx *call, +                                 struct shm_rpc *shm_rpc) +{ +    if ( shm_rpc->xen_arg->params[0].attr != (OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT |

OPTEE_MSG_ATTR_NONCONTIG) ) +    { +        gprintk(XENLOG_WARNING, "Invalid attrs for shared mem buffer\n"); +        return; +    }

+    /* Last entry in non_contig array is used to hold RPC-allocated buffer */ +    if ( call->non_contig[MAX_NONCONTIG_ENTRIES - 1] ) +    { +        free_xenheap_pages(call->non_contig[MAX_NONCONTIG_ENTRIES - 1],

call->non_contig_order[MAX_NONCONTIG_ENTRIES - 1]);

+        call->non_contig[MAX_NONCONTIG_ENTRIES - 1] = NULL; +    }

This is quite odd. Why don't you just deny allocating information in the non_config array? This would avoid to silently dropped any page that may have been linked together and potentially used still in use.

No, this, actually is part of the protocol. OP-TEE can ask to allocate more shared buffers, one per RPC return.

Please a give link to the spec and the paragraph.

optee_msg.h, above #define OPTEE_MSG_RPC_CMD_SHM_ALLOC 6 (line 405 in my case). It also defines fragmented buffers, which currently aren't supported in OP-TEE.

...

...
call->non_contig[x] is needed to hold list of pages until OP_TEE consumes them. The it can be freed and reused to allocate next buffer. Consider this:

What is x?

Index in non_contig array. The same principle works not only for buffers allocated by RPC, but for all other buffers as well. See below.

...

...

OP-TEE issues RPC "allocate buffer"

NW returns list of pages

Mediator translates and stores address in non_contig[x]

OP-TEE begins to consume this list

IRQ arrives and OP-TEE forced to break the work

Mediator receives control back, but it should not free non_contig[x],

because it is not sure of OP-TEE finished reading from it 7. Xen/guest handles the IRQ and returns control back to OP-TEE 8. OP-TEE finishes processing this buffers and asks for another one 9. NW returns list of pages for the next buffer 10. At this point mediator is sure that OP-TEE finished processing old non_contig[x], so it can free it and allocated another.

Thank you for the description of the protocol. However, it is still does not explain why you decided to free MAX_NONCONTIG_ENTRIES - 1. Why not 0 or 1 or n?

Okay. You can pass up to 4 arguments for TA in command buffer. Any of that argument can be a shared memory reference, so we need at least 4 items in non_contig array. Sadly, this constant (TEE_NUM_PARAMS) is defined not optee_msg.h or optee_smc.h, but in tee_api_defines.h, which is user-space (or TA) part of OP-TEE. But, now I'm thinking, that maybe it is not so bad idea to add this file to XEN... What is your opinion?

Anyways, we need at least 4 items for arguments. But I defined MAX_NONCONTIG_ENTRIES as 5, because last item in array is used for RPC-allocated buffer. I also added comment. I'll copy it there:

/* Last entry in non_contig array is used to hold RPC-allocated buffer */

So, first 4 items are used for arguments and last one used for RPC requests. Right way to define MAX_NONCONTIG_ENTRIES is (TEE_NUM_PARAMS + 1), but then I should add another header file, which defines GlobalPlatform TEE core API.

...

Overall, it feels like to me you want to write more documentation about how the mediator is supposed to work.

Looks like I need to start on how OP-TEE protocol is supposed to work... I tried to cover this in the commit messages, but looks like it is not sufficient.

...

...
...
...
gprintk(XENLOG_WARNING, "Failed to allocate shm_rpc object\n"); - ptr = 0; - } - else - ptr = mfn_to_maddr(shm_rpc->guest_mfn); + ptr = ~0;

Can you explain why you change from 0 to ~0?

I had to introduce this in the original patch, actually.

What do you mean?

Please ignore previous sentence. There should be 0. I'll remove this change.

-- Volodymyr Babchuk

Julien Grall

18 Sep 18 Sep

4:50 p.m.

New subject: [PATCH v2 10/13] optee: add support for RPC commands

Hi Volodymyr,

On 09/11/2018 07:58 PM, Volodymyr Babchuk wrote:

...

On 11.09.18 16:56, Julien Grall wrote:

...
On 10/09/18 19:14, Volodymyr Babchuk wrote:

...
On 10.09.18 18:34, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote:

OP-TEE issues RPC "allocate buffer"

NW returns list of pages

Mediator translates and stores address in non_contig[x]

OP-TEE begins to consume this list

IRQ arrives and OP-TEE forced to break the work

Mediator receives control back, but it should not free non_contig[x],

because it is not sure of OP-TEE finished reading from it 7. Xen/guest handles the IRQ and returns control back to OP-TEE 8. OP-TEE finishes processing this buffers and asks for another one 9. NW returns list of pages for the next buffer 10. At this point mediator is sure that OP-TEE finished processing old non_contig[x], so it can free it and allocated another.

Thank you for the description of the protocol. However, it is still does not explain why you decided to free MAX_NONCONTIG_ENTRIES - 1. Why not 0 or 1 or n?

Okay. You can pass up to 4 arguments for TA in command buffer. Any of that argument can be a shared memory reference, so we need at least 4 items in non_contig array. Sadly, this constant (TEE_NUM_PARAMS) is defined not optee_msg.h or optee_smc.h, but in tee_api_defines.h, which is user-space (or TA) part of OP-TEE. But, now I'm thinking, that maybe it is not so bad idea to add this file to XEN... What is your opinion?

I would just introduce TEE_NUM_PARAMS in tee.h with a comment explaining this is coming from GlobalPlatform TEE.

...

Anyways, we need at least 4 items for arguments. But I defined MAX_NONCONTIG_ENTRIES as 5, because last item in array is used for RPC-allocated buffer. I also added comment. I'll copy it there:

/* Last entry in non_contig array is used to hold RPC-allocated buffer */

So, first 4 items are used for arguments and last one used for RPC requests. Right way to define MAX_NONCONTIG_ENTRIES is (TEE_NUM_PARAMS + 1), but then I should add another header file, which defines GlobalPlatform TEE core API.

If I understand correctly your e-mail, a TEE call can never have more than 4 parameters. Right? If so, should not your code mostly use TEE_NUM_PARAMS and not MAX_NONCONTIG_ENTRIES?

...

...
Overall, it feels like to me you want to write more documentation about how the mediator is supposed to work.

Looks like I need to start on how OP-TEE protocol is supposed to work... I tried to cover this in the commit messages, but looks like it is not sufficient.

I appreciate your effort to cover that in the commit message :). I tend to update commit message over revision when there are misunderstanding on the patch.

Cheers,

-- Julien Grall

Volodymyr Babchuk

19 Sep 19 Sep

3:21 p.m.

New subject: [PATCH v2 10/13] optee: add support for RPC commands

Hi Julien,

On 18.09.18 19:50, Julien Grall wrote:

...

Hi Volodymyr,

On 09/11/2018 07:58 PM, Volodymyr Babchuk wrote:

...
On 11.09.18 16:56, Julien Grall wrote:

...
On 10/09/18 19:14, Volodymyr Babchuk wrote:

...
On 10.09.18 18:34, Julien Grall wrote:

...
On 03/09/18 17:54, Volodymyr Babchuk wrote:

OP-TEE issues RPC "allocate buffer"

NW returns list of pages

Mediator translates and stores address in non_contig[x]

OP-TEE begins to consume this list

IRQ arrives and OP-TEE forced to break the work

Mediator receives control back, but it should not free

non_contig[x], because it is not sure of OP-TEE finished reading from it 7. Xen/guest handles the IRQ and returns control back to OP-TEE 8. OP-TEE finishes processing this buffers and asks for another one 9. NW returns list of pages for the next buffer 10. At this point mediator is sure that OP-TEE finished processing old non_contig[x], so it can free it and allocated another.

Thank you for the description of the protocol. However, it is still does not explain why you decided to free MAX_NONCONTIG_ENTRIES - 1. Why not 0 or 1 or n?

Okay. You can pass up to 4 arguments for TA in command buffer. Any of that argument can be a shared memory reference, so we need at least 4 items in non_contig array. Sadly, this constant (TEE_NUM_PARAMS) is defined not optee_msg.h or optee_smc.h, but in tee_api_defines.h, which is user-space (or TA) part of OP-TEE.

[...]

...

...
Anyways, we need at least 4 items for arguments. But I defined MAX_NONCONTIG_ENTRIES as 5, because last item in array is used for RPC-allocated buffer. I also added comment. I'll copy it there:

/* Last entry in non_contig array is used to hold RPC-allocated buffer */

So, first 4 items are used for arguments and last one used for RPC requests. Right way to define MAX_NONCONTIG_ENTRIES is (TEE_NUM_PARAMS

1), but then I should add another header file, which defines

GlobalPlatform TEE core API.

If I understand correctly your e-mail, a TEE call can never have more than 4 parameters. Right? If so, should not your code mostly use TEE_NUM_PARAMS and not MAX_NONCONTIG_ENTRIES?

It is more complex. GP TEE API defines that there is 4 parameters maximum can be passed to TA in single call. OP-TEE can add additional, so called "meta" parameters, for own needs. Also, there is OP-TEE calls that does not map to GP API calls. So, I need to review this piece of code. Probably, I'll add some dynamic structure to handle used lists of pages.

...

...
...
Overall, it feels like to me you want to write more documentation about how the mediator is supposed to work.

Looks like I need to start on how OP-TEE protocol is supposed to work... I tried to cover this in the commit messages, but looks like it is not sufficient.

I appreciate your effort to cover that in the commit message :). I tend to update commit message over revision when there are misunderstanding on the patch.

Thank you for the tip, this is good point. I'll extend commit messages to cover topics we discussed.

-- Volodymyr Babchuk

2662

days inactive

2678

days old

tee-dev@lists.linaro.org

42 comments

participants

tags (0)

participants (3)

Andrew Cooper
Julien Grall
Volodymyr Babchuk