At the moment ARM eglibc doesn't support the functions declared in ucontext.h: getcontext(), setcontext(), swapcontext() and makecontext(). Instead you get implementations which always fail and set errno to ENOSYS.
QEMU uses these functions to implement coroutines. Although there is a fallback implementation in terms of threads, there are reasons why using the fallback is suboptimal: * its performance is worse * it will be less tested, because x86_64 and i386 both implement the ucontext functions and so QEMU on those hosts will be using different code paths * I'm not aware of a good way at configure time to detect whether getcontext() et al will always fail without actually running a test binary, which won't work in a cross-compile setup. (If eglibc just didn't provide the functions at all this would be much simpler...)
We're going to care about performance and reliability of QEMU on ARM hosts as we start to support KVM on Cortex-A15, so it would be good if we could add ucontext function support to eglibc as part of that effort.
Opinions? Have I missed some good reason why there isn't an ARM implementation of these functions?
(I'm aware that the ucontext functions have been removed from the latest version of the POSIX spec; however AFAIK there's no equivalent functionality that replaces them so I think they're still worth having implementations of for parity with other architectures.)
-- PMM
The calling convention for makecontext() isn't just an issue for standards lawyers, it's an actual problem on ARM, especially on hard-float systems. According to the standard, "The application shall ensure that the value of argc matches the number of arguments of type int passed to func; otherwise, the behavior is undefined." If you read this to imply that func() must be a non-variadic function that takes zero or more int arguments, and call it as if it took exactly as many int arguments as fit in registers -- stuffing extras onto the stack if necessary -- then you're OK. But supporting more general calling conventions for func() gets hairy, and correctly supporting both variadic and non-variadic functions on a hard-float system is impossible.
If you ask me, the right way to handle this is to replace makecontext() with ucontext_t * makevcontext(ucontext_t *ucp, void (*func)(va_list), va_list args). That is, if you're going to buck the standards committee and fix the interface instead of calling it obselescent and dropping the functionality. For what it's worth, mcontext_t on ARMv7-A shouldn't have to be as big a beast as eglibc's implementation for, say, powerpc; you don't have to keep any state which is clobbered by a function call, so you just need stack/frame/return addresses plus the callee-save core and VFP registers.
Note that you will need to save/restore FP state in all swapcontext() calls. User thread context switch code doesn't have access to the FPEXC register and thus can't identify non-FP-using contexts and skip the q4-q7 save/restore. However, the standard is silent on whether you can have context-specific settings of floating point exception control, rounding mode, etc. If QEMU's coroutines need this, you'll need to save/restore FPSCR as well.
Now it seems to me that you can make the "CPU state" part of the context structure drastically simpler than the conventional mcontext_t -- all you really need is the stack pointer. An inline assembly implementation of the innards of swapcontext() can explicitly push the FPSCR and return address; switch stacks; and pop the same explicit state. The rest can be left up to the compiler by marking this assembly block as having clobbered all registers. (The signal mask save/restore should be done in C outside this, to prevent signal delivery during this maneuver.) You might as well make this an inline function, and then you shouldn't wind up with gratuitous register churn.
I suppose I might as well turn this into an eglibc patch. Probably not this week, though, as I have a lot of work to do before my Linaro Connect plenary.
Cheers, - Michael On Oct 24, 2011 6:46 AM, "Peter Maydell" peter.maydell@linaro.org wrote:
At the moment ARM eglibc doesn't support the functions declared in ucontext.h: getcontext(), setcontext(), swapcontext() and makecontext(). Instead you get implementations which always fail and set errno to ENOSYS.
QEMU uses these functions to implement coroutines. Although there is a fallback implementation in terms of threads, there are reasons why using the fallback is suboptimal:
- its performance is worse
- it will be less tested, because x86_64 and i386 both implement
the ucontext functions and so QEMU on those hosts will be using different code paths
- I'm not aware of a good way at configure time to detect whether
getcontext() et al will always fail without actually running a test binary, which won't work in a cross-compile setup. (If eglibc just didn't provide the functions at all this would be much simpler...)
We're going to care about performance and reliability of QEMU on ARM hosts as we start to support KVM on Cortex-A15, so it would be good if we could add ucontext function support to eglibc as part of that effort.
Opinions? Have I missed some good reason why there isn't an ARM implementation of these functions?
(I'm aware that the ucontext functions have been removed from the latest version of the POSIX spec; however AFAIK there's no equivalent functionality that replaces them so I think they're still worth having implementations of for parity with other architectures.)
-- PMM
linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
On 24 October 2011 20:05, Michael K. Edwards m.k.edwards@gmail.com wrote:
The calling convention for makecontext() isn't just an issue for standards lawyers, it's an actual problem on ARM, especially on hard-float systems. According to the standard, "The application shall ensure that the value of argc matches the number of arguments of type int passed to func; otherwise, the behavior is undefined." If you read this to imply that func() must be a non-variadic function that takes zero or more int arguments, and call it as if it took exactly as many int arguments as fit in registers -- stuffing extras onto the stack if necessary -- then you're OK. But supporting more general calling conventions for func() gets hairy, and correctly supporting both variadic and non-variadic functions on a hard-float system is impossible.
FWIW, the function QEMU uses is: static void coroutine_trampoline(int i0, int i1), so we are not trying to take advantage of the hairier cases.
Note that you will need to save/restore FP state in all swapcontext() calls. User thread context switch code doesn't have access to the FPEXC register and thus can't identify non-FP-using contexts and skip the q4-q7 save/restore. However, the standard is silent on whether you can have context-specific settings of floating point exception control, rounding mode, etc. If QEMU's coroutines need this, you'll need to save/restore FPSCR as well.
As it happens QEMU coroutines use swapcontext() only as a way to initially switch to a newly allocated stack -- actual switching between coroutines is done via setjmp()/longjmp(). (On reflection this does seem a bit odd...)
(the actual code is here: http://git.linaro.org/gitweb?p=people/pmaydell/qemu-arm.git%3Ba=blob%3Bf=cor... )
-- PMM
linaro-toolchain@lists.linaro.org