On Mon, 2023-10-23 at 19:32 +0100, Mark Brown wrote:
Right. We're already adding the cost of the extra map_shadow_stack() so it doesn't seem that out of scope. We could also allow clone3() to be used for allocation, potentially removing the ability to specify the address entirely and only specifying the size. I did consider that option but it felt awkward in the API, though equally the whole shadow stack allocation thing is a bit that way. That would avoid concerns about placing and validating tokens entirely but gives less control to userspace.
There is also cost in the form of extra complexity. Not to throw FUD, but GUP has been the source of thorny problems. And here we would be doing it around security races. We're probably helped that shadow stack is only private/anonymous memory, so maybe it's enough of a normal case to not worry about it.
Still, there is some extra complexity, and I'm not sure if we really need it. The justification seems to mostly be that it's not as flexible as normal stacks with clone3.
I don't understand why doing size-only is awkward. Just because it doesn't match the regular stack clone3 semantics?
This also doesn't do anything to stop anyone trying to allocate sub page shadow stacks if they're trying to save memory with all the lack of overrun protection that implies, though that seems to me to be much more of a deliberate decision that people might make, a token would prevent that too unless write access to the shadow stack is enabled.
Sorry, I'm not following. Sub-page shadow stacks?
+ /* + * This doesn't validate that the addresses are mapped + * VM_SHADOW_STACK, just that they're mapped at all. + */
It just checks the range, right?
Yes, same check as for the normal stack.
What looked wrong is that the comment says that it checks if the addresses are mapped, but the code just does access_ok(). It's a minor thing in any case.