Re: MMU Off / Strict Alignment

19 Dec 2013


      On 18/12/13 05:06, Jonathan S. Shapiro wrote:
...
At the risk of sticking my nose in, this isn't a startup code issue.
It's a contract issue.
First, I don't buy Richard's argument about memcpy() startup costs and
hard-to-predict branches. We do those tests on essentially every
*other* RISC platform without complaint, and it's very easy to order
those branches so that the currently efficient cases run well. Perhaps
more to the point, I haven't seen anybody put forward quantitative
data that using the MMU for unaligned references is any better than
executing those branches. Speaking as a recovering processor
architect, that assumption needs to be validated quantitatively. My
guess is that the branches are faster if properly arranged.
Second, this is a contract issue. If newlib intends to support
embedded platforms, then it needs to implement algorithms that are
functionally correct without relying on an MMU. By all means use
simpler or smarter algorithms when an MMU can be assumed to be
available in a given configuration, but provide an algorithm that is
functionally correct when no MMU is available. "Good overall
performance in memcpy" is a fine thing, but it is subject to the
requirement of meeting functional specifications. As Jochen Liedtke
famously put it (read this in a heavy German accent): "Fast, ya. But
correct? (shrug) Eh!"
So: we need a normative statement saying what the contract is. The
rest of the answer will fall out from that.
I do agree with Richard that startup code is special. I've built
deeply embedded runtimes of one form or another for 25 years now, and
I have yet to see a system where optimizing a simplistic byte-wise
memcpy during bootstrap would have made any difference in anything
overall. That said, if the specification of memcpy requires it to
handle incompatibly aligned pointers (and it does), and the contract
for newlib requires it to operate in MMU-less scenarios in a given
configuration (which, at least in some cases, it does), it's
completely legitimate to expect that bootstrap code can call memcpy()
and expect behavior that meets specifications.
So what's the contract?
I disagree with your assertion that newlib *requires* it to operate in
an MMU-less scenario for all targets; it only does so when the target
can reasonably be expected to not have an MMU.
The only contract that exists is the one written in the C standard:
7.23.2.1#2 The memcpy function copies n characters from the object
pointed to by s2 into the object pointed to by s1. If copying takes
place between objects that overlap, the behavior is undefined.
But that is written on the assumption that we're in a normal execution
environment, not in some special case.
What you're missing is that AArch64 is (in ARM ARM terms) an A-profile
only environment where an MMU is mandated in the system.  Furthermore,
processors implementing the architecture will *expect* that the MMU be
turned on as soon as possible after boot, since without this the caches
cannot be used and without those the performance will be truly horrible.
 Once the caches are enabled, it's perfectly reasonable to assume that
memcpy will only be used for copies to and from NORMAL memory, since
other types of memory have potential side effects, which means that use
of memcpy would be unsafe.
If you want to write an MMU-less memcpy, then feel free to write one;
but please install it with a different interface -- something like
__memcpy_nommu().  Don't penalise the standard case for the non-standard
exceptional one.
R.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: MMU Off / Strict Alignment