Hi,
We've noticed an issue trying to use the Linaro AArch64 binary bare metal toolchain release with the MMU turned off for some low-level tests.
Anytime puts, sprintf, etc. gets called, a reent structure gets created with references to STDIN, STDOUT, STDERR FILE types. A member in the __sFile struct, _mbstate, is an 8 byte struct, but is not aligned on an 8 byte boundary. This means that when memset (or a similar function) gets called on this struct, and doesn't operate one byte at a time, a data alignment fault will be generated when operating out of device memory, such as on a system where the MMU has not yet been turned on yet.
I'm still examining possible fixes (I'll probably look at building with -mstrict-align first), but I wanted to check if anyone had thoughts on the subject and if Newlib upstream or Linaro consider using Newlib with the MMU turned off to be a valid use case or if running the code that turns on the MMU is considered a prerequisite to everything else.
Thanks, Christopher
On 20 November 2013 17:57, Christopher Covington cov@codeaurora.org wrote:
Hi,
We've noticed an issue trying to use the Linaro AArch64 binary bare metal toolchain release with the MMU turned off for some low-level tests.
Anytime puts, sprintf, etc. gets called, a reent structure gets created with references to STDIN, STDOUT, STDERR FILE types. A member in the __sFile struct, _mbstate, is an 8 byte struct, but is not aligned on an 8 byte boundary. This means that when memset (or a similar function) gets called on this struct, and doesn't operate one byte at a time, a data alignment fault will be generated when operating out of device memory, such as on a system where the MMU has not yet been turned on yet.
I'm still examining possible fixes (I'll probably look at building with -mstrict-align first), but I wanted to check if anyone had thoughts on the subject and if Newlib upstream or Linaro consider using Newlib with the MMU turned off to be a valid use case or if running the code that turns on the MMU is considered a prerequisite to everything else.
I've always viewed newlib as a C Library - and as such it doesn't provide system 'startup' code but rather assumes that has been run before you enter newlib (possibly by using libgloss).
Certainly this is how the aarch64 port is structured - it provides system startup code which installs exception vectors and turns the MMU on for when using the AEM models. It also provides the .specs files to use with GCC to get that installed.
So I think you want to write some CPU initialisation code for your particular core and hook it into libgloss. Basically this should involve writing a _cpu_init_hook of your own, and getting it built into its own .o file with libgloss, and providing appropriate .specs files.
However, I can see the other viewpoint as well - the newlib should work without the MMU turned on. So I leave it to wiser heads than me to make a definitive statement on what newlib presupposes about the CPU state.
Thanks,
Matt
Hi,
On 11/20/2013 03:45 PM, Matthew Gretton-Dann wrote:
On 20 November 2013 17:57, Christopher Covington cov@codeaurora.org wrote:
Hi,
We've noticed an issue trying to use the Linaro AArch64 binary bare metal toolchain release with the MMU turned off for some low-level tests.
Anytime puts, sprintf, etc. gets called, a reent structure gets created with references to STDIN, STDOUT, STDERR FILE types. A member in the __sFile struct, _mbstate, is an 8 byte struct, but is not aligned on an 8 byte boundary. This means that when memset (or a similar function) gets called on this struct, and doesn't operate one byte at a time, a data alignment fault will be generated when operating out of device memory, such as on a system where the MMU has not yet been turned on yet.
We believe to have narrowed down the issue to the AArch64 optimized memcpy/memset implementations that assume unaligned accesses will not fault. While the current AArch64 libgloss startup code turns the MMU on so such accesses will succeed, I don't think turning on the MMU should be required of all startup code. Would it be possible to modify these routines to make only size-aligned accesses without degrading performance? If a single implementation can't make everyone happy, should the ifdefs around them perhaps be expanded to include something about requiring the MMU to be on?
Thanks, Christopher
On 16/12/13 17:54, Christopher Covington wrote:
Hi,
On 11/20/2013 03:45 PM, Matthew Gretton-Dann wrote:
On 20 November 2013 17:57, Christopher Covington cov@codeaurora.org wrote:
Hi,
We've noticed an issue trying to use the Linaro AArch64 binary bare metal toolchain release with the MMU turned off for some low-level tests.
Anytime puts, sprintf, etc. gets called, a reent structure gets created with references to STDIN, STDOUT, STDERR FILE types. A member in the __sFile struct, _mbstate, is an 8 byte struct, but is not aligned on an 8 byte boundary. This means that when memset (or a similar function) gets called on this struct, and doesn't operate one byte at a time, a data alignment fault will be generated when operating out of device memory, such as on a system where the MMU has not yet been turned on yet.
We believe to have narrowed down the issue to the AArch64 optimized memcpy/memset implementations that assume unaligned accesses will not fault. While the current AArch64 libgloss startup code turns the MMU on so such accesses will succeed, I don't think turning on the MMU should be required of all startup code. Would it be possible to modify these routines to make only size-aligned accesses without degrading performance? If a single implementation can't make everyone happy, should the ifdefs around them perhaps be expanded to include something about requiring the MMU to be on?
Quite frankly, I doubt it. Good overall performance in memcpy means avoiding hard-to-predict branches (it's not unusual for the code to be called with completely random copy sizes); removing the unaligned accesses would mean many more compares and branches than are currently required, each of which would carry a significant risk of an avoidable branch mispredict.
Furthermore, completely unaligned copies would then need to be entirely rewritten to use byte-shifting techniques; that would significantly impact the overall performance.
My personal feeling is that startup code is really special. If you need to copy some memory during this time and the MMU has not been enabled, then you can't assume that it's safe to call memcpy.
R.
linaro-toolchain@lists.linaro.org