linaro-toolchain January 2013

linaro-toolchain@lists.linaro.org

22 participants
45 discussions

Re: [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores

by Michael Hope

On 6 November 2012 02:48, Rob Herring <robherring2(a)gmail.com> wrote: > > On 11/05/2012 05:13 AM, Russell King - ARM Linux wrote: > > On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote: > >> On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote: > >>> On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote: > >>>> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote: > >>>>> On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote: > >>>>>> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote: > >>>>>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote: > >>>>>>> > >>>>>>>> While v6 can support unaligned accesses, it is optional and current > >>>>>>>> compilers won't emit unaligned accesses. So we don't clear the A bit for > >>>>>>>> v6. > >>>>>>> > >>>>>>> not true according to the gcc changes page > >>>>>> > >>>>>> What are you going to believe: documentation or what the compiler > >>>>>> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access > >>>>>> support backported and 4.7.2, unaligned accesses are emitted for v7 > >>>>>> only. I guess default here means it is the default unless you change the > >>>>>> default in your build of gcc. > >>>>> > >>>>> Since ARMv6 can handle unaligned access in the same way as ARMv7 > >>>>> it seems a clear bug in gcc which might hopefully get fixed. > >>>>> Thus in this case I think it is reasonable to follow the > >>>>> gcc documentation, otherwise the code would break for ARMv6 > >>>>> when gcc gets fixed. > >>>> > >>>> But the compiler can't assume the state of the U bit. I think it is > >>>> still legal on v6 to not support unaligned accesses, but on v7 it is > >>>> required. All the standard v6 ARM cores support it, but I'm not sure > >>>> about custom cores or if there are SOCs with buses that don't support > >>>> unaligned accesses properly. > >>> > >>> Well, I read the "...since Linux version 2.6.28" comment > >>> in the gcc changes page in the way that they assume the > >>> U-bit is set. (Although I'm not sure it really is???) > >> > >> Actually, the kernel checks the arch version and the U bit on boot, > >> and chooses the appropriate setting for the A bit depending on the > >> result. (See arch/arm/mm/alignment.c:alignment_init().) > > > > That is in the kernel itself, _after_ the decompressor has run. It is > > not relevant to any discussion about the decompressor. > > > >> Currently, we depend on the CPU reset behaviour or firmware/ > >> bootloader to set the U bit for v6, but the behaviour should be > >> correct either way, though unaligned accesses will obviously > >> perform (much) better with U=1. > > > > Will someone _PLEASE_ address my initial comments against this patch > > in light of the fact that it's now been proven _NOT_ to be just a V7 > > issue, rather than everyone seemingly buring their heads in the sand > > over this. > > I tried adding -munaligned-accesses on a v6 build and still get byte > accesses rather than unaligned word accesses. So this does seem to be a > v7 only issue based on what gcc will currently produce. Copying Michael > Hope who can hopefully provide some insight on why v6 unaligned accesses > are not enabled. This looks like a bug. Unaligned access is enabled for armv6 but seems to only take effect for cores with Thumb-2. Here's a test case both with unaligned field access and unaligned block copy: struct foo { char a; int b; struct { int x[3]; } c; } __attribute__((packed)); int get_field(struct foo *p) { return p->b; } int copy_block(struct foo *p, struct foo *q) { p->c = q->c; } With -march=armv7-a you get the correct: bar: ldr r0, [r0, #1] @ unaligned @ 11 unaligned_loadsi/2 [length = 4] bx lr @ 21 *arm_return [length = 12] baz: str r4, [sp, #-4]! @ 25 *push_multi [length = 4] mov r2, r0 @ 2 *arm_movsi_vfp/1 [length = 4] ldr r4, [r1, #5]! @ unaligned @ 9 unaligned_loadsi/2 [length = 4] ldr ip, [r1, #4] @ unaligned @ 10 unaligned_loadsi/2 [length = 4] ldr r1, [r1, #8] @ unaligned @ 11 unaligned_loadsi/2 [length = 4] str r4, [r2, #5] @ unaligned @ 12 unaligned_storesi/2 [length = 4] str ip, [r2, #9] @ unaligned @ 13 unaligned_storesi/2 [length = 4] str r1, [r2, #13] @ unaligned @ 14 unaligned_storesi/2 [length = 4] ldmfd sp!, {r4} bx lr With -march=armv6 you get a byte-by-byte field access and a correct unaligned block copy: bar: ldrb r1, [r0, #2] @ zero_extendqisi2 ldrb r3, [r0, #1] @ zero_extendqisi2 ldrb r2, [r0, #3] @ zero_extendqisi2 ldrb r0, [r0, #4] @ zero_extendqisi2 orr r3, r3, r1, asl #8 orr r3, r3, r2, asl #16 orr r0, r3, r0, asl #24 bx lr baz: str r4, [sp, #-4]! mov r2, r0 ldr r4, [r1, #5]! @ unaligned ldr ip, [r1, #4] @ unaligned ldr r1, [r1, #8] @ unaligned str r4, [r2, #5] @ unaligned str ip, [r2, #9] @ unaligned str r1, [r2, #13] @ unaligned ldmfd sp!, {r4} bx lr readelf -A shows that the compiler planned to use unaligned access in both. My suspicion is that GCC is using the extv pattern to extract the field from memory, and that pattern is only enabled for Thumb-2 capable cores. I've logged PR55218. We'll discuss it at our next meeting. -- Michael

12 years, 5 months

[ACTIVITY] 28-31 Jan

by Renato Golin

== Progress == * Maintenance - Fixing ARM buildbots, poking people to fix bugs, keeping them green - http://llvm.org/viewvc/llvm-project?view=rev&revision=173510 * Cost Model - Fixing some bugs on the generic code - http://llvm.org/viewvc/llvm-project?view=rev&revision=173691 - Adding some simple free cast (plus some infrastructure) - http://llvm.org/viewvc/llvm-project?view=rev&revision=173849 * LLVM - Investigating APFloat issue on Chromebook (bad libraries?) - Clang miscompiles and show same synthoms, will play with options next week - AArch64 back-end in, to be built by default * LAVA - Got three last errors due to include path ('bits/predefs.h' file not found) - libc6-dev + libstdc++-dev have no effect, problem doesn't show on buildbots - Testing heating problem with multiple images (only 12.02 is good) - Testing other boards, other images (with Dave) * Friday Holiday == Plan == * Try a bit more on the APFloat issue in Chromebook, but I think that's just bad distro (ChrUbuntu), since no one else has this problem. Has anyone put any Linaro image on a Chromebook? * Continue working on getting faster builds on LAVA (quad-core origen, Arndale, etc) with Dave Pigot. * Continue micro-benchmarking the vectorization and updating the cost-model. Start discussing the side-effects that are not modelled at all.

12 years, 6 months

[ANNOUNCE] Linaro Toolchain Binaries 2013.01 released

by Bernhard Rosenkränzer

The Linaro Toolchain Working Group and Platform Team are pleased to announce the 2013.01 release of the Linaro Toolchain Binaries, a pre-built version of Linaro GCC and Linaro GDB that runs on generic Linux or Windows and targets the glibc Linaro Evaluation Build. Uses include: * Cross compiling ARM applications from your laptop * Remote debugging * Build the Linux kernel for your board What's included: * Linaro GCC 4.7 2013.01 * Linaro GDB 7.5 2012.12 * A statically linked gdbserver * A system root * Manuals under share/doc/ The system root contains the basic header files and libraries to link your programs against. The Linux version is supported on Ubuntu 10.04.3 and 12.04, Debian 6.0.2, Fedora 16, openSUSE 12.1, Red Hat Enterprise Linux Workstation 5.7 and later, and should run on any Linux Standard Base 3.0 compatible distribution. Please see the README about running on x86_64 hosts. The Windows version is supported on Windows XP Pro SP3, Windows Vista Business SP2, and Windows 7 Pro SP1. The binaries and build scripts are available from: https://launchpad.net/linaro-toolchain-binaries/trunk/2013.01 Need help? Ask a question on https://ask.linaro.org/ Already on Launchpad? Submit a bug at https://bugs.launchpad.net/linaro-toolchain-binaries On IRC? See us on #linaro on Freenode. Other ways that you can contact us or get involved are listed at https://wiki.linaro.org/GettingInvolved.

12 years, 6 months

Newbie question

by Kalai Narayanan-SSI

Hi, I have a few armv7 assembly tests. I'm trying to compile these using the linaro aarch64 toolchain and I'm getting errors. Is there any specific flag that I have to pass to enable backward compatibility to allow v7 assembly to be compiled for a v8 model? reset.s: Assembler messages: reset.s:32: Error: operand 1 should be an integer register -- `mov r0,#0' reset.s:33: Error: unknown mnemonic `mcr' -- `mcr p15,0,R0,C13,c0,1' reset.s:36: Error: unknown mnemonic `mrc' -- `mrc p15,0,r0,c1,c0,0' reset.s:40: Error: operand 1 should be a SIMD vector register -- `orr r0,r0,#0x00001000' .... Relevant assembly code: .... _reset: // init Context ID Register MOV r0, #0 MCR p15, 0, R0, C13, c0, 1 // Enable Instruction cache mrc p15, 0, r0, c1, c0, 0 /* set bits: 12 = I i-cache */ orr r0, r0, #0x00001000 mcr p15, 0, r0, c1, c0, 0 ..... This is my assembler command: aarch64-linux-gnu-as -march=armv8-a+fp --keep-locals -o "reset.o" "reset.s" Thanks, Kalai

12 years, 6 months

[ACTIVITY] 21-25 January 2013

by Christophe Lyon

== Progress == * 64-bits ops in Neon: waiting for upstream. * vectorizer cost model: initial activation with unaligned load/store cost equal to aligned ones; benchmarking shows no significant difference. * smin-umin: a few benchmarks show a few unexpected regressions (10-15%). * setting up spec2k on local board * tcpanda heat problems: GCC built OK. Don't know how hot it became. == Next == * handle 64-bits bitops in Neon feedback from upstream if any. * analyze regressions in smin-umin * check if more tuning of the vectorizer cost model is desirable. * finish local board setup * tcpanda: run gcc testsuite to check heat

12 years, 6 months

[ACTIVITY] 21-25 January 2013

by Yvan Roux

== Progress == * Boehm GC AArch64 support: - Tested on Foundation model - Patches sent to mailing list - Boehm GC has been accepted and merged into mainline - Libatomic_ops under review, some improvements are needed. == Next == * Boehm GC AArch64 support: - Fix libatomic_ops for mainline merge * Start gc sections support for AArch64 binutils * Review roster

12 years, 6 months

[Activity] Week 04

by Zhenqiang Chen

Summary: * Investigate Automotive benchmark performance on different branch cost. Details: 1. Automotive benchmark performance analysis for different branch cost on Pardaboard ES. * Design small test cases to simulate bitmnp01 to compare the performance between ITTT and conditional branch. Test results show - If branch prediction does not work (put the codes in a function), ITTT is always better than conditional branch. - If branch prediction works (inline the codes t in the loop body), for most cases, conditional branch is better than ITTT. * Code alignment has big impact for tblook01. By default IT block has better performance. When adding __attribute__((aligned (16))) for function t_run_test, performance of conditional branch is better than IT block. 2. Prepare Linaro toolchain binary release. * Update Linaro crosstool-ng local patches due to the fix of lp:1067766 in source package. * Spawn all builds and smoke tests. Plan: * Investigate SPEC2k performance for different branch costs. * Work with Bero for 2013.01 toolchain binary release . Planed leaves: * Feb. 9 - 15: Chinese Spring Festival. Best Regards! -Zhenqiang

12 years, 6 months

[ACTIVITY] 21-25 Jan

by Renato Golin

== Progress == * Buildbot - Taking buildbot to Linaro - Had wireless/GPU overheating, disabled kernel modules - Running smooth again (most of the time) - Debugging errors that only appear on ARM. * Building and Testing LLVM - Compiling on Intel with only the ARM backend helps a lot - Sent a call for Action to people clean up cross-compilation failures * LAVA - Progress on LAVA LLVM job - Got it checking out, configuring and building - Got PASS/FAIL/SKIP patterns working - https://validation.linaro.org/lava-server/scheduler/job/46027 - Need to get a patch from a specific place to apply * Cost Model - Re-wrote table lookup patch a few times, finally in for good - http://llvm.org/viewvc/llvm-project?rev=173382&view=rev - Studying costs of instructions, all seem good enough - Better approach now is to change the target description (less code, more gain) * EuroLLVM - 136 people so far == Plan == * Test distcc (or similar) on Pandas * Get a buildbot running with cross-compilation * Internal git repository for LAVA LLVM job * Confirm Linaro's sponsorship for EuroLLVM * Continue cost model changes in between == Background == * Monitor list for ARM changes * Monitor buildbot for failures

12 years, 6 months

[ACTIVITY] report week 04

by Peter Maydell

Activity: * calls and meetings (about 20% of my working week this week ;-)) * finished rebasing and testing the KVM QEMU patches (thanks to Pawel for getting me an updated RTSM device tree), sent out updated version to go with -v17 kernel * minor qemu maintenance patches (including a minor cfi01 flash model bugfix) * trying to track down issues running a 3.8-rc4 vexpress kernel on QEMU. Among other things: * looks like we need to emulate some more of the oscillator and voltage config registers now (if only to make the kernel a bit quieter) * the kernel doesn't like the way qemu's boot loader puts the DTB blob after the initrd but beginning in the same page as the initrd ends [free_initrd_mem will trash memory outside the initrd proper but inside that last page] * a15 reports the wrong board model number -- PMM

12 years, 6 months

AArch64 cross compile

by naveen yadav

Dear All, Is it possible to compile ARCH "AArch64 " for 32 mode, like if I have x86 64 bit machine and I install 32 bit OS on it, and machine is compatible with 32 bit binary. So is it possible to use AARCH64 (Cortex-V8) with installation of kernel 32 bit and use 32 bit tool chain. If answer is yes, can I build tool chain or is there option available in linaro cross-compile available from https://launchpad.net/linaro-toolchain-binaries/+milestone/2012.12 Thanks

12 years, 6 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain January 2013