On 12 June 2012 18:53, Akash D <akashd(a)renuelectronics.com> wrote:
> Hello Michael,
>
> Thanks for reply.
>
> The required information is mentioned below.
>
> Compiler Used --->
>
> http://launchpad.net/gcc-arm-embedded
>
> Version number ---->
>
> arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 4.6.2 20110921
> (release) [ARM/embedded-4_6-branch revision 182083]
Yip, this is ARM's Cortex-R & M bare metal toolchain. It doesn't come
directly from Linaro but we've got a good relationship with them.
I'll ping them and see the best place to ask this question.
> The link file is attached with this mail.
I only had a quick read, but you're missing a capture for the
'.text.startup' section in the linker script. You might need
something like:
*(.text.startup)
before the one.o(.text) line or to change the *(.text) capture to
*(.text*). The startup section might need to be at a fixed address.
Please check your chip and toolchain documentation to confirm.
-- Michael
While benchmarking the auto-vectoriser on Libav, I noticed a performance
regression in gcc 4.7 (both FSF and Linaro) compared to gcc 4.6 in the AAC
decoder. I narrowed it down to this function:
static void ps_hybrid_analysis_ileave_c(float (*out)[32][2],
float L[2][38][64],
int i, int len)
{
int j;
for (; i < 64; i++) {
for (j = 0; j < len; j++) {
out[i][j][0] = L[0][j][i];
out[i][j][1] = L[1][j][i];
}
}
}
While gcc 4.6 does not attempt to vectorise this at all, 4.7 goes crazy
with a massive slowdown, about 20x slower than non-vectorised with Linaro
4.7 and much worse with FSF 4.7.
Let me know if you need more information.
--
Mans Rullgard / mru
== Progress ==
* Connect last week.
* Worked through the open issues and open work items related to
performance and we've got a clear list of things that are currently in
flight. Now to keep track of this better.
https://wiki.linaro.org/RamanaRadhakrishnan/Sandbox//RRQ212ConnectNotes
and move this away from the wiki page in a form that we can use to
talk during our regular performance meetings.
* Created blueprints, closed down old issues and reprioritized
issues with Ulrich and others.
* A number of interesting conversations during Connect for a number
of compiler related issues.
* Other sessions that I attended included the Android optimizations
sessions - while there was quite a bit about toolchain performance it
is important that we keep looking out for the performance profiles and
find areas where the toolchain can be improved. However this can't be
done without getting more testcases from other groups. There were a
couple of interesting comments made that skia is CPU bound which would
indicate that the paint function is CPU bound. But why and how ?
Someone should look at reproducing these numbers and see where we get
to in this area. Pointed out that cortex-strings might be good to make
it into bionic ?
* Fixed the vrev off by one error and committed to FSF trunk .
However it couldn't make it in time for FSF 4.7.1 as the merge window
had closed by then.
* Set up my panda board to be identical to what runs on our
validation labs etc.
* This week
* Worked through the merge requests and moved some patches
upstream away from the "toreview" state.
* Landed a few merge requests that were approved but hadn't been
done so. Took care of merging the upstream 4.7 branch.
* Given I only had a few hours back in the office this week I
worked on regenerating arm_neon.h to use __builtin_shuffle with
vrev64, vrev32, vtrn , vzip and vuzp. A follow up patch needs to do
the same for vext but that needs generic support also in
vec_perm_const_ok .Once that is done I think we can safely start
rewriting . It still needs some more testing and polishing up but the
initial results on the testcase from PR48941 is kind of neat. The
result for some of the other testcases that I've looked at also looks
much better than where we were a few weeks back. So all in all nice
progress on that front. However we have to also find a way of getting
these generated at O0 which they don't appear to do so cleanly enough
with this approach.
for one example it does look like this below: Notice those spills
beginning to disappear .... :)
New :
sqrlen4D_16u8:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
vabd.u8 q1, q0, q1
vmull.u8 q0, d2, d2
vmull.u8 q8, d3, d3
vuzp.32 q0, q8
vpaddl.u16 q0, q0
vpadal.u16 q0, q8
bx lr
Old :
sqrlen4D_16u8:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
@ link register save eliminated.
vabd.u8 q1, q0, q1
stmfd sp!, {r4, fp}
add fp, sp, #4
sub sp, sp, #48
add r3, sp, #15
vmull.u8 q0, d2, d2
bic r3, r3, #15
vmull.u8 q8, d3, d3
vuzp.32 q0, q8
vstmia r3, {d0-d1}
vstr d16, [r3, #16]
vstr d17, [r3, #24]
vpaddl.u16 q0, q0
vpadal.u16 q0, q8
sub sp, fp, #4
ldmfd sp!, {r4, fp}
bx lr
* Attended platform / WG sync-up.
== Plans ==
* Cleanup the ml bits of rewiring the intrinsics and try some proper testcases.
* Work on the auto-inc-dec scheduler patches.
* Rework the sched-pressure patch upstream .
* Review the Android benchmarking writeups.
Summary:
* Bug fixes.
* Tune ivopt for code size.
Details:
1. Reproduce lp:1007353 "kernel build fails with 12.04 and 12.05
toolchain released" and workout a patch to fix it; reopen the related
binutils/gas bug http://sourceware.org/bugzilla/show_bug.cgi?id=12698
and propose the patch to it; push the patch to linaro crosstool-ng to
make sure lp:1007353 is fixed for next binary toolchain release.
2. Setup the SPEC build env and reproduce lp: 886124 "using LDR from
literal pool rather than MOVW/MOVT". After cprop1 replaces lo_sum
(high: symbol_ref bloc) (symbol_ref (block)) with a (symbol_ref
(block)), no later optimization can split it. The solution in linaro
4.5 is to add a split (porting from codesourcery) in arm.md. Then
split1 can split the (symbol_ref (block)). The split is:
(define_split
[(set (match_operand:SI 0 "arm_general_register_operand" "")
(match_operand:SI 1 "general_operand" ""))]
"TARGET_32BIT
&& TARGET_USE_MOVT && GET_CODE (operands[1]) == SYMBOL_REF
&& !flag_pic && !target_word_relocations
&& !arm_tls_referenced_p (operands[1])"
[(clobber (const_int 0))]
{
arm_emit_movpair (operands[0], operands[1]);
DONE;
})
3. Tune ivopt for code size. Try to set avg_loop_niter to 1 since loop
iterator number does not impact code size. But test shows there is no
improvement. Need more tuning.
Plans:
* Analyze the failed cases in arm-linux-gnueabihf regression test.
* Tune code size for M0.
Best regards!
-Zhenqiang
Hello Sir/Madam,
I am using MK60FN1M0VLQ12 (COTREX-M4) processor for my development.
I am using float and double data types in my code. When I perform any
mathematical operation on these variables, the processor goes to Hard Fault
Exception.
Earlier I have used GCC 4.5.2 compiler for my compilation
So now I am using Linaro's GNU-GCC Toolchain 4.6.2 for compiling my code
with following command.
arm-none-eabi-gcc -Wall -mfpu=fpv4-sp-d16 -mfloat-abi=softfp -mcpu=cortex-m4
-mthumb -Qn -Os -mlong-calls -c main.c -o main.o
But I am getting following error while linking my code
ld: section .text.startup loaded at [00032258,000331cb] overlaps section
.InitializedVariables loaded at [00032258,00032787]
The link file is attached with this mail.
Can you please suggest me some solution for this problem.
Can you also suggest some compiler commands to support float and double data
type using software.
Awaiting for your reply,
Thanks & Regards,
Akash
== GCC ==
* Worked on reimplementing reassociation pass based on
review comments I had received.
* Identified root cause and worked on fix for vectorizer
bug causing unaligned memory accesses (reported by Mans).
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hi,
OpenEmbedded-Core/meta-linaro:
* fixed the binary toolchain support on master (still 2012.03)
* fixed armhf support for Linaro GCC 4.6 on master
* backport of Linaro GCC 4.7 r114985
* tested the images using QEMU - no failures
* now the master branch supports building images for ARM, MIPS, PPC,
X86 and X86_64 using the latest (2012.05) releases of Linaro GCC
4.6 or 4.7
* add tags on meta-linaro to easily find the revision for a particular
Linaro GCC
* changed cbuild to pull in the master branches of OE-Core and
meta-linaro
* merged the branch that allows to build OpenEmbedded-Core using cbuild
http://bazaar.launchpad.net/~kwerner/cbuild/oecore/changes/
* updated docs on the wiki
Misc:
* public holiday on Thu, vacation on Fri
* I'll be back on Monday : )
Regards,
Ken
Hi,
GDB for Android:
* Submitted and committed trivial patch to gdbserver which made it
compile again on Android. A patch had been added which made gdbserver
use a MIPS-related constant which Android doesn't provide.
* Compared testsuite results of GDB on Android vs regular Linux.
Unfortunately there's a lot of noise because the GCC 4.4 used by
the Android SDK generates bad debuginfo which confuses GDB and breaks
a lot of tests. Overall, it seems GDB on Android is in a generally
good
shape. Still need to run the testsuite again with an Android based on
a newer compiler to have a better comparison.
* Mozilla has a GDB patch to call gdbarch_addr_bits_remove before
comparing PCs in breakpoint handling which seems like a sensible thing
to do. Still, running the testsuite with and without this patch didn't
make a difference on Android or regular Linux.
* Remotely attended the GDB for Android session at Connect. Prepared
the following page to go with it:
https://wiki.linaro.org/ThiagoBauermann/Sandbox/AndroidGDBConnectSession
--
[]'s
Thiago Jung Bauermann
Linaro Toolchain Working Group
Hi,
GDB for Android:
* Created patch to expand the ~ in "set solib-search-path". Despite
doing the right thing in other commands, GDB doesn't understand ~ in
solib-search-path, which made me lose some time in a debugging session
trying to figure out what was going on. Committed upstream.
* Looked into AOSP patch which hardcodes use of fork tracing instead of
thread events. Found out that gdbserver actually already prefers fork
tracing on both Linux and Android (tested on ICS and Linaro 12.04).
This must have been a problem in some earlier version, and the patch
is
unnecessary now.
* Set up a QEMU instance with Linaro Android 12.04. Got dropbear ssh
on it and ran the GDB testsuite remotely on the VM.
--
[]'s
Thiago Jung Bauermann
Linaro Toolchain Working Group