Hi,
I have been using CodeSourcery tool-chains since forever, and I
decided to try Linaro's tool-chain. So far I haven't had any issues,
except one:
libgcc_s.so.1 is located in arm-linux-gnueabi/lib, which is not
available from arm-linux-gnueabi/libc. The CodeSourcer tool-chain has
all those files inside the 'libc' directory, which is useful for
Scratchbox2, because you need to specify a "target root" which is
basically the libc directory which acts in a similar fashion to /.
I've managed to create a rule to workaround this issue, but I wonder
why are those files in that location, why not in 'libc/lib'?
Cheers.
--
Felipe Contreras
=== Progress ===
* Worked on the VFP addressing modes patch upstream. Handled most
comments. Final version has finished testing and looks almost ready to
commit.
* Investigated an issue with min type transformations for loop
terminating conditions. Wrote up a small patch which appears to do the
right thing - passed a bootstrap on x86 but that probably means it
never got triggered :( .
The particular case of interest was not vectorizing :
#define min(x,y) ((x) <= (y) ? (x) : (y))
int a[256] __attribute__((aligned (16)));
int b[256] __attribute__((aligned (16)));
int c[256] __attribute__((aligned (16)));
void foo (int x, int y)
{
int i;
for (i = 0;
i < x && i < y;
// i < min (x, y);
i++)
a[i] = b[i] * c[i];
}
but vectorizing the commented region. I've tentatively worked out a
fix in tree-loop-im.c which looks like a bit of a grotesque hack ....
* Attended LLVM devcon in Week 15. Useful and interesting and conference.
== Plans ==
* Pursue backporting gnu_unique_object upstream.
* Look at some of the existing blueprints and start discussions around
prioritizing this.
* Investigate some of the SEGVs with h-c partitioning.
* Finish off the VFP addressing modes patch.
Absences.
* 03 May 2012 - 08 May 2012.
* Linaro Connect Q2.12 - May 28 - June 1 -
== GCC ==
* Checked in backport patch to fix LP #972648
into Linaro GCC 4.6.
* Checked in fix for incorrect vld alignment hints to
FSF mainline and 4.7 branch.
* Investigated options to fix stack re-alignment.
* Ongoing investigation of LP #959242: design problem in vectorizer
pattern detection logic causes ICE in certain cases where an
original sequence is recognized as part of *two* potential patterns
simultaneously.
* Ongoing work on improving end-of-loop value computation.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Current Milestones:
|| || Planned || Estimate || Actual ||
||cp15-rework || 2012-01-06 || 2012-04-10 || ||
Historical Milestones:
||initial-a15-system-model || 2012-01-27 || 2012-01-27 || 2012-01-17 ||
||qemu-kvm-getting-started || 2012-03-04?|| 2012-03-04?|| 2012-02-01 ||
== cp15-rework ==
* sent out v1 patchseries for cp15 cleanup to the mailing list, finally
* handled review comments on drop-cpu-reset-model-id patchseries
== kvm-boot-wrapper ==
* imported libfdt device tree library and Dave Martin's code from
the big.LITTLE boot-wrapper, to add support to the KVM boot wrapper
for handling device tree blobs. Patch series sent, will probably
commit next week unless there are review issues
* that will more or less wrap this blueprint up
== other ==
* usual upstream patch review; QEMU 1.1 soft feature freeze was start
of this week, hardfreeze will be beginning of May, people (including
me :-)) are trying to squeeze things in under the wire...
-- PMM
* Linaro GCC
Investigated the latest test failure for the neon-extend patch. The test
for GCC Bugzilla 43137 is failing again. It turns out to be because my
switching sign_extend to use a DImode output, rather than two SImode
subreg outputs has exposed a bug in the lower-subreg pass. I've spent
most of the week trying to figure what can be done about this and
discussing the problem with Richard Sandiford upstream.
Richard Earnshaw approved my neon-negate patch. That patch depends on
the neon-immediates patch which is not approved yet, so I'll have to
wait to commit it.
Continued looking at NEON-v-core register allocation. Not much progress
this week though.
* Other
Created a patch to make the GCC RTL dumps easier to diff. Posted it
upstream and discussed it on the list.
Vacation on Friday.
* Next week
Vacation Monday and Tuesday
Hello,
I've been following up on the discussion we had on Monday regarding stack
alignment, and noticed that I had mis-remembered the current state of
affairs. Ramana asked me on Tuesday to provide a write-up of the actual
status, so here we go ...
To summarize the background of the problem: on ARM, the incoming stack
pointer is only guaranteed to be aligned to an 8 byte boundary. This means
that objects on the stack (local variables, spill slots, temporaries etc.)
cannot easily be aligned to more than 8 bytes. This can potentially cause
problems in two situations:
1) The object's default alignment (according to its type) is larger than 8
bytes
2) The object has a forced non-default alignment that is larger than 8
bytes
The first situation should in theory never appear, since according to the
ARM ABI all types have a default alignment of at most 8 bytes. However,
due to the current mix-up in GCC, vector types actually are considered to
have a 16-byte alignment requirement in GCC.
The second situation can only appear with local variables that are declared
using attribute ((aligned)).
We had discussed on Monday that we need to fix the second situation, since
this can always occur and is supported on other platforms. By doing so,
we would then automatically fix the first situation as well.
However, this reasoning turns out to be incorrect. There are currently in
GCC *two* completely separate mechanisms that can be used to align objects
on the stack to larger than the ABI guaranteed stack pointer alignment:
A) Re-alignment of the full stack frame. This is what is used by the Intel
back-end (and only the Intel back-end). At function entry, generated code
will align the stack pointer itself to whatever is necessary to fulfil
alignment requirements of all objects on the stack. This may necessitate
follow-on changes: the frame pointer, if there is one, will likewise need
to be aligned at runtime. Also, since incoming stack arguments are now no
longer at a fixed offset relative to the stack pointer *or* frame pointer
in some cases, we might need an extra register as argument pointer. This
method allows extra alignment for *any* object on the stack, but needs
significant back-end support in order to be enabled on any non-Intel
architecture.
B) Dynamic allocation of selected stack variables. This is implemented by
common code with no involvement of the back-end. In effect, the code in
cfgexpand.c:expand_stack_vars that decides on how to allocate local
variables on the stack will remove all variables that require extra
alignment and place them into an extra structure. Generated prologue code
will then in effect dynamically allocate and align that structure on the
stack, and just store a pointer to it as "variable" into the normal stack
frame. All other areas of the frame are unaffected. Since this method
just simulates code the programmer could have written themselves using
alloca, it does not require *any* back-end support and is enabled by
default everywhere. However, it only works for regular local variables,
and not for any other objects on the stack.
Objects on the stack *except* local variables always use default alignment.
Since on most platforms, except Intel and *currently* ARM, the ABI stack
pointer alignment is sufficient to implement default alignments, method B)
as above is able to fulfil all stack alignments. Intel uses method A), so
they're also OK. In effect, it's only ARM due to the vector type
alignment problem that runs into the situation that neither method works.
Under those circumstances, given that:
- we want to fix vector type alignment in order to become ABI compliant
- once we've fixed this, we're in the same situation as other platforms and
method B) already fixes stack alignment problems
- implementing method A) is therefore both quite involved *and* actually
superfluous
I'd now rather recommend that we *don't* try to implement method A) (full
stack-frame re-alignment) on ARM.
Comments?
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hi,
GDB for Android:
* Continued conversation about .note.ABI-tag.
* Committed upsream the second of three patches for building
gdbserver on Android.
* Continued working on a crtbrand.c for bionic.
2012.04 release:
* Tested the gdb-linaro/7.4 branch on armv5tel, x86 and x86_64, both
natively and remotely. Compared results against the 2012.02 release.
* Ran the release process, made the release.
--
[]'s
Thiago Jung Bauermann
Linaro Toolchain Working Group
Hi,
GDB for Android:
* Investigated .note.ABI-tag generation in glibc and FreeBSD.
Wrote crtbrand.c for bionic based on FreeBSD's crtbrand.c.
Tried to integrate it with bionic's build system.
* Contacted Google engineers about adding .note.ABI-tag to Android
binaries.
2012.04 release:
* Started working on the release. Applied the latest patches from
the FSF 7.4 branch to gdb-linaro/7.4 and also my patches to compile
gdbserver with the Android toolchain.
--
[]'s
Thiago Jung Bauermann
Linaro Toolchain Working Group
* Linaro GCC
Spun release tarballs for Linaro GCC 4.6 and 4.7. Pushed them to
Michael's servers and launched the testing.
Continued trying to get the 64-bit NEON stuff to work. The negdi2 patch
needed some reworking following upstream review, and the extend patch
has mysteriously reintroduced a performance regression when the
operation is done in core-registers, which needs to be solved, but the
other patches seem ok at the moment, although the shift stuff is blocked
on benchmarking.
I've begun looking at how my changes affect spec2000, and whether the
register allocation could be done better. Unfortunately, given that the
test systems are currently giving spec results with a low level of
consistency (and therefore confidence), this has mostly been by visual
inspection, so far.
* Other
Public Holiday on Monday.
I have a cross toolchain I configured with "--with-arch=armv7-a --with-cpu=cortex-a9 --with-tune=cortex-a9" and I want the linker to emit armv4t compatible thumb interworking, but I can't seem to get it to.
I noticed that if I create a armv4t toolchain with "--with-arch=armv4t --with-cpu=arm7tdmi --with-tune=arm7tdmi" and then I pass "--use-blx" to the linker it will emit armv7 thumb interworking. There doesn't seem to be any inverse "--no-use-blx" type switch though. Is this a bug/limitation of the linker or am I misunderstanding something?
-Allen
nvpublic