linaro-toolchain January 2011

linaro-toolchain@lists.linaro.org

23 participants
53 discussions

Extension elimination pass breaks SPU (Fw: spu gcc-4.5 linaro build failure)

by Ulrich Weigand

Hello, Matthias noticed the following ICE when attempting to build the SPU compiler from the Linaro GCC 4.5 sources: ../../../../src-spu/libgcc/../gcc/libgcc2.c: In function '__fixunssfdi': ../../../../src-spu/libgcc/../gcc/libgcc2.c:1344:1: internal compiler error: in spu_expand_mov, at config/spu/spu.c:4575 It turns out that this is due to the new "extension elimination" pass that was recently added in Linaro GCC, as port from the CodeSourcery compiler. This patch has also been proposed, but not yet included upstream. The problem is that this patch seems to frequently introduce instructions that *set* a sub-word lowpart subreg of a register. Now such instructions, according to the docs, are probably valid RTL, but since the effect of the instruction onto the highpart of the register is deliberately left unspecified, they tend to be very infrequently used. Probably because of this, there seem to be parts of the compiler that simply don't handle such instructions correctly. This has been already noticed in the case of the RTL loop optimizers (see discussion here http://gcc.gnu.org/ml/gcc/2010-11/msg00552.html). The failure in the SPU back-end is another instance of the same problem. SPU needs special code to handle subregs (since a "lowpart" SImode subreg of a DImode register is not actually valid on the SPU, because SImode values live in bytes 0..3 while DImode values live in bytes 0..7 of the otherwise big-endian 16-byte SPU registers), and this code simply aborts when given an assignment to a sub-word lowpart subreg. Now, I guess there's two ways forward: either the outcome of the ongoing discussions on gcc-patches is that it is in fact not a good idea to generate such sets, and the EE pass is subsequently rewritten to avoid them; or else, if those instructions are considered valid, I'll have to extend the SPU move expander to handle them. Thoughts? Matthias, if you need a quick workaround for now, I guess you could disable the new pass for SPU by adding a line "flag_ee = 0;" to spu_override_options. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

15 years, 3 months

Help with define_insn

by Ira Rosen

Hi, I am trying to implement interleave_high/low and extract_even/odd using vzip and vuzp instructions. I am attaching a patch that attempts to do that. It uses already existing neon_vzip<mode>_internal. The problem with it is that it doesn't express the fact that the two outputs of vzip depend on both inputs, which causes wrong code generation in CSE: for (a,b)<- vzip (c,d) and (e,f) <- vzip (g,d) CSE decides that b==f, since on RTL level b and f depend only on d. Here is neon_vzip<mode>_internal: (define_insn "neon_vzip<mode>_internal" [(set (match_operand:VDQW 0 "s_register_operand" "=w") (unspec:VDQW [(match_operand:VDQW 1 "s_register_operand" "0")] UNSPEC_VZIP1)) (set (match_operand:VDQW 2 "s_register_operand" "=w") (unspec:VDQW [(match_operand:VDQW 3 "s_register_operand" "2")] UNSPEC_VZIP2))] "TARGET_NEON" "vzip.<V_sz_elem>\t%<V_reg>0, %<V_reg>2" [(set (attr "neon_type") (if_then_else (ne (symbol_ref "<Is_d_reg>") (const_int 0)) (const_string "neon_bp_simple") (const_string "neon_bp_3cycle")))] ) Is there a way to properly mark the dependence? Thanks, Ira

15 years, 3 months

qemu-linaro prerelease available

by Peter Maydell

Hi; this is a note to say that we have now produced a prerelease tarball of qemu-linaro. (The first formal qemu-linaro release will happen in sync with other toolchain group releases on 8th Feb.) This prerelease is primarily to pipeclean the release process and to allow work to start on producing Ubuntu and Linaro packages; however it does include a number of useful bugfixes which are required if you want to be able to boot a recent Linaro snapshot on the beagle model. So the enthusiastic might like to build it from source and give it a spin. Like the Linaro kernel trees, the qemu-linaro tree aims to only include patches we are confident will go upstream; at the moment this means the OMAP3 support and ARM correctness fixes from the qemu-meego tree, based on the qemu upstream trunk. You can download the source tarball from: https://launchpad.net/qemu-linaro/+milestone/2011.02 -- Peter Maydell

15 years, 3 months

Generating ancilliary sections with gas

by Dave Martin

Hi all, With gas, does anyone know of a way to create a section whose name is based on that of the current section? The specific requirement is to be able to define a generic macro like the example "fixup" below, whose purpose is to record ancilliary data related to some other section. To illustrate: .macro fixup 100\@ : .pushsection fixup<current section name>, "a" .long 100\@b .popsection .endm .text ... fixup .long sym1 ... .section .other, "ax" ... fixup .long sym2 The linux kernel uses a technique just like this for patching SMP kernels at bootup to work on uniprocessor platforms (when CONFIG_SMP_ON_UP is enabled), resulting in code looking something like this: void exit __attribute__ (( __section__ (".text.exit") )) { ... asm( ... FIXUP("something") ... ); } Note that the inline asm may actually come out of a generic header file rather than being explitly written for this invocation. So it may have to be truly generic. Is far as I have been able to determine, it's not possible to generate sections named based on the current section. In practice, the kernel puts all the fixups into a single section. The downside of this is that when sections are selectively discarded at link time (which in general may happen -- for example, Linux discards the "module exit" code for drivers which are built into the kernel and therefore never exit) there is no way to selectively discard the related fixup entries. Currently the only solution is to include all the module exit code in the image and discard it at run-time when the kernel boots. This is obviously wasteful. Attempting to discard that code at like time results in a link error, since fixups refer to the removed sections. Of course, the "fixup" macro could be given an extra parameter to name the containing section, but the macro can then no longer be called in a generic way: all the calls to that macro must be manually (and buggily) maintained to ensure that the referenced section name is correct, some object post-processing must be done before linking, and/or a tool must be created to implement the missing assembler functionality. Unfortunately, such solutions are likely to be too fragile or complex to make it upstream. It's interesting to note that the same problem will apply for any section containing ancilliary data for another section. In particular, it looks like either the ABI or the assembler has had to grow a special-case workaround for this in order to support exception unwind information sections generated by .fnstart ... .fnend in a sane way: the unwind information sections get called .ARM.ex{idx,tab} for .text, and .ARM.ex{idx,tab}<section> for any other section. As a consequence, link-time discarding can handle this information properly, but IMHO this is a bit of a cheat and admits the general need to create sections with names based transparently on those of other sections, without satisfying that need. .popsection is also an example of such a cheat: most other aspects of assmbler state still cannot be saved and restored. In general, it would be useful if gas supported some general reflective abilities: i.e., the ability to query the current assembler state (section, subsection, active instruction set, active macro mode, etc.) and/or the ability to wrap or hook existing pseudo-ops. For example, the above problem would almost certainly solvable using assembler macros (albeit painfully) if wrapper macros could be defined for the section manipulation directives (section, .text, .data, .bss, .pushsection, .popsection, .previous). However, supporting some magic macro parameters reflecting the assembler state would be a lot simpler. As an example of the kind of behaviour I think would be useful, the macro argument qualifier could be extended to allow macros to query the assembler state in a backwards-compatible way; something like: .macro fixup base_section:gas_current_section_name, old_altmacro:gas_macro_mode .altmacro LOCAL fixup_location fixup_location: .pushsection \base_section\().fixup .long 100\@b .popsection \old_altmacro .endm Existing assembler code will continue to work just fine with this approach. Note how this also enables a local label to be generated hygenically, by making it possible to save and restore the macro mode. Otherwise, .altmacro (and hence LOCAL) is hard to use safely, since the initial macro mode is unknown and can't be restored. Any thoughts / comments? Cheers. ---Dave

15 years, 3 months

[ACTIVITY] Jan 24--30

by Chung-Lin Tang

== Last week == * PR47246, VFP index range on Thumb-2. Submitted and committed patch upstream. * Pinged two upstream submissions on gcc-patches, one for PR44557 and the other a patch for LP:689887; still awaiting approval. == This week == * Chinese New Year Holiday, I'll be off until Feb.8th.

15 years, 3 months

GCC 4.6 Upgrade Plan

by Andrew Stubbs

Hi All, The GCC 4.6 Upgrade plan can be found here: https://wiki.linaro.org/WorkingGroups/ToolChain/GCCUpgradePlan The plan describes how we should commit "upstream" patches while the FSF trunk is closed, and what we plan to use 4.6 for. Andrew

15 years, 3 months

[ACTIVITY] 24th -28th January 2011

by Andrew Stubbs

Created a Google docs spreadsheet to help visualise the benchmark results. The graphs are not very informative yet - too many lines and too much noise. I'm going to have to revisit them. Continued trying to build Android. The toolchains build fine, but Android itself complains about -Werror, and there are a few other real errors too. Considering I was told it built fine with GCC 4.6 and all I needed to do was tweak 4.5 to match, I'm not terribly impressed. I'm sinking too much time into fixing up Android, and I haven't even got to looking at the compiler trouble. Alexander Sack has said he will try to get me to a more appropriate starting place (I think), so I'll see what happens there. Discussed my maddhidi4 patch with Richard E (wearing his GCC ARM maintainer hat). He's not convinced that my change won't make something else produce worse code. I can't prove that it won't either, so I'm going to have to revisit it. Wrote up the GCC 4.6 branch policy and upgrade plan.

15 years, 3 months

New phone number for today's call

by Michael Hope

Just a reminder that the dial-in numbers for today's and all future calls has changed. See: https://wiki.linaro.org/WorkingGroups/ToolChain/Meetings for the new list. I'll hang out on IRC just before the meeting to help the lost... -- Michael

15 years, 3 months

[ACTIVITY] report week 04

by Peter Maydell

RAG: Red: Amber: Green: qemu-linaro RC0 prerelease uploaded Current Milestones: | Planned | Estimate | Actual | first qemu-linaro release | 2011-01-11 | 2011-01-11 | | Historical Milestones: finish virtio-system | 2010-08-27 | postponed | | finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 | successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 | finish qemu-cont-integration | 2010-01-25 | 2010-01-25 | handed off | * maintain-beagle-models: + went through diffs between qemu-linaro and qemu-meego to confirm we hadn't dropped any patches by mistake + tested qemu-linaro tree on ubuntu netbook image: boots OK + investigated and fixed qemu bugs caused by new x-loader https://bugs.launchpad.net/qemu-linaro/+bug/704484 and new u-boot: https://bugs.launchpad.net/qemu-linaro/+bug/703094 + made merge requests to meego for a CRIS compile failure and the x-loader bugfix http://meego.gitorious.org/qemu-maemo/qemu/merge_requests/4 http://meego.gitorious.org/qemu-maemo/qemu/merge_requests/5 + Went through the process of doing a qemu-linaro release with a "RC0" prerelease as a pipecleaning exercise and to provide a tarball to slangasek for doing packaging. Download: https://launchpad.net/qemu-linaro/+milestone/2011.02 Release process writeup: https://wiki.linaro.org/WorkingGroups/ToolChain/QemuReleaseProcess * merge-correctness-fixes + the usual upstream mailing list monitoring and code review + tested and sent meego VQ(R)DMULH.s16 fix upstream: http://patchwork.ozlabs.org/patch/80725/ + working on a patch to fix decoding of the preload and hint space Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: 17/18 March: QEMU Users Forum, Grenoble Holiday: 22 Apr - 2 May

15 years, 3 months

[ACTIVITY] 2011-01-28

by David Gilbert

SPEC Tried to track down what was going on with lbm; it doesn't seem to be repeatable on canis1; I'd previously seen it fail at O1 and work at O0 and tried to chop down the flags between the two; but after adding all the flags back in on top of -O0 it still worked and then I tried -O1 again and it worked. Going to try on another machine, but it might be uninitialised data somewhere. Panda Our panda arrived; it's now happily nestling near our Beagles and running the 0126 headless snapshot (with 0127 hwpack). It seems fine except for rather slow USB and SD IO. Tip: Panda's do absolutely nothing (no LEDs, no serial console activity) unless you put an SD card in with the firmware on. Libffi Wrote the changes for armhf. Tested on arm, armhf, i386, ppc and s390x - all happy. (Not too unsuspectingly variadic calls just work on everything other than armhf without the api change) Mailed Python CType list asking how much of a pain the API change will be and any hints on what might be affected. Awaiting sign off for submission of code. Optimised library routines Looked at benchmarking 'git'; I'd seen previous discussions where it had been pointed out that it spends a lot of time in library routines; and indeed it does spend useful amounts in memchr, memcpy and friends on a simple git diff v2.6.36 v.2.6.37 > /dev/null of the current kernel tree produces a useful ~25second run. One interesting observation is that the variation in the times reported by 'time' - i.e. user, system and real, the variation in user+system is much less than either user or system individually and is quite stable (within ~0.7% over 10 runs). I've just tried preloading my memchr routine in and it does get a consistent 1-1.2% improvement which does look above the nice. Also asked on libc-help list for suggestions as to other benchmarks people actually trust to reflect useful performance increases in core routines as opposed to totally artificial ones. Dave

15 years, 3 months

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain January 2011