linaro-toolchain February 2011

linaro-toolchain@lists.linaro.org

27 participants
66 discussions

by Ken Werner

Hi, * continued to look into latrace and found an issue in case a dynamic library gets unloaded. Otherwise latrace looks quite good on ARM. https://wiki.linaro.org/KenWerner/Sandbox/latrace * chasing bugs: - After a lot of testing Andy Green has made a big step forward in finding the root cause for the shut-down issue of my PandaBoard. The PMIC is seeing an overcurrent and issues an interrupt that gets ignored by current kernels. Then the PMIC shuts the board down for safety reasons. As a workaround Andy has made a kernel patch for the twl6030 driver that enables all interrupt sources. The kernel will acknowledge the overcurrent reported by the PMIC and the board survives. A patched kernel binary can be found at: https://wiki.linaro.org/KenWerner/Sandbox/708883 - While testing Andys patches on the linaro natty kernels I ran into https://bugs.launchpad.net/bugs/720055 - The flash-kernel utility doesn't work on the PandaBoard because the subarch check expects omap4 instead of omap: https://bugs.launchpad.net/bugs/721147 - Looked into the apr fail (process shared mutex's fail on armel v7). Their mutex functionality can be mappped to various methods, but only pthread is of interest here. The code relies on pthread_mutex_lock and pthread_mutex_trylock which is implemented by the (e)glibc. The c library uses GCCs __sync primitives if eglibc >= 2.12.1-0ubuntu11 and GCC >=4.5. The testprocmutex testcase passes now. https://bugs.launchpad.net/bugs/604753 Regards Ken

15 years, 2 months

RE: Problems with kernel support for hardware watchpoints

by Ulrich Weigand

"Will Deacon" <will.deacon(a)arm.com> wrote on 02/16/2011 01:07:09 PM: > > I've now built a kernel with CONFIG_ARM_ERRATA_720789 enabled, and the > > symptoms indeed seem to have disappeared completely ... > > Yup - that's because without it, invalidating a TLB entry for a particular > process isn't broadcast correctly, so you can end up using the old (pre-COW) > mappings if you're running on a different core. OK. So I guess the only remaining questions is: if this hardware needs the errata fix to work properly, shouldn't it be automatically selected by the kernel configure logic? Note that this appears to happen for certain OMAP boards, see arch/arm/mach-omap2/Kconfig: config ARCH_OMAP4 bool "TI OMAP4" default y depends on ARCH_OMAP2PLUS select CPU_V7 select ARM_GIC select PL310_ERRATA_588369 select ARM_ERRATA_720789 <<===== select USB_ARCH_HAS_EHCI But this does not happen for the vexpress; arch/arm/mach-vexpress/Kconfig has only: config ARCH_VEXPRESS_CA9X4 bool "Versatile Express Cortex-A9x4 tile" select CPU_V7 select ARM_GIC Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

15 years, 2 months

[ACTIVITY] February 13-17

by Revital1 Eres

Hello, * Continue looking into DENbench benchmarks. * While testing SMS I realized that my current implementation of doloop pattern for ARM does not follow SMS's requirement to have the doloop instructions be decoupled from the other loop's instructions. This happens because doloop uses CC register which might be used elsewhere in the loop. I am looking into a solution for that. Thanks, Revital

15 years, 2 months

[ACTIVITY] February 13-17

by Ira Rosen

Hi, This week I looked into DENBench: * sad8_c (hot function from mp4encode) needs SLP reduction, but it also contains cond_expr which cannot be vectorized as reduction, so I don't think there is anything I can do here * fdct_int32 (another hot function from mp4encode) now gets vectorized with vzip/vuzp patch, but the vectorization causes performance degradation here because of multiple register spills. I also noticed that vectorizer costs are not set for NEON, i.e., it uses default costs. So, I am now working on costs for NEON and adding registers consideration into vectorizer's cost model. I also did some general vectorization research, checking opportunities of collaboration with GRAPHITE pass and auto-parallelization. Ira

15 years, 2 months

Summary of required work for a Versatile Express QEMU model

by Peter Maydell

I mentioned in the toolchain standup call that I'd done a quick estimate of the work required to support vexpress, so I thought I might as well clean it up a little and post it. This is a quick summary and time estimate for adding Versatile Express support to qemu. The general idea is that most of the components on this board already have QEMU implementations (since they're standard ARM primecells used in versatile/realview), and we can live without the few major components that aren't implemented (maybe we'd need dummy implementations if the kernel prods them on startup.) Components already supported by QEMU: ------------------------------------- A9MPx4 PL050 keyboard, mouse SMCS LAN9118 ethernet PL011 UARTs SP804 timers Components with a near match in QEMU: ------------------------------------- PL111 CLCD -- qemu has a PL110 PL180 MMC card -- qemu has a PL181 -- both cases should either just work or be fairly trivial tweaks Components not supported by QEMU: --------------------------------- PL041 audio compact flash two-wire serial bus (for PCI-express switch config and DVI-I displays) ISP1761 Philips USB controller User switches and LEDs -- vexpress specific, but trivial to do Components where a dummy implementation should be sufficient: ------------------------------------------------------------- PL310 L2 cache controller PL341 dynamic memory controller PL354 static memory bus controller trustzone controllers Other required work: -------------------- The usual knitting for interrupts, clocks, reset etc etc. Summary ------- Assuming we're happy not to worry about support for audio, USB, two-wire serial bus or compact flash, this is about two weeks work to put together, test and get a more-or-less upstreamable patchset from. This would produce a platform hopefully at least as usable as versatile, but with an A9 and 1GB RAM. -- PMM

15 years, 2 months

RE: Problems with kernel support for hardware watchpoints

by Ulrich Weigand

"Will Deacon" <will.deacon(a)arm.com> wrote on 02/14/2011 11:30:45 AM: > > - In testing on Versatile Express, I noticed what appears to be SMP > > related bugs in handling regular software breakpoints: occasionally, > > software breakpoints simply are not hit and execution continues as if > > the underlying code had not been changed at all. This symptom > > completely goes away if GDB and the debugged process are forced to > > the same CPU using the affinity feature (e.g. with schedtool). > > I've seen this issue in the past but I thought I'd fixed it. What kernel are > you using and do you have CONFIG_ARM_ERRATA_720789 enabled? I'm using the 2.6.37-1002-linaro-vexpress kernel from the Linaro package of the same name. This does *not* have CONFIG_ARM_ERRATA_720789 enabled (presumably because the mach-vexpress/Kconfig file does not add it?) ... > > My guess, just from seeing those symptoms, would be that when inserting > > a software breakpoint via ptrace, not all i-caches on all CPUs are > > reliably flushed ... Any thoughts on this? > > There was an I-cache aliasing problem in the kernel coupled with a TLB > invalidation hardware bug on the versatile express. I fixed these though > and haven't seen any problems since. Hmm, a TLB flush problem could also explain the symptom (because the write of the breakpoint to the text section causes a copy-on-write operation which installs a new page ...) I'll try rebuilding the kernel with the above config option enabled. > Hmmm, I'll need to have a think about this. What does GDB do if it receives > a SIGTRAP with si_addr set to (potentially) complete nonsense? As an aside, > Cortex-A15 reports the faulting address for a watchpoint correctly, so we > will be able to use multiple watchpoints there. The GDB common core can handle either of the following two indications: A) The (read/write/access) watchpoint at address XXX triggered. B) A write watchpoint may have triggered at some address. In the case of B, GDB will scan all the write breakpoints it is currently tracking and compare the current value at that address with the last value it remembers being present there. Any changes GDB sees will cause it to report the corresponding watchpoint as triggered. As far as the kernel interface is concerned, the important issue that the ARM native target in GDB is able to understand what the kernel reports, so it can in turn report either case A or B to the common core. This means as long as there is some way for GDB to understand the kernel is reporting a write watchpoint hit at an unknown address, everything is fine. This could be done e.g. be reporting a "slot" zero in si_errno to indicate the slot (and then also the address) triggering the watchpoint is unknown ... > > - Finally, I noticed when reading kernel code that under some > > circumstances, the kernel will automatically do a single step to > > get off a watchpoint that was just hit. However, this does not > > happen for user-space watchpoints installed via ptrace, right? > > (Just wanting to confirm; since GDB currently does that single > > step itself -- we don't want *both* kernel and GDB to issue a > > single step each ...) > > If the {break,watch}point has been inserted via ptrace, the kernel will > send a SIGTRAP instead of stepping the instruction. OK, thanks for the confirmation! > > I haven't gotten to looking further into other hardware (IGEP, > > Panda) -- that's next on the list. > > Good stuff, keep me posted if you see any further problems! Sure, will do! Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

15 years, 2 months

Announcing the Linaro porting jam

by Steve Langasek

Hello, my fellow ARM aficionados! The Linaro Developer Platform Team is pleased to announce a new initiative to help improve the state of software on ARM: the ARM porting jam. Starting today, February 16th, we will be running a weekly IRC jam on Wednesdays from 1400-1800 UTC to bring developers together to work on all manner of userspace porting bugs, with the aim of fixing portability issues and getting the fixes delivered to our upstreams. An initial porting queue of known issues can be found here: https://bugs.launchpad.net/ubuntu/+bugs?field.tag=arm-porting-queue Interested in making the software in Ubuntu run better on ARM? Stop on by the #linaro channel on irc.linaro.org today! -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developer http://www.debian.org/ slangasek(a)ubuntu.com vorlon(a)debian.org

15 years, 2 months

RE: Problems with kernel support for hardware watchpoints

by Ulrich Weigand

"Will Deacon" <will.deacon(a)arm.com> wrote on 02/11/2011 10:13:01 AM: > I don't have a pandaboard, so I'd be interested to see if the code > works there. I developed it using ARM boards, so the versatile express > is a known good target. I've now got it working reliably on on Versatile Express, after fixing a couple of bugs on the GDB side (both in the HW-watchpoint patch, and in common GDB code). The testsuite now passes with no regressions when enabling HW watchpoints, except for two tests that require more than one single watchpoint to be supported. This raises another couple of issues/questions, however: - In testing on Versatile Express, I noticed what appears to be SMP related bugs in handling regular software breakpoints: occasionally, software breakpoints simply are not hit and execution continues as if the underlying code had not been changed at all. This symptom completely goes away if GDB and the debugged process are forced to the same CPU using the affinity feature (e.g. with schedtool). My guess, just from seeing those symptoms, would be that when inserting a software breakpoint via ptrace, not all i-caches on all CPUs are reliably flushed ... Any thoughts on this? - As mentioned above, the kernel currently only supports one single watchpoint to be active at a time, even though hardware might support multiple ones. The reason seems to be that when a watchpoint triggers, the kernel cannot figure out which one it was (if there's more than one choice). This is a bit unfortunate, given that GDB will attempt to insert two or more watchpoints in many interesting cases (e.g. a "watch *p" command will insert *two* low-level watchpoints, one at the address of p, and one at the address where p (currently) points to). In addition, for regular (write) watchpoints, GDB does not actually *require* the underlying hardware/kernel to specify which watchpoint was hit; GDB is able to find out by itself by checking whether the values at any of the currently active locations actually changed. (For read/access type watchpoints, GDB does require that underlying support -- but those are much more rarely used anyway.) Do you see any chance of improving upon the current behaviour? - Finally, I noticed when reading kernel code that under some circumstances, the kernel will automatically do a single step to get off a watchpoint that was just hit. However, this does not happen for user-space watchpoints installed via ptrace, right? (Just wanting to confirm; since GDB currently does that single step itself -- we don't want *both* kernel and GDB to issue a single step each ...) I haven't gotten to looking further into other hardware (IGEP, Panda) -- that's next on the list. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

15 years, 2 months

[ACTIVITY] 7th - 12th February 2011

by Andrew Stubbs

== Linaro GCC 4.5 == Re merged all the patches I've had to back out of Linaro GCC due to various test failures. I've now found all the extra fixes/patches necessary to make them go ... I think. Tested the build and test on ARM and x86_64. == Linaro GCC 4.6 == Continued getting the 4.5 patches forward ported to 4.6. I now have about 4 patches waiting for review upatream, or ready to be posted. Upstream review isn't happening though. This partly due to GCC being in stage 4, but mostly due to Richard Earshaw being on sabatical, and the other maintainers being inactive. I can see that I'm going to have to abandon my hopes of only merging to Linaro GCC once it's been approved upstream, and be content with merging to Linaro once it's posted upstream. Started another test to rebase the Linaro 4.6 branch with the latest from upstream. Once that's done, I think I'll start merging my changes in, and call that our baseline. (There'll still be merges from upstream, but the history will diverge.) ---- Upstream patched requiring review: * Thumb2 constants: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html * Kazu's VFP testcases: http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00128.html * Jie's thumb2 testcase fix: http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00670.html

15 years, 2 months

[ACTIVITY] Jan 31 -- Feb 13

by Chung-Lin Tang

== Week of Jan.31st--Feb.6th == * Vacation, Chinese New Year Holiday. == Last week == * Monday (Feb.7th), last day of vacation. * LP #711819, ICE in push_minipool_fix: this turned out to be a simple case where a memory load alternative was not tagged with the minipool range attributes. Patch sent upstream, awaiting approval. * LP #709453, wrong code generated for NEON. Tracked this down and mostly know how to fix this, but discussion with Ramana brought the issue up that the entire idea of using NEON vmov.i32 for loading VFP constants may not be good for A9, and unclear for A8. We probably should just revert the patch from the Linaro tree for now. * PR46002, IRA internal compiler error with -fira-algorithm=priority. Been looking at this as a part of my background IRA studies. Have a possible patch for this, plus found another assert fail ICE under ARM. Will see if can post upstream this week. == This week == * Continue to look at above unfinished issues, as well as other new ones.

15 years, 2 months

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain February 2011