linaro-toolchain December 2010

linaro-toolchain@lists.linaro.org

37 participants
63 discussions

by Ken Werner

Hi, * created custom kernel deb packages from the linaro-linux tree in order to * test the various ftrace tracers and profilers available on ARM * results at: https://wiki.linaro.org/KenWerner/Sandbox/ftrace * started to look into crash (kexec, kdump) but wasn't able to generate a kernel dump yet Regards Ken

13 years, 10 months

Perfromance Test Results using gcc-linaro-4.5-2010.11-1

by Prashanth S

Dear All Our team in Samsung collected some performance metrics for the following 3 GCC cross compilers 1.. Gentoo Complier(part of Chrome OS Build Environment) 2.. GCC 4.4.1 (Code Sourcery). 3.. Linaro (gcc-linaro-4.5-2010.11-1) Flags used to Build Linaro Tool chain used Michael Hope Script .Just modified "GCCFLAGS = --with-mode=thumb --with-arch=armv7-a --with-float=softfp --with-fpu=neon --with-fpu=vfpv3-d16" a.. Using the above three tool chains we compiled the kernel of Chrome OS and did Coremark Performance test.(With same optimisation flag mentioned in the attachment) b.. Test Environment for all the three are the same. My Questions 1.. Is there any build options that I am missing while I am building the Cross Compiler? 2.. Else is this performance degradation is a know issue and is the tool chain group working on it?.(If so whom to contact?) Any Pointers from you would be of great help to me. If you need any further details also do ping me Regards Prashanth S

13 years, 10 months

GCC Optimization Brain Storming Session

by Andrew Stubbs

Hi All, As we discussed on Monday, I think it might be helpful to get a number of knowledgeable people together on a call to discuss GCC optimization opportunities. So, I'd like to get some idea of who would like to attend, and we'll try to find a slot we can all make. I'm on vacation next week, so I expect it'll be in two or three week's time. Before we get there, I'd like to have a list of ideas to discuss. Partly so that we don't forget anything, and partly so that people can have a think about them before the day. I'm really looking for bigger picture stuff, rather than individual poor code generation bugs. So here's a few to kick off: * Costs tuning. - GCC 4.6 has a new costs model, but are we making full use of it? - What about optimizing for size? - Do the optimizers take any notice? [1] * Instruction set coverage. - Are there any ARM/Thumb2 instructions that we are not taking advantage of? [2] - Do we test that we use the instructions we do have? [3] * Constant pools - it might be a very handy space optimization to have small functions share one constant pool, but the way the passes work one function at a time makes this hard. (LP:625233) * NEON - There's already a lot of work going on here, and I don't want it to hog all our time, but it might be worth touching on. What else? I'm not the most experienced person with GCC internals, and I'm relatively new to the ARM specific parts of those, so somebody else must be able to come up with something far more exciting! So, please, get brain-storming! Andrew [1] We discovered recently that combine is happy to take two insns and combine them into a pattern that matches a splitter that then explodes into three insns (partly due to being no longer able to generate pseudo-registers). [2] For example, I just wrote a patch to add addw and subw support (not yet submitted). [3] LP:643479 is an example of a case where we don't.

13 years, 10 months

[ACTIVITY] 2010-12-09

by David Gilbert

Mostly more working with libffi; swapping some ideas back and forwards with Marcus Shawcroft and it looks like we have a good way forward. Got an armhf chroot going, libffi built. Got a testcase failing as expected. Trying to look at other processors ABIs to understand why varargs works for anyone else. Cut through one layer of red tape; can now do the next level of comparison in the string routine work. Started looking at SPEC; hit problems with network stability on VExpress (turns out to be bug 673820) long long weekend; short weeks=2; Back in on Tuesday. Dave

13 years, 10 months

Silverbell

by David Gilbert

Hi, Those of you use silverbell may be glad to know it's back up. Be a little careful, if you shovel large amounts of stuff over it's network the network tends to disappear. (Not sure if this is hardware or driver) Dave

13 years, 10 months

Hard float chroot

by David Gilbert

Hi, As mentioned on the standup, I just got an armhf chroot going, thanks to markos for pointing me at using multistrap I put the following in a armhfmultistrap.conf and did multistrap -f armhfmultistrap.conf Once that's done, chroot in and then do dpkg --configure -a it's pretty sparse in there, but it's enough to get going. Dave ============================================== [General] arch=armhf directory=/discs/more/armhf cleanup=true noauth=true unpack=true explicitsuite=false aptsources=unstable unreleased bootstrap=unstable unreleased [unstable] packages= source=http://ftp.de.debian.org/debian-ports/ keyring=debian-archive-keyring suite=unstable omitdebsrc=true [unreleased] packages= source=http://ftp.de.debian.org/debian-ports/ keyring=debian-archive-keyring suite=unreleased omitdebsrc=true

13 years, 10 months

risu instruction set test harness now publicly available

by Peter Maydell

Hi. As part of my work on qemu I've written a simplistic random instruction sequence generator and test harness. To quote the README: risu is a tool intended to assist in testing the implementation of models of the ARM architecture such as qemu and valgrind. In particular it restricts itself to considering the parts of the architecture visible from Linux userspace, so it can be used to test programs which only implement userspace, like valgrind and qemu's linux-user mode. I don't particularly expect this tool to be of much general interest outside people developing either qemu or valgrind or similar models, but I have in any case made it publicly available now: http://git.linaro.org/gitweb?p=people/pmaydell/risu.git;a=tree -- PMM

13 years, 10 months

RFC: Dynamic hwcaps

by Dave Martin

Hi all, I'd be interested in people's views on the following idea-- feel free to ignore if it doesn't interest you. For power-management purposes, it's useful to be able to turn off functional blocks on the SoC. For on-SoC peripherals, this can be managed through the driver framework in the kernel, but for functional blocks of the CPU itself which are used by instruction set extensions, such as NEON or other media accelerators, it would be interesting if processes could adapt to these units appearing and disappearing at runtime. This would mean that user processes would need to select dynamically between different implementations of accelerated functionality at runtime. This allows for more active power management of such functional blocks: if the CPU is not fully loaded, you can turn them off -- the kernel can spot when there is significant idle time and do this. If the CPU becomes fully loaded, applications which have soft-realtime constraints can notice this and switch to their accelerated code (which will cause the kernel to switch the functional unit(s) on). Or, the kernel can react to increasing CPU load by speculatively turn it on instead. This is analogous to the behaviour of other power governors in the system. Non-aware applications will still work seamlessly -- these may simply run accelerated code if the hardware supports it, causing the kernel to turn the affected functional block(s) on. In order for this to work, some dynamic status information would need to be visible to each user process, and polled each time a function with a dynamically switchable choice of implementations gets called. You probably don't need to worry about race conditions either-- if the process accidentally tries to use a turned-off feature, you will take a fault which gives the kernel the chance to turn the feature back on. Generally, this should be a rare occurrence. The dynamic feature status information should ideally be per-CPU global, though we could have a separate copy per thread, at the cost of more memory. It can't be system-global, since different CPUs may have a different set of functional blocks active at any one time -- for this reason, the information can't be stored in an existing mapping such as the vectors page. Conversely, existing mechanisms such sysfs probably involve too much overhead to be polled every time you call copy_pixmap() or whatever. Alternatively, each thread could register a userspace buffer (a single word is probably adequate) into which the CPU pokes the hardware status flags each time it returns to userspace, if the hardware status has changed or if the thread has been migrated. Either of the above approaches could be prototyped as an mmap'able driver, though this may not be the best approach in the long run. Does anyone have a view on whether this is a worthwhile idea, or what the best approach would be? Cheers ---Dave

13 years, 10 months

[ACTIVITY] November 29th-December 5th

by Julian Brown

== Linaro GCC == * Worked on quad-word/big-endian fixes patch. Sent off a version on Tuesday which worked OK, but which made some awkward changes to the middle-end. Tried to re-think those parts, but without much luck: came to the conclusion that spending more time trying to fix element-ordering-dependent operations on quad-word vectors in big-endian mode was probably not worth the effort (since we plan to be changing things in that area anyway). Wrote a much-simplified patch which simply disables those patterns, and ported it to mainline. * Then, spent some time trying to set up big-endian testing with a mainline build, since the lack of such an option is partly why we got into this mess to start with. My current plan (as well as testing the above patch) is to create an upstreamable patch to easily enable big-endian (Linux) multilibs, in the hope that it'll generally make big-endian testing easier. (Of course people will still need test harness configurations which will allow running big & little-endian code, which most won't have.) * Also, ping lp675347 (volatile bitfields vs. QT atomics), and do some some extra checks suggested by DJ Delorie, which seemed to work out fine. Backported patch for lp629671 to Linaro 4.4 branch, and ran tests (also fine). * Continued discussion of internal representations for fancy vector loads/stores in GIMPLE/RTL on linaro-toolchain.

13 years, 10 months

[ACTIVITY] Nov. 19 -- Dec. 05

by Zach Welch

== Last Week == * Continued implementing support for ARM unwind tables in libunwind. * Sent patches upstream to improve binutils's readelf, adding support for all remaining unwind table instructions (i.e. VFP/NEON and WMMX). When used on ARMv7a, provides meaningful output for previously 'unsupported' opcodes that get used in some libraries (e.g. glibc). == This Week == * Continue working on libunwind. -- Zach Welch CodeSourcery zwelch(a)codesourcery.com (650) 331-3385 x743

13 years, 10 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain December 2010