Hi Dave. I had a little play with cortex-strings and did some
benchmarks on my Tegra 2. Images are attached.
I've added two scripts to cortex-strings:
scripts/bench-all.sh runs all the routines on all variants and records them
scripts/plot.py plots the results from above
ploy.py corrects for the benchmark overhead by doing a linear fit to
the null 'bounce' results and subtracting this fit.
You should be able to a autogen; configure; make; bash
scripts/bench-all.sh | tee log.txt; python scripts/plot.py log.txt.
I'm sure you have your own favourite tools though.
The string routines look good. Lumpy in funny ways though...
-- Michael
[Sorry, forgot to CC: the list]
Hi Ira,
Thanks for the feedback.
On 6 March 2011 09:20, Ira Rosen <IRAR(a)il.ibm.com> wrote:
> > So how about the following functions? (Forgive the pascally syntax.)
> >
> > __builtin_load_lanes (REF : array N*M of X)
> > returns array N of vector M of X
> > maps to vldN
> > in practice, the result would be used in assignments of the form:
> > vectorX = ARRAY_REF <result, X>
> >
> > __builtin_store_lanes (VECTORS : array N of vector M of X)
> > returns array N*M of X
> > maps to vstN
> > in practice, the argument would be populated by assignments ofthe
> form:
> > vectorX = ARRAY_REF <result, X>
> >
> > __builtin_load_lane (REF : array N of X,
> > VECTORS : array N of vector M of X,
> > LANE : integer)
> > returns array N of vector M of X
> > maps to vldN_lane
> >
> > __builtin_store_lane (VECTORS : array N of vector M of X,
> > LANE : integer)
> > returns array N of X
> > maps to vstN_lane
> >
>
> How do you distinguish between "multiple structures" and "single structure
> to all lanes"?
Sorry, I'm not sure I understand the question. Could you give a couple
of examples?
The idea is that the arrays above really are array types, regardless of the
actual type of the thing we're accessing (which might be a larger array
than the bounds above say, or which might be an array of structures
or a structure of arrays). That should be OK because arrays alias
their elements.
Richard
Hi Matthias,
in last week's meeting you raised the question what, if any, code from the
Linaro GDB repository could be useful for inclusion into the natty GDB
package. I've now reviewed the contents of the repository, and my
suggestion would be to use everything in Linaro GDB 7.2, except for this
commit (which changes the branding to "Linaro GDB"):
revno: 32969
committer: Ulrich Weigand <uweigand(a)de.ibm.com>
branch nick: 7.2
timestamp: Wed 2010-09-22 19:18:38 +0200
message:
2010-09-22 Ulrich Weigand <uweigand(a)de.ibm.com>
* src-release: Support gdb-linaro packages.
gdb/
* version.in: Set to Linaro GDB version number.
* configure.ac (PKGVERSION, BUGURL): Refer to Linaro.
* configure: Regenerate.
gdb/gdbserver/
* configure.ac (PKGVERSION, BUGURL): Refer to Linaro.
* configure: Regenerate.
gdb/doc/
* configure.ac (PKGVERSION, BUGURL): Refer to Linaro.
* configure: Regenerate.
(Instead, the branding ought to be set as appropriate for the Ubuntu
package. Maybe with an additional reference to Linaro, just as with GCC?)
I've created a snapshot of the Linaro GDB 7.2 branch using the command
bzr diff --prefix a/:b/ -r32965..
and then manually removed changes to
src-release
gdb/version.in
gdb/configure.ac
gdb/configure
gdb/gdbserver/configure.ac
gdb/gdbserver/configure
gdb/doc/configure.ac
gdb/doc/configure
I've left in the new file ChangeLog.linaro for documentation purposes, but
if you prefer this could of course be removed as well.
The resulting patch is appended here. (Note that I'd recommend to continue
updating the patch from Linaro GDB as further changes make it in.)
(See attached file: linaro-gdb.patch)
I've then added the patch to the natty GDB package. Since it touches a
completely distinct set of files compared to the existing list of patches
in the package, it can be added to the series file at any arbitrary point.
I've built the resulting compiler on i386, arm, and ppc64, and it strictly
improved the test results on all three platforms:
i386 without patch:
# of expected passes 16161
# of unexpected failures 114
# of expected failures 72
# of untested testcases 9
# of unresolved testcases 1
# of unsupported tests 69
i386 with patch:
# of expected passes 16331
# of unexpected failures 24
# of expected failures 72
# of untested testcases 9
# of unresolved testcases 1
# of unsupported tests 69
Fixed test case failures are from:
gdb.base/break-interp.exp
gdb.base/foll-fork.exp
gdb.base/printcmds.exp
(These are just test suite cleanups, no actual code changes.)
ppc without patch:
# of expected passes 15350
# of unexpected failures 74
# of expected failures 53
# of untested testcases 15
# of unresolved testcases 1
# of unsupported tests 63
ppc with patch:
# of expected passes 15350
# of unexpected failures 55
# of expected failures 53
# of untested testcases 15
# of unresolved testcases 1
# of unsupported tests 63
Fixed test case failures are from:
gdb.base/printcmds.exp
gdb.threads/local-watch-wrong-thread.exp
gdb.threads/watchthreads.exp
(These are just test suite cleanups, no actual code changes.)
arm without patch:
# of expected passes 15343
# of unexpected failures 270
# of unexpected successes 1
# of expected failures 65
# of untested testcases 11
# of unresolved testcases 2
# of unsupported tests 70
arm with patch:
# of expected passes 15686
# of unexpected failures 46
# of unexpected successes 3
# of expected failures 63
# of untested testcases 11
# of unresolved testcases 1
# of unsupported tests 69
Fixed test case failures are from:
gdb.base/break-interp.exp
gdb.base/corefile.exp
gdb.base/foll-fork.exp
gdb.base/gcore.exp
gdb.base/gdb1555.exp
gdb.base/pr11022.exp
gdb.base/printcmds.exp
gdb.base/recurse.exp
gdb.base/relativedebug.exp
gdb.base/step-test.exp
gdb.base/watch-cond.exp
gdb.base/watch-read.exp
gdb.base/watch_thread_num.exp
gdb.base/watch-vfork.exp
gdb.gdb/selftest.exp
gdb.mi/gdb792.exp
gdb.mi/mi2-syn-frame.exp
gdb.mi/mi2-var-display.exp
gdb.mi/mi2-watch.exp
gdb.mi/mi-syn-frame.exp
gdb.mi/mi-var-display.exp
gdb.mi/mi-watch.exp
gdb.pie/corefile.exp
gdb.server/ext-attach.exp
gdb.threads/attachstop-mt.exp
gdb.threads/attach-stopped.exp
gdb.threads/linux-dp.exp
gdb.threads/local-watch-wrong-thread.exp
gdb.threads/pthread_cond_wait.exp
(This represents much of the bug fix work that went into Linaro GDB.)
Let me know if there's any further information you need, or anything else I
can do to help get the Linaro changes into natty GDB.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Merged fixes for several bug into Linaro GCC 4.5. Both from Linaro
(Richard, Matthias and Ramana), and from CS (the shrink wrap problems).
Continued working on benchmarking the patches I've merged to 4.6. Spent
quite some time trying to figure out why EEMBC and the Spec2K weren't
working properly. I've got this sorted now.
Confirmed that the patch to discourage NEON use for integer operations
is still profitable on Cortex-A8. Posted the patch upstream.
Merged upstream GCC 4.6 into Linaro GCC 4.6.
Booked travel to Budapest for Linaro @ UDS.
Followed up on Ramana's questions about the RVCT interoperability patch.
Paul Brook helped explain what it was about, and pointed me at the
proper section in the proper ARM manual.
Continued forward porting patches to 4.6. Mostly I need to convince
myself that they still do something useful. I have posted one new patch
to upstream - the "Discourage A8 NEON" patch.
* Future Absence
Away Wednesday 16th to Friday 18th.
Away Monday 28th to Friday 1st April.
----
Upstream patched requiring review:
* Thumb2 constants:
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html
* ARM EABI half-precision functions
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html
* ARM Thumb2 Spill Likely tweak
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html
* NEON scheduling patch
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html
* RVCT Interoperability patch
http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg00059.html
* Discourage NEON on A8
http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg00576.html
== Last week ==
* Working on Coremark ARMv6 regressions. Identified a major cause being
RTL ifcvt failing on one of the crc routines, due to combine pass
failing to optimize a particular sequence, causing the if-conversion
estimates to give up on conditional executing (too many insns). The
combine pass failed on ARMv6 and above, due to the existence of true
zero_extend insns. On ARMv5, the use of two shifts actually allowed
combine to phase reduce the shifts one by one, thus producing better
code. On ARMv6, combine produced a (xor (and ...) <mask>) which did not
match any insn. Analyzed and sent a patch upstream which should work on
such XOR cases. Patch is due for upstream commit for 4.7-stage1.
(http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00609.html)
* Another situation of un-optimized uxth insns still exists; trying
to solve this by another combine patch I am currently testing, will send
upstream later.
== This week ==
* verify the improvements the above patches should have on Coremark for
ARMv6/v7.
* Work on sending them to Linaro and SG++ branches.
* Other bug issues.
== GDB ==
* Ongoing work on glibc patch to add ARM unwind tables to system
call stubs; ran into design problems that look difficult to fix.
* As an alternative, started work on a GDB patch to recognize glibc
system call assembler stubs via code-scanning; this should allow
alloc unwinding in the absence of debug info for current libc code.
* Analyzed bug #728216 (GDB fails to get a valid backtrace while
debugging a Webkit SIGSEGV) and resolved as invalid; the fault
occurs within JIT-generated code where unwinding is impossible.
== Misc ==
* Made travel arrangements for Linaro Summit in Budapest
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
RAG:
Red:
Amber:
Green: another qemu-linaro release out the door on time
Current Milestones:
| Planned | Estimate | Actual |
qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 |
Historical Milestones:
finish virtio-system | 2010-08-27 | postponed | |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 |
finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off |
first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 |
== maintain-beagle-models ==
* released qemu-linaro 2011-03
* had to do a 2011-03-1 reroll of the tarball on day of release to
fix a "versatilepb model crashes on startup" bug found at last minute
* Paul Larson is working on having automated test image boots on
qemu built from git, so we can catch this much earlier in the cycle
== merge-correctness-fixes ==
* added support to risu for testing of load and store instructions
* used this to test a patch which cleans up Thumb load/store decode
and makes us UNDEF in the right places
* wrote/submitted patch to fix GE bits for signed modulo arithmetic
* wrote/submitted patch to get SMUAD/SMLAD Q bit right in an edge case
* started on a patchset which will fix various minor qemu Neon bugs
detected by test programs from the valgrind source tree
== other ==
* meetings: toolchain, standup, pdsw-tools
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
17/18 March: QEMU Users Forum, Grenoble
Holiday: 22 Apr - 2 May
9-13 May: UDS, Budapest
(maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver
Hi,
== libunwind ==
* the patches posted last week are now upstream
* continued to study the Exception Handling ABI for the ARM Architecture
* looked into the structure of libunwind (lib interdependencies)
* documented at: https://wiki.linaro.org/KenWerner/Sandbox/libunwind
* The work on the local unwinding appears to be quite complete. If the
generic unwind model is used the code assumes the GCC personality routine. We
should either check name of the symbol (maybe be difficult) or just call the
pers function. I'm in contact with Zach on this.
Regards
Ken
LP: #731665 is a silent bad code generation bug at least on functions
which are empty except for inline assembly:
https://bugs.launchpad.net/ubuntu/+source/gcc-4.5/+bug/731665
It was introduced in the shrink-wrap patch and is due to using an
uninitialised variable. Andrew, can you please address this urgently
either in Linaro or CSL.
-- Michael
== hard-float ==
* Updated libffi variadic patch and Sent updated libffi variadic
patch to the ffi mailing list.
== String routines ==
* Got a big endian build environment going
* Patched up memchr and strlen for big endian; turned out to be a
very small change in the end; and
tested it on qemu-armeb - note that an older version it didn't work
on, but a newer one it did; I'll assume
the newer one is correct.
* Fixed a couple of build issues in the cortex strings test harness
== Other ==
* Kicked off a SPEC2006 train run on canis using the 2011.03 compilers
I'm on holiday tomorrow (Friday) and Monday.
Dave