linaro-toolchain

linaro-toolchain@lists.linaro.org

6 participants
5605 discussions

by Michael Hope

Hi there. The 2011.05 release has been spun and is testing up well. The 4.5 and 4.6 branches are now open so feel free to commit any approved patches. -- Michael

14 years, 1 month

[Activity] Progress till 2011-05-13

by Ramana Radhakrishnan

Progress: * Attended LDS from 9th -14th May. Plans: * Look at Thumb2 performance blueprint and break it down. * Investigate more headroom for SPEC2k starting this week. * Thumb2 performance call this week. Meetings: * 1-1s * T2 performance.

14 years, 1 month

[ACTIVITY] 9th - 13th May

by Revital Eres

Hello, - Attended Linaro@UDS. - SMS patches to support ARM do-loop pattern got approved in mainline and merged into gcc-linaro 4.6 and 4.5. - Sent merge request for two patches in trunk. (SMS_fixes_for_unfreed_memory) - Implemented an optimization for the stage-count and now testing it. Thanks, Revital

14 years, 1 month

[ACTIVITY] May.09 -- May.15

by Chung-Lin Tang

== Last week == * At Linaro@UDS; I am still typing this in Budapest. Sparingly did some work between sessions. * PR42017, ARM LR register not being used. Discussed the patch with Richard Sandiford at LDS. Re-tested a bit and about to resend a revised patch according to his suggestion. * LP:748138, redirect_jump() ICE. Committed patch to CS stable and trunk. Submitted merge request to Linaro 4.5 branch. * LP:689887. Got some suggestions from Revital on how to debug the bootstrap failure caused by my patch, will look into applying it. == This week == * Taking Monday off, I'll be flying back to Taiwan on Tuesday. * Continue with issues after getting home.

14 years, 1 month

[ACTIVITY] 9th - 13th May

by Andrew Stubbs

Spent the whole week attending Linaro@UDS. Any other activity this week is squeezed into the space between (interesting) sessions. Finished making the suggested changes to my Thumb2 constants patch, and posted it back upstream. This is pre-approved, but can't be committed until after the addw/subw patch. http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg05195.html Merged all my outstanding approved merge requests to the release branches in time for next week's release. ---- Upstream patched requiring review: * NEON scheduling patch http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html * ARM Thumb2 addw/subw support. http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg03783.html

14 years, 2 months

[ACTIVITY] report week 19

by Peter Maydell

RAG: Red: Amber: Green: 1105 work item status 99% complete with 2 weeks to go Current Milestones: | Planned | Estimate | Actual | qemu-linaro 2011-05 | 2011-05-19 | 2011-05-19 | n/a | close out 1105 blueprints | 2011-05-28 | 2011-05-28 | | complete 1111 planning | 2011-05-28 | 2011-05-28 | | Historical Milestones: finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off | first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 | qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 | qemu-linaro 2011-04 | 2011-04-21 | 2011-04-21 | 2011-04-21 | == merge-correctness-fixes == * some of my pending patches have been applied; a number of others are still under discussion or need further work/testing == other == * We won't be making a qemu-linaro 2011-05 release, since there are no changes since the 2011-04 release (due to a combination of the Easter holiday and UDS week). * Attended UDS * almost all 1105 work items either complete or confirmed postponed to next cycle * Good progress on fleshing out blueprints for next cycle: https://wiki.linaro.org/PeterMaydell/Qemu1111 Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: (maybe) 15-16 August: QEMU/KVM strand at LinuxCon NA, Vancouver [LinuxCon proper follows on 17-19th]

14 years, 2 months

Idea for auto-increment performance improvement

by Richard Sandiford

Last week, Ramana pointed me at an upstream bug report about the inefficient code that GCC generates for vzip, vuzp and vtrn: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48941 It was filed not longer after the Neon seminar at the summit; I'm not sure whether that was a coincidence or not. I attached a patch to the bug last week and will test it this week. However, a cut-down version shows up another problem that isn't related specifically to intrinsics. Given: #include <arm_neon.h> void foo (float32x4x2_t *__restrict dst, float32x4_t *__restrict src, int n) { while (n--) { dst[0] = vzipq_f32 (src[0], src[1]); dst[1] = vzipq_f32 (src[2], src[3]); dst += 2; src += 4; } } GCC produces: cmp r2, #0 bxeq lr .L3: vldmia r1, {d16-d17} vldr d18, [r1, #16] vldr d19, [r1, #24] vldr d20, [r1, #32] vldr d21, [r1, #40] vldr d22, [r1, #48] vldr d23, [r1, #56] add r3, r0, #32 vzip.32 q8, q9 vzip.32 q10, q11 subs r2, r2, #1 vstmia r0, {d16-d19} add r1, r1, #64 vstmia r3, {d20-d23} add r0, r0, #64 bne .L3 bx lr We're missing many auto-increment opportunities here. I think this is due to the limitations of GCC's auto-inc-dec pass rather than to a problem in the ARM port itself. I think there are two main areas for improvement: - The pass only tries to use auto-incs in cases where there is a separate addition and memory access. It doesn't try to handle cases where there are two consecutive memory accesses of the form *base and *(base + size), even if the address costs make it clear that post-increments would be a win. - The pass uses a backward scan rather than a forward scan, which makes it harder to spot chains of more than two accesses. FWIW, I've got fairly specific ideas about how to do this. Unfortunately, the pass is in need of some TLC before it's easy to make changes. So in terms of work items, how about: 1. Clean up the auto-inc pass so that it's easier to modify 2. Investigate improvements to the pass 3. Submit the changes upstream 4. Backport the changes to the Linaro branches I wrote some patches for (1) last week. I'd estimate it's about 2 weeks' work for (1) and (2). (3) and (4) would hopefully be background tasks. The aim would be for something like: .L3: vldmia r1!, {d16-d17} vldmia r1!, {d18-d19} vldmia r1!, {d20-d21} vldmia r1!, {d22-d23} vzip.32 q8, q9 vzip.32 q10, q11 subs r2, r2, #1 vstmia r0!, {d16-d19} vstmia r0!, {d20-d23} bne .L3 bx lr This should help with auto-vectorised code, as well as normal core code. (Combining the vldmias and vstmias is a different topic. The fact that this particular example could be implemented using one load and one store is to some extent coincidental.) Richard

14 years, 2 months

[ACTIVITY] 2011-05-13

by David Gilbert

== String routines == * Gave up on perf on silverbell and redid it on ursa2; now have a full set of perf figures and have updated the workload report to show the spec binaries that use significant time in libc and the routines they spend it in; a handful of tests spend very significant amounts of time in libm. * Have ltrace results from about 75% of spec - some of the others are fighting a bit * Optimised the non-neon memcpy; it's now quite respectable except in one or two cases (2 byte misaligned, and for some odd reason source offset by 8 bytes, destination by 12 is way down on any other combination) (Current result graphs here https://wiki.linaro.org/Internal/People/DaveGilbert?action=AttachFile&do=ge… ) Dave

14 years, 2 months

[ACTIVITY] May 8-12

by Ira Rosen

Hi, * continued looking into ffmpeg/libavcodec: - dcadsp.c - the inner loop contains reverse accesses which are not supported on Neon. I think we can handle them using vrev and vswp. - a lot of loops have unknown memory stride. I am exploring a possibility of a combination of scalar loads and vmov into a vector register, but it is probably too expensive. * looking into telecom/conven Ira

14 years, 2 months

[ACTIVITY] May.02 -- May.08

by Chung-Lin Tang

== Last week == * Launchpad #748138: "ICE in redirect_jump, at jump.c:1443". Related to shrink-wrap, discussed a bit with Bernd off-list. Sent fix today (Mon.) to gnu-internal; will need to merge to Linaro. * CoreMark combine canonicalize compares patch set: bootstrapped and tested with clean results on powerpc, added comments and updated upstream submission. Machine independent parts okayed by Jeff Law, now committed upstream. ARM parts still pending review. * Compiled back-list of upstream patches, and sent to patches(a)linaro.org * Traveled to Budapest, Hungary for Linaro Developer Summit on Saturday. == This week == * Linaro Developer Summit at Budapest all week.

14 years, 2 months

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain