linaro-toolchain October 2011

linaro-toolchain@lists.linaro.org

19 participants
55 discussions

by Ulrich Weigand

== GDB == * Reimplemented patch to disable address space randomization in gdbserver to respect the "set disable-randomization" command, and checked it in to mainline and Linaro GDB 7.3 * Worked on support for cross-platform core file generation. == GCC == * Checked in mainline fix for PR 50305. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

13 years, 9 months

[ACTIVITY] October 2-6

by Ira Rosen

Hi, - worked on the RTL part of the widen-shift patch - backported to linaro 2/3 of the SLP patches, and proposed the third one - worked on additional SLP improvements: - swap operands to make statements isomorphic - support load with offset 1 (after load from 0) - started working on presentation for NEON forum Upcoming holidays: Oct 12, Wed - half day Oct 13, Thu Oct 16-19, Sun-Wed - half day Oct 20, Thu Ira

13 years, 9 months

[RFC ARM] Use vcvt.f32.s32 with immediate bits to do fixed to floating point conversions.

by Ramana Radhakrishnan

Hi, So one of the things Michael pointed out in today's call was that the ARM backend doesn't generate vcvt.f32.s<type> where you have an idiom conversion from fixed to floating point as in the example below. I've chosen to implement this in the following manner in the backend using these interfaces from real.c . The reason I've chosen to not allow this transformation in case flag_rounding_math is true is because this instruction always ends up rounding using round-to-nearest rather than obeying whats in the FPSCR and thus is not safe for programs that want to dynamically set their rounding modes. The benefits are quite obvious in that we eliminate a load from the constant pool and a floating point multiply and thus essentially shaving off a floating point multiply + Load latency off these sequences. This instruction can only write the output into the same register as the input register which is why I've modelled it as below by tying op1 into op0. If there's a simpler way of using the interfaces into real.c then I'm all ears ? Thoughts ? I believe such idioms are used in libav from where the original report appears to have come and thus it's a worthwhile gain where we can have it. Any other places where folks might have noticed this. I will post upstream as well once I finish testing this patch. I'm posting this here to get some feedback as well to let anyone who is really really keen about trying this out have a go given I'm out tomorrow. ( I took a quick look at the short -> f32 case as well but the fact remains that loads either zero or sign extend anyway so there's probably not much gain in modelling that right away and the win really is in getting rid of that fp mul and the constant pool load. There's probably some gain in going from i64-> f64 as well so those patterns need to be written up at some point for completeness ) cheers Ramana 2011-10-04 Ramana Radhakrishnan <ramana.radhakrishnan(a)linaro.org> * config/arm/arm.c (vfp3_const_double_for_fract_bits): Define. * config/arm/arm-protos.h (vfp3_const_double_for_fract_bits): Declare. * config/arm/constraints.md ("Dt"): New constraint. * config/arm/predicates.md (const_double_vcvt_power_of_two_reciprocal): New. * config/arm/vfp.md (*arm_combine_vcvt_f32_s32): New. (*arm_combine_vcvt_f32_u32): New. For the following testcases I see the code as follows with -mfloat-abi=hard -mfpu=vfpv3 and -mcpu=cortex-a9 float foo (int i) { float v = (float)i / (1 << 11); return v; } float foa_unsigned (unsigned int i) { float v = (float)i / (1 << 5); return v; } After patch . foo: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. fmsr s0, r0 @ int vcvt.f32.s32 s0, s0, #11 bx lr .size foo, .-foo .align 2 .global foa_unsigned .type foa_unsigned, %function foa_unsigned: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. fmsr s0, r0 @ int vcvt.f32.u32 s0, s0, #5 bx lr .size foa_unsigned, .-foa_unsigned .align 2 .global foo1 .type foo1, %function rather than .type foo, %function foo: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. fmsr s15, r0 @ int fsitos s0, s15 flds s15, .L2 fmuls s0, s0, s15 bx lr .L3: .align 2 .L2: .word 973078528 .size foo, .-foo .align 2 .global foa_unsigned .type foa_unsigned, %function foa_unsigned: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. fmsr s15, r0 @ int fuitos s0, s15 flds s15, .L5 fmuls s0, s0, s15 bx lr .L6: .align 2 .L5: .word 1023410176

13 years, 9 months

[ACTIVITY] 26th - 30th September

by Andrew Stubbs

* Vacation Monday, Tuesday, and Wednesday. * GCC Continued work on my constant reuse optimizations. Disappointingly, I've found that there are very few optimization opportunities in EEMBC (ARM/Thumb V7-A), although it's not difficult to write testcases that the optimization could improve. I also discovered that the data-flow chains don't work exactly how I thought (with respect to if-then-else cases) so I need to do a little more work. Pinged the native tuning patches; they're still waiting for upstream review. Still can't get the generic tuning work done as the CodeSourcery panda boards appear to be still offline.

13 years, 9 months

[ACTIVITY] Weekly status

by Revital Eres

Committed to mainline the patch to support instructions with auto-inc operations in SMS after addressing Ayal's comments. The patch contains two parts; one of them fixes a bug revealed during bootstrapping with the patch and SMS flags. http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01988.html http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01987.html Looking at estimating register pressure with SMS: based on previous discussion with Richard the current approach is to try and use the register pressure estimation in loop invariant pass.

13 years, 9 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain October 2011