linaro-toolchain

linaro-toolchain@lists.linaro.org

5665 discussions

by Richard Sandiford

This is my current 4.7 auto-inc-dec.c patch. I submitted an RFC in July: http://article.gmane.org/gmane.comp.gcc.patches/241779/ and updated the patch in line with the feedback I got. Steven Bosscher sent some very useful comments in private email, so the update deals with those as well as Bernd's public ones. If we do go ahead with this rewrite, it depends on the A9 pipeline description changes. I submitted some A8 and A9 changes here: http://article.gmane.org/gmane.comp.gcc.patches/244238/ http://article.gmane.org/gmane.comp.gcc.patches/244242/ but because I later noticed that the A9 didn't behave quite as I thought, I decided not to apply them. Ramana asked around internally about what the A9 actually does (thanks) and had some ideas. The patch also relies on the MEM rtx_costs patch that I just posted: http://lists.linaro.org/pipermail/linaro-toolchain/2011-December/001944.html Richard gcc/ * Makefile.in (auto-inc-dec.o): Depends on $(OPTABS_H) and addresses.h. * auto-inc-dec.c: Rewrite. Index: gcc/Makefile.in =================================================================== --- gcc/Makefile.in 2011-12-07 11:43:29.549238252 +0000 +++ gcc/Makefile.in 2011-12-29 09:24:51.066303201 +0000 @@ -3145,7 +3145,8 @@ alloc-pool.o : alloc-pool.c $(CONFIG_H) auto-inc-dec.o : auto-inc-dec.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(TREE_H) $(RTL_H) $(TM_P_H) hard-reg-set.h $(BASIC_BLOCK_H) insn-config.h \ $(REGS_H) $(FLAGS_H) output.h $(FUNCTION_H) $(EXCEPT_H) $(DIAGNOSTIC_CORE_H) $(RECOG_H) \ - $(EXPR_H) $(TIMEVAR_H) $(TREE_PASS_H) $(DF_H) $(DBGCNT_H) $(TARGET_H) + $(EXPR_H) $(TIMEVAR_H) $(TREE_PASS_H) $(DF_H) $(DBGCNT_H) $(TARGET_H) \ + $(OPTABS_H) addresses.h cfg.o : cfg.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(FLAGS_H) \ $(REGS_H) hard-reg-set.h output.h $(DIAGNOSTIC_CORE_H) $(FUNCTION_H) $(EXCEPT_H) $(GGC_H) \ $(TM_P_H) $(TIMEVAR_H) $(OBSTACK_H) $(TREE_H) alloc-pool.h \

14 years

Patch drop: Rework MEM rtx_costs

by Richard Sandiford

I originally wrote this patch as part of the auto-inc-dec work. I didn't submit it because I wasn't sure what value of extra_writeback_latency was appropriate for A9. (I was hoping to crib it from Ramana's pipeline description.) The patch introduces three new fields to the costs structure: one to control the latency of core loads, one to control the latency of NEON loads, and one to control the penalty of address writeback. The patch includes a tweak for cases where we use two VLDRs. That part should obviously be dropped if we change the move patterns to use something else. Richard gcc/ * config/arm/arm-protos.h (tune_params): Add core_mem_latency, neon_mem_latency and extra_writeback_latency. * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune) (arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune) (arm_cortex_tune, arm_cortex_a5_tune, arm_cortex_a9_tune) (arm_fa726te_tune): Populate the new tune_params fields. (arm_mem_cost): New function. (arm_rtx_costs_1): Use it. Index: gcc/config/arm/arm-protos.h =================================================================== --- gcc/config/arm/arm-protos.h 2011-08-09 15:01:14.000000000 +0100 +++ gcc/config/arm/arm-protos.h 2011-08-09 15:04:58.121984034 +0100 @@ -236,6 +236,9 @@ struct tune_params int l1_cache_size; int l1_cache_line_size; bool prefer_constant_pool; + int core_mem_latency; + int neon_mem_latency; + int extra_writeback_latency; int (*branch_cost) (bool, bool); }; Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c 2011-08-09 15:01:14.000000000 +0100 +++ gcc/config/arm/arm.c 2011-08-09 15:07:07.215103271 +0100 @@ -840,6 +840,9 @@ const struct tune_params arm_slowmul_tun 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -851,6 +854,9 @@ const struct tune_params arm_fastmul_tun 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -865,6 +871,9 @@ const struct tune_params arm_strongarm_t 3, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -876,6 +885,9 @@ const struct tune_params arm_xscale_tune 3, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -887,6 +899,9 @@ const struct tune_params arm_9e_tune = 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -898,6 +913,9 @@ const struct tune_params arm_v6t2_tune = 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, false, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -910,6 +928,9 @@ const struct tune_params arm_cortex_tune 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, false, /* Prefer constant pool. */ + 2, + 2, + 1, arm_default_branch_cost }; @@ -924,6 +945,9 @@ const struct tune_params arm_cortex_a5_t 1, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, false, /* Prefer constant pool. */ + 2, + 2, + 0, arm_cortex_a5_branch_cost }; @@ -935,6 +959,9 @@ const struct tune_params arm_cortex_a9_t 5, /* Max cond insns. */ ARM_PREFETCH_BENEFICIAL(4,32,32), false, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -946,6 +973,9 @@ const struct tune_params arm_fa726te_tun 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -6848,6 +6878,41 @@ thumb1_rtx_costs (rtx x, enum rtx_code c } } +/* Return the cost in insns of a memory reference of mode MODE to + address ADDR. */ + +static int +arm_mem_cost (enum machine_mode mode, rtx addr) +{ + int count, base; + + count = ARM_NUM_REGS (mode); + if (TARGET_NEON + && (VALID_NEON_DREG_MODE (mode) + || VALID_NEON_QREG_MODE (mode) + || VALID_NEON_STRUCT_MODE (mode))) + { + base = current_tune->neon_mem_latency; + + if (count == 4 && (GET_CODE (addr) == PLUS || CONSTANT_P (addr))) + /* In this case we use two VLDRs. */ + return COSTS_N_INSNS (base + 2); + + /* Assume that one quad can be accessed each cycle. */ + return COSTS_N_INSNS (base + (count + 3) / 4); + } + + base = current_tune->core_mem_latency; + + if (count == 1 && GET_RTX_CLASS (GET_CODE (addr)) == RTX_AUTOINC) + /* On some targets (like A8), core accesses chained by address + register writeback cannot issue in consecutive cycles. + Pessimize writeback to account for this. */ + base += current_tune->extra_writeback_latency; + + return COSTS_N_INSNS (base + count); +} + static inline bool arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, bool speed) { @@ -6860,9 +6925,7 @@ arm_rtx_costs_1 (rtx x, enum rtx_code ou switch (code) { case MEM: - /* Memory costs quite a lot for the first word, but subsequent words - load at the equivalent of a single insn each. */ - *total = COSTS_N_INSNS (2 + ARM_NUM_REGS (mode)); + *total = arm_mem_cost (mode, XEXP (x, 0)); return true; case DIV:

14 years

Goodbye

by Ira Rosen

Hi, Thank you all for an interesting and pleasant experience. I am very grateful to Linaro for the opportunity to meet and work with such an amazing group of people. I wish you all the best, and hope to meet you again (at least online). You can find me at irar(a)il.ibm.com or ira.rsn(a)gmail.com. Ira

14 years

[ACTIVITY] WW51

by Zhenqiang Chen

Summary: * Read armV7-A/R reference manual; crosstool-ng patches and wrapper scripts. Details: 1. Patches for crosstool-ng: * Fix symlink issue when CT_USE_SYSROOT is not enabled. * Update sample/linaro-arm-none-eabi (baremetal) to disable SYSROOT. So that both include and lib files are in the same dir. 2. Study armV7-A/R reference manual. 3. Validate embedded toolchain Dec. release. 4. Enhance the wrapper to use crosstool-ng for embedded toolchain. Plan: * Ramp-up on gcc. Best regards! -Zhenqiang

14 years

[ACTIVITY] weekly status

by Revital Eres

Submitting patch for Bug #879725: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01459.html Looking at the performance results running SMS with automatic testing. This is my last week in Linaro so I would also like to thank you all for the interesting year -- it was a great experience for me to work in this project. I wish you all good luck and happy holidays! Revital

14 years

[ACTIVITY] weekly status

by Ken Werner

Hi, OpenEmbedded-Core: * No response on the CSL patches I posted to the ml yet * khem says someone (other than me) needs to try them * Linaro binary toolchain * Runs on Oneiric-X86_64 after installing lsb-core (interpreter: /lib/ld-lsb.so.3) * The do_rootfs tasks fails with runtine dependecy issues when using the external-linaro-toolchain_arm-2011.11.bb recipe. When re-using my CSL 2011.03 recipe with the linaro toolchain the error doesn't show up - strange. * OE-Core build gets confused by the (arm-linux-gnueabi-)pkg-config of the external linaro toolchain. As a workaround I just renamed this script. * The qemuarm MACHINE configuration uses "-march=armv5te -mno-thumb" Since the linaro toolchain defaults to thumb and -mno-thumb has no effect some inline assemblies are failing (i.e. on the umull insn). GNU #47930 suggests using -marm instead -> OE-Core patch posted. * Got the core-image-minimal to build, but it doesn't run yet (I suspect some basic runtime dependencies like libc again) * The build of the sato image fails (seems libtool and/or C++ related - need to investigate) Regards Ken

14 years

Fwd: [ACTIVITY] w51

by Asa Sandahl

Hi, * Continued with comparison of eembc results for gcc4.4 and gcc 4.6 (FSF and Linaro). Collecting results for 4.6 with loop-unrolling turned off. * Working on a plotbench.py script that will use matplotlib for plotting the results. Right now the script plots the geomean value, for instance for eembc. I now try to make it plot all subtest as well. Then it should also show relative improvements instead of just the numbers, and then also sorted from best to worse. This script depends on Michaels script libtabulate.py for transforming the tabulated file back to python records. * Will be back January 9 /Regards Åsa

14 years

[ACTIVITY] Nov 19 - Dec 22

by Ulrich Weigand

== GDB == * Ongoing work on remote support for "info proc" and core file generation. Completed implementation of latest solution via accessing arbitrary files on the remote site, only to run into a fundamental design problem ... so it's probably back to the previous approach. Discussion on the list is ongoing. * Fixed a GDB 7.4 test suite failure on ARM: PR tdep/12797 * Fixed another GBD 7.4 test suite failure on ARM, by enabling pthread_t thread debugging on core files. == GCC == * Patch review week. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years

cdce3.C execution fault

by Michael Hope

Hi there. I've looked further into the intermittent gcc/testsuite/g++.dg/cdce3.C test failures. Taking Ira's vectoriser-only fix-pr51301-4.6 branch and comparing it with it's predecessor r106845: * cdce3.o itself is identical across compilers * Fault occurs in a parallel test run as part of the normal auto build * Fault occurs every time * Fault occurs with a manual 'make check-gcc RUNTESTFLAGS="dg.exp=cdce*' * Fault doesn't occur when building from the command line * Fault doesn't occur after updating binutils I'm suspicious of the linker. The auto builders are Natty based and come with ld 2.21.0.20110327. Updating them to Oneiric's 2.21.53.20110810 clears the problem. I've saved the build trees. I see no reason not to commit ~ramana/gcc-linaro/fix-lp-900426 and ~irar/gcc-linaro/fix-pr51301-4.6. -- Michael

14 years

[ACTIVITY] Nov 12 - Dec 16

by Ulrich Weigand

== GDB == * Ongoing work on remote support for "info proc" and core file generation. Implemented initial version of latest solution via accessing arbitrary files on the remote site. == GCC == * Started familiarizing myself with current status of various performance patches in programm, in preparation of my taking on GCC performance work next year. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain