The Linaro Toolchain Working Group (TCWG) is pleased to announce the
2016.09 snapshot of the Linaro GCC 6 source package.
This monthly snapshot[1] is based on FSF GCC 6.2+svn239654 and
includes performance improvements and bug fixes backported from
mainline GCC. This snapshot contents will be part of the 2016.11
stable[2] quarterly release.
This snapshot tarball is available on:
http://snapshots.linaro.org/components/toolchain/gcc-linaro/6.2-2016.09/
Interesting changes in this GCC source package snapshot include:
* Backport of [Bugfix] [AArch32] PR target/53440 Handle generic thunks
better for TARGET_32BIT
* Backport of [Bugfix] [AArch32] PR target/59833 ARM soft-float
extendsfdf2 fails to quiet signaling NaN
* Backport of [Bugfix] [AArch32] PR target/70473 Reduce size of
Cortex-A8 automaton
* Backport of [Bugfix] [AArch32] PR target/71061 length pop* pattern
in epilogue correctly
* Backport of [Bugfix] [AArch32] PR target/77281 Fix an invalid check
for vectors of the same floating-point constants.
* Backport of [Bugfix] [AArch64] PR 64971 Convert function pointer to
Pmode when emit call.
* Backport of [Bugfix] [AArch64] PR 70904 Relax the restriction on
subreg reload for wide mode
* Backport of [Bugfix] [AArch64] PR target/63596 Honor tree-stdarg
analysis result to improve VAARG codegen
* Backport of [Bugfix] [AArch64] PR target/63874 vtable address
generation goes through memory
* Backport of [Bugfix] PR 70751 Correct the cost for spilling
non-pseudo into memory
* Backport of [Bugfix] PR 77421 Redundant second assignment of bb_copy
= NULL in free_original_copy_tables
* Backport of [Bugfix] PR middle-end/37780 Conditional expression with
__builtin_clz() should be optimized out
* Backport of [Bugfix] PR middle-end/68217 Wrong constant folding
* Backport of [Bugfix] PR middle-end/71700 zero-extend sub-word value
when widening constructor element
* Backport of [Bugfix] PR rtl-optimization/66940 Avoid signed overflow
in noce_get_alt_condition
* Backport of [Bugfix] PR rtl-optimization/71150 Guard in_class_p with
REG_P check
* Backport of [Bugfix] PR rtl-optimization/71295
* Backport of [Bugfix] PR rtl-optimization/71594 ICE in
noce_emit_cmove due to mismatched source modes
* Backport of [Bugfix] PR rtl-optimization/71878
* Backport of [Bugfix] PR tree-optimization/61839 More optimize
opportunity for VRP
* Backport of [Bugfix] PR tree-optimization/71818 ICE in as_a, at
is-a.h:192 w/ -O2 -ftree-vectorize
* Backport of [AArch32] Keep ctz expressions together until after reload
* Backport of [AArch32] Add fcsel to Cortex-A57 scheduler
* Backport of [AArch32] Add initial support for Cortex-A73
* Backport of [AArch32] Add support for overflow add, sub, and neg operations
* Backport of [AArch32] Add support for some ARMv8-A cores to driver-arm.c
* Backport of [AArch32] arm_neon.h: s/__FAST_MATH/__FAST_MATH__/g
* Backport of [AArch32] Emit vmov.i64 to load 0.0 into FP reg when neon enabled.
* Backport of [AArch32] Enable __fp16 as a function parameter and return type
* Backport of [AArch32] Fix aprofile multilib mappings
* Backport of [AArch32] Fix predicable_short_it attribute for arm_movqi_insn
* Backport of [AArch32] genmultilib: improve error reporting for MULTILIB_REUSE
* Backport of [AArch32] improve Cortex-A53 integer scheduler
* Backport of [AArch32] Model CSEL instruction in Cortex-A57 scheduling model
* Backport of [AArch32] no-data-is-text-relative & msingle-pic-base
* Backport of [AArch32] Refactor MOVW/MOVT fusion logic to allow extension
* Backport of [AArch32] Replace uses of int_log2 by exact_log2
* Backport of [AArch32] Update documentation for ARM architecture
* Backport of [AArch32] Use a MULTILIB_REQUIRED approach for aprofile multilib
* Backport of [AArch64] 1/2 Add support INS (element) instruction to
copy lanes between vectors
* Backport of [AArch64] 2/2 (Re)Implement vcopy<q>_lane<q> intrinsics
* Backport of [AArch64] 1/2 Improve zero extend
* Backport of [AArch64] 2/2 Improve zero extend
* Backport of [AArch64] 1/3 Migrate aarch64_add_constant to new
interface & kill aarch64_build_constant
* Backport of [AArch64] 2/3 Optimize aarch64_add_constant to generate
better addition sequences
* Backport of [AArch64] 3/3 Migrate aarch64_expand_prologue/epilogue
to aarch64_add_constant
* Backport of [AArch64] 1/6 Reimplement scalar fixed-point intrinsics
* Backport of [AArch64] 2/6 Reimplement vector fixed-point intrinsics
* Backport of [AArch64] 3/6 Reimplement frsqrte intrinsics
* Backport of [AArch64] 4/6 Reimplement frsqrts intrinsics
* Backport of [AArch64] 5/6 Reimplement fabd intrinsics & merge rtl patterns
* Backport of [AArch64] 6/6 Reimplement vpadd intrinsics & extend rtl
patterns to all modes
* Backport of [AArch64] Accept vulcan as a cpu name for the AArch64 port of GCC
* Backport of [AArch64] Add ANDS pattern for CMP+ZERO_EXTEND
* Backport of [AArch64] Add commit message
* Backport of [AArch64] Add initial support for Cortex-A73
* Backport of [AArch64] Add legitimize_address_displacement hook
* Backport of [AArch64] Add more choices for the reciprocal square
root approximation
* Backport of [AArch64] Add rtx_costs routine for vulcan
* Backport of [AArch64] Add some more missing intrinsics
* Backport of [AArch64] Add ThunderX vector cost model
* Backport of [AArch64] Allow multiple-of-8 immediate offsets for TImode LDP/STP
* Backport of [AArch64] Canonicalize Cortex core tunings
* Backport of [AArch64] Cleanup -mpc-relative-loads
* Backport of [AArch64] Define WORD_REGISTER_OPERATIONS to zero and comment why
* Backport of [AArch64] Emit division using the Newton series
* Backport of [AArch64] Emit square root using the Newton series
* Backport of [AArch64] Enable tree-stdarg pass for AArch64 by
defining counter fields
* Backport of [AArch64] Fix typo in aarch64_legitimize_address
* Backport of [AArch64] Fixup to fcvt patterns added in r237200
* Backport of [AArch64] Fix vld2/3/4 on big endian systems
* Backport of [AArch64] Give some new costs for Cortex-A53
floating-point operations
* Backport of [AArch64] Give some new costs for Cortex-A57
floating-point operations
* Backport of [AArch64] Handle AND+ASHIFT form of UBFIZ correctly in costs
* Backport of [AArch64] Handle iterator definitions with conditionals
in geniterator.sh
* Backport of [AArch64] Improve aarch64_modes_tieable_p
* Backport of [AArch64] Increase code alignment
* Backport of [AArch64] Keep CTZ components together until after reload
* Backport of [AArch64] Optimize prolog/epilog
* Backport of [AArch64] Remove aarch64_cannot_change_mode_class
* Backport of [AArch64] Remove spurious attribute __unused__ from NEON intrinsic
* Backport of [AArch64] Renaming ARMv8.1 to ARMv8.1-A in comments and
documentations
* Backport of [AArch64] Replace insn to zero up SIMD registers
* Backport of [AArch64] update vulcan L1 cacheline size
* Backport of [ARMv8.2] [AArch64] 10/10 ARMv8.2-A FP16 lane scalar intrinsics
* Backport of [ARMv8.2] [AArch64] 1/10 ARMv8.2-A FP16 data processing intrinsics
* Backport of [ARMv8.2] [AArch64] 2/10 ARMv8.2-A FP16 one operand
vector intrinsics
* Backport of [ARMv8.2] [AArch64] 3/10 ARMv8.2-A FP16 two operands
vector intrinsics
* Backport of [ARMv8.2] [AArch64] 4/10 ARMv8.2-A FP16 three operands
vector intrinsics
* Backport of [ARMv8.2] [AArch64] 5/10 ARMv8.2-A FP16 lane vector intrinsics
* Backport of [ARMv8.2] [AArch64] 6/10 ARMv8.2-A FP16 reduction vector
intrinsics
* Backport of [ARMv8.2] [AArch64] 7/10 ARMv8.2-A FP16 one operand
scalar intrinsics
* Backport of [ARMv8.2] [AArch64] 8/10 ARMv8.2-A FP16 two operands
scalar intrinsics
* Backport of [ARMv8.2] [AArch64] 9/10 ARMv8.2-A FP16 three operands
scalar intrinsics
* Backport of [ARMv8.2] [AArch64] ARMv8.2 command line and feature
macros support
* Backport of [Misc] 1/2 Move choose_mult_variant declaration and
dependent declarations to expmed.h
* Backport of [Misc] 2/2 Hook up mult synthesis logic into
vectorisation of mult-by-constant
* Backport of [Misc] Append "evaluates to 0" for Wundef diagnostic
* Backport of [Misc] Avoid unnecessary peeling for gaps with LD3
* Backport of [Misc] Check for POINTER_TYPE_P before accessing
SSA_NAME_PTR_INFO in tree-inline
* Backport of [Misc] Disable ifunc on *-musl by default
* Backport of [Misc] Disable setting param of __builtin_constant_p to null
* Backport of [Misc] Don't count spilling cost for it offmemok
* Backport of [Misc] Don't use section anchors for declarations that
don't fit in a single anchor range
* Backport of [Misc] Fix ChangeLog entry
* Backport of [Misc] Fix GROUP_GAP for single-element interleaving
* Backport of [Misc] Fix unused variable warning in
simplify_cond_clz_ctz on some targets
* Backport of [Misc] Increase alignment of global structs in
increase_alignment pass
* Backport of [Misc] Latent alignment bug in tree-ssa-address.c
* Backport of [Misc] Report supported function classes correctly on *-musl
* Backport of [Testsuite] 29_atomics/atomic/65913.cc: require
atomic-builtins rather than specific target
* Backport of [testsuite] [AArch32] Add missing guards to fp16 AdvSIMD tests
* Backport of [Testsuite] [AArch32] Fix, add tests for FP16 aapcs
* Backport of [Testsuite] [AArch32] Fix dg-do and dg-skip order
* Backport of [Testsuite] [AArch32] gcc.target/arm/pr37780_1.c: Use
arm_arch_v6t2 effective target and options
* Backport of [Testsuite] [AArch32] Make arm_neon_fp16 depend on arm_neon_ok
* Backport of [Testsuite] [AArch32] Selectors and options directives
for ARM VFP FP16 support
* Backport of [Testsuite] [AArch64] Ensure vrnd* tests run on ARMv8 cores
* Backport of [Testsuite] Add testcases
* Backport of [testsuite] asan/clone-test-1.c: Handle clone() failure
* Backport of [Testsuite] Fix testcases
* Backport of [Testsuite] Use setrlimit for testing libstdc++ in cross
toolchains
* Backport of [Cleanup] [AArch32] 2/4 Replace casts of 1 to
HOST_WIDE_INT by HOST_WIDE_INT_1 and HOST_WIDE_INT_1U
* Backport of [Cleanup] [AArch32] 3/4 Cleanup casts from INTVAL to
[unsigned] HOST_WIDE_INT
* Backport of [Cleanup] [AArch32] 4/4 Simplify checks for CONST_INT_P
and comparison against 1/0
* Backport of [cleanup] [AArch32] Delete thumb_reload_in_h
* Backport of [Cleanup] [AArch32] Remove non-existent extern
declarations in arm.h
* Backport of [Cleanup] [AArch64] aarch64_elf_asm_named_section:
Remove declaration.
* Backport of [Cleanup] [AArch64] Clean up parentheses and use
GET_MODE_UNIT_BITSIZE in a couple of patterns
* Backport of [Cleanup] [AArch64] Remove static variable
all_extensions from aarch64.c
* Backport of [Cleanup] Cleanup frame push/pop code
* Backport of [Cleanup] rtlanal.c: Convert conditional compilation on
WORD_REGISTER_OPERATIONS
* Backport of [Debug] ifcvt: Print name of noce trasform that
succeeded in dump file
Subscribe to the important Linaro mailing lists and join our IRC
channels to stay on top of Linaro development.
** Linaro Toolchain Development "mailing list":
http://lists.linaro.org/mailman/listinfo/linaro-toolchain
** Linaro Toolchain IRC channel on irc.freenode.net at @#linaro-tcwg@
* Bug reports should be filed in bugzilla against GCC product:
http://bugs.linaro.org/enter_bug.cgi?product=GCC
* Interested in commercial support? inquire at "Linaro support":
mailto:support@linaro.org
[1]. Source package snapshots are defined when the compiler is only
put through unit-testing and full validation is not performed.
[2]. Stable source package releases are defined as releases where the
full Linaro Toolchain validation plan is executed.
Hi,
For reference, the following questions refer to a Linux 3.10 aarch64 kernel (ARMv8, ARMv8-A) compiled with Linaro 5.3-2016.02 (5.3-2016.02 arm64 CROSS_COMPILE and 5.3-2016.02 armhf CROSS32CC cross compiled from x86_64). This Linux kernel compile requires the full 64-bit tool chain plus the 32-bit gcc.
There is a Linux kernel file "arch/arm64/kernel/deprecated.c". Within that file is a block of code which is apparently designed to detect some sort of older obsolete 32-bit code. Specifically, the warning message is "Using deprecated CP15 barrier instruction". I wouldn't think that this kernel, when compiled with such a recent compiler, would ever introduce CP15 barrier code (from what I see barrier code was deprecated in ARMv7 and used to deal with obsolete ARMv6 code). Is there any chance the 32-bit 5.3 gcc would still generate CP15 barrier code?
Although this kernel and "most" modules are built with Linaro 5.3 I'd like to be able to narrow the cause down to existing binary modules not compiled with the 5.3 Linaro. If I can guarantee 5.3 32-bit gcc does not produce the CP15 barrier instruction then I'll know it is in pre-built binary modules. Can anyone tell me if C code compiled from Linaro version 5.3 32-bit gcc code would ever contain CP15 instructions?
Thanks!
What I need to confirm is if the Linaro 5.3 would generate such code from a normal block of C without any kind of asm in it? Detecting the code is part of the kernel...I need to figure out where it was generated. Is Linaro 5.3 going to generate such code without explicitly writing it in?
----- Original Message -----From: Andrew Pinski <Andrew.Pinski(a)cavium.com>To: stimits(a)comcast.net, Linaro Toolchain Mailman List <linaro-toolchain(a)lists.linaro.org>Sent: Mon, 05 Sep 2016 17:21:24 -0000 (UTC)Subject: RE: Question on "Using deprecated CP15 barrier instruction".
Simple answer is yes. There are a lot of assembly out there and someone (seen it in the past) could have used this barrier method in their code and not really thought it was going to change in the future.
Thanks,
Andrew
From: linaro-toolchain [mailto:linaro-toolchain-bounces@lists.linaro.org]On Behalf Of stimits(a)comcast.netSent: Monday, September 5, 2016 8:10 AMTo: Linaro Toolchain Mailman List <linaro-toolchain(a)lists.linaro.org>Subject: Question on "Using deprecated CP15 barrier instruction".
Hi,
For reference, the following questions refer to a Linux 3.10 aarch64 kernel (ARMv8, ARMv8-A) compiled with Linaro5.3-2016.02 (5.3-2016.02 arm64 CROSS_COMPILE and 5.3-2016.02 armhf CROSS32CC cross compiled from x86_64). This Linux kernel compile requires the full 64-bit tool chain plus the 32-bit gcc.
There is a Linux kernel file "arch/arm64/kernel/deprecated.c". Within that file is a block of code which is apparently designed to detect some sort of older obsolete 32-bit code. Specifically, the warning message is "Using deprecated CP15 barrier instruction". I wouldn't think that this kernel, when compiled with such a recent compiler, would ever introduce CP15 barrier code (from what I see barrier code was deprecated in ARMv7 and used to deal with obsolete ARMv6 code). Is there any chance the 32-bit 5.3 gcc would still generate CP15 barrier code?
Although this kernel and "most" modules are built with Linaro 5.3 I'd like to be able to narrow the cause down to existing binary modules not compiled with the 5.3 Linaro. If I can guarantee 5.3 32-bit gcc does not produce the CP15 barrier instruction then I'll know it is in pre-built binary modules. Can anyone tell me if C code compiled from Linaro version 5.3 32-bit gcc code would ever contain CP15 instructions?
Thanks!
== Progress ==
o Linaro GCC/Validation (7/10)
- Merged bkk16 buildfarm developments into main job.
- Backports: Reviews, dependencies tracking and backports.
- Analysed extended validation results.
o Misc (3/10)
* Various meetings and discussions.
== Plan ==
o Release 2016.09 Snapshots
o GNU Cauldron
== Progress ==
- IPA-VRP: Re-based ipa-cp/ipa-prop on top of Prathameshes commit
(quite a few conflicts) and did full testing before posting to the
list. Patch approved for commit.
- Waiting for Early-VRP patch to commit rest of the patches in the series
- All other patches are OK now.
- Looking at tree-vrp in details. Created
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77387
- Looking at re-assoc wrong code generation issue (PR72835). Posted
another attempt to fix for review
= Next ==
- Work on tree-vrp improvements
- Follow up on pending patches
== This Week ==
* TCWG-779 (4/10)
- Upstream patch iterations
* NULL pointer propagation in ipa-bitwise-cp (4/10)
- WIP patch
* Benchmarking (1/10)
- Found how to set iterations for coremark
- Submitted job for setting up hack session on Juno
* Misc (1/10)
- Meetings
== Next Week ==
- NULL pointer propagation
- TCWG-779
- GNU Cauldron
* Holiday on Monday. [2/10]
# Progress #
* TCWG-685, GDB 7.12 release. No new issue. [1/10]
* TCWG-655, ARM linux kernel ptrace bug. [3/10]
Teach GDB testsuite to detect such kernel bug, and skip all tests
related to floating point. Patches are committed.
On the other hand, upgrade my arm kernel to 4.7.2, on which the ptrace
bug is fixed.
* TCWG-518, range stepping on ARM. [2/10].
Turned it on, but it causes some threads starvation. Investigating.
* Finish the reproducer to show odd behaviour of armv8 kernel in si_code
after PTRACE_SINGLESTEP and Will Deacon fixed it
http://lists.infradead.org/pipermail/linux-arm-kernel/2016-September/453289…
[1/10]
* Think about the GDB plan, and write down a list of things I need to
do. [1/10]
# Plan #
* TCWG-685, TCWG-518.
* GNU Cauldron.
--
Yao Qi
== Activity ==
[TCWG-610] Sent .ARM.exidx for upstream review.
Will need to rework and simplify bit to make more specific to ARM. In
summary abandon pretence of SHF_LINK_ORDER in general and concentrate
on supporting its one known use case in .ARM.exidx sections.
Made a couple of drafts of forthcoming LLVM Cauldron presentation
- Presented locally
- Modified slides after overrunning time
== Plans ==
[TCGWG-610] Rework and resend for review
llvm-cauldron on Thursday
Intending to take Friday as holiday to take advantage of visiting
parents whilst up North.
== Progress ==
* Validation
- patch reviews (Jenkins jobs, abe) mainly for the release process
* Backports
- resubmitted the pending ones
- started looking at v8.2 backports
* GCC
- resumed PR 67591 (ARM v8 Thumb IT blocks deprecated)
* arm-neon-tests
- small update to support clang, as requested by Chromium
* misc (conf-calls, meetings, emails, ....)
- catching up after holidays
== Next ==
- monitor trunk regressions
- pr 67591
- release scripts
- backports