o 1 day off (2/10)
== Progress ==
o Linaro GCC (6/10)
* FSF branch merge into linaro GCC 5 branch
* Troubleshot various regression after the merge
* Delivered GCC 5.2 2015.11 snapshot
o Upstream work (1/10)
* Sanitizing gfortran testsuite
o Misc (1/10)
* Various meetings
== Plan ==
o Continue on sanitizing testsuite
o Backports, infra, ...
Implement LAVA jobs for microinstance - TCWG-432 [6/10]
* Refactoring to permit sharing of code between uinstance & main
instance, as far as possible
* Further refactoring for sane submission of bundles without inserting
LAVA assumptions in the wrong places
* Tested as far as possible in main instance, using light hacks and fakebench
Jenkins benchmarking job - TCWG-348 [1/10]
* Converted pbl hacks into a sane patch for yaml-to-json.py
Controlled image builds - TCWG-360 [1/10]
* Submitted aarch64 filesystem build for review
* Generated armhf and amd64 filesystems
* Started learning how to generate hwpack
Misc [2/10]
=Plan=
Review security with shared uinstance/main instance code
Expose more data, benchmarks to bundles
Create YAML definition for Jenkins benchmarking job
Generate (controlled) hwpack for at least one target, or know what the
problems are
Write up noise control report (if time)
Have another at crashdump (if time, if new kexec patches)
* TCWG-72 (3/10)
- divmod transform approved by Richard
- builds cleanly on arm-linux-gnueabihf, aarch64-linux-gnu
- Investigating segfault with __bdi64_div.c
happens when mode == DImode and libval_mode == TImode
- Found another segfault on x86 with TImode, on arm
TImode is not supported and compiler aborts. Perhaps we should
not do the transform when mode is TImode ?
- Had a look at expand_binop_twoval_libfunc().
Wrote a similar function to obtain both results but this resulted
in infinite loop in emit_libcall_block_1
- Strangely the bug is reproducible only during the build and doesn't
trigger when compiled with preprocessed version of bid64_div.c
(passing the same set of options).
- waiting for upstream comments
* TCWG-319 (1/10)
- Submitted jobs for fp benchmark on a53, a57
* Misc:
- PR66214 appears to have gone (fixed or became latent), that was
blocking firefox LTO build with trunk
- PR65837 still appears to be present after r230327
* Public Holidays (6/10)
- Diwali festival
== Next Week ==
- Continue with TCWG-72, TCWG-319 benchmarking, target hook conversion
- Run SPEC2k6 with LTO
== Progress ==
- Widening pass (TCWG-547) - 6/10
* Bootstrapped latest patch on ppc64-linux-gnu, aarch64-linux-gnu and
x64-64-linux-gnu.
* Regression testing on ppc64-linux-gnu,
aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu.
* Fixed all of the execution issues
* Posted updated patch to the list
- Misc (4/10)
* Linaro bug 1900
* Continued Looking at LuaJIT code-base
* gcc/bug list
== Plan ==
* bug 1900
* Look at implementing LuaJIT for aarch64
* LTO
Hi,
We have a packaging/linking/optimization problem at LNG, I hope you guys
can give us some advice on that. (Cc'ing ODP list in case someone want
to add something)
We have OpenDataPlane (ODP), an API stretching between userspace
applications and hardware SDKs. It's defined in the form of C headers,
and we already have several implementations to face SDKs (or whathever
is actually controlling the hardware), e.g. linux-generic, a DPDK one etc.
And we have applications, like Open vSwitch (OVS), which now is able to
work with any ODP platform implementation which implements this API
When it comes to packaging, the ideal scenario would be to create one
package for the application, e.g. openvswitch.deb, and one for each
platform, e.g odp-generic.deb, odp-dpdk.deb. The latter would contain
the implementations in the form of a libodp.so file, so the application
can dynamically load the actually installed platform's library runtime,
with all the benefits of dynamic linking.
The trouble is that we have several accessor functions in the API which
are very short and __very__ frequently used. The best example is
"uint32_t odp_packet_len(odp_packet_t pkt)", which returns the length of
the packet. odp_packet_t is an opaque type defined by the
implementation, often a pointer to the packet's actual metadata, so the
actual function call yields to a simple load from that metadata pointer
(+offset). Having it wrapped into a function call brings a significant
performance decrease: when forwarding 64 byte packets at 10 Gbps, I got
13.2 Mpps with function calls. When I've inlined that function it
brought 13.8 Mpps, that's ~5% difference. And there are a lot of other
frequently used short accessor functions with the same problem.
But obviously if I inline these functions I break the ABI, and I need to
compile the application for each platform (and create packages like
openvswitch-odp-dpdk.deb, containing the platform statically linked).
I've tried to look around on Google and in gcc manual, but I couldn't
find a good solution for this kind of problem.
I've checked link time optimization (-flto), but it only helps with
static linking. Is there any way to keep the ODP application and
platform implementation binaries in separate files while having the
performance benefit of inlining?
Regards,
Zoltan
The Linaro Toolchain Working Group (TCWG) is pleased to announce the
2015.11 snapshot of the Linaro GCC 5 source package.
This monthly snapshot[1] is based on FSF GCC 5.2+svn230068 and
includes performance improvements and bug fixes backported from
mainline GCC. This snapshot contents will be part of the 2015.11
stable [1] quarterly release.
This snapshot tarball is available on:
http://snapshots.linaro.org/components/toolchain/gcc-linaro/5.2-2015.11/
Interesting changes in this GCC source package snapshot include:
* Updates to GCC 5.2+svn230068
* Backport of [Bugfix] [AArch32] fp16 Fix PR 67624 - Incorrect
conversion of float Infinity to __fp16
* Backport of [Bugfix] [AArch64] PR 66776 Add cmovdi_insn_uxtw pattern
* Backport of [Bugfix] [AArch64] PR rtl-optimization/68106 LRA
* Backport of [Bugfix] PR48052 fix testcase
* Backport of [Bugfix] PR other/57195
* Backport of [Bugfix] PR rtl-optim/67421 Cost instruction sequences
when doing left wide shift
* Backport of [Bugfix] PR rtl-optimization/67103 Improve conditional
select ops on immediates
* Backport of [Bugfix] PR rtl-optimization/67756
* Backport of [Bugfix] PR target/61578
* Backport of [Bugfix] PR target/61578
* Backport of [Bugfix] PR target/61578
* Backport of [Bugfix] PR tree-optimization/48052 IVOPTS
* Backport of [Bugfix] PR tree-optimization/52563 and 62173 IVOPTS
* Backport of [Bugfix] PR tree-optimization/64454
* Backport of [Bugfix] PR tree-optimization/66449
* Backport of [AArch32] 1/2 Record FPU features as a bit-set
* Backport of [AArch32] 2/2 Use new FPU features representation
* Backport of [AArch32] 1/5 Make room for more CPU feature flags
* Backport of [AArch32] 2/5 Add feature set definitions
* Backport of [AArch32] 3/5 Use new feature set representation
* Backport of [AArch32] 4/5 Use features sets for builtins
* Backport of [AArch32] 5/5 Move initializer into arm-cores.def and
arm-arches.def
* Backport of [AArch32] Add earlyclobber modifier for neon_(vtrn,
vuzp, vzip)<mode>_insn rtx pattern
* Backport of [AArch32] Add missing is_neon_type types
* Backport of [AArch32] arm memcpy of aligned data
* Backport of [AArch32] Fix arm bootstrap failure due to
-Werror=shift-negative-value
* Backport of [AArch32] fix vget_lane on big-endian
* Backport of [AArch32] Use %wd format for lane printing in bounds_check
* Backport of [AArch32/AArch64] 1/15 [FP16] Hide existing float16
intrinsics unless we have a scalar __fp16 type
* Backport of [AArch32/AArch64] 2/15 [fp16] float16x4_t intrinsics in arm_neon.h
* Backport of [AArch32/AArch64] 3/15 Add V8HFmode and float16x8_t type
* Backport of [AArch32/AArch64] 4/15 float16x8_t intrinsics in arm_neon.h
* Backport of [AArch32/AArch64] 5/15 Remaining intrinsics
* Backport of [AArch32/AArch64] 6/15 Add basic FP16 support
* Backport of [AArch32/AArch64] 8/15 Add support for float16x{4,8}_t
vectors/builtins
* Backport of [AArch32/AArch64] 9/15 vld{2,3,4}{,_lane,_dup}, vcombine, vcreate
* Backport of [AArch32/AArch64] 10/15 Implement vcvt_{,high_}f16_f32
* Backport of [AArch32/AArch64] 11/15 vreinterpret(q?),
vget_(low|high), vld1(q?)_dup
* Backport of [AArch32/AArch64] 12/15 Add vcvt(_high)?_f32_f16
intrinsics, with BE RTL fix
* Backport of [AArch32/AArch64] 13/15 Add float16 tests to
advsimd-intrinsics testsuite
* Backport of [AArch32/AArch64] 14/15 Add test of
vcvt{,_high}_i{f32_f16,f16_f32}
* Backport of [AArch32/AArch64] 15/15 Update sourcebuild.texi with
testsuite/effective-target hooks
* Backport of [AArch64] 1/5 Reimplement aarch64_bitmask_imm
* Backport of [AArch64] 2/5 Improve aarch64_internal_mov_immediate by
using faster algorithm
* Backport of [AArch64] 3/5 Remove dead code
* Backport of [AArch64] 4/5 Remove redundant code
* Backport of [AArch64] 5/5 Cleanup immediate generation code in
aarch64_internal_mov_immediate
* Backport of [AArch64] 1/14 Add ident field to struct processor
* Backport of [AArch64] 2/14 Refactor arches handling, add arch enum identifier
* Backport of [AArch64] 3/14 Refactor option override code
* Backport of [AArch64] 4/14 Create TARGET_FIX_ERR_A53_835769 and use
that instead of aarch64_fix_a53_err835769
* Backport of [AArch64] 5/14 Make flag_omit_leaf_frame_pointer
intialize to 2. Define and use TARGET_OMIT_LEAF_FRAME
* Backport of [AArch64] 6/14 Implement TARGET_OPTION_SAVE/TARGET_OPTION_RESTORE
* Backport of [AArch64] 7/14 Implement TARGET_SET_CURRENT_FUNCTION
* Backport of [AArch64] 8/14 Implement TARGET_OPTION_VALID_ATTRIBUTE_P
* Backport of [AArch64] 9/14 Implement TARGET_CAN_INLINE_P
* Backport of [AArch64] 10/14 Implement target pragmas
* Backport of [AArch64] 11/14 Re-layout SIMD builtin types on builtin expansion
* Backport of [AArch64] 12/14 Target attributes and target pragmas tests
* Backport of [AArch64] 13/14 Document AArch64 target attributes and pragmas
* Backport of [AArch64] 14/14 Reuse target_option_current_node when
passing pragma string to target attribute
* Backport of [AArch64] vtbl[34] and vtbx4
* Backport of [AArch64] Add backend aarch64_bfi pattern
* Backport of [AArch64] Add csneg3_uxtw_insn pattern
* Backport of [AArch64] Add support for 64-bit vector-mode ldp/stp
* Backport of [AArch64] Adjust tests to take LSE extension into account
* Backport of [AArch64] [array_mode 1/8] Rename
vec_store_lanes<mode>_lane to aarch64_vec_store_lanes<mode>_lane
* Backport of [AArch64] [array_mode 2/8] Remove VSTRUCT_DREG, use
BLKmode for d-reg aarch64_st/ld expands
* Backport of [AArch64] [array_mode 3/8] Stop using EImode in
aarch64-simd.md and iterators.md
* Backport of [AArch64] [array_mode 4/8] Remove EImode
* Backport of [AArch64] [array_mode 5/8] Remove V_FOUR_ELEM, again
using BLKmode + set_mem_size.
* Backport of [AArch64] [array_mode 6/8] Remove V_TWO_ELEM, again
using BLKmode + set_mem_size.
* Backport of [AArch64] [array_mode 7/8] Combine the expanders using
VSTRUCT:nregs
* Backport of [AArch64] [array_mode 8/8] Add d-registers to
TARGET_ARRAY_MODE_SUPPORTED_P
* Backport of [AArch64] Break -mcpu tie between the compiler and assembler
* Backport of [AArch64] [expand] Check gimple statement to improve
LSHIFT_EXP expand
* Backport of [AArch64] Fix FAIL:
gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler
error)
* Backport of [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics
* Backport of [AArch64] Fix vldX/vstX AdvSIMD intrinsics
* Backport of [AArch64] Followup to [AArch64_be] Fix vtbl[34] and vtbx4
* Backport of [AArch64] Force __builtin_aarch64_fp[sc]r argument into a REG
* Backport of [AArch64] Handle const address in aarch64_print_operand
* Backport of [AArch64] Implement copysign[ds]f3
* Backport of [AArch64] Improve code generation for float16 vector code
* Backport of [AArch64] Improve SIMD concatenation with zeroes
* Backport of [AArch64] Remove index from AARCH64_FUSION_PAIR
* Backport of [AArch64] Remove obsolete comment in aarch64-option-extensions.def
* Backport of [AArch64] Remove separate movtf pattern - Use an
iterator for all FP modes
* Backport of [AArch64] Remove the hack for AARCH64_EXTRA_TUNE_ALL
* Backport of [AArch64] TLSLE 1,2 and 3/N
* Backport of [AArch64] Use default_elf_asm_named_section instead of
special cased hook
* Backport of [AArch64] Use default_elf_asm_named_section instead of
special cased hook
* Backport of [AArch64] Use logics_imm type for 2nd alternative of
*and<mode>3nr_compare0
* Backport of [AArch64] Use popcount_hwi instead of homebrew version
* Backport of [Testsuite] Fix race on temp file in gfortran streamio_*.f90 tests
* Backport of [Testsuite] Fix race on temp file in gfortran tests
* Backport of [Testsuite] Fix typo in vcvt_f16.c testcase
* Backport of [Testsuite] Adjust compiling options for
gcc.target/arm/unsigned-float.c
* Backport of [Testsuite] [AArch32] gcc.target/arm/pr67756.c: Fixed warnings
* Backport of [Testsuite] [AArch64] 7/15 Add basic fp16 tests
* Backport of [Testsuite] [AArch64] Adjust some arith+compare tests
for potentially more aggressive if-conversion
* Backport of [Testsuite] [AArch64] Make arm_align_max_stack_pwr.c and
arm_align_max_pwr.c compile testcase, instead of execution
* Backport of [Testsuite] [AArch64] Mark target_attr_1.c as compile-only
* Backport of [testsuite] [AArch64] Remove divisions-to-produce-NaN
from vdiv_f.c
* Backport of [Testsuite] Add float16 lane_f16_indices tests
* Backport of [Testsuite] auto-wipe dump files
* Backport of [Testsuite] Clean up effective_target cache
* Backport of [Testsuite] Clean up effective_target cache
* Backport of [Testsuite] Fix order of dg-do and
dg-require-effective-target directives
* Backport of [testsuite] gcc.dg/builtins-20.c: Remove undefined behavior
* Backport of [Testsuite] gcc.dg/tree-ssa/pr65447.c: Increase searching number
* Backport of [Misc] add separate insn sched class for vector LDP & STP
* Backport of [Misc] ccorrect ChangeLog dates+address
* Backport of [Misc] fix typo in 223858 1/2
* Backport of [Misc] fix typo in 223858 2/2
* Backport of [Misc] Fix bigendian HFmode in native_interpret_real
* Backport of [Misc] model load/store multiples properly in
autoprefetcher scheduling
* Backport of [Misc] Improve auto-increment addressing mode support in
IVO by refactoring add candiate logic
* Backport of [Misc] Improve bound information in loop niter analysis
* Backport of [Misc] Improve conditional select ops on immediates
* Backport of [Misc] Improve loop bound info by simplifying
conversions in iv base
* Backport of [Misc] IVOPS
* Backport of [Misc] Look into unnecessary conversion when checking
mult_op in get_shiftadd_cost
* Backport of [Misc] Allow REG_EQUAL for ZERO_EXTRACT
* Backport of [Misc] mark libstdc++ tests unsupported if they fail
with relocation truncated
* Backport of [Misc] Rerun loop-header-copying just before vectorization
* Backport of [Misc] Allow PLUS+immediate expression in
noce_try_store_flag_constants
* Backport of [Doc] Clarify feature modifiers {no,}{fp,simd,crypto}
Feedback and Support
Subscribe to the important Linaro mailing lists and join our IRC
channels to stay on top of Linaro development.
** Linaro Toolchain Development "mailing list":
http://lists.linaro.org/mailman/listinfo/linaro-toolchain
** Linaro Toolchain IRC channel on irc.freenode.net at @#linaro-tcwg@
* Bug reports should be filed in bugzilla against GCC product:
http://bugs.linaro.org/enter_bug.cgi?product=GCC
* Interested in commercial support? inquire at "Linaro support":
mailto:support@linaro.org
[1]. Stable source package releases are defined as releases where the
full Linaro Toolchain validation plan is executed.
[2]. Source package snapshots are defined when the compiler is only
put through unit-testing and full validation is not performed.
1 day off (Wednesday) (2/10)
== Progress ==
* Validation
- Jenkins jobs maintenance & cleanup
- comparison of build times between old & new lab
- dedicated slave for results comparison works well
* GCC
- trunk monitoring, reported a few new failures.
- high rate of commits before e/o stage1 means
lots of patches to check
- infrastructure problems in the ST compute farm
mean a few false errors needed analysis
- looked at bug #1869, (problem with binary toolsets
on RHEL6). Made some progress
== Next ==
* Validation:
- continue preparation of switch, as dev-01 is now back
- improve reporting
* GCC:
- check Neon tests cleanup
- bug #1869
- look at how to send valuable reports to gcc-regression
* Off on Wed afternoon [1/10].
# Progress #
* Fails in gdb.threads/multiple-step-overs.exp, (TCWG-332) [1/10]
Patch V2 is posted, pending for review.
* TCWG-422, patch is committed. Done. [2/10].
* TCWG-423, patches are ready, being regression tested. [2/10]
* TCWG-433, build GDB with -fsanitize=address, and exposes many memory
issues. Some of them are fixed. [2/10].
* Upstream patch review, [1/10]
* Misc, meeting, [1/10]
# Plan #
* TCWG-423, Post patches upstream.
* Understand ST's jtag probe and help them to make use of multi-arch
with GDB.
* TCWG-433, Continue fixing memory issues exposed by
-fsanitize=address.
--
Yao
Hi Albert,
On Thu, Nov 12, 2015 at 08:20:18AM +0100, Albert ARIBAUD wrote:
> Can you provide the target name and commit ID that you are building,
> s well as the version of the toolchain that you are building with?
> Without being able to reproduce your issue, it's kind of hard to
> diagnose it.
With the explanation from Ard, I understand the thing now. But thanks
for the reply anyway.
Shawn