Hi Andrew,
On 27 June 2016 at 19:32, Pinski, Andrew <Andrew.Pinski(a)cavium.com> wrote:
>> No gain expected in implementing an Ifunc'ed version of the library.
>
> How did you prove that? What hardware did you run this on to prove it?
> Also have you thought at least doing an ifunc version for 128bit atomics?
up to 64bits, the calls to the libatomic routines are inlined and
armv8.1 CAS and load-operate version are used when the application is
build for armv8.1 architecture. For 128bits, a call to the lib is
made which uses the same LL/SC implementation with or without LSE
support, as CAS and load-operate instruction don't support this data
size.
I don't have armv8.1 hardware and made the analysis on the generated
assembler. Do you have use case on your side where an ifunc version
can be useful ? I'm not aware of an algorithm which can replace
effectively LL/SC implementation with shorter CAS, do you have any
pointers ? Maybe CASP can be used in some cases, I'll investigate it.
Thanks
Yvan
> Thanks,
> Andrew
>
> -----Original Message-----
> From: linaro-toolchain [mailto:linaro-toolchain-bounces@lists.linaro.org] On Behalf Of Yvan Roux
> Sent: Monday, June 27, 2016 1:40 AM
> To: Linaro Toolchain Mailman List <linaro-toolchain(a)lists.linaro.org>
> Subject: [ACTIVITY] Week 25
>
> == Progress ==
> o Extended Validation (1/10)
> - Benchmarking job babysitting.
>
> o Upstream GCC (4/10)
> - ARMv8.1 libatomic: Analysis completed.
> No gain expected in implementing an Ifunc'ed version of the library.
> - Working on __sync buitlins potential fix.
>
> o Misc (5/10)
> * Various meetings and discussions.
> * Internal appraisal
>
> == Plan ==
> o Continue on-going tasks (__sync, benchmarking) _______________________________________________
> linaro-toolchain mailing list
> linaro-toolchain(a)lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-toolchain
== Progress ==
o Extended Validation (1/10)
- Benchmarking job babysitting.
o Upstream GCC (4/10)
- ARMv8.1 libatomic: Analysis completed.
No gain expected in implementing an Ifunc'ed version of the library.
- Working on __sync buitlins potential fix.
o Misc (5/10)
* Various meetings and discussions.
* Internal appraisal
== Plan ==
o Continue on-going tasks (__sync, benchmarking)
# Progress #
* TCWG-333, ISA bit treatment in ARM thumb mode.
Got some comments from Maciej (MIPS) and need to address them.
* TCWG-518, ARM range stepping patches. [2/10]
Combine the path of "proceed" and "resume" so that we only change one
place instead of two to support range stepping. Patches are being
tested.
* TCWG-655, Workaround ARM linux kernel ptrace bug... [2/10]
by pining both program and debugger on the same core in the affected
test cases.
* Off on Wed to Fri. [6/10]
# Plan #
* Continue things above,
--
Yao
== This Week ==
* LTO (8/10)
a) TCWG-558 (2/10)
- found a way to detect if stmt is inside a loop
- addressed issue with -ffat-lto-objects
- patch posted upstream, waiting for review.
b) ipa-vrp (3/10)
- Prototype patch for propagating value ranges
inter-procedurally (accidental overlap with Kugan's patches)
- Experimented and reviewed Kugan's patch.
c) TCWG-548 (2/10)
- Wrote a pass to gather stats on external references.
- Measurements with SPEC2000 with and without prototype patch
d) Had a look at pointer alignment propagation within ipa-cp pass (1/10)
* Benchmarking (1/10)
- Obtained results for linaro-gcc-5, fsf-gcc-6 for coremark-pro for
arm-linux-gnueabihf target.
- Submitted job 112 for bkk16-buildfarm-benchmark, status shows SUCCESS, but it
appears no tcwg-benchmark job references build #112.
* Misc (1/10)
- Meetings
== Next Week ==
- Continue with TCWG-548 and ipa-vrp
- Ping patches in upstream reviews
== Progress ==
* Out of office on Monday and Tuesday [4/10]
* ARM: Do not test for CPUs, use SubtargetFeatures [TCWG-623] [4/10]
- First patch committed upstream (6 features / properties)
- Another patch in upstream review (5 features / properties)
- Working on another series of patches
* Use member initializers in ARMSubtarget [TCWG-659] [1/10]
- Trivial refactoring suggested during code review for TCWG-623
- Committed upstream
* Unbreak the selfhost bots [TCWG-661] [1/10]
- Currently bisecting the issue affecting
clang-cmake-thumbv7-full-sh (in progress)
- Tried bisecting one of the issues affecting
clang-cmake-armv7-selfhost and clang-cmake-armv7-selfhost-neon, but
the test board died in the process; Renato is trying to reproduce on
one of his boards
* Remove exit-on-error flag from CodeGen tests [TCWG-604]
- Finally committed upstream
== Plan ==
* ARM: Do not test for CPUs, use SubtargetFeatures [TCWG-623]
* Unbreak the selfhost bots [TCWG-661]
== Progress ==
* TCWG-653 Add interworking thunks to LLD
Investigated and reported upstream bug in existing implementation.
This has been fixed by reverting the change that introduced it. Got
some feedback about whether I would need to strictly follow existing
Thunk implementation.
Implemented an alternative thunk mechanism that should generalise to
supporting range extension and interworking thunks. It is passing the
existing lld regression tests.
== Plans ==
Finish off support for and add tests for ARM/Thumb interwork.
Tidy up and submit upstream.
== Progress ==
* Validation
- experimenting backport-multijob
- noticed dejagnu problems when trying to kill processes that timed out
- abe and jenkins configs patches / reviews
* Backports/snapshots
- restarted a few validations, to try to recover from disk full issues
* GCC
- neon-testgen.ml removal patch sent upstream
- PR 67591 (ARM v8 Thumb IT blocks deprecated)
- followup on vect.exp's check_vect() support for old arm cores.
- neon-fp16 tests consistency patch posted
- a couple of regressions flagged on trunk
* cortex-strings
- updated aarch64/strlen
* Support
* misc (conf-calls, meetings, emails, ....)
== Next ==
* Validation:
- patch reviews
- look at xenial, docker builders
* Backports
- restart the ones that failed due to disk space issues
* GCC
- monitor trunk regressions
- fix "check_vect" guard in gcc.dg/vect tests
- pr 67591
- advsimd tests
== Progress ==
o Extended Validation (1/10)
- Benchmarking job babysitting.
o Linaro GCC (5/10)
- Merged FSF GCtC 6 branch
- Reviewed backports (very tedious because of infra issues)
- Released June snapshots (5.4 and 6.1)
o Upstream GCC (2/10)
- ARMv8.1 libatomic: Reading and code analysis.
o Misc (2/10)
* Various meetings and discussions.
== Plan ==
o Continue on-going tasks (libatomic, benchamrking)
# Progress #
* TCWG-333, ISA bit treatment in ARM thumb mode. [2/10]
Post a patch to skip the test for thumb mode. Finish the prototype
which shows many issues that I can't overcome. Convince myself that
we should skip the test rather than support that in GDB.
* TCWG-518, ARM range stepping patches. [3/10].
11 of 12 patches are approved, and some of them are already
committed. Need to address one comment that merge to execution paths
into one, which requires some big changes in GDBserver for linux.
* TCWG-651, Support 'catch syscall' in remote for aarch64 and arm.[3/10]
My patches are posted upstream.
* TCWG-556, aarch64 gdb buildbot. [1/10]
They are online.
http://gdb-build.sergiodj.net/builders/Ubuntu-AArch64-m64/http://gdb-build.sergiodj.net/builders/Ubuntu-AArch64-native-gdbserver-m64
* TCWG-654, Build both cross/native arm/aarch64 gcc 5.4.1 to replace
the gcc in my gdb tests. Ongoing. [1/10]
# Plan #
* Follow up all of them above,
* Holiday from Wed - Fri.
--
Yao
== Progress ==
1 day public holidays
IPA VRP
- Implemented a version of early VRP.
- Verified with simple test cases.
- Some test cases are failing in regression testing, looking into it.
- Some design decisions need to be firmed up with the upstream
discussions.
== Plan ==
- Follow upon remaining upstream patches
- IPA VRP