> On Mon, Dec 5, 2011 at 1:40 AM, Tom Gall <tom.gall(a)linaro.org> wrote:
> > I probably know the answer to this already but ...
> >
> > For shared libs one can define and use something like:
> >
> > void __attribute__ ((constructor)) my_init(void);
> > void __attribute__ ((destructor)) my_fini(void);
> >
> > Which of course allows your lib to run code just after the library is
> > loaded and just before the library is going to be unloaded. This helps
> > keep out cruft such as the following out of your design:
> >
> > PleaseCallThisLibraryFunctionFirstOrThereWillBeAnErrorWhichYouWillHitCausingYouToPostToTheMailingListAskingTheSameQuestionThatHasBeenAsked1000sOfTimes();
> >
> > Yeah .. you know the function. I don't like it either.
> >
> > Unfortunately this doesn't work when people link in the .a from your
> > lib. Libs like libjpeg-turbo in theory should never ever need to be
> > linked in that fashion but consider the browsers who link to the
> > universe instead of using system shared libs.
On Mon, Dec 05, 2011 at 04:19:11PM +0800, Kito Cheng wrote:
> Here is some triky way for this problem, you can put the constructor
> and destructor to the source file which contain necessary function
> call in your libraries to enforce the linker to archive your
> constructor and destructor.
>
> However if this solution is not work for your situation, you can apply
> the patch in attach for build script to enable the
> LOCAL_WHOLE_STATIC_LIBRARIES for executable,
>
> After patch you can just add a line in your Android.mk :
>
> LOCAL_WHOLE_STATIC_LIBRARIES += libfoo
>
> The most disadvantage of this way is you should always link libfoo by
> LOCAL_WHOLE_STATIC_LIBRARIES...and this patch don't send to linaro and
> aosp yet.
[...]
Part of the problem here is that .a libraries lack the dependency and
linkage metadata that shared libraries have.
-2)
Put up with the need to call an explicit initialisation function
for the library. A lot of commonly-used libraries require an
initialisation call, and I'm not sure it causes that much of a
problem in practice...
-1)
Put a C++ wrapper around just enough of your library such that your
constructor/destructor code is recognised as a needed static
constructor/descructor by the toolchain.
I can't think of a very nice way of doing this, so I won't elaborate
on it...
It's also not really a solution, since you still need to pull in a
dummy static object from somewhere in order to cause the construcor
and descructor to get called.
0)
libtool or similar may help solve this problem, but I don't know much
about this -- also, for solving the problem, that approach only works
if uses of your library link via libtool.
1)
One hacky approach is to rename your library to libmylib-real.a, and
then make replace libmylib.a with a linker script which pulls in the
needed constructor as well as the real library:
libmylib.a:
EXTERN(__mylib_constructor)
INPUT(/path/to/libmylib-real.a)
This works, providing that __mylib_constructor is external (normally,
you would be able have the constructor function static, but it needs
to be externally visible in order to be pulled in in this way.
2)
Another way of doing a similar thing is to mark __mylib_constructor
as undefined in all the objects that make up the library.
Unfortunately, there seems to be no obvious way of doing that: the
assembler generates undefined symbol references automatically for
unresolved references at assembly time. There's no way for force
the existence of an undefined symbol without an actual reference to
it. objcopy/elfedit don't seem to support adding such a symbol
either. It would be simple to write a tool to add the undefined
symbol reference (such tools may exist already), but binutils doesn't
seem to provide this for you. The plausible-looking -u option to
gcc doesn't do anything unless doing a link.
One other way of doing it without a special tool is to insert a bogus
relocation into the text section of each object with an assembler
.reloc directive specifying relocation type R_<arch>_NONE.
There isn't really a portable way to do that, though. The name of
the relocation changes per-arch, and some arches have other quirks
(on ARM for example, .reloc cannot refer to the current location,
but seems instead to need to refer to a defined symbol which is non-zero
distance away from the location counter).
One advantage to this approach is that your .a file looks just
like any other .a file. Also, you can include that dependency
in only those objects which really require the library to be
initialised (normally, this is not a huge benefit though, since
probably most of your objects _do_ require the library to be
initialised).
A disadvantage (other than portability problems) is that, like (1),
the constructor symbol must be external (not static)... so it
pollutes the symbol table and isn't protected against people calling
it directly.
You can create a dummy symbol instead of referring to the constructor
symbol directly though -- this solves the second problem.
3)
Finally, you can split your contructor/destructor code out into a
separate .o file (say mylib-ctors.o), and use the linker script
trick for (1) to forcibly include this object when linking:
libmylib.a:
INPUT(/path/to/mylib-ctors.o /path/to/mylib-real.a)
This avoids some of the disadvantages of the other approaches,
but you still end up with a strange-looking library which is really
a linker script.
This is closer to how the C library traditionally solves the problem
(i.e., the crt*.o stuff). libc.so also tends to be a linker script,
which deals with the fact that some parts of libc must be statically
linked from a separate library when linking to -lc.
Obviously, approaches (1)..(3) all suffer from arch or toolchain
portability problems (or both). (The GNU/GCC __constructor__ thing
is obviously a portability problem in itself, it you're minded to
care about it.)
Cheers
---Dave
* Linaro GCC
Continued work on 64-bit shift / extend / etc. in NEON. I have posted an
RFC to the gcc-patches list in the hope of getting some feedback on how
best to fix this. No response yet. Hopefully some of the Linaro guys are
at least looking at it ...
Merged FSF GCC 4.5 and 4.6 into the Linaro GCC release branches prior to
the release next week.
Set more benchmarking work running in my ongoing investigation into
generic tuning.
Did a dry run of the extra release testing Michael normally does. It
failed. Michael says he's fixed it now, but I know how to do my bit, so
fingers crossed.
* Other
Experienced some IT/connectivity outages within Mentor. Resolved now.
==Progress===
* Off sick on Monday
* Systematic testing duty - few Aarch64 issues.
* Linaro patch review duty.
* Tested my vcvt fixed point patch and close to committing.
* Worked on sometime on movw / movt for symbol references rather than
constant pools . While this gives nice benefits it's a code size hog
and needs further investigation.
* PGO patch being tested finally and should go back up for review.
=== Plans ===
* Release week next week.
* Start looking at partial_partial PRE.
* Finish committing by backlog of patches.
Absences.
* Dec 19 - 31st Dec - Tentatively booked
* Feb 6-10 : Linaro Connect Q1.12/
Summary:
* Patch linaro crosstool-ng.
* Windows install package
Details:
* Patch linaro crosstool-ng:
* Back port upstream patches.
* Check-in the zlib/libiconv/expat/ncurses related patches to linaro branch.
* Create reference windows install package for linaro toolchain from
installjammer. The install process works well on Win7.
Plans:
* Investigate test on Windows.
Best regards!
-Zhenqiang
Hi,
OpenEmbedded:
* started on creating a receipts to compile the "core-image-minimal"
using an external prebuilt toolchain (csl arm-2011.03)
* there are still a lot of warnings at the do_package/do_package_qa task
* the good news is that the build process finishes and kernel plus root
file system image gets created
* the bad news is that the rootfs lacks some important libs like libc
and therefore won't run under qemu-system-arm
(since init, busybox, etc. are dynamically linked)
* currently a 3-lines hack on oe-core is required to be able to
overwrite a task of the generic glibc receipt; all other files could go
into a separate layer
Linaro Android:
* had a quick look into the EABI attribute tag issue
Regards
Ken
== String routines ==
* Sent updated memchr to the eglibc list
== 64 bit atomics ==
* Ran a set of timing consistency tests that a colleague had sent me
while I was off; Panda passed those, so time
doesn't appear to be going backwards or anything, so that's not the
problem with membase.
* Pushed the code into linaro-gcc.
== QEmu ==
* Tested Peter's prerelease - all good.
* Started looking at the issues for running in TCG mode on ARM
== Other ==
* Read through the ARMv8 instructions docs that landed on arm.com;
quite interesting. Note that multiple instruction
IT blocks are listed as being deprecated for 32bit mode on v8
(although this will work but it can be put in a mode to fault
you to make it easy to find the uses).
* Some debugging of Panda odd timing issue with Paul Mckenney.
Dave
RAG:
Red:
Amber:
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||upstream-omap3-cleanup || 2011-11-10 || 2011-12-15 || ||
||cp15-rework || 2012-01-06 || 2012-01-06 || ||
||initial-a15-system-model || 2012-01-27 || 2012-01-27 || ||
||qemu-kvm-getting-started || 2012-03-04?|| 2012-03-04?|| ||
(for blueprint definitions: https://wiki.linaro.org/PeterMaydell/QemuKVM)
Historical Milestones:
||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 ||
||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 ||
||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 ||
== qemu-kvm-getting-started ==
* now reasonably set up to run KVM under Fast Model; howto is here:
https://wiki.linaro.org/PeterMaydell/A15OnFastModels
* rebased kvm patches into qemu-linaro
* fixed bug where we weren't passing cpu number to kvm properly
when delivering an interrupt
* sent some minor patches to upstream qemu that will be needed for
kvm (eg configure script tweaks)
== initial-a15-system-model ==
* started on cleaning up a9/11mpcore private peripheral implementation;
now mostly done and looking much better as a base for a15
== other ==
* preparation for qemu-linaro release (rolled tarball, tested)
* submitted patch to fix buffer overrun in GIC model
* discussion: linux-user mode race conditions, and in particular
how we should handle signals that arrive during syscall emulation
* upstream patch review: imx31 round 3
-- PMM
== GDB ==
* Completed new set of patches to support both "info proc" and
core file generation across the remote protocol, and posted
them to the mailing list for review.
* Tested GDB trunk in preparation for 7.4 release branch point
on multiple platforms; analyzed and fixed a couple of problems,
some also present on ARM in remote testing. Patches checked
in to mainline.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== This week ==
* More on -fsched-pressure. Testing on POWER7 showed a degenerate case
that I'd failed to handle well. Fixed that. Saw that part of the
problem on POWER7 was that IRA was using a combination of GENERAL_REGS
and CR_REGS as a single pressure class, so there appeared to be 39
registers available for storing integers. Fixed (or worked around) that.
Tweaked a few other things too. The only denbench result that I
wasn't happy with was RSA, where both forms of -fsched-pressure are
significantly worse than -fno-sched-pressure. Tracked down the cause
of that. We had a block BB1:
A: (set (reg:DI X) Y)
B: (clobber (reg:DI Z))
C: (set (subreg:SI (reg:DI Z) 0) (... X ...))
D: (set (subreg:SI (reg:DI Z) 4) (...))
where B makes sure that Z is treated as dead before C. Interblock
motion causes B to be scheduled in an earlier block, but none of
the other instructions can be. This means that, when we schedule BB1,
it still contains A, C and D, and Z now appears to be live on entry to
the block. C therefore appears to reduce register pressure, because
it contains the last use of X, and appears to leave Z's liveness
unaffected. In reality it should be treated as increasing register
pressure by 1 (-1 for the death of X, +2 for the birth of Z).
I "fixed" this by moving C's dependencies to B, a bit like we do
for scheduling groups (although none of the other handling of
scheduling groups should apply). This made a big difference,
so that the new code is a win on RSA.
There's still one SPEC2006 degradation on POWER7 that I want
to look at.
* Caught up on a lot of mail. gcc-patches backlog has gone down
from ~4900 when I got back to ~500.
* Briefly looked at x86's drap support, to see what would be needed
for ARM. Didn't look for long though: the overhead seems excessive
for optional alignment, and the agreement seemed to be that 128-bit
alignment wouldn't really make much of a difference anyway.
Richard
Hi!
* Continued with running eembc, coremark, denbench and spec2k on the ursas
with the latest of the Linaro and FSF series. The variants used were
o3-neon and o3-neon-novect. Something went wrong with the variants the
first time, so I had to rerun the tests once.
Discussed draft report with Michael, next week I will share with the rest
of the team.
* Did a rerun SPEC2K runs with "train" and "ref" data sets. I did -o2 and
-o3 runs on a panda with the two data sets. Asked for a sanity check of the
numbers.
* Prepared and held a presentation about the tcwg internally.
* Will be tied up with internal work for the most of w49.
Best regards
Åsa