Hi Wookey,
I've finally completed a first of draft the write-up of toolchain
implications of multiarch paths that we discussed in Prague. Sorry it took
a while, but it got a lot longer than I expected :-/
I'd appreciate any feedback and comments!
Multiarch paths and toolchain implications
== Overview and goals ==
Binary files in packages are usually platform-specific, that is they work
only on the architecture they were built for. Therefore, the packaging
system provides platform-specific versions for them. Currently, these
versions will install platform-specific files to the same file system
locations, which implies that only one of them can be installed into a
system at the same time.
The goal of the "multiarch" effort is to lift this limitation, and allow
multiple platform-specific versions of the same package to be installed
into the same file system at the same time. In addition, each package
should install to the same file system locations no matter on which host
architecture the installation is performed (that is, no rewriting of
path names during installation).
This approach could solve a number of existing situations that are not
handled well by today's packaging mechanisms:
- Systems able to run binaries of more than one ISA natively.
- Support for multiple incompatible ABI variants on the same ISA.
- Support for processor-optimized ABI-compatible library variants.
- NFS file system images exported to hosts of different architectures.
- Target file systems for ISA emulators etc.
- Development packages for cross-compilation.
In order to support this, platform-specific versions of a multiarch
package must have the property that for each file, it is either 100%
identical across platforms, or else it must be installed to separate
locations in the file system.
The latter is the case at least for executable files, shared libraries,
static libraries and object files, and to some extent maybe header files.
This means that in a multiarch world, such files must move to different
locations in the file system than they are now. This causes a variety
of issues to be solved; in particular, most of the existing locations
are defined by the LHS and/or are assumed to have well-known values by
various system tools.
In this document, I want to focus on the impact of file system hierarchy
changes to two tasks in particular:
- loading/running an executable
- building an executable from source
In the following two sections, I'll provide details on file system paths
are currently handled in these two areas. In the final section, I'll
discuss suggestions how to extend the current behavior to support
multiarch paths.
== Loading/running an executable ==
Running a new executable starts with the execve () system call. The
Linux kernel supports execution of a variety of executable types; most
commonly used are
- native ELF executable
- ELF exectuable for a secondary native ISA (32-bit on 64-bit)
- #! scripts
- user-defined execution handlers (via binfmt_misc)
The binary itself is passed via full (or relative) pathname to the
execve call; the kernel does not make file system hierarchy assuptions.
By convention, callers of execve ususally search well-known path locations
(via the PATH environment variable) when locating executables. How to
adapt these conventions for multiarch is beyond the scope of this document.
With #! scripts and binfmt_misc handlers, the kernel will involve a
user-space helper to start execution. The location of these handlers
themselves and secondary files they in turn may require is provided by
user space (e.g. in the #! line, or in the parameters installed into
the binfmt_misc file system). Again, adapting these path names is
beyond the scope of this document.
For native ELF executables, there are two additional classes of files
involved in the initial load process: the ELF interpreter (dynamic
loader), and shared libraries required by the executable.
The ELF interpreter name is provided in the PT_INTERP program header
of the ELF executable to be loaded; the kernel makes no file name
assumptions here. This program header is generated by the linker
when performing final link of a dynamically linked executable;
it uses the file name passed via the -dynamic-linker argument.
(Note that while the linker will fall back to some hard-coded path
if that argument is missing, on many Linux platforms this default
is in fact incorrect and does not correspond to a ELF interpreter
actually installed in the file system in current distributions.
Passing a correct -dynamic-linker argument is therefore mandatory.)
In normal builds, the -dynamic-linker switch is passed to the linker
by the GCC compiler driver. This in turn gets the proper argument
to be used on the target platform from the specs file; the (correct)
default value is hard-coded into the GCC platform back-end sources.
On bi-arch platforms, GCC will automatically choose the correct
variant depending on compile options like -m32 or -m64. Again,
the logic to do so is hard-coded into the back-end. Unfortunately,
various bi-arch platforms use different schemes today:
amd64: /lib/ld-linux.so.2 vs. /lib64/ld-linux-x86-64.so.2
ia64: /lib/ld-linux.so.2 vs. /lib/ld-linux-ia64.so.2
mips: /lib/ld.so.1 vs. /lib64/ld.so.1
ppc: /lib/ld.so.1 vs. /lib64/ld64.so.1
s390: /lib/ld.so.1 vs. /lib/ld64.so.1
sparc: /lib/ld-linux.so.2 vs. /lib64/ld-linux.so.2
Once the dynamic interpreter is loaded, it will go on and load
dynamic libraries required by the executable. For this discussion,
we will consider only the case where the interpreter is ld.so as
provided by glibc.
As opposed to the kernel, glibc does in fact *search* for libraries,
and makes a variety of path name assumptions while doing so. It
will consider paths encoded via -rpath, the LD_LIBRARY_PATH environment
variable, and knows of certain hard-coded system library directories.
It also provides a mechanism to automatically choose the best out of
a number of libraries available on the system, depending on which
capabilities the hardware / OS provides.
Specifically, glibc determines a list of search directory prefixes,
and a list of capability suffixes. The directory prefixes are:
- any directory named in the (deprecated) DT_RPATH dynamic tag of
the requesting object, or, recursively, any parent object
(note that DT_RPATH is ignored if DT_RUNPATH is also present)
- any directory listed in the LD_LIBRARY_PATH environment variable
- any directory named in the DT_RUNPATH dynamic tag of the requesting
object (only)
- the system directories, which are on Linux hard-coded to:
* /lib$(biarch_suffix)
* /usr/lib$(biarch_suffix)
where $(biarch_suffix) may be "64" on 64-bit bi-arch platforms.
The capability suffixes are determined from the following list
of capabilities:
- For each hardware capability that is present on the hardware
as indicated by a bit set in the AT_HWCAP auxillary vector entry,
and is considered "important" according to glibc's hard-coded
list of important hwcaps (platform-dependent), a well-known
string provided by glibc's platform back-end.
- For each "extra" hardware capability present on the hardware
as indicated by a GNU NOTE section in the vDSO provided by
the kernel, a string provided in that same note.
- A string identifying the platform as a whole, as provided by
the kernel via the AT_PLATFORM auxillary vector entry.
- For each software capability supported by glibc and the kernel,
a well-known string. The only such capability supported today
is "tls", indicating support for thread-local storage.
The full list of capability suffixes is created from the list of supported
capabilities by forming every sub-sequence. For example, if the platform is
"i686", supports the important hwcap "sse2" and TLS, the list of suffixes is:
sse2/i686/tls
sse2/i686
sse2
i686/tls
i686
tls
<empty>
The total list of directories to be searched is then formed by concatenating
every directory prefix with every capability suffix. Various caching
mechanisms are employed to reduce the run-time overhead of this large
search space.
Note: This method of searching capability suffixes is employed only by glibc
at run time; it is unknown to the toolchain at compile time. This implies
that an executable will have been linked against the "base" version of a
library, and the "capability-specific" version of the library is only
substituted at run time. Therefore, all capability-specific versions must
be ABI-compatible to the base version, in particular they must provide the
same soname and symbol versions, and they must use compatible function
calling conventions.
== Building an executable from source ==
For this discussion, we only consider GCC and the GNU toolchain, installed
into the usual locations as system toolchain, and in the absence of any
special-purpose options (-B) or environment variables (GCC_EXEC_PREFIX,
COMPILER_PATH, LIBRARY_PATH ...).
However, we do consider the three typical modes of operation:
- native compilation
- cross-compilation using a "traditional" toolchain install
- cross-compilation using a sysroot
During the build process, the toolchain performs a number of searches for
files. In particular, it looks for (executable) components of the toolchain
itself; for include files; for startup object files; and for static and
dynamic libraries.
In doing so, the GNU toolchain considers locations derived from any of the
following "roots":
- Private GCC installation directories
* /usr/lib/gcc
* /usr/libexec/gcc
These hold both GCC components and target-specific headers and libraries
provided by GCC itself.
- GNU toolchain installation directories
* /usr/$(target)
These directories hold files used across multiple components of the GNU
toolchain, including the compiler and binutils. In addition, they may also
hold target libraries and headers; in particular for libraries traditionally
considered part of the toolchain, like newlib for embedded systems. In fact,
in the "traditional" installation of a GNU cross-toolchain, *all* default
target libraries and headers are found here. However, the toolchain
directories are always consulted for native compilation as well (if present)!
- The install "prefix"
This is what is specified via the --prefix configure option. The toolchain
will look for headers and libraries under that root, allowing for building
and installing of multiple software packages that depend on each other into
a shared prefix. (This is really only relevant when the toolchain is *not*
installed as system toolchain, e.g. in a setup where you provide a GNU
toolchain + separately built GNU packages on a non-GNU system in some
distinct non-system directory.)
- System directories
* /lib$(biarch_suffix)
* /usr/lib$(biarch_suffix)
* /usr/local/lib$(biarch_suffix)
* /usr/include
* /usr/local/include
These are default locations defined by the OS and hard-coded into the
toolchain sources. The examples above are those used for Linux. On bi-arch
platforms, $(biarch_suffix) may be the suffix "64". These directories are
used only for native compilation; however in a "sysroot" cross-compiler, they
are used for cross-compilation as well, prefixed by the sysroot directory.
In addition to the base directory paths refered to above, the GNU toolchain
supports the so-called "multilib" mechanism. This is intended to provide
support for multiple incompatible ABIs on a single platform. This is
implemented by the GCC back-end having hard-coded information about which
compiler option causes an incompatible ABI change, and a hard-coded
"multilib" directory suffix name corresponding to that option. For example,
on PowerPC the -msoft-float option is associated with the multilib suffix
"nof", which means libraries using the soft-float ABI (passing floating point
values in integer registers) can be provided in directories like:
/usr/lib/gcc/powerpc-linux/4.4.4/nof
/usr/lib/nof
The multilib suffix is appended to all directories searched for libraries
by GCC and passed via -L options to the linker. The linker itself does not
have any particular knowledge of multilibs, and will continue to consult its
default search directories if a library is not found in the -L paths. If
multiple orthogonal ABI-changing options are used in a single compilation,
multiple multilib suffixes can be used in series.
As a special consideration, some compiler options may correspond to multiple
incompatible ABIs that are already supported by the OS, but using directory
names differently from what GCC would use internally. As the typical example,
on bi-arch systems the OS will normally provide the default 64-bit libraries
in /usr/lib64, while also providing 32-bit libraries in /usr/lib. For GCC on
the other side, 64-bit is the default (meaning no multilib suffix), while the
-m32 option is associated with the multilib suffix "32".
To solve this problem, the GCC back-end may provide a secondary OS multilib
suffix which is used in place of the primary multilib suffix for all library
directories derived from *system* paths as opposed to GCC paths. For example,
in the typical bi-arch setup, the -m32 option is associated with the OS
multilib suffix "../lib". Given the that primary system library directory
is /usr/lib64 on such systems, this has the effect of causing the toolchain
to search
/usr/lib64/gcc/powerpc64-linux/4.4.4
/usr/lib64
for default compilations, and
/usr/lib64/gcc/powerpc64-linux/4.4.4/32
/usr/lib64/../lib (i.e. /usr/lib)
for -m32 compilations.
The following rules specify in detail which directories are searched
at which phase of compilation. The following parameters are used:
$(target)
GNU target triple (as specified at configure time)
$(version)
GCC version number
$(prefix)
Determined at configure time, usually /usr or /usr/local
$(libdir)
Determined at configure time, usually $(prefix)/lib
$(libexecdir)
Determined at configure time, usually $(prefix)/libexec
$(tooldir)
GNU toolchain directory, usually $(prefix)/$(target)
$(gcc_gxx_include_dir)
Location of C++ header files. Determined at configure time:
* If --with-gxx-include-dir is given, the specified directory
* Otherwise, if --enable-version-specific-runtime-libs is given:
$(libdir)/gcc/$(target)/$(version)/include/c++
* Otherwise for all cross-compilers (including with sysroot!):
$(tooldir)/include/c++/$(version)
* Otherwise for native compilers:
$(prefix)/include/c++/$(version)
$(multi)
Multilib suffix appended for GCC include and library directories
$(multi_os)
OS multilib suffix appended for system library directories
$(sysroot)
Sysroot directory (empty for native compilers)
Directories searched by the compiler driver for executables (cc1, as, ...):
1. GCC directories:
$(libexecdir)/gcc/$(target)/$(version)
$(libexecdir)/gcc/$(target)
$(libdir)/gcc/$(target)/$(version)
$(libdir)/gcc/$(target)
2. Toolchain directories:
$(tooldir)/bin/$(target)/$(version)
$(tooldir)/bin
Directories searched by the compiler for include files:
1. G++ directories (when compiling C++ code only):
$(gcc_gxx_include_dir)
$(gcc_gxx_include_dir)/$(target)[/$(multi)]
$(gcc_gxx_include_dir)/backward
2. Prefix directories (if distinct from system directories):
[native only] $(prefix)/include
3. GCC directories:
$(libdir)/gcc/$(target)/$(version)/include
$(libdir)/gcc/$(target)/$(version)/include-fixed
4. Toolchain directories:
[cross only] $(tooldir)/sys-include
$(tooldir)/include
5. System directories:
[native/sysroot] $(sysroot)/usr/local/include
[native/sysroot] $(sysroot)/usr/include
Directories searched by the compiler driver for startup files (crt*.o):
1. GCC directories:
$(libdir)/gcc/$(target)/$(version)[/$(multi)]
2. Toolchain directories:
$(tooldir)/lib/$(target)/$(version)[/$(multi_os)]
$(tooldir)/lib[/$(multi_os)]
3. Prefix directories:
[native only] $(libdir)/$(target)/$(version)[/$(multi_os)]
[native only] $(libdir)[/$(multi_os)]
4. System directories:
[native/sysroot] $(sysroot)/lib/$(target)/$(version)[/$(multi_os)]
[native/sysroot] $(sysroot)/lib[/$(multi_os)]
[native/sysroot] $(sysroot)/usr/lib/$(target)/$(version)[/$(multi_os)]
[native/sysroot] $(sysroot)/usr/lib[/$(multi_os)]
Directories searched by the linker for libraries:
In addition to these directories built-in to the linker, if the linker
is invoked via the compiler driver, it will also search the same list
of directories specified above for startup files, because those are
implicitly passed in via -L options by the driver.
Also, when searching for dependencies of shared libraries, the linker
will attempt to mimic the search order used by the dynamic linker,
including DT_RPATH/DT_RUNPATH and LD_LIBRARY_PATH lookups.
1. Prefix directories (if distinct from system directories):
[native only] $(libdir)$(biarch_suffix)
2. Toolchain directories:
[native/cross] $(tooldir)/lib$(biarch_suffix)
3. System directories:
[native/sysroot] $(sysroot)/usr/local/lib$(biarch_suffix)
[native/sysroot] $(sysroot)/usr/lib$(biarch_suffix)
[native/sysroot] $(sysroot)/lib$(biarch_suffix)
4. Non-biarch directories (if distinct from the above)
[native only] $(libdir)
[native/cross] $(tooldir)/lib
[native/sysroot] $(sysroot)/usr/local/lib
[native/sysroot] $(sysroot)/usr/lib
[native/sysroot] $(sysroot)/lib
== Multiarch impact on the toolchain ==
The current multiarch proposal is to move the system library directories
to a new path including the GNU target triplet, that is, instead of using
/lib
/usr/lib
the system library directories are now called
/lib/$(multiarch)
/usr/lib/$(multiarch)
At this point, there is no provision for multiarch executable or header file
installation.
What are the effects of this renaming on the toolchain, following the
discussion above?
* ELF interpreter
The ELF interpreter would now reside in a new location, e.g.
/lib/$(multiarch)/ld-linux.so.2
This allows interpreters for different architectures to be installed
simultaneously, and removes the need for the various bi-arch hacks.
Change would imply modification of the GCC back-end, and possibly
the binutils ld default as well (even though that's currently not
normally used), to build new executables using the new ELF
interpreter install path.
Caveats:
Any executable built with the new ELF interpreter will absolutely
not run on a system that does not provide the multiarch install
location of the interpreter. (This is probably OK.)
Executables built with the old ELF interpreter will not run on a
system that *only* provides the multiarch install location. This
is clearly *not* OK. To provide backwards compatibility, even a
multiarch-capable system will need to install ELF interpreters
at the old locations as well, possibly via symlinks. (Note that
any given system can only be compatible in this way with *one*
architecture, except for lucky circumstances.)
As the multiarch string $(multiarch) is now embedded into each and
every executable file, it becomes invariant part of the platform
ABI, and needs to be strictly standardized. GNU target triplets
as used today in general seem to provide too much flexibility and
underspecified components to serve in such a role, at least without
some additional requirements.
* Shared library search paths
According to the primary multiarch assumption, the system library
search paths are modified to include the multiarch target string:
* /lib/$(multiarch)
* /usr/lib/$(multiarch)
This requires modifications to glibc's ld.so loader (can possibly
be provided via platform back-end changes).
Backwards compatibility most likely requires that both the new
multiarch location and the old location are searched.
Open questions:
+ How are -rpath encoded paths to be handled?
Option A: They are used by ld.so as-is. This implies that in a
fully multiarch system, every *user* of -rpath needs to update
their paths to include the multiarch target. Also, multiarch
targets are once again directly embedded into exectuables.
Option B: ld.so automatically appends the multiarch target string
to the path as encoded in DT_RPATH/DT_RUNPATH. This may break
backwards compatibility.
Option C: ld.so searches both the -rpath as is, and also with
multiarch target string appended.
+ How is LD_LIBRARY_PATH to be handled?
The options here are basically analogous to the -rpath case.
+ What is the interaction between the multiarch string and capability
suffixes (hwcaps etc.) supported by ld.so?
The most straightforward option seems to be to just leave the capability
mechanism unchanged, that is, ld.so would continue to append the
capabitility suffixes to all directory search paths (which potentially
already include a multiarch suffix). This implies that different
ABI-compatible but capability-optimized versions of a library share
the same multiarch prefix, but use different capability suffixes.
* GCC and toolchain directory paths
The core multiarch spec only refers to system directories. What
about directories provided by GCC and the toolchain? Note that for
a cross-development setup, we may have various combinations of host
and target architectures. In this case $(host-multiarch) refers to the
multiarch identifier for the host architecture, while $(target-multiarch)
refers to the one for the target architecture. In addition $(target)
refers to the GNU target triplet as currently used by GCC paths
(which may or may not be equal to $(target-multiarch), depending on
how the latter will end up being standardized).
+ GCC private directories
The GCC default installation already distinguishes files that are
independent of the host architecture (in /usr/lib/gcc) from those that
are dependent on the host architecture (in /usr/libexec/gcc). In both
cases, the target architecture is already explicitly encoded in the
path namess. Thus it would appear that we'd simply have to move the
libexec paths to include the multiarch string in a straightforward
manner in order:
/usr/lib/gcc/$(target)/$(version)/...
/usr/libexec/$(host-multiarch)/gcc/$(target)/$(version)/...
This assumes that two cross-compilers to the same target running on
different hosts can share /usr/lib/gcc, which may not be fully possible
(because two cross-compilers may build slightly different versions of
target libraries due to optimization differences). In this case, the
whole of /usr/lib/gcc could be moved to multiarch as well:
/usr/lib/$(host-multiarch)/gcc/$(target)/$(version)/...
/usr/libexec/$(host-multiarch)/gcc/$(target)/$(version)/...
The alternative would be to package them separately, with the host
independent files always coming from the same package (presumbly itself
produced on a native system, or else one "master" host system).
Note that if -in the first stage of multiarch- we do not support
parallel installation of *binaries*, we may not need to do anything
for the GCC directories.
+ Toolchain directories
The /usr/$(target) directory used by various toolchain components is
somewhat of a mixture of target vs. host dependent files:
/usr/$(target)/bin
executable files in host architecture, targeting target architecture
(e.g. cross-assembler, cross-linker binaries)
/usr/$(target)/sys-include
target headers (only for cross-builds)
/usr/$(target)/include
native toolchain header files for target (like bfd.h)
/usr/$(target)/lib
target libraries + native toolchain libraries for target (like libbfd.a)
/usr/$(target)/lib/ldscripts
linker scripts used by the cross-linker
In a full multiarch setup, the only directory that would require the
multiarch suffix is probably bin:
/usr/$(target)/bin/$(host-multiarch)
As discussed above, if we want to support different versions of target
libraries for the same target, as compiled with differently hosted
cross-compilers, we might also have to multiarch the lib directory:
/usr/$(target)/lib/$(host-multiarch)
Yet another option might be to require multiarch systems to always
use the sysroot cross-compile option, and not support the toolchain
directory for target libraries in the first place. [Or at least,
have no distribution package ever install any target library into
the toolchain lib directory ...]
+ System directories
For these, the primary multiarch rules would apply. In a sysroot
configuration, the sysroot prefix is applied as usual (after the
multiarch paths have been determined as for a native system). Note
that we need to apply the *target* multiarch string here; in case
this is different from the target triplet, the toolchain would have
to have explicit knowledge of those names:
/lib/$(target-multiarch)
/usr/lib/$(target-multiarch)
+ Prefix directories
For completeness' sake, we need to define how prefix directories are
handled in a GCC build that is both multiarch enabled *and* not installed
into the system prefix. The most straightforward solution would be to
apply multiarch suffixes to the prefix directories as well:
$(prefix)/lib/$(target-multiarch)
$(prefix)/include/$(target-multiarch) [ if include is multiarch'ed ... ]
* Multiarch and cross-compilation
Using the paths as discussed in the previous section, we have some
options how to perform cross-compilation on a multiarch system.
The most obvious option is to build a cross-compiler with sysroot
equal to "/". This means that the compiler will use target libraries
and header files as installed by unmodified distribution multiarch
packages for the target architecture. This should ideally be the
default cross-compilation setup on a multi-arch system.
In addition, it is still possible to build cross-compilers with a
different sysroot, which may be useful if you want to install
target libraries you build yourself into a non-system directory
(and do not want to require root access for doing so).
Questions:
+ Should we still support "traditional" cross-compilation using
the toolchain directory to hold target libraries/header
This is probably no longer really useful. On the other hand, it
probably doesn't really hurt either ...
+ What about header files in a multiarch system?
The current multiarch spec does not provide for multiple locations
for header files. This works only if headers are identical across
all targets. This is usually true, and where it isn't, can be
enforced by use of conditional compilation. In fact, the latter
is how things currently work on traditional bi-arch distributions.
In the alternative, the toolchain could also provide for multiarch'ed
header directories along the lines of
/usr/include/local/$(target-multiarch)
/usr/include/$(target-multiarch)
which are included in addition to (and before) the usual directories.
It will most likely not be necessary to do this for the GCC and
toolchain directory include paths, as they are already target-specific.
* Multiarch and multilib
The multilib mechanism provides a way to support multiple incompatible
ABI versions on the same ISA. In a multiarch world, this is supposed
to be handled by different multiarch prefixes, to enable use of the
package management system to handle libraries for all those variants.
How can we reconcile the two systems?
It would appear that the way forward is based on the existing "OS
multilib suffix" mechanism. GCC already expects to need to handle
naming conventions provided by the OS for where incompatible versions
are to found.
In a multiarch system, the straightforward solution would appear to
be to use the multiarch names as-is as OS multilib suffixes. In fact,
this could even handle the *default* multiarch name without requiring
any further changes to GCC.
For example, in a bi-arch amd64 setup, the GCC back-end might register
"x86_64-linux" as default OS multilib suffix, and "i386-linux" as OS
multilib suffix triggered by the (ABI changing) -m32 option. Without
any further change, GCC would now search
/usr/lib/x86_64-linux
or
/usr/lib/i386-linux
as appropriate, depending on the command line options used. (This
assumes that the main libdir is /usr/lib, i.e. no bi-arch configure
options were used.)
Note that GCC would still use its own internal multilib suffixes
for its private libraries, but that seems to be OK.
Caveat: This would imply those multilib names, and their association
with compiler options, becomes hard-coded in the GCC back-end.
However, this seems to have already been necessary (see above).
== Summary ==
>From the preceding discussion, it would appear a multiarch setup allowing
parallel installation of run-time libraries and development packages, and
thus providing support for running binaries of different native ISAs and
ABI variants as well as cross-compilation, might be feasible by implementing
the following set of changes. Note that parallel installation of main
*executable* files by multiarch packages is not (yet) supported.
* Define well-known multiarch suffix names for each supported ISA/ABI
combination. Well-known in particular means it is allowed for them
to be hard-coded in toolchain source files, as well as in executable
files and libraries built by such a toolchain.
* Change GCC back-ends to use the well-known multiarch suffix as OS
multilib suffix, depending on target and ABI-changing options. Also
include multiarch suffix in ELF interpreter name passed to ld.
(Probably need to change ld default directory search paths as well.)
* Change the dynamic loader ld.so to optionally append the multiarch suffix
(as a constant string pre-determined at ld.so build time) after each
directory search path, before further appending capability suffixes.
(See above as to open questions about -rpath and LD_LIBRARY_PATH.)
* Change package build/install rules to install libraries and ld.so
into multiarch library directories (not covered in this document).
Change system installation to provide for backward-compatibility
fallbacks (e.g. symbolic links to the ELF interpreter).
* If capability-optimized ISA/ABI-compatible library variants are desired,
they can be build just as today, only under the (same) multiarch
suffix. They could be packaged either within a single pacakge,
or else using multiple packages (of the same multiarch type).
* Enforce platform-independent header files across all packages. (In
the alternative, provide for multiarch include paths in GCC.)
* Build cross-compiler packages with --with-sysroot=/
I'd appreciate any feedback or comments! Have I missed something?
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
As you probably know I am working on bootstrapping cross compiler. Process is
described on https://wiki.linaro.org/CrossCompilers page.
Current status is: nearly done.
Done:
- Linux/stage1 - patch (LP:603087) waits approval from kernel team.
- GCC 4.5/stage[12] - patch (LP:603497) created
In progress:
- eglibc/stage1 builds but needs packaging work
Problems:
- gcc/stage3 (normal full build) fails on shlibs because I do not have
libc6-armel packages installed in system but dpkg-shlibdeps needs them
How to bootstrap cross compiler?
- apt-get source gcc-4.5 eglibc binutils linux-image-2.6.35-8-generic
- copy gcc-4.5 dir to gcc-stage1 gcc-stage2 gcc-stage3
- copy eglibc dir to eglibc-stage1 eglibc-stage2
- fetch and apply my patches from LP:603087 LP:603497 LP:603498
- create directory "sysroot/" and point WITH_SYSROOT and WITH_BUILD_SYSROOT
enviroment variables to it
- cd binutils* && TARGET=armel dpkg-buildpackage -b -uc -us
- cd sysroot && dpkg-deb -x ../binutils-arm*deb .
- export PATH=$PWD/sysroot/usr/bin:$PATH
- export LD_LIBRARY_PATH=$PWD/sysroot/usr/x86_64-linux*/usr/arm*/lib
- cd linux* && DEB_STAGE=stage1 dpkg-buildpackage -b -uc -us -aarmel
- cd sysroot && dpkg-deb -x ../linux-libc-dev*deb .
- cd gcc-stage1 && DEB_STAGE=stage1 dpkg-buildpackage -b -uc -us
- cd sysroot && dpkg-deb -x ../cpp*deb ../gcc*deb .
- cd eglibc-stage1 && DEB_STAGE=stage1 dpkg-buildpackage -b -uc -us -aarmel
As packaging is not yet done for that step manual copying of debian/tmp-libc/
contents to sysroot dir is needed.
- cd gcc-stage2 && DEB_STAGE=stage2 dpkg-buildpackage -b -uc -us
- cd sysroot && dpkg-deb -x ../cpp*deb ../gcc*deb .
- cd eglibc-stage2 && dpkg-buildpackage -b -uc -us -aarmel
- cd sysroot && dpkg-deb -x ../libc*deb .
- cd gcc-stage3 && dpkg-buildpackage -b -uc -us
Here dpkg-shlibdeps will complain so for that step manual copying of
debian/tmp/ contents to sysroot dir is needed.
sysroot dir will then contain working cross compiler - so far I used it to
compile hello.c and kernel for beagleboard
I did not tested yet building stage3 compiler with sysroot=/ like it should be
for installing it on other systems. But thats future.
Regards,
--
JID: hrw(a)jabber.org
Website: http://marcin.juszkiewicz.com.pl/
LinkedIn: http://www.linkedin.com/in/marcinjuszkiewicz
As already discussed with Loic, CodeSourcery have a GCC patch that
implements a new optimization: -fremove-local-statics.
Essentially, it transforms code like this:
int foo (void) { static int a = 1; return a; }
into this:
int foo (void) { int a = 1; return a; }
Admittedly, if the code is written like that you might argue that the
auther gets what they deserve, but apparently at least one of the EEMBC
benchmarks does have these, so now we all care about it.
This patch was originally submitted, by RedHat, to gcc-patches here:
http://gcc.gnu.org/ml/gcc-patches/2008-07/subjects.html#00982
Some discussion later, they decided it would be better to implement the
optimization using inter-procedural dead store analysis:
http://gcc.gnu.org/ml/gcc-patches/2008-07/msg01602.html
This doesn't seem to have actually been done. Not yet, anyway.
So basically we're left with this patch that does something we want, but
not in a way that can go upstream. :(
The question is, should I merge this to Linaro, or not? Loic and I
agreed to hold off until I'd done a bit more research and/or tried to
upstream it again, but now I think we need to think again.
Andrew
Hi all,
These are approximate instructions for installing Lucid on an IGEPv2. It
uses the kernel recommended on the IGEP site because this supports the
SD card. I'm sure an ubuntu kernel will be fine later. At the end, you
will have an SD card that will boot the IGEPv2 board with no external
intervention or devices.
This recipe is derived from here:
http://labs.igep.es/index.php/How_to_get_the_Ubuntu_distribution
Andrew
----------------------
sudo apt-get install rootstock uboot-mkimage qemu
[The next step gives you the kernel, initrd, and rootfs all in one.
Ideally I would have it install lxde now, to match the "demo" OS that
came with the board, but I found that it hung. Create minimal install
for now, and install lxde later, once the board is running.]
sudo rootstock --fqdn ubuntu --login jdoe --password letmein --imagesize
2G \
--seed
wget,nano,linux-firmware,wireless-tools,usbutils,openssh-server,openssh-client
--dist lucid \
--serial ttyS2 --components "main universe multiverse" \
--kernel-image
http://www.rcn-ee.net/deb/lucid/v2.6.33.5-l3/linux-image-2.6.33.5-l3_1.0luc…
mkimage -A arm -O linux -T kernel -C none -a 0x80008000 -e 0x80008000 -n
"Linux" -d vmlinuz-2.6.33.5-l3 uImage
mkimage -A arm -O linux -T ramdisk -C none -a 0 -e 0 -n initramfs -d
initrd.img-2.6.33.5-l3 uInitrd
cat > boot.source < EOF
fatload mmc 0:1 0x80000000 uImage
fatload mmc 0:1 0x82000000 uInitrd
setenv bootargs vram=12M omapfb.mode=dvi:1280x720MR-16@60
root=/dev/mmcblk0p2 console=ttyS2,115200n8 fixrtc
bootm 0x80000000 0x82000000
EOF
mkimage -A arm -O linux -T script -C none -a 0 -e 0 -n "Boot Script" -d
boot.source boot.ini
[Format the SD card with two paritions, mmcblk0p1 small fat-16 (label
"boot"), and mmcblk0p2 large ext3 (label "rootfs").]
cp uImage uInitrd boot.ini /media/boot
sudo tar xzpf armel-rootfs-<date>.tgz -C /media/rootfs/
ln -s ../init.d/ssh /media/rootfs/etc/rc2.d/S01ssh
[Set up /media/rootfs/etc/network/interfaces - you'll need an "auto
eth0" line and something to go with it]
[Boot the target, log in (jdoe/letmein)]
sudo apt-get install lxde gdm
[Actually, I only installed the lxde desktop so I could run it remotely
using Xnest. If you want a graphical login on the video output, only
then do you need gdm also. Not installing gdm means that Xorg doesn't
start at boot time and eat memory and cycles.]
Minutes from the toolchain working group stand up call are at:
https://wiki.linaro.org/WorkingGroups/ToolChain/Meetings/2010-07-28
-- Michael
== Attendees ==
||<rowbgcolor="#333333" rowstyle="color: white; font-weight:
bold;"style="text-align: center;">Name ||<style="text-align:
center;">Email ||<style="text-align: center;">IRC Nick ||
|| Andrew Stubbs || andrew.stubbs(a)linaro.org || ams ||
|| Chung-Lin Tang || cltang(a)codesourcery.com || cltang ||
|| Julian Brown || julian(a)codesourcery.com || jbrown ||
|| Loïc Minier || loic.minier(a)linaro.org || lool ||
|| Michael Hope || michael.hope(a)linaro.org || michaelh ||
|| Richard Earnshaw || richard.earnshaw(a)arm.com || rearnshaw ||
|| Scott Bambrough || scott.bambrough(a)linaro.org || scottb ||
|| Ulrich Weigand || ulrich.weigand(a)linaro.org || uweigand ||
|| Yao Qi || yao.qi(a)linaro.org || yao ||
== Agenda ==
* Stand-up call - progress, what's next, and problems
== Action Items from this Meeting ==
* ACTION: Richard: will send Michael an email on the sync primitives
* ACTION: Richard: will see where the str* assembler routines landed
== Action Items from Previous Meeting ==
== Minutes ==
* Andrew:
* Has received a IGEPv2 board and is trying to get it working
* Having trouble with the maverick chroot
* Modifying the CSL build system to use bzr
* Modifying the CSL build system to try the Ubuntu glibc
* 4.5 merges will start going in soon
* Chung-Lin:
* Looking at hard float
* Working on libffi. Has looked at the internals and has started the port
* Julian:
* Working on porting the 4.4 changes into 4.5
* Also has a IGEPv2 board and is working with Andrew to get it going
* Ulrich:
* Proposed the powerpc revert to doko, who is happy with that approach
* Working with upstream on [[LP:500524]]. Patch should be present in 4.4.5
* To work on the Firefox issue [[LP:604874]]
* Michael:
* Working on a continuous build
* Starting a write-up on patch tracking
* Yao
* Has looked into and reproduced [[LP:604874]]
* Will look into the assembly to see what is going on
* Richard:
* Working on the sync primitives in gcc and their use in eglibc
* ACTION: Will send Michael an email on their use
* Michael: any hand coded mem* or str* that he knows of?
* ACTION: Richard: probably present in CSL eglibc, will investigate
* Problems with the license of - want BSD as they're universal, but
glibc may require LGPL
Next call is on Monday
Hi
As some of you know I am working on cross compiler packages for Ubuntu. Those
of you who know what Emdebian is probably use their repositories for such
stuff. Thats ok - I just want to share with you what my job will bring in near
future and what I have done in last 3 months.
Since 26th April I am working for Canonical as part of Linaro project. Due to
my six years of OpenEmbedded experience I became part of Toolchain Working
Group and started work on packaging. Specification etc are listed on blueprint
page:
https://blueprints.launchpad.net/ubuntu/+spec/arm-m-cross-compilers
I started with reviewing gcc-4.4/4.5 and binutils packaging rules and merged
them as much as possible to get rid of *-cross.mk files which went bitrot a
bit. As result we got packages with debug versions of libraries, dependencies
are proper and as a bonus we got libmudflap cross compiled in case someone
needs it.
Currently I am working on bootstraping cross compiler without using dpkg-cross
converted packages (aka Emdebian way). I got it working with Ubuntu Maverick
versions and published all required patches in bugs linked to my blueprint.
Maybe it is not easy to recreate but should work when you will try.
To make it possible I also have to alter contents of *-source binary packages
from binutils/eglibc/gcc/linux to have a possibility to reuse their packaging
rules in new $ARCH-cross-compiler package on which I will work in next weeks.
And here I have a problem. How much of debian/ directory should be provided in
*-source binary packages? Minimal set just to be able to call "dpkg-
buildpackage -b" and get wanted output or rather everything just in case?
Why new $ARCH-cross-compiler package instead of Emdebian way? Think about
buildd and how they work - nothing can be done manually there so we need to
automate whole procedure.
Regards,
--
JID: hrw(a)jabber.org
Website: http://marcin.juszkiewicz.com.pl/
LinkedIn: http://www.linkedin.com/in/marcinjuszkiewicz
The Toolchain Working Group meeting notes from 2010-07-26 are now available at:
https://wiki.linaro.org/WorkingGroups/ToolChain/Meetings/2010-07-26
A copy follows.
-- Michael
= Monday 26th July 2010 =
== This month's meetings ==
<<MonthCalendar(WorkingGroups/ToolChain/Meetings,2010,07,,,,WorkingGroups/ToolChain/MeetingTemplate)>>
== Attendees ==
||<rowbgcolor="#333333" rowstyle="color: white; font-weight:
bold;"style="text-align: center;">Name ||<style="text-align:
center;">Email ||<style="text-align: center;">IRC Nick ||
|| Andrew Stubbs || andrew.stubbs(a)linaro.org || ams ||
|| Chung-Lin Tang || cltang(a)codesourcery.com || cltang ||
|| Julian Brown || julian(a)codesourcery.com || jbrown ||
|| Loïc Minier || loic.minier(a)linaro.org || lool ||
|| Marcin Juszkiewicz || marcin.juszkiewicz(a)linaro.org || hrw ||
|| Michael Hope || michael.hope(a)linaro.org || michaelh ||
|| Scott Bambrough || scott.bambrough(a)linaro.org || scottb ||
|| Ulrich Weigand || ulrich.weigand(a)linaro.org || uweigand ||
|| Yao Qi || yao.qi(a)linaro.org || yao ||
== Agenda ==
* Review action items from last meeting
* Feedback on the sprint
* Ideas at [[Internal/MichaelHope/ToolChainMockup]]
* What's next
* Our focus
* Release dates
* Hardware availability and needs
* Benchmark ideas
* Open source ones
* 'Typical' open source projects such as ffmpeg, libtheora
* Closed tests
* Next release
* Blueprint status
* Moving to public phone call
* New starter, Chung-Lin Tang
.
||<rowbgcolor="#333333" rowstyle="color: white; font-weight:
bold;"style="text-align: center;">Blueprint ||<style="text-align:
center;">Assignee ||
|| [[https://blueprints.launchpad.net/gcc-linaro/+spec/initial-4.4|Initial
delivery of Linaro GCC 4.4]] || ams ||
|| [[https://blueprints.launchpad.net/ubuntu/+spec/arm-m-cross-compilers|Cross
Compiler Packages]] || hrw ||
== Action Items from this Meeting ==
* ACTION: Michael to organise and spread next Monday's call in number
* ACTION: Michael to re-check release dates and suggest something
* ACTION: Michael to rename the intermediate milestone
* ACTION: Loic to find the Chrome OS contact name for the records
* ACTION: Ulrich to confirm PowerPC approach with Matthias
== Action Items from Previous Meeting ==
* DONE: Go to the sprint
* DONE: Loic: announcement for linaro-announce, Andrew and MLH to
review before sending
* DONE: Andrew will create release via the Launchpad milestone page
* DONE: Ulrich to summarise the GDB Thumb 2 issues and post so that
Andrew can see if CSL have already done this.
* CSL agrees that it is a problem and isn't working on it
== Minutes ==
* Reviewed the agenda
* Michael Hope is now the toolchain lead
* Calls will be shifted to a public number starting next Monday
* ACTION: Michael to organise and spread number.
* Went over release dates
* Loic suggested the second Tuesday of every month instead as it
gives the rest of working week to handle problems
* Michael would like to keep close to the FeatureFreeze/FinalRelease
dates so that announcements coincide
* ACTION: Michael to re-check dates and suggest something
* The next release, due in a week, will be skipped due to no
significant changes since last release
* ACTION: Michael to rename milestone
* Hardware availability
* CSL
* i.mx51 in their data centre
* Julian, Andrew: have IGEPv2
* Yao, Chung-Lin: will get hardware
* Loic pointed everyone to the internal hardware availability page
* Ulric: is organising hardware
* Loic wants to investigate using qemu as a buildd host
* Michael will write-up and share his hardware. Everything should
move into the data centre later
* Benchmarks
* Loic noted that we want everything in the open and reproducible,
so prefer freely usable benchmarks
* Don't want trivial benchmarks
* Most commercial benchmarks include the source but have
restrictions on sharing the results
* Michael asked for suggestions on benchmarks
* Michael to look at freely usable benchmarks such as lmbench
* Current tickets
* Yao and Michael will look at Firefox
* Ulrich is investigating the next in the list
* Michael would like to send a weekly 'State of the toolchain' update
* Consumed by linaro-dev, Ubuntu toolchain maintainers, and perhaps
others like Ubuntu kernel maintainers
* Loic suggested via Launchpad news
* Discussed communication with Ubuntu
* All notification to go via Ubuntu toolchain maintainers
* U-T-M has responsibility to pass this downstream
* Loic talked with a developer on Chrome OS
* Interested in trying the toolchain
* Currently plain GCC 4.4.1
* Require something very stable
* ACTION: Loic to find a name for the records
* General want for us to try building Chrome OS using the Linaro toolchain
* PowerPC
* Ulrich talked with Matthias at the end of the sprint
* From a Ubuntu perspective, don't want 4.5 to regress in features vs 4.4
* Revert the 4.4 PowerPC changes, keep SH and MIPS. Ubuntu doesn't
support SH or MIPS so no change
* ACTION: Ulrich to confirm with Matthias
Next call is on Wednesday
The armel build that doko kicked off some time ago has completed. I've
gone through and categorised the different failures and created
tickets as appropriate. Latest results are here:
https://wiki.linaro.org/MichaelHope/Sandbox/BuildFailures
-- Michael