Re: Thoughts on progressing Toolchain Working Group Lava integration

17 Apr 2013

      Hello Matt,
There were quite a few responses already, so I'll try to focus on the
questions to which I think I may contribute something useful.
On Tue, 16 Apr 2013 10:49:23 +0100
Matthew Gretton-Dann matthew.gretton-dann@linaro.org wrote:
...
Paul,
I've been having some thoughts about CBuild and Lava and the TCWG 
integration of them both.  I wish to share them and open them up for
general discussion.
The background to this has been the flakiness of the Panda's (due to
heat), the Arndale (due to board 'set-up' issues), and getting a
batch of Calxeda nodes working.
The following discussion refers to building and testing only, *not* 
benchmarking.
If you look at http://cbuild.validation.linaro.org/helpers/scheduler
you will see a bunch of calxeda01_* nodes have been added to CBuild.
After a week of sorting them out they provide builds twice as fast as
the Panda boards.  However, during the setup of the boards I came to
the conclusion that we set build slaves up incorrectly, and that
there is a better way.
The issues I encountered were:

The Calxeda's run quantal - yet we want to build on precise.
Its hard to get a machine running in hard-float to bootstrap a

soft-float compiler and vice-versa.

My understanding of how the Lava integration works is that it

runs the cbuild install scripts each time, and so we can't
necessarily reproduce a build if the upstream packages have been
changed.
Having thought about this a bit I came to the conclusion that the
simple solution is to use chroots (managed by schroot), and to change
the architecture a bit.  The old architecture is everything is put
into the main file-system as one layer.  The new architecture would
be to split this into two:

Rootfs - Contains just enough to boot the system and knows how

to download an appropriate chroot and start it.
  2. Chroots - these contain a setup build system that can be used
for particular builds.
The rootfs can be machine type specific (as necessary), and for
builds can be a stock linaro root filesystem.  It will contain
scripts to set the users needed up, and then to download an
appropriate chroot and run it.
The chroot will be set up for a particular type of build (soft-float
vs hard-float) and will be the same for all platforms.  The advantage
of this is that I can then download a chroot to my ChromeBook and
reproduce a build locally in the same environment to diagnose issues.
The Calxeda nodes in cbuild use this type of infrastructure - the
rootfs is running quantal (and I have no idea how it is configured -
it is what Steve supplied me with).  Each node then runs two chroots
(precise armel and precise armhf) which take it in turns to ask the
cbuild scheduler whether there is a job available.
So my first question is does any of the above make sense?
If you propose LAVA builds to use such chroot setup, then it
technically should be possible, but practically it will be quite a
chore to setup and maintain. If we want to use LAVA, why don't we
follow its way directly? It already allows to use (and switch easily)
any rootfs directly. There should be distro methods to pin packages to
specific versions. If you want to run LAVA's rootfs in chroot on
Chromebook, you can do just that - take one, transform to chroot and
use ("transform" stage may take a bit of effort initially, but at LAVA
rootfs is wholly based on Linaro standard linaro-media-create
technology, done once, it's reusable for all Linaro builds).
...
Next steps as I see it are:

Paul/Dave - what stage is getting the Pandaboards in the Lava

farm cooled at?  One advantage of the above architecture is we could
use a stock Pandaboard kernel & rootfs that has thermal limiting
turned on for builds, so that things don't fall over all the time.
I'm currently focusing on critical android-build issues, so anything
else is in backlog. And next up in my queue is supporting IT with
global Linaro services EC2 migration ;-I.
But the problem we have is not that we can't get reliable *builds* in
LAVA - it's that the *complete* CBuild picture doesn't work in LAVA.
Benchmarks is a culprit specifically. If you want reliable builds, just
use "lava-panda-usbdrive" queue - that will use those 15 standard Panda
boards mentioned by Renato, with known good rootfs/kernel. The problem,
gcc, etc. binaries produced by those builds won't run on benchmarking
image, because OS versions of "known good Panda rootfs" and "validated
CBuild PandaES rootfs" are different.
...

Paul - how hard would it be to try and fire up a Calxeda node

into Lava?
As other folks answered, that completely depends on work which
(old-time) LAVA people do, not something I (a former Infra engineer)
can influence so far.
...
We can use one of the ones assigned to me.  I don't need
any fancy multinode stuff that Michael Hudson-Doyle is working on -
each node can be considered a separate board.  I feel guilty that I
put the nodes into CBuild without looking at Lava - but it was easier
to do and got me going - I think correcting that is important

Generally - What's the state of the Arndale boards in Lava?

Fathi has got GCC building reliably, although I believe he is now
facing networking issues.

Paul - If Arndale boards are available in Lava - how much effort

would it be to make them available to CBuild?
The next step if getting good rootfs for it. Note that if a board is
*supported* by LAVA, there's always at least one good rootfs - one
which is used by LAVA itself as "meta rootfs" (one into which board
gets booted to configure target rootfs). Then, it's just switching board
type in job template, and go fixing bugs we see in first builds.
Summing up: it should be easy, modulo any unexpected things which may
happen along the route.
...
One issue the above doesn't solve as far as I see it is being able to
say to Lava that we can do a build on any ARMv7-A CBuild compatible
board.  I don't generally care whether the build happens on an
Arndale, Panda, or Calxeda board - I want the result in the shortest
possible time.
LAVA has flexible tag functionality which allows to do almost all
things mentioned by Renato. But that's theoretical point, practically,
there always will be differences and issues, so having affinity to
particular board at a given time makes sense (but migrating in agile
manner as "better" boards became available).
...
A final note on benchmarking.  I think the above scheme could work
for benchmarking targets all we need to do is build a kernel/rootfs
that is setup to provide a system that produces repeatable
benchmarking results.
Yes, we only need to make a kernel/rootfs which is suitable for boards
(and their particular setup, like cooling culprits) we have at hand. It
seems that it's exactly what can't be achieved easily, because no single
team has enough data/experience. For example, TCWG knows how to setup
rootfs for toolchain build, but doesn't know what's good
rootfs/combo for a particular combo. QA services would know (at least
I'd hope so!) which basic rootfs/kernel is good for a board, but
can't provide CBuild-ready one, and we don't even involve them in
discussion. Infra/LAVA kinda runs this stuff, and would be happy to
prepare/deploy needed image given specific data and instructions, but
without them, it would be painfully slow, because we'd need to
re-investigate all this stuff, hitting lot of dead-ends, and being
sidetracked by other stuff regularly.
But besides that, one important point is that building a new 
kernel/rootfs pair for benchmarking means invalidating all the previous
results (or introduce discontinuity). Are you ready to do that? I
always treated it as invariant "no" for my work on CBuild/LAVA, but
sooner or later that would need to be done anyway. If it can be done
now (for LAVA builds) then let's just switch to using plain Pandas
right away, be ready to switch to Arndale/Calxeda as they become stable
- and let matter of preparing new benchmark image just run in the
background.
...
Comments welcome from all.
Thanks,
Matt
-- 
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: Thoughts on progressing Toolchain Working Group Lava integration