Hello,
Over lifetime of Linaro CI services, we've been experiencing number of
episodes when service stability was influenced by Ubuntu package
mirror issues, with it being used to configure each and every build
slave, usually dozen times a day.
Linaro Infrastructure team and CI service stakeholders tried different
approaches to solve this issue, and over time we come to conclusion that
sustainable way to do it is to maintain custom Amazon Machine
Images (AMI) with preinstalled packages, which will simply cut the
dependency on Ubuntu mirror's 100% availability. In addition to that,
they also speed up build startup time noticeably.
A special tool, linaro-ami, was developed to help maintain custom AMIs,
with two main requirements being centralized configuration location for
EC2 admins control, as well as being easy to use for individual custom
AMI maintainers. We still may need to improve UI functionality for the
latter, and would need stakeholder feedback for that, but otherwise the
tool seems to work pretty well. It is available as part of
linaro-aws-tools project, https://launchpad.net/linaro-aws-tools . The
README is available at
http://bazaar.launchpad.net/~linaro-aws-devs/linaro-aws-tools/trunk/view/he…ci.linaro.org and android-build.linaro.org has been already switched to
new custom AMIs. Using custom AMIs for CBuild service is under
consideration by Toolchain group.
So, from this point on, it's important that all stakeholders follow
build slave configuration change process centered around linaro-ami
tool: any package set changes should be applied to slave init script
(as referenced in linaro-ami config), a new version of corresponding AMI
generated, and Jenkins configuration updated with new AMI ID. (No
package installation should be happen directly in Jenkins.)
The Infrastructure team can help with maintaining AMIs, and is open for
suggestions on improving new setup.
Thanks,
Paul
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linarohttp://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
+ android team
This is becoming a big problem. I just checked and the load on the
system was growing out of control again.
Upon inspection, I found several Java (CTS) processes that were
consuming lots of CPU cycles, but upon inspection there was no ADB
connection to the board they were supposed to be testing.
I also found a couple of places where MonkeyRunner seemed to crash, but
keep running (and also consume CPU cycles).
In addition the mmtest output is ridiculous. It just dumps %download
info throughout our log file making it really hard to read through the logs.
I think we need to do a few things quickly here:
1) We need to limit the amount of builds sending CTS jobs until we
understand what's going on. I'd suggest doing it for something like
Origen or Panda since we have the most of those boards. Right now, the
jobs are queuing up on snowball faster than it can unsuccessfully
execute them
2) We need to understand what's going on with CTS. Is this due to the
patched version we deployed, etc?
3) For sanity sake update the logic for the wgets in mmtest.py to not
dump so much junk.
We'll need the android team's help on item 1. I also think we may need
their help on item 2.
On 07/16/2012 10:06 PM, Michael Hudson-Doyle wrote:
> Hi gang,
>
> We had a fright today with LAVA being unreachable. Luckily, we could
> log in again after a time and notice the cause of the load: 10 or so
> Java processes like this:
>
> root 31180 53.2 0.5 8175148 159856 ? Sl Jul16 603:22 java -cp :./android-cts/tools/../../android-cts/tools/ddmlib-prebuilt.jar:./android-cts/tools/../../android-cts/tools/tradefed-prebuilt.jar:./android-cts/tools/../../android-cts/tools/hosttestlib.jar:./android-cts/tools/../../android-cts/tools/cts-tradefed.jar -DCTS_ROOT=./android-cts/tools/../.. com.android.cts.tradefed.command.CtsConsole run cts --serial 192.168.1.199:5555 --plan CTS
>
> Clearly, this is related to the CTS upgrades we've done recently. There
> was no device connected to 192.168.1.199:5555 so somehow we're leaking
> these processes. I guess we should stop that :-)
>
> While looking into this, I noticed that monkeyrunner tests are quite CPU
> heavy. Is this expected? Do we need to limit how many of these we run
> at once?
>
> Cheers,
> mwh
>
All week I've been get "gitweb is currently unavailable until we finish
importing jellybean". So I guess my question should really be "when will
jellybean be finished importing?" ;-)
Its a bit of pain not being able to easily verify branches and tags etc.
--
Tixy
---------- Forwarded message ----------
From: ffxx68 <ffumi68(a)googlemail.com>
Date: 15 June 2012 11:00
Subject: [android-porting] Re: Issue with Zygote initialization after
migrating host to Ubuntu 12
To: android-porting(a)googlegroups.com
After several different trials, looks like I managed to solve this issue.
It all depends on the Java Development Kit used to make the AOSP
build. I initially installed the Oracle JDK (as suggested by the
android tutorial pages:
http://source.android.com/source/initializing.html) but then I tried
the Open-JDK, which is installed with:
sudo apt-get install openjdk-6-jdk
If you have both Open-JDK and Oracle-JDK installed, you can choose
which one to use executing:
sudo update-alternatives --config java
sudo update-alternatives --config javac
sudo update-alternatives --config jar
sudo update-alternatives --config javah
sudo update-alternatives --config javadoc
and selecting the Open-JDK version for each of these.
The AOSP build using Oracle JDK failed to boot, while the one made
with Open-JDK completed the boot just fine.
Hopefully it helps others to solve similar issues.
On Wednesday, 13 June 2012 09:55:59 UTC-7, ffxx68 wrote:
>
> After upgrading my host to Ubuntu 12 and successfully rebuilding android, after isntallation on device, the system boot loops into this error:
>
> ----------------------------
> I/Zygote ( 147): Preloading classes...
> D/dalvikvm( 147): GC_EXPLICIT freed 47K, 78% free 232K/1024K, external 0K/0K, paused 3ms
> D/dalvikvm( 147): GC_EXPLICIT freed 1K, 73% free 282K/1024K, external 0K/0K, paused 3ms
> D/dalvikvm( 147): GC_EXPLICIT freed 20K, 70% free 315K/1024K, external 0K/0K, paused 3ms
> D/dalvikvm( 147): GC_EXPLICIT freed 17K, 66% free 353K/1024K, external 0K/0K, paused 3ms
> D/dalvikvm( 147): GC_EXPLICIT freed 26K, 63% free 381K/1024K, external 0K/0K, paused 4ms
> D/dalvikvm( 147): GC_EXPLICIT freed 22K, 58% free 440K/1024K, external 0K/0K, paused 4ms
> W/MediaProfiles( 147): could not find media config xml file
> D/dalvikvm( 147): GC_EXPLICIT freed 99K, 47% free 545K/1024K, external 0K/0K, paused 5ms
> W/dalvikvm( 147): Exception Ljava/lang/
> NullPointerException; thrown while initializing Landroid/net/http/HttpsConnection;
> E/Zygote ( 147): Error preloading android.net.http.HttpsConnection.
> E/Zygote ( 147): java.lang.ExceptionInInitializerError
> E/Zygote ( 147): at java.lang.Class.classForName(Native Method)
> E/Zygote ( 147): at java.lang.Class.forName(Class.java:234)
> E/Zygote ( 147): at java.lang.Class.forName(Class.java:181)
> E/Zygote ( 147): at com.android.internal.os.ZygoteInit.preloadClasses(ZygoteInit.java:297)
> E/Zygote ( 147): at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:564)
> E/Zygote ( 147): at dalvik.system.NativeStart.main(Native Method)
> E/Zygote ( 147): Caused by: java.lang.NullPointerException: algorithm is null
> E/Zygote ( 147): at javax.net.ssl.KeyManagerFactory.getInstance(KeyManagerFactory.java:77)
> E/Zygote ( 147): at org.apache.harmony.xnet.provider.jsse.SSLParametersImpl.createDefaultKeyManager(SSLParametersImpl.java:387)
> E/Zygote ( 147): at org.apache.harmony.xnet.provider.jsse.SSLParametersImpl.getDefaultKeyManager(SSLParametersImpl.java:380)
> E/Zygote ( 147): at org.apache.harmony.xnet.provider.jsse.SSLParametersImpl.<init>(SSLParametersImpl.java:120)
> E/Zygote ( 147): at org.apache.harmony.xnet.provider.jsse.SSLContextImpl.engineInit(SSLContextImpl.java:97)
> E/Zygote ( 147): at android.net.http.HttpsConnection.initializeEngine(HttpsConnection.java:101)
> E/Zygote ( 147): at android.net.http.HttpsConnection.<clinit>(HttpsConnection.java:65)
> E/Zygote ( 147): ... 6 more
> E/Zygote ( 147): setreuid() failed. errno: 17
> D/AndroidRuntime( 147): Shutting down VM
> ----------------------------
>
> The build of the same sources worked fine when built in my Ubuntu 11.10 (64-bit) host.
> I did the migration like this:
>
> 1) full backup of the aosp source dir, from the 11.10 host
> 2) setup of a new Ubuntu 12.04 host
> 3) installation of AOSP pre-requisite packages
> 4) installation of Oracle jdk 6
> 4) restore of the aosp source dir (from 1)
>
> The failure in "javax.net.ssl.KeyManagerFactory" suggests some SSL-related issue, but I wonder what I could be missing,...
>
> Thanks in advance for any help.
>
> Fabio
--
unsubscribe: android-porting+unsubscribe(a)googlegroups.com
website: http://groups.google.com/group/android-porting
--
Zach Pfeffer
Android Platform Team Lead, Linaro Platform Teams
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linarohttp://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
Hey everyone.
I've started a build for Panda (without the blobs) that uses a new
SOURCE_OVERLAY which will install BUILD-INFO.txt alongside system.tar.bz2
The build is now in progress, I will update everyone once this works:
https://android-build.linaro.org/builds/~linaro-android/panda-ics-gcc47-til…
Zach, question: should we rename '-blob- to something else while we are
waiting for the new panda blobs to arrive?
Thanks
ZK
--
Zygmunt Krynicki
Linaro Validation Team
s/Validation/Android/
Hey everyone.
In light of discussion started by bug:
https://bugs.launchpad.net/linaro-android/+bug/1022537 and duplicates I
think we need to reconsider non-tests build variants.
I'd like to understand how that would work:
0) Which builds need to be split to 'eng' and 'tests' builds?
1) Which images are sent to LAVA for automatic testing?
2) Which images are tested manually? (I don't suppose we'll test both)
3) How do we re-visit the split (do we? how often? on what event?)
My current feeling is that:
0: we need to split all builds and slightly re-design the a-b.l.o front
page to offer twin links
1: we use the 'eng' variant for LAVA, for now
2: we use the 'eng' variant for manual tests unless everyone agrees that
they can reliably test 'tests' builds
3: we look at this policy every time we {loose,gain} {h/w
acceleration,board type} and do a check-list each month during monthly
planning.
Feel free to post your ideas and start the discussion. Thanks!
ZK
--
Zygmunt Krynicki
Linaro Validation Team
s/Validation/Android/