Hi All
After updating my Juno tree to the 16.07 release of LSK 3.18 I am not able to boot Android. I narrowed down the problem to the new iptable validity checks that came in with Linux 3.18.37. If I revert a bunch of these then things work again, i.e.
git revert 674c7e173d01~..6a9f9d4e6c5d
If I don't revert 674c7e173d01 then things remain broken, but it's not as simple as just that one commit being wrong, because if I only revert that one by itself it doesn't fix things.
I don't have time to investigate further, I thought I should at least make people aware of this problem, and hope that a clue might be found at some point.
I didn't raise a bug as it's not obvious where the cause lies, e.g. Android user-side could really be passing in invalid data to the kernel, or there could be bad interactions between AOSP kernel changes and Linux 3.18 LTS, or LTS could be broken. (Searching the web I didn't spot anything that indicated other people had a problem with those patches though).
The bad symptoms I'm getting with Android are that boot stalls and logcat shows the following every 5 seconds...
E NetdConnector: Communications error: java.io.IOException: Connection refused E NetdConnector: Error in NativeDaemonConnector: java.io.IOException: Connection refused
In case anyone wants more specifics for reproducing this, I pushed my kernel to the iptables-problem branch at https://git.linaro.org/people/tixy/kernel.git
The kernel config was the result of
ARCH=arm64 scripts/kconfig/merge_config.sh \ linaro/configs/linaro-base.conf \ linaro/configs/android.conf \ linaro/configs/vexpress64.conf \ linaro/configs/EAS.conf
And Android images was from http://releases.linaro.org/members/arm/android/juno/16.07/
On 2 August 2016 at 21:49, Jon Medhurst (Tixy) tixy@linaro.org wrote:
Hi All
After updating my Juno tree to the 16.07 release of LSK 3.18 I am not able to boot Android. I narrowed down the problem to the new iptable validity checks that came in with Linux 3.18.37. If I revert a bunch of these then things work again, i.e.
git revert 674c7e173d01~..6a9f9d4e6c5d
If I don't revert 674c7e173d01 then things remain broken, but it's not as simple as just that one commit being wrong, because if I only revert that one by itself it doesn't fix things.
I don't have time to investigate further, I thought I should at least make people aware of this problem, and hope that a clue might be found at some point.
I didn't raise a bug as it's not obvious where the cause lies, e.g. Android user-side could really be passing in invalid data to the kernel, or there could be bad interactions between AOSP kernel changes and Linux 3.18 LTS, or LTS could be broken. (Searching the web I didn't spot anything that indicated other people had a problem with those patches though).
The bad symptoms I'm getting with Android are that boot stalls and logcat shows the following every 5 seconds...
E NetdConnector: Communications error: java.io.IOException: Connection refused E NetdConnector: Error in NativeDaemonConnector: java.io.IOException: Connection refused
In case anyone wants more specifics for reproducing this, I pushed my kernel to the iptables-problem branch at https://git.linaro.org/people/tixy/kernel.git
Hi, I can reproduce the error locally on qemu-system-arm running Android M. I'll dig more into that but I'm working from my parents house for couple of weeks so I might be slower than usual.
Regards, Amit Pundir
The kernel config was the result of
ARCH=arm64 scripts/kconfig/merge_config.sh \ linaro/configs/linaro-base.conf \ linaro/configs/android.conf \ linaro/configs/vexpress64.conf \ linaro/configs/EAS.conf
And Android images was from http://releases.linaro.org/members/arm/android/juno/16.07/
On Wed, 2016-08-03 at 15:51 +0530, Amit Pundir wrote:
Hi, I can reproduce the error locally on qemu-system-arm running Android M. I'll dig more into that but I'm working from my parents house for couple of weeks so I might be slower than usual.
Many thanks for looking into this.
On 2 August 2016 at 21:49, Jon Medhurst (Tixy) tixy@linaro.org wrote:
Hi All
After updating my Juno tree to the 16.07 release of LSK 3.18 I am not able to boot Android. I narrowed down the problem to the new iptable validity checks that came in with Linux 3.18.37. If I revert a bunch of these then things work again, i.e.
git revert 674c7e173d01~..6a9f9d4e6c5d
If I don't revert 674c7e173d01 then things remain broken, but it's not as simple as just that one commit being wrong, because if I only revert that one by itself it doesn't fix things.
Just an FYI that I stumbled upon this upstream commit f4dc77713f80 ("netfilter: x_tables: speed up jump target validation") which, from the description, seem to address the same regression which we are facing on lsk-android-4.4. But I have not verified it yet and will keep you posted.
Regards, Amit Pundir
I don't have time to investigate further, I thought I should at least make people aware of this problem, and hope that a clue might be found at some point.
I didn't raise a bug as it's not obvious where the cause lies, e.g. Android user-side could really be passing in invalid data to the kernel, or there could be bad interactions between AOSP kernel changes and Linux 3.18 LTS, or LTS could be broken. (Searching the web I didn't spot anything that indicated other people had a problem with those patches though).
The bad symptoms I'm getting with Android are that boot stalls and logcat shows the following every 5 seconds...
E NetdConnector: Communications error: java.io.IOException: Connection refused E NetdConnector: Error in NativeDaemonConnector: java.io.IOException: Connection refused
In case anyone wants more specifics for reproducing this, I pushed my kernel to the iptables-problem branch at https://git.linaro.org/people/tixy/kernel.git
The kernel config was the result of
ARCH=arm64 scripts/kconfig/merge_config.sh \ linaro/configs/linaro-base.conf \ linaro/configs/android.conf \ linaro/configs/vexpress64.conf \ linaro/configs/EAS.conf
And Android images was from http://releases.linaro.org/members/arm/android/juno/16.07/
On Mon, 2016-08-08 at 19:22 +0530, Amit Pundir wrote:
On 2 August 2016 at 21:49, Jon Medhurst (Tixy) tixy@linaro.org wrote:
Hi All
After updating my Juno tree to the 16.07 release of LSK 3.18 I am not able to boot Android. I narrowed down the problem to the new iptable validity checks that came in with Linux 3.18.37. If I revert a bunch of these then things work again, i.e.
git revert 674c7e173d01~..6a9f9d4e6c5d
If I don't revert 674c7e173d01 then things remain broken, but it's not as simple as just that one commit being wrong, because if I only revert that one by itself it doesn't fix things.
Just an FYI that I stumbled upon this upstream commit f4dc77713f80 ("netfilter: x_tables: speed up jump target validation") which, from the description, seem to address the same regression which we are facing on lsk-android-4.4. But I have not verified it yet and will keep you posted.
Thanks. From my experiments it seemed that find_jump_target() was returning false, rather than just taking a long time before returning true. So that upstream performance fix may not be the whole picture.
Hi Tixy,
On 8 August 2016 at 19:22, Amit Pundir amit.pundir@linaro.org wrote:
On 2 August 2016 at 21:49, Jon Medhurst (Tixy) tixy@linaro.org wrote:
Hi All
After updating my Juno tree to the 16.07 release of LSK 3.18 I am not able to boot Android. I narrowed down the problem to the new iptable validity checks that came in with Linux 3.18.37. If I revert a bunch of these then things work again, i.e.
git revert 674c7e173d01~..6a9f9d4e6c5d
If I don't revert 674c7e173d01 then things remain broken, but it's not as simple as just that one commit being wrong, because if I only revert that one by itself it doesn't fix things.
Just an FYI that I stumbled upon this upstream commit f4dc77713f80 ("netfilter: x_tables: speed up jump target validation") which, from the description, seem to address the same regression which we are facing on lsk-android-4.4. But I have not verified it yet and will keep you posted.
I can confirm that this upstream fix works for me. Verified on qemu-system-arm with lsk-v3.18-android snapshot for lts-v3.18.37 (sorry for reporting it as v4.4 in previous email), as well as your "iptables-problem" tree.
Regards, Amit Pundir
On 8 Aug 2016 7:33 p.m., "Amit Pundir" amit.pundir@linaro.org wrote:
Hi Tixy,
On 8 August 2016 at 19:22, Amit Pundir amit.pundir@linaro.org wrote:
On 2 August 2016 at 21:49, Jon Medhurst (Tixy) tixy@linaro.org wrote:
Hi All
After updating my Juno tree to the 16.07 release of LSK 3.18 I am not able to boot Android. I narrowed down the problem to the new iptable validity checks that came in with Linux 3.18.37. If I revert a bunch of these then things work again, i.e.
git revert 674c7e173d01~..6a9f9d4e6c5d
If I don't revert 674c7e173d01 then things remain broken, but it's not as simple as just that one commit being wrong, because if I only revert that one by itself it doesn't fix things.
Just an FYI that I stumbled upon this upstream commit f4dc77713f80 ("netfilter: x_tables: speed up jump target validation") which, from the description, seem to address the same regression which we are facing on lsk-android-4.4. But I have not verified it yet and will keep you posted.
I can confirm that this upstream fix works for me. Verified on qemu-system-arm with lsk-v3.18-android snapshot for lts-v3.18.37 (sorry for reporting it as v4.4 in previous email), as well as your "iptables-problem" tree.
Good work diagnosing! Does this apply to the base LTS or is this an interaction with Android patches? Either way it seems we should push it to an upstream, Greg or Google.
Regards, Amit Pundir
On 9 August 2016 at 00:18, Mark Brown broonie@linaro.org wrote:
On 8 Aug 2016 7:33 p.m., "Amit Pundir" amit.pundir@linaro.org wrote:
I can confirm that this upstream fix works for me. Verified on qemu-system-arm with lsk-v3.18-android snapshot for lts-v3.18.37 (sorry for reporting it as v4.4 in previous email), as well as your "iptables-problem" tree.
Good work diagnosing! Does this apply to the base LTS or is this an interaction with Android patches? Either way it seems we should push it to an upstream, Greg or Google.
I see this patch is already pushed to LTS v3.18.39 tag.
Regards, Amit Pundir
On Tue, 2016-08-09 at 10:55 +0530, Amit Pundir wrote:
On 9 August 2016 at 00:18, Mark Brown broonie@linaro.org wrote:
On 8 Aug 2016 7:33 p.m., "Amit Pundir" amit.pundir@linaro.org wrote:
I can confirm that this upstream fix works for me. Verified on qemu-system-arm with lsk-v3.18-android snapshot for lts-v3.18.37 (sorry for reporting it as v4.4 in previous email), as well as your "iptables-problem" tree.
Good work diagnosing! Does this apply to the base LTS or is this an interaction with Android patches? Either way it seems we should push it to an upstream, Greg or Google.
I see this patch is already pushed to LTS v3.18.39 tag.
There isn't a v3.18.39 tag in what I thought was the stable git [1], but after some confusion I found it in Sasha Levin's fork [2].
I cherry-picked commit f5bba514aff9 from that and it fixed things for me too, on my Juno kernel.
[1] http://www.spinics.net/lists/stable/msg140231.html [2] https://git.kernel.org/cgit/linux/kernel/git/sashal/linux-stable.git/log/?h=...
Thanks again.
On Mon, 2016-08-08 at 19:48 +0100, Mark Brown wrote:
I can confirm that this upstream fix works for me. Verified on qemu-system-arm with lsk-v3.18-android snapshot for lts-v3.18.37 (sorry for reporting it as v4.4 in previous email), as well as your "iptables-problem" tree.
Good work diagnosing!
I'll second that, thanks!
linaro-kernel@lists.linaro.org