Hi there ...
summary: boot loop after upgrading from 5.17.5 to 5.17.6
I have a Gentoo Desktop where I was successfully running a self compiled vanilla 5.17.5 kernel. After upgrading to 5.17.12 the machine was unable to boot: lilo loads the kernel and the intel microcode 20220510_p20220508 (/boot/intel-uc.img) as an initrd. Then the monitor turns black and a few moments later the BIOS logo is shown, followed by lilo.
As there was a recent microcode upgrade I first tried to boot without it. But that didn't help.
Then I tried to boot other already compiled but not yet booted kernels 5.17.9 and 5.17.6 without any success. Booting the old 5.17.5 still worked. So I compiled 5.18.1 but that also was unable to boot.
Then I tried to do a "manual bisecting" with patch-5.17.5-6 and found that the machine is still booting with 197 of 209 diffs applied.
After applying "patch-5.17.5-6.part198.patch" compilation is broken. Still after applying "patch-5.17.5-6.part199.patch". After applying "patch-5.17.5-6.part200.patch", compilation works again but the resulting kernel now fails to boot.
It is an (up to date) Gentoo System, installed ~15 years ago, running a 64Bit kernel with a 32Bit userland. The hardware is newer then the installation (i5-7400).
I have to mention that the machine is in a remote location. With lilo allowing a "default kernel" as well as a kernel to be booted "just once", I was able to try the above from remote.
Thomas
some more information:
$ cat /proc/version Linux version 5.17.5 (root@dragon) (x86_64-pc-linux-gnu-gcc (Gentoo 11.2.1_p20220115 p4) 11.2.1 20220115, GNU ld (Gentoo 2.37_p1 p2) 2.37) #130 SMP PREEMPT Thu Apr 28 10:50:24 CEST 2022
$ uname -mi x86_64 GenuineIntel
I tried to compile 5.17.6 without the three mentioned diffs which modify the following files:
tools/objtool/check.c and tools/objtool/elf.c and tools/objtool/include/objtool/elf.h
and was then able to successfully boot 5.17.6.
with these three diffs reverted, I was able to boot all affected 5.17.x kernels (x={6,7,8,9,10,11,12})
Am 02.06.22 um 18:42 schrieb Thomas Sattler:
some more information:
$ cat /proc/version Linux version 5.17.5 (root@dragon) (x86_64-pc-linux-gnu-gcc (Gentoo 11.2.1_p20220115 p4) 11.2.1 20220115, GNU ld (Gentoo 2.37_p1 p2) 2.37) #130 SMP PREEMPT Thu Apr 28 10:50:24 CEST 2022
$ uname -mi x86_64 GenuineIntel
I tried to compile 5.17.6 without the three mentioned diffs which modify the following files:
tools/objtool/check.c and tools/objtool/elf.c and tools/objtool/include/objtool/elf.h
and was then able to successfully boot 5.17.6.
On Thu, Jun 02, 2022 at 09:24:18PM +0200, Thomas Sattler wrote:
with these three diffs reverted, I was able to boot all affected 5.17.x kernels (x={6,7,8,9,10,11,12})
Am 02.06.22 um 18:42 schrieb Thomas Sattler:
some more information:
$ cat /proc/version Linux version 5.17.5 (root@dragon) (x86_64-pc-linux-gnu-gcc (Gentoo 11.2.1_p20220115 p4) 11.2.1 20220115, GNU ld (Gentoo 2.37_p1 p2) 2.37) #130 SMP PREEMPT Thu Apr 28 10:50:24 CEST 2022
$ uname -mi x86_64 GenuineIntel
I tried to compile 5.17.6 without the three mentioned diffs which modify the following files:
tools/objtool/check.c and tools/objtool/elf.c and tools/objtool/include/objtool/elf.h
and was then able to successfully boot 5.17.6.
5.17.6 has commit 60d2b0b1018a ("objtool: Fix code relocs vs weak symbols"), which has a known issue that is fixed with commit ead165fa1042 ("objtool: Fix symbol creation"). If you apply ead165fa1042 on 5.17.6 or newer, does that resolve your issue?
ead165fa1042 is tagged for stable but I don't think Greg picks up patches from mainline until they are in a tagged -rc release.
Cheers, Nathan
Am 02.06.22 um 23:32 schrieb Nathan Chancellor:
Am 02.06.22 um 18:42 schrieb Thomas Sattler:
I tried to compile 5.17.6 without the three mentioned diffs which modify the following files:
tools/objtool/check.c and tools/objtool/elf.c and tools/objtool/include/objtool/elf.h
and was then able to successfully boot 5.17.6.
5.17.6 has commit 60d2b0b1018a ("objtool: Fix code relocs vs weak symbols"), which has a known issue that is fixed with commit ead165fa1042 ("objtool: Fix symbol creation"). If you apply ead165fa1042 on 5.17.6 or newer, does that resolve your issue?
I applied ead165fa1042 ontop of 5.17.12, but that did not make my system boot that kernel.
Thomas
Hi, this is your Linux kernel regression tracker. Sorry, I'm behind mail.
On 03.06.22 00:44, Thomas Sattler wrote:
Am 02.06.22 um 23:32 schrieb Nathan Chancellor:
Am 02.06.22 um 18:42 schrieb Thomas Sattler:
I tried to compile 5.17.6 without the three mentioned diffs which modify the following files:
tools/objtool/check.c and tools/objtool/elf.c and tools/objtool/include/objtool/elf.h
and was then able to successfully boot 5.17.6.
5.17.6 has commit 60d2b0b1018a ("objtool: Fix code relocs vs weak symbols"),
FWIW, that is 4abff6d48dbc in mainline
which has a known issue that is fixed with commit ead165fa1042 ("objtool: Fix symbol creation"). If you apply ead165fa1042 on 5.17.6 or newer, does that resolve your issue?
I applied ead165fa1042 ontop of 5.17.12, but that did not make my system boot that kernel.
Was there any progress to get down to this? Peter, who authored 4abff6d48dbc, is not even CCed yet to this thread yet afaics.
BTW, 5.17 will likely be EOL in a week or two. Thomas, maybe it might be the best to give 5.19-rc1 a shot and in case the regression is still there start a new thread about this that focuses on the regression in mainline. That makes things less confusing and the regression needs to be fixed in mainline first anyway before it can be fixed in the stable trees.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.
Hi Greg,
I did not yet start a bisect but looked at commits that modify the same file again. And there was a promising one:
22682a07acc3 objtool: Fix objtool regression on x32 systems
That one applies cleanly to both, 5.17.12 and 5.18.2, and also fixes my boot-problem on both.
Thomas
Am 07.06.22 um 13:53 schrieb Greg KH:
On Tue, Jun 07, 2022 at 01:40:03PM +0200, Thomas Sattler wrote:
Hi Thorsten,
I just compiled 5.19-rc1 and my issue is solved there.
Any chance you can do 'git bisect' between 5.18 and 5.19-rc1 to find the commit that fixed the issue?
thanks,
greg k-h
On Tue, Jun 07, 2022 at 04:17:26PM +0200, Thomas Sattler wrote:
Hi Greg,
I did not yet start a bisect but looked at commits that modify the same file again. And there was a promising one:
22682a07acc3 objtool: Fix objtool regression on x32 systems
That one applies cleanly to both, 5.17.12 and 5.18.2, and also fixes my boot-problem on both.
Oh good, that's already queued up for the next round of stable kernels.
thanks,
greg k-h
On 07.06.22 13:40, Thomas Sattler wrote:
I just compiled 5.19-rc1 and my issue is solved there.
Then in the interest of the greater good it would be good if you could check 5.18.y as well, as if it doesn't work it might be a good idea to identify what fixed the problem in mainline and backport the fix to 5.18.y. The same is kinda true for 5.17 as well, but obviously it's your choice if you want to spend time on this.
Ciao, Thorsten
On Thu, Jun 02, 2022 at 06:14:43PM +0200, Thomas Sattler wrote:
After applying "patch-5.17.5-6.part198.patch" compilation is broken. Still after applying "patch-5.17.5-6.part199.patch". After applying "patch-5.17.5-6.part200.patch", compilation works again but the resulting kernel now fails to boot.
I have no idea what those random patches are, please can you say what the upstream commit is?
thanks,
greg k-h
Am 02.06.22 um 22:08 schrieb Greg KH:
On Thu, Jun 02, 2022 at 06:14:43PM +0200, Thomas Sattler wrote:
After applying "patch-5.17.5-6.part198.patch" compilation is broken. Still after applying "patch-5.17.5-6.part199.patch". After applying "patch-5.17.5-6.part200.patch", compilation works again but the resulting kernel now fails to boot.
I have no idea what those random patches are, please can you say what the upstream commit is?
I took what I reverted from patch-5.17.5-6.xz. In your tree it matches what Nathan mentioned (60d2b0b1018a) plus d17f64c29512.
Now, knowing that they were two patchsets, I compiled 5.17.12 twice, once without 60d2b0b1018a and once without d17f64c29512.
And it turns out it is 60d2b0b1018a which breaks my system.
Thomas
On Fri, Jun 03, 2022 at 01:29:26AM +0200, Thomas Sattler wrote:
Am 02.06.22 um 22:08 schrieb Greg KH:
On Thu, Jun 02, 2022 at 06:14:43PM +0200, Thomas Sattler wrote:
After applying "patch-5.17.5-6.part198.patch" compilation is broken. Still after applying "patch-5.17.5-6.part199.patch". After applying "patch-5.17.5-6.part200.patch", compilation works again but the resulting kernel now fails to boot.
I have no idea what those random patches are, please can you say what the upstream commit is?
I took what I reverted from patch-5.17.5-6.xz. In your tree it matches what Nathan mentioned (60d2b0b1018a) plus d17f64c29512.
Now, knowing that they were two patchsets, I compiled 5.17.12 twice, once without 60d2b0b1018a and once without d17f64c29512.
And it turns out it is 60d2b0b1018a which breaks my system.
Does 5.18.1 also break for you?
thanks,
greg k-h
Am 03.06.22 um 14:57 schrieb Greg KH:
On Fri, Jun 03, 2022 at 01:29:26AM +0200, Thomas Sattler wrote:
Am 02.06.22 um 22:08 schrieb Greg KH:
On Thu, Jun 02, 2022 at 06:14:43PM +0200, Thomas Sattler wrote:
Now, knowing that they were two patchsets, I compiled 5.17.12 twice, once without 60d2b0b1018a and once without d17f64c29512.
And it turns out it is 60d2b0b1018a which breaks my system.
Does 5.18.1 also break for you?
There is no difference between 5.17.6, 5.17.12 and 5.18.1:
- they do not boot vanilla - applying ead165fa1042 is no help - reverting 60d2b0b1018a allows booting
Thomas
On Fri, Jun 03, 2022 at 03:46:34PM +0200, Thomas Sattler wrote:
Am 03.06.22 um 14:57 schrieb Greg KH:
On Fri, Jun 03, 2022 at 01:29:26AM +0200, Thomas Sattler wrote:
Am 02.06.22 um 22:08 schrieb Greg KH:
On Thu, Jun 02, 2022 at 06:14:43PM +0200, Thomas Sattler wrote:
Now, knowing that they were two patchsets, I compiled 5.17.12 twice, once without 60d2b0b1018a and once without d17f64c29512.
And it turns out it is 60d2b0b1018a which breaks my system.
Does 5.18.1 also break for you?
There is no difference between 5.17.6, 5.17.12 and 5.18.1:
- they do not boot vanilla
- applying ead165fa1042 is no help
- reverting 60d2b0b1018a allows booting
Great, please work with the developers of that change to track down the problem and get it fixed.
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org