hi,
i seem to be the only one in the world to have this problem. :-(
on one of my machines, updating to 6.6.18 and later (including mainline branch) leads to unbootable system. all other computers are unaffected.
bisecting the history leads to:
commit 8117961d98fb2d335ab6de2cad7afb8b6171f5fe Author: Ard Biesheuvel ardb@kernel.org
Date: Tue Sep 12 09:00:53 2023 +0000
x86/efi: Disregard setup header of loaded image
commit 7e50262229faad0c7b8c54477cd1c883f31cc4a7 upstream.
The native EFI entrypoint does not take a struct boot_params from the loader, but instead, it constructs one from scratch, using the setup header data placed at the start of the image.
This setup header is placed in a way that permits legacy loaders to manipulate the contents (i.e., to pass the kernel command line or the address and size of an initial ramdisk), but EFI boot does not use it in that way - it only copies the contents that were placed there at build time, but EFI loaders will not (and should not) manipulate the setup header to configure the boot. (Commit 63bf28ceb3ebbe76 "efi: x86: Wipe setup_data on pure EFI boot" deals with some of the fallout of using setup_data in a way that breaks EFI boot.)
Given that none of the non-zero values that are copied from the setup header into the EFI stub's struct boot_params are relevant to the boot now that the EFI stub no longer enters via the legacy decompressor, the copy can be omitted altogether.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230912090051.4014114-19-ardb@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
this seems to be the commit to introduce the regression.
i have no idea where to look and what information to provide to explain what the problem might be or why this single machine is different. :-(
please don't hesitate to ask me further questions - i can do any debugging you may need on your behalf.
sincerely, R.
#regzbot introduced: 8117961d98fb2d335ab6de2cad7afb8b6171f5fe
On Thu, 14 Mar 2024 at 19:32, Radek Podgorny radek@podgorny.cz wrote:
hi,
i seem to be the only one in the world to have this problem. :-(
on one of my machines, updating to 6.6.18 and later (including mainline branch) leads to unbootable system. all other computers are unaffected.
bisecting the history leads to:
commit 8117961d98fb2d335ab6de2cad7afb8b6171f5fe Author: Ard Biesheuvel ardb@kernel.org
Thanks for the report.
I'd like to get to the bottom of this if we can.
Please share as much information as you can about the system - boot logs - DMI data to identify the system and firmware etc - distro version - versions of boot components (shim, grub, systemd-boot, etc including config files) - other information that might help narrow this down.
On Thu, 14 Mar 2024 at 20:35, Ard Biesheuvel ardb@kernel.org wrote:
On Thu, 14 Mar 2024 at 19:32, Radek Podgorny radek@podgorny.cz wrote:
hi,
i seem to be the only one in the world to have this problem. :-(
on one of my machines, updating to 6.6.18 and later (including mainline branch) leads to unbootable system. all other computers are unaffected.
bisecting the history leads to:
commit 8117961d98fb2d335ab6de2cad7afb8b6171f5fe Author: Ard Biesheuvel ardb@kernel.org
Thanks for the report.
I'd like to get to the bottom of this if we can.
Please share as much information as you can about the system
- boot logs
- DMI data to identify the system and firmware etc
- distro version
- versions of boot components (shim, grub, systemd-boot, etc including
config files)
- other information that might help narrow this down.
Also, please check whether the below change makes a difference or not
--- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -477,6 +477,8 @@ efi_exit(handle, status); }
+ memset(&boot_params, 0, sizeof(boot_params)); + /* Assign the setup_header fields that the kernel actually cares about */ hdr->root_flags = 1; hdr->vid_mode = 0xffff;
On Thu, 14 Mar 2024 at 22:53, Ard Biesheuvel ardb@kernel.org wrote:
On Thu, 14 Mar 2024 at 20:35, Ard Biesheuvel ardb@kernel.org wrote:
On Thu, 14 Mar 2024 at 19:32, Radek Podgorny radek@podgorny.cz wrote:
hi,
i seem to be the only one in the world to have this problem. :-(
on one of my machines, updating to 6.6.18 and later (including mainline branch) leads to unbootable system. all other computers are unaffected.
bisecting the history leads to:
commit 8117961d98fb2d335ab6de2cad7afb8b6171f5fe Author: Ard Biesheuvel ardb@kernel.org
Thanks for the report.
I'd like to get to the bottom of this if we can.
Please share as much information as you can about the system
- boot logs
- DMI data to identify the system and firmware etc
- distro version
- versions of boot components (shim, grub, systemd-boot, etc including
config files)
- other information that might help narrow this down.
Also, please check whether the below change makes a difference or not
--- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -477,6 +477,8 @@ efi_exit(handle, status); }
memset(&boot_params, 0, sizeof(boot_params));
/* Assign the setup_header fields that the kernel actually
cares about */ hdr->root_flags = 1; hdr->vid_mode = 0xffff;
Another thing you might try is reverting commit
commit 5f51c5d0e905608ba7be126737f7c84a793ae1aa Author: Ard Biesheuvel ardb@kernel.org Date: Tue Sep 12 09:00:52 2023 +0000
x86/efi: Drop EFI stub .bss from .data section
(in v6.6 the commit id is fa244085025f4a8fb38ec67af635aed04297758d)
hi ard, thanks for the effort!
so, your first recommended patch (the memset thing), applied to current mainline (6.8) DOES NOT resolve the issue.
the second recommendation, a revert patch, applied to the same mainline tree, indeed DOES resolve the problem.
just to be sure, i'm attaching the revert patch.
as for the information to find the cause:
* boot logs - can't really give you as the boot process just stops - i can send you a picture of the screen taken with my phone.
* dmi data - attaching dmidecode output
* distro - current arch linux (rolling distro) - attaching list of installed packages with versions
* versions of boot components - should be included in the listing above
feel free to send me any patches, i'll be happy to try them out!
cheers, R.
On 3/15/24 08:42, Ard Biesheuvel wrote:
On Thu, 14 Mar 2024 at 22:53, Ard Biesheuvel ardb@kernel.org wrote:
On Thu, 14 Mar 2024 at 20:35, Ard Biesheuvel ardb@kernel.org wrote:
On Thu, 14 Mar 2024 at 19:32, Radek Podgorny radek@podgorny.cz wrote:
hi,
i seem to be the only one in the world to have this problem. :-(
on one of my machines, updating to 6.6.18 and later (including mainline branch) leads to unbootable system. all other computers are unaffected.
bisecting the history leads to:
commit 8117961d98fb2d335ab6de2cad7afb8b6171f5fe Author: Ard Biesheuvel ardb@kernel.org
Thanks for the report.
I'd like to get to the bottom of this if we can.
Please share as much information as you can about the system
- boot logs
- DMI data to identify the system and firmware etc
- distro version
- versions of boot components (shim, grub, systemd-boot, etc including
config files)
- other information that might help narrow this down.
Also, please check whether the below change makes a difference or not
--- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -477,6 +477,8 @@ efi_exit(handle, status); }
memset(&boot_params, 0, sizeof(boot_params));
/* Assign the setup_header fields that the kernel actually
cares about */ hdr->root_flags = 1; hdr->vid_mode = 0xffff;
Another thing you might try is reverting commit
commit 5f51c5d0e905608ba7be126737f7c84a793ae1aa Author: Ard Biesheuvel ardb@kernel.org Date: Tue Sep 12 09:00:52 2023 +0000
x86/efi: Drop EFI stub .bss from .data section
(in v6.6 the commit id is fa244085025f4a8fb38ec67af635aed04297758d)
On Fri, 15 Mar 2024 at 15:12, Radek Podgorny radek@podgorny.cz wrote:
hi ard, thanks for the effort!
so, your first recommended patch (the memset thing), applied to current mainline (6.8) DOES NOT resolve the issue.
the second recommendation, a revert patch, applied to the same mainline tree, indeed DOES resolve the problem.
just to be sure, i'm attaching the revert patch.
Thanks.
If the revert works for you, I think we can stop looking.
This points to an issue in the firmware's image loader, which does not clear all the memory it should be clearing.
I will queue up the revert with your tested-by.
Thanks, Ard.
ok, no problem, thanks!
i was just wondering what was the purpose of the original change - i don't want others to miss some improvement just because of my weird machine. ;-)
r.
On 3/15/24 15:25, Ard Biesheuvel wrote:
On Fri, 15 Mar 2024 at 15:12, Radek Podgorny radek@podgorny.cz wrote:
hi ard, thanks for the effort!
so, your first recommended patch (the memset thing), applied to current mainline (6.8) DOES NOT resolve the issue.
the second recommendation, a revert patch, applied to the same mainline tree, indeed DOES resolve the problem.
just to be sure, i'm attaching the revert patch.
Thanks.
If the revert works for you, I think we can stop looking.
This points to an issue in the firmware's image loader, which does not clear all the memory it should be clearing.
I will queue up the revert with your tested-by.
Thanks, Ard.
On Fri, 15 Mar 2024 at 15:12, Radek Podgorny radek@podgorny.cz wrote:
hi ard, thanks for the effort!
so, your first recommended patch (the memset thing), applied to current mainline (6.8) DOES NOT resolve the issue.
the second recommendation, a revert patch, applied to the same mainline tree, indeed DOES resolve the problem.
just to be sure, i'm attaching the revert patch.
Actually, that is not the patch I had in mind.
Please revert
x86/efi: Drop EFI stub .bss from .data section
commit fa244085025f4a8fb38ec67af635aed04297758d in v6.6
(or apply the changes below by hand if that is easier for you)
--- a/arch/x86/boot/compressed/vmlinux.lds.S +++ b/arch/x86/boot/compressed/vmlinux.lds.S @@ -47,6 +47,7 @@ SECTIONS _data = . ; *(.data) *(.data.*) + *(.bss.efistub)
/* Add 4 bytes of extra space for a CRC-32 checksum */ . = ALIGN(. + 4, 0x200);
--- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -108,6 +108,13 @@ # https://bugs.llvm.org/show_bug.cgi?id=46480 STUBCOPY_FLAGS-y += --remove-section=.note.gnu.property
+STUBCOPY_FLAGS-$(CONFIG_X86) += --rename-section .bss=.bss.efistub,load,alloc STUBCOPY_RELOC-$(CONFIG_X86_32) := R_386_32 STUBCOPY_RELOC-$(CONFIG_X86_64) := R_X86_64_64
On Fri, 15 Mar 2024 at 16:33, Ard Biesheuvel ardb@kernel.org wrote:
On Fri, 15 Mar 2024 at 15:12, Radek Podgorny radek@podgorny.cz wrote:
hi ard, thanks for the effort!
so, your first recommended patch (the memset thing), applied to current mainline (6.8) DOES NOT resolve the issue.
the second recommendation, a revert patch, applied to the same mainline tree, indeed DOES resolve the problem.
just to be sure, i'm attaching the revert patch.
Actually, that is not the patch I had in mind.
Please revert
x86/efi: Drop EFI stub .bss from .data section
BTW which bootloader are you using?
it's systemd-boot. attaching bootctl output. now looking at it, it seems that while systemd (and systemd-boot) gets timely updates on my system (currently at 255.4), the stub (is this how it's called?) does not get updated automatically in the efi partition (still at version 244?).
i can try to update it. but i'll wait for your instructions since this may be some rare situation and we may use it for testing.
anyway, i'm compiling new kernel with your suggested changes right now so i'll let you know how it turned out, soon.
r.
p.s.: ha! nevermind, i just checked the other systems which boot fine and they also are on stub (?) 244 so it's probably not the cause.
On 3/15/24 17:08, Ard Biesheuvel wrote:
On Fri, 15 Mar 2024 at 16:33, Ard Biesheuvel ardb@kernel.org wrote:
On Fri, 15 Mar 2024 at 15:12, Radek Podgorny radek@podgorny.cz wrote:
hi ard, thanks for the effort!
so, your first recommended patch (the memset thing), applied to current mainline (6.8) DOES NOT resolve the issue.
the second recommendation, a revert patch, applied to the same mainline tree, indeed DOES resolve the problem.
just to be sure, i'm attaching the revert patch.
Actually, that is not the patch I had in mind.
Please revert
x86/efi: Drop EFI stub .bss from .data section
BTW which bootloader are you using?
On Fri, 15 Mar 2024 at 17:24, Radek Podgorny radek@podgorny.cz wrote:
it's systemd-boot. attaching bootctl output. now looking at it, it seems that while systemd (and systemd-boot) gets timely updates on my system (currently at 255.4), the stub (is this how it's called?) does not get updated automatically in the efi partition (still at version 244?).
i can try to update it. but i'll wait for your instructions since this may be some rare situation and we may use it for testing.
anyway, i'm compiling new kernel with your suggested changes right now so i'll let you know how it turned out, soon.
r.
p.s.: ha! nevermind, i just checked the other systems which boot fine and they also are on stub (?) 244 so it's probably not the cause.
OK that makes sense.
I installed Arch linux in a VM (what a pain!) but I don't think the distro has anything to do with it.
I did realize that reverting that patch is not going to be a full solution in any case.
Could you please try whether the following fix works for you?
--- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -473,6 +473,9 @@ int options_size = 0; efi_status_t status; char *cmdline_ptr; + extern char _bss[], _ebss[]; + + memset(_bss, 0, _ebss - _bss);
efi_system_table = sys_table_arg;
ok, will. the kernel with previous patch is still compiling so i'll queue it. ;-)
anyway, should i apply this as a separate patch or as an addition to the previous one (the one with bss.efistub addition)?
r.
On 3/15/24 18:48, Ard Biesheuvel wrote:
On Fri, 15 Mar 2024 at 17:24, Radek Podgorny radek@podgorny.cz wrote:
it's systemd-boot. attaching bootctl output. now looking at it, it seems that while systemd (and systemd-boot) gets timely updates on my system (currently at 255.4), the stub (is this how it's called?) does not get updated automatically in the efi partition (still at version 244?).
i can try to update it. but i'll wait for your instructions since this may be some rare situation and we may use it for testing.
anyway, i'm compiling new kernel with your suggested changes right now so i'll let you know how it turned out, soon.
r.
p.s.: ha! nevermind, i just checked the other systems which boot fine and they also are on stub (?) 244 so it's probably not the cause.
OK that makes sense.
I installed Arch linux in a VM (what a pain!) but I don't think the distro has anything to do with it.
I did realize that reverting that patch is not going to be a full solution in any case.
Could you please try whether the following fix works for you?
--- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -473,6 +473,9 @@ int options_size = 0; efi_status_t status; char *cmdline_ptr;
extern char _bss[], _ebss[];
memset(_bss, 0, _ebss - _bss); efi_system_table = sys_table_arg;
On Fri, 15 Mar 2024 at 19:11, Radek Podgorny radek@podgorny.cz wrote:
ok, will. the kernel with previous patch is still compiling so i'll queue it. ;-)
anyway, should i apply this as a separate patch or as an addition to the previous one (the one with bss.efistub addition)?
Only this one change please.
hello ard,
so, i have some bad news. unfortunately, none of the suggested changes resolves the issue. :-(
the last two patches still lead to boot-stuck system.
overall, the patches only make the difference in error being printed: sometimes it's something like "wrong padding", sometimes the "invalid magic" message. i only noticed later so i can't really tell which patch leads to which message but i can retry them all and tell you exactly if it helps.
anyway, if you have any other patches, i'll be more than happy to test them out (and be more careful about the message this time)!
thanks, r.
On 3/15/24 19:25, Ard Biesheuvel wrote:
On Fri, 15 Mar 2024 at 19:11, Radek Podgorny radek@podgorny.cz wrote:
ok, will. the kernel with previous patch is still compiling so i'll queue it. ;-)
anyway, should i apply this as a separate patch or as an addition to the previous one (the one with bss.efistub addition)?
Only this one change please.
linux-stable-mirror@lists.linaro.org