[REGRESSION] kexec does firmware reboot in kernel v6.7.6

List overview All Threads
Download

newer

older

[PATCH STABLE v5.4.y] mm/migrate:...

Linux 6.7.9

Pavin Joseph

1 Mar 2024 1 Mar '24

2:10 p.m.

Hello everyone,

#regzbot introduced v6.7.5..v6.7.6

I'm experiencing an issue where kexec does a full firmware reboot instead of kexec reboot.

Issue first submitted at OpenSuse bugzilla [0].

OS details as follows: Distributor ID: openSUSE Description: openSUSE Tumbleweed-Slowroll Release: 20240213

Issue has been reproduced by building kernel from source.

kexec works as expected in kernel v6.7.5. kexec does full firmware reboot in kernel v6.7.6.

I followed the docs here [1] to perform git bisect and find the culprit, hope it's alright as I'm quite out of my depth here.

Git bisect logs: git bisect start # status: waiting for both good and bad commits # bad: [b631f5b445dc3379f67ff63a2e4c58f22d4975dc] Linux 6.7.6 git bisect bad b631f5b445dc3379f67ff63a2e4c58f22d4975dc # status: waiting for good commit(s), bad commit known # good: [004dcea13dc10acaf1486d9939be4c793834c13c] Linux 6.7.5 git bisect good 004dcea13dc10acaf1486d9939be4c793834c13c

Let me know if there's anything else I can do to help troubleshoot the issue.

[0]: https://bugzilla.suse.com/show_bug.cgi?id=1220541 [1]: https://docs.kernel.org/admin-guide/bug-bisect.html

Kind regards, Pavin Joseph.

Show replies by date

Linux regression tracking (Thorsten Leemhuis)

1 Mar 1 Mar

2:45 p.m.

Hi! Thx for the report.

On 01.03.24 15:10, Pavin Joseph wrote:

...

#regzbot introduced v6.7.5..v6.7.6

I'm experiencing an issue where kexec does a full firmware reboot instead of kexec reboot.

Does mainline show the same problem? The answer determines who later will have to look into this.

...

Issue first submitted at OpenSuse bugzilla [0].

OS details as follows: Distributor ID: openSUSE Description: openSUSE Tumbleweed-Slowroll Release: 20240213

Issue has been reproduced by building kernel from source.

kexec works as expected in kernel v6.7.5. kexec does full firmware reboot in kernel v6.7.6.

I followed the docs here [1] to perform git bisect and find the culprit, hope it's alright as I'm quite out of my depth here.

With a bit of luck somebody might have heard about problems like yours. But if nobody comes up with an idea up within a few days we almost certainly need a bisection to get down to the root of the problem.

I'm working on a more detailed guide describing the process, maybe that works better for you:

https://www.leemhuis.info/files/misc/How%20to%20bisect%20a%20Linux%20kernel%...

It among other will make you check mainline and also 6.7.7, which was just released.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.

Pavin Joseph

2 Mar 2 Mar

8:24 a.m.

Hi Thorsten,

On 3/1/24 20:15, Linux regression tracking (Thorsten Leemhuis) wrote:

...

Does mainline show the same problem? The answer determines who later will have to look into this.

Yes, I reproduced the issue on mainline and the latest stable version 6.7.7 using your excellent guide.

...

With a bit of luck somebody might have heard about problems like yours. But if nobody comes up with an idea up within a few days we almost certainly need a bisection to get down to the root of the problem.

Full bisection done, culprit identified, and validated by reverting commit on mainline.

Attached bisection log and config used.

Bisection final results: 7143c5f4cf2073193eb27c9cdb84fd4655d1802d is the first bad commit commit 7143c5f4cf2073193eb27c9cdb84fd4655d1802d Author: Steve Wahl steve.wahl@hpe.com Date: Fri Jan 26 10:48:41 2024 -0600

x86/mm/ident_map: Use gbpages only where full GB page should be mapped.

commit d794734c9bbfe22f86686dc2909c25f5ffe1a572 upstream.

When ident_pud_init() uses only gbpages to create identity maps, large ranges of addresses not actually requested can be included in the resulting table; a 4K request will map a full GB. On UV systems, this ends up including regions that will cause hardware to halt the system if accessed (these are marked "reserved" by BIOS). Even processor speculation into these regions is enough to trigger the system halt.

Only use gbpages when map creation requests include the full GB page of space. Fall back to using smaller 2M pages when only portions of a GB page are included in the request.

No attempt is made to coalesce mapping requests. If a request requires a map entry at the 2M (pmd) level, subsequent mapping requests within the same 1G region will also be at the pmd level, even if adjacent or overlapping such requests could have been combined to map a full gbpage. Existing usage starts with larger regions and then adds smaller regions, so this should not have any great consequence.

[ dhansen: fix up comment formatting, simplifty changelog ]

Signed-off-by: Steve Wahl steve.wahl@hpe.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240126164841.170866-1-steve.wahl%40hpe.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org

arch/x86/mm/ident_map.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)

----------

Btw, the issue appears on LTS kernel 6.6.18 as well. I didn't build this one from the source and test, but installed it a while back from OpenSuse Tumbleweed repos as "kernel-longterm" is a new addition and is being actively tested over there.

Kind regards, Pavin Joseph.

Linux regression tracking (Thorsten Leemhuis)

3:17 p.m.

[adding the people involved in developing and applying the culprit to the list of recipients]

FWIW, thread starts here: https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@pavinjoseph...

On 02.03.24 09:24, Pavin Joseph wrote:

...

On 3/1/24 20:15, Linux regression tracking (Thorsten Leemhuis) wrote:

...
Does mainline show the same problem? The answer determines who later will have to look into this.

Yes, I reproduced the issue on mainline and the latest stable version 6.7.7 using your excellent guide.

Thx for testing and glad to hear. Still: if you have any feedback how to make that guide even better, please let me know!

...

...
With a bit of luck somebody might have heard about problems like yours. But if nobody comes up with an idea up within a few days we almost certainly need a bisection to get down to the root of the problem.

Full bisection done, culprit identified, and validated by reverting commit on mainline.

I assume the latter meant "reverting the culprit on mainline fixed the problem"; if you meant something else, please let us know.

...

Attached bisection log and config used.

Bisection final results: 7143c5f4cf2073193eb27c9cdb84fd4655d1802d is the first bad commit commit 7143c5f4cf2073193eb27c9cdb84fd4655d1802d Author: Steve Wahl steve.wahl@hpe.com Date:   Fri Jan 26 10:48:41 2024 -0600

x86/mm/ident_map: Use gbpages only where full GB page should be mapped.

commit d794734c9bbfe22f86686dc2909c25f5ffe1a572 upstream.

When ident_pud_init() uses only gbpages to create identity maps, large     ranges of addresses not actually requested can be included in the     resulting table; a 4K request will map a full GB. On UV systems, this     ends up including regions that will cause hardware to halt the system     if accessed (these are marked "reserved" by BIOS). Even processor     speculation into these regions is enough to trigger the system halt.

Only use gbpages when map creation requests include the full GB page     of space. Fall back to using smaller 2M pages when only portions of a     GB page are included in the request.

No attempt is made to coalesce mapping requests. If a request requires     a map entry at the 2M (pmd) level, subsequent mapping requests within     the same 1G region will also be at the pmd level, even if adjacent or     overlapping such requests could have been combined to map a full     gbpage. Existing usage starts with larger regions and then adds     smaller regions, so this should not have any great consequence.

[ dhansen: fix up comment formatting, simplifty changelog ]

Signed-off-by: Steve Wahl steve.wahl@hpe.com     Signed-off-by: Dave Hansen dave.hansen@linux.intel.com     Cc: stable@vger.kernel.org     Link: https://lore.kernel.org/all/20240126164841.170866-1-steve.wahl%40hpe.com     Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org

arch/x86/mm/ident_map.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)

Btw, the issue appears on LTS kernel 6.6.18 as well. I didn't build this one from the source and test, but installed it a while back from OpenSuse Tumbleweed repos as "kernel-longterm" is a new addition and is being actively tested over there.

P.S.:

#regzbot introduced d794734c9bbfe22f86686dc2909c25f5ffe1a572 #regzbot title x86/mm/ident_map: kexec now leads to reboot

Pavin Joseph

4:10 p.m.

Hello everyone,

On 3/2/24 20:47, Linux regression tracking (Thorsten Leemhuis) wrote> Thx for testing and glad to hear. Still: if you have any feedback how to

...

make that guide even better, please let me know!

Yes, I have some improvements in mind. Don't know if there is a Github repo where I can make a PR, but if not here's the gist:

1. The git clone/fetch instructions in the TLDR is easy to follow, but there are conflicting information later on in the main section and reference that taken together does not work. I think it would be better to not perform shallow clones or such advanced topics could be relegated to its own reference section.

Here's what I ended up using: git clone -o mainline --no-checkout \ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/ cd ~/linux/ git remote add -t master stable \ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git git checkout --detach v6.0 git checkout --force --detach mainline/master git remote set-branches --add stable linux-6.7.y git fetch --verbose stable git checkout --force --detach v6.7.7 git checkout --force --detach v6.7.5

2. The "installkernel" command is called "kernel-install" in OpenSuse, and it doesn't really perform all the steps to install kernel. It calls dracut to create initramfs though, but that's hardly much help.

I ended up doing: sudo make modules_install sudo install -m 0600 $(make -s image_name) /usr/lib/modules/$(make -s kernelrelease)/vmlinuz sudo install -m 0600 System.map /usr/lib/modules/$(make -s kernelrelease)/System.map sudo kernel-install add $(make -s kernelrelease) /usr/lib/modules/$(make -s kernelrelease)/vmlinuz sudo ln -sf /boot/initrd-$(make -s kernelrelease) /boot/initrd sudo ln -sf /usr/lib/modules/$(make -s kernelrelease)/vmlinuz /boot/vmlinuz-$(make -s kernelrelease) sudo ln -sf /boot/vmlinuz-$(make -s kernelrelease) /boot/vmlinuz sudo ln -sf /usr/lib/modules/$(make -s kernelrelease)/System.map /boot/System.map-$(make -s kernelrelease) sudo update-bootloader

3. The dependencies for kernel building in OpenSuse and other major distros are incomplete, most of them have some form of package collection that can be provided as an alternative. For example in OpenSuse, I installed the following patterns (collection of packages): sudo zypper in -t pattern devel_basis devel_kernel devel_osc_build devel_rpm_build

4. The command to build RPM package (make binrpm-pkg) fails as the modules are installed into "/home/<user>/linux/.../lib" while depmod checks for modules in "/home/<user>/linux/.../usr/lib".

I think that's it, turned out not to be a gist after all. 🙂 Thank you very much for writing the updated guide, it was very helpful without which I don't think it would have been possible for someone like me to find/report this bug.

...

...
Full bisection done, culprit identified, and validated by reverting commit on mainline.

I assume the latter meant "reverting the culprit on mainline fixed the problem"; if you meant something else, please let us know.

Clarification: reverting culprit commit on mainline fixed the problem.

Kind regards, Pavin Joseph.

Steve Wahl

3 Mar 3 Mar

midnight

On Sat, Mar 02, 2024 at 09:40:06PM +0530, Pavin Joseph wrote:

...

Hello everyone,

On 3/2/24 20:47, Linux regression tracking (Thorsten Leemhuis) wrote> Thx for testing and glad to hear. Still: if you have any feedback how to

...
make that guide even better, please let me know!

Yes, I have some improvements in mind. Don't know if there is a Github repo where I can make a PR, but if not here's the gist:

The git clone/fetch instructions in the TLDR is easy to follow, but there

are conflicting information later on in the main section and reference that taken together does not work. I think it would be better to not perform shallow clones or such advanced topics could be relegated to its own reference section.

Here's what I ended up using: git clone -o mainline --no-checkout \ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/ cd ~/linux/ git remote add -t master stable \ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git git checkout --detach v6.0 git checkout --force --detach mainline/master git remote set-branches --add stable linux-6.7.y git fetch --verbose stable git checkout --force --detach v6.7.7 git checkout --force --detach v6.7.5

The "installkernel" command is called "kernel-install" in OpenSuse, and

it doesn't really perform all the steps to install kernel. It calls dracut to create initramfs though, but that's hardly much help.

I ended up doing: sudo make modules_install sudo install -m 0600 $(make -s image_name) /usr/lib/modules/$(make -s kernelrelease)/vmlinuz sudo install -m 0600 System.map /usr/lib/modules/$(make -s kernelrelease)/System.map sudo kernel-install add $(make -s kernelrelease) /usr/lib/modules/$(make -s kernelrelease)/vmlinuz sudo ln -sf /boot/initrd-$(make -s kernelrelease) /boot/initrd sudo ln -sf /usr/lib/modules/$(make -s kernelrelease)/vmlinuz /boot/vmlinuz-$(make -s kernelrelease) sudo ln -sf /boot/vmlinuz-$(make -s kernelrelease) /boot/vmlinuz sudo ln -sf /usr/lib/modules/$(make -s kernelrelease)/System.map /boot/System.map-$(make -s kernelrelease) sudo update-bootloader

The dependencies for kernel building in OpenSuse and other major distros

are incomplete, most of them have some form of package collection that can be provided as an alternative. For example in OpenSuse, I installed the following patterns (collection of packages): sudo zypper in -t pattern devel_basis devel_kernel devel_osc_build devel_rpm_build

The command to build RPM package (make binrpm-pkg) fails as the modules

are installed into "/home/<user>/linux/.../lib" while depmod checks for modules in "/home/<user>/linux/.../usr/lib".

I think that's it, turned out not to be a gist after all. 🙂 Thank you very much for writing the updated guide, it was very helpful without which I don't think it would have been possible for someone like me to find/report this bug.

...
...
Full bisection done, culprit identified, and validated by reverting commit on mainline.

I assume the latter meant "reverting the culprit on mainline fixed the problem"; if you meant something else, please let us know.

Clarification: reverting culprit commit on mainline fixed the problem.

Kind regards, Pavin Joseph.

Pavin,

I have just now built and installed 6.7.7 and succesfully kexec'd in this fashion:

------------------------------ sph-185:~ # kexec -l /boot/vmlinuz-6.7.7-wahl --initrd=/boot/initrd-6.7.7-wahl --reuse-cmdline sph-185:~ # systemctl kexec ... [ OK ] Reached target Late Shutdown Services. Starting Reboot via kexec... [ 493.056708][ T1] systemd-shutdown[1]: Sending SIGKILL to remaining processes... [ 493.089271][ T1] systemd-shutdown[1]: Unmounting file systems. [ 493.096285][T14707] (sd-remount)[14707]: Remounting '/' read-only in with options 'attr2,inode64,logbufs=8,logbsize=32k,noquota'. [ 493.113587][ T1] systemd-shutdown[1]: All filesystems unmounted. [ 493.119913][ T1] systemd-shutdown[1]: Deactivating swaps. [ 493.126612][ T1] systemd-shutdown[1]: All swaps deactivated. [ 493.132584][ T1] systemd-shutdown[1]: Detaching loop devices. [ 493.138718][ T1] systemd-shutdown[1]: All loop devices detached. [ 493.145036][ T1] systemd-shutdown[1]: Stopping MD devices. [ 493.150838][ T1] systemd-shutdown[1]: All MD devices stopped. [ 493.156894][ T1] systemd-shutdown[1]: Detaching DM devices. [ 493.162785][ T1] systemd-shutdown[1]: All DM devices detached. [ 493.168930][ T1] systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached. [ 493.221975][ T1] systemd-shutdown[1]: Syncing filesystems and block devices. [ 493.229354][ T1] systemd-shutdown[1]: Rebooting with kexec. [ 493.303438][T14716] qla2xxx [0003:61:00.0]-fffa:14: Adapter shutdown [ 493.309930][T14716] qla2xxx [0003:61:00.0]-00af:14: Performing ISP error recovery - ha=00000000ae891d2f. [ 493.330535][T14716] qla2xxx [0003:61:00.0]-fffe:14: Adapter shutdown successfully. [ 494.114055][T14716] mana 0000:61:00.1: Shutdown was called [ 494.643693][T14716] kvm: exiting hardware virtualization [ 494.649419][T14716] kexec_core: Starting new kernel

Invalid physical address chosen!

Physical KASLR disabled: no suitable memory region!

[ 0.000000][ T0] Linux version 6.7.7-wahl (root@sph-185) (gcc (SUSE Linux) 7.5.0, GNU ld (GNU Binutils; SUSE Linux Enterprise 15) 2.41.0.20230908-150100.7.46) #1 SMP PREEMPT_DYNAMIC Sat Mar 2 17:22:28 CST 2024 ------------------------------

This was on SLES, not open Suse.

The machines I work on are large, though. Can you give specifics on exactly how you are performing your kexec, and what hardware you are using when you hit this (especially memory size)? Have you made any special arrangements for the size of memory reserved for kexec on your system?

The patch can use slightly more memory to create the identity maps, and the only thing I can think of right now is that little bit causing you to run out of memory.

Thank you!

--> Steve Wahl

-- Steve Wahl, Hewlett Packard Enterprise

Pavin Joseph

6:32 a.m.

On 3/3/24 05:30, Steve Wahl wrote:

...

The machines I work on are large, though. Can you give specifics on exactly how you are performing your kexec, and what hardware you are using when you hit this (especially memory size)? Have you made any special arrangements for the size of memory reserved for kexec on your system?

Hi Steve, I'm using a mainstream Lenovo laptop with an AMD APU (Ryzen 3 5300U), this is my secondary/testing machine using which I've built the kernels and performed the git bisection. I've attached the relevant journal logs and inxi output.

My primary laptop which I'm typing this from is of the same build but with a slightly better APU (Ryzen 5 5500U) in which I have replicated the problem using kernels from OpenSuse repos, both patched and vanilla but not ones I've built myself.

The only peculiarity with these machines I can think of is that its onboard graphics reserves/uses a portion of the normal RAM as its VRAM.

I have reproduced the issue by calling kexec directly using "kexec -l" & "kexec -e", systemctl kexec, and also using the default systemd service/script provided by OpenSuse. The exact command it uses is as follows.

Kexec call: kexec --kexec-syscall-auto --load '/usr/lib/modules/6.7.4-1-default/vmlinuz' --initrd='/boot/initrd-6.7.4-1-default' --append='root=/dev/mapper/suse-system splash=silent mitigations=auto quiet crashkernel=421M,high crashkernel=72M,low security=apparmor'

Note: when I used "kexec -l", I only included the root fs path in append and none of the other options to rule any side effects.

The problem can be reliably reproduced when kexec'ing from the faulty kernel into the same kernel. This why there are two boot entries for each kernel (6.7.7 and 6.7.5) in the attached journal logs.

Let me know if you need any further clarifications.

Kind regards, Pavin Joseph.

Steve Wahl

4 Mar 4 Mar

4:15 p.m.

On Sun, Mar 03, 2024 at 12:02:31PM +0530, Pavin Joseph wrote:

...

On 3/3/24 05:30, Steve Wahl wrote:

...
The machines I work on are large, though. Can you give specifics on exactly how you are performing your kexec, and what hardware you are using when you hit this (especially memory size)? Have you made any special arrangements for the size of memory reserved for kexec on your system?

Hi Steve, I'm using a mainstream Lenovo laptop with an AMD APU (Ryzen 3 5300U), this is my secondary/testing machine using which I've built the kernels and performed the git bisection. I've attached the relevant journal logs and inxi output.

Hi, Pavin,

Thanks for the extra information. I have skimmed it, and will continue to read more thoroughly.

There's a chance you may be running out of the memory reserved for the kexec kernel. If you have the time to try adding the command line option "nogbpages" to a kernel that's working for you to see if that breaks it in a similar way or not, that would be valuable information.

Explanation: My patch can require additional memory for the identity map, should be worst case an extra 4K per GiB mapped. The nogbpages option always does what my patch only does sometimes, including requiring this extra memory.

My next steps are to read through your logs more closely, and load OpenSUSE somewhere to see if I can replicate your problem.

Thanks again,

--> Steve Wahl

-- Steve Wahl, Hewlett Packard Enterprise

Pavin Joseph

5:48 p.m.

On 3/4/24 21:45, Steve Wahl wrote

...

There's a chance you may be running out of the memory reserved for the kexec kernel. If you have the time to try adding the command line option "nogbpages" to a kernel that's working for you to see if that breaks it in a similar way or not, that would be valuable information.

I tried it and it breaks working kernels (6.7.4).

...

My next steps are to read through your logs more closely, and load OpenSUSE somewhere to see if I can replicate your problem.

I wasn't able to reproduce the issue inside a VM (virt-manager, QEMU/KVM).

Kind regards, Pavin Joseph.

Steve Wahl

5 Mar 5 Mar

3:25 p.m.

[Oops; previously sent this to Pavin only when I ment to copy everyone.]

On Mon, Mar 04, 2024 at 11:18:49PM +0530, Pavin Joseph wrote:

...

On 3/4/24 21:45, Steve Wahl wrote

...
There's a chance you may be running out of the memory reserved for the kexec kernel. If you have the time to try adding the command line option "nogbpages" to a kernel that's working for you to see if that breaks it in a similar way or not, that would be valuable information.

I tried it and it breaks working kernels (6.7.4).

Thank you. That's good news, it means I'm thinking on the right track.

I'm still on the way to getting a system installed with OpenSUSE to try and replicate your problem. In the meantime, if you want to try figuring out how to increase the memory allocated for kexec kernel purposes, that might correct the problem.

...

...
My next steps are to read through your logs more closely, and load OpenSUSE somewhere to see if I can replicate your problem.

I wasn't able to reproduce the issue inside a VM (virt-manager, QEMU/KVM).

Also good to know, as that was a possibility I was considering trying.

The number of regions created in the identity map as you're kexecing is fairly system dependent, it's been a couple of months since I looked through the callers, but as I recall it might even include regions that are in tables passed in by the BIOS. So, it varies from system to system, and a VM is probably going to be much simpler compared to real hardware.

Thanks.

--> Steve Wahl

-- Steve Wahl, Hewlett Packard Enterprise

Pavin Joseph

7:58 p.m.

On 3/5/24 20:55, Steve Wahl wrote:

...

In the meantime, if you want to try figuring out how to increase the memory allocated for kexec kernel purposes, that might correct the problem.

I tried all the options and variations possible in kexec. Don't know how useful this is but it seems there's a hard limit imposed by kexec on the size of the kernel image, irrespective of the format.

pavin@suse-laptop:~> sudo /usr/sbin/kexec --debug --kexec-syscall-auto --load '/usr/lib/modules/6.7.6-1-default/vmlinux' --initrd='/boot/initrd-6.7.6-1-default' --append='root=/dev/mapper/suse-system crashkernel=341M,high crashkernel=72M,low security=apparmor mitigations=auto' Try gzip decompression. Invalid memory segment 0x1000000 - 0x2c60fff pavin@suse-laptop:~> file /usr/lib/modules/6.7.6-1-default/vmlinux /usr/lib/modules/6.7.6-1-default/vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=cd9816be5099dbe04750b2583fe34462de6dcdca, not stripped

Kind regards, Pavin Joseph.

Pavin Joseph

6 Mar 6 Mar

3:09 a.m.

Hello everyone,

I tried optimizing the new stable kernel 6.7.8 for space but that did not resolve the issue.

pavin@suse-laptop:~> du -s /usr/lib/modules/6.7.8-local/vmlinuz 10496 /usr/lib/modules/6.7.8-local/vmlinuz pavin@suse-laptop:~> du -s /usr/lib/modules/6.7.6-1-default/vmlinuz 14012 /usr/lib/modules/6.7.6-1-default/vmlinuz

Kind regards, Pavin Joseph.

On 3/6/24 01:28, Pavin Joseph wrote:

...

On 3/5/24 20:55, Steve Wahl wrote:

...
In the meantime, if you want to try figuring out how to increase the memory allocated for kexec kernel purposes, that might correct the problem.

I tried all the options and variations possible in kexec. Don't know how useful this is but it seems there's a hard limit imposed by kexec on the size of the kernel image, irrespective of the format.

pavin@suse-laptop:~> sudo /usr/sbin/kexec --debug --kexec-syscall-auto --load '/usr/lib/modules/6.7.6-1-default/vmlinux' --initrd='/boot/initrd-6.7.6-1-default' --append='root=/dev/mapper/suse-system crashkernel=341M,high crashkernel=72M,low security=apparmor mitigations=auto' Try gzip decompression. Invalid memory segment 0x1000000 - 0x2c60fff pavin@suse-laptop:~> file /usr/lib/modules/6.7.6-1-default/vmlinux /usr/lib/modules/6.7.6-1-default/vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=cd9816be5099dbe04750b2583fe34462de6dcdca, not stripped

Kind regards, Pavin Joseph.

Steve Wahl

3:50 p.m.

Pavin, thanks.

For my part, I've loaded OpenSUSE on two different systems but have not succeeded in replicating your problem. I am still working on that.

The systems I have in hand to test this with are Intel, not AMD. Eric Hagberg's report (thanks, Eric!) of seeing it on a PowerEdge R6615 (which also appears to be AMD) suggests to me that AMD systems might have something different, like a different set of commonly included devices on the motherboard, that affects what regions are included in the identity map and makes us trip up here. I will look harder at the logs Pavin supplied to see if I can glean any differences, and maybe see if I can locate an AMD system.

--> Steve Wahl

On Wed, Mar 06, 2024 at 08:39:38AM +0530, Pavin Joseph wrote:

...

Hello everyone,

I tried optimizing the new stable kernel 6.7.8 for space but that did not resolve the issue.

pavin@suse-laptop:~> du -s /usr/lib/modules/6.7.8-local/vmlinuz 10496 /usr/lib/modules/6.7.8-local/vmlinuz pavin@suse-laptop:~> du -s /usr/lib/modules/6.7.6-1-default/vmlinuz 14012 /usr/lib/modules/6.7.6-1-default/vmlinuz

Kind regards, Pavin Joseph.

On 3/6/24 01:28, Pavin Joseph wrote:

...
On 3/5/24 20:55, Steve Wahl wrote:

...
In the meantime, if you want to try figuring out how to increase the memory allocated for kexec kernel purposes, that might correct the problem.

I tried all the options and variations possible in kexec. Don't know how useful this is but it seems there's a hard limit imposed by kexec on the size of the kernel image, irrespective of the format.

pavin@suse-laptop:~> sudo /usr/sbin/kexec --debug --kexec-syscall-auto --load '/usr/lib/modules/6.7.6-1-default/vmlinux' --initrd='/boot/initrd-6.7.6-1-default' --append='root=/dev/mapper/suse-system crashkernel=341M,high crashkernel=72M,low security=apparmor mitigations=auto' Try gzip decompression. Invalid memory segment 0x1000000 - 0x2c60fff pavin@suse-laptop:~> file /usr/lib/modules/6.7.6-1-default/vmlinux /usr/lib/modules/6.7.6-1-default/vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=cd9816be5099dbe04750b2583fe34462de6dcdca, not stripped

Kind regards, Pavin Joseph.

-- Steve Wahl, Hewlett Packard Enterprise

677

days inactive

682

days old

linux-stable-mirror@lists.linaro.org

12 comments

participants

tags (0)

participants (3)

Linux regression tracking (Thorsten Leemhuis)
Pavin Joseph
Steve Wahl