[Oops; previously sent this to Pavin only when I ment to copy everyone.]
On Mon, Mar 04, 2024 at 11:18:49PM +0530, Pavin Joseph wrote:
On 3/4/24 21:45, Steve Wahl wrote
There's a chance you may be running out of the memory reserved for the kexec kernel. If you have the time to try adding the command line option "nogbpages" to a kernel that's working for you to see if that breaks it in a similar way or not, that would be valuable information.
I tried it and it breaks working kernels (6.7.4).
Thank you. That's good news, it means I'm thinking on the right track.
I'm still on the way to getting a system installed with OpenSUSE to try and replicate your problem. In the meantime, if you want to try figuring out how to increase the memory allocated for kexec kernel purposes, that might correct the problem.
My next steps are to read through your logs more closely, and load OpenSUSE somewhere to see if I can replicate your problem.
I wasn't able to reproduce the issue inside a VM (virt-manager, QEMU/KVM).
Also good to know, as that was a possibility I was considering trying.
The number of regions created in the identity map as you're kexecing is fairly system dependent, it's been a couple of months since I looked through the callers, but as I recall it might even include regions that are in tables passed in by the BIOS. So, it varies from system to system, and a VM is probably going to be much simpler compared to real hardware.
Thanks.
--> Steve Wahl