On Sun, Mar 24, 2024 at 11:31:39AM +0100, Ingo Molnar wrote:
- Steve Wahl steve.wahl@hpe.com wrote:
Some systems have ACPI tables that don't include everything that needs to be mapped for a successful kexec. These systems rely on identity maps that include the full gigabyte surrounding any smaller region requested for kexec success. Without this, they fail to kexec and end up doing a full firmware reboot.
So, reduce the use of GB pages only on systems where this is known to be necessary (specifically, UV systems).
Signed-off-by: Steve Wahl steve.wahl@hpe.com Fixes: d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.") Reported-by: Pavin Joseph me@pavinjoseph.com
Sigh, why was d794734c9bbf marked for a -stable backport? The commit never explains ...
I will try to explain, since Steve is offline. That commit fixes a legitimate bug where more address range is mapped (1G) than the requested address range. The fix avoids the issue of cpu speculativly loading beyond the requested range, which inludes specutalive loads from reserved memory. That is why it was marked for -stable.
If it's broken, it should be reverted - instead of trying to partially revert and then maybe break some other systems.
Three people reported that mapping only the correct address range caused problems on their platforms. https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@pavinjoseph... Steve and several people helped debug the issue. The commit itself looks correct but the correct behavior causes some side effect on a few platforms. Some memory ends up not being mapped, but it is not clear if it is due to some other bug, such as bios not accurately providing the right memory map or some other kernel code path did not map what it should. The 1G mapping covers up that type issue.
Steve's second patch was to not break those platforms while leaving the fix on the platform detected the original mapping problem (UV platform).
When there's boot breakage with new patches, we back out the bad patch and re-try in 99.9% of the cases.
Steve can certainly merge his two patches and resubmit, to replace the reverted original patch. He should be on in the morning to speak for himself.
Thanks