* Russ Anderson rja@hpe.com wrote:
On Sun, Mar 24, 2024 at 11:31:39AM +0100, Ingo Molnar wrote:
- Steve Wahl steve.wahl@hpe.com wrote:
Some systems have ACPI tables that don't include everything that needs to be mapped for a successful kexec. These systems rely on identity maps that include the full gigabyte surrounding any smaller region requested for kexec success. Without this, they fail to kexec and end up doing a full firmware reboot.
So, reduce the use of GB pages only on systems where this is known to be necessary (specifically, UV systems).
Signed-off-by: Steve Wahl steve.wahl@hpe.com Fixes: d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.") Reported-by: Pavin Joseph me@pavinjoseph.com
Sigh, why was d794734c9bbf marked for a -stable backport? The commit never explains ...
I will try to explain, since Steve is offline. That commit fixes a legitimate bug where more address range is mapped (1G) than the requested address range.
If a change regresses on certain machines then it's not a bug fix anymore, it's a regression. End of story.
The fix avoids the issue of cpu speculativly loading beyond the requested range, which inludes specutalive loads from reserved memory. That is why it was marked for -stable.
And this regression is why more complicated fixes in this area should not be forwarded to -stable before it's been merged upstream and exposed a bit more. Please keep that in mind for future iterations.
If it's broken, it should be reverted - instead of trying to partially revert and then maybe break some other systems.
Three people reported that mapping only the correct address range caused problems on their platforms. https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@pavinjoseph... Steve and several people helped debug the issue. The commit itself looks correct but the correct behavior causes some side effect on a few platforms.
That's all fine and the effort is much appreciated - but we should not try to whitewash a regression: if there's a couple of reports in such a short time already, then the regression is significant.
Anyway, I've reverted this in tip:x86/urgent:
c567f2948f57 Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped."
we can iterate from there again. Please post future patches against that tree.
Note that this is just the regular development process: regressions happen, and this is how we handle them a lot of the time in this area - we back out the breakage, then try again.
Some memory ends up not being mapped, but it is not clear if it is due to some other bug, such as bios not accurately providing the right memory map or some other kernel code path did not map what it should. The 1G mapping covers up that type issue.
Steve's second patch was to not break those platforms while leaving the fix on the platform detected the original mapping problem (UV platform).
When there's boot breakage with new patches, we back out the bad patch and re-try in 99.9% of the cases.
Steve can certainly merge his two patches and resubmit, to replace the reverted original patch. He should be on in the morning to speak for himself.
Thank you!
Ingo