The current implementation of e820__register_nosave_regions suffers from multiple serious issues: - The end of last region is tracked by PFN, causing it to find holes that aren't there if two consecutive subpage regions are present - The nosave PFN ranges derived from holes are rounded out (instead of rounded in) which makes it inconsistent with how explicitly reserved regions are handled
Fix this by: - Treating reserved regions as if they were holes, to ensure consistent handling (rounding out nosave PFN ranges is more correct as the kernel does not use partial pages) - Tracking the end of the last RAM region by address instead of pages to detect holes more precisely
Cc: stable@vger.kernel.org Fixes: e5540f875404 ("x86/boot/e820: Consolidate 'struct e820_entry *entry' local variable names") Reported-by: Roberto Ricci io@r-ricci.it Closes: https://lore.kernel.org/all/Z4WFjBVHpndct7br@desktop0a/ Signed-off-by: Myrrh Periwinkle myrrhperiwinkle@qtmlabs.xyz --- The issue of the kernel failing to resume from hibernation after kexec_load() is used is likely due to kexec-tools passing in a different e820 memory map from the one provided by system firmware, causing the e820 consistency check to fail. That issue is not addressed in this patch and will need to be fixed in kexec-tools instead.
Changes in v3: - Round out nosave PFN ranges since the kernel does not use partial pages - Link to v2: https://lore.kernel.org/r/20250405-fix-e820-nosave-v2-1-d40dbe457c95@qtmlabs...
Changes in v2: - Updated author details - Rewrote commit message - Link to v1: https://lore.kernel.org/r/20250405-fix-e820-nosave-v1-1-162633199548@qtmlabs... --- arch/x86/kernel/e820.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 57120f0749cc3c23844eeb36820705687e08bbf7..9d8dd8deb2a702bd34b961ca4f5eba8a8d9643d0 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -753,22 +753,21 @@ void __init e820__memory_setup_extended(u64 phys_addr, u32 data_len) void __init e820__register_nosave_regions(unsigned long limit_pfn) { int i; - unsigned long pfn = 0; + u64 last_addr = 0;
for (i = 0; i < e820_table->nr_entries; i++) { struct e820_entry *entry = &e820_table->entries[i];
- if (pfn < PFN_UP(entry->addr)) - register_nosave_region(pfn, PFN_UP(entry->addr)); - - pfn = PFN_DOWN(entry->addr + entry->size); - if (entry->type != E820_TYPE_RAM) - register_nosave_region(PFN_UP(entry->addr), pfn); + continue;
- if (pfn >= limit_pfn) - break; + if (last_addr < entry->addr) + register_nosave_region(PFN_DOWN(last_addr), PFN_UP(entry->addr)); + + last_addr = entry->addr + entry->size; } + + register_nosave_region(PFN_DOWN(last_addr), limit_pfn); }
#ifdef CONFIG_ACPI
--- base-commit: a8662bcd2ff152bfbc751cab20f33053d74d0963 change-id: 20250405-fix-e820-nosave-f5cb985a9a61
Best regards,