max_mapnr variable is utilized in the pfn_valid() method in order to determine the upper PFN space boundary. Having it uninitialized effectively makes any PFN passed to that method invalid. That in its turn causes the kernel mm-subsystem occasion malfunctions even after the max_mapnr variable is actually properly updated. For instance, pfn_valid() is called in the init_unavailable_range() method in the framework of the calls-chain on MIPS: setup_arch() +-> paging_init() +-> free_area_init() +-> memmap_init() +-> memmap_init_zone_range() +-> init_unavailable_range()
Since pfn_valid() always returns "false" value before max_mapnr is initialized in the mem_init() method, any flatmem page-holes will be left in the poisoned/uninitialized state including the IO-memory pages. Thus any further attempts to map/remap the IO-memory by using MMU may fail. In particular it happened in my case on attempt to map the SRAM region. The kernel bootup procedure just crashed on the unhandled unaligned access bug raised in the __update_cache() method:
Unhandled kernel unaligned access[#1]: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc1-00307-g0dff108838c9-dirty #2056 ... Call Trace: [<8011ef9c>] __update_cache+0x88/0x1bc [<80385944>] ioremap_page_range+0x110/0x2a4 [<80126948>] ioremap_prot+0x17c/0x1f4 [<80711b80>] __devm_ioremap+0x8c/0x120 [<80711e0c>] __devm_ioremap_resource+0xf4/0x218 [<808bf244>] sram_probe+0x4f4/0x930 [<80889d20>] platform_probe+0x68/0xec ...
Let's fix the problem by initializing the max_mapnr variable as soon as the required data is available. In particular it can be done right in the paging_init() method before free_area_init() is called since all the PFN zone boundaries have already been calculated by that time.
Cc: stable@vger.kernel.org Signed-off-by: Serge Semin fancer.lancer@gmail.com
---
Note I don't really know since what point that problem actually exists. Based on the commits log it might had been persistent even before the boot_mem_map allocator was dropped. On the other hand I hadn't seen it actually come out before moving my working tree from kernel 6.5-rc4 to 6.7-rc1. So after updating the kernel I got the unhandled unaligned access BUG() due to the access to compound head pointer the __update_cache() method (see the commit log). After enabling the DEBUG_VM config I managed to find out that the IO-memory pages were just left uninitialized and poisoned:
page:81367080 is uninitialized and poisoned (pfn 8192) page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Kernel bug detected[#1]: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc1-00298-g88721b1a9ad5-dirty $ 0 : 00000000 812d0000 00000034 dced7cdf $ 4 : dced7cdf 00594000 10000000 ffff00fe $ 8 : 8196bfe0 00000000 00000001 818458c0 $12 : 00000000 00000000 00000000 00000216 $16 : 00002800 81227b80 00000000 00000000 $20 : 00000000 00000000 00000000 00000000 $24 : 0000022b 818458c0 $28 : 81968000 8196be68 00000000 803a0920 Hi : 00000000 Lo : 00000000 epc : 8039d2a4 BUG+0x0/0x4 ra : 803a0920 post_alloc_hook+0x0/0x128 Status: 10000003 KERNEL EXL IE Cause : 00800424 (ExcCode 09) PrId : 0001a830 (MIPS P5600) Modules linked in: Process swapper/0 (pid: 1, threadinfo=81968000, task=819a0000, tls=00000000) Stack : 00000000 8101ccb0 00000000 8196bd00 00000000 80359768 818a8300 00000001 81139088 8114438c 8042e4f8 81297a2c 81297a2c 81255e90 819a1b50 dced7cdf 81297a2c 81297a2c 00000000 81227b80 00000000 81241168 811394b0 00000000 81140000 80e2cee0 00000000 00000000 00000000 00000000 00000000 819b0000 81140000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... Call Trace: [<8039d2a4>] BUG+0x0/0x4 [<803a0920>] post_alloc_hook+0x0/0x128
Code: 01001025 03e00008 24020001 <000c000d> 2403003c 27bdffd0 afb2001c 3c12812f 8e4269e4
Which in its turn made me digging deeper into the way the MMIO-space pages are initialized. That's how I got into the pfn_valid() and init_unavailable_range() working improperly on my setup.
Anyway none of the problems above I spotted on kernel 6.5-rc4. So what actually triggered having them finally popped up isn't that easy to be foundn seeing the involved code hasn't changed much. --- arch/mips/mm/init.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c index 5dcb525a8995..6e368a4658b5 100644 --- a/arch/mips/mm/init.c +++ b/arch/mips/mm/init.c @@ -422,7 +422,12 @@ void __init paging_init(void) (highend_pfn - max_low_pfn) << (PAGE_SHIFT - 10)); max_zone_pfns[ZONE_HIGHMEM] = max_low_pfn; } + + max_mapnr = highend_pfn ? highend_pfn : max_low_pfn; +#else + max_mapnr = max_low_pfn; #endif + high_memory = (void *) __va(max_low_pfn << PAGE_SHIFT);
free_area_init(max_zone_pfns); } @@ -458,13 +463,6 @@ void __init mem_init(void) */ BUILD_BUG_ON(IS_ENABLED(CONFIG_32BIT) && (PFN_PTE_SHIFT > PAGE_SHIFT));
-#ifdef CONFIG_HIGHMEM - max_mapnr = highend_pfn ? highend_pfn : max_low_pfn; -#else - max_mapnr = max_low_pfn; -#endif - high_memory = (void *) __va(max_low_pfn << PAGE_SHIFT); - maar_init(); memblock_free_all(); setup_zero_pages(); /* Setup zeroed pages. */