Hello Thomas,
On 10/11/2020 10:55, Thomas Bogendoerfer wrote:
Linux doesn't own the memory immediately after the kernel image. On Octeon bootloader places a shared structure right close after the kernel _end, refer to "struct cvmx_bootinfo *octeon_bootinfo" in cavium-octeon/setup.c.
If check_kernel_sections_mem() rounds the PFNs up, first memblock_alloc() inside early_init_dt_alloc_memory_arch() <= device_tree_init() returns memory block overlapping with the above octeon_bootinfo structure, which is being overwritten afterwards.
as this special for Octeon how about added the memblock_reserve in octen specific code ?
while the shared structure which is being corrupted is indeed Octeon-specific, the wrong assumption that the memory right after the kernel can be allocated by memblock allocator and re-used somewhere in Linux is in MIPS-generic check_kernel_sections_mem().
ok, I see your point. IMHO this whole check_kernel_sections_mem() should be removed. IMHO memory adding should only be done my memory detection code.
Could you send a patch, which removes check_kernel_section_mem completly ?
this will expose one issue: platforms usually do it in a sane way, like it was done last 15 years, namely add kernel image without non-complete pages on the boundaries. This will lead to the situation, that request_resource() will fail at least for .bss section of the kernel and it will not be properly displayed under /proc/iomem (and probably same problem will appear, which initially motivated the creation of check_kernel_section_mem()).
As I understood, the issue is that memblock API operates internally on the page granularity (at least there are many ROUND_DOWN() inside for the size or upper boundary), so for request_resource() to success one has to claim the rest of the .bss last page. And with current memblock API memblock_reserve() must appear somewhere, being this ARCH or platform code.