On 2024-04-18 Björn Töpel wrote:
Nam Cao namcao@linutronix.de writes:
On 2024-04-18 Nam Cao wrote:
There is nothing preventing kernel memory allocators from allocating a page that overlaps with PTR_ERR(), except for architecture-specific code that setup memblock.
It was discovered that RISCV architecture doesn't setup memblock corectly, leading to a page overlapping with PTR_ERR() being allocated, and subsequently crashing the kernel (link in Close: )
The reported crash has nothing to do with PTR_ERR(): the last page (at address 0xfffff000) being allocated leads to an unexpected arithmetic overflow in ext4; but still, this page shouldn't be allocated in the first place.
Because PTR_ERR() is an architecture-independent thing, we shouldn't ask every single architecture to set this up. There may be other architectures beside RISCV that have the same problem.
Fix this one and for all by reserving the physical memory page that may be mapped to the last virtual memory page as part of low memory.
Unfortunately, this means if there is actual memory at this reserved location, that memory will become inaccessible. However, if this page is not reserved, it can only be accessed as high memory, so this doesn't matter if high memory is not supported. Even if high memory is supported, it is still only one page.
Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@all.your.base.are.belong.... Signed-off-by: Nam Cao namcao@linutronix.de Cc: stable@vger.kernel.org # all versions
Sorry, forgot to add: Reported-by: Björn Töpel bjorn@kernel.org
Hmm, can't we get rid of the whole check in arch/riscv/mm/init.c for 32b?
We can, but that depends on this patch. So my intention is to wait for this patch to be applied first, because I don't want to bother the maintainers with dependencies.
--8<-- diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index fe8e159394d8..1e91d5728887 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -196,7 +196,6 @@ early_param("mem", early_mem); static void __init setup_bootmem(void) { phys_addr_t vmlinux_end = __pa_symbol(&_end);
- phys_addr_t max_mapped_addr; phys_addr_t phys_ram_end, vmlinux_start;
if (IS_ENABLED(CONFIG_XIP_KERNEL)) @@ -234,21 +233,6 @@ static void __init setup_bootmem(void) if (IS_ENABLED(CONFIG_64BIT)) kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base;
- /*
* memblock allocator is not aware of the fact that last 4K bytes of
* the addressable memory can not be mapped because of IS_ERR_VALUE
* macro. Make sure that last 4k bytes are not usable by memblock
* if end of dram is equal to maximum addressable memory. For 64-bit
* kernel, this problem can't happen here as the end of the virtual
* address space is occupied by the kernel mapping then this check must
* be done as soon as the kernel mapping base address is determined.
*/
- if (!IS_ENABLED(CONFIG_64BIT)) {
max_mapped_addr = __pa(~(ulong)0);
if (max_mapped_addr == (phys_ram_end - 1))
memblock_set_current_limit(max_mapped_addr - 4096);
- }
If you are going to send this, you can add: Reviewed-by: Nam Cao namcao@linutronix.de
min_low_pfn = PFN_UP(phys_ram_base); max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end); high_memory = (void *)(__va(PFN_PHYS(max_low_pfn))); --8<--
Mike hints that's *not* the case (https://lore.kernel.org/linux-riscv/ZiAkRMUfiPDUGPdL@kernel.org/). memblock_reserve() should disallow allocation as well, no?
He said it can't be removed if we set max_low_pfn instead of using memblock_reserve()
If max_low_pfn() is used, then it can be removed: https://lore.kernel.org/linux-riscv/Zh6n-nvnQbL-0xss@kernel.org
Best regards, Nam