On 28/07/2025 12:31, Dev Jain wrote:
On 28/07/25 4:43 pm, Ryan Roberts wrote:
On 28/07/2025 11:31, Dev Jain wrote:
Memory hotunplug is done under the hotplug lock and ptdump walk is done under the init_mm.mmap_lock. Therefore, ptdump and hotunplug can run simultaneously without any synchronization. During hotunplug, free_empty_tables() is ultimately called to free up the pagetables. The following race can happen, where x denotes the level of the pagetable:
CPU1 CPU2 free_empty_pxd_table ptdump_walk_pgd() Get p(x+1)d table from pxd entry pxd_clear free_hotplug_pgtable_page(p(x+1)dp) Still using the p(x+1)d table
which leads to a user-after-free.
I'm not sure I understand this. ptdump_show() protects against this with get_online_mems()/put_online_mems(), doesn't it? There are 2 paths that call ptdump_walk_pgd(). This protects one of them. The other is ptdump_check_wx(); I thought you (or Anshuman?) had a patch in flight to fix that with [get|put]_online_mems() too?
Sorry if my memory is failing me here...
Nope, I think I just had a use-after-free in my memory so I came up with this patch :) Because of the recent work with ptdump, I was so concentrated on ptdump_walk_pgd() that I didn't even bother looking up the call chain. And I even forgot we had these [get|put]_online_mems() patches recently.
I just checked; Anshuman's fix is in mm-stable, so I guess it'll be in v6.17-rc1.
That's the patch: https://lore.kernel.org/linux-arm-kernel/20250620052427.2092093-1-anshuman.k...
Sorry for the noise, it must have been incredibly confusing to see this patch :(