When PAGEMAP_SCAN ioctl invoked with vec_len = 0 reaches pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI [ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none) [ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace: [ 44.947030] <TASK> [ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0 [ 44.952593] walk_pmd_range.isra.0+0x302/0x910 [ 44.954069] walk_pud_range.isra.0+0x419/0x790 [ 44.954427] walk_p4d_range+0x41e/0x620 [ 44.954743] walk_pgd_range+0x31e/0x630 [ 44.955057] __walk_page_range+0x160/0x670 [ 44.956883] walk_page_range_mm+0x408/0x980 [ 44.958677] walk_page_range+0x66/0x90 [ 44.958984] do_pagemap_scan+0x28d/0x9c0 [ 44.961833] do_pagemap_cmd+0x59/0x80 [ 44.962484] __x64_sys_ioctl+0x18d/0x210 [ 44.962804] do_syscall_64+0x5b/0x290 [ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(), that page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking cur_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly or transitively) protected by checking p->vec_buf.
Note: From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output is requested and it's only the side effects caller is interested in, hence it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs") Cc: Andrew Morton akpm@linux-foundation.org Cc: David Hildenbrand david@redhat.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Jinjiang Tu tujinjiang@huawei.com Cc: Suren Baghdasaryan surenb@google.com Cc: Penglei Jiang superman.xpt@gmail.com Cc: Mark Brown broonie@kernel.org Cc: Baolin Wang baolin.wang@linux.alibaba.com Cc: Ryan Roberts ryan.roberts@arm.com Cc: Andrei Vagin avagin@gmail.com Cc: "Michał Mirosław" mirq-linux@rere.qmqm.pl Cc: Stephen Rothwell sfr@canb.auug.org.au Cc: Muhammad Usama Anjum usama.anjum@collabora.com linux-kernel@vger.kernel.org linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Jakub Acs acsjakub@amazon.de
--- fs/proc/task_mmu.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 29cca0e6d0ff..8c10a8135e74 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(struct pagemap_scan_private *p, { struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
+ if (!cur_buf) + return; + if (cur_buf->start != addr) cur_buf->end = addr; else
On Fri, Sep 19, 2025 at 7:21 AM Jakub Acs acsjakub@amazon.de wrote:
When PAGEMAP_SCAN ioctl invoked with vec_len = 0 reaches pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI [ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none) [ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace: [ 44.947030] <TASK> [ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0 [ 44.952593] walk_pmd_range.isra.0+0x302/0x910 [ 44.954069] walk_pud_range.isra.0+0x419/0x790 [ 44.954427] walk_p4d_range+0x41e/0x620 [ 44.954743] walk_pgd_range+0x31e/0x630 [ 44.955057] __walk_page_range+0x160/0x670 [ 44.956883] walk_page_range_mm+0x408/0x980 [ 44.958677] walk_page_range+0x66/0x90 [ 44.958984] do_pagemap_scan+0x28d/0x9c0 [ 44.961833] do_pagemap_cmd+0x59/0x80 [ 44.962484] __x64_sys_ioctl+0x18d/0x210 [ 44.962804] do_syscall_64+0x5b/0x290 [ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(), that page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking cur_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly or transitively) protected by checking p->vec_buf.
Note: From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output is requested and it's only the side effects caller is interested in, hence it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs") Cc: Andrew Morton akpm@linux-foundation.org Cc: David Hildenbrand david@redhat.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Jinjiang Tu tujinjiang@huawei.com Cc: Suren Baghdasaryan surenb@google.com Cc: Penglei Jiang superman.xpt@gmail.com Cc: Mark Brown broonie@kernel.org Cc: Baolin Wang baolin.wang@linux.alibaba.com Cc: Ryan Roberts ryan.roberts@arm.com Cc: Andrei Vagin avagin@gmail.com Cc: "Michał Mirosław" mirq-linux@rere.qmqm.pl Cc: Stephen Rothwell sfr@canb.auug.org.au Cc: Muhammad Usama Anjum usama.anjum@collabora.com linux-kernel@vger.kernel.org linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Jakub Acs acsjakub@amazon.de
fs/proc/task_mmu.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 29cca0e6d0ff..8c10a8135e74 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(struct pagemap_scan_private *p, { struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
if (!cur_buf)
I think it is better to check !p->vec_buf. I know that vec_buf_index is always 0 in this case, so there is no functional difference, but the !p->vec_buf is more readable/obvious.
Thanks, Andrei
linux-stable-mirror@lists.linaro.org