The patch titled Subject: zram: fix panic when using ext4 over zram has been added to the -mm mm-hotfixes-unstable branch. Its filename is zram-panic-when-use-ext4-over-zram.patch
This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches...
This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days
------------------------------------------------------ From: caiqingfu caiqingfu@ruijie.com.cn Subject: zram: fix panic when using ext4 over zram Date: Fri, 29 Nov 2024 19:57:35 +0800
[ 52.073080 ] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 52.073511 ] Modules linked in: [ 52.074094 ] CPU: 0 UID: 0 PID: 3825 Comm: a.out Not tainted 6.12.0-07749-g28eb75e178d3-dirty #3 [ 52.074672 ] Hardware name: linux,dummy-virt (DT) [ 52.075128 ] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 52.075619 ] pc : obj_malloc+0x5c/0x160 [ 52.076402 ] lr : zs_malloc+0x200/0x570 [ 52.076630 ] sp : ffff80008dd335f0 [ 52.076797 ] x29: ffff80008dd335f0 x28: ffff000004104a00 x27: ffff000004dfc400 [ 52.077319 ] x26: 000000000000ca18 x25: ffff00003fcaf0e0 x24: ffff000006925cf0 [ 52.077785 ] x23: 0000000000000c0a x22: ffff0000032ee780 x21: ffff000006925cf0 [ 52.078257 ] x20: 0000000000088000 x19: 0000000000000000 x18: 0000000000fffc18 [ 52.078701 ] x17: 00000000fffffffd x16: 0000000000000803 x15: 00000000fffffffe [ 52.079203 ] x14: 000000001824429d x13: ffff000006e84000 x12: ffff000006e83fec [ 52.079711 ] x11: ffff000006e83000 x10: 00000000000002a5 x9 : ffff000006e83ff3 [ 52.080269 ] x8 : 0000000000000001 x7 : 0000000017e80000 x6 : 0000000000017e80 [ 52.080724 ] x5 : 0000000000000003 x4 : ffff00000402a5e8 x3 : 0000000000000066 [ 52.081081 ] x2 : ffff000006925cf0 x1 : ffff00000402a5e8 x0 : ffff000004104a00 [ 52.081595 ] Call trace: [ 52.081925 ] obj_malloc+0x5c/0x160 (P) [ 52.082220 ] zs_malloc+0x200/0x570 (L) [ 52.082504 ] zs_malloc+0x200/0x570 [ 52.082716 ] zram_submit_bio+0x788/0x9e8 [ 52.083017 ] __submit_bio+0x1c4/0x338 [ 52.083343 ] submit_bio_noacct_nocheck+0x128/0x2c0 [ 52.083518 ] submit_bio_noacct+0x1c8/0x308 [ 52.083722 ] submit_bio+0xa8/0x14c [ 52.083942 ] submit_bh_wbc+0x140/0x1bc [ 52.084088 ] __block_write_full_folio+0x23c/0x5f0 [ 52.084232 ] block_write_full_folio+0x134/0x21c [ 52.084524 ] write_cache_pages+0x64/0xd4 [ 52.084778 ] blkdev_writepages+0x50/0x8c [ 52.085040 ] do_writepages+0x80/0x2b0 [ 52.085292 ] filemap_fdatawrite_wbc+0x6c/0x90 [ 52.085597 ] __filemap_fdatawrite_range+0x64/0x94 [ 52.085900 ] filemap_fdatawrite+0x1c/0x28 [ 52.086158 ] sync_bdevs+0x170/0x17c [ 52.086374 ] ksys_sync+0x6c/0xb8 [ 52.086597 ] __arm64_sys_sync+0x10/0x20 [ 52.086847 ] invoke_syscall+0x44/0x100 [ 52.087230 ] el0_svc_common.constprop.0+0x40/0xe0 [ 52.087550 ] do_el0_svc+0x1c/0x28 [ 52.087690 ] el0_svc+0x30/0xd0 [ 52.087818 ] el0t_64_sync_handler+0xc8/0xcc [ 52.088046 ] el0t_64_sync+0x198/0x19c [ 52.088500 ] Code: 110004a5 6b0500df f9401273 54000160 (f9401664) [ 52.089097 ] ---[ end trace 0000000000000000 ]---
When using ext4 on zram, the following panic occasionally occurs under high memory usage
The reason is that when the handle is obtained using the slow path, it will be re-compressed. If the data in the page changes, the compressed length may exceed the previous one. Overflow occurred when writing to zs_object, which then caused the panic.
Comment the fast path and force the slow path. Adding a large number of read and write file systems can quickly reproduce it.
The solution is to re-obtain the handle after re-compression if the length is different from the previous one.
Link: https://lkml.kernel.org/r/20241129115735.136033-1-baicaiaichibaicai@gmail.co... Signed-off-by: caiqingfu caiqingfu@ruijie.com.cn Cc: Minchan Kim minchan@kernel.org Cc: Sergey Senozhatsky senozhatsky@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org ---
drivers/block/zram/zram_drv.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/block/zram/zram_drv.c~zram-panic-when-use-ext4-over-zram +++ a/drivers/block/zram/zram_drv.c @@ -1633,6 +1633,7 @@ static int zram_write_page(struct zram * unsigned long alloced_pages; unsigned long handle = -ENOMEM; unsigned int comp_len = 0; + unsigned int last_comp_len = 0; void *src, *dst, *mem; struct zcomp_strm *zstrm; unsigned long element = 0; @@ -1664,6 +1665,11 @@ compress_again:
if (comp_len >= huge_class_size) comp_len = PAGE_SIZE; + + if (last_comp_len && (last_comp_len != comp_len)) { + zs_free(zram->mem_pool, handle); + handle = (unsigned long)ERR_PTR(-ENOMEM); + } /* * handle allocation has 2 paths: * a) fast path is executed with preemption disabled (for @@ -1692,8 +1698,10 @@ compress_again: if (IS_ERR_VALUE(handle)) return PTR_ERR((void *)handle);
- if (comp_len != PAGE_SIZE) + if (comp_len != PAGE_SIZE) { + last_comp_len = comp_len; goto compress_again; + } /* * If the page is not compressible, you need to acquire the * lock and execute the code below. The zcomp_stream_get() _
Patches currently in -mm which might be from caiqingfu@ruijie.com.cn are
zram-panic-when-use-ext4-over-zram.patch
On (24/11/29 19:04), Andrew Morton wrote: [..]
When using ext4 on zram, the following panic occasionally occurs under high memory usage
The reason is that when the handle is obtained using the slow path, it will be re-compressed. If the data in the page changes, the compressed length may exceed the previous one. Overflow occurred when writing to zs_object, which then caused the panic.
Comment the fast path and force the slow path. Adding a large number of read and write file systems can quickly reproduce it.
The solution is to re-obtain the handle after re-compression if the length is different from the previous one.
Link: https://lkml.kernel.org/r/20241129115735.136033-1-baicaiaichibaicai@gmail.co... Signed-off-by: caiqingfu caiqingfu@ruijie.com.cn Cc: Minchan Kim minchan@kernel.org Cc: Sergey Senozhatsky senozhatsky@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org
Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org
A side note: Well, we should not crash the kernel or corrupt memory, in this regard the patch makes sense, however, if data is being modified concurrently with zram write() then zram simply should not be used, we can't decompress data that is unstable at the time of compression.
On (24/11/29 19:04), Andrew Morton wrote:
--- a/drivers/block/zram/zram_drv.c~zram-panic-when-use-ext4-over-zram +++ a/drivers/block/zram/zram_drv.c @@ -1633,6 +1633,7 @@ static int zram_write_page(struct zram * unsigned long alloced_pages; unsigned long handle = -ENOMEM; unsigned int comp_len = 0;
- unsigned int last_comp_len = 0; void *src, *dst, *mem; struct zcomp_strm *zstrm; unsigned long element = 0;
@@ -1664,6 +1665,11 @@ compress_again: if (comp_len >= huge_class_size) comp_len = PAGE_SIZE;
- if (last_comp_len && (last_comp_len != comp_len)) {
zs_free(zram->mem_pool, handle);
handle = (unsigned long)ERR_PTR(-ENOMEM);
- }
I don't think this needs to be this complex. A simple -ENOMEM should work, that's the default handle value in this function.
Andrew, can you please fold this in? (Or we can just ask for v2)
---
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 52a005face62..5de7f30d0aa6 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1674,7 +1674,7 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
if (last_comp_len && (last_comp_len != comp_len)) { zs_free(zram->mem_pool, handle); - handle = (unsigned long)ERR_PTR(-ENOMEM); + handle = -ENOMEM; } /* * handle allocation has 2 paths:
On (24/11/29 19:04), Andrew Morton wrote:
[ 52.073080 ] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 52.073511 ] Modules linked in: [ 52.074094 ] CPU: 0 UID: 0 PID: 3825 Comm: a.out Not tainted 6.12.0-07749-g28eb75e178d3-dirty #3 [ 52.074672 ] Hardware name: linux,dummy-virt (DT) [ 52.075128 ] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 52.075619 ] pc : obj_malloc+0x5c/0x160 [ 52.076402 ] lr : zs_malloc+0x200/0x570 [ 52.076630 ] sp : ffff80008dd335f0 [ 52.076797 ] x29: ffff80008dd335f0 x28: ffff000004104a00 x27: ffff000004dfc400 [ 52.077319 ] x26: 000000000000ca18 x25: ffff00003fcaf0e0 x24: ffff000006925cf0 [ 52.077785 ] x23: 0000000000000c0a x22: ffff0000032ee780 x21: ffff000006925cf0 [ 52.078257 ] x20: 0000000000088000 x19: 0000000000000000 x18: 0000000000fffc18 [ 52.078701 ] x17: 00000000fffffffd x16: 0000000000000803 x15: 00000000fffffffe [ 52.079203 ] x14: 000000001824429d x13: ffff000006e84000 x12: ffff000006e83fec [ 52.079711 ] x11: ffff000006e83000 x10: 00000000000002a5 x9 : ffff000006e83ff3 [ 52.080269 ] x8 : 0000000000000001 x7 : 0000000017e80000 x6 : 0000000000017e80 [ 52.080724 ] x5 : 0000000000000003 x4 : ffff00000402a5e8 x3 : 0000000000000066 [ 52.081081 ] x2 : ffff000006925cf0 x1 : ffff00000402a5e8 x0 : ffff000004104a00 [ 52.081595 ] Call trace: [ 52.081925 ] obj_malloc+0x5c/0x160 (P) [ 52.082220 ] zs_malloc+0x200/0x570 (L) [ 52.082504 ] zs_malloc+0x200/0x570 [ 52.082716 ] zram_submit_bio+0x788/0x9e8 [ 52.083017 ] __submit_bio+0x1c4/0x338 [ 52.083343 ] submit_bio_noacct_nocheck+0x128/0x2c0 [ 52.083518 ] submit_bio_noacct+0x1c8/0x308 [ 52.083722 ] submit_bio+0xa8/0x14c [ 52.083942 ] submit_bh_wbc+0x140/0x1bc [ 52.084088 ] __block_write_full_folio+0x23c/0x5f0 [ 52.084232 ] block_write_full_folio+0x134/0x21c [ 52.084524 ] write_cache_pages+0x64/0xd4 [ 52.084778 ] blkdev_writepages+0x50/0x8c [ 52.085040 ] do_writepages+0x80/0x2b0 [ 52.085292 ] filemap_fdatawrite_wbc+0x6c/0x90 [ 52.085597 ] __filemap_fdatawrite_range+0x64/0x94 [ 52.085900 ] filemap_fdatawrite+0x1c/0x28 [ 52.086158 ] sync_bdevs+0x170/0x17c [ 52.086374 ] ksys_sync+0x6c/0xb8 [ 52.086597 ] __arm64_sys_sync+0x10/0x20 [ 52.086847 ] invoke_syscall+0x44/0x100 [ 52.087230 ] el0_svc_common.constprop.0+0x40/0xe0 [ 52.087550 ] do_el0_svc+0x1c/0x28 [ 52.087690 ] el0_svc+0x30/0xd0 [ 52.087818 ] el0t_64_sync_handler+0xc8/0xcc [ 52.088046 ] el0t_64_sync+0x198/0x19c [ 52.088500 ] Code: 110004a5 6b0500df f9401273 54000160 (f9401664) [ 52.089097 ] ---[ end trace 0000000000000000 ]---
When using ext4 on zram, the following panic occasionally occurs under high memory usage
The reason is that when the handle is obtained using the slow path, it will be re-compressed. If the data in the page changes, the compressed length may exceed the previous one. Overflow occurred when writing to zs_object, which then caused the panic.
Comment the fast path and force the slow path. Adding a large number of read and write file systems can quickly reproduce it.
The solution is to re-obtain the handle after re-compression if the length is different from the previous one.
Andrew, I'm leaning toward asking you to drop this patch. zram cannot (nor should probably) do anything about upper layer modifying the page data during write(). It's a bug in the upper layer which zram should not hide.
On Thu, 12 Dec 2024 17:40:24 +0900 Sergey Senozhatsky senozhatsky@chromium.org wrote:
The reason is that when the handle is obtained using the slow path, it will be re-compressed. If the data in the page changes, the compressed length may exceed the previous one. Overflow occurred when writing to zs_object, which then caused the panic.
Comment the fast path and force the slow path. Adding a large number of read and write file systems can quickly reproduce it.
The solution is to re-obtain the handle after re-compression if the length is different from the previous one.
Andrew, I'm leaning toward asking you to drop this patch. zram cannot (nor should probably) do anything about upper layer modifying the page data during write(). It's a bug in the upper layer which zram should not hide.
No probs, I already have a "don't do anything with this" note so I'm awaiting resolution.
I often hold onto "wrong" fixes as a reminder that a "right" fix is needed. Rather a weird bug tracking system, but it works for me ;)
linux-stable-mirror@lists.linaro.org