From: Genjian Zhang zhanggenjian@kylinos.cn
Hello!
We found that 2035c770bfdb ("loop: Check for overflow while configuring loop") lost a unlock loop_ctl_mutex in loop_get_status(...). which caused syzbot to report a UAF issue. However, the upstream patch does not have this issue. So, we revert this patch and directly apply the unmodified upstream patch.
Risk use-after-free as reported by syzbot:
[ 174.437352] BUG: KASAN: use-after-free in __mutex_lock.isra.10+0xbc4/0xc30 [ 174.437772] Read of size 4 at addr ffff8880bac49ab8 by task syz-executor.0/13897 [ 174.438205] [ 174.438306] CPU: 1 PID: 13897 Comm: syz-executor.0 Not tainted 4.19.306 #1 [ 174.438712] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1kylin1 04/01/2014 [ 174.439236] Call Trace: [ 174.439392] dump_stack+0x94/0xc7 [ 174.439596] ? __mutex_lock.isra.10+0xbc4/0xc30 [ 174.439881] print_address_description+0x60/0x229 [ 174.440165] ? __mutex_lock.isra.10+0xbc4/0xc30 [ 174.440436] kasan_report.cold.6+0x241/0x2fd [ 174.440696] __mutex_lock.isra.10+0xbc4/0xc30 [ 174.440959] ? entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [ 174.441272] ? mutex_trylock+0xa0/0xa0 [ 174.441500] ? entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [ 174.441816] ? kobject_get_unless_zero+0x129/0x1c0 [ 174.442106] ? kset_unregister+0x30/0x30 [ 174.442351] ? find_symbol_in_section+0x310/0x310 [ 174.442634] ? __mutex_lock_slowpath+0x10/0x10 [ 174.442901] mutex_lock_killable+0xb0/0xf0 [ 174.443149] ? __mutex_lock_killable_slowpath+0x10/0x10 [ 174.443465] ? __mutex_lock_slowpath+0x10/0x10 [ 174.443732] ? _cond_resched+0x10/0x20 [ 174.443966] ? kobject_get+0x54/0xa0 [ 174.444190] lo_open+0x16/0xc0 [ 174.444382] __blkdev_get+0x273/0x10f0 [ 174.444612] ? lo_fallocate.isra.20+0x150/0x150 [ 174.444886] ? bdev_disk_changed+0x190/0x190 [ 174.445146] ? path_init+0x1030/0x1030 [ 174.445371] ? do_syscall_64+0x9a/0x2d0 [ 174.445608] ? deref_stack_reg+0xab/0xe0 [ 174.445852] blkdev_get+0x97/0x880 [ 174.446061] ? walk_component+0x297/0xdc0 [ 174.446303] ? __blkdev_get+0x10f0/0x10f0 [ 174.446547] ? __fsnotify_inode_delete+0x20/0x20 [ 174.446822] blkdev_open+0x1bd/0x240 [ 174.447040] do_dentry_open+0x448/0xf80 [ 174.447274] ? blkdev_get_by_dev+0x60/0x60 [ 174.447522] ? __x64_sys_fchdir+0x1a0/0x1a0 [ 174.447775] ? inode_permission+0x86/0x320 [ 174.448022] path_openat+0xa83/0x3ed0 [ 174.448248] ? path_mountpoint+0xb50/0xb50 [ 174.448495] ? kasan_kmalloc+0xbf/0xe0 [ 174.448723] ? kmem_cache_alloc+0xbc/0x1b0 [ 174.448971] ? getname_flags+0xc4/0x560 [ 174.449203] ? do_sys_open+0x1ce/0x3f0 [ 174.449432] ? do_syscall_64+0x9a/0x2d0 [ 174.449706] ? entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [ 174.450022] ? __d_alloc+0x2a/0xa50 [ 174.450232] ? kasan_unpoison_shadow+0x30/0x40 [ 174.450510] ? should_fail+0x117/0x6c0 [ 174.450737] ? timespec64_trunc+0xc1/0x150 [ 174.450986] ? inode_init_owner+0x2e0/0x2e0 [ 174.451237] ? timespec64_trunc+0xc1/0x150 [ 174.451484] ? inode_init_owner+0x2e0/0x2e0 [ 174.451736] do_filp_open+0x197/0x270 [ 174.451959] ? may_open_dev+0xd0/0xd0 [ 174.452182] ? kasan_unpoison_shadow+0x30/0x40 [ 174.452448] ? kasan_kmalloc+0xbf/0xe0 [ 174.452672] ? __alloc_fd+0x1a3/0x4b0 [ 174.452895] do_sys_open+0x2c7/0x3f0 [ 174.453114] ? filp_open+0x60/0x60 [ 174.453320] do_syscall_64+0x9a/0x2d0 [ 174.453541] ? prepare_exit_to_usermode+0xf3/0x170 [ 174.453832] entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [ 174.454136] RIP: 0033:0x41edee [ 174.454321] Code: 25 00 00 41 00 3d 00 00 41 00 74 48 48 c7 c0 a4 af 0b 01 8b 00 85 c0 75 69 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a6 00 00 00 48 8b 4c 24 28 64 48 33 0c5 [ 174.455404] RSP: 002b:00007ffd2501fbd0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [ 174.455854] RAX: ffffffffffffffda RBX: 00007ffd2501fc90 RCX: 000000000041edee [ 174.456273] RDX: 0000000000000002 RSI: 00007ffd2501fcd0 RDI: 00000000ffffff9c [ 174.456698] RBP: 0000000000000003 R08: 0000000000000001 R09: 00007ffd2501f9a7 [ 174.457116] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003 [ 174.457535] R13: 0000000000565e48 R14: 00007ffd2501fcd0 R15: 0000000000400510 [ 174.457955] [ 174.458052] Allocated by task 945: [ 174.458261] kasan_kmalloc+0xbf/0xe0 [ 174.458478] kmem_cache_alloc_node+0xb4/0x1d0 [ 174.458743] copy_process.part.57+0x14b0/0x7010 [ 174.459017] _do_fork+0x197/0x980 [ 174.459218] kernel_thread+0x2f/0x40 [ 174.459438] call_usermodehelper_exec_work+0xa8/0x240 [ 174.459742] process_one_work+0x933/0x13b0 [ 174.459986] worker_thread+0x8c/0x1000 [ 174.460212] kthread+0x343/0x410 [ 174.460408] ret_from_fork+0x35/0x40 [ 174.460621] [ 174.460716] Freed by task 22902: [ 174.460913] __kasan_slab_free+0x125/0x170 [ 174.461159] kmem_cache_free+0x6e/0x1b0 [ 174.461391] __put_task_struct+0x1c4/0x440 [ 174.461636] delayed_put_task_struct+0x135/0x170 [ 174.461915] rcu_process_callbacks+0x578/0x15c0 [ 174.462184] __do_softirq+0x175/0x60e [ 174.462403] [ 174.462501] The buggy address belongs to the object at ffff8880bac49a80 [ 174.462501] which belongs to the cache task_struct of size 3264 [ 174.463235] The buggy address is located 56 bytes inside of [ 174.463235] 3264-byte region [ffff8880bac49a80, ffff8880bac4a740) [ 174.463923] The buggy address belongs to the page: [ 174.464210] page:ffffea0002eb1200 count:1 mapcount:0 mapping:ffff888188ca0a00 index:0x0 compound_mapcount: 0 [ 174.464784] flags: 0x100000000008100(slab|head) [ 174.465079] raw: 0100000000008100 ffffea0002eaa400 0000000400000004 ffff888188ca0a00 [ 174.465533] raw: 0000000000000000 0000000000090009 00000001ffffffff 0000000000000000 [ 174.465988] page dumped because: kasan: bad access detected [ 174.466321] [ 174.466322] Memory state around the buggy address: [ 174.466325] ffff8880bac49980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 174.466327] ffff8880bac49a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 174.466329] >ffff8880bac49a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 174.466329] ^ [ 174.466331] ffff8880bac49b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 174.466333] ffff8880bac49b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 174.466333] ================================================================== [ 174.466338] Disabling lock debugging due to kernel taint
Best regards Genjian
Genjian Zhang (1): Revert "loop: Check for overflow while configuring loop"
Holger Hoffstätte (1): loop: properly observe rotational flag of underlying device
Martijn Coenen (5): loop: Call loop_config_discard() only after new config is applied loop: Remove sector_t truncation checks loop: Factor out setting loop device size loop: Refactor loop_set_status() size calculation loop: Factor out configuring loop from status
Siddh Raman Pant (1): loop: Check for overflow while configuring loop
Zhong Jinghua (1): loop: loop_set_status_from_info() check before assignment
drivers/block/loop.c | 204 ++++++++++++++++++++++++++----------------- 1 file changed, 123 insertions(+), 81 deletions(-)
From: Genjian Zhang zhanggenjian@kylinos.cn
This reverts commit 2035c770bfdbcc82bd52e05871a7c82db9529e0f.
This patch lost a unlock loop_ctl_mutex in loop_get_status(...), which caused syzbot to report a UAF issue.The upstream patch does not have this issue. Therefore, we revert this patch and directly apply the upstream patch later on.
Risk use-after-free as reported by syzbot:
[ 174.437352] BUG: KASAN: use-after-free in __mutex_lock.isra.10+0xbc4/0xc30 [ 174.437772] Read of size 4 at addr ffff8880bac49ab8 by task syz-executor.0/13897 [ 174.438205] [ 174.438306] CPU: 1 PID: 13897 Comm: syz-executor.0 Not tainted 4.19.306 #1 [ 174.438712] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1kylin1 04/01/2014 [ 174.439236] Call Trace: [ 174.439392] dump_stack+0x94/0xc7 [ 174.439596] ? __mutex_lock.isra.10+0xbc4/0xc30 [ 174.439881] print_address_description+0x60/0x229 [ 174.440165] ? __mutex_lock.isra.10+0xbc4/0xc30 [ 174.440436] kasan_report.cold.6+0x241/0x2fd [ 174.440696] __mutex_lock.isra.10+0xbc4/0xc30 [ 174.440959] ? entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [ 174.441272] ? mutex_trylock+0xa0/0xa0 [ 174.441500] ? entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [ 174.441816] ? kobject_get_unless_zero+0x129/0x1c0 [ 174.442106] ? kset_unregister+0x30/0x30 [ 174.442351] ? find_symbol_in_section+0x310/0x310 [ 174.442634] ? __mutex_lock_slowpath+0x10/0x10 [ 174.442901] mutex_lock_killable+0xb0/0xf0 [ 174.443149] ? __mutex_lock_killable_slowpath+0x10/0x10 [ 174.443465] ? __mutex_lock_slowpath+0x10/0x10 [ 174.443732] ? _cond_resched+0x10/0x20 [ 174.443966] ? kobject_get+0x54/0xa0 [ 174.444190] lo_open+0x16/0xc0 [ 174.444382] __blkdev_get+0x273/0x10f0 [ 174.444612] ? lo_fallocate.isra.20+0x150/0x150 [ 174.444886] ? bdev_disk_changed+0x190/0x190 [ 174.445146] ? path_init+0x1030/0x1030 [ 174.445371] ? do_syscall_64+0x9a/0x2d0 [ 174.445608] ? deref_stack_reg+0xab/0xe0 [ 174.445852] blkdev_get+0x97/0x880 [ 174.446061] ? walk_component+0x297/0xdc0 [ 174.446303] ? __blkdev_get+0x10f0/0x10f0 [ 174.446547] ? __fsnotify_inode_delete+0x20/0x20 [ 174.446822] blkdev_open+0x1bd/0x240 [ 174.447040] do_dentry_open+0x448/0xf80 [ 174.447274] ? blkdev_get_by_dev+0x60/0x60 [ 174.447522] ? __x64_sys_fchdir+0x1a0/0x1a0 [ 174.447775] ? inode_permission+0x86/0x320 [ 174.448022] path_openat+0xa83/0x3ed0 [ 174.448248] ? path_mountpoint+0xb50/0xb50 [ 174.448495] ? kasan_kmalloc+0xbf/0xe0 [ 174.448723] ? kmem_cache_alloc+0xbc/0x1b0 [ 174.448971] ? getname_flags+0xc4/0x560 [ 174.449203] ? do_sys_open+0x1ce/0x3f0 [ 174.449432] ? do_syscall_64+0x9a/0x2d0 [ 174.449706] ? entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [ 174.450022] ? __d_alloc+0x2a/0xa50 [ 174.450232] ? kasan_unpoison_shadow+0x30/0x40 [ 174.450510] ? should_fail+0x117/0x6c0 [ 174.450737] ? timespec64_trunc+0xc1/0x150 [ 174.450986] ? inode_init_owner+0x2e0/0x2e0 [ 174.451237] ? timespec64_trunc+0xc1/0x150 [ 174.451484] ? inode_init_owner+0x2e0/0x2e0 [ 174.451736] do_filp_open+0x197/0x270 [ 174.451959] ? may_open_dev+0xd0/0xd0 [ 174.452182] ? kasan_unpoison_shadow+0x30/0x40 [ 174.452448] ? kasan_kmalloc+0xbf/0xe0 [ 174.452672] ? __alloc_fd+0x1a3/0x4b0 [ 174.452895] do_sys_open+0x2c7/0x3f0 [ 174.453114] ? filp_open+0x60/0x60 [ 174.453320] do_syscall_64+0x9a/0x2d0 [ 174.453541] ? prepare_exit_to_usermode+0xf3/0x170 [ 174.453832] entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [ 174.454136] RIP: 0033:0x41edee [ 174.454321] Code: 25 00 00 41 00 3d 00 00 41 00 74 48 48 c7 c0 a4 af 0b 01 8b 00 85 c0 75 69 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a6 00 00 00 48 8b 4c 24 28 64 48 33 0c5 [ 174.455404] RSP: 002b:00007ffd2501fbd0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [ 174.455854] RAX: ffffffffffffffda RBX: 00007ffd2501fc90 RCX: 000000000041edee [ 174.456273] RDX: 0000000000000002 RSI: 00007ffd2501fcd0 RDI: 00000000ffffff9c [ 174.456698] RBP: 0000000000000003 R08: 0000000000000001 R09: 00007ffd2501f9a7 [ 174.457116] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003 [ 174.457535] R13: 0000000000565e48 R14: 00007ffd2501fcd0 R15: 0000000000400510 [ 174.457955] [ 174.458052] Allocated by task 945: [ 174.458261] kasan_kmalloc+0xbf/0xe0 [ 174.458478] kmem_cache_alloc_node+0xb4/0x1d0 [ 174.458743] copy_process.part.57+0x14b0/0x7010 [ 174.459017] _do_fork+0x197/0x980 [ 174.459218] kernel_thread+0x2f/0x40 [ 174.459438] call_usermodehelper_exec_work+0xa8/0x240 [ 174.459742] process_one_work+0x933/0x13b0 [ 174.459986] worker_thread+0x8c/0x1000 [ 174.460212] kthread+0x343/0x410 [ 174.460408] ret_from_fork+0x35/0x40 [ 174.460621] [ 174.460716] Freed by task 22902: [ 174.460913] __kasan_slab_free+0x125/0x170 [ 174.461159] kmem_cache_free+0x6e/0x1b0 [ 174.461391] __put_task_struct+0x1c4/0x440 [ 174.461636] delayed_put_task_struct+0x135/0x170 [ 174.461915] rcu_process_callbacks+0x578/0x15c0 [ 174.462184] __do_softirq+0x175/0x60e [ 174.462403] [ 174.462501] The buggy address belongs to the object at ffff8880bac49a80 [ 174.462501] which belongs to the cache task_struct of size 3264 [ 174.463235] The buggy address is located 56 bytes inside of [ 174.463235] 3264-byte region [ffff8880bac49a80, ffff8880bac4a740) [ 174.463923] The buggy address belongs to the page: [ 174.464210] page:ffffea0002eb1200 count:1 mapcount:0 mapping:ffff888188ca0a00 index:0x0 compound_mapcount: 0 [ 174.464784] flags: 0x100000000008100(slab|head) [ 174.465079] raw: 0100000000008100 ffffea0002eaa400 0000000400000004 ffff888188ca0a00 [ 174.465533] raw: 0000000000000000 0000000000090009 00000001ffffffff 0000000000000000 [ 174.465988] page dumped because: kasan: bad access detected [ 174.466321] [ 174.466322] Memory state around the buggy address: [ 174.466325] ffff8880bac49980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 174.466327] ffff8880bac49a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 174.466329] >ffff8880bac49a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 174.466329] ^ [ 174.466331] ffff8880bac49b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 174.466333] ffff8880bac49b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 174.466333] ================================================================== [ 174.466338] Disabling lock debugging due to kernel taint
Reported-by: k2ci kernel-bot@kylinos.cn Signed-off-by: Genjian Zhang zhanggenjian@kylinos.cn --- drivers/block/loop.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 2e6c3f658894..52481f1f8d01 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1351,11 +1351,6 @@ loop_get_status(struct loop_device *lo, struct loop_info64 *info) info->lo_number = lo->lo_number; info->lo_offset = lo->lo_offset; info->lo_sizelimit = lo->lo_sizelimit; - - /* loff_t vars have been assigned __u64 */ - if (lo->lo_offset < 0 || lo->lo_sizelimit < 0) - return -EOVERFLOW; - info->lo_flags = lo->lo_flags; memcpy(info->lo_file_name, lo->lo_file_name, LO_NAME_SIZE); memcpy(info->lo_crypt_name, lo->lo_crypt_name, LO_NAME_SIZE);
From: Martijn Coenen maco@android.com
[ Upstream commit 7c5014b0987a30e4989c90633c198aced454c0ec ]
loop_set_status() calls loop_config_discard() to configure discard for the loop device; however, the discard configuration depends on whether the loop device uses encryption, and when we call it the encryption configuration has not been updated yet. Move the call down so we apply the correct discard configuration based on the new configuration.
Signed-off-by: Martijn Coenen maco@android.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Bob Liu bob.liu@oracle.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Genjian Zhang zhanggenjian@kylinos.cn --- drivers/block/loop.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 52481f1f8d01..bd94406b90c9 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1286,8 +1286,6 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) } }
- loop_config_discard(lo); - memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE); memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE); lo->lo_file_name[LO_NAME_SIZE-1] = 0; @@ -1311,6 +1309,8 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) lo->lo_key_owner = uid; }
+ loop_config_discard(lo); + /* update dio if lo_offset or transfer is changed */ __loop_update_dio(lo, lo->use_dio);
From: Martijn Coenen maco@android.com
[ Upstream commit 083a6a50783ef54256eec3499e6575237e0e3d53 ]
sector_t is now always u64, so we don't need to check for truncation.
Signed-off-by: Martijn Coenen maco@android.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Genjian Zhang zhanggenjian@kylinos.cn --- drivers/block/loop.c | 21 +++++++-------------- 1 file changed, 7 insertions(+), 14 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index bd94406b90c9..281aefba2a6f 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -225,24 +225,20 @@ static void __loop_update_dio(struct loop_device *lo, bool dio) blk_mq_unfreeze_queue(lo->lo_queue); }
-static int +static void figure_loop_size(struct loop_device *lo, loff_t offset, loff_t sizelimit) { loff_t size = get_size(offset, sizelimit, lo->lo_backing_file); - sector_t x = (sector_t)size; struct block_device *bdev = lo->lo_device;
- if (unlikely((loff_t)x != size)) - return -EFBIG; if (lo->lo_offset != offset) lo->lo_offset = offset; if (lo->lo_sizelimit != sizelimit) lo->lo_sizelimit = sizelimit; - set_capacity(lo->lo_disk, x); + set_capacity(lo->lo_disk, size); bd_set_size(bdev, (loff_t)get_capacity(bdev->bd_disk) << 9); /* let user-space know about the new size */ kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE); - return 0; }
static inline int @@ -972,10 +968,8 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, !file->f_op->write_iter) lo_flags |= LO_FLAGS_READ_ONLY;
- error = -EFBIG; size = get_loop_size(lo, file); - if ((loff_t)(sector_t)size != size) - goto out_unlock; + error = loop_prepare_queue(lo); if (error) goto out_unlock; @@ -1280,10 +1274,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) lo->lo_device->bd_inode->i_mapping->nrpages); goto out_unfreeze; } - if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) { - err = -EFBIG; - goto out_unfreeze; - } + figure_loop_size(lo, info->lo_offset, info->lo_sizelimit); }
memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE); @@ -1486,7 +1477,9 @@ static int loop_set_capacity(struct loop_device *lo) if (unlikely(lo->lo_state != Lo_bound)) return -ENXIO;
- return figure_loop_size(lo, lo->lo_offset, lo->lo_sizelimit); + figure_loop_size(lo, lo->lo_offset, lo->lo_sizelimit); + + return 0; }
static int loop_set_dio(struct loop_device *lo, unsigned long arg)
From: Martijn Coenen maco@android.com
[ Upstream commit 5795b6f5607f7e4db62ddea144727780cb351a9b ]
This code is used repeatedly.
Signed-off-by: Martijn Coenen maco@android.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Genjian Zhang zhanggenjian@kylinos.cn --- drivers/block/loop.c | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 281aefba2a6f..6bd07fa3a1fc 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -225,20 +225,35 @@ static void __loop_update_dio(struct loop_device *lo, bool dio) blk_mq_unfreeze_queue(lo->lo_queue); }
+/** + * loop_set_size() - sets device size and notifies userspace + * @lo: struct loop_device to set the size for + * @size: new size of the loop device + * + * Callers must validate that the size passed into this function fits into + * a sector_t, eg using loop_validate_size() + */ +static void loop_set_size(struct loop_device *lo, loff_t size) +{ + struct block_device *bdev = lo->lo_device; + + set_capacity(lo->lo_disk, size); + bd_set_size(bdev, size << SECTOR_SHIFT); + /* let user-space know about the new size */ + kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE); +} + static void figure_loop_size(struct loop_device *lo, loff_t offset, loff_t sizelimit) { loff_t size = get_size(offset, sizelimit, lo->lo_backing_file); - struct block_device *bdev = lo->lo_device;
if (lo->lo_offset != offset) lo->lo_offset = offset; if (lo->lo_sizelimit != sizelimit) lo->lo_sizelimit = sizelimit; - set_capacity(lo->lo_disk, size); - bd_set_size(bdev, (loff_t)get_capacity(bdev->bd_disk) << 9); - /* let user-space know about the new size */ - kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE); + + loop_set_size(lo, size); }
static inline int @@ -992,11 +1007,8 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, blk_queue_write_cache(lo->lo_queue, true, false);
loop_update_dio(lo); - set_capacity(lo->lo_disk, size); - bd_set_size(bdev, size << 9); loop_sysfs_init(lo); - /* let user-space know about the new size */ - kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE); + loop_set_size(lo, size);
set_blocksize(bdev, S_ISBLK(inode->i_mode) ? block_size(inode->i_bdev) : PAGE_SIZE);
From: Martijn Coenen maco@android.com
[ Upstream commit b0bd158dd630bd47640e0e418c062cda1e0da5ad ]
figure_loop_size() calculates the loop size based on the passed in parameters, but at the same time it updates the offset and sizelimit parameters in the loop device configuration. That is a somewhat unexpected side effect of a function with this name, and it is only only needed by one of the two callers of this function - loop_set_status().
Move the lo_offset and lo_sizelimit assignment back into loop_set_status(), and use the newly factored out functions to validate and apply the newly calculated size. This allows us to get rid of figure_loop_size() in a follow-up commit.
Signed-off-by: Martijn Coenen maco@android.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Genjian Zhang zhanggenjian@kylinos.cn --- drivers/block/loop.c | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 6bd07fa3a1fc..1a6805642ed2 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -248,11 +248,6 @@ figure_loop_size(struct loop_device *lo, loff_t offset, loff_t sizelimit) { loff_t size = get_size(offset, sizelimit, lo->lo_backing_file);
- if (lo->lo_offset != offset) - lo->lo_offset = offset; - if (lo->lo_sizelimit != sizelimit) - lo->lo_sizelimit = sizelimit; - loop_set_size(lo, size); }
@@ -1225,6 +1220,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) kuid_t uid = current_uid(); struct block_device *bdev; bool partscan = false; + bool size_changed = false;
err = mutex_lock_killable(&loop_ctl_mutex); if (err) @@ -1246,6 +1242,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
if (lo->lo_offset != info->lo_offset || lo->lo_sizelimit != info->lo_sizelimit) { + size_changed = true; sync_blockdev(lo->lo_device); invalidate_bdev(lo->lo_device); } @@ -1253,6 +1250,15 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) /* I/O need to be drained during transfer transition */ blk_mq_freeze_queue(lo->lo_queue);
+ if (size_changed && lo->lo_device->bd_inode->i_mapping->nrpages) { + /* If any pages were dirtied after invalidate_bdev(), try again */ + err = -EAGAIN; + pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n", + __func__, lo->lo_number, lo->lo_file_name, + lo->lo_device->bd_inode->i_mapping->nrpages); + goto out_unfreeze; + } + err = loop_release_xfer(lo); if (err) goto out_unfreeze; @@ -1276,19 +1282,8 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) if (err) goto out_unfreeze;
- if (lo->lo_offset != info->lo_offset || - lo->lo_sizelimit != info->lo_sizelimit) { - /* kill_bdev should have truncated all the pages */ - if (lo->lo_device->bd_inode->i_mapping->nrpages) { - err = -EAGAIN; - pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n", - __func__, lo->lo_number, lo->lo_file_name, - lo->lo_device->bd_inode->i_mapping->nrpages); - goto out_unfreeze; - } - figure_loop_size(lo, info->lo_offset, info->lo_sizelimit); - } - + lo->lo_offset = info->lo_offset; + lo->lo_sizelimit = info->lo_sizelimit; memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE); memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE); lo->lo_file_name[LO_NAME_SIZE-1] = 0; @@ -1312,6 +1307,12 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) lo->lo_key_owner = uid; }
+ if (size_changed) { + loff_t new_size = get_size(lo->lo_offset, lo->lo_sizelimit, + lo->lo_backing_file); + loop_set_size(lo, new_size); + } + loop_config_discard(lo);
/* update dio if lo_offset or transfer is changed */
From: Holger Hoffstätte holger.hoffstaette@googlemail.com
[ Upstream commit 56a85fd8376ef32458efb6ea97a820754e12f6bb ]
The loop driver always declares the rotational flag of its device as rotational, even when the device of the mapped file is nonrotational, as is the case with SSDs or on tmpfs. This can confuse filesystem tools which are SSD-aware; in my case I frequently forget to tell mkfs.btrfs that my loop device on tmpfs is nonrotational, and that I really don't need any automatic metadata redundancy.
The attached patch fixes this by introspecting the rotational flag of the mapped file's underlying block device, if it exists. If the mapped file's filesystem has no associated block device - as is the case on e.g. tmpfs - we assume nonrotational storage. If there is a better way to identify such non-devices I'd love to hear them.
Cc: Jens Axboe axboe@kernel.dk Cc: linux-block@vger.kernel.org Cc: holger@applied-asynchrony.com Signed-off-by: Holger Hoffstätte holger.hoffstaette@googlemail.com Signed-off-by: Gwendal Grignou gwendal@chromium.org Signed-off-by: Benjamin Gordon bmgordon@chromium.org Reviewed-by: Guenter Roeck groeck@chromium.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Genjian Zhang zhanggenjian@kylinos.cn --- drivers/block/loop.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 1a6805642ed2..7a0461a6160b 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -940,6 +940,24 @@ static int loop_prepare_queue(struct loop_device *lo) return 0; }
+static void loop_update_rotational(struct loop_device *lo) +{ + struct file *file = lo->lo_backing_file; + struct inode *file_inode = file->f_mapping->host; + struct block_device *file_bdev = file_inode->i_sb->s_bdev; + struct request_queue *q = lo->lo_queue; + bool nonrot = true; + + /* not all filesystems (e.g. tmpfs) have a sb->s_bdev */ + if (file_bdev) + nonrot = blk_queue_nonrot(bdev_get_queue(file_bdev)); + + if (nonrot) + blk_queue_flag_set(QUEUE_FLAG_NONROT, q); + else + blk_queue_flag_clear(QUEUE_FLAG_NONROT, q); +} + static int loop_set_fd(struct loop_device *lo, fmode_t mode, struct block_device *bdev, unsigned int arg) { @@ -1001,6 +1019,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, if (!(lo_flags & LO_FLAGS_READ_ONLY) && file->f_op->fsync) blk_queue_write_cache(lo->lo_queue, true, false);
+ loop_update_rotational(lo); loop_update_dio(lo); loop_sysfs_init(lo); loop_set_size(lo, size);
From: Martijn Coenen maco@android.com
[ Upstream commit 0c3796c244598122a5d59d56f30d19390096817f ]
Factor out this code into a separate function, so it can be reused by other code more easily.
Signed-off-by: Martijn Coenen maco@android.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Genjian Zhang zhanggenjian@kylinos.cn --- drivers/block/loop.c | 117 +++++++++++++++++++++++++------------------ 1 file changed, 67 insertions(+), 50 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 7a0461a6160b..0fefd21f0c71 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1231,75 +1231,43 @@ static int loop_clr_fd(struct loop_device *lo) return __loop_clr_fd(lo, false); }
+/** + * loop_set_status_from_info - configure device from loop_info + * @lo: struct loop_device to configure + * @info: struct loop_info64 to configure the device with + * + * Configures the loop device parameters according to the passed + * in loop_info64 configuration. + */ static int -loop_set_status(struct loop_device *lo, const struct loop_info64 *info) +loop_set_status_from_info(struct loop_device *lo, + const struct loop_info64 *info) { int err; struct loop_func_table *xfer; kuid_t uid = current_uid(); - struct block_device *bdev; - bool partscan = false; - bool size_changed = false; - - err = mutex_lock_killable(&loop_ctl_mutex); - if (err) - return err; - if (lo->lo_encrypt_key_size && - !uid_eq(lo->lo_key_owner, uid) && - !capable(CAP_SYS_ADMIN)) { - err = -EPERM; - goto out_unlock; - } - if (lo->lo_state != Lo_bound) { - err = -ENXIO; - goto out_unlock; - } - if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE) { - err = -EINVAL; - goto out_unlock; - } - - if (lo->lo_offset != info->lo_offset || - lo->lo_sizelimit != info->lo_sizelimit) { - size_changed = true; - sync_blockdev(lo->lo_device); - invalidate_bdev(lo->lo_device); - }
- /* I/O need to be drained during transfer transition */ - blk_mq_freeze_queue(lo->lo_queue); - - if (size_changed && lo->lo_device->bd_inode->i_mapping->nrpages) { - /* If any pages were dirtied after invalidate_bdev(), try again */ - err = -EAGAIN; - pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n", - __func__, lo->lo_number, lo->lo_file_name, - lo->lo_device->bd_inode->i_mapping->nrpages); - goto out_unfreeze; - } + if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE) + return -EINVAL;
err = loop_release_xfer(lo); if (err) - goto out_unfreeze; + return err;
if (info->lo_encrypt_type) { unsigned int type = info->lo_encrypt_type;
- if (type >= MAX_LO_CRYPT) { - err = -EINVAL; - goto out_unfreeze; - } + if (type >= MAX_LO_CRYPT) + return -EINVAL; xfer = xfer_funcs[type]; - if (xfer == NULL) { - err = -EINVAL; - goto out_unfreeze; - } + if (xfer == NULL) + return -EINVAL; } else xfer = NULL;
err = loop_init_xfer(lo, xfer, info); if (err) - goto out_unfreeze; + return err;
lo->lo_offset = info->lo_offset; lo->lo_sizelimit = info->lo_sizelimit; @@ -1326,6 +1294,55 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) lo->lo_key_owner = uid; }
+ return 0; +} + +static int +loop_set_status(struct loop_device *lo, const struct loop_info64 *info) +{ + int err; + struct block_device *bdev; + kuid_t uid = current_uid(); + bool partscan = false; + bool size_changed = false; + + err = mutex_lock_killable(&loop_ctl_mutex); + if (err) + return err; + if (lo->lo_encrypt_key_size && + !uid_eq(lo->lo_key_owner, uid) && + !capable(CAP_SYS_ADMIN)) { + err = -EPERM; + goto out_unlock; + } + if (lo->lo_state != Lo_bound) { + err = -ENXIO; + goto out_unlock; + } + + if (lo->lo_offset != info->lo_offset || + lo->lo_sizelimit != info->lo_sizelimit) { + size_changed = true; + sync_blockdev(lo->lo_device); + invalidate_bdev(lo->lo_device); + } + + /* I/O need to be drained during transfer transition */ + blk_mq_freeze_queue(lo->lo_queue); + + if (size_changed && lo->lo_device->bd_inode->i_mapping->nrpages) { + /* If any pages were dirtied after invalidate_bdev(), try again */ + err = -EAGAIN; + pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n", + __func__, lo->lo_number, lo->lo_file_name, + lo->lo_device->bd_inode->i_mapping->nrpages); + goto out_unfreeze; + } + + err = loop_set_status_from_info(lo, info); + if (err) + goto out_unfreeze; + if (size_changed) { loff_t new_size = get_size(lo->lo_offset, lo->lo_sizelimit, lo->lo_backing_file);
From: Siddh Raman Pant code@siddh.me
[ Upstream commit c490a0b5a4f36da3918181a8acdc6991d967c5f3 ]
The userspace can configure a loop using an ioctl call, wherein a configuration of type loop_config is passed (see lo_ioctl()'s case on line 1550 of drivers/block/loop.c). This proceeds to call loop_configure() which in turn calls loop_set_status_from_info() (see line 1050 of loop.c), passing &config->info which is of type loop_info64*. This function then sets the appropriate values, like the offset.
loop_device has lo_offset of type loff_t (see line 52 of loop.c), which is typdef-chained to long long, whereas loop_info64 has lo_offset of type __u64 (see line 56 of include/uapi/linux/loop.h).
The function directly copies offset from info to the device as follows (See line 980 of loop.c): lo->lo_offset = info->lo_offset;
This results in an overflow, which triggers a warning in iomap_iter() due to a call to iomap_iter_done() which has: WARN_ON_ONCE(iter->iomap.offset > iter->pos);
Thus, check for negative value during loop_set_status_from_info().
Bug report: https://syzkaller.appspot.com/bug?id=c620fe14aac810396d3c3edc9ad73848bf69a29...
Reported-and-tested-by: syzbot+a8e049cd3abd342936b6@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Siddh Raman Pant code@siddh.me Reviewed-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/r/20220823160810.181275-1-code@siddh.me Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Genjian Zhang zhanggenjian@kylinos.cn --- drivers/block/loop.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 0fefd21f0c71..c1caa3e2355f 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1271,6 +1271,11 @@ loop_set_status_from_info(struct loop_device *lo,
lo->lo_offset = info->lo_offset; lo->lo_sizelimit = info->lo_sizelimit; + + /* loff_t vars have been assigned __u64 */ + if (lo->lo_offset < 0 || lo->lo_sizelimit < 0) + return -EOVERFLOW; + memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE); memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE); lo->lo_file_name[LO_NAME_SIZE-1] = 0;
From: Zhong Jinghua zhongjinghua@huawei.com
[ Upstream commit 9f6ad5d533d1c71e51bdd06a5712c4fbc8768dfa ]
In loop_set_status_from_info(), lo->lo_offset and lo->lo_sizelimit should be checked before reassignment, because if an overflow error occurs, the original correct value will be changed to the wrong value, and it will not be changed back.
More, the original patch did not solve the problem, the value was set and ioctl returned an error, but the subsequent io used the value in the loop driver, which still caused an alarm:
loop_handle_cmd do_req_filebacked loff_t pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset; lo_rw_aio cmd->iocb.ki_pos = pos
Fixes: c490a0b5a4f3 ("loop: Check for overflow while configuring loop") Signed-off-by: Zhong Jinghua zhongjinghua@huawei.com Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Link: https://lore.kernel.org/r/20230221095027.3656193-1-zhongjinghua@huaweicloud.... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Genjian Zhang zhanggenjian@kylinos.cn --- drivers/block/loop.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index c1caa3e2355f..6050b039e4d2 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1269,13 +1269,13 @@ loop_set_status_from_info(struct loop_device *lo, if (err) return err;
+ /* Avoid assigning overflow values */ + if (info->lo_offset > LLONG_MAX || info->lo_sizelimit > LLONG_MAX) + return -EOVERFLOW; + lo->lo_offset = info->lo_offset; lo->lo_sizelimit = info->lo_sizelimit;
- /* loff_t vars have been assigned __u64 */ - if (lo->lo_offset < 0 || lo->lo_sizelimit < 0) - return -EOVERFLOW; - memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE); memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE); lo->lo_file_name[LO_NAME_SIZE-1] = 0;
On Fri, Mar 01, 2024 at 09:30:19AM +0800, Genjian wrote:
From: Genjian Zhang zhanggenjian@kylinos.cn
Hello!
We found that 2035c770bfdb ("loop: Check for overflow while configuring loop") lost a unlock loop_ctl_mutex in loop_get_status(...). which caused syzbot to report a UAF issue. However, the upstream patch does not have this issue. So, we revert this patch and directly apply the unmodified upstream patch.
Risk use-after-free as reported by syzbot:
This looks good, but you are backporting commits that are NOT in newer stable releases (i.e. from 5.8 but the commit is not in 5.4.y), is that intentional?
Does 5.4.y also have this problem? If so, can you send a series that fixes that up so I can take both of them at the same time?
thanks,
greg k-h
At 2024-03-04 21:31:20, "Greg KH" gregkh@linuxfoundation.org wrote:
On Fri, Mar 01, 2024 at 09:30:19AM +0800, Genjian wrote:
From: Genjian Zhang zhanggenjian@kylinos.cn
Hello!
We found that 2035c770bfdb ("loop: Check for overflow while configuring loop") lost a unlock loop_ctl_mutex in loop_get_status(...). which caused syzbot to report a UAF issue. However, the upstream patch does not have this issue. So, we revert this patch and directly apply the unmodified upstream patch.
Risk use-after-free as reported by syzbot:
This looks good, but you are backporting commits that are NOT in newer stable releases (i.e. from 5.8 but the commit is not in 5.4.y), is that intentional?
Does 5.4.y also have this problem? If so, can you send a series that fixes that up so I can take both of them at the same time?
thanks,
greg k-h
Thank you for your advice. This problem also exists in 5.4.y. I will send a series of patches for 5.4.y.
thanks,
Genjian
On Thu, Mar 07, 2024 at 10:34:12AM +0800, genjian zhang wrote:
At 2024-03-04 21:31:20, "Greg KH" gregkh@linuxfoundation.org wrote:
On Fri, Mar 01, 2024 at 09:30:19AM +0800, Genjian wrote:
From: Genjian Zhang zhanggenjian@kylinos.cn
Hello!
We found that 2035c770bfdb ("loop: Check for overflow while configuring loop") lost a unlock loop_ctl_mutex in loop_get_status(...). which caused syzbot to report a UAF issue. However, the upstream patch does not have this issue. So, we revert this patch and directly apply the unmodified upstream patch.
Risk use-after-free as reported by syzbot:
This looks good, but you are backporting commits that are NOT in newer stable releases (i.e. from 5.8 but the commit is not in 5.4.y), is that intentional?
Does 5.4.y also have this problem? If so, can you send a series that fixes that up so I can take both of them at the same time?
thanks,
greg k-h
Thank you for your advice. This problem also exists in 5.4.y. I will send a series of patches for 5.4.y.
All now queued up, thanks!
greg k-h
linux-stable-mirror@lists.linaro.org