On 3/28/23 11:23, Mirsad Todorovac wrote:
Hi all,
Platform is AlmaLinux 8.7 (CentOS fork), Lenovo desktop LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB with the BIOS M22KT49A dated 11/10/2022.
Running Torvalds vanilla kernel 6.3-rc3 commit 6981739a967c with CONFIG_DEBUG_KMEMLEAK and CONFIG_DEBUG_{KOBJECT,KOBJECT_RELEASE} enabled.
The leak is cummulative, it can be reproduced with tools/testing/selftests/firmware/*.sh scripts.
The leaks are in chunks of 1024 bytes (+ overhead), but so far I could not reproduce w/o root privileges, as tests refuse to run as unprivileged user. (This is not the proof of non-existence of an unprivileged automated exploit that would exhaust the kernel memory at approx. rate 4 MB/hour on our setup.
This would mean about 96 MB / day or 3 GB / month (of kernel memory).
TEST RESULTS (showing the number of kmemleaks per test):
root@pc-mtodorov marvin]# grep -c 'comm "test_' linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw*.log linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_fallback.sh.log:0 linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_filesystem.sh.log:60 linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_lib.sh.log:9 linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_run_tests.sh.log:196 linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_upload.sh.log:0 [root@pc-mtodorov marvin]#
Leaks look like this:
unreferenced object 0xffff943c390f8400 (size 1024): comm "test_firmware-0", pid 449178, jiffies 4381453603 (age 824.844s) hex dump (first 32 bytes): 45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00 EFGH4567........ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0 [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0 [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0 [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170 [<ffffffff907d6dcf>] kthread+0x10f/0x140 [<ffffffff90602fa9>] ret_from_fork+0x29/0x50 unreferenced object 0xffff943a902f6400 (size 1024): comm "test_firmware-1", pid 449179, jiffies 4381453603 (age 824.844s) hex dump (first 32 bytes): 45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00 EFGH4567........ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0 [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0 [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0 [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170 [<ffffffff907d6dcf>] kthread+0x10f/0x140 [<ffffffff90602fa9>] ret_from_fork+0x29/0x50 unreferenced object 0xffff943a902f0400 (size 1024): comm "test_firmware-2", pid 449180, jiffies 4381453603 (age 824.844s) hex dump (first 32 bytes): 45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00 EFGH4567........ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0 [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0 [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0 [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170 [<ffffffff907d6dcf>] kthread+0x10f/0x140 [<ffffffff90602fa9>] ret_from_fork+0x29/0x50 unreferenced object 0xffff943a902f4000 (size 1024): comm "test_firmware-3", pid 449181, jiffies 4381453603 (age 824.844s) hex dump (first 32 bytes): 45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00 EFGH4567........ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0 [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0 [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0 [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170 [<ffffffff907d6dcf>] kthread+0x10f/0x140 [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
Please find the build config, lshw output and the output of /sys/kernel/debug/kmemleak in the following directory:
https://domac.alu.hr/~mtodorov/linux/bugreports/kmemleak-firmware/
NOTE: sent to the maintainers listed for selftest/firmware and those listed for lib/test_firmware.c .
Hi, again!
The problem seems to be here:
lib/test_firmware.c: ----------------------------------------------------------------------------------- 826 static int test_fw_run_batch_request(void *data) 827 { 828 struct test_batched_req *req = data; 829 830 if (!req) { 831 test_fw_config->test_result = -EINVAL; 832 return -EINVAL; 833 } 834 835 if (test_fw_config->into_buf) { 836 void *test_buf; 837 838 test_buf = kzalloc(TEST_FIRMWARE_BUF_SIZE, GFP_KERNEL); 839 if (!test_buf) 840 return -ENOSPC; 841 842 if (test_fw_config->partial) 843 req->rc = request_partial_firmware_into_buf 844 (&req->fw, 845 req->name, 846 req->dev, 847 test_buf, 848 test_fw_config->buf_size, 849 test_fw_config->file_offset); 850 else 851 req->rc = request_firmware_into_buf 852 (&req->fw, 853 req->name, 854 req->dev, 855 test_buf, 856 test_fw_config->buf_size); 857 if (!req->fw) 858 kfree(test_buf); 859 } else { 860 req->rc = test_fw_config->req_firmware(&req->fw, 861 req->name, 862 req->dev); 863 } 864 865 if (req->rc) { 866 pr_info("#%u: batched sync load failed: %d\n", 867 req->idx, req->rc); 868 if (!test_fw_config->test_result) 869 test_fw_config->test_result = req->rc; 870 } else if (req->fw) { 871 req->sent = true; 872 pr_info("#%u: batched sync loaded %zu\n", 873 req->idx, req->fw->size); 874 } 875 complete(&req->completion); 876 877 req->task = NULL; 878 879 return 0; 880 }
The scope of test_buf is from its definition in line 836 to its end in line 859, so in case req->fw != NULL the execution line loses track of the memory kzalloc()'d in line 838.
Unless it is somewhere non-transparently referenced, it appears that the kernel loses track of this allocated block.
Hope this helps.
Best regards, Mirsad