On Tue, Apr 09, 2024 at 10:38:34AM +0800, guomengqi (A) wrote:
在 2024/4/5 17:30, Greg KH 写道:
On Wed, Apr 03, 2024 at 05:47:16PM +0800, Guo Mengqi wrote:
commit 73a82b22963d ("drm/atomic: Fix potential use-after-free in nonblocking commits") introduced drm_dev_get/put() to drm_atomic_helper_shutdown(). And this cause problem in vkms driver exit process.
vkms_exit() drm_dev_put() vkms_release() drm_atomic_helper_shutdown() drm_dev_get() drm_dev_put() vkms_release() ------ null pointer access
Using 4.19 stable x86 image on qemu, below stacktrace can be triggered by load and unload vkms.ko.
root:~ # insmod vkms.ko [ 142.135449] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 142.138713] [drm] Driver supports precise vblank timestamp query. [ 142.142390] [drm] Initialized vkms 1.0.0 20180514 for virtual device on minor 0 root:~ # rmmod vkms.ko [ 144.093710] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0 [ 144.097491] PGD 800000023624e067 P4D 800000023624e067 PUD 22ab59067 PMD 0 [ 144.100802] Oops: 0000 [#1] SMP PTI [ 144.102502] CPU: 0 PID: 3615 Comm: rmmod Not tainted 4.19.310 #1 [ 144.104452] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 144.107238] RIP: 0010:device_del+0x34/0x3a0 ... [ 144.131323] Call Trace: [ 144.131962] ? __die+0x7d/0xc0 [ 144.132711] ? no_context+0x152/0x3b0 [ 144.133605] ? wake_up_q+0x70/0x70 [ 144.134436] ? __do_page_fault+0x342/0x4b0 [ 144.135445] ? __switch_to_asm+0x41/0x70 [ 144.136416] ? __switch_to_asm+0x35/0x70 [ 144.137366] ? page_fault+0x1e/0x30 [ 144.138214] ? __drm_atomic_state_free+0x51/0x60 [ 144.139331] ? device_del+0x34/0x3a0 [ 144.140197] platform_device_del.part.14+0x19/0x70 [ 144.141348] platform_device_unregister+0xe/0x20 [ 144.142458] vkms_release+0x10/0x30 [vkms] [ 144.143449] __drm_atomic_helper_disable_all.constprop.31+0x13b/0x150 [ 144.144980] drm_atomic_helper_shutdown+0x4b/0x90 [ 144.146102] vkms_release+0x18/0x30 [vkms] [ 144.147107] vkms_exit+0x29/0x8ec [vkms] [ 144.148053] __x64_sys_delete_module+0x155/0x220 [ 144.149168] do_syscall_64+0x43/0x100 [ 144.150056] entry_SYSCALL_64_after_hwframe+0x5c/0xc1
It seems that the proper unload sequence is: drm_atomic_helper_shutdown(); drm_dev_put();
Just put drm_atomic_helper_shutdown() before drm_dev_put() should solve the problem.
Note that vkms exit code is refactored by 53d77aaa3f76 ("drm/vkms: Use devm_drm_dev_alloc") in tags/v5.10-rc1.
So this bug only exists on 4.19 and 5.4.
Do we also need this for 5.4? If so, can you send a version for that tree with the correct Fixes: information, and I will be glad to queue both of these up.
I sent a patch to 5.4.y too. Please check it at https://lore.kernel.org/all/20240409022647.1821-1-guomengqi3@huawei.com/T/#u
Both now queued up, thanks.
greg k-h