On Tue, 2025-01-14 at 16:16 +0000, André Draszik wrote:
devm_blk_crypto_profile_init() registers a cleanup handler to run when the associated (platform-) device is being released. For UFS, the crypto private data and pointers are stored as part of the ufs_hba's data structure 'struct ufs_hba::crypto_profile'. This structure is allocated as part of the underlying ufshd allocation.
During driver release or during error handling in ufshcd_pltfrm_init(), this structure is released as part of ufshcd_dealloc_host() before the (platform-) device associated with the crypto call above is released. Once this device is released, the crypto cleanup code will run, using the just-released 'struct ufs_hba::crypto_profile'. This causes a use-after-free situation:
exynos-ufshc 14700000.ufs: ufshcd_pltfrm_init() failed -11 exynos-ufshc 14700000.ufs: probe with driver exynos-ufshc failed with error -11 Unable to handle kernel paging request at virtual address 01adafad6dadad88 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [01adafad6dadad88] address between user and kernel address ranges Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.13.0-rc5-next-20250106+ #70 Tainted: [W]=WARN Hardware name: Oriole (DT) pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : kfree+0x60/0x2d8 lr : kvfree+0x44/0x60 sp : ffff80008009ba80 x29: ffff80008009ba90 x28: 0000000000000000 x27: ffffbcc6591e0130 x26: ffffbcc659309960 x25: ffffbcc658f89c50 x24: ffffbcc659539d80 x23: ffff22e000940040 x22: ffff22e001539010 x21: ffffbcc65714b22c x20: 6b6b6b6b6b6b6b6b x19: 01adafad6dadad80 x18: 0000000000000000 x17: ffffbcc6579fbac8 x16: ffffbcc657a04300 x15: ffffbcc657a027f4 x14: ffffbcc656f969cc x13: ffffbcc6579fdc80 x12: ffffbcc6579fb194 x11: ffffbcc6579fbc34 x10: 0000000000000000 x9 : ffffbcc65714b22c x8 : ffff80008009b880 x7 : 0000000000000000 x6 : ffff80008009b940 x5 : ffff80008009b8c0 x4 : ffff22e000940518 x3 : ffff22e006f54f40 x2 : ffffbcc657a02268 x1 : ffff80007fffffff x0 : ffffc1ffc0000000 Call trace: kfree+0x60/0x2d8 (P) kvfree+0x44/0x60 blk_crypto_profile_destroy_callback+0x28/0x70 devm_action_release+0x1c/0x30 release_nodes+0x6c/0x108 devres_release_all+0x98/0x100 device_unbind_cleanup+0x20/0x70 really_probe+0x218/0x2d0
In other words, the initialisation code flow is:
platform-device probe ufshcd_pltfrm_init() ufshcd_alloc_host() scsi_host_alloc() allocation of struct ufs_hba creation of scsi-host devices devm_blk_crypto_profile_init() devm registration of cleanup handler using platform-device
and during error handling of ufshcd_pltfrm_init() or during driver removal:
ufshcd_dealloc_host() scsi_host_put() put_device(scsi-host) release of struct ufs_hba put_device(platform-device) crypto cleanup handler
To fix this use-after free, change ufshcd_alloc_host() to register a devres action to automatically cleanup the underlying SCSI device on ufshcd destruction, without requiring explicit calls to ufshcd_dealloc_host(). This way:
* the crypto profile and all other ufs_hba-owned resources are destroyed before SCSI (as they've been registered after) * a memleak is plugged in tc-dwc-g210-pci.c as a side-effect * EXPORT_SYMBOL_GPL(ufshcd_dealloc_host) can be removed fully as it's not needed anymore * no future drivers using ufshcd_alloc_host() could ever forget adding the cleanup
Fixes: cb77cb5abe1f ("blk-crypto: rename blk_keyslot_manager to blk_crypto_profile") Fixes: d76d9d7d1009 ("scsi: ufs: use devm_blk_ksm_init()") Cc: stable@vger.kernel.org Signed-off-by: André Draszik andre.draszik@linaro.org
Changes in v2:
- completely new approach using devres action for Scsi_host cleanup, to
ensure ordering
As mentioned, I am not sure if this approach has wider implications (in particular if there is any underlying assumption or requirement for the Scsi_host device to clean up before the ufshcd device).
Simple testing using a few iteration of manual module bind/unbind worked, as did the error handling / cleanup during init. But I'm not sure if that is sufficient testing for the changed release ordering.
Cheers, Andre'