5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
[ Upstream commit 88d3917f82ed4215a2154432c26de1480a61b209 ]
Ira reports that removing cxl_mock_mem causes a crash with the following trace:
BUG: kernel NULL pointer dereference, address: 0000000000000044 [..] RIP: 0010:cxl_region_decode_reset+0x7f/0x180 [cxl_core] [..] Call Trace: <TASK> cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x29/0x40 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 bus_remove_device+0xd7/0x150 device_del+0x155/0x3e0 device_unregister+0x13/0x60 devm_release_action+0x4d/0x90 ? __pfx_unregister_port+0x10/0x10 [cxl_core] delete_endpoint+0x121/0x130 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 bus_remove_device+0xd7/0x150 device_del+0x155/0x3e0 ? lock_release+0x142/0x290 cdev_device_del+0x15/0x50 cxl_memdev_unregister+0x54/0x70 [cxl_core]
This crash is due to the clearing out the cxl_memdev's driver context (@cxlds) before the subsystem is done with it. This is ultimately due to the region(s), that this memdev is a member, being torn down and expecting to be able to de-reference @cxlds, like here:
static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) ... if (cxlds->rcd) goto endpoint_reset; ...
Fix it by keeping the driver context valid until memdev-device unregistration, and subsequently the entire stack of related dependencies, unwinds.
Fixes: 9cc238c7a526 ("cxl/pci: Introduce cdevm_file_operations") Reported-by: Ira Weiny ira.weiny@intel.com Reviewed-by: Davidlohr Bueso dave@stgolabs.net Reviewed-by: Dave Jiang dave.jiang@intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Reviewed-by: Ira Weiny ira.weiny@intel.com Tested-by: Ira Weiny ira.weiny@intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cxl/core/memdev.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -139,10 +139,9 @@ static void cxl_memdev_unregister(void * struct cdev *cdev = &cxlmd->cdev; const struct cdevm_file_operations *cdevm_fops;
+ cdev_device_del(&cxlmd->cdev, dev); cdevm_fops = container_of(cdev->ops, typeof(*cdevm_fops), fops); cdevm_fops->shutdown(dev); - - cdev_device_del(&cxlmd->cdev, dev); put_device(dev); }