The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 32b2397c1e56f33b0b1881def965bb89bd12f448 Mon Sep 17 00:00:00 2001
From: sumiyawang sumiyawang@tencent.com Date: Sun, 22 Aug 2021 19:49:09 +0800 Subject: [PATCH] libnvdimm/pmem: Fix crash triggered when I/O in-flight during unbind
There is a use after free crash when the pmem driver tears down its mapping while I/O is still inbound.
This is triggered by driver unbind, "ndctl destroy-namespace", while I/O is in flight.
Fix the sequence of blk_cleanup_queue() vs memunmap().
The crash signature is of the form:
BUG: unable to handle page fault for address: ffffc90080200000 CPU: 36 PID: 9606 Comm: systemd-udevd Call Trace: ? pmem_do_bvec+0xf9/0x3a0 ? xas_alloc+0x55/0xd0 pmem_rw_page+0x4b/0x80 bdev_read_page+0x86/0xb0 do_mpage_readpage+0x5d4/0x7a0 ? lru_cache_add+0xe/0x10 mpage_readpages+0xf9/0x1c0 ? bd_link_disk_holder+0x1a0/0x1a0 blkdev_readpages+0x1d/0x20 read_pages+0x67/0x1a0
ndctl Call Trace in vmcore: PID: 23473 TASK: ffff88c4fbbe8000 CPU: 1 COMMAND: "ndctl" __schedule schedule blk_mq_freeze_queue_wait blk_freeze_queue blk_cleanup_queue pmem_release_queue devm_action_release release_nodes devres_release_all device_release_driver_internal device_driver_detach unbind_store
Cc: stable@vger.kernel.org Signed-off-by: sumiyawang sumiyawang@tencent.com Reviewed-by: yongduan yongduan@tencent.com Link: https://lore.kernel.org/r/1629632949-14749-1-git-send-email-sumiyawang@tence... Fixes: 50f44ee7248a ("mm/devm_memremap_pages: fix final page put race") Signed-off-by: Dan Williams dan.j.williams@intel.com
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 1e0615b8565e..72de88ff0d30 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -450,11 +450,11 @@ static int pmem_attach_disk(struct device *dev, pmem->pfn_flags |= PFN_MAP; bb_range = pmem->pgmap.range; } else { + addr = devm_memremap(dev, pmem->phys_addr, + pmem->size, ARCH_MEMREMAP_PMEM); if (devm_add_action_or_reset(dev, pmem_release_queue, &pmem->pgmap)) return -ENOMEM; - addr = devm_memremap(dev, pmem->phys_addr, - pmem->size, ARCH_MEMREMAP_PMEM); bb_range.start = res->start; bb_range.end = res->end; }
From: sumiyawang sumiyawang@tencent.com
commit 32b2397c1e56f33b0b1881def965bb89bd12f448 upstream.
There is a use after free crash when the pmem driver tears down its mapping while I/O is still inbound.
This is triggered by driver unbind, "ndctl destroy-namespace", while I/O is in flight.
Fix the sequence of blk_cleanup_queue() vs memunmap().
The crash signature is of the form:
BUG: unable to handle page fault for address: ffffc90080200000 CPU: 36 PID: 9606 Comm: systemd-udevd Call Trace: ? pmem_do_bvec+0xf9/0x3a0 ? xas_alloc+0x55/0xd0 pmem_rw_page+0x4b/0x80 bdev_read_page+0x86/0xb0 do_mpage_readpage+0x5d4/0x7a0 ? lru_cache_add+0xe/0x10 mpage_readpages+0xf9/0x1c0 ? bd_link_disk_holder+0x1a0/0x1a0 blkdev_readpages+0x1d/0x20 read_pages+0x67/0x1a0
ndctl Call Trace in vmcore: PID: 23473 TASK: ffff88c4fbbe8000 CPU: 1 COMMAND: "ndctl" __schedule schedule blk_mq_freeze_queue_wait blk_freeze_queue blk_cleanup_queue pmem_release_queue devm_action_release release_nodes devres_release_all device_release_driver_internal device_driver_detach unbind_store
Cc: stable@vger.kernel.org Signed-off-by: sumiyawang sumiyawang@tencent.com Reviewed-by: yongduan yongduan@tencent.com Link: https://lore.kernel.org/r/1629632949-14749-1-git-send-email-sumiyawang@tence... Fixes: 50f44ee7248a ("mm/devm_memremap_pages: fix final page put race") Signed-off-by: Dan Williams dan.j.williams@intel.com [tyhicks: Minor contextual change in pmem_attach_disk() due to the transition to 'struct range' not yet taking place. Preserve the memcpy() call rather than initializing the range struct. That change was introduced in v5.10 with commit a4574f63edc6 ("mm/memremap_pages: convert to 'struct range'")] Signed-off-by: Tyler Hicks tyhicks@linux.microsoft.com ---
We're seeing memory corruption issues in production and, AFAICT, we exercise this bit of code around the time that the corruption takes place. Therefore, I'm submitting this manually tested backport for inclusion in linux-5.4.y since it wasn't automatically applied due to the need for a manual backport.
drivers/nvdimm/pmem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index f9f76f6ba07b..7e65306b2bf2 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -423,11 +423,11 @@ static int pmem_attach_disk(struct device *dev, pmem->pfn_flags |= PFN_MAP; memcpy(&bb_res, &pmem->pgmap.res, sizeof(bb_res)); } else { + addr = devm_memremap(dev, pmem->phys_addr, + pmem->size, ARCH_MEMREMAP_PMEM); if (devm_add_action_or_reset(dev, pmem_release_queue, &pmem->pgmap)) return -ENOMEM; - addr = devm_memremap(dev, pmem->phys_addr, - pmem->size, ARCH_MEMREMAP_PMEM); memcpy(&bb_res, &nsio->res, sizeof(bb_res)); }
On Mon, Oct 04, 2021 at 12:51:34AM -0500, Tyler Hicks wrote:
From: sumiyawang sumiyawang@tencent.com
commit 32b2397c1e56f33b0b1881def965bb89bd12f448 upstream.
There is a use after free crash when the pmem driver tears down its mapping while I/O is still inbound.
This is triggered by driver unbind, "ndctl destroy-namespace", while I/O is in flight.
Fix the sequence of blk_cleanup_queue() vs memunmap().
The crash signature is of the form:
BUG: unable to handle page fault for address: ffffc90080200000 CPU: 36 PID: 9606 Comm: systemd-udevd Call Trace: ? pmem_do_bvec+0xf9/0x3a0 ? xas_alloc+0x55/0xd0 pmem_rw_page+0x4b/0x80 bdev_read_page+0x86/0xb0 do_mpage_readpage+0x5d4/0x7a0 ? lru_cache_add+0xe/0x10 mpage_readpages+0xf9/0x1c0 ? bd_link_disk_holder+0x1a0/0x1a0 blkdev_readpages+0x1d/0x20 read_pages+0x67/0x1a0
ndctl Call Trace in vmcore: PID: 23473 TASK: ffff88c4fbbe8000 CPU: 1 COMMAND: "ndctl" __schedule schedule blk_mq_freeze_queue_wait blk_freeze_queue blk_cleanup_queue pmem_release_queue devm_action_release release_nodes devres_release_all device_release_driver_internal device_driver_detach unbind_store
Cc: stable@vger.kernel.org Signed-off-by: sumiyawang sumiyawang@tencent.com Reviewed-by: yongduan yongduan@tencent.com Link: https://lore.kernel.org/r/1629632949-14749-1-git-send-email-sumiyawang@tence... Fixes: 50f44ee7248a ("mm/devm_memremap_pages: fix final page put race") Signed-off-by: Dan Williams dan.j.williams@intel.com [tyhicks: Minor contextual change in pmem_attach_disk() due to the transition to 'struct range' not yet taking place. Preserve the memcpy() call rather than initializing the range struct. That change was introduced in v5.10 with commit a4574f63edc6 ("mm/memremap_pages: convert to 'struct range'")] Signed-off-by: Tyler Hicks tyhicks@linux.microsoft.com
We're seeing memory corruption issues in production and, AFAICT, we exercise this bit of code around the time that the corruption takes place. Therefore, I'm submitting this manually tested backport for inclusion in linux-5.4.y since it wasn't automatically applied due to the need for a manual backport.
Now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org