Dan Williams dan.j.williams@intel.com writes:
[...]
First up for the fix is release_mem_region_adjustable() needs to reliably delete the resource inserted by add_memory_driver_managed(). That is thwarted by a check for IORESOURCE_SYSRAM that predates the dax/kmem driver, from commit:
65c78784135f ("kernel, resource: check for IORESOURCE_SYSRAM in release_mem_region_adjustable")
That appears to be working around the behavior of HMM's "MEMORY_DEVICE_PUBLIC" facility that has since been deleted. With that check removed the "System RAM (kmem)" resource gets removed, but corruption still occurs occasionally because the "dax" resource is not reliably removed.
And re-added in the form of MEMORY_DEVICE_COHERENT, which is pretty similar. However I can't understand why that commit was needed in the first place. As mentioned in that commit message "we only call release_mem_region_adjustable() in __remove_pages if the zone is not ZONE_DEVICE" anyway, so I don't really understand how this code path could ever have been exercised. I can't see why it's needed now either, so for the change to resource.c feel free to add:
Reviewed-by: Alistair Popple apopple@nvidia.com
The dax range information is freed before the device is unregistered, so the driver can not reliably recall (another use after free) what it is meant to release. Lastly if that use after free got lucky, the driver was covering up the leak of "System RAM (kmem)" due to its use of release_resource() which detaches, but does not free, child resources. The switch to remove_resource() forces remove_memory() to be responsible for the deletion of the resource added by add_memory_driver_managed().
Fixes: c2f3011ee697 ("device-dax: add an allocation interface for device-dax instances") Cc: stable@vger.kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: David Hildenbrand david@redhat.com Cc: Pavel Tatashin pasha.tatashin@soleen.com Signed-off-by: Dan Williams dan.j.williams@intel.com
drivers/dax/bus.c | 2 +- drivers/dax/kmem.c | 4 ++-- kernel/resource.c | 14 -------------- 3 files changed, 3 insertions(+), 17 deletions(-)
diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 012d576004e9..67a64f4c472d 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -441,8 +441,8 @@ static void unregister_dev_dax(void *dev) dev_dbg(dev, "%s\n", __func__); kill_dev_dax(dev_dax);
- free_dev_dax_ranges(dev_dax); device_del(dev);
- free_dev_dax_ranges(dev_dax); put_device(dev);
} diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index 918d01d3fbaa..7b36db6f1cbd 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -146,7 +146,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) if (rc) { dev_warn(dev, "mapping%d: %#llx-%#llx memory add failed\n", i, range.start, range.end);
release_resource(res);
remove_resource(res); kfree(res); data->res[i] = NULL; if (mapped)
@@ -195,7 +195,7 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_dax) rc = remove_memory(range.start, range_len(&range)); if (rc == 0) {
release_resource(data->res[i]);
remove_resource(data->res[i]); kfree(data->res[i]); data->res[i] = NULL; success++;
diff --git a/kernel/resource.c b/kernel/resource.c index ddbbacb9fb50..b1763b2fd7ef 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1343,20 +1343,6 @@ void release_mem_region_adjustable(resource_size_t start, resource_size_t size) continue; }
/*
* All memory regions added from memory-hotplug path have the
* flag IORESOURCE_SYSTEM_RAM. If the resource does not have
* this flag, we know that we are dealing with a resource coming
* from HMM/devm. HMM/devm use another mechanism to add/release
* a resource. This goes via devm_request_mem_region and
* devm_release_mem_region.
* HMM/devm take care to release their resources when they want,
* so if we are dealing with them, let us just back off here.
*/
if (!(res->flags & IORESOURCE_SYSRAM)) {
break;
}
- if (!(res->flags & IORESOURCE_MEM)) break;