On Tue, May 22, 2018 at 9:42 AM, Logan Gunthorpe logang@deltatee.com wrote:
Hey Dan,
On 21/05/18 06:07 PM, Dan Williams wrote:
Without this change we could fail to register the teardown of devm_memremap_pages(). The likelihood of hitting this failure is tiny as small memory allocations almost always succeed. However, the impact of the failure is large given any future reconfiguration, or disable/enable, of an nvdimm namespace will fail forever as subsequent calls to devm_memremap_pages() will fail to setup the pgmap_radix since there will be stale entries for the physical address range.
Sorry, I don't follow this. The change only seems to prevent a warning from occurring in this situation. Won't pgmap_radix_release() still be called regardless of whether this patch is applied?
devm_add_action() does not call the release function, devm_add_action_or_reset() does.
But it looks to me like this patch doesn't quite solve the issue -- at least when looking at dax/pmem.c: If devm_add_action_or_reset() fails, then dax_pmem_percpu_kill() won't be registered as an action and the percpu_ref will never get killed. Thus, dax_pmem_percpu_release() would not get called and dax_pmem_percpu_exit() will hang waiting for a completion that will never occur. So we probably need to add a kill call somewhere on the failing path...
Ah, true, good catch!
We should manually kill in the !registered case. I think this means we need to pass in the custom kill routine, because for the pmem driver it's blk_freeze_queue_start().