From: Dexuan Cui decui@microsoft.com
After a device is just created in new_pcichild_device(), hpdev->refs is set to 2 (i.e. the initial value of 1 plus the get_pcichild()).
When we hot remove the device from the host, in Linux VM we first call hv_pci_eject_device(), which increases hpdev->refs by get_pcichild() and then schedules a work of hv_eject_device_work(), so hpdev->refs becomes 3 (let's ignore the paired get/put_pcichild() in other places). But in hv_eject_device_work(), currently we only call put_pcichild() twice, meaning the 'hpdev' struct can't be freed in put_pcichild(). This patch adds one put_pcichild() to fix the memory leak.
BTW, the device can also be removed when we run "rmmod pci-hyperv". On this path (hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_devices_present()), hpdev->refs is 2, and we do correctly call put_pcichild() twice in pci_devices_present_work().
Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Signed-off-by: Dexuan Cui decui@microsoft.com Cc: stable@vger.kernel.org
Exiting new_pcichild_device() with hpdev->refs set to 2 seems OK to me. There is the reference in the hbus->children list, and there is the reference that is returned to the caller. But what is strange is that pci_devices_present_work() overwrites the reference returned in local variable hpdev without doing a put_pcichild(). It seems like the "normal" reference count should be 1 when the child device is not being manipulated, not 2. The fix would be to add a call to put_pcichild() when the return value from new_pcichild_device() is overwritten. Then remove the call to put_pcichild() in pci_device_present_work() when missing children are moved to the local list. The children have been moved from one list to another, so there's no need to decrement the reference count. Then when everything in the local list is deleted, the reference is correctly decremented, presumably freeing the memory.
With this approach, the code in hv_eject_device_work() is correct. There's one call to put_pcichild() to reflect removing the child device from the hbus-> children list, and one call to put_pcichild() to pair with the get_pcichild() in hv_pci_eject_device().
Your patch works, but to me it leaves the ref count in an unnatural state most of the time.
Michael