Hi Greg, Hi Sasha,
Please take this backport of the upstream commit a46044a92add ("s390/pci: fix zpci_zdev_put() on reserve") for the v5.14 stable series. After adding the prerequisite commit 023cc3cb1e4b ("s390/pci: cleanup resources only if necessary") both it and the original upstream patch apply cleanly. I have also tested them with the original problem situation on top of v5.14.14 and confirmed the issue to be fixed.
Thanks, Niklas
Niklas Schnelle (2): s390/pci: cleanup resources only if necessary s390/pci: fix zpci_zdev_put() on reserve
arch/s390/include/asm/pci.h | 2 ++ arch/s390/pci/pci.c | 48 ++++++++++++++++++++++++++---- arch/s390/pci/pci_event.c | 4 +-- drivers/pci/hotplug/s390_pci_hpc.c | 9 +----- 4 files changed, 47 insertions(+), 16 deletions(-)
commit 023cc3cb1e4baa8d1900a4f2e999701c82ce2b6c upstream.
It's currently safe to call zpci_cleanup_bus_resources() even if the resources were never created but it makes no sense so check zdev->has_resources before we call zpci_cleanup_bus_resources() in zpci_release_device().
Reviewed-by: Matthew Rosato mjrosato@linux.ibm.com Acked-by: Pierre Morel pmorel@linux.ibm.com Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com --- arch/s390/pci/pci.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c index 34839bad33e4..5e234cb2ad7b 100644 --- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -829,7 +829,8 @@ void zpci_release_device(struct kref *kref) case ZPCI_FN_STATE_STANDBY: if (zdev->has_hp_slot) zpci_exit_slot(zdev); - zpci_cleanup_bus_resources(zdev); + if (zdev->has_resources) + zpci_cleanup_bus_resources(zdev); zpci_bus_device_unregister(zdev); zpci_destroy_iommu(zdev); fallthrough;
On Sun, Oct 24, 2021 at 02:04:00PM +0200, Greg KH wrote:
On Thu, Oct 21, 2021 at 04:13:40PM +0200, Niklas Schnelle wrote:
commit 023cc3cb1e4baa8d1900a4f2e999701c82ce2b6c upstream.
This is not a commit in Linus's tree :(
It is 02368b7cf6c7 ("s390/pci: cleanup resources only if necessary"), don't know where the sha1 you have here is from :(
On Mon, 2021-10-25 at 09:48 +0200, Greg KH wrote:
On Sun, Oct 24, 2021 at 02:04:00PM +0200, Greg KH wrote:
On Thu, Oct 21, 2021 at 04:13:40PM +0200, Niklas Schnelle wrote:
commit 023cc3cb1e4baa8d1900a4f2e999701c82ce2b6c upstream.
This is not a commit in Linus's tree :(
It is 02368b7cf6c7 ("s390/pci: cleanup resources only if necessary"), don't know where the sha1 you have here is from :(
Oh sorry yes, this the correct commit. Mixed up and got the hash from a local branch, even managed to match the first 3 digits. I resent with the correct hash.
On Mon, Oct 25, 2021 at 10:43:46AM +0200, Niklas Schnelle wrote:
On Mon, 2021-10-25 at 09:48 +0200, Greg KH wrote:
On Sun, Oct 24, 2021 at 02:04:00PM +0200, Greg KH wrote:
On Thu, Oct 21, 2021 at 04:13:40PM +0200, Niklas Schnelle wrote:
commit 023cc3cb1e4baa8d1900a4f2e999701c82ce2b6c upstream.
This is not a commit in Linus's tree :(
It is 02368b7cf6c7 ("s390/pci: cleanup resources only if necessary"), don't know where the sha1 you have here is from :(
Oh sorry yes, this the correct commit. Mixed up and got the hash from a local branch, even managed to match the first 3 digits. I resent with the correct hash.
Already queued up, just need a 5.10.y version now, thanks.
greg k-h
commit a46044a92add6a400f4dada7b943b30221f7cc80 upstream.
Since commit 2a671f77ee49 ("s390/pci: fix use after free of zpci_dev") the reference count of a zpci_dev is incremented between pcibios_add_device() and pcibios_release_device() which was supposed to prevent the zpci_dev from being freed while the common PCI code has access to it. It was missed however that the handling of zPCI availability events assumed that once zpci_zdev_put() was called no later availability event would still see the device. With the previously mentioned commit however this assumption no longer holds and we must make sure that we only drop the initial long-lived reference the zPCI subsystem holds exactly once.
Do so by introducing a zpci_device_reserved() function that handles when a device is reserved. Here we make sure the zpci_dev will not be considered for further events by removing it from the zpci_list.
This also means that the device actually stays in the ZPCI_FN_STATE_RESERVED state between the time we know it has been reserved and the final reference going away. We thus need to consider it a real state instead of just a conceptual state after the removal. The final cleanup of PCI resources, removal from zbus, and destruction of the IOMMU stays in zpci_release_device() to make sure holders of the reference do see valid data until the release.
Fixes: 2a671f77ee49 ("s390/pci: fix use after free of zpci_dev") Cc: stable@vger.kernel.org Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com --- arch/s390/include/asm/pci.h | 2 ++ arch/s390/pci/pci.c | 45 ++++++++++++++++++++++++++---- arch/s390/pci/pci_event.c | 4 +-- drivers/pci/hotplug/s390_pci_hpc.c | 9 +----- 4 files changed, 45 insertions(+), 15 deletions(-)
diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h index 5509b224c2ec..abe4d45c2f47 100644 --- a/arch/s390/include/asm/pci.h +++ b/arch/s390/include/asm/pci.h @@ -207,6 +207,8 @@ int zpci_enable_device(struct zpci_dev *); int zpci_disable_device(struct zpci_dev *); int zpci_scan_configured_device(struct zpci_dev *zdev, u32 fh); int zpci_deconfigure_device(struct zpci_dev *zdev); +void zpci_device_reserved(struct zpci_dev *zdev); +bool zpci_is_device_configured(struct zpci_dev *zdev);
int zpci_register_ioat(struct zpci_dev *, u8, u64, u64, u64); int zpci_unregister_ioat(struct zpci_dev *, u8); diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c index 5e234cb2ad7b..d7ef98218c80 100644 --- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -92,7 +92,7 @@ void zpci_remove_reserved_devices(void) spin_unlock(&zpci_list_lock);
list_for_each_entry_safe(zdev, tmp, &remove, entry) - zpci_zdev_put(zdev); + zpci_device_reserved(zdev); }
int pci_domain_nr(struct pci_bus *bus) @@ -744,6 +744,14 @@ struct zpci_dev *zpci_create_device(u32 fid, u32 fh, enum zpci_state state) return ERR_PTR(rc); }
+bool zpci_is_device_configured(struct zpci_dev *zdev) +{ + enum zpci_state state = zdev->state; + + return state != ZPCI_FN_STATE_RESERVED && + state != ZPCI_FN_STATE_STANDBY; +} + /** * zpci_scan_configured_device() - Scan a freshly configured zpci_dev * @zdev: The zpci_dev to be configured @@ -810,6 +818,31 @@ int zpci_deconfigure_device(struct zpci_dev *zdev) return 0; }
+/** + * zpci_device_reserved() - Mark device as resverved + * @zdev: the zpci_dev that was reserved + * + * Handle the case that a given zPCI function was reserved by another system. + * After a call to this function the zpci_dev can not be found via + * get_zdev_by_fid() anymore but may still be accessible via existing + * references though it will not be functional anymore. + */ +void zpci_device_reserved(struct zpci_dev *zdev) +{ + if (zdev->has_hp_slot) + zpci_exit_slot(zdev); + /* + * Remove device from zpci_list as it is going away. This also + * makes sure we ignore subsequent zPCI events for this device. + */ + spin_lock(&zpci_list_lock); + list_del(&zdev->entry); + spin_unlock(&zpci_list_lock); + zdev->state = ZPCI_FN_STATE_RESERVED; + zpci_dbg(3, "rsv fid:%x\n", zdev->fid); + zpci_zdev_put(zdev); +} + void zpci_release_device(struct kref *kref) { struct zpci_dev *zdev = container_of(kref, struct zpci_dev, kref); @@ -829,6 +862,12 @@ void zpci_release_device(struct kref *kref) case ZPCI_FN_STATE_STANDBY: if (zdev->has_hp_slot) zpci_exit_slot(zdev); + spin_lock(&zpci_list_lock); + list_del(&zdev->entry); + spin_unlock(&zpci_list_lock); + zpci_dbg(3, "rsv fid:%x\n", zdev->fid); + fallthrough; + case ZPCI_FN_STATE_RESERVED: if (zdev->has_resources) zpci_cleanup_bus_resources(zdev); zpci_bus_device_unregister(zdev); @@ -837,10 +876,6 @@ void zpci_release_device(struct kref *kref) default: break; } - - spin_lock(&zpci_list_lock); - list_del(&zdev->entry); - spin_unlock(&zpci_list_lock); zpci_dbg(3, "rem fid:%x\n", zdev->fid); kfree(zdev); } diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c index cd447b96b4b1..9b26617ca1c5 100644 --- a/arch/s390/pci/pci_event.c +++ b/arch/s390/pci/pci_event.c @@ -137,7 +137,7 @@ static void __zpci_event_availability(struct zpci_ccdf_avail *ccdf) /* The 0x0304 event may immediately reserve the device */ if (!clp_get_state(zdev->fid, &state) && state == ZPCI_FN_STATE_RESERVED) { - zpci_zdev_put(zdev); + zpci_device_reserved(zdev); } } break; @@ -148,7 +148,7 @@ static void __zpci_event_availability(struct zpci_ccdf_avail *ccdf) case 0x0308: /* Standby -> Reserved */ if (!zdev) break; - zpci_zdev_put(zdev); + zpci_device_reserved(zdev); break; default: break; diff --git a/drivers/pci/hotplug/s390_pci_hpc.c b/drivers/pci/hotplug/s390_pci_hpc.c index 014868752cd4..dcefdb42ac46 100644 --- a/drivers/pci/hotplug/s390_pci_hpc.c +++ b/drivers/pci/hotplug/s390_pci_hpc.c @@ -62,14 +62,7 @@ static int get_power_status(struct hotplug_slot *hotplug_slot, u8 *value) struct zpci_dev *zdev = container_of(hotplug_slot, struct zpci_dev, hotplug_slot);
- switch (zdev->state) { - case ZPCI_FN_STATE_STANDBY: - *value = 0; - break; - default: - *value = 1; - break; - } + *value = zpci_is_device_configured(zdev) ? 1 : 0; return 0; }
On Thu, Oct 21, 2021 at 04:13:41PM +0200, Niklas Schnelle wrote:
commit a46044a92add6a400f4dada7b943b30221f7cc80 upstream.
Since commit 2a671f77ee49 ("s390/pci: fix use after free of zpci_dev") the reference count of a zpci_dev is incremented between pcibios_add_device() and pcibios_release_device() which was supposed to prevent the zpci_dev from being freed while the common PCI code has access to it. It was missed however that the handling of zPCI availability events assumed that once zpci_zdev_put() was called no later availability event would still see the device. With the previously mentioned commit however this assumption no longer holds and we must make sure that we only drop the initial long-lived reference the zPCI subsystem holds exactly once.
Do so by introducing a zpci_device_reserved() function that handles when a device is reserved. Here we make sure the zpci_dev will not be considered for further events by removing it from the zpci_list.
This also means that the device actually stays in the ZPCI_FN_STATE_RESERVED state between the time we know it has been reserved and the final reference going away. We thus need to consider it a real state instead of just a conceptual state after the removal. The final cleanup of PCI resources, removal from zbus, and destruction of the IOMMU stays in zpci_release_device() to make sure holders of the reference do see valid data until the release.
Fixes: 2a671f77ee49 ("s390/pci: fix use after free of zpci_dev") Cc: stable@vger.kernel.org Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com
This is also needed for 5.10.y, can you please provide a working backport for that tree too?
thanks,
greg k-h
On Mon, 2021-10-25 at 09:49 +0200, Greg KH wrote:
On Thu, Oct 21, 2021 at 04:13:41PM +0200, Niklas Schnelle wrote:
commit a46044a92add6a400f4dada7b943b30221f7cc80 upstream.
Since commit 2a671f77ee49 ("s390/pci: fix use after free of zpci_dev") the reference count of a zpci_dev is incremented between pcibios_add_device() and pcibios_release_device() which was supposed to prevent the zpci_dev from being freed while the common PCI code has access to it. It was missed however that the handling of zPCI availability events assumed that once zpci_zdev_put() was called no later availability event would still see the device. With the previously mentioned commit however this assumption no longer holds and we must make sure that we only drop the initial long-lived reference the zPCI subsystem holds exactly once.
Do so by introducing a zpci_device_reserved() function that handles when a device is reserved. Here we make sure the zpci_dev will not be considered for further events by removing it from the zpci_list.
This also means that the device actually stays in the ZPCI_FN_STATE_RESERVED state between the time we know it has been reserved and the final reference going away. We thus need to consider it a real state instead of just a conceptual state after the removal. The final cleanup of PCI resources, removal from zbus, and destruction of the IOMMU stays in zpci_release_device() to make sure holders of the reference do see valid data until the release.
Fixes: 2a671f77ee49 ("s390/pci: fix use after free of zpci_dev") Cc: stable@vger.kernel.org Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com
This is also needed for 5.10.y, can you please provide a working backport for that tree too?
thanks,
greg k-h
Hi Greg,
Sorry again for the confusion. For v5.10.y the backport I sent originally is the correct one. I had replied there explaining that backporting the prerequisite commit to v5.10.y makes little sense because the flag it checks isn't there yet. The backport for v5.10 also doesn't subsume the prerequisite commits change. I guess this reply was missed. I'll resent anyway then we have both v5.14.y and v5.10.y sent today and easier to find.
Thanks, Niklas
linux-stable-mirror@lists.linaro.org