Commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated") introduced a change that results in a circular lockdep when a Secure Execution guest that is configured with crypto devices is started. The problem resulted due to the fact that the patch moved the setting of the guest's AP masks within the protection of the matrix_dev->lock when the vfio_ap driver is notified that the KVM pointer has been set. Since it is not critical that setting/clearing of the guest's AP masks be done under the matrix_dev->lock when the driver is notified, the masks will not be updated under the matrix_dev->lock. The lock is necessary for the setting/unsetting of the KVM pointer, however, so that will remain in place.
The dependency chain for the circular lockdep resolved by this patch is (in reverse order):
2: vfio_ap_mdev_group_notifier: kvm->lock matrix_dev->lock
1: handle_pqap: matrix_dev->lock kvm_vcpu_ioctl: vcpu->mutex
0: kvm_s390_cpus_to_pv: vcpu->mutex kvm_vm_ioctl: kvm->lock
Please note that if checkpatch is run against this patch series, you may get a "WARNING: Unknown commit id 'f21916ec4826', maybe rebased or not pulled?" message. The commit 'f21916ec4826', however, is definitely in the master branch on top of which this patch series was built, so I'm not sure why this message is being output by checkpatch.
Change log v1=> v2: ------------------ * No longer holding the matrix_dev->lock prior to setting/clearing the masks supplying the AP configuration to a KVM guest. * Make all updates to the data in the matrix mdev that is used to manage AP resources used by the KVM guest in the vfio_ap_mdev_set_kvm() function instead of the group notifier callback. * Check for the matrix mdev's KVM pointer in the vfio_ap_mdev_unset_kvm() function instead of the vfio_ap_mdev_release() function.
Tony Krowiak (1): s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
drivers/s390/crypto/vfio_ap_ops.c | 119 +++++++++++++++++++++--------- 1 file changed, 84 insertions(+), 35 deletions(-)
This patch fixes a circular locking dependency in the CI introduced by commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated"). The lockdep only occurs when starting a Secure Execution guest. Crypto virtualization (vfio_ap) is not yet supported for SE guests; however, in order to avoid CI errors, this fix is being provided.
The circular lockdep was introduced when the masks in the guest's APCB were taken under the matrix_dev->lock. While the lock is definitely needed to protect the setting/unsetting of the KVM pointer, it is not necessarily critical for setting the masks, so this will not be done under protection of the matrix_dev->lock.
Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated") Cc: stable@vger.kernel.org Signed-off-by: Tony Krowiak akrowiak@linux.ibm.com --- drivers/s390/crypto/vfio_ap_ops.c | 119 +++++++++++++++++++++--------- 1 file changed, 84 insertions(+), 35 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 41fc2e4135fe..8574b6ecc9c5 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1027,8 +1027,21 @@ static const struct attribute_group *vfio_ap_mdev_attr_groups[] = { * @matrix_mdev: a mediated matrix device * @kvm: reference to KVM instance * - * Verifies no other mediated matrix device has @kvm and sets a reference to - * it in @matrix_mdev->kvm. + * Sets all data for @matrix_mdev that are needed to manage AP resources + * for the guest whose state is represented by @kvm: + * 1. Verifies no other mediated device has a reference to @kvm. + * 2. Increments the ref count for @kvm so it doesn't disappear until the + * vfio_ap driver is notified the pointer is being nullified. + * 3. Sets a reference to the PQAP hook (i.e., handle_pqap() function) into + * @kvm to handle interception of the PQAP(AQIC) instruction. + * 4. Sets the masks supplying the AP configuration to the KVM guest. + * 5. Sets the KVM pointer into @kvm so the vfio_ap driver can access it. + * + * Note: The matrix_dev->lock must be taken prior to calling + * this function; however, the lock will be temporarily released to avoid a + * potential circular lock dependency with other asynchronous processes that + * lock the kvm->lock mutex which is also needed to supply the guest's AP + * configuration. * * Return 0 if no other mediated matrix device has a reference to @kvm; * otherwise, returns an -EPERM. @@ -1043,9 +1056,17 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, return -EPERM; }
- matrix_mdev->kvm = kvm; - kvm_get_kvm(kvm); - kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook; + if (kvm->arch.crypto.crycbd) { + kvm_get_kvm(kvm); + kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook; + mutex_unlock(&matrix_dev->lock); + kvm_arch_crypto_set_masks(kvm, + matrix_mdev->matrix.apm, + matrix_mdev->matrix.aqm, + matrix_mdev->matrix.adm); + mutex_lock(&matrix_dev->lock); + matrix_mdev->kvm = kvm; + }
return 0; } @@ -1079,51 +1100,80 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb, return NOTIFY_DONE; }
+/** + * vfio_ap_mdev_unset_kvm + * + * @matrix_mdev: a matrix mediated device + * + * Performs clean-up of resources no longer needed by @matrix_mdev. + * + * Note: The matrix_dev->lock must be taken prior to calling this + * function; however, the lock will be temporarily released to avoid a + * potential circular lock dependency with other asynchronous processes that + * lock the kvm->lock mutex which is also needed to update the guest's AP + * configuration as follows: + * 1. Grab a reference to the KVM pointer stored in @matrix_mdev. + * 2. Set the KVM pointer in @matrix_mdev to NULL so no other asynchronous + * process uses it (e.g., assign_adapter store function) after + * unlocking the matrix_dev->lock mutex. + * 3. Set the PQAP hook to NULL so it will not be invoked after unlocking + * the matrix_dev->lock mutex. + * 4. Unlock the matrix_dev->lock mutex to avoid circular lock + * dependencies. + * 5. Clear the masks in the guest's APCB to remove guest access to AP + * resources assigned to @matrix_mdev. + * 6. Lock the matrix_dev->lock mutex to prevent access to resources + * assigned to @matrix_mdev while the remainder of the cleanup + * operations take place. + * 7. Decrement the reference counter incremented in #1. + * 8. Set the reference to the KVM pointer grabbed in #1 into @matrix_mdev + * (set to NULL in #2) because it will be needed when the queues are + * reset to clean up any IRQ resources being held. + * 9. Decrement the reference count that was incremented when the KVM + * pointer was originally set by the group notifier. + * 10. Set the KVM pointer @matrix_mdev to NULL to prevent its usage from + * here on out. + * + */ static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev) { - kvm_arch_crypto_clear_masks(matrix_mdev->kvm); - matrix_mdev->kvm->arch.crypto.pqap_hook = NULL; - vfio_ap_mdev_reset_queues(matrix_mdev->mdev); - kvm_put_kvm(matrix_mdev->kvm); - matrix_mdev->kvm = NULL; + struct kvm *kvm; + + if (matrix_mdev->kvm) { + kvm = matrix_mdev->kvm; + kvm_get_kvm(kvm); + matrix_mdev->kvm = NULL; + kvm->arch.crypto.pqap_hook = NULL; + mutex_unlock(&matrix_dev->lock); + kvm_arch_crypto_clear_masks(kvm); + mutex_lock(&matrix_dev->lock); + kvm_put_kvm(kvm); + matrix_mdev->kvm = kvm; + vfio_ap_mdev_reset_queues(matrix_mdev->mdev); + kvm_put_kvm(matrix_mdev->kvm); + matrix_mdev->kvm = NULL; + } }
static int vfio_ap_mdev_group_notifier(struct notifier_block *nb, unsigned long action, void *data) { - int ret, notify_rc = NOTIFY_OK; + int notify_rc = NOTIFY_OK; struct ap_matrix_mdev *matrix_mdev;
if (action != VFIO_GROUP_NOTIFY_SET_KVM) return NOTIFY_OK;
- matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier); mutex_lock(&matrix_dev->lock); + matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);
- if (!data) { - if (matrix_mdev->kvm) - vfio_ap_mdev_unset_kvm(matrix_mdev); - goto notify_done; - } - - ret = vfio_ap_mdev_set_kvm(matrix_mdev, data); - if (ret) { - notify_rc = NOTIFY_DONE; - goto notify_done; - } - - /* If there is no CRYCB pointer, then we can't copy the masks */ - if (!matrix_mdev->kvm->arch.crypto.crycbd) { + if (!data) + vfio_ap_mdev_unset_kvm(matrix_mdev); + else if (vfio_ap_mdev_set_kvm(matrix_mdev, data)) notify_rc = NOTIFY_DONE; - goto notify_done; - } - - kvm_arch_crypto_set_masks(matrix_mdev->kvm, matrix_mdev->matrix.apm, - matrix_mdev->matrix.aqm, - matrix_mdev->matrix.adm);
-notify_done: mutex_unlock(&matrix_dev->lock); + return notify_rc; }
@@ -1258,8 +1308,7 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev) struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
mutex_lock(&matrix_dev->lock); - if (matrix_mdev->kvm) - vfio_ap_mdev_unset_kvm(matrix_mdev); + vfio_ap_mdev_unset_kvm(matrix_mdev); mutex_unlock(&matrix_dev->lock);
vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
On Mon, 15 Feb 2021 20:15:47 -0500 Tony Krowiak akrowiak@linux.ibm.com wrote:
This patch fixes a circular locking dependency in the CI introduced by commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated"). The lockdep only occurs when starting a Secure Execution guest. Crypto virtualization (vfio_ap) is not yet supported for SE guests; however, in order to avoid CI errors, this fix is being provided.
The circular lockdep was introduced when the masks in the guest's APCB were taken under the matrix_dev->lock. While the lock is definitely needed to protect the setting/unsetting of the KVM pointer, it is not necessarily critical for setting the masks, so this will not be done under protection of the matrix_dev->lock.
Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated") Cc: stable@vger.kernel.org Signed-off-by: Tony Krowiak akrowiak@linux.ibm.com
drivers/s390/crypto/vfio_ap_ops.c | 119 +++++++++++++++++++++--------- 1 file changed, 84 insertions(+), 35 deletions(-)
I've been looking at the patch for a bit now and tried to follow down the various paths; and while I think it's ok, I do not really have enough confidence about that for a R-b. But have an
Acked-by: Cornelia Huck cohuck@redhat.com
On 2/19/21 8:45 AM, Cornelia Huck wrote:
On Mon, 15 Feb 2021 20:15:47 -0500 Tony Krowiak akrowiak@linux.ibm.com wrote:
This patch fixes a circular locking dependency in the CI introduced by commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated"). The lockdep only occurs when starting a Secure Execution guest. Crypto virtualization (vfio_ap) is not yet supported for SE guests; however, in order to avoid CI errors, this fix is being provided.
The circular lockdep was introduced when the masks in the guest's APCB were taken under the matrix_dev->lock. While the lock is definitely needed to protect the setting/unsetting of the KVM pointer, it is not necessarily critical for setting the masks, so this will not be done under protection of the matrix_dev->lock.
Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated") Cc: stable@vger.kernel.org Signed-off-by: Tony Krowiak akrowiak@linux.ibm.com
drivers/s390/crypto/vfio_ap_ops.c | 119 +++++++++++++++++++++--------- 1 file changed, 84 insertions(+), 35 deletions(-)
I've been looking at the patch for a bit now and tried to follow down the various paths; and while I think it's ok, I do not really have enough confidence about that for a R-b. But have an
Acked-by: Cornelia Huck cohuck@redhat.com
Thanks for the review.
On Mon, 15 Feb 2021 20:15:47 -0500 Tony Krowiak akrowiak@linux.ibm.com wrote:
This patch fixes a circular locking dependency in the CI introduced by commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated"). The lockdep only occurs when starting a Secure Execution guest. Crypto virtualization (vfio_ap) is not yet supported for SE guests; however, in order to avoid CI errors, this fix is being provided.
The circular lockdep was introduced when the masks in the guest's APCB were taken under the matrix_dev->lock. While the lock is definitely needed to protect the setting/unsetting of the KVM pointer, it is not necessarily critical for setting the masks, so this will not be done under protection of the matrix_dev->lock.
With the one little thing I commented on below addressed: Acked-by: Halil Pasic pasic@linux.ibm.com
This solution probably ain't a perfect one, but can't say I see a simple way to get around this problem. For instance I played with the thought of taking locks in a different order and keeping the critical sections intact, but that has problems of its own. Tony should have the best understanding of vfio_ap anyway.
In theory the execution of vfio_ap_mdev_group_notifier() and vfio_ap_mdev_release() could interleave, and we could loose a clear because in theory some permutations of the critical sections need to be considered. In practice I hope that won't happen with QEMU.
Tony, you gave this a decent amount of testing or?
I think we should move forward with this. Any objections?
Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated") Cc: stable@vger.kernel.org Signed-off-by: Tony Krowiak akrowiak@linux.ibm.com
drivers/s390/crypto/vfio_ap_ops.c | 119 +++++++++++++++++++++--------- 1 file changed, 84 insertions(+), 35 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 41fc2e4135fe..8574b6ecc9c5 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1027,8 +1027,21 @@ static const struct attribute_group *vfio_ap_mdev_attr_groups[] = {
- @matrix_mdev: a mediated matrix device
- @kvm: reference to KVM instance
- Verifies no other mediated matrix device has @kvm and sets a reference to
- it in @matrix_mdev->kvm.
- Sets all data for @matrix_mdev that are needed to manage AP resources
- for the guest whose state is represented by @kvm:
- Verifies no other mediated device has a reference to @kvm.
- Increments the ref count for @kvm so it doesn't disappear until the
- vfio_ap driver is notified the pointer is being nullified.
- Sets a reference to the PQAP hook (i.e., handle_pqap() function) into
- @kvm to handle interception of the PQAP(AQIC) instruction.
- Sets the masks supplying the AP configuration to the KVM guest.
- Sets the KVM pointer into @kvm so the vfio_ap driver can access it.
Could for example a PQAP AQIC run across an unset matrix_mdev->kvm like this, in theory? I don't think it's likely to happen in the wild though. Why not set it up before setting the mask?
- Note: The matrix_dev->lock must be taken prior to calling
- this function; however, the lock will be temporarily released to avoid a
- potential circular lock dependency with other asynchronous processes that
- lock the kvm->lock mutex which is also needed to supply the guest's AP
- configuration.
- Return 0 if no other mediated matrix device has a reference to @kvm;
- otherwise, returns an -EPERM.
@@ -1043,9 +1056,17 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, return -EPERM; }
- matrix_mdev->kvm = kvm;
- kvm_get_kvm(kvm);
- kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
- if (kvm->arch.crypto.crycbd) {
kvm_get_kvm(kvm);
kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
mutex_unlock(&matrix_dev->lock);
kvm_arch_crypto_set_masks(kvm,
matrix_mdev->matrix.apm,
matrix_mdev->matrix.aqm,
matrix_mdev->matrix.adm);
mutex_lock(&matrix_dev->lock);
matrix_mdev->kvm = kvm;
- }
return 0; } @@ -1079,51 +1100,80 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb, return NOTIFY_DONE; } +/**
- vfio_ap_mdev_unset_kvm
- @matrix_mdev: a matrix mediated device
- Performs clean-up of resources no longer needed by @matrix_mdev.
- Note: The matrix_dev->lock must be taken prior to calling this
- function; however, the lock will be temporarily released to avoid a
- potential circular lock dependency with other asynchronous processes that
- lock the kvm->lock mutex which is also needed to update the guest's AP
- configuration as follows:
- Grab a reference to the KVM pointer stored in @matrix_mdev.
- Set the KVM pointer in @matrix_mdev to NULL so no other asynchronous
process uses it (e.g., assign_adapter store function) after
unlocking the matrix_dev->lock mutex.
- Set the PQAP hook to NULL so it will not be invoked after unlocking
the matrix_dev->lock mutex.
- Unlock the matrix_dev->lock mutex to avoid circular lock
dependencies.
- Clear the masks in the guest's APCB to remove guest access to AP
resources assigned to @matrix_mdev.
- Lock the matrix_dev->lock mutex to prevent access to resources
assigned to @matrix_mdev while the remainder of the cleanup
operations take place.
- Decrement the reference counter incremented in #1.
- Set the reference to the KVM pointer grabbed in #1 into @matrix_mdev
(set to NULL in #2) because it will be needed when the queues are
reset to clean up any IRQ resources being held.
- Decrement the reference count that was incremented when the KVM
pointer was originally set by the group notifier.
- Set the KVM pointer @matrix_mdev to NULL to prevent its usage from
here on out.
- */
static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev) {
- kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
- matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
- vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
- kvm_put_kvm(matrix_mdev->kvm);
- matrix_mdev->kvm = NULL;
- struct kvm *kvm;
- if (matrix_mdev->kvm) {
kvm = matrix_mdev->kvm;
kvm_get_kvm(kvm);
matrix_mdev->kvm = NULL;
I think if there were two threads dong the unset in parallel, one of them could bail out and carry on before the cleanup is done. But since nothing much happens in release after that, I don't see an immediate problem.
Another thing to consider is, that setting ->kvm to NULL arms vfio_ap_mdev_remove()...
kvm->arch.crypto.pqap_hook = NULL;
mutex_unlock(&matrix_dev->lock);
kvm_arch_crypto_clear_masks(kvm);
mutex_lock(&matrix_dev->lock);
kvm_put_kvm(kvm);
matrix_mdev->kvm = kvm;
vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
kvm_put_kvm(matrix_mdev->kvm);
matrix_mdev->kvm = NULL;
- }
} static int vfio_ap_mdev_group_notifier(struct notifier_block *nb, unsigned long action, void *data) {
- int ret, notify_rc = NOTIFY_OK;
- int notify_rc = NOTIFY_OK; struct ap_matrix_mdev *matrix_mdev;
if (action != VFIO_GROUP_NOTIFY_SET_KVM) return NOTIFY_OK;
- matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier); mutex_lock(&matrix_dev->lock);
- matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);
- if (!data) {
if (matrix_mdev->kvm)
vfio_ap_mdev_unset_kvm(matrix_mdev);
goto notify_done;
- }
- ret = vfio_ap_mdev_set_kvm(matrix_mdev, data);
- if (ret) {
notify_rc = NOTIFY_DONE;
goto notify_done;
- }
- /* If there is no CRYCB pointer, then we can't copy the masks */
- if (!matrix_mdev->kvm->arch.crypto.crycbd) {
- if (!data)
vfio_ap_mdev_unset_kvm(matrix_mdev);
- else if (vfio_ap_mdev_set_kvm(matrix_mdev, data)) notify_rc = NOTIFY_DONE;
goto notify_done;
- }
- kvm_arch_crypto_set_masks(matrix_mdev->kvm, matrix_mdev->matrix.apm,
matrix_mdev->matrix.aqm,
matrix_mdev->matrix.adm);
-notify_done: mutex_unlock(&matrix_dev->lock);
- return notify_rc;
} @@ -1258,8 +1308,7 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev) struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev); mutex_lock(&matrix_dev->lock);
- if (matrix_mdev->kvm)
vfio_ap_mdev_unset_kvm(matrix_mdev);
- vfio_ap_mdev_unset_kvm(matrix_mdev); mutex_unlock(&matrix_dev->lock);
vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
On 23.02.21 10:48, Halil Pasic wrote:
On Mon, 15 Feb 2021 20:15:47 -0500 Tony Krowiak akrowiak@linux.ibm.com wrote:
This patch fixes a circular locking dependency in the CI introduced by commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated"). The lockdep only occurs when starting a Secure Execution guest. Crypto virtualization (vfio_ap) is not yet supported for SE guests; however, in order to avoid CI errors, this fix is being provided.
The circular lockdep was introduced when the masks in the guest's APCB were taken under the matrix_dev->lock. While the lock is definitely needed to protect the setting/unsetting of the KVM pointer, it is not necessarily critical for setting the masks, so this will not be done under protection of the matrix_dev->lock.
With the one little thing I commented on below addressed: Acked-by: Halil Pasic pasic@linux.ibm.com
Tony, can you comment on Halils comment or send a v3 right away?
On 2/24/21 11:10 AM, Christian Borntraeger wrote:
On 23.02.21 10:48, Halil Pasic wrote:
On Mon, 15 Feb 2021 20:15:47 -0500 Tony Krowiak akrowiak@linux.ibm.com wrote:
This patch fixes a circular locking dependency in the CI introduced by commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated"). The lockdep only occurs when starting a Secure Execution guest. Crypto virtualization (vfio_ap) is not yet supported for SE guests; however, in order to avoid CI errors, this fix is being provided.
The circular lockdep was introduced when the masks in the guest's APCB were taken under the matrix_dev->lock. While the lock is definitely needed to protect the setting/unsetting of the KVM pointer, it is not necessarily critical for setting the masks, so this will not be done under protection of the matrix_dev->lock.
With the one little thing I commented on below addressed: Acked-by: Halil Pasic pasic@linux.ibm.com
Tony, can you comment on Halils comment or send a v3 right away?
I was locked out of email due to expiration of my w3 password. I am working on the response now.
linux-stable-mirror@lists.linaro.org