On 2/25/25 16:00, Cosmin Ratiu wrote:
On Tue, 2025-02-25 at 09:40 +0000, Hangbin Liu wrote:
The fixed commit placed mutex_lock() inside spin_lock_bh(), which triggers a warning like:
BUG: sleeping function called from invalid context at...
Fix this by moving the mutex_lock() operation to a work queue.
Fixes: 2aeeef906d5a ("bonding: change ipsec_lock from spin lock to mutex") Reported-by: Jakub Kicinski kuba@kernel.org Closes: https://lore.kernel.org/netdev/20241212062734.182a0164@kernel.org Signed-off-by: Hangbin Liu liuhangbin@gmail.com
drivers/net/bonding/bond_main.c | 41 +++++++++++++++++++++++++------
include/net/bonding.h | 6 +++++ 2 files changed, 37 insertions(+), 10 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index e45bba240cbc..cc7064aa4b35 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -551,6 +551,25 @@ static void bond_ipsec_add_sa_all(struct bonding *bond) mutex_unlock(&bond->ipsec_lock); } +static void bond_xfrm_state_gc_work(struct work_struct *work) +{
- struct bond_xfrm_work *xfrm_work = container_of(work, struct
bond_xfrm_work, work);
- struct bonding *bond = xfrm_work->bond;
- struct xfrm_state *xs = xfrm_work->xs;
- struct bond_ipsec *ipsec;
- mutex_lock(&bond->ipsec_lock);
- list_for_each_entry(ipsec, &bond->ipsec_list, list) {
if (ipsec->xs == xs) {
list_del(&ipsec->list);
kfree(ipsec);
xfrm_state_put(xs);
I would expect xfrm_state_put to be called from outside the loop, regardless of whether an entry is found in the list or not, because it was unconditionally referenced when the work was created.
break;
}
- }
- mutex_unlock(&bond->ipsec_lock);
+}
/** * bond_ipsec_del_sa - clear out this specific SA * @xs: pointer to transformer state struct @@ -558,9 +577,9 @@ static void bond_ipsec_add_sa_all(struct bonding *bond) static void bond_ipsec_del_sa(struct xfrm_state *xs) { struct net_device *bond_dev = xs->xso.dev;
- struct bond_xfrm_work *xfrm_work;
struct net_device *real_dev; netdevice_tracker tracker;
- struct bond_ipsec *ipsec;
struct bonding *bond; struct slave *slave; @@ -592,15 +611,17 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) real_dev->xfrmdev_ops->xdo_dev_state_delete(xs); out: netdev_put(real_dev, &tracker);
- mutex_lock(&bond->ipsec_lock);
- list_for_each_entry(ipsec, &bond->ipsec_list, list) {
if (ipsec->xs == xs) {
list_del(&ipsec->list);
kfree(ipsec);
break;
}
- }
- mutex_unlock(&bond->ipsec_lock);
- xfrm_work = kmalloc(sizeof(*xfrm_work), GFP_ATOMIC);
- if (!xfrm_work)
return;
- INIT_WORK(&xfrm_work->work, bond_xfrm_state_gc_work);
- xfrm_work->bond = bond;
- xfrm_work->xs = xs;
- xfrm_state_hold(xs);
- queue_work(bond->wq, &xfrm_work->work);
} static void bond_ipsec_del_sa_all(struct bonding *bond) diff --git a/include/net/bonding.h b/include/net/bonding.h index 8bb5f016969f..d54ba5e3affb 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -209,6 +209,12 @@ struct bond_ipsec { struct xfrm_state *xs; }; +struct bond_xfrm_work {
- struct work_struct work;
- struct bonding *bond;
- struct xfrm_state *xs;
+};
Also, like Nikolai said, something needs to wait on all in-flight work items.
This got me to stare at the code again. What if we move the removal of the xs from bond->ipsec from bond_ipsec_del_sa to bond_ipsec_free_sa? bond_ipsec_free_sa, unlike bond_ipsec_del_sa, is not called with x-
lock held. It is called from the xfrm gc task or directly via
xfrm_state_put_sync and therefore wouldn't suffer from the locking issue.
The tricky part is to make sure that inactive bond->ipsec entries (after bond_ipsec_del_sa calls) do not cause issues if there's a migration (bond_ipsec_del_sa_all is called) happening before bond_ipsec_free_sa. Perhaps filtering by x->km.state != XFRM_STATE_DEAD in bond_ipsec_del_sa_all.
What do you think about this idea?
Cosmin.
I know the question was for Hangbin, but I do like this solution. I missed the xdo_dev_state_free callback, it could lead to a much simpler solution with some care.
Cheers, Nik