``` commit 8ae9fd98fe53c45a1a89adf6d254c2db90d26544 Author: Menglong Dong menglong8.dong@gmail.com
bpf: use rqspinlock for lru map
This patch converts raw_spinlock_t to rqspinlock_t in BPF LRU maps to prevent NMI context deadlocks. The new lock API can fail gracefully instead of deadlocking, returning an error code.
Link: https://lore.kernel.org/bpf/CAEf4BzbTJCUx0D=zjx6+5m5iiGhwLzaP94hnw36ZMDHAf4-...
diff --git a/kernel/bpf/bpf_lru_list.c b/kernel/bpf/bpf_lru_list.c index e7a2fc605..38fddcb1e 100644 --- a/kernel/bpf/bpf_lru_list.c +++ b/kernel/bpf/bpf_lru_list.c
[ ... ]
@@ -466,23 +470,26 @@ static struct bpf_lru_node *bpf_common_lru_pop_free(struct bpf_lru *lru, do { steal_loc_l = per_cpu_ptr(clru->local_list, steal);
raw_spin_lock_irqsave(&steal_loc_l->lock, flags);
if (raw_res_spin_lock_irqsave(&steal_loc_l->lock, flags))goto out_next;node = __local_list_pop_free(steal_loc_l); if (!node) node = __local_list_pop_pending(lru, steal_loc_l);
raw_spin_unlock_irqrestore(&steal_loc_l->lock, flags);
raw_res_spin_unlock_irqrestore(&steal_loc_l->lock, flags);+out_next: steal = cpumask_next_wrap(steal, cpu_possible_mask); } while (!node && steal != first_steal);
loc_l->next_steal = steal;
if (node) {
raw_spin_lock_irqsave(&loc_l->lock, flags);
if (raw_res_spin_lock_irqsave(&loc_l->lock, flags))return NULL;
^^^^^^^^^^^
Can this leak the node? At this point in bpf_common_lru_pop_free(), a node was successfully stolen from another CPU's list. The __local_list_pop_free() or __local_list_pop_pending() call removed the node from steal_loc_l, and steal_loc_l was unlocked. The node now exists but is not in any list.
When raw_res_spin_lock_irqsave() fails here, the function returns NULL, but the stolen node is never added to the local pending list and never returned to any list. The node becomes orphaned.
In NMI context (which this patch is designed to handle), lock failures are expected to be common. Each failure would leak one pre-allocated node from the BPF LRU pool, eventually exhausting the available nodes.
__local_list_add_pending(lru, loc_l, cpu, node, hash);
raw_spin_unlock_irqrestore(&loc_l->lock, flags);
raw_res_spin_unlock_irqrestore(&loc_l->lock, flags);}
return node;
[ ... ]
```
--- AI reviewed your patch. Please fix the bug or email reply why it's not a bug. See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
In-Reply-To-Subject: `bpf: use rqspinlock for lru map` CI run summary: https://github.com/kernel-patches/bpf/actions/runs/18928591681