2024-11-14, 11:32:36 +0100, Antonio Quartulli wrote:
On 13/11/2024 12:05, Sabrina Dubroca wrote:
2024-11-12, 15:26:59 +0100, Antonio Quartulli wrote:
On 11/11/2024 16:41, Sabrina Dubroca wrote:
2024-10-29, 11:47:31 +0100, Antonio Quartulli wrote:
+void ovpn_peer_hash_vpn_ip(struct ovpn_peer *peer)
- __must_hold(&peer->ovpn->peers->lock)
Changes to peer->vpn_addrs are not protected by peers->lock, so those could be getting updated while we're rehashing (and taking peer->lock in ovpn_nl_peer_modify as I'm suggesting above also wouldn't prevent that).
/me screams :-D
Sorry :)
Indeed peers->lock is only about protecting the lists, not the content of the listed objects.
How about acquiring the peers->lock before calling ovpn_nl_peer_modify()?
It seems like it would work. Maybe a bit weird to have conditional locking (MP mode only), but ok. You already have this lock ordering (hold peers->lock before taking peer->lock) in ovpn_peer_keepalive_work_mp, so there should be no deadlock from doing the same thing in the netlink code.
Yeah.
Then I would also do that in ovpn_peer_float to protect that rehash.
I am not extremely comfortable with this, because it means acquiring peers->lock on every packet (right now we do so only on peer->lock) and it may defeat the advantage of the RCU locking on the hashtables. Wouldn't you agree?
Hmpf, yeah. Then I think you could keep most of the current code, except doing the rehash under both locks (peers + peer), and get ss+sa_len for the rehash directly from peer->bind (instead of using the ones we just defined locally in ovpn_peer_float, since they may have changed while we released peer->lock to grab peers->lock). We may end up "rehashing" twice into the same bucket if we have 2 concurrent peer_float calls (call 1 sets remote r1, call 2 sets a new one r2, call 1 hashes according to r2, call 2 also rehashes based on r2). That should be ok (it can happen anyway that a "real" rehash lands in the same bucket).
peer_float { spin_lock(peer) match/update bind spin_unlock(peer)
if (MP) { spin_lock(peers) spin_lock(peer) rehash using peer->bind->remote rather than ss spin_unlock(peer) spin_unlock(peers) } }
Does that sound reasonable?