2024-11-12, 14:20:45 +0100, Antonio Quartulli wrote:
On 05/11/2024 19:10, Sabrina Dubroca wrote:
2024-10-29, 11:47:28 +0100, Antonio Quartulli wrote:
@@ -105,6 +132,9 @@ void ovpn_decrypt_post(void *data, int ret) goto drop; }
- /* keep track of last received authenticated packet for keepalive */
- peer->last_recv = ktime_get_real_seconds();
It doesn't look like we're locking the peer here so that should be a WRITE_ONCE() (and READ_ONCE(peer->last_recv) for all reads).
Is that because last_recv is 64 bit long (and might be more than one word on certain architectures)?
I don't remember having to do so for reading/writing 32 bit long integers.
AFAIK it's not just that. The compiler is free to do the read/write in any way it wants when you don't specify _ONCE. On the read side, it could read from memory a single time or multiple times (getting possibly different values each time), or maybe split the load (possibly reading chunks from different values being written in parallel).
I presume we need a WRITE_ONCE also upon initialization in ovpn_peer_keepalive_set() right? We still want to coordinate that with other reads/writes.
I think it makes sense, yes.
- /* point to encapsulated IP packet */ __skb_pull(skb, payload_offset);
@@ -121,6 +151,12 @@ void ovpn_decrypt_post(void *data, int ret) goto drop; }
if (ovpn_is_keepalive(skb)) {
net_dbg_ratelimited("%s: ping received from peer %u\n",
peer->ovpn->dev->name, peer->id);
goto drop;
To help with debugging connectivity issues, maybe keepalives shouldn't be counted as drops? (consume_skb instead of kfree_skb, and not incrementing rx_dropped) The packet was successfully received and did all it had to do.
you're absolutely right. Will change that.
Thanks.
- /* check for peer timeout */
- expired = false;
- timeout = peer->keepalive_timeout;
- delta = now - peer->last_recv;
I'm not sure that's always > 0 if we finish decrypting a packet just as the workqueue starts:
ovpn_peer_keepalive_work now = ...
ovpn_decrypt_post peer->last_recv = ...
ovpn_peer_keepalive_work_single delta: now < peer->last_recv
Yeah, there is nothing preventing this from happening...but is this truly a problem? The math should still work, no?
We'll fail "delta < timeout" (which we shouldn't), so we'll end up either in the "expired = true" case, or not updating keepalive_recv_exp. Both of these seem not ideal.
However:
- if (delta < timeout) {
peer->keepalive_recv_exp = now + timeout - delta;
I'd shorten that to
peer->keepalive_recv_exp = peer->last_recv + timeout;
it's a bit more readable to my eyes and avoids risks of wrapping values.
So I'd probably get rid of delta and go with:
last_recv = READ_ONCE(peer->last_recv) if (now < last_recv + timeout) { peer->keepalive_recv_exp = last_recv + timeout; next_run1 = peer->keepalive_recv_exp; } else if ...
next_run1 = peer->keepalive_recv_exp;
- } else if (peer->keepalive_recv_exp > now) {
next_run1 = peer->keepalive_recv_exp;
- } else {
expired = true;
- }
I agree this is simpler to read and gets rid of some extra operations.
[note: I took inspiration from nat_keepalive_work_single() - it could be simplified as well I guess]
Ah, ok. I wanted to review this code when it was posted but didn't have time :(
[...]
- /* check for peer keepalive */
- expired = false;
- interval = peer->keepalive_interval;
- delta = now - peer->last_sent;
- if (delta < interval) {
peer->keepalive_xmit_exp = now + interval - delta;
next_run2 = peer->keepalive_xmit_exp;
and same here
Yeah, will change both. Thanks!
Thanks.