Re: [PATCH net-next v11 09/23] ovpn: implement basic RX path (UDP)

29 Nov 2024


      2024-11-26, 02:32:38 +0200, Sergey Ryazanov wrote:
...
On 15.11.2024 17:02, Antonio Quartulli wrote:
...
On 11/11/2024 02:54, Sergey Ryazanov wrote:
[...]
...
...
+    skb_reset_transport_header(skb);
+    skb_probe_transport_header(skb);
+    skb_reset_inner_headers(skb);


+    memset(skb->cb, 0, sizeof(skb->cb));
Why do we need to zero the control buffer here?
To avoid the next layer to assume the cb is clean while it is not.
Other drivers do the same as well.
AFAIR, there is no convention to clean the control buffer before the handing
over. The common practice is a bit opposite, programmer shall not assume
that the control buffer has been zeroed.
Not a big deal to clean it here, we just can save some CPU cycles avoiding
it.
...
I think this was recommended by Sabrina as well.
Curious. It's macsec that does not zero it, or I've not understood how it
was done.
I only remember discussing a case [1] where one function within ovpn
was expecting a cleared skb->cb to behave correctly but the caller did
not clear it. In general, as you said, clearing cb "to be nice to
other layers" is not expected. Sorry if some comments I made were
confusing.
[1] https://lore.kernel.org/netdev/ZtXOw-NcL9lvwWa8@hog
...
...
...
...
+struct ovpn_struct *ovpn_from_udp_sock(struct sock *sk)
+{
+    struct ovpn_socket *ovpn_sock;


+    if (unlikely(READ_ONCE(udp_sk(sk)->encap_type) !=
UDP_ENCAP_OVPNINUDP))
+        return NULL;


+    ovpn_sock = rcu_dereference_sk_user_data(sk);
+    if (unlikely(!ovpn_sock))
+        return NULL;


+    /* make sure that sk matches our stored transport socket */
+    if (unlikely(!ovpn_sock->sock || sk != ovpn_sock->sock->sk))
+        return NULL;


+    return ovpn_sock->ovpn;
Now, returning of this pointer is safe. But the following TCP
transport support calls the socket release via a scheduled work.
What extends socket lifetime and makes it possible to receive a UDP
packet way after the interface private data release. Is it correct
assumption?
Sorry you lost me when sayng "following *TCP* transp[ort support calls".
This function is invoked only in UDP context.
Was that a typ0?
Yeah, you are right. The question sounds like a riddle. I should eventually
stop composing emails at midnight. Let me paraphrase it.
The potential issue is tricky since we create it patch-by-patch.
Up to this patch the socket releasing procedure looks solid and reliable.
E.g. the P2P netdev destroying:
ovpn_netdev_notifier_call(NETDEV_UNREGISTER)
    ovpn_peer_release_p2p
      ovpn_peer_del_p2p
        ovpn_peer_put
          ovpn_peer_release_kref
            ovpn_peer_release
              ovpn_socket_put
                ovpn_socket_release_kref
                  ovpn_socket_detach
                    ovpn_udp_socket_detach
                      setup_udp_tunnel_sock
  netdev_run_todo
    rcu_barrier  <- no running ovpn_udp_encap_recv after this point
It's more the synchronize_net in unregister_netdevice_many_notify?
rcu_barrier waits for pending kfree_rcu/call_rcu, synchronize_rcu
waits for rcu_read_lock sections (see the comments for rcu_barrier and
synchronize_rcu in kernel/rcu/tree.c).
...
free_netdev


After the setup_udp_tunnel_sock() call no new ovpn_udp_encap_recv() will be
spawned. And after the rcu_barrier() all running ovpn_udp_encap_recv() will
be done. All good.
Then, the following patch 'ovpn: implement TCP transport' disjoin
ovpn_socket_release_kref() and ovpn_socket_detach() by scheduling the socket
detach function call:
ovpn_socket_release_kref
    ovpn_socket_schedule_release
      schedule_work(&sock->work)
And long time after the socket will be actually detached:
ovpn_socket_release_work
    ovpn_socket_detach
      ovpn_udp_socket_detach
        setup_udp_tunnel_sock
And until this detaching will take a place, UDP handler can call
ovpn_udp_encap_recv() whatever number of times.
So, we can end up with this scenario:
ovpn_netdev_notifier_call(NETDEV_UNREGISTER)
    ovpn_peer_release_p2p
      ovpn_peer_del_p2p
        ovpn_peer_put
          ovpn_peer_release_kref
            ovpn_peer_release
              ovpn_socket_put
                ovpn_socket_release_kref
                  ovpn_socket_schedule_release
                    schedule_work(&sock->work)
  netdev_run_todo
    rcu_barrier
    free_netdev
ovpn_udp_encap_recv  <- called for an incoming UDP packet
    ovpn_from_udp_sock <- returns pointer to freed memory
    // Any access to ovpn pointer is the use-after-free
ovpn_socket_release_work  <- kernel finally ivoke the work
    ovpn_socket_detach
      ovpn_udp_socket_detach
        setup_udp_tunnel_sock
To address the issue, I see two possible solutions:

flush the workqueue somewhere before the netdev release
set ovpn_sock->ovpn = NULL before scheduling the socket detach

Going with #2, we could fully split detach into a synchronous part and
async part (with async not needed for UDP). detach_sync clears the
pointers (CBs, strp_stop(), ovpn_sock->ovpn, setup_udp_tunnel_sock) so
that no more packets will be sent through the ovpn driver.
Related to that topic, I'm not sure what's keeping a reference on the
peer to guarantee it doesn't get freed before we're done with
peer->tcp.tx_work at the end of ovpn_tcp_socket_detach. Maybe all this
tcp stuff should move from the peer to ovpn_socket?
-- 
Sabrina

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH net-next v11 09/23] ovpn: implement basic RX path (UDP)