On 26/11/2024 09:49, Antonio Quartulli wrote: [...]
The potential issue is tricky since we create it patch-by-patch.
Up to this patch the socket releasing procedure looks solid and reliable. E.g. the P2P netdev destroying:
ovpn_netdev_notifier_call(NETDEV_UNREGISTER) ovpn_peer_release_p2p ovpn_peer_del_p2p ovpn_peer_put ovpn_peer_release_kref ovpn_peer_release ovpn_socket_put ovpn_socket_release_kref ovpn_socket_detach ovpn_udp_socket_detach setup_udp_tunnel_sock netdev_run_todo rcu_barrier <- no running ovpn_udp_encap_recv after this point free_netdev
After the setup_udp_tunnel_sock() call no new ovpn_udp_encap_recv() will be spawned. And after the rcu_barrier() all running ovpn_udp_encap_recv() will be done. All good.
ok
Then, the following patch 'ovpn: implement TCP transport' disjoin ovpn_socket_release_kref() and ovpn_socket_detach() by scheduling the socket detach function call:
ovpn_socket_release_kref ovpn_socket_schedule_release schedule_work(&sock->work)
And long time after the socket will be actually detached:
ovpn_socket_release_work ovpn_socket_detach ovpn_udp_socket_detach setup_udp_tunnel_sock
And until this detaching will take a place, UDP handler can call ovpn_udp_encap_recv() whatever number of times.
So, we can end up with this scenario:
ovpn_netdev_notifier_call(NETDEV_UNREGISTER) ovpn_peer_release_p2p ovpn_peer_del_p2p ovpn_peer_put ovpn_peer_release_kref ovpn_peer_release ovpn_socket_put ovpn_socket_release_kref ovpn_socket_schedule_release schedule_work(&sock->work) netdev_run_todo rcu_barrier free_netdev
ovpn_udp_encap_recv <- called for an incoming UDP packet ovpn_from_udp_sock <- returns pointer to freed memory // Any access to ovpn pointer is the use-after-free
ovpn_socket_release_work <- kernel finally ivoke the work ovpn_socket_detach ovpn_udp_socket_detach setup_udp_tunnel_sock
To address the issue, I see two possible solutions:
- flush the workqueue somewhere before the netdev release
yes! This is what I was missing. This will also solve the "how can the module wait for all workers to be done before unloading?"
Actually there might be even a simpler solution: each ovpn_socket will hold a reference to an ovpn_peer (TCP) or to an ovpn_priv (UDP). I can simply increase the refcounter those objects while they are referenced by the socket and decrease it when the socket is fully released (in the detach() function called by the worker).
This way the netdev cannot be released until all socket (and all peers) are gone.
This approach doesn't require any local workqueue or any other special coordination as we'll just force the whole cleanup to happen in a specific order.
Does it make sense?
Regards,