On Mon, Nov 24, 2025 at 11:10:19AM +0100, Stefano Garzarella wrote:
On Fri, Nov 21, 2025 at 11:01:53AM -0800, Bobby Eshleman wrote:
On Fri, Nov 21, 2025 at 03:24:25PM +0100, Stefano Garzarella wrote:
On Thu, Nov 20, 2025 at 09:44:35PM -0800, Bobby Eshleman wrote:
From: Bobby Eshleman bobbyeshleman@meta.com
Reject setting VSOCK_NET_MODE_LOCAL with -EOPNOTSUPP if a G2H transport is operational. Additionally, reject G2H transport registration if there already exists a namespace in local mode.
G2H sockets break in local mode because the G2H transports don't support namespacing yet. The current approach is to coerce packets coming out of G2H transports into VSOCK_NET_MODE_GLOBAL mode, but it is not possible to coerce sockets in the same way because it cannot be deduced which transport will be used by the socket. Specifically, when bound to VMADDR_CID_ANY in a nested VM (both G2H and H2G available), it is not until a packet is received and matched to the bound socket that we assign the transport. This presents a chicken-and-egg problem, because we need the namespace to lookup the socket and resolve the transport, but we need the transport to know how to use the namespace during lookup.
For that reason, this patch prevents VSOCK_NET_MODE_LOCAL from being used on systems that support G2H, even nested systems that also have H2G transports.
Local mode is blocked based on detecting the presence of G2H devices (when possible, as hyperv is special). This means that a host kernel with G2H support compiled in (or has the module loaded), will still support local mode if there is no G2H (e.g., virtio-vsock) device detected. This enables using the same kernel in the host and in the guest, as we do in kselftest.
Systems with only namespace-aware transports (vhost-vsock, loopback) can still use both VSOCK_NET_MODE_GLOBAL and VSOCK_NET_MODE_LOCAL modes as intended.
Add supports_local_mode() transport callback to indicate transport-specific local mode support.
These restrictions can be lifted in a future patch series when G2H transports gain namespace support.
Signed-off-by: Bobby Eshleman bobbyeshleman@meta.com
Changes in v11:
- vhost_transport_supports_local_mode() returns false to keep things
disabled until support comes online (Stefano)
- add comment above supports_local_mode() cb to clarify (Stefano)
- Remove redundant `ret = 0` initialization in vsock_net_mode_string()
(Stefano)
- Refactor vsock_net_mode_string() to separate parsing from validation
(Stefano)
- vmci returns false for supports_local_mode(), with comment
Changes in v10:
- move this patch before any transports bring online namespacing (Stefano)
- move vsock_net_mode_string into critical section (Stefano)
- add ->supports_local_mode() callback to transports (Stefano)
drivers/vhost/vsock.c | 6 ++++++ include/net/af_vsock.h | 11 +++++++++++ net/vmw_vsock/af_vsock.c | 32 ++++++++++++++++++++++++++++++++ net/vmw_vsock/hyperv_transport.c | 6 ++++++ net/vmw_vsock/virtio_transport.c | 13 +++++++++++++ net/vmw_vsock/vmci_transport.c | 12 ++++++++++++ net/vmw_vsock/vsock_loopback.c | 6 ++++++ 7 files changed, 86 insertions(+)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 69074656263d..4e3856aa2479 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -64,6 +64,11 @@ static u32 vhost_transport_get_local_cid(void) return VHOST_VSOCK_DEFAULT_HOST_CID; }
+static bool vhost_transport_supports_local_mode(void) +{
- return false;
+}
/* Callers that dereference the return value must hold vhost_vsock_mutex or the
- RCU read lock.
*/ @@ -412,6 +417,7 @@ static struct virtio_transport vhost_transport = { .module = THIS_MODULE,
.get_local_cid = vhost_transport_get_local_cid,
.supports_local_mode = vhost_transport_supports_local_mode,.init = virtio_transport_do_socket_init, .destruct = virtio_transport_destruct,
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index 59d97a143204..e24ef1d9fe02 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h @@ -180,6 +180,17 @@ struct vsock_transport { /* Addressing. */ u32 (*get_local_cid)(void);
- /* Return true if the transport is compatible with
* VSOCK_NET_MODE_LOCAL. Otherwise, return false.** Transports should return false if they lack local-mode namespace* support (e.g., G2H transports like hyperv-vsock and vmci-vsock).* virtio-vsock returns true only if no device is present in order to* enable local mode in nested scenarios in which virtio-vsock is* loaded or built-in, but nonetheless unusable by sockets.*/- bool (*supports_local_mode)(void);
- /* Read a single skb */ int (*read_skb)(struct vsock_sock *, skb_read_actor_t);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 243c0d588682..120adb9dad9f 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -91,6 +91,12 @@
- and locked down by a namespace manager. The default is "global". The mode
- is set per-namespace.
- Note: LOCAL mode is only supported when using namespace-aware transports
- (vhost-vsock, loopback). If a guest-to-host transport (virtio-vsock,
- hyperv-vsock, vmci-vsock) is operational, attempts to set LOCAL mode will
- fail with EOPNOTSUPP, as these transports do not support per-namespace
- isolation.
- The modes affect the allocation and accessibility of CIDs as follows:
- global - access and allocation are all system-wide
@@ -2794,6 +2800,15 @@ static int vsock_net_mode_string(const struct ctl_table *table, int write, else return -EINVAL;
- mutex_lock(&vsock_register_mutex);
- if (mode == VSOCK_NET_MODE_LOCAL &&
transport_g2h && transport_g2h->supports_local_mode &&!transport_g2h->supports_local_mode()) {mutex_unlock(&vsock_register_mutex);return -EOPNOTSUPP;- }
- mutex_unlock(&vsock_register_mutex);
Wait, I think we already discussed about this, vsock_net_write_mode() must be called with the lock held.
See https://lore.kernel.org/netdev/aRTTwuuXSz5CvNjt@devvm11784.nha0.facebook.com...
Ah right, oversight on my part.
Since I guess we need another version of this patch, can you check the commit description to see if it reflects what we are doing now (e.g vhost is not enabled)?
Also I don't understand why for vhost we will enable it later, but for virtio_transport and vsock_loopback we are enabling it now, also if this patch is before the support on that transports. I'm a bit confused.
If something is unclear, let's discuss it before sending a new version.
What I had in mind was, add this patch and explain why we need this new callback (like you did), but enable the support in the patches that really enable it for any transport. But maybe what is not clear to me is that we need this only for G2H. But now I'm confused about the discussion around vmci H2G. We decided to discard also that one, but here we are not checking that? I mean here we are calling supports_local_mode() only on G2H IIUC.
Ah right, VMCI broke my original mental model of only needing this check for G2H (originally I didn't realize VMCI was H2G too).
I think now, we actually need to do this check for all of the transports no? Including h2g, g2h, local, and dgram?
Additionally, the commit description needs to be updated to reflect that.
Let's take a step back, though, because I tried to understand the problem better and I'm confused.
For example, in vmci (G2H side), when a packet arrives, we always use vsock_find_connected_socket(), which only searches in GLOBAL. So connections originating from the host can only reach global sockets in the guest. In this direction (host -> guest), we should be fine, right?
Now let's consider the other direction, from guest to host, so the connection should be generated via vsock_connect(). Here I see that we are not doing anything with regard to the source namespace. At this point, my question is whether we should modify vsock_assign_transport() or transport->stream_allow() to do this for each stream, and not prevent loading a G2H module a priori.
For example, stream_allow() could check that the socket namespace is supported by the assigned transport. E.g., vmci can check that if the namespace mode is not GLOBAL, then it returns false. (Same thing in virtio-vsock, I mean the G2H driver).
This should solve the guest -> host direction, but at this point I wonder if I'm missing something.
For the G2H connect case that is true, but the situation gets a little fuzzier on the G2H RX side w/ VMADDR_CID_ANY listeners.
Let's say we have a nested system w/ both virtio-vsock and vhost-vsock. We have a listener in namespace local on VMADDR_CID_ANY. So far, no transport is assigned, so we can't call t->stream_allow() yet. virtio-vsock only knows of global mode, so its lookup will fail (unless we hack in some special case to virtio_transport_recv_pkt() to scan local namespaces). vhost-vsock will work as expected. Letting local mode sockets be silently unreachable by virtio-vsock seems potentially confusing for users. If the system were not nested, we can pre-resolve VMADDR_CID_ANY in bind() and handle things upfront as well. Rejecting local mode outright is just a broad guardrail.
If we're trying to find a less heavy-handed option, we might be able to do the following:
- change bind(cid) w/ cid != VMADDR_CID_ANY to directly assign the transport for all socket types (not just SOCK_DGRAM)
- vsock_assign_transport() can outright fail if !t->supports_local_mode() and sock_net(sk) has mode local
- bind(VMADDR_CID_ANY) can maybe print (once) to dmesg a warning that only the H2G transport will land on VMADDR_CID_ANY sockets.
I'm certainly open to other suggestions.
Best, Bobby