Restrict two generic netlink multicast groups - in the "psample" and "NET_DM" families - to be root-only with the appropriate capabilities.
Patch #1 is a dependency of patch #2 which is needed by the actual fixes in patches #3 and #4.
Florian Westphal (1): netlink: don't call ->netlink_bind with table lock held
Ido Schimmel (3): genetlink: add CAP_NET_ADMIN test for multicast bind psample: Require 'CAP_NET_ADMIN' when joining "packets" group drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group
include/net/genetlink.h | 3 +++ net/core/drop_monitor.c | 4 +++- net/netlink/af_netlink.c | 4 ++-- net/netlink/genetlink.c | 35 +++++++++++++++++++++++++++++++++++ net/psample/psample.c | 3 ++- 5 files changed, 45 insertions(+), 4 deletions(-)
From: Florian Westphal fw@strlen.de
commit f2764bd4f6a8dffaec3e220728385d9756b3c2cb upstream.
When I added support to allow generic netlink multicast groups to be restricted to subscribers with CAP_NET_ADMIN I was unaware that a genl_bind implementation already existed in the past.
It was reverted due to ABBA deadlock:
1. ->netlink_bind gets called with the table lock held. 2. genetlink bind callback is invoked, it grabs the genl lock.
But when a new genl subsystem is (un)registered, these two locks are taken in reverse order.
One solution would be to revert again and add a comment in genl referring 1e82a62fec613, "genetlink: remove genl_bind").
This would need a second change in mptcp to not expose the raw token value anymore, e.g. by hashing the token with a secret key so userspace can still associate subflow events with the correct mptcp connection.
However, Paolo Abeni reminded me to double-check why the netlink table is locked in the first place.
I can't find one. netlink_bind() is already called without this lock when userspace joins a group via NETLINK_ADD_MEMBERSHIP setsockopt. Same holds for the netlink_unbind operation.
Digging through the history, commit f773608026ee1 ("netlink: access nlk groups safely in netlink bind and getname") expanded the lock scope.
commit 3a20773beeeeade ("net: netlink: cap max groups which will be considered in netlink_bind()") ... removed the nlk->ngroups access that the lock scope extension was all about.
Reduce the lock scope again and always call ->netlink_bind without the table lock.
The Fixes tag should be vs. the patch mentioned in the link below, but that one got squash-merged into the patch that came earlier in the series.
Fixes: 4d54cc32112d8d ("mptcp: avoid lock_fast usage in accept path") Link: https://lore.kernel.org/mptcp/20210213000001.379332-8-mathew.j.martineau@lin... Cc: Cong Wang xiyou.wangcong@gmail.com Cc: Xin Long lucien.xin@gmail.com Cc: Johannes Berg johannes.berg@intel.com Cc: Sean Tranchetti stranche@codeaurora.org Cc: Paolo Abeni pabeni@redhat.com Cc: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ido Schimmel idosch@nvidia.com --- net/netlink/af_netlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 6aa984971577..89ece1f093e2 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -1001,7 +1001,6 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, return -EINVAL; }
- netlink_lock_table(); if (nlk->netlink_bind && groups) { int group;
@@ -1013,13 +1012,14 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, if (!err) continue; netlink_undo_bind(group, groups, sk); - goto unlock; + return err; } }
/* No need for barriers here as we return to user-space without * using any of the bound attributes. */ + netlink_lock_table(); if (!bound) { err = nladdr->nl_pid ? netlink_insert(sk, nladdr->nl_pid) :
commit 44ec98ea5ea9cfecd31a5c4cc124703cb5442832 upstream.
The "psample" generic netlink family notifies sampled packets over the "packets" multicast group. This is problematic since by default generic netlink allows non-root users to listen to these notifications.
Fix by marking the group with the 'GENL_UNS_ADMIN_PERM' flag. This will prevent non-root users or root without the 'CAP_NET_ADMIN' capability (in the user namespace owning the network namespace) from joining the group.
Tested using [1].
Before:
# capsh -- -c ./psample_repo # capsh --drop=cap_net_admin -- -c ./psample_repo
After:
# capsh -- -c ./psample_repo # capsh --drop=cap_net_admin -- -c ./psample_repo Failed to join "packets" multicast group
[1] $ cat psample.c #include <stdio.h> #include <netlink/genl/ctrl.h> #include <netlink/genl/genl.h> #include <netlink/socket.h>
int join_grp(struct nl_sock *sk, const char *grp_name) { int grp, err;
grp = genl_ctrl_resolve_grp(sk, "psample", grp_name); if (grp < 0) { fprintf(stderr, "Failed to resolve "%s" multicast group\n", grp_name); return grp; }
err = nl_socket_add_memberships(sk, grp, NFNLGRP_NONE); if (err) { fprintf(stderr, "Failed to join "%s" multicast group\n", grp_name); return err; }
return 0; }
int main(int argc, char **argv) { struct nl_sock *sk; int err;
sk = nl_socket_alloc(); if (!sk) { fprintf(stderr, "Failed to allocate socket\n"); return -1; }
err = genl_connect(sk); if (err) { fprintf(stderr, "Failed to connect socket\n"); return err; }
err = join_grp(sk, "config"); if (err) return err;
err = join_grp(sk, "packets"); if (err) return err;
return 0; } $ gcc -I/usr/include/libnl3 -lnl-3 -lnl-genl-3 -o psample_repo psample.c
Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling") Reported-by: "The UK's National Cyber Security Centre (NCSC)" security@ncsc.gov.uk Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Jiri Pirko jiri@nvidia.com Link: https://lore.kernel.org/r/20231206213102.1824398-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org --- net/psample/psample.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/psample/psample.c b/net/psample/psample.c index 30e8239bd774..196fbf674dc1 100644 --- a/net/psample/psample.c +++ b/net/psample/psample.c @@ -31,7 +31,8 @@ enum psample_nl_multicast_groups {
static const struct genl_multicast_group psample_nl_mcgrps[] = { [PSAMPLE_NL_MCGRP_CONFIG] = { .name = PSAMPLE_NL_MCGRP_CONFIG_NAME }, - [PSAMPLE_NL_MCGRP_SAMPLE] = { .name = PSAMPLE_NL_MCGRP_SAMPLE_NAME }, + [PSAMPLE_NL_MCGRP_SAMPLE] = { .name = PSAMPLE_NL_MCGRP_SAMPLE_NAME, + .flags = GENL_UNS_ADMIN_PERM }, };
static struct genl_family psample_nl_family __ro_after_init;
commit e03781879a0d524ce3126678d50a80484a513c4b upstream.
The "NET_DM" generic netlink family notifies drop locations over the "events" multicast group. This is problematic since by default generic netlink allows non-root users to listen to these notifications.
Fix by adding a new field to the generic netlink multicast group structure that when set prevents non-root users or root without the 'CAP_SYS_ADMIN' capability (in the user namespace owning the network namespace) from joining the group. Set this field for the "events" group. Use 'CAP_SYS_ADMIN' rather than 'CAP_NET_ADMIN' because of the nature of the information that is shared over this group.
Note that the capability check in this case will always be performed against the initial user namespace since the family is not netns aware and only operates in the initial network namespace.
A new field is added to the structure rather than using the "flags" field because the existing field uses uAPI flags and it is inappropriate to add a new uAPI flag for an internal kernel check. In net-next we can rework the "flags" field to use internal flags and fold the new field into it. But for now, in order to reduce the amount of changes, add a new field.
Since the information can only be consumed by root, mark the control plane operations that start and stop the tracing as root-only using the 'GENL_ADMIN_PERM' flag.
Tested using [1].
Before:
# capsh -- -c ./dm_repo # capsh --drop=cap_sys_admin -- -c ./dm_repo
After:
# capsh -- -c ./dm_repo # capsh --drop=cap_sys_admin -- -c ./dm_repo Failed to join "events" multicast group
[1] $ cat dm.c #include <stdio.h> #include <netlink/genl/ctrl.h> #include <netlink/genl/genl.h> #include <netlink/socket.h>
int main(int argc, char **argv) { struct nl_sock *sk; int grp, err;
sk = nl_socket_alloc(); if (!sk) { fprintf(stderr, "Failed to allocate socket\n"); return -1; }
err = genl_connect(sk); if (err) { fprintf(stderr, "Failed to connect socket\n"); return err; }
grp = genl_ctrl_resolve_grp(sk, "NET_DM", "events"); if (grp < 0) { fprintf(stderr, "Failed to resolve "events" multicast group\n"); return grp; }
err = nl_socket_add_memberships(sk, grp, NFNLGRP_NONE); if (err) { fprintf(stderr, "Failed to join "events" multicast group\n"); return err; }
return 0; } $ gcc -I/usr/include/libnl3 -lnl-3 -lnl-genl-3 -o dm_repo dm.c
Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol") Reported-by: "The UK's National Cyber Security Centre (NCSC)" security@ncsc.gov.uk Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Jiri Pirko jiri@nvidia.com Link: https://lore.kernel.org/r/20231206213102.1824398-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org --- include/net/genetlink.h | 2 ++ net/core/drop_monitor.c | 4 +++- net/netlink/genetlink.c | 3 +++ 3 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/include/net/genetlink.h b/include/net/genetlink.h index d9c4936c9941..08302fa8085b 100644 --- a/include/net/genetlink.h +++ b/include/net/genetlink.h @@ -11,10 +11,12 @@ /** * struct genl_multicast_group - generic netlink multicast group * @name: name of the multicast group, names are per-family + * @cap_sys_admin: whether %CAP_SYS_ADMIN is required for binding */ struct genl_multicast_group { char name[GENL_NAMSIZ]; u8 flags; + u8 cap_sys_admin:1; };
struct genl_ops; diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c index 006b7d72aea7..9ca28e66fabf 100644 --- a/net/core/drop_monitor.c +++ b/net/core/drop_monitor.c @@ -122,7 +122,7 @@ static struct sk_buff *reset_per_cpu_data(struct per_cpu_dm_data *data) }
static const struct genl_multicast_group dropmon_mcgrps[] = { - { .name = "events", }, + { .name = "events", .cap_sys_admin = 1 }, };
static void send_dm_alert(struct work_struct *work) @@ -370,10 +370,12 @@ static const struct genl_ops dropmon_ops[] = { { .cmd = NET_DM_CMD_START, .doit = net_dm_cmd_trace, + .flags = GENL_ADMIN_PERM, }, { .cmd = NET_DM_CMD_STOP, .doit = net_dm_cmd_trace, + .flags = GENL_ADMIN_PERM, }, };
diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index 366206078b11..acb04573ba1d 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -982,6 +982,9 @@ static int genl_bind(struct net *net, int group) if ((grp->flags & GENL_UNS_ADMIN_PERM) && !ns_capable(net->user_ns, CAP_NET_ADMIN)) ret = -EPERM; + if (grp->cap_sys_admin && + !ns_capable(net->user_ns, CAP_SYS_ADMIN)) + ret = -EPERM;
break; }
On Mon, Dec 11, 2023 at 02:42:57PM +0200, Ido Schimmel wrote:
Restrict two generic netlink multicast groups - in the "psample" and "NET_DM" families - to be root-only with the appropriate capabilities.
Patch #1 is a dependency of patch #2 which is needed by the actual fixes in patches #3 and #4.
Florian Westphal (1): netlink: don't call ->netlink_bind with table lock held
Ido Schimmel (3): genetlink: add CAP_NET_ADMIN test for multicast bind psample: Require 'CAP_NET_ADMIN' when joining "packets" group drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group
include/net/genetlink.h | 3 +++ net/core/drop_monitor.c | 4 +++- net/netlink/af_netlink.c | 4 ++-- net/netlink/genetlink.c | 35 +++++++++++++++++++++++++++++++++++ net/psample/psample.c | 3 ++- 5 files changed, 45 insertions(+), 4 deletions(-)
-- 2.40.1
All backports now queued up, thanks!
greg k-h
linux-stable-mirror@lists.linaro.org