This patch set extends the locked port feature for devices that are behind a locked port, but do not have the ability to authorize themselves as a supplicant using IEEE 802.1X. Such devices can be printers, meters or anything related to fixed installations. Instead of 802.1X authorization, devices can get access based on their MAC addresses being whitelisted.
For an authorization daemon to detect that a device is trying to get access through a locked port, the bridge will add the MAC address of the device to the FDB with a locked flag to it. Thus the authorization daemon can catch the FDB add event and check if the MAC address is in the whitelist and if so replace the FDB entry without the locked flag enabled, and thus open the port for the device.
This feature is known as MAC-Auth or MAC Authentication Bypass (MAB) in Cisco terminology, where the full MAB concept involves additional Cisco infrastructure for authorization. There is no real authentication process, as the MAC address of the device is the only input the authorization daemon, in the general case, has to base the decision if to unlock the port or not.
With this patch set, an implementation of the offloaded case is supplied for the mv88e6xxx driver. When a packet ingresses on a locked port, an ATU miss violation event will occur. When handling such ATU miss violation interrupts, the MAC address of the device is added to the FDB with a zero destination port vector (DPV) and the MAC address is communicated through the switchdev layer to the bridge, so that a FDB entry with the locked flag enabled can be added.
Log: v3: Added timers and lists in the driver (mv88e6xxx) to keep track of and remove locked entries.
v4: Leave out enforcing a limit to the number of locked entries in the bridge. Removed the timers in the driver and use the worker only. Add locked FDB flag to all drivers using port_fdb_add() from the dsa api and let all drivers ignore entries with this flag set. Change how to get the ageing timeout of locked entries. See global1_atu.c and switchdev.c. Use struct mv88e6xxx_port for locked entries variables instead of struct dsa_port.
v5: Added 'mab' flag to enable MAB/MacAuth feature, in a similar way to the locked feature flag.
In these implementations for the mv88e6xxx, the switchport must be configured with learning on.
To tell userspace about the behavior of the locked entries in the driver, a 'blackhole' FDB flag has been added, which locked FDB entries coming from the driver gets. Also the 'sticky' flag comes with those locked entries, as the drivers locked entries cannot roam.
Fixed issues with taking mutex locks, and added a function to read the fid, that supports all versions of the chipset family.
v6: Added blackhole FDB flag instead of using sticky flag, as the blackhole flag corresponds to the behaviour of the zero-DPV locked entries in the driver.
Userspace can add blackhole FDB entries with: # bridge fdb add MAC dev br0 blackhole
Added FDB flags towards driver in DSA layer as u16.
v7: Remove locked port and mab flags from DSA flags inherit list as it messes with the learning setting and those flags are not naturally meant for enheriting, but should be set explicitly.
Fix blackhole implementation, selftests a.o small fixes.
v8: Improvements to error messages with user space added blackhole entries and improvements to the selftests.
Hans J. Schultz (12): net: bridge: add locked entry fdb flag to extend locked port feature net: bridge: add blackhole fdb entry flag net: bridge: enable bridge to install locked fdb entries from drivers net: bridge: add MAB flag to hardware offloadable flags net: dsa: propagate the locked flag down through the DSA layer net: bridge: enable bridge to send and receive blackhole FDB entries net: dsa: send the blackhole flag down through the DSA layer drivers: net: dsa: add fdb entry flags incoming to switchcore drivers net: dsa: mv88e6xxx: allow reading FID when handling ATU violations net: dsa: mv88e6xxx: mac-auth/MAB implementation net: dsa: mv88e6xxx: add blackhole ATU entries selftests: forwarding: add MAB tests to locked port tests
drivers/net/dsa/b53/b53_common.c | 12 +- drivers/net/dsa/b53/b53_priv.h | 4 +- drivers/net/dsa/hirschmann/hellcreek.c | 12 +- drivers/net/dsa/lan9303-core.c | 12 +- drivers/net/dsa/lantiq_gswip.c | 12 +- drivers/net/dsa/microchip/ksz9477.c | 8 +- drivers/net/dsa/microchip/ksz9477.h | 8 +- drivers/net/dsa/microchip/ksz_common.c | 14 +- drivers/net/dsa/mt7530.c | 12 +- drivers/net/dsa/mv88e6xxx/Makefile | 1 + drivers/net/dsa/mv88e6xxx/chip.c | 142 ++++++++- drivers/net/dsa/mv88e6xxx/chip.h | 19 ++ drivers/net/dsa/mv88e6xxx/global1.h | 1 + drivers/net/dsa/mv88e6xxx/global1_atu.c | 72 ++++- drivers/net/dsa/mv88e6xxx/port.c | 15 +- drivers/net/dsa/mv88e6xxx/port.h | 6 + drivers/net/dsa/mv88e6xxx/switchdev.c | 284 ++++++++++++++++++ drivers/net/dsa/mv88e6xxx/switchdev.h | 37 +++ drivers/net/dsa/ocelot/felix.c | 12 +- drivers/net/dsa/qca/qca8k-common.c | 12 +- drivers/net/dsa/qca/qca8k.h | 4 +- drivers/net/dsa/rzn1_a5psw.c | 12 +- drivers/net/dsa/sja1105/sja1105_main.c | 18 +- include/linux/if_bridge.h | 1 + include/net/dsa.h | 7 +- include/net/switchdev.h | 2 + include/uapi/linux/if_link.h | 1 + include/uapi/linux/neighbour.h | 11 +- net/bridge/br.c | 5 +- net/bridge/br_fdb.c | 88 +++++- net/bridge/br_input.c | 20 +- net/bridge/br_netlink.c | 12 +- net/bridge/br_private.h | 5 +- net/bridge/br_switchdev.c | 4 +- net/core/rtnetlink.c | 5 + net/dsa/dsa_priv.h | 10 +- net/dsa/port.c | 32 +- net/dsa/slave.c | 16 +- net/dsa/switch.c | 24 +- .../selftests/drivers/net/dsa/Makefile | 1 + .../testing/selftests/net/forwarding/Makefile | 1 + .../net/forwarding/bridge_blackhole_fdb.sh | 131 ++++++++ .../net/forwarding/bridge_locked_port.sh | 99 +++++- tools/testing/selftests/net/forwarding/lib.sh | 17 ++ 44 files changed, 1100 insertions(+), 121 deletions(-) create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.c create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.h create mode 100755 tools/testing/selftests/net/forwarding/bridge_blackhole_fdb.sh
Add an intermediate state for clients behind a locked port to allow for possible opening of the port for said clients. The clients mac address will be added with the locked flag set, denying access through the port for the mac address, but also creating a new FDB add event giving userspace daemons the ability to unlock the mac address. This feature corresponds to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The latter defined by Cisco.
Only the kernel can set this FDB entry flag, while userspace can read the flag and remove it by replacing or deleting the FDB entry.
Locked entries will age out with the set bridge ageing time.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- include/linux/if_bridge.h | 1 + include/uapi/linux/if_link.h | 1 + include/uapi/linux/neighbour.h | 7 ++++++- net/bridge/br_fdb.c | 22 ++++++++++++++++++++++ net/bridge/br_input.c | 15 +++++++++++++-- net/bridge/br_netlink.c | 12 +++++++++++- net/bridge/br_private.h | 3 ++- net/core/rtnetlink.c | 5 +++++ 8 files changed, 61 insertions(+), 5 deletions(-)
diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h index d62ef428e3aa..1668ac4d7adc 100644 --- a/include/linux/if_bridge.h +++ b/include/linux/if_bridge.h @@ -59,6 +59,7 @@ struct br_ip_list { #define BR_MRP_LOST_IN_CONT BIT(19) #define BR_TX_FWD_OFFLOAD BIT(20) #define BR_PORT_LOCKED BIT(21) +#define BR_PORT_MAB BIT(22)
#define BR_DEFAULT_AGEING_TIME (300 * HZ)
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 5e7a1041df3a..d92b3f79eba3 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -561,6 +561,7 @@ enum { IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT, IFLA_BRPORT_MCAST_EHT_HOSTS_CNT, IFLA_BRPORT_LOCKED, + IFLA_BRPORT_MAB, __IFLA_BRPORT_MAX }; #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1) diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index a998bf761635..4dda051b0ba8 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -52,7 +52,8 @@ enum { #define NTF_STICKY (1 << 6) #define NTF_ROUTER (1 << 7) /* Extended flags under NDA_FLAGS_EXT: */ -#define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_LOCKED (1 << 1)
/* * Neighbor Cache Entry States. @@ -86,6 +87,10 @@ enum { * NTF_EXT_MANAGED flagged neigbor entries are managed by the kernel on behalf * of a user space control plane, and automatically refreshed so that (if * possible) they remain in NUD_REACHABLE state. + * + * NTF_EXT_LOCKED flagged FDB entries are placeholder entries used with the + * locked port feature, that ensures that an entry exists while at the same + * time dropping packets on ingress with src MAC and VID matching the entry. */
struct nda_cacheinfo { diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..2cf695ee61c5 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -105,6 +105,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, struct nda_cacheinfo ci; struct nlmsghdr *nlh; struct ndmsg *ndm; + u32 ext_flags = 0;
nlh = nlmsg_put(skb, portid, seq, type, sizeof(*ndm), flags); if (nlh == NULL) @@ -125,11 +126,16 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, ndm->ndm_flags |= NTF_EXT_LEARNED; if (test_bit(BR_FDB_STICKY, &fdb->flags)) ndm->ndm_flags |= NTF_STICKY; + if (test_bit(BR_FDB_LOCKED, &fdb->flags)) + ext_flags |= NTF_EXT_LOCKED;
if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr)) goto nla_put_failure; if (nla_put_u32(skb, NDA_MASTER, br->dev->ifindex)) goto nla_put_failure; + if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags)) + goto nla_put_failure; + ci.ndm_used = jiffies_to_clock_t(now - fdb->used); ci.ndm_confirmed = 0; ci.ndm_updated = jiffies_to_clock_t(now - fdb->updated); @@ -171,6 +177,7 @@ static inline size_t fdb_nlmsg_size(void) return NLMSG_ALIGN(sizeof(struct ndmsg)) + nla_total_size(ETH_ALEN) /* NDA_LLADDR */ + nla_total_size(sizeof(u32)) /* NDA_MASTER */ + + nla_total_size(sizeof(u32)) /* NDA_FLAGS_EXT */ + nla_total_size(sizeof(u16)) /* NDA_VLAN */ + nla_total_size(sizeof(struct nda_cacheinfo)) + nla_total_size(0) /* NDA_FDB_EXT_ATTRS */ @@ -879,6 +886,9 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, &fdb->flags))) clear_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags); + /* Allow roaming from unauthorized port to authorized port */ + if (unlikely(test_bit(BR_FDB_LOCKED, &fdb->flags))) + clear_bit(BR_FDB_LOCKED, &fdb->flags); }
if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags))) @@ -1082,6 +1092,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, modified = true; }
+ if (test_and_clear_bit(BR_FDB_LOCKED, &fdb->flags)) + modified = true; + if (fdb_handle_notify(fdb, notify)) modified = true;
@@ -1150,6 +1163,7 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], struct net_bridge_port *p = NULL; struct net_bridge_vlan *v; struct net_bridge *br = NULL; + u32 ext_flags = 0; int err = 0;
trace_br_fdb_add(ndm, dev, addr, vid, nlh_flags); @@ -1178,6 +1192,14 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], vg = nbp_vlan_group(p); }
+ if (tb[NDA_FLAGS_EXT]) + ext_flags = nla_get_u32(tb[NDA_FLAGS_EXT]); + + if (ext_flags & NTF_EXT_LOCKED) { + pr_info("bridge: RTM_NEWNEIGH has invalid extended flags\n"); + return -EINVAL; + } + if (tb[NDA_FDB_EXT_ATTRS]) { attr = tb[NDA_FDB_EXT_ATTRS]; err = nla_parse_nested(nfea_tb, NFEA_MAX, attr, diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..068fced7693c 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -109,9 +109,20 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb struct net_bridge_fdb_entry *fdb_src = br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
- if (!fdb_src || READ_ONCE(fdb_src->dst) != p || - test_bit(BR_FDB_LOCAL, &fdb_src->flags)) + if (!fdb_src) { + unsigned long flags = 0; + + if (p->flags & BR_PORT_MAB) { + __set_bit(BR_FDB_LOCKED, &flags); + br_fdb_update(br, p, eth_hdr(skb)->h_source, + vid, flags); + } goto drop; + } else if (READ_ONCE(fdb_src->dst) != p || + test_bit(BR_FDB_LOCAL, &fdb_src->flags) || + test_bit(BR_FDB_LOCKED, &fdb_src->flags)) { + goto drop; + } }
nbp_switchdev_frame_mark(p, skb); diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 5aeb3646e74c..4b190abc11bb 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -188,6 +188,7 @@ static inline size_t br_port_info_size(void) + nla_total_size(1) /* IFLA_BRPORT_NEIGH_SUPPRESS */ + nla_total_size(1) /* IFLA_BRPORT_ISOLATED */ + nla_total_size(1) /* IFLA_BRPORT_LOCKED */ + + nla_total_size(1) /* IFLA_BRPORT_MAB */ + nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_ROOT_ID */ + nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_BRIDGE_ID */ + nla_total_size(sizeof(u16)) /* IFLA_BRPORT_DESIGNATED_PORT */ @@ -274,7 +275,8 @@ static int br_port_fill_attrs(struct sk_buff *skb, nla_put_u8(skb, IFLA_BRPORT_MRP_IN_OPEN, !!(p->flags & BR_MRP_LOST_IN_CONT)) || nla_put_u8(skb, IFLA_BRPORT_ISOLATED, !!(p->flags & BR_ISOLATED)) || - nla_put_u8(skb, IFLA_BRPORT_LOCKED, !!(p->flags & BR_PORT_LOCKED))) + nla_put_u8(skb, IFLA_BRPORT_LOCKED, !!(p->flags & BR_PORT_LOCKED)) || + nla_put_u8(skb, IFLA_BRPORT_MAB, !!(p->flags & BR_PORT_MAB))) return -EMSGSIZE;
timerval = br_timer_value(&p->message_age_timer); @@ -876,6 +878,7 @@ static const struct nla_policy br_port_policy[IFLA_BRPORT_MAX + 1] = { [IFLA_BRPORT_NEIGH_SUPPRESS] = { .type = NLA_U8 }, [IFLA_BRPORT_ISOLATED] = { .type = NLA_U8 }, [IFLA_BRPORT_LOCKED] = { .type = NLA_U8 }, + [IFLA_BRPORT_MAB] = { .type = NLA_U8 }, [IFLA_BRPORT_BACKUP_PORT] = { .type = NLA_U32 }, [IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT] = { .type = NLA_U32 }, }; @@ -943,6 +946,13 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED); + br_set_port_flag(p, tb, IFLA_BRPORT_MAB, BR_PORT_MAB); + + if (!(p->flags & BR_PORT_LOCKED) && (p->flags & BR_PORT_MAB)) { + NL_SET_ERR_MSG(extack, "MAB cannot be enabled when port is unlocked"); + p->flags = old_flags; + return -EINVAL; + }
changed_mask = old_flags ^ p->flags;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 06e5f6faa431..4ce8b8e5ae0b 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -251,7 +251,8 @@ enum { BR_FDB_ADDED_BY_EXT_LEARN, BR_FDB_OFFLOADED, BR_FDB_NOTIFY, - BR_FDB_NOTIFY_INACTIVE + BR_FDB_NOTIFY_INACTIVE, + BR_FDB_LOCKED, };
struct net_bridge_fdb_key { diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 74864dc46a7e..d6e4d2854edb 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4045,6 +4045,11 @@ int ndo_dflt_fdb_add(struct ndmsg *ndm, return err; }
+ if (tb[NDA_FLAGS_EXT]) { + netdev_info(dev, "invalid flags given to default FDB implementation\n"); + return err; + } + if (vid) { netdev_info(dev, "vlans aren't supported yet for dev_uc|mc_add()\n"); return err;
On Tue, Oct 18, 2022 at 06:56:08PM +0200, Hans J. Schultz wrote:
Add an intermediate state for clients behind a locked port to allow for possible opening of the port for said clients. The clients mac address will be added with the locked flag set, denying access through the port
The entry itself is not denying the access through the port, but rather the fact that the port is locked and there is no matching FDB entry.
for the mac address, but also creating a new FDB add event giving userspace daemons the ability to unlock the mac address. This feature corresponds to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The latter defined by Cisco.
Worth mentioning that the feature is enabled via the 'mab' bridge port option (BR_PORT_MAB).
Only the kernel can set this FDB entry flag, while userspace can read the flag and remove it by replacing or deleting the FDB entry.
Locked entries will age out with the set bridge ageing time.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com
Overall looks OK to me. See one comment below.
Reviewed-by: Ido Schimmel idosch@nvidia.com
[...]
@@ -1178,6 +1192,14 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], vg = nbp_vlan_group(p); }
- if (tb[NDA_FLAGS_EXT])
ext_flags = nla_get_u32(tb[NDA_FLAGS_EXT]);
- if (ext_flags & NTF_EXT_LOCKED) {
pr_info("bridge: RTM_NEWNEIGH has invalid extended flags\n");
I understand this function makes use of pr_info(), but it already gets extack and it's a matter of time until the pr_info() instances will be converted to extack. I would just use extack here like you are doing in the next patch.
Also, I find this message more helpful:
"Cannot add FDB entry with "locked" flag set"
return -EINVAL;
- }
- if (tb[NDA_FDB_EXT_ATTRS]) { attr = tb[NDA_FDB_EXT_ATTRS]; err = nla_parse_nested(nfea_tb, NFEA_MAX, attr,
On 2022-10-20 14:54, Ido Schimmel wrote:
On Tue, Oct 18, 2022 at 06:56:08PM +0200, Hans J. Schultz wrote:
Add an intermediate state for clients behind a locked port to allow for possible opening of the port for said clients. The clients mac address will be added with the locked flag set, denying access through the port
The entry itself is not denying the access through the port, but rather the fact that the port is locked and there is no matching FDB entry.
for the mac address, but also creating a new FDB add event giving userspace daemons the ability to unlock the mac address. This feature corresponds to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The latter defined by Cisco.
Worth mentioning that the feature is enabled via the 'mab' bridge port option (BR_PORT_MAB).
Only the kernel can set this FDB entry flag, while userspace can read the flag and remove it by replacing or deleting the FDB entry.
Locked entries will age out with the set bridge ageing time.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com
Overall looks OK to me. See one comment below.
Reviewed-by: Ido Schimmel idosch@nvidia.com
[...]
@@ -1178,6 +1192,14 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], vg = nbp_vlan_group(p); }
- if (tb[NDA_FLAGS_EXT])
ext_flags = nla_get_u32(tb[NDA_FLAGS_EXT]);
- if (ext_flags & NTF_EXT_LOCKED) {
pr_info("bridge: RTM_NEWNEIGH has invalid extended flags\n");
I understand this function makes use of pr_info(), but it already gets extack and it's a matter of time until the pr_info() instances will be converted to extack. I would just use extack here like you are doing in the next patch.
Also, I find this message more helpful:
"Cannot add FDB entry with "locked" flag set"
Okay, since Jakub says that this patch set must be resent, the question remains to me if I shall make these changes and resend the patch set as v8?
On Thu, 20 Oct 2022 21:37:17 +0200 netdev@kapio-technology.com wrote:
Okay, since Jakub says that this patch set must be resent, the question remains to me if I shall make these changes and resend the patch set as v8?
If I understand the question right - since you'd be making changes the new posting should be a v9. If you got only acks and no change requests for this posting you could repost as "v8 RESEND", or also as v9, when in doubt err on the side of bumping the version...
Add a 'blackhole' fdb flag, ensuring that no forwarding from any port to a destination MAC that has a FDB entry with this flag on will occur. The packets will thus be dropped.
When the blackhole fdb flag is set, the 'local' flag will also be enabled as blackhole entries are not associated with any port.
Thus the command will be alike to: bridge fdb add MAC dev br0 local blackhole
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- include/uapi/linux/neighbour.h | 4 ++++ net/bridge/br_fdb.c | 37 ++++++++++++++++++++++++++++------ net/bridge/br_input.c | 5 ++++- net/bridge/br_private.h | 1 + 4 files changed, 40 insertions(+), 7 deletions(-)
diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index 4dda051b0ba8..cc7d540eb734 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -54,6 +54,7 @@ enum { /* Extended flags under NDA_FLAGS_EXT: */ #define NTF_EXT_MANAGED (1 << 0) #define NTF_EXT_LOCKED (1 << 1) +#define NTF_EXT_BLACKHOLE (1 << 2)
/* * Neighbor Cache Entry States. @@ -91,6 +92,9 @@ enum { * NTF_EXT_LOCKED flagged FDB entries are placeholder entries used with the * locked port feature, that ensures that an entry exists while at the same * time dropping packets on ingress with src MAC and VID matching the entry. + * + * NTF_EXT_BLACKHOLE flagged FDB entries ensure that no forwarding is allowed + * from any port to the destination MAC, VID pair associated with it. */
struct nda_cacheinfo { diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 2cf695ee61c5..15ead4dc6190 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -128,6 +128,8 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, ndm->ndm_flags |= NTF_STICKY; if (test_bit(BR_FDB_LOCKED, &fdb->flags)) ext_flags |= NTF_EXT_LOCKED; + if (test_bit(BR_FDB_BLACKHOLE, &fdb->flags)) + ext_flags |= NTF_EXT_BLACKHOLE;
if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr)) goto nla_put_failure; @@ -1018,8 +1020,9 @@ static bool fdb_handle_notify(struct net_bridge_fdb_entry *fdb, u8 notify) /* Update (create or replace) forwarding database entry */ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, const u8 *addr, struct ndmsg *ndm, u16 flags, u16 vid, - struct nlattr *nfea_tb[]) + u32 ext_flags, struct nlattr *nfea_tb[]) { + bool blackhole = !!(ext_flags & NTF_EXT_BLACKHOLE); bool is_sticky = !!(ndm->ndm_flags & NTF_STICKY); bool refresh = !nfea_tb[NFEA_DONT_REFRESH]; struct net_bridge_fdb_entry *fdb; @@ -1092,6 +1095,11 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, modified = true; }
+ if (blackhole != test_bit(BR_FDB_BLACKHOLE, &fdb->flags)) { + change_bit(BR_FDB_BLACKHOLE, &fdb->flags); + modified = true; + } + if (test_and_clear_bit(BR_FDB_LOCKED, &fdb->flags)) modified = true;
@@ -1113,7 +1121,7 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 nlh_flags, u16 vid, struct nlattr *nfea_tb[], - struct netlink_ext_ack *extack) + u32 ext_flags, struct netlink_ext_ack *extack) { int err = 0;
@@ -1140,7 +1148,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, err = br_fdb_external_learn_add(br, p, addr, vid, true); } else { spin_lock_bh(&br->hash_lock); - err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb); + err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, ext_flags, nfea_tb); spin_unlock_bh(&br->hash_lock); }
@@ -1200,6 +1208,23 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], return -EINVAL; }
+ if (ext_flags & NTF_EXT_BLACKHOLE) { + if (!(ndm->ndm_state & NUD_PERMANENT)) { + NL_SET_ERR_MSG_MOD(extack, "Blackhole FDB entry must be permanent"); + return -EINVAL; + } + if (p) { + NL_SET_ERR_MSG_MOD(extack, + "Blackhole FDB entry cannot be applied on a port"); + return -EINVAL; + } + if (ndm->ndm_flags & NTF_EXT_LEARNED) { + NL_SET_ERR_MSG_MOD(extack, + "Blackhole FDB entry cannot be added as ext. learned"); + return -EINVAL; + } + } + if (tb[NDA_FDB_EXT_ATTRS]) { attr = tb[NDA_FDB_EXT_ATTRS]; err = nla_parse_nested(nfea_tb, NFEA_MAX, attr, @@ -1219,10 +1244,10 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
/* VID was specified, so use it. */ err = __br_fdb_add(ndm, br, p, addr, nlh_flags, vid, nfea_tb, - extack); + ext_flags, extack); } else { err = __br_fdb_add(ndm, br, p, addr, nlh_flags, 0, nfea_tb, - extack); + ext_flags, extack); if (err || !vg || !vg->num_vlans) goto out;
@@ -1234,7 +1259,7 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], if (!br_vlan_should_use(v)) continue; err = __br_fdb_add(ndm, br, p, addr, nlh_flags, v->vid, - nfea_tb, extack); + nfea_tb, ext_flags, extack); if (err) goto out; } diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 068fced7693c..665d1d6bdc75 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -193,8 +193,11 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (dst) { unsigned long now = jiffies;
- if (test_bit(BR_FDB_LOCAL, &dst->flags)) + if (test_bit(BR_FDB_LOCAL, &dst->flags)) { + if (unlikely(test_bit(BR_FDB_BLACKHOLE, &dst->flags))) + goto drop; return br_pass_frame_up(skb); + }
if (now != dst->used) dst->used = now; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 4ce8b8e5ae0b..e7a08657c7ed 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -253,6 +253,7 @@ enum { BR_FDB_NOTIFY, BR_FDB_NOTIFY_INACTIVE, BR_FDB_LOCKED, + BR_FDB_BLACKHOLE, };
struct net_bridge_fdb_key {
On Tue, Oct 18, 2022 at 06:56:09PM +0200, Hans J. Schultz wrote:
Add a 'blackhole' fdb flag, ensuring that no forwarding from any port to a destination MAC that has a FDB entry with this flag on will occur. The packets will thus be dropped.
When the blackhole fdb flag is set, the 'local' flag will also be enabled as blackhole entries are not associated with any port.
It reads as if the kernel will enable the 'local' flag automatically, which is not true anymore. The bridge driver enforces that 'NUD_PERMANENT' is set if 'NTF_EXT_BLACKHOLE' is specified.
Thus the command will be alike to: bridge fdb add MAC dev br0 local blackhole
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com
Looks OK to me. See one comment below.
Reviewed-by: Ido Schimmel idosch@nvidia.com
[...]
@@ -1140,7 +1148,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, err = br_fdb_external_learn_add(br, p, addr, vid, true); } else { spin_lock_bh(&br->hash_lock);
err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, ext_flags, nfea_tb);
I believe the preference is to wrap to 80 columns when possible.
spin_unlock_bh(&br->hash_lock);
}
On 2022-10-20 15:06, Ido Schimmel wrote:
On Tue, Oct 18, 2022 at 06:56:09PM +0200, Hans J. Schultz wrote:
Add a 'blackhole' fdb flag, ensuring that no forwarding from any port to a destination MAC that has a FDB entry with this flag on will occur. The packets will thus be dropped.
When the blackhole fdb flag is set, the 'local' flag will also be enabled as blackhole entries are not associated with any port.
It reads as if the kernel will enable the 'local' flag automatically, which is not true anymore. The bridge driver enforces that 'NUD_PERMANENT' is set if 'NTF_EXT_BLACKHOLE' is specified.
Thus the command will be alike to: bridge fdb add MAC dev br0 local blackhole
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com
Looks OK to me. See one comment below.
Reviewed-by: Ido Schimmel idosch@nvidia.com
[...]
@@ -1140,7 +1148,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, err = br_fdb_external_learn_add(br, p, addr, vid, true); } else { spin_lock_bh(&br->hash_lock);
err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, ext_flags,
nfea_tb);
I believe the preference is to wrap to 80 columns when possible.
Ok, I only have knowledge of 100 columns as a limit.
On 2022-10-20 15:06, Ido Schimmel wrote:
[...]
@@ -1140,7 +1148,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, err = br_fdb_external_learn_add(br, p, addr, vid, true); } else { spin_lock_bh(&br->hash_lock);
err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, ext_flags,
nfea_tb);
I believe the preference is to wrap to 80 columns when possible.
Very strange... since I ran checkpatch.pl from the net-next kernel itself and it did not give me any warnings about 80 columns, but rather said 'patch is ready for submission'.
As this is silent, could it be some missing python plugins or something to do with perl?
On Sun, 23 Oct 2022 07:32:02 +0200 netdev@kapio-technology.com wrote:
@@ -1140,7 +1148,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, err = br_fdb_external_learn_add(br, p, addr, vid, true); } else { spin_lock_bh(&br->hash_lock);
err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, ext_flags,
nfea_tb);
I believe the preference is to wrap to 80 columns when possible.
Very strange... since I ran checkpatch.pl from the net-next kernel itself and it did not give me any warnings about 80 columns, but rather said 'patch is ready for submission'.
As this is silent, could it be some missing python plugins or something to do with perl?
I run:
./scripts/checkpatch.pl --strict --max-line-length=80
The bridge will be able to install locked entries when receiving SWITCHDEV_FDB_ADD_TO_BRIDGE notifications from drivers.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- include/net/switchdev.h | 1 + net/bridge/br.c | 4 ++-- net/bridge/br_fdb.c | 12 ++++++++++-- net/bridge/br_private.h | 2 +- net/bridge/br_switchdev.c | 1 + 5 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/include/net/switchdev.h b/include/net/switchdev.h index 7dcdc97c0bc3..ca0312b78294 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -248,6 +248,7 @@ struct switchdev_notifier_fdb_info { u16 vid; u8 added_by_user:1, is_local:1, + locked:1, offloaded:1; };
diff --git a/net/bridge/br.c b/net/bridge/br.c index 96e91d69a9a8..e0e2df2fa278 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -165,8 +165,8 @@ static int br_switchdev_event(struct notifier_block *unused, switch (event) { case SWITCHDEV_FDB_ADD_TO_BRIDGE: fdb_info = ptr; - err = br_fdb_external_learn_add(br, p, fdb_info->addr, - fdb_info->vid, false); + err = br_fdb_external_learn_add(br, p, fdb_info->addr, fdb_info->vid, + fdb_info->locked, false); if (err) { err = notifier_from_errno(err); break; diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 15ead4dc6190..8d207b1416f7 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1145,7 +1145,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, "FDB entry towards bridge must be permanent"); return -EINVAL; } - err = br_fdb_external_learn_add(br, p, addr, vid, true); + err = br_fdb_external_learn_add(br, p, addr, vid, false, true); } else { spin_lock_bh(&br->hash_lock); err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, ext_flags, nfea_tb); @@ -1400,7 +1400,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p) }
int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, - const unsigned char *addr, u16 vid, + const unsigned char *addr, u16 vid, bool locked, bool swdev_notify) { struct net_bridge_fdb_entry *fdb; @@ -1421,6 +1421,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (!p) flags |= BIT(BR_FDB_LOCAL);
+ if (locked) + flags |= BIT(BR_FDB_LOCKED); + fdb = fdb_create(br, p, addr, vid, flags); if (!fdb) { err = -ENOMEM; @@ -1444,6 +1447,11 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, modified = true; }
+ if (locked != test_bit(BR_FDB_LOCKED, &fdb->flags)) { + change_bit(BR_FDB_LOCKED, &fdb->flags); + modified = true; + } + if (swdev_notify) set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index e7a08657c7ed..3e9f4d1fbd60 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -812,7 +812,7 @@ int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p); void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p); int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, - bool swdev_notify); + bool locked, bool swdev_notify); int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, bool swdev_notify); diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 8f3d76c751dd..c6b938c01a74 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -136,6 +136,7 @@ static void br_switchdev_fdb_populate(struct net_bridge *br, item->added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); item->offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags); item->is_local = test_bit(BR_FDB_LOCAL, &fdb->flags); + item->locked = test_bit(BR_FDB_LOCKED, &fdb->flags); item->info.dev = (!p || item->is_local) ? br->dev : p->dev; item->info.ctx = ctx; }
On Tue, Oct 18, 2022 at 06:56:10PM +0200, Hans J. Schultz wrote:
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 8f3d76c751dd..c6b938c01a74 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -136,6 +136,7 @@ static void br_switchdev_fdb_populate(struct net_bridge *br, item->added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); item->offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags); item->is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
- item->locked = test_bit(BR_FDB_LOCKED, &fdb->flags);
Shouldn't this be set to 0 here, since it is the bridge->driver direction?
item->info.dev = (!p || item->is_local) ? br->dev : p->dev; item->info.ctx = ctx; } -- 2.34.1
On 2022-10-20 14:55, Vladimir Oltean wrote:
On Tue, Oct 18, 2022 at 06:56:10PM +0200, Hans J. Schultz wrote:
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 8f3d76c751dd..c6b938c01a74 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -136,6 +136,7 @@ static void br_switchdev_fdb_populate(struct net_bridge *br, item->added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); item->offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags); item->is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
- item->locked = test_bit(BR_FDB_LOCKED, &fdb->flags);
Shouldn't this be set to 0 here, since it is the bridge->driver direction?
Wouldn't it be a good idea to allow drivers to add what corresponds to a blackhole entry when using the bridge input chain to activate the MAB feature, or in general to leave the decision of what to do to the driver implementation?
On Thu, Oct 20, 2022 at 09:29:06PM +0200, netdev@kapio-technology.com wrote:
On 2022-10-20 14:55, Vladimir Oltean wrote:
On Tue, Oct 18, 2022 at 06:56:10PM +0200, Hans J. Schultz wrote:
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 8f3d76c751dd..c6b938c01a74 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -136,6 +136,7 @@ static void br_switchdev_fdb_populate(struct net_bridge *br, item->added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); item->offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags); item->is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
- item->locked = test_bit(BR_FDB_LOCKED, &fdb->flags);
Shouldn't this be set to 0 here, since it is the bridge->driver direction?
Wouldn't it be a good idea to allow drivers to add what corresponds to a blackhole entry when using the bridge input chain to activate the MAB feature, or in general to leave the decision of what to do to the driver implementation?
The patch doesn't propose that. It proposes:
| net: bridge: enable bridge to install locked fdb entries from drivers | | The bridge will be able to install locked entries when receiving | SWITCHDEV_FDB_ADD_TO_BRIDGE notifications from drivers.
Please write patches which make just one logical change, and explain the justification for that change and precisely that change in the commit message.
Add BR_PORT_MAB to BR_PORT_FLAGS_HW_OFFLOAD.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- net/bridge/br_switchdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index c6b938c01a74..ccf1b4cffdd0 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -71,7 +71,7 @@ bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p, }
/* Flags that can be offloaded to hardware */ -#define BR_PORT_FLAGS_HW_OFFLOAD (BR_LEARNING | BR_FLOOD | \ +#define BR_PORT_FLAGS_HW_OFFLOAD (BR_LEARNING | BR_FLOOD | BR_PORT_MAB | \ BR_MCAST_FLOOD | BR_BCAST_FLOOD | BR_PORT_LOCKED | \ BR_HAIRPIN_MODE | BR_ISOLATED | BR_MULTICAST_TO_UNICAST)
Add a new u16 for fdb flags to propagate through the DSA layer towards the fdb add and del functions of the drivers.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- include/net/dsa.h | 2 ++ net/dsa/dsa_priv.h | 6 ++++-- net/dsa/port.c | 10 ++++++---- net/dsa/slave.c | 10 ++++++++-- net/dsa/switch.c | 16 ++++++++-------- 5 files changed, 28 insertions(+), 16 deletions(-)
diff --git a/include/net/dsa.h b/include/net/dsa.h index ee369670e20e..e4b641b20713 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -821,6 +821,8 @@ static inline bool dsa_port_tree_same(const struct dsa_port *a, return a->ds->dst == b->ds->dst; }
+#define DSA_FDB_FLAG_LOCKED (1 << 0) + typedef int dsa_fdb_dump_cb_t(const unsigned char *addr, u16 vid, bool is_static, void *data); struct dsa_switch_ops { diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index 6e65c7ffd6f3..c943e8934063 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -65,6 +65,7 @@ struct dsa_notifier_fdb_info { const struct dsa_port *dp; const unsigned char *addr; u16 vid; + u16 fdb_flags; struct dsa_db db; };
@@ -131,6 +132,7 @@ struct dsa_switchdev_event_work { */ unsigned char addr[ETH_ALEN]; u16 vid; + u16 fdb_flags; bool host_addr; };
@@ -241,9 +243,9 @@ int dsa_port_vlan_msti(struct dsa_port *dp, const struct switchdev_vlan_msti *msti); int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu); int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr, - u16 vid); + u16 vid, u16 fdb_flags); int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, - u16 vid); + u16 vid, u16 fdb_flags); int dsa_port_standalone_host_fdb_add(struct dsa_port *dp, const unsigned char *addr, u16 vid); int dsa_port_standalone_host_fdb_del(struct dsa_port *dp, diff --git a/net/dsa/port.c b/net/dsa/port.c index 208168276995..ff4f66f14d39 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -304,7 +304,7 @@ static int dsa_port_inherit_brport_flags(struct dsa_port *dp, struct netlink_ext_ack *extack) { const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | - BR_BCAST_FLOOD | BR_PORT_LOCKED; + BR_BCAST_FLOOD; struct net_device *brport_dev = dsa_port_to_bridge_port(dp); int flag, err;
@@ -328,7 +328,7 @@ static void dsa_port_clear_brport_flags(struct dsa_port *dp) { const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | - BR_BCAST_FLOOD | BR_PORT_LOCKED; + BR_BCAST_FLOOD | BR_PORT_LOCKED | BR_PORT_MAB; int flag, err;
for_each_set_bit(flag, &mask, 32) { @@ -956,12 +956,13 @@ int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu) }
int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr, - u16 vid) + u16 vid, u16 fdb_flags) { struct dsa_notifier_fdb_info info = { .dp = dp, .addr = addr, .vid = vid, + .fdb_flags = fdb_flags, .db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge, @@ -979,12 +980,13 @@ int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr, }
int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, - u16 vid) + u16 vid, u16 fdb_flags) { struct dsa_notifier_fdb_info info = { .dp = dp, .addr = addr, .vid = vid, + .fdb_flags = fdb_flags, .db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge, diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 1a59918d3b30..65f0c578ef44 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -3246,6 +3246,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) container_of(work, struct dsa_switchdev_event_work, work); const unsigned char *addr = switchdev_work->addr; struct net_device *dev = switchdev_work->dev; + u16 fdb_flags = switchdev_work->fdb_flags; u16 vid = switchdev_work->vid; struct dsa_switch *ds; struct dsa_port *dp; @@ -3261,7 +3262,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) else if (dp->lag) err = dsa_port_lag_fdb_add(dp, addr, vid); else - err = dsa_port_fdb_add(dp, addr, vid); + err = dsa_port_fdb_add(dp, addr, vid, fdb_flags); if (err) { dev_err(ds->dev, "port %d failed to add %pM vid %d to fdb: %d\n", @@ -3277,7 +3278,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) else if (dp->lag) err = dsa_port_lag_fdb_del(dp, addr, vid); else - err = dsa_port_fdb_del(dp, addr, vid); + err = dsa_port_fdb_del(dp, addr, vid, fdb_flags); if (err) { dev_err(ds->dev, "port %d failed to delete %pM vid %d from fdb: %d\n", @@ -3315,6 +3316,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, struct dsa_port *dp = dsa_slave_to_port(dev); bool host_addr = fdb_info->is_local; struct dsa_switch *ds = dp->ds; + u16 fdb_flags = 0;
if (ctx && ctx != dp) return 0; @@ -3361,6 +3363,9 @@ static int dsa_slave_fdb_event(struct net_device *dev, orig_dev->name, fdb_info->addr, fdb_info->vid, host_addr ? " as host address" : "");
+ if (fdb_info->locked) + fdb_flags |= DSA_FDB_FLAG_LOCKED; + INIT_WORK(&switchdev_work->work, dsa_slave_switchdev_event_work); switchdev_work->event = event; switchdev_work->dev = dev; @@ -3369,6 +3374,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, ether_addr_copy(switchdev_work->addr, fdb_info->addr); switchdev_work->vid = fdb_info->vid; switchdev_work->host_addr = host_addr; + switchdev_work->fdb_flags = fdb_flags;
dsa_schedule_work(&switchdev_work->work);
diff --git a/net/dsa/switch.c b/net/dsa/switch.c index ce56acdba203..dd355556892e 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -234,7 +234,7 @@ static int dsa_port_do_mdb_del(struct dsa_port *dp, }
static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr, - u16 vid, struct dsa_db db) + u16 vid, u16 fdb_flags, struct dsa_db db) { struct dsa_switch *ds = dp->ds; struct dsa_mac_addr *a; @@ -278,7 +278,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr, }
static int dsa_port_do_fdb_del(struct dsa_port *dp, const unsigned char *addr, - u16 vid, struct dsa_db db) + u16 vid, u16 fdb_flags, struct dsa_db db) { struct dsa_switch *ds = dp->ds; struct dsa_mac_addr *a; @@ -404,8 +404,8 @@ static int dsa_switch_host_fdb_add(struct dsa_switch *ds, info->vid, info->db); } else { - err = dsa_port_do_fdb_add(dp, info->addr, - info->vid, info->db); + err = dsa_port_do_fdb_add(dp, info->addr, info->vid, + info->fdb_flags, info->db); } if (err) break; @@ -432,8 +432,8 @@ static int dsa_switch_host_fdb_del(struct dsa_switch *ds, info->vid, info->db); } else { - err = dsa_port_do_fdb_del(dp, info->addr, - info->vid, info->db); + err = dsa_port_do_fdb_del(dp, info->addr, info->vid, + info->fdb_flags, info->db); } if (err) break; @@ -452,7 +452,7 @@ static int dsa_switch_fdb_add(struct dsa_switch *ds, if (!ds->ops->port_fdb_add) return -EOPNOTSUPP;
- return dsa_port_do_fdb_add(dp, info->addr, info->vid, info->db); + return dsa_port_do_fdb_add(dp, info->addr, info->vid, info->fdb_flags, info->db); }
static int dsa_switch_fdb_del(struct dsa_switch *ds, @@ -464,7 +464,7 @@ static int dsa_switch_fdb_del(struct dsa_switch *ds, if (!ds->ops->port_fdb_del) return -EOPNOTSUPP;
- return dsa_port_do_fdb_del(dp, info->addr, info->vid, info->db); + return dsa_port_do_fdb_del(dp, info->addr, info->vid, info->fdb_flags, info->db); }
static int dsa_switch_lag_fdb_add(struct dsa_switch *ds,
On Tue, Oct 18, 2022 at 06:56:12PM +0200, Hans J. Schultz wrote:
Add a new u16 for fdb flags to propagate through the DSA layer towards the fdb add and del functions of the drivers.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com
include/net/dsa.h | 2 ++ net/dsa/dsa_priv.h | 6 ++++-- net/dsa/port.c | 10 ++++++---- net/dsa/slave.c | 10 ++++++++-- net/dsa/switch.c | 16 ++++++++-------- 5 files changed, 28 insertions(+), 16 deletions(-)
diff --git a/include/net/dsa.h b/include/net/dsa.h index ee369670e20e..e4b641b20713 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -821,6 +821,8 @@ static inline bool dsa_port_tree_same(const struct dsa_port *a, return a->ds->dst == b->ds->dst; } +#define DSA_FDB_FLAG_LOCKED (1 << 0)
typedef int dsa_fdb_dump_cb_t(const unsigned char *addr, u16 vid, bool is_static, void *data); struct dsa_switch_ops { diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index 6e65c7ffd6f3..c943e8934063 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -65,6 +65,7 @@ struct dsa_notifier_fdb_info { const struct dsa_port *dp; const unsigned char *addr; u16 vid;
- u16 fdb_flags; struct dsa_db db;
}; @@ -131,6 +132,7 @@ struct dsa_switchdev_event_work { */ unsigned char addr[ETH_ALEN]; u16 vid;
- u16 fdb_flags; bool host_addr;
}; @@ -241,9 +243,9 @@ int dsa_port_vlan_msti(struct dsa_port *dp, const struct switchdev_vlan_msti *msti); int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu); int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
u16 vid);
u16 vid, u16 fdb_flags);
int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr,
u16 vid);
u16 vid, u16 fdb_flags);
int dsa_port_standalone_host_fdb_add(struct dsa_port *dp, const unsigned char *addr, u16 vid); int dsa_port_standalone_host_fdb_del(struct dsa_port *dp, diff --git a/net/dsa/port.c b/net/dsa/port.c index 208168276995..ff4f66f14d39 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -304,7 +304,7 @@ static int dsa_port_inherit_brport_flags(struct dsa_port *dp, struct netlink_ext_ack *extack) { const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
BR_BCAST_FLOOD | BR_PORT_LOCKED;
struct net_device *brport_dev = dsa_port_to_bridge_port(dp); int flag, err;BR_BCAST_FLOOD;
@@ -328,7 +328,7 @@ static void dsa_port_clear_brport_flags(struct dsa_port *dp) { const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
BR_BCAST_FLOOD | BR_PORT_LOCKED;
BR_BCAST_FLOOD | BR_PORT_LOCKED | BR_PORT_MAB;
Why does the mask of cleared brport flags differ from the one of set brport flags, and what/where is the explanation for this change?
int flag, err; for_each_set_bit(flag, &mask, 32) { @@ -956,12 +956,13 @@ int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu) } int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
u16 vid)
u16 vid, u16 fdb_flags)
{ struct dsa_notifier_fdb_info info = { .dp = dp, .addr = addr, .vid = vid,
.db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge,.fdb_flags = fdb_flags,
@@ -979,12 +980,13 @@ int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr, } int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr,
u16 vid)
u16 vid, u16 fdb_flags)
{ struct dsa_notifier_fdb_info info = { .dp = dp, .addr = addr, .vid = vid,
.db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge,.fdb_flags = fdb_flags,
diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 1a59918d3b30..65f0c578ef44 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -3246,6 +3246,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) container_of(work, struct dsa_switchdev_event_work, work); const unsigned char *addr = switchdev_work->addr; struct net_device *dev = switchdev_work->dev;
- u16 fdb_flags = switchdev_work->fdb_flags; u16 vid = switchdev_work->vid; struct dsa_switch *ds; struct dsa_port *dp;
@@ -3261,7 +3262,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) else if (dp->lag) err = dsa_port_lag_fdb_add(dp, addr, vid); else
err = dsa_port_fdb_add(dp, addr, vid);
if (err) { dev_err(ds->dev, "port %d failed to add %pM vid %d to fdb: %d\n",err = dsa_port_fdb_add(dp, addr, vid, fdb_flags);
@@ -3277,7 +3278,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) else if (dp->lag) err = dsa_port_lag_fdb_del(dp, addr, vid); else
err = dsa_port_fdb_del(dp, addr, vid);
if (err) { dev_err(ds->dev, "port %d failed to delete %pM vid %d from fdb: %d\n",err = dsa_port_fdb_del(dp, addr, vid, fdb_flags);
@@ -3315,6 +3316,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, struct dsa_port *dp = dsa_slave_to_port(dev); bool host_addr = fdb_info->is_local; struct dsa_switch *ds = dp->ds;
- u16 fdb_flags = 0;
if (ctx && ctx != dp) return 0; @@ -3361,6 +3363,9 @@ static int dsa_slave_fdb_event(struct net_device *dev, orig_dev->name, fdb_info->addr, fdb_info->vid, host_addr ? " as host address" : "");
- if (fdb_info->locked)
fdb_flags |= DSA_FDB_FLAG_LOCKED;
This is the bridge->driver direction. In which of the changes up until now/through which mechanism will the bridge emit a SWITCHDEV_FDB_ADD_TO_DEVICE with fdb_info->locked = true?
Don't the other switchdev drivers except DSA (search for SWITCHDEV_FDB_EVENT_TO_DEVICE in the drivers/ folder) need to handle this new flag too, even if to reject it?
When other drivers will want to look at fdb_info->locked, they'll have the surprise that it's impossible to maintain backwards compatibility, because they didn't use to treat the flag at all in the past (so either locked or unlocked, they did the same thing).
- INIT_WORK(&switchdev_work->work, dsa_slave_switchdev_event_work); switchdev_work->event = event; switchdev_work->dev = dev;
@@ -3369,6 +3374,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, ether_addr_copy(switchdev_work->addr, fdb_info->addr); switchdev_work->vid = fdb_info->vid; switchdev_work->host_addr = host_addr;
- switchdev_work->fdb_flags = fdb_flags;
On Thu, Oct 20, 2022 at 04:02:24PM +0300, Vladimir Oltean wrote:
On Tue, Oct 18, 2022 at 06:56:12PM +0200, Hans J. Schultz wrote:
@@ -3315,6 +3316,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, struct dsa_port *dp = dsa_slave_to_port(dev); bool host_addr = fdb_info->is_local; struct dsa_switch *ds = dp->ds;
- u16 fdb_flags = 0;
if (ctx && ctx != dp) return 0; @@ -3361,6 +3363,9 @@ static int dsa_slave_fdb_event(struct net_device *dev, orig_dev->name, fdb_info->addr, fdb_info->vid, host_addr ? " as host address" : "");
- if (fdb_info->locked)
fdb_flags |= DSA_FDB_FLAG_LOCKED;
This is the bridge->driver direction. In which of the changes up until now/through which mechanism will the bridge emit a SWITCHDEV_FDB_ADD_TO_DEVICE with fdb_info->locked = true?
I believe it can happen in the following call chain:
br_handle_frame_finish br_fdb_update // p->flags & BR_PORT_MAB fdb_notify br_switchdev_fdb_notify
This can happen with Spectrum when a packet ingresses via a locked port and incurs an FDB miss in hardware. The packet will be trapped and injected to the Rx path where it should invoke the above call chain.
Don't the other switchdev drivers except DSA (search for SWITCHDEV_FDB_EVENT_TO_DEVICE in the drivers/ folder) need to handle this new flag too, even if to reject it?
Yes, agree. At least with mlxsw it is not a big deal right now because it ignores entries with !BR_FDB_ADDED_BY_USER and locked entries are always like that, but it would be good to make it more explicit.
When other drivers will want to look at fdb_info->locked, they'll have the surprise that it's impossible to maintain backwards compatibility, because they didn't use to treat the flag at all in the past (so either locked or unlocked, they did the same thing).
- INIT_WORK(&switchdev_work->work, dsa_slave_switchdev_event_work); switchdev_work->event = event; switchdev_work->dev = dev;
@@ -3369,6 +3374,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, ether_addr_copy(switchdev_work->addr, fdb_info->addr); switchdev_work->vid = fdb_info->vid; switchdev_work->host_addr = host_addr;
- switchdev_work->fdb_flags = fdb_flags;
On Thu, Oct 20, 2022 at 04:24:16PM +0300, Ido Schimmel wrote:
On Thu, Oct 20, 2022 at 04:02:24PM +0300, Vladimir Oltean wrote:
On Tue, Oct 18, 2022 at 06:56:12PM +0200, Hans J. Schultz wrote:
@@ -3315,6 +3316,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, struct dsa_port *dp = dsa_slave_to_port(dev); bool host_addr = fdb_info->is_local; struct dsa_switch *ds = dp->ds;
- u16 fdb_flags = 0;
if (ctx && ctx != dp) return 0; @@ -3361,6 +3363,9 @@ static int dsa_slave_fdb_event(struct net_device *dev, orig_dev->name, fdb_info->addr, fdb_info->vid, host_addr ? " as host address" : "");
- if (fdb_info->locked)
fdb_flags |= DSA_FDB_FLAG_LOCKED;
This is the bridge->driver direction. In which of the changes up until now/through which mechanism will the bridge emit a SWITCHDEV_FDB_ADD_TO_DEVICE with fdb_info->locked = true?
I believe it can happen in the following call chain:
br_handle_frame_finish br_fdb_update // p->flags & BR_PORT_MAB fdb_notify br_switchdev_fdb_notify
This can happen with Spectrum when a packet ingresses via a locked port and incurs an FDB miss in hardware. The packet will be trapped and injected to the Rx path where it should invoke the above call chain.
Ah, so this is the case which in mv88e6xxx would generate an ATU violation interrupt; in the Spectrum case it generates a special packet. Right now this packet isn't generated, right?
I think we have the same thing in ocelot, a port can be configured to send "learn frames" to the CPU.
Should these packets be injected into the bridge RX path in the first place? They reach the CPU because of an FDB miss, not because the CPU was the intended destination.
On Thu, Oct 20, 2022 at 04:35:06PM +0300, Vladimir Oltean wrote:
On Thu, Oct 20, 2022 at 04:24:16PM +0300, Ido Schimmel wrote:
On Thu, Oct 20, 2022 at 04:02:24PM +0300, Vladimir Oltean wrote:
On Tue, Oct 18, 2022 at 06:56:12PM +0200, Hans J. Schultz wrote:
@@ -3315,6 +3316,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, struct dsa_port *dp = dsa_slave_to_port(dev); bool host_addr = fdb_info->is_local; struct dsa_switch *ds = dp->ds;
- u16 fdb_flags = 0;
if (ctx && ctx != dp) return 0; @@ -3361,6 +3363,9 @@ static int dsa_slave_fdb_event(struct net_device *dev, orig_dev->name, fdb_info->addr, fdb_info->vid, host_addr ? " as host address" : "");
- if (fdb_info->locked)
fdb_flags |= DSA_FDB_FLAG_LOCKED;
This is the bridge->driver direction. In which of the changes up until now/through which mechanism will the bridge emit a SWITCHDEV_FDB_ADD_TO_DEVICE with fdb_info->locked = true?
I believe it can happen in the following call chain:
br_handle_frame_finish br_fdb_update // p->flags & BR_PORT_MAB fdb_notify br_switchdev_fdb_notify
This can happen with Spectrum when a packet ingresses via a locked port and incurs an FDB miss in hardware. The packet will be trapped and injected to the Rx path where it should invoke the above call chain.
Ah, so this is the case which in mv88e6xxx would generate an ATU violation interrupt; in the Spectrum case it generates a special packet.
Not sure what you mean by "special" :) It's simply the packet that incurred the FDB miss on the SMAC.
Right now this packet isn't generated, right?
Right. We don't support BR_PORT_LOCKED so these checks are not currently enabled in hardware. To be clear, only packets received via locked ports are able to trigger the check.
I think we have the same thing in ocelot, a port can be configured to send "learn frames" to the CPU.
Should these packets be injected into the bridge RX path in the first place? They reach the CPU because of an FDB miss, not because the CPU was the intended destination.
The reason to inject them to the Rx path is so that they will trigger the creation of the "locked" entry in the bridge driver (when MAB is on), thereby notifying user space about the presence of a new MAC behind the locked port. We can try to parse them in the driver and notify the bridge driver via SWITCHDEV_FDB_ADD_TO_BRIDGE, but it's quite ugly...
On Thu, Oct 20, 2022 at 04:57:35PM +0300, Ido Schimmel wrote:
On Thu, Oct 20, 2022 at 04:35:06PM +0300, Vladimir Oltean wrote:
On Thu, Oct 20, 2022 at 04:24:16PM +0300, Ido Schimmel wrote:
On Thu, Oct 20, 2022 at 04:02:24PM +0300, Vladimir Oltean wrote:
On Tue, Oct 18, 2022 at 06:56:12PM +0200, Hans J. Schultz wrote:
@@ -3315,6 +3316,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, struct dsa_port *dp = dsa_slave_to_port(dev); bool host_addr = fdb_info->is_local; struct dsa_switch *ds = dp->ds;
- u16 fdb_flags = 0;
if (ctx && ctx != dp) return 0; @@ -3361,6 +3363,9 @@ static int dsa_slave_fdb_event(struct net_device *dev, orig_dev->name, fdb_info->addr, fdb_info->vid, host_addr ? " as host address" : "");
- if (fdb_info->locked)
fdb_flags |= DSA_FDB_FLAG_LOCKED;
This is the bridge->driver direction. In which of the changes up until now/through which mechanism will the bridge emit a SWITCHDEV_FDB_ADD_TO_DEVICE with fdb_info->locked = true?
I believe it can happen in the following call chain:
br_handle_frame_finish br_fdb_update // p->flags & BR_PORT_MAB fdb_notify br_switchdev_fdb_notify
This can happen with Spectrum when a packet ingresses via a locked port and incurs an FDB miss in hardware. The packet will be trapped and injected to the Rx path where it should invoke the above call chain.
Ah, so this is the case which in mv88e6xxx would generate an ATU violation interrupt; in the Spectrum case it generates a special packet.
Not sure what you mean by "special" :) It's simply the packet that incurred the FDB miss on the SMAC.
Right now this packet isn't generated, right?
Right. We don't support BR_PORT_LOCKED so these checks are not currently enabled in hardware. To be clear, only packets received via locked ports are able to trigger the check.
I think we have the same thing in ocelot, a port can be configured to send "learn frames" to the CPU.
Should these packets be injected into the bridge RX path in the first place? They reach the CPU because of an FDB miss, not because the CPU was the intended destination.
The reason to inject them to the Rx path is so that they will trigger the creation of the "locked" entry in the bridge driver (when MAB is on), thereby notifying user space about the presence of a new MAC behind the locked port. We can try to parse them in the driver and notify the bridge driver via SWITCHDEV_FDB_ADD_TO_BRIDGE, but it's quite ugly...
"ugly" => your words, not mine... But abstracting things a bit, doing what you just said (SWITCHDEV_FDB_ADD_TO_BRIDGE) for learn frames would be exactly the same thing as what mv88e6xxx is doing (so your "ugly" comment equally applies to Marvell). The learn frames are "special" in the sense that they don't belong to the data path of the software bridge*, they are just hardware specific information which the driver must deal with, using a channel that happens to be Ethernet and not an IRQ/MDIO.
*in other words, a bridge with proper RX filtering should not even receive these frames, or would need special casing for BR_PORT_MAB to not drop them in the first place.
I would incline towards an unified approach for CPU assisted learning, regardless of this (minor, IMO) difference between Marvell and other vendors.
On Thu, Oct 20, 2022 at 05:04:00PM +0300, Vladimir Oltean wrote:
On Thu, Oct 20, 2022 at 04:57:35PM +0300, Ido Schimmel wrote:
On Thu, Oct 20, 2022 at 04:35:06PM +0300, Vladimir Oltean wrote:
On Thu, Oct 20, 2022 at 04:24:16PM +0300, Ido Schimmel wrote:
On Thu, Oct 20, 2022 at 04:02:24PM +0300, Vladimir Oltean wrote:
On Tue, Oct 18, 2022 at 06:56:12PM +0200, Hans J. Schultz wrote:
@@ -3315,6 +3316,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, struct dsa_port *dp = dsa_slave_to_port(dev); bool host_addr = fdb_info->is_local; struct dsa_switch *ds = dp->ds;
- u16 fdb_flags = 0;
if (ctx && ctx != dp) return 0; @@ -3361,6 +3363,9 @@ static int dsa_slave_fdb_event(struct net_device *dev, orig_dev->name, fdb_info->addr, fdb_info->vid, host_addr ? " as host address" : "");
- if (fdb_info->locked)
fdb_flags |= DSA_FDB_FLAG_LOCKED;
This is the bridge->driver direction. In which of the changes up until now/through which mechanism will the bridge emit a SWITCHDEV_FDB_ADD_TO_DEVICE with fdb_info->locked = true?
I believe it can happen in the following call chain:
br_handle_frame_finish br_fdb_update // p->flags & BR_PORT_MAB fdb_notify br_switchdev_fdb_notify
This can happen with Spectrum when a packet ingresses via a locked port and incurs an FDB miss in hardware. The packet will be trapped and injected to the Rx path where it should invoke the above call chain.
Ah, so this is the case which in mv88e6xxx would generate an ATU violation interrupt; in the Spectrum case it generates a special packet.
Not sure what you mean by "special" :) It's simply the packet that incurred the FDB miss on the SMAC.
Right now this packet isn't generated, right?
Right. We don't support BR_PORT_LOCKED so these checks are not currently enabled in hardware. To be clear, only packets received via locked ports are able to trigger the check.
I think we have the same thing in ocelot, a port can be configured to send "learn frames" to the CPU.
Should these packets be injected into the bridge RX path in the first place? They reach the CPU because of an FDB miss, not because the CPU was the intended destination.
The reason to inject them to the Rx path is so that they will trigger the creation of the "locked" entry in the bridge driver (when MAB is on), thereby notifying user space about the presence of a new MAC behind the locked port. We can try to parse them in the driver and notify the bridge driver via SWITCHDEV_FDB_ADD_TO_BRIDGE, but it's quite ugly...
"ugly" => your words, not mine... But abstracting things a bit, doing what you just said (SWITCHDEV_FDB_ADD_TO_BRIDGE) for learn frames would be exactly the same thing as what mv88e6xxx is doing (so your "ugly" comment equally applies to Marvell).
My understanding is that mv88e6xxx only reads the SMAC and FID/VID from hardware and notifies them to the bridge driver. It does not need to parse them out of the Ethernet frame that triggered the "violation". This is the "ugly" part (in my opinion).
The learn frames are "special" in the sense that they don't belong to the data path of the software bridge*, they are just hardware specific information which the driver must deal with, using a channel that happens to be Ethernet and not an IRQ/MDIO.
I think we misunderstand each other because I don't understand why you call them "special" nor "hardware specific information" :/ We don't inject to the software data path some hardware specific frames, but rather the original Ethernet frames that triggered the violation. The same thing happens with packets that encountered a neighbour miss during routing or whose TTL was decremented to zero. The hardware can't generate ARP or ICMP packets, so the original packet is injected to the Rx path so that the kernel will generate the necessary control packets in response.
*in other words, a bridge with proper RX filtering should not even receive these frames, or would need special casing for BR_PORT_MAB to not drop them in the first place.
I would incline towards an unified approach for CPU assisted learning, regardless of this (minor, IMO) difference between Marvell and other vendors.
OK, understood. Assuming you don't like the above, I need to check if we can do something similar to what mv88e6xxx is doing (because I don't think mv88e6xxx can do anything else).
On Thu, Oct 20, 2022 at 05:58:42PM +0300, Ido Schimmel wrote:
My understanding is that mv88e6xxx only reads the SMAC and FID/VID from hardware and notifies them to the bridge driver. It does not need to parse them out of the Ethernet frame that triggered the "violation". This is the "ugly" part (in my opinion).
I think that the Marvell approach is uglier, but maybe that's just me. Between parsing a MAC SA/VLAN ID from an Ethernet frame than having to concern myself with rate limiting IRQs which need MDIO access, I'd rather parse Ethernet frames all day long.
With Ethernet we have all sorts of coping mechanisms, NAPI, IRQ coalescing. The Ethernet interrupts are designed to be very high bandwidth. You can even put a storm policer on Ethernet traffic and rate limit the learn frames. I don't like where the Marvell specific impl is going, I don't think it is a good first implementation of a new feature, since it will inevitably shape the way in which other hardware with CPU assisted learning will do things. For example, not sure if blackhole FDB entries are going to be needed by other implementations as well.
I kind of thought that the Linux bridge would be more resilient to DoS than it actually is. Now I'm not sure if me and Andrew gave bad advice with the whole protection mechanisms put in place as UAPI for mv88e6xxx's quirks.
The learn frames are "special" in the sense that they don't belong to the data path of the software bridge*, they are just hardware specific information which the driver must deal with, using a channel that happens to be Ethernet and not an IRQ/MDIO.
I think we misunderstand each other because I don't understand why you call them "special" nor "hardware specific information" :/
I call them special because there is no need to present these packets to application software. Understood and agreed that they are identical to the original packet which triggered the trap (plus some metadata which denotes the trap reason, presumably), although I don't think this really matters too much.
We don't inject to the software data path some hardware specific frames, but rather the original Ethernet frames that triggered the violation. The same thing happens with packets that encountered a neighbour miss during routing or whose TTL was decremented to zero. The hardware can't generate ARP or ICMP packets, so the original packet is injected to the Rx path so that the kernel will generate the necessary control packets in response.
Can't speak for IP forwarding offload unfortunately, but it seems like you presented a different/unrelated situation here. CPU assisted learning is not slow path processing, because nothing needs to be done further with that packet except for extracting its MAC SA/VID, and learning it. The rest of the original packet is really not necessary.
*in other words, a bridge with proper RX filtering should not even receive these frames, or would need special casing for BR_PORT_MAB to not drop them in the first place.
I would incline towards an unified approach for CPU assisted learning, regardless of this (minor, IMO) difference between Marvell and other vendors.
OK, understood. Assuming you don't like the above, I need to check if we can do something similar to what mv88e6xxx is doing (because I don't think mv88e6xxx can do anything else).
No no, I like having an Ethernet channel (see the first reply to this email), I think it has benefits and I don't want to make Spectrum follow an inferior route just because that's the model.
But on the other hand, nobody right now needs the mechanism that Hans put in place for setting BR_FDB_LOCKED in software, and notifying it back to the driver. Moreover, when Ethernet-based CPU assisted learning will come, this mechanism will not be the only possibility, and that should be a separate discussion. I still think that generic helpers to emit SWITCHDEV_FDB_ADD_TO_BRIDGE based on an skb are an equally valid approach, and would diverge significantly less from Marvell without imposing any real limitation. In the implementation proposed here, we have variation for the sake of variation, and we come up with hypothetical examples of how they might be useful. At least half this patch set is full of maybes, I can't really say I like that.
On Thu, Oct 20, 2022 at 04:57:35PM +0300, Ido Schimmel wrote:
Right now this packet isn't generated, right?
Right. We don't support BR_PORT_LOCKED so these checks are not currently enabled in hardware. To be clear, only packets received via locked ports are able to trigger the check.
You mean BR_PORT_MAB, not BR_PORT_LOCKED, right? AFAIU, "locked" means drop unknown MAC SA, "mab" means "install BR_FDB_LOCKED entry on port" (and also maybe still drop, if "locked" is also set on port).
Sad there isn't any good documentation about these flags in the patches that Hans is proposing.
On Thu, Oct 20, 2022 at 05:11:04PM +0300, Vladimir Oltean wrote:
On Thu, Oct 20, 2022 at 04:57:35PM +0300, Ido Schimmel wrote:
Right now this packet isn't generated, right?
Right. We don't support BR_PORT_LOCKED so these checks are not currently enabled in hardware. To be clear, only packets received via locked ports are able to trigger the check.
You mean BR_PORT_MAB, not BR_PORT_LOCKED, right?
I actually meant BR_PORT_LOCKED... The hardware has a single bit per port that can be used to enable security checks on the port. If security checks are enabled, then before L2 forwarding the hardware will perform an FDB lookup with the SMAC and FID (VID), which can have one of three results:
1. Match. FDB entry found and it points to the ingress port. In this case the packet continues to the regular L2 pipeline.
2. Mismatch. FDB entry found, but it points to a different port than ingress port. In this case we want to drop the packet like the software bridge.
3. Miss. FDB entry not found. Here I was thinking to always tell the packet to go to the software data path so that it will trigger the creation of the "locked" entry if MAB is enabled. If MAB is not enabled, it will simply be dropped by the bridge. We can't control it per port in hardware, which is why the BR_PORT_MAB flag is not consulted.
AFAIU, "locked" means drop unknown MAC SA, "mab" means "install BR_FDB_LOCKED entry on port" (and also maybe still drop, if "locked" is also set on port).
Right, but you can't have "mab" without "locked" (from patch #1):
``` @@ -943,6 +946,13 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED); + br_set_port_flag(p, tb, IFLA_BRPORT_MAB, BR_PORT_MAB); + + if (!(p->flags & BR_PORT_LOCKED) && (p->flags & BR_PORT_MAB)) { + NL_SET_ERR_MSG(extack, "MAB cannot be enabled when port is unlocked"); + p->flags = old_flags; + return -EINVAL; + }
changed_mask = old_flags ^ p->flags; ```
Sad there isn't any good documentation about these flags in the patches that Hans is proposing.
Will try to comment with better commit messages for patches #1 and #2. Not sure I will have time today.
On Thu, Oct 20, 2022 at 06:23:37PM +0300, Ido Schimmel wrote:
- Miss. FDB entry not found. Here I was thinking to always tell the
packet to go to the software data path so that it will trigger the creation of the "locked" entry if MAB is enabled. If MAB is not enabled, it will simply be dropped by the bridge. We can't control it per port in hardware, which is why the BR_PORT_MAB flag is not consulted.
Ah, ok, this is the part I was missing, so you can't control an FDB miss to generate a learn frame only on some ports. But in principle, it still is the BR_PORT_MAB flag the one which requires these frames to be generated, not BR_PORT_LOCKED. You can have all ports LOCKED but not MAB, and no learn frames will be necessary to be sent to the CPU. Only EAPOL, which is link-local multicast, will reach software for further processing and unlock the port for a certain MAC DA.
On 2022-10-20 15:35, Vladimir Oltean wrote:
On Thu, Oct 20, 2022 at 04:24:16PM +0300, Ido Schimmel wrote:
On Thu, Oct 20, 2022 at 04:02:24PM +0300, Vladimir Oltean wrote:
On Tue, Oct 18, 2022 at 06:56:12PM +0200, Hans J. Schultz wrote:
@@ -3315,6 +3316,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, struct dsa_port *dp = dsa_slave_to_port(dev); bool host_addr = fdb_info->is_local; struct dsa_switch *ds = dp->ds;
u16 fdb_flags = 0;
if (ctx && ctx != dp) return 0;
@@ -3361,6 +3363,9 @@ static int dsa_slave_fdb_event(struct net_device *dev, orig_dev->name, fdb_info->addr, fdb_info->vid, host_addr ? " as host address" : "");
- if (fdb_info->locked)
fdb_flags |= DSA_FDB_FLAG_LOCKED;
This is the bridge->driver direction. In which of the changes up until now/through which mechanism will the bridge emit a SWITCHDEV_FDB_ADD_TO_DEVICE with fdb_info->locked = true?
I believe it can happen in the following call chain:
br_handle_frame_finish br_fdb_update // p->flags & BR_PORT_MAB fdb_notify br_switchdev_fdb_notify
This can happen with Spectrum when a packet ingresses via a locked port and incurs an FDB miss in hardware. The packet will be trapped and injected to the Rx path where it should invoke the above call chain.
Ah, so this is the case which in mv88e6xxx would generate an ATU violation interrupt; in the Spectrum case it generates a special packet. Right now this packet isn't generated, right?
I think we have the same thing in ocelot, a port can be configured to send "learn frames" to the CPU.
Should these packets be injected into the bridge RX path in the first place? They reach the CPU because of an FDB miss, not because the CPU was the intended destination.
Just to add to it, now that there is a u16 for flags in the bridge->driver direction, making it easier to add such flags, I expect that for the mv88e6xxx driver there shall be a 'IS_DYNAMIC' flag also, as authorized hosts will have their authorized FDB entries added with dynamic entries...
Now as the bridge will not be able to refresh such authorized FDB entries based on unicast incoming traffic on the locked port in the offloaded case, besides we don't want the CPU to do such in this case anyway, to keep the authorized line alive without having to reauthorize in like every 5 minutes, the driver needs to do the ageing (and refreshing) of the dynamic entry added from userspace. When the entry "ages" out, there is the HoldAt1 feature and Age Out Violations that should be used to tell userspace (plus bridge) that this authorization has been removed by the driver as the host has gone quiet.
So all in all, there is the need of another flag from userspace->bridge->driver, telling that we want a dynamic ATU entry (with mv88e6xxx it will start at age 7).
On Thu, Oct 20, 2022 at 08:47:39PM +0200, netdev@kapio-technology.com wrote:
Just to add to it, now that there is a u16 for flags in the bridge->driver direction, making it easier to add such flags, I expect that for the mv88e6xxx driver there shall be a 'IS_DYNAMIC' flag also, as authorized hosts will have their authorized FDB entries added with dynamic entries...
With what is implemented in this patchset, the MAB daemon uses static FDB entries for authorizations, just like the selftests, right? That's the only thing that works.
Now as the bridge will not be able to refresh such authorized FDB entries based on unicast incoming traffic on the locked port in the offloaded case, besides we don't want the CPU to do such in this case anyway,
..because the software bridge refreshes the FDB entry based on the traffic it sees, and the hardware port refreshes the corresponding ATU entry based on the traffic *it* sees, and the 2 are not in sync because most of the traffic is autonomously forwarded, causing the FDB to be refreshed more often in hardware than in software..
to keep the authorized line alive without having to reauthorize in like every 5 minutes, the driver needs to do the ageing (and refreshing) of the dynamic entry added from userspace.
You're saying "now [...] to keep the authorized line alive [...], the driver needs to do the [...] refreshing of the dynamic [FDB] entry".
Can you point me to the code where that is done now?
Or perhaps I'm misunderstanding and it is a "future now"...
When the entry "ages" out, there is the HoldAt1 feature and Age Out Violations that should be used to tell userspace (plus bridge) that this authorization has been removed by the driver as the host has gone quiet.
So this is your proposal for how a dynamic FDB entry could be offloaded.
Have you given any thought to how can we prevent the software FDB entry from ageing out first, and causing the hardware FDB entry to be removed too, through the ensuing switchdev notification?
So all in all, there is the need of another flag from userspace->bridge->driver, telling that we want a dynamic ATU entry (with mv88e6xxx it will start at age 7).
Sorry for the elementary question, but what is gained from making the authorized FDB entries dynamic in the bridge? You don't have to reauthorize every 5 minutes even with the current implementation; you could make the FDB entries static. The ability for authorized stations to roam? This is why the authorizations are removed every 5 minutes, to see if anybody is still there? Who removes the authorizations in the implementation with the currently proposed patch set? The MAB daemon, right?
Could you please present a high level overview of how you want things to look in the end and how far you are along that line? Maybe a set of user space + kernel repos where everything is implemented and works?
On 2022-10-20 15:02, Vladimir Oltean wrote:
--- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -304,7 +304,7 @@ static int dsa_port_inherit_brport_flags(struct dsa_port *dp, struct netlink_ext_ack *extack) { const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
BR_BCAST_FLOOD | BR_PORT_LOCKED;
struct net_device *brport_dev = dsa_port_to_bridge_port(dp); int flag, err;BR_BCAST_FLOOD;
@@ -328,7 +328,7 @@ static void dsa_port_clear_brport_flags(struct dsa_port *dp) { const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
BR_BCAST_FLOOD | BR_PORT_LOCKED;
BR_BCAST_FLOOD | BR_PORT_LOCKED | BR_PORT_MAB;
Why does the mask of cleared brport flags differ from the one of set brport flags, and what/where is the explanation for this change?
I guess you mean, why it differs from the inherit flag mask list?
If so it is explained in the update to v7 in 00/12.
On Thu, Oct 20, 2022 at 09:43:40PM +0200, netdev@kapio-technology.com wrote:
I guess you mean, why it differs from the inherit flag mask list?
If so it is explained in the update to v7 in 00/12.
The following is written there:
| v7: Remove locked port and mab flags from DSA flags | inherit list as it messes with the learning | setting and those flags are not naturally meant | for enheriting, but should be set explicitly.
Can you go one level deeper with the explanation? What messes with the learning setting? Why are those brport flags not naturally meant for inheriting?
It's pretty hard to take your patch set seriously if you don't provide proper explanations.
Enable the bridge to receive blackhole FDB entries from drivers with SWITCHDEV_FDB_ADD_TO_BRIDGE notifications and send them to drivers with RTM_NEWNEIGH notifications.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- include/net/switchdev.h | 1 + net/bridge/br.c | 3 ++- net/bridge/br_fdb.c | 19 ++++++++++++++++--- net/bridge/br_private.h | 3 ++- net/bridge/br_switchdev.c | 1 + 5 files changed, 22 insertions(+), 5 deletions(-)
diff --git a/include/net/switchdev.h b/include/net/switchdev.h index ca0312b78294..39727902354e 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -249,6 +249,7 @@ struct switchdev_notifier_fdb_info { u8 added_by_user:1, is_local:1, locked:1, + blackhole:1, offloaded:1; };
diff --git a/net/bridge/br.c b/net/bridge/br.c index e0e2df2fa278..85fc529b6a9f 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused, case SWITCHDEV_FDB_ADD_TO_BRIDGE: fdb_info = ptr; err = br_fdb_external_learn_add(br, p, fdb_info->addr, fdb_info->vid, - fdb_info->locked, false); + fdb_info->locked, fdb_info->is_local, + fdb_info->blackhole, false); if (err) { err = notifier_from_errno(err); break; diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 8d207b1416f7..ef973d21d7bd 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1145,7 +1145,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, "FDB entry towards bridge must be permanent"); return -EINVAL; } - err = br_fdb_external_learn_add(br, p, addr, vid, false, true); + err = br_fdb_external_learn_add(br, p, addr, vid, false, false, false, true); } else { spin_lock_bh(&br->hash_lock); err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, ext_flags, nfea_tb); @@ -1401,7 +1401,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, bool locked, - bool swdev_notify) + bool local, bool blackhole, bool swdev_notify) { struct net_bridge_fdb_entry *fdb; bool modified = false; @@ -1418,12 +1418,15 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (swdev_notify) flags |= BIT(BR_FDB_ADDED_BY_USER);
- if (!p) + if (!p || local) flags |= BIT(BR_FDB_LOCAL);
if (locked) flags |= BIT(BR_FDB_LOCKED);
+ if (blackhole) + flags |= BIT(BR_FDB_BLACKHOLE); + fdb = fdb_create(br, p, addr, vid, flags); if (!fdb) { err = -ENOMEM; @@ -1447,11 +1450,21 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, modified = true; }
+ if (local != test_bit(BR_FDB_LOCAL, &fdb->flags)) { + change_bit(BR_FDB_LOCAL, &fdb->flags); + modified = true; + } + if (locked != test_bit(BR_FDB_LOCKED, &fdb->flags)) { change_bit(BR_FDB_LOCKED, &fdb->flags); modified = true; }
+ if (blackhole != test_bit(BR_FDB_BLACKHOLE, &fdb->flags)) { + change_bit(BR_FDB_BLACKHOLE, &fdb->flags); + modified = true; + } + if (swdev_notify) set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 3e9f4d1fbd60..4202c80e465e 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -812,7 +812,8 @@ int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p); void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p); int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, - bool locked, bool swdev_notify); + bool locked, bool local, bool blackhole, + bool swdev_notify); int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, bool swdev_notify); diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index ccf1b4cffdd0..ce7b80c782ec 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -137,6 +137,7 @@ static void br_switchdev_fdb_populate(struct net_bridge *br, item->offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags); item->is_local = test_bit(BR_FDB_LOCAL, &fdb->flags); item->locked = test_bit(BR_FDB_LOCKED, &fdb->flags); + item->blackhole = test_bit(BR_FDB_BLACKHOLE, &fdb->flags); item->info.dev = (!p || item->is_local) ? br->dev : p->dev; item->info.ctx = ctx; }
Propagate blackhole FDB entries through the DSA layer towards the drivers, remembering that these entries are local entries.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- include/net/dsa.h | 1 + net/dsa/dsa_priv.h | 4 ++-- net/dsa/port.c | 22 ++++++++++++---------- net/dsa/slave.c | 6 ++++-- 4 files changed, 19 insertions(+), 14 deletions(-)
diff --git a/include/net/dsa.h b/include/net/dsa.h index e4b641b20713..d5b2aef52d93 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -822,6 +822,7 @@ static inline bool dsa_port_tree_same(const struct dsa_port *a, }
#define DSA_FDB_FLAG_LOCKED (1 << 0) +#define DSA_FDB_FLAG_BLACKHOLE (1 << 1)
typedef int dsa_fdb_dump_cb_t(const unsigned char *addr, u16 vid, bool is_static, void *data); diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index c943e8934063..611833f162d1 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -251,9 +251,9 @@ int dsa_port_standalone_host_fdb_add(struct dsa_port *dp, int dsa_port_standalone_host_fdb_del(struct dsa_port *dp, const unsigned char *addr, u16 vid); int dsa_port_bridge_host_fdb_add(struct dsa_port *dp, const unsigned char *addr, - u16 vid); + u16 vid, u16 fdb_flags); int dsa_port_bridge_host_fdb_del(struct dsa_port *dp, const unsigned char *addr, - u16 vid); + u16 vid, u16 fdb_flags); int dsa_port_lag_fdb_add(struct dsa_port *dp, const unsigned char *addr, u16 vid); int dsa_port_lag_fdb_del(struct dsa_port *dp, const unsigned char *addr, diff --git a/net/dsa/port.c b/net/dsa/port.c index ff4f66f14d39..7e77c5d6090a 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -1001,12 +1001,13 @@ int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr,
static int dsa_port_host_fdb_add(struct dsa_port *dp, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct dsa_notifier_fdb_info info = { .dp = dp, .addr = addr, .vid = vid, + .fdb_flags = fdb_flags, .db = db, };
@@ -1024,11 +1025,11 @@ int dsa_port_standalone_host_fdb_add(struct dsa_port *dp, .dp = dp, };
- return dsa_port_host_fdb_add(dp, addr, vid, db); + return dsa_port_host_fdb_add(dp, addr, vid, 0, db); }
-int dsa_port_bridge_host_fdb_add(struct dsa_port *dp, - const unsigned char *addr, u16 vid) +int dsa_port_bridge_host_fdb_add(struct dsa_port *dp, const unsigned char *addr, + u16 vid, u16 fdb_flags) { struct net_device *master = dsa_port_to_master(dp); struct dsa_db db = { @@ -1047,17 +1048,18 @@ int dsa_port_bridge_host_fdb_add(struct dsa_port *dp, return err; }
- return dsa_port_host_fdb_add(dp, addr, vid, db); + return dsa_port_host_fdb_add(dp, addr, vid, fdb_flags, db); }
static int dsa_port_host_fdb_del(struct dsa_port *dp, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct dsa_notifier_fdb_info info = { .dp = dp, .addr = addr, .vid = vid, + .fdb_flags = fdb_flags, .db = db, };
@@ -1075,11 +1077,11 @@ int dsa_port_standalone_host_fdb_del(struct dsa_port *dp, .dp = dp, };
- return dsa_port_host_fdb_del(dp, addr, vid, db); + return dsa_port_host_fdb_del(dp, addr, vid, 0, db); }
-int dsa_port_bridge_host_fdb_del(struct dsa_port *dp, - const unsigned char *addr, u16 vid) +int dsa_port_bridge_host_fdb_del(struct dsa_port *dp, const unsigned char *addr, + u16 vid, u16 fdb_flags) { struct net_device *master = dsa_port_to_master(dp); struct dsa_db db = { @@ -1094,7 +1096,7 @@ int dsa_port_bridge_host_fdb_del(struct dsa_port *dp, return err; }
- return dsa_port_host_fdb_del(dp, addr, vid, db); + return dsa_port_host_fdb_del(dp, addr, vid, fdb_flags, db); }
int dsa_port_lag_fdb_add(struct dsa_port *dp, const unsigned char *addr, diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 65f0c578ef44..4e22014ec469 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -3258,7 +3258,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) switch (switchdev_work->event) { case SWITCHDEV_FDB_ADD_TO_DEVICE: if (switchdev_work->host_addr) - err = dsa_port_bridge_host_fdb_add(dp, addr, vid); + err = dsa_port_bridge_host_fdb_add(dp, addr, vid, fdb_flags); else if (dp->lag) err = dsa_port_lag_fdb_add(dp, addr, vid); else @@ -3274,7 +3274,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work)
case SWITCHDEV_FDB_DEL_TO_DEVICE: if (switchdev_work->host_addr) - err = dsa_port_bridge_host_fdb_del(dp, addr, vid); + err = dsa_port_bridge_host_fdb_del(dp, addr, vid, fdb_flags); else if (dp->lag) err = dsa_port_lag_fdb_del(dp, addr, vid); else @@ -3365,6 +3365,8 @@ static int dsa_slave_fdb_event(struct net_device *dev,
if (fdb_info->locked) fdb_flags |= DSA_FDB_FLAG_LOCKED; + if (fdb_info->blackhole) + fdb_flags |= DSA_FDB_FLAG_BLACKHOLE;
INIT_WORK(&switchdev_work->work, dsa_slave_switchdev_event_work); switchdev_work->event = event;
Ignore fdb entries with set flags coming in on all switchcore drivers.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- drivers/net/dsa/b53/b53_common.c | 12 ++++++++++-- drivers/net/dsa/b53/b53_priv.h | 4 ++-- drivers/net/dsa/hirschmann/hellcreek.c | 12 ++++++++++-- drivers/net/dsa/lan9303-core.c | 12 ++++++++++-- drivers/net/dsa/lantiq_gswip.c | 12 ++++++++++-- drivers/net/dsa/microchip/ksz9477.c | 8 ++++---- drivers/net/dsa/microchip/ksz9477.h | 8 ++++---- drivers/net/dsa/microchip/ksz_common.c | 14 +++++++++++--- drivers/net/dsa/mt7530.c | 12 ++++++++++-- drivers/net/dsa/mv88e6xxx/chip.c | 12 ++++++++++-- drivers/net/dsa/ocelot/felix.c | 12 ++++++++++-- drivers/net/dsa/qca/qca8k-common.c | 12 ++++++++++-- drivers/net/dsa/qca/qca8k.h | 4 ++-- drivers/net/dsa/rzn1_a5psw.c | 12 ++++++++++-- drivers/net/dsa/sja1105/sja1105_main.c | 18 +++++++++++++----- include/net/dsa.h | 4 ++-- net/dsa/switch.c | 8 ++++---- 17 files changed, 132 insertions(+), 44 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 59cdfc51ce06..cec60af6dfdc 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1684,11 +1684,15 @@ static int b53_arl_op(struct b53_device *dev, int op, int port,
int b53_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct b53_device *priv = ds->priv; int ret;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + /* 5325 and 5365 require some more massaging, but could * be supported eventually */ @@ -1705,11 +1709,15 @@ EXPORT_SYMBOL(b53_fdb_add);
int b53_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct b53_device *priv = ds->priv; int ret;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + mutex_lock(&priv->arl_mutex); ret = b53_arl_op(priv, 0, port, addr, vid, false); mutex_unlock(&priv->arl_mutex); diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h index 795cbffd5c2b..7673c4e712bb 100644 --- a/drivers/net/dsa/b53/b53_priv.h +++ b/drivers/net/dsa/b53/b53_priv.h @@ -362,10 +362,10 @@ int b53_vlan_del(struct dsa_switch *ds, int port, const struct switchdev_obj_port_vlan *vlan); int b53_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db); + u16 fdb_flags, struct dsa_db db); int b53_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db); + u16 fdb_flags, struct dsa_db db); int b53_fdb_dump(struct dsa_switch *ds, int port, dsa_fdb_dump_cb_t *cb, void *data); int b53_mdb_add(struct dsa_switch *ds, int port, diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c index 951f7935c872..374b90e79b9a 100644 --- a/drivers/net/dsa/hirschmann/hellcreek.c +++ b/drivers/net/dsa/hirschmann/hellcreek.c @@ -839,12 +839,16 @@ static int hellcreek_fdb_get(struct hellcreek *hellcreek,
static int hellcreek_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct hellcreek_fdb_entry entry = { 0 }; struct hellcreek *hellcreek = ds->priv; int ret;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + dev_dbg(hellcreek->dev, "Add FDB entry for MAC=%pM\n", addr);
mutex_lock(&hellcreek->reg_lock); @@ -885,12 +889,16 @@ static int hellcreek_fdb_add(struct dsa_switch *ds, int port,
static int hellcreek_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct hellcreek_fdb_entry entry = { 0 }; struct hellcreek *hellcreek = ds->priv; int ret;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + dev_dbg(hellcreek->dev, "Delete FDB entry for MAC=%pM\n", addr);
mutex_lock(&hellcreek->reg_lock); diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index 438e46af03e9..36187705833f 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -1192,10 +1192,14 @@ static void lan9303_port_fast_age(struct dsa_switch *ds, int port)
static int lan9303_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct lan9303 *chip = ds->priv;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + dev_dbg(chip->dev, "%s(%d, %pM, %d)\n", __func__, port, addr, vid); if (vid) return -EOPNOTSUPP; @@ -1205,10 +1209,14 @@ static int lan9303_port_fdb_add(struct dsa_switch *ds, int port,
static int lan9303_port_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct lan9303 *chip = ds->priv;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + dev_dbg(chip->dev, "%s(%d, %pM, %d)\n", __func__, port, addr, vid); if (vid) return -EOPNOTSUPP; diff --git a/drivers/net/dsa/lantiq_gswip.c b/drivers/net/dsa/lantiq_gswip.c index 05ecaa007ab1..a945e8e62232 100644 --- a/drivers/net/dsa/lantiq_gswip.c +++ b/drivers/net/dsa/lantiq_gswip.c @@ -1399,15 +1399,23 @@ static int gswip_port_fdb(struct dsa_switch *ds, int port,
static int gswip_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { + /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + return gswip_port_fdb(ds, port, addr, vid, true); }
static int gswip_port_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { + /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + return gswip_port_fdb(ds, port, addr, vid, false); }
diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c index a6a0321a8931..e65daabf9865 100644 --- a/drivers/net/dsa/microchip/ksz9477.c +++ b/drivers/net/dsa/microchip/ksz9477.c @@ -457,8 +457,8 @@ int ksz9477_port_vlan_del(struct ksz_device *dev, int port, return 0; }
-int ksz9477_fdb_add(struct ksz_device *dev, int port, - const unsigned char *addr, u16 vid, struct dsa_db db) +int ksz9477_fdb_add(struct ksz_device *dev, int port, const unsigned char *addr, + u16 vid, struct dsa_db db) { u32 alu_table[4]; u32 data; @@ -513,8 +513,8 @@ int ksz9477_fdb_add(struct ksz_device *dev, int port, return ret; }
-int ksz9477_fdb_del(struct ksz_device *dev, int port, - const unsigned char *addr, u16 vid, struct dsa_db db) +int ksz9477_fdb_del(struct ksz_device *dev, int port, const unsigned char *addr, + u16 vid, struct dsa_db db) { u32 alu_table[4]; u32 data; diff --git a/drivers/net/dsa/microchip/ksz9477.h b/drivers/net/dsa/microchip/ksz9477.h index 00862c4cfb7f..a9c64e166cca 100644 --- a/drivers/net/dsa/microchip/ksz9477.h +++ b/drivers/net/dsa/microchip/ksz9477.h @@ -41,10 +41,10 @@ void ksz9477_get_caps(struct ksz_device *dev, int port, struct phylink_config *config); int ksz9477_fdb_dump(struct ksz_device *dev, int port, dsa_fdb_dump_cb_t *cb, void *data); -int ksz9477_fdb_add(struct ksz_device *dev, int port, - const unsigned char *addr, u16 vid, struct dsa_db db); -int ksz9477_fdb_del(struct ksz_device *dev, int port, - const unsigned char *addr, u16 vid, struct dsa_db db); +int ksz9477_fdb_add(struct ksz_device *dev, int port, const unsigned char *addr, + u16 vid, struct dsa_db db); +int ksz9477_fdb_del(struct ksz_device *dev, int port, const unsigned char *addr, + u16 vid, struct dsa_db db); int ksz9477_mdb_add(struct ksz_device *dev, int port, const struct switchdev_obj_port_mdb *mdb, struct dsa_db db); int ksz9477_mdb_del(struct ksz_device *dev, int port, diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c index d612181b3226..cfcfc725fed9 100644 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -2227,10 +2227,14 @@ static int ksz_set_ageing_time(struct dsa_switch *ds, unsigned int msecs)
static int ksz_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct ksz_device *dev = ds->priv;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + if (!dev->dev_ops->fdb_add) return -EOPNOTSUPP;
@@ -2238,11 +2242,15 @@ static int ksz_port_fdb_add(struct dsa_switch *ds, int port, }
static int ksz_port_fdb_del(struct dsa_switch *ds, int port, - const unsigned char *addr, - u16 vid, struct dsa_db db) + const unsigned char *addr, u16 vid, + u16 fdb_flags, struct dsa_db db) { struct ksz_device *dev = ds->priv;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + if (!dev->dev_ops->fdb_del) return -EOPNOTSUPP;
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c index e74c6b406172..fd75565b1782 100644 --- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -1369,12 +1369,16 @@ mt7530_port_bridge_leave(struct dsa_switch *ds, int port, static int mt7530_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct mt7530_priv *priv = ds->priv; int ret; u8 port_mask = BIT(port);
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + mutex_lock(&priv->reg_mutex); mt7530_fdb_write(priv, vid, port_mask, addr, -1, STATIC_ENT); ret = mt7530_fdb_cmd(priv, MT7530_FDB_WRITE, NULL); @@ -1386,12 +1390,16 @@ mt7530_port_fdb_add(struct dsa_switch *ds, int port, static int mt7530_port_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct mt7530_priv *priv = ds->priv; int ret; u8 port_mask = BIT(port);
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + mutex_lock(&priv->reg_mutex); mt7530_fdb_write(priv, vid, port_mask, addr, -1, STATIC_EMP); ret = mt7530_fdb_cmd(priv, MT7530_FDB_WRITE, NULL); diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 2479be3a1e35..352121cce77e 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2722,11 +2722,15 @@ static int mv88e6xxx_vlan_msti_set(struct dsa_switch *ds,
static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv; int err;
+ /* Ignore entries with flags set */ + if (fdb_flags) + return 0; + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC); @@ -2737,11 +2741,15 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv; int err;
+ /* Ignore entries with flags set */ + if (fdb_flags) + return 0; + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0); mv88e6xxx_reg_unlock(chip); diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c index dd3a18cc89dd..b9acec77c820 100644 --- a/drivers/net/dsa/ocelot/felix.c +++ b/drivers/net/dsa/ocelot/felix.c @@ -782,12 +782,16 @@ static int felix_fdb_dump(struct dsa_switch *ds, int port,
static int felix_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct net_device *bridge_dev = felix_classify_db(db); struct dsa_port *dp = dsa_to_port(ds, port); struct ocelot *ocelot = ds->priv;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + if (IS_ERR(bridge_dev)) return PTR_ERR(bridge_dev);
@@ -803,12 +807,16 @@ static int felix_fdb_add(struct dsa_switch *ds, int port,
static int felix_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct net_device *bridge_dev = felix_classify_db(db); struct dsa_port *dp = dsa_to_port(ds, port); struct ocelot *ocelot = ds->priv;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + if (IS_ERR(bridge_dev)) return PTR_ERR(bridge_dev);
diff --git a/drivers/net/dsa/qca/qca8k-common.c b/drivers/net/dsa/qca/qca8k-common.c index fb45b598847b..e26a9a483955 100644 --- a/drivers/net/dsa/qca/qca8k-common.c +++ b/drivers/net/dsa/qca/qca8k-common.c @@ -795,21 +795,29 @@ int qca8k_port_fdb_insert(struct qca8k_priv *priv, const u8 *addr,
int qca8k_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv; u16 port_mask = BIT(port);
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + return qca8k_port_fdb_insert(priv, addr, port_mask, vid); }
int qca8k_port_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv; u16 port_mask = BIT(port);
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + if (!vid) vid = QCA8K_PORT_VID_DEF;
diff --git a/drivers/net/dsa/qca/qca8k.h b/drivers/net/dsa/qca/qca8k.h index 0b7a5cb12321..5d47e840ae1e 100644 --- a/drivers/net/dsa/qca/qca8k.h +++ b/drivers/net/dsa/qca/qca8k.h @@ -479,10 +479,10 @@ int qca8k_port_fdb_insert(struct qca8k_priv *priv, const u8 *addr, u16 port_mask, u16 vid); int qca8k_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db); + u16 fdb_flags, struct dsa_db db); int qca8k_port_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db); + u16 fdb_flags, struct dsa_db db); int qca8k_port_fdb_dump(struct dsa_switch *ds, int port, dsa_fdb_dump_cb_t *cb, void *data);
diff --git a/drivers/net/dsa/rzn1_a5psw.c b/drivers/net/dsa/rzn1_a5psw.c index ed413d555bec..d70c21f22d4c 100644 --- a/drivers/net/dsa/rzn1_a5psw.c +++ b/drivers/net/dsa/rzn1_a5psw.c @@ -396,7 +396,7 @@ static int a5psw_lk_execute_lookup(struct a5psw *a5psw, union lk_data *lk_data,
static int a5psw_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct a5psw *a5psw = ds->priv; union lk_data lk_data = {0}; @@ -405,6 +405,10 @@ static int a5psw_port_fdb_add(struct dsa_switch *ds, int port, u16 entry; u32 reg;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + ether_addr_copy(lk_data.entry.mac, addr); lk_data.entry.port_mask = BIT(port);
@@ -447,7 +451,7 @@ static int a5psw_port_fdb_add(struct dsa_switch *ds, int port,
static int a5psw_port_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct a5psw *a5psw = ds->priv; union lk_data lk_data = {0}; @@ -456,6 +460,10 @@ static int a5psw_port_fdb_del(struct dsa_switch *ds, int port, u32 reg; int ret;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + ether_addr_copy(lk_data.entry.mac, addr);
mutex_lock(&a5psw->lk_lock); diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index 412666111b0c..526177813d53 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -1802,10 +1802,14 @@ int sja1105pqrs_fdb_del(struct dsa_switch *ds, int port,
static int sja1105_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct sja1105_private *priv = ds->priv;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + if (!vid) { switch (db.type) { case DSA_DB_PORT: @@ -1824,10 +1828,14 @@ static int sja1105_fdb_add(struct dsa_switch *ds, int port,
static int sja1105_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db) + u16 fdb_flags, struct dsa_db db) { struct sja1105_private *priv = ds->priv;
+ /* Ignore entries with set flags */ + if (fdb_flags) + return 0; + if (!vid) { switch (db.type) { case DSA_DB_PORT: @@ -1930,7 +1938,7 @@ static void sja1105_fast_age(struct dsa_switch *ds, int port)
u64_to_ether_addr(l2_lookup.macaddr, macaddr);
- rc = sja1105_fdb_del(ds, port, macaddr, l2_lookup.vlanid, db); + rc = sja1105_fdb_del(ds, port, macaddr, l2_lookup.vlanid, 0, db); if (rc) { dev_err(ds->dev, "Failed to delete FDB entry %pM vid %lld: %pe\n", @@ -1944,14 +1952,14 @@ static int sja1105_mdb_add(struct dsa_switch *ds, int port, const struct switchdev_obj_port_mdb *mdb, struct dsa_db db) { - return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid, db); + return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid, 0, db); }
static int sja1105_mdb_del(struct dsa_switch *ds, int port, const struct switchdev_obj_port_mdb *mdb, struct dsa_db db) { - return sja1105_fdb_del(ds, port, mdb->addr, mdb->vid, db); + return sja1105_fdb_del(ds, port, mdb->addr, mdb->vid, 0, db); }
/* Common function for unicast and broadcast flood configuration. diff --git a/include/net/dsa.h b/include/net/dsa.h index d5b2aef52d93..50ed82f16cda 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -1043,10 +1043,10 @@ struct dsa_switch_ops { */ int (*port_fdb_add)(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db); + u16 fdb_flags, struct dsa_db db); int (*port_fdb_del)(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, - struct dsa_db db); + u16 fdb_flags, struct dsa_db db); int (*port_fdb_dump)(struct dsa_switch *ds, int port, dsa_fdb_dump_cb_t *cb, void *data); int (*lag_fdb_add)(struct dsa_switch *ds, struct dsa_lag lag, diff --git a/net/dsa/switch.c b/net/dsa/switch.c index dd355556892e..6dacab9c1428 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -243,7 +243,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
/* No need to bother with refcounting for user ports */ if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) - return ds->ops->port_fdb_add(ds, port, addr, vid, db); + return ds->ops->port_fdb_add(ds, port, addr, vid, fdb_flags, db);
mutex_lock(&dp->addr_lists_lock);
@@ -259,7 +259,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr, goto out; }
- err = ds->ops->port_fdb_add(ds, port, addr, vid, db); + err = ds->ops->port_fdb_add(ds, port, addr, vid, fdb_flags, db); if (err) { kfree(a); goto out; @@ -287,7 +287,7 @@ static int dsa_port_do_fdb_del(struct dsa_port *dp, const unsigned char *addr,
/* No need to bother with refcounting for user ports */ if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) - return ds->ops->port_fdb_del(ds, port, addr, vid, db); + return ds->ops->port_fdb_del(ds, port, addr, vid, fdb_flags, db);
mutex_lock(&dp->addr_lists_lock);
@@ -300,7 +300,7 @@ static int dsa_port_do_fdb_del(struct dsa_port *dp, const unsigned char *addr, if (!refcount_dec_and_test(&a->refcount)) goto out;
- err = ds->ops->port_fdb_del(ds, port, addr, vid, db); + err = ds->ops->port_fdb_del(ds, port, addr, vid, fdb_flags, db); if (err) { refcount_set(&a->refcount, 1); goto out;
On Tue, Oct 18, 2022 at 06:56:15PM +0200, Hans J. Schultz wrote:
Ignore fdb entries with set flags coming in on all switchcore drivers.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com
Some very thorough documentation in Documentation/networking/dsa/dsa.rst is necessary. I'm interested in seeing what those flags are, and what are drivers supposed to do when they see them, other than ignoring them.
The FID is needed to get hold of which VID was involved in a violation, thus the need to be able to read the FID.
For convenience the function mv88e6xxx_g1_atu_op() has been used to read ATU violations, but the function invalidates reading the fid, so to both read ATU violations without zeroing the fid, and read the fid, functions have been added to ensure both are done correctly.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- drivers/net/dsa/mv88e6xxx/global1_atu.c | 60 ++++++++++++++++++++++--- 1 file changed, 55 insertions(+), 5 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c index 40bd67a5c8e9..d9dfa1159cde 100644 --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c @@ -114,6 +114,19 @@ static int mv88e6xxx_g1_atu_op_wait(struct mv88e6xxx_chip *chip) return mv88e6xxx_g1_wait_bit(chip, MV88E6XXX_G1_ATU_OP, bit, 0); }
+static int mv88e6xxx_g1_read_atu_violation(struct mv88e6xxx_chip *chip) +{ + int err; + + err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_OP, + MV88E6XXX_G1_ATU_OP_BUSY | + MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION); + if (err) + return err; + + return mv88e6xxx_g1_atu_op_wait(chip); +} + static int mv88e6xxx_g1_atu_op(struct mv88e6xxx_chip *chip, u16 fid, u16 op) { u16 val; @@ -159,6 +172,41 @@ int mv88e6xxx_g1_atu_get_next(struct mv88e6xxx_chip *chip, u16 fid) return mv88e6xxx_g1_atu_op(chip, fid, MV88E6XXX_G1_ATU_OP_GET_NEXT_DB); }
+static int mv88e6xxx_g1_atu_fid_read(struct mv88e6xxx_chip *chip, u16 *fid) +{ + u16 val = 0, upper = 0, op = 0; + int err = -EOPNOTSUPP; + + if (mv88e6xxx_num_databases(chip) > 256) { + err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &val); + val &= 0xfff; + if (err) + return err; + } else { + err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_ATU_OP, &op); + if (err) + return err; + if (mv88e6xxx_num_databases(chip) > 64) { + /* ATU DBNum[7:4] are located in ATU Control 15:12 */ + err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_ATU_CTL, &upper); + if (err) + return err; + + upper = (upper >> 8) & 0x00f0; + } else if (mv88e6xxx_num_databases(chip) > 16) { + /* ATU DBNum[5:4] are located in ATU Operation 9:8 */ + + upper = (op >> 4) & 0x30; + } + /* ATU DBNum[3:0] are located in ATU Operation 3:0 */ + + val = (op & 0xf) | upper; + } + *fid = val; + + return err; +} + /* Offset 0x0C: ATU Data Register */
static int mv88e6xxx_g1_atu_data_read(struct mv88e6xxx_chip *chip, @@ -353,14 +401,12 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) { struct mv88e6xxx_chip *chip = dev_id; struct mv88e6xxx_atu_entry entry; - int spid; - int err; - u16 val; + int err, spid; + u16 val, fid;
mv88e6xxx_reg_lock(chip);
- err = mv88e6xxx_g1_atu_op(chip, 0, - MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION); + err = mv88e6xxx_g1_read_atu_violation(chip); if (err) goto out;
@@ -368,6 +414,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) if (err) goto out;
+ err = mv88e6xxx_g1_atu_fid_read(chip, &fid); + if (err) + goto out; + err = mv88e6xxx_g1_atu_data_read(chip, &entry); if (err) goto out;
This implementation for the Marvell mv88e6xxx chip series, is based on handling ATU miss violations occurring when packets ingress on a port that is locked. The mac address triggering the ATU miss violation will be added to the ATU with a zero-DPV, and is then communicated through switchdev to the bridge module, which adds a fdb entry with the fdb locked flag set. The entry is kept according to the bridges ageing time, thus simulating a dynamic entry.
Additionally the driver will set the sticky and masked flags, as the driver does not support roaming and forwarding from any port to a locked entry.
As this is essentially a form of CPU based learning, the amount of locked entries will be limited by a hardcoded value for now, so as to prevent DOS attacks.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- drivers/net/dsa/mv88e6xxx/Makefile | 1 + drivers/net/dsa/mv88e6xxx/chip.c | 76 +++++-- drivers/net/dsa/mv88e6xxx/chip.h | 19 ++ drivers/net/dsa/mv88e6xxx/global1.h | 1 + drivers/net/dsa/mv88e6xxx/global1_atu.c | 12 +- drivers/net/dsa/mv88e6xxx/port.c | 15 +- drivers/net/dsa/mv88e6xxx/port.h | 6 + drivers/net/dsa/mv88e6xxx/switchdev.c | 284 ++++++++++++++++++++++++ drivers/net/dsa/mv88e6xxx/switchdev.h | 37 +++ 9 files changed, 429 insertions(+), 22 deletions(-) create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.c create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.h
diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile index c8eca2b6f959..be903a983780 100644 --- a/drivers/net/dsa/mv88e6xxx/Makefile +++ b/drivers/net/dsa/mv88e6xxx/Makefile @@ -15,3 +15,4 @@ mv88e6xxx-objs += port_hidden.o mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_PTP) += ptp.o mv88e6xxx-objs += serdes.o mv88e6xxx-objs += smi.o +mv88e6xxx-objs += switchdev.o diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 352121cce77e..71843fe87f77 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -42,6 +42,7 @@ #include "ptp.h" #include "serdes.h" #include "smi.h" +#include "switchdev.h"
static void assert_reg_lock(struct mv88e6xxx_chip *chip) { @@ -924,6 +925,13 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port, if (err) dev_err(chip->dev, "p%d: failed to force MAC link down\n", port); + else + if (mv88e6xxx_port_is_locked(chip, port)) { + err = mv88e6xxx_atu_locked_entry_flush(ds, port); + if (err) + dev_err(chip->dev, + "p%d: failed to clear locked entries\n", port); + } }
static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port, @@ -1690,6 +1698,13 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port) struct mv88e6xxx_chip *chip = ds->priv; int err;
+ if (mv88e6xxx_port_is_locked(chip, port)) { + err = mv88e6xxx_atu_locked_entry_flush(ds, port); + if (err) + dev_err(chip->ds->dev, "p%d: failed to clear locked entries: %d\n", + port, err); + } + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_fast_age_fid(chip, port, 0); mv88e6xxx_reg_unlock(chip); @@ -1726,11 +1741,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid, return err; }
-static int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, - int (*cb)(struct mv88e6xxx_chip *chip, - const struct mv88e6xxx_vtu_entry *entry, - void *priv), - void *priv) +int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, + int (*cb)(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv), + void *priv) { struct mv88e6xxx_vtu_entry entry = { .vid = mv88e6xxx_max_vid(chip), @@ -2731,6 +2746,9 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, if (fdb_flags) return 0;
+ if (mv88e6xxx_port_is_locked(chip, port)) + mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid); + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC); @@ -2744,16 +2762,21 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port, u16 fdb_flags, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv; - int err; + bool locked_found = false; + int err = 0;
/* Ignore entries with flags set */ if (fdb_flags) return 0;
- mv88e6xxx_reg_lock(chip); - err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0); - mv88e6xxx_reg_unlock(chip); + if (mv88e6xxx_port_is_locked(chip, port)) + locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
+ if (!locked_found) { + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0); + mv88e6xxx_reg_unlock(chip); + } return err; }
@@ -3849,11 +3872,18 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port) { - return mv88e6xxx_setup_devlink_regions_port(ds, port); + int err; + + err = mv88e6xxx_setup_devlink_regions_port(ds, port); + if (!err) + return mv88e6xxx_init_violation_handler(ds, port); + + return err; }
static void mv88e6xxx_port_teardown(struct dsa_switch *ds, int port) { + mv88e6xxx_teardown_violation_handler(ds, port); mv88e6xxx_teardown_devlink_regions_port(ds, port); }
@@ -6528,7 +6558,7 @@ static int mv88e6xxx_port_pre_bridge_flags(struct dsa_switch *ds, int port, const struct mv88e6xxx_ops *ops;
if (flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | - BR_BCAST_FLOOD | BR_PORT_LOCKED)) + BR_BCAST_FLOOD | BR_PORT_LOCKED | BR_PORT_MAB)) return -EINVAL;
ops = chip->info->ops; @@ -6549,13 +6579,13 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, struct mv88e6xxx_chip *chip = ds->priv; int err = -EOPNOTSUPP;
- mv88e6xxx_reg_lock(chip); - if (flags.mask & BR_LEARNING) { bool learning = !!(flags.val & BR_LEARNING); u16 pav = learning ? (1 << port) : 0;
+ mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_set_assoc_vector(chip, port, pav); + mv88e6xxx_reg_unlock(chip); if (err) goto out; } @@ -6563,8 +6593,10 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_FLOOD) { bool unicast = !!(flags.val & BR_FLOOD);
+ mv88e6xxx_reg_lock(chip); err = chip->info->ops->port_set_ucast_flood(chip, port, unicast); + mv88e6xxx_reg_unlock(chip); if (err) goto out; } @@ -6572,8 +6604,10 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_MCAST_FLOOD) { bool multicast = !!(flags.val & BR_MCAST_FLOOD);
+ mv88e6xxx_reg_lock(chip); err = chip->info->ops->port_set_mcast_flood(chip, port, multicast); + mv88e6xxx_reg_unlock(chip); if (err) goto out; } @@ -6581,20 +6615,34 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_BCAST_FLOOD) { bool broadcast = !!(flags.val & BR_BCAST_FLOOD);
+ mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_broadcast_sync(chip, port, broadcast); + mv88e6xxx_reg_unlock(chip); if (err) goto out; }
+ if (flags.mask & BR_PORT_MAB) { + chip->ports[port].mab = !!(flags.val & BR_PORT_MAB); + + if (!chip->ports[port].mab) + err = mv88e6xxx_atu_locked_entry_flush(ds, port); + else + err = 0; + } + if (flags.mask & BR_PORT_LOCKED) { bool locked = !!(flags.val & BR_PORT_LOCKED);
+ mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_set_lock(chip, port, locked); + mv88e6xxx_reg_unlock(chip); if (err) goto out; + + chip->ports[port].locked = locked; } out: - mv88e6xxx_reg_unlock(chip);
return err; } diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index e693154cf803..180fbcf596fa 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/mv88e6xxx/chip.h @@ -280,6 +280,16 @@ struct mv88e6xxx_port { unsigned int serdes_irq; char serdes_irq_name[64]; struct devlink_region *region; + + /* Locked port and MacAuth control flags */ + bool locked; + bool mab; + + /* List and maintenance of ATU locked entries */ + struct mutex ale_list_lock; + struct list_head ale_list; + struct delayed_work ale_work; + int ale_cnt; };
enum mv88e6xxx_region_id { @@ -399,6 +409,9 @@ struct mv88e6xxx_chip { int egress_dest_port; int ingress_dest_port;
+ /* Keep the register written age time for easy access */ + u8 age_time; + /* Per-port timestamping resources. */ struct mv88e6xxx_port_hwtstamp port_hwtstamp[DSA_MAX_PORTS];
@@ -802,6 +815,12 @@ static inline void mv88e6xxx_reg_unlock(struct mv88e6xxx_chip *chip) mutex_unlock(&chip->reg_lock); }
+int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, + int (*cb)(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv), + void *priv); + int mv88e6xxx_fid_map(struct mv88e6xxx_chip *chip, unsigned long *bitmap);
#endif /* _MV88E6XXX_CHIP_H */ diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h index 65958b2a0d3a..503fbf216670 100644 --- a/drivers/net/dsa/mv88e6xxx/global1.h +++ b/drivers/net/dsa/mv88e6xxx/global1.h @@ -136,6 +136,7 @@ #define MV88E6XXX_G1_ATU_DATA_TRUNK 0x8000 #define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK 0x00f0 #define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK 0x3ff0 +#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS 0x0000 #define MV88E6XXX_G1_ATU_DATA_STATE_MASK 0x000f #define MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED 0x0000 #define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_1_OLDEST 0x0001 diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c index d9dfa1159cde..67907cd00b87 100644 --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c @@ -12,6 +12,8 @@
#include "chip.h" #include "global1.h" +#include "port.h" +#include "switchdev.h"
/* Offset 0x01: ATU FID Register */
@@ -54,6 +56,7 @@ int mv88e6xxx_g1_atu_set_age_time(struct mv88e6xxx_chip *chip,
/* Round to nearest multiple of coeff */ age_time = (msecs + coeff / 2) / coeff; + chip->age_time = age_time;
err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_ATU_CTL, &val); if (err) @@ -426,6 +429,8 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) if (err) goto out;
+ mv88e6xxx_reg_unlock(chip); + spid = entry.state;
if (val & MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION) { @@ -446,6 +451,12 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) "ATU miss violation for %pM portvec %x spid %d\n", entry.mac, entry.portvec, spid); chip->ports[spid].atu_miss_violation++; + + if (fid && chip->ports[spid].mab) + err = mv88e6xxx_handle_violation(chip, spid, &entry, fid, + MV88E6XXX_G1_ATU_OP_MISS_VIOLATION); + if (err) + goto out; }
if (val & MV88E6XXX_G1_ATU_OP_FULL_VIOLATION) { @@ -454,7 +465,6 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) entry.mac, entry.portvec, spid); chip->ports[spid].atu_full_violation++; } - mv88e6xxx_reg_unlock(chip);
return IRQ_HANDLED;
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 5c4195c635b0..67e457ce67ae 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -14,9 +14,11 @@ #include <linux/phylink.h>
#include "chip.h" +#include "global1.h" #include "global2.h" #include "port.h" #include "serdes.h" +#include "switchdev.h"
int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port, int reg, u16 *val) @@ -1240,13 +1242,12 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, if (err) return err;
- err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, ®); - if (err) - return err; - - reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; - if (locked) - reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; + reg = 0; + if (locked) { + reg = (1 << port); + reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG | + MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; + }
return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); } diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h index cb04243f37c1..9475bc6e95a2 100644 --- a/drivers/net/dsa/mv88e6xxx/port.h +++ b/drivers/net/dsa/mv88e6xxx/port.h @@ -231,6 +231,7 @@ #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT 0x2000 #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG 0x1000 #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED 0x0800 +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK 0x07ff
/* Offset 0x0C: Port ATU Control */ #define MV88E6XXX_PORT_ATU_CTL 0x0c @@ -375,6 +376,11 @@ int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid); int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked);
+static inline bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port) +{ + return chip->ports[port].locked; +} + int mv88e6xxx_port_set_8021q_mode(struct mv88e6xxx_chip *chip, int port, u16 mode); int mv88e6095_port_tag_remap(struct mv88e6xxx_chip *chip, int port); diff --git a/drivers/net/dsa/mv88e6xxx/switchdev.c b/drivers/net/dsa/mv88e6xxx/switchdev.c new file mode 100644 index 000000000000..cd332a10fad5 --- /dev/null +++ b/drivers/net/dsa/mv88e6xxx/switchdev.c @@ -0,0 +1,284 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * switchdev.c + * + * Authors: + * Hans J. Schultz hans.schultz@westermo.com + * + */ + +#include <net/switchdev.h> +#include <linux/list.h> +#include "chip.h" +#include "global1.h" +#include "switchdev.h" + +static void mv88e6xxx_atu_locked_entry_purge(struct mv88e6xxx_atu_locked_entry *ale, + bool notify, bool take_nl_lock) +{ + struct switchdev_notifier_fdb_info info = { + .addr = ale->mac, + .vid = ale->vid, + .locked = true, + .offloaded = true, + }; + struct mv88e6xxx_atu_entry entry; + struct net_device *brport; + struct dsa_port *dp; + + entry.portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS; + entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED; + entry.trunk = false; + ether_addr_copy(entry.mac, ale->mac); + + mv88e6xxx_reg_lock(ale->chip); + mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry); + mv88e6xxx_reg_unlock(ale->chip); + + dp = dsa_to_port(ale->chip->ds, ale->port); + + if (notify) { + if (take_nl_lock) + rtnl_lock(); + brport = dsa_port_to_bridge_port(dp); + + if (brport) { + call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE, + brport, &info.info, NULL); + } else { + dev_err(ale->chip->dev, "No bridge port for dsa port belonging to port %d\n", + ale->port); + } + if (take_nl_lock) + rtnl_unlock(); + } + + list_del(&ale->list); + kfree(ale); +} + +static void mv88e6xxx_atu_locked_entry_cleanup(struct work_struct *work) +{ + struct mv88e6xxx_port *p = container_of(work, struct mv88e6xxx_port, ale_work.work); + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + if (time_after(jiffies, ale->expires)) { + mv88e6xxx_atu_locked_entry_purge(ale, true, true); + p->ale_cnt--; + } + } + mutex_unlock(&p->ale_list_lock); + + mod_delayed_work(system_long_wq, &p->ale_work, msecs_to_jiffies(100)); +} + +static int mv88e6xxx_new_atu_locked_entry(struct mv88e6xxx_chip *chip, const unsigned char *addr, + int port, u16 fid, u16 vid, + struct mv88e6xxx_atu_locked_entry **alep) +{ + struct mv88e6xxx_atu_locked_entry *ale; + unsigned long now, age_time; + + ale = kmalloc(sizeof(*ale), GFP_ATOMIC); + if (!ale) + return -ENOMEM; + + ether_addr_copy(ale->mac, addr); + ale->chip = chip; + ale->port = port; + ale->fid = fid; + ale->vid = vid; + now = jiffies; + age_time = chip->age_time * chip->info->age_time_coeff; + ale->expires = now + age_time; + + *alep = ale; + return 0; +} + +struct mv88e6xxx_fid_search_ctx { + u16 fid_search; + u16 vid_found; +}; + +static int mv88e6xxx_find_vid_on_matching_fid(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv) +{ + struct mv88e6xxx_fid_search_ctx *ctx = priv; + + if (ctx->fid_search == entry->fid) { + ctx->vid_found = entry->vid; + return 1; + } + + return 0; +} + +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip, int port, + struct mv88e6xxx_atu_entry *entry, + u16 fid, u16 type) +{ + struct switchdev_notifier_fdb_info info = { + .addr = entry->mac, + .vid = 0, + .is_local = true, + .locked = true, + .blackhole = true, + .offloaded = true, + }; + struct mv88e6xxx_atu_locked_entry *ale; + struct mv88e6xxx_fid_search_ctx ctx; + struct net_device *brport; + struct mv88e6xxx_port *p; + struct dsa_port *dp; + int err; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + ctx.fid_search = fid; + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_vtu_walk(chip, mv88e6xxx_find_vid_on_matching_fid, &ctx); + mv88e6xxx_reg_unlock(chip); + if (err < 0) + return err; + if (err == 1) + info.vid = ctx.vid_found; + else + return -ENODATA; + + switch (type) { + case MV88E6XXX_G1_ATU_OP_MISS_VIOLATION: + mutex_lock(&p->ale_list_lock); + if (p->ale_cnt >= ATU_LOCKED_ENTRIES_MAX) + goto exit; + mutex_unlock(&p->ale_list_lock); + entry->portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS; + entry->state = MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC; + entry->trunk = false; + + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_g1_atu_loadpurge(chip, fid, entry); + if (err) + goto fail; + mv88e6xxx_reg_unlock(chip); + + dp = dsa_to_port(chip->ds, port); + err = mv88e6xxx_new_atu_locked_entry(chip, entry->mac, port, fid, + info.vid, &ale); + if (err) + return err; + + mutex_lock(&p->ale_list_lock); + list_add(&ale->list, &p->ale_list); + p->ale_cnt++; + mutex_unlock(&p->ale_list_lock); + + rtnl_lock(); + brport = dsa_port_to_bridge_port(dp); + if (!brport) { + rtnl_unlock(); + return -ENODEV; + } + err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE, + brport, &info.info, NULL); + rtnl_unlock(); + break; + } + + return err; + +fail: + mv88e6xxx_reg_unlock(chip); + return err; + +exit: + mutex_unlock(&p->ale_list_lock); + return 0; +} + +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port, + const unsigned char *addr, u16 vid) +{ + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + bool found = false; + + p = &chip->ports[port]; + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + if (ether_addr_equal(ale->mac, addr)) { + if (ale->vid == vid) { + mv88e6xxx_atu_locked_entry_purge(ale, false, false); + p->ale_cnt--; + found = true; + break; + } + } + } + mutex_unlock(&p->ale_list_lock); + return found; +} + +int mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + mv88e6xxx_atu_locked_entry_purge(ale, true, false); + p->ale_cnt--; + } + mutex_unlock(&p->ale_list_lock); + + return 0; +} + +int mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + INIT_LIST_HEAD(&p->ale_list); + mutex_init(&p->ale_list_lock); + p->ale_cnt = 0; + INIT_DELAYED_WORK(&p->ale_work, mv88e6xxx_atu_locked_entry_cleanup); + mod_delayed_work(system_long_wq, &p->ale_work, msecs_to_jiffies(500)); + + return 0; +} + +int mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + cancel_delayed_work(&p->ale_work); + mv88e6xxx_atu_locked_entry_flush(ds, port); + mutex_destroy(&p->ale_list_lock); + + return 0; +} diff --git a/drivers/net/dsa/mv88e6xxx/switchdev.h b/drivers/net/dsa/mv88e6xxx/switchdev.h new file mode 100644 index 000000000000..df2005c36f47 --- /dev/null +++ b/drivers/net/dsa/mv88e6xxx/switchdev.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * + * switchdev.h + * + * Authors: + * Hans J. Schultz hans.schultz@westermo.com + * + */ + +#ifndef DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ +#define DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ + +#include <net/switchdev.h> +#include "chip.h" + +#define ATU_LOCKED_ENTRIES_MAX 64 + +struct mv88e6xxx_atu_locked_entry { + struct list_head list; + struct mv88e6xxx_chip *chip; + int port; + u8 mac[ETH_ALEN]; + u16 fid; + u16 vid; + unsigned long expires; +}; + +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip, int port, + struct mv88e6xxx_atu_entry *entry, + u16 fid, u16 type); +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port, + const unsigned char *addr, u16 vid); +int mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port); +int mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port); +int mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port); + +#endif /* DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ */
On Tue, Oct 18, 2022 at 06:56:17PM +0200, Hans J. Schultz wrote:
This implementation for the Marvell mv88e6xxx chip series, is based on handling ATU miss violations occurring when packets ingress on a port that is locked. The mac address triggering the ATU miss violation will be added to the ATU with a zero-DPV, and is then communicated through switchdev to the bridge module, which adds a fdb entry with the fdb locked flag set. The entry is kept according to the bridges ageing time, thus simulating a dynamic entry.
Additionally the driver will set the sticky and masked flags, as the driver does not support roaming and forwarding from any port to a locked entry.
As this is essentially a form of CPU based learning, the amount of locked entries will be limited by a hardcoded value for now, so as to prevent DOS attacks.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com
drivers/net/dsa/mv88e6xxx/Makefile | 1 + drivers/net/dsa/mv88e6xxx/chip.c | 76 +++++-- drivers/net/dsa/mv88e6xxx/chip.h | 19 ++ drivers/net/dsa/mv88e6xxx/global1.h | 1 + drivers/net/dsa/mv88e6xxx/global1_atu.c | 12 +- drivers/net/dsa/mv88e6xxx/port.c | 15 +- drivers/net/dsa/mv88e6xxx/port.h | 6 + drivers/net/dsa/mv88e6xxx/switchdev.c | 284 ++++++++++++++++++++++++ drivers/net/dsa/mv88e6xxx/switchdev.h | 37 +++ 9 files changed, 429 insertions(+), 22 deletions(-) create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.c create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.h
diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile index c8eca2b6f959..be903a983780 100644 --- a/drivers/net/dsa/mv88e6xxx/Makefile +++ b/drivers/net/dsa/mv88e6xxx/Makefile @@ -15,3 +15,4 @@ mv88e6xxx-objs += port_hidden.o mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_PTP) += ptp.o mv88e6xxx-objs += serdes.o mv88e6xxx-objs += smi.o +mv88e6xxx-objs += switchdev.o diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 352121cce77e..71843fe87f77 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -42,6 +42,7 @@ #include "ptp.h" #include "serdes.h" #include "smi.h" +#include "switchdev.h" static void assert_reg_lock(struct mv88e6xxx_chip *chip) { @@ -924,6 +925,13 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port, if (err) dev_err(chip->dev, "p%d: failed to force MAC link down\n", port);
- else
if (mv88e6xxx_port_is_locked(chip, port)) {
err = mv88e6xxx_atu_locked_entry_flush(ds, port);
if (err)
dev_err(chip->dev,
"p%d: failed to clear locked entries\n", port);
}
This would not have been needed if dsa_port_set_state() would have called dsa_port_fast_age().
Currently it only does that if dp->learning is true. From previous conversations I get the idea that with MAB, port learning will be false. But I don't understand why; isn't MAB CPU-assisted learning? I'm looking at the ocelot hardware support for this and I think it could be implemented using a similar mechanism, but I certainly don't want to add more workarounds such as this in other drivers.
Are there any other ways to implement MAB other than through CPU assisted learning?
We could add one more dp->mab flag which tracks the "mab" brport flag, and extend dsa_port_set_state() to also call dsa_port_fast_age() in that case, but I want to make sure there isn't something extremely obvious I'm missing about the "learning" flag.
} static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port, @@ -1690,6 +1698,13 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port) struct mv88e6xxx_chip *chip = ds->priv; int err;
- if (mv88e6xxx_port_is_locked(chip, port)) {
err = mv88e6xxx_atu_locked_entry_flush(ds, port);
if (err)
dev_err(chip->ds->dev, "p%d: failed to clear locked entries: %d\n",
port, err);
- }
- mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_fast_age_fid(chip, port, 0); mv88e6xxx_reg_unlock(chip);
@@ -1726,11 +1741,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid, return err; } -static int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
int (*cb)(struct mv88e6xxx_chip *chip,
const struct mv88e6xxx_vtu_entry *entry,
void *priv),
void *priv)
+int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
int (*cb)(struct mv88e6xxx_chip *chip,
const struct mv88e6xxx_vtu_entry *entry,
void *priv),
void *priv)
{ struct mv88e6xxx_vtu_entry entry = { .vid = mv88e6xxx_max_vid(chip), @@ -2731,6 +2746,9 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, if (fdb_flags) return 0;
- if (mv88e6xxx_port_is_locked(chip, port))
mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
- mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
@@ -2744,16 +2762,21 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port, u16 fdb_flags, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv;
- int err;
- bool locked_found = false;
- int err = 0;
/* Ignore entries with flags set */ if (fdb_flags) return 0;
- mv88e6xxx_reg_lock(chip);
- err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
- mv88e6xxx_reg_unlock(chip);
- if (mv88e6xxx_port_is_locked(chip, port))
locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
- if (!locked_found) {
mv88e6xxx_reg_lock(chip);
err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
mv88e6xxx_reg_unlock(chip);
- } return err;
} @@ -3849,11 +3872,18 @@ static int mv88e6xxx_setup(struct dsa_switch *ds) static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port) {
- return mv88e6xxx_setup_devlink_regions_port(ds, port);
- int err;
- err = mv88e6xxx_setup_devlink_regions_port(ds, port);
- if (!err)
return mv88e6xxx_init_violation_handler(ds, port);
- return err;
} static void mv88e6xxx_port_teardown(struct dsa_switch *ds, int port) {
- mv88e6xxx_teardown_violation_handler(ds, port); mv88e6xxx_teardown_devlink_regions_port(ds, port);
} @@ -6528,7 +6558,7 @@ static int mv88e6xxx_port_pre_bridge_flags(struct dsa_switch *ds, int port, const struct mv88e6xxx_ops *ops; if (flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
BR_BCAST_FLOOD | BR_PORT_LOCKED))
return -EINVAL;BR_BCAST_FLOOD | BR_PORT_LOCKED | BR_PORT_MAB))
ops = chip->info->ops; @@ -6549,13 +6579,13 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, struct mv88e6xxx_chip *chip = ds->priv; int err = -EOPNOTSUPP;
- mv88e6xxx_reg_lock(chip);
Separate commit which changes the locking?
if (flags.mask & BR_LEARNING) { bool learning = !!(flags.val & BR_LEARNING); u16 pav = learning ? (1 << port) : 0;
err = mv88e6xxx_port_set_assoc_vector(chip, port, pav);mv88e6xxx_reg_lock(chip);
if (err) goto out; }mv88e6xxx_reg_unlock(chip);
@@ -6563,8 +6593,10 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_FLOOD) { bool unicast = !!(flags.val & BR_FLOOD);
err = chip->info->ops->port_set_ucast_flood(chip, port, unicast);mv88e6xxx_reg_lock(chip);
if (err) goto out; }mv88e6xxx_reg_unlock(chip);
@@ -6572,8 +6604,10 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_MCAST_FLOOD) { bool multicast = !!(flags.val & BR_MCAST_FLOOD);
err = chip->info->ops->port_set_mcast_flood(chip, port, multicast);mv88e6xxx_reg_lock(chip);
if (err) goto out; }mv88e6xxx_reg_unlock(chip);
@@ -6581,20 +6615,34 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_BCAST_FLOOD) { bool broadcast = !!(flags.val & BR_BCAST_FLOOD);
err = mv88e6xxx_port_broadcast_sync(chip, port, broadcast);mv88e6xxx_reg_lock(chip);
if (err) goto out; }mv88e6xxx_reg_unlock(chip);
- if (flags.mask & BR_PORT_MAB) {
chip->ports[port].mab = !!(flags.val & BR_PORT_MAB);
if (!chip->ports[port].mab)
err = mv88e6xxx_atu_locked_entry_flush(ds, port);
else
err = 0;
Again, dsa_port_fast_age() is also called when dp->learning is turned off in dsa_port_bridge_flags(). I don't want to see the mv88e6xxx driver doing this manually.
- }
- if (flags.mask & BR_PORT_LOCKED) { bool locked = !!(flags.val & BR_PORT_LOCKED);
err = mv88e6xxx_port_set_lock(chip, port, locked);mv88e6xxx_reg_lock(chip);
if (err) goto out;mv88e6xxx_reg_unlock(chip);
}chip->ports[port].locked = locked;
out:
- mv88e6xxx_reg_unlock(chip);
return err; } diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index e693154cf803..180fbcf596fa 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/mv88e6xxx/chip.h @@ -280,6 +280,16 @@ struct mv88e6xxx_port { unsigned int serdes_irq; char serdes_irq_name[64]; struct devlink_region *region;
- /* Locked port and MacAuth control flags */
- bool locked;
- bool mab;
- /* List and maintenance of ATU locked entries */
- struct mutex ale_list_lock;
- struct list_head ale_list;
- struct delayed_work ale_work;
- int ale_cnt;
}; enum mv88e6xxx_region_id { @@ -399,6 +409,9 @@ struct mv88e6xxx_chip { int egress_dest_port; int ingress_dest_port;
- /* Keep the register written age time for easy access */
- u8 age_time;
- /* Per-port timestamping resources. */ struct mv88e6xxx_port_hwtstamp port_hwtstamp[DSA_MAX_PORTS];
@@ -802,6 +815,12 @@ static inline void mv88e6xxx_reg_unlock(struct mv88e6xxx_chip *chip) mutex_unlock(&chip->reg_lock); } +int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
int (*cb)(struct mv88e6xxx_chip *chip,
const struct mv88e6xxx_vtu_entry *entry,
void *priv),
void *priv);
int mv88e6xxx_fid_map(struct mv88e6xxx_chip *chip, unsigned long *bitmap); #endif /* _MV88E6XXX_CHIP_H */ diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h index 65958b2a0d3a..503fbf216670 100644 --- a/drivers/net/dsa/mv88e6xxx/global1.h +++ b/drivers/net/dsa/mv88e6xxx/global1.h @@ -136,6 +136,7 @@ #define MV88E6XXX_G1_ATU_DATA_TRUNK 0x8000 #define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK 0x00f0 #define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK 0x3ff0 +#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS 0x0000 #define MV88E6XXX_G1_ATU_DATA_STATE_MASK 0x000f #define MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED 0x0000 #define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_1_OLDEST 0x0001 diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c index d9dfa1159cde..67907cd00b87 100644 --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c @@ -12,6 +12,8 @@ #include "chip.h" #include "global1.h" +#include "port.h" +#include "switchdev.h" /* Offset 0x01: ATU FID Register */ @@ -54,6 +56,7 @@ int mv88e6xxx_g1_atu_set_age_time(struct mv88e6xxx_chip *chip, /* Round to nearest multiple of coeff */ age_time = (msecs + coeff / 2) / coeff;
- chip->age_time = age_time;
err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_ATU_CTL, &val); if (err) @@ -426,6 +429,8 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) if (err) goto out;
- mv88e6xxx_reg_unlock(chip);
- spid = entry.state;
if (val & MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION) { @@ -446,6 +451,12 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) "ATU miss violation for %pM portvec %x spid %d\n", entry.mac, entry.portvec, spid); chip->ports[spid].atu_miss_violation++;
if (fid && chip->ports[spid].mab)
err = mv88e6xxx_handle_violation(chip, spid, &entry, fid,
MV88E6XXX_G1_ATU_OP_MISS_VIOLATION);
if (err)
}goto out;
if (val & MV88E6XXX_G1_ATU_OP_FULL_VIOLATION) { @@ -454,7 +465,6 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) entry.mac, entry.portvec, spid); chip->ports[spid].atu_full_violation++; }
- mv88e6xxx_reg_unlock(chip);
return IRQ_HANDLED; diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 5c4195c635b0..67e457ce67ae 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -14,9 +14,11 @@ #include <linux/phylink.h> #include "chip.h" +#include "global1.h" #include "global2.h" #include "port.h" #include "serdes.h" +#include "switchdev.h" int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port, int reg, u16 *val) @@ -1240,13 +1242,12 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, if (err) return err;
- err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, ®);
- if (err)
return err;
- reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
- if (locked)
reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
- reg = 0;
- if (locked) {
reg = (1 << port);
reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
- }
return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); } diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h index cb04243f37c1..9475bc6e95a2 100644 --- a/drivers/net/dsa/mv88e6xxx/port.h +++ b/drivers/net/dsa/mv88e6xxx/port.h @@ -231,6 +231,7 @@ #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT 0x2000 #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG 0x1000 #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED 0x0800 +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK 0x07ff /* Offset 0x0C: Port ATU Control */ #define MV88E6XXX_PORT_ATU_CTL 0x0c @@ -375,6 +376,11 @@ int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid); int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked); +static inline bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port) +{
- return chip->ports[port].locked;
+}
int mv88e6xxx_port_set_8021q_mode(struct mv88e6xxx_chip *chip, int port, u16 mode); int mv88e6095_port_tag_remap(struct mv88e6xxx_chip *chip, int port); diff --git a/drivers/net/dsa/mv88e6xxx/switchdev.c b/drivers/net/dsa/mv88e6xxx/switchdev.c new file mode 100644 index 000000000000..cd332a10fad5 --- /dev/null +++ b/drivers/net/dsa/mv88e6xxx/switchdev.c @@ -0,0 +1,284 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/*
- switchdev.c
- Authors:
- Hans J. Schultz hans.schultz@westermo.com
- */
+#include <net/switchdev.h> +#include <linux/list.h> +#include "chip.h" +#include "global1.h" +#include "switchdev.h"
+static void mv88e6xxx_atu_locked_entry_purge(struct mv88e6xxx_atu_locked_entry *ale,
bool notify, bool take_nl_lock)
+{
- struct switchdev_notifier_fdb_info info = {
.addr = ale->mac,
.vid = ale->vid,
.locked = true,
.offloaded = true,
- };
- struct mv88e6xxx_atu_entry entry;
- struct net_device *brport;
- struct dsa_port *dp;
- entry.portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS;
- entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED;
- entry.trunk = false;
- ether_addr_copy(entry.mac, ale->mac);
- mv88e6xxx_reg_lock(ale->chip);
- mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry);
- mv88e6xxx_reg_unlock(ale->chip);
- dp = dsa_to_port(ale->chip->ds, ale->port);
- if (notify) {
if (take_nl_lock)
rtnl_lock();
Is this tested with lockdep? I see the function is called with other locks held (p->ale_list_lock). Isn't there a lock inversion anywhere? Locks always need to be taken in the same order, and rtnl_lock is a pretty high level lock, not exactly the kind you could take just like that.
brport = dsa_port_to_bridge_port(dp);
if (brport) {
call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
brport, &info.info, NULL);
} else {
dev_err(ale->chip->dev, "No bridge port for dsa port belonging to port %d\n",
ale->port);
}
if (take_nl_lock)
rtnl_unlock();
- }
- list_del(&ale->list);
- kfree(ale);
+}
On 2022-10-20 15:25, Vladimir Oltean wrote:
+#include <net/switchdev.h> +#include <linux/list.h> +#include "chip.h" +#include "global1.h" +#include "switchdev.h"
+static void mv88e6xxx_atu_locked_entry_purge(struct mv88e6xxx_atu_locked_entry *ale,
bool notify, bool take_nl_lock)
+{
- struct switchdev_notifier_fdb_info info = {
.addr = ale->mac,
.vid = ale->vid,
.locked = true,
.offloaded = true,
- };
- struct mv88e6xxx_atu_entry entry;
- struct net_device *brport;
- struct dsa_port *dp;
- entry.portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS;
- entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED;
- entry.trunk = false;
- ether_addr_copy(entry.mac, ale->mac);
- mv88e6xxx_reg_lock(ale->chip);
- mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry);
- mv88e6xxx_reg_unlock(ale->chip);
- dp = dsa_to_port(ale->chip->ds, ale->port);
- if (notify) {
if (take_nl_lock)
rtnl_lock();
Is this tested with lockdep? I see the function is called with other locks held (p->ale_list_lock). Isn't there a lock inversion anywhere? Locks always need to be taken in the same order, and rtnl_lock is a pretty high level lock, not exactly the kind you could take just like that.
I am very sure that there is no lock inversions or double locks taken. It is only in the clean-up from time-out of driver locked entries that the nl lock needs to be taken (as the code reveals). In all other instances, the nl lock is already taken as far as this implementation goes.
On 2022-10-20 15:25, Vladimir Oltean wrote:
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 352121cce77e..71843fe87f77 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -42,6 +42,7 @@ #include "ptp.h" #include "serdes.h" #include "smi.h" +#include "switchdev.h"
static void assert_reg_lock(struct mv88e6xxx_chip *chip) { @@ -924,6 +925,13 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port, if (err) dev_err(chip->dev, "p%d: failed to force MAC link down\n", port);
- else
if (mv88e6xxx_port_is_locked(chip, port)) {
err = mv88e6xxx_atu_locked_entry_flush(ds, port);
if (err)
dev_err(chip->dev,
"p%d: failed to clear locked entries\n", port);
}
This would not have been needed if dsa_port_set_state() would have called dsa_port_fast_age().
Currently it only does that if dp->learning is true. From previous conversations I get the idea that with MAB, port learning will be false. But I don't understand why; isn't MAB CPU-assisted learning? I'm looking at the ocelot hardware support for this and I think it could be implemented using a similar mechanism, but I certainly don't want to add more workarounds such as this in other drivers.
Are there any other ways to implement MAB other than through CPU assisted learning?
We could add one more dp->mab flag which tracks the "mab" brport flag, and extend dsa_port_set_state() to also call dsa_port_fast_age() in that case, but I want to make sure there isn't something extremely obvious I'm missing about the "learning" flag.
In general locked ports block traffic from a host based on if there is a FDB entry or not. In the non-offloaded case, there is only CPU assisted learning, so the normal learning mechanism has to be disabled as any learned entry will open the port for the learned MAC,vlan. Thus learning is off for locked ports, which of course includes MAB.
So the 'learning' is based on authorizing MAC,vlan addresses, which is done by userspace daemons, e.g. hostapd or what could be called mabd.
On Thu, Oct 20, 2022 at 10:20:50PM +0200, netdev@kapio-technology.com wrote:
In general locked ports block traffic from a host based on if there is a FDB entry or not. In the non-offloaded case, there is only CPU assisted learning, so the normal learning mechanism has to be disabled as any learned entry will open the port for the learned MAC,vlan.
Does it have to be that way? Why can't BR_LEARNING on a BR_PORT_LOCKED cause the learned FDB entries to have BR_FDB_LOCKED, and everything would be ok in that case (the port will not be opened for the learned MAC/VLAN)?
Thus learning is off for locked ports, which of course includes MAB.
So the 'learning' is based on authorizing MAC,vlan addresses, which is done by userspace daemons, e.g. hostapd or what could be called mabd.
On 2022-10-21 00:57, Vladimir Oltean wrote:
On Thu, Oct 20, 2022 at 10:20:50PM +0200, netdev@kapio-technology.com wrote:
In general locked ports block traffic from a host based on if there is a FDB entry or not. In the non-offloaded case, there is only CPU assisted learning, so the normal learning mechanism has to be disabled as any learned entry will open the port for the learned MAC,vlan.
Does it have to be that way? Why can't BR_LEARNING on a BR_PORT_LOCKED cause the learned FDB entries to have BR_FDB_LOCKED, and everything would be ok in that case (the port will not be opened for the learned MAC/VLAN)?
I suppose you are right that basing it solely on BR_FDB_LOCKED is possible.
The question is then maybe if the common case where you don't need learned entries for the scheme to work, e.g. with EAPOL link local packets, requires less CPU load to work and is cleaner than if using BR_FDB_LOCKED entries?
Thus learning is off for locked ports, which of course includes MAB.
So the 'learning' is based on authorizing MAC,vlan addresses, which is done by userspace daemons, e.g. hostapd or what could be called mabd.
On Fri, Oct 21, 2022 at 08:47:42AM +0200, netdev@kapio-technology.com wrote:
On 2022-10-21 00:57, Vladimir Oltean wrote:
On Thu, Oct 20, 2022 at 10:20:50PM +0200, netdev@kapio-technology.com wrote:
In general locked ports block traffic from a host based on if there is a FDB entry or not. In the non-offloaded case, there is only CPU assisted learning, so the normal learning mechanism has to be disabled as any learned entry will open the port for the learned MAC,vlan.
Does it have to be that way? Why can't BR_LEARNING on a BR_PORT_LOCKED cause the learned FDB entries to have BR_FDB_LOCKED, and everything would be ok in that case (the port will not be opened for the learned MAC/VLAN)?
I suppose you are right that basing it solely on BR_FDB_LOCKED is possible.
The question is then maybe if the common case where you don't need learned entries for the scheme to work, e.g. with EAPOL link local packets, requires less CPU load to work and is cleaner than if using BR_FDB_LOCKED entries?
I suppose the real question is what does the bridge currently do with BR_LEARNING + BR_PORT_LOCKED, and if that is sane and useful in any case? It isn't a configuration that's rejected, for sure. The configuration could be rejected via a bug fix patch, then in net-next it could be made to learn these addresses with the BR_FDB_LOCKED flag.
To your question regarding the common case (no MAB): that can be supported just fine when BR_LEARNING is off and BR_PORT_LOCKED is on, no? No BR_FDB_LOCKED entries will be learned.
On 2022-10-21 13:22, Vladimir Oltean wrote:
On Fri, Oct 21, 2022 at 08:47:42AM +0200, netdev@kapio-technology.com wrote:
On 2022-10-21 00:57, Vladimir Oltean wrote:
On Thu, Oct 20, 2022 at 10:20:50PM +0200, netdev@kapio-technology.com wrote:
In general locked ports block traffic from a host based on if there is a FDB entry or not. In the non-offloaded case, there is only CPU assisted learning, so the normal learning mechanism has to be disabled as any learned entry will open the port for the learned MAC,vlan.
Does it have to be that way? Why can't BR_LEARNING on a BR_PORT_LOCKED cause the learned FDB entries to have BR_FDB_LOCKED, and everything would be ok in that case (the port will not be opened for the learned MAC/VLAN)?
I suppose you are right that basing it solely on BR_FDB_LOCKED is possible.
The question is then maybe if the common case where you don't need learned entries for the scheme to work, e.g. with EAPOL link local packets, requires less CPU load to work and is cleaner than if using BR_FDB_LOCKED entries?
I suppose the real question is what does the bridge currently do with BR_LEARNING + BR_PORT_LOCKED, and if that is sane and useful in any case? It isn't a configuration that's rejected, for sure. The configuration could be rejected via a bug fix patch, then in net-next it could be made to learn these addresses with the BR_FDB_LOCKED flag.
To your question regarding the common case (no MAB): that can be supported just fine when BR_LEARNING is off and BR_PORT_LOCKED is on, no? No BR_FDB_LOCKED entries will be learned.
As it is now in the bridge, the locked port part is handled before learning in the ingress data path, so with BR_LEARNING and BR_PORT_LOCKED, I think it will work as it does now except link local packages.
If your suggestion of BR_LEARNING causing BR_FDB_LOCKED on a locked port, I guess it would be implemented under br_fdb_update() and BR_LEARNING + BR_PORT_LOCKED would go together, forcing BR_LEARNING in this case, thus also for all drivers?
On Fri, Oct 21, 2022 at 03:16:21PM +0200, netdev@kapio-technology.com wrote:
As it is now in the bridge, the locked port part is handled before learning in the ingress data path, so with BR_LEARNING and BR_PORT_LOCKED, I think it will work as it does now except link local packages.
If link-local learning is enabled on a locked port, I think those addresses should also be learned with the BR_FDB_LOCKED flag. The creation of those locked FDB entries can be further suppressed by the BROPT_NO_LL_LEARN flag.
If your suggestion of BR_LEARNING causing BR_FDB_LOCKED on a locked port, I guess it would be implemented under br_fdb_update() and BR_LEARNING + BR_PORT_LOCKED would go together, forcing BR_LEARNING in this case, thus also for all drivers?
Yes, basically where this is placed right now (in br_handle_frame_finish):
if (p->flags & BR_PORT_LOCKED) { struct net_bridge_fdb_entry *fdb_src = br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
if (!fdb_src) { unsigned long flags = 0;
if (p->flags & BR_PORT_MAB) { ~~~~~~~~~~~~~~~~~~~~~~~~ except check for BR_LEARNING
__set_bit(BR_FDB_LOCKED, &flags); br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags); } goto drop; } else if (READ_ONCE(fdb_src->dst) != p || test_bit(BR_FDB_LOCAL, &fdb_src->flags) || test_bit(BR_FDB_LOCKED, &fdb_src->flags)) { goto drop; } }
On 2022-10-21 18:30, Vladimir Oltean wrote:
On Fri, Oct 21, 2022 at 03:16:21PM +0200, netdev@kapio-technology.com wrote:
As it is now in the bridge, the locked port part is handled before learning in the ingress data path, so with BR_LEARNING and BR_PORT_LOCKED, I think it will work as it does now except link local packages.
If link-local learning is enabled on a locked port, I think those addresses should also be learned with the BR_FDB_LOCKED flag. The creation of those locked FDB entries can be further suppressed by the BROPT_NO_LL_LEARN flag.
If your suggestion of BR_LEARNING causing BR_FDB_LOCKED on a locked port, I guess it would be implemented under br_fdb_update() and BR_LEARNING + BR_PORT_LOCKED would go together, forcing BR_LEARNING in this case, thus also for all drivers?
Yes, basically where this is placed right now (in br_handle_frame_finish):
if (p->flags & BR_PORT_LOCKED) { struct net_bridge_fdb_entry *fdb_src = br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
if (!fdb_src) { unsigned long flags = 0; if (p->flags & BR_PORT_MAB) { ~~~~~~~~~~~~~~~~~~~~~~~~ except check for BR_LEARNING __set_bit(BR_FDB_LOCKED, &flags); br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags); } goto drop; } else if (READ_ONCE(fdb_src->dst) != p || test_bit(BR_FDB_LOCAL, &fdb_src->flags) || test_bit(BR_FDB_LOCKED, &fdb_src->flags)) { goto drop; }
}
As I don't know what implications it would have for other drivers to have learning forced enabled on locked ports, I cannot say if it is a good idea or not. Right now learning is not forced either way as is, but the consensus is that learning should be off with locked ports, which it would be either way in the common case I think.
On Fri, Oct 21, 2022 at 07:18:59PM +0200, netdev@kapio-technology.com wrote:
On 2022-10-21 18:30, Vladimir Oltean wrote:
On Fri, Oct 21, 2022 at 03:16:21PM +0200, netdev@kapio-technology.com wrote:
As it is now in the bridge, the locked port part is handled before learning in the ingress data path, so with BR_LEARNING and BR_PORT_LOCKED, I think it will work as it does now except link local packages.
If link-local learning is enabled on a locked port, I think those addresses should also be learned with the BR_FDB_LOCKED flag. The creation of those locked FDB entries can be further suppressed by the BROPT_NO_LL_LEARN flag.
If your suggestion of BR_LEARNING causing BR_FDB_LOCKED on a locked port, I guess it would be implemented under br_fdb_update() and BR_LEARNING + BR_PORT_LOCKED would go together, forcing BR_LEARNING in this case, thus also for all drivers?
Yes, basically where this is placed right now (in br_handle_frame_finish):
As I don't know what implications it would have for other drivers to have learning forced enabled on locked ports, I cannot say if it is a good idea or not. Right now learning is not forced either way as is, but the consensus is that learning should be off with locked ports, which it would be either way in the common case I think.
I don't think I fully understand what you mean by forcing BR_LEARNING. A bridge port gets created with a default set of flags as can be seen in new_nbp(). Those flags include BR_LEARNING but don't include BR_PORT_LOCKED.
The user can decide he wants to make the port use 802.1X without MAB, so he enables BR_PORT_LOCKED and disables BR_LEARNING, all with the same netlink command (ip link set swp0 type bridge_slave learning off locked on).
How was the driver forced into anything?
On 2022-10-21 19:30, Vladimir Oltean wrote:
On Fri, Oct 21, 2022 at 07:18:59PM +0200, netdev@kapio-technology.com wrote:
On 2022-10-21 18:30, Vladimir Oltean wrote:
On Fri, Oct 21, 2022 at 03:16:21PM +0200, netdev@kapio-technology.com wrote:
As it is now in the bridge, the locked port part is handled before learning in the ingress data path, so with BR_LEARNING and BR_PORT_LOCKED, I think it will work as it does now except link local packages.
If link-local learning is enabled on a locked port, I think those addresses should also be learned with the BR_FDB_LOCKED flag. The creation of those locked FDB entries can be further suppressed by the BROPT_NO_LL_LEARN flag.
If your suggestion of BR_LEARNING causing BR_FDB_LOCKED on a locked port, I guess it would be implemented under br_fdb_update() and BR_LEARNING + BR_PORT_LOCKED would go together, forcing BR_LEARNING in this case, thus also for all drivers?
Yes, basically where this is placed right now (in br_handle_frame_finish):
As I don't know what implications it would have for other drivers to have learning forced enabled on locked ports, I cannot say if it is a good idea or not. Right now learning is not forced either way as is, but the consensus is that learning should be off with locked ports, which it would be either way in the common case I think.
I don't think I fully understand what you mean by forcing BR_LEARNING. A bridge port gets created with a default set of flags as can be seen in new_nbp(). Those flags include BR_LEARNING but don't include BR_PORT_LOCKED.
The user can decide he wants to make the port use 802.1X without MAB, so he enables BR_PORT_LOCKED and disables BR_LEARNING, all with the same netlink command (ip link set swp0 type bridge_slave learning off locked on).
How was the driver forced into anything?
Well, with this change, to have MAB working, the bridge would need learning on of course, but how things work with the bridge according to the flags, they should also work in the offloaded case if you ask me. There should be no difference between the two, thus MAB in drivers would have to be with learning on.
On Fri, Oct 21, 2022 at 07:39:34PM +0200, netdev@kapio-technology.com wrote:
Well, with this change, to have MAB working, the bridge would need learning on of course, but how things work with the bridge according to the flags, they should also work in the offloaded case if you ask me. There should be no difference between the two, thus MAB in drivers would have to be with learning on.
Am I proposing for things to work differently in the offload and software case, and not realizing it? :-/
The essence of my proposal was to send a bug fix now which denies BR_LEARNING to be set together with BR_PORT_LOCKED. The fact that link-local traffic is learned by the software bridge is something unintended as far as I understand.
You tried to fix it here, and as far as I could search in my inbox, that didn't go anywhere: https://lore.kernel.org/netdev/47d8d747-54ef-df52-3b9c-acb9a77fa14a@blackwal...
I thought only mv88e6xxx offloads BR_PORT_LOCKED, but now, after searching, I also see prestera has support for it, so let me add Oleksandr Mazur to the discussion as well. I wonder how they deal with this? Has somebody come to rely on learning being enabled on a locked port?
MAB in offloading drivers will have to be with learning on (same as in software). When BR_PORT_LOCKED | BR_LEARNING will be allowed together back in net-next (to denote the MAB configuration), offloading drivers (mv88e6xxx and prestera) will be patched to reject them. They will only accept the two together when they implement MAB support.
Future drivers after this mess has been cleaned up will have to look at the BR_PORT_LOCKED and BR_LEARNING flag in combination, to see which kind of learning is desired on a port (secure, CPU based learning or autonomous learning).
Am I not making sense?
On 2022-10-21 20:14, Vladimir Oltean wrote:
On Fri, Oct 21, 2022 at 07:39:34PM +0200, netdev@kapio-technology.com wrote:
Well, with this change, to have MAB working, the bridge would need learning on of course, but how things work with the bridge according to the flags, they should also work in the offloaded case if you ask me. There should be no difference between the two, thus MAB in drivers would have to be with learning on.
Am I proposing for things to work differently in the offload and software case, and not realizing it? :-/
The essence of my proposal was to send a bug fix now which denies BR_LEARNING to be set together with BR_PORT_LOCKED. The fact that link-local traffic is learned by the software bridge is something unintended as far as I understand.
You tried to fix it here, and as far as I could search in my inbox, that didn't go anywhere: https://lore.kernel.org/netdev/47d8d747-54ef-df52-3b9c-acb9a77fa14a@blackwal...
I thought only mv88e6xxx offloads BR_PORT_LOCKED, but now, after searching, I also see prestera has support for it, so let me add Oleksandr Mazur to the discussion as well. I wonder how they deal with this? Has somebody come to rely on learning being enabled on a locked port?
MAB in offloading drivers will have to be with learning on (same as in software). When BR_PORT_LOCKED | BR_LEARNING will be allowed together back in net-next (to denote the MAB configuration), offloading drivers (mv88e6xxx and prestera) will be patched to reject them. They will only accept the two together when they implement MAB support.
Future drivers after this mess has been cleaned up will have to look at the BR_PORT_LOCKED and BR_LEARNING flag in combination, to see which kind of learning is desired on a port (secure, CPU based learning or autonomous learning).
Am I not making sense?
I will not say that you are not making sense as for the mv88e6xxx, as it needs port association in all cases with BR_PORT_LOCKED, MAB or not, and port association is turned on in the driver with learning turned on.
That said, there must be some resolution and agreement overall with this issue to move on. Right now port association is turned on in the mv88e6xxx driver when locking the port, thus setting learning off after locking will break things.
On Sat, Oct 22, 2022 at 09:24:56AM +0200, netdev@kapio-technology.com wrote:
I will not say that you are not making sense as for the mv88e6xxx, as it needs port association in all cases with BR_PORT_LOCKED, MAB or not, and port association is turned on in the driver with learning turned on.
That said, there must be some resolution and agreement overall with this issue to move on. Right now port association is turned on in the mv88e6xxx driver when locking the port, thus setting learning off after locking will break things.
This already needs to be treated as a bug and fixed on its own. Forget about MAB.
You're saying that when BR_LEARNING=on and BR_PORT_LOCKED=on, the mv88e6xxx driver works properly, but the software bridge is broken (learns from link-local multicast).
When BR_LEARNING=off and BR_PORT_LOCKED=on, the software bridge is not broken, but the mv88e6xxx driver is, because it requires the PAV configured properly.
And you're saying that I'm the one who suggests things should work differently in software mode vs offloaded mode?!
Why don't you (a) deny BR_LEARNING + BR_PORT_LOCKED in the bridge layer (b) fix the mv88e6xxx driver to always keep the assoc_vector set properly for the port, if BR_LEARNING *or* BR_PORT_LOCKED is set?
On 2022-10-22 14:02, Vladimir Oltean wrote:
On Sat, Oct 22, 2022 at 09:24:56AM +0200, netdev@kapio-technology.com wrote:
I will not say that you are not making sense as for the mv88e6xxx, as it needs port association in all cases with BR_PORT_LOCKED, MAB or not, and port association is turned on in the driver with learning turned on.
That said, there must be some resolution and agreement overall with this issue to move on. Right now port association is turned on in the mv88e6xxx driver when locking the port, thus setting learning off after locking will break things.
This already needs to be treated as a bug and fixed on its own. Forget about MAB.
You're saying that when BR_LEARNING=on and BR_PORT_LOCKED=on, the mv88e6xxx driver works properly, but the software bridge is broken (learns from link-local multicast).
When BR_LEARNING=off and BR_PORT_LOCKED=on, the software bridge is not broken, but the mv88e6xxx driver is, because it requires the PAV configured properly.
And you're saying that I'm the one who suggests things should work differently in software mode vs offloaded mode?!
Well :-) To be specific, I am talking about how things work from a user perspective, where I have kept to BR_LEARNING off before turning BR_PORT_LOCKED on.
I admit to a weakness in that BR_LEARNING off after BR_PORT_LOCKED on is a problem that from my perspective at this point would be a user error.
Why don't you (a) deny BR_LEARNING + BR_PORT_LOCKED in the bridge layer (b) fix the mv88e6xxx driver to always keep the assoc_vector set properly for the port, if BR_LEARNING *or* BR_PORT_LOCKED is set?
(a) yes, I have thought that documentation could handle this, but maybe you are right, maybe it should be enforced... (b) BR_PORT_LOCKED ensures now that the PAV is correctly set, so I have basically distinguished between learning and port association (which I know mechanically is the same in mv88e6xxx), but still I have adhered to learning off while port association is on for the port.
On Fri, Oct 21, 2022 at 07:39:34PM +0200, netdev@kapio-technology.com wrote:
Well, with this change, to have MAB working, the bridge would need learning on of course, but how things work with the bridge according to the flags, they should also work in the offloaded case if you ask me. There should be no difference between the two, thus MAB in drivers would have to be with learning on.
Am I proposing for things to work differently in the offload and software case, and not realizing it? :-/
The essence of my proposal was to send a bug fix now which denies BR_LEARNING to be set together with BR_PORT_LOCKED. The fact that link-local traffic is learned by the software bridge is something unintended as far as I understand.
You tried to fix it here, and as far as I could search in my inbox, that didn't go anywhere: https://lore.kernel.org/netdev/47d8d747-54ef-df52-3b9c-acb9a77fa14a@blackwal...
I thought only mv88e6xxx offloads BR_PORT_LOCKED, but now, after searching, I also see prestera has support for it, so let me add Oleksandr Mazur to the discussion as well. I wonder how they deal with this? Has somebody come to rely on learning being enabled on a locked port?
Hello,
The fact that link-local traffic is learned by the software bridge is something unintended as far as I understand.
In prestera driver, if port is in blocked state only the PAE frames can be trapped, so i'm not sure where other traffic might come from that you are talking. Or maybe i didn't get the issue here right, sorry?
Also, basically, prestera driver does not rely on the learning flag if the port's flag BR_PORT_LOCKED is set. What this means, is that we discard any learning changes on the port if LOCKED is still set (done inside firmware, if i recall correctly). E.g. learning is always off, if port is in BR_PORT_LOCKED state, or in a block state but also has a static fdb entry (aka mac-auth entry).
The concept we follow is basically: - some userspace daemon blocks the port; - speaks with the <auth-center> (PAE traffic); - the daemon itself populates the FDB with authenticated MACs (adding static FDB MACs); - forces learning flag disable, disables the PORT_LOCKED flag. At this point switch can basically receive only the traffic from authorized addresses (fdb still has static entries; learning disabled).
Hope that helps. Cheers.
Hi Oleksandr,
On Sat, Oct 22, 2022 at 08:50:20AM +0000, Oleksandr Mazur wrote:
The essence of my proposal was to send a bug fix now which denies BR_LEARNING to be set together with BR_PORT_LOCKED. The fact that link-local traffic is learned by the software bridge is something unintended as far as I understand.
You tried to fix it here, and as far as I could search in my inbox, that didn't go anywhere: https://lore.kernel.org/netdev/47d8d747-54ef-df52-3b9c-acb9a77fa14a@blackwal...
I thought only mv88e6xxx offloads BR_PORT_LOCKED, but now, after searching, I also see prestera has support for it, so let me add Oleksandr Mazur to the discussion as well. I wonder how they deal with this? Has somebody come to rely on learning being enabled on a locked port?
Hello,
The fact that link-local traffic is learned by the software bridge is something unintended as far as I understand.
In prestera driver, if port is in blocked state only the PAE frames can be trapped, so i'm not sure where other traffic might come from that you are talking. Or maybe i didn't get the issue here right, sorry?
I hope the following script will exemplify what I mean.
#!/bin/bash
ip netns add ns0 ip -n ns0 link add br0 type bridge ip -n ns0 link add veth0 type veth peer name veth1 ip -n ns0 link set veth1 master br0 ip -n ns0 link set veth1 type bridge_slave locked on learning on ip -n ns0 link set veth0 up ip -n ns0 link set veth1 up ip -n ns0 link set br0 up addr=$(ip -j -n ns0 link show dev veth0 | jq -r '.[]["address"]') ip netns exec ns0 mausezahn veth0 -q -c 1 -p 64 -b 01:80:c2:00:00:0e -t ip sleep 1 ip netns exec ns0 bridge fdb show dev veth1 master | grep ${addr} ip netns del ns0
It will print:
6e:71:0a:8d:85:9e master br0
or in other words, the brport veth1 has learned the MAC SA of veth0 as a dynamic FDB entry even with no user space daemon to handle the authentication protocol.
In turn, having this MAC SA present in the bridge FDB means that communication with this station is now allowed. As far as I can tell, this is *not* intended. Only the authentication protocol should create the FDB entry.
Compare this with the same script, but with "locked on learning off". No FDB entry will be printed.
Also, basically, prestera driver does not rely on the learning flag if the port's flag BR_PORT_LOCKED is set. What this means, is that we discard any learning changes on the port if LOCKED is still set (done inside firmware, if i recall correctly). E.g. learning is always off, if port is in BR_PORT_LOCKED state, or in a block state but also has a static fdb entry (aka mac-auth entry).
So I take this as meaning that we could deny BR_LEARNING on ports with BR_PORT_LOCKED set, and prestera wouldn't be adversely affected. Ok.
The concept we follow is basically:
- some userspace daemon blocks the port;
- speaks with the <auth-center> (PAE traffic);
- the daemon itself populates the FDB with authenticated MACs (adding static FDB MACs);
- forces learning flag disable, disables the PORT_LOCKED flag. At
this point switch can basically receive only the traffic from authorized addresses (fdb still has static entries; learning disabled).
I don't understand the last step. Why is the BR_PORT_LOCKED flag disabled? If disabled, the port will receive frames with any unknown MAC SA, not just the authorized ones.
I hope the following script will exemplify what I mean.
.. Oh, i get it now.
Frankly speaking we haven't stumbled across such scenario / issue before. But i can tell it does indeed seems a bit broken;
I think there are 2 options here: 1. The setup itself seems insecure, and user should be aware of such behavior / issue; 2. Bridge indeed should not learn MACs if BR_PORT_LOCKED is set. E.g. learning condition should be something like: not BR_PORT_locked and learning is on;
I don't understand the last step. Why is the BR_PORT_LOCKED flag disabled? If disabled, the port will receive frames with any unknown MAC SA, not just the authorized ones.
Sorry for the confusion. Basically, what i described what i would expect from a daemon (e.g. daemon would disable LOCKED); So just ignore that part.
On Sat, Oct 22, 2022 at 12:55:14PM +0000, Oleksandr Mazur wrote:
I hope the following script will exemplify what I mean.
.. Oh, i get it now.
Frankly speaking we haven't stumbled across such scenario / issue before. But i can tell it does indeed seems a bit broken;
I think there are 2 options here:
- The setup itself seems insecure, and user should be aware of such behavior / issue;
Be aware, and do what? Port locking is unfit for use if learning is left enabled (in the way learning is currently done).
- Bridge indeed should not learn MACs if BR_PORT_LOCKED is set.
E.g. learning condition should be something like: not BR_PORT_locked and learning is on;
Rather than violate the BR_LEARNING flag (have it set but do nothing, which would require even more checks in the fast path), I was proposing to not allow the BR_PORT_LOCKED | BR_LEARNING configuration at all. My question to you was if you're aware of any regression in prestera with such a change.
I don't understand the last step. Why is the BR_PORT_LOCKED flag disabled? If disabled, the port will receive frames with any unknown MAC SA, not just the authorized ones.
Sorry for the confusion. Basically, what i described what i would expect from a daemon (e.g. daemon would disable LOCKED); So just ignore that part.
But still, why would the daemon disable BR_PORT_LOCKED once a station is authorized? You're describing a sample/test application, not a port security solution...
On Fri, Oct 21, 2022 at 09:14:11PM +0300, Vladimir Oltean wrote:
On Fri, Oct 21, 2022 at 07:39:34PM +0200, netdev@kapio-technology.com wrote:
Well, with this change, to have MAB working, the bridge would need learning on of course, but how things work with the bridge according to the flags, they should also work in the offloaded case if you ask me. There should be no difference between the two, thus MAB in drivers would have to be with learning on.
Am I proposing for things to work differently in the offload and software case, and not realizing it? :-/
The essence of my proposal was to send a bug fix now which denies BR_LEARNING to be set together with BR_PORT_LOCKED. The fact that link-local traffic is learned by the software bridge is something unintended as far as I understand.
You tried to fix it here, and as far as I could search in my inbox, that didn't go anywhere: https://lore.kernel.org/netdev/47d8d747-54ef-df52-3b9c-acb9a77fa14a@blackwal...
I thought only mv88e6xxx offloads BR_PORT_LOCKED, but now, after searching, I also see prestera has support for it, so let me add Oleksandr Mazur to the discussion as well. I wonder how they deal with this? Has somebody come to rely on learning being enabled on a locked port?
MAB in offloading drivers will have to be with learning on (same as in software). When BR_PORT_LOCKED | BR_LEARNING will be allowed together back in net-next (to denote the MAB configuration), offloading drivers (mv88e6xxx and prestera) will be patched to reject them. They will only accept the two together when they implement MAB support.
Future drivers after this mess has been cleaned up will have to look at the BR_PORT_LOCKED and BR_LEARNING flag in combination, to see which kind of learning is desired on a port (secure, CPU based learning or autonomous learning).
Am I not making sense?
I will try to summarize what I learned from past discussions because I think it is not properly explained in the commit messages.
If you look at the hostapd fork by Westermo [1], you will see that they are authorizing hosts by adding dynamic FDB entries from user space, not static ones. Someone from Westermo will need to confirm this, but I guess the reasons are that a) They want hosts that became silent to lose their authentication after the aging time b) They want hosts to lose their authentication when the carrier of the bridge port goes down. This will cause the bridge driver to flush dynamic FDB entries, but not static ones. Otherwise, an attacker with physical access to the switch and knowledge of the MAC address of the authenticated host can connect a different (malicious) host that will be able to communicate through the bridge.
In the above scenario, learning does not need to be on for the bridge to populate its FDB, but rather for the bridge to refresh the dynamic FDB entries installed by hostapd. This seems like a valid use case and one needs a good reason to break it in future kernels.
Regarding learning from link-local frames, this can be mitigated by [2] without adding additional checks in the bridge. I don't know why this bridge option was originally added, but if it wasn't for this use case, then now it has another use case.
Regarding MAB, from the above you can see that a pure 802.1X implementation that does not involve MAB can benefit from locked bridge ports with learning enabled. It is therefore not accurate to say that one wants MAB merely by enabling learning on a locked port. Given that MAB is a proprietary extension and much less secure than 802.1X, we can assume that there will be deployments out there that do not use MAB and do not care about notifications regarding locked FDB entries. I therefore think that MAB needs to be enabled by a separate bridge port flag that is rejected unless the bridge port is locked and has learning enabled.
Regarding hardware offload, I have an idea (needs testing) on how to make mlxsw work in a similar way to mv88e6xxx. That is, does not involve injecting frames that incurred a miss to the Rx path. If you guys want, I'm willing to take a subset of the patches here, improve the commit message, do some small changes and submit them along with an mlxsw implementation. My intention is not to discredit anyone (I will keep the original authorship), but to help push this forward and give another example of hardware offload.
[1] https://github.com/westermo/hostapd/commit/10c584b875a63a9e58b0ad39835282545... [2] https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id...
On 2022-10-22 15:49, Ido Schimmel wrote:
On Fri, Oct 21, 2022 at 09:14:11PM +0300, Vladimir Oltean wrote:
On Fri, Oct 21, 2022 at 07:39:34PM +0200, netdev@kapio-technology.com wrote:
Well, with this change, to have MAB working, the bridge would need learning on of course, but how things work with the bridge according to the flags, they should also work in the offloaded case if you ask me. There should be no difference between the two, thus MAB in drivers would have to be with learning on.
Am I proposing for things to work differently in the offload and software case, and not realizing it? :-/
The essence of my proposal was to send a bug fix now which denies BR_LEARNING to be set together with BR_PORT_LOCKED. The fact that link-local traffic is learned by the software bridge is something unintended as far as I understand.
You tried to fix it here, and as far as I could search in my inbox, that didn't go anywhere: https://lore.kernel.org/netdev/47d8d747-54ef-df52-3b9c-acb9a77fa14a@blackwal...
I thought only mv88e6xxx offloads BR_PORT_LOCKED, but now, after searching, I also see prestera has support for it, so let me add Oleksandr Mazur to the discussion as well. I wonder how they deal with this? Has somebody come to rely on learning being enabled on a locked port?
MAB in offloading drivers will have to be with learning on (same as in software). When BR_PORT_LOCKED | BR_LEARNING will be allowed together back in net-next (to denote the MAB configuration), offloading drivers (mv88e6xxx and prestera) will be patched to reject them. They will only accept the two together when they implement MAB support.
Future drivers after this mess has been cleaned up will have to look at the BR_PORT_LOCKED and BR_LEARNING flag in combination, to see which kind of learning is desired on a port (secure, CPU based learning or autonomous learning).
Am I not making sense?
I will try to summarize what I learned from past discussions because I think it is not properly explained in the commit messages.
If you look at the hostapd fork by Westermo [1], you will see that they are authorizing hosts by adding dynamic FDB entries from user space, not
Those dynamic FDB entries are to be dynamic ATU entries by a patch set that I have ready, but which I have not submitted as I was expecting to submit it after this patch set was accepted.
The important aspect of Dynamic ATU entries is that the HW refreshes the ATU entries with an active host.
static ones. Someone from Westermo will need to confirm this, but I
I represent WesterMo in the upstreaming of these patches, and can confirm that both for hostapd and the MAB solution, WesterMo authorizes by using dynamic entries.
guess the reasons are that a) They want hosts that became silent to lose their authentication after the aging time b) They want hosts to lose their authentication when the carrier of the bridge port goes down. This will cause the bridge driver to flush dynamic FDB entries, but not static ones. Otherwise, an attacker with physical access to the switch and knowledge of the MAC address of the authenticated host can connect a different (malicious) host that will be able to communicate through the bridge.
Seems correct, only that it must be specified that it must be the switchcore and not the bridge that ages the entries, thus ATU entries.
In the above scenario, learning does not need to be on for the bridge to populate its FDB, but rather for the bridge to refresh the dynamic FDB entries installed by hostapd. This seems like a valid use case and one needs a good reason to break it in future kernels.
Regarding learning from link-local frames, this can be mitigated by [2] without adding additional checks in the bridge. I don't know why this bridge option was originally added, but if it wasn't for this use case, then now it has another use case.
Regarding MAB, from the above you can see that a pure 802.1X implementation that does not involve MAB can benefit from locked bridge ports with learning enabled. It is therefore not accurate to say that one wants MAB merely by enabling learning on a locked port. Given that MAB is a proprietary extension and much less secure than 802.1X, we can assume that there will be deployments out there that do not use MAB and do not care about notifications regarding locked FDB entries. I therefore think that MAB needs to be enabled by a separate bridge port flag that is rejected unless the bridge port is locked and has learning enabled.
Regarding hardware offload, I have an idea (needs testing) on how to make mlxsw work in a similar way to mv88e6xxx. That is, does not involve injecting frames that incurred a miss to the Rx path. If you guys want, I'm willing to take a subset of the patches here, improve the commit message, do some small changes and submit them along with an mlxsw implementation. My intention is not to discredit anyone (I will keep the original authorship), but to help push this forward and give another example of hardware offload.
You are very welcome to help pushing this forward for my sake, I just need to know how it will affect this patch set. :-)
[1] https://github.com/westermo/hostapd/commit/10c584b875a63a9e58b0ad39835282545... [2] https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id...
On Sat, Oct 22, 2022 at 04:49:50PM +0300, Ido Schimmel wrote:
I will try to summarize what I learned from past discussions because I think it is not properly explained in the commit messages.
If you look at the hostapd fork by Westermo [1], you will see that they are authorizing hosts by adding dynamic FDB entries from user space, not static ones. Someone from Westermo will need to confirm this, but I guess the reasons are that a) They want hosts that became silent to lose their authentication after the aging time b) They want hosts to lose their authentication when the carrier of the bridge port goes down. This will cause the bridge driver to flush dynamic FDB entries, but not static ones. Otherwise, an attacker with physical access to the switch and knowledge of the MAC address of the authenticated host can connect a different (malicious) host that will be able to communicate through the bridge.
Not only is it not well explained, but Hans said back in February that "in the common case you will want to use static entries": https://lore.kernel.org/lkml/867da5viak.fsf@gmail.com/
In the above scenario, learning does not need to be on for the bridge to populate its FDB, but rather for the bridge to refresh the dynamic FDB entries installed by hostapd. This seems like a valid use case and one needs a good reason to break it in future kernels.
Before suggesting any alternatives, I'd like to know more details about how this will work in practice, because I'm aware of the limitations that come with DSA not syncing its hardware FDB with the software bridge.
So you add a dynamic FDB entry from user space, it gets propagated to hardware via SWITCHDEV_FDB_ADD_TO_DEVICE, and from there on, they have completely independent ageing timers.
You'll still suffer interruptions in authorization, if the software FDB entry expires because it was never refreshed (which will happen if traffic is forwarded autonomously and not seen by software). And at this stage, you could just add static FDB entries which you periodically delete from user space, since the effect would be equivalent.
If the mitigation to that is going to involve the extern_learn flag, the whole point becomes moot (for mv88e6xxx), since FDB refreshing does not happen in the bridge driver in that case (so the learning flag can be whatever).
Regarding learning from link-local frames, this can be mitigated by [2] without adding additional checks in the bridge. I don't know why this bridge option was originally added, but if it wasn't for this use case, then now it has another use case.
There is still the problem that link-local learning is on by default (follows the BR_LEARNING setting of the port). I don't feel exactly comfortable with the fact that it's easy for a user to miss this and leave the port completely insecure.
Regarding MAB, from the above you can see that a pure 802.1X implementation that does not involve MAB can benefit from locked bridge ports with learning enabled. It is therefore not accurate to say that one wants MAB merely by enabling learning on a locked port. Given that MAB is a proprietary extension and much less secure than 802.1X, we can assume that there will be deployments out there that do not use MAB and do not care about notifications regarding locked FDB entries. I therefore think that MAB needs to be enabled by a separate bridge port flag that is rejected unless the bridge port is locked and has learning enabled.
I had missed the detail that dynamic FDB entries will be refreshed only with "learning" on. It makes the picture more complete. Only this is said in "man bridge":
learning on or learning off Controls whether a given port will learn MAC addresses from received traffic or not. If learning if off, the bridge will end up flooding any traffic for which it has no FDB entry. By default this flag is on.
Can live with MAB being a separate flag if it comes to that, as long as 'learning' will continue to have its own specific meaning, independent of it (right now that meaning is subtle and undocumented, but makes sense).
Regarding hardware offload, I have an idea (needs testing) on how to make mlxsw work in a similar way to mv88e6xxx. That is, does not involve injecting frames that incurred a miss to the Rx path. If you guys want, I'm willing to take a subset of the patches here, improve the commit message, do some small changes and submit them along with an mlxsw implementation. My intention is not to discredit anyone (I will keep the original authorship), but to help push this forward and give another example of hardware offload.
[1] https://github.com/westermo/hostapd/commit/10c584b875a63a9e58b0ad39835282545... [2] https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id...
I think it would be very nice if you could do that. As a middle ground between mv88e6xxx and mlxsw, I can also try to build a setup on ocelot (which should trap frames with MAC SA misses in a similar way to mlxsw, but does also not sync its FDB with the bridge, similar to the mv88e6xxx. Not sure what to do with dynamic FDB entries).
If only I would figure out how to configure that hostapd fork (something which I never did before).
Hans, would it be possible to lay out some usage instructions for this fork?
On Sat, Oct 22, 2022 at 05:49:51PM +0300, Vladimir Oltean wrote:
On Sat, Oct 22, 2022 at 04:49:50PM +0300, Ido Schimmel wrote:
In the above scenario, learning does not need to be on for the bridge to populate its FDB, but rather for the bridge to refresh the dynamic FDB entries installed by hostapd. This seems like a valid use case and one needs a good reason to break it in future kernels.
Before suggesting any alternatives, I'd like to know more details about how this will work in practice, because I'm aware of the limitations that come with DSA not syncing its hardware FDB with the software bridge.
So you add a dynamic FDB entry from user space, it gets propagated to hardware via SWITCHDEV_FDB_ADD_TO_DEVICE, and from there on, they have completely independent ageing timers.
You'll still suffer interruptions in authorization, if the software FDB entry expires because it was never refreshed (which will happen if traffic is forwarded autonomously and not seen by software). And at this stage, you could just add static FDB entries which you periodically delete from user space, since the effect would be equivalent.
If the mitigation to that is going to involve the extern_learn flag, the whole point becomes moot (for mv88e6xxx), since FDB refreshing does not happen in the bridge driver in that case (so the learning flag can be whatever).
Once a dynamic FDB entry is installed in hardware the software bridge no longer sees the majority of the traffic that refreshes this entry, which means we need to prevent the bridge from mindlessly ageing and removing the entry. I see two options, depending on the capabilities of the underlying hardware implementation:
1. If the hardware is capable of generating an event that an entry was aged out, then once the dynamic entry was installed in hardware the device driver needs to let the bridge driver know that it is no longer responsible for ageing the entry. This can be done by either marking the entry as extern_learn or offloaded. The latter is more accurate, but we need to patch br_fdb_cleanup(). Upon an ageing event, the device driver will tell the bridge to remove the entry via SWITCHDEV_FDB_DEL_TO_BRIDGE.
2. If the hardware is unable to generate ageing events, but allows querying the activity of the entry, then the device driver will need to emulate the behavior of the first option. This allows us to use the same interface between the bridge and device driver regardless of the underlying hardware implementation. My feeling is that most devices fall in the first category.
Regarding learning from link-local frames, this can be mitigated by [2] without adding additional checks in the bridge. I don't know why this bridge option was originally added, but if it wasn't for this use case, then now it has another use case.
There is still the problem that link-local learning is on by default (follows the BR_LEARNING setting of the port). I don't feel exactly comfortable with the fact that it's easy for a user to miss this and leave the port completely insecure.
I'm willing to patch the man page and add a note near the 'locked' bridge port option.
Regarding MAB, from the above you can see that a pure 802.1X implementation that does not involve MAB can benefit from locked bridge ports with learning enabled. It is therefore not accurate to say that one wants MAB merely by enabling learning on a locked port. Given that MAB is a proprietary extension and much less secure than 802.1X, we can assume that there will be deployments out there that do not use MAB and do not care about notifications regarding locked FDB entries. I therefore think that MAB needs to be enabled by a separate bridge port flag that is rejected unless the bridge port is locked and has learning enabled.
I had missed the detail that dynamic FDB entries will be refreshed only with "learning" on. It makes the picture more complete. Only this is said in "man bridge":
learning on or learning off Controls whether a given port will learn MAC addresses from received traffic or not. If learning if off, the bridge will end up flooding any traffic for which it has no FDB entry. By default this flag is on.
Can live with MAB being a separate flag if it comes to that, as long as 'learning' will continue to have its own specific meaning, independent of it (right now that meaning is subtle and undocumented, but makes sense).
Yes, I agree it is subtle.
Regarding hardware offload, I have an idea (needs testing) on how to make mlxsw work in a similar way to mv88e6xxx. That is, does not involve injecting frames that incurred a miss to the Rx path. If you guys want, I'm willing to take a subset of the patches here, improve the commit message, do some small changes and submit them along with an mlxsw implementation. My intention is not to discredit anyone (I will keep the original authorship), but to help push this forward and give another example of hardware offload.
[1] https://github.com/westermo/hostapd/commit/10c584b875a63a9e58b0ad39835282545... [2] https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id...
I think it would be very nice if you could do that. As a middle ground between mv88e6xxx and mlxsw, I can also try to build a setup on ocelot (which should trap frames with MAC SA misses in a similar way to mlxsw, but does also not sync its FDB with the bridge, similar to the mv88e6xxx. Not sure what to do with dynamic FDB entries).
Will try to post my patches this week.
If only I would figure out how to configure that hostapd fork (something which I never did before).
Hans, would it be possible to lay out some usage instructions for this fork?
That would be good.
On 2022-10-20 15:25, Vladimir Oltean wrote:
This would not have been needed if dsa_port_set_state() would have called dsa_port_fast_age().
Currently it only does that if dp->learning is true. From previous conversations I get the idea that with MAB, port learning will be false. But I don't understand why; isn't MAB CPU-assisted learning? I'm looking at the ocelot hardware support for this and I think it could be implemented using a similar mechanism, but I certainly don't want to add more workarounds such as this in other drivers.
Are there any other ways to implement MAB other than through CPU assisted learning?
We could add one more dp->mab flag which tracks the "mab" brport flag, and extend dsa_port_set_state() to also call dsa_port_fast_age() in that case, but I want to make sure there isn't something extremely obvious I'm missing about the "learning" flag.
As learning is off on locked ports, see other response, your dp->mab flag idea might be a way to go, just need confirmation that this is needed.
@@ -6572,8 +6604,10 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_MCAST_FLOOD) { bool multicast = !!(flags.val & BR_MCAST_FLOOD);
err = chip->info->ops->port_set_mcast_flood(chip, port, multicast);mv88e6xxx_reg_lock(chip);
if (err) goto out; }mv88e6xxx_reg_unlock(chip);
@@ -6581,20 +6615,34 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_BCAST_FLOOD) { bool broadcast = !!(flags.val & BR_BCAST_FLOOD);
mv88e6xxx_reg_lock(chip);
err = mv88e6xxx_port_broadcast_sync(chip, port, broadcast);
mv88e6xxx_reg_unlock(chip);
if (err) goto out; }
if (flags.mask & BR_PORT_MAB) {
chip->ports[port].mab = !!(flags.val & BR_PORT_MAB);
if (!chip->ports[port].mab)
err = mv88e6xxx_atu_locked_entry_flush(ds, port);
else
err = 0;
Again, dsa_port_fast_age() is also called when dp->learning is turned off in dsa_port_bridge_flags(). I don't want to see the mv88e6xxx driver doing this manually.
Maybe I am wrong, but I have only been able to trigger fast ageing by setting the STP state of the port to blocked...
On Thu, Oct 20, 2022 at 11:09:40PM +0200, netdev@kapio-technology.com wrote:
Again, dsa_port_fast_age() is also called when dp->learning is turned off in dsa_port_bridge_flags(). I don't want to see the mv88e6xxx driver doing this manually.
Maybe I am wrong, but I have only been able to trigger fast ageing by setting the STP state of the port to blocked...
Maybe you didn't try hard enough? On a DSA bridge port that is up and in the FORWARDING state and with 'learning' on, running "ip link set dev swp0 type bridge_slave learning off" triggers dsa_port_fast_age().
On 2022-10-20 15:25, Vladimir Oltean wrote:
if (flags.mask & BR_LEARNING) { bool learning = !!(flags.val & BR_LEARNING); u16 pav = learning ? (1 << port) : 0;
err = mv88e6xxx_port_set_assoc_vector(chip, port, pav);mv88e6xxx_reg_lock(chip);
if (err) goto out; }mv88e6xxx_reg_unlock(chip);
@@ -6563,8 +6593,10 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_FLOOD) { bool unicast = !!(flags.val & BR_FLOOD);
err = chip->info->ops->port_set_ucast_flood(chip, port, unicast);mv88e6xxx_reg_lock(chip);
if (err) goto out; }mv88e6xxx_reg_unlock(chip);
@@ -6572,8 +6604,10 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_MCAST_FLOOD) { bool multicast = !!(flags.val & BR_MCAST_FLOOD);
err = chip->info->ops->port_set_mcast_flood(chip, port, multicast);mv88e6xxx_reg_lock(chip);
if (err) goto out; }mv88e6xxx_reg_unlock(chip);
@@ -6581,20 +6615,34 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_BCAST_FLOOD) { bool broadcast = !!(flags.val & BR_BCAST_FLOOD);
mv88e6xxx_reg_lock(chip);
err = mv88e6xxx_port_broadcast_sync(chip, port, broadcast);
mv88e6xxx_reg_unlock(chip);
if (err) goto out; }
if (flags.mask & BR_PORT_MAB) {
chip->ports[port].mab = !!(flags.val & BR_PORT_MAB);
if (!chip->ports[port].mab)
err = mv88e6xxx_atu_locked_entry_flush(ds, port);
else
err = 0;
Again, dsa_port_fast_age() is also called when dp->learning is turned off in dsa_port_bridge_flags(). I don't want to see the mv88e6xxx driver doing this manually.
But I think it should be so that turning MAB off will clear the ALE entries regardless, as the port can continue to be locked and needing port association, or you want them to just age out normally in that case, thus lingering for up to bridge ageing time?
On Sat, Oct 22, 2022 at 09:31:06AM +0200, netdev@kapio-technology.com wrote:
But I think it should be so that turning MAB off will clear the ALE entries regardless, as the port can continue to be locked and needing port association, or you want them to just age out normally in that case, thus lingering for up to bridge ageing time?
Even without BR_PORT_LOCKED, I find it normal that dynamically learned FDB entries are forcefully aged out when BR_LEARNING is turned off, instead of lingering on until they expire.
This does not happen in the software bridge, and I did not understand why (I suspected some backwards compatibility reasons), and for this reason, it is only from within DSA that we are forcing this behavior to take place. In dsa_port_bridge_flags(), when BR_LEARNING is turned off, we call dsa_port_fast_age() which also calls SWITCHDEV_FDB_FLUSH_TO_BRIDGE (and this clears the bridge software FDB of dynamically learned entries).
I very much expect the same thing with MAB and BR_FDB_LOCKED entries, that they go away when the BR_PORT_MAB/BR_LEARNING flag (whichever way we call it) is unset.
Now, if the bridge should initiate the flushing, or still DSA, is perhaps a topic for further discussion. Given that BR_FDB_LOCKED entries are new, maybe the bridge could do it in this case (no backwards compatibility to handle).
Currently the DSA logic mentioned above is bypassed, because we treat MAB and autonomous learning differently. If we accepted that MAB is still a form of learning (managed through BR_LEARNING+BR_PORT_LOCKED), then the DSA logic would kick in, and both the software bridge and the hardware driver would have a hook to clean up the BR_FDB_LOCKED entries, plus anything else that is dynamic. The DSA logic would also kick in if we treated BR_PORT_MAB within DSA like BR_LEARNING, which basically amounts to the same thing, except for the confusing (IMO) UAPI of having a flag (BR_PORT_MAB) which is basically a form of learning that isn't controlled by the BR_LEARNING flag (which is undefined and unclear if it should be set or not, in BR_PORT_LOCKED mode).
Blackhole FDB entries can now be added, deleted or replaced in the driver ATU.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- drivers/net/dsa/mv88e6xxx/chip.c | 62 +++++++++++++++++++++++++++++--- 1 file changed, 58 insertions(+), 4 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 71843fe87f77..7a7cd1f0e735 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2735,6 +2735,58 @@ static int mv88e6xxx_vlan_msti_set(struct dsa_switch *ds, return err; }
+static int mv88e6xxx_blackhole_fdb_loadpurge(struct dsa_switch *ds, + const unsigned char *addr, u16 vid, u8 state) +{ + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_atu_entry entry; + struct mv88e6xxx_vtu_entry vlan; + + u16 fid = 0; + int err; + + if (vid == 0) { + fid = MV88E6XXX_FID_BRIDGED; + } else { + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_vtu_get(chip, vid, &vlan); + mv88e6xxx_reg_unlock(chip); + if (err) + return err; + + /* switchdev expects -EOPNOTSUPP to honor software VLANs */ + if (!vlan.valid) + return -EOPNOTSUPP; + + fid = vlan.fid; + } + + ether_addr_copy(entry.mac, addr); + entry.portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS; + entry.state = state; + entry.trunk = false; + + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_g1_atu_loadpurge(chip, fid, &entry); + mv88e6xxx_reg_unlock(chip); + + return err; +} + +static int mv88e6xxx_blackhole_fdb_add(struct dsa_switch *ds, + const unsigned char *addr, u16 vid) +{ + return mv88e6xxx_blackhole_fdb_loadpurge(ds, addr, vid, + MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC); +} + +static int mv88e6xxx_blackhole_fdb_del(struct dsa_switch *ds, + const unsigned char *addr, u16 vid) +{ + return mv88e6xxx_blackhole_fdb_loadpurge(ds, addr, vid, + MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED); +} + static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, u16 fdb_flags, struct dsa_db db) @@ -2742,9 +2794,10 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, struct mv88e6xxx_chip *chip = ds->priv; int err;
- /* Ignore entries with flags set */ - if (fdb_flags) + if (fdb_flags & DSA_FDB_FLAG_LOCKED) return 0; + if (fdb_flags & DSA_FDB_FLAG_BLACKHOLE) + return mv88e6xxx_blackhole_fdb_add(ds, addr, vid);
if (mv88e6xxx_port_is_locked(chip, port)) mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid); @@ -2765,9 +2818,10 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port, bool locked_found = false; int err = 0;
- /* Ignore entries with flags set */ - if (fdb_flags) + if (fdb_flags & DSA_FDB_FLAG_LOCKED) return 0; + if (fdb_flags & DSA_FDB_FLAG_BLACKHOLE) + return mv88e6xxx_blackhole_fdb_del(ds, addr, vid);
if (mv88e6xxx_port_is_locked(chip, port)) locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
On Tue, Oct 18, 2022 at 06:56:18PM +0200, Hans J. Schultz wrote:
Blackhole FDB entries can now be added, deleted or replaced in the driver ATU.
Why is this necessary, why is it useful?
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com
static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, u16 fdb_flags, struct dsa_db db) @@ -2742,9 +2794,10 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, struct mv88e6xxx_chip *chip = ds->priv; int err;
- /* Ignore entries with flags set */
- if (fdb_flags)
- if (fdb_flags & DSA_FDB_FLAG_LOCKED) return 0;
I don't understand this. If no driver looks at DSA_FDB_FLAG_LOCKED (not even mv88e6xxx, up until the end of the series), then why was it propagated all the way in the first place?
- if (fdb_flags & DSA_FDB_FLAG_BLACKHOLE)
return mv88e6xxx_blackhole_fdb_add(ds, addr, vid);
if (mv88e6xxx_port_is_locked(chip, port)) mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid); @@ -2765,9 +2818,10 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port, bool locked_found = false; int err = 0;
- /* Ignore entries with flags set */
- if (fdb_flags)
- if (fdb_flags & DSA_FDB_FLAG_LOCKED) return 0;
- if (fdb_flags & DSA_FDB_FLAG_BLACKHOLE)
return mv88e6xxx_blackhole_fdb_del(ds, addr, vid);
if (mv88e6xxx_port_is_locked(chip, port)) locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid); -- 2.34.1
Verify that the MacAuth/MAB mechanism works by adding a FDB entry with the locked flag set, denying access until the FDB entry is replaced with a FDB entry without the locked flag set. Also verify that FDB entries cannot roam from an unlocked port to a locked port.
Add test of blackhole fdb entries, verifying that there is no forwarding to a blackhole entry from any port, and that the blackhole entry can be replaced.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com --- .../selftests/drivers/net/dsa/Makefile | 1 + .../testing/selftests/net/forwarding/Makefile | 1 + .../net/forwarding/bridge_blackhole_fdb.sh | 131 ++++++++++++++++++ .../net/forwarding/bridge_locked_port.sh | 99 ++++++++++++- tools/testing/selftests/net/forwarding/lib.sh | 17 +++ 5 files changed, 248 insertions(+), 1 deletion(-) create mode 100755 tools/testing/selftests/net/forwarding/bridge_blackhole_fdb.sh
diff --git a/tools/testing/selftests/drivers/net/dsa/Makefile b/tools/testing/selftests/drivers/net/dsa/Makefile index c393e7b73805..c0a75d869763 100644 --- a/tools/testing/selftests/drivers/net/dsa/Makefile +++ b/tools/testing/selftests/drivers/net/dsa/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0+ OR MIT
TEST_PROGS = bridge_locked_port.sh \ + bridge_blackhole_fdb.sh \ bridge_mdb.sh \ bridge_mld.sh \ bridge_vlan_aware.sh \ diff --git a/tools/testing/selftests/net/forwarding/Makefile b/tools/testing/selftests/net/forwarding/Makefile index a9c5c1be5088..7d832020937f 100644 --- a/tools/testing/selftests/net/forwarding/Makefile +++ b/tools/testing/selftests/net/forwarding/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0+ OR MIT
TEST_PROGS = bridge_igmp.sh \ + bridge_blackhole_fdb.sh \ bridge_locked_port.sh \ bridge_mdb.sh \ bridge_mdb_port_down.sh \ diff --git a/tools/testing/selftests/net/forwarding/bridge_blackhole_fdb.sh b/tools/testing/selftests/net/forwarding/bridge_blackhole_fdb.sh new file mode 100755 index 000000000000..42d9367f7339 --- /dev/null +++ b/tools/testing/selftests/net/forwarding/bridge_blackhole_fdb.sh @@ -0,0 +1,131 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +ALL_TESTS="blackhole_fdb" +NUM_NETIFS=4 +source tc_common.sh +source lib.sh + +h1_create() +{ + simple_if_init $h1 192.0.2.1/24 +} + +h1_destroy() +{ + simple_if_fini $h1 192.0.2.1/24 +} + +h2_create() +{ + simple_if_init $h2 192.0.2.2/24 +} + +h2_destroy() +{ + simple_if_fini $h2 192.0.2.2/24 +} + +switch_create() +{ + ip link add dev br0 type bridge vlan_filtering 1 + + ip link set dev $swp1 master br0 + ip link set dev $swp2 master br0 + + ip link set dev br0 up + ip link set dev $swp1 up + ip link set dev $swp2 up + + tc qdisc add dev $swp2 clsact +} + +switch_destroy() +{ + tc qdisc del dev $swp2 clsact + + ip link set dev $swp2 down + ip link set dev $swp1 down + + ip link del dev br0 +} + +setup_prepare() +{ + h1=${NETIFS[p1]} + swp1=${NETIFS[p2]} + h2=${NETIFS[p3]} + swp2=${NETIFS[p4]} + + vrf_prepare + + h1_create + h2_create + + switch_create +} + +cleanup() +{ + pre_cleanup + + switch_destroy + + h2_destroy + h1_destroy + + vrf_cleanup +} + +# Check that there is no egress with blackhole entry and that blackhole entries +# can be replaced +blackhole_fdb() +{ + RET=0 + + check_blackhole_fdb_support || return 0 + + tc filter add dev $swp2 egress protocol ip pref 1 handle 1 flower \ + dst_ip 192.0.2.2 ip_proto udp dst_port 12345 action pass + + $MZ $h1 -c 1 -p 128 -t udp "sp=54321,dp=12345" \ + -a own -b `mac_get $h2` -A 192.0.2.1 -B 192.0.2.2 -q + + tc_check_packets "dev $swp2 egress" 1 1 + check_err $? "Packet not seen on egress before adding blackhole entry" + + bridge fdb replace `mac_get $h2` dev br0 vlan 1 blackhole + bridge fdb get `mac_get $h2` br br0 vlan 1 | grep -q blackhole + check_err $? "Blackhole entry not found" + + $MZ $h1 -c 1 -p 128 -t udp "sp=54321,dp=12345" \ + -a own -b `mac_get $h2` -A 192.0.2.1 -B 192.0.2.2 -q + + tc_check_packets "dev $swp2 egress" 1 1 + check_err $? "Packet seen on egress after adding blackhole entry" + + # Check blackhole entries can be replaced. + bridge fdb replace `mac_get $h2` dev $swp2 vlan 1 master static + bridge fdb get `mac_get $h2` br br0 vlan 1 | grep -q blackhole + check_fail $? "Blackhole entry found after replacement" + + $MZ $h1 -c 1 -p 128 -t udp "sp=54321,dp=12345" \ + -a own -b `mac_get $h2` -A 192.0.2.1 -B 192.0.2.2 -q + + tc_check_packets "dev $swp2 egress" 1 2 + check_err $? "Packet not seen on egress after replacing blackhole entry" + + bridge fdb del `mac_get $h2` dev $swp2 vlan 1 master static + tc filter del dev $swp2 egress protocol ip pref 1 handle 1 flower + + log_test "Blackhole FDB entry" +} + +trap cleanup EXIT + +setup_prepare +setup_wait + +tests_run + +exit $EXIT_STATUS diff --git a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh index 5b02b6b60ce7..f0bc0bcbc246 100755 --- a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh +++ b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh @@ -1,7 +1,15 @@ #!/bin/bash # SPDX-License-Identifier: GPL-2.0
-ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan" +ALL_TESTS=" + locked_port_ipv4 + locked_port_ipv6 + locked_port_vlan + locked_port_mab + locked_port_station_move + locked_port_mab_station_move +" + NUM_NETIFS=4 CHECK_TC="no" source lib.sh @@ -166,6 +174,95 @@ locked_port_ipv6() log_test "Locked port ipv6" }
+locked_port_mab() +{ + RET=0 + check_port_mab_support || return 0 + + ping_do $h1 192.0.2.2 + check_err $? "MAB: Ping did not work before locking port" + + bridge link set dev $swp1 locked on mab on + + ping_do $h1 192.0.2.2 + check_fail $? "MAB: Ping worked on mab enabled port without FDB entry" + + bridge fdb get `mac_get $h1` br br0 vlan 1 | grep "dev $swp1" | grep -q "locked" + check_err $? "MAB: No locked FDB entry after ping on mab enabled port" + + bridge fdb replace `mac_get $h1` dev $swp1 master static + + ping_do $h1 192.0.2.2 + check_err $? "MAB: Ping did not work with FDB entry without locked flag" + + bridge fdb del `mac_get $h1` dev $swp1 master + bridge link set dev $swp1 locked off mab off + + log_test "Locked port MAB" +} + +# Check that entries cannot roam from an unlocked port to a locked port. +locked_port_station_move() +{ + local mac=a0:b0:c0:c0:b0:a0 + + RET=0 + check_locked_port_support || return 0 + + bridge link set dev $swp1 locked on learning on + + $MZ $h1 -q -c 5 -d 100msec -t udp -a $mac -b rand + bridge fdb get $mac br br0 vlan 1 &> /dev/null + check_fail $? "Locked port station move: FDB entry on first injection" + + $MZ $h2 -q -c 5 -d 100msec -t udp -a $mac -b rand + bridge fdb get $mac br br0 vlan 1 | grep -q "dev $swp2" + check_err $? "Locked port station move: Entry not found on unlocked port" + + $MZ $h1 -q -c 5 -d 100msec -t udp -a $mac -b rand + bridge fdb get $mac br br0 vlan 1 | grep -q "dev $swp1" + check_fail $? "Locked port station move: entry roamed to locked port" + + bridge fdb del $mac vlan 1 dev $swp2 master + bridge link set dev $swp1 locked off learning off + + log_test "Locked port station move" +} + +# Check that entries can roam from a locked port if blackhole FDB flag is not +# set. +locked_port_mab_station_move() +{ + local mac=10:20:30:30:20:10 + + RET=0 + check_port_mab_support || return 0 + + bridge link set dev $swp1 locked on mab on + + $MZ $h1 -q -c 5 -d 100 mesc -t udp -a $mac -b rand + if bridge fdb show dev $swp1 | grep "$mac vlan 1" | grep -q "blackhole"; then + echo "SKIP: Roaming not possible with blackhole flag, skipping test..." + bridge link set dev $swp1 locked off mab off + return $ksft_skip + fi + + bridge fdb get $mac br br0 vlan 1 | grep "dev $swp1" | grep -q "locked" + check_err $? "MAB station move: no locked entry on first injection" + + $MZ $h2 -q -c 5 -d 100msec -t udp -a $mac -b rand + bridge fdb get $mac br br0 vlan 1 | grep -q "dev $swp2" + check_err $? "MAB station move: roamed entry not found" + + bridge fdb get $mac br br0 vlan 1 | grep -q "locked" + check_fail $? "MAB station move: roamed entry to unlocked port had locked flag on" + + bridge fdb del $mac vlan 1 dev $swp2 master + bridge link set dev $swp1 locked off mab off + + log_test "Locked port MAB station move" +} + trap cleanup EXIT
setup_prepare diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh index 3ffb9d6c0950..d6abe873665c 100755 --- a/tools/testing/selftests/net/forwarding/lib.sh +++ b/tools/testing/selftests/net/forwarding/lib.sh @@ -137,6 +137,23 @@ check_locked_port_support() fi }
+check_port_mab_support() +{ + if ! bridge -d link show | grep -q "mab"; then + echo "SKIP: iproute2 too old; MacAuth feature not supported." + return $ksft_skip + fi +} + +check_blackhole_fdb_support() +{ + bridge fdb help 2>&1|grep blackhole &> /dev/null + if [[ $? -ne 0 ]]; then + echo "SKIP: Blackhole fdb feature not supported." + return $ksft_skip + fi +} + if [[ "$(id -u)" -ne 0 ]]; then echo "SKIP: need root privileges" exit $ksft_skip
On Tue, Oct 18, 2022 at 06:56:19PM +0200, Hans J. Schultz wrote:
Verify that the MacAuth/MAB mechanism works by adding a FDB entry with the locked flag set, denying access until the FDB entry is replaced with a FDB entry without the locked flag set. Also verify that FDB entries cannot roam from an unlocked port to a locked port.
Add test of blackhole fdb entries, verifying that there is no forwarding to a blackhole entry from any port, and that the blackhole entry can be replaced.
Signed-off-by: Hans J. Schultz netdev@kapio-technology.com
Reviewed-by: Ido Schimmel idosch@nvidia.com Tested-by: Ido Schimmel idosch@nvidia.com
On Tue, 18 Oct 2022 18:56:07 +0200 Hans J. Schultz wrote:
This patch set extends the locked port feature for devices that are behind a locked port, but do not have the ability to authorize themselves as a supplicant using IEEE 802.1X. Such devices can be printers, meters or anything related to fixed installations. Instead of 802.1X authorization, devices can get access based on their MAC addresses being whitelisted.
FWIW half of this posting got stuck on the "email pipes" for a day.. somehow. Let's give Ido and others a chance to have a look but you'll need to repost even if it's flawless because the build bots can't deal with a delay that long :(
On Wed, Oct 19, 2022 at 11:58:09AM -0700, Jakub Kicinski wrote:
FWIW half of this posting got stuck on the "email pipes" for a day.. somehow. Let's give Ido and others a chance to have a look but you'll need to repost even if it's flawless because the build bots can't deal with a delay that long :(
Will review today
linux-kselftest-mirror@lists.linaro.org