This is a series of patches used in OpenWrt's v4.9 backports that seem to be of generic interest to v4.9.y
As a bunch of commits hit networking I CC DaveM and here so the can nod/object to these, I think the want a bit of control over what goes into networking stable.
I avoided two patch sets dealing with IPv6 tunneling and netfilter that seem to still be fixed and I feel generally incompetent about.
For the remaining patches I cherry-picked the upstream commits except for (8/10) "netfilter: nf_tables: fix mismatch in big-endian system" where I used OpenWrt's backport.
The list of upstream commits in patch order: 6db6f0eae6052b70885562e1733896647ec1d807 "bridge: multicast to unicast" e9156cd26a495a18706e796f02a81fee41ec14f4 "smsc95xx: Use skb_cow_head to deal with cloned skbs" 6bc6895bdd6744e0136eaa4a11fbdb20a7db4e40 "ch9200: use skb_cow_head() to deal with cloned skbs" 39fba7835aacda65284a86e611774cbba71dac20 "kaweth: use skb_cow_head() to deal with cloned skbs" 854826c9d526fd81077742c3b000e3f7fcaef3ce "ubifs: Drop softlimit and delta fields from struct ubifs_wbuf" 1b7fc2c0069f3864a3dda15430b7aded31c0bfcc "ubifs: Use dirty_writeback_interval value for wbuf timer" cd4b1e34655d46950c065d9284b596cd8d7b28cd "usb: dwc2: Remove unnecessary kfree" 10596608c4d62cb8c1c2b806debcbd32fe657e7 "netfilter: nf_tables: fix mismatch in big-endian system" 65cab850f0eeaa9180bd2e10a231964f33743edf "net: Allow class-e address assignment via ifconfig ioctl" 6926e041a8920c8ec27e4e155efa760aa01551fd "uapi/if_ether.h: prevent redefinition of struct ethhdr"
Dave Taht (1): net: Allow class-e address assignment via ifconfig ioctl
Eric Dumazet (2): ch9200: use skb_cow_head() to deal with cloned skbs kaweth: use skb_cow_head() to deal with cloned skbs
Felix Fietkau (1): bridge: multicast to unicast
Hauke Mehrtens (1): uapi/if_ether.h: prevent redefinition of struct ethhdr
James Hughes (1): smsc95xx: Use skb_cow_head to deal with cloned skbs
John Youn (1): usb: dwc2: Remove unnecessary kfree
Liping Zhang (1): netfilter: nf_tables: fix mismatch in big-endian system
Rafał Miłecki (2): ubifs: Drop softlimit and delta fields from struct ubifs_wbuf ubifs: Use dirty_writeback_interval value for wbuf timer
drivers/net/usb/ch9200.c | 9 +-- drivers/net/usb/kaweth.c | 18 ++---- drivers/net/usb/smsc95xx.c | 12 ++-- drivers/usb/dwc2/hcd.c | 1 - fs/ubifs/io.c | 18 +++--- fs/ubifs/ubifs.h | 9 --- include/linux/if_bridge.h | 1 + include/net/netfilter/nf_tables.h | 29 ++++++++++ include/uapi/linux/if_ether.h | 3 + include/uapi/linux/if_link.h | 1 + include/uapi/linux/in.h | 10 +++- include/uapi/linux/libc-compat.h | 6 ++ net/bridge/br_forward.c | 39 ++++++++++++- net/bridge/br_mdb.c | 2 +- net/bridge/br_multicast.c | 90 +++++++++++++++++++++-------- net/bridge/br_netlink.c | 5 ++ net/bridge/br_private.h | 3 +- net/bridge/br_sysfs_if.c | 2 + net/ipv4/devinet.c | 5 +- net/ipv4/ipconfig.c | 2 + net/ipv4/netfilter/nft_masq_ipv4.c | 8 +-- net/ipv4/netfilter/nft_redir_ipv4.c | 8 +-- net/ipv6/netfilter/nft_masq_ipv6.c | 8 +-- net/ipv6/netfilter/nft_redir_ipv6.c | 8 +-- net/netfilter/nft_ct.c | 10 ++-- net/netfilter/nft_meta.c | 42 +++++++------- net/netfilter/nft_nat.c | 8 +-- 27 files changed, 235 insertions(+), 122 deletions(-)
From: Felix Fietkau nbd@nbd.name
Implements an optional, per bridge port flag and feature to deliver multicast packets to any host on the according port via unicast individually. This is done by copying the packet per host and changing the multicast destination MAC to a unicast one accordingly.
multicast-to-unicast works on top of the multicast snooping feature of the bridge. Which means unicast copies are only delivered to hosts which are interested in it and signalized this via IGMP/MLD reports previously.
This feature is intended for interface types which have a more reliable and/or efficient way to deliver unicast packets than broadcast ones (e.g. wifi).
However, it should only be enabled on interfaces where no IGMPv2/MLDv1 report suppression takes place. This feature is disabled by default.
The initial patch and idea is from Felix Fietkau.
Signed-off-by: Felix Fietkau nbd@nbd.name [linus.luessing@c0d3.blue: various bug + style fixes, commit message] Signed-off-by: Linus Lüssing linus.luessing@c0d3.blue Reviewed-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net --- include/linux/if_bridge.h | 1 + include/uapi/linux/if_link.h | 1 + net/bridge/br_forward.c | 39 +++++++++++++++- net/bridge/br_mdb.c | 2 +- net/bridge/br_multicast.c | 90 ++++++++++++++++++++++++++---------- net/bridge/br_netlink.c | 5 ++ net/bridge/br_private.h | 3 +- net/bridge/br_sysfs_if.c | 2 + 8 files changed, 114 insertions(+), 29 deletions(-)
diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h index c6587c01d951..debc9d5904e5 100644 --- a/include/linux/if_bridge.h +++ b/include/linux/if_bridge.h @@ -46,6 +46,7 @@ struct br_ip_list { #define BR_LEARNING_SYNC BIT(9) #define BR_PROXYARP_WIFI BIT(10) #define BR_MCAST_FLOOD BIT(11) +#define BR_MULTICAST_TO_UNICAST BIT(12)
#define BR_DEFAULT_AGEING_TIME (300 * HZ)
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index b4fba662cd32..ee4d632d089d 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -319,6 +319,7 @@ enum { IFLA_BRPORT_MULTICAST_ROUTER, IFLA_BRPORT_PAD, IFLA_BRPORT_MCAST_FLOOD, + IFLA_BRPORT_MCAST_TO_UCAST, __IFLA_BRPORT_MAX }; #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1) diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index 5b675695c661..30afa130287e 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -173,6 +173,31 @@ static struct net_bridge_port *maybe_deliver( return p; }
+static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb, + const unsigned char *addr, bool local_orig) +{ + struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev; + const unsigned char *src = eth_hdr(skb)->h_source; + + if (!should_deliver(p, skb)) + return; + + /* Even with hairpin, no soliloquies - prevent breaking IPv6 DAD */ + if (skb->dev == p->dev && ether_addr_equal(src, addr)) + return; + + skb = skb_copy(skb, GFP_ATOMIC); + if (!skb) { + dev->stats.tx_dropped++; + return; + } + + if (!is_broadcast_ether_addr(addr)) + memcpy(eth_hdr(skb)->h_dest, addr, ETH_ALEN); + + __br_forward(p, skb, local_orig); +} + /* called under rcu_read_lock */ void br_flood(struct net_bridge *br, struct sk_buff *skb, enum br_pkt_type pkt_type, bool local_rcv, bool local_orig) @@ -241,10 +266,20 @@ void br_multicast_flood(struct net_bridge_mdb_entry *mdst, rport = rp ? hlist_entry(rp, struct net_bridge_port, rlist) : NULL;
- port = (unsigned long)lport > (unsigned long)rport ? - lport : rport; + if ((unsigned long)lport > (unsigned long)rport) { + port = lport; + + if (port->flags & BR_MULTICAST_TO_UNICAST) { + maybe_deliver_addr(lport, skb, p->eth_addr, + local_orig); + goto delivered; + } + } else { + port = rport; + }
prev = maybe_deliver(prev, port, skb, local_orig); +delivered: if (IS_ERR(prev)) goto out; if (prev == port) diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c index 6406010e155b..57e94a1b57e1 100644 --- a/net/bridge/br_mdb.c +++ b/net/bridge/br_mdb.c @@ -532,7 +532,7 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port, break; }
- p = br_multicast_new_port_group(port, group, *pp, state); + p = br_multicast_new_port_group(port, group, *pp, state, NULL); if (unlikely(!p)) return -ENOMEM; rcu_assign_pointer(*pp, p); diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 4bd57507b9a4..1183c5fcd9d2 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -42,12 +42,14 @@ static void br_multicast_add_router(struct net_bridge *br, static void br_ip4_multicast_leave_group(struct net_bridge *br, struct net_bridge_port *port, __be32 group, - __u16 vid); + __u16 vid, + const unsigned char *src); + #if IS_ENABLED(CONFIG_IPV6) static void br_ip6_multicast_leave_group(struct net_bridge *br, struct net_bridge_port *port, const struct in6_addr *group, - __u16 vid); + __u16 vid, const unsigned char *src); #endif unsigned int br_mdb_rehash_seq;
@@ -658,7 +660,8 @@ struct net_bridge_port_group *br_multicast_new_port_group( struct net_bridge_port *port, struct br_ip *group, struct net_bridge_port_group __rcu *next, - unsigned char flags) + unsigned char flags, + const unsigned char *src) { struct net_bridge_port_group *p;
@@ -673,12 +676,32 @@ struct net_bridge_port_group *br_multicast_new_port_group( hlist_add_head(&p->mglist, &port->mglist); setup_timer(&p->timer, br_multicast_port_group_expired, (unsigned long)p); + + if (src) + memcpy(p->eth_addr, src, ETH_ALEN); + else + memset(p->eth_addr, 0xff, ETH_ALEN); + return p; }
+static bool br_port_group_equal(struct net_bridge_port_group *p, + struct net_bridge_port *port, + const unsigned char *src) +{ + if (p->port != port) + return false; + + if (!(port->flags & BR_MULTICAST_TO_UNICAST)) + return true; + + return ether_addr_equal(src, p->eth_addr); +} + static int br_multicast_add_group(struct net_bridge *br, struct net_bridge_port *port, - struct br_ip *group) + struct br_ip *group, + const unsigned char *src) { struct net_bridge_mdb_entry *mp; struct net_bridge_port_group *p; @@ -705,13 +728,13 @@ static int br_multicast_add_group(struct net_bridge *br, for (pp = &mp->ports; (p = mlock_dereference(*pp, br)) != NULL; pp = &p->next) { - if (p->port == port) + if (br_port_group_equal(p, port, src)) goto found; if ((unsigned long)p->port < (unsigned long)port) break; }
- p = br_multicast_new_port_group(port, group, *pp, 0); + p = br_multicast_new_port_group(port, group, *pp, 0, src); if (unlikely(!p)) goto err; rcu_assign_pointer(*pp, p); @@ -730,7 +753,8 @@ static int br_multicast_add_group(struct net_bridge *br, static int br_ip4_multicast_add_group(struct net_bridge *br, struct net_bridge_port *port, __be32 group, - __u16 vid) + __u16 vid, + const unsigned char *src) { struct br_ip br_group;
@@ -741,14 +765,15 @@ static int br_ip4_multicast_add_group(struct net_bridge *br, br_group.proto = htons(ETH_P_IP); br_group.vid = vid;
- return br_multicast_add_group(br, port, &br_group); + return br_multicast_add_group(br, port, &br_group, src); }
#if IS_ENABLED(CONFIG_IPV6) static int br_ip6_multicast_add_group(struct net_bridge *br, struct net_bridge_port *port, const struct in6_addr *group, - __u16 vid) + __u16 vid, + const unsigned char *src) { struct br_ip br_group;
@@ -759,7 +784,7 @@ static int br_ip6_multicast_add_group(struct net_bridge *br, br_group.proto = htons(ETH_P_IPV6); br_group.vid = vid;
- return br_multicast_add_group(br, port, &br_group); + return br_multicast_add_group(br, port, &br_group, src); } #endif
@@ -1028,6 +1053,7 @@ static int br_ip4_multicast_igmp3_report(struct net_bridge *br, struct sk_buff *skb, u16 vid) { + const unsigned char *src; struct igmpv3_report *ih; struct igmpv3_grec *grec; int i; @@ -1068,12 +1094,14 @@ static int br_ip4_multicast_igmp3_report(struct net_bridge *br, continue; }
+ src = eth_hdr(skb)->h_source; if ((type == IGMPV3_CHANGE_TO_INCLUDE || type == IGMPV3_MODE_IS_INCLUDE) && ntohs(grec->grec_nsrcs) == 0) { - br_ip4_multicast_leave_group(br, port, group, vid); + br_ip4_multicast_leave_group(br, port, group, vid, src); } else { - err = br_ip4_multicast_add_group(br, port, group, vid); + err = br_ip4_multicast_add_group(br, port, group, vid, + src); if (err) break; } @@ -1088,6 +1116,7 @@ static int br_ip6_multicast_mld2_report(struct net_bridge *br, struct sk_buff *skb, u16 vid) { + const unsigned char *src; struct icmp6hdr *icmp6h; struct mld2_grec *grec; int i; @@ -1135,14 +1164,16 @@ static int br_ip6_multicast_mld2_report(struct net_bridge *br, continue; }
+ src = eth_hdr(skb)->h_source; if ((grec->grec_type == MLD2_CHANGE_TO_INCLUDE || grec->grec_type == MLD2_MODE_IS_INCLUDE) && ntohs(*nsrcs) == 0) { br_ip6_multicast_leave_group(br, port, &grec->grec_mca, - vid); + vid, src); } else { err = br_ip6_multicast_add_group(br, port, - &grec->grec_mca, vid); + &grec->grec_mca, vid, + src); if (err) break; } @@ -1465,7 +1496,8 @@ br_multicast_leave_group(struct net_bridge *br, struct net_bridge_port *port, struct br_ip *group, struct bridge_mcast_other_query *other_query, - struct bridge_mcast_own_query *own_query) + struct bridge_mcast_own_query *own_query, + const unsigned char *src) { struct net_bridge_mdb_htable *mdb; struct net_bridge_mdb_entry *mp; @@ -1489,7 +1521,7 @@ br_multicast_leave_group(struct net_bridge *br, for (pp = &mp->ports; (p = mlock_dereference(*pp, br)) != NULL; pp = &p->next) { - if (p->port != port) + if (!br_port_group_equal(p, port, src)) continue;
rcu_assign_pointer(*pp, p->next); @@ -1520,7 +1552,7 @@ br_multicast_leave_group(struct net_bridge *br, for (p = mlock_dereference(mp->ports, br); p != NULL; p = mlock_dereference(p->next, br)) { - if (p->port != port) + if (!br_port_group_equal(p, port, src)) continue;
if (!hlist_unhashed(&p->mglist) && @@ -1571,7 +1603,8 @@ br_multicast_leave_group(struct net_bridge *br, static void br_ip4_multicast_leave_group(struct net_bridge *br, struct net_bridge_port *port, __be32 group, - __u16 vid) + __u16 vid, + const unsigned char *src) { struct br_ip br_group; struct bridge_mcast_own_query *own_query; @@ -1586,14 +1619,15 @@ static void br_ip4_multicast_leave_group(struct net_bridge *br, br_group.vid = vid;
br_multicast_leave_group(br, port, &br_group, &br->ip4_other_query, - own_query); + own_query, src); }
#if IS_ENABLED(CONFIG_IPV6) static void br_ip6_multicast_leave_group(struct net_bridge *br, struct net_bridge_port *port, const struct in6_addr *group, - __u16 vid) + __u16 vid, + const unsigned char *src) { struct br_ip br_group; struct bridge_mcast_own_query *own_query; @@ -1608,7 +1642,7 @@ static void br_ip6_multicast_leave_group(struct net_bridge *br, br_group.vid = vid;
br_multicast_leave_group(br, port, &br_group, &br->ip6_other_query, - own_query); + own_query, src); } #endif
@@ -1651,6 +1685,7 @@ static int br_multicast_ipv4_rcv(struct net_bridge *br, u16 vid) { struct sk_buff *skb_trimmed = NULL; + const unsigned char *src; struct igmphdr *ih; int err;
@@ -1666,13 +1701,14 @@ static int br_multicast_ipv4_rcv(struct net_bridge *br, }
ih = igmp_hdr(skb); + src = eth_hdr(skb)->h_source; BR_INPUT_SKB_CB(skb)->igmp = ih->type;
switch (ih->type) { case IGMP_HOST_MEMBERSHIP_REPORT: case IGMPV2_HOST_MEMBERSHIP_REPORT: BR_INPUT_SKB_CB(skb)->mrouters_only = 1; - err = br_ip4_multicast_add_group(br, port, ih->group, vid); + err = br_ip4_multicast_add_group(br, port, ih->group, vid, src); break; case IGMPV3_HOST_MEMBERSHIP_REPORT: err = br_ip4_multicast_igmp3_report(br, port, skb_trimmed, vid); @@ -1681,7 +1717,7 @@ static int br_multicast_ipv4_rcv(struct net_bridge *br, err = br_ip4_multicast_query(br, port, skb_trimmed, vid); break; case IGMP_HOST_LEAVE_MESSAGE: - br_ip4_multicast_leave_group(br, port, ih->group, vid); + br_ip4_multicast_leave_group(br, port, ih->group, vid, src); break; }
@@ -1701,6 +1737,7 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br, u16 vid) { struct sk_buff *skb_trimmed = NULL; + const unsigned char *src; struct mld_msg *mld; int err;
@@ -1720,8 +1757,10 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
switch (mld->mld_type) { case ICMPV6_MGM_REPORT: + src = eth_hdr(skb)->h_source; BR_INPUT_SKB_CB(skb)->mrouters_only = 1; - err = br_ip6_multicast_add_group(br, port, &mld->mld_mca, vid); + err = br_ip6_multicast_add_group(br, port, &mld->mld_mca, vid, + src); break; case ICMPV6_MLD2_REPORT: err = br_ip6_multicast_mld2_report(br, port, skb_trimmed, vid); @@ -1730,7 +1769,8 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br, err = br_ip6_multicast_query(br, port, skb_trimmed, vid); break; case ICMPV6_MGM_REDUCTION: - br_ip6_multicast_leave_group(br, port, &mld->mld_mca, vid); + src = eth_hdr(skb)->h_source; + br_ip6_multicast_leave_group(br, port, &mld->mld_mca, vid, src); break; }
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 4f831225d34f..a62deecd471b 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -123,6 +123,7 @@ static inline size_t br_port_info_size(void) + nla_total_size(1) /* IFLA_BRPORT_GUARD */ + nla_total_size(1) /* IFLA_BRPORT_PROTECT */ + nla_total_size(1) /* IFLA_BRPORT_FAST_LEAVE */ + + nla_total_size(1) /* IFLA_BRPORT_MCAST_TO_UCAST */ + nla_total_size(1) /* IFLA_BRPORT_LEARNING */ + nla_total_size(1) /* IFLA_BRPORT_UNICAST_FLOOD */ + nla_total_size(1) /* IFLA_BRPORT_PROXYARP */ @@ -173,6 +174,8 @@ static int br_port_fill_attrs(struct sk_buff *skb, !!(p->flags & BR_ROOT_BLOCK)) || nla_put_u8(skb, IFLA_BRPORT_FAST_LEAVE, !!(p->flags & BR_MULTICAST_FAST_LEAVE)) || + nla_put_u8(skb, IFLA_BRPORT_MCAST_TO_UCAST, + !!(p->flags & BR_MULTICAST_TO_UNICAST)) || nla_put_u8(skb, IFLA_BRPORT_LEARNING, !!(p->flags & BR_LEARNING)) || nla_put_u8(skb, IFLA_BRPORT_UNICAST_FLOOD, !!(p->flags & BR_FLOOD)) || @@ -586,6 +589,7 @@ static const struct nla_policy br_port_policy[IFLA_BRPORT_MAX + 1] = { [IFLA_BRPORT_PROXYARP] = { .type = NLA_U8 }, [IFLA_BRPORT_PROXYARP_WIFI] = { .type = NLA_U8 }, [IFLA_BRPORT_MULTICAST_ROUTER] = { .type = NLA_U8 }, + [IFLA_BRPORT_MCAST_TO_UCAST] = { .type = NLA_U8 }, };
/* Change the state of the port and notify spanning tree */ @@ -636,6 +640,7 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[]) br_set_port_flag(p, tb, IFLA_BRPORT_LEARNING, BR_LEARNING); br_set_port_flag(p, tb, IFLA_BRPORT_UNICAST_FLOOD, BR_FLOOD); br_set_port_flag(p, tb, IFLA_BRPORT_MCAST_FLOOD, BR_MCAST_FLOOD); + br_set_port_flag(p, tb, IFLA_BRPORT_MCAST_TO_UCAST, BR_MULTICAST_TO_UNICAST); br_set_port_flag(p, tb, IFLA_BRPORT_PROXYARP, BR_PROXYARP); br_set_port_flag(p, tb, IFLA_BRPORT_PROXYARP_WIFI, BR_PROXYARP_WIFI);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 1b63177e0ccd..f038cfdc8d98 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -177,6 +177,7 @@ struct net_bridge_port_group { struct timer_list timer; struct br_ip addr; unsigned char flags; + unsigned char eth_addr[ETH_ALEN]; };
struct net_bridge_mdb_entry @@ -591,7 +592,7 @@ void br_multicast_free_pg(struct rcu_head *head); struct net_bridge_port_group * br_multicast_new_port_group(struct net_bridge_port *port, struct br_ip *group, struct net_bridge_port_group __rcu *next, - unsigned char flags); + unsigned char flags, const unsigned char *src); void br_mdb_init(void); void br_mdb_uninit(void); void br_mdb_notify(struct net_device *dev, struct net_bridge_port *port, diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c index abf711112418..228ebcd4fa51 100644 --- a/net/bridge/br_sysfs_if.c +++ b/net/bridge/br_sysfs_if.c @@ -188,6 +188,7 @@ static BRPORT_ATTR(multicast_router, S_IRUGO | S_IWUSR, show_multicast_router, store_multicast_router);
BRPORT_ATTR_FLAG(multicast_fast_leave, BR_MULTICAST_FAST_LEAVE); +BRPORT_ATTR_FLAG(multicast_to_unicast, BR_MULTICAST_TO_UNICAST); #endif
static const struct brport_attribute *brport_attrs[] = { @@ -214,6 +215,7 @@ static const struct brport_attribute *brport_attrs[] = { #ifdef CONFIG_BRIDGE_IGMP_SNOOPING &brport_attr_multicast_router, &brport_attr_multicast_fast_leave, + &brport_attr_multicast_to_unicast, #endif &brport_attr_proxyarp, &brport_attr_proxyarp_wifi,
On Thu, Feb 14, 2019 at 11:24:27AM +0100, Linus Walleij wrote:
From: Felix Fietkau nbd@nbd.name
Implements an optional, per bridge port flag and feature to deliver multicast packets to any host on the according port via unicast individually. This is done by copying the packet per host and changing the multicast destination MAC to a unicast one accordingly.
multicast-to-unicast works on top of the multicast snooping feature of the bridge. Which means unicast copies are only delivered to hosts which are interested in it and signalized this via IGMP/MLD reports previously.
This feature is intended for interface types which have a more reliable and/or efficient way to deliver unicast packets than broadcast ones (e.g. wifi).
However, it should only be enabled on interfaces where no IGMPv2/MLDv1 report suppression takes place. This feature is disabled by default.
The initial patch and idea is from Felix Fietkau.
Signed-off-by: Felix Fietkau nbd@nbd.name [linus.luessing@c0d3.blue: various bug + style fixes, commit message] Signed-off-by: Linus Lüssing linus.luessing@c0d3.blue Reviewed-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net
include/linux/if_bridge.h | 1 + include/uapi/linux/if_link.h | 1 + net/bridge/br_forward.c | 39 +++++++++++++++- net/bridge/br_mdb.c | 2 +- net/bridge/br_multicast.c | 90 ++++++++++++++++++++++++++---------- net/bridge/br_netlink.c | 5 ++ net/bridge/br_private.h | 3 +- net/bridge/br_sysfs_if.c | 2 + 8 files changed, 114 insertions(+), 29 deletions(-)
What is the git commit id of this patch upstream? I need that on the patches themselves.
thanks,
greg k-h
From: James Hughes james.hughes@raspberrypi.org
The driver was failing to check that the SKB wasn't cloned before adding checksum data. Replace existing handling to extend/copy the header buffer with skb_cow_head.
Signed-off-by: James Hughes james.hughes@raspberrypi.org Acked-by: Eric Dumazet edumazet@google.com Acked-by: Woojung Huh Woojung.Huh@microchip.com Signed-off-by: David S. Miller davem@davemloft.net --- drivers/net/usb/smsc95xx.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c index e29f4c0767eb..e719ecd69d01 100644 --- a/drivers/net/usb/smsc95xx.c +++ b/drivers/net/usb/smsc95xx.c @@ -2011,13 +2011,13 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet *dev, /* We do not advertise SG, so skbs should be already linearized */ BUG_ON(skb_shinfo(skb)->nr_frags);
- if (skb_headroom(skb) < overhead) { - struct sk_buff *skb2 = skb_copy_expand(skb, - overhead, 0, flags); + /* Make writable and expand header space by overhead if required */ + if (skb_cow_head(skb, overhead)) { + /* Must deallocate here as returning NULL to indicate error + * means the skb won't be deallocated in the caller. + */ dev_kfree_skb_any(skb); - skb = skb2; - if (!skb) - return NULL; + return NULL; }
if (csum) {
From: Eric Dumazet edumazet@google.com
We need to ensure there is enough headroom to push extra header, but we also need to check if we are allowed to change headers.
skb_cow_head() is the proper helper to deal with this.
Fixes: 4a476bd6d1d9 ("usbnet: New driver for QinHeng CH9200 devices") Signed-off-by: Eric Dumazet edumazet@google.com Cc: James Hughes james.hughes@raspberrypi.org Cc: Matthew Garrett mjg59@srcf.ucam.org Signed-off-by: David S. Miller davem@davemloft.net --- drivers/net/usb/ch9200.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/net/usb/ch9200.c b/drivers/net/usb/ch9200.c index 8a40202c0a17..c4f1c363e24b 100644 --- a/drivers/net/usb/ch9200.c +++ b/drivers/net/usb/ch9200.c @@ -254,14 +254,9 @@ static struct sk_buff *ch9200_tx_fixup(struct usbnet *dev, struct sk_buff *skb, tx_overhead = 0x40;
len = skb->len; - if (skb_headroom(skb) < tx_overhead) { - struct sk_buff *skb2; - - skb2 = skb_copy_expand(skb, tx_overhead, 0, flags); + if (skb_cow_head(skb, tx_overhead)) { dev_kfree_skb_any(skb); - skb = skb2; - if (!skb) - return NULL; + return NULL; }
__skb_push(skb, tx_overhead);
From: Eric Dumazet edumazet@google.com
We can use skb_cow_head() to properly deal with clones, especially the ones coming from TCP stack that allow their head being modified. This avoids a copy.
Signed-off-by: Eric Dumazet edumazet@google.com Cc: James Hughes james.hughes@raspberrypi.org Signed-off-by: David S. Miller davem@davemloft.net --- drivers/net/usb/kaweth.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-)
diff --git a/drivers/net/usb/kaweth.c b/drivers/net/usb/kaweth.c index 66b34ddbe216..72d9e7954b0a 100644 --- a/drivers/net/usb/kaweth.c +++ b/drivers/net/usb/kaweth.c @@ -803,18 +803,12 @@ static netdev_tx_t kaweth_start_xmit(struct sk_buff *skb, }
/* We now decide whether we can put our special header into the sk_buff */ - if (skb_cloned(skb) || skb_headroom(skb) < 2) { - /* no such luck - we make our own */ - struct sk_buff *copied_skb; - copied_skb = skb_copy_expand(skb, 2, 0, GFP_ATOMIC); - dev_kfree_skb_irq(skb); - skb = copied_skb; - if (!copied_skb) { - kaweth->stats.tx_errors++; - netif_start_queue(net); - spin_unlock_irq(&kaweth->device_lock); - return NETDEV_TX_OK; - } + if (skb_cow_head(skb, 2)) { + kaweth->stats.tx_errors++; + netif_start_queue(net); + spin_unlock_irq(&kaweth->device_lock); + dev_kfree_skb_any(skb); + return NETDEV_TX_OK; }
private_header = (__le16 *)__skb_push(skb, 2);
From: Rafał Miłecki rafal@milecki.pl
Values of these fields are set during init and never modified. They are used (read) in a single function only. There isn't really any reason to keep them in a struct. It only makes struct just a bit bigger without any visible gain.
Signed-off-by: Rafał Miłecki rafal@milecki.pl Reviewed-by: Boris Brezillon boris.brezillon@free-electrons.com Signed-off-by: Richard Weinberger richard@nod.at --- fs/ubifs/io.c | 18 ++++++++++-------- fs/ubifs/ubifs.h | 5 ----- 2 files changed, 10 insertions(+), 13 deletions(-)
diff --git a/fs/ubifs/io.c b/fs/ubifs/io.c index 97be41215332..4d6ce4a2a4b6 100644 --- a/fs/ubifs/io.c +++ b/fs/ubifs/io.c @@ -452,16 +452,22 @@ static enum hrtimer_restart wbuf_timer_callback_nolock(struct hrtimer *timer) */ static void new_wbuf_timer_nolock(struct ubifs_wbuf *wbuf) { + ktime_t softlimit = ktime_set(WBUF_TIMEOUT_SOFTLIMIT, 0); + unsigned long long delta; + + delta = WBUF_TIMEOUT_HARDLIMIT - WBUF_TIMEOUT_SOFTLIMIT; + delta *= 1000000000ULL; + ubifs_assert(!hrtimer_active(&wbuf->timer)); + ubifs_assert(delta <= ULONG_MAX);
if (wbuf->no_timer) return; dbg_io("set timer for jhead %s, %llu-%llu millisecs", dbg_jhead(wbuf->jhead), - div_u64(ktime_to_ns(wbuf->softlimit), USEC_PER_SEC), - div_u64(ktime_to_ns(wbuf->softlimit) + wbuf->delta, - USEC_PER_SEC)); - hrtimer_start_range_ns(&wbuf->timer, wbuf->softlimit, wbuf->delta, + div_u64(ktime_to_ns(softlimit), USEC_PER_SEC), + div_u64(ktime_to_ns(softlimit) + delta, USEC_PER_SEC)); + hrtimer_start_range_ns(&wbuf->timer, softlimit, delta, HRTIMER_MODE_REL); }
@@ -1059,10 +1065,6 @@ int ubifs_wbuf_init(struct ubifs_info *c, struct ubifs_wbuf *wbuf)
hrtimer_init(&wbuf->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); wbuf->timer.function = wbuf_timer_callback_nolock; - wbuf->softlimit = ktime_set(WBUF_TIMEOUT_SOFTLIMIT, 0); - wbuf->delta = WBUF_TIMEOUT_HARDLIMIT - WBUF_TIMEOUT_SOFTLIMIT; - wbuf->delta *= 1000000000ULL; - ubifs_assert(wbuf->delta <= ULONG_MAX); return 0; }
diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h index 096035eb29d0..ade4b3137a1d 100644 --- a/fs/ubifs/ubifs.h +++ b/fs/ubifs/ubifs.h @@ -645,9 +645,6 @@ typedef int (*ubifs_lpt_scan_callback)(struct ubifs_info *c, * @io_mutex: serializes write-buffer I/O * @lock: serializes @buf, @lnum, @offs, @avail, @used, @next_ino and @inodes * fields - * @softlimit: soft write-buffer timeout interval - * @delta: hard and soft timeouts delta (the timer expire interval is @softlimit - * and @softlimit + @delta) * @timer: write-buffer timer * @no_timer: non-zero if this write-buffer does not have a timer * @need_sync: non-zero if the timer expired and the wbuf needs sync'ing @@ -676,8 +673,6 @@ struct ubifs_wbuf { int (*sync_callback)(struct ubifs_info *c, int lnum, int free, int pad); struct mutex io_mutex; spinlock_t lock; - ktime_t softlimit; - unsigned long long delta; struct hrtimer timer; unsigned int no_timer:1; unsigned int need_sync:1;
From: Rafał Miłecki rafal@milecki.pl
Right now wbuf timer has hardcoded timeouts and there is no place for manual adjustments. Some projects / cases many need that though. Few file systems allow doing that by respecting dirty_writeback_interval that can be set using sysctl (dirty_writeback_centisecs).
Lowering dirty_writeback_interval could be some way of dealing with user space apps lacking proper fsyncs. This is definitely *not* a perfect solution but we don't have ideal (user space) world. There were already advanced discussions on this matter, mostly when ext4 was introduced and it wasn't behaving as ext3. Anyway, the final decision was to add some hacks to the ext4, as trying to fix whole user space or adding new API was pointless.
We can't (and shouldn't?) just follow ext4. We can't e.g. sync on close as this would cause too many commits and flash wearing. On the other hand we still should allow some trade-off between -o sync and default wbuf timeout. Respecting dirty_writeback_interval should allow some sane cutomizations if used warily.
Signed-off-by: Rafał Miłecki rafal@milecki.pl Reviewed-by: Boris Brezillon boris.brezillon@free-electrons.com Signed-off-by: Richard Weinberger richard@nod.at --- fs/ubifs/io.c | 8 ++++---- fs/ubifs/ubifs.h | 4 ---- 2 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/fs/ubifs/io.c b/fs/ubifs/io.c index 4d6ce4a2a4b6..3be28900bf37 100644 --- a/fs/ubifs/io.c +++ b/fs/ubifs/io.c @@ -452,11 +452,11 @@ static enum hrtimer_restart wbuf_timer_callback_nolock(struct hrtimer *timer) */ static void new_wbuf_timer_nolock(struct ubifs_wbuf *wbuf) { - ktime_t softlimit = ktime_set(WBUF_TIMEOUT_SOFTLIMIT, 0); - unsigned long long delta; + ktime_t softlimit = ms_to_ktime(dirty_writeback_interval * 10); + unsigned long long delta = dirty_writeback_interval;
- delta = WBUF_TIMEOUT_HARDLIMIT - WBUF_TIMEOUT_SOFTLIMIT; - delta *= 1000000000ULL; + /* centi to milli, milli to nano, then 10% */ + delta *= 10ULL * NSEC_PER_MSEC / 10ULL;
ubifs_assert(!hrtimer_active(&wbuf->timer)); ubifs_assert(delta <= ULONG_MAX); diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h index ade4b3137a1d..b8b18d446a49 100644 --- a/fs/ubifs/ubifs.h +++ b/fs/ubifs/ubifs.h @@ -83,10 +83,6 @@ */ #define BGT_NAME_PATTERN "ubifs_bgt%d_%d"
-/* Write-buffer synchronization timeout interval in seconds */ -#define WBUF_TIMEOUT_SOFTLIMIT 3 -#define WBUF_TIMEOUT_HARDLIMIT 5 - /* Maximum possible inode number (only 32-bit inodes are supported now) */ #define MAX_INUM 0xFFFFFFFF
From: John Youn johnyoun@synopsys.com
This shouldn't be freed by the HCD as it is owned by the core and allocated with devm_kzalloc.
Signed-off-by: John Youn johnyoun@synopsys.com Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com --- drivers/usb/dwc2/hcd.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c index 984d6aae7529..0e5435330c07 100644 --- a/drivers/usb/dwc2/hcd.c +++ b/drivers/usb/dwc2/hcd.c @@ -5202,7 +5202,6 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg, int irq) error2: usb_put_hcd(hcd); error1: - kfree(hsotg->core_params);
#ifdef CONFIG_USB_DWC2_TRACK_MISSED_SOFS kfree(hsotg->last_frame_num_array);
From: Liping Zhang zlpnobody@gmail.com
Currently, there are two different methods to store an u16 integer to the u32 data register. For example: u32 *dest = ®s->data[priv->dreg]; 1. *dest = 0; *(u16 *) dest = val_u16; 2. *dest = val_u16;
For method 1, the u16 value will be stored like this, either in big-endian or little-endian system: 0 15 31 +-+-+-+-+-+-+-+-+-+-+-+-+ | Value | 0 | +-+-+-+-+-+-+-+-+-+-+-+-+
For method 2, in little-endian system, the u16 value will be the same as listed above. But in big-endian system, the u16 value will be stored like this: 0 15 31 +-+-+-+-+-+-+-+-+-+-+-+-+ | 0 | Value | +-+-+-+-+-+-+-+-+-+-+-+-+
So later we use "memcmp(®s->data[priv->sreg], data, 2);" to do compare in nft_cmp, nft_lookup expr ..., method 2 will get the wrong result in big-endian system, as 0~15 bits will always be zero.
For the similar reason, when loading an u16 value from the u32 data register, we should use "*(u16 *) sreg;" instead of "(u16)*sreg;", the 2nd method will get the wrong value in the big-endian system.
So introduce some wrapper functions to store/load an u8 or u16 integer to/from the u32 data register, and use them in the right place.
This is OpenWrts backport of upstream commit 10596608c4d62cb8c1c2b806debcbd32fe657e71 "netfilter: nf_tables: fix mismatch in big-endian system", the upstream commit does not work right off.
Signed-off-by: Liping Zhang zlpnobody@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org --- include/net/netfilter/nf_tables.h | 29 ++++++++++++++++++++ net/ipv4/netfilter/nft_masq_ipv4.c | 8 +++--- net/ipv4/netfilter/nft_redir_ipv4.c | 8 +++--- net/ipv6/netfilter/nft_masq_ipv6.c | 8 +++--- net/ipv6/netfilter/nft_redir_ipv6.c | 8 +++--- net/netfilter/nft_ct.c | 10 +++---- net/netfilter/nft_meta.c | 42 +++++++++++++++-------------- net/netfilter/nft_nat.c | 8 +++--- 8 files changed, 76 insertions(+), 45 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index b02af0bf5777..66f6b84df287 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -87,6 +87,35 @@ struct nft_regs { }; };
+/* Store/load an u16 or u8 integer to/from the u32 data register. + * + * Note, when using concatenations, register allocation happens at 32-bit + * level. So for store instruction, pad the rest part with zero to avoid + * garbage values. + */ + +static inline void nft_reg_store16(u32 *dreg, u16 val) +{ + *dreg = 0; + *(u16 *)dreg = val; +} + +static inline void nft_reg_store8(u32 *dreg, u8 val) +{ + *dreg = 0; + *(u8 *)dreg = val; +} + +static inline u16 nft_reg_load16(u32 *sreg) +{ + return *(u16 *)sreg; +} + +static inline u8 nft_reg_load8(u32 *sreg) +{ + return *(u8 *)sreg; +} + static inline void nft_data_copy(u32 *dst, const struct nft_data *src, unsigned int len) { diff --git a/net/ipv4/netfilter/nft_masq_ipv4.c b/net/ipv4/netfilter/nft_masq_ipv4.c index 51ced81b616c..dc3628a396ec 100644 --- a/net/ipv4/netfilter/nft_masq_ipv4.c +++ b/net/ipv4/netfilter/nft_masq_ipv4.c @@ -26,10 +26,10 @@ static void nft_masq_ipv4_eval(const struct nft_expr *expr, memset(&range, 0, sizeof(range)); range.flags = priv->flags; if (priv->sreg_proto_min) { - range.min_proto.all = - *(__be16 *)®s->data[priv->sreg_proto_min]; - range.max_proto.all = - *(__be16 *)®s->data[priv->sreg_proto_max]; + range.min_proto.all = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg_proto_min]); + range.max_proto.all = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg_proto_max]); } regs->verdict.code = nf_nat_masquerade_ipv4(pkt->skb, pkt->hook, &range, pkt->out); diff --git a/net/ipv4/netfilter/nft_redir_ipv4.c b/net/ipv4/netfilter/nft_redir_ipv4.c index c09d4381427e..f760524e1353 100644 --- a/net/ipv4/netfilter/nft_redir_ipv4.c +++ b/net/ipv4/netfilter/nft_redir_ipv4.c @@ -26,10 +26,10 @@ static void nft_redir_ipv4_eval(const struct nft_expr *expr,
memset(&mr, 0, sizeof(mr)); if (priv->sreg_proto_min) { - mr.range[0].min.all = - *(__be16 *)®s->data[priv->sreg_proto_min]; - mr.range[0].max.all = - *(__be16 *)®s->data[priv->sreg_proto_max]; + mr.range[0].min.all = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg_proto_min]); + mr.range[0].max.all = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg_proto_max]); mr.range[0].flags |= NF_NAT_RANGE_PROTO_SPECIFIED; }
diff --git a/net/ipv6/netfilter/nft_masq_ipv6.c b/net/ipv6/netfilter/nft_masq_ipv6.c index 9597ffb74077..b74a420050c4 100644 --- a/net/ipv6/netfilter/nft_masq_ipv6.c +++ b/net/ipv6/netfilter/nft_masq_ipv6.c @@ -27,10 +27,10 @@ static void nft_masq_ipv6_eval(const struct nft_expr *expr, memset(&range, 0, sizeof(range)); range.flags = priv->flags; if (priv->sreg_proto_min) { - range.min_proto.all = - *(__be16 *)®s->data[priv->sreg_proto_min]; - range.max_proto.all = - *(__be16 *)®s->data[priv->sreg_proto_max]; + range.min_proto.all = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg_proto_min]); + range.max_proto.all = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg_proto_max]); } regs->verdict.code = nf_nat_masquerade_ipv6(pkt->skb, &range, pkt->out); } diff --git a/net/ipv6/netfilter/nft_redir_ipv6.c b/net/ipv6/netfilter/nft_redir_ipv6.c index aca44e89a881..7ef58e493fca 100644 --- a/net/ipv6/netfilter/nft_redir_ipv6.c +++ b/net/ipv6/netfilter/nft_redir_ipv6.c @@ -26,10 +26,10 @@ static void nft_redir_ipv6_eval(const struct nft_expr *expr,
memset(&range, 0, sizeof(range)); if (priv->sreg_proto_min) { - range.min_proto.all = - *(__be16 *)®s->data[priv->sreg_proto_min], - range.max_proto.all = - *(__be16 *)®s->data[priv->sreg_proto_max], + range.min_proto.all = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg_proto_min]); + range.max_proto.all = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg_proto_max]); range.flags |= NF_NAT_RANGE_PROTO_SPECIFIED; }
diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c index d7b0d171172a..2b9fda71fa8b 100644 --- a/net/netfilter/nft_ct.c +++ b/net/netfilter/nft_ct.c @@ -77,7 +77,7 @@ static void nft_ct_get_eval(const struct nft_expr *expr,
switch (priv->key) { case NFT_CT_DIRECTION: - *dest = CTINFO2DIR(ctinfo); + nft_reg_store8(dest, CTINFO2DIR(ctinfo)); return; case NFT_CT_STATUS: *dest = ct->status; @@ -129,10 +129,10 @@ static void nft_ct_get_eval(const struct nft_expr *expr, return; } case NFT_CT_L3PROTOCOL: - *dest = nf_ct_l3num(ct); + nft_reg_store8(dest, nf_ct_l3num(ct)); return; case NFT_CT_PROTOCOL: - *dest = nf_ct_protonum(ct); + nft_reg_store8(dest, nf_ct_protonum(ct)); return; default: break; @@ -149,10 +149,10 @@ static void nft_ct_get_eval(const struct nft_expr *expr, nf_ct_l3num(ct) == NFPROTO_IPV4 ? 4 : 16); return; case NFT_CT_PROTO_SRC: - *dest = (__force __u16)tuple->src.u.all; + nft_reg_store16(dest, (__force u16)tuple->src.u.all); return; case NFT_CT_PROTO_DST: - *dest = (__force __u16)tuple->dst.u.all; + nft_reg_store16(dest, (__force u16)tuple->dst.u.all); return; default: break; diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c index 7c3395513ff0..cec8dc0e5e6f 100644 --- a/net/netfilter/nft_meta.c +++ b/net/netfilter/nft_meta.c @@ -45,16 +45,15 @@ void nft_meta_get_eval(const struct nft_expr *expr, *dest = skb->len; break; case NFT_META_PROTOCOL: - *dest = 0; - *(__be16 *)dest = skb->protocol; + nft_reg_store16(dest, (__force u16)skb->protocol); break; case NFT_META_NFPROTO: - *dest = pkt->pf; + nft_reg_store8(dest, pkt->pf); break; case NFT_META_L4PROTO: if (!pkt->tprot_set) goto err; - *dest = pkt->tprot; + nft_reg_store8(dest, pkt->tprot); break; case NFT_META_PRIORITY: *dest = skb->priority; @@ -85,14 +84,12 @@ void nft_meta_get_eval(const struct nft_expr *expr, case NFT_META_IIFTYPE: if (in == NULL) goto err; - *dest = 0; - *(u16 *)dest = in->type; + nft_reg_store16(dest, in->type); break; case NFT_META_OIFTYPE: if (out == NULL) goto err; - *dest = 0; - *(u16 *)dest = out->type; + nft_reg_store16(dest, out->type); break; case NFT_META_SKUID: sk = skb_to_full_sk(skb); @@ -142,22 +139,22 @@ void nft_meta_get_eval(const struct nft_expr *expr, #endif case NFT_META_PKTTYPE: if (skb->pkt_type != PACKET_LOOPBACK) { - *dest = skb->pkt_type; + nft_reg_store8(dest, skb->pkt_type); break; }
switch (pkt->pf) { case NFPROTO_IPV4: if (ipv4_is_multicast(ip_hdr(skb)->daddr)) - *dest = PACKET_MULTICAST; + nft_reg_store8(dest, PACKET_MULTICAST); else - *dest = PACKET_BROADCAST; + nft_reg_store8(dest, PACKET_BROADCAST); break; case NFPROTO_IPV6: if (ipv6_hdr(skb)->daddr.s6_addr[0] == 0xFF) - *dest = PACKET_MULTICAST; + nft_reg_store8(dest, PACKET_MULTICAST); else - *dest = PACKET_BROADCAST; + nft_reg_store8(dest, PACKET_BROADCAST); break; case NFPROTO_NETDEV: switch (skb->protocol) { @@ -171,14 +168,14 @@ void nft_meta_get_eval(const struct nft_expr *expr, goto err;
if (ipv4_is_multicast(iph->daddr)) - *dest = PACKET_MULTICAST; + nft_reg_store8(dest, PACKET_MULTICAST); else - *dest = PACKET_BROADCAST; + nft_reg_store8(dest, PACKET_BROADCAST);
break; } case htons(ETH_P_IPV6): - *dest = PACKET_MULTICAST; + nft_reg_store8(dest, PACKET_MULTICAST); break; default: WARN_ON_ONCE(1); @@ -233,7 +230,9 @@ void nft_meta_set_eval(const struct nft_expr *expr, { const struct nft_meta *meta = nft_expr_priv(expr); struct sk_buff *skb = pkt->skb; - u32 value = regs->data[meta->sreg]; + u32 *sreg = ®s->data[meta->sreg]; + u32 value = *sreg; + u8 pkt_type;
switch (meta->key) { case NFT_META_MARK: @@ -243,9 +242,12 @@ void nft_meta_set_eval(const struct nft_expr *expr, skb->priority = value; break; case NFT_META_PKTTYPE: - if (skb->pkt_type != value && - skb_pkt_type_ok(value) && skb_pkt_type_ok(skb->pkt_type)) - skb->pkt_type = value; + pkt_type = nft_reg_load8(sreg); + + if (skb->pkt_type != pkt_type && + skb_pkt_type_ok(pkt_type) && + skb_pkt_type_ok(skb->pkt_type)) + skb->pkt_type = pkt_type; break; case NFT_META_NFTRACE: skb->nf_trace = !!value; diff --git a/net/netfilter/nft_nat.c b/net/netfilter/nft_nat.c index ee2d71753746..4c48e9bb21e2 100644 --- a/net/netfilter/nft_nat.c +++ b/net/netfilter/nft_nat.c @@ -65,10 +65,10 @@ static void nft_nat_eval(const struct nft_expr *expr, }
if (priv->sreg_proto_min) { - range.min_proto.all = - *(__be16 *)®s->data[priv->sreg_proto_min]; - range.max_proto.all = - *(__be16 *)®s->data[priv->sreg_proto_max]; + range.min_proto.all = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg_proto_min]); + range.max_proto.all = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg_proto_max]); range.flags |= NF_NAT_RANGE_PROTO_SPECIFIED; }
From: Dave Taht dave.taht@gmail.com
While most distributions long ago switched to the iproute2 suite of utilities, which allow class-e (240.0.0.0/4) address assignment, distributions relying on busybox, toybox and other forms of ifconfig cannot assign class-e addresses without this kernel patch.
While CIDR has been obsolete for 2 decades, and a survey of all the open source code in the world shows the IN_whatever macros are also obsolete... rather than obsolete CIDR from this ioctl entirely, this patch merely enables class-e assignment, sanely.
Signed-off-by: Dave Taht dave.taht@gmail.com Signed-off-by: David S. Miller davem@davemloft.net --- include/uapi/linux/in.h | 10 +++++++--- net/ipv4/devinet.c | 5 +++-- net/ipv4/ipconfig.c | 2 ++ 3 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h index eaf94919291a..550b234be4a1 100644 --- a/include/uapi/linux/in.h +++ b/include/uapi/linux/in.h @@ -264,10 +264,14 @@ struct sockaddr_in {
#define IN_CLASSD(a) ((((long int) (a)) & 0xf0000000) == 0xe0000000) #define IN_MULTICAST(a) IN_CLASSD(a) -#define IN_MULTICAST_NET 0xF0000000 +#define IN_MULTICAST_NET 0xe0000000
-#define IN_EXPERIMENTAL(a) ((((long int) (a)) & 0xf0000000) == 0xf0000000) -#define IN_BADCLASS(a) IN_EXPERIMENTAL((a)) +#define IN_BADCLASS(a) ((((long int) (a) ) == 0xffffffff) +#define IN_EXPERIMENTAL(a) IN_BADCLASS((a)) + +#define IN_CLASSE(a) ((((long int) (a)) & 0xf0000000) == 0xf0000000) +#define IN_CLASSE_NET 0xffffffff +#define IN_CLASSE_NSHIFT 0
/* Address to accept any incoming messages. */ #define INADDR_ANY ((unsigned long int) 0x00000000) diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index f08f984ebc56..32335ed0e9fe 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -898,17 +898,18 @@ static int inet_abc_len(__be32 addr) { int rc = -1; /* Something else, probably a multicast. */
- if (ipv4_is_zeronet(addr)) + if (ipv4_is_zeronet(addr) || ipv4_is_lbcast(addr)) rc = 0; else { __u32 haddr = ntohl(addr); - if (IN_CLASSA(haddr)) rc = 8; else if (IN_CLASSB(haddr)) rc = 16; else if (IN_CLASSC(haddr)) rc = 24; + else if (IN_CLASSE(haddr)) + rc = 32; }
return rc; diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c index d278b06459ac..ecf19f7ab5fd 100644 --- a/net/ipv4/ipconfig.c +++ b/net/ipv4/ipconfig.c @@ -455,6 +455,8 @@ static int __init ic_defaults(void) ic_netmask = htonl(IN_CLASSB_NET); else if (IN_CLASSC(ntohl(ic_myaddr))) ic_netmask = htonl(IN_CLASSC_NET); + else if (IN_CLASSE(ntohl(ic_myaddr))) + ic_netmask = htonl(IN_CLASSE_NET); else { pr_err("IP-Config: Unable to guess netmask for address %pI4\n", &ic_myaddr);
From: Hauke Mehrtens hauke@hauke-m.de
Musl provides its own ethhdr struct definition. Add a guard to prevent its definition of the appropriate musl header has already been included.
glibc does not implement this header, but when glibc will implement this they can just define __UAPI_DEF_ETHHDR 0 to make it work with the kernel.
Signed-off-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: David S. Miller davem@davemloft.net --- include/uapi/linux/if_ether.h | 3 +++ include/uapi/linux/libc-compat.h | 6 ++++++ 2 files changed, 9 insertions(+)
diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h index 659b1634de61..b31e2f836317 100644 --- a/include/uapi/linux/if_ether.h +++ b/include/uapi/linux/if_ether.h @@ -22,6 +22,7 @@ #define _UAPI_LINUX_IF_ETHER_H
#include <linux/types.h> +#include <linux/libc-compat.h>
/* * IEEE 802.3 Ethernet magic constants. The frame sizes omit the preamble @@ -139,11 +140,13 @@ * This is an Ethernet frame header. */
+#if __UAPI_DEF_ETHHDR struct ethhdr { unsigned char h_dest[ETH_ALEN]; /* destination eth addr */ unsigned char h_source[ETH_ALEN]; /* source ether addr */ __be16 h_proto; /* packet type ID field */ } __attribute__((packed)); +#endif
#endif /* _UAPI_LINUX_IF_ETHER_H */ diff --git a/include/uapi/linux/libc-compat.h b/include/uapi/linux/libc-compat.h index 774cb2db1b89..7c098d68d5e5 100644 --- a/include/uapi/linux/libc-compat.h +++ b/include/uapi/linux/libc-compat.h @@ -263,4 +263,10 @@
#endif /* __GLIBC__ */
+/* Definitions for if_ether.h */ +/* allow libcs like musl to deactivate this, glibc does not implement this. */ +#ifndef __UAPI_DEF_ETHHDR +#define __UAPI_DEF_ETHHDR 1 +#endif + #endif /* _UAPI_LIBC_COMPAT_H */
On Thu, Feb 14, 2019 at 11:24:26AM +0100, Linus Walleij wrote:
This is a series of patches used in OpenWrt's v4.9 backports that seem to be of generic interest to v4.9.y
As a bunch of commits hit networking I CC DaveM and here so the can nod/object to these, I think the want a bit of control over what goes into networking stable.
I avoided two patch sets dealing with IPv6 tunneling and netfilter that seem to still be fixed and I feel generally incompetent about.
For the remaining patches I cherry-picked the upstream commits except for (8/10) "netfilter: nf_tables: fix mismatch in big-endian system" where I used OpenWrt's backport.
The list of upstream commits in patch order: 6db6f0eae6052b70885562e1733896647ec1d807 "bridge: multicast to unicast" e9156cd26a495a18706e796f02a81fee41ec14f4 "smsc95xx: Use skb_cow_head to deal with cloned skbs" 6bc6895bdd6744e0136eaa4a11fbdb20a7db4e40 "ch9200: use skb_cow_head() to deal with cloned skbs" 39fba7835aacda65284a86e611774cbba71dac20 "kaweth: use skb_cow_head() to deal with cloned skbs" 854826c9d526fd81077742c3b000e3f7fcaef3ce "ubifs: Drop softlimit and delta fields from struct ubifs_wbuf" 1b7fc2c0069f3864a3dda15430b7aded31c0bfcc "ubifs: Use dirty_writeback_interval value for wbuf timer" cd4b1e34655d46950c065d9284b596cd8d7b28cd "usb: dwc2: Remove unnecessary kfree" 10596608c4d62cb8c1c2b806debcbd32fe657e7 "netfilter: nf_tables: fix mismatch in big-endian system" 65cab850f0eeaa9180bd2e10a231964f33743edf "net: Allow class-e address assignment via ifconfig ioctl" 6926e041a8920c8ec27e4e155efa760aa01551fd "uapi/if_ether.h: prevent redefinition of struct ethhdr"
Ick that's hard to read. Going with a bit longer line is better:
6db6f0eae6052b70885562e1733896647ec1d807 "bridge: multicast to unicast" e9156cd26a495a18706e796f02a81fee41ec14f4 "smsc95xx: Use skb_cow_head to deal with cloned skbs" 6bc6895bdd6744e0136eaa4a11fbdb20a7db4e40 "ch9200: use skb_cow_head() to deal with cloned skbs" 39fba7835aacda65284a86e611774cbba71dac20 "kaweth: use skb_cow_head() to deal with cloned skbs" 854826c9d526fd81077742c3b000e3f7fcaef3ce "ubifs: Drop softlimit and delta fields from struct ubifs_wbuf" 1b7fc2c0069f3864a3dda15430b7aded31c0bfcc "ubifs: Use dirty_writeback_interval value for wbuf timer" cd4b1e34655d46950c065d9284b596cd8d7b28cd "usb: dwc2: Remove unnecessary kfree" 10596608c4d62cb8c1c2b806debcbd32fe657e7 "netfilter: nf_tables: fix mismatch in big-endian system" 65cab850f0eeaa9180bd2e10a231964f33743edf "net: Allow class-e address assignment via ifconfig ioctl" 6926e041a8920c8ec27e4e155efa760aa01551fd "uapi/if_ether.h: prevent redefinition of struct ethhdr"
Or better yet, use the "standard" way we reference ids in the kernel: 6db6f0eae605 ("bridge: multicast to unicast") e9156cd26a49 ("smsc95xx: Use skb_cow_head to deal with cloned skbs") ...
Anyway, I need those ids in the patches themselves, as we do for all stable patches (look at how they look in the stable trees for examples).
Also, some of these are in kernels newer than 4.14, so if I were to apply them to 4.9 only, someone moving to a newer kernel would have a regression, which isn't ok.
65cab850f0ee ("net: Allow class-e address assignment via ifconfig ioctl") is one such example of that type of patch.
So I can't take these as-is, sorry.
greg k-h
On Thu, Feb 14, 2019 at 11:42 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
Also, some of these are in kernels newer than 4.14, so if I were to apply them to 4.9 only, someone moving to a newer kernel would have a regression, which isn't ok.
65cab850f0ee ("net: Allow class-e address assignment via ifconfig ioctl") is one such example of that type of patch.
So I can't take these as-is, sorry.
OK I will look closer for each of them and try to mention which stable each of them should go to, I thought they would all be for v4.9 but I was wrong. I'll update and respin, NP.
Yours, Linus Walleij
linux-stable-mirror@lists.linaro.org