This is a note to let you know that I've just added the patch titled
s390/qeth: fix thinko in IPv4 multicast address tracking
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-thinko-in-ipv4-multicast-address-tracking.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 14 11:45:40 CET 2017
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Fri, 1 Dec 2017 10:14:49 +0100
Subject: s390/qeth: fix thinko in IPv4 multicast address tracking
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upsteam commit bc3ab70584696cb798b9e1e0ac8e6ced5fd4c3b8 ]
Commit 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
reworked how secondary addresses are managed for qeth devices.
Instead of dropping & subsequently re-adding all addresses on every
ndo_set_rx_mode() call, qeth now keeps track of the addresses that are
currently registered with the HW.
On a ndo_set_rx_mode(), we thus only need to do (de-)registration
requests for the addresses that have actually changed.
On L3 devices, the lookup for IPv4 Multicast addresses checks the wrong
hashtable - and thus never finds a match. As a result, we first delete
*all* such addresses, and then re-add them again. So each set_rx_mode()
causes a short period where the IPv4 Multicast addresses are not
registered, and the card stops forwarding inbound traffic for them.
Fix this by setting the ->is_multicast flag on the lookup object, thus
enabling qeth_l3_ip_from_hash() to search the correct hashtable and
find a match there.
Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_l3_main.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -1376,6 +1376,7 @@ qeth_l3_add_mc_to_hash(struct qeth_card
tmp->u.a4.addr = be32_to_cpu(im4->multiaddr);
memcpy(tmp->mac, buf, sizeof(tmp->mac));
+ tmp->is_multicast = 1;
ipm = qeth_l3_ip_from_hash(card, tmp);
if (ipm) {
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.14/s390-qeth-fix-thinko-in-ipv4-multicast-address-tracking.patch
queue-4.14/s390-qeth-fix-gso-throughput-regression.patch
queue-4.14/s390-qeth-fix-early-exit-from-error-path.patch
queue-4.14/s390-qeth-build-max-size-gso-skbs-on-l2-devices.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix GSO throughput regression
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-gso-throughput-regression.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 14 11:45:40 CET 2017
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Fri, 1 Dec 2017 10:14:50 +0100
Subject: s390/qeth: fix GSO throughput regression
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 6d69b1f1eb7a2edf8a3547f361c61f2538e054bb ]
Using GSO with small MTUs currently results in a substantial throughput
regression - which is caused by how qeth needs to map non-linear skbs
into its IO buffer elements:
compared to a linear skb, each GSO-segmented skb effectively consumes
twice as many buffer elements (ie two instead of one) due to the
additional header-only part. This causes the Output Queue to be
congested with low-utilized IO buffers.
Fix this as follows:
If the MSS is low enough so that a non-SG GSO segmentation produces
order-0 skbs (currently ~3500 byte), opt out from NETIF_F_SG. This is
where we anticipate the biggest savings, since an SG-enabled
GSO segmentation produces skbs that always consume at least two
buffer elements.
Larger MSS values continue to get a SG-enabled GSO segmentation, since
1) the relative overhead of the additional header-only buffer element
becomes less noticeable, and
2) the linearization overhead increases.
With the throughput regression fixed, re-enable NETIF_F_SG by default to
reap the significant CPU savings of GSO.
Fixes: 5722963a8e83 ("qeth: do not turn on SG per default")
Reported-by: Nils Hoppmann <niho(a)de.ibm.com>
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_core.h | 3 +++
drivers/s390/net/qeth_core_main.c | 31 +++++++++++++++++++++++++++++++
drivers/s390/net/qeth_l2_main.c | 2 ++
drivers/s390/net/qeth_l3_main.c | 2 ++
4 files changed, 38 insertions(+)
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -985,6 +985,9 @@ struct qeth_cmd_buffer *qeth_get_setassp
int qeth_set_features(struct net_device *, netdev_features_t);
int qeth_recover_features(struct net_device *);
netdev_features_t qeth_fix_features(struct net_device *, netdev_features_t);
+netdev_features_t qeth_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features);
int qeth_vm_request_mac(struct qeth_card *card);
int qeth_push_hdr(struct sk_buff *skb, struct qeth_hdr **hdr, unsigned int len);
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -19,6 +19,11 @@
#include <linux/mii.h>
#include <linux/kthread.h>
#include <linux/slab.h>
+#include <linux/if_vlan.h>
+#include <linux/netdevice.h>
+#include <linux/netdev_features.h>
+#include <linux/skbuff.h>
+
#include <net/iucv/af_iucv.h>
#include <net/dsfield.h>
@@ -6505,6 +6510,32 @@ netdev_features_t qeth_fix_features(stru
}
EXPORT_SYMBOL_GPL(qeth_fix_features);
+netdev_features_t qeth_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+{
+ /* GSO segmentation builds skbs with
+ * a (small) linear part for the headers, and
+ * page frags for the data.
+ * Compared to a linear skb, the header-only part consumes an
+ * additional buffer element. This reduces buffer utilization, and
+ * hurts throughput. So compress small segments into one element.
+ */
+ if (netif_needs_gso(skb, features)) {
+ /* match skb_segment(): */
+ unsigned int doffset = skb->data - skb_mac_header(skb);
+ unsigned int hsize = skb_shinfo(skb)->gso_size;
+ unsigned int hroom = skb_headroom(skb);
+
+ /* linearize only if resulting skb allocations are order-0: */
+ if (SKB_DATA_ALIGN(hroom + doffset + hsize) <= SKB_MAX_HEAD(0))
+ features &= ~NETIF_F_SG;
+ }
+
+ return vlan_features_check(skb, features);
+}
+EXPORT_SYMBOL_GPL(qeth_features_check);
+
static int __init qeth_core_init(void)
{
int rc;
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -963,6 +963,7 @@ static const struct net_device_ops qeth_
.ndo_stop = qeth_l2_stop,
.ndo_get_stats = qeth_get_stats,
.ndo_start_xmit = qeth_l2_hard_start_xmit,
+ .ndo_features_check = qeth_features_check,
.ndo_validate_addr = eth_validate_addr,
.ndo_set_rx_mode = qeth_l2_set_rx_mode,
.ndo_do_ioctl = qeth_do_ioctl,
@@ -1009,6 +1010,7 @@ static int qeth_l2_setup_netdev(struct q
if (card->info.type == QETH_CARD_TYPE_OSD && !card->info.guestlan) {
card->dev->hw_features = NETIF_F_SG;
card->dev->vlan_features = NETIF_F_SG;
+ card->dev->features |= NETIF_F_SG;
/* OSA 3S and earlier has no RX/TX support */
if (qeth_is_supported(card, IPA_OUTBOUND_CHECKSUM)) {
card->dev->hw_features |= NETIF_F_IP_CSUM;
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -2923,6 +2923,7 @@ static const struct net_device_ops qeth_
.ndo_stop = qeth_l3_stop,
.ndo_get_stats = qeth_get_stats,
.ndo_start_xmit = qeth_l3_hard_start_xmit,
+ .ndo_features_check = qeth_features_check,
.ndo_validate_addr = eth_validate_addr,
.ndo_set_rx_mode = qeth_l3_set_multicast_list,
.ndo_do_ioctl = qeth_do_ioctl,
@@ -2963,6 +2964,7 @@ static int qeth_l3_setup_netdev(struct q
card->dev->vlan_features = NETIF_F_SG |
NETIF_F_RXCSUM | NETIF_F_IP_CSUM |
NETIF_F_TSO;
+ card->dev->features |= NETIF_F_SG;
}
}
} else if (card->info.type == QETH_CARD_TYPE_IQD) {
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.14/s390-qeth-fix-thinko-in-ipv4-multicast-address-tracking.patch
queue-4.14/s390-qeth-fix-gso-throughput-regression.patch
queue-4.14/s390-qeth-fix-early-exit-from-error-path.patch
queue-4.14/s390-qeth-build-max-size-gso-skbs-on-l2-devices.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: fix early exit from error path
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-fix-early-exit-from-error-path.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 14 11:45:40 CET 2017
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Wed, 18 Oct 2017 17:40:17 +0200
Subject: s390/qeth: fix early exit from error path
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 83cf79a2fec3cf499eb6cb9eb608656fc2a82776 ]
When the allocation of the addr buffer fails, we need to free
our refcount on the inetdevice before returning.
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_l3_main.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -1553,7 +1553,7 @@ static void qeth_l3_free_vlan_addresses4
addr = qeth_l3_get_addr_buffer(QETH_PROT_IPV4);
if (!addr)
- return;
+ goto out;
spin_lock_bh(&card->ip_lock);
@@ -1567,6 +1567,7 @@ static void qeth_l3_free_vlan_addresses4
spin_unlock_bh(&card->ip_lock);
kfree(addr);
+out:
in_dev_put(in_dev);
}
@@ -1591,7 +1592,7 @@ static void qeth_l3_free_vlan_addresses6
addr = qeth_l3_get_addr_buffer(QETH_PROT_IPV6);
if (!addr)
- return;
+ goto out;
spin_lock_bh(&card->ip_lock);
@@ -1606,6 +1607,7 @@ static void qeth_l3_free_vlan_addresses6
spin_unlock_bh(&card->ip_lock);
kfree(addr);
+out:
in6_dev_put(in6_dev);
#endif /* CONFIG_QETH_IPV6 */
}
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.14/s390-qeth-fix-thinko-in-ipv4-multicast-address-tracking.patch
queue-4.14/s390-qeth-fix-gso-throughput-regression.patch
queue-4.14/s390-qeth-fix-early-exit-from-error-path.patch
queue-4.14/s390-qeth-build-max-size-gso-skbs-on-l2-devices.patch
This is a note to let you know that I've just added the patch titled
s390/qeth: build max size GSO skbs on L2 devices
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
s390-qeth-build-max-size-gso-skbs-on-l2-devices.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 14 11:45:40 CET 2017
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Date: Fri, 1 Dec 2017 10:14:51 +0100
Subject: s390/qeth: build max size GSO skbs on L2 devices
From: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
[ Upstream commit 0cbff6d4546613330a1c5f139f5c368e4ce33ca1 ]
The current GSO skb size limit was copy&pasted over from the L3 path,
where it is needed due to a TSO limitation.
As L2 devices don't offer TSO support (and thus all GSO skbs are
segmented before they reach the driver), there's no reason to restrict
the stack in how large it may build the GSO skbs.
Fixes: d52aec97e5bc ("qeth: enable scatter/gather in layer 2 mode")
Signed-off-by: Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/s390/net/qeth_l2_main.c | 2 --
drivers/s390/net/qeth_l3_main.c | 4 ++--
2 files changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -1027,8 +1027,6 @@ static int qeth_l2_setup_netdev(struct q
card->info.broadcast_capable = 1;
qeth_l2_request_initial_mac(card);
- card->dev->gso_max_size = (QETH_MAX_BUFFER_ELEMENTS(card) - 1) *
- PAGE_SIZE;
SET_NETDEV_DEV(card->dev, &card->gdev->dev);
netif_napi_add(card->dev, &card->napi, qeth_poll, QETH_NAPI_WEIGHT);
netif_carrier_off(card->dev);
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -2989,8 +2989,8 @@ static int qeth_l3_setup_netdev(struct q
NETIF_F_HW_VLAN_CTAG_RX |
NETIF_F_HW_VLAN_CTAG_FILTER;
netif_keep_dst(card->dev);
- card->dev->gso_max_size = (QETH_MAX_BUFFER_ELEMENTS(card) - 1) *
- PAGE_SIZE;
+ netif_set_gso_max_size(card->dev, (QETH_MAX_BUFFER_ELEMENTS(card) - 1) *
+ PAGE_SIZE);
SET_NETDEV_DEV(card->dev, &card->gdev->dev);
netif_napi_add(card->dev, &card->napi, qeth_poll, QETH_NAPI_WEIGHT);
Patches currently in stable-queue which might be from jwi(a)linux.vnet.ibm.com are
queue-4.14/s390-qeth-fix-thinko-in-ipv4-multicast-address-tracking.patch
queue-4.14/s390-qeth-fix-gso-throughput-regression.patch
queue-4.14/s390-qeth-fix-early-exit-from-error-path.patch
queue-4.14/s390-qeth-build-max-size-gso-skbs-on-l2-devices.patch
This is a note to let you know that I've just added the patch titled
packet: fix crash in fanout_demux_rollover()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
packet-fix-crash-in-fanout_demux_rollover.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 14 11:45:40 CET 2017
From: Mike Maloney <maloney(a)google.com>
Date: Tue, 28 Nov 2017 10:44:29 -0500
Subject: packet: fix crash in fanout_demux_rollover()
From: Mike Maloney <maloney(a)google.com>
syzkaller found a race condition fanout_demux_rollover() while removing
a packet socket from a fanout group.
po->rollover is read and operated on during packet_rcv_fanout(), via
fanout_demux_rollover(), but the pointer is currently cleared before the
synchronization in packet_release(). It is safer to delay the cleanup
until after synchronize_net() has been called, ensuring all calls to
packet_rcv_fanout() for this socket have finished.
To further simplify synchronization around the rollover structure, set
po->rollover in fanout_add() only if there are no errors. This removes
the need for rcu in the struct and in the call to
packet_getsockopt(..., PACKET_ROLLOVER_STATS, ...).
Crashing stack trace:
fanout_demux_rollover+0xb6/0x4d0 net/packet/af_packet.c:1392
packet_rcv_fanout+0x649/0x7c8 net/packet/af_packet.c:1487
dev_queue_xmit_nit+0x835/0xc10 net/core/dev.c:1953
xmit_one net/core/dev.c:2975 [inline]
dev_hard_start_xmit+0x16b/0xac0 net/core/dev.c:2995
__dev_queue_xmit+0x17a4/0x2050 net/core/dev.c:3476
dev_queue_xmit+0x17/0x20 net/core/dev.c:3509
neigh_connected_output+0x489/0x720 net/core/neighbour.c:1379
neigh_output include/net/neighbour.h:482 [inline]
ip6_finish_output2+0xad1/0x22a0 net/ipv6/ip6_output.c:120
ip6_finish_output+0x2f9/0x920 net/ipv6/ip6_output.c:146
NF_HOOK_COND include/linux/netfilter.h:239 [inline]
ip6_output+0x1f4/0x850 net/ipv6/ip6_output.c:163
dst_output include/net/dst.h:459 [inline]
NF_HOOK.constprop.35+0xff/0x630 include/linux/netfilter.h:250
mld_sendpack+0x6a8/0xcc0 net/ipv6/mcast.c:1660
mld_send_initial_cr.part.24+0x103/0x150 net/ipv6/mcast.c:2072
mld_send_initial_cr net/ipv6/mcast.c:2056 [inline]
ipv6_mc_dad_complete+0x99/0x130 net/ipv6/mcast.c:2079
addrconf_dad_completed+0x595/0x970 net/ipv6/addrconf.c:4039
addrconf_dad_work+0xac9/0x1160 net/ipv6/addrconf.c:3971
process_one_work+0xbf0/0x1bc0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x35e/0x430 kernel/kthread.c:231
ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:432
Fixes: 0648ab70afe6 ("packet: rollover prepare: per-socket state")
Fixes: 509c7a1ecc860 ("packet: avoid panic in packet_getsockopt()")
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: Mike Maloney <maloney(a)google.com>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/packet/af_packet.c | 32 ++++++++++----------------------
net/packet/internal.h | 1 -
2 files changed, 10 insertions(+), 23 deletions(-)
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1697,7 +1697,6 @@ static int fanout_add(struct sock *sk, u
atomic_long_set(&rollover->num, 0);
atomic_long_set(&rollover->num_huge, 0);
atomic_long_set(&rollover->num_failed, 0);
- po->rollover = rollover;
}
if (type_flags & PACKET_FANOUT_FLAG_UNIQUEID) {
@@ -1755,6 +1754,8 @@ static int fanout_add(struct sock *sk, u
if (refcount_read(&match->sk_ref) < PACKET_FANOUT_MAX) {
__dev_remove_pack(&po->prot_hook);
po->fanout = match;
+ po->rollover = rollover;
+ rollover = NULL;
refcount_set(&match->sk_ref, refcount_read(&match->sk_ref) + 1);
__fanout_link(sk, po);
err = 0;
@@ -1768,10 +1769,7 @@ static int fanout_add(struct sock *sk, u
}
out:
- if (err && rollover) {
- kfree_rcu(rollover, rcu);
- po->rollover = NULL;
- }
+ kfree(rollover);
mutex_unlock(&fanout_mutex);
return err;
}
@@ -1795,11 +1793,6 @@ static struct packet_fanout *fanout_rele
list_del(&f->list);
else
f = NULL;
-
- if (po->rollover) {
- kfree_rcu(po->rollover, rcu);
- po->rollover = NULL;
- }
}
mutex_unlock(&fanout_mutex);
@@ -3039,6 +3032,7 @@ static int packet_release(struct socket
synchronize_net();
if (f) {
+ kfree(po->rollover);
fanout_release_data(f);
kfree(f);
}
@@ -3853,7 +3847,6 @@ static int packet_getsockopt(struct sock
void *data = &val;
union tpacket_stats_u st;
struct tpacket_rollover_stats rstats;
- struct packet_rollover *rollover;
if (level != SOL_PACKET)
return -ENOPROTOOPT;
@@ -3932,18 +3925,13 @@ static int packet_getsockopt(struct sock
0);
break;
case PACKET_ROLLOVER_STATS:
- rcu_read_lock();
- rollover = rcu_dereference(po->rollover);
- if (rollover) {
- rstats.tp_all = atomic_long_read(&rollover->num);
- rstats.tp_huge = atomic_long_read(&rollover->num_huge);
- rstats.tp_failed = atomic_long_read(&rollover->num_failed);
- data = &rstats;
- lv = sizeof(rstats);
- }
- rcu_read_unlock();
- if (!rollover)
+ if (!po->rollover)
return -EINVAL;
+ rstats.tp_all = atomic_long_read(&po->rollover->num);
+ rstats.tp_huge = atomic_long_read(&po->rollover->num_huge);
+ rstats.tp_failed = atomic_long_read(&po->rollover->num_failed);
+ data = &rstats;
+ lv = sizeof(rstats);
break;
case PACKET_TX_HAS_OFF:
val = po->tp_tx_has_off;
--- a/net/packet/internal.h
+++ b/net/packet/internal.h
@@ -95,7 +95,6 @@ struct packet_fanout {
struct packet_rollover {
int sock;
- struct rcu_head rcu;
atomic_long_t num;
atomic_long_t num_huge;
atomic_long_t num_failed;
Patches currently in stable-queue which might be from maloney(a)google.com are
queue-4.14/packet-fix-crash-in-fanout_demux_rollover.patch
This is a note to let you know that I've just added the patch titled
net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-thunderx-fix-tcp-udp-checksum-offload-for-ipv6-pkts.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 14 11:45:40 CET 2017
From: Sunil Goutham <sgoutham(a)cavium.com>
Date: Thu, 23 Nov 2017 22:34:31 +0300
Subject: net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts
From: Sunil Goutham <sgoutham(a)cavium.com>
[ Upstream commit fa6d7cb5d76cf0467c61420fc9238045aedfd379 ]
Don't offload IP header checksum to NIC.
This fixes a previous patch which enabled checksum offloading
for both IPv4 and IPv6 packets. So L3 checksum offload was
getting enabled for IPv6 pkts. And HW is dropping these pkts
as it assumes the pkt is IPv4 when IP csum offload is set
in the SQ descriptor.
Fixes: 3a9024f52c2e ("net: thunderx: Enable TSO and checksum offloads for ipv6")
Signed-off-by: Sunil Goutham <sgoutham(a)cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov(a)auriga.com>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 1 -
1 file changed, 1 deletion(-)
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -1355,7 +1355,6 @@ nicvf_sq_add_hdr_subdesc(struct nicvf *n
/* Offload checksum calculation to HW */
if (skb->ip_summed == CHECKSUM_PARTIAL) {
- hdr->csum_l3 = 1; /* Enable IP csum calculation */
hdr->l3_offset = skb_network_offset(skb);
hdr->l4_offset = skb_transport_offset(skb);
Patches currently in stable-queue which might be from sgoutham(a)cavium.com are
queue-4.14/net-thunderx-fix-tcp-udp-checksum-offload-for-ipv6-pkts.patch
queue-4.14/net-thunderx-fix-tcp-udp-checksum-offload-for-ipv4-pkts.patch
This is a note to let you know that I've just added the patch titled
net: thunderx: Fix TCP/UDP checksum offload for IPv4 pkts
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-thunderx-fix-tcp-udp-checksum-offload-for-ipv4-pkts.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 14 11:45:40 CET 2017
From: Florian Westphal <fw(a)strlen.de>
Date: Wed, 6 Dec 2017 01:04:50 +0100
Subject: net: thunderx: Fix TCP/UDP checksum offload for IPv4 pkts
From: Florian Westphal <fw(a)strlen.de>
[ Upstream commit 134059fd2775be79e26c2dff87d25cc2f6ea5626 ]
Offload IP header checksum to NIC.
This fixes a previous patch which disabled checksum offloading
for both IPv4 and IPv6 packets. So L3 checksum offload was
getting disabled for IPv4 pkts. And HW is dropping these pkts
for some reason.
Without this patch, IPv4 TSO appears to be broken:
WIthout this patch I get ~16kbyte/s, with patch close to 2mbyte/s
when copying files via scp from test box to my home workstation.
Looking at tcpdump on sender it looks like hardware drops IPv4 TSO skbs.
This patch restores performance for me, ipv6 looks good too.
Fixes: fa6d7cb5d76c ("net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts")
Cc: Sunil Goutham <sgoutham(a)cavium.com>
Cc: Aleksey Makarov <aleksey.makarov(a)auriga.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: Florian Westphal <fw(a)strlen.de>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -1355,6 +1355,8 @@ nicvf_sq_add_hdr_subdesc(struct nicvf *n
/* Offload checksum calculation to HW */
if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ if (ip.v4->version == 4)
+ hdr->csum_l3 = 1; /* Enable IP csum calculation */
hdr->l3_offset = skb_network_offset(skb);
hdr->l4_offset = skb_transport_offset(skb);
Patches currently in stable-queue which might be from fw(a)strlen.de are
queue-4.14/tcp-remove-buggy-call-to-tcp_v6_restore_cb.patch
queue-4.14/net-thunderx-fix-tcp-udp-checksum-offload-for-ipv4-pkts.patch
This is a note to let you know that I've just added the patch titled
net: sched: cbq: create block for q->link.block
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-sched-cbq-create-block-for-q-link.block.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 14 11:45:40 CET 2017
From: Jiri Pirko <jiri(a)mellanox.com>
Date: Mon, 27 Nov 2017 18:37:21 +0100
Subject: net: sched: cbq: create block for q->link.block
From: Jiri Pirko <jiri(a)mellanox.com>
[ Upstream commit d51aae68b142f48232257e96ce317db25445418d ]
q->link.block is not initialized, that leads to EINVAL when one tries to
add filter there. So initialize it properly.
This can be reproduced by:
$ tc qdisc add dev eth0 root handle 1: cbq avpkt 1000 rate 1000Mbit bandwidth 1000Mbit
$ tc filter add dev eth0 parent 1: protocol ip prio 100 u32 match ip protocol 0 0x00 flowid 1:1
Reported-by: Jaroslav Aster <jaster(a)redhat.com>
Reported-by: Ivan Vecera <ivecera(a)redhat.com>
Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure")
Signed-off-by: Jiri Pirko <jiri(a)mellanox.com>
Acked-by: Eelco Chaudron <echaudro(a)redhat.com>
Reviewed-by: Ivan Vecera <ivecera(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sched/sch_cbq.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -1157,9 +1157,13 @@ static int cbq_init(struct Qdisc *sch, s
if ((q->link.R_tab = qdisc_get_rtab(r, tb[TCA_CBQ_RTAB])) == NULL)
return -EINVAL;
+ err = tcf_block_get(&q->link.block, &q->link.filter_list);
+ if (err)
+ goto put_rtab;
+
err = qdisc_class_hash_init(&q->clhash);
if (err < 0)
- goto put_rtab;
+ goto put_block;
q->link.sibling = &q->link;
q->link.common.classid = sch->handle;
@@ -1193,6 +1197,9 @@ static int cbq_init(struct Qdisc *sch, s
cbq_addprio(q, &q->link);
return 0;
+put_block:
+ tcf_block_put(q->link.block);
+
put_rtab:
qdisc_put_rtab(q->link.R_tab);
return err;
Patches currently in stable-queue which might be from jiri(a)mellanox.com are
queue-4.14/net-sched-cbq-create-block-for-q-link.block.patch
This is a note to let you know that I've just added the patch titled
net: remove hlist_nulls_add_tail_rcu()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-remove-hlist_nulls_add_tail_rcu.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 14 11:45:40 CET 2017
From: Eric Dumazet <edumazet(a)google.com>
Date: Tue, 5 Dec 2017 12:45:56 -0800
Subject: net: remove hlist_nulls_add_tail_rcu()
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit d7efc6c11b277d9d80b99b1334a78bfe7d7edf10 ]
Alexander Potapenko reported use of uninitialized memory [1]
This happens when inserting a request socket into TCP ehash,
in __sk_nulls_add_node_rcu(), since sk_reuseport is not initialized.
Bug was added by commit d894ba18d4e4 ("soreuseport: fix ordering for
mixed v4/v6 sockets")
Note that d296ba60d8e2 ("soreuseport: Resolve merge conflict for v4/v6
ordering fix") missed the opportunity to get rid of
hlist_nulls_add_tail_rcu() :
Both UDP sockets and TCP/DCCP listeners no longer use
__sk_nulls_add_node_rcu() for their hash insertion.
Since all other sockets have unique 4-tuple, the reuseport status
has no special meaning, so we can always use hlist_nulls_add_head_rcu()
for them and save few cycles/instructions.
[1]
==================================================================
BUG: KMSAN: use of uninitialized memory in inet_ehash_insert+0xd40/0x1050
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0+ #3288
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:16
dump_stack+0x185/0x1d0 lib/dump_stack.c:52
kmsan_report+0x13f/0x1c0 mm/kmsan/kmsan.c:1016
__msan_warning_32+0x69/0xb0 mm/kmsan/kmsan_instr.c:766
__sk_nulls_add_node_rcu ./include/net/sock.h:684
inet_ehash_insert+0xd40/0x1050 net/ipv4/inet_hashtables.c:413
reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:754
inet_csk_reqsk_queue_hash_add+0x1cc/0x300 net/ipv4/inet_connection_sock.c:765
tcp_conn_request+0x31e7/0x36f0 net/ipv4/tcp_input.c:6414
tcp_v4_conn_request+0x16d/0x220 net/ipv4/tcp_ipv4.c:1314
tcp_rcv_state_process+0x42a/0x7210 net/ipv4/tcp_input.c:5917
tcp_v4_do_rcv+0xa6a/0xcd0 net/ipv4/tcp_ipv4.c:1483
tcp_v4_rcv+0x3de0/0x4ab0 net/ipv4/tcp_ipv4.c:1763
ip_local_deliver_finish+0x6bb/0xcb0 net/ipv4/ip_input.c:216
NF_HOOK ./include/linux/netfilter.h:248
ip_local_deliver+0x3fa/0x480 net/ipv4/ip_input.c:257
dst_input ./include/net/dst.h:477
ip_rcv_finish+0x6fb/0x1540 net/ipv4/ip_input.c:397
NF_HOOK ./include/linux/netfilter.h:248
ip_rcv+0x10f6/0x15c0 net/ipv4/ip_input.c:488
__netif_receive_skb_core+0x36f6/0x3f60 net/core/dev.c:4298
__netif_receive_skb net/core/dev.c:4336
netif_receive_skb_internal+0x63c/0x19c0 net/core/dev.c:4497
napi_skb_finish net/core/dev.c:4858
napi_gro_receive+0x629/0xa50 net/core/dev.c:4889
e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4018
e1000_clean_rx_irq+0x1492/0x1d30
drivers/net/ethernet/intel/e1000/e1000_main.c:4474
e1000_clean+0x43aa/0x5970 drivers/net/ethernet/intel/e1000/e1000_main.c:3819
napi_poll net/core/dev.c:5500
net_rx_action+0x73c/0x1820 net/core/dev.c:5566
__do_softirq+0x4b4/0x8dd kernel/softirq.c:284
invoke_softirq kernel/softirq.c:364
irq_exit+0x203/0x240 kernel/softirq.c:405
exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:638
do_IRQ+0x15e/0x1a0 arch/x86/kernel/irq.c:263
common_interrupt+0x86/0x86
Fixes: d894ba18d4e4 ("soreuseport: fix ordering for mixed v4/v6 sockets")
Fixes: d296ba60d8e2 ("soreuseport: Resolve merge conflict for v4/v6 ordering fix")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: Alexander Potapenko <glider(a)google.com>
Acked-by: Craig Gallek <kraig(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/rculist_nulls.h | 38 --------------------------------------
include/net/sock.h | 6 +-----
2 files changed, 1 insertion(+), 43 deletions(-)
--- a/include/linux/rculist_nulls.h
+++ b/include/linux/rculist_nulls.h
@@ -101,44 +101,6 @@ static inline void hlist_nulls_add_head_
}
/**
- * hlist_nulls_add_tail_rcu
- * @n: the element to add to the hash list.
- * @h: the list to add to.
- *
- * Description:
- * Adds the specified element to the end of the specified hlist_nulls,
- * while permitting racing traversals. NOTE: tail insertion requires
- * list traversal.
- *
- * The caller must take whatever precautions are necessary
- * (such as holding appropriate locks) to avoid racing
- * with another list-mutation primitive, such as hlist_nulls_add_head_rcu()
- * or hlist_nulls_del_rcu(), running on this same list.
- * However, it is perfectly legal to run concurrently with
- * the _rcu list-traversal primitives, such as
- * hlist_nulls_for_each_entry_rcu(), used to prevent memory-consistency
- * problems on Alpha CPUs. Regardless of the type of CPU, the
- * list-traversal primitive must be guarded by rcu_read_lock().
- */
-static inline void hlist_nulls_add_tail_rcu(struct hlist_nulls_node *n,
- struct hlist_nulls_head *h)
-{
- struct hlist_nulls_node *i, *last = NULL;
-
- for (i = hlist_nulls_first_rcu(h); !is_a_nulls(i);
- i = hlist_nulls_next_rcu(i))
- last = i;
-
- if (last) {
- n->next = last->next;
- n->pprev = &last->next;
- rcu_assign_pointer(hlist_nulls_next_rcu(last), n);
- } else {
- hlist_nulls_add_head_rcu(n, h);
- }
-}
-
-/**
* hlist_nulls_for_each_entry_rcu - iterate over rcu list of given type
* @tpos: the type * to use as a loop cursor.
* @pos: the &struct hlist_nulls_node to use as a loop cursor.
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -683,11 +683,7 @@ static inline void sk_add_node_rcu(struc
static inline void __sk_nulls_add_node_rcu(struct sock *sk, struct hlist_nulls_head *list)
{
- if (IS_ENABLED(CONFIG_IPV6) && sk->sk_reuseport &&
- sk->sk_family == AF_INET6)
- hlist_nulls_add_tail_rcu(&sk->sk_nulls_node, list);
- else
- hlist_nulls_add_head_rcu(&sk->sk_nulls_node, list);
+ hlist_nulls_add_head_rcu(&sk->sk_nulls_node, list);
}
static inline void sk_nulls_add_node_rcu(struct sock *sk, struct hlist_nulls_head *list)
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.14/tcp-add-tcp_v4_fill_cb-tcp_v4_restore_cb.patch
queue-4.14/net-thunderx-fix-tcp-udp-checksum-offload-for-ipv6-pkts.patch
queue-4.14/tcp-remove-buggy-call-to-tcp_v6_restore_cb.patch
queue-4.14/net-packet-fix-a-race-in-packet_bind-and-packet_notifier.patch
queue-4.14/tcp-use-ipcb-instead-of-tcp_skb_cb-in-inet_exact_dif_match.patch
queue-4.14/net-thunderx-fix-tcp-udp-checksum-offload-for-ipv4-pkts.patch
queue-4.14/packet-fix-crash-in-fanout_demux_rollover.patch
queue-4.14/net-remove-hlist_nulls_add_tail_rcu.patch
queue-4.14/tcp-when-scheduling-tlp-time-of-rto-should-account-for-current-ack.patch
queue-4.14/tcp-dccp-block-bh-before-arming-time_wait-timer.patch
queue-4.14/tcp-use-current-time-in-tcp_rcv_space_adjust.patch