This is a note to let you know that I've just added the patch titled
net/mlx5e: Specify numa node when allocating drop rq
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-mlx5e-specify-numa-node-when-allocating-drop-rq.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Gal Pressman <galp(a)mellanox.com>
Date: Thu, 25 Jan 2018 18:00:41 +0200
Subject: net/mlx5e: Specify numa node when allocating drop rq
From: Gal Pressman <galp(a)mellanox.com>
[ Upstream commit 2f0db87901698cd73d828cc6fb1957b8916fc911 ]
When allocating a drop rq, no numa node is explicitly set which means
allocations are done on node zero. This is not necessarily the nearest
numa node to the HCA, and even worse, might even be a memoryless numa
node.
Choose the numa_node given to us by the pci device in order to properly
allocate the coherent dma memory instead of assuming zero is valid.
Fixes: 556dd1b9c313 ("net/mlx5e: Set drop RQ's necessary parameters only")
Signed-off-by: Gal Pressman <galp(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1911,13 +1911,16 @@ static void mlx5e_build_rq_param(struct
param->wq.linear = 1;
}
-static void mlx5e_build_drop_rq_param(struct mlx5e_rq_param *param)
+static void mlx5e_build_drop_rq_param(struct mlx5_core_dev *mdev,
+ struct mlx5e_rq_param *param)
{
void *rqc = param->rqc;
void *wq = MLX5_ADDR_OF(rqc, rqc, wq);
MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_LINKED_LIST);
MLX5_SET(wq, wq, log_wq_stride, ilog2(sizeof(struct mlx5e_rx_wqe)));
+
+ param->wq.buf_numa_node = dev_to_node(&mdev->pdev->dev);
}
static void mlx5e_build_sq_param_common(struct mlx5e_priv *priv,
@@ -2774,6 +2777,9 @@ static int mlx5e_alloc_drop_cq(struct ml
struct mlx5e_cq *cq,
struct mlx5e_cq_param *param)
{
+ param->wq.buf_numa_node = dev_to_node(&mdev->pdev->dev);
+ param->wq.db_numa_node = dev_to_node(&mdev->pdev->dev);
+
return mlx5e_alloc_cq_common(mdev, param, cq);
}
@@ -2785,7 +2791,7 @@ static int mlx5e_open_drop_rq(struct mlx
struct mlx5e_cq *cq = &drop_rq->cq;
int err;
- mlx5e_build_drop_rq_param(&rq_param);
+ mlx5e_build_drop_rq_param(mdev, &rq_param);
err = mlx5e_alloc_drop_cq(mdev, cq, &cq_param);
if (err)
Patches currently in stable-queue which might be from galp(a)mellanox.com are
queue-4.15/net-mlx5e-specify-numa-node-when-allocating-drop-rq.patch
queue-4.15/net-mlx5e-fix-tcp-checksum-in-lro-buffers.patch
This is a note to let you know that I've just added the patch titled
net/mlx5e: Verify inline header size do not exceed SKB linear size
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-mlx5e-verify-inline-header-size-do-not-exceed-skb-linear-size.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Eran Ben Elisha <eranbe(a)mellanox.com>
Date: Thu, 25 Jan 2018 11:18:09 +0200
Subject: net/mlx5e: Verify inline header size do not exceed SKB linear size
From: Eran Ben Elisha <eranbe(a)mellanox.com>
[ Upstream commit f600c6088018d1dbc5777d18daa83660f7ea4a64 ]
Driver tries to copy at least MLX5E_MIN_INLINE bytes into the control
segment of the WQE. It assumes that the linear part contains at least
MLX5E_MIN_INLINE bytes, which can be wrong.
Cited commit verified that driver will not copy more bytes into the
inline header part that the actual size of the packet. Re-factor this
check to make sure we do not exceed the linear part as well.
This fix is aligned with the current driver's assumption that the entire
L2 will be present in the linear part of the SKB.
Fixes: 6aace17e64f4 ("net/mlx5e: Fix inline header size for small packets")
Signed-off-by: Eran Ben Elisha <eranbe(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -176,7 +176,7 @@ static inline u16 mlx5e_calc_min_inline(
default:
hlen = mlx5e_skb_l2_header_offset(skb);
}
- return min_t(u16, hlen, skb->len);
+ return min_t(u16, hlen, skb_headlen(skb));
}
static inline void mlx5e_tx_skb_pull_inline(unsigned char **skb_data,
Patches currently in stable-queue which might be from eranbe(a)mellanox.com are
queue-4.15/net-mlx5e-verify-inline-header-size-do-not-exceed-skb-linear-size.patch
This is a note to let you know that I've just added the patch titled
net/mlx5e: Fix TCP checksum in LRO buffers
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-mlx5e-fix-tcp-checksum-in-lro-buffers.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Gal Pressman <galp(a)mellanox.com>
Date: Wed, 20 Dec 2017 08:48:24 +0200
Subject: net/mlx5e: Fix TCP checksum in LRO buffers
From: Gal Pressman <galp(a)mellanox.com>
[ Upstream commit 8babd44d2079079f9d5a4aca7005aed80236efe0 ]
When receiving an LRO packet, the checksum field is set by the hardware
to the checksum of the first coalesced packet. Obviously, this checksum
is not valid for the merged LRO packet and should be fixed. We can use
the CQE checksum which covers the checksum of the entire merged packet
TCP payload to help us calculate the checksum incrementally.
Tested by sending IPv4/6 traffic with LRO enabled, RX checksum disabled
and watching nstat checksum error counters (in addition to the obvious
bandwidth drop caused by checksum errors).
This bug is usually "hidden" since LRO packets would go through the
CHECKSUM_UNNECESSARY flow which does not validate the packet checksum.
It's important to note that previous to this patch, LRO packets provided
with CHECKSUM_UNNECESSARY are indeed packets with a correct validated
checksum (even though the checksum inside the TCP header is incorrect),
since the hardware LRO aggregation is terminated upon receiving a packet
with bad checksum.
Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Gal Pressman <galp(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 47 +++++++++++++++++-------
1 file changed, 34 insertions(+), 13 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -36,6 +36,7 @@
#include <linux/tcp.h>
#include <linux/bpf_trace.h>
#include <net/busy_poll.h>
+#include <net/ip6_checksum.h>
#include "en.h"
#include "en_tc.h"
#include "eswitch.h"
@@ -547,20 +548,33 @@ bool mlx5e_post_rx_mpwqes(struct mlx5e_r
return true;
}
+static void mlx5e_lro_update_tcp_hdr(struct mlx5_cqe64 *cqe, struct tcphdr *tcp)
+{
+ u8 l4_hdr_type = get_cqe_l4_hdr_type(cqe);
+ u8 tcp_ack = (l4_hdr_type == CQE_L4_HDR_TYPE_TCP_ACK_NO_DATA) ||
+ (l4_hdr_type == CQE_L4_HDR_TYPE_TCP_ACK_AND_DATA);
+
+ tcp->check = 0;
+ tcp->psh = get_cqe_lro_tcppsh(cqe);
+
+ if (tcp_ack) {
+ tcp->ack = 1;
+ tcp->ack_seq = cqe->lro_ack_seq_num;
+ tcp->window = cqe->lro_tcp_win;
+ }
+}
+
static void mlx5e_lro_update_hdr(struct sk_buff *skb, struct mlx5_cqe64 *cqe,
u32 cqe_bcnt)
{
struct ethhdr *eth = (struct ethhdr *)(skb->data);
struct tcphdr *tcp;
int network_depth = 0;
+ __wsum check;
__be16 proto;
u16 tot_len;
void *ip_p;
- u8 l4_hdr_type = get_cqe_l4_hdr_type(cqe);
- u8 tcp_ack = (l4_hdr_type == CQE_L4_HDR_TYPE_TCP_ACK_NO_DATA) ||
- (l4_hdr_type == CQE_L4_HDR_TYPE_TCP_ACK_AND_DATA);
-
proto = __vlan_get_protocol(skb, eth->h_proto, &network_depth);
tot_len = cqe_bcnt - network_depth;
@@ -577,23 +591,30 @@ static void mlx5e_lro_update_hdr(struct
ipv4->check = 0;
ipv4->check = ip_fast_csum((unsigned char *)ipv4,
ipv4->ihl);
+
+ mlx5e_lro_update_tcp_hdr(cqe, tcp);
+ check = csum_partial(tcp, tcp->doff * 4,
+ csum_unfold((__force __sum16)cqe->check_sum));
+ /* Almost done, don't forget the pseudo header */
+ tcp->check = csum_tcpudp_magic(ipv4->saddr, ipv4->daddr,
+ tot_len - sizeof(struct iphdr),
+ IPPROTO_TCP, check);
} else {
+ u16 payload_len = tot_len - sizeof(struct ipv6hdr);
struct ipv6hdr *ipv6 = ip_p;
tcp = ip_p + sizeof(struct ipv6hdr);
skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
ipv6->hop_limit = cqe->lro_min_ttl;
- ipv6->payload_len = cpu_to_be16(tot_len -
- sizeof(struct ipv6hdr));
- }
+ ipv6->payload_len = cpu_to_be16(payload_len);
- tcp->psh = get_cqe_lro_tcppsh(cqe);
-
- if (tcp_ack) {
- tcp->ack = 1;
- tcp->ack_seq = cqe->lro_ack_seq_num;
- tcp->window = cqe->lro_tcp_win;
+ mlx5e_lro_update_tcp_hdr(cqe, tcp);
+ check = csum_partial(tcp, tcp->doff * 4,
+ csum_unfold((__force __sum16)cqe->check_sum));
+ /* Almost done, don't forget the pseudo header */
+ tcp->check = csum_ipv6_magic(&ipv6->saddr, &ipv6->daddr, payload_len,
+ IPPROTO_TCP, check);
}
}
Patches currently in stable-queue which might be from galp(a)mellanox.com are
queue-4.15/net-mlx5e-specify-numa-node-when-allocating-drop-rq.patch
queue-4.15/net-mlx5e-fix-tcp-checksum-in-lro-buffers.patch
This is a note to let you know that I've just added the patch titled
net/mlx5: Fix error handling when adding flow rules
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-mlx5-fix-error-handling-when-adding-flow-rules.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Vlad Buslov <vladbu(a)mellanox.com>
Date: Tue, 6 Feb 2018 10:52:19 +0200
Subject: net/mlx5: Fix error handling when adding flow rules
From: Vlad Buslov <vladbu(a)mellanox.com>
[ Upstream commit 9238e380e823a39983ee8d6b6ee8d1a9c4ba8a65 ]
If building match list or adding existing fg fails when
node is locked, function returned without unlocking it.
This happened if node version changed or adding existing fg
returned with EAGAIN after jumping to search_again_locked label.
Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Vlad Buslov <vladbu(a)mellanox.com>
Reviewed-by: Maor Gottlieb <maorg(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -1755,8 +1755,11 @@ search_again_locked:
/* Collect all fgs which has a matching match_criteria */
err = build_match_list(&match_head, ft, spec);
- if (err)
+ if (err) {
+ if (take_write)
+ up_write_ref_node(&ft->node);
return ERR_PTR(err);
+ }
if (!take_write)
up_read_ref_node(&ft->node);
@@ -1765,8 +1768,11 @@ search_again_locked:
dest_num, version);
free_match_list(&match_head);
if (!IS_ERR(rule) ||
- (PTR_ERR(rule) != -ENOENT && PTR_ERR(rule) != -EAGAIN))
+ (PTR_ERR(rule) != -ENOENT && PTR_ERR(rule) != -EAGAIN)) {
+ if (take_write)
+ up_write_ref_node(&ft->node);
return rule;
+ }
if (!take_write) {
nested_down_write_ref_node(&ft->node, FS_LOCK_GRANDPARENT);
Patches currently in stable-queue which might be from vladbu(a)mellanox.com are
queue-4.15/net-mlx5-fix-error-handling-when-adding-flow-rules.patch
This is a note to let you know that I've just added the patch titled
net/mlx5e: Fix loopback self test when GRO is off
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-mlx5e-fix-loopback-self-test-when-gro-is-off.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Inbar Karmy <inbark(a)mellanox.com>
Date: Thu, 7 Dec 2017 17:26:33 +0200
Subject: net/mlx5e: Fix loopback self test when GRO is off
From: Inbar Karmy <inbark(a)mellanox.com>
[ Upstream commit ef7a3518f7dd4f4cf5e5b5358c93d1eb78df28fb ]
When GRO is off, the transport header pointer in sk_buff is
initialized to network's header.
To find the udp header, instead of using udp_hdr() which assumes
skb_network_header was set, manually calculate the udp header offset.
Fixes: 0952da791c97 ("net/mlx5e: Add support for loopback selftest")
Signed-off-by: Inbar Karmy <inbark(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -216,7 +216,8 @@ mlx5e_test_loopback_validate(struct sk_b
if (iph->protocol != IPPROTO_UDP)
goto out;
- udph = udp_hdr(skb);
+ /* Don't assume skb_transport_header() was set */
+ udph = (struct udphdr *)((u8 *)iph + 4 * iph->ihl);
if (udph->dest != htons(9))
goto out;
Patches currently in stable-queue which might be from inbark(a)mellanox.com are
queue-4.15/net-mlx5e-fix-loopback-self-test-when-gro-is-off.patch
This is a note to let you know that I've just added the patch titled
net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Sabrina Dubroca <sd(a)queasysnail.net>
Date: Mon, 26 Feb 2018 16:13:43 +0100
Subject: net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68
From: Sabrina Dubroca <sd(a)queasysnail.net>
[ Upstream commit c7272c2f1229125f74f22dcdd59de9bbd804f1c8 ]
According to RFC 1191 sections 3 and 4, ICMP frag-needed messages
indicating an MTU below 68 should be rejected:
A host MUST never reduce its estimate of the Path MTU below 68
octets.
and (talking about ICMP frag-needed's Next-Hop MTU field):
This field will never contain a value less than 68, since every
router "must be able to forward a datagram of 68 octets without
fragmentation".
Furthermore, by letting net.ipv4.route.min_pmtu be set to negative
values, we can end up with a very large PMTU when (-1) is cast into u32.
Let's also make ip_rt_min_pmtu a u32, since it's only ever compared to
unsigned ints.
Reported-by: Jianlin Shi <jishi(a)redhat.com>
Signed-off-by: Sabrina Dubroca <sd(a)queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/route.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -128,10 +128,13 @@ static int ip_rt_redirect_silence __read
static int ip_rt_error_cost __read_mostly = HZ;
static int ip_rt_error_burst __read_mostly = 5 * HZ;
static int ip_rt_mtu_expires __read_mostly = 10 * 60 * HZ;
-static int ip_rt_min_pmtu __read_mostly = 512 + 20 + 20;
+static u32 ip_rt_min_pmtu __read_mostly = 512 + 20 + 20;
static int ip_rt_min_advmss __read_mostly = 256;
static int ip_rt_gc_timeout __read_mostly = RT_GC_TIMEOUT;
+
+static int ip_min_valid_pmtu __read_mostly = IPV4_MIN_MTU;
+
/*
* Interface to generic destination cache.
*/
@@ -2934,7 +2937,8 @@ static struct ctl_table ipv4_route_table
.data = &ip_rt_min_pmtu,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &ip_min_valid_pmtu,
},
{
.procname = "min_adv_mss",
Patches currently in stable-queue which might be from sd(a)queasysnail.net are
queue-4.15/net-ipv4-don-t-allow-setting-net.ipv4.route.min_pmtu-below-68.patch
This is a note to let you know that I've just added the patch titled
net: ipv4: Set addr_type in hash_keys for forwarded case
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ipv4-set-addr_type-in-hash_keys-for-forwarded-case.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: David Ahern <dsahern(a)gmail.com>
Date: Wed, 21 Feb 2018 11:00:54 -0800
Subject: net: ipv4: Set addr_type in hash_keys for forwarded case
From: David Ahern <dsahern(a)gmail.com>
[ Upstream commit 1fe4b1184c2ae2bfbf9e8b14c9c0c1945c98f205 ]
The result of the skb flow dissect is copied from keys to hash_keys to
ensure only the intended data is hashed. The original L4 hash patch
overlooked setting the addr_type for this case; add it.
Fixes: bf4e0a3db97eb ("net: ipv4: add support for ECMP hash policy choice")
Reported-by: Ido Schimmel <idosch(a)idosch.org>
Signed-off-by: David Ahern <dsahern(a)gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Reviewed-by: Ido Schimmel <idosch(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/route.c | 2 ++
1 file changed, 2 insertions(+)
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1832,6 +1832,8 @@ int fib_multipath_hash(const struct fib_
return skb_get_hash_raw(skb) >> 1;
memset(&hash_keys, 0, sizeof(hash_keys));
skb_flow_dissect_flow_keys(skb, &keys, flag);
+
+ hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src;
hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst;
hash_keys.ports.src = keys.ports.src;
Patches currently in stable-queue which might be from dsahern(a)gmail.com are
queue-4.15/fib_semantics-don-t-match-route-with-mismatching-tclassid.patch
queue-4.15/net-ipv4-set-addr_type-in-hash_keys-for-forwarded-case.patch
This is a note to let you know that I've just added the patch titled
net: ethernet: ti: cpsw: fix net watchdog timeout
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ethernet-ti-cpsw-fix-net-watchdog-timeout.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Grygorii Strashko <grygorii.strashko(a)ti.com>
Date: Tue, 6 Feb 2018 19:17:06 -0600
Subject: net: ethernet: ti: cpsw: fix net watchdog timeout
From: Grygorii Strashko <grygorii.strashko(a)ti.com>
[ Upstream commit 62f94c2101f35cd45775df00ba09bde77580e26a ]
It was discovered that simple program which indefinitely sends 200b UDP
packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network
watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog
timeout is triggered due to race between cpsw_ndo_start_xmit() and
cpsw_tx_handler() [NAPI]
cpsw_ndo_start_xmit()
if (unlikely(!cpdma_check_free_tx_desc(txch))) {
txq = netdev_get_tx_queue(ndev, q_idx);
netif_tx_stop_queue(txq);
^^ as per [1] barier has to be used after set_bit() otherwise new value
might not be visible to other cpus
}
cpsw_tx_handler()
if (unlikely(netif_tx_queue_stopped(txq)))
netif_tx_wake_queue(txq);
and when it happens ndev TX queue became disabled forever while driver's HW
TX queue is empty.
Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue()
calls and double check for free TX descriptors after stopping ndev TX queue
- if there are free TX descriptors wake up ndev TX queue.
[1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html
Signed-off-by: Grygorii Strashko <grygorii.strashko(a)ti.com>
Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk(a)linaro.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/ti/cpsw.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1618,6 +1618,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(s
q_idx = q_idx % cpsw->tx_ch_num;
txch = cpsw->txv[q_idx].ch;
+ txq = netdev_get_tx_queue(ndev, q_idx);
ret = cpsw_tx_packet_submit(priv, skb, txch);
if (unlikely(ret != 0)) {
cpsw_err(priv, tx_err, "desc submit failed\n");
@@ -1628,15 +1629,26 @@ static netdev_tx_t cpsw_ndo_start_xmit(s
* tell the kernel to stop sending us tx frames.
*/
if (unlikely(!cpdma_check_free_tx_desc(txch))) {
- txq = netdev_get_tx_queue(ndev, q_idx);
netif_tx_stop_queue(txq);
+
+ /* Barrier, so that stop_queue visible to other cpus */
+ smp_mb__after_atomic();
+
+ if (cpdma_check_free_tx_desc(txch))
+ netif_tx_wake_queue(txq);
}
return NETDEV_TX_OK;
fail:
ndev->stats.tx_dropped++;
- txq = netdev_get_tx_queue(ndev, skb_get_queue_mapping(skb));
netif_tx_stop_queue(txq);
+
+ /* Barrier, so that stop_queue visible to other cpus */
+ smp_mb__after_atomic();
+
+ if (cpdma_check_free_tx_desc(txch))
+ netif_tx_wake_queue(txq);
+
return NETDEV_TX_BUSY;
}
Patches currently in stable-queue which might be from grygorii.strashko(a)ti.com are
queue-4.15/net-ethernet-ti-cpsw-fix-net-watchdog-timeout.patch
This is a note to let you know that I've just added the patch titled
net: fix race on decreasing number of TX queues
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-fix-race-on-decreasing-number-of-tx-queues.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Jakub Kicinski <jakub.kicinski(a)netronome.com>
Date: Mon, 12 Feb 2018 21:35:31 -0800
Subject: net: fix race on decreasing number of TX queues
From: Jakub Kicinski <jakub.kicinski(a)netronome.com>
[ Upstream commit ac5b70198adc25c73fba28de4f78adcee8f6be0b ]
netif_set_real_num_tx_queues() can be called when netdev is up.
That usually happens when user requests change of number of
channels/rings with ethtool -L. The procedure for changing
the number of queues involves resetting the qdiscs and setting
dev->num_tx_queues to the new value. When the new value is
lower than the old one, extra care has to be taken to ensure
ordering of accesses to the number of queues vs qdisc reset.
Currently the queues are reset before new dev->num_tx_queues
is assigned, leaving a window of time where packets can be
enqueued onto the queues going down, leading to a likely
crash in the drivers, since most drivers don't check if TX
skbs are assigned to an active queue.
Fixes: e6484930d7c7 ("net: allocate tx queues in register_netdevice")
Signed-off-by: Jakub Kicinski <jakub.kicinski(a)netronome.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/dev.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2366,8 +2366,11 @@ EXPORT_SYMBOL(netdev_set_num_tc);
*/
int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
{
+ bool disabling;
int rc;
+ disabling = txq < dev->real_num_tx_queues;
+
if (txq < 1 || txq > dev->num_tx_queues)
return -EINVAL;
@@ -2383,15 +2386,19 @@ int netif_set_real_num_tx_queues(struct
if (dev->num_tc)
netif_setup_tc(dev, txq);
- if (txq < dev->real_num_tx_queues) {
+ dev->real_num_tx_queues = txq;
+
+ if (disabling) {
+ synchronize_net();
qdisc_reset_all_tx_gt(dev, txq);
#ifdef CONFIG_XPS
netif_reset_xps_queues_gt(dev, txq);
#endif
}
+ } else {
+ dev->real_num_tx_queues = txq;
}
- dev->real_num_tx_queues = txq;
return 0;
}
EXPORT_SYMBOL(netif_set_real_num_tx_queues);
Patches currently in stable-queue which might be from jakub.kicinski(a)netronome.com are
queue-4.15/net-fix-race-on-decreasing-number-of-tx-queues.patch
This is a note to let you know that I've just added the patch titled
net: amd-xgbe: fix comparison to bitshift when dealing with a mask
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-amd-xgbe-fix-comparison-to-bitshift-when-dealing-with-a-mask.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Mar 6 19:02:56 PST 2018
From: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Date: Mon, 5 Feb 2018 21:10:01 +0100
Subject: net: amd-xgbe: fix comparison to bitshift when dealing with a mask
From: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
[ Upstream commit a3276892db7a588bedc33168e502572008f714a9 ]
Due to a typo, the mask was destroyed by a comparison instead of a bit
shift.
Signed-off-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Acked-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -595,7 +595,7 @@ isr_done:
reissue_mask = 1 << 0;
if (!pdata->per_channel_irq)
- reissue_mask |= 0xffff < 4;
+ reissue_mask |= 0xffff << 4;
XP_IOWRITE(pdata, XP_INT_REISSUE_EN, reissue_mask);
}
Patches currently in stable-queue which might be from wsa+renesas(a)sang-engineering.com are
queue-4.15/net-amd-xgbe-fix-comparison-to-bitshift-when-dealing-with-a-mask.patch