This is a note to let you know that I've just added the patch titled
net/mlx4: Check if Granular QoS per VF has been enabled before updating QP qos_vport
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-mlx4-check-if-granular-qos-per-vf-has-been-enabled-before-updating-qp-qos_vport.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 10:31:53 CEST 2018
From: Ido Shamay <idos(a)mellanox.com>
Date: Mon, 5 Jun 2017 10:44:56 +0300
Subject: net/mlx4: Check if Granular QoS per VF has been enabled before updating QP qos_vport
From: Ido Shamay <idos(a)mellanox.com>
[ Upstream commit 269f9883fe254d109afdfc657875c456d6fabb08 ]
The Granular QoS per VF feature must be enabled in FW before it can be
used.
Thus, the driver cannot modify a QP's qos_vport value (via the UPDATE_QP FW
command) if the feature has not been enabled -- the FW returns an error if
this is attempted.
Fixes: 08068cd5683f ("net/mlx4: Added qos_vport QP configuration in VST mode")
Signed-off-by: Ido Shamay <idos(a)mellanox.com>
Signed-off-by: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlx4/qp.c | 6 ++++++
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 16 +++++++++++-----
2 files changed, 17 insertions(+), 5 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx4/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx4/qp.c
@@ -481,6 +481,12 @@ int mlx4_update_qp(struct mlx4_dev *dev,
}
if (attr & MLX4_UPDATE_QP_QOS_VPORT) {
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QOS_VPP)) {
+ mlx4_warn(dev, "Granular QoS per VF is not enabled\n");
+ err = -EOPNOTSUPP;
+ goto out;
+ }
+
qp_mask |= 1ULL << MLX4_UPD_QP_MASK_QOS_VPP;
cmd->qp_context.qos_vport = params->qos_vport;
}
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -5040,6 +5040,13 @@ void mlx4_delete_all_resources_for_slave
mutex_unlock(&priv->mfunc.master.res_tracker.slave_list[slave].mutex);
}
+static void update_qos_vpp(struct mlx4_update_qp_context *ctx,
+ struct mlx4_vf_immed_vlan_work *work)
+{
+ ctx->qp_mask |= cpu_to_be64(1ULL << MLX4_UPD_QP_MASK_QOS_VPP);
+ ctx->qp_context.qos_vport = work->qos_vport;
+}
+
void mlx4_vf_immed_vlan_work_handler(struct work_struct *_work)
{
struct mlx4_vf_immed_vlan_work *work =
@@ -5144,11 +5151,10 @@ void mlx4_vf_immed_vlan_work_handler(str
qp->sched_queue & 0xC7;
upd_context->qp_context.pri_path.sched_queue |=
((work->qos & 0x7) << 3);
- upd_context->qp_mask |=
- cpu_to_be64(1ULL <<
- MLX4_UPD_QP_MASK_QOS_VPP);
- upd_context->qp_context.qos_vport =
- work->qos_vport;
+
+ if (dev->caps.flags2 &
+ MLX4_DEV_CAP_FLAG2_QOS_VPP)
+ update_qos_vpp(upd_context, work);
}
err = mlx4_cmd(dev, mailbox->dma,
Patches currently in stable-queue which might be from idos(a)mellanox.com are
queue-4.4/net-mlx4-check-if-granular-qos-per-vf-has-been-enabled-before-updating-qp-qos_vport.patch
This is a note to let you know that I've just added the patch titled
net: llc: add lock_sock in llc_ui_bind to avoid a race condition
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-llc-add-lock_sock-in-llc_ui_bind-to-avoid-a-race-condition.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 10:31:53 CEST 2018
From: linzhang <xiaolou4617(a)gmail.com>
Date: Thu, 25 May 2017 14:07:18 +0800
Subject: net: llc: add lock_sock in llc_ui_bind to avoid a race condition
From: linzhang <xiaolou4617(a)gmail.com>
[ Upstream commit 0908cf4dfef35fc6ac12329007052ebe93ff1081 ]
There is a race condition in llc_ui_bind if two or more processes/threads
try to bind a same socket.
If more processes/threads bind a same socket success that will lead to
two problems, one is this action is not what we expected, another is
will lead to kernel in unstable status or oops(in my simple test case,
cause llc2.ko can't unload).
The current code is test SOCK_ZAPPED bit to avoid a process to
bind a same socket twice but that is can't avoid more processes/threads
try to bind a same socket at the same time.
So, add lock_sock in llc_ui_bind like others, such as llc_ui_connect.
Signed-off-by: Lin Zhang <xiaolou4617(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/llc/af_llc.c | 3 +++
1 file changed, 3 insertions(+)
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -309,6 +309,8 @@ static int llc_ui_bind(struct socket *so
int rc = -EINVAL;
dprintk("%s: binding %02X\n", __func__, addr->sllc_sap);
+
+ lock_sock(sk);
if (unlikely(!sock_flag(sk, SOCK_ZAPPED) || addrlen != sizeof(*addr)))
goto out;
rc = -EAFNOSUPPORT;
@@ -380,6 +382,7 @@ static int llc_ui_bind(struct socket *so
out_put:
llc_sap_put(sap);
out:
+ release_sock(sk);
return rc;
}
Patches currently in stable-queue which might be from xiaolou4617(a)gmail.com are
queue-4.4/net-x25-fix-one-potential-use-after-free-issue.patch
queue-4.4/net-llc-add-lock_sock-in-llc_ui_bind-to-avoid-a-race-condition.patch
queue-4.4/net-ieee802154-fix-net_device-reference-release-too-early.patch
This is a note to let you know that I've just added the patch titled
net: ieee802154: fix net_device reference release too early
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ieee802154-fix-net_device-reference-release-too-early.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 10:31:53 CEST 2018
From: Lin Zhang <xiaolou4617(a)gmail.com>
Date: Tue, 23 May 2017 13:29:39 +0800
Subject: net: ieee802154: fix net_device reference release too early
From: Lin Zhang <xiaolou4617(a)gmail.com>
[ Upstream commit a611c58b3d42a92e6b23423e166dd17c0c7fffce ]
This patch fixes the kernel oops when release net_device reference in
advance. In function raw_sendmsg(i think the dgram_sendmsg has the same
problem), there is a race condition between dev_put and dev_queue_xmit
when the device is gong that maybe lead to dev_queue_ximt to see
an illegal net_device pointer.
My test kernel is 3.13.0-32 and because i am not have a real 802154
device, so i change lowpan_newlink function to this:
/* find and hold real wpan device */
real_dev = dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK]));
if (!real_dev)
return -ENODEV;
// if (real_dev->type != ARPHRD_IEEE802154) {
// dev_put(real_dev);
// return -EINVAL;
// }
lowpan_dev_info(dev)->real_dev = real_dev;
lowpan_dev_info(dev)->fragment_tag = 0;
mutex_init(&lowpan_dev_info(dev)->dev_list_mtx);
Also, in order to simulate preempt, i change the raw_sendmsg function
to this:
skb->dev = dev;
skb->sk = sk;
skb->protocol = htons(ETH_P_IEEE802154);
dev_put(dev);
//simulate preempt
schedule_timeout_uninterruptible(30 * HZ);
err = dev_queue_xmit(skb);
if (err > 0)
err = net_xmit_errno(err);
and this is my userspace test code named test_send_data:
int main(int argc, char **argv)
{
char buf[127];
int sockfd;
sockfd = socket(AF_IEEE802154, SOCK_RAW, 0);
if (sockfd < 0) {
printf("create sockfd error: %s\n", strerror(errno));
return -1;
}
send(sockfd, buf, sizeof(buf), 0);
return 0;
}
This is my test case:
root@zhanglin-x-computer:~/develop/802154# uname -a
Linux zhanglin-x-computer 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15
03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
root@zhanglin-x-computer:~/develop/802154# ip link add link eth0 name
lowpan0 type lowpan
root@zhanglin-x-computer:~/develop/802154#
//keep the lowpan0 device down
root@zhanglin-x-computer:~/develop/802154# ./test_send_data &
//wait a while
root@zhanglin-x-computer:~/develop/802154# ip link del link dev lowpan0
//the device is gone
//oops
[381.303307] general protection fault: 0000 [#1]SMP
[381.303407] Modules linked in: af_802154 6lowpan bnep rfcomm
bluetooth nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek
rts5139(C) snd_hda_intel
snd_had_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi
snd_seq_midi_event snd_rawmidi snd_req intel_rapl snd_seq_device
coretemp i915 kvm_intel
kvm snd_timer snd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
cypted drm_kms_helper drm i2c_algo_bit soundcore video mac_hid
parport_pc ppdev ip parport hid_generic
usbhid hid ahci r8169 mii libahdi
[381.304286] CPU:1 PID: 2524 Commm: 1 Tainted: G C 0 3.13.0-32-generic
[381.304409] Hardware name: Haier Haier DT Computer/Haier DT Codputer,
BIOS FIBT19H02_X64 06/09/2014
[381.304546] tasks: ffff000096965fc0 ti: ffffB0013779c000 task.ti:
ffffB8013779c000
[381.304659] RIP: 0010:[<ffffffff01621fe1>] [<ffffffff81621fe1>]
__dev_queue_ximt+0x61/0x500
[381.304798] RSP: 0018:ffffB8013779dca0 EFLAGS: 00010202
[381.304880] RAX: 272b031d57565351 RBX: 0000000000000000 RCX: ffff8800968f1a00
[381.304987] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800968f1a00
[381.305095] RBP: ffff8e013773dce0 R08: 0000000000000266 R09: 0000000000000004
[381.305202] R10: 0000000000000004 R11: 0000000000000005 R12: ffff88013902e000
[381.305310] R13: 000000000000007f R14: 000000000000007f R15: ffff8800968f1a00
[381.305418] FS: 00007fc57f50f740(0000) GS: ffff88013fc80000(0000)
knlGS: 0000000000000000
[381.305540] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[381.305627] CR2: 00007fad0841c000 CR3: 00000001368dd000 CR4: 00000000001007e0
[361.905734] Stack:
[381.305768] 00000000002052d0 000000003facb30a ffff88013779dcc0
ffff880137764000
[381.305898] ffff88013779de70 000000000000007f 000000000000007f
ffff88013902e000
[381.306026] ffff88013779dcf0 ffffffff81622490 ffff88013779dd39
ffffffffa03af9f1
[381.306155] Call Trace:
[381.306202] [<ffffffff81622490>] dev_queue_xmit+0x10/0x20
[381.306294] [<ffffffffa03af9f1>] raw_sendmsg+0x1b1/0x270 [af_802154]
[381.306396] [<ffffffffa03af054>] ieee802154_sock_sendmsg+0x14/0x20 [af_802154]
[381.306512] [<ffffffff816079eb>] sock_sendmsg+0x8b/0xc0
[381.306600] [<ffffffff811d52a5>] ? __d_alloc+0x25/0x180
[381.306687] [<ffffffff811a1f56>] ? kmem_cache_alloc_trace+0x1c6/0x1f0
[381.306791] [<ffffffff81607b91>] SYSC_sendto+0x121/0x1c0
[381.306878] [<ffffffff8109ddf4>] ? vtime_account_user+x54/0x60
[381.306975] [<ffffffff81020d45>] ? syscall_trace_enter+0x145/0x250
[381.307073] [<ffffffff816086ae>] SyS_sendto+0xe/0x10
[381.307156] [<ffffffff8172c87f>] tracesys+0xe1/0xe6
[381.307233] Code: c6 a1 a4 ff 41 8b 57 78 49 8b 47 20 85 d2 48 8b 80
78 07 00 00 75 21 49 8b 57 18 48 85 d2 74 18 48 85 c0 74 13 8b 92 ac
01 00 00 <3b> 50 10 73 08 8b 44 90 14 41 89 47 78 41 f6 84 24 d5 00 00
00
[381.307801] RIP [<ffffffff81621fe1>] _dev_queue_xmit+0x61/0x500
[381.307901] RSP <ffff88013779dca0>
[381.347512] Kernel panic - not syncing: Fatal exception in interrupt
[381.347747] drm_kms_helper: panic occurred, switching back to text console
In my opinion, there is always exist a chance that the device is gong
before call dev_queue_xmit.
I think the latest kernel is have the same problem and that
dev_put should be behind of the dev_queue_xmit.
Signed-off-by: Lin Zhang <xiaolou4617(a)gmail.com>
Acked-by: Stefan Schmidt <stefan(a)osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel(a)holtmann.org>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ieee802154/socket.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -302,12 +302,12 @@ static int raw_sendmsg(struct sock *sk,
skb->sk = sk;
skb->protocol = htons(ETH_P_IEEE802154);
- dev_put(dev);
-
err = dev_queue_xmit(skb);
if (err > 0)
err = net_xmit_errno(err);
+ dev_put(dev);
+
return err ?: size;
out_skb:
@@ -689,12 +689,12 @@ static int dgram_sendmsg(struct sock *sk
skb->sk = sk;
skb->protocol = htons(ETH_P_IEEE802154);
- dev_put(dev);
-
err = dev_queue_xmit(skb);
if (err > 0)
err = net_xmit_errno(err);
+ dev_put(dev);
+
return err ?: size;
out_skb:
Patches currently in stable-queue which might be from xiaolou4617(a)gmail.com are
queue-4.4/net-x25-fix-one-potential-use-after-free-issue.patch
queue-4.4/net-llc-add-lock_sock-in-llc_ui_bind-to-avoid-a-race-condition.patch
queue-4.4/net-ieee802154-fix-net_device-reference-release-too-early.patch
This is a note to let you know that I've just added the patch titled
net: freescale: fix potential null pointer dereference
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-freescale-fix-potential-null-pointer-dereference.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 10:31:53 CEST 2018
From: "Gustavo A. R. Silva" <garsilva(a)embeddedor.com>
Date: Tue, 30 May 2017 17:38:43 -0500
Subject: net: freescale: fix potential null pointer dereference
From: "Gustavo A. R. Silva" <garsilva(a)embeddedor.com>
[ Upstream commit 06d2d6431bc8d41ef5ffd8bd4b52cea9f72aed22 ]
Add NULL check before dereferencing pointer _id_ in order to avoid
a potential NULL pointer dereference.
Addresses-Coverity-ID: 1397995
Signed-off-by: Gustavo A. R. Silva <garsilva(a)embeddedor.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/freescale/fsl_pq_mdio.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/freescale/fsl_pq_mdio.c
+++ b/drivers/net/ethernet/freescale/fsl_pq_mdio.c
@@ -382,7 +382,7 @@ static int fsl_pq_mdio_probe(struct plat
{
const struct of_device_id *id =
of_match_device(fsl_pq_mdio_match, &pdev->dev);
- const struct fsl_pq_mdio_data *data = id->data;
+ const struct fsl_pq_mdio_data *data;
struct device_node *np = pdev->dev.of_node;
struct resource res;
struct device_node *tbi;
@@ -390,6 +390,13 @@ static int fsl_pq_mdio_probe(struct plat
struct mii_bus *new_bus;
int err;
+ if (!id) {
+ dev_err(&pdev->dev, "Failed to match device\n");
+ return -ENODEV;
+ }
+
+ data = id->data;
+
dev_dbg(&pdev->dev, "found %s compatible node\n", id->compatible);
new_bus = mdiobus_alloc_size(sizeof(*priv));
Patches currently in stable-queue which might be from garsilva(a)embeddedor.com are
queue-4.4/net-freescale-fix-potential-null-pointer-dereference.patch
This is a note to let you know that I've just added the patch titled
net: ethernet: ti: cpsw: adjust cpsw fifos depth for fullduplex flow control
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ethernet-ti-cpsw-adjust-cpsw-fifos-depth-for-fullduplex-flow-control.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 10:31:53 CEST 2018
From: Grygorii Strashko <grygorii.strashko(a)ti.com>
Date: Mon, 8 May 2017 14:21:21 -0500
Subject: net: ethernet: ti: cpsw: adjust cpsw fifos depth for fullduplex flow control
From: Grygorii Strashko <grygorii.strashko(a)ti.com>
[ Upstream commit 48f5bccc60675f8426a6159935e8636a1fd89f56 ]
When users set flow control using ethtool the bits are set properly in the
CPGMAC_SL MACCONTROL register, but the FIFO depth in the respective Port n
Maximum FIFO Blocks (Pn_MAX_BLKS) registers remains set to the minimum size
reset value. When receive flow control is enabled on a port, the port's
associated FIFO block allocation must be adjusted. The port RX allocation
must increase to accommodate the flow control runout. The TRM recommends
numbers of 5 or 6.
Hence, apply required Port FIFO configuration to
Pn_MAX_BLKS.Pn_TX_MAX_BLKS=0xF and Pn_MAX_BLKS.Pn_RX_MAX_BLKS=0x5 during
interface initialization.
Cc: Schuyler Patton <spatton(a)ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko(a)ti.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/ti/cpsw.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -280,6 +280,10 @@ struct cpsw_ss_regs {
/* Bit definitions for the CPSW1_TS_SEQ_LTYPE register */
#define CPSW_V1_SEQ_ID_OFS_SHIFT 16
+#define CPSW_MAX_BLKS_TX 15
+#define CPSW_MAX_BLKS_TX_SHIFT 4
+#define CPSW_MAX_BLKS_RX 5
+
struct cpsw_host_regs {
u32 max_blks;
u32 blk_cnt;
@@ -1127,11 +1131,23 @@ static void cpsw_slave_open(struct cpsw_
switch (priv->version) {
case CPSW_VERSION_1:
slave_write(slave, TX_PRIORITY_MAPPING, CPSW1_TX_PRI_MAP);
+ /* Increase RX FIFO size to 5 for supporting fullduplex
+ * flow control mode
+ */
+ slave_write(slave,
+ (CPSW_MAX_BLKS_TX << CPSW_MAX_BLKS_TX_SHIFT) |
+ CPSW_MAX_BLKS_RX, CPSW1_MAX_BLKS);
break;
case CPSW_VERSION_2:
case CPSW_VERSION_3:
case CPSW_VERSION_4:
slave_write(slave, TX_PRIORITY_MAPPING, CPSW2_TX_PRI_MAP);
+ /* Increase RX FIFO size to 5 for supporting fullduplex
+ * flow control mode
+ */
+ slave_write(slave,
+ (CPSW_MAX_BLKS_TX << CPSW_MAX_BLKS_TX_SHIFT) |
+ CPSW_MAX_BLKS_RX, CPSW2_MAX_BLKS);
break;
}
Patches currently in stable-queue which might be from grygorii.strashko(a)ti.com are
queue-4.4/net-ethernet-ti-cpsw-adjust-cpsw-fifos-depth-for-fullduplex-flow-control.patch
This is a note to let you know that I've just added the patch titled
net: emac: fix reset timeout with AR8035 phy
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-emac-fix-reset-timeout-with-ar8035-phy.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 10:31:53 CEST 2018
From: Christian Lamparter <chunkeey(a)googlemail.com>
Date: Wed, 7 Jun 2017 15:51:15 +0200
Subject: net: emac: fix reset timeout with AR8035 phy
From: Christian Lamparter <chunkeey(a)googlemail.com>
[ Upstream commit 19d90ece81da802207a9b91ce95a29fbdc40626e ]
This patch fixes a problem where the AR8035 PHY can't be
detected on an Cisco Meraki MR24, if the ethernet cable is
not connected on boot.
Russell Senior provided steps to reproduce the issue:
|Disconnect ethernet cable, apply power, wait until device has booted,
|plug in ethernet, check for interfaces, no eth0 is listed.
|
|This appears to be a problem during probing of the AR8035 Phy chip.
|When ethernet has no link, the phy detection fails, and eth0 is not
|created. Plugging ethernet later has no effect, because there is no
|interface as far as the kernel is concerned. The relevant part of
|the boot log looks like this:
|this is the failing case:
|
|[ 0.876611] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.882532] /plb/opb/ethernet@ef600c00: reset timeout
|[ 0.888546] /plb/opb/ethernet@ef600c00: can't find PHY!
|and the succeeding case:
|
|[ 0.876672] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.883952] eth0: EMAC-0 /plb/opb/ethernet@ef600c00, MAC 00:01:..
|[ 0.890822] eth0: found Atheros 8035 Gigabit Ethernet PHY (0x01)
Based on the comment and the commit message of
commit 23fbb5a87c56 ("emac: Fix EMAC soft reset on 460EX/GT").
This is because the AR8035 PHY doesn't provide the TX Clock,
if the ethernet cable is not attached. This causes the reset
to timeout and the PHY detection code in emac_init_phy() is
unable to detect the AR8035 PHY. As a result, the emac driver
bails out early and the user left with no ethernet.
In order to stay compatible with existing configurations, the driver
tries the current reset approach at first. Only if the first attempt
timed out, it does perform one more retry with the clock temporarily
switched to the internal source for just the duration of the reset.
LEDE-Bug: #687 <https://bugs.lede-project.org/index.php?do=details&task_id=687>
Cc: Chris Blake <chrisrblake93(a)gmail.com>
Reported-by: Russell Senior <russell(a)personaltelco.net>
Fixes: 23fbb5a87c56e98 ("emac: Fix EMAC soft reset on 460EX/GT")
Signed-off-by: Christian Lamparter <chunkeey(a)googlemail.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/ibm/emac/core.c | 26 ++++++++++++++++++++++----
1 file changed, 22 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -342,6 +342,7 @@ static int emac_reset(struct emac_instan
{
struct emac_regs __iomem *p = dev->emacp;
int n = 20;
+ bool __maybe_unused try_internal_clock = false;
DBG(dev, "reset" NL);
@@ -354,6 +355,7 @@ static int emac_reset(struct emac_instan
}
#ifdef CONFIG_PPC_DCR_NATIVE
+do_retry:
/*
* PPC460EX/GT Embedded Processor Advanced User's Manual
* section 28.10.1 Mode Register 0 (EMACx_MR0) states:
@@ -361,10 +363,19 @@ static int emac_reset(struct emac_instan
* of the EMAC. If none is present, select the internal clock
* (SDR0_ETH_CFG[EMACx_PHY_CLK] = 1).
* After a soft reset, select the external clock.
+ *
+ * The AR8035-A PHY Meraki MR24 does not provide a TX Clk if the
+ * ethernet cable is not attached. This causes the reset to timeout
+ * and the PHY detection code in emac_init_phy() is unable to
+ * communicate and detect the AR8035-A PHY. As a result, the emac
+ * driver bails out early and the user has no ethernet.
+ * In order to stay compatible with existing configurations, the
+ * driver will temporarily switch to the internal clock, after
+ * the first reset fails.
*/
if (emac_has_feature(dev, EMAC_FTR_460EX_PHY_CLK_FIX)) {
- if (dev->phy_address == 0xffffffff &&
- dev->phy_map == 0xffffffff) {
+ if (try_internal_clock || (dev->phy_address == 0xffffffff &&
+ dev->phy_map == 0xffffffff)) {
/* No PHY: select internal loop clock before reset */
dcri_clrset(SDR0, SDR0_ETH_CFG,
0, SDR0_ETH_CFG_ECS << dev->cell_index);
@@ -382,8 +393,15 @@ static int emac_reset(struct emac_instan
#ifdef CONFIG_PPC_DCR_NATIVE
if (emac_has_feature(dev, EMAC_FTR_460EX_PHY_CLK_FIX)) {
- if (dev->phy_address == 0xffffffff &&
- dev->phy_map == 0xffffffff) {
+ if (!n && !try_internal_clock) {
+ /* first attempt has timed out. */
+ n = 20;
+ try_internal_clock = true;
+ goto do_retry;
+ }
+
+ if (try_internal_clock || (dev->phy_address == 0xffffffff &&
+ dev->phy_map == 0xffffffff)) {
/* No PHY: restore external clock source after reset */
dcri_clrset(SDR0, SDR0_ETH_CFG,
SDR0_ETH_CFG_ECS << dev->cell_index, 0);
Patches currently in stable-queue which might be from chunkeey(a)googlemail.com are
queue-4.4/net-emac-fix-reset-timeout-with-ar8035-phy.patch
This is a note to let you know that I've just added the patch titled
net: cdc_ncm: Fix TX zero padding
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-cdc_ncm-fix-tx-zero-padding.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 10:31:53 CEST 2018
From: Jim Baxter <jim_baxter(a)mentor.com>
Date: Mon, 8 May 2017 13:49:57 +0100
Subject: net: cdc_ncm: Fix TX zero padding
From: Jim Baxter <jim_baxter(a)mentor.com>
[ Upstream commit aeca3a77b1e0ed06a095933b89c86aed007383eb ]
The zero padding that is added to NTB's does
not zero the memory correctly.
This is because the skb_put modifies the value
of skb_out->len which results in the memset
command not setting any memory to zero as
(ctx->tx_max - skb_out->len) == 0.
I have resolved this by storing the size of
the memory to be zeroed before the skb_put
and using this in the memset call.
Signed-off-by: Jim Baxter <jim_baxter(a)mentor.com>
Reviewed-by: Bjørn Mork <bjorn(a)mork.no>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/usb/cdc_ncm.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -1069,6 +1069,7 @@ cdc_ncm_fill_tx_frame(struct usbnet *dev
u16 n = 0, index, ndplen;
u8 ready2send = 0;
u32 delayed_ndp_size;
+ size_t padding_count;
/* When our NDP gets written in cdc_ncm_ndp(), then skb_out->len gets updated
* accordingly. Otherwise, we should check here.
@@ -1225,11 +1226,13 @@ cdc_ncm_fill_tx_frame(struct usbnet *dev
* a ZLP after full sized NTBs.
*/
if (!(dev->driver_info->flags & FLAG_SEND_ZLP) &&
- skb_out->len > ctx->min_tx_pkt)
- memset(skb_put(skb_out, ctx->tx_max - skb_out->len), 0,
- ctx->tx_max - skb_out->len);
- else if (skb_out->len < ctx->tx_max && (skb_out->len % dev->maxpacket) == 0)
+ skb_out->len > ctx->min_tx_pkt) {
+ padding_count = ctx->tx_max - skb_out->len;
+ memset(skb_put(skb_out, padding_count), 0, padding_count);
+ } else if (skb_out->len < ctx->tx_max &&
+ (skb_out->len % dev->maxpacket) == 0) {
*skb_put(skb_out, 1) = 0; /* force short packet */
+ }
/* set final frame length */
nth16 = (struct usb_cdc_ncm_nth16 *)skb_out->data;
Patches currently in stable-queue which might be from jim_baxter(a)mentor.com are
queue-4.4/net-cdc_ncm-fix-tx-zero-padding.patch
This is a note to let you know that I've just added the patch titled
neighbour: update neigh timestamps iff update is effective
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
neighbour-update-neigh-timestamps-iff-update-is-effective.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 10:31:53 CEST 2018
From: Ihar Hrachyshka <ihrachys(a)redhat.com>
Date: Tue, 16 May 2017 08:44:24 -0700
Subject: neighbour: update neigh timestamps iff update is effective
From: Ihar Hrachyshka <ihrachys(a)redhat.com>
[ Upstream commit 77d7123342dcf6442341b67816321d71da8b2b16 ]
It's a common practice to send gratuitous ARPs after moving an
IP address to another device to speed up healing of a service. To
fulfill service availability constraints, the timing of network peers
updating their caches to point to a new location of an IP address can be
particularly important.
Sometimes neigh_update calls won't touch neither lladdr nor state, for
example if an update arrives in locktime interval. The neigh->updated
value is tested by the protocol specific neigh code, which in turn
will influence whether NEIGH_UPDATE_F_OVERRIDE gets set in the
call to neigh_update() or not. As a result, we may effectively ignore
the update request, bailing out of touching the neigh entry, except that
we still bump its timestamps inside neigh_update.
This may be a problem for updates arriving in quick succession. For
example, consider the following scenario:
A service is moved to another device with its IP address. The new device
sends three gratuitous ARP requests into the network with ~1 seconds
interval between them. Just before the first request arrives to one of
network peer nodes, its neigh entry for the IP address transitions from
STALE to DELAY. This transition, among other things, updates
neigh->updated. Once the kernel receives the first gratuitous ARP, it
ignores it because its arrival time is inside the locktime interval. The
kernel still bumps neigh->updated. Then the second gratuitous ARP
request arrives, and it's also ignored because it's still in the (new)
locktime interval. Same happens for the third request. The node
eventually heals itself (after delay_first_probe_time seconds since the
initial transition to DELAY state), but it just wasted some time and
require a new ARP request/reply round trip. This unfortunate behaviour
both puts more load on the network, as well as reduces service
availability.
This patch changes neigh_update so that it bumps neigh->updated (as well
as neigh->confirmed) only once we are sure that either lladdr or entry
state will change). In the scenario described above, it means that the
second gratuitous ARP request will actually update the entry lladdr.
Ideally, we would update the neigh entry on the very first gratuitous
ARP request. The locktime mechanism is designed to ignore ARP updates in
a short timeframe after a previous ARP update was honoured by the kernel
layer. This would require tracking timestamps for state transitions
separately from timestamps when actual updates are received. This would
probably involve changes in neighbour struct. Therefore, the patch
doesn't tackle the issue of the first gratuitous APR ignored, leaving
it for a follow-up.
Signed-off-by: Ihar Hrachyshka <ihrachys(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/neighbour.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1132,10 +1132,6 @@ int neigh_update(struct neighbour *neigh
lladdr = neigh->ha;
}
- if (new & NUD_CONNECTED)
- neigh->confirmed = jiffies;
- neigh->updated = jiffies;
-
/* If entry was valid and address is not changed,
do not change entry state, if new one is STALE.
*/
@@ -1159,6 +1155,16 @@ int neigh_update(struct neighbour *neigh
}
}
+ /* Update timestamps only once we know we will make a change to the
+ * neighbour entry. Otherwise we risk to move the locktime window with
+ * noop updates and ignore relevant ARP updates.
+ */
+ if (new != old || lladdr != neigh->ha) {
+ if (new & NUD_CONNECTED)
+ neigh->confirmed = jiffies;
+ neigh->updated = jiffies;
+ }
+
if (new != old) {
neigh_del_timer(neigh);
if (new & NUD_PROBE)
Patches currently in stable-queue which might be from ihrachys(a)redhat.com are
queue-4.4/arp-honour-gratuitous-arp-_replies_.patch
queue-4.4/neighbour-update-neigh-timestamps-iff-update-is-effective.patch
This is a note to let you know that I've just added the patch titled
mlx5: fix bug reading rss_hash_type from CQE
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mlx5-fix-bug-reading-rss_hash_type-from-cqe.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 10:31:53 CEST 2018
From: Jesper Dangaard Brouer <brouer(a)redhat.com>
Date: Mon, 22 May 2017 20:13:07 +0200
Subject: mlx5: fix bug reading rss_hash_type from CQE
From: Jesper Dangaard Brouer <brouer(a)redhat.com>
[ Upstream commit 12e8b570e732eaa5eae3a2895ba3fbcf91bde2b4 ]
Masks for extracting part of the Completion Queue Entry (CQE)
field rss_hash_type was swapped, namely CQE_RSS_HTYPE_IP and
CQE_RSS_HTYPE_L4.
The bug resulted in setting skb->l4_hash, even-though the
rss_hash_type indicated that hash was NOT computed over the
L4 (UDP or TCP) part of the packet.
Added comments from the datasheet, to make it more clear what
these masks are selecting.
Signed-off-by: Jesper Dangaard Brouer <brouer(a)redhat.com>
Acked-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/mlx5/device.h | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -635,8 +635,14 @@ enum {
};
enum {
- CQE_RSS_HTYPE_IP = 0x3 << 6,
- CQE_RSS_HTYPE_L4 = 0x3 << 2,
+ CQE_RSS_HTYPE_IP = 0x3 << 2,
+ /* cqe->rss_hash_type[3:2] - IP destination selected for hash
+ * (00 = none, 01 = IPv4, 10 = IPv6, 11 = Reserved)
+ */
+ CQE_RSS_HTYPE_L4 = 0x3 << 6,
+ /* cqe->rss_hash_type[7:6] - L4 destination selected for hash
+ * (00 = none, 01 = TCP. 10 = UDP, 11 = IPSEC.SPI
+ */
};
enum {
Patches currently in stable-queue which might be from brouer(a)redhat.com are
queue-4.4/mlx5-fix-bug-reading-rss_hash_type-from-cqe.patch
This is a note to let you know that I've just added the patch titled
mISDN: Fix a sleep-in-atomic bug
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
misdn-fix-a-sleep-in-atomic-bug.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 10:31:53 CEST 2018
From: Jia-Ju Bai <baijiaju1990(a)163.com>
Date: Wed, 31 May 2017 15:08:25 +0800
Subject: mISDN: Fix a sleep-in-atomic bug
From: Jia-Ju Bai <baijiaju1990(a)163.com>
[ Upstream commit 93818da5eed63fbc17b64080406ea53b86b23309 ]
The driver may sleep under a read spin lock, and the function call path is:
send_socklist (acquire the lock by read_lock)
skb_copy(GFP_KERNEL) --> may sleep
To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".
Signed-off-by: Jia-Ju Bai <baijiaju1990(a)163.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/isdn/mISDN/stack.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/isdn/mISDN/stack.c
+++ b/drivers/isdn/mISDN/stack.c
@@ -72,7 +72,7 @@ send_socklist(struct mISDN_sock_list *sl
if (sk->sk_state != MISDN_BOUND)
continue;
if (!cskb)
- cskb = skb_copy(skb, GFP_KERNEL);
+ cskb = skb_copy(skb, GFP_ATOMIC);
if (!cskb) {
printk(KERN_WARNING "%s no skb\n", __func__);
break;
Patches currently in stable-queue which might be from baijiaju1990(a)163.com are
queue-4.4/misdn-fix-a-sleep-in-atomic-bug.patch
queue-4.4/qlcnic-fix-a-sleep-in-atomic-bug-in-qlcnic_82xx_hw_write_wx_2m-and-qlcnic_82xx_hw_read_wx_2m.patch