This is a note to let you know that I've just added the patch titled
net: llc: add lock_sock in llc_ui_bind to avoid a race condition
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-llc-add-lock_sock-in-llc_ui_bind-to-avoid-a-race-condition.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: linzhang <xiaolou4617(a)gmail.com>
Date: Thu, 25 May 2017 14:07:18 +0800
Subject: net: llc: add lock_sock in llc_ui_bind to avoid a race condition
From: linzhang <xiaolou4617(a)gmail.com>
[ Upstream commit 0908cf4dfef35fc6ac12329007052ebe93ff1081 ]
There is a race condition in llc_ui_bind if two or more processes/threads
try to bind a same socket.
If more processes/threads bind a same socket success that will lead to
two problems, one is this action is not what we expected, another is
will lead to kernel in unstable status or oops(in my simple test case,
cause llc2.ko can't unload).
The current code is test SOCK_ZAPPED bit to avoid a process to
bind a same socket twice but that is can't avoid more processes/threads
try to bind a same socket at the same time.
So, add lock_sock in llc_ui_bind like others, such as llc_ui_connect.
Signed-off-by: Lin Zhang <xiaolou4617(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/llc/af_llc.c | 3 +++
1 file changed, 3 insertions(+)
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -309,6 +309,8 @@ static int llc_ui_bind(struct socket *so
int rc = -EINVAL;
dprintk("%s: binding %02X\n", __func__, addr->sllc_sap);
+
+ lock_sock(sk);
if (unlikely(!sock_flag(sk, SOCK_ZAPPED) || addrlen != sizeof(*addr)))
goto out;
rc = -EAFNOSUPPORT;
@@ -380,6 +382,7 @@ static int llc_ui_bind(struct socket *so
out_put:
llc_sap_put(sap);
out:
+ release_sock(sk);
return rc;
}
Patches currently in stable-queue which might be from xiaolou4617(a)gmail.com are
queue-4.9/net-x25-fix-one-potential-use-after-free-issue.patch
queue-4.9/net-llc-add-lock_sock-in-llc_ui_bind-to-avoid-a-race-condition.patch
queue-4.9/net-ieee802154-fix-net_device-reference-release-too-early.patch
This is a note to let you know that I've just added the patch titled
net: ieee802154: fix net_device reference release too early
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ieee802154-fix-net_device-reference-release-too-early.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Lin Zhang <xiaolou4617(a)gmail.com>
Date: Tue, 23 May 2017 13:29:39 +0800
Subject: net: ieee802154: fix net_device reference release too early
From: Lin Zhang <xiaolou4617(a)gmail.com>
[ Upstream commit a611c58b3d42a92e6b23423e166dd17c0c7fffce ]
This patch fixes the kernel oops when release net_device reference in
advance. In function raw_sendmsg(i think the dgram_sendmsg has the same
problem), there is a race condition between dev_put and dev_queue_xmit
when the device is gong that maybe lead to dev_queue_ximt to see
an illegal net_device pointer.
My test kernel is 3.13.0-32 and because i am not have a real 802154
device, so i change lowpan_newlink function to this:
/* find and hold real wpan device */
real_dev = dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK]));
if (!real_dev)
return -ENODEV;
// if (real_dev->type != ARPHRD_IEEE802154) {
// dev_put(real_dev);
// return -EINVAL;
// }
lowpan_dev_info(dev)->real_dev = real_dev;
lowpan_dev_info(dev)->fragment_tag = 0;
mutex_init(&lowpan_dev_info(dev)->dev_list_mtx);
Also, in order to simulate preempt, i change the raw_sendmsg function
to this:
skb->dev = dev;
skb->sk = sk;
skb->protocol = htons(ETH_P_IEEE802154);
dev_put(dev);
//simulate preempt
schedule_timeout_uninterruptible(30 * HZ);
err = dev_queue_xmit(skb);
if (err > 0)
err = net_xmit_errno(err);
and this is my userspace test code named test_send_data:
int main(int argc, char **argv)
{
char buf[127];
int sockfd;
sockfd = socket(AF_IEEE802154, SOCK_RAW, 0);
if (sockfd < 0) {
printf("create sockfd error: %s\n", strerror(errno));
return -1;
}
send(sockfd, buf, sizeof(buf), 0);
return 0;
}
This is my test case:
root@zhanglin-x-computer:~/develop/802154# uname -a
Linux zhanglin-x-computer 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15
03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
root@zhanglin-x-computer:~/develop/802154# ip link add link eth0 name
lowpan0 type lowpan
root@zhanglin-x-computer:~/develop/802154#
//keep the lowpan0 device down
root@zhanglin-x-computer:~/develop/802154# ./test_send_data &
//wait a while
root@zhanglin-x-computer:~/develop/802154# ip link del link dev lowpan0
//the device is gone
//oops
[381.303307] general protection fault: 0000 [#1]SMP
[381.303407] Modules linked in: af_802154 6lowpan bnep rfcomm
bluetooth nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek
rts5139(C) snd_hda_intel
snd_had_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi
snd_seq_midi_event snd_rawmidi snd_req intel_rapl snd_seq_device
coretemp i915 kvm_intel
kvm snd_timer snd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
cypted drm_kms_helper drm i2c_algo_bit soundcore video mac_hid
parport_pc ppdev ip parport hid_generic
usbhid hid ahci r8169 mii libahdi
[381.304286] CPU:1 PID: 2524 Commm: 1 Tainted: G C 0 3.13.0-32-generic
[381.304409] Hardware name: Haier Haier DT Computer/Haier DT Codputer,
BIOS FIBT19H02_X64 06/09/2014
[381.304546] tasks: ffff000096965fc0 ti: ffffB0013779c000 task.ti:
ffffB8013779c000
[381.304659] RIP: 0010:[<ffffffff01621fe1>] [<ffffffff81621fe1>]
__dev_queue_ximt+0x61/0x500
[381.304798] RSP: 0018:ffffB8013779dca0 EFLAGS: 00010202
[381.304880] RAX: 272b031d57565351 RBX: 0000000000000000 RCX: ffff8800968f1a00
[381.304987] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800968f1a00
[381.305095] RBP: ffff8e013773dce0 R08: 0000000000000266 R09: 0000000000000004
[381.305202] R10: 0000000000000004 R11: 0000000000000005 R12: ffff88013902e000
[381.305310] R13: 000000000000007f R14: 000000000000007f R15: ffff8800968f1a00
[381.305418] FS: 00007fc57f50f740(0000) GS: ffff88013fc80000(0000)
knlGS: 0000000000000000
[381.305540] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[381.305627] CR2: 00007fad0841c000 CR3: 00000001368dd000 CR4: 00000000001007e0
[361.905734] Stack:
[381.305768] 00000000002052d0 000000003facb30a ffff88013779dcc0
ffff880137764000
[381.305898] ffff88013779de70 000000000000007f 000000000000007f
ffff88013902e000
[381.306026] ffff88013779dcf0 ffffffff81622490 ffff88013779dd39
ffffffffa03af9f1
[381.306155] Call Trace:
[381.306202] [<ffffffff81622490>] dev_queue_xmit+0x10/0x20
[381.306294] [<ffffffffa03af9f1>] raw_sendmsg+0x1b1/0x270 [af_802154]
[381.306396] [<ffffffffa03af054>] ieee802154_sock_sendmsg+0x14/0x20 [af_802154]
[381.306512] [<ffffffff816079eb>] sock_sendmsg+0x8b/0xc0
[381.306600] [<ffffffff811d52a5>] ? __d_alloc+0x25/0x180
[381.306687] [<ffffffff811a1f56>] ? kmem_cache_alloc_trace+0x1c6/0x1f0
[381.306791] [<ffffffff81607b91>] SYSC_sendto+0x121/0x1c0
[381.306878] [<ffffffff8109ddf4>] ? vtime_account_user+x54/0x60
[381.306975] [<ffffffff81020d45>] ? syscall_trace_enter+0x145/0x250
[381.307073] [<ffffffff816086ae>] SyS_sendto+0xe/0x10
[381.307156] [<ffffffff8172c87f>] tracesys+0xe1/0xe6
[381.307233] Code: c6 a1 a4 ff 41 8b 57 78 49 8b 47 20 85 d2 48 8b 80
78 07 00 00 75 21 49 8b 57 18 48 85 d2 74 18 48 85 c0 74 13 8b 92 ac
01 00 00 <3b> 50 10 73 08 8b 44 90 14 41 89 47 78 41 f6 84 24 d5 00 00
00
[381.307801] RIP [<ffffffff81621fe1>] _dev_queue_xmit+0x61/0x500
[381.307901] RSP <ffff88013779dca0>
[381.347512] Kernel panic - not syncing: Fatal exception in interrupt
[381.347747] drm_kms_helper: panic occurred, switching back to text console
In my opinion, there is always exist a chance that the device is gong
before call dev_queue_xmit.
I think the latest kernel is have the same problem and that
dev_put should be behind of the dev_queue_xmit.
Signed-off-by: Lin Zhang <xiaolou4617(a)gmail.com>
Acked-by: Stefan Schmidt <stefan(a)osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel(a)holtmann.org>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ieee802154/socket.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -304,12 +304,12 @@ static int raw_sendmsg(struct sock *sk,
skb->sk = sk;
skb->protocol = htons(ETH_P_IEEE802154);
- dev_put(dev);
-
err = dev_queue_xmit(skb);
if (err > 0)
err = net_xmit_errno(err);
+ dev_put(dev);
+
return err ?: size;
out_skb:
@@ -693,12 +693,12 @@ static int dgram_sendmsg(struct sock *sk
skb->sk = sk;
skb->protocol = htons(ETH_P_IEEE802154);
- dev_put(dev);
-
err = dev_queue_xmit(skb);
if (err > 0)
err = net_xmit_errno(err);
+ dev_put(dev);
+
return err ?: size;
out_skb:
Patches currently in stable-queue which might be from xiaolou4617(a)gmail.com are
queue-4.9/net-x25-fix-one-potential-use-after-free-issue.patch
queue-4.9/net-llc-add-lock_sock-in-llc_ui_bind-to-avoid-a-race-condition.patch
queue-4.9/net-ieee802154-fix-net_device-reference-release-too-early.patch
This is a note to let you know that I've just added the patch titled
net: freescale: fix potential null pointer dereference
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-freescale-fix-potential-null-pointer-dereference.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: "Gustavo A. R. Silva" <garsilva(a)embeddedor.com>
Date: Tue, 30 May 2017 17:38:43 -0500
Subject: net: freescale: fix potential null pointer dereference
From: "Gustavo A. R. Silva" <garsilva(a)embeddedor.com>
[ Upstream commit 06d2d6431bc8d41ef5ffd8bd4b52cea9f72aed22 ]
Add NULL check before dereferencing pointer _id_ in order to avoid
a potential NULL pointer dereference.
Addresses-Coverity-ID: 1397995
Signed-off-by: Gustavo A. R. Silva <garsilva(a)embeddedor.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/freescale/fsl_pq_mdio.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/freescale/fsl_pq_mdio.c
+++ b/drivers/net/ethernet/freescale/fsl_pq_mdio.c
@@ -381,7 +381,7 @@ static int fsl_pq_mdio_probe(struct plat
{
const struct of_device_id *id =
of_match_device(fsl_pq_mdio_match, &pdev->dev);
- const struct fsl_pq_mdio_data *data = id->data;
+ const struct fsl_pq_mdio_data *data;
struct device_node *np = pdev->dev.of_node;
struct resource res;
struct device_node *tbi;
@@ -389,6 +389,13 @@ static int fsl_pq_mdio_probe(struct plat
struct mii_bus *new_bus;
int err;
+ if (!id) {
+ dev_err(&pdev->dev, "Failed to match device\n");
+ return -ENODEV;
+ }
+
+ data = id->data;
+
dev_dbg(&pdev->dev, "found %s compatible node\n", id->compatible);
new_bus = mdiobus_alloc_size(sizeof(*priv));
Patches currently in stable-queue which might be from garsilva(a)embeddedor.com are
queue-4.9/pm-devfreq-fix-potential-null-pointer-dereference-in-governor_store.patch
queue-4.9/net-freescale-fix-potential-null-pointer-dereference.patch
This is a note to let you know that I've just added the patch titled
net: ethernet: ti: cpsw: adjust cpsw fifos depth for fullduplex flow control
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ethernet-ti-cpsw-adjust-cpsw-fifos-depth-for-fullduplex-flow-control.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Grygorii Strashko <grygorii.strashko(a)ti.com>
Date: Mon, 8 May 2017 14:21:21 -0500
Subject: net: ethernet: ti: cpsw: adjust cpsw fifos depth for fullduplex flow control
From: Grygorii Strashko <grygorii.strashko(a)ti.com>
[ Upstream commit 48f5bccc60675f8426a6159935e8636a1fd89f56 ]
When users set flow control using ethtool the bits are set properly in the
CPGMAC_SL MACCONTROL register, but the FIFO depth in the respective Port n
Maximum FIFO Blocks (Pn_MAX_BLKS) registers remains set to the minimum size
reset value. When receive flow control is enabled on a port, the port's
associated FIFO block allocation must be adjusted. The port RX allocation
must increase to accommodate the flow control runout. The TRM recommends
numbers of 5 or 6.
Hence, apply required Port FIFO configuration to
Pn_MAX_BLKS.Pn_TX_MAX_BLKS=0xF and Pn_MAX_BLKS.Pn_RX_MAX_BLKS=0x5 during
interface initialization.
Cc: Schuyler Patton <spatton(a)ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko(a)ti.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/ti/cpsw.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -282,6 +282,10 @@ struct cpsw_ss_regs {
/* Bit definitions for the CPSW1_TS_SEQ_LTYPE register */
#define CPSW_V1_SEQ_ID_OFS_SHIFT 16
+#define CPSW_MAX_BLKS_TX 15
+#define CPSW_MAX_BLKS_TX_SHIFT 4
+#define CPSW_MAX_BLKS_RX 5
+
struct cpsw_host_regs {
u32 max_blks;
u32 blk_cnt;
@@ -1160,11 +1164,23 @@ static void cpsw_slave_open(struct cpsw_
switch (cpsw->version) {
case CPSW_VERSION_1:
slave_write(slave, TX_PRIORITY_MAPPING, CPSW1_TX_PRI_MAP);
+ /* Increase RX FIFO size to 5 for supporting fullduplex
+ * flow control mode
+ */
+ slave_write(slave,
+ (CPSW_MAX_BLKS_TX << CPSW_MAX_BLKS_TX_SHIFT) |
+ CPSW_MAX_BLKS_RX, CPSW1_MAX_BLKS);
break;
case CPSW_VERSION_2:
case CPSW_VERSION_3:
case CPSW_VERSION_4:
slave_write(slave, TX_PRIORITY_MAPPING, CPSW2_TX_PRI_MAP);
+ /* Increase RX FIFO size to 5 for supporting fullduplex
+ * flow control mode
+ */
+ slave_write(slave,
+ (CPSW_MAX_BLKS_TX << CPSW_MAX_BLKS_TX_SHIFT) |
+ CPSW_MAX_BLKS_RX, CPSW2_MAX_BLKS);
break;
}
Patches currently in stable-queue which might be from grygorii.strashko(a)ti.com are
queue-4.9/net-ethernet-ti-cpsw-adjust-cpsw-fifos-depth-for-fullduplex-flow-control.patch
This is a note to let you know that I've just added the patch titled
net: fec: Add a fec_enet_clear_ethtool_stats() stub for CONFIG_M5272
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-fec-add-a-fec_enet_clear_ethtool_stats-stub-for-config_m5272.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Fabio Estevam <fabio.estevam(a)nxp.com>
Date: Fri, 9 Jun 2017 22:37:22 -0300
Subject: net: fec: Add a fec_enet_clear_ethtool_stats() stub for CONFIG_M5272
From: Fabio Estevam <fabio.estevam(a)nxp.com>
[ Upstream commit bf292f1b2c813f1d6ac49b04bd1a9863d8314266 ]
Commit 2b30842b23b9 ("net: fec: Clear and enable MIB counters on imx51")
introduced fec_enet_clear_ethtool_stats(), but missed to add a stub
for the CONFIG_M5272=y case, causing build failure for the
m5272c3_defconfig.
Add the missing empty stub to fix the build failure.
Reported-by: Paul Gortmaker <paul.gortmaker(a)windriver.com>
Signed-off-by: Fabio Estevam <fabio.estevam(a)nxp.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/freescale/fec_main.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -2371,6 +2371,10 @@ static int fec_enet_get_sset_count(struc
static inline void fec_enet_update_ethtool_stats(struct net_device *dev)
{
}
+
+static inline void fec_enet_clear_ethtool_stats(struct net_device *dev)
+{
+}
#endif /* !defined(CONFIG_M5272) */
static int fec_enet_nway_reset(struct net_device *dev)
Patches currently in stable-queue which might be from fabio.estevam(a)nxp.com are
queue-4.9/arm-dts-imx6qdl-wandboard-fix-audio-channel-swap.patch
queue-4.9/net-fec-add-a-fec_enet_clear_ethtool_stats-stub-for-config_m5272.patch
queue-4.9/arm-dts-imx53-qsrb-pulldown-pmic-irq-pin.patch
queue-4.9/arm-imx-add-mxc_cpu_imx6ull-and-cpu_is_imx6ull.patch
This is a note to let you know that I've just added the patch titled
net: ena: fix race condition between submit and completion admin command
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ena-fix-race-condition-between-submit-and-completion-admin-command.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Netanel Belgazal <netanel(a)amazon.com>
Date: Sun, 11 Jun 2017 15:42:46 +0300
Subject: net: ena: fix race condition between submit and completion admin command
From: Netanel Belgazal <netanel(a)amazon.com>
[ Upstream commit 661d2b0ccef6a63f48b61105cf7be17403d1db01 ]
Bug:
"Completion context is occupied" error printout will be noticed in
dmesg.
This error will cause the admin command to fail, which will lead to
an ena_probe() failure or a watchdog reset (depends on which admin
command failed).
Root cause:
__ena_com_submit_admin_cmd() is the function that submits new entries to
the admin queue.
The function have a check that makes sure the queue is not full and the
function does not override any outstanding command.
It uses head and tail indexes for this check.
The head is increased by ena_com_handle_admin_completion() which runs
from interrupt context, and the tail index is increased by the submit
function (the function is running under ->q_lock, so there is no risk
of multithread increment).
Each command is associated with a completion context. This context
allocated before call to __ena_com_submit_admin_cmd() and freed by
ena_com_wait_and_process_admin_cq_interrupts(), right after the command
was completed.
This can lead to a state where the head was increased, the check passed,
but the completion context is still in use.
Solution:
Use the atomic variable ->outstanding_cmds instead of using the head and
the tail indexes.
This variable is safe for use since it is bumped in get_comp_ctx() in
__ena_com_submit_admin_cmd() and is freed by comp_ctxt_release()
Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel(a)amazon.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/amazon/ena/ena_com.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -232,11 +232,9 @@ static struct ena_comp_ctx *__ena_com_su
tail_masked = admin_queue->sq.tail & queue_size_mask;
/* In case of queue FULL */
- cnt = admin_queue->sq.tail - admin_queue->sq.head;
+ cnt = atomic_read(&admin_queue->outstanding_cmds);
if (cnt >= admin_queue->q_depth) {
- pr_debug("admin queue is FULL (tail %d head %d depth: %d)\n",
- admin_queue->sq.tail, admin_queue->sq.head,
- admin_queue->q_depth);
+ pr_debug("admin queue is full.\n");
admin_queue->stats.out_of_space++;
return ERR_PTR(-ENOSPC);
}
Patches currently in stable-queue which might be from netanel(a)amazon.com are
queue-4.9/net-ena-disable-admin-msix-while-working-in-polling-mode.patch
queue-4.9/net-ena-fix-race-condition-between-submit-and-completion-admin-command.patch
queue-4.9/net-ena-fix-rare-uncompleted-admin-command-false-alarm.patch
queue-4.9/net-ena-add-missing-unmap-bars-on-device-removal.patch
queue-4.9/net-ena-add-missing-return-when-ena_com_get_io_handlers-fails.patch
This is a note to let you know that I've just added the patch titled
net: ena: fix rare uncompleted admin command false alarm
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ena-fix-rare-uncompleted-admin-command-false-alarm.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Netanel Belgazal <netanel(a)amazon.com>
Date: Sun, 11 Jun 2017 15:42:43 +0300
Subject: net: ena: fix rare uncompleted admin command false alarm
From: Netanel Belgazal <netanel(a)amazon.com>
[ Upstream commit a77c1aafcc906f657d1a0890c1d898be9ee1d5c9 ]
The current flow to detect admin completion is:
while (command_not_completed) {
if (timeout)
error
check_for_completion()
sleep()
}
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.
The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.
Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel(a)amazon.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/amazon/ena/ena_com.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -508,15 +508,20 @@ static int ena_com_comp_status_to_errno(
static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_ctx,
struct ena_com_admin_queue *admin_queue)
{
- unsigned long flags;
- u32 start_time;
+ unsigned long flags, timeout;
int ret;
- start_time = ((u32)jiffies_to_usecs(jiffies));
+ timeout = jiffies + ADMIN_CMD_TIMEOUT_US;
- while (comp_ctx->status == ENA_CMD_SUBMITTED) {
- if ((((u32)jiffies_to_usecs(jiffies)) - start_time) >
- ADMIN_CMD_TIMEOUT_US) {
+ while (1) {
+ spin_lock_irqsave(&admin_queue->q_lock, flags);
+ ena_com_handle_admin_completion(admin_queue);
+ spin_unlock_irqrestore(&admin_queue->q_lock, flags);
+
+ if (comp_ctx->status != ENA_CMD_SUBMITTED)
+ break;
+
+ if (time_is_before_jiffies(timeout)) {
pr_err("Wait for completion (polling) timeout\n");
/* ENA didn't have any completion */
spin_lock_irqsave(&admin_queue->q_lock, flags);
@@ -528,10 +533,6 @@ static int ena_com_wait_and_process_admi
goto err;
}
- spin_lock_irqsave(&admin_queue->q_lock, flags);
- ena_com_handle_admin_completion(admin_queue);
- spin_unlock_irqrestore(&admin_queue->q_lock, flags);
-
msleep(100);
}
Patches currently in stable-queue which might be from netanel(a)amazon.com are
queue-4.9/net-ena-disable-admin-msix-while-working-in-polling-mode.patch
queue-4.9/net-ena-fix-race-condition-between-submit-and-completion-admin-command.patch
queue-4.9/net-ena-fix-rare-uncompleted-admin-command-false-alarm.patch
queue-4.9/net-ena-add-missing-unmap-bars-on-device-removal.patch
queue-4.9/net-ena-add-missing-return-when-ena_com_get_io_handlers-fails.patch
This is a note to let you know that I've just added the patch titled
net: ena: disable admin msix while working in polling mode
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ena-disable-admin-msix-while-working-in-polling-mode.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Netanel Belgazal <netanel(a)amazon.com>
Date: Sun, 11 Jun 2017 15:42:49 +0300
Subject: net: ena: disable admin msix while working in polling mode
From: Netanel Belgazal <netanel(a)amazon.com>
[ Upstream commit a2cc5198dac102775b21787752a2e0afe44ad311 ]
Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel(a)amazon.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/amazon/ena/ena_com.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -61,6 +61,8 @@
#define ENA_MMIO_READ_TIMEOUT 0xFFFFFFFF
+#define ENA_REGS_ADMIN_INTR_MASK 1
+
/*****************************************************************************/
/*****************************************************************************/
/*****************************************************************************/
@@ -1448,6 +1450,12 @@ void ena_com_admin_destroy(struct ena_co
void ena_com_set_admin_polling_mode(struct ena_com_dev *ena_dev, bool polling)
{
+ u32 mask_value = 0;
+
+ if (polling)
+ mask_value = ENA_REGS_ADMIN_INTR_MASK;
+
+ writel(mask_value, ena_dev->reg_bar + ENA_REGS_INTR_MASK_OFF);
ena_dev->admin_queue.polling = polling;
}
Patches currently in stable-queue which might be from netanel(a)amazon.com are
queue-4.9/net-ena-disable-admin-msix-while-working-in-polling-mode.patch
queue-4.9/net-ena-fix-race-condition-between-submit-and-completion-admin-command.patch
queue-4.9/net-ena-fix-rare-uncompleted-admin-command-false-alarm.patch
queue-4.9/net-ena-add-missing-unmap-bars-on-device-removal.patch
queue-4.9/net-ena-add-missing-return-when-ena_com_get_io_handlers-fails.patch
This is a note to let you know that I've just added the patch titled
net: ena: add missing unmap bars on device removal
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ena-add-missing-unmap-bars-on-device-removal.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Netanel Belgazal <netanel(a)amazon.com>
Date: Sun, 11 Jun 2017 15:42:47 +0300
Subject: net: ena: add missing unmap bars on device removal
From: Netanel Belgazal <netanel(a)amazon.com>
[ Upstream commit 0857d92f71b6cb75281fde913554b2d5436c394b ]
This patch also change the mapping functions to devm_ functions
Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel(a)amazon.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/amazon/ena/ena_netdev.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2808,6 +2808,11 @@ static void ena_release_bars(struct ena_
{
int release_bars;
+ if (ena_dev->mem_bar)
+ devm_iounmap(&pdev->dev, ena_dev->mem_bar);
+
+ devm_iounmap(&pdev->dev, ena_dev->reg_bar);
+
release_bars = pci_select_bars(pdev, IORESOURCE_MEM) & ENA_BAR_MASK;
pci_release_selected_regions(pdev, release_bars);
}
@@ -2895,8 +2900,9 @@ static int ena_probe(struct pci_dev *pde
goto err_free_ena_dev;
}
- ena_dev->reg_bar = ioremap(pci_resource_start(pdev, ENA_REG_BAR),
- pci_resource_len(pdev, ENA_REG_BAR));
+ ena_dev->reg_bar = devm_ioremap(&pdev->dev,
+ pci_resource_start(pdev, ENA_REG_BAR),
+ pci_resource_len(pdev, ENA_REG_BAR));
if (!ena_dev->reg_bar) {
dev_err(&pdev->dev, "failed to remap regs bar\n");
rc = -EFAULT;
@@ -2916,8 +2922,9 @@ static int ena_probe(struct pci_dev *pde
ena_set_push_mode(pdev, ena_dev, &get_feat_ctx);
if (ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) {
- ena_dev->mem_bar = ioremap_wc(pci_resource_start(pdev, ENA_MEM_BAR),
- pci_resource_len(pdev, ENA_MEM_BAR));
+ ena_dev->mem_bar = devm_ioremap_wc(&pdev->dev,
+ pci_resource_start(pdev, ENA_MEM_BAR),
+ pci_resource_len(pdev, ENA_MEM_BAR));
if (!ena_dev->mem_bar) {
rc = -EFAULT;
goto err_device_destroy;
Patches currently in stable-queue which might be from netanel(a)amazon.com are
queue-4.9/net-ena-disable-admin-msix-while-working-in-polling-mode.patch
queue-4.9/net-ena-fix-race-condition-between-submit-and-completion-admin-command.patch
queue-4.9/net-ena-fix-rare-uncompleted-admin-command-false-alarm.patch
queue-4.9/net-ena-add-missing-unmap-bars-on-device-removal.patch
queue-4.9/net-ena-add-missing-return-when-ena_com_get_io_handlers-fails.patch
This is a note to let you know that I've just added the patch titled
net: emac: fix reset timeout with AR8035 phy
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-emac-fix-reset-timeout-with-ar8035-phy.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Christian Lamparter <chunkeey(a)googlemail.com>
Date: Wed, 7 Jun 2017 15:51:15 +0200
Subject: net: emac: fix reset timeout with AR8035 phy
From: Christian Lamparter <chunkeey(a)googlemail.com>
[ Upstream commit 19d90ece81da802207a9b91ce95a29fbdc40626e ]
This patch fixes a problem where the AR8035 PHY can't be
detected on an Cisco Meraki MR24, if the ethernet cable is
not connected on boot.
Russell Senior provided steps to reproduce the issue:
|Disconnect ethernet cable, apply power, wait until device has booted,
|plug in ethernet, check for interfaces, no eth0 is listed.
|
|This appears to be a problem during probing of the AR8035 Phy chip.
|When ethernet has no link, the phy detection fails, and eth0 is not
|created. Plugging ethernet later has no effect, because there is no
|interface as far as the kernel is concerned. The relevant part of
|the boot log looks like this:
|this is the failing case:
|
|[ 0.876611] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.882532] /plb/opb/ethernet@ef600c00: reset timeout
|[ 0.888546] /plb/opb/ethernet@ef600c00: can't find PHY!
|and the succeeding case:
|
|[ 0.876672] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.883952] eth0: EMAC-0 /plb/opb/ethernet@ef600c00, MAC 00:01:..
|[ 0.890822] eth0: found Atheros 8035 Gigabit Ethernet PHY (0x01)
Based on the comment and the commit message of
commit 23fbb5a87c56 ("emac: Fix EMAC soft reset on 460EX/GT").
This is because the AR8035 PHY doesn't provide the TX Clock,
if the ethernet cable is not attached. This causes the reset
to timeout and the PHY detection code in emac_init_phy() is
unable to detect the AR8035 PHY. As a result, the emac driver
bails out early and the user left with no ethernet.
In order to stay compatible with existing configurations, the driver
tries the current reset approach at first. Only if the first attempt
timed out, it does perform one more retry with the clock temporarily
switched to the internal source for just the duration of the reset.
LEDE-Bug: #687 <https://bugs.lede-project.org/index.php?do=details&task_id=687>
Cc: Chris Blake <chrisrblake93(a)gmail.com>
Reported-by: Russell Senior <russell(a)personaltelco.net>
Fixes: 23fbb5a87c56e98 ("emac: Fix EMAC soft reset on 460EX/GT")
Signed-off-by: Christian Lamparter <chunkeey(a)googlemail.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/ibm/emac/core.c | 26 ++++++++++++++++++++++----
1 file changed, 22 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -342,6 +342,7 @@ static int emac_reset(struct emac_instan
{
struct emac_regs __iomem *p = dev->emacp;
int n = 20;
+ bool __maybe_unused try_internal_clock = false;
DBG(dev, "reset" NL);
@@ -354,6 +355,7 @@ static int emac_reset(struct emac_instan
}
#ifdef CONFIG_PPC_DCR_NATIVE
+do_retry:
/*
* PPC460EX/GT Embedded Processor Advanced User's Manual
* section 28.10.1 Mode Register 0 (EMACx_MR0) states:
@@ -361,10 +363,19 @@ static int emac_reset(struct emac_instan
* of the EMAC. If none is present, select the internal clock
* (SDR0_ETH_CFG[EMACx_PHY_CLK] = 1).
* After a soft reset, select the external clock.
+ *
+ * The AR8035-A PHY Meraki MR24 does not provide a TX Clk if the
+ * ethernet cable is not attached. This causes the reset to timeout
+ * and the PHY detection code in emac_init_phy() is unable to
+ * communicate and detect the AR8035-A PHY. As a result, the emac
+ * driver bails out early and the user has no ethernet.
+ * In order to stay compatible with existing configurations, the
+ * driver will temporarily switch to the internal clock, after
+ * the first reset fails.
*/
if (emac_has_feature(dev, EMAC_FTR_460EX_PHY_CLK_FIX)) {
- if (dev->phy_address == 0xffffffff &&
- dev->phy_map == 0xffffffff) {
+ if (try_internal_clock || (dev->phy_address == 0xffffffff &&
+ dev->phy_map == 0xffffffff)) {
/* No PHY: select internal loop clock before reset */
dcri_clrset(SDR0, SDR0_ETH_CFG,
0, SDR0_ETH_CFG_ECS << dev->cell_index);
@@ -382,8 +393,15 @@ static int emac_reset(struct emac_instan
#ifdef CONFIG_PPC_DCR_NATIVE
if (emac_has_feature(dev, EMAC_FTR_460EX_PHY_CLK_FIX)) {
- if (dev->phy_address == 0xffffffff &&
- dev->phy_map == 0xffffffff) {
+ if (!n && !try_internal_clock) {
+ /* first attempt has timed out. */
+ n = 20;
+ try_internal_clock = true;
+ goto do_retry;
+ }
+
+ if (try_internal_clock || (dev->phy_address == 0xffffffff &&
+ dev->phy_map == 0xffffffff)) {
/* No PHY: restore external clock source after reset */
dcri_clrset(SDR0, SDR0_ETH_CFG,
SDR0_ETH_CFG_ECS << dev->cell_index, 0);
Patches currently in stable-queue which might be from chunkeey(a)googlemail.com are
queue-4.9/net-emac-fix-reset-timeout-with-ar8035-phy.patch
queue-4.9/arm-dts-qcom-ipq4019-fix-i2c_0-node.patch