This is the start of the stable review cycle for the 5.4.65 release. There are 8 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:54:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.65-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.4.65-rc1
Jakub Kicinski kuba@kernel.org net: disable netpoll on fresh napis
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp tipc: fix shutdown() of connectionless socket
Vinicius Costa Gomes vinicius.gomes@intel.com taprio: Fix using wrong queues in gate mask
Xin Long lucien.xin@gmail.com sctp: not disable bh in the whole sctp_get_port_local()
Kamil Lorenc kamil@re-ws.pl net: usb: dm9601: Add USB ID of Keenetic Plus DSL
Paul Moore paul@paul-moore.com netlabel: fix problems with mapping removal
Ido Schimmel idosch@nvidia.com ipv6: Fix sysctl max for fib_multipath_hash_policy
Ido Schimmel idosch@nvidia.com ipv4: Silence suspicious RCU usage warning
-------------
Diffstat:
Makefile | 4 +-- drivers/net/usb/dm9601.c | 4 +++ net/core/dev.c | 3 +- net/core/netpoll.c | 2 +- net/ipv4/fib_trie.c | 3 +- net/ipv6/sysctl_net_ipv6.c | 3 +- net/netlabel/netlabel_domainhash.c | 59 +++++++++++++++++++------------------- net/sched/sch_taprio.c | 30 +++++++++++++++---- net/sctp/socket.c | 16 ++++------- net/tipc/socket.c | 9 ++++-- 10 files changed, 79 insertions(+), 54 deletions(-)
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 7f6f32bb7d3355cd78ebf1dece9a6ea7a0ca8158 ]
fib_info_notify_update() is always called with RTNL held, but not from an RCU read-side critical section. This leads to the following warning [1] when the FIB table list is traversed with hlist_for_each_entry_rcu(), but without a proper lockdep expression.
Since modification of the list is protected by RTNL, silence the warning by adding a lockdep expression which verifies RTNL is held.
[1] ============================= WARNING: suspicious RCU usage 5.9.0-rc1-custom-14233-g2f26e122d62f #129 Not tainted ----------------------------- net/ipv4/fib_trie.c:2124 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ip/834: #0: ffffffff85a3b6b0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x49a/0xbd0
stack backtrace: CPU: 0 PID: 834 Comm: ip Not tainted 5.9.0-rc1-custom-14233-g2f26e122d62f #129 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack+0x100/0x184 lockdep_rcu_suspicious+0x143/0x14d fib_info_notify_update+0x8d1/0xa60 __nexthop_replace_notify+0xd2/0x290 rtm_new_nexthop+0x35e2/0x5946 rtnetlink_rcv_msg+0x4f7/0xbd0 netlink_rcv_skb+0x17a/0x480 rtnetlink_rcv+0x22/0x30 netlink_unicast+0x5ae/0x890 netlink_sendmsg+0x98a/0xf40 ____sys_sendmsg+0x879/0xa00 ___sys_sendmsg+0x122/0x190 __sys_sendmsg+0x103/0x1d0 __x64_sys_sendmsg+0x7d/0xb0 do_syscall_64+0x32/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fde28c3be57 Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 RSP: 002b:00007ffc09330028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fde28c3be57 RDX: 0000000000000000 RSI: 00007ffc09330090 RDI: 0000000000000003 RBP: 000000005f45f911 R08: 0000000000000001 R09: 00007ffc0933012c R10: 0000000000000076 R11: 0000000000000246 R12: 0000000000000001 R13: 00007ffc09330290 R14: 00007ffc09330eee R15: 00005610e48ed020
Fixes: 1bff1a0c9bbd ("ipv4: Add function to send route updates") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/fib_trie.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -2010,7 +2010,8 @@ void fib_info_notify_update(struct net * struct hlist_head *head = &net->ipv4.fib_table_hash[h]; struct fib_table *tb;
- hlist_for_each_entry_rcu(tb, head, tb_hlist) + hlist_for_each_entry_rcu(tb, head, tb_hlist, + lockdep_rtnl_is_held()) __fib_info_notify_update(net, tb, info); } }
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 05d4487197b2b71d5363623c28924fd58c71c0b6 ]
Cited commit added the possible value of '2', but it cannot be set. Fix it by adjusting the maximum value to '2'. This is consistent with the corresponding IPv4 sysctl.
Before:
# sysctl -w net.ipv6.fib_multipath_hash_policy=2 sysctl: setting key "net.ipv6.fib_multipath_hash_policy": Invalid argument net.ipv6.fib_multipath_hash_policy = 2 # sysctl net.ipv6.fib_multipath_hash_policy net.ipv6.fib_multipath_hash_policy = 0
After:
# sysctl -w net.ipv6.fib_multipath_hash_policy=2 net.ipv6.fib_multipath_hash_policy = 2 # sysctl net.ipv6.fib_multipath_hash_policy net.ipv6.fib_multipath_hash_policy = 2
Fixes: d8f74f0975d8 ("ipv6: Support multipath hashing on inner IP pkts") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Stephen Suryaputra ssuryaextr@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/sysctl_net_ipv6.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ipv6/sysctl_net_ipv6.c +++ b/net/ipv6/sysctl_net_ipv6.c @@ -21,6 +21,7 @@ #include <net/calipso.h> #endif
+static int two = 2; static int flowlabel_reflect_max = 0x7; static int auto_flowlabels_min; static int auto_flowlabels_max = IP6_AUTO_FLOW_LABEL_MAX; @@ -151,7 +152,7 @@ static struct ctl_table ipv6_table_templ .mode = 0644, .proc_handler = proc_rt6_multipath_hash_policy, .extra1 = SYSCTL_ZERO, - .extra2 = SYSCTL_ONE, + .extra2 = &two, }, { .procname = "seg6_flowlabel",
From: Paul Moore paul@paul-moore.com
[ Upstream commit d3b990b7f327e2afa98006e7666fb8ada8ed8683 ]
This patch fixes two main problems seen when removing NetLabel mappings: memory leaks and potentially extra audit noise.
The memory leaks are caused by not properly free'ing the mapping's address selector struct when free'ing the entire entry as well as not properly cleaning up a temporary mapping entry when adding new address selectors to an existing entry. This patch fixes both these problems such that kmemleak reports no NetLabel associated leaks after running the SELinux test suite.
The potentially extra audit noise was caused by the auditing code in netlbl_domhsh_remove_entry() being called regardless of the entry's validity. If another thread had already marked the entry as invalid, but not removed/free'd it from the list of mappings, then it was possible that an additional mapping removal audit record would be generated. This patch fixes this by returning early from the removal function when the entry was previously marked invalid. This change also had the side benefit of improving the code by decreasing the indentation level of large chunk of code by one (accounting for most of the diffstat).
Fixes: 63c416887437 ("netlabel: Add network address selectors to the NetLabel/LSM domain mapping") Reported-by: Stephen Smalley stephen.smalley.work@gmail.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netlabel/netlabel_domainhash.c | 59 ++++++++++++++++++------------------- 1 file changed, 30 insertions(+), 29 deletions(-)
--- a/net/netlabel/netlabel_domainhash.c +++ b/net/netlabel/netlabel_domainhash.c @@ -85,6 +85,7 @@ static void netlbl_domhsh_free_entry(str kfree(netlbl_domhsh_addr6_entry(iter6)); } #endif /* IPv6 */ + kfree(ptr->def.addrsel); } kfree(ptr->domain); kfree(ptr); @@ -536,6 +537,8 @@ int netlbl_domhsh_add(struct netlbl_dom_ goto add_return; } #endif /* IPv6 */ + /* cleanup the new entry since we've moved everything over */ + netlbl_domhsh_free_entry(&entry->rcu); } else ret_val = -EINVAL;
@@ -579,6 +582,12 @@ int netlbl_domhsh_remove_entry(struct ne { int ret_val = 0; struct audit_buffer *audit_buf; + struct netlbl_af4list *iter4; + struct netlbl_domaddr4_map *map4; +#if IS_ENABLED(CONFIG_IPV6) + struct netlbl_af6list *iter6; + struct netlbl_domaddr6_map *map6; +#endif /* IPv6 */
if (entry == NULL) return -ENOENT; @@ -596,6 +605,9 @@ int netlbl_domhsh_remove_entry(struct ne ret_val = -ENOENT; spin_unlock(&netlbl_domhsh_lock);
+ if (ret_val) + return ret_val; + audit_buf = netlbl_audit_start_common(AUDIT_MAC_MAP_DEL, audit_info); if (audit_buf != NULL) { audit_log_format(audit_buf, @@ -605,40 +617,29 @@ int netlbl_domhsh_remove_entry(struct ne audit_log_end(audit_buf); }
- if (ret_val == 0) { - struct netlbl_af4list *iter4; - struct netlbl_domaddr4_map *map4; -#if IS_ENABLED(CONFIG_IPV6) - struct netlbl_af6list *iter6; - struct netlbl_domaddr6_map *map6; -#endif /* IPv6 */ - - switch (entry->def.type) { - case NETLBL_NLTYPE_ADDRSELECT: - netlbl_af4list_foreach_rcu(iter4, - &entry->def.addrsel->list4) { - map4 = netlbl_domhsh_addr4_entry(iter4); - cipso_v4_doi_putdef(map4->def.cipso); - } + switch (entry->def.type) { + case NETLBL_NLTYPE_ADDRSELECT: + netlbl_af4list_foreach_rcu(iter4, &entry->def.addrsel->list4) { + map4 = netlbl_domhsh_addr4_entry(iter4); + cipso_v4_doi_putdef(map4->def.cipso); + } #if IS_ENABLED(CONFIG_IPV6) - netlbl_af6list_foreach_rcu(iter6, - &entry->def.addrsel->list6) { - map6 = netlbl_domhsh_addr6_entry(iter6); - calipso_doi_putdef(map6->def.calipso); - } + netlbl_af6list_foreach_rcu(iter6, &entry->def.addrsel->list6) { + map6 = netlbl_domhsh_addr6_entry(iter6); + calipso_doi_putdef(map6->def.calipso); + } #endif /* IPv6 */ - break; - case NETLBL_NLTYPE_CIPSOV4: - cipso_v4_doi_putdef(entry->def.cipso); - break; + break; + case NETLBL_NLTYPE_CIPSOV4: + cipso_v4_doi_putdef(entry->def.cipso); + break; #if IS_ENABLED(CONFIG_IPV6) - case NETLBL_NLTYPE_CALIPSO: - calipso_doi_putdef(entry->def.calipso); - break; + case NETLBL_NLTYPE_CALIPSO: + calipso_doi_putdef(entry->def.calipso); + break; #endif /* IPv6 */ - } - call_rcu(&entry->rcu, netlbl_domhsh_free_entry); } + call_rcu(&entry->rcu, netlbl_domhsh_free_entry);
return ret_val; }
From: Kamil Lorenc kamil@re-ws.pl
[ Upstream commit a609d0259183a841621f252e067f40f8cc25d6f6 ]
Keenetic Plus DSL is a xDSL modem that uses dm9620 as its USB interface.
Signed-off-by: Kamil Lorenc kamil@re-ws.pl Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/dm9601.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/net/usb/dm9601.c +++ b/drivers/net/usb/dm9601.c @@ -625,6 +625,10 @@ static const struct usb_device_id produc USB_DEVICE(0x0a46, 0x1269), /* DM9621A USB to Fast Ethernet Adapter */ .driver_info = (unsigned long)&dm9601_info, }, + { + USB_DEVICE(0x0586, 0x3427), /* ZyXEL Keenetic Plus DSL xDSL modem */ + .driver_info = (unsigned long)&dm9601_info, + }, {}, // END };
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 3106ecb43a05dc3e009779764b9da245a5d082de ]
With disabling bh in the whole sctp_get_port_local(), when snum == 0 and too many ports have been used, the do-while loop will take the cpu for a long time and cause cpu stuck:
[ ] watchdog: BUG: soft lockup - CPU#11 stuck for 22s! [ ] RIP: 0010:native_queued_spin_lock_slowpath+0x4de/0x940 [ ] Call Trace: [ ] _raw_spin_lock+0xc1/0xd0 [ ] sctp_get_port_local+0x527/0x650 [sctp] [ ] sctp_do_bind+0x208/0x5e0 [sctp] [ ] sctp_autobind+0x165/0x1e0 [sctp] [ ] sctp_connect_new_asoc+0x355/0x480 [sctp] [ ] __sctp_connect+0x360/0xb10 [sctp]
There's no need to disable bh in the whole function of sctp_get_port_local. So fix this cpu stuck by removing local_bh_disable() called at the beginning, and using spin_lock_bh() instead.
The same thing was actually done for inet_csk_get_port() in Commit ea8add2b1903 ("tcp/dccp: better use of ephemeral ports in bind()").
Thanks to Marcelo for pointing the buggy code out.
v1->v2: - use cond_resched() to yield cpu to other tasks if needed, as Eric noticed.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Ying Xu yinxu@redhat.com Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/socket.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-)
--- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -8176,8 +8176,6 @@ static int sctp_get_port_local(struct so
pr_debug("%s: begins, snum:%d\n", __func__, snum);
- local_bh_disable(); - if (snum == 0) { /* Search for an available port. */ int low, high, remaining, index; @@ -8196,20 +8194,21 @@ static int sctp_get_port_local(struct so continue; index = sctp_phashfn(sock_net(sk), rover); head = &sctp_port_hashtable[index]; - spin_lock(&head->lock); + spin_lock_bh(&head->lock); sctp_for_each_hentry(pp, &head->chain) if ((pp->port == rover) && net_eq(sock_net(sk), pp->net)) goto next; break; next: - spin_unlock(&head->lock); + spin_unlock_bh(&head->lock); + cond_resched(); } while (--remaining > 0);
/* Exhausted local port range during search? */ ret = 1; if (remaining <= 0) - goto fail; + return ret;
/* OK, here is the one we will use. HEAD (the port * hash table list entry) is non-NULL and we hold it's @@ -8224,7 +8223,7 @@ static int sctp_get_port_local(struct so * port iterator, pp being NULL. */ head = &sctp_port_hashtable[sctp_phashfn(sock_net(sk), snum)]; - spin_lock(&head->lock); + spin_lock_bh(&head->lock); sctp_for_each_hentry(pp, &head->chain) { if ((pp->port == snum) && net_eq(pp->net, sock_net(sk))) goto pp_found; @@ -8324,10 +8323,7 @@ success: ret = 0;
fail_unlock: - spin_unlock(&head->lock); - -fail: - local_bh_enable(); + spin_unlock_bh(&head->lock); return ret; }
From: Vinicius Costa Gomes vinicius.gomes@intel.com
[ Upstream commit 09e31cf0c528dac3358a081dc4e773d1b3de1bc9 ]
Since commit 9c66d1564676 ("taprio: Add support for hardware offloading") there's a bit of inconsistency when offloading schedules to the hardware:
In software mode, the gate masks are specified in terms of traffic classes, so if say "sched-entry S 03 20000", it means that the traffic classes 0 and 1 are open for 20us; when taprio is offloaded to hardware, the gate masks are specified in terms of hardware queues.
The idea here is to fix hardware offloading, so schedules in hardware and software mode have the same behavior. What's needed to do is to map traffic classes to queues when applying the offload to the driver.
Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading") Signed-off-by: Vinicius Costa Gomes vinicius.gomes@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_taprio.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-)
--- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -1177,9 +1177,27 @@ static void taprio_offload_config_change spin_unlock(&q->current_entry_lock); }
-static void taprio_sched_to_offload(struct taprio_sched *q, +static u32 tc_map_to_queue_mask(struct net_device *dev, u32 tc_mask) +{ + u32 i, queue_mask = 0; + + for (i = 0; i < dev->num_tc; i++) { + u32 offset, count; + + if (!(tc_mask & BIT(i))) + continue; + + offset = dev->tc_to_txq[i].offset; + count = dev->tc_to_txq[i].count; + + queue_mask |= GENMASK(offset + count - 1, offset); + } + + return queue_mask; +} + +static void taprio_sched_to_offload(struct net_device *dev, struct sched_gate_list *sched, - const struct tc_mqprio_qopt *mqprio, struct tc_taprio_qopt_offload *offload) { struct sched_entry *entry; @@ -1194,7 +1212,8 @@ static void taprio_sched_to_offload(stru
e->command = entry->command; e->interval = entry->interval; - e->gate_mask = entry->gate_mask; + e->gate_mask = tc_map_to_queue_mask(dev, entry->gate_mask); + i++; }
@@ -1202,7 +1221,6 @@ static void taprio_sched_to_offload(stru }
static int taprio_enable_offload(struct net_device *dev, - struct tc_mqprio_qopt *mqprio, struct taprio_sched *q, struct sched_gate_list *sched, struct netlink_ext_ack *extack) @@ -1224,7 +1242,7 @@ static int taprio_enable_offload(struct return -ENOMEM; } offload->enable = 1; - taprio_sched_to_offload(q, sched, mqprio, offload); + taprio_sched_to_offload(dev, sched, offload);
err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload); if (err < 0) { @@ -1486,7 +1504,7 @@ static int taprio_change(struct Qdisc *s }
if (FULL_OFFLOAD_IS_ENABLED(q->flags)) - err = taprio_enable_offload(dev, mqprio, q, new_admin, extack); + err = taprio_enable_offload(dev, q, new_admin, extack); else err = taprio_disable_offload(dev, q, extack); if (err)
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit 2a63866c8b51a3f72cea388dfac259d0e14c4ba6 ]
syzbot is reporting hung task at nbd_ioctl() [1], for there are two problems regarding TIPC's connectionless socket's shutdown() operation.
---------- #include <fcntl.h> #include <sys/socket.h> #include <sys/ioctl.h> #include <linux/nbd.h> #include <unistd.h>
int main(int argc, char *argv[]) { const int fd = open("/dev/nbd0", 3); alarm(5); ioctl(fd, NBD_SET_SOCK, socket(PF_TIPC, SOCK_DGRAM, 0)); ioctl(fd, NBD_DO_IT, 0); /* To be interrupted by SIGALRM. */ return 0; } ----------
One problem is that wait_for_completion() from flush_workqueue() from nbd_start_device_ioctl() from nbd_ioctl() cannot be completed when nbd_start_device_ioctl() received a signal at wait_event_interruptible(), for tipc_shutdown() from kernel_sock_shutdown(SHUT_RDWR) from nbd_mark_nsock_dead() from sock_shutdown() from nbd_start_device_ioctl() is failing to wake up a WQ thread sleeping at wait_woken() from tipc_wait_for_rcvmsg() from sock_recvmsg() from sock_xmit() from nbd_read_stat() from recv_work() scheduled by nbd_start_device() from nbd_start_device_ioctl(). Fix this problem by always invoking sk->sk_state_change() (like inet_shutdown() does) when tipc_shutdown() is called.
The other problem is that tipc_wait_for_rcvmsg() cannot return when tipc_shutdown() is called, for tipc_shutdown() sets sk->sk_shutdown to SEND_SHUTDOWN (despite "how" is SHUT_RDWR) while tipc_wait_for_rcvmsg() needs sk->sk_shutdown set to RCV_SHUTDOWN or SHUTDOWN_MASK. Fix this problem by setting sk->sk_shutdown to SHUTDOWN_MASK (like inet_shutdown() does) when the socket is connectionless.
[1] https://syzkaller.appspot.com/bug?id=3fe51d307c1f0a845485cf1798aa059d12bf18b...
Reported-by: syzbot syzbot+e36f41d207137b5d12f7@syzkaller.appspotmail.com Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/socket.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -2616,18 +2616,21 @@ static int tipc_shutdown(struct socket *
trace_tipc_sk_shutdown(sk, NULL, TIPC_DUMP_ALL, " "); __tipc_shutdown(sock, TIPC_CONN_SHUTDOWN); - sk->sk_shutdown = SEND_SHUTDOWN; + if (tipc_sk_type_connectionless(sk)) + sk->sk_shutdown = SHUTDOWN_MASK; + else + sk->sk_shutdown = SEND_SHUTDOWN;
if (sk->sk_state == TIPC_DISCONNECTING) { /* Discard any unreceived messages */ __skb_queue_purge(&sk->sk_receive_queue);
- /* Wake up anyone sleeping in poll */ - sk->sk_state_change(sk); res = 0; } else { res = -ENOTCONN; } + /* Wake up anyone sleeping in poll. */ + sk->sk_state_change(sk);
release_sock(sk); return res;
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 96e97bc07e90f175a8980a22827faf702ca4cb30 ]
napi_disable() makes sure to set the NAPI_STATE_NPSVC bit to prevent netpoll from accessing rings before init is complete. However, the same is not done for fresh napi instances in netif_napi_add(), even though we expect NAPI instances to be added as disabled.
This causes crashes during driver reconfiguration (enabling XDP, changing the channel count) - if there is any printk() after netif_napi_add() but before napi_enable().
To ensure memory ordering is correct we need to use RCU accessors.
Reported-by: Rob Sherwood rsher@fb.com Fixes: 2d8bff12699a ("netpoll: Close race condition between poll_one_napi and napi_disable") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 3 ++- net/core/netpoll.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-)
--- a/net/core/dev.c +++ b/net/core/dev.c @@ -6231,12 +6231,13 @@ void netif_napi_add(struct net_device *d netdev_err_once(dev, "%s() called with weight %d\n", __func__, weight); napi->weight = weight; - list_add(&napi->dev_list, &dev->napi_list); napi->dev = dev; #ifdef CONFIG_NETPOLL napi->poll_owner = -1; #endif set_bit(NAPI_STATE_SCHED, &napi->state); + set_bit(NAPI_STATE_NPSVC, &napi->state); + list_add_rcu(&napi->dev_list, &dev->napi_list); napi_hash_add(napi); } EXPORT_SYMBOL(netif_napi_add); --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -161,7 +161,7 @@ static void poll_napi(struct net_device struct napi_struct *napi; int cpu = smp_processor_id();
- list_for_each_entry(napi, &dev->napi_list, dev_list) { + list_for_each_entry_rcu(napi, &dev->napi_list, dev_list) { if (cmpxchg(&napi->poll_owner, -1, cpu) == -1) { poll_one_napi(napi); smp_store_release(&napi->poll_owner, -1);
On Fri, 11 Sep 2020 14:54:38 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.65 release. There are 8 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:54:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.65-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.4: 14 builds: 14 pass, 0 fail 26 boots: 26 pass, 0 fail 56 tests: 56 pass, 0 fail
Linux version: 5.4.65-rc1-gcdcf7cd54ebd Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On 9/11/20 6:54 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.65 release. There are 8 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:54:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.65-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Fri, 11 Sep 2020 at 18:30, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.4.65 release. There are 8 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:54:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.65-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
Summary ------------------------------------------------------------------------
kernel: 5.4.65-rc1 git repo: ['https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git', 'https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc'] git branch: linux-5.4.y git commit: cdcf7cd54ebd34795a39be0bf517dd1017873640 git describe: v5.4.64-9-gcdcf7cd54ebd Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.4.y/build/v5.4.64...
No regressions (compared to build v5.4.64)
No fixes (compared to build v5.4.64)
Ran 19759 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - hi6220-hikey - i386 - juno-r2 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - x86
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * kselftest/drivers * kselftest/filesystems * kselftest/net * linux-log-parser * perf * network-basic-tests * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * v4l2-compliance * ltp-open-posix-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-native/drivers * kselftest-vsyscall-mode-native/filesystems * kselftest-vsyscall-mode-native/net * kselftest-vsyscall-mode-none * kselftest-vsyscall-mode-none/drivers * kselftest-vsyscall-mode-none/filesystems * kselftest-vsyscall-mode-none/net * ssuite
linux-stable-mirror@lists.linaro.org