This is a note to let you know that I've just added the patch titled
tcp: refresh tp timestamp before tcp_mtu_probe()
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-refresh-tp-timestamp-before-tcp_mtu_probe.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Nov 15 17:25:34 CET 2017
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 26 Oct 2017 21:21:40 -0700
Subject: tcp: refresh tp timestamp before tcp_mtu_probe()
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit ee1836aec4f5a977c1699a311db4d9027ef21ac8 ]
In the unlikely event tcp_mtu_probe() is sending a packet, we
want tp->tcp_mstamp being as accurate as possible.
This means we need to call tcp_mstamp_refresh() a bit earlier in
tcp_write_xmit().
Fixes: 385e20706fac ("tcp: use tp->tcp_mstamp in output path")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_output.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2271,6 +2271,7 @@ static bool tcp_write_xmit(struct sock *
sent_pkts = 0;
+ tcp_mstamp_refresh(tp);
if (!push_one) {
/* Do MTU probing. */
result = tcp_mtu_probe(sk);
@@ -2282,7 +2283,6 @@ static bool tcp_write_xmit(struct sock *
}
max_segs = tcp_tso_segs(sk, mss_now);
- tcp_mstamp_refresh(tp);
while ((skb = tcp_send_head(sk))) {
unsigned int limit;
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.13/tcp-refresh-tp-timestamp-before-tcp_mtu_probe.patch
queue-4.13/net-call-cgroup_sk_alloc-earlier-in-sk_clone_lock.patch
queue-4.13/tcp-dccp-fix-ireq-opt-races.patch
queue-4.13/tcp-fix-tcp_mtu_probe-vs-highest_sack.patch
queue-4.13/ipv6-addrconf-increment-ifp-refcount-before-ipv6_del_addr.patch
queue-4.13/ipv6-flowlabel-do-not-leave-opt-tot_len-with-garbage.patch
queue-4.13/packet-avoid-panic-in-packet_getsockopt.patch
queue-4.13/sctp-add-the-missing-sock_owned_by_user-check-in-sctp_icmp_redirect.patch
queue-4.13/net_sched-avoid-matching-qdisc-with-zero-handle.patch
queue-4.13/tun-tap-sanitize-tunsetsndbuf-input.patch
queue-4.13/tcp-dccp-fix-lockdep-splat-in-inet_csk_route_req.patch
queue-4.13/tcp-dccp-fix-other-lockdep-splats-accessing-ireq_opt.patch
This is a note to let you know that I've just added the patch titled
tcp: fix tcp_mtu_probe() vs highest_sack
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-fix-tcp_mtu_probe-vs-highest_sack.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Nov 15 17:25:34 CET 2017
From: Eric Dumazet <edumazet(a)google.com>
Date: Mon, 30 Oct 2017 23:08:20 -0700
Subject: tcp: fix tcp_mtu_probe() vs highest_sack
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 2b7cda9c35d3b940eb9ce74b30bbd5eb30db493d ]
Based on SNMP values provided by Roman, Yuchung made the observation
that some crashes in tcp_sacktag_walk() might be caused by MTU probing.
Looking at tcp_mtu_probe(), I found that when a new skb was placed
in front of the write queue, we were not updating tcp highest sack.
If one skb is freed because all its content was copied to the new skb
(for MTU probing), then tp->highest_sack could point to a now freed skb.
Bad things would then happen, including infinite loops.
This patch renames tcp_highest_sack_combine() and uses it
from tcp_mtu_probe() to fix the bug.
Note that I also removed one test against tp->sacked_out,
since we want to replace tp->highest_sack regardless of whatever
condition, since keeping a stale pointer to freed skb is a recipe
for disaster.
Fixes: a47e5a988a57 ("[TCP]: Convert highest_sack to sk_buff to allow direct access")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: Alexei Starovoitov <alexei.starovoitov(a)gmail.com>
Reported-by: Roman Gushchin <guro(a)fb.com>
Reported-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Neal Cardwell <ncardwell(a)google.com>
Acked-by: Yuchung Cheng <ycheng(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/tcp.h | 6 +++---
net/ipv4/tcp_output.c | 3 ++-
2 files changed, 5 insertions(+), 4 deletions(-)
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1750,12 +1750,12 @@ static inline void tcp_highest_sack_rese
tcp_sk(sk)->highest_sack = tcp_write_queue_head(sk);
}
-/* Called when old skb is about to be deleted (to be combined with new skb) */
-static inline void tcp_highest_sack_combine(struct sock *sk,
+/* Called when old skb is about to be deleted and replaced by new skb */
+static inline void tcp_highest_sack_replace(struct sock *sk,
struct sk_buff *old,
struct sk_buff *new)
{
- if (tcp_sk(sk)->sacked_out && (old == tcp_sk(sk)->highest_sack))
+ if (old == tcp_highest_sack(sk))
tcp_sk(sk)->highest_sack = new;
}
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2094,6 +2094,7 @@ static int tcp_mtu_probe(struct sock *sk
nskb->ip_summed = skb->ip_summed;
tcp_insert_write_queue_before(nskb, skb, sk);
+ tcp_highest_sack_replace(sk, skb, nskb);
len = 0;
tcp_for_write_queue_from_safe(skb, next, sk) {
@@ -2694,7 +2695,7 @@ static bool tcp_collapse_retrans(struct
else if (!skb_shift(skb, next_skb, next_skb_size))
return false;
}
- tcp_highest_sack_combine(sk, next_skb, skb);
+ tcp_highest_sack_replace(sk, next_skb, skb);
tcp_unlink_write_queue(next_skb, sk);
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.13/tcp-refresh-tp-timestamp-before-tcp_mtu_probe.patch
queue-4.13/net-call-cgroup_sk_alloc-earlier-in-sk_clone_lock.patch
queue-4.13/tcp-dccp-fix-ireq-opt-races.patch
queue-4.13/tcp-fix-tcp_mtu_probe-vs-highest_sack.patch
queue-4.13/ipv6-addrconf-increment-ifp-refcount-before-ipv6_del_addr.patch
queue-4.13/ipv6-flowlabel-do-not-leave-opt-tot_len-with-garbage.patch
queue-4.13/packet-avoid-panic-in-packet_getsockopt.patch
queue-4.13/sctp-add-the-missing-sock_owned_by_user-check-in-sctp_icmp_redirect.patch
queue-4.13/net_sched-avoid-matching-qdisc-with-zero-handle.patch
queue-4.13/tun-tap-sanitize-tunsetsndbuf-input.patch
queue-4.13/tcp-dccp-fix-lockdep-splat-in-inet_csk_route_req.patch
queue-4.13/tcp-dccp-fix-other-lockdep-splats-accessing-ireq_opt.patch
This is a note to let you know that I've just added the patch titled
tcp/dccp: fix other lockdep splats accessing ireq_opt
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-dccp-fix-other-lockdep-splats-accessing-ireq_opt.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Nov 15 17:25:34 CET 2017
From: Eric Dumazet <edumazet(a)google.com>
Date: Tue, 24 Oct 2017 08:20:31 -0700
Subject: tcp/dccp: fix other lockdep splats accessing ireq_opt
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 06f877d613be3621604c2520ec0351d9fbdca15f ]
In my first attempt to fix the lockdep splat, I forgot we could
enter inet_csk_route_req() with a freshly allocated request socket,
for which refcount has not yet been elevated, due to complex
SLAB_TYPESAFE_BY_RCU rules.
We either are in rcu_read_lock() section _or_ we own a refcount on the
request.
Correct RCU verb to use here is rcu_dereference_check(), although it is
not possible to prove we actually own a reference on a shared
refcount :/
In v2, I added ireq_opt_deref() helper and use in three places, to fix other
possible splats.
[ 49.844590] lockdep_rcu_suspicious+0xea/0xf3
[ 49.846487] inet_csk_route_req+0x53/0x14d
[ 49.848334] tcp_v4_route_req+0xe/0x10
[ 49.850174] tcp_conn_request+0x31c/0x6a0
[ 49.851992] ? __lock_acquire+0x614/0x822
[ 49.854015] tcp_v4_conn_request+0x5a/0x79
[ 49.855957] ? tcp_v4_conn_request+0x5a/0x79
[ 49.858052] tcp_rcv_state_process+0x98/0xdcc
[ 49.859990] ? sk_filter_trim_cap+0x2f6/0x307
[ 49.862085] tcp_v4_do_rcv+0xfc/0x145
[ 49.864055] ? tcp_v4_do_rcv+0xfc/0x145
[ 49.866173] tcp_v4_rcv+0x5ab/0xaf9
[ 49.868029] ip_local_deliver_finish+0x1af/0x2e7
[ 49.870064] ip_local_deliver+0x1b2/0x1c5
[ 49.871775] ? inet_del_offload+0x45/0x45
[ 49.873916] ip_rcv_finish+0x3f7/0x471
[ 49.875476] ip_rcv+0x3f1/0x42f
[ 49.876991] ? ip_local_deliver_finish+0x2e7/0x2e7
[ 49.878791] __netif_receive_skb_core+0x6d3/0x950
[ 49.880701] ? process_backlog+0x7e/0x216
[ 49.882589] __netif_receive_skb+0x1d/0x5e
[ 49.884122] process_backlog+0x10c/0x216
[ 49.885812] net_rx_action+0x147/0x3df
Fixes: a6ca7abe53633 ("tcp/dccp: fix lockdep splat in inet_csk_route_req()")
Fixes: c92e8c02fe66 ("tcp/dccp: fix ireq->opt races")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: kernel test robot <fengguang.wu(a)intel.com>
Reported-by: Maciej Żenczykowski <maze(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/inet_sock.h | 6 ++++++
net/dccp/ipv4.c | 2 +-
net/ipv4/inet_connection_sock.c | 4 ++--
net/ipv4/tcp_ipv4.c | 2 +-
4 files changed, 10 insertions(+), 4 deletions(-)
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -132,6 +132,12 @@ static inline int inet_request_bound_dev
return sk->sk_bound_dev_if;
}
+static inline struct ip_options_rcu *ireq_opt_deref(const struct inet_request_sock *ireq)
+{
+ return rcu_dereference_check(ireq->ireq_opt,
+ refcount_read(&ireq->req.rsk_refcnt) > 0);
+}
+
struct inet_cork {
unsigned int flags;
__be32 addr;
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -495,7 +495,7 @@ static int dccp_v4_send_response(const s
ireq->ir_rmt_addr);
err = ip_build_and_send_pkt(skb, sk, ireq->ir_loc_addr,
ireq->ir_rmt_addr,
- rcu_dereference(ireq->ireq_opt));
+ ireq_opt_deref(ireq));
err = net_xmit_eval(err);
}
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -540,8 +540,8 @@ struct dst_entry *inet_csk_route_req(con
struct ip_options_rcu *opt;
struct rtable *rt;
- opt = rcu_dereference_protected(ireq->ireq_opt,
- refcount_read(&req->rsk_refcnt) > 0);
+ opt = ireq_opt_deref(ireq);
+
flowi4_init_output(fl4, ireq->ir_iif, ireq->ir_mark,
RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE,
sk->sk_protocol, inet_sk_flowi_flags(sk),
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -878,7 +878,7 @@ static int tcp_v4_send_synack(const stru
err = ip_build_and_send_pkt(skb, sk, ireq->ir_loc_addr,
ireq->ir_rmt_addr,
- rcu_dereference(ireq->ireq_opt));
+ ireq_opt_deref(ireq));
err = net_xmit_eval(err);
}
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.13/tcp-refresh-tp-timestamp-before-tcp_mtu_probe.patch
queue-4.13/net-call-cgroup_sk_alloc-earlier-in-sk_clone_lock.patch
queue-4.13/tcp-dccp-fix-ireq-opt-races.patch
queue-4.13/tcp-fix-tcp_mtu_probe-vs-highest_sack.patch
queue-4.13/ipv6-addrconf-increment-ifp-refcount-before-ipv6_del_addr.patch
queue-4.13/ipv6-flowlabel-do-not-leave-opt-tot_len-with-garbage.patch
queue-4.13/packet-avoid-panic-in-packet_getsockopt.patch
queue-4.13/sctp-add-the-missing-sock_owned_by_user-check-in-sctp_icmp_redirect.patch
queue-4.13/net_sched-avoid-matching-qdisc-with-zero-handle.patch
queue-4.13/tun-tap-sanitize-tunsetsndbuf-input.patch
queue-4.13/tcp-dccp-fix-lockdep-splat-in-inet_csk_route_req.patch
queue-4.13/tcp-dccp-fix-other-lockdep-splats-accessing-ireq_opt.patch
This is a note to let you know that I've just added the patch titled
tcp/dccp: fix lockdep splat in inet_csk_route_req()
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-dccp-fix-lockdep-splat-in-inet_csk_route_req.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Nov 15 17:25:34 CET 2017
From: Eric Dumazet <edumazet(a)google.com>
Date: Sun, 22 Oct 2017 12:33:57 -0700
Subject: tcp/dccp: fix lockdep splat in inet_csk_route_req()
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit a6ca7abe53633d08eea1c6756cb49c9b2d4c90bf ]
This patch fixes the following lockdep splat in inet_csk_route_req()
lockdep_rcu_suspicious
inet_csk_route_req
tcp_v4_send_synack
tcp_rtx_synack
inet_rtx_syn_ack
tcp_fastopen_synack_time
tcp_retransmit_timer
tcp_write_timer_handler
tcp_write_timer
call_timer_fn
Thread running inet_csk_route_req() owns a reference on the request
socket, so we have the guarantee ireq->ireq_opt wont be changed or
freed.
lockdep can enforce this invariant for us.
Fixes: c92e8c02fe66 ("tcp/dccp: fix ireq->opt races")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/inet_connection_sock.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -540,7 +540,8 @@ struct dst_entry *inet_csk_route_req(con
struct ip_options_rcu *opt;
struct rtable *rt;
- opt = rcu_dereference(ireq->ireq_opt);
+ opt = rcu_dereference_protected(ireq->ireq_opt,
+ refcount_read(&req->rsk_refcnt) > 0);
flowi4_init_output(fl4, ireq->ir_iif, ireq->ir_mark,
RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE,
sk->sk_protocol, inet_sk_flowi_flags(sk),
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.13/tcp-refresh-tp-timestamp-before-tcp_mtu_probe.patch
queue-4.13/net-call-cgroup_sk_alloc-earlier-in-sk_clone_lock.patch
queue-4.13/tcp-dccp-fix-ireq-opt-races.patch
queue-4.13/tcp-fix-tcp_mtu_probe-vs-highest_sack.patch
queue-4.13/ipv6-addrconf-increment-ifp-refcount-before-ipv6_del_addr.patch
queue-4.13/ipv6-flowlabel-do-not-leave-opt-tot_len-with-garbage.patch
queue-4.13/packet-avoid-panic-in-packet_getsockopt.patch
queue-4.13/sctp-add-the-missing-sock_owned_by_user-check-in-sctp_icmp_redirect.patch
queue-4.13/net_sched-avoid-matching-qdisc-with-zero-handle.patch
queue-4.13/tun-tap-sanitize-tunsetsndbuf-input.patch
queue-4.13/tcp-dccp-fix-lockdep-splat-in-inet_csk_route_req.patch
queue-4.13/tcp-dccp-fix-other-lockdep-splats-accessing-ireq_opt.patch
This is a note to let you know that I've just added the patch titled
tap: reference to KVA of an unloaded module causes kernel panic
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tap-reference-to-kva-of-an-unloaded-module-causes-kernel-panic.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Nov 15 17:25:34 CET 2017
From: Girish Moodalbail <girish.moodalbail(a)oracle.com>
Date: Fri, 27 Oct 2017 00:00:16 -0700
Subject: tap: reference to KVA of an unloaded module causes kernel panic
From: Girish Moodalbail <girish.moodalbail(a)oracle.com>
[ Upstream commit dea6e19f4ef746aa18b4c33d1a7fed54356796ed ]
The commit 9a393b5d5988 ("tap: tap as an independent module") created a
separate tap module that implements tap functionality and exports
interfaces that will be used by macvtap and ipvtap modules to create
create respective tap devices.
However, that patch introduced a regression wherein the modules macvtap
and ipvtap can be removed (through modprobe -r) while there are
applications using the respective /dev/tapX devices. These applications
cause kernel to hold reference to /dev/tapX through 'struct cdev
macvtap_cdev' and 'struct cdev ipvtap_dev' defined in macvtap and ipvtap
modules respectively. So, when the application is later closed the
kernel panics because we are referencing KVA that is present in the
unloaded modules.
----------8<------- Example ----------8<----------
$ sudo ip li add name mv0 link enp7s0 type macvtap
$ sudo ip li show mv0 |grep mv0| awk -e '{print $1 $2}'
14:mv0@enp7s0:
$ cat /dev/tap14 &
$ lsmod |egrep -i 'tap|vlan'
macvtap 16384 0
macvlan 24576 1 macvtap
tap 24576 3 macvtap
$ sudo modprobe -r macvtap
$ fg
cat /dev/tap14
^C
<...system panics...>
BUG: unable to handle kernel paging request at ffffffffa038c500
IP: cdev_put+0xf/0x30
----------8<-----------------8<----------
The fix is to set cdev.owner to the module that creates the tap device
(either macvtap or ipvtap). With this set, the operations (in
fs/char_dev.c) on char device holds and releases the module through
cdev_get() and cdev_put() and will not allow the module to unload
prematurely.
Fixes: 9a393b5d5988ea4e (tap: tap as an independent module)
Signed-off-by: Girish Moodalbail <girish.moodalbail(a)oracle.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ipvlan/ipvtap.c | 4 ++--
drivers/net/macvtap.c | 4 ++--
drivers/net/tap.c | 5 +++--
include/linux/if_tap.h | 4 ++--
4 files changed, 9 insertions(+), 8 deletions(-)
--- a/drivers/net/ipvlan/ipvtap.c
+++ b/drivers/net/ipvlan/ipvtap.c
@@ -197,8 +197,8 @@ static int ipvtap_init(void)
{
int err;
- err = tap_create_cdev(&ipvtap_cdev, &ipvtap_major, "ipvtap");
-
+ err = tap_create_cdev(&ipvtap_cdev, &ipvtap_major, "ipvtap",
+ THIS_MODULE);
if (err)
goto out1;
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -204,8 +204,8 @@ static int macvtap_init(void)
{
int err;
- err = tap_create_cdev(&macvtap_cdev, &macvtap_major, "macvtap");
-
+ err = tap_create_cdev(&macvtap_cdev, &macvtap_major, "macvtap",
+ THIS_MODULE);
if (err)
goto out1;
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -1252,8 +1252,8 @@ static int tap_list_add(dev_t major, con
return 0;
}
-int tap_create_cdev(struct cdev *tap_cdev,
- dev_t *tap_major, const char *device_name)
+int tap_create_cdev(struct cdev *tap_cdev, dev_t *tap_major,
+ const char *device_name, struct module *module)
{
int err;
@@ -1262,6 +1262,7 @@ int tap_create_cdev(struct cdev *tap_cde
goto out1;
cdev_init(tap_cdev, &tap_fops);
+ tap_cdev->owner = module;
err = cdev_add(tap_cdev, *tap_major, TAP_NUM_DEVS);
if (err)
goto out2;
--- a/include/linux/if_tap.h
+++ b/include/linux/if_tap.h
@@ -73,8 +73,8 @@ void tap_del_queues(struct tap_dev *tap)
int tap_get_minor(dev_t major, struct tap_dev *tap);
void tap_free_minor(dev_t major, struct tap_dev *tap);
int tap_queue_resize(struct tap_dev *tap);
-int tap_create_cdev(struct cdev *tap_cdev,
- dev_t *tap_major, const char *device_name);
+int tap_create_cdev(struct cdev *tap_cdev, dev_t *tap_major,
+ const char *device_name, struct module *module);
void tap_destroy_cdev(dev_t major, struct cdev *tap_cdev);
#endif /*_LINUX_IF_TAP_H_*/
Patches currently in stable-queue which might be from girish.moodalbail(a)oracle.com are
queue-4.13/tap-reference-to-kva-of-an-unloaded-module-causes-kernel-panic.patch
queue-4.13/tap-double-free-in-error-path-in-tap_open.patch
This is a note to let you know that I've just added the patch titled
soreuseport: fix initialization race
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
soreuseport-fix-initialization-race.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Nov 15 17:25:34 CET 2017
From: Craig Gallek <kraig(a)google.com>
Date: Thu, 19 Oct 2017 15:00:29 -0400
Subject: soreuseport: fix initialization race
From: Craig Gallek <kraig(a)google.com>
[ Upstream commit 1b5f962e71bfad6284574655c406597535c3ea7a ]
Syzkaller stumbled upon a way to trigger
WARNING: CPU: 1 PID: 13881 at net/core/sock_reuseport.c:41
reuseport_alloc+0x306/0x3b0 net/core/sock_reuseport.c:39
There are two initialization paths for the sock_reuseport structure in a
socket: Through the udp/tcp bind paths of SO_REUSEPORT sockets or through
SO_ATTACH_REUSEPORT_[CE]BPF before bind. The existing implementation
assumedthat the socket lock protected both of these paths when it actually
only protects the SO_ATTACH_REUSEPORT path. Syzkaller triggered this
double allocation by running these paths concurrently.
This patch moves the check for double allocation into the reuseport_alloc
function which is protected by a global spin lock.
Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection")
Fixes: c125e80b8868 ("soreuseport: fast reuseport TCP socket selection")
Signed-off-by: Craig Gallek <kraig(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/sock_reuseport.c | 12 +++++++++---
net/ipv4/inet_hashtables.c | 5 +----
net/ipv4/udp.c | 5 +----
3 files changed, 11 insertions(+), 11 deletions(-)
--- a/net/core/sock_reuseport.c
+++ b/net/core/sock_reuseport.c
@@ -36,9 +36,14 @@ int reuseport_alloc(struct sock *sk)
* soft irq of receive path or setsockopt from process context
*/
spin_lock_bh(&reuseport_lock);
- WARN_ONCE(rcu_dereference_protected(sk->sk_reuseport_cb,
- lockdep_is_held(&reuseport_lock)),
- "multiple allocations for the same socket");
+
+ /* Allocation attempts can occur concurrently via the setsockopt path
+ * and the bind/hash path. Nothing to do when we lose the race.
+ */
+ if (rcu_dereference_protected(sk->sk_reuseport_cb,
+ lockdep_is_held(&reuseport_lock)))
+ goto out;
+
reuse = __reuseport_alloc(INIT_SOCKS);
if (!reuse) {
spin_unlock_bh(&reuseport_lock);
@@ -49,6 +54,7 @@ int reuseport_alloc(struct sock *sk)
reuse->num_socks = 1;
rcu_assign_pointer(sk->sk_reuseport_cb, reuse);
+out:
spin_unlock_bh(&reuseport_lock);
return 0;
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -449,10 +449,7 @@ static int inet_reuseport_add_sock(struc
return reuseport_add_sock(sk, sk2);
}
- /* Initial allocation may have already happened via setsockopt */
- if (!rcu_access_pointer(sk->sk_reuseport_cb))
- return reuseport_alloc(sk);
- return 0;
+ return reuseport_alloc(sk);
}
int __inet_hash(struct sock *sk, struct sock *osk)
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -231,10 +231,7 @@ static int udp_reuseport_add_sock(struct
}
}
- /* Initial allocation may have already happened via setsockopt */
- if (!rcu_access_pointer(sk->sk_reuseport_cb))
- return reuseport_alloc(sk);
- return 0;
+ return reuseport_alloc(sk);
}
/**
Patches currently in stable-queue which might be from kraig(a)google.com are
queue-4.13/soreuseport-fix-initialization-race.patch
queue-4.13/tun-tap-sanitize-tunsetsndbuf-input.patch
This is a note to let you know that I've just added the patch titled
tap: double-free in error path in tap_open()
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tap-double-free-in-error-path-in-tap_open.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Nov 15 17:25:34 CET 2017
From: Girish Moodalbail <girish.moodalbail(a)oracle.com>
Date: Wed, 25 Oct 2017 00:23:04 -0700
Subject: tap: double-free in error path in tap_open()
From: Girish Moodalbail <girish.moodalbail(a)oracle.com>
[ Upstream commit 78e0ea6791d7baafb8a0ca82b1bd0c7b3453c919 ]
Double free of skb_array in tap module is causing kernel panic. When
tap_set_queue() fails we free skb_array right away by calling
skb_array_cleanup(). However, later on skb_array_cleanup() is called
again by tap_sock_destruct through sock_put(). This patch fixes that
issue.
Fixes: 362899b8725b35e3 (macvtap: switch to use skb array)
Signed-off-by: Girish Moodalbail <girish.moodalbail(a)oracle.com>
Acked-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/tap.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -517,6 +517,10 @@ static int tap_open(struct inode *inode,
&tap_proto, 0);
if (!q)
goto err;
+ if (skb_array_init(&q->skb_array, tap->dev->tx_queue_len, GFP_KERNEL)) {
+ sk_free(&q->sk);
+ goto err;
+ }
RCU_INIT_POINTER(q->sock.wq, &q->wq);
init_waitqueue_head(&q->wq.wait);
@@ -540,22 +544,18 @@ static int tap_open(struct inode *inode,
if ((tap->dev->features & NETIF_F_HIGHDMA) && (tap->dev->features & NETIF_F_SG))
sock_set_flag(&q->sk, SOCK_ZEROCOPY);
- err = -ENOMEM;
- if (skb_array_init(&q->skb_array, tap->dev->tx_queue_len, GFP_KERNEL))
- goto err_array;
-
err = tap_set_queue(tap, file, q);
- if (err)
- goto err_queue;
+ if (err) {
+ /* tap_sock_destruct() will take care of freeing skb_array */
+ goto err_put;
+ }
dev_put(tap->dev);
rtnl_unlock();
return err;
-err_queue:
- skb_array_cleanup(&q->skb_array);
-err_array:
+err_put:
sock_put(&q->sk);
err:
if (tap)
Patches currently in stable-queue which might be from girish.moodalbail(a)oracle.com are
queue-4.13/tap-reference-to-kva-of-an-unloaded-module-causes-kernel-panic.patch
queue-4.13/tap-double-free-in-error-path-in-tap_open.patch
This is a note to let you know that I've just added the patch titled
sctp: reset owner sk for data chunks on out queues when migrating a sock
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-reset-owner-sk-for-data-chunks-on-out-queues-when-migrating-a-sock.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Nov 15 17:25:34 CET 2017
From: Xin Long <lucien.xin(a)gmail.com>
Date: Sat, 28 Oct 2017 02:13:29 +0800
Subject: sctp: reset owner sk for data chunks on out queues when migrating a sock
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit d04adf1b355181e737b6b1e23d801b07f0b7c4c0 ]
Now when migrating sock to another one in sctp_sock_migrate(), it only
resets owner sk for the data in receive queues, not the chunks on out
queues.
It would cause that data chunks length on the sock is not consistent
with sk sk_wmem_alloc. When closing the sock or freeing these chunks,
the old sk would never be freed, and the new sock may crash due to
the overflow sk_wmem_alloc.
syzbot found this issue with this series:
r0 = socket$inet_sctp()
sendto$inet(r0)
listen(r0)
accept4(r0)
close(r0)
Although listen() should have returned error when one TCP-style socket
is in connecting (I may fix this one in another patch), it could also
be reproduced by peeling off an assoc.
This issue is there since very beginning.
This patch is to reset owner sk for the chunks on out queues so that
sk sk_wmem_alloc has correct value after accept one sock or peeloff
an assoc to one sock.
Note that when resetting owner sk for chunks on outqueue, it has to
sctp_clear_owner_w/skb_orphan chunks before changing assoc->base.sk
first and then sctp_set_owner_w them after changing assoc->base.sk,
due to that sctp_wfree and it's callees are using assoc->base.sk.
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/socket.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -169,6 +169,36 @@ static inline void sctp_set_owner_w(stru
sk_mem_charge(sk, chunk->skb->truesize);
}
+static void sctp_clear_owner_w(struct sctp_chunk *chunk)
+{
+ skb_orphan(chunk->skb);
+}
+
+static void sctp_for_each_tx_datachunk(struct sctp_association *asoc,
+ void (*cb)(struct sctp_chunk *))
+
+{
+ struct sctp_outq *q = &asoc->outqueue;
+ struct sctp_transport *t;
+ struct sctp_chunk *chunk;
+
+ list_for_each_entry(t, &asoc->peer.transport_addr_list, transports)
+ list_for_each_entry(chunk, &t->transmitted, transmitted_list)
+ cb(chunk);
+
+ list_for_each_entry(chunk, &q->retransmit, list)
+ cb(chunk);
+
+ list_for_each_entry(chunk, &q->sacked, list)
+ cb(chunk);
+
+ list_for_each_entry(chunk, &q->abandoned, list)
+ cb(chunk);
+
+ list_for_each_entry(chunk, &q->out_chunk_list, list)
+ cb(chunk);
+}
+
/* Verify that this is a valid address. */
static inline int sctp_verify_addr(struct sock *sk, union sctp_addr *addr,
int len)
@@ -8196,7 +8226,9 @@ static void sctp_sock_migrate(struct soc
* paths won't try to lock it and then oldsk.
*/
lock_sock_nested(newsk, SINGLE_DEPTH_NESTING);
+ sctp_for_each_tx_datachunk(assoc, sctp_clear_owner_w);
sctp_assoc_migrate(assoc, newsk);
+ sctp_for_each_tx_datachunk(assoc, sctp_set_owner_w);
/* If the association on the newsk is already closed before accept()
* is called, set RCV_SHUTDOWN flag.
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.13/sctp-reset-owner-sk-for-data-chunks-on-out-queues-when-migrating-a-sock.patch
queue-4.13/ipip-only-increase-err_count-for-some-certain-type-icmp-in-ipip_err.patch
queue-4.13/sctp-full-support-for-ipv6-ip_nonlocal_bind-ip_freebind.patch
queue-4.13/sctp-add-the-missing-sock_owned_by_user-check-in-sctp_icmp_redirect.patch
queue-4.13/ip6_gre-only-increase-err_count-for-some-certain-type-icmpv6-in-ip6gre_err.patch
queue-4.13/ip6_gre-update-dst-pmtu-if-dev-mtu-has-been-updated-by-toobig-in-__gre6_xmit.patch
This is a note to let you know that I've just added the patch titled
sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-full-support-for-ipv6-ip_nonlocal_bind-ip_freebind.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Nov 15 17:25:34 CET 2017
From: Laszlo Toth <laszlth(a)gmail.com>
Date: Mon, 23 Oct 2017 19:19:33 +0200
Subject: sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND
From: Laszlo Toth <laszlth(a)gmail.com>
[ Upstream commit b71d21c274eff20a9db8158882b545b141b73ab8 ]
Commit 9b9742022888 ("sctp: support ipv6 nonlocal bind")
introduced support for the above options as v4 sctp did,
so patched sctp_v6_available().
In the v4 implementation it's enough, because
sctp_inet_bind_verify() just returns with sctp_v4_available().
However sctp_inet6_bind_verify() has an extra check before that
for link-local scope_id, which won't respect the above options.
Added the checks before calling ipv6_chk_addr(), but
not before the validation of scope_id.
before (w/ both options):
./v6test fe80::10 sctp
bind failed, errno: 99 (Cannot assign requested address)
./v6test fe80::10 tcp
bind success, errno: 0 (Success)
after (w/ both options):
./v6test fe80::10 sctp
bind success, errno: 0 (Success)
Signed-off-by: Laszlo Toth <laszlth(a)gmail.com>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/ipv6.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -882,8 +882,10 @@ static int sctp_inet6_bind_verify(struct
net = sock_net(&opt->inet.sk);
rcu_read_lock();
dev = dev_get_by_index_rcu(net, addr->v6.sin6_scope_id);
- if (!dev ||
- !ipv6_chk_addr(net, &addr->v6.sin6_addr, dev, 0)) {
+ if (!dev || !(opt->inet.freebind ||
+ net->ipv6.sysctl.ip_nonlocal_bind ||
+ ipv6_chk_addr(net, &addr->v6.sin6_addr,
+ dev, 0))) {
rcu_read_unlock();
return 0;
}
Patches currently in stable-queue which might be from laszlth(a)gmail.com are
queue-4.13/sctp-full-support-for-ipv6-ip_nonlocal_bind-ip_freebind.patch