This is the start of the stable review cycle for the 4.15.15 release. There are 47 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Mar 31 17:57:05 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.15.15-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.15.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.15.15-rc1
Arkadi Sharshevsky arkadis@mellanox.com team: Fix double free in error path
Vinicius Costa Gomes vinicius.gomes@intel.com skbuff: Fix not waking applications when errors are enqueued
Michal Kalderon Michal.Kalderon@cavium.com qede: Fix qedr link update
Florian Fainelli f.fainelli@gmail.com net: systemport: Rewrite __bcm_sysport_tx_reclaim()
David Ahern dsahern@gmail.com net: Only honor ifindex in IP_PKTINFO if non-0
Nicolas Dichtel nicolas.dichtel@6wind.com netlink: avoid a double skb free in genlmsg_mcast()
Arvind Yadav arvind.yadav.cs@gmail.com net/iucv: Free memory obtained by kzalloc
Florian Fainelli f.fainelli@gmail.com net: fec: Fix unbalanced PM runtime calls
SZ Lin (林上智) sz.lin@moxa.com net: ethernet: ti: cpsw: add check for in-band mode setting with RGMII PHY interface
Christophe JAILLET christophe.jaillet@wanadoo.fr net: ethernet: arc: Fix a potential memory leak if an optional regulator is deferred
Eric Dumazet edumazet@google.com l2tp: do not accept arbitrary sockets
Lorenzo Bianconi lorenzo.bianconi@redhat.com ipv6: fix access to non-linear packet in ndisc_fill_redirect_hdr_option()
Alexey Kodanev alexey.kodanev@oracle.com dccp: check sk for closed state in dccp_sendmsg()
Camelia Groza camelia.groza@nxp.com dpaa_eth: remove duplicate increment of the tx_errors counter
Camelia Groza camelia.groza@nxp.com dpaa_eth: increment the RX dropped counter when needed
Camelia Groza camelia.groza@nxp.com dpaa_eth: remove duplicate initialization
Madalin Bucur madalin.bucur@nxp.com dpaa_eth: fix error in dpaa_remove()
Madalin Bucur madalin.bucur@nxp.com soc/fsl/qbman: fix issue in qman_delete_cgr_safe()
Julian Wiedmann jwi@linux.vnet.ibm.com s390/qeth: on channel error, reject further cmd requests
Julian Wiedmann jwi@linux.vnet.ibm.com s390/qeth: lock read device while queueing next buffer
Julian Wiedmann jwi@linux.vnet.ibm.com s390/qeth: when thread completes, wake up all waiters
Julian Wiedmann jwi@linux.vnet.ibm.com s390/qeth: free netdevice when removing a card
Kirill Tkhai ktkhai@virtuozzo.com net: Fix hlist corruptions in inet_evict_bucket()
Eric Dumazet edumazet@google.com net: use skb_to_full_sk() in skb_update_prio()
Eric Dumazet edumazet@google.com ieee802154: 6lowpan: fix possible NULL deref in lowpan_device_event()
Alexey Kodanev alexey.kodanev@oracle.com sch_netem: fix skb leak in netem_enqueue()
Tom Herbert tom@quantonium.net kcm: lock lower socket in kcm_attach
Paul Blakey paulb@mellanox.com test_rhashtable: add test case for rhltable with duplicate objects
Paul Blakey paulb@mellanox.com rhashtable: Fix rhlist duplicates insertion
Guillaume Nault g.nault@alphalink.fr ppp: avoid loop in xmit recursion detection code
Roman Mashak mrv@mojatatu.com net sched actions: return explicit error when tunnel_key mode is not specified
Stefano Brivio sbrivio@redhat.com ipv6: Reflect MTU changes on PMTU of exceptions for MTU-less routes
Brad Mouring brad.mouring@ni.com net: phy: Tell caller result of phy_change()
Ido Schimmel idosch@mellanox.com mlxsw: spectrum_buffers: Set a minimum quota for CPU port traffic
David Lebrun dlebrun@google.com ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state
David Lebrun dlebrun@google.com ipv6: sr: fix NULL pointer dereference when setting encap source address
Stefano Brivio sbrivio@redhat.com ipv6: old_dport should be a __be16 in __ip6_datagram_connect()
Paolo Abeni pabeni@redhat.com net: ipv6: keep sk status consistent after datagram connect failure
Shannon Nelson shannon.nelson@oracle.com macvlan: filter out unsupported feature flags
Arkadi Sharshevsky arkadis@mellanox.com devlink: Remove redundant free on error path
Grygorii Strashko grygorii.strashko@ti.com net: phy: relax error checking when creating sysfs link netdev->phydev
Grygorii Strashko grygorii.strashko@ti.com sysfs: symlink: export sysfs_create_link_nowarn()
Michal Kalderon Michal.Kalderon@cavium.com qed: Fix non TCP packets should be dropped on iWARP ll2 connection
Soheil Hassas Yeganeh soheil@google.com tcp: purge write queue upon aborting the connection
Michal Kalderon Michal.Kalderon@cavium.com qed: Fix MPA unalign flow in case header is split across two packets.
zhangliping zhangliping02@baidu.com openvswitch: meter: fix the incorrect calculation of max delta_t
Florian Fainelli f.fainelli@gmail.com net: dsa: Fix dsa_is_user_port() test inversion
-------------
Diffstat:
Makefile | 4 +- drivers/net/ethernet/arc/emac_rockchip.c | 6 +- drivers/net/ethernet/broadcom/bcmsysport.c | 33 ++-- drivers/net/ethernet/broadcom/bcmsysport.h | 2 +- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 8 +- drivers/net/ethernet/freescale/fec_main.c | 2 + .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 12 +- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 17 +- drivers/net/ethernet/qlogic/qede/qede_main.c | 4 +- drivers/net/ethernet/ti/cpsw.c | 3 +- drivers/net/macvlan.c | 2 +- drivers/net/phy/phy.c | 173 ++++++++++----------- drivers/net/phy/phy_device.c | 15 +- drivers/net/ppp/ppp_generic.c | 26 ++-- drivers/net/team/team.c | 4 +- drivers/s390/net/qeth_core_main.c | 21 ++- drivers/s390/net/qeth_l2_main.c | 2 +- drivers/s390/net/qeth_l3_main.c | 2 +- drivers/soc/fsl/qbman/qman.c | 28 +--- fs/sysfs/symlink.c | 1 + include/linux/cgroup-defs.h | 4 +- include/linux/phy.h | 1 - include/linux/rhashtable.h | 4 +- include/net/sch_generic.h | 19 +++ lib/rhashtable.c | 4 +- lib/test_rhashtable.c | 134 ++++++++++++++++ net/core/dev.c | 22 ++- net/core/devlink.c | 16 +- net/core/skbuff.c | 2 +- net/dccp/proto.c | 5 + net/dsa/legacy.c | 2 +- net/ieee802154/6lowpan/core.c | 12 +- net/ipv4/inet_fragment.c | 3 + net/ipv4/ip_sockglue.c | 6 +- net/ipv4/tcp.c | 1 + net/ipv4/tcp_timer.c | 1 + net/ipv6/datagram.c | 21 ++- net/ipv6/ndisc.c | 3 +- net/ipv6/route.c | 71 +++++---- net/ipv6/seg6_iptunnel.c | 7 +- net/iucv/af_iucv.c | 4 +- net/kcm/kcmsock.c | 33 ++-- net/l2tp/l2tp_core.c | 8 +- net/netlink/genetlink.c | 2 +- net/openvswitch/meter.c | 12 +- net/sched/act_tunnel_key.c | 1 + net/sched/sch_netem.c | 2 +- 47 files changed, 501 insertions(+), 264 deletions(-)
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 5a9f8df68ee6927f21dd3f2c75c16feb8b53a9e8 ]
During the conversion to dsa_is_user_port(), a condition ended up being reversed, which would prevent the creation of any user port when using the legacy binding and/or platform data, fix that.
Fixes: 4a5b85ffe2a0 ("net: dsa: use dsa_is_user_port everywhere") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dsa/legacy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/dsa/legacy.c +++ b/net/dsa/legacy.c @@ -194,7 +194,7 @@ static int dsa_switch_setup_one(struct d ds->ports[i].dn = cd->port_dn[i]; ds->ports[i].cpu_dp = dst->cpu_dp;
- if (dsa_is_user_port(ds, i)) + if (!dsa_is_user_port(ds, i)) continue;
ret = dsa_slave_create(&ds->ports[i]);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhangliping zhangliping02@baidu.com
[ Usptream commit ddc502dfed600bff0b61d899f70d95b76223fdfc ]
Max delat_t should be the full_bucket/rate instead of the full_bucket. Also report EINVAL if the rate is zero.
Fixes: 96fbc13d7e77 ("openvswitch: Add meter infrastructure") Cc: Andy Zhou azhou@ovn.org Signed-off-by: zhangliping zhangliping02@baidu.com Acked-by: Pravin B Shelar pshelar@ovn.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/openvswitch/meter.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/net/openvswitch/meter.c +++ b/net/openvswitch/meter.c @@ -242,14 +242,20 @@ static struct dp_meter *dp_meter_create(
band->type = nla_get_u32(attr[OVS_BAND_ATTR_TYPE]); band->rate = nla_get_u32(attr[OVS_BAND_ATTR_RATE]); + if (band->rate == 0) { + err = -EINVAL; + goto exit_free_meter; + } + band->burst_size = nla_get_u32(attr[OVS_BAND_ATTR_BURST]); /* Figure out max delta_t that is enough to fill any bucket. * Keep max_delta_t size to the bucket units: * pkts => 1/1000 packets, kilobits => bits. + * + * Start with a full bucket. */ - band_max_delta_t = (band->burst_size + band->rate) * 1000; - /* Start with a full bucket. */ - band->bucket = band_max_delta_t; + band->bucket = (band->burst_size + band->rate) * 1000; + band_max_delta_t = band->bucket / band->rate; if (band_max_delta_t > meter->max_delta_t) meter->max_delta_t = band_max_delta_t; band++;
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Kalderon Michal.Kalderon@cavium.com
[ Upstream commit 933e8c91b9f5a2f504f6da1f069c410449b9f4b9 ]
There is a corner case in the MPA unalign flow where a FPDU header is split over two tcp segments. The length of the first fragment in this case was not initialized properly and should be '1'
Fixes: c7d1d839 ("qed: Add support for MPA header being split over two tcp packets")
Signed-off-by: Michal Kalderon Michal.Kalderon@cavium.com Signed-off-by: Ariel Elior Ariel.Elior@cavium.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c +++ b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c @@ -1906,8 +1906,8 @@ qed_iwarp_update_fpdu_length(struct qed_ /* Missing lower byte is now available */ mpa_len = fpdu->fpdu_length | *mpa_data; fpdu->fpdu_length = QED_IWARP_FPDU_LEN_WITH_PAD(mpa_len); - fpdu->mpa_frag_len = fpdu->fpdu_length; /* one byte of hdr */ + fpdu->mpa_frag_len = 1; fpdu->incomplete_bytes = fpdu->fpdu_length - 1; DP_VERBOSE(p_hwfn, QED_MSG_RDMA,
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Soheil Hassas Yeganeh soheil@google.com
[ Upstream commit e05836ac07c77dd90377f8c8140bce2a44af5fe7 ]
When the connection is aborted, there is no point in keeping the packets on the write queue until the connection is closed.
Similar to a27fd7a8ed38 ('tcp: purge write queue upon RST'), this is essential for a correct MSG_ZEROCOPY implementation, because userspace cannot call close(fd) before receiving zerocopy signals even when the connection is aborted.
Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY") Signed-off-by: Soheil Hassas Yeganeh soheil@google.com Signed-off-by: Neal Cardwell ncardwell@google.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Yuchung Cheng ycheng@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp.c | 1 + net/ipv4/tcp_timer.c | 1 + 2 files changed, 2 insertions(+)
--- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3542,6 +3542,7 @@ int tcp_abort(struct sock *sk, int err)
bh_unlock_sock(sk); local_bh_enable(); + tcp_write_queue_purge(sk); release_sock(sk); return 0; } --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -34,6 +34,7 @@ static void tcp_write_err(struct sock *s sk->sk_err = sk->sk_err_soft ? : ETIMEDOUT; sk->sk_error_report(sk);
+ tcp_write_queue_purge(sk); tcp_done(sk); __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONTIMEOUT); }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Kalderon Michal.Kalderon@cavium.com
[ Upstream commit 16da09047d3fb991dc48af41f6d255fd578e8ca2 ]
FW workaround. The iWARP LL2 connection did not expect TCP packets to arrive on it's connection. The fix drops any non-tcp packets
Fixes b5c29ca ("qed: iWARP CM - setup a ll2 connection for handling SYN packets")
Signed-off-by: Michal Kalderon Michal.Kalderon@cavium.com Signed-off-by: Ariel Elior Ariel.Elior@cavium.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
--- a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c +++ b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c @@ -1681,6 +1681,13 @@ qed_iwarp_parse_rx_pkt(struct qed_hwfn * iph = (struct iphdr *)((u8 *)(ethh) + eth_hlen);
if (eth_type == ETH_P_IP) { + if (iph->protocol != IPPROTO_TCP) { + DP_NOTICE(p_hwfn, + "Unexpected ip protocol on ll2 %x\n", + iph->protocol); + return -EINVAL; + } + cm_info->local_ip[0] = ntohl(iph->daddr); cm_info->remote_ip[0] = ntohl(iph->saddr); cm_info->ip_version = TCP_IPV4; @@ -1689,6 +1696,14 @@ qed_iwarp_parse_rx_pkt(struct qed_hwfn * *payload_len = ntohs(iph->tot_len) - ip_hlen; } else if (eth_type == ETH_P_IPV6) { ip6h = (struct ipv6hdr *)iph; + + if (ip6h->nexthdr != IPPROTO_TCP) { + DP_NOTICE(p_hwfn, + "Unexpected ip protocol on ll2 %x\n", + iph->protocol); + return -EINVAL; + } + for (i = 0; i < 4; i++) { cm_info->local_ip[i] = ntohl(ip6h->daddr.in6_u.u6_addr32[i]);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Grygorii Strashko grygorii.strashko@ti.com
[ Upstream commit 2399ac42e762ab25c58420e25359b2921afdc55f ]
The sysfs_create_link_nowarn() is going to be used in phylib framework in subsequent patch which can be built as module. Hence, export sysfs_create_link_nowarn() to avoid build errors.
Cc: Florian Fainelli f.fainelli@gmail.com Cc: Andrew Lunn andrew@lunn.ch Fixes: a3995460491d ("net: phy: Relax error checking on sysfs_create_link()") Signed-off-by: Grygorii Strashko grygorii.strashko@ti.com Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/sysfs/symlink.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/sysfs/symlink.c +++ b/fs/sysfs/symlink.c @@ -107,6 +107,7 @@ int sysfs_create_link_nowarn(struct kobj { return sysfs_do_create_link(kobj, target, name, 0); } +EXPORT_SYMBOL_GPL(sysfs_create_link_nowarn);
/** * sysfs_delete_link - remove symlink in object's directory.
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Grygorii Strashko grygorii.strashko@ti.com
[ Upstream commit 4414b3ed74be0e205e04e12cd83542a727d88255 ]
Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per one netdevice, as result such drivers will produce warning during system boot and fail to connect second phy to netdevice when PHYLIB framework will try to create sysfs link netdev->phydev for second PHY in phy_attach_direct(), because sysfs link with the same name has been created already for the first PHY. As result, second CPSW external port will became unusable.
Fix it by relaxing error checking when PHYLIB framework is creating sysfs link netdev->phydev in phy_attach_direct(), suppressing warning by using sysfs_create_link_nowarn() and adding error message instead. After this change links (phy->netdev and netdev->phy) creation failure is not fatal any more and system can continue working, which fixes TI CPSW issue.
Cc: Florian Fainelli f.fainelli@gmail.com Cc: Andrew Lunn andrew@lunn.ch Fixes: a3995460491d ("net: phy: Relax error checking on sysfs_create_link()") Signed-off-by: Grygorii Strashko grygorii.strashko@ti.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phy_device.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
--- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -999,10 +999,17 @@ int phy_attach_direct(struct net_device err = sysfs_create_link(&phydev->mdio.dev.kobj, &dev->dev.kobj, "attached_dev"); if (!err) { - err = sysfs_create_link(&dev->dev.kobj, &phydev->mdio.dev.kobj, - "phydev"); - if (err) - goto error; + err = sysfs_create_link_nowarn(&dev->dev.kobj, + &phydev->mdio.dev.kobj, + "phydev"); + if (err) { + dev_err(&dev->dev, "could not add device link to %s err %d\n", + kobject_name(&phydev->mdio.dev.kobj), + err); + /* non-fatal - some net drivers can use one netdevice + * with more then one phy + */ + }
phydev->sysfs_links = true; }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arkadi Sharshevsky arkadis@mellanox.com
[ Upstream commit 7fe4d6dcbcb43fe0282d4213fc52be178bb30e91 ]
The current code performs unneeded free. Remove the redundant skb freeing during the error path.
Fixes: 1555d204e743 ("devlink: Support for pipeline debug (dpipe)") Signed-off-by: Arkadi Sharshevsky arkadis@mellanox.com Acked-by: Jiri Pirko jiri@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/devlink.c | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-)
--- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -1776,7 +1776,7 @@ send_done: if (!nlh) { err = devlink_dpipe_send_and_alloc_skb(&skb, info); if (err) - goto err_skb_send_alloc; + return err; goto send_done; }
@@ -1785,7 +1785,6 @@ send_done: nla_put_failure: err = -EMSGSIZE; err_table_put: -err_skb_send_alloc: genlmsg_cancel(skb, hdr); nlmsg_free(skb); return err; @@ -2051,7 +2050,7 @@ static int devlink_dpipe_entries_fill(st table->counters_enabled, &dump_ctx); if (err) - goto err_entries_dump; + return err;
send_done: nlh = nlmsg_put(dump_ctx.skb, info->snd_portid, info->snd_seq, @@ -2059,16 +2058,10 @@ send_done: if (!nlh) { err = devlink_dpipe_send_and_alloc_skb(&dump_ctx.skb, info); if (err) - goto err_skb_send_alloc; + return err; goto send_done; } return genlmsg_reply(dump_ctx.skb, info); - -err_entries_dump: -err_skb_send_alloc: - genlmsg_cancel(dump_ctx.skb, dump_ctx.hdr); - nlmsg_free(dump_ctx.skb); - return err; }
static int devlink_nl_cmd_dpipe_entries_get(struct sk_buff *skb, @@ -2207,7 +2200,7 @@ send_done: if (!nlh) { err = devlink_dpipe_send_and_alloc_skb(&skb, info); if (err) - goto err_skb_send_alloc; + return err; goto send_done; } return genlmsg_reply(skb, info); @@ -2215,7 +2208,6 @@ send_done: nla_put_failure: err = -EMSGSIZE; err_table_put: -err_skb_send_alloc: genlmsg_cancel(skb, hdr); nlmsg_free(skb); return err;
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shannon Nelson shannon.nelson@oracle.com
[ Upstream commit 13fbcc8dc573482dd3f27568257fd7087f8935f4 ]
Adding a macvlan device on top of a lowerdev that supports the xfrm offloads fails with a new regression: # ip link add link ens1f0 mv0 type macvlan RTNETLINK answers: Operation not permitted
Tracing down the failure shows that the macvlan device inherits the NETIF_F_HW_ESP and NETIF_F_HW_ESP_TX_CSUM feature flags from the lowerdev, but with no dev->xfrmdev_ops API filled in, it doesn't actually support xfrm. When the request is made to add the new macvlan device, the XFRM listener for NETDEV_REGISTER calls xfrm_api_check() which fails the new registration because dev->xfrmdev_ops is NULL.
The macvlan creation succeeds when we filter out the ESP feature flags in macvlan_fix_features(), so let's filter them out like we're already filtering out ~NETIF_F_NETNS_LOCAL. When XFRM support is added in the future, we can add the flags into MACVLAN_FEATURES.
This same problem could crop up in the future with any other new feature flags, so let's filter out any flags that aren't defined as supported in macvlan.
Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Reported-by: Alexey Kodanev alexey.kodanev@oracle.com Signed-off-by: Shannon Nelson shannon.nelson@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/macvlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -1036,7 +1036,7 @@ static netdev_features_t macvlan_fix_fea lowerdev_features &= (features | ~NETIF_F_LRO); features = netdev_increment_features(lowerdev_features, features, mask); features |= ALWAYS_ON_FEATURES; - features &= ~NETIF_F_NETNS_LOCAL; + features &= (ALWAYS_ON_FEATURES | MACVLAN_FEATURES);
return features; }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 2f987a76a97773beafbc615b9c4d8fe79129a7f4 ]
On unsuccesful ip6_datagram_connect(), if the failure is caused by ip6_datagram_dst_update(), the sk peer information are cleared, but the sk->sk_state is preserved.
If the socket was already in an established status, the overall sk status is inconsistent and fouls later checks in datagram code.
Fix this saving the old peer information and restoring them in case of failure. This also aligns ipv6 datagram connect() behavior with ipv4.
v1 -> v2: - added missing Fixes tag
Fixes: 85cb73ff9b74 ("net: ipv6: reset daddr and dport in sk if connect() fails") Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/datagram.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
--- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -146,10 +146,12 @@ int __ip6_datagram_connect(struct sock * struct sockaddr_in6 *usin = (struct sockaddr_in6 *) uaddr; struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); - struct in6_addr *daddr; + struct in6_addr *daddr, old_daddr; + __be32 fl6_flowlabel = 0; + __be32 old_fl6_flowlabel; + __be32 old_dport; int addr_type; int err; - __be32 fl6_flowlabel = 0;
if (usin->sin6_family == AF_INET) { if (__ipv6_only_sock(sk)) @@ -239,9 +241,13 @@ ipv4_connected: } }
+ /* save the current peer information before updating it */ + old_daddr = sk->sk_v6_daddr; + old_fl6_flowlabel = np->flow_label; + old_dport = inet->inet_dport; + sk->sk_v6_daddr = *daddr; np->flow_label = fl6_flowlabel; - inet->inet_dport = usin->sin6_port;
/* @@ -251,11 +257,12 @@ ipv4_connected:
err = ip6_datagram_dst_update(sk, true); if (err) { - /* Reset daddr and dport so that udp_v6_early_demux() - * fails to find this socket + /* Restore the socket peer info, to keep it consistent with + * the old socket state */ - memset(&sk->sk_v6_daddr, 0, sizeof(sk->sk_v6_daddr)); - inet->inet_dport = 0; + sk->sk_v6_daddr = old_daddr; + np->flow_label = old_fl6_flowlabel; + inet->inet_dport = old_dport; goto out; }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefano Brivio sbrivio@redhat.com
[ Upstream commit 5f2fb802eee1df0810b47ea251942fe3fd36589a ]
Fixes: 2f987a76a977 ("net: ipv6: keep sk status consistent after datagram connect failure") Signed-off-by: Stefano Brivio sbrivio@redhat.com Acked-by: Paolo Abeni pabeni@redhat.com Acked-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/datagram.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -149,7 +149,7 @@ int __ip6_datagram_connect(struct sock * struct in6_addr *daddr, old_daddr; __be32 fl6_flowlabel = 0; __be32 old_fl6_flowlabel; - __be32 old_dport; + __be16 old_dport; int addr_type; int err;
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lebrun dlebrun@google.com
[ Upstream commit 8936ef7604c11b5d701580d779e0f5684abc7b68 ]
When using seg6 in encap mode, we call ipv6_dev_get_saddr() to set the source address of the outer IPv6 header, in case none was specified. Using skb->dev can lead to BUG() when it is in an inconsistent state. This patch uses the net_device attached to the skb's dst instead.
[940807.667429] BUG: unable to handle kernel NULL pointer dereference at 000000000000047c [940807.762427] IP: ipv6_dev_get_saddr+0x8b/0x1d0 [940807.815725] PGD 0 P4D 0 [940807.847173] Oops: 0000 [#1] SMP PTI [940807.890073] Modules linked in: [940807.927765] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W 4.16.0-rc1-seg6bpf+ #2 [940808.028988] Hardware name: HP ProLiant DL120 G6/ProLiant DL120 G6, BIOS O26 09/06/2010 [940808.128128] RIP: 0010:ipv6_dev_get_saddr+0x8b/0x1d0 [940808.187667] RSP: 0018:ffff88043fd836b0 EFLAGS: 00010206 [940808.251366] RAX: 0000000000000005 RBX: ffff88042cb1c860 RCX: 00000000000000fe [940808.338025] RDX: 00000000000002c0 RSI: ffff88042cb1c860 RDI: 0000000000004500 [940808.424683] RBP: ffff88043fd83740 R08: 0000000000000000 R09: ffffffffffffffff [940808.511342] R10: 0000000000000040 R11: 0000000000000000 R12: ffff88042cb1c850 [940808.598012] R13: ffffffff8208e380 R14: ffff88042ac8da00 R15: 0000000000000002 [940808.684675] FS: 0000000000000000(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000 [940808.783036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [940808.852975] CR2: 000000000000047c CR3: 00000004255fe000 CR4: 00000000000006e0 [940808.939634] Call Trace: [940808.970041] <IRQ> [940808.995250] ? ip6t_do_table+0x265/0x640 [940809.043341] seg6_do_srh_encap+0x28f/0x300 [940809.093516] ? seg6_do_srh+0x1a0/0x210 [940809.139528] seg6_do_srh+0x1a0/0x210 [940809.183462] seg6_output+0x28/0x1e0 [940809.226358] lwtunnel_output+0x3f/0x70 [940809.272370] ip6_xmit+0x2b8/0x530 [940809.313185] ? ac6_proc_exit+0x20/0x20 [940809.359197] inet6_csk_xmit+0x7d/0xc0 [940809.404173] tcp_transmit_skb+0x548/0x9a0 [940809.453304] __tcp_retransmit_skb+0x1a8/0x7a0 [940809.506603] ? ip6_default_advmss+0x40/0x40 [940809.557824] ? tcp_current_mss+0x24/0x90 [940809.605925] tcp_retransmit_skb+0xd/0x80 [940809.654016] tcp_xmit_retransmit_queue.part.17+0xf9/0x210 [940809.719797] tcp_ack+0xa47/0x1110 [940809.760612] tcp_rcv_established+0x13c/0x570 [940809.812865] tcp_v6_do_rcv+0x151/0x3d0 [940809.858879] tcp_v6_rcv+0xa5c/0xb10 [940809.901770] ? seg6_output+0xdd/0x1e0 [940809.946745] ip6_input_finish+0xbb/0x460 [940809.994837] ip6_input+0x74/0x80 [940810.034612] ? ip6_rcv_finish+0xb0/0xb0 [940810.081663] ipv6_rcv+0x31c/0x4c0 ...
Fixes: 6c8702c60b886 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Reported-by: Tom Herbert tom@quantonium.net Signed-off-by: David Lebrun dlebrun@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/seg6_iptunnel.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/ipv6/seg6_iptunnel.c +++ b/net/ipv6/seg6_iptunnel.c @@ -93,7 +93,8 @@ static void set_tun_src(struct net *net, /* encapsulate an IPv6 packet within an outer IPv6 header with a given SRH */ int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, int proto) { - struct net *net = dev_net(skb_dst(skb)->dev); + struct dst_entry *dst = skb_dst(skb); + struct net *net = dev_net(dst->dev); struct ipv6hdr *hdr, *inner_hdr; struct ipv6_sr_hdr *isrh; int hdrlen, tot_len, err; @@ -134,7 +135,7 @@ int seg6_do_srh_encap(struct sk_buff *sk isrh->nexthdr = proto;
hdr->daddr = isrh->segments[isrh->first_segment]; - set_tun_src(net, skb->dev, &hdr->daddr, &hdr->saddr); + set_tun_src(net, ip6_dst_idev(dst)->dev, &hdr->daddr, &hdr->saddr);
#ifdef CONFIG_IPV6_SEG6_HMAC if (sr_has_hmac(isrh)) {
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lebrun dlebrun@google.com
[ Upstream commit 191f86ca8ef27f7a492fd1c03620498c6e94f0ac ]
The seg6_build_state() function is called with RCU read lock held, so we cannot use GFP_KERNEL. This patch uses GFP_ATOMIC instead.
[ 92.770271] ============================= [ 92.770628] WARNING: suspicious RCU usage [ 92.770921] 4.16.0-rc4+ #12 Not tainted [ 92.771277] ----------------------------- [ 92.771585] ./include/linux/rcupdate.h:302 Illegal context switch in RCU read-side critical section! [ 92.772279] [ 92.772279] other info that might help us debug this: [ 92.772279] [ 92.773067] [ 92.773067] rcu_scheduler_active = 2, debug_locks = 1 [ 92.773514] 2 locks held by ip/2413: [ 92.773765] #0: (rtnl_mutex){+.+.}, at: [<00000000e5461720>] rtnetlink_rcv_msg+0x441/0x4d0 [ 92.774377] #1: (rcu_read_lock){....}, at: [<00000000df4f161e>] lwtunnel_build_state+0x59/0x210 [ 92.775065] [ 92.775065] stack backtrace: [ 92.775371] CPU: 0 PID: 2413 Comm: ip Not tainted 4.16.0-rc4+ #12 [ 92.775791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014 [ 92.776608] Call Trace: [ 92.776852] dump_stack+0x7d/0xbc [ 92.777130] __schedule+0x133/0xf00 [ 92.777393] ? unwind_get_return_address_ptr+0x50/0x50 [ 92.777783] ? __sched_text_start+0x8/0x8 [ 92.778073] ? rcu_is_watching+0x19/0x30 [ 92.778383] ? kernel_text_address+0x49/0x60 [ 92.778800] ? __kernel_text_address+0x9/0x30 [ 92.779241] ? unwind_get_return_address+0x29/0x40 [ 92.779727] ? pcpu_alloc+0x102/0x8f0 [ 92.780101] _cond_resched+0x23/0x50 [ 92.780459] __mutex_lock+0xbd/0xad0 [ 92.780818] ? pcpu_alloc+0x102/0x8f0 [ 92.781194] ? seg6_build_state+0x11d/0x240 [ 92.781611] ? save_stack+0x9b/0xb0 [ 92.781965] ? __ww_mutex_wakeup_for_backoff+0xf0/0xf0 [ 92.782480] ? seg6_build_state+0x11d/0x240 [ 92.782925] ? lwtunnel_build_state+0x1bd/0x210 [ 92.783393] ? ip6_route_info_create+0x687/0x1640 [ 92.783846] ? ip6_route_add+0x74/0x110 [ 92.784236] ? inet6_rtm_newroute+0x8a/0xd0
Fixes: 6c8702c60b886 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Signed-off-by: David Lebrun dlebrun@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/seg6_iptunnel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv6/seg6_iptunnel.c +++ b/net/ipv6/seg6_iptunnel.c @@ -419,7 +419,7 @@ static int seg6_build_state(struct nlatt
slwt = seg6_lwt_lwtunnel(newts);
- err = dst_cache_init(&slwt->cache, GFP_KERNEL); + err = dst_cache_init(&slwt->cache, GFP_ATOMIC); if (err) { kfree(newts); return err;
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ido Schimmel idosch@mellanox.com
[ Upstream commit bcdd5de80a2275f7879dc278bfc747f1caf94442 ]
In commit 9ffcc3725f09 ("mlxsw: spectrum: Allow packets to be trapped from any PG") I fixed a problem where packets could not be trapped to the CPU due to exceeded shared buffer quotas. The mentioned commit explains the problem in detail.
The problem was fixed by assigning a minimum quota for the CPU port and the traffic class used for scheduling traffic to the CPU.
However, commit 117b0dad2d54 ("mlxsw: Create a different trap group list for each device") assigned different traffic classes to different packet types and rendered the fix useless.
Fix the problem by assigning a minimum quota for the CPU port and all the traffic classes that are currently in use.
Fixes: 117b0dad2d54 ("mlxsw: Create a different trap group list for each device") Signed-off-by: Ido Schimmel idosch@mellanox.com Reported-by: Eddie Shklaer eddies@mellanox.com Tested-by: Eddie Shklaer eddies@mellanox.com Acked-by: Jiri Pirko jiri@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c @@ -385,13 +385,13 @@ static const struct mlxsw_sp_sb_cm mlxsw
static const struct mlxsw_sp_sb_cm mlxsw_sp_cpu_port_sb_cms[] = { MLXSW_SP_CPU_PORT_SB_CM, + MLXSW_SP_SB_CM(MLXSW_PORT_MAX_MTU, 0, 0), + MLXSW_SP_SB_CM(MLXSW_PORT_MAX_MTU, 0, 0), + MLXSW_SP_SB_CM(MLXSW_PORT_MAX_MTU, 0, 0), + MLXSW_SP_SB_CM(MLXSW_PORT_MAX_MTU, 0, 0), + MLXSW_SP_SB_CM(MLXSW_PORT_MAX_MTU, 0, 0), MLXSW_SP_CPU_PORT_SB_CM, - MLXSW_SP_CPU_PORT_SB_CM, - MLXSW_SP_CPU_PORT_SB_CM, - MLXSW_SP_CPU_PORT_SB_CM, - MLXSW_SP_CPU_PORT_SB_CM, - MLXSW_SP_CPU_PORT_SB_CM, - MLXSW_SP_SB_CM(10000, 0, 0), + MLXSW_SP_SB_CM(MLXSW_PORT_MAX_MTU, 0, 0), MLXSW_SP_CPU_PORT_SB_CM, MLXSW_SP_CPU_PORT_SB_CM, MLXSW_SP_CPU_PORT_SB_CM,
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brad Mouring brad.mouring@ni.com
[ Upstream commit a2c054a896b8ac794ddcfc7c92e2dc7ec4ed4ed5 ]
In 664fcf123a30e (net: phy: Threaded interrupts allow some simplification) the phy_interrupt system was changed to use a traditional threaded interrupt scheme instead of a workqueue approach.
With this change, the phy status check moved into phy_change, which did not report back to the caller whether or not the interrupt was handled. This means that, in the case of a shared phy interrupt, only the first phydev's interrupt registers are checked (since phy_interrupt() would always return IRQ_HANDLED). This leads to interrupt storms when it is a secondary device that's actually the interrupt source.
Signed-off-by: Brad Mouring brad.mouring@ni.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phy.c | 173 ++++++++++++++++++++++++-------------------------- include/linux/phy.h | 1 2 files changed, 86 insertions(+), 88 deletions(-)
--- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -615,6 +615,91 @@ static void phy_error(struct phy_device }
/** + * phy_disable_interrupts - Disable the PHY interrupts from the PHY side + * @phydev: target phy_device struct + */ +static int phy_disable_interrupts(struct phy_device *phydev) +{ + int err; + + /* Disable PHY interrupts */ + err = phy_config_interrupt(phydev, PHY_INTERRUPT_DISABLED); + if (err) + goto phy_err; + + /* Clear the interrupt */ + err = phy_clear_interrupt(phydev); + if (err) + goto phy_err; + + return 0; + +phy_err: + phy_error(phydev); + + return err; +} + +/** + * phy_change - Called by the phy_interrupt to handle PHY changes + * @phydev: phy_device struct that interrupted + */ +static irqreturn_t phy_change(struct phy_device *phydev) +{ + if (phy_interrupt_is_valid(phydev)) { + if (phydev->drv->did_interrupt && + !phydev->drv->did_interrupt(phydev)) + goto ignore; + + if (phy_disable_interrupts(phydev)) + goto phy_err; + } + + mutex_lock(&phydev->lock); + if ((PHY_RUNNING == phydev->state) || (PHY_NOLINK == phydev->state)) + phydev->state = PHY_CHANGELINK; + mutex_unlock(&phydev->lock); + + if (phy_interrupt_is_valid(phydev)) { + atomic_dec(&phydev->irq_disable); + enable_irq(phydev->irq); + + /* Reenable interrupts */ + if (PHY_HALTED != phydev->state && + phy_config_interrupt(phydev, PHY_INTERRUPT_ENABLED)) + goto irq_enable_err; + } + + /* reschedule state queue work to run as soon as possible */ + phy_trigger_machine(phydev, true); + return IRQ_HANDLED; + +ignore: + atomic_dec(&phydev->irq_disable); + enable_irq(phydev->irq); + return IRQ_NONE; + +irq_enable_err: + disable_irq(phydev->irq); + atomic_inc(&phydev->irq_disable); +phy_err: + phy_error(phydev); + return IRQ_NONE; +} + +/** + * phy_change_work - Scheduled by the phy_mac_interrupt to handle PHY changes + * @work: work_struct that describes the work to be done + */ +void phy_change_work(struct work_struct *work) +{ + struct phy_device *phydev = + container_of(work, struct phy_device, phy_queue); + + phy_change(phydev); +} + +/** * phy_interrupt - PHY interrupt handler * @irq: interrupt line * @phy_dat: phy_device pointer @@ -632,9 +717,7 @@ static irqreturn_t phy_interrupt(int irq disable_irq_nosync(irq); atomic_inc(&phydev->irq_disable);
- phy_change(phydev); - - return IRQ_HANDLED; + return phy_change(phydev); }
/** @@ -652,32 +735,6 @@ static int phy_enable_interrupts(struct }
/** - * phy_disable_interrupts - Disable the PHY interrupts from the PHY side - * @phydev: target phy_device struct - */ -static int phy_disable_interrupts(struct phy_device *phydev) -{ - int err; - - /* Disable PHY interrupts */ - err = phy_config_interrupt(phydev, PHY_INTERRUPT_DISABLED); - if (err) - goto phy_err; - - /* Clear the interrupt */ - err = phy_clear_interrupt(phydev); - if (err) - goto phy_err; - - return 0; - -phy_err: - phy_error(phydev); - - return err; -} - -/** * phy_start_interrupts - request and enable interrupts for a PHY device * @phydev: target phy_device struct * @@ -728,64 +785,6 @@ int phy_stop_interrupts(struct phy_devic EXPORT_SYMBOL(phy_stop_interrupts);
/** - * phy_change - Called by the phy_interrupt to handle PHY changes - * @phydev: phy_device struct that interrupted - */ -void phy_change(struct phy_device *phydev) -{ - if (phy_interrupt_is_valid(phydev)) { - if (phydev->drv->did_interrupt && - !phydev->drv->did_interrupt(phydev)) - goto ignore; - - if (phy_disable_interrupts(phydev)) - goto phy_err; - } - - mutex_lock(&phydev->lock); - if ((PHY_RUNNING == phydev->state) || (PHY_NOLINK == phydev->state)) - phydev->state = PHY_CHANGELINK; - mutex_unlock(&phydev->lock); - - if (phy_interrupt_is_valid(phydev)) { - atomic_dec(&phydev->irq_disable); - enable_irq(phydev->irq); - - /* Reenable interrupts */ - if (PHY_HALTED != phydev->state && - phy_config_interrupt(phydev, PHY_INTERRUPT_ENABLED)) - goto irq_enable_err; - } - - /* reschedule state queue work to run as soon as possible */ - phy_trigger_machine(phydev, true); - return; - -ignore: - atomic_dec(&phydev->irq_disable); - enable_irq(phydev->irq); - return; - -irq_enable_err: - disable_irq(phydev->irq); - atomic_inc(&phydev->irq_disable); -phy_err: - phy_error(phydev); -} - -/** - * phy_change_work - Scheduled by the phy_mac_interrupt to handle PHY changes - * @work: work_struct that describes the work to be done - */ -void phy_change_work(struct work_struct *work) -{ - struct phy_device *phydev = - container_of(work, struct phy_device, phy_queue); - - phy_change(phydev); -} - -/** * phy_stop - Bring down the PHY link, and stop checking the status * @phydev: target phy_device struct */ --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -897,7 +897,6 @@ int phy_driver_register(struct phy_drive int phy_drivers_register(struct phy_driver *new_driver, int n, struct module *owner); void phy_state_machine(struct work_struct *work); -void phy_change(struct phy_device *phydev); void phy_change_work(struct work_struct *work); void phy_mac_interrupt(struct phy_device *phydev, int new_link); void phy_start_machine(struct phy_device *phydev);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefano Brivio sbrivio@redhat.com
[ Upstream commit e9fa1495d738e34fcec88a3d2ec9101a9ee5b310 ]
Currently, administrative MTU changes on a given netdevice are not reflected on route exceptions for MTU-less routes, with a set PMTU value, for that device:
# ip -6 route get 2001:db8::b 2001:db8::b from :: dev vti_a proto kernel src 2001:db8::a metric 256 pref medium # ping6 -c 1 -q -s10000 2001:db8::b > /dev/null # ip netns exec a ip -6 route get 2001:db8::b 2001:db8::b from :: dev vti_a src 2001:db8::a metric 0 cache expires 571sec mtu 4926 pref medium # ip link set dev vti_a mtu 3000 # ip -6 route get 2001:db8::b 2001:db8::b from :: dev vti_a src 2001:db8::a metric 0 cache expires 571sec mtu 4926 pref medium # ip link set dev vti_a mtu 9000 # ip -6 route get 2001:db8::b 2001:db8::b from :: dev vti_a src 2001:db8::a metric 0 cache expires 571sec mtu 4926 pref medium
The first issue is that since commit fb56be83e43d ("net-ipv6: on device mtu change do not add mtu to mtu-less routes") we don't call rt6_exceptions_update_pmtu() from rt6_mtu_change_route(), which handles administrative MTU changes, if the regular route is MTU-less.
However, PMTU exceptions should be always updated, as long as RTAX_MTU is not locked. Keep the check for MTU-less main route, as introduced by that commit, but, for exceptions, call rt6_exceptions_update_pmtu() regardless of that check.
Once that is fixed, one problem remains: MTU changes are not reflected if the new MTU is higher than the previous one, because rt6_exceptions_update_pmtu() doesn't allow that. We should instead allow PMTU increase if the old PMTU matches the local MTU, as that implies that the old MTU was the lowest in the path, and PMTU discovery might lead to different results.
The existing check in rt6_mtu_change_route() correctly took that case into account (for regular routes only), so factor it out and re-use it also in rt6_exceptions_update_pmtu().
While at it, fix comments style and grammar, and try to be a bit more descriptive.
Reported-by: Xiumei Mu xmu@redhat.com Fixes: fb56be83e43d ("net-ipv6: on device mtu change do not add mtu to mtu-less routes") Fixes: f5bbe7ee79c2 ("ipv6: prepare rt6_mtu_change() for exception table") Signed-off-by: Stefano Brivio sbrivio@redhat.com Acked-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/route.c | 71 ++++++++++++++++++++++++++++++++----------------------- 1 file changed, 42 insertions(+), 29 deletions(-)
--- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1510,7 +1510,30 @@ static void rt6_exceptions_remove_prefsr } }
-static void rt6_exceptions_update_pmtu(struct rt6_info *rt, int mtu) +static bool rt6_mtu_change_route_allowed(struct inet6_dev *idev, + struct rt6_info *rt, int mtu) +{ + /* If the new MTU is lower than the route PMTU, this new MTU will be the + * lowest MTU in the path: always allow updating the route PMTU to + * reflect PMTU decreases. + * + * If the new MTU is higher, and the route PMTU is equal to the local + * MTU, this means the old MTU is the lowest in the path, so allow + * updating it: if other nodes now have lower MTUs, PMTU discovery will + * handle this. + */ + + if (dst_mtu(&rt->dst) >= mtu) + return true; + + if (dst_mtu(&rt->dst) == idev->cnf.mtu6) + return true; + + return false; +} + +static void rt6_exceptions_update_pmtu(struct inet6_dev *idev, + struct rt6_info *rt, int mtu) { struct rt6_exception_bucket *bucket; struct rt6_exception *rt6_ex; @@ -1519,20 +1542,22 @@ static void rt6_exceptions_update_pmtu(s bucket = rcu_dereference_protected(rt->rt6i_exception_bucket, lockdep_is_held(&rt6_exception_lock));
- if (bucket) { - for (i = 0; i < FIB6_EXCEPTION_BUCKET_SIZE; i++) { - hlist_for_each_entry(rt6_ex, &bucket->chain, hlist) { - struct rt6_info *entry = rt6_ex->rt6i; - /* For RTF_CACHE with rt6i_pmtu == 0 - * (i.e. a redirected route), - * the metrics of its rt->dst.from has already - * been updated. - */ - if (entry->rt6i_pmtu && entry->rt6i_pmtu > mtu) - entry->rt6i_pmtu = mtu; - } - bucket++; + if (!bucket) + return; + + for (i = 0; i < FIB6_EXCEPTION_BUCKET_SIZE; i++) { + hlist_for_each_entry(rt6_ex, &bucket->chain, hlist) { + struct rt6_info *entry = rt6_ex->rt6i; + + /* For RTF_CACHE with rt6i_pmtu == 0 (i.e. a redirected + * route), the metrics of its rt->dst.from have already + * been updated. + */ + if (entry->rt6i_pmtu && + rt6_mtu_change_route_allowed(idev, entry, mtu)) + entry->rt6i_pmtu = mtu; } + bucket++; } }
@@ -3521,25 +3546,13 @@ static int rt6_mtu_change_route(struct r Since RFC 1981 doesn't include administrative MTU increase update PMTU increase is a MUST. (i.e. jumbo frame) */ - /* - If new MTU is less than route PMTU, this new MTU will be the - lowest MTU in the path, update the route PMTU to reflect PMTU - decreases; if new MTU is greater than route PMTU, and the - old MTU is the lowest MTU in the path, update the route PMTU - to reflect the increase. In this case if the other nodes' MTU - also have the lowest MTU, TOO BIG MESSAGE will be lead to - PMTU discovery. - */ if (rt->dst.dev == arg->dev && - dst_metric_raw(&rt->dst, RTAX_MTU) && !dst_metric_locked(&rt->dst, RTAX_MTU)) { spin_lock_bh(&rt6_exception_lock); - if (dst_mtu(&rt->dst) >= arg->mtu || - (dst_mtu(&rt->dst) < arg->mtu && - dst_mtu(&rt->dst) == idev->cnf.mtu6)) { + if (dst_metric_raw(&rt->dst, RTAX_MTU) && + rt6_mtu_change_route_allowed(idev, rt, arg->mtu)) dst_metric_set(&rt->dst, RTAX_MTU, arg->mtu); - } - rt6_exceptions_update_pmtu(rt, arg->mtu); + rt6_exceptions_update_pmtu(idev, rt, arg->mtu); spin_unlock_bh(&rt6_exception_lock); } return 0;
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Mashak mrv@mojatatu.com
[ Upstream commit 51d4740f88affd85d49c04e3c9cd129c0e33bcb9 ]
If set/unset mode of the tunnel_key action is not provided, ->init() still returns 0, and the caller proceeds with bogus 'struct tc_action *' object, this results in crash:
% tc actions add action tunnel_key src_ip 1.1.1.1 dst_ip 2.2.2.1 id 7 index 1
[ 35.805515] general protection fault: 0000 [#1] SMP PTI [ 35.806161] Modules linked in: act_tunnel_key kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd serio_raw [ 35.808233] CPU: 1 PID: 428 Comm: tc Not tainted 4.16.0-rc4+ #286 [ 35.808929] RIP: 0010:tcf_action_init+0x90/0x190 [ 35.809457] RSP: 0018:ffffb8edc068b9a0 EFLAGS: 00010206 [ 35.810053] RAX: 1320c000000a0003 RBX: 0000000000000001 RCX: 0000000000000000 [ 35.810866] RDX: 0000000000000070 RSI: 0000000000007965 RDI: ffffb8edc068b910 [ 35.811660] RBP: ffffb8edc068b9d0 R08: 0000000000000000 R09: ffffb8edc068b808 [ 35.812463] R10: ffffffffc02bf040 R11: 0000000000000040 R12: ffffb8edc068bb38 [ 35.813235] R13: 0000000000000000 R14: 0000000000000000 R15: ffffb8edc068b910 [ 35.814006] FS: 00007f3d0d8556c0(0000) GS:ffff91d1dbc40000(0000) knlGS:0000000000000000 [ 35.814881] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.815540] CR2: 000000000043f720 CR3: 0000000019248001 CR4: 00000000001606a0 [ 35.816457] Call Trace: [ 35.817158] tc_ctl_action+0x11a/0x220 [ 35.817795] rtnetlink_rcv_msg+0x23d/0x2e0 [ 35.818457] ? __slab_alloc+0x1c/0x30 [ 35.819079] ? __kmalloc_node_track_caller+0xb1/0x2b0 [ 35.819544] ? rtnl_calcit.isra.30+0xe0/0xe0 [ 35.820231] netlink_rcv_skb+0xce/0x100 [ 35.820744] netlink_unicast+0x164/0x220 [ 35.821500] netlink_sendmsg+0x293/0x370 [ 35.822040] sock_sendmsg+0x30/0x40 [ 35.822508] ___sys_sendmsg+0x2c5/0x2e0 [ 35.823149] ? pagecache_get_page+0x27/0x220 [ 35.823714] ? filemap_fault+0xa2/0x640 [ 35.824423] ? page_add_file_rmap+0x108/0x200 [ 35.825065] ? alloc_set_pte+0x2aa/0x530 [ 35.825585] ? finish_fault+0x4e/0x70 [ 35.826140] ? __handle_mm_fault+0xbc1/0x10d0 [ 35.826723] ? __sys_sendmsg+0x41/0x70 [ 35.827230] __sys_sendmsg+0x41/0x70 [ 35.827710] do_syscall_64+0x68/0x120 [ 35.828195] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 35.828859] RIP: 0033:0x7f3d0ca4da67 [ 35.829331] RSP: 002b:00007ffc9f284338 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 35.830304] RAX: ffffffffffffffda RBX: 00007ffc9f284460 RCX: 00007f3d0ca4da67 [ 35.831247] RDX: 0000000000000000 RSI: 00007ffc9f2843b0 RDI: 0000000000000003 [ 35.832167] RBP: 000000005aa6a7a9 R08: 0000000000000001 R09: 0000000000000000 [ 35.833075] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000 [ 35.833997] R13: 00007ffc9f2884c0 R14: 0000000000000001 R15: 0000000000674640 [ 35.834923] Code: 24 30 bb 01 00 00 00 45 31 f6 eb 5e 8b 50 08 83 c2 07 83 e2 fc 83 c2 70 49 8b 07 48 8b 40 70 48 85 c0 74 10 48 89 14 24 4c 89 ff <ff> d0 48 8b 14 24 48 01 c2 49 01 d6 45 85 ed 74 05 41 83 47 2c [ 35.837442] RIP: tcf_action_init+0x90/0x190 RSP: ffffb8edc068b9a0 [ 35.838291] ---[ end trace a095c06ee4b97a26 ]---
Fixes: d0f6dd8a914f ("net/sched: Introduce act_tunnel_key") Signed-off-by: Roman Mashak mrv@mojatatu.com Acked-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/act_tunnel_key.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/sched/act_tunnel_key.c +++ b/net/sched/act_tunnel_key.c @@ -153,6 +153,7 @@ static int tunnel_key_init(struct net *n metadata->u.tun_info.mode |= IP_TUNNEL_INFO_TX; break; default: + ret = -EINVAL; goto err_out; }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guillaume Nault g.nault@alphalink.fr
[ Upstream commit 6d066734e9f09cdea4a3b9cb76136db3f29cfb02 ]
We already detect situations where a PPP channel sends packets back to its upper PPP device. While this is enough to avoid deadlocking on xmit locks, this doesn't prevent packets from looping between the channel and the unit.
The problem is that ppp_start_xmit() enqueues packets in ppp->file.xq before checking for xmit recursion. Therefore, __ppp_xmit_process() might dequeue a packet from ppp->file.xq and send it on the channel which, in turn, loops it back on the unit. Then ppp_start_xmit() queues the packet back to ppp->file.xq and __ppp_xmit_process() picks it up and sends it again through the channel. Therefore, the packet will loop between __ppp_xmit_process() and ppp_start_xmit() until some other part of the xmit path drops it.
For L2TP, we rapidly fill the skb's headroom and pppol2tp_xmit() drops the packet after a few iterations. But PPTP reallocates the headroom if necessary, letting the loop run and exhaust the machine resources (as reported in https://bugzilla.kernel.org/show_bug.cgi?id=199109).
Fix this by letting __ppp_xmit_process() enqueue the skb to ppp->file.xq, so that we can check for recursion before adding it to the queue. Now ppp_xmit_process() can drop the packet when recursion is detected.
__ppp_channel_push() is a bit special. It calls __ppp_xmit_process() without having any actual packet to send. This is used by ppp_output_wakeup() to re-enable transmission on the parent unit (for implementations like ppp_async.c, where the .start_xmit() function might not consume the skb, leaving it in ppp->xmit_pending and disabling transmission). Therefore, __ppp_xmit_process() needs to handle the case where skb is NULL, dequeuing as many packets as possible from ppp->file.xq.
Reported-by: xu heng xuheng333@zoho.com Fixes: 55454a565836 ("ppp: avoid dealock on recursive xmit") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ppp/ppp_generic.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)
--- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -257,7 +257,7 @@ struct ppp_net { /* Prototypes. */ static int ppp_unattached_ioctl(struct net *net, struct ppp_file *pf, struct file *file, unsigned int cmd, unsigned long arg); -static void ppp_xmit_process(struct ppp *ppp); +static void ppp_xmit_process(struct ppp *ppp, struct sk_buff *skb); static void ppp_send_frame(struct ppp *ppp, struct sk_buff *skb); static void ppp_push(struct ppp *ppp); static void ppp_channel_push(struct channel *pch); @@ -513,13 +513,12 @@ static ssize_t ppp_write(struct file *fi goto out; }
- skb_queue_tail(&pf->xq, skb); - switch (pf->kind) { case INTERFACE: - ppp_xmit_process(PF_TO_PPP(pf)); + ppp_xmit_process(PF_TO_PPP(pf), skb); break; case CHANNEL: + skb_queue_tail(&pf->xq, skb); ppp_channel_push(PF_TO_CHANNEL(pf)); break; } @@ -1267,8 +1266,8 @@ ppp_start_xmit(struct sk_buff *skb, stru put_unaligned_be16(proto, pp);
skb_scrub_packet(skb, !net_eq(ppp->ppp_net, dev_net(dev))); - skb_queue_tail(&ppp->file.xq, skb); - ppp_xmit_process(ppp); + ppp_xmit_process(ppp, skb); + return NETDEV_TX_OK;
outf: @@ -1420,13 +1419,14 @@ static void ppp_setup(struct net_device */
/* Called to do any work queued up on the transmit side that can now be done */ -static void __ppp_xmit_process(struct ppp *ppp) +static void __ppp_xmit_process(struct ppp *ppp, struct sk_buff *skb) { - struct sk_buff *skb; - ppp_xmit_lock(ppp); if (!ppp->closing) { ppp_push(ppp); + + if (skb) + skb_queue_tail(&ppp->file.xq, skb); while (!ppp->xmit_pending && (skb = skb_dequeue(&ppp->file.xq))) ppp_send_frame(ppp, skb); @@ -1440,7 +1440,7 @@ static void __ppp_xmit_process(struct pp ppp_xmit_unlock(ppp); }
-static void ppp_xmit_process(struct ppp *ppp) +static void ppp_xmit_process(struct ppp *ppp, struct sk_buff *skb) { local_bh_disable();
@@ -1448,7 +1448,7 @@ static void ppp_xmit_process(struct ppp goto err;
(*this_cpu_ptr(ppp->xmit_recursion))++; - __ppp_xmit_process(ppp); + __ppp_xmit_process(ppp, skb); (*this_cpu_ptr(ppp->xmit_recursion))--;
local_bh_enable(); @@ -1458,6 +1458,8 @@ static void ppp_xmit_process(struct ppp err: local_bh_enable();
+ kfree_skb(skb); + if (net_ratelimit()) netdev_err(ppp->dev, "recursion detected\n"); } @@ -1942,7 +1944,7 @@ static void __ppp_channel_push(struct ch if (skb_queue_empty(&pch->file.xq)) { ppp = pch->ppp; if (ppp) - __ppp_xmit_process(ppp); + __ppp_xmit_process(ppp, NULL); } }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Blakey paulb@mellanox.com
[ Upstream commit d3dcf8eb615537526bd42ff27a081d46d337816e ]
When inserting duplicate objects (those with the same key), current rhlist implementation messes up the chain pointers by updating the bucket pointer instead of prev next pointer to the newly inserted node. This causes missing elements on removal and travesal.
Fix that by properly updating pprev pointer to point to the correct rhash_head next pointer.
Issue: 1241076 Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7 Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface') Signed-off-by: Paul Blakey paulb@mellanox.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/rhashtable.h | 4 +++- lib/rhashtable.c | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-)
--- a/include/linux/rhashtable.h +++ b/include/linux/rhashtable.h @@ -750,8 +750,10 @@ slow_path: if (!key || (params.obj_cmpfn ? params.obj_cmpfn(&arg, rht_obj(ht, head)) : - rhashtable_compare(&arg, rht_obj(ht, head)))) + rhashtable_compare(&arg, rht_obj(ht, head)))) { + pprev = &head->next; continue; + }
data = rht_obj(ht, head);
--- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -537,8 +537,10 @@ static void *rhashtable_lookup_one(struc if (!key || (ht->p.obj_cmpfn ? ht->p.obj_cmpfn(&arg, rht_obj(ht, head)) : - rhashtable_compare(&arg, rht_obj(ht, head)))) + rhashtable_compare(&arg, rht_obj(ht, head)))) { + pprev = &head->next; continue; + }
if (!ht->rhlist) return rht_obj(ht, head);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Blakey paulb@mellanox.com
[ Upstream commit 499ac3b60f657dae82055fc81c7b01e6242ac9bc ]
Tries to insert duplicates in the middle of bucket's chain: bucket 1: [[val 21 (tid=1)]] -> [[ val 1 (tid=2), val 1 (tid=0) ]]
Reuses tid to distinguish the elements insertion order.
Signed-off-by: Paul Blakey paulb@mellanox.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/test_rhashtable.c | 134 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 134 insertions(+)
--- a/lib/test_rhashtable.c +++ b/lib/test_rhashtable.c @@ -79,6 +79,21 @@ struct thread_data { struct test_obj *objs; };
+static u32 my_hashfn(const void *data, u32 len, u32 seed) +{ + const struct test_obj_rhl *obj = data; + + return (obj->value.id % 10) << RHT_HASH_RESERVED_SPACE; +} + +static int my_cmpfn(struct rhashtable_compare_arg *arg, const void *obj) +{ + const struct test_obj_rhl *test_obj = obj; + const struct test_obj_val *val = arg->key; + + return test_obj->value.id - val->id; +} + static struct rhashtable_params test_rht_params = { .head_offset = offsetof(struct test_obj, node), .key_offset = offsetof(struct test_obj, value), @@ -87,6 +102,17 @@ static struct rhashtable_params test_rht .nulls_base = (3U << RHT_BASE_SHIFT), };
+static struct rhashtable_params test_rht_params_dup = { + .head_offset = offsetof(struct test_obj_rhl, list_node), + .key_offset = offsetof(struct test_obj_rhl, value), + .key_len = sizeof(struct test_obj_val), + .hashfn = jhash, + .obj_hashfn = my_hashfn, + .obj_cmpfn = my_cmpfn, + .nelem_hint = 128, + .automatic_shrinking = false, +}; + static struct semaphore prestart_sem; static struct semaphore startup_sem = __SEMAPHORE_INITIALIZER(startup_sem, 0);
@@ -469,6 +495,112 @@ static int __init test_rhashtable_max(st return err; }
+static unsigned int __init print_ht(struct rhltable *rhlt) +{ + struct rhashtable *ht; + const struct bucket_table *tbl; + char buff[512] = ""; + unsigned int i, cnt = 0; + + ht = &rhlt->ht; + tbl = rht_dereference(ht->tbl, ht); + for (i = 0; i < tbl->size; i++) { + struct rhash_head *pos, *next; + struct test_obj_rhl *p; + + pos = rht_dereference(tbl->buckets[i], ht); + next = !rht_is_a_nulls(pos) ? rht_dereference(pos->next, ht) : NULL; + + if (!rht_is_a_nulls(pos)) { + sprintf(buff, "%s\nbucket[%d] -> ", buff, i); + } + + while (!rht_is_a_nulls(pos)) { + struct rhlist_head *list = container_of(pos, struct rhlist_head, rhead); + sprintf(buff, "%s[[", buff); + do { + pos = &list->rhead; + list = rht_dereference(list->next, ht); + p = rht_obj(ht, pos); + + sprintf(buff, "%s val %d (tid=%d)%s", buff, p->value.id, p->value.tid, + list? ", " : " "); + cnt++; + } while (list); + + pos = next, + next = !rht_is_a_nulls(pos) ? + rht_dereference(pos->next, ht) : NULL; + + sprintf(buff, "%s]]%s", buff, !rht_is_a_nulls(pos) ? " -> " : ""); + } + } + printk(KERN_ERR "\n---- ht: ----%s\n-------------\n", buff); + + return cnt; +} + +static int __init test_insert_dup(struct test_obj_rhl *rhl_test_objects, + int cnt, bool slow) +{ + struct rhltable rhlt; + unsigned int i, ret; + const char *key; + int err = 0; + + err = rhltable_init(&rhlt, &test_rht_params_dup); + if (WARN_ON(err)) + return err; + + for (i = 0; i < cnt; i++) { + rhl_test_objects[i].value.tid = i; + key = rht_obj(&rhlt.ht, &rhl_test_objects[i].list_node.rhead); + key += test_rht_params_dup.key_offset; + + if (slow) { + err = PTR_ERR(rhashtable_insert_slow(&rhlt.ht, key, + &rhl_test_objects[i].list_node.rhead)); + if (err == -EAGAIN) + err = 0; + } else + err = rhltable_insert(&rhlt, + &rhl_test_objects[i].list_node, + test_rht_params_dup); + if (WARN(err, "error %d on element %d/%d (%s)\n", err, i, cnt, slow? "slow" : "fast")) + goto skip_print; + } + + ret = print_ht(&rhlt); + WARN(ret != cnt, "missing rhltable elements (%d != %d, %s)\n", ret, cnt, slow? "slow" : "fast"); + +skip_print: + rhltable_destroy(&rhlt); + + return 0; +} + +static int __init test_insert_duplicates_run(void) +{ + struct test_obj_rhl rhl_test_objects[3] = {}; + + pr_info("test inserting duplicates\n"); + + /* two different values that map to same bucket */ + rhl_test_objects[0].value.id = 1; + rhl_test_objects[1].value.id = 21; + + /* and another duplicate with same as [0] value + * which will be second on the bucket list */ + rhl_test_objects[2].value.id = rhl_test_objects[0].value.id; + + test_insert_dup(rhl_test_objects, 2, false); + test_insert_dup(rhl_test_objects, 3, false); + test_insert_dup(rhl_test_objects, 2, true); + test_insert_dup(rhl_test_objects, 3, true); + + return 0; +} + static int thread_lookup_test(struct thread_data *tdata) { unsigned int entries = tdata->entries; @@ -617,6 +749,8 @@ static int __init test_rht_init(void) do_div(total_time, runs); pr_info("Average test time: %llu\n", total_time);
+ test_insert_duplicates_run(); + if (!tcount) return 0;
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Herbert tom@quantonium.net
[ Upstream commit 2cc683e88c0c993ac3721d9b702cb0630abe2879 ]
Need to lock lower socket in order to provide mutual exclusion with kcm_unattach.
v2: Add Reported-by for syzbot
Fixes: ab7ac4eb9832e32a09f4e804 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot+ea75c0ffcd353d32515f064aaebefc5279e6161e@syzkaller.appspotmail.com Signed-off-by: Tom Herbert tom@quantonium.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/kcm/kcmsock.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-)
--- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -1381,24 +1381,32 @@ static int kcm_attach(struct socket *soc .parse_msg = kcm_parse_func_strparser, .read_sock_done = kcm_read_sock_done, }; - int err; + int err = 0;
csk = csock->sk; if (!csk) return -EINVAL;
+ lock_sock(csk); + /* Only allow TCP sockets to be attached for now */ if ((csk->sk_family != AF_INET && csk->sk_family != AF_INET6) || - csk->sk_protocol != IPPROTO_TCP) - return -EOPNOTSUPP; + csk->sk_protocol != IPPROTO_TCP) { + err = -EOPNOTSUPP; + goto out; + }
/* Don't allow listeners or closed sockets */ - if (csk->sk_state == TCP_LISTEN || csk->sk_state == TCP_CLOSE) - return -EOPNOTSUPP; + if (csk->sk_state == TCP_LISTEN || csk->sk_state == TCP_CLOSE) { + err = -EOPNOTSUPP; + goto out; + }
psock = kmem_cache_zalloc(kcm_psockp, GFP_KERNEL); - if (!psock) - return -ENOMEM; + if (!psock) { + err = -ENOMEM; + goto out; + }
psock->mux = mux; psock->sk = csk; @@ -1407,7 +1415,7 @@ static int kcm_attach(struct socket *soc err = strp_init(&psock->strp, csk, &cb); if (err) { kmem_cache_free(kcm_psockp, psock); - return err; + goto out; }
write_lock_bh(&csk->sk_callback_lock); @@ -1419,7 +1427,8 @@ static int kcm_attach(struct socket *soc write_unlock_bh(&csk->sk_callback_lock); strp_done(&psock->strp); kmem_cache_free(kcm_psockp, psock); - return -EALREADY; + err = -EALREADY; + goto out; }
psock->save_data_ready = csk->sk_data_ready; @@ -1455,7 +1464,10 @@ static int kcm_attach(struct socket *soc /* Schedule RX work in case there are already bytes queued */ strp_check_rcv(&psock->strp);
- return 0; +out: + release_sock(csk); + + return err; }
static int kcm_attach_ioctl(struct socket *sock, struct kcm_attach *info) @@ -1507,6 +1519,7 @@ static void kcm_unattach(struct kcm_psoc
if (WARN_ON(psock->rx_kcm)) { write_unlock_bh(&csk->sk_callback_lock); + release_sock(csk); return; }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Kodanev alexey.kodanev@oracle.com
[ Upstream commit 35d889d10b649fda66121891ec05eca88150059d ]
When we exceed current packets limit and we have more than one segment in the list returned by skb_gso_segment(), netem drops only the first one, skipping the rest, hence kmemleak reports:
unreferenced object 0xffff880b5d23b600 (size 1024): comm "softirq", pid 0, jiffies 4384527763 (age 2770.629s) hex dump (first 32 bytes): 00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00 ..#]............ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000d8a19b9d>] __alloc_skb+0xc9/0x520 [<000000001709b32f>] skb_segment+0x8c8/0x3710 [<00000000c7b9bb88>] tcp_gso_segment+0x331/0x1830 [<00000000c921cba1>] inet_gso_segment+0x476/0x1370 [<000000008b762dd4>] skb_mac_gso_segment+0x1f9/0x510 [<000000002182660a>] __skb_gso_segment+0x1dd/0x620 [<00000000412651b9>] netem_enqueue+0x1536/0x2590 [sch_netem] [<0000000005d3b2a9>] __dev_queue_xmit+0x1167/0x2120 [<00000000fc5f7327>] ip_finish_output2+0x998/0xf00 [<00000000d309e9d3>] ip_output+0x1aa/0x2c0 [<000000007ecbd3a4>] tcp_transmit_skb+0x18db/0x3670 [<0000000042d2a45f>] tcp_write_xmit+0x4d4/0x58c0 [<0000000056a44199>] tcp_tasklet_func+0x3d9/0x540 [<0000000013d06d02>] tasklet_action+0x1ca/0x250 [<00000000fcde0b8b>] __do_softirq+0x1b4/0x5a3 [<00000000e7ed027c>] irq_exit+0x1e2/0x210
Fix it by adding the rest of the segments, if any, to skb 'to_free' list. Add new __qdisc_drop_all() and qdisc_drop_all() functions because they can be useful in the future if we need to drop segmented GSO packets in other places.
Fixes: 6071bd1aa13e ("netem: Segment GSO packets on enqueue") Signed-off-by: Alexey Kodanev alexey.kodanev@oracle.com Acked-by: Neil Horman nhorman@tuxdriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/sch_generic.h | 19 +++++++++++++++++++ net/sched/sch_netem.c | 2 +- 2 files changed, 20 insertions(+), 1 deletion(-)
--- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -735,6 +735,16 @@ static inline void __qdisc_drop(struct s *to_free = skb; }
+static inline void __qdisc_drop_all(struct sk_buff *skb, + struct sk_buff **to_free) +{ + if (skb->prev) + skb->prev->next = *to_free; + else + skb->next = *to_free; + *to_free = skb; +} + static inline unsigned int __qdisc_queue_drop_head(struct Qdisc *sch, struct qdisc_skb_head *qh, struct sk_buff **to_free) @@ -853,6 +863,15 @@ static inline int qdisc_drop(struct sk_b qdisc_qstats_drop(sch);
return NET_XMIT_DROP; +} + +static inline int qdisc_drop_all(struct sk_buff *skb, struct Qdisc *sch, + struct sk_buff **to_free) +{ + __qdisc_drop_all(skb, to_free); + qdisc_qstats_drop(sch); + + return NET_XMIT_DROP; }
/* Length to Time (L2T) lookup in a qdisc_rate_table, to determine how --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -509,7 +509,7 @@ static int netem_enqueue(struct sk_buff }
if (unlikely(sch->q.qlen >= sch->limit)) - return qdisc_drop(skb, sch, to_free); + return qdisc_drop_all(skb, sch, to_free);
qdisc_qstats_backlog_inc(sch, skb);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit ca0edb131bdf1e6beaeb2b8289fd6b374b74147d ]
A tun device type can trivially be set to arbitrary value using TUNSETLINK ioctl().
Therefore, lowpan_device_event() must really check that ieee802154_ptr is not NULL.
Fixes: 2c88b5283f60d ("ieee802154: 6lowpan: remove check on null") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Alexander Aring alex.aring@gmail.com Cc: Stefan Schmidt stefan@osg.samsung.com Reported-by: syzbot syzkaller@googlegroups.com Acked-by: Stefan Schmidt stefan@osg.samsung.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/6lowpan/core.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
--- a/net/ieee802154/6lowpan/core.c +++ b/net/ieee802154/6lowpan/core.c @@ -206,9 +206,13 @@ static inline void lowpan_netlink_fini(v static int lowpan_device_event(struct notifier_block *unused, unsigned long event, void *ptr) { - struct net_device *wdev = netdev_notifier_info_to_dev(ptr); + struct net_device *ndev = netdev_notifier_info_to_dev(ptr); + struct wpan_dev *wpan_dev;
- if (wdev->type != ARPHRD_IEEE802154) + if (ndev->type != ARPHRD_IEEE802154) + return NOTIFY_DONE; + wpan_dev = ndev->ieee802154_ptr; + if (!wpan_dev) return NOTIFY_DONE;
switch (event) { @@ -217,8 +221,8 @@ static int lowpan_device_event(struct no * also delete possible lowpan interfaces which belongs * to the wpan interface. */ - if (wdev->ieee802154_ptr->lowpan_dev) - lowpan_dellink(wdev->ieee802154_ptr->lowpan_dev, NULL); + if (wpan_dev->lowpan_dev) + lowpan_dellink(wpan_dev->lowpan_dev, NULL); break; default: return NOTIFY_DONE;
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 4dcb31d4649df36297296b819437709f5407059c ]
Andrei Vagin reported a KASAN: slab-out-of-bounds error in skb_update_prio()
Since SYNACK might be attached to a request socket, we need to get back to the listener socket. Since this listener is manipulated without locks, add const qualifiers to sock_cgroup_prioidx() so that the const can also be used in skb_update_prio()
Also add the const qualifier to sock_cgroup_classid() for consistency.
Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Andrei Vagin avagin@virtuozzo.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/cgroup-defs.h | 4 ++-- net/core/dev.c | 22 +++++++++++++++------- 2 files changed, 17 insertions(+), 9 deletions(-)
--- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -755,13 +755,13 @@ struct sock_cgroup_data { * updaters and return part of the previous pointer as the prioidx or * classid. Such races are short-lived and the result isn't critical. */ -static inline u16 sock_cgroup_prioidx(struct sock_cgroup_data *skcd) +static inline u16 sock_cgroup_prioidx(const struct sock_cgroup_data *skcd) { /* fallback to 1 which is always the ID of the root cgroup */ return (skcd->is_data & 1) ? skcd->prioidx : 1; }
-static inline u32 sock_cgroup_classid(struct sock_cgroup_data *skcd) +static inline u32 sock_cgroup_classid(const struct sock_cgroup_data *skcd) { /* fallback to 0 which is the unconfigured default classid */ return (skcd->is_data & 1) ? skcd->classid : 0; --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3247,15 +3247,23 @@ static inline int __dev_xmit_skb(struct #if IS_ENABLED(CONFIG_CGROUP_NET_PRIO) static void skb_update_prio(struct sk_buff *skb) { - struct netprio_map *map = rcu_dereference_bh(skb->dev->priomap); + const struct netprio_map *map; + const struct sock *sk; + unsigned int prioidx;
- if (!skb->priority && skb->sk && map) { - unsigned int prioidx = - sock_cgroup_prioidx(&skb->sk->sk_cgrp_data); + if (skb->priority) + return; + map = rcu_dereference_bh(skb->dev->priomap); + if (!map) + return; + sk = skb_to_full_sk(skb); + if (!sk) + return;
- if (prioidx < map->priomap_len) - skb->priority = map->priomap[prioidx]; - } + prioidx = sock_cgroup_prioidx(&sk->sk_cgrp_data); + + if (prioidx < map->priomap_len) + skb->priority = map->priomap[prioidx]; } #else #define skb_update_prio(skb)
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kirill Tkhai ktkhai@virtuozzo.com
[ Upstream commit a560002437d3646dafccecb1bf32d1685112ddda ]
inet_evict_bucket() iterates global list, and several tasks may call it in parallel. All of them hash the same fq->list_evictor to different lists, which leads to list corruption.
This patch makes fq be hashed to expired list only if this has not been made yet by another task. Since inet_frag_alloc() allocates fq using kmem_cache_zalloc(), we may rely on list_evictor is initially unhashed.
The problem seems to exist before async pernet_operations, as there was possible to have exit method to be executed in parallel with inet_frags::frags_work, so I add two Fixes tags. This also may go to stable.
Fixes: d1fe19444d82 "inet: frag: don't re-use chainlist for evictor" Fixes: f84c6821aa54 "net: Convert pernet_subsys, registered from inet_init()" Signed-off-by: Kirill Tkhai ktkhai@virtuozzo.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/inet_fragment.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -119,6 +119,9 @@ out:
static bool inet_fragq_should_evict(const struct inet_frag_queue *q) { + if (!hlist_unhashed(&q->list_evictor)) + return false; + return q->net->low_thresh == 0 || frag_mem_limit(q->net) >= q->net->low_thresh; }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Julian Wiedmann jwi@linux.vnet.ibm.com
[ Upstream commit 6be687395b3124f002a653c1a50b3260222b3cd7 ]
On removal, a qeth card's netdevice is currently not properly freed because the call chain looks as follows:
qeth_core_remove_device(card) lx_remove_device(card) unregister_netdev(card->dev) card->dev = NULL !!! qeth_core_free_card(card) if (card->dev) !!! free_netdev(card->dev)
Fix it by free'ing the netdev straight after unregistering. This also fixes the sysfs-driven layer switch case (qeth_dev_layer2_store()), where the need to free the current netdevice was not considered at all.
Note that free_netdev() takes care of the netif_napi_del() for us too.
Fixes: 4a71df50047f ("qeth: new qeth device driver") Signed-off-by: Julian Wiedmann jwi@linux.vnet.ibm.com Reviewed-by: Ursula Braun ubraun@linux.vnet.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/net/qeth_core_main.c | 2 -- drivers/s390/net/qeth_l2_main.c | 2 +- drivers/s390/net/qeth_l3_main.c | 2 +- 3 files changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -5022,8 +5022,6 @@ static void qeth_core_free_card(struct q QETH_DBF_HEX(SETUP, 2, &card, sizeof(void *)); qeth_clean_channel(&card->read); qeth_clean_channel(&card->write); - if (card->dev) - free_netdev(card->dev); qeth_free_qdio_buffers(card); unregister_service_level(&card->qeth_service_level); kfree(card); --- a/drivers/s390/net/qeth_l2_main.c +++ b/drivers/s390/net/qeth_l2_main.c @@ -933,8 +933,8 @@ static void qeth_l2_remove_device(struct qeth_l2_set_offline(cgdev);
if (card->dev) { - netif_napi_del(&card->napi); unregister_netdev(card->dev); + free_netdev(card->dev); card->dev = NULL; } return; --- a/drivers/s390/net/qeth_l3_main.c +++ b/drivers/s390/net/qeth_l3_main.c @@ -3042,8 +3042,8 @@ static void qeth_l3_remove_device(struct qeth_l3_set_offline(cgdev);
if (card->dev) { - netif_napi_del(&card->napi); unregister_netdev(card->dev); + free_netdev(card->dev); card->dev = NULL; }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Julian Wiedmann jwi@linux.vnet.ibm.com
[ Upstream commit 1063e432bb45be209427ed3f1ca3908e4aa3c7d7 ]
qeth_wait_for_threads() is potentially called by multiple users, make sure to notify all of them after qeth_clear_thread_running_bit() adjusted the thread_running_mask. With no timeout, callers would otherwise stall.
Signed-off-by: Julian Wiedmann jwi@linux.vnet.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/net/qeth_core_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -961,7 +961,7 @@ void qeth_clear_thread_running_bit(struc spin_lock_irqsave(&card->thread_mask_lock, flags); card->thread_running_mask &= ~thread; spin_unlock_irqrestore(&card->thread_mask_lock, flags); - wake_up(&card->wait_q); + wake_up_all(&card->wait_q); } EXPORT_SYMBOL_GPL(qeth_clear_thread_running_bit);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Julian Wiedmann jwi@linux.vnet.ibm.com
[ Upstream commit 17bf8c9b3d499d5168537c98b61eb7a1fcbca6c2 ]
For calling ccw_device_start(), issue_next_read() needs to hold the device's ccwlock. This is satisfied for the IRQ handler path (where qeth_irq() gets called under the ccwlock), but we need explicit locking for the initial call by the MPC initialization.
Signed-off-by: Julian Wiedmann jwi@linux.vnet.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/net/qeth_core_main.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
--- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -526,8 +526,7 @@ static inline int qeth_is_cq(struct qeth queue == card->qdio.no_in_queues - 1; }
- -static int qeth_issue_next_read(struct qeth_card *card) +static int __qeth_issue_next_read(struct qeth_card *card) { int rc; struct qeth_cmd_buffer *iob; @@ -558,6 +557,17 @@ static int qeth_issue_next_read(struct q return rc; }
+static int qeth_issue_next_read(struct qeth_card *card) +{ + int ret; + + spin_lock_irq(get_ccwdev_lock(CARD_RDEV(card))); + ret = __qeth_issue_next_read(card); + spin_unlock_irq(get_ccwdev_lock(CARD_RDEV(card))); + + return ret; +} + static struct qeth_reply *qeth_alloc_reply(struct qeth_card *card) { struct qeth_reply *reply; @@ -1183,7 +1193,7 @@ static void qeth_irq(struct ccw_device * return; if (channel == &card->read && channel->state == CH_STATE_UP) - qeth_issue_next_read(card); + __qeth_issue_next_read(card);
iob = channel->iob; index = channel->buf_no;
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Julian Wiedmann jwi@linux.vnet.ibm.com
[ Upstream commit a6c3d93963e4b333c764fde69802c3ea9eaa9d5c ]
When the IRQ handler determines that one of the cmd IO channels has failed and schedules recovery, block any further cmd requests from being submitted. The request would inevitably stall, and prevent the recovery from making progress until the request times out.
This sort of error was observed after Live Guest Relocation, where the pending IO on the READ channel intentionally gets terminated to kick-start recovery. Simultaneously the guest executed SIOCETHTOOL, triggering qeth to issue a QUERY CARD INFO command. The command then stalled in the inoperabel WRITE channel.
Signed-off-by: Julian Wiedmann jwi@linux.vnet.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/net/qeth_core_main.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -1175,6 +1175,7 @@ static void qeth_irq(struct ccw_device * } rc = qeth_get_problem(cdev, irb); if (rc) { + card->read_or_write_problem = 1; qeth_clear_ipacmd_list(card); qeth_schedule_recovery(card); goto out;
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Madalin Bucur madalin.bucur@nxp.com
[ Upstream commit 96f413f47677366e0ae03797409bfcc4151dbf9e ]
The wait_for_completion() call in qman_delete_cgr_safe() was triggering a scheduling while atomic bug, replacing the kthread with a smp_call_function_single() call to fix it.
Signed-off-by: Madalin Bucur madalin.bucur@nxp.com Signed-off-by: Roy Pledge roy.pledge@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/soc/fsl/qbman/qman.c | 28 +++++----------------------- 1 file changed, 5 insertions(+), 23 deletions(-)
--- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -2443,39 +2443,21 @@ struct cgr_comp { struct completion completion; };
-static int qman_delete_cgr_thread(void *p) +static void qman_delete_cgr_smp_call(void *p) { - struct cgr_comp *cgr_comp = (struct cgr_comp *)p; - int ret; - - ret = qman_delete_cgr(cgr_comp->cgr); - complete(&cgr_comp->completion); - - return ret; + qman_delete_cgr((struct qman_cgr *)p); }
void qman_delete_cgr_safe(struct qman_cgr *cgr) { - struct task_struct *thread; - struct cgr_comp cgr_comp; - preempt_disable(); if (qman_cgr_cpus[cgr->cgrid] != smp_processor_id()) { - init_completion(&cgr_comp.completion); - cgr_comp.cgr = cgr; - thread = kthread_create(qman_delete_cgr_thread, &cgr_comp, - "cgr_del"); - - if (IS_ERR(thread)) - goto out; - - kthread_bind(thread, qman_cgr_cpus[cgr->cgrid]); - wake_up_process(thread); - wait_for_completion(&cgr_comp.completion); + smp_call_function_single(qman_cgr_cpus[cgr->cgrid], + qman_delete_cgr_smp_call, cgr, true); preempt_enable(); return; } -out: + qman_delete_cgr(cgr); preempt_enable(); }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Madalin Bucur madalin.bucur@nxp.com
[ Upstream commit 88075256ee817041d68c2387f29065b5cb2b342a ]
The recent changes that make the driver probing compatible with DSA were not propagated in the dpa_remove() function, breaking the module unload function. Using the proper device to address the issue.
Signed-off-by: Madalin Bucur madalin.bucur@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -2860,7 +2860,7 @@ static int dpaa_remove(struct platform_d struct device *dev; int err;
- dev = &pdev->dev; + dev = pdev->dev.parent; net_dev = dev_get_drvdata(dev);
priv = netdev_priv(net_dev);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Camelia Groza camelia.groza@nxp.com
[ Upstream commit 565186362b73226a288830abe595f05f0cec0bbc ]
The fd_format has already been initialized at this point.
Signed-off-by: Camelia Groza camelia.groza@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -2278,7 +2278,6 @@ static enum qman_cb_dqrr_result rx_defau vaddr = phys_to_virt(addr); prefetch(vaddr + qm_fd_get_offset(fd));
- fd_format = qm_fd_get_format(fd); /* The only FD types that we may receive are contig and S/G */ WARN_ON((fd_format != qm_fd_contig) && (fd_format != qm_fd_sg));
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Camelia Groza camelia.groza@nxp.com
[ Upstream commit e4d1b37c17d000a3da9368a3e260fb9ea4927c25 ]
Signed-off-by: Camelia Groza camelia.groza@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -2310,8 +2310,10 @@ static enum qman_cb_dqrr_result rx_defau
skb_len = skb->len;
- if (unlikely(netif_receive_skb(skb) == NET_RX_DROP)) + if (unlikely(netif_receive_skb(skb) == NET_RX_DROP)) { + percpu_stats->rx_dropped++; return qman_cb_dqrr_consume; + }
percpu_stats->rx_packets++; percpu_stats->rx_bytes += skb_len;
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Camelia Groza camelia.groza@nxp.com
[ Upstream commit 82d141cd19d088ee41feafde4a6f86eeb40d93c5 ]
The tx_errors counter is incremented by the dpaa_xmit caller.
Signed-off-by: Camelia Groza camelia.groza@nxp.com Signed-off-by: Madalin Bucur madalin.bucur@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -2008,7 +2008,6 @@ static inline int dpaa_xmit(struct dpaa_ }
if (unlikely(err < 0)) { - percpu_stats->tx_errors++; percpu_stats->tx_fifo_errors++; return err; }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Kodanev alexey.kodanev@oracle.com
[ Upstream commit 67f93df79aeefc3add4e4b31a752600f834236e2 ]
dccp_disconnect() sets 'dp->dccps_hc_tx_ccid' tx handler to NULL, therefore if DCCP socket is disconnected and dccp_sendmsg() is called after it, it will cause a NULL pointer dereference in dccp_write_xmit().
This crash and the reproducer was reported by syzbot. Looks like it is reproduced if commit 69c64866ce07 ("dccp: CVE-2017-8824: use-after-free in DCCP code") is applied.
Reported-by: syzbot+f99ab3887ab65d70f816@syzkaller.appspotmail.com Signed-off-by: Alexey Kodanev alexey.kodanev@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dccp/proto.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -789,6 +789,11 @@ int dccp_sendmsg(struct sock *sk, struct if (skb == NULL) goto out_release;
+ if (sk->sk_state == DCCP_CLOSED) { + rc = -ENOTCONN; + goto out_discard; + } + skb_reserve(skb, sk->sk_prot->max_header); rc = memcpy_from_msg(skb_put(skb, len), msg, len); if (rc != 0)
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lorenzo Bianconi lorenzo.bianconi@redhat.com
[ Upstream commit 9f62c15f28b0d1d746734666d88a79f08ba1e43e ]
Fix the following slab-out-of-bounds kasan report in ndisc_fill_redirect_hdr_option when the incoming ipv6 packet is not linear and the accessed data are not in the linear data region of orig_skb.
[ 1503.122508] ================================================================== [ 1503.122832] BUG: KASAN: slab-out-of-bounds in ndisc_send_redirect+0x94e/0x990 [ 1503.123036] Read of size 1184 at addr ffff8800298ab6b0 by task netperf/1932
[ 1503.123220] CPU: 0 PID: 1932 Comm: netperf Not tainted 4.16.0-rc2+ #124 [ 1503.123347] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014 [ 1503.123527] Call Trace: [ 1503.123579] <IRQ> [ 1503.123638] print_address_description+0x6e/0x280 [ 1503.123849] kasan_report+0x233/0x350 [ 1503.123946] memcpy+0x1f/0x50 [ 1503.124037] ndisc_send_redirect+0x94e/0x990 [ 1503.125150] ip6_forward+0x1242/0x13b0 [...] [ 1503.153890] Allocated by task 1932: [ 1503.153982] kasan_kmalloc+0x9f/0xd0 [ 1503.154074] __kmalloc_track_caller+0xb5/0x160 [ 1503.154198] __kmalloc_reserve.isra.41+0x24/0x70 [ 1503.154324] __alloc_skb+0x130/0x3e0 [ 1503.154415] sctp_packet_transmit+0x21a/0x1810 [ 1503.154533] sctp_outq_flush+0xc14/0x1db0 [ 1503.154624] sctp_do_sm+0x34e/0x2740 [ 1503.154715] sctp_primitive_SEND+0x57/0x70 [ 1503.154807] sctp_sendmsg+0xaa6/0x1b10 [ 1503.154897] sock_sendmsg+0x68/0x80 [ 1503.154987] ___sys_sendmsg+0x431/0x4b0 [ 1503.155078] __sys_sendmsg+0xa4/0x130 [ 1503.155168] do_syscall_64+0x171/0x3f0 [ 1503.155259] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 1503.155436] Freed by task 1932: [ 1503.155527] __kasan_slab_free+0x134/0x180 [ 1503.155618] kfree+0xbc/0x180 [ 1503.155709] skb_release_data+0x27f/0x2c0 [ 1503.155800] consume_skb+0x94/0xe0 [ 1503.155889] sctp_chunk_put+0x1aa/0x1f0 [ 1503.155979] sctp_inq_pop+0x2f8/0x6e0 [ 1503.156070] sctp_assoc_bh_rcv+0x6a/0x230 [ 1503.156164] sctp_inq_push+0x117/0x150 [ 1503.156255] sctp_backlog_rcv+0xdf/0x4a0 [ 1503.156346] __release_sock+0x142/0x250 [ 1503.156436] release_sock+0x80/0x180 [ 1503.156526] sctp_sendmsg+0xbb0/0x1b10 [ 1503.156617] sock_sendmsg+0x68/0x80 [ 1503.156708] ___sys_sendmsg+0x431/0x4b0 [ 1503.156799] __sys_sendmsg+0xa4/0x130 [ 1503.156889] do_syscall_64+0x171/0x3f0 [ 1503.156980] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 1503.157158] The buggy address belongs to the object at ffff8800298ab600 which belongs to the cache kmalloc-1024 of size 1024 [ 1503.157444] The buggy address is located 176 bytes inside of 1024-byte region [ffff8800298ab600, ffff8800298aba00) [ 1503.157702] The buggy address belongs to the page: [ 1503.157820] page:ffffea0000a62a00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0 [ 1503.158053] flags: 0x4000000000008100(slab|head) [ 1503.158171] raw: 4000000000008100 0000000000000000 0000000000000000 00000001800e000e [ 1503.158350] raw: dead000000000100 dead000000000200 ffff880036002600 0000000000000000 [ 1503.158523] page dumped because: kasan: bad access detected
[ 1503.158698] Memory state around the buggy address: [ 1503.158816] ffff8800298ab900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 1503.158988] ffff8800298ab980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 1503.159165] >ffff8800298aba00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1503.159338] ^ [ 1503.159436] ffff8800298aba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1503.159610] ffff8800298abb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1503.159785] ================================================================== [ 1503.159964] Disabling lock debugging due to kernel taint
The test scenario to trigger the issue consists of 4 devices: - H0: data sender, connected to LAN0 - H1: data receiver, connected to LAN1 - GW0 and GW1: routers between LAN0 and LAN1. Both of them have an ethernet connection on LAN0 and LAN1 On H{0,1} set GW0 as default gateway while on GW0 set GW1 as next hop for data from LAN0 to LAN1. Moreover create an ip6ip6 tunnel between H0 and H1 and send 3 concurrent data streams (TCP/UDP/SCTP) from H0 to H1 through ip6ip6 tunnel (send buffer size is set to 16K). While data streams are active flush the route cache on HA multiple times. I have not been able to identify a given commit that introduced the issue since, using the reproducer described above, the kasan report has been triggered from 4.14 and I have not gone back further.
Reported-by: Jianlin Shi jishi@redhat.com Reviewed-by: Stefano Brivio sbrivio@redhat.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Lorenzo Bianconi lorenzo.bianconi@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ndisc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -1554,7 +1554,8 @@ static void ndisc_fill_redirect_hdr_opti *(opt++) = (rd_len >> 3); opt += 6;
- memcpy(opt, ipv6_hdr(orig_skb), rd_len - 8); + skb_copy_bits(orig_skb, skb_network_offset(orig_skb), opt, + rd_len - 8); }
void ndisc_send_redirect(struct sk_buff *skb, const struct in6_addr *target)
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 17cfe79a65f98abe535261856c5aef14f306dff7 ]
syzkaller found an issue caused by lack of sufficient checks in l2tp_tunnel_create()
RAW sockets can not be considered as UDP ones for instance.
In another patch, we shall replace all pr_err() by less intrusive pr_debug() so that syzkaller can find other bugs faster. Acked-by: Guillaume Nault g.nault@alphalink.fr Acked-by: James Chapman jchapman@katalix.com
================================================================== BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69 dst_release: dst:00000000d53d0d0f refcnt:-1 Write of size 1 at addr ffff8801d013b798 by task syz-executor3/6242
CPU: 1 PID: 6242 Comm: syz-executor3 Not tainted 4.16.0-rc2+ #253 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report+0x23b/0x360 mm/kasan/report.c:412 __asan_report_store1_noabort+0x17/0x20 mm/kasan/report.c:435 setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69 l2tp_tunnel_create+0x1354/0x17f0 net/l2tp/l2tp_core.c:1596 pppol2tp_connect+0x14b1/0x1dd0 net/l2tp/l2tp_ppp.c:707 SYSC_connect+0x213/0x4a0 net/socket.c:1640 SyS_connect+0x24/0x30 net/socket.c:1621 do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7
Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_core.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1466,9 +1466,14 @@ int l2tp_tunnel_create(struct net *net, encap = cfg->encap;
/* Quick sanity checks */ + err = -EPROTONOSUPPORT; + if (sk->sk_type != SOCK_DGRAM) { + pr_debug("tunl %hu: fd %d wrong socket type\n", + tunnel_id, fd); + goto err; + } switch (encap) { case L2TP_ENCAPTYPE_UDP: - err = -EPROTONOSUPPORT; if (sk->sk_protocol != IPPROTO_UDP) { pr_err("tunl %hu: fd %d wrong protocol, got %d, expected %d\n", tunnel_id, fd, sk->sk_protocol, IPPROTO_UDP); @@ -1476,7 +1481,6 @@ int l2tp_tunnel_create(struct net *net, } break; case L2TP_ENCAPTYPE_IP: - err = -EPROTONOSUPPORT; if (sk->sk_protocol != IPPROTO_L2TP) { pr_err("tunl %hu: fd %d wrong protocol, got %d, expected %d\n", tunnel_id, fd, sk->sk_protocol, IPPROTO_L2TP);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 00777fac28ba3e126b9e63e789a613e8bd2cab25 ]
If the optional regulator is deferred, we must release some resources. They will be re-allocated when the probe function will be called again.
Fixes: 6eacf31139bf ("ethernet: arc: Add support for Rockchip SoC layer device tree bindings") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/arc/emac_rockchip.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/arc/emac_rockchip.c +++ b/drivers/net/ethernet/arc/emac_rockchip.c @@ -169,8 +169,10 @@ static int emac_rockchip_probe(struct pl /* Optional regulator for PHY */ priv->regulator = devm_regulator_get_optional(dev, "phy"); if (IS_ERR(priv->regulator)) { - if (PTR_ERR(priv->regulator) == -EPROBE_DEFER) - return -EPROBE_DEFER; + if (PTR_ERR(priv->regulator) == -EPROBE_DEFER) { + err = -EPROBE_DEFER; + goto out_clk_disable; + } dev_err(dev, "no regulator found\n"); priv->regulator = NULL; }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit a069215cf5985f3aa1bba550264907d6bd05c5f7 ]
When unbinding/removing the driver, we will run into the following warnings:
[ 259.655198] fec 400d1000.ethernet: 400d1000.ethernet supply phy not found, using dummy regulator [ 259.665065] fec 400d1000.ethernet: Unbalanced pm_runtime_enable! [ 259.672770] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Invalid MAC address: 00:00:00:00:00:00 [ 259.683062] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Using random MAC address: f2:3e:93:b7:29:c1 [ 259.696239] libphy: fec_enet_mii_bus: probed
Avoid these warnings by balancing the runtime PM calls during fec_drv_remove().
Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/fec_main.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3578,6 +3578,8 @@ fec_drv_remove(struct platform_device *p fec_enet_mii_remove(fep); if (fep->reg_phy) regulator_disable(fep->reg_phy); + pm_runtime_put(&pdev->dev); + pm_runtime_disable(&pdev->dev); if (of_phy_is_fixed_link(np)) of_phy_deregister_fixed_link(np); of_node_put(fep->phy_node);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arvind Yadav arvind.yadav.cs@gmail.com
[ Upstream commit fa6a91e9b907231d2e38ea5ed89c537b3525df3d ]
Free memory by calling put_device(), if afiucv_iucv_init is not successful.
Signed-off-by: Arvind Yadav arvind.yadav.cs@gmail.com Reviewed-by: Cornelia Huck cohuck@redhat.com Signed-off-by: Ursula Braun ursula.braun@de.ibm.com Signed-off-by: Julian Wiedmann jwi@linux.vnet.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/iucv/af_iucv.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/iucv/af_iucv.c +++ b/net/iucv/af_iucv.c @@ -2433,9 +2433,11 @@ static int afiucv_iucv_init(void) af_iucv_dev->driver = &af_iucv_driver; err = device_register(af_iucv_dev); if (err) - goto out_driver; + goto out_iucv_dev; return 0;
+out_iucv_dev: + put_device(af_iucv_dev); out_driver: driver_unregister(&af_iucv_driver); out_iucv:
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Dichtel nicolas.dichtel@6wind.com
[ Upstream commit 02a2385f37a7c6594c9d89b64c4a1451276f08eb ]
nlmsg_multicast() consumes always the skb, thus the original skb must be freed only when this function is called with a clone.
Fixes: cb9f7a9a5c96 ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()") Reported-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netlink/genetlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -1106,7 +1106,7 @@ static int genlmsg_mcast(struct sk_buff if (!err) delivered = true; else if (err != -ESRCH) - goto error; + return err; return delivered ? 0 : -ESRCH; error: kfree_skb(skb);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Ahern dsahern@gmail.com
[ Upstream commit 2cbb4ea7de167b02ffa63e9cdfdb07a7e7094615 ]
Only allow ifindex from IP_PKTINFO to override SO_BINDTODEVICE settings if the index is actually set in the message.
Signed-off-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_sockglue.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -258,7 +258,8 @@ int ip_cmsg_send(struct sock *sk, struct src_info = (struct in6_pktinfo *)CMSG_DATA(cmsg); if (!ipv6_addr_v4mapped(&src_info->ipi6_addr)) return -EINVAL; - ipc->oif = src_info->ipi6_ifindex; + if (src_info->ipi6_ifindex) + ipc->oif = src_info->ipi6_ifindex; ipc->addr = src_info->ipi6_addr.s6_addr32[3]; continue; } @@ -288,7 +289,8 @@ int ip_cmsg_send(struct sock *sk, struct if (cmsg->cmsg_len != CMSG_LEN(sizeof(struct in_pktinfo))) return -EINVAL; info = (struct in_pktinfo *)CMSG_DATA(cmsg); - ipc->oif = info->ipi_ifindex; + if (info->ipi_ifindex) + ipc->oif = info->ipi_ifindex; ipc->addr = info->ipi_spec_dst.s_addr; break; }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 484d802d0f2f29c335563fcac2a8facf174a1bbc ]
There is no need for complex checking between the last consumed index and current consumed index, a simple subtraction will do.
This also eliminates the possibility of a permanent transmit queue stall under the following conditions:
- one CPU bursts ring->size worth of traffic (up to 256 buffers), to the point where we run out of free descriptors, so we stop the transmit queue at the end of bcm_sysport_xmit()
- because of our locking, we have the transmit process disable interrupts which means we can be blocking the TX reclamation process
- when TX reclamation finally runs, we will be computing the difference between ring->c_index (last consumed index by SW) and what the HW reports through its register
- this register is masked with (ring->size - 1) = 0xff, which will lead to stripping the upper bits of the index (register is 16-bits wide)
- we will be computing last_tx_cn as 0, which means there is no work to be done, and we never wake-up the transmit queue, leaving it permanently disabled
A practical example is e.g: ring->c_index aka last_c_index = 12, we pushed 256 entries, HW consumer index = 268, we mask it with 0xff = 12, so last_tx_cn == 0, nothing happens.
Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bcmsysport.c | 33 +++++++++++++---------------- drivers/net/ethernet/broadcom/bcmsysport.h | 2 - 2 files changed, 16 insertions(+), 19 deletions(-)
--- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -855,10 +855,12 @@ static void bcm_sysport_tx_reclaim_one(s static unsigned int __bcm_sysport_tx_reclaim(struct bcm_sysport_priv *priv, struct bcm_sysport_tx_ring *ring) { - unsigned int c_index, last_c_index, last_tx_cn, num_tx_cbs; unsigned int pkts_compl = 0, bytes_compl = 0; struct net_device *ndev = priv->netdev; + unsigned int txbds_processed = 0; struct bcm_sysport_cb *cb; + unsigned int txbds_ready; + unsigned int c_index; u32 hw_ind;
/* Clear status before servicing to reduce spurious interrupts */ @@ -871,29 +873,23 @@ static unsigned int __bcm_sysport_tx_rec /* Compute how many descriptors have been processed since last call */ hw_ind = tdma_readl(priv, TDMA_DESC_RING_PROD_CONS_INDEX(ring->index)); c_index = (hw_ind >> RING_CONS_INDEX_SHIFT) & RING_CONS_INDEX_MASK; - ring->p_index = (hw_ind & RING_PROD_INDEX_MASK); - - last_c_index = ring->c_index; - num_tx_cbs = ring->size; - - c_index &= (num_tx_cbs - 1); - - if (c_index >= last_c_index) - last_tx_cn = c_index - last_c_index; - else - last_tx_cn = num_tx_cbs - last_c_index + c_index; + txbds_ready = (c_index - ring->c_index) & RING_CONS_INDEX_MASK;
netif_dbg(priv, tx_done, ndev, - "ring=%d c_index=%d last_tx_cn=%d last_c_index=%d\n", - ring->index, c_index, last_tx_cn, last_c_index); + "ring=%d old_c_index=%u c_index=%u txbds_ready=%u\n", + ring->index, ring->c_index, c_index, txbds_ready);
- while (last_tx_cn-- > 0) { - cb = ring->cbs + last_c_index; + while (txbds_processed < txbds_ready) { + cb = &ring->cbs[ring->clean_index]; bcm_sysport_tx_reclaim_one(ring, cb, &bytes_compl, &pkts_compl);
ring->desc_count++; - last_c_index++; - last_c_index &= (num_tx_cbs - 1); + txbds_processed++; + + if (likely(ring->clean_index < ring->size - 1)) + ring->clean_index++; + else + ring->clean_index = 0; }
u64_stats_update_begin(&priv->syncp); @@ -1406,6 +1402,7 @@ static int bcm_sysport_init_tx_ring(stru netif_tx_napi_add(priv->netdev, &ring->napi, bcm_sysport_tx_poll, 64); ring->index = index; ring->size = size; + ring->clean_index = 0; ring->alloc_size = ring->size; ring->desc_cpu = p; ring->desc_count = ring->size; --- a/drivers/net/ethernet/broadcom/bcmsysport.h +++ b/drivers/net/ethernet/broadcom/bcmsysport.h @@ -706,7 +706,7 @@ struct bcm_sysport_tx_ring { unsigned int desc_count; /* Number of descriptors */ unsigned int curr_desc; /* Current descriptor */ unsigned int c_index; /* Last consumer index */ - unsigned int p_index; /* Current producer index */ + unsigned int clean_index; /* Current clean index */ struct bcm_sysport_cb *cbs; /* Transmit control blocks */ struct dma_desc *desc_cpu; /* CPU view of the descriptor */ struct bcm_sysport_priv *priv; /* private context backpointer */
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Kalderon Michal.Kalderon@cavium.com
[ Upstream commit 4609adc27175839408359822523de7247d56c87f ]
Link updates were not reported to qedr correctly. Leading to cases where a link could be down, but qedr would see it as up. In addition, once qede was loaded, link state would be up, regardless of the actual link state.
Signed-off-by: Michal Kalderon michal.kalderon@cavium.com Signed-off-by: Ariel Elior ariel.elior@cavium.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/qlogic/qede/qede_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@ -2066,8 +2066,6 @@ static int qede_load(struct qede_dev *ed link_params.link_up = true; edev->ops->common->set_link(edev->cdev, &link_params);
- qede_rdma_dev_event_open(edev); - edev->state = QEDE_STATE_OPEN;
DP_INFO(edev, "Ending successfully qede load\n"); @@ -2168,12 +2166,14 @@ static void qede_link_update(void *dev, DP_NOTICE(edev, "Link is up\n"); netif_tx_start_all_queues(edev->ndev); netif_carrier_on(edev->ndev); + qede_rdma_dev_event_open(edev); } } else { if (netif_carrier_ok(edev->ndev)) { DP_NOTICE(edev, "Link is down\n"); netif_tx_disable(edev->ndev); netif_carrier_off(edev->ndev); + qede_rdma_dev_event_close(edev); } } }
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vinicius Costa Gomes vinicius.gomes@intel.com
[ Upstream commit 6e5d58fdc9bedd0255a8781b258f10bbdc63e975 ]
When errors are enqueued to the error queue via sock_queue_err_skb() function, it is possible that the waiting application is not notified.
Calling 'sk->sk_data_ready()' would not notify applications that selected only POLLERR events in poll() (for example).
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Randy E. Witt randy.e.witt@intel.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Vinicius Costa Gomes vinicius.gomes@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/skbuff.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -4173,7 +4173,7 @@ int sock_queue_err_skb(struct sock *sk,
skb_queue_tail(&sk->sk_error_queue, skb); if (!sock_flag(sk, SOCK_DEAD)) - sk->sk_data_ready(sk); + sk->sk_error_report(sk); return 0; } EXPORT_SYMBOL(sock_queue_err_skb);
4.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arkadi Sharshevsky arkadis@mellanox.com
[ Upstream commit cbcc607e18422555db569b593608aec26111cb0b ]
The __send_and_alloc_skb() receives a skb ptr as a parameter but in case it fails the skb is not valid: - Send failed and released the skb internally. - Allocation failed.
The current code tries to release the skb in case of failure which causes redundant freeing.
Fixes: 9b00cf2d1024 ("team: implement multipart netlink messages for options transfers") Signed-off-by: Arkadi Sharshevsky arkadis@mellanox.com Acked-by: Jiri Pirko jiri@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/team/team.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -2395,7 +2395,7 @@ send_done: if (!nlh) { err = __send_and_alloc_skb(&skb, team, portid, send_func); if (err) - goto errout; + return err; goto send_done; }
@@ -2681,7 +2681,7 @@ send_done: if (!nlh) { err = __send_and_alloc_skb(&skb, team, portid, send_func); if (err) - goto errout; + return err; goto send_done; }
On 03/29/2018 11:59 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.15.15 release. There are 47 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Mar 31 17:57:05 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.15.15-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.15.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Thu, Mar 29, 2018 at 05:09:57PM -0600, Shuah Khan wrote:
On 03/29/2018 11:59 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.15.15 release. There are 47 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Mar 31 17:57:05 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.15.15-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.15.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Thanks for testing all of these and letting me know.
greg k-h
On 29 March 2018 at 23:29, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.15.15 release. There are 47 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Mar 31 17:57:05 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.15.15-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.15.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm and x86_64.
NOTE: LTP test case clock_nanosleep02 is an intermittent failure on qemu_x86_64. libhugetlbfs truncate_above_4GB-2M-32 is still fails on arm32 x15 device.
Summary ------------------------------------------------------------------------
kernel: 4.15.15-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.15.y git commit: cfaa8b04a52648b3f18fb3d4f8a622c53ab32646 git describe: v4.15.14-48-gcfaa8b04a526 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.15-oe/build/v4.15.14-48...
No regressions (compared to build v4.15.14-48-g98a6610fb36e) ------------------------------------------------------------------------
Boards, architectures and test suites: -------------------------------------
dragonboard-410c - arm64 * boot - fail: 3, pass: 20 * kselftest - skip: 20, pass: 45 * libhugetlbfs - skip: 1, pass: 90 * ltp-cap_bounds-tests - pass: 2 * ltp-containers-tests - skip: 17, pass: 64 * ltp-fcntl-locktests-tests - pass: 2 * ltp-filecaps-tests - pass: 2 * ltp-fs-tests - skip: 2, pass: 61 * ltp-fs_bind-tests - pass: 2 * ltp-fs_perms_simple-tests - pass: 19 * ltp-fsx-tests - pass: 2 * ltp-hugetlb-tests - skip: 1, pass: 21 * ltp-io-tests - pass: 3 * ltp-ipc-tests - pass: 9 * ltp-math-tests - pass: 11 * ltp-nptl-tests - pass: 2 * ltp-pty-tests - pass: 4 * ltp-sched-tests - pass: 14 * ltp-securebits-tests - pass: 4 * ltp-syscalls-tests - skip: 148, pass: 1002 * ltp-timers-tests - skip: 1, pass: 12
hi6220-hikey - arm64 * boot - pass: 20 * kselftest - skip: 17, pass: 48 * libhugetlbfs - skip: 1, pass: 90 * ltp-cap_bounds-tests - pass: 2 * ltp-containers-tests - skip: 17, pass: 64 * ltp-fcntl-locktests-tests - pass: 2 * ltp-filecaps-tests - pass: 2 * ltp-fs-tests - skip: 2, pass: 61 * ltp-fs_bind-tests - pass: 2 * ltp-fs_perms_simple-tests - pass: 19 * ltp-fsx-tests - pass: 2 * ltp-hugetlb-tests - skip: 1, pass: 21 * ltp-io-tests - pass: 3 * ltp-ipc-tests - pass: 9 * ltp-math-tests - pass: 11 * ltp-nptl-tests - pass: 2 * ltp-pty-tests - pass: 4 * ltp-sched-tests - skip: 4, pass: 10 * ltp-securebits-tests - pass: 4 * ltp-syscalls-tests - skip: 151, pass: 999 * ltp-timers-tests - skip: 1, pass: 12
juno-r2 - arm64 * boot - pass: 20 * kselftest - skip: 17, pass: 48 * libhugetlbfs - skip: 1, pass: 90 * ltp-cap_bounds-tests - pass: 2 * ltp-containers-tests - skip: 17, pass: 64 * ltp-fcntl-locktests-tests - pass: 2 * ltp-filecaps-tests - pass: 2 * ltp-fs-tests - skip: 2, pass: 61 * ltp-fs_bind-tests - pass: 2 * ltp-fs_perms_simple-tests - pass: 19 * ltp-fsx-tests - pass: 2 * ltp-hugetlb-tests - pass: 22 * ltp-io-tests - pass: 3 * ltp-ipc-tests - pass: 9 * ltp-math-tests - pass: 11 * ltp-nptl-tests - pass: 2 * ltp-pty-tests - pass: 4 * ltp-sched-tests - skip: 4, pass: 10 * ltp-securebits-tests - pass: 4 * ltp-syscalls-tests - skip: 149, pass: 1001 * ltp-timers-tests - skip: 1, pass: 12
qemu_x86_64 * boot - pass: 22 * kselftest - skip: 23, pass: 57 * kselftest-vsyscall-mode-native - skip: 23, pass: 57 * kselftest-vsyscall-mode-none - skip: 23, pass: 57 * libhugetlbfs - skip: 1, pass: 90 * ltp-cap_bounds-tests - pass: 2 * ltp-containers-tests - skip: 17, pass: 64 * ltp-fcntl-locktests-tests - pass: 2 * ltp-filecaps-tests - pass: 2 * ltp-fs-tests - skip: 6, pass: 57 * ltp-fs_bind-tests - pass: 2 * ltp-fs_perms_simple-tests - pass: 19 * ltp-fsx-tests - pass: 2 * ltp-hugetlb-tests - pass: 22 * ltp-io-tests - pass: 3 * ltp-ipc-tests - pass: 9 * ltp-math-tests - pass: 11 * ltp-nptl-tests - pass: 2 * ltp-pty-tests - pass: 4 * ltp-sched-tests - skip: 1, pass: 13 * ltp-securebits-tests - pass: 4 * ltp-syscalls-tests - skip: 148, fail: 2, pass: 1000 * ltp-timers-tests - skip: 1, pass: 12
x15 - arm * boot - pass: 20 * kselftest - skip: 21, pass: 41 * libhugetlbfs - skip: 1, fail: 1, pass: 86 * ltp-cap_bounds-tests - pass: 2 * ltp-containers-tests - skip: 18, pass: 63 * ltp-fcntl-locktests-tests - pass: 2 * ltp-filecaps-tests - pass: 2 * ltp-fs-tests - skip: 2, pass: 61 * ltp-fs_bind-tests - pass: 2 * ltp-fs_perms_simple-tests - pass: 19 * ltp-fsx-tests - pass: 2 * ltp-hugetlb-tests - skip: 2, pass: 20 * ltp-io-tests - pass: 3 * ltp-ipc-tests - pass: 9 * ltp-math-tests - pass: 11 * ltp-nptl-tests - pass: 2 * ltp-pty-tests - pass: 4 * ltp-sched-tests - skip: 1, pass: 13 * ltp-securebits-tests - pass: 4 * ltp-syscalls-tests - skip: 97, pass: 1053 * ltp-timers-tests - skip: 1, pass: 12
x86_64 * boot - pass: 22 * kselftest - skip: 19, pass: 61 * kselftest-vsyscall-mode-native - skip: 19, pass: 61 * kselftest-vsyscall-mode-none - skip: 19, fail: 1, pass: 60 * libhugetlbfs - skip: 1, pass: 90 * ltp-cap_bounds-tests - pass: 2 * ltp-containers-tests - skip: 17, pass: 63 * ltp-fcntl-locktests-tests - pass: 2 * ltp-filecaps-tests - pass: 2 * ltp-fs-tests - skip: 1, pass: 62 * ltp-fs_bind-tests - pass: 2 * ltp-fs_perms_simple-tests - pass: 19 * ltp-fsx-tests - pass: 2 * ltp-hugetlb-tests - pass: 22 * ltp-io-tests - pass: 3 * ltp-ipc-tests - pass: 9 * ltp-math-tests - pass: 11 * ltp-nptl-tests - pass: 2 * ltp-pty-tests - pass: 4 * ltp-sched-tests - skip: 5, pass: 9 * ltp-securebits-tests - pass: 4 * ltp-syscalls-tests - skip: 119, pass: 1031 * ltp-timers-tests - skip: 1, pass: 12
-- Linaro QA (beta) https://qa-reports.linaro.org
On Fri, Mar 30, 2018 at 01:41:30PM +0530, Naresh Kamboju wrote:
On 29 March 2018 at 23:29, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.15.15 release. There are 47 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Mar 31 17:57:05 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.15.15-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.15.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm and x86_64.
NOTE: LTP test case clock_nanosleep02 is an intermittent failure on qemu_x86_64. libhugetlbfs truncate_above_4GB-2M-32 is still fails on arm32 x15 device.
Thanks for testing these and letting me know.
greg k-h
On 03/29/2018 10:59 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.15.15 release. There are 47 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Mar 31 17:57:05 UTC 2018. Anything received after that time might be too late.
For v4.15.14-48-gcfaa8b0:
Build results: total: 147 pass: 147 fail: 0 Qemu test results: total: 141 pass: 141 fail: 0
Guenter
On Fri, Mar 30, 2018 at 08:20:59AM -0700, Guenter Roeck wrote:
On 03/29/2018 10:59 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.15.15 release. There are 47 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Mar 31 17:57:05 UTC 2018. Anything received after that time might be too late.
For v4.15.14-48-gcfaa8b0:
Build results: total: 147 pass: 147 fail: 0 Qemu test results: total: 141 pass: 141 fail: 0
Yeah! Thanks for testing all of these and letting me know.
greg k-h
linux-stable-mirror@lists.linaro.org