Hi,
Can you apply these three patches, one for each of the 5.10, 5.15, and
5.18 stable tree? Doesn't fix any issues of concern, just ensures that
we -EINVAL when invalid fields are set in the sqe for these opcodes.
This brings it up to par with 5.19 and newer.
Thanks!
--
Jens Axboe
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 665030fd0c1ed9f505932e6e73e7a2c788787a0a Mon Sep 17 00:00:00 2001
From: Petr Machata <petrm(a)nvidia.com>
Date: Wed, 29 Jun 2022 10:02:05 +0300
Subject: [PATCH] mlxsw: spectrum_router: Fix rollback in tunnel next hop init
In mlxsw_sp_nexthop6_init(), a next hop is always added to the router
linked list, and mlxsw_sp_nexthop_type_init() is invoked afterwards. When
that function results in an error, the next hop will not have been removed
from the linked list. As the error is propagated upwards and the caller
frees the next hop object, the linked list ends up holding an invalid
object.
A similar issue comes up with mlxsw_sp_nexthop4_init(), where rollback
block does exist, however does not include the linked list removal.
Both IPv6 and IPv4 next hops have a similar issue with next-hop counter
rollbacks. As these were introduced in the same patchset as the next hop
linked list, include the cleanup in this patch.
Fixes: dbe4598c1e92 ("mlxsw: spectrum_router: Keep nexthops in a linked list")
Fixes: a5390278a5eb ("mlxsw: spectrum: Add support for setting counters on nexthops")
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Amit Cohen <amcohen(a)nvidia.com>
Signed-off-by: Ido Schimmel <idosch(a)nvidia.com>
Link: https://lore.kernel.org/r/20220629070205.803952-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 9dbb573d53ea..0d8a0068e4ca 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -4415,6 +4415,8 @@ static int mlxsw_sp_nexthop4_init(struct mlxsw_sp *mlxsw_sp,
return 0;
err_nexthop_neigh_init:
+ list_del(&nh->router_list_node);
+ mlxsw_sp_nexthop_counter_free(mlxsw_sp, nh);
mlxsw_sp_nexthop_remove(mlxsw_sp, nh);
return err;
}
@@ -6740,6 +6742,7 @@ static int mlxsw_sp_nexthop6_init(struct mlxsw_sp *mlxsw_sp,
const struct fib6_info *rt)
{
struct net_device *dev = rt->fib6_nh->fib_nh_dev;
+ int err;
nh->nhgi = nh_grp->nhgi;
nh->nh_weight = rt->fib6_nh->fib_nh_weight;
@@ -6755,7 +6758,16 @@ static int mlxsw_sp_nexthop6_init(struct mlxsw_sp *mlxsw_sp,
return 0;
nh->ifindex = dev->ifindex;
- return mlxsw_sp_nexthop_type_init(mlxsw_sp, nh, dev);
+ err = mlxsw_sp_nexthop_type_init(mlxsw_sp, nh, dev);
+ if (err)
+ goto err_nexthop_type_init;
+
+ return 0;
+
+err_nexthop_type_init:
+ list_del(&nh->router_list_node);
+ mlxsw_sp_nexthop_counter_free(mlxsw_sp, nh);
+ return err;
}
static void mlxsw_sp_nexthop6_fini(struct mlxsw_sp *mlxsw_sp,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 665030fd0c1ed9f505932e6e73e7a2c788787a0a Mon Sep 17 00:00:00 2001
From: Petr Machata <petrm(a)nvidia.com>
Date: Wed, 29 Jun 2022 10:02:05 +0300
Subject: [PATCH] mlxsw: spectrum_router: Fix rollback in tunnel next hop init
In mlxsw_sp_nexthop6_init(), a next hop is always added to the router
linked list, and mlxsw_sp_nexthop_type_init() is invoked afterwards. When
that function results in an error, the next hop will not have been removed
from the linked list. As the error is propagated upwards and the caller
frees the next hop object, the linked list ends up holding an invalid
object.
A similar issue comes up with mlxsw_sp_nexthop4_init(), where rollback
block does exist, however does not include the linked list removal.
Both IPv6 and IPv4 next hops have a similar issue with next-hop counter
rollbacks. As these were introduced in the same patchset as the next hop
linked list, include the cleanup in this patch.
Fixes: dbe4598c1e92 ("mlxsw: spectrum_router: Keep nexthops in a linked list")
Fixes: a5390278a5eb ("mlxsw: spectrum: Add support for setting counters on nexthops")
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Amit Cohen <amcohen(a)nvidia.com>
Signed-off-by: Ido Schimmel <idosch(a)nvidia.com>
Link: https://lore.kernel.org/r/20220629070205.803952-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 9dbb573d53ea..0d8a0068e4ca 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -4415,6 +4415,8 @@ static int mlxsw_sp_nexthop4_init(struct mlxsw_sp *mlxsw_sp,
return 0;
err_nexthop_neigh_init:
+ list_del(&nh->router_list_node);
+ mlxsw_sp_nexthop_counter_free(mlxsw_sp, nh);
mlxsw_sp_nexthop_remove(mlxsw_sp, nh);
return err;
}
@@ -6740,6 +6742,7 @@ static int mlxsw_sp_nexthop6_init(struct mlxsw_sp *mlxsw_sp,
const struct fib6_info *rt)
{
struct net_device *dev = rt->fib6_nh->fib_nh_dev;
+ int err;
nh->nhgi = nh_grp->nhgi;
nh->nh_weight = rt->fib6_nh->fib_nh_weight;
@@ -6755,7 +6758,16 @@ static int mlxsw_sp_nexthop6_init(struct mlxsw_sp *mlxsw_sp,
return 0;
nh->ifindex = dev->ifindex;
- return mlxsw_sp_nexthop_type_init(mlxsw_sp, nh, dev);
+ err = mlxsw_sp_nexthop_type_init(mlxsw_sp, nh, dev);
+ if (err)
+ goto err_nexthop_type_init;
+
+ return 0;
+
+err_nexthop_type_init:
+ list_del(&nh->router_list_node);
+ mlxsw_sp_nexthop_counter_free(mlxsw_sp, nh);
+ return err;
}
static void mlxsw_sp_nexthop6_fini(struct mlxsw_sp *mlxsw_sp,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 665030fd0c1ed9f505932e6e73e7a2c788787a0a Mon Sep 17 00:00:00 2001
From: Petr Machata <petrm(a)nvidia.com>
Date: Wed, 29 Jun 2022 10:02:05 +0300
Subject: [PATCH] mlxsw: spectrum_router: Fix rollback in tunnel next hop init
In mlxsw_sp_nexthop6_init(), a next hop is always added to the router
linked list, and mlxsw_sp_nexthop_type_init() is invoked afterwards. When
that function results in an error, the next hop will not have been removed
from the linked list. As the error is propagated upwards and the caller
frees the next hop object, the linked list ends up holding an invalid
object.
A similar issue comes up with mlxsw_sp_nexthop4_init(), where rollback
block does exist, however does not include the linked list removal.
Both IPv6 and IPv4 next hops have a similar issue with next-hop counter
rollbacks. As these were introduced in the same patchset as the next hop
linked list, include the cleanup in this patch.
Fixes: dbe4598c1e92 ("mlxsw: spectrum_router: Keep nexthops in a linked list")
Fixes: a5390278a5eb ("mlxsw: spectrum: Add support for setting counters on nexthops")
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Amit Cohen <amcohen(a)nvidia.com>
Signed-off-by: Ido Schimmel <idosch(a)nvidia.com>
Link: https://lore.kernel.org/r/20220629070205.803952-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 9dbb573d53ea..0d8a0068e4ca 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -4415,6 +4415,8 @@ static int mlxsw_sp_nexthop4_init(struct mlxsw_sp *mlxsw_sp,
return 0;
err_nexthop_neigh_init:
+ list_del(&nh->router_list_node);
+ mlxsw_sp_nexthop_counter_free(mlxsw_sp, nh);
mlxsw_sp_nexthop_remove(mlxsw_sp, nh);
return err;
}
@@ -6740,6 +6742,7 @@ static int mlxsw_sp_nexthop6_init(struct mlxsw_sp *mlxsw_sp,
const struct fib6_info *rt)
{
struct net_device *dev = rt->fib6_nh->fib_nh_dev;
+ int err;
nh->nhgi = nh_grp->nhgi;
nh->nh_weight = rt->fib6_nh->fib_nh_weight;
@@ -6755,7 +6758,16 @@ static int mlxsw_sp_nexthop6_init(struct mlxsw_sp *mlxsw_sp,
return 0;
nh->ifindex = dev->ifindex;
- return mlxsw_sp_nexthop_type_init(mlxsw_sp, nh, dev);
+ err = mlxsw_sp_nexthop_type_init(mlxsw_sp, nh, dev);
+ if (err)
+ goto err_nexthop_type_init;
+
+ return 0;
+
+err_nexthop_type_init:
+ list_del(&nh->router_list_node);
+ mlxsw_sp_nexthop_counter_free(mlxsw_sp, nh);
+ return err;
}
static void mlxsw_sp_nexthop6_fini(struct mlxsw_sp *mlxsw_sp,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From adabdd8f6acabc0c3fdbba2e7f5a2edd9c5ef22d Mon Sep 17 00:00:00 2001
From: katrinzhou <katrinzhou(a)tencent.com>
Date: Tue, 28 Jun 2022 11:50:30 +0800
Subject: [PATCH] ipv6/sit: fix ipip6_tunnel_get_prl return value
When kcalloc fails, ipip6_tunnel_get_prl() should return -ENOMEM.
Move the position of label "out" to return correctly.
Addresses-Coverity: ("Unused value")
Fixes: 300aaeeaab5f ("[IPV6] SIT: Add SIOCGETPRL ioctl to get/dump PRL.")
Signed-off-by: katrinzhou <katrinzhou(a)tencent.com>
Reviewed-by: Eric Dumazet<edumazet(a)google.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://lore.kernel.org/r/20220628035030.1039171-1-zys.zljxml@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index c0b138c20992..6bcd5e419a08 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -323,8 +323,6 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
kcalloc(cmax, sizeof(*kp), GFP_KERNEL_ACCOUNT | __GFP_NOWARN) :
NULL;
- rcu_read_lock();
-
ca = min(t->prl_count, cmax);
if (!kp) {
@@ -341,7 +339,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
}
}
- c = 0;
+ rcu_read_lock();
for_each_prl_rcu(t->prl) {
if (c >= cmax)
break;
@@ -353,7 +351,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
if (kprl.addr != htonl(INADDR_ANY))
break;
}
-out:
+
rcu_read_unlock();
len = sizeof(*kp) * c;
@@ -362,7 +360,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
ret = -EFAULT;
kfree(kp);
-
+out:
return ret;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From adabdd8f6acabc0c3fdbba2e7f5a2edd9c5ef22d Mon Sep 17 00:00:00 2001
From: katrinzhou <katrinzhou(a)tencent.com>
Date: Tue, 28 Jun 2022 11:50:30 +0800
Subject: [PATCH] ipv6/sit: fix ipip6_tunnel_get_prl return value
When kcalloc fails, ipip6_tunnel_get_prl() should return -ENOMEM.
Move the position of label "out" to return correctly.
Addresses-Coverity: ("Unused value")
Fixes: 300aaeeaab5f ("[IPV6] SIT: Add SIOCGETPRL ioctl to get/dump PRL.")
Signed-off-by: katrinzhou <katrinzhou(a)tencent.com>
Reviewed-by: Eric Dumazet<edumazet(a)google.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://lore.kernel.org/r/20220628035030.1039171-1-zys.zljxml@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index c0b138c20992..6bcd5e419a08 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -323,8 +323,6 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
kcalloc(cmax, sizeof(*kp), GFP_KERNEL_ACCOUNT | __GFP_NOWARN) :
NULL;
- rcu_read_lock();
-
ca = min(t->prl_count, cmax);
if (!kp) {
@@ -341,7 +339,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
}
}
- c = 0;
+ rcu_read_lock();
for_each_prl_rcu(t->prl) {
if (c >= cmax)
break;
@@ -353,7 +351,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
if (kprl.addr != htonl(INADDR_ANY))
break;
}
-out:
+
rcu_read_unlock();
len = sizeof(*kp) * c;
@@ -362,7 +360,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
ret = -EFAULT;
kfree(kp);
-
+out:
return ret;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From adabdd8f6acabc0c3fdbba2e7f5a2edd9c5ef22d Mon Sep 17 00:00:00 2001
From: katrinzhou <katrinzhou(a)tencent.com>
Date: Tue, 28 Jun 2022 11:50:30 +0800
Subject: [PATCH] ipv6/sit: fix ipip6_tunnel_get_prl return value
When kcalloc fails, ipip6_tunnel_get_prl() should return -ENOMEM.
Move the position of label "out" to return correctly.
Addresses-Coverity: ("Unused value")
Fixes: 300aaeeaab5f ("[IPV6] SIT: Add SIOCGETPRL ioctl to get/dump PRL.")
Signed-off-by: katrinzhou <katrinzhou(a)tencent.com>
Reviewed-by: Eric Dumazet<edumazet(a)google.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://lore.kernel.org/r/20220628035030.1039171-1-zys.zljxml@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index c0b138c20992..6bcd5e419a08 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -323,8 +323,6 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
kcalloc(cmax, sizeof(*kp), GFP_KERNEL_ACCOUNT | __GFP_NOWARN) :
NULL;
- rcu_read_lock();
-
ca = min(t->prl_count, cmax);
if (!kp) {
@@ -341,7 +339,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
}
}
- c = 0;
+ rcu_read_lock();
for_each_prl_rcu(t->prl) {
if (c >= cmax)
break;
@@ -353,7 +351,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
if (kprl.addr != htonl(INADDR_ANY))
break;
}
-out:
+
rcu_read_unlock();
len = sizeof(*kp) * c;
@@ -362,7 +360,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
ret = -EFAULT;
kfree(kp);
-
+out:
return ret;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From adabdd8f6acabc0c3fdbba2e7f5a2edd9c5ef22d Mon Sep 17 00:00:00 2001
From: katrinzhou <katrinzhou(a)tencent.com>
Date: Tue, 28 Jun 2022 11:50:30 +0800
Subject: [PATCH] ipv6/sit: fix ipip6_tunnel_get_prl return value
When kcalloc fails, ipip6_tunnel_get_prl() should return -ENOMEM.
Move the position of label "out" to return correctly.
Addresses-Coverity: ("Unused value")
Fixes: 300aaeeaab5f ("[IPV6] SIT: Add SIOCGETPRL ioctl to get/dump PRL.")
Signed-off-by: katrinzhou <katrinzhou(a)tencent.com>
Reviewed-by: Eric Dumazet<edumazet(a)google.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://lore.kernel.org/r/20220628035030.1039171-1-zys.zljxml@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index c0b138c20992..6bcd5e419a08 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -323,8 +323,6 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
kcalloc(cmax, sizeof(*kp), GFP_KERNEL_ACCOUNT | __GFP_NOWARN) :
NULL;
- rcu_read_lock();
-
ca = min(t->prl_count, cmax);
if (!kp) {
@@ -341,7 +339,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
}
}
- c = 0;
+ rcu_read_lock();
for_each_prl_rcu(t->prl) {
if (c >= cmax)
break;
@@ -353,7 +351,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
if (kprl.addr != htonl(INADDR_ANY))
break;
}
-out:
+
rcu_read_unlock();
len = sizeof(*kp) * c;
@@ -362,7 +360,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
ret = -EFAULT;
kfree(kp);
-
+out:
return ret;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6f0012e35160cd08a53e46e3b3bbf724b92dfe68 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 23 Jun 2022 05:04:36 +0000
Subject: [PATCH] tcp: add a missing nf_reset_ct() in 3WHS handling
When the third packet of 3WHS connection establishment
contains payload, it is added into socket receive queue
without the XFRM check and the drop of connection tracking
context.
This means that if the data is left unread in the socket
receive queue, conntrack module can not be unloaded.
As most applications usually reads the incoming data
immediately after accept(), bug has been hiding for
quite a long time.
Commit 68822bdf76f1 ("net: generalize skb freeing
deferral to per-cpu lists") exposed this bug because
even if the application reads this data, the skb
with nfct state could stay in a per-cpu cache for
an arbitrary time, if said cpu no longer process RX softirqs.
Many thanks to Ilya Maximets for reporting this issue,
and for testing various patches:
https://lore.kernel.org/netdev/20220619003919.394622-1-i.maximets@ovn.org/
Note that I also added a missing xfrm4_policy_check() call,
although this is probably not a big issue, as the SYN
packet should have been dropped earlier.
Fixes: b59c270104f0 ("[NETFILTER]: Keep conntrack reference until IPsec policy checks are done")
Reported-by: Ilya Maximets <i.maximets(a)ovn.org>
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Florian Westphal <fw(a)strlen.de>
Cc: Pablo Neira Ayuso <pablo(a)netfilter.org>
Cc: Steffen Klassert <steffen.klassert(a)secunet.com>
Tested-by: Ilya Maximets <i.maximets(a)ovn.org>
Reviewed-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220623050436.1290307-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fe8f23b95d32..da5a3c44c4fb 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1964,7 +1964,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
struct sock *nsk;
sk = req->rsk_listener;
- drop_reason = tcp_inbound_md5_hash(sk, skb,
+ if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
+ drop_reason = SKB_DROP_REASON_XFRM_POLICY;
+ else
+ drop_reason = tcp_inbound_md5_hash(sk, skb,
&iph->saddr, &iph->daddr,
AF_INET, dif, sdif);
if (unlikely(drop_reason)) {
@@ -2016,6 +2019,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
}
goto discard_and_relse;
}
+ nf_reset_ct(skb);
if (nsk == sk) {
reqsk_put(req);
tcp_v4_restore_cb(skb);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6f0012e35160cd08a53e46e3b3bbf724b92dfe68 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 23 Jun 2022 05:04:36 +0000
Subject: [PATCH] tcp: add a missing nf_reset_ct() in 3WHS handling
When the third packet of 3WHS connection establishment
contains payload, it is added into socket receive queue
without the XFRM check and the drop of connection tracking
context.
This means that if the data is left unread in the socket
receive queue, conntrack module can not be unloaded.
As most applications usually reads the incoming data
immediately after accept(), bug has been hiding for
quite a long time.
Commit 68822bdf76f1 ("net: generalize skb freeing
deferral to per-cpu lists") exposed this bug because
even if the application reads this data, the skb
with nfct state could stay in a per-cpu cache for
an arbitrary time, if said cpu no longer process RX softirqs.
Many thanks to Ilya Maximets for reporting this issue,
and for testing various patches:
https://lore.kernel.org/netdev/20220619003919.394622-1-i.maximets@ovn.org/
Note that I also added a missing xfrm4_policy_check() call,
although this is probably not a big issue, as the SYN
packet should have been dropped earlier.
Fixes: b59c270104f0 ("[NETFILTER]: Keep conntrack reference until IPsec policy checks are done")
Reported-by: Ilya Maximets <i.maximets(a)ovn.org>
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Florian Westphal <fw(a)strlen.de>
Cc: Pablo Neira Ayuso <pablo(a)netfilter.org>
Cc: Steffen Klassert <steffen.klassert(a)secunet.com>
Tested-by: Ilya Maximets <i.maximets(a)ovn.org>
Reviewed-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220623050436.1290307-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fe8f23b95d32..da5a3c44c4fb 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1964,7 +1964,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
struct sock *nsk;
sk = req->rsk_listener;
- drop_reason = tcp_inbound_md5_hash(sk, skb,
+ if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
+ drop_reason = SKB_DROP_REASON_XFRM_POLICY;
+ else
+ drop_reason = tcp_inbound_md5_hash(sk, skb,
&iph->saddr, &iph->daddr,
AF_INET, dif, sdif);
if (unlikely(drop_reason)) {
@@ -2016,6 +2019,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
}
goto discard_and_relse;
}
+ nf_reset_ct(skb);
if (nsk == sk) {
reqsk_put(req);
tcp_v4_restore_cb(skb);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6f0012e35160cd08a53e46e3b3bbf724b92dfe68 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 23 Jun 2022 05:04:36 +0000
Subject: [PATCH] tcp: add a missing nf_reset_ct() in 3WHS handling
When the third packet of 3WHS connection establishment
contains payload, it is added into socket receive queue
without the XFRM check and the drop of connection tracking
context.
This means that if the data is left unread in the socket
receive queue, conntrack module can not be unloaded.
As most applications usually reads the incoming data
immediately after accept(), bug has been hiding for
quite a long time.
Commit 68822bdf76f1 ("net: generalize skb freeing
deferral to per-cpu lists") exposed this bug because
even if the application reads this data, the skb
with nfct state could stay in a per-cpu cache for
an arbitrary time, if said cpu no longer process RX softirqs.
Many thanks to Ilya Maximets for reporting this issue,
and for testing various patches:
https://lore.kernel.org/netdev/20220619003919.394622-1-i.maximets@ovn.org/
Note that I also added a missing xfrm4_policy_check() call,
although this is probably not a big issue, as the SYN
packet should have been dropped earlier.
Fixes: b59c270104f0 ("[NETFILTER]: Keep conntrack reference until IPsec policy checks are done")
Reported-by: Ilya Maximets <i.maximets(a)ovn.org>
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Florian Westphal <fw(a)strlen.de>
Cc: Pablo Neira Ayuso <pablo(a)netfilter.org>
Cc: Steffen Klassert <steffen.klassert(a)secunet.com>
Tested-by: Ilya Maximets <i.maximets(a)ovn.org>
Reviewed-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220623050436.1290307-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fe8f23b95d32..da5a3c44c4fb 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1964,7 +1964,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
struct sock *nsk;
sk = req->rsk_listener;
- drop_reason = tcp_inbound_md5_hash(sk, skb,
+ if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
+ drop_reason = SKB_DROP_REASON_XFRM_POLICY;
+ else
+ drop_reason = tcp_inbound_md5_hash(sk, skb,
&iph->saddr, &iph->daddr,
AF_INET, dif, sdif);
if (unlikely(drop_reason)) {
@@ -2016,6 +2019,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
}
goto discard_and_relse;
}
+ nf_reset_ct(skb);
if (nsk == sk) {
reqsk_put(req);
tcp_v4_restore_cb(skb);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6f0012e35160cd08a53e46e3b3bbf724b92dfe68 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 23 Jun 2022 05:04:36 +0000
Subject: [PATCH] tcp: add a missing nf_reset_ct() in 3WHS handling
When the third packet of 3WHS connection establishment
contains payload, it is added into socket receive queue
without the XFRM check and the drop of connection tracking
context.
This means that if the data is left unread in the socket
receive queue, conntrack module can not be unloaded.
As most applications usually reads the incoming data
immediately after accept(), bug has been hiding for
quite a long time.
Commit 68822bdf76f1 ("net: generalize skb freeing
deferral to per-cpu lists") exposed this bug because
even if the application reads this data, the skb
with nfct state could stay in a per-cpu cache for
an arbitrary time, if said cpu no longer process RX softirqs.
Many thanks to Ilya Maximets for reporting this issue,
and for testing various patches:
https://lore.kernel.org/netdev/20220619003919.394622-1-i.maximets@ovn.org/
Note that I also added a missing xfrm4_policy_check() call,
although this is probably not a big issue, as the SYN
packet should have been dropped earlier.
Fixes: b59c270104f0 ("[NETFILTER]: Keep conntrack reference until IPsec policy checks are done")
Reported-by: Ilya Maximets <i.maximets(a)ovn.org>
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Florian Westphal <fw(a)strlen.de>
Cc: Pablo Neira Ayuso <pablo(a)netfilter.org>
Cc: Steffen Klassert <steffen.klassert(a)secunet.com>
Tested-by: Ilya Maximets <i.maximets(a)ovn.org>
Reviewed-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220623050436.1290307-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fe8f23b95d32..da5a3c44c4fb 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1964,7 +1964,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
struct sock *nsk;
sk = req->rsk_listener;
- drop_reason = tcp_inbound_md5_hash(sk, skb,
+ if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
+ drop_reason = SKB_DROP_REASON_XFRM_POLICY;
+ else
+ drop_reason = tcp_inbound_md5_hash(sk, skb,
&iph->saddr, &iph->daddr,
AF_INET, dif, sdif);
if (unlikely(drop_reason)) {
@@ -2016,6 +2019,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
}
goto discard_and_relse;
}
+ nf_reset_ct(skb);
if (nsk == sk) {
reqsk_put(req);
tcp_v4_restore_cb(skb);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6f0012e35160cd08a53e46e3b3bbf724b92dfe68 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 23 Jun 2022 05:04:36 +0000
Subject: [PATCH] tcp: add a missing nf_reset_ct() in 3WHS handling
When the third packet of 3WHS connection establishment
contains payload, it is added into socket receive queue
without the XFRM check and the drop of connection tracking
context.
This means that if the data is left unread in the socket
receive queue, conntrack module can not be unloaded.
As most applications usually reads the incoming data
immediately after accept(), bug has been hiding for
quite a long time.
Commit 68822bdf76f1 ("net: generalize skb freeing
deferral to per-cpu lists") exposed this bug because
even if the application reads this data, the skb
with nfct state could stay in a per-cpu cache for
an arbitrary time, if said cpu no longer process RX softirqs.
Many thanks to Ilya Maximets for reporting this issue,
and for testing various patches:
https://lore.kernel.org/netdev/20220619003919.394622-1-i.maximets@ovn.org/
Note that I also added a missing xfrm4_policy_check() call,
although this is probably not a big issue, as the SYN
packet should have been dropped earlier.
Fixes: b59c270104f0 ("[NETFILTER]: Keep conntrack reference until IPsec policy checks are done")
Reported-by: Ilya Maximets <i.maximets(a)ovn.org>
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Florian Westphal <fw(a)strlen.de>
Cc: Pablo Neira Ayuso <pablo(a)netfilter.org>
Cc: Steffen Klassert <steffen.klassert(a)secunet.com>
Tested-by: Ilya Maximets <i.maximets(a)ovn.org>
Reviewed-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220623050436.1290307-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fe8f23b95d32..da5a3c44c4fb 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1964,7 +1964,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
struct sock *nsk;
sk = req->rsk_listener;
- drop_reason = tcp_inbound_md5_hash(sk, skb,
+ if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
+ drop_reason = SKB_DROP_REASON_XFRM_POLICY;
+ else
+ drop_reason = tcp_inbound_md5_hash(sk, skb,
&iph->saddr, &iph->daddr,
AF_INET, dif, sdif);
if (unlikely(drop_reason)) {
@@ -2016,6 +2019,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
}
goto discard_and_relse;
}
+ nf_reset_ct(skb);
if (nsk == sk) {
reqsk_put(req);
tcp_v4_restore_cb(skb);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6f0012e35160cd08a53e46e3b3bbf724b92dfe68 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 23 Jun 2022 05:04:36 +0000
Subject: [PATCH] tcp: add a missing nf_reset_ct() in 3WHS handling
When the third packet of 3WHS connection establishment
contains payload, it is added into socket receive queue
without the XFRM check and the drop of connection tracking
context.
This means that if the data is left unread in the socket
receive queue, conntrack module can not be unloaded.
As most applications usually reads the incoming data
immediately after accept(), bug has been hiding for
quite a long time.
Commit 68822bdf76f1 ("net: generalize skb freeing
deferral to per-cpu lists") exposed this bug because
even if the application reads this data, the skb
with nfct state could stay in a per-cpu cache for
an arbitrary time, if said cpu no longer process RX softirqs.
Many thanks to Ilya Maximets for reporting this issue,
and for testing various patches:
https://lore.kernel.org/netdev/20220619003919.394622-1-i.maximets@ovn.org/
Note that I also added a missing xfrm4_policy_check() call,
although this is probably not a big issue, as the SYN
packet should have been dropped earlier.
Fixes: b59c270104f0 ("[NETFILTER]: Keep conntrack reference until IPsec policy checks are done")
Reported-by: Ilya Maximets <i.maximets(a)ovn.org>
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Florian Westphal <fw(a)strlen.de>
Cc: Pablo Neira Ayuso <pablo(a)netfilter.org>
Cc: Steffen Klassert <steffen.klassert(a)secunet.com>
Tested-by: Ilya Maximets <i.maximets(a)ovn.org>
Reviewed-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220623050436.1290307-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fe8f23b95d32..da5a3c44c4fb 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1964,7 +1964,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
struct sock *nsk;
sk = req->rsk_listener;
- drop_reason = tcp_inbound_md5_hash(sk, skb,
+ if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
+ drop_reason = SKB_DROP_REASON_XFRM_POLICY;
+ else
+ drop_reason = tcp_inbound_md5_hash(sk, skb,
&iph->saddr, &iph->daddr,
AF_INET, dif, sdif);
if (unlikely(drop_reason)) {
@@ -2016,6 +2019,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
}
goto discard_and_relse;
}
+ nf_reset_ct(skb);
if (nsk == sk) {
reqsk_put(req);
tcp_v4_restore_cb(skb);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8ee9d82cd0a45e7d050ade598c9f33032a0f2891 Mon Sep 17 00:00:00 2001
From: Tong Zhang <ztong0001(a)gmail.com>
Date: Sun, 26 Jun 2022 21:33:48 -0700
Subject: [PATCH] epic100: fix use after free on rmmod
epic_close() calls epic_rx() and uses dma buffer, but in epic_remove_one()
we already freed the dma buffer. To fix this issue, reorder function calls
like in the .probe function.
BUG: KASAN: use-after-free in epic_rx+0xa6/0x7e0 [epic100]
Call Trace:
epic_rx+0xa6/0x7e0 [epic100]
epic_close+0xec/0x2f0 [epic100]
unregister_netdev+0x18/0x20
epic_remove_one+0xaa/0xf0 [epic100]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Yilun Wu <yiluwu(a)cs.stonybrook.edu>
Signed-off-by: Tong Zhang <ztong0001(a)gmail.com>
Reviewed-by: Francois Romieu <romieu(a)fr.zoreil.com>
Link: https://lore.kernel.org/r/20220627043351.25615-1-ztong0001@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/smsc/epic100.c b/drivers/net/ethernet/smsc/epic100.c
index a0654e88444c..0329caf63279 100644
--- a/drivers/net/ethernet/smsc/epic100.c
+++ b/drivers/net/ethernet/smsc/epic100.c
@@ -1515,14 +1515,14 @@ static void epic_remove_one(struct pci_dev *pdev)
struct net_device *dev = pci_get_drvdata(pdev);
struct epic_private *ep = netdev_priv(dev);
+ unregister_netdev(dev);
dma_free_coherent(&pdev->dev, TX_TOTAL_SIZE, ep->tx_ring,
ep->tx_ring_dma);
dma_free_coherent(&pdev->dev, RX_TOTAL_SIZE, ep->rx_ring,
ep->rx_ring_dma);
- unregister_netdev(dev);
pci_iounmap(pdev, ep->ioaddr);
- pci_release_regions(pdev);
free_netdev(dev);
+ pci_release_regions(pdev);
pci_disable_device(pdev);
/* pci_power_off(pdev, -1); */
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8ee9d82cd0a45e7d050ade598c9f33032a0f2891 Mon Sep 17 00:00:00 2001
From: Tong Zhang <ztong0001(a)gmail.com>
Date: Sun, 26 Jun 2022 21:33:48 -0700
Subject: [PATCH] epic100: fix use after free on rmmod
epic_close() calls epic_rx() and uses dma buffer, but in epic_remove_one()
we already freed the dma buffer. To fix this issue, reorder function calls
like in the .probe function.
BUG: KASAN: use-after-free in epic_rx+0xa6/0x7e0 [epic100]
Call Trace:
epic_rx+0xa6/0x7e0 [epic100]
epic_close+0xec/0x2f0 [epic100]
unregister_netdev+0x18/0x20
epic_remove_one+0xaa/0xf0 [epic100]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Yilun Wu <yiluwu(a)cs.stonybrook.edu>
Signed-off-by: Tong Zhang <ztong0001(a)gmail.com>
Reviewed-by: Francois Romieu <romieu(a)fr.zoreil.com>
Link: https://lore.kernel.org/r/20220627043351.25615-1-ztong0001@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/smsc/epic100.c b/drivers/net/ethernet/smsc/epic100.c
index a0654e88444c..0329caf63279 100644
--- a/drivers/net/ethernet/smsc/epic100.c
+++ b/drivers/net/ethernet/smsc/epic100.c
@@ -1515,14 +1515,14 @@ static void epic_remove_one(struct pci_dev *pdev)
struct net_device *dev = pci_get_drvdata(pdev);
struct epic_private *ep = netdev_priv(dev);
+ unregister_netdev(dev);
dma_free_coherent(&pdev->dev, TX_TOTAL_SIZE, ep->tx_ring,
ep->tx_ring_dma);
dma_free_coherent(&pdev->dev, RX_TOTAL_SIZE, ep->rx_ring,
ep->rx_ring_dma);
- unregister_netdev(dev);
pci_iounmap(pdev, ep->ioaddr);
- pci_release_regions(pdev);
free_netdev(dev);
+ pci_release_regions(pdev);
pci_disable_device(pdev);
/* pci_power_off(pdev, -1); */
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8ee9d82cd0a45e7d050ade598c9f33032a0f2891 Mon Sep 17 00:00:00 2001
From: Tong Zhang <ztong0001(a)gmail.com>
Date: Sun, 26 Jun 2022 21:33:48 -0700
Subject: [PATCH] epic100: fix use after free on rmmod
epic_close() calls epic_rx() and uses dma buffer, but in epic_remove_one()
we already freed the dma buffer. To fix this issue, reorder function calls
like in the .probe function.
BUG: KASAN: use-after-free in epic_rx+0xa6/0x7e0 [epic100]
Call Trace:
epic_rx+0xa6/0x7e0 [epic100]
epic_close+0xec/0x2f0 [epic100]
unregister_netdev+0x18/0x20
epic_remove_one+0xaa/0xf0 [epic100]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Yilun Wu <yiluwu(a)cs.stonybrook.edu>
Signed-off-by: Tong Zhang <ztong0001(a)gmail.com>
Reviewed-by: Francois Romieu <romieu(a)fr.zoreil.com>
Link: https://lore.kernel.org/r/20220627043351.25615-1-ztong0001@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/smsc/epic100.c b/drivers/net/ethernet/smsc/epic100.c
index a0654e88444c..0329caf63279 100644
--- a/drivers/net/ethernet/smsc/epic100.c
+++ b/drivers/net/ethernet/smsc/epic100.c
@@ -1515,14 +1515,14 @@ static void epic_remove_one(struct pci_dev *pdev)
struct net_device *dev = pci_get_drvdata(pdev);
struct epic_private *ep = netdev_priv(dev);
+ unregister_netdev(dev);
dma_free_coherent(&pdev->dev, TX_TOTAL_SIZE, ep->tx_ring,
ep->tx_ring_dma);
dma_free_coherent(&pdev->dev, RX_TOTAL_SIZE, ep->rx_ring,
ep->rx_ring_dma);
- unregister_netdev(dev);
pci_iounmap(pdev, ep->ioaddr);
- pci_release_regions(pdev);
free_netdev(dev);
+ pci_release_regions(pdev);
pci_disable_device(pdev);
/* pci_power_off(pdev, -1); */
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8ee9d82cd0a45e7d050ade598c9f33032a0f2891 Mon Sep 17 00:00:00 2001
From: Tong Zhang <ztong0001(a)gmail.com>
Date: Sun, 26 Jun 2022 21:33:48 -0700
Subject: [PATCH] epic100: fix use after free on rmmod
epic_close() calls epic_rx() and uses dma buffer, but in epic_remove_one()
we already freed the dma buffer. To fix this issue, reorder function calls
like in the .probe function.
BUG: KASAN: use-after-free in epic_rx+0xa6/0x7e0 [epic100]
Call Trace:
epic_rx+0xa6/0x7e0 [epic100]
epic_close+0xec/0x2f0 [epic100]
unregister_netdev+0x18/0x20
epic_remove_one+0xaa/0xf0 [epic100]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Yilun Wu <yiluwu(a)cs.stonybrook.edu>
Signed-off-by: Tong Zhang <ztong0001(a)gmail.com>
Reviewed-by: Francois Romieu <romieu(a)fr.zoreil.com>
Link: https://lore.kernel.org/r/20220627043351.25615-1-ztong0001@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/smsc/epic100.c b/drivers/net/ethernet/smsc/epic100.c
index a0654e88444c..0329caf63279 100644
--- a/drivers/net/ethernet/smsc/epic100.c
+++ b/drivers/net/ethernet/smsc/epic100.c
@@ -1515,14 +1515,14 @@ static void epic_remove_one(struct pci_dev *pdev)
struct net_device *dev = pci_get_drvdata(pdev);
struct epic_private *ep = netdev_priv(dev);
+ unregister_netdev(dev);
dma_free_coherent(&pdev->dev, TX_TOTAL_SIZE, ep->tx_ring,
ep->tx_ring_dma);
dma_free_coherent(&pdev->dev, RX_TOTAL_SIZE, ep->rx_ring,
ep->rx_ring_dma);
- unregister_netdev(dev);
pci_iounmap(pdev, ep->ioaddr);
- pci_release_regions(pdev);
free_netdev(dev);
+ pci_release_regions(pdev);
pci_disable_device(pdev);
/* pci_power_off(pdev, -1); */
}
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9c5de246c1dbe785268fc2e83c88624b92e4ec93 Mon Sep 17 00:00:00 2001
From: Casper Andersson <casper.casan(a)gmail.com>
Date: Thu, 30 Jun 2022 14:22:26 +0200
Subject: [PATCH] net: sparx5: mdb add/del handle non-sparx5 devices
When adding/deleting mdb entries on other net_devices, eg., tap
interfaces, it should not crash.
Fixes: 3bacfccdcb2d ("net: sparx5: Add mdb handlers")
Signed-off-by: Casper Andersson <casper.casan(a)gmail.com>
Reviewed-by: Steen Hegelund <Steen.Hegelund(a)microchip.com>
Link: https://lore.kernel.org/r/20220630122226.316812-1-casper.casan@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
index 3429660cd2e5..5edc8b7176c8 100644
--- a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
+++ b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
@@ -396,6 +396,9 @@ static int sparx5_handle_port_mdb_add(struct net_device *dev,
u32 mact_entry;
int res, err;
+ if (!sparx5_netdevice_check(dev))
+ return -EOPNOTSUPP;
+
if (netif_is_bridge_master(v->obj.orig_dev)) {
sparx5_mact_learn(spx5, PGID_CPU, v->addr, v->vid);
return 0;
@@ -466,6 +469,9 @@ static int sparx5_handle_port_mdb_del(struct net_device *dev,
u32 mact_entry, res, pgid_entry[3];
int err;
+ if (!sparx5_netdevice_check(dev))
+ return -EOPNOTSUPP;
+
if (netif_is_bridge_master(v->obj.orig_dev)) {
sparx5_mact_forget(spx5, v->addr, v->vid);
return 0;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76b39b94382f9e0a639e1c70c3253de248cc4c83 Mon Sep 17 00:00:00 2001
From: Victor Nogueira <victor(a)mojatatu.com>
Date: Thu, 23 Jun 2022 11:07:41 -0300
Subject: [PATCH] net/sched: act_api: Notify user space if any actions were
flushed before error
If during an action flush operation one of the actions is still being
referenced, the flush operation is aborted and the kernel returns to
user space with an error. However, if the kernel was able to flush, for
example, 3 actions and failed on the fourth, the kernel will not notify
user space that it deleted 3 actions before failing.
This patch fixes that behaviour by notifying user space of how many
actions were deleted before flush failed and by setting extack with a
message describing what happened.
Fixes: 55334a5db5cd ("net_sched: act: refuse to remove bound action outside")
Signed-off-by: Victor Nogueira <victor(a)mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs(a)mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index da9733da9868..817065aa2833 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -588,7 +588,8 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
}
static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
- const struct tc_action_ops *ops)
+ const struct tc_action_ops *ops,
+ struct netlink_ext_ack *extack)
{
struct nlattr *nest;
int n_i = 0;
@@ -604,20 +605,25 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
if (nla_put_string(skb, TCA_KIND, ops->kind))
goto nla_put_failure;
+ ret = 0;
mutex_lock(&idrinfo->lock);
idr_for_each_entry_ul(idr, p, tmp, id) {
if (IS_ERR(p))
continue;
ret = tcf_idr_release_unsafe(p);
- if (ret == ACT_P_DELETED) {
+ if (ret == ACT_P_DELETED)
module_put(ops->owner);
- n_i++;
- } else if (ret < 0) {
- mutex_unlock(&idrinfo->lock);
- goto nla_put_failure;
- }
+ else if (ret < 0)
+ break;
+ n_i++;
}
mutex_unlock(&idrinfo->lock);
+ if (ret < 0) {
+ if (n_i)
+ NL_SET_ERR_MSG(extack, "Unable to flush all TC actions");
+ else
+ goto nla_put_failure;
+ }
ret = nla_put_u32(skb, TCA_FCNT, n_i);
if (ret)
@@ -638,7 +644,7 @@ int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb,
struct tcf_idrinfo *idrinfo = tn->idrinfo;
if (type == RTM_DELACTION) {
- return tcf_del_walker(idrinfo, skb, ops);
+ return tcf_del_walker(idrinfo, skb, ops, extack);
} else if (type == RTM_GETACTION) {
return tcf_dump_walker(idrinfo, skb, cb);
} else {
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76b39b94382f9e0a639e1c70c3253de248cc4c83 Mon Sep 17 00:00:00 2001
From: Victor Nogueira <victor(a)mojatatu.com>
Date: Thu, 23 Jun 2022 11:07:41 -0300
Subject: [PATCH] net/sched: act_api: Notify user space if any actions were
flushed before error
If during an action flush operation one of the actions is still being
referenced, the flush operation is aborted and the kernel returns to
user space with an error. However, if the kernel was able to flush, for
example, 3 actions and failed on the fourth, the kernel will not notify
user space that it deleted 3 actions before failing.
This patch fixes that behaviour by notifying user space of how many
actions were deleted before flush failed and by setting extack with a
message describing what happened.
Fixes: 55334a5db5cd ("net_sched: act: refuse to remove bound action outside")
Signed-off-by: Victor Nogueira <victor(a)mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs(a)mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index da9733da9868..817065aa2833 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -588,7 +588,8 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
}
static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
- const struct tc_action_ops *ops)
+ const struct tc_action_ops *ops,
+ struct netlink_ext_ack *extack)
{
struct nlattr *nest;
int n_i = 0;
@@ -604,20 +605,25 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
if (nla_put_string(skb, TCA_KIND, ops->kind))
goto nla_put_failure;
+ ret = 0;
mutex_lock(&idrinfo->lock);
idr_for_each_entry_ul(idr, p, tmp, id) {
if (IS_ERR(p))
continue;
ret = tcf_idr_release_unsafe(p);
- if (ret == ACT_P_DELETED) {
+ if (ret == ACT_P_DELETED)
module_put(ops->owner);
- n_i++;
- } else if (ret < 0) {
- mutex_unlock(&idrinfo->lock);
- goto nla_put_failure;
- }
+ else if (ret < 0)
+ break;
+ n_i++;
}
mutex_unlock(&idrinfo->lock);
+ if (ret < 0) {
+ if (n_i)
+ NL_SET_ERR_MSG(extack, "Unable to flush all TC actions");
+ else
+ goto nla_put_failure;
+ }
ret = nla_put_u32(skb, TCA_FCNT, n_i);
if (ret)
@@ -638,7 +644,7 @@ int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb,
struct tcf_idrinfo *idrinfo = tn->idrinfo;
if (type == RTM_DELACTION) {
- return tcf_del_walker(idrinfo, skb, ops);
+ return tcf_del_walker(idrinfo, skb, ops, extack);
} else if (type == RTM_GETACTION) {
return tcf_dump_walker(idrinfo, skb, cb);
} else {
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76b39b94382f9e0a639e1c70c3253de248cc4c83 Mon Sep 17 00:00:00 2001
From: Victor Nogueira <victor(a)mojatatu.com>
Date: Thu, 23 Jun 2022 11:07:41 -0300
Subject: [PATCH] net/sched: act_api: Notify user space if any actions were
flushed before error
If during an action flush operation one of the actions is still being
referenced, the flush operation is aborted and the kernel returns to
user space with an error. However, if the kernel was able to flush, for
example, 3 actions and failed on the fourth, the kernel will not notify
user space that it deleted 3 actions before failing.
This patch fixes that behaviour by notifying user space of how many
actions were deleted before flush failed and by setting extack with a
message describing what happened.
Fixes: 55334a5db5cd ("net_sched: act: refuse to remove bound action outside")
Signed-off-by: Victor Nogueira <victor(a)mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs(a)mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index da9733da9868..817065aa2833 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -588,7 +588,8 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
}
static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
- const struct tc_action_ops *ops)
+ const struct tc_action_ops *ops,
+ struct netlink_ext_ack *extack)
{
struct nlattr *nest;
int n_i = 0;
@@ -604,20 +605,25 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
if (nla_put_string(skb, TCA_KIND, ops->kind))
goto nla_put_failure;
+ ret = 0;
mutex_lock(&idrinfo->lock);
idr_for_each_entry_ul(idr, p, tmp, id) {
if (IS_ERR(p))
continue;
ret = tcf_idr_release_unsafe(p);
- if (ret == ACT_P_DELETED) {
+ if (ret == ACT_P_DELETED)
module_put(ops->owner);
- n_i++;
- } else if (ret < 0) {
- mutex_unlock(&idrinfo->lock);
- goto nla_put_failure;
- }
+ else if (ret < 0)
+ break;
+ n_i++;
}
mutex_unlock(&idrinfo->lock);
+ if (ret < 0) {
+ if (n_i)
+ NL_SET_ERR_MSG(extack, "Unable to flush all TC actions");
+ else
+ goto nla_put_failure;
+ }
ret = nla_put_u32(skb, TCA_FCNT, n_i);
if (ret)
@@ -638,7 +644,7 @@ int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb,
struct tcf_idrinfo *idrinfo = tn->idrinfo;
if (type == RTM_DELACTION) {
- return tcf_del_walker(idrinfo, skb, ops);
+ return tcf_del_walker(idrinfo, skb, ops, extack);
} else if (type == RTM_GETACTION) {
return tcf_dump_walker(idrinfo, skb, cb);
} else {
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f44b799603a9b5d2e375b0b2d54dd0b791eddfc2 Mon Sep 17 00:00:00 2001
From: Miaoqian Lin <linmq006(a)gmail.com>
Date: Thu, 26 May 2022 12:28:56 +0400
Subject: [PATCH] PM / devfreq: exynos-ppmu: Fix refcount leak in
of_get_devfreq_events
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
This function only calls of_node_put() in normal path,
missing it in error paths.
Add missing of_node_put() to avoid refcount leak.
Fixes: f262f28c1470 ("PM / devfreq: event: Add devfreq_event class")
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com>
diff --git a/drivers/devfreq/event/exynos-ppmu.c b/drivers/devfreq/event/exynos-ppmu.c
index 9b849d781116..a443e7c42daf 100644
--- a/drivers/devfreq/event/exynos-ppmu.c
+++ b/drivers/devfreq/event/exynos-ppmu.c
@@ -519,15 +519,19 @@ static int of_get_devfreq_events(struct device_node *np,
count = of_get_child_count(events_np);
desc = devm_kcalloc(dev, count, sizeof(*desc), GFP_KERNEL);
- if (!desc)
+ if (!desc) {
+ of_node_put(events_np);
return -ENOMEM;
+ }
info->num_events = count;
of_id = of_match_device(exynos_ppmu_id_match, dev);
if (of_id)
info->ppmu_type = (enum exynos_ppmu_type)of_id->data;
- else
+ else {
+ of_node_put(events_np);
return -EINVAL;
+ }
j = 0;
for_each_child_of_node(events_np, node) {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f44b799603a9b5d2e375b0b2d54dd0b791eddfc2 Mon Sep 17 00:00:00 2001
From: Miaoqian Lin <linmq006(a)gmail.com>
Date: Thu, 26 May 2022 12:28:56 +0400
Subject: [PATCH] PM / devfreq: exynos-ppmu: Fix refcount leak in
of_get_devfreq_events
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
This function only calls of_node_put() in normal path,
missing it in error paths.
Add missing of_node_put() to avoid refcount leak.
Fixes: f262f28c1470 ("PM / devfreq: event: Add devfreq_event class")
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com>
diff --git a/drivers/devfreq/event/exynos-ppmu.c b/drivers/devfreq/event/exynos-ppmu.c
index 9b849d781116..a443e7c42daf 100644
--- a/drivers/devfreq/event/exynos-ppmu.c
+++ b/drivers/devfreq/event/exynos-ppmu.c
@@ -519,15 +519,19 @@ static int of_get_devfreq_events(struct device_node *np,
count = of_get_child_count(events_np);
desc = devm_kcalloc(dev, count, sizeof(*desc), GFP_KERNEL);
- if (!desc)
+ if (!desc) {
+ of_node_put(events_np);
return -ENOMEM;
+ }
info->num_events = count;
of_id = of_match_device(exynos_ppmu_id_match, dev);
if (of_id)
info->ppmu_type = (enum exynos_ppmu_type)of_id->data;
- else
+ else {
+ of_node_put(events_np);
return -EINVAL;
+ }
j = 0;
for_each_child_of_node(events_np, node) {
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f44b799603a9b5d2e375b0b2d54dd0b791eddfc2 Mon Sep 17 00:00:00 2001
From: Miaoqian Lin <linmq006(a)gmail.com>
Date: Thu, 26 May 2022 12:28:56 +0400
Subject: [PATCH] PM / devfreq: exynos-ppmu: Fix refcount leak in
of_get_devfreq_events
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
This function only calls of_node_put() in normal path,
missing it in error paths.
Add missing of_node_put() to avoid refcount leak.
Fixes: f262f28c1470 ("PM / devfreq: event: Add devfreq_event class")
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com>
diff --git a/drivers/devfreq/event/exynos-ppmu.c b/drivers/devfreq/event/exynos-ppmu.c
index 9b849d781116..a443e7c42daf 100644
--- a/drivers/devfreq/event/exynos-ppmu.c
+++ b/drivers/devfreq/event/exynos-ppmu.c
@@ -519,15 +519,19 @@ static int of_get_devfreq_events(struct device_node *np,
count = of_get_child_count(events_np);
desc = devm_kcalloc(dev, count, sizeof(*desc), GFP_KERNEL);
- if (!desc)
+ if (!desc) {
+ of_node_put(events_np);
return -ENOMEM;
+ }
info->num_events = count;
of_id = of_match_device(exynos_ppmu_id_match, dev);
if (of_id)
info->ppmu_type = (enum exynos_ppmu_type)of_id->data;
- else
+ else {
+ of_node_put(events_np);
return -EINVAL;
+ }
j = 0;
for_each_child_of_node(events_np, node) {
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 868f9f2f8e004bfe0d3935b1976f625b2924893b Mon Sep 17 00:00:00 2001
From: Amir Goldstein <amir73il(a)gmail.com>
Date: Thu, 30 Jun 2022 22:58:49 +0300
Subject: [PATCH] vfs: fix copy_file_range() regression in cross-fs copies
A regression has been reported by Nicolas Boichat, found while using the
copy_file_range syscall to copy a tracefs file.
Before commit 5dae222a5ff0 ("vfs: allow copy_file_range to copy across
devices") the kernel would return -EXDEV to userspace when trying to
copy a file across different filesystems. After this commit, the
syscall doesn't fail anymore and instead returns zero (zero bytes
copied), as this file's content is generated on-the-fly and thus reports
a size of zero.
Another regression has been reported by He Zhe - the assertion of
WARN_ON_ONCE(ret == -EOPNOTSUPP) can be triggered from userspace when
copying from a sysfs file whose read operation may return -EOPNOTSUPP.
Since we do not have test coverage for copy_file_range() between any two
types of filesystems, the best way to avoid these sort of issues in the
future is for the kernel to be more picky about filesystems that are
allowed to do copy_file_range().
This patch restores some cross-filesystem copy restrictions that existed
prior to commit 5dae222a5ff0 ("vfs: allow copy_file_range to copy across
devices"), namely, cross-sb copy is not allowed for filesystems that do
not implement ->copy_file_range().
Filesystems that do implement ->copy_file_range() have full control of
the result - if this method returns an error, the error is returned to
the user. Before this change this was only true for fs that did not
implement the ->remap_file_range() operation (i.e. nfsv3).
Filesystems that do not implement ->copy_file_range() still fall-back to
the generic_copy_file_range() implementation when the copy is within the
same sb. This helps the kernel can maintain a more consistent story
about which filesystems support copy_file_range().
nfsd and ksmbd servers are modified to fall-back to the
generic_copy_file_range() implementation in case vfs_copy_file_range()
fails with -EOPNOTSUPP or -EXDEV, which preserves behavior of
server-side-copy.
fall-back to generic_copy_file_range() is not implemented for the smb
operation FSCTL_DUPLICATE_EXTENTS_TO_FILE, which is arguably a correct
change of behavior.
Fixes: 5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices")
Link: https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chr…
Link: https://lore.kernel.org/linux-fsdevel/CANMq1KDZuxir2LM5jOTm0xx+BnvW=ZmpsG47…
Link: https://lore.kernel.org/linux-fsdevel/20210126135012.1.If45b7cdc3ff707bc1ef…
Link: https://lore.kernel.org/linux-fsdevel/20210630161320.29006-1-lhenriques@sus…
Reported-by: Nicolas Boichat <drinkcat(a)chromium.org>
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Signed-off-by: Luis Henriques <lhenriques(a)suse.de>
Fixes: 64bf5ff58dff ("vfs: no fallback for ->copy_file_range")
Link: https://lore.kernel.org/linux-fsdevel/20f17f64-88cb-4e80-07c1-85cb96c83619@…
Reported-by: He Zhe <zhe.he(a)windriver.com>
Tested-by: Namjae Jeon <linkinjeon(a)kernel.org>
Tested-by: Luis Henriques <lhenriques(a)suse.de>
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/ksmbd/smb2pdu.c b/fs/ksmbd/smb2pdu.c
index 94ab1dcd80e7..353f047e783c 100644
--- a/fs/ksmbd/smb2pdu.c
+++ b/fs/ksmbd/smb2pdu.c
@@ -7810,14 +7810,24 @@ int smb2_ioctl(struct ksmbd_work *work)
src_off = le64_to_cpu(dup_ext->SourceFileOffset);
dst_off = le64_to_cpu(dup_ext->TargetFileOffset);
length = le64_to_cpu(dup_ext->ByteCount);
- cloned = vfs_clone_file_range(fp_in->filp, src_off, fp_out->filp,
- dst_off, length, 0);
+ /*
+ * XXX: It is not clear if FSCTL_DUPLICATE_EXTENTS_TO_FILE
+ * should fall back to vfs_copy_file_range(). This could be
+ * beneficial when re-exporting nfs/smb mount, but note that
+ * this can result in partial copy that returns an error status.
+ * If/when FSCTL_DUPLICATE_EXTENTS_TO_FILE_EX is implemented,
+ * fall back to vfs_copy_file_range(), should be avoided when
+ * the flag DUPLICATE_EXTENTS_DATA_EX_SOURCE_ATOMIC is set.
+ */
+ cloned = vfs_clone_file_range(fp_in->filp, src_off,
+ fp_out->filp, dst_off, length, 0);
if (cloned == -EXDEV || cloned == -EOPNOTSUPP) {
ret = -EOPNOTSUPP;
goto dup_ext_out;
} else if (cloned != length) {
cloned = vfs_copy_file_range(fp_in->filp, src_off,
- fp_out->filp, dst_off, length, 0);
+ fp_out->filp, dst_off,
+ length, 0);
if (cloned != length) {
if (cloned < 0)
ret = cloned;
diff --git a/fs/ksmbd/vfs.c b/fs/ksmbd/vfs.c
index 5d185564aef6..05efcdf7a4a7 100644
--- a/fs/ksmbd/vfs.c
+++ b/fs/ksmbd/vfs.c
@@ -1779,6 +1779,10 @@ int ksmbd_vfs_copy_file_ranges(struct ksmbd_work *work,
ret = vfs_copy_file_range(src_fp->filp, src_off,
dst_fp->filp, dst_off, len, 0);
+ if (ret == -EOPNOTSUPP || ret == -EXDEV)
+ ret = generic_copy_file_range(src_fp->filp, src_off,
+ dst_fp->filp, dst_off,
+ len, 0);
if (ret < 0)
return ret;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 840e3af63a6f..b764213bcc55 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -577,6 +577,7 @@ __be32 nfsd4_clone_file_range(struct svc_rqst *rqstp,
ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst,
u64 dst_pos, u64 count)
{
+ ssize_t ret;
/*
* Limit copy to 4MB to prevent indefinitely blocking an nfsd
@@ -587,7 +588,12 @@ ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst,
* limit like this and pipeline multiple COPY requests.
*/
count = min_t(u64, count, 1 << 22);
- return vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0);
+ ret = vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0);
+
+ if (ret == -EOPNOTSUPP || ret == -EXDEV)
+ ret = generic_copy_file_range(src, src_pos, dst, dst_pos,
+ count, 0);
+ return ret;
}
__be32 nfsd4_vfs_fallocate(struct svc_rqst *rqstp, struct svc_fh *fhp,
diff --git a/fs/read_write.c b/fs/read_write.c
index b1b1cdfee9d3..e0777eefd846 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1397,28 +1397,6 @@ ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in,
}
EXPORT_SYMBOL(generic_copy_file_range);
-static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out,
- size_t len, unsigned int flags)
-{
- /*
- * Although we now allow filesystems to handle cross sb copy, passing
- * a file of the wrong filesystem type to filesystem driver can result
- * in an attempt to dereference the wrong type of ->private_data, so
- * avoid doing that until we really have a good reason. NFS defines
- * several different file_system_type structures, but they all end up
- * using the same ->copy_file_range() function pointer.
- */
- if (file_out->f_op->copy_file_range &&
- file_out->f_op->copy_file_range == file_in->f_op->copy_file_range)
- return file_out->f_op->copy_file_range(file_in, pos_in,
- file_out, pos_out,
- len, flags);
-
- return generic_copy_file_range(file_in, pos_in, file_out, pos_out, len,
- flags);
-}
-
/*
* Performs necessary checks before doing a file copy
*
@@ -1440,6 +1418,24 @@ static int generic_copy_file_checks(struct file *file_in, loff_t pos_in,
if (ret)
return ret;
+ /*
+ * We allow some filesystems to handle cross sb copy, but passing
+ * a file of the wrong filesystem type to filesystem driver can result
+ * in an attempt to dereference the wrong type of ->private_data, so
+ * avoid doing that until we really have a good reason.
+ *
+ * nfs and cifs define several different file_system_type structures
+ * and several different sets of file_operations, but they all end up
+ * using the same ->copy_file_range() function pointer.
+ */
+ if (file_out->f_op->copy_file_range) {
+ if (file_in->f_op->copy_file_range !=
+ file_out->f_op->copy_file_range)
+ return -EXDEV;
+ } else if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb) {
+ return -EXDEV;
+ }
+
/* Don't touch certain kinds of inodes */
if (IS_IMMUTABLE(inode_out))
return -EPERM;
@@ -1505,26 +1501,41 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
file_start_write(file_out);
/*
- * Try cloning first, this is supported by more file systems, and
- * more efficient if both clone and copy are supported (e.g. NFS).
+ * Cloning is supported by more file systems, so we implement copy on
+ * same sb using clone, but for filesystems where both clone and copy
+ * are supported (e.g. nfs,cifs), we only call the copy method.
*/
+ if (file_out->f_op->copy_file_range) {
+ ret = file_out->f_op->copy_file_range(file_in, pos_in,
+ file_out, pos_out,
+ len, flags);
+ goto done;
+ }
+
if (file_in->f_op->remap_file_range &&
file_inode(file_in)->i_sb == file_inode(file_out)->i_sb) {
- loff_t cloned;
-
- cloned = file_in->f_op->remap_file_range(file_in, pos_in,
+ ret = file_in->f_op->remap_file_range(file_in, pos_in,
file_out, pos_out,
min_t(loff_t, MAX_RW_COUNT, len),
REMAP_FILE_CAN_SHORTEN);
- if (cloned > 0) {
- ret = cloned;
+ if (ret > 0)
goto done;
- }
}
- ret = do_copy_file_range(file_in, pos_in, file_out, pos_out, len,
- flags);
- WARN_ON_ONCE(ret == -EOPNOTSUPP);
+ /*
+ * We can get here for same sb copy of filesystems that do not implement
+ * ->copy_file_range() in case filesystem does not support clone or in
+ * case filesystem supports clone but rejected the clone request (e.g.
+ * because it was not block aligned).
+ *
+ * In both cases, fall back to kernel copy so we are able to maintain a
+ * consistent story about which filesystems support copy_file_range()
+ * and which filesystems do not, that will allow userspace tools to
+ * make consistent desicions w.r.t using copy_file_range().
+ */
+ ret = generic_copy_file_range(file_in, pos_in, file_out, pos_out, len,
+ flags);
+
done:
if (ret > 0) {
fsnotify_access(file_in);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e4f74400308cb8abde5fdc9cad609c2aba32110c Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Sat, 11 Jun 2022 00:20:23 +0200
Subject: [PATCH] s390/archrandom: simplify back to earlier design and
initialize earlier
s390x appears to present two RNG interfaces:
- a "TRNG" that gathers entropy using some hardware function; and
- a "DRBG" that takes in a seed and expands it.
Previously, the TRNG was wired up to arch_get_random_{long,int}(), but
it was observed that this was being called really frequently, resulting
in high overhead. So it was changed to be wired up to arch_get_random_
seed_{long,int}(), which was a reasonable decision. Later on, the DRBG
was then wired up to arch_get_random_{long,int}(), with a complicated
buffer filling thread, to control overhead and rate.
Fortunately, none of the performance issues matter much now. The RNG
always attempts to use arch_get_random_seed_{long,int}() first, which
means a complicated implementation of arch_get_random_{long,int}() isn't
really valuable or useful to have around. And it's only used when
reseeding, which means it won't hit the high throughput complications
that were faced before.
So this commit returns to an earlier design of just calling the TRNG in
arch_get_random_seed_{long,int}(), and returning false in arch_get_
random_{long,int}().
Part of what makes the simplification possible is that the RNG now seeds
itself using the TRNG at bootup. But this only works if the TRNG is
detected early in boot, before random_init() is called. So this commit
also causes that check to happen in setup_arch().
Cc: stable(a)vger.kernel.org
Cc: Harald Freudenberger <freude(a)linux.ibm.com>
Cc: Ingo Franzki <ifranzki(a)linux.ibm.com>
Cc: Juergen Christ <jchrist(a)linux.ibm.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Link: https://lore.kernel.org/r/20220610222023.378448-1-Jason@zx2c4.com
Reviewed-by: Harald Freudenberger <freude(a)linux.ibm.com>
Acked-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
diff --git a/arch/s390/crypto/arch_random.c b/arch/s390/crypto/arch_random.c
index 56007c763902..1f2d40993c4d 100644
--- a/arch/s390/crypto/arch_random.c
+++ b/arch/s390/crypto/arch_random.c
@@ -4,232 +4,15 @@
*
* Copyright IBM Corp. 2017, 2020
* Author(s): Harald Freudenberger
- *
- * The s390_arch_random_generate() function may be called from random.c
- * in interrupt context. So this implementation does the best to be very
- * fast. There is a buffer of random data which is asynchronously checked
- * and filled by a workqueue thread.
- * If there are enough bytes in the buffer the s390_arch_random_generate()
- * just delivers these bytes. Otherwise false is returned until the
- * worker thread refills the buffer.
- * The worker fills the rng buffer by pulling fresh entropy from the
- * high quality (but slow) true hardware random generator. This entropy
- * is then spread over the buffer with an pseudo random generator PRNG.
- * As the arch_get_random_seed_long() fetches 8 bytes and the calling
- * function add_interrupt_randomness() counts this as 1 bit entropy the
- * distribution needs to make sure there is in fact 1 bit entropy contained
- * in 8 bytes of the buffer. The current values pull 32 byte entropy
- * and scatter this into a 2048 byte buffer. So 8 byte in the buffer
- * will contain 1 bit of entropy.
- * The worker thread is rescheduled based on the charge level of the
- * buffer but at least with 500 ms delay to avoid too much CPU consumption.
- * So the max. amount of rng data delivered via arch_get_random_seed is
- * limited to 4k bytes per second.
*/
#include <linux/kernel.h>
#include <linux/atomic.h>
#include <linux/random.h>
-#include <linux/slab.h>
#include <linux/static_key.h>
-#include <linux/workqueue.h>
-#include <linux/moduleparam.h>
#include <asm/cpacf.h>
DEFINE_STATIC_KEY_FALSE(s390_arch_random_available);
atomic64_t s390_arch_random_counter = ATOMIC64_INIT(0);
EXPORT_SYMBOL(s390_arch_random_counter);
-
-#define ARCH_REFILL_TICKS (HZ/2)
-#define ARCH_PRNG_SEED_SIZE 32
-#define ARCH_RNG_BUF_SIZE 2048
-
-static DEFINE_SPINLOCK(arch_rng_lock);
-static u8 *arch_rng_buf;
-static unsigned int arch_rng_buf_idx;
-
-static void arch_rng_refill_buffer(struct work_struct *);
-static DECLARE_DELAYED_WORK(arch_rng_work, arch_rng_refill_buffer);
-
-bool s390_arch_random_generate(u8 *buf, unsigned int nbytes)
-{
- /* max hunk is ARCH_RNG_BUF_SIZE */
- if (nbytes > ARCH_RNG_BUF_SIZE)
- return false;
-
- /* lock rng buffer */
- if (!spin_trylock(&arch_rng_lock))
- return false;
-
- /* try to resolve the requested amount of bytes from the buffer */
- arch_rng_buf_idx -= nbytes;
- if (arch_rng_buf_idx < ARCH_RNG_BUF_SIZE) {
- memcpy(buf, arch_rng_buf + arch_rng_buf_idx, nbytes);
- atomic64_add(nbytes, &s390_arch_random_counter);
- spin_unlock(&arch_rng_lock);
- return true;
- }
-
- /* not enough bytes in rng buffer, refill is done asynchronously */
- spin_unlock(&arch_rng_lock);
-
- return false;
-}
-EXPORT_SYMBOL(s390_arch_random_generate);
-
-static void arch_rng_refill_buffer(struct work_struct *unused)
-{
- unsigned int delay = ARCH_REFILL_TICKS;
-
- spin_lock(&arch_rng_lock);
- if (arch_rng_buf_idx > ARCH_RNG_BUF_SIZE) {
- /* buffer is exhausted and needs refill */
- u8 seed[ARCH_PRNG_SEED_SIZE];
- u8 prng_wa[240];
- /* fetch ARCH_PRNG_SEED_SIZE bytes of entropy */
- cpacf_trng(NULL, 0, seed, sizeof(seed));
- /* blow this entropy up to ARCH_RNG_BUF_SIZE with PRNG */
- memset(prng_wa, 0, sizeof(prng_wa));
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_SEED,
- &prng_wa, NULL, 0, seed, sizeof(seed));
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_GEN,
- &prng_wa, arch_rng_buf, ARCH_RNG_BUF_SIZE, NULL, 0);
- arch_rng_buf_idx = ARCH_RNG_BUF_SIZE;
- }
- delay += (ARCH_REFILL_TICKS * arch_rng_buf_idx) / ARCH_RNG_BUF_SIZE;
- spin_unlock(&arch_rng_lock);
-
- /* kick next check */
- queue_delayed_work(system_long_wq, &arch_rng_work, delay);
-}
-
-/*
- * Here follows the implementation of s390_arch_get_random_long().
- *
- * The random longs to be pulled by arch_get_random_long() are
- * prepared in an 4K buffer which is filled from the NIST 800-90
- * compliant s390 drbg. By default the random long buffer is refilled
- * 256 times before the drbg itself needs a reseed. The reseed of the
- * drbg is done with 32 bytes fetched from the high quality (but slow)
- * trng which is assumed to deliver 100% entropy. So the 32 * 8 = 256
- * bits of entropy are spread over 256 * 4KB = 1MB serving 131072
- * arch_get_random_long() invocations before reseeded.
- *
- * How often the 4K random long buffer is refilled with the drbg
- * before the drbg is reseeded can be adjusted. There is a module
- * parameter 's390_arch_rnd_long_drbg_reseed' accessible via
- * /sys/module/arch_random/parameters/rndlong_drbg_reseed
- * or as kernel command line parameter
- * arch_random.rndlong_drbg_reseed=<value>
- * This parameter tells how often the drbg fills the 4K buffer before
- * it is re-seeded by fresh entropy from the trng.
- * A value of 16 results in reseeding the drbg at every 16 * 4 KB = 64
- * KB with 32 bytes of fresh entropy pulled from the trng. So a value
- * of 16 would result in 256 bits entropy per 64 KB.
- * A value of 256 results in 1MB of drbg output before a reseed of the
- * drbg is done. So this would spread the 256 bits of entropy among 1MB.
- * Setting this parameter to 0 forces the reseed to take place every
- * time the 4K buffer is depleted, so the entropy rises to 256 bits
- * entropy per 4K or 0.5 bit entropy per arch_get_random_long(). With
- * setting this parameter to negative values all this effort is
- * disabled, arch_get_random long() returns false and thus indicating
- * that the arch_get_random_long() feature is disabled at all.
- */
-
-static unsigned long rndlong_buf[512];
-static DEFINE_SPINLOCK(rndlong_lock);
-static int rndlong_buf_index;
-
-static int rndlong_drbg_reseed = 256;
-module_param_named(rndlong_drbg_reseed, rndlong_drbg_reseed, int, 0600);
-MODULE_PARM_DESC(rndlong_drbg_reseed, "s390 arch_get_random_long() drbg reseed");
-
-static inline void refill_rndlong_buf(void)
-{
- static u8 prng_ws[240];
- static int drbg_counter;
-
- if (--drbg_counter < 0) {
- /* need to re-seed the drbg */
- u8 seed[32];
-
- /* fetch seed from trng */
- cpacf_trng(NULL, 0, seed, sizeof(seed));
- /* seed drbg */
- memset(prng_ws, 0, sizeof(prng_ws));
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_SEED,
- &prng_ws, NULL, 0, seed, sizeof(seed));
- /* re-init counter for drbg */
- drbg_counter = rndlong_drbg_reseed;
- }
-
- /* fill the arch_get_random_long buffer from drbg */
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_GEN, &prng_ws,
- (u8 *) rndlong_buf, sizeof(rndlong_buf),
- NULL, 0);
-}
-
-bool s390_arch_get_random_long(unsigned long *v)
-{
- bool rc = false;
- unsigned long flags;
-
- /* arch_get_random_long() disabled ? */
- if (rndlong_drbg_reseed < 0)
- return false;
-
- /* try to lock the random long lock */
- if (!spin_trylock_irqsave(&rndlong_lock, flags))
- return false;
-
- if (--rndlong_buf_index >= 0) {
- /* deliver next long value from the buffer */
- *v = rndlong_buf[rndlong_buf_index];
- rc = true;
- goto out;
- }
-
- /* buffer is depleted and needs refill */
- if (in_interrupt()) {
- /* delay refill in interrupt context to next caller */
- rndlong_buf_index = 0;
- goto out;
- }
-
- /* refill random long buffer */
- refill_rndlong_buf();
- rndlong_buf_index = ARRAY_SIZE(rndlong_buf);
-
- /* and provide one random long */
- *v = rndlong_buf[--rndlong_buf_index];
- rc = true;
-
-out:
- spin_unlock_irqrestore(&rndlong_lock, flags);
- return rc;
-}
-EXPORT_SYMBOL(s390_arch_get_random_long);
-
-static int __init s390_arch_random_init(void)
-{
- /* all the needed PRNO subfunctions available ? */
- if (cpacf_query_func(CPACF_PRNO, CPACF_PRNO_TRNG) &&
- cpacf_query_func(CPACF_PRNO, CPACF_PRNO_SHA512_DRNG_GEN)) {
-
- /* alloc arch random working buffer */
- arch_rng_buf = kmalloc(ARCH_RNG_BUF_SIZE, GFP_KERNEL);
- if (!arch_rng_buf)
- return -ENOMEM;
-
- /* kick worker queue job to fill the random buffer */
- queue_delayed_work(system_long_wq,
- &arch_rng_work, ARCH_REFILL_TICKS);
-
- /* enable arch random to the outside world */
- static_branch_enable(&s390_arch_random_available);
- }
-
- return 0;
-}
-arch_initcall(s390_arch_random_init);
diff --git a/arch/s390/include/asm/archrandom.h b/arch/s390/include/asm/archrandom.h
index 5dc712fde3c7..2c6e1c6ecbe7 100644
--- a/arch/s390/include/asm/archrandom.h
+++ b/arch/s390/include/asm/archrandom.h
@@ -15,17 +15,13 @@
#include <linux/static_key.h>
#include <linux/atomic.h>
+#include <asm/cpacf.h>
DECLARE_STATIC_KEY_FALSE(s390_arch_random_available);
extern atomic64_t s390_arch_random_counter;
-bool s390_arch_get_random_long(unsigned long *v);
-bool s390_arch_random_generate(u8 *buf, unsigned int nbytes);
-
static inline bool __must_check arch_get_random_long(unsigned long *v)
{
- if (static_branch_likely(&s390_arch_random_available))
- return s390_arch_get_random_long(v);
return false;
}
@@ -37,7 +33,9 @@ static inline bool __must_check arch_get_random_int(unsigned int *v)
static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
{
if (static_branch_likely(&s390_arch_random_available)) {
- return s390_arch_random_generate((u8 *)v, sizeof(*v));
+ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
+ atomic64_add(sizeof(*v), &s390_arch_random_counter);
+ return true;
}
return false;
}
@@ -45,7 +43,9 @@ static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
{
if (static_branch_likely(&s390_arch_random_available)) {
- return s390_arch_random_generate((u8 *)v, sizeof(*v));
+ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
+ atomic64_add(sizeof(*v), &s390_arch_random_counter);
+ return true;
}
return false;
}
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index 8d91eccc0963..0a37f5de2863 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -875,6 +875,11 @@ static void __init setup_randomness(void)
if (stsi(vmms, 3, 2, 2) == 0 && vmms->count)
add_device_randomness(&vmms->vm, sizeof(vmms->vm[0]) * vmms->count);
memblock_free(vmms, PAGE_SIZE);
+
+#ifdef CONFIG_ARCH_RANDOM
+ if (cpacf_query_func(CPACF_PRNO, CPACF_PRNO_TRNG))
+ static_branch_enable(&s390_arch_random_available);
+#endif
}
/*
In commit 169d00a25658 ("can: mcp251xfd: add TX IRQ coalescing
support") software based TX coalescing was added to the driver. The
key idea is to keep the TX complete IRQ disabled for some time after
processing it and re-enable later by a hrtimer. When bringing the
interface down, this timer has to be stopped.
Add the missing hrtimer_cancel() of the tx_irq_time hrtimer to
mcp251xfd_stop().
Link: https://lore.kernel.org/all/20220620143942.891811-1-mkl@pengutronix.de
Fixes: 169d00a25658 ("can: mcp251xfd: add TX IRQ coalescing support")
Cc: stable(a)vger.kernel.org # v5.18
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
index b21252390216..34b160024ce3 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
@@ -1650,6 +1650,7 @@ static int mcp251xfd_stop(struct net_device *ndev)
netif_stop_queue(ndev);
set_bit(MCP251XFD_FLAGS_DOWN, priv->flags);
hrtimer_cancel(&priv->rx_irq_timer);
+ hrtimer_cancel(&priv->tx_irq_timer);
mcp251xfd_chip_interrupts_disable(priv);
free_irq(ndev->irq, priv);
can_rx_offload_disable(&priv->offload);
--
2.35.1
From: Thomas Kopp <thomas.kopp(a)microchip.com>
The mcp251xfd compatible chips have an erratum ([1], [2]), where the
received CRC doesn't match the calculated CRC. In commit
c7eb923c3caf ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work
around broken CRC on TBC register") the following workaround was
implementierend.
- If a CRC read error on the TBC register is detected and the first
byte is 0x00 or 0x80, the most significant bit of the first byte is
flipped and the CRC is calculated again.
- If the CRC now matches, the _original_ data is passed to the reader.
For now we assume transferred data was OK.
New investigations and simulations indicate that the CRC send by the
device is calculated on correct data, and the data is incorrectly
received by the SPI host controller.
Use flipped instead of original data and update workaround description
in mcp251xfd_regmap_crc_read().
[1] mcp2517fd: DS80000792C: "Incorrect CRC for certain READ_CRC commands"
[2] mcp2518fd: DS80000789C: "Incorrect CRC for certain READ_CRC commands"
Link: https://lore.kernel.org/all/DM4PR11MB53901D49578FE265B239E55AFB7C9@DM4PR11M…
Fixes: c7eb923c3caf ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Kopp <thomas.kopp(a)microchip.com>
[mkl: split into 2 patches, update patch description and documentation]
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
index 211edd9ef0c9..92b7bc7f14b9 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
@@ -343,9 +343,8 @@ mcp251xfd_regmap_crc_read(void *context,
*
* If the highest bit in the lowest byte is flipped
* the transferred CRC matches the calculated one. We
- * assume for now the CRC calculation in the chip
- * works on wrong data and the transferred data is
- * correct.
+ * assume for now the CRC operates on the correct
+ * data.
*/
if (reg == MCP251XFD_REG_TBC &&
((buf_rx->data[0] & 0xf8) == 0x0 ||
@@ -359,10 +358,8 @@ mcp251xfd_regmap_crc_read(void *context,
val_len);
if (!err) {
/* If CRC is now correct, assume
- * transferred data was OK, flip bit
- * back to original value.
+ * flipped data is OK.
*/
- buf_rx->data[0] ^= 0x80;
goto out;
}
}
--
2.35.1
From: Thomas Kopp <thomas.kopp(a)microchip.com>
The mcp251xfd compatible chips have an erratum ([1], [2]), where the
received CRC doesn't match the calculated CRC. In commit
c7eb923c3caf ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work
around broken CRC on TBC register") the following workaround was
implementierend.
- If a CRC read error on the TBC register is detected and the first
byte is 0x00 or 0x80, the most significant bit of the first byte is
flipped and the CRC is calculated again.
- If the CRC now matches, the _original_ data is passed to the reader.
For now we assume transferred data was OK.
Measurements on the mcp2517fd show that the workaround is applicable
not only of the lowest byte is 0x00 or 0x80, but also if 3 least
significant bits are set.
Update check on 1st data byte and workaround description accordingly.
[1] mcp2517fd: DS80000792C: "Incorrect CRC for certain READ_CRC commands"
[2] mcp2518fd: DS80000789C: "Incorrect CRC for certain READ_CRC commands"
Link: https://lore.kernel.org/all/DM4PR11MB53901D49578FE265B239E55AFB7C9@DM4PR11M…
Fixes: c7eb923c3caf ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register")
Cc: stable(a)vger.kernel.org
Reported-by: Pavel Modilaynen <pavel.modilaynen(a)volvocars.com>
Signed-off-by: Thomas Kopp <thomas.kopp(a)microchip.com>
[mkl: split into 2 patches, update patch description and documentation]
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
index 217510c12af5..211edd9ef0c9 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
@@ -334,10 +334,12 @@ mcp251xfd_regmap_crc_read(void *context,
* register. It increments once per SYS clock tick,
* which is 20 or 40 MHz.
*
- * Observation shows that if the lowest byte (which is
- * transferred first on the SPI bus) of that register
- * is 0x00 or 0x80 the calculated CRC doesn't always
- * match the transferred one.
+ * Observation on the mcp2518fd shows that if the
+ * lowest byte (which is transferred first on the SPI
+ * bus) of that register is 0x00 or 0x80 the
+ * calculated CRC doesn't always match the transferred
+ * one. On the mcp2517fd this problem is not limited
+ * to the first byte being 0x00 or 0x80.
*
* If the highest bit in the lowest byte is flipped
* the transferred CRC matches the calculated one. We
@@ -346,7 +348,8 @@ mcp251xfd_regmap_crc_read(void *context,
* correct.
*/
if (reg == MCP251XFD_REG_TBC &&
- (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) {
+ ((buf_rx->data[0] & 0xf8) == 0x0 ||
+ (buf_rx->data[0] & 0xf8) == 0x80)) {
/* Flip highest bit in lowest byte of le32 */
buf_rx->data[0] ^= 0x80;
--
2.35.1
In commit 1be37d3b0414 ("can: m_can: fix periph RX path: use
rx-offload to ensure skbs are sent from softirq context") the RX path
for peripheral devices was switched to RX-offload.
Received CAN frames are pushed to RX-offload together with a
timestamp. RX-offload is designed to handle overflows of the timestamp
correctly, if 32 bit timestamps are provided.
The timestamps of m_can core are only 16 bits wide. So this patch
shifts them to full 32 bit before passing them to RX-offload.
Link: https://lore.kernel.org/all/20220612211410.4081390-1-mkl@pengutronix.de
Fixes: 1be37d3b0414 ("can: m_can: fix periph RX path: use rx-offload to ensure skbs are sent from softirq context")
Cc: <stable(a)vger.kernel.org> # 5.13
Cc: Torin Cooper-Bennun <torin(a)maxiluxsystems.com>
Reviewed-by: Chandrasekar Ramakrishnan <rcsekar(a)samsung.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/m_can/m_can.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 03a22d493cf6..7931f9c71ef3 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -529,7 +529,7 @@ static int m_can_read_fifo(struct net_device *dev, u32 rxfs)
/* acknowledge rx fifo 0 */
m_can_write(cdev, M_CAN_RXF0A, fgi);
- timestamp = FIELD_GET(RX_BUF_RXTS_MASK, fifo_header.dlc);
+ timestamp = FIELD_GET(RX_BUF_RXTS_MASK, fifo_header.dlc) << 16;
m_can_receive_skb(cdev, skb, timestamp);
@@ -1030,7 +1030,7 @@ static int m_can_echo_tx_event(struct net_device *dev)
}
msg_mark = FIELD_GET(TX_EVENT_MM_MASK, txe);
- timestamp = FIELD_GET(TX_EVENT_TXTS_MASK, txe);
+ timestamp = FIELD_GET(TX_EVENT_TXTS_MASK, txe) << 16;
/* ack txe element */
m_can_write(cdev, M_CAN_TXEFA, FIELD_PREP(TXEFA_EFAI_MASK,
--
2.35.1
In commit df06fd678260 ("can: m_can: m_can_chip_config(): enable and
configure internal timestamps") the timestamping in the m_can core
should be enabled. In peripheral mode, the RX'ed CAN frames, TX
compete frames and error events are sorted by the timestamp.
The above mentioned commit however forgot to enable the timestamping.
Add the missing bits to enable the timestamp counter to the write of
the Timestamp Counter Configuration register.
Link: https://lore.kernel.org/all/20220612212708.4081756-1-mkl@pengutronix.de
Fixes: df06fd678260 ("can: m_can: m_can_chip_config(): enable and configure internal timestamps")
Cc: <stable(a)vger.kernel.org> # 5.13
Cc: Torin Cooper-Bennun <torin(a)maxiluxsystems.com>
Reviewed-by: Chandrasekar Ramakrishnan <rcsekar(a)samsung.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/m_can/m_can.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 5d0c82d8b9a9..03a22d493cf6 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1351,7 +1351,9 @@ static void m_can_chip_config(struct net_device *dev)
/* enable internal timestamp generation, with a prescalar of 16. The
* prescalar is applied to the nominal bit timing
*/
- m_can_write(cdev, M_CAN_TSCC, FIELD_PREP(TSCC_TCP_MASK, 0xf));
+ m_can_write(cdev, M_CAN_TSCC,
+ FIELD_PREP(TSCC_TCP_MASK, 0xf) |
+ FIELD_PREP(TSCC_TSS_MASK, TSCC_TSS_INTERNAL));
m_can_config_endisable(cdev, false);
--
2.35.1
From: Rhett Aultman <rhett.aultman(a)samsara.com>
The gs_usb driver appears to suffer from a malady common to many USB
CAN adapter drivers in that it performs usb_alloc_coherent() to
allocate a number of USB request blocks (URBs) for RX, and then later
relies on usb_kill_anchored_urbs() to free them, but this doesn't
actually free them. As a result, this may be leaking DMA memory that's
been used by the driver.
This commit is an adaptation of the techniques found in the esd_usb2
driver where a similar design pattern led to a memory leak. It
explicitly frees the RX URBs and their DMA memory via a call to
usb_free_coherent(). Since the RX URBs were allocated in the
gs_can_open(), we remove them in gs_can_close() rather than in the
disconnect function as was done in esd_usb2.
For more information, see the 928150fad41b ("can: esd_usb2: fix memory
leak").
Link: https://lore.kernel.org/all/alpine.DEB.2.22.394.2206031547001.1630869@thela…
Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices")
Cc: stable(a)vger.kernel.org
Signed-off-by: Rhett Aultman <rhett.aultman(a)samsara.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/usb/gs_usb.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index b29ba9138866..d3a658b444b5 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -268,6 +268,8 @@ struct gs_can {
struct usb_anchor tx_submitted;
atomic_t active_tx_urbs;
+ void *rxbuf[GS_MAX_RX_URBS];
+ dma_addr_t rxbuf_dma[GS_MAX_RX_URBS];
};
/* usb interface struct */
@@ -742,6 +744,7 @@ static int gs_can_open(struct net_device *netdev)
for (i = 0; i < GS_MAX_RX_URBS; i++) {
struct urb *urb;
u8 *buf;
+ dma_addr_t buf_dma;
/* alloc rx urb */
urb = usb_alloc_urb(0, GFP_KERNEL);
@@ -752,7 +755,7 @@ static int gs_can_open(struct net_device *netdev)
buf = usb_alloc_coherent(dev->udev,
dev->parent->hf_size_rx,
GFP_KERNEL,
- &urb->transfer_dma);
+ &buf_dma);
if (!buf) {
netdev_err(netdev,
"No memory left for USB buffer\n");
@@ -760,6 +763,8 @@ static int gs_can_open(struct net_device *netdev)
return -ENOMEM;
}
+ urb->transfer_dma = buf_dma;
+
/* fill, anchor, and submit rx urb */
usb_fill_bulk_urb(urb,
dev->udev,
@@ -781,10 +786,17 @@ static int gs_can_open(struct net_device *netdev)
"usb_submit failed (err=%d)\n", rc);
usb_unanchor_urb(urb);
+ usb_free_coherent(dev->udev,
+ sizeof(struct gs_host_frame),
+ buf,
+ buf_dma);
usb_free_urb(urb);
break;
}
+ dev->rxbuf[i] = buf;
+ dev->rxbuf_dma[i] = buf_dma;
+
/* Drop reference,
* USB core will take care of freeing it
*/
@@ -842,13 +854,20 @@ static int gs_can_close(struct net_device *netdev)
int rc;
struct gs_can *dev = netdev_priv(netdev);
struct gs_usb *parent = dev->parent;
+ unsigned int i;
netif_stop_queue(netdev);
/* Stop polling */
parent->active_channels--;
- if (!parent->active_channels)
+ if (!parent->active_channels) {
usb_kill_anchored_urbs(&parent->rx_submitted);
+ for (i = 0; i < GS_MAX_RX_URBS; i++)
+ usb_free_coherent(dev->udev,
+ sizeof(struct gs_host_frame),
+ dev->rxbuf[i],
+ dev->rxbuf_dma[i]);
+ }
/* Stop sending URBs */
usb_kill_anchored_urbs(&dev->tx_submitted);
--
2.35.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e4f74400308cb8abde5fdc9cad609c2aba32110c Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Sat, 11 Jun 2022 00:20:23 +0200
Subject: [PATCH] s390/archrandom: simplify back to earlier design and
initialize earlier
s390x appears to present two RNG interfaces:
- a "TRNG" that gathers entropy using some hardware function; and
- a "DRBG" that takes in a seed and expands it.
Previously, the TRNG was wired up to arch_get_random_{long,int}(), but
it was observed that this was being called really frequently, resulting
in high overhead. So it was changed to be wired up to arch_get_random_
seed_{long,int}(), which was a reasonable decision. Later on, the DRBG
was then wired up to arch_get_random_{long,int}(), with a complicated
buffer filling thread, to control overhead and rate.
Fortunately, none of the performance issues matter much now. The RNG
always attempts to use arch_get_random_seed_{long,int}() first, which
means a complicated implementation of arch_get_random_{long,int}() isn't
really valuable or useful to have around. And it's only used when
reseeding, which means it won't hit the high throughput complications
that were faced before.
So this commit returns to an earlier design of just calling the TRNG in
arch_get_random_seed_{long,int}(), and returning false in arch_get_
random_{long,int}().
Part of what makes the simplification possible is that the RNG now seeds
itself using the TRNG at bootup. But this only works if the TRNG is
detected early in boot, before random_init() is called. So this commit
also causes that check to happen in setup_arch().
Cc: stable(a)vger.kernel.org
Cc: Harald Freudenberger <freude(a)linux.ibm.com>
Cc: Ingo Franzki <ifranzki(a)linux.ibm.com>
Cc: Juergen Christ <jchrist(a)linux.ibm.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Link: https://lore.kernel.org/r/20220610222023.378448-1-Jason@zx2c4.com
Reviewed-by: Harald Freudenberger <freude(a)linux.ibm.com>
Acked-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
diff --git a/arch/s390/crypto/arch_random.c b/arch/s390/crypto/arch_random.c
index 56007c763902..1f2d40993c4d 100644
--- a/arch/s390/crypto/arch_random.c
+++ b/arch/s390/crypto/arch_random.c
@@ -4,232 +4,15 @@
*
* Copyright IBM Corp. 2017, 2020
* Author(s): Harald Freudenberger
- *
- * The s390_arch_random_generate() function may be called from random.c
- * in interrupt context. So this implementation does the best to be very
- * fast. There is a buffer of random data which is asynchronously checked
- * and filled by a workqueue thread.
- * If there are enough bytes in the buffer the s390_arch_random_generate()
- * just delivers these bytes. Otherwise false is returned until the
- * worker thread refills the buffer.
- * The worker fills the rng buffer by pulling fresh entropy from the
- * high quality (but slow) true hardware random generator. This entropy
- * is then spread over the buffer with an pseudo random generator PRNG.
- * As the arch_get_random_seed_long() fetches 8 bytes and the calling
- * function add_interrupt_randomness() counts this as 1 bit entropy the
- * distribution needs to make sure there is in fact 1 bit entropy contained
- * in 8 bytes of the buffer. The current values pull 32 byte entropy
- * and scatter this into a 2048 byte buffer. So 8 byte in the buffer
- * will contain 1 bit of entropy.
- * The worker thread is rescheduled based on the charge level of the
- * buffer but at least with 500 ms delay to avoid too much CPU consumption.
- * So the max. amount of rng data delivered via arch_get_random_seed is
- * limited to 4k bytes per second.
*/
#include <linux/kernel.h>
#include <linux/atomic.h>
#include <linux/random.h>
-#include <linux/slab.h>
#include <linux/static_key.h>
-#include <linux/workqueue.h>
-#include <linux/moduleparam.h>
#include <asm/cpacf.h>
DEFINE_STATIC_KEY_FALSE(s390_arch_random_available);
atomic64_t s390_arch_random_counter = ATOMIC64_INIT(0);
EXPORT_SYMBOL(s390_arch_random_counter);
-
-#define ARCH_REFILL_TICKS (HZ/2)
-#define ARCH_PRNG_SEED_SIZE 32
-#define ARCH_RNG_BUF_SIZE 2048
-
-static DEFINE_SPINLOCK(arch_rng_lock);
-static u8 *arch_rng_buf;
-static unsigned int arch_rng_buf_idx;
-
-static void arch_rng_refill_buffer(struct work_struct *);
-static DECLARE_DELAYED_WORK(arch_rng_work, arch_rng_refill_buffer);
-
-bool s390_arch_random_generate(u8 *buf, unsigned int nbytes)
-{
- /* max hunk is ARCH_RNG_BUF_SIZE */
- if (nbytes > ARCH_RNG_BUF_SIZE)
- return false;
-
- /* lock rng buffer */
- if (!spin_trylock(&arch_rng_lock))
- return false;
-
- /* try to resolve the requested amount of bytes from the buffer */
- arch_rng_buf_idx -= nbytes;
- if (arch_rng_buf_idx < ARCH_RNG_BUF_SIZE) {
- memcpy(buf, arch_rng_buf + arch_rng_buf_idx, nbytes);
- atomic64_add(nbytes, &s390_arch_random_counter);
- spin_unlock(&arch_rng_lock);
- return true;
- }
-
- /* not enough bytes in rng buffer, refill is done asynchronously */
- spin_unlock(&arch_rng_lock);
-
- return false;
-}
-EXPORT_SYMBOL(s390_arch_random_generate);
-
-static void arch_rng_refill_buffer(struct work_struct *unused)
-{
- unsigned int delay = ARCH_REFILL_TICKS;
-
- spin_lock(&arch_rng_lock);
- if (arch_rng_buf_idx > ARCH_RNG_BUF_SIZE) {
- /* buffer is exhausted and needs refill */
- u8 seed[ARCH_PRNG_SEED_SIZE];
- u8 prng_wa[240];
- /* fetch ARCH_PRNG_SEED_SIZE bytes of entropy */
- cpacf_trng(NULL, 0, seed, sizeof(seed));
- /* blow this entropy up to ARCH_RNG_BUF_SIZE with PRNG */
- memset(prng_wa, 0, sizeof(prng_wa));
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_SEED,
- &prng_wa, NULL, 0, seed, sizeof(seed));
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_GEN,
- &prng_wa, arch_rng_buf, ARCH_RNG_BUF_SIZE, NULL, 0);
- arch_rng_buf_idx = ARCH_RNG_BUF_SIZE;
- }
- delay += (ARCH_REFILL_TICKS * arch_rng_buf_idx) / ARCH_RNG_BUF_SIZE;
- spin_unlock(&arch_rng_lock);
-
- /* kick next check */
- queue_delayed_work(system_long_wq, &arch_rng_work, delay);
-}
-
-/*
- * Here follows the implementation of s390_arch_get_random_long().
- *
- * The random longs to be pulled by arch_get_random_long() are
- * prepared in an 4K buffer which is filled from the NIST 800-90
- * compliant s390 drbg. By default the random long buffer is refilled
- * 256 times before the drbg itself needs a reseed. The reseed of the
- * drbg is done with 32 bytes fetched from the high quality (but slow)
- * trng which is assumed to deliver 100% entropy. So the 32 * 8 = 256
- * bits of entropy are spread over 256 * 4KB = 1MB serving 131072
- * arch_get_random_long() invocations before reseeded.
- *
- * How often the 4K random long buffer is refilled with the drbg
- * before the drbg is reseeded can be adjusted. There is a module
- * parameter 's390_arch_rnd_long_drbg_reseed' accessible via
- * /sys/module/arch_random/parameters/rndlong_drbg_reseed
- * or as kernel command line parameter
- * arch_random.rndlong_drbg_reseed=<value>
- * This parameter tells how often the drbg fills the 4K buffer before
- * it is re-seeded by fresh entropy from the trng.
- * A value of 16 results in reseeding the drbg at every 16 * 4 KB = 64
- * KB with 32 bytes of fresh entropy pulled from the trng. So a value
- * of 16 would result in 256 bits entropy per 64 KB.
- * A value of 256 results in 1MB of drbg output before a reseed of the
- * drbg is done. So this would spread the 256 bits of entropy among 1MB.
- * Setting this parameter to 0 forces the reseed to take place every
- * time the 4K buffer is depleted, so the entropy rises to 256 bits
- * entropy per 4K or 0.5 bit entropy per arch_get_random_long(). With
- * setting this parameter to negative values all this effort is
- * disabled, arch_get_random long() returns false and thus indicating
- * that the arch_get_random_long() feature is disabled at all.
- */
-
-static unsigned long rndlong_buf[512];
-static DEFINE_SPINLOCK(rndlong_lock);
-static int rndlong_buf_index;
-
-static int rndlong_drbg_reseed = 256;
-module_param_named(rndlong_drbg_reseed, rndlong_drbg_reseed, int, 0600);
-MODULE_PARM_DESC(rndlong_drbg_reseed, "s390 arch_get_random_long() drbg reseed");
-
-static inline void refill_rndlong_buf(void)
-{
- static u8 prng_ws[240];
- static int drbg_counter;
-
- if (--drbg_counter < 0) {
- /* need to re-seed the drbg */
- u8 seed[32];
-
- /* fetch seed from trng */
- cpacf_trng(NULL, 0, seed, sizeof(seed));
- /* seed drbg */
- memset(prng_ws, 0, sizeof(prng_ws));
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_SEED,
- &prng_ws, NULL, 0, seed, sizeof(seed));
- /* re-init counter for drbg */
- drbg_counter = rndlong_drbg_reseed;
- }
-
- /* fill the arch_get_random_long buffer from drbg */
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_GEN, &prng_ws,
- (u8 *) rndlong_buf, sizeof(rndlong_buf),
- NULL, 0);
-}
-
-bool s390_arch_get_random_long(unsigned long *v)
-{
- bool rc = false;
- unsigned long flags;
-
- /* arch_get_random_long() disabled ? */
- if (rndlong_drbg_reseed < 0)
- return false;
-
- /* try to lock the random long lock */
- if (!spin_trylock_irqsave(&rndlong_lock, flags))
- return false;
-
- if (--rndlong_buf_index >= 0) {
- /* deliver next long value from the buffer */
- *v = rndlong_buf[rndlong_buf_index];
- rc = true;
- goto out;
- }
-
- /* buffer is depleted and needs refill */
- if (in_interrupt()) {
- /* delay refill in interrupt context to next caller */
- rndlong_buf_index = 0;
- goto out;
- }
-
- /* refill random long buffer */
- refill_rndlong_buf();
- rndlong_buf_index = ARRAY_SIZE(rndlong_buf);
-
- /* and provide one random long */
- *v = rndlong_buf[--rndlong_buf_index];
- rc = true;
-
-out:
- spin_unlock_irqrestore(&rndlong_lock, flags);
- return rc;
-}
-EXPORT_SYMBOL(s390_arch_get_random_long);
-
-static int __init s390_arch_random_init(void)
-{
- /* all the needed PRNO subfunctions available ? */
- if (cpacf_query_func(CPACF_PRNO, CPACF_PRNO_TRNG) &&
- cpacf_query_func(CPACF_PRNO, CPACF_PRNO_SHA512_DRNG_GEN)) {
-
- /* alloc arch random working buffer */
- arch_rng_buf = kmalloc(ARCH_RNG_BUF_SIZE, GFP_KERNEL);
- if (!arch_rng_buf)
- return -ENOMEM;
-
- /* kick worker queue job to fill the random buffer */
- queue_delayed_work(system_long_wq,
- &arch_rng_work, ARCH_REFILL_TICKS);
-
- /* enable arch random to the outside world */
- static_branch_enable(&s390_arch_random_available);
- }
-
- return 0;
-}
-arch_initcall(s390_arch_random_init);
diff --git a/arch/s390/include/asm/archrandom.h b/arch/s390/include/asm/archrandom.h
index 5dc712fde3c7..2c6e1c6ecbe7 100644
--- a/arch/s390/include/asm/archrandom.h
+++ b/arch/s390/include/asm/archrandom.h
@@ -15,17 +15,13 @@
#include <linux/static_key.h>
#include <linux/atomic.h>
+#include <asm/cpacf.h>
DECLARE_STATIC_KEY_FALSE(s390_arch_random_available);
extern atomic64_t s390_arch_random_counter;
-bool s390_arch_get_random_long(unsigned long *v);
-bool s390_arch_random_generate(u8 *buf, unsigned int nbytes);
-
static inline bool __must_check arch_get_random_long(unsigned long *v)
{
- if (static_branch_likely(&s390_arch_random_available))
- return s390_arch_get_random_long(v);
return false;
}
@@ -37,7 +33,9 @@ static inline bool __must_check arch_get_random_int(unsigned int *v)
static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
{
if (static_branch_likely(&s390_arch_random_available)) {
- return s390_arch_random_generate((u8 *)v, sizeof(*v));
+ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
+ atomic64_add(sizeof(*v), &s390_arch_random_counter);
+ return true;
}
return false;
}
@@ -45,7 +43,9 @@ static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
{
if (static_branch_likely(&s390_arch_random_available)) {
- return s390_arch_random_generate((u8 *)v, sizeof(*v));
+ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
+ atomic64_add(sizeof(*v), &s390_arch_random_counter);
+ return true;
}
return false;
}
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index 8d91eccc0963..0a37f5de2863 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -875,6 +875,11 @@ static void __init setup_randomness(void)
if (stsi(vmms, 3, 2, 2) == 0 && vmms->count)
add_device_randomness(&vmms->vm, sizeof(vmms->vm[0]) * vmms->count);
memblock_free(vmms, PAGE_SIZE);
+
+#ifdef CONFIG_ARCH_RANDOM
+ if (cpacf_query_func(CPACF_PRNO, CPACF_PRNO_TRNG))
+ static_branch_enable(&s390_arch_random_available);
+#endif
}
/*
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e4f74400308cb8abde5fdc9cad609c2aba32110c Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Sat, 11 Jun 2022 00:20:23 +0200
Subject: [PATCH] s390/archrandom: simplify back to earlier design and
initialize earlier
s390x appears to present two RNG interfaces:
- a "TRNG" that gathers entropy using some hardware function; and
- a "DRBG" that takes in a seed and expands it.
Previously, the TRNG was wired up to arch_get_random_{long,int}(), but
it was observed that this was being called really frequently, resulting
in high overhead. So it was changed to be wired up to arch_get_random_
seed_{long,int}(), which was a reasonable decision. Later on, the DRBG
was then wired up to arch_get_random_{long,int}(), with a complicated
buffer filling thread, to control overhead and rate.
Fortunately, none of the performance issues matter much now. The RNG
always attempts to use arch_get_random_seed_{long,int}() first, which
means a complicated implementation of arch_get_random_{long,int}() isn't
really valuable or useful to have around. And it's only used when
reseeding, which means it won't hit the high throughput complications
that were faced before.
So this commit returns to an earlier design of just calling the TRNG in
arch_get_random_seed_{long,int}(), and returning false in arch_get_
random_{long,int}().
Part of what makes the simplification possible is that the RNG now seeds
itself using the TRNG at bootup. But this only works if the TRNG is
detected early in boot, before random_init() is called. So this commit
also causes that check to happen in setup_arch().
Cc: stable(a)vger.kernel.org
Cc: Harald Freudenberger <freude(a)linux.ibm.com>
Cc: Ingo Franzki <ifranzki(a)linux.ibm.com>
Cc: Juergen Christ <jchrist(a)linux.ibm.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Link: https://lore.kernel.org/r/20220610222023.378448-1-Jason@zx2c4.com
Reviewed-by: Harald Freudenberger <freude(a)linux.ibm.com>
Acked-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
diff --git a/arch/s390/crypto/arch_random.c b/arch/s390/crypto/arch_random.c
index 56007c763902..1f2d40993c4d 100644
--- a/arch/s390/crypto/arch_random.c
+++ b/arch/s390/crypto/arch_random.c
@@ -4,232 +4,15 @@
*
* Copyright IBM Corp. 2017, 2020
* Author(s): Harald Freudenberger
- *
- * The s390_arch_random_generate() function may be called from random.c
- * in interrupt context. So this implementation does the best to be very
- * fast. There is a buffer of random data which is asynchronously checked
- * and filled by a workqueue thread.
- * If there are enough bytes in the buffer the s390_arch_random_generate()
- * just delivers these bytes. Otherwise false is returned until the
- * worker thread refills the buffer.
- * The worker fills the rng buffer by pulling fresh entropy from the
- * high quality (but slow) true hardware random generator. This entropy
- * is then spread over the buffer with an pseudo random generator PRNG.
- * As the arch_get_random_seed_long() fetches 8 bytes and the calling
- * function add_interrupt_randomness() counts this as 1 bit entropy the
- * distribution needs to make sure there is in fact 1 bit entropy contained
- * in 8 bytes of the buffer. The current values pull 32 byte entropy
- * and scatter this into a 2048 byte buffer. So 8 byte in the buffer
- * will contain 1 bit of entropy.
- * The worker thread is rescheduled based on the charge level of the
- * buffer but at least with 500 ms delay to avoid too much CPU consumption.
- * So the max. amount of rng data delivered via arch_get_random_seed is
- * limited to 4k bytes per second.
*/
#include <linux/kernel.h>
#include <linux/atomic.h>
#include <linux/random.h>
-#include <linux/slab.h>
#include <linux/static_key.h>
-#include <linux/workqueue.h>
-#include <linux/moduleparam.h>
#include <asm/cpacf.h>
DEFINE_STATIC_KEY_FALSE(s390_arch_random_available);
atomic64_t s390_arch_random_counter = ATOMIC64_INIT(0);
EXPORT_SYMBOL(s390_arch_random_counter);
-
-#define ARCH_REFILL_TICKS (HZ/2)
-#define ARCH_PRNG_SEED_SIZE 32
-#define ARCH_RNG_BUF_SIZE 2048
-
-static DEFINE_SPINLOCK(arch_rng_lock);
-static u8 *arch_rng_buf;
-static unsigned int arch_rng_buf_idx;
-
-static void arch_rng_refill_buffer(struct work_struct *);
-static DECLARE_DELAYED_WORK(arch_rng_work, arch_rng_refill_buffer);
-
-bool s390_arch_random_generate(u8 *buf, unsigned int nbytes)
-{
- /* max hunk is ARCH_RNG_BUF_SIZE */
- if (nbytes > ARCH_RNG_BUF_SIZE)
- return false;
-
- /* lock rng buffer */
- if (!spin_trylock(&arch_rng_lock))
- return false;
-
- /* try to resolve the requested amount of bytes from the buffer */
- arch_rng_buf_idx -= nbytes;
- if (arch_rng_buf_idx < ARCH_RNG_BUF_SIZE) {
- memcpy(buf, arch_rng_buf + arch_rng_buf_idx, nbytes);
- atomic64_add(nbytes, &s390_arch_random_counter);
- spin_unlock(&arch_rng_lock);
- return true;
- }
-
- /* not enough bytes in rng buffer, refill is done asynchronously */
- spin_unlock(&arch_rng_lock);
-
- return false;
-}
-EXPORT_SYMBOL(s390_arch_random_generate);
-
-static void arch_rng_refill_buffer(struct work_struct *unused)
-{
- unsigned int delay = ARCH_REFILL_TICKS;
-
- spin_lock(&arch_rng_lock);
- if (arch_rng_buf_idx > ARCH_RNG_BUF_SIZE) {
- /* buffer is exhausted and needs refill */
- u8 seed[ARCH_PRNG_SEED_SIZE];
- u8 prng_wa[240];
- /* fetch ARCH_PRNG_SEED_SIZE bytes of entropy */
- cpacf_trng(NULL, 0, seed, sizeof(seed));
- /* blow this entropy up to ARCH_RNG_BUF_SIZE with PRNG */
- memset(prng_wa, 0, sizeof(prng_wa));
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_SEED,
- &prng_wa, NULL, 0, seed, sizeof(seed));
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_GEN,
- &prng_wa, arch_rng_buf, ARCH_RNG_BUF_SIZE, NULL, 0);
- arch_rng_buf_idx = ARCH_RNG_BUF_SIZE;
- }
- delay += (ARCH_REFILL_TICKS * arch_rng_buf_idx) / ARCH_RNG_BUF_SIZE;
- spin_unlock(&arch_rng_lock);
-
- /* kick next check */
- queue_delayed_work(system_long_wq, &arch_rng_work, delay);
-}
-
-/*
- * Here follows the implementation of s390_arch_get_random_long().
- *
- * The random longs to be pulled by arch_get_random_long() are
- * prepared in an 4K buffer which is filled from the NIST 800-90
- * compliant s390 drbg. By default the random long buffer is refilled
- * 256 times before the drbg itself needs a reseed. The reseed of the
- * drbg is done with 32 bytes fetched from the high quality (but slow)
- * trng which is assumed to deliver 100% entropy. So the 32 * 8 = 256
- * bits of entropy are spread over 256 * 4KB = 1MB serving 131072
- * arch_get_random_long() invocations before reseeded.
- *
- * How often the 4K random long buffer is refilled with the drbg
- * before the drbg is reseeded can be adjusted. There is a module
- * parameter 's390_arch_rnd_long_drbg_reseed' accessible via
- * /sys/module/arch_random/parameters/rndlong_drbg_reseed
- * or as kernel command line parameter
- * arch_random.rndlong_drbg_reseed=<value>
- * This parameter tells how often the drbg fills the 4K buffer before
- * it is re-seeded by fresh entropy from the trng.
- * A value of 16 results in reseeding the drbg at every 16 * 4 KB = 64
- * KB with 32 bytes of fresh entropy pulled from the trng. So a value
- * of 16 would result in 256 bits entropy per 64 KB.
- * A value of 256 results in 1MB of drbg output before a reseed of the
- * drbg is done. So this would spread the 256 bits of entropy among 1MB.
- * Setting this parameter to 0 forces the reseed to take place every
- * time the 4K buffer is depleted, so the entropy rises to 256 bits
- * entropy per 4K or 0.5 bit entropy per arch_get_random_long(). With
- * setting this parameter to negative values all this effort is
- * disabled, arch_get_random long() returns false and thus indicating
- * that the arch_get_random_long() feature is disabled at all.
- */
-
-static unsigned long rndlong_buf[512];
-static DEFINE_SPINLOCK(rndlong_lock);
-static int rndlong_buf_index;
-
-static int rndlong_drbg_reseed = 256;
-module_param_named(rndlong_drbg_reseed, rndlong_drbg_reseed, int, 0600);
-MODULE_PARM_DESC(rndlong_drbg_reseed, "s390 arch_get_random_long() drbg reseed");
-
-static inline void refill_rndlong_buf(void)
-{
- static u8 prng_ws[240];
- static int drbg_counter;
-
- if (--drbg_counter < 0) {
- /* need to re-seed the drbg */
- u8 seed[32];
-
- /* fetch seed from trng */
- cpacf_trng(NULL, 0, seed, sizeof(seed));
- /* seed drbg */
- memset(prng_ws, 0, sizeof(prng_ws));
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_SEED,
- &prng_ws, NULL, 0, seed, sizeof(seed));
- /* re-init counter for drbg */
- drbg_counter = rndlong_drbg_reseed;
- }
-
- /* fill the arch_get_random_long buffer from drbg */
- cpacf_prno(CPACF_PRNO_SHA512_DRNG_GEN, &prng_ws,
- (u8 *) rndlong_buf, sizeof(rndlong_buf),
- NULL, 0);
-}
-
-bool s390_arch_get_random_long(unsigned long *v)
-{
- bool rc = false;
- unsigned long flags;
-
- /* arch_get_random_long() disabled ? */
- if (rndlong_drbg_reseed < 0)
- return false;
-
- /* try to lock the random long lock */
- if (!spin_trylock_irqsave(&rndlong_lock, flags))
- return false;
-
- if (--rndlong_buf_index >= 0) {
- /* deliver next long value from the buffer */
- *v = rndlong_buf[rndlong_buf_index];
- rc = true;
- goto out;
- }
-
- /* buffer is depleted and needs refill */
- if (in_interrupt()) {
- /* delay refill in interrupt context to next caller */
- rndlong_buf_index = 0;
- goto out;
- }
-
- /* refill random long buffer */
- refill_rndlong_buf();
- rndlong_buf_index = ARRAY_SIZE(rndlong_buf);
-
- /* and provide one random long */
- *v = rndlong_buf[--rndlong_buf_index];
- rc = true;
-
-out:
- spin_unlock_irqrestore(&rndlong_lock, flags);
- return rc;
-}
-EXPORT_SYMBOL(s390_arch_get_random_long);
-
-static int __init s390_arch_random_init(void)
-{
- /* all the needed PRNO subfunctions available ? */
- if (cpacf_query_func(CPACF_PRNO, CPACF_PRNO_TRNG) &&
- cpacf_query_func(CPACF_PRNO, CPACF_PRNO_SHA512_DRNG_GEN)) {
-
- /* alloc arch random working buffer */
- arch_rng_buf = kmalloc(ARCH_RNG_BUF_SIZE, GFP_KERNEL);
- if (!arch_rng_buf)
- return -ENOMEM;
-
- /* kick worker queue job to fill the random buffer */
- queue_delayed_work(system_long_wq,
- &arch_rng_work, ARCH_REFILL_TICKS);
-
- /* enable arch random to the outside world */
- static_branch_enable(&s390_arch_random_available);
- }
-
- return 0;
-}
-arch_initcall(s390_arch_random_init);
diff --git a/arch/s390/include/asm/archrandom.h b/arch/s390/include/asm/archrandom.h
index 5dc712fde3c7..2c6e1c6ecbe7 100644
--- a/arch/s390/include/asm/archrandom.h
+++ b/arch/s390/include/asm/archrandom.h
@@ -15,17 +15,13 @@
#include <linux/static_key.h>
#include <linux/atomic.h>
+#include <asm/cpacf.h>
DECLARE_STATIC_KEY_FALSE(s390_arch_random_available);
extern atomic64_t s390_arch_random_counter;
-bool s390_arch_get_random_long(unsigned long *v);
-bool s390_arch_random_generate(u8 *buf, unsigned int nbytes);
-
static inline bool __must_check arch_get_random_long(unsigned long *v)
{
- if (static_branch_likely(&s390_arch_random_available))
- return s390_arch_get_random_long(v);
return false;
}
@@ -37,7 +33,9 @@ static inline bool __must_check arch_get_random_int(unsigned int *v)
static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
{
if (static_branch_likely(&s390_arch_random_available)) {
- return s390_arch_random_generate((u8 *)v, sizeof(*v));
+ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
+ atomic64_add(sizeof(*v), &s390_arch_random_counter);
+ return true;
}
return false;
}
@@ -45,7 +43,9 @@ static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
{
if (static_branch_likely(&s390_arch_random_available)) {
- return s390_arch_random_generate((u8 *)v, sizeof(*v));
+ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
+ atomic64_add(sizeof(*v), &s390_arch_random_counter);
+ return true;
}
return false;
}
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index 8d91eccc0963..0a37f5de2863 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -875,6 +875,11 @@ static void __init setup_randomness(void)
if (stsi(vmms, 3, 2, 2) == 0 && vmms->count)
add_device_randomness(&vmms->vm, sizeof(vmms->vm[0]) * vmms->count);
memblock_free(vmms, PAGE_SIZE);
+
+#ifdef CONFIG_ARCH_RANDOM
+ if (cpacf_query_func(CPACF_PRNO, CPACF_PRNO_TRNG))
+ static_branch_enable(&s390_arch_random_available);
+#endif
}
/*
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f3eac426657d985b97c92fa5f7ae1d43f04721f3 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 21 Jun 2022 16:08:49 +0200
Subject: [PATCH] powerpc/powernv: wire up rng during setup_arch
The platform's RNG must be available before random_init() in order to be
useful for initial seeding, which in turn means that it needs to be
called from setup_arch(), rather than from an init call.
Complicating things, however, is that POWER8 systems need some per-cpu
state and kmalloc, which isn't available at this stage. So we split
things up into an early phase and a later opportunistic phase. This
commit also removes some noisy log messages that don't add much.
Fixes: a4da0d50b2a0 ("powerpc: Implement arch_get_random_long/int() for powernv")
Cc: stable(a)vger.kernel.org # v3.13+
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Reviewed-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
[mpe: Add of_node_put(), use pnv naming, minor change log editing]
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20220621140849.127227-1-Jason@zx2c4.com
diff --git a/arch/powerpc/platforms/powernv/powernv.h b/arch/powerpc/platforms/powernv/powernv.h
index e297bf4abfcb..866efdc103fd 100644
--- a/arch/powerpc/platforms/powernv/powernv.h
+++ b/arch/powerpc/platforms/powernv/powernv.h
@@ -42,4 +42,6 @@ ssize_t memcons_copy(struct memcons *mc, char *to, loff_t pos, size_t count);
u32 __init memcons_get_size(struct memcons *mc);
struct memcons *__init memcons_init(struct device_node *node, const char *mc_prop_name);
+void pnv_rng_init(void);
+
#endif /* _POWERNV_H */
diff --git a/arch/powerpc/platforms/powernv/rng.c b/arch/powerpc/platforms/powernv/rng.c
index e3d44b36ae98..463c78c52cc5 100644
--- a/arch/powerpc/platforms/powernv/rng.c
+++ b/arch/powerpc/platforms/powernv/rng.c
@@ -17,6 +17,7 @@
#include <asm/prom.h>
#include <asm/machdep.h>
#include <asm/smp.h>
+#include "powernv.h"
#define DARN_ERR 0xFFFFFFFFFFFFFFFFul
@@ -28,7 +29,6 @@ struct powernv_rng {
static DEFINE_PER_CPU(struct powernv_rng *, powernv_rng);
-
int powernv_hwrng_present(void)
{
struct powernv_rng *rng;
@@ -98,9 +98,6 @@ static int __init initialise_darn(void)
return 0;
}
}
-
- pr_warn("Unable to use DARN for get_random_seed()\n");
-
return -EIO;
}
@@ -163,32 +160,55 @@ static __init int rng_create(struct device_node *dn)
rng_init_per_cpu(rng, dn);
- pr_info_once("Registering arch random hook.\n");
-
ppc_md.get_random_seed = powernv_get_random_long;
return 0;
}
-static __init int rng_init(void)
+static int __init pnv_get_random_long_early(unsigned long *v)
{
struct device_node *dn;
- int rc;
+
+ if (!slab_is_available())
+ return 0;
+
+ if (cmpxchg(&ppc_md.get_random_seed, pnv_get_random_long_early,
+ NULL) != pnv_get_random_long_early)
+ return 0;
for_each_compatible_node(dn, NULL, "ibm,power-rng") {
- rc = rng_create(dn);
- if (rc) {
- pr_err("Failed creating rng for %pOF (%d).\n",
- dn, rc);
+ if (rng_create(dn))
continue;
- }
-
/* Create devices for hwrng driver */
of_platform_device_create(dn, NULL, NULL);
}
- initialise_darn();
+ if (!ppc_md.get_random_seed)
+ return 0;
+ return ppc_md.get_random_seed(v);
+}
+
+void __init pnv_rng_init(void)
+{
+ struct device_node *dn;
+ /* Prefer darn over the rest. */
+ if (!initialise_darn())
+ return;
+
+ dn = of_find_compatible_node(NULL, NULL, "ibm,power-rng");
+ if (dn)
+ ppc_md.get_random_seed = pnv_get_random_long_early;
+
+ of_node_put(dn);
+}
+
+static int __init pnv_rng_late_init(void)
+{
+ unsigned long v;
+ /* In case it wasn't called during init for some other reason. */
+ if (ppc_md.get_random_seed == pnv_get_random_long_early)
+ pnv_get_random_long_early(&v);
return 0;
}
-machine_subsys_initcall(powernv, rng_init);
+machine_subsys_initcall(powernv, pnv_rng_late_init);
diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index 824c3ad7a0fa..dac545aa0308 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -203,6 +203,8 @@ static void __init pnv_setup_arch(void)
pnv_check_guarded_cores();
/* XXX PMCS */
+
+ pnv_rng_init();
}
static void __init pnv_init(void)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fd37c2ecb21f7aee04ccca5f561469f07d00063c Mon Sep 17 00:00:00 2001
From: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
Date: Mon, 27 Jun 2022 18:02:43 -0700
Subject: [PATCH] selftests: mptcp: Initialize variables to quiet gcc 12
warnings
In a few MPTCP selftest tools, gcc 12 complains that the 'sock' variable
might be used uninitialized. This is a false positive because the only
code path that could lead to uninitialized access is where getaddrinfo()
fails, but the local xgetaddrinfo() wrapper exits if such a failure
occurs.
Initialize the 'sock' variable anyway to allow the tools to build with
gcc 12.
Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
Acked-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index 8628aa61b763..e2ea6c126c99 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -265,7 +265,7 @@ static void sock_test_tcpulp(int sock, int proto, unsigned int line)
static int sock_listen_mptcp(const char * const listenaddr,
const char * const port)
{
- int sock;
+ int sock = -1;
struct addrinfo hints = {
.ai_protocol = IPPROTO_TCP,
.ai_socktype = SOCK_STREAM,
diff --git a/tools/testing/selftests/net/mptcp/mptcp_inq.c b/tools/testing/selftests/net/mptcp/mptcp_inq.c
index 29f75e2a1116..8672d898f8cd 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_inq.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_inq.c
@@ -88,7 +88,7 @@ static void xgetaddrinfo(const char *node, const char *service,
static int sock_listen_mptcp(const char * const listenaddr,
const char * const port)
{
- int sock;
+ int sock = -1;
struct addrinfo hints = {
.ai_protocol = IPPROTO_TCP,
.ai_socktype = SOCK_STREAM,
diff --git a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
index ac9a4d9c1764..ae61f39556ca 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
@@ -136,7 +136,7 @@ static void xgetaddrinfo(const char *node, const char *service,
static int sock_listen_mptcp(const char * const listenaddr,
const char * const port)
{
- int sock;
+ int sock = -1;
struct addrinfo hints = {
.ai_protocol = IPPROTO_TCP,
.ai_socktype = SOCK_STREAM,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fd37c2ecb21f7aee04ccca5f561469f07d00063c Mon Sep 17 00:00:00 2001
From: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
Date: Mon, 27 Jun 2022 18:02:43 -0700
Subject: [PATCH] selftests: mptcp: Initialize variables to quiet gcc 12
warnings
In a few MPTCP selftest tools, gcc 12 complains that the 'sock' variable
might be used uninitialized. This is a false positive because the only
code path that could lead to uninitialized access is where getaddrinfo()
fails, but the local xgetaddrinfo() wrapper exits if such a failure
occurs.
Initialize the 'sock' variable anyway to allow the tools to build with
gcc 12.
Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
Acked-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index 8628aa61b763..e2ea6c126c99 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -265,7 +265,7 @@ static void sock_test_tcpulp(int sock, int proto, unsigned int line)
static int sock_listen_mptcp(const char * const listenaddr,
const char * const port)
{
- int sock;
+ int sock = -1;
struct addrinfo hints = {
.ai_protocol = IPPROTO_TCP,
.ai_socktype = SOCK_STREAM,
diff --git a/tools/testing/selftests/net/mptcp/mptcp_inq.c b/tools/testing/selftests/net/mptcp/mptcp_inq.c
index 29f75e2a1116..8672d898f8cd 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_inq.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_inq.c
@@ -88,7 +88,7 @@ static void xgetaddrinfo(const char *node, const char *service,
static int sock_listen_mptcp(const char * const listenaddr,
const char * const port)
{
- int sock;
+ int sock = -1;
struct addrinfo hints = {
.ai_protocol = IPPROTO_TCP,
.ai_socktype = SOCK_STREAM,
diff --git a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
index ac9a4d9c1764..ae61f39556ca 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
@@ -136,7 +136,7 @@ static void xgetaddrinfo(const char *node, const char *service,
static int sock_listen_mptcp(const char * const listenaddr,
const char * const port)
{
- int sock;
+ int sock = -1;
struct addrinfo hints = {
.ai_protocol = IPPROTO_TCP,
.ai_socktype = SOCK_STREAM,
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 31bf11de146c3f8892093ff39f8f9b3069d6a852 Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Mon, 27 Jun 2022 18:02:36 -0700
Subject: [PATCH] mptcp: introduce MAPPING_BAD_CSUM
This allow moving a couple of conditional out of the fast path,
making the code more easy to follow and will simplify the next
patch.
Fixes: ae66fb2ba6c3 ("mptcp: Do TCP fallback on early DSS checksum failure")
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 50ad19adc003..42a7c18a1c24 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -843,7 +843,8 @@ enum mapping_status {
MAPPING_INVALID,
MAPPING_EMPTY,
MAPPING_DATA_FIN,
- MAPPING_DUMMY
+ MAPPING_DUMMY,
+ MAPPING_BAD_CSUM
};
static void dbg_bad_map(struct mptcp_subflow_context *subflow, u32 ssn)
@@ -958,9 +959,7 @@ static enum mapping_status validate_data_csum(struct sock *ssk, struct sk_buff *
subflow->map_data_csum);
if (unlikely(csum)) {
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DATACSUMERR);
- if (subflow->mp_join || subflow->valid_csum_seen)
- subflow->send_mp_fail = 1;
- return subflow->mp_join ? MAPPING_INVALID : MAPPING_DUMMY;
+ return MAPPING_BAD_CSUM;
}
subflow->valid_csum_seen = 1;
@@ -1182,10 +1181,8 @@ static bool subflow_check_data_avail(struct sock *ssk)
status = get_mapping_status(ssk, msk);
trace_subflow_check_data_avail(status, skb_peek(&ssk->sk_receive_queue));
- if (unlikely(status == MAPPING_INVALID))
- goto fallback;
-
- if (unlikely(status == MAPPING_DUMMY))
+ if (unlikely(status == MAPPING_INVALID || status == MAPPING_DUMMY ||
+ status == MAPPING_BAD_CSUM))
goto fallback;
if (status != MAPPING_OK)
@@ -1227,7 +1224,10 @@ static bool subflow_check_data_avail(struct sock *ssk)
fallback:
if (!__mptcp_check_fallback(msk)) {
/* RFC 8684 section 3.7. */
- if (subflow->send_mp_fail) {
+ if (status == MAPPING_BAD_CSUM &&
+ (subflow->mp_join || subflow->valid_csum_seen)) {
+ subflow->send_mp_fail = 1;
+
if (!READ_ONCE(msk->allow_infinite_fallback)) {
ssk->sk_err = EBADMSG;
tcp_set_state(ssk, TCP_CLOSE);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 31bf11de146c3f8892093ff39f8f9b3069d6a852 Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Mon, 27 Jun 2022 18:02:36 -0700
Subject: [PATCH] mptcp: introduce MAPPING_BAD_CSUM
This allow moving a couple of conditional out of the fast path,
making the code more easy to follow and will simplify the next
patch.
Fixes: ae66fb2ba6c3 ("mptcp: Do TCP fallback on early DSS checksum failure")
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 50ad19adc003..42a7c18a1c24 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -843,7 +843,8 @@ enum mapping_status {
MAPPING_INVALID,
MAPPING_EMPTY,
MAPPING_DATA_FIN,
- MAPPING_DUMMY
+ MAPPING_DUMMY,
+ MAPPING_BAD_CSUM
};
static void dbg_bad_map(struct mptcp_subflow_context *subflow, u32 ssn)
@@ -958,9 +959,7 @@ static enum mapping_status validate_data_csum(struct sock *ssk, struct sk_buff *
subflow->map_data_csum);
if (unlikely(csum)) {
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DATACSUMERR);
- if (subflow->mp_join || subflow->valid_csum_seen)
- subflow->send_mp_fail = 1;
- return subflow->mp_join ? MAPPING_INVALID : MAPPING_DUMMY;
+ return MAPPING_BAD_CSUM;
}
subflow->valid_csum_seen = 1;
@@ -1182,10 +1181,8 @@ static bool subflow_check_data_avail(struct sock *ssk)
status = get_mapping_status(ssk, msk);
trace_subflow_check_data_avail(status, skb_peek(&ssk->sk_receive_queue));
- if (unlikely(status == MAPPING_INVALID))
- goto fallback;
-
- if (unlikely(status == MAPPING_DUMMY))
+ if (unlikely(status == MAPPING_INVALID || status == MAPPING_DUMMY ||
+ status == MAPPING_BAD_CSUM))
goto fallback;
if (status != MAPPING_OK)
@@ -1227,7 +1224,10 @@ static bool subflow_check_data_avail(struct sock *ssk)
fallback:
if (!__mptcp_check_fallback(msk)) {
/* RFC 8684 section 3.7. */
- if (subflow->send_mp_fail) {
+ if (status == MAPPING_BAD_CSUM &&
+ (subflow->mp_join || subflow->valid_csum_seen)) {
+ subflow->send_mp_fail = 1;
+
if (!READ_ONCE(msk->allow_infinite_fallback)) {
ssk->sk_err = EBADMSG;
tcp_set_state(ssk, TCP_CLOSE);
The of node for the rng must be created much later in boot. Otherwise it
tries to connect to a parent that doesn't yet exist, resulting on this
splat:
[ 0.000478] kobject: '(null)' ((____ptrval____)): is not initialized, yet kobject_get() is being called.
[ 0.002925] [c000000002a0fb30] [c00000000073b0bc] kobject_get+0x8c/0x100 (unreliable)
[ 0.003071] [c000000002a0fba0] [c00000000087e464] device_add+0xf4/0xb00
[ 0.003194] [c000000002a0fc80] [c000000000a7f6e4] of_device_add+0x64/0x80
[ 0.003321] [c000000002a0fcb0] [c000000000a800d0] of_platform_device_create_pdata+0xd0/0x1b0
[ 0.003476] [c000000002a0fd00] [c00000000201fa44] pnv_get_random_long_early+0x240/0x2e4
[ 0.003623] [c000000002a0fe20] [c000000002060c38] random_init+0xc0/0x214
This patch fixes the issue by doing the of node creation inside of
machine_subsys_initcall.
Fixes: f3eac426657d ("powerpc/powernv: wire up rng during setup_arch")
Cc: stable(a)vger.kernel.org
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Reported-by: Sachin Sant <sachinp(a)linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
arch/powerpc/platforms/powernv/rng.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/platforms/powernv/rng.c b/arch/powerpc/platforms/powernv/rng.c
index 463c78c52cc5..bd5ad5f351c2 100644
--- a/arch/powerpc/platforms/powernv/rng.c
+++ b/arch/powerpc/platforms/powernv/rng.c
@@ -176,12 +176,8 @@ static int __init pnv_get_random_long_early(unsigned long *v)
NULL) != pnv_get_random_long_early)
return 0;
- for_each_compatible_node(dn, NULL, "ibm,power-rng") {
- if (rng_create(dn))
- continue;
- /* Create devices for hwrng driver */
- of_platform_device_create(dn, NULL, NULL);
- }
+ for_each_compatible_node(dn, NULL, "ibm,power-rng")
+ rng_create(dn);
if (!ppc_md.get_random_seed)
return 0;
@@ -205,10 +201,16 @@ void __init pnv_rng_init(void)
static int __init pnv_rng_late_init(void)
{
+ struct device_node *dn;
unsigned long v;
+
/* In case it wasn't called during init for some other reason. */
if (ppc_md.get_random_seed == pnv_get_random_long_early)
pnv_get_random_long_early(&v);
+ if (ppc_md.get_random_seed == powernv_get_random_long) {
+ for_each_compatible_node(dn, NULL, "ibm,power-rng")
+ of_platform_device_create(dn, NULL, NULL);
+ }
return 0;
}
machine_subsys_initcall(powernv, pnv_rng_late_init);
--
2.35.1
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 118f767413ada4eef7825fbd4af7c0866f883441 Mon Sep 17 00:00:00 2001
From: Kamal Heib <kamalheib1(a)gmail.com>
Date: Wed, 25 May 2022 16:20:29 +0300
Subject: [PATCH] RDMA/qedr: Fix reporting QP timeout attribute
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Make sure to save the passed QP timeout attribute when the QP gets modified,
so when calling query QP the right value is reported and not the
converted value that is required by the firmware. This issue was found
while running the pyverbs tests.
Fixes: cecbcddf6461 ("qedr: Add support for QP verbs")
Link: https://lore.kernel.org/r/20220525132029.84813-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1(a)gmail.com>
Acked-by: Michal Kalderon <michal.kalderon(a)marvell.com>
Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
diff --git a/drivers/infiniband/hw/qedr/qedr.h b/drivers/infiniband/hw/qedr/qedr.h
index 8def88cfa300..db9ef3e1eb97 100644
--- a/drivers/infiniband/hw/qedr/qedr.h
+++ b/drivers/infiniband/hw/qedr/qedr.h
@@ -418,6 +418,7 @@ struct qedr_qp {
u32 sq_psn;
u32 qkey;
u32 dest_qp_num;
+ u8 timeout;
/* Relevant to qps created from kernel space only (ULPs) */
u8 prev_wqe_size;
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index f0f43b6db89e..03ed7c0fae50 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -2613,6 +2613,8 @@ int qedr_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
1 << max_t(int, attr->timeout - 8, 0);
else
qp_params.ack_timeout = 0;
+
+ qp->timeout = attr->timeout;
}
if (attr_mask & IB_QP_RETRY_CNT) {
@@ -2772,7 +2774,7 @@ int qedr_query_qp(struct ib_qp *ibqp,
rdma_ah_set_dgid_raw(&qp_attr->ah_attr, ¶ms.dgid.bytes[0]);
rdma_ah_set_port_num(&qp_attr->ah_attr, 1);
rdma_ah_set_sl(&qp_attr->ah_attr, 0);
- qp_attr->timeout = params.timeout;
+ qp_attr->timeout = qp->timeout;
qp_attr->rnr_retry = params.rnr_retry;
qp_attr->retry_cnt = params.retry_cnt;
qp_attr->min_rnr_timer = params.min_rnr_nak_timer;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a8fc8cb5692aebb9c6f7afd4265366d25dcd1d01 Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Wed, 22 Jun 2022 21:21:05 -0700
Subject: [PATCH] net: tun: stop NAPI when detaching queues
While looking at a syzbot report I noticed the NAPI only gets
disabled before it's deleted. I think that user can detach
the queue before destroying the device and the NAPI will never
be stopped.
Fixes: 943170998b20 ("tun: enable NAPI for TUN/TAP driver")
Acked-by: Petar Penkov <ppenkov(a)aviatrix.com>
Link: https://lore.kernel.org/r/20220623042105.2274812-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7fd0288c3789..e2eb35887394 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -273,6 +273,12 @@ static void tun_napi_init(struct tun_struct *tun, struct tun_file *tfile,
}
}
+static void tun_napi_enable(struct tun_file *tfile)
+{
+ if (tfile->napi_enabled)
+ napi_enable(&tfile->napi);
+}
+
static void tun_napi_disable(struct tun_file *tfile)
{
if (tfile->napi_enabled)
@@ -653,8 +659,10 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
if (clean) {
RCU_INIT_POINTER(tfile->tun, NULL);
sock_put(&tfile->sk);
- } else
+ } else {
tun_disable_queue(tun, tfile);
+ tun_napi_disable(tfile);
+ }
synchronize_net();
tun_flow_delete_by_queue(tun, tun->numqueues + 1);
@@ -808,6 +816,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file,
if (tfile->detached) {
tun_enable_queue(tfile);
+ tun_napi_enable(tfile);
} else {
sock_hold(&tfile->sk);
tun_napi_init(tun, tfile, napi, napi_frags);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0e597e2affb90d6ea48df6890d882924acf71e19 Mon Sep 17 00:00:00 2001
From: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt(a)savoirfairelinux.com>
Date: Thu, 23 Jun 2022 15:46:45 +0200
Subject: [PATCH] net: dp83822: disable rx error interrupt
Some RX errors, notably when disconnecting the cable, increase the RCSR
register. Once half full (0x7fff), an interrupt flood is generated. I
measured ~3k/s interrupts even after the RX errors transfer was
stopped.
Since we don't read and clear the RCSR register, we should disable this
interrupt.
Fixes: 87461f7a58ab ("net: phy: DP83822 initial driver submission")
Signed-off-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt(a)savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c
index 95ef507053a6..8549e0e356c9 100644
--- a/drivers/net/phy/dp83822.c
+++ b/drivers/net/phy/dp83822.c
@@ -229,8 +229,7 @@ static int dp83822_config_intr(struct phy_device *phydev)
if (misr_status < 0)
return misr_status;
- misr_status |= (DP83822_RX_ERR_HF_INT_EN |
- DP83822_LINK_STAT_INT_EN |
+ misr_status |= (DP83822_LINK_STAT_INT_EN |
DP83822_ENERGY_DET_INT_EN |
DP83822_LINK_QUAL_INT_EN);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0e597e2affb90d6ea48df6890d882924acf71e19 Mon Sep 17 00:00:00 2001
From: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt(a)savoirfairelinux.com>
Date: Thu, 23 Jun 2022 15:46:45 +0200
Subject: [PATCH] net: dp83822: disable rx error interrupt
Some RX errors, notably when disconnecting the cable, increase the RCSR
register. Once half full (0x7fff), an interrupt flood is generated. I
measured ~3k/s interrupts even after the RX errors transfer was
stopped.
Since we don't read and clear the RCSR register, we should disable this
interrupt.
Fixes: 87461f7a58ab ("net: phy: DP83822 initial driver submission")
Signed-off-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt(a)savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c
index 95ef507053a6..8549e0e356c9 100644
--- a/drivers/net/phy/dp83822.c
+++ b/drivers/net/phy/dp83822.c
@@ -229,8 +229,7 @@ static int dp83822_config_intr(struct phy_device *phydev)
if (misr_status < 0)
return misr_status;
- misr_status |= (DP83822_RX_ERR_HF_INT_EN |
- DP83822_LINK_STAT_INT_EN |
+ misr_status |= (DP83822_LINK_STAT_INT_EN |
DP83822_ENERGY_DET_INT_EN |
DP83822_LINK_QUAL_INT_EN);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c96614eeab663646f57f67aa591e015abd8bd0ba Mon Sep 17 00:00:00 2001
From: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt(a)savoirfairelinux.com>
Date: Thu, 23 Jun 2022 15:46:44 +0200
Subject: [PATCH] net: dp83822: disable false carrier interrupt
When unplugging an Ethernet cable, false carrier events were produced by
the PHY at a very high rate. Once the false carrier counter full, an
interrupt was triggered every few clock cycles until the cable was
replugged. This resulted in approximately 10k/s interrupts.
Since the false carrier counter (FCSCR) is never used, we can safely
disable this interrupt.
In addition to improving performance, this also solved MDIO read
timeouts I was randomly encountering with an i.MX8 fec MAC because of
the interrupt flood. The interrupt count and MDIO timeout fix were
tested on a v5.4.110 kernel.
Fixes: 87461f7a58ab ("net: phy: DP83822 initial driver submission")
Signed-off-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt(a)savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c
index e6ad3a494d32..95ef507053a6 100644
--- a/drivers/net/phy/dp83822.c
+++ b/drivers/net/phy/dp83822.c
@@ -230,7 +230,6 @@ static int dp83822_config_intr(struct phy_device *phydev)
return misr_status;
misr_status |= (DP83822_RX_ERR_HF_INT_EN |
- DP83822_FALSE_CARRIER_HF_INT_EN |
DP83822_LINK_STAT_INT_EN |
DP83822_ENERGY_DET_INT_EN |
DP83822_LINK_QUAL_INT_EN);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c96614eeab663646f57f67aa591e015abd8bd0ba Mon Sep 17 00:00:00 2001
From: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt(a)savoirfairelinux.com>
Date: Thu, 23 Jun 2022 15:46:44 +0200
Subject: [PATCH] net: dp83822: disable false carrier interrupt
When unplugging an Ethernet cable, false carrier events were produced by
the PHY at a very high rate. Once the false carrier counter full, an
interrupt was triggered every few clock cycles until the cable was
replugged. This resulted in approximately 10k/s interrupts.
Since the false carrier counter (FCSCR) is never used, we can safely
disable this interrupt.
In addition to improving performance, this also solved MDIO read
timeouts I was randomly encountering with an i.MX8 fec MAC because of
the interrupt flood. The interrupt count and MDIO timeout fix were
tested on a v5.4.110 kernel.
Fixes: 87461f7a58ab ("net: phy: DP83822 initial driver submission")
Signed-off-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt(a)savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c
index e6ad3a494d32..95ef507053a6 100644
--- a/drivers/net/phy/dp83822.c
+++ b/drivers/net/phy/dp83822.c
@@ -230,7 +230,6 @@ static int dp83822_config_intr(struct phy_device *phydev)
return misr_status;
misr_status |= (DP83822_RX_ERR_HF_INT_EN |
- DP83822_FALSE_CARRIER_HF_INT_EN |
DP83822_LINK_STAT_INT_EN |
DP83822_ENERGY_DET_INT_EN |
DP83822_LINK_QUAL_INT_EN);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3b89b511ea0c705cc418440e2abf9d692a556d84 Mon Sep 17 00:00:00 2001
From: Dan Carpenter <dan.carpenter(a)oracle.com>
Date: Thu, 23 Jun 2022 16:32:32 +0300
Subject: [PATCH] net: fix IFF_TX_SKB_NO_LINEAR definition
The "1<<31" shift has a sign extension bug so IFF_TX_SKB_NO_LINEAR is
0xffffffff80000000 instead of 0x0000000080000000.
Fixes: c2ff53d8049f ("net: Add priv_flags for allow tx skb without linear")
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Reviewed-by: Xuan Zhuo <xuanzhuo(a)linux.alibaba.com>
Link: https://lore.kernel.org/r/YrRrcGttfEVnf85Q@kili
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f615a66c89e9..2563d30736e9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1671,7 +1671,7 @@ enum netdev_priv_flags {
IFF_FAILOVER_SLAVE = 1<<28,
IFF_L3MDEV_RX_HANDLER = 1<<29,
IFF_LIVE_RENAME_OK = 1<<30,
- IFF_TX_SKB_NO_LINEAR = 1<<31,
+ IFF_TX_SKB_NO_LINEAR = BIT_ULL(31),
IFF_CHANGE_PROTO_DOWN = BIT_ULL(32),
};
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7c97bc0128b2eecc703106112679a69d446d1a12 Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Wed, 22 Jun 2022 20:02:04 -0700
Subject: [PATCH] net: dsa: bcm_sf2: force pause link settings
The pause settings reported by the PHY should also be applied to the GMII port
status override otherwise the switch will not generate pause frames towards the
link partner despite the advertisement saying otherwise.
Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli(a)gmail.com>
Link: https://lore.kernel.org/r/20220623030204.1966851-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 87e81c636339..be0edfa093d0 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -878,6 +878,11 @@ static void bcm_sf2_sw_mac_link_up(struct dsa_switch *ds, int port,
if (duplex == DUPLEX_FULL)
reg |= DUPLX_MODE;
+ if (tx_pause)
+ reg |= TXFLOW_CNTL;
+ if (rx_pause)
+ reg |= RXFLOW_CNTL;
+
core_writel(priv, reg, offset);
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7c97bc0128b2eecc703106112679a69d446d1a12 Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Wed, 22 Jun 2022 20:02:04 -0700
Subject: [PATCH] net: dsa: bcm_sf2: force pause link settings
The pause settings reported by the PHY should also be applied to the GMII port
status override otherwise the switch will not generate pause frames towards the
link partner despite the advertisement saying otherwise.
Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli(a)gmail.com>
Link: https://lore.kernel.org/r/20220623030204.1966851-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 87e81c636339..be0edfa093d0 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -878,6 +878,11 @@ static void bcm_sf2_sw_mac_link_up(struct dsa_switch *ds, int port,
if (duplex == DUPLEX_FULL)
reg |= DUPLX_MODE;
+ if (tx_pause)
+ reg |= TXFLOW_CNTL;
+ if (rx_pause)
+ reg |= RXFLOW_CNTL;
+
core_writel(priv, reg, offset);
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7c97bc0128b2eecc703106112679a69d446d1a12 Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Wed, 22 Jun 2022 20:02:04 -0700
Subject: [PATCH] net: dsa: bcm_sf2: force pause link settings
The pause settings reported by the PHY should also be applied to the GMII port
status override otherwise the switch will not generate pause frames towards the
link partner despite the advertisement saying otherwise.
Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli(a)gmail.com>
Link: https://lore.kernel.org/r/20220623030204.1966851-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 87e81c636339..be0edfa093d0 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -878,6 +878,11 @@ static void bcm_sf2_sw_mac_link_up(struct dsa_switch *ds, int port,
if (duplex == DUPLEX_FULL)
reg |= DUPLX_MODE;
+ if (tx_pause)
+ reg |= TXFLOW_CNTL;
+ if (rx_pause)
+ reg |= RXFLOW_CNTL;
+
core_writel(priv, reg, offset);
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7c97bc0128b2eecc703106112679a69d446d1a12 Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Wed, 22 Jun 2022 20:02:04 -0700
Subject: [PATCH] net: dsa: bcm_sf2: force pause link settings
The pause settings reported by the PHY should also be applied to the GMII port
status override otherwise the switch will not generate pause frames towards the
link partner despite the advertisement saying otherwise.
Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli(a)gmail.com>
Link: https://lore.kernel.org/r/20220623030204.1966851-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 87e81c636339..be0edfa093d0 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -878,6 +878,11 @@ static void bcm_sf2_sw_mac_link_up(struct dsa_switch *ds, int port,
if (duplex == DUPLEX_FULL)
reg |= DUPLX_MODE;
+ if (tx_pause)
+ reg |= TXFLOW_CNTL;
+ if (rx_pause)
+ reg |= RXFLOW_CNTL;
+
core_writel(priv, reg, offset);
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 50c0ada627f56c92f5953a8bf9158b045ad026a1 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang(a)redhat.com>
Date: Fri, 17 Jun 2022 15:29:49 +0800
Subject: [PATCH] virtio-net: fix race between ndo_open() and
virtio_device_ready()
We currently call virtio_device_ready() after netdev
registration. Since ndo_open() can be called immediately
after register_netdev, this means there exists a race between
ndo_open() and virtio_device_ready(): the driver may start to use the
device before DRIVER_OK which violates the spec.
Fix this by switching to use register_netdevice() and protect the
virtio_device_ready() with rtnl_lock() to make sure ndo_open() can
only be called after virtio_device_ready().
Fixes: 4baf1e33d0842 ("virtio_net: enable VQs early")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Message-Id: <20220617072949.30734-1-jasowang(a)redhat.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index db05b5e930be..8a5810bcb839 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3655,14 +3655,20 @@ static int virtnet_probe(struct virtio_device *vdev)
if (vi->has_rss || vi->has_rss_hash_report)
virtnet_init_default_rss(vi);
- err = register_netdev(dev);
+ /* serialize netdev register + virtio_device_ready() with ndo_open() */
+ rtnl_lock();
+
+ err = register_netdevice(dev);
if (err) {
pr_debug("virtio_net: registering device failed\n");
+ rtnl_unlock();
goto free_failover;
}
virtio_device_ready(vdev);
+ rtnl_unlock();
+
err = virtnet_cpu_notif_add(vi);
if (err) {
pr_debug("virtio_net: registering cpu notifier failed\n");
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 50c0ada627f56c92f5953a8bf9158b045ad026a1 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang(a)redhat.com>
Date: Fri, 17 Jun 2022 15:29:49 +0800
Subject: [PATCH] virtio-net: fix race between ndo_open() and
virtio_device_ready()
We currently call virtio_device_ready() after netdev
registration. Since ndo_open() can be called immediately
after register_netdev, this means there exists a race between
ndo_open() and virtio_device_ready(): the driver may start to use the
device before DRIVER_OK which violates the spec.
Fix this by switching to use register_netdevice() and protect the
virtio_device_ready() with rtnl_lock() to make sure ndo_open() can
only be called after virtio_device_ready().
Fixes: 4baf1e33d0842 ("virtio_net: enable VQs early")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Message-Id: <20220617072949.30734-1-jasowang(a)redhat.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index db05b5e930be..8a5810bcb839 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3655,14 +3655,20 @@ static int virtnet_probe(struct virtio_device *vdev)
if (vi->has_rss || vi->has_rss_hash_report)
virtnet_init_default_rss(vi);
- err = register_netdev(dev);
+ /* serialize netdev register + virtio_device_ready() with ndo_open() */
+ rtnl_lock();
+
+ err = register_netdevice(dev);
if (err) {
pr_debug("virtio_net: registering device failed\n");
+ rtnl_unlock();
goto free_failover;
}
virtio_device_ready(vdev);
+ rtnl_unlock();
+
err = virtnet_cpu_notif_add(vi);
if (err) {
pr_debug("virtio_net: registering cpu notifier failed\n");
With KUAP, the TLB miss handler bails out when an access to user
memory is performed with a nul TID.
But the normal TLB miss routine which is only used early during boot
does the check regardless for all memory areas, not only user memory.
By chance there is no early IO or vmalloc access, but when KASAN
come we will start having early TLB misses.
Fix it by creating a special branch for user accesses similar to the
one in the 'bolted' TLB miss handlers. Unfortunately SPRN_MAS1 is
now read too early and there are no registers available to preserve
it so it will be read a second time.
Fixes: 57bc963837f5 ("powerpc/kuap: Wire-up KUAP on book3e/64")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
---
arch/powerpc/mm/nohash/tlb_low_64e.S | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/mm/nohash/tlb_low_64e.S b/arch/powerpc/mm/nohash/tlb_low_64e.S
index 8b97c4acfebf..9e9ab3803fb2 100644
--- a/arch/powerpc/mm/nohash/tlb_low_64e.S
+++ b/arch/powerpc/mm/nohash/tlb_low_64e.S
@@ -583,7 +583,7 @@ itlb_miss_fault_e6500:
*/
rlwimi r11,r14,32-19,27,27
rlwimi r11,r14,32-16,19,19
- beq normal_tlb_miss
+ beq normal_tlb_miss_user
/* XXX replace the RMW cycles with immediate loads + writes */
1: mfspr r10,SPRN_MAS1
cmpldi cr0,r15,8 /* Check for vmalloc region */
@@ -626,7 +626,7 @@ itlb_miss_fault_e6500:
cmpldi cr0,r15,0 /* Check for user region */
std r14,EX_TLB_ESR(r12) /* write crazy -1 to frame */
- beq normal_tlb_miss
+ beq normal_tlb_miss_user
li r11,_PAGE_PRESENT|_PAGE_BAP_SX /* Base perm */
oris r11,r11,_PAGE_ACCESSED@h
@@ -653,6 +653,12 @@ itlb_miss_fault_e6500:
* r11 = PTE permission mask
* r10 = crap (free to use)
*/
+normal_tlb_miss_user:
+#ifdef CONFIG_PPC_KUAP
+ mfspr r14,SPRN_MAS1
+ rlwinm. r14,r14,0,0x3fff0000
+ beq- normal_tlb_miss_access_fault /* KUAP fault */
+#endif
normal_tlb_miss:
/* So we first construct the page table address. We do that by
* shifting the bottom of the address (not the region ID) by
@@ -683,11 +689,6 @@ finish_normal_tlb_miss:
/* Check if required permissions are met */
andc. r15,r11,r14
bne- normal_tlb_miss_access_fault
-#ifdef CONFIG_PPC_KUAP
- mfspr r11,SPRN_MAS1
- rlwinm. r10,r11,0,0x3fff0000
- beq- normal_tlb_miss_access_fault /* KUAP fault */
-#endif
/* Now we build the MAS:
*
@@ -709,9 +710,7 @@ finish_normal_tlb_miss:
rldicl r10,r14,64-8,64-8
cmpldi cr0,r10,BOOK3E_PAGESZ_4K
beq- 1f
-#ifndef CONFIG_PPC_KUAP
mfspr r11,SPRN_MAS1
-#endif
rlwimi r11,r14,31,21,24
rlwinm r11,r11,0,21,19
mtspr SPRN_MAS1,r11
--
2.36.1
On FSL_BOOK3E, _PAGE_RW is defined with two bits, one for user and one
for supervisor. As soon as one of the two bits is set, the page has
to be display as RW. But the way it is implemented today requires both
bits to be set in order to display it as RW.
Instead of display RW when _PAGE_RW bits are set and R otherwise,
reverse the logic and display R when _PAGE_RW bits are all 0 and
RW otherwise.
This change has no impact on other platforms as _PAGE_RW is a single
bit on all of them.
Fixes: 8eb07b187000 ("powerpc/mm: Dump linux pagetables")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
---
arch/powerpc/mm/ptdump/shared.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/mm/ptdump/shared.c b/arch/powerpc/mm/ptdump/shared.c
index 03607ab90c66..f884760ca5cf 100644
--- a/arch/powerpc/mm/ptdump/shared.c
+++ b/arch/powerpc/mm/ptdump/shared.c
@@ -17,9 +17,9 @@ static const struct flag_info flag_array[] = {
.clear = " ",
}, {
.mask = _PAGE_RW,
- .val = _PAGE_RW,
- .set = "rw",
- .clear = "r ",
+ .val = 0,
+ .set = "r ",
+ .clear = "rw",
}, {
.mask = _PAGE_EXEC,
.val = _PAGE_EXEC,
--
2.36.1
commit e4f74400308cb8abde5fdc9cad609c2aba32110c upstream.
s390x appears to present two RNG interfaces:
- a "TRNG" that gathers entropy using some hardware function; and
- a "DRBG" that takes in a seed and expands it.
Previously, the TRNG was wired up to arch_get_random_{long,int}(), but
it was observed that this was being called really frequently, resulting
in high overhead. So it was changed to be wired up to arch_get_random_
seed_{long,int}(), which was a reasonable decision. Later on, the DRBG
was then wired up to arch_get_random_{long,int}(), with a complicated
buffer filling thread, to control overhead and rate.
Fortunately, none of the performance issues matter much now. The RNG
always attempts to use arch_get_random_seed_{long,int}() first, which
means a complicated implementation of arch_get_random_{long,int}() isn't
really valuable or useful to have around. And it's only used when
reseeding, which means it won't hit the high throughput complications
that were faced before.
So this commit returns to an earlier design of just calling the TRNG in
arch_get_random_seed_{long,int}(), and returning false in arch_get_
random_{long,int}().
Part of what makes the simplification possible is that the RNG now seeds
itself using the TRNG at bootup. But this only works if the TRNG is
detected early in boot, before random_init() is called. So this commit
also causes that check to happen in setup_arch().
Cc: stable(a)vger.kernel.org
Cc: Harald Freudenberger <freude(a)linux.ibm.com>
Cc: Ingo Franzki <ifranzki(a)linux.ibm.com>
Cc: Juergen Christ <jchrist(a)linux.ibm.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Link: https://lore.kernel.org/r/20220610222023.378448-1-Jason@zx2c4.com
Reviewed-by: Harald Freudenberger <freude(a)linux.ibm.com>
Acked-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
arch/s390/crypto/arch_random.c | 20 +++----------------
arch/s390/include/asm/archrandom.h | 32 +++++++++++++-----------------
arch/s390/kernel/setup.c | 5 +++++
3 files changed, 22 insertions(+), 35 deletions(-)
diff --git a/arch/s390/crypto/arch_random.c b/arch/s390/crypto/arch_random.c
index 36aefc07d10c..1f2d40993c4d 100644
--- a/arch/s390/crypto/arch_random.c
+++ b/arch/s390/crypto/arch_random.c
@@ -1,13 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* s390 arch random implementation.
*
- * Copyright IBM Corp. 2017
- * Author(s): Harald Freudenberger <freude(a)de.ibm.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License (version 2 only)
- * as published by the Free Software Foundation.
- *
+ * Copyright IBM Corp. 2017, 2020
+ * Author(s): Harald Freudenberger
*/
#include <linux/kernel.h>
@@ -20,13 +16,3 @@ DEFINE_STATIC_KEY_FALSE(s390_arch_random_available);
atomic64_t s390_arch_random_counter = ATOMIC64_INIT(0);
EXPORT_SYMBOL(s390_arch_random_counter);
-
-static int __init s390_arch_random_init(void)
-{
- /* check if subfunction CPACF_PRNO_TRNG is available */
- if (cpacf_query_func(CPACF_PRNO, CPACF_PRNO_TRNG))
- static_branch_enable(&s390_arch_random_available);
-
- return 0;
-}
-arch_initcall(s390_arch_random_init);
diff --git a/arch/s390/include/asm/archrandom.h b/arch/s390/include/asm/archrandom.h
index ddf97715ee53..2c6e1c6ecbe7 100644
--- a/arch/s390/include/asm/archrandom.h
+++ b/arch/s390/include/asm/archrandom.h
@@ -2,7 +2,7 @@
/*
* Kernel interface for the s390 arch_random_* functions
*
- * Copyright IBM Corp. 2017
+ * Copyright IBM Corp. 2017, 2020
*
* Author: Harald Freudenberger <freude(a)de.ibm.com>
*
@@ -20,38 +20,34 @@
DECLARE_STATIC_KEY_FALSE(s390_arch_random_available);
extern atomic64_t s390_arch_random_counter;
-static void s390_arch_random_generate(u8 *buf, unsigned int nbytes)
+static inline bool __must_check arch_get_random_long(unsigned long *v)
{
- cpacf_trng(NULL, 0, buf, nbytes);
- atomic64_add(nbytes, &s390_arch_random_counter);
+ return false;
}
-static inline bool arch_get_random_long(unsigned long *v)
+static inline bool __must_check arch_get_random_int(unsigned int *v)
{
- if (static_branch_likely(&s390_arch_random_available)) {
- s390_arch_random_generate((u8 *)v, sizeof(*v));
- return true;
- }
return false;
}
-static inline bool arch_get_random_int(unsigned int *v)
+static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
{
if (static_branch_likely(&s390_arch_random_available)) {
- s390_arch_random_generate((u8 *)v, sizeof(*v));
+ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
+ atomic64_add(sizeof(*v), &s390_arch_random_counter);
return true;
}
return false;
}
-static inline bool arch_get_random_seed_long(unsigned long *v)
+static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
{
- return arch_get_random_long(v);
-}
-
-static inline bool arch_get_random_seed_int(unsigned int *v)
-{
- return arch_get_random_int(v);
+ if (static_branch_likely(&s390_arch_random_available)) {
+ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
+ atomic64_add(sizeof(*v), &s390_arch_random_counter);
+ return true;
+ }
+ return false;
}
#endif /* CONFIG_ARCH_RANDOM */
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index e9ef093eb676..b3343f093f67 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -853,6 +853,11 @@ static void __init setup_randomness(void)
if (stsi(vmms, 3, 2, 2) == 0 && vmms->count)
add_device_randomness(&vmms->vm, sizeof(vmms->vm[0]) * vmms->count);
memblock_free((unsigned long) vmms, PAGE_SIZE);
+
+#ifdef CONFIG_ARCH_RANDOM
+ if (cpacf_query_func(CPACF_PRNO, CPACF_PRNO_TRNG))
+ static_branch_enable(&s390_arch_random_available);
+#endif
}
/*
--
2.35.1
Until now, when running at the maximum resolution of 1280x720 at 32bpp
on the JZ4770 SoC the output was garbled, the X/Y position of the
top-left corner of the framebuffer warping to a random position with
the whole image being offset accordingly, every time a new frame was
being submitted.
This problem can be eliminated by using a bigger burst size for the DMA.
Set in each soc_info structure the maximum burst size supported by the
corresponding SoC, and use it in the driver.
Set the new value using regmap_update_bits() instead of
regmap_set_bits(), since we do want to override the old value of the
burst size. (Note that regmap_set_bits() wasn't really valid before for
the same reason, but it never seemed to be a problem).
Cc: <stable(a)vger.kernel.org>
Fixes: 90b86fcc47b4 ("DRM: Add KMS driver for the Ingenic JZ47xx SoCs")
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
---
drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 10 ++++++++--
drivers/gpu/drm/ingenic/ingenic-drm.h | 3 +++
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 2c559885347a..8ad6080b32b2 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -70,6 +70,7 @@ struct jz_soc_info {
bool map_noncoherent;
bool use_extended_hwdesc;
bool plane_f0_not_working;
+ u32 max_burst;
unsigned int max_width, max_height;
const u32 *formats_f0, *formats_f1;
unsigned int num_formats_f0, num_formats_f1;
@@ -319,8 +320,9 @@ static void ingenic_drm_crtc_update_timings(struct ingenic_drm *priv,
regmap_write(priv->map, JZ_REG_LCD_REV, mode->htotal << 16);
}
- regmap_set_bits(priv->map, JZ_REG_LCD_CTRL,
- JZ_LCD_CTRL_OFUP | JZ_LCD_CTRL_BURST_16);
+ regmap_update_bits(priv->map, JZ_REG_LCD_CTRL,
+ JZ_LCD_CTRL_OFUP | JZ_LCD_CTRL_BURST_MASK,
+ JZ_LCD_CTRL_OFUP | priv->soc_info->max_burst);
/*
* IPU restart - specify how much time the LCDC will wait before
@@ -1519,6 +1521,7 @@ static const struct jz_soc_info jz4740_soc_info = {
.map_noncoherent = false,
.max_width = 800,
.max_height = 600,
+ .max_burst = JZ_LCD_CTRL_BURST_16,
.formats_f1 = jz4740_formats,
.num_formats_f1 = ARRAY_SIZE(jz4740_formats),
/* JZ4740 has only one plane */
@@ -1530,6 +1533,7 @@ static const struct jz_soc_info jz4725b_soc_info = {
.map_noncoherent = false,
.max_width = 800,
.max_height = 600,
+ .max_burst = JZ_LCD_CTRL_BURST_16,
.formats_f1 = jz4725b_formats_f1,
.num_formats_f1 = ARRAY_SIZE(jz4725b_formats_f1),
.formats_f0 = jz4725b_formats_f0,
@@ -1542,6 +1546,7 @@ static const struct jz_soc_info jz4770_soc_info = {
.map_noncoherent = true,
.max_width = 1280,
.max_height = 720,
+ .max_burst = JZ_LCD_CTRL_BURST_64,
.formats_f1 = jz4770_formats_f1,
.num_formats_f1 = ARRAY_SIZE(jz4770_formats_f1),
.formats_f0 = jz4770_formats_f0,
@@ -1556,6 +1561,7 @@ static const struct jz_soc_info jz4780_soc_info = {
.plane_f0_not_working = true, /* REVISIT */
.max_width = 4096,
.max_height = 2048,
+ .max_burst = JZ_LCD_CTRL_BURST_64,
.formats_f1 = jz4770_formats_f1,
.num_formats_f1 = ARRAY_SIZE(jz4770_formats_f1),
.formats_f0 = jz4770_formats_f0,
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm.h b/drivers/gpu/drm/ingenic/ingenic-drm.h
index cb1d09b62588..e5bd007ea93d 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm.h
+++ b/drivers/gpu/drm/ingenic/ingenic-drm.h
@@ -106,6 +106,9 @@
#define JZ_LCD_CTRL_BURST_4 (0x0 << 28)
#define JZ_LCD_CTRL_BURST_8 (0x1 << 28)
#define JZ_LCD_CTRL_BURST_16 (0x2 << 28)
+#define JZ_LCD_CTRL_BURST_32 (0x3 << 28)
+#define JZ_LCD_CTRL_BURST_64 (0x4 << 28)
+#define JZ_LCD_CTRL_BURST_MASK (0x7 << 28)
#define JZ_LCD_CTRL_RGB555 BIT(27)
#define JZ_LCD_CTRL_OFUP BIT(26)
#define JZ_LCD_CTRL_FRC_GRAYSCALE_16 (0x0 << 24)
--
2.35.1
This bug is marked as fixed by commit:
net: core: netlink: add helper refcount dec and lock function
net: sched: add helper function to take reference to Qdisc
net: sched: extend Qdisc with rcu
net: sched: rename qdisc_destroy() to qdisc_put()
net: sched: use Qdisc rcu API instead of relying on rtnl lock
But I can't find it in any tested tree for more than 90 days.
Is it a correct commit? Please update it by replying:
#syz fix: exact-commit-title
Until then the bug is still considered open and
new crashes with the same signature are ignored.
From: Chris Wilson <chris.p.wilson(a)intel.com>
Don't allow two engines to be reset in parallel, as they would both
try to select a reset bit (and send requests to common registers)
and wait on that register, at the same time. Serialize control of
the reset requests/acks using the uncore->lock, which will also ensure
that no other GT state changes at the same time as the actual reset.
Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Reported-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Andi Shyti <andi.shyti(a)intel.com>
Cc: stable(a)vger.kernel.org
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
---
See [PATCH 0/6] at: https://lore.kernel.org/all/cover.1655306128.git.mchehab@kernel.org/
drivers/gpu/drm/i915/gt/intel_reset.c | 37 ++++++++++++++++++++-------
1 file changed, 28 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index a5338c3fde7a..c68d36fb5bbd 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -300,9 +300,9 @@ static int gen6_hw_domain_reset(struct intel_gt *gt, u32 hw_domain_mask)
return err;
}
-static int gen6_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
u32 hw_mask;
@@ -321,6 +321,20 @@ static int gen6_reset_engines(struct intel_gt *gt,
return gen6_hw_domain_reset(gt, hw_mask);
}
+static int gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
+{
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(>->uncore->lock, flags);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
+ return ret;
+}
+
static struct intel_engine_cs *find_sfc_paired_vecs_engine(struct intel_engine_cs *engine)
{
int vecs_id;
@@ -487,9 +501,9 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
rmw_clear_fw(uncore, sfc_lock.lock_reg, sfc_lock.lock_bit);
}
-static int gen11_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen11_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
@@ -583,8 +597,11 @@ static int gen8_reset_engines(struct intel_gt *gt,
struct intel_engine_cs *engine;
const bool reset_non_ready = retry >= 1;
intel_engine_mask_t tmp;
+ unsigned long flags;
int ret;
+ spin_lock_irqsave(>->uncore->lock, flags);
+
for_each_engine_masked(engine, gt, engine_mask, tmp) {
ret = gen8_engine_reset_prepare(engine);
if (ret && !reset_non_ready)
@@ -612,17 +629,19 @@ static int gen8_reset_engines(struct intel_gt *gt,
* This is best effort, so ignore any error from the initial reset.
*/
if (IS_DG2(gt->i915) && engine_mask == ALL_ENGINES)
- gen11_reset_engines(gt, gt->info.engine_mask, 0);
+ __gen11_reset_engines(gt, gt->info.engine_mask, 0);
if (GRAPHICS_VER(gt->i915) >= 11)
- ret = gen11_reset_engines(gt, engine_mask, retry);
+ ret = __gen11_reset_engines(gt, engine_mask, retry);
else
- ret = gen6_reset_engines(gt, engine_mask, retry);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
skip_reset:
for_each_engine_masked(engine, gt, engine_mask, tmp)
gen8_engine_reset_cancel(engine);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
return ret;
}
--
2.36.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1758bde2e4aa5ff188d53e7d9d388bbb7e12eebb Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 28 Jun 2022 12:15:08 +0200
Subject: [PATCH] net: phy: Don't trigger state machine while in suspend
Upon system sleep, mdio_bus_phy_suspend() stops the phy_state_machine(),
but subsequent interrupts may retrigger it:
They may have been left enabled to facilitate wakeup and are not
quiesced until the ->suspend_noirq() phase. Unwanted interrupts may
hence occur between mdio_bus_phy_suspend() and dpm_suspend_noirq(),
as well as between dpm_resume_noirq() and mdio_bus_phy_resume().
Retriggering the phy_state_machine() through an interrupt is not only
undesirable for the reason given in mdio_bus_phy_suspend() (freezing it
midway with phydev->lock held), but also because the PHY may be
inaccessible after it's suspended: Accesses to USB-attached PHYs are
blocked once usb_suspend_both() clears the can_submit flag and PHYs on
PCI network cards may become inaccessible upon suspend as well.
Amend phy_interrupt() to avoid triggering the state machine if the PHY
is suspended. Signal wakeup instead if the attached net_device or its
parent has been configured as a wakeup source. (Those conditions are
identical to mdio_bus_phy_may_suspend().) Postpone handling of the
interrupt until the PHY has resumed.
Before stopping the phy_state_machine() in mdio_bus_phy_suspend(),
wait for a concurrent phy_interrupt() to run to completion. That is
necessary because phy_interrupt() may have checked the PHY's suspend
status before the system sleep transition commenced and it may thus
retrigger the state machine after it was stopped.
Likewise, after re-enabling interrupt handling in mdio_bus_phy_resume(),
wait for a concurrent phy_interrupt() to complete to ensure that
interrupts which it postponed are properly rerun.
The issue was exposed by commit 1ce8b37241ed ("usbnet: smsc95xx: Forward
PHY interrupts to PHY driver to avoid polling"), but has existed since
forever.
Fixes: 541cd3ee00a4 ("phylib: Fix deadlock on resume")
Link: https://lore.kernel.org/netdev/a5315a8a-32c2-962f-f696-de9a26d30091@samsung…
Reported-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable(a)vger.kernel.org # v2.6.33+
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Link: https://lore.kernel.org/r/b7f386d04e9b5b0e2738f0125743e30676f309ef.16564108…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index ef62f357b76d..8d3ee3a6495b 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -31,6 +31,7 @@
#include <linux/io.h>
#include <linux/uaccess.h>
#include <linux/atomic.h>
+#include <linux/suspend.h>
#include <net/netlink.h>
#include <net/genetlink.h>
#include <net/sock.h>
@@ -976,6 +977,28 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
struct phy_driver *drv = phydev->drv;
irqreturn_t ret;
+ /* Wakeup interrupts may occur during a system sleep transition.
+ * Postpone handling until the PHY has resumed.
+ */
+ if (IS_ENABLED(CONFIG_PM_SLEEP) && phydev->irq_suspended) {
+ struct net_device *netdev = phydev->attached_dev;
+
+ if (netdev) {
+ struct device *parent = netdev->dev.parent;
+
+ if (netdev->wol_enabled)
+ pm_system_wakeup();
+ else if (device_may_wakeup(&netdev->dev))
+ pm_wakeup_dev_event(&netdev->dev, 0, true);
+ else if (parent && device_may_wakeup(parent))
+ pm_wakeup_dev_event(parent, 0, true);
+ }
+
+ phydev->irq_rerun = 1;
+ disable_irq_nosync(irq);
+ return IRQ_HANDLED;
+ }
+
mutex_lock(&phydev->lock);
ret = drv->handle_interrupt(phydev);
mutex_unlock(&phydev->lock);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 431a8719c635..46acddd865a7 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -278,6 +278,15 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
if (phydev->mac_managed_pm)
return 0;
+ /* Wakeup interrupts may occur during the system sleep transition when
+ * the PHY is inaccessible. Set flag to postpone handling until the PHY
+ * has resumed. Wait for concurrent interrupt handler to complete.
+ */
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 1;
+ synchronize_irq(phydev->irq);
+ }
+
/* We must stop the state machine manually, otherwise it stops out of
* control, possibly with the phydev->lock held. Upon resume, netdev
* may call phy routines that try to grab the same lock, and that may
@@ -315,6 +324,20 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
if (ret < 0)
return ret;
no_resume:
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 0;
+ synchronize_irq(phydev->irq);
+
+ /* Rerun interrupts which were postponed by phy_interrupt()
+ * because they occurred during the system sleep transition.
+ */
+ if (phydev->irq_rerun) {
+ phydev->irq_rerun = 0;
+ enable_irq(phydev->irq);
+ irq_wake_thread(phydev->irq, phydev);
+ }
+ }
+
if (phydev->attached_dev && phydev->adjust_link)
phy_start_machine(phydev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 508f1149665b..b09f7d36cff2 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -572,6 +572,10 @@ struct macsec_ops;
* @mdix_ctrl: User setting of crossover
* @pma_extable: Cached value of PMA/PMD Extended Abilities Register
* @interrupts: Flag interrupts have been enabled
+ * @irq_suspended: Flag indicating PHY is suspended and therefore interrupt
+ * handling shall be postponed until PHY has resumed
+ * @irq_rerun: Flag indicating interrupts occurred while PHY was suspended,
+ * requiring a rerun of the interrupt handler after resume
* @interface: enum phy_interface_t value
* @skb: Netlink message for cable diagnostics
* @nest: Netlink nest used for cable diagnostics
@@ -626,6 +630,8 @@ struct phy_device {
/* Interrupts are enabled */
unsigned interrupts:1;
+ unsigned irq_suspended:1;
+ unsigned irq_rerun:1;
enum phy_state state;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1758bde2e4aa5ff188d53e7d9d388bbb7e12eebb Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 28 Jun 2022 12:15:08 +0200
Subject: [PATCH] net: phy: Don't trigger state machine while in suspend
Upon system sleep, mdio_bus_phy_suspend() stops the phy_state_machine(),
but subsequent interrupts may retrigger it:
They may have been left enabled to facilitate wakeup and are not
quiesced until the ->suspend_noirq() phase. Unwanted interrupts may
hence occur between mdio_bus_phy_suspend() and dpm_suspend_noirq(),
as well as between dpm_resume_noirq() and mdio_bus_phy_resume().
Retriggering the phy_state_machine() through an interrupt is not only
undesirable for the reason given in mdio_bus_phy_suspend() (freezing it
midway with phydev->lock held), but also because the PHY may be
inaccessible after it's suspended: Accesses to USB-attached PHYs are
blocked once usb_suspend_both() clears the can_submit flag and PHYs on
PCI network cards may become inaccessible upon suspend as well.
Amend phy_interrupt() to avoid triggering the state machine if the PHY
is suspended. Signal wakeup instead if the attached net_device or its
parent has been configured as a wakeup source. (Those conditions are
identical to mdio_bus_phy_may_suspend().) Postpone handling of the
interrupt until the PHY has resumed.
Before stopping the phy_state_machine() in mdio_bus_phy_suspend(),
wait for a concurrent phy_interrupt() to run to completion. That is
necessary because phy_interrupt() may have checked the PHY's suspend
status before the system sleep transition commenced and it may thus
retrigger the state machine after it was stopped.
Likewise, after re-enabling interrupt handling in mdio_bus_phy_resume(),
wait for a concurrent phy_interrupt() to complete to ensure that
interrupts which it postponed are properly rerun.
The issue was exposed by commit 1ce8b37241ed ("usbnet: smsc95xx: Forward
PHY interrupts to PHY driver to avoid polling"), but has existed since
forever.
Fixes: 541cd3ee00a4 ("phylib: Fix deadlock on resume")
Link: https://lore.kernel.org/netdev/a5315a8a-32c2-962f-f696-de9a26d30091@samsung…
Reported-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable(a)vger.kernel.org # v2.6.33+
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Link: https://lore.kernel.org/r/b7f386d04e9b5b0e2738f0125743e30676f309ef.16564108…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index ef62f357b76d..8d3ee3a6495b 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -31,6 +31,7 @@
#include <linux/io.h>
#include <linux/uaccess.h>
#include <linux/atomic.h>
+#include <linux/suspend.h>
#include <net/netlink.h>
#include <net/genetlink.h>
#include <net/sock.h>
@@ -976,6 +977,28 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
struct phy_driver *drv = phydev->drv;
irqreturn_t ret;
+ /* Wakeup interrupts may occur during a system sleep transition.
+ * Postpone handling until the PHY has resumed.
+ */
+ if (IS_ENABLED(CONFIG_PM_SLEEP) && phydev->irq_suspended) {
+ struct net_device *netdev = phydev->attached_dev;
+
+ if (netdev) {
+ struct device *parent = netdev->dev.parent;
+
+ if (netdev->wol_enabled)
+ pm_system_wakeup();
+ else if (device_may_wakeup(&netdev->dev))
+ pm_wakeup_dev_event(&netdev->dev, 0, true);
+ else if (parent && device_may_wakeup(parent))
+ pm_wakeup_dev_event(parent, 0, true);
+ }
+
+ phydev->irq_rerun = 1;
+ disable_irq_nosync(irq);
+ return IRQ_HANDLED;
+ }
+
mutex_lock(&phydev->lock);
ret = drv->handle_interrupt(phydev);
mutex_unlock(&phydev->lock);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 431a8719c635..46acddd865a7 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -278,6 +278,15 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
if (phydev->mac_managed_pm)
return 0;
+ /* Wakeup interrupts may occur during the system sleep transition when
+ * the PHY is inaccessible. Set flag to postpone handling until the PHY
+ * has resumed. Wait for concurrent interrupt handler to complete.
+ */
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 1;
+ synchronize_irq(phydev->irq);
+ }
+
/* We must stop the state machine manually, otherwise it stops out of
* control, possibly with the phydev->lock held. Upon resume, netdev
* may call phy routines that try to grab the same lock, and that may
@@ -315,6 +324,20 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
if (ret < 0)
return ret;
no_resume:
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 0;
+ synchronize_irq(phydev->irq);
+
+ /* Rerun interrupts which were postponed by phy_interrupt()
+ * because they occurred during the system sleep transition.
+ */
+ if (phydev->irq_rerun) {
+ phydev->irq_rerun = 0;
+ enable_irq(phydev->irq);
+ irq_wake_thread(phydev->irq, phydev);
+ }
+ }
+
if (phydev->attached_dev && phydev->adjust_link)
phy_start_machine(phydev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 508f1149665b..b09f7d36cff2 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -572,6 +572,10 @@ struct macsec_ops;
* @mdix_ctrl: User setting of crossover
* @pma_extable: Cached value of PMA/PMD Extended Abilities Register
* @interrupts: Flag interrupts have been enabled
+ * @irq_suspended: Flag indicating PHY is suspended and therefore interrupt
+ * handling shall be postponed until PHY has resumed
+ * @irq_rerun: Flag indicating interrupts occurred while PHY was suspended,
+ * requiring a rerun of the interrupt handler after resume
* @interface: enum phy_interface_t value
* @skb: Netlink message for cable diagnostics
* @nest: Netlink nest used for cable diagnostics
@@ -626,6 +630,8 @@ struct phy_device {
/* Interrupts are enabled */
unsigned interrupts:1;
+ unsigned irq_suspended:1;
+ unsigned irq_rerun:1;
enum phy_state state;
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ebc2cec0b7dd8dad0812449110803bd875ac816 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 29 Jun 2022 13:40:01 -0400
Subject: [PATCH] dm raid: fix KASAN warning in raid5_remove_disk
There's a KASAN warning in raid5_remove_disk when running the LVM
testsuite. We fix this warning by verifying that the "number" variable is
within limits.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d09256d7f81..ba289411f26f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7933,7 +7933,7 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
int err = 0;
int number = rdev->raid_disk;
struct md_rdev __rcu **rdevp;
- struct disk_info *p = conf->disks + number;
+ struct disk_info *p;
struct md_rdev *tmp;
print_raid5_conf(conf);
@@ -7952,6 +7952,9 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
log_exit(conf);
return 0;
}
+ if (unlikely(number >= conf->pool_size))
+ return 0;
+ p = conf->disks + number;
if (rdev == rcu_access_pointer(p->rdev))
rdevp = &p->rdev;
else if (rdev == rcu_access_pointer(p->replacement))
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ebc2cec0b7dd8dad0812449110803bd875ac816 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 29 Jun 2022 13:40:01 -0400
Subject: [PATCH] dm raid: fix KASAN warning in raid5_remove_disk
There's a KASAN warning in raid5_remove_disk when running the LVM
testsuite. We fix this warning by verifying that the "number" variable is
within limits.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d09256d7f81..ba289411f26f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7933,7 +7933,7 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
int err = 0;
int number = rdev->raid_disk;
struct md_rdev __rcu **rdevp;
- struct disk_info *p = conf->disks + number;
+ struct disk_info *p;
struct md_rdev *tmp;
print_raid5_conf(conf);
@@ -7952,6 +7952,9 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
log_exit(conf);
return 0;
}
+ if (unlikely(number >= conf->pool_size))
+ return 0;
+ p = conf->disks + number;
if (rdev == rcu_access_pointer(p->rdev))
rdevp = &p->rdev;
else if (rdev == rcu_access_pointer(p->replacement))
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ebc2cec0b7dd8dad0812449110803bd875ac816 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 29 Jun 2022 13:40:01 -0400
Subject: [PATCH] dm raid: fix KASAN warning in raid5_remove_disk
There's a KASAN warning in raid5_remove_disk when running the LVM
testsuite. We fix this warning by verifying that the "number" variable is
within limits.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d09256d7f81..ba289411f26f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7933,7 +7933,7 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
int err = 0;
int number = rdev->raid_disk;
struct md_rdev __rcu **rdevp;
- struct disk_info *p = conf->disks + number;
+ struct disk_info *p;
struct md_rdev *tmp;
print_raid5_conf(conf);
@@ -7952,6 +7952,9 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
log_exit(conf);
return 0;
}
+ if (unlikely(number >= conf->pool_size))
+ return 0;
+ p = conf->disks + number;
if (rdev == rcu_access_pointer(p->rdev))
rdevp = &p->rdev;
else if (rdev == rcu_access_pointer(p->replacement))
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ebc2cec0b7dd8dad0812449110803bd875ac816 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 29 Jun 2022 13:40:01 -0400
Subject: [PATCH] dm raid: fix KASAN warning in raid5_remove_disk
There's a KASAN warning in raid5_remove_disk when running the LVM
testsuite. We fix this warning by verifying that the "number" variable is
within limits.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d09256d7f81..ba289411f26f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7933,7 +7933,7 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
int err = 0;
int number = rdev->raid_disk;
struct md_rdev __rcu **rdevp;
- struct disk_info *p = conf->disks + number;
+ struct disk_info *p;
struct md_rdev *tmp;
print_raid5_conf(conf);
@@ -7952,6 +7952,9 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
log_exit(conf);
return 0;
}
+ if (unlikely(number >= conf->pool_size))
+ return 0;
+ p = conf->disks + number;
if (rdev == rcu_access_pointer(p->rdev))
rdevp = &p->rdev;
else if (rdev == rcu_access_pointer(p->replacement))
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ebc2cec0b7dd8dad0812449110803bd875ac816 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 29 Jun 2022 13:40:01 -0400
Subject: [PATCH] dm raid: fix KASAN warning in raid5_remove_disk
There's a KASAN warning in raid5_remove_disk when running the LVM
testsuite. We fix this warning by verifying that the "number" variable is
within limits.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d09256d7f81..ba289411f26f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7933,7 +7933,7 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
int err = 0;
int number = rdev->raid_disk;
struct md_rdev __rcu **rdevp;
- struct disk_info *p = conf->disks + number;
+ struct disk_info *p;
struct md_rdev *tmp;
print_raid5_conf(conf);
@@ -7952,6 +7952,9 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
log_exit(conf);
return 0;
}
+ if (unlikely(number >= conf->pool_size))
+ return 0;
+ p = conf->disks + number;
if (rdev == rcu_access_pointer(p->rdev))
rdevp = &p->rdev;
else if (rdev == rcu_access_pointer(p->replacement))
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ebc2cec0b7dd8dad0812449110803bd875ac816 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 29 Jun 2022 13:40:01 -0400
Subject: [PATCH] dm raid: fix KASAN warning in raid5_remove_disk
There's a KASAN warning in raid5_remove_disk when running the LVM
testsuite. We fix this warning by verifying that the "number" variable is
within limits.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d09256d7f81..ba289411f26f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7933,7 +7933,7 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
int err = 0;
int number = rdev->raid_disk;
struct md_rdev __rcu **rdevp;
- struct disk_info *p = conf->disks + number;
+ struct disk_info *p;
struct md_rdev *tmp;
print_raid5_conf(conf);
@@ -7952,6 +7952,9 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
log_exit(conf);
return 0;
}
+ if (unlikely(number >= conf->pool_size))
+ return 0;
+ p = conf->disks + number;
if (rdev == rcu_access_pointer(p->rdev))
rdevp = &p->rdev;
else if (rdev == rcu_access_pointer(p->replacement))
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ebc2cec0b7dd8dad0812449110803bd875ac816 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 29 Jun 2022 13:40:01 -0400
Subject: [PATCH] dm raid: fix KASAN warning in raid5_remove_disk
There's a KASAN warning in raid5_remove_disk when running the LVM
testsuite. We fix this warning by verifying that the "number" variable is
within limits.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d09256d7f81..ba289411f26f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7933,7 +7933,7 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
int err = 0;
int number = rdev->raid_disk;
struct md_rdev __rcu **rdevp;
- struct disk_info *p = conf->disks + number;
+ struct disk_info *p;
struct md_rdev *tmp;
print_raid5_conf(conf);
@@ -7952,6 +7952,9 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
log_exit(conf);
return 0;
}
+ if (unlikely(number >= conf->pool_size))
+ return 0;
+ p = conf->disks + number;
if (rdev == rcu_access_pointer(p->rdev))
rdevp = &p->rdev;
else if (rdev == rcu_access_pointer(p->replacement))
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b5d281f6c16dd432b618bdfd36ddba1a58d5b603 Mon Sep 17 00:00:00 2001
From: Christian Marangi <ansuelsmth(a)gmail.com>
Date: Mon, 20 Jun 2022 00:03:51 +0200
Subject: [PATCH] PM / devfreq: Rework freq_table to be local to devfreq struct
On a devfreq PROBE_DEFER, the freq_table in the driver profile struct,
is never reset and may be leaved in an undefined state.
This comes from the fact that we store the freq_table in the driver
profile struct that is commonly defined as static and not reset on
PROBE_DEFER.
We currently skip the reinit of the freq_table if we found
it's already defined since a driver may declare his own freq_table.
This logic is flawed in the case devfreq core generate a freq_table, set
it in the profile struct and then PROBE_DEFER, freeing the freq_table.
In this case devfreq will found a NOT NULL freq_table that has been
freed, skip the freq_table generation and probe the driver based on the
wrong table.
To fix this and correctly handle PROBE_DEFER, use a local freq_table and
max_state in the devfreq struct and never modify the freq_table present
in the profile struct if it does provide it.
Fixes: 0ec09ac2cebe ("PM / devfreq: Set the freq_table of devfreq device")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com>
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 80a1235ef8fb..9602141bb8ec 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -123,7 +123,7 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
unsigned long *min_freq,
unsigned long *max_freq)
{
- unsigned long *freq_table = devfreq->profile->freq_table;
+ unsigned long *freq_table = devfreq->freq_table;
s32 qos_min_freq, qos_max_freq;
lockdep_assert_held(&devfreq->lock);
@@ -133,11 +133,11 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
* The devfreq drivers can initialize this in either ascending or
* descending order and devfreq core supports both.
*/
- if (freq_table[0] < freq_table[devfreq->profile->max_state - 1]) {
+ if (freq_table[0] < freq_table[devfreq->max_state - 1]) {
*min_freq = freq_table[0];
- *max_freq = freq_table[devfreq->profile->max_state - 1];
+ *max_freq = freq_table[devfreq->max_state - 1];
} else {
- *min_freq = freq_table[devfreq->profile->max_state - 1];
+ *min_freq = freq_table[devfreq->max_state - 1];
*max_freq = freq_table[0];
}
@@ -169,8 +169,8 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
{
int lev;
- for (lev = 0; lev < devfreq->profile->max_state; lev++)
- if (freq == devfreq->profile->freq_table[lev])
+ for (lev = 0; lev < devfreq->max_state; lev++)
+ if (freq == devfreq->freq_table[lev])
return lev;
return -EINVAL;
@@ -178,7 +178,6 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
static int set_freq_table(struct devfreq *devfreq)
{
- struct devfreq_dev_profile *profile = devfreq->profile;
struct dev_pm_opp *opp;
unsigned long freq;
int i, count;
@@ -188,25 +187,22 @@ static int set_freq_table(struct devfreq *devfreq)
if (count <= 0)
return -EINVAL;
- profile->max_state = count;
- profile->freq_table = devm_kcalloc(devfreq->dev.parent,
- profile->max_state,
- sizeof(*profile->freq_table),
- GFP_KERNEL);
- if (!profile->freq_table) {
- profile->max_state = 0;
+ devfreq->max_state = count;
+ devfreq->freq_table = devm_kcalloc(devfreq->dev.parent,
+ devfreq->max_state,
+ sizeof(*devfreq->freq_table),
+ GFP_KERNEL);
+ if (!devfreq->freq_table)
return -ENOMEM;
- }
- for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
+ for (i = 0, freq = 0; i < devfreq->max_state; i++, freq++) {
opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
if (IS_ERR(opp)) {
- devm_kfree(devfreq->dev.parent, profile->freq_table);
- profile->max_state = 0;
+ devm_kfree(devfreq->dev.parent, devfreq->freq_table);
return PTR_ERR(opp);
}
dev_pm_opp_put(opp);
- profile->freq_table[i] = freq;
+ devfreq->freq_table[i] = freq;
}
return 0;
@@ -246,7 +242,7 @@ int devfreq_update_status(struct devfreq *devfreq, unsigned long freq)
if (lev != prev_lev) {
devfreq->stats.trans_table[
- (prev_lev * devfreq->profile->max_state) + lev]++;
+ (prev_lev * devfreq->max_state) + lev]++;
devfreq->stats.total_trans++;
}
@@ -835,6 +831,9 @@ struct devfreq *devfreq_add_device(struct device *dev,
if (err < 0)
goto err_dev;
mutex_lock(&devfreq->lock);
+ } else {
+ devfreq->freq_table = devfreq->profile->freq_table;
+ devfreq->max_state = devfreq->profile->max_state;
}
devfreq->scaling_min_freq = find_available_min_freq(devfreq);
@@ -870,8 +869,8 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->stats.trans_table = devm_kzalloc(&devfreq->dev,
array3_size(sizeof(unsigned int),
- devfreq->profile->max_state,
- devfreq->profile->max_state),
+ devfreq->max_state,
+ devfreq->max_state),
GFP_KERNEL);
if (!devfreq->stats.trans_table) {
mutex_unlock(&devfreq->lock);
@@ -880,7 +879,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
}
devfreq->stats.time_in_state = devm_kcalloc(&devfreq->dev,
- devfreq->profile->max_state,
+ devfreq->max_state,
sizeof(*devfreq->stats.time_in_state),
GFP_KERNEL);
if (!devfreq->stats.time_in_state) {
@@ -1666,9 +1665,9 @@ static ssize_t available_frequencies_show(struct device *d,
mutex_lock(&df->lock);
- for (i = 0; i < df->profile->max_state; i++)
+ for (i = 0; i < df->max_state; i++)
count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
- "%lu ", df->profile->freq_table[i]);
+ "%lu ", df->freq_table[i]);
mutex_unlock(&df->lock);
/* Truncate the trailing space */
@@ -1691,7 +1690,7 @@ static ssize_t trans_stat_show(struct device *dev,
if (!df->profile)
return -EINVAL;
- max_state = df->profile->max_state;
+ max_state = df->max_state;
if (max_state == 0)
return sprintf(buf, "Not Supported.\n");
@@ -1708,19 +1707,17 @@ static ssize_t trans_stat_show(struct device *dev,
len += sprintf(buf + len, " :");
for (i = 0; i < max_state; i++)
len += sprintf(buf + len, "%10lu",
- df->profile->freq_table[i]);
+ df->freq_table[i]);
len += sprintf(buf + len, " time(ms)\n");
for (i = 0; i < max_state; i++) {
- if (df->profile->freq_table[i]
- == df->previous_freq) {
+ if (df->freq_table[i] == df->previous_freq)
len += sprintf(buf + len, "*");
- } else {
+ else
len += sprintf(buf + len, " ");
- }
- len += sprintf(buf + len, "%10lu:",
- df->profile->freq_table[i]);
+
+ len += sprintf(buf + len, "%10lu:", df->freq_table[i]);
for (j = 0; j < max_state; j++)
len += sprintf(buf + len, "%10u",
df->stats.trans_table[(i * max_state) + j]);
@@ -1744,7 +1741,7 @@ static ssize_t trans_stat_store(struct device *dev,
if (!df->profile)
return -EINVAL;
- if (df->profile->max_state == 0)
+ if (df->max_state == 0)
return count;
err = kstrtoint(buf, 10, &value);
@@ -1752,11 +1749,11 @@ static ssize_t trans_stat_store(struct device *dev,
return -EINVAL;
mutex_lock(&df->lock);
- memset(df->stats.time_in_state, 0, (df->profile->max_state *
+ memset(df->stats.time_in_state, 0, (df->max_state *
sizeof(*df->stats.time_in_state)));
memset(df->stats.trans_table, 0, array3_size(sizeof(unsigned int),
- df->profile->max_state,
- df->profile->max_state));
+ df->max_state,
+ df->max_state));
df->stats.total_trans = 0;
df->stats.last_update = get_jiffies_64();
mutex_unlock(&df->lock);
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 69e06725d92b..406ef79c0c46 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -144,18 +144,18 @@ static int get_target_freq_with_devfreq(struct devfreq *devfreq,
goto out;
/* Use interpolation if required opps is not available */
- for (i = 0; i < parent_devfreq->profile->max_state; i++)
- if (parent_devfreq->profile->freq_table[i] == *freq)
+ for (i = 0; i < parent_devfreq->max_state; i++)
+ if (parent_devfreq->freq_table[i] == *freq)
break;
- if (i == parent_devfreq->profile->max_state)
+ if (i == parent_devfreq->max_state)
return -EINVAL;
- if (i < devfreq->profile->max_state) {
- child_freq = devfreq->profile->freq_table[i];
+ if (i < devfreq->max_state) {
+ child_freq = devfreq->freq_table[i];
} else {
- count = devfreq->profile->max_state;
- child_freq = devfreq->profile->freq_table[count - 1];
+ count = devfreq->max_state;
+ child_freq = devfreq->freq_table[count - 1];
}
out:
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index dc10bee75a72..34aab4dd336c 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -148,6 +148,8 @@ struct devfreq_stats {
* reevaluate operable frequencies. Devfreq users may use
* devfreq.nb to the corresponding register notifier call chain.
* @work: delayed work for load monitoring.
+ * @freq_table: current frequency table used by the devfreq driver.
+ * @max_state: count of entry present in the frequency table.
* @previous_freq: previously configured frequency value.
* @last_status: devfreq user device info, performance statistics
* @data: Private data of the governor. The devfreq framework does not
@@ -185,6 +187,9 @@ struct devfreq {
struct notifier_block nb;
struct delayed_work work;
+ unsigned long *freq_table;
+ unsigned int max_state;
+
unsigned long previous_freq;
struct devfreq_dev_status last_status;
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b5d281f6c16dd432b618bdfd36ddba1a58d5b603 Mon Sep 17 00:00:00 2001
From: Christian Marangi <ansuelsmth(a)gmail.com>
Date: Mon, 20 Jun 2022 00:03:51 +0200
Subject: [PATCH] PM / devfreq: Rework freq_table to be local to devfreq struct
On a devfreq PROBE_DEFER, the freq_table in the driver profile struct,
is never reset and may be leaved in an undefined state.
This comes from the fact that we store the freq_table in the driver
profile struct that is commonly defined as static and not reset on
PROBE_DEFER.
We currently skip the reinit of the freq_table if we found
it's already defined since a driver may declare his own freq_table.
This logic is flawed in the case devfreq core generate a freq_table, set
it in the profile struct and then PROBE_DEFER, freeing the freq_table.
In this case devfreq will found a NOT NULL freq_table that has been
freed, skip the freq_table generation and probe the driver based on the
wrong table.
To fix this and correctly handle PROBE_DEFER, use a local freq_table and
max_state in the devfreq struct and never modify the freq_table present
in the profile struct if it does provide it.
Fixes: 0ec09ac2cebe ("PM / devfreq: Set the freq_table of devfreq device")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com>
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 80a1235ef8fb..9602141bb8ec 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -123,7 +123,7 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
unsigned long *min_freq,
unsigned long *max_freq)
{
- unsigned long *freq_table = devfreq->profile->freq_table;
+ unsigned long *freq_table = devfreq->freq_table;
s32 qos_min_freq, qos_max_freq;
lockdep_assert_held(&devfreq->lock);
@@ -133,11 +133,11 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
* The devfreq drivers can initialize this in either ascending or
* descending order and devfreq core supports both.
*/
- if (freq_table[0] < freq_table[devfreq->profile->max_state - 1]) {
+ if (freq_table[0] < freq_table[devfreq->max_state - 1]) {
*min_freq = freq_table[0];
- *max_freq = freq_table[devfreq->profile->max_state - 1];
+ *max_freq = freq_table[devfreq->max_state - 1];
} else {
- *min_freq = freq_table[devfreq->profile->max_state - 1];
+ *min_freq = freq_table[devfreq->max_state - 1];
*max_freq = freq_table[0];
}
@@ -169,8 +169,8 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
{
int lev;
- for (lev = 0; lev < devfreq->profile->max_state; lev++)
- if (freq == devfreq->profile->freq_table[lev])
+ for (lev = 0; lev < devfreq->max_state; lev++)
+ if (freq == devfreq->freq_table[lev])
return lev;
return -EINVAL;
@@ -178,7 +178,6 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
static int set_freq_table(struct devfreq *devfreq)
{
- struct devfreq_dev_profile *profile = devfreq->profile;
struct dev_pm_opp *opp;
unsigned long freq;
int i, count;
@@ -188,25 +187,22 @@ static int set_freq_table(struct devfreq *devfreq)
if (count <= 0)
return -EINVAL;
- profile->max_state = count;
- profile->freq_table = devm_kcalloc(devfreq->dev.parent,
- profile->max_state,
- sizeof(*profile->freq_table),
- GFP_KERNEL);
- if (!profile->freq_table) {
- profile->max_state = 0;
+ devfreq->max_state = count;
+ devfreq->freq_table = devm_kcalloc(devfreq->dev.parent,
+ devfreq->max_state,
+ sizeof(*devfreq->freq_table),
+ GFP_KERNEL);
+ if (!devfreq->freq_table)
return -ENOMEM;
- }
- for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
+ for (i = 0, freq = 0; i < devfreq->max_state; i++, freq++) {
opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
if (IS_ERR(opp)) {
- devm_kfree(devfreq->dev.parent, profile->freq_table);
- profile->max_state = 0;
+ devm_kfree(devfreq->dev.parent, devfreq->freq_table);
return PTR_ERR(opp);
}
dev_pm_opp_put(opp);
- profile->freq_table[i] = freq;
+ devfreq->freq_table[i] = freq;
}
return 0;
@@ -246,7 +242,7 @@ int devfreq_update_status(struct devfreq *devfreq, unsigned long freq)
if (lev != prev_lev) {
devfreq->stats.trans_table[
- (prev_lev * devfreq->profile->max_state) + lev]++;
+ (prev_lev * devfreq->max_state) + lev]++;
devfreq->stats.total_trans++;
}
@@ -835,6 +831,9 @@ struct devfreq *devfreq_add_device(struct device *dev,
if (err < 0)
goto err_dev;
mutex_lock(&devfreq->lock);
+ } else {
+ devfreq->freq_table = devfreq->profile->freq_table;
+ devfreq->max_state = devfreq->profile->max_state;
}
devfreq->scaling_min_freq = find_available_min_freq(devfreq);
@@ -870,8 +869,8 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->stats.trans_table = devm_kzalloc(&devfreq->dev,
array3_size(sizeof(unsigned int),
- devfreq->profile->max_state,
- devfreq->profile->max_state),
+ devfreq->max_state,
+ devfreq->max_state),
GFP_KERNEL);
if (!devfreq->stats.trans_table) {
mutex_unlock(&devfreq->lock);
@@ -880,7 +879,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
}
devfreq->stats.time_in_state = devm_kcalloc(&devfreq->dev,
- devfreq->profile->max_state,
+ devfreq->max_state,
sizeof(*devfreq->stats.time_in_state),
GFP_KERNEL);
if (!devfreq->stats.time_in_state) {
@@ -1666,9 +1665,9 @@ static ssize_t available_frequencies_show(struct device *d,
mutex_lock(&df->lock);
- for (i = 0; i < df->profile->max_state; i++)
+ for (i = 0; i < df->max_state; i++)
count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
- "%lu ", df->profile->freq_table[i]);
+ "%lu ", df->freq_table[i]);
mutex_unlock(&df->lock);
/* Truncate the trailing space */
@@ -1691,7 +1690,7 @@ static ssize_t trans_stat_show(struct device *dev,
if (!df->profile)
return -EINVAL;
- max_state = df->profile->max_state;
+ max_state = df->max_state;
if (max_state == 0)
return sprintf(buf, "Not Supported.\n");
@@ -1708,19 +1707,17 @@ static ssize_t trans_stat_show(struct device *dev,
len += sprintf(buf + len, " :");
for (i = 0; i < max_state; i++)
len += sprintf(buf + len, "%10lu",
- df->profile->freq_table[i]);
+ df->freq_table[i]);
len += sprintf(buf + len, " time(ms)\n");
for (i = 0; i < max_state; i++) {
- if (df->profile->freq_table[i]
- == df->previous_freq) {
+ if (df->freq_table[i] == df->previous_freq)
len += sprintf(buf + len, "*");
- } else {
+ else
len += sprintf(buf + len, " ");
- }
- len += sprintf(buf + len, "%10lu:",
- df->profile->freq_table[i]);
+
+ len += sprintf(buf + len, "%10lu:", df->freq_table[i]);
for (j = 0; j < max_state; j++)
len += sprintf(buf + len, "%10u",
df->stats.trans_table[(i * max_state) + j]);
@@ -1744,7 +1741,7 @@ static ssize_t trans_stat_store(struct device *dev,
if (!df->profile)
return -EINVAL;
- if (df->profile->max_state == 0)
+ if (df->max_state == 0)
return count;
err = kstrtoint(buf, 10, &value);
@@ -1752,11 +1749,11 @@ static ssize_t trans_stat_store(struct device *dev,
return -EINVAL;
mutex_lock(&df->lock);
- memset(df->stats.time_in_state, 0, (df->profile->max_state *
+ memset(df->stats.time_in_state, 0, (df->max_state *
sizeof(*df->stats.time_in_state)));
memset(df->stats.trans_table, 0, array3_size(sizeof(unsigned int),
- df->profile->max_state,
- df->profile->max_state));
+ df->max_state,
+ df->max_state));
df->stats.total_trans = 0;
df->stats.last_update = get_jiffies_64();
mutex_unlock(&df->lock);
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 69e06725d92b..406ef79c0c46 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -144,18 +144,18 @@ static int get_target_freq_with_devfreq(struct devfreq *devfreq,
goto out;
/* Use interpolation if required opps is not available */
- for (i = 0; i < parent_devfreq->profile->max_state; i++)
- if (parent_devfreq->profile->freq_table[i] == *freq)
+ for (i = 0; i < parent_devfreq->max_state; i++)
+ if (parent_devfreq->freq_table[i] == *freq)
break;
- if (i == parent_devfreq->profile->max_state)
+ if (i == parent_devfreq->max_state)
return -EINVAL;
- if (i < devfreq->profile->max_state) {
- child_freq = devfreq->profile->freq_table[i];
+ if (i < devfreq->max_state) {
+ child_freq = devfreq->freq_table[i];
} else {
- count = devfreq->profile->max_state;
- child_freq = devfreq->profile->freq_table[count - 1];
+ count = devfreq->max_state;
+ child_freq = devfreq->freq_table[count - 1];
}
out:
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index dc10bee75a72..34aab4dd336c 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -148,6 +148,8 @@ struct devfreq_stats {
* reevaluate operable frequencies. Devfreq users may use
* devfreq.nb to the corresponding register notifier call chain.
* @work: delayed work for load monitoring.
+ * @freq_table: current frequency table used by the devfreq driver.
+ * @max_state: count of entry present in the frequency table.
* @previous_freq: previously configured frequency value.
* @last_status: devfreq user device info, performance statistics
* @data: Private data of the governor. The devfreq framework does not
@@ -185,6 +187,9 @@ struct devfreq {
struct notifier_block nb;
struct delayed_work work;
+ unsigned long *freq_table;
+ unsigned int max_state;
+
unsigned long previous_freq;
struct devfreq_dev_status last_status;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b5d281f6c16dd432b618bdfd36ddba1a58d5b603 Mon Sep 17 00:00:00 2001
From: Christian Marangi <ansuelsmth(a)gmail.com>
Date: Mon, 20 Jun 2022 00:03:51 +0200
Subject: [PATCH] PM / devfreq: Rework freq_table to be local to devfreq struct
On a devfreq PROBE_DEFER, the freq_table in the driver profile struct,
is never reset and may be leaved in an undefined state.
This comes from the fact that we store the freq_table in the driver
profile struct that is commonly defined as static and not reset on
PROBE_DEFER.
We currently skip the reinit of the freq_table if we found
it's already defined since a driver may declare his own freq_table.
This logic is flawed in the case devfreq core generate a freq_table, set
it in the profile struct and then PROBE_DEFER, freeing the freq_table.
In this case devfreq will found a NOT NULL freq_table that has been
freed, skip the freq_table generation and probe the driver based on the
wrong table.
To fix this and correctly handle PROBE_DEFER, use a local freq_table and
max_state in the devfreq struct and never modify the freq_table present
in the profile struct if it does provide it.
Fixes: 0ec09ac2cebe ("PM / devfreq: Set the freq_table of devfreq device")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com>
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 80a1235ef8fb..9602141bb8ec 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -123,7 +123,7 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
unsigned long *min_freq,
unsigned long *max_freq)
{
- unsigned long *freq_table = devfreq->profile->freq_table;
+ unsigned long *freq_table = devfreq->freq_table;
s32 qos_min_freq, qos_max_freq;
lockdep_assert_held(&devfreq->lock);
@@ -133,11 +133,11 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
* The devfreq drivers can initialize this in either ascending or
* descending order and devfreq core supports both.
*/
- if (freq_table[0] < freq_table[devfreq->profile->max_state - 1]) {
+ if (freq_table[0] < freq_table[devfreq->max_state - 1]) {
*min_freq = freq_table[0];
- *max_freq = freq_table[devfreq->profile->max_state - 1];
+ *max_freq = freq_table[devfreq->max_state - 1];
} else {
- *min_freq = freq_table[devfreq->profile->max_state - 1];
+ *min_freq = freq_table[devfreq->max_state - 1];
*max_freq = freq_table[0];
}
@@ -169,8 +169,8 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
{
int lev;
- for (lev = 0; lev < devfreq->profile->max_state; lev++)
- if (freq == devfreq->profile->freq_table[lev])
+ for (lev = 0; lev < devfreq->max_state; lev++)
+ if (freq == devfreq->freq_table[lev])
return lev;
return -EINVAL;
@@ -178,7 +178,6 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
static int set_freq_table(struct devfreq *devfreq)
{
- struct devfreq_dev_profile *profile = devfreq->profile;
struct dev_pm_opp *opp;
unsigned long freq;
int i, count;
@@ -188,25 +187,22 @@ static int set_freq_table(struct devfreq *devfreq)
if (count <= 0)
return -EINVAL;
- profile->max_state = count;
- profile->freq_table = devm_kcalloc(devfreq->dev.parent,
- profile->max_state,
- sizeof(*profile->freq_table),
- GFP_KERNEL);
- if (!profile->freq_table) {
- profile->max_state = 0;
+ devfreq->max_state = count;
+ devfreq->freq_table = devm_kcalloc(devfreq->dev.parent,
+ devfreq->max_state,
+ sizeof(*devfreq->freq_table),
+ GFP_KERNEL);
+ if (!devfreq->freq_table)
return -ENOMEM;
- }
- for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
+ for (i = 0, freq = 0; i < devfreq->max_state; i++, freq++) {
opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
if (IS_ERR(opp)) {
- devm_kfree(devfreq->dev.parent, profile->freq_table);
- profile->max_state = 0;
+ devm_kfree(devfreq->dev.parent, devfreq->freq_table);
return PTR_ERR(opp);
}
dev_pm_opp_put(opp);
- profile->freq_table[i] = freq;
+ devfreq->freq_table[i] = freq;
}
return 0;
@@ -246,7 +242,7 @@ int devfreq_update_status(struct devfreq *devfreq, unsigned long freq)
if (lev != prev_lev) {
devfreq->stats.trans_table[
- (prev_lev * devfreq->profile->max_state) + lev]++;
+ (prev_lev * devfreq->max_state) + lev]++;
devfreq->stats.total_trans++;
}
@@ -835,6 +831,9 @@ struct devfreq *devfreq_add_device(struct device *dev,
if (err < 0)
goto err_dev;
mutex_lock(&devfreq->lock);
+ } else {
+ devfreq->freq_table = devfreq->profile->freq_table;
+ devfreq->max_state = devfreq->profile->max_state;
}
devfreq->scaling_min_freq = find_available_min_freq(devfreq);
@@ -870,8 +869,8 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->stats.trans_table = devm_kzalloc(&devfreq->dev,
array3_size(sizeof(unsigned int),
- devfreq->profile->max_state,
- devfreq->profile->max_state),
+ devfreq->max_state,
+ devfreq->max_state),
GFP_KERNEL);
if (!devfreq->stats.trans_table) {
mutex_unlock(&devfreq->lock);
@@ -880,7 +879,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
}
devfreq->stats.time_in_state = devm_kcalloc(&devfreq->dev,
- devfreq->profile->max_state,
+ devfreq->max_state,
sizeof(*devfreq->stats.time_in_state),
GFP_KERNEL);
if (!devfreq->stats.time_in_state) {
@@ -1666,9 +1665,9 @@ static ssize_t available_frequencies_show(struct device *d,
mutex_lock(&df->lock);
- for (i = 0; i < df->profile->max_state; i++)
+ for (i = 0; i < df->max_state; i++)
count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
- "%lu ", df->profile->freq_table[i]);
+ "%lu ", df->freq_table[i]);
mutex_unlock(&df->lock);
/* Truncate the trailing space */
@@ -1691,7 +1690,7 @@ static ssize_t trans_stat_show(struct device *dev,
if (!df->profile)
return -EINVAL;
- max_state = df->profile->max_state;
+ max_state = df->max_state;
if (max_state == 0)
return sprintf(buf, "Not Supported.\n");
@@ -1708,19 +1707,17 @@ static ssize_t trans_stat_show(struct device *dev,
len += sprintf(buf + len, " :");
for (i = 0; i < max_state; i++)
len += sprintf(buf + len, "%10lu",
- df->profile->freq_table[i]);
+ df->freq_table[i]);
len += sprintf(buf + len, " time(ms)\n");
for (i = 0; i < max_state; i++) {
- if (df->profile->freq_table[i]
- == df->previous_freq) {
+ if (df->freq_table[i] == df->previous_freq)
len += sprintf(buf + len, "*");
- } else {
+ else
len += sprintf(buf + len, " ");
- }
- len += sprintf(buf + len, "%10lu:",
- df->profile->freq_table[i]);
+
+ len += sprintf(buf + len, "%10lu:", df->freq_table[i]);
for (j = 0; j < max_state; j++)
len += sprintf(buf + len, "%10u",
df->stats.trans_table[(i * max_state) + j]);
@@ -1744,7 +1741,7 @@ static ssize_t trans_stat_store(struct device *dev,
if (!df->profile)
return -EINVAL;
- if (df->profile->max_state == 0)
+ if (df->max_state == 0)
return count;
err = kstrtoint(buf, 10, &value);
@@ -1752,11 +1749,11 @@ static ssize_t trans_stat_store(struct device *dev,
return -EINVAL;
mutex_lock(&df->lock);
- memset(df->stats.time_in_state, 0, (df->profile->max_state *
+ memset(df->stats.time_in_state, 0, (df->max_state *
sizeof(*df->stats.time_in_state)));
memset(df->stats.trans_table, 0, array3_size(sizeof(unsigned int),
- df->profile->max_state,
- df->profile->max_state));
+ df->max_state,
+ df->max_state));
df->stats.total_trans = 0;
df->stats.last_update = get_jiffies_64();
mutex_unlock(&df->lock);
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 69e06725d92b..406ef79c0c46 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -144,18 +144,18 @@ static int get_target_freq_with_devfreq(struct devfreq *devfreq,
goto out;
/* Use interpolation if required opps is not available */
- for (i = 0; i < parent_devfreq->profile->max_state; i++)
- if (parent_devfreq->profile->freq_table[i] == *freq)
+ for (i = 0; i < parent_devfreq->max_state; i++)
+ if (parent_devfreq->freq_table[i] == *freq)
break;
- if (i == parent_devfreq->profile->max_state)
+ if (i == parent_devfreq->max_state)
return -EINVAL;
- if (i < devfreq->profile->max_state) {
- child_freq = devfreq->profile->freq_table[i];
+ if (i < devfreq->max_state) {
+ child_freq = devfreq->freq_table[i];
} else {
- count = devfreq->profile->max_state;
- child_freq = devfreq->profile->freq_table[count - 1];
+ count = devfreq->max_state;
+ child_freq = devfreq->freq_table[count - 1];
}
out:
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index dc10bee75a72..34aab4dd336c 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -148,6 +148,8 @@ struct devfreq_stats {
* reevaluate operable frequencies. Devfreq users may use
* devfreq.nb to the corresponding register notifier call chain.
* @work: delayed work for load monitoring.
+ * @freq_table: current frequency table used by the devfreq driver.
+ * @max_state: count of entry present in the frequency table.
* @previous_freq: previously configured frequency value.
* @last_status: devfreq user device info, performance statistics
* @data: Private data of the governor. The devfreq framework does not
@@ -185,6 +187,9 @@ struct devfreq {
struct notifier_block nb;
struct delayed_work work;
+ unsigned long *freq_table;
+ unsigned int max_state;
+
unsigned long previous_freq;
struct devfreq_dev_status last_status;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b5d281f6c16dd432b618bdfd36ddba1a58d5b603 Mon Sep 17 00:00:00 2001
From: Christian Marangi <ansuelsmth(a)gmail.com>
Date: Mon, 20 Jun 2022 00:03:51 +0200
Subject: [PATCH] PM / devfreq: Rework freq_table to be local to devfreq struct
On a devfreq PROBE_DEFER, the freq_table in the driver profile struct,
is never reset and may be leaved in an undefined state.
This comes from the fact that we store the freq_table in the driver
profile struct that is commonly defined as static and not reset on
PROBE_DEFER.
We currently skip the reinit of the freq_table if we found
it's already defined since a driver may declare his own freq_table.
This logic is flawed in the case devfreq core generate a freq_table, set
it in the profile struct and then PROBE_DEFER, freeing the freq_table.
In this case devfreq will found a NOT NULL freq_table that has been
freed, skip the freq_table generation and probe the driver based on the
wrong table.
To fix this and correctly handle PROBE_DEFER, use a local freq_table and
max_state in the devfreq struct and never modify the freq_table present
in the profile struct if it does provide it.
Fixes: 0ec09ac2cebe ("PM / devfreq: Set the freq_table of devfreq device")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com>
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 80a1235ef8fb..9602141bb8ec 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -123,7 +123,7 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
unsigned long *min_freq,
unsigned long *max_freq)
{
- unsigned long *freq_table = devfreq->profile->freq_table;
+ unsigned long *freq_table = devfreq->freq_table;
s32 qos_min_freq, qos_max_freq;
lockdep_assert_held(&devfreq->lock);
@@ -133,11 +133,11 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
* The devfreq drivers can initialize this in either ascending or
* descending order and devfreq core supports both.
*/
- if (freq_table[0] < freq_table[devfreq->profile->max_state - 1]) {
+ if (freq_table[0] < freq_table[devfreq->max_state - 1]) {
*min_freq = freq_table[0];
- *max_freq = freq_table[devfreq->profile->max_state - 1];
+ *max_freq = freq_table[devfreq->max_state - 1];
} else {
- *min_freq = freq_table[devfreq->profile->max_state - 1];
+ *min_freq = freq_table[devfreq->max_state - 1];
*max_freq = freq_table[0];
}
@@ -169,8 +169,8 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
{
int lev;
- for (lev = 0; lev < devfreq->profile->max_state; lev++)
- if (freq == devfreq->profile->freq_table[lev])
+ for (lev = 0; lev < devfreq->max_state; lev++)
+ if (freq == devfreq->freq_table[lev])
return lev;
return -EINVAL;
@@ -178,7 +178,6 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
static int set_freq_table(struct devfreq *devfreq)
{
- struct devfreq_dev_profile *profile = devfreq->profile;
struct dev_pm_opp *opp;
unsigned long freq;
int i, count;
@@ -188,25 +187,22 @@ static int set_freq_table(struct devfreq *devfreq)
if (count <= 0)
return -EINVAL;
- profile->max_state = count;
- profile->freq_table = devm_kcalloc(devfreq->dev.parent,
- profile->max_state,
- sizeof(*profile->freq_table),
- GFP_KERNEL);
- if (!profile->freq_table) {
- profile->max_state = 0;
+ devfreq->max_state = count;
+ devfreq->freq_table = devm_kcalloc(devfreq->dev.parent,
+ devfreq->max_state,
+ sizeof(*devfreq->freq_table),
+ GFP_KERNEL);
+ if (!devfreq->freq_table)
return -ENOMEM;
- }
- for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
+ for (i = 0, freq = 0; i < devfreq->max_state; i++, freq++) {
opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
if (IS_ERR(opp)) {
- devm_kfree(devfreq->dev.parent, profile->freq_table);
- profile->max_state = 0;
+ devm_kfree(devfreq->dev.parent, devfreq->freq_table);
return PTR_ERR(opp);
}
dev_pm_opp_put(opp);
- profile->freq_table[i] = freq;
+ devfreq->freq_table[i] = freq;
}
return 0;
@@ -246,7 +242,7 @@ int devfreq_update_status(struct devfreq *devfreq, unsigned long freq)
if (lev != prev_lev) {
devfreq->stats.trans_table[
- (prev_lev * devfreq->profile->max_state) + lev]++;
+ (prev_lev * devfreq->max_state) + lev]++;
devfreq->stats.total_trans++;
}
@@ -835,6 +831,9 @@ struct devfreq *devfreq_add_device(struct device *dev,
if (err < 0)
goto err_dev;
mutex_lock(&devfreq->lock);
+ } else {
+ devfreq->freq_table = devfreq->profile->freq_table;
+ devfreq->max_state = devfreq->profile->max_state;
}
devfreq->scaling_min_freq = find_available_min_freq(devfreq);
@@ -870,8 +869,8 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->stats.trans_table = devm_kzalloc(&devfreq->dev,
array3_size(sizeof(unsigned int),
- devfreq->profile->max_state,
- devfreq->profile->max_state),
+ devfreq->max_state,
+ devfreq->max_state),
GFP_KERNEL);
if (!devfreq->stats.trans_table) {
mutex_unlock(&devfreq->lock);
@@ -880,7 +879,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
}
devfreq->stats.time_in_state = devm_kcalloc(&devfreq->dev,
- devfreq->profile->max_state,
+ devfreq->max_state,
sizeof(*devfreq->stats.time_in_state),
GFP_KERNEL);
if (!devfreq->stats.time_in_state) {
@@ -1666,9 +1665,9 @@ static ssize_t available_frequencies_show(struct device *d,
mutex_lock(&df->lock);
- for (i = 0; i < df->profile->max_state; i++)
+ for (i = 0; i < df->max_state; i++)
count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
- "%lu ", df->profile->freq_table[i]);
+ "%lu ", df->freq_table[i]);
mutex_unlock(&df->lock);
/* Truncate the trailing space */
@@ -1691,7 +1690,7 @@ static ssize_t trans_stat_show(struct device *dev,
if (!df->profile)
return -EINVAL;
- max_state = df->profile->max_state;
+ max_state = df->max_state;
if (max_state == 0)
return sprintf(buf, "Not Supported.\n");
@@ -1708,19 +1707,17 @@ static ssize_t trans_stat_show(struct device *dev,
len += sprintf(buf + len, " :");
for (i = 0; i < max_state; i++)
len += sprintf(buf + len, "%10lu",
- df->profile->freq_table[i]);
+ df->freq_table[i]);
len += sprintf(buf + len, " time(ms)\n");
for (i = 0; i < max_state; i++) {
- if (df->profile->freq_table[i]
- == df->previous_freq) {
+ if (df->freq_table[i] == df->previous_freq)
len += sprintf(buf + len, "*");
- } else {
+ else
len += sprintf(buf + len, " ");
- }
- len += sprintf(buf + len, "%10lu:",
- df->profile->freq_table[i]);
+
+ len += sprintf(buf + len, "%10lu:", df->freq_table[i]);
for (j = 0; j < max_state; j++)
len += sprintf(buf + len, "%10u",
df->stats.trans_table[(i * max_state) + j]);
@@ -1744,7 +1741,7 @@ static ssize_t trans_stat_store(struct device *dev,
if (!df->profile)
return -EINVAL;
- if (df->profile->max_state == 0)
+ if (df->max_state == 0)
return count;
err = kstrtoint(buf, 10, &value);
@@ -1752,11 +1749,11 @@ static ssize_t trans_stat_store(struct device *dev,
return -EINVAL;
mutex_lock(&df->lock);
- memset(df->stats.time_in_state, 0, (df->profile->max_state *
+ memset(df->stats.time_in_state, 0, (df->max_state *
sizeof(*df->stats.time_in_state)));
memset(df->stats.trans_table, 0, array3_size(sizeof(unsigned int),
- df->profile->max_state,
- df->profile->max_state));
+ df->max_state,
+ df->max_state));
df->stats.total_trans = 0;
df->stats.last_update = get_jiffies_64();
mutex_unlock(&df->lock);
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 69e06725d92b..406ef79c0c46 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -144,18 +144,18 @@ static int get_target_freq_with_devfreq(struct devfreq *devfreq,
goto out;
/* Use interpolation if required opps is not available */
- for (i = 0; i < parent_devfreq->profile->max_state; i++)
- if (parent_devfreq->profile->freq_table[i] == *freq)
+ for (i = 0; i < parent_devfreq->max_state; i++)
+ if (parent_devfreq->freq_table[i] == *freq)
break;
- if (i == parent_devfreq->profile->max_state)
+ if (i == parent_devfreq->max_state)
return -EINVAL;
- if (i < devfreq->profile->max_state) {
- child_freq = devfreq->profile->freq_table[i];
+ if (i < devfreq->max_state) {
+ child_freq = devfreq->freq_table[i];
} else {
- count = devfreq->profile->max_state;
- child_freq = devfreq->profile->freq_table[count - 1];
+ count = devfreq->max_state;
+ child_freq = devfreq->freq_table[count - 1];
}
out:
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index dc10bee75a72..34aab4dd336c 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -148,6 +148,8 @@ struct devfreq_stats {
* reevaluate operable frequencies. Devfreq users may use
* devfreq.nb to the corresponding register notifier call chain.
* @work: delayed work for load monitoring.
+ * @freq_table: current frequency table used by the devfreq driver.
+ * @max_state: count of entry present in the frequency table.
* @previous_freq: previously configured frequency value.
* @last_status: devfreq user device info, performance statistics
* @data: Private data of the governor. The devfreq framework does not
@@ -185,6 +187,9 @@ struct devfreq {
struct notifier_block nb;
struct delayed_work work;
+ unsigned long *freq_table;
+ unsigned int max_state;
+
unsigned long previous_freq;
struct devfreq_dev_status last_status;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b5d281f6c16dd432b618bdfd36ddba1a58d5b603 Mon Sep 17 00:00:00 2001
From: Christian Marangi <ansuelsmth(a)gmail.com>
Date: Mon, 20 Jun 2022 00:03:51 +0200
Subject: [PATCH] PM / devfreq: Rework freq_table to be local to devfreq struct
On a devfreq PROBE_DEFER, the freq_table in the driver profile struct,
is never reset and may be leaved in an undefined state.
This comes from the fact that we store the freq_table in the driver
profile struct that is commonly defined as static and not reset on
PROBE_DEFER.
We currently skip the reinit of the freq_table if we found
it's already defined since a driver may declare his own freq_table.
This logic is flawed in the case devfreq core generate a freq_table, set
it in the profile struct and then PROBE_DEFER, freeing the freq_table.
In this case devfreq will found a NOT NULL freq_table that has been
freed, skip the freq_table generation and probe the driver based on the
wrong table.
To fix this and correctly handle PROBE_DEFER, use a local freq_table and
max_state in the devfreq struct and never modify the freq_table present
in the profile struct if it does provide it.
Fixes: 0ec09ac2cebe ("PM / devfreq: Set the freq_table of devfreq device")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com>
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 80a1235ef8fb..9602141bb8ec 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -123,7 +123,7 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
unsigned long *min_freq,
unsigned long *max_freq)
{
- unsigned long *freq_table = devfreq->profile->freq_table;
+ unsigned long *freq_table = devfreq->freq_table;
s32 qos_min_freq, qos_max_freq;
lockdep_assert_held(&devfreq->lock);
@@ -133,11 +133,11 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
* The devfreq drivers can initialize this in either ascending or
* descending order and devfreq core supports both.
*/
- if (freq_table[0] < freq_table[devfreq->profile->max_state - 1]) {
+ if (freq_table[0] < freq_table[devfreq->max_state - 1]) {
*min_freq = freq_table[0];
- *max_freq = freq_table[devfreq->profile->max_state - 1];
+ *max_freq = freq_table[devfreq->max_state - 1];
} else {
- *min_freq = freq_table[devfreq->profile->max_state - 1];
+ *min_freq = freq_table[devfreq->max_state - 1];
*max_freq = freq_table[0];
}
@@ -169,8 +169,8 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
{
int lev;
- for (lev = 0; lev < devfreq->profile->max_state; lev++)
- if (freq == devfreq->profile->freq_table[lev])
+ for (lev = 0; lev < devfreq->max_state; lev++)
+ if (freq == devfreq->freq_table[lev])
return lev;
return -EINVAL;
@@ -178,7 +178,6 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
static int set_freq_table(struct devfreq *devfreq)
{
- struct devfreq_dev_profile *profile = devfreq->profile;
struct dev_pm_opp *opp;
unsigned long freq;
int i, count;
@@ -188,25 +187,22 @@ static int set_freq_table(struct devfreq *devfreq)
if (count <= 0)
return -EINVAL;
- profile->max_state = count;
- profile->freq_table = devm_kcalloc(devfreq->dev.parent,
- profile->max_state,
- sizeof(*profile->freq_table),
- GFP_KERNEL);
- if (!profile->freq_table) {
- profile->max_state = 0;
+ devfreq->max_state = count;
+ devfreq->freq_table = devm_kcalloc(devfreq->dev.parent,
+ devfreq->max_state,
+ sizeof(*devfreq->freq_table),
+ GFP_KERNEL);
+ if (!devfreq->freq_table)
return -ENOMEM;
- }
- for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
+ for (i = 0, freq = 0; i < devfreq->max_state; i++, freq++) {
opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
if (IS_ERR(opp)) {
- devm_kfree(devfreq->dev.parent, profile->freq_table);
- profile->max_state = 0;
+ devm_kfree(devfreq->dev.parent, devfreq->freq_table);
return PTR_ERR(opp);
}
dev_pm_opp_put(opp);
- profile->freq_table[i] = freq;
+ devfreq->freq_table[i] = freq;
}
return 0;
@@ -246,7 +242,7 @@ int devfreq_update_status(struct devfreq *devfreq, unsigned long freq)
if (lev != prev_lev) {
devfreq->stats.trans_table[
- (prev_lev * devfreq->profile->max_state) + lev]++;
+ (prev_lev * devfreq->max_state) + lev]++;
devfreq->stats.total_trans++;
}
@@ -835,6 +831,9 @@ struct devfreq *devfreq_add_device(struct device *dev,
if (err < 0)
goto err_dev;
mutex_lock(&devfreq->lock);
+ } else {
+ devfreq->freq_table = devfreq->profile->freq_table;
+ devfreq->max_state = devfreq->profile->max_state;
}
devfreq->scaling_min_freq = find_available_min_freq(devfreq);
@@ -870,8 +869,8 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->stats.trans_table = devm_kzalloc(&devfreq->dev,
array3_size(sizeof(unsigned int),
- devfreq->profile->max_state,
- devfreq->profile->max_state),
+ devfreq->max_state,
+ devfreq->max_state),
GFP_KERNEL);
if (!devfreq->stats.trans_table) {
mutex_unlock(&devfreq->lock);
@@ -880,7 +879,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
}
devfreq->stats.time_in_state = devm_kcalloc(&devfreq->dev,
- devfreq->profile->max_state,
+ devfreq->max_state,
sizeof(*devfreq->stats.time_in_state),
GFP_KERNEL);
if (!devfreq->stats.time_in_state) {
@@ -1666,9 +1665,9 @@ static ssize_t available_frequencies_show(struct device *d,
mutex_lock(&df->lock);
- for (i = 0; i < df->profile->max_state; i++)
+ for (i = 0; i < df->max_state; i++)
count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
- "%lu ", df->profile->freq_table[i]);
+ "%lu ", df->freq_table[i]);
mutex_unlock(&df->lock);
/* Truncate the trailing space */
@@ -1691,7 +1690,7 @@ static ssize_t trans_stat_show(struct device *dev,
if (!df->profile)
return -EINVAL;
- max_state = df->profile->max_state;
+ max_state = df->max_state;
if (max_state == 0)
return sprintf(buf, "Not Supported.\n");
@@ -1708,19 +1707,17 @@ static ssize_t trans_stat_show(struct device *dev,
len += sprintf(buf + len, " :");
for (i = 0; i < max_state; i++)
len += sprintf(buf + len, "%10lu",
- df->profile->freq_table[i]);
+ df->freq_table[i]);
len += sprintf(buf + len, " time(ms)\n");
for (i = 0; i < max_state; i++) {
- if (df->profile->freq_table[i]
- == df->previous_freq) {
+ if (df->freq_table[i] == df->previous_freq)
len += sprintf(buf + len, "*");
- } else {
+ else
len += sprintf(buf + len, " ");
- }
- len += sprintf(buf + len, "%10lu:",
- df->profile->freq_table[i]);
+
+ len += sprintf(buf + len, "%10lu:", df->freq_table[i]);
for (j = 0; j < max_state; j++)
len += sprintf(buf + len, "%10u",
df->stats.trans_table[(i * max_state) + j]);
@@ -1744,7 +1741,7 @@ static ssize_t trans_stat_store(struct device *dev,
if (!df->profile)
return -EINVAL;
- if (df->profile->max_state == 0)
+ if (df->max_state == 0)
return count;
err = kstrtoint(buf, 10, &value);
@@ -1752,11 +1749,11 @@ static ssize_t trans_stat_store(struct device *dev,
return -EINVAL;
mutex_lock(&df->lock);
- memset(df->stats.time_in_state, 0, (df->profile->max_state *
+ memset(df->stats.time_in_state, 0, (df->max_state *
sizeof(*df->stats.time_in_state)));
memset(df->stats.trans_table, 0, array3_size(sizeof(unsigned int),
- df->profile->max_state,
- df->profile->max_state));
+ df->max_state,
+ df->max_state));
df->stats.total_trans = 0;
df->stats.last_update = get_jiffies_64();
mutex_unlock(&df->lock);
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 69e06725d92b..406ef79c0c46 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -144,18 +144,18 @@ static int get_target_freq_with_devfreq(struct devfreq *devfreq,
goto out;
/* Use interpolation if required opps is not available */
- for (i = 0; i < parent_devfreq->profile->max_state; i++)
- if (parent_devfreq->profile->freq_table[i] == *freq)
+ for (i = 0; i < parent_devfreq->max_state; i++)
+ if (parent_devfreq->freq_table[i] == *freq)
break;
- if (i == parent_devfreq->profile->max_state)
+ if (i == parent_devfreq->max_state)
return -EINVAL;
- if (i < devfreq->profile->max_state) {
- child_freq = devfreq->profile->freq_table[i];
+ if (i < devfreq->max_state) {
+ child_freq = devfreq->freq_table[i];
} else {
- count = devfreq->profile->max_state;
- child_freq = devfreq->profile->freq_table[count - 1];
+ count = devfreq->max_state;
+ child_freq = devfreq->freq_table[count - 1];
}
out:
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index dc10bee75a72..34aab4dd336c 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -148,6 +148,8 @@ struct devfreq_stats {
* reevaluate operable frequencies. Devfreq users may use
* devfreq.nb to the corresponding register notifier call chain.
* @work: delayed work for load monitoring.
+ * @freq_table: current frequency table used by the devfreq driver.
+ * @max_state: count of entry present in the frequency table.
* @previous_freq: previously configured frequency value.
* @last_status: devfreq user device info, performance statistics
* @data: Private data of the governor. The devfreq framework does not
@@ -185,6 +187,9 @@ struct devfreq {
struct notifier_block nb;
struct delayed_work work;
+ unsigned long *freq_table;
+ unsigned int max_state;
+
unsigned long previous_freq;
struct devfreq_dev_status last_status;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b5d281f6c16dd432b618bdfd36ddba1a58d5b603 Mon Sep 17 00:00:00 2001
From: Christian Marangi <ansuelsmth(a)gmail.com>
Date: Mon, 20 Jun 2022 00:03:51 +0200
Subject: [PATCH] PM / devfreq: Rework freq_table to be local to devfreq struct
On a devfreq PROBE_DEFER, the freq_table in the driver profile struct,
is never reset and may be leaved in an undefined state.
This comes from the fact that we store the freq_table in the driver
profile struct that is commonly defined as static and not reset on
PROBE_DEFER.
We currently skip the reinit of the freq_table if we found
it's already defined since a driver may declare his own freq_table.
This logic is flawed in the case devfreq core generate a freq_table, set
it in the profile struct and then PROBE_DEFER, freeing the freq_table.
In this case devfreq will found a NOT NULL freq_table that has been
freed, skip the freq_table generation and probe the driver based on the
wrong table.
To fix this and correctly handle PROBE_DEFER, use a local freq_table and
max_state in the devfreq struct and never modify the freq_table present
in the profile struct if it does provide it.
Fixes: 0ec09ac2cebe ("PM / devfreq: Set the freq_table of devfreq device")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com>
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 80a1235ef8fb..9602141bb8ec 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -123,7 +123,7 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
unsigned long *min_freq,
unsigned long *max_freq)
{
- unsigned long *freq_table = devfreq->profile->freq_table;
+ unsigned long *freq_table = devfreq->freq_table;
s32 qos_min_freq, qos_max_freq;
lockdep_assert_held(&devfreq->lock);
@@ -133,11 +133,11 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
* The devfreq drivers can initialize this in either ascending or
* descending order and devfreq core supports both.
*/
- if (freq_table[0] < freq_table[devfreq->profile->max_state - 1]) {
+ if (freq_table[0] < freq_table[devfreq->max_state - 1]) {
*min_freq = freq_table[0];
- *max_freq = freq_table[devfreq->profile->max_state - 1];
+ *max_freq = freq_table[devfreq->max_state - 1];
} else {
- *min_freq = freq_table[devfreq->profile->max_state - 1];
+ *min_freq = freq_table[devfreq->max_state - 1];
*max_freq = freq_table[0];
}
@@ -169,8 +169,8 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
{
int lev;
- for (lev = 0; lev < devfreq->profile->max_state; lev++)
- if (freq == devfreq->profile->freq_table[lev])
+ for (lev = 0; lev < devfreq->max_state; lev++)
+ if (freq == devfreq->freq_table[lev])
return lev;
return -EINVAL;
@@ -178,7 +178,6 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
static int set_freq_table(struct devfreq *devfreq)
{
- struct devfreq_dev_profile *profile = devfreq->profile;
struct dev_pm_opp *opp;
unsigned long freq;
int i, count;
@@ -188,25 +187,22 @@ static int set_freq_table(struct devfreq *devfreq)
if (count <= 0)
return -EINVAL;
- profile->max_state = count;
- profile->freq_table = devm_kcalloc(devfreq->dev.parent,
- profile->max_state,
- sizeof(*profile->freq_table),
- GFP_KERNEL);
- if (!profile->freq_table) {
- profile->max_state = 0;
+ devfreq->max_state = count;
+ devfreq->freq_table = devm_kcalloc(devfreq->dev.parent,
+ devfreq->max_state,
+ sizeof(*devfreq->freq_table),
+ GFP_KERNEL);
+ if (!devfreq->freq_table)
return -ENOMEM;
- }
- for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
+ for (i = 0, freq = 0; i < devfreq->max_state; i++, freq++) {
opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
if (IS_ERR(opp)) {
- devm_kfree(devfreq->dev.parent, profile->freq_table);
- profile->max_state = 0;
+ devm_kfree(devfreq->dev.parent, devfreq->freq_table);
return PTR_ERR(opp);
}
dev_pm_opp_put(opp);
- profile->freq_table[i] = freq;
+ devfreq->freq_table[i] = freq;
}
return 0;
@@ -246,7 +242,7 @@ int devfreq_update_status(struct devfreq *devfreq, unsigned long freq)
if (lev != prev_lev) {
devfreq->stats.trans_table[
- (prev_lev * devfreq->profile->max_state) + lev]++;
+ (prev_lev * devfreq->max_state) + lev]++;
devfreq->stats.total_trans++;
}
@@ -835,6 +831,9 @@ struct devfreq *devfreq_add_device(struct device *dev,
if (err < 0)
goto err_dev;
mutex_lock(&devfreq->lock);
+ } else {
+ devfreq->freq_table = devfreq->profile->freq_table;
+ devfreq->max_state = devfreq->profile->max_state;
}
devfreq->scaling_min_freq = find_available_min_freq(devfreq);
@@ -870,8 +869,8 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->stats.trans_table = devm_kzalloc(&devfreq->dev,
array3_size(sizeof(unsigned int),
- devfreq->profile->max_state,
- devfreq->profile->max_state),
+ devfreq->max_state,
+ devfreq->max_state),
GFP_KERNEL);
if (!devfreq->stats.trans_table) {
mutex_unlock(&devfreq->lock);
@@ -880,7 +879,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
}
devfreq->stats.time_in_state = devm_kcalloc(&devfreq->dev,
- devfreq->profile->max_state,
+ devfreq->max_state,
sizeof(*devfreq->stats.time_in_state),
GFP_KERNEL);
if (!devfreq->stats.time_in_state) {
@@ -1666,9 +1665,9 @@ static ssize_t available_frequencies_show(struct device *d,
mutex_lock(&df->lock);
- for (i = 0; i < df->profile->max_state; i++)
+ for (i = 0; i < df->max_state; i++)
count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
- "%lu ", df->profile->freq_table[i]);
+ "%lu ", df->freq_table[i]);
mutex_unlock(&df->lock);
/* Truncate the trailing space */
@@ -1691,7 +1690,7 @@ static ssize_t trans_stat_show(struct device *dev,
if (!df->profile)
return -EINVAL;
- max_state = df->profile->max_state;
+ max_state = df->max_state;
if (max_state == 0)
return sprintf(buf, "Not Supported.\n");
@@ -1708,19 +1707,17 @@ static ssize_t trans_stat_show(struct device *dev,
len += sprintf(buf + len, " :");
for (i = 0; i < max_state; i++)
len += sprintf(buf + len, "%10lu",
- df->profile->freq_table[i]);
+ df->freq_table[i]);
len += sprintf(buf + len, " time(ms)\n");
for (i = 0; i < max_state; i++) {
- if (df->profile->freq_table[i]
- == df->previous_freq) {
+ if (df->freq_table[i] == df->previous_freq)
len += sprintf(buf + len, "*");
- } else {
+ else
len += sprintf(buf + len, " ");
- }
- len += sprintf(buf + len, "%10lu:",
- df->profile->freq_table[i]);
+
+ len += sprintf(buf + len, "%10lu:", df->freq_table[i]);
for (j = 0; j < max_state; j++)
len += sprintf(buf + len, "%10u",
df->stats.trans_table[(i * max_state) + j]);
@@ -1744,7 +1741,7 @@ static ssize_t trans_stat_store(struct device *dev,
if (!df->profile)
return -EINVAL;
- if (df->profile->max_state == 0)
+ if (df->max_state == 0)
return count;
err = kstrtoint(buf, 10, &value);
@@ -1752,11 +1749,11 @@ static ssize_t trans_stat_store(struct device *dev,
return -EINVAL;
mutex_lock(&df->lock);
- memset(df->stats.time_in_state, 0, (df->profile->max_state *
+ memset(df->stats.time_in_state, 0, (df->max_state *
sizeof(*df->stats.time_in_state)));
memset(df->stats.trans_table, 0, array3_size(sizeof(unsigned int),
- df->profile->max_state,
- df->profile->max_state));
+ df->max_state,
+ df->max_state));
df->stats.total_trans = 0;
df->stats.last_update = get_jiffies_64();
mutex_unlock(&df->lock);
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 69e06725d92b..406ef79c0c46 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -144,18 +144,18 @@ static int get_target_freq_with_devfreq(struct devfreq *devfreq,
goto out;
/* Use interpolation if required opps is not available */
- for (i = 0; i < parent_devfreq->profile->max_state; i++)
- if (parent_devfreq->profile->freq_table[i] == *freq)
+ for (i = 0; i < parent_devfreq->max_state; i++)
+ if (parent_devfreq->freq_table[i] == *freq)
break;
- if (i == parent_devfreq->profile->max_state)
+ if (i == parent_devfreq->max_state)
return -EINVAL;
- if (i < devfreq->profile->max_state) {
- child_freq = devfreq->profile->freq_table[i];
+ if (i < devfreq->max_state) {
+ child_freq = devfreq->freq_table[i];
} else {
- count = devfreq->profile->max_state;
- child_freq = devfreq->profile->freq_table[count - 1];
+ count = devfreq->max_state;
+ child_freq = devfreq->freq_table[count - 1];
}
out:
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index dc10bee75a72..34aab4dd336c 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -148,6 +148,8 @@ struct devfreq_stats {
* reevaluate operable frequencies. Devfreq users may use
* devfreq.nb to the corresponding register notifier call chain.
* @work: delayed work for load monitoring.
+ * @freq_table: current frequency table used by the devfreq driver.
+ * @max_state: count of entry present in the frequency table.
* @previous_freq: previously configured frequency value.
* @last_status: devfreq user device info, performance statistics
* @data: Private data of the governor. The devfreq framework does not
@@ -185,6 +187,9 @@ struct devfreq {
struct notifier_block nb;
struct delayed_work work;
+ unsigned long *freq_table;
+ unsigned int max_state;
+
unsigned long previous_freq;
struct devfreq_dev_status last_status;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b5d281f6c16dd432b618bdfd36ddba1a58d5b603 Mon Sep 17 00:00:00 2001
From: Christian Marangi <ansuelsmth(a)gmail.com>
Date: Mon, 20 Jun 2022 00:03:51 +0200
Subject: [PATCH] PM / devfreq: Rework freq_table to be local to devfreq struct
On a devfreq PROBE_DEFER, the freq_table in the driver profile struct,
is never reset and may be leaved in an undefined state.
This comes from the fact that we store the freq_table in the driver
profile struct that is commonly defined as static and not reset on
PROBE_DEFER.
We currently skip the reinit of the freq_table if we found
it's already defined since a driver may declare his own freq_table.
This logic is flawed in the case devfreq core generate a freq_table, set
it in the profile struct and then PROBE_DEFER, freeing the freq_table.
In this case devfreq will found a NOT NULL freq_table that has been
freed, skip the freq_table generation and probe the driver based on the
wrong table.
To fix this and correctly handle PROBE_DEFER, use a local freq_table and
max_state in the devfreq struct and never modify the freq_table present
in the profile struct if it does provide it.
Fixes: 0ec09ac2cebe ("PM / devfreq: Set the freq_table of devfreq device")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com>
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 80a1235ef8fb..9602141bb8ec 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -123,7 +123,7 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
unsigned long *min_freq,
unsigned long *max_freq)
{
- unsigned long *freq_table = devfreq->profile->freq_table;
+ unsigned long *freq_table = devfreq->freq_table;
s32 qos_min_freq, qos_max_freq;
lockdep_assert_held(&devfreq->lock);
@@ -133,11 +133,11 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
* The devfreq drivers can initialize this in either ascending or
* descending order and devfreq core supports both.
*/
- if (freq_table[0] < freq_table[devfreq->profile->max_state - 1]) {
+ if (freq_table[0] < freq_table[devfreq->max_state - 1]) {
*min_freq = freq_table[0];
- *max_freq = freq_table[devfreq->profile->max_state - 1];
+ *max_freq = freq_table[devfreq->max_state - 1];
} else {
- *min_freq = freq_table[devfreq->profile->max_state - 1];
+ *min_freq = freq_table[devfreq->max_state - 1];
*max_freq = freq_table[0];
}
@@ -169,8 +169,8 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
{
int lev;
- for (lev = 0; lev < devfreq->profile->max_state; lev++)
- if (freq == devfreq->profile->freq_table[lev])
+ for (lev = 0; lev < devfreq->max_state; lev++)
+ if (freq == devfreq->freq_table[lev])
return lev;
return -EINVAL;
@@ -178,7 +178,6 @@ static int devfreq_get_freq_level(struct devfreq *devfreq, unsigned long freq)
static int set_freq_table(struct devfreq *devfreq)
{
- struct devfreq_dev_profile *profile = devfreq->profile;
struct dev_pm_opp *opp;
unsigned long freq;
int i, count;
@@ -188,25 +187,22 @@ static int set_freq_table(struct devfreq *devfreq)
if (count <= 0)
return -EINVAL;
- profile->max_state = count;
- profile->freq_table = devm_kcalloc(devfreq->dev.parent,
- profile->max_state,
- sizeof(*profile->freq_table),
- GFP_KERNEL);
- if (!profile->freq_table) {
- profile->max_state = 0;
+ devfreq->max_state = count;
+ devfreq->freq_table = devm_kcalloc(devfreq->dev.parent,
+ devfreq->max_state,
+ sizeof(*devfreq->freq_table),
+ GFP_KERNEL);
+ if (!devfreq->freq_table)
return -ENOMEM;
- }
- for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
+ for (i = 0, freq = 0; i < devfreq->max_state; i++, freq++) {
opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
if (IS_ERR(opp)) {
- devm_kfree(devfreq->dev.parent, profile->freq_table);
- profile->max_state = 0;
+ devm_kfree(devfreq->dev.parent, devfreq->freq_table);
return PTR_ERR(opp);
}
dev_pm_opp_put(opp);
- profile->freq_table[i] = freq;
+ devfreq->freq_table[i] = freq;
}
return 0;
@@ -246,7 +242,7 @@ int devfreq_update_status(struct devfreq *devfreq, unsigned long freq)
if (lev != prev_lev) {
devfreq->stats.trans_table[
- (prev_lev * devfreq->profile->max_state) + lev]++;
+ (prev_lev * devfreq->max_state) + lev]++;
devfreq->stats.total_trans++;
}
@@ -835,6 +831,9 @@ struct devfreq *devfreq_add_device(struct device *dev,
if (err < 0)
goto err_dev;
mutex_lock(&devfreq->lock);
+ } else {
+ devfreq->freq_table = devfreq->profile->freq_table;
+ devfreq->max_state = devfreq->profile->max_state;
}
devfreq->scaling_min_freq = find_available_min_freq(devfreq);
@@ -870,8 +869,8 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->stats.trans_table = devm_kzalloc(&devfreq->dev,
array3_size(sizeof(unsigned int),
- devfreq->profile->max_state,
- devfreq->profile->max_state),
+ devfreq->max_state,
+ devfreq->max_state),
GFP_KERNEL);
if (!devfreq->stats.trans_table) {
mutex_unlock(&devfreq->lock);
@@ -880,7 +879,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
}
devfreq->stats.time_in_state = devm_kcalloc(&devfreq->dev,
- devfreq->profile->max_state,
+ devfreq->max_state,
sizeof(*devfreq->stats.time_in_state),
GFP_KERNEL);
if (!devfreq->stats.time_in_state) {
@@ -1666,9 +1665,9 @@ static ssize_t available_frequencies_show(struct device *d,
mutex_lock(&df->lock);
- for (i = 0; i < df->profile->max_state; i++)
+ for (i = 0; i < df->max_state; i++)
count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
- "%lu ", df->profile->freq_table[i]);
+ "%lu ", df->freq_table[i]);
mutex_unlock(&df->lock);
/* Truncate the trailing space */
@@ -1691,7 +1690,7 @@ static ssize_t trans_stat_show(struct device *dev,
if (!df->profile)
return -EINVAL;
- max_state = df->profile->max_state;
+ max_state = df->max_state;
if (max_state == 0)
return sprintf(buf, "Not Supported.\n");
@@ -1708,19 +1707,17 @@ static ssize_t trans_stat_show(struct device *dev,
len += sprintf(buf + len, " :");
for (i = 0; i < max_state; i++)
len += sprintf(buf + len, "%10lu",
- df->profile->freq_table[i]);
+ df->freq_table[i]);
len += sprintf(buf + len, " time(ms)\n");
for (i = 0; i < max_state; i++) {
- if (df->profile->freq_table[i]
- == df->previous_freq) {
+ if (df->freq_table[i] == df->previous_freq)
len += sprintf(buf + len, "*");
- } else {
+ else
len += sprintf(buf + len, " ");
- }
- len += sprintf(buf + len, "%10lu:",
- df->profile->freq_table[i]);
+
+ len += sprintf(buf + len, "%10lu:", df->freq_table[i]);
for (j = 0; j < max_state; j++)
len += sprintf(buf + len, "%10u",
df->stats.trans_table[(i * max_state) + j]);
@@ -1744,7 +1741,7 @@ static ssize_t trans_stat_store(struct device *dev,
if (!df->profile)
return -EINVAL;
- if (df->profile->max_state == 0)
+ if (df->max_state == 0)
return count;
err = kstrtoint(buf, 10, &value);
@@ -1752,11 +1749,11 @@ static ssize_t trans_stat_store(struct device *dev,
return -EINVAL;
mutex_lock(&df->lock);
- memset(df->stats.time_in_state, 0, (df->profile->max_state *
+ memset(df->stats.time_in_state, 0, (df->max_state *
sizeof(*df->stats.time_in_state)));
memset(df->stats.trans_table, 0, array3_size(sizeof(unsigned int),
- df->profile->max_state,
- df->profile->max_state));
+ df->max_state,
+ df->max_state));
df->stats.total_trans = 0;
df->stats.last_update = get_jiffies_64();
mutex_unlock(&df->lock);
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 69e06725d92b..406ef79c0c46 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -144,18 +144,18 @@ static int get_target_freq_with_devfreq(struct devfreq *devfreq,
goto out;
/* Use interpolation if required opps is not available */
- for (i = 0; i < parent_devfreq->profile->max_state; i++)
- if (parent_devfreq->profile->freq_table[i] == *freq)
+ for (i = 0; i < parent_devfreq->max_state; i++)
+ if (parent_devfreq->freq_table[i] == *freq)
break;
- if (i == parent_devfreq->profile->max_state)
+ if (i == parent_devfreq->max_state)
return -EINVAL;
- if (i < devfreq->profile->max_state) {
- child_freq = devfreq->profile->freq_table[i];
+ if (i < devfreq->max_state) {
+ child_freq = devfreq->freq_table[i];
} else {
- count = devfreq->profile->max_state;
- child_freq = devfreq->profile->freq_table[count - 1];
+ count = devfreq->max_state;
+ child_freq = devfreq->freq_table[count - 1];
}
out:
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index dc10bee75a72..34aab4dd336c 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -148,6 +148,8 @@ struct devfreq_stats {
* reevaluate operable frequencies. Devfreq users may use
* devfreq.nb to the corresponding register notifier call chain.
* @work: delayed work for load monitoring.
+ * @freq_table: current frequency table used by the devfreq driver.
+ * @max_state: count of entry present in the frequency table.
* @previous_freq: previously configured frequency value.
* @last_status: devfreq user device info, performance statistics
* @data: Private data of the governor. The devfreq framework does not
@@ -185,6 +187,9 @@ struct devfreq {
struct notifier_block nb;
struct delayed_work work;
+ unsigned long *freq_table;
+ unsigned int max_state;
+
unsigned long previous_freq;
struct devfreq_dev_status last_status;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1758bde2e4aa5ff188d53e7d9d388bbb7e12eebb Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 28 Jun 2022 12:15:08 +0200
Subject: [PATCH] net: phy: Don't trigger state machine while in suspend
Upon system sleep, mdio_bus_phy_suspend() stops the phy_state_machine(),
but subsequent interrupts may retrigger it:
They may have been left enabled to facilitate wakeup and are not
quiesced until the ->suspend_noirq() phase. Unwanted interrupts may
hence occur between mdio_bus_phy_suspend() and dpm_suspend_noirq(),
as well as between dpm_resume_noirq() and mdio_bus_phy_resume().
Retriggering the phy_state_machine() through an interrupt is not only
undesirable for the reason given in mdio_bus_phy_suspend() (freezing it
midway with phydev->lock held), but also because the PHY may be
inaccessible after it's suspended: Accesses to USB-attached PHYs are
blocked once usb_suspend_both() clears the can_submit flag and PHYs on
PCI network cards may become inaccessible upon suspend as well.
Amend phy_interrupt() to avoid triggering the state machine if the PHY
is suspended. Signal wakeup instead if the attached net_device or its
parent has been configured as a wakeup source. (Those conditions are
identical to mdio_bus_phy_may_suspend().) Postpone handling of the
interrupt until the PHY has resumed.
Before stopping the phy_state_machine() in mdio_bus_phy_suspend(),
wait for a concurrent phy_interrupt() to run to completion. That is
necessary because phy_interrupt() may have checked the PHY's suspend
status before the system sleep transition commenced and it may thus
retrigger the state machine after it was stopped.
Likewise, after re-enabling interrupt handling in mdio_bus_phy_resume(),
wait for a concurrent phy_interrupt() to complete to ensure that
interrupts which it postponed are properly rerun.
The issue was exposed by commit 1ce8b37241ed ("usbnet: smsc95xx: Forward
PHY interrupts to PHY driver to avoid polling"), but has existed since
forever.
Fixes: 541cd3ee00a4 ("phylib: Fix deadlock on resume")
Link: https://lore.kernel.org/netdev/a5315a8a-32c2-962f-f696-de9a26d30091@samsung…
Reported-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable(a)vger.kernel.org # v2.6.33+
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Link: https://lore.kernel.org/r/b7f386d04e9b5b0e2738f0125743e30676f309ef.16564108…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index ef62f357b76d..8d3ee3a6495b 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -31,6 +31,7 @@
#include <linux/io.h>
#include <linux/uaccess.h>
#include <linux/atomic.h>
+#include <linux/suspend.h>
#include <net/netlink.h>
#include <net/genetlink.h>
#include <net/sock.h>
@@ -976,6 +977,28 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
struct phy_driver *drv = phydev->drv;
irqreturn_t ret;
+ /* Wakeup interrupts may occur during a system sleep transition.
+ * Postpone handling until the PHY has resumed.
+ */
+ if (IS_ENABLED(CONFIG_PM_SLEEP) && phydev->irq_suspended) {
+ struct net_device *netdev = phydev->attached_dev;
+
+ if (netdev) {
+ struct device *parent = netdev->dev.parent;
+
+ if (netdev->wol_enabled)
+ pm_system_wakeup();
+ else if (device_may_wakeup(&netdev->dev))
+ pm_wakeup_dev_event(&netdev->dev, 0, true);
+ else if (parent && device_may_wakeup(parent))
+ pm_wakeup_dev_event(parent, 0, true);
+ }
+
+ phydev->irq_rerun = 1;
+ disable_irq_nosync(irq);
+ return IRQ_HANDLED;
+ }
+
mutex_lock(&phydev->lock);
ret = drv->handle_interrupt(phydev);
mutex_unlock(&phydev->lock);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 431a8719c635..46acddd865a7 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -278,6 +278,15 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
if (phydev->mac_managed_pm)
return 0;
+ /* Wakeup interrupts may occur during the system sleep transition when
+ * the PHY is inaccessible. Set flag to postpone handling until the PHY
+ * has resumed. Wait for concurrent interrupt handler to complete.
+ */
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 1;
+ synchronize_irq(phydev->irq);
+ }
+
/* We must stop the state machine manually, otherwise it stops out of
* control, possibly with the phydev->lock held. Upon resume, netdev
* may call phy routines that try to grab the same lock, and that may
@@ -315,6 +324,20 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
if (ret < 0)
return ret;
no_resume:
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 0;
+ synchronize_irq(phydev->irq);
+
+ /* Rerun interrupts which were postponed by phy_interrupt()
+ * because they occurred during the system sleep transition.
+ */
+ if (phydev->irq_rerun) {
+ phydev->irq_rerun = 0;
+ enable_irq(phydev->irq);
+ irq_wake_thread(phydev->irq, phydev);
+ }
+ }
+
if (phydev->attached_dev && phydev->adjust_link)
phy_start_machine(phydev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 508f1149665b..b09f7d36cff2 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -572,6 +572,10 @@ struct macsec_ops;
* @mdix_ctrl: User setting of crossover
* @pma_extable: Cached value of PMA/PMD Extended Abilities Register
* @interrupts: Flag interrupts have been enabled
+ * @irq_suspended: Flag indicating PHY is suspended and therefore interrupt
+ * handling shall be postponed until PHY has resumed
+ * @irq_rerun: Flag indicating interrupts occurred while PHY was suspended,
+ * requiring a rerun of the interrupt handler after resume
* @interface: enum phy_interface_t value
* @skb: Netlink message for cable diagnostics
* @nest: Netlink nest used for cable diagnostics
@@ -626,6 +630,8 @@ struct phy_device {
/* Interrupts are enabled */
unsigned interrupts:1;
+ unsigned irq_suspended:1;
+ unsigned irq_rerun:1;
enum phy_state state;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1758bde2e4aa5ff188d53e7d9d388bbb7e12eebb Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 28 Jun 2022 12:15:08 +0200
Subject: [PATCH] net: phy: Don't trigger state machine while in suspend
Upon system sleep, mdio_bus_phy_suspend() stops the phy_state_machine(),
but subsequent interrupts may retrigger it:
They may have been left enabled to facilitate wakeup and are not
quiesced until the ->suspend_noirq() phase. Unwanted interrupts may
hence occur between mdio_bus_phy_suspend() and dpm_suspend_noirq(),
as well as between dpm_resume_noirq() and mdio_bus_phy_resume().
Retriggering the phy_state_machine() through an interrupt is not only
undesirable for the reason given in mdio_bus_phy_suspend() (freezing it
midway with phydev->lock held), but also because the PHY may be
inaccessible after it's suspended: Accesses to USB-attached PHYs are
blocked once usb_suspend_both() clears the can_submit flag and PHYs on
PCI network cards may become inaccessible upon suspend as well.
Amend phy_interrupt() to avoid triggering the state machine if the PHY
is suspended. Signal wakeup instead if the attached net_device or its
parent has been configured as a wakeup source. (Those conditions are
identical to mdio_bus_phy_may_suspend().) Postpone handling of the
interrupt until the PHY has resumed.
Before stopping the phy_state_machine() in mdio_bus_phy_suspend(),
wait for a concurrent phy_interrupt() to run to completion. That is
necessary because phy_interrupt() may have checked the PHY's suspend
status before the system sleep transition commenced and it may thus
retrigger the state machine after it was stopped.
Likewise, after re-enabling interrupt handling in mdio_bus_phy_resume(),
wait for a concurrent phy_interrupt() to complete to ensure that
interrupts which it postponed are properly rerun.
The issue was exposed by commit 1ce8b37241ed ("usbnet: smsc95xx: Forward
PHY interrupts to PHY driver to avoid polling"), but has existed since
forever.
Fixes: 541cd3ee00a4 ("phylib: Fix deadlock on resume")
Link: https://lore.kernel.org/netdev/a5315a8a-32c2-962f-f696-de9a26d30091@samsung…
Reported-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable(a)vger.kernel.org # v2.6.33+
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Link: https://lore.kernel.org/r/b7f386d04e9b5b0e2738f0125743e30676f309ef.16564108…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index ef62f357b76d..8d3ee3a6495b 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -31,6 +31,7 @@
#include <linux/io.h>
#include <linux/uaccess.h>
#include <linux/atomic.h>
+#include <linux/suspend.h>
#include <net/netlink.h>
#include <net/genetlink.h>
#include <net/sock.h>
@@ -976,6 +977,28 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
struct phy_driver *drv = phydev->drv;
irqreturn_t ret;
+ /* Wakeup interrupts may occur during a system sleep transition.
+ * Postpone handling until the PHY has resumed.
+ */
+ if (IS_ENABLED(CONFIG_PM_SLEEP) && phydev->irq_suspended) {
+ struct net_device *netdev = phydev->attached_dev;
+
+ if (netdev) {
+ struct device *parent = netdev->dev.parent;
+
+ if (netdev->wol_enabled)
+ pm_system_wakeup();
+ else if (device_may_wakeup(&netdev->dev))
+ pm_wakeup_dev_event(&netdev->dev, 0, true);
+ else if (parent && device_may_wakeup(parent))
+ pm_wakeup_dev_event(parent, 0, true);
+ }
+
+ phydev->irq_rerun = 1;
+ disable_irq_nosync(irq);
+ return IRQ_HANDLED;
+ }
+
mutex_lock(&phydev->lock);
ret = drv->handle_interrupt(phydev);
mutex_unlock(&phydev->lock);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 431a8719c635..46acddd865a7 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -278,6 +278,15 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
if (phydev->mac_managed_pm)
return 0;
+ /* Wakeup interrupts may occur during the system sleep transition when
+ * the PHY is inaccessible. Set flag to postpone handling until the PHY
+ * has resumed. Wait for concurrent interrupt handler to complete.
+ */
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 1;
+ synchronize_irq(phydev->irq);
+ }
+
/* We must stop the state machine manually, otherwise it stops out of
* control, possibly with the phydev->lock held. Upon resume, netdev
* may call phy routines that try to grab the same lock, and that may
@@ -315,6 +324,20 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
if (ret < 0)
return ret;
no_resume:
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 0;
+ synchronize_irq(phydev->irq);
+
+ /* Rerun interrupts which were postponed by phy_interrupt()
+ * because they occurred during the system sleep transition.
+ */
+ if (phydev->irq_rerun) {
+ phydev->irq_rerun = 0;
+ enable_irq(phydev->irq);
+ irq_wake_thread(phydev->irq, phydev);
+ }
+ }
+
if (phydev->attached_dev && phydev->adjust_link)
phy_start_machine(phydev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 508f1149665b..b09f7d36cff2 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -572,6 +572,10 @@ struct macsec_ops;
* @mdix_ctrl: User setting of crossover
* @pma_extable: Cached value of PMA/PMD Extended Abilities Register
* @interrupts: Flag interrupts have been enabled
+ * @irq_suspended: Flag indicating PHY is suspended and therefore interrupt
+ * handling shall be postponed until PHY has resumed
+ * @irq_rerun: Flag indicating interrupts occurred while PHY was suspended,
+ * requiring a rerun of the interrupt handler after resume
* @interface: enum phy_interface_t value
* @skb: Netlink message for cable diagnostics
* @nest: Netlink nest used for cable diagnostics
@@ -626,6 +630,8 @@ struct phy_device {
/* Interrupts are enabled */
unsigned interrupts:1;
+ unsigned irq_suspended:1;
+ unsigned irq_rerun:1;
enum phy_state state;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1758bde2e4aa5ff188d53e7d9d388bbb7e12eebb Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 28 Jun 2022 12:15:08 +0200
Subject: [PATCH] net: phy: Don't trigger state machine while in suspend
Upon system sleep, mdio_bus_phy_suspend() stops the phy_state_machine(),
but subsequent interrupts may retrigger it:
They may have been left enabled to facilitate wakeup and are not
quiesced until the ->suspend_noirq() phase. Unwanted interrupts may
hence occur between mdio_bus_phy_suspend() and dpm_suspend_noirq(),
as well as between dpm_resume_noirq() and mdio_bus_phy_resume().
Retriggering the phy_state_machine() through an interrupt is not only
undesirable for the reason given in mdio_bus_phy_suspend() (freezing it
midway with phydev->lock held), but also because the PHY may be
inaccessible after it's suspended: Accesses to USB-attached PHYs are
blocked once usb_suspend_both() clears the can_submit flag and PHYs on
PCI network cards may become inaccessible upon suspend as well.
Amend phy_interrupt() to avoid triggering the state machine if the PHY
is suspended. Signal wakeup instead if the attached net_device or its
parent has been configured as a wakeup source. (Those conditions are
identical to mdio_bus_phy_may_suspend().) Postpone handling of the
interrupt until the PHY has resumed.
Before stopping the phy_state_machine() in mdio_bus_phy_suspend(),
wait for a concurrent phy_interrupt() to run to completion. That is
necessary because phy_interrupt() may have checked the PHY's suspend
status before the system sleep transition commenced and it may thus
retrigger the state machine after it was stopped.
Likewise, after re-enabling interrupt handling in mdio_bus_phy_resume(),
wait for a concurrent phy_interrupt() to complete to ensure that
interrupts which it postponed are properly rerun.
The issue was exposed by commit 1ce8b37241ed ("usbnet: smsc95xx: Forward
PHY interrupts to PHY driver to avoid polling"), but has existed since
forever.
Fixes: 541cd3ee00a4 ("phylib: Fix deadlock on resume")
Link: https://lore.kernel.org/netdev/a5315a8a-32c2-962f-f696-de9a26d30091@samsung…
Reported-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable(a)vger.kernel.org # v2.6.33+
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Link: https://lore.kernel.org/r/b7f386d04e9b5b0e2738f0125743e30676f309ef.16564108…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index ef62f357b76d..8d3ee3a6495b 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -31,6 +31,7 @@
#include <linux/io.h>
#include <linux/uaccess.h>
#include <linux/atomic.h>
+#include <linux/suspend.h>
#include <net/netlink.h>
#include <net/genetlink.h>
#include <net/sock.h>
@@ -976,6 +977,28 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
struct phy_driver *drv = phydev->drv;
irqreturn_t ret;
+ /* Wakeup interrupts may occur during a system sleep transition.
+ * Postpone handling until the PHY has resumed.
+ */
+ if (IS_ENABLED(CONFIG_PM_SLEEP) && phydev->irq_suspended) {
+ struct net_device *netdev = phydev->attached_dev;
+
+ if (netdev) {
+ struct device *parent = netdev->dev.parent;
+
+ if (netdev->wol_enabled)
+ pm_system_wakeup();
+ else if (device_may_wakeup(&netdev->dev))
+ pm_wakeup_dev_event(&netdev->dev, 0, true);
+ else if (parent && device_may_wakeup(parent))
+ pm_wakeup_dev_event(parent, 0, true);
+ }
+
+ phydev->irq_rerun = 1;
+ disable_irq_nosync(irq);
+ return IRQ_HANDLED;
+ }
+
mutex_lock(&phydev->lock);
ret = drv->handle_interrupt(phydev);
mutex_unlock(&phydev->lock);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 431a8719c635..46acddd865a7 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -278,6 +278,15 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
if (phydev->mac_managed_pm)
return 0;
+ /* Wakeup interrupts may occur during the system sleep transition when
+ * the PHY is inaccessible. Set flag to postpone handling until the PHY
+ * has resumed. Wait for concurrent interrupt handler to complete.
+ */
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 1;
+ synchronize_irq(phydev->irq);
+ }
+
/* We must stop the state machine manually, otherwise it stops out of
* control, possibly with the phydev->lock held. Upon resume, netdev
* may call phy routines that try to grab the same lock, and that may
@@ -315,6 +324,20 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
if (ret < 0)
return ret;
no_resume:
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 0;
+ synchronize_irq(phydev->irq);
+
+ /* Rerun interrupts which were postponed by phy_interrupt()
+ * because they occurred during the system sleep transition.
+ */
+ if (phydev->irq_rerun) {
+ phydev->irq_rerun = 0;
+ enable_irq(phydev->irq);
+ irq_wake_thread(phydev->irq, phydev);
+ }
+ }
+
if (phydev->attached_dev && phydev->adjust_link)
phy_start_machine(phydev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 508f1149665b..b09f7d36cff2 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -572,6 +572,10 @@ struct macsec_ops;
* @mdix_ctrl: User setting of crossover
* @pma_extable: Cached value of PMA/PMD Extended Abilities Register
* @interrupts: Flag interrupts have been enabled
+ * @irq_suspended: Flag indicating PHY is suspended and therefore interrupt
+ * handling shall be postponed until PHY has resumed
+ * @irq_rerun: Flag indicating interrupts occurred while PHY was suspended,
+ * requiring a rerun of the interrupt handler after resume
* @interface: enum phy_interface_t value
* @skb: Netlink message for cable diagnostics
* @nest: Netlink nest used for cable diagnostics
@@ -626,6 +630,8 @@ struct phy_device {
/* Interrupts are enabled */
unsigned interrupts:1;
+ unsigned irq_suspended:1;
+ unsigned irq_rerun:1;
enum phy_state state;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1758bde2e4aa5ff188d53e7d9d388bbb7e12eebb Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 28 Jun 2022 12:15:08 +0200
Subject: [PATCH] net: phy: Don't trigger state machine while in suspend
Upon system sleep, mdio_bus_phy_suspend() stops the phy_state_machine(),
but subsequent interrupts may retrigger it:
They may have been left enabled to facilitate wakeup and are not
quiesced until the ->suspend_noirq() phase. Unwanted interrupts may
hence occur between mdio_bus_phy_suspend() and dpm_suspend_noirq(),
as well as between dpm_resume_noirq() and mdio_bus_phy_resume().
Retriggering the phy_state_machine() through an interrupt is not only
undesirable for the reason given in mdio_bus_phy_suspend() (freezing it
midway with phydev->lock held), but also because the PHY may be
inaccessible after it's suspended: Accesses to USB-attached PHYs are
blocked once usb_suspend_both() clears the can_submit flag and PHYs on
PCI network cards may become inaccessible upon suspend as well.
Amend phy_interrupt() to avoid triggering the state machine if the PHY
is suspended. Signal wakeup instead if the attached net_device or its
parent has been configured as a wakeup source. (Those conditions are
identical to mdio_bus_phy_may_suspend().) Postpone handling of the
interrupt until the PHY has resumed.
Before stopping the phy_state_machine() in mdio_bus_phy_suspend(),
wait for a concurrent phy_interrupt() to run to completion. That is
necessary because phy_interrupt() may have checked the PHY's suspend
status before the system sleep transition commenced and it may thus
retrigger the state machine after it was stopped.
Likewise, after re-enabling interrupt handling in mdio_bus_phy_resume(),
wait for a concurrent phy_interrupt() to complete to ensure that
interrupts which it postponed are properly rerun.
The issue was exposed by commit 1ce8b37241ed ("usbnet: smsc95xx: Forward
PHY interrupts to PHY driver to avoid polling"), but has existed since
forever.
Fixes: 541cd3ee00a4 ("phylib: Fix deadlock on resume")
Link: https://lore.kernel.org/netdev/a5315a8a-32c2-962f-f696-de9a26d30091@samsung…
Reported-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable(a)vger.kernel.org # v2.6.33+
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Link: https://lore.kernel.org/r/b7f386d04e9b5b0e2738f0125743e30676f309ef.16564108…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index ef62f357b76d..8d3ee3a6495b 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -31,6 +31,7 @@
#include <linux/io.h>
#include <linux/uaccess.h>
#include <linux/atomic.h>
+#include <linux/suspend.h>
#include <net/netlink.h>
#include <net/genetlink.h>
#include <net/sock.h>
@@ -976,6 +977,28 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
struct phy_driver *drv = phydev->drv;
irqreturn_t ret;
+ /* Wakeup interrupts may occur during a system sleep transition.
+ * Postpone handling until the PHY has resumed.
+ */
+ if (IS_ENABLED(CONFIG_PM_SLEEP) && phydev->irq_suspended) {
+ struct net_device *netdev = phydev->attached_dev;
+
+ if (netdev) {
+ struct device *parent = netdev->dev.parent;
+
+ if (netdev->wol_enabled)
+ pm_system_wakeup();
+ else if (device_may_wakeup(&netdev->dev))
+ pm_wakeup_dev_event(&netdev->dev, 0, true);
+ else if (parent && device_may_wakeup(parent))
+ pm_wakeup_dev_event(parent, 0, true);
+ }
+
+ phydev->irq_rerun = 1;
+ disable_irq_nosync(irq);
+ return IRQ_HANDLED;
+ }
+
mutex_lock(&phydev->lock);
ret = drv->handle_interrupt(phydev);
mutex_unlock(&phydev->lock);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 431a8719c635..46acddd865a7 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -278,6 +278,15 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
if (phydev->mac_managed_pm)
return 0;
+ /* Wakeup interrupts may occur during the system sleep transition when
+ * the PHY is inaccessible. Set flag to postpone handling until the PHY
+ * has resumed. Wait for concurrent interrupt handler to complete.
+ */
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 1;
+ synchronize_irq(phydev->irq);
+ }
+
/* We must stop the state machine manually, otherwise it stops out of
* control, possibly with the phydev->lock held. Upon resume, netdev
* may call phy routines that try to grab the same lock, and that may
@@ -315,6 +324,20 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
if (ret < 0)
return ret;
no_resume:
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->irq_suspended = 0;
+ synchronize_irq(phydev->irq);
+
+ /* Rerun interrupts which were postponed by phy_interrupt()
+ * because they occurred during the system sleep transition.
+ */
+ if (phydev->irq_rerun) {
+ phydev->irq_rerun = 0;
+ enable_irq(phydev->irq);
+ irq_wake_thread(phydev->irq, phydev);
+ }
+ }
+
if (phydev->attached_dev && phydev->adjust_link)
phy_start_machine(phydev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 508f1149665b..b09f7d36cff2 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -572,6 +572,10 @@ struct macsec_ops;
* @mdix_ctrl: User setting of crossover
* @pma_extable: Cached value of PMA/PMD Extended Abilities Register
* @interrupts: Flag interrupts have been enabled
+ * @irq_suspended: Flag indicating PHY is suspended and therefore interrupt
+ * handling shall be postponed until PHY has resumed
+ * @irq_rerun: Flag indicating interrupts occurred while PHY was suspended,
+ * requiring a rerun of the interrupt handler after resume
* @interface: enum phy_interface_t value
* @skb: Netlink message for cable diagnostics
* @nest: Netlink nest used for cable diagnostics
@@ -626,6 +630,8 @@ struct phy_device {
/* Interrupts are enabled */
unsigned interrupts:1;
+ unsigned irq_suspended:1;
+ unsigned irq_rerun:1;
enum phy_state state;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3b0dc529f56b5f2328244130683210be98f16f7f Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Thu, 23 Jun 2022 14:00:15 +0200
Subject: [PATCH] ipv6: take care of disable_policy when restoring routes
When routes corresponding to addresses are restored by
fixup_permanent_addr(), the dst_nopolicy parameter was not set.
The typical use case is a user that configures an address on a down
interface and then put this interface up.
Let's take care of this flag in addrconf_f6i_alloc(), so that every callers
benefit ont it.
CC: stable(a)kernel.org
CC: David Forster <dforster(a)brocade.com>
Fixes: df789fe75206 ("ipv6: Provide ipv6 version of "disable_policy" sysctl")
Reported-by: Siwar Zitouni <siwar.zitouni(a)6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://lore.kernel.org/r/20220623120015.32640-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 1b1932502e9e..5864cbc30db6 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1109,10 +1109,6 @@ ipv6_add_addr(struct inet6_dev *idev, struct ifa6_config *cfg,
goto out;
}
- if (net->ipv6.devconf_all->disable_policy ||
- idev->cnf.disable_policy)
- f6i->dst_nopolicy = true;
-
neigh_parms_data_state_setall(idev->nd_parms);
ifa->addr = *cfg->pfx;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index d25dc83bac62..828355710c57 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -4569,8 +4569,15 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
}
f6i = ip6_route_info_create(&cfg, gfp_flags, NULL);
- if (!IS_ERR(f6i))
+ if (!IS_ERR(f6i)) {
f6i->dst_nocount = true;
+
+ if (!anycast &&
+ (net->ipv6.devconf_all->disable_policy ||
+ idev->cnf.disable_policy))
+ f6i->dst_nopolicy = true;
+ }
+
return f6i;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3b0dc529f56b5f2328244130683210be98f16f7f Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Thu, 23 Jun 2022 14:00:15 +0200
Subject: [PATCH] ipv6: take care of disable_policy when restoring routes
When routes corresponding to addresses are restored by
fixup_permanent_addr(), the dst_nopolicy parameter was not set.
The typical use case is a user that configures an address on a down
interface and then put this interface up.
Let's take care of this flag in addrconf_f6i_alloc(), so that every callers
benefit ont it.
CC: stable(a)kernel.org
CC: David Forster <dforster(a)brocade.com>
Fixes: df789fe75206 ("ipv6: Provide ipv6 version of "disable_policy" sysctl")
Reported-by: Siwar Zitouni <siwar.zitouni(a)6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://lore.kernel.org/r/20220623120015.32640-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 1b1932502e9e..5864cbc30db6 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1109,10 +1109,6 @@ ipv6_add_addr(struct inet6_dev *idev, struct ifa6_config *cfg,
goto out;
}
- if (net->ipv6.devconf_all->disable_policy ||
- idev->cnf.disable_policy)
- f6i->dst_nopolicy = true;
-
neigh_parms_data_state_setall(idev->nd_parms);
ifa->addr = *cfg->pfx;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index d25dc83bac62..828355710c57 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -4569,8 +4569,15 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
}
f6i = ip6_route_info_create(&cfg, gfp_flags, NULL);
- if (!IS_ERR(f6i))
+ if (!IS_ERR(f6i)) {
f6i->dst_nocount = true;
+
+ if (!anycast &&
+ (net->ipv6.devconf_all->disable_policy ||
+ idev->cnf.disable_policy))
+ f6i->dst_nopolicy = true;
+ }
+
return f6i;
}
Currently, when loading a kernel image via the kexec_file_load() system
call, arm64 can only use the .builtin_trusted_keys keyring to verify
a signature whereas x86 can use three more keyrings i.e.
.secondary_trusted_keys, .machine and .platform keyrings. For example,
one resulting problem is kexec'ing a kernel image would be rejected
with the error "Lockdown: kexec: kexec of unsigned images is restricted;
see man kernel_lockdown.7".
This patch set enables arm64 to make use of the same keyrings as x86 to
verify the signature kexec'ed kernel image.
Fixes: 732b7b93d849 ("arm64: kexec_file: add kernel signature verification support")
Cc: stable(a)vger.kernel.org # 34d5960af253: kexec: clean up arch_kexec_kernel_verify_sig
Cc: stable(a)vger.kernel.org # 83b7bb2d49ae: kexec, KEYS: make the code in bzImage64_verify_sig generic
Acked-by: Baoquan He <bhe(a)redhat.com>
Cc: kexec(a)lists.infradead.org
Cc: keyrings(a)vger.kernel.org
Cc: linux-security-module(a)vger.kernel.org
Co-developed-by: Michal Suchanek <msuchanek(a)suse.de>
Signed-off-by: Michal Suchanek <msuchanek(a)suse.de>
Acked-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Coiby Xu <coxu(a)redhat.com>
---
arch/arm64/kernel/kexec_image.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index 9ec34690e255..5ed6a585f21f 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -14,7 +14,6 @@
#include <linux/kexec.h>
#include <linux/pe.h>
#include <linux/string.h>
-#include <linux/verification.h>
#include <asm/byteorder.h>
#include <asm/cpufeature.h>
#include <asm/image.h>
@@ -130,18 +129,10 @@ static void *image_load(struct kimage *image,
return NULL;
}
-#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
-static int image_verify_sig(const char *kernel, unsigned long kernel_len)
-{
- return verify_pefile_signature(kernel, kernel_len, NULL,
- VERIFYING_KEXEC_PE_SIGNATURE);
-}
-#endif
-
const struct kexec_file_ops kexec_image_ops = {
.probe = image_probe,
.load = image_load,
#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
- .verify_sig = image_verify_sig,
+ .verify_sig = kexec_kernel_verify_pe_sig,
#endif
};
--
2.35.3
Currently there is no arch-specific implementation of
arch_kexec_kernel_verify_sig. Even if we want to add an implementation
for an architecture in the future, we can simply use "(struct
kexec_file_ops*)->verify_sig". So clean it up.
Note this patch is dependent by later patches so it should backported to
the stable tree as well.
Cc: stable(a)vger.kernel.org
Suggested-by: Eric W. Biederman <ebiederm(a)xmission.com>
Reviewed-by: Michal Suchanek <msuchanek(a)suse.de>
Acked-by: Baoquan He <bhe(a)redhat.com>
Signed-off-by: Coiby Xu <coxu(a)redhat.com>
---
include/linux/kexec.h | 4 ----
kernel/kexec_file.c | 34 +++++++++++++---------------------
2 files changed, 13 insertions(+), 25 deletions(-)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index ce6536f1d269..e3125fae1599 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -194,10 +194,6 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
unsigned long buf_len);
void *arch_kexec_kernel_image_load(struct kimage *image);
int arch_kimage_file_post_load_cleanup(struct kimage *image);
-#ifdef CONFIG_KEXEC_SIG
-int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
- unsigned long buf_len);
-#endif
int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
extern int kexec_add_buffer(struct kexec_buf *kbuf);
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 145321a5e798..c7cbadc754a1 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -89,25 +89,6 @@ int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
return kexec_image_post_load_cleanup_default(image);
}
-#ifdef CONFIG_KEXEC_SIG
-static int kexec_image_verify_sig_default(struct kimage *image, void *buf,
- unsigned long buf_len)
-{
- if (!image->fops || !image->fops->verify_sig) {
- pr_debug("kernel loader does not support signature verification.\n");
- return -EKEYREJECTED;
- }
-
- return image->fops->verify_sig(buf, buf_len);
-}
-
-int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
- unsigned long buf_len)
-{
- return kexec_image_verify_sig_default(image, buf, buf_len);
-}
-#endif
-
/*
* Free up memory used by kernel, initrd, and command line. This is temporary
* memory allocation which is not needed any more after these buffers have
@@ -150,13 +131,24 @@ void kimage_file_post_load_cleanup(struct kimage *image)
}
#ifdef CONFIG_KEXEC_SIG
+static int kexec_image_verify_sig(struct kimage *image, void *buf,
+ unsigned long buf_len)
+{
+ if (!image->fops || !image->fops->verify_sig) {
+ pr_debug("kernel loader does not support signature verification.\n");
+ return -EKEYREJECTED;
+ }
+
+ return image->fops->verify_sig(buf, buf_len);
+}
+
static int
kimage_validate_signature(struct kimage *image)
{
int ret;
- ret = arch_kexec_kernel_verify_sig(image, image->kernel_buf,
- image->kernel_buf_len);
+ ret = kexec_image_verify_sig(image, image->kernel_buf,
+ image->kernel_buf_len);
if (ret) {
if (IS_ENABLED(CONFIG_KEXEC_SIG_FORCE)) {
--
2.35.3
The patch titled
Subject: mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
has been added to the -mm mm-unstable branch. Its filename is
mm-hugetlb-separate-path-for-hwpoison-entry-in-copy_hugetlb_page_range.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Subject: mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
Date: Mon, 4 Jul 2022 10:33:05 +0900
Originally copy_hugetlb_page_range() handles migration entries and
hwpoisoned entries in similar manner. But recently the related code path
has more code for migration entries, and when
is_writable_migration_entry() was converted to
!is_readable_migration_entry(), hwpoison entries on source processes got
to be unexpectedly updated (which is legitimate for migration entries, but
not for hwpoison entries). This results in unexpected serious issues like
kernel panic when forking processes with hwpoison entries in pmd.
Separate the if branch into one for hwpoison entries and one for migration
entries.
Link: https://lkml.kernel.org/r/20220704013312.2415700-3-naoya.horiguchi@linux.dev
Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org> [5.18]
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Liu Shixin <liushixin2(a)huawei.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Yang Shi <shy828301(a)gmail.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-separate-path-for-hwpoison-entry-in-copy_hugetlb_page_range
+++ a/mm/hugetlb.c
@@ -4802,8 +4802,13 @@ again:
* sharing with another vma.
*/
;
- } else if (unlikely(is_hugetlb_entry_migration(entry) ||
- is_hugetlb_entry_hwpoisoned(entry))) {
+ } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
+ bool uffd_wp = huge_pte_uffd_wp(entry);
+
+ if (!userfaultfd_wp(dst_vma) && uffd_wp)
+ entry = huge_pte_clear_uffd_wp(entry);
+ set_huge_pte_at(dst, addr, dst_pte, entry);
+ } else if (unlikely(is_hugetlb_entry_migration(entry))) {
swp_entry_t swp_entry = pte_to_swp_entry(entry);
bool uffd_wp = huge_pte_uffd_wp(entry);
_
Patches currently in -mm which might be from naoya.horiguchi(a)nec.com are
mm-hugetlb-check-gigantic_page_runtime_supported-in-return_unused_surplus_pages.patch
mm-hugetlb-separate-path-for-hwpoison-entry-in-copy_hugetlb_page_range.patch
mm-hugetlb-make-pud_huge-and-follow_huge_pud-aware-of-non-present-pud-entry.patch
mm-hwpoison-hugetlb-support-saving-mechanism-of-raw-error-pages.patch
mm-hwpoison-make-unpoison-aware-of-raw-error-info-in-hwpoisoned-hugepage.patch
mm-hwpoison-set-pg_hwpoison-for-busy-hugetlb-pages.patch
mm-hwpoison-make-__page_handle_poison-returns-int.patch
mm-hwpoison-skip-raw-hwpoison-page-in-freeing-1gb-hugepage.patch
mm-hwpoison-enable-memory-error-handling-on-1gb-hugepage.patch
The quilt patch titled
Subject: mm: sparsemem: fix missing higher order allocation splitting
has been removed from the -mm tree. Its filename was
mm-sparsemem-fix-missing-higher-order-allocation-splitting.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: sparsemem: fix missing higher order allocation splitting
Date: Mon, 20 Jun 2022 10:30:19 +0800
Higher order allocations for vmemmap pages from buddy allocator must be
able to be treated as indepdenent small pages as they can be freed
individually by the caller. There is no problem for higher order vmemmap
pages allocated at boot time since each individual small page will be
initialized at boot time. However, it will be an issue for memory hotplug
case since those higher order vmemmap pages are allocated from buddy
allocator without initializing each individual small page's refcount. The
system will panic in put_page_testzero() when CONFIG_DEBUG_VM is enabled
if the vmemmap page is freed.
Link: https://lkml.kernel.org/r/20220620023019.94257-1-songmuchun@bytedance.com
Fixes: d8d55f5616cf ("mm: sparsemem: use page table lock to protect kernel pmd operations")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Xiongchun Duan <duanxiongchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/sparse-vmemmap.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/mm/sparse-vmemmap.c~mm-sparsemem-fix-missing-higher-order-allocation-splitting
+++ a/mm/sparse-vmemmap.c
@@ -78,6 +78,14 @@ static int __split_vmemmap_huge_pmd(pmd_
spin_lock(&init_mm.page_table_lock);
if (likely(pmd_leaf(*pmd))) {
+ /*
+ * Higher order allocations from buddy allocator must be able to
+ * be treated as indepdenent small pages (as they can be freed
+ * individually).
+ */
+ if (!PageReserved(page))
+ split_page(page, get_order(PMD_SIZE));
+
/* Make pte visible before pmd. See comment in pmd_install(). */
smp_wmb();
pmd_populate_kernel(&init_mm, pmd, pgtable);
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-memory_hotplug-enumerate-all-supported-section-flags.patch
mm-memory_hotplug-enumerate-all-supported-section-flags-v5.patch
mm-memory_hotplug-make-hugetlb_optimize_vmemmap-compatible-with-memmap_on_memory.patch
mm-memory_hotplug-make-hugetlb_optimize_vmemmap-compatible-with-memmap_on_memory-v5.patch
mm-hugetlb-remove-minimum_order-variable.patch
mm-memcontrol-remove-dead-code-and-comments.patch
mm-rename-unlock_page_lruvec_irq-_irqrestore-to-lruvec_unlock_irq-_irqrestore.patch
mm-memcontrol-prepare-objcg-api-for-non-kmem-usage.patch
mm-memcontrol-make-lruvec-lock-safe-when-lru-pages-are-reparented.patch
mm-vmscan-rework-move_pages_to_lru.patch
mm-thp-make-split-queue-lock-safe-when-lru-pages-are-reparented.patch
mm-memcontrol-make-all-the-callers-of-foliopage_memcg-safe.patch
mm-memcontrol-introduce-memcg_reparent_ops.patch
mm-memcontrol-use-obj_cgroup-apis-to-charge-the-lru-pages.patch
mm-lru-add-vm_warn_on_once_folio-to-lru-maintenance-function.patch
mm-hugetlb_vmemmap-delete-hugetlb_optimize_vmemmap_enabled.patch
mm-hugetlb_vmemmap-optimize-vmemmap_optimize_mode-handling.patch
mm-hugetlb_vmemmap-introduce-the-name-hvo.patch
mm-hugetlb_vmemmap-move-vmemmap-code-related-to-hugetlb-to-hugetlb_vmemmapc.patch
mm-hugetlb_vmemmap-replace-early_param-with-core_param.patch
mm-hugetlb_vmemmap-improve-hugetlb_vmemmap-code-readability.patch
mm-hugetlb_vmemmap-move-code-comments-to-vmemmap_deduprst.patch
mm-hugetlb_vmemmap-use-ptrs_per_pte-instead-of-pmd_size-page_size.patch
The quilt patch titled
Subject: mm/damon: use set_huge_pte_at() to make huge pte old
has been removed from the -mm tree. Its filename was
mm-damon-use-set_huge_pte_at-to-make-huge-pte-old.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Subject: mm/damon: use set_huge_pte_at() to make huge pte old
Date: Mon, 20 Jun 2022 10:34:42 +0800
The huge_ptep_set_access_flags() can not make the huge pte old according
to the discussion [1], that means we will always mornitor the young state
of the hugetlb though we stopped accessing the hugetlb, as a result DAMON
will get inaccurate accessing statistics.
So changing to use set_huge_pte_at() to make the huge pte old to fix this
issue.
[1] https://lore.kernel.org/all/Yqy97gXI4Nqb7dYo@arm.com/
Link: https://lkml.kernel.org/r/1655692482-28797-1-git-send-email-baolin.wang@lin…
Fixes: 49f4203aae06 ("mm/damon: add access checking for hugetlb pages")
Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Acked-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/vaddr.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/mm/damon/vaddr.c~mm-damon-use-set_huge_pte_at-to-make-huge-pte-old
+++ a/mm/damon/vaddr.c
@@ -336,8 +336,7 @@ static void damon_hugetlb_mkold(pte_t *p
if (pte_young(entry)) {
referenced = true;
entry = pte_mkold(entry);
- huge_ptep_set_access_flags(vma, addr, pte, entry,
- vma->vm_flags & VM_WRITE);
+ set_huge_pte_at(mm, addr, pte, entry);
}
#ifdef CONFIG_MMU_NOTIFIER
_
Patches currently in -mm which might be from baolin.wang(a)linux.alibaba.com are
mm-hugetlb-remove-unnecessary-huge_ptep_set_access_flags-in-hugetlb_mcopy_atomic_pte.patch
mm-rmap-simplify-the-hugetlb-handling-when-unmapping-or-migration.patch
arm64-hugetlb-implement-arm64-specific-hugetlb_mask_last_page.patch
arm64-hugetlb-implement-arm64-specific-hugetlb_mask_last_page-fix.patch
The quilt patch titled
Subject: mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pages
has been removed from the -mm tree. Its filename was
mm-userfaultfd-fix-uffdio_continue-on-fallocated-shmem-pages.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Axel Rasmussen <axelrasmussen(a)google.com>
Subject: mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pages
Date: Fri, 10 Jun 2022 10:38:12 -0700
When fallocate() is used on a shmem file, the pages we allocate can end up
with !PageUptodate.
Since UFFDIO_CONTINUE tries to find the existing page the user wants to
map with SGP_READ, we would fail to find such a page, since
shmem_getpage_gfp returns with a "NULL" pagep for SGP_READ if it discovers
!PageUptodate. As a result, UFFDIO_CONTINUE returns -EFAULT, as it would
do if the page wasn't found in the page cache at all.
This isn't the intended behavior. UFFDIO_CONTINUE is just trying to find
if a page exists, and doesn't care whether it still needs to be cleared or
not. So, instead of SGP_READ, pass in SGP_NOALLOC. This is the same,
except for one critical difference: in the !PageUptodate case, SGP_NOALLOC
will clear the page and then return it. With this change, UFFDIO_CONTINUE
works properly (succeeds) on a shmem file which has been fallocated, but
otherwise not modified.
Link: https://lkml.kernel.org/r/20220610173812.1768919-1-axelrasmussen@google.com
Fixes: 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem")
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/userfaultfd.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/userfaultfd.c~mm-userfaultfd-fix-uffdio_continue-on-fallocated-shmem-pages
+++ a/mm/userfaultfd.c
@@ -246,7 +246,10 @@ static int mcontinue_atomic_pte(struct m
struct page *page;
int ret;
- ret = shmem_getpage(inode, pgoff, &page, SGP_READ);
+ ret = shmem_getpage(inode, pgoff, &page, SGP_NOALLOC);
+ /* Our caller expects us to return -EFAULT if we failed to find page. */
+ if (ret == -ENOENT)
+ ret = -EFAULT;
if (ret)
goto out;
if (!page) {
_
Patches currently in -mm which might be from axelrasmussen(a)google.com are
selftests-vm-add-hugetlb_shared-userfaultfd-test-to-run_vmtestssh.patch
userfaultfd-add-dev-userfaultfd-for-fine-grained-access-control.patch
userfaultfd-selftests-modify-selftest-to-use-dev-userfaultfd.patch
userfaultfd-update-documentation-to-describe-dev-userfaultfd.patch
userfaultfd-selftests-make-dev-userfaultfd-testing-configurable.patch
selftests-vm-add-dev-userfaultfd-test-cases-to-run_vmtestssh.patch
--
A mail was sent to you sometime last week with the expectation of
having a return mail from you but to my surprise you never bothered to replied.
Kindly reply for further explanations.
Respectfully yours,
Barrister Douglas Felix
Greetings,
I sent this mail praying it will find you in a good condition, since I
myself am in a very critical health condition in which I sleep every
night without knowing if I may be alive to see the next day. I am Mrs
Doris David, a widow suffering from a long time illness. I have
some funds I inherited from my late husband, the sum of
($11,000,000.00) my Doctor told me recently that I have serious
sickness which is a cancer problem. What disturbs me most is my stroke
sickness. Having known my condition, I decided to donate this fund to
a good person that will utilize it the way I am going to instruct
herein. I need a very honest God.
fearing a person who can claim this money and use it for Charity
works, for orphanages, widows and also build schools for less
privileges that will be named after my late husband if possible and to
promote the word of God and the effort that the house of God is
maintained. I do not want a situation where this money will be used in
an ungodly manner. That's why I' making this decision. I'm not afraid
of death so I know where I'm going. I accept this decision because I
do not have any child who will inherit this money after I die. Please
I want your sincere and urgent answer to know if you will be able to
execute this project, and I will give you more information on how
thunder will be transferred to your bank account. I am waiting for
your reply.
May God Bless you,
Mrs.Doris David,
Greetings,
I sent this mail praying it will find you in a good condition, since I
myself am in a very critical health condition in which I sleep every
night without knowing if I may be alive to see the next day. I am Mrs
Doris David, a widow suffering from a long time illness. I have
some funds I inherited from my late husband, the sum of
($11,000,000.00) my Doctor told me recently that I have serious
sickness which is a cancer problem. What disturbs me most is my stroke
sickness. Having known my condition, I decided to donate this fund to
a good person that will utilize it the way I am going to instruct
herein. I need a very honest God.
fearing a person who can claim this money and use it for Charity
works, for orphanages, widows and also build schools for less
privileges that will be named after my late husband if possible and to
promote the word of God and the effort that the house of God is
maintained. I do not want a situation where this money will be used in
an ungodly manner. That's why I' making this decision. I'm not afraid
of death so I know where I'm going. I accept this decision because I
do not have any child who will inherit this money after I die. Please
I want your sincere and urgent answer to know if you will be able to
execute this project, and I will give you more information on how
thunder will be transferred to your bank account. I am waiting for
your reply.
May God Bless you,
Mrs Doris David,
I'm announcing the release of the 5.18.9 kernel.
All users of the 5.18 kernel series must upgrade.
The updated 5.18.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.18.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 -
arch/powerpc/include/asm/ftrace.h | 4 +-
arch/powerpc/kernel/trace/ftrace.c | 15 +++++++--
arch/powerpc/mm/mem.c | 2 +
drivers/clocksource/Kconfig | 2 -
drivers/clocksource/timer-ixp4xx.c | 25 ----------------
drivers/md/bcache/btree.c | 1
drivers/md/bcache/writeback.c | 1
drivers/net/ethernet/huawei/hinic/hinic_devlink.c | 4 --
fs/io_uring.c | 34 +++++++++++-----------
include/linux/platform_data/timer-ixp4xx.h | 11 -------
kernel/time/tick-sched.c | 1
12 files changed, 40 insertions(+), 62 deletions(-)
Coly Li (1):
bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()
Greg Kroah-Hartman (1):
Linux 5.18.9
Kees Cook (1):
hinic: Replace memcpy() with direct assignment
Linus Walleij (1):
clocksource/drivers/ixp4xx: Drop boardfile probe path
Masahiro Yamada (1):
tick/nohz: unexport __init-annotated tick_nohz_full_setup()
Naveen N. Rao (1):
powerpc/ftrace: Remove ftrace init tramp once kernel init is complete
Pavel Begunkov (1):
io_uring: fix not locked access to fixed buf table
I'm announcing the release of the 5.10.128 kernel.
All users of the 5.10 kernel series must upgrade.
The updated 5.10.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.10.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
MAINTAINERS | 3 ++-
Makefile | 2 +-
arch/powerpc/include/asm/ftrace.h | 4 +++-
arch/powerpc/kernel/trace/ftrace.c | 15 ++++++++++++---
arch/powerpc/mm/mem.c | 2 ++
drivers/gpu/drm/drm_crtc_helper_internal.h | 10 ----------
drivers/gpu/drm/drm_fb_helper.c | 21 ---------------------
drivers/gpu/drm/drm_kms_helper_common.c | 25 ++++++++++++-------------
drivers/md/bcache/btree.c | 1 +
drivers/md/bcache/writeback.c | 1 +
drivers/net/ethernet/mscc/ocelot.c | 8 ++++++--
fs/xfs/libxfs/xfs_attr.c | 13 +++++--------
fs/xfs/xfs_aops.c | 15 ++++++++++++---
fs/xfs/xfs_buf_item_recover.c | 2 +-
fs/xfs/xfs_extfree_item.c | 6 +++---
fs/xfs/xfs_super.c | 14 +++++++++++---
kernel/time/tick-sched.c | 1 -
17 files changed, 72 insertions(+), 71 deletions(-)
Amir Goldstein (1):
MAINTAINERS: add Amir as xfs maintainer for 5.10.y
Brian Foster (1):
xfs: punch out data fork delalloc blocks on COW writeback failure
Christoph Hellwig (1):
drm: remove drm_fb_helper_modinit
Coly Li (1):
bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()
Darrick J. Wong (1):
xfs: remove all COW fork extents when remounting readonly
Dave Chinner (1):
xfs: check sb_meta_uuid for dabuf buffer recovery
Greg Kroah-Hartman (1):
Linux 5.10.128
Masahiro Yamada (1):
tick/nohz: unexport __init-annotated tick_nohz_full_setup()
Naveen N. Rao (1):
powerpc/ftrace: Remove ftrace init tramp once kernel init is complete
Rustam Kovhaev (1):
xfs: use kmem_cache_free() for kmem_cache objects
Vladimir Oltean (1):
net: mscc: ocelot: allow unregistered IP multicast flooding
Yang Xu (1):
xfs: Fix the free logic of state in xfs_attr_node_hasname
Two small documentation fixes for writecache functionality of device
mapper. The shortlog below should be self-explanatory.
Cc: Alasdair Kergon <agk(a)redhat.com>
Cc: Mike Snitzer <snitzer(a)kernel.org>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Mikulas Patocka <mpatocka(a)redhat.com>
Cc: Ross Zwisler <zwisler(a)kernel.org>
Cc: Colin Ian King <colin.king(a)intel.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: dm-devel(a)redhat.com
Cc: stable(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Bagas Sanjaya (2):
Documentation: dm writecache: Add missing blank line before optional
parameters
Documentation: dm writecache: Render status list as list
Documentation/admin-guide/device-mapper/writecache.rst | 2 ++
1 file changed, 2 insertions(+)
base-commit: 089866061428ec9bf67221247c936792078c41a4
--
An old man doll... just what I always wanted! - Clara
The patch titled
Subject: mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
has been added to the -mm mm-unstable branch. Its filename is
mm-hugetlb-separate-path-for-hwpoison-entry-in-copy_hugetlb_page_range.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Subject: mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
Date: Thu, 30 Jun 2022 11:27:48 +0900
Originally copy_hugetlb_page_range() handles migration entries and
hwpoisoned entries in similar manner. But recently the related code path
has more code for migration entries, and when
is_writable_migration_entry() was converted to
!is_readable_migration_entry(), hwpoison entries on source processes got
to be unexpectedly updated (which is legitimate for migration entries, but
not for hwpoison entries). This results in unexpected serious issues like
kernel panic when forking processes with hwpoison entries in pmd.
Separate the if branch into one for hwpoison entries and one for migration
entries.
Link: https://lkml.kernel.org/r/20220630022755.3362349-3-naoya.horiguchi@linux.dev
Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org> # 5.18
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Liu Shixin <liushixin2(a)huawei.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Yang Shi <shy828301(a)gmail.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-separate-path-for-hwpoison-entry-in-copy_hugetlb_page_range
+++ a/mm/hugetlb.c
@@ -4803,8 +4803,13 @@ again:
* sharing with another vma.
*/
;
- } else if (unlikely(is_hugetlb_entry_migration(entry) ||
- is_hugetlb_entry_hwpoisoned(entry))) {
+ } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
+ bool uffd_wp = huge_pte_uffd_wp(entry);
+
+ if (!userfaultfd_wp(dst_vma) && uffd_wp)
+ entry = huge_pte_clear_uffd_wp(entry);
+ set_huge_swap_pte_at(dst, addr, dst_pte, entry, sz);
+ } else if (unlikely(is_hugetlb_entry_migration(entry))) {
swp_entry_t swp_entry = pte_to_swp_entry(entry);
bool uffd_wp = huge_pte_uffd_wp(entry);
_
Patches currently in -mm which might be from naoya.horiguchi(a)nec.com are
mm-hugetlb-check-gigantic_page_runtime_supported-in-return_unused_surplus_pages.patch
mm-hugetlb-separate-path-for-hwpoison-entry-in-copy_hugetlb_page_range.patch
mm-hugetlb-make-pud_huge-and-follow_huge_pud-aware-of-non-present-pud-entry.patch
mm-hwpoison-hugetlb-support-saving-mechanism-of-raw-error-pages.patch
mm-hwpoison-make-unpoison-aware-of-raw-error-info-in-hwpoisoned-hugepage.patch
mm-hwpoison-set-pg_hwpoison-for-busy-hugetlb-pages.patch
mm-hwpoison-make-__page_handle_poison-returns-int.patch
mm-hwpoison-skip-raw-hwpoison-page-in-freeing-1gb-hugepage.patch
mm-hwpoison-enable-memory-error-handling-on-1gb-hugepage.patch
This is the start of the stable review cycle for the 5.10.128 release.
There are 12 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.128-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.128-rc1
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: mscc: ocelot: allow unregistered IP multicast flooding
Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
powerpc/ftrace: Remove ftrace init tramp once kernel init is complete
Dave Chinner <dchinner(a)redhat.com>
xfs: check sb_meta_uuid for dabuf buffer recovery
Darrick J. Wong <djwong(a)kernel.org>
xfs: remove all COW fork extents when remounting readonly
Yang Xu <xuyang2018.jy(a)fujitsu.com>
xfs: Fix the free logic of state in xfs_attr_node_hasname
Brian Foster <bfoster(a)redhat.com>
xfs: punch out data fork delalloc blocks on COW writeback failure
Rustam Kovhaev <rkovhaev(a)gmail.com>
xfs: use kmem_cache_free() for kmem_cache objects
Coly Li <colyli(a)suse.de>
bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
clocksource/drivers/ixp4xx: remove __init from ixp4xx_timer_setup()
Masahiro Yamada <masahiroy(a)kernel.org>
tick/nohz: unexport __init-annotated tick_nohz_full_setup()
Christoph Hellwig <hch(a)lst.de>
drm: remove drm_fb_helper_modinit
Amir Goldstein <amir73il(a)gmail.com>
MAINTAINERS: add Amir as xfs maintainer for 5.10.y
-------------
Diffstat:
MAINTAINERS | 3 ++-
Makefile | 4 ++--
arch/powerpc/include/asm/ftrace.h | 4 +++-
arch/powerpc/kernel/trace/ftrace.c | 15 ++++++++++++---
arch/powerpc/mm/mem.c | 2 ++
drivers/clocksource/mmio.c | 2 +-
drivers/clocksource/timer-ixp4xx.c | 10 ++++------
drivers/gpu/drm/drm_crtc_helper_internal.h | 10 ----------
drivers/gpu/drm/drm_fb_helper.c | 21 ---------------------
drivers/gpu/drm/drm_kms_helper_common.c | 25 ++++++++++++-------------
drivers/md/bcache/btree.c | 1 +
drivers/md/bcache/writeback.c | 1 +
drivers/net/ethernet/mscc/ocelot.c | 8 ++++++--
fs/xfs/libxfs/xfs_attr.c | 13 +++++--------
fs/xfs/xfs_aops.c | 15 ++++++++++++---
fs/xfs/xfs_buf_item_recover.c | 2 +-
fs/xfs/xfs_extfree_item.c | 6 +++---
fs/xfs/xfs_super.c | 14 +++++++++++---
include/linux/platform_data/timer-ixp4xx.h | 5 ++---
kernel/time/tick-sched.c | 1 -
20 files changed, 80 insertions(+), 82 deletions(-)
This is the start of the stable review cycle for the 4.9.321 release.
There are 29 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.321-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.321-rc1
Liu Shixin <liushixin2(a)huawei.com>
swiotlb: skip swiotlb_bounce when orig_addr is zero
Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
kexec_file: drop weak attribute from arch_kexec_apply_relocations[_add]
Hsin-Yi Wang <hsinyi(a)chromium.org>
fdt: Update CRC check for rng-seed
Masahiro Yamada <masahiroy(a)kernel.org>
xen: unexport __init-annotated xen_xlate_map_ballooned_pages()
Christoph Hellwig <hch(a)lst.de>
drm: remove drm_fb_helper_modinit
Jason A. Donenfeld <Jason(a)zx2c4.com>
powerpc/pseries: wire up rng during setup_arch()
Masahiro Yamada <masahiroy(a)kernel.org>
modpost: fix section mismatch check for exported init/exit sections
Miaoqian Lin <linmq006(a)gmail.com>
ARM: cns3xxx: Fix refcount leak in cns3xxx_init
Miaoqian Lin <linmq006(a)gmail.com>
ARM: Fix refcount leak in axxia_boot_secondary
Miaoqian Lin <linmq006(a)gmail.com>
ARM: exynos: Fix refcount leak in exynos_map_pmu
Lucas Stach <l.stach(a)pengutronix.de>
ARM: dts: imx6qdl: correct PU regulator ramp delay
Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
powerpc: Enable execve syscall exit tracepoint
Liang He <windhl(a)126.com>
xtensa: Fix refcount leak bug in time.c
Liang He <windhl(a)126.com>
xtensa: xtfpga: Fix refcount leak bug in setup
Vincent Whitchurch <vincent.whitchurch(a)axis.com>
iio: trigger: sysfs: fix use-after-free on remove
Haibo Chen <haibo.chen(a)nxp.com>
iio: accel: mma8452: ignore the return value of reset operation
Dmitry Rokosov <DDRokosov(a)sberdevices.ru>
iio:accel:bma180: rearrange iio trigger get and register
Xu Yang <xu.yang_2(a)nxp.com>
usb: chipidea: udc: check request status before setting device address
Baruch Siach <baruch(a)tkos.co.il>
iio: adc: vf610: fix conversion mode sysfs node name
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
igb: Make DMA faster when CPU is active on the PCIe link
huhai <huhai(a)kylinos.cn>
MIPS: Remove repetitive increase irq_err_count
Julien Grall <jgrall(a)amazon.com>
x86/xen: Remove undefined behavior in setup_features()
Jay Vosburgh <jay.vosburgh(a)canonical.com>
bonding: ARP monitor spams NETDEV_NOTIFY_PEERS notifiers
Carlo Lobrano <c.lobrano(a)gmail.com>
USB: serial: option: add Telit LE910Cx 0x1250 composition
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: quiet urandom warning ratelimit suppression message
Nikos Tsironis <ntsironis(a)arrikto.com>
dm era: commit metadata in postsuspend after worker stops
Edward Wu <edwardwu(a)realtek.com>
ata: libata: add qc->flags in ata_qc_complete_template tracepoint
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: schedule mix_interrupt_randomness() less often
Jiri Slaby <jslaby(a)suse.cz>
vt: drop old FONT ioctls
-------------
Diffstat:
Documentation/ABI/testing/sysfs-bus-iio-vf610 | 2 +-
Makefile | 4 +-
arch/arm/boot/dts/imx6qdl.dtsi | 2 +-
arch/arm/mach-axxia/platsmp.c | 1 +
arch/arm/mach-cns3xxx/core.c | 2 +
arch/arm/mach-exynos/exynos.c | 1 +
arch/mips/vr41xx/common/icu.c | 2 -
arch/powerpc/kernel/process.c | 2 +-
arch/powerpc/platforms/pseries/pseries.h | 2 +
arch/powerpc/platforms/pseries/rng.c | 11 +-
arch/powerpc/platforms/pseries/setup.c | 1 +
arch/x86/include/asm/kexec.h | 7 ++
arch/xtensa/kernel/time.c | 1 +
arch/xtensa/platforms/xtfpga/setup.c | 1 +
drivers/char/random.c | 4 +-
drivers/gpio/gpio-vr41xx.c | 2 -
drivers/gpu/drm/drm_crtc_helper_internal.h | 10 --
drivers/gpu/drm/drm_fb_helper.c | 21 ----
drivers/gpu/drm/drm_kms_helper_common.c | 25 +++--
drivers/iio/accel/bma180.c | 3 +-
drivers/iio/accel/mma8452.c | 10 +-
drivers/iio/trigger/iio-trig-sysfs.c | 1 +
drivers/md/dm-era-target.c | 8 +-
drivers/net/bonding/bond_main.c | 4 +-
drivers/net/ethernet/intel/igb/igb_main.c | 12 +--
drivers/of/fdt.c | 8 +-
drivers/tty/vt/vt.c | 34 +-----
drivers/tty/vt/vt_ioctl.c | 149 --------------------------
drivers/usb/chipidea/udc.c | 3 +
drivers/usb/serial/option.c | 1 +
drivers/xen/features.c | 2 +-
drivers/xen/xlate_mmu.c | 1 -
include/linux/kd.h | 7 --
include/linux/kexec.h | 26 ++++-
include/linux/ratelimit.h | 12 ++-
include/trace/events/libata.h | 1 +
kernel/kexec_file.c | 18 ----
lib/swiotlb.c | 3 +-
scripts/mod/modpost.c | 2 +-
39 files changed, 111 insertions(+), 295 deletions(-)
An interrupt for a channel might be pending even after struct
dma_device::device_terminate_all has been called. In that case the
recently introduced warning message "restart cyclic channel..." triggers
and the channel will be restarted. This is not desired as the channel
has just been stopped. Only restart the channel when we still have a
descriptor set for it (which will be set to NULL in
sdma_terminate_all()).
Fixes: 5b215c28b9235 ("dmaengine: imx-sdma: restart cyclic channel if needed")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sascha Hauer <s.hauer(a)pengutronix.de>
---
drivers/dma/imx-sdma.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index 8535018ee7a2e..5356cce41bffc 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -891,7 +891,7 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
* SDMA stops cyclic channel when DMA request triggers a channel and no SDMA
* owned buffer is available (i.e. BD_DONE was set too late).
*/
- if (!is_sdma_channel_enabled(sdmac->sdma, sdmac->channel)) {
+ if (sdmac->desc && !is_sdma_channel_enabled(sdmac->sdma, sdmac->channel)) {
dev_warn(sdmac->sdma->dev, "restart cyclic channel %d\n", sdmac->channel);
sdma_enable_channel(sdmac->sdma, sdmac->channel);
}
--
2.30.2
This is the start of the stable review cycle for the 5.15.52 release.
There are 28 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.52-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.52-rc1
Pavel Begunkov <asml.silence(a)gmail.com>
io_uring: fix not locked access to fixed buf table
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: mscc: ocelot: allow unregistered IP multicast flooding to CPU
Ping-Ke Shih <pkshih(a)realtek.com>
rtw88: rtw8821c: enable rfe 6 devices
Guo-Feng Fan <vincent_fann(a)realtek.com>
rtw88: 8821c: support RFE type4 wifi NIC
Christian Brauner <brauner(a)kernel.org>
fs: account for group membership
Christian Brauner <brauner(a)kernel.org>
fs: fix acl translation
Christian Brauner <christian.brauner(a)ubuntu.com>
fs: support mapped mounts of mapped filesystems
Christian Brauner <christian.brauner(a)ubuntu.com>
fs: add i_user_ns() helper
Christian Brauner <christian.brauner(a)ubuntu.com>
fs: port higher-level mapping helpers
Christian Brauner <christian.brauner(a)ubuntu.com>
fs: remove unused low-level mapping helpers
Christian Brauner <christian.brauner(a)ubuntu.com>
fs: use low-level mapping helpers
Christian Brauner <christian.brauner(a)ubuntu.com>
docs: update mapping documentation
Christian Brauner <christian.brauner(a)ubuntu.com>
fs: account for filesystem mappings
Christian Brauner <christian.brauner(a)ubuntu.com>
fs: tweak fsuidgid_has_mapping()
Christian Brauner <christian.brauner(a)ubuntu.com>
fs: move mapping helpers
Christian Brauner <christian.brauner(a)ubuntu.com>
fs: add is_idmapped_mnt() helper
Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
powerpc/ftrace: Remove ftrace init tramp once kernel init is complete
Darrick J. Wong <djwong(a)kernel.org>
xfs: only bother with sync_filesystem during readonly remount
Darrick J. Wong <djwong(a)kernel.org>
xfs: prevent UAF in xfs_log_item_in_current_chkpt
Dave Chinner <dchinner(a)redhat.com>
xfs: check sb_meta_uuid for dabuf buffer recovery
Darrick J. Wong <djwong(a)kernel.org>
xfs: remove all COW fork extents when remounting readonly
Yang Xu <xuyang2018.jy(a)fujitsu.com>
xfs: Fix the free logic of state in xfs_attr_node_hasname
Brian Foster <bfoster(a)redhat.com>
xfs: punch out data fork delalloc blocks on COW writeback failure
Rustam Kovhaev <rkovhaev(a)gmail.com>
xfs: use kmem_cache_free() for kmem_cache objects
Coly Li <colyli(a)suse.de>
bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
x86, kvm: use proper ASM macros for kvm_vcpu_is_preempted
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
clocksource/drivers/ixp4xx: remove __init from ixp4xx_timer_setup()
Masahiro Yamada <masahiroy(a)kernel.org>
tick/nohz: unexport __init-annotated tick_nohz_full_setup()
-------------
Diffstat:
Documentation/filesystems/idmappings.rst | 72 --------
Makefile | 4 +-
arch/powerpc/include/asm/ftrace.h | 4 +-
arch/powerpc/kernel/trace/ftrace.c | 15 +-
arch/powerpc/mm/mem.c | 2 +
arch/x86/kernel/kvm.c | 2 +-
drivers/clocksource/mmio.c | 2 +-
drivers/clocksource/timer-ixp4xx.c | 10 +-
drivers/md/bcache/btree.c | 1 +
drivers/md/bcache/writeback.c | 1 +
drivers/net/ethernet/mscc/ocelot.c | 8 +-
drivers/net/wireless/realtek/rtw88/rtw8821c.c | 14 +-
fs/attr.c | 26 ++-
fs/cachefiles/bind.c | 2 +-
fs/ecryptfs/main.c | 2 +-
fs/io_uring.c | 28 +--
fs/ksmbd/smbacl.c | 19 +--
fs/ksmbd/smbacl.h | 5 +-
fs/namespace.c | 53 ++++--
fs/nfsd/export.c | 2 +-
fs/open.c | 8 +-
fs/overlayfs/super.c | 2 +-
fs/posix_acl.c | 27 ++-
fs/proc_namespace.c | 2 +-
fs/xattr.c | 6 +-
fs/xfs/libxfs/xfs_attr.c | 17 +-
fs/xfs/xfs_aops.c | 15 +-
fs/xfs/xfs_buf_item_recover.c | 2 +-
fs/xfs/xfs_extfree_item.c | 6 +-
fs/xfs/xfs_inode.c | 8 +-
fs/xfs/xfs_linux.h | 1 +
fs/xfs/xfs_log_cil.c | 6 +-
fs/xfs/xfs_super.c | 21 ++-
fs/xfs/xfs_symlink.c | 4 +-
include/linux/fs.h | 141 +++++-----------
include/linux/mnt_idmapping.h | 234 ++++++++++++++++++++++++++
include/linux/platform_data/timer-ixp4xx.h | 5 +-
include/linux/posix_acl_xattr.h | 4 +
kernel/time/tick-sched.c | 1 -
security/commoncap.c | 15 +-
40 files changed, 498 insertions(+), 299 deletions(-)
--
Hello, my name is Christopher Wright, Audit Accounting Officer of
Groningen Bank, Groningen, The Netherlands. I got your information
when I was searching for an oversea partner among other names, I ask
for your pardon if my approach is offensive as I never mean't to
invade your privacy through this means, and also i believe this is the
best and secured means I can pass my message across to you in a clear
terms. I have sent you this proposal before now; I do hope this will
get to you in good health.
As the Audit Accounting Officer of the bank, I have access to lots of
documents because I handle some of the bank's sensitive files. In the
course of the last year 2021 business report, I discovered that my
branch in which I am the Audit Accounting Officer made €5.500.000.
(Five Million Five Hundred Thousand Euro) from some past government
contractors in which my head office is not aware of and will never be
aware of. I have placed this funds on what we call an escrow call
account with no beneficiary.
As an officer of this bank I cannot be directly connected to this
money, since i am still working with the bank. so my aim of contacting
you is to assist me receive this money in your bank account and get
50% of the total funds as commission. There are practically no risks
involved, it will be a bank-to-bank transfer, and all I need from you
is to stand claim as the Original depositor of these funds who made
the deposit with my branch so that my head office can order the
transfer to your designated bank account.
When the fund has been transferred into your bank account, I will come
to your country to share the fund. The fund will be shared 50% for me
and 50% for you.
I await your response.
Yours Faithfully,
Christopher Wright
Audit Accounting Officer
Groningen Bank
christopher(a)groningenbank.com
This is the start of the stable review cycle for the 5.15.37 release.
There are 33 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 01 May 2022 10:40:41 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.37-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.37-rc1
Kumar Kartikeya Dwivedi <memxor(a)gmail.com>
selftests/bpf: Add test for reg2btf_ids out of bounds access
Linus Torvalds <torvalds(a)linux-foundation.org>
mm: gup: make fault_in_safe_writeable() use fixup_user_fault()
Filipe Manana <fdmanana(a)suse.com>
btrfs: fallback to blocking mode when doing async dio over multiple extents
Filipe Manana <fdmanana(a)suse.com>
btrfs: fix deadlock due to page faults during direct IO reads and writes
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Fix mmap + page fault deadlocks for direct I/O
Andreas Gruenbacher <agruenba(a)redhat.com>
iov_iter: Introduce nofault flag to disable page faults
Andreas Gruenbacher <agruenba(a)redhat.com>
gup: Introduce FOLL_NOFAULT flag to disable page faults
Andreas Gruenbacher <agruenba(a)redhat.com>
iomap: Add done_before argument to iomap_dio_rw
Andreas Gruenbacher <agruenba(a)redhat.com>
iomap: Support partial direct I/O on user copy failures
Andreas Gruenbacher <agruenba(a)redhat.com>
iomap: Fix iomap_dio_rw return value for user copies
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Fix mmap + page fault deadlocks for buffered I/O
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Eliminate ip->i_gh
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Move the inode glock locking to gfs2_file_buffered_write
Bob Peterson <rpeterso(a)redhat.com>
gfs2: Introduce flag for glock holder auto-demotion
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Clean up function may_grant
Andreas Gruenbacher <agruenba(a)redhat.com>
gfs2: Add wrapper for iomap_file_buffered_write
Andreas Gruenbacher <agruenba(a)redhat.com>
iov_iter: Introduce fault_in_iov_iter_writeable
Andreas Gruenbacher <agruenba(a)redhat.com>
iov_iter: Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable
Andreas Gruenbacher <agruenba(a)redhat.com>
gup: Turn fault_in_pages_{readable,writeable} into fault_in_{readable,writeable}
Muchun Song <songmuchun(a)bytedance.com>
mm: kfence: fix objcgs vector allocation
Dinh Nguyen <dinguyen(a)kernel.org>
ARM: dts: socfpga: change qspi to "intel,socfpga-qspi"
Dinh Nguyen <dinguyen(a)kernel.org>
spi: cadence-quadspi: fix write completion support
Kumar Kartikeya Dwivedi <memxor(a)gmail.com>
bpf: Fix crash due to out of bounds access into reg2btf_ids.
Hao Luo <haoluo(a)google.com>
bpf/selftests: Test PTR_TO_RDONLY_MEM
Hao Luo <haoluo(a)google.com>
bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem.
Hao Luo <haoluo(a)google.com>
bpf: Make per_cpu_ptr return rdonly PTR_TO_MEM.
Hao Luo <haoluo(a)google.com>
bpf: Convert PTR_TO_MEM_OR_NULL to composable types.
Hao Luo <haoluo(a)google.com>
bpf: Introduce MEM_RDONLY flag
Hao Luo <haoluo(a)google.com>
bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL
Hao Luo <haoluo(a)google.com>
bpf: Replace RET_XXX_OR_NULL with RET_XXX | PTR_MAYBE_NULL
Hao Luo <haoluo(a)google.com>
bpf: Replace ARG_XXX_OR_NULL with ARG_XXX | PTR_MAYBE_NULL
Hao Luo <haoluo(a)google.com>
bpf: Introduce composable reg, ret and arg types.
Willy Tarreau <w(a)1wt.eu>
floppy: disable FDRAWCMD by default
-------------
Diffstat:
Makefile | 4 +-
arch/arm/boot/dts/socfpga.dtsi | 2 +-
arch/arm/boot/dts/socfpga_arria10.dtsi | 2 +-
arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi | 2 +-
arch/arm64/boot/dts/intel/socfpga_agilex.dtsi | 2 +-
arch/powerpc/kernel/kvm.c | 3 +-
arch/powerpc/kernel/signal_32.c | 4 +-
arch/powerpc/kernel/signal_64.c | 2 +-
arch/x86/kernel/fpu/signal.c | 7 +-
drivers/block/Kconfig | 16 +
drivers/block/floppy.c | 43 +-
drivers/gpu/drm/armada/armada_gem.c | 7 +-
drivers/spi/spi-cadence-quadspi.c | 24 +-
fs/btrfs/file.c | 142 +++++-
fs/btrfs/inode.c | 28 ++
fs/btrfs/ioctl.c | 5 +-
fs/erofs/data.c | 2 +-
fs/ext4/file.c | 5 +-
fs/f2fs/file.c | 2 +-
fs/fuse/file.c | 2 +-
fs/gfs2/bmap.c | 60 +--
fs/gfs2/file.c | 252 ++++++++++-
fs/gfs2/glock.c | 330 ++++++++++----
fs/gfs2/glock.h | 20 +
fs/gfs2/incore.h | 4 +-
fs/iomap/buffered-io.c | 2 +-
fs/iomap/direct-io.c | 29 +-
fs/ntfs/file.c | 2 +-
fs/ntfs3/file.c | 2 +-
fs/xfs/xfs_file.c | 6 +-
fs/zonefs/super.c | 4 +-
include/linux/bpf.h | 101 ++++-
include/linux/bpf_verifier.h | 18 +
include/linux/iomap.h | 11 +-
include/linux/mm.h | 3 +-
include/linux/pagemap.h | 58 +--
include/linux/uio.h | 4 +-
kernel/bpf/btf.c | 16 +-
kernel/bpf/cgroup.c | 2 +-
kernel/bpf/helpers.c | 12 +-
kernel/bpf/map_iter.c | 4 +-
kernel/bpf/ringbuf.c | 2 +-
kernel/bpf/syscall.c | 2 +-
kernel/bpf/verifier.c | 488 ++++++++++-----------
kernel/trace/bpf_trace.c | 22 +-
lib/iov_iter.c | 98 ++++-
mm/filemap.c | 4 +-
mm/gup.c | 120 ++++-
mm/kfence/core.c | 11 +-
mm/kfence/kfence.h | 3 +
net/core/bpf_sk_storage.c | 2 +-
net/core/filter.c | 64 +--
net/core/sock_map.c | 2 +-
tools/testing/selftests/bpf/prog_tests/ksyms_btf.c | 14 +
.../bpf/progs/test_ksyms_btf_write_check.c | 29 ++
tools/testing/selftests/bpf/verifier/calls.c | 19 +
56 files changed, 1472 insertions(+), 652 deletions(-)
Greetings dear
This letter might be a surprise to you, But I believe that you will
be honest to fulfill my final wish. I bring peace and love to you. It
is by the grace of god, I had no choice than to do what is lawful and
right in the sight of God for eternal life and in the sight of man for
witness of god's mercy and glory upon my life. My dear, I sent this
mail praying it will find you in a good condition, since I myself am
in a very critical health condition in which I sleep every night
without knowing if I may be alive to see the next day. I am Mrs.Hannah
Vandrad, a widow suffering from a long time illness. I have some funds
I inherited from my late husband, the sum of ($11,000,000.00,)
my Doctor told me recently that I have serious
sickness which is a cancer problem. What disturbs me most is my stroke
sickness. Having known my condition, I decided to donate this fund to
a good person that will utilize it the way I am going to instruct
herein. I need a very honest and God fearing person who can claim this
money and use it for Charity works, for orphanages and gives justice
and help to the poor, needy and widows says The Lord." Jeremiah
22:15-16.“ and also build schools for less privilege that will be
named after my late husband if possible and to promote the word of god
and the effort that the house of god is maintained.
I do not want a situation where this money will be used in an ungodly
manner. That's why I'm taking this decision. I'm not afraid of death,
so I know where I'm going. I accept this decision because I do not
have any child who will inherit this money after I die. Please I want
your sincere and urgent answer to know if you will be able to execute
this project, and I will give you more information on how the fund
will be transferred to your bank account. May the grace, peace, love
and the truth in the Word of god be with you and all those that you
love and care for.
I am waiting for your reply.
May God Bless you,
Mrs.Hannah Vandrad.
This is the start of the stable review cycle for the 4.14.286 release.
There are 35 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 02 Jul 2022 13:32:22 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.286-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.286-rc1
Liu Shixin <liushixin2(a)huawei.com>
swiotlb: skip swiotlb_bounce when orig_addr is zero
Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
kexec_file: drop weak attribute from arch_kexec_apply_relocations[_add]
Hsin-Yi Wang <hsinyi(a)chromium.org>
fdt: Update CRC check for rng-seed
Masahiro Yamada <masahiroy(a)kernel.org>
xen: unexport __init-annotated xen_xlate_map_ballooned_pages()
Christoph Hellwig <hch(a)lst.de>
drm: remove drm_fb_helper_modinit
Jason A. Donenfeld <Jason(a)zx2c4.com>
powerpc/pseries: wire up rng during setup_arch()
Masahiro Yamada <masahiroy(a)kernel.org>
modpost: fix section mismatch check for exported init/exit sections
Miaoqian Lin <linmq006(a)gmail.com>
ARM: cns3xxx: Fix refcount leak in cns3xxx_init
Miaoqian Lin <linmq006(a)gmail.com>
ARM: Fix refcount leak in axxia_boot_secondary
Miaoqian Lin <linmq006(a)gmail.com>
ARM: exynos: Fix refcount leak in exynos_map_pmu
Lucas Stach <l.stach(a)pengutronix.de>
ARM: dts: imx6qdl: correct PU regulator ramp delay
Jason A. Donenfeld <Jason(a)zx2c4.com>
powerpc/powernv: wire up rng during setup_arch
Andrew Donnellan <ajd(a)linux.ibm.com>
powerpc/rtas: Allow ibm,platform-dump RTAS call with null buffer address
Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
powerpc: Enable execve syscall exit tracepoint
Liang He <windhl(a)126.com>
xtensa: Fix refcount leak bug in time.c
Liang He <windhl(a)126.com>
xtensa: xtfpga: Fix refcount leak bug in setup
Hans de Goede <hdegoede(a)redhat.com>
iio: adc: axp288: Override TS pin bias current for some models
Vincent Whitchurch <vincent.whitchurch(a)axis.com>
iio: trigger: sysfs: fix use-after-free on remove
Zheyu Ma <zheyuma97(a)gmail.com>
iio: gyro: mpu3050: Fix the error handling in mpu3050_power_up()
Haibo Chen <haibo.chen(a)nxp.com>
iio: accel: mma8452: ignore the return value of reset operation
Dmitry Rokosov <DDRokosov(a)sberdevices.ru>
iio:accel:bma180: rearrange iio trigger get and register
Xu Yang <xu.yang_2(a)nxp.com>
usb: chipidea: udc: check request status before setting device address
Baruch Siach <baruch(a)tkos.co.il>
iio: adc: vf610: fix conversion mode sysfs node name
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
igb: Make DMA faster when CPU is active on the PCIe link
huhai <huhai(a)kylinos.cn>
MIPS: Remove repetitive increase irq_err_count
Julien Grall <jgrall(a)amazon.com>
x86/xen: Remove undefined behavior in setup_features()
Jay Vosburgh <jay.vosburgh(a)canonical.com>
bonding: ARP monitor spams NETDEV_NOTIFY_PEERS notifiers
Macpaul Lin <macpaul.lin(a)mediatek.com>
USB: serial: option: add Quectel RM500K module support
Yonglin Tan <yonglin.tan(a)outlook.com>
USB: serial: option: add Quectel EM05-G modem
Carlo Lobrano <c.lobrano(a)gmail.com>
USB: serial: option: add Telit LE910Cx 0x1250 composition
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: quiet urandom warning ratelimit suppression message
Nikos Tsironis <ntsironis(a)arrikto.com>
dm era: commit metadata in postsuspend after worker stops
Edward Wu <edwardwu(a)realtek.com>
ata: libata: add qc->flags in ata_qc_complete_template tracepoint
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: schedule mix_interrupt_randomness() less often
Jiri Slaby <jslaby(a)suse.cz>
vt: drop old FONT ioctls
-------------
Diffstat:
Documentation/ABI/testing/sysfs-bus-iio-vf610 | 2 +-
Makefile | 4 +-
arch/arm/boot/dts/imx6qdl.dtsi | 2 +-
arch/arm/mach-axxia/platsmp.c | 1 +
arch/arm/mach-cns3xxx/core.c | 2 +
arch/arm/mach-exynos/exynos.c | 1 +
arch/mips/vr41xx/common/icu.c | 2 -
arch/powerpc/kernel/process.c | 2 +-
arch/powerpc/kernel/rtas.c | 11 +-
arch/powerpc/platforms/powernv/powernv.h | 2 +
arch/powerpc/platforms/powernv/rng.c | 52 ++++++---
arch/powerpc/platforms/powernv/setup.c | 2 +
arch/powerpc/platforms/pseries/pseries.h | 2 +
arch/powerpc/platforms/pseries/rng.c | 11 +-
arch/powerpc/platforms/pseries/setup.c | 1 +
arch/x86/include/asm/kexec.h | 6 ++
arch/xtensa/kernel/time.c | 1 +
arch/xtensa/platforms/xtfpga/setup.c | 1 +
drivers/char/random.c | 4 +-
drivers/gpio/gpio-vr41xx.c | 2 -
drivers/gpu/drm/drm_crtc_helper_internal.h | 10 --
drivers/gpu/drm/drm_fb_helper.c | 21 ----
drivers/gpu/drm/drm_kms_helper_common.c | 25 +++--
drivers/iio/accel/bma180.c | 3 +-
drivers/iio/accel/mma8452.c | 10 +-
drivers/iio/adc/axp288_adc.c | 8 ++
drivers/iio/gyro/mpu3050-core.c | 1 +
drivers/iio/trigger/iio-trig-sysfs.c | 1 +
drivers/md/dm-era-target.c | 8 +-
drivers/net/bonding/bond_main.c | 4 +-
drivers/net/ethernet/intel/igb/igb_main.c | 12 +--
drivers/of/fdt.c | 8 +-
drivers/tty/vt/vt.c | 34 +-----
drivers/tty/vt/vt_ioctl.c | 149 --------------------------
drivers/usb/chipidea/udc.c | 3 +
drivers/usb/serial/option.c | 6 ++
drivers/xen/features.c | 2 +-
drivers/xen/xlate_mmu.c | 1 -
include/linux/kd.h | 8 --
include/linux/kexec.h | 26 ++++-
include/linux/ratelimit.h | 12 ++-
include/trace/events/libata.h | 1 +
kernel/kexec_file.c | 18 ----
lib/swiotlb.c | 3 +-
scripts/mod/modpost.c | 2 +-
45 files changed, 174 insertions(+), 313 deletions(-)
commit dbe97cff7dd9f0f75c524afdd55ad46be3d15295 upstream
unmap_grant_pages() currently waits for the pages to no longer be used.
In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
deadlock against i915: i915 was waiting for gntdev's MMU notifier to
finish, while gntdev was waiting for i915 to free its pages. I also
believe this is responsible for various deadlocks I have experienced in
the past.
Avoid these problems by making unmap_grant_pages async. This requires
making it return void, as any errors will not be available when the
function returns. Fortunately, the only use of the return value is a
WARN_ON(), which can be replaced by a WARN_ON when the error is
detected. Additionally, a failed call will not prevent further calls
from being made, but this is harmless.
Because unmap_grant_pages is now async, the grant handle will be sent to
INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
handle. Instead, a separate bool array is allocated for this purpose.
This wastes memory, but stuffing this information in padding bytes is
too fragile. Furthermore, it is necessary to grab a reference to the
map before making the asynchronous call, and release the reference when
the call returns.
It is also necessary to guard against reentrancy in gntdev_map_put(),
and to handle the case where userspace tries to map a mapping whose
contents have not all been freed yet.
Fixes: 745282256c75 ("xen/gntdev: safely unmap grants in case they are still in use")
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
drivers/xen/gntdev-common.h | 7 ++
drivers/xen/gntdev.c | 142 +++++++++++++++++++++++++-----------
2 files changed, 106 insertions(+), 43 deletions(-)
diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h
index 20d7d059dadb..40ef379c28ab 100644
--- a/drivers/xen/gntdev-common.h
+++ b/drivers/xen/gntdev-common.h
@@ -16,6 +16,7 @@
#include <linux/mmu_notifier.h>
#include <linux/types.h>
#include <xen/interface/event_channel.h>
+#include <xen/grant_table.h>
struct gntdev_dmabuf_priv;
@@ -56,6 +57,7 @@ struct gntdev_grant_map {
struct gnttab_unmap_grant_ref *unmap_ops;
struct gnttab_map_grant_ref *kmap_ops;
struct gnttab_unmap_grant_ref *kunmap_ops;
+ bool *being_removed;
struct page **pages;
unsigned long pages_vm_start;
@@ -73,6 +75,11 @@ struct gntdev_grant_map {
/* Needed to avoid allocation in gnttab_dma_free_pages(). */
xen_pfn_t *frames;
#endif
+
+ /* Number of live grants */
+ atomic_t live_grants;
+ /* Needed to avoid allocation in __unmap_grant_pages */
+ struct gntab_unmap_queue_data unmap_data;
};
struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 54778aadf618..a631a453eb57 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -35,6 +35,7 @@
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/refcount.h>
+#include <linux/workqueue.h>
#include <xen/xen.h>
#include <xen/grant_table.h>
@@ -60,10 +61,11 @@ module_param(limit, uint, 0644);
MODULE_PARM_DESC(limit,
"Maximum number of grants that may be mapped by one mapping request");
+/* True in PV mode, false otherwise */
static int use_ptemod;
-static int unmap_grant_pages(struct gntdev_grant_map *map,
- int offset, int pages);
+static void unmap_grant_pages(struct gntdev_grant_map *map,
+ int offset, int pages);
static struct miscdevice gntdev_miscdev;
@@ -120,6 +122,7 @@ static void gntdev_free_map(struct gntdev_grant_map *map)
kvfree(map->unmap_ops);
kvfree(map->kmap_ops);
kvfree(map->kunmap_ops);
+ kvfree(map->being_removed);
kfree(map);
}
@@ -140,12 +143,13 @@ struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
add->kunmap_ops = kvcalloc(count,
sizeof(add->kunmap_ops[0]), GFP_KERNEL);
add->pages = kvcalloc(count, sizeof(add->pages[0]), GFP_KERNEL);
+ add->being_removed =
+ kvcalloc(count, sizeof(add->being_removed[0]), GFP_KERNEL);
if (NULL == add->grants ||
- NULL == add->map_ops ||
- NULL == add->unmap_ops ||
NULL == add->kmap_ops ||
NULL == add->kunmap_ops ||
- NULL == add->pages)
+ NULL == add->pages ||
+ NULL == add->being_removed)
goto err;
#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
@@ -240,9 +244,36 @@ void gntdev_put_map(struct gntdev_priv *priv, struct gntdev_grant_map *map)
if (!refcount_dec_and_test(&map->users))
return;
- if (map->pages && !use_ptemod)
+ if (map->pages && !use_ptemod) {
+ /*
+ * Increment the reference count. This ensures that the
+ * subsequent call to unmap_grant_pages() will not wind up
+ * re-entering itself. It *can* wind up calling
+ * gntdev_put_map() recursively, but such calls will be with a
+ * reference count greater than 1, so they will return before
+ * this code is reached. The recursion depth is thus limited to
+ * 1. Do NOT use refcount_inc() here, as it will detect that
+ * the reference count is zero and WARN().
+ */
+ refcount_set(&map->users, 1);
+
+ /*
+ * Unmap the grants. This may or may not be asynchronous, so it
+ * is possible that the reference count is 1 on return, but it
+ * could also be greater than 1.
+ */
unmap_grant_pages(map, 0, map->count);
+ /* Check if the memory now needs to be freed */
+ if (!refcount_dec_and_test(&map->users))
+ return;
+
+ /*
+ * All pages have been returned to the hypervisor, so free the
+ * map.
+ */
+ }
+
if (map->notify.flags & UNMAP_NOTIFY_SEND_EVENT) {
notify_remote_via_evtchn(map->notify.event);
evtchn_put(map->notify.event);
@@ -288,6 +319,7 @@ static int set_grant_ptes_as_special(pte_t *pte, unsigned long addr, void *data)
int gntdev_map_grant_pages(struct gntdev_grant_map *map)
{
+ size_t alloced = 0;
int i, err = 0;
if (!use_ptemod) {
@@ -336,87 +368,109 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map)
map->pages, map->count);
for (i = 0; i < map->count; i++) {
- if (map->map_ops[i].status == GNTST_okay)
+ if (map->map_ops[i].status == GNTST_okay) {
map->unmap_ops[i].handle = map->map_ops[i].handle;
- else if (!err)
+ if (!use_ptemod)
+ alloced++;
+ } else if (!err)
err = -EINVAL;
if (map->flags & GNTMAP_device_map)
map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr;
if (use_ptemod) {
- if (map->kmap_ops[i].status == GNTST_okay)
+ if (map->kmap_ops[i].status == GNTST_okay) {
+ if (map->map_ops[i].status == GNTST_okay)
+ alloced++;
map->kunmap_ops[i].handle = map->kmap_ops[i].handle;
- else if (!err)
+ } else if (!err)
err = -EINVAL;
}
}
+ atomic_add(alloced, &map->live_grants);
return err;
}
-static int __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void __unmap_grant_pages_done(int result,
+ struct gntab_unmap_queue_data *data)
{
- int i, err = 0;
- struct gntab_unmap_queue_data unmap_data;
+ unsigned int i;
+ struct gntdev_grant_map *map = data->data;
+ unsigned int offset = data->unmap_ops - map->unmap_ops;
+ for (i = 0; i < data->count; i++) {
+ WARN_ON(map->unmap_ops[offset+i].status);
+ pr_debug("unmap handle=%d st=%d\n",
+ map->unmap_ops[offset+i].handle,
+ map->unmap_ops[offset+i].status);
+ map->unmap_ops[offset+i].handle = -1;
+ }
+ /*
+ * Decrease the live-grant counter. This must happen after the loop to
+ * prevent premature reuse of the grants by gnttab_mmap().
+ */
+ atomic_sub(data->count, &map->live_grants);
+
+ /* Release reference taken by __unmap_grant_pages */
+ gntdev_put_map(NULL, map);
+}
+
+static void __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
+{
if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
int pgno = (map->notify.addr >> PAGE_SHIFT);
+
if (pgno >= offset && pgno < offset + pages) {
/* No need for kmap, pages are in lowmem */
uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
+
tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
}
}
- unmap_data.unmap_ops = map->unmap_ops + offset;
- unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
- unmap_data.pages = map->pages + offset;
- unmap_data.count = pages;
+ map->unmap_data.unmap_ops = map->unmap_ops + offset;
+ map->unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
+ map->unmap_data.pages = map->pages + offset;
+ map->unmap_data.count = pages;
+ map->unmap_data.done = __unmap_grant_pages_done;
+ map->unmap_data.data = map;
+ refcount_inc(&map->users); /* to keep map alive during async call below */
- err = gnttab_unmap_refs_sync(&unmap_data);
- if (err)
- return err;
-
- for (i = 0; i < pages; i++) {
- if (map->unmap_ops[offset+i].status)
- err = -EINVAL;
- pr_debug("unmap handle=%d st=%d\n",
- map->unmap_ops[offset+i].handle,
- map->unmap_ops[offset+i].status);
- map->unmap_ops[offset+i].handle = -1;
- }
- return err;
+ gnttab_unmap_refs_async(&map->unmap_data);
}
-static int unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
{
- int range, err = 0;
+ int range;
+
+ if (atomic_read(&map->live_grants) == 0)
+ return; /* Nothing to do */
pr_debug("unmap %d+%d [%d+%d]\n", map->index, map->count, offset, pages);
/* It is possible the requested range will have a "hole" where we
* already unmapped some of the grants. Only unmap valid ranges.
*/
- while (pages && !err) {
- while (pages && map->unmap_ops[offset].handle == -1) {
+ while (pages) {
+ while (pages && map->being_removed[offset]) {
offset++;
pages--;
}
range = 0;
while (range < pages) {
- if (map->unmap_ops[offset+range].handle == -1)
+ if (map->being_removed[offset + range])
break;
+ map->being_removed[offset + range] = true;
range++;
}
- err = __unmap_grant_pages(map, offset, range);
+ if (range)
+ __unmap_grant_pages(map, offset, range);
offset += range;
pages -= range;
}
-
- return err;
}
/* ------------------------------------------------------------------ */
@@ -468,7 +522,6 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
struct gntdev_grant_map *map =
container_of(mn, struct gntdev_grant_map, notifier);
unsigned long mstart, mend;
- int err;
if (!mmu_notifier_range_blockable(range))
return false;
@@ -489,10 +542,9 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
map->index, map->count,
map->vma->vm_start, map->vma->vm_end,
range->start, range->end, mstart, mend);
- err = unmap_grant_pages(map,
+ unmap_grant_pages(map,
(mstart - map->vma->vm_start) >> PAGE_SHIFT,
(mend - mstart) >> PAGE_SHIFT);
- WARN_ON(err);
return true;
}
@@ -980,6 +1032,10 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
goto unlock_out;
if (use_ptemod && map->vma)
goto unlock_out;
+ if (atomic_read(&map->live_grants)) {
+ err = -EAGAIN;
+ goto unlock_out;
+ }
refcount_inc(&map->users);
vma->vm_ops = &gntdev_vmops;
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab
Update MAINTAINERS for xfs in an effort to help direct bots/questions
about xfs in 5.15.y.
Note: 5.10.y and 5.4.y will have different updates to their
respective MAINTAINERS files for this effort.
Suggested-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik(a)gmail.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 393706e85ba2..a60d7e0466af 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -20579,6 +20579,7 @@ F: drivers/xen/*swiotlb*
XFS FILESYSTEM
C: irc://irc.oftc.net/xfs
+M: Leah Rumancik <leah.rumancik(a)gmail.com>
M: Darrick J. Wong <djwong(a)kernel.org>
M: linux-xfs(a)vger.kernel.org
L: linux-xfs(a)vger.kernel.org
--
2.37.0.rc0.161.g10f37bed90-goog
Hello friend.
How are you, I'm Sgt.Kayla Manthey a US army serving in Iraq/Syria, i
want to make friends with you, please write back direct to my email
address (sgt.kayla03(a)gmail.com) to enable me to tell you more about my
self. I expect to receive your response direct to my email address
because i have something very important and private to share
with you.
Thanks
Sgt Kayla Manthey
From: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri(a)xilinx.com>
[ Upstream commit 21b511ddee09a78909035ec47a6a594349fe3296 ]
As part of unprepare_transfer_hardware, SPI controller will be disabled
which will indirectly deassert the CS line. This will create a problem
in some of the devices where message will be transferred with
cs_change flag set(CS should not be deasserted).
As per SPI controller implementation, if SPI controller is disabled then
all output enables are inactive and all pins are set to input mode which
means CS will go to default state high(deassert). This leads to an issue
when core explicitly ask not to deassert the CS (cs_change = 1). This
patch fix the above issue by checking the Slave select status bits from
configuration register before disabling the SPI.
Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri(a)xilinx.com>
Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra(a)xilinx.com>
Link: https://lore.kernel.org/r/20220606062525.18447-1-amit.kumar-mahapatra@xilin…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/spi/spi-cadence.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-cadence.c b/drivers/spi/spi-cadence.c
index e383c6368915..6d294a1fa5e5 100644
--- a/drivers/spi/spi-cadence.c
+++ b/drivers/spi/spi-cadence.c
@@ -72,6 +72,7 @@
#define CDNS_SPI_BAUD_DIV_SHIFT 3 /* Baud rate divisor shift in CR */
#define CDNS_SPI_SS_SHIFT 10 /* Slave Select field shift in CR */
#define CDNS_SPI_SS0 0x1 /* Slave Select zero */
+#define CDNS_SPI_NOSS 0x3C /* No Slave select */
/*
* SPI Interrupt Registers bit Masks
@@ -444,15 +445,20 @@ static int cdns_prepare_transfer_hardware(struct spi_master *master)
* @master: Pointer to the spi_master structure which provides
* information about the controller.
*
- * This function disables the SPI master controller.
+ * This function disables the SPI master controller when no slave selected.
*
* Return: 0 always
*/
static int cdns_unprepare_transfer_hardware(struct spi_master *master)
{
struct cdns_spi *xspi = spi_master_get_devdata(master);
+ u32 ctrl_reg;
- cdns_spi_write(xspi, CDNS_SPI_ER, CDNS_SPI_ER_DISABLE);
+ /* Disable the SPI if slave is deselected */
+ ctrl_reg = cdns_spi_read(xspi, CDNS_SPI_CR);
+ ctrl_reg = (ctrl_reg & CDNS_SPI_CR_SSCTRL) >> CDNS_SPI_SS_SHIFT;
+ if (ctrl_reg == CDNS_SPI_NOSS)
+ cdns_spi_write(xspi, CDNS_SPI_ER, CDNS_SPI_ER_DISABLE);
return 0;
}
--
2.35.1