The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bb193e3db8b86a63f26889c99e14fd30c9ebd72a Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:31 -0600
Subject: [PATCH] net: axienet: fix for TX busy handling
Network driver documentation indicates we should be avoiding returning
NETDEV_TX_BUSY from ndo_start_xmit in normal cases, since it requires
the packets to be requeued. Instead the queue should be stopped after
a packet is added to the TX ring when there may not be enough room for an
additional one. Also, when TX ring entries are completed, we should only
wake the queue if we know there is room for another full maximally
fragmented packet.
Print a warning if there is insufficient space at the start of start_xmit,
since this should no longer happen.
Combined with increasing the default TX ring size (in a subsequent
patch), this appears to recover the TX performance lost by previous changes
to actually manage the TX ring state properly.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 8dc9e92e05d2..b4f42ee9b75d 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -660,6 +660,32 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
return i;
}
+/**
+ * axienet_check_tx_bd_space - Checks if a BD/group of BDs are currently busy
+ * @lp: Pointer to the axienet_local structure
+ * @num_frag: The number of BDs to check for
+ *
+ * Return: 0, on success
+ * NETDEV_TX_BUSY, if any of the descriptors are not free
+ *
+ * This function is invoked before BDs are allocated and transmission starts.
+ * This function returns 0 if a BD or group of BDs can be allocated for
+ * transmission. If the BD or any of the BDs are not free the function
+ * returns a busy status. This is invoked from axienet_start_xmit.
+ */
+static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
+ int num_frag)
+{
+ struct axidma_bd *cur_p;
+
+ /* Ensure we see all descriptor updates from device or TX IRQ path */
+ rmb();
+ cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
+ if (cur_p->cntrl)
+ return NETDEV_TX_BUSY;
+ return 0;
+}
+
/**
* axienet_start_xmit_done - Invoked once a transmit is completed by the
* Axi DMA Tx channel.
@@ -689,33 +715,8 @@ static void axienet_start_xmit_done(struct net_device *ndev)
/* Matches barrier in axienet_start_xmit */
smp_mb();
- netif_wake_queue(ndev);
-}
-
-/**
- * axienet_check_tx_bd_space - Checks if a BD/group of BDs are currently busy
- * @lp: Pointer to the axienet_local structure
- * @num_frag: The number of BDs to check for
- *
- * Return: 0, on success
- * NETDEV_TX_BUSY, if any of the descriptors are not free
- *
- * This function is invoked before BDs are allocated and transmission starts.
- * This function returns 0 if a BD or group of BDs can be allocated for
- * transmission. If the BD or any of the BDs are not free the function
- * returns a busy status. This is invoked from axienet_start_xmit.
- */
-static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
- int num_frag)
-{
- struct axidma_bd *cur_p;
-
- /* Ensure we see all descriptor updates from device or TX IRQ path */
- rmb();
- cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
- if (cur_p->cntrl)
- return NETDEV_TX_BUSY;
- return 0;
+ if (!axienet_check_tx_bd_space(lp, MAX_SKB_FRAGS + 1))
+ netif_wake_queue(ndev);
}
/**
@@ -748,19 +749,14 @@ axienet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
cur_p = &lp->tx_bd_v[lp->tx_bd_tail];
if (axienet_check_tx_bd_space(lp, num_frag + 1)) {
- if (netif_queue_stopped(ndev))
- return NETDEV_TX_BUSY;
-
+ /* Should not happen as last start_xmit call should have
+ * checked for sufficient space and queue should only be
+ * woken when sufficient space is available.
+ */
netif_stop_queue(ndev);
-
- /* Matches barrier in axienet_start_xmit_done */
- smp_mb();
-
- /* Space might have just been freed - check again */
- if (axienet_check_tx_bd_space(lp, num_frag + 1))
- return NETDEV_TX_BUSY;
-
- netif_wake_queue(ndev);
+ if (net_ratelimit())
+ netdev_warn(ndev, "TX ring unexpectedly full\n");
+ return NETDEV_TX_BUSY;
}
if (skb->ip_summed == CHECKSUM_PARTIAL) {
@@ -821,6 +817,18 @@ axienet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
if (++lp->tx_bd_tail >= lp->tx_bd_num)
lp->tx_bd_tail = 0;
+ /* Stop queue if next transmit may not have space */
+ if (axienet_check_tx_bd_space(lp, MAX_SKB_FRAGS + 1)) {
+ netif_stop_queue(ndev);
+
+ /* Matches barrier in axienet_start_xmit_done */
+ smp_mb();
+
+ /* Space might have just been freed - check again */
+ if (!axienet_check_tx_bd_space(lp, MAX_SKB_FRAGS + 1))
+ netif_wake_queue(ndev);
+ }
+
return NETDEV_TX_OK;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bb193e3db8b86a63f26889c99e14fd30c9ebd72a Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:31 -0600
Subject: [PATCH] net: axienet: fix for TX busy handling
Network driver documentation indicates we should be avoiding returning
NETDEV_TX_BUSY from ndo_start_xmit in normal cases, since it requires
the packets to be requeued. Instead the queue should be stopped after
a packet is added to the TX ring when there may not be enough room for an
additional one. Also, when TX ring entries are completed, we should only
wake the queue if we know there is room for another full maximally
fragmented packet.
Print a warning if there is insufficient space at the start of start_xmit,
since this should no longer happen.
Combined with increasing the default TX ring size (in a subsequent
patch), this appears to recover the TX performance lost by previous changes
to actually manage the TX ring state properly.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 8dc9e92e05d2..b4f42ee9b75d 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -660,6 +660,32 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
return i;
}
+/**
+ * axienet_check_tx_bd_space - Checks if a BD/group of BDs are currently busy
+ * @lp: Pointer to the axienet_local structure
+ * @num_frag: The number of BDs to check for
+ *
+ * Return: 0, on success
+ * NETDEV_TX_BUSY, if any of the descriptors are not free
+ *
+ * This function is invoked before BDs are allocated and transmission starts.
+ * This function returns 0 if a BD or group of BDs can be allocated for
+ * transmission. If the BD or any of the BDs are not free the function
+ * returns a busy status. This is invoked from axienet_start_xmit.
+ */
+static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
+ int num_frag)
+{
+ struct axidma_bd *cur_p;
+
+ /* Ensure we see all descriptor updates from device or TX IRQ path */
+ rmb();
+ cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
+ if (cur_p->cntrl)
+ return NETDEV_TX_BUSY;
+ return 0;
+}
+
/**
* axienet_start_xmit_done - Invoked once a transmit is completed by the
* Axi DMA Tx channel.
@@ -689,33 +715,8 @@ static void axienet_start_xmit_done(struct net_device *ndev)
/* Matches barrier in axienet_start_xmit */
smp_mb();
- netif_wake_queue(ndev);
-}
-
-/**
- * axienet_check_tx_bd_space - Checks if a BD/group of BDs are currently busy
- * @lp: Pointer to the axienet_local structure
- * @num_frag: The number of BDs to check for
- *
- * Return: 0, on success
- * NETDEV_TX_BUSY, if any of the descriptors are not free
- *
- * This function is invoked before BDs are allocated and transmission starts.
- * This function returns 0 if a BD or group of BDs can be allocated for
- * transmission. If the BD or any of the BDs are not free the function
- * returns a busy status. This is invoked from axienet_start_xmit.
- */
-static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
- int num_frag)
-{
- struct axidma_bd *cur_p;
-
- /* Ensure we see all descriptor updates from device or TX IRQ path */
- rmb();
- cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
- if (cur_p->cntrl)
- return NETDEV_TX_BUSY;
- return 0;
+ if (!axienet_check_tx_bd_space(lp, MAX_SKB_FRAGS + 1))
+ netif_wake_queue(ndev);
}
/**
@@ -748,19 +749,14 @@ axienet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
cur_p = &lp->tx_bd_v[lp->tx_bd_tail];
if (axienet_check_tx_bd_space(lp, num_frag + 1)) {
- if (netif_queue_stopped(ndev))
- return NETDEV_TX_BUSY;
-
+ /* Should not happen as last start_xmit call should have
+ * checked for sufficient space and queue should only be
+ * woken when sufficient space is available.
+ */
netif_stop_queue(ndev);
-
- /* Matches barrier in axienet_start_xmit_done */
- smp_mb();
-
- /* Space might have just been freed - check again */
- if (axienet_check_tx_bd_space(lp, num_frag + 1))
- return NETDEV_TX_BUSY;
-
- netif_wake_queue(ndev);
+ if (net_ratelimit())
+ netdev_warn(ndev, "TX ring unexpectedly full\n");
+ return NETDEV_TX_BUSY;
}
if (skb->ip_summed == CHECKSUM_PARTIAL) {
@@ -821,6 +817,18 @@ axienet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
if (++lp->tx_bd_tail >= lp->tx_bd_num)
lp->tx_bd_tail = 0;
+ /* Stop queue if next transmit may not have space */
+ if (axienet_check_tx_bd_space(lp, MAX_SKB_FRAGS + 1)) {
+ netif_stop_queue(ndev);
+
+ /* Matches barrier in axienet_start_xmit_done */
+ smp_mb();
+
+ /* Space might have just been freed - check again */
+ if (!axienet_check_tx_bd_space(lp, MAX_SKB_FRAGS + 1))
+ netif_wake_queue(ndev);
+ }
+
return NETDEV_TX_OK;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 996defd7f8b5dafc1d480b7585c7c62437f80c3c Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:29 -0600
Subject: [PATCH] net: axienet: Fix TX ring slot available check
The check for whether a TX ring slot was available was incorrect,
since a slot which had been loaded with transmit data but the device had
not started transmitting would be treated as available, potentially
causing non-transmitted slots to be overwritten. The control field in
the descriptor should be checked, rather than the status field (which may
only be updated when the device completes the entry).
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 3f92001bacaf..85fe2b3bd37a 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -643,7 +643,6 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
if (cur_p->skb && (status & XAXIDMA_BD_STS_COMPLETE_MASK))
dev_consume_skb_irq(cur_p->skb);
- cur_p->cntrl = 0;
cur_p->app0 = 0;
cur_p->app1 = 0;
cur_p->app2 = 0;
@@ -651,6 +650,7 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
cur_p->skb = NULL;
/* ensure our transmit path and device don't prematurely see status cleared */
wmb();
+ cur_p->cntrl = 0;
cur_p->status = 0;
if (sizep)
@@ -713,7 +713,7 @@ static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
/* Ensure we see all descriptor updates from device or TX IRQ path */
rmb();
cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
- if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
+ if (cur_p->cntrl)
return NETDEV_TX_BUSY;
return 0;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 996defd7f8b5dafc1d480b7585c7c62437f80c3c Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:29 -0600
Subject: [PATCH] net: axienet: Fix TX ring slot available check
The check for whether a TX ring slot was available was incorrect,
since a slot which had been loaded with transmit data but the device had
not started transmitting would be treated as available, potentially
causing non-transmitted slots to be overwritten. The control field in
the descriptor should be checked, rather than the status field (which may
only be updated when the device completes the entry).
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 3f92001bacaf..85fe2b3bd37a 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -643,7 +643,6 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
if (cur_p->skb && (status & XAXIDMA_BD_STS_COMPLETE_MASK))
dev_consume_skb_irq(cur_p->skb);
- cur_p->cntrl = 0;
cur_p->app0 = 0;
cur_p->app1 = 0;
cur_p->app2 = 0;
@@ -651,6 +650,7 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
cur_p->skb = NULL;
/* ensure our transmit path and device don't prematurely see status cleared */
wmb();
+ cur_p->cntrl = 0;
cur_p->status = 0;
if (sizep)
@@ -713,7 +713,7 @@ static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
/* Ensure we see all descriptor updates from device or TX IRQ path */
rmb();
cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
- if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
+ if (cur_p->cntrl)
return NETDEV_TX_BUSY;
return 0;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 996defd7f8b5dafc1d480b7585c7c62437f80c3c Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:29 -0600
Subject: [PATCH] net: axienet: Fix TX ring slot available check
The check for whether a TX ring slot was available was incorrect,
since a slot which had been loaded with transmit data but the device had
not started transmitting would be treated as available, potentially
causing non-transmitted slots to be overwritten. The control field in
the descriptor should be checked, rather than the status field (which may
only be updated when the device completes the entry).
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 3f92001bacaf..85fe2b3bd37a 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -643,7 +643,6 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
if (cur_p->skb && (status & XAXIDMA_BD_STS_COMPLETE_MASK))
dev_consume_skb_irq(cur_p->skb);
- cur_p->cntrl = 0;
cur_p->app0 = 0;
cur_p->app1 = 0;
cur_p->app2 = 0;
@@ -651,6 +650,7 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
cur_p->skb = NULL;
/* ensure our transmit path and device don't prematurely see status cleared */
wmb();
+ cur_p->cntrl = 0;
cur_p->status = 0;
if (sizep)
@@ -713,7 +713,7 @@ static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
/* Ensure we see all descriptor updates from device or TX IRQ path */
rmb();
cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
- if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
+ if (cur_p->cntrl)
return NETDEV_TX_BUSY;
return 0;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 996defd7f8b5dafc1d480b7585c7c62437f80c3c Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:29 -0600
Subject: [PATCH] net: axienet: Fix TX ring slot available check
The check for whether a TX ring slot was available was incorrect,
since a slot which had been loaded with transmit data but the device had
not started transmitting would be treated as available, potentially
causing non-transmitted slots to be overwritten. The control field in
the descriptor should be checked, rather than the status field (which may
only be updated when the device completes the entry).
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 3f92001bacaf..85fe2b3bd37a 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -643,7 +643,6 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
if (cur_p->skb && (status & XAXIDMA_BD_STS_COMPLETE_MASK))
dev_consume_skb_irq(cur_p->skb);
- cur_p->cntrl = 0;
cur_p->app0 = 0;
cur_p->app1 = 0;
cur_p->app2 = 0;
@@ -651,6 +650,7 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
cur_p->skb = NULL;
/* ensure our transmit path and device don't prematurely see status cleared */
wmb();
+ cur_p->cntrl = 0;
cur_p->status = 0;
if (sizep)
@@ -713,7 +713,7 @@ static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
/* Ensure we see all descriptor updates from device or TX IRQ path */
rmb();
cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
- if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
+ if (cur_p->cntrl)
return NETDEV_TX_BUSY;
return 0;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 996defd7f8b5dafc1d480b7585c7c62437f80c3c Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:29 -0600
Subject: [PATCH] net: axienet: Fix TX ring slot available check
The check for whether a TX ring slot was available was incorrect,
since a slot which had been loaded with transmit data but the device had
not started transmitting would be treated as available, potentially
causing non-transmitted slots to be overwritten. The control field in
the descriptor should be checked, rather than the status field (which may
only be updated when the device completes the entry).
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 3f92001bacaf..85fe2b3bd37a 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -643,7 +643,6 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
if (cur_p->skb && (status & XAXIDMA_BD_STS_COMPLETE_MASK))
dev_consume_skb_irq(cur_p->skb);
- cur_p->cntrl = 0;
cur_p->app0 = 0;
cur_p->app1 = 0;
cur_p->app2 = 0;
@@ -651,6 +650,7 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
cur_p->skb = NULL;
/* ensure our transmit path and device don't prematurely see status cleared */
wmb();
+ cur_p->cntrl = 0;
cur_p->status = 0;
if (sizep)
@@ -713,7 +713,7 @@ static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
/* Ensure we see all descriptor updates from device or TX IRQ path */
rmb();
cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
- if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
+ if (cur_p->cntrl)
return NETDEV_TX_BUSY;
return 0;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 95978df6fa328df619c15312e65ece469c2be2d2 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:27 -0600
Subject: [PATCH] net: axienet: add missing memory barriers
This driver was missing some required memory barriers:
Use dma_rmb to ensure we see all updates to the descriptor after we see
that an entry has been completed.
Use wmb and rmb to avoid stale descriptor status between the TX path and
TX complete IRQ path.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 53ff38cbc37b..fb486a457c76 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -632,6 +632,8 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
if (nr_bds == -1 && !(status & XAXIDMA_BD_STS_COMPLETE_MASK))
break;
+ /* Ensure we see complete descriptor update */
+ dma_rmb();
phys = desc_get_phys_addr(lp, cur_p);
dma_unmap_single(ndev->dev.parent, phys,
(cur_p->cntrl & XAXIDMA_BD_CTRL_LENGTH_MASK),
@@ -645,8 +647,10 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
cur_p->app1 = 0;
cur_p->app2 = 0;
cur_p->app4 = 0;
- cur_p->status = 0;
cur_p->skb = NULL;
+ /* ensure our transmit path and device don't prematurely see status cleared */
+ wmb();
+ cur_p->status = 0;
if (sizep)
*sizep += status & XAXIDMA_BD_STS_ACTUAL_LEN_MASK;
@@ -704,6 +708,9 @@ static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
int num_frag)
{
struct axidma_bd *cur_p;
+
+ /* Ensure we see all descriptor updates from device or TX IRQ path */
+ rmb();
cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
return NETDEV_TX_BUSY;
@@ -843,6 +850,8 @@ static void axienet_recv(struct net_device *ndev)
tail_p = lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_ci;
+ /* Ensure we see complete descriptor update */
+ dma_rmb();
phys = desc_get_phys_addr(lp, cur_p);
dma_unmap_single(ndev->dev.parent, phys, lp->max_frm_size,
DMA_FROM_DEVICE);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 95978df6fa328df619c15312e65ece469c2be2d2 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:27 -0600
Subject: [PATCH] net: axienet: add missing memory barriers
This driver was missing some required memory barriers:
Use dma_rmb to ensure we see all updates to the descriptor after we see
that an entry has been completed.
Use wmb and rmb to avoid stale descriptor status between the TX path and
TX complete IRQ path.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 53ff38cbc37b..fb486a457c76 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -632,6 +632,8 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
if (nr_bds == -1 && !(status & XAXIDMA_BD_STS_COMPLETE_MASK))
break;
+ /* Ensure we see complete descriptor update */
+ dma_rmb();
phys = desc_get_phys_addr(lp, cur_p);
dma_unmap_single(ndev->dev.parent, phys,
(cur_p->cntrl & XAXIDMA_BD_CTRL_LENGTH_MASK),
@@ -645,8 +647,10 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
cur_p->app1 = 0;
cur_p->app2 = 0;
cur_p->app4 = 0;
- cur_p->status = 0;
cur_p->skb = NULL;
+ /* ensure our transmit path and device don't prematurely see status cleared */
+ wmb();
+ cur_p->status = 0;
if (sizep)
*sizep += status & XAXIDMA_BD_STS_ACTUAL_LEN_MASK;
@@ -704,6 +708,9 @@ static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
int num_frag)
{
struct axidma_bd *cur_p;
+
+ /* Ensure we see all descriptor updates from device or TX IRQ path */
+ rmb();
cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
return NETDEV_TX_BUSY;
@@ -843,6 +850,8 @@ static void axienet_recv(struct net_device *ndev)
tail_p = lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_ci;
+ /* Ensure we see complete descriptor update */
+ dma_rmb();
phys = desc_get_phys_addr(lp, cur_p);
dma_unmap_single(ndev->dev.parent, phys, lp->max_frm_size,
DMA_FROM_DEVICE);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 95978df6fa328df619c15312e65ece469c2be2d2 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:27 -0600
Subject: [PATCH] net: axienet: add missing memory barriers
This driver was missing some required memory barriers:
Use dma_rmb to ensure we see all updates to the descriptor after we see
that an entry has been completed.
Use wmb and rmb to avoid stale descriptor status between the TX path and
TX complete IRQ path.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 53ff38cbc37b..fb486a457c76 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -632,6 +632,8 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
if (nr_bds == -1 && !(status & XAXIDMA_BD_STS_COMPLETE_MASK))
break;
+ /* Ensure we see complete descriptor update */
+ dma_rmb();
phys = desc_get_phys_addr(lp, cur_p);
dma_unmap_single(ndev->dev.parent, phys,
(cur_p->cntrl & XAXIDMA_BD_CTRL_LENGTH_MASK),
@@ -645,8 +647,10 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
cur_p->app1 = 0;
cur_p->app2 = 0;
cur_p->app4 = 0;
- cur_p->status = 0;
cur_p->skb = NULL;
+ /* ensure our transmit path and device don't prematurely see status cleared */
+ wmb();
+ cur_p->status = 0;
if (sizep)
*sizep += status & XAXIDMA_BD_STS_ACTUAL_LEN_MASK;
@@ -704,6 +708,9 @@ static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
int num_frag)
{
struct axidma_bd *cur_p;
+
+ /* Ensure we see all descriptor updates from device or TX IRQ path */
+ rmb();
cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
return NETDEV_TX_BUSY;
@@ -843,6 +850,8 @@ static void axienet_recv(struct net_device *ndev)
tail_p = lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_ci;
+ /* Ensure we see complete descriptor update */
+ dma_rmb();
phys = desc_get_phys_addr(lp, cur_p);
dma_unmap_single(ndev->dev.parent, phys, lp->max_frm_size,
DMA_FROM_DEVICE);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 95978df6fa328df619c15312e65ece469c2be2d2 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:27 -0600
Subject: [PATCH] net: axienet: add missing memory barriers
This driver was missing some required memory barriers:
Use dma_rmb to ensure we see all updates to the descriptor after we see
that an entry has been completed.
Use wmb and rmb to avoid stale descriptor status between the TX path and
TX complete IRQ path.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 53ff38cbc37b..fb486a457c76 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -632,6 +632,8 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
if (nr_bds == -1 && !(status & XAXIDMA_BD_STS_COMPLETE_MASK))
break;
+ /* Ensure we see complete descriptor update */
+ dma_rmb();
phys = desc_get_phys_addr(lp, cur_p);
dma_unmap_single(ndev->dev.parent, phys,
(cur_p->cntrl & XAXIDMA_BD_CTRL_LENGTH_MASK),
@@ -645,8 +647,10 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
cur_p->app1 = 0;
cur_p->app2 = 0;
cur_p->app4 = 0;
- cur_p->status = 0;
cur_p->skb = NULL;
+ /* ensure our transmit path and device don't prematurely see status cleared */
+ wmb();
+ cur_p->status = 0;
if (sizep)
*sizep += status & XAXIDMA_BD_STS_ACTUAL_LEN_MASK;
@@ -704,6 +708,9 @@ static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
int num_frag)
{
struct axidma_bd *cur_p;
+
+ /* Ensure we see all descriptor updates from device or TX IRQ path */
+ rmb();
cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
return NETDEV_TX_BUSY;
@@ -843,6 +850,8 @@ static void axienet_recv(struct net_device *ndev)
tail_p = lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_ci;
+ /* Ensure we see complete descriptor update */
+ dma_rmb();
phys = desc_get_phys_addr(lp, cur_p);
dma_unmap_single(ndev->dev.parent, phys, lp->max_frm_size,
DMA_FROM_DEVICE);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 95978df6fa328df619c15312e65ece469c2be2d2 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:27 -0600
Subject: [PATCH] net: axienet: add missing memory barriers
This driver was missing some required memory barriers:
Use dma_rmb to ensure we see all updates to the descriptor after we see
that an entry has been completed.
Use wmb and rmb to avoid stale descriptor status between the TX path and
TX complete IRQ path.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 53ff38cbc37b..fb486a457c76 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -632,6 +632,8 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
if (nr_bds == -1 && !(status & XAXIDMA_BD_STS_COMPLETE_MASK))
break;
+ /* Ensure we see complete descriptor update */
+ dma_rmb();
phys = desc_get_phys_addr(lp, cur_p);
dma_unmap_single(ndev->dev.parent, phys,
(cur_p->cntrl & XAXIDMA_BD_CTRL_LENGTH_MASK),
@@ -645,8 +647,10 @@ static int axienet_free_tx_chain(struct net_device *ndev, u32 first_bd,
cur_p->app1 = 0;
cur_p->app2 = 0;
cur_p->app4 = 0;
- cur_p->status = 0;
cur_p->skb = NULL;
+ /* ensure our transmit path and device don't prematurely see status cleared */
+ wmb();
+ cur_p->status = 0;
if (sizep)
*sizep += status & XAXIDMA_BD_STS_ACTUAL_LEN_MASK;
@@ -704,6 +708,9 @@ static inline int axienet_check_tx_bd_space(struct axienet_local *lp,
int num_frag)
{
struct axidma_bd *cur_p;
+
+ /* Ensure we see all descriptor updates from device or TX IRQ path */
+ rmb();
cur_p = &lp->tx_bd_v[(lp->tx_bd_tail + num_frag) % lp->tx_bd_num];
if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
return NETDEV_TX_BUSY;
@@ -843,6 +850,8 @@ static void axienet_recv(struct net_device *ndev)
tail_p = lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_ci;
+ /* Ensure we see complete descriptor update */
+ dma_rmb();
phys = desc_get_phys_addr(lp, cur_p);
dma_unmap_single(ndev->dev.parent, phys, lp->max_frm_size,
DMA_FROM_DEVICE);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b400c2f4f4c53c86594dd57098970d97d488bfde Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:25 -0600
Subject: [PATCH] net: axienet: Wait for PhyRstCmplt after core reset
When resetting the device, wait for the PhyRstCmplt bit to be set
in the interrupt status register before continuing initialization, to
ensure that the core is actually ready. When using an external PHY, this
also ensures we do not start trying to access the PHY while it is still
in reset. The PHY reset is initiated by the core reset which is
triggered just above, but remains asserted for 5ms after the core is
reset according to the documentation.
The MgtRdy bit could also be waited for, but unfortunately when using
7-series devices, the bit does not appear to work as documented (it
seems to behave as some sort of link state indication and not just an
indication the transceiver is ready) so it can't really be relied on for
this purpose.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 9c5b24af61fa..3a2d7e8c3f66 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -516,6 +516,16 @@ static int __axienet_device_reset(struct axienet_local *lp)
return ret;
}
+ /* Wait for PhyRstCmplt bit to be set, indicating the PHY reset has finished */
+ ret = read_poll_timeout(axienet_ior, value,
+ value & XAE_INT_PHYRSTCMPLT_MASK,
+ DELAY_OF_ONE_MILLISEC, 50000, false, lp,
+ XAE_IS_OFFSET);
+ if (ret) {
+ dev_err(lp->dev, "%s: timeout waiting for PhyRstCmplt\n", __func__);
+ return ret;
+ }
+
return 0;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e5644b1bab2ccea9cfc7a9520af95b94eb0dbf1 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:24 -0600
Subject: [PATCH] net: axienet: increase reset timeout
The previous timeout of 1ms was too short to handle some cases where the
core is reset just after the input clocks were started, which will
be introduced in an upcoming patch. Increase the timeout to 50ms. Also
simplify the reset timeout checking to use read_poll_timeout.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 23ac353b35fe..9c5b24af61fa 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -496,7 +496,8 @@ static void axienet_setoptions(struct net_device *ndev, u32 options)
static int __axienet_device_reset(struct axienet_local *lp)
{
- u32 timeout;
+ u32 value;
+ int ret;
/* Reset Axi DMA. This would reset Axi Ethernet core as well. The reset
* process of Axi DMA takes a while to complete as all pending
@@ -506,15 +507,13 @@ static int __axienet_device_reset(struct axienet_local *lp)
* they both reset the entire DMA core, so only one needs to be used.
*/
axienet_dma_out32(lp, XAXIDMA_TX_CR_OFFSET, XAXIDMA_CR_RESET_MASK);
- timeout = DELAY_OF_ONE_MILLISEC;
- while (axienet_dma_in32(lp, XAXIDMA_TX_CR_OFFSET) &
- XAXIDMA_CR_RESET_MASK) {
- udelay(1);
- if (--timeout == 0) {
- netdev_err(lp->ndev, "%s: DMA reset timeout!\n",
- __func__);
- return -ETIMEDOUT;
- }
+ ret = read_poll_timeout(axienet_dma_in32, value,
+ !(value & XAXIDMA_CR_RESET_MASK),
+ DELAY_OF_ONE_MILLISEC, 50000, false, lp,
+ XAXIDMA_TX_CR_OFFSET);
+ if (ret) {
+ dev_err(lp->dev, "%s: DMA reset timeout!\n", __func__);
+ return ret;
}
return 0;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e5644b1bab2ccea9cfc7a9520af95b94eb0dbf1 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:24 -0600
Subject: [PATCH] net: axienet: increase reset timeout
The previous timeout of 1ms was too short to handle some cases where the
core is reset just after the input clocks were started, which will
be introduced in an upcoming patch. Increase the timeout to 50ms. Also
simplify the reset timeout checking to use read_poll_timeout.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 23ac353b35fe..9c5b24af61fa 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -496,7 +496,8 @@ static void axienet_setoptions(struct net_device *ndev, u32 options)
static int __axienet_device_reset(struct axienet_local *lp)
{
- u32 timeout;
+ u32 value;
+ int ret;
/* Reset Axi DMA. This would reset Axi Ethernet core as well. The reset
* process of Axi DMA takes a while to complete as all pending
@@ -506,15 +507,13 @@ static int __axienet_device_reset(struct axienet_local *lp)
* they both reset the entire DMA core, so only one needs to be used.
*/
axienet_dma_out32(lp, XAXIDMA_TX_CR_OFFSET, XAXIDMA_CR_RESET_MASK);
- timeout = DELAY_OF_ONE_MILLISEC;
- while (axienet_dma_in32(lp, XAXIDMA_TX_CR_OFFSET) &
- XAXIDMA_CR_RESET_MASK) {
- udelay(1);
- if (--timeout == 0) {
- netdev_err(lp->ndev, "%s: DMA reset timeout!\n",
- __func__);
- return -ETIMEDOUT;
- }
+ ret = read_poll_timeout(axienet_dma_in32, value,
+ !(value & XAXIDMA_CR_RESET_MASK),
+ DELAY_OF_ONE_MILLISEC, 50000, false, lp,
+ XAXIDMA_TX_CR_OFFSET);
+ if (ret) {
+ dev_err(lp->dev, "%s: DMA reset timeout!\n", __func__);
+ return ret;
}
return 0;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e5644b1bab2ccea9cfc7a9520af95b94eb0dbf1 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:24 -0600
Subject: [PATCH] net: axienet: increase reset timeout
The previous timeout of 1ms was too short to handle some cases where the
core is reset just after the input clocks were started, which will
be introduced in an upcoming patch. Increase the timeout to 50ms. Also
simplify the reset timeout checking to use read_poll_timeout.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 23ac353b35fe..9c5b24af61fa 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -496,7 +496,8 @@ static void axienet_setoptions(struct net_device *ndev, u32 options)
static int __axienet_device_reset(struct axienet_local *lp)
{
- u32 timeout;
+ u32 value;
+ int ret;
/* Reset Axi DMA. This would reset Axi Ethernet core as well. The reset
* process of Axi DMA takes a while to complete as all pending
@@ -506,15 +507,13 @@ static int __axienet_device_reset(struct axienet_local *lp)
* they both reset the entire DMA core, so only one needs to be used.
*/
axienet_dma_out32(lp, XAXIDMA_TX_CR_OFFSET, XAXIDMA_CR_RESET_MASK);
- timeout = DELAY_OF_ONE_MILLISEC;
- while (axienet_dma_in32(lp, XAXIDMA_TX_CR_OFFSET) &
- XAXIDMA_CR_RESET_MASK) {
- udelay(1);
- if (--timeout == 0) {
- netdev_err(lp->ndev, "%s: DMA reset timeout!\n",
- __func__);
- return -ETIMEDOUT;
- }
+ ret = read_poll_timeout(axienet_dma_in32, value,
+ !(value & XAXIDMA_CR_RESET_MASK),
+ DELAY_OF_ONE_MILLISEC, 50000, false, lp,
+ XAXIDMA_TX_CR_OFFSET);
+ if (ret) {
+ dev_err(lp->dev, "%s: DMA reset timeout!\n", __func__);
+ return ret;
}
return 0;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e5644b1bab2ccea9cfc7a9520af95b94eb0dbf1 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:24 -0600
Subject: [PATCH] net: axienet: increase reset timeout
The previous timeout of 1ms was too short to handle some cases where the
core is reset just after the input clocks were started, which will
be introduced in an upcoming patch. Increase the timeout to 50ms. Also
simplify the reset timeout checking to use read_poll_timeout.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 23ac353b35fe..9c5b24af61fa 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -496,7 +496,8 @@ static void axienet_setoptions(struct net_device *ndev, u32 options)
static int __axienet_device_reset(struct axienet_local *lp)
{
- u32 timeout;
+ u32 value;
+ int ret;
/* Reset Axi DMA. This would reset Axi Ethernet core as well. The reset
* process of Axi DMA takes a while to complete as all pending
@@ -506,15 +507,13 @@ static int __axienet_device_reset(struct axienet_local *lp)
* they both reset the entire DMA core, so only one needs to be used.
*/
axienet_dma_out32(lp, XAXIDMA_TX_CR_OFFSET, XAXIDMA_CR_RESET_MASK);
- timeout = DELAY_OF_ONE_MILLISEC;
- while (axienet_dma_in32(lp, XAXIDMA_TX_CR_OFFSET) &
- XAXIDMA_CR_RESET_MASK) {
- udelay(1);
- if (--timeout == 0) {
- netdev_err(lp->ndev, "%s: DMA reset timeout!\n",
- __func__);
- return -ETIMEDOUT;
- }
+ ret = read_poll_timeout(axienet_dma_in32, value,
+ !(value & XAXIDMA_CR_RESET_MASK),
+ DELAY_OF_ONE_MILLISEC, 50000, false, lp,
+ XAXIDMA_TX_CR_OFFSET);
+ if (ret) {
+ dev_err(lp->dev, "%s: DMA reset timeout!\n", __func__);
+ return ret;
}
return 0;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e5644b1bab2ccea9cfc7a9520af95b94eb0dbf1 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Tue, 18 Jan 2022 15:41:24 -0600
Subject: [PATCH] net: axienet: increase reset timeout
The previous timeout of 1ms was too short to handle some cases where the
core is reset just after the input clocks were started, which will
be introduced in an upcoming patch. Increase the timeout to 50ms. Also
simplify the reset timeout checking to use read_poll_timeout.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 23ac353b35fe..9c5b24af61fa 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -496,7 +496,8 @@ static void axienet_setoptions(struct net_device *ndev, u32 options)
static int __axienet_device_reset(struct axienet_local *lp)
{
- u32 timeout;
+ u32 value;
+ int ret;
/* Reset Axi DMA. This would reset Axi Ethernet core as well. The reset
* process of Axi DMA takes a while to complete as all pending
@@ -506,15 +507,13 @@ static int __axienet_device_reset(struct axienet_local *lp)
* they both reset the entire DMA core, so only one needs to be used.
*/
axienet_dma_out32(lp, XAXIDMA_TX_CR_OFFSET, XAXIDMA_CR_RESET_MASK);
- timeout = DELAY_OF_ONE_MILLISEC;
- while (axienet_dma_in32(lp, XAXIDMA_TX_CR_OFFSET) &
- XAXIDMA_CR_RESET_MASK) {
- udelay(1);
- if (--timeout == 0) {
- netdev_err(lp->ndev, "%s: DMA reset timeout!\n",
- __func__);
- return -ETIMEDOUT;
- }
+ ret = read_poll_timeout(axienet_dma_in32, value,
+ !(value & XAXIDMA_CR_RESET_MASK),
+ DELAY_OF_ONE_MILLISEC, 50000, false, lp,
+ XAXIDMA_TX_CR_OFFSET);
+ if (ret) {
+ dev_err(lp->dev, "%s: DMA reset timeout!\n", __func__);
+ return ret;
}
return 0;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c5d692a2335d64ac390aeb8ab6c4ac9f662e1be4 Mon Sep 17 00:00:00 2001
From: Tom Rix <trix(a)redhat.com>
Date: Wed, 22 Dec 2021 09:29:23 -0800
Subject: [PATCH] crypto: hisilicon - cleanup warning in qm_get_qos_value()
Building with clang static analysis returns this warning:
qm.c:4382:11: warning: The left operand of '==' is a garbage value
if (*val == 0 || *val > QM_QOS_MAX_VAL || ret) {
~~~~ ^
The call to qm_qos_value_init() can return an error without setting
*val. So check ret before checking *val.
Fixes: 72b010dc33b9 ("crypto: hisilicon/qm - supports writing QoS int the host")
Signed-off-by: Tom Rix <trix(a)redhat.com>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c
index b731cb4ec294..c5b84a5ea350 100644
--- a/drivers/crypto/hisilicon/qm.c
+++ b/drivers/crypto/hisilicon/qm.c
@@ -4394,7 +4394,7 @@ static ssize_t qm_get_qos_value(struct hisi_qm *qm, const char *buf,
return -EINVAL;
ret = qm_qos_value_init(val_buf, val);
- if (*val == 0 || *val > QM_QOS_MAX_VAL || ret) {
+ if (ret || *val == 0 || *val > QM_QOS_MAX_VAL) {
pci_err(qm->pdev, "input qos value is error, please set 1~1000!\n");
return -EINVAL;
}
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c5d692a2335d64ac390aeb8ab6c4ac9f662e1be4 Mon Sep 17 00:00:00 2001
From: Tom Rix <trix(a)redhat.com>
Date: Wed, 22 Dec 2021 09:29:23 -0800
Subject: [PATCH] crypto: hisilicon - cleanup warning in qm_get_qos_value()
Building with clang static analysis returns this warning:
qm.c:4382:11: warning: The left operand of '==' is a garbage value
if (*val == 0 || *val > QM_QOS_MAX_VAL || ret) {
~~~~ ^
The call to qm_qos_value_init() can return an error without setting
*val. So check ret before checking *val.
Fixes: 72b010dc33b9 ("crypto: hisilicon/qm - supports writing QoS int the host")
Signed-off-by: Tom Rix <trix(a)redhat.com>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c
index b731cb4ec294..c5b84a5ea350 100644
--- a/drivers/crypto/hisilicon/qm.c
+++ b/drivers/crypto/hisilicon/qm.c
@@ -4394,7 +4394,7 @@ static ssize_t qm_get_qos_value(struct hisi_qm *qm, const char *buf,
return -EINVAL;
ret = qm_qos_value_init(val_buf, val);
- if (*val == 0 || *val > QM_QOS_MAX_VAL || ret) {
+ if (ret || *val == 0 || *val > QM_QOS_MAX_VAL) {
pci_err(qm->pdev, "input qos value is error, please set 1~1000!\n");
return -EINVAL;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f6db43076d190d9bf75559dec28e18b9d12e4ce5 Mon Sep 17 00:00:00 2001
From: Chao Yu <chao(a)kernel.org>
Date: Mon, 6 Dec 2021 22:44:20 +0800
Subject: [PATCH] f2fs: fix to avoid panic in is_alive() if metadata is
inconsistent
As report by Wenqing Liu in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215231
If we enable CONFIG_F2FS_CHECK_FS config, and with fuzzed image attached
in above link, we will encounter panic when executing below script:
1. mkdir mnt
2. mount -t f2fs tmp1.img mnt
3. touch tmp
F2FS-fs (loop11): mismatched blkaddr 5765 (source_blkaddr 1) in seg 3
kernel BUG at fs/f2fs/gc.c:1042!
do_garbage_collect+0x90f/0xa80 [f2fs]
f2fs_gc+0x294/0x12a0 [f2fs]
f2fs_balance_fs+0x2c5/0x7d0 [f2fs]
f2fs_create+0x239/0xd90 [f2fs]
lookup_open+0x45e/0xa90
open_last_lookups+0x203/0x670
path_openat+0xae/0x490
do_filp_open+0xbc/0x160
do_sys_openat2+0x2f1/0x500
do_sys_open+0x5e/0xa0
__x64_sys_openat+0x28/0x40
Previously, f2fs tries to catch data inconcistency exception in between
SSA and SIT table during GC, however once the exception is caught, it will
call f2fs_bug_on to hang kernel, it's not needed, instead, let's set
SBI_NEED_FSCK flag and skip migrating current block.
Fixes: bbf9f7d90f21 ("f2fs: Fix indefinite loop in f2fs_gc()")
Signed-off-by: Chao Yu <chao(a)kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index e0bdc4361a9b..3e64b234df21 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1039,7 +1039,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
if (!test_and_set_bit(segno, SIT_I(sbi)->invalid_segmap)) {
f2fs_err(sbi, "mismatched blkaddr %u (source_blkaddr %u) in seg %u",
blkaddr, source_blkaddr, segno);
- f2fs_bug_on(sbi, 1);
+ set_sbi_flag(sbi, SBI_NEED_FSCK);
}
}
#endif
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f6db43076d190d9bf75559dec28e18b9d12e4ce5 Mon Sep 17 00:00:00 2001
From: Chao Yu <chao(a)kernel.org>
Date: Mon, 6 Dec 2021 22:44:20 +0800
Subject: [PATCH] f2fs: fix to avoid panic in is_alive() if metadata is
inconsistent
As report by Wenqing Liu in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215231
If we enable CONFIG_F2FS_CHECK_FS config, and with fuzzed image attached
in above link, we will encounter panic when executing below script:
1. mkdir mnt
2. mount -t f2fs tmp1.img mnt
3. touch tmp
F2FS-fs (loop11): mismatched blkaddr 5765 (source_blkaddr 1) in seg 3
kernel BUG at fs/f2fs/gc.c:1042!
do_garbage_collect+0x90f/0xa80 [f2fs]
f2fs_gc+0x294/0x12a0 [f2fs]
f2fs_balance_fs+0x2c5/0x7d0 [f2fs]
f2fs_create+0x239/0xd90 [f2fs]
lookup_open+0x45e/0xa90
open_last_lookups+0x203/0x670
path_openat+0xae/0x490
do_filp_open+0xbc/0x160
do_sys_openat2+0x2f1/0x500
do_sys_open+0x5e/0xa0
__x64_sys_openat+0x28/0x40
Previously, f2fs tries to catch data inconcistency exception in between
SSA and SIT table during GC, however once the exception is caught, it will
call f2fs_bug_on to hang kernel, it's not needed, instead, let's set
SBI_NEED_FSCK flag and skip migrating current block.
Fixes: bbf9f7d90f21 ("f2fs: Fix indefinite loop in f2fs_gc()")
Signed-off-by: Chao Yu <chao(a)kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index e0bdc4361a9b..3e64b234df21 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1039,7 +1039,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
if (!test_and_set_bit(segno, SIT_I(sbi)->invalid_segmap)) {
f2fs_err(sbi, "mismatched blkaddr %u (source_blkaddr %u) in seg %u",
blkaddr, source_blkaddr, segno);
- f2fs_bug_on(sbi, 1);
+ set_sbi_flag(sbi, SBI_NEED_FSCK);
}
}
#endif
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f36b96bc70f9707e618d22cd6a6baf86706ade2 Mon Sep 17 00:00:00 2001
From: Palmer Dabbelt <palmer(a)rivosinc.com>
Date: Fri, 19 Nov 2021 08:44:03 -0800
Subject: [PATCH] RISC-V: MAXPHYSMEM_2GB doesn't depend on CMODEL_MEDLOW
For non-relocatable kernels we need to be able to link the kernel at
approximately PAGE_OFFSET, thus requiring medany (as medlow requires the
code to be linked within 2GiB of 0). The inverse doesn't apply, though:
since medany code can be linked anywhere it's fine to link it close to
0, so we can support the smaller memory config.
Fixes: de5f4b8f634b ("RISC-V: Define MAXPHYSMEM_1GB only for RV32")
Reviewed-by: Anup Patel <anup(a)brainfault.org>
Signed-off-by: Palmer Dabbelt <palmer(a)rivosinc.com>
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 821252b65f89..61f64512dcde 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -280,7 +280,7 @@ choice
depends on 32BIT
bool "1GiB"
config MAXPHYSMEM_2GB
- depends on 64BIT && CMODEL_MEDLOW
+ depends on 64BIT
bool "2GiB"
config MAXPHYSMEM_128GB
depends on 64BIT && CMODEL_MEDANY
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f36b96bc70f9707e618d22cd6a6baf86706ade2 Mon Sep 17 00:00:00 2001
From: Palmer Dabbelt <palmer(a)rivosinc.com>
Date: Fri, 19 Nov 2021 08:44:03 -0800
Subject: [PATCH] RISC-V: MAXPHYSMEM_2GB doesn't depend on CMODEL_MEDLOW
For non-relocatable kernels we need to be able to link the kernel at
approximately PAGE_OFFSET, thus requiring medany (as medlow requires the
code to be linked within 2GiB of 0). The inverse doesn't apply, though:
since medany code can be linked anywhere it's fine to link it close to
0, so we can support the smaller memory config.
Fixes: de5f4b8f634b ("RISC-V: Define MAXPHYSMEM_1GB only for RV32")
Reviewed-by: Anup Patel <anup(a)brainfault.org>
Signed-off-by: Palmer Dabbelt <palmer(a)rivosinc.com>
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 821252b65f89..61f64512dcde 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -280,7 +280,7 @@ choice
depends on 32BIT
bool "1GiB"
config MAXPHYSMEM_2GB
- depends on 64BIT && CMODEL_MEDLOW
+ depends on 64BIT
bool "2GiB"
config MAXPHYSMEM_128GB
depends on 64BIT && CMODEL_MEDANY
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f36b96bc70f9707e618d22cd6a6baf86706ade2 Mon Sep 17 00:00:00 2001
From: Palmer Dabbelt <palmer(a)rivosinc.com>
Date: Fri, 19 Nov 2021 08:44:03 -0800
Subject: [PATCH] RISC-V: MAXPHYSMEM_2GB doesn't depend on CMODEL_MEDLOW
For non-relocatable kernels we need to be able to link the kernel at
approximately PAGE_OFFSET, thus requiring medany (as medlow requires the
code to be linked within 2GiB of 0). The inverse doesn't apply, though:
since medany code can be linked anywhere it's fine to link it close to
0, so we can support the smaller memory config.
Fixes: de5f4b8f634b ("RISC-V: Define MAXPHYSMEM_1GB only for RV32")
Reviewed-by: Anup Patel <anup(a)brainfault.org>
Signed-off-by: Palmer Dabbelt <palmer(a)rivosinc.com>
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 821252b65f89..61f64512dcde 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -280,7 +280,7 @@ choice
depends on 32BIT
bool "1GiB"
config MAXPHYSMEM_2GB
- depends on 64BIT && CMODEL_MEDLOW
+ depends on 64BIT
bool "2GiB"
config MAXPHYSMEM_128GB
depends on 64BIT && CMODEL_MEDANY
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6198c722019774d38018457a8bfb9ba3ed8c931e Mon Sep 17 00:00:00 2001
From: Tobias Waldekranz <tobias(a)waldekranz.com>
Date: Tue, 18 Jan 2022 22:50:50 +0100
Subject: [PATCH] net/fsl: xgmac_mdio: Add workaround for erratum A-009885
Once an MDIO read transaction is initiated, we must read back the data
register within 16 MDC cycles after the transaction completes. Outside
of this window, reads may return corrupt data.
Therefore, disable local interrupts in the critical section, to
maximize the probability that we can satisfy this requirement.
Fixes: d55ad2967d89 ("powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan")
Signed-off-by: Tobias Waldekranz <tobias(a)waldekranz.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/freescale/xgmac_mdio.c b/drivers/net/ethernet/freescale/xgmac_mdio.c
index 5b8b9bcf41a2..bf566ac3195b 100644
--- a/drivers/net/ethernet/freescale/xgmac_mdio.c
+++ b/drivers/net/ethernet/freescale/xgmac_mdio.c
@@ -51,6 +51,7 @@ struct tgec_mdio_controller {
struct mdio_fsl_priv {
struct tgec_mdio_controller __iomem *mdio_base;
bool is_little_endian;
+ bool has_a009885;
bool has_a011043;
};
@@ -186,10 +187,10 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
{
struct mdio_fsl_priv *priv = (struct mdio_fsl_priv *)bus->priv;
struct tgec_mdio_controller __iomem *regs = priv->mdio_base;
+ unsigned long flags;
uint16_t dev_addr;
uint32_t mdio_stat;
uint32_t mdio_ctl;
- uint16_t value;
int ret;
bool endian = priv->is_little_endian;
@@ -221,12 +222,18 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
return ret;
}
+ if (priv->has_a009885)
+ /* Once the operation completes, i.e. MDIO_STAT_BSY clears, we
+ * must read back the data register within 16 MDC cycles.
+ */
+ local_irq_save(flags);
+
/* Initiate the read */
xgmac_write32(mdio_ctl | MDIO_CTL_READ, ®s->mdio_ctl, endian);
ret = xgmac_wait_until_done(&bus->dev, regs, endian);
if (ret)
- return ret;
+ goto irq_restore;
/* Return all Fs if nothing was there */
if ((xgmac_read32(®s->mdio_stat, endian) & MDIO_STAT_RD_ER) &&
@@ -234,13 +241,17 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
dev_dbg(&bus->dev,
"Error while reading PHY%d reg at %d.%hhu\n",
phy_id, dev_addr, regnum);
- return 0xffff;
+ ret = 0xffff;
+ } else {
+ ret = xgmac_read32(®s->mdio_data, endian) & 0xffff;
+ dev_dbg(&bus->dev, "read %04x\n", ret);
}
- value = xgmac_read32(®s->mdio_data, endian) & 0xffff;
- dev_dbg(&bus->dev, "read %04x\n", value);
+irq_restore:
+ if (priv->has_a009885)
+ local_irq_restore(flags);
- return value;
+ return ret;
}
static int xgmac_mdio_probe(struct platform_device *pdev)
@@ -287,6 +298,8 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
priv->is_little_endian = device_property_read_bool(&pdev->dev,
"little-endian");
+ priv->has_a009885 = device_property_read_bool(&pdev->dev,
+ "fsl,erratum-a009885");
priv->has_a011043 = device_property_read_bool(&pdev->dev,
"fsl,erratum-a011043");
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6198c722019774d38018457a8bfb9ba3ed8c931e Mon Sep 17 00:00:00 2001
From: Tobias Waldekranz <tobias(a)waldekranz.com>
Date: Tue, 18 Jan 2022 22:50:50 +0100
Subject: [PATCH] net/fsl: xgmac_mdio: Add workaround for erratum A-009885
Once an MDIO read transaction is initiated, we must read back the data
register within 16 MDC cycles after the transaction completes. Outside
of this window, reads may return corrupt data.
Therefore, disable local interrupts in the critical section, to
maximize the probability that we can satisfy this requirement.
Fixes: d55ad2967d89 ("powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan")
Signed-off-by: Tobias Waldekranz <tobias(a)waldekranz.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/freescale/xgmac_mdio.c b/drivers/net/ethernet/freescale/xgmac_mdio.c
index 5b8b9bcf41a2..bf566ac3195b 100644
--- a/drivers/net/ethernet/freescale/xgmac_mdio.c
+++ b/drivers/net/ethernet/freescale/xgmac_mdio.c
@@ -51,6 +51,7 @@ struct tgec_mdio_controller {
struct mdio_fsl_priv {
struct tgec_mdio_controller __iomem *mdio_base;
bool is_little_endian;
+ bool has_a009885;
bool has_a011043;
};
@@ -186,10 +187,10 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
{
struct mdio_fsl_priv *priv = (struct mdio_fsl_priv *)bus->priv;
struct tgec_mdio_controller __iomem *regs = priv->mdio_base;
+ unsigned long flags;
uint16_t dev_addr;
uint32_t mdio_stat;
uint32_t mdio_ctl;
- uint16_t value;
int ret;
bool endian = priv->is_little_endian;
@@ -221,12 +222,18 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
return ret;
}
+ if (priv->has_a009885)
+ /* Once the operation completes, i.e. MDIO_STAT_BSY clears, we
+ * must read back the data register within 16 MDC cycles.
+ */
+ local_irq_save(flags);
+
/* Initiate the read */
xgmac_write32(mdio_ctl | MDIO_CTL_READ, ®s->mdio_ctl, endian);
ret = xgmac_wait_until_done(&bus->dev, regs, endian);
if (ret)
- return ret;
+ goto irq_restore;
/* Return all Fs if nothing was there */
if ((xgmac_read32(®s->mdio_stat, endian) & MDIO_STAT_RD_ER) &&
@@ -234,13 +241,17 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
dev_dbg(&bus->dev,
"Error while reading PHY%d reg at %d.%hhu\n",
phy_id, dev_addr, regnum);
- return 0xffff;
+ ret = 0xffff;
+ } else {
+ ret = xgmac_read32(®s->mdio_data, endian) & 0xffff;
+ dev_dbg(&bus->dev, "read %04x\n", ret);
}
- value = xgmac_read32(®s->mdio_data, endian) & 0xffff;
- dev_dbg(&bus->dev, "read %04x\n", value);
+irq_restore:
+ if (priv->has_a009885)
+ local_irq_restore(flags);
- return value;
+ return ret;
}
static int xgmac_mdio_probe(struct platform_device *pdev)
@@ -287,6 +298,8 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
priv->is_little_endian = device_property_read_bool(&pdev->dev,
"little-endian");
+ priv->has_a009885 = device_property_read_bool(&pdev->dev,
+ "fsl,erratum-a009885");
priv->has_a011043 = device_property_read_bool(&pdev->dev,
"fsl,erratum-a011043");
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6198c722019774d38018457a8bfb9ba3ed8c931e Mon Sep 17 00:00:00 2001
From: Tobias Waldekranz <tobias(a)waldekranz.com>
Date: Tue, 18 Jan 2022 22:50:50 +0100
Subject: [PATCH] net/fsl: xgmac_mdio: Add workaround for erratum A-009885
Once an MDIO read transaction is initiated, we must read back the data
register within 16 MDC cycles after the transaction completes. Outside
of this window, reads may return corrupt data.
Therefore, disable local interrupts in the critical section, to
maximize the probability that we can satisfy this requirement.
Fixes: d55ad2967d89 ("powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan")
Signed-off-by: Tobias Waldekranz <tobias(a)waldekranz.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/freescale/xgmac_mdio.c b/drivers/net/ethernet/freescale/xgmac_mdio.c
index 5b8b9bcf41a2..bf566ac3195b 100644
--- a/drivers/net/ethernet/freescale/xgmac_mdio.c
+++ b/drivers/net/ethernet/freescale/xgmac_mdio.c
@@ -51,6 +51,7 @@ struct tgec_mdio_controller {
struct mdio_fsl_priv {
struct tgec_mdio_controller __iomem *mdio_base;
bool is_little_endian;
+ bool has_a009885;
bool has_a011043;
};
@@ -186,10 +187,10 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
{
struct mdio_fsl_priv *priv = (struct mdio_fsl_priv *)bus->priv;
struct tgec_mdio_controller __iomem *regs = priv->mdio_base;
+ unsigned long flags;
uint16_t dev_addr;
uint32_t mdio_stat;
uint32_t mdio_ctl;
- uint16_t value;
int ret;
bool endian = priv->is_little_endian;
@@ -221,12 +222,18 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
return ret;
}
+ if (priv->has_a009885)
+ /* Once the operation completes, i.e. MDIO_STAT_BSY clears, we
+ * must read back the data register within 16 MDC cycles.
+ */
+ local_irq_save(flags);
+
/* Initiate the read */
xgmac_write32(mdio_ctl | MDIO_CTL_READ, ®s->mdio_ctl, endian);
ret = xgmac_wait_until_done(&bus->dev, regs, endian);
if (ret)
- return ret;
+ goto irq_restore;
/* Return all Fs if nothing was there */
if ((xgmac_read32(®s->mdio_stat, endian) & MDIO_STAT_RD_ER) &&
@@ -234,13 +241,17 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
dev_dbg(&bus->dev,
"Error while reading PHY%d reg at %d.%hhu\n",
phy_id, dev_addr, regnum);
- return 0xffff;
+ ret = 0xffff;
+ } else {
+ ret = xgmac_read32(®s->mdio_data, endian) & 0xffff;
+ dev_dbg(&bus->dev, "read %04x\n", ret);
}
- value = xgmac_read32(®s->mdio_data, endian) & 0xffff;
- dev_dbg(&bus->dev, "read %04x\n", value);
+irq_restore:
+ if (priv->has_a009885)
+ local_irq_restore(flags);
- return value;
+ return ret;
}
static int xgmac_mdio_probe(struct platform_device *pdev)
@@ -287,6 +298,8 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
priv->is_little_endian = device_property_read_bool(&pdev->dev,
"little-endian");
+ priv->has_a009885 = device_property_read_bool(&pdev->dev,
+ "fsl,erratum-a009885");
priv->has_a011043 = device_property_read_bool(&pdev->dev,
"fsl,erratum-a011043");
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6198c722019774d38018457a8bfb9ba3ed8c931e Mon Sep 17 00:00:00 2001
From: Tobias Waldekranz <tobias(a)waldekranz.com>
Date: Tue, 18 Jan 2022 22:50:50 +0100
Subject: [PATCH] net/fsl: xgmac_mdio: Add workaround for erratum A-009885
Once an MDIO read transaction is initiated, we must read back the data
register within 16 MDC cycles after the transaction completes. Outside
of this window, reads may return corrupt data.
Therefore, disable local interrupts in the critical section, to
maximize the probability that we can satisfy this requirement.
Fixes: d55ad2967d89 ("powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan")
Signed-off-by: Tobias Waldekranz <tobias(a)waldekranz.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/freescale/xgmac_mdio.c b/drivers/net/ethernet/freescale/xgmac_mdio.c
index 5b8b9bcf41a2..bf566ac3195b 100644
--- a/drivers/net/ethernet/freescale/xgmac_mdio.c
+++ b/drivers/net/ethernet/freescale/xgmac_mdio.c
@@ -51,6 +51,7 @@ struct tgec_mdio_controller {
struct mdio_fsl_priv {
struct tgec_mdio_controller __iomem *mdio_base;
bool is_little_endian;
+ bool has_a009885;
bool has_a011043;
};
@@ -186,10 +187,10 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
{
struct mdio_fsl_priv *priv = (struct mdio_fsl_priv *)bus->priv;
struct tgec_mdio_controller __iomem *regs = priv->mdio_base;
+ unsigned long flags;
uint16_t dev_addr;
uint32_t mdio_stat;
uint32_t mdio_ctl;
- uint16_t value;
int ret;
bool endian = priv->is_little_endian;
@@ -221,12 +222,18 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
return ret;
}
+ if (priv->has_a009885)
+ /* Once the operation completes, i.e. MDIO_STAT_BSY clears, we
+ * must read back the data register within 16 MDC cycles.
+ */
+ local_irq_save(flags);
+
/* Initiate the read */
xgmac_write32(mdio_ctl | MDIO_CTL_READ, ®s->mdio_ctl, endian);
ret = xgmac_wait_until_done(&bus->dev, regs, endian);
if (ret)
- return ret;
+ goto irq_restore;
/* Return all Fs if nothing was there */
if ((xgmac_read32(®s->mdio_stat, endian) & MDIO_STAT_RD_ER) &&
@@ -234,13 +241,17 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
dev_dbg(&bus->dev,
"Error while reading PHY%d reg at %d.%hhu\n",
phy_id, dev_addr, regnum);
- return 0xffff;
+ ret = 0xffff;
+ } else {
+ ret = xgmac_read32(®s->mdio_data, endian) & 0xffff;
+ dev_dbg(&bus->dev, "read %04x\n", ret);
}
- value = xgmac_read32(®s->mdio_data, endian) & 0xffff;
- dev_dbg(&bus->dev, "read %04x\n", value);
+irq_restore:
+ if (priv->has_a009885)
+ local_irq_restore(flags);
- return value;
+ return ret;
}
static int xgmac_mdio_probe(struct platform_device *pdev)
@@ -287,6 +298,8 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
priv->is_little_endian = device_property_read_bool(&pdev->dev,
"little-endian");
+ priv->has_a009885 = device_property_read_bool(&pdev->dev,
+ "fsl,erratum-a009885");
priv->has_a011043 = device_property_read_bool(&pdev->dev,
"fsl,erratum-a011043");
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6198c722019774d38018457a8bfb9ba3ed8c931e Mon Sep 17 00:00:00 2001
From: Tobias Waldekranz <tobias(a)waldekranz.com>
Date: Tue, 18 Jan 2022 22:50:50 +0100
Subject: [PATCH] net/fsl: xgmac_mdio: Add workaround for erratum A-009885
Once an MDIO read transaction is initiated, we must read back the data
register within 16 MDC cycles after the transaction completes. Outside
of this window, reads may return corrupt data.
Therefore, disable local interrupts in the critical section, to
maximize the probability that we can satisfy this requirement.
Fixes: d55ad2967d89 ("powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan")
Signed-off-by: Tobias Waldekranz <tobias(a)waldekranz.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/freescale/xgmac_mdio.c b/drivers/net/ethernet/freescale/xgmac_mdio.c
index 5b8b9bcf41a2..bf566ac3195b 100644
--- a/drivers/net/ethernet/freescale/xgmac_mdio.c
+++ b/drivers/net/ethernet/freescale/xgmac_mdio.c
@@ -51,6 +51,7 @@ struct tgec_mdio_controller {
struct mdio_fsl_priv {
struct tgec_mdio_controller __iomem *mdio_base;
bool is_little_endian;
+ bool has_a009885;
bool has_a011043;
};
@@ -186,10 +187,10 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
{
struct mdio_fsl_priv *priv = (struct mdio_fsl_priv *)bus->priv;
struct tgec_mdio_controller __iomem *regs = priv->mdio_base;
+ unsigned long flags;
uint16_t dev_addr;
uint32_t mdio_stat;
uint32_t mdio_ctl;
- uint16_t value;
int ret;
bool endian = priv->is_little_endian;
@@ -221,12 +222,18 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
return ret;
}
+ if (priv->has_a009885)
+ /* Once the operation completes, i.e. MDIO_STAT_BSY clears, we
+ * must read back the data register within 16 MDC cycles.
+ */
+ local_irq_save(flags);
+
/* Initiate the read */
xgmac_write32(mdio_ctl | MDIO_CTL_READ, ®s->mdio_ctl, endian);
ret = xgmac_wait_until_done(&bus->dev, regs, endian);
if (ret)
- return ret;
+ goto irq_restore;
/* Return all Fs if nothing was there */
if ((xgmac_read32(®s->mdio_stat, endian) & MDIO_STAT_RD_ER) &&
@@ -234,13 +241,17 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
dev_dbg(&bus->dev,
"Error while reading PHY%d reg at %d.%hhu\n",
phy_id, dev_addr, regnum);
- return 0xffff;
+ ret = 0xffff;
+ } else {
+ ret = xgmac_read32(®s->mdio_data, endian) & 0xffff;
+ dev_dbg(&bus->dev, "read %04x\n", ret);
}
- value = xgmac_read32(®s->mdio_data, endian) & 0xffff;
- dev_dbg(&bus->dev, "read %04x\n", value);
+irq_restore:
+ if (priv->has_a009885)
+ local_irq_restore(flags);
- return value;
+ return ret;
}
static int xgmac_mdio_probe(struct platform_device *pdev)
@@ -287,6 +298,8 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
priv->is_little_endian = device_property_read_bool(&pdev->dev,
"little-endian");
+ priv->has_a009885 = device_property_read_bool(&pdev->dev,
+ "fsl,erratum-a009885");
priv->has_a011043 = device_property_read_bool(&pdev->dev,
"fsl,erratum-a011043");
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1e9d74660d4df625b0889e77018f9e94727ceacd Mon Sep 17 00:00:00 2001
From: Yafang Shao <laoar.shao(a)gmail.com>
Date: Sat, 8 Jan 2022 13:46:23 +0000
Subject: [PATCH] bpf: Fix mount source show for bpffs
We noticed our tc ebpf tools can't start after we upgrade our in-house kernel
version from 4.19 to 5.10. That is because of the behaviour change in bpffs
caused by commit d2935de7e4fd ("vfs: Convert bpf to use the new mount API").
In our tc ebpf tools, we do strict environment check. If the environment is
not matched, we won't allow to start the ebpf progs. One of the check is whether
bpffs is properly mounted. The mount information of bpffs in kernel-4.19 and
kernel-5.10 are as follows:
- kernel 4.19
$ mount -t bpf bpffs /sys/fs/bpf
$ mount -t bpf
bpffs on /sys/fs/bpf type bpf (rw,relatime)
- kernel 5.10
$ mount -t bpf bpffs /sys/fs/bpf
$ mount -t bpf
none on /sys/fs/bpf type bpf (rw,relatime)
The device name in kernel-5.10 is displayed as none instead of bpffs, then our
environment check fails. Currently we modify the tools to adopt to the kernel
behaviour change, but I think we'd better change the kernel code to keep the
behavior consistent.
After this change, the mount information will be displayed the same with the
behavior in kernel-4.19, for example:
$ mount -t bpf bpffs /sys/fs/bpf
$ mount -t bpf
bpffs on /sys/fs/bpf type bpf (rw,relatime)
Fixes: d2935de7e4fd ("vfs: Convert bpf to use the new mount API")
Suggested-by: Daniel Borkmann <daniel(a)iogearbox.net>
Signed-off-by: Yafang Shao <laoar.shao(a)gmail.com>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: Christian Brauner <christian.brauner(a)ubuntu.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Link: https://lore.kernel.org/bpf/20220108134623.32467-1-laoar.shao@gmail.com
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index 80da1db47c68..5a8d9f7467bf 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -648,12 +648,22 @@ static int bpf_parse_param(struct fs_context *fc, struct fs_parameter *param)
int opt;
opt = fs_parse(fc, bpf_fs_parameters, param, &result);
- if (opt < 0)
+ if (opt < 0) {
/* We might like to report bad mount options here, but
* traditionally we've ignored all mount options, so we'd
* better continue to ignore non-existing options for bpf.
*/
- return opt == -ENOPARAM ? 0 : opt;
+ if (opt == -ENOPARAM) {
+ opt = vfs_parse_fs_param_source(fc, param);
+ if (opt != -ENOPARAM)
+ return opt;
+
+ return 0;
+ }
+
+ if (opt < 0)
+ return opt;
+ }
switch (opt) {
case OPT_MODE:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1e9d74660d4df625b0889e77018f9e94727ceacd Mon Sep 17 00:00:00 2001
From: Yafang Shao <laoar.shao(a)gmail.com>
Date: Sat, 8 Jan 2022 13:46:23 +0000
Subject: [PATCH] bpf: Fix mount source show for bpffs
We noticed our tc ebpf tools can't start after we upgrade our in-house kernel
version from 4.19 to 5.10. That is because of the behaviour change in bpffs
caused by commit d2935de7e4fd ("vfs: Convert bpf to use the new mount API").
In our tc ebpf tools, we do strict environment check. If the environment is
not matched, we won't allow to start the ebpf progs. One of the check is whether
bpffs is properly mounted. The mount information of bpffs in kernel-4.19 and
kernel-5.10 are as follows:
- kernel 4.19
$ mount -t bpf bpffs /sys/fs/bpf
$ mount -t bpf
bpffs on /sys/fs/bpf type bpf (rw,relatime)
- kernel 5.10
$ mount -t bpf bpffs /sys/fs/bpf
$ mount -t bpf
none on /sys/fs/bpf type bpf (rw,relatime)
The device name in kernel-5.10 is displayed as none instead of bpffs, then our
environment check fails. Currently we modify the tools to adopt to the kernel
behaviour change, but I think we'd better change the kernel code to keep the
behavior consistent.
After this change, the mount information will be displayed the same with the
behavior in kernel-4.19, for example:
$ mount -t bpf bpffs /sys/fs/bpf
$ mount -t bpf
bpffs on /sys/fs/bpf type bpf (rw,relatime)
Fixes: d2935de7e4fd ("vfs: Convert bpf to use the new mount API")
Suggested-by: Daniel Borkmann <daniel(a)iogearbox.net>
Signed-off-by: Yafang Shao <laoar.shao(a)gmail.com>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: Christian Brauner <christian.brauner(a)ubuntu.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Link: https://lore.kernel.org/bpf/20220108134623.32467-1-laoar.shao@gmail.com
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index 80da1db47c68..5a8d9f7467bf 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -648,12 +648,22 @@ static int bpf_parse_param(struct fs_context *fc, struct fs_parameter *param)
int opt;
opt = fs_parse(fc, bpf_fs_parameters, param, &result);
- if (opt < 0)
+ if (opt < 0) {
/* We might like to report bad mount options here, but
* traditionally we've ignored all mount options, so we'd
* better continue to ignore non-existing options for bpf.
*/
- return opt == -ENOPARAM ? 0 : opt;
+ if (opt == -ENOPARAM) {
+ opt = vfs_parse_fs_param_source(fc, param);
+ if (opt != -ENOPARAM)
+ return opt;
+
+ return 0;
+ }
+
+ if (opt < 0)
+ return opt;
+ }
switch (opt) {
case OPT_MODE:
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e0068620e5e1779b3d5bb5649cd196e1cbf277a9 Mon Sep 17 00:00:00 2001
From: Tobias Waldekranz <tobias(a)waldekranz.com>
Date: Thu, 9 Dec 2021 23:24:24 +0100
Subject: [PATCH] net: dsa: mv88e6xxx: Add tx fwd offload PVT on intermediate
devices
In a typical mv88e6xxx switch tree like this:
CPU
| .----.
.--0--. | .--0--.
| sw0 | | | sw1 |
'-1-2-' | '-1-2-'
'---'
If sw1p{1,2} are added to a bridge that sw0p1 is not a part of, sw0
still needs to add a crosschip PVT entry for the virtual DSA device
assigned to represent the bridge.
Fixes: ce5df6894a57 ("net: dsa: mv88e6xxx: map virtual bridges with forwarding offload in the PVT")
Signed-off-by: Tobias Waldekranz <tobias(a)waldekranz.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean(a)nxp.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 45aaa7333fb8..dced1505e6bf 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2531,6 +2531,7 @@ static int mv88e6xxx_crosschip_bridge_join(struct dsa_switch *ds,
mv88e6xxx_reg_lock(chip);
err = mv88e6xxx_pvt_map(chip, sw_index, port);
+ err = err ? : mv88e6xxx_map_virtual_bridge_to_pvt(ds, bridge.num);
mv88e6xxx_reg_unlock(chip);
return err;
@@ -2546,7 +2547,8 @@ static void mv88e6xxx_crosschip_bridge_leave(struct dsa_switch *ds,
return;
mv88e6xxx_reg_lock(chip);
- if (mv88e6xxx_pvt_map(chip, sw_index, port))
+ if (mv88e6xxx_pvt_map(chip, sw_index, port) ||
+ mv88e6xxx_map_virtual_bridge_to_pvt(ds, bridge.num))
dev_err(ds->dev, "failed to remap cross-chip Port VLAN\n");
mv88e6xxx_reg_unlock(chip);
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e0068620e5e1779b3d5bb5649cd196e1cbf277a9 Mon Sep 17 00:00:00 2001
From: Tobias Waldekranz <tobias(a)waldekranz.com>
Date: Thu, 9 Dec 2021 23:24:24 +0100
Subject: [PATCH] net: dsa: mv88e6xxx: Add tx fwd offload PVT on intermediate
devices
In a typical mv88e6xxx switch tree like this:
CPU
| .----.
.--0--. | .--0--.
| sw0 | | | sw1 |
'-1-2-' | '-1-2-'
'---'
If sw1p{1,2} are added to a bridge that sw0p1 is not a part of, sw0
still needs to add a crosschip PVT entry for the virtual DSA device
assigned to represent the bridge.
Fixes: ce5df6894a57 ("net: dsa: mv88e6xxx: map virtual bridges with forwarding offload in the PVT")
Signed-off-by: Tobias Waldekranz <tobias(a)waldekranz.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean(a)nxp.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 45aaa7333fb8..dced1505e6bf 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2531,6 +2531,7 @@ static int mv88e6xxx_crosschip_bridge_join(struct dsa_switch *ds,
mv88e6xxx_reg_lock(chip);
err = mv88e6xxx_pvt_map(chip, sw_index, port);
+ err = err ? : mv88e6xxx_map_virtual_bridge_to_pvt(ds, bridge.num);
mv88e6xxx_reg_unlock(chip);
return err;
@@ -2546,7 +2547,8 @@ static void mv88e6xxx_crosschip_bridge_leave(struct dsa_switch *ds,
return;
mv88e6xxx_reg_lock(chip);
- if (mv88e6xxx_pvt_map(chip, sw_index, port))
+ if (mv88e6xxx_pvt_map(chip, sw_index, port) ||
+ mv88e6xxx_map_virtual_bridge_to_pvt(ds, bridge.num))
dev_err(ds->dev, "failed to remap cross-chip Port VLAN\n");
mv88e6xxx_reg_unlock(chip);
}
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a672b2e36a648afb04ad3bda93b6bda947a479a5 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Thu, 13 Jan 2022 11:11:30 +0000
Subject: [PATCH] bpf: Fix ringbuf memory type confusion when passing to
helpers
The bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument, and thus both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.
While the non-NULL memory from bpf_ringbuf_reserve() can be passed to other
helpers, the two sinks (bpf_ringbuf_submit(), bpf_ringbuf_discard()) right now
only enforce a register type of PTR_TO_MEM.
This can lead to potential type confusion since it would allow other PTR_TO_MEM
memory to be passed into the two sinks which did not come from bpf_ringbuf_reserve().
Add a new MEM_ALLOC composable type attribute for PTR_TO_MEM, and enforce that:
- bpf_ringbuf_reserve() returns NULL or PTR_TO_MEM | MEM_ALLOC
- bpf_ringbuf_submit() and bpf_ringbuf_discard() only take PTR_TO_MEM | MEM_ALLOC
but not plain PTR_TO_MEM arguments via ARG_PTR_TO_ALLOC_MEM
- however, other helpers might treat PTR_TO_MEM | MEM_ALLOC as plain PTR_TO_MEM
to populate the memory area when they use ARG_PTR_TO_{UNINIT_,}MEM in their
func proto description
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6e947cd91152..fa517ae604ad 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -316,7 +316,12 @@ enum bpf_type_flag {
*/
MEM_RDONLY = BIT(1 + BPF_BASE_TYPE_BITS),
- __BPF_TYPE_LAST_FLAG = MEM_RDONLY,
+ /* MEM was "allocated" from a different helper, and cannot be mixed
+ * with regular non-MEM_ALLOC'ed MEM types.
+ */
+ MEM_ALLOC = BIT(2 + BPF_BASE_TYPE_BITS),
+
+ __BPF_TYPE_LAST_FLAG = MEM_ALLOC,
};
/* Max number of base types. */
@@ -400,7 +405,7 @@ enum bpf_return_type {
RET_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCKET,
RET_PTR_TO_TCP_SOCK_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK,
RET_PTR_TO_SOCK_COMMON_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON,
- RET_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM,
+ RET_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_ALLOC_MEM,
RET_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID,
/* This must be the last entry. Its purpose is to ensure the enum is
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c72c57a6684f..a39eedecc93a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -570,6 +570,8 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
if (type & MEM_RDONLY)
strncpy(prefix, "rdonly_", 16);
+ if (type & MEM_ALLOC)
+ strncpy(prefix, "alloc_", 16);
snprintf(env->type_str_buf, TYPE_STR_BUF_LEN, "%s%s%s",
prefix, str[base_type(type)], postfix);
@@ -5135,6 +5137,7 @@ static const struct bpf_reg_types mem_types = {
PTR_TO_MAP_KEY,
PTR_TO_MAP_VALUE,
PTR_TO_MEM,
+ PTR_TO_MEM | MEM_ALLOC,
PTR_TO_BUF,
},
};
@@ -5152,7 +5155,7 @@ static const struct bpf_reg_types int_ptr_types = {
static const struct bpf_reg_types fullsock_types = { .types = { PTR_TO_SOCKET } };
static const struct bpf_reg_types scalar_types = { .types = { SCALAR_VALUE } };
static const struct bpf_reg_types context_types = { .types = { PTR_TO_CTX } };
-static const struct bpf_reg_types alloc_mem_types = { .types = { PTR_TO_MEM } };
+static const struct bpf_reg_types alloc_mem_types = { .types = { PTR_TO_MEM | MEM_ALLOC } };
static const struct bpf_reg_types const_map_ptr_types = { .types = { CONST_PTR_TO_MAP } };
static const struct bpf_reg_types btf_ptr_types = { .types = { PTR_TO_BTF_ID } };
static const struct bpf_reg_types spin_lock_types = { .types = { PTR_TO_MAP_VALUE } };
@@ -5315,6 +5318,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
case PTR_TO_MAP_VALUE:
case PTR_TO_MEM:
case PTR_TO_MEM | MEM_RDONLY:
+ case PTR_TO_MEM | MEM_ALLOC:
case PTR_TO_BUF:
case PTR_TO_BUF | MEM_RDONLY:
case PTR_TO_STACK:
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a672b2e36a648afb04ad3bda93b6bda947a479a5 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Thu, 13 Jan 2022 11:11:30 +0000
Subject: [PATCH] bpf: Fix ringbuf memory type confusion when passing to
helpers
The bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument, and thus both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.
While the non-NULL memory from bpf_ringbuf_reserve() can be passed to other
helpers, the two sinks (bpf_ringbuf_submit(), bpf_ringbuf_discard()) right now
only enforce a register type of PTR_TO_MEM.
This can lead to potential type confusion since it would allow other PTR_TO_MEM
memory to be passed into the two sinks which did not come from bpf_ringbuf_reserve().
Add a new MEM_ALLOC composable type attribute for PTR_TO_MEM, and enforce that:
- bpf_ringbuf_reserve() returns NULL or PTR_TO_MEM | MEM_ALLOC
- bpf_ringbuf_submit() and bpf_ringbuf_discard() only take PTR_TO_MEM | MEM_ALLOC
but not plain PTR_TO_MEM arguments via ARG_PTR_TO_ALLOC_MEM
- however, other helpers might treat PTR_TO_MEM | MEM_ALLOC as plain PTR_TO_MEM
to populate the memory area when they use ARG_PTR_TO_{UNINIT_,}MEM in their
func proto description
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6e947cd91152..fa517ae604ad 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -316,7 +316,12 @@ enum bpf_type_flag {
*/
MEM_RDONLY = BIT(1 + BPF_BASE_TYPE_BITS),
- __BPF_TYPE_LAST_FLAG = MEM_RDONLY,
+ /* MEM was "allocated" from a different helper, and cannot be mixed
+ * with regular non-MEM_ALLOC'ed MEM types.
+ */
+ MEM_ALLOC = BIT(2 + BPF_BASE_TYPE_BITS),
+
+ __BPF_TYPE_LAST_FLAG = MEM_ALLOC,
};
/* Max number of base types. */
@@ -400,7 +405,7 @@ enum bpf_return_type {
RET_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCKET,
RET_PTR_TO_TCP_SOCK_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK,
RET_PTR_TO_SOCK_COMMON_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON,
- RET_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM,
+ RET_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_ALLOC_MEM,
RET_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID,
/* This must be the last entry. Its purpose is to ensure the enum is
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c72c57a6684f..a39eedecc93a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -570,6 +570,8 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
if (type & MEM_RDONLY)
strncpy(prefix, "rdonly_", 16);
+ if (type & MEM_ALLOC)
+ strncpy(prefix, "alloc_", 16);
snprintf(env->type_str_buf, TYPE_STR_BUF_LEN, "%s%s%s",
prefix, str[base_type(type)], postfix);
@@ -5135,6 +5137,7 @@ static const struct bpf_reg_types mem_types = {
PTR_TO_MAP_KEY,
PTR_TO_MAP_VALUE,
PTR_TO_MEM,
+ PTR_TO_MEM | MEM_ALLOC,
PTR_TO_BUF,
},
};
@@ -5152,7 +5155,7 @@ static const struct bpf_reg_types int_ptr_types = {
static const struct bpf_reg_types fullsock_types = { .types = { PTR_TO_SOCKET } };
static const struct bpf_reg_types scalar_types = { .types = { SCALAR_VALUE } };
static const struct bpf_reg_types context_types = { .types = { PTR_TO_CTX } };
-static const struct bpf_reg_types alloc_mem_types = { .types = { PTR_TO_MEM } };
+static const struct bpf_reg_types alloc_mem_types = { .types = { PTR_TO_MEM | MEM_ALLOC } };
static const struct bpf_reg_types const_map_ptr_types = { .types = { CONST_PTR_TO_MAP } };
static const struct bpf_reg_types btf_ptr_types = { .types = { PTR_TO_BTF_ID } };
static const struct bpf_reg_types spin_lock_types = { .types = { PTR_TO_MAP_VALUE } };
@@ -5315,6 +5318,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
case PTR_TO_MAP_VALUE:
case PTR_TO_MEM:
case PTR_TO_MEM | MEM_RDONLY:
+ case PTR_TO_MEM | MEM_ALLOC:
case PTR_TO_BUF:
case PTR_TO_BUF | MEM_RDONLY:
case PTR_TO_STACK:
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a672b2e36a648afb04ad3bda93b6bda947a479a5 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Thu, 13 Jan 2022 11:11:30 +0000
Subject: [PATCH] bpf: Fix ringbuf memory type confusion when passing to
helpers
The bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument, and thus both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.
While the non-NULL memory from bpf_ringbuf_reserve() can be passed to other
helpers, the two sinks (bpf_ringbuf_submit(), bpf_ringbuf_discard()) right now
only enforce a register type of PTR_TO_MEM.
This can lead to potential type confusion since it would allow other PTR_TO_MEM
memory to be passed into the two sinks which did not come from bpf_ringbuf_reserve().
Add a new MEM_ALLOC composable type attribute for PTR_TO_MEM, and enforce that:
- bpf_ringbuf_reserve() returns NULL or PTR_TO_MEM | MEM_ALLOC
- bpf_ringbuf_submit() and bpf_ringbuf_discard() only take PTR_TO_MEM | MEM_ALLOC
but not plain PTR_TO_MEM arguments via ARG_PTR_TO_ALLOC_MEM
- however, other helpers might treat PTR_TO_MEM | MEM_ALLOC as plain PTR_TO_MEM
to populate the memory area when they use ARG_PTR_TO_{UNINIT_,}MEM in their
func proto description
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6e947cd91152..fa517ae604ad 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -316,7 +316,12 @@ enum bpf_type_flag {
*/
MEM_RDONLY = BIT(1 + BPF_BASE_TYPE_BITS),
- __BPF_TYPE_LAST_FLAG = MEM_RDONLY,
+ /* MEM was "allocated" from a different helper, and cannot be mixed
+ * with regular non-MEM_ALLOC'ed MEM types.
+ */
+ MEM_ALLOC = BIT(2 + BPF_BASE_TYPE_BITS),
+
+ __BPF_TYPE_LAST_FLAG = MEM_ALLOC,
};
/* Max number of base types. */
@@ -400,7 +405,7 @@ enum bpf_return_type {
RET_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCKET,
RET_PTR_TO_TCP_SOCK_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK,
RET_PTR_TO_SOCK_COMMON_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON,
- RET_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM,
+ RET_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_ALLOC_MEM,
RET_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID,
/* This must be the last entry. Its purpose is to ensure the enum is
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c72c57a6684f..a39eedecc93a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -570,6 +570,8 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
if (type & MEM_RDONLY)
strncpy(prefix, "rdonly_", 16);
+ if (type & MEM_ALLOC)
+ strncpy(prefix, "alloc_", 16);
snprintf(env->type_str_buf, TYPE_STR_BUF_LEN, "%s%s%s",
prefix, str[base_type(type)], postfix);
@@ -5135,6 +5137,7 @@ static const struct bpf_reg_types mem_types = {
PTR_TO_MAP_KEY,
PTR_TO_MAP_VALUE,
PTR_TO_MEM,
+ PTR_TO_MEM | MEM_ALLOC,
PTR_TO_BUF,
},
};
@@ -5152,7 +5155,7 @@ static const struct bpf_reg_types int_ptr_types = {
static const struct bpf_reg_types fullsock_types = { .types = { PTR_TO_SOCKET } };
static const struct bpf_reg_types scalar_types = { .types = { SCALAR_VALUE } };
static const struct bpf_reg_types context_types = { .types = { PTR_TO_CTX } };
-static const struct bpf_reg_types alloc_mem_types = { .types = { PTR_TO_MEM } };
+static const struct bpf_reg_types alloc_mem_types = { .types = { PTR_TO_MEM | MEM_ALLOC } };
static const struct bpf_reg_types const_map_ptr_types = { .types = { CONST_PTR_TO_MAP } };
static const struct bpf_reg_types btf_ptr_types = { .types = { PTR_TO_BTF_ID } };
static const struct bpf_reg_types spin_lock_types = { .types = { PTR_TO_MAP_VALUE } };
@@ -5315,6 +5318,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
case PTR_TO_MAP_VALUE:
case PTR_TO_MEM:
case PTR_TO_MEM | MEM_RDONLY:
+ case PTR_TO_MEM | MEM_ALLOC:
case PTR_TO_BUF:
case PTR_TO_BUF | MEM_RDONLY:
case PTR_TO_STACK:
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64620e0a1e712a778095bd35cbb277dc2259281f Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Tue, 11 Jan 2022 14:43:41 +0000
Subject: [PATCH] bpf: Fix out of bounds access for ringbuf helpers
Both bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument. They both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.
Meaning, after a NULL check in the code, the verifier will promote the register
type in the non-NULL branch to a PTR_TO_MEM and in the NULL branch to a known
zero scalar. Generally, pointer arithmetic on PTR_TO_MEM is allowed, so the
latter could have an offset.
The ARG_PTR_TO_ALLOC_MEM expects a PTR_TO_MEM register type. However, the non-
zero result from bpf_ringbuf_reserve() must be fed into either bpf_ringbuf_submit()
or bpf_ringbuf_discard() but with the original offset given it will then read
out the struct bpf_ringbuf_hdr mapping.
The verifier missed to enforce a zero offset, so that out of bounds access
can be triggered which could be used to escalate privileges if unprivileged
BPF was enabled (disabled by default in kernel).
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: <tr3e.wang(a)gmail.com> (SecCoder Security Lab)
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e0b3f4d683eb..c72c57a6684f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5318,9 +5318,15 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
case PTR_TO_BUF:
case PTR_TO_BUF | MEM_RDONLY:
case PTR_TO_STACK:
+ /* Some of the argument types nevertheless require a
+ * zero register offset.
+ */
+ if (arg_type == ARG_PTR_TO_ALLOC_MEM)
+ goto force_off_check;
break;
/* All the rest must be rejected: */
default:
+force_off_check:
err = __check_ptr_off_reg(env, reg, regno,
type == PTR_TO_BTF_ID);
if (err < 0)
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64620e0a1e712a778095bd35cbb277dc2259281f Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Tue, 11 Jan 2022 14:43:41 +0000
Subject: [PATCH] bpf: Fix out of bounds access for ringbuf helpers
Both bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument. They both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.
Meaning, after a NULL check in the code, the verifier will promote the register
type in the non-NULL branch to a PTR_TO_MEM and in the NULL branch to a known
zero scalar. Generally, pointer arithmetic on PTR_TO_MEM is allowed, so the
latter could have an offset.
The ARG_PTR_TO_ALLOC_MEM expects a PTR_TO_MEM register type. However, the non-
zero result from bpf_ringbuf_reserve() must be fed into either bpf_ringbuf_submit()
or bpf_ringbuf_discard() but with the original offset given it will then read
out the struct bpf_ringbuf_hdr mapping.
The verifier missed to enforce a zero offset, so that out of bounds access
can be triggered which could be used to escalate privileges if unprivileged
BPF was enabled (disabled by default in kernel).
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: <tr3e.wang(a)gmail.com> (SecCoder Security Lab)
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e0b3f4d683eb..c72c57a6684f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5318,9 +5318,15 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
case PTR_TO_BUF:
case PTR_TO_BUF | MEM_RDONLY:
case PTR_TO_STACK:
+ /* Some of the argument types nevertheless require a
+ * zero register offset.
+ */
+ if (arg_type == ARG_PTR_TO_ALLOC_MEM)
+ goto force_off_check;
break;
/* All the rest must be rejected: */
default:
+force_off_check:
err = __check_ptr_off_reg(env, reg, regno,
type == PTR_TO_BTF_ID);
if (err < 0)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6788ab23508bddb0a9d88e104284922cb2c22b77 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Mon, 10 Jan 2022 14:40:40 +0000
Subject: [PATCH] bpf: Generally fix helper register offset check
Right now the assertion on check_ptr_off_reg() is only enforced for register
types PTR_TO_CTX (and open coded also for PTR_TO_BTF_ID), however, this is
insufficient since many other PTR_TO_* register types such as PTR_TO_FUNC do
not handle/expect register offsets when passed to helper functions.
Given this can slip-through easily when adding new types, make this an explicit
allow-list and reject all other current and future types by default if this is
encountered.
Also, extend check_ptr_off_reg() to handle PTR_TO_BTF_ID as well instead of
duplicating it. For PTR_TO_BTF_ID, reg->off is used for BTF to match expected
BTF ids if struct offset is used. This part still needs to be allowed, but the
dynamic off from the tnum must be rejected.
Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ffec0baaf2b6..e0b3f4d683eb 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3969,14 +3969,15 @@ static int get_callee_stack_depth(struct bpf_verifier_env *env,
}
#endif
-int check_ptr_off_reg(struct bpf_verifier_env *env,
- const struct bpf_reg_state *reg, int regno)
+static int __check_ptr_off_reg(struct bpf_verifier_env *env,
+ const struct bpf_reg_state *reg, int regno,
+ bool fixed_off_ok)
{
/* Access to this pointer-typed register or passing it to a helper
* is only allowed in its original, unmodified form.
*/
- if (reg->off) {
+ if (!fixed_off_ok && reg->off) {
verbose(env, "dereference of modified %s ptr R%d off=%d disallowed\n",
reg_type_str(env, reg->type), regno, reg->off);
return -EACCES;
@@ -3994,6 +3995,12 @@ int check_ptr_off_reg(struct bpf_verifier_env *env,
return 0;
}
+int check_ptr_off_reg(struct bpf_verifier_env *env,
+ const struct bpf_reg_state *reg, int regno)
+{
+ return __check_ptr_off_reg(env, reg, regno, false);
+}
+
static int __check_buffer_access(struct bpf_verifier_env *env,
const char *buf_info,
const struct bpf_reg_state *reg,
@@ -5245,12 +5252,6 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
kernel_type_name(btf_vmlinux, *arg_btf_id));
return -EACCES;
}
-
- if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
- verbose(env, "R%d is a pointer to in-kernel struct with non-zero offset\n",
- regno);
- return -EACCES;
- }
}
return 0;
@@ -5305,10 +5306,26 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
if (err)
return err;
- if (type == PTR_TO_CTX) {
- err = check_ptr_off_reg(env, reg, regno);
+ switch ((u32)type) {
+ case SCALAR_VALUE:
+ /* Pointer types where reg offset is explicitly allowed: */
+ case PTR_TO_PACKET:
+ case PTR_TO_PACKET_META:
+ case PTR_TO_MAP_KEY:
+ case PTR_TO_MAP_VALUE:
+ case PTR_TO_MEM:
+ case PTR_TO_MEM | MEM_RDONLY:
+ case PTR_TO_BUF:
+ case PTR_TO_BUF | MEM_RDONLY:
+ case PTR_TO_STACK:
+ break;
+ /* All the rest must be rejected: */
+ default:
+ err = __check_ptr_off_reg(env, reg, regno,
+ type == PTR_TO_BTF_ID);
if (err < 0)
return err;
+ break;
}
skip_type_check:
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6788ab23508bddb0a9d88e104284922cb2c22b77 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Mon, 10 Jan 2022 14:40:40 +0000
Subject: [PATCH] bpf: Generally fix helper register offset check
Right now the assertion on check_ptr_off_reg() is only enforced for register
types PTR_TO_CTX (and open coded also for PTR_TO_BTF_ID), however, this is
insufficient since many other PTR_TO_* register types such as PTR_TO_FUNC do
not handle/expect register offsets when passed to helper functions.
Given this can slip-through easily when adding new types, make this an explicit
allow-list and reject all other current and future types by default if this is
encountered.
Also, extend check_ptr_off_reg() to handle PTR_TO_BTF_ID as well instead of
duplicating it. For PTR_TO_BTF_ID, reg->off is used for BTF to match expected
BTF ids if struct offset is used. This part still needs to be allowed, but the
dynamic off from the tnum must be rejected.
Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ffec0baaf2b6..e0b3f4d683eb 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3969,14 +3969,15 @@ static int get_callee_stack_depth(struct bpf_verifier_env *env,
}
#endif
-int check_ptr_off_reg(struct bpf_verifier_env *env,
- const struct bpf_reg_state *reg, int regno)
+static int __check_ptr_off_reg(struct bpf_verifier_env *env,
+ const struct bpf_reg_state *reg, int regno,
+ bool fixed_off_ok)
{
/* Access to this pointer-typed register or passing it to a helper
* is only allowed in its original, unmodified form.
*/
- if (reg->off) {
+ if (!fixed_off_ok && reg->off) {
verbose(env, "dereference of modified %s ptr R%d off=%d disallowed\n",
reg_type_str(env, reg->type), regno, reg->off);
return -EACCES;
@@ -3994,6 +3995,12 @@ int check_ptr_off_reg(struct bpf_verifier_env *env,
return 0;
}
+int check_ptr_off_reg(struct bpf_verifier_env *env,
+ const struct bpf_reg_state *reg, int regno)
+{
+ return __check_ptr_off_reg(env, reg, regno, false);
+}
+
static int __check_buffer_access(struct bpf_verifier_env *env,
const char *buf_info,
const struct bpf_reg_state *reg,
@@ -5245,12 +5252,6 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
kernel_type_name(btf_vmlinux, *arg_btf_id));
return -EACCES;
}
-
- if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
- verbose(env, "R%d is a pointer to in-kernel struct with non-zero offset\n",
- regno);
- return -EACCES;
- }
}
return 0;
@@ -5305,10 +5306,26 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
if (err)
return err;
- if (type == PTR_TO_CTX) {
- err = check_ptr_off_reg(env, reg, regno);
+ switch ((u32)type) {
+ case SCALAR_VALUE:
+ /* Pointer types where reg offset is explicitly allowed: */
+ case PTR_TO_PACKET:
+ case PTR_TO_PACKET_META:
+ case PTR_TO_MAP_KEY:
+ case PTR_TO_MAP_VALUE:
+ case PTR_TO_MEM:
+ case PTR_TO_MEM | MEM_RDONLY:
+ case PTR_TO_BUF:
+ case PTR_TO_BUF | MEM_RDONLY:
+ case PTR_TO_STACK:
+ break;
+ /* All the rest must be rejected: */
+ default:
+ err = __check_ptr_off_reg(env, reg, regno,
+ type == PTR_TO_BTF_ID);
if (err < 0)
return err;
+ break;
}
skip_type_check:
Commit ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices
and sfp cages") added code which finds SFP bus DT node even if the node
is disabled with status = "disabled". Because of this, when phylink is
created, it ends with non-null .sfp_bus member, even though the SFP
module is not probed (because the node is disabled).
We need to ignore disabled SFP bus node.
Fixes: ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages")
Signed-off-by: Marek Behún <kabel(a)kernel.org>
Cc: stable(a)vger.kernel.org # 2203cbf2c8b5 ("net: sfp: move fwnode parsing into sfp-bus layer")
---
drivers/net/phy/sfp-bus.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c
index 0c6c0d1843bc..c1512c9925a6 100644
--- a/drivers/net/phy/sfp-bus.c
+++ b/drivers/net/phy/sfp-bus.c
@@ -651,6 +651,11 @@ struct sfp_bus *sfp_bus_find_fwnode(struct fwnode_handle *fwnode)
else if (ret < 0)
return ERR_PTR(ret);
+ if (!fwnode_device_is_available(ref.fwnode)) {
+ fwnode_handle_put(ref.fwnode);
+ return NULL;
+ }
+
bus = sfp_bus_get(ref.fwnode);
fwnode_handle_put(ref.fwnode);
if (!bus)
--
2.34.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b89ddf4cca43f1269093942cf5c4e457fd45c335 Mon Sep 17 00:00:00 2001
From: Russell King <russell.king(a)oracle.com>
Date: Fri, 5 Nov 2021 16:50:45 +0000
Subject: [PATCH] arm64/bpf: Remove 128MB limit for BPF JIT programs
Commit 91fc957c9b1d ("arm64/bpf: don't allocate BPF JIT programs in module
memory") restricts BPF JIT program allocation to a 128MB region to ensure
BPF programs are still in branching range of each other. However this
restriction should not apply to the aarch64 JIT, since BPF_JMP | BPF_CALL
are implemented as a 64-bit move into a register and then a BLR instruction -
which has the effect of being able to call anything without proximity
limitation.
The practical reason to relax this restriction on JIT memory is that 128MB of
JIT memory can be quickly exhausted, especially where PAGE_SIZE is 64KB - one
page is needed per program. In cases where seccomp filters are applied to
multiple VMs on VM launch - such filters are classic BPF but converted to
BPF - this can severely limit the number of VMs that can be launched. In a
world where we support BPF JIT always on, turning off the JIT isn't always an
option either.
Fixes: 91fc957c9b1d ("arm64/bpf: don't allocate BPF JIT programs in module memory")
Suggested-by: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Signed-off-by: Russell King <russell.king(a)oracle.com>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Tested-by: Alan Maguire <alan.maguire(a)oracle.com>
Link: https://lore.kernel.org/bpf/1636131046-5982-2-git-send-email-alan.maguire@o…
diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
index 8b300dd28def..72b0e71cc3de 100644
--- a/arch/arm64/include/asm/extable.h
+++ b/arch/arm64/include/asm/extable.h
@@ -33,15 +33,6 @@ do { \
(b)->data = (tmp).data; \
} while (0)
-static inline bool in_bpf_jit(struct pt_regs *regs)
-{
- if (!IS_ENABLED(CONFIG_BPF_JIT))
- return false;
-
- return regs->pc >= BPF_JIT_REGION_START &&
- regs->pc < BPF_JIT_REGION_END;
-}
-
#ifdef CONFIG_BPF_JIT
bool ex_handler_bpf(const struct exception_table_entry *ex,
struct pt_regs *regs);
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 1b9a1e242612..0af70d9abede 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -44,11 +44,8 @@
#define _PAGE_OFFSET(va) (-(UL(1) << (va)))
#define PAGE_OFFSET (_PAGE_OFFSET(VA_BITS))
#define KIMAGE_VADDR (MODULES_END)
-#define BPF_JIT_REGION_START (_PAGE_END(VA_BITS_MIN))
-#define BPF_JIT_REGION_SIZE (SZ_128M)
-#define BPF_JIT_REGION_END (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
-#define MODULES_VADDR (BPF_JIT_REGION_END)
+#define MODULES_VADDR (_PAGE_END(VA_BITS_MIN))
#define MODULES_VSIZE (SZ_128M)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 7b21213a570f..e8986e6067a9 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -994,7 +994,7 @@ static struct break_hook bug_break_hook = {
static int reserved_fault_handler(struct pt_regs *regs, unsigned int esr)
{
pr_err("%s generated an invalid instruction at %pS!\n",
- in_bpf_jit(regs) ? "BPF JIT" : "Kernel text patching",
+ "Kernel text patching",
(void *)instruction_pointer(regs));
/* We cannot handle this */
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 1c403536c9bb..9bc4066c5bf3 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -41,8 +41,6 @@ static struct addr_marker address_markers[] = {
{ 0 /* KASAN_SHADOW_START */, "Kasan shadow start" },
{ KASAN_SHADOW_END, "Kasan shadow end" },
#endif
- { BPF_JIT_REGION_START, "BPF start" },
- { BPF_JIT_REGION_END, "BPF end" },
{ MODULES_VADDR, "Modules start" },
{ MODULES_END, "Modules end" },
{ VMALLOC_START, "vmalloc() area" },
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 3a8a7140a9bf..86c9dc0681cc 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1141,15 +1141,12 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
u64 bpf_jit_alloc_exec_limit(void)
{
- return BPF_JIT_REGION_SIZE;
+ return VMALLOC_END - VMALLOC_START;
}
void *bpf_jit_alloc_exec(unsigned long size)
{
- return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,
- BPF_JIT_REGION_END, GFP_KERNEL,
- PAGE_KERNEL, 0, NUMA_NO_NODE,
- __builtin_return_address(0));
+ return vmalloc(size);
}
void bpf_jit_free_exec(void *addr)
The rseq rseq_cs.ptr.{ptr32,padding} uapi endianness handling is
entirely wrong on 32-bit little endian: a preprocessor logic mistake
wrongly uses the big endian field layout on 32-bit little endian
architectures.
Fortunately, those ptr32 accessors were never used within the kernel,
and only meant as a convenience for user-space.
While working on fixing the ppc32 support in librseq [1], I made sure
all 32-bit little endian architectures stopped depending on little
endian byte ordering by using the ptr32 field. It led me to discover
this wrong ptr32 field ordering on little endian.
Because it is already exposed as a UAPI, all we can do for the existing
fields is document the wrong behavior and encourage users to use
alternative mechanisms.
Introduce a new rseq_cs.arch field with correct field ordering. Use this
opportunity to improve the layout so accesses to architecture fields on
both 32-bit and 64-bit architectures are done through the same field
hierarchy, which is much nicer than the previous scheme.
The intended use is now:
* rseq_thread_area->rseq_cs.ptr64: Access the 64-bit value of the rseq_cs
pointer. Available on all
architectures (unchanged).
* rseq_thread_area->rseq_cs.arch.ptr: Access the architecture specific
layout of the rseq_cs pointer. This
is a 32-bit field on 32-bit
architectures, and a 64-bit field on
64-bit architectures.
Link: https://git.kernel.org/pub/scm/libs/librseq/librseq.git/ [1]
Fixes: ec9c82e03a74 ("rseq: uapi: Declare rseq_cs field as union, update includes")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Florian Weimer <fw(a)deneb.enyo.de>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-api(a)vger.kernel.org
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Boqun Feng <boqun.feng(a)gmail.com>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Cc: Dave Watson <davejwatson(a)fb.com>
Cc: Paul Turner <pjt(a)google.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Russell King <linux(a)arm.linux.org.uk>
Cc: "H . Peter Anvin" <hpa(a)zytor.com>
Cc: Andi Kleen <andi(a)firstfloor.org>
Cc: Christian Brauner <christian.brauner(a)ubuntu.com>
Cc: Ben Maurer <bmaurer(a)fb.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Josh Triplett <josh(a)joshtriplett.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: Joel Fernandes <joelaf(a)google.com>
Cc: Paul E. McKenney <paulmck(a)kernel.org>
---
include/uapi/linux/rseq.h | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
index 9a402fdb60e9..68f61cdb45db 100644
--- a/include/uapi/linux/rseq.h
+++ b/include/uapi/linux/rseq.h
@@ -108,6 +108,12 @@ struct rseq {
*/
union {
__u64 ptr64;
+
+ /*
+ * The "ptr" field layout is broken on little-endian
+ * 32-bit architectures due to wrong preprocessor logic.
+ * DO NOT USE.
+ */
#ifdef __LP64__
__u64 ptr;
#else
@@ -121,6 +127,23 @@ struct rseq {
#endif /* ENDIAN */
} ptr;
#endif
+
+ /*
+ * The "arch" field provides architecture accessor for
+ * the ptr field based on architecture pointer size and
+ * endianness.
+ */
+ struct {
+#ifdef __LP64__
+ __u64 ptr;
+#elif defined(__BYTE_ORDER) ? (__BYTE_ORDER == __BIG_ENDIAN) : defined(__BIG_ENDIAN)
+ __u32 padding; /* Initialized to zero. */
+ __u32 ptr;
+#else
+ __u32 ptr;
+ __u32 padding; /* Initialized to zero. */
+#endif
+ } arch;
} rseq_cs;
/*
--
2.17.1
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 244a36e50da05c33b860d20638ee4628017a5334 Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Wed, 17 Nov 2021 10:45:22 +0100
Subject: [PATCH] drm/vc4: kms: Wait for the commit before increasing our clock
rate
Several DRM/KMS atomic commits can run in parallel if they affect
different CRTC. These commits share the global HVS state, so we have
some code to make sure we run commits in sequence. This synchronization
code is one of the first thing that runs in vc4_atomic_commit_tail().
Another constraints we have is that we need to make sure the HVS clock
gets a boost during the commit. That code relies on clk_set_min_rate and
will remove the old minimum and set a new one. We also need another,
temporary, minimum for the duration of the commit.
The algorithm is thus to set a temporary minimum, drop the previous
one, do the commit, and finally set the minimum for the current mode.
However, the part that sets the temporary minimum and drops the older
one runs before the commit synchronization code.
Thus, under the proper conditions, we can end up mixing up the minimums
and ending up with the wrong one for our current step.
To avoid it, let's move the clock setup in the protected section.
Fixes: d7d96c00e585 ("drm/vc4: hvs: Boost the core clock during modeset")
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
Reviewed-by: Dave Stevenson <dave.stevenson(a)raspberrypi.com>
Tested-by: Jian-Hong Pan <jhp(a)endlessos.org>
[danvet: re-apply this from 0c980a006d3f ("drm/vc4: kms: Wait for the
commit before increasing our clock rate") because I lost that part in
my merge resolution in 99b03ca651f1 ("Merge v5.16-rc5 into drm-next")]
Fixes: 99b03ca651f1 ("Merge v5.16-rc5 into drm-next")
Acked-by: Maxime Ripard <maxime(a)cerno.tech>
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
Link: https://lore.kernel.org/r/20211117094527.146275-2-maxime@cerno.tech
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index bf3706f97ec7..24de29bc1cda 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -365,14 +365,6 @@ static void vc4_atomic_commit_tail(struct drm_atomic_state *state)
vc4_hvs_mask_underrun(dev, vc4_crtc_state->assigned_channel);
}
- if (vc4->hvs->hvs5) {
- unsigned long core_rate = max_t(unsigned long,
- 500000000,
- new_hvs_state->core_clock_rate);
-
- clk_set_min_rate(hvs->core_clk, core_rate);
- }
-
for (channel = 0; channel < HVS_NUM_CHANNELS; channel++) {
struct drm_crtc_commit *commit;
int ret;
@@ -392,6 +384,13 @@ static void vc4_atomic_commit_tail(struct drm_atomic_state *state)
old_hvs_state->fifo_state[channel].pending_commit = NULL;
}
+ if (vc4->hvs->hvs5) {
+ unsigned long core_rate = max_t(unsigned long,
+ 500000000,
+ new_hvs_state->core_clock_rate);
+
+ clk_set_min_rate(hvs->core_clk, core_rate);
+ }
drm_atomic_helper_commit_modeset_disables(dev, state);
vc4_ctm_commit(vc4, state);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 244a36e50da05c33b860d20638ee4628017a5334 Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Wed, 17 Nov 2021 10:45:22 +0100
Subject: [PATCH] drm/vc4: kms: Wait for the commit before increasing our clock
rate
Several DRM/KMS atomic commits can run in parallel if they affect
different CRTC. These commits share the global HVS state, so we have
some code to make sure we run commits in sequence. This synchronization
code is one of the first thing that runs in vc4_atomic_commit_tail().
Another constraints we have is that we need to make sure the HVS clock
gets a boost during the commit. That code relies on clk_set_min_rate and
will remove the old minimum and set a new one. We also need another,
temporary, minimum for the duration of the commit.
The algorithm is thus to set a temporary minimum, drop the previous
one, do the commit, and finally set the minimum for the current mode.
However, the part that sets the temporary minimum and drops the older
one runs before the commit synchronization code.
Thus, under the proper conditions, we can end up mixing up the minimums
and ending up with the wrong one for our current step.
To avoid it, let's move the clock setup in the protected section.
Fixes: d7d96c00e585 ("drm/vc4: hvs: Boost the core clock during modeset")
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
Reviewed-by: Dave Stevenson <dave.stevenson(a)raspberrypi.com>
Tested-by: Jian-Hong Pan <jhp(a)endlessos.org>
[danvet: re-apply this from 0c980a006d3f ("drm/vc4: kms: Wait for the
commit before increasing our clock rate") because I lost that part in
my merge resolution in 99b03ca651f1 ("Merge v5.16-rc5 into drm-next")]
Fixes: 99b03ca651f1 ("Merge v5.16-rc5 into drm-next")
Acked-by: Maxime Ripard <maxime(a)cerno.tech>
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
Link: https://lore.kernel.org/r/20211117094527.146275-2-maxime@cerno.tech
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index bf3706f97ec7..24de29bc1cda 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -365,14 +365,6 @@ static void vc4_atomic_commit_tail(struct drm_atomic_state *state)
vc4_hvs_mask_underrun(dev, vc4_crtc_state->assigned_channel);
}
- if (vc4->hvs->hvs5) {
- unsigned long core_rate = max_t(unsigned long,
- 500000000,
- new_hvs_state->core_clock_rate);
-
- clk_set_min_rate(hvs->core_clk, core_rate);
- }
-
for (channel = 0; channel < HVS_NUM_CHANNELS; channel++) {
struct drm_crtc_commit *commit;
int ret;
@@ -392,6 +384,13 @@ static void vc4_atomic_commit_tail(struct drm_atomic_state *state)
old_hvs_state->fifo_state[channel].pending_commit = NULL;
}
+ if (vc4->hvs->hvs5) {
+ unsigned long core_rate = max_t(unsigned long,
+ 500000000,
+ new_hvs_state->core_clock_rate);
+
+ clk_set_min_rate(hvs->core_clk, core_rate);
+ }
drm_atomic_helper_commit_modeset_disables(dev, state);
vc4_ctm_commit(vc4, state);
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e32e5723256a99c5324824503572f743377dd0fe Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Mon, 25 Oct 2021 17:28:55 +0200
Subject: [PATCH] drm/vc4: hdmi: Fix HPD GPIO detection
Prior to commit 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod"), in the
detect hook, if we had an HPD GPIO we would only rely on it and return
whatever state it was in.
However, that commit changed that by mistake to only consider the case
where we have a GPIO and it returns a logical high, and would fall back
to the other methods otherwise.
Since we can read the EDIDs when the HPD signal is low on some displays,
we changed the detection status from disconnected to connected, and we
would ignore an HPD pulse.
Fixes: 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod")
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
Reviewed-by: Dave Stevenson <dave.stevenson(a)raspberrypi.com>
Link: https://lore.kernel.org/r/20211025152903.1088803-3-maxime@cerno.tech
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 6469645a7ad5..07a67bf6d3a8 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -170,9 +170,9 @@ vc4_hdmi_connector_detect(struct drm_connector *connector, bool force)
WARN_ON(pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev));
- if (vc4_hdmi->hpd_gpio &&
- gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) {
- connected = true;
+ if (vc4_hdmi->hpd_gpio) {
+ if (gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio))
+ connected = true;
} else if (HDMI_READ(HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED) {
connected = true;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e32e5723256a99c5324824503572f743377dd0fe Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Mon, 25 Oct 2021 17:28:55 +0200
Subject: [PATCH] drm/vc4: hdmi: Fix HPD GPIO detection
Prior to commit 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod"), in the
detect hook, if we had an HPD GPIO we would only rely on it and return
whatever state it was in.
However, that commit changed that by mistake to only consider the case
where we have a GPIO and it returns a logical high, and would fall back
to the other methods otherwise.
Since we can read the EDIDs when the HPD signal is low on some displays,
we changed the detection status from disconnected to connected, and we
would ignore an HPD pulse.
Fixes: 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod")
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
Reviewed-by: Dave Stevenson <dave.stevenson(a)raspberrypi.com>
Link: https://lore.kernel.org/r/20211025152903.1088803-3-maxime@cerno.tech
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 6469645a7ad5..07a67bf6d3a8 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -170,9 +170,9 @@ vc4_hdmi_connector_detect(struct drm_connector *connector, bool force)
WARN_ON(pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev));
- if (vc4_hdmi->hpd_gpio &&
- gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) {
- connected = true;
+ if (vc4_hdmi->hpd_gpio) {
+ if (gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio))
+ connected = true;
} else if (HDMI_READ(HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED) {
connected = true;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a64ff88cb5eb0b9a6855a24ff326e948931e3a8e Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Mon, 25 Oct 2021 16:11:11 +0200
Subject: [PATCH] drm/vc4: hdmi: Check the device state in prepare()
Even though we already check that the encoder->crtc pointer is there
during in startup(), which is part of the open() path in ASoC, nothing
guarantees that our encoder state won't change between the time when we
open the device and the time we prepare it.
Move the sanity checks we do in startup() to a helper and call it from
prepare().
Link: https://lore.kernel.org/r/20211025141113.702757-8-maxime@cerno.tech
Fixes: 91e99e113929 ("drm/vc4: hdmi: Register HDMI codec")
Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 71e3d1044d84..0adaa6f7eaef 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -1395,20 +1395,36 @@ static inline struct vc4_hdmi *dai_to_hdmi(struct snd_soc_dai *dai)
return snd_soc_card_get_drvdata(card);
}
+static bool vc4_hdmi_audio_can_stream(struct vc4_hdmi *vc4_hdmi)
+{
+ struct drm_encoder *encoder = &vc4_hdmi->encoder.base.base;
+
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
+ /*
+ * The encoder doesn't have a CRTC until the first modeset.
+ */
+ if (!encoder->crtc)
+ return false;
+
+ /*
+ * If the encoder is currently in DVI mode, treat the codec DAI
+ * as missing.
+ */
+ if (!(HDMI_READ(HDMI_RAM_PACKET_CONFIG) & VC4_HDMI_RAM_PACKET_ENABLE))
+ return false;
+
+ return true;
+}
+
static int vc4_hdmi_audio_startup(struct device *dev, void *data)
{
struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
- struct drm_encoder *encoder = &vc4_hdmi->encoder.base.base;
unsigned long flags;
mutex_lock(&vc4_hdmi->mutex);
- /*
- * If the HDMI encoder hasn't probed, or the encoder is
- * currently in DVI mode, treat the codec dai as missing.
- */
- if (!encoder->crtc || !(HDMI_READ(HDMI_RAM_PACKET_CONFIG) &
- VC4_HDMI_RAM_PACKET_ENABLE)) {
+ if (!vc4_hdmi_audio_can_stream(vc4_hdmi)) {
mutex_unlock(&vc4_hdmi->mutex);
return -ENODEV;
}
@@ -1538,6 +1554,11 @@ static int vc4_hdmi_audio_prepare(struct device *dev, void *data,
mutex_lock(&vc4_hdmi->mutex);
+ if (!vc4_hdmi_audio_can_stream(vc4_hdmi)) {
+ mutex_unlock(&vc4_hdmi->mutex);
+ return -EINVAL;
+ }
+
vc4_hdmi_audio_set_mai_clock(vc4_hdmi, sample_rate);
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a64ff88cb5eb0b9a6855a24ff326e948931e3a8e Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Mon, 25 Oct 2021 16:11:11 +0200
Subject: [PATCH] drm/vc4: hdmi: Check the device state in prepare()
Even though we already check that the encoder->crtc pointer is there
during in startup(), which is part of the open() path in ASoC, nothing
guarantees that our encoder state won't change between the time when we
open the device and the time we prepare it.
Move the sanity checks we do in startup() to a helper and call it from
prepare().
Link: https://lore.kernel.org/r/20211025141113.702757-8-maxime@cerno.tech
Fixes: 91e99e113929 ("drm/vc4: hdmi: Register HDMI codec")
Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 71e3d1044d84..0adaa6f7eaef 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -1395,20 +1395,36 @@ static inline struct vc4_hdmi *dai_to_hdmi(struct snd_soc_dai *dai)
return snd_soc_card_get_drvdata(card);
}
+static bool vc4_hdmi_audio_can_stream(struct vc4_hdmi *vc4_hdmi)
+{
+ struct drm_encoder *encoder = &vc4_hdmi->encoder.base.base;
+
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
+ /*
+ * The encoder doesn't have a CRTC until the first modeset.
+ */
+ if (!encoder->crtc)
+ return false;
+
+ /*
+ * If the encoder is currently in DVI mode, treat the codec DAI
+ * as missing.
+ */
+ if (!(HDMI_READ(HDMI_RAM_PACKET_CONFIG) & VC4_HDMI_RAM_PACKET_ENABLE))
+ return false;
+
+ return true;
+}
+
static int vc4_hdmi_audio_startup(struct device *dev, void *data)
{
struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
- struct drm_encoder *encoder = &vc4_hdmi->encoder.base.base;
unsigned long flags;
mutex_lock(&vc4_hdmi->mutex);
- /*
- * If the HDMI encoder hasn't probed, or the encoder is
- * currently in DVI mode, treat the codec dai as missing.
- */
- if (!encoder->crtc || !(HDMI_READ(HDMI_RAM_PACKET_CONFIG) &
- VC4_HDMI_RAM_PACKET_ENABLE)) {
+ if (!vc4_hdmi_audio_can_stream(vc4_hdmi)) {
mutex_unlock(&vc4_hdmi->mutex);
return -ENODEV;
}
@@ -1538,6 +1554,11 @@ static int vc4_hdmi_audio_prepare(struct device *dev, void *data,
mutex_lock(&vc4_hdmi->mutex);
+ if (!vc4_hdmi_audio_can_stream(vc4_hdmi)) {
+ mutex_unlock(&vc4_hdmi->mutex);
+ return -EINVAL;
+ }
+
vc4_hdmi_audio_set_mai_clock(vc4_hdmi, sample_rate);
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 82cb88af12d29eaa5350d9ba83f9c376f65b7fec Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Mon, 25 Oct 2021 16:11:09 +0200
Subject: [PATCH] drm/vc4: hdmi: Use a mutex to prevent concurrent framework
access
The vc4 HDMI controller registers into the KMS, CEC and ALSA
frameworks.
However, no particular care is done to prevent the concurrent execution
of different framework hooks from happening at the same time.
In order to protect against that scenario, let's introduce a mutex that
relevant ALSA and KMS hooks will need to take to prevent concurrent
execution.
CEC is left out at the moment though, since the .get_modes and .detect
KMS hooks, when running cec_s_phys_addr_from_edid, can end up calling
CEC's .adap_enable hook. This introduces some reentrancy that isn't easy
to deal with properly.
The CEC hooks also don't share much state with the rest of the driver:
the registers are entirely separate, we don't share any variable, the
only thing that can conflict is the CEC clock divider setup that can be
affected by a mode set.
However, after discussing it, it looks like CEC should be able to
recover from this if it was to happen.
Link: https://lore.kernel.org/r/20211025141113.702757-6-maxime@cerno.tech
Fixes: bb7d78568814 ("drm/vc4: Add HDMI audio support")
Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 42b9f4cfdc38..cea3665abcd6 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -188,6 +188,8 @@ vc4_hdmi_connector_detect(struct drm_connector *connector, bool force)
struct vc4_hdmi *vc4_hdmi = connector_to_vc4_hdmi(connector);
bool connected = false;
+ mutex_lock(&vc4_hdmi->mutex);
+
WARN_ON(pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev));
if (vc4_hdmi->hpd_gpio) {
@@ -218,11 +220,13 @@ vc4_hdmi_connector_detect(struct drm_connector *connector, bool force)
vc4_hdmi_enable_scrambling(&vc4_hdmi->encoder.base.base);
pm_runtime_put(&vc4_hdmi->pdev->dev);
+ mutex_unlock(&vc4_hdmi->mutex);
return connector_status_connected;
}
cec_phys_addr_invalidate(vc4_hdmi->cec_adap);
pm_runtime_put(&vc4_hdmi->pdev->dev);
+ mutex_unlock(&vc4_hdmi->mutex);
return connector_status_disconnected;
}
@@ -239,10 +243,14 @@ static int vc4_hdmi_connector_get_modes(struct drm_connector *connector)
int ret = 0;
struct edid *edid;
+ mutex_lock(&vc4_hdmi->mutex);
+
edid = drm_get_edid(connector, vc4_hdmi->ddc);
cec_s_phys_addr_from_edid(vc4_hdmi->cec_adap, edid);
- if (!edid)
- return -ENODEV;
+ if (!edid) {
+ ret = -ENODEV;
+ goto out;
+ }
vc4_encoder->hdmi_monitor = drm_detect_hdmi_monitor(edid);
@@ -262,6 +270,9 @@ static int vc4_hdmi_connector_get_modes(struct drm_connector *connector)
}
}
+out:
+ mutex_unlock(&vc4_hdmi->mutex);
+
return ret;
}
@@ -478,6 +489,8 @@ static void vc4_hdmi_set_avi_infoframe(struct drm_encoder *encoder)
union hdmi_infoframe frame;
int ret;
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
connector, mode);
if (ret < 0) {
@@ -529,6 +542,8 @@ static void vc4_hdmi_set_hdr_infoframe(struct drm_encoder *encoder)
struct drm_connector_state *conn_state = connector->state;
union hdmi_infoframe frame;
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
if (!vc4_hdmi->variant->supports_hdr)
return;
@@ -545,6 +560,8 @@ static void vc4_hdmi_set_infoframes(struct drm_encoder *encoder)
{
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
vc4_hdmi_set_avi_infoframe(encoder);
vc4_hdmi_set_spd_infoframe(encoder);
/*
@@ -564,6 +581,8 @@ static bool vc4_hdmi_supports_scrambling(struct drm_encoder *encoder,
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
struct drm_display_info *display = &vc4_hdmi->connector.display_info;
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
if (!vc4_encoder->hdmi_monitor)
return false;
@@ -582,6 +601,8 @@ static void vc4_hdmi_enable_scrambling(struct drm_encoder *encoder)
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
unsigned long flags;
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
if (!vc4_hdmi_supports_scrambling(encoder, mode))
return;
@@ -651,6 +672,8 @@ static void vc4_hdmi_encoder_post_crtc_disable(struct drm_encoder *encoder,
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
unsigned long flags;
+ mutex_lock(&vc4_hdmi->mutex);
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_RAM_PACKET_CONFIG, 0);
@@ -667,6 +690,8 @@ static void vc4_hdmi_encoder_post_crtc_disable(struct drm_encoder *encoder,
spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
vc4_hdmi_disable_scrambling(encoder);
+
+ mutex_unlock(&vc4_hdmi->mutex);
}
static void vc4_hdmi_encoder_post_crtc_powerdown(struct drm_encoder *encoder,
@@ -676,6 +701,8 @@ static void vc4_hdmi_encoder_post_crtc_powerdown(struct drm_encoder *encoder,
unsigned long flags;
int ret;
+ mutex_lock(&vc4_hdmi->mutex);
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_VID_CTL,
HDMI_READ(HDMI_VID_CTL) | VC4_HD_VID_CTL_BLANKPIX);
@@ -690,6 +717,8 @@ static void vc4_hdmi_encoder_post_crtc_powerdown(struct drm_encoder *encoder,
ret = pm_runtime_put(&vc4_hdmi->pdev->dev);
if (ret < 0)
DRM_ERROR("Failed to release power domain: %d\n", ret);
+
+ mutex_unlock(&vc4_hdmi->mutex);
}
static void vc4_hdmi_encoder_disable(struct drm_encoder *encoder)
@@ -986,6 +1015,8 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct drm_encoder *encoder,
unsigned long flags;
int ret;
+ mutex_lock(&vc4_hdmi->mutex);
+
/*
* As stated in RPi's vc4 firmware "HDMI state machine (HSM) clock must
* be faster than pixel clock, infinitesimally faster, tested in
@@ -1006,13 +1037,13 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct drm_encoder *encoder,
ret = clk_set_min_rate(vc4_hdmi->hsm_clock, hsm_rate);
if (ret) {
DRM_ERROR("Failed to set HSM clock rate: %d\n", ret);
- return;
+ goto out;
}
ret = pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev);
if (ret < 0) {
DRM_ERROR("Failed to retain power domain: %d\n", ret);
- return;
+ goto out;
}
ret = clk_set_rate(vc4_hdmi->pixel_clock, pixel_rate);
@@ -1064,13 +1095,16 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct drm_encoder *encoder,
if (vc4_hdmi->variant->set_timings)
vc4_hdmi->variant->set_timings(vc4_hdmi, conn_state, mode);
+ mutex_unlock(&vc4_hdmi->mutex);
+
return;
err_disable_pixel_clock:
clk_disable_unprepare(vc4_hdmi->pixel_clock);
err_put_runtime_pm:
pm_runtime_put(&vc4_hdmi->pdev->dev);
-
+out:
+ mutex_unlock(&vc4_hdmi->mutex);
return;
}
@@ -1082,6 +1116,8 @@ static void vc4_hdmi_encoder_pre_crtc_enable(struct drm_encoder *encoder,
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
unsigned long flags;
+ mutex_lock(&vc4_hdmi->mutex);
+
if (vc4_encoder->hdmi_monitor &&
drm_default_rgb_quant_range(mode) == HDMI_QUANTIZATION_RANGE_LIMITED) {
if (vc4_hdmi->variant->csc_setup)
@@ -1098,6 +1134,8 @@ static void vc4_hdmi_encoder_pre_crtc_enable(struct drm_encoder *encoder,
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_FIFO_CTL, VC4_HDMI_FIFO_CTL_MASTER_SLAVE_N);
spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
+
+ mutex_unlock(&vc4_hdmi->mutex);
}
static void vc4_hdmi_encoder_post_crtc_enable(struct drm_encoder *encoder,
@@ -1111,6 +1149,8 @@ static void vc4_hdmi_encoder_post_crtc_enable(struct drm_encoder *encoder,
unsigned long flags;
int ret;
+ mutex_lock(&vc4_hdmi->mutex);
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_VID_CTL,
@@ -1170,6 +1210,8 @@ static void vc4_hdmi_encoder_post_crtc_enable(struct drm_encoder *encoder,
vc4_hdmi_recenter_fifo(vc4_hdmi);
vc4_hdmi_enable_scrambling(encoder);
+
+ mutex_unlock(&vc4_hdmi->mutex);
}
static void vc4_hdmi_encoder_enable(struct drm_encoder *encoder)
@@ -1311,6 +1353,7 @@ static void vc4_hdmi_set_n_cts(struct vc4_hdmi *vc4_hdmi, unsigned int samplerat
u32 n, cts;
u64 tmp;
+ lockdep_assert_held(&vc4_hdmi->mutex);
lockdep_assert_held(&vc4_hdmi->hw_lock);
n = 128 * samplerate / 1000;
@@ -1344,13 +1387,17 @@ static int vc4_hdmi_audio_startup(struct device *dev, void *data)
struct drm_encoder *encoder = &vc4_hdmi->encoder.base.base;
unsigned long flags;
+ mutex_lock(&vc4_hdmi->mutex);
+
/*
* If the HDMI encoder hasn't probed, or the encoder is
* currently in DVI mode, treat the codec dai as missing.
*/
if (!encoder->crtc || !(HDMI_READ(HDMI_RAM_PACKET_CONFIG) &
- VC4_HDMI_RAM_PACKET_ENABLE))
+ VC4_HDMI_RAM_PACKET_ENABLE)) {
+ mutex_unlock(&vc4_hdmi->mutex);
return -ENODEV;
+ }
vc4_hdmi->audio.streaming = true;
@@ -1366,6 +1413,8 @@ static int vc4_hdmi_audio_startup(struct device *dev, void *data)
if (vc4_hdmi->variant->phy_rng_enable)
vc4_hdmi->variant->phy_rng_enable(vc4_hdmi);
+ mutex_unlock(&vc4_hdmi->mutex);
+
return 0;
}
@@ -1376,6 +1425,8 @@ static void vc4_hdmi_audio_reset(struct vc4_hdmi *vc4_hdmi)
unsigned long flags;
int ret;
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
vc4_hdmi->audio.streaming = false;
ret = vc4_hdmi_stop_packet(encoder, HDMI_INFOFRAME_TYPE_AUDIO, false);
if (ret)
@@ -1395,6 +1446,8 @@ static void vc4_hdmi_audio_shutdown(struct device *dev, void *data)
struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
unsigned long flags;
+ mutex_lock(&vc4_hdmi->mutex);
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_MAI_CTL,
@@ -1409,6 +1462,8 @@ static void vc4_hdmi_audio_shutdown(struct device *dev, void *data)
vc4_hdmi->audio.streaming = false;
vc4_hdmi_audio_reset(vc4_hdmi);
+
+ mutex_unlock(&vc4_hdmi->mutex);
}
static int sample_rate_to_mai_fmt(int samplerate)
@@ -1467,6 +1522,8 @@ static int vc4_hdmi_audio_prepare(struct device *dev, void *data,
dev_dbg(dev, "%s: %u Hz, %d bit, %d channels\n", __func__,
sample_rate, params->sample_width, channels);
+ mutex_lock(&vc4_hdmi->mutex);
+
vc4_hdmi_audio_set_mai_clock(vc4_hdmi, sample_rate);
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
@@ -1521,6 +1578,8 @@ static int vc4_hdmi_audio_prepare(struct device *dev, void *data,
memcpy(&vc4_hdmi->audio.infoframe, ¶ms->cea, sizeof(params->cea));
vc4_hdmi_set_audio_infoframe(encoder);
+ mutex_unlock(&vc4_hdmi->mutex);
+
return 0;
}
@@ -1571,7 +1630,9 @@ static int vc4_hdmi_audio_get_eld(struct device *dev, void *data,
struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
struct drm_connector *connector = &vc4_hdmi->connector;
+ mutex_lock(&vc4_hdmi->mutex);
memcpy(buf, connector->eld, min(sizeof(connector->eld), len));
+ mutex_unlock(&vc4_hdmi->mutex);
return 0;
}
@@ -1902,6 +1963,17 @@ static int vc4_hdmi_cec_enable(struct cec_adapter *adap)
u32 val;
int ret;
+ /*
+ * NOTE: This function should really take vc4_hdmi->mutex, but doing so
+ * results in a reentrancy since cec_s_phys_addr_from_edid() called in
+ * .detect or .get_modes might call .adap_enable, which leads to this
+ * function being called with that mutex held.
+ *
+ * Concurrency is not an issue for the moment since we don't share any
+ * state with KMS, so we can ignore the lock for now, but we need to
+ * keep it in mind if we were to change that assumption.
+ */
+
ret = pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev);
if (ret)
return ret;
@@ -1948,6 +2020,17 @@ static int vc4_hdmi_cec_disable(struct cec_adapter *adap)
struct vc4_hdmi *vc4_hdmi = cec_get_drvdata(adap);
unsigned long flags;
+ /*
+ * NOTE: This function should really take vc4_hdmi->mutex, but doing so
+ * results in a reentrancy since cec_s_phys_addr_from_edid() called in
+ * .detect or .get_modes might call .adap_enable, which leads to this
+ * function being called with that mutex held.
+ *
+ * Concurrency is not an issue for the moment since we don't share any
+ * state with KMS, so we can ignore the lock for now, but we need to
+ * keep it in mind if we were to change that assumption.
+ */
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
if (!vc4_hdmi->variant->external_irq_controller)
@@ -1976,6 +2059,17 @@ static int vc4_hdmi_cec_adap_log_addr(struct cec_adapter *adap, u8 log_addr)
struct vc4_hdmi *vc4_hdmi = cec_get_drvdata(adap);
unsigned long flags;
+ /*
+ * NOTE: This function should really take vc4_hdmi->mutex, but doing so
+ * results in a reentrancy since cec_s_phys_addr_from_edid() called in
+ * .detect or .get_modes might call .adap_enable, which leads to this
+ * function being called with that mutex held.
+ *
+ * Concurrency is not an issue for the moment since we don't share any
+ * state with KMS, so we can ignore the lock for now, but we need to
+ * keep it in mind if we were to change that assumption.
+ */
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_CEC_CNTRL_1,
(HDMI_READ(HDMI_CEC_CNTRL_1) & ~VC4_HDMI_CEC_ADDR_MASK) |
@@ -1994,6 +2088,17 @@ static int vc4_hdmi_cec_adap_transmit(struct cec_adapter *adap, u8 attempts,
u32 val;
unsigned int i;
+ /*
+ * NOTE: This function should really take vc4_hdmi->mutex, but doing so
+ * results in a reentrancy since cec_s_phys_addr_from_edid() called in
+ * .detect or .get_modes might call .adap_enable, which leads to this
+ * function being called with that mutex held.
+ *
+ * Concurrency is not an issue for the moment since we don't share any
+ * state with KMS, so we can ignore the lock for now, but we need to
+ * keep it in mind if we were to change that assumption.
+ */
+
if (msg->len > 16) {
drm_err(dev, "Attempting to transmit too much data (%d)\n", msg->len);
return -ENOMEM;
@@ -2349,6 +2454,7 @@ static int vc4_hdmi_bind(struct device *dev, struct device *master, void *data)
vc4_hdmi = devm_kzalloc(dev, sizeof(*vc4_hdmi), GFP_KERNEL);
if (!vc4_hdmi)
return -ENOMEM;
+ mutex_init(&vc4_hdmi->mutex);
spin_lock_init(&vc4_hdmi->hw_lock);
INIT_DELAYED_WORK(&vc4_hdmi->scrambling_work, vc4_hdmi_scrambling_wq);
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h
index 006142fe8d4e..cf9bb21a8ef7 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.h
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
@@ -183,6 +183,20 @@ struct vc4_hdmi {
* @hw_lock: Spinlock protecting device register access.
*/
spinlock_t hw_lock;
+
+ /**
+ * @mutex: Mutex protecting the driver access across multiple
+ * frameworks (KMS, ALSA).
+ *
+ * NOTE: While supported, CEC has been left out since
+ * cec_s_phys_addr_from_edid() might call .adap_enable and lead to a
+ * reentrancy issue between .get_modes (or .detect) and .adap_enable.
+ * Since we don't share any state between the CEC hooks and KMS', it's
+ * not a big deal. The only trouble might come from updating the CEC
+ * clock divider which might be affected by a modeset, but CEC should
+ * be resilient to that.
+ */
+ struct mutex mutex;
};
static inline struct vc4_hdmi *
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 82cb88af12d29eaa5350d9ba83f9c376f65b7fec Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Mon, 25 Oct 2021 16:11:09 +0200
Subject: [PATCH] drm/vc4: hdmi: Use a mutex to prevent concurrent framework
access
The vc4 HDMI controller registers into the KMS, CEC and ALSA
frameworks.
However, no particular care is done to prevent the concurrent execution
of different framework hooks from happening at the same time.
In order to protect against that scenario, let's introduce a mutex that
relevant ALSA and KMS hooks will need to take to prevent concurrent
execution.
CEC is left out at the moment though, since the .get_modes and .detect
KMS hooks, when running cec_s_phys_addr_from_edid, can end up calling
CEC's .adap_enable hook. This introduces some reentrancy that isn't easy
to deal with properly.
The CEC hooks also don't share much state with the rest of the driver:
the registers are entirely separate, we don't share any variable, the
only thing that can conflict is the CEC clock divider setup that can be
affected by a mode set.
However, after discussing it, it looks like CEC should be able to
recover from this if it was to happen.
Link: https://lore.kernel.org/r/20211025141113.702757-6-maxime@cerno.tech
Fixes: bb7d78568814 ("drm/vc4: Add HDMI audio support")
Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 42b9f4cfdc38..cea3665abcd6 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -188,6 +188,8 @@ vc4_hdmi_connector_detect(struct drm_connector *connector, bool force)
struct vc4_hdmi *vc4_hdmi = connector_to_vc4_hdmi(connector);
bool connected = false;
+ mutex_lock(&vc4_hdmi->mutex);
+
WARN_ON(pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev));
if (vc4_hdmi->hpd_gpio) {
@@ -218,11 +220,13 @@ vc4_hdmi_connector_detect(struct drm_connector *connector, bool force)
vc4_hdmi_enable_scrambling(&vc4_hdmi->encoder.base.base);
pm_runtime_put(&vc4_hdmi->pdev->dev);
+ mutex_unlock(&vc4_hdmi->mutex);
return connector_status_connected;
}
cec_phys_addr_invalidate(vc4_hdmi->cec_adap);
pm_runtime_put(&vc4_hdmi->pdev->dev);
+ mutex_unlock(&vc4_hdmi->mutex);
return connector_status_disconnected;
}
@@ -239,10 +243,14 @@ static int vc4_hdmi_connector_get_modes(struct drm_connector *connector)
int ret = 0;
struct edid *edid;
+ mutex_lock(&vc4_hdmi->mutex);
+
edid = drm_get_edid(connector, vc4_hdmi->ddc);
cec_s_phys_addr_from_edid(vc4_hdmi->cec_adap, edid);
- if (!edid)
- return -ENODEV;
+ if (!edid) {
+ ret = -ENODEV;
+ goto out;
+ }
vc4_encoder->hdmi_monitor = drm_detect_hdmi_monitor(edid);
@@ -262,6 +270,9 @@ static int vc4_hdmi_connector_get_modes(struct drm_connector *connector)
}
}
+out:
+ mutex_unlock(&vc4_hdmi->mutex);
+
return ret;
}
@@ -478,6 +489,8 @@ static void vc4_hdmi_set_avi_infoframe(struct drm_encoder *encoder)
union hdmi_infoframe frame;
int ret;
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
connector, mode);
if (ret < 0) {
@@ -529,6 +542,8 @@ static void vc4_hdmi_set_hdr_infoframe(struct drm_encoder *encoder)
struct drm_connector_state *conn_state = connector->state;
union hdmi_infoframe frame;
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
if (!vc4_hdmi->variant->supports_hdr)
return;
@@ -545,6 +560,8 @@ static void vc4_hdmi_set_infoframes(struct drm_encoder *encoder)
{
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
vc4_hdmi_set_avi_infoframe(encoder);
vc4_hdmi_set_spd_infoframe(encoder);
/*
@@ -564,6 +581,8 @@ static bool vc4_hdmi_supports_scrambling(struct drm_encoder *encoder,
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
struct drm_display_info *display = &vc4_hdmi->connector.display_info;
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
if (!vc4_encoder->hdmi_monitor)
return false;
@@ -582,6 +601,8 @@ static void vc4_hdmi_enable_scrambling(struct drm_encoder *encoder)
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
unsigned long flags;
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
if (!vc4_hdmi_supports_scrambling(encoder, mode))
return;
@@ -651,6 +672,8 @@ static void vc4_hdmi_encoder_post_crtc_disable(struct drm_encoder *encoder,
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
unsigned long flags;
+ mutex_lock(&vc4_hdmi->mutex);
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_RAM_PACKET_CONFIG, 0);
@@ -667,6 +690,8 @@ static void vc4_hdmi_encoder_post_crtc_disable(struct drm_encoder *encoder,
spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
vc4_hdmi_disable_scrambling(encoder);
+
+ mutex_unlock(&vc4_hdmi->mutex);
}
static void vc4_hdmi_encoder_post_crtc_powerdown(struct drm_encoder *encoder,
@@ -676,6 +701,8 @@ static void vc4_hdmi_encoder_post_crtc_powerdown(struct drm_encoder *encoder,
unsigned long flags;
int ret;
+ mutex_lock(&vc4_hdmi->mutex);
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_VID_CTL,
HDMI_READ(HDMI_VID_CTL) | VC4_HD_VID_CTL_BLANKPIX);
@@ -690,6 +717,8 @@ static void vc4_hdmi_encoder_post_crtc_powerdown(struct drm_encoder *encoder,
ret = pm_runtime_put(&vc4_hdmi->pdev->dev);
if (ret < 0)
DRM_ERROR("Failed to release power domain: %d\n", ret);
+
+ mutex_unlock(&vc4_hdmi->mutex);
}
static void vc4_hdmi_encoder_disable(struct drm_encoder *encoder)
@@ -986,6 +1015,8 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct drm_encoder *encoder,
unsigned long flags;
int ret;
+ mutex_lock(&vc4_hdmi->mutex);
+
/*
* As stated in RPi's vc4 firmware "HDMI state machine (HSM) clock must
* be faster than pixel clock, infinitesimally faster, tested in
@@ -1006,13 +1037,13 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct drm_encoder *encoder,
ret = clk_set_min_rate(vc4_hdmi->hsm_clock, hsm_rate);
if (ret) {
DRM_ERROR("Failed to set HSM clock rate: %d\n", ret);
- return;
+ goto out;
}
ret = pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev);
if (ret < 0) {
DRM_ERROR("Failed to retain power domain: %d\n", ret);
- return;
+ goto out;
}
ret = clk_set_rate(vc4_hdmi->pixel_clock, pixel_rate);
@@ -1064,13 +1095,16 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct drm_encoder *encoder,
if (vc4_hdmi->variant->set_timings)
vc4_hdmi->variant->set_timings(vc4_hdmi, conn_state, mode);
+ mutex_unlock(&vc4_hdmi->mutex);
+
return;
err_disable_pixel_clock:
clk_disable_unprepare(vc4_hdmi->pixel_clock);
err_put_runtime_pm:
pm_runtime_put(&vc4_hdmi->pdev->dev);
-
+out:
+ mutex_unlock(&vc4_hdmi->mutex);
return;
}
@@ -1082,6 +1116,8 @@ static void vc4_hdmi_encoder_pre_crtc_enable(struct drm_encoder *encoder,
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
unsigned long flags;
+ mutex_lock(&vc4_hdmi->mutex);
+
if (vc4_encoder->hdmi_monitor &&
drm_default_rgb_quant_range(mode) == HDMI_QUANTIZATION_RANGE_LIMITED) {
if (vc4_hdmi->variant->csc_setup)
@@ -1098,6 +1134,8 @@ static void vc4_hdmi_encoder_pre_crtc_enable(struct drm_encoder *encoder,
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_FIFO_CTL, VC4_HDMI_FIFO_CTL_MASTER_SLAVE_N);
spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
+
+ mutex_unlock(&vc4_hdmi->mutex);
}
static void vc4_hdmi_encoder_post_crtc_enable(struct drm_encoder *encoder,
@@ -1111,6 +1149,8 @@ static void vc4_hdmi_encoder_post_crtc_enable(struct drm_encoder *encoder,
unsigned long flags;
int ret;
+ mutex_lock(&vc4_hdmi->mutex);
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_VID_CTL,
@@ -1170,6 +1210,8 @@ static void vc4_hdmi_encoder_post_crtc_enable(struct drm_encoder *encoder,
vc4_hdmi_recenter_fifo(vc4_hdmi);
vc4_hdmi_enable_scrambling(encoder);
+
+ mutex_unlock(&vc4_hdmi->mutex);
}
static void vc4_hdmi_encoder_enable(struct drm_encoder *encoder)
@@ -1311,6 +1353,7 @@ static void vc4_hdmi_set_n_cts(struct vc4_hdmi *vc4_hdmi, unsigned int samplerat
u32 n, cts;
u64 tmp;
+ lockdep_assert_held(&vc4_hdmi->mutex);
lockdep_assert_held(&vc4_hdmi->hw_lock);
n = 128 * samplerate / 1000;
@@ -1344,13 +1387,17 @@ static int vc4_hdmi_audio_startup(struct device *dev, void *data)
struct drm_encoder *encoder = &vc4_hdmi->encoder.base.base;
unsigned long flags;
+ mutex_lock(&vc4_hdmi->mutex);
+
/*
* If the HDMI encoder hasn't probed, or the encoder is
* currently in DVI mode, treat the codec dai as missing.
*/
if (!encoder->crtc || !(HDMI_READ(HDMI_RAM_PACKET_CONFIG) &
- VC4_HDMI_RAM_PACKET_ENABLE))
+ VC4_HDMI_RAM_PACKET_ENABLE)) {
+ mutex_unlock(&vc4_hdmi->mutex);
return -ENODEV;
+ }
vc4_hdmi->audio.streaming = true;
@@ -1366,6 +1413,8 @@ static int vc4_hdmi_audio_startup(struct device *dev, void *data)
if (vc4_hdmi->variant->phy_rng_enable)
vc4_hdmi->variant->phy_rng_enable(vc4_hdmi);
+ mutex_unlock(&vc4_hdmi->mutex);
+
return 0;
}
@@ -1376,6 +1425,8 @@ static void vc4_hdmi_audio_reset(struct vc4_hdmi *vc4_hdmi)
unsigned long flags;
int ret;
+ lockdep_assert_held(&vc4_hdmi->mutex);
+
vc4_hdmi->audio.streaming = false;
ret = vc4_hdmi_stop_packet(encoder, HDMI_INFOFRAME_TYPE_AUDIO, false);
if (ret)
@@ -1395,6 +1446,8 @@ static void vc4_hdmi_audio_shutdown(struct device *dev, void *data)
struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
unsigned long flags;
+ mutex_lock(&vc4_hdmi->mutex);
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_MAI_CTL,
@@ -1409,6 +1462,8 @@ static void vc4_hdmi_audio_shutdown(struct device *dev, void *data)
vc4_hdmi->audio.streaming = false;
vc4_hdmi_audio_reset(vc4_hdmi);
+
+ mutex_unlock(&vc4_hdmi->mutex);
}
static int sample_rate_to_mai_fmt(int samplerate)
@@ -1467,6 +1522,8 @@ static int vc4_hdmi_audio_prepare(struct device *dev, void *data,
dev_dbg(dev, "%s: %u Hz, %d bit, %d channels\n", __func__,
sample_rate, params->sample_width, channels);
+ mutex_lock(&vc4_hdmi->mutex);
+
vc4_hdmi_audio_set_mai_clock(vc4_hdmi, sample_rate);
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
@@ -1521,6 +1578,8 @@ static int vc4_hdmi_audio_prepare(struct device *dev, void *data,
memcpy(&vc4_hdmi->audio.infoframe, ¶ms->cea, sizeof(params->cea));
vc4_hdmi_set_audio_infoframe(encoder);
+ mutex_unlock(&vc4_hdmi->mutex);
+
return 0;
}
@@ -1571,7 +1630,9 @@ static int vc4_hdmi_audio_get_eld(struct device *dev, void *data,
struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
struct drm_connector *connector = &vc4_hdmi->connector;
+ mutex_lock(&vc4_hdmi->mutex);
memcpy(buf, connector->eld, min(sizeof(connector->eld), len));
+ mutex_unlock(&vc4_hdmi->mutex);
return 0;
}
@@ -1902,6 +1963,17 @@ static int vc4_hdmi_cec_enable(struct cec_adapter *adap)
u32 val;
int ret;
+ /*
+ * NOTE: This function should really take vc4_hdmi->mutex, but doing so
+ * results in a reentrancy since cec_s_phys_addr_from_edid() called in
+ * .detect or .get_modes might call .adap_enable, which leads to this
+ * function being called with that mutex held.
+ *
+ * Concurrency is not an issue for the moment since we don't share any
+ * state with KMS, so we can ignore the lock for now, but we need to
+ * keep it in mind if we were to change that assumption.
+ */
+
ret = pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev);
if (ret)
return ret;
@@ -1948,6 +2020,17 @@ static int vc4_hdmi_cec_disable(struct cec_adapter *adap)
struct vc4_hdmi *vc4_hdmi = cec_get_drvdata(adap);
unsigned long flags;
+ /*
+ * NOTE: This function should really take vc4_hdmi->mutex, but doing so
+ * results in a reentrancy since cec_s_phys_addr_from_edid() called in
+ * .detect or .get_modes might call .adap_enable, which leads to this
+ * function being called with that mutex held.
+ *
+ * Concurrency is not an issue for the moment since we don't share any
+ * state with KMS, so we can ignore the lock for now, but we need to
+ * keep it in mind if we were to change that assumption.
+ */
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
if (!vc4_hdmi->variant->external_irq_controller)
@@ -1976,6 +2059,17 @@ static int vc4_hdmi_cec_adap_log_addr(struct cec_adapter *adap, u8 log_addr)
struct vc4_hdmi *vc4_hdmi = cec_get_drvdata(adap);
unsigned long flags;
+ /*
+ * NOTE: This function should really take vc4_hdmi->mutex, but doing so
+ * results in a reentrancy since cec_s_phys_addr_from_edid() called in
+ * .detect or .get_modes might call .adap_enable, which leads to this
+ * function being called with that mutex held.
+ *
+ * Concurrency is not an issue for the moment since we don't share any
+ * state with KMS, so we can ignore the lock for now, but we need to
+ * keep it in mind if we were to change that assumption.
+ */
+
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_CEC_CNTRL_1,
(HDMI_READ(HDMI_CEC_CNTRL_1) & ~VC4_HDMI_CEC_ADDR_MASK) |
@@ -1994,6 +2088,17 @@ static int vc4_hdmi_cec_adap_transmit(struct cec_adapter *adap, u8 attempts,
u32 val;
unsigned int i;
+ /*
+ * NOTE: This function should really take vc4_hdmi->mutex, but doing so
+ * results in a reentrancy since cec_s_phys_addr_from_edid() called in
+ * .detect or .get_modes might call .adap_enable, which leads to this
+ * function being called with that mutex held.
+ *
+ * Concurrency is not an issue for the moment since we don't share any
+ * state with KMS, so we can ignore the lock for now, but we need to
+ * keep it in mind if we were to change that assumption.
+ */
+
if (msg->len > 16) {
drm_err(dev, "Attempting to transmit too much data (%d)\n", msg->len);
return -ENOMEM;
@@ -2349,6 +2454,7 @@ static int vc4_hdmi_bind(struct device *dev, struct device *master, void *data)
vc4_hdmi = devm_kzalloc(dev, sizeof(*vc4_hdmi), GFP_KERNEL);
if (!vc4_hdmi)
return -ENOMEM;
+ mutex_init(&vc4_hdmi->mutex);
spin_lock_init(&vc4_hdmi->hw_lock);
INIT_DELAYED_WORK(&vc4_hdmi->scrambling_work, vc4_hdmi_scrambling_wq);
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h
index 006142fe8d4e..cf9bb21a8ef7 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.h
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
@@ -183,6 +183,20 @@ struct vc4_hdmi {
* @hw_lock: Spinlock protecting device register access.
*/
spinlock_t hw_lock;
+
+ /**
+ * @mutex: Mutex protecting the driver access across multiple
+ * frameworks (KMS, ALSA).
+ *
+ * NOTE: While supported, CEC has been left out since
+ * cec_s_phys_addr_from_edid() might call .adap_enable and lead to a
+ * reentrancy issue between .get_modes (or .detect) and .adap_enable.
+ * Since we don't share any state between the CEC hooks and KMS', it's
+ * not a big deal. The only trouble might come from updating the CEC
+ * clock divider which might be affected by a modeset, but CEC should
+ * be resilient to that.
+ */
+ struct mutex mutex;
};
static inline struct vc4_hdmi *
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1fbaa02dfd05229312404aaef8bc9317b4ff8750 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Fri, 3 Dec 2021 15:19:46 -0800
Subject: [PATCH] scsi: ufs: Improve SCSI abort handling further
Release resources when aborting a command. Make sure that aborted commands
are completed once by clearing the corresponding tag bit from
hba->outstanding_reqs. This patch is an improved version of commit
3ff1f6b6ba6f ("scsi: ufs: core: Improve SCSI abort handling").
Link: https://lore.kernel.org/r/20211203231950.193369-14-bvanassche@acm.org
Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver")
Tested-by: Bean Huo <beanhuo(a)micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Reviewed-by: Bean Huo <beanhuo(a)micron.com>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5a641610dd74..06954a6e9d5d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6984,6 +6984,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
struct ufshcd_lrb *lrbp = &hba->lrb[tag];
unsigned long flags;
int err = FAILED;
+ bool outstanding;
u32 reg;
WARN_ONCE(tag < 0, "Invalid tag %d\n", tag);
@@ -7061,6 +7062,17 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
goto release;
}
+ /*
+ * Clear the corresponding bit from outstanding_reqs since the command
+ * has been aborted successfully.
+ */
+ spin_lock_irqsave(&hba->outstanding_lock, flags);
+ outstanding = __test_and_clear_bit(tag, &hba->outstanding_reqs);
+ spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+
+ if (outstanding)
+ ufshcd_release_scsi_cmd(hba, lrbp);
+
err = SUCCESS;
release:
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1fbaa02dfd05229312404aaef8bc9317b4ff8750 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Fri, 3 Dec 2021 15:19:46 -0800
Subject: [PATCH] scsi: ufs: Improve SCSI abort handling further
Release resources when aborting a command. Make sure that aborted commands
are completed once by clearing the corresponding tag bit from
hba->outstanding_reqs. This patch is an improved version of commit
3ff1f6b6ba6f ("scsi: ufs: core: Improve SCSI abort handling").
Link: https://lore.kernel.org/r/20211203231950.193369-14-bvanassche@acm.org
Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver")
Tested-by: Bean Huo <beanhuo(a)micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Reviewed-by: Bean Huo <beanhuo(a)micron.com>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5a641610dd74..06954a6e9d5d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6984,6 +6984,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
struct ufshcd_lrb *lrbp = &hba->lrb[tag];
unsigned long flags;
int err = FAILED;
+ bool outstanding;
u32 reg;
WARN_ONCE(tag < 0, "Invalid tag %d\n", tag);
@@ -7061,6 +7062,17 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
goto release;
}
+ /*
+ * Clear the corresponding bit from outstanding_reqs since the command
+ * has been aborted successfully.
+ */
+ spin_lock_irqsave(&hba->outstanding_lock, flags);
+ outstanding = __test_and_clear_bit(tag, &hba->outstanding_reqs);
+ spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+
+ if (outstanding)
+ ufshcd_release_scsi_cmd(hba, lrbp);
+
err = SUCCESS;
release:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1fbaa02dfd05229312404aaef8bc9317b4ff8750 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Fri, 3 Dec 2021 15:19:46 -0800
Subject: [PATCH] scsi: ufs: Improve SCSI abort handling further
Release resources when aborting a command. Make sure that aborted commands
are completed once by clearing the corresponding tag bit from
hba->outstanding_reqs. This patch is an improved version of commit
3ff1f6b6ba6f ("scsi: ufs: core: Improve SCSI abort handling").
Link: https://lore.kernel.org/r/20211203231950.193369-14-bvanassche@acm.org
Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver")
Tested-by: Bean Huo <beanhuo(a)micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Reviewed-by: Bean Huo <beanhuo(a)micron.com>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5a641610dd74..06954a6e9d5d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6984,6 +6984,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
struct ufshcd_lrb *lrbp = &hba->lrb[tag];
unsigned long flags;
int err = FAILED;
+ bool outstanding;
u32 reg;
WARN_ONCE(tag < 0, "Invalid tag %d\n", tag);
@@ -7061,6 +7062,17 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
goto release;
}
+ /*
+ * Clear the corresponding bit from outstanding_reqs since the command
+ * has been aborted successfully.
+ */
+ spin_lock_irqsave(&hba->outstanding_lock, flags);
+ outstanding = __test_and_clear_bit(tag, &hba->outstanding_reqs);
+ spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+
+ if (outstanding)
+ ufshcd_release_scsi_cmd(hba, lrbp);
+
err = SUCCESS;
release:
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1fbaa02dfd05229312404aaef8bc9317b4ff8750 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Fri, 3 Dec 2021 15:19:46 -0800
Subject: [PATCH] scsi: ufs: Improve SCSI abort handling further
Release resources when aborting a command. Make sure that aborted commands
are completed once by clearing the corresponding tag bit from
hba->outstanding_reqs. This patch is an improved version of commit
3ff1f6b6ba6f ("scsi: ufs: core: Improve SCSI abort handling").
Link: https://lore.kernel.org/r/20211203231950.193369-14-bvanassche@acm.org
Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver")
Tested-by: Bean Huo <beanhuo(a)micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Reviewed-by: Bean Huo <beanhuo(a)micron.com>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5a641610dd74..06954a6e9d5d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6984,6 +6984,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
struct ufshcd_lrb *lrbp = &hba->lrb[tag];
unsigned long flags;
int err = FAILED;
+ bool outstanding;
u32 reg;
WARN_ONCE(tag < 0, "Invalid tag %d\n", tag);
@@ -7061,6 +7062,17 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
goto release;
}
+ /*
+ * Clear the corresponding bit from outstanding_reqs since the command
+ * has been aborted successfully.
+ */
+ spin_lock_irqsave(&hba->outstanding_lock, flags);
+ outstanding = __test_and_clear_bit(tag, &hba->outstanding_reqs);
+ spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+
+ if (outstanding)
+ ufshcd_release_scsi_cmd(hba, lrbp);
+
err = SUCCESS;
release:
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1fbaa02dfd05229312404aaef8bc9317b4ff8750 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Fri, 3 Dec 2021 15:19:46 -0800
Subject: [PATCH] scsi: ufs: Improve SCSI abort handling further
Release resources when aborting a command. Make sure that aborted commands
are completed once by clearing the corresponding tag bit from
hba->outstanding_reqs. This patch is an improved version of commit
3ff1f6b6ba6f ("scsi: ufs: core: Improve SCSI abort handling").
Link: https://lore.kernel.org/r/20211203231950.193369-14-bvanassche@acm.org
Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver")
Tested-by: Bean Huo <beanhuo(a)micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Reviewed-by: Bean Huo <beanhuo(a)micron.com>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5a641610dd74..06954a6e9d5d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6984,6 +6984,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
struct ufshcd_lrb *lrbp = &hba->lrb[tag];
unsigned long flags;
int err = FAILED;
+ bool outstanding;
u32 reg;
WARN_ONCE(tag < 0, "Invalid tag %d\n", tag);
@@ -7061,6 +7062,17 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
goto release;
}
+ /*
+ * Clear the corresponding bit from outstanding_reqs since the command
+ * has been aborted successfully.
+ */
+ spin_lock_irqsave(&hba->outstanding_lock, flags);
+ outstanding = __test_and_clear_bit(tag, &hba->outstanding_reqs);
+ spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+
+ if (outstanding)
+ ufshcd_release_scsi_cmd(hba, lrbp);
+
err = SUCCESS;
release:
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1fbaa02dfd05229312404aaef8bc9317b4ff8750 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Fri, 3 Dec 2021 15:19:46 -0800
Subject: [PATCH] scsi: ufs: Improve SCSI abort handling further
Release resources when aborting a command. Make sure that aborted commands
are completed once by clearing the corresponding tag bit from
hba->outstanding_reqs. This patch is an improved version of commit
3ff1f6b6ba6f ("scsi: ufs: core: Improve SCSI abort handling").
Link: https://lore.kernel.org/r/20211203231950.193369-14-bvanassche@acm.org
Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver")
Tested-by: Bean Huo <beanhuo(a)micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Reviewed-by: Bean Huo <beanhuo(a)micron.com>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5a641610dd74..06954a6e9d5d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6984,6 +6984,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
struct ufshcd_lrb *lrbp = &hba->lrb[tag];
unsigned long flags;
int err = FAILED;
+ bool outstanding;
u32 reg;
WARN_ONCE(tag < 0, "Invalid tag %d\n", tag);
@@ -7061,6 +7062,17 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
goto release;
}
+ /*
+ * Clear the corresponding bit from outstanding_reqs since the command
+ * has been aborted successfully.
+ */
+ spin_lock_irqsave(&hba->outstanding_lock, flags);
+ outstanding = __test_and_clear_bit(tag, &hba->outstanding_reqs);
+ spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+
+ if (outstanding)
+ ufshcd_release_scsi_cmd(hba, lrbp);
+
err = SUCCESS;
release:
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1fbaa02dfd05229312404aaef8bc9317b4ff8750 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Fri, 3 Dec 2021 15:19:46 -0800
Subject: [PATCH] scsi: ufs: Improve SCSI abort handling further
Release resources when aborting a command. Make sure that aborted commands
are completed once by clearing the corresponding tag bit from
hba->outstanding_reqs. This patch is an improved version of commit
3ff1f6b6ba6f ("scsi: ufs: core: Improve SCSI abort handling").
Link: https://lore.kernel.org/r/20211203231950.193369-14-bvanassche@acm.org
Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver")
Tested-by: Bean Huo <beanhuo(a)micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Reviewed-by: Bean Huo <beanhuo(a)micron.com>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5a641610dd74..06954a6e9d5d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6984,6 +6984,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
struct ufshcd_lrb *lrbp = &hba->lrb[tag];
unsigned long flags;
int err = FAILED;
+ bool outstanding;
u32 reg;
WARN_ONCE(tag < 0, "Invalid tag %d\n", tag);
@@ -7061,6 +7062,17 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
goto release;
}
+ /*
+ * Clear the corresponding bit from outstanding_reqs since the command
+ * has been aborted successfully.
+ */
+ spin_lock_irqsave(&hba->outstanding_lock, flags);
+ outstanding = __test_and_clear_bit(tag, &hba->outstanding_reqs);
+ spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+
+ if (outstanding)
+ ufshcd_release_scsi_cmd(hba, lrbp);
+
err = SUCCESS;
release:
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1fbaa02dfd05229312404aaef8bc9317b4ff8750 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche(a)acm.org>
Date: Fri, 3 Dec 2021 15:19:46 -0800
Subject: [PATCH] scsi: ufs: Improve SCSI abort handling further
Release resources when aborting a command. Make sure that aborted commands
are completed once by clearing the corresponding tag bit from
hba->outstanding_reqs. This patch is an improved version of commit
3ff1f6b6ba6f ("scsi: ufs: core: Improve SCSI abort handling").
Link: https://lore.kernel.org/r/20211203231950.193369-14-bvanassche@acm.org
Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver")
Tested-by: Bean Huo <beanhuo(a)micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Reviewed-by: Bean Huo <beanhuo(a)micron.com>
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5a641610dd74..06954a6e9d5d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6984,6 +6984,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
struct ufshcd_lrb *lrbp = &hba->lrb[tag];
unsigned long flags;
int err = FAILED;
+ bool outstanding;
u32 reg;
WARN_ONCE(tag < 0, "Invalid tag %d\n", tag);
@@ -7061,6 +7062,17 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
goto release;
}
+ /*
+ * Clear the corresponding bit from outstanding_reqs since the command
+ * has been aborted successfully.
+ */
+ spin_lock_irqsave(&hba->outstanding_lock, flags);
+ outstanding = __test_and_clear_bit(tag, &hba->outstanding_reqs);
+ spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+
+ if (outstanding)
+ ufshcd_release_scsi_cmd(hba, lrbp);
+
err = SUCCESS;
release:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 20b0dfa86bef0e80b41b0e5ac38b92f23b6f27f9 Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Thu, 19 Aug 2021 15:59:30 +0200
Subject: [PATCH] drm/vc4: hdmi: Make sure the device is powered with CEC
Similarly to what we encountered with the detect hook with DRM, nothing
actually prevents any of the CEC callback from being run while the HDMI
output is disabled.
However, this is an issue since any register access to the controller
when it's powered down will result in a silent hang.
Let's make sure we run the runtime_pm hooks when the CEC adapter is
opened and closed by the userspace to avoid that issue.
Fixes: 15b4511a4af6 ("drm/vc4: add HDMI CEC support")
Reviewed-by: Dave Stevenson <dave.stevenson(a)raspberrypi.com>
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210819135931.895976-6-maxim…
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 1f13e0f00089..8cc84395f4cb 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -1741,8 +1741,14 @@ static int vc4_hdmi_cec_enable(struct cec_adapter *adap)
struct vc4_hdmi *vc4_hdmi = cec_get_drvdata(adap);
/* clock period in microseconds */
const u32 usecs = 1000000 / CEC_CLOCK_FREQ;
- u32 val = HDMI_READ(HDMI_CEC_CNTRL_5);
+ u32 val;
+ int ret;
+
+ ret = pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev);
+ if (ret)
+ return ret;
+ val = HDMI_READ(HDMI_CEC_CNTRL_5);
val &= ~(VC4_HDMI_CEC_TX_SW_RESET | VC4_HDMI_CEC_RX_SW_RESET |
VC4_HDMI_CEC_CNT_TO_4700_US_MASK |
VC4_HDMI_CEC_CNT_TO_4500_US_MASK);
@@ -1785,6 +1791,8 @@ static int vc4_hdmi_cec_disable(struct cec_adapter *adap)
HDMI_WRITE(HDMI_CEC_CNTRL_5, HDMI_READ(HDMI_CEC_CNTRL_5) |
VC4_HDMI_CEC_TX_SW_RESET | VC4_HDMI_CEC_RX_SW_RESET);
+ pm_runtime_put(&vc4_hdmi->pdev->dev);
+
return 0;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 20b0dfa86bef0e80b41b0e5ac38b92f23b6f27f9 Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Thu, 19 Aug 2021 15:59:30 +0200
Subject: [PATCH] drm/vc4: hdmi: Make sure the device is powered with CEC
Similarly to what we encountered with the detect hook with DRM, nothing
actually prevents any of the CEC callback from being run while the HDMI
output is disabled.
However, this is an issue since any register access to the controller
when it's powered down will result in a silent hang.
Let's make sure we run the runtime_pm hooks when the CEC adapter is
opened and closed by the userspace to avoid that issue.
Fixes: 15b4511a4af6 ("drm/vc4: add HDMI CEC support")
Reviewed-by: Dave Stevenson <dave.stevenson(a)raspberrypi.com>
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210819135931.895976-6-maxim…
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 1f13e0f00089..8cc84395f4cb 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -1741,8 +1741,14 @@ static int vc4_hdmi_cec_enable(struct cec_adapter *adap)
struct vc4_hdmi *vc4_hdmi = cec_get_drvdata(adap);
/* clock period in microseconds */
const u32 usecs = 1000000 / CEC_CLOCK_FREQ;
- u32 val = HDMI_READ(HDMI_CEC_CNTRL_5);
+ u32 val;
+ int ret;
+
+ ret = pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev);
+ if (ret)
+ return ret;
+ val = HDMI_READ(HDMI_CEC_CNTRL_5);
val &= ~(VC4_HDMI_CEC_TX_SW_RESET | VC4_HDMI_CEC_RX_SW_RESET |
VC4_HDMI_CEC_CNT_TO_4700_US_MASK |
VC4_HDMI_CEC_CNT_TO_4500_US_MASK);
@@ -1785,6 +1791,8 @@ static int vc4_hdmi_cec_disable(struct cec_adapter *adap)
HDMI_WRITE(HDMI_CEC_CNTRL_5, HDMI_READ(HDMI_CEC_CNTRL_5) |
VC4_HDMI_CEC_TX_SW_RESET | VC4_HDMI_CEC_RX_SW_RESET);
+ pm_runtime_put(&vc4_hdmi->pdev->dev);
+
return 0;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 20b0dfa86bef0e80b41b0e5ac38b92f23b6f27f9 Mon Sep 17 00:00:00 2001
From: Maxime Ripard <maxime(a)cerno.tech>
Date: Thu, 19 Aug 2021 15:59:30 +0200
Subject: [PATCH] drm/vc4: hdmi: Make sure the device is powered with CEC
Similarly to what we encountered with the detect hook with DRM, nothing
actually prevents any of the CEC callback from being run while the HDMI
output is disabled.
However, this is an issue since any register access to the controller
when it's powered down will result in a silent hang.
Let's make sure we run the runtime_pm hooks when the CEC adapter is
opened and closed by the userspace to avoid that issue.
Fixes: 15b4511a4af6 ("drm/vc4: add HDMI CEC support")
Reviewed-by: Dave Stevenson <dave.stevenson(a)raspberrypi.com>
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210819135931.895976-6-maxim…
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 1f13e0f00089..8cc84395f4cb 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -1741,8 +1741,14 @@ static int vc4_hdmi_cec_enable(struct cec_adapter *adap)
struct vc4_hdmi *vc4_hdmi = cec_get_drvdata(adap);
/* clock period in microseconds */
const u32 usecs = 1000000 / CEC_CLOCK_FREQ;
- u32 val = HDMI_READ(HDMI_CEC_CNTRL_5);
+ u32 val;
+ int ret;
+
+ ret = pm_runtime_resume_and_get(&vc4_hdmi->pdev->dev);
+ if (ret)
+ return ret;
+ val = HDMI_READ(HDMI_CEC_CNTRL_5);
val &= ~(VC4_HDMI_CEC_TX_SW_RESET | VC4_HDMI_CEC_RX_SW_RESET |
VC4_HDMI_CEC_CNT_TO_4700_US_MASK |
VC4_HDMI_CEC_CNT_TO_4500_US_MASK);
@@ -1785,6 +1791,8 @@ static int vc4_hdmi_cec_disable(struct cec_adapter *adap)
HDMI_WRITE(HDMI_CEC_CNTRL_5, HDMI_READ(HDMI_CEC_CNTRL_5) |
VC4_HDMI_CEC_TX_SW_RESET | VC4_HDMI_CEC_RX_SW_RESET);
+ pm_runtime_put(&vc4_hdmi->pdev->dev);
+
return 0;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 70897848730470cc477d5d89e6222c0f6a9ac173 Mon Sep 17 00:00:00 2001
From: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com>
Date: Tue, 30 Nov 2021 09:32:33 -0500
Subject: [PATCH] drm/amdgpu/display: Only set vblank_disable_immediate when
PSR is not enabled
[Why]
PSR currently relies on the kernel's delayed vblank on/off mechanism
as an implicit bufferring mechanism to prevent excessive entry/exit.
Without this delay the user experience is impacted since it can take
a few frames to enter/exit.
[How]
Only allow vblank disable immediate for DC when psr is not supported.
Leave a TODO indicating that this support should be extended in the
future to delay independent of the vblank interrupt.
Fixes: 92020e81ddbeac ("drm/amdgpu/display: set vblank_disable_immediate for DC")
Acked-by: Alex Deucher <alexander.deucher(a)amd.com>
Reviewed-by: Harry Wentland <harry.wentland(a)amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 05e4a0952a2b..4b76eb14a111 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1599,9 +1599,6 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
adev_to_drm(adev)->mode_config.cursor_width = adev->dm.dc->caps.max_cursor_size;
adev_to_drm(adev)->mode_config.cursor_height = adev->dm.dc->caps.max_cursor_size;
- /* Disable vblank IRQs aggressively for power-saving */
- adev_to_drm(adev)->vblank_disable_immediate = true;
-
if (drm_vblank_init(adev_to_drm(adev), adev->dm.display_indexes_num)) {
DRM_ERROR(
"amdgpu: failed to initialize sw for display support.\n");
@@ -4264,6 +4261,14 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev)
}
+ /*
+ * Disable vblank IRQs aggressively for power-saving.
+ *
+ * TODO: Fix vblank control helpers to delay PSR entry to allow this when PSR
+ * is also supported.
+ */
+ adev_to_drm(adev)->vblank_disable_immediate = !psr_feature_enabled;
+
/* Software is initialized. Now we can register interrupt handlers. */
switch (adev->asic_type) {
#if defined(CONFIG_DRM_AMD_DC_SI)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 70897848730470cc477d5d89e6222c0f6a9ac173 Mon Sep 17 00:00:00 2001
From: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com>
Date: Tue, 30 Nov 2021 09:32:33 -0500
Subject: [PATCH] drm/amdgpu/display: Only set vblank_disable_immediate when
PSR is not enabled
[Why]
PSR currently relies on the kernel's delayed vblank on/off mechanism
as an implicit bufferring mechanism to prevent excessive entry/exit.
Without this delay the user experience is impacted since it can take
a few frames to enter/exit.
[How]
Only allow vblank disable immediate for DC when psr is not supported.
Leave a TODO indicating that this support should be extended in the
future to delay independent of the vblank interrupt.
Fixes: 92020e81ddbeac ("drm/amdgpu/display: set vblank_disable_immediate for DC")
Acked-by: Alex Deucher <alexander.deucher(a)amd.com>
Reviewed-by: Harry Wentland <harry.wentland(a)amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 05e4a0952a2b..4b76eb14a111 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1599,9 +1599,6 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
adev_to_drm(adev)->mode_config.cursor_width = adev->dm.dc->caps.max_cursor_size;
adev_to_drm(adev)->mode_config.cursor_height = adev->dm.dc->caps.max_cursor_size;
- /* Disable vblank IRQs aggressively for power-saving */
- adev_to_drm(adev)->vblank_disable_immediate = true;
-
if (drm_vblank_init(adev_to_drm(adev), adev->dm.display_indexes_num)) {
DRM_ERROR(
"amdgpu: failed to initialize sw for display support.\n");
@@ -4264,6 +4261,14 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev)
}
+ /*
+ * Disable vblank IRQs aggressively for power-saving.
+ *
+ * TODO: Fix vblank control helpers to delay PSR entry to allow this when PSR
+ * is also supported.
+ */
+ adev_to_drm(adev)->vblank_disable_immediate = !psr_feature_enabled;
+
/* Software is initialized. Now we can register interrupt handlers. */
switch (adev->asic_type) {
#if defined(CONFIG_DRM_AMD_DC_SI)
[ Upstream commit 089b3f61ecfc43ca4ea26d595e1d31ead6de3f7b ]
Boot-on regulators are always kept on because their use_count value
is now incremented at boot time and never cleaned.
Only increment count value for alway-on regulators.
regulator_late_cleanup() is now able to power off boot-on regulators
when unused.
Fixes: 45f9c1b2e57c ("regulator: core: Clean enabling always-on regulators + their supplies")
Signed-off-by: Pascal Paillet <p.paillet(a)st.com>
Link: https://lore.kernel.org/r/20191113102737.27831-1-p.paillet@st.com
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Acked-by: Andre Kalb <andre.kalb(a)sma.de>
# 4.19.x
---
drivers/regulator/core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 088ed4ee6d83..045075cd256c 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -1211,7 +1211,9 @@ static int set_machine_constraints(struct regulator_dev *rdev)
rdev_err(rdev, "failed to enable\n");
return ret;
}
- rdev->use_count++;
+
+ if (rdev->constraints->always_on)
+ rdev->use_count++;
}
print_constraints(rdev);
--
2.31.1
Hi,
I've been noticing this on a number of Ryzen machines that a traceback
comes up when allocating memory for ath11k. It's not a fatal failure,
as it retries with smaller slices but makes a lot of spew in the logs.
commit b9b5948cdd7bc8d9fa31c78cbbb04382c815587f ("ath11k: qmi: avoid
error messages when dma allocation fails") fixes it, can you please
bring it to 5.15.y?
Thanks,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 29009604ad4e3ef784fd9b9fef6f23610ddf633d Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Mon, 20 Dec 2021 20:50:22 +0100
Subject: [PATCH] crypto: stm32/crc32 - Fix kernel BUG triggered in probe()
The include/linux/crypto.h struct crypto_alg field cra_driver_name description
states "Unique name of the transformation provider. " ... " this contains the
name of the chip or provider and the name of the transformation algorithm."
In case of the stm32-crc driver, field cra_driver_name is identical for all
registered transformation providers and set to the name of the driver itself,
which is incorrect. This patch fixes it by assigning a unique cra_driver_name
to each registered transformation provider.
The kernel crash is triggered when the driver calls crypto_register_shashes()
which calls crypto_register_shash(), which calls crypto_register_alg(), which
calls __crypto_register_alg(), which returns -EEXIST, which is propagated
back through this call chain. Upon -EEXIST from crypto_register_shash(), the
crypto_register_shashes() starts unregistering the providers back, and calls
crypto_unregister_shash(), which calls crypto_unregister_alg(), and this is
where the BUG() triggers due to incorrect cra_refcnt.
Fixes: b51dbe90912a ("crypto: stm32 - Support for STM32 CRC32 crypto module")
Signed-off-by: Marek Vasut <marex(a)denx.de>
Cc: <stable(a)vger.kernel.org> # 4.12+
Cc: Alexandre Torgue <alexandre.torgue(a)foss.st.com>
Cc: Fabien Dessenne <fabien.dessenne(a)st.com>
Cc: Herbert Xu <herbert(a)gondor.apana.org.au>
Cc: Lionel Debieve <lionel.debieve(a)st.com>
Cc: Nicolas Toromanoff <nicolas.toromanoff(a)st.com>
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-stm32(a)st-md-mailman.stormreply.com
To: linux-crypto(a)vger.kernel.org
Acked-by: Nicolas Toromanoff <nicolas.toromanoff(a)foss.st.com>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/drivers/crypto/stm32/stm32-crc32.c b/drivers/crypto/stm32/stm32-crc32.c
index 75867c0b0017..be1bf39a317d 100644
--- a/drivers/crypto/stm32/stm32-crc32.c
+++ b/drivers/crypto/stm32/stm32-crc32.c
@@ -279,7 +279,7 @@ static struct shash_alg algs[] = {
.digestsize = CHKSUM_DIGEST_SIZE,
.base = {
.cra_name = "crc32",
- .cra_driver_name = DRIVER_NAME,
+ .cra_driver_name = "stm32-crc32-crc32",
.cra_priority = 200,
.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.cra_blocksize = CHKSUM_BLOCK_SIZE,
@@ -301,7 +301,7 @@ static struct shash_alg algs[] = {
.digestsize = CHKSUM_DIGEST_SIZE,
.base = {
.cra_name = "crc32c",
- .cra_driver_name = DRIVER_NAME,
+ .cra_driver_name = "stm32-crc32-crc32c",
.cra_priority = 200,
.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.cra_blocksize = CHKSUM_BLOCK_SIZE,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0878355b51f5f26632e652c848a8e174bb02d22d Mon Sep 17 00:00:00 2001
From: Nikita Yushchenko <nikita.yushchenko(a)virtuozzo.com>
Date: Sun, 9 Jan 2022 18:34:59 +0300
Subject: [PATCH] tracing/osnoise: Properly unhook events if
start_per_cpu_kthreads() fails
If start_per_cpu_kthreads() called from osnoise_workload_start() returns
error, event hooks are left in broken state: unhook_irq_events() called
but unhook_thread_events() and unhook_softirq_events() not called, and
trace_osnoise_callback_enabled flag not cleared.
On the next tracer enable, hooks get not installed due to
trace_osnoise_callback_enabled flag.
And on the further tracer disable an attempt to remove non-installed
hooks happened, hitting a WARN_ON_ONCE() in tracepoint_remove_func().
Fix the error path by adding the missing part of cleanup.
While at this, introduce osnoise_unhook_events() to avoid code
duplication between this error path and normal tracer disable.
Link: https://lkml.kernel.org/r/20220109153459.3701773-1-nikita.yushchenko@virtuo…
Cc: stable(a)vger.kernel.org
Fixes: bce29ac9ce0b ("trace: Add osnoise tracer")
Acked-by: Daniel Bristot de Oliveira <bristot(a)kernel.org>
Signed-off-by: Nikita Yushchenko <nikita.yushchenko(a)virtuozzo.com>
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 4719a848bf17..36d9d5be08b4 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -2122,6 +2122,13 @@ static int osnoise_hook_events(void)
return -EINVAL;
}
+static void osnoise_unhook_events(void)
+{
+ unhook_thread_events();
+ unhook_softirq_events();
+ unhook_irq_events();
+}
+
/*
* osnoise_workload_start - start the workload and hook to events
*/
@@ -2154,7 +2161,14 @@ static int osnoise_workload_start(void)
retval = start_per_cpu_kthreads();
if (retval) {
- unhook_irq_events();
+ trace_osnoise_callback_enabled = false;
+ /*
+ * Make sure that ftrace_nmi_enter/exit() see
+ * trace_osnoise_callback_enabled as false before continuing.
+ */
+ barrier();
+
+ osnoise_unhook_events();
return retval;
}
@@ -2185,9 +2199,7 @@ static void osnoise_workload_stop(void)
stop_per_cpu_kthreads();
- unhook_irq_events();
- unhook_softirq_events();
- unhook_thread_events();
+ osnoise_unhook_events();
}
static void osnoise_tracer_start(struct trace_array *tr)
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c6617c61e8fe44b9e9fdfede921f61cac6b5149d Mon Sep 17 00:00:00 2001
From: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Date: Mon, 17 Jan 2022 16:05:40 +0100
Subject: [PATCH] KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN
Commit feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN")
forbade changing CPUID altogether but unfortunately this is not fully
compatible with existing VMMs. In particular, QEMU reuses vCPU fds for
CPU hotplug after unplug and it calls KVM_SET_CPUID2. Instead of full ban,
check whether the supplied CPUID data is equal to what was previously set.
Reported-by: Igor Mammedov <imammedo(a)redhat.com>
Fixes: feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN")
Signed-off-by: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Message-Id: <20220117150542.2176196-3-vkuznets(a)redhat.com>
Cc: stable(a)vger.kernel.org
[Do not call kvm_find_cpuid_entry repeatedly. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 812190a707f6..7eb046d907c6 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -119,6 +119,28 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu,
return fpu_enable_guest_xfd_features(&vcpu->arch.guest_fpu, xfeatures);
}
+/* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */
+static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
+ int nent)
+{
+ struct kvm_cpuid_entry2 *orig;
+ int i;
+
+ if (nent != vcpu->arch.cpuid_nent)
+ return -EINVAL;
+
+ for (i = 0; i < nent; i++) {
+ orig = &vcpu->arch.cpuid_entries[i];
+ if (e2[i].function != orig->function ||
+ e2[i].index != orig->index ||
+ e2[i].eax != orig->eax || e2[i].ebx != orig->ebx ||
+ e2[i].ecx != orig->ecx || e2[i].edx != orig->edx)
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static void kvm_update_kvm_cpuid_base(struct kvm_vcpu *vcpu)
{
u32 function;
@@ -313,6 +335,20 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
__kvm_update_cpuid_runtime(vcpu, e2, nent);
+ /*
+ * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
+ * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
+ * tracked in kvm_mmu_page_role. As a result, KVM may miss guest page
+ * faults due to reusing SPs/SPTEs. In practice no sane VMM mucks with
+ * the core vCPU model on the fly. It would've been better to forbid any
+ * KVM_SET_CPUID{,2} calls after KVM_RUN altogether but unfortunately
+ * some VMMs (e.g. QEMU) reuse vCPU fds for CPU hotplug/unplug and do
+ * KVM_SET_CPUID{,2} again. To support this legacy behavior, check
+ * whether the supplied CPUID data is equal to what's already set.
+ */
+ if (vcpu->arch.last_vmentry_cpu != -1)
+ return kvm_cpuid_check_equal(vcpu, e2, nent);
+
r = kvm_check_cpuid(vcpu, e2, nent);
if (r)
return r;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 60da2331ec32..a0bc637348b2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5230,17 +5230,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
struct kvm_cpuid __user *cpuid_arg = argp;
struct kvm_cpuid cpuid;
- /*
- * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
- * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
- * tracked in kvm_mmu_page_role. As a result, KVM may miss guest page
- * faults due to reusing SPs/SPTEs. In practice no sane VMM mucks with
- * the core vCPU model on the fly, so fail.
- */
- r = -EINVAL;
- if (vcpu->arch.last_vmentry_cpu != -1)
- goto out;
-
r = -EFAULT;
if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
@@ -5251,14 +5240,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
struct kvm_cpuid2 __user *cpuid_arg = argp;
struct kvm_cpuid2 cpuid;
- /*
- * KVM_SET_CPUID{,2} after KVM_RUN is forbidded, see the comment in
- * KVM_SET_CPUID case above.
- */
- r = -EINVAL;
- if (vcpu->arch.last_vmentry_cpu != -1)
- goto out;
-
r = -EFAULT;
if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
Dzień dobry,
Czy rozważali Państwo montaż systemu fotowoltaicznego?
Instalacja fotowoltaiczna jest najlepszym sposobem na obniżenie wysokości rachunków za prąd (pozostają tylko opłaty stałe) i zabezpieczenie się przed rosnącymi cenami energii elektrycznej. Jest to w pełni odnawialne i bezemisyjne źródło energii, dzięki czemu przyczyniamy się do ochrony środowiska naturalnego.
Działamy od wielu lat na rynku energetycznym. Przygotujemy projekt, wycenę oraz kompleksowo wykonamy i zgłosimy realizację do zakładu energetycznego.
Czy chcą Państwo poznać naszą propozycję?
Pozdrawiam,
Przemysław Wróblewski
Injecting an exception into a guest with non-VHE is risky business.
Instead of writing in the shadow register for the switch code to
restore it, we override the CPU register instead. Which gets
overriden a few instructions later by said restore code.
The result is that although the guest correctly gets the exception,
it will return to the original context in some random state,
depending on what was there the first place... Boo.
Fix the issue by writing to the shadow register. The original code
is absolutely fine on VHE, as the state is already loaded, and writing
to the shadow register in that case would actually be a bug.
Fixes: bb666c472ca2 ("KVM: arm64: Inject AArch64 exceptions from HYP")
Cc: stable(a)vger.kernel.org
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
---
arch/arm64/kvm/hyp/exception.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index 0418399e0a20..c5d009715402 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -38,7 +38,10 @@ static inline void __vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
static void __vcpu_write_spsr(struct kvm_vcpu *vcpu, u64 val)
{
- write_sysreg_el1(val, SYS_SPSR);
+ if (has_vhe())
+ write_sysreg_el1(val, SYS_SPSR);
+ else
+ __vcpu_sys_reg(vcpu, SPSR_EL1) = val;
}
static void __vcpu_write_spsr_abt(struct kvm_vcpu *vcpu, u64 val)
--
2.34.1
ATTN: BENEFICIARY
OVER-DUE PAYMENT RELEASED. FINALLY, YOUR FUND IS NOW HERE IN THE
Nigeria AWAITING TO BE SHIPPED TO YOUR HOME ADDRESS. THIS IS THREE
TIMES I HAVE SENT YOU THIS THE SAME MESSAGE AND TODAY IF YOU, FAIL TO
COMPLY WITHIN THREE DAYS YOU WILL LOSE YOUR FUND FOREVER.YOUR FUND
APPROVED PAYMENT OF $27,000,000.00 (TWENTY SEVEN MILLION
DOLLARS THROUGH ATM CARD)
This is to inform you that three gentle men came to my office with
all the claim documents this morning at about (10:15amStandard Pacific
Time), with a letter claiming to be from you.You are advise to
contact this office immediately with your full contact details. EMAIL
: Mr.RobertFerguson(a)aeiou.pt
Regards
Mr. Robert Ferguson
Admin Chairman
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5c48a7df91499e371ef725895b2e2d21a126e227 Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Sat, 25 Dec 2021 17:09:37 +0800
Subject: [PATCH] ext4: fix an use-after-free issue about data=journal
writeback mode
Our syzkaller report an use-after-free issue that accessing the freed
buffer_head on the writeback page in __ext4_journalled_writepage(). The
problem is that if there was a truncate racing with the data=journalled
writeback procedure, the writeback length could become zero and
bget_one() refuse to get buffer_head's refcount, then the truncate
procedure release buffer once we drop page lock, finally, the last
ext4_walk_page_buffers() trigger the use-after-free problem.
sync truncate
ext4_sync_file()
file_write_and_wait_range()
ext4_setattr(0)
inode->i_size = 0
ext4_writepage()
len = 0
__ext4_journalled_writepage()
page_bufs = page_buffers(page)
ext4_walk_page_buffers(bget_one) <- does not get refcount
do_invalidatepage()
free_buffer_head()
ext4_walk_page_buffers(page_bufs) <- trigger use-after-free
After commit bdf96838aea6 ("ext4: fix race between truncate and
__ext4_journalled_writepage()"), we have already handled the racing
case, so the bget_one() and bput_one() are not needed. So this patch
simply remove these hunk, and recheck the i_size to make it safe.
Fixes: bdf96838aea6 ("ext4: fix race between truncate and __ext4_journalled_writepage()")
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20211225090937.712867-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index bca9951634d9..68070f34f0cf 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1845,30 +1845,16 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
return 0;
}
-static int bget_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- get_bh(bh);
- return 0;
-}
-
-static int bput_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- put_bh(bh);
- return 0;
-}
-
static int __ext4_journalled_writepage(struct page *page,
unsigned int len)
{
struct address_space *mapping = page->mapping;
struct inode *inode = mapping->host;
- struct buffer_head *page_bufs = NULL;
handle_t *handle = NULL;
int ret = 0, err = 0;
int inline_data = ext4_has_inline_data(inode);
struct buffer_head *inode_bh = NULL;
+ loff_t size;
ClearPageChecked(page);
@@ -1878,14 +1864,6 @@ static int __ext4_journalled_writepage(struct page *page,
inode_bh = ext4_journalled_write_inline_data(inode, len, page);
if (inode_bh == NULL)
goto out;
- } else {
- page_bufs = page_buffers(page);
- if (!page_bufs) {
- BUG();
- goto out;
- }
- ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
- NULL, bget_one);
}
/*
* We need to release the page lock before we start the
@@ -1906,7 +1884,8 @@ static int __ext4_journalled_writepage(struct page *page,
lock_page(page);
put_page(page);
- if (page->mapping != mapping) {
+ size = i_size_read(inode);
+ if (page->mapping != mapping || page_offset(page) > size) {
/* The page got truncated from under us */
ext4_journal_stop(handle);
ret = 0;
@@ -1916,6 +1895,13 @@ static int __ext4_journalled_writepage(struct page *page,
if (inline_data) {
ret = ext4_mark_inode_dirty(handle, inode);
} else {
+ struct buffer_head *page_bufs = page_buffers(page);
+
+ if (page->index == size >> PAGE_SHIFT)
+ len = size & ~PAGE_MASK;
+ else
+ len = PAGE_SIZE;
+
ret = ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
NULL, do_journal_get_write_access);
@@ -1936,9 +1922,6 @@ static int __ext4_journalled_writepage(struct page *page,
out:
unlock_page(page);
out_no_pagelock:
- if (!inline_data && page_bufs)
- ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len,
- NULL, bput_one);
brelse(inode_bh);
return ret;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5c48a7df91499e371ef725895b2e2d21a126e227 Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Sat, 25 Dec 2021 17:09:37 +0800
Subject: [PATCH] ext4: fix an use-after-free issue about data=journal
writeback mode
Our syzkaller report an use-after-free issue that accessing the freed
buffer_head on the writeback page in __ext4_journalled_writepage(). The
problem is that if there was a truncate racing with the data=journalled
writeback procedure, the writeback length could become zero and
bget_one() refuse to get buffer_head's refcount, then the truncate
procedure release buffer once we drop page lock, finally, the last
ext4_walk_page_buffers() trigger the use-after-free problem.
sync truncate
ext4_sync_file()
file_write_and_wait_range()
ext4_setattr(0)
inode->i_size = 0
ext4_writepage()
len = 0
__ext4_journalled_writepage()
page_bufs = page_buffers(page)
ext4_walk_page_buffers(bget_one) <- does not get refcount
do_invalidatepage()
free_buffer_head()
ext4_walk_page_buffers(page_bufs) <- trigger use-after-free
After commit bdf96838aea6 ("ext4: fix race between truncate and
__ext4_journalled_writepage()"), we have already handled the racing
case, so the bget_one() and bput_one() are not needed. So this patch
simply remove these hunk, and recheck the i_size to make it safe.
Fixes: bdf96838aea6 ("ext4: fix race between truncate and __ext4_journalled_writepage()")
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20211225090937.712867-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index bca9951634d9..68070f34f0cf 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1845,30 +1845,16 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
return 0;
}
-static int bget_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- get_bh(bh);
- return 0;
-}
-
-static int bput_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- put_bh(bh);
- return 0;
-}
-
static int __ext4_journalled_writepage(struct page *page,
unsigned int len)
{
struct address_space *mapping = page->mapping;
struct inode *inode = mapping->host;
- struct buffer_head *page_bufs = NULL;
handle_t *handle = NULL;
int ret = 0, err = 0;
int inline_data = ext4_has_inline_data(inode);
struct buffer_head *inode_bh = NULL;
+ loff_t size;
ClearPageChecked(page);
@@ -1878,14 +1864,6 @@ static int __ext4_journalled_writepage(struct page *page,
inode_bh = ext4_journalled_write_inline_data(inode, len, page);
if (inode_bh == NULL)
goto out;
- } else {
- page_bufs = page_buffers(page);
- if (!page_bufs) {
- BUG();
- goto out;
- }
- ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
- NULL, bget_one);
}
/*
* We need to release the page lock before we start the
@@ -1906,7 +1884,8 @@ static int __ext4_journalled_writepage(struct page *page,
lock_page(page);
put_page(page);
- if (page->mapping != mapping) {
+ size = i_size_read(inode);
+ if (page->mapping != mapping || page_offset(page) > size) {
/* The page got truncated from under us */
ext4_journal_stop(handle);
ret = 0;
@@ -1916,6 +1895,13 @@ static int __ext4_journalled_writepage(struct page *page,
if (inline_data) {
ret = ext4_mark_inode_dirty(handle, inode);
} else {
+ struct buffer_head *page_bufs = page_buffers(page);
+
+ if (page->index == size >> PAGE_SHIFT)
+ len = size & ~PAGE_MASK;
+ else
+ len = PAGE_SIZE;
+
ret = ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
NULL, do_journal_get_write_access);
@@ -1936,9 +1922,6 @@ static int __ext4_journalled_writepage(struct page *page,
out:
unlock_page(page);
out_no_pagelock:
- if (!inline_data && page_bufs)
- ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len,
- NULL, bput_one);
brelse(inode_bh);
return ret;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5c48a7df91499e371ef725895b2e2d21a126e227 Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Sat, 25 Dec 2021 17:09:37 +0800
Subject: [PATCH] ext4: fix an use-after-free issue about data=journal
writeback mode
Our syzkaller report an use-after-free issue that accessing the freed
buffer_head on the writeback page in __ext4_journalled_writepage(). The
problem is that if there was a truncate racing with the data=journalled
writeback procedure, the writeback length could become zero and
bget_one() refuse to get buffer_head's refcount, then the truncate
procedure release buffer once we drop page lock, finally, the last
ext4_walk_page_buffers() trigger the use-after-free problem.
sync truncate
ext4_sync_file()
file_write_and_wait_range()
ext4_setattr(0)
inode->i_size = 0
ext4_writepage()
len = 0
__ext4_journalled_writepage()
page_bufs = page_buffers(page)
ext4_walk_page_buffers(bget_one) <- does not get refcount
do_invalidatepage()
free_buffer_head()
ext4_walk_page_buffers(page_bufs) <- trigger use-after-free
After commit bdf96838aea6 ("ext4: fix race between truncate and
__ext4_journalled_writepage()"), we have already handled the racing
case, so the bget_one() and bput_one() are not needed. So this patch
simply remove these hunk, and recheck the i_size to make it safe.
Fixes: bdf96838aea6 ("ext4: fix race between truncate and __ext4_journalled_writepage()")
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20211225090937.712867-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index bca9951634d9..68070f34f0cf 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1845,30 +1845,16 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
return 0;
}
-static int bget_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- get_bh(bh);
- return 0;
-}
-
-static int bput_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- put_bh(bh);
- return 0;
-}
-
static int __ext4_journalled_writepage(struct page *page,
unsigned int len)
{
struct address_space *mapping = page->mapping;
struct inode *inode = mapping->host;
- struct buffer_head *page_bufs = NULL;
handle_t *handle = NULL;
int ret = 0, err = 0;
int inline_data = ext4_has_inline_data(inode);
struct buffer_head *inode_bh = NULL;
+ loff_t size;
ClearPageChecked(page);
@@ -1878,14 +1864,6 @@ static int __ext4_journalled_writepage(struct page *page,
inode_bh = ext4_journalled_write_inline_data(inode, len, page);
if (inode_bh == NULL)
goto out;
- } else {
- page_bufs = page_buffers(page);
- if (!page_bufs) {
- BUG();
- goto out;
- }
- ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
- NULL, bget_one);
}
/*
* We need to release the page lock before we start the
@@ -1906,7 +1884,8 @@ static int __ext4_journalled_writepage(struct page *page,
lock_page(page);
put_page(page);
- if (page->mapping != mapping) {
+ size = i_size_read(inode);
+ if (page->mapping != mapping || page_offset(page) > size) {
/* The page got truncated from under us */
ext4_journal_stop(handle);
ret = 0;
@@ -1916,6 +1895,13 @@ static int __ext4_journalled_writepage(struct page *page,
if (inline_data) {
ret = ext4_mark_inode_dirty(handle, inode);
} else {
+ struct buffer_head *page_bufs = page_buffers(page);
+
+ if (page->index == size >> PAGE_SHIFT)
+ len = size & ~PAGE_MASK;
+ else
+ len = PAGE_SIZE;
+
ret = ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
NULL, do_journal_get_write_access);
@@ -1936,9 +1922,6 @@ static int __ext4_journalled_writepage(struct page *page,
out:
unlock_page(page);
out_no_pagelock:
- if (!inline_data && page_bufs)
- ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len,
- NULL, bput_one);
brelse(inode_bh);
return ret;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5c48a7df91499e371ef725895b2e2d21a126e227 Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Sat, 25 Dec 2021 17:09:37 +0800
Subject: [PATCH] ext4: fix an use-after-free issue about data=journal
writeback mode
Our syzkaller report an use-after-free issue that accessing the freed
buffer_head on the writeback page in __ext4_journalled_writepage(). The
problem is that if there was a truncate racing with the data=journalled
writeback procedure, the writeback length could become zero and
bget_one() refuse to get buffer_head's refcount, then the truncate
procedure release buffer once we drop page lock, finally, the last
ext4_walk_page_buffers() trigger the use-after-free problem.
sync truncate
ext4_sync_file()
file_write_and_wait_range()
ext4_setattr(0)
inode->i_size = 0
ext4_writepage()
len = 0
__ext4_journalled_writepage()
page_bufs = page_buffers(page)
ext4_walk_page_buffers(bget_one) <- does not get refcount
do_invalidatepage()
free_buffer_head()
ext4_walk_page_buffers(page_bufs) <- trigger use-after-free
After commit bdf96838aea6 ("ext4: fix race between truncate and
__ext4_journalled_writepage()"), we have already handled the racing
case, so the bget_one() and bput_one() are not needed. So this patch
simply remove these hunk, and recheck the i_size to make it safe.
Fixes: bdf96838aea6 ("ext4: fix race between truncate and __ext4_journalled_writepage()")
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20211225090937.712867-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index bca9951634d9..68070f34f0cf 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1845,30 +1845,16 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
return 0;
}
-static int bget_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- get_bh(bh);
- return 0;
-}
-
-static int bput_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- put_bh(bh);
- return 0;
-}
-
static int __ext4_journalled_writepage(struct page *page,
unsigned int len)
{
struct address_space *mapping = page->mapping;
struct inode *inode = mapping->host;
- struct buffer_head *page_bufs = NULL;
handle_t *handle = NULL;
int ret = 0, err = 0;
int inline_data = ext4_has_inline_data(inode);
struct buffer_head *inode_bh = NULL;
+ loff_t size;
ClearPageChecked(page);
@@ -1878,14 +1864,6 @@ static int __ext4_journalled_writepage(struct page *page,
inode_bh = ext4_journalled_write_inline_data(inode, len, page);
if (inode_bh == NULL)
goto out;
- } else {
- page_bufs = page_buffers(page);
- if (!page_bufs) {
- BUG();
- goto out;
- }
- ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
- NULL, bget_one);
}
/*
* We need to release the page lock before we start the
@@ -1906,7 +1884,8 @@ static int __ext4_journalled_writepage(struct page *page,
lock_page(page);
put_page(page);
- if (page->mapping != mapping) {
+ size = i_size_read(inode);
+ if (page->mapping != mapping || page_offset(page) > size) {
/* The page got truncated from under us */
ext4_journal_stop(handle);
ret = 0;
@@ -1916,6 +1895,13 @@ static int __ext4_journalled_writepage(struct page *page,
if (inline_data) {
ret = ext4_mark_inode_dirty(handle, inode);
} else {
+ struct buffer_head *page_bufs = page_buffers(page);
+
+ if (page->index == size >> PAGE_SHIFT)
+ len = size & ~PAGE_MASK;
+ else
+ len = PAGE_SIZE;
+
ret = ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
NULL, do_journal_get_write_access);
@@ -1936,9 +1922,6 @@ static int __ext4_journalled_writepage(struct page *page,
out:
unlock_page(page);
out_no_pagelock:
- if (!inline_data && page_bufs)
- ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len,
- NULL, bput_one);
brelse(inode_bh);
return ret;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5c48a7df91499e371ef725895b2e2d21a126e227 Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Sat, 25 Dec 2021 17:09:37 +0800
Subject: [PATCH] ext4: fix an use-after-free issue about data=journal
writeback mode
Our syzkaller report an use-after-free issue that accessing the freed
buffer_head on the writeback page in __ext4_journalled_writepage(). The
problem is that if there was a truncate racing with the data=journalled
writeback procedure, the writeback length could become zero and
bget_one() refuse to get buffer_head's refcount, then the truncate
procedure release buffer once we drop page lock, finally, the last
ext4_walk_page_buffers() trigger the use-after-free problem.
sync truncate
ext4_sync_file()
file_write_and_wait_range()
ext4_setattr(0)
inode->i_size = 0
ext4_writepage()
len = 0
__ext4_journalled_writepage()
page_bufs = page_buffers(page)
ext4_walk_page_buffers(bget_one) <- does not get refcount
do_invalidatepage()
free_buffer_head()
ext4_walk_page_buffers(page_bufs) <- trigger use-after-free
After commit bdf96838aea6 ("ext4: fix race between truncate and
__ext4_journalled_writepage()"), we have already handled the racing
case, so the bget_one() and bput_one() are not needed. So this patch
simply remove these hunk, and recheck the i_size to make it safe.
Fixes: bdf96838aea6 ("ext4: fix race between truncate and __ext4_journalled_writepage()")
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20211225090937.712867-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index bca9951634d9..68070f34f0cf 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1845,30 +1845,16 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
return 0;
}
-static int bget_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- get_bh(bh);
- return 0;
-}
-
-static int bput_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- put_bh(bh);
- return 0;
-}
-
static int __ext4_journalled_writepage(struct page *page,
unsigned int len)
{
struct address_space *mapping = page->mapping;
struct inode *inode = mapping->host;
- struct buffer_head *page_bufs = NULL;
handle_t *handle = NULL;
int ret = 0, err = 0;
int inline_data = ext4_has_inline_data(inode);
struct buffer_head *inode_bh = NULL;
+ loff_t size;
ClearPageChecked(page);
@@ -1878,14 +1864,6 @@ static int __ext4_journalled_writepage(struct page *page,
inode_bh = ext4_journalled_write_inline_data(inode, len, page);
if (inode_bh == NULL)
goto out;
- } else {
- page_bufs = page_buffers(page);
- if (!page_bufs) {
- BUG();
- goto out;
- }
- ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
- NULL, bget_one);
}
/*
* We need to release the page lock before we start the
@@ -1906,7 +1884,8 @@ static int __ext4_journalled_writepage(struct page *page,
lock_page(page);
put_page(page);
- if (page->mapping != mapping) {
+ size = i_size_read(inode);
+ if (page->mapping != mapping || page_offset(page) > size) {
/* The page got truncated from under us */
ext4_journal_stop(handle);
ret = 0;
@@ -1916,6 +1895,13 @@ static int __ext4_journalled_writepage(struct page *page,
if (inline_data) {
ret = ext4_mark_inode_dirty(handle, inode);
} else {
+ struct buffer_head *page_bufs = page_buffers(page);
+
+ if (page->index == size >> PAGE_SHIFT)
+ len = size & ~PAGE_MASK;
+ else
+ len = PAGE_SIZE;
+
ret = ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
NULL, do_journal_get_write_access);
@@ -1936,9 +1922,6 @@ static int __ext4_journalled_writepage(struct page *page,
out:
unlock_page(page);
out_no_pagelock:
- if (!inline_data && page_bufs)
- ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len,
- NULL, bput_one);
brelse(inode_bh);
return ret;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5c48a7df91499e371ef725895b2e2d21a126e227 Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Sat, 25 Dec 2021 17:09:37 +0800
Subject: [PATCH] ext4: fix an use-after-free issue about data=journal
writeback mode
Our syzkaller report an use-after-free issue that accessing the freed
buffer_head on the writeback page in __ext4_journalled_writepage(). The
problem is that if there was a truncate racing with the data=journalled
writeback procedure, the writeback length could become zero and
bget_one() refuse to get buffer_head's refcount, then the truncate
procedure release buffer once we drop page lock, finally, the last
ext4_walk_page_buffers() trigger the use-after-free problem.
sync truncate
ext4_sync_file()
file_write_and_wait_range()
ext4_setattr(0)
inode->i_size = 0
ext4_writepage()
len = 0
__ext4_journalled_writepage()
page_bufs = page_buffers(page)
ext4_walk_page_buffers(bget_one) <- does not get refcount
do_invalidatepage()
free_buffer_head()
ext4_walk_page_buffers(page_bufs) <- trigger use-after-free
After commit bdf96838aea6 ("ext4: fix race between truncate and
__ext4_journalled_writepage()"), we have already handled the racing
case, so the bget_one() and bput_one() are not needed. So this patch
simply remove these hunk, and recheck the i_size to make it safe.
Fixes: bdf96838aea6 ("ext4: fix race between truncate and __ext4_journalled_writepage()")
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20211225090937.712867-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index bca9951634d9..68070f34f0cf 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1845,30 +1845,16 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
return 0;
}
-static int bget_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- get_bh(bh);
- return 0;
-}
-
-static int bput_one(handle_t *handle, struct inode *inode,
- struct buffer_head *bh)
-{
- put_bh(bh);
- return 0;
-}
-
static int __ext4_journalled_writepage(struct page *page,
unsigned int len)
{
struct address_space *mapping = page->mapping;
struct inode *inode = mapping->host;
- struct buffer_head *page_bufs = NULL;
handle_t *handle = NULL;
int ret = 0, err = 0;
int inline_data = ext4_has_inline_data(inode);
struct buffer_head *inode_bh = NULL;
+ loff_t size;
ClearPageChecked(page);
@@ -1878,14 +1864,6 @@ static int __ext4_journalled_writepage(struct page *page,
inode_bh = ext4_journalled_write_inline_data(inode, len, page);
if (inode_bh == NULL)
goto out;
- } else {
- page_bufs = page_buffers(page);
- if (!page_bufs) {
- BUG();
- goto out;
- }
- ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
- NULL, bget_one);
}
/*
* We need to release the page lock before we start the
@@ -1906,7 +1884,8 @@ static int __ext4_journalled_writepage(struct page *page,
lock_page(page);
put_page(page);
- if (page->mapping != mapping) {
+ size = i_size_read(inode);
+ if (page->mapping != mapping || page_offset(page) > size) {
/* The page got truncated from under us */
ext4_journal_stop(handle);
ret = 0;
@@ -1916,6 +1895,13 @@ static int __ext4_journalled_writepage(struct page *page,
if (inline_data) {
ret = ext4_mark_inode_dirty(handle, inode);
} else {
+ struct buffer_head *page_bufs = page_buffers(page);
+
+ if (page->index == size >> PAGE_SHIFT)
+ len = size & ~PAGE_MASK;
+ else
+ len = PAGE_SIZE;
+
ret = ext4_walk_page_buffers(handle, inode, page_bufs, 0, len,
NULL, do_journal_get_write_access);
@@ -1936,9 +1922,6 @@ static int __ext4_journalled_writepage(struct page *page,
out:
unlock_page(page);
out_no_pagelock:
- if (!inline_data && page_bufs)
- ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len,
- NULL, bput_one);
brelse(inode_bh);
return ret;
}
Hello my dear,
I sent this mail praying it will get to you in a good condition of
health, since I myself are in a very critical health condition in
which I sleep every night without knowing if I may be alive to see the
next day. I bring peace and love to you. It is by the grace of God, I
had no choice than to do what is lawful and right in the sight of God
for eternal life and in the sight of man, for witness of God’s mercy
and glory upon my life. I am Mrs. Dina. Howley Mckenna, a widow. I am
suffering from a long time brain tumor, It has defiled all forms of
medical treatment, and right now I have about a few months to leave,
according to medical experts. The situation has gotten complicated
recently with my inability to hear proper, am communicating with you
with the help of the chief nurse herein the hospital, from all
indication my conditions is really deteriorating and it is quite
obvious that, according to my doctors they have advised me that I may
not live too long, Because this illness has gotten to a very bad
stage. I plead that you will not expose or betray this trust and
confidence that I am about to repose on you for the mutual benefit of
the orphans and the less privilege. I have some funds I inherited from
my late husband, the sum of ($ 11,000,000.00, Eleven Million Dollars).
Having known my condition, I decided to donate this fund to you
believing that you will utilize it the way i am going to instruct
herein. I need you to assist me and reclaim this money and use it for
Charity works therein your country for orphanages and gives justice
and help to the poor, needy and widows says The Lord." Jeremiah
22:15-16.“ and also build schools for less privilege that will be
named after my late husband if possible and to promote the word of God
and the effort that the house of God is maintained. I do not want a
situation where this money will be used in an ungodly manner. That's
why I'm taking this decision. I'm not afraid of death, so I know where
I'm going. I accept this decision because I do not have any child who
will inherit this money after I die.. Please I want your sincerely and
urgent answer to know if you will be able to execute this project for
the glory of God, and I will give you more information on how the fund
will be transferred to your bank account. May the grace, peace, love
and the truth in the Word of God be with you and all those that you
love and care for.
I'm waiting for your immediate reply..
May God Bless you,
Mrs. Dina. Howley Mckenna.
Hello,
Am Fredrik Elvebakk an Investment Manager from Norway. I wish to solicit
your interest in an investment project that is currently ongoing in my company (DNB);
It is a short term investment with good returns.
Simply reply for me to confirm the validity of your email so i shall give you comprehensive details about the project.
Best Regards,
Fredrik Elvebakk
Business Consultant
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a602f5111fdd3d8a8ea2ac9e61f1c047d9794062 Mon Sep 17 00:00:00 2001
From: Fabrizio Bertocci <fabriziobertocci(a)gmail.com>
Date: Mon, 29 Nov 2021 23:15:40 -0500
Subject: [PATCH] platform/x86: amd-pmc: Fix s2idle failures on certain AMD
laptops
On some AMD hardware laptops, the system fails communicating with the
PMC when entering s2idle and the machine is battery powered.
Hardware description: HP Pavilion Aero Laptop 13-be0097nr
CPU: AMD Ryzen 7 5800U with Radeon Graphics
GPU: 03:00.0 VGA compatible controller [0300]: Advanced Micro Devices,
Inc. [AMD/ATI] Device [1002:1638] (rev c1)
Detailed description of the problem (and investigation) here:
https://gitlab.freedesktop.org/drm/amd/-/issues/1799
Patch is a single line: reduce the polling delay in half, from 100uSec
to 50uSec when waiting for a change in state from the PMC after a
write command operation.
After changing the delay, I did not see a single failure on this
machine (I have this fix for now more than one week and s2idle worked
every single time on battery power).
Cc: stable(a)vger.kernel.org
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k(a)amd.com>
Signed-off-by: Fabrizio Bertocci <fabriziobertocci(a)gmail.com>
Link: https://lore.kernel.org/r/CADtzkx7TdfbwtaVEXUdD6YXPey52E-nZVQNs+Z41DTx7gqMq…
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
diff --git a/drivers/platform/x86/amd-pmc.c b/drivers/platform/x86/amd-pmc.c
index b7e50ed050a8..841c44cd64c2 100644
--- a/drivers/platform/x86/amd-pmc.c
+++ b/drivers/platform/x86/amd-pmc.c
@@ -76,7 +76,7 @@
#define AMD_CPU_ID_CZN AMD_CPU_ID_RN
#define AMD_CPU_ID_YC 0x14B5
-#define PMC_MSG_DELAY_MIN_US 100
+#define PMC_MSG_DELAY_MIN_US 50
#define RESPONSE_REGISTER_LOOP_MAX 20000
#define SOC_SUBSYSTEM_IP_MAX 12
Changes to the AMD Thresholding sysfs code prevents sysfs writes from
updating the underlying registers once CPU init is completed, i.e.
"threshold_banks" is set.
Allow the registers to be updated if the thresholding interface is
already initialized or if in the init path. Use the "set_lvt_off" value
to indicate if running in the init path, since this value is only set
during init.
Fixes: a037f3ca0ea0 ("x86/mce/amd: Make threshold bank setting hotplug robust")
Signed-off-by: Yazen Ghannam <yazen.ghannam(a)amd.com>
Cc: <stable(a)vger.kernel.org>
---
Link:
https://lkml.kernel.org/r/20211207193028.9389-1-yazen.ghannam@amd.com
v1->v2:
* Add Cc: stable
* Switch logic for check and drop extra comment.
arch/x86/kernel/cpu/mce/amd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index a1e2f41796dc..9f4b508886dd 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -423,7 +423,7 @@ static void threshold_restart_bank(void *_tr)
u32 hi, lo;
/* sysfs write might race against an offline operation */
- if (this_cpu_read(threshold_banks))
+ if (!this_cpu_read(threshold_banks) && !tr->set_lvt_off)
return;
rdmsr(tr->b->address, lo, hi);
--
2.25.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 232796df8c1437c41d308d161007f0715bac0a54 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Wed, 27 Oct 2021 18:30:25 +0100
Subject: [PATCH] btrfs: fix deadlock between quota enable and other quota
operations
When enabling quotas, we attempt to commit a transaction while holding the
mutex fs_info->qgroup_ioctl_lock. This can result on a deadlock with other
quota operations such as:
- qgroup creation and deletion, ioctl BTRFS_IOC_QGROUP_CREATE;
- adding and removing qgroup relations, ioctl BTRFS_IOC_QGROUP_ASSIGN.
This is because these operations join a transaction and after that they
attempt to lock the mutex fs_info->qgroup_ioctl_lock. Acquiring that mutex
after joining or starting a transaction is a pattern followed everywhere
in qgroups, so the quota enablement operation is the one at fault here,
and should not commit a transaction while holding that mutex.
Fix this by making the transaction commit while not holding the mutex.
We are safe from two concurrent tasks trying to enable quotas because
we are serialized by the rw semaphore fs_info->subvol_sem at
btrfs_ioctl_quota_ctl(), which is the only call site for enabling
quotas.
When this deadlock happens, it produces a trace like the following:
INFO: task syz-executor:25604 blocked for more than 143 seconds.
Not tainted 5.15.0-rc6 #4
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor state:D stack:24800 pid:25604 ppid: 24873 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:4940 [inline]
__schedule+0xcd9/0x2530 kernel/sched/core.c:6287
schedule+0xd3/0x270 kernel/sched/core.c:6366
btrfs_commit_transaction+0x994/0x2e90 fs/btrfs/transaction.c:2201
btrfs_quota_enable+0x95c/0x1790 fs/btrfs/qgroup.c:1120
btrfs_ioctl_quota_ctl fs/btrfs/ioctl.c:4229 [inline]
btrfs_ioctl+0x637e/0x7b70 fs/btrfs/ioctl.c:5010
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f86920b2c4d
RSP: 002b:00007f868f61ac58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f86921d90a0 RCX: 00007f86920b2c4d
RDX: 0000000020005e40 RSI: 00000000c0109428 RDI: 0000000000000008
RBP: 00007f869212bd80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f86921d90a0
R13: 00007fff6d233e4f R14: 00007fff6d233ff0 R15: 00007f868f61adc0
INFO: task syz-executor:25628 blocked for more than 143 seconds.
Not tainted 5.15.0-rc6 #4
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor state:D stack:29080 pid:25628 ppid: 24873 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:4940 [inline]
__schedule+0xcd9/0x2530 kernel/sched/core.c:6287
schedule+0xd3/0x270 kernel/sched/core.c:6366
schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6425
__mutex_lock_common kernel/locking/mutex.c:669 [inline]
__mutex_lock+0xc96/0x1680 kernel/locking/mutex.c:729
btrfs_remove_qgroup+0xb7/0x7d0 fs/btrfs/qgroup.c:1548
btrfs_ioctl_qgroup_create fs/btrfs/ioctl.c:4333 [inline]
btrfs_ioctl+0x683c/0x7b70 fs/btrfs/ioctl.c:5014
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Reported-by: Hao Sun <sunhao.th(a)gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CACkBjsZQF19bQ1C6=yetF3BvL10OSORpFUcWXT…
Fixes: 340f1aa27f36 ("btrfs: qgroups: Move transaction management inside btrfs_quota_enable/disable")
CC: stable(a)vger.kernel.org # 4.19
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 6c037f1252b7..071f7334f818 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -940,6 +940,14 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
int ret = 0;
int slot;
+ /*
+ * We need to have subvol_sem write locked, to prevent races between
+ * concurrent tasks trying to enable quotas, because we will unlock
+ * and relock qgroup_ioctl_lock before setting fs_info->quota_root
+ * and before setting BTRFS_FS_QUOTA_ENABLED.
+ */
+ lockdep_assert_held_write(&fs_info->subvol_sem);
+
mutex_lock(&fs_info->qgroup_ioctl_lock);
if (fs_info->quota_root)
goto out;
@@ -1117,8 +1125,19 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
goto out_free_path;
}
+ mutex_unlock(&fs_info->qgroup_ioctl_lock);
+ /*
+ * Commit the transaction while not holding qgroup_ioctl_lock, to avoid
+ * a deadlock with tasks concurrently doing other qgroup operations, such
+ * adding/removing qgroups or adding/deleting qgroup relations for example,
+ * because all qgroup operations first start or join a transaction and then
+ * lock the qgroup_ioctl_lock mutex.
+ * We are safe from a concurrent task trying to enable quotas, by calling
+ * this function, since we are serialized by fs_info->subvol_sem.
+ */
ret = btrfs_commit_transaction(trans);
trans = NULL;
+ mutex_lock(&fs_info->qgroup_ioctl_lock);
if (ret)
goto out_free_path;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 16beac87e95e2fb278b552397c8260637f8a63f7 Mon Sep 17 00:00:00 2001
From: Naohiro Aota <naohiro.aota(a)wdc.com>
Date: Thu, 11 Nov 2021 14:14:38 +0900
Subject: [PATCH] btrfs: zoned: cache reported zone during mount
When mounting a device, we are reporting the zones twice: once for
checking the zone attributes in btrfs_get_dev_zone_info and once for
loading block groups' zone info in
btrfs_load_block_group_zone_info(). With a lot of block groups, that
leads to a lot of REPORT ZONE commands and slows down the mount
process.
This patch introduces a zone info cache in struct
btrfs_zoned_device_info. The cache is populated while in
btrfs_get_dev_zone_info() and used for
btrfs_load_block_group_zone_info() to reduce the number of REPORT ZONE
commands. The zone cache is then released after loading the block
groups, as it will not be much effective during the run time.
Benchmark: Mount an HDD with 57,007 block groups
Before patch: 171.368 seconds
After patch: 64.064 seconds
While it still takes a minute due to the slowness of loading all the
block groups, the patch reduces the mount time by 1/3.
Link: https://lore.kernel.org/linux-btrfs/CAHQ7scUiLtcTqZOMMY5kbWUBOhGRwKo6J6wYPT…
Fixes: 5b316468983d ("btrfs: get zone information of zoned block devices")
CC: stable(a)vger.kernel.org
Signed-off-by: Naohiro Aota <naohiro.aota(a)wdc.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 82769f1c17ee..66fa61cb3f23 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -322,7 +322,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE);
device->fs_devices = fs_info->fs_devices;
- ret = btrfs_get_dev_zone_info(device);
+ ret = btrfs_get_dev_zone_info(device, false);
if (ret)
goto error;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 6408948b3e2c..67533b13e1eb 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3570,6 +3570,8 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
goto fail_sysfs;
}
+ btrfs_free_zone_cache(fs_info);
+
if (!sb_rdonly(sb) && fs_info->fs_devices->missing_devices &&
!btrfs_check_rw_degradable(fs_info, NULL)) {
btrfs_warn(fs_info,
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 53753e04af14..cafd490da072 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2669,7 +2669,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
device->fs_info = fs_info;
device->bdev = bdev;
- ret = btrfs_get_dev_zone_info(device);
+ ret = btrfs_get_dev_zone_info(device, false);
if (ret)
goto error_free_device;
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 678a29469511..b06059a5db2a 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -5,6 +5,7 @@
#include <linux/blkdev.h>
#include <linux/sched/mm.h>
#include <linux/atomic.h>
+#include <linux/vmalloc.h>
#include "ctree.h"
#include "volumes.h"
#include "zoned.h"
@@ -213,6 +214,8 @@ static int emulate_report_zones(struct btrfs_device *device, u64 pos,
static int btrfs_get_dev_zones(struct btrfs_device *device, u64 pos,
struct blk_zone *zones, unsigned int *nr_zones)
{
+ struct btrfs_zoned_device_info *zinfo = device->zone_info;
+ u32 zno;
int ret;
if (!*nr_zones)
@@ -224,6 +227,34 @@ static int btrfs_get_dev_zones(struct btrfs_device *device, u64 pos,
return 0;
}
+ /* Check cache */
+ if (zinfo->zone_cache) {
+ unsigned int i;
+
+ ASSERT(IS_ALIGNED(pos, zinfo->zone_size));
+ zno = pos >> zinfo->zone_size_shift;
+ /*
+ * We cannot report zones beyond the zone end. So, it is OK to
+ * cap *nr_zones to at the end.
+ */
+ *nr_zones = min_t(u32, *nr_zones, zinfo->nr_zones - zno);
+
+ for (i = 0; i < *nr_zones; i++) {
+ struct blk_zone *zone_info;
+
+ zone_info = &zinfo->zone_cache[zno + i];
+ if (!zone_info->len)
+ break;
+ }
+
+ if (i == *nr_zones) {
+ /* Cache hit on all the zones */
+ memcpy(zones, zinfo->zone_cache + zno,
+ sizeof(*zinfo->zone_cache) * *nr_zones);
+ return 0;
+ }
+ }
+
ret = blkdev_report_zones(device->bdev, pos >> SECTOR_SHIFT, *nr_zones,
copy_zone_info_cb, zones);
if (ret < 0) {
@@ -237,6 +268,11 @@ static int btrfs_get_dev_zones(struct btrfs_device *device, u64 pos,
if (!ret)
return -EIO;
+ /* Populate cache */
+ if (zinfo->zone_cache)
+ memcpy(zinfo->zone_cache + zno, zones,
+ sizeof(*zinfo->zone_cache) * *nr_zones);
+
return 0;
}
@@ -300,7 +336,7 @@ int btrfs_get_dev_zone_info_all_devices(struct btrfs_fs_info *fs_info)
if (!device->bdev)
continue;
- ret = btrfs_get_dev_zone_info(device);
+ ret = btrfs_get_dev_zone_info(device, true);
if (ret)
break;
}
@@ -309,7 +345,7 @@ int btrfs_get_dev_zone_info_all_devices(struct btrfs_fs_info *fs_info)
return ret;
}
-int btrfs_get_dev_zone_info(struct btrfs_device *device)
+int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
{
struct btrfs_fs_info *fs_info = device->fs_info;
struct btrfs_zoned_device_info *zone_info = NULL;
@@ -339,6 +375,8 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device)
if (!zone_info)
return -ENOMEM;
+ device->zone_info = zone_info;
+
if (!bdev_is_zoned(bdev)) {
if (!fs_info->zone_size) {
ret = calculate_emulated_zone_size(fs_info);
@@ -407,6 +445,23 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device)
goto out;
}
+ /*
+ * Enable zone cache only for a zoned device. On a non-zoned device, we
+ * fill the zone info with emulated CONVENTIONAL zones, so no need to
+ * use the cache.
+ */
+ if (populate_cache && bdev_is_zoned(device->bdev)) {
+ zone_info->zone_cache = vzalloc(sizeof(struct blk_zone) *
+ zone_info->nr_zones);
+ if (!zone_info->zone_cache) {
+ btrfs_err_in_rcu(device->fs_info,
+ "zoned: failed to allocate zone cache for %s",
+ rcu_str_deref(device->name));
+ ret = -ENOMEM;
+ goto out;
+ }
+ }
+
/* Get zones type */
nactive = 0;
while (sector < nr_sectors) {
@@ -505,8 +560,6 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device)
kfree(zones);
- device->zone_info = zone_info;
-
switch (bdev_zoned_model(bdev)) {
case BLK_ZONED_HM:
model = "host-managed zoned";
@@ -539,11 +592,7 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device)
out:
kfree(zones);
out_free_zone_info:
- bitmap_free(zone_info->active_zones);
- bitmap_free(zone_info->empty_zones);
- bitmap_free(zone_info->seq_zones);
- kfree(zone_info);
- device->zone_info = NULL;
+ btrfs_destroy_dev_zone_info(device);
return ret;
}
@@ -558,6 +607,7 @@ void btrfs_destroy_dev_zone_info(struct btrfs_device *device)
bitmap_free(zone_info->active_zones);
bitmap_free(zone_info->seq_zones);
bitmap_free(zone_info->empty_zones);
+ vfree(zone_info->zone_cache);
kfree(zone_info);
device->zone_info = NULL;
}
@@ -1975,3 +2025,21 @@ void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg)
fs_info->data_reloc_bg = 0;
spin_unlock(&fs_info->relocation_bg_lock);
}
+
+void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info)
+{
+ struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
+ struct btrfs_device *device;
+
+ if (!btrfs_is_zoned(fs_info))
+ return;
+
+ mutex_lock(&fs_devices->device_list_mutex);
+ list_for_each_entry(device, &fs_devices->devices, dev_list) {
+ if (device->zone_info) {
+ vfree(device->zone_info->zone_cache);
+ device->zone_info->zone_cache = NULL;
+ }
+ }
+ mutex_unlock(&fs_devices->device_list_mutex);
+}
diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h
index e53ab7b96437..4344f4818389 100644
--- a/fs/btrfs/zoned.h
+++ b/fs/btrfs/zoned.h
@@ -28,6 +28,7 @@ struct btrfs_zoned_device_info {
unsigned long *seq_zones;
unsigned long *empty_zones;
unsigned long *active_zones;
+ struct blk_zone *zone_cache;
struct blk_zone sb_zones[2 * BTRFS_SUPER_MIRROR_MAX];
};
@@ -35,7 +36,7 @@ struct btrfs_zoned_device_info {
int btrfs_get_dev_zone(struct btrfs_device *device, u64 pos,
struct blk_zone *zone);
int btrfs_get_dev_zone_info_all_devices(struct btrfs_fs_info *fs_info);
-int btrfs_get_dev_zone_info(struct btrfs_device *device);
+int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache);
void btrfs_destroy_dev_zone_info(struct btrfs_device *device);
int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info);
int btrfs_check_mountopts_zoned(struct btrfs_fs_info *info);
@@ -76,6 +77,7 @@ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices,
void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical,
u64 length);
void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg);
+void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info);
#else /* CONFIG_BLK_DEV_ZONED */
static inline int btrfs_get_dev_zone(struct btrfs_device *device, u64 pos,
struct blk_zone *zone)
@@ -88,7 +90,8 @@ static inline int btrfs_get_dev_zone_info_all_devices(struct btrfs_fs_info *fs_i
return 0;
}
-static inline int btrfs_get_dev_zone_info(struct btrfs_device *device)
+static inline int btrfs_get_dev_zone_info(struct btrfs_device *device,
+ bool populate_cache)
{
return 0;
}
@@ -232,6 +235,7 @@ static inline void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info,
static inline void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg) { }
+static inline void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) { }
#endif
static inline bool btrfs_dev_is_sequential(struct btrfs_device *device, u64 pos)
Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to the
RPC read layers") on the client, a read of 0xfff is aligned up to server
rsize of 0x1000.
As a result, in a test where the server has a file of size
0x7fffffffffffffff, and the client tries to read from the offset
0x7ffffffffffff000, the read causes loff_t overflow in the server and it
returns an NFS code of EINVAL to the client. The client as a result
indefinitely retries the request.
This fixes the issue at server side by trimming reads past
NFS_OFFSET_MAX. It also adds a missing check for out of bound offset in
NFSv3, copying a similar check from NFSv4.x.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Dan Aloni <dan.aloni(a)vastdata.com>
---
fs/nfsd/nfs4proc.c | 3 +++
fs/nfsd/vfs.c | 6 ++++++
2 files changed, 9 insertions(+)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 486c5dba4b65..816bdf212559 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -785,6 +785,9 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
if (read->rd_offset >= OFFSET_MAX)
return nfserr_inval;
+ if (unlikely(read->rd_offset + read->rd_length > OFFSET_MAX))
+ read->rd_length = OFFSET_MAX - read->rd_offset;
+
trace_nfsd_read_start(rqstp, &cstate->current_fh,
read->rd_offset, read->rd_length);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 738d564ca4ce..ad4df374433e 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1045,6 +1045,12 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
struct file *file;
__be32 err;
+ if (unlikely(offset >= NFS_OFFSET_MAX))
+ return nfserr_inval;
+
+ if (unlikely(offset + *count > NFS_OFFSET_MAX))
+ *count = NFS_OFFSET_MAX - offset;
+
trace_nfsd_read_start(rqstp, fhp, offset, *count);
err = nfsd_file_acquire(rqstp, fhp, NFSD_MAY_READ, &nf);
if (err)
--
2.23.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 12998087d9f48b66965b97412069c7826502cd7e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Pali=20Roh=C3=A1r?= <pali(a)kernel.org>
Date: Wed, 24 Nov 2021 16:59:42 +0100
Subject: [PATCH] PCI: pci-bridge-emul: Fix definitions of reserved bits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Some bits in PCI_EXP registers are reserved for non-root ports. Driver
pci-bridge-emul.c implements PCIe Root Port device therefore it should not
allow setting reserved bits of registers.
Properly define non-reserved bits for all PCI_EXP registers.
Link: https://lore.kernel.org/r/20211124155944.1290-5-pali@kernel.org
Fixes: 23a5fba4d941 ("PCI: Introduce PCI bridge emulated config space common logic")
Signed-off-by: Pali Rohár <pali(a)kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/pci/pci-bridge-emul.c b/drivers/pci/pci-bridge-emul.c
index 0cbb4e3ca827..2c7e04fb2685 100644
--- a/drivers/pci/pci-bridge-emul.c
+++ b/drivers/pci/pci-bridge-emul.c
@@ -176,41 +176,55 @@ struct pci_bridge_reg_behavior pcie_cap_regs_behavior[PCI_CAP_PCIE_SIZEOF / 4] =
[PCI_CAP_LIST_ID / 4] = {
/*
* Capability ID, Next Capability Pointer and
- * Capabilities register are all read-only.
+ * bits [14:0] of Capabilities register are all read-only.
+ * Bit 15 of Capabilities register is reserved.
*/
- .ro = ~0,
+ .ro = GENMASK(30, 0),
},
[PCI_EXP_DEVCAP / 4] = {
- .ro = ~0,
+ /*
+ * Bits [31:29] and [17:16] are reserved.
+ * Bits [27:18] are reserved for non-upstream ports.
+ * Bits 28 and [14:6] are reserved for non-endpoint devices.
+ * Other bits are read-only.
+ */
+ .ro = BIT(15) | GENMASK(5, 0),
},
[PCI_EXP_DEVCTL / 4] = {
- /* Device control register is RW */
- .rw = GENMASK(15, 0),
+ /*
+ * Device control register is RW, except bit 15 which is
+ * reserved for non-endpoints or non-PCIe-to-PCI/X bridges.
+ */
+ .rw = GENMASK(14, 0),
/*
* Device status register has bits 6 and [3:0] W1C, [5:4] RO,
- * the rest is reserved
+ * the rest is reserved. Also bit 6 is reserved for non-upstream
+ * ports.
*/
- .w1c = (BIT(6) | GENMASK(3, 0)) << 16,
+ .w1c = GENMASK(3, 0) << 16,
.ro = GENMASK(5, 4) << 16,
},
[PCI_EXP_LNKCAP / 4] = {
- /* All bits are RO, except bit 23 which is reserved */
- .ro = lower_32_bits(~BIT(23)),
+ /*
+ * All bits are RO, except bit 23 which is reserved and
+ * bit 18 which is reserved for non-upstream ports.
+ */
+ .ro = lower_32_bits(~(BIT(23) | PCI_EXP_LNKCAP_CLKPM)),
},
[PCI_EXP_LNKCTL / 4] = {
/*
* Link control has bits [15:14], [11:3] and [1:0] RW, the
- * rest is reserved.
+ * rest is reserved. Bit 8 is reserved for non-upstream ports.
*
* Link status has bits [13:0] RO, and bits [15:14]
* W1C.
*/
- .rw = GENMASK(15, 14) | GENMASK(11, 3) | GENMASK(1, 0),
+ .rw = GENMASK(15, 14) | GENMASK(11, 9) | GENMASK(7, 3) | GENMASK(1, 0),
.ro = GENMASK(13, 0) << 16,
.w1c = GENMASK(15, 14) << 16,
},
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1c1a3b4d3e86b997a313ffb297c1129540882859 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Pali=20Roh=C3=A1r?= <pali(a)kernel.org>
Date: Wed, 24 Nov 2021 16:59:39 +0100
Subject: [PATCH] PCI: pci-bridge-emul: Make expansion ROM Base Address
register read-only
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If expansion ROM is unsupported (which is the case of pci-bridge-emul.c
driver) then ROM Base Address register must be implemented as read-only
register that return 0 when read, same as for unused Base Address
registers.
Link: https://lore.kernel.org/r/20211124155944.1290-2-pali@kernel.org
Fixes: 23a5fba4d941 ("PCI: Introduce PCI bridge emulated config space common logic")
Signed-off-by: Pali Rohár <pali(a)kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/pci/pci-bridge-emul.c b/drivers/pci/pci-bridge-emul.c
index db97cddfc85e..5de8b8dde209 100644
--- a/drivers/pci/pci-bridge-emul.c
+++ b/drivers/pci/pci-bridge-emul.c
@@ -139,8 +139,13 @@ struct pci_bridge_reg_behavior pci_regs_behavior[PCI_STD_HEADER_SIZEOF / 4] = {
.ro = GENMASK(7, 0),
},
+ /*
+ * If expansion ROM is unsupported then ROM Base Address register must
+ * be implemented as read-only register that return 0 when read, same
+ * as for unused Base Address registers.
+ */
[PCI_ROM_ADDRESS1 / 4] = {
- .rw = GENMASK(31, 11) | BIT(0),
+ .ro = ~0,
},
/*
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bba496656a73fc1d1330b49c7f82843836e9feb1 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Wed, 22 Dec 2021 13:07:31 +0000
Subject: [PATCH] powerpc/32: Fix boot failure with GCC latent entropy plugin
Boot fails with GCC latent entropy plugin enabled.
This is due to early boot functions trying to access 'latent_entropy'
global data while the kernel is not relocated at its final
destination yet.
As there is no way to tell GCC to use PTRRELOC() to access it,
disable latent entropy plugin in early_32.o and feature-fixups.o and
code-patching.o
Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin")
Cc: stable(a)vger.kernel.org # v4.9+
Reported-by: Erhard Furtner <erhard_f(a)mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215217
Link: https://lore.kernel.org/r/2bac55483b8daf5b1caa163a45fa5f9cdbe18be4.16401784…
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 5fa68c2ef1f8..36f3f5a8868d 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -11,6 +11,7 @@ CFLAGS_prom_init.o += -fPIC
CFLAGS_btext.o += -fPIC
endif
+CFLAGS_early_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_cputable.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 3e183f4b4bda..5d1881d2e39a 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -19,6 +19,9 @@ CFLAGS_code-patching.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_feature-fixups.o += -DDISABLE_BRANCH_PROFILING
endif
+CFLAGS_code-patching.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+CFLAGS_feature-fixups.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+
obj-y += alloc.o code-patching.o feature-fixups.o pmem.o
obj-$(CONFIG_CODE_PATCHING_SELFTEST) += test-code-patching.o
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bba496656a73fc1d1330b49c7f82843836e9feb1 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Wed, 22 Dec 2021 13:07:31 +0000
Subject: [PATCH] powerpc/32: Fix boot failure with GCC latent entropy plugin
Boot fails with GCC latent entropy plugin enabled.
This is due to early boot functions trying to access 'latent_entropy'
global data while the kernel is not relocated at its final
destination yet.
As there is no way to tell GCC to use PTRRELOC() to access it,
disable latent entropy plugin in early_32.o and feature-fixups.o and
code-patching.o
Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin")
Cc: stable(a)vger.kernel.org # v4.9+
Reported-by: Erhard Furtner <erhard_f(a)mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215217
Link: https://lore.kernel.org/r/2bac55483b8daf5b1caa163a45fa5f9cdbe18be4.16401784…
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 5fa68c2ef1f8..36f3f5a8868d 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -11,6 +11,7 @@ CFLAGS_prom_init.o += -fPIC
CFLAGS_btext.o += -fPIC
endif
+CFLAGS_early_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_cputable.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 3e183f4b4bda..5d1881d2e39a 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -19,6 +19,9 @@ CFLAGS_code-patching.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_feature-fixups.o += -DDISABLE_BRANCH_PROFILING
endif
+CFLAGS_code-patching.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+CFLAGS_feature-fixups.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+
obj-y += alloc.o code-patching.o feature-fixups.o pmem.o
obj-$(CONFIG_CODE_PATCHING_SELFTEST) += test-code-patching.o
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bba496656a73fc1d1330b49c7f82843836e9feb1 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Wed, 22 Dec 2021 13:07:31 +0000
Subject: [PATCH] powerpc/32: Fix boot failure with GCC latent entropy plugin
Boot fails with GCC latent entropy plugin enabled.
This is due to early boot functions trying to access 'latent_entropy'
global data while the kernel is not relocated at its final
destination yet.
As there is no way to tell GCC to use PTRRELOC() to access it,
disable latent entropy plugin in early_32.o and feature-fixups.o and
code-patching.o
Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin")
Cc: stable(a)vger.kernel.org # v4.9+
Reported-by: Erhard Furtner <erhard_f(a)mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215217
Link: https://lore.kernel.org/r/2bac55483b8daf5b1caa163a45fa5f9cdbe18be4.16401784…
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 5fa68c2ef1f8..36f3f5a8868d 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -11,6 +11,7 @@ CFLAGS_prom_init.o += -fPIC
CFLAGS_btext.o += -fPIC
endif
+CFLAGS_early_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_cputable.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 3e183f4b4bda..5d1881d2e39a 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -19,6 +19,9 @@ CFLAGS_code-patching.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_feature-fixups.o += -DDISABLE_BRANCH_PROFILING
endif
+CFLAGS_code-patching.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+CFLAGS_feature-fixups.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+
obj-y += alloc.o code-patching.o feature-fixups.o pmem.o
obj-$(CONFIG_CODE_PATCHING_SELFTEST) += test-code-patching.o
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bba496656a73fc1d1330b49c7f82843836e9feb1 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Wed, 22 Dec 2021 13:07:31 +0000
Subject: [PATCH] powerpc/32: Fix boot failure with GCC latent entropy plugin
Boot fails with GCC latent entropy plugin enabled.
This is due to early boot functions trying to access 'latent_entropy'
global data while the kernel is not relocated at its final
destination yet.
As there is no way to tell GCC to use PTRRELOC() to access it,
disable latent entropy plugin in early_32.o and feature-fixups.o and
code-patching.o
Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin")
Cc: stable(a)vger.kernel.org # v4.9+
Reported-by: Erhard Furtner <erhard_f(a)mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215217
Link: https://lore.kernel.org/r/2bac55483b8daf5b1caa163a45fa5f9cdbe18be4.16401784…
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 5fa68c2ef1f8..36f3f5a8868d 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -11,6 +11,7 @@ CFLAGS_prom_init.o += -fPIC
CFLAGS_btext.o += -fPIC
endif
+CFLAGS_early_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_cputable.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 3e183f4b4bda..5d1881d2e39a 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -19,6 +19,9 @@ CFLAGS_code-patching.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_feature-fixups.o += -DDISABLE_BRANCH_PROFILING
endif
+CFLAGS_code-patching.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+CFLAGS_feature-fixups.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+
obj-y += alloc.o code-patching.o feature-fixups.o pmem.o
obj-$(CONFIG_CODE_PATCHING_SELFTEST) += test-code-patching.o
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bba496656a73fc1d1330b49c7f82843836e9feb1 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Wed, 22 Dec 2021 13:07:31 +0000
Subject: [PATCH] powerpc/32: Fix boot failure with GCC latent entropy plugin
Boot fails with GCC latent entropy plugin enabled.
This is due to early boot functions trying to access 'latent_entropy'
global data while the kernel is not relocated at its final
destination yet.
As there is no way to tell GCC to use PTRRELOC() to access it,
disable latent entropy plugin in early_32.o and feature-fixups.o and
code-patching.o
Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin")
Cc: stable(a)vger.kernel.org # v4.9+
Reported-by: Erhard Furtner <erhard_f(a)mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215217
Link: https://lore.kernel.org/r/2bac55483b8daf5b1caa163a45fa5f9cdbe18be4.16401784…
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 5fa68c2ef1f8..36f3f5a8868d 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -11,6 +11,7 @@ CFLAGS_prom_init.o += -fPIC
CFLAGS_btext.o += -fPIC
endif
+CFLAGS_early_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_cputable.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 3e183f4b4bda..5d1881d2e39a 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -19,6 +19,9 @@ CFLAGS_code-patching.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_feature-fixups.o += -DDISABLE_BRANCH_PROFILING
endif
+CFLAGS_code-patching.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+CFLAGS_feature-fixups.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+
obj-y += alloc.o code-patching.o feature-fixups.o pmem.o
obj-$(CONFIG_CODE_PATCHING_SELFTEST) += test-code-patching.o
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bba496656a73fc1d1330b49c7f82843836e9feb1 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Wed, 22 Dec 2021 13:07:31 +0000
Subject: [PATCH] powerpc/32: Fix boot failure with GCC latent entropy plugin
Boot fails with GCC latent entropy plugin enabled.
This is due to early boot functions trying to access 'latent_entropy'
global data while the kernel is not relocated at its final
destination yet.
As there is no way to tell GCC to use PTRRELOC() to access it,
disable latent entropy plugin in early_32.o and feature-fixups.o and
code-patching.o
Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin")
Cc: stable(a)vger.kernel.org # v4.9+
Reported-by: Erhard Furtner <erhard_f(a)mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215217
Link: https://lore.kernel.org/r/2bac55483b8daf5b1caa163a45fa5f9cdbe18be4.16401784…
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 5fa68c2ef1f8..36f3f5a8868d 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -11,6 +11,7 @@ CFLAGS_prom_init.o += -fPIC
CFLAGS_btext.o += -fPIC
endif
+CFLAGS_early_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_cputable.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 3e183f4b4bda..5d1881d2e39a 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -19,6 +19,9 @@ CFLAGS_code-patching.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_feature-fixups.o += -DDISABLE_BRANCH_PROFILING
endif
+CFLAGS_code-patching.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+CFLAGS_feature-fixups.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+
obj-y += alloc.o code-patching.o feature-fixups.o pmem.o
obj-$(CONFIG_CODE_PATCHING_SELFTEST) += test-code-patching.o
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bba496656a73fc1d1330b49c7f82843836e9feb1 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Wed, 22 Dec 2021 13:07:31 +0000
Subject: [PATCH] powerpc/32: Fix boot failure with GCC latent entropy plugin
Boot fails with GCC latent entropy plugin enabled.
This is due to early boot functions trying to access 'latent_entropy'
global data while the kernel is not relocated at its final
destination yet.
As there is no way to tell GCC to use PTRRELOC() to access it,
disable latent entropy plugin in early_32.o and feature-fixups.o and
code-patching.o
Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin")
Cc: stable(a)vger.kernel.org # v4.9+
Reported-by: Erhard Furtner <erhard_f(a)mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215217
Link: https://lore.kernel.org/r/2bac55483b8daf5b1caa163a45fa5f9cdbe18be4.16401784…
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 5fa68c2ef1f8..36f3f5a8868d 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -11,6 +11,7 @@ CFLAGS_prom_init.o += -fPIC
CFLAGS_btext.o += -fPIC
endif
+CFLAGS_early_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_cputable.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 3e183f4b4bda..5d1881d2e39a 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -19,6 +19,9 @@ CFLAGS_code-patching.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_feature-fixups.o += -DDISABLE_BRANCH_PROFILING
endif
+CFLAGS_code-patching.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+CFLAGS_feature-fixups.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+
obj-y += alloc.o code-patching.o feature-fixups.o pmem.o
obj-$(CONFIG_CODE_PATCHING_SELFTEST) += test-code-patching.o
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 467ba14e1660b52a2f9338b484704c461bd23019 Mon Sep 17 00:00:00 2001
From: Nicholas Piggin <npiggin(a)gmail.com>
Date: Thu, 16 Dec 2021 20:33:42 +1000
Subject: [PATCH] powerpc/64s/radix: Fix huge vmap false positive
pmd_huge() is defined to false when HUGETLB_PAGE is not configured, but
the vmap code still installs huge PMDs. This leads to false bad PMD
errors when vunmapping because it is not seen as a huge PTE, and the bad
PMD check catches it. The end result may not be much more serious than
some bad pmd warning messages, because the pmd_none_or_clear_bad() does
what we wanted and clears the huge PTE anyway.
Fix this by checking pmd_is_leaf(), which checks for a PTE regardless of
config options. The whole huge/large/leaf stuff is a tangled mess but
that's kernel-wide and not something we can improve much in arch/powerpc
code.
pmd_page(), pud_page(), etc., called by vmalloc_to_page() on huge vmaps
can similarly trigger a false VM_BUG_ON when CONFIG_HUGETLB_PAGE=n, so
those checks are adjusted. The checks were added by commit d6eacedd1f0e
("powerpc/book3s: Use config independent helpers for page table walk"),
while implementing a similar fix for other page table walking functions.
Fixes: d909f9109c30 ("powerpc/64s/radix: Enable HAVE_ARCH_HUGE_VMAP")
Cc: stable(a)vger.kernel.org # v5.3+
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20211216103342.609192-1-npiggin@gmail.com
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 3c4f0ebe5df8..ca23f5d1883a 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -1076,7 +1076,7 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
int pud_clear_huge(pud_t *pud)
{
- if (pud_huge(*pud)) {
+ if (pud_is_leaf(*pud)) {
pud_clear(pud);
return 1;
}
@@ -1123,7 +1123,7 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
int pmd_clear_huge(pmd_t *pmd)
{
- if (pmd_huge(*pmd)) {
+ if (pmd_is_leaf(*pmd)) {
pmd_clear(pmd);
return 1;
}
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 78c8cf01db5f..175aabf101e8 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -102,7 +102,8 @@ EXPORT_SYMBOL(__pte_frag_size_shift);
struct page *p4d_page(p4d_t p4d)
{
if (p4d_is_leaf(p4d)) {
- VM_WARN_ON(!p4d_huge(p4d));
+ if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
+ VM_WARN_ON(!p4d_huge(p4d));
return pte_page(p4d_pte(p4d));
}
return virt_to_page(p4d_pgtable(p4d));
@@ -112,7 +113,8 @@ struct page *p4d_page(p4d_t p4d)
struct page *pud_page(pud_t pud)
{
if (pud_is_leaf(pud)) {
- VM_WARN_ON(!pud_huge(pud));
+ if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
+ VM_WARN_ON(!pud_huge(pud));
return pte_page(pud_pte(pud));
}
return virt_to_page(pud_pgtable(pud));
@@ -125,7 +127,13 @@ struct page *pud_page(pud_t pud)
struct page *pmd_page(pmd_t pmd)
{
if (pmd_is_leaf(pmd)) {
- VM_WARN_ON(!(pmd_large(pmd) || pmd_huge(pmd)));
+ /*
+ * vmalloc_to_page may be called on any vmap address (not only
+ * vmalloc), and it uses pmd_page() etc., when huge vmap is
+ * enabled so these checks can't be used.
+ */
+ if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
+ VM_WARN_ON(!(pmd_large(pmd) || pmd_huge(pmd)));
return pte_page(pmd_pte(pmd));
}
return virt_to_page(pmd_page_vaddr(pmd));
The patch below does not apply to the 04f0d6cc62cc1eaf9242c081520c024a17ba86a3-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 04f0d6cc62cc1eaf9242c081520c024a17ba86a3 Mon Sep 17 00:00:00 2001
From: Lyude Paul <lyude(a)redhat.com>
Date: Fri, 5 Nov 2021 14:33:38 -0400
Subject: [PATCH] drm/i915: Add support for panels with VESA backlights with
PWM enable/disable
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This simply adds proper support for panel backlights that can be controlled
via VESA's backlight control protocol, but which also require that we
enable and disable the backlight via PWM instead of via the DPCD interface.
We also enable this by default, in order to fix some people's backlights
that were broken by not having this enabled.
For reference, backlights that require this and use VESA's backlight
interface tend to be laptops with hybrid GPUs, but this very well may
change in the future.
v4:
* Make sure that we call intel_backlight_level_to_pwm() in
intel_dp_aux_vesa_enable_backlight() - vsyrjala
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/3680
Fixes: fe7d52bccab6 ("drm/i915/dp: Don't use DPCD backlights that need PWM enable/disable")
Reviewed-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v5.12+
Link: https://patchwork.freedesktop.org/patch/msgid/20211105183342.130810-2-lyude…
diff --git a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
index 569d17b4d00f..f05b71c01b8e 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
@@ -293,6 +293,13 @@ intel_dp_aux_vesa_enable_backlight(const struct intel_crtc_state *crtc_state,
struct intel_panel *panel = &connector->panel;
struct intel_dp *intel_dp = enc_to_intel_dp(connector->encoder);
+ if (!panel->backlight.edp.vesa.info.aux_enable) {
+ u32 pwm_level = intel_backlight_invert_pwm_level(connector,
+ panel->backlight.pwm_level_max);
+
+ panel->backlight.pwm_funcs->enable(crtc_state, conn_state, pwm_level);
+ }
+
drm_edp_backlight_enable(&intel_dp->aux, &panel->backlight.edp.vesa.info, level);
}
@@ -304,6 +311,10 @@ static void intel_dp_aux_vesa_disable_backlight(const struct drm_connector_state
struct intel_dp *intel_dp = enc_to_intel_dp(connector->encoder);
drm_edp_backlight_disable(&intel_dp->aux, &panel->backlight.edp.vesa.info);
+
+ if (!panel->backlight.edp.vesa.info.aux_enable)
+ panel->backlight.pwm_funcs->disable(old_conn_state,
+ intel_backlight_invert_pwm_level(connector, 0));
}
static int intel_dp_aux_vesa_setup_backlight(struct intel_connector *connector, enum pipe pipe)
@@ -321,6 +332,15 @@ static int intel_dp_aux_vesa_setup_backlight(struct intel_connector *connector,
if (ret < 0)
return ret;
+ if (!panel->backlight.edp.vesa.info.aux_enable) {
+ ret = panel->backlight.pwm_funcs->setup(connector, pipe);
+ if (ret < 0) {
+ drm_err(&i915->drm,
+ "Failed to setup PWM backlight controls for eDP backlight: %d\n",
+ ret);
+ return ret;
+ }
+ }
panel->backlight.max = panel->backlight.edp.vesa.info.max;
panel->backlight.min = 0;
if (current_mode == DP_EDP_BACKLIGHT_CONTROL_MODE_DPCD) {
@@ -340,12 +360,7 @@ intel_dp_aux_supports_vesa_backlight(struct intel_connector *connector)
struct intel_dp *intel_dp = intel_attached_dp(connector);
struct drm_i915_private *i915 = dp_to_i915(intel_dp);
- /* TODO: We currently only support AUX only backlight configurations, not backlights which
- * require a mix of PWM and AUX controls to work. In the mean time, these machines typically
- * work just fine using normal PWM controls anyway.
- */
- if ((intel_dp->edp_dpcd[1] & DP_EDP_BACKLIGHT_AUX_ENABLE_CAP) &&
- drm_edp_backlight_supported(intel_dp->edp_dpcd)) {
+ if (drm_edp_backlight_supported(intel_dp->edp_dpcd)) {
drm_dbg_kms(&i915->drm, "AUX Backlight Control Supported!\n");
return true;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 04f0d6cc62cc1eaf9242c081520c024a17ba86a3 Mon Sep 17 00:00:00 2001
From: Lyude Paul <lyude(a)redhat.com>
Date: Fri, 5 Nov 2021 14:33:38 -0400
Subject: [PATCH] drm/i915: Add support for panels with VESA backlights with
PWM enable/disable
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This simply adds proper support for panel backlights that can be controlled
via VESA's backlight control protocol, but which also require that we
enable and disable the backlight via PWM instead of via the DPCD interface.
We also enable this by default, in order to fix some people's backlights
that were broken by not having this enabled.
For reference, backlights that require this and use VESA's backlight
interface tend to be laptops with hybrid GPUs, but this very well may
change in the future.
v4:
* Make sure that we call intel_backlight_level_to_pwm() in
intel_dp_aux_vesa_enable_backlight() - vsyrjala
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/3680
Fixes: fe7d52bccab6 ("drm/i915/dp: Don't use DPCD backlights that need PWM enable/disable")
Reviewed-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v5.12+
Link: https://patchwork.freedesktop.org/patch/msgid/20211105183342.130810-2-lyude…
diff --git a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
index 569d17b4d00f..f05b71c01b8e 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
@@ -293,6 +293,13 @@ intel_dp_aux_vesa_enable_backlight(const struct intel_crtc_state *crtc_state,
struct intel_panel *panel = &connector->panel;
struct intel_dp *intel_dp = enc_to_intel_dp(connector->encoder);
+ if (!panel->backlight.edp.vesa.info.aux_enable) {
+ u32 pwm_level = intel_backlight_invert_pwm_level(connector,
+ panel->backlight.pwm_level_max);
+
+ panel->backlight.pwm_funcs->enable(crtc_state, conn_state, pwm_level);
+ }
+
drm_edp_backlight_enable(&intel_dp->aux, &panel->backlight.edp.vesa.info, level);
}
@@ -304,6 +311,10 @@ static void intel_dp_aux_vesa_disable_backlight(const struct drm_connector_state
struct intel_dp *intel_dp = enc_to_intel_dp(connector->encoder);
drm_edp_backlight_disable(&intel_dp->aux, &panel->backlight.edp.vesa.info);
+
+ if (!panel->backlight.edp.vesa.info.aux_enable)
+ panel->backlight.pwm_funcs->disable(old_conn_state,
+ intel_backlight_invert_pwm_level(connector, 0));
}
static int intel_dp_aux_vesa_setup_backlight(struct intel_connector *connector, enum pipe pipe)
@@ -321,6 +332,15 @@ static int intel_dp_aux_vesa_setup_backlight(struct intel_connector *connector,
if (ret < 0)
return ret;
+ if (!panel->backlight.edp.vesa.info.aux_enable) {
+ ret = panel->backlight.pwm_funcs->setup(connector, pipe);
+ if (ret < 0) {
+ drm_err(&i915->drm,
+ "Failed to setup PWM backlight controls for eDP backlight: %d\n",
+ ret);
+ return ret;
+ }
+ }
panel->backlight.max = panel->backlight.edp.vesa.info.max;
panel->backlight.min = 0;
if (current_mode == DP_EDP_BACKLIGHT_CONTROL_MODE_DPCD) {
@@ -340,12 +360,7 @@ intel_dp_aux_supports_vesa_backlight(struct intel_connector *connector)
struct intel_dp *intel_dp = intel_attached_dp(connector);
struct drm_i915_private *i915 = dp_to_i915(intel_dp);
- /* TODO: We currently only support AUX only backlight configurations, not backlights which
- * require a mix of PWM and AUX controls to work. In the mean time, these machines typically
- * work just fine using normal PWM controls anyway.
- */
- if ((intel_dp->edp_dpcd[1] & DP_EDP_BACKLIGHT_AUX_ENABLE_CAP) &&
- drm_edp_backlight_supported(intel_dp->edp_dpcd)) {
+ if (drm_edp_backlight_supported(intel_dp->edp_dpcd)) {
drm_dbg_kms(&i915->drm, "AUX Backlight Control Supported!\n");
return true;
}
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 04f0d6cc62cc1eaf9242c081520c024a17ba86a3 Mon Sep 17 00:00:00 2001
From: Lyude Paul <lyude(a)redhat.com>
Date: Fri, 5 Nov 2021 14:33:38 -0400
Subject: [PATCH] drm/i915: Add support for panels with VESA backlights with
PWM enable/disable
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This simply adds proper support for panel backlights that can be controlled
via VESA's backlight control protocol, but which also require that we
enable and disable the backlight via PWM instead of via the DPCD interface.
We also enable this by default, in order to fix some people's backlights
that were broken by not having this enabled.
For reference, backlights that require this and use VESA's backlight
interface tend to be laptops with hybrid GPUs, but this very well may
change in the future.
v4:
* Make sure that we call intel_backlight_level_to_pwm() in
intel_dp_aux_vesa_enable_backlight() - vsyrjala
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/3680
Fixes: fe7d52bccab6 ("drm/i915/dp: Don't use DPCD backlights that need PWM enable/disable")
Reviewed-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v5.12+
Link: https://patchwork.freedesktop.org/patch/msgid/20211105183342.130810-2-lyude…
diff --git a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
index 569d17b4d00f..f05b71c01b8e 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
@@ -293,6 +293,13 @@ intel_dp_aux_vesa_enable_backlight(const struct intel_crtc_state *crtc_state,
struct intel_panel *panel = &connector->panel;
struct intel_dp *intel_dp = enc_to_intel_dp(connector->encoder);
+ if (!panel->backlight.edp.vesa.info.aux_enable) {
+ u32 pwm_level = intel_backlight_invert_pwm_level(connector,
+ panel->backlight.pwm_level_max);
+
+ panel->backlight.pwm_funcs->enable(crtc_state, conn_state, pwm_level);
+ }
+
drm_edp_backlight_enable(&intel_dp->aux, &panel->backlight.edp.vesa.info, level);
}
@@ -304,6 +311,10 @@ static void intel_dp_aux_vesa_disable_backlight(const struct drm_connector_state
struct intel_dp *intel_dp = enc_to_intel_dp(connector->encoder);
drm_edp_backlight_disable(&intel_dp->aux, &panel->backlight.edp.vesa.info);
+
+ if (!panel->backlight.edp.vesa.info.aux_enable)
+ panel->backlight.pwm_funcs->disable(old_conn_state,
+ intel_backlight_invert_pwm_level(connector, 0));
}
static int intel_dp_aux_vesa_setup_backlight(struct intel_connector *connector, enum pipe pipe)
@@ -321,6 +332,15 @@ static int intel_dp_aux_vesa_setup_backlight(struct intel_connector *connector,
if (ret < 0)
return ret;
+ if (!panel->backlight.edp.vesa.info.aux_enable) {
+ ret = panel->backlight.pwm_funcs->setup(connector, pipe);
+ if (ret < 0) {
+ drm_err(&i915->drm,
+ "Failed to setup PWM backlight controls for eDP backlight: %d\n",
+ ret);
+ return ret;
+ }
+ }
panel->backlight.max = panel->backlight.edp.vesa.info.max;
panel->backlight.min = 0;
if (current_mode == DP_EDP_BACKLIGHT_CONTROL_MODE_DPCD) {
@@ -340,12 +360,7 @@ intel_dp_aux_supports_vesa_backlight(struct intel_connector *connector)
struct intel_dp *intel_dp = intel_attached_dp(connector);
struct drm_i915_private *i915 = dp_to_i915(intel_dp);
- /* TODO: We currently only support AUX only backlight configurations, not backlights which
- * require a mix of PWM and AUX controls to work. In the mean time, these machines typically
- * work just fine using normal PWM controls anyway.
- */
- if ((intel_dp->edp_dpcd[1] & DP_EDP_BACKLIGHT_AUX_ENABLE_CAP) &&
- drm_edp_backlight_supported(intel_dp->edp_dpcd)) {
+ if (drm_edp_backlight_supported(intel_dp->edp_dpcd)) {
drm_dbg_kms(&i915->drm, "AUX Backlight Control Supported!\n");
return true;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4a7f4110f79163fd53ea65438041994ed615e3af Mon Sep 17 00:00:00 2001
From: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Date: Wed, 1 Dec 2021 14:59:29 +0200
Subject: [PATCH] device property: Fix fwnode_graph_devcon_match() fwnode leak
For each endpoint it encounters, fwnode_graph_devcon_match() checks
whether the endpoint's remote port parent device is available. If it is
not, it ignores the endpoint but does not put the reference to the remote
endpoint port parent fwnode. For available devices the fwnode handle
reference is put as expected.
Put the reference for unavailable devices now.
Fixes: 637e9e52b185 ("device connection: Find device connections also from device graphs")
Cc: 5.1+ <stable(a)vger.kernel.org> # 5.1+
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/base/property.c b/drivers/base/property.c
index d0960a9e8974..b7b3a7b86006 100644
--- a/drivers/base/property.c
+++ b/drivers/base/property.c
@@ -1158,8 +1158,10 @@ fwnode_graph_devcon_match(struct fwnode_handle *fwnode, const char *con_id,
fwnode_graph_for_each_endpoint(fwnode, ep) {
node = fwnode_graph_get_remote_port_parent(ep);
- if (!fwnode_device_is_available(node))
+ if (!fwnode_device_is_available(node)) {
+ fwnode_handle_put(node);
continue;
+ }
ret = match(node, con_id, data);
fwnode_handle_put(node);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77360f9bbc7e5e2ab7a2c8b4c0244fbbfcfc6f62 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Mon, 10 Jan 2022 11:55:32 -0500
Subject: [PATCH] tracing: Add test for user space strings when filtering on
string pointers
Pingfan reported that the following causes a fault:
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable
The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reported-by: Pingfan Liu <kernelfans(a)gmail.com>
Tested-by: Pingfan Liu <kernelfans(a)gmail.com>
Fixes: 87a342f5db69d ("tracing/filters: Support filtering for char * strings")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index 8ddb9b09451c..45e66a60a816 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -230,6 +230,16 @@ Currently the caret ('^') for an error always appears at the beginning of
the filter string; the error message should still be useful though
even without more accurate position info.
+5.2.1 Filter limitations
+------------------------
+
+If a filter is placed on a string pointer ``(char *)`` that does not point
+to a string on the ring buffer, but instead points to kernel or user space
+memory, then, for safety reasons, at most 1024 bytes of the content is
+copied onto a temporary buffer to do the compare. If the copy of the memory
+faults (the pointer points to memory that should not be accessed), then the
+string compare will be treated as not matching.
+
5.3 Clearing filters
--------------------
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 996920ed1812..2e9ef64e9ee9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -5,6 +5,7 @@
* Copyright (C) 2009 Tom Zanussi <tzanussi(a)gmail.com>
*/
+#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/ctype.h>
#include <linux/mutex.h>
@@ -654,6 +655,47 @@ DEFINE_EQUALITY_PRED(32);
DEFINE_EQUALITY_PRED(16);
DEFINE_EQUALITY_PRED(8);
+/* user space strings temp buffer */
+#define USTRING_BUF_SIZE 1024
+
+struct ustring_buffer {
+ char buffer[USTRING_BUF_SIZE];
+};
+
+static __percpu struct ustring_buffer *ustring_per_cpu;
+
+static __always_inline char *test_string(char *str)
+{
+ struct ustring_buffer *ubuf;
+ char __user *ustr;
+ char *kstr;
+
+ if (!ustring_per_cpu)
+ return NULL;
+
+ ubuf = this_cpu_ptr(ustring_per_cpu);
+ kstr = ubuf->buffer;
+
+ /*
+ * We use TASK_SIZE to denote user or kernel space, but this will
+ * not work for all architectures. If it picks the wrong one, it may
+ * just fail the filter (but will not bug).
+ *
+ * TODO: Have a way to properly denote which one this is for.
+ */
+ if (likely((unsigned long)str >= TASK_SIZE)) {
+ /* For safety, do not trust the string pointer */
+ if (!strncpy_from_kernel_nofault(kstr, str, USTRING_BUF_SIZE))
+ return NULL;
+ } else {
+ /* user space address? */
+ ustr = (char __user *)str;
+ if (!strncpy_from_user_nofault(kstr, ustr, USTRING_BUF_SIZE))
+ return NULL;
+ }
+ return kstr;
+}
+
/* Filter predicate for fixed sized arrays of characters */
static int filter_pred_string(struct filter_pred *pred, void *event)
{
@@ -671,10 +713,16 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
char **addr = (char **)(event + pred->offset);
+ char *str;
int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
+ int len;
- cmp = pred->regex.match(*addr, &pred->regex, len);
+ str = test_string(*addr);
+ if (!str)
+ return 0;
+
+ len = strlen(str) + 1; /* including tailing '\0' */
+ cmp = pred->regex.match(str, &pred->regex, len);
match = cmp ^ pred->not;
@@ -1348,8 +1396,17 @@ static int parse_pred(const char *str, void *data,
pred->fn = filter_pred_strloc;
} else if (field->filter_type == FILTER_RDYN_STRING)
pred->fn = filter_pred_strrelloc;
- else
+ else {
+
+ if (!ustring_per_cpu) {
+ /* Once allocated, keep it around for good */
+ ustring_per_cpu = alloc_percpu(struct ustring_buffer);
+ if (!ustring_per_cpu)
+ goto err_mem;
+ }
+
pred->fn = filter_pred_pchar;
+ }
/* go past the last quote */
i++;
@@ -1415,6 +1472,9 @@ static int parse_pred(const char *str, void *data,
err_free:
kfree(pred);
return -EINVAL;
+err_mem:
+ kfree(pred);
+ return -ENOMEM;
}
enum {
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77360f9bbc7e5e2ab7a2c8b4c0244fbbfcfc6f62 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Mon, 10 Jan 2022 11:55:32 -0500
Subject: [PATCH] tracing: Add test for user space strings when filtering on
string pointers
Pingfan reported that the following causes a fault:
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable
The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reported-by: Pingfan Liu <kernelfans(a)gmail.com>
Tested-by: Pingfan Liu <kernelfans(a)gmail.com>
Fixes: 87a342f5db69d ("tracing/filters: Support filtering for char * strings")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index 8ddb9b09451c..45e66a60a816 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -230,6 +230,16 @@ Currently the caret ('^') for an error always appears at the beginning of
the filter string; the error message should still be useful though
even without more accurate position info.
+5.2.1 Filter limitations
+------------------------
+
+If a filter is placed on a string pointer ``(char *)`` that does not point
+to a string on the ring buffer, but instead points to kernel or user space
+memory, then, for safety reasons, at most 1024 bytes of the content is
+copied onto a temporary buffer to do the compare. If the copy of the memory
+faults (the pointer points to memory that should not be accessed), then the
+string compare will be treated as not matching.
+
5.3 Clearing filters
--------------------
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 996920ed1812..2e9ef64e9ee9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -5,6 +5,7 @@
* Copyright (C) 2009 Tom Zanussi <tzanussi(a)gmail.com>
*/
+#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/ctype.h>
#include <linux/mutex.h>
@@ -654,6 +655,47 @@ DEFINE_EQUALITY_PRED(32);
DEFINE_EQUALITY_PRED(16);
DEFINE_EQUALITY_PRED(8);
+/* user space strings temp buffer */
+#define USTRING_BUF_SIZE 1024
+
+struct ustring_buffer {
+ char buffer[USTRING_BUF_SIZE];
+};
+
+static __percpu struct ustring_buffer *ustring_per_cpu;
+
+static __always_inline char *test_string(char *str)
+{
+ struct ustring_buffer *ubuf;
+ char __user *ustr;
+ char *kstr;
+
+ if (!ustring_per_cpu)
+ return NULL;
+
+ ubuf = this_cpu_ptr(ustring_per_cpu);
+ kstr = ubuf->buffer;
+
+ /*
+ * We use TASK_SIZE to denote user or kernel space, but this will
+ * not work for all architectures. If it picks the wrong one, it may
+ * just fail the filter (but will not bug).
+ *
+ * TODO: Have a way to properly denote which one this is for.
+ */
+ if (likely((unsigned long)str >= TASK_SIZE)) {
+ /* For safety, do not trust the string pointer */
+ if (!strncpy_from_kernel_nofault(kstr, str, USTRING_BUF_SIZE))
+ return NULL;
+ } else {
+ /* user space address? */
+ ustr = (char __user *)str;
+ if (!strncpy_from_user_nofault(kstr, ustr, USTRING_BUF_SIZE))
+ return NULL;
+ }
+ return kstr;
+}
+
/* Filter predicate for fixed sized arrays of characters */
static int filter_pred_string(struct filter_pred *pred, void *event)
{
@@ -671,10 +713,16 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
char **addr = (char **)(event + pred->offset);
+ char *str;
int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
+ int len;
- cmp = pred->regex.match(*addr, &pred->regex, len);
+ str = test_string(*addr);
+ if (!str)
+ return 0;
+
+ len = strlen(str) + 1; /* including tailing '\0' */
+ cmp = pred->regex.match(str, &pred->regex, len);
match = cmp ^ pred->not;
@@ -1348,8 +1396,17 @@ static int parse_pred(const char *str, void *data,
pred->fn = filter_pred_strloc;
} else if (field->filter_type == FILTER_RDYN_STRING)
pred->fn = filter_pred_strrelloc;
- else
+ else {
+
+ if (!ustring_per_cpu) {
+ /* Once allocated, keep it around for good */
+ ustring_per_cpu = alloc_percpu(struct ustring_buffer);
+ if (!ustring_per_cpu)
+ goto err_mem;
+ }
+
pred->fn = filter_pred_pchar;
+ }
/* go past the last quote */
i++;
@@ -1415,6 +1472,9 @@ static int parse_pred(const char *str, void *data,
err_free:
kfree(pred);
return -EINVAL;
+err_mem:
+ kfree(pred);
+ return -ENOMEM;
}
enum {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77360f9bbc7e5e2ab7a2c8b4c0244fbbfcfc6f62 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Mon, 10 Jan 2022 11:55:32 -0500
Subject: [PATCH] tracing: Add test for user space strings when filtering on
string pointers
Pingfan reported that the following causes a fault:
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable
The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reported-by: Pingfan Liu <kernelfans(a)gmail.com>
Tested-by: Pingfan Liu <kernelfans(a)gmail.com>
Fixes: 87a342f5db69d ("tracing/filters: Support filtering for char * strings")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index 8ddb9b09451c..45e66a60a816 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -230,6 +230,16 @@ Currently the caret ('^') for an error always appears at the beginning of
the filter string; the error message should still be useful though
even without more accurate position info.
+5.2.1 Filter limitations
+------------------------
+
+If a filter is placed on a string pointer ``(char *)`` that does not point
+to a string on the ring buffer, but instead points to kernel or user space
+memory, then, for safety reasons, at most 1024 bytes of the content is
+copied onto a temporary buffer to do the compare. If the copy of the memory
+faults (the pointer points to memory that should not be accessed), then the
+string compare will be treated as not matching.
+
5.3 Clearing filters
--------------------
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 996920ed1812..2e9ef64e9ee9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -5,6 +5,7 @@
* Copyright (C) 2009 Tom Zanussi <tzanussi(a)gmail.com>
*/
+#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/ctype.h>
#include <linux/mutex.h>
@@ -654,6 +655,47 @@ DEFINE_EQUALITY_PRED(32);
DEFINE_EQUALITY_PRED(16);
DEFINE_EQUALITY_PRED(8);
+/* user space strings temp buffer */
+#define USTRING_BUF_SIZE 1024
+
+struct ustring_buffer {
+ char buffer[USTRING_BUF_SIZE];
+};
+
+static __percpu struct ustring_buffer *ustring_per_cpu;
+
+static __always_inline char *test_string(char *str)
+{
+ struct ustring_buffer *ubuf;
+ char __user *ustr;
+ char *kstr;
+
+ if (!ustring_per_cpu)
+ return NULL;
+
+ ubuf = this_cpu_ptr(ustring_per_cpu);
+ kstr = ubuf->buffer;
+
+ /*
+ * We use TASK_SIZE to denote user or kernel space, but this will
+ * not work for all architectures. If it picks the wrong one, it may
+ * just fail the filter (but will not bug).
+ *
+ * TODO: Have a way to properly denote which one this is for.
+ */
+ if (likely((unsigned long)str >= TASK_SIZE)) {
+ /* For safety, do not trust the string pointer */
+ if (!strncpy_from_kernel_nofault(kstr, str, USTRING_BUF_SIZE))
+ return NULL;
+ } else {
+ /* user space address? */
+ ustr = (char __user *)str;
+ if (!strncpy_from_user_nofault(kstr, ustr, USTRING_BUF_SIZE))
+ return NULL;
+ }
+ return kstr;
+}
+
/* Filter predicate for fixed sized arrays of characters */
static int filter_pred_string(struct filter_pred *pred, void *event)
{
@@ -671,10 +713,16 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
char **addr = (char **)(event + pred->offset);
+ char *str;
int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
+ int len;
- cmp = pred->regex.match(*addr, &pred->regex, len);
+ str = test_string(*addr);
+ if (!str)
+ return 0;
+
+ len = strlen(str) + 1; /* including tailing '\0' */
+ cmp = pred->regex.match(str, &pred->regex, len);
match = cmp ^ pred->not;
@@ -1348,8 +1396,17 @@ static int parse_pred(const char *str, void *data,
pred->fn = filter_pred_strloc;
} else if (field->filter_type == FILTER_RDYN_STRING)
pred->fn = filter_pred_strrelloc;
- else
+ else {
+
+ if (!ustring_per_cpu) {
+ /* Once allocated, keep it around for good */
+ ustring_per_cpu = alloc_percpu(struct ustring_buffer);
+ if (!ustring_per_cpu)
+ goto err_mem;
+ }
+
pred->fn = filter_pred_pchar;
+ }
/* go past the last quote */
i++;
@@ -1415,6 +1472,9 @@ static int parse_pred(const char *str, void *data,
err_free:
kfree(pred);
return -EINVAL;
+err_mem:
+ kfree(pred);
+ return -ENOMEM;
}
enum {
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77360f9bbc7e5e2ab7a2c8b4c0244fbbfcfc6f62 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Mon, 10 Jan 2022 11:55:32 -0500
Subject: [PATCH] tracing: Add test for user space strings when filtering on
string pointers
Pingfan reported that the following causes a fault:
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable
The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reported-by: Pingfan Liu <kernelfans(a)gmail.com>
Tested-by: Pingfan Liu <kernelfans(a)gmail.com>
Fixes: 87a342f5db69d ("tracing/filters: Support filtering for char * strings")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index 8ddb9b09451c..45e66a60a816 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -230,6 +230,16 @@ Currently the caret ('^') for an error always appears at the beginning of
the filter string; the error message should still be useful though
even without more accurate position info.
+5.2.1 Filter limitations
+------------------------
+
+If a filter is placed on a string pointer ``(char *)`` that does not point
+to a string on the ring buffer, but instead points to kernel or user space
+memory, then, for safety reasons, at most 1024 bytes of the content is
+copied onto a temporary buffer to do the compare. If the copy of the memory
+faults (the pointer points to memory that should not be accessed), then the
+string compare will be treated as not matching.
+
5.3 Clearing filters
--------------------
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 996920ed1812..2e9ef64e9ee9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -5,6 +5,7 @@
* Copyright (C) 2009 Tom Zanussi <tzanussi(a)gmail.com>
*/
+#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/ctype.h>
#include <linux/mutex.h>
@@ -654,6 +655,47 @@ DEFINE_EQUALITY_PRED(32);
DEFINE_EQUALITY_PRED(16);
DEFINE_EQUALITY_PRED(8);
+/* user space strings temp buffer */
+#define USTRING_BUF_SIZE 1024
+
+struct ustring_buffer {
+ char buffer[USTRING_BUF_SIZE];
+};
+
+static __percpu struct ustring_buffer *ustring_per_cpu;
+
+static __always_inline char *test_string(char *str)
+{
+ struct ustring_buffer *ubuf;
+ char __user *ustr;
+ char *kstr;
+
+ if (!ustring_per_cpu)
+ return NULL;
+
+ ubuf = this_cpu_ptr(ustring_per_cpu);
+ kstr = ubuf->buffer;
+
+ /*
+ * We use TASK_SIZE to denote user or kernel space, but this will
+ * not work for all architectures. If it picks the wrong one, it may
+ * just fail the filter (but will not bug).
+ *
+ * TODO: Have a way to properly denote which one this is for.
+ */
+ if (likely((unsigned long)str >= TASK_SIZE)) {
+ /* For safety, do not trust the string pointer */
+ if (!strncpy_from_kernel_nofault(kstr, str, USTRING_BUF_SIZE))
+ return NULL;
+ } else {
+ /* user space address? */
+ ustr = (char __user *)str;
+ if (!strncpy_from_user_nofault(kstr, ustr, USTRING_BUF_SIZE))
+ return NULL;
+ }
+ return kstr;
+}
+
/* Filter predicate for fixed sized arrays of characters */
static int filter_pred_string(struct filter_pred *pred, void *event)
{
@@ -671,10 +713,16 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
char **addr = (char **)(event + pred->offset);
+ char *str;
int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
+ int len;
- cmp = pred->regex.match(*addr, &pred->regex, len);
+ str = test_string(*addr);
+ if (!str)
+ return 0;
+
+ len = strlen(str) + 1; /* including tailing '\0' */
+ cmp = pred->regex.match(str, &pred->regex, len);
match = cmp ^ pred->not;
@@ -1348,8 +1396,17 @@ static int parse_pred(const char *str, void *data,
pred->fn = filter_pred_strloc;
} else if (field->filter_type == FILTER_RDYN_STRING)
pred->fn = filter_pred_strrelloc;
- else
+ else {
+
+ if (!ustring_per_cpu) {
+ /* Once allocated, keep it around for good */
+ ustring_per_cpu = alloc_percpu(struct ustring_buffer);
+ if (!ustring_per_cpu)
+ goto err_mem;
+ }
+
pred->fn = filter_pred_pchar;
+ }
/* go past the last quote */
i++;
@@ -1415,6 +1472,9 @@ static int parse_pred(const char *str, void *data,
err_free:
kfree(pred);
return -EINVAL;
+err_mem:
+ kfree(pred);
+ return -ENOMEM;
}
enum {
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77360f9bbc7e5e2ab7a2c8b4c0244fbbfcfc6f62 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Mon, 10 Jan 2022 11:55:32 -0500
Subject: [PATCH] tracing: Add test for user space strings when filtering on
string pointers
Pingfan reported that the following causes a fault:
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable
The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reported-by: Pingfan Liu <kernelfans(a)gmail.com>
Tested-by: Pingfan Liu <kernelfans(a)gmail.com>
Fixes: 87a342f5db69d ("tracing/filters: Support filtering for char * strings")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index 8ddb9b09451c..45e66a60a816 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -230,6 +230,16 @@ Currently the caret ('^') for an error always appears at the beginning of
the filter string; the error message should still be useful though
even without more accurate position info.
+5.2.1 Filter limitations
+------------------------
+
+If a filter is placed on a string pointer ``(char *)`` that does not point
+to a string on the ring buffer, but instead points to kernel or user space
+memory, then, for safety reasons, at most 1024 bytes of the content is
+copied onto a temporary buffer to do the compare. If the copy of the memory
+faults (the pointer points to memory that should not be accessed), then the
+string compare will be treated as not matching.
+
5.3 Clearing filters
--------------------
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 996920ed1812..2e9ef64e9ee9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -5,6 +5,7 @@
* Copyright (C) 2009 Tom Zanussi <tzanussi(a)gmail.com>
*/
+#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/ctype.h>
#include <linux/mutex.h>
@@ -654,6 +655,47 @@ DEFINE_EQUALITY_PRED(32);
DEFINE_EQUALITY_PRED(16);
DEFINE_EQUALITY_PRED(8);
+/* user space strings temp buffer */
+#define USTRING_BUF_SIZE 1024
+
+struct ustring_buffer {
+ char buffer[USTRING_BUF_SIZE];
+};
+
+static __percpu struct ustring_buffer *ustring_per_cpu;
+
+static __always_inline char *test_string(char *str)
+{
+ struct ustring_buffer *ubuf;
+ char __user *ustr;
+ char *kstr;
+
+ if (!ustring_per_cpu)
+ return NULL;
+
+ ubuf = this_cpu_ptr(ustring_per_cpu);
+ kstr = ubuf->buffer;
+
+ /*
+ * We use TASK_SIZE to denote user or kernel space, but this will
+ * not work for all architectures. If it picks the wrong one, it may
+ * just fail the filter (but will not bug).
+ *
+ * TODO: Have a way to properly denote which one this is for.
+ */
+ if (likely((unsigned long)str >= TASK_SIZE)) {
+ /* For safety, do not trust the string pointer */
+ if (!strncpy_from_kernel_nofault(kstr, str, USTRING_BUF_SIZE))
+ return NULL;
+ } else {
+ /* user space address? */
+ ustr = (char __user *)str;
+ if (!strncpy_from_user_nofault(kstr, ustr, USTRING_BUF_SIZE))
+ return NULL;
+ }
+ return kstr;
+}
+
/* Filter predicate for fixed sized arrays of characters */
static int filter_pred_string(struct filter_pred *pred, void *event)
{
@@ -671,10 +713,16 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
char **addr = (char **)(event + pred->offset);
+ char *str;
int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
+ int len;
- cmp = pred->regex.match(*addr, &pred->regex, len);
+ str = test_string(*addr);
+ if (!str)
+ return 0;
+
+ len = strlen(str) + 1; /* including tailing '\0' */
+ cmp = pred->regex.match(str, &pred->regex, len);
match = cmp ^ pred->not;
@@ -1348,8 +1396,17 @@ static int parse_pred(const char *str, void *data,
pred->fn = filter_pred_strloc;
} else if (field->filter_type == FILTER_RDYN_STRING)
pred->fn = filter_pred_strrelloc;
- else
+ else {
+
+ if (!ustring_per_cpu) {
+ /* Once allocated, keep it around for good */
+ ustring_per_cpu = alloc_percpu(struct ustring_buffer);
+ if (!ustring_per_cpu)
+ goto err_mem;
+ }
+
pred->fn = filter_pred_pchar;
+ }
/* go past the last quote */
i++;
@@ -1415,6 +1472,9 @@ static int parse_pred(const char *str, void *data,
err_free:
kfree(pred);
return -EINVAL;
+err_mem:
+ kfree(pred);
+ return -ENOMEM;
}
enum {
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77360f9bbc7e5e2ab7a2c8b4c0244fbbfcfc6f62 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Mon, 10 Jan 2022 11:55:32 -0500
Subject: [PATCH] tracing: Add test for user space strings when filtering on
string pointers
Pingfan reported that the following causes a fault:
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable
The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reported-by: Pingfan Liu <kernelfans(a)gmail.com>
Tested-by: Pingfan Liu <kernelfans(a)gmail.com>
Fixes: 87a342f5db69d ("tracing/filters: Support filtering for char * strings")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index 8ddb9b09451c..45e66a60a816 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -230,6 +230,16 @@ Currently the caret ('^') for an error always appears at the beginning of
the filter string; the error message should still be useful though
even without more accurate position info.
+5.2.1 Filter limitations
+------------------------
+
+If a filter is placed on a string pointer ``(char *)`` that does not point
+to a string on the ring buffer, but instead points to kernel or user space
+memory, then, for safety reasons, at most 1024 bytes of the content is
+copied onto a temporary buffer to do the compare. If the copy of the memory
+faults (the pointer points to memory that should not be accessed), then the
+string compare will be treated as not matching.
+
5.3 Clearing filters
--------------------
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 996920ed1812..2e9ef64e9ee9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -5,6 +5,7 @@
* Copyright (C) 2009 Tom Zanussi <tzanussi(a)gmail.com>
*/
+#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/ctype.h>
#include <linux/mutex.h>
@@ -654,6 +655,47 @@ DEFINE_EQUALITY_PRED(32);
DEFINE_EQUALITY_PRED(16);
DEFINE_EQUALITY_PRED(8);
+/* user space strings temp buffer */
+#define USTRING_BUF_SIZE 1024
+
+struct ustring_buffer {
+ char buffer[USTRING_BUF_SIZE];
+};
+
+static __percpu struct ustring_buffer *ustring_per_cpu;
+
+static __always_inline char *test_string(char *str)
+{
+ struct ustring_buffer *ubuf;
+ char __user *ustr;
+ char *kstr;
+
+ if (!ustring_per_cpu)
+ return NULL;
+
+ ubuf = this_cpu_ptr(ustring_per_cpu);
+ kstr = ubuf->buffer;
+
+ /*
+ * We use TASK_SIZE to denote user or kernel space, but this will
+ * not work for all architectures. If it picks the wrong one, it may
+ * just fail the filter (but will not bug).
+ *
+ * TODO: Have a way to properly denote which one this is for.
+ */
+ if (likely((unsigned long)str >= TASK_SIZE)) {
+ /* For safety, do not trust the string pointer */
+ if (!strncpy_from_kernel_nofault(kstr, str, USTRING_BUF_SIZE))
+ return NULL;
+ } else {
+ /* user space address? */
+ ustr = (char __user *)str;
+ if (!strncpy_from_user_nofault(kstr, ustr, USTRING_BUF_SIZE))
+ return NULL;
+ }
+ return kstr;
+}
+
/* Filter predicate for fixed sized arrays of characters */
static int filter_pred_string(struct filter_pred *pred, void *event)
{
@@ -671,10 +713,16 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
char **addr = (char **)(event + pred->offset);
+ char *str;
int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
+ int len;
- cmp = pred->regex.match(*addr, &pred->regex, len);
+ str = test_string(*addr);
+ if (!str)
+ return 0;
+
+ len = strlen(str) + 1; /* including tailing '\0' */
+ cmp = pred->regex.match(str, &pred->regex, len);
match = cmp ^ pred->not;
@@ -1348,8 +1396,17 @@ static int parse_pred(const char *str, void *data,
pred->fn = filter_pred_strloc;
} else if (field->filter_type == FILTER_RDYN_STRING)
pred->fn = filter_pred_strrelloc;
- else
+ else {
+
+ if (!ustring_per_cpu) {
+ /* Once allocated, keep it around for good */
+ ustring_per_cpu = alloc_percpu(struct ustring_buffer);
+ if (!ustring_per_cpu)
+ goto err_mem;
+ }
+
pred->fn = filter_pred_pchar;
+ }
/* go past the last quote */
i++;
@@ -1415,6 +1472,9 @@ static int parse_pred(const char *str, void *data,
err_free:
kfree(pred);
return -EINVAL;
+err_mem:
+ kfree(pred);
+ return -ENOMEM;
}
enum {
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77360f9bbc7e5e2ab7a2c8b4c0244fbbfcfc6f62 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Mon, 10 Jan 2022 11:55:32 -0500
Subject: [PATCH] tracing: Add test for user space strings when filtering on
string pointers
Pingfan reported that the following causes a fault:
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable
The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reported-by: Pingfan Liu <kernelfans(a)gmail.com>
Tested-by: Pingfan Liu <kernelfans(a)gmail.com>
Fixes: 87a342f5db69d ("tracing/filters: Support filtering for char * strings")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index 8ddb9b09451c..45e66a60a816 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -230,6 +230,16 @@ Currently the caret ('^') for an error always appears at the beginning of
the filter string; the error message should still be useful though
even without more accurate position info.
+5.2.1 Filter limitations
+------------------------
+
+If a filter is placed on a string pointer ``(char *)`` that does not point
+to a string on the ring buffer, but instead points to kernel or user space
+memory, then, for safety reasons, at most 1024 bytes of the content is
+copied onto a temporary buffer to do the compare. If the copy of the memory
+faults (the pointer points to memory that should not be accessed), then the
+string compare will be treated as not matching.
+
5.3 Clearing filters
--------------------
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 996920ed1812..2e9ef64e9ee9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -5,6 +5,7 @@
* Copyright (C) 2009 Tom Zanussi <tzanussi(a)gmail.com>
*/
+#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/ctype.h>
#include <linux/mutex.h>
@@ -654,6 +655,47 @@ DEFINE_EQUALITY_PRED(32);
DEFINE_EQUALITY_PRED(16);
DEFINE_EQUALITY_PRED(8);
+/* user space strings temp buffer */
+#define USTRING_BUF_SIZE 1024
+
+struct ustring_buffer {
+ char buffer[USTRING_BUF_SIZE];
+};
+
+static __percpu struct ustring_buffer *ustring_per_cpu;
+
+static __always_inline char *test_string(char *str)
+{
+ struct ustring_buffer *ubuf;
+ char __user *ustr;
+ char *kstr;
+
+ if (!ustring_per_cpu)
+ return NULL;
+
+ ubuf = this_cpu_ptr(ustring_per_cpu);
+ kstr = ubuf->buffer;
+
+ /*
+ * We use TASK_SIZE to denote user or kernel space, but this will
+ * not work for all architectures. If it picks the wrong one, it may
+ * just fail the filter (but will not bug).
+ *
+ * TODO: Have a way to properly denote which one this is for.
+ */
+ if (likely((unsigned long)str >= TASK_SIZE)) {
+ /* For safety, do not trust the string pointer */
+ if (!strncpy_from_kernel_nofault(kstr, str, USTRING_BUF_SIZE))
+ return NULL;
+ } else {
+ /* user space address? */
+ ustr = (char __user *)str;
+ if (!strncpy_from_user_nofault(kstr, ustr, USTRING_BUF_SIZE))
+ return NULL;
+ }
+ return kstr;
+}
+
/* Filter predicate for fixed sized arrays of characters */
static int filter_pred_string(struct filter_pred *pred, void *event)
{
@@ -671,10 +713,16 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
char **addr = (char **)(event + pred->offset);
+ char *str;
int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
+ int len;
- cmp = pred->regex.match(*addr, &pred->regex, len);
+ str = test_string(*addr);
+ if (!str)
+ return 0;
+
+ len = strlen(str) + 1; /* including tailing '\0' */
+ cmp = pred->regex.match(str, &pred->regex, len);
match = cmp ^ pred->not;
@@ -1348,8 +1396,17 @@ static int parse_pred(const char *str, void *data,
pred->fn = filter_pred_strloc;
} else if (field->filter_type == FILTER_RDYN_STRING)
pred->fn = filter_pred_strrelloc;
- else
+ else {
+
+ if (!ustring_per_cpu) {
+ /* Once allocated, keep it around for good */
+ ustring_per_cpu = alloc_percpu(struct ustring_buffer);
+ if (!ustring_per_cpu)
+ goto err_mem;
+ }
+
pred->fn = filter_pred_pchar;
+ }
/* go past the last quote */
i++;
@@ -1415,6 +1472,9 @@ static int parse_pred(const char *str, void *data,
err_free:
kfree(pred);
return -EINVAL;
+err_mem:
+ kfree(pred);
+ return -ENOMEM;
}
enum {
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77360f9bbc7e5e2ab7a2c8b4c0244fbbfcfc6f62 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Mon, 10 Jan 2022 11:55:32 -0500
Subject: [PATCH] tracing: Add test for user space strings when filtering on
string pointers
Pingfan reported that the following causes a fault:
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable
The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reported-by: Pingfan Liu <kernelfans(a)gmail.com>
Tested-by: Pingfan Liu <kernelfans(a)gmail.com>
Fixes: 87a342f5db69d ("tracing/filters: Support filtering for char * strings")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index 8ddb9b09451c..45e66a60a816 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -230,6 +230,16 @@ Currently the caret ('^') for an error always appears at the beginning of
the filter string; the error message should still be useful though
even without more accurate position info.
+5.2.1 Filter limitations
+------------------------
+
+If a filter is placed on a string pointer ``(char *)`` that does not point
+to a string on the ring buffer, but instead points to kernel or user space
+memory, then, for safety reasons, at most 1024 bytes of the content is
+copied onto a temporary buffer to do the compare. If the copy of the memory
+faults (the pointer points to memory that should not be accessed), then the
+string compare will be treated as not matching.
+
5.3 Clearing filters
--------------------
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 996920ed1812..2e9ef64e9ee9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -5,6 +5,7 @@
* Copyright (C) 2009 Tom Zanussi <tzanussi(a)gmail.com>
*/
+#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/ctype.h>
#include <linux/mutex.h>
@@ -654,6 +655,47 @@ DEFINE_EQUALITY_PRED(32);
DEFINE_EQUALITY_PRED(16);
DEFINE_EQUALITY_PRED(8);
+/* user space strings temp buffer */
+#define USTRING_BUF_SIZE 1024
+
+struct ustring_buffer {
+ char buffer[USTRING_BUF_SIZE];
+};
+
+static __percpu struct ustring_buffer *ustring_per_cpu;
+
+static __always_inline char *test_string(char *str)
+{
+ struct ustring_buffer *ubuf;
+ char __user *ustr;
+ char *kstr;
+
+ if (!ustring_per_cpu)
+ return NULL;
+
+ ubuf = this_cpu_ptr(ustring_per_cpu);
+ kstr = ubuf->buffer;
+
+ /*
+ * We use TASK_SIZE to denote user or kernel space, but this will
+ * not work for all architectures. If it picks the wrong one, it may
+ * just fail the filter (but will not bug).
+ *
+ * TODO: Have a way to properly denote which one this is for.
+ */
+ if (likely((unsigned long)str >= TASK_SIZE)) {
+ /* For safety, do not trust the string pointer */
+ if (!strncpy_from_kernel_nofault(kstr, str, USTRING_BUF_SIZE))
+ return NULL;
+ } else {
+ /* user space address? */
+ ustr = (char __user *)str;
+ if (!strncpy_from_user_nofault(kstr, ustr, USTRING_BUF_SIZE))
+ return NULL;
+ }
+ return kstr;
+}
+
/* Filter predicate for fixed sized arrays of characters */
static int filter_pred_string(struct filter_pred *pred, void *event)
{
@@ -671,10 +713,16 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
static int filter_pred_pchar(struct filter_pred *pred, void *event)
{
char **addr = (char **)(event + pred->offset);
+ char *str;
int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
+ int len;
- cmp = pred->regex.match(*addr, &pred->regex, len);
+ str = test_string(*addr);
+ if (!str)
+ return 0;
+
+ len = strlen(str) + 1; /* including tailing '\0' */
+ cmp = pred->regex.match(str, &pred->regex, len);
match = cmp ^ pred->not;
@@ -1348,8 +1396,17 @@ static int parse_pred(const char *str, void *data,
pred->fn = filter_pred_strloc;
} else if (field->filter_type == FILTER_RDYN_STRING)
pred->fn = filter_pred_strrelloc;
- else
+ else {
+
+ if (!ustring_per_cpu) {
+ /* Once allocated, keep it around for good */
+ ustring_per_cpu = alloc_percpu(struct ustring_buffer);
+ if (!ustring_per_cpu)
+ goto err_mem;
+ }
+
pred->fn = filter_pred_pchar;
+ }
/* go past the last quote */
i++;
@@ -1415,6 +1472,9 @@ static int parse_pred(const char *str, void *data,
err_free:
kfree(pred);
return -EINVAL;
+err_mem:
+ kfree(pred);
+ return -ENOMEM;
}
enum {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3e2a56e6f639492311e0a8533f0a7aed60816308 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 7 Jan 2022 17:56:56 -0500
Subject: [PATCH] tracing: Have syscall trace events use
trace_event_buffer_lock_reserve()
Currently, the syscall trace events call trace_buffer_lock_reserve()
directly, which means that it misses out on some of the filtering
optimizations provided by the helper function
trace_event_buffer_lock_reserve(). Have the syscall trace events call that
instead, as it was missed when adding the update to use the temp buffer
when filtering.
Link: https://lkml.kernel.org/r/20220107225839.823118570@goodmis.org
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Fixes: 0fc1b09ff1ff4 ("tracing: Use temp buffer when filtering events")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 8bfcd3b09422..f755bde42fd0 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -323,8 +323,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
trace_ctx = tracing_gen_ctx();
- buffer = tr->array_buffer.buffer;
- event = trace_buffer_lock_reserve(buffer,
+ event = trace_event_buffer_lock_reserve(&buffer, trace_file,
sys_data->enter_event->event.type, size, trace_ctx);
if (!event)
return;
@@ -367,8 +366,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
trace_ctx = tracing_gen_ctx();
- buffer = tr->array_buffer.buffer;
- event = trace_buffer_lock_reserve(buffer,
+ event = trace_event_buffer_lock_reserve(&buffer, trace_file,
sys_data->exit_event->event.type, sizeof(*entry),
trace_ctx);
if (!event)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3e2a56e6f639492311e0a8533f0a7aed60816308 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 7 Jan 2022 17:56:56 -0500
Subject: [PATCH] tracing: Have syscall trace events use
trace_event_buffer_lock_reserve()
Currently, the syscall trace events call trace_buffer_lock_reserve()
directly, which means that it misses out on some of the filtering
optimizations provided by the helper function
trace_event_buffer_lock_reserve(). Have the syscall trace events call that
instead, as it was missed when adding the update to use the temp buffer
when filtering.
Link: https://lkml.kernel.org/r/20220107225839.823118570@goodmis.org
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Fixes: 0fc1b09ff1ff4 ("tracing: Use temp buffer when filtering events")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 8bfcd3b09422..f755bde42fd0 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -323,8 +323,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
trace_ctx = tracing_gen_ctx();
- buffer = tr->array_buffer.buffer;
- event = trace_buffer_lock_reserve(buffer,
+ event = trace_event_buffer_lock_reserve(&buffer, trace_file,
sys_data->enter_event->event.type, size, trace_ctx);
if (!event)
return;
@@ -367,8 +366,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
trace_ctx = tracing_gen_ctx();
- buffer = tr->array_buffer.buffer;
- event = trace_buffer_lock_reserve(buffer,
+ event = trace_event_buffer_lock_reserve(&buffer, trace_file,
sys_data->exit_event->event.type, sizeof(*entry),
trace_ctx);
if (!event)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3e2a56e6f639492311e0a8533f0a7aed60816308 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 7 Jan 2022 17:56:56 -0500
Subject: [PATCH] tracing: Have syscall trace events use
trace_event_buffer_lock_reserve()
Currently, the syscall trace events call trace_buffer_lock_reserve()
directly, which means that it misses out on some of the filtering
optimizations provided by the helper function
trace_event_buffer_lock_reserve(). Have the syscall trace events call that
instead, as it was missed when adding the update to use the temp buffer
when filtering.
Link: https://lkml.kernel.org/r/20220107225839.823118570@goodmis.org
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Fixes: 0fc1b09ff1ff4 ("tracing: Use temp buffer when filtering events")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 8bfcd3b09422..f755bde42fd0 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -323,8 +323,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
trace_ctx = tracing_gen_ctx();
- buffer = tr->array_buffer.buffer;
- event = trace_buffer_lock_reserve(buffer,
+ event = trace_event_buffer_lock_reserve(&buffer, trace_file,
sys_data->enter_event->event.type, size, trace_ctx);
if (!event)
return;
@@ -367,8 +366,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
trace_ctx = tracing_gen_ctx();
- buffer = tr->array_buffer.buffer;
- event = trace_buffer_lock_reserve(buffer,
+ event = trace_event_buffer_lock_reserve(&buffer, trace_file,
sys_data->exit_event->event.type, sizeof(*entry),
trace_ctx);
if (!event)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3e2a56e6f639492311e0a8533f0a7aed60816308 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 7 Jan 2022 17:56:56 -0500
Subject: [PATCH] tracing: Have syscall trace events use
trace_event_buffer_lock_reserve()
Currently, the syscall trace events call trace_buffer_lock_reserve()
directly, which means that it misses out on some of the filtering
optimizations provided by the helper function
trace_event_buffer_lock_reserve(). Have the syscall trace events call that
instead, as it was missed when adding the update to use the temp buffer
when filtering.
Link: https://lkml.kernel.org/r/20220107225839.823118570@goodmis.org
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Fixes: 0fc1b09ff1ff4 ("tracing: Use temp buffer when filtering events")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 8bfcd3b09422..f755bde42fd0 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -323,8 +323,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
trace_ctx = tracing_gen_ctx();
- buffer = tr->array_buffer.buffer;
- event = trace_buffer_lock_reserve(buffer,
+ event = trace_event_buffer_lock_reserve(&buffer, trace_file,
sys_data->enter_event->event.type, size, trace_ctx);
if (!event)
return;
@@ -367,8 +366,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
trace_ctx = tracing_gen_ctx();
- buffer = tr->array_buffer.buffer;
- event = trace_buffer_lock_reserve(buffer,
+ event = trace_event_buffer_lock_reserve(&buffer, trace_file,
sys_data->exit_event->event.type, sizeof(*entry),
trace_ctx);
if (!event)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3e2a56e6f639492311e0a8533f0a7aed60816308 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 7 Jan 2022 17:56:56 -0500
Subject: [PATCH] tracing: Have syscall trace events use
trace_event_buffer_lock_reserve()
Currently, the syscall trace events call trace_buffer_lock_reserve()
directly, which means that it misses out on some of the filtering
optimizations provided by the helper function
trace_event_buffer_lock_reserve(). Have the syscall trace events call that
instead, as it was missed when adding the update to use the temp buffer
when filtering.
Link: https://lkml.kernel.org/r/20220107225839.823118570@goodmis.org
Cc: stable(a)vger.kernel.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Fixes: 0fc1b09ff1ff4 ("tracing: Use temp buffer when filtering events")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 8bfcd3b09422..f755bde42fd0 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -323,8 +323,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
trace_ctx = tracing_gen_ctx();
- buffer = tr->array_buffer.buffer;
- event = trace_buffer_lock_reserve(buffer,
+ event = trace_event_buffer_lock_reserve(&buffer, trace_file,
sys_data->enter_event->event.type, size, trace_ctx);
if (!event)
return;
@@ -367,8 +366,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
trace_ctx = tracing_gen_ctx();
- buffer = tr->array_buffer.buffer;
- event = trace_buffer_lock_reserve(buffer,
+ event = trace_event_buffer_lock_reserve(&buffer, trace_file,
sys_data->exit_event->event.type, sizeof(*entry),
trace_ctx);
if (!event)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dfea08a2116fe327f79d8f4d4b2cf6e0c88be11f Mon Sep 17 00:00:00 2001
From: Xiangyang Zhang <xyz.sun.ok(a)gmail.com>
Date: Fri, 7 Jan 2022 23:02:42 +0800
Subject: [PATCH] tracing/kprobes: 'nmissed' not showed correctly for kretprobe
The 'nmissed' column of the 'kprobe_profile' file for kretprobe is
not showed correctly, kretprobe can be skipped by two reasons,
shortage of kretprobe_instance which is counted by tk->rp.nmissed,
and kprobe itself is missed by some reason, so to show the sum.
Link: https://lkml.kernel.org/r/20220107150242.5019-1-xyz.sun.ok@gmail.com
Cc: stable(a)vger.kernel.org
Fixes: 4a846b443b4e ("tracing/kprobes: Cleanup kprobe tracer code")
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Xiangyang Zhang <xyz.sun.ok(a)gmail.com>
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index f8c26ee72de3..3d85323278ed 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1170,15 +1170,18 @@ static int probes_profile_seq_show(struct seq_file *m, void *v)
{
struct dyn_event *ev = v;
struct trace_kprobe *tk;
+ unsigned long nmissed;
if (!is_trace_kprobe(ev))
return 0;
tk = to_trace_kprobe(ev);
+ nmissed = trace_kprobe_is_return(tk) ?
+ tk->rp.kp.nmissed + tk->rp.nmissed : tk->rp.kp.nmissed;
seq_printf(m, " %-44s %15lu %15lu\n",
trace_probe_name(&tk->tp),
trace_kprobe_nhit(tk),
- tk->rp.kp.nmissed);
+ nmissed);
return 0;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dfea08a2116fe327f79d8f4d4b2cf6e0c88be11f Mon Sep 17 00:00:00 2001
From: Xiangyang Zhang <xyz.sun.ok(a)gmail.com>
Date: Fri, 7 Jan 2022 23:02:42 +0800
Subject: [PATCH] tracing/kprobes: 'nmissed' not showed correctly for kretprobe
The 'nmissed' column of the 'kprobe_profile' file for kretprobe is
not showed correctly, kretprobe can be skipped by two reasons,
shortage of kretprobe_instance which is counted by tk->rp.nmissed,
and kprobe itself is missed by some reason, so to show the sum.
Link: https://lkml.kernel.org/r/20220107150242.5019-1-xyz.sun.ok@gmail.com
Cc: stable(a)vger.kernel.org
Fixes: 4a846b443b4e ("tracing/kprobes: Cleanup kprobe tracer code")
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Xiangyang Zhang <xyz.sun.ok(a)gmail.com>
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index f8c26ee72de3..3d85323278ed 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1170,15 +1170,18 @@ static int probes_profile_seq_show(struct seq_file *m, void *v)
{
struct dyn_event *ev = v;
struct trace_kprobe *tk;
+ unsigned long nmissed;
if (!is_trace_kprobe(ev))
return 0;
tk = to_trace_kprobe(ev);
+ nmissed = trace_kprobe_is_return(tk) ?
+ tk->rp.kp.nmissed + tk->rp.nmissed : tk->rp.kp.nmissed;
seq_printf(m, " %-44s %15lu %15lu\n",
trace_probe_name(&tk->tp),
trace_kprobe_nhit(tk),
- tk->rp.kp.nmissed);
+ nmissed);
return 0;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dfea08a2116fe327f79d8f4d4b2cf6e0c88be11f Mon Sep 17 00:00:00 2001
From: Xiangyang Zhang <xyz.sun.ok(a)gmail.com>
Date: Fri, 7 Jan 2022 23:02:42 +0800
Subject: [PATCH] tracing/kprobes: 'nmissed' not showed correctly for kretprobe
The 'nmissed' column of the 'kprobe_profile' file for kretprobe is
not showed correctly, kretprobe can be skipped by two reasons,
shortage of kretprobe_instance which is counted by tk->rp.nmissed,
and kprobe itself is missed by some reason, so to show the sum.
Link: https://lkml.kernel.org/r/20220107150242.5019-1-xyz.sun.ok@gmail.com
Cc: stable(a)vger.kernel.org
Fixes: 4a846b443b4e ("tracing/kprobes: Cleanup kprobe tracer code")
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Xiangyang Zhang <xyz.sun.ok(a)gmail.com>
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index f8c26ee72de3..3d85323278ed 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1170,15 +1170,18 @@ static int probes_profile_seq_show(struct seq_file *m, void *v)
{
struct dyn_event *ev = v;
struct trace_kprobe *tk;
+ unsigned long nmissed;
if (!is_trace_kprobe(ev))
return 0;
tk = to_trace_kprobe(ev);
+ nmissed = trace_kprobe_is_return(tk) ?
+ tk->rp.kp.nmissed + tk->rp.nmissed : tk->rp.kp.nmissed;
seq_printf(m, " %-44s %15lu %15lu\n",
trace_probe_name(&tk->tp),
trace_kprobe_nhit(tk),
- tk->rp.kp.nmissed);
+ nmissed);
return 0;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dfea08a2116fe327f79d8f4d4b2cf6e0c88be11f Mon Sep 17 00:00:00 2001
From: Xiangyang Zhang <xyz.sun.ok(a)gmail.com>
Date: Fri, 7 Jan 2022 23:02:42 +0800
Subject: [PATCH] tracing/kprobes: 'nmissed' not showed correctly for kretprobe
The 'nmissed' column of the 'kprobe_profile' file for kretprobe is
not showed correctly, kretprobe can be skipped by two reasons,
shortage of kretprobe_instance which is counted by tk->rp.nmissed,
and kprobe itself is missed by some reason, so to show the sum.
Link: https://lkml.kernel.org/r/20220107150242.5019-1-xyz.sun.ok@gmail.com
Cc: stable(a)vger.kernel.org
Fixes: 4a846b443b4e ("tracing/kprobes: Cleanup kprobe tracer code")
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Xiangyang Zhang <xyz.sun.ok(a)gmail.com>
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index f8c26ee72de3..3d85323278ed 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1170,15 +1170,18 @@ static int probes_profile_seq_show(struct seq_file *m, void *v)
{
struct dyn_event *ev = v;
struct trace_kprobe *tk;
+ unsigned long nmissed;
if (!is_trace_kprobe(ev))
return 0;
tk = to_trace_kprobe(ev);
+ nmissed = trace_kprobe_is_return(tk) ?
+ tk->rp.kp.nmissed + tk->rp.nmissed : tk->rp.kp.nmissed;
seq_printf(m, " %-44s %15lu %15lu\n",
trace_probe_name(&tk->tp),
trace_kprobe_nhit(tk),
- tk->rp.kp.nmissed);
+ nmissed);
return 0;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9731698ecb9c851f353ce2496292ff9fcea39dff Mon Sep 17 00:00:00 2001
From: Andrey Ryabinin <arbn(a)yandex-team.com>
Date: Mon, 15 Nov 2021 19:46:04 +0300
Subject: [PATCH] cputime, cpuacct: Include guest time in user time in
cpuacct.stat
cpuacct.stat in no-root cgroups shows user time without guest time
included int it. This doesn't match with user time shown in root
cpuacct.stat and /proc/<pid>/stat. This also affects cgroup2's cpu.stat
in the same way.
Make account_guest_time() to add user time to cgroup's cpustat to
fix this.
Fixes: ef12fefabf94 ("cpuacct: add per-cgroup utime/stime statistics")
Signed-off-by: Andrey Ryabinin <arbn(a)yandex-team.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Acked-by: Tejun Heo <tj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20211115164607.23784-1-arbn@yandex-team.com
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 872e481d5098..042a6dbce8f3 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -148,10 +148,10 @@ void account_guest_time(struct task_struct *p, u64 cputime)
/* Add guest time to cpustat. */
if (task_nice(p) > 0) {
- cpustat[CPUTIME_NICE] += cputime;
+ task_group_account_field(p, CPUTIME_NICE, cputime);
cpustat[CPUTIME_GUEST_NICE] += cputime;
} else {
- cpustat[CPUTIME_USER] += cputime;
+ task_group_account_field(p, CPUTIME_USER, cputime);
cpustat[CPUTIME_GUEST] += cputime;
}
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9731698ecb9c851f353ce2496292ff9fcea39dff Mon Sep 17 00:00:00 2001
From: Andrey Ryabinin <arbn(a)yandex-team.com>
Date: Mon, 15 Nov 2021 19:46:04 +0300
Subject: [PATCH] cputime, cpuacct: Include guest time in user time in
cpuacct.stat
cpuacct.stat in no-root cgroups shows user time without guest time
included int it. This doesn't match with user time shown in root
cpuacct.stat and /proc/<pid>/stat. This also affects cgroup2's cpu.stat
in the same way.
Make account_guest_time() to add user time to cgroup's cpustat to
fix this.
Fixes: ef12fefabf94 ("cpuacct: add per-cgroup utime/stime statistics")
Signed-off-by: Andrey Ryabinin <arbn(a)yandex-team.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Acked-by: Tejun Heo <tj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20211115164607.23784-1-arbn@yandex-team.com
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 872e481d5098..042a6dbce8f3 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -148,10 +148,10 @@ void account_guest_time(struct task_struct *p, u64 cputime)
/* Add guest time to cpustat. */
if (task_nice(p) > 0) {
- cpustat[CPUTIME_NICE] += cputime;
+ task_group_account_field(p, CPUTIME_NICE, cputime);
cpustat[CPUTIME_GUEST_NICE] += cputime;
} else {
- cpustat[CPUTIME_USER] += cputime;
+ task_group_account_field(p, CPUTIME_USER, cputime);
cpustat[CPUTIME_GUEST] += cputime;
}
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 80bb73a9fbcde4ecc55e12f10c73fabbe68a24d1 Mon Sep 17 00:00:00 2001
From: Kunihiko Hayashi <hayashi.kunihiko(a)socionext.com>
Date: Wed, 22 Dec 2021 13:48:12 +0900
Subject: [PATCH] spi: uniphier: Fix a bug that doesn't point to private data
correctly
In uniphier_spi_remove(), there is a wrong code to get private data from
the platform device, so the driver can't be removed properly.
The driver should get spi_master from the platform device and retrieve
the private data from it.
Cc: <stable(a)vger.kernel.org>
Fixes: 5ba155a4d4cc ("spi: add SPI controller driver for UniPhier SoC")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko(a)socionext.com>
Link: https://lore.kernel.org/r/1640148492-32178-1-git-send-email-hayashi.kunihik…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/drivers/spi/spi-uniphier.c b/drivers/spi/spi-uniphier.c
index 8900e51e1a1c..342ee8d2c476 100644
--- a/drivers/spi/spi-uniphier.c
+++ b/drivers/spi/spi-uniphier.c
@@ -767,12 +767,13 @@ static int uniphier_spi_probe(struct platform_device *pdev)
static int uniphier_spi_remove(struct platform_device *pdev)
{
- struct uniphier_spi_priv *priv = platform_get_drvdata(pdev);
+ struct spi_master *master = platform_get_drvdata(pdev);
+ struct uniphier_spi_priv *priv = spi_master_get_devdata(master);
- if (priv->master->dma_tx)
- dma_release_channel(priv->master->dma_tx);
- if (priv->master->dma_rx)
- dma_release_channel(priv->master->dma_rx);
+ if (master->dma_tx)
+ dma_release_channel(master->dma_tx);
+ if (master->dma_rx)
+ dma_release_channel(master->dma_rx);
clk_disable_unprepare(priv->clk);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 80bb73a9fbcde4ecc55e12f10c73fabbe68a24d1 Mon Sep 17 00:00:00 2001
From: Kunihiko Hayashi <hayashi.kunihiko(a)socionext.com>
Date: Wed, 22 Dec 2021 13:48:12 +0900
Subject: [PATCH] spi: uniphier: Fix a bug that doesn't point to private data
correctly
In uniphier_spi_remove(), there is a wrong code to get private data from
the platform device, so the driver can't be removed properly.
The driver should get spi_master from the platform device and retrieve
the private data from it.
Cc: <stable(a)vger.kernel.org>
Fixes: 5ba155a4d4cc ("spi: add SPI controller driver for UniPhier SoC")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko(a)socionext.com>
Link: https://lore.kernel.org/r/1640148492-32178-1-git-send-email-hayashi.kunihik…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/drivers/spi/spi-uniphier.c b/drivers/spi/spi-uniphier.c
index 8900e51e1a1c..342ee8d2c476 100644
--- a/drivers/spi/spi-uniphier.c
+++ b/drivers/spi/spi-uniphier.c
@@ -767,12 +767,13 @@ static int uniphier_spi_probe(struct platform_device *pdev)
static int uniphier_spi_remove(struct platform_device *pdev)
{
- struct uniphier_spi_priv *priv = platform_get_drvdata(pdev);
+ struct spi_master *master = platform_get_drvdata(pdev);
+ struct uniphier_spi_priv *priv = spi_master_get_devdata(master);
- if (priv->master->dma_tx)
- dma_release_channel(priv->master->dma_tx);
- if (priv->master->dma_rx)
- dma_release_channel(priv->master->dma_rx);
+ if (master->dma_tx)
+ dma_release_channel(master->dma_tx);
+ if (master->dma_rx)
+ dma_release_channel(master->dma_rx);
clk_disable_unprepare(priv->clk);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 29009604ad4e3ef784fd9b9fef6f23610ddf633d Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Mon, 20 Dec 2021 20:50:22 +0100
Subject: [PATCH] crypto: stm32/crc32 - Fix kernel BUG triggered in probe()
The include/linux/crypto.h struct crypto_alg field cra_driver_name description
states "Unique name of the transformation provider. " ... " this contains the
name of the chip or provider and the name of the transformation algorithm."
In case of the stm32-crc driver, field cra_driver_name is identical for all
registered transformation providers and set to the name of the driver itself,
which is incorrect. This patch fixes it by assigning a unique cra_driver_name
to each registered transformation provider.
The kernel crash is triggered when the driver calls crypto_register_shashes()
which calls crypto_register_shash(), which calls crypto_register_alg(), which
calls __crypto_register_alg(), which returns -EEXIST, which is propagated
back through this call chain. Upon -EEXIST from crypto_register_shash(), the
crypto_register_shashes() starts unregistering the providers back, and calls
crypto_unregister_shash(), which calls crypto_unregister_alg(), and this is
where the BUG() triggers due to incorrect cra_refcnt.
Fixes: b51dbe90912a ("crypto: stm32 - Support for STM32 CRC32 crypto module")
Signed-off-by: Marek Vasut <marex(a)denx.de>
Cc: <stable(a)vger.kernel.org> # 4.12+
Cc: Alexandre Torgue <alexandre.torgue(a)foss.st.com>
Cc: Fabien Dessenne <fabien.dessenne(a)st.com>
Cc: Herbert Xu <herbert(a)gondor.apana.org.au>
Cc: Lionel Debieve <lionel.debieve(a)st.com>
Cc: Nicolas Toromanoff <nicolas.toromanoff(a)st.com>
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-stm32(a)st-md-mailman.stormreply.com
To: linux-crypto(a)vger.kernel.org
Acked-by: Nicolas Toromanoff <nicolas.toromanoff(a)foss.st.com>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/drivers/crypto/stm32/stm32-crc32.c b/drivers/crypto/stm32/stm32-crc32.c
index 75867c0b0017..be1bf39a317d 100644
--- a/drivers/crypto/stm32/stm32-crc32.c
+++ b/drivers/crypto/stm32/stm32-crc32.c
@@ -279,7 +279,7 @@ static struct shash_alg algs[] = {
.digestsize = CHKSUM_DIGEST_SIZE,
.base = {
.cra_name = "crc32",
- .cra_driver_name = DRIVER_NAME,
+ .cra_driver_name = "stm32-crc32-crc32",
.cra_priority = 200,
.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.cra_blocksize = CHKSUM_BLOCK_SIZE,
@@ -301,7 +301,7 @@ static struct shash_alg algs[] = {
.digestsize = CHKSUM_DIGEST_SIZE,
.base = {
.cra_name = "crc32c",
- .cra_driver_name = DRIVER_NAME,
+ .cra_driver_name = "stm32-crc32-crc32c",
.cra_priority = 200,
.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.cra_blocksize = CHKSUM_BLOCK_SIZE,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8066c615cb69b7da8a94f59379847b037b3a5e46 Mon Sep 17 00:00:00 2001
From: Arnaud Pouliquen <arnaud.pouliquen(a)foss.st.com>
Date: Mon, 6 Dec 2021 20:07:58 +0100
Subject: [PATCH] rpmsg: core: Clean up resources on announce_create failure.
During the rpmsg_dev_probe, if rpdev->ops->announce_create returns an
error, the rpmsg device and default endpoint should be freed before
exiting the function.
Fixes: 5e619b48677c ("rpmsg: Split rpmsg core and virtio backend")
Suggested-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen(a)foss.st.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20211206190758.10004-1-arnaud.pouliquen@foss.st.c…
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
diff --git a/drivers/rpmsg/rpmsg_core.c b/drivers/rpmsg/rpmsg_core.c
index f031b2b1b21c..d9e612f4f0f2 100644
--- a/drivers/rpmsg/rpmsg_core.c
+++ b/drivers/rpmsg/rpmsg_core.c
@@ -540,13 +540,25 @@ static int rpmsg_dev_probe(struct device *dev)
err = rpdrv->probe(rpdev);
if (err) {
dev_err(dev, "%s: failed: %d\n", __func__, err);
- if (ept)
- rpmsg_destroy_ept(ept);
- goto out;
+ goto destroy_ept;
}
- if (ept && rpdev->ops->announce_create)
+ if (ept && rpdev->ops->announce_create) {
err = rpdev->ops->announce_create(rpdev);
+ if (err) {
+ dev_err(dev, "failed to announce creation\n");
+ goto remove_rpdev;
+ }
+ }
+
+ return 0;
+
+remove_rpdev:
+ if (rpdrv->remove)
+ rpdrv->remove(rpdev);
+destroy_ept:
+ if (ept)
+ rpmsg_destroy_ept(ept);
out:
return err;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8066c615cb69b7da8a94f59379847b037b3a5e46 Mon Sep 17 00:00:00 2001
From: Arnaud Pouliquen <arnaud.pouliquen(a)foss.st.com>
Date: Mon, 6 Dec 2021 20:07:58 +0100
Subject: [PATCH] rpmsg: core: Clean up resources on announce_create failure.
During the rpmsg_dev_probe, if rpdev->ops->announce_create returns an
error, the rpmsg device and default endpoint should be freed before
exiting the function.
Fixes: 5e619b48677c ("rpmsg: Split rpmsg core and virtio backend")
Suggested-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen(a)foss.st.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20211206190758.10004-1-arnaud.pouliquen@foss.st.c…
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
diff --git a/drivers/rpmsg/rpmsg_core.c b/drivers/rpmsg/rpmsg_core.c
index f031b2b1b21c..d9e612f4f0f2 100644
--- a/drivers/rpmsg/rpmsg_core.c
+++ b/drivers/rpmsg/rpmsg_core.c
@@ -540,13 +540,25 @@ static int rpmsg_dev_probe(struct device *dev)
err = rpdrv->probe(rpdev);
if (err) {
dev_err(dev, "%s: failed: %d\n", __func__, err);
- if (ept)
- rpmsg_destroy_ept(ept);
- goto out;
+ goto destroy_ept;
}
- if (ept && rpdev->ops->announce_create)
+ if (ept && rpdev->ops->announce_create) {
err = rpdev->ops->announce_create(rpdev);
+ if (err) {
+ dev_err(dev, "failed to announce creation\n");
+ goto remove_rpdev;
+ }
+ }
+
+ return 0;
+
+remove_rpdev:
+ if (rpdrv->remove)
+ rpdrv->remove(rpdev);
+destroy_ept:
+ if (ept)
+ rpmsg_destroy_ept(ept);
out:
return err;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8066c615cb69b7da8a94f59379847b037b3a5e46 Mon Sep 17 00:00:00 2001
From: Arnaud Pouliquen <arnaud.pouliquen(a)foss.st.com>
Date: Mon, 6 Dec 2021 20:07:58 +0100
Subject: [PATCH] rpmsg: core: Clean up resources on announce_create failure.
During the rpmsg_dev_probe, if rpdev->ops->announce_create returns an
error, the rpmsg device and default endpoint should be freed before
exiting the function.
Fixes: 5e619b48677c ("rpmsg: Split rpmsg core and virtio backend")
Suggested-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen(a)foss.st.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20211206190758.10004-1-arnaud.pouliquen@foss.st.c…
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
diff --git a/drivers/rpmsg/rpmsg_core.c b/drivers/rpmsg/rpmsg_core.c
index f031b2b1b21c..d9e612f4f0f2 100644
--- a/drivers/rpmsg/rpmsg_core.c
+++ b/drivers/rpmsg/rpmsg_core.c
@@ -540,13 +540,25 @@ static int rpmsg_dev_probe(struct device *dev)
err = rpdrv->probe(rpdev);
if (err) {
dev_err(dev, "%s: failed: %d\n", __func__, err);
- if (ept)
- rpmsg_destroy_ept(ept);
- goto out;
+ goto destroy_ept;
}
- if (ept && rpdev->ops->announce_create)
+ if (ept && rpdev->ops->announce_create) {
err = rpdev->ops->announce_create(rpdev);
+ if (err) {
+ dev_err(dev, "failed to announce creation\n");
+ goto remove_rpdev;
+ }
+ }
+
+ return 0;
+
+remove_rpdev:
+ if (rpdrv->remove)
+ rpdrv->remove(rpdev);
+destroy_ept:
+ if (ept)
+ rpmsg_destroy_ept(ept);
out:
return err;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5b78223f55a0f516a1639dbe11cd4324d4aaee20 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Mon, 6 Dec 2021 18:48:06 +0100
Subject: [PATCH] mfd: intel_soc_pmic: Use CPU-id check instead of _HRV check
to differentiate variants
The Intel Crystal Cove PMIC has 2 different variants, one for use with
Bay Trail (BYT) SoCs and one for use with Cherry Trail (CHT) SoCs.
So far we have been using an ACPI _HRV check to differentiate between
the 2, but at least on the Microsoft Surface 3, which is a CHT device,
the wrong _HRV value is reported by ACPI.
So instead switch to a CPU-ID check which prevents us from relying on
the possibly wrong ACPI _HRV value.
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Reported-by: Tsuchiya Yuto <kitakar(a)gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Signed-off-by: Lee Jones <lee.jones(a)linaro.org>
Link: https://lore.kernel.org/r/20211206174806.197772-2-hdegoede@redhat.com
diff --git a/drivers/mfd/intel_soc_pmic_core.c b/drivers/mfd/intel_soc_pmic_core.c
index ddd64f9e3341..47cb7f00dfcf 100644
--- a/drivers/mfd/intel_soc_pmic_core.c
+++ b/drivers/mfd/intel_soc_pmic_core.c
@@ -14,15 +14,12 @@
#include <linux/module.h>
#include <linux/mfd/core.h>
#include <linux/mfd/intel_soc_pmic.h>
+#include <linux/platform_data/x86/soc.h>
#include <linux/pwm.h>
#include <linux/regmap.h>
#include "intel_soc_pmic_core.h"
-/* Crystal Cove PMIC shares same ACPI ID between different platforms */
-#define BYT_CRC_HRV 2
-#define CHT_CRC_HRV 3
-
/* PWM consumed by the Intel GFX */
static struct pwm_lookup crc_pwm_lookup[] = {
PWM_LOOKUP("crystal_cove_pwm", 0, "0000:00:02.0", "pwm_pmic_backlight", 0, PWM_POLARITY_NORMAL),
@@ -34,31 +31,12 @@ static int intel_soc_pmic_i2c_probe(struct i2c_client *i2c,
struct device *dev = &i2c->dev;
struct intel_soc_pmic_config *config;
struct intel_soc_pmic *pmic;
- unsigned long long hrv;
- acpi_status status;
int ret;
- /*
- * There are 2 different Crystal Cove PMICs a Bay Trail and Cherry
- * Trail version, use _HRV to differentiate between the 2.
- */
- status = acpi_evaluate_integer(ACPI_HANDLE(dev), "_HRV", NULL, &hrv);
- if (ACPI_FAILURE(status)) {
- dev_err(dev, "Failed to get PMIC hardware revision\n");
- return -ENODEV;
- }
-
- switch (hrv) {
- case BYT_CRC_HRV:
+ if (soc_intel_is_byt())
config = &intel_soc_pmic_config_byt_crc;
- break;
- case CHT_CRC_HRV:
+ else
config = &intel_soc_pmic_config_cht_crc;
- break;
- default:
- dev_warn(dev, "Unknown hardware rev %llu, assuming BYT\n", hrv);
- config = &intel_soc_pmic_config_byt_crc;
- }
pmic = devm_kzalloc(dev, sizeof(*pmic), GFP_KERNEL);
if (!pmic)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5b78223f55a0f516a1639dbe11cd4324d4aaee20 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Mon, 6 Dec 2021 18:48:06 +0100
Subject: [PATCH] mfd: intel_soc_pmic: Use CPU-id check instead of _HRV check
to differentiate variants
The Intel Crystal Cove PMIC has 2 different variants, one for use with
Bay Trail (BYT) SoCs and one for use with Cherry Trail (CHT) SoCs.
So far we have been using an ACPI _HRV check to differentiate between
the 2, but at least on the Microsoft Surface 3, which is a CHT device,
the wrong _HRV value is reported by ACPI.
So instead switch to a CPU-ID check which prevents us from relying on
the possibly wrong ACPI _HRV value.
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Reported-by: Tsuchiya Yuto <kitakar(a)gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Signed-off-by: Lee Jones <lee.jones(a)linaro.org>
Link: https://lore.kernel.org/r/20211206174806.197772-2-hdegoede@redhat.com
diff --git a/drivers/mfd/intel_soc_pmic_core.c b/drivers/mfd/intel_soc_pmic_core.c
index ddd64f9e3341..47cb7f00dfcf 100644
--- a/drivers/mfd/intel_soc_pmic_core.c
+++ b/drivers/mfd/intel_soc_pmic_core.c
@@ -14,15 +14,12 @@
#include <linux/module.h>
#include <linux/mfd/core.h>
#include <linux/mfd/intel_soc_pmic.h>
+#include <linux/platform_data/x86/soc.h>
#include <linux/pwm.h>
#include <linux/regmap.h>
#include "intel_soc_pmic_core.h"
-/* Crystal Cove PMIC shares same ACPI ID between different platforms */
-#define BYT_CRC_HRV 2
-#define CHT_CRC_HRV 3
-
/* PWM consumed by the Intel GFX */
static struct pwm_lookup crc_pwm_lookup[] = {
PWM_LOOKUP("crystal_cove_pwm", 0, "0000:00:02.0", "pwm_pmic_backlight", 0, PWM_POLARITY_NORMAL),
@@ -34,31 +31,12 @@ static int intel_soc_pmic_i2c_probe(struct i2c_client *i2c,
struct device *dev = &i2c->dev;
struct intel_soc_pmic_config *config;
struct intel_soc_pmic *pmic;
- unsigned long long hrv;
- acpi_status status;
int ret;
- /*
- * There are 2 different Crystal Cove PMICs a Bay Trail and Cherry
- * Trail version, use _HRV to differentiate between the 2.
- */
- status = acpi_evaluate_integer(ACPI_HANDLE(dev), "_HRV", NULL, &hrv);
- if (ACPI_FAILURE(status)) {
- dev_err(dev, "Failed to get PMIC hardware revision\n");
- return -ENODEV;
- }
-
- switch (hrv) {
- case BYT_CRC_HRV:
+ if (soc_intel_is_byt())
config = &intel_soc_pmic_config_byt_crc;
- break;
- case CHT_CRC_HRV:
+ else
config = &intel_soc_pmic_config_cht_crc;
- break;
- default:
- dev_warn(dev, "Unknown hardware rev %llu, assuming BYT\n", hrv);
- config = &intel_soc_pmic_config_byt_crc;
- }
pmic = devm_kzalloc(dev, sizeof(*pmic), GFP_KERNEL);
if (!pmic)
[Public]
Hi,
I found recently that WCN6855 even with the latest firmware doesn't work on 5.16. As a result of not working it causes failures during suspend: https://bugzilla.kernel.org/show_bug.cgi?id=215504
However the reason that wireless isn't working in 5.16.0 is because the way to find the board IDs in the firmware binary needs modification. I can confirm that it's able to parse once
commit fc95d10ac41d75c14a81afcc8722333d8b2cf80f ("ath11k: add string type to search board data in board-2.bin for WCN6855") is backported and that avoids the suspend bug (that otherwise still occurs when no firmware found and should separately be fixed) so I think this is a good candidate to come back to 5.16.y.
Thanks,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cf4a4493ff70874f8af26d75d4346c591c298e89 Mon Sep 17 00:00:00 2001
From: Peng Hao <flyingpenghao(a)gmail.com>
Date: Wed, 22 Dec 2021 09:12:25 +0800
Subject: [PATCH] virtio/virtio_mem: handle a possible NULL as a memcpy
parameter
There is a check for vm->sbm.sb_states before, and it should check
it here as well.
Signed-off-by: Peng Hao <flyingpeng(a)tencent.com>
Link: https://lore.kernel.org/r/20211222011225.40573-1-flyingpeng@tencent.com
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Fixes: 5f1f79bbc9e2 ("virtio-mem: Paravirtualized memory hotplug")
Cc: stable(a)vger.kernel.org # v5.8+
diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index a6a78685cfbe..38becd8d578c 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -593,7 +593,7 @@ static int virtio_mem_sbm_sb_states_prepare_next_mb(struct virtio_mem *vm)
return -ENOMEM;
mutex_lock(&vm->hotplug_mutex);
- if (new_bitmap)
+ if (vm->sbm.sb_states)
memcpy(new_bitmap, vm->sbm.sb_states, old_pages * PAGE_SIZE);
old_bitmap = vm->sbm.sb_states;