The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0f344c8129a5337dae50e31b817dd50a60ff238c Mon Sep 17 00:00:00 2001
From: Karen Sornek <karen.sornek(a)intel.com>
Date: Thu, 2 Dec 2021 12:52:01 +0100
Subject: [PATCH] i40e: Fix for failed to init adminq while VF reset
Fix for failed to init adminq: -53 while VF is resetting via MAC
address changing procedure.
Added sync module to avoid reading deadbeef value in reinit adminq
during software reset.
Without this patch it is possible to trigger VF reset procedure
during reinit adminq. This resulted in an incorrect reading of
value from the AQP registers and generated the -53 error.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek(a)intel.com>
Signed-off-by: Karen Sornek <karen.sornek(a)intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_register.h b/drivers/net/ethernet/intel/i40e/i40e_register.h
index 8d0588a27a05..1908eed4fa5e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_register.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_register.h
@@ -413,6 +413,9 @@
#define I40E_VFINT_DYN_CTLN(_INTVF) (0x00024800 + ((_INTVF) * 4)) /* _i=0...511 */ /* Reset: VFR */
#define I40E_VFINT_DYN_CTLN_CLEARPBA_SHIFT 1
#define I40E_VFINT_DYN_CTLN_CLEARPBA_MASK I40E_MASK(0x1, I40E_VFINT_DYN_CTLN_CLEARPBA_SHIFT)
+#define I40E_VFINT_ICR0_ADMINQ_SHIFT 30
+#define I40E_VFINT_ICR0_ADMINQ_MASK I40E_MASK(0x1, I40E_VFINT_ICR0_ADMINQ_SHIFT)
+#define I40E_VFINT_ICR0_ENA(_VF) (0x0002C000 + ((_VF) * 4)) /* _i=0...127 */ /* Reset: CORER */
#define I40E_VPINT_AEQCTL(_VF) (0x0002B800 + ((_VF) * 4)) /* _i=0...127 */ /* Reset: CORER */
#define I40E_VPINT_AEQCTL_MSIX_INDX_SHIFT 0
#define I40E_VPINT_AEQCTL_ITR_INDX_SHIFT 11
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 7dd2a2bab339..dfdb6e786461 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1376,6 +1376,32 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
return aq_ret;
}
+/**
+ * i40e_sync_vfr_reset
+ * @hw: pointer to hw struct
+ * @vf_id: VF identifier
+ *
+ * Before trigger hardware reset, we need to know if no other process has
+ * reserved the hardware for any reset operations. This check is done by
+ * examining the status of the RSTAT1 register used to signal the reset.
+ **/
+static int i40e_sync_vfr_reset(struct i40e_hw *hw, int vf_id)
+{
+ u32 reg;
+ int i;
+
+ for (i = 0; i < I40E_VFR_WAIT_COUNT; i++) {
+ reg = rd32(hw, I40E_VFINT_ICR0_ENA(vf_id)) &
+ I40E_VFINT_ICR0_ADMINQ_MASK;
+ if (reg)
+ return 0;
+
+ usleep_range(100, 200);
+ }
+
+ return -EAGAIN;
+}
+
/**
* i40e_trigger_vf_reset
* @vf: pointer to the VF structure
@@ -1390,9 +1416,11 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
struct i40e_pf *pf = vf->pf;
struct i40e_hw *hw = &pf->hw;
u32 reg, reg_idx, bit_idx;
+ bool vf_active;
+ u32 radq;
/* warn the VF */
- clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
+ vf_active = test_and_clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
/* Disable VF's configuration API during reset. The flag is re-enabled
* in i40e_alloc_vf_res(), when it's safe again to access VF's VSI.
@@ -1406,7 +1434,19 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
* just need to clean up, so don't hit the VFRTRIG register.
*/
if (!flr) {
- /* reset VF using VPGEN_VFRTRIG reg */
+ /* Sync VFR reset before trigger next one */
+ radq = rd32(hw, I40E_VFINT_ICR0_ENA(vf->vf_id)) &
+ I40E_VFINT_ICR0_ADMINQ_MASK;
+ if (vf_active && !radq)
+ /* waiting for finish reset by virtual driver */
+ if (i40e_sync_vfr_reset(hw, vf->vf_id))
+ dev_info(&pf->pdev->dev,
+ "Reset VF %d never finished\n",
+ vf->vf_id);
+
+ /* Reset VF using VPGEN_VFRTRIG reg. It is also setting
+ * in progress state in rstat1 register.
+ */
reg = rd32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id));
reg |= I40E_VPGEN_VFRTRIG_VFSWR_MASK;
wr32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id), reg);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
index 49575a640a84..03c42fd0fea1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
@@ -19,6 +19,7 @@
#define I40E_MAX_VF_PROMISC_FLAGS 3
#define I40E_VF_STATE_WAIT_COUNT 20
+#define I40E_VFR_WAIT_COUNT 100
/* Various queue ctrls */
enum i40e_queue_ctrl {
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0f344c8129a5337dae50e31b817dd50a60ff238c Mon Sep 17 00:00:00 2001
From: Karen Sornek <karen.sornek(a)intel.com>
Date: Thu, 2 Dec 2021 12:52:01 +0100
Subject: [PATCH] i40e: Fix for failed to init adminq while VF reset
Fix for failed to init adminq: -53 while VF is resetting via MAC
address changing procedure.
Added sync module to avoid reading deadbeef value in reinit adminq
during software reset.
Without this patch it is possible to trigger VF reset procedure
during reinit adminq. This resulted in an incorrect reading of
value from the AQP registers and generated the -53 error.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek(a)intel.com>
Signed-off-by: Karen Sornek <karen.sornek(a)intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_register.h b/drivers/net/ethernet/intel/i40e/i40e_register.h
index 8d0588a27a05..1908eed4fa5e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_register.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_register.h
@@ -413,6 +413,9 @@
#define I40E_VFINT_DYN_CTLN(_INTVF) (0x00024800 + ((_INTVF) * 4)) /* _i=0...511 */ /* Reset: VFR */
#define I40E_VFINT_DYN_CTLN_CLEARPBA_SHIFT 1
#define I40E_VFINT_DYN_CTLN_CLEARPBA_MASK I40E_MASK(0x1, I40E_VFINT_DYN_CTLN_CLEARPBA_SHIFT)
+#define I40E_VFINT_ICR0_ADMINQ_SHIFT 30
+#define I40E_VFINT_ICR0_ADMINQ_MASK I40E_MASK(0x1, I40E_VFINT_ICR0_ADMINQ_SHIFT)
+#define I40E_VFINT_ICR0_ENA(_VF) (0x0002C000 + ((_VF) * 4)) /* _i=0...127 */ /* Reset: CORER */
#define I40E_VPINT_AEQCTL(_VF) (0x0002B800 + ((_VF) * 4)) /* _i=0...127 */ /* Reset: CORER */
#define I40E_VPINT_AEQCTL_MSIX_INDX_SHIFT 0
#define I40E_VPINT_AEQCTL_ITR_INDX_SHIFT 11
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 7dd2a2bab339..dfdb6e786461 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1376,6 +1376,32 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
return aq_ret;
}
+/**
+ * i40e_sync_vfr_reset
+ * @hw: pointer to hw struct
+ * @vf_id: VF identifier
+ *
+ * Before trigger hardware reset, we need to know if no other process has
+ * reserved the hardware for any reset operations. This check is done by
+ * examining the status of the RSTAT1 register used to signal the reset.
+ **/
+static int i40e_sync_vfr_reset(struct i40e_hw *hw, int vf_id)
+{
+ u32 reg;
+ int i;
+
+ for (i = 0; i < I40E_VFR_WAIT_COUNT; i++) {
+ reg = rd32(hw, I40E_VFINT_ICR0_ENA(vf_id)) &
+ I40E_VFINT_ICR0_ADMINQ_MASK;
+ if (reg)
+ return 0;
+
+ usleep_range(100, 200);
+ }
+
+ return -EAGAIN;
+}
+
/**
* i40e_trigger_vf_reset
* @vf: pointer to the VF structure
@@ -1390,9 +1416,11 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
struct i40e_pf *pf = vf->pf;
struct i40e_hw *hw = &pf->hw;
u32 reg, reg_idx, bit_idx;
+ bool vf_active;
+ u32 radq;
/* warn the VF */
- clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
+ vf_active = test_and_clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
/* Disable VF's configuration API during reset. The flag is re-enabled
* in i40e_alloc_vf_res(), when it's safe again to access VF's VSI.
@@ -1406,7 +1434,19 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
* just need to clean up, so don't hit the VFRTRIG register.
*/
if (!flr) {
- /* reset VF using VPGEN_VFRTRIG reg */
+ /* Sync VFR reset before trigger next one */
+ radq = rd32(hw, I40E_VFINT_ICR0_ENA(vf->vf_id)) &
+ I40E_VFINT_ICR0_ADMINQ_MASK;
+ if (vf_active && !radq)
+ /* waiting for finish reset by virtual driver */
+ if (i40e_sync_vfr_reset(hw, vf->vf_id))
+ dev_info(&pf->pdev->dev,
+ "Reset VF %d never finished\n",
+ vf->vf_id);
+
+ /* Reset VF using VPGEN_VFRTRIG reg. It is also setting
+ * in progress state in rstat1 register.
+ */
reg = rd32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id));
reg |= I40E_VPGEN_VFRTRIG_VFSWR_MASK;
wr32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id), reg);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
index 49575a640a84..03c42fd0fea1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
@@ -19,6 +19,7 @@
#define I40E_MAX_VF_PROMISC_FLAGS 3
#define I40E_VF_STATE_WAIT_COUNT 20
+#define I40E_VFR_WAIT_COUNT 100
/* Various queue ctrls */
enum i40e_queue_ctrl {
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0f344c8129a5337dae50e31b817dd50a60ff238c Mon Sep 17 00:00:00 2001
From: Karen Sornek <karen.sornek(a)intel.com>
Date: Thu, 2 Dec 2021 12:52:01 +0100
Subject: [PATCH] i40e: Fix for failed to init adminq while VF reset
Fix for failed to init adminq: -53 while VF is resetting via MAC
address changing procedure.
Added sync module to avoid reading deadbeef value in reinit adminq
during software reset.
Without this patch it is possible to trigger VF reset procedure
during reinit adminq. This resulted in an incorrect reading of
value from the AQP registers and generated the -53 error.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek(a)intel.com>
Signed-off-by: Karen Sornek <karen.sornek(a)intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_register.h b/drivers/net/ethernet/intel/i40e/i40e_register.h
index 8d0588a27a05..1908eed4fa5e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_register.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_register.h
@@ -413,6 +413,9 @@
#define I40E_VFINT_DYN_CTLN(_INTVF) (0x00024800 + ((_INTVF) * 4)) /* _i=0...511 */ /* Reset: VFR */
#define I40E_VFINT_DYN_CTLN_CLEARPBA_SHIFT 1
#define I40E_VFINT_DYN_CTLN_CLEARPBA_MASK I40E_MASK(0x1, I40E_VFINT_DYN_CTLN_CLEARPBA_SHIFT)
+#define I40E_VFINT_ICR0_ADMINQ_SHIFT 30
+#define I40E_VFINT_ICR0_ADMINQ_MASK I40E_MASK(0x1, I40E_VFINT_ICR0_ADMINQ_SHIFT)
+#define I40E_VFINT_ICR0_ENA(_VF) (0x0002C000 + ((_VF) * 4)) /* _i=0...127 */ /* Reset: CORER */
#define I40E_VPINT_AEQCTL(_VF) (0x0002B800 + ((_VF) * 4)) /* _i=0...127 */ /* Reset: CORER */
#define I40E_VPINT_AEQCTL_MSIX_INDX_SHIFT 0
#define I40E_VPINT_AEQCTL_ITR_INDX_SHIFT 11
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 7dd2a2bab339..dfdb6e786461 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1376,6 +1376,32 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
return aq_ret;
}
+/**
+ * i40e_sync_vfr_reset
+ * @hw: pointer to hw struct
+ * @vf_id: VF identifier
+ *
+ * Before trigger hardware reset, we need to know if no other process has
+ * reserved the hardware for any reset operations. This check is done by
+ * examining the status of the RSTAT1 register used to signal the reset.
+ **/
+static int i40e_sync_vfr_reset(struct i40e_hw *hw, int vf_id)
+{
+ u32 reg;
+ int i;
+
+ for (i = 0; i < I40E_VFR_WAIT_COUNT; i++) {
+ reg = rd32(hw, I40E_VFINT_ICR0_ENA(vf_id)) &
+ I40E_VFINT_ICR0_ADMINQ_MASK;
+ if (reg)
+ return 0;
+
+ usleep_range(100, 200);
+ }
+
+ return -EAGAIN;
+}
+
/**
* i40e_trigger_vf_reset
* @vf: pointer to the VF structure
@@ -1390,9 +1416,11 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
struct i40e_pf *pf = vf->pf;
struct i40e_hw *hw = &pf->hw;
u32 reg, reg_idx, bit_idx;
+ bool vf_active;
+ u32 radq;
/* warn the VF */
- clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
+ vf_active = test_and_clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
/* Disable VF's configuration API during reset. The flag is re-enabled
* in i40e_alloc_vf_res(), when it's safe again to access VF's VSI.
@@ -1406,7 +1434,19 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
* just need to clean up, so don't hit the VFRTRIG register.
*/
if (!flr) {
- /* reset VF using VPGEN_VFRTRIG reg */
+ /* Sync VFR reset before trigger next one */
+ radq = rd32(hw, I40E_VFINT_ICR0_ENA(vf->vf_id)) &
+ I40E_VFINT_ICR0_ADMINQ_MASK;
+ if (vf_active && !radq)
+ /* waiting for finish reset by virtual driver */
+ if (i40e_sync_vfr_reset(hw, vf->vf_id))
+ dev_info(&pf->pdev->dev,
+ "Reset VF %d never finished\n",
+ vf->vf_id);
+
+ /* Reset VF using VPGEN_VFRTRIG reg. It is also setting
+ * in progress state in rstat1 register.
+ */
reg = rd32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id));
reg |= I40E_VPGEN_VFRTRIG_VFSWR_MASK;
wr32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id), reg);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
index 49575a640a84..03c42fd0fea1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
@@ -19,6 +19,7 @@
#define I40E_MAX_VF_PROMISC_FLAGS 3
#define I40E_VF_STATE_WAIT_COUNT 20
+#define I40E_VFR_WAIT_COUNT 100
/* Various queue ctrls */
enum i40e_queue_ctrl {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0f344c8129a5337dae50e31b817dd50a60ff238c Mon Sep 17 00:00:00 2001
From: Karen Sornek <karen.sornek(a)intel.com>
Date: Thu, 2 Dec 2021 12:52:01 +0100
Subject: [PATCH] i40e: Fix for failed to init adminq while VF reset
Fix for failed to init adminq: -53 while VF is resetting via MAC
address changing procedure.
Added sync module to avoid reading deadbeef value in reinit adminq
during software reset.
Without this patch it is possible to trigger VF reset procedure
during reinit adminq. This resulted in an incorrect reading of
value from the AQP registers and generated the -53 error.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek(a)intel.com>
Signed-off-by: Karen Sornek <karen.sornek(a)intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_register.h b/drivers/net/ethernet/intel/i40e/i40e_register.h
index 8d0588a27a05..1908eed4fa5e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_register.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_register.h
@@ -413,6 +413,9 @@
#define I40E_VFINT_DYN_CTLN(_INTVF) (0x00024800 + ((_INTVF) * 4)) /* _i=0...511 */ /* Reset: VFR */
#define I40E_VFINT_DYN_CTLN_CLEARPBA_SHIFT 1
#define I40E_VFINT_DYN_CTLN_CLEARPBA_MASK I40E_MASK(0x1, I40E_VFINT_DYN_CTLN_CLEARPBA_SHIFT)
+#define I40E_VFINT_ICR0_ADMINQ_SHIFT 30
+#define I40E_VFINT_ICR0_ADMINQ_MASK I40E_MASK(0x1, I40E_VFINT_ICR0_ADMINQ_SHIFT)
+#define I40E_VFINT_ICR0_ENA(_VF) (0x0002C000 + ((_VF) * 4)) /* _i=0...127 */ /* Reset: CORER */
#define I40E_VPINT_AEQCTL(_VF) (0x0002B800 + ((_VF) * 4)) /* _i=0...127 */ /* Reset: CORER */
#define I40E_VPINT_AEQCTL_MSIX_INDX_SHIFT 0
#define I40E_VPINT_AEQCTL_ITR_INDX_SHIFT 11
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 7dd2a2bab339..dfdb6e786461 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1376,6 +1376,32 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
return aq_ret;
}
+/**
+ * i40e_sync_vfr_reset
+ * @hw: pointer to hw struct
+ * @vf_id: VF identifier
+ *
+ * Before trigger hardware reset, we need to know if no other process has
+ * reserved the hardware for any reset operations. This check is done by
+ * examining the status of the RSTAT1 register used to signal the reset.
+ **/
+static int i40e_sync_vfr_reset(struct i40e_hw *hw, int vf_id)
+{
+ u32 reg;
+ int i;
+
+ for (i = 0; i < I40E_VFR_WAIT_COUNT; i++) {
+ reg = rd32(hw, I40E_VFINT_ICR0_ENA(vf_id)) &
+ I40E_VFINT_ICR0_ADMINQ_MASK;
+ if (reg)
+ return 0;
+
+ usleep_range(100, 200);
+ }
+
+ return -EAGAIN;
+}
+
/**
* i40e_trigger_vf_reset
* @vf: pointer to the VF structure
@@ -1390,9 +1416,11 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
struct i40e_pf *pf = vf->pf;
struct i40e_hw *hw = &pf->hw;
u32 reg, reg_idx, bit_idx;
+ bool vf_active;
+ u32 radq;
/* warn the VF */
- clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
+ vf_active = test_and_clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
/* Disable VF's configuration API during reset. The flag is re-enabled
* in i40e_alloc_vf_res(), when it's safe again to access VF's VSI.
@@ -1406,7 +1434,19 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
* just need to clean up, so don't hit the VFRTRIG register.
*/
if (!flr) {
- /* reset VF using VPGEN_VFRTRIG reg */
+ /* Sync VFR reset before trigger next one */
+ radq = rd32(hw, I40E_VFINT_ICR0_ENA(vf->vf_id)) &
+ I40E_VFINT_ICR0_ADMINQ_MASK;
+ if (vf_active && !radq)
+ /* waiting for finish reset by virtual driver */
+ if (i40e_sync_vfr_reset(hw, vf->vf_id))
+ dev_info(&pf->pdev->dev,
+ "Reset VF %d never finished\n",
+ vf->vf_id);
+
+ /* Reset VF using VPGEN_VFRTRIG reg. It is also setting
+ * in progress state in rstat1 register.
+ */
reg = rd32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id));
reg |= I40E_VPGEN_VFRTRIG_VFSWR_MASK;
wr32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id), reg);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
index 49575a640a84..03c42fd0fea1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
@@ -19,6 +19,7 @@
#define I40E_MAX_VF_PROMISC_FLAGS 3
#define I40E_VF_STATE_WAIT_COUNT 20
+#define I40E_VFR_WAIT_COUNT 100
/* Various queue ctrls */
enum i40e_queue_ctrl {
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0f344c8129a5337dae50e31b817dd50a60ff238c Mon Sep 17 00:00:00 2001
From: Karen Sornek <karen.sornek(a)intel.com>
Date: Thu, 2 Dec 2021 12:52:01 +0100
Subject: [PATCH] i40e: Fix for failed to init adminq while VF reset
Fix for failed to init adminq: -53 while VF is resetting via MAC
address changing procedure.
Added sync module to avoid reading deadbeef value in reinit adminq
during software reset.
Without this patch it is possible to trigger VF reset procedure
during reinit adminq. This resulted in an incorrect reading of
value from the AQP registers and generated the -53 error.
Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek(a)intel.com>
Signed-off-by: Karen Sornek <karen.sornek(a)intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_register.h b/drivers/net/ethernet/intel/i40e/i40e_register.h
index 8d0588a27a05..1908eed4fa5e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_register.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_register.h
@@ -413,6 +413,9 @@
#define I40E_VFINT_DYN_CTLN(_INTVF) (0x00024800 + ((_INTVF) * 4)) /* _i=0...511 */ /* Reset: VFR */
#define I40E_VFINT_DYN_CTLN_CLEARPBA_SHIFT 1
#define I40E_VFINT_DYN_CTLN_CLEARPBA_MASK I40E_MASK(0x1, I40E_VFINT_DYN_CTLN_CLEARPBA_SHIFT)
+#define I40E_VFINT_ICR0_ADMINQ_SHIFT 30
+#define I40E_VFINT_ICR0_ADMINQ_MASK I40E_MASK(0x1, I40E_VFINT_ICR0_ADMINQ_SHIFT)
+#define I40E_VFINT_ICR0_ENA(_VF) (0x0002C000 + ((_VF) * 4)) /* _i=0...127 */ /* Reset: CORER */
#define I40E_VPINT_AEQCTL(_VF) (0x0002B800 + ((_VF) * 4)) /* _i=0...127 */ /* Reset: CORER */
#define I40E_VPINT_AEQCTL_MSIX_INDX_SHIFT 0
#define I40E_VPINT_AEQCTL_ITR_INDX_SHIFT 11
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 7dd2a2bab339..dfdb6e786461 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1376,6 +1376,32 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
return aq_ret;
}
+/**
+ * i40e_sync_vfr_reset
+ * @hw: pointer to hw struct
+ * @vf_id: VF identifier
+ *
+ * Before trigger hardware reset, we need to know if no other process has
+ * reserved the hardware for any reset operations. This check is done by
+ * examining the status of the RSTAT1 register used to signal the reset.
+ **/
+static int i40e_sync_vfr_reset(struct i40e_hw *hw, int vf_id)
+{
+ u32 reg;
+ int i;
+
+ for (i = 0; i < I40E_VFR_WAIT_COUNT; i++) {
+ reg = rd32(hw, I40E_VFINT_ICR0_ENA(vf_id)) &
+ I40E_VFINT_ICR0_ADMINQ_MASK;
+ if (reg)
+ return 0;
+
+ usleep_range(100, 200);
+ }
+
+ return -EAGAIN;
+}
+
/**
* i40e_trigger_vf_reset
* @vf: pointer to the VF structure
@@ -1390,9 +1416,11 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
struct i40e_pf *pf = vf->pf;
struct i40e_hw *hw = &pf->hw;
u32 reg, reg_idx, bit_idx;
+ bool vf_active;
+ u32 radq;
/* warn the VF */
- clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
+ vf_active = test_and_clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
/* Disable VF's configuration API during reset. The flag is re-enabled
* in i40e_alloc_vf_res(), when it's safe again to access VF's VSI.
@@ -1406,7 +1434,19 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr)
* just need to clean up, so don't hit the VFRTRIG register.
*/
if (!flr) {
- /* reset VF using VPGEN_VFRTRIG reg */
+ /* Sync VFR reset before trigger next one */
+ radq = rd32(hw, I40E_VFINT_ICR0_ENA(vf->vf_id)) &
+ I40E_VFINT_ICR0_ADMINQ_MASK;
+ if (vf_active && !radq)
+ /* waiting for finish reset by virtual driver */
+ if (i40e_sync_vfr_reset(hw, vf->vf_id))
+ dev_info(&pf->pdev->dev,
+ "Reset VF %d never finished\n",
+ vf->vf_id);
+
+ /* Reset VF using VPGEN_VFRTRIG reg. It is also setting
+ * in progress state in rstat1 register.
+ */
reg = rd32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id));
reg |= I40E_VPGEN_VFRTRIG_VFSWR_MASK;
wr32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id), reg);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
index 49575a640a84..03c42fd0fea1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
@@ -19,6 +19,7 @@
#define I40E_MAX_VF_PROMISC_FLAGS 3
#define I40E_VF_STATE_WAIT_COUNT 20
+#define I40E_VFR_WAIT_COUNT 100
/* Various queue ctrls */
enum i40e_queue_ctrl {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 92947844b8beee988c0ce17082b705c2f75f0742 Mon Sep 17 00:00:00 2001
From: Sylwester Dziedziuch <sylwesterx.dziedziuch(a)intel.com>
Date: Fri, 26 Nov 2021 11:11:22 +0100
Subject: [PATCH] i40e: Fix queues reservation for XDP
When XDP was configured on a system with large number of CPUs
and X722 NIC there was a call trace with NULL pointer dereference.
i40e 0000:87:00.0: failed to get tracking for 256 queues for VSI 0 err -12
i40e 0000:87:00.0: setup of MAIN VSI failed
BUG: kernel NULL pointer dereference, address: 0000000000000000
RIP: 0010:i40e_xdp+0xea/0x1b0 [i40e]
Call Trace:
? i40e_reconfig_rss_queues+0x130/0x130 [i40e]
dev_xdp_install+0x61/0xe0
dev_xdp_attach+0x18a/0x4c0
dev_change_xdp_fd+0x1e6/0x220
do_setlink+0x616/0x1030
? ahci_port_stop+0x80/0x80
? ata_qc_issue+0x107/0x1e0
? lock_timer_base+0x61/0x80
? __mod_timer+0x202/0x380
rtnl_setlink+0xe5/0x170
? bpf_lsm_binder_transaction+0x10/0x10
? security_capable+0x36/0x50
rtnetlink_rcv_msg+0x121/0x350
? rtnl_calcit.isra.0+0x100/0x100
netlink_rcv_skb+0x50/0xf0
netlink_unicast+0x1d3/0x2a0
netlink_sendmsg+0x22a/0x440
sock_sendmsg+0x5e/0x60
__sys_sendto+0xf0/0x160
? __sys_getsockname+0x7e/0xc0
? _copy_from_user+0x3c/0x80
? __sys_setsockopt+0xc8/0x1a0
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f83fa7a39e0
This was caused by PF queue pile fragmentation due to
flow director VSI queue being placed right after main VSI.
Because of this main VSI was not able to resize its
queue allocation for XDP resulting in no queues allocated
for main VSI when XDP was turned on.
Fix this by always allocating last queue in PF queue pile
for a flow director VSI.
Fixes: 41c445ff0f48 ("i40e: main driver core")
Fixes: 74608d17fe29 ("i40e: add support for XDP_TX action")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch(a)intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski(a)intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski(a)intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9456641cd24a..9b65cb50282c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -210,6 +210,20 @@ static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
return -EINVAL;
}
+ /* Allocate last queue in the pile for FDIR VSI queue
+ * so it doesn't fragment the qp_pile
+ */
+ if (pile == pf->qp_pile && pf->vsi[id]->type == I40E_VSI_FDIR) {
+ if (pile->list[pile->num_entries - 1] & I40E_PILE_VALID_BIT) {
+ dev_err(&pf->pdev->dev,
+ "Cannot allocate queue %d for I40E_VSI_FDIR\n",
+ pile->num_entries - 1);
+ return -ENOMEM;
+ }
+ pile->list[pile->num_entries - 1] = id | I40E_PILE_VALID_BIT;
+ return pile->num_entries - 1;
+ }
+
i = 0;
while (i < pile->num_entries) {
/* skip already allocated entries */
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d701658a50a471591094b3eb3961b4926cc8f104 Mon Sep 17 00:00:00 2001
From: Jedrzej Jagielski <jedrzej.jagielski(a)intel.com>
Date: Fri, 5 Nov 2021 11:17:00 +0000
Subject: [PATCH] i40e: Fix issue when maximum queues is exceeded
Before this patch VF interface vanished when
maximum queue number was exceeded. Driver tried
to add next queues even if there was not enough
space. PF sent incorrect number of queues to
the VF when there were not enough of them.
Add an additional condition introduced to check
available space in 'qp_pile' before proceeding.
This condition makes it impossible to add queues
if they number is greater than the number resulting
from available space.
Also add the search for free space in PF queue
pair piles.
Without this patch VF interfaces are not seen
when available space for queues has been
exceeded and following logs appears permanently
in dmesg:
"Unable to get VF config (-32)".
"VF 62 failed opcode 3, retval: -5"
"Unable to get VF config due to PF error condition, not retrying"
Fixes: 7daa6bf3294e ("i40e: driver core headers")
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Jaroslaw Gawin <jaroslawx.gawin(a)intel.com>
Signed-off-by: Slawomir Laba <slawomirx.laba(a)intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski(a)intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 4d939af0a626..c8cfe62d5e05 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -174,7 +174,6 @@ enum i40e_interrupt_policy {
struct i40e_lump_tracking {
u16 num_entries;
- u16 search_hint;
u16 list[0];
#define I40E_PILE_VALID_BIT 0x8000
#define I40E_IWARP_IRQ_PILE_ID (I40E_PILE_VALID_BIT - 2)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 1b0fd3eacd64..9456641cd24a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -196,10 +196,6 @@ int i40e_free_virt_mem_d(struct i40e_hw *hw, struct i40e_virt_mem *mem)
* @id: an owner id to stick on the items assigned
*
* Returns the base item index of the lump, or negative for error
- *
- * The search_hint trick and lack of advanced fit-finding only work
- * because we're highly likely to have all the same size lump requests.
- * Linear search time and any fragmentation should be minimal.
**/
static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
u16 needed, u16 id)
@@ -214,8 +210,7 @@ static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
return -EINVAL;
}
- /* start the linear search with an imperfect hint */
- i = pile->search_hint;
+ i = 0;
while (i < pile->num_entries) {
/* skip already allocated entries */
if (pile->list[i] & I40E_PILE_VALID_BIT) {
@@ -234,7 +229,6 @@ static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
for (j = 0; j < needed; j++)
pile->list[i+j] = id | I40E_PILE_VALID_BIT;
ret = i;
- pile->search_hint = i + j;
break;
}
@@ -257,7 +251,7 @@ static int i40e_put_lump(struct i40e_lump_tracking *pile, u16 index, u16 id)
{
int valid_id = (id | I40E_PILE_VALID_BIT);
int count = 0;
- int i;
+ u16 i;
if (!pile || index >= pile->num_entries)
return -EINVAL;
@@ -269,8 +263,6 @@ static int i40e_put_lump(struct i40e_lump_tracking *pile, u16 index, u16 id)
count++;
}
- if (count && index < pile->search_hint)
- pile->search_hint = index;
return count;
}
@@ -11786,7 +11778,6 @@ static int i40e_init_interrupt_scheme(struct i40e_pf *pf)
return -ENOMEM;
pf->irq_pile->num_entries = vectors;
- pf->irq_pile->search_hint = 0;
/* track first vector for misc interrupts, ignore return */
(void)i40e_get_lump(pf, pf->irq_pile, 1, I40E_PILE_VALID_BIT - 1);
@@ -12589,7 +12580,6 @@ static int i40e_sw_init(struct i40e_pf *pf)
goto sw_init_done;
}
pf->qp_pile->num_entries = pf->hw.func_caps.num_tx_qp;
- pf->qp_pile->search_hint = 0;
pf->tx_timeout_recovery_level = 1;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index b785d09c19f8..7dd2a2bab339 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2617,6 +2617,59 @@ static int i40e_vc_disable_queues_msg(struct i40e_vf *vf, u8 *msg)
aq_ret);
}
+/**
+ * i40e_check_enough_queue - find big enough queue number
+ * @vf: pointer to the VF info
+ * @needed: the number of items needed
+ *
+ * Returns the base item index of the queue, or negative for error
+ **/
+static int i40e_check_enough_queue(struct i40e_vf *vf, u16 needed)
+{
+ unsigned int i, cur_queues, more, pool_size;
+ struct i40e_lump_tracking *pile;
+ struct i40e_pf *pf = vf->pf;
+ struct i40e_vsi *vsi;
+
+ vsi = pf->vsi[vf->lan_vsi_idx];
+ cur_queues = vsi->alloc_queue_pairs;
+
+ /* if current allocated queues are enough for need */
+ if (cur_queues >= needed)
+ return vsi->base_queue;
+
+ pile = pf->qp_pile;
+ if (cur_queues > 0) {
+ /* if the allocated queues are not zero
+ * just check if there are enough queues for more
+ * behind the allocated queues.
+ */
+ more = needed - cur_queues;
+ for (i = vsi->base_queue + cur_queues;
+ i < pile->num_entries; i++) {
+ if (pile->list[i] & I40E_PILE_VALID_BIT)
+ break;
+
+ if (more-- == 1)
+ /* there is enough */
+ return vsi->base_queue;
+ }
+ }
+
+ pool_size = 0;
+ for (i = 0; i < pile->num_entries; i++) {
+ if (pile->list[i] & I40E_PILE_VALID_BIT) {
+ pool_size = 0;
+ continue;
+ }
+ if (needed <= ++pool_size)
+ /* there is enough */
+ return i;
+ }
+
+ return -ENOMEM;
+}
+
/**
* i40e_vc_request_queues_msg
* @vf: pointer to the VF info
@@ -2651,6 +2704,12 @@ static int i40e_vc_request_queues_msg(struct i40e_vf *vf, u8 *msg)
req_pairs - cur_pairs,
pf->queues_left);
vfres->num_queue_pairs = pf->queues_left + cur_pairs;
+ } else if (i40e_check_enough_queue(vf, req_pairs) < 0) {
+ dev_warn(&pf->pdev->dev,
+ "VF %d requested %d more queues, but there is not enough for it.\n",
+ vf->vf_id,
+ req_pairs - cur_pairs);
+ vfres->num_queue_pairs = cur_pairs;
} else {
/* successful request */
vf->num_req_queues = req_pairs;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d701658a50a471591094b3eb3961b4926cc8f104 Mon Sep 17 00:00:00 2001
From: Jedrzej Jagielski <jedrzej.jagielski(a)intel.com>
Date: Fri, 5 Nov 2021 11:17:00 +0000
Subject: [PATCH] i40e: Fix issue when maximum queues is exceeded
Before this patch VF interface vanished when
maximum queue number was exceeded. Driver tried
to add next queues even if there was not enough
space. PF sent incorrect number of queues to
the VF when there were not enough of them.
Add an additional condition introduced to check
available space in 'qp_pile' before proceeding.
This condition makes it impossible to add queues
if they number is greater than the number resulting
from available space.
Also add the search for free space in PF queue
pair piles.
Without this patch VF interfaces are not seen
when available space for queues has been
exceeded and following logs appears permanently
in dmesg:
"Unable to get VF config (-32)".
"VF 62 failed opcode 3, retval: -5"
"Unable to get VF config due to PF error condition, not retrying"
Fixes: 7daa6bf3294e ("i40e: driver core headers")
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Jaroslaw Gawin <jaroslawx.gawin(a)intel.com>
Signed-off-by: Slawomir Laba <slawomirx.laba(a)intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski(a)intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 4d939af0a626..c8cfe62d5e05 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -174,7 +174,6 @@ enum i40e_interrupt_policy {
struct i40e_lump_tracking {
u16 num_entries;
- u16 search_hint;
u16 list[0];
#define I40E_PILE_VALID_BIT 0x8000
#define I40E_IWARP_IRQ_PILE_ID (I40E_PILE_VALID_BIT - 2)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 1b0fd3eacd64..9456641cd24a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -196,10 +196,6 @@ int i40e_free_virt_mem_d(struct i40e_hw *hw, struct i40e_virt_mem *mem)
* @id: an owner id to stick on the items assigned
*
* Returns the base item index of the lump, or negative for error
- *
- * The search_hint trick and lack of advanced fit-finding only work
- * because we're highly likely to have all the same size lump requests.
- * Linear search time and any fragmentation should be minimal.
**/
static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
u16 needed, u16 id)
@@ -214,8 +210,7 @@ static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
return -EINVAL;
}
- /* start the linear search with an imperfect hint */
- i = pile->search_hint;
+ i = 0;
while (i < pile->num_entries) {
/* skip already allocated entries */
if (pile->list[i] & I40E_PILE_VALID_BIT) {
@@ -234,7 +229,6 @@ static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
for (j = 0; j < needed; j++)
pile->list[i+j] = id | I40E_PILE_VALID_BIT;
ret = i;
- pile->search_hint = i + j;
break;
}
@@ -257,7 +251,7 @@ static int i40e_put_lump(struct i40e_lump_tracking *pile, u16 index, u16 id)
{
int valid_id = (id | I40E_PILE_VALID_BIT);
int count = 0;
- int i;
+ u16 i;
if (!pile || index >= pile->num_entries)
return -EINVAL;
@@ -269,8 +263,6 @@ static int i40e_put_lump(struct i40e_lump_tracking *pile, u16 index, u16 id)
count++;
}
- if (count && index < pile->search_hint)
- pile->search_hint = index;
return count;
}
@@ -11786,7 +11778,6 @@ static int i40e_init_interrupt_scheme(struct i40e_pf *pf)
return -ENOMEM;
pf->irq_pile->num_entries = vectors;
- pf->irq_pile->search_hint = 0;
/* track first vector for misc interrupts, ignore return */
(void)i40e_get_lump(pf, pf->irq_pile, 1, I40E_PILE_VALID_BIT - 1);
@@ -12589,7 +12580,6 @@ static int i40e_sw_init(struct i40e_pf *pf)
goto sw_init_done;
}
pf->qp_pile->num_entries = pf->hw.func_caps.num_tx_qp;
- pf->qp_pile->search_hint = 0;
pf->tx_timeout_recovery_level = 1;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index b785d09c19f8..7dd2a2bab339 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2617,6 +2617,59 @@ static int i40e_vc_disable_queues_msg(struct i40e_vf *vf, u8 *msg)
aq_ret);
}
+/**
+ * i40e_check_enough_queue - find big enough queue number
+ * @vf: pointer to the VF info
+ * @needed: the number of items needed
+ *
+ * Returns the base item index of the queue, or negative for error
+ **/
+static int i40e_check_enough_queue(struct i40e_vf *vf, u16 needed)
+{
+ unsigned int i, cur_queues, more, pool_size;
+ struct i40e_lump_tracking *pile;
+ struct i40e_pf *pf = vf->pf;
+ struct i40e_vsi *vsi;
+
+ vsi = pf->vsi[vf->lan_vsi_idx];
+ cur_queues = vsi->alloc_queue_pairs;
+
+ /* if current allocated queues are enough for need */
+ if (cur_queues >= needed)
+ return vsi->base_queue;
+
+ pile = pf->qp_pile;
+ if (cur_queues > 0) {
+ /* if the allocated queues are not zero
+ * just check if there are enough queues for more
+ * behind the allocated queues.
+ */
+ more = needed - cur_queues;
+ for (i = vsi->base_queue + cur_queues;
+ i < pile->num_entries; i++) {
+ if (pile->list[i] & I40E_PILE_VALID_BIT)
+ break;
+
+ if (more-- == 1)
+ /* there is enough */
+ return vsi->base_queue;
+ }
+ }
+
+ pool_size = 0;
+ for (i = 0; i < pile->num_entries; i++) {
+ if (pile->list[i] & I40E_PILE_VALID_BIT) {
+ pool_size = 0;
+ continue;
+ }
+ if (needed <= ++pool_size)
+ /* there is enough */
+ return i;
+ }
+
+ return -ENOMEM;
+}
+
/**
* i40e_vc_request_queues_msg
* @vf: pointer to the VF info
@@ -2651,6 +2704,12 @@ static int i40e_vc_request_queues_msg(struct i40e_vf *vf, u8 *msg)
req_pairs - cur_pairs,
pf->queues_left);
vfres->num_queue_pairs = pf->queues_left + cur_pairs;
+ } else if (i40e_check_enough_queue(vf, req_pairs) < 0) {
+ dev_warn(&pf->pdev->dev,
+ "VF %d requested %d more queues, but there is not enough for it.\n",
+ vf->vf_id,
+ req_pairs - cur_pairs);
+ vfres->num_queue_pairs = cur_pairs;
} else {
/* successful request */
vf->num_req_queues = req_pairs;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d701658a50a471591094b3eb3961b4926cc8f104 Mon Sep 17 00:00:00 2001
From: Jedrzej Jagielski <jedrzej.jagielski(a)intel.com>
Date: Fri, 5 Nov 2021 11:17:00 +0000
Subject: [PATCH] i40e: Fix issue when maximum queues is exceeded
Before this patch VF interface vanished when
maximum queue number was exceeded. Driver tried
to add next queues even if there was not enough
space. PF sent incorrect number of queues to
the VF when there were not enough of them.
Add an additional condition introduced to check
available space in 'qp_pile' before proceeding.
This condition makes it impossible to add queues
if they number is greater than the number resulting
from available space.
Also add the search for free space in PF queue
pair piles.
Without this patch VF interfaces are not seen
when available space for queues has been
exceeded and following logs appears permanently
in dmesg:
"Unable to get VF config (-32)".
"VF 62 failed opcode 3, retval: -5"
"Unable to get VF config due to PF error condition, not retrying"
Fixes: 7daa6bf3294e ("i40e: driver core headers")
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Jaroslaw Gawin <jaroslawx.gawin(a)intel.com>
Signed-off-by: Slawomir Laba <slawomirx.laba(a)intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski(a)intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 4d939af0a626..c8cfe62d5e05 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -174,7 +174,6 @@ enum i40e_interrupt_policy {
struct i40e_lump_tracking {
u16 num_entries;
- u16 search_hint;
u16 list[0];
#define I40E_PILE_VALID_BIT 0x8000
#define I40E_IWARP_IRQ_PILE_ID (I40E_PILE_VALID_BIT - 2)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 1b0fd3eacd64..9456641cd24a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -196,10 +196,6 @@ int i40e_free_virt_mem_d(struct i40e_hw *hw, struct i40e_virt_mem *mem)
* @id: an owner id to stick on the items assigned
*
* Returns the base item index of the lump, or negative for error
- *
- * The search_hint trick and lack of advanced fit-finding only work
- * because we're highly likely to have all the same size lump requests.
- * Linear search time and any fragmentation should be minimal.
**/
static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
u16 needed, u16 id)
@@ -214,8 +210,7 @@ static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
return -EINVAL;
}
- /* start the linear search with an imperfect hint */
- i = pile->search_hint;
+ i = 0;
while (i < pile->num_entries) {
/* skip already allocated entries */
if (pile->list[i] & I40E_PILE_VALID_BIT) {
@@ -234,7 +229,6 @@ static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
for (j = 0; j < needed; j++)
pile->list[i+j] = id | I40E_PILE_VALID_BIT;
ret = i;
- pile->search_hint = i + j;
break;
}
@@ -257,7 +251,7 @@ static int i40e_put_lump(struct i40e_lump_tracking *pile, u16 index, u16 id)
{
int valid_id = (id | I40E_PILE_VALID_BIT);
int count = 0;
- int i;
+ u16 i;
if (!pile || index >= pile->num_entries)
return -EINVAL;
@@ -269,8 +263,6 @@ static int i40e_put_lump(struct i40e_lump_tracking *pile, u16 index, u16 id)
count++;
}
- if (count && index < pile->search_hint)
- pile->search_hint = index;
return count;
}
@@ -11786,7 +11778,6 @@ static int i40e_init_interrupt_scheme(struct i40e_pf *pf)
return -ENOMEM;
pf->irq_pile->num_entries = vectors;
- pf->irq_pile->search_hint = 0;
/* track first vector for misc interrupts, ignore return */
(void)i40e_get_lump(pf, pf->irq_pile, 1, I40E_PILE_VALID_BIT - 1);
@@ -12589,7 +12580,6 @@ static int i40e_sw_init(struct i40e_pf *pf)
goto sw_init_done;
}
pf->qp_pile->num_entries = pf->hw.func_caps.num_tx_qp;
- pf->qp_pile->search_hint = 0;
pf->tx_timeout_recovery_level = 1;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index b785d09c19f8..7dd2a2bab339 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2617,6 +2617,59 @@ static int i40e_vc_disable_queues_msg(struct i40e_vf *vf, u8 *msg)
aq_ret);
}
+/**
+ * i40e_check_enough_queue - find big enough queue number
+ * @vf: pointer to the VF info
+ * @needed: the number of items needed
+ *
+ * Returns the base item index of the queue, or negative for error
+ **/
+static int i40e_check_enough_queue(struct i40e_vf *vf, u16 needed)
+{
+ unsigned int i, cur_queues, more, pool_size;
+ struct i40e_lump_tracking *pile;
+ struct i40e_pf *pf = vf->pf;
+ struct i40e_vsi *vsi;
+
+ vsi = pf->vsi[vf->lan_vsi_idx];
+ cur_queues = vsi->alloc_queue_pairs;
+
+ /* if current allocated queues are enough for need */
+ if (cur_queues >= needed)
+ return vsi->base_queue;
+
+ pile = pf->qp_pile;
+ if (cur_queues > 0) {
+ /* if the allocated queues are not zero
+ * just check if there are enough queues for more
+ * behind the allocated queues.
+ */
+ more = needed - cur_queues;
+ for (i = vsi->base_queue + cur_queues;
+ i < pile->num_entries; i++) {
+ if (pile->list[i] & I40E_PILE_VALID_BIT)
+ break;
+
+ if (more-- == 1)
+ /* there is enough */
+ return vsi->base_queue;
+ }
+ }
+
+ pool_size = 0;
+ for (i = 0; i < pile->num_entries; i++) {
+ if (pile->list[i] & I40E_PILE_VALID_BIT) {
+ pool_size = 0;
+ continue;
+ }
+ if (needed <= ++pool_size)
+ /* there is enough */
+ return i;
+ }
+
+ return -ENOMEM;
+}
+
/**
* i40e_vc_request_queues_msg
* @vf: pointer to the VF info
@@ -2651,6 +2704,12 @@ static int i40e_vc_request_queues_msg(struct i40e_vf *vf, u8 *msg)
req_pairs - cur_pairs,
pf->queues_left);
vfres->num_queue_pairs = pf->queues_left + cur_pairs;
+ } else if (i40e_check_enough_queue(vf, req_pairs) < 0) {
+ dev_warn(&pf->pdev->dev,
+ "VF %d requested %d more queues, but there is not enough for it.\n",
+ vf->vf_id,
+ req_pairs - cur_pairs);
+ vfres->num_queue_pairs = cur_pairs;
} else {
/* successful request */
vf->num_req_queues = req_pairs;
This is backport for 4.14
(cherry picked from commit bc93a22a19eb2b68a16ecf04cdf4b2ed65aaf398)
On a kernel without CONFIG_STRICT_KERNEL_RWX, running EXEC_RODATA
test leads to "Illegal instruction" failure.
Looking at the content of rodata_objcopy.o, we see that the
function content zeroes only:
Disassembly of section .rodata:
0000000000000000 <.lkdtm_rodata_do_nothing>:
0: 00 00 00 00 .long 0x0
Add the contents flag in order to keep the content of the section
while renaming it.
Disassembly of section .rodata:
0000000000000000 <.lkdtm_rodata_do_nothing>:
0: 4e 80 00 20 blr
Fixes: e9e08a07385e ("lkdtm: support llvm-objcopy")
Cc: stable(a)vger.kernel.org
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: Nathan Chancellor <nathan(a)kernel.org>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Reviewed-by: Nick Desaulniers <ndesaulniers(a)google.com>
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Link: https://lore.kernel.org/r/8900731fbc05fb8b0de18af7133a8fc07c3c53a1.16337121…
---
drivers/misc/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 76f6a4f628b3..cc0df7280fe5 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -69,7 +69,7 @@ KCOV_INSTRUMENT_lkdtm_rodata.o := n
OBJCOPYFLAGS :=
OBJCOPYFLAGS_lkdtm_rodata_objcopy.o := \
- --rename-section .text=.rodata,alloc,readonly,load
+ --rename-section .text=.rodata,alloc,readonly,load,contents
targets += lkdtm_rodata.o lkdtm_rodata_objcopy.o
$(obj)/lkdtm_rodata_objcopy.o: $(obj)/lkdtm_rodata.o FORCE
$(call if_changed,objcopy)
--
2.33.1
This is a backport for 5.10
To apply, it also requires commit 37eb7ca91b69 ("powerpc/32s: Allocate
one 256k IBAT instead of two consecutives 128k IBATs")
(cherry picked from commit d37823c3528e5e0705fc7746bcbc2afffb619259)
It has been reported some configuration where the kernel doesn't
boot with KASAN enabled.
This is due to wrong BAT allocation for the KASAN area:
---[ Data Block Address Translation ]---
0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw m
1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw m
2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw m
3: 0xf8000000-0xf9ffffff 0x2a000000 32M Kernel rw m
4: 0xfa000000-0xfdffffff 0x2c000000 64M Kernel rw m
A BAT must have both virtual and physical addresses alignment matching
the size of the BAT. This is not the case for BAT 4 above.
Fix kasan_init_region() by using block_size() function that is in
book3s32/mmu.c. To be able to reuse it here, make it non static and
change its name to bat_block_size() in order to avoid name conflict
with block_size() defined in <linux/blkdev.h>
Also reuse find_free_bat() to avoid an error message from setbat()
when no BAT is available.
And allocate memory outside of linear memory mapping to avoid
wasting that precious space.
With this change we get correct alignment for BATs and KASAN shadow
memory is allocated outside the linear memory space.
---[ Data Block Address Translation ]---
0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw
1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw
2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw
3: 0xf8000000-0xfbffffff 0x7c000000 64M Kernel rw
4: 0xfc000000-0xfdffffff 0x7a000000 32M Kernel rw
Fixes: 7974c4732642 ("powerpc/32s: Implement dedicated kasan_init_region()")
Cc: stable(a)vger.kernel.org
Reported-by: Maxime Bizon <mbizon(a)freebox.fr>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Tested-by: Maxime Bizon <mbizon(a)freebox.fr>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/7a50ef902494d1325227d47d33dada01e52e5518.16418187…
---
arch/powerpc/include/asm/book3s/32/mmu-hash.h | 2 +
arch/powerpc/mm/book3s32/mmu.c | 10 ++--
arch/powerpc/mm/kasan/book3s_32.c | 59 ++++++++++---------
3 files changed, 38 insertions(+), 33 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
index a8982d52f6b1..cbde06d0fb38 100644
--- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
@@ -102,6 +102,8 @@ extern s32 patch__hash_page_B, patch__hash_page_C;
extern s32 patch__flush_hash_A0, patch__flush_hash_A1, patch__flush_hash_A2;
extern s32 patch__flush_hash_B;
+int __init find_free_bat(void);
+unsigned int bat_block_size(unsigned long base, unsigned long top);
#endif /* !__ASSEMBLY__ */
/* We happily ignore the smaller BATs on 601, we don't actually use
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index addecf77dae3..602ab13127b4 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -72,7 +72,7 @@ unsigned long p_block_mapped(phys_addr_t pa)
return 0;
}
-static int find_free_bat(void)
+int __init find_free_bat(void)
{
int b;
int n = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4;
@@ -96,7 +96,7 @@ static int find_free_bat(void)
* - block size has to be a power of two. This is calculated by finding the
* highest bit set to 1.
*/
-static unsigned int block_size(unsigned long base, unsigned long top)
+unsigned int bat_block_size(unsigned long base, unsigned long top)
{
unsigned int max_size = SZ_256M;
unsigned int base_shift = (ffs(base) - 1) & 31;
@@ -141,7 +141,7 @@ static unsigned long __init __mmu_mapin_ram(unsigned long base, unsigned long to
int idx;
while ((idx = find_free_bat()) != -1 && base != top) {
- unsigned int size = block_size(base, top);
+ unsigned int size = bat_block_size(base, top);
if (size < 128 << 10)
break;
@@ -206,12 +206,12 @@ void mmu_mark_initmem_nx(void)
unsigned long size;
for (i = 0; i < nb - 1 && base < top;) {
- size = block_size(base, top);
+ size = bat_block_size(base, top);
setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT);
base += size;
}
if (base < top) {
- size = block_size(base, top);
+ size = bat_block_size(base, top);
if ((top - base) > size) {
size <<= 1;
if (strict_kernel_rwx_enabled() && base + size > border)
diff --git a/arch/powerpc/mm/kasan/book3s_32.c b/arch/powerpc/mm/kasan/book3s_32.c
index 35b287b0a8da..450a67ef0bbe 100644
--- a/arch/powerpc/mm/kasan/book3s_32.c
+++ b/arch/powerpc/mm/kasan/book3s_32.c
@@ -10,48 +10,51 @@ int __init kasan_init_region(void *start, size_t size)
{
unsigned long k_start = (unsigned long)kasan_mem_to_shadow(start);
unsigned long k_end = (unsigned long)kasan_mem_to_shadow(start + size);
- unsigned long k_cur = k_start;
- int k_size = k_end - k_start;
- int k_size_base = 1 << (ffs(k_size) - 1);
+ unsigned long k_nobat = k_start;
+ unsigned long k_cur;
+ phys_addr_t phys;
int ret;
- void *block;
- block = memblock_alloc(k_size, k_size_base);
-
- if (block && k_size_base >= SZ_128K && k_start == ALIGN(k_start, k_size_base)) {
- int shift = ffs(k_size - k_size_base);
- int k_size_more = shift ? 1 << (shift - 1) : 0;
-
- setbat(-1, k_start, __pa(block), k_size_base, PAGE_KERNEL);
- if (k_size_more >= SZ_128K)
- setbat(-1, k_start + k_size_base, __pa(block) + k_size_base,
- k_size_more, PAGE_KERNEL);
- if (v_block_mapped(k_start))
- k_cur = k_start + k_size_base;
- if (v_block_mapped(k_start + k_size_base))
- k_cur = k_start + k_size_base + k_size_more;
-
- update_bats();
+ while (k_nobat < k_end) {
+ unsigned int k_size = bat_block_size(k_nobat, k_end);
+ int idx = find_free_bat();
+
+ if (idx == -1)
+ break;
+ if (k_size < SZ_128K)
+ break;
+ phys = memblock_phys_alloc_range(k_size, k_size, 0,
+ MEMBLOCK_ALLOC_ANYWHERE);
+ if (!phys)
+ break;
+
+ setbat(idx, k_nobat, phys, k_size, PAGE_KERNEL);
+ k_nobat += k_size;
}
+ if (k_nobat != k_start)
+ update_bats();
- if (!block)
- block = memblock_alloc(k_size, PAGE_SIZE);
- if (!block)
- return -ENOMEM;
+ if (k_nobat < k_end) {
+ phys = memblock_phys_alloc_range(k_end - k_nobat, PAGE_SIZE, 0,
+ MEMBLOCK_ALLOC_ANYWHERE);
+ if (!phys)
+ return -ENOMEM;
+ }
ret = kasan_init_shadow_page_tables(k_start, k_end);
if (ret)
return ret;
- kasan_update_early_region(k_start, k_cur, __pte(0));
+ kasan_update_early_region(k_start, k_nobat, __pte(0));
- for (; k_cur < k_end; k_cur += PAGE_SIZE) {
+ for (k_cur = k_nobat; k_cur < k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_off_k(k_cur);
- void *va = block + k_cur - k_start;
- pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
+ pte_t pte = pfn_pte(PHYS_PFN(phys + k_cur - k_nobat), PAGE_KERNEL);
__set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), pte, 0);
}
flush_tlb_kernel_range(k_start, k_end);
+ memset(kasan_mem_to_shadow(start), 0, k_end - k_start);
+
return 0;
}
--
2.33.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 27fe73394a1c6d0b07fa4d95f1bca116d1cc66e9 Mon Sep 17 00:00:00 2001
From: Peter Collingbourne <pcc(a)google.com>
Date: Sat, 29 Jan 2022 13:41:14 -0800
Subject: [PATCH] mm, kasan: use compare-exchange operation to set KASAN page
tag
It has been reported that the tag setting operation on newly-allocated
pages can cause the page flags to be corrupted when performed
concurrently with other flag updates as a result of the use of
non-atomic operations.
Fix the problem by using a compare-exchange loop to update the tag.
Link: https://lkml.kernel.org/r/20220120020148.1632253-1-pcc@google.com
Link: https://linux-review.googlesource.com/id/I456b24a2b9067d93968d43b4bb3351c0c…
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e1a84b1e6787..213cc569b192 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1506,11 +1506,18 @@ static inline u8 page_kasan_tag(const struct page *page)
static inline void page_kasan_tag_set(struct page *page, u8 tag)
{
- if (kasan_enabled()) {
- tag ^= 0xff;
- page->flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
- page->flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
- }
+ unsigned long old_flags, flags;
+
+ if (!kasan_enabled())
+ return;
+
+ tag ^= 0xff;
+ old_flags = READ_ONCE(page->flags);
+ do {
+ flags = old_flags;
+ flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
+ flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
+ } while (unlikely(!try_cmpxchg(&page->flags, &old_flags, flags)));
}
static inline void page_kasan_tag_reset(struct page *page)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 27fe73394a1c6d0b07fa4d95f1bca116d1cc66e9 Mon Sep 17 00:00:00 2001
From: Peter Collingbourne <pcc(a)google.com>
Date: Sat, 29 Jan 2022 13:41:14 -0800
Subject: [PATCH] mm, kasan: use compare-exchange operation to set KASAN page
tag
It has been reported that the tag setting operation on newly-allocated
pages can cause the page flags to be corrupted when performed
concurrently with other flag updates as a result of the use of
non-atomic operations.
Fix the problem by using a compare-exchange loop to update the tag.
Link: https://lkml.kernel.org/r/20220120020148.1632253-1-pcc@google.com
Link: https://linux-review.googlesource.com/id/I456b24a2b9067d93968d43b4bb3351c0c…
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e1a84b1e6787..213cc569b192 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1506,11 +1506,18 @@ static inline u8 page_kasan_tag(const struct page *page)
static inline void page_kasan_tag_set(struct page *page, u8 tag)
{
- if (kasan_enabled()) {
- tag ^= 0xff;
- page->flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
- page->flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
- }
+ unsigned long old_flags, flags;
+
+ if (!kasan_enabled())
+ return;
+
+ tag ^= 0xff;
+ old_flags = READ_ONCE(page->flags);
+ do {
+ flags = old_flags;
+ flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
+ flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
+ } while (unlikely(!try_cmpxchg(&page->flags, &old_flags, flags)));
}
static inline void page_kasan_tag_reset(struct page *page)
Hello,
I've always had trouble with this driver in my Asus Zephyrus laptop,
but I was able to use it eventually, that's until 5.16.3 landed.
This version completely broke it. I'm unable to bring the interface
up, no matter what I try.
Before, sometimes I was able to make the chip work by suspending the
laptop, but in 5.16.3 the machine doesn't wake up (which is probably
another issue).
Reverting back to 5.16.2 makes it work.
Let me know if you need more information, or if you would like me to
bisect the issue.
Cheers.
--
Felipe Contreras
--
Dear Sir/Ma'am,
I'm a broker with G4 Finance Limited. I’m inviting you to join over 200 million investors with a reliable source of high income.
G4 Finance LTD is an international financial company engaged in investment activities, which are related to trading on financial
markets and crypto-currency exchanges performed by qualified professional traders.
Get involved in our tremendous platform and Invest. We will utilize your money and give you profit in your wallet automatically.
Step 1: Create an account
Step 2: Deposit and Invest to Plan
Step 3: Get Profit Daily.
We also offer a wide range loan to interested Business Owners, Investors, Contractors who are looking for Loans to Finance their projects.
. Commercial Loans
. Real Estate Loans
. Business Loans
. Personal Loans.
Our Company service is due at a minimum Nine Billion Dollars and our rate is 1.5% to suit your Business needs.
Contact back for more details:-
Regards,
G4 Finance Limited.
Mr.Mark Tucker
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a37d9a17f099072fe4d3a9048b0321978707a918 Mon Sep 17 00:00:00 2001
From: Amir Goldstein <amir73il(a)gmail.com>
Date: Thu, 20 Jan 2022 23:53:04 +0200
Subject: [PATCH] fsnotify: invalidate dcache before IN_DELETE event
Apparently, there are some applications that use IN_DELETE event as an
invalidation mechanism and expect that if they try to open a file with
the name reported with the delete event, that it should not contain the
content of the deleted file.
Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of
d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify
will have access to a positive dentry.
This allowed a race where opening the deleted file via cached dentry
is now possible after receiving the IN_DELETE event.
To fix the regression, create a new hook fsnotify_delete() that takes
the unlinked inode as an argument and use a helper d_delete_notify() to
pin the inode, so we can pass it to fsnotify_delete() after d_delete().
Backporting hint: this regression is from v5.3. Although patch will
apply with only trivial conflicts to v5.4 and v5.10, it won't build,
because fsnotify_delete() implementation is different in each of those
versions (see fsnotify_link()).
A follow up patch will fix the fsnotify_unlink/rmdir() calls in pseudo
filesystem that do not need to call d_delete().
Link: https://lore.kernel.org/r/20220120215305.282577-1-amir73il@gmail.com
Reported-by: Ivan Delalande <colona(a)arista.com>
Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/
Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()")
Cc: stable(a)vger.kernel.org # v5.3+
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a5bd6926f7ff..7807b28b7892 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3086,10 +3086,8 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
btrfs_inode_lock(inode, 0);
err = btrfs_delete_subvolume(dir, dentry);
btrfs_inode_unlock(inode, 0);
- if (!err) {
- fsnotify_rmdir(dir, dentry);
- d_delete(dentry);
- }
+ if (!err)
+ d_delete_notify(dir, dentry);
out_dput:
dput(dentry);
diff --git a/fs/namei.c b/fs/namei.c
index d81f04f8d818..4ed0e41feab7 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3974,13 +3974,12 @@ int vfs_rmdir(struct user_namespace *mnt_userns, struct inode *dir,
dentry->d_inode->i_flags |= S_DEAD;
dont_mount(dentry);
detach_mounts(dentry);
- fsnotify_rmdir(dir, dentry);
out:
inode_unlock(dentry->d_inode);
dput(dentry);
if (!error)
- d_delete(dentry);
+ d_delete_notify(dir, dentry);
return error;
}
EXPORT_SYMBOL(vfs_rmdir);
@@ -4102,7 +4101,6 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir,
if (!error) {
dont_mount(dentry);
detach_mounts(dentry);
- fsnotify_unlink(dir, dentry);
}
}
}
@@ -4110,9 +4108,11 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir,
inode_unlock(target);
/* We don't d_delete() NFS sillyrenamed files--they still exist. */
- if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
+ if (!error && dentry->d_flags & DCACHE_NFSFS_RENAMED) {
+ fsnotify_unlink(dir, dentry);
+ } else if (!error) {
fsnotify_link_count(target);
- d_delete(dentry);
+ d_delete_notify(dir, dentry);
}
return error;
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 3a2d7dc3c607..bb8467cd11ae 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -224,6 +224,43 @@ static inline void fsnotify_link(struct inode *dir, struct inode *inode,
dir, &new_dentry->d_name, 0);
}
+/*
+ * fsnotify_delete - @dentry was unlinked and unhashed
+ *
+ * Caller must make sure that dentry->d_name is stable.
+ *
+ * Note: unlike fsnotify_unlink(), we have to pass also the unlinked inode
+ * as this may be called after d_delete() and old_dentry may be negative.
+ */
+static inline void fsnotify_delete(struct inode *dir, struct inode *inode,
+ struct dentry *dentry)
+{
+ __u32 mask = FS_DELETE;
+
+ if (S_ISDIR(inode->i_mode))
+ mask |= FS_ISDIR;
+
+ fsnotify_name(mask, inode, FSNOTIFY_EVENT_INODE, dir, &dentry->d_name,
+ 0);
+}
+
+/**
+ * d_delete_notify - delete a dentry and call fsnotify_delete()
+ * @dentry: The dentry to delete
+ *
+ * This helper is used to guaranty that the unlinked inode cannot be found
+ * by lookup of this name after fsnotify_delete() event has been delivered.
+ */
+static inline void d_delete_notify(struct inode *dir, struct dentry *dentry)
+{
+ struct inode *inode = d_inode(dentry);
+
+ ihold(inode);
+ d_delete(dentry);
+ fsnotify_delete(dir, inode, dentry);
+ iput(inode);
+}
+
/*
* fsnotify_unlink - 'name' was unlinked
*
@@ -231,10 +268,10 @@ static inline void fsnotify_link(struct inode *dir, struct inode *inode,
*/
static inline void fsnotify_unlink(struct inode *dir, struct dentry *dentry)
{
- /* Expected to be called before d_delete() */
- WARN_ON_ONCE(d_is_negative(dentry));
+ if (WARN_ON_ONCE(d_is_negative(dentry)))
+ return;
- fsnotify_dirent(dir, dentry, FS_DELETE);
+ fsnotify_delete(dir, d_inode(dentry), dentry);
}
/*
@@ -258,10 +295,10 @@ static inline void fsnotify_mkdir(struct inode *dir, struct dentry *dentry)
*/
static inline void fsnotify_rmdir(struct inode *dir, struct dentry *dentry)
{
- /* Expected to be called before d_delete() */
- WARN_ON_ONCE(d_is_negative(dentry));
+ if (WARN_ON_ONCE(d_is_negative(dentry)))
+ return;
- fsnotify_dirent(dir, dentry, FS_DELETE | FS_ISDIR);
+ fsnotify_delete(dir, d_inode(dentry), dentry);
}
/*
Hello....
You have been compensated with the sum of 9.5 million dollars in this
united nation the payment will be issue into atm visa card and send
to you from the santander bank we need your address and your
Whatsapp number + 1 6465853907 this my email.ID
( mrs.bill.chantal.roland(a)gmail.com ) contact me
Thanks my
mrs bill chantal
--
Hello,
Greetings and hope this email finds you well?
I am Dr. Hamza Kabore, the chief Medical consultant at a reputable
clinic here in Ouagadougou, Burkina Faso and I have a Patient who
hails from the Republic of philippines but unfortunately is in Coma
right now due to complications from a Cancer disease and she has the
sum of $10.7 Million United States (Ten Million seven Hundred
Thousand) Dollars she wants me to guide you on, so that her Bank can
transfer it to you for charity purposes.
Please, I will like you to contact me on this email
(hamzakabore97(a)gmail.com) for further details as this is a very
sensitive issue that needs urgent attention from you and I want to
maintain the promised I made to the woman before she entered Coma,
never to betray her by looking for another person other than you that
she choosed and selected for the offer among the people she got their
email contacts in her quest for an honest person oversea to help her
wholeheartedly in handling this project to fulfill her wish.
Best Regards,
Dr. Hamza Kabore on behalf of
Mrs. Sismer Shirley Acojedo
From: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Subject: jbd2: export jbd2_journal_[grab|put]_journal_head
Patch series "ocfs2: fix a deadlock case".
This fixes a deadlock case in ocfs2. We firstly export jbd2 symbols
jbd2_journal_[grab|put]_journal_head as preparation and later use them in
ocfs2 insread of jbd_[lock|unlock]_bh_journal_head to fix the deadlock.
This patch (of 2):
This exports symbols jbd2_journal_[grab|put]_journal_head, which will be
used outside modules, e.g. ocfs2.
Link: https://lkml.kernel.org/r/20220121071205.100648-2-joseph.qi@linux.alibaba.c…
Signed-off-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: Andreas Dilger <adilger.kernel(a)dilger.ca>
Cc: Gautham Ananthakrishna <gautham.ananthakrishna(a)oracle.com>
Cc: Saeed Mirzamohammadi <saeed.mirzamohammadi(a)oracle.com>
Cc: "Theodore Ts'o" <tytso(a)mit.edu>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/jbd2/journal.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/jbd2/journal.c~jbd2-export-jbd2_journal__journal_head
+++ a/fs/jbd2/journal.c
@@ -2972,6 +2972,7 @@ struct journal_head *jbd2_journal_grab_j
jbd_unlock_bh_journal_head(bh);
return jh;
}
+EXPORT_SYMBOL(jbd2_journal_grab_journal_head);
static void __journal_remove_journal_head(struct buffer_head *bh)
{
@@ -3024,6 +3025,7 @@ void jbd2_journal_put_journal_head(struc
jbd_unlock_bh_journal_head(bh);
}
}
+EXPORT_SYMBOL(jbd2_journal_put_journal_head);
/*
* Initialize jbd inode head
_
From: Peter Collingbourne <pcc(a)google.com>
Subject: mm, kasan: use compare-exchange operation to set KASAN page tag
It has been reported that the tag setting operation on newly-allocated
pages can cause the page flags to be corrupted when performed concurrently
with other flag updates as a result of the use of non-atomic operations.
Fix the problem by using a compare-exchange loop to update the tag.
Link: https://lkml.kernel.org/r/20220120020148.1632253-1-pcc@google.com
Link: https://linux-review.googlesource.com/id/I456b24a2b9067d93968d43b4bb3351c0c…
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
--- a/include/linux/mm.h~mm-use-compare-exchange-operation-to-set-kasan-page-tag
+++ a/include/linux/mm.h
@@ -1506,11 +1506,18 @@ static inline u8 page_kasan_tag(const st
static inline void page_kasan_tag_set(struct page *page, u8 tag)
{
- if (kasan_enabled()) {
- tag ^= 0xff;
- page->flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
- page->flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
- }
+ unsigned long old_flags, flags;
+
+ if (!kasan_enabled())
+ return;
+
+ tag ^= 0xff;
+ old_flags = READ_ONCE(page->flags);
+ do {
+ flags = old_flags;
+ flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
+ flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
+ } while (unlikely(!try_cmpxchg(&page->flags, &old_flags, flags)));
}
static inline void page_kasan_tag_reset(struct page *page)
_
A previous patch to skip part of the initialization when a USB3 PHY was
not present could result in the return value being uninitialized in that
case, causing spurious probe failures. Initialize ret to 0 to avoid this.
Fixes: 9678f3361afc ("usb: dwc3: xilinx: Skip resets and USB3 register settings for USB2.0 mode")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
---
drivers/usb/dwc3/dwc3-xilinx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/dwc3-xilinx.c b/drivers/usb/dwc3/dwc3-xilinx.c
index e14ac15e24c3..a6f3a9b38789 100644
--- a/drivers/usb/dwc3/dwc3-xilinx.c
+++ b/drivers/usb/dwc3/dwc3-xilinx.c
@@ -99,7 +99,7 @@ static int dwc3_xlnx_init_zynqmp(struct dwc3_xlnx *priv_data)
struct device *dev = priv_data->dev;
struct reset_control *crst, *hibrst, *apbrst;
struct phy *usb3_phy;
- int ret;
+ int ret = 0;
u32 reg;
usb3_phy = devm_phy_optional_get(dev, "usb3-phy");
--
2.31.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a37d9a17f099072fe4d3a9048b0321978707a918 Mon Sep 17 00:00:00 2001
From: Amir Goldstein <amir73il(a)gmail.com>
Date: Thu, 20 Jan 2022 23:53:04 +0200
Subject: [PATCH] fsnotify: invalidate dcache before IN_DELETE event
Apparently, there are some applications that use IN_DELETE event as an
invalidation mechanism and expect that if they try to open a file with
the name reported with the delete event, that it should not contain the
content of the deleted file.
Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of
d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify
will have access to a positive dentry.
This allowed a race where opening the deleted file via cached dentry
is now possible after receiving the IN_DELETE event.
To fix the regression, create a new hook fsnotify_delete() that takes
the unlinked inode as an argument and use a helper d_delete_notify() to
pin the inode, so we can pass it to fsnotify_delete() after d_delete().
Backporting hint: this regression is from v5.3. Although patch will
apply with only trivial conflicts to v5.4 and v5.10, it won't build,
because fsnotify_delete() implementation is different in each of those
versions (see fsnotify_link()).
A follow up patch will fix the fsnotify_unlink/rmdir() calls in pseudo
filesystem that do not need to call d_delete().
Link: https://lore.kernel.org/r/20220120215305.282577-1-amir73il@gmail.com
Reported-by: Ivan Delalande <colona(a)arista.com>
Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/
Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()")
Cc: stable(a)vger.kernel.org # v5.3+
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a5bd6926f7ff..7807b28b7892 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3086,10 +3086,8 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
btrfs_inode_lock(inode, 0);
err = btrfs_delete_subvolume(dir, dentry);
btrfs_inode_unlock(inode, 0);
- if (!err) {
- fsnotify_rmdir(dir, dentry);
- d_delete(dentry);
- }
+ if (!err)
+ d_delete_notify(dir, dentry);
out_dput:
dput(dentry);
diff --git a/fs/namei.c b/fs/namei.c
index d81f04f8d818..4ed0e41feab7 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3974,13 +3974,12 @@ int vfs_rmdir(struct user_namespace *mnt_userns, struct inode *dir,
dentry->d_inode->i_flags |= S_DEAD;
dont_mount(dentry);
detach_mounts(dentry);
- fsnotify_rmdir(dir, dentry);
out:
inode_unlock(dentry->d_inode);
dput(dentry);
if (!error)
- d_delete(dentry);
+ d_delete_notify(dir, dentry);
return error;
}
EXPORT_SYMBOL(vfs_rmdir);
@@ -4102,7 +4101,6 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir,
if (!error) {
dont_mount(dentry);
detach_mounts(dentry);
- fsnotify_unlink(dir, dentry);
}
}
}
@@ -4110,9 +4108,11 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir,
inode_unlock(target);
/* We don't d_delete() NFS sillyrenamed files--they still exist. */
- if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
+ if (!error && dentry->d_flags & DCACHE_NFSFS_RENAMED) {
+ fsnotify_unlink(dir, dentry);
+ } else if (!error) {
fsnotify_link_count(target);
- d_delete(dentry);
+ d_delete_notify(dir, dentry);
}
return error;
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 3a2d7dc3c607..bb8467cd11ae 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -224,6 +224,43 @@ static inline void fsnotify_link(struct inode *dir, struct inode *inode,
dir, &new_dentry->d_name, 0);
}
+/*
+ * fsnotify_delete - @dentry was unlinked and unhashed
+ *
+ * Caller must make sure that dentry->d_name is stable.
+ *
+ * Note: unlike fsnotify_unlink(), we have to pass also the unlinked inode
+ * as this may be called after d_delete() and old_dentry may be negative.
+ */
+static inline void fsnotify_delete(struct inode *dir, struct inode *inode,
+ struct dentry *dentry)
+{
+ __u32 mask = FS_DELETE;
+
+ if (S_ISDIR(inode->i_mode))
+ mask |= FS_ISDIR;
+
+ fsnotify_name(mask, inode, FSNOTIFY_EVENT_INODE, dir, &dentry->d_name,
+ 0);
+}
+
+/**
+ * d_delete_notify - delete a dentry and call fsnotify_delete()
+ * @dentry: The dentry to delete
+ *
+ * This helper is used to guaranty that the unlinked inode cannot be found
+ * by lookup of this name after fsnotify_delete() event has been delivered.
+ */
+static inline void d_delete_notify(struct inode *dir, struct dentry *dentry)
+{
+ struct inode *inode = d_inode(dentry);
+
+ ihold(inode);
+ d_delete(dentry);
+ fsnotify_delete(dir, inode, dentry);
+ iput(inode);
+}
+
/*
* fsnotify_unlink - 'name' was unlinked
*
@@ -231,10 +268,10 @@ static inline void fsnotify_link(struct inode *dir, struct inode *inode,
*/
static inline void fsnotify_unlink(struct inode *dir, struct dentry *dentry)
{
- /* Expected to be called before d_delete() */
- WARN_ON_ONCE(d_is_negative(dentry));
+ if (WARN_ON_ONCE(d_is_negative(dentry)))
+ return;
- fsnotify_dirent(dir, dentry, FS_DELETE);
+ fsnotify_delete(dir, d_inode(dentry), dentry);
}
/*
@@ -258,10 +295,10 @@ static inline void fsnotify_mkdir(struct inode *dir, struct dentry *dentry)
*/
static inline void fsnotify_rmdir(struct inode *dir, struct dentry *dentry)
{
- /* Expected to be called before d_delete() */
- WARN_ON_ONCE(d_is_negative(dentry));
+ if (WARN_ON_ONCE(d_is_negative(dentry)))
+ return;
- fsnotify_dirent(dir, dentry, FS_DELETE | FS_ISDIR);
+ fsnotify_delete(dir, d_inode(dentry), dentry);
}
/*
From: Andy Lutomirski <luto(a)kernel.org>
We don't need spinlocks to protect batched entropy -- all we need
is a little bit of care. This should fix up the following splat that
Jonathan received with a PROVE_LOCKING=y/PROVE_RAW_LOCK_NESTING=y
kernel:
[ 2.500000] [ BUG: Invalid wait context ]
[ 2.500000] 5.17.0-rc1 #563 Not tainted
[ 2.500000] -----------------------------
[ 2.500000] swapper/1 is trying to lock:
[ 2.500000] c0b0e9cc (batched_entropy_u32.lock){....}-{3:3}, at: invalidate_batched_entropy+0x18/0x4c
[ 2.500000] other info that might help us debug this:
[ 2.500000] context-{2:2}
[ 2.500000] 3 locks held by swapper/1:
[ 2.500000] #0: c0ae86ac (event_mutex){+.+.}-{4:4}, at: event_trace_init+0x4c/0xd8
[ 2.500000] #1: c0ae81b8 (trace_event_sem){+.+.}-{4:4}, at: event_trace_init+0x68/0xd8
[ 2.500000] #2: c19b05cc (&sb->s_type->i_mutex_key#2){+.+.}-{4:4}, at: start_creating+0x40/0xc4
[ 2.500000] stack backtrace:
[ 2.500000] CPU: 0 PID: 1 Comm: swapper Not tainted 5.17.0-rc1 #563
[ 2.500000] Hardware name: WPCM450 chip
[ 2.500000] [<c00100a8>] (unwind_backtrace) from [<c000db2c>] (show_stack+0x10/0x14)
[ 2.500000] [<c000db2c>] (show_stack) from [<c0054eec>] (__lock_acquire+0x3f0/0x189c)
[ 2.500000] [<c0054eec>] (__lock_acquire) from [<c0054478>] (lock_acquire+0x2b8/0x354)
[ 2.500000] [<c0054478>] (lock_acquire) from [<c0568028>] (_raw_spin_lock_irqsave+0x60/0x74)
[ 2.500000] [<c0568028>] (_raw_spin_lock_irqsave) from [<c030b6f4>] (invalidate_batched_entropy+0x18/0x4c)
[ 2.500000] [<c030b6f4>] (invalidate_batched_entropy) from [<c030e7fc>] (crng_fast_load+0xf0/0x110)
[ 2.500000] [<c030e7fc>] (crng_fast_load) from [<c030e954>] (add_interrupt_randomness+0x138/0x200)
[ 2.500000] [<c030e954>] (add_interrupt_randomness) from [<c0061b34>] (handle_irq_event_percpu+0x18/0x38)
[ 2.500000] [<c0061b34>] (handle_irq_event_percpu) from [<c0061b8c>] (handle_irq_event+0x38/0x5c)
[ 2.500000] [<c0061b8c>] (handle_irq_event) from [<c0065b28>] (handle_fasteoi_irq+0x9c/0x114)
[ 2.500000] [<c0065b28>] (handle_fasteoi_irq) from [<c0061178>] (handle_irq_desc+0x24/0x34)
[ 2.500000] [<c0061178>] (handle_irq_desc) from [<c056214c>] (generic_handle_arch_irq+0x28/0x3c)
[ 2.500000] [<c056214c>] (generic_handle_arch_irq) from [<c0008eb4>] (__irq_svc+0x54/0x80)
[ 2.500000] Exception stack(0xc1485d48 to 0xc1485d90)
[ 2.500000] 5d40: 9780e804 00000001 c09413d4 200000d3 60000053 c016af54
[ 2.500000] 5d60: 00000000 c0afa5b8 c14194e0 c19a1d48 c0789ce0 00000000 c1490480 c1485d98
[ 2.500000] 5d80: c0168970 c0168984 20000053 ffffffff
[ 2.500000] [<c0008eb4>] (__irq_svc) from [<c0168984>] (read_seqbegin.constprop.0+0x6c/0x90)
[ 2.500000] [<c0168984>] (read_seqbegin.constprop.0) from [<c016af54>] (d_lookup+0x14/0x40)
[ 2.500000] [<c016af54>] (d_lookup) from [<c015cecc>] (lookup_dcache+0x18/0x50)
[ 2.500000] [<c015cecc>] (lookup_dcache) from [<c015d868>] (lookup_one_len+0x90/0xe0)
[ 2.500000] [<c015d868>] (lookup_one_len) from [<c01e33e4>] (start_creating+0x68/0xc4)
[ 2.500000] [<c01e33e4>] (start_creating) from [<c01e398c>] (tracefs_create_file+0x30/0x11c)
[ 2.500000] [<c01e398c>] (tracefs_create_file) from [<c00c42f8>] (trace_create_file+0x14/0x38)
[ 2.500000] [<c00c42f8>] (trace_create_file) from [<c00cc854>] (event_create_dir+0x310/0x420)
[ 2.500000] [<c00cc854>] (event_create_dir) from [<c00cc9d8>] (__trace_early_add_event_dirs+0x28/0x50)
[ 2.500000] [<c00cc9d8>] (__trace_early_add_event_dirs) from [<c07c8d64>] (event_trace_init+0x70/0xd8)
[ 2.500000] [<c07c8d64>] (event_trace_init) from [<c07c8560>] (tracer_init_tracefs+0x14/0x284)
[ 2.500000] [<c07c8560>] (tracer_init_tracefs) from [<c000a330>] (do_one_initcall+0xdc/0x288)
[ 2.500000] [<c000a330>] (do_one_initcall) from [<c07bd1e8>] (kernel_init_freeable+0x1c4/0x20c)
[ 2.500000] [<c07bd1e8>] (kernel_init_freeable) from [<c05629c0>] (kernel_init+0x10/0x110)
[ 2.500000] [<c05629c0>] (kernel_init) from [<c00084f8>] (ret_from_fork+0x14/0x3c)
[ 2.500000] Exception stack(0xc1485fb0 to 0xc1485ff8)
[ 2.500000] 5fa0: 00000000 00000000 00000000 00000000
[ 2.500000] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 2.500000] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
[Jason: I extracted this from a larger in-progress series of Andy's that
also unifies the two batches into one and does other performance
things. Since that's still under development, but because we need this
part to fix the CONFIG_PROVE_RAW_LOCK_NESTING issue, I've extracted it
out and applied it to the current setup. This will also make it easier
to backport to old kernels that also need the fix. I've also amended
Andy's original commit message.]
Reported-by: Jonathan Neuschäfer <j.neuschaefer(a)gmx.net>
Link: https://lore.kernel.org/lkml/YfMa0QgsjCVdRAvJ@latitude/
Fixes: b7d5dc21072c ("random: add a spinlock_t to struct batched_entropy")
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
Andy - could you take a look at this and let me know if it's still
correct after I've ported it out of your series and into a standalone
thing here? I'd prefer to hold off on moving forward on this until I
receive our green light. I'm also still a bit uncertain about your NB:
comment regarding the acceptable race. If you could elaborate on that
argument, it might save me a few cycles with my thinking cap on.
drivers/char/random.c | 57 ++++++++++++++++++++++++-------------------
1 file changed, 32 insertions(+), 25 deletions(-)
diff --git a/drivers/char/random.c b/drivers/char/random.c
index b411182df6f6..22c190aecbe8 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -2063,7 +2063,6 @@ struct batched_entropy {
u32 entropy_u32[CHACHA_BLOCK_SIZE / sizeof(u32)];
};
unsigned int position;
- spinlock_t batch_lock;
};
/*
@@ -2075,7 +2074,7 @@ struct batched_entropy {
* point prior.
*/
static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_u64) = {
- .batch_lock = __SPIN_LOCK_UNLOCKED(batched_entropy_u64.lock),
+ .position = ARRAY_SIZE(((struct batched_entropy *)0)->entropy_u64)
};
u64 get_random_u64(void)
@@ -2083,42 +2082,55 @@ u64 get_random_u64(void)
u64 ret;
unsigned long flags;
struct batched_entropy *batch;
+ size_t position;
static void *previous;
warn_unseeded_randomness(&previous);
- batch = raw_cpu_ptr(&batched_entropy_u64);
- spin_lock_irqsave(&batch->batch_lock, flags);
- if (batch->position % ARRAY_SIZE(batch->entropy_u64) == 0) {
+ local_irq_save(flags);
+ batch = this_cpu_ptr(&batched_entropy_u64);
+ position = READ_ONCE(batch->position);
+ /* NB: position can change to ARRAY_SIZE(batch->entropy_u64) out
+ * from under us -- see invalidate_batched_entropy(). If this,
+ * happens it's okay if we still return the data in the batch. */
+ if (unlikely(position + 1 > ARRAY_SIZE(batch->entropy_u64))) {
extract_crng((u8 *)batch->entropy_u64);
- batch->position = 0;
+ position = 0;
}
- ret = batch->entropy_u64[batch->position++];
- spin_unlock_irqrestore(&batch->batch_lock, flags);
+ ret = batch->entropy_u64[position++];
+ WRITE_ONCE(batch->position, position);
+ local_irq_restore(flags);
return ret;
}
EXPORT_SYMBOL(get_random_u64);
static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_u32) = {
- .batch_lock = __SPIN_LOCK_UNLOCKED(batched_entropy_u32.lock),
+ .position = ARRAY_SIZE(((struct batched_entropy *)0)->entropy_u32)
};
+
u32 get_random_u32(void)
{
u32 ret;
unsigned long flags;
struct batched_entropy *batch;
+ size_t position;
static void *previous;
warn_unseeded_randomness(&previous);
- batch = raw_cpu_ptr(&batched_entropy_u32);
- spin_lock_irqsave(&batch->batch_lock, flags);
- if (batch->position % ARRAY_SIZE(batch->entropy_u32) == 0) {
+ local_irq_save(flags);
+ batch = this_cpu_ptr(&batched_entropy_u32);
+ position = READ_ONCE(batch->position);
+ /* NB: position can change to ARRAY_SIZE(batch->entropy_u32) out
+ * from under us -- see invalidate_batched_entropy(). If this,
+ * happens it's okay if we still return the data in the batch. */
+ if (unlikely(position + 1 > ARRAY_SIZE(batch->entropy_u32))) {
extract_crng((u8 *)batch->entropy_u32);
- batch->position = 0;
+ position = 0;
}
- ret = batch->entropy_u32[batch->position++];
- spin_unlock_irqrestore(&batch->batch_lock, flags);
+ ret = batch->entropy_u64[position++];
+ WRITE_ONCE(batch->position, position);
+ local_irq_restore(flags);
return ret;
}
EXPORT_SYMBOL(get_random_u32);
@@ -2130,20 +2142,15 @@ EXPORT_SYMBOL(get_random_u32);
static void invalidate_batched_entropy(void)
{
int cpu;
- unsigned long flags;
for_each_possible_cpu(cpu) {
- struct batched_entropy *batched_entropy;
+ struct batched_entropy *batch;
- batched_entropy = per_cpu_ptr(&batched_entropy_u32, cpu);
- spin_lock_irqsave(&batched_entropy->batch_lock, flags);
- batched_entropy->position = 0;
- spin_unlock(&batched_entropy->batch_lock);
+ batch = per_cpu_ptr(&batched_entropy_u32, cpu);
+ WRITE_ONCE(batch->position, ARRAY_SIZE(batch->entropy_u32));
- batched_entropy = per_cpu_ptr(&batched_entropy_u64, cpu);
- spin_lock(&batched_entropy->batch_lock);
- batched_entropy->position = 0;
- spin_unlock_irqrestore(&batched_entropy->batch_lock, flags);
+ batch = per_cpu_ptr(&batched_entropy_u64, cpu);
+ WRITE_ONCE(batch->position, ARRAY_SIZE(batch->entropy_u64));
}
}
--
2.35.0
You have been compensated with the sum of 6.4 million dollars in this
united nation the payment will be issue into atm visa card and send to
you from the bank we need your address and your whatsapp number
Fill the followings with your details;
1. Your Name:
2. Country :
3. Age and Sex:
4. Occupation :
5. Mobile Telephone:
6. Delivery Address:
7. Id Card Identification
Thanks
Mrs. Bill Chantal,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2148927e6ed43a1667baf7c2ae3e0e05a44b51a0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Beh=C3=BAn?= <kabel(a)kernel.org>
Date: Wed, 19 Jan 2022 17:44:55 +0100
Subject: [PATCH] net: sfp: ignore disabled SFP node
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Commit ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices
and sfp cages") added code which finds SFP bus DT node even if the node
is disabled with status = "disabled". Because of this, when phylink is
created, it ends with non-null .sfp_bus member, even though the SFP
module is not probed (because the node is disabled).
We need to ignore disabled SFP bus node.
Fixes: ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages")
Signed-off-by: Marek Behún <kabel(a)kernel.org>
Cc: stable(a)vger.kernel.org # 2203cbf2c8b5 ("net: sfp: move fwnode parsing into sfp-bus layer")
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c
index 0c6c0d1843bc..c1512c9925a6 100644
--- a/drivers/net/phy/sfp-bus.c
+++ b/drivers/net/phy/sfp-bus.c
@@ -651,6 +651,11 @@ struct sfp_bus *sfp_bus_find_fwnode(struct fwnode_handle *fwnode)
else if (ret < 0)
return ERR_PTR(ret);
+ if (!fwnode_device_is_available(ref.fwnode)) {
+ fwnode_handle_put(ref.fwnode);
+ return NULL;
+ }
+
bus = sfp_bus_get(ref.fwnode);
fwnode_handle_put(ref.fwnode);
if (!bus)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2148927e6ed43a1667baf7c2ae3e0e05a44b51a0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Beh=C3=BAn?= <kabel(a)kernel.org>
Date: Wed, 19 Jan 2022 17:44:55 +0100
Subject: [PATCH] net: sfp: ignore disabled SFP node
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Commit ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices
and sfp cages") added code which finds SFP bus DT node even if the node
is disabled with status = "disabled". Because of this, when phylink is
created, it ends with non-null .sfp_bus member, even though the SFP
module is not probed (because the node is disabled).
We need to ignore disabled SFP bus node.
Fixes: ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages")
Signed-off-by: Marek Behún <kabel(a)kernel.org>
Cc: stable(a)vger.kernel.org # 2203cbf2c8b5 ("net: sfp: move fwnode parsing into sfp-bus layer")
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c
index 0c6c0d1843bc..c1512c9925a6 100644
--- a/drivers/net/phy/sfp-bus.c
+++ b/drivers/net/phy/sfp-bus.c
@@ -651,6 +651,11 @@ struct sfp_bus *sfp_bus_find_fwnode(struct fwnode_handle *fwnode)
else if (ret < 0)
return ERR_PTR(ret);
+ if (!fwnode_device_is_available(ref.fwnode)) {
+ fwnode_handle_put(ref.fwnode);
+ return NULL;
+ }
+
bus = sfp_bus_get(ref.fwnode);
fwnode_handle_put(ref.fwnode);
if (!bus)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2148927e6ed43a1667baf7c2ae3e0e05a44b51a0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Beh=C3=BAn?= <kabel(a)kernel.org>
Date: Wed, 19 Jan 2022 17:44:55 +0100
Subject: [PATCH] net: sfp: ignore disabled SFP node
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Commit ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices
and sfp cages") added code which finds SFP bus DT node even if the node
is disabled with status = "disabled". Because of this, when phylink is
created, it ends with non-null .sfp_bus member, even though the SFP
module is not probed (because the node is disabled).
We need to ignore disabled SFP bus node.
Fixes: ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages")
Signed-off-by: Marek Behún <kabel(a)kernel.org>
Cc: stable(a)vger.kernel.org # 2203cbf2c8b5 ("net: sfp: move fwnode parsing into sfp-bus layer")
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c
index 0c6c0d1843bc..c1512c9925a6 100644
--- a/drivers/net/phy/sfp-bus.c
+++ b/drivers/net/phy/sfp-bus.c
@@ -651,6 +651,11 @@ struct sfp_bus *sfp_bus_find_fwnode(struct fwnode_handle *fwnode)
else if (ret < 0)
return ERR_PTR(ret);
+ if (!fwnode_device_is_available(ref.fwnode)) {
+ fwnode_handle_put(ref.fwnode);
+ return NULL;
+ }
+
bus = sfp_bus_get(ref.fwnode);
fwnode_handle_put(ref.fwnode);
if (!bus)
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d3d079bde07e1b7deaeb57506dc0b86010121d17 Mon Sep 17 00:00:00 2001
From: Valentin Caron <valentin.caron(a)foss.st.com>
Date: Tue, 11 Jan 2022 17:44:40 +0100
Subject: [PATCH] serial: stm32: prevent TDR register overwrite when sending
x_char
When sending x_char in stm32_usart_transmit_chars(), driver can overwrite
the value of TDR register by the value of x_char. If this happens, the
previous value that was present in TDR register will not be sent through
uart.
This code checks if the previous value in TDR register is sent before
writing the x_char value into register.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Valentin Caron <valentin.caron(a)foss.st.com>
Link: https://lore.kernel.org/r/20220111164441.6178-2-valentin.caron@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 1f89ab0e49ac..c1b8828451c8 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -550,11 +550,23 @@ static void stm32_usart_transmit_chars(struct uart_port *port)
struct stm32_port *stm32_port = to_stm32_port(port);
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
struct circ_buf *xmit = &port->state->xmit;
+ u32 isr;
+ int ret;
if (port->x_char) {
if (stm32_usart_tx_dma_started(stm32_port) &&
stm32_usart_tx_dma_enabled(stm32_port))
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);
+
+ /* Check that TDR is empty before filling FIFO */
+ ret =
+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,
+ isr,
+ (isr & USART_SR_TXE),
+ 10, 1000);
+ if (ret)
+ dev_warn(port->dev, "1 character may be erased\n");
+
writel_relaxed(port->x_char, port->membase + ofs->tdr);
port->x_char = 0;
port->icount.tx++;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d06b1cf28297e27127d3da54753a3a01a2fa2f28 Mon Sep 17 00:00:00 2001
From: Robert Hancock <robert.hancock(a)calian.com>
Date: Wed, 12 Jan 2022 13:42:14 -0600
Subject: [PATCH] serial: 8250: of: Fix mapped region size when using
reg-offset property
8250_of supports a reg-offset property which is intended to handle
cases where the device registers start at an offset inside the region
of memory allocated to the device. The Xilinx 16550 UART, for which this
support was initially added, requires this. However, the code did not
adjust the overall size of the mapped region accordingly, causing the
driver to request an area of memory past the end of the device's
allocation. For example, if the UART was allocated an address of
0xb0130000, size of 0x10000 and reg-offset of 0x1000 in the device
tree, the region of memory reserved was b0131000-b0140fff, which caused
the driver for the region starting at b0140000 to fail to probe.
Fix this by subtracting reg-offset from the mapped region size.
Fixes: b912b5e2cfb3 ([POWERPC] Xilinx: of_serial support for Xilinx uart 16550.)
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Robert Hancock <robert.hancock(a)calian.com>
Link: https://lore.kernel.org/r/20220112194214.881844-1-robert.hancock@calian.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/8250/8250_of.c b/drivers/tty/serial/8250/8250_of.c
index bce28729dd7b..be8626234627 100644
--- a/drivers/tty/serial/8250/8250_of.c
+++ b/drivers/tty/serial/8250/8250_of.c
@@ -83,8 +83,17 @@ static int of_platform_serial_setup(struct platform_device *ofdev,
port->mapsize = resource_size(&resource);
/* Check for shifted address mapping */
- if (of_property_read_u32(np, "reg-offset", &prop) == 0)
+ if (of_property_read_u32(np, "reg-offset", &prop) == 0) {
+ if (prop >= port->mapsize) {
+ dev_warn(&ofdev->dev, "reg-offset %u exceeds region size %pa\n",
+ prop, &port->mapsize);
+ ret = -EINVAL;
+ goto err_unprepare;
+ }
+
port->mapbase += prop;
+ port->mapsize -= prop;
+ }
port->iotype = UPIO_MEM;
if (of_property_read_u32(np, "reg-io-width", &prop) == 0) {
Hello Dear Friend,
My name is Mr.Mohamed Allyson. I have decided to seek a confidential
co-operation with you in the execution of the deal described
here-under for our both mutual benefit and I hope you will keep it a
top secret because of the nature of the transaction, During the
course of our bank year auditing, I discovered an unclaimed/abandoned
fund, sum total of {US$19.3 Million United State Dollars} in the bank
account that belongs to a Saudi Arabia businessman Who unfortunately
lost his life and entire family in a Motor Accident.
Now our bank has been waiting for any of the relatives to come-up for
the claim but nobody has done that. I personally has been unsuccessful
in locating any of the relatives, now, I sincerely seek your consent
to present you as the next of kin / Will Beneficiary to the deceased
so that the proceeds of this account valued at {US$19.3 Million United
State Dollars} can be paid to you, which we will share in these
percentages ratio, 60% to me and 40% to you. All I request is your
utmost sincere co-operation; trust and maximum confidentiality to
achieve this project successfully. I have carefully mapped out the
moralities for execution of this transaction under a legitimate
arrangement to protect you from any breach of the law both in your
country and here in Burkina Faso when the fund is being transferred to
your bank account.
I will have to provide all the relevant document that will be
requested to indicate that you are the rightful beneficiary of this
legacy and our bank will release the fund to you without any further
delay, upon your consideration and acceptance of this offer, please
send me the following information as stated below so we can proceed
and get this fund transferred to your designated bank account
immediately.
-Your Full Name:
-Your Contact Address:
-Your direct Mobile telephone Number:
-Your Date of Birth:
-Your occupation:
I await your swift response and re-assurance.
Best regards,
Mr.Mohamed Allyson.
From: Pablo Neira Ayuso <pablo(a)netfilter.org>
commit 4e1860a3863707e8177329c006d10f9e37e097a8 upstream.
IP fragments do not come with the transport header, hence skip bogus
layer 4 checksum updates.
Fixes: 1814096980bb ("netfilter: nft_payload: layer 4 checksum adjustment for pseudoheader fields")
Reported-and-tested-by: Steffen Weinreich <steve(a)weinreich.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Florian Westphal <fw(a)strlen.de>
---
This is already in the 5.y branches but 4.19 needs a minor
tweak as ->fragoff member resides in xt sub-struct.
net/netfilter/nft_payload.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index b1a9f330a51f..fd87216bc0a9 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -194,6 +194,9 @@ static int nft_payload_l4csum_offset(const struct nft_pktinfo *pkt,
struct sk_buff *skb,
unsigned int *l4csum_offset)
{
+ if (pkt->xt.fragoff)
+ return -1;
+
switch (pkt->tprot) {
case IPPROTO_TCP:
*l4csum_offset = offsetof(struct tcphdr, check);
--
2.34.1
From: D Scott Phillips <scott(a)os.amperecomputing.com>
commit 38e0257e0e6f4fef2aa2966b089b56a8b1cfb75c upstream.
The erratum 1418040 workaround enables CNTVCT_EL1 access trapping in EL0
when executing compat threads. The workaround is applied when switching
between tasks, but the need for the workaround could also change at an
exec(), when a non-compat task execs a compat binary or vice versa. Apply
the workaround in arch_setup_new_exec().
This leaves a small window of time between SET_PERSONALITY and
arch_setup_new_exec where preemption could occur and confuse the old
workaround logic that compares TIF_32BIT between prev and next. Instead, we
can just read cntkctl to make sure it's in the state that the next task
needs. I measured cntkctl read time to be about the same as a mov from a
general-purpose register on N1. Update the workaround logic to examine the
current value of cntkctl instead of the previous task's compat state.
Fixes: d49f7d7376d0 ("arm64: Move handling of erratum 1418040 into C code")
Cc: <stable(a)vger.kernel.org> # 5.9.x
Signed-off-by: D Scott Phillips <scott(a)os.amperecomputing.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20211220234114.3926-1-scott@os.amperecomputing.com
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
---
arch/arm64/kernel/process.c | 39 +++++++++++++++----------------------
1 file changed, 16 insertions(+), 23 deletions(-)
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 4999caff3281..22275d8518eb 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -511,34 +511,26 @@ static void entry_task_switch(struct task_struct *next)
/*
* ARM erratum 1418040 handling, affecting the 32bit view of CNTVCT.
- * Assuming the virtual counter is enabled at the beginning of times:
- *
- * - disable access when switching from a 64bit task to a 32bit task
- * - enable access when switching from a 32bit task to a 64bit task
+ * Ensure access is disabled when switching to a 32bit task, ensure
+ * access is enabled when switching to a 64bit task.
*/
-static void erratum_1418040_thread_switch(struct task_struct *prev,
- struct task_struct *next)
+static void erratum_1418040_thread_switch(struct task_struct *next)
{
- bool prev32, next32;
- u64 val;
-
- if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))
- return;
-
- prev32 = is_compat_thread(task_thread_info(prev));
- next32 = is_compat_thread(task_thread_info(next));
-
- if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
+ if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) ||
+ !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
return;
- val = read_sysreg(cntkctl_el1);
-
- if (!next32)
- val |= ARCH_TIMER_USR_VCT_ACCESS_EN;
+ if (is_compat_thread(task_thread_info(next)))
+ sysreg_clear_set(cntkctl_el1, ARCH_TIMER_USR_VCT_ACCESS_EN, 0);
else
- val &= ~ARCH_TIMER_USR_VCT_ACCESS_EN;
+ sysreg_clear_set(cntkctl_el1, 0, ARCH_TIMER_USR_VCT_ACCESS_EN);
+}
- write_sysreg(val, cntkctl_el1);
+static void erratum_1418040_new_exec(void)
+{
+ preempt_disable();
+ erratum_1418040_thread_switch(current);
+ preempt_enable();
}
/*
@@ -556,7 +548,7 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
entry_task_switch(next);
uao_thread_switch(next);
ssbs_thread_switch(next);
- erratum_1418040_thread_switch(prev, next);
+ erratum_1418040_thread_switch(next);
/*
* Complete any pending TLB or cache maintenance on this CPU in case
@@ -622,6 +614,7 @@ void arch_setup_new_exec(void)
current->mm->context.flags = is_compat_task() ? MMCF_AARCH32 : 0;
ptrauth_thread_init_user(current);
+ erratum_1418040_new_exec();
if (task_spec_ssb_noexec(current)) {
arch_prctl_spec_ctrl_set(current, PR_SPEC_STORE_BYPASS,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e45c47d1f94e0cc7b6b079fdb4bcce2995e2adc4 Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer(a)redhat.com>
Date: Fri, 28 Jan 2022 10:58:39 -0500
Subject: [PATCH] block: add bio_start_io_acct_time() to control start_time
bio_start_io_acct_time() interface is like bio_start_io_acct() that
allows start_time to be passed in. This gives drivers the ability to
defer starting accounting until after IO is issued (but possibily not
entirely due to bio splitting).
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Link: https://lore.kernel.org/r/20220128155841.39644-2-snitzer@redhat.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/block/blk-core.c b/block/blk-core.c
index 97f8bc8d3a79..d93e3bb9a769 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1061,20 +1061,32 @@ again:
}
static unsigned long __part_start_io_acct(struct block_device *part,
- unsigned int sectors, unsigned int op)
+ unsigned int sectors, unsigned int op,
+ unsigned long start_time)
{
const int sgrp = op_stat_group(op);
- unsigned long now = READ_ONCE(jiffies);
part_stat_lock();
- update_io_ticks(part, now, false);
+ update_io_ticks(part, start_time, false);
part_stat_inc(part, ios[sgrp]);
part_stat_add(part, sectors[sgrp], sectors);
part_stat_local_inc(part, in_flight[op_is_write(op)]);
part_stat_unlock();
- return now;
+ return start_time;
+}
+
+/**
+ * bio_start_io_acct_time - start I/O accounting for bio based drivers
+ * @bio: bio to start account for
+ * @start_time: start time that should be passed back to bio_end_io_acct().
+ */
+void bio_start_io_acct_time(struct bio *bio, unsigned long start_time)
+{
+ __part_start_io_acct(bio->bi_bdev, bio_sectors(bio),
+ bio_op(bio), start_time);
}
+EXPORT_SYMBOL_GPL(bio_start_io_acct_time);
/**
* bio_start_io_acct - start I/O accounting for bio based drivers
@@ -1084,14 +1096,15 @@ static unsigned long __part_start_io_acct(struct block_device *part,
*/
unsigned long bio_start_io_acct(struct bio *bio)
{
- return __part_start_io_acct(bio->bi_bdev, bio_sectors(bio), bio_op(bio));
+ return __part_start_io_acct(bio->bi_bdev, bio_sectors(bio),
+ bio_op(bio), jiffies);
}
EXPORT_SYMBOL_GPL(bio_start_io_acct);
unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
unsigned int op)
{
- return __part_start_io_acct(disk->part0, sectors, op);
+ return __part_start_io_acct(disk->part0, sectors, op, jiffies);
}
EXPORT_SYMBOL(disk_start_io_acct);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 9c95df26fc26..f35aea98bc35 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1258,6 +1258,7 @@ unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,
void disk_end_io_acct(struct gendisk *disk, unsigned int op,
unsigned long start_time);
+void bio_start_io_acct_time(struct bio *bio, unsigned long start_time);
unsigned long bio_start_io_acct(struct bio *bio);
void bio_end_io_acct_remapped(struct bio *bio, unsigned long start_time,
struct block_device *orig_bdev);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b879f915bc48a18d4f4462729192435bb0f17052 Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer(a)redhat.com>
Date: Fri, 28 Jan 2022 10:58:41 -0500
Subject: [PATCH] dm: properly fix redundant bio-based IO accounting
Record the start_time for a bio but defer the starting block core's IO
accounting until after IO is submitted using bio_start_io_acct_time().
This approach avoids the need to mess around with any of the
individual IO stats in response to a bio_split() that follows bio
submission.
Reported-by: Bud Brown <bubrown(a)redhat.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Cc: stable(a)vger.kernel.org
Depends-on: e45c47d1f94e ("block: add bio_start_io_acct_time() to control start_time")
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Link: https://lore.kernel.org/r/20220128155841.39644-4-snitzer@redhat.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 9849114b3c08..dcbd6d201619 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -489,7 +489,7 @@ static void start_io_acct(struct dm_io *io)
struct mapped_device *md = io->md;
struct bio *bio = io->orig_bio;
- io->start_time = bio_start_io_acct(bio);
+ bio_start_io_acct_time(bio, io->start_time);
if (unlikely(dm_stats_used(&md->stats)))
dm_stats_account_io(&md->stats, bio_data_dir(bio),
bio->bi_iter.bi_sector, bio_sectors(bio),
@@ -535,7 +535,7 @@ static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio)
io->md = md;
spin_lock_init(&io->endio_lock);
- start_io_acct(io);
+ io->start_time = jiffies;
return io;
}
@@ -1482,6 +1482,7 @@ static void __split_and_process_bio(struct mapped_device *md,
submit_bio_noacct(bio);
}
}
+ start_io_acct(ci.io);
/* drop the extra reference count */
dm_io_dec_pending(ci.io, errno_to_blk_status(error));
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f524d9c95fab54783d0038f7a3e8c014d5b56857 Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer(a)redhat.com>
Date: Fri, 28 Jan 2022 10:58:40 -0500
Subject: [PATCH] dm: revert partial fix for redundant bio-based IO accounting
Reverts a1e1cb72d9649 ("dm: fix redundant IO accounting for bios that
need splitting") because it was too narrow in scope (only addressed
redundant 'sectors[]' accounting and not ios, nsecs[], etc).
Cc: stable(a)vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Link: https://lore.kernel.org/r/20220128155841.39644-3-snitzer@redhat.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index c0ae8087c602..9849114b3c08 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1442,9 +1442,6 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
ci->sector = bio->bi_iter.bi_sector;
}
-#define __dm_part_stat_sub(part, field, subnd) \
- (part_stat_get(part, field) -= (subnd))
-
/*
* Entry point to split a bio into clones and submit them to the targets.
*/
@@ -1480,18 +1477,6 @@ static void __split_and_process_bio(struct mapped_device *md,
GFP_NOIO, &md->queue->bio_split);
ci.io->orig_bio = b;
- /*
- * Adjust IO stats for each split, otherwise upon queue
- * reentry there will be redundant IO accounting.
- * NOTE: this is a stop-gap fix, a proper fix involves
- * significant refactoring of DM core's bio splitting
- * (by eliminating DM's splitting and just using bio_split)
- */
- part_stat_lock();
- __dm_part_stat_sub(dm_disk(md)->part0,
- sectors[op_stat_group(bio_op(bio))], ci.sector_count);
- part_stat_unlock();
-
bio_chain(b, bio);
trace_block_split(b, bio->bi_iter.bi_sector);
submit_bio_noacct(bio);
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1e0924bd09916fab795fc2a21ec1d148f24299fd Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Mon, 24 Jan 2022 17:17:54 +0900
Subject: [PATCH] arm64: Mark start_backtrace() notrace and NOKPROBE_SYMBOL
Mark the start_backtrace() as notrace and NOKPROBE_SYMBOL
because this function is called from ftrace and lockdep to
get the caller address via return_address(). The lockdep
is used in kprobes, it should also be NOKPROBE_SYMBOL.
Fixes: b07f3499661c ("arm64: stacktrace: Move start_backtrace() out of the header")
Cc: <stable(a)vger.kernel.org> # 5.13.x
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Reviewed-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/164301227374.1433152.12808232644267107415.stgit@d…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 0fb58fed54cb..e4103e085681 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -33,8 +33,8 @@
*/
-static void start_backtrace(struct stackframe *frame, unsigned long fp,
- unsigned long pc)
+static notrace void start_backtrace(struct stackframe *frame, unsigned long fp,
+ unsigned long pc)
{
frame->fp = fp;
frame->pc = pc;
@@ -55,6 +55,7 @@ static void start_backtrace(struct stackframe *frame, unsigned long fp,
frame->prev_fp = 0;
frame->prev_type = STACK_TYPE_UNKNOWN;
}
+NOKPROBE_SYMBOL(start_backtrace);
/*
* Unwind from one frame record (A) to the next frame record (B).
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1e0924bd09916fab795fc2a21ec1d148f24299fd Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Mon, 24 Jan 2022 17:17:54 +0900
Subject: [PATCH] arm64: Mark start_backtrace() notrace and NOKPROBE_SYMBOL
Mark the start_backtrace() as notrace and NOKPROBE_SYMBOL
because this function is called from ftrace and lockdep to
get the caller address via return_address(). The lockdep
is used in kprobes, it should also be NOKPROBE_SYMBOL.
Fixes: b07f3499661c ("arm64: stacktrace: Move start_backtrace() out of the header")
Cc: <stable(a)vger.kernel.org> # 5.13.x
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Reviewed-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/164301227374.1433152.12808232644267107415.stgit@d…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 0fb58fed54cb..e4103e085681 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -33,8 +33,8 @@
*/
-static void start_backtrace(struct stackframe *frame, unsigned long fp,
- unsigned long pc)
+static notrace void start_backtrace(struct stackframe *frame, unsigned long fp,
+ unsigned long pc)
{
frame->fp = fp;
frame->pc = pc;
@@ -55,6 +55,7 @@ static void start_backtrace(struct stackframe *frame, unsigned long fp,
frame->prev_fp = 0;
frame->prev_type = STACK_TYPE_UNKNOWN;
}
+NOKPROBE_SYMBOL(start_backtrace);
/*
* Unwind from one frame record (A) to the next frame record (B).
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 05a9e065059e566f218f8778c4d17ee75db56c55 Mon Sep 17 00:00:00 2001
From: Like Xu <likexu(a)tencent.com>
Date: Wed, 26 Jan 2022 17:22:26 +0000
Subject: [PATCH] KVM: x86: Sync the states size with the XCR0/IA32_XSS at, any
time
XCR0 is reset to 1 by RESET but not INIT and IA32_XSS is zeroed by
both RESET and INIT. The kvm_set_msr_common()'s handling of MSR_IA32_XSS
also needs to update kvm_update_cpuid_runtime(). In the above cases, the
size in bytes of the XSAVE area containing all states enabled by XCR0 or
(XCRO | IA32_XSS) needs to be updated.
For simplicity and consistency, existing helpers are used to write values
and call kvm_update_cpuid_runtime(), and it's not exactly a fast path.
Fixes: a554d207dc46 ("KVM: X86: Processor States following Reset or INIT")
Cc: stable(a)vger.kernel.org
Signed-off-by: Like Xu <likexu(a)tencent.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-4-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9c984eeed30c..5481d2259f02 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11266,8 +11266,8 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
vcpu->arch.msr_misc_features_enables = 0;
- vcpu->arch.xcr0 = XFEATURE_MASK_FP;
- vcpu->arch.ia32_xss = 0;
+ __kvm_set_xcr(vcpu, 0, XFEATURE_MASK_FP);
+ __kvm_set_msr(vcpu, MSR_IA32_XSS, 0, true);
}
/* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 05a9e065059e566f218f8778c4d17ee75db56c55 Mon Sep 17 00:00:00 2001
From: Like Xu <likexu(a)tencent.com>
Date: Wed, 26 Jan 2022 17:22:26 +0000
Subject: [PATCH] KVM: x86: Sync the states size with the XCR0/IA32_XSS at, any
time
XCR0 is reset to 1 by RESET but not INIT and IA32_XSS is zeroed by
both RESET and INIT. The kvm_set_msr_common()'s handling of MSR_IA32_XSS
also needs to update kvm_update_cpuid_runtime(). In the above cases, the
size in bytes of the XSAVE area containing all states enabled by XCR0 or
(XCRO | IA32_XSS) needs to be updated.
For simplicity and consistency, existing helpers are used to write values
and call kvm_update_cpuid_runtime(), and it's not exactly a fast path.
Fixes: a554d207dc46 ("KVM: X86: Processor States following Reset or INIT")
Cc: stable(a)vger.kernel.org
Signed-off-by: Like Xu <likexu(a)tencent.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-4-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9c984eeed30c..5481d2259f02 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11266,8 +11266,8 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
vcpu->arch.msr_misc_features_enables = 0;
- vcpu->arch.xcr0 = XFEATURE_MASK_FP;
- vcpu->arch.ia32_xss = 0;
+ __kvm_set_xcr(vcpu, 0, XFEATURE_MASK_FP);
+ __kvm_set_msr(vcpu, MSR_IA32_XSS, 0, true);
}
/* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 05a9e065059e566f218f8778c4d17ee75db56c55 Mon Sep 17 00:00:00 2001
From: Like Xu <likexu(a)tencent.com>
Date: Wed, 26 Jan 2022 17:22:26 +0000
Subject: [PATCH] KVM: x86: Sync the states size with the XCR0/IA32_XSS at, any
time
XCR0 is reset to 1 by RESET but not INIT and IA32_XSS is zeroed by
both RESET and INIT. The kvm_set_msr_common()'s handling of MSR_IA32_XSS
also needs to update kvm_update_cpuid_runtime(). In the above cases, the
size in bytes of the XSAVE area containing all states enabled by XCR0 or
(XCRO | IA32_XSS) needs to be updated.
For simplicity and consistency, existing helpers are used to write values
and call kvm_update_cpuid_runtime(), and it's not exactly a fast path.
Fixes: a554d207dc46 ("KVM: X86: Processor States following Reset or INIT")
Cc: stable(a)vger.kernel.org
Signed-off-by: Like Xu <likexu(a)tencent.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-4-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9c984eeed30c..5481d2259f02 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11266,8 +11266,8 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
vcpu->arch.msr_misc_features_enables = 0;
- vcpu->arch.xcr0 = XFEATURE_MASK_FP;
- vcpu->arch.ia32_xss = 0;
+ __kvm_set_xcr(vcpu, 0, XFEATURE_MASK_FP);
+ __kvm_set_msr(vcpu, MSR_IA32_XSS, 0, true);
}
/* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4c282e51e4450b94680d6ca3b10f830483b1f243 Mon Sep 17 00:00:00 2001
From: Like Xu <likexu(a)tencent.com>
Date: Wed, 26 Jan 2022 17:22:25 +0000
Subject: [PATCH] KVM: x86: Update vCPU's runtime CPUID on write to
MSR_IA32_XSS
Do a runtime CPUID update for a vCPU if MSR_IA32_XSS is written, as the
size in bytes of the XSAVE area is affected by the states enabled in XSS.
Fixes: 203000993de5 ("kvm: vmx: add MSR logic for XSAVES")
Cc: stable(a)vger.kernel.org
Signed-off-by: Like Xu <likexu(a)tencent.com>
[sean: split out as a separate patch, adjust Fixes tag]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b6d60c296eda..9c984eeed30c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3535,6 +3535,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (data & ~supported_xss)
return 1;
vcpu->arch.ia32_xss = data;
+ kvm_update_cpuid_runtime(vcpu);
break;
case MSR_SMI_COUNT:
if (!msr_info->host_initiated)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4c282e51e4450b94680d6ca3b10f830483b1f243 Mon Sep 17 00:00:00 2001
From: Like Xu <likexu(a)tencent.com>
Date: Wed, 26 Jan 2022 17:22:25 +0000
Subject: [PATCH] KVM: x86: Update vCPU's runtime CPUID on write to
MSR_IA32_XSS
Do a runtime CPUID update for a vCPU if MSR_IA32_XSS is written, as the
size in bytes of the XSAVE area is affected by the states enabled in XSS.
Fixes: 203000993de5 ("kvm: vmx: add MSR logic for XSAVES")
Cc: stable(a)vger.kernel.org
Signed-off-by: Like Xu <likexu(a)tencent.com>
[sean: split out as a separate patch, adjust Fixes tag]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b6d60c296eda..9c984eeed30c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3535,6 +3535,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (data & ~supported_xss)
return 1;
vcpu->arch.ia32_xss = data;
+ kvm_update_cpuid_runtime(vcpu);
break;
case MSR_SMI_COUNT:
if (!msr_info->host_initiated)
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4c282e51e4450b94680d6ca3b10f830483b1f243 Mon Sep 17 00:00:00 2001
From: Like Xu <likexu(a)tencent.com>
Date: Wed, 26 Jan 2022 17:22:25 +0000
Subject: [PATCH] KVM: x86: Update vCPU's runtime CPUID on write to
MSR_IA32_XSS
Do a runtime CPUID update for a vCPU if MSR_IA32_XSS is written, as the
size in bytes of the XSAVE area is affected by the states enabled in XSS.
Fixes: 203000993de5 ("kvm: vmx: add MSR logic for XSAVES")
Cc: stable(a)vger.kernel.org
Signed-off-by: Like Xu <likexu(a)tencent.com>
[sean: split out as a separate patch, adjust Fixes tag]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b6d60c296eda..9c984eeed30c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3535,6 +3535,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (data & ~supported_xss)
return 1;
vcpu->arch.ia32_xss = data;
+ kvm_update_cpuid_runtime(vcpu);
break;
case MSR_SMI_COUNT:
if (!msr_info->host_initiated)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4c282e51e4450b94680d6ca3b10f830483b1f243 Mon Sep 17 00:00:00 2001
From: Like Xu <likexu(a)tencent.com>
Date: Wed, 26 Jan 2022 17:22:25 +0000
Subject: [PATCH] KVM: x86: Update vCPU's runtime CPUID on write to
MSR_IA32_XSS
Do a runtime CPUID update for a vCPU if MSR_IA32_XSS is written, as the
size in bytes of the XSAVE area is affected by the states enabled in XSS.
Fixes: 203000993de5 ("kvm: vmx: add MSR logic for XSAVES")
Cc: stable(a)vger.kernel.org
Signed-off-by: Like Xu <likexu(a)tencent.com>
[sean: split out as a separate patch, adjust Fixes tag]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b6d60c296eda..9c984eeed30c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3535,6 +3535,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (data & ~supported_xss)
return 1;
vcpu->arch.ia32_xss = data;
+ kvm_update_cpuid_runtime(vcpu);
break;
case MSR_SMI_COUNT:
if (!msr_info->host_initiated)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4c282e51e4450b94680d6ca3b10f830483b1f243 Mon Sep 17 00:00:00 2001
From: Like Xu <likexu(a)tencent.com>
Date: Wed, 26 Jan 2022 17:22:25 +0000
Subject: [PATCH] KVM: x86: Update vCPU's runtime CPUID on write to
MSR_IA32_XSS
Do a runtime CPUID update for a vCPU if MSR_IA32_XSS is written, as the
size in bytes of the XSAVE area is affected by the states enabled in XSS.
Fixes: 203000993de5 ("kvm: vmx: add MSR logic for XSAVES")
Cc: stable(a)vger.kernel.org
Signed-off-by: Like Xu <likexu(a)tencent.com>
[sean: split out as a separate patch, adjust Fixes tag]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b6d60c296eda..9c984eeed30c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3535,6 +3535,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (data & ~supported_xss)
return 1;
vcpu->arch.ia32_xss = data;
+ kvm_update_cpuid_runtime(vcpu);
break;
case MSR_SMI_COUNT:
if (!msr_info->host_initiated)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From be4f3b3f82271c3193ce200a996dc70682c8e622 Mon Sep 17 00:00:00 2001
From: Xiaoyao Li <xiaoyao.li(a)intel.com>
Date: Wed, 26 Jan 2022 17:22:24 +0000
Subject: [PATCH] KVM: x86: Keep MSR_IA32_XSS unchanged for INIT
It has been corrected from SDM version 075 that MSR_IA32_XSS is reset to
zero on Power up and Reset but keeps unchanged on INIT.
Fixes: a554d207dc46 ("KVM: X86: Processor States following Reset or INIT")
Cc: stable(a)vger.kernel.org
Signed-off-by: Xiaoyao Li <xiaoyao.li(a)intel.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-2-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a50baf4c5bff..b6d60c296eda 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11266,6 +11266,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
vcpu->arch.msr_misc_features_enables = 0;
vcpu->arch.xcr0 = XFEATURE_MASK_FP;
+ vcpu->arch.ia32_xss = 0;
}
/* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */
@@ -11282,8 +11283,6 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
cpuid_0x1 = kvm_find_cpuid_entry(vcpu, 1, 0);
kvm_rdx_write(vcpu, cpuid_0x1 ? cpuid_0x1->eax : 0x600);
- vcpu->arch.ia32_xss = 0;
-
static_call(kvm_x86_vcpu_reset)(vcpu, init_event);
kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From be4f3b3f82271c3193ce200a996dc70682c8e622 Mon Sep 17 00:00:00 2001
From: Xiaoyao Li <xiaoyao.li(a)intel.com>
Date: Wed, 26 Jan 2022 17:22:24 +0000
Subject: [PATCH] KVM: x86: Keep MSR_IA32_XSS unchanged for INIT
It has been corrected from SDM version 075 that MSR_IA32_XSS is reset to
zero on Power up and Reset but keeps unchanged on INIT.
Fixes: a554d207dc46 ("KVM: X86: Processor States following Reset or INIT")
Cc: stable(a)vger.kernel.org
Signed-off-by: Xiaoyao Li <xiaoyao.li(a)intel.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-2-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a50baf4c5bff..b6d60c296eda 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11266,6 +11266,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
vcpu->arch.msr_misc_features_enables = 0;
vcpu->arch.xcr0 = XFEATURE_MASK_FP;
+ vcpu->arch.ia32_xss = 0;
}
/* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */
@@ -11282,8 +11283,6 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
cpuid_0x1 = kvm_find_cpuid_entry(vcpu, 1, 0);
kvm_rdx_write(vcpu, cpuid_0x1 ? cpuid_0x1->eax : 0x600);
- vcpu->arch.ia32_xss = 0;
-
static_call(kvm_x86_vcpu_reset)(vcpu, init_event);
kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From be4f3b3f82271c3193ce200a996dc70682c8e622 Mon Sep 17 00:00:00 2001
From: Xiaoyao Li <xiaoyao.li(a)intel.com>
Date: Wed, 26 Jan 2022 17:22:24 +0000
Subject: [PATCH] KVM: x86: Keep MSR_IA32_XSS unchanged for INIT
It has been corrected from SDM version 075 that MSR_IA32_XSS is reset to
zero on Power up and Reset but keeps unchanged on INIT.
Fixes: a554d207dc46 ("KVM: X86: Processor States following Reset or INIT")
Cc: stable(a)vger.kernel.org
Signed-off-by: Xiaoyao Li <xiaoyao.li(a)intel.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20220126172226.2298529-2-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a50baf4c5bff..b6d60c296eda 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11266,6 +11266,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
vcpu->arch.msr_misc_features_enables = 0;
vcpu->arch.xcr0 = XFEATURE_MASK_FP;
+ vcpu->arch.ia32_xss = 0;
}
/* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */
@@ -11282,8 +11283,6 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
cpuid_0x1 = kvm_find_cpuid_entry(vcpu, 1, 0);
kvm_rdx_write(vcpu, cpuid_0x1 ? cpuid_0x1->eax : 0x600);
- vcpu->arch.ia32_xss = 0;
-
static_call(kvm_x86_vcpu_reset)(vcpu, init_event);
kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0b0be065b7563ac708aaa9f69dd4941c80b3446d Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Thu, 20 Jan 2022 01:07:13 +0000
Subject: [PATCH] KVM: SVM: Don't intercept #GP for SEV guests
Never intercept #GP for SEV guests as reading SEV guest private memory
will return cyphertext, i.e. emulating on #GP can't work as intended.
Cc: stable(a)vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Brijesh Singh <brijesh.singh(a)amd.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Reviewed-by: Liam Merwick <liam.merwick(a)oracle.com>
Message-Id: <20220120010719.711476-4-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 37eb3168e0ea..defc91a8c04c 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -312,7 +312,11 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
return ret;
}
- if (svm_gp_erratum_intercept)
+ /*
+ * Never intercept #GP for SEV guests, KVM can't
+ * decrypt guest memory to workaround the erratum.
+ */
+ if (svm_gp_erratum_intercept && !sev_guest(vcpu->kvm))
set_exception_intercept(svm, GP_VECTOR);
}
}
@@ -1010,9 +1014,10 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
* Guest access to VMware backdoor ports could legitimately
* trigger #GP because of TSS I/O permission bitmap.
* We intercept those #GP and allow access to them anyway
- * as VMware does.
+ * as VMware does. Don't intercept #GP for SEV guests as KVM can't
+ * decrypt guest memory to decode the faulting instruction.
*/
- if (enable_vmware_backdoor)
+ if (enable_vmware_backdoor && !sev_guest(vcpu->kvm))
set_exception_intercept(svm, GP_VECTOR);
svm_set_intercept(svm, INTERCEPT_INTR);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0b0be065b7563ac708aaa9f69dd4941c80b3446d Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Thu, 20 Jan 2022 01:07:13 +0000
Subject: [PATCH] KVM: SVM: Don't intercept #GP for SEV guests
Never intercept #GP for SEV guests as reading SEV guest private memory
will return cyphertext, i.e. emulating on #GP can't work as intended.
Cc: stable(a)vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Brijesh Singh <brijesh.singh(a)amd.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Reviewed-by: Liam Merwick <liam.merwick(a)oracle.com>
Message-Id: <20220120010719.711476-4-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 37eb3168e0ea..defc91a8c04c 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -312,7 +312,11 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
return ret;
}
- if (svm_gp_erratum_intercept)
+ /*
+ * Never intercept #GP for SEV guests, KVM can't
+ * decrypt guest memory to workaround the erratum.
+ */
+ if (svm_gp_erratum_intercept && !sev_guest(vcpu->kvm))
set_exception_intercept(svm, GP_VECTOR);
}
}
@@ -1010,9 +1014,10 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
* Guest access to VMware backdoor ports could legitimately
* trigger #GP because of TSS I/O permission bitmap.
* We intercept those #GP and allow access to them anyway
- * as VMware does.
+ * as VMware does. Don't intercept #GP for SEV guests as KVM can't
+ * decrypt guest memory to decode the faulting instruction.
*/
- if (enable_vmware_backdoor)
+ if (enable_vmware_backdoor && !sev_guest(vcpu->kvm))
set_exception_intercept(svm, GP_VECTOR);
svm_set_intercept(svm, INTERCEPT_INTR);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 05d5a48635259e621ea26d01e8316c6feeb34190 Mon Sep 17 00:00:00 2001
From: "Singh, Brijesh" <brijesh.singh(a)amd.com>
Date: Fri, 15 Feb 2019 17:24:12 +0000
Subject: [PATCH] KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP
violation)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Errata#1096:
On a nested data page fault when CR.SMAP=1 and the guest data read
generates a SMAP violation, GuestInstrBytes field of the VMCB on a
VMEXIT will incorrectly return 0h instead the correct guest
instruction bytes .
Recommend Workaround:
To determine what instruction the guest was executing the hypervisor
will have to decode the instruction at the instruction pointer.
The recommended workaround can not be implemented for the SEV
guest because guest memory is encrypted with the guest specific key,
and instruction decoder will not be able to decode the instruction
bytes. If we hit this errata in the SEV guest then log the message
and request a guest shutdown.
Reported-by: Venkatesh Srinivas <venkateshs(a)google.com>
Cc: Jim Mattson <jmattson(a)google.com>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Joerg Roedel <joro(a)8bytes.org>
Cc: "Radim Krčmář" <rkrcmar(a)redhat.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh(a)amd.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 88f5192ce05e..5b03006c00be 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1192,6 +1192,8 @@ struct kvm_x86_ops {
int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu,
uint16_t *vmcs_version);
uint16_t (*nested_get_evmcs_version)(struct kvm_vcpu *vcpu);
+
+ bool (*need_emulation_on_page_fault)(struct kvm_vcpu *vcpu);
};
struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 776a58b00682..f6d760dcdb75 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5408,10 +5408,12 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t cr2, u64 error_code,
* This can happen if a guest gets a page-fault on data access but the HW
* table walker is not able to read the instruction page (e.g instruction
* page is not present in memory). In those cases we simply restart the
- * guest.
+ * guest, with the exception of AMD Erratum 1096 which is unrecoverable.
*/
- if (unlikely(insn && !insn_len))
- return 1;
+ if (unlikely(insn && !insn_len)) {
+ if (!kvm_x86_ops->need_emulation_on_page_fault(vcpu))
+ return 1;
+ }
er = x86_emulate_instruction(vcpu, cr2, emulation_type, insn, insn_len);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b5b128a0a051..426039285fd1 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7098,6 +7098,36 @@ static int nested_enable_evmcs(struct kvm_vcpu *vcpu,
return -ENODEV;
}
+static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
+{
+ bool is_user, smap;
+
+ is_user = svm_get_cpl(vcpu) == 3;
+ smap = !kvm_read_cr4_bits(vcpu, X86_CR4_SMAP);
+
+ /*
+ * Detect and workaround Errata 1096 Fam_17h_00_0Fh
+ *
+ * In non SEV guest, hypervisor will be able to read the guest
+ * memory to decode the instruction pointer when insn_len is zero
+ * so we return true to indicate that decoding is possible.
+ *
+ * But in the SEV guest, the guest memory is encrypted with the
+ * guest specific key and hypervisor will not be able to decode the
+ * instruction pointer so we will not able to workaround it. Lets
+ * print the error and request to kill the guest.
+ */
+ if (is_user && smap) {
+ if (!sev_guest(vcpu->kvm))
+ return true;
+
+ pr_err_ratelimited("KVM: Guest triggered AMD Erratum 1096\n");
+ kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+ }
+
+ return false;
+}
+
static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.cpu_has_kvm_support = has_svm,
.disabled_by_bios = is_disabled,
@@ -7231,6 +7261,8 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.nested_enable_evmcs = nested_enable_evmcs,
.nested_get_evmcs_version = nested_get_evmcs_version,
+
+ .need_emulation_on_page_fault = svm_need_emulation_on_page_fault,
};
static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c73375e01ab8..6aa84e09217b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7409,6 +7409,11 @@ static int enable_smi_window(struct kvm_vcpu *vcpu)
return 0;
}
+static bool vmx_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
+{
+ return 0;
+}
+
static __init int hardware_setup(void)
{
unsigned long host_bndcfgs;
@@ -7711,6 +7716,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
.set_nested_state = NULL,
.get_vmcs12_pages = NULL,
.nested_enable_evmcs = NULL,
+ .need_emulation_on_page_fault = vmx_need_emulation_on_page_fault,
};
static void vmx_cleanup_l1d_flush(void)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 05d5a48635259e621ea26d01e8316c6feeb34190 Mon Sep 17 00:00:00 2001
From: "Singh, Brijesh" <brijesh.singh(a)amd.com>
Date: Fri, 15 Feb 2019 17:24:12 +0000
Subject: [PATCH] KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP
violation)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Errata#1096:
On a nested data page fault when CR.SMAP=1 and the guest data read
generates a SMAP violation, GuestInstrBytes field of the VMCB on a
VMEXIT will incorrectly return 0h instead the correct guest
instruction bytes .
Recommend Workaround:
To determine what instruction the guest was executing the hypervisor
will have to decode the instruction at the instruction pointer.
The recommended workaround can not be implemented for the SEV
guest because guest memory is encrypted with the guest specific key,
and instruction decoder will not be able to decode the instruction
bytes. If we hit this errata in the SEV guest then log the message
and request a guest shutdown.
Reported-by: Venkatesh Srinivas <venkateshs(a)google.com>
Cc: Jim Mattson <jmattson(a)google.com>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Joerg Roedel <joro(a)8bytes.org>
Cc: "Radim Krčmář" <rkrcmar(a)redhat.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh(a)amd.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 88f5192ce05e..5b03006c00be 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1192,6 +1192,8 @@ struct kvm_x86_ops {
int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu,
uint16_t *vmcs_version);
uint16_t (*nested_get_evmcs_version)(struct kvm_vcpu *vcpu);
+
+ bool (*need_emulation_on_page_fault)(struct kvm_vcpu *vcpu);
};
struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 776a58b00682..f6d760dcdb75 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5408,10 +5408,12 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t cr2, u64 error_code,
* This can happen if a guest gets a page-fault on data access but the HW
* table walker is not able to read the instruction page (e.g instruction
* page is not present in memory). In those cases we simply restart the
- * guest.
+ * guest, with the exception of AMD Erratum 1096 which is unrecoverable.
*/
- if (unlikely(insn && !insn_len))
- return 1;
+ if (unlikely(insn && !insn_len)) {
+ if (!kvm_x86_ops->need_emulation_on_page_fault(vcpu))
+ return 1;
+ }
er = x86_emulate_instruction(vcpu, cr2, emulation_type, insn, insn_len);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b5b128a0a051..426039285fd1 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7098,6 +7098,36 @@ static int nested_enable_evmcs(struct kvm_vcpu *vcpu,
return -ENODEV;
}
+static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
+{
+ bool is_user, smap;
+
+ is_user = svm_get_cpl(vcpu) == 3;
+ smap = !kvm_read_cr4_bits(vcpu, X86_CR4_SMAP);
+
+ /*
+ * Detect and workaround Errata 1096 Fam_17h_00_0Fh
+ *
+ * In non SEV guest, hypervisor will be able to read the guest
+ * memory to decode the instruction pointer when insn_len is zero
+ * so we return true to indicate that decoding is possible.
+ *
+ * But in the SEV guest, the guest memory is encrypted with the
+ * guest specific key and hypervisor will not be able to decode the
+ * instruction pointer so we will not able to workaround it. Lets
+ * print the error and request to kill the guest.
+ */
+ if (is_user && smap) {
+ if (!sev_guest(vcpu->kvm))
+ return true;
+
+ pr_err_ratelimited("KVM: Guest triggered AMD Erratum 1096\n");
+ kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+ }
+
+ return false;
+}
+
static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.cpu_has_kvm_support = has_svm,
.disabled_by_bios = is_disabled,
@@ -7231,6 +7261,8 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.nested_enable_evmcs = nested_enable_evmcs,
.nested_get_evmcs_version = nested_get_evmcs_version,
+
+ .need_emulation_on_page_fault = svm_need_emulation_on_page_fault,
};
static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c73375e01ab8..6aa84e09217b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7409,6 +7409,11 @@ static int enable_smi_window(struct kvm_vcpu *vcpu)
return 0;
}
+static bool vmx_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
+{
+ return 0;
+}
+
static __init int hardware_setup(void)
{
unsigned long host_bndcfgs;
@@ -7711,6 +7716,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
.set_nested_state = NULL,
.get_vmcs12_pages = NULL,
.nested_enable_evmcs = NULL,
+ .need_emulation_on_page_fault = vmx_need_emulation_on_page_fault,
};
static void vmx_cleanup_l1d_flush(void)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 35fe7cfbab2e81f1afb23fc4212210b1de6d9633 Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpengli(a)tencent.com>
Date: Tue, 25 Jan 2022 01:17:00 -0800
Subject: [PATCH] KVM: LAPIC: Also cancel preemption timer during SET_LAPIC
The below warning is splatting during guest reboot.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1931 at arch/x86/kvm/x86.c:10322 kvm_arch_vcpu_ioctl_run+0x874/0x880 [kvm]
CPU: 0 PID: 1931 Comm: qemu-system-x86 Tainted: G I 5.17.0-rc1+ #5
RIP: 0010:kvm_arch_vcpu_ioctl_run+0x874/0x880 [kvm]
Call Trace:
<TASK>
kvm_vcpu_ioctl+0x279/0x710 [kvm]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fd39797350b
This can be triggered by not exposing tsc-deadline mode and doing a reboot in
the guest. The lapic_shutdown() function which is called in sys_reboot path
will not disarm the flying timer, it just masks LVTT. lapic_shutdown() clears
APIC state w/ LVT_MASKED and timer-mode bit is 0, this can trigger timer-mode
switch between tsc-deadline and oneshot/periodic, which can result in preemption
timer be cancelled in apic_update_lvtt(). However, We can't depend on this when
not exposing tsc-deadline mode and oneshot/periodic modes emulated by preemption
timer. Qemu will synchronise states around reset, let's cancel preemption timer
under KVM_SET_LAPIC.
Signed-off-by: Wanpeng Li <wanpengli(a)tencent.com>
Message-Id: <1643102220-35667-1-git-send-email-wanpengli(a)tencent.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index baca9fa37a91..4662469240bc 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2629,7 +2629,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
kvm_apic_set_version(vcpu);
apic_update_ppr(apic);
- hrtimer_cancel(&apic->lapic_timer.timer);
+ cancel_apic_timer(apic);
apic->lapic_timer.expired_tscdeadline = 0;
apic_update_lvtt(apic);
apic_manage_nmi_watchdog(apic, kvm_lapic_get_reg(apic, APIC_LVT0));
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d37823c3528e5e0705fc7746bcbc2afffb619259 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Mon, 10 Jan 2022 15:29:25 +0000
Subject: [PATCH] powerpc/32s: Fix kasan_init_region() for KASAN
It has been reported some configuration where the kernel doesn't
boot with KASAN enabled.
This is due to wrong BAT allocation for the KASAN area:
---[ Data Block Address Translation ]---
0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw m
1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw m
2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw m
3: 0xf8000000-0xf9ffffff 0x2a000000 32M Kernel rw m
4: 0xfa000000-0xfdffffff 0x2c000000 64M Kernel rw m
A BAT must have both virtual and physical addresses alignment matching
the size of the BAT. This is not the case for BAT 4 above.
Fix kasan_init_region() by using block_size() function that is in
book3s32/mmu.c. To be able to reuse it here, make it non static and
change its name to bat_block_size() in order to avoid name conflict
with block_size() defined in <linux/blkdev.h>
Also reuse find_free_bat() to avoid an error message from setbat()
when no BAT is available.
And allocate memory outside of linear memory mapping to avoid
wasting that precious space.
With this change we get correct alignment for BATs and KASAN shadow
memory is allocated outside the linear memory space.
---[ Data Block Address Translation ]---
0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw
1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw
2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw
3: 0xf8000000-0xfbffffff 0x7c000000 64M Kernel rw
4: 0xfc000000-0xfdffffff 0x7a000000 32M Kernel rw
Fixes: 7974c4732642 ("powerpc/32s: Implement dedicated kasan_init_region()")
Cc: stable(a)vger.kernel.org
Reported-by: Maxime Bizon <mbizon(a)freebox.fr>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Tested-by: Maxime Bizon <mbizon(a)freebox.fr>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/7a50ef902494d1325227d47d33dada01e52e5518.16418187…
diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
index 7be27862329f..78c6a5fde1d6 100644
--- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
@@ -223,6 +223,8 @@ static __always_inline void update_user_segments(u32 val)
update_user_segment(15, val);
}
+int __init find_free_bat(void);
+unsigned int bat_block_size(unsigned long base, unsigned long top);
#endif /* !__ASSEMBLY__ */
/* We happily ignore the smaller BATs on 601, we don't actually use
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 94045b265b6b..203735caf691 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -76,7 +76,7 @@ unsigned long p_block_mapped(phys_addr_t pa)
return 0;
}
-static int __init find_free_bat(void)
+int __init find_free_bat(void)
{
int b;
int n = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4;
@@ -100,7 +100,7 @@ static int __init find_free_bat(void)
* - block size has to be a power of two. This is calculated by finding the
* highest bit set to 1.
*/
-static unsigned int block_size(unsigned long base, unsigned long top)
+unsigned int bat_block_size(unsigned long base, unsigned long top)
{
unsigned int max_size = SZ_256M;
unsigned int base_shift = (ffs(base) - 1) & 31;
@@ -145,7 +145,7 @@ static unsigned long __init __mmu_mapin_ram(unsigned long base, unsigned long to
int idx;
while ((idx = find_free_bat()) != -1 && base != top) {
- unsigned int size = block_size(base, top);
+ unsigned int size = bat_block_size(base, top);
if (size < 128 << 10)
break;
@@ -201,12 +201,12 @@ void mmu_mark_initmem_nx(void)
unsigned long size;
for (i = 0; i < nb - 1 && base < top;) {
- size = block_size(base, top);
+ size = bat_block_size(base, top);
setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT);
base += size;
}
if (base < top) {
- size = block_size(base, top);
+ size = bat_block_size(base, top);
if ((top - base) > size) {
size <<= 1;
if (strict_kernel_rwx_enabled() && base + size > border)
diff --git a/arch/powerpc/mm/kasan/book3s_32.c b/arch/powerpc/mm/kasan/book3s_32.c
index 35b287b0a8da..450a67ef0bbe 100644
--- a/arch/powerpc/mm/kasan/book3s_32.c
+++ b/arch/powerpc/mm/kasan/book3s_32.c
@@ -10,48 +10,51 @@ int __init kasan_init_region(void *start, size_t size)
{
unsigned long k_start = (unsigned long)kasan_mem_to_shadow(start);
unsigned long k_end = (unsigned long)kasan_mem_to_shadow(start + size);
- unsigned long k_cur = k_start;
- int k_size = k_end - k_start;
- int k_size_base = 1 << (ffs(k_size) - 1);
+ unsigned long k_nobat = k_start;
+ unsigned long k_cur;
+ phys_addr_t phys;
int ret;
- void *block;
- block = memblock_alloc(k_size, k_size_base);
-
- if (block && k_size_base >= SZ_128K && k_start == ALIGN(k_start, k_size_base)) {
- int shift = ffs(k_size - k_size_base);
- int k_size_more = shift ? 1 << (shift - 1) : 0;
-
- setbat(-1, k_start, __pa(block), k_size_base, PAGE_KERNEL);
- if (k_size_more >= SZ_128K)
- setbat(-1, k_start + k_size_base, __pa(block) + k_size_base,
- k_size_more, PAGE_KERNEL);
- if (v_block_mapped(k_start))
- k_cur = k_start + k_size_base;
- if (v_block_mapped(k_start + k_size_base))
- k_cur = k_start + k_size_base + k_size_more;
-
- update_bats();
+ while (k_nobat < k_end) {
+ unsigned int k_size = bat_block_size(k_nobat, k_end);
+ int idx = find_free_bat();
+
+ if (idx == -1)
+ break;
+ if (k_size < SZ_128K)
+ break;
+ phys = memblock_phys_alloc_range(k_size, k_size, 0,
+ MEMBLOCK_ALLOC_ANYWHERE);
+ if (!phys)
+ break;
+
+ setbat(idx, k_nobat, phys, k_size, PAGE_KERNEL);
+ k_nobat += k_size;
}
+ if (k_nobat != k_start)
+ update_bats();
- if (!block)
- block = memblock_alloc(k_size, PAGE_SIZE);
- if (!block)
- return -ENOMEM;
+ if (k_nobat < k_end) {
+ phys = memblock_phys_alloc_range(k_end - k_nobat, PAGE_SIZE, 0,
+ MEMBLOCK_ALLOC_ANYWHERE);
+ if (!phys)
+ return -ENOMEM;
+ }
ret = kasan_init_shadow_page_tables(k_start, k_end);
if (ret)
return ret;
- kasan_update_early_region(k_start, k_cur, __pte(0));
+ kasan_update_early_region(k_start, k_nobat, __pte(0));
- for (; k_cur < k_end; k_cur += PAGE_SIZE) {
+ for (k_cur = k_nobat; k_cur < k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_off_k(k_cur);
- void *va = block + k_cur - k_start;
- pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
+ pte_t pte = pfn_pte(PHYS_PFN(phys + k_cur - k_nobat), PAGE_KERNEL);
__set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), pte, 0);
}
flush_tlb_kernel_range(k_start, k_end);
+ memset(kasan_mem_to_shadow(start), 0, k_end - k_start);
+
return 0;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d37823c3528e5e0705fc7746bcbc2afffb619259 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Mon, 10 Jan 2022 15:29:25 +0000
Subject: [PATCH] powerpc/32s: Fix kasan_init_region() for KASAN
It has been reported some configuration where the kernel doesn't
boot with KASAN enabled.
This is due to wrong BAT allocation for the KASAN area:
---[ Data Block Address Translation ]---
0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw m
1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw m
2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw m
3: 0xf8000000-0xf9ffffff 0x2a000000 32M Kernel rw m
4: 0xfa000000-0xfdffffff 0x2c000000 64M Kernel rw m
A BAT must have both virtual and physical addresses alignment matching
the size of the BAT. This is not the case for BAT 4 above.
Fix kasan_init_region() by using block_size() function that is in
book3s32/mmu.c. To be able to reuse it here, make it non static and
change its name to bat_block_size() in order to avoid name conflict
with block_size() defined in <linux/blkdev.h>
Also reuse find_free_bat() to avoid an error message from setbat()
when no BAT is available.
And allocate memory outside of linear memory mapping to avoid
wasting that precious space.
With this change we get correct alignment for BATs and KASAN shadow
memory is allocated outside the linear memory space.
---[ Data Block Address Translation ]---
0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw
1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw
2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw
3: 0xf8000000-0xfbffffff 0x7c000000 64M Kernel rw
4: 0xfc000000-0xfdffffff 0x7a000000 32M Kernel rw
Fixes: 7974c4732642 ("powerpc/32s: Implement dedicated kasan_init_region()")
Cc: stable(a)vger.kernel.org
Reported-by: Maxime Bizon <mbizon(a)freebox.fr>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Tested-by: Maxime Bizon <mbizon(a)freebox.fr>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/7a50ef902494d1325227d47d33dada01e52e5518.16418187…
diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
index 7be27862329f..78c6a5fde1d6 100644
--- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
@@ -223,6 +223,8 @@ static __always_inline void update_user_segments(u32 val)
update_user_segment(15, val);
}
+int __init find_free_bat(void);
+unsigned int bat_block_size(unsigned long base, unsigned long top);
#endif /* !__ASSEMBLY__ */
/* We happily ignore the smaller BATs on 601, we don't actually use
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 94045b265b6b..203735caf691 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -76,7 +76,7 @@ unsigned long p_block_mapped(phys_addr_t pa)
return 0;
}
-static int __init find_free_bat(void)
+int __init find_free_bat(void)
{
int b;
int n = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4;
@@ -100,7 +100,7 @@ static int __init find_free_bat(void)
* - block size has to be a power of two. This is calculated by finding the
* highest bit set to 1.
*/
-static unsigned int block_size(unsigned long base, unsigned long top)
+unsigned int bat_block_size(unsigned long base, unsigned long top)
{
unsigned int max_size = SZ_256M;
unsigned int base_shift = (ffs(base) - 1) & 31;
@@ -145,7 +145,7 @@ static unsigned long __init __mmu_mapin_ram(unsigned long base, unsigned long to
int idx;
while ((idx = find_free_bat()) != -1 && base != top) {
- unsigned int size = block_size(base, top);
+ unsigned int size = bat_block_size(base, top);
if (size < 128 << 10)
break;
@@ -201,12 +201,12 @@ void mmu_mark_initmem_nx(void)
unsigned long size;
for (i = 0; i < nb - 1 && base < top;) {
- size = block_size(base, top);
+ size = bat_block_size(base, top);
setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT);
base += size;
}
if (base < top) {
- size = block_size(base, top);
+ size = bat_block_size(base, top);
if ((top - base) > size) {
size <<= 1;
if (strict_kernel_rwx_enabled() && base + size > border)
diff --git a/arch/powerpc/mm/kasan/book3s_32.c b/arch/powerpc/mm/kasan/book3s_32.c
index 35b287b0a8da..450a67ef0bbe 100644
--- a/arch/powerpc/mm/kasan/book3s_32.c
+++ b/arch/powerpc/mm/kasan/book3s_32.c
@@ -10,48 +10,51 @@ int __init kasan_init_region(void *start, size_t size)
{
unsigned long k_start = (unsigned long)kasan_mem_to_shadow(start);
unsigned long k_end = (unsigned long)kasan_mem_to_shadow(start + size);
- unsigned long k_cur = k_start;
- int k_size = k_end - k_start;
- int k_size_base = 1 << (ffs(k_size) - 1);
+ unsigned long k_nobat = k_start;
+ unsigned long k_cur;
+ phys_addr_t phys;
int ret;
- void *block;
- block = memblock_alloc(k_size, k_size_base);
-
- if (block && k_size_base >= SZ_128K && k_start == ALIGN(k_start, k_size_base)) {
- int shift = ffs(k_size - k_size_base);
- int k_size_more = shift ? 1 << (shift - 1) : 0;
-
- setbat(-1, k_start, __pa(block), k_size_base, PAGE_KERNEL);
- if (k_size_more >= SZ_128K)
- setbat(-1, k_start + k_size_base, __pa(block) + k_size_base,
- k_size_more, PAGE_KERNEL);
- if (v_block_mapped(k_start))
- k_cur = k_start + k_size_base;
- if (v_block_mapped(k_start + k_size_base))
- k_cur = k_start + k_size_base + k_size_more;
-
- update_bats();
+ while (k_nobat < k_end) {
+ unsigned int k_size = bat_block_size(k_nobat, k_end);
+ int idx = find_free_bat();
+
+ if (idx == -1)
+ break;
+ if (k_size < SZ_128K)
+ break;
+ phys = memblock_phys_alloc_range(k_size, k_size, 0,
+ MEMBLOCK_ALLOC_ANYWHERE);
+ if (!phys)
+ break;
+
+ setbat(idx, k_nobat, phys, k_size, PAGE_KERNEL);
+ k_nobat += k_size;
}
+ if (k_nobat != k_start)
+ update_bats();
- if (!block)
- block = memblock_alloc(k_size, PAGE_SIZE);
- if (!block)
- return -ENOMEM;
+ if (k_nobat < k_end) {
+ phys = memblock_phys_alloc_range(k_end - k_nobat, PAGE_SIZE, 0,
+ MEMBLOCK_ALLOC_ANYWHERE);
+ if (!phys)
+ return -ENOMEM;
+ }
ret = kasan_init_shadow_page_tables(k_start, k_end);
if (ret)
return ret;
- kasan_update_early_region(k_start, k_cur, __pte(0));
+ kasan_update_early_region(k_start, k_nobat, __pte(0));
- for (; k_cur < k_end; k_cur += PAGE_SIZE) {
+ for (k_cur = k_nobat; k_cur < k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_off_k(k_cur);
- void *va = block + k_cur - k_start;
- pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
+ pte_t pte = pfn_pte(PHYS_PFN(phys + k_cur - k_nobat), PAGE_KERNEL);
__set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), pte, 0);
}
flush_tlb_kernel_range(k_start, k_end);
+ memset(kasan_mem_to_shadow(start), 0, k_end - k_start);
+
return 0;
}
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d37823c3528e5e0705fc7746bcbc2afffb619259 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Date: Mon, 10 Jan 2022 15:29:25 +0000
Subject: [PATCH] powerpc/32s: Fix kasan_init_region() for KASAN
It has been reported some configuration where the kernel doesn't
boot with KASAN enabled.
This is due to wrong BAT allocation for the KASAN area:
---[ Data Block Address Translation ]---
0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw m
1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw m
2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw m
3: 0xf8000000-0xf9ffffff 0x2a000000 32M Kernel rw m
4: 0xfa000000-0xfdffffff 0x2c000000 64M Kernel rw m
A BAT must have both virtual and physical addresses alignment matching
the size of the BAT. This is not the case for BAT 4 above.
Fix kasan_init_region() by using block_size() function that is in
book3s32/mmu.c. To be able to reuse it here, make it non static and
change its name to bat_block_size() in order to avoid name conflict
with block_size() defined in <linux/blkdev.h>
Also reuse find_free_bat() to avoid an error message from setbat()
when no BAT is available.
And allocate memory outside of linear memory mapping to avoid
wasting that precious space.
With this change we get correct alignment for BATs and KASAN shadow
memory is allocated outside the linear memory space.
---[ Data Block Address Translation ]---
0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw
1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw
2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw
3: 0xf8000000-0xfbffffff 0x7c000000 64M Kernel rw
4: 0xfc000000-0xfdffffff 0x7a000000 32M Kernel rw
Fixes: 7974c4732642 ("powerpc/32s: Implement dedicated kasan_init_region()")
Cc: stable(a)vger.kernel.org
Reported-by: Maxime Bizon <mbizon(a)freebox.fr>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Tested-by: Maxime Bizon <mbizon(a)freebox.fr>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/7a50ef902494d1325227d47d33dada01e52e5518.16418187…
diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
index 7be27862329f..78c6a5fde1d6 100644
--- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
@@ -223,6 +223,8 @@ static __always_inline void update_user_segments(u32 val)
update_user_segment(15, val);
}
+int __init find_free_bat(void);
+unsigned int bat_block_size(unsigned long base, unsigned long top);
#endif /* !__ASSEMBLY__ */
/* We happily ignore the smaller BATs on 601, we don't actually use
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 94045b265b6b..203735caf691 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -76,7 +76,7 @@ unsigned long p_block_mapped(phys_addr_t pa)
return 0;
}
-static int __init find_free_bat(void)
+int __init find_free_bat(void)
{
int b;
int n = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4;
@@ -100,7 +100,7 @@ static int __init find_free_bat(void)
* - block size has to be a power of two. This is calculated by finding the
* highest bit set to 1.
*/
-static unsigned int block_size(unsigned long base, unsigned long top)
+unsigned int bat_block_size(unsigned long base, unsigned long top)
{
unsigned int max_size = SZ_256M;
unsigned int base_shift = (ffs(base) - 1) & 31;
@@ -145,7 +145,7 @@ static unsigned long __init __mmu_mapin_ram(unsigned long base, unsigned long to
int idx;
while ((idx = find_free_bat()) != -1 && base != top) {
- unsigned int size = block_size(base, top);
+ unsigned int size = bat_block_size(base, top);
if (size < 128 << 10)
break;
@@ -201,12 +201,12 @@ void mmu_mark_initmem_nx(void)
unsigned long size;
for (i = 0; i < nb - 1 && base < top;) {
- size = block_size(base, top);
+ size = bat_block_size(base, top);
setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT);
base += size;
}
if (base < top) {
- size = block_size(base, top);
+ size = bat_block_size(base, top);
if ((top - base) > size) {
size <<= 1;
if (strict_kernel_rwx_enabled() && base + size > border)
diff --git a/arch/powerpc/mm/kasan/book3s_32.c b/arch/powerpc/mm/kasan/book3s_32.c
index 35b287b0a8da..450a67ef0bbe 100644
--- a/arch/powerpc/mm/kasan/book3s_32.c
+++ b/arch/powerpc/mm/kasan/book3s_32.c
@@ -10,48 +10,51 @@ int __init kasan_init_region(void *start, size_t size)
{
unsigned long k_start = (unsigned long)kasan_mem_to_shadow(start);
unsigned long k_end = (unsigned long)kasan_mem_to_shadow(start + size);
- unsigned long k_cur = k_start;
- int k_size = k_end - k_start;
- int k_size_base = 1 << (ffs(k_size) - 1);
+ unsigned long k_nobat = k_start;
+ unsigned long k_cur;
+ phys_addr_t phys;
int ret;
- void *block;
- block = memblock_alloc(k_size, k_size_base);
-
- if (block && k_size_base >= SZ_128K && k_start == ALIGN(k_start, k_size_base)) {
- int shift = ffs(k_size - k_size_base);
- int k_size_more = shift ? 1 << (shift - 1) : 0;
-
- setbat(-1, k_start, __pa(block), k_size_base, PAGE_KERNEL);
- if (k_size_more >= SZ_128K)
- setbat(-1, k_start + k_size_base, __pa(block) + k_size_base,
- k_size_more, PAGE_KERNEL);
- if (v_block_mapped(k_start))
- k_cur = k_start + k_size_base;
- if (v_block_mapped(k_start + k_size_base))
- k_cur = k_start + k_size_base + k_size_more;
-
- update_bats();
+ while (k_nobat < k_end) {
+ unsigned int k_size = bat_block_size(k_nobat, k_end);
+ int idx = find_free_bat();
+
+ if (idx == -1)
+ break;
+ if (k_size < SZ_128K)
+ break;
+ phys = memblock_phys_alloc_range(k_size, k_size, 0,
+ MEMBLOCK_ALLOC_ANYWHERE);
+ if (!phys)
+ break;
+
+ setbat(idx, k_nobat, phys, k_size, PAGE_KERNEL);
+ k_nobat += k_size;
}
+ if (k_nobat != k_start)
+ update_bats();
- if (!block)
- block = memblock_alloc(k_size, PAGE_SIZE);
- if (!block)
- return -ENOMEM;
+ if (k_nobat < k_end) {
+ phys = memblock_phys_alloc_range(k_end - k_nobat, PAGE_SIZE, 0,
+ MEMBLOCK_ALLOC_ANYWHERE);
+ if (!phys)
+ return -ENOMEM;
+ }
ret = kasan_init_shadow_page_tables(k_start, k_end);
if (ret)
return ret;
- kasan_update_early_region(k_start, k_cur, __pte(0));
+ kasan_update_early_region(k_start, k_nobat, __pte(0));
- for (; k_cur < k_end; k_cur += PAGE_SIZE) {
+ for (k_cur = k_nobat; k_cur < k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_off_k(k_cur);
- void *va = block + k_cur - k_start;
- pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
+ pte_t pte = pfn_pte(PHYS_PFN(phys + k_cur - k_nobat), PAGE_KERNEL);
__set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), pte, 0);
}
flush_tlb_kernel_range(k_start, k_end);
+ memset(kasan_mem_to_shadow(start), 0, k_end - k_start);
+
return 0;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 097f1eefedeab528cecbd35586dfe293853ffb17 Mon Sep 17 00:00:00 2001
From: Tom Zanussi <zanussi(a)kernel.org>
Date: Thu, 27 Jan 2022 15:44:17 -0600
Subject: [PATCH] tracing: Propagate is_signed to expression
During expression parsing, a new expression field is created which
should inherit the properties of the operands, such as size and
is_signed.
is_signed propagation was missing, causing spurious errors with signed
operands. Add it in parse_expr() and parse_unary() to fix the problem.
Link: https://lkml.kernel.org/r/f4dac08742fd7a0920bf80a73c6c44042f5eaa40.16433197…
Cc: stable(a)vger.kernel.org
Fixes: 100719dcef447 ("tracing: Add simple expression support to hist triggers")
Reported-by: Yordan Karadzhov <ykaradzhov(a)vmware.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215513
Signed-off-by: Tom Zanussi <zanussi(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index b894d68082ea..ada87bfb5bb8 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -2503,6 +2503,8 @@ static struct hist_field *parse_unary(struct hist_trigger_data *hist_data,
(HIST_FIELD_FL_TIMESTAMP | HIST_FIELD_FL_TIMESTAMP_USECS);
expr->fn = hist_field_unary_minus;
expr->operands[0] = operand1;
+ expr->size = operand1->size;
+ expr->is_signed = operand1->is_signed;
expr->operator = FIELD_OP_UNARY_MINUS;
expr->name = expr_str(expr, 0);
expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
@@ -2719,6 +2721,7 @@ static struct hist_field *parse_expr(struct hist_trigger_data *hist_data,
/* The operand sizes should be the same, so just pick one */
expr->size = operand1->size;
+ expr->is_signed = operand1->is_signed;
expr->operator = field_op;
expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 097f1eefedeab528cecbd35586dfe293853ffb17 Mon Sep 17 00:00:00 2001
From: Tom Zanussi <zanussi(a)kernel.org>
Date: Thu, 27 Jan 2022 15:44:17 -0600
Subject: [PATCH] tracing: Propagate is_signed to expression
During expression parsing, a new expression field is created which
should inherit the properties of the operands, such as size and
is_signed.
is_signed propagation was missing, causing spurious errors with signed
operands. Add it in parse_expr() and parse_unary() to fix the problem.
Link: https://lkml.kernel.org/r/f4dac08742fd7a0920bf80a73c6c44042f5eaa40.16433197…
Cc: stable(a)vger.kernel.org
Fixes: 100719dcef447 ("tracing: Add simple expression support to hist triggers")
Reported-by: Yordan Karadzhov <ykaradzhov(a)vmware.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215513
Signed-off-by: Tom Zanussi <zanussi(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index b894d68082ea..ada87bfb5bb8 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -2503,6 +2503,8 @@ static struct hist_field *parse_unary(struct hist_trigger_data *hist_data,
(HIST_FIELD_FL_TIMESTAMP | HIST_FIELD_FL_TIMESTAMP_USECS);
expr->fn = hist_field_unary_minus;
expr->operands[0] = operand1;
+ expr->size = operand1->size;
+ expr->is_signed = operand1->is_signed;
expr->operator = FIELD_OP_UNARY_MINUS;
expr->name = expr_str(expr, 0);
expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
@@ -2719,6 +2721,7 @@ static struct hist_field *parse_expr(struct hist_trigger_data *hist_data,
/* The operand sizes should be the same, so just pick one */
expr->size = operand1->size;
+ expr->is_signed = operand1->is_signed;
expr->operator = field_op;
expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f094a39c6ba168f2df1edfd1731cca377af5f442 Mon Sep 17 00:00:00 2001
From: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Date: Mon, 17 Jan 2022 18:40:32 +0100
Subject: [PATCH] s390/nmi: handle vector validity failures for KVM guests
The machine check validity bit tells about the context. If a KVM guest
was running the bit tells about the guest validity and the host state is
not affected. As a guest can disable the guest validity this might
result in unwanted host errors on machine checks.
Cc: stable(a)vger.kernel.org
Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest")
Signed-off-by: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kernel/nmi.c b/arch/s390/kernel/nmi.c
index 147c0d5fd9b4..651a51914e34 100644
--- a/arch/s390/kernel/nmi.c
+++ b/arch/s390/kernel/nmi.c
@@ -264,7 +264,14 @@ static int notrace s390_validate_registers(union mci mci, int umode)
/* Validate vector registers */
union ctlreg0 cr0;
- if (!mci.vr) {
+ /*
+ * The vector validity must only be checked if not running a
+ * KVM guest. For KVM guests the machine check is forwarded by
+ * KVM and it is the responsibility of the guest to take
+ * appropriate actions. The host vector or FPU values have been
+ * saved by KVM and will be restored by KVM.
+ */
+ if (!mci.vr && !test_cpu_flag(CIF_MCCK_GUEST)) {
/*
* Vector registers can't be restored. If the kernel
* currently uses vector registers the system is
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f094a39c6ba168f2df1edfd1731cca377af5f442 Mon Sep 17 00:00:00 2001
From: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Date: Mon, 17 Jan 2022 18:40:32 +0100
Subject: [PATCH] s390/nmi: handle vector validity failures for KVM guests
The machine check validity bit tells about the context. If a KVM guest
was running the bit tells about the guest validity and the host state is
not affected. As a guest can disable the guest validity this might
result in unwanted host errors on machine checks.
Cc: stable(a)vger.kernel.org
Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest")
Signed-off-by: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kernel/nmi.c b/arch/s390/kernel/nmi.c
index 147c0d5fd9b4..651a51914e34 100644
--- a/arch/s390/kernel/nmi.c
+++ b/arch/s390/kernel/nmi.c
@@ -264,7 +264,14 @@ static int notrace s390_validate_registers(union mci mci, int umode)
/* Validate vector registers */
union ctlreg0 cr0;
- if (!mci.vr) {
+ /*
+ * The vector validity must only be checked if not running a
+ * KVM guest. For KVM guests the machine check is forwarded by
+ * KVM and it is the responsibility of the guest to take
+ * appropriate actions. The host vector or FPU values have been
+ * saved by KVM and will be restored by KVM.
+ */
+ if (!mci.vr && !test_cpu_flag(CIF_MCCK_GUEST)) {
/*
* Vector registers can't be restored. If the kernel
* currently uses vector registers the system is
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f094a39c6ba168f2df1edfd1731cca377af5f442 Mon Sep 17 00:00:00 2001
From: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Date: Mon, 17 Jan 2022 18:40:32 +0100
Subject: [PATCH] s390/nmi: handle vector validity failures for KVM guests
The machine check validity bit tells about the context. If a KVM guest
was running the bit tells about the guest validity and the host state is
not affected. As a guest can disable the guest validity this might
result in unwanted host errors on machine checks.
Cc: stable(a)vger.kernel.org
Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest")
Signed-off-by: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kernel/nmi.c b/arch/s390/kernel/nmi.c
index 147c0d5fd9b4..651a51914e34 100644
--- a/arch/s390/kernel/nmi.c
+++ b/arch/s390/kernel/nmi.c
@@ -264,7 +264,14 @@ static int notrace s390_validate_registers(union mci mci, int umode)
/* Validate vector registers */
union ctlreg0 cr0;
- if (!mci.vr) {
+ /*
+ * The vector validity must only be checked if not running a
+ * KVM guest. For KVM guests the machine check is forwarded by
+ * KVM and it is the responsibility of the guest to take
+ * appropriate actions. The host vector or FPU values have been
+ * saved by KVM and will be restored by KVM.
+ */
+ if (!mci.vr && !test_cpu_flag(CIF_MCCK_GUEST)) {
/*
* Vector registers can't be restored. If the kernel
* currently uses vector registers the system is
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f094a39c6ba168f2df1edfd1731cca377af5f442 Mon Sep 17 00:00:00 2001
From: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Date: Mon, 17 Jan 2022 18:40:32 +0100
Subject: [PATCH] s390/nmi: handle vector validity failures for KVM guests
The machine check validity bit tells about the context. If a KVM guest
was running the bit tells about the guest validity and the host state is
not affected. As a guest can disable the guest validity this might
result in unwanted host errors on machine checks.
Cc: stable(a)vger.kernel.org
Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest")
Signed-off-by: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kernel/nmi.c b/arch/s390/kernel/nmi.c
index 147c0d5fd9b4..651a51914e34 100644
--- a/arch/s390/kernel/nmi.c
+++ b/arch/s390/kernel/nmi.c
@@ -264,7 +264,14 @@ static int notrace s390_validate_registers(union mci mci, int umode)
/* Validate vector registers */
union ctlreg0 cr0;
- if (!mci.vr) {
+ /*
+ * The vector validity must only be checked if not running a
+ * KVM guest. For KVM guests the machine check is forwarded by
+ * KVM and it is the responsibility of the guest to take
+ * appropriate actions. The host vector or FPU values have been
+ * saved by KVM and will be restored by KVM.
+ */
+ if (!mci.vr && !test_cpu_flag(CIF_MCCK_GUEST)) {
/*
* Vector registers can't be restored. If the kernel
* currently uses vector registers the system is
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ea1d6a847d2b1d17fefd9196664b95f052a0775 Mon Sep 17 00:00:00 2001
From: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Date: Thu, 13 Jan 2022 11:44:19 +0100
Subject: [PATCH] s390/nmi: handle guarded storage validity failures for KVM
guests
machine check validity bits reflect the state of the machine check. If a
guest does not make use of guarded storage, the validity bit might be
off. We can not use the host CR bit to decide if the validity bit must
be on. So ignore "invalid" guarded storage controls for KVM guests in
the host and rely on the machine check being forwarded to the guest. If
no other errors happen from a host perspective everything is fine and no
process must be killed and the host can continue to run.
Cc: stable(a)vger.kernel.org
Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest")
Reported-by: Carsten Otte <cotte(a)de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Tested-by: Carsten Otte <cotte(a)de.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kernel/nmi.c b/arch/s390/kernel/nmi.c
index 0c9e894913dc..147c0d5fd9b4 100644
--- a/arch/s390/kernel/nmi.c
+++ b/arch/s390/kernel/nmi.c
@@ -307,11 +307,21 @@ static int notrace s390_validate_registers(union mci mci, int umode)
if (cr2.gse) {
if (!mci.gs) {
/*
- * Guarded storage register can't be restored and
- * the current processes uses guarded storage.
- * It has to be terminated.
+ * 2 cases:
+ * - machine check in kernel or userspace
+ * - machine check while running SIE (KVM guest)
+ * For kernel or userspace the userspace values of
+ * guarded storage control can not be recreated, the
+ * process must be terminated.
+ * For SIE the guest values of guarded storage can not
+ * be recreated. This is either due to a bug or due to
+ * GS being disabled in the guest. The guest will be
+ * notified by KVM code and the guests machine check
+ * handling must take care of this. The host values
+ * are saved by KVM and are not affected.
*/
- kill_task = 1;
+ if (!test_cpu_flag(CIF_MCCK_GUEST))
+ kill_task = 1;
} else {
load_gs_cb((struct gs_cb *)mcesa->guarded_storage_save_area);
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ea1d6a847d2b1d17fefd9196664b95f052a0775 Mon Sep 17 00:00:00 2001
From: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Date: Thu, 13 Jan 2022 11:44:19 +0100
Subject: [PATCH] s390/nmi: handle guarded storage validity failures for KVM
guests
machine check validity bits reflect the state of the machine check. If a
guest does not make use of guarded storage, the validity bit might be
off. We can not use the host CR bit to decide if the validity bit must
be on. So ignore "invalid" guarded storage controls for KVM guests in
the host and rely on the machine check being forwarded to the guest. If
no other errors happen from a host perspective everything is fine and no
process must be killed and the host can continue to run.
Cc: stable(a)vger.kernel.org
Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest")
Reported-by: Carsten Otte <cotte(a)de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Tested-by: Carsten Otte <cotte(a)de.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kernel/nmi.c b/arch/s390/kernel/nmi.c
index 0c9e894913dc..147c0d5fd9b4 100644
--- a/arch/s390/kernel/nmi.c
+++ b/arch/s390/kernel/nmi.c
@@ -307,11 +307,21 @@ static int notrace s390_validate_registers(union mci mci, int umode)
if (cr2.gse) {
if (!mci.gs) {
/*
- * Guarded storage register can't be restored and
- * the current processes uses guarded storage.
- * It has to be terminated.
+ * 2 cases:
+ * - machine check in kernel or userspace
+ * - machine check while running SIE (KVM guest)
+ * For kernel or userspace the userspace values of
+ * guarded storage control can not be recreated, the
+ * process must be terminated.
+ * For SIE the guest values of guarded storage can not
+ * be recreated. This is either due to a bug or due to
+ * GS being disabled in the guest. The guest will be
+ * notified by KVM code and the guests machine check
+ * handling must take care of this. The host values
+ * are saved by KVM and are not affected.
*/
- kill_task = 1;
+ if (!test_cpu_flag(CIF_MCCK_GUEST))
+ kill_task = 1;
} else {
load_gs_cb((struct gs_cb *)mcesa->guarded_storage_save_area);
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ea1d6a847d2b1d17fefd9196664b95f052a0775 Mon Sep 17 00:00:00 2001
From: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Date: Thu, 13 Jan 2022 11:44:19 +0100
Subject: [PATCH] s390/nmi: handle guarded storage validity failures for KVM
guests
machine check validity bits reflect the state of the machine check. If a
guest does not make use of guarded storage, the validity bit might be
off. We can not use the host CR bit to decide if the validity bit must
be on. So ignore "invalid" guarded storage controls for KVM guests in
the host and rely on the machine check being forwarded to the guest. If
no other errors happen from a host perspective everything is fine and no
process must be killed and the host can continue to run.
Cc: stable(a)vger.kernel.org
Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest")
Reported-by: Carsten Otte <cotte(a)de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Tested-by: Carsten Otte <cotte(a)de.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kernel/nmi.c b/arch/s390/kernel/nmi.c
index 0c9e894913dc..147c0d5fd9b4 100644
--- a/arch/s390/kernel/nmi.c
+++ b/arch/s390/kernel/nmi.c
@@ -307,11 +307,21 @@ static int notrace s390_validate_registers(union mci mci, int umode)
if (cr2.gse) {
if (!mci.gs) {
/*
- * Guarded storage register can't be restored and
- * the current processes uses guarded storage.
- * It has to be terminated.
+ * 2 cases:
+ * - machine check in kernel or userspace
+ * - machine check while running SIE (KVM guest)
+ * For kernel or userspace the userspace values of
+ * guarded storage control can not be recreated, the
+ * process must be terminated.
+ * For SIE the guest values of guarded storage can not
+ * be recreated. This is either due to a bug or due to
+ * GS being disabled in the guest. The guest will be
+ * notified by KVM code and the guests machine check
+ * handling must take care of this. The host values
+ * are saved by KVM and are not affected.
*/
- kill_task = 1;
+ if (!test_cpu_flag(CIF_MCCK_GUEST))
+ kill_task = 1;
} else {
load_gs_cb((struct gs_cb *)mcesa->guarded_storage_save_area);
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ea1d6a847d2b1d17fefd9196664b95f052a0775 Mon Sep 17 00:00:00 2001
From: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Date: Thu, 13 Jan 2022 11:44:19 +0100
Subject: [PATCH] s390/nmi: handle guarded storage validity failures for KVM
guests
machine check validity bits reflect the state of the machine check. If a
guest does not make use of guarded storage, the validity bit might be
off. We can not use the host CR bit to decide if the validity bit must
be on. So ignore "invalid" guarded storage controls for KVM guests in
the host and rely on the machine check being forwarded to the guest. If
no other errors happen from a host perspective everything is fine and no
process must be killed and the host can continue to run.
Cc: stable(a)vger.kernel.org
Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest")
Reported-by: Carsten Otte <cotte(a)de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Tested-by: Carsten Otte <cotte(a)de.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kernel/nmi.c b/arch/s390/kernel/nmi.c
index 0c9e894913dc..147c0d5fd9b4 100644
--- a/arch/s390/kernel/nmi.c
+++ b/arch/s390/kernel/nmi.c
@@ -307,11 +307,21 @@ static int notrace s390_validate_registers(union mci mci, int umode)
if (cr2.gse) {
if (!mci.gs) {
/*
- * Guarded storage register can't be restored and
- * the current processes uses guarded storage.
- * It has to be terminated.
+ * 2 cases:
+ * - machine check in kernel or userspace
+ * - machine check while running SIE (KVM guest)
+ * For kernel or userspace the userspace values of
+ * guarded storage control can not be recreated, the
+ * process must be terminated.
+ * For SIE the guest values of guarded storage can not
+ * be recreated. This is either due to a bug or due to
+ * GS being disabled in the guest. The guest will be
+ * notified by KVM code and the guests machine check
+ * handling must take care of this. The host values
+ * are saved by KVM and are not affected.
*/
- kill_task = 1;
+ if (!test_cpu_flag(CIF_MCCK_GUEST))
+ kill_task = 1;
} else {
load_gs_cb((struct gs_cb *)mcesa->guarded_storage_save_area);
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0735e639f129dff455aeb91da291f5c578cc33db Mon Sep 17 00:00:00 2001
From: Mohammad Athari Bin Ismail <mohammad.athari.ismail(a)intel.com>
Date: Wed, 26 Jan 2022 17:47:23 +0800
Subject: [PATCH] net: stmmac: skip only stmmac_ptp_register when resume from
suspend
When resume from suspend, besides skipping PTP registration, it also
skipping PTP HW initialization. This could cause PTP clock not able to
operate properly when resume from suspend.
To fix this, only stmmac_ptp_register() is skipped when resume from
suspend.
Fixes: fe1319291150 ("stmmac: Don't init ptp again when resume from suspend/hibernation")
Cc: <stable(a)vger.kernel.org> # 5.15.x
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail(a)intel.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 25c802344356..639a753266e6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -915,8 +915,6 @@ static int stmmac_init_ptp(struct stmmac_priv *priv)
priv->hwts_tx_en = 0;
priv->hwts_rx_en = 0;
- stmmac_ptp_register(priv);
-
return 0;
}
@@ -3242,7 +3240,7 @@ static int stmmac_fpe_start_wq(struct stmmac_priv *priv)
/**
* stmmac_hw_setup - setup mac in a usable state.
* @dev : pointer to the device structure.
- * @init_ptp: initialize PTP if set
+ * @ptp_register: register PTP if set
* Description:
* this is the main function to setup the HW in a usable state because the
* dma engine is reset, the core registers are configured (e.g. AXI,
@@ -3252,7 +3250,7 @@ static int stmmac_fpe_start_wq(struct stmmac_priv *priv)
* 0 on success and an appropriate (-)ve integer as defined in errno.h
* file on failure.
*/
-static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
+static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)
{
struct stmmac_priv *priv = netdev_priv(dev);
u32 rx_cnt = priv->plat->rx_queues_to_use;
@@ -3309,13 +3307,13 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
stmmac_mmc_setup(priv);
- if (init_ptp) {
- ret = stmmac_init_ptp(priv);
- if (ret == -EOPNOTSUPP)
- netdev_warn(priv->dev, "PTP not supported by HW\n");
- else if (ret)
- netdev_warn(priv->dev, "PTP init failed\n");
- }
+ ret = stmmac_init_ptp(priv);
+ if (ret == -EOPNOTSUPP)
+ netdev_warn(priv->dev, "PTP not supported by HW\n");
+ else if (ret)
+ netdev_warn(priv->dev, "PTP init failed\n");
+ else if (ptp_register)
+ stmmac_ptp_register(priv);
priv->eee_tw_timer = STMMAC_DEFAULT_TWT_LS;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0735e639f129dff455aeb91da291f5c578cc33db Mon Sep 17 00:00:00 2001
From: Mohammad Athari Bin Ismail <mohammad.athari.ismail(a)intel.com>
Date: Wed, 26 Jan 2022 17:47:23 +0800
Subject: [PATCH] net: stmmac: skip only stmmac_ptp_register when resume from
suspend
When resume from suspend, besides skipping PTP registration, it also
skipping PTP HW initialization. This could cause PTP clock not able to
operate properly when resume from suspend.
To fix this, only stmmac_ptp_register() is skipped when resume from
suspend.
Fixes: fe1319291150 ("stmmac: Don't init ptp again when resume from suspend/hibernation")
Cc: <stable(a)vger.kernel.org> # 5.15.x
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail(a)intel.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 25c802344356..639a753266e6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -915,8 +915,6 @@ static int stmmac_init_ptp(struct stmmac_priv *priv)
priv->hwts_tx_en = 0;
priv->hwts_rx_en = 0;
- stmmac_ptp_register(priv);
-
return 0;
}
@@ -3242,7 +3240,7 @@ static int stmmac_fpe_start_wq(struct stmmac_priv *priv)
/**
* stmmac_hw_setup - setup mac in a usable state.
* @dev : pointer to the device structure.
- * @init_ptp: initialize PTP if set
+ * @ptp_register: register PTP if set
* Description:
* this is the main function to setup the HW in a usable state because the
* dma engine is reset, the core registers are configured (e.g. AXI,
@@ -3252,7 +3250,7 @@ static int stmmac_fpe_start_wq(struct stmmac_priv *priv)
* 0 on success and an appropriate (-)ve integer as defined in errno.h
* file on failure.
*/
-static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
+static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)
{
struct stmmac_priv *priv = netdev_priv(dev);
u32 rx_cnt = priv->plat->rx_queues_to_use;
@@ -3309,13 +3307,13 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
stmmac_mmc_setup(priv);
- if (init_ptp) {
- ret = stmmac_init_ptp(priv);
- if (ret == -EOPNOTSUPP)
- netdev_warn(priv->dev, "PTP not supported by HW\n");
- else if (ret)
- netdev_warn(priv->dev, "PTP init failed\n");
- }
+ ret = stmmac_init_ptp(priv);
+ if (ret == -EOPNOTSUPP)
+ netdev_warn(priv->dev, "PTP not supported by HW\n");
+ else if (ret)
+ netdev_warn(priv->dev, "PTP init failed\n");
+ else if (ptp_register)
+ stmmac_ptp_register(priv);
priv->eee_tw_timer = STMMAC_DEFAULT_TWT_LS;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0735e639f129dff455aeb91da291f5c578cc33db Mon Sep 17 00:00:00 2001
From: Mohammad Athari Bin Ismail <mohammad.athari.ismail(a)intel.com>
Date: Wed, 26 Jan 2022 17:47:23 +0800
Subject: [PATCH] net: stmmac: skip only stmmac_ptp_register when resume from
suspend
When resume from suspend, besides skipping PTP registration, it also
skipping PTP HW initialization. This could cause PTP clock not able to
operate properly when resume from suspend.
To fix this, only stmmac_ptp_register() is skipped when resume from
suspend.
Fixes: fe1319291150 ("stmmac: Don't init ptp again when resume from suspend/hibernation")
Cc: <stable(a)vger.kernel.org> # 5.15.x
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail(a)intel.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 25c802344356..639a753266e6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -915,8 +915,6 @@ static int stmmac_init_ptp(struct stmmac_priv *priv)
priv->hwts_tx_en = 0;
priv->hwts_rx_en = 0;
- stmmac_ptp_register(priv);
-
return 0;
}
@@ -3242,7 +3240,7 @@ static int stmmac_fpe_start_wq(struct stmmac_priv *priv)
/**
* stmmac_hw_setup - setup mac in a usable state.
* @dev : pointer to the device structure.
- * @init_ptp: initialize PTP if set
+ * @ptp_register: register PTP if set
* Description:
* this is the main function to setup the HW in a usable state because the
* dma engine is reset, the core registers are configured (e.g. AXI,
@@ -3252,7 +3250,7 @@ static int stmmac_fpe_start_wq(struct stmmac_priv *priv)
* 0 on success and an appropriate (-)ve integer as defined in errno.h
* file on failure.
*/
-static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
+static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)
{
struct stmmac_priv *priv = netdev_priv(dev);
u32 rx_cnt = priv->plat->rx_queues_to_use;
@@ -3309,13 +3307,13 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
stmmac_mmc_setup(priv);
- if (init_ptp) {
- ret = stmmac_init_ptp(priv);
- if (ret == -EOPNOTSUPP)
- netdev_warn(priv->dev, "PTP not supported by HW\n");
- else if (ret)
- netdev_warn(priv->dev, "PTP init failed\n");
- }
+ ret = stmmac_init_ptp(priv);
+ if (ret == -EOPNOTSUPP)
+ netdev_warn(priv->dev, "PTP not supported by HW\n");
+ else if (ret)
+ netdev_warn(priv->dev, "PTP init failed\n");
+ else if (ptp_register)
+ stmmac_ptp_register(priv);
priv->eee_tw_timer = STMMAC_DEFAULT_TWT_LS;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0735e639f129dff455aeb91da291f5c578cc33db Mon Sep 17 00:00:00 2001
From: Mohammad Athari Bin Ismail <mohammad.athari.ismail(a)intel.com>
Date: Wed, 26 Jan 2022 17:47:23 +0800
Subject: [PATCH] net: stmmac: skip only stmmac_ptp_register when resume from
suspend
When resume from suspend, besides skipping PTP registration, it also
skipping PTP HW initialization. This could cause PTP clock not able to
operate properly when resume from suspend.
To fix this, only stmmac_ptp_register() is skipped when resume from
suspend.
Fixes: fe1319291150 ("stmmac: Don't init ptp again when resume from suspend/hibernation")
Cc: <stable(a)vger.kernel.org> # 5.15.x
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail(a)intel.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 25c802344356..639a753266e6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -915,8 +915,6 @@ static int stmmac_init_ptp(struct stmmac_priv *priv)
priv->hwts_tx_en = 0;
priv->hwts_rx_en = 0;
- stmmac_ptp_register(priv);
-
return 0;
}
@@ -3242,7 +3240,7 @@ static int stmmac_fpe_start_wq(struct stmmac_priv *priv)
/**
* stmmac_hw_setup - setup mac in a usable state.
* @dev : pointer to the device structure.
- * @init_ptp: initialize PTP if set
+ * @ptp_register: register PTP if set
* Description:
* this is the main function to setup the HW in a usable state because the
* dma engine is reset, the core registers are configured (e.g. AXI,
@@ -3252,7 +3250,7 @@ static int stmmac_fpe_start_wq(struct stmmac_priv *priv)
* 0 on success and an appropriate (-)ve integer as defined in errno.h
* file on failure.
*/
-static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
+static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)
{
struct stmmac_priv *priv = netdev_priv(dev);
u32 rx_cnt = priv->plat->rx_queues_to_use;
@@ -3309,13 +3307,13 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
stmmac_mmc_setup(priv);
- if (init_ptp) {
- ret = stmmac_init_ptp(priv);
- if (ret == -EOPNOTSUPP)
- netdev_warn(priv->dev, "PTP not supported by HW\n");
- else if (ret)
- netdev_warn(priv->dev, "PTP init failed\n");
- }
+ ret = stmmac_init_ptp(priv);
+ if (ret == -EOPNOTSUPP)
+ netdev_warn(priv->dev, "PTP not supported by HW\n");
+ else if (ret)
+ netdev_warn(priv->dev, "PTP init failed\n");
+ else if (ptp_register)
+ stmmac_ptp_register(priv);
priv->eee_tw_timer = STMMAC_DEFAULT_TWT_LS;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0735e639f129dff455aeb91da291f5c578cc33db Mon Sep 17 00:00:00 2001
From: Mohammad Athari Bin Ismail <mohammad.athari.ismail(a)intel.com>
Date: Wed, 26 Jan 2022 17:47:23 +0800
Subject: [PATCH] net: stmmac: skip only stmmac_ptp_register when resume from
suspend
When resume from suspend, besides skipping PTP registration, it also
skipping PTP HW initialization. This could cause PTP clock not able to
operate properly when resume from suspend.
To fix this, only stmmac_ptp_register() is skipped when resume from
suspend.
Fixes: fe1319291150 ("stmmac: Don't init ptp again when resume from suspend/hibernation")
Cc: <stable(a)vger.kernel.org> # 5.15.x
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail(a)intel.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 25c802344356..639a753266e6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -915,8 +915,6 @@ static int stmmac_init_ptp(struct stmmac_priv *priv)
priv->hwts_tx_en = 0;
priv->hwts_rx_en = 0;
- stmmac_ptp_register(priv);
-
return 0;
}
@@ -3242,7 +3240,7 @@ static int stmmac_fpe_start_wq(struct stmmac_priv *priv)
/**
* stmmac_hw_setup - setup mac in a usable state.
* @dev : pointer to the device structure.
- * @init_ptp: initialize PTP if set
+ * @ptp_register: register PTP if set
* Description:
* this is the main function to setup the HW in a usable state because the
* dma engine is reset, the core registers are configured (e.g. AXI,
@@ -3252,7 +3250,7 @@ static int stmmac_fpe_start_wq(struct stmmac_priv *priv)
* 0 on success and an appropriate (-)ve integer as defined in errno.h
* file on failure.
*/
-static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
+static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)
{
struct stmmac_priv *priv = netdev_priv(dev);
u32 rx_cnt = priv->plat->rx_queues_to_use;
@@ -3309,13 +3307,13 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
stmmac_mmc_setup(priv);
- if (init_ptp) {
- ret = stmmac_init_ptp(priv);
- if (ret == -EOPNOTSUPP)
- netdev_warn(priv->dev, "PTP not supported by HW\n");
- else if (ret)
- netdev_warn(priv->dev, "PTP init failed\n");
- }
+ ret = stmmac_init_ptp(priv);
+ if (ret == -EOPNOTSUPP)
+ netdev_warn(priv->dev, "PTP not supported by HW\n");
+ else if (ret)
+ netdev_warn(priv->dev, "PTP init failed\n");
+ else if (ptp_register)
+ stmmac_ptp_register(priv);
priv->eee_tw_timer = STMMAC_DEFAULT_TWT_LS;
Hi,
I'm not sure when patches get picked from Linus' branch, just in case
please add the following fixes to 5.16.4. They fix a few user visible
bugs in defrag, all apply cleanly. Thanks.
6b34cd8e175b btrfs: fix too long loop when defragging a 1 byte file
b767c2fc787e btrfs: allow defrag to be interruptible
484167da7773 btrfs: defrag: fix wrong number of defragged sectors
c080b4144b9d btrfs: defrag: properly update range->start for autodefrag
0cb5950f3f3b btrfs: fix deadlock when reserving space during defrag
3c9d31c71594 btrfs: add back missing dirty page rate limiting to defrag
From: D Scott Phillips <scott(a)os.amperecomputing.com>
commit 38e0257e0e6f4fef2aa2966b089b56a8b1cfb75c upstream.
The erratum 1418040 workaround enables CNTVCT_EL1 access trapping in EL0
when executing compat threads. The workaround is applied when switching
between tasks, but the need for the workaround could also change at an
exec(), when a non-compat task execs a compat binary or vice versa. Apply
the workaround in arch_setup_new_exec().
This leaves a small window of time between SET_PERSONALITY and
arch_setup_new_exec where preemption could occur and confuse the old
workaround logic that compares TIF_32BIT between prev and next. Instead, we
can just read cntkctl to make sure it's in the state that the next task
needs. I measured cntkctl read time to be about the same as a mov from a
general-purpose register on N1. Update the workaround logic to examine the
current value of cntkctl instead of the previous task's compat state.
Fixes: d49f7d7376d0 ("arm64: Move handling of erratum 1418040 into C code")
Cc: <stable(a)vger.kernel.org> # 5.9.x
Signed-off-by: D Scott Phillips <scott(a)os.amperecomputing.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20211220234114.3926-1-scott@os.amperecomputing.com
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
---
arch/arm64/kernel/process.c | 39 +++++++++++++++----------------------
1 file changed, 16 insertions(+), 23 deletions(-)
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index f7c42a7d09b6..c108d96a22db 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -515,34 +515,26 @@ static void entry_task_switch(struct task_struct *next)
/*
* ARM erratum 1418040 handling, affecting the 32bit view of CNTVCT.
- * Assuming the virtual counter is enabled at the beginning of times:
- *
- * - disable access when switching from a 64bit task to a 32bit task
- * - enable access when switching from a 32bit task to a 64bit task
+ * Ensure access is disabled when switching to a 32bit task, ensure
+ * access is enabled when switching to a 64bit task.
*/
-static void erratum_1418040_thread_switch(struct task_struct *prev,
- struct task_struct *next)
+static void erratum_1418040_thread_switch(struct task_struct *next)
{
- bool prev32, next32;
- u64 val;
-
- if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))
- return;
-
- prev32 = is_compat_thread(task_thread_info(prev));
- next32 = is_compat_thread(task_thread_info(next));
-
- if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
+ if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) ||
+ !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
return;
- val = read_sysreg(cntkctl_el1);
-
- if (!next32)
- val |= ARCH_TIMER_USR_VCT_ACCESS_EN;
+ if (is_compat_thread(task_thread_info(next)))
+ sysreg_clear_set(cntkctl_el1, ARCH_TIMER_USR_VCT_ACCESS_EN, 0);
else
- val &= ~ARCH_TIMER_USR_VCT_ACCESS_EN;
+ sysreg_clear_set(cntkctl_el1, 0, ARCH_TIMER_USR_VCT_ACCESS_EN);
+}
- write_sysreg(val, cntkctl_el1);
+static void erratum_1418040_new_exec(void)
+{
+ preempt_disable();
+ erratum_1418040_thread_switch(current);
+ preempt_enable();
}
/*
@@ -560,7 +552,7 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
entry_task_switch(next);
uao_thread_switch(next);
ssbs_thread_switch(next);
- erratum_1418040_thread_switch(prev, next);
+ erratum_1418040_thread_switch(next);
/*
* Complete any pending TLB or cache maintenance on this CPU in case
@@ -619,6 +611,7 @@ void arch_setup_new_exec(void)
current->mm->context.flags = is_compat_task() ? MMCF_AARCH32 : 0;
ptrauth_thread_init_user(current);
+ erratum_1418040_new_exec();
}
#ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
On Thu, Jan 27, 2022 at 1:44 PM Martin Faltesek <mfaltesek(a)google.com> wrote:
+stable@
>
> +gregkh(a)linuxfoundation.org
>
> ddbcd0c58a6a is upstream fix for this and needs to be applied to affected stable releases.
>
My little tool tells me that this is only needed in v5.10.y.
Guenter
> On Thu, Jan 27, 2022 at 3:21 PM Martin Faltesek <mfaltesek(a)google.com> wrote:
>>
>> Hi Bryan,
>>
>> I was merging this patch into our code base that was in 5.10.94:
>>
>> commit eeefa2eae8fc82ad757a2241b9f82ac33e99e6b4
>> Author: Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
>> AuthorDate: Fri Feb 5 19:11:49 2021 +0100
>> Commit: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
>> CommitDate: Thu Jan 27 10:53:51 2022 +0100
>>
>> media: venus: core, venc, vdec: Fix probe dependency error
>>
>>
>>
>> @@ -364,11 +365,14 @@ static int venus_remove(struct platform_device *pdev)
>> pm_runtime_disable(dev);
>>
>> if (pm_ops->core_put)
>> - pm_ops->core_put(dev);
>> + pm_ops->core_put(core);
>> +
>> + v4l2_device_unregister(&core->v4l2_dev);
>>
>> hfi_destroy(core);
>>
>> v4l2_device_unregister(&core->v4l2_dev);
>> +
>> mutex_destroy(&core->pm_lock);
>> mutex_destroy(&core->lock);
>> venus_dbgfs_deinit(core);
>>
>> This section introduces unregistering twice, before and after destroy.
>> I think there must be a patch that fixed this since this error does
>> not exist in linus's tree.
>> I'm guessing there might be a missing patch you'd want to grab.
>>
>> Can you comment on this?
>>
>> Thanks,
>>
>> Martin
Stopping tasklet and hrtimer rely on the active state of tasklet and
hrtimer sequentially in bcm_remove_op(), the op object will be freed
if they are all unactive. Assume the hrtimer timeout is short, the
hrtimer cb has been excuted after tasklet conditional judgment which
must be false after last round tasklet_kill() and before condition
hrtimer_active(), it is false when execute to hrtimer_active(). Bug
is triggerd, because the stopping action is end and the op object
will be freed, but the tasklet is scheduled. The resources of the op
object will occur UAF bug.
Move hrtimer_cancel() behind tasklet_kill() and switch 'while () {...}'
to 'do {...} while ()' to fix the op UAF problem.
Fixes: a06393ed0316 ("can: bcm: fix hrtimer/tasklet termination in bcm op removal")
Reported-by: syzbot+5ca851459ed04c778d1d(a)syzkaller.appspotmail.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com>
---
net/can/bcm.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 369326715b9c..bfb507223468 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -761,21 +761,21 @@ static struct bcm_op *bcm_find_op(struct list_head *ops,
static void bcm_remove_op(struct bcm_op *op)
{
if (op->tsklet.func) {
- while (test_bit(TASKLET_STATE_SCHED, &op->tsklet.state) ||
- test_bit(TASKLET_STATE_RUN, &op->tsklet.state) ||
- hrtimer_active(&op->timer)) {
- hrtimer_cancel(&op->timer);
+ do {
tasklet_kill(&op->tsklet);
- }
+ hrtimer_cancel(&op->timer);
+ } while (test_bit(TASKLET_STATE_SCHED, &op->tsklet.state) ||
+ test_bit(TASKLET_STATE_RUN, &op->tsklet.state) ||
+ hrtimer_active(&op->timer));
}
if (op->thrtsklet.func) {
- while (test_bit(TASKLET_STATE_SCHED, &op->thrtsklet.state) ||
- test_bit(TASKLET_STATE_RUN, &op->thrtsklet.state) ||
- hrtimer_active(&op->thrtimer)) {
- hrtimer_cancel(&op->thrtimer);
+ do {
tasklet_kill(&op->thrtsklet);
- }
+ hrtimer_cancel(&op->thrtimer);
+ } while (test_bit(TASKLET_STATE_SCHED, &op->thrtsklet.state) ||
+ test_bit(TASKLET_STATE_RUN, &op->thrtsklet.state) ||
+ hrtimer_active(&op->thrtimer));
}
if ((op->frames) && (op->frames != &op->sframe))
--
2.25.1
I'm announcing the release of the 5.4.175 kernel.
All users of the 5.4 kernel series must upgrade.
The updated 5.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
arch/arm/boot/dts/bcm283x.dtsi | 1
drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 3
drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 +
drivers/gpu/drm/i915/gt/intel_gt.c | 99 ++++++++++
drivers/gpu/drm/i915/gt/intel_gt.h | 2
drivers/gpu/drm/i915/gt/intel_gt_types.h | 2
drivers/gpu/drm/i915/i915_reg.h | 11 +
drivers/gpu/drm/i915/i915_vma.c | 4
drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 5
drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 33 +--
drivers/gpu/drm/vmwgfx/vmwgfx_fence.c | 2
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 2
drivers/mmc/host/sdhci-esdhc-imx.c | 3
drivers/pinctrl/bcm/pinctrl-bcm2835.c | 209 +++++++++++++++++++----
fs/select.c | 63 +++---
kernel/rcu/tree.c | 7
17 files changed, 366 insertions(+), 92 deletions(-)
Florian Fainelli (2):
pinctrl: bcm2835: Match BCM7211 compatible string
pinctrl: bcm2835: Add support for wake-up interrupts
Greg Kroah-Hartman (1):
Linux 5.4.175
Jan Kara (1):
select: Fix indefinitely sleeping task in poll_schedule_timeout()
Mathias Krause (1):
drm/vmwgfx: Fix stale file descriptors on failed usercopy
Paul E. McKenney (1):
rcu: Tighten rcu_advance_cbs_nowake() checks
Phil Elwell (2):
pinctrl: bcm2835: Change init order for gpio hogs
ARM: dts: gpio-ranges property is now required
Stefan Wahren (3):
pinctrl: bcm2835: Drop unused define
pinctrl: bcm2835: Refactor platform data
pinctrl: bcm2835: Add support for all GPIOs on BCM2711
Tim Harvey (1):
mmc: sdhci-esdhc-imx: disable CMDQ support
Tvrtko Ursulin (1):
drm/i915: Flush TLBs before releasing backing store
I'm announcing the release of the 4.19.227 kernel.
All users of the 4.19 kernel series must upgrade.
The updated 4.19.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.19.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
drivers/gpu/drm/i915/i915_drv.h | 2
drivers/gpu/drm/i915/i915_gem.c | 83 ++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem_object.h | 1
drivers/gpu/drm/i915/i915_reg.h | 6 ++
drivers/gpu/drm/i915/i915_vma.c | 4 +
drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 5 -
drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 34 ++++++-------
drivers/gpu/drm/vmwgfx/vmwgfx_fence.c | 2
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 2
fs/select.c | 63 ++++++++++++------------
net/bridge/br_device.c | 2
12 files changed, 153 insertions(+), 53 deletions(-)
Greg Kroah-Hartman (1):
Linux 4.19.227
Jan Kara (1):
select: Fix indefinitely sleeping task in poll_schedule_timeout()
Mathias Krause (1):
drm/vmwgfx: Fix stale file descriptors on failed usercopy
Nikolay Aleksandrov (1):
net: bridge: clear bridge's private skb space on xmit
Tvrtko Ursulin (1):
drm/i915: Flush TLBs before releasing backing store
I'm announcing the release of the 4.14.264 kernel.
All users of the 4.14 kernel series must upgrade.
The updated 4.14.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.14.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
drivers/gpu/drm/i915/i915_drv.h | 2
drivers/gpu/drm/i915/i915_gem.c | 84 +++++++++++++++++++++++++++++++-
drivers/gpu/drm/i915/i915_gem_object.h | 1
drivers/gpu/drm/i915/i915_reg.h | 6 ++
drivers/gpu/drm/i915/i915_vma.c | 4 +
drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 5 -
drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 34 ++++++------
drivers/gpu/drm/vmwgfx/vmwgfx_fence.c | 2
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 2
net/can/bcm.c | 20 +++----
11 files changed, 128 insertions(+), 34 deletions(-)
Greg Kroah-Hartman (1):
Linux 4.14.264
Mathias Krause (1):
drm/vmwgfx: Fix stale file descriptors on failed usercopy
Tvrtko Ursulin (1):
drm/i915: Flush TLBs before releasing backing store
Ziyang Xuan (1):
can: bcm: fix UAF of bcm op
I'm announcing the release of the 4.9.299 kernel.
All users of the 4.9 kernel series must upgrade.
The updated 4.9.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.9.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/virtual/kvm/mmu.txt | 4 -
Makefile | 2
arch/arm/Kconfig.debug | 44 +++++++++-----
arch/x86/kvm/paging_tmpl.h | 44 +++++++++-----
drivers/gpu/drm/i915/i915_drv.h | 5 +
drivers/gpu/drm/i915/i915_gem.c | 72 ++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem_gtt.c | 4 +
drivers/gpu/drm/i915/i915_reg.h | 6 ++
drivers/media/firewire/firedtv-avc.c | 14 +++-
drivers/media/firewire/firedtv-ci.c | 2
drivers/staging/android/ion/ion-ioctl.c | 96 ++++++++++++++++++++++++++++----
drivers/staging/android/ion/ion.c | 19 ++++--
drivers/staging/android/ion/ion.h | 4 +
drivers/staging/android/ion/ion_priv.h | 4 +
fs/nfs/nfs4client.c | 82 ++++++++++++++-------------
lib/Kconfig.debug | 6 +-
16 files changed, 310 insertions(+), 98 deletions(-)
Dan Carpenter (1):
media: firewire: firedtv-avc: fix a buffer overflow in avc_ca_pmt()
Daniel Rosenberg (2):
ion: Fix use after free during ION_IOC_ALLOC
ion: Protect kref from userspace manipulation
Greg Kroah-Hartman (1):
Linux 4.9.299
Lai Jiangshan (1):
KVM: X86: MMU: Use the correct inherited permissions to get shadow page
Lee Jones (1):
ion: Do not 'put' ION handle until after its final use
Paolo Bonzini (1):
KVM: nVMX: fix EPT permissions as reported in exit qualification
Stefan Agner (1):
ARM: 8800/1: use choice for kernel unwinders
Trond Myklebust (1):
NFSv4: Initialise connection to the server in nfs4_alloc_client()
Tvrtko Ursulin (1):
drm/i915: Flush TLBs before releasing backing store
I'm announcing the release of the 4.4.301 kernel.
All users of the 4.4 kernel series must upgrade.
The updated 4.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
drivers/gpu/drm/i915/i915_drv.h | 5 ++
drivers/gpu/drm/i915/i915_gem.c | 89 ++++++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +
drivers/gpu/drm/i915/i915_reg.h | 6 ++
5 files changed, 104 insertions(+), 1 deletion(-)
Greg Kroah-Hartman (1):
Linux 4.4.301
Tvrtko Ursulin (1):
drm/i915: Flush TLBs before releasing backing store
On Fri, Jan 28, 2022 at 10:17:55PM +0300, Sergey Shtylyov wrote:
> My experience tells they do.
Let's ask them:
@stable folks, do you guys take patches based only on Fixes: tags
nowadays or you still require CC:stable to be present in the commit
message?
> See my other mail...
Which other mail?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
This is the start of the stable review cycle for the 4.4.301 release.
There are 1 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 29 Jan 2022 18:02:51 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.301-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.4.301-rc1
Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
drm/i915: Flush TLBs before releasing backing store
-------------
Diffstat:
Makefile | 4 +-
drivers/gpu/drm/i915/i915_drv.h | 5 +++
drivers/gpu/drm/i915/i915_gem.c | 89 +++++++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem_gtt.c | 3 ++
drivers/gpu/drm/i915/i915_reg.h | 6 +++
5 files changed, 105 insertions(+), 2 deletions(-)
This is the start of the stable review cycle for the 4.9.299 release.
There are 9 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 29 Jan 2022 18:02:51 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.299-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.299-rc1
Lee Jones <lee.jones(a)linaro.org>
ion: Do not 'put' ION handle until after its final use
Daniel Rosenberg <drosen(a)google.com>
ion: Protect kref from userspace manipulation
Daniel Rosenberg <drosen(a)google.com>
ion: Fix use after free during ION_IOC_ALLOC
Stefan Agner <stefan(a)agner.ch>
ARM: 8800/1: use choice for kernel unwinders
Lai Jiangshan <laijs(a)linux.alibaba.com>
KVM: X86: MMU: Use the correct inherited permissions to get shadow page
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: nVMX: fix EPT permissions as reported in exit qualification
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFSv4: Initialise connection to the server in nfs4_alloc_client()
Dan Carpenter <dan.carpenter(a)oracle.com>
media: firewire: firedtv-avc: fix a buffer overflow in avc_ca_pmt()
Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
drm/i915: Flush TLBs before releasing backing store
-------------
Diffstat:
Documentation/virtual/kvm/mmu.txt | 4 +-
Makefile | 4 +-
arch/arm/Kconfig.debug | 44 +++++++++------
arch/x86/kvm/paging_tmpl.h | 44 +++++++++------
drivers/gpu/drm/i915/i915_drv.h | 5 +-
drivers/gpu/drm/i915/i915_gem.c | 72 +++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem_gtt.c | 4 ++
drivers/gpu/drm/i915/i915_reg.h | 6 +++
drivers/media/firewire/firedtv-avc.c | 14 +++--
drivers/media/firewire/firedtv-ci.c | 2 +
drivers/staging/android/ion/ion-ioctl.c | 96 +++++++++++++++++++++++++++++----
drivers/staging/android/ion/ion.c | 19 +++++--
drivers/staging/android/ion/ion.h | 4 ++
drivers/staging/android/ion/ion_priv.h | 4 ++
fs/nfs/nfs4client.c | 82 ++++++++++++++--------------
lib/Kconfig.debug | 6 +--
16 files changed, 311 insertions(+), 99 deletions(-)
When HCE(Host Controller Error) is set, it means an internal
error condition has been detected. Software needs to re-initialize
the HC, so add this check in xhci resume.
Cc: stable(a)vger.kernel.org
Signed-off-by: Puma Hsu <pumahsu(a)google.com>
---
v2: Follow Sergey Shtylyov <s.shtylyov(a)omp.ru>'s comment.
v3: Add stable(a)vger.kernel.org for stable release.
v4: Refine the commit message.
v5: Add a debug log. Follow Mathias Nyman <mathias.nyman(a)linux.intel.com>'s comment.
drivers/usb/host/xhci.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index dc357cabb265..41f594f0f73f 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1146,8 +1146,10 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
temp = readl(&xhci->op_regs->status);
}
- /* If restore operation fails, re-initialize the HC during resume */
- if ((temp & STS_SRE) || hibernated) {
+ /* If restore operation fails or HC error is detected, re-initialize the HC during resume */
+ if ((temp & (STS_SRE | STS_HCE)) || hibernated) {
+ xhci_warn(xhci, "re-initialize HC during resume, USBSTS:%s\n",
+ xhci_decode_usbsts(str, temp));
if ((xhci->quirks & XHCI_COMP_MODE_QUIRK) &&
!(xhci_all_ports_seen_u0(xhci))) {
--
2.34.1.703.g22d0c6ccf7-goog
This is the start of the stable review cycle for the 4.19.227 release.
There are 3 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 29 Jan 2022 18:02:51 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.227-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.227-rc1
Jan Kara <jack(a)suse.cz>
select: Fix indefinitely sleeping task in poll_schedule_timeout()
Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
net: bridge: clear bridge's private skb space on xmit
Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
drm/i915: Flush TLBs before releasing backing store
-------------
Diffstat:
Makefile | 4 +-
drivers/gpu/drm/i915/i915_drv.h | 2 +
drivers/gpu/drm/i915/i915_gem.c | 83 ++++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem_object.h | 1 +
drivers/gpu/drm/i915/i915_reg.h | 6 +++
drivers/gpu/drm/i915/i915_vma.c | 4 ++
fs/select.c | 63 ++++++++++++++------------
net/bridge/br_device.c | 2 +
8 files changed, 133 insertions(+), 32 deletions(-)
Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the
first argument to execve(2) be the name of a program, thus prohibiting
a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is
associated with the process being started by one of the exec
functions.
...
Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
but there was no consensus to support fixing this issue then.
Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows
mostly test code, or example rootkit code. While rejecting a NULL argv
would be preferred, it looks like the main cause of userspace confusion
is an assumption that argc >= 1, and buggy programs may skip argv[0]
when iterating. To protect against userspace bugs of this nature, insert
an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace
programs expect to find envp at exactly argv[argc]. The overlap of these
two misguided assumptions is believed to be zero.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
[2] https://bugzilla.kernel.org/show_bug.cgi?id=8408
[3] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
[4] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+…
Reported-by: Ariadne Conill <ariadne(a)dereferenced.org>
Reported-by: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Rich Felker <dalias(a)libc.org>
Cc: Eric Biederman <ebiederm(a)xmission.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: linux-fsdevel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Kees Cook <keescook(a)chromium.org>
---
fs/binfmt_elf.c | 10 +++++++++-
fs/exec.c | 7 ++++++-
2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 605017eb9349..e456c48658ad 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -297,7 +297,8 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
ei_index = elf_info - (elf_addr_t *)mm->saved_auxv;
sp = STACK_ADD(p, ei_index);
- items = (argc + 1) + (envc + 1) + 1;
+ /* Make room for extra pointer when argc == 0. See below. */
+ items = (min(argc, 1) + 1) + (envc + 1) + 1;
bprm->p = STACK_ROUND(sp, items);
/* Point sp at the lowest address on the stack */
@@ -326,6 +327,13 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
/* Populate list of argv pointers back to argv strings. */
p = mm->arg_end = mm->arg_start;
+ /*
+ * Include an extra NULL pointer in argv when argc == 0 so
+ * that argv[1] != envp[0] to help userspace programs from
+ * mishandling argc == 0. See fs/exec.c bprm_stack_limits().
+ */
+ if (argc == 0 && put_user(0, sp++))
+ return -EFAULT;
while (argc-- > 0) {
size_t len;
if (put_user((elf_addr_t)p, sp++))
diff --git a/fs/exec.c b/fs/exec.c
index 79f2c9483302..0b36384e55b1 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -495,8 +495,13 @@ static int bprm_stack_limits(struct linux_binprm *bprm)
* the stack. They aren't stored until much later when we can't
* signal to the parent that the child has run out of stack space.
* Instead, calculate it here so it's possible to fail gracefully.
+ *
+ * In the case of argc < 1, make sure there is a NULL pointer gap
+ * between argv and envp to ensure confused userspace programs don't
+ * start processing from argv[1], thinking argc can never be 0,
+ * to block them from walking envp by accident. See fs/binfmt_elf.c.
*/
- ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
+ ptr_size = (min(bprm->argc, 1) + bprm->envc) * sizeof(void *);
if (limit <= ptr_size)
return -E2BIG;
limit -= ptr_size;
--
2.30.2
From: Peter Collingbourne <pcc(a)google.com>
Subject: mm, kasan: use compare-exchange operation to set KASAN page tag
It has been reported that the tag setting operation on newly-allocated
pages can cause the page flags to be corrupted when performed concurrently
with other flag updates as a result of the use of non-atomic operations.
Fix the problem by using a compare-exchange loop to update the tag.
Link: https://lkml.kernel.org/r/20220120020148.1632253-1-pcc@google.com
Link: https://linux-review.googlesource.com/id/I456b24a2b9067d93968d43b4bb3351c0c…
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
--- a/include/linux/mm.h~mm-use-compare-exchange-operation-to-set-kasan-page-tag
+++ a/include/linux/mm.h
@@ -1506,11 +1506,18 @@ static inline u8 page_kasan_tag(const st
static inline void page_kasan_tag_set(struct page *page, u8 tag)
{
- if (kasan_enabled()) {
- tag ^= 0xff;
- page->flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
- page->flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
- }
+ unsigned long old_flags, flags;
+
+ if (!kasan_enabled())
+ return;
+
+ tag ^= 0xff;
+ old_flags = READ_ONCE(page->flags);
+ do {
+ flags = old_flags;
+ flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
+ flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
+ } while (unlikely(!try_cmpxchg(&page->flags, &old_flags, flags)));
}
static inline void page_kasan_tag_reset(struct page *page)
_
This is the start of the stable review cycle for the 5.16.4 release.
There are 9 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 29 Jan 2022 18:02:51 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.16.4-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.16.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.16.4-rc1
Russell King <russell.king(a)oracle.com>
arm64/bpf: Remove 128MB limit for BPF JIT programs
Jan Kara <jack(a)suse.cz>
select: Fix indefinitely sleeping task in poll_schedule_timeout()
Paul E. McKenney <paulmck(a)kernel.org>
rcu: Tighten rcu_advance_cbs_nowake() checks
Shakeel Butt <shakeelb(a)google.com>
memcg: better bounds on the memcg stats updates
Manish Chopra <manishc(a)marvell.com>
bnx2x: Invalidate fastpath HSI version for VFs
Manish Chopra <manishc(a)marvell.com>
bnx2x: Utilize firmware 7.13.21.0
Pavel Begunkov <asml.silence(a)gmail.com>
io_uring: fix not released cached task refs
Mario Limonciello <mario.limonciello(a)amd.com>
drm/amd/display: reset dcn31 SMU mailbox on failures
Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
drm/i915: Flush TLBs before releasing backing store
-------------
Diffstat:
Makefile | 4 +-
arch/arm64/include/asm/extable.h | 9 --
arch/arm64/include/asm/memory.h | 5 +-
arch/arm64/kernel/traps.c | 2 +-
arch/arm64/mm/ptdump.c | 2 -
arch/arm64/net/bpf_jit_comp.c | 7 +-
.../drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c | 6 ++
drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 1 +
drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 ++
drivers/gpu/drm/i915/gt/intel_gt.c | 102 +++++++++++++++++++++
drivers/gpu/drm/i915/gt/intel_gt.h | 2 +
drivers/gpu/drm/i915/gt/intel_gt_types.h | 2 +
drivers/gpu/drm/i915/i915_reg.h | 11 +++
drivers/gpu/drm/i915/i915_vma.c | 3 +
drivers/gpu/drm/i915/intel_uncore.c | 26 +++++-
drivers/gpu/drm/i915/intel_uncore.h | 2 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x.h | 11 ++-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 6 +-
.../net/ethernet/broadcom/bnx2x/bnx2x_fw_defs.h | 2 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h | 3 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 75 ++++++++++-----
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c | 13 ++-
fs/io_uring.c | 34 ++++---
fs/select.c | 63 +++++++------
kernel/rcu/tree.c | 7 +-
mm/memcontrol.c | 20 ++--
26 files changed, 318 insertions(+), 110 deletions(-)
This is the start of the stable review cycle for the 5.15.18 release.
There are 12 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 29 Jan 2022 18:02:51 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.18-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.18-rc1
Russell King <russell.king(a)oracle.com>
arm64/bpf: Remove 128MB limit for BPF JIT programs
Harry Wentland <harry.wentland(a)amd.com>
drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2
Jan Kara <jack(a)suse.cz>
select: Fix indefinitely sleeping task in poll_schedule_timeout()
Paul E. McKenney <paulmck(a)kernel.org>
rcu: Tighten rcu_advance_cbs_nowake() checks
Shakeel Butt <shakeelb(a)google.com>
memcg: better bounds on the memcg stats updates
Shakeel Butt <shakeelb(a)google.com>
memcg: unify memcg stat flushing
Shakeel Butt <shakeelb(a)google.com>
memcg: flush stats only if updated
Manish Chopra <manishc(a)marvell.com>
bnx2x: Invalidate fastpath HSI version for VFs
Manish Chopra <manishc(a)marvell.com>
bnx2x: Utilize firmware 7.13.21.0
Pavel Begunkov <asml.silence(a)gmail.com>
io_uring: fix not released cached task refs
Mario Limonciello <mario.limonciello(a)amd.com>
drm/amd/display: reset dcn31 SMU mailbox on failures
Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
drm/i915: Flush TLBs before releasing backing store
-------------
Diffstat:
Makefile | 4 +-
arch/arm64/include/asm/extable.h | 9 --
arch/arm64/include/asm/memory.h | 5 +-
arch/arm64/kernel/traps.c | 2 +-
arch/arm64/mm/extable.c | 13 ++-
arch/arm64/mm/ptdump.c | 2 -
arch/arm64/net/bpf_jit_comp.c | 7 +-
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 14 ++-
.../drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c | 6 ++
drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 1 +
drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 ++
drivers/gpu/drm/i915/gt/intel_gt.c | 102 +++++++++++++++++++++
drivers/gpu/drm/i915/gt/intel_gt.h | 2 +
drivers/gpu/drm/i915/gt/intel_gt_types.h | 2 +
drivers/gpu/drm/i915/i915_reg.h | 11 +++
drivers/gpu/drm/i915/i915_vma.c | 3 +
drivers/gpu/drm/i915/intel_uncore.c | 26 +++++-
drivers/gpu/drm/i915/intel_uncore.h | 2 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x.h | 11 ++-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 6 +-
.../net/ethernet/broadcom/bnx2x/bnx2x_fw_defs.h | 2 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h | 3 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 75 ++++++++++-----
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c | 13 ++-
fs/io_uring.c | 34 ++++---
fs/select.c | 63 +++++++------
kernel/rcu/tree.c | 7 +-
mm/memcontrol.c | 99 ++++++++++++++------
28 files changed, 396 insertions(+), 138 deletions(-)
This is the start of the stable review cycle for the 5.10.95 release.
There are 6 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 29 Jan 2022 18:02:51 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.95-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.95-rc1
Jan Kara <jack(a)suse.cz>
select: Fix indefinitely sleeping task in poll_schedule_timeout()
David Matlack <dmatlack(a)google.com>
KVM: x86/mmu: Fix write-protection of PTs mapped by the TDP MMU
Paul E. McKenney <paulmck(a)kernel.org>
rcu: Tighten rcu_advance_cbs_nowake() checks
Manish Chopra <manishc(a)marvell.com>
bnx2x: Invalidate fastpath HSI version for VFs
Manish Chopra <manishc(a)marvell.com>
bnx2x: Utilize firmware 7.13.21.0
Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
drm/i915: Flush TLBs before releasing backing store
-------------
Diffstat:
Makefile | 4 +-
arch/x86/kvm/mmu/tdp_mmu.c | 6 +-
drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 1 +
drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 ++
drivers/gpu/drm/i915/gt/intel_gt.c | 102 +++++++++++++++++++++
drivers/gpu/drm/i915/gt/intel_gt.h | 2 +
drivers/gpu/drm/i915/gt/intel_gt_types.h | 2 +
drivers/gpu/drm/i915/i915_reg.h | 11 +++
drivers/gpu/drm/i915/i915_vma.c | 3 +
drivers/gpu/drm/i915/intel_uncore.c | 26 +++++-
drivers/gpu/drm/i915/intel_uncore.h | 2 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x.h | 11 ++-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 6 +-
.../net/ethernet/broadcom/bnx2x/bnx2x_fw_defs.h | 2 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h | 3 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 75 ++++++++++-----
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c | 13 ++-
fs/select.c | 63 +++++++------
kernel/rcu/tree.c | 7 +-
19 files changed, 277 insertions(+), 72 deletions(-)
This is the start of the stable review cycle for the 5.4.175 release.
There are 11 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 29 Jan 2022 18:02:51 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.175-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.175-rc1
Jan Kara <jack(a)suse.cz>
select: Fix indefinitely sleeping task in poll_schedule_timeout()
Tim Harvey <tharvey(a)gateworks.com>
mmc: sdhci-esdhc-imx: disable CMDQ support
Phil Elwell <phil(a)raspberrypi.com>
ARM: dts: gpio-ranges property is now required
Phil Elwell <phil(a)raspberrypi.com>
pinctrl: bcm2835: Change init order for gpio hogs
Florian Fainelli <f.fainelli(a)gmail.com>
pinctrl: bcm2835: Add support for wake-up interrupts
Florian Fainelli <f.fainelli(a)gmail.com>
pinctrl: bcm2835: Match BCM7211 compatible string
Stefan Wahren <stefan.wahren(a)i2se.com>
pinctrl: bcm2835: Add support for all GPIOs on BCM2711
Stefan Wahren <stefan.wahren(a)i2se.com>
pinctrl: bcm2835: Refactor platform data
Stefan Wahren <stefan.wahren(a)i2se.com>
pinctrl: bcm2835: Drop unused define
Paul E. McKenney <paulmck(a)kernel.org>
rcu: Tighten rcu_advance_cbs_nowake() checks
Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
drm/i915: Flush TLBs before releasing backing store
-------------
Diffstat:
Makefile | 4 +-
arch/arm/boot/dts/bcm283x.dtsi | 1 +
drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 3 +
drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 ++
drivers/gpu/drm/i915/gt/intel_gt.c | 99 +++++++++++
drivers/gpu/drm/i915/gt/intel_gt.h | 2 +
drivers/gpu/drm/i915/gt/intel_gt_types.h | 2 +
drivers/gpu/drm/i915/i915_reg.h | 11 ++
drivers/gpu/drm/i915/i915_vma.c | 4 +
drivers/mmc/host/sdhci-esdhc-imx.c | 3 +-
drivers/pinctrl/bcm/pinctrl-bcm2835.c | 209 +++++++++++++++++++----
fs/select.c | 63 +++----
kernel/rcu/tree.c | 7 +-
13 files changed, 346 insertions(+), 72 deletions(-)
Dear Friend,
I know that this mail will come to you as a surprise as we have never
met before, but need not to worry as I am contacting you independently
of my investigation and no one is informed of this communication.
I need your urgent assistance in transferring the sum of $11.3million
immediately to your private account.The money has been here in our
Bank lying dormant for years now without anybody coming for the claim of it.
I want to release the money to you as the relative to our deceased
customer (the account owner) who died a long with his supposed NEXT OF
KIN since 16th October 2005. The Banking laws here does not allow such
money to stay more than 16 years, because the money will be recalled
to the Bank treasury account as unclaimed fund.
By indicating your interest I will send you the full details on how
the business will be executed.
Please respond urgently and delete if you are not interested.
Best Regards,
Mr. Duna Wattara.
From: D Scott Phillips <scott(a)os.amperecomputing.com>
commit 38e0257e0e6f4fef2aa2966b089b56a8b1cfb75c upstream.
The erratum 1418040 workaround enables CNTVCT_EL1 access trapping in EL0
when executing compat threads. The workaround is applied when switching
between tasks, but the need for the workaround could also change at an
exec(), when a non-compat task execs a compat binary or vice versa. Apply
the workaround in arch_setup_new_exec().
This leaves a small window of time between SET_PERSONALITY and
arch_setup_new_exec where preemption could occur and confuse the old
workaround logic that compares TIF_32BIT between prev and next. Instead, we
can just read cntkctl to make sure it's in the state that the next task
needs. I measured cntkctl read time to be about the same as a mov from a
general-purpose register on N1. Update the workaround logic to examine the
current value of cntkctl instead of the previous task's compat state.
Fixes: d49f7d7376d0 ("arm64: Move handling of erratum 1418040 into C code")
Cc: <stable(a)vger.kernel.org> # 5.9.x
Signed-off-by: D Scott Phillips <scott(a)os.amperecomputing.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20211220234114.3926-1-scott@os.amperecomputing.com
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
---
arch/arm64/kernel/process.c | 39 +++++++++++++++----------------------
1 file changed, 16 insertions(+), 23 deletions(-)
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index f61ef46ebff7..d07fbc21f14c 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -500,34 +500,26 @@ static void entry_task_switch(struct task_struct *next)
/*
* ARM erratum 1418040 handling, affecting the 32bit view of CNTVCT.
- * Assuming the virtual counter is enabled at the beginning of times:
- *
- * - disable access when switching from a 64bit task to a 32bit task
- * - enable access when switching from a 32bit task to a 64bit task
+ * Ensure access is disabled when switching to a 32bit task, ensure
+ * access is enabled when switching to a 64bit task.
*/
-static void erratum_1418040_thread_switch(struct task_struct *prev,
- struct task_struct *next)
+static void erratum_1418040_thread_switch(struct task_struct *next)
{
- bool prev32, next32;
- u64 val;
-
- if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))
- return;
-
- prev32 = is_compat_thread(task_thread_info(prev));
- next32 = is_compat_thread(task_thread_info(next));
-
- if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
+ if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) ||
+ !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
return;
- val = read_sysreg(cntkctl_el1);
-
- if (!next32)
- val |= ARCH_TIMER_USR_VCT_ACCESS_EN;
+ if (is_compat_thread(task_thread_info(next)))
+ sysreg_clear_set(cntkctl_el1, ARCH_TIMER_USR_VCT_ACCESS_EN, 0);
else
- val &= ~ARCH_TIMER_USR_VCT_ACCESS_EN;
+ sysreg_clear_set(cntkctl_el1, 0, ARCH_TIMER_USR_VCT_ACCESS_EN);
+}
- write_sysreg(val, cntkctl_el1);
+static void erratum_1418040_new_exec(void)
+{
+ preempt_disable();
+ erratum_1418040_thread_switch(current);
+ preempt_enable();
}
/*
@@ -546,7 +538,7 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
uao_thread_switch(next);
ptrauth_thread_switch(next);
ssbs_thread_switch(next);
- erratum_1418040_thread_switch(prev, next);
+ erratum_1418040_thread_switch(next);
/*
* Complete any pending TLB or cache maintenance on this CPU in case
@@ -605,6 +597,7 @@ void arch_setup_new_exec(void)
current->mm->context.flags = is_compat_task() ? MMCF_AARCH32 : 0;
ptrauth_thread_init_user(current);
+ erratum_1418040_new_exec();
}
#ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
From: Tom Zanussi <zanussi(a)kernel.org>
During expression parsing, a new expression field is created which
should inherit the properties of the operands, such as size and
is_signed.
is_signed propagation was missing, causing spurious errors with signed
operands. Add it in parse_expr() and parse_unary() to fix the problem.
Link: https://lkml.kernel.org/r/f4dac08742fd7a0920bf80a73c6c44042f5eaa40.16433197…
Cc: stable(a)vger.kernel.org
Fixes: 100719dcef447 ("tracing: Add simple expression support to hist triggers")
Reported-by: Yordan Karadzhov <ykaradzhov(a)vmware.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215513
Signed-off-by: Tom Zanussi <zanussi(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace_events_hist.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index b894d68082ea..ada87bfb5bb8 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -2503,6 +2503,8 @@ static struct hist_field *parse_unary(struct hist_trigger_data *hist_data,
(HIST_FIELD_FL_TIMESTAMP | HIST_FIELD_FL_TIMESTAMP_USECS);
expr->fn = hist_field_unary_minus;
expr->operands[0] = operand1;
+ expr->size = operand1->size;
+ expr->is_signed = operand1->is_signed;
expr->operator = FIELD_OP_UNARY_MINUS;
expr->name = expr_str(expr, 0);
expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
@@ -2719,6 +2721,7 @@ static struct hist_field *parse_expr(struct hist_trigger_data *hist_data,
/* The operand sizes should be the same, so just pick one */
expr->size = operand1->size;
+ expr->is_signed = operand1->is_signed;
expr->operator = field_op;
expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
--
2.33.0
From: Xiaoke Wang <xkernel.wang(a)foxmail.com>
kfree() is missing on an error path to free the memory allocated by
kstrdup():
p = param = kstrdup(data->params[i], GFP_KERNEL);
So it is better to free it via kfree(p).
Link: https://lkml.kernel.org/r/tencent_C52895FD37802832A3E5B272D05008866F0A@qq.c…
Cc: stable(a)vger.kernel.org
Fixes: d380dcde9a07c ("tracing: Fix now invalid var_ref_vals assumption in trace action")
Signed-off-by: Xiaoke Wang <xkernel.wang(a)foxmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace_events_hist.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 5e6a988a8a51..cd9610688ddc 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -3935,6 +3935,7 @@ static int trace_action_create(struct hist_trigger_data *hist_data,
var_ref_idx = find_var_ref_idx(hist_data, var_ref);
if (WARN_ON(var_ref_idx < 0)) {
+ kfree(p);
ret = var_ref_idx;
goto err;
}
--
2.33.0
hallo Greg
5.16.4-rc1
compiles, boots and runs on my x86_64
(Intel i5-11400, Fedora 35)
Thanks
Tested-by: Ronald Warsow <rwarsow(a)gmx.de>
regards
Ronald
Hello
I'm "Mrs Mimi Hassan." married to Mr. Hassan( an International
Contractor and Oil Merchant/ jointly in Exposition of Agro Equipment
) who died in Burkina Faso attack, and i was diagnosed of cancer, about
2 years ago,and my husband informed me that he deposited the sum of
(22.3Million USD Only) with a company) in OUAGADOUGOU BURKINA
FASO.I want you to help me to use this money for a charity project before
I pass on, for the Poor, Less-privileged and ORPHANAGES in
your country. Please kindly respond quickly for further details.
thanks
Mrs Mimi Hassan.
We observed an issue with NXP 5.15 LTS kernel that dma_alloc_coherent()
may fail sometimes when there're multiple processes trying to allocate
CMA memory.
This issue can be very easily reproduced on MX6Q SDB board with latest
linux-next kernel by writing a test module creating 16 or 32 threads
allocating random size of CMA memory in parallel at the background.
Or simply enabling CONFIG_CMA_DEBUG, you can see endless of CMA alloc
retries during booting:
[ 1.452124] cma: cma_alloc(): memory range at (ptrval) is busy,retrying
....
(thousands of reties)
NOTE: MX6 has CONFIG_FORCE_MAX_ZONEORDER=14 which means MAX_ORDER is
13 (32M).
The root cause of this issue is that since commit a4efc174b382
("mm/cma.c: remove redundant cma_mutex lock"), CMA supports concurrent
memory allocation.
It's possible that the pageblock process A try to alloc has already
been isolated by the allocation of process B during memory migration.
When there're multi process allocating CMA memory in parallel, it's
likely that other the remain pageblocks may have also been isolated,
then CMA alloc fail finally during the first round of scanning of the
whole available CMA bitmap.
This patchset introduces a retry mechanism to rescan CMA bitmap for -EBUSY
error in case the target pageblock may has been temporarily isolated
by others and released later.
It also improves the CMA allocation performance by trying the next
pageblock during reties rather than looping in the same pageblock
which is in -EBUSY state.
Theoretically, this issue can be easily reproduced on ARMv7 platforms
with big MAX_ORDER/pageblock
e.g. 1G RAM(320M reserved CMA) and 32M pageblock ARM platform:
Page block order: 13
Pages per block: 8192
The following test is based on linux-next: next-20211213.
Without the fix, it's easily fail.
# insmod cma_alloc.ko pnum=16
[ 274.322369] CMA alloc test enter: thread number: 16
[ 274.329948] cpu: 0, pid: 692, index 4 pages 144
[ 274.330143] cpu: 1, pid: 694, index 2 pages 44
[ 274.330359] cpu: 2, pid: 695, index 7 pages 757
[ 274.330760] cpu: 2, pid: 696, index 4 pages 144
[ 274.330974] cpu: 2, pid: 697, index 6 pages 512
[ 274.331223] cpu: 2, pid: 698, index 6 pages 512
[ 274.331499] cpu: 2, pid: 699, index 2 pages 44
[ 274.332228] cpu: 2, pid: 700, index 0 pages 7
[ 274.337421] cpu: 0, pid: 701, index 1 pages 38
[ 274.337618] cpu: 2, pid: 702, index 0 pages 7
[ 274.344669] cpu: 1, pid: 703, index 0 pages 7
[ 274.344807] cpu: 3, pid: 704, index 6 pages 512
[ 274.348269] cpu: 2, pid: 705, index 5 pages 148
[ 274.349490] cma: cma_alloc: reserved: alloc failed, req-size: 38 pages, ret: -16
[ 274.366292] cpu: 1, pid: 706, index 4 pages 144
[ 274.366562] cpu: 0, pid: 707, index 3 pages 128
[ 274.367356] cma: cma_alloc: reserved: alloc failed, req-size: 128 pages, ret: -16
[ 274.367370] cpu: 0, pid: 707, index 3 pages 128 failed
[ 274.371148] cma: cma_alloc: reserved: alloc failed, req-size: 148 pages, ret: -16
[ 274.375348] cma: cma_alloc: reserved: alloc failed, req-size: 144 pages, ret: -16
[ 274.384256] cpu: 2, pid: 708, index 0 pages 7
....
With the fix, 32 threads allocating in parallel can pass overnight
stress test.
root@imx6qpdlsolox:~# insmod cma_alloc.ko pnum=32
[ 112.976809] cma_alloc: loading out-of-tree module taints kernel.
[ 112.984128] CMA alloc test enter: thread number: 32
[ 112.989748] cpu: 2, pid: 707, index 6 pages 512
[ 112.994342] cpu: 1, pid: 708, index 6 pages 512
[ 112.995162] cpu: 0, pid: 709, index 3 pages 128
[ 112.995867] cpu: 2, pid: 710, index 0 pages 7
[ 112.995910] cpu: 3, pid: 711, index 2 pages 44
[ 112.996005] cpu: 3, pid: 712, index 7 pages 757
[ 112.996098] cpu: 3, pid: 713, index 7 pages 757
...
[41877.368163] cpu: 1, pid: 737, index 2 pages 44
[41877.369388] cpu: 1, pid: 736, index 3 pages 128
[41878.486516] cpu: 0, pid: 737, index 2 pages 44
[41878.486515] cpu: 2, pid: 739, index 4 pages 144
[41878.486622] cpu: 1, pid: 736, index 3 pages 128
[41878.486948] cpu: 2, pid: 735, index 7 pages 757
[41878.487279] cpu: 2, pid: 738, index 4 pages 144
[41879.526603] cpu: 1, pid: 739, index 3 pages 128
[41879.606491] cpu: 2, pid: 737, index 3 pages 128
[41879.606550] cpu: 0, pid: 736, index 0 pages 7
[41879.612271] cpu: 2, pid: 738, index 4 pages 144
...
v1:
https://patchwork.kernel.org/project/linux-mm/cover/20211215080242.3034856-…
Dong Aisheng (2):
mm: cma: fix allocation may fail sometimes
mm: cma: try next MAX_ORDER_NR_PAGES during retry
mm/cma.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
--
2.25.1