From: Kan Liang <kan.liang(a)linux.intel.com>
This reverts commit 01330d7288e0 ("perf/x86: Allow zero PEBS status with
only single active event")
A repeatable crash can be triggered by the perf_fuzzer on some Haswell
system.
https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/
For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
may be mistakenly set to 0. To minimize the impact of the defect, the
commit was introduced to try to avoid dropping the PEBS record for some
cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
the local pebs_status accordingly. However, it doesn't correct the PEBS
status in the PEBS record, which may trigger the crash, especially for
the large PEBS.
It's possible that all the PEBS records in a large PEBS have the PEBS
status 0. If so, the first get_next_pebs_record_by_bit() in the
__intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
PEBS, the 'count' parameter must > 1. The second
get_next_pebs_record_by_bit() will crash.
Two solutions were considered to fix the crash.
- Keep the SW workaround and add extra checks in the
get_next_pebs_record_by_bit() to workaround the issue. The
get_next_pebs_record_by_bit() is a critical path. The extra checks
will bring extra overhead for the latest CPUs which don't have the
defect. Also, the defect can only be observed on some old CPUs
(For example, the issue can be reproduced on an HSW client, but I
didn't observe the issue on my Haswell server machine.). The impact
of the defect should be limit.
This solution is dropped.
- Drop the SW workaround and revert the commit.
It seems that the commit never works, because the PEBS status in the
PEBS record never be changed. The get_next_pebs_record_by_bit() only
checks the PEBS status in the PEBS record. The record is dropped
eventually. Reverting the commit should not change the current
behavior.
Fixes: 01330d7288e0 ("perf/x86: Allow zero PEBS status with only single active event")
Reported-by: Vince Weaver <vincent.weaver(a)maine.edu>
Signed-off-by: Kan Liang <kan.liang(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/events/intel/ds.c | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 7ebae18..9c90d1e 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
continue;
}
- /*
- * On some CPUs the PEBS status can be zero when PEBS is
- * racing with clearing of GLOBAL_STATUS.
- *
- * Normally we would drop that record, but in the
- * case when there is only a single active PEBS event
- * we can assume it's for that event.
- */
- if (!pebs_status && cpuc->pebs_enabled &&
- !(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
- pebs_status = cpuc->pebs_enabled;
-
bit = find_first_bit((unsigned long *)&pebs_status,
x86_pmu.max_pebs_events);
if (bit >= x86_pmu.max_pebs_events)
--
2.7.4
From: Heiko Thiery <heiko.thiery(a)gmail.com>
[ Upstream commit 6a4d7234ae9a3bb31181f348ade9bbdb55aeb5c5 ]
When accessing the timecounter register on an i.MX8MQ the kernel hangs.
This is only the case when the interface is down. This can be reproduced
by reading with 'phc_ctrl eth0 get'.
Like described in the change in 91c0d987a9788dcc5fe26baafd73bf9242b68900
the igp clock is disabled when the interface is down and leads to a
system hang.
So we check if the ptp clock status before reading the timecounter
register.
Signed-off-by: Heiko Thiery <heiko.thiery(a)gmail.com>
Acked-by: Richard Cochran <richardcochran(a)gmail.com>
Link: https://lore.kernel.org/r/20210225211514.9115-1-heiko.thiery@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ethernet/freescale/fec_ptp.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index f9e74461bdc0..123181612595 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -396,9 +396,16 @@ static int fec_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
u64 ns;
unsigned long flags;
+ mutex_lock(&adapter->ptp_clk_mutex);
+ /* Check the ptp clock */
+ if (!adapter->ptp_clk_on) {
+ mutex_unlock(&adapter->ptp_clk_mutex);
+ return -EINVAL;
+ }
spin_lock_irqsave(&adapter->tmreg_lock, flags);
ns = timecounter_read(&adapter->tc);
spin_unlock_irqrestore(&adapter->tmreg_lock, flags);
+ mutex_unlock(&adapter->ptp_clk_mutex);
*ts = ns_to_timespec64(ns);
--
2.30.1
From: Heiko Thiery <heiko.thiery(a)gmail.com>
[ Upstream commit 6a4d7234ae9a3bb31181f348ade9bbdb55aeb5c5 ]
When accessing the timecounter register on an i.MX8MQ the kernel hangs.
This is only the case when the interface is down. This can be reproduced
by reading with 'phc_ctrl eth0 get'.
Like described in the change in 91c0d987a9788dcc5fe26baafd73bf9242b68900
the igp clock is disabled when the interface is down and leads to a
system hang.
So we check if the ptp clock status before reading the timecounter
register.
Signed-off-by: Heiko Thiery <heiko.thiery(a)gmail.com>
Acked-by: Richard Cochran <richardcochran(a)gmail.com>
Link: https://lore.kernel.org/r/20210225211514.9115-1-heiko.thiery@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ethernet/freescale/fec_ptp.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index f9e74461bdc0..123181612595 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -396,9 +396,16 @@ static int fec_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
u64 ns;
unsigned long flags;
+ mutex_lock(&adapter->ptp_clk_mutex);
+ /* Check the ptp clock */
+ if (!adapter->ptp_clk_on) {
+ mutex_unlock(&adapter->ptp_clk_mutex);
+ return -EINVAL;
+ }
spin_lock_irqsave(&adapter->tmreg_lock, flags);
ns = timecounter_read(&adapter->tc);
spin_unlock_irqrestore(&adapter->tmreg_lock, flags);
+ mutex_unlock(&adapter->ptp_clk_mutex);
*ts = ns_to_timespec64(ns);
--
2.30.1
From: Heiko Thiery <heiko.thiery(a)gmail.com>
[ Upstream commit 6a4d7234ae9a3bb31181f348ade9bbdb55aeb5c5 ]
When accessing the timecounter register on an i.MX8MQ the kernel hangs.
This is only the case when the interface is down. This can be reproduced
by reading with 'phc_ctrl eth0 get'.
Like described in the change in 91c0d987a9788dcc5fe26baafd73bf9242b68900
the igp clock is disabled when the interface is down and leads to a
system hang.
So we check if the ptp clock status before reading the timecounter
register.
Signed-off-by: Heiko Thiery <heiko.thiery(a)gmail.com>
Acked-by: Richard Cochran <richardcochran(a)gmail.com>
Link: https://lore.kernel.org/r/20210225211514.9115-1-heiko.thiery@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ethernet/freescale/fec_ptp.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index 6ebad3fac81d..e63df6455fba 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -396,9 +396,16 @@ static int fec_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
u64 ns;
unsigned long flags;
+ mutex_lock(&adapter->ptp_clk_mutex);
+ /* Check the ptp clock */
+ if (!adapter->ptp_clk_on) {
+ mutex_unlock(&adapter->ptp_clk_mutex);
+ return -EINVAL;
+ }
spin_lock_irqsave(&adapter->tmreg_lock, flags);
ns = timecounter_read(&adapter->tc);
spin_unlock_irqrestore(&adapter->tmreg_lock, flags);
+ mutex_unlock(&adapter->ptp_clk_mutex);
*ts = ns_to_timespec64(ns);
--
2.30.1
From: Heiko Thiery <heiko.thiery(a)gmail.com>
[ Upstream commit 6a4d7234ae9a3bb31181f348ade9bbdb55aeb5c5 ]
When accessing the timecounter register on an i.MX8MQ the kernel hangs.
This is only the case when the interface is down. This can be reproduced
by reading with 'phc_ctrl eth0 get'.
Like described in the change in 91c0d987a9788dcc5fe26baafd73bf9242b68900
the igp clock is disabled when the interface is down and leads to a
system hang.
So we check if the ptp clock status before reading the timecounter
register.
Signed-off-by: Heiko Thiery <heiko.thiery(a)gmail.com>
Acked-by: Richard Cochran <richardcochran(a)gmail.com>
Link: https://lore.kernel.org/r/20210225211514.9115-1-heiko.thiery@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ethernet/freescale/fec_ptp.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index 7e892b1cbd3d..09a762eb4f09 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -382,9 +382,16 @@ static int fec_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
u64 ns;
unsigned long flags;
+ mutex_lock(&adapter->ptp_clk_mutex);
+ /* Check the ptp clock */
+ if (!adapter->ptp_clk_on) {
+ mutex_unlock(&adapter->ptp_clk_mutex);
+ return -EINVAL;
+ }
spin_lock_irqsave(&adapter->tmreg_lock, flags);
ns = timecounter_read(&adapter->tc);
spin_unlock_irqrestore(&adapter->tmreg_lock, flags);
+ mutex_unlock(&adapter->ptp_clk_mutex);
*ts = ns_to_timespec64(ns);
--
2.30.1
From: Heiko Thiery <heiko.thiery(a)gmail.com>
[ Upstream commit 6a4d7234ae9a3bb31181f348ade9bbdb55aeb5c5 ]
When accessing the timecounter register on an i.MX8MQ the kernel hangs.
This is only the case when the interface is down. This can be reproduced
by reading with 'phc_ctrl eth0 get'.
Like described in the change in 91c0d987a9788dcc5fe26baafd73bf9242b68900
the igp clock is disabled when the interface is down and leads to a
system hang.
So we check if the ptp clock status before reading the timecounter
register.
Signed-off-by: Heiko Thiery <heiko.thiery(a)gmail.com>
Acked-by: Richard Cochran <richardcochran(a)gmail.com>
Link: https://lore.kernel.org/r/20210225211514.9115-1-heiko.thiery@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ethernet/freescale/fec_ptp.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index 945643c02615..49fad118988b 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -382,9 +382,16 @@ static int fec_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
u64 ns;
unsigned long flags;
+ mutex_lock(&adapter->ptp_clk_mutex);
+ /* Check the ptp clock */
+ if (!adapter->ptp_clk_on) {
+ mutex_unlock(&adapter->ptp_clk_mutex);
+ return -EINVAL;
+ }
spin_lock_irqsave(&adapter->tmreg_lock, flags);
ns = timecounter_read(&adapter->tc);
spin_unlock_irqrestore(&adapter->tmreg_lock, flags);
+ mutex_unlock(&adapter->ptp_clk_mutex);
*ts = ns_to_timespec64(ns);
--
2.30.1
From: Felix Fietkau <nbd(a)nbd.name>
[ Upstream commit ae064fc0e32a4d28389086d9f4b260a0c157cfee ]
When running out of room in the tx queue after calling drv->tx_prepare_skb,
the buffer list will already have been modified on MT7615 and newer drivers.
This can leak a DMA mapping and will show up as swiotlb allocation failures
on x86.
Fix this by moving the queue length check further up. This is less accurate,
since it can overestimate the needed room in the queue on MT7615 and newer,
but the difference is small enough to not matter in practice.
Signed-off-by: Felix Fietkau <nbd(a)nbd.name>
Signed-off-by: Kalle Valo <kvalo(a)codeaurora.org>
Link: https://lore.kernel.org/r/20210216135119.23809-1-nbd@nbd.name
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/wireless/mediatek/mt76/dma.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/dma.c b/drivers/net/wireless/mediatek/mt76/dma.c
index 917617aad8d3..3857761c9619 100644
--- a/drivers/net/wireless/mediatek/mt76/dma.c
+++ b/drivers/net/wireless/mediatek/mt76/dma.c
@@ -355,7 +355,6 @@ mt76_dma_tx_queue_skb(struct mt76_dev *dev, enum mt76_txq_id qid,
};
struct ieee80211_hw *hw;
int len, n = 0, ret = -ENOMEM;
- struct mt76_queue_entry e;
struct mt76_txwi_cache *t;
struct sk_buff *iter;
dma_addr_t addr;
@@ -397,6 +396,11 @@ mt76_dma_tx_queue_skb(struct mt76_dev *dev, enum mt76_txq_id qid,
}
tx_info.nbuf = n;
+ if (q->queued + (tx_info.nbuf + 1) / 2 >= q->ndesc - 1) {
+ ret = -ENOMEM;
+ goto unmap;
+ }
+
dma_sync_single_for_cpu(dev->dev, t->dma_addr, dev->drv->txwi_size,
DMA_TO_DEVICE);
ret = dev->drv->tx_prepare_skb(dev, txwi, qid, wcid, sta, &tx_info);
@@ -405,11 +409,6 @@ mt76_dma_tx_queue_skb(struct mt76_dev *dev, enum mt76_txq_id qid,
if (ret < 0)
goto unmap;
- if (q->queued + (tx_info.nbuf + 1) / 2 >= q->ndesc - 1) {
- ret = -ENOMEM;
- goto unmap;
- }
-
return mt76_dma_add_buf(dev, q, tx_info.buf, tx_info.nbuf,
tx_info.info, tx_info.skb, t);
@@ -425,9 +424,7 @@ mt76_dma_tx_queue_skb(struct mt76_dev *dev, enum mt76_txq_id qid,
dev->test.tx_done--;
#endif
- e.skb = tx_info.skb;
- e.txwi = t;
- dev->drv->tx_complete_skb(dev, &e);
+ dev_kfree_skb(tx_info.skb);
mt76_put_txwi(dev, t);
return ret;
}
--
2.30.1