From: Sergio Perez Gonzalez <sperezglz(a)gmail.com>
[ Upstream commit 3920a758800762917177a6b5ab39707d8e376fe6 ]
Issue flagged by coverity. Add a safety check for the return value
of dma_set_mask_and_coherent, go to a safe exit if it returns error.
Link: https://scan7.scan.coverity.com/#/project-view/53936/11354?selectedIssue=16…
Signed-off-by: Sergio Perez Gonzalez <sperezglz(a)gmail.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea(a)tuxon.dev>
Link: https://patch.msgid.link/20250526032034.84900-1-sperezglz@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
Based on my analysis of the commit and the codebase context, here's my
assessment: **YES** This commit should be backported to stable kernel
trees. Here's my extensive analysis: ## Code Analysis The commit adds
proper error checking for `dma_set_mask_and_coherent()` in the macb
network driver. Looking at the specific changes: **Before the fix:**
```c if (GEM_BFEXT(DAW64, gem_readl(bp, DCFG6))) {
dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(44)); bp->hw_dma_cap
|= HW_DMA_CAP_64B; } ``` **After the fix:** ```c if (GEM_BFEXT(DAW64,
gem_readl(bp, DCFG6))) { err = dma_set_mask_and_coherent(&pdev->dev,
DMA_BIT_MASK(44)); if (err) { dev_err(&pdev->dev, "failed to set DMA
mask\n"); goto err_out_free_netdev; } bp->hw_dma_cap |= HW_DMA_CAP_64B;
} ``` ## Why this should be backported: 1. **Fixes a Real Bug**: The
original code ignores the return value of `dma_set_mask_and_coherent()`.
If this function fails (which can happen on systems with restricted DMA
capabilities), the driver would continue execution assuming 44-bit DMA
addressing is available when it's not, leading to potential memory
corruption or system crashes. 2. **Small, Contained Fix**: This is a
minimal change that only adds proper error checking without changing any
core logic or introducing new features. It follows the stable tree
principle of being a small, targeted bugfix. 3. **Follows Established
Patterns**: Looking at the similar commits provided, this matches
exactly the pattern of commits marked "YES" for backporting (commits #1,
#2, and #5) which all add proper error checking for DMA mask functions.
4. **Critical Network Driver**: The macb driver is used in production
systems, and DMA-related bugs can cause data corruption or system
instability, making this a high-priority fix for stable trees. 5.
**Proper Error Handling Path**: The fix correctly uses the existing
`err_out_free_netdev` label, which properly cleans up allocated
resources (line 5373-5383 in macb_main.c), preventing resource leaks
when DMA setup fails. 6. **Static Analysis Tool Flagged**: The commit
message mentions this was flagged by Coverity, indicating it's a
legitimate code quality issue that could manifest as a runtime bug. 7.
**Matches Stable Tree Criteria**: - Important bugfix ✓ - Minimal risk of
regression ✓ - No architectural changes ✓ - Confined to one
driver/subsystem ✓ - Clear side effects (proper cleanup on failure) ✓
This commit addresses the same class of vulnerability as the other "YES"
examples: ignoring return values of critical DMA functions that can fail
and lead to undefined behavior. The fix is surgical, safe, and addresses
a genuine runtime issue in a widely-used network driver.
drivers/net/ethernet/cadence/macb_main.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index e1e8bd2ec155b..d1f1ae5ea161c 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -5283,7 +5283,11 @@ static int macb_probe(struct platform_device *pdev)
#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
if (GEM_BFEXT(DAW64, gem_readl(bp, DCFG6))) {
- dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(44));
+ err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(44));
+ if (err) {
+ dev_err(&pdev->dev, "failed to set DMA mask\n");
+ goto err_out_free_netdev;
+ }
bp->hw_dma_cap |= HW_DMA_CAP_64B;
}
#endif
--
2.39.5
From: Francesco Dolcini <francesco.dolcini(a)toradex.com>
This reverts commit 4fcfcbe457349267fe048524078e8970807c1a5b.
That commit introduces a regression, when HT40 mode is enabled,
received packets are lost, this was experience with W8997 with both
SDIO-UART and SDIO-SDIO variants. From an initial investigation the
issue solves on its own after some time, but it's not clear what is
the reason. Given that this was just a performance optimization, let's
revert it till we have a better understanding of the issue and a proper
fix.
Cc: Jeff Chen <jeff.chen_1(a)nxp.com>
Cc: stable(a)vger.kernel.org
Fixes: 4fcfcbe45734 ("wifi: mwifiex: Fix HT40 bandwidth issue.")
Closes: https://lore.kernel.org/all/20250603203337.GA109929@francesco-nb/
Signed-off-by: Francesco Dolcini <francesco.dolcini(a)toradex.com>
---
v2: fix reverted commit sha
v1: https://lore.kernel.org/all/20250605100313.34014-1-francesco@dolcini.it/
---
drivers/net/wireless/marvell/mwifiex/11n.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/marvell/mwifiex/11n.c b/drivers/net/wireless/marvell/mwifiex/11n.c
index 738bafc3749b..66f0f5377ac1 100644
--- a/drivers/net/wireless/marvell/mwifiex/11n.c
+++ b/drivers/net/wireless/marvell/mwifiex/11n.c
@@ -403,14 +403,12 @@ mwifiex_cmd_append_11n_tlv(struct mwifiex_private *priv,
if (sband->ht_cap.cap & IEEE80211_HT_CAP_SUP_WIDTH_20_40 &&
bss_desc->bcn_ht_oper->ht_param &
- IEEE80211_HT_PARAM_CHAN_WIDTH_ANY) {
- chan_list->chan_scan_param[0].radio_type |=
- CHAN_BW_40MHZ << 2;
+ IEEE80211_HT_PARAM_CHAN_WIDTH_ANY)
SET_SECONDARYCHAN(chan_list->chan_scan_param[0].
radio_type,
(bss_desc->bcn_ht_oper->ht_param &
IEEE80211_HT_PARAM_CHA_SEC_OFFSET));
- }
+
*buffer += struct_size(chan_list, chan_scan_param, 1);
ret_len += struct_size(chan_list, chan_scan_param, 1);
}
--
2.39.5
This reverts commit d95fdee2253e612216e72f29c65b92ec42d254eb which is
upstream commit be4ae8c19492cd6d5de61ccb34ffb3f5ede5eec8.
This commit is causing a suspend regression on Tegra186 Jetson TX2 with
Linux v6.12.y kernels. This is not seen with Linux v6.15 that includes
this change but indicates that there are there changes missing.
Therefore, revert this change.
Fixes: d95fdee2253e ("cpufreq: tegra186: Share policy per cluster")
Link: https://lore.kernel.org/linux-tegra/bf1dabf7-0337-40e9-8b8e-4e93a0ffd4cc@nv…
Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
---
drivers/cpufreq/tegra186-cpufreq.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c
index 4e5b6f9a56d1..7b8fcfa55038 100644
--- a/drivers/cpufreq/tegra186-cpufreq.c
+++ b/drivers/cpufreq/tegra186-cpufreq.c
@@ -73,18 +73,11 @@ static int tegra186_cpufreq_init(struct cpufreq_policy *policy)
{
struct tegra186_cpufreq_data *data = cpufreq_get_driver_data();
unsigned int cluster = data->cpus[policy->cpu].bpmp_cluster_id;
- u32 cpu;
policy->freq_table = data->clusters[cluster].table;
policy->cpuinfo.transition_latency = 300 * 1000;
policy->driver_data = NULL;
- /* set same policy for all cpus in a cluster */
- for (cpu = 0; cpu < ARRAY_SIZE(tegra186_cpus); cpu++) {
- if (data->cpus[cpu].bpmp_cluster_id == cluster)
- cpumask_set_cpu(cpu, policy->cpus);
- }
-
return 0;
}
--
2.43.0
This reverts commit ac64f0e893ff370c4d3426c83c1bd0acae75bcf4 which is
upstream commit be4ae8c19492cd6d5de61ccb34ffb3f5ede5eec8.
This commit is causing a suspend regression on Tegra186 Jetson TX2 with
Linux v6.12.y kernels. This is not seen with Linux v6.15 that includes
this change but indicates that there are there changes missing.
Therefore, revert this change.
Fixes: ac64f0e893ff ("cpufreq: tegra186: Share policy per cluster")
Link: https://lore.kernel.org/linux-tegra/bf1dabf7-0337-40e9-8b8e-4e93a0ffd4cc@nv…
Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
---
drivers/cpufreq/tegra186-cpufreq.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c
index 4e5b6f9a56d1..7b8fcfa55038 100644
--- a/drivers/cpufreq/tegra186-cpufreq.c
+++ b/drivers/cpufreq/tegra186-cpufreq.c
@@ -73,18 +73,11 @@ static int tegra186_cpufreq_init(struct cpufreq_policy *policy)
{
struct tegra186_cpufreq_data *data = cpufreq_get_driver_data();
unsigned int cluster = data->cpus[policy->cpu].bpmp_cluster_id;
- u32 cpu;
policy->freq_table = data->clusters[cluster].table;
policy->cpuinfo.transition_latency = 300 * 1000;
policy->driver_data = NULL;
- /* set same policy for all cpus in a cluster */
- for (cpu = 0; cpu < ARRAY_SIZE(tegra186_cpus); cpu++) {
- if (data->cpus[cpu].bpmp_cluster_id == cluster)
- cpumask_set_cpu(cpu, policy->cpus);
- }
-
return 0;
}
--
2.43.0
This reverts commit 89172666228de1cefcacf5bc6f61c6281751d2ed which is
upstream commit be4ae8c19492cd6d5de61ccb34ffb3f5ede5eec8.
This commit is causing a suspend regression on Tegra186 Jetson TX2 with
Linux v6.12.y kernels. This is not seen with Linux v6.15 that includes
this change but indicates that there are there changes missing.
Therefore, revert this change.
Link: https://lore.kernel.org/linux-tegra/bf1dabf7-0337-40e9-8b8e-4e93a0ffd4cc@nv…
Fixes: 89172666228d ("cpufreq: tegra186: Share policy per cluster")
Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
---
drivers/cpufreq/tegra186-cpufreq.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c
index 1d6b54303723..6c88827f4e62 100644
--- a/drivers/cpufreq/tegra186-cpufreq.c
+++ b/drivers/cpufreq/tegra186-cpufreq.c
@@ -73,18 +73,11 @@ static int tegra186_cpufreq_init(struct cpufreq_policy *policy)
{
struct tegra186_cpufreq_data *data = cpufreq_get_driver_data();
unsigned int cluster = data->cpus[policy->cpu].bpmp_cluster_id;
- u32 cpu;
policy->freq_table = data->clusters[cluster].table;
policy->cpuinfo.transition_latency = 300 * 1000;
policy->driver_data = NULL;
- /* set same policy for all cpus in a cluster */
- for (cpu = 0; cpu < ARRAY_SIZE(tegra186_cpus); cpu++) {
- if (data->cpus[cpu].bpmp_cluster_id == cluster)
- cpumask_set_cpu(cpu, policy->cpus);
- }
-
return 0;
}
--
2.43.0
This reverts commit 2592aeda794c9ea73193effdab69f1cf90d0851a which is
upstream commit be4ae8c19492cd6d5de61ccb34ffb3f5ede5eec8.
This commit is causing a suspend regression on Tegra186 Jetson TX2 with
Linux v6.12.y kernels. This is not seen with Linux v6.15 that includes
this change but indicates that there are there changes missing.
Therefore, revert this change.
Fixes: 2592aeda794c ("cpufreq: tegra186: Share policy per cluster")
Link: https://lore.kernel.org/linux-tegra/bf1dabf7-0337-40e9-8b8e-4e93a0ffd4cc@nv…
Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
---
drivers/cpufreq/tegra186-cpufreq.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c
index 19597246f9cc..5d1943e787b0 100644
--- a/drivers/cpufreq/tegra186-cpufreq.c
+++ b/drivers/cpufreq/tegra186-cpufreq.c
@@ -73,18 +73,11 @@ static int tegra186_cpufreq_init(struct cpufreq_policy *policy)
{
struct tegra186_cpufreq_data *data = cpufreq_get_driver_data();
unsigned int cluster = data->cpus[policy->cpu].bpmp_cluster_id;
- u32 cpu;
policy->freq_table = data->clusters[cluster].table;
policy->cpuinfo.transition_latency = 300 * 1000;
policy->driver_data = NULL;
- /* set same policy for all cpus in a cluster */
- for (cpu = 0; cpu < ARRAY_SIZE(tegra186_cpus); cpu++) {
- if (data->cpus[cpu].bpmp_cluster_id == cluster)
- cpumask_set_cpu(cpu, policy->cpus);
- }
-
return 0;
}
--
2.43.0
Don't WARN if imported buffers are in use in ivpu_gem_bo_free() as they
can be indeed used in the original context/driver.
Fixes: 647371a6609d ("accel/ivpu: Add GEM buffer object management")
Cc: stable(a)vger.kernel.org # v6.3
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz(a)linux.intel.com>
---
v2: Use drm_gem_is_imported() to check if the buffer is imported.
---
drivers/accel/ivpu/ivpu_gem.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/ivpu/ivpu_gem.c b/drivers/accel/ivpu/ivpu_gem.c
index c193a80241f5f..5ff0bac739fc9 100644
--- a/drivers/accel/ivpu/ivpu_gem.c
+++ b/drivers/accel/ivpu/ivpu_gem.c
@@ -278,7 +278,8 @@ static void ivpu_gem_bo_free(struct drm_gem_object *obj)
list_del(&bo->bo_list_node);
mutex_unlock(&vdev->bo_list_lock);
- drm_WARN_ON(&vdev->drm, !dma_resv_test_signaled(obj->resv, DMA_RESV_USAGE_READ));
+ drm_WARN_ON(&vdev->drm, !drm_gem_is_imported(&bo->base.base) &&
+ !dma_resv_test_signaled(obj->resv, DMA_RESV_USAGE_READ));
drm_WARN_ON(&vdev->drm, ivpu_bo_size(bo) == 0);
drm_WARN_ON(&vdev->drm, bo->base.vaddr);
--
2.45.1
From: Karol Wachowski <karol.wachowski(a)intel.com>
Trigger full device recovery when the driver fails to restore device state
via engine reset and resume operations. This is necessary because, even if
submissions from a faulty context are blocked, the NPU may still process
previously submitted faulty jobs if the engine reset fails to abort them.
Such jobs can continue to generate faults and occupy device resources.
When engine reset is ineffective, the only way to recover is to perform
a full device recovery.
Fixes: dad945c27a42 ("accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW")
Cc: <stable(a)vger.kernel.org> # v6.15+
Signed-off-by: Karol Wachowski <karol.wachowski(a)intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz(a)linux.intel.com>
---
drivers/accel/ivpu/ivpu_job.c | 6 ++++--
drivers/accel/ivpu/ivpu_jsm_msg.c | 9 +++++++--
2 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 1c8e283ad9854..fae8351aa3309 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -986,7 +986,8 @@ void ivpu_context_abort_work_fn(struct work_struct *work)
return;
if (vdev->fw->sched_mode == VPU_SCHEDULING_MODE_HW)
- ivpu_jsm_reset_engine(vdev, 0);
+ if (ivpu_jsm_reset_engine(vdev, 0))
+ return;
mutex_lock(&vdev->context_list_lock);
xa_for_each(&vdev->context_xa, ctx_id, file_priv) {
@@ -1009,7 +1010,8 @@ void ivpu_context_abort_work_fn(struct work_struct *work)
if (vdev->fw->sched_mode != VPU_SCHEDULING_MODE_HW)
goto runtime_put;
- ivpu_jsm_hws_resume_engine(vdev, 0);
+ if (ivpu_jsm_hws_resume_engine(vdev, 0))
+ return;
/*
* In hardware scheduling mode NPU already has stopped processing jobs
* and won't send us any further notifications, thus we have to free job related resources
diff --git a/drivers/accel/ivpu/ivpu_jsm_msg.c b/drivers/accel/ivpu/ivpu_jsm_msg.c
index 219ab8afefabd..0256b2dfefc10 100644
--- a/drivers/accel/ivpu/ivpu_jsm_msg.c
+++ b/drivers/accel/ivpu/ivpu_jsm_msg.c
@@ -7,6 +7,7 @@
#include "ivpu_hw.h"
#include "ivpu_ipc.h"
#include "ivpu_jsm_msg.h"
+#include "ivpu_pm.h"
#include "vpu_jsm_api.h"
const char *ivpu_jsm_msg_type_to_str(enum vpu_ipc_msg_type type)
@@ -163,8 +164,10 @@ int ivpu_jsm_reset_engine(struct ivpu_device *vdev, u32 engine)
ret = ivpu_ipc_send_receive(vdev, &req, VPU_JSM_MSG_ENGINE_RESET_DONE, &resp,
VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm);
- if (ret)
+ if (ret) {
ivpu_err_ratelimited(vdev, "Failed to reset engine %d: %d\n", engine, ret);
+ ivpu_pm_trigger_recovery(vdev, "Engine reset failed");
+ }
return ret;
}
@@ -354,8 +357,10 @@ int ivpu_jsm_hws_resume_engine(struct ivpu_device *vdev, u32 engine)
ret = ivpu_ipc_send_receive(vdev, &req, VPU_JSM_MSG_HWS_RESUME_ENGINE_DONE, &resp,
VPU_IPC_CHAN_ASYNC_CMD, vdev->timeout.jsm);
- if (ret)
+ if (ret) {
ivpu_err_ratelimited(vdev, "Failed to resume engine %d: %d\n", engine, ret);
+ ivpu_pm_trigger_recovery(vdev, "Engine resume failed");
+ }
return ret;
}
--
2.45.1