From: Karol Wachowski karol.wachowski@intel.com
commit dad945c27a42dfadddff1049cf5ae417209a8996 upstream.
Trigger recovery of the NPU upon receiving HW context violation from the firmware. The context violation error is a fatal error that prevents any subsequent jobs from being executed. Without this fix it is necessary to reload the driver to restore the NPU operational state.
This is simplified version of upstream commit as the full implementation would require all engine reset/resume logic to be backported.
Signed-off-by: Karol Wachowski karol.wachowski@intel.com Signed-off-by: Maciej Falkowski maciej.falkowski@linux.intel.com Reviewed-by: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com Signed-off-by: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-13-macie... Fixes: 0adff3b0ef12 ("accel/ivpu: Share NPU busy time in sysfs") Cc: stable@vger.kernel.org # v6.11+ --- drivers/accel/ivpu/ivpu_job.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c index be2e2bf0f43f0..70b3676974407 100644 --- a/drivers/accel/ivpu/ivpu_job.c +++ b/drivers/accel/ivpu/ivpu_job.c @@ -482,6 +482,8 @@ static struct ivpu_job *ivpu_job_remove_from_submitted_jobs(struct ivpu_device * return job; }
+#define VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW 0xEU + static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32 job_status) { struct ivpu_job *job; @@ -490,6 +492,9 @@ static int ivpu_job_signal_and_destroy(struct ivpu_device *vdev, u32 job_id, u32 if (!job) return -ENOENT;
+ if (job_status == VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW) + ivpu_pm_trigger_recovery(vdev, "HW context violation"); + if (job->file_priv->has_mmu_faults) job_status = DRM_IVPU_JOB_STATUS_ABORTED;