The first two commits fix rare bugs and should be backported to stable branches.
The rest is an attempt to cleanup and document the code to make it a bit easier to understand.
Signed-off-by: Alessio Belle alessio.belle@imgtec.com --- Alessio Belle (8): drm/imagination: Count paired job fence as dependency in prepare_job() drm/imagination: Fit paired fragment job in the correct CCCB drm/imagination: Skip check on paired job fence during job submission drm/imagination: Rename pvr_queue_fence_is_ufo_backed() to reflect usage drm/imagination: Rename fence returned by pvr_queue_job_arm() drm/imagination: Move repeated job fence check to its own function drm/imagination: Update check to skip prepare_job() for fragment jobs drm/imagination: Minor improvements to job submission code documentation
drivers/gpu/drm/imagination/pvr_job.c | 8 +- drivers/gpu/drm/imagination/pvr_queue.c | 154 +++++++++++++-------- drivers/gpu/drm/imagination/pvr_queue.h | 2 +- .../gpu/drm/imagination/pvr_rogue_fwif_shared.h | 10 +- drivers/gpu/drm/imagination/pvr_sync.c | 8 +- drivers/gpu/drm/imagination/pvr_sync.h | 2 +- 6 files changed, 110 insertions(+), 74 deletions(-) --- base-commit: 3bce3fdd1ff2ba242f76ab66659fff27207299f1 change-id: 20260330-job-submission-fixes-cleanup-83e01196c3e9
Best regards,
The DRM scheduler's prepare_job() callback counts the remaining non-signaled native dependencies for a job, preventing job submission until those (plus job data and fence update) can fit in the job queue's CCCB.
This means checking which dependencies can be waited upon in the firmware, i.e. whether they are backed by a UFO object, i.e. whether their drm_sched_fence::parent has been assigned to a pvr_queue_fence::base fence. That happens when the job owning the fence is submitted to the firmware.
Paired geometry and fragment jobs are submitted at the same time, which means the dependency between them can't be checked this way before submission.
Update job_count_remaining_native_deps() to take into account the dependency between paired jobs.
This fixes cases where prepare_job() underestimated the space left in an almost full fragment CCCB, wrongly unblocking run_job(), which then returned early without writing the full sequence of commands to the CCCB.
The above lead to kernel warnings such as the following and potentially job timeouts (depending on waiters on the missing commands):
[ 375.702979] WARNING: drivers/gpu/drm/imagination/pvr_cccb.c:178 at pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr], CPU#1: kworker/u16:3/47 [ 375.703160] Modules linked in: [ 375.703571] CPU: 1 UID: 0 PID: 47 Comm: kworker/u16:3 Tainted: G W 7.0.0-rc2-g817eb6b11ad5 #40 PREEMPT [ 375.703613] Tainted: [W]=WARN [ 375.703627] Hardware name: Texas Instruments AM625 SK (DT) [ 375.703645] Workqueue: powervr-sched drm_sched_run_job_work [gpu_sched] [ 375.703741] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 375.703764] pc : pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr] [ 375.703847] lr : pvr_queue_submit_job_to_cccb+0x578/0xa70 [powervr] [ 375.703921] sp : ffff800084a97650 [ 375.703934] x29: ffff800084a97740 x28: 0000000000000958 x27: ffff80008565d000 [ 375.703979] x26: 0000000000000030 x25: ffff800084a97680 x24: 0000000000001000 [ 375.704017] x23: ffff800084a97820 x22: 1ffff00010952ecc x21: 0000000000000008 [ 375.704056] x20: 00000000000006a8 x19: ffff00002ff7da88 x18: 0000000000000000 [ 375.704093] x17: 0000000020020000 x16: 0000000000020000 x15: 0000000000000000 [ 375.704132] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 375.704168] x11: 000000000000f2f2 x10: 00000000f3000000 x9 : 00000000f3f3f3f3 [ 375.704206] x8 : 00000000f2f2f200 x7 : ffff700010952ecc x6 : 0000000000000008 [ 375.704243] x5 : 0000000000000000 x4 : 1ffff00010acba00 x3 : 0000000000000000 [ 375.704279] x2 : 0000000000000007 x1 : 0000000000000fff x0 : 000000000000002f [ 375.704317] Call trace: [ 375.704331] pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr] (P) [ 375.704411] pvr_queue_submit_job_to_cccb+0x578/0xa70 [powervr] [ 375.704487] pvr_queue_run_job+0x3a4/0x990 [powervr] [ 375.704562] drm_sched_run_job_work+0x580/0xd48 [gpu_sched] [ 375.704623] process_one_work+0x520/0x1288 [ 375.704658] worker_thread+0x3f0/0xb3c [ 375.704680] kthread+0x334/0x3d8 [ 375.704706] ret_from_fork+0x10/0x20 [ 375.704736] ---[ end trace 0000000000000000 ]---
Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling") Cc: stable@vger.kernel.org Signed-off-by: Alessio Belle alessio.belle@imgtec.com --- drivers/gpu/drm/imagination/pvr_queue.c | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c index dd88949f6194..836feaa0b295 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.c +++ b/drivers/gpu/drm/imagination/pvr_queue.c @@ -179,7 +179,7 @@ static const struct dma_fence_ops pvr_queue_job_fence_ops = {
/** * to_pvr_queue_job_fence() - Return a pvr_queue_fence object if the fence is - * backed by a UFO. + * already backed by a UFO. * @f: The dma_fence to turn into a pvr_queue_fence. * * Return: @@ -356,6 +356,15 @@ static u32 job_cmds_size(struct pvr_job *job, u32 ufo_wait_count) pvr_cccb_get_size_of_cmd_with_hdr(job->cmd_len); }
+static bool +is_paired_job_fence(struct dma_fence *fence, struct pvr_job *job) +{ + /* This assumes "fence" is one of "job"'s drm_sched_job::dependencies */ + return job->type == DRM_PVR_JOB_TYPE_FRAGMENT && + job->paired_job && + &job->paired_job->base.s_fence->scheduled == fence; +} + /** * job_count_remaining_native_deps() - Count the number of non-signaled native dependencies. * @job: Job to operate on. @@ -371,6 +380,17 @@ static unsigned long job_count_remaining_native_deps(struct pvr_job *job) xa_for_each(&job->base.dependencies, index, fence) { struct pvr_queue_fence *jfence;
+ if (is_paired_job_fence(fence, job)) { + /* + * A fence between paired jobs won't resolve to a pvr_queue_fence (i.e. + * be backed by a UFO) until the jobs have been submitted, together. + * The submitting code will insert a partial render fence command for this. + */ + WARN_ON(dma_fence_is_signaled(fence)); + remaining_count++; + continue; + } + jfence = to_pvr_queue_job_fence(fence); if (!jfence) continue; @@ -630,9 +650,8 @@ static void pvr_queue_submit_job_to_cccb(struct pvr_job *job) if (!jfence) continue;
- /* Skip the partial render fence, we will place it at the end. */ - if (job->type == DRM_PVR_JOB_TYPE_FRAGMENT && job->paired_job && - &job->paired_job->base.s_fence->scheduled == fence) + /* This fence will be placed last, as partial render fence. */ + if (is_paired_job_fence(fence, job)) continue;
if (dma_fence_is_signaled(&jfence->base))
For geometry jobs with a paired fragment job, at the moment, the DRM scheduler's prepare_job() callback:
- checks for internal (driver) dependencies for the geometry job; - calls into pvr_queue_get_paired_frag_job_dep() to check for external dependencies for the fragment job (the two jobs are submitted together but the common scheduler code doesn't know about it, so this needs to be done at this point in time); - calls into the prepare_job() callback again, but for the fragment job, to check its internal dependencies as well, passing the fragment job's drm_sched_job and the geometry job's drm_sched_entity / pvr_queue.
The problem with the last step is that pvr_queue_prepare_job() doesn't always take the mismatched fragment job and geometry queue into account, in particular when checking whether there is space for the fragment command to be submitted, so the code ends up checking for space in the geometry (i.e. wrong) CCCB. The rest of the nested prepare_job() callback happens to work fine at the moment as the other internal dependencies are not relevant for a paired fragment job.
Move the initialisation of a paired fragment job's done fence and CCCB fence to pvr_queue_get_paired_frag_job_dep(), inferring the correct queue from the fragment job itself.
This fixes cases where prepare_job() wrongly assumed that there was enough space for a paired fragment job in its own CCCB, unblocking run_job(), which then returned early without writing the full sequence of commands to the CCCB.
The above lead to kernel warnings such as the following and potentially job timeouts (depending on waiters on the missing commands):
[ 552.421075] WARNING: drivers/gpu/drm/imagination/pvr_cccb.c:178 at pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr], CPU#2: kworker/u16:5/63 [ 552.421230] Modules linked in: [ 552.421592] CPU: 2 UID: 0 PID: 63 Comm: kworker/u16:5 Tainted: G W 7.0.0-rc2-gc5d053e4dccb #39 PREEMPT [ 552.421625] Tainted: [W]=WARN [ 552.421637] Hardware name: Texas Instruments AM625 SK (DT) [ 552.421655] Workqueue: powervr-sched drm_sched_run_job_work [gpu_sched] [ 552.421744] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 552.421766] pc : pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr] [ 552.421850] lr : pvr_queue_submit_job_to_cccb+0x57c/0xa74 [powervr] [ 552.421923] sp : ffff800084c47650 [ 552.421936] x29: ffff800084c47740 x28: 0000000000000df8 x27: ffff800088a77000 [ 552.421979] x26: 0000000000000030 x25: ffff800084c47680 x24: 0000000000001000 [ 552.422017] x23: ffff800084c47820 x22: 1ffff00010988ecc x21: 0000000000000008 [ 552.422055] x20: 0000000000000208 x19: ffff000006ad5a88 x18: 0000000000000000 [ 552.422093] x17: 0000000020020000 x16: 0000000000020000 x15: 0000000000000000 [ 552.422130] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 552.422167] x11: 000000000000f2f2 x10: 00000000f3000000 x9 : 00000000f3f3f3f3 [ 552.422204] x8 : 00000000f2f2f200 x7 : ffff700010988ecc x6 : 0000000000000008 [ 552.422241] x5 : 0000000000000000 x4 : 1ffff0001114ee00 x3 : 0000000000000000 [ 552.422278] x2 : 0000000000000007 x1 : 0000000000000fff x0 : 000000000000002f [ 552.422316] Call trace: [ 552.422330] pvr_cccb_write_command_with_header+0x2c4/0x330 [powervr] (P) [ 552.422411] pvr_queue_submit_job_to_cccb+0x57c/0xa74 [powervr] [ 552.422486] pvr_queue_run_job+0x3a4/0x990 [powervr] [ 552.422562] drm_sched_run_job_work+0x580/0xd48 [gpu_sched] [ 552.422623] process_one_work+0x520/0x1288 [ 552.422657] worker_thread+0x3f0/0xb3c [ 552.422679] kthread+0x334/0x3d8 [ 552.422706] ret_from_fork+0x10/0x20
Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling") Cc: stable@vger.kernel.org Signed-off-by: Alessio Belle alessio.belle@imgtec.com --- drivers/gpu/drm/imagination/pvr_queue.c | 32 ++++++++++++++------------------ 1 file changed, 14 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c index 836feaa0b295..f1e54e6d940d 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.c +++ b/drivers/gpu/drm/imagination/pvr_queue.c @@ -488,10 +488,11 @@ pvr_queue_get_job_kccb_fence(struct pvr_queue *queue, struct pvr_job *job) }
static struct dma_fence * -pvr_queue_get_paired_frag_job_dep(struct pvr_queue *queue, struct pvr_job *job) +pvr_queue_get_paired_frag_job_dep(struct pvr_job *job) { struct pvr_job *frag_job = job->type == DRM_PVR_JOB_TYPE_GEOMETRY ? job->paired_job : NULL; + struct pvr_queue *frag_queue = frag_job ? frag_job->ctx->queues.fragment : NULL; struct dma_fence *f; unsigned long index;
@@ -510,7 +511,10 @@ pvr_queue_get_paired_frag_job_dep(struct pvr_queue *queue, struct pvr_job *job) return dma_fence_get(f); }
- return frag_job->base.sched->ops->prepare_job(&frag_job->base, &queue->entity); + /* Initialize the paired fragment job's done_fence, so we can signal it. */ + pvr_queue_job_fence_init(frag_job->done_fence, frag_queue); + + return pvr_queue_get_job_cccb_fence(frag_queue, frag_job); }
/** @@ -529,11 +533,6 @@ pvr_queue_prepare_job(struct drm_sched_job *sched_job, struct pvr_queue *queue = container_of(s_entity, struct pvr_queue, entity); struct dma_fence *internal_dep = NULL;
- /* - * Initialize the done_fence, so we can signal it. This must be done - * here because otherwise by the time of run_job() the job will end up - * in the pending list without a valid fence. - */ if (job->type == DRM_PVR_JOB_TYPE_FRAGMENT && job->paired_job) { /* * This will be called on a paired fragment job after being @@ -543,18 +542,15 @@ pvr_queue_prepare_job(struct drm_sched_job *sched_job, */ if (job->paired_job->has_pm_ref) return NULL; - - /* - * In this case we need to use the job's own ctx to initialise - * the done_fence. The other steps are done in the ctx of the - * paired geometry job. - */ - pvr_queue_job_fence_init(job->done_fence, - job->ctx->queues.fragment); - } else { - pvr_queue_job_fence_init(job->done_fence, queue); }
+ /* + * Initialize the done_fence, so we can signal it. This must be done + * here because otherwise by the time of run_job() the job will end up + * in the pending list without a valid fence. + */ + pvr_queue_job_fence_init(job->done_fence, queue); + /* CCCB fence is used to make sure we have enough space in the CCCB to * submit our commands. */ @@ -575,7 +571,7 @@ pvr_queue_prepare_job(struct drm_sched_job *sched_job,
/* The paired job fence should come last, when everything else is ready. */ if (!internal_dep) - internal_dep = pvr_queue_get_paired_frag_job_dep(queue, job); + internal_dep = pvr_queue_get_paired_frag_job_dep(job);
return internal_dep; }
By the time prepare_job() is called on a paired fragment job, the paired geometry job might already be finished and its PM reference dropped.
Check the fragment job's PM reference instead which is a bit more likely to be still set. This is a very minor optimization.
Signed-off-by: Alessio Belle alessio.belle@imgtec.com --- drivers/gpu/drm/imagination/pvr_queue.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c index 4a49d954562e..303f4d6cc09e 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.c +++ b/drivers/gpu/drm/imagination/pvr_queue.c @@ -553,12 +553,13 @@ pvr_queue_prepare_job(struct drm_sched_job *sched_job,
if (job->type == DRM_PVR_JOB_TYPE_FRAGMENT && job->paired_job) { /* - * This will be called on a paired fragment job after being - * submitted to firmware. We can tell if this is the case and - * bail early from whether run_job() has been called on the - * geometry job, which would issue a pm ref. + * This will be called on a paired fragment job after being submitted + * to the firmware as part of the paired geometry job's submission. + * We can tell if this is the case and bail early from whether run_job() + * has been called on the geometry job, which would issue a pm ref on + * this job as well. */ - if (job->paired_job->has_pm_ref) + if (job->has_pm_ref) return NULL; }
While submitting a paired fragment job, there is no need to manually look for, and skip, the paired job fence, as the existing logic to resolve dependencies to pvr_queue_fence objects will have failed to resolve it already and continued with the next one.
Point this out where the fence is actually accessed and drop the related check.
Signed-off-by: Alessio Belle alessio.belle@imgtec.com --- drivers/gpu/drm/imagination/pvr_queue.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c index f1e54e6d940d..527eae1309d8 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.c +++ b/drivers/gpu/drm/imagination/pvr_queue.c @@ -646,10 +646,6 @@ static void pvr_queue_submit_job_to_cccb(struct pvr_job *job) if (!jfence) continue;
- /* This fence will be placed last, as partial render fence. */ - if (is_paired_job_fence(fence, job)) - continue; - if (dma_fence_is_signaled(&jfence->base)) continue;
@@ -664,8 +660,13 @@ static void pvr_queue_submit_job_to_cccb(struct pvr_job *job) } }
- /* Partial render fence goes last. */ if (job->type == DRM_PVR_JOB_TYPE_FRAGMENT && job->paired_job) { + /* + * The loop above will only process dependencies backed by a UFO i.e. with + * a valid parent fence assigned, but the paired job dependency won't have + * one until both jobs have been submitted. Access the parent fence directly + * here instead, submitting it last as partial render fence. + */ jfence = to_pvr_queue_job_fence(job->paired_job->done_fence); if (!WARN_ON(!jfence)) { pvr_fw_object_get_fw_addr(jfence->queue->timeline_ufo.fw_obj,
This function is only used by the synchronization code to figure out if a fence belongs to this driver. Rename it to pvr_queue_fence_is_native() and update its documentation to reflect its current purpose.
Signed-off-by: Alessio Belle alessio.belle@imgtec.com --- drivers/gpu/drm/imagination/pvr_queue.c | 14 +++++++------- drivers/gpu/drm/imagination/pvr_queue.h | 2 +- drivers/gpu/drm/imagination/pvr_sync.c | 4 ++-- 3 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c index 527eae1309d8..df0a110ed96f 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.c +++ b/drivers/gpu/drm/imagination/pvr_queue.c @@ -898,16 +898,16 @@ static const struct drm_sched_backend_ops pvr_queue_sched_ops = { };
/** - * pvr_queue_fence_is_ufo_backed() - Check if a dma_fence is backed by a UFO object + * pvr_queue_fence_is_native() - Check if a dma_fence is native to this driver. * @f: Fence to test. * - * A UFO-backed fence is a fence that can be signaled or waited upon FW-side. - * pvr_job::done_fence objects are backed by the timeline UFO attached to the queue - * they are pushed to, but those fences are not directly exposed to the outside - * world, so we also need to check if the fence we're being passed is a - * drm_sched_fence that was coming from our driver. + * Check if the fence we're being passed is a drm_sched_fence that is coming from this driver. + * + * It may be a UFO-backed fence i.e. a fence that can be signaled or waited upon FW-side, + * such as pvr_job::done_fence objects that are backed by the timeline UFO attached to the queue + * they are pushed to. */ -bool pvr_queue_fence_is_ufo_backed(struct dma_fence *f) +bool pvr_queue_fence_is_native(struct dma_fence *f) { struct drm_sched_fence *sched_fence = f ? to_drm_sched_fence(f) : NULL;
diff --git a/drivers/gpu/drm/imagination/pvr_queue.h b/drivers/gpu/drm/imagination/pvr_queue.h index fc1986d73fc8..4aa72665ce25 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.h +++ b/drivers/gpu/drm/imagination/pvr_queue.h @@ -141,7 +141,7 @@ struct pvr_queue { u64 callstack_addr; };
-bool pvr_queue_fence_is_ufo_backed(struct dma_fence *f); +bool pvr_queue_fence_is_native(struct dma_fence *f);
int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id);
diff --git a/drivers/gpu/drm/imagination/pvr_sync.c b/drivers/gpu/drm/imagination/pvr_sync.c index 3582616ff722..757a18b1ab8f 100644 --- a/drivers/gpu/drm/imagination/pvr_sync.c +++ b/drivers/gpu/drm/imagination/pvr_sync.c @@ -211,7 +211,7 @@ pvr_sync_add_dep_to_job(struct drm_sched_job *job, struct dma_fence *f) int err = 0;
dma_fence_unwrap_for_each(uf, &iter, f) { - if (pvr_queue_fence_is_ufo_backed(uf)) + if (pvr_queue_fence_is_native(uf)) native_fence_count++; }
@@ -227,7 +227,7 @@ pvr_sync_add_dep_to_job(struct drm_sched_job *job, struct dma_fence *f) if (err) continue;
- if (pvr_queue_fence_is_ufo_backed(uf)) { + if (pvr_queue_fence_is_native(uf)) { struct drm_sched_fence *s_fence = to_drm_sched_fence(uf);
/* If this is a native dependency, we wait for the scheduled fence,
This should make the code slightly clearer.
Signed-off-by: Alessio Belle alessio.belle@imgtec.com --- drivers/gpu/drm/imagination/pvr_queue.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c index df0a110ed96f..4a49d954562e 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.c +++ b/drivers/gpu/drm/imagination/pvr_queue.c @@ -177,6 +177,24 @@ static const struct dma_fence_ops pvr_queue_job_fence_ops = { .release = pvr_queue_fence_release, };
+/** + * pvr_queue_fence_is_ufo_backed() - Check if a dma_fence is backed by a UFO. + * @f: The dma_fence to check. + * + * Return: + * * true if the dma_fence is backed by a UFO, or + * * false otherwise. + */ +static inline bool +pvr_queue_fence_is_ufo_backed(struct dma_fence *f) +{ + /* + * Currently the only dma_fence backed by a UFO object is the job fence, + * e.g. pvr_job::done_fence, wrapped by a pvr_queue_fence object. + */ + return f && f->ops == &pvr_queue_job_fence_ops; +} + /** * to_pvr_queue_job_fence() - Return a pvr_queue_fence object if the fence is * already backed by a UFO. @@ -194,7 +212,7 @@ to_pvr_queue_job_fence(struct dma_fence *f) if (sched_fence) f = sched_fence->parent;
- if (f && f->ops == &pvr_queue_job_fence_ops) + if (pvr_queue_fence_is_ufo_backed(f)) return container_of(f, struct pvr_queue_fence, base);
return NULL; @@ -915,10 +933,7 @@ bool pvr_queue_fence_is_native(struct dma_fence *f) sched_fence->sched->ops == &pvr_queue_sched_ops) return true;
- if (f && f->ops == &pvr_queue_job_fence_ops) - return true; - - return false; + return pvr_queue_fence_is_ufo_backed(f); }
/**
Rename from done_fence to finished_fence, both because the function returns a drm_sched_fence's finished fence, and to avoid confusion with the job fence, which is called the same but has a different purpose.
Signed-off-by: Alessio Belle alessio.belle@imgtec.com --- drivers/gpu/drm/imagination/pvr_job.c | 8 ++++---- drivers/gpu/drm/imagination/pvr_sync.c | 4 ++-- drivers/gpu/drm/imagination/pvr_sync.h | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/imagination/pvr_job.c b/drivers/gpu/drm/imagination/pvr_job.c index 0c2f511a6178..dd9f5df01e08 100644 --- a/drivers/gpu/drm/imagination/pvr_job.c +++ b/drivers/gpu/drm/imagination/pvr_job.c @@ -326,7 +326,7 @@ prepare_job_syncs(struct pvr_file *pvr_file, struct pvr_job_data *job_data, struct xarray *signal_array) { - struct dma_fence *done_fence; + struct dma_fence *finished_fence; int err = pvr_sync_signal_array_collect_ops(signal_array, from_pvr_file(pvr_file), job_data->sync_op_count, @@ -359,13 +359,13 @@ prepare_job_syncs(struct pvr_file *pvr_file, return err; }
- /* We need to arm the job to get the job done fence. */ - done_fence = pvr_queue_job_arm(job_data->job); + /* We need to arm the job to get the job finished fence. */ + finished_fence = pvr_queue_job_arm(job_data->job);
err = pvr_sync_signal_array_update_fences(signal_array, job_data->sync_op_count, job_data->sync_ops, - done_fence); + finished_fence); return err; }
diff --git a/drivers/gpu/drm/imagination/pvr_sync.c b/drivers/gpu/drm/imagination/pvr_sync.c index 757a18b1ab8f..936f840a5221 100644 --- a/drivers/gpu/drm/imagination/pvr_sync.c +++ b/drivers/gpu/drm/imagination/pvr_sync.c @@ -160,7 +160,7 @@ int pvr_sync_signal_array_update_fences(struct xarray *array, u32 sync_op_count, const struct drm_pvr_sync_op *sync_ops, - struct dma_fence *done_fence) + struct dma_fence *finished_fence) { for (u32 i = 0; i < sync_op_count; i++) { struct dma_fence *old_fence; @@ -175,7 +175,7 @@ pvr_sync_signal_array_update_fences(struct xarray *array, return -EINVAL;
old_fence = sig_sync->fence; - sig_sync->fence = dma_fence_get(done_fence); + sig_sync->fence = dma_fence_get(finished_fence); dma_fence_put(old_fence);
if (WARN_ON(!sig_sync->fence)) diff --git a/drivers/gpu/drm/imagination/pvr_sync.h b/drivers/gpu/drm/imagination/pvr_sync.h index db6ccfda104a..48501ad27794 100644 --- a/drivers/gpu/drm/imagination/pvr_sync.h +++ b/drivers/gpu/drm/imagination/pvr_sync.h @@ -70,7 +70,7 @@ int pvr_sync_signal_array_update_fences(struct xarray *array, u32 sync_op_count, const struct drm_pvr_sync_op *sync_ops, - struct dma_fence *done_fence); + struct dma_fence *finished_fence);
void pvr_sync_signal_array_push_fences(struct xarray *array);
Mixed list of clarifications and typo fixes.
Signed-off-by: Alessio Belle alessio.belle@imgtec.com --- drivers/gpu/drm/imagination/pvr_queue.c | 38 ++++++++++++++-------- .../gpu/drm/imagination/pvr_rogue_fwif_shared.h | 10 +----- 2 files changed, 26 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c index 303f4d6cc09e..b5bec656d13c 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.c +++ b/drivers/gpu/drm/imagination/pvr_queue.c @@ -200,6 +200,13 @@ pvr_queue_fence_is_ufo_backed(struct dma_fence *f) * already backed by a UFO. * @f: The dma_fence to turn into a pvr_queue_fence. * + * This could be called on: + * - a job fence directly, in which case it simply returns the containing pvr_queue_fence; + * - a drm_sched_fence's scheduled or finished fence, in which case it will first try to follow + * the parent pointer to find the job fence (note that the parent pointer is initialized + * only after the run_job() callback is called on the drm_sched_fence's owning job); + * - any other dma_fence, in which case it will return NULL. + * * Return: * * A non-NULL pvr_queue_fence object if the dma_fence is backed by a UFO, or * * NULL otherwise. @@ -367,11 +374,14 @@ static u32 ufo_cmds_size(u32 elem_count)
static u32 job_cmds_size(struct pvr_job *job, u32 ufo_wait_count) { - /* One UFO cmd for the fence signaling, one UFO cmd per native fence native, - * and a command for the job itself. + /* + * One UFO command per native fence this job will be waiting on (unless any are + * signaled by the time the job is submitted), plus a command for the job itself, + * plus one UFO command for the fence signaling. */ - return ufo_cmds_size(1) + ufo_cmds_size(ufo_wait_count) + - pvr_cccb_get_size_of_cmd_with_hdr(job->cmd_len); + return ufo_cmds_size(ufo_wait_count) + + pvr_cccb_get_size_of_cmd_with_hdr(job->cmd_len) + + ufo_cmds_size(1); }
static bool @@ -517,12 +527,16 @@ pvr_queue_get_paired_frag_job_dep(struct pvr_job *job) if (!frag_job) return NULL;
+ /* Have the geometry job wait on the paired fragment job's dependencies as well. */ xa_for_each(&frag_job->base.dependencies, index, f) { /* Skip already signaled fences. */ if (dma_fence_is_signaled(f)) continue;
- /* Skip our own fence. */ + /* + * The paired job fence won't be signaled until both jobs have + * been submitted, so we can't wait on it to schedule them. + */ if (f == &job->base.s_fence->scheduled) continue;
@@ -665,6 +679,7 @@ static void pvr_queue_submit_job_to_cccb(struct pvr_job *job) if (!jfence) continue;
+ /* Some dependencies might have been signaled since prepare_job() */ if (dma_fence_is_signaled(&jfence->base)) continue;
@@ -714,7 +729,7 @@ static void pvr_queue_submit_job_to_cccb(struct pvr_job *job) pvr_cccb_write_command_with_header(cccb, job->fw_ccb_cmd_type, job->cmd_len, job->cmd, job->id, job->id);
- /* Signal the job fence. */ + /* Update command to signal the job fence. */ pvr_fw_object_get_fw_addr(queue->timeline_ufo.fw_obj, &ufos[0].addr); ufos[0].value = job->done_fence->seqno; pvr_cccb_write_command_with_header(cccb, ROGUE_FWIF_CCB_CMD_TYPE_UPDATE, @@ -744,10 +759,8 @@ static struct dma_fence *pvr_queue_run_job(struct drm_sched_job *sched_job) }
/* The only kind of jobs that can be paired are geometry and fragment, and - * we bail out early if we see a fragment job that's paired with a geomtry - * job. - * Paired jobs must also target the same context and point to the same - * HWRT. + * we bail out early if we see a fragment job that's paired with a geometry job. + * Paired jobs must also target the same context and point to the same HWRT. */ if (WARN_ON(job->paired_job && (job->type != DRM_PVR_JOB_TYPE_GEOMETRY || @@ -966,9 +979,8 @@ pvr_queue_signal_done_fences(struct pvr_queue *queue) }
/** - * pvr_queue_check_job_waiting_for_cccb_space() - Check if the job waiting for CCCB space - * can be unblocked - * pushed to the CCCB + * pvr_queue_check_job_waiting_for_cccb_space() - Check if a job waiting for CCCB space + * can be unblocked and pushed to the CCCB. * @queue: Queue to check * * If we have a job waiting for CCCB, and this job now fits in the CCCB, we signal diff --git a/drivers/gpu/drm/imagination/pvr_rogue_fwif_shared.h b/drivers/gpu/drm/imagination/pvr_rogue_fwif_shared.h index 869d904e3649..fe54c1cad7a9 100644 --- a/drivers/gpu/drm/imagination/pvr_rogue_fwif_shared.h +++ b/drivers/gpu/drm/imagination/pvr_rogue_fwif_shared.h @@ -14,15 +14,7 @@
#define ROGUE_NUM_GEOM_CORES_SIZE 2U
-/* - * Maximum number of UFOs in a CCB command. - * The number is based on having 32 sync prims (as originally), plus 32 sync - * checkpoints. - * Once the use of sync prims is no longer supported, we will retain - * the same total (64) as the number of sync checkpoints which may be - * supporting a fence is not visible to the client driver and has to - * allow for the number of different timelines involved in fence merges. - */ +/* Maximum number of UFOs in a CCB command. */ #define ROGUE_FWIF_CCB_CMD_MAX_UFOS (32U + 32U)
/*
On Mon, 2026-03-30 at 08:56 +0100, Alessio Belle wrote:
*** NOTE: This is an internal email from Imagination Technologies ***
The first two commits fix rare bugs and should be backported to stable branches.
The rest is an attempt to cleanup and document the code to make it a bit easier to understand.
Signed-off-by: Alessio Belle alessio.belle@imgtec.com
Alessio Belle (8): drm/imagination: Count paired job fence as dependency in prepare_job() drm/imagination: Fit paired fragment job in the correct CCCB drm/imagination: Skip check on paired job fence during job submission drm/imagination: Rename pvr_queue_fence_is_ufo_backed() to reflect usage drm/imagination: Rename fence returned by pvr_queue_job_arm() drm/imagination: Move repeated job fence check to its own function drm/imagination: Update check to skip prepare_job() for fragment jobs drm/imagination: Minor improvements to job submission code documentation
drivers/gpu/drm/imagination/pvr_job.c | 8 +- drivers/gpu/drm/imagination/pvr_queue.c | 154 +++++++++++++-------- drivers/gpu/drm/imagination/pvr_queue.h | 2 +- .../gpu/drm/imagination/pvr_rogue_fwif_shared.h | 10 +- drivers/gpu/drm/imagination/pvr_sync.c | 8 +- drivers/gpu/drm/imagination/pvr_sync.h | 2 +- 6 files changed, 110 insertions(+), 74 deletions(-)
base-commit: 3bce3fdd1ff2ba242f76ab66659fff27207299f1 change-id: 20260330-job-submission-fixes-cleanup-83e01196c3e9
Best regards,
Alessio Belle alessio.belle@imgtec.com
For the whole series, Reviewed-by: Brajesh Gupta brajesh.gupta@imgtec.com
Thanks, Brajesh
On Mon, 30 Mar 2026 08:56:35 +0100, Alessio Belle wrote:
The first two commits fix rare bugs and should be backported to stable branches.
The rest is an attempt to cleanup and document the code to make it a bit easier to understand.
[...]
Applied, thanks!
[1/8] drm/imagination: Count paired job fence as dependency in prepare_job() commit: 9cd74f935306cd857f46686975c43383e1d95f94 [2/8] drm/imagination: Fit paired fragment job in the correct CCCB commit: 4baf9e70cb756d78dd56419f8baee2978a72d0c3 [3/8] drm/imagination: Skip check on paired job fence during job submission commit: 18998b3cb7595850b8b2da55adb3fdc7aef8bc22 [4/8] drm/imagination: Rename pvr_queue_fence_is_ufo_backed() to reflect usage commit: c162e655092de8de2e0f7776d72919dd5e3b84f2 [5/8] drm/imagination: Rename fence returned by pvr_queue_job_arm() commit: 5dae1a21f1e7128a19c68212422383f700699d01 [6/8] drm/imagination: Move repeated job fence check to its own function commit: 402562e60c6c1b15ee359ba7ffed907baa886a99 [7/8] drm/imagination: Update check to skip prepare_job() for fragment jobs commit: 5c81eb2970133ad073214eb1e5b0c34a1ae793eb [8/8] drm/imagination: Minor improvements to job submission code documentation commit: 62a36c2da774800bef893bc4bf8922fb9c07c1d0
Best regards,
On Mon, Mar 30, 2026 at 3:21 AM Alessio Belle alessio.belle@imgtec.com wrote:
The first two commits fix rare bugs and should be backported to stable branches.
Yeap, that triggered on BeaglePlay v7.0.0, Mesa 26.1.0-rc1 and xserver 21.1.20-1 overnight.. testing now (and updating xserver to 21.1.22 as more glamor changes)..
https://gist.github.com/RobertCNelson/5e1dcf4c648a5bffaaf970c5a50e5c96
Regards,
The rest is an attempt to cleanup and document the code to make it a bit easier to understand.
Signed-off-by: Alessio Belle alessio.belle@imgtec.com
Alessio Belle (8): drm/imagination: Count paired job fence as dependency in prepare_job() drm/imagination: Fit paired fragment job in the correct CCCB drm/imagination: Skip check on paired job fence during job submission drm/imagination: Rename pvr_queue_fence_is_ufo_backed() to reflect usage drm/imagination: Rename fence returned by pvr_queue_job_arm() drm/imagination: Move repeated job fence check to its own function drm/imagination: Update check to skip prepare_job() for fragment jobs drm/imagination: Minor improvements to job submission code documentation
drivers/gpu/drm/imagination/pvr_job.c | 8 +- drivers/gpu/drm/imagination/pvr_queue.c | 154 +++++++++++++-------- drivers/gpu/drm/imagination/pvr_queue.h | 2 +- .../gpu/drm/imagination/pvr_rogue_fwif_shared.h | 10 +- drivers/gpu/drm/imagination/pvr_sync.c | 8 +- drivers/gpu/drm/imagination/pvr_sync.h | 2 +- 6 files changed, 110 insertions(+), 74 deletions(-)
base-commit: 3bce3fdd1ff2ba242f76ab66659fff27207299f1 change-id: 20260330-job-submission-fixes-cleanup-83e01196c3e9
Best regards,
Alessio Belle alessio.belle@imgtec.com
linaro-mm-sig@lists.linaro.org