Quoting Sultan Alsawaf (2020-04-20 08:24:19)
Chris,
Could you please look at this in earnest? This is a real bug that crashes my laptop without any kind of provocation. It is undeniably a bug in i915, and I've clearly described it in my patch. If you dont like the patch, I'm open to any suggestions you have for an alternative solution. My goal here is to make i915 better, but it's difficult when communication only goes one way.
Hi Sultan,
The patch Chris pointed out was not part of 5.4 release. The commit message describes that it fixes the functions to be tolerant to running simultaneously. In doing that zeroing of ring->vaddr is removed so the test to do mdelay(1) and "ring->vaddr = NULL;" is not correct.
I think you might have used the wrong git command for checking the patch history:
$ git describe a266bf420060 v5.4-rc7-1996-ga266bf420060 # after -rc7 tag
$ git describe --contains a266bf420060 v5.6-rc1~34^2~21^2~326 # included in v5.6-rc1
And git log to double check:
$ git log --format=oneline kernel.org/stable/linux-5.4.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint" $ git log --format=oneline kernel.org/stable/linux-5.5.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint" 0725d9a31869e6c80630e99da366ede2848295cc drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint $ git log --format=oneline kernel.org/stable/linux-5.6.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint" a754012b9f2323a5d640da7eb7b095ac3b8cd012 drm/i915/execlists: Leave resetting ring to intel_ring 0725d9a31869e6c80630e99da366ede2848295cc drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint a266bf42006004306dd48a9082c35dfbff153307 drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
So it seems that the patch got pulled into v5.6 and has been backported to v5.5 but not v5.4.
Could you try applying the patch to 5.4 and seeing if the problem persists?
Regards, Joonas