From: Tvrtko Ursulin
Sent: 19 July 2022 08:25
...
It's not only the TLB flushes that cause grief.
There is a loop that forces a write-back of all the frame buffer pages. With a large display and some cpu (like my Ivy bridge one) that takes long enough with pre-emption disabled that wakeup of RT processes (and any pinned to the cpu) takes far longer than one might have wished for.
Since some X servers request a flush every few seconds this makes the system unusable for some workloads.
Ok TLB invalidations as discussed in this patch does not apply to Ivybridge. But what is the write back loop you mention which is causing you grief? What size frame buffers are we talking about here? If they don't fit in the mappable area recently we merged a patch* which improves things in that situation but not sure you are hitting exactly that.
I found the old email:
What I've found is that the Intel i915 graphics driver uses the 'events_unbound' kernel worker thread to periodically execute drm_cflush_sg(). (see https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/drm_cache.c)
I'm guessing this is to ensure that any writes to graphics memory become visible is a semi-timely manner.
This loop takes about 1us per iteration split fairly evenly between whatever is in for_each_sg_page() and drm_cflush_page(). With a 2560x1440 display the loop count is 3600 (4 bytes/pixel) and the whole function takes around 3.3ms.
IIRC the first few page flushes are quick (I bet they go into a fifo) and then they all get slow. The flushes are actually requested from userspace.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)