On Fri, 2024-12-13 at 09:14 -0500, Mathieu Desnoyers wrote:
On 2024-12-13 04:54, Gabriele Monaco wrote:
Currently, the task_mm_cid_work function is called in a task work triggered by a scheduler tick. This can delay the execution of the task for the entire duration of the function, negatively affecting the response of real time tasks.
This patch runs the task_mm_cid_work in a new delayed work connected to the mm_struct rather than in the task context before returning to userspace.
This delayed work is initialised while allocating the mm and disabled before freeing it, its execution is no longer triggered by scheduler ticks but run periodically based on the defined MM_CID_SCAN_DELAY.
The main advantage of this change is that the function can be offloaded to a different CPU and even preempted by RT tasks.
Moreover, this new behaviour could be more predictable in some situations since the delayed work is always scheduled with the same periodicity for each mm.
This last paragraph could be clarified. AFAIR, the problem with the preexisting approach based on the scheduler tick is with a mm consisting of a set of periodic threads, where none happen to run while the scheduler tick is running.
This would skip mm_cid compaction. So it's not a bug per se, because the mm_cid allocation will just be slightly less compact than it should be in that case.
The underlying question here is whether eventual convergence of mm_cid towards 0 when the number of threads or the allowed CPU mask are reduced in a mm should be guaranteed or only best effort.
If best effort, then this corner-case is not worthy of a "Fix" tag. Otherwise, we should identify which commit it fixes and introduce a "Fix" tag.
I will definitely make it clearer, but I'm also not sure if the patch is actually a fix for that. I wanted to mention it rather as a nice consequence of the change. The main purpose for us is that it solves latency issues in isolated environments.
From that point of view, it's still /fixing/ the latency spikes introduced by that commit, so perhaps it deserves the Fix tag anyway.
Let me know what you think about that.
I'm going to merge this patch with 2/4 and pull yours first in V3.
Thanks for the review Gabriele