Hi Juri,
On Wed, Dec 17, 2025 at 04:49:02PM +0100, Juri Lelli wrote:
Hi!
On 17/12/25 10:35, Andrea Righi wrote:
sched_ext currently suffers starvation due to RT. The same workload when converted to EXT can get zero runtime if RT is 100% running, causing EXT processes to stall. Fix it by adding a DL server for EXT.
...
v4: - initialize EXT server bandwidth reservation at init time and always keep it active (Andrea Righi) - check for rq->nr_running == 1 to determine when to account idle time (Juri Lelli) v3: - clarify that fair is not the only dl_server (Juri Lelli) - remove explicit stop to reduce timer reprogramming overhead (Juri Lelli) - do not restart pick_task() when it's invoked by the dl_server (Tejun Heo) - depend on CONFIG_SCHED_CLASS_EXT (Andrea Righi) v2: - drop ->balance() now that pick_task() has an rf argument (Andrea Righi)
Tested-by: Christian Loehle christian.loehle@arm.com Co-developed-by: Joel Fernandes joelagnelf@nvidia.com Signed-off-by: Joel Fernandes joelagnelf@nvidia.com Signed-off-by: Andrea Righi arighi@nvidia.com
...
@@ -3090,6 +3123,15 @@ static void switching_to_scx(struct rq *rq, struct task_struct *p) static void switched_from_scx(struct rq *rq, struct task_struct *p) { scx_disable_task(p);
- /*
* After class switch, if the DL server is still active, restart it so* that DL timers will be queued, in case SCX switched to higher class.*/- if (dl_server_active(&rq->ext_server)) {
dl_server_stop(&rq->ext_server);dl_server_start(&rq->ext_server);- }
}
We might have discussed this already, in that case I forgot, sorry. But, why we do need to start the server again if switched from scx? Couldn't make sense of the comment that is already present.
The intention was to restart the DL timers, but thinking more about it, this appears more harmful than helpful, as it may actually disrupt accounting.
I did a quick test without the restart and everything seems to work. I'll run more tests and I'll send an updated patch if everything works well without the restart.
Thanks! -Andrea