Hello,
On 08/04/25 04:50, Harshit Agarwal wrote:
When a CPU chooses to call push_dl_task and picks a task to push to another CPU's runqueue then it will call find_lock_later_rq method which would take a double lock on both CPUs' runqueues. If one of the locks aren't readily available, it may lead to dropping the current runqueue lock and reacquiring both the locks at once. During this window it is possible that the task is already migrated and is running on some other CPU. These cases are already handled. However, if the task is migrated and has already been executed and another CPU is now trying to wake it up (ttwu) such that it is queued again on the runqeue (on_rq is 1) and also if the task was run by the same CPU, then the current checks will pass even though the task was migrated out and is no longer in the pushable tasks list. Please go through the original rt change for more details on the issue.
To fix this, after the lock is obtained inside the find_lock_later_rq, it ensures that the task is still at the head of pushable tasks list. Also removed some checks that are no longer needed with the addition of this new check. However, the new check of pushable tasks list only applies when find_lock_later_rq is called by push_dl_task. For the other caller i.e. dl_task_offline_migration, existing checks are used.
Signed-off-by: Harshit Agarwal harshit@nutanix.com Cc: stable@vger.kernel.org
The new version looks good to me.
Some final minor touches to changelog/comment might still be required, but Peter maybe you can do it if/when picking up the change?
Anyway,
Acked-by: Juri Lelli juri.lelli@redhat.com
Thanks! Juri