On Fri, Mar 17, 2023 at 10:59:26AM -0400, Waiman Long wrote:
On 3/17/23 08:27, Michal Koutný wrote:
On Tue, Mar 14, 2023 at 04:22:06PM -0400, Waiman Long longman@redhat.com wrote:
Some arm64 systems can have asymmetric CPUs where certain tasks are only runnable on a selected subset of CPUs.
Ah, I'm catching up.
This information is not captured in the cpuset. As a result, task_cpu_possible_mask() may return a mask that have no overlap with effective_cpus causing new_cpus to become empty.
I can see that historically, there was an approach of terminating unaccomodable tasks: 94f9c00f6460 ("arm64: Remove logic to kill 32-bit tasks on 64-bit-only cores") the removal of killing had been made possible with df950811f4a8 ("arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system").
That gives two other alternatives to affinity modification: 2) kill such tasks (not unlike OOM upon memory.max reduction), 3) reject cpuset reduction (violates cgroup v2 delegation).
What do you think about 2)?
Yes, killing it is one possible solution.
(3) doesn't work if the affinity change is due to hot cpu removal. So that leaves this patch or (2) as the only alternative. I would like to hear what Will and Tejun thinks about it.
The main constraint from the Android side (the lucky ecosystem where these SoCs tend to show up) is that existing userspace (including 32-bit binaries) continues to function without modification. So approaches such as killing tasks or rejecting system calls tend not to work as well, since you inevitably get divergent behaviour leading to functional breakage rather than e.g. performance anomalies.
Having said that, the behaviour we currently have in mainline seems to be alright, so please don't go out of your way to accomodate these SoCs. I'm mainly just concerned about introducing any regressions, which is why I ran my tests on this series
Cheers,
Will