On 2022-04-11 09:28, Greg KH wrote:
On Mon, Apr 11, 2022 at 09:18:19AM +0200, Holger Hoffstätte wrote:
On 2022-04-11 01:22, Holger Hoffstätte wrote:
On 2022-04-11 00:06, Qais Yousef wrote:
On 04/10/22 00:38, Qais Yousef wrote:
On 03/08/22 18:51, Qais Yousef wrote:
On 03/08/22 19:10, Greg KH wrote: > On Tue, Mar 08, 2022 at 06:02:40PM +0000, Qais Yousef wrote: >> +CC stable >> >> On 03/01/22 15:24, tip-bot2 for Valentin Schneider wrote: >>> The following commit has been merged into the sched/core branch of tip: >>> >>> Commit-ID: fa2c3254d7cfff5f7a916ab928a562d1165f17bb >>> Gitweb: https://git.kernel.org/tip/fa2c3254d7cfff5f7a916ab928a562d1165f17bb >>> Author: Valentin Schneider valentin.schneider@arm.com >>> AuthorDate: Thu, 20 Jan 2022 16:25:19 >>> Committer: Peter Zijlstra peterz@infradead.org >>> CommitterDate: Tue, 01 Mar 2022 16:18:39 +01:00 >>> >>> sched/tracing: Don't re-read p->state when emitting sched_switch event >>> >>> As of commit >>> >>> c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu") >>> >>> the following sequence becomes possible: >>> >>> p->__state = TASK_INTERRUPTIBLE; >>> __schedule() >>> deactivate_task(p); >>> ttwu() >>> READ !p->on_rq >>> p->__state=TASK_WAKING >>> trace_sched_switch() >>> __trace_sched_switch_state() >>> task_state_index() >>> return 0; >>> >>> TASK_WAKING isn't in TASK_REPORT, so the task appears as TASK_RUNNING in >>> the trace event. >>> >>> Prevent this by pushing the value read from __schedule() down the trace >>> event. >>> >>> Reported-by: Abhijeet Dharmapurikar adharmap@quicinc.com >>> Signed-off-by: Valentin Schneider valentin.schneider@arm.com >>> Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org >>> Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org >>> Link: https://lore.kernel.org/r/20220120162520.570782-2-valentin.schneider@arm.com >> >> Any objection to picking this for stable? I'm interested in this one for some >> Android users but prefer if it can be taken by stable rather than backport it >> individually. >> >> I think it makes sense to pick the next one in the series too. > > What commit does this fix in Linus's tree?
It should be this one: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")
Should this be okay to be picked up by stable now? I can see AUTOSEL has picked it up for v5.15+, but it impacts v5.10 too.
commit: fa2c3254d7cfff5f7a916ab928a562d1165f17bb subject: sched/tracing: Don't re-read p->state when emitting sched_switch event
This patch has an impact on Android 5.10 users who experience tooling breakage. Is it possible to include in 5.10 LTS please?
It was already picked up for 5.15+ by AUTOSEL and only 5.10 is missing.
https://lore.kernel.org/stable/Yk2PQzynOVOzJdPo@kroah.com/
However, since then further investigation (still in progress) has shown that this may have been the fault of the tool in question, so if you can verify that tracing sched still works for you with this patch in 5.15.x then by all means let's merge it.
So it turns out the lockup is indeed the fault of the tool, which contains multiple kernel-version dependent tracepoint definitions and now fails with this patch.
What tools is this?
sysdig - which uses a helper kernel module which accesses tracepoints, but of course (as I just found) with copypasta'd TP definitions, which broke with this patch due to the additional parameter in the function signature. It's been prone to breakage forever because of a lack of a stable kernel ABI.
Took me a while to find/figure out, but IMHO better safe than sorry. We've had autoselected scheduler patches before that looked fine but really were not.
Greg, please re-enqueue this patch where necessary (5.10, 5.15+)
If I queue it up again, will the tools keep breaking?
Yes, but that's their problem with an out-of-tree module; a few more #ifdefs are not going to make a big difference.
thanks Holger