On 7/31/24 11:25, Laura Nao wrote:
It looks like sleepgraph.py is more focused on analyzing suspend/resume timings, while bootgraph.py measures boot time using the kernel log and ftrace. The latter might indeed come in handy. As far as I can see, the script doesn't support automatic detection of boot slowdowns, and the output is in HTML format, which is meant for human analysis. However, I can look into adding support for a more machine-readable output format too. The test proposed in this patch could then use bootgraph.py to generate the reference file and measure current boot timings.
I'll look into this and report back.
After examining the bootgraph.py script, it seems feasible to add support for generating the output in a machine-readable format (e.g., JSON) for automated analysis. Todd, I've CC'd you on this discussion in case you have feedback on possibly using bootgraph.py in an automated test to detect slowdowns.
Some points to consider:
- The bootgraph.py script supports ftrace through the -fstat and -ftrace options, and it parses the kernel log to get initcall timings. To use this in an automated test, we need a way to provide the necessary command line options. One approach is to include these options in a bootconfig file embedded in the kernel image (as per proposal in this RFC). Shuah, do you think this is acceptable? I haven't seen other tests doing this, so I'm unsure if this is a proper way to handle required command line options in a selftest.
- The bootgraph.py script tracks timings for all init calls, which might be excessive and generate too much output when integrated in an automated test. We might need to limit the test output to report only significant slowdowns to make it manageable.
- I'd like to get some feedback on which key boot process events are more relevant to track; depending on this, we could use the bootgraph.py script to monitor initcalls and possibly other events tracked via ftrace. The script currently uses the function_graph tracer, and its parser is designed for this tracer's output. If we need to track other events (e.g., kprobe events), the parser might need some adjustments.
I'll be discussing this at LPC in September (https://lpc.events/event/18/contributions/1700/) and look forward to exploring more details and alternative approaches for an automated boot time test.
Best,
Laura