On 6/10/23 14:14, Guenter Roeck wrote:
Hi,
On 6/10/23 12:23, Pavel Machek wrote:
Hi!
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 499 pass: 498 fail: 1 Failed tests: arm:kudo-bmc:multi_v7_defconfig:npcm:usb0.1:nuvoton-npcm730-kudo:rootfs
The test failure is spurious and not new. I observe it randomly on multi_v7_defconfig builds, primarily on npcm platforms. There is no error message, just a stalled boot. I have been trying to bisect for a while, but I have not been successful so far. No immediate concern; I just wanted to mention it in case someone else hits the same or a similar problem.
I managed to revise my bisect script sufficiently enough to get reliable results. It looks like the culprit is commit 503e554782c9 (" debugobject: Ensure pool refill (again)"); see bisect log below. Bisect on four different systems all have the same result. After reverting this patch, I do not see the problem anymore (again, confirmed on four different systems). If anyone has an idea how to debug this, please let me know. I'll be happy to give it a try.
You may want to comment out debug_objects_fill_pool() in debug_object_activate or debug_object_assert_init to see which one is causing the failure...
CONFIG_PREEMPT_RT is disabled for you, right? (Should 5.15 even have that option?)
CONFIG_PREEMPT_RT is disabled (it depends on ARCH_SUPPORTS_RT which is not enabled by any architecture in v5.15.y).
The added call in debug_object_activate() triggers the problem. Any idea what to do about it or how to debug it further ?
I did some more debugging. The call to debug_object_activate() from debug_hrtimer_activate() causes the immediate problem, and the call from debug_timer_activate() causes a second (less likely) problem, where the stall is seen during reboot.
In other words, the problem is (only) seen if DEBUG_OBJECTS_TIMERS is enabled.
Guenter