On Tue, Jan 14, 2025 at 07:30:31PM +0100, Daniel Lezcano wrote:
On 13/01/2025 14:27, Nícolas F. R. A. Prado wrote:
In order to get working interrupts, a low offset value needs to be configured. The minimum value for it is 20 Celsius, which is what is configured when there's no lower thermal trip (ie the thermal core passes -INT_MAX as low trip temperature). However, when the temperature gets that low and fluctuates around that value it causes an interrupt storm.
Is it really about an irq storm or about having a temperature threshold set close to the ambiant temperature. So leading to unnecessary wakeups as there is need for mitigation ?
Yes, that's what I mean. The irq threshold gets configured to 20C, so whenever the temperature drops below that value, the IRQ gets triggered. But this usually does not happen just once, because from the thermal frameworks' perspective, there's no thermal threshold configured for 20C, since that's done from the driver, the framework thinks it's -INT_MAX, so the threshold doesn't get moved after the trigger and it just ends up triggering hundreds or thousands of times in a short span of time, hence why I say it's an interrupt storm.
Prevent that interrupt storm by not enabling the low offset interrupt if the low threshold is the minimum one.
The case where the high threshold is the INT_MAX should be handled too. The system may have configured a thermal zone without critical trip points, so setting the next upper threshold will program the register with INT_MAX. I guess it is an undefined behavior in this case, right ?
Ah, yes, I don't think I've tested that before... I'll test it and send a fix if needed.
Thanks, Nícolas