On Sun, Aug 04 2024 at 20:28, Guenter Roeck wrote:
On 8/4/24 11:36, Guenter Roeck wrote:
Rafael J. Wysocki rafael.j.wysocki@intel.com genirq: Set IRQF_COND_ONESHOT in request_irq()
With this patch in v6.10.3, all my parisc64 qemu tests get stuck with repeated error messages
[ 0.000000] ============================================================================= [ 0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16 [ 0.000000] -----------------------------------------------------------------------------
Do you have a full boot log? It's unclear to me at which point of the boot process this happens. Is this before or after the secondary CPUs have been brought up?
This never stops until the emulation aborts.
Do you have a recipe how to reproduce?
Reverting this patch fixes the problem for me.
I noticed a similar problem in the mainline kernel but it is either spurious there or the problem has been fixed.
As a follow-up, the patch below (on top of v6.10.3) "fixes" the problem for me. I guess that suggests some kind of race condition.
@@ -2156,6 +2157,8 @@ int request_threaded_irq(unsigned int irq, irq_handler_t handler, struct irq_desc *desc; int retval;
udelay(1);
if (irq == IRQ_NOTCONNECTED) return -ENOTCONN;
That all makes absolutely no sense to me.
IRQF_COND_ONESHOT has only an effect on shared interrupts, when the interrupt was already requested with IRQF_ONESHOT.
If this is really a race then the following must be true:
1) no delay
CPU0 CPU1 request_irq(IRQF_ONESHOT) request_irq(IRQF_COND_ONESHOT)
2) delay
CPU0 CPU1 request_irq(IRQF_COND_ONESHOT) request_irq(IRQF_ONESHOT)
In this case the request on CPU 0 fails with -EBUSY ...
Confused
tglx