On Thu, Nov 29, 2018 at 2:31 PM Linus Walleij linus.walleij@linaro.org wrote:
This addresses a bug in the irqchip core that was triggered by new code assuming a strict semantic order of the irqchip calls:
.irq_request_resources() .irq_enable() .irq_disable() .irq_release_resources()
Mostly this is working fine when handled inside manage.c, the calls are strictly happening in the above assumed order.
However on a Qualcomm platform I get the following in dmesg:
WARNING: CPU: 0 PID: 1 at ../drivers/gpio/gpiolib.c:3513 gpiochip_irq_enable+0x18/0x44 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4-00002-g1b8a75b25c6e-dirty #9 Hardware name: Generic DT based system [<c03124ac>] (unwind_backtrace) from [<c030d994>] (show_stack+0x10/0x14) [<c030d994>] (show_stack) from [<c0b48aec>] (dump_stack+0x78/0x8c) [<c0b48aec>] (dump_stack) from [<c03213e4>] (__warn+0xe0/0xf8) [<c03213e4>] (__warn) from [<c0321514>] (warn_slowpath_null+0x40/0x48) [<c0321514>] (warn_slowpath_null) from [<c06ad5d0>] (gpiochip_irq_enable+0x18/0x44) [<c06ad5d0>] (gpiochip_irq_enable) from [<c0371198>] (irq_enable+0x44/0x64) [<c0371198>] (irq_enable) from [<c0371258>] (__irq_startup+0xa0/0xa8) [<c0371258>] (__irq_startup) from [<c03712ac>] (irq_startup+0x4c/0x130) [<c03712ac>] (irq_startup) from [<c03716d0>] (irq_set_chained_handler_and_data+0x4c/0x78) [<c03716d0>] (irq_set_chained_handler_and_data) from [<c081c17c>] (pm8xxx_probe+0x160/0x22c) [<c081c17c>] (pm8xxx_probe) from [<c07f439c>] (platform_drv_probe+0x48/0x98)
What happens is that when the pm8xxx driver tries to register a chained IRQ irq_set_chained_handler_and_data() will eventually set the type and call irq_startup() and thus the .irq_enable() callback on the irqchip.
This irqchip is in drivers/pinctrl/qcom/pinctrl-msm.c and known as TLMM.
However, the irqchip core helpers in GPIO assumes that the .irq_request_resources() callback is always called before .irq_enable(), and the latter is where module refcount and also gpiochip_lock_as_irq() is normally called.
When .irq_enable() is called without .irq_request_resources() having been called first, it seems like we are enabling an IRQ on a GPIO line that has not first been locked to be used as IRQ and we get the above warning. This happens since as of commit 461c1a7d4733d ("gpiolib: override irq_enable/disable") this is strictly assumed for all GPIO-based IRQs.
As it seems reasonable to assume that .irq_request_resources() is always strictly called before .irq_enable(), we add the irq_[request|release]_resources() functions to the internal API and call them also when adding a chained irqchip to an IRQ.
I am a bit uncertain about the call site for irq_released_resources() inside chip.c, but it appears this path is for uninstalling a chained handler.
Cc: stable@vger.kernel.org Cc: Bjorn Andersson bjorn.andersson@linaro.org Cc: Hans Verkuil hverkuil@xs4all.nl Fixes: 461c1a7d4733d ("gpiolib: override irq_enable/disable") Signed-off-by: Linus Walleij linus.walleij@linaro.org
IRQ folks: did you have time to look at this?
It's a very real regression for me.
Yours, Linus Walleij