From: Alexander Sverdlin alexander.sverdlin@siemens.com
The problem apparetly has been known since the conversion to raw_spin_lock() (commit 4dbada2be460 ("gpio: omap: use raw locks for locking")).
Symptom:
[ BUG: Invalid wait context ] 5.10.214 ----------------------------- swapper/1 is trying to lock: (enable_lock){....}-{3:3}, at: clk_enable_lock other info that might help us debug this: context-{5:5} 2 locks held by swapper/1: #0: (&dev->mutex){....}-{4:4}, at: device_driver_attach #1: (&bank->lock){....}-{2:2}, at: omap_gpio_set_config stack backtrace: CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.214 Hardware name: Generic AM33XX (Flattened Device Tree) unwind_backtrace show_stack __lock_acquire lock_acquire.part.0 _raw_spin_lock_irqsave clk_enable_lock clk_enable omap_gpio_set_config gpio_keys_setup_key gpio_keys_probe platform_drv_probe really_probe driver_probe_device device_driver_attach __driver_attach bus_for_each_dev bus_add_driver driver_register do_one_initcall do_initcalls kernel_init_freeable kernel_init
Problematic spin_lock_irqsave(&enable_lock, ...) is being called by clk_enable()/clk_disable() in omap2_set_gpio_debounce() and omap_clear_gpio_debounce().
For omap2_set_gpio_debounce() it's possible to move raw_spin_lock_irqsave(&bank->lock, ...) inside omap2_set_gpio_debounce() so that the locks nest as follows:
clk_enable(bank->dbck) raw_spin_lock_irqsave(&bank->lock, ...) raw_spin_unlock_irqrestore() clk_disable()
Two call-sites of omap_clear_gpio_debounce() are more convoluted, but one can take the advantage of the nesting nature of clk_enable()/clk_disable(), so that the inner clk_disable() becomes lockless:
clk_enable(bank->dbck) <-- only to clk_enable_lock() raw_spin_lock_irqsave(&bank->lock, ...) omap_clear_gpio_debounce() clk_disable() <-- becomes lockless raw_spin_unlock_irqrestore() clk_disable()
Cc: stable@vger.kernel.org Fixes: 4dbada2be460 ("gpio: omap: use raw locks for locking") Signed-off-by: Alexander Sverdlin alexander.sverdlin@siemens.com --- drivers/gpio/gpio-omap.c | 35 ++++++++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-)
diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c index 7ad4534054962..f9e502aa57753 100644 --- a/drivers/gpio/gpio-omap.c +++ b/drivers/gpio/gpio-omap.c @@ -181,6 +181,7 @@ static inline void omap_gpio_dbck_disable(struct gpio_bank *bank) static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset, unsigned debounce) { + unsigned long flags; u32 val; u32 l; bool enable = !!debounce; @@ -196,13 +197,18 @@ static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset,
l = BIT(offset);
+ /* + * Ordering is important here: clk_enable() calls spin_lock_irqsave(), + * therefore it must be outside of the following raw_spin_lock_irqsave() + */ clk_enable(bank->dbck); + raw_spin_lock_irqsave(&bank->lock, flags); + writel_relaxed(debounce, bank->base + bank->regs->debounce);
val = omap_gpio_rmw(bank->base + bank->regs->debounce_en, l, enable); bank->dbck_enable_mask = val;
- clk_disable(bank->dbck); /* * Enable debounce clock per module. * This call is mandatory because in omap_gpio_request() when @@ -217,6 +223,9 @@ static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset, bank->context.debounce_en = val; }
+ raw_spin_unlock_irqrestore(&bank->lock, flags); + clk_disable(bank->dbck); + return 0; }
@@ -647,6 +656,13 @@ static void omap_gpio_irq_shutdown(struct irq_data *d) unsigned long flags; unsigned offset = d->hwirq;
+ /* + * Enable the clock here so that the nested clk_disable() in the + * following omap_clear_gpio_debounce() is lockless + */ + if (bank->dbck_flag) + clk_enable(bank->dbck); + raw_spin_lock_irqsave(&bank->lock, flags); bank->irq_usage &= ~(BIT(offset)); omap_set_gpio_triggering(bank, offset, IRQ_TYPE_NONE); @@ -656,6 +672,9 @@ static void omap_gpio_irq_shutdown(struct irq_data *d) omap_clear_gpio_debounce(bank, offset); omap_disable_gpio_module(bank, offset); raw_spin_unlock_irqrestore(&bank->lock, flags); + + if (bank->dbck_flag) + clk_disable(bank->dbck); }
static void omap_gpio_irq_bus_lock(struct irq_data *data) @@ -827,6 +846,13 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) struct gpio_bank *bank = gpiochip_get_data(chip); unsigned long flags;
+ /* + * Enable the clock here so that the nested clk_disable() in the + * following omap_clear_gpio_debounce() is lockless + */ + if (bank->dbck_flag) + clk_enable(bank->dbck); + raw_spin_lock_irqsave(&bank->lock, flags); bank->mod_usage &= ~(BIT(offset)); if (!LINE_USED(bank->irq_usage, offset)) { @@ -836,6 +862,9 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) omap_disable_gpio_module(bank, offset); raw_spin_unlock_irqrestore(&bank->lock, flags);
+ if (bank->dbck_flag) + clk_disable(bank->dbck); + pm_runtime_put(chip->parent); }
@@ -913,15 +942,11 @@ static int omap_gpio_debounce(struct gpio_chip *chip, unsigned offset, unsigned debounce) { struct gpio_bank *bank; - unsigned long flags; int ret;
bank = gpiochip_get_data(chip);
- raw_spin_lock_irqsave(&bank->lock, flags); ret = omap2_set_gpio_debounce(bank, offset, debounce); - raw_spin_unlock_irqrestore(&bank->lock, flags); - if (ret) dev_info(chip->parent, "Could not set line %u debounce to %u microseconds (%d)",
On Mon, Nov 25, 2024 at 9:05 AM A. Sverdlin alexander.sverdlin@siemens.com wrote:
From: Alexander Sverdlin alexander.sverdlin@siemens.com
The problem apparetly has been known since the conversion to raw_spin_lock() (commit 4dbada2be460 ("gpio: omap: use raw locks for locking")).
Symptom:
[ BUG: Invalid wait context ] 5.10.214
swapper/1 is trying to lock: (enable_lock){....}-{3:3}, at: clk_enable_lock other info that might help us debug this: context-{5:5} 2 locks held by swapper/1: #0: (&dev->mutex){....}-{4:4}, at: device_driver_attach #1: (&bank->lock){....}-{2:2}, at: omap_gpio_set_config stack backtrace: CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.214 Hardware name: Generic AM33XX (Flattened Device Tree) unwind_backtrace show_stack __lock_acquire lock_acquire.part.0 _raw_spin_lock_irqsave clk_enable_lock clk_enable omap_gpio_set_config gpio_keys_setup_key gpio_keys_probe platform_drv_probe really_probe driver_probe_device device_driver_attach __driver_attach bus_for_each_dev bus_add_driver driver_register do_one_initcall do_initcalls kernel_init_freeable kernel_init
Problematic spin_lock_irqsave(&enable_lock, ...) is being called by clk_enable()/clk_disable() in omap2_set_gpio_debounce() and omap_clear_gpio_debounce().
For omap2_set_gpio_debounce() it's possible to move raw_spin_lock_irqsave(&bank->lock, ...) inside omap2_set_gpio_debounce() so that the locks nest as follows:
clk_enable(bank->dbck) raw_spin_lock_irqsave(&bank->lock, ...) raw_spin_unlock_irqrestore() clk_disable()
Two call-sites of omap_clear_gpio_debounce() are more convoluted, but one can take the advantage of the nesting nature of clk_enable()/clk_disable(), so that the inner clk_disable() becomes lockless:
clk_enable(bank->dbck) <-- only to clk_enable_lock() raw_spin_lock_irqsave(&bank->lock, ...) omap_clear_gpio_debounce() clk_disable() <-- becomes lockless raw_spin_unlock_irqrestore() clk_disable()
Cc: stable@vger.kernel.org Fixes: 4dbada2be460 ("gpio: omap: use raw locks for locking") Signed-off-by: Alexander Sverdlin alexander.sverdlin@siemens.com
drivers/gpio/gpio-omap.c | 35 ++++++++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-)
diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c index 7ad4534054962..f9e502aa57753 100644 --- a/drivers/gpio/gpio-omap.c +++ b/drivers/gpio/gpio-omap.c @@ -181,6 +181,7 @@ static inline void omap_gpio_dbck_disable(struct gpio_bank *bank) static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset, unsigned debounce) {
unsigned long flags; u32 val; u32 l; bool enable = !!debounce;
@@ -196,13 +197,18 @@ static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset,
l = BIT(offset);
/*
* Ordering is important here: clk_enable() calls spin_lock_irqsave(),
* therefore it must be outside of the following raw_spin_lock_irqsave()
*/ clk_enable(bank->dbck);
raw_spin_lock_irqsave(&bank->lock, flags);
writel_relaxed(debounce, bank->base + bank->regs->debounce); val = omap_gpio_rmw(bank->base + bank->regs->debounce_en, l, enable); bank->dbck_enable_mask = val;
clk_disable(bank->dbck); /* * Enable debounce clock per module. * This call is mandatory because in omap_gpio_request() when
@@ -217,6 +223,9 @@ static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset, bank->context.debounce_en = val; }
raw_spin_unlock_irqrestore(&bank->lock, flags);
clk_disable(bank->dbck);
This part looks pretty clear to me.
return 0;
}
@@ -647,6 +656,13 @@ static void omap_gpio_irq_shutdown(struct irq_data *d) unsigned long flags; unsigned offset = d->hwirq;
/*
* Enable the clock here so that the nested clk_disable() in the
* following omap_clear_gpio_debounce() is lockless
*/
if (bank->dbck_flag)
clk_enable(bank->dbck);
But this looks like a functional change. You effectively bump the clock enable count but don't add a corresponding clk_disable() in the affected path. Is the clock ever actually disabled then?
Am I not getting something?
Bart
raw_spin_lock_irqsave(&bank->lock, flags); bank->irq_usage &= ~(BIT(offset)); omap_set_gpio_triggering(bank, offset, IRQ_TYPE_NONE);
@@ -656,6 +672,9 @@ static void omap_gpio_irq_shutdown(struct irq_data *d) omap_clear_gpio_debounce(bank, offset); omap_disable_gpio_module(bank, offset); raw_spin_unlock_irqrestore(&bank->lock, flags);
if (bank->dbck_flag)
clk_disable(bank->dbck);
}
static void omap_gpio_irq_bus_lock(struct irq_data *data) @@ -827,6 +846,13 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) struct gpio_bank *bank = gpiochip_get_data(chip); unsigned long flags;
/*
* Enable the clock here so that the nested clk_disable() in the
* following omap_clear_gpio_debounce() is lockless
*/
if (bank->dbck_flag)
clk_enable(bank->dbck);
raw_spin_lock_irqsave(&bank->lock, flags); bank->mod_usage &= ~(BIT(offset)); if (!LINE_USED(bank->irq_usage, offset)) {
@@ -836,6 +862,9 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) omap_disable_gpio_module(bank, offset); raw_spin_unlock_irqrestore(&bank->lock, flags);
if (bank->dbck_flag)
clk_disable(bank->dbck);
pm_runtime_put(chip->parent);
}
@@ -913,15 +942,11 @@ static int omap_gpio_debounce(struct gpio_chip *chip, unsigned offset, unsigned debounce) { struct gpio_bank *bank;
unsigned long flags; int ret; bank = gpiochip_get_data(chip);
raw_spin_lock_irqsave(&bank->lock, flags); ret = omap2_set_gpio_debounce(bank, offset, debounce);
raw_spin_unlock_irqrestore(&bank->lock, flags);
if (ret) dev_info(chip->parent, "Could not set line %u debounce to %u microseconds (%d)",
-- 2.47.0
Hi Bartosz!
On Tue, 2024-11-26 at 21:37 +0100, Bartosz Golaszewski wrote:
@@ -647,6 +656,13 @@ static void omap_gpio_irq_shutdown(struct irq_data *d) unsigned long flags; unsigned offset = d->hwirq;
+ /* + * Enable the clock here so that the nested clk_disable() in the + * following omap_clear_gpio_debounce() is lockless + */ + if (bank->dbck_flag) + clk_enable(bank->dbck);
But this looks like a functional change. You effectively bump the clock enable count but don't add a corresponding clk_disable() in the affected path. Is the clock ever actually disabled then?
Am I not getting something?
Actually I though I enable and disable them in the very same function, so for the first enable above...
raw_spin_lock_irqsave(&bank->lock, flags); bank->irq_usage &= ~(BIT(offset)); omap_set_gpio_triggering(bank, offset, IRQ_TYPE_NONE); @@ -656,6 +672,9 @@ static void omap_gpio_irq_shutdown(struct irq_data *d) omap_clear_gpio_debounce(bank, offset); omap_disable_gpio_module(bank, offset); raw_spin_unlock_irqrestore(&bank->lock, flags);
+ if (bank->dbck_flag) + clk_disable(bank->dbck);
^^^^^^^^^^^^^^^^^^^^^^^^
this would be the corresponding disable.
}
static void omap_gpio_irq_bus_lock(struct irq_data *data) @@ -827,6 +846,13 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) struct gpio_bank *bank = gpiochip_get_data(chip); unsigned long flags;
+ /* + * Enable the clock here so that the nested clk_disable() in the + * following omap_clear_gpio_debounce() is lockless + */ + if (bank->dbck_flag) + clk_enable(bank->dbck);
^^^^^^^^^^^^^^^^^^^^^^ And for this second enable....
raw_spin_lock_irqsave(&bank->lock, flags); bank->mod_usage &= ~(BIT(offset)); if (!LINE_USED(bank->irq_usage, offset)) { @@ -836,6 +862,9 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) omap_disable_gpio_module(bank, offset); raw_spin_unlock_irqrestore(&bank->lock, flags);
+ if (bank->dbck_flag) + clk_disable(bank->dbck);
^^^^^^^^^^^^^^^^^^^^^^^ ... this would be the corresponding 2nd disable.
pm_runtime_put(chip->parent); }
Aren't they paired from your PoV?
On Tue, Nov 26, 2024 at 9:59 PM Sverdlin, Alexander alexander.sverdlin@siemens.com wrote:
Hi Bartosz!
On Tue, 2024-11-26 at 21:37 +0100, Bartosz Golaszewski wrote:
@@ -647,6 +656,13 @@ static void omap_gpio_irq_shutdown(struct irq_data *d) unsigned long flags; unsigned offset = d->hwirq;
/*
* Enable the clock here so that the nested clk_disable() in the
* following omap_clear_gpio_debounce() is lockless
*/
if (bank->dbck_flag)
clk_enable(bank->dbck);
But this looks like a functional change. You effectively bump the clock enable count but don't add a corresponding clk_disable() in the affected path. Is the clock ever actually disabled then?
Am I not getting something?
Actually I though I enable and disable them in the very same function, so for the first enable above...
raw_spin_lock_irqsave(&bank->lock, flags); bank->irq_usage &= ~(BIT(offset)); omap_set_gpio_triggering(bank, offset, IRQ_TYPE_NONE);
@@ -656,6 +672,9 @@ static void omap_gpio_irq_shutdown(struct irq_data *d) omap_clear_gpio_debounce(bank, offset); omap_disable_gpio_module(bank, offset); raw_spin_unlock_irqrestore(&bank->lock, flags);
if (bank->dbck_flag)
clk_disable(bank->dbck);
^^^^^^^^^^^^^^^^^^^^^^^^
this would be the corresponding disable.
}
static void omap_gpio_irq_bus_lock(struct irq_data *data) @@ -827,6 +846,13 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) struct gpio_bank *bank = gpiochip_get_data(chip); unsigned long flags;
/*
* Enable the clock here so that the nested clk_disable() in the
* following omap_clear_gpio_debounce() is lockless
*/
if (bank->dbck_flag)
clk_enable(bank->dbck);
^^^^^^^^^^^^^^^^^^^^^^
And for this second enable....
raw_spin_lock_irqsave(&bank->lock, flags); bank->mod_usage &= ~(BIT(offset)); if (!LINE_USED(bank->irq_usage, offset)) {
@@ -836,6 +862,9 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) omap_disable_gpio_module(bank, offset); raw_spin_unlock_irqrestore(&bank->lock, flags);
if (bank->dbck_flag)
clk_disable(bank->dbck);
^^^^^^^^^^^^^^^^^^^^^^^
... this would be the corresponding 2nd disable.
}pm_runtime_put(chip->parent);
Aren't they paired from your PoV?
Ok, I needed to take a long look at this driver.
IIUC what happens now:
In omap_gpio_irq_shutdown() and omap_gpio_free() you now enable the clock (really just bumping its enable count) before entering omap_clear_gpio_debounce() so that its internal call do clk_disable() only decreases the reference count without actually disabling it and the clock is really disabled one level up the stack.
If that's correct, isn't it unnecessarily convoluted? omap_clear_gpio_debounce() is only called from these two functions so why not just move the clk_disable() out of it and into them?
Bartosz
Thanks Bartosz for looking into it!
On Wed, 2024-11-27 at 11:04 +0100, Bartosz Golaszewski wrote:
@@ -827,6 +846,13 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) struct gpio_bank *bank = gpiochip_get_data(chip); unsigned long flags;
+ /* + * Enable the clock here so that the nested clk_disable() in the + * following omap_clear_gpio_debounce() is lockless + */ + if (bank->dbck_flag) + clk_enable(bank->dbck);
^^^^^^^^^^^^^^^^^^^^^^ And for this second enable....
raw_spin_lock_irqsave(&bank->lock, flags); bank->mod_usage &= ~(BIT(offset)); if (!LINE_USED(bank->irq_usage, offset)) { @@ -836,6 +862,9 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) omap_disable_gpio_module(bank, offset); raw_spin_unlock_irqrestore(&bank->lock, flags);
+ if (bank->dbck_flag) + clk_disable(bank->dbck);
^^^^^^^^^^^^^^^^^^^^^^^ ... this would be the corresponding 2nd disable.
pm_runtime_put(chip->parent); }
Aren't they paired from your PoV?
Ok, I needed to take a long look at this driver.
IIUC what happens now:
In omap_gpio_irq_shutdown() and omap_gpio_free() you now enable the clock (really just bumping its enable count) before entering omap_clear_gpio_debounce() so that its internal call do clk_disable() only decreases the reference count without actually disabling it and the clock is really disabled one level up the stack.
If that's correct, isn't it unnecessarily convoluted? omap_clear_gpio_debounce() is only called from these two functions so why not just move the clk_disable() out of it and into them?
Seems that I didn't notice the elephant in the room, so let me rework it ;-)
linux-stable-mirror@lists.linaro.org