A refcount issue can appeared in __fwnode_link_del() due to the pr_debug() call: WARNING: CPU: 0 PID: 901 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110 Call Trace: <TASK> ? refcount_warn_saturate+0xe5/0x110 ? __warn+0x81/0x130 ? refcount_warn_saturate+0xe5/0x110 ? report_bug+0x191/0x1c0 ? srso_alias_return_thunk+0x5/0x7f ? prb_read_valid+0x1b/0x30 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? refcount_warn_saturate+0xe5/0x110 kobject_get+0x68/0x70 of_node_get+0x1e/0x30 of_fwnode_get+0x28/0x40 fwnode_full_name_string+0x34/0x90 fwnode_string+0xdb/0x140 vsnprintf+0x17b/0x630 va_format.isra.0+0x71/0x130 vsnprintf+0x17b/0x630 vprintk_store+0x162/0x4d0 ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? try_to_wake_up+0x9c/0x620 ? rwsem_mark_wake+0x1b2/0x310 vprintk_emit+0xe4/0x2b0 _printk+0x5c/0x80 __dynamic_pr_debug+0x131/0x160 ? srso_alias_return_thunk+0x5/0x7f __fwnode_link_del+0x25/0xa0 fwnode_links_purge+0x39/0xb0 of_node_release+0xd9/0x180 kobject_put+0x7b/0x190 ...
Indeed, an of_node is destroyed and so, of_node_release() is called because the of_node refcount reached 0. of_node_release() calls fwnode_links_purge() to purge the links and ended with __fwnode_link_del() calls. __fwnode_link_del calls pr_debug() to print the fwnodes (of_nodes) involved in the link and so this call is done while one of them is no more available (ie the one related to the of_node_release() call)
Remove the pr_debug() call to avoid the use of the links fwnode while destroying the fwnode itself.
Fixes: ebd6823af378 ("driver core: Add debug logs when fwnode links are added/deleted") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina herve.codina@bootlin.com --- drivers/base/core.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c index f4b09691998e..62088c663014 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -109,8 +109,6 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup) */ static void __fwnode_link_del(struct fwnode_link *link) { - pr_debug("%pfwf Dropping the fwnode link to %pfwf\n", - link->consumer, link->supplier); list_del(&link->s_hook); list_del(&link->c_hook); kfree(link);
On Fri, Nov 10, 2023 at 9:01 AM Herve Codina herve.codina@bootlin.com wrote:
A refcount issue can appeared in __fwnode_link_del() due to the pr_debug() call: WARNING: CPU: 0 PID: 901 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110 Call Trace:
<TASK> ? refcount_warn_saturate+0xe5/0x110 ? __warn+0x81/0x130 ? refcount_warn_saturate+0xe5/0x110 ? report_bug+0x191/0x1c0 ? srso_alias_return_thunk+0x5/0x7f ? prb_read_valid+0x1b/0x30 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? refcount_warn_saturate+0xe5/0x110 kobject_get+0x68/0x70 of_node_get+0x1e/0x30 of_fwnode_get+0x28/0x40 fwnode_full_name_string+0x34/0x90 fwnode_string+0xdb/0x140 vsnprintf+0x17b/0x630 va_format.isra.0+0x71/0x130 vsnprintf+0x17b/0x630 vprintk_store+0x162/0x4d0 ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? try_to_wake_up+0x9c/0x620 ? rwsem_mark_wake+0x1b2/0x310 vprintk_emit+0xe4/0x2b0 _printk+0x5c/0x80 __dynamic_pr_debug+0x131/0x160 ? srso_alias_return_thunk+0x5/0x7f __fwnode_link_del+0x25/0xa0 fwnode_links_purge+0x39/0xb0 of_node_release+0xd9/0x180 kobject_put+0x7b/0x190 ...
Indeed, an of_node is destroyed and so, of_node_release() is called because the of_node refcount reached 0. of_node_release() calls fwnode_links_purge() to purge the links and ended with __fwnode_link_del() calls. __fwnode_link_del calls pr_debug() to print the fwnodes (of_nodes) involved in the link and so this call is done while one of them is no more available (ie the one related to the of_node_release() call)
Remove the pr_debug() call to avoid the use of the links fwnode while destroying the fwnode itself.
Fixes: ebd6823af378 ("driver core: Add debug logs when fwnode links are added/deleted") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina herve.codina@bootlin.com
drivers/base/core.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c index f4b09691998e..62088c663014 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -109,8 +109,6 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup) */ static void __fwnode_link_del(struct fwnode_link *link) {
pr_debug("%pfwf Dropping the fwnode link to %pfwf\n",
link->consumer, link->supplier);
Valid issue, but a NACK for the patch.
The pr_debug has been very handy, so I don't want to delete it. Also, the fwnode link can't get deleted before the supplier/consumer. If it is, I need to take a closer look as I'd expect the list_del() to cause corruption. My guess is that the %pfwf is traversing stuff that's causing an issue. But let me take a closer look next week when I'll be at LPC.
-Saravana
list_del(&link->s_hook); list_del(&link->c_hook); kfree(link);
-- 2.41.0
Hi Saravan,
On Fri, 10 Nov 2023 12:09:02 -0800 Saravana Kannan saravanak@google.com wrote:
On Fri, Nov 10, 2023 at 9:01 AM Herve Codina herve.codina@bootlin.com wrote:
A refcount issue can appeared in __fwnode_link_del() due to the pr_debug() call: WARNING: CPU: 0 PID: 901 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110 Call Trace:
<TASK> ? refcount_warn_saturate+0xe5/0x110 ? __warn+0x81/0x130 ? refcount_warn_saturate+0xe5/0x110 ? report_bug+0x191/0x1c0 ? srso_alias_return_thunk+0x5/0x7f ? prb_read_valid+0x1b/0x30 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? refcount_warn_saturate+0xe5/0x110 kobject_get+0x68/0x70 of_node_get+0x1e/0x30 of_fwnode_get+0x28/0x40 fwnode_full_name_string+0x34/0x90 fwnode_string+0xdb/0x140 vsnprintf+0x17b/0x630 va_format.isra.0+0x71/0x130 vsnprintf+0x17b/0x630 vprintk_store+0x162/0x4d0 ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? try_to_wake_up+0x9c/0x620 ? rwsem_mark_wake+0x1b2/0x310 vprintk_emit+0xe4/0x2b0 _printk+0x5c/0x80 __dynamic_pr_debug+0x131/0x160 ? srso_alias_return_thunk+0x5/0x7f __fwnode_link_del+0x25/0xa0 fwnode_links_purge+0x39/0xb0 of_node_release+0xd9/0x180 kobject_put+0x7b/0x190 ...
Indeed, an of_node is destroyed and so, of_node_release() is called because the of_node refcount reached 0. of_node_release() calls fwnode_links_purge() to purge the links and ended with __fwnode_link_del() calls. __fwnode_link_del calls pr_debug() to print the fwnodes (of_nodes) involved in the link and so this call is done while one of them is no more available (ie the one related to the of_node_release() call)
Remove the pr_debug() call to avoid the use of the links fwnode while destroying the fwnode itself.
Fixes: ebd6823af378 ("driver core: Add debug logs when fwnode links are added/deleted") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina herve.codina@bootlin.com
drivers/base/core.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c index f4b09691998e..62088c663014 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -109,8 +109,6 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup) */ static void __fwnode_link_del(struct fwnode_link *link) {
pr_debug("%pfwf Dropping the fwnode link to %pfwf\n",
link->consumer, link->supplier);
Valid issue, but a NACK for the patch.
The pr_debug has been very handy, so I don't want to delete it. Also, the fwnode link can't get deleted before the supplier/consumer. If it is, I need to take a closer look as I'd expect the list_del() to cause corruption. My guess is that the %pfwf is traversing stuff that's causing an issue. But let me take a closer look next week when I'll be at LPC.
The issue is really related to print the full name (%pfwf) of the node been destroyed by of_node_release() due to refcount == 0. The issue does not appear with %pfwP.
Looked at printk(). On %pfwf fwnode_handle_{get,put}() is called for current node and its parents whereas %pfwP does not call fwnode_handle_{get,put}() on the current node.
A fix can probably be done at printk() level to avoid the fwnode_handle_{get,put}() calls for the current node in case of %pfwf.
I will do a patch in this way instead of removing the pr_debug() call in __fwnode_link_del().
Best regards, Hervé
linux-stable-mirror@lists.linaro.org