Our syztester report the lockdep WARNING [1], which was identified in stable kernel version 5.10. However, this deadlock path no longer exists due to the refactoring of console_lock in v6.2-rc1 [2]. Coincidentally, there are two types of deadlocks that we have found here. One is the ABBA deadlock, as mentioned above [1], and the other is the AA deadlock was reported by Breno [3]. The latter's deadlock issue persists.
To solve this problem, switch to printk_safe mode before printing warning message, this will redirect all printk()-s to a special per-CPU buffer, which will be flushed later from a safe context (irq work), and this deadlock problem can be avoided. The proper API to use should be printk_deferred_enter()/printk_deferred_exit() [4].
[1] https://lore.kernel.org/all/20250730094914.566582-1-gubowen5@huawei.com/ [2] https://lore.kernel.org/all/20221116162152.193147-1-john.ogness@linutronix.d... [3] https://lore.kernel.org/all/20250731-kmemleak_lock-v1-1-728fd470198f@debian.... [4] https://lore.kernel.org/all/5ca375cd-4a20-4807-b897-68b289626550@redhat.com/ ====================
Signed-off-by: Gu Bowen gubowen5@huawei.com --- mm/kmemleak.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/mm/kmemleak.c b/mm/kmemleak.c index 84265983f239..26113b89d09b 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -437,9 +437,15 @@ static struct kmemleak_object *__lookup_object(unsigned long ptr, int alias, else if (untagged_objp == untagged_ptr || alias) return object; else { + /* + * Printk deferring due to the kmemleak_lock held. + * This is done to avoid deadlock. + */ + printk_deferred_enter(); kmemleak_warn("Found object by alias at 0x%08lx\n", ptr); dump_object_info(object); + printk_deferred_exit(); break; } } @@ -736,6 +742,11 @@ static int __link_object(struct kmemleak_object *object, unsigned long ptr, else if (untagged_objp + parent->size <= untagged_ptr) link = &parent->rb_node.rb_right; else { + /* + * Printk deferring due to the kmemleak_lock held. + * This is done to avoid deadlock. + */ + printk_deferred_enter(); kmemleak_stop("Cannot insert 0x%lx into the object search tree (overlaps existing)\n", ptr); /* @@ -743,6 +754,7 @@ static int __link_object(struct kmemleak_object *object, unsigned long ptr, * be freed while the kmemleak_lock is held. */ dump_object_info(parent); + printk_deferred_exit(); return -EEXIST; } } @@ -858,8 +870,14 @@ static void delete_object_part(unsigned long ptr, size_t size, object = __find_and_remove_object(ptr, 1, objflags); if (!object) { #ifdef DEBUG + /* + * Printk deferring due to the kmemleak_lock held. + * This is done to avoid deadlock. + */ + printk_deferred_enter(); kmemleak_warn("Partially freeing unknown object at 0x%08lx (size %zu)\n", ptr, size); + printk_deferred_exit(); #endif goto unlock; }
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#opti...
Rule: add the tag "Cc: stable@vger.kernel.org" in the sign-off area to have the patch automatically included in the stable tree. Subject: [PATCH v4] mm: Fix possible deadlock in kmemleak Link: https://lore.kernel.org/stable/20250818090945.1003644-1-gubowen5%40huawei.co...
On Mon, Aug 18, 2025 at 05:09:44PM +0800, Gu Bowen wrote:
Our syztester report the lockdep WARNING [1], which was identified in stable kernel version 5.10. However, this deadlock path no longer exists due to the refactoring of console_lock in v6.2-rc1 [2]. Coincidentally, there are two types of deadlocks that we have found here. One is the ABBA deadlock, as mentioned above [1], and the other is the AA deadlock was reported by Breno [3]. The latter's deadlock issue persists.
It's better to include the lockdep warning here rather than linking to other threads. Also since we are targeting upstream with this patch, I don't think we should mention lockdep warnings for 5.10.
To solve this problem, switch to printk_safe mode before printing warning message, this will redirect all printk()-s to a special per-CPU buffer, which will be flushed later from a safe context (irq work), and this deadlock problem can be avoided. The proper API to use should be printk_deferred_enter()/printk_deferred_exit() [4].
[1] https://lore.kernel.org/all/20250730094914.566582-1-gubowen5@huawei.com/ [2] https://lore.kernel.org/all/20221116162152.193147-1-john.ogness@linutronix.d... [3] https://lore.kernel.org/all/20250731-kmemleak_lock-v1-1-728fd470198f@debian.... [4] https://lore.kernel.org/all/5ca375cd-4a20-4807-b897-68b289626550@redhat.com/ ====================
Signed-off-by: Gu Bowen gubowen5@huawei.com
I suggest you add the 5.10 mention here if you want, text after "---" is normally stripped (well, not sure with Andrew's scripts).
Otherwise the patch looks fine.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com
On Tue, 19 Aug 2025 16:30:51 +0100 Catalin Marinas catalin.marinas@arm.com wrote:
Signed-off-by: Gu Bowen gubowen5@huawei.com
I suggest you add the 5.10 mention here if you want, text after "---" is normally stripped (well, not sure with Andrew's scripts).
Yes, I strip it. Although there's often useful stuff down there so I'll paste that into the changelog.
Otherwise the patch looks fine.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com
Thanks, I'll queue it for testing and add a note that a v5 is expected.
On 8/19/2025 11:30 PM, Catalin Marinas wrote:
On Mon, Aug 18, 2025 at 05:09:44PM +0800, Gu Bowen wrote:
Our syztester report the lockdep WARNING [1], which was identified in stable kernel version 5.10. However, this deadlock path no longer exists due to the refactoring of console_lock in v6.2-rc1 [2]. Coincidentally, there are two types of deadlocks that we have found here. One is the ABBA deadlock, as mentioned above [1], and the other is the AA deadlock was reported by Breno [3]. The latter's deadlock issue persists.
It's better to include the lockdep warning here rather than linking to other threads. Also since we are targeting upstream with this patch, I don't think we should mention lockdep warnings for 5.10.
To solve this problem, switch to printk_safe mode before printing warning message, this will redirect all printk()-s to a special per-CPU buffer, which will be flushed later from a safe context (irq work), and this deadlock problem can be avoided. The proper API to use should be printk_deferred_enter()/printk_deferred_exit() [4].
[1] https://lore.kernel.org/all/20250730094914.566582-1-gubowen5@huawei.com/ [2] https://lore.kernel.org/all/20221116162152.193147-1-john.ogness@linutronix.d... [3] https://lore.kernel.org/all/20250731-kmemleak_lock-v1-1-728fd470198f@debian.... [4] https://lore.kernel.org/all/5ca375cd-4a20-4807-b897-68b289626550@redhat.com/ ====================
I suggest you add the 5.10 mention here if you want, text after "---" is normally stripped (well, not sure with Andrew's scripts).
Otherwise the patch looks fine.
Thank you for your advice, I will pay attention to these points in the future.
Best Regards, Guber
On 8/18/25 5:09 AM, Gu Bowen wrote:
Our syztester report the lockdep WARNING [1], which was identified in stable kernel version 5.10. However, this deadlock path no longer exists due to the refactoring of console_lock in v6.2-rc1 [2]. Coincidentally, there are two types of deadlocks that we have found here. One is the ABBA deadlock, as mentioned above [1], and the other is the AA deadlock was reported by Breno [3]. The latter's deadlock issue persists.
To solve this problem, switch to printk_safe mode before printing warning message, this will redirect all printk()-s to a special per-CPU buffer, which will be flushed later from a safe context (irq work), and this deadlock problem can be avoided. The proper API to use should be printk_deferred_enter()/printk_deferred_exit() [4].
[1] https://lore.kernel.org/all/20250730094914.566582-1-gubowen5@huawei.com/ [2] https://lore.kernel.org/all/20221116162152.193147-1-john.ogness@linutronix.d... [3] https://lore.kernel.org/all/20250731-kmemleak_lock-v1-1-728fd470198f@debian.... [4] https://lore.kernel.org/all/5ca375cd-4a20-4807-b897-68b289626550@redhat.com/ ====================
Signed-off-by: Gu Bowen gubowen5@huawei.com
mm/kmemleak.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/mm/kmemleak.c b/mm/kmemleak.c index 84265983f239..26113b89d09b 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -437,9 +437,15 @@ static struct kmemleak_object *__lookup_object(unsigned long ptr, int alias, else if (untagged_objp == untagged_ptr || alias) return object; else {
/*
* Printk deferring due to the kmemleak_lock held.
* This is done to avoid deadlock.
*/
printk_deferred_enter(); kmemleak_warn("Found object by alias at 0x%08lx\n", ptr); dump_object_info(object);
} }printk_deferred_exit(); break;
@@ -736,6 +742,11 @@ static int __link_object(struct kmemleak_object *object, unsigned long ptr, else if (untagged_objp + parent->size <= untagged_ptr) link = &parent->rb_node.rb_right; else {
/*
* Printk deferring due to the kmemleak_lock held.
* This is done to avoid deadlock.
*/
printk_deferred_enter(); kmemleak_stop("Cannot insert 0x%lx into the object search tree (overlaps existing)\n", ptr); /*
@@ -743,6 +754,7 @@ static int __link_object(struct kmemleak_object *object, unsigned long ptr, * be freed while the kmemleak_lock is held. */ dump_object_info(parent);
} }printk_deferred_exit(); return -EEXIST;
@@ -858,8 +870,14 @@ static void delete_object_part(unsigned long ptr, size_t size, object = __find_and_remove_object(ptr, 1, objflags); if (!object) { #ifdef DEBUG
/*
* Printk deferring due to the kmemleak_lock held.
* This is done to avoid deadlock.
*/
kmemleak_warn("Partially freeing unknown object at 0x%08lx (size %zu)\n", ptr, size);printk_deferred_enter();
#endifprintk_deferred_exit();
This particular warning message can be moved after unlock by adding a warning flag. Locking is done outside of the other two helper functions above, so it is easier to use printk_deferred_enter/exit() for those.
Anyway, it is just a nit.
Acked-by: Waiman Long longman@redhat.com
linux-stable-mirror@lists.linaro.org