On Wed, Sep 06, 2023 at 08:23:18PM +0100, Lorenzo Stoakes wrote:
On Tue, 5 Sept 2023 at 12:47, Joel Fernandes joel@joelfernandes.org wrote:
On Tue, Sep 05, 2023 at 08:09:16AM +0100, Lorenzo Stoakes wrote:
On Mon, Sep 04, 2023 at 06:08:04PM +0000, Joel Fernandes (Google) wrote:
It is unsafe to dump vmalloc area information when trying to do so from some contexts. Add a safer trylock version of the same function to do a best-effort VMA finding and use it from vmalloc_dump_obj().
It'd be nice to have more details as to precisely which contexts and what this resolves.
True. I was hoping the 'trylock' mention would be sufficient (example hardirq context interrupting a lock-held region) but you're right.
[applied test robot feedback on unused function fix.] [applied Uladzislau feedback on locking.]
Reported-by: Zhen Lei thunder.leizhen@huaweicloud.com Cc: Paul E. McKenney paulmck@kernel.org Cc: rcu@vger.kernel.org Cc: Zqiang qiang.zhang1211@gmail.com Reviewed-by: Uladzislau Rezki (Sony) urezki@gmail.com Fixes: 98f180837a89 ("mm: Make mem_dump_obj() handle vmalloc() memory") Cc: stable@vger.kernel.org Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org
mm/vmalloc.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 93cf99aba335..2c6a0e2ff404 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -4274,14 +4274,32 @@ void pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms) #ifdef CONFIG_PRINTK bool vmalloc_dump_obj(void *object) {
- struct vm_struct *vm; void *objp = (void *)PAGE_ALIGN((unsigned long)object);
- const void *caller;
- struct vm_struct *vm;
- struct vmap_area *va;
- unsigned long addr;
- unsigned int nr_pages;
- vm = find_vm_area(objp);
- if (!vm)
- if (!spin_trylock(&vmap_area_lock))
return false;
It'd be good to have a comment here explaining why we must trylock here. I am also concerned that in the past this function would return false only if the address was not a vmalloc one, but now it might just return false due to lock contention and the user has no idea which it is?
I'd want to at least output "vmalloc region cannot lookup lock contention" vs. the below cannot find case.
In the patch 2/2 we do print if the address looks like a vmalloc address even if the vmalloc look up fails.
No, you output exactly what was output before, only changing what it means and in no way differentiating between couldn't find vmalloc area/couldn't get lock.
Also the reporter's usecase is not a common one. We only attempt to dump information if there was a debug objects failure (example if somebody did a double call_rcu). In such a situation, the patch will prevent a deadlock and still print something about the address.
Right, but the function still purports to do X but does Y.
Under heavy lock contention aren't you potentially breaking the ability to introspect vmalloc addresses? Wouldn't it be better to explicitly detect the contexts under which acquiring this spinlock is not appropriate?
Yes this is a good point, but there's another case as well: PREEMPT_RT can sleep on lock contention (as spinlocks are sleeping) and we can't sleep from call_rcu() as it may be called in contexts that cannot sleep. So we handle that also using trylock.
Right so somebody now has to find this email to realise that. I hate implicit knowledge like this, it needs a comment. It also furthers the point that it'd be useful to differentiate between the two.
Thanks for the review!
This got merged despite my outstanding comments so I guess I'll have to follow up with a patch.
- Joel
- va = __find_vmap_area((unsigned long)objp, &vmap_area_root);
- if (!va) {
spin_unlock(&vmap_area_lock); return false;
- }
- vm = va->vm;
- if (!vm) {
spin_unlock(&vmap_area_lock);
return false;
- }
- addr = (unsigned long)vm->addr;
- caller = vm->caller;
- nr_pages = vm->nr_pages;
- spin_unlock(&vmap_area_lock); pr_cont(" %u-page vmalloc region starting at %#lx allocated at %pS\n",
vm->nr_pages, (unsigned long)vm->addr, vm->caller);
return true;nr_pages, addr, caller);
}
#endif
2.42.0.283.g2d96d420d3-goog
This reads like another 'nice review and I agree but I won't change anything!'...
Sorry I actually wrote this unkind comment in a moment of annoyance then meant to delete it but of course forgot to :>) Disregard this bit.
Happy for pushback/disagreement, just feel like a few little touchups would have helped improve documentation/clarity of what this series does.
Obviously stability matters more so perhaps touch-ups best as a follow up series... though would be nice to have a comment to that effect.
Thanks!