On 11.01.20 15:25, David Hildenbrand wrote:
Am 11.01.2020 um 14:56 schrieb Qian Cai cai@lca.pw:
On Jan 11, 2020, at 6:03 AM, David Hildenbrand david@redhat.com wrote:
So I just remember why I think this (and the previously reported done for ACPI DIMMs) are false positives. The actual locking order is
onlining/offlining from user space:
kn->count -> device_hotplug_lock -> cpu_hotplug_lock -> mem_hotplug_lock
memory removal:
device_hotplug_lock -> cpu_hotplug_lock -> mem_hotplug_lock -> kn->count
This looks like a locking inversion - but it's not. Whenever we come via user space we do a mutex_trylock(), which resolves this issue by backing up. The device_hotplug_lock will prevent
I have no clue why the device_hotplug_lock does not pop up in the lockdep report here. Sounds wrong to me.
I think this is a false positive and not stable material.
The point is that there are other paths does kn->count —> cpu_hotplug_lock without needing device_hotplug_lock to race with memory removal.
kmem_cache_shrink_all+0x50/0x100 (cpu_hotplug_lock.rw_sem/mem_hotplug_lock.rw_sem) shrink_store+0x34/0x60 slab_attr_store+0x6c/0x170 sysfs_kf_write+0x70/0xb0 kernfs_fop_write+0x11c/0x270 ((kn->count) __vfs_write+0x3c/0x70 vfs_write+0xcc/0x200 ksys_write+0x7c/0x140 system_call+0x5c/0x6
But not the lock of the memory devices, or am I missing something?
To clarify:
memory unplug will remove e.g., /sys/devices/system/memory/memoryX/, which has a dedicated kn->count AFAIK
If you do a "echo 1 > /sys/kernel/slab/X/shrink", you would not lock the kn->count of /sys/devices/system/memory/memoryX/, but the one of some slab thingy.
The only scenario I could see is if remove_memory_block_devices() will not only remove /sys/devices/system/memory/memoryX/, but also implicitly e.g., /sys/kernel/slab/X/. If that is the case, then this is indeed not a false positive, but something rather hard to trigger (which would still classify as stable material).