Re: [PATCH v7 09/12] sysfs: fix deadlock race with module removal - Linux-kselftest-mirror

21 Sep 2021

      On Tue, Sep 21, 2021 at 1:24 AM David Laight David.Laight@aculab.com wrote:
...
From: Luis Chamberlain
...
Sent: 17 September 2021 20:47
When sysfs attributes use a lock also used on module removal we can
race to deadlock. This happens when for instance a sysfs file on
a driver is used, then at the same time we have module removal call
trigger. The module removal call code holds a lock, and then the sysfs
file entry waits for the same lock. While holding the lock the module
removal tries to remove the sysfs entries, but these cannot be removed
yet as one is waiting for a lock. This won't complete as the lock is
already held. Likewise module removal cannot complete, and so we deadlock.
Isn't the real problem the race between a sysfs file action and the
removal of the sysfs node?
Nope, that is taken care of by kernfs.
...
This isn't really related to module unload - except that may
well remove some sysfs nodes.
Nope, the issue is a deadlock that can happen due to a shared lock on
module removal and a driver sysfs operation.
...
This is the same problem as removing any other kind of driver callback.
There are three basic solutions:

Use a global lock - not usually useful.
Have the remove call sleep until any callbacks are complete.
Have the remove just request removal and have a final
callback (from a different context).

Kernfs already does a sort of combination of 1) and 2) but 1) is using
atomic reference counts.
...
If the remove can sleep (as in 2) then there is a requirement
on the driver code to not hold any locks across the 'remove'
that can be acquired during the callbacks.
And this is the part that kernfs has no control over since the removal
and sysfs operation are implementation specific.
...
Now, for sysfs, you probably only want to sleep the remove code
while a read/write is in progress - not just because the node
is open.
That probably requires marking an open node 'invalid' and
deferring delete to close.
This is already done by kernfs.
...
None of this requires a reference count on the module.
You are missing the point to the other aspect of the try_module_get(),
it lets you also check if module exit has been entered. By using
try_module_get() you let the module exit trump proceeding with an
operation, therefore also preventing any potential use of a shared
lock on module exit and the driver specific sysfs operation.
Luis