On Tue, Nov 02, 2021 at 03:51:33PM +0100, Petr Mladek wrote:
On Tue 2021-11-02 15:15:19, Petr Mladek wrote:
On Tue 2021-10-26 23:37:30, Ming Lei wrote:
On Tue, Oct 26, 2021 at 10:48:18AM +0200, Petr Mladek wrote:
Below are more details about the livepatch code. I hope that it will help you to see if zram has similar problems or not.
We have kobject in three structures: klp_func, klp_object, and klp_patch, see include/linux/livepatch.h.
These structures have to be statically defined in the module sources because they define what is livepatched, see samples/livepatch/livepatch-sample.c
The kobject is used there to show information about the patch, patched objects, and patched functions, in sysfs. And most importantly, the sysfs interface can be used to disable the livepatch.
The problem with static structures is that the module must stay in the memory as long as the sysfs interface exists. It can be solved in module_exit() callback. It could wait until the sysfs interface is destroyed.
kobject API does not support this scenario. The relase() callbacks
kobject_delete() is for supporting this scenario, that is why we don't need to grab module refcnt before calling show()/store() of the kobject's attributes.
kobject_delete() can be called in module_exit(), then any show()/store() will be done after kobject_delete() returns.
I am a bit confused. I do not see kobject_delete() anywhere in kernel sources.
I see only kobject_del() and kobject_put(). AFAIK, they do _not_ guarantee that either the sysfs interface was destroyed or the release callbacks were called. For example, see schedule_delayed_work(&kobj->release, delay) in kobject_release().
Grr, I always get confused by the code. kobject_del() actually waits until the sysfs interface gets destroyed. This is why there is the deadlock.
Right.
But kobject_put() is _not_ synchronous. And the comment above kobject_add() repeat 3 times that kobject_put() must be called on success:
- Return: If this function returns an error, kobject_put() must be
called to properly clean up the memory associated with the
object. Under no instance should the kobject that is passed
to this function be directly freed with a call to kfree(),
that can leak memory.
If this function returns success, kobject_put() must also be called
in order to properly clean up the memory associated with the object.
In short, once this function is called, kobject_put() MUST be called
when the use of the object is finished in order to properly free
everything.
and similar text in Documentation/core-api/kobject.rst
After a kobject has been registered with the kobject core successfully, it must be cleaned up when the code is finished with it. To do that, call kobject_put().
If I read the code correctly then kobject_put() calls kref_put() that might call kobject_delayed_cleanup(). This function does a lot of things and need to access struct kobject.
Yes, then what is the problem here wrt. kobject_put() which may not be synchronous?
IMHO, kobject API does not support static structures and module removal.
If kobject_put() has to be called also for static structures then module_exit() must explicitly wait until the clean up is finished.
Right, that is exactly how klp_patch kobject is implemented. klp_patch kobject has to be disabled first, then module refcnt can be dropped after the klp_patch kobject is released. Then module_exit() is possible.
Thanks, Ming