Hi Greg,
On Sunday 10/13 at 17:04 +0200, Greg Kroah-Hartman wrote:
On Fri, Sep 13, 2024 at 05:24:29PM -0700, Calvin Owens wrote:
On a board running ntpd and gpsd, I'm seeing a consistent use-after-free in sys_exit() from gpsd when rebooting:
pps pps1: removed ------------[ cut here ]------------ kobject: '(null)' (00000000db4bec24): is not initialized, yet kobject_put() is being called.
Something is wrong with the reference counting here...
WARNING: CPU: 2 PID: 440 at lib/kobject.c:734 kobject_put+0x120/0x150 CPU: 2 UID: 299 PID: 440 Comm: gpsd Not tainted 6.11.0-rc6-00308-gb31c44928842 #1 Hardware name: Raspberry Pi 4 Model B Rev 1.1 (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : kobject_put+0x120/0x150 lr : kobject_put+0x120/0x150 sp : ffffffc0803d3ae0 x29: ffffffc0803d3ae0 x28: ffffff8042dc9738 x27: 0000000000000001 x26: 0000000000000000 x25: ffffff8042dc9040 x24: ffffff8042dc9440 x23: ffffff80402a4620 x22: ffffff8042ef4bd0 x21: ffffff80405cb600 x20: 000000000008001b x19: ffffff8040b3b6e0 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 696e6920746f6e20 x14: 7369203a29343263 x13: 205d303434542020 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: kobject_put+0x120/0x150 cdev_put+0x20/0x3c __fput+0x2c4/0x2d8 ____fput+0x1c/0x38 task_work_run+0x70/0xfc do_exit+0x2a0/0x924 do_group_exit+0x34/0x90 get_signal+0x7fc/0x8c0 do_signal+0x128/0x13b4 do_notify_resume+0xdc/0x160 el0_svc+0xd4/0xf8 el0t_64_sync_handler+0x140/0x14c el0t_64_sync+0x190/0x194 ---[ end trace 0000000000000000 ]---
...followed by more symptoms of corruption, with similar stacks:
refcount_t: underflow; use-after-free. kernel BUG at lib/list_debug.c:62! Kernel panic - not syncing: Oops - BUG: Fatal exception
This happens because pps_device_destruct() frees the pps_device with the embedded cdev immediately after calling cdev_del(), but, as the comment above cdev_del() notes, fops for previously opened cdevs are still callable even after cdev_del() returns. I think this bug has always been there: I can't explain why it suddenly started happening every time I reboot this particular board.
In commit d953e0e837e6 ("pps: Fix a use-after free bug when unregistering a source."), George Spelvin suggested removing the embedded cdev. That seems like the simplest way to fix this, so I've implemented his suggestion, with pps_idr becoming the source of truth for which minor corresponds to which device.
You remove it, but now the structure has no reference counting at all, so you should make it a real "struct device" not just containing a pointer to one.
It still uses the device refcount. I didn't change that, see the kobject_get() (which is confusing and should be get_device() as you note below) in pps_cdev_open() before my patch.
But now that pps_idr defines userspace visibility instead of cdev_add(), we need to be sure the pps->dev kobject refcount can't reach zero while userspace can still find it again. So, the idr_remove() call moves to pps_unregister_cdev(), and pps_idr now holds a reference to the pps->dev kobject.
An idr shouldn't be doing the reference counting here, the struct device should be doing it, right?
Right, I was trying to explain that it is now necessary to call get_device() to acquire a reference while the device minor is indexed by the idr.
Before, the minor was deindexed in the idr in ->release(). But now, since it is visible the moment it is indexed by the idr, deindexing it in ->release() means this could happen during removal:
CPU1 CPU2 --- --- pps_unregister_cdev() <...> sys_open() <...> idr_lookup() idr_remove() put_device() <--- ref goes to zero get_device() <--- BOOM pps_device_destruct() pps_idr_remove()
By calling get_device() in pps_idr_get(), and by moving the idr deindexing from ->release() to pps_unregister_cdev(), we ensure that can't happen.
pps_core: source serial1 got cdev (251:1) <...> pps pps1: removed pps_core: unregistering pps1 pps_core: deallocating pps1
Fixes: d953e0e837e6 ("pps: Fix a use-after free bug when unregistering a source.") Cc: stable@vger.kernel.org Signed-off-by: Calvin Owens calvin@wbinvd.org
Changes in v2:
- Don't move pr_debug() from pps_device_destruct() to pps_unregister_cdev()
- Actually add stable@vger.kernel.org to CC
drivers/pps/pps.c | 83 ++++++++++++++++++++------------------ include/linux/pps_kernel.h | 1 - 2 files changed, 44 insertions(+), 40 deletions(-)
diff --git a/drivers/pps/pps.c b/drivers/pps/pps.c index 5d19baae6a38..6980ab17f314 100644 --- a/drivers/pps/pps.c +++ b/drivers/pps/pps.c @@ -25,7 +25,7 @@
- Local variables
*/ -static dev_t pps_devt; +static int pps_major; static struct class *pps_class; static DEFINE_MUTEX(pps_idr_lock); @@ -296,19 +296,35 @@ static long pps_cdev_compat_ioctl(struct file *file, #define pps_cdev_compat_ioctl NULL #endif +static struct pps_device *pps_idr_get(unsigned long id) +{
- struct pps_device *pps;
- mutex_lock(&pps_idr_lock);
- pps = idr_find(&pps_idr, id);
- if (pps)
kobject_get(&pps->dev->kobj);
A driver should never call "raw" kobject calls, this alone makes this not ok :(
Oh this is silly, sorry: it's simply open coding put_device() and get_device(), I'll fix that.
Please move the structure to be embedded in and then it should be simpler.
I actually tried to do that, but got hung up on the API: what's the right way to do what device_create() does for an embedded device struct?
Today, the pps core does:
pps = kzalloc(sizeof(*pps));
...then later in pps_register_cdev():
idr_alloc(&newmin, ...); pps->dev = device_create(..., MKDEV(maj, newmin), ...);
To embed the device struct, I guess this would become something like:
pps = kzalloc(sizeof(*pps)); device_initialize(&pps->dev); idr_alloc(&newmin, ...); /* ??? */ device_register(&pps->dev);
...but the question marks are what I'm not seeing: what's the proper way to accomplish what device_create() does, but in an embedded dev struct? Obviously I could just copy device_create(), but I'm sure that's not what you want.
Thanks, Calvin
thanks,
greg k-h