The use-after-free bug appears when: - A platform device is created from OF, by of_device_add(); - The same device's name is changed afterwards using dev_set_name(), by its probe for example.
Out of the 37 drivers that deal with platform devices and do a dev_set_name() call, only one might be affected. That driver is loongson-i2s-plat [0]. All other dev_set_name() calls are on children devices created on the spot. The issue was found on downstream kernels and we don't have what it takes to test loongson-i2s-plat.
Note: loongson-i2s-plat maintainers are CCed.
⟩ # Finding potential trouble-makers: ⟩ git grep -l 'struct platform_device' | xargs grep -l dev_set_name
The solution proposed is to add a flag to platform_device that tells if it is responsible for freeing its name. We can then duplicate the device name inside of_device_add() instead of copying the pointer.
What is done elsewhere? - Platform bus code does a copy of the argument name that is stored alongside the struct platform_device; see platform_device_alloc()[1]. - Other busses duplicate the device name; either through a dynamic allocation [2] or through an array embedded inside devices [3]. - Some busses don't have a separate name; when they want a name they take it from the device [4].
[0]: https://elixir.bootlin.com/linux/v6.13.2/source/sound/soc/loongson/loongson_... [1]: https://elixir.bootlin.com/linux/v6.13.2/source/drivers/base/platform.c#L581 [2]: https://elixir.bootlin.com/linux/v6.13.2/source/drivers/gpu/drm/drm_drv.c#L6... [3]: https://elixir.bootlin.com/linux/v6.13.2/source/include/linux/i2c.h#L343 [4]: https://elixir.bootlin.com/linux/v6.13.2/source/include/linux/pci.h#L2150
This can be reproduced using Buildroot's qemu_aarch64_virt_defconfig with CONFIG_KASAN=y and a dev_set_name() inside the probe of: drivers/pci/controller/pci-host-common.c
The below splat appears at boot. It happens whenever something tries to access pdev->name; one big consumer of this field is platform_match() that fallbacks to name matching.
================================================================== BUG: KASAN: slab-use-after-free in strcmp+0x2c/0x78 Read of size 1 at addr ffffff80c0300160 by task swapper/0/1
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.32 #1 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x90/0xe8 show_stack+0x18/0x24 dump_stack_lvl+0x48/0x60 print_report+0xf8/0x5d8 kasan_report+0x90/0xcc __asan_load1+0x60/0x6c strcmp+0x2c/0x78 platform_match+0xd0/0x140 __driver_attach+0x44/0x240 bus_for_each_dev+0xe4/0x160 driver_attach+0x34/0x44 bus_add_driver+0x134/0x270 driver_register+0xa4/0x1e4 __platform_driver_register+0x44/0x54 ged_driver_init+0x1c/0x28 do_one_initcall+0xdc/0x260 kernel_init_freeable+0x314/0x448 kernel_init+0x2c/0x1e0 ret_from_fork+0x10/0x20
Allocated by task 1: kasan_save_stack+0x3c/0x64 kasan_set_track+0x2c/0x40 kasan_save_alloc_info+0x24/0x34 __kasan_kmalloc+0xb8/0xbc __kmalloc_node_track_caller+0x64/0xa4 kvasprintf+0xcc/0x16c kvasprintf_const+0xe8/0x180 kobject_set_name_vargs+0x54/0xd4 dev_set_name+0xa8/0xe4 of_device_make_bus_id+0x298/0x2b0 of_device_alloc+0x1ec/0x204 of_platform_device_create_pdata+0x60/0x168 of_platform_bus_create+0x20c/0x4a0 of_platform_populate+0x50/0x10c of_platform_default_populate_init+0xe0/0x100 do_one_initcall+0xdc/0x260 kernel_init_freeable+0x314/0x448 kernel_init+0x2c/0x1e0 ret_from_fork+0x10/0x20
Freed by task 1: kasan_save_stack+0x3c/0x64 kasan_set_track+0x2c/0x40 kasan_save_free_info+0x38/0x60 __kasan_slab_free+0xe4/0x150 __kmem_cache_free+0x134/0x26c kfree+0x54/0x6c kfree_const+0x34/0x40 kobject_set_name_vargs+0xa8/0xd4 dev_set_name+0xa8/0xe4 pci_host_common_probe+0x9c/0x294 platform_probe+0x90/0x100 really_probe+0x100/0x3cc __driver_probe_device+0xb8/0x18c driver_probe_device+0x108/0x1d8 __driver_attach+0xc8/0x240 bus_for_each_dev+0xe4/0x160 driver_attach+0x34/0x44 bus_add_driver+0x134/0x270 driver_register+0xa4/0x1e4 __platform_driver_register+0x44/0x54 gen_pci_driver_init+0x1c/0x28 do_one_initcall+0xdc/0x260 kernel_init_freeable+0x314/0x448 kernel_init+0x2c/0x1e0 ret_from_fork+0x10/0x20
The buggy address belongs to the object at ffffff80c0300160 which belongs to the cache kmalloc-16 of size 16 The buggy address is located 0 bytes inside of freed 16-byte region [ffffff80c0300160, ffffff80c0300170)
The buggy address belongs to the physical page: page:0000000099fe29a0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x100300 flags: 0x8000000000000800(slab|zone=2) page_type: 0xffffffff() raw: 8000000000000800 ffffff80c00013c0 dead000000000122 0000000000000000 raw: 0000000000000000 0000000080800080 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffffff80c0300000: fa fb fc fc fa fb fc fc 00 07 fc fc 00 07 fc fc ffffff80c0300080: 00 07 fc fc 00 02 fc fc 00 02 fc fc 00 02 fc fc
ffffff80c0300100: 00 06 fc fc 00 06 fc fc 00 06 fc fc fa fb fc fc
^ ffffff80c0300180: 00 00 fc fc 00 00 fc fc 00 06 fc fc 00 06 fc fc ffffff80c0300200: 00 06 fc fc 00 06 fc fc 00 06 fc fc 00 06 fc fc ==================================================================
Signed-off-by: Théo Lebrun theo.lebrun@bootlin.com --- Théo Lebrun (2): driver core: platform: turn pdev->id_auto into pdev->flags driver core: platform: avoid use-after-free on pdev->name
drivers/base/platform.c | 8 +++++--- drivers/of/platform.c | 12 +++++++++++- include/linux/platform_device.h | 4 +++- 3 files changed, 19 insertions(+), 5 deletions(-) --- base-commit: 0ad2507d5d93f39619fc42372c347d6006b64319 change-id: 20250217-pdev-uaf-1a779a98d81b
Best regards,
struct platform_device->id_auto is the only boolean stored inside the structure. Remove it and add an u8 flags field. The goal is to allow more flags (without using more memory).
Cc: stable@vger.kernel.org Signed-off-by: Théo Lebrun theo.lebrun@bootlin.com --- drivers/base/platform.c | 6 +++--- include/linux/platform_device.h | 3 ++- 2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/base/platform.c b/drivers/base/platform.c index 6f2a33722c5203ac196a6e36e153648d0fe6c6d4..e2284482c7ba7c12fe2ab3c715e7d1daa3f65021 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -682,7 +682,7 @@ int platform_device_add(struct platform_device *pdev) if (ret < 0) return ret; pdev->id = ret; - pdev->id_auto = true; + pdev->flags |= PLATFORM_DEVICE_FLAG_ID_AUTO; dev_set_name(dev, "%s.%d.auto", pdev->name, pdev->id); break; } @@ -720,7 +720,7 @@ int platform_device_add(struct platform_device *pdev) return 0;
failed: - if (pdev->id_auto) { + if (pdev->flags & PLATFORM_DEVICE_FLAG_ID_AUTO) { ida_free(&platform_devid_ida, pdev->id); pdev->id = PLATFORM_DEVID_AUTO; } @@ -750,7 +750,7 @@ void platform_device_del(struct platform_device *pdev) if (!IS_ERR_OR_NULL(pdev)) { device_del(&pdev->dev);
- if (pdev->id_auto) { + if (pdev->flags & PLATFORM_DEVICE_FLAG_ID_AUTO) { ida_free(&platform_devid_ida, pdev->id); pdev->id = PLATFORM_DEVID_AUTO; } diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h index 074754c23d330c9a099e20eecfeb6cbd5025e04f..d842b21ba3791f974fa62f52bd160ef5820261c1 100644 --- a/include/linux/platform_device.h +++ b/include/linux/platform_device.h @@ -23,7 +23,8 @@ struct platform_device_id; struct platform_device { const char *name; int id; - bool id_auto; + u8 flags; +#define PLATFORM_DEVICE_FLAG_ID_AUTO BIT(0) struct device dev; u64 platform_dma_mask; struct device_dma_parameters dma_parms;
The issue is with this:
int of_device_add(struct platform_device *ofdev) { // ... ofdev->name = dev_name(&ofdev->dev); // ... }
We store the current device name pointer. If the device name changes through a `dev_set_name(dev, "foo")` call: - old device name is freed: kfree(dev->name); - new device name is allocated: kmalloc(...); - notice pdev->name is still the old device name, ie a freed pointer.
OF is at fault here, taking the pointer to the device name in of_device_add().
The new PLATFORM_DEVICE_FLAG_FREE_NAME flag tells platform devices if they own their pdev->name pointer and if it requires a kfree() call.
Considerations:
- The generic case in platform_device_register_full() is not faulty because it allocates memory for storing the name adjacent to the `struct platform_device` alloc; see platform_device_alloc():
struct platform_object *pa; pa = kzalloc(sizeof(*pa) + strlen(name) + 1, GFP_KERNEL);
We cannot rely on this codepath in all cases because OF wants to change the name after the platform device creation.
- kfree_const() cannot solve the issue: either we allocated pdev->name separately or it is part of the platform_object allocation. pdev->name is never coming from read-only data.
- It is important to duplicate! pdev->name must not change to make sure the platform_match() return value is stable over time. If we updated pdev->name alongside dev->name, once a device probes and changes its name then the platform_match() return value would change.
- In of_device_add(), we make sure to kstrdup() the new name before freeing the old one; if alloc fails, we leave the device as-is.
Fixes: eca3930163ba ("of: Merge of_platform_bus_type with platform_bus_type") Cc: stable@vger.kernel.org Signed-off-by: Théo Lebrun theo.lebrun@bootlin.com --- drivers/base/platform.c | 2 ++ drivers/of/platform.c | 12 +++++++++++- include/linux/platform_device.h | 1 + 3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/base/platform.c b/drivers/base/platform.c index e2284482c7ba7c12fe2ab3c715e7d1daa3f65021..3548714d6ba408abc6c7ab0f3e7496c6e27ba060 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -563,6 +563,8 @@ static void platform_device_release(struct device *dev) kfree(pa->pdev.mfd_cell); kfree(pa->pdev.resource); kfree(pa->pdev.driver_override); + if (pa->pdev.flags & PLATFORM_DEVICE_FLAG_FREE_NAME) + kfree(pa->pdev.name); kfree(pa); }
diff --git a/drivers/of/platform.c b/drivers/of/platform.c index c6d8afb284e88061eb6fb0ba02e429cec702664c..ef6f341fd9b77a9e0ed6969c3f322b9bc91d0e8d 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -44,11 +44,21 @@ EXPORT_SYMBOL(of_find_device_by_node);
int of_device_add(struct platform_device *ofdev) { + char *new_name; + BUG_ON(ofdev->dev.of_node == NULL);
+ new_name = kstrdup(dev_name(&ofdev->dev), GFP_KERNEL); + if (!new_name) + return -ENOMEM; + + if (ofdev->flags & PLATFORM_DEVICE_FLAG_FREE_NAME) + kfree(ofdev->name); + /* name and id have to be set so that the platform bus doesn't get * confused on matching */ - ofdev->name = dev_name(&ofdev->dev); + ofdev->name = new_name; + ofdev->flags |= PLATFORM_DEVICE_FLAG_FREE_NAME; ofdev->id = PLATFORM_DEVID_NONE;
/* diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h index d842b21ba3791f974fa62f52bd160ef5820261c1..203016afc3899ffa05f38b9d4ce3bfc02d5b75ef 100644 --- a/include/linux/platform_device.h +++ b/include/linux/platform_device.h @@ -25,6 +25,7 @@ struct platform_device { int id; u8 flags; #define PLATFORM_DEVICE_FLAG_ID_AUTO BIT(0) +#define PLATFORM_DEVICE_FLAG_FREE_NAME BIT(1) struct device dev; u64 platform_dma_mask; struct device_dma_parameters dma_parms;
On Tue, Feb 18, 2025 at 12:00:11PM +0100, Théo Lebrun wrote:
The use-after-free bug appears when:
- A platform device is created from OF, by of_device_add();
- The same device's name is changed afterwards using dev_set_name(), by its probe for example.
Out of the 37 drivers that deal with platform devices and do a dev_set_name() call, only one might be affected. That driver is loongson-i2s-plat [0]. All other dev_set_name() calls are on children devices created on the spot. The issue was found on downstream kernels and we don't have what it takes to test loongson-i2s-plat.
Note: loongson-i2s-plat maintainers are CCed.
⟩ # Finding potential trouble-makers: ⟩ git grep -l 'struct platform_device' | xargs grep -l dev_set_name
The solution proposed is to add a flag to platform_device that tells if it is responsible for freeing its name. We can then duplicate the device name inside of_device_add() instead of copying the pointer.
Ick.
What is done elsewhere?
- Platform bus code does a copy of the argument name that is stored alongside the struct platform_device; see platform_device_alloc()[1].
- Other busses duplicate the device name; either through a dynamic allocation [2] or through an array embedded inside devices [3].
- Some busses don't have a separate name; when they want a name they take it from the device [4].
Really ick.
Let's do the right thing here and just get rid of the name pointer entirely in struct platform_device please. Isn't that the correct thing that way the driver core logic will work properly for all of this.
thanks,
greg k-h
Hello Greg,
On Thu Feb 20, 2025 at 1:41 PM CET, Greg Kroah-Hartman wrote:
On Tue, Feb 18, 2025 at 12:00:11PM +0100, Théo Lebrun wrote:
The use-after-free bug appears when:
- A platform device is created from OF, by of_device_add();
- The same device's name is changed afterwards using dev_set_name(), by its probe for example.
Out of the 37 drivers that deal with platform devices and do a dev_set_name() call, only one might be affected. That driver is loongson-i2s-plat [0]. All other dev_set_name() calls are on children devices created on the spot. The issue was found on downstream kernels and we don't have what it takes to test loongson-i2s-plat.
Note: loongson-i2s-plat maintainers are CCed.
⟩ # Finding potential trouble-makers: ⟩ git grep -l 'struct platform_device' | xargs grep -l dev_set_name
The solution proposed is to add a flag to platform_device that tells if it is responsible for freeing its name. We can then duplicate the device name inside of_device_add() instead of copying the pointer.
Ick.
What is done elsewhere?
- Platform bus code does a copy of the argument name that is stored alongside the struct platform_device; see platform_device_alloc()[1].
- Other busses duplicate the device name; either through a dynamic allocation [2] or through an array embedded inside devices [3].
- Some busses don't have a separate name; when they want a name they take it from the device [4].
Really ick.
Let's do the right thing here and just get rid of the name pointer entirely in struct platform_device please. Isn't that the correct thing that way the driver core logic will work properly for all of this.
I would agree, if it wasn't for this consideration that is found in the commit message [0]:
It is important to duplicate! pdev->name must not change to make sure the platform_match() return value is stable over time. If we updated pdev->name alongside dev->name, once a device probes and changes its name then the platform_match() return value would change.
I'd be fine sending a V2 that removes the field *and the fallback* [1], but I don't have the full scope in mind to know what would become broken.
[0]: https://lore.kernel.org/lkml/20250218-pdev-uaf-v1-2-5ea1a0d3aba0@bootlin.com... [1]: https://elixir.bootlin.com/linux/v6.13.3/source/drivers/base/platform.c#L135...
Regards,
-- Théo Lebrun, Bootlin Embedded Linux and Kernel engineering https://bootlin.com
On Thu, Feb 20, 2025 at 02:31:29PM +0100, Théo Lebrun wrote:
Hello Greg,
On Thu Feb 20, 2025 at 1:41 PM CET, Greg Kroah-Hartman wrote:
On Tue, Feb 18, 2025 at 12:00:11PM +0100, Théo Lebrun wrote:
The use-after-free bug appears when:
- A platform device is created from OF, by of_device_add();
- The same device's name is changed afterwards using dev_set_name(), by its probe for example.
Out of the 37 drivers that deal with platform devices and do a dev_set_name() call, only one might be affected. That driver is loongson-i2s-plat [0]. All other dev_set_name() calls are on children devices created on the spot. The issue was found on downstream kernels and we don't have what it takes to test loongson-i2s-plat.
Note: loongson-i2s-plat maintainers are CCed.
⟩ # Finding potential trouble-makers: ⟩ git grep -l 'struct platform_device' | xargs grep -l dev_set_name
The solution proposed is to add a flag to platform_device that tells if it is responsible for freeing its name. We can then duplicate the device name inside of_device_add() instead of copying the pointer.
Ick.
What is done elsewhere?
- Platform bus code does a copy of the argument name that is stored alongside the struct platform_device; see platform_device_alloc()[1].
- Other busses duplicate the device name; either through a dynamic allocation [2] or through an array embedded inside devices [3].
- Some busses don't have a separate name; when they want a name they take it from the device [4].
Really ick.
Let's do the right thing here and just get rid of the name pointer entirely in struct platform_device please. Isn't that the correct thing that way the driver core logic will work properly for all of this.
I would agree, if it wasn't for this consideration that is found in the commit message [0]:
What, that the of code is broken? Then it should be fixed, why does it need a pointer to a name at all anyway? It shouldn't be needed there either.
It is important to duplicate! pdev->name must not change to make sure the platform_match() return value is stable over time. If we updated pdev->name alongside dev->name, once a device probes and changes its name then the platform_match() return value would change.
I'd be fine sending a V2 that removes the field *and the fallback* [1], but I don't have the full scope in mind to know what would become broken.
The fallback will not need to be removed, properly point to the name of the device and it should work correctly.
thanks,
greg k-h
On Thu Feb 20, 2025 at 3:06 PM CET, Greg Kroah-Hartman wrote:
On Thu, Feb 20, 2025 at 02:31:29PM +0100, Théo Lebrun wrote:
On Thu Feb 20, 2025 at 1:41 PM CET, Greg Kroah-Hartman wrote:
On Tue, Feb 18, 2025 at 12:00:11PM +0100, Théo Lebrun wrote:
The solution proposed is to add a flag to platform_device that tells if it is responsible for freeing its name. We can then duplicate the device name inside of_device_add() instead of copying the pointer.
Ick.
What is done elsewhere?
- Platform bus code does a copy of the argument name that is stored alongside the struct platform_device; see platform_device_alloc()[1].
- Other busses duplicate the device name; either through a dynamic allocation [2] or through an array embedded inside devices [3].
- Some busses don't have a separate name; when they want a name they take it from the device [4].
Really ick.
Let's do the right thing here and just get rid of the name pointer entirely in struct platform_device please. Isn't that the correct thing that way the driver core logic will work properly for all of this.
I would agree, if it wasn't for this consideration that is found in the commit message [0]:
What, that the of code is broken? Then it should be fixed, why does it need a pointer to a name at all anyway? It shouldn't be needed there either.
I cannot guess why it originally has a separate pdev->name field. All I can tell you is a good reason to have one, as quoted below.
It is important to duplicate! pdev->name must not change to make sure the platform_match() return value is stable over time. If we updated pdev->name alongside dev->name, once a device probes and changes its name then the platform_match() return value would change.
I'd be fine sending a V2 that removes the field *and the fallback* [1], but I don't have the full scope in mind to know what would become broken.
The fallback will not need to be removed, properly point to the name of the device and it should work correctly.
No, it will not work correctly, as the above quote indicates.
Let's assume we remove the field, this situation would be broken: - OF allocates platform devices and gives them names. - A device matches with a driver, which gets probed. - During the probe, driver does a dev_set_name(). - Afterwards, the upcoming platform_match() against other drivers are called with another device name.
We should be safe as there are guardraids to not probe twice a device, see __driver_probe_device() that checks dev->driver is NULL. But it isn't a situation we should be in.
Another broken situation: - OF allocates platform devices and gives them names. - A device matches with a driver, which gets probed based on its name. - During the probe, driver does a dev_set_name(). - Module is removed. - Module is re-added, the (driver, device) pair don't end up matching again because the device name changed.
I might be missing other edge-cases.
Conclusion: we need a constant name for platform devices as we want the return value of platform_match() to stay stable across time.
Regards,
-- Théo Lebrun, Bootlin Embedded Linux and Kernel engineering https://bootlin.com
On Thu, Feb 20, 2025 at 04:46:59PM +0100, Théo Lebrun wrote:
On Thu Feb 20, 2025 at 3:06 PM CET, Greg Kroah-Hartman wrote:
On Thu, Feb 20, 2025 at 02:31:29PM +0100, Théo Lebrun wrote:
On Thu Feb 20, 2025 at 1:41 PM CET, Greg Kroah-Hartman wrote:
On Tue, Feb 18, 2025 at 12:00:11PM +0100, Théo Lebrun wrote:
The solution proposed is to add a flag to platform_device that tells if it is responsible for freeing its name. We can then duplicate the device name inside of_device_add() instead of copying the pointer.
Ick.
What is done elsewhere?
- Platform bus code does a copy of the argument name that is stored alongside the struct platform_device; see platform_device_alloc()[1].
- Other busses duplicate the device name; either through a dynamic allocation [2] or through an array embedded inside devices [3].
- Some busses don't have a separate name; when they want a name they take it from the device [4].
Really ick.
Let's do the right thing here and just get rid of the name pointer entirely in struct platform_device please. Isn't that the correct thing that way the driver core logic will work properly for all of this.
I would agree, if it wasn't for this consideration that is found in the commit message [0]:
What, that the of code is broken? Then it should be fixed, why does it need a pointer to a name at all anyway? It shouldn't be needed there either.
I cannot guess why it originally has a separate pdev->name field.
Many people got this wrong when we designed busses, it's not unique. But we should learn from our mistakes where we can :)
It is important to duplicate! pdev->name must not change to make sure the platform_match() return value is stable over time. If we updated pdev->name alongside dev->name, once a device probes and changes its name then the platform_match() return value would change.
I'd be fine sending a V2 that removes the field *and the fallback* [1], but I don't have the full scope in mind to know what would become broken.
The fallback will not need to be removed, properly point to the name of the device and it should work correctly.
No, it will not work correctly, as the above quote indicates.
I don't know which quote, sorry.
Let's assume we remove the field, this situation would be broken:
- OF allocates platform devices and gives them names.
- A device matches with a driver, which gets probed.
- During the probe, driver does a dev_set_name().
- Afterwards, the upcoming platform_match() against other drivers are called with another device name.
We should be safe as there are guardraids to not probe twice a device, see __driver_probe_device() that checks dev->driver is NULL. But it isn't a situation we should be in.
The fragility of attempting to match a driver to a device purely by a name is a very week part of using platform devices.
Why would a driver change the device name? It's been given to the driver to "bind to" not to change its name. That shouldn't be ok, fix those drivers.
Another broken situation:
- OF allocates platform devices and gives them names.
- A device matches with a driver, which gets probed based on its name.
- During the probe, driver does a dev_set_name().
Again, don't do that. That's the breaking part.
- Module is removed.
- Module is re-added, the (driver, device) pair don't end up matching again because the device name changed.
Sure, that was a bug in the driver. It shouldn't be changing the name, the name is set/owned by the bus, not the driver.
Do we have examples today of platform drivers that like to rename devices? I did a quick search and couldn't find any in-tree, but I might have missed some.
Again, the bus controls the name when the device is created, changing it after the fact is generally not a good idea.
I might be missing other edge-cases.
Conclusion: we need a constant name for platform devices as we want the return value of platform_match() to stay stable across time.
No, let's just not rename devices in platform drivers.
Or if this really is an issue, let's fix OF to not use the platform bus and have it's own bus for stuff like this.
thanks,
greg k-h
On Thu Feb 20, 2025 at 5:19 PM CET, Greg Kroah-Hartman wrote:
On Thu, Feb 20, 2025 at 04:46:59PM +0100, Théo Lebrun wrote:
On Thu Feb 20, 2025 at 3:06 PM CET, Greg Kroah-Hartman wrote:
On Thu, Feb 20, 2025 at 02:31:29PM +0100, Théo Lebrun wrote:
On Thu Feb 20, 2025 at 1:41 PM CET, Greg Kroah-Hartman wrote:
On Tue, Feb 18, 2025 at 12:00:11PM +0100, Théo Lebrun wrote:
The solution proposed is to add a flag to platform_device that tells if it is responsible for freeing its name. We can then duplicate the device name inside of_device_add() instead of copying the pointer.
Ick.
What is done elsewhere?
- Platform bus code does a copy of the argument name that is stored alongside the struct platform_device; see platform_device_alloc()[1].
- Other busses duplicate the device name; either through a dynamic allocation [2] or through an array embedded inside devices [3].
- Some busses don't have a separate name; when they want a name they take it from the device [4].
Really ick.
Let's do the right thing here and just get rid of the name pointer entirely in struct platform_device please. Isn't that the correct thing that way the driver core logic will work properly for all of this.
I would agree, if it wasn't for this consideration that is found in the commit message [0]:
What, that the of code is broken? Then it should be fixed, why does it need a pointer to a name at all anyway? It shouldn't be needed there either.
I cannot guess why it originally has a separate pdev->name field.
Many people got this wrong when we designed busses, it's not unique. But we should learn from our mistakes where we can :)
It is important to duplicate! pdev->name must not change to make sure the platform_match() return value is stable over time. If we updated pdev->name alongside dev->name, once a device probes and changes its name then the platform_match() return value would change.
I'd be fine sending a V2 that removes the field *and the fallback* [1], but I don't have the full scope in mind to know what would become broken.
The fallback will not need to be removed, properly point to the name of the device and it should work correctly.
No, it will not work correctly, as the above quote indicates.
I don't know which quote, sorry.
Let's assume we remove the field, this situation would be broken:
- OF allocates platform devices and gives them names.
- A device matches with a driver, which gets probed.
- During the probe, driver does a dev_set_name().
- Afterwards, the upcoming platform_match() against other drivers are called with another device name.
We should be safe as there are guardraids to not probe twice a device, see __driver_probe_device() that checks dev->driver is NULL. But it isn't a situation we should be in.
The fragility of attempting to match a driver to a device purely by a name is a very week part of using platform devices.
I never said the opposite, and I agree. However the mechanism exists and I was focused on not breaking it.
Why would a driver change the device name? It's been given to the driver to "bind to" not to change its name. That shouldn't be ok, fix those drivers.
I do get the argument that devices shouldn't change device names. I'll take the devil's advocate and give at least one argument FOR allowing changing names: prettier names, especially as device names leak into userspace through pseudo filesystems.
If we agree that device names shouldn't be changed one a device is matched with a driver, then (1) we can remove the pdev->name field and (2) `dev_set_name()` should warn when used too late.
Turn the implicit explicit.
diff --git a/drivers/base/core.c b/drivers/base/core.c index 5a1f05198114..3532b068e32d 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -3462,10 +3462,13 @@ static void device_remove_class_symlinks(struct device *dev) int dev_set_name(struct device *dev, const char *fmt, ...) { va_list vargs; int err;
+ if (dev_WARN_ONCE(dev, dev->driver, "device name is static once matched")) + return -EPERM; + va_start(vargs, fmt); err = kobject_set_name_vargs(&dev->kobj, fmt, vargs); va_end(vargs); return err; }
(Unsure about the exact error code to return.)
[...]
Do we have examples today of platform drivers that like to rename devices? I did a quick search and couldn't find any in-tree, but I might have missed some.
The cover letter expands on the quest for those drivers:
On Tue Feb 18, 2025 at 12:00 PM CET, Théo Lebrun wrote:
Out of the 37 drivers that deal with platform devices and do a dev_set_name() call, only one might be affected. That driver is loongson-i2s-plat [0]. All other dev_set_name() calls are on children devices created on the spot. The issue was found on downstream kernels and we don't have what it takes to test loongson-i2s-plat.
[...]
⟩ # Finding potential trouble-makers: ⟩ git grep -l 'struct platform_device' | xargs grep -l dev_set_name
[...]
[...]
Or if this really is an issue, let's fix OF to not use the platform bus and have it's own bus for stuff like this.
That used to exist! I cannot see how it could be a good idea to reintroduce the distinction though.
commit eca3930163ba8884060ce9d9ff5ef0d9b7c7b00f Author: Grant Likely grant.likely@secretlab.ca Date: Tue Jun 8 07:48:21 2010 -0600
of: Merge of_platform_bus_type with platform_bus_type
Thanks,
-- Théo Lebrun, Bootlin Embedded Linux and Kernel engineering https://bootlin.com
On Thu, Feb 20, 2025 at 07:26:41PM +0100, Théo Lebrun wrote:
On Thu Feb 20, 2025 at 5:19 PM CET, Greg Kroah-Hartman wrote:
On Thu, Feb 20, 2025 at 04:46:59PM +0100, Théo Lebrun wrote:
On Thu Feb 20, 2025 at 3:06 PM CET, Greg Kroah-Hartman wrote:
On Thu, Feb 20, 2025 at 02:31:29PM +0100, Théo Lebrun wrote:
On Thu Feb 20, 2025 at 1:41 PM CET, Greg Kroah-Hartman wrote:
On Tue, Feb 18, 2025 at 12:00:11PM +0100, Théo Lebrun wrote: > The solution proposed is to add a flag to platform_device that tells if > it is responsible for freeing its name. We can then duplicate the > device name inside of_device_add() instead of copying the pointer.
Ick.
> What is done elsewhere? > - Platform bus code does a copy of the argument name that is stored > alongside the struct platform_device; see platform_device_alloc()[1]. > - Other busses duplicate the device name; either through a dynamic > allocation [2] or through an array embedded inside devices [3]. > - Some busses don't have a separate name; when they want a name they > take it from the device [4].
Really ick.
Let's do the right thing here and just get rid of the name pointer entirely in struct platform_device please. Isn't that the correct thing that way the driver core logic will work properly for all of this.
I would agree, if it wasn't for this consideration that is found in the commit message [0]:
What, that the of code is broken? Then it should be fixed, why does it need a pointer to a name at all anyway? It shouldn't be needed there either.
I cannot guess why it originally has a separate pdev->name field.
Many people got this wrong when we designed busses, it's not unique. But we should learn from our mistakes where we can :)
It is important to duplicate! pdev->name must not change to make sure the platform_match() return value is stable over time. If we updated pdev->name alongside dev->name, once a device probes and changes its name then the platform_match() return value would change.
I'd be fine sending a V2 that removes the field *and the fallback* [1], but I don't have the full scope in mind to know what would become broken.
The fallback will not need to be removed, properly point to the name of the device and it should work correctly.
No, it will not work correctly, as the above quote indicates.
I don't know which quote, sorry.
Let's assume we remove the field, this situation would be broken:
- OF allocates platform devices and gives them names.
- A device matches with a driver, which gets probed.
- During the probe, driver does a dev_set_name().
- Afterwards, the upcoming platform_match() against other drivers are called with another device name.
We should be safe as there are guardraids to not probe twice a device, see __driver_probe_device() that checks dev->driver is NULL. But it isn't a situation we should be in.
The fragility of attempting to match a driver to a device purely by a name is a very week part of using platform devices.
I never said the opposite, and I agree. However the mechanism exists and I was focused on not breaking it.
Why would a driver change the device name? It's been given to the driver to "bind to" not to change its name. That shouldn't be ok, fix those drivers.
I do get the argument that devices shouldn't change device names. I'll take the devil's advocate and give at least one argument FOR allowing changing names: prettier names, especially as device names leak into userspace through pseudo filesystems.
Then that same driver should have created a prettier name when it created the device and sent it to the driver core :)
If we agree that device names shouldn't be changed one a device is matched with a driver, then (1) we can remove the pdev->name field and (2) `dev_set_name()` should warn when used too late.
Turn the implicit explicit.
diff --git a/drivers/base/core.c b/drivers/base/core.c index 5a1f05198114..3532b068e32d 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -3462,10 +3462,13 @@ static void device_remove_class_symlinks(struct device *dev) int dev_set_name(struct device *dev, const char *fmt, ...) { va_list vargs; int err;
if (dev_WARN_ONCE(dev, dev->driver, "device name is static once matched"))
return -EPERM;
What? No, this is a platform driver thing, not a driver core thing. Let's just remove the name pointer in the platform driver structure and then we can handle the rest from there.
va_start(vargs, fmt); err = kobject_set_name_vargs(&dev->kobj, fmt, vargs); va_end(vargs); return err;
}
(Unsure about the exact error code to return.)
[...]
Do we have examples today of platform drivers that like to rename devices? I did a quick search and couldn't find any in-tree, but I might have missed some.
The cover letter expands on the quest for those drivers:
On Tue Feb 18, 2025 at 12:00 PM CET, Théo Lebrun wrote:
Out of the 37 drivers that deal with platform devices and do a dev_set_name() call, only one might be affected. That driver is loongson-i2s-plat [0]. All other dev_set_name() calls are on children devices created on the spot. The issue was found on downstream kernels and we don't have what it takes to test loongson-i2s-plat.
out-of-tree drivers don't matter to us :)
[...]
⟩ # Finding potential trouble-makers: ⟩ git grep -l 'struct platform_device' | xargs grep -l dev_set_name
[...]
[...]
Or if this really is an issue, let's fix OF to not use the platform bus and have it's own bus for stuff like this.
That used to exist! I cannot see how it could be a good idea to reintroduce the distinction though.
commit eca3930163ba8884060ce9d9ff5ef0d9b7c7b00f Author: Grant Likely grant.likely@secretlab.ca Date: Tue Jun 8 07:48:21 2010 -0600
of: Merge of_platform_bus_type with platform_bus_type
True, that was nice, but we shouldn't let one force bugs in the other :)
Anyway try removing the name pointer and let's see what falls out.
thanks,
greg k-h
On Thu, 20 Feb 2025 19:26:41 +0100 Théo Lebrun theo.lebrun@bootlin.com wrote:
That used to exist! I cannot see how it could be a good idea to reintroduce the distinction though.
commit eca3930163ba8884060ce9d9ff5ef0d9b7c7b00f Author: Grant Likely grant.likely@secretlab.ca Date: Tue Jun 8 07:48:21 2010 -0600
of: Merge of_platform_bus_type with platform_bus_type
I don't really see how an of_platform bus would make sense. OF is not a bus at all, it's a way of providing HW description to an operating system.
What would IMO make a lot more sense is mmio_bus, for Memory-Mapped I/O peripherals. mmio_device can be described through OF, through old-style board.c, possibly through ACPI, or other means.
But in my eyes, the current platform bus is exactly this: the bus for MMIO devices. It would have be clearer to name it mmio_bus, and that would have probably prevented abuses of the platform bus for things that aren't memory-mapped peripherals.
But clearly any bus that has "OF" in its name is wrong, as OF cannot be a bus. Keep in mind that OF allows to describe not only MMIO devices, but also I2C devices, SPI devices, MMC/SDIO devices, PCI devices, USB devices, etc. OF is a description of the HW, not a bus.
Best regards,
Thomas
linux-stable-mirror@lists.linaro.org