Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
Note that jc42 devices are registered correctly and work with and without the change.
# grep . /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-*/name /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0018/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0019/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-001a/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-001b/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0050/name:spd /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0051/name:spd /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0052/name:spd /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0053/name:spd
# sensors|grep -A4 jc42-i2c jc42-i2c-12-1b Adapter: SMBus I801 adapter at 3000 temp1: +33.2°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C) -- jc42-i2c-12-19 Adapter: SMBus I801 adapter at 3000 temp1: +33.5°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C) -- jc42-i2c-12-1a Adapter: SMBus I801 adapter at 3000 temp1: +33.5°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C) -- jc42-i2c-12-18 Adapter: SMBus I801 adapter at 3000 temp1: +33.2°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C)
dmesg: [ 0.000000] DMI: Dell Inc. PowerEdge T110 II/0PM2CW, BIOS 2.10.0 05/24/2018 (...) [ 7.681132] i2c_dev: i2c /dev entries driver [ 7.687116] i2c i2c-12: 4/4 memory slots populated (from DMI) [ 7.690623] at24 12-0050: 256 byte spd EEPROM, read-only [ 7.691812] i2c i2c-12: Successfully instantiated SPD at 0x50 [ 7.698246] at24 12-0051: 256 byte spd EEPROM, read-only [ 7.699465] i2c i2c-12: Successfully instantiated SPD at 0x51 [ 7.700043] i2c i2c-12: Failed to register i2c client jc42 at 0x19 (-16) [ 7.700047] i2c i2c-12: Failed creating jc42 at 0x19 [ 7.705248] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1f.3/i2c-12/12-001a' [ 7.711617] <TASK> [ 7.712612] dump_stack_lvl+0x37/0x4a [ 7.712612] sysfs_warn_dup+0x55/0x61 [ 7.715616] sysfs_create_dir_ns+0xa6/0xd2 [ 7.716620] kobject_add_internal+0xc3/0x1c0 [ 7.716620] kobject_add+0xba/0xe4 [ 7.719615] ? device_add+0x53/0x726 [ 7.720611] device_add+0x132/0x726 [ 7.720611] i2c_new_client_device+0x1ee/0x246 [ 7.723616] at24_probe+0x5f8/0x666 [ 7.724642] ? __pfx_at24_read+0x10/0x10 [ 7.724642] ? __pfx_at24_write+0x10/0x10 [ 7.724642] ? __pfx___device_attach_driver+0x10/0x10 [ 7.727619] i2c_device_probe+0x1b7/0x240 [ 7.728612] really_probe+0x101/0x248 [ 7.728612] __driver_probe_device+0xbb/0xed [ 7.731620] driver_probe_device+0x1a/0x72 [ 7.732621] __device_attach_driver+0x82/0x96 [ 7.732621] bus_for_each_drv+0xa6/0xd4 [ 7.732621] __device_attach+0xa8/0x12a [ 7.735619] bus_probe_device+0x31/0x95 [ 7.736614] device_add+0x265/0x726 [ 7.736614] i2c_new_client_device+0x1ee/0x246 [ 7.739618] i2c_register_spd+0x1a1/0x1ed [ 7.740613] i801_probe+0x589/0x603 [ 7.740613] ? up_write+0x37/0x4d [ 7.740613] ? kernfs_add_one+0x104/0x126 [ 7.743618] ? __raw_spin_unlock_irqrestore+0x14/0x29 [ 7.744612] pci_device_probe+0xbe/0x12f [ 7.744612] really_probe+0x101/0x248 [ 7.744612] __driver_probe_device+0xbb/0xed [ 7.747618] driver_probe_device+0x1a/0x72 [ 7.748612] __driver_attach_async_helper+0x2d/0x42 [ 7.748612] async_run_entry_fn+0x25/0xa0 [ 7.748612] process_scheduled_works+0x193/0x291 [ 7.748612] worker_thread+0x1c5/0x21f [ 7.751619] ? __pfx_worker_thread+0x10/0x10 [ 7.752611] kthread+0xf6/0xfe [ 7.752611] ? __pfx_kthread+0x10/0x10 [ 7.752611] ret_from_fork+0x23/0x35 [ 7.755621] ? __pfx_kthread+0x10/0x10 [ 7.756613] ret_from_fork_asm+0x1b/0x30 [ 7.756613] </TASK> [ 7.759637] i2c i2c-12: Failed to register i2c client jc42 at 0x1a (-17) [ 7.760815] at24 12-0052: 256 byte spd EEPROM, read-only [ 7.762047] i2c i2c-12: Successfully instantiated SPD at 0x52 [ 7.765252] i2c i2c-12: Failed to register i2c client jc42 at 0x1b (-16) [ 7.766126] at24 12-0053: 256 byte spd EEPROM, read-only [ 7.767584] i2c i2c-12: Successfully instantiated SPD at 0x53
Thanks, Krzysztof
On 6/23/24 11:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
Note that jc42 devices are registered correctly and work with and without the change.
My guess is that the devices are fist instantiated through the jc42 driver's _detect function and then again from the at24 driver. The at24 driver should possibly call i2c_new_scanned_device() instead of i2c_new_client_device() to only instantiate the device if it wasn't already instantiated.
Guenter
# grep . /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-*/name /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0018/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0019/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-001a/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-001b/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0050/name:spd /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0051/name:spd /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0052/name:spd /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0053/name:spd
# sensors|grep -A4 jc42-i2c jc42-i2c-12-1b Adapter: SMBus I801 adapter at 3000 temp1: +33.2°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C) -- jc42-i2c-12-19 Adapter: SMBus I801 adapter at 3000 temp1: +33.5°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C) -- jc42-i2c-12-1a Adapter: SMBus I801 adapter at 3000 temp1: +33.5°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C) -- jc42-i2c-12-18 Adapter: SMBus I801 adapter at 3000 temp1: +33.2°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C)
dmesg: [ 0.000000] DMI: Dell Inc. PowerEdge T110 II/0PM2CW, BIOS 2.10.0 05/24/2018 (...) [ 7.681132] i2c_dev: i2c /dev entries driver [ 7.687116] i2c i2c-12: 4/4 memory slots populated (from DMI) [ 7.690623] at24 12-0050: 256 byte spd EEPROM, read-only [ 7.691812] i2c i2c-12: Successfully instantiated SPD at 0x50 [ 7.698246] at24 12-0051: 256 byte spd EEPROM, read-only [ 7.699465] i2c i2c-12: Successfully instantiated SPD at 0x51 [ 7.700043] i2c i2c-12: Failed to register i2c client jc42 at 0x19 (-16) [ 7.700047] i2c i2c-12: Failed creating jc42 at 0x19 [ 7.705248] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1f.3/i2c-12/12-001a' [ 7.711617] <TASK> [ 7.712612] dump_stack_lvl+0x37/0x4a [ 7.712612] sysfs_warn_dup+0x55/0x61 [ 7.715616] sysfs_create_dir_ns+0xa6/0xd2 [ 7.716620] kobject_add_internal+0xc3/0x1c0 [ 7.716620] kobject_add+0xba/0xe4 [ 7.719615] ? device_add+0x53/0x726 [ 7.720611] device_add+0x132/0x726 [ 7.720611] i2c_new_client_device+0x1ee/0x246 [ 7.723616] at24_probe+0x5f8/0x666 [ 7.724642] ? __pfx_at24_read+0x10/0x10 [ 7.724642] ? __pfx_at24_write+0x10/0x10 [ 7.724642] ? __pfx___device_attach_driver+0x10/0x10 [ 7.727619] i2c_device_probe+0x1b7/0x240 [ 7.728612] really_probe+0x101/0x248 [ 7.728612] __driver_probe_device+0xbb/0xed [ 7.731620] driver_probe_device+0x1a/0x72 [ 7.732621] __device_attach_driver+0x82/0x96 [ 7.732621] bus_for_each_drv+0xa6/0xd4 [ 7.732621] __device_attach+0xa8/0x12a [ 7.735619] bus_probe_device+0x31/0x95 [ 7.736614] device_add+0x265/0x726 [ 7.736614] i2c_new_client_device+0x1ee/0x246 [ 7.739618] i2c_register_spd+0x1a1/0x1ed [ 7.740613] i801_probe+0x589/0x603 [ 7.740613] ? up_write+0x37/0x4d [ 7.740613] ? kernfs_add_one+0x104/0x126 [ 7.743618] ? __raw_spin_unlock_irqrestore+0x14/0x29 [ 7.744612] pci_device_probe+0xbe/0x12f [ 7.744612] really_probe+0x101/0x248 [ 7.744612] __driver_probe_device+0xbb/0xed [ 7.747618] driver_probe_device+0x1a/0x72 [ 7.748612] __driver_attach_async_helper+0x2d/0x42 [ 7.748612] async_run_entry_fn+0x25/0xa0 [ 7.748612] process_scheduled_works+0x193/0x291 [ 7.748612] worker_thread+0x1c5/0x21f [ 7.751619] ? __pfx_worker_thread+0x10/0x10 [ 7.752611] kthread+0xf6/0xfe [ 7.752611] ? __pfx_kthread+0x10/0x10 [ 7.752611] ret_from_fork+0x23/0x35 [ 7.755621] ? __pfx_kthread+0x10/0x10 [ 7.756613] ret_from_fork_asm+0x1b/0x30 [ 7.756613] </TASK> [ 7.759637] i2c i2c-12: Failed to register i2c client jc42 at 0x1a (-17) [ 7.760815] at24 12-0052: 256 byte spd EEPROM, read-only [ 7.762047] i2c i2c-12: Successfully instantiated SPD at 0x52 [ 7.765252] i2c i2c-12: Failed to register i2c client jc42 at 0x1b (-16) [ 7.766126] at24 12-0053: 256 byte spd EEPROM, read-only [ 7.767584] i2c i2c-12: Successfully instantiated SPD at 0x53
Thanks, Krzysztof
On 23.06.2024 at 22:33, Guenter Roeck wrote:
On 6/23/24 11:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
Note that jc42 devices are registered correctly and work with and without the change.
My guess is that the devices are fist instantiated through the jc42 driver's _detect function and then again from the at24 driver. The at24 driver should possibly call i2c_new_scanned_device() instead of i2c_new_client_device() to only instantiate the device if it wasn't already instantiated.
i2c_new_scanned_device() also calls i2c_default_probe() at the end (unless different probe is provided) which seems risky given the comment that explains that it would use quick write for that address. However, maybe it is safe in this case? I wish we had a way to just tell "no probing is needed".
We also know the exact address so no scanning is needed.
Perhaps it would be better to just call i2c_check_addr_busy() in at24_probe_temp_sensor()?
Something like this: --- a/drivers/misc/eeprom/at24.c 2024-06-24 09:16:11.251855130 +0200 +++ b/drivers/misc/eeprom/at24.c 2024-06-24 09:27:01.158170725 +0200 @@ -603,6 +603,10 @@
info.addr = 0x18 | (client->addr & 7);
+ /* The device may be already instantiated through the jc42 driver */ + if (i2c_check_addr_busy(client->adapter, info.addr)) + return; + i2c_new_client_device(client->adapter, &info); }
Unfortunately, i2c_check_addr_busy is not exported and declared as static, I assume intentionally? Unless this can be changed, we are back to the original recommendation:
--- a/drivers/misc/eeprom/at24.c 2024-06-24 09:16:11.251855130 +0200 +++ b/drivers/misc/eeprom/at24.c 2024-06-24 10:25:39.142567472 +0200 @@ -585,6 +585,7 @@ { struct at24_data *at24 = i2c_get_clientdata(client); struct i2c_board_info info = { .type = "jc42" }; + unsigned short addr_list[] = { 0, I2C_CLIENT_END }; int ret; u8 val;
@@ -601,9 +602,10 @@ if (ret || !(val & BIT(7))) return;
- info.addr = 0x18 | (client->addr & 7); + addr_list[0] = 0x18 | (client->addr & 7);
- i2c_new_client_device(client->adapter, &info); + /* The device may be already instantiated through the jc42 driver */ + i2c_new_scanned_device(client->adapter, &info, addr_list, NULL); }
static int at24_probe(struct i2c_client *client)
For now compile-tested only given the write-test concern above.
That said, I have some follow-up questions:
1. if the jc42 driver handles this already, I wonder what's the point of adding at24_probe_temp_sensor()? Is there a situation where it would not do it properly? Or do we expect to remove the probing functionally from jc42.c?
2. I don't understand why we are also getting the "Failed creating jc42" and "sysfs: cannot create duplicate filename" errors since i2c_new_client_device() calls i2c_check_addr_busy() on its own and should abort after the first error message?
3. (unrelated but found while looking at the code) The comment for delete_device_store() seems to be outdated as it mentions i2c_sysfs_new_device which does not exist any longer, as it was renamed in "i2c: core: Use DEVICE_ATTR_*() helper macros" back in 2019: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/dr... -
For the Greg's question if it is also in 6.9: I have not tested that kernel yet, but unless there have been some recent changes in the i2c code I would expect it should behave the same way. If required, I should be able to do this next week.
Krzysztof
On 6/24/24 01:38, Krzysztof Olędzki wrote:
On 23.06.2024 at 22:33, Guenter Roeck wrote:
On 6/23/24 11:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
Note that jc42 devices are registered correctly and work with and without the change.
My guess is that the devices are fist instantiated through the jc42 driver's _detect function and then again from the at24 driver. The at24 driver should possibly call i2c_new_scanned_device() instead of i2c_new_client_device() to only instantiate the device if it wasn't already instantiated.
i2c_new_scanned_device() also calls i2c_default_probe() at the end (unless different probe is provided) which seems risky given the comment that explains that it would use quick write for that address. However, maybe it is safe in this case? I wish we had a way to just tell "no probing is needed".
Sorry, I don't understand why it would be less risky to just probe the device without such a test.
We also know the exact address so no scanning is needed.
Perhaps it would be better to just call i2c_check_addr_busy() in at24_probe_temp_sensor()?
Something like this: --- a/drivers/misc/eeprom/at24.c 2024-06-24 09:16:11.251855130 +0200 +++ b/drivers/misc/eeprom/at24.c 2024-06-24 09:27:01.158170725 +0200 @@ -603,6 +603,10 @@ info.addr = 0x18 | (client->addr & 7);
- /* The device may be already instantiated through the jc42 driver */
- if (i2c_check_addr_busy(client->adapter, info.addr))
return;
- i2c_new_client_device(client->adapter, &info); }
Unfortunately, i2c_check_addr_busy is not exported and declared as static,
That is why I did not suggest that.
I assume intentionally? Unless this can be changed, we are back to the original recommendation:
--- a/drivers/misc/eeprom/at24.c 2024-06-24 09:16:11.251855130 +0200 +++ b/drivers/misc/eeprom/at24.c 2024-06-24 10:25:39.142567472 +0200 @@ -585,6 +585,7 @@ { struct at24_data *at24 = i2c_get_clientdata(client); struct i2c_board_info info = { .type = "jc42" };
- unsigned short addr_list[] = { 0, I2C_CLIENT_END }; int ret; u8 val;
@@ -601,9 +602,10 @@ if (ret || !(val & BIT(7))) return;
- info.addr = 0x18 | (client->addr & 7);
- addr_list[0] = 0x18 | (client->addr & 7);
- i2c_new_client_device(client->adapter, &info);
- /* The device may be already instantiated through the jc42 driver */
- i2c_new_scanned_device(client->adapter, &info, addr_list, NULL); }
static int at24_probe(struct i2c_client *client)
For now compile-tested only given the write-test concern above.
The device detect code in the i2c core does that same write-test that you are concerned about.
That said, I have some follow-up questions:
- if the jc42 driver handles this already, I wonder what's the point of adding
at24_probe_temp_sensor()? Is there a situation where it would not do it properly? Or do we expect to remove the probing functionally from jc42.c?
The jc42 driver is not auto-loaded. When suggesting to remove the "probing functionally", I assume you mean to remove its detect function. That would only work if SPD EEPROMs were only connected to I2C adapters calling i2c_register_spd(), and if the systems with those adapters would support DMI.
In v6.9, i2c_register_spd() is only called from the i801 driver (Intel systems). In v6.11, piix4 (AMD) will be added. Even after that, all non-Intel / non-AMD systems would no longer be able to support jc42 compatible chips by just loading the jc42 driver. That would not be acceptable.
- I don't understand why we are also getting the "Failed creating jc42" and
"sysfs: cannot create duplicate filename" errors since i2c_new_client_device() calls i2c_check_addr_busy() on its own and should abort after the first error message?
The "Failed creating" message is from the i2c core's detect function which is only called if a new i2c adapter is added. This is actually the case here, since the call sequence of the backtrace includes i801_probe(). It looks like i2c_detect() runs asynchronously and doesn't protect itself against having devices added to a bus while it is running on that same bus. That is just a guess, though - I have not tried to verify it.
That does suggest, though, that even your suggested code above might not completely fix the problem. It may be necessary to call i2c_lock_bus() or similar from i2c_new_scanned_device() and i2c_detect(), but I don't know if that is save, sufficient, or even possible.
- (unrelated but found while looking at the code) The comment for
delete_device_store() seems to be outdated as it mentions i2c_sysfs_new_device which does not exist any longer, as it was renamed in "i2c: core: Use DEVICE_ATTR_*() helper macros" back in 2019: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/dr... -
For the Greg's question if it is also in 6.9: I have not tested that kernel yet, but unless there have been some recent changes in the i2c code I would expect it should behave the same way. If required, I should be able to do this next week.
Agreed.
Guenter
On 6/24/24 07:54, Guenter Roeck wrote: [ ... ]
That said, I have some follow-up questions:
- if the jc42 driver handles this already, I wonder what's the point of adding
at24_probe_temp_sensor()? Is there a situation where it would not do it properly? Or do we expect to remove the probing functionally from jc42.c?
The jc42 driver is not auto-loaded. When suggesting to remove the "probing functionally", I assume you mean to remove its detect function. That would only work if SPD EEPROMs were only connected to I2C adapters calling i2c_register_spd(), and if the systems with those adapters would support DMI.
In v6.9, i2c_register_spd() is only called from the i801 driver (Intel systems). In v6.11, piix4 (AMD) will be added. Even after that, all non-Intel / non-AMD systems would no longer be able to support jc42 compatible chips by just loading the jc42 driver. That would not be acceptable.
There is another reason to not remove the detect function, one that I just found in my system when I tried to reproduce the problem: While SPD data is supposed to identify if a DIMM supports a temperature sensor, this is not always the case. The DIMMs in one of my systems (F4-3200C14-16GTZSW) do support temperature sensors, but the respective bit in the SPD data is not set. From raw SPD data:
000000 23 10 0c 02 85 21 00 08 00 40 00 03 09 03 00 00 ^^ Bit 7 is supposed to be set but isn't.
This means that the thermal sensors on the DIMMs in my system would not be instantiated without detect function and require manual instantiation.
Guenter
On 24.06.2024 16:54, Guenter Roeck wrote:
On 6/24/24 01:38, Krzysztof Olędzki wrote:
On 23.06.2024 at 22:33, Guenter Roeck wrote:
On 6/23/24 11:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
Note that jc42 devices are registered correctly and work with and without the change.
My guess is that the devices are fist instantiated through the jc42 driver's _detect function and then again from the at24 driver. The at24 driver should possibly call i2c_new_scanned_device() instead of i2c_new_client_device() to only instantiate the device if it wasn't already instantiated.
i2c_new_scanned_device() also calls i2c_default_probe() at the end (unless different probe is provided) which seems risky given the comment that explains that it would use quick write for that address. However, maybe it is safe in this case? I wish we had a way to just tell "no probing is needed".
Sorry, I don't understand why it would be less risky to just probe the device without such a test.
We also know the exact address so no scanning is needed.
Perhaps it would be better to just call i2c_check_addr_busy() in at24_probe_temp_sensor()?
Something like this: --- a/drivers/misc/eeprom/at24.c 2024-06-24 09:16:11.251855130 +0200 +++ b/drivers/misc/eeprom/at24.c 2024-06-24 09:27:01.158170725 +0200 @@ -603,6 +603,10 @@ info.addr = 0x18 | (client->addr & 7); + /* The device may be already instantiated through the jc42 driver */ + if (i2c_check_addr_busy(client->adapter, info.addr)) + return;
i2c_new_client_device(client->adapter, &info); }
Unfortunately, i2c_check_addr_busy is not exported and declared as static,
That is why I did not suggest that.
I assume intentionally? Unless this can be changed, we are back to the original recommendation:
--- a/drivers/misc/eeprom/at24.c 2024-06-24 09:16:11.251855130 +0200 +++ b/drivers/misc/eeprom/at24.c 2024-06-24 10:25:39.142567472 +0200 @@ -585,6 +585,7 @@ { struct at24_data *at24 = i2c_get_clientdata(client); struct i2c_board_info info = { .type = "jc42" }; + unsigned short addr_list[] = { 0, I2C_CLIENT_END }; int ret; u8 val; @@ -601,9 +602,10 @@ if (ret || !(val & BIT(7))) return; - info.addr = 0x18 | (client->addr & 7); + addr_list[0] = 0x18 | (client->addr & 7); - i2c_new_client_device(client->adapter, &info); + /* The device may be already instantiated through the jc42 driver */ + i2c_new_scanned_device(client->adapter, &info, addr_list, NULL); } static int at24_probe(struct i2c_client *client)
For now compile-tested only given the write-test concern above.
The device detect code in the i2c core does that same write-test that you are concerned about.
That said, I have some follow-up questions:
- if the jc42 driver handles this already, I wonder what's the point of adding
at24_probe_temp_sensor()? Is there a situation where it would not do it properly? Or do we expect to remove the probing functionally from jc42.c?
The jc42 driver is not auto-loaded. When suggesting to remove the "probing functionally", I assume you mean to remove its detect function. That would only work if SPD EEPROMs were only connected to I2C adapters calling i2c_register_spd(), and if the systems with those adapters would support DMI.
In v6.9, i2c_register_spd() is only called from the i801 driver (Intel systems). In v6.11, piix4 (AMD) will be added. Even after that, all non-Intel / non-AMD systems would no longer be able to support jc42 compatible chips by just loading the jc42 driver. That would not be acceptable.
- I don't understand why we are also getting the "Failed creating jc42" and
"sysfs: cannot create duplicate filename" errors since i2c_new_client_device() calls i2c_check_addr_busy() on its own and should abort after the first error message?
The "Failed creating" message is from the i2c core's detect function which is only called if a new i2c adapter is added. This is actually the case here, since the call sequence of the backtrace includes i801_probe(). It looks like i2c_detect() runs asynchronously and doesn't protect itself against having devices added to a bus while it is running on that same bus. That is just a guess, though - I have not tried to verify it.
Too me the issue also looks like a race. According to the OP's logs: - jc42 at 0x18 is instantiated successfully - jc42 at 0x19 returns -EBUSY. This is what is expected if the device has been instantiated otherwise already. - jc42 at 0x1a returns -EEXIST. Here two instantiations of the the same device seem to collide. - jc42 at 0x1b returns -EBUSY, like at 0x19.
So it looks like referenced change isn't wrong, but reveals an underlying issue with device instantiation races. I'll have a look how this could be fixed.
That does suggest, though, that even your suggested code above might not completely fix the problem. It may be necessary to call i2c_lock_bus() or similar from i2c_new_scanned_device() and i2c_detect(), but I don't know if that is save, sufficient, or even possible.
- (unrelated but found while looking at the code) The comment for
delete_device_store() seems to be outdated as it mentions i2c_sysfs_new_device which does not exist any longer, as it was renamed in "i2c: core: Use DEVICE_ATTR_*() helper macros" back in 2019: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/dr... -
For the Greg's question if it is also in 6.9: I have not tested that kernel yet, but unless there have been some recent changes in the i2c code I would expect it should behave the same way. If required, I should be able to do this next week.
Agreed.
Guenter
On 6/24/24 13:58, Heiner Kallweit wrote: [ ... ]
Too me the issue also looks like a race. According to the OP's logs:
- jc42 at 0x18 is instantiated successfully
- jc42 at 0x19 returns -EBUSY. This is what is expected if the device has been instantiated otherwise already.
- jc42 at 0x1a returns -EEXIST. Here two instantiations of the the same device seem to collide.
- jc42 at 0x1b returns -EBUSY, like at 0x19.
So it looks like referenced change isn't wrong, but reveals an underlying issue with device instantiation races.
It isn't just a race, though. Try to unload the at24 (or ee1004 driver for DDR4) and load it again, and you'll see the -EBUSY errors. Problem is that instantiating those drivers _always_ triggers the call to i2c_new_client_device() even if the jc42 device is already instantiated. Unloading the spd/eeprom driver doesn't unload the jc42 driver, so -EBUSY will be seen if the spd/eeprom driver is loaded again.
I have not been able to reproduce the backtrace with my systems, but those are all with AMD CPUs using the piix4 driver, so timing is likely different. Another difference is that my systems (with DDR4) use the ee1004 driver. That driver instantiates the jc42 devices under a driver lock, so it is guaranteed that a single instantiation doesn't interfere with other instantiations running in parallel.
Guenter
On 24.06.2024 at 20:45, Guenter Roeck wrote:
On 6/24/24 13:58, Heiner Kallweit wrote: [ ... ]
Too me the issue also looks like a race. According to the OP's logs:
- jc42 at 0x18 is instantiated successfully
- jc42 at 0x19 returns -EBUSY. This is what is expected if the device has been instantiated otherwise already.
- jc42 at 0x1a returns -EEXIST. Here two instantiations of the the same device seem to collide.
- jc42 at 0x1b returns -EBUSY, like at 0x19.
So it looks like referenced change isn't wrong, but reveals an underlying issue with device instantiation races.
It isn't just a race, though. Try to unload the at24 (or ee1004 driver for DDR4) and load it again, and you'll see the -EBUSY errors. Problem is that instantiating those drivers _always_ triggers the call to i2c_new_client_device() even if the jc42 device is already instantiated. Unloading the spd/eeprom driver doesn't unload the jc42 driver, so -EBUSY will be seen if the spd/eeprom driver is loaded again.
I have not been able to reproduce the backtrace with my systems, but those are all with AMD CPUs using the piix4 driver, so timing is likely different. Another difference is that my systems (with DDR4) use the ee1004 driver. That driver instantiates the jc42 devices under a driver lock, so it is guaranteed that a single instantiation doesn't interfere with other instantiations running in parallel.
Right, sorry for not mentioning this in the original report:
[ 0.269013] pci 0000:00:1f.3: [8086:1c22] type 00 class 0x0c0500 [ 0.269098] pci 0000:00:1f.3: reg 0x10: [mem 0xc3a02000-0xc3a020ff 64bit] [ 0.269186] pci 0000:00:1f.3: reg 0x20: [io 0x3000-0x301f] [ 0.334962] pci 0000:00:1f.3: Adding to iommu group 7 [ 7.874736] i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
$ lspci -s 0000:00:1f.3 -vvnn 00:1f.3 SMBus [0c05]: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller [8086:1c22] (rev 04) Subsystem: Dell Device [1028:04de] Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin C routed to IRQ 19 IOMMU group: 7 Region 0: Memory at c3a02000 (64-bit, non-prefetchable) [size=256] Region 4: I/O ports at 3000 [size=32] Kernel driver in use: i801_smbus
Krzysztof
On 27.06.2024 at 04:29, Krzysztof Olędzki wrote:
On 24.06.2024 at 20:45, Guenter Roeck wrote:
On 6/24/24 13:58, Heiner Kallweit wrote: [ ... ]
Too me the issue also looks like a race. According to the OP's logs:
- jc42 at 0x18 is instantiated successfully
- jc42 at 0x19 returns -EBUSY. This is what is expected if the device has been instantiated otherwise already.
- jc42 at 0x1a returns -EEXIST. Here two instantiations of the the same device seem to collide.
- jc42 at 0x1b returns -EBUSY, like at 0x19.
So it looks like referenced change isn't wrong, but reveals an underlying issue with device instantiation races.
It isn't just a race, though. Try to unload the at24 (or ee1004 driver for DDR4) and load it again, and you'll see the -EBUSY errors. Problem is that instantiating those drivers _always_ triggers the call to i2c_new_client_device() even if the jc42 device is already instantiated. Unloading the spd/eeprom driver doesn't unload the jc42 driver, so -EBUSY will be seen if the spd/eeprom driver is loaded again.
I have not been able to reproduce the backtrace with my systems, but those are all with AMD CPUs using the piix4 driver, so timing is likely different. Another difference is that my systems (with DDR4) use the ee1004 driver. That driver instantiates the jc42 devices under a driver lock, so it is guaranteed that a single instantiation doesn't interfere with other instantiations running in parallel.
Right, sorry for not mentioning this in the original report:
[ 0.269013] pci 0000:00:1f.3: [8086:1c22] type 00 class 0x0c0500 [ 0.269098] pci 0000:00:1f.3: reg 0x10: [mem 0xc3a02000-0xc3a020ff 64bit] [ 0.269186] pci 0000:00:1f.3: reg 0x20: [io 0x3000-0x301f] [ 0.334962] pci 0000:00:1f.3: Adding to iommu group 7 [ 7.874736] i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
$ lspci -s 0000:00:1f.3 -vvnn 00:1f.3 SMBus [0c05]: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller [8086:1c22] (rev 04) Subsystem: Dell Device [1028:04de] Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin C routed to IRQ 19 IOMMU group: 7 Region 0: Memory at c3a02000 (64-bit, non-prefetchable) [size=256] Region 4: I/O ports at 3000 [size=32] Kernel driver in use: i801_smbus
Also, here is a different trace showing a different code path, which even more suggest a race:
[ 7.871973] i2c_dev: i2c /dev entries driver [ 7.878215] i2c i2c-12: 4/4 memory slots populated (from DMI) [ 7.881116] at24 12-0050: 256 byte spd EEPROM, read-only [ 7.881887] i2c i2c-12: Successfully instantiated SPD at 0x50 [ 7.894183] at24 12-0051: 256 byte spd EEPROM, read-only [ 7.895910] i2c i2c-12: Failed to register i2c client jc42 at 0x19 (-16) [ 7.896039] i2c i2c-12: Successfully instantiated SPD at 0x51 [ 7.896850] i2c i2c-12: Failed creating jc42 at 0x19 [ 7.903444] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1f.3/i2c-12/12-001a' [ 7.904085] at24 12-0052: 256 byte spd EEPROM, read-only [ 7.905284] Hardware name: Dell Inc. PowerEdge T110 II/0PM2CW, BIOS 2.10.0 05/24/2018 [ 7.905284] Call Trace: [ 7.905284] <TASK> [ 7.909238] dump_stack_lvl+0x37/0x4a [ 7.910488] at24 12-0053: 256 byte spd EEPROM, read-only [ 7.909855] sysfs_warn_dup+0x55/0x61 [ 7.911456] i2c i2c-12: Successfully instantiated SPD at 0x53 [ 7.911597] sysfs_create_dir_ns+0xa6/0xd2 [ 7.911597] kobject_add_internal+0xc3/0x1c0 [ 7.914606] kobject_add+0xba/0xe4 [ 7.915595] ? device_add+0x53/0x726 [ 7.915595] device_add+0x132/0x726 [ 7.916622] i2c_new_client_device+0x1ee/0x246 [ 7.916622] i2c_detect.isra.0+0x17c/0x223 [ 7.918603] ? __pfx___process_new_driver+0x10/0x10 [ 7.919603] __process_new_driver+0x17/0x1e [ 7.919603] bus_for_each_dev+0x8b/0xcf [ 7.920595] ? __pfx___process_new_driver+0x10/0x10 [ 7.920595] i2c_for_each_dev+0x2d/0x49 [ 7.922608] i2c_register_driver+0x51/0x63 [ 7.922608] ? __pfx_jc42_driver_init+0x10/0x10 [ 7.923595] do_one_initcall+0x93/0x182 [ 7.924601] kernel_init_freeable+0x1be/0x204 [ 7.924601] ? __pfx_kernel_init+0x10/0x10 [ 7.924601] kernel_init+0x15/0x110 [ 7.926609] ret_from_fork+0x23/0x35 [ 7.927602] ? __pfx_kernel_init+0x10/0x10 [ 7.927602] ret_from_fork_asm+0x1b/0x30 [ 7.927602] </TASK> [ 7.929937] kobject: kobject_add_internal failed for 12-001a with -EEXIST, don't try to register things with the same name in the same directory. [ 7.932129] i2c i2c-12: Failed to register i2c client jc42 at 0x1a (-17) [ 7.933257] i2c i2c-12: Failed creating jc42 at 0x1a
Note there is no warning for 0x18 and 0x1b.
# sensors|grep jc42-i2c|sort jc42-i2c-12-18 jc42-i2c-12-19 jc42-i2c-12-1a jc42-i2c-12-1b
Krzysztof
On 24.06.2024 at 07:54, Guenter Roeck wrote:
On 6/24/24 01:38, Krzysztof Olędzki wrote:
On 23.06.2024 at 22:33, Guenter Roeck wrote:
On 6/23/24 11:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
Note that jc42 devices are registered correctly and work with and without the change.
My guess is that the devices are fist instantiated through the jc42 driver's _detect function and then again from the at24 driver. The at24 driver should possibly call i2c_new_scanned_device() instead of i2c_new_client_device() to only instantiate the device if it wasn't already instantiated.
i2c_new_scanned_device() also calls i2c_default_probe() at the end (unless different probe is provided) which seems risky given the comment that explains that it would use quick write for that address. However, maybe it is safe in this case? I wish we had a way to just tell "no probing is needed".
Sorry, I don't understand why it would be less risky to just probe the device without such a test.
I'm referring to this comment on i2c_default_probe():
/* * Legacy default probe function, mostly relevant for SMBus. The default * probe method is a quick write, but it is known to corrupt the 24RF08 * EEPROMs due to a state machine bug, and could also irreversibly * write-protect some EEPROMs, so for address ranges 0x30-0x37 and 0x50-0x5f, * we use a short byte read instead. Also, some bus drivers don't implement * quick write, so we fallback to a byte read in that case too. * On x86, there is another special case for FSC hardware monitoring chips, * which want regular byte reads (address 0x73.) Fortunately, these are the * only known chips using this I2C address on PC hardware. * Returns 1 if probe succeeded, 0 if not. */
<CUT>
For now compile-tested only given the write-test concern above.
The device detect code in the i2c core does that same write-test that you are concerned about.
There is no write-test testing in i2c_new_client_device() and I cannot find how i2c_detect() would do that. However, after looking at this more, it seems that we actually have jc42_detect() precisely for this, as this is what jc42.c provides as the .detect callback.
That said, I have some follow-up questions:
- if the jc42 driver handles this already, I wonder what's the point of adding
at24_probe_temp_sensor()? Is there a situation where it would not do it properly? Or do we expect to remove the probing functionally from jc42.c?
The jc42 driver is not auto-loaded. When suggesting to remove the "probing functionally", I assume you mean to remove its detect function. That would only work if SPD EEPROMs were only connected to I2C adapters calling i2c_register_spd(), and if the systems with those adapters would support DMI.
I'm not suggesting to remove it. I'm just asking why we have two two different mechanisms for doing the same thing and what is the plan longer term? A new code was added, where is seems the old one has worked just fine. As you mentioned in the other response, it even handles well a buggy DIMMS with their eeprom data incorrectly stating no temp sensor.
In v6.9, i2c_register_spd() is only called from the i801 driver (Intel systems). In v6.11, piix4 (AMD) will be added. Even after that, all non-Intel / non-AMD systems would no longer be able to support jc42 compatible chips by just loading the jc42 driver. That would not be acceptable.
Indeed.
- I don't understand why we are also getting the "Failed creating jc42" and
"sysfs: cannot create duplicate filename" errors since i2c_new_client_device() calls i2c_check_addr_busy() on its own and should abort after the first error message?
The "Failed creating" message is from the i2c core's detect function which is only called if a new i2c adapter is added. This is actually the case here, since the call sequence of the backtrace includes i801_probe(). It looks like i2c_detect() runs asynchronously and doesn't protect itself against having devices added to a bus while it is running on that same bus. That is just a guess, though - I have not tried to verify it.
That does suggest, though, that even your suggested code above might not completely fix the problem. It may be necessary to call i2c_lock_bus() or similar from i2c_new_scanned_device() and i2c_detect(), but I don't know if that is save, sufficient, or even possible.
Right... Which again brings the original question: is there a situation where jc42 is not able to detect the temp sensors on its own?
Krzysztof
On 27.06.2024 13:24, Krzysztof Olędzki wrote:
On 24.06.2024 at 07:54, Guenter Roeck wrote:
On 6/24/24 01:38, Krzysztof Olędzki wrote:
On 23.06.2024 at 22:33, Guenter Roeck wrote:
On 6/23/24 11:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
Note that jc42 devices are registered correctly and work with and without the change.
My guess is that the devices are fist instantiated through the jc42 driver's _detect function and then again from the at24 driver. The at24 driver should possibly call i2c_new_scanned_device() instead of i2c_new_client_device() to only instantiate the device if it wasn't already instantiated.
i2c_new_scanned_device() also calls i2c_default_probe() at the end (unless different probe is provided) which seems risky given the comment that explains that it would use quick write for that address. However, maybe it is safe in this case? I wish we had a way to just tell "no probing is needed".
Sorry, I don't understand why it would be less risky to just probe the device without such a test.
I'm referring to this comment on i2c_default_probe():
/*
- Legacy default probe function, mostly relevant for SMBus. The default
- probe method is a quick write, but it is known to corrupt the 24RF08
- EEPROMs due to a state machine bug, and could also irreversibly
- write-protect some EEPROMs, so for address ranges 0x30-0x37 and 0x50-0x5f,
- we use a short byte read instead. Also, some bus drivers don't implement
- quick write, so we fallback to a byte read in that case too.
- On x86, there is another special case for FSC hardware monitoring chips,
- which want regular byte reads (address 0x73.) Fortunately, these are the
- only known chips using this I2C address on PC hardware.
- Returns 1 if probe succeeded, 0 if not.
*/
<CUT>
For now compile-tested only given the write-test concern above.
The device detect code in the i2c core does that same write-test that you are concerned about.
There is no write-test testing in i2c_new_client_device() and I cannot find how i2c_detect() would do that. However, after looking at this more, it seems that we actually have jc42_detect() precisely for this, as this is what jc42.c provides as the .detect callback.
That said, I have some follow-up questions:
- if the jc42 driver handles this already, I wonder what's the point of adding
at24_probe_temp_sensor()? Is there a situation where it would not do it properly? Or do we expect to remove the probing functionally from jc42.c?
The jc42 driver is not auto-loaded. When suggesting to remove the "probing functionally", I assume you mean to remove its detect function. That would only work if SPD EEPROMs were only connected to I2C adapters calling i2c_register_spd(), and if the systems with those adapters would support DMI.
I'm not suggesting to remove it. I'm just asking why we have two two different mechanisms for doing the same thing and what is the plan longer term? A new code was added, where is seems the old one has worked just fine. As you mentioned in the other response, it even handles well a buggy DIMMS with their eeprom data incorrectly stating no temp sensor.
Auto-probing (scanning) is a legacy mechanism, and all use cases but hwmon have been removed meanwhile. One of the reasons is that the detect i2c access could do harm to a device of another type that sits on one of the scanned addresses. Explicit instantiation is preferred.
Another reason why explicit instantiation is preferred: It allows auto-loading of i2c client driver modules. W/o explicit instantiation jc42 needs to be loaded manually (or by some user space tool).
It seems we have two issues: 1. Calls to i2c_new_client_device() aren't serialized. Parallel instantiation of the same device can cause errors like reported here. 2. i2c_new_client_device() complains also if the same device has been instantiated already.
Issue 2 we may be able to tackle w/o big effort: So far we just check for registered clients with the same bus address, and complain if found. It should be possible (not tested yet) to include the client device name in the check. If the names are identical, then we can assume it's the same device that has been registered by other means already, and silence the error message.
In v6.9, i2c_register_spd() is only called from the i801 driver (Intel systems). In v6.11, piix4 (AMD) will be added. Even after that, all non-Intel / non-AMD systems would no longer be able to support jc42 compatible chips by just loading the jc42 driver. That would not be acceptable.
Indeed.
- I don't understand why we are also getting the "Failed creating jc42" and
"sysfs: cannot create duplicate filename" errors since i2c_new_client_device() calls i2c_check_addr_busy() on its own and should abort after the first error message?
The "Failed creating" message is from the i2c core's detect function which is only called if a new i2c adapter is added. This is actually the case here, since the call sequence of the backtrace includes i801_probe(). It looks like i2c_detect() runs asynchronously and doesn't protect itself against having devices added to a bus while it is running on that same bus. That is just a guess, though - I have not tried to verify it.
That does suggest, though, that even your suggested code above might not completely fix the problem. It may be necessary to call i2c_lock_bus() or similar from i2c_new_scanned_device() and i2c_detect(), but I don't know if that is save, sufficient, or even possible.
Right... Which again brings the original question: is there a situation where jc42 is not able to detect the temp sensors on its own?
Krzysztof
On Sun, Jun 23, 2024 at 11:47:39AM -0700, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
So is this also an issue in 6.9?
thanks,
greg k-h
On 6/23/24 22:43, Greg Kroah-Hartman wrote:
On Sun, Jun 23, 2024 at 11:47:39AM -0700, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
So is this also an issue in 6.9?
I didn't verify, but I am quite sure that it is seen there as well.
Guenter
On 23.06.2024 20:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
Note that jc42 devices are registered correctly and work with and without the change.
# grep . /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-*/name /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0018/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0019/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-001a/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-001b/name:jc42 /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0050/name:spd /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0051/name:spd /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0052/name:spd /sys/devices/pci0000:00/0000:00:1f.3/i2c-12/12-0053/name:spd
# sensors|grep -A4 jc42-i2c jc42-i2c-12-1b Adapter: SMBus I801 adapter at 3000 temp1: +33.2°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C) -- jc42-i2c-12-19 Adapter: SMBus I801 adapter at 3000 temp1: +33.5°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C) -- jc42-i2c-12-1a Adapter: SMBus I801 adapter at 3000 temp1: +33.5°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C) -- jc42-i2c-12-18 Adapter: SMBus I801 adapter at 3000 temp1: +33.2°C (low = +0.0°C) (high = +91.0°C, hyst = +91.0°C) (crit = +95.0°C, hyst = +95.0°C)
dmesg: [ 0.000000] DMI: Dell Inc. PowerEdge T110 II/0PM2CW, BIOS 2.10.0 05/24/2018 (...) [ 7.681132] i2c_dev: i2c /dev entries driver [ 7.687116] i2c i2c-12: 4/4 memory slots populated (from DMI) [ 7.690623] at24 12-0050: 256 byte spd EEPROM, read-only [ 7.691812] i2c i2c-12: Successfully instantiated SPD at 0x50 [ 7.698246] at24 12-0051: 256 byte spd EEPROM, read-only [ 7.699465] i2c i2c-12: Successfully instantiated SPD at 0x51 [ 7.700043] i2c i2c-12: Failed to register i2c client jc42 at 0x19 (-16) [ 7.700047] i2c i2c-12: Failed creating jc42 at 0x19 [ 7.705248] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1f.3/i2c-12/12-001a' [ 7.711617] <TASK> [ 7.712612] dump_stack_lvl+0x37/0x4a [ 7.712612] sysfs_warn_dup+0x55/0x61 [ 7.715616] sysfs_create_dir_ns+0xa6/0xd2 [ 7.716620] kobject_add_internal+0xc3/0x1c0 [ 7.716620] kobject_add+0xba/0xe4 [ 7.719615] ? device_add+0x53/0x726 [ 7.720611] device_add+0x132/0x726 [ 7.720611] i2c_new_client_device+0x1ee/0x246 [ 7.723616] at24_probe+0x5f8/0x666 [ 7.724642] ? __pfx_at24_read+0x10/0x10 [ 7.724642] ? __pfx_at24_write+0x10/0x10 [ 7.724642] ? __pfx___device_attach_driver+0x10/0x10 [ 7.727619] i2c_device_probe+0x1b7/0x240 [ 7.728612] really_probe+0x101/0x248 [ 7.728612] __driver_probe_device+0xbb/0xed [ 7.731620] driver_probe_device+0x1a/0x72 [ 7.732621] __device_attach_driver+0x82/0x96 [ 7.732621] bus_for_each_drv+0xa6/0xd4 [ 7.732621] __device_attach+0xa8/0x12a [ 7.735619] bus_probe_device+0x31/0x95 [ 7.736614] device_add+0x265/0x726 [ 7.736614] i2c_new_client_device+0x1ee/0x246 [ 7.739618] i2c_register_spd+0x1a1/0x1ed [ 7.740613] i801_probe+0x589/0x603 [ 7.740613] ? up_write+0x37/0x4d [ 7.740613] ? kernfs_add_one+0x104/0x126 [ 7.743618] ? __raw_spin_unlock_irqrestore+0x14/0x29 [ 7.744612] pci_device_probe+0xbe/0x12f [ 7.744612] really_probe+0x101/0x248 [ 7.744612] __driver_probe_device+0xbb/0xed [ 7.747618] driver_probe_device+0x1a/0x72 [ 7.748612] __driver_attach_async_helper+0x2d/0x42 [ 7.748612] async_run_entry_fn+0x25/0xa0 [ 7.748612] process_scheduled_works+0x193/0x291 [ 7.748612] worker_thread+0x1c5/0x21f [ 7.751619] ? __pfx_worker_thread+0x10/0x10 [ 7.752611] kthread+0xf6/0xfe [ 7.752611] ? __pfx_kthread+0x10/0x10 [ 7.752611] ret_from_fork+0x23/0x35 [ 7.755621] ? __pfx_kthread+0x10/0x10 [ 7.756613] ret_from_fork_asm+0x1b/0x30 [ 7.756613] </TASK> [ 7.759637] i2c i2c-12: Failed to register i2c client jc42 at 0x1a (-17) [ 7.760815] at24 12-0052: 256 byte spd EEPROM, read-only [ 7.762047] i2c i2c-12: Successfully instantiated SPD at 0x52 [ 7.765252] i2c i2c-12: Failed to register i2c client jc42 at 0x1b (-16) [ 7.766126] at24 12-0053: 256 byte spd EEPROM, read-only [ 7.767584] i2c i2c-12: Successfully instantiated SPD at 0x53
Thanks, Krzysztof
Could you please test whether the attached two experimental patches fix the issue for you? They serialize client device instantiation per I2C adapter, and include the client device name in the check whether a bus address is busy.
On 02.07.2024 at 13:25, Heiner Kallweit wrote:
On 23.06.2024 20:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
<CUT>
Could you please test whether the attached two experimental patches fix the issue for you? They serialize client device instantiation per I2C adapter, and include the client device name in the check whether a bus address is busy.
Sadly, they crash the kernel.
I will get serial console attached there next week, so will be able to capture the full crash. For now, I was able to obtain a photo. I'm very sorry for the quality, just wanted to provide something for now.
Krzysztof
On 06.07.2024 at 18:42, Krzysztof Olędzki wrote:
On 02.07.2024 at 13:25, Heiner Kallweit wrote:
On 23.06.2024 20:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
<CUT>
Could you please test whether the attached two experimental patches fix the issue for you? They serialize client device instantiation per I2C adapter, and include the client device name in the check whether a bus address is busy.
Sadly, they crash the kernel.
I will get serial console attached there next week, so will be able to capture the full crash. For now, I was able to obtain a photo. I'm very sorry for the quality, just wanted to provide something for now.
Sorry it took me so long - my attempts to coordinate setting up serial console were not successful, so it had to wait for me to go there in person...
I have attached complete dmesg, summary:
[ 10.905953] rtmutex deadlock detected [ 10.909959] WARNING: CPU: 5 PID: 83 at kernel/locking/rtmutex.c:1642 __rt_mutex_slowlock_locked.constprop.0+0x10f/0x1a5 [ 10.920961] CPU: 5 PID: 83 Comm: kworker/u16:3 Not tainted 6.6.34-o5 #1 [ 10.929970] Hardware name: Dell Inc. PowerEdge T110 II/0PM2CW, BIOS 2.10.0 05/24/2018 [ 10.938954] Workqueue: events_unbound async_run_entry_fn
[ 11.336954] BUG: scheduling while atomic: kworker/u16:3/83/0x00000002 [ 11.342953] Preemption disabled at: [ 11.342953] [<0000000000000000>] 0x0 [ 11.350953] CPU: 5 PID: 83 Comm: kworker/u16:3 Tainted: G W 6.6.34-o5 #1 [ 11.361954] Hardware name: Dell Inc. PowerEdge T110 II/0PM2CW, BIOS 2.10.0 05/24/2018 [ 11.369953] Workqueue: events_unbound async_run_entry_fn
Krzysztof
On 23.07.2024 16:12, Krzysztof Olędzki wrote:
On 06.07.2024 at 18:42, Krzysztof Olędzki wrote:
On 02.07.2024 at 13:25, Heiner Kallweit wrote:
On 23.06.2024 20:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
<CUT>
Could you please test whether the attached two experimental patches fix the issue for you? They serialize client device instantiation per I2C adapter, and include the client device name in the check whether a bus address is busy.
Sadly, they crash the kernel.
I will get serial console attached there next week, so will be able to capture the full crash. For now, I was able to obtain a photo. I'm very sorry for the quality, just wanted to provide something for now.
Sorry it took me so long - my attempts to coordinate setting up serial console were not successful, so it had to wait for me to go there in person...
I have attached complete dmesg, summary:
[ 10.905953] rtmutex deadlock detected [ 10.909959] WARNING: CPU: 5 PID: 83 at kernel/locking/rtmutex.c:1642 __rt_mutex_slowlock_locked.constprop.0+0x10f/0x1a5 [ 10.920961] CPU: 5 PID: 83 Comm: kworker/u16:3 Not tainted 6.6.34-o5 #1 [ 10.929970] Hardware name: Dell Inc. PowerEdge T110 II/0PM2CW, BIOS 2.10.0 05/24/2018 [ 10.938954] Workqueue: events_unbound async_run_entry_fn
[ 11.336954] BUG: scheduling while atomic: kworker/u16:3/83/0x00000002 [ 11.342953] Preemption disabled at: [ 11.342953] [<0000000000000000>] 0x0 [ 11.350953] CPU: 5 PID: 83 Comm: kworker/u16:3 Tainted: G W 6.6.34-o5 #1 [ 11.361954] Hardware name: Dell Inc. PowerEdge T110 II/0PM2CW, BIOS 2.10.0 05/24/2018 [ 11.369953] Workqueue: events_unbound async_run_entry_fn
Thanks a lot for the comprehensive info. Reason for the deadlock is that calls to i2c_new_client_device() can be nested. So another solution approach is needed. I'd appreciate if you could test also the new version below.
Krzysztof
Heiner
diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c index b63f75e44..c1074d409 100644 --- a/drivers/i2c/i2c-core-base.c +++ b/drivers/i2c/i2c-core-base.c @@ -915,6 +915,29 @@ int i2c_dev_irq_from_resources(const struct resource *resources, return 0; }
+/** + * Serialize device instantiation in case it can be instantiated explicitly + * and by auto-detection + */ +static int i2c_test_and_set_addr_in_instantiation(struct i2c_adapter *adap, + const struct i2c_client *client) +{ + if (!(client->flags & I2C_CLIENT_TEN) && + !(client->flags & I2C_CLIENT_SLAVE) && + test_and_set_bit(client->addr, adap->addrs_in_instantiation)) + return -EBUSY; + + return 0; +} + +static void i2c_clear_addr_in_instantiation(struct i2c_adapter *adap, + const struct i2c_client *client) +{ + if (!(client->flags & I2C_CLIENT_TEN) && + !(client->flags & I2C_CLIENT_SLAVE)) + clear_bit(client->addr, adap->addrs_in_instantiation); +} + /** * i2c_new_client_device - instantiate an i2c device * @adap: the adapter managing the device @@ -962,6 +985,10 @@ i2c_new_client_device(struct i2c_adapter *adap, struct i2c_board_info const *inf goto out_err_silent; }
+ status = i2c_test_and_set_addr_in_instantiation(adap, client); + if (status) + goto out_err_silent; + /* Check for address business */ status = i2c_check_addr_busy(adap, i2c_encode_flags_to_addr(client)); if (status) @@ -993,6 +1020,8 @@ i2c_new_client_device(struct i2c_adapter *adap, struct i2c_board_info const *inf dev_dbg(&adap->dev, "client [%s] registered with bus id %s\n", client->name, dev_name(&client->dev));
+ i2c_clear_addr_in_instantiation(adap, client); + return client;
out_remove_swnode: @@ -1004,6 +1033,7 @@ i2c_new_client_device(struct i2c_adapter *adap, struct i2c_board_info const *inf dev_err(&adap->dev, "Failed to register i2c client %s at 0x%02x (%d)\n", client->name, client->addr, status); + i2c_clear_addr_in_instantiation(adap, client); out_err_silent: if (need_put) put_device(&client->dev); diff --git a/include/linux/i2c.h b/include/linux/i2c.h index 07e33bbc9..31486455f 100644 --- a/include/linux/i2c.h +++ b/include/linux/i2c.h @@ -761,6 +761,9 @@ struct i2c_adapter { struct regulator *bus_regulator;
struct dentry *debugfs; + + /* covers 7bit addresses only */ + DECLARE_BITMAP(addrs_in_instantiation, 1 << 7); }; #define to_i2c_adapter(d) container_of(d, struct i2c_adapter, dev)
On 03.08.2024 at 10:19, Heiner Kallweit wrote:
On 23.07.2024 16:12, Krzysztof Olędzki wrote:
On 06.07.2024 at 18:42, Krzysztof Olędzki wrote:
On 02.07.2024 at 13:25, Heiner Kallweit wrote:
On 23.06.2024 20:47, Krzysztof Olędzki wrote:
Hi,
After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l... - reverting the change fixes the problem.
<CUT>
Could you please test whether the attached two experimental patches fix the issue for you? They serialize client device instantiation per I2C adapter, and include the client device name in the check whether a bus address is busy.
Sadly, they crash the kernel.
I will get serial console attached there next week, so will be able to capture the full crash. For now, I was able to obtain a photo. I'm very sorry for the quality, just wanted to provide something for now.
Sorry it took me so long - my attempts to coordinate setting up serial console were not successful, so it had to wait for me to go there in person...
I have attached complete dmesg, summary:
[ 10.905953] rtmutex deadlock detected [ 10.909959] WARNING: CPU: 5 PID: 83 at kernel/locking/rtmutex.c:1642 __rt_mutex_slowlock_locked.constprop.0+0x10f/0x1a5 [ 10.920961] CPU: 5 PID: 83 Comm: kworker/u16:3 Not tainted 6.6.34-o5 #1 [ 10.929970] Hardware name: Dell Inc. PowerEdge T110 II/0PM2CW, BIOS 2.10.0 05/24/2018 [ 10.938954] Workqueue: events_unbound async_run_entry_fn
[ 11.336954] BUG: scheduling while atomic: kworker/u16:3/83/0x00000002 [ 11.342953] Preemption disabled at: [ 11.342953] [<0000000000000000>] 0x0 [ 11.350953] CPU: 5 PID: 83 Comm: kworker/u16:3 Tainted: G W 6.6.34-o5 #1 [ 11.361954] Hardware name: Dell Inc. PowerEdge T110 II/0PM2CW, BIOS 2.10.0 05/24/2018 [ 11.369953] Workqueue: events_unbound async_run_entry_fn
Thanks a lot for the comprehensive info. Reason for the deadlock is that calls to i2c_new_client_device() can be nested. So another solution approach is needed. I'd appreciate if you could test also the new version below.
The patch did not apply cleanly for Linux-6.6, so I had to tweak it a little bit for the include/linux/i2c.h part, but it does seem to work. Everything gets detected and there are no warning / errors:
[ 8.311414] i2c i2c-12: 4/4 memory slots populated (from DMI) [ 8.314112] at24 12-0050: 256 byte spd EEPROM, read-only [ 8.314856] i2c i2c-12: Successfully instantiated SPD at 0x50 [ 8.317513] at24 12-0051: 256 byte spd EEPROM, read-only [ 8.318252] i2c i2c-12: Successfully instantiated SPD at 0x51 [ 8.320909] at24 12-0052: 256 byte spd EEPROM, read-only [ 8.322126] i2c i2c-12: Successfully instantiated SPD at 0x52 [ 8.325538] at24 12-0053: 256 byte spd EEPROM, read-only [ 8.326789] i2c i2c-12: Successfully instantiated SPD at 0x53
# sensors|grep -A2 jc42 jc42-i2c-12-19 Adapter: SMBus I801 adapter at 3000 temp1: +36.5°C (low = +0.0°C) -- jc42-i2c-12-1b Adapter: SMBus I801 adapter at 3000 temp1: +35.0°C (low = +0.0°C) -- jc42-i2c-12-1a Adapter: SMBus I801 adapter at 3000 temp1: +36.0°C (low = +0.0°C) -- jc42-i2c-12-18 Adapter: SMBus I801 adapter at 3000 temp1: +36.5°C (low = +0.0°C)
Feel free to add: Tested-by: Krzysztof Piotr Oledzki ole@ans.pl
Thanks, Krzysztof
linux-stable-mirror@lists.linaro.org