I'm currently developing on the OrangePi Neo-01, which ships with two similar touchpads using the Synaptics driver. On 6.7.10 those two devices get detected normally. On 6.8.1 it seems to error out.
I either get none, or at best only one of those two devices.
i2c_hid_acpi: probe of i2c-XXX0001:00 failed with error -110 i2c_hid_acpi: probe of i2c-XXX0002:00 failed with error -110
what would be the best way to debug this?
On 18/03/2024 17:58, Philip Müller wrote:
I'm currently developing on the OrangePi Neo-01, which ships with two similar touchpads using the Synaptics driver. On 6.7.10 those two devices get detected normally. On 6.8.1 it seems to error out.
I either get none, or at best only one of those two devices.
i2c_hid_acpi: probe of i2c-XXX0001:00 failed with error -110 i2c_hid_acpi: probe of i2c-XXX0002:00 failed with error -110
what would be the best way to debug this?
I found the regression in commit aa69d6974185e9f7a552ba982540a38e34f69690 HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling
When I use the commit before I can rmmod and modprobe in a batch script using a loop without erroring out to -110. Attached the testing script and dmesg log snippets
#!/bin/bash for ((n=0;n<5;n++)) do sudo rmmod i2c_hid_acpi sleep 1 sudo modprobe i2c_hid_acpi --force-vermagic sleep 2 done
I will try to use the code of 6.8.1 and revert that commit.
Hi Philip,
On 3/19/24 5:52 AM, Philip Müller wrote:
On 18/03/2024 17:58, Philip Müller wrote:
I'm currently developing on the OrangePi Neo-01, which ships with two similar touchpads using the Synaptics driver. On 6.7.10 those two devices get detected normally. On 6.8.1 it seems to error out.
I either get none, or at best only one of those two devices.
i2c_hid_acpi: probe of i2c-XXX0001:00 failed with error -110 i2c_hid_acpi: probe of i2c-XXX0002:00 failed with error -110
what would be the best way to debug this?
I found the regression in commit aa69d6974185e9f7a552ba982540a38e34f69690 HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling
I just checked that patch and I don't see anyway how that can create this regression. I assume you did a git bisect ?
Did you try the last commit in the tree before that commit got added and verified that that one works where as building a kernel from commit aa69d6974185e9f itself does not work ?
When I use the commit before I can rmmod and modprobe in a batch script using a loop without erroring out to -110. Attached the testing script and dmesg log snippets
#!/bin/bash for ((n=0;n<5;n++)) do sudo rmmod i2c_hid_acpi sleep 1 sudo modprobe i2c_hid_acpi --force-vermagic sleep 2 done
Ok, so you did try the commit before and that did work. Are you sure that aa69d6974185e9f was not actually the last working commit ?
AFAICT aa69d6974185e9f makes no functional changes, except for actually propagating the error from i2c_hid_read_register() rather then hardcoding -EIO. But that should not matter...
Note that commit aa69d6974185e9f is part of a series and I would not be surprised if some other commit in that series is causing your problem, but aa69d6974185e9f itself seems rather harmless.
Regards,
Hans
On 27/03/2024 17:31, Hans de Goede wrote:
Hi Philip,
On 3/19/24 5:52 AM, Philip Müller wrote:
On 18/03/2024 17:58, Philip Müller wrote:
I'm currently developing on the OrangePi Neo-01, which ships with two similar touchpads using the Synaptics driver. On 6.7.10 those two devices get detected normally. On 6.8.1 it seems to error out.
I either get none, or at best only one of those two devices.
i2c_hid_acpi: probe of i2c-XXX0001:00 failed with error -110 i2c_hid_acpi: probe of i2c-XXX0002:00 failed with error -110
what would be the best way to debug this?
I found the regression in commit aa69d6974185e9f7a552ba982540a38e34f69690 HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling
I just checked that patch and I don't see anyway how that can create this regression. I assume you did a git bisect ?
Did you try the last commit in the tree before that commit got added and verified that that one works where as building a kernel from commit aa69d6974185e9f itself does not work ?
No I didn't do a git bisect actually. I did it more manually like git bisect might do it. What I did was to compile just that module and load it against the 6.7.10 kernel. a9f68ffe1170ca4bc17ab29067d806a354a026e0 is the last commit you find in 6.7 series. All the other commits are part of the 6.8 series or later. So I looked at the commit titles and picked a9f68ffe1170ca4bc17ab29067d806a354a026e0 which I compiled the "out of tree" module. I unloaded the 6.7.10 module, copied the new compiled one over and did the rmod, modprobe batch script to see how the touchpads react.
Then I went to compile and tested af93a167eda90192563bce64c4bb989836afac1f which created the issue with -110 errors. So I went to aa69d6974185e9f7a552ba982540a38e34f69690 which also results in the error. So I went back to 96d3098db835d58649b73a5788898bd7672a319b which still works. Since now 6.9 development started, I may try 00aab7dcb2267f2aef59447602f34501efe1a07f to see if that fixes anything.
I can get the touchpads both working in 6.8 code when I do enough tries by a rmod/modprobe loop. Based on the logs you see, sometimes both are up, mostly one or none is up. having only two commits of 6.7 code in the module it works as 6.7 code. Only when "HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling" gets added I see the error and can reproduce it on my device as shown in the log.
Before we had
hid_err(hid, "reading report descriptor failed\n"); kfree(rdesc);
followed by the hardcoded return -EIO;
Now we have an redirect to the out function but don't do kfree(rdesc) if you hit the error anymore.
Also the last return 0; got removed. So maybe what ever error the device might have hit before might be reverted due to return 0 or not freeing rdesc when hitting the error might prevent it to load.
Anyhow, that patch is the first patch which changes the function in that series. Sure it might not be able to compile the whole kernel with it. I only tested the module based on the given code in an out of tree module style.
So I might test 6.9rc1 on the device and let you know if that kernel has the same issue still.
When I use the commit before I can rmmod and modprobe in a batch script using a loop without erroring out to -110. Attached the testing script and dmesg log snippets
#!/bin/bash for ((n=0;n<5;n++)) do sudo rmmod i2c_hid_acpi sleep 1 sudo modprobe i2c_hid_acpi --force-vermagic sleep 2 done
Ok, so you did try the commit before and that did work. Are you sure that aa69d6974185e9f was not actually the last working commit ?
For me 96d3098db835d58649b73a5788898bd7672a319b was the last working code, as it doesn't change functionality.
AFAICT aa69d6974185e9f makes no functional changes, except for actually propagating the error from i2c_hid_read_register() rather then hardcoding -EIO. But that should not matter...
See return 0 and kfree(rdesc) which is now missing.
Note that commit aa69d6974185e9f is part of a series and I would not be surprised if some other commit in that series is causing your problem, but aa69d6974185e9f itself seems rather harmless.
Well, it might also revealed an issue, which was ignored all the time and by erroring out it might stop the process to continue something. Cos maybe return 0 had skipped any recorded error before.
If you want me to test or try something else, let me know. I'm open for ideas.
Regards,
Hans
Hi Philip,
On 3/27/24 4:47 PM, Philip Müller wrote:
On 27/03/2024 17:31, Hans de Goede wrote:
Hi Philip,
On 3/19/24 5:52 AM, Philip Müller wrote:
On 18/03/2024 17:58, Philip Müller wrote:
I'm currently developing on the OrangePi Neo-01, which ships with two similar touchpads using the Synaptics driver. On 6.7.10 those two devices get detected normally. On 6.8.1 it seems to error out.
I either get none, or at best only one of those two devices.
i2c_hid_acpi: probe of i2c-XXX0001:00 failed with error -110 i2c_hid_acpi: probe of i2c-XXX0002:00 failed with error -110
what would be the best way to debug this?
I found the regression in commit aa69d6974185e9f7a552ba982540a38e34f69690 HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling
I just checked that patch and I don't see anyway how that can create this regression. I assume you did a git bisect ?
Did you try the last commit in the tree before that commit got added and verified that that one works where as building a kernel from commit aa69d6974185e9f itself does not work ?
No I didn't do a git bisect actually. I did it more manually like git bisect might do it. What I did was to compile just that module and load it against the 6.7.10 kernel. a9f68ffe1170ca4bc17ab29067d806a354a026e0 is the last commit you find in 6.7 series. All the other commits are part of the 6.8 series or later. So I looked at the commit titles and picked a9f68ffe1170ca4bc17ab29067d806a354a026e0 which I compiled the "out of tree" module. I unloaded the 6.7.10 module, copied the new compiled one over and did the rmod, modprobe batch script to see how the touchpads react.
Then I went to compile and tested af93a167eda90192563bce64c4bb989836afac1f which created the issue with -110 errors. So I went to aa69d6974185e9f7a552ba982540a38e34f69690 which also results in the error. So I went back to 96d3098db835d58649b73a5788898bd7672a319b which still works. Since now 6.9 development started, I may try 00aab7dcb2267f2aef59447602f34501efe1a07f to see if that fixes anything.
I can get the touchpads both working in 6.8 code when I do enough tries by a rmod/modprobe loop. Based on the logs you see, sometimes both are up, mostly one or none is up. having only two commits of 6.7 code in the module it works as 6.7 code. Only when "HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling" gets added I see the error and can reproduce it on my device as shown in the log.
Before we had
hid_err(hid, "reading report descriptor failed\n"); kfree(rdesc);
followed by the hardcoded return -EIO;
Now we have an redirect to the out function but don't do kfree(rdesc) if you hit the error anymore.
Also the last return 0; got removed. So maybe what ever error the device might have hit before might be reverted due to return 0 or not freeing rdesc when hitting the error might prevent it to load.
Anyhow, that patch is the first patch which changes the function in that series. Sure it might not be able to compile the whole kernel with it. I only tested the module based on the given code in an out of tree module style.
So I might test 6.9rc1 on the device and let you know if that kernel has the same issue still.
When I use the commit before I can rmmod and modprobe in a batch script using a loop without erroring out to -110. Attached the testing script and dmesg log snippets
#!/bin/bash for ((n=0;n<5;n++)) do sudo rmmod i2c_hid_acpi sleep 1 sudo modprobe i2c_hid_acpi --force-vermagic sleep 2 done
Ok, so you did try the commit before and that did work. Are you sure that aa69d6974185e9f was not actually the last working commit ?
For me 96d3098db835d58649b73a5788898bd7672a319b was the last working code, as it doesn't change functionality.
AFAICT aa69d6974185e9f makes no functional changes, except for actually propagating the error from i2c_hid_read_register() rather then hardcoding -EIO. But that should not matter...
See return 0 and kfree(rdesc) which is now missing.
The new code is (on error) "goto out;" and then out: looks like this:
out: if (!use_override) kfree(rdesc);
return ret;
Since the "goto out;" is an else branch of "if (use_override) { } else {}" we know use_override is false here, so !use_override is true and the kfree() still happens. Also even if that were to no longer happen that would just be a small memleak and would not explain the probe failures you are seeing.
As for "return 0" vs "return ret", the code at the end of i2c_hid_parse() before commit aa69d6974185e9f looked like this:
if (ret) { dbg_hid("parsing report descriptor failed\n"); return ret; }
return 0;
So this would also always return ret because if ret is non 0 we always hit if (ret) return ret; and if ret is 0 then return 0 and return ret are functionally the same.
So since the functionality at the end of the function did not change, it still always ends up returning ret, but now does it in a less circumspect way.
Which leaves just the change in the:
hid_err(hid, "reading report descriptor failed\n");
code path which used to return -EIO and now does goto out which ends up returning ret which is the actual return value of i2c_hid_read_register() instead of hardcoding -EIO.
I checked and nothing in the HID core treats -EIO in a special way.
So I still do not see how commit aa69d6974185e9f can break anything and I really wonder if you did not make a mistake during testing.
Can you try building a clean broken kernel (the entire kernel) and then in reverse order (of them being committed) revert my recent i2c-hid changes 1 by one and after each revert check if things start working again ?
And then see which is the first revert after which things start working again, that one is the culprit and I expect you to find a different commit then aa69d6974185e9f.
Unless I'm missing / overlooking something in commit aa69d6974185e9f...
Note I'm going offline for a long weekend and I won't be replying to emails until next week Tuesday.
Regards,
Hans
On 28/03/2024 00:45, Hans de Goede wrote:
Hi Philip,
On 3/27/24 4:47 PM, Philip Müller wrote:
On 27/03/2024 17:31, Hans de Goede wrote:
Hi Philip,
On 3/19/24 5:52 AM, Philip Müller wrote:
On 18/03/2024 17:58, Philip Müller wrote:
I'm currently developing on the OrangePi Neo-01, which ships with two similar touchpads using the Synaptics driver. On 6.7.10 those two devices get detected normally. On 6.8.1 it seems to error out.
I either get none, or at best only one of those two devices.
i2c_hid_acpi: probe of i2c-XXX0001:00 failed with error -110 i2c_hid_acpi: probe of i2c-XXX0002:00 failed with error -110
what would be the best way to debug this?
I found the regression in commit aa69d6974185e9f7a552ba982540a38e34f69690 HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling
I just checked that patch and I don't see anyway how that can create this regression. I assume you did a git bisect ?
Did you try the last commit in the tree before that commit got added and verified that that one works where as building a kernel from commit aa69d6974185e9f itself does not work ?
No I didn't do a git bisect actually. I did it more manually like git bisect might do it. What I did was to compile just that module and load it against the 6.7.10 kernel. a9f68ffe1170ca4bc17ab29067d806a354a026e0 is the last commit you find in 6.7 series. All the other commits are part of the 6.8 series or later. So I looked at the commit titles and picked a9f68ffe1170ca4bc17ab29067d806a354a026e0 which I compiled the "out of tree" module. I unloaded the 6.7.10 module, copied the new compiled one over and did the rmod, modprobe batch script to see how the touchpads react.
Then I went to compile and tested af93a167eda90192563bce64c4bb989836afac1f which created the issue with -110 errors. So I went to aa69d6974185e9f7a552ba982540a38e34f69690 which also results in the error. So I went back to 96d3098db835d58649b73a5788898bd7672a319b which still works. Since now 6.9 development started, I may try 00aab7dcb2267f2aef59447602f34501efe1a07f to see if that fixes anything.
I can get the touchpads both working in 6.8 code when I do enough tries by a rmod/modprobe loop. Based on the logs you see, sometimes both are up, mostly one or none is up. having only two commits of 6.7 code in the module it works as 6.7 code. Only when "HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling" gets added I see the error and can reproduce it on my device as shown in the log.
Before we had
hid_err(hid, "reading report descriptor failed\n"); kfree(rdesc);
followed by the hardcoded return -EIO;
Now we have an redirect to the out function but don't do kfree(rdesc) if you hit the error anymore.
Also the last return 0; got removed. So maybe what ever error the device might have hit before might be reverted due to return 0 or not freeing rdesc when hitting the error might prevent it to load.
Anyhow, that patch is the first patch which changes the function in that series. Sure it might not be able to compile the whole kernel with it. I only tested the module based on the given code in an out of tree module style.
So I might test 6.9rc1 on the device and let you know if that kernel has the same issue still.
When I use the commit before I can rmmod and modprobe in a batch script using a loop without erroring out to -110. Attached the testing script and dmesg log snippets
#!/bin/bash for ((n=0;n<5;n++)) do sudo rmmod i2c_hid_acpi sleep 1 sudo modprobe i2c_hid_acpi --force-vermagic sleep 2 done
Ok, so you did try the commit before and that did work. Are you sure that aa69d6974185e9f was not actually the last working commit ?
For me 96d3098db835d58649b73a5788898bd7672a319b was the last working code, as it doesn't change functionality.
AFAICT aa69d6974185e9f makes no functional changes, except for actually propagating the error from i2c_hid_read_register() rather then hardcoding -EIO. But that should not matter...
See return 0 and kfree(rdesc) which is now missing.
The new code is (on error) "goto out;" and then out: looks like this:
out: if (!use_override) kfree(rdesc);
return ret;
Since the "goto out;" is an else branch of "if (use_override) { } else {}" we know use_override is false here, so !use_override is true and the kfree() still happens. Also even if that were to no longer happen that would just be a small memleak and would not explain the probe failures you are seeing.
As for "return 0" vs "return ret", the code at the end of i2c_hid_parse() before commit aa69d6974185e9f looked like this:
if (ret) { dbg_hid("parsing report descriptor failed\n"); return ret; } return 0;
So this would also always return ret because if ret is non 0 we always hit if (ret) return ret; and if ret is 0 then return 0 and return ret are functionally the same.
So since the functionality at the end of the function did not change, it still always ends up returning ret, but now does it in a less circumspect way.
Which leaves just the change in the:
hid_err(hid, "reading report descriptor failed\n");
code path which used to return -EIO and now does goto out which ends up returning ret which is the actual return value of i2c_hid_read_register() instead of hardcoding -EIO.
I checked and nothing in the HID core treats -EIO in a special way.
So I still do not see how commit aa69d6974185e9f can break anything and I really wonder if you did not make a mistake during testing.
Can you try building a clean broken kernel (the entire kernel) and then in reverse order (of them being committed) revert my recent i2c-hid changes 1 by one and after each revert check if things start working again ?
And then see which is the first revert after which things start working again, that one is the culprit and I expect you to find a different commit then aa69d6974185e9f.
Unless I'm missing / overlooking something in commit aa69d6974185e9f...
Note I'm going offline for a long weekend and I won't be replying to emails until next week Tuesday.
Regards,
Hans
I managed to do a brief test with Kernel 6.9-rc1. There the issue is still given. So I might add the changes of i2c-hid to the working 6.7.x kernels and post which commits will break that module.
On 29.03.24 04:40, Philip Müller wrote:
On 28/03/2024 00:45, Hans de Goede wrote:
On 3/27/24 4:47 PM, Philip Müller wrote:
On 27/03/2024 17:31, Hans de Goede wrote:
On 3/19/24 5:52 AM, Philip Müller wrote:
On 18/03/2024 17:58, Philip Müller wrote:
I'm currently developing on the OrangePi Neo-01, which ships with two similar touchpads using the Synaptics driver. On 6.7.10 those two devices get detected normally. On 6.8.1 it seems to error out.
I either get none, or at best only one of those two devices.
i2c_hid_acpi: probe of i2c-XXX0001:00 failed with error -110 i2c_hid_acpi: probe of i2c-XXX0002:00 failed with error -110
what would be the best way to debug this?
I found the regression in commit aa69d6974185e9f7a552ba982540a38e34f69690 HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling
TWIMC: Kenny Levinsen (now CCed) posted a patch to fix problems introduced by af93a167eda9: https://lore.kernel.org/all/20240331182440.14477-1-kl@kl.wtf/
Looks a bit (but I might be wrong there!) like he ran into similar problems as Philip, but was not aware of this thread.
Ciao, Thorsten
I just checked that patch and I don't see anyway how that can create this regression. I assume you did a git bisect ?
Did you try the last commit in the tree before that commit got added and verified that that one works where as building a kernel from commit aa69d6974185e9f itself does not work ?
No I didn't do a git bisect actually. I did it more manually like git bisect might do it. What I did was to compile just that module and load it against the 6.7.10 kernel. a9f68ffe1170ca4bc17ab29067d806a354a026e0 is the last commit you find in 6.7 series. All the other commits are part of the 6.8 series or later. So I looked at the commit titles and picked a9f68ffe1170ca4bc17ab29067d806a354a026e0 which I compiled the "out of tree" module. I unloaded the 6.7.10 module, copied the new compiled one over and did the rmod, modprobe batch script to see how the touchpads react.
Then I went to compile and tested af93a167eda90192563bce64c4bb989836afac1f which created the issue with -110 errors. So I went to aa69d6974185e9f7a552ba982540a38e34f69690 which also results in the error. So I went back to 96d3098db835d58649b73a5788898bd7672a319b which still works. Since now 6.9 development started, I may try 00aab7dcb2267f2aef59447602f34501efe1a07f to see if that fixes anything.
I can get the touchpads both working in 6.8 code when I do enough tries by a rmod/modprobe loop. Based on the logs you see, sometimes both are up, mostly one or none is up. having only two commits of 6.7 code in the module it works as 6.7 code. Only when "HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling" gets added I see the error and can reproduce it on my device as shown in the log.
Before we had
hid_err(hid, "reading report descriptor failed\n"); kfree(rdesc);
followed by the hardcoded return -EIO;
Now we have an redirect to the out function but don't do kfree(rdesc) if you hit the error anymore.
Also the last return 0; got removed. So maybe what ever error the device might have hit before might be reverted due to return 0 or not freeing rdesc when hitting the error might prevent it to load.
Anyhow, that patch is the first patch which changes the function in that series. Sure it might not be able to compile the whole kernel with it. I only tested the module based on the given code in an out of tree module style.
So I might test 6.9rc1 on the device and let you know if that kernel has the same issue still.
When I use the commit before I can rmmod and modprobe in a batch script using a loop without erroring out to -110. Attached the testing script and dmesg log snippets
#!/bin/bash for ((n=0;n<5;n++)) do sudo rmmod i2c_hid_acpi sleep 1 sudo modprobe i2c_hid_acpi --force-vermagic sleep 2 done
Ok, so you did try the commit before and that did work. Are you sure that aa69d6974185e9f was not actually the last working commit ?
For me 96d3098db835d58649b73a5788898bd7672a319b was the last working code, as it doesn't change functionality.
AFAICT aa69d6974185e9f makes no functional changes, except for actually propagating the error from i2c_hid_read_register() rather then hardcoding -EIO. But that should not matter...
See return 0 and kfree(rdesc) which is now missing.
The new code is (on error) "goto out;" and then out: looks like this:
out: if (!use_override) kfree(rdesc);
return ret;
Since the "goto out;" is an else branch of "if (use_override) { } else {}" we know use_override is false here, so !use_override is true and the kfree() still happens. Also even if that were to no longer happen that would just be a small memleak and would not explain the probe failures you are seeing.
As for "return 0" vs "return ret", the code at the end of i2c_hid_parse() before commit aa69d6974185e9f looked like this:
if (ret) { dbg_hid("parsing report descriptor failed\n"); return ret; }
return 0;
So this would also always return ret because if ret is non 0 we always hit if (ret) return ret; and if ret is 0 then return 0 and return ret are functionally the same.
So since the functionality at the end of the function did not change, it still always ends up returning ret, but now does it in a less circumspect way.
Which leaves just the change in the:
hid_err(hid, "reading report descriptor failed\n");
code path which used to return -EIO and now does goto out which ends up returning ret which is the actual return value of i2c_hid_read_register() instead of hardcoding -EIO.
I checked and nothing in the HID core treats -EIO in a special way.
So I still do not see how commit aa69d6974185e9f can break anything and I really wonder if you did not make a mistake during testing.
Can you try building a clean broken kernel (the entire kernel) and then in reverse order (of them being committed) revert my recent i2c-hid changes 1 by one and after each revert check if things start working again ?
And then see which is the first revert after which things start working again, that one is the culprit and I expect you to find a different commit then aa69d6974185e9f.
Unless I'm missing / overlooking something in commit aa69d6974185e9f...
Note I'm going offline for a long weekend and I won't be replying to emails until next week Tuesday.
Regards,
Hans
I managed to do a brief test with Kernel 6.9-rc1. There the issue is still given. So I might add the changes of i2c-hid to the working 6.7.x kernels and post which commits will break that module.
On 01/04/2024 13:20, Linux regression tracking (Thorsten Leemhuis) wrote:
TWIMC: Kenny Levinsen (now CCed) posted a patch to fix problems introduced by af93a167eda9: https://lore.kernel.org/all/20240331182440.14477-1-kl@kl.wtf/
Looks a bit (but I might be wrong there!) like he ran into similar problems as Philip, but was not aware of this thread.
Hi Thorsten,
thx for the pointer. I'm applying it right now and let you know if that changes the situation on my device. Might be the right direction.
On 01/04/2024 13:43, Philip Müller wrote:
On 01/04/2024 13:20, Linux regression tracking (Thorsten Leemhuis) wrote:
TWIMC: Kenny Levinsen (now CCed) posted a patch to fix problems introduced by af93a167eda9: https://lore.kernel.org/all/20240331182440.14477-1-kl@kl.wtf/
Looks a bit (but I might be wrong there!) like he ran into similar problems as Philip, but was not aware of this thread.
Hi Thorsten,
thx for the pointer. I'm applying it right now and let you know if that changes the situation on my device. Might be the right direction.
Seems I can confirm that the patch fixes the regression for me. I did batch test with reloading the module and no -110 errors anymore. We should add this patch also to 6.8+ kernels and backport if patches from that series should go to lower LTS kernels.
On 01.04.24 12:09, Philip Müller wrote:
On 01/04/2024 13:43, Philip Müller wrote:
On 01/04/2024 13:20, Linux regression tracking (Thorsten Leemhuis) wrote:
TWIMC: Kenny Levinsen (now CCed) posted a patch to fix problems introduced by af93a167eda9: https://lore.kernel.org/all/20240331182440.14477-1-kl@kl.wtf/
Looks a bit (but I might be wrong there!) like he ran into similar problems as Philip, but was not aware of this thread.
thx for the pointer. I'm applying it right now and let you know if that changes the situation on my device. Might be the right direction.
Seems I can confirm that the patch fixes the regression for me.
Thx for testing!
I did batch test with reloading the module and no -110 errors anymore. We should add this patch also to 6.8+ kernels and backport if patches from that series should go to lower LTS kernels.
Well, ideally Kenny would add a stable tag to ensure backporting to affected series, but that's up to Kenny.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.
#regzbot fix: HID: i2c-hid: Revert to await reset ACK before reading report descriptor #regzbot monitor: https://lore.kernel.org/all/20240331182440.14477-1-kl@kl.wtf/
Hi Philiph,
Someone else has hit the same problem and has correctly pinpointed it to another commit (in the same series) and submitted a fix upstream:
https://lore.kernel.org/linux-input/20240331182440.14477-1-kl@kl.wtf/T/#u
I'll be reviewing this patch shortly.
Regards,
Hans
On 3/29/24 4:40 AM, Philip Müller wrote:
On 28/03/2024 00:45, Hans de Goede wrote:
Hi Philip,
On 3/27/24 4:47 PM, Philip Müller wrote:
On 27/03/2024 17:31, Hans de Goede wrote:
Hi Philip,
On 3/19/24 5:52 AM, Philip Müller wrote:
On 18/03/2024 17:58, Philip Müller wrote:
I'm currently developing on the OrangePi Neo-01, which ships with two similar touchpads using the Synaptics driver. On 6.7.10 those two devices get detected normally. On 6.8.1 it seems to error out.
I either get none, or at best only one of those two devices.
i2c_hid_acpi: probe of i2c-XXX0001:00 failed with error -110 i2c_hid_acpi: probe of i2c-XXX0002:00 failed with error -110
what would be the best way to debug this?
I found the regression in commit aa69d6974185e9f7a552ba982540a38e34f69690 HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling
I just checked that patch and I don't see anyway how that can create this regression. I assume you did a git bisect ?
Did you try the last commit in the tree before that commit got added and verified that that one works where as building a kernel from commit aa69d6974185e9f itself does not work ?
No I didn't do a git bisect actually. I did it more manually like git bisect might do it. What I did was to compile just that module and load it against the 6.7.10 kernel. a9f68ffe1170ca4bc17ab29067d806a354a026e0 is the last commit you find in 6.7 series. All the other commits are part of the 6.8 series or later. So I looked at the commit titles and picked a9f68ffe1170ca4bc17ab29067d806a354a026e0 which I compiled the "out of tree" module. I unloaded the 6.7.10 module, copied the new compiled one over and did the rmod, modprobe batch script to see how the touchpads react.
Then I went to compile and tested af93a167eda90192563bce64c4bb989836afac1f which created the issue with -110 errors. So I went to aa69d6974185e9f7a552ba982540a38e34f69690 which also results in the error. So I went back to 96d3098db835d58649b73a5788898bd7672a319b which still works. Since now 6.9 development started, I may try 00aab7dcb2267f2aef59447602f34501efe1a07f to see if that fixes anything.
I can get the touchpads both working in 6.8 code when I do enough tries by a rmod/modprobe loop. Based on the logs you see, sometimes both are up, mostly one or none is up. having only two commits of 6.7 code in the module it works as 6.7 code. Only when "HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling" gets added I see the error and can reproduce it on my device as shown in the log.
Before we had
hid_err(hid, "reading report descriptor failed\n"); kfree(rdesc);
followed by the hardcoded return -EIO;
Now we have an redirect to the out function but don't do kfree(rdesc) if you hit the error anymore.
Also the last return 0; got removed. So maybe what ever error the device might have hit before might be reverted due to return 0 or not freeing rdesc when hitting the error might prevent it to load.
Anyhow, that patch is the first patch which changes the function in that series. Sure it might not be able to compile the whole kernel with it. I only tested the module based on the given code in an out of tree module style.
So I might test 6.9rc1 on the device and let you know if that kernel has the same issue still.
When I use the commit before I can rmmod and modprobe in a batch script using a loop without erroring out to -110. Attached the testing script and dmesg log snippets
#!/bin/bash for ((n=0;n<5;n++)) do sudo rmmod i2c_hid_acpi sleep 1 sudo modprobe i2c_hid_acpi --force-vermagic sleep 2 done
Ok, so you did try the commit before and that did work. Are you sure that aa69d6974185e9f was not actually the last working commit ?
For me 96d3098db835d58649b73a5788898bd7672a319b was the last working code, as it doesn't change functionality.
AFAICT aa69d6974185e9f makes no functional changes, except for actually propagating the error from i2c_hid_read_register() rather then hardcoding -EIO. But that should not matter...
See return 0 and kfree(rdesc) which is now missing.
The new code is (on error) "goto out;" and then out: looks like this:
out: if (!use_override) kfree(rdesc);
return ret;
Since the "goto out;" is an else branch of "if (use_override) { } else {}" we know use_override is false here, so !use_override is true and the kfree() still happens. Also even if that were to no longer happen that would just be a small memleak and would not explain the probe failures you are seeing.
As for "return 0" vs "return ret", the code at the end of i2c_hid_parse() before commit aa69d6974185e9f looked like this:
if (ret) { dbg_hid("parsing report descriptor failed\n"); return ret; }
return 0;
So this would also always return ret because if ret is non 0 we always hit if (ret) return ret; and if ret is 0 then return 0 and return ret are functionally the same.
So since the functionality at the end of the function did not change, it still always ends up returning ret, but now does it in a less circumspect way.
Which leaves just the change in the:
hid_err(hid, "reading report descriptor failed\n");
code path which used to return -EIO and now does goto out which ends up returning ret which is the actual return value of i2c_hid_read_register() instead of hardcoding -EIO.
I checked and nothing in the HID core treats -EIO in a special way.
So I still do not see how commit aa69d6974185e9f can break anything and I really wonder if you did not make a mistake during testing.
Can you try building a clean broken kernel (the entire kernel) and then in reverse order (of them being committed) revert my recent i2c-hid changes 1 by one and after each revert check if things start working again ?
And then see which is the first revert after which things start working again, that one is the culprit and I expect you to find a different commit then aa69d6974185e9f.
Unless I'm missing / overlooking something in commit aa69d6974185e9f...
Note I'm going offline for a long weekend and I won't be replying to emails until next week Tuesday.
Regards,
Hans
I managed to do a brief test with Kernel 6.9-rc1. There the issue is still given. So I might add the changes of i2c-hid to the working 6.7.x kernels and post which commits will break that module.
On 02/04/2024 15:52, Hans de Goede wrote:
Hi Philip,
Someone else has hit the same problem and has correctly pinpointed it to another commit (in the same series) and submitted a fix upstream:
https://lore.kernel.org/linux-input/20240331182440.14477-1-kl@kl.wtf/T/#u
I'll be reviewing this patch shortly.
Regards,
Hans
Hi Hans,
I already tested that patch and it works.
linux-stable-mirror@lists.linaro.org