Le 20/02/2024 à 12:35, Thorsten Leemhuis a écrit :
[CCing the regression list, as it should be in the loop for regressions: https://docs.kernel.org/admin-guide/reporting-regressions.html]
Hi, Thorsten here, the Linux kernel's regression tracker.
Hi, thanks for replying (even if I find your tone a bit harsh, but I don't blame you - and since English is not my native language, maybe I'm mistaking).
[1] https://github.com/torvalds/linux/commit/46a0a2c96f0f47628190f122c2e3d879e59... [2] https://github.com/torvalds/linux/commit/2f2bd7cbd1d1548137b351040dc4e037d18... [3] https://github.com/torvalds/linux/commit/43527a0094c10dfbf0d5a2e7979395a38de...
The regression is that a middle click is performed when releasing middle button after wheel emulation.
How did you identify these three commits? Or do you just suspect that it's one of them?
No, I didn't "just suspect" that it was one of them. I may not be a kernel developer but I'm an experienced sysadmin (25+ years). So please stop taking users for idiots.
First, I compared the three machines I used which have a keyboard with a TrackPoint: my desktop at home (external "Lenovo ThinkPad Compact Keyboard with TrackPoint" (not II, not Bluetooth), Debian unstable (I'm a DM), my desktop at work (same keyboard, Debian Stable) and my personal laptop (ThinkPad X270, internal keyboard, Debian Stable but with backports).
The machine at work had a 5.10 kernel at the time, and the other ones had a 6.6, but only the machines with an external keyboard exhibited the spurious middle-clicks. So I compared the loaded HID drivers, and noticed that both of them had hid_lenovo loaded, whereas the laptop did not.
Confident that I probably pinpointed the faulty driver, I simply looked at the file history on Github, and saw that those three commits were dated from after the time when the bug appeared ; moreover, the comments did mention stuff related to wheel emulation and spurious middle-clicks.
So, no, I didn't "just suspected" that they were responsible, but I hope you'll admit my method was sound, and that my conclusion is a pretty strong (to not say "almost certain") probability.
And did you try to check which of the three is the actual culprit? Either by reverting them on top of master or by checking the parent for each of the commits (git show '2f2bd7cbd1d^' shows the parent for 2f2bd7cbd1d).
I admit I didn't. I didn't compile my own kernels for ages. I used to do it in the past, but I came to trust Debian's kernels and rely on the maintainers' work. But read below.
On Debian Stable, the last working kernel was 5.10.127, the regression appeared in 5.10.136 (i read all changelogs on kernel.org between those two releases but couldn't find anything about hid-lenovo, so I can't tell exactly in which release the regression appeared, Debian upgraded directly from .127 to .136).
Why not bisect between .127 and .136 then?
I heard of that term before (and I understand the mathematical meaning of it), but I never did it with a Git tree. I read the guide you mentioned below, but it seems much too complicated and too long to me for just verifying if those three commits are indeed the cause of the regression (which I'm almost sure of, as stated above).
So in the meantime, I decided to follow my hunch and recompile only the hid_lenovo module (following the guide at [6], updating it slightly by manually removing kernel signing options in .config, since I obviously don't have Debian's signing keys, and replacing "make SUBDIRS=drivers/..." with "make M=..." as suggested by make), after un-applying those three patches in reverse order.
[6] https://askubuntu.com/a/338403/387067
The HID modules built successfully, and after copying my modified hid-lenovo.ko to /usr/lib/modules/6.6.15-amd64/updates/ and running 'depmod -a', the module loaded fine with Debian's kernel (I don't use Secure Boot on this machine).
I'll let a few days pass (remember, the bug doesn't happen immediately but only after a varying amount of time) and I'll report here if the spurious middle-clicks happened again or not.
Notes:
1/ Thank you for (indirectly) giving me this idea. Maybe this relatively simple procedure should be made available somewhere on Debian's wiki (instead of an outdated, but still useful, answer on AskUbuntu).
2/ Please note that I did it only for unstable kernel; unfortunately, I can't do the same for the stable kernel, since I don't have access to my machine at work anymore (my freelance contract ended one week ago) and I don't have any other machine at home exhibiting this bug. So I won't be able to test it on a stable kernel.
I reported it in Debian [4], and apparently I'm not the only person suffering from it [5].
[4] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1058758#32 [5] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1058758#42
I would understand that such bugs would end up in a development kernel like the ones provided by Debian Unstable, but not with stable kernels like the ones provided by Debian Stable.
A bug report like yours can do the trick sometimes, as it might be enough to ring a bell for one of the developers. But given that nobody replied yet it looks like that is not the case. Then you most likely will need to perform a bisection to identify the exact commit that broke things.
Nobody amongst the developers, yes, I'll give you that. But the comment I linked from the Debian BTS, plus another bug report I found in the Input mailing list [7], show that I'm not the only user complaining from the recent regressions.
[7] https://lore.kernel.org/linux-input/CACSVgagaEHO2zoYQ8zDBrMT9OvT8R5B_h3dxfZu...
Regards,