On Thu, Aug 05, 2021 at 06:42:54PM +0200, Willy Tarreau wrote:
Hi Guenter,
On Thu, Aug 05, 2021 at 09:11:02AM -0700, Guenter Roeck wrote:
Hi folks,
we have (at least) two severe regressions in stable releases right now.
[SHAs are from linux-5.10.y]
2435dcfd16ac spi: mediatek: fix fifo rx mode Breaks SPI access on all Mediatek devices for small transactions (including all Mediatek based Chromebooks since they use small SPI transactions for EC communication)
60789afc02f5 Bluetooth: Shutdown controller after workqueues are flushed or cancelled Breaks Bluetooth on various devices (Mediatek and possibly others) Discussion: https://lkml.org/lkml/2021/7/28/569
Unfortunately, it appears that all our testing doesn't cover SPI and Bluetooth.
I understand that upstream is just as broken until fixes are applied there. Still, it shows that our test coverage is far from where it needs to be, and/or that we may be too aggressive with backporting patches to stable releases.
If you have an idea how to improve the situation, please let me know.
The first one is really interesting. The author did all the job right by documenting what commit this patch fixed, this commit was indeed present in the stable branches, and given that the change is probably only understood by the driver's maintainer, it's very likely that he did that in good faith after some testing on real hardware. So there's little chance that any extra form of automated testing will catch this if it worked at least in one place.
It looks like a typical "works for me" regression. The best thing that could possibly be done to limit such occurrences would be to wait "long enough" before backporting them, in hope to catch breakage reports before the backport, but here there were already 3 weeks between the patch was submitted and it was backported.
No. The patch is wrong. It just _looks_ correct at first glance. It claims to fix something that wasn't broken. FIFO rx mode was working just fine, handled in the receive interrupt as one would expect. The patch copies data from the rx fifo before the transfer is even started. I do not think it was tested on real hardware, or at least fifo receive transfer was not tested.
The patch _does_ fix a problem on the transmit side, but the patch subject doesn't mention that.
Guenter