On Thu, 21 Dec 2023 10:56:39 -0500 Hugo Villeneuve hugo@hugovil.com wrote:
On Wed, 20 Dec 2023 17:40:42 +0200 Andy Shevchenko andriy.shevchenko@intel.com wrote:
On Tue, Dec 19, 2023 at 12:18:46PM -0500, Hugo Villeneuve wrote:
From: Hugo Villeneuve hvilleneuve@dimonoff.com
If an error occurs during probing, the sc16is7xx_lines bitfield may be left in a state that doesn't represent the correct state of lines allocation.
For example, in a system with two SC16 devices, if an error occurs only during probing of channel (port) B of the second device, sc16is7xx_lines final state will be 00001011b instead of the expected 00000011b.
This is caused in part because of the "i--" in the for/loop located in the out_ports: error path.
Fix this by checking the return value of uart_add_one_port() and set line allocation bit only if this was successful. This allows the refactor of the obfuscated for(i--...) loop in the error path, and properly call uart_remove_one_port() only when needed, and properly unset line allocation bits.
Also use same mechanism in remove() when calling uart_remove_one_port().
Yes, this seems to be the correct one to fix the problem described in the patch 1. I dunno why the patch 1 even exists.
Hi, this will indeed fix the problem described in patch 1.
However, if I remove patch 1, and I simulate the same probe error as described in patch 1, now we get stuck forever when trying to remove the driver. This is something that I observed before and that patch 1 also corrected.
The problem is caused in sc16is7xx_remove() when calling this function
kthread_flush_worker(&s->kworker);
I am not sure how best to handle that without patch 1.
Also, if we manage to get past kthread_flush_worker() and kthread_stop() (commented out for testing purposes), we get another bug:
# rmmod sc16is7xx ... crystal-duart-24m already disabled WARNING: CPU: 2 PID: 340 at drivers/clk/clk.c:1090 clk_core_disable+0x1b0/0x1e0 ... Call trace: clk_core_disable+0x1b0/0x1e0 clk_disable+0x38/0x60 sc16is7xx_remove+0x1e4/0x240 [sc16is7xx]
This one is caused by calling clk_disable_unprepare(). But clk_disable_unprepare() has already been called in probe error handling code. Patch 1 also fixed this...
Hugo Villeneuve