On Sun, Aug 04, 2019 at 02:10:06PM -0500, Mike Christie wrote:
This fixes a bug added in 4.10 with commit:
commit 9561a7ade0c205bc2ee035a2ac880478dcc1a024 Author: Josef Bacik jbacik@fb.com Date: Tue Nov 22 14:04:40 2016 -0500
nbd: add multi-connection supportthat limited the number of devices to 256. Before the patch we could create 1000s of devices, but the patch switched us from using our own thread to using a work queue which has a default limit of 256 active works.
The problem is that our recv_work function sits in a loop until disconnection but only handles IO for one connection. The work is started when the connection is started/restarted, but if we end up creating 257 or more connections, the queue_work call just queues connection257+'s recv_work and that waits for connection 1 - 256's recv_work to be disconnected and that work instance completing.
Instead of reverting back to kthreads, this has us allocate a workqueue_struct per device, so we can block in the work.
Woops, thanks for fixing this. Sorry I was out of the office when this went through and forgot to come back to it.
Reviewed-by: Josef Bacik josef@toxicpanda.com
Thanks,
Josef