Hello
I reproduced the issue on 5.11.7-rc1, could we port this patch to stable branch.
Thanks Yi
On 2/17/21 9:37 PM, Jason Gunthorpe wrote:
On Fri, Feb 05, 2021 at 09:14:28AM +0100, Nicolas Morey-Chaisemartin wrote:
The current code computes a number of channels per SRP target and spreads them equally across all online NUMA nodes. Each channel is then assigned a CPU within this node.
In the case of unbalanced, or even unpopulated nodes, some channels do not get a CPU associated and thus do not get connected. This causes the SRP connection to fail.
This patch solves the issue by rewriting channel computation and allocation:
- Drop channel to node/CPU association as it had no real effect on locality but added unnecessary complexity.
- Tweak the number of channels allocated to reduce CPU contention when possible:
- Up to one channel per CPU (instead of up to 4 by node)
- At least 4 channels per node, unless ch_count module parameter is used.
Signed-off-by: Nicolas Morey-Chaisemartin nmoreychaisemartin@suse.com Reviewed-by: Bart Van Assche bvanassche@acm.org
drivers/infiniband/ulp/srp/ib_srp.c | 110 ++++++++++++---------------- 1 file changed, 45 insertions(+), 65 deletions(-)
Applied to for-next, thanks
Jason