From: Mike Marciniszyn mike.marciniszyn@cornelisnetworks.com
When the ipoib send_queue_size is increased from the default the following panic happens:
[ 219.242960] RIP: 0010:hfi1_ipoib_drain_tx_ring+0x45/0xf0 [hfi1] [ 219.250708] Code: 31 e4 eb 0f 8b 85 c8 02 00 00 41 83 c4 01 44 39 e0 76 60 8b 8d cc 02 00 00 44 89 e3 be 01 00 00 00 d3 e3 48 03 9d c0 02 00 00 <c7> 83 18 01 00 00 00 00 00 00 48 8b bb 30 01 00 00 e8 25 af a7 e0 [ 219.273764] RSP: 0018:ffffc9000798f4a0 EFLAGS: 00010286 [ 219.280740] RAX: 0000000000008000 RBX: ffffc9000aa0f000 RCX: 000000000000000f [ 219.289842] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 [ 219.298864] RBP: ffff88810ff08000 R08: ffff88889476d900 R09: 0000000000000101 [ 219.307907] R10: 0000000000000000 R11: ffffc90006590ff8 R12: 0000000000000200 [ 219.317016] R13: ffffc9000798fba8 R14: 0000000000000000 R15: 0000000000000001 [ 219.326100] FS: 00007fd0f79cc3c0(0000) GS:ffff88885fb00000(0000) knlGS:0000000000000000 [ 219.336171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 219.343639] CR2: ffffc9000aa0f118 CR3: 0000000889c84001 CR4: 00000000001706e0 [ 219.352589] Call Trace: [ 219.356340] <TASK> [ 219.359804] hfi1_ipoib_napi_tx_disable+0x45/0x60 [hfi1] [ 219.366887] hfi1_ipoib_dev_stop+0x18/0x80 [hfi1] [ 219.373313] ipoib_ib_dev_stop+0x1d/0x40 [ib_ipoib] [ 219.379814] ipoib_stop+0x48/0xc0 [ib_ipoib] [ 219.385604] __dev_close_many+0x9e/0x110 [ 219.391001] __dev_change_flags+0xd9/0x210 [ 219.396618] dev_change_flags+0x21/0x60 [ 219.401878] do_setlink+0x31c/0x10f0 [ 219.406841] ? __nla_validate_parse+0x12d/0x1a0 [ 219.412902] ? __nla_parse+0x21/0x30 [ 219.417844] ? inet6_validate_link_af+0x5e/0xf0 [ 219.423913] ? cpumask_next+0x1f/0x20 [ 219.428914] ? __snmp6_fill_stats64.isra.53+0xbb/0x140 [ 219.435648] ? __nla_validate_parse+0x47/0x1a0 [ 219.441564] __rtnl_newlink+0x530/0x910 [ 219.446818] ? pskb_expand_head+0x73/0x300 [ 219.452198] ? __kmalloc_node_track_caller+0x109/0x280 [ 219.458999] ? __nla_put+0xc/0x20 [ 219.463733] ? cpumask_next_and+0x20/0x30 [ 219.469166] ? update_sd_lb_stats.constprop.144+0xd3/0x820 [ 219.476325] ? _raw_spin_unlock_irqrestore+0x25/0x37 [ 219.482815] ? __wake_up_common_lock+0x87/0xc0 [ 219.488761] ? kmem_cache_alloc_trace+0x3d/0x3d0 [ 219.494917] rtnl_newlink+0x43/0x60
The issue happens when the shift that should have been a function of the txq item size mistakenly used the ring size.
Fix by using the item size.
Fixes: d47dfc2b00e6 ("IB/hfi1: Remove cache and embed txreq in ring") Cc: stable@vger.kernel.org Reviewed-by: Dennis Dalessandro dennis.dalessandro@cornelisnetworks.com Signed-off-by: Mike Marciniszyn mike.marciniszyn@cornelisnetworks.com --- drivers/infiniband/hw/hfi1/ipoib_tx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c index f401089..bf62956 100644 --- a/drivers/infiniband/hw/hfi1/ipoib_tx.c +++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c @@ -731,7 +731,7 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv) goto free_txqs;
txq->tx_ring.max_items = tx_ring_size; - txq->tx_ring.shift = ilog2(tx_ring_size); + txq->tx_ring.shift = ilog2(tx_item_size); txq->tx_ring.avail = hfi1_ipoib_ring_hwat(txq);
netif_tx_napi_add(dev, &txq->napi,
linux-stable-mirror@lists.linaro.org