This is a note to let you know that I've just added the patch titled
IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH ports
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-mlx4-fix-incorrectly-releasing-steerable-ud-qps-when-have-only-eth-ports.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 852f6927594d0d3e8632c889b2ab38cbc46476ad Mon Sep 17 00:00:00 2001
From: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
Date: Fri, 12 Jan 2018 07:58:40 +0200
Subject: IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH ports
From: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
commit 852f6927594d0d3e8632c889b2ab38cbc46476ad upstream.
Allocating steerable UD QPs depends on having at least one IB port,
while releasing those QPs does not.
As a result, when there are only ETH ports, the IB (RoCE) driver
requests releasing a qp range whose base qp is zero, with
qp count zero.
When SR-IOV is enabled, and the VF driver is running on a VM over
a hypervisor which treats such qp release calls as errors
(rather than NOPs), we see lines in the VM message log like:
mlx4_core 0002:00:02.0: Failed to release qp range base:0 cnt:0
Fix this by adding a check for a zero count in mlx4_release_qp_range()
(which thus treats releasing 0 qps as a nop), and eliminating the
check for device managed flow steering when releasing steerable UD QPs.
(Freeing ib_uc_qpns_bitmap unconditionally is also OK, since it
remains NULL when steerable UD QPs are not allocated).
Fixes: 4196670be786 ("IB/mlx4: Don't allocate range of steerable UD QPs for Ethernet-only device")
Signed-off-by: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/mlx4/main.c | 13 +++++--------
drivers/net/ethernet/mellanox/mlx4/qp.c | 3 +++
2 files changed, 8 insertions(+), 8 deletions(-)
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2928,9 +2928,8 @@ err_steer_free_bitmap:
kfree(ibdev->ib_uc_qpns_bitmap);
err_steer_qp_release:
- if (ibdev->steering_support == MLX4_STEERING_MODE_DEVICE_MANAGED)
- mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
- ibdev->steer_qpn_count);
+ mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
+ ibdev->steer_qpn_count);
err_counter:
for (i = 0; i < ibdev->num_ports; ++i)
mlx4_ib_delete_counters_table(ibdev, &ibdev->counters_table[i]);
@@ -3035,11 +3034,9 @@ static void mlx4_ib_remove(struct mlx4_d
ibdev->iboe.nb.notifier_call = NULL;
}
- if (ibdev->steering_support == MLX4_STEERING_MODE_DEVICE_MANAGED) {
- mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
- ibdev->steer_qpn_count);
- kfree(ibdev->ib_uc_qpns_bitmap);
- }
+ mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
+ ibdev->steer_qpn_count);
+ kfree(ibdev->ib_uc_qpns_bitmap);
iounmap(ibdev->uar_map);
for (p = 0; p < ibdev->num_ports; ++p)
--- a/drivers/net/ethernet/mellanox/mlx4/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx4/qp.c
@@ -286,6 +286,9 @@ void mlx4_qp_release_range(struct mlx4_d
u64 in_param = 0;
int err;
+ if (!cnt)
+ return;
+
if (mlx4_is_mfunc(dev)) {
set_param_l(&in_param, base_qpn);
set_param_h(&in_param, cnt);
Patches currently in stable-queue which might be from jackm(a)dev.mellanox.co.il are
queue-4.9/ib-mlx4-fix-incorrectly-releasing-steerable-ud-qps-when-have-only-eth-ports.patch
This is a note to let you know that I've just added the patch titled
IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH ports
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-mlx4-fix-incorrectly-releasing-steerable-ud-qps-when-have-only-eth-ports.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 852f6927594d0d3e8632c889b2ab38cbc46476ad Mon Sep 17 00:00:00 2001
From: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
Date: Fri, 12 Jan 2018 07:58:40 +0200
Subject: IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH ports
From: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
commit 852f6927594d0d3e8632c889b2ab38cbc46476ad upstream.
Allocating steerable UD QPs depends on having at least one IB port,
while releasing those QPs does not.
As a result, when there are only ETH ports, the IB (RoCE) driver
requests releasing a qp range whose base qp is zero, with
qp count zero.
When SR-IOV is enabled, and the VF driver is running on a VM over
a hypervisor which treats such qp release calls as errors
(rather than NOPs), we see lines in the VM message log like:
mlx4_core 0002:00:02.0: Failed to release qp range base:0 cnt:0
Fix this by adding a check for a zero count in mlx4_release_qp_range()
(which thus treats releasing 0 qps as a nop), and eliminating the
check for device managed flow steering when releasing steerable UD QPs.
(Freeing ib_uc_qpns_bitmap unconditionally is also OK, since it
remains NULL when steerable UD QPs are not allocated).
Fixes: 4196670be786 ("IB/mlx4: Don't allocate range of steerable UD QPs for Ethernet-only device")
Signed-off-by: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/mlx4/main.c | 13 +++++--------
drivers/net/ethernet/mellanox/mlx4/qp.c | 3 +++
2 files changed, 8 insertions(+), 8 deletions(-)
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2483,9 +2483,8 @@ err_steer_free_bitmap:
kfree(ibdev->ib_uc_qpns_bitmap);
err_steer_qp_release:
- if (ibdev->steering_support == MLX4_STEERING_MODE_DEVICE_MANAGED)
- mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
- ibdev->steer_qpn_count);
+ mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
+ ibdev->steer_qpn_count);
err_counter:
for (i = 0; i < ibdev->num_ports; ++i)
mlx4_ib_delete_counters_table(ibdev, &ibdev->counters_table[i]);
@@ -2586,11 +2585,9 @@ static void mlx4_ib_remove(struct mlx4_d
ibdev->iboe.nb.notifier_call = NULL;
}
- if (ibdev->steering_support == MLX4_STEERING_MODE_DEVICE_MANAGED) {
- mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
- ibdev->steer_qpn_count);
- kfree(ibdev->ib_uc_qpns_bitmap);
- }
+ mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
+ ibdev->steer_qpn_count);
+ kfree(ibdev->ib_uc_qpns_bitmap);
iounmap(ibdev->uar_map);
for (p = 0; p < ibdev->num_ports; ++p)
--- a/drivers/net/ethernet/mellanox/mlx4/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx4/qp.c
@@ -280,6 +280,9 @@ void mlx4_qp_release_range(struct mlx4_d
u64 in_param = 0;
int err;
+ if (!cnt)
+ return;
+
if (mlx4_is_mfunc(dev)) {
set_param_l(&in_param, base_qpn);
set_param_h(&in_param, cnt);
Patches currently in stable-queue which might be from jackm(a)dev.mellanox.co.il are
queue-4.4/ib-mlx4-fix-incorrectly-releasing-steerable-ud-qps-when-have-only-eth-ports.patch
This is a note to let you know that I've just added the patch titled
selftests: seccomp: fix compile error seccomp_bpf
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
selftests-seccomp-fix-compile-error-seccomp_bpf.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 912ec316686df352028afb6efec59e47a958a24d Mon Sep 17 00:00:00 2001
From: Anders Roxell <anders.roxell(a)linaro.org>
Date: Fri, 5 Jan 2018 17:31:18 +0100
Subject: selftests: seccomp: fix compile error seccomp_bpf
From: Anders Roxell <anders.roxell(a)linaro.org>
commit 912ec316686df352028afb6efec59e47a958a24d upstream.
aarch64-linux-gnu-gcc -Wl,-no-as-needed -Wall
-lpthread seccomp_bpf.c -o seccomp_bpf
seccomp_bpf.c: In function 'tracer_ptrace':
seccomp_bpf.c:1720:12: error: '__NR_open' undeclared
(first use in this function)
if (nr == __NR_open)
^~~~~~~~~
seccomp_bpf.c:1720:12: note: each undeclared identifier is reported
only once for each function it appears in
In file included from seccomp_bpf.c:48:0:
seccomp_bpf.c: In function 'TRACE_syscall_ptrace_syscall_dropped':
seccomp_bpf.c:1795:39: error: '__NR_open' undeclared
(first use in this function)
EXPECT_SYSCALL_RETURN(EPERM, syscall(__NR_open));
^
open(2) is a legacy syscall, replaced with openat(2) since 2.6.16.
Thus new architectures in the kernel, such as arm64, don't implement
these legacy syscalls.
Fixes: a33b2d0359a0 ("selftests/seccomp: Add tests for basic ptrace actions")
Signed-off-by: Anders Roxell <anders.roxell(a)linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju(a)linaro.org>
Cc: stable(a)vger.kernel.org
Acked-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Shuah Khan <shuahkh(a)osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/testing/selftests/seccomp/seccomp_bpf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1717,7 +1717,7 @@ void tracer_ptrace(struct __test_metadat
if (nr == __NR_getpid)
change_syscall(_metadata, tracee, __NR_getppid);
- if (nr == __NR_open)
+ if (nr == __NR_openat)
change_syscall(_metadata, tracee, -1);
}
@@ -1792,7 +1792,7 @@ TEST_F(TRACE_syscall, ptrace_syscall_dro
true);
/* Tracer should skip the open syscall, resulting in EPERM. */
- EXPECT_SYSCALL_RETURN(EPERM, syscall(__NR_open));
+ EXPECT_SYSCALL_RETURN(EPERM, syscall(__NR_openat));
}
TEST_F(TRACE_syscall, syscall_allowed)
Patches currently in stable-queue which might be from anders.roxell(a)linaro.org are
queue-4.15/selftests-seccomp-fix-compile-error-seccomp_bpf.patch
This is a note to let you know that I've just added the patch titled
RDMA/rxe: Fix rxe_qp_cleanup()
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rdma-rxe-fix-rxe_qp_cleanup.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From bb3ffb7ad48a21e98a5c64eb21103a74fd9f03f6 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche(a)wdc.com>
Date: Fri, 12 Jan 2018 15:11:59 -0800
Subject: RDMA/rxe: Fix rxe_qp_cleanup()
From: Bart Van Assche <bart.vanassche(a)wdc.com>
commit bb3ffb7ad48a21e98a5c64eb21103a74fd9f03f6 upstream.
rxe_qp_cleanup() can sleep so it must be run in thread context and
not in atomic context. This patch avoids that the following bug is
triggered:
Kernel BUG at 00000000560033f3 [verbose debug info unavailable]
BUG: sleeping function called from invalid context at net/core/sock.c:2761
in_atomic(): 1, irqs_disabled(): 0, pid: 7, name: ksoftirqd/0
INFO: lockdep is turned off.
Preemption disabled at:
[<00000000b6e69628>] __do_softirq+0x4e/0x540
CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.15.0-rc7-dbg+ #4
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0x85/0xbf
___might_sleep+0x177/0x260
lock_sock_nested+0x1d/0x90
inet_shutdown+0x2e/0xd0
rxe_qp_cleanup+0x107/0x140 [rdma_rxe]
rxe_elem_release+0x18/0x80 [rdma_rxe]
rxe_requester+0x1cf/0x11b0 [rdma_rxe]
rxe_do_task+0x78/0xf0 [rdma_rxe]
tasklet_action+0x99/0x270
__do_softirq+0xc0/0x540
run_ksoftirqd+0x1c/0x70
smpboot_thread_fn+0x1be/0x270
kthread+0x117/0x130
ret_from_fork+0x24/0x30
Signed-off-by: Bart Van Assche <bart.vanassche(a)wdc.com>
Cc: Moni Shoua <monis(a)mellanox.com>
Signed-off-by: Doug Ledford <dledford(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/sw/rxe/rxe_qp.c | 12 ++++++++++--
drivers/infiniband/sw/rxe/rxe_verbs.h | 3 +++
2 files changed, 13 insertions(+), 2 deletions(-)
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -824,9 +824,9 @@ void rxe_qp_destroy(struct rxe_qp *qp)
}
/* called when the last reference to the qp is dropped */
-void rxe_qp_cleanup(struct rxe_pool_entry *arg)
+static void rxe_qp_do_cleanup(struct work_struct *work)
{
- struct rxe_qp *qp = container_of(arg, typeof(*qp), pelem);
+ struct rxe_qp *qp = container_of(work, typeof(*qp), cleanup_work.work);
rxe_drop_all_mcast_groups(qp);
@@ -859,3 +859,11 @@ void rxe_qp_cleanup(struct rxe_pool_entr
kernel_sock_shutdown(qp->sk, SHUT_RDWR);
sock_release(qp->sk);
}
+
+/* called when the last reference to the qp is dropped */
+void rxe_qp_cleanup(struct rxe_pool_entry *arg)
+{
+ struct rxe_qp *qp = container_of(arg, typeof(*qp), pelem);
+
+ execute_in_process_context(rxe_qp_do_cleanup, &qp->cleanup_work);
+}
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -35,6 +35,7 @@
#define RXE_VERBS_H
#include <linux/interrupt.h>
+#include <linux/workqueue.h>
#include <rdma/rdma_user_rxe.h>
#include "rxe_pool.h"
#include "rxe_task.h"
@@ -281,6 +282,8 @@ struct rxe_qp {
struct timer_list rnr_nak_timer;
spinlock_t state_lock; /* guard requester and completer */
+
+ struct execute_work cleanup_work;
};
enum rxe_mem_state {
Patches currently in stable-queue which might be from bart.vanassche(a)wdc.com are
queue-4.15/ib-core-fix-two-kernel-warnings-triggered-by-rxe-registration.patch
queue-4.15/rdma-rxe-fix-a-race-condition-in-rxe_requester.patch
queue-4.15/rdma-rxe-fix-rxe_qp_cleanup.patch
queue-4.15/rdma-rxe-fix-a-race-condition-related-to-the-qp-error-state.patch
This is a note to let you know that I've just added the patch titled
RDMA/rxe: Fix a race condition related to the QP error state
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rdma-rxe-fix-a-race-condition-related-to-the-qp-error-state.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6f301e06de4cf9ab7303f5acd43e64fcd4aa04be Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche(a)wdc.com>
Date: Tue, 9 Jan 2018 11:23:40 -0800
Subject: RDMA/rxe: Fix a race condition related to the QP error state
From: Bart Van Assche <bart.vanassche(a)wdc.com>
commit 6f301e06de4cf9ab7303f5acd43e64fcd4aa04be upstream.
The following sequence:
* Change queue pair state into IB_QPS_ERR.
* Post a work request on the queue pair.
Triggers the following race condition in the rdma_rxe driver:
* rxe_qp_error() triggers an asynchronous call of rxe_completer(), the function
that examines the QP send queue.
* rxe_post_send() posts a work request on the QP send queue.
If rxe_completer() runs prior to rxe_post_send(), it will drain the send
queue and the driver will assume no further action is necessary.
However, once we post the send to the send queue, because the queue is
in error, no send completion will ever happen and the send will get
stuck. In order to process the send, we need to make sure that
rxe_completer() gets run after a send is posted to a queue pair in an
error state. This patch ensures that happens.
Signed-off-by: Bart Van Assche <bart.vanassche(a)wdc.com>
Cc: Moni Shoua <monis(a)mellanox.com>
Cc: <stable(a)vger.kernel.org> # v4.8
Signed-off-by: Doug Ledford <dledford(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -814,6 +814,8 @@ static int rxe_post_send_kernel(struct r
(queue_count(qp->sq.queue) > 1);
rxe_run_task(&qp->req.task, must_sched);
+ if (unlikely(qp->req.state == QP_STATE_ERROR))
+ rxe_run_task(&qp->comp.task, 1);
return err;
}
Patches currently in stable-queue which might be from bart.vanassche(a)wdc.com are
queue-4.15/ib-core-fix-two-kernel-warnings-triggered-by-rxe-registration.patch
queue-4.15/rdma-rxe-fix-a-race-condition-in-rxe_requester.patch
queue-4.15/rdma-rxe-fix-rxe_qp_cleanup.patch
queue-4.15/rdma-rxe-fix-a-race-condition-related-to-the-qp-error-state.patch
This is a note to let you know that I've just added the patch titled
RDMA/rxe: Fix a race condition in rxe_requester()
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rdma-rxe-fix-a-race-condition-in-rxe_requester.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 65567e41219888feec72fee1de98ccf1efbbc16d Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche(a)wdc.com>
Date: Fri, 12 Jan 2018 15:11:58 -0800
Subject: RDMA/rxe: Fix a race condition in rxe_requester()
From: Bart Van Assche <bart.vanassche(a)wdc.com>
commit 65567e41219888feec72fee1de98ccf1efbbc16d upstream.
The rxe driver works as follows:
* The send queue, receive queue and completion queues are implemented as
circular buffers.
* ib_post_send() and ib_post_recv() calls are serialized through a spinlock.
* Removing elements from various queues happens from tasklet
context. Tasklets are guaranteed to run on at most one CPU. This serializes
access to these queues. See also rxe_completer(), rxe_requester() and
rxe_responder().
* rxe_completer() processes the skbs queued onto qp->resp_pkts.
* rxe_requester() handles the send queue (qp->sq.queue).
* rxe_responder() processes the skbs queued onto qp->req_pkts.
Since rxe_drain_req_pkts() processes qp->req_pkts, calling
rxe_drain_req_pkts() from rxe_requester() is racy. Hence this patch.
Reported-by: Moni Shoua <monis(a)mellanox.com>
Signed-off-by: Bart Van Assche <bart.vanassche(a)wdc.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Doug Ledford <dledford(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/sw/rxe/rxe_loc.h | 1 -
drivers/infiniband/sw/rxe/rxe_req.c | 9 +--------
drivers/infiniband/sw/rxe/rxe_resp.c | 2 +-
3 files changed, 2 insertions(+), 10 deletions(-)
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -237,7 +237,6 @@ int rxe_srq_from_attr(struct rxe_dev *rx
void rxe_release(struct kref *kref);
-void rxe_drain_req_pkts(struct rxe_qp *qp, bool notify);
int rxe_completer(void *arg);
int rxe_requester(void *arg);
int rxe_responder(void *arg);
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -594,15 +594,8 @@ int rxe_requester(void *arg)
rxe_add_ref(qp);
next_wqe:
- if (unlikely(!qp->valid)) {
- rxe_drain_req_pkts(qp, true);
+ if (unlikely(!qp->valid || qp->req.state == QP_STATE_ERROR))
goto exit;
- }
-
- if (unlikely(qp->req.state == QP_STATE_ERROR)) {
- rxe_drain_req_pkts(qp, true);
- goto exit;
- }
if (unlikely(qp->req.state == QP_STATE_RESET)) {
qp->req.wqe_index = consumer_index(qp->sq.queue);
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -1210,7 +1210,7 @@ static enum resp_states do_class_d1e_err
}
}
-void rxe_drain_req_pkts(struct rxe_qp *qp, bool notify)
+static void rxe_drain_req_pkts(struct rxe_qp *qp, bool notify)
{
struct sk_buff *skb;
Patches currently in stable-queue which might be from bart.vanassche(a)wdc.com are
queue-4.15/ib-core-fix-two-kernel-warnings-triggered-by-rxe-registration.patch
queue-4.15/rdma-rxe-fix-a-race-condition-in-rxe_requester.patch
queue-4.15/rdma-rxe-fix-rxe_qp_cleanup.patch
queue-4.15/rdma-rxe-fix-a-race-condition-related-to-the-qp-error-state.patch
This is a note to let you know that I've just added the patch titled
IB/umad: Fix use of unprotected device pointer
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-umad-fix-use-of-unprotected-device-pointer.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f23a5350e43c810ca36b26d4ed4ecd9a08686f47 Mon Sep 17 00:00:00 2001
From: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
Date: Sun, 28 Jan 2018 11:25:29 +0200
Subject: IB/umad: Fix use of unprotected device pointer
From: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
commit f23a5350e43c810ca36b26d4ed4ecd9a08686f47 upstream.
The ib_write_umad() is protected by taking the umad file mutex.
However, it accesses file->port->ib_dev -- which is protected only by the
port's mutex (field file_mutex).
The ib_umad_remove_one() calls ib_umad_kill_port() which sets
port->ib_dev to NULL under the port mutex (NOT the file mutex).
It then sets the mad agent to "dead" under the umad file mutex.
This is a race condition -- because there is a window where
port->ib_dev is NULL, while the agent is not "dead".
As a result, we saw stack traces like:
[16490.678059] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
[16490.678246] IP: ib_umad_write+0x29c/0xa3a [ib_umad]
[16490.678333] PGD 0 P4D 0
[16490.678404] Oops: 0000 [#1] SMP PTI
[16490.678466] Modules linked in: rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx4_en(OE) ptp pps_core mlx4_ib(OE-) ib_core(OE) mlx4_core(OE) mlx_compat
(OE) memtrack(OE) devlink mst_pciconf(OE) mst_pci(OE) netconsole nfsv3 nfs_acl nfs lockd grace fscache cfg80211 rfkill esp6_offload esp6 esp4_offload esp4 sunrpc kvm_intel kvm ppdev parport_pc irqbypass
parport joydev i2c_piix4 virtio_balloon cirrus drm_kms_helper ttm drm e1000 serio_raw virtio_pci virtio_ring virtio ata_generic pata_acpi qemu_fw_cfg [last unloaded: mlxfw]
[16490.679202] CPU: 4 PID: 3115 Comm: sminfo Tainted: G OE 4.14.13-300.fc27.x86_64 #1
[16490.679339] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
[16490.679477] task: ffff9cf753890000 task.stack: ffffaf70c26b0000
[16490.679571] RIP: 0010:ib_umad_write+0x29c/0xa3a [ib_umad]
[16490.679664] RSP: 0018:ffffaf70c26b3d90 EFLAGS: 00010202
[16490.679747] RAX: 0000000000000010 RBX: ffff9cf75610fd80 RCX: 0000000000000000
[16490.679856] RDX: 0000000000000001 RSI: 00007ffdf2bfd714 RDI: ffff9cf6bb2a9c00
In the above trace, ib_umad_write is trying to dereference the NULL
file->port->ib_dev pointer.
Fix this by using the agent's device pointer (the device field
in struct ib_mad_agent) -- which IS protected by the umad file mutex.
Fixes: 44c58487d51a ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
Signed-off-by: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/core/user_mad.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -500,7 +500,7 @@ static ssize_t ib_umad_write(struct file
}
memset(&ah_attr, 0, sizeof ah_attr);
- ah_attr.type = rdma_ah_find_type(file->port->ib_dev,
+ ah_attr.type = rdma_ah_find_type(agent->device,
file->port->port_num);
rdma_ah_set_dlid(&ah_attr, be16_to_cpu(packet->mad.hdr.lid));
rdma_ah_set_sl(&ah_attr, packet->mad.hdr.sl);
Patches currently in stable-queue which might be from jackm(a)dev.mellanox.co.il are
queue-4.15/ib-mlx4-fix-incorrectly-releasing-steerable-ud-qps-when-have-only-eth-ports.patch
queue-4.15/ib-umad-fix-use-of-unprotected-device-pointer.patch
This is a note to let you know that I've just added the patch titled
kselftest: fix OOM in memory compaction test
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kselftest-fix-oom-in-memory-compaction-test.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 4c1baad223906943b595a887305f2e8124821dad Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Tue, 9 Jan 2018 17:26:24 +0100
Subject: kselftest: fix OOM in memory compaction test
From: Arnd Bergmann <arnd(a)arndb.de>
commit 4c1baad223906943b595a887305f2e8124821dad upstream.
Running the compaction_test sometimes results in out-of-memory
failures. When I debugged this, it turned out that the code to
reset the number of hugepages to the initial value is simply
broken since we write into an open sysctl file descriptor
multiple times without seeking back to the start.
Adding the lseek here fixes the problem.
Cc: stable(a)vger.kernel.org
Reported-by: Naresh Kamboju <naresh.kamboju(a)linaro.org>
Link: https://bugs.linaro.org/show_bug.cgi?id=3145
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Shuah Khan <shuahkh(a)osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/testing/selftests/vm/compaction_test.c | 2 ++
1 file changed, 2 insertions(+)
--- a/tools/testing/selftests/vm/compaction_test.c
+++ b/tools/testing/selftests/vm/compaction_test.c
@@ -137,6 +137,8 @@ int check_compaction(unsigned long mem_f
printf("No of huge pages allocated = %d\n",
(atoi(nr_hugepages)));
+ lseek(fd, 0, SEEK_SET);
+
if (write(fd, initial_nr_hugepages, strlen(initial_nr_hugepages))
!= strlen(initial_nr_hugepages)) {
perror("Failed to write value to /proc/sys/vm/nr_hugepages\n");
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-4.15/kselftest-fix-oom-in-memory-compaction-test.patch
This is a note to let you know that I've just added the patch titled
IB/qib: Fix comparison error with qperf compare/swap test
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-qib-fix-comparison-error-with-qperf-compare-swap-test.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 87b3524cb5058fdc7c2afdb92bdb2e079661ddc4 Mon Sep 17 00:00:00 2001
From: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Date: Tue, 14 Nov 2017 04:34:52 -0800
Subject: IB/qib: Fix comparison error with qperf compare/swap test
From: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
commit 87b3524cb5058fdc7c2afdb92bdb2e079661ddc4 upstream.
This failure exists with qib:
ver_rc_compare_swap:
mismatch, sequence 2, expected 123456789abcdef, got 0
The request builder was using the incorrect inlines to
build the request header resulting in incorrect data
in the atomic header.
Fix by using the appropriate inlines to create the request.
Fixes: 261a4351844b ("IB/qib,IB/hfi: Use core common header file")
Reviewed-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro(a)intel.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/qib/qib_rc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/infiniband/hw/qib/qib_rc.c
+++ b/drivers/infiniband/hw/qib/qib_rc.c
@@ -434,13 +434,13 @@ no_flow_control:
qp->s_state = OP(COMPARE_SWAP);
put_ib_ateth_swap(wqe->atomic_wr.swap,
&ohdr->u.atomic_eth);
- put_ib_ateth_swap(wqe->atomic_wr.compare_add,
- &ohdr->u.atomic_eth);
+ put_ib_ateth_compare(wqe->atomic_wr.compare_add,
+ &ohdr->u.atomic_eth);
} else {
qp->s_state = OP(FETCH_ADD);
put_ib_ateth_swap(wqe->atomic_wr.compare_add,
&ohdr->u.atomic_eth);
- put_ib_ateth_swap(0, &ohdr->u.atomic_eth);
+ put_ib_ateth_compare(0, &ohdr->u.atomic_eth);
}
put_ib_ateth_vaddr(wqe->atomic_wr.remote_addr,
&ohdr->u.atomic_eth);
Patches currently in stable-queue which might be from mike.marciniszyn(a)intel.com are
queue-4.15/ib-qib-fix-comparison-error-with-qperf-compare-swap-test.patch
This is a note to let you know that I've just added the patch titled
IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH ports
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-mlx4-fix-incorrectly-releasing-steerable-ud-qps-when-have-only-eth-ports.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 852f6927594d0d3e8632c889b2ab38cbc46476ad Mon Sep 17 00:00:00 2001
From: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
Date: Fri, 12 Jan 2018 07:58:40 +0200
Subject: IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH ports
From: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
commit 852f6927594d0d3e8632c889b2ab38cbc46476ad upstream.
Allocating steerable UD QPs depends on having at least one IB port,
while releasing those QPs does not.
As a result, when there are only ETH ports, the IB (RoCE) driver
requests releasing a qp range whose base qp is zero, with
qp count zero.
When SR-IOV is enabled, and the VF driver is running on a VM over
a hypervisor which treats such qp release calls as errors
(rather than NOPs), we see lines in the VM message log like:
mlx4_core 0002:00:02.0: Failed to release qp range base:0 cnt:0
Fix this by adding a check for a zero count in mlx4_release_qp_range()
(which thus treats releasing 0 qps as a nop), and eliminating the
check for device managed flow steering when releasing steerable UD QPs.
(Freeing ib_uc_qpns_bitmap unconditionally is also OK, since it
remains NULL when steerable UD QPs are not allocated).
Fixes: 4196670be786 ("IB/mlx4: Don't allocate range of steerable UD QPs for Ethernet-only device")
Signed-off-by: Jack Morgenstein <jackm(a)dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/mlx4/main.c | 13 +++++--------
drivers/net/ethernet/mellanox/mlx4/qp.c | 3 +++
2 files changed, 8 insertions(+), 8 deletions(-)
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2995,9 +2995,8 @@ err_steer_free_bitmap:
kfree(ibdev->ib_uc_qpns_bitmap);
err_steer_qp_release:
- if (ibdev->steering_support == MLX4_STEERING_MODE_DEVICE_MANAGED)
- mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
- ibdev->steer_qpn_count);
+ mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
+ ibdev->steer_qpn_count);
err_counter:
for (i = 0; i < ibdev->num_ports; ++i)
mlx4_ib_delete_counters_table(ibdev, &ibdev->counters_table[i]);
@@ -3102,11 +3101,9 @@ static void mlx4_ib_remove(struct mlx4_d
ibdev->iboe.nb.notifier_call = NULL;
}
- if (ibdev->steering_support == MLX4_STEERING_MODE_DEVICE_MANAGED) {
- mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
- ibdev->steer_qpn_count);
- kfree(ibdev->ib_uc_qpns_bitmap);
- }
+ mlx4_qp_release_range(dev, ibdev->steer_qpn_base,
+ ibdev->steer_qpn_count);
+ kfree(ibdev->ib_uc_qpns_bitmap);
iounmap(ibdev->uar_map);
for (p = 0; p < ibdev->num_ports; ++p)
--- a/drivers/net/ethernet/mellanox/mlx4/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx4/qp.c
@@ -287,6 +287,9 @@ void mlx4_qp_release_range(struct mlx4_d
u64 in_param = 0;
int err;
+ if (!cnt)
+ return;
+
if (mlx4_is_mfunc(dev)) {
set_param_l(&in_param, base_qpn);
set_param_h(&in_param, cnt);
Patches currently in stable-queue which might be from jackm(a)dev.mellanox.co.il are
queue-4.15/ib-mlx4-fix-incorrectly-releasing-steerable-ud-qps-when-have-only-eth-ports.patch
queue-4.15/ib-umad-fix-use-of-unprotected-device-pointer.patch