This is a note to let you know that I've just added the patch titled
libnvdimm, namespace: make 'resource' attribute only readable by root
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
libnvdimm-namespace-make-resource-attribute-only-readable-by-root.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From c1fb3542074fd0c4d901d778bd52455111e4eb6f Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Tue, 26 Sep 2017 11:21:24 -0700
Subject: libnvdimm, namespace: make 'resource' attribute only readable by root
From: Dan Williams <dan.j.williams(a)intel.com>
commit c1fb3542074fd0c4d901d778bd52455111e4eb6f upstream.
For the same reason that /proc/iomem returns 0's for non-root readers
and acpi tables are root-only, make the 'resource' attribute for
namespace devices only readable by root. Otherwise we disclose physical
address information.
Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation")
Reported-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvdimm/namespace_devs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -1620,7 +1620,7 @@ static umode_t namespace_visible(struct
if (a == &dev_attr_resource.attr) {
if (is_namespace_blk(dev))
return 0;
- return a->mode;
+ return 0400;
}
if (is_namespace_pmem(dev) || is_namespace_blk(dev)) {
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.14/libnvdimm-dimm-clear-locked-status-on-successful-dimm-enable.patch
queue-4.14/libnvdimm-pfn-make-resource-attribute-only-readable-by-root.patch
queue-4.14/libnvdimm-region-make-resource-attribute-only-readable-by-root.patch
queue-4.14/dax-fix-pmd-faults-on-zero-length-files.patch
queue-4.14/libnvdimm-namespace-make-resource-attribute-only-readable-by-root.patch
queue-4.14/dax-fix-general-protection-fault-in-dax_alloc_inode.patch
queue-4.14/libnvdimm-namespace-fix-label-initialization-to-use-valid-seq-numbers.patch
This is a note to let you know that I've just added the patch titled
libnvdimm, namespace: fix label initialization to use valid seq numbers
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
libnvdimm-namespace-fix-label-initialization-to-use-valid-seq-numbers.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b18d4b8a25af6fe83d7692191d6ff962ea611c4f Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Tue, 26 Sep 2017 11:41:28 -0700
Subject: libnvdimm, namespace: fix label initialization to use valid seq numbers
From: Dan Williams <dan.j.williams(a)intel.com>
commit b18d4b8a25af6fe83d7692191d6ff962ea611c4f upstream.
The set of valid sequence numbers is {1,2,3}. The specification
indicates that an implementation should consider 0 a sign of a critical
error:
UEFI 2.7: 13.19 NVDIMM Label Protocol
Software never writes the sequence number 00, so a correctly
check-summed Index Block with this sequence number probably indicates a
critical error. When software discovers this case it treats it as an
invalid Index Block indication.
While the expectation is that the invalid block is just thrown away, the
Robustness Principle says we should fix this to make both sequence
numbers valid.
Fixes: f524bf271a5c ("libnvdimm: write pmem label set")
Reported-by: Juston Li <juston.li(a)intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvdimm/label.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/nvdimm/label.c
+++ b/drivers/nvdimm/label.c
@@ -1050,7 +1050,7 @@ static int init_labels(struct nd_mapping
nsindex = to_namespace_index(ndd, 0);
memset(nsindex, 0, ndd->nsarea.config_size);
for (i = 0; i < 2; i++) {
- int rc = nd_label_write_index(ndd, i, i*2, ND_NSINDEX_INIT);
+ int rc = nd_label_write_index(ndd, i, 3 - i, ND_NSINDEX_INIT);
if (rc)
return rc;
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.14/libnvdimm-dimm-clear-locked-status-on-successful-dimm-enable.patch
queue-4.14/libnvdimm-pfn-make-resource-attribute-only-readable-by-root.patch
queue-4.14/libnvdimm-region-make-resource-attribute-only-readable-by-root.patch
queue-4.14/dax-fix-pmd-faults-on-zero-length-files.patch
queue-4.14/libnvdimm-namespace-make-resource-attribute-only-readable-by-root.patch
queue-4.14/dax-fix-general-protection-fault-in-dax_alloc_inode.patch
queue-4.14/libnvdimm-namespace-fix-label-initialization-to-use-valid-seq-numbers.patch
This is a note to let you know that I've just added the patch titled
libnvdimm, dimm: clear 'locked' status on successful DIMM enable
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
libnvdimm-dimm-clear-locked-status-on-successful-dimm-enable.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d34cb808402898e53b9a9bcbbedd01667a78723b Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 25 Sep 2017 11:01:31 -0700
Subject: libnvdimm, dimm: clear 'locked' status on successful DIMM enable
From: Dan Williams <dan.j.williams(a)intel.com>
commit d34cb808402898e53b9a9bcbbedd01667a78723b upstream.
If we successfully enable a DIMM then it must not be locked and we can
clear the label-read failure condition. Otherwise, we need to reload the
entire bus provider driver to achieve the same effect, and that can
disrupt unrelated DIMMs and namespaces.
Fixes: 9d62ed965118 ("libnvdimm: handle locked label storage areas")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvdimm/dimm.c | 1 +
drivers/nvdimm/dimm_devs.c | 7 +++++++
drivers/nvdimm/nd.h | 1 +
3 files changed, 9 insertions(+)
--- a/drivers/nvdimm/dimm.c
+++ b/drivers/nvdimm/dimm.c
@@ -68,6 +68,7 @@ static int nvdimm_probe(struct device *d
rc = nd_label_reserve_dpa(ndd);
if (ndd->ns_current >= 0)
nvdimm_set_aliasing(dev);
+ nvdimm_clear_locked(dev);
nvdimm_bus_unlock(dev);
if (rc)
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -200,6 +200,13 @@ void nvdimm_set_locked(struct device *de
set_bit(NDD_LOCKED, &nvdimm->flags);
}
+void nvdimm_clear_locked(struct device *dev)
+{
+ struct nvdimm *nvdimm = to_nvdimm(dev);
+
+ clear_bit(NDD_LOCKED, &nvdimm->flags);
+}
+
static void nvdimm_release(struct device *dev)
{
struct nvdimm *nvdimm = to_nvdimm(dev);
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -254,6 +254,7 @@ long nvdimm_clear_poison(struct device *
unsigned int len);
void nvdimm_set_aliasing(struct device *dev);
void nvdimm_set_locked(struct device *dev);
+void nvdimm_clear_locked(struct device *dev);
struct nd_btt *to_nd_btt(struct device *dev);
struct nd_gen_sb {
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.14/libnvdimm-dimm-clear-locked-status-on-successful-dimm-enable.patch
queue-4.14/libnvdimm-pfn-make-resource-attribute-only-readable-by-root.patch
queue-4.14/libnvdimm-region-make-resource-attribute-only-readable-by-root.patch
queue-4.14/dax-fix-pmd-faults-on-zero-length-files.patch
queue-4.14/libnvdimm-namespace-make-resource-attribute-only-readable-by-root.patch
queue-4.14/dax-fix-general-protection-fault-in-dax_alloc_inode.patch
queue-4.14/libnvdimm-namespace-fix-label-initialization-to-use-valid-seq-numbers.patch
This is a note to let you know that I've just added the patch titled
kvm: vmx: Reinstate support for CPUs without virtual NMI
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-vmx-reinstate-support-for-cpus-without-virtual-nmi.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8a1b43922d0d1279e7936ba85c4c2a870403c95f Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Mon, 6 Nov 2017 13:31:12 +0100
Subject: kvm: vmx: Reinstate support for CPUs without virtual NMI
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit 8a1b43922d0d1279e7936ba85c4c2a870403c95f upstream.
This is more or less a revert of commit 2c82878b0cb3 ("KVM: VMX: require
virtual NMI support", 2017-03-27); it turns out that Core 2 Duo machines
only had virtual NMIs in some SKUs.
The revert is not trivial because in the meanwhile there have been several
fixes to nested NMI injection. Therefore, the entire vNMI state is moved
to struct loaded_vmcs.
Another change compared to before the patch is a simplification here:
if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked &&
!(is_guest_mode(vcpu) && nested_cpu_has_virtual_nmis(
get_vmcs12(vcpu))))) {
The final condition here is always true (because nested_cpu_has_virtual_nmis
is always false) and is removed.
Fixes: 2c82878b0cb38fd516fd612c67852a6bbf282003
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1490803
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 150 +++++++++++++++++++++++++++++++++++++----------------
1 file changed, 106 insertions(+), 44 deletions(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -202,6 +202,10 @@ struct loaded_vmcs {
bool nmi_known_unmasked;
unsigned long vmcs_host_cr3; /* May not match real cr3 */
unsigned long vmcs_host_cr4; /* May not match real cr4 */
+ /* Support for vnmi-less CPUs */
+ int soft_vnmi_blocked;
+ ktime_t entry_time;
+ s64 vnmi_blocked_time;
struct list_head loaded_vmcss_on_cpu_link;
};
@@ -1286,6 +1290,11 @@ static inline bool cpu_has_vmx_invpcid(v
SECONDARY_EXEC_ENABLE_INVPCID;
}
+static inline bool cpu_has_virtual_nmis(void)
+{
+ return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS;
+}
+
static inline bool cpu_has_vmx_wbinvd_exit(void)
{
return vmcs_config.cpu_based_2nd_exec_ctrl &
@@ -1343,11 +1352,6 @@ static inline bool nested_cpu_has2(struc
(vmcs12->secondary_vm_exec_control & bit);
}
-static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12)
-{
- return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS;
-}
-
static inline bool nested_cpu_has_preemption_timer(struct vmcs12 *vmcs12)
{
return vmcs12->pin_based_vm_exec_control &
@@ -3699,9 +3703,9 @@ static __init int setup_vmcs_config(stru
&_vmexit_control) < 0)
return -EIO;
- min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING |
- PIN_BASED_VIRTUAL_NMIS;
- opt = PIN_BASED_POSTED_INTR | PIN_BASED_VMX_PREEMPTION_TIMER;
+ min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
+ opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR |
+ PIN_BASED_VMX_PREEMPTION_TIMER;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
&_pin_based_exec_control) < 0)
return -EIO;
@@ -5667,7 +5671,8 @@ static void enable_irq_window(struct kvm
static void enable_nmi_window(struct kvm_vcpu *vcpu)
{
- if (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & GUEST_INTR_STATE_STI) {
+ if (!cpu_has_virtual_nmis() ||
+ vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & GUEST_INTR_STATE_STI) {
enable_irq_window(vcpu);
return;
}
@@ -5707,6 +5712,19 @@ static void vmx_inject_nmi(struct kvm_vc
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
+ if (!cpu_has_virtual_nmis()) {
+ /*
+ * Tracking the NMI-blocked state in software is built upon
+ * finding the next open IRQ window. This, in turn, depends on
+ * well-behaving guests: They have to keep IRQs disabled at
+ * least as long as the NMI handler runs. Otherwise we may
+ * cause NMI nesting, maybe breaking the guest. But as this is
+ * highly unlikely, we can live with the residual risk.
+ */
+ vmx->loaded_vmcs->soft_vnmi_blocked = 1;
+ vmx->loaded_vmcs->vnmi_blocked_time = 0;
+ }
+
++vcpu->stat.nmi_injections;
vmx->loaded_vmcs->nmi_known_unmasked = false;
@@ -5725,6 +5743,8 @@ static bool vmx_get_nmi_mask(struct kvm_
struct vcpu_vmx *vmx = to_vmx(vcpu);
bool masked;
+ if (!cpu_has_virtual_nmis())
+ return vmx->loaded_vmcs->soft_vnmi_blocked;
if (vmx->loaded_vmcs->nmi_known_unmasked)
return false;
masked = vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & GUEST_INTR_STATE_NMI;
@@ -5736,13 +5756,20 @@ static void vmx_set_nmi_mask(struct kvm_
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
- vmx->loaded_vmcs->nmi_known_unmasked = !masked;
- if (masked)
- vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO,
- GUEST_INTR_STATE_NMI);
- else
- vmcs_clear_bits(GUEST_INTERRUPTIBILITY_INFO,
- GUEST_INTR_STATE_NMI);
+ if (!cpu_has_virtual_nmis()) {
+ if (vmx->loaded_vmcs->soft_vnmi_blocked != masked) {
+ vmx->loaded_vmcs->soft_vnmi_blocked = masked;
+ vmx->loaded_vmcs->vnmi_blocked_time = 0;
+ }
+ } else {
+ vmx->loaded_vmcs->nmi_known_unmasked = !masked;
+ if (masked)
+ vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO,
+ GUEST_INTR_STATE_NMI);
+ else
+ vmcs_clear_bits(GUEST_INTERRUPTIBILITY_INFO,
+ GUEST_INTR_STATE_NMI);
+ }
}
static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
@@ -5750,6 +5777,10 @@ static int vmx_nmi_allowed(struct kvm_vc
if (to_vmx(vcpu)->nested.nested_run_pending)
return 0;
+ if (!cpu_has_virtual_nmis() &&
+ to_vmx(vcpu)->loaded_vmcs->soft_vnmi_blocked)
+ return 0;
+
return !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
(GUEST_INTR_STATE_MOV_SS | GUEST_INTR_STATE_STI
| GUEST_INTR_STATE_NMI));
@@ -6478,6 +6509,7 @@ static int handle_ept_violation(struct k
* AAK134, BY25.
*/
if (!(to_vmx(vcpu)->idt_vectoring_info & VECTORING_INFO_VALID_MASK) &&
+ cpu_has_virtual_nmis() &&
(exit_qualification & INTR_INFO_UNBLOCK_NMI))
vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO, GUEST_INTR_STATE_NMI);
@@ -6961,7 +6993,7 @@ static struct loaded_vmcs *nested_get_cu
}
/* Create a new VMCS */
- item = kmalloc(sizeof(struct vmcs02_list), GFP_KERNEL);
+ item = kzalloc(sizeof(struct vmcs02_list), GFP_KERNEL);
if (!item)
return NULL;
item->vmcs02.vmcs = alloc_vmcs();
@@ -7978,6 +8010,7 @@ static int handle_pml_full(struct kvm_vc
* "blocked by NMI" bit has to be set before next VM entry.
*/
if (!(to_vmx(vcpu)->idt_vectoring_info & VECTORING_INFO_VALID_MASK) &&
+ cpu_has_virtual_nmis() &&
(exit_qualification & INTR_INFO_UNBLOCK_NMI))
vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO,
GUEST_INTR_STATE_NMI);
@@ -8822,6 +8855,25 @@ static int vmx_handle_exit(struct kvm_vc
return 0;
}
+ if (unlikely(!cpu_has_virtual_nmis() &&
+ vmx->loaded_vmcs->soft_vnmi_blocked)) {
+ if (vmx_interrupt_allowed(vcpu)) {
+ vmx->loaded_vmcs->soft_vnmi_blocked = 0;
+ } else if (vmx->loaded_vmcs->vnmi_blocked_time > 1000000000LL &&
+ vcpu->arch.nmi_pending) {
+ /*
+ * This CPU don't support us in finding the end of an
+ * NMI-blocked window if the guest runs with IRQs
+ * disabled. So we pull the trigger after 1 s of
+ * futile waiting, but inform the user about this.
+ */
+ printk(KERN_WARNING "%s: Breaking out of NMI-blocked "
+ "state on VCPU %d after 1 s timeout\n",
+ __func__, vcpu->vcpu_id);
+ vmx->loaded_vmcs->soft_vnmi_blocked = 0;
+ }
+ }
+
if (exit_reason < kvm_vmx_max_exit_handlers
&& kvm_vmx_exit_handlers[exit_reason])
return kvm_vmx_exit_handlers[exit_reason](vcpu);
@@ -9104,33 +9156,38 @@ static void vmx_recover_nmi_blocking(str
idtv_info_valid = vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK;
- if (vmx->loaded_vmcs->nmi_known_unmasked)
- return;
- /*
- * Can't use vmx->exit_intr_info since we're not sure what
- * the exit reason is.
- */
- exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
- unblock_nmi = (exit_intr_info & INTR_INFO_UNBLOCK_NMI) != 0;
- vector = exit_intr_info & INTR_INFO_VECTOR_MASK;
- /*
- * SDM 3: 27.7.1.2 (September 2008)
- * Re-set bit "block by NMI" before VM entry if vmexit caused by
- * a guest IRET fault.
- * SDM 3: 23.2.2 (September 2008)
- * Bit 12 is undefined in any of the following cases:
- * If the VM exit sets the valid bit in the IDT-vectoring
- * information field.
- * If the VM exit is due to a double fault.
- */
- if ((exit_intr_info & INTR_INFO_VALID_MASK) && unblock_nmi &&
- vector != DF_VECTOR && !idtv_info_valid)
- vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO,
- GUEST_INTR_STATE_NMI);
- else
- vmx->loaded_vmcs->nmi_known_unmasked =
- !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO)
- & GUEST_INTR_STATE_NMI);
+ if (cpu_has_virtual_nmis()) {
+ if (vmx->loaded_vmcs->nmi_known_unmasked)
+ return;
+ /*
+ * Can't use vmx->exit_intr_info since we're not sure what
+ * the exit reason is.
+ */
+ exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
+ unblock_nmi = (exit_intr_info & INTR_INFO_UNBLOCK_NMI) != 0;
+ vector = exit_intr_info & INTR_INFO_VECTOR_MASK;
+ /*
+ * SDM 3: 27.7.1.2 (September 2008)
+ * Re-set bit "block by NMI" before VM entry if vmexit caused by
+ * a guest IRET fault.
+ * SDM 3: 23.2.2 (September 2008)
+ * Bit 12 is undefined in any of the following cases:
+ * If the VM exit sets the valid bit in the IDT-vectoring
+ * information field.
+ * If the VM exit is due to a double fault.
+ */
+ if ((exit_intr_info & INTR_INFO_VALID_MASK) && unblock_nmi &&
+ vector != DF_VECTOR && !idtv_info_valid)
+ vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO,
+ GUEST_INTR_STATE_NMI);
+ else
+ vmx->loaded_vmcs->nmi_known_unmasked =
+ !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO)
+ & GUEST_INTR_STATE_NMI);
+ } else if (unlikely(vmx->loaded_vmcs->soft_vnmi_blocked))
+ vmx->loaded_vmcs->vnmi_blocked_time +=
+ ktime_to_ns(ktime_sub(ktime_get(),
+ vmx->loaded_vmcs->entry_time));
}
static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
@@ -9247,6 +9304,11 @@ static void __noclone vmx_vcpu_run(struc
struct vcpu_vmx *vmx = to_vmx(vcpu);
unsigned long debugctlmsr, cr3, cr4;
+ /* Record the guest's net vcpu time for enforced NMI injections. */
+ if (unlikely(!cpu_has_virtual_nmis() &&
+ vmx->loaded_vmcs->soft_vnmi_blocked))
+ vmx->loaded_vmcs->entry_time = ktime_get();
+
/* Don't enter VMX if guest state is invalid, let the exit handler
start emulation until we arrive back to a valid state */
if (vmx->emulation_required)
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.14/kvm-nvmx-set-idtr-and-gdtr-limits-when-loading-l1-host-state.patch
queue-4.14/kvm-svm-obey-guest-pat.patch
queue-4.14/kvm-vmx-reinstate-support-for-cpus-without-virtual-nmi.patch
This is a note to let you know that I've just added the patch titled
KVM: SVM: obey guest PAT
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-svm-obey-guest-pat.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 15038e14724799b8c205beb5f20f9e54896013c3 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 26 Oct 2017 09:13:27 +0200
Subject: KVM: SVM: obey guest PAT
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit 15038e14724799b8c205beb5f20f9e54896013c3 upstream.
For many years some users of assigned devices have reported worse
performance on AMD processors with NPT than on AMD without NPT,
Intel or bare metal.
The reason turned out to be that SVM is discarding the guest PAT
setting and uses the default (PA0=PA4=WB, PA1=PA5=WT, PA2=PA6=UC-,
PA3=UC). The guest might be using a different setting, and
especially might want write combining but isn't getting it
(instead getting slow UC or UC- accesses).
Thanks a lot to geoff(a)hostfission.com for noticing the relation
to the g_pat setting. The patch has been tested also by a bunch
of people on VFIO users forums.
Fixes: 709ddebf81cb40e3c36c6109a7892e8b93a09464
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196409
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Nick Sarnie <commendsarnex(a)gmail.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/svm.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3657,6 +3657,13 @@ static int svm_set_msr(struct kvm_vcpu *
u32 ecx = msr->index;
u64 data = msr->data;
switch (ecx) {
+ case MSR_IA32_CR_PAT:
+ if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
+ return 1;
+ vcpu->arch.pat = data;
+ svm->vmcb->save.g_pat = data;
+ mark_dirty(svm->vmcb, VMCB_NPT);
+ break;
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr);
break;
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.14/kvm-nvmx-set-idtr-and-gdtr-limits-when-loading-l1-host-state.patch
queue-4.14/kvm-svm-obey-guest-pat.patch
queue-4.14/kvm-vmx-reinstate-support-for-cpus-without-virtual-nmi.patch
This is a note to let you know that I've just added the patch titled
KVM: PPC: Book3S HV: Don't call real-mode XICS hypercall handlers if not enabled
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-ppc-book3s-hv-don-t-call-real-mode-xics-hypercall-handlers-if-not-enabled.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 00bb6ae5006205e041ce9784c819460562351d47 Mon Sep 17 00:00:00 2001
From: Paul Mackerras <paulus(a)ozlabs.org>
Date: Thu, 26 Oct 2017 17:00:22 +1100
Subject: KVM: PPC: Book3S HV: Don't call real-mode XICS hypercall handlers if not enabled
From: Paul Mackerras <paulus(a)ozlabs.org>
commit 00bb6ae5006205e041ce9784c819460562351d47 upstream.
When running a guest on a POWER9 system with the in-kernel XICS
emulation disabled (for example by running QEMU with the parameter
"-machine pseries,kernel_irqchip=off"), the kernel does not pass
the XICS-related hypercalls such as H_CPPR up to userspace for
emulation there as it should.
The reason for this is that the real-mode handlers for these
hypercalls don't check whether a XICS device has been instantiated
before calling the xics-on-xive code. That code doesn't check
either, leading to potential NULL pointer dereferences because
vcpu->arch.xive_vcpu is NULL. Those dereferences won't cause an
exception in real mode but will lead to kernel memory corruption.
This fixes it by adding kvmppc_xics_enabled() checks before calling
the XICS functions.
Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Signed-off-by: Paul Mackerras <paulus(a)ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/powerpc/kvm/book3s_hv_builtin.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -529,6 +529,8 @@ static inline bool is_rm(void)
unsigned long kvmppc_rm_h_xirr(struct kvm_vcpu *vcpu)
{
+ if (!kvmppc_xics_enabled(vcpu))
+ return H_TOO_HARD;
if (xive_enabled()) {
if (is_rm())
return xive_rm_h_xirr(vcpu);
@@ -541,6 +543,8 @@ unsigned long kvmppc_rm_h_xirr(struct kv
unsigned long kvmppc_rm_h_xirr_x(struct kvm_vcpu *vcpu)
{
+ if (!kvmppc_xics_enabled(vcpu))
+ return H_TOO_HARD;
vcpu->arch.gpr[5] = get_tb();
if (xive_enabled()) {
if (is_rm())
@@ -554,6 +558,8 @@ unsigned long kvmppc_rm_h_xirr_x(struct
unsigned long kvmppc_rm_h_ipoll(struct kvm_vcpu *vcpu, unsigned long server)
{
+ if (!kvmppc_xics_enabled(vcpu))
+ return H_TOO_HARD;
if (xive_enabled()) {
if (is_rm())
return xive_rm_h_ipoll(vcpu, server);
@@ -567,6 +573,8 @@ unsigned long kvmppc_rm_h_ipoll(struct k
int kvmppc_rm_h_ipi(struct kvm_vcpu *vcpu, unsigned long server,
unsigned long mfrr)
{
+ if (!kvmppc_xics_enabled(vcpu))
+ return H_TOO_HARD;
if (xive_enabled()) {
if (is_rm())
return xive_rm_h_ipi(vcpu, server, mfrr);
@@ -579,6 +587,8 @@ int kvmppc_rm_h_ipi(struct kvm_vcpu *vcp
int kvmppc_rm_h_cppr(struct kvm_vcpu *vcpu, unsigned long cppr)
{
+ if (!kvmppc_xics_enabled(vcpu))
+ return H_TOO_HARD;
if (xive_enabled()) {
if (is_rm())
return xive_rm_h_cppr(vcpu, cppr);
@@ -591,6 +601,8 @@ int kvmppc_rm_h_cppr(struct kvm_vcpu *vc
int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr)
{
+ if (!kvmppc_xics_enabled(vcpu))
+ return H_TOO_HARD;
if (xive_enabled()) {
if (is_rm())
return xive_rm_h_eoi(vcpu, xirr);
Patches currently in stable-queue which might be from paulus(a)ozlabs.org are
queue-4.14/kvm-ppc-book3s-hv-don-t-call-real-mode-xics-hypercall-handlers-if-not-enabled.patch
This is a note to let you know that I've just added the patch titled
KVM: nVMX: set IDTR and GDTR limits when loading L1 host state
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-nvmx-set-idtr-and-gdtr-limits-when-loading-l1-host-state.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 21f2d551183847bc7fbe8d866151d00cdad18752 Mon Sep 17 00:00:00 2001
From: Ladi Prosek <lprosek(a)redhat.com>
Date: Wed, 11 Oct 2017 16:54:42 +0200
Subject: KVM: nVMX: set IDTR and GDTR limits when loading L1 host state
From: Ladi Prosek <lprosek(a)redhat.com>
commit 21f2d551183847bc7fbe8d866151d00cdad18752 upstream.
Intel SDM 27.5.2 Loading Host Segment and Descriptor-Table Registers:
"The GDTR and IDTR limits are each set to FFFFH."
Signed-off-by: Ladi Prosek <lprosek(a)redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 2 ++
1 file changed, 2 insertions(+)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11325,6 +11325,8 @@ static void load_vmcs12_host_state(struc
vmcs_writel(GUEST_SYSENTER_EIP, vmcs12->host_ia32_sysenter_eip);
vmcs_writel(GUEST_IDTR_BASE, vmcs12->host_idtr_base);
vmcs_writel(GUEST_GDTR_BASE, vmcs12->host_gdtr_base);
+ vmcs_write32(GUEST_IDTR_LIMIT, 0xFFFF);
+ vmcs_write32(GUEST_GDTR_LIMIT, 0xFFFF);
/* If not VM_EXIT_CLEAR_BNDCFGS, the L2 value propagates to L1. */
if (vmcs12->vm_exit_controls & VM_EXIT_CLEAR_BNDCFGS)
Patches currently in stable-queue which might be from lprosek(a)redhat.com are
queue-4.14/kvm-nvmx-set-idtr-and-gdtr-limits-when-loading-l1-host-state.patch
This is a note to let you know that I've just added the patch titled
IB/srpt: Do not accept invalid initiator port names
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-srpt-do-not-accept-invalid-initiator-port-names.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From c70ca38960399a63d5c048b7b700612ea321d17e Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche(a)wdc.com>
Date: Wed, 11 Oct 2017 10:27:22 -0700
Subject: IB/srpt: Do not accept invalid initiator port names
From: Bart Van Assche <bart.vanassche(a)wdc.com>
commit c70ca38960399a63d5c048b7b700612ea321d17e upstream.
Make srpt_parse_i_port_id() return a negative value if hex2bin()
fails.
Fixes: commit a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
Signed-off-by: Bart Van Assche <bart.vanassche(a)wdc.com>
Signed-off-by: Doug Ledford <dledford(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/ulp/srpt/ib_srpt.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -2777,7 +2777,7 @@ static int srpt_parse_i_port_id(u8 i_por
{
const char *p;
unsigned len, count, leading_zero_bytes;
- int ret, rc;
+ int ret;
p = name;
if (strncasecmp(p, "0x", 2) == 0)
@@ -2789,10 +2789,9 @@ static int srpt_parse_i_port_id(u8 i_por
count = min(len / 2, 16U);
leading_zero_bytes = 16 - count;
memset(i_port_id, 0, leading_zero_bytes);
- rc = hex2bin(i_port_id + leading_zero_bytes, p, count);
- if (rc < 0)
- pr_debug("hex2bin failed for srpt_parse_i_port_id: %d\n", rc);
- ret = 0;
+ ret = hex2bin(i_port_id + leading_zero_bytes, p, count);
+ if (ret < 0)
+ pr_debug("hex2bin failed for srpt_parse_i_port_id: %d\n", ret);
out:
return ret;
}
Patches currently in stable-queue which might be from bart.vanassche(a)wdc.com are
queue-4.14/block-fix-a-race-between-blk_cleanup_queue-and-timeout-handling.patch
queue-4.14/scsi-qla2xxx-suppress-a-kernel-complaint-in-qla_init_base_qpair.patch
queue-4.14/ib-srp-avoid-that-a-cable-pull-can-trigger-a-kernel-crash.patch
queue-4.14/ib-srpt-do-not-accept-invalid-initiator-port-names.patch
This is a note to let you know that I've just added the patch titled
IB/srp: Avoid that a cable pull can trigger a kernel crash
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-srp-avoid-that-a-cable-pull-can-trigger-a-kernel-crash.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8a0d18c62121d3c554a83eb96e2752861d84d937 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche(a)wdc.com>
Date: Wed, 11 Oct 2017 10:27:26 -0700
Subject: IB/srp: Avoid that a cable pull can trigger a kernel crash
From: Bart Van Assche <bart.vanassche(a)wdc.com>
commit 8a0d18c62121d3c554a83eb96e2752861d84d937 upstream.
This patch fixes the following kernel crash:
general protection fault: 0000 [#1] PREEMPT SMP
Workqueue: ib_mad2 timeout_sends [ib_core]
Call Trace:
ib_sa_path_rec_callback+0x1c4/0x1d0 [ib_core]
send_handler+0xb2/0xd0 [ib_core]
timeout_sends+0x14d/0x220 [ib_core]
process_one_work+0x200/0x630
worker_thread+0x4e/0x3b0
kthread+0x113/0x150
Fixes: commit aef9ec39c47f ("IB: Add SCSI RDMA Protocol (SRP) initiator")
Signed-off-by: Bart Van Assche <bart.vanassche(a)wdc.com>
Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me>
Signed-off-by: Doug Ledford <dledford(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/ulp/srp/ib_srp.c | 25 +++++++++++++++++++------
1 file changed, 19 insertions(+), 6 deletions(-)
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -665,12 +665,19 @@ static void srp_path_rec_completion(int
static int srp_lookup_path(struct srp_rdma_ch *ch)
{
struct srp_target_port *target = ch->target;
- int ret;
+ int ret = -ENODEV;
ch->path.numb_path = 1;
init_completion(&ch->done);
+ /*
+ * Avoid that the SCSI host can be removed by srp_remove_target()
+ * before srp_path_rec_completion() is called.
+ */
+ if (!scsi_host_get(target->scsi_host))
+ goto out;
+
ch->path_query_id = ib_sa_path_rec_get(&srp_sa_client,
target->srp_host->srp_dev->dev,
target->srp_host->port,
@@ -684,18 +691,24 @@ static int srp_lookup_path(struct srp_rd
GFP_KERNEL,
srp_path_rec_completion,
ch, &ch->path_query);
- if (ch->path_query_id < 0)
- return ch->path_query_id;
+ ret = ch->path_query_id;
+ if (ret < 0)
+ goto put;
ret = wait_for_completion_interruptible(&ch->done);
if (ret < 0)
- return ret;
+ goto put;
- if (ch->status < 0)
+ ret = ch->status;
+ if (ret < 0)
shost_printk(KERN_WARNING, target->scsi_host,
PFX "Path record query failed\n");
- return ch->status;
+put:
+ scsi_host_put(target->scsi_host);
+
+out:
+ return ret;
}
static int srp_send_req(struct srp_rdma_ch *ch, bool multich)
Patches currently in stable-queue which might be from bart.vanassche(a)wdc.com are
queue-4.14/block-fix-a-race-between-blk_cleanup_queue-and-timeout-handling.patch
queue-4.14/scsi-qla2xxx-suppress-a-kernel-complaint-in-qla_init_base_qpair.patch
queue-4.14/ib-srp-avoid-that-a-cable-pull-can-trigger-a-kernel-crash.patch
queue-4.14/ib-srpt-do-not-accept-invalid-initiator-port-names.patch
This is a note to let you know that I've just added the patch titled
IB/hfi1: Fix incorrect available receive user context count
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-hfi1-fix-incorrect-available-receive-user-context-count.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d7d626179fb283aba73699071af0df6d00e32138 Mon Sep 17 00:00:00 2001
From: "Michael J. Ruhl" <michael.j.ruhl(a)intel.com>
Date: Mon, 2 Oct 2017 11:04:19 -0700
Subject: IB/hfi1: Fix incorrect available receive user context count
From: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
commit d7d626179fb283aba73699071af0df6d00e32138 upstream.
The addition of the VNIC contexts to num_rcv_contexts changes the
meaning of the sysfs value nctxts from available user contexts, to
user contexts + reserved VNIC contexts.
User applications that use nctxts are now broken.
Update the calculation so that VNIC contexts are used only if there are
hardware contexts available, and do not silently affect nctxts.
Update code to use the calculated VNIC context number.
Update the sysfs value nctxts to be available user contexts only.
Fixes: 2280740f01ae ("IB/hfi1: Virtual Network Interface Controller (VNIC) HW support")
Reviewed-by: Ira Weiny <ira.weiny(a)intel.com>
Reviewed-by: Niranjana Vishwanathapura <Niranjana.Vishwanathapura(a)intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro(a)intel.com>
Signed-off-by: Doug Ledford <dledford(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/hfi1/chip.c | 35 +++++++++++++++++++--------------
drivers/infiniband/hw/hfi1/hfi.h | 2 +
drivers/infiniband/hw/hfi1/sysfs.c | 2 -
drivers/infiniband/hw/hfi1/vnic_main.c | 7 ++++--
4 files changed, 29 insertions(+), 17 deletions(-)
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -13074,7 +13074,7 @@ static int request_msix_irqs(struct hfi1
first_sdma = last_general;
last_sdma = first_sdma + dd->num_sdma;
first_rx = last_sdma;
- last_rx = first_rx + dd->n_krcv_queues + HFI1_NUM_VNIC_CTXT;
+ last_rx = first_rx + dd->n_krcv_queues + dd->num_vnic_contexts;
/* VNIC MSIx interrupts get mapped when VNIC contexts are created */
dd->first_dyn_msix_idx = first_rx + dd->n_krcv_queues;
@@ -13294,8 +13294,9 @@ static int set_up_interrupts(struct hfi1
* slow source, SDMACleanupDone)
* N interrupts - one per used SDMA engine
* M interrupt - one per kernel receive context
+ * V interrupt - one for each VNIC context
*/
- total = 1 + dd->num_sdma + dd->n_krcv_queues + HFI1_NUM_VNIC_CTXT;
+ total = 1 + dd->num_sdma + dd->n_krcv_queues + dd->num_vnic_contexts;
/* ask for MSI-X interrupts */
request = request_msix(dd, total);
@@ -13356,10 +13357,12 @@ fail:
* in array of contexts
* freectxts - number of free user contexts
* num_send_contexts - number of PIO send contexts being used
+ * num_vnic_contexts - number of contexts reserved for VNIC
*/
static int set_up_context_variables(struct hfi1_devdata *dd)
{
unsigned long num_kernel_contexts;
+ u16 num_vnic_contexts = HFI1_NUM_VNIC_CTXT;
int total_contexts;
int ret;
unsigned ngroups;
@@ -13393,6 +13396,14 @@ static int set_up_context_variables(stru
num_kernel_contexts);
num_kernel_contexts = dd->chip_send_contexts - num_vls - 1;
}
+
+ /* Accommodate VNIC contexts if possible */
+ if ((num_kernel_contexts + num_vnic_contexts) > dd->chip_rcv_contexts) {
+ dd_dev_err(dd, "No receive contexts available for VNIC\n");
+ num_vnic_contexts = 0;
+ }
+ total_contexts = num_kernel_contexts + num_vnic_contexts;
+
/*
* User contexts:
* - default to 1 user context per real (non-HT) CPU core if
@@ -13402,19 +13413,16 @@ static int set_up_context_variables(stru
num_user_contexts =
cpumask_weight(&node_affinity.real_cpu_mask);
- total_contexts = num_kernel_contexts + num_user_contexts;
-
/*
* Adjust the counts given a global max.
*/
- if (total_contexts > dd->chip_rcv_contexts) {
+ if (total_contexts + num_user_contexts > dd->chip_rcv_contexts) {
dd_dev_err(dd,
"Reducing # user receive contexts to: %d, from %d\n",
- (int)(dd->chip_rcv_contexts - num_kernel_contexts),
+ (int)(dd->chip_rcv_contexts - total_contexts),
(int)num_user_contexts);
- num_user_contexts = dd->chip_rcv_contexts - num_kernel_contexts;
/* recalculate */
- total_contexts = num_kernel_contexts + num_user_contexts;
+ num_user_contexts = dd->chip_rcv_contexts - total_contexts;
}
/* each user context requires an entry in the RMT */
@@ -13427,25 +13435,24 @@ static int set_up_context_variables(stru
user_rmt_reduced);
/* recalculate */
num_user_contexts = user_rmt_reduced;
- total_contexts = num_kernel_contexts + num_user_contexts;
}
- /* Accommodate VNIC contexts */
- if ((total_contexts + HFI1_NUM_VNIC_CTXT) <= dd->chip_rcv_contexts)
- total_contexts += HFI1_NUM_VNIC_CTXT;
+ total_contexts += num_user_contexts;
/* the first N are kernel contexts, the rest are user/vnic contexts */
dd->num_rcv_contexts = total_contexts;
dd->n_krcv_queues = num_kernel_contexts;
dd->first_dyn_alloc_ctxt = num_kernel_contexts;
+ dd->num_vnic_contexts = num_vnic_contexts;
dd->num_user_contexts = num_user_contexts;
dd->freectxts = num_user_contexts;
dd_dev_info(dd,
- "rcv contexts: chip %d, used %d (kernel %d, user %d)\n",
+ "rcv contexts: chip %d, used %d (kernel %d, vnic %u, user %u)\n",
(int)dd->chip_rcv_contexts,
(int)dd->num_rcv_contexts,
(int)dd->n_krcv_queues,
- (int)dd->num_rcv_contexts - dd->n_krcv_queues);
+ dd->num_vnic_contexts,
+ dd->num_user_contexts);
/*
* Receive array allocation:
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1047,6 +1047,8 @@ struct hfi1_devdata {
u64 z_send_schedule;
u64 __percpu *send_schedule;
+ /* number of reserved contexts for VNIC usage */
+ u16 num_vnic_contexts;
/* number of receive contexts in use by the driver */
u32 num_rcv_contexts;
/* number of pio send contexts in use by the driver */
--- a/drivers/infiniband/hw/hfi1/sysfs.c
+++ b/drivers/infiniband/hw/hfi1/sysfs.c
@@ -543,7 +543,7 @@ static ssize_t show_nctxts(struct device
* give a more accurate picture of total contexts available.
*/
return scnprintf(buf, PAGE_SIZE, "%u\n",
- min(dd->num_rcv_contexts - dd->first_dyn_alloc_ctxt,
+ min(dd->num_user_contexts,
(u32)dd->sc_sizes[SC_USER].count));
}
--- a/drivers/infiniband/hw/hfi1/vnic_main.c
+++ b/drivers/infiniband/hw/hfi1/vnic_main.c
@@ -840,6 +840,9 @@ struct net_device *hfi1_vnic_alloc_rn(st
struct rdma_netdev *rn;
int i, size, rc;
+ if (!dd->num_vnic_contexts)
+ return ERR_PTR(-ENOMEM);
+
if (!port_num || (port_num > dd->num_pports))
return ERR_PTR(-EINVAL);
@@ -848,7 +851,7 @@ struct net_device *hfi1_vnic_alloc_rn(st
size = sizeof(struct opa_vnic_rdma_netdev) + sizeof(*vinfo);
netdev = alloc_netdev_mqs(size, name, name_assign_type, setup,
- dd->chip_sdma_engines, HFI1_NUM_VNIC_CTXT);
+ dd->chip_sdma_engines, dd->num_vnic_contexts);
if (!netdev)
return ERR_PTR(-ENOMEM);
@@ -856,7 +859,7 @@ struct net_device *hfi1_vnic_alloc_rn(st
vinfo = opa_vnic_dev_priv(netdev);
vinfo->dd = dd;
vinfo->num_tx_q = dd->chip_sdma_engines;
- vinfo->num_rx_q = HFI1_NUM_VNIC_CTXT;
+ vinfo->num_rx_q = dd->num_vnic_contexts;
vinfo->netdev = netdev;
rn->free_rdma_netdev = hfi1_vnic_free_rn;
rn->set_id = hfi1_vnic_set_vesw_id;
Patches currently in stable-queue which might be from michael.j.ruhl(a)intel.com are
queue-4.14/ib-hfi1-fix-incorrect-available-receive-user-context-count.patch