When the second-stage kernel is booted via kexec with a limiting command
line such as "mem=<size>", the physical range that contains the carried
over IMA measurement list may fall outside the truncated RAM leading to
a kernel panic.
BUG: unable to handle page fault for address: ffff97793ff47000
RIP: ima_restore_measurement_list+0xdc/0x45a
#PF: error_code(0x0000) – not-present page
Other architectures already validate the range with page_is_ram(), as
done in commit: cbf9c4b9617b ("of: check previous kernel's
ima-kexec-buffer against memory bounds") do a similar check on x86.
Cc: stable(a)vger.kernel.org
Fixes: b69a2afd5afc ("x86/kexec: Carry forward IMA measurement log on kexec")
Reported-by: Paul Webb <paul.x.webb(a)oracle.com>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
---
Have tested the kexec for x86 kernel with IMA_KEXEC enabled and the
above patch works good. Paul initially reported this on 6.12 kernel but
I was able to reproduce this on 6.18, so I tried replicating how this
was fixed in drivers/of/kexec.c
---
arch/x86/kernel/setup.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 1b2edd07a3e1..fcef197d180e 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -439,9 +439,23 @@ int __init ima_free_kexec_buffer(void)
int __init ima_get_kexec_buffer(void **addr, size_t *size)
{
+ unsigned long start_pfn, end_pfn;
+
if (!ima_kexec_buffer_size)
return -ENOENT;
+ /*
+ * Calculate the PFNs for the buffer and ensure
+ * they are with in addressable memory.
+ */
+ start_pfn = PFN_DOWN(ima_kexec_buffer_phys);
+ end_pfn = PFN_DOWN(ima_kexec_buffer_phys + ima_kexec_buffer_size - 1);
+ if (!pfn_range_is_mapped(start_pfn, end_pfn)) {
+ pr_warn("IMA buffer at 0x%llx, size = 0x%zx beyond memory\n",
+ ima_kexec_buffer_phys, ima_kexec_buffer_size);
+ return -EINVAL;
+ }
+
*addr = __va(ima_kexec_buffer_phys);
*size = ima_kexec_buffer_size;
--
2.50.1
From: NeilBrown <neil(a)brown.name>
A recent change to clamp_t() in 6.1.y caused fs/nfsd/nfs4state.c to fail
to compile with gcc-9. The code in nfsd4_get_drc_mem() was written with
the assumption that when "max < min",
clamp(val, min, max)
would return max. This assumption is not documented as an API promise
and the change caused a compile failure if it could be statically
determined that "max < min".
The relevant code was no longer present upstream when commit 1519fbc8832b
("minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()")
landed there, so there is no upstream change to nfsd4_get_drc_mem() to
backport.
There is no clear case that the existing code in nfsd4_get_drc_mem()
is functioning incorrectly. The goal of this patch is to permit the clean
application of commit 1519fbc8832b ("minmax.h: use BUILD_BUG_ON_MSG() for
the lo < hi test in clamp()"), and any commits that depend on it, to LTS
kernels without affecting the ability to compile those kernels. This is
done by open-coding the __clamp() macro sans the built-in type checking.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220745#c0
Signed-off-by: NeilBrown <neil(a)brown.name>
Stable-dep-of: 1519fbc8832b ("minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()")
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
---
fs/nfsd/nfs4state.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
Changes since Neil's post:
* Editorial changes to the commit message
* Attempt to address David's review comments
* Applied to linux-6.12.y, passed NFSD upstream CI suite
This patch is intended to be applied to linux-6.12.y, and should
apply cleanly to other LTS kernels since nfsd4_get_drc_mem hasn't
changed since v5.4.
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7b0fabf8c657..41545933dd18 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1983,8 +1983,10 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca, struct nfsd_net *nn
*/
scale_factor = max_t(unsigned int, 8, nn->nfsd_serv->sv_nrthreads);
- avail = clamp_t(unsigned long, avail, slotsize,
- total_avail/scale_factor);
+ if (avail > total_avail / scale_factor)
+ avail = total_avail / scale_factor;
+ else if (avail < slotsize)
+ avail = slotsize;
num = min_t(int, num, avail / slotsize);
num = max_t(int, num, 1);
nfsd_drc_mem_used += num * slotsize;
--
2.51.0
The vNTB endpoint function (pci-epf-vntb) can be configured and reconfigured
through configfs (link/unlink functions, start/stop the controller, update
parameters). In practice, several pitfalls present: double-unmapping when two
windows share a BAR, wrong parameter order in .drop_link leading to wrong
object lookups, duplicate EPC teardown that leads to oopses, a work item
running after resources were torn down, and inability to re-link/restart
fundamentally because ntb_dev was embedded and the vPCI bus teardown was
incomplete.
This series addresses those issues and hardens resource management across NTB
EPF and PCI EP core:
- Avoid double iounmap when PEER_SPAD and CONFIG share the same BAR.
- Fix configfs .drop_link parameter order so the correct groups are used during
unlink.
- Remove duplicate EPC resource teardown in both pci-epf-vntb and pci-epf-ntb,
avoiding crashes on .allow_link failures and during .drop_link.
- Stop the delayed cmd_handler work before clearing BARs/doorbells.
- Manage ntb_dev as a devm-managed allocation and implement .remove() in the
vNTB PCI driver. Switch to pci_scan_root_bus().
With these changes, the controller can now be stopped, a function unlinked,
configfs settings updated, and the controller re-linked and restarted
without rebooting the endpoint, as long as the underlying pci_epc_ops
.stop() is non-destructive and .start() restores normal operation.
Patches 1-5 carry Fixes tags and are candidates for stable.
Patch 6 is a preparatory one for Patch 7.
Patch 7 is a behavioral improvement that completes lifetime management for
relink/restart scenarios.
Apologies for the delay between v2 and v3, and thank you for the review.
v2->v3 changes:
- Added Reviewed-by tag for [PATCH v2 4/6]
- Split [PATCH v2 6/6] into two, based on the feedback from Frank.
(No code changes overall.)
v1->v2 changes:
- Incorporated feedback from Frank.
- Added Reviewed-by tags (except for patches #4 and #6).
- Fixed a typo in patch #5 title.
(No code changes overall.)
v2: https://lore.kernel.org/all/20251029080321.807943-1-den@valinux.co.jp/
v1: https://lore.kernel.org/all/20251023071757.901181-1-den@valinux.co.jp/
Koichiro Den (7):
NTB: epf: Avoid pci_iounmap() with offset when PEER_SPAD and CONFIG
share BAR
PCI: endpoint: Fix parameter order for .drop_link
PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
NTB: epf: vntb: Stop cmd_handler work in epf_ntb_epc_cleanup
PCI: endpoint: pci-epf-vntb: Switch vpci_scan_bus() to use
pci_scan_root_bus()
PCI: endpoint: pci-epf-vntb: Manage ntb_dev lifetime and fix vpci bus
teardown
drivers/ntb/hw/epf/ntb_hw_epf.c | 3 +-
drivers/pci/endpoint/functions/pci-epf-ntb.c | 56 +-----------
drivers/pci/endpoint/functions/pci-epf-vntb.c | 86 ++++++++++++-------
drivers/pci/endpoint/pci-ep-cfs.c | 8 +-
4 files changed, 62 insertions(+), 91 deletions(-)
--
2.48.1