Changes since v2 [1]:
- Drop the "mem-quiet" pm-debug interface in favor of an explicit
hibernate_quiet_exec() helper that executes firmware activation, or
any other subsystem provided routine, in a system-quiet context.
(Rafael)
- Rework the sysfs interface to add an explicit trigger to run
activation under hibernate_quiet_exec(). Rename
ndbusX/firmware_activate to ndbusX/firmware/activate, and add a
ndbusX/firmware/capability. Some ndctl reworks are needed to catch up
with this change.
- The new ndbusX/firmware/capability attribute indicates the default
activation method / execution context between "live" and "suspend".
[1]: http://lore.kernel.org/r/159408711335.2385045.2567600405906448375.stgit@dwi…
---
Quoting the documentation:
Some persistent memory devices run a firmware locally on the device /
"DIMM" to perform tasks like media management, capacity provisioning,
and health monitoring. The process of updating that firmware typically
involves a reboot because it has implications for in-flight memory
transactions. However, reboots are disruptive and at least the Intel
persistent memory platform implementation, described by the Intel ACPI
DSM specification [1], has added support for activating firmware at
runtime.
[1]: https://docs.pmem.io/persistent-memory/
The approach taken is to abstract the Intel platform specific mechanism
behind a libnvdimm-generic sysfs interface. The interface could support
runtime-firmware-activation on another architecture without need to
change userspace tooling.
The ACPI NFIT implementation involves a set of device-specific-methods
(DSMs) to 'arm' individual devices for activation and bus-level
'trigger' method to execute the activation. Informational / enumeration
methods are also provided at the bus and device level.
One complicating aspect of the memory device firmware activation is that
the memory controller may need to be quiesced, no memory cycles, during
the activation. While the platform has mechanisms to support holding off
in-flight DMA during the activation, the device response to that delay
is potentially undefined. The platform may reject a runtime firmware
update if, for example a PCI-E device does not support its completion
timeout value being increased to meet the activation time. Outside of
device timeouts the quiesce period may also violate application
timeouts.
Given the above device and application timeout considerations the
implementation uses a new hibernate_quiet_exec() facility to carry-out
firmware activation. This imposes the same conditions that allow for a
stable memory image snapshot to be taken for a hibernate-to-disk
sequence. However, if desired, runtime activation without the hibernate
freeze can be forced as an override.
The ndctl utility grows the following extensions / commands to drive
this mechanism:
1/ The existing update-firmware command will 'arm' devices where the
firmware image is staged by default.
ndctl update-firmware all -f firmware_image.bin
2/ The existing ability to enumerate firmware-update capabilities now
includes firmware activate capabilities at the 'bus' and 'dimm/device'
level:
ndctl list -BDF -b nfit_test.0
[
{
"provider":"nfit_test.0",
"dev":"ndbus2",
"scrub_state":"idle",
"firmware":{
"activate_method":"suspend",
"activate_state":"idle"
},
"dimms":[
{
"dev":"nmem1",
"id":"cdab-0a-07e0-ffffffff",
"handle":0,
"phys_id":0,
"security":"disabled",
"firmware":{
"current_version":0,
"can_update":true
}
},
...
3/ The new activate-firmware command triggers firmware activation per
the platform enumerated context, "suspend" vs "live", or can be forced
to "live" if there is a explicit knowledge that allowing applications
and devices to race the quiesce timeout will have no adverse effects.
ndctl activate-firmware nfit_test.0 [--force]
These patches are passing an updated version of the ndctl
"firmware-update.sh" unit test (to be posted).
---
Dan Williams (11):
libnvdimm: Validate command family indices
ACPI: NFIT: Move bus_dsm_mask out of generic nvdimm_bus_descriptor
ACPI: NFIT: Define runtime firmware activation commands
tools/testing/nvdimm: Cleanup dimm index passing
tools/testing/nvdimm: Add command debug messages
tools/testing/nvdimm: Prepare nfit_ctl_test() for ND_CMD_CALL emulation
tools/testing/nvdimm: Emulate firmware activation commands
driver-core: Introduce DEVICE_ATTR_ADMIN_{RO,RW}
libnvdimm: Convert to DEVICE_ATTR_ADMIN_RO()
PM, libnvdimm: Add runtime firmware activation support
ACPI: NFIT: Add runtime firmware activate support
Documentation/ABI/testing/sysfs-bus-nfit | 19 +
Documentation/ABI/testing/sysfs-bus-nvdimm | 2
.../driver-api/nvdimm/firmware-activate.rst | 86 ++++
drivers/acpi/nfit/core.c | 142 +++++--
drivers/acpi/nfit/intel.c | 386 ++++++++++++++++++++
drivers/acpi/nfit/intel.h | 61 +++
drivers/acpi/nfit/nfit.h | 38 ++
drivers/nvdimm/bus.c | 16 +
drivers/nvdimm/core.c | 149 ++++++++
drivers/nvdimm/dimm_devs.c | 119 ++++++
drivers/nvdimm/namespace_devs.c | 2
drivers/nvdimm/nd-core.h | 1
drivers/nvdimm/pfn_devs.c | 2
drivers/nvdimm/region_devs.c | 2
include/linux/device.h | 4
include/linux/libnvdimm.h | 52 +++
include/linux/suspend.h | 6
include/linux/sysfs.h | 7
include/uapi/linux/ndctl.h | 5
kernel/power/hibernate.c | 97 +++++
tools/testing/nvdimm/test/nfit.c | 367 +++++++++++++++----
21 files changed, 1449 insertions(+), 114 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-nvdimm
create mode 100644 Documentation/driver-api/nvdimm/firmware-activate.rst
base-commit: 48778464bb7d346b47157d21ffde2af6b2d39110
The !ATOMIC_IOMAP version of io_maping_init_wc will always return
success, even when the ioremap fails.
Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.
During a device probe, where the ioremap failed, a crash can look
like this:
BUG: unable to handle page fault for address: 0000000000210000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 177 Comm:
RIP: 0010:fill_page_dma [i915]
gen8_ppgtt_create [i915]
i915_ppgtt_create [i915]
intel_gt_init [i915]
i915_gem_init [i915]
i915_driver_probe [i915]
pci_device_probe
really_probe
driver_probe_device
The remap failure occurred much earlier in the probe. If it had
been propagated, the driver would have exited with an error.
Return NULL on ioremap failure.
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
---
v2: reflect review comments
---
include/linux/io-mapping.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 0beaa3eba155..5641e06cbcf7 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *iomap,
iomap->prot = pgprot_noncached(PAGE_KERNEL);
#endif
- return iomap;
+ return iomap->iomem ? iomap : NULL;
}
static inline void
--
2.21.0
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: de2b41be8fcccb2f5b6c480d35df590476344201
Gitweb: https://git.kernel.org/tip/de2b41be8fcccb2f5b6c480d35df590476344201
Author: Joerg Roedel <jroedel(a)suse.de>
AuthorDate: Tue, 21 Jul 2020 11:34:48 +02:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 22 Jul 2020 09:38:37 +02:00
x86, vmlinux.lds: Page-align end of ..page_aligned sections
On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
page-aligned, but the end of the .bss..page_aligned section is not
guaranteed to be page-aligned.
As a result, objects from other .bss sections may end up on the same 4k
page as the idt_table, and will accidentially get mapped read-only during
boot, causing unexpected page-faults when the kernel writes to them.
This could be worked around by making the objects in the page aligned
sections page sized, but that's wrong.
Explicit sections which store only page aligned objects have an implicit
guarantee that the object is alone in the page in which it is placed. That
works for all objects except the last one. That's inconsistent.
Enforcing page sized objects for these sections would wreckage memory
sanitizers, because the object becomes artificially larger than it should
be and out of bound access becomes legit.
Align the end of the .bss..page_aligned and .data..page_aligned section on
page-size so all objects places in these sections are guaranteed to have
their own page.
[ tglx: Amended changelog ]
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org
---
arch/x86/kernel/vmlinux.lds.S | 1 +
include/asm-generic/vmlinux.lds.h | 5 ++++-
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 3bfc8dd..9a03e5b 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -358,6 +358,7 @@ SECTIONS
.bss : AT(ADDR(.bss) - LOAD_OFFSET) {
__bss_start = .;
*(.bss..page_aligned)
+ . = ALIGN(PAGE_SIZE);
*(BSS_MAIN)
BSS_DECRYPTED
. = ALIGN(PAGE_SIZE);
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index db600ef..052e0f0 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -341,7 +341,8 @@
#define PAGE_ALIGNED_DATA(page_align) \
. = ALIGN(page_align); \
- *(.data..page_aligned)
+ *(.data..page_aligned) \
+ . = ALIGN(page_align);
#define READ_MOSTLY_DATA(align) \
. = ALIGN(align); \
@@ -737,7 +738,9 @@
. = ALIGN(bss_align); \
.bss : AT(ADDR(.bss) - LOAD_OFFSET) { \
BSS_FIRST_SECTIONS \
+ . = ALIGN(PAGE_SIZE); \
*(.bss..page_aligned) \
+ . = ALIGN(PAGE_SIZE); \
*(.dynbss) \
*(BSS_MAIN) \
*(COMMON) \
When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if
$(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit,
GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to
/usr/bin/ and Clang as of 11 will search for both
$(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle.
GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle,
$(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice,
$(prefix)aarch64-linux-gnu/$needle rarely contains executables.
To better model how GCC's -B/--prefix takes in effect in practice, newer
Clang (since
https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6…)
only searches for $(prefix)$needle. Currently it will find /usr/bin/as
instead of /usr/bin/aarch64-linux-gnu-as.
Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE)
(/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the
appropriate cross compiling GNU as (when -no-integrated-as is in
effect).
Cc: stable(a)vger.kernel.org
Reported-by: Nathan Chancellor <natechancellor(a)gmail.com>
Signed-off-by: Fangrui Song <maskray(a)google.com>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers(a)google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1099
---
Changes in v2:
* Updated description to add tags and the llvm-project commit link.
* Fixed a typo.
Changes in v3:
* Add Cc: stable(a)vger.kernel.org
---
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 0b5f8538bde5..3ac83e375b61 100644
--- a/Makefile
+++ b/Makefile
@@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),)
ifneq ($(CROSS_COMPILE),)
CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%))
GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit))
-CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)
+CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE)
GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..)
endif
ifneq ($(GCC_TOOLCHAIN),)
--
2.28.0.rc0.105.gf9edc3c819-goog