The upcoming new Idle HLT Intercept feature allows for the HLT
instruction execution by a vCPU to be intercepted by the hypervisor
only if there are no pending V_INTR and V_NMI events for the vCPU.
When the vCPU is expected to service the pending V_INTR and V_NMI
events, the Idle HLT intercept won’t trigger. The feature allows the
hypervisor to determine if the vCPU is actually idle and reduces
wasteful VMEXITs.
The Idle HLT intercept feature is used for enlightened guests who wish
to securely handle the events. When an enlightened guest does a HLT
while an interrupt is pending, hypervisor will not have a way to
figure out whether the guest needs to be re-entered or not. The Idle
HLT intercept feature allows the HLT execution only if there are no
pending V_INTR and V_NMI events.
Presence of the Idle HLT Intercept feature is indicated via CPUID
function Fn8000_000A_EDX[30].
Document for the Idle HLT intercept feature is available at [1].
This series is based on kvm-x86/next (13e98294d7ce) + [2] + [3].
Testing Done:
- Tested the functionality for the Idle HLT intercept feature
using selftest ipi_hlt_test.
- Tested on normal, SEV, SEV-ES, SEV-SNP guest for the Idle HLT intercept
functionality.
- Tested the Idle HLT intercept functionality on nested guest.
v4 -> v5
- Incorporated Sean's review comments on nested Idle HLT intercept support.
- Make svm_idle_hlt_test independent of the Idle HLT to run on all hardware.
v3 -> v4
- Drop the patches to add vcpu_get_stat() into a new series [2].
- Added nested Idle HLT intercept support.
v2 -> v3
- Incorporated Andrew's suggestion to structure vcpu_stat_types in
a way that each architecture can share the generic types and also
provide its own.
v1 -> v2
- Done changes in svm_idle_hlt_test based on the review comments from Sean.
- Added an enum based approach to get binary stats in vcpu_get_stat() which
doesn't use string to get stat data based on the comments from Sean.
- Added safe_halt() and cli() helpers based on the comments from Sean.
[1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
Vol 2, 15.9 Instruction Intercepts (Table 15-7: IDLE_HLT).
https://bugzilla.kernel.org/attachment.cgi?id=306250
[2]: https://lore.kernel.org/kvm/20241220013906.3518334-1-seanjc@google.com/T/#u
[3]: https://lore.kernel.org/kvm/20241220012617.3513898-1-seanjc@google.com/T/#u
---
V4: https://lore.kernel.org/kvm/20241022054810.23369-1-manali.shukla@amd.com/
V3: https://lore.kernel.org/kvm/20240528041926.3989-4-manali.shukla@amd.com/T/
V2: https://lore.kernel.org/kvm/20240501145433.4070-1-manali.shukla@amd.com/
V1: https://lore.kernel.org/kvm/20240307054623.13632-1-manali.shukla@amd.com/
Manali Shukla (3):
x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept
KVM: SVM: Add Idle HLT intercept support
KVM: selftests: Add self IPI HLT test
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 1 +
arch/x86/include/uapi/asm/svm.h | 2 +
arch/x86/kvm/svm/svm.c | 13 ++-
tools/testing/selftests/kvm/Makefile.kvm | 1 +
.../selftests/kvm/include/x86/processor.h | 1 +
tools/testing/selftests/kvm/ipi_hlt_test.c | 85 +++++++++++++++++++
7 files changed, 101 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/kvm/ipi_hlt_test.c
base-commit: 13e98294d7cec978e31138d16824f50556a62d17
prerequisite-patch-id: cb345fc0d814a351df2b5788b76eee0eef9de549
prerequisite-patch-id: 71806f400cffe09f47d6231cb072cbdbd540de1b
prerequisite-patch-id: 9ea0412aab7ecd8555fcee3e9609dbfe8456d47b
prerequisite-patch-id: 3504df50cdd33958456f2e56139d76867273525c
prerequisite-patch-id: 674e56729a56cc487cb85be1a64ef561eb7bac8a
prerequisite-patch-id: 48e87354f9d6e6bd121ca32ab73cd0d7f1dce74f
prerequisite-patch-id: 74daffd7677992995f37e5a5cb784b8d4357e342
prerequisite-patch-id: 509018dc2fc1657debc641544e86f5a92d04bc1a
prerequisite-patch-id: 4a50c6a4dc3b3c8c8c640a86072faafb7bae4384
--
2.34.1
When working on OpenRISC support for restartable sequences I noticed
and fixed these two issues with the riscv support bits.
1 The 'inc' argument to RSEQ_ASM_OP_R_DEREF_ADDV was being implicitly
passed to the macro. Fix this by adding 'inc' to the list of macro
arguments.
2 The inline asm input constraints for 'inc' and 'off' use "er", The
riscv gcc port does not have an "e" constraint, this looks to be
copied from the x86 port. Fix this by just using an "r" constraint.
I have compile tested this only for riscv. However, the same fixes I
use in the OpenRISC rseq selftests and everything passes with no issues.
Signed-off-by: Stafford Horne <shorne(a)gmail.com>
---
tools/testing/selftests/rseq/rseq-riscv-bits.h | 6 +++---
tools/testing/selftests/rseq/rseq-riscv.h | 2 +-
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/rseq/rseq-riscv-bits.h b/tools/testing/selftests/rseq/rseq-riscv-bits.h
index de31a0143139..f02f411d550d 100644
--- a/tools/testing/selftests/rseq/rseq-riscv-bits.h
+++ b/tools/testing/selftests/rseq/rseq-riscv-bits.h
@@ -243,7 +243,7 @@ int RSEQ_TEMPLATE_IDENTIFIER(rseq_offset_deref_addv)(intptr_t *ptr, off_t off, i
#ifdef RSEQ_COMPARE_TWICE
RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, "%l[error1]")
#endif
- RSEQ_ASM_OP_R_DEREF_ADDV(ptr, off, 3)
+ RSEQ_ASM_OP_R_DEREF_ADDV(ptr, off, inc, 3)
RSEQ_INJECT_ASM(4)
RSEQ_ASM_DEFINE_ABORT(4, abort)
: /* gcc asm goto does not allow outputs */
@@ -251,8 +251,8 @@ int RSEQ_TEMPLATE_IDENTIFIER(rseq_offset_deref_addv)(intptr_t *ptr, off_t off, i
[current_cpu_id] "m" (rseq_get_abi()->RSEQ_TEMPLATE_CPU_ID_FIELD),
[rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
[ptr] "r" (ptr),
- [off] "er" (off),
- [inc] "er" (inc)
+ [off] "r" (off),
+ [inc] "r" (inc)
RSEQ_INJECT_INPUT
: "memory", RSEQ_ASM_TMP_REG_1
RSEQ_INJECT_CLOBBER
diff --git a/tools/testing/selftests/rseq/rseq-riscv.h b/tools/testing/selftests/rseq/rseq-riscv.h
index 37e598d0a365..67d544aaa9a3 100644
--- a/tools/testing/selftests/rseq/rseq-riscv.h
+++ b/tools/testing/selftests/rseq/rseq-riscv.h
@@ -158,7 +158,7 @@ do { \
"bnez " RSEQ_ASM_TMP_REG_1 ", 222b\n" \
"333:\n"
-#define RSEQ_ASM_OP_R_DEREF_ADDV(ptr, off, post_commit_label) \
+#define RSEQ_ASM_OP_R_DEREF_ADDV(ptr, off, inc, post_commit_label) \
"mv " RSEQ_ASM_TMP_REG_1 ", %[" __rseq_str(ptr) "]\n" \
RSEQ_ASM_OP_R_ADD(off) \
REG_L RSEQ_ASM_TMP_REG_1 ", 0(" RSEQ_ASM_TMP_REG_1 ")\n" \
--
2.47.0
As the part-3 of the vIOMMU infrastructure, this series introduces a vIRQ
object. The existing FAULT object provides a nice notification pathway to
the user space already, so let vIRQ reuse the infrastructure.
Mimicing the HWPT structure, add a common EVENTQ structure to support its
derivatives: IOMMUFD_OBJ_FAULT (existing) and IOMMUFD_OBJ_VIRQ (new).
IOMMUFD_CMD_VIRQ_ALLOC is introduced to allocate vIRQ objects for vIOMMUs.
One vIOMMU can have multiple vIRQs in different types but can not support
multiple vIRQs with the same types.
The forwarding part is fairly simple but might need to replace a physical
device ID with a virtual device ID in a driver-level IRQ data structure.
So, this comes with some helpers for drivers to use.
As usual, this series comes with the selftest coverage for this new vIRQ,
and with a real world use case in the ARM SMMUv3 driver.
This is on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_virq-v3
Testing with RMR patches for MSI:
https://github.com/nicolinc/iommufd/commits/iommufd_virq-v3-with-rmr
Paring QEMU branch for testing:
https://github.com/nicolinc/qemu/commits/wip/for_iommufd_virq-v3
Changelog
v3
* Rebase on Will's for-joerg/arm-smmu/updates for arm_smmu_event series
* Add "Reviewed-by" lines from Kevin
* Fix typos in comments, kdocs, and jump tags
* Add a patch to sort struct iommufd_ioctl_op
* Update iommufd's userpsace-api documentation
* Update uAPI kdoc to quote SMMUv3 offical spec
* Drop the unused workqueue in struct iommufd_virq
* Drop might_sleep() in iommufd_viommu_report_irq() helper
* Add missing "break" in iommufd_viommu_get_vdev_id() helper
* Shrink the scope of the vmaster's read lock in SMMUv3 driver
* Pass in two arguments to iommufd_eventq_virq_handler() helper
* Move "!ops || !ops->read" validation into iommufd_eventq_init()
* Move "fault->ictx = ictx" closer to iommufd_ctx_get(fault->ictx)
* Update commit message for arm_smmu_attach_prepare/commit_vmaster()
* Keep "iommufd_fault" as-is and rename "iommufd_eventq_virq" to just
"iommufd_virq"
v2
https://lore.kernel.org/all/cover.1733263737.git.nicolinc@nvidia.com/
* Rebase on v6.13-rc1
* Add IOPF and vIRQ in iommufd.rst (userspace-api)
* Add a proper locking in iommufd_event_virq_destroy
* Add iommufd_event_virq_abort with a lockdep_assert_held
* Rename "EVENT_*" to "EVENTQ_*" to describe the objects better
* Reorganize flows in iommufd_eventq_virq_alloc for abort() to work
* Adde struct arm_smmu_vmaster to store vSID upon attaching to a nested
domain, calling a newly added iommufd_viommu_get_vdev_id helper
* Adde an arm_vmaster_report_event helper in arm-smmu-v3-iommufd file
to simplify the routine in arm_smmu_handle_evt() of the main driver
v1
https://lore.kernel.org/all/cover.1724777091.git.nicolinc@nvidia.com/
Thanks!
Nicolin
Nicolin Chen (14):
iommufd: Keep IOCTL list in an alphabetical order
iommufd/fault: Add an iommufd_fault_init() helper
iommufd/fault: Move iommufd_fault_iopf_handler() to header
iommufd: Abstract an iommufd_eventq from iommufd_fault
iommufd: Rename fault.c to eventq.c
iommufd: Add IOMMUFD_OBJ_VIRQ and IOMMUFD_CMD_VIRQ_ALLOC
iommufd/viommu: Add iommufd_viommu_get_vdev_id helper
iommufd/viommu: Add iommufd_viommu_report_irq helper
iommufd/selftest: Require vdev_id when attaching to a nested domain
iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_VIRQ for vIRQ coverage
iommufd/selftest: Add IOMMU_VIRQ_ALLOC test coverage
Documentation: userspace-api: iommufd: Update FAULT and VIRQ
iommu/arm-smmu-v3: Introduce struct arm_smmu_vmaster
iommu/arm-smmu-v3: Report IRQs that belong to devices attached to
vIOMMU
drivers/iommu/iommufd/Makefile | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 30 ++
drivers/iommu/iommufd/iommufd_private.h | 115 ++++++-
drivers/iommu/iommufd/iommufd_test.h | 10 +
include/linux/iommufd.h | 20 ++
include/uapi/linux/iommufd.h | 46 +++
tools/testing/selftests/iommu/iommufd_utils.h | 63 ++++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 65 ++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 90 ++++--
drivers/iommu/iommufd/driver.c | 57 ++++
drivers/iommu/iommufd/{fault.c => eventq.c} | 298 ++++++++++++++----
drivers/iommu/iommufd/hw_pagetable.c | 6 +-
drivers/iommu/iommufd/main.c | 20 +-
drivers/iommu/iommufd/selftest.c | 53 ++++
drivers/iommu/iommufd/viommu.c | 2 +
tools/testing/selftests/iommu/iommufd.c | 27 ++
.../selftests/iommu/iommufd_fail_nth.c | 6 +
Documentation/userspace-api/iommufd.rst | 16 +
18 files changed, 809 insertions(+), 117 deletions(-)
rename drivers/iommu/iommufd/{fault.c => eventq.c} (55%)
base-commit: 376ce8b35ed15d5deee57bdecd8449f6a4df4c42
--
2.43.0
This patch allows progs to elide a null check on statically known map
lookup keys. In other words, if the verifier can statically prove that
the lookup will be in-bounds, allow the prog to drop the null check.
This is useful for two reasons:
1. Large numbers of nullness checks (especially when they cannot fail)
unnecessarily pushes prog towards BPF_COMPLEXITY_LIMIT_JMP_SEQ.
2. It forms a tighter contract between programmer and verifier.
For (1), bpftrace is starting to make heavier use of percpu scratch
maps. As a result, for user scripts with large number of unrolled loops,
we are starting to hit jump complexity verification errors. These
percpu lookups cannot fail anyways, as we only use static key values.
Eliding nullness probably results in less work for verifier as well.
For (2), percpu scratch maps are often used as a larger stack, as the
currrent stack is limited to 512 bytes. In these situations, it is
desirable for the programmer to express: "this lookup should never fail,
and if it does, it means I messed up the code". By omitting the null
check, the programmer can "ask" the verifier to double check the logic.
=== Changelog ===
Changes in v6:
* Use is_spilled_scalar_reg() helper and remove unnecessary comment
* Add back deleted selftest with different helper to dirty dst buffer
* Check size of spill is exactly key_size and update selftests
* Read slot_type from correct offset into the spi
* Rewrite selftests in C where possible
* Mark constant map keys as precise
Changes in v5:
* Dropped all acks
* Use s64 instead of long for const_map_key
* Ensure stack slot contains spilled reg before accessing spilled_ptr
* Ensure spilled reg is a scalar before accessing tnum const value
* Fix verifier selftest for 32-bit write to write at 8 byte alignment
to ensure spill is tracked
* Introduce more precise tracking of helper stack accesses
* Do constant map key extraction as part of helper argument processing
and then remove duplicated stack checks
* Use ret_flag instead of regs[BPF_REG_0].type
* Handle STACK_ZERO
* Fix bug in bpf_load_hdr_opt() arg annotation
Changes in v4:
* Only allow for CAP_BPF
* Add test for stack growing upwards
* Improve comment about stack growing upwards
Changes in v3:
* Check if stack is (erroneously) growing upwards
* Mention in commit message why existing tests needed change
Changes in v2:
* Added a check for when R2 is not a ptr to stack
* Added a check for when stack is uninitialized (no stack slot yet)
* Updated existing tests to account for null elision
* Added test case for when R2 can be both const and non-const
Daniel Xu (5):
bpf: verifier: Add missing newline on verbose() call
bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write
bpf: verifier: Refactor helper access type tracking
bpf: verifier: Support eliding map lookup nullness
bpf: selftests: verifier: Add nullness elision tests
kernel/bpf/verifier.c | 139 +++++++++++----
net/core/filter.c | 2 +-
.../testing/selftests/bpf/progs/dynptr_fail.c | 6 +-
tools/testing/selftests/bpf/progs/iters.c | 14 +-
.../selftests/bpf/progs/map_kptr_fail.c | 2 +-
.../selftests/bpf/progs/test_global_func10.c | 2 +-
.../selftests/bpf/progs/uninit_stack.c | 5 +-
.../bpf/progs/verifier_array_access.c | 168 ++++++++++++++++++
.../bpf/progs/verifier_basic_stack.c | 2 +-
.../selftests/bpf/progs/verifier_const_or.c | 4 +-
.../progs/verifier_helper_access_var_len.c | 12 +-
.../selftests/bpf/progs/verifier_int_ptr.c | 2 +-
.../selftests/bpf/progs/verifier_map_in_map.c | 2 +-
.../selftests/bpf/progs/verifier_mtu.c | 2 +-
.../selftests/bpf/progs/verifier_raw_stack.c | 4 +-
.../selftests/bpf/progs/verifier_unpriv.c | 2 +-
.../selftests/bpf/progs/verifier_var_off.c | 8 +-
tools/testing/selftests/bpf/verifier/calls.c | 2 +-
.../testing/selftests/bpf/verifier/map_kptr.c | 2 +-
19 files changed, 311 insertions(+), 69 deletions(-)
--
2.47.1
From: Jeff Xu <jeffxu(a)chromium.org>
This change creates the initial version of memorysealing.c.
The introduction of memorysealing.c, which replaces mseal_test.c and
uses the kselftest_harness, aims to initiate a discussion on using the
selftest harness for memory sealing tests. Upon approval of this
approach, the migration of tests from mseal_test.c to memorysealing.c
can be implemented in a step-by-step manner.
This tests addresses following feedbacks from previous reviews:
1> Use kselftest_harness instead of custom macro, such as EXPECT_XX,
ASSERT_XX, etc. (Lorenzo Stoakes, Mark Brown, etc) [1]
2> Use MAP_FAILED to check the return of mmap (Lorenzo Stoakes).
3> Adding a check for vma size and prot bits. The discussion for
this can be found in [2] [3], here is a brief summary:
This is to follow up on Pedro’s in-loop change (from
can_modify_mm to can_modify_vma). When mseal_test is initially
created, they have a common pattern: setup memory layout,
seal the memory, perform a few mm-api steps, verify return code
(not zero). Because of the nature of out-of-loop, it is sufficient
to just verify the error code in a few cases.
With Pedro's in-loop change, the sealing check happens later in the
stack, thus there are more things and scenarios to verify. And there
were feedbacks to me that mseal_test should be extensive enough to
discover all regressions. Hence I'm adding check for vma size and prot
bits.
In this change: we created two fixtures:
Fixture basic: This creates a single VMA, the VMA has a
PROT_NONE page at each end to prevent auto-merging.
Fixture wo_vma: Two VMAs back to end, a PROT_NONE page at each
end to prevent auto-merging.
In addition, I add one test (mprotec) in each fixture for discussion.
[1] https://lore.kernel.org/all/20240830180237.1220027-5-jeffxu@chromium.org/
[2] https://lore.kernel.org/all/CABi2SkUgDZtJtRJe+J9UNdtZn=EQzZcbMB685P=1rR7DUh…
[3] https://lore.kernel.org/all/2qywbjb5ebtgwkh354w3lj3vhaothvubjokxq5fhyri5jee…
Jeff Xu (1):
selftest/mm: refactor mseal_test
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/memorysealing.c | 182 +++++++++++++++++++++
tools/testing/selftests/mm/memorysealing.h | 116 +++++++++++++
tools/testing/selftests/mm/mseal_test.c | 67 +-------
5 files changed, 301 insertions(+), 66 deletions(-)
create mode 100644 tools/testing/selftests/mm/memorysealing.c
create mode 100644 tools/testing/selftests/mm/memorysealing.h
--
2.47.1.613.gc27f4b7a9f-goog
This RFC patch series proposes a new ioctl PTP_SYS_OFFSET_STAT and adds
support for it in the proposed virtio_rtc driver [1]. The new
PTP_SYS_OFFSET_STAT ioctl provides a cross-timestamp like
PTP_SYS_OFFSET_PRECISE2, plus any the following status information (for
now):
- for UTC timescale clocks: leap second related status,
- clock accuracy.
The second commit adds support for the ioctl in the proposed virtio_rtc
driver, and hence depends on the patch series "Add virtio_rtc module" [1].
[1] https://lore.kernel.org/lkml/20241219201118.2233-1-quic_philber@quicinc.com…
Signed-off-by: Peter Hilber <quic_philber(a)quicinc.com>
Peter Hilber (2):
ptp: add PTP_SYS_OFFSET_STAT for xtstamping with status
virtio_rtc: Support PTP_SYS_OFFSET_STAT ioctl
drivers/ptp/ptp_chardev.c | 39 ++++++++
drivers/ptp/ptp_clock.c | 9 ++
drivers/virtio/Kconfig | 4 +-
drivers/virtio/virtio_rtc_driver.c | 122 +++++++++++++++++++++++-
drivers/virtio/virtio_rtc_internal.h | 3 +-
drivers/virtio/virtio_rtc_ptp.c | 25 +++--
include/linux/ptp_clock_kernel.h | 31 ++++++
include/uapi/linux/ptp_clock.h | 130 +++++++++++++++++++++++++-
tools/testing/selftests/ptp/Makefile | 2 +-
tools/testing/selftests/ptp/testptp.c | 126 ++++++++++++++++++++++++-
10 files changed, 471 insertions(+), 20 deletions(-)
base-commit: 8a8009abbfa04e58f1b01b20534cac9e8fe61a46
--
2.43.0