- Linux-kselftest-mirror - lists.linaro.org

[PATCH 0/4] mm: permit guard regions for file-backed/shmem mappings

by Lorenzo Stoakes

The guard regions feature was initially implemented to support anonymous mappings only, excluding shmem. This was done such as to introduce the feature carefully and incrementally and to be conservative when considering the various caveats and corner cases that are applicable to file-backed mappings but not to anonymous ones. Now this feature has landed in 6.13, it is time to revisit this and to extend this functionality to file-backed and shmem mappings. In order to make this maximally useful, and since one may map file-backed mappings read-only (for instance ELF images), we also remove the restriction on read-only mappings and permit the establishment of guard regions in any non-hugetlb, non-mlock()'d mapping. It is permissible to permit the establishment of guard regions in read-only mappings because the guard regions only reduce access to the mapping, and when removed simply reinstate the existing attributes of the underlying VMA, meaning no access violations can occur. While the change in kernel code introduced in this series is small, the majority of the effort here is spent in extending the testing to assert that the feature works correctly across numerous file-backed mapping scenarios. Every single guard region self-test performed against anonymous memory (which is relevant and not anon-only) has now been updated to also be performed against shmem and a mapping of a file in the working directory. This confirms that all cases also function correctly for file-backed guard regions. In addition a number of other tests are added for specific file-backed mapping scenarios. There are a number of other concerns that one might have with regard to guard regions, addressed below: Readahead ~~~~~~~~~ Readahead is a process through which the page cache is populated on the assumption that sequential reads will occur, thus amortising I/O and, through a clever use of the PG_readahead folio flag establishing during major fault and checked upon minor fault, provides for asynchronous I/O to occur as dat is processed, reducing I/O stalls as data is faulted in. Guard regions do not alter this mechanism which operations at the folio and fault level, but do of course prevent the faulting of folios that would otherwise be mapped. In the instance of a major fault prior to a guard region, synchronous readahead will occur including populating folios in the page cache which the guard regions will, in the case of the mapping in question, prevent access to. In addition, if PG_readahead is placed in a folio that is now inaccessible, this will prevent asynchronous readahead from occurring as it would otherwise do. However, there are mechanisms for heuristically resetting this within readahead regardless, which will 'recover' correct readahead behaviour. Readahead presumes sequential data access, the presence of a guard region clearly indicates that, at least in the guard region, no such sequential access will occur, as it cannot occur there. So this should have very little impact on any real workload. The far more important point is as to whether readahead causes incorrect or inappropriate mapping of ranges disallowed by the presence of guard regions - this is not the case, as readahead does not 'pre-fault' memory in this fashion. At any rate, any mechanism which would attempt to do so would hit the usual page fault paths, which correctly handle PTE markers as with anonymous mappings. Fault-Around ~~~~~~~~~~~~ The fault-around logic, in a similar vein to readahead, attempts to improve efficiency with regard to file-backed memory mappings, however it differs in that it does not try to fetch folios into the page cache that are about to be accessed, but rather pre-maps a range of folios around the faulting address. Guard regions making use of PTE markers makes this relatively trivial, as this case is already handled - see filemap_map_folio_range() and filemap_map_order0_folio() - in both instances, the solution is to simply keep the established page table mappings and let the fault handler take care of PTE markers, as per the comment: /* * NOTE: If there're PTE markers, we'll leave them to be * handled in the specific fault path, and it'll prohibit * the fault-around logic. */ This works, as establishing guard regions results in page table mappings with PTE markers, and clearing them removes them. Truncation ~~~~~~~~~~ File truncation will not eliminate existing guard regions, as the truncation operation will ultimately zap the range via unmap_mapping_range(), which specifically excludes PTE markers. Zapping ~~~~~~~ Zapping is, as with anonymous mappings, handled by zap_nonpresent_ptes(), which specifically deals with guard entries, leaving them intact except in instances such as process teardown or munmap() where they need to be removed. Reclaim ~~~~~~~ When reclaim is performed on file-backed folios, it ultimately invokes try_to_unmap_one() via the rmap. If the folio is non-large, then map_pte() will ultimately abort the operation for the guard region mapping. If large, then check_pte() will determine that this is a non-device private entry/device-exclusive entry 'swap' PTE and thus abort the operation in that instance. Therefore, no odd things happen in the instance of reclaim being attempted upon a file-backed guard region. Hole Punching ~~~~~~~~~~~~~ This updates the page cache and ultimately invokes unmap_mapping_range(), which explicitly leaves PTE markers in place. Because the establishment of guard regions zapped any existing mappings to file-backed folios, once the guard regions are removed then the hole-punched region will be faulted in as usual and everything will behave as expected. Lorenzo Stoakes (4): mm: allow guard regions in file-backed and read-only mappings selftests/mm: rename guard-pages to guard-regions tools/selftests: expand all guard region tests to file-backed tools/selftests: add file/shmem-backed mapping guard region tests mm/madvise.c | 8 +- tools/testing/selftests/mm/.gitignore | 2 +- tools/testing/selftests/mm/Makefile | 2 +- .../mm/{guard-pages.c => guard-regions.c} | 921 ++++++++++++++++-- 4 files changed, 821 insertions(+), 112 deletions(-) rename tools/testing/selftests/mm/{guard-pages.c => guard-regions.c} (58%) -- 2.48.1

3 days, 20 hours

7
63
0 0

[PATCH 0/5] riscv: misaligned: fix interruptible context and add tests

by Clément Léger

This series fixes misaligned access handling when in non interruptible context by reenabling interrupts when possible. A previous commit changed raw_copy_from_user() with copy_from_user() which enables page faulting and thus can sleep. While correct, a warning is now triggered due to being called in an invalid context (sleeping in non-interruptible). This series fixes that problem by factorizing misaligned load/store entry in a single function than reenables interrupt if the interrupted context had interrupts enabled. In order for misaligned handling problems to be caught sooner, add a kselftest for all the currently supported instructions . Note: these commits were actually part of another larger series for misaligned request delegation but was split since it isn't directly required. Clément Léger (5): riscv: misaligned: factorize trap handling riscv: misaligned: enable IRQs while handling misaligned accesses riscv: misaligned: use get_user() instead of __get_user() Documentation/sysctl: add riscv to unaligned-trap supported archs selftests: riscv: add misaligned access testing Documentation/admin-guide/sysctl/kernel.rst | 4 +- arch/riscv/kernel/traps.c | 57 ++-- arch/riscv/kernel/traps_misaligned.c | 2 +- .../selftests/riscv/misaligned/.gitignore | 1 + .../selftests/riscv/misaligned/Makefile | 12 + .../selftests/riscv/misaligned/common.S | 33 +++ .../testing/selftests/riscv/misaligned/fpu.S | 180 +++++++++++++ tools/testing/selftests/riscv/misaligned/gp.S | 103 +++++++ .../selftests/riscv/misaligned/misaligned.c | 254 ++++++++++++++++++ 9 files changed, 614 insertions(+), 32 deletions(-) create mode 100644 tools/testing/selftests/riscv/misaligned/.gitignore create mode 100644 tools/testing/selftests/riscv/misaligned/Makefile create mode 100644 tools/testing/selftests/riscv/misaligned/common.S create mode 100644 tools/testing/selftests/riscv/misaligned/fpu.S create mode 100644 tools/testing/selftests/riscv/misaligned/gp.S create mode 100644 tools/testing/selftests/riscv/misaligned/misaligned.c -- 2.49.0

3 days, 21 hours

3
12
0 0

Re: [PATCH bpf-next v2 05/11] bpf, arm64, powerpc: Add bpf_jit_bypass_spec_v1/v4()

by Luis Gerhorst

kernel test robot <lkp(a)intel.com> writes: > All warnings (new ones prefixed by >>): > >>> kernel/bpf/core.c:3037:13: warning: no previous prototype for 'bpf_jit_bypass_spec_v1' [-Wmissing-prototypes] > 3037 | bool __weak bpf_jit_bypass_spec_v1(void) > | ^~~~~~~~~~~~~~~~~~~~~~ >>> kernel/bpf/core.c:3042:13: warning: no previous prototype for 'bpf_jit_bypass_spec_v4' [-Wmissing-prototypes] > 3042 | bool __weak bpf_jit_bypass_spec_v4(void) > | ^~~~~~~~~~~~~~~~~~~~~~ That's because the prototypes in include/linux/bpf.h were in the #ifdef CONFIG_BPF_SYSCALL. I fixed this for v3 by moving the prototypes out of the ifdef.

4 days, 17 hours

1
0
0 0

[PATCH bpf-next v2 00/11] bpf: Mitigate Spectre v1 using barriers

by Luis Gerhorst

This improves the expressiveness of unprivileged BPF by inserting speculation barriers instead of rejecting the programs. The approach was previously presented at LPC'24 [1] and RAID'24 [2]. To mitigate the Spectre v1 (PHT) vulnerability, the kernel rejects potentially-dangerous unprivileged BPF programs as of commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted branches"). In [2], we have analyzed 364 object files from open source projects (Linux Samples and Selftests, BCC, Loxilb, Cilium, libbpf Examples, Parca, and Prevail) and found that this affects 31% to 54% of programs. To resolve this in the majority of cases this patchset adds a fall-back for mitigating Spectre v1 using speculation barriers. The kernel still optimistically attempts to verify all speculative paths but uses speculation barriers against v1 when unsafe behavior is detected. This allows for more programs to be accepted without disabling the BPF Spectre mitigations (e.g., by setting cpu_mitigations_off()). For this, it relies on the fact that speculation barriers prevent all later instructions if the speculation was not correct: * On x86_64, lfence acts as full speculation barrier, not only as a load fence [3]: An LFENCE instruction or a serializing instruction will ensure that no later instructions execute, even speculatively, until all prior instructions complete locally. [...] Inserting an LFENCE instruction after a bounds check prevents later operations from executing before the bound check completes. This was experimentally confirmed in [4]. * ARM's SB speculation barrier instruction also affects "any instruction that appears later in the program order than the barrier" [5]. In [1] we have measured the overhead of this approach relative to having mitigations off and including the upstream Spectre v4 mitigations. For event tracing and stack-sampling profilers, we found that mitigations increase BPF program execution time by 0% to 62%. For the Loxilb network load balancer, we have measured a 14% slowdown in SCTP performance but no significant slowdown for TCP. This overhead only applies to programs that were previously rejected. I reran the expressiveness-evaluation with v6.14 and made sure the main results still match those from [1] and [2] (which used v6.5). Main design decisions are: * Do not use separate bytecode insns for v1 and v4 barriers. This simplifies the verifier significantly and has the only downside that performance on PowerPC is not as high as it could be. * Allow archs to still disable v1/v4 mitigations separately by setting bpf_jit_bypass_spec_v1/v4(). This has the benefit that archs can benefit from improved BPF expressiveness / performance if they are not vulnerable (e.g., ARM64 for v4 in the kernel). * Do not remove the empty BPF_NOSPEC implementation for backends for which it is unknown whether they are vulnerable to Spectre v1. [1] https://lpc.events/event/18/contributions/1954/ ("Mitigating Spectre-PHT using Speculation Barriers in Linux eBPF") [2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and Precise Spectre Defenses for Untrusted Linux Kernel Extensions") [3] https://www.intel.com/content/www/us/en/developer/articles/technical/softwa… ("Managed Runtime Speculative Execution Side Channel Mitigations") [4] https://dl.acm.org/doi/pdf/10.1145/3359789.3359837 ("Speculator: a tool to analyze speculative execution attacks and mitigations" - Section 4.6 "Stopping Speculative Execution") [5] https://developer.arm.com/documentation/ddi0597/2020-12/Base-Instructions/S… ("SB - Speculation Barrier - Arm Armv8-A A32/T32 Instruction Set Architecture (2020-12)") Changes: * v1 -> v2: - Drop former commits 9 ("bpf: Return PTR_ERR from push_stack()") and 11 ("bpf: Fall back to nospec for spec path verification") as suggested by Alexei. This series therefore no longer changes push_stack() to return PTR_ERR. - Add detailed explanation of how lfence works internally and how it affects the algorithm. - Add tests checking that nospec instructions are inserted in expected locations using __xlated_unpriv as suggested by Eduard (also, include a fix for __xlated_unpriv) - Add a test for the mitigations from the description of commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted branches") - Remove unused variables from do_check[_insn]() as suggested by Eduard. - Remove INSN_IDX_MODIFIED to improve readability as suggested by Eduard. This also causes the nospec_result-check to run (and fail) for jumping-ops. Add a warning to assert that this check must never succeed in that case. - Add details on the safety of patch 10 ("bpf: Allow nospec-protected var-offset stack access") based on the feedback on v1. - Rebase to bpf-next-250420 - Link to v1: https://lore.kernel.org/all/20250313172127.1098195-1-luis.gerhorst@fau.de/ * RFC -> v1: - rebase to bpf-next-250313 - tests: mark expected successes/new errors - add bpt_jit_bypass_spec_v1/v4() to avoid #ifdef in bpf_bypass_spec_v1/v4() - ensure that nospec with v1-support is implemented for archs for which GCC supports speculation barriers, except for MIPS - arm64: emit speculation barrier - powerpc: change nospec to include v1 barrier - discuss potential security (archs that do not impl. BPF nospec) and performance (only PowerPC) regressions - Linkt to RFC: https://lore.kernel.org/bpf/20250224203619.594724-1-luis.gerhorst@fau.de/ Luis Gerhorst (11): selftests/bpf: Fix caps for __xlated/jited_unpriv bpf: Move insn if/else into do_check_insn() bpf: Return -EFAULT on misconfigurations bpf: Return -EFAULT on internal errors bpf, arm64, powerpc: Add bpf_jit_bypass_spec_v1/v4() bpf, arm64, powerpc: Change nospec to include v1 barrier bpf: Rename sanitize_stack_spill to nospec_result bpf: Fall back to nospec for Spectre v1 selftests/bpf: Add test for Spectre v1 mitigation bpf: Allow nospec-protected var-offset stack access bpf: Fall back to nospec for sanitization-failures arch/arm64/net/bpf_jit.h | 5 + arch/arm64/net/bpf_jit_comp.c | 28 +- arch/powerpc/net/bpf_jit_comp64.c | 79 ++- include/linux/bpf.h | 11 +- include/linux/bpf_verifier.h | 3 +- include/linux/filter.h | 2 +- kernel/bpf/core.c | 32 +- kernel/bpf/verifier.c | 648 ++++++++++-------- tools/testing/selftests/bpf/progs/bpf_misc.h | 4 + .../selftests/bpf/progs/verifier_and.c | 8 +- .../selftests/bpf/progs/verifier_bounds.c | 66 +- .../bpf/progs/verifier_bounds_deduction.c | 45 +- .../selftests/bpf/progs/verifier_map_ptr.c | 20 +- .../selftests/bpf/progs/verifier_movsx.c | 16 +- .../selftests/bpf/progs/verifier_unpriv.c | 65 +- .../bpf/progs/verifier_value_ptr_arith.c | 101 ++- tools/testing/selftests/bpf/test_loader.c | 14 +- .../selftests/bpf/verifier/dead_code.c | 3 +- tools/testing/selftests/bpf/verifier/jmp32.c | 33 +- tools/testing/selftests/bpf/verifier/jset.c | 10 +- 20 files changed, 765 insertions(+), 428 deletions(-) base-commit: 8582d9ab3efdebb88e0cd8beed8e0b9de76443e7 -- 2.49.0

4 days, 22 hours

1
11
0 0

[PATCH bpf-next v2 0/9] selftests/bpf: Test sockmap/sockhash redirection

by Michal Luczaj

The idea behind this series is to comprehensively test the BPF redirection: BPF_MAP_TYPE_SOCKMAP, BPF_MAP_TYPE_SOCKHASH x sk_msg-to-egress, sk_msg-to-ingress, sk_skb-to-egress, sk_skb-to-ingress x AF_INET, SOCK_STREAM, AF_INET6, SOCK_STREAM, AF_INET, SOCK_DGRAM, AF_INET6, SOCK_DGRAM, AF_UNIX, SOCK_STREAM, AF_UNIX, SOCK_DGRAM, AF_VSOCK, SOCK_STREAM, AF_VSOCK, SOCK_SEQPACKET New module is introduced, sockmap_redir: all supported and unsupported redirect combinations are tested for success and failure respectively. Code is pretty much stolen/adapted from Jakub Sitnicki's sockmap_redir_matrix.c [1]. Usage: $ cd tools/testing/selftests/bpf $ make $ sudo ./test_progs -t sockmap_redir ... Summary: 1/576 PASSED, 0 SKIPPED, 0 FAILED [1]: https://github.com/jsitnicki/sockmap-redir-matrix/blob/main/sockmap_redir_m… Changes in v2: - Verify that the unsupported redirect combos do fail [Jakub] - Dedup tests in sockmap_listen - Cosmetic changes and code reordering - Link to v1: https://lore.kernel.org/bpf/42939687-20f9-4a45-b7c2-342a0e11a014@rbox.co/ Suggested-by: Jakub Sitnicki <jakub(a)cloudflare.com> Signed-off-by: Michal Luczaj <mhal(a)rbox.co> --- Michal Luczaj (9): selftests/bpf: Support af_unix SOCK_DGRAM socket pair creation selftests/bpf: Add socket_kind_to_str() to socket_helpers selftests/bpf: Add u32()/u64() to sockmap_helpers selftests/bpf: Allow setting BPF_F_INGRESS in prog_msg_verdict() selftests/bpf: Add selftest for sockmap/hashmap redirection selftests/bpf: sockmap_listen cleanup: Drop af_vsock redir tests selftests/bpf: sockmap_listen cleanup: Drop af_unix redir tests selftests/bpf: sockmap_listen cleanup: Drop af_inet SOCK_DGRAM redir tests docs/bpf: sockmap: Add a missing comma Documentation/bpf/map_sockmap.rst | 2 +- .../selftests/bpf/prog_tests/socket_helpers.h | 84 +++- .../selftests/bpf/prog_tests/sockmap_helpers.h | 25 +- .../selftests/bpf/prog_tests/sockmap_listen.c | 459 +------------------- .../selftests/bpf/prog_tests/sockmap_redir.c | 461 +++++++++++++++++++++ .../selftests/bpf/progs/test_sockmap_listen.c | 6 +- 6 files changed, 558 insertions(+), 479 deletions(-) --- base-commit: a27a97f713947b20ba91b23a3ef77fa92d74171b change-id: 20240922-selftests-sockmap-redir-5d839396c75e Best regards, -- Michal Luczaj <mhal(a)rbox.co>

5 days, 3 hours

4
21
0 0

[PATCH v9 0/5] KVM: selftests: Add LoongArch support

by Bibo Mao

--- Changes in v9: 1. Add vm mode VM_MODE_P47V47_16K, LoongArch VM uses this mode by default, rather than VM_MODE_P36V47_16K. 2. Refresh some spelling issues in changelog. Changes in v8: 1. Porting patch based on the latest version. 2. For macro PC_OFFSET_EXREGS, offsetof() method is used for C header file, still hardcoded definition for assemble language. Changes in v7: 1. Refine code to add LoongArch support in test case set_memory_region_test. Changes in v6: 1. Refresh the patch based on latest kernel 6.8-rc1, add LoongArch support about testcase set_memory_region_test. 2. Add hardware_disable_test test case. 3. Drop modification about macro DEFAULT_GUEST_TEST_MEM, it is problem of LoongArch binutils, this issue is raised to LoongArch binutils owners. Changes in v5: 1. In LoongArch kvm self tests, the DEFAULT_GUEST_TEST_MEM could be 0x130000000, it is different from the default value in memstress.h. So we Move the definition of DEFAULT_GUEST_TEST_MEM into LoongArch ucall.h, and add 'ifndef' condition for DEFAULT_GUEST_TEST_MEM in memstress.h. Changes in v4: 1. Remove the based-on flag, as the LoongArch KVM patch series have been accepted by Linux kernel, so this can be applied directly in kernel. Changes in v3: 1. Improve implementation of LoongArch VM page walk. 2. Add exception handler for LoongArch. 3. Add dirty_log_test, dirty_log_perf_test, guest_print_test test cases for LoongArch. 4. Add __ASSEMBLER__ macro to distinguish asm file and c file. 5. Move ucall_arch_do_ucall to the header file and make it as static inline to avoid function calls. 6. Change the DEFAULT_GUEST_TEST_MEM base addr for LoongArch. Changes in v2: 1. We should use ".balign 4096" to align the assemble code with 4K in exception.S instead of "align 12". 2. LoongArch only supports 3 or 4 levels page tables, so we remove the hanlders for 2-levels page table. 3. Remove the DEFAULT_LOONGARCH_GUEST_STACK_VADDR_MIN and use the common DEFAULT_GUEST_STACK_VADDR_MIN to allocate stack memory in guest. 4. Reorganize the test cases supported by LoongArch. 5. Fix some code comments. 6. Add kvm_binary_stats_test test case into LoongArch KVM selftests. --- Bibo Mao (5): KVM: selftests: Add VM_MODE_P47V47_16K vm mode KVM: selftests: Add KVM selftests header files for LoongArch KVM: selftests: Add core KVM selftests support for LoongArch KVM: selftests: Add ucall test support for LoongArch KVM: selftests: Add test cases for LoongArch tools/testing/selftests/kvm/Makefile | 2 +- tools/testing/selftests/kvm/Makefile.kvm | 18 + .../testing/selftests/kvm/include/kvm_util.h | 6 + .../kvm/include/loongarch/kvm_util_arch.h | 7 + .../kvm/include/loongarch/processor.h | 138 +++++++ .../selftests/kvm/include/loongarch/ucall.h | 20 + tools/testing/selftests/kvm/lib/kvm_util.c | 3 + .../selftests/kvm/lib/loongarch/exception.S | 59 +++ .../selftests/kvm/lib/loongarch/processor.c | 347 ++++++++++++++++++ .../selftests/kvm/lib/loongarch/ucall.c | 38 ++ .../selftests/kvm/set_memory_region_test.c | 2 +- 11 files changed, 638 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/loongarch/kvm_util_arch.h create mode 100644 tools/testing/selftests/kvm/include/loongarch/processor.h create mode 100644 tools/testing/selftests/kvm/include/loongarch/ucall.h create mode 100644 tools/testing/selftests/kvm/lib/loongarch/exception.S create mode 100644 tools/testing/selftests/kvm/lib/loongarch/processor.c create mode 100644 tools/testing/selftests/kvm/lib/loongarch/ucall.c base-commit: 8ffd015db85fea3e15a77027fda6c02ced4d2444 -- 2.39.3

5 days, 6 hours

3
9
0 0

[PATCH 12/14] torture: Add testing of RCU's Rust bindings to torture.sh

by Joel Fernandes

From: "Paul E. McKenney" <paulmck(a)kernel.org> This commit adds a --do-rcu-rust parameter to torture.sh, which invokes a rust_doctests_kernel kunit run. Note that kunit wants a clean source tree, so this runs "make mrproper", which might come as a surprise to some users. Should there be a --mrproper parameter to torture.sh to make the user explicitly ask for it? Co-developed-by: Boqun Feng <boqun.feng(a)gmail.com> Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com> Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org> Signed-off-by: Joel Fernandes <joelagnelf(a)nvidia.com> --- .../selftests/rcutorture/bin/torture.sh | 45 +++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/tools/testing/selftests/rcutorture/bin/torture.sh b/tools/testing/selftests/rcutorture/bin/torture.sh index 475f758f6216..e03fdaca89b3 100755 --- a/tools/testing/selftests/rcutorture/bin/torture.sh +++ b/tools/testing/selftests/rcutorture/bin/torture.sh @@ -59,6 +59,7 @@ do_clocksourcewd=yes do_rt=yes do_rcutasksflavors=yes do_srcu_lockdep=yes +do_rcu_rust=no # doyesno - Helper function for yes/no arguments function doyesno () { @@ -89,6 +90,7 @@ usage () { echo " --do-rcutorture / --do-no-rcutorture / --no-rcutorture" echo " --do-refscale / --do-no-refscale / --no-refscale" echo " --do-rt / --do-no-rt / --no-rt" + echo " --do-rcu-rust / --do-no-rcu-rust / --no-rcu-rust" echo " --do-scftorture / --do-no-scftorture / --no-scftorture" echo " --do-srcu-lockdep / --do-no-srcu-lockdep / --no-srcu-lockdep" echo " --duration [ <minutes> | <hours>h | <days>d ]" @@ -191,6 +193,9 @@ do --do-rt|--do-no-rt|--no-rt) do_rt=`doyesno "$1" --do-rt` ;; + --do-rcu-rust|--do-no-rcu-rust|--no-rcu-rust) + do_rcu_rust=`doyesno "$1" --do-rcu-rust` + ;; --do-scftorture|--do-no-scftorture|--no-scftorture) do_scftorture=`doyesno "$1" --do-scftorture` ;; @@ -485,6 +490,46 @@ then torture_set "rcurttorture-exp" tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration "$duration_rcutorture" --configs "TREE03" --kconfig "CONFIG_PREEMPT_RT=y CONFIG_EXPERT=y CONFIG_HZ_PERIODIC=n CONFIG_NO_HZ_FULL=y CONFIG_RCU_NOCB_CPU=y" --trust-make fi +if test "$do_rcu_rust" = "yes" +then + echo " --- do-rcu-rust:" Start `date` | tee -a $T/log + rrdir="tools/testing/selftests/rcutorture/res/$ds/results-rcu-rust" + mkdir -p "$rrdir" + echo " --- make LLVM=1 rustavailable " | tee -a $rrdir/log > $rrdir/rustavailable.out + make LLVM=1 rustavailable > $T/rustavailable.out 2>&1 + retcode=$? + echo $retcode > $rrdir/rustavailable.exitcode + cat $T/rustavailable.out | tee -a $rrdir/log >> $rrdir/rustavailable.out 2>&1 + buildphase=rustavailable + if test "$retcode" -eq 0 + then + echo " --- Running 'make mrproper' in order to run kunit." | tee -a $rrdir/log > $rrdir/mrproper.out + make mrproper > $rrdir/mrproper.out 2>&1 + retcode=$? + echo $retcode > $rrdir/mrproper.exitcode + buildphase=mrproper + fi + if test "$retcode" -eq 0 + then + echo " --- Running rust_doctests_kernel." | tee -a $rrdir/log > $rrdir/rust_doctests_kernel.out + ./tools/testing/kunit/kunit.py run --make_options LLVM=1 --make_options CLIPPY=1 --arch arm64 --kconfig_add CONFIG_SMP=y --kconfig_add CONFIG_WERROR=y --kconfig_add CONFIG_RUST=y rust_doctests_kernel >> $rrdir/rust_doctests_kernel.out 2>&1 + # @@@ Remove "--arch arm64" in order to test on native architecture? + # @@@ Analyze $rrdir/rust_doctests_kernel.out contents? + retcode=$? + echo $retcode > $rrdir/rust_doctests_kernel.exitcode + buildphase=rust_doctests_kernel + fi + if test "$retcode" -eq 0 + then + echo "rcu-rust($retcode)" $rrdir >> $T/successes + echo Success >> $rrdir/log + else + echo "rcu-rust($retcode)" $rrdir >> $T/failures + echo " --- rcu-rust Test summary:" >> $rrdir/log + echo " --- Summary: Exit code $retcode from $buildphase, see $rrdir/$buildphase.out" >> $rrdir/log + fi +fi + if test "$do_srcu_lockdep" = "yes" then echo " --- do-srcu-lockdep:" Start `date` | tee -a $T/log -- 2.43.0

6 days, 7 hours

3
9
0 0

[PATCH v2 0/7] tools/nolibc: fix some undefined behaviour and enable UBSAN

by Thomas Weißschuh

Fix some issues uncovered by UBSAN and enable UBSAN for nolibc-test to avoid regressions. Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net> --- Changes in v2: - Introduce and use __nolibc_aligned_as() - Reduce size of fixes to i{64,}toa_r() - Link to v1: https://lore.kernel.org/r/20250416-nolibc-ubsan-v1-0-c4704bb23da7@weissschu… --- Thomas Weißschuh (7): tools/nolibc: add __nolibc_has_feature() tools/nolibc: add __nolibc_aligned() and __nolibc_aligned_as() tools/nolibc: disable function sanitizer for _start_c() tools/nolibc: properly align dirent buffer tools/nolibc: fix integer overflow in i{64,}toa_r() and selftests/nolibc: disable ubsan for smash_stack() selftests/nolibc: enable UBSAN if available tools/include/nolibc/compiler.h | 9 +++++++++ tools/include/nolibc/crt.h | 5 +++++ tools/include/nolibc/dirent.h | 3 ++- tools/include/nolibc/stdlib.h | 4 ++-- tools/testing/selftests/nolibc/Makefile | 3 ++- tools/testing/selftests/nolibc/nolibc-test.c | 1 + 6 files changed, 21 insertions(+), 4 deletions(-) --- base-commit: 7c73c10b906778384843b9d3ac6c2224727bbf5c change-id: 20250416-nolibc-ubsan-028401698654 Best regards, -- Thomas Weißschuh <linux(a)weissschuh.net>

6 days, 20 hours

2
8
0 0

[PATCH 0/6] tools/nolibc: fix some undefined behaviour and enable UBSAN

by Thomas Weißschuh

Fix some issues uncovered by UBSAN and enable UBSAN for nolibc-test to avoid regressions. Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net> --- Thomas Weißschuh (6): tools/nolibc: add __nolibc_has_feature() tools/nolibc: disable function sanitizer for _start_c() tools/nolibc: properly align dirent buffer tools/nolibc: fix integer overflow in i{64,}toa_r() and selftests/nolibc: disable ubsan for smash_stack() selftests/nolibc: enable UBSAN if available tools/include/nolibc/compiler.h | 6 ++++++ tools/include/nolibc/crt.h | 5 +++++ tools/include/nolibc/dirent.h | 1 + tools/include/nolibc/stdlib.h | 24 ++++++++---------------- tools/testing/selftests/nolibc/Makefile | 3 ++- tools/testing/selftests/nolibc/nolibc-test.c | 1 + 6 files changed, 23 insertions(+), 17 deletions(-) --- base-commit: 7c73c10b906778384843b9d3ac6c2224727bbf5c change-id: 20250416-nolibc-ubsan-028401698654 Best regards, -- Thomas Weißschuh <linux(a)weissschuh.net>

6 days, 21 hours

3
17
0 0

[PATCH v2 1/2] time/timekeeping: Fix possible inconsistencies in _COARSE clockids

by John Stultz

Lei Chen raised an issue with CLOCK_MONOTONIC_COARSE seeing time inconsistencies. Lei tracked down that this was being caused by the adjustment tk->tkr_mono.xtime_nsec -= offset; which is made to compensate for the unaccumulated cycles in offset when the mult value is adjusted forward, so that the non-_COARSE clockids don't see inconsistencies. However, the _COARSE clockids don't use the mult*offset value in their calculations, so this subtraction can cause the _COARSE clock ids to jump back a bit. Now, by design, this negative adjustment should be fine, because the logic run from timekeeping_adjust() is done after we accumulate approx mult*interval_cycles into xtime_nsec. The accumulated (mult*interval_cycles) will be larger then the (mult_adj*offset) value subtracted from xtime_nsec, and both operations are done together under the tk_core.lock, so the net change to xtime_nsec should always be positive. However, do_adjtimex() calls into timekeeping_advance() as well, since we want to apply the ntp freq adjustment immediately. In this case, we don't return early when the offset is smaller then interval_cycles, so we don't end up accumulating any time into xtime_nsec. But we do go on to call timekeeping_adjust(), which modifies the mult value, and subtracts from xtime_nsec to correct for the new mult value. Here because we did not accumulate anything, we have a window where the _COARSE clockids that don't utilize the mult*offset value, can see an inconsistency. So to fix this, rework the timekeeping_advance() logic a bit so that when we are called from do_adjtimex(), we call timekeeping_forward(), to first accumulate the sub-interval time into xtime_nsec. Then with no unaccumulated cycles in offset, we can do the mult adjustment without worry of the subtraction having an impact. Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Stephen Boyd <sboyd(a)kernel.org> Cc: Anna-Maria Behnsen <anna-maria(a)linutronix.de> Cc: Frederic Weisbecker <frederic(a)kernel.org> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Miroslav Lichvar <mlichvar(a)redhat.com> Cc: linux-kselftest(a)vger.kernel.org Cc: kernel-team(a)android.com Cc: Lei Chen <lei.chen(a)smartx.com> Fixes: da15cfdae033 ("time: Introduce CLOCK_REALTIME_COARSE") Reported-by: Lei Chen <lei.chen(a)smartx.com> Closes: https://lore.kernel.org/lkml/20250310030004.3705801-1-lei.chen@smartx.com/ Diagnosed-by: Thomas Gleixner <tglx(a)linutronix.de> Additional-fixes-by: Thomas Gleixner <tglx(a)linutronix.de> Signed-off-by: John Stultz <jstultz(a)google.com> --- v2: Include fixes from Thomas, dropping the unnecessary clock_set setting, and instead clearing ntp_error, along with some other minor tweaks. --- kernel/time/timekeeping.c | 94 ++++++++++++++++++++++++++++----------- 1 file changed, 69 insertions(+), 25 deletions(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 1e67d076f1955..929846b8b45ab 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -682,20 +682,19 @@ static void timekeeping_update_from_shadow(struct tk_data *tkd, unsigned int act } /** - * timekeeping_forward_now - update clock to the current time + * timekeeping_forward - update clock to given cycle now value * @tk: Pointer to the timekeeper to update + * @cycle_now: Current clocksource read value * * Forward the current clock to update its state since the last call to * update_wall_time(). This is useful before significant clock changes, * as it avoids having to deal with this time offset explicitly. */ -static void timekeeping_forward_now(struct timekeeper *tk) +static void timekeeping_forward(struct timekeeper *tk, u64 cycle_now) { - u64 cycle_now, delta; + u64 delta = clocksource_delta(cycle_now, tk->tkr_mono.cycle_last, tk->tkr_mono.mask, + tk->tkr_mono.clock->max_raw_delta); - cycle_now = tk_clock_read(&tk->tkr_mono); - delta = clocksource_delta(cycle_now, tk->tkr_mono.cycle_last, tk->tkr_mono.mask, - tk->tkr_mono.clock->max_raw_delta); tk->tkr_mono.cycle_last = cycle_now; tk->tkr_raw.cycle_last = cycle_now; @@ -710,6 +709,21 @@ static void timekeeping_forward_now(struct timekeeper *tk) } } +/** + * timekeeping_forward_now - update clock to the current time + * @tk: Pointer to the timekeeper to update + * + * Forward the current clock to update its state since the last call to + * update_wall_time(). This is useful before significant clock changes, + * as it avoids having to deal with this time offset explicitly. + */ +static void timekeeping_forward_now(struct timekeeper *tk) +{ + u64 cycle_now = tk_clock_read(&tk->tkr_mono); + + timekeeping_forward(tk, cycle_now); +} + /** * ktime_get_real_ts64 - Returns the time of day in a timespec64. * @ts: pointer to the timespec to be set @@ -2151,6 +2165,54 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset, return offset; } +static u64 timekeeping_accumulate(struct timekeeper *tk, u64 offset, + enum timekeeping_adv_mode mode, + unsigned int *clock_set) +{ + int shift = 0, maxshift; + + /* + * TK_ADV_FREQ indicates that adjtimex(2) directly set the + * frequency or the tick length. + * + * Accumulate the offset, so that the new multiplier starts from + * now. This is required as otherwise for offsets, which are + * smaller than tk::cycle_interval, timekeeping_adjust() could set + * xtime_nsec backwards, which subsequently causes time going + * backwards in the coarse time getters. But even for the case + * where offset is greater than tk::cycle_interval the periodic + * accumulation does not have much value. + * + * Also reset tk::ntp_error as it does not make sense to keep the + * old accumulated error around in this case. + */ + if (mode == TK_ADV_FREQ) { + timekeeping_forward(tk, tk->tkr_mono.cycle_last + offset); + tk->ntp_error = 0; + return 0; + } + + /* + * With NO_HZ we may have to accumulate many cycle_intervals + * (think "ticks") worth of time at once. To do this efficiently, + * we calculate the largest doubling multiple of cycle_intervals + * that is smaller than the offset. We then accumulate that + * chunk in one go, and then try to consume the next smaller + * doubled multiple. + */ + shift = ilog2(offset) - ilog2(tk->cycle_interval); + shift = max(0, shift); + /* Bound shift to one less than what overflows tick_length */ + maxshift = (64 - (ilog2(ntp_tick_length()) + 1)) - 1; + shift = min(shift, maxshift); + while (offset >= tk->cycle_interval) { + offset = logarithmic_accumulation(tk, offset, shift, clock_set); + if (offset < tk->cycle_interval << shift) + shift--; + } + return offset; +} + /* * timekeeping_advance - Updates the timekeeper to the current time and * current NTP tick length @@ -2160,7 +2222,6 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode) struct timekeeper *tk = &tk_core.shadow_timekeeper; struct timekeeper *real_tk = &tk_core.timekeeper; unsigned int clock_set = 0; - int shift = 0, maxshift; u64 offset; guard(raw_spinlock_irqsave)(&tk_core.lock); @@ -2177,24 +2238,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode) if (offset < real_tk->cycle_interval && mode == TK_ADV_TICK) return false; - /* - * With NO_HZ we may have to accumulate many cycle_intervals - * (think "ticks") worth of time at once. To do this efficiently, - * we calculate the largest doubling multiple of cycle_intervals - * that is smaller than the offset. We then accumulate that - * chunk in one go, and then try to consume the next smaller - * doubled multiple. - */ - shift = ilog2(offset) - ilog2(tk->cycle_interval); - shift = max(0, shift); - /* Bound shift to one less than what overflows tick_length */ - maxshift = (64 - (ilog2(ntp_tick_length())+1)) - 1; - shift = min(shift, maxshift); - while (offset >= tk->cycle_interval) { - offset = logarithmic_accumulation(tk, offset, shift, &clock_set); - if (offset < tk->cycle_interval<<shift) - shift--; - } + offset = timekeeping_accumulate(tk, offset, mode, &clock_set); /* Adjust the multiplier to correct NTP error */ timekeeping_adjust(tk, offset); -- 2.49.0.395.g12beb8f557-goog

1 week

3
22
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror