March 2025 - Linux-kselftest-mirror

[PATCH v2] selftests/mm/cow: Fix the incorrect error handling

by Cyan Yang

There is an error handling did not check the correct return value. This patch will fix it. Fixes: f4b5fd6946e244cdedc3bbb9a1f24c8133b2077a ("selftests/vm: anon_cow: THP tests") Signed-off-by: Cyan Yang <cyan.yang(a)sifive.com> --- tools/testing/selftests/mm/cow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index 9446673645eb..f0cb14ea8608 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -876,7 +876,7 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run, size_t thpsize) mremap_size = thpsize / 2; mremap_mem = mmap(NULL, mremap_size, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); - if (mem == MAP_FAILED) { + if (mremap_mem == MAP_FAILED) { ksft_test_result_fail("mmap() failed\n"); goto munmap; } -- 2.39.5 (Apple Git-154)

4 months, 2 weeks

4
3
0 0

[PATCH v3 0/6] rust: reduce pointer casts, enable related lints

by Tamir Duberstein

This started with a patch that enabled `clippy::ptr_as_ptr`. Benno Lossin suggested I also look into `clippy::ptr_cast_constness` and I discovered `clippy::as_ptr_cast_mut`. This series now enables all 3 lints. It also enables `clippy::as_underscore` which ensures other pointer casts weren't missed. The first commit reduces the need for pointer casts and is shared with another series[1]. The final patch also enables pointer provenance lints and fixes violations. See that commit message for details. The build system portion of that commit is pretty messy but I couldn't find a better way to convincingly ensure that these lints were applied globally. Suggestions would be very welcome. Link: https://lore.kernel.org/all/20250307-no-offset-v1-0-0c728f63b69c@gmail.com/ [1] Signed-off-by: Tamir Duberstein <tamird(a)gmail.com> --- Changes in v3: - Fixed clippy warning in rust/kernel/firmware.rs. (kernel test robot) Link: https://lore.kernel.org/all/202503120332.YTCpFEvv-lkp@intel.com/ - s/as u64/as bindings::phys_addr_t/g. (Benno Lossin) - Use strict provenance APIs and enable lints. (Benno Lossin) - Link to v2: https://lore.kernel.org/r/20250309-ptr-as-ptr-v2-0-25d60ad922b7@gmail.com Changes in v2: - Fixed typo in first commit message. - Added additional patches, converted to series. - Link to v1: https://lore.kernel.org/r/20250307-ptr-as-ptr-v1-1-582d06514c98@gmail.com --- Tamir Duberstein (6): rust: retain pointer mut-ness in `container_of!` rust: enable `clippy::ptr_as_ptr` lint rust: enable `clippy::ptr_cast_constness` lint rust: enable `clippy::as_ptr_cast_mut` lint rust: enable `clippy::as_underscore` lint rust: use strict provenance APIs Makefile | 13 ++++++++++++- init/Kconfig | 3 +++ rust/Makefile | 26 ++++++++++++++++++++------ rust/bindings/lib.rs | 1 + rust/kernel/alloc.rs | 2 +- rust/kernel/alloc/allocator_test.rs | 2 +- rust/kernel/alloc/kvec.rs | 4 ++-- rust/kernel/block/mq/operations.rs | 2 +- rust/kernel/block/mq/request.rs | 7 ++++--- rust/kernel/device.rs | 5 +++-- rust/kernel/device_id.rs | 2 +- rust/kernel/devres.rs | 19 ++++++++++--------- rust/kernel/error.rs | 2 +- rust/kernel/firmware.rs | 3 ++- rust/kernel/fs/file.rs | 2 +- rust/kernel/io.rs | 16 ++++++++-------- rust/kernel/kunit.rs | 15 +++++++-------- rust/kernel/lib.rs | 25 ++++++++++++++++++++++--- rust/kernel/list/impl_list_item_mod.rs | 2 +- rust/kernel/miscdevice.rs | 2 +- rust/kernel/of.rs | 6 +++--- rust/kernel/pci.rs | 15 +++++++++------ rust/kernel/platform.rs | 6 ++++-- rust/kernel/print.rs | 11 +++++------ rust/kernel/rbtree.rs | 23 ++++++++++------------- rust/kernel/seq_file.rs | 3 ++- rust/kernel/str.rs | 18 +++++++----------- rust/kernel/sync/poll.rs | 2 +- rust/kernel/uaccess.rs | 12 ++++++++---- rust/kernel/workqueue.rs | 12 ++++++------ rust/uapi/lib.rs | 1 + scripts/Makefile.build | 2 +- scripts/Makefile.host | 4 ++++ 33 files changed, 163 insertions(+), 105 deletions(-) --- base-commit: a1eb95d6b5f4cf5cc7b081e85e374d1dd98a213b change-id: 20250307-ptr-as-ptr-21b1867fc4d4 Best regards, -- Tamir Duberstein <tamird(a)gmail.com>

4 months, 3 weeks

5
19
0 0

[PATCH RFC 8/8] selftests/sched_ext: Add test for sched_ext dl_server

by Joel Fernandes

From: Andrea Righi <arighi(a)nvidia.com> Add a selftest to validate the correct behavior of the deadline server for the ext_sched_class. [ Joel: Replaced occurences of CFS in the test with EXT. ] Signed-off-by: Joel Fernandes <joelagnelf(a)nvidia.com> Signed-off-by: Andrea Righi <arighi(a)nvidia.com> --- tools/testing/selftests/sched_ext/Makefile | 1 + .../selftests/sched_ext/rt_stall.bpf.c | 23 ++ tools/testing/selftests/sched_ext/rt_stall.c | 213 ++++++++++++++++++ 3 files changed, 237 insertions(+) create mode 100644 tools/testing/selftests/sched_ext/rt_stall.bpf.c create mode 100644 tools/testing/selftests/sched_ext/rt_stall.c diff --git a/tools/testing/selftests/sched_ext/Makefile b/tools/testing/selftests/sched_ext/Makefile index 011762224600..802e3d8d038f 100644 --- a/tools/testing/selftests/sched_ext/Makefile +++ b/tools/testing/selftests/sched_ext/Makefile @@ -180,6 +180,7 @@ auto-test-targets := \ select_cpu_dispatch_bad_dsq \ select_cpu_dispatch_dbl_dsp \ select_cpu_vtime \ + rt_stall \ test_example \ testcase-targets := $(addsuffix .o,$(addprefix $(SCXOBJ_DIR)/,$(auto-test-targets))) diff --git a/tools/testing/selftests/sched_ext/rt_stall.bpf.c b/tools/testing/selftests/sched_ext/rt_stall.bpf.c new file mode 100644 index 000000000000..80086779dd1e --- /dev/null +++ b/tools/testing/selftests/sched_ext/rt_stall.bpf.c @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * A scheduler that verified if RT tasks can stall SCHED_EXT tasks. + * + * Copyright (c) 2025 NVIDIA Corporation. + */ + +#include <scx/common.bpf.h> + +char _license[] SEC("license") = "GPL"; + +UEI_DEFINE(uei); + +void BPF_STRUCT_OPS(rt_stall_exit, struct scx_exit_info *ei) +{ + UEI_RECORD(uei, ei); +} + +SEC(".struct_ops.link") +struct sched_ext_ops rt_stall_ops = { + .exit = (void *)rt_stall_exit, + .name = "rt_stall", +}; diff --git a/tools/testing/selftests/sched_ext/rt_stall.c b/tools/testing/selftests/sched_ext/rt_stall.c new file mode 100644 index 000000000000..d4cb545ebfd8 --- /dev/null +++ b/tools/testing/selftests/sched_ext/rt_stall.c @@ -0,0 +1,213 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2025 NVIDIA Corporation. + */ +#define _GNU_SOURCE +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <sched.h> +#include <sys/prctl.h> +#include <sys/types.h> +#include <sys/wait.h> +#include <time.h> +#include <linux/sched.h> +#include <signal.h> +#include <bpf/bpf.h> +#include <scx/common.h> +#include <sys/wait.h> +#include <unistd.h> +#include "rt_stall.bpf.skel.h" +#include "scx_test.h" +#include "../kselftest.h" + +#define CORE_ID 0 /* CPU to pin tasks to */ +#define RUN_TIME 5 /* How long to run the test in seconds */ + +/* Simple busy-wait function for test tasks */ +static void process_func(void) +{ + while (1) { + /* Busy wait */ + for (volatile unsigned long i = 0; i < 10000000UL; i++); + } +} + +/* Set CPU affinity to a specific core */ +static void set_affinity(int cpu) +{ + cpu_set_t mask; + + CPU_ZERO(&mask); + CPU_SET(cpu, &mask); + if (sched_setaffinity(0, sizeof(mask), &mask) != 0) { + perror("sched_setaffinity"); + exit(EXIT_FAILURE); + } +} + +/* Set task scheduling policy and priority */ +static void set_sched(int policy, int priority) +{ + struct sched_param param; + + param.sched_priority = priority; + if (sched_setscheduler(0, policy, &param) != 0) { + perror("sched_setscheduler"); + exit(EXIT_FAILURE); + } +} + +/* Get process runtime from /proc/<pid>/stat */ +static float get_process_runtime(int pid) +{ + char path[256]; + FILE *file; + long utime, stime; + int fields; + + snprintf(path, sizeof(path), "/proc/%d/stat", pid); + file = fopen(path, "r"); + if (file == NULL) { + perror("Failed to open stat file"); + return -1; + } + + /* Skip the first 13 fields and read the 14th and 15th */ + fields = fscanf(file, + "%*d %*s %*c %*d %*d %*d %*d %*d %*u %*u %*u %*u %*u %lu %lu", + &utime, &stime); + fclose(file); + + if (fields != 2) { + fprintf(stderr, "Failed to read stat file\n"); + return -1; + } + + /* Calculate the total time spent in the process */ + long total_time = utime + stime; + long ticks_per_second = sysconf(_SC_CLK_TCK); + float runtime_seconds = total_time * 1.0 / ticks_per_second; + + return runtime_seconds; +} + +static enum scx_test_status setup(void **ctx) +{ + struct rt_stall *skel; + + skel = rt_stall__open(); + SCX_FAIL_IF(!skel, "Failed to open"); + SCX_ENUM_INIT(skel); + SCX_FAIL_IF(rt_stall__load(skel), "Failed to load skel"); + + *ctx = skel; + + return SCX_TEST_PASS; +} + +static bool sched_stress_test(void) +{ + float cfs_runtime, rt_runtime; + int cfs_pid, rt_pid; + float expected_min_ratio = 0.04; /* 4% */ + + ksft_print_header(); + ksft_set_plan(1); + + /* Create and set up a EXT task */ + cfs_pid = fork(); + if (cfs_pid == 0) { + set_affinity(CORE_ID); + process_func(); + exit(0); + } else if (cfs_pid < 0) { + perror("fork for EXT task"); + ksft_exit_fail(); + } + + /* Create an RT task */ + rt_pid = fork(); + if (rt_pid == 0) { + set_affinity(CORE_ID); + set_sched(SCHED_FIFO, 50); + process_func(); + exit(0); + } else if (rt_pid < 0) { + perror("fork for RT task"); + ksft_exit_fail(); + } + + /* Let the processes run for the specified time */ + sleep(RUN_TIME); + + /* Get runtime for the EXT task */ + cfs_runtime = get_process_runtime(cfs_pid); + if (cfs_runtime != -1) + ksft_print_msg("Runtime of EXT task (PID %d) is %f seconds\n", cfs_pid, cfs_runtime); + else + ksft_exit_fail_msg("Error getting runtime for EXT task (PID %d)\n", cfs_pid); + + /* Get runtime for the RT task */ + rt_runtime = get_process_runtime(rt_pid); + if (rt_runtime != -1) + ksft_print_msg("Runtime of RT task (PID %d) is %f seconds\n", rt_pid, rt_runtime); + else + ksft_exit_fail_msg("Error getting runtime for RT task (PID %d)\n", rt_pid); + + /* Kill the processes */ + kill(cfs_pid, SIGKILL); + kill(rt_pid, SIGKILL); + waitpid(cfs_pid, NULL, 0); + waitpid(rt_pid, NULL, 0); + + /* Verify that the scx task got enough runtime */ + float actual_ratio = cfs_runtime / (cfs_runtime + rt_runtime); + ksft_print_msg("EXT task got %.2f%% of total runtime\n", actual_ratio * 100); + + if (actual_ratio >= expected_min_ratio) { + ksft_test_result_pass("PASS: EXT task got more than %.2f%% of runtime\n", + expected_min_ratio * 100); + return true; + } else { + ksft_test_result_fail("FAIL: EXT task got less than %.2f%% of runtime\n", + expected_min_ratio * 100); + return false; + } +} + +static enum scx_test_status run(void *ctx) +{ + struct rt_stall *skel = ctx; + struct bpf_link *link; + bool res; + + link = bpf_map__attach_struct_ops(skel->maps.rt_stall_ops); + SCX_FAIL_IF(!link, "Failed to attach scheduler"); + + res = sched_stress_test(); + + SCX_EQ(skel->data->uei.kind, EXIT_KIND(SCX_EXIT_NONE)); + bpf_link__destroy(link); + + if (!res) + ksft_exit_fail(); + + return SCX_TEST_PASS; +} + +static void cleanup(void *ctx) +{ + struct rt_stall *skel = ctx; + + rt_stall__destroy(skel); +} + +struct scx_test rt_stall = { + .name = "rt_stall", + .description = "Verify that RT tasks cannot stall SCHED_EXT tasks", + .setup = setup, + .run = run, + .cleanup = cleanup, +}; +REGISTER_SCX_TEST(&rt_stall) -- 2.43.0

4 months, 3 weeks

1
0
0 0

[PATCH 0/4] ntsync: some small fixes for doc and selftests

by Su Hui

There are four small fixes for ntsync test and doc. I divided these into four different patches due to different types of errors. If one patch is better, I can do it too. Su Hui (4): selftests: ntsync: fix the wrong condition in wake_all selftests: ntsync: avoid possible overflow in 32-bit machine selftests: ntsync: update config docs: ntsync: update NTSYNC_IOC_* Documentation/userspace-api/ntsync.rst | 18 +++++++++--------- tools/testing/selftests/drivers/ntsync/config | 2 +- .../testing/selftests/drivers/ntsync/ntsync.c | 6 +++--- 3 files changed, 13 insertions(+), 13 deletions(-) -- 2.30.2

4 months, 3 weeks

3
7
0 0

[PATCH v3 00/10] selftests/mm: Some cleanups from trying to run them

by Brendan Jackman

I never had much luck running mm selftests so I spent a few hours digging into why. Looks like most of the reason is missing SKIP checks, so this series is just adding a bunch of those that I found. I did not do anything like all of them, just the ones I spotted in gup_longterm, gup_test, mmap, userfaultfd and memfd_secret. It's a bit unfortunate to have to skip those tests when ftruncate() fails, but I don't have time to dig deep enough into it to actually make them pass. I have observed the issue on 9pfs and heard rumours that NFS has a similar problem. I'm now able to run these test groups successfully: - mmap - gup_test - compaction - migration - page_frag - userfaultfd Signed-off-by: Brendan Jackman <jackmanb(a)google.com> --- Changes in v3: - Added fix for userfaultfd tests. - Dropped attempts to use sudo. - Fixed garbage printf in uffd-stress. (Added EXTRA_CFLAGS=-Werror FORCE_TARGETS=1 to my scripts to prevent such errors happening again). - Fixed missing newlines in ksft_test_result_skip() calls. - Link to v2: https://lore.kernel.org/r/20250221-mm-selftests-v2-0-28c4d66383c5@google.com Changes in v2 (Thanks to Dev for the reviews): - Improve and cleanup some error messages - Add some extra SKIPs - Fix misnaming of nr_cpus variable in uffd tests - Link to v1: https://lore.kernel.org/r/20250220-mm-selftests-v1-0-9bbf57d64463@google.com --- Brendan Jackman (10): selftests/mm: Report errno when things fail in gup_longterm selftests/mm: Skip uffd-stress if userfaultfd not available selftests/mm: Skip uffd-wp-mremap if userfaultfd not available selftests/mm/uffd: Rename nr_cpus -> nr_threads selftests/mm: Print some details when uffd-stress gets bad params selftests/mm: Don't fail uffd-stress if too many CPUs selftests/mm: Skip map_populate on weird filesystems selftests/mm: Skip gup_longerm tests on weird filesystems selftests/mm: Drop unnecessary sudo usage selftests/mm: Ensure uffd-wp-mremap gets pages of each size tools/testing/selftests/mm/gup_longterm.c | 45 ++++++++++++++++++---------- tools/testing/selftests/mm/map_populate.c | 7 +++++ tools/testing/selftests/mm/run_vmtests.sh | 25 ++++++++++++++-- tools/testing/selftests/mm/uffd-common.c | 8 ++--- tools/testing/selftests/mm/uffd-common.h | 2 +- tools/testing/selftests/mm/uffd-stress.c | 42 ++++++++++++++++---------- tools/testing/selftests/mm/uffd-unit-tests.c | 2 +- tools/testing/selftests/mm/uffd-wp-mremap.c | 5 +++- 8 files changed, 95 insertions(+), 41 deletions(-) --- base-commit: 76544811c850a1f4c055aa182b513b7a843868ea change-id: 20250220-mm-selftests-2d7d0542face Best regards, -- Brendan Jackman <jackmanb(a)google.com>

4 months, 3 weeks

4
36
0 0

[RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing

by Nikita Kalyazin

This series is built on top of the v3 write syscall support [1]. With James's KVM userfault [2], it is possible to handle stage-2 faults in guest_memfd in userspace. However, KVM itself also triggers faults in guest_memfd in some cases, for example: PV interfaces like kvmclock, PV EOI and page table walking code when fetching the MMIO instruction on x86. It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3] that KVM would be accessing those pages via userspace page tables. In order for such faults to be handled in userspace, guest_memfd needs to support userfaultfd. This series proposes a limited support for userfaultfd in guest_memfd: - userfaultfd support is conditional to `CONFIG_KVM_GMEM_SHARED_MEM` (as is fault support in general) - Only `page missing` event is currently supported - Userspace is supposed to respond to the event with the `write` syscall followed by `UFFDIO_CONTINUE` ioctl to unblock the faulting process. Note that we can't use `UFFDIO_COPY` here because userfaulfd code does not know how to prepare guest_memfd pages, eg remove them from direct map [4]. Not included in this series: - Proper interface for userfaultfd to recognise guest_memfd mappings - Proper handling of truncation cases after locking the page Request for comments: - Is it a sensible workflow for guest_memfd to resolve a userfault `page missing` event with `write` syscall + `UFFDIO_CONTINUE`? One of the alternatives is teaching `UFFDIO_COPY` how to deal with guest_memfd pages. - What is a way forward to make userfaultfd code aware of guest_memfd? I saw that Patrick hit a somewhat similar problem in [5] when trying to use direct map manipulation functions in KVM and was pointed by David at Elliot's guestmem library [6] that might include a shim for that. Would the library be the right place to expose required interfaces like `vma_is_gmem`? Nikita [1] https://lore.kernel.org/kvm/20250303130838.28812-1-kalyazin@amazon.com/T/ [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/… [3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAo… [4] https://lore.kernel.org/kvm/20250221160728.1584559-1-roypat@amazon.co.uk/T/ [4] https://lore.kernel.org/kvm/20250221160728.1584559-1-roypat@amazon.co.uk/T/… [5] https://lore.kernel.org/kvm/20241122-guestmem-library-v5-2-450e92951a15@qui… Nikita Kalyazin (5): KVM: guest_memfd: add kvm_gmem_vma_is_gmem KVM: guest_memfd: add support for uffd missing mm: userfaultfd: allow to register userfaultfd for guest_memfd mm: userfaultfd: support continue for guest_memfd KVM: selftests: add uffd missing test for guest_memfd include/linux/userfaultfd_k.h | 9 ++ mm/userfaultfd.c | 23 ++++- .../testing/selftests/kvm/guest_memfd_test.c | 88 +++++++++++++++++++ virt/kvm/guest_memfd.c | 17 +++- virt/kvm/kvm_mm.h | 1 + 5 files changed, 136 insertions(+), 2 deletions(-) base-commit: 592e7531753dc4b711f96cd1daf808fd493d3223 -- 2.47.1

4 months, 3 weeks

3
21
0 0

[PATCH v11 00/27] riscv control-flow integrity for usermode

by Deepak Gupta

Basics and overview =================== Software with larger attack surfaces (e.g. network facing apps like databases, browsers or apps relying on browser runtimes) suffer from memory corruption issues which can be utilized by attackers to bend control flow of the program to eventually gain control (by making their payload executable). Attackers are able to perform such attacks by leveraging call-sites which rely on indirect calls or return sites which rely on obtaining return address from stack memory. To mitigate such attacks, risc-v extension zicfilp enforces that all indirect calls must land on a landing pad instruction `lpad` else cpu will raise software check exception (a new cpu exception cause code on riscv). Similarly for return flow, risc-v extension zicfiss extends architecture with - `sspush` instruction to push return address on a shadow stack - `sspopchk` instruction to pop return address from shadow stack and compare with input operand (i.e. return address on stack) - `sspopchk` to raise software check exception if comparision above was a mismatch - Protection mechanism using which shadow stack is not writeable via regular store instructions More information an details can be found at extensions github repo [1]. Equivalent to landing pad (zicfilp) on x86 is `ENDBRANCH` instruction in Intel CET [3] and branch target identification (BTI) [4] on arm. Similarly x86's Intel CET has shadow stack [5] and arm64 has guarded control stack (GCS) [6] which are very similar to risc-v's zicfiss shadow stack. x86 and arm64 support for user mode shadow stack is already in mainline. Kernel awareness for user control flow integrity ================================================ This series picks up Samuel Holland's envcfg changes [2] as well. So if those are being applied independently, they should be removed from this series. Enabling: In order to maintain compatibility and not break anything in user mode, kernel doesn't enable control flow integrity cpu extensions on binary by default. Instead exposes a prctl interface to enable, disable and lock the shadow stack or landing pad feature for a task. This allows userspace (loader) to enumerate if all objects in its address space are compiled with shadow stack and landing pad support and accordingly enable the feature. Additionally if a subsequent `dlopen` happens on a library, user mode can take a decision again to disable the feature (if incoming library is not compiled with support) OR terminate the task (if user mode policy is strict to have all objects in address space to be compiled with control flow integirty cpu feature). prctl to enable shadow stack results in allocating shadow stack from virtual memory and activating for user address space. x86 and arm64 are also following same direction due to similar reason(s). clone/fork: On clone and fork, cfi state for task is inherited by child. Shadow stack is part of virtual memory and is a writeable memory from kernel perspective (writeable via a restricted set of instructions aka shadow stack instructions) Thus kernel changes ensure that this memory is converted into read-only when fork/clone happens and COWed when fault is taken due to sspush, sspopchk or ssamoswap. In case `CLONE_VM` is specified and shadow stack is to be enabled, kernel will automatically allocate a shadow stack for that clone call. map_shadow_stack: x86 introduced `map_shadow_stack` system call to allow user space to explicitly map shadow stack memory in its address space. It is useful to allocate shadow for different contexts managed by a single thread (green threads or contexts) risc-v implements this system call as well. signal management: If shadow stack is enabled for a task, kernel performs an asynchronous control flow diversion to deliver the signal and eventually expects userspace to issue sigreturn so that original execution can be resumed. Even though resume context is prepared by kernel, it is in user space memory and is subject to memory corruption and corruption bugs can be utilized by attacker in this race window to perform arbitrary sigreturn and eventually bypass cfi mechanism. Another issue is how to ensure that cfi related state on sigcontext area is not trampled by legacy apps or apps compiled with old kernel headers. In order to mitigate control-flow hijacting, kernel prepares a token and place it on shadow stack before signal delivery and places address of token in sigcontext structure. During sigreturn, kernel obtains address of token from sigcontext struture, reads token from shadow stack and validates it and only then allow sigreturn to succeed. Compatiblity issue is solved by adopting dynamic sigcontext management introduced for vector extension. This series re-factor the code little bit to allow future sigcontext management easy (as proposed by Andy Chiu from SiFive) config and compilation: Introduce a new risc-v config option `CONFIG_RISCV_USER_CFI`. Selecting this config option picks the kernel support for user control flow integrity. This optin is presented only if toolchain has shadow stack and landing pad support. And is on purpose guarded by toolchain support. Reason being that eventually vDSO also needs to be compiled in with shadow stack and landing pad support. vDSO compile patches are not included as of now because landing pad labeling scheme is yet to settle for usermode runtime. To get more information on kernel interactions with respect to zicfilp and zicfiss, patch series adds documentation for `zicfilp` and `zicfiss` in following: Documentation/arch/riscv/zicfiss.rst Documentation/arch/riscv/zicfilp.rst How to test this series ======================= Toolchain --------- $ git clone git@github.com:sifive/riscv-gnu-toolchain.git -b cfi-dev $ riscv-gnu-toolchain/configure --prefix=<path-to-where-to-build> --with-arch=rv64gc_zicfilp_zicfiss --enable-linux --disable-gdb --with-extra-multilib-test="rv64gc_zicfilp_zicfiss-lp64d:-static" $ make -j$(nproc) Qemu ---- Get the lastest qemu $ cd qemu $ mkdir build $ cd build $ ../configure --target-list=riscv64-softmmu $ make -j$(nproc) Opensbi ------- $ git clone git@github.com:deepak0414/opensbi.git -b v6_cfi_spec_split_opensbi $ make CROSS_COMPILE=<your riscv toolchain> -j$(nproc) PLATFORM=generic Linux ----- Running defconfig is fine. CFI is enabled by default if the toolchain supports it. $ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc) defconfig $ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc) In case you're building your own rootfs using toolchain, please make sure you pick following patch to ensure that vDSO compiled with lpad and shadow stack. "arch/riscv: compile vdso with landing pad" Branch where above patch can be picked https://github.com/deepak0414/linux-riscv-cfi/tree/vdso_user_cfi_v6.12-rc1 Running ------- Modify your qemu command to have: -bios <path-to-cfi-opensbi>/build/platform/generic/firmware/fw_dynamic.bin -cpu rv64,zicfilp=true,zicfiss=true,zimop=true,zcmop=true vDSO related Opens (in the flux) ================================= I am listing these opens for laying out plan and what to expect in future patch sets. And of course for the sake of discussion. Shadow stack and landing pad enabling in vDSO ---------------------------------------------- vDSO must have shadow stack and landing pad support compiled in for task to have shadow stack and landing pad support. This patch series doesn't enable that (yet). Enabling shadow stack support in vDSO should be straight forward (intend to do that in next versions of patch set). Enabling landing pad support in vDSO requires some collaboration with toolchain folks to follow a single label scheme for all object binaries. This is necessary to ensure that all indirect call-sites are setting correct label and target landing pads are decorated with same label scheme. How many vDSOs --------------- Shadow stack instructions are carved out of zimop (may be operations) and if CPU doesn't implement zimop, they're illegal instructions. Kernel could be running on a CPU which may or may not implement zimop. And thus kernel will have to carry 2 different vDSOs and expose the appropriate one depending on whether CPU implements zimop or not. References ========== [1] - https://github.com/riscv/riscv-cfi [2] - https://lore.kernel.org/all/20240814081126.956287-1-samuel.holland@sifive.c… [3] - https://lwn.net/Articles/889475/ [4] - https://developer.arm.com/documentation/109576/0100/Branch-Target-Identific… [5] - https://www.intel.com/content/dam/develop/external/us/en/documents/catc17-i… [6] - https://lwn.net/Articles/940403/ --- changelog --------- v11: - patch "arch/riscv: compile vdso with landing pad" was unconditionally selecting `_zicfilp` for vDSO compile. fixed that. Changed `lpad 1` to to `lpad 0`. v10: - dropped "mm: helper `is_shadow_stack_vma` to check shadow stack vma". This patch is not that interesting to this patch series for risc-v. There are instances in arch directories where VM_SHADOW_STACK flag is anyways used. Dropping this patch to expedite merging in riscv tree. - Took suggestions from `Clement` on "riscv: zicfiss / zicfilp enumeration" to validate presence of cfi based on config. - Added a patch for vDSO to have `lpad 0`. I had omitted this earlier to make sure we add single vdso object with cfi enabled. But a vdso object with scheme of zero labeled landing pad is least common denominator and should work with all objects of zero labeled as well as function-signature labeled objects. v9: - rebased on master (39a803b754d5 fix braino in "9p: fix ->rename_sem exclusion") - dropped "mm: Introduce ARCH_HAS_USER_SHADOW_STACK" (master has it from arm64/gcs) - dropped "prctl: arch-agnostic prctl for shadow stack" (master has it from arm64/gcs) v8: - rebased on palmer/for-next - dropped samuel holland's `envcfg` context switch patches. they are in parlmer/for-next v7: - Removed "riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv" Instead using `deactivate_mm` flow to clean up. see here for more context https://lore.kernel.org/all/20230908203655.543765-1-rick.p.edgecombe@intel.… - Changed the header include in `kselftest`. Hopefully this fixes compile issue faced by Zong Li at SiFive. - Cleaned up an orphaned change to `mm/mmap.c` in below patch "riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE" - Lock interfaces for shadow stack and indirect branch tracking expect arg == 0 Any future evolution of this interface should accordingly define how arg should be setup. - `mm/map.c` has an instance of using `VM_SHADOW_STACK`. Fixed it to use helper `is_shadow_stack_vma`. - Link to v6: https://lore.kernel.org/r/20241008-v5_user_cfi_series-v6-0-60d9fe073f37@riv… v6: - Picked up Samuel Holland's changes as is with `envcfg` placed in `thread` instead of `thread_info` - fixed unaligned newline escapes in kselftest - cleaned up messages in kselftest and included test output in commit message - fixed a bug in clone path reported by Zong Li - fixed a build issue if CONFIG_RISCV_ISA_V is not selected (this was introduced due to re-factoring signal context management code) v5: - rebased on v6.12-rc1 - Fixed schema related issues in device tree file - Fixed some of the documentation related issues in zicfilp/ss.rst (style issues and added index) - added `SHADOW_STACK_SET_MARKER` so that implementation can define base of shadow stack. - Fixed warnings on definitions added in usercfi.h when CONFIG_RISCV_USER_CFI is not selected. - Adopted context header based signal handling as proposed by Andy Chiu - Added support for enabling kernel mode access to shadow stack using FWFT (https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-firmware…) - Link to v5: https://lore.kernel.org/r/20241001-v5_user_cfi_series-v1-0-3ba65b6e550f@riv… (Note: I had an issue in my workflow due to which version number wasn't picked up correctly while sending out patches) v4: - rebased on 6.11-rc6 - envcfg: Converged with Samuel Holland's patches for envcfg management on per- thread basis. - vma_is_shadow_stack is renamed to is_vma_shadow_stack - picked up Mark Brown's `ARCH_HAS_USER_SHADOW_STACK` patch - signal context: using extended context management to maintain compatibility. - fixed `-Wmissing-prototypes` compiler warnings for prctl functions - Documentation fixes and amending typos. - Link to v4: https://lore.kernel.org/all/20240912231650.3740732-1-debug@rivosinc.com/ v3: - envcfg logic to pick up base envcfg had a bug where `ENVCFG_CBZE` could have been picked on per task basis, even though CPU didn't implement it. Fixed in this series. - dt-bindings As suggested, split into separate commit. fixed the messaging that spec is in public review - arch_is_shadow_stack change arch_is_shadow_stack changed to vma_is_shadow_stack - hwprobe zicfiss / zicfilp if present will get enumerated in hwprobe - selftests As suggested, added object and binary filenames to .gitignore Selftest binary anyways need to be compiled with cfi enabled compiler which will make sure that landing pad and shadow stack are enabled. Thus removed separate enable/disable tests. Cleaned up tests a bit. - Link to v3: https://lore.kernel.org/lkml/20240403234054.2020347-1-debug@rivosinc.com/ v2: - Using config `CONFIG_RISCV_USER_CFI`, kernel support for riscv control flow integrity for user mode programs can be compiled in the kernel. - Enabling of control flow integrity for user programs is left to user runtime - This patch series introduces arch agnostic `prctls` to enable shadow stack and indirect branch tracking. And implements them on riscv. --- --- Changes in v11: - EDITME: describe what is new in this series revision. - EDITME: use bulletpoints and terse descriptions. - Link to v10: https://lore.kernel.org/r/20250210-v5_user_cfi_series-v10-0-163dcfa31c60@ri… --- Andy Chiu (1): riscv: signal: abstract header saving for setup_sigcontext Clément Léger (1): riscv: Add Firmware Feature SBI extensions definitions Deepak Gupta (24): mm: VM_SHADOW_STACK definition for riscv dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml) riscv: zicfiss / zicfilp enumeration riscv: zicfiss / zicfilp extension csr and bit definitions riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE riscv mm: manufacture shadow stack pte riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs riscv mmu: write protect and shadow stack riscv/mm: Implement map_shadow_stack() syscall riscv/shstk: If needed allocate a new shadow stack on clone riscv: Implements arch agnostic shadow stack prctls prctl: arch-agnostic prctl for indirect branch tracking riscv/traps: Introduce software check exception riscv/signal: save and restore of shadow stack for signal riscv/kernel: update __show_regs to print shadow stack register riscv/ptrace: riscv cfi status and state via ptrace and in core files riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe riscv: enable kernel access to shadow stack memory via FWFT sbi call riscv: kernel command line option to opt out of user cfi riscv: create a config for shadow stack and landing pad instr support riscv: Documentation for landing pad / indirect branch tracking riscv: Documentation for shadow stack on riscv kselftest/riscv: kselftest for user mode cfi Jim Shu (1): arch/riscv: compile vdso with landing pad Documentation/arch/riscv/index.rst | 2 + Documentation/arch/riscv/zicfilp.rst | 115 +++++ Documentation/arch/riscv/zicfiss.rst | 176 +++++++ .../devicetree/bindings/riscv/extensions.yaml | 14 + arch/riscv/Kconfig | 20 + arch/riscv/Makefile | 7 +- arch/riscv/include/asm/asm-prototypes.h | 1 + arch/riscv/include/asm/assembler.h | 44 ++ arch/riscv/include/asm/cpufeature.h | 13 + arch/riscv/include/asm/csr.h | 16 + arch/riscv/include/asm/entry-common.h | 2 + arch/riscv/include/asm/hwcap.h | 2 + arch/riscv/include/asm/mman.h | 25 + arch/riscv/include/asm/mmu_context.h | 7 + arch/riscv/include/asm/pgtable.h | 30 +- arch/riscv/include/asm/processor.h | 2 + arch/riscv/include/asm/sbi.h | 26 + arch/riscv/include/asm/thread_info.h | 3 + arch/riscv/include/asm/usercfi.h | 89 ++++ arch/riscv/include/asm/vector.h | 3 + arch/riscv/include/uapi/asm/hwprobe.h | 2 + arch/riscv/include/uapi/asm/ptrace.h | 22 + arch/riscv/include/uapi/asm/sigcontext.h | 1 + arch/riscv/kernel/Makefile | 1 + arch/riscv/kernel/asm-offsets.c | 8 + arch/riscv/kernel/cpufeature.c | 13 + arch/riscv/kernel/entry.S | 31 +- arch/riscv/kernel/head.S | 12 + arch/riscv/kernel/process.c | 26 +- arch/riscv/kernel/ptrace.c | 83 ++++ arch/riscv/kernel/signal.c | 142 +++++- arch/riscv/kernel/sys_hwprobe.c | 2 + arch/riscv/kernel/sys_riscv.c | 10 + arch/riscv/kernel/traps.c | 43 ++ arch/riscv/kernel/usercfi.c | 524 +++++++++++++++++++++ arch/riscv/kernel/vdso/Makefile | 12 + arch/riscv/kernel/vdso/flush_icache.S | 4 + arch/riscv/kernel/vdso/getcpu.S | 4 + arch/riscv/kernel/vdso/rt_sigreturn.S | 4 + arch/riscv/kernel/vdso/sys_hwprobe.S | 4 + arch/riscv/mm/init.c | 2 +- arch/riscv/mm/pgtable.c | 17 + include/linux/cpu.h | 4 + include/linux/mm.h | 7 + include/uapi/linux/elf.h | 1 + include/uapi/linux/prctl.h | 27 ++ kernel/sys.c | 30 ++ tools/testing/selftests/riscv/Makefile | 2 +- tools/testing/selftests/riscv/cfi/.gitignore | 3 + tools/testing/selftests/riscv/cfi/Makefile | 10 + tools/testing/selftests/riscv/cfi/cfi_rv_test.h | 84 ++++ tools/testing/selftests/riscv/cfi/riscv_cfi_test.c | 78 +++ tools/testing/selftests/riscv/cfi/shadowstack.c | 375 +++++++++++++++ tools/testing/selftests/riscv/cfi/shadowstack.h | 37 ++ 54 files changed, 2193 insertions(+), 29 deletions(-) --- base-commit: 39a803b754d5224a3522016b564113ee1e4091b2 change-id: 20240930-v5_user_cfi_series-3dc332f8f5b2 -- - debug

4 months, 3 weeks

2
47
0 0

[PATCH] rust: enable `clippy::ptr_as_ptr` lint

by Tamir Duberstein

In Rust 1.51.0, Clippy introduced the `ignored_unit_patterns` lint [1]: > Though `as` casts between raw pointers are not terrible, > `pointer::cast` is safer because it cannot accidentally change the > pointer's mutability, nor cast the pointer to other types like `usize`. There are a few classes of changes required: - Modules generated by bindgen are marked `#[allow(clippy::ptr_as_ptr)]`. - Inferred casts (` as _`) are replaced with `.cast()`. - Ascribed casts (` as *... T`) are replaced with `.cast::<T>()`. - Multistep casts from references (` as *const _ as *const T`) are replaced with `let x: *const _ = &x;` and `.cast()` or `.cast::<T>()` according to the previous rules. The intermediate `let` binding is required because `(x as *const _).cast::<T>()` results in inference failure. - Native literal C strings are replaced with `c_str!().as_char_ptr()`. Apply these changes and enable the lint -- no functional change intended. Link: https://rust-lang.github.io/rust-clippy/master/index.html#ptr_as_ptr [1] Signed-off-by: Tamir Duberstein <tamird(a)gmail.com> --- Makefile | 1 + rust/bindings/lib.rs | 1 + rust/kernel/alloc/allocator_test.rs | 2 +- rust/kernel/alloc/kvec.rs | 4 ++-- rust/kernel/device.rs | 5 +++-- rust/kernel/devres.rs | 2 +- rust/kernel/error.rs | 2 +- rust/kernel/fs/file.rs | 2 +- rust/kernel/kunit.rs | 15 +++++++-------- rust/kernel/lib.rs | 4 ++-- rust/kernel/list/impl_list_item_mod.rs | 2 +- rust/kernel/pci.rs | 2 +- rust/kernel/platform.rs | 4 +++- rust/kernel/print.rs | 11 +++++------ rust/kernel/seq_file.rs | 3 ++- rust/kernel/str.rs | 2 +- rust/kernel/sync/poll.rs | 2 +- rust/kernel/workqueue.rs | 10 +++++----- rust/uapi/lib.rs | 1 + 19 files changed, 40 insertions(+), 35 deletions(-) diff --git a/Makefile b/Makefile index 70bdbf2218fc..ec8efc8e23ba 100644 --- a/Makefile +++ b/Makefile @@ -483,6 +483,7 @@ export rust_common_flags := --edition=2021 \ -Wclippy::needless_continue \ -Aclippy::needless_lifetimes \ -Wclippy::no_mangle_with_rust_abi \ + -Wclippy::ptr_as_ptr \ -Wclippy::undocumented_unsafe_blocks \ -Wclippy::unnecessary_safety_comment \ -Wclippy::unnecessary_safety_doc \ diff --git a/rust/bindings/lib.rs b/rust/bindings/lib.rs index 014af0d1fc70..0486a32ed314 100644 --- a/rust/bindings/lib.rs +++ b/rust/bindings/lib.rs @@ -25,6 +25,7 @@ )] #[allow(dead_code)] +#[allow(clippy::ptr_as_ptr)] #[allow(clippy::undocumented_unsafe_blocks)] mod bindings_raw { // Manual definition for blocklisted types. diff --git a/rust/kernel/alloc/allocator_test.rs b/rust/kernel/alloc/allocator_test.rs index c37d4c0c64e9..8017aa9d5213 100644 --- a/rust/kernel/alloc/allocator_test.rs +++ b/rust/kernel/alloc/allocator_test.rs @@ -82,7 +82,7 @@ unsafe fn realloc( // SAFETY: Returns either NULL or a pointer to a memory allocation that satisfies or // exceeds the given size and alignment requirements. - let dst = unsafe { libc_aligned_alloc(layout.align(), layout.size()) } as *mut u8; + let dst = unsafe { libc_aligned_alloc(layout.align(), layout.size()) }.cast::<u8>(); let dst = NonNull::new(dst).ok_or(AllocError)?; if flags.contains(__GFP_ZERO) { diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs index ae9d072741ce..c12844764671 100644 --- a/rust/kernel/alloc/kvec.rs +++ b/rust/kernel/alloc/kvec.rs @@ -262,7 +262,7 @@ pub fn spare_capacity_mut(&mut self) -> &mut [MaybeUninit<T>] { // - `self.len` is smaller than `self.capacity` and hence, the resulting pointer is // guaranteed to be part of the same allocated object. // - `self.len` can not overflow `isize`. - let ptr = unsafe { self.as_mut_ptr().add(self.len) } as *mut MaybeUninit<T>; + let ptr = unsafe { self.as_mut_ptr().add(self.len) }.cast::<MaybeUninit<T>>(); // SAFETY: The memory between `self.len` and `self.capacity` is guaranteed to be allocated // and valid, but uninitialized. @@ -554,7 +554,7 @@ fn drop(&mut self) { // - `ptr` points to memory with at least a size of `size_of::<T>() * len`, // - all elements within `b` are initialized values of `T`, // - `len` does not exceed `isize::MAX`. - unsafe { Vec::from_raw_parts(ptr as _, len, len) } + unsafe { Vec::from_raw_parts(ptr.cast(), len, len) } } } diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs index db2d9658ba47..9e500498835d 100644 --- a/rust/kernel/device.rs +++ b/rust/kernel/device.rs @@ -168,16 +168,17 @@ pub fn pr_dbg(&self, args: fmt::Arguments<'_>) { /// `KERN_*`constants, for example, `KERN_CRIT`, `KERN_ALERT`, etc. #[cfg_attr(not(CONFIG_PRINTK), allow(unused_variables))] unsafe fn printk(&self, klevel: &[u8], msg: fmt::Arguments<'_>) { + let msg: *const _ = &msg; // SAFETY: `klevel` is null-terminated and one of the kernel constants. `self.as_raw` // is valid because `self` is valid. The "%pA" format string expects a pointer to // `fmt::Arguments`, which is what we're passing as the last argument. #[cfg(CONFIG_PRINTK)] unsafe { bindings::_dev_printk( - klevel as *const _ as *const crate::ffi::c_char, + klevel.as_ptr().cast::<crate::ffi::c_char>(), self.as_raw(), c_str!("%pA").as_char_ptr(), - &msg as *const _ as *const crate::ffi::c_void, + msg.cast::<crate::ffi::c_void>(), ) }; } diff --git a/rust/kernel/devres.rs b/rust/kernel/devres.rs index 942376f6f3af..3a9d998ec371 100644 --- a/rust/kernel/devres.rs +++ b/rust/kernel/devres.rs @@ -157,7 +157,7 @@ fn remove_action(this: &Arc<Self>) { #[allow(clippy::missing_safety_doc)] unsafe extern "C" fn devres_callback(ptr: *mut kernel::ffi::c_void) { - let ptr = ptr as *mut DevresInner<T>; + let ptr = ptr.cast::<DevresInner<T>>(); // Devres owned this memory; now that we received the callback, drop the `Arc` and hence the // reference. // SAFETY: Safe, since we leaked an `Arc` reference to devm_add_action() in diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs index f6ecf09cb65f..8654d52b0bb9 100644 --- a/rust/kernel/error.rs +++ b/rust/kernel/error.rs @@ -152,7 +152,7 @@ pub(crate) fn to_blk_status(self) -> bindings::blk_status_t { /// Returns the error encoded as a pointer. pub fn to_ptr<T>(self) -> *mut T { // SAFETY: `self.0` is a valid error due to its invariant. - unsafe { bindings::ERR_PTR(self.0.get() as _) as *mut _ } + unsafe { bindings::ERR_PTR(self.0.get() as _).cast() } } /// Returns a string representing the error, if one exists. diff --git a/rust/kernel/fs/file.rs b/rust/kernel/fs/file.rs index e03dbe14d62a..8936afc234a4 100644 --- a/rust/kernel/fs/file.rs +++ b/rust/kernel/fs/file.rs @@ -364,7 +364,7 @@ fn deref(&self) -> &LocalFile { // // By the type invariants, there are no `fdget_pos` calls that did not take the // `f_pos_lock` mutex. - unsafe { LocalFile::from_raw_file(self as *const File as *const bindings::file) } + unsafe { LocalFile::from_raw_file((self as *const Self).cast()) } } } diff --git a/rust/kernel/kunit.rs b/rust/kernel/kunit.rs index 824da0e9738a..7ed2063c1af0 100644 --- a/rust/kernel/kunit.rs +++ b/rust/kernel/kunit.rs @@ -8,19 +8,20 @@ use core::{ffi::c_void, fmt}; +#[cfg(CONFIG_PRINTK)] +use crate::c_str; + /// Prints a KUnit error-level message. /// /// Public but hidden since it should only be used from KUnit generated code. #[doc(hidden)] pub fn err(args: fmt::Arguments<'_>) { + let args: *const _ = &args; // SAFETY: The format string is null-terminated and the `%pA` specifier matches the argument we // are passing. #[cfg(CONFIG_PRINTK)] unsafe { - bindings::_printk( - c"\x013%pA".as_ptr() as _, - &args as *const _ as *const c_void, - ); + bindings::_printk(c_str!("\x013%pA").as_char_ptr(), args.cast::<c_void>()); } } @@ -29,14 +30,12 @@ pub fn err(args: fmt::Arguments<'_>) { /// Public but hidden since it should only be used from KUnit generated code. #[doc(hidden)] pub fn info(args: fmt::Arguments<'_>) { + let args: *const _ = &args; // SAFETY: The format string is null-terminated and the `%pA` specifier matches the argument we // are passing. #[cfg(CONFIG_PRINTK)] unsafe { - bindings::_printk( - c"\x016%pA".as_ptr() as _, - &args as *const _ as *const c_void, - ); + bindings::_printk(c_str!("\x016%pA").as_char_ptr(), args.cast::<c_void>()); } } diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index 7697c60b2d1a..01264e459c92 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -196,9 +196,9 @@ fn panic(info: &core::panic::PanicInfo<'_>) -> ! { #[macro_export] macro_rules! container_of { ($ptr:expr, $type:ty, $($f:tt)*) => {{ - let ptr = $ptr as *const _ as *const u8; + let ptr: *const _ = $ptr; let offset: usize = ::core::mem::offset_of!($type, $($f)*); - ptr.sub(offset) as *const $type + ptr.cast::<u8>().sub(offset).cast::<$type>() }} } diff --git a/rust/kernel/list/impl_list_item_mod.rs b/rust/kernel/list/impl_list_item_mod.rs index a0438537cee1..1f9498c1458f 100644 --- a/rust/kernel/list/impl_list_item_mod.rs +++ b/rust/kernel/list/impl_list_item_mod.rs @@ -34,7 +34,7 @@ pub unsafe trait HasListLinks<const ID: u64 = 0> { unsafe fn raw_get_list_links(ptr: *mut Self) -> *mut ListLinks<ID> { // SAFETY: The caller promises that the pointer is valid. The implementer promises that the // `OFFSET` constant is correct. - unsafe { (ptr as *mut u8).add(Self::OFFSET) as *mut ListLinks<ID> } + unsafe { ptr.cast::<u8>().add(Self::OFFSET).cast() } } } diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs index 4c98b5b9aa1e..206f71d33ab2 100644 --- a/rust/kernel/pci.rs +++ b/rust/kernel/pci.rs @@ -75,7 +75,7 @@ extern "C" fn probe_callback( // Let the `struct pci_dev` own a reference of the driver's private data. // SAFETY: By the type invariant `pdev.as_raw` returns a valid pointer to a // `struct pci_dev`. - unsafe { bindings::pci_set_drvdata(pdev.as_raw(), data.into_foreign() as _) }; + unsafe { bindings::pci_set_drvdata(pdev.as_raw(), data.into_foreign().cast()) }; } Err(err) => return Error::to_errno(err), } diff --git a/rust/kernel/platform.rs b/rust/kernel/platform.rs index 50e6b0421813..8f9e6b125faf 100644 --- a/rust/kernel/platform.rs +++ b/rust/kernel/platform.rs @@ -66,7 +66,9 @@ extern "C" fn probe_callback(pdev: *mut bindings::platform_device) -> kernel::ff // Let the `struct platform_device` own a reference of the driver's private data. // SAFETY: By the type invariant `pdev.as_raw` returns a valid pointer to a // `struct platform_device`. - unsafe { bindings::platform_set_drvdata(pdev.as_raw(), data.into_foreign() as _) }; + unsafe { + bindings::platform_set_drvdata(pdev.as_raw(), data.into_foreign().cast()) + }; } Err(err) => return Error::to_errno(err), } diff --git a/rust/kernel/print.rs b/rust/kernel/print.rs index b19ee490be58..0245c145ea32 100644 --- a/rust/kernel/print.rs +++ b/rust/kernel/print.rs @@ -25,7 +25,7 @@ // SAFETY: The C contract guarantees that `buf` is valid if it's less than `end`. let mut w = unsafe { RawFormatter::from_ptrs(buf.cast(), end.cast()) }; // SAFETY: TODO. - let _ = w.write_fmt(unsafe { *(ptr as *const fmt::Arguments<'_>) }); + let _ = w.write_fmt(unsafe { *ptr.cast::<fmt::Arguments<'_>>() }); w.pos().cast() } @@ -102,6 +102,7 @@ pub unsafe fn call_printk( module_name: &[u8], args: fmt::Arguments<'_>, ) { + let args: *const _ = &args; // `_printk` does not seem to fail in any path. #[cfg(CONFIG_PRINTK)] // SAFETY: TODO. @@ -109,7 +110,7 @@ pub unsafe fn call_printk( bindings::_printk( format_string.as_ptr(), module_name.as_ptr(), - &args as *const _ as *const c_void, + args.cast::<c_void>(), ); } } @@ -122,15 +123,13 @@ pub unsafe fn call_printk( #[doc(hidden)] #[cfg_attr(not(CONFIG_PRINTK), allow(unused_variables))] pub fn call_printk_cont(args: fmt::Arguments<'_>) { + let args: *const _ = &args; // `_printk` does not seem to fail in any path. // // SAFETY: The format string is fixed. #[cfg(CONFIG_PRINTK)] unsafe { - bindings::_printk( - format_strings::CONT.as_ptr(), - &args as *const _ as *const c_void, - ); + bindings::_printk(format_strings::CONT.as_ptr(), args.cast::<c_void>()); } } diff --git a/rust/kernel/seq_file.rs b/rust/kernel/seq_file.rs index 04947c672979..90545d28e6b7 100644 --- a/rust/kernel/seq_file.rs +++ b/rust/kernel/seq_file.rs @@ -31,12 +31,13 @@ pub unsafe fn from_raw<'a>(ptr: *mut bindings::seq_file) -> &'a SeqFile { /// Used by the [`seq_print`] macro. pub fn call_printf(&self, args: core::fmt::Arguments<'_>) { + let args: *const _ = &args; // SAFETY: Passing a void pointer to `Arguments` is valid for `%pA`. unsafe { bindings::seq_printf( self.inner.get(), c_str!("%pA").as_char_ptr(), - &args as *const _ as *const crate::ffi::c_void, + args.cast::<crate::ffi::c_void>(), ); } } diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs index 28e2201604d6..6a1a982b946d 100644 --- a/rust/kernel/str.rs +++ b/rust/kernel/str.rs @@ -191,7 +191,7 @@ pub unsafe fn from_char_ptr<'a>(ptr: *const crate::ffi::c_char) -> &'a Self { // to a `NUL`-terminated C string. let len = unsafe { bindings::strlen(ptr) } + 1; // SAFETY: Lifetime guaranteed by the safety precondition. - let bytes = unsafe { core::slice::from_raw_parts(ptr as _, len) }; + let bytes = unsafe { core::slice::from_raw_parts(ptr.cast(), len) }; // SAFETY: As `len` is returned by `strlen`, `bytes` does not contain interior `NUL`. // As we have added 1 to `len`, the last byte is known to be `NUL`. unsafe { Self::from_bytes_with_nul_unchecked(bytes) } diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs index d5f17153b424..a151f54cde91 100644 --- a/rust/kernel/sync/poll.rs +++ b/rust/kernel/sync/poll.rs @@ -73,7 +73,7 @@ pub fn register_wait(&mut self, file: &File, cv: &PollCondVar) { // be destroyed, the destructor must run. That destructor first removes all waiters, // and then waits for an rcu grace period. Therefore, `cv.wait_queue_head` is valid for // long enough. - unsafe { qproc(file.as_ptr() as _, cv.wait_queue_head.get(), self.0.get()) }; + unsafe { qproc(file.as_ptr().cast(), cv.wait_queue_head.get(), self.0.get()) }; } } } diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs index 0cd100d2aefb..8ff54105be3f 100644 --- a/rust/kernel/workqueue.rs +++ b/rust/kernel/workqueue.rs @@ -170,7 +170,7 @@ impl Queue { pub unsafe fn from_raw<'a>(ptr: *const bindings::workqueue_struct) -> &'a Queue { // SAFETY: The `Queue` type is `#[repr(transparent)]`, so the pointer cast is valid. The // caller promises that the pointer is not dangling. - unsafe { &*(ptr as *const Queue) } + unsafe { &*ptr.cast::<Queue>() } } /// Enqueues a work item. @@ -457,7 +457,7 @@ fn get_work_offset(&self) -> usize { #[inline] unsafe fn raw_get_work(ptr: *mut Self) -> *mut Work<T, ID> { // SAFETY: The caller promises that the pointer is valid. - unsafe { (ptr as *mut u8).add(Self::OFFSET) as *mut Work<T, ID> } + unsafe { ptr.cast::<u8>().add(Self::OFFSET).cast::<Work<T, ID>>() } } /// Returns a pointer to the struct containing the [`Work<T, ID>`] field. @@ -472,7 +472,7 @@ unsafe fn work_container_of(ptr: *mut Work<T, ID>) -> *mut Self { // SAFETY: The caller promises that the pointer points at a field of the right type in the // right kind of struct. - unsafe { (ptr as *mut u8).sub(Self::OFFSET) as *mut Self } + unsafe { ptr.cast::<u8>().sub(Self::OFFSET).cast::<Self>() } } } @@ -538,7 +538,7 @@ unsafe impl<T, const ID: u64> WorkItemPointer<ID> for Arc<T> { unsafe extern "C" fn run(ptr: *mut bindings::work_struct) { // The `__enqueue` method always uses a `work_struct` stored in a `Work<T, ID>`. - let ptr = ptr as *mut Work<T, ID>; + let ptr = ptr.cast::<Work<T, ID>>(); // SAFETY: This computes the pointer that `__enqueue` got from `Arc::into_raw`. let ptr = unsafe { T::work_container_of(ptr) }; // SAFETY: This pointer comes from `Arc::into_raw` and we've been given back ownership. @@ -591,7 +591,7 @@ unsafe impl<T, const ID: u64> WorkItemPointer<ID> for Pin<KBox<T>> { unsafe extern "C" fn run(ptr: *mut bindings::work_struct) { // The `__enqueue` method always uses a `work_struct` stored in a `Work<T, ID>`. - let ptr = ptr as *mut Work<T, ID>; + let ptr = ptr.cast::<Work<T, ID>>(); // SAFETY: This computes the pointer that `__enqueue` got from `Arc::into_raw`. let ptr = unsafe { T::work_container_of(ptr) }; // SAFETY: This pointer comes from `Arc::into_raw` and we've been given back ownership. diff --git a/rust/uapi/lib.rs b/rust/uapi/lib.rs index 13495910271f..fe9bf7b5a306 100644 --- a/rust/uapi/lib.rs +++ b/rust/uapi/lib.rs @@ -15,6 +15,7 @@ #![allow( clippy::all, clippy::undocumented_unsafe_blocks, + clippy::ptr_as_ptr, dead_code, missing_docs, non_camel_case_types, --- base-commit: ff64846bee0e7e3e7bc9363ebad3bab42dd27e24 change-id: 20250307-ptr-as-ptr-21b1867fc4d4 Best regards, -- Tamir Duberstein <tamird(a)gmail.com>

4 months, 3 weeks

5
12
0 0

[PATCH net v2 0/3] vsock/bpf: Handle races between sockmap update and connect() disconnecting

by Michal Luczaj

Signal delivery during connect() may disconnect an already established socket. Problem is that such socket might have been placed in a sockmap before the connection was closed. PATCH 1 ensures this race won't lead to an unconnected vsock staying in the sockmap. PATCH 2 selftests it. PATCH 3 fixes a related race. Note that here the race window is rather difficult to hit and I can't think of an easy way of testing it. Signed-off-by: Michal Luczaj <mhal(a)rbox.co> --- Changes in v2: - Handle one more path of tripping the warning - Add a selftest - Collect R-b [Stefano] - Link to v1: https://lore.kernel.org/r/20250307-vsock-trans-signal-race-v1-1-3aca3f771fb… --- Michal Luczaj (3): vsock/bpf: Fix EINTR connect() racing sockmap update selftest/bpf: Add test for AF_VSOCK connect() racing sockmap update vsock/bpf: Fix bpf recvmsg() racing transport reassignment net/vmw_vsock/af_vsock.c | 10 +- net/vmw_vsock/vsock_bpf.c | 24 +++-- .../selftests/bpf/prog_tests/sockmap_basic.c | 111 +++++++++++++++++++++ 3 files changed, 136 insertions(+), 9 deletions(-) --- base-commit: da9e8efe7ee10e8425dc356a9fc593502c8e3933 change-id: 20250305-vsock-trans-signal-race-d62f7718d099 Best regards, -- Michal Luczaj <mhal(a)rbox.co>

4 months, 3 weeks

1
3
0 0

[PATCH v3 00/17] riscv: add SBI FWFT misaligned exception delegation support

by Clément Léger

The SBI Firmware Feature extension allows the S-mode to request some specific features (either hardware or software) to be enabled. This series uses this extension to request misaligned access exception delegation to S-mode in order to let the kernel handle it. It also adds support for the KVM FWFT SBI extension based on the misaligned access handling infrastructure. FWFT SBI extension is part of the SBI V3.0 specifications [1]. It can be tested using the qemu provided at [2] which contains the series from [3]. kvm-unit-tests [4] can be used inside kvm to tests the correct delegation of misaligned exceptions. Upstream OpenSBI can be used. Note: Since SBI V3.0 is not yet ratified, FWFT extension API is split between interface only and implementation, allowing to pick only the interface which do not have hard dependencies on SBI. The tests can be run using the included kselftest: $ qemu-system-riscv64 \ -cpu rv64,trap-misaligned-access=true,v=true \ -M virt \ -m 1024M \ -bios fw_dynamic.bin \ -kernel Image ... # ./misaligned TAP version 13 1..23 # Starting 23 tests from 1 test cases. # RUN global.gp_load_lh ... # OK global.gp_load_lh ok 1 global.gp_load_lh # RUN global.gp_load_lhu ... # OK global.gp_load_lhu ok 2 global.gp_load_lhu # RUN global.gp_load_lw ... # OK global.gp_load_lw ok 3 global.gp_load_lw # RUN global.gp_load_lwu ... # OK global.gp_load_lwu ok 4 global.gp_load_lwu # RUN global.gp_load_ld ... # OK global.gp_load_ld ok 5 global.gp_load_ld # RUN global.gp_load_c_lw ... # OK global.gp_load_c_lw ok 6 global.gp_load_c_lw # RUN global.gp_load_c_ld ... # OK global.gp_load_c_ld ok 7 global.gp_load_c_ld # RUN global.gp_load_c_ldsp ... # OK global.gp_load_c_ldsp ok 8 global.gp_load_c_ldsp # RUN global.gp_load_sh ... # OK global.gp_load_sh ok 9 global.gp_load_sh # RUN global.gp_load_sw ... # OK global.gp_load_sw ok 10 global.gp_load_sw # RUN global.gp_load_sd ... # OK global.gp_load_sd ok 11 global.gp_load_sd # RUN global.gp_load_c_sw ... # OK global.gp_load_c_sw ok 12 global.gp_load_c_sw # RUN global.gp_load_c_sd ... # OK global.gp_load_c_sd ok 13 global.gp_load_c_sd # RUN global.gp_load_c_sdsp ... # OK global.gp_load_c_sdsp ok 14 global.gp_load_c_sdsp # RUN global.fpu_load_flw ... # OK global.fpu_load_flw ok 15 global.fpu_load_flw # RUN global.fpu_load_fld ... # OK global.fpu_load_fld ok 16 global.fpu_load_fld # RUN global.fpu_load_c_fld ... # OK global.fpu_load_c_fld ok 17 global.fpu_load_c_fld # RUN global.fpu_load_c_fldsp ... # OK global.fpu_load_c_fldsp ok 18 global.fpu_load_c_fldsp # RUN global.fpu_store_fsw ... # OK global.fpu_store_fsw ok 19 global.fpu_store_fsw # RUN global.fpu_store_fsd ... # OK global.fpu_store_fsd ok 20 global.fpu_store_fsd # RUN global.fpu_store_c_fsd ... # OK global.fpu_store_c_fsd ok 21 global.fpu_store_c_fsd # RUN global.fpu_store_c_fsdsp ... # OK global.fpu_store_c_fsdsp ok 22 global.fpu_store_c_fsdsp # RUN global.gen_sigbus ... [12797.988647] misaligned[618]: unhandled signal 7 code 0x1 at 0x0000000000014dc0 in misaligned[4dc0,10000+76000] [12797.988990] CPU: 0 UID: 0 PID: 618 Comm: misaligned Not tainted 6.13.0-rc6-00008-g4ec4468967c9-dirty #51 [12797.989169] Hardware name: riscv-virtio,qemu (DT) [12797.989264] epc : 0000000000014dc0 ra : 0000000000014d00 sp : 00007fffe165d100 [12797.989407] gp : 000000000008f6e8 tp : 0000000000095760 t0 : 0000000000000008 [12797.989544] t1 : 00000000000965d8 t2 : 000000000008e830 s0 : 00007fffe165d160 [12797.989692] s1 : 000000000000001a a0 : 0000000000000000 a1 : 0000000000000002 [12797.989831] a2 : 0000000000000000 a3 : 0000000000000000 a4 : ffffffffdeadbeef [12797.989964] a5 : 000000000008ef61 a6 : 626769735f6e0000 a7 : fffffffffffff000 [12797.990094] s2 : 0000000000000001 s3 : 00007fffe165d838 s4 : 00007fffe165d848 [12797.990238] s5 : 000000000000001a s6 : 0000000000010442 s7 : 0000000000010200 [12797.990391] s8 : 000000000000003a s9 : 0000000000094508 s10: 0000000000000000 [12797.990526] s11: 0000555567460668 t3 : 00007fffe165d070 t4 : 00000000000965d0 [12797.990656] t5 : fefefefefefefeff t6 : 0000000000000073 [12797.990756] status: 0000000200004020 badaddr: 000000000008ef61 cause: 0000000000000006 [12797.990911] Code: 8793 8791 3423 fcf4 3783 fc84 c737 dead 0713 eef7 (c398) 0001 # OK global.gen_sigbus ok 23 global.gen_sigbus # PASSED: 23 / 23 tests passed. # Totals: pass:23 fail:0 xfail:0 xpass:0 skip:0 error:0 With kvm-tools: # lkvm run -k sbi.flat -m 128 Info: # lkvm run -k sbi.flat -m 128 -c 1 --name guest-97 Info: Removed ghost socket file "/root/.lkvm//guest-97.sock". ########################################################################## # kvm-unit-tests ########################################################################## ... [test messages elided] PASS: sbi: fwft: FWFT extension probing no error PASS: sbi: fwft: get/set reserved feature 0x6 error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0x3fffffff error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0x80000000 error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0xbfffffff error == SBI_ERR_DENIED PASS: sbi: fwft: misaligned_deleg: Get misaligned deleg feature no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature invalid value error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature invalid value error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value 0 PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value 1 PASS: sbi: fwft: misaligned_deleg: Verify misaligned load exception trap in supervisor SUMMARY: 50 tests, 2 unexpected failures, 12 skipped This series is available at [6]. Link: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/vv3.0-rc2/… [1] Link: https://github.com/rivosinc/qemu/tree/dev/cleger/misaligned [2] Link: https://lore.kernel.org/all/20241211211933.198792-3-fkonrad@amd.com/T/ [3] Link: https://github.com/clementleger/kvm-unit-tests/tree/dev/cleger/fwft_v1 [4] Link: https://github.com/clementleger/unaligned_test [5] Link: https://github.com/rivosinc/linux/tree/dev/cleger/fwft_v1 [6] --- V3: - Added comment about kvm sbi fwft supported/set/get callback requirements - Move struct kvm_sbi_fwft_feature in kvm_sbi_fwft.c - Add a FWFT interface V2: - Added Kselftest for misaligned testing - Added get_user() usage instead of __get_user() - Reenable interrupt when possible in misaligned access handling - Document that riscv supports unaligned-traps - Fix KVM extension state when an init function is present - Rework SBI misaligned accesses trap delegation code - Added support for CPU hotplugging - Added KVM SBI reset callback - Added reset for KVM SBI FWFT lock - Return SBI_ERR_DENIED_LOCKED when LOCK flag is set Clément Léger (17): riscv: add Firmware Feature (FWFT) SBI extensions definitions riscv: sbi: add FWFT extension interface riscv: sbi: add SBI FWFT extension calls riscv: misaligned: request misaligned exception from SBI riscv: misaligned: use on_each_cpu() for scalar misaligned access probing riscv: misaligned: use correct CONFIG_ ifdef for misaligned_access_speed riscv: misaligned: move emulated access uniformity check in a function riscv: misaligned: add a function to check misalign trap delegability riscv: misaligned: factorize trap handling riscv: misaligned: enable IRQs while handling misaligned accesses riscv: misaligned: use get_user() instead of __get_user() Documentation/sysctl: add riscv to unaligned-trap supported archs selftests: riscv: add misaligned access testing RISC-V: KVM: add SBI extension init()/deinit() functions RISC-V: KVM: add SBI extension reset callback RISC-V: KVM: add support for FWFT SBI extension RISC-V: KVM: add support for SBI_FWFT_MISALIGNED_DELEG Documentation/admin-guide/sysctl/kernel.rst | 4 +- arch/riscv/include/asm/cpufeature.h | 8 +- arch/riscv/include/asm/kvm_host.h | 5 +- arch/riscv/include/asm/kvm_vcpu_sbi.h | 12 + arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h | 31 +++ arch/riscv/include/asm/sbi.h | 38 +++ arch/riscv/include/uapi/asm/kvm.h | 1 + arch/riscv/kernel/sbi.c | 123 +++++++++ arch/riscv/kernel/traps.c | 57 ++-- arch/riscv/kernel/traps_misaligned.c | 119 +++++++- arch/riscv/kernel/unaligned_access_speed.c | 11 +- arch/riscv/kvm/Makefile | 1 + arch/riscv/kvm/vcpu.c | 7 +- arch/riscv/kvm/vcpu_sbi.c | 57 ++++ arch/riscv/kvm/vcpu_sbi_fwft.c | 251 +++++++++++++++++ arch/riscv/kvm/vcpu_sbi_sta.c | 3 +- .../selftests/riscv/misaligned/.gitignore | 1 + .../selftests/riscv/misaligned/Makefile | 12 + .../selftests/riscv/misaligned/common.S | 33 +++ .../testing/selftests/riscv/misaligned/fpu.S | 180 +++++++++++++ tools/testing/selftests/riscv/misaligned/gp.S | 103 +++++++ .../selftests/riscv/misaligned/misaligned.c | 254 ++++++++++++++++++ 22 files changed, 1264 insertions(+), 47 deletions(-) create mode 100644 arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h create mode 100644 arch/riscv/kvm/vcpu_sbi_fwft.c create mode 100644 tools/testing/selftests/riscv/misaligned/.gitignore create mode 100644 tools/testing/selftests/riscv/misaligned/Makefile create mode 100644 tools/testing/selftests/riscv/misaligned/common.S create mode 100644 tools/testing/selftests/riscv/misaligned/fpu.S create mode 100644 tools/testing/selftests/riscv/misaligned/gp.S create mode 100644 tools/testing/selftests/riscv/misaligned/misaligned.c -- 2.47.2

4 months, 3 weeks

2
37
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror March 2025