October 2025 - Linux-kselftest-mirror

[PATCH] gpio-selftests: replace fixed sleep with polling+timeout

by zntsproj

Replace the hard-coded sleep 0.1 with a polling loop with timeout to check the sysfs GPIO value. This avoids timing-dependent flaky failures in CI and on slower machines. --- .../testing/selftests/gpio/gpio-aggregator.sh | 59 +++++++++++++++---- 1 file changed, 46 insertions(+), 13 deletions(-) diff --git a/tools/testing/selftests/gpio/gpio-aggregator.sh b/tools/testing/selftests/gpio/gpio-aggregator.sh index 9b6f80ad9..1e81e62e9 100755 --- a/tools/testing/selftests/gpio/gpio-aggregator.sh +++ b/tools/testing/selftests/gpio/gpio-aggregator.sh @@ -671,26 +671,59 @@ teardown_4() { agg_configfs_cleanup } +# helper: wait for sysfs file to become a given value (timeout in seconds) +wait_for_sysfs_value() { + file="$1" + expected="$2" + timeout="${3:-2}" # seconds + interval="0.01" # seconds per poll + max=$((timeout * 100)) + i=0 + + while [ "$i" -lt "$max" ]; do + if [ "$(cat "$file")" = "$expected" ]; then + return 0 + fi + sleep "$interval" + i=$((i + 1)) + done + + return 1 +} + echo "4.1. Forwarding set values" setup_4 OFFSET=0 for SETTING in $SETTINGS; do - CHIP=$(echo "$SETTING" | cut -d: -f1) - BANK=$(echo "$SETTING" | cut -d: -f2) - LINE=$(echo "$SETTING" | cut -d: -f3) - DEVNAME=$(cat "$CONFIGFS_SIM_DIR/$CHIP/dev_name") - CHIPNAME=$(cat "$CONFIGFS_SIM_DIR/$CHIP/$BANK/chip_name") - VAL_PATH="/sys/devices/platform/$DEVNAME/$CHIPNAME/sim_gpio${LINE}/value" - test $(cat $VAL_PATH) = "0" || fail "incorrect value read from sysfs" - $BASE_DIR/gpio-mockup-cdev -s 1 "/dev/$(agg_configfs_chip_name agg0)" "$OFFSET" & - mock_pid=$! - sleep 0.1 # FIXME Any better way? - test "$(cat $VAL_PATH)" = "1" || fail "incorrect value read from sysfs" - kill "$mock_pid" - OFFSET=$(expr $OFFSET + 1) + CHIP=$(echo "$SETTING" | cut -d: -f1) + BANK=$(echo "$SETTING" | cut -d: -f2) + LINE=$(echo "$SETTING" | cut -d: -f3) + DEVNAME=$(cat "$CONFIGFS_SIM_DIR/$CHIP/dev_name") + CHIPNAME=$(cat "$CONFIGFS_SIM_DIR/$CHIP/$BANK/chip_name") + VAL_PATH="/sys/devices/platform/$DEVNAME/$CHIPNAME/sim_gpio${LINE}/value" + + test "$(cat "$VAL_PATH")" = "0" || fail "incorrect value read from sysfs" + + $BASE_DIR/gpio-mockup-cdev -s 1 "/dev/$(agg_configfs_chip_name agg0)" "$OFFSET" & + mock_pid=$! + + # wait up to 2s for value to flip to "1" + if ! wait_for_sysfs_value "$VAL_PATH" "1" 2; then + kill "$mock_pid" 2>/dev/null || true + wait "$mock_pid" 2>/dev/null || true + fail "timeout waiting for $VAL_PATH to become 1" + fi + + test "$(cat "$VAL_PATH")" = "1" || fail "incorrect value read from sysfs" + + kill "$mock_pid" 2>/dev/null || true + wait "$mock_pid" 2>/dev/null || true + + OFFSET=$((OFFSET + 1)) done teardown_4 + echo "4.2. Forwarding set config" setup_4 OFFSET=0 -- 2.51.2

2 months

1
0
0 0

[PATCH v2] selftests: af_unix: Add tests for ECONNRESET and EOF semantics

by Sunday Adelodun

Add selftests to verify and document Linux’s intended behaviour for UNIX domain sockets (SOCK_STREAM and SOCK_DGRAM) when a peer closes. The tests cover: 1. EOF returned when a SOCK_STREAM peer closes normally. 2. ECONNRESET returned when a SOCK_STREAM peer closes with unread data. 3. SOCK_DGRAM sockets not returning ECONNRESET on peer close. This follows up on review feedback suggesting a selftest to clarify Linux’s semantics. Suggested-by: Kuniyuki Iwashima <kuniyu(a)google.com> Signed-off-by: Sunday Adelodun <adelodunolaoluwa(a)yahoo.com> --- Changelog: Changes made from v1: - Patch prefix updated to selftest: af_unix:. - All mentions of “UNIX” changed to AF_UNIX. - Removed BSD references from comments. - Shared setup refactored using FIXTURE_VARIANT(). - Cleanup moved to FIXTURE_TEARDOWN() to always run. - Tests consolidated to reduce duplication: EOF, ECONNRESET, SOCK_DGRAM peer close. - Corrected ASSERT usage and initialization style. - Makefile updated for new directory af_unix. tools/testing/selftests/net/af_unix/Makefile | 1 + .../selftests/net/af_unix/unix_connreset.c | 161 ++++++++++++++++++ 2 files changed, 162 insertions(+) create mode 100644 tools/testing/selftests/net/af_unix/unix_connreset.c diff --git a/tools/testing/selftests/net/af_unix/Makefile b/tools/testing/selftests/net/af_unix/Makefile index de805cbbdf69..5826a8372451 100644 --- a/tools/testing/selftests/net/af_unix/Makefile +++ b/tools/testing/selftests/net/af_unix/Makefile @@ -7,6 +7,7 @@ TEST_GEN_PROGS := \ scm_pidfd \ scm_rights \ unix_connect \ + unix_connreset \ # end of TEST_GEN_PROGS include ../../lib.mk diff --git a/tools/testing/selftests/net/af_unix/unix_connreset.c b/tools/testing/selftests/net/af_unix/unix_connreset.c new file mode 100644 index 000000000000..c65ec997d77d --- /dev/null +++ b/tools/testing/selftests/net/af_unix/unix_connreset.c @@ -0,0 +1,161 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Selftest for AF_UNIX socket close and ECONNRESET behaviour. + * + * This test verifies that: + * 1. SOCK_STREAM sockets return EOF when peer closes normally. + * 2. SOCK_STREAM sockets return ECONNRESET if peer closes with unread data. + * 3. SOCK_DGRAM sockets do not return ECONNRESET when peer closes. + * + * These tests document the intended Linux behaviour. + * + */ + +#define _GNU_SOURCE +#include <stdlib.h> +#include <string.h> +#include <fcntl.h> +#include <unistd.h> +#include <errno.h> +#include <sys/socket.h> +#include <sys/un.h> +#include "../../kselftest_harness.h" + +#define SOCK_PATH "/tmp/af_unix_connreset.sock" + +static void remove_socket_file(void) +{ + unlink(SOCK_PATH); +} + +FIXTURE(unix_sock) +{ + int server; + int client; + int child; +}; + +FIXTURE_VARIANT(unix_sock) +{ + int socket_type; + const char *name; +}; + +/* Define variants: stream and datagram */ +FIXTURE_VARIANT_ADD(unix_sock, stream) { + .socket_type = SOCK_STREAM, + .name = "SOCK_STREAM", +}; + +FIXTURE_VARIANT_ADD(unix_sock, dgram) { + .socket_type = SOCK_DGRAM, + .name = "SOCK_DGRAM", +}; + +FIXTURE_SETUP(unix_sock) +{ + struct sockaddr_un addr = {}; + int err; + + addr.sun_family = AF_UNIX; + strcpy(addr.sun_path, SOCK_PATH); + + self->server = socket(AF_UNIX, variant->socket_type, 0); + ASSERT_LT(-1, self->server); + + err = bind(self->server, (struct sockaddr *)&addr, sizeof(addr)); + ASSERT_EQ(0, err); + + if (variant->socket_type == SOCK_STREAM) { + err = listen(self->server, 1); + ASSERT_EQ(0, err); + + self->client = socket(AF_UNIX, SOCK_STREAM, 0); + ASSERT_LT(-1, self->client); + + err = connect(self->client, (struct sockaddr *)&addr, sizeof(addr)); + ASSERT_EQ(0, err); + + self->child = accept(self->server, NULL, NULL); + ASSERT_LT(-1, self->child); + } else { + /* Datagram: bind and connect only */ + self->client = socket(AF_UNIX, SOCK_DGRAM | SOCK_NONBLOCK, 0); + ASSERT_LT(-1, self->client); + + err = connect(self->client, (struct sockaddr *)&addr, sizeof(addr)); + ASSERT_EQ(0, err); + } +} + +FIXTURE_TEARDOWN(unix_sock) +{ + if (variant->socket_type == SOCK_STREAM) + close(self->child); + + close(self->client); + close(self->server); + remove_socket_file(); +} + +/* Test 1: peer closes normally */ +TEST_F(unix_sock, eof) +{ + char buf[16] = {}; + ssize_t n; + + if (variant->socket_type != SOCK_STREAM) + SKIP(return, "This test only applies to SOCK_STREAM"); + + /* Peer closes normally */ + close(self->child); + + n = recv(self->client, buf, sizeof(buf), 0); + TH_LOG("%s: recv=%zd errno=%d (%s)", variant->name, n, errno, strerror(errno)); + if (n == -1) + ASSERT_EQ(ECONNRESET, errno); + + if (n != -1) + ASSERT_EQ(0, n); +} + +/* Test 2: peer closes with unread data */ +TEST_F(unix_sock, reset_unread) +{ + char buf[16] = {}; + ssize_t n; + + if (variant->socket_type != SOCK_STREAM) + SKIP(return, "This test only applies to SOCK_STREAM"); + + /* Send data that will remain unread by client */ + send(self->client, "hello", 5, 0); + close(self->child); + + n = recv(self->client, buf, sizeof(buf), 0); + TH_LOG("%s: recv=%zd errno=%d (%s)", variant->name, n, errno, strerror(errno)); + ASSERT_EQ(-1, n); + ASSERT_EQ(ECONNRESET, errno); +} + +/* Test 3: SOCK_DGRAM peer close */ +TEST_F(unix_sock, dgram_reset) +{ + char buf[16] = {}; + ssize_t n; + + if (variant->socket_type != SOCK_DGRAM) + SKIP(return, "This test only applies to SOCK_DGRAM"); + + send(self->client, "hello", 5, 0); + close(self->server); + + n = recv(self->client, buf, sizeof(buf), 0); + TH_LOG("%s: recv=%zd errno=%d (%s)", variant->name, n, errno, strerror(errno)); + /* Expect EAGAIN because there is no datagram and peer is closed. */ + ASSERT_EQ(-1, n); + ASSERT_EQ(EAGAIN, errno); +} + +TEST_HARNESS_MAIN + -- 2.43.0

2 months

2
4
0 0

[PATCH v22 00/28] riscv control-flow integrity for usermode

by Deepak Gupta

v22: fixing build error due to -march=zicfiss being picked in gcc-13 and above but not actually doing any codegen or recognizing instruction for zicfiss. Change in v22 makes dependence on `-fcf-protection=full` compiler flag to ensure that toolchain has support and then only CONFIG_RISCV_USER_CFI will be visible in menuconfig. v21: fixed build errors. Basics and overview =================== Software with larger attack surfaces (e.g. network facing apps like databases, browsers or apps relying on browser runtimes) suffer from memory corruption issues which can be utilized by attackers to bend control flow of the program to eventually gain control (by making their payload executable). Attackers are able to perform such attacks by leveraging call-sites which rely on indirect calls or return sites which rely on obtaining return address from stack memory. To mitigate such attacks, risc-v extension zicfilp enforces that all indirect calls must land on a landing pad instruction `lpad` else cpu will raise software check exception (a new cpu exception cause code on riscv). Similarly for return flow, risc-v extension zicfiss extends architecture with - `sspush` instruction to push return address on a shadow stack - `sspopchk` instruction to pop return address from shadow stack and compare with input operand (i.e. return address on stack) - `sspopchk` to raise software check exception if comparision above was a mismatch - Protection mechanism using which shadow stack is not writeable via regular store instructions More information an details can be found at extensions github repo [1]. Equivalent to landing pad (zicfilp) on x86 is `ENDBRANCH` instruction in Intel CET [3] and branch target identification (BTI) [4] on arm. Similarly x86's Intel CET has shadow stack [5] and arm64 has guarded control stack (GCS) [6] which are very similar to risc-v's zicfiss shadow stack. x86 and arm64 support for user mode shadow stack is already in mainline. Kernel awareness for user control flow integrity ================================================ This series picks up Samuel Holland's envcfg changes [2] as well. So if those are being applied independently, they should be removed from this series. Enabling: In order to maintain compatibility and not break anything in user mode, kernel doesn't enable control flow integrity cpu extensions on binary by default. Instead exposes a prctl interface to enable, disable and lock the shadow stack or landing pad feature for a task. This allows userspace (loader) to enumerate if all objects in its address space are compiled with shadow stack and landing pad support and accordingly enable the feature. Additionally if a subsequent `dlopen` happens on a library, user mode can take a decision again to disable the feature (if incoming library is not compiled with support) OR terminate the task (if user mode policy is strict to have all objects in address space to be compiled with control flow integirty cpu feature). prctl to enable shadow stack results in allocating shadow stack from virtual memory and activating for user address space. x86 and arm64 are also following same direction due to similar reason(s). clone/fork: On clone and fork, cfi state for task is inherited by child. Shadow stack is part of virtual memory and is a writeable memory from kernel perspective (writeable via a restricted set of instructions aka shadow stack instructions) Thus kernel changes ensure that this memory is converted into read-only when fork/clone happens and COWed when fault is taken due to sspush, sspopchk or ssamoswap. In case `CLONE_VM` is specified and shadow stack is to be enabled, kernel will automatically allocate a shadow stack for that clone call. map_shadow_stack: x86 introduced `map_shadow_stack` system call to allow user space to explicitly map shadow stack memory in its address space. It is useful to allocate shadow for different contexts managed by a single thread (green threads or contexts) risc-v implements this system call as well. signal management: If shadow stack is enabled for a task, kernel performs an asynchronous control flow diversion to deliver the signal and eventually expects userspace to issue sigreturn so that original execution can be resumed. Even though resume context is prepared by kernel, it is in user space memory and is subject to memory corruption and corruption bugs can be utilized by attacker in this race window to perform arbitrary sigreturn and eventually bypass cfi mechanism. Another issue is how to ensure that cfi related state on sigcontext area is not trampled by legacy apps or apps compiled with old kernel headers. In order to mitigate control-flow hijacting, kernel prepares a token and place it on shadow stack before signal delivery and places address of token in sigcontext structure. During sigreturn, kernel obtains address of token from sigcontext struture, reads token from shadow stack and validates it and only then allow sigreturn to succeed. Compatiblity issue is solved by adopting dynamic sigcontext management introduced for vector extension. This series re-factor the code little bit to allow future sigcontext management easy (as proposed by Andy Chiu from SiFive) config and compilation: Introduce a new risc-v config option `CONFIG_RISCV_USER_CFI`. Selecting this config option picks the kernel support for user control flow integrity. This optin is presented only if toolchain has shadow stack and landing pad support. And is on purpose guarded by toolchain support. Reason being that eventually vDSO also needs to be compiled in with shadow stack and landing pad support. vDSO compile patches are not included as of now because landing pad labeling scheme is yet to settle for usermode runtime. To get more information on kernel interactions with respect to zicfilp and zicfiss, patch series adds documentation for `zicfilp` and `zicfiss` in following: Documentation/arch/riscv/zicfiss.rst Documentation/arch/riscv/zicfilp.rst How to test this series ======================= Toolchain --------- $ git clone git@github.com:sifive/riscv-gnu-toolchain.git -b cfi-dev $ riscv-gnu-toolchain/configure --prefix=<path-to-where-to-build> --with-arch=rv64gc_zicfilp_zicfiss --enable-linux --disable-gdb --with-extra-multilib-test="rv64gc_zicfilp_zicfiss-lp64d:-static" $ make -j$(nproc) Qemu ---- Get the lastest qemu $ cd qemu $ mkdir build $ cd build $ ../configure --target-list=riscv64-softmmu $ make -j$(nproc) Opensbi ------- $ git clone git@github.com:deepak0414/opensbi.git -b v6_cfi_spec_split_opensbi $ make CROSS_COMPILE=<your riscv toolchain> -j$(nproc) PLATFORM=generic Linux ----- Running defconfig is fine. CFI is enabled by default if the toolchain supports it. $ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc) defconfig $ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc) Running ------- Modify your qemu command to have: -bios <path-to-cfi-opensbi>/build/platform/generic/firmware/fw_dynamic.bin -cpu rv64,zicfilp=true,zicfiss=true,zimop=true,zcmop=true References ========== [1] - https://github.com/riscv/riscv-cfi [2] - https://lore.kernel.org/all/20240814081126.956287-1-samuel.holland@sifive.c… [3] - https://lwn.net/Articles/889475/ [4] - https://developer.arm.com/documentation/109576/0100/Branch-Target-Identific… [5] - https://www.intel.com/content/dam/develop/external/us/en/documents/catc17-i… [6] - https://lwn.net/Articles/940403/ To: Thomas Gleixner <tglx(a)linutronix.de> To: Ingo Molnar <mingo(a)redhat.com> To: Borislav Petkov <bp(a)alien8.de> To: Dave Hansen <dave.hansen(a)linux.intel.com> To: x86(a)kernel.org To: H. Peter Anvin <hpa(a)zytor.com> To: Andrew Morton <akpm(a)linux-foundation.org> To: Liam R. Howlett <Liam.Howlett(a)oracle.com> To: Vlastimil Babka <vbabka(a)suse.cz> To: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com> To: Paul Walmsley <paul.walmsley(a)sifive.com> To: Palmer Dabbelt <palmer(a)dabbelt.com> To: Albert Ou <aou(a)eecs.berkeley.edu> To: Conor Dooley <conor(a)kernel.org> To: Rob Herring <robh(a)kernel.org> To: Krzysztof Kozlowski <krzk+dt(a)kernel.org> To: Arnd Bergmann <arnd(a)arndb.de> To: Christian Brauner <brauner(a)kernel.org> To: Peter Zijlstra <peterz(a)infradead.org> To: Oleg Nesterov <oleg(a)redhat.com> To: Eric Biederman <ebiederm(a)xmission.com> To: Kees Cook <kees(a)kernel.org> To: Jonathan Corbet <corbet(a)lwn.net> To: Shuah Khan <shuah(a)kernel.org> To: Jann Horn <jannh(a)google.com> To: Conor Dooley <conor+dt(a)kernel.org> To: Miguel Ojeda <ojeda(a)kernel.org> To: Alex Gaynor <alex.gaynor(a)gmail.com> To: Boqun Feng <boqun.feng(a)gmail.com> To: Gary Guo <gary(a)garyguo.net> To: Björn Roy Baron <bjorn3_gh(a)protonmail.com> To: Benno Lossin <benno.lossin(a)proton.me> To: Andreas Hindborg <a.hindborg(a)kernel.org> To: Alice Ryhl <aliceryhl(a)google.com> To: Trevor Gross <tmgross(a)umich.edu> Cc: linux-kernel(a)vger.kernel.org Cc: linux-fsdevel(a)vger.kernel.org Cc: linux-mm(a)kvack.org Cc: linux-riscv(a)lists.infradead.org Cc: devicetree(a)vger.kernel.org Cc: linux-arch(a)vger.kernel.org Cc: linux-doc(a)vger.kernel.org Cc: linux-kselftest(a)vger.kernel.org Cc: alistair.francis(a)wdc.com Cc: richard.henderson(a)linaro.org Cc: jim.shu(a)sifive.com Cc: andybnac(a)gmail.com Cc: kito.cheng(a)sifive.com Cc: charlie(a)rivosinc.com Cc: atishp(a)rivosinc.com Cc: evan(a)rivosinc.com Cc: cleger(a)rivosinc.com Cc: alexghiti(a)rivosinc.com Cc: samitolvanen(a)google.com Cc: broonie(a)kernel.org Cc: rick.p.edgecombe(a)intel.com Cc: rust-for-linux(a)vger.kernel.org changelog --------- v22: - CONFIG_RISCV_USER_CFI was by default "n". With dual vdso support it is default "y" (if toolchain supports it). Fixing build error due to "-march=zicfiss" being picked in gcc-13 partially. gcc-13 only recognizes the flag but not actually doing any codegen or recognizing instruction for zicfiss. Change in v22 makes dependence on `-fcf-protection=full` compiler flag to ensure that toolchain has support and then only CONFIG_RISCV_USER_CFI will be visible in menuconfig. - picked up tags and some cosmetic changes in commit message for dual vdso patch. v21: - Fixing build errors due to changes in arch/riscv/include/asm/vdso.h Using #ifdef instead of IS_ENABLED in arch/riscv/include/asm/vdso.h vdso-cfi-offsets.h should be included only when CONFIG_RISCV_USER_CFI is selected. v20: - rebased on v6.18-rc1. - Added two vDSO support. If `CONFIG_RISCV_USER_CFI` is selected two vDSOs are compiled (one for hardware prior to RVA23 and one for RVA23 onwards). Kernel exposes RVA23 vDSO if hardware/cpu implements zimop else exposes existing vDSO to userspace. - default selection for `CONFIG_RISCV_USER_CFI` is "Yes". - replaced "__ASSEMBLY__" with "__ASSEMBLER__" v19: - riscv_nousercfi was `int`. changed it to unsigned long. Thanks to Alex Ghiti for reporting it. It was a bug. - ELP is cleared on trap entry only when CONFIG_64BIT. - restore ssp back on return to usermode was being done before `riscv_v_context_nesting_end` on trap exit path. If kernel shadow stack were enabled this would result in kernel operating on user shadow stack and panic (as I found in my testing of kcfi patch series). So fixed that. v18: - rebased on 6.16-rc1 - uprobe handling clears ELP in sstatus image in pt_regs - vdso was missing shadow stack elf note for object files. added that. Additional asm file for vdso needed the elf marker flag. toolchain should complain if `-fcf-protection=full` and marker is missing for object generated from asm file. Asked toolchain folks to fix this. Although no reason to gate the merge on that. - Split up compile options for march and fcf-protection in vdso Makefile - CONFIG_RISCV_USER_CFI option is moved under "Kernel features" menu Added `arch/riscv/configs/hardening.config` fragment which selects CONFIG_RISCV_USER_CFI v17: - fixed warnings due to empty macros in usercfi.h (reported by alexg) - fixed prefixes in commit titles reported by alexg - took below uprobe with fcfi v2 patch from Zong Li and squashed it with "riscv/traps: Introduce software check exception and uprobe handling" https://lore.kernel.org/all/20250604093403.10916-1-zong.li@sifive.com/ v16: - If FWFT is not implemented or returns error for shadow stack activation, then no_usercfi is set to disable shadow stack. Although this should be picked up by extension validation and activation. Fixed this bug for zicfilp and zicfiss both. Thanks to Charlie Jenkins for reporting this. - If toolchain doesn't support cfi, cfi kselftest shouldn't build. Suggested by Charlie Jenkins. - Default for CONFIG_RISCV_USER_CFI is set to no. Charlie/Atish suggested to keep it off till we have more hardware availibility with RVA23 profile and zimop/zcmop implemented. Else this will start breaking people's workflow - Includes the fix if "!RV64 and !SBI" then definitions for FWFT in asm-offsets.c error. v15: - Toolchain has been updated to include `-fcf-protection` flag. This exists for x86 as well. Updated kernel patches to compile vDSO and selftest to compile with `fcf-protection=full` flag. - selecting CONFIG_RISCV_USERCFI selects CONFIG_RISCV_SBI. - Patch to enable shadow stack for kernel wasn't hidden behind CONFIG_RISCV_USERCFI and CONFIG_RISCV_SBI. fixed that. v14: - rebased on top of palmer/sbi-v3. Thus dropped clement's FWFT patches Updated RISCV_ISA_EXT_XXXX in hwcap and hwprobe constants. - Took Radim's suggestions on bitfields. - Placed cfi_state at the end of thread_info block so that current situation is not disturbed with respect to member fields of thread_info in single cacheline. v13: - cpu_supports_shadow_stack/cpu_supports_indirect_br_lp_instr uses riscv_has_extension_unlikely() - uses nops(count) to create nop slide - RISCV_ACQUIRE_BARRIER is not needed in `amo_user_shstk`. Removed it - changed ternaries to simply use implicit casting to convert to bool. - kernel command line allows to disable zicfilp and zicfiss independently. updated kernel-parameters.txt. - ptrace user abi for cfi uses bitmasks instead of bitfields. Added ptrace kselftest. - cosmetic and grammatical changes to documentation. v12: - It seems like I had accidently squashed arch agnostic indirect branch tracking prctl and riscv implementation of those prctls. Split them again. - set_shstk_status/set_indir_lp_status perform CSR writes only when CPU support is available. As suggested by Zong Li. - Some minor clean up in kselftests as suggested by Zong Li. v11: - patch "arch/riscv: compile vdso with landing pad" was unconditionally selecting `_zicfilp` for vDSO compile. fixed that. Changed `lpad 1` to to `lpad 0`. v10: - dropped "mm: helper `is_shadow_stack_vma` to check shadow stack vma". This patch is not that interesting to this patch series for risc-v. There are instances in arch directories where VM_SHADOW_STACK flag is anyways used. Dropping this patch to expedite merging in riscv tree. - Took suggestions from `Clement` on "riscv: zicfiss / zicfilp enumeration" to validate presence of cfi based on config. - Added a patch for vDSO to have `lpad 0`. I had omitted this earlier to make sure we add single vdso object with cfi enabled. But a vdso object with scheme of zero labeled landing pad is least common denominator and should work with all objects of zero labeled as well as function-signature labeled objects. v9: - rebased on master (39a803b754d5 fix braino in "9p: fix ->rename_sem exclusion") - dropped "mm: Introduce ARCH_HAS_USER_SHADOW_STACK" (master has it from arm64/gcs) - dropped "prctl: arch-agnostic prctl for shadow stack" (master has it from arm64/gcs) v8: - rebased on palmer/for-next - dropped samuel holland's `envcfg` context switch patches. they are in parlmer/for-next v7: - Removed "riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv" Instead using `deactivate_mm` flow to clean up. see here for more context https://lore.kernel.org/all/20230908203655.543765-1-rick.p.edgecombe@intel.… - Changed the header include in `kselftest`. Hopefully this fixes compile issue faced by Zong Li at SiFive. - Cleaned up an orphaned change to `mm/mmap.c` in below patch "riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE" - Lock interfaces for shadow stack and indirect branch tracking expect arg == 0 Any future evolution of this interface should accordingly define how arg should be setup. - `mm/map.c` has an instance of using `VM_SHADOW_STACK`. Fixed it to use helper `is_shadow_stack_vma`. - Link to v6: https://lore.kernel.org/r/20241008-v5_user_cfi_series-v6-0-60d9fe073f37@riv… v6: - Picked up Samuel Holland's changes as is with `envcfg` placed in `thread` instead of `thread_info` - fixed unaligned newline escapes in kselftest - cleaned up messages in kselftest and included test output in commit message - fixed a bug in clone path reported by Zong Li - fixed a build issue if CONFIG_RISCV_ISA_V is not selected (this was introduced due to re-factoring signal context management code) v5: - rebased on v6.12-rc1 - Fixed schema related issues in device tree file - Fixed some of the documentation related issues in zicfilp/ss.rst (style issues and added index) - added `SHADOW_STACK_SET_MARKER` so that implementation can define base of shadow stack. - Fixed warnings on definitions added in usercfi.h when CONFIG_RISCV_USER_CFI is not selected. - Adopted context header based signal handling as proposed by Andy Chiu - Added support for enabling kernel mode access to shadow stack using FWFT (https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-firmware…) - Link to v5: https://lore.kernel.org/r/20241001-v5_user_cfi_series-v1-0-3ba65b6e550f@riv… (Note: I had an issue in my workflow due to which version number wasn't picked up correctly while sending out patches) v4: - rebased on 6.11-rc6 - envcfg: Converged with Samuel Holland's patches for envcfg management on per- thread basis. - vma_is_shadow_stack is renamed to is_vma_shadow_stack - picked up Mark Brown's `ARCH_HAS_USER_SHADOW_STACK` patch - signal context: using extended context management to maintain compatibility. - fixed `-Wmissing-prototypes` compiler warnings for prctl functions - Documentation fixes and amending typos. - Link to v4: https://lore.kernel.org/all/20240912231650.3740732-1-debug@rivosinc.com/ v3: - envcfg logic to pick up base envcfg had a bug where `ENVCFG_CBZE` could have been picked on per task basis, even though CPU didn't implement it. Fixed in this series. - dt-bindings As suggested, split into separate commit. fixed the messaging that spec is in public review - arch_is_shadow_stack change arch_is_shadow_stack changed to vma_is_shadow_stack - hwprobe zicfiss / zicfilp if present will get enumerated in hwprobe - selftests As suggested, added object and binary filenames to .gitignore Selftest binary anyways need to be compiled with cfi enabled compiler which will make sure that landing pad and shadow stack are enabled. Thus removed separate enable/disable tests. Cleaned up tests a bit. - Link to v3: https://lore.kernel.org/lkml/20240403234054.2020347-1-debug@rivosinc.com/ v2: - Using config `CONFIG_RISCV_USER_CFI`, kernel support for riscv control flow integrity for user mode programs can be compiled in the kernel. - Enabling of control flow integrity for user programs is left to user runtime - This patch series introduces arch agnostic `prctls` to enable shadow stack and indirect branch tracking. And implements them on riscv. --- Changes in v22: - Link to v21: https://lore.kernel.org/r/20251015-v5_user_cfi_series-v21-0-6a07856e90e7@ri… Changes in v21: - Link to v20: https://lore.kernel.org/r/20251013-v5_user_cfi_series-v20-0-b9de4be9912e@ri… Changes in v20: - Link to v19: https://lore.kernel.org/r/20250731-v5_user_cfi_series-v19-0-09b468d7beab@ri… Changes in v19: - Link to v18: https://lore.kernel.org/r/20250711-v5_user_cfi_series-v18-0-a8ee62f9f38e@ri… Changes in v18: - Link to v17: https://lore.kernel.org/r/20250604-v5_user_cfi_series-v17-0-4565c2cf869f@ri… Changes in v17: - Link to v16: https://lore.kernel.org/r/20250522-v5_user_cfi_series-v16-0-64f61a35eee7@ri… Changes in v16: - Link to v15: https://lore.kernel.org/r/20250502-v5_user_cfi_series-v15-0-914966471885@ri… Changes in v15: - changelog posted just below cover letter - Link to v14: https://lore.kernel.org/r/20250429-v5_user_cfi_series-v14-0-5239410d012a@ri… Changes in v14: - changelog posted just below cover letter - Link to v13: https://lore.kernel.org/r/20250424-v5_user_cfi_series-v13-0-971437de586a@ri… Changes in v13: - changelog posted just below cover letter - Link to v12: https://lore.kernel.org/r/20250314-v5_user_cfi_series-v12-0-e51202b53138@ri… Changes in v12: - changelog posted just below cover letter - Link to v11: https://lore.kernel.org/r/20250310-v5_user_cfi_series-v11-0-86b36cbfb910@ri… Changes in v11: - changelog posted just below cover letter - Link to v10: https://lore.kernel.org/r/20250210-v5_user_cfi_series-v10-0-163dcfa31c60@ri… --- Andy Chiu (1): riscv: signal: abstract header saving for setup_sigcontext Deepak Gupta (26): mm: VM_SHADOW_STACK definition for riscv dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml) riscv: zicfiss / zicfilp enumeration riscv: zicfiss / zicfilp extension csr and bit definitions riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE riscv/mm: manufacture shadow stack pte riscv/mm: teach pte_mkwrite to manufacture shadow stack PTEs riscv/mm: write protect and shadow stack riscv/mm: Implement map_shadow_stack() syscall riscv/shstk: If needed allocate a new shadow stack on clone riscv: Implements arch agnostic shadow stack prctls prctl: arch-agnostic prctl for indirect branch tracking riscv: Implements arch agnostic indirect branch tracking prctls riscv/traps: Introduce software check exception and uprobe handling riscv/signal: save and restore of shadow stack for signal riscv/kernel: update __show_regs to print shadow stack register riscv/ptrace: riscv cfi status and state via ptrace and in core files riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe riscv: kernel command line option to opt out of user cfi riscv: enable kernel access to shadow stack memory via FWFT sbi call arch/riscv: dual vdso creation logic and select vdso based on hw riscv: create a config for shadow stack and landing pad instr support riscv: Documentation for landing pad / indirect branch tracking riscv: Documentation for shadow stack on riscv kselftest/riscv: kselftest for user mode cfi Jim Shu (1): arch/riscv: compile vdso with landing pad and shadow stack note Documentation/admin-guide/kernel-parameters.txt | 8 + Documentation/arch/riscv/index.rst | 2 + Documentation/arch/riscv/zicfilp.rst | 115 +++++ Documentation/arch/riscv/zicfiss.rst | 179 +++++++ .../devicetree/bindings/riscv/extensions.yaml | 14 + arch/riscv/Kconfig | 22 + arch/riscv/Makefile | 8 +- arch/riscv/configs/hardening.config | 4 + arch/riscv/include/asm/asm-prototypes.h | 1 + arch/riscv/include/asm/assembler.h | 44 ++ arch/riscv/include/asm/cpufeature.h | 12 + arch/riscv/include/asm/csr.h | 16 + arch/riscv/include/asm/entry-common.h | 2 + arch/riscv/include/asm/hwcap.h | 2 + arch/riscv/include/asm/mman.h | 26 + arch/riscv/include/asm/mmu_context.h | 7 + arch/riscv/include/asm/pgtable.h | 30 +- arch/riscv/include/asm/processor.h | 1 + arch/riscv/include/asm/thread_info.h | 3 + arch/riscv/include/asm/usercfi.h | 95 ++++ arch/riscv/include/asm/vdso.h | 13 +- arch/riscv/include/asm/vector.h | 3 + arch/riscv/include/uapi/asm/hwprobe.h | 2 + arch/riscv/include/uapi/asm/ptrace.h | 34 ++ arch/riscv/include/uapi/asm/sigcontext.h | 1 + arch/riscv/kernel/Makefile | 2 + arch/riscv/kernel/asm-offsets.c | 10 + arch/riscv/kernel/cpufeature.c | 27 + arch/riscv/kernel/entry.S | 38 ++ arch/riscv/kernel/head.S | 27 + arch/riscv/kernel/process.c | 27 +- arch/riscv/kernel/ptrace.c | 95 ++++ arch/riscv/kernel/signal.c | 148 +++++- arch/riscv/kernel/sys_hwprobe.c | 2 + arch/riscv/kernel/sys_riscv.c | 10 + arch/riscv/kernel/traps.c | 54 ++ arch/riscv/kernel/usercfi.c | 545 +++++++++++++++++++++ arch/riscv/kernel/vdso.c | 7 + arch/riscv/kernel/vdso/Makefile | 40 +- arch/riscv/kernel/vdso/flush_icache.S | 4 + arch/riscv/kernel/vdso/gen_vdso_offsets.sh | 4 +- arch/riscv/kernel/vdso/getcpu.S | 4 + arch/riscv/kernel/vdso/note.S | 3 + arch/riscv/kernel/vdso/rt_sigreturn.S | 4 + arch/riscv/kernel/vdso/sys_hwprobe.S | 4 + arch/riscv/kernel/vdso/vgetrandom-chacha.S | 5 +- arch/riscv/kernel/vdso_cfi/Makefile | 25 + arch/riscv/kernel/vdso_cfi/vdso-cfi.S | 11 + arch/riscv/mm/init.c | 2 +- arch/riscv/mm/pgtable.c | 16 + include/linux/cpu.h | 4 + include/linux/mm.h | 7 + include/uapi/linux/elf.h | 2 + include/uapi/linux/prctl.h | 27 + kernel/sys.c | 30 ++ tools/testing/selftests/riscv/Makefile | 2 +- tools/testing/selftests/riscv/cfi/.gitignore | 3 + tools/testing/selftests/riscv/cfi/Makefile | 16 + tools/testing/selftests/riscv/cfi/cfi_rv_test.h | 82 ++++ tools/testing/selftests/riscv/cfi/riscv_cfi_test.c | 173 +++++++ tools/testing/selftests/riscv/cfi/shadowstack.c | 385 +++++++++++++++ tools/testing/selftests/riscv/cfi/shadowstack.h | 27 + 62 files changed, 2475 insertions(+), 41 deletions(-) --- base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787 change-id: 20240930-v5_user_cfi_series-3dc332f8f5b2 -- - debug

2 months

2
15
0 0

[PATCH bpf-next v7 00/15] selftests/bpf: Integrate test_xsk.c to test_progs framework

by Bastien Curutchet (eBPF Foundation)

Hi all, The test_xsk.sh script covers many AF_XDP use cases. The tests it runs are defined in xksxceiver.c. Since this script is used to test real hardware, the goal here is to leave it as it is, and only integrate the tests that run on veth peers into the test_progs framework. PATCH 1 extracts test_xsk[.c/.h] from xskxceiver[.c/.h] to make the tests available to test_progs. PATCH 2 to 7 fix small issues in the current test PATCH 8 to 13 handle all errors to release resources instead of calling exit() when any error occurs. PATCH 14 isolates the tests that won't fit in the CI PATCH 15 integrates the CI tests to the test_progs framework Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com> --- Changes in v7: - Restore 'test_ns' prefix to allow parallel execution. - PATCH 11: fix potential uninitialized variable spotted by AI. - PACTH 12: fix potential resource leak spotted by AI - Link to v6: https://lore.kernel.org/r/20251029-xsk-v6-0-5a63a64dff98@bootlin.com Changes in v6: - Setup veth peer once for each mode instead of once for each substest - Rename the 'flaky' table 'skip-ci' table and move the automatically skipped and the longest tests into it - Link to v5: https://lore.kernel.org/r/20251016-xsk-v5-0-662c95eb8005@bootlin.com Changes in v5: - Rebase on latest bpf-next_base - Move XDP_ADJUST_TAIL_SHRINK_MULTI_BUFF to the flaky table - Add Maciej's reviewed-by - Link to v4: https://lore.kernel.org/r/20250924-xsk-v4-0-20e57537b876@bootlin.com Changes in v4: - Fix test_xsk.sh's summary report. - Merge PATCH 11 & 12 together, otherwise PATCH 11 fails to build. - Split old PATCH 3 in two patches. The first one fixes testapp_stats_rx_dropped(), the second one fixes testapp_xdp_shared_umem(). The unecessary frees (in testapp_stats_rx_full() and testapp_stats_fill_empty() are removed) - Link to v3: https://lore.kernel.org/r/20250904-xsk-v3-0-ce382e331485@bootlin.com Changes in v3: - Rebase on latest bpf-next_base to integrate commit c9110e6f7237 ("selftests/bpf: Fix count write in testapp_xdp_metadata_copy()"). - Move XDP_METADATA_COPY_* tests from flaky-tests to nominal tests - Link to v2: https://lore.kernel.org/r/20250902-xsk-v2-0-17c6345d5215@bootlin.com Changes in v2: - Rebase on the latest bpf-next_base and integrate the newly added tests to the work (adjust_tail* and tx_queue_consumer tests) - Re-order patches to split xkxceiver sooner. - Fix the bug reported by Maciej. - Fix verbose mode in test_xsk.sh by keeping kselftest (remove PATCH 1, 7 and 8) - Link to v1: https://lore.kernel.org/r/20250313-xsk-v1-0-7374729a93b9@bootlin.com --- Bastien Curutchet (eBPF Foundation) (15): selftests/bpf: test_xsk: Split xskxceiver selftests/bpf: test_xsk: Initialize bitmap before use selftests/bpf: test_xsk: Fix __testapp_validate_traffic()'s return value selftests/bpf: test_xsk: fix memory leak in testapp_stats_rx_dropped() selftests/bpf: test_xsk: fix memory leak in testapp_xdp_shared_umem() selftests/bpf: test_xsk: Wrap test clean-up in functions selftests/bpf: test_xsk: Release resources when swap fails selftests/bpf: test_xsk: Add return value to init_iface() selftests/bpf: test_xsk: Don't exit immediately when xsk_attach fails selftests/bpf: test_xsk: Don't exit immediately when gettimeofday fails selftests/bpf: test_xsk: Don't exit immediately when workers fail selftests/bpf: test_xsk: Don't exit immediately if validate_traffic fails selftests/bpf: test_xsk: Don't exit immediately on allocation failures selftests/bpf: test_xsk: Isolate non-CI tests selftests/bpf: test_xsk: Integrate test_xsk.c to test_progs framework tools/testing/selftests/bpf/Makefile | 11 +- tools/testing/selftests/bpf/prog_tests/test_xsk.c | 2596 ++++++++++++++++++++ tools/testing/selftests/bpf/prog_tests/test_xsk.h | 298 +++ tools/testing/selftests/bpf/prog_tests/xsk.c | 151 ++ tools/testing/selftests/bpf/xskxceiver.c | 2696 +-------------------- tools/testing/selftests/bpf/xskxceiver.h | 156 -- 6 files changed, 3184 insertions(+), 2724 deletions(-) --- base-commit: 1e2d874b04ba46a3b9fe6697097aa437641f4339 change-id: 20250218-xsk-0cf90e975d14 Best regards, -- Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>

2 months

3
17
0 0

[PATCH v7 00/15] Consolidate iommu page table implementations (AMD)

by Jason Gunthorpe

[Kevin has a done a great job to get through reviews on all these, and Vasant/Ankit have been looking at it on AMD systems, I think we are close to being done now!] Currently each of the iommu page table formats duplicates all of the logic to maintain the page table and perform map/unmap/etc operations. There are several different versions of the algorithms between all the different formats. The io-pgtable system provides an interface to help isolate the page table code from the iommu driver, but doesn't provide tools to implement the common algorithms. This makes it very hard to improve the state of the pagetable code under the iommu domains as any proposed improvement needs to alter a large number of different driver code paths. Combined with a lack of software based testing this makes improvement in this area very hard. iommufd wants several new page table operations: - More efficient map/unmap operations, using iommufd's batching logic - unmap that returns the physical addresses into a batch as it progresses - cut that allows splitting areas so large pages can have holes poked in them dynamically (ie guestmemfd hitless shared/private transitions) - More agressive freeing of table memory to avoid waste - Fragmenting large pages so that dirty tracking can be more granular - Reassembling large pages so that VMs can run at full IO performance in migration/dirty tracking error flows - KHO integration for kernel live upgrade Together these are algorithmically complex enough to be a very significant task to go and implement in all the page table formats we support. Just the "server" focused drivers use almost all the formats (ARMv8 S1&S2 / x86 PAE / AMDv1 / VT-d SS / RISCV) Instead of doing the duplicated work, this series takes the first step to consolidate the algorithms into one places. In spirit it is similar to the work Christoph did a few years back to pull the redundant get_user_pages() implementations out of the arch code into core MM. This unlocked a great deal of improvement in that space in the following years. I would like to see the same benefit in iommu as well. My first RFC showed a bigger picture with all most all formats and more algorithms. This series reorganizes that to be narrowly focused on just enough to convert the AMD driver to use the new mechanism. kunit tests are provided that allow good testing of the algorithms and all formats on x86, nothing is arch specific. AMD is one of the simpler options as the HW is quite uniform with few different options/bugs while still requiring the complicated contiguous pages support. The HW also has a very simple range based invalidation approach that is easy to implement. The AMD v1 and AMD v2 page table formats are implemented bit for bit identical to the current code, tested using a compare kunit test that checks against the io-pgtable version (on github, see below). Updating the AMD driver to replace the io-pgtable layer with the new stuff is fairly straightforward now. The layering is fixed up in the new version so that all the invalidation goes through function pointers. Several small fixing patches have come out of this as I've been fixing the problems that the test suite uncovers in the current code, and implementing the fixed version in iommupt. On performance, there is a quite wide variety of implementation designs across all the drivers. Looking at some key performance across the main formats: iommu_map(): pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 53,66 , 51,63 , 19.19 (AMDV1) 256*2^12, 386,1909 , 367,1795 , 79.79 256*2^21, 362,1633 , 355,1556 , 77.77 2^12, 56,62 , 52,59 , 11.11 (AMDv2) 256*2^12, 405,1355 , 357,1292 , 72.72 256*2^21, 393,1160 , 358,1114 , 67.67 2^12, 55,65 , 53,62 , 14.14 (VT-d second stage) 256*2^12, 391,518 , 332,512 , 35.35 256*2^21, 383,635 , 336,624 , 46.46 2^12, 57,65 , 55,63 , 12.12 (ARM 64 bit) 256*2^12, 380,389 , 361,369 , 2.02 256*2^21, 358,419 , 345,400 , 13.13 iommu_unmap(): pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 69,88 , 65,85 , 23.23 (AMDv1) 256*2^12, 353,6498 , 331,6029 , 94.94 256*2^21, 373,6014 , 360,5706 , 93.93 2^12, 71,72 , 66,69 , 4.04 (AMDv2) 256*2^12, 228,891 , 206,871 , 76.76 256*2^21, 254,721 , 245,711 , 65.65 2^12, 69,87 , 65,82 , 20.20 (VT-d second stage) 256*2^12, 210,321 , 200,315 , 36.36 256*2^21, 255,349 , 238,342 , 30.30 2^12, 72,77 , 68,74 , 8.08 (ARM 64 bit) 256*2^12, 521,357 , 447,346 , -29.29 256*2^21, 489,358 , 433,345 , -25.25 * Above numbers include additional patches to remove the iommu_pgsize() overheads. gcc 13.3.0, i7-12700 This version provides fairly consistent performance across formats. ARM unmap performance is quite different because this version supports contiguous pages and uses a very different algorithm for unmapping. Though why it is so worse compared to AMDv1 I haven't figured out yet. The per-format commits include a more detailed chart. There is a second branch: https://github.com/jgunthorpe/linux/commits/iommu_pt_all Containing supporting work and future steps: - ARM short descriptor (32 bit), ARM long descriptor (64 bit) formats - RISCV format and RISCV conversion https://github.com/jgunthorpe/linux/commits/iommu_pt_riscv - Support for a DMA incoherent HW page table walker - VT-d second stage format and VT-d conversion https://github.com/jgunthorpe/linux/commits/iommu_pt_vtd - DART v1 & v2 format - Draft of a iommufd 'cut' operation to break down huge pages - A compare test that checks the iommupt formats against the iopgtable interface, including updating AMD to have a working iopgtable and patches to make VT-d have an iopgtable for testing. - A performance test to micro-benchmark map and unmap against iogptable My strategy is to go one by one for the drivers: - AMD driver conversion - RISCV page table and driver - Intel VT-d driver and VTDSS page table - Flushing improvements for RISCV - ARM SMMUv3 And concurrently work on the algorithm side: - debugfs content dump, like VT-d has - Cut support - Increase/Decrease page size support - map/unmap batching - KHO As we make more algorithm improvements the value to convert the drivers increases. This is on github: https://github.com/jgunthorpe/linux/commits/iommu_pt v7: - Rebase to v6.18-rc2 - Improve comments and documentation - Add a few missed __sme_sets() for AMD CC - Rename pt_iommu_flush_ops -> pt_iommu_driver_ops VT-D -> VT-d pt_clear_entry -> pt_clear_entries pt_entry_write_is_dirty -> pt_entry_is_write_dirty pt_entry_set_write_clean -> pt_entry_make_write_clean - Tidy some of the map flow into a new function do_map() - Fix ffz64() v6: https://patch.msgid.link/r/0-v6-0fb54a1d9850+36b-iommu_pt_jgg@nvidia.com - Improve comments and documentation - Rename pt_entry_oa_full -> pt_entry_oa_exact pt_has_system_page -> pt_has_system_page_size pt_max_output_address_lg2 -> pt_max_oa_lg2 log2_f*() -> vaf* / oaf* / f*_t pt_item_fully_covered -> pt_entry_fully_covered - Fix missed constant propogation causing division - Consolidate debugging checks to pt_check_install_leaf_args() - Change collect->ignore_mapped to check_mapped - Shuffle some hunks around to more appropriate patches - Two new mini kunit tests v5: https://patch.msgid.link/r/0-v5-116c4948af3d+68091-iommu_pt_jgg@nvidia.com - Text grammar updates and kdoc fixes v4: https://patch.msgid.link/r/0-v4-0d6a6726a372+18959-iommu_pt_jgg@nvidia.com - Rebase on v6.16-rc3 - Integrate the HATS/HATDis changes - Remove 'default n' from kconfig - Remove unused 'PT_FIXED_TOP_LEVEL' - Improve comments and documentation - Fix some compile warnings from kbuild robots v3: https://patch.msgid.link/r/0-v3-a93aab628dbc+521-iommu_pt_jgg@nvidia.com - Rebase on v6.16-rc2 - s/PT_ENTRY_WORD_SIZE/PT_ITEM_WORD_SIZE/s to follow the language better - Comment and documentation updates - Add PT_TOP_PHYS_MASK to help manage alignment restrictions on the top pointer - Add missed force_aperture = true - Make pt_iommu_deinit() take care of the not-yet-inited error case internally as AMD/RISCV/VTD all shared this logic - Change gather_range() into gather_range_pages() so it also deals with the page list. This makes the following cache flushing series simpler - Fix missed update of unmap->unmapped in some error cases - Change clear_contig() to order the gather more logically - Remove goto from the error handling in __map_range_leaf() - s/log2_/oalog2_/ in places where the argument is an oaddr_t - Pass the pts to pt_table_install64/32() - Do not use SIGN_EXTEND for the AMDv2 page table because of Vasant's information on how PASID 0 works. v2: https://patch.msgid.link/r/0-v2-5c26bde5c22d+58b-iommu_pt_jgg@nvidia.com - AMD driver only, many code changes RFC: https://lore.kernel.org/all/0-v1-01fa10580981+1d-iommu_pt_jgg@nvidia.com/ Cc: Michael Roth <michael.roth(a)amd.com> Cc: Alexey Kardashevskiy <aik(a)amd.com> Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com> Cc: James Gowans <jgowans(a)amazon.com> Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com> Alejandro Jimenez (1): iommu/amd: Use the generic iommu page table Jason Gunthorpe (14): genpt: Generic Page Table base API genpt: Add Documentation/ files iommupt: Add the basic structure of the iommu implementation iommupt: Add the AMD IOMMU v1 page table format iommupt: Add iova_to_phys op iommupt: Add unmap_pages op iommupt: Add map_pages op iommupt: Add read_and_clear_dirty op iommupt: Add a kunit test for Generic Page Table iommupt: Add a mock pagetable format for iommufd selftest to use iommufd: Change the selftest to use iommupt instead of xarray iommupt: Add the x86 64 bit page table format iommu/amd: Remove AMD io_pgtable support iommupt: Add a kunit test for the IOMMU implementation .clang-format | 1 + Documentation/driver-api/generic_pt.rst | 142 ++ Documentation/driver-api/index.rst | 1 + drivers/iommu/Kconfig | 2 + drivers/iommu/Makefile | 1 + drivers/iommu/amd/Kconfig | 5 +- drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu.h | 1 - drivers/iommu/amd/amd_iommu_types.h | 110 +- drivers/iommu/amd/io_pgtable.c | 577 -------- drivers/iommu/amd/io_pgtable_v2.c | 370 ------ drivers/iommu/amd/iommu.c | 538 ++++---- drivers/iommu/generic_pt/.kunitconfig | 13 + drivers/iommu/generic_pt/Kconfig | 68 + drivers/iommu/generic_pt/fmt/Makefile | 26 + drivers/iommu/generic_pt/fmt/amdv1.h | 415 ++++++ drivers/iommu/generic_pt/fmt/defs_amdv1.h | 21 + drivers/iommu/generic_pt/fmt/defs_x86_64.h | 21 + drivers/iommu/generic_pt/fmt/iommu_amdv1.c | 15 + drivers/iommu/generic_pt/fmt/iommu_mock.c | 10 + drivers/iommu/generic_pt/fmt/iommu_template.h | 48 + drivers/iommu/generic_pt/fmt/iommu_x86_64.c | 11 + drivers/iommu/generic_pt/fmt/x86_64.h | 259 ++++ drivers/iommu/generic_pt/iommu_pt.h | 1162 +++++++++++++++++ drivers/iommu/generic_pt/kunit_generic_pt.h | 713 ++++++++++ drivers/iommu/generic_pt/kunit_iommu.h | 183 +++ drivers/iommu/generic_pt/kunit_iommu_pt.h | 487 +++++++ drivers/iommu/generic_pt/pt_common.h | 358 +++++ drivers/iommu/generic_pt/pt_defs.h | 329 +++++ drivers/iommu/generic_pt/pt_fmt_defaults.h | 233 ++++ drivers/iommu/generic_pt/pt_iter.h | 636 +++++++++ drivers/iommu/generic_pt/pt_log2.h | 122 ++ drivers/iommu/io-pgtable.c | 4 - drivers/iommu/iommufd/Kconfig | 1 + drivers/iommu/iommufd/iommufd_test.h | 11 +- drivers/iommu/iommufd/selftest.c | 438 +++---- include/linux/generic_pt/common.h | 167 +++ include/linux/generic_pt/iommu.h | 271 ++++ include/linux/io-pgtable.h | 2 - include/linux/irqchip/riscv-imsic.h | 3 +- tools/testing/selftests/iommu/iommufd.c | 60 +- tools/testing/selftests/iommu/iommufd_utils.h | 12 + 42 files changed, 6237 insertions(+), 1612 deletions(-) create mode 100644 Documentation/driver-api/generic_pt.rst delete mode 100644 drivers/iommu/amd/io_pgtable.c delete mode 100644 drivers/iommu/amd/io_pgtable_v2.c create mode 100644 drivers/iommu/generic_pt/.kunitconfig create mode 100644 drivers/iommu/generic_pt/Kconfig create mode 100644 drivers/iommu/generic_pt/fmt/Makefile create mode 100644 drivers/iommu/generic_pt/fmt/amdv1.h create mode 100644 drivers/iommu/generic_pt/fmt/defs_amdv1.h create mode 100644 drivers/iommu/generic_pt/fmt/defs_x86_64.h create mode 100644 drivers/iommu/generic_pt/fmt/iommu_amdv1.c create mode 100644 drivers/iommu/generic_pt/fmt/iommu_mock.c create mode 100644 drivers/iommu/generic_pt/fmt/iommu_template.h create mode 100644 drivers/iommu/generic_pt/fmt/iommu_x86_64.c create mode 100644 drivers/iommu/generic_pt/fmt/x86_64.h create mode 100644 drivers/iommu/generic_pt/iommu_pt.h create mode 100644 drivers/iommu/generic_pt/kunit_generic_pt.h create mode 100644 drivers/iommu/generic_pt/kunit_iommu.h create mode 100644 drivers/iommu/generic_pt/kunit_iommu_pt.h create mode 100644 drivers/iommu/generic_pt/pt_common.h create mode 100644 drivers/iommu/generic_pt/pt_defs.h create mode 100644 drivers/iommu/generic_pt/pt_fmt_defaults.h create mode 100644 drivers/iommu/generic_pt/pt_iter.h create mode 100644 drivers/iommu/generic_pt/pt_log2.h create mode 100644 include/linux/generic_pt/common.h create mode 100644 include/linux/generic_pt/iommu.h base-commit: bf3db0366052dcdf7dea89a07929b690aac59b15 -- 2.43.0

2 months

5
38
0 0

[PATCH v3] selftests/run_kselftest.sh: exit with error if tests fail

by Brendan Jackman

Parsing KTAP is quite an inconvenience, but most of the time the thing you really want to know is "did anything fail"? Let's give the user the his information without them needing to parse anything. Because of the use of subshells and namespaces, this needs to be communicated via a file. Just write arbitrary data into the file and treat non-empty content as a signal that something failed. In case any user depends on the current behaviour, such as running this from a script with `set -e` and parsing the result for failures afterwards, add a flag they can set to get the old behaviour, namely --no-error-on-fail. Signed-off-by: Brendan Jackman <jackmanb(a)google.com> --- Changes in v3: - Fixed quoting - Link to v2: https://lore.kernel.org/r/20251014-b4-ksft-error-on-fail-v2-1-b3e2657237b8@… Changes in v2: - Fixed bug in report_failure() - Made error-on-fail the default - Link to v1: https://lore.kernel.org/r/20251007-b4-ksft-error-on-fail-v1-1-71bf058f5662@… --- tools/testing/selftests/kselftest/runner.sh | 14 ++++++++++---- tools/testing/selftests/run_kselftest.sh | 14 ++++++++++++++ 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/kselftest/runner.sh b/tools/testing/selftests/kselftest/runner.sh index 2c3c58e65a419f5ee8d7dc51a37671237a07fa0b..3a62039fa6217f3453423ff011575d0a1eb8c275 100644 --- a/tools/testing/selftests/kselftest/runner.sh +++ b/tools/testing/selftests/kselftest/runner.sh @@ -44,6 +44,12 @@ tap_timeout() fi } +report_failure() +{ + echo "not ok $*" + echo "$*" >> "$kselftest_failures_file" +} + run_one() { DIR="$1" @@ -105,7 +111,7 @@ run_one() echo "# $TEST_HDR_MSG" if [ ! -e "$TEST" ]; then echo "# Warning: file $TEST is missing!" - echo "not ok $test_num $TEST_HDR_MSG" + report_failure "$test_num $TEST_HDR_MSG" else if [ -x /usr/bin/stdbuf ]; then stdbuf="/usr/bin/stdbuf --output=L " @@ -123,7 +129,7 @@ run_one() interpreter=$(head -n 1 "$TEST" | cut -c 3-) cmd="$stdbuf $interpreter ./$BASENAME_TEST" else - echo "not ok $test_num $TEST_HDR_MSG" + report_failure "$test_num $TEST_HDR_MSG" return fi fi @@ -137,9 +143,9 @@ run_one() echo "ok $test_num $TEST_HDR_MSG # SKIP" elif [ $rc -eq $timeout_rc ]; then \ echo "#" - echo "not ok $test_num $TEST_HDR_MSG # TIMEOUT $kselftest_timeout seconds" + report_failure "$test_num $TEST_HDR_MSG # TIMEOUT $kselftest_timeout seconds" else - echo "not ok $test_num $TEST_HDR_MSG # exit=$rc" + report_failure "$test_num $TEST_HDR_MSG # exit=$rc" fi) cd - >/dev/null fi diff --git a/tools/testing/selftests/run_kselftest.sh b/tools/testing/selftests/run_kselftest.sh index 0443beacf3621ae36cb12ffd57f696ddef3526b5..d4be97498b32e975c63a1167d3060bdeba674c8c 100755 --- a/tools/testing/selftests/run_kselftest.sh +++ b/tools/testing/selftests/run_kselftest.sh @@ -33,6 +33,7 @@ Usage: $0 [OPTIONS] -c | --collection COLLECTION Run all tests from COLLECTION -l | --list List the available collection:test entries -d | --dry-run Don't actually run any tests + -f | --no-error-on-fail Don't exit with an error just because tests failed -n | --netns Run each test in namespace -h | --help Show this usage info -o | --override-timeout Number of seconds after which we timeout @@ -44,6 +45,7 @@ COLLECTIONS="" TESTS="" dryrun="" kselftest_override_timeout="" +ERROR_ON_FAIL=true while true; do case "$1" in -s | --summary) @@ -65,6 +67,9 @@ while true; do -d | --dry-run) dryrun="echo" shift ;; + -f | --no-error-on-fail) + ERROR_ON_FAIL=false + shift ;; -n | --netns) RUN_IN_NETNS=1 shift ;; @@ -105,9 +110,18 @@ if [ -n "$TESTS" ]; then available="$(echo "$valid" | sed -e 's/ /\n/g')" fi +kselftest_failures_file="$(mktemp --tmpdir kselftest-failures-XXXXXX)" +export kselftest_failures_file + collections=$(echo "$available" | cut -d: -f1 | sort | uniq) for collection in $collections ; do [ -w /dev/kmsg ] && echo "kselftest: Running tests in $collection" >> /dev/kmsg tests=$(echo "$available" | grep "^$collection:" | cut -d: -f2) ($dryrun cd "$collection" && $dryrun run_many $tests) done + +failures="$(cat "$kselftest_failures_file")" +rm "$kselftest_failures_file" +if "$ERROR_ON_FAIL" && [ "$failures" ]; then + exit 1 +fi --- base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585 change-id: 20251007-b4-ksft-error-on-fail-0c2cb3246041 Best regards, -- Brendan Jackman <jackmanb(a)google.com>

2 months

2
3
0 0

[PATCH 0/6] KVM: LoongArch: selftests: Add timer test case

by Bibo Mao

This patch set adds timer test case for LoongArch system, it is based on common arch_timer test case. And it includes time counter function, one-shot/period mode interrupt, and software emulated timer function test. Bibo Mao (6): KVM: LoongArch: selftests: Add system registers save and restore on exception KVM: LoongArch: selftests: Add exception handler register interface KVM: LoongArch: selftests: Add basic interfaces KVM: LoongArch: selftests: Add timer test case with one-shot mode KVM: LoongArch: selftests: Add period mode timer and time counter test KVM: LoongArch: selftests: Add SW emulated timer test tools/testing/selftests/kvm/Makefile.kvm | 10 +- .../kvm/include/loongarch/arch_timer.h | 84 ++++++++ .../kvm/include/loongarch/processor.h | 81 +++++++- .../selftests/kvm/lib/loongarch/exception.S | 6 + .../selftests/kvm/lib/loongarch/processor.c | 38 +++- .../selftests/kvm/loongarch/arch_timer.c | 187 ++++++++++++++++++ 6 files changed, 400 insertions(+), 6 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/loongarch/arch_timer.h create mode 100644 tools/testing/selftests/kvm/loongarch/arch_timer.c base-commit: e53642b87a4f4b03a8d7e5f8507fc3cd0c595ea6 -- 2.39.3

2 months

1
6
0 0

[GIT PULL] kselftest fixes update for Linux 6.18-rc4

by Shuah Khan

Hi Linus, Please pull the following kselftest fixes update for Linux 6.18-rc4. Fixes build warning in cachestat found during clang build and adds tmpshmcstat to .gitignore. diff is attached. thanks, -- Shuah ---------------------------------------------------------------- The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787: Linux 6.18-rc1 (2025-10-12 13:42:36 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-fixes-6.18-rc4 for you to fetch changes up to 920aa3a7705a061cb3004572d8b7932b54463dbf: selftests: cachestat: Fix warning on declaration under label (2025-10-22 09:23:18 -0600) ---------------------------------------------------------------- linux_kselftest-fixes-6.18-rc4 Fixes build warning in cachestat found during clang build and adds tmpshmcstat to .gitignore. ---------------------------------------------------------------- Madhur Kumar (1): selftests/cachestat: add tmpshmcstat file to .gitignore Sidharth Seela (1): selftests: cachestat: Fix warning on declaration under label tools/testing/selftests/cachestat/.gitignore | 1 + tools/testing/selftests/cachestat/test_cachestat.c | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) ----------------------------------------------------------------

2 months

2
1
0 0

[GIT PULL] kunit fixes update for Linux 6.18-rc4

by Shuah Khan

Hi Linus, Please pull the following kunit fixes update for Linux 6.18-rc4. Fixes log overwrite in param_tests and fixes incorrect cast of priv pointer in test_dev_action(). Updates email address for Rae Moar in MAINTAINERS KUnit entry. diff is attached. thanks, -- Shuah ---------------------------------------------------------------- The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787: Linux 6.18-rc1 (2025-10-12 13:42:36 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-kunit-fixes-6.18-rc4 for you to fetch changes up to f3903ec76ae6afcdba0347681d1dda005fb145cd: MAINTAINERS: Update KUnit email address for Rae Moar (2025-10-29 14:57:54 -0600) ---------------------------------------------------------------- linux_kselftest-kunit-fixes-6.18-rc4 Fixes log overwrite in param_tests and fixes incorrect cast of priv pointer in test_dev_action(). Updates email address for Rae Moar in MAINTAINERS KUnit entry. ---------------------------------------------------------------- Carlos Llamas (1): kunit: prevent log overwrite in param_tests Florian Schmaus (1): kunit: test_dev_action: Correctly cast 'priv' pointer to long* Rae Moar (1): MAINTAINERS: Update KUnit email address for Rae Moar .mailmap | 1 + MAINTAINERS | 2 +- lib/kunit/kunit-test.c | 2 +- lib/kunit/test.c | 3 ++- 4 files changed, 5 insertions(+), 3 deletions(-) ----------------------------------------------------------------

2 months

2
1
0 0

[PATCH bpf 0/2] use rqspinlock for bpf lru map

by Menglong Dong

Convert the raw_spinlock to rqspinlock to fix the possible deadlock in [1] for bpf lru map. Meanwhile, add the testcase for the deadlock. Link: https://lore.kernel.org/bpf/CAEf4BzbTJCUx0D=zjx6+5m5iiGhwLzaP94hnw36ZMDHAf4… Menglong Dong (2): bpf: use rqspinlock for lru map selftests/bpf: test map deadlock caused by NMI kernel/bpf/bpf_lru_list.c | 47 +++--- kernel/bpf/bpf_lru_list.h | 5 +- .../selftests/bpf/prog_tests/map_deadlock.c | 134 ++++++++++++++++++ .../selftests/bpf/progs/map_deadlock.c | 52 +++++++ 4 files changed, 217 insertions(+), 21 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/map_deadlock.c create mode 100644 tools/testing/selftests/bpf/progs/map_deadlock.c -- 2.51.2

2 months

3
5
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror October 2025