- Linux-kselftest-mirror - lists.linaro.org

[PATCHv4 net-next 0/3] bonding: support aggregator selection based on port priority

by Hangbin Liu

This patchset introduces a new per-port bonding option: `ad_actor_port_prio`. It allows users to configure the actor's port priority, which can then be used by the bonding driver for aggregator selection based on port priority. This provides finer control over LACP aggregator choice, especially in setups with multiple eligible aggregators over 2 switches. v4: a) fix actor_port_prio minimal value (Jay Vosburgh) b) fix ad_agg_selection_test comment order (Paolo Abeni) c) restruct selftest, reduce duplication (Paolo Abeni) v3: a) add comments when init slave port_priority (Jonas Gorski) b) rename ad_lacp_port_prio to lacp_port_prio (Jay Vosburgh) v2: a) set default bond option value for port priority (Nikolay Aleksandrov) b) fix __agg_ports_priority coding style (Nikolay Aleksandrov) c) fix shellcheck warns Hangbin Liu (3): bonding: add support for per-port LACP actor priority bonding: support aggregator selection based on port priority selftests: bonding: add test for LACP actor port priority Documentation/networking/bonding.rst | 18 ++- drivers/net/bonding/bond_3ad.c | 31 +++++ drivers/net/bonding/bond_netlink.c | 16 +++ drivers/net/bonding/bond_options.c | 37 ++++++ include/net/bond_3ad.h | 2 + include/net/bond_options.h | 1 + include/uapi/linux/if_link.h | 1 + .../selftests/drivers/net/bonding/Makefile | 3 +- .../drivers/net/bonding/bond_lacp_prio.sh | 107 ++++++++++++++++++ tools/testing/selftests/net/forwarding/lib.sh | 24 ---- tools/testing/selftests/net/lib.sh | 24 ++++ 11 files changed, 238 insertions(+), 26 deletions(-) create mode 100755 tools/testing/selftests/drivers/net/bonding/bond_lacp_prio.sh -- 2.50.1

3 months, 3 weeks

2
6
0 0

[bpf-next v2 2/2] selftests/bpf: Add tests for bpf_strnstr

by Rong Tao

From: Rong Tao <rongtao(a)cestc.cn> Add two tests for bpf_strnstr(): bpf_strnstr("", "", 0) = 0 bpf_strnstr("hello world", "hello", 5) = 0 Signed-off-by: Rong Tao <rongtao(a)cestc.cn> --- tools/testing/selftests/bpf/progs/string_kfuncs_success.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/string_kfuncs_success.c b/tools/testing/selftests/bpf/progs/string_kfuncs_success.c index 46697f381878..f8fe14787b2e 100644 --- a/tools/testing/selftests/bpf/progs/string_kfuncs_success.c +++ b/tools/testing/selftests/bpf/progs/string_kfuncs_success.c @@ -30,6 +30,8 @@ __test(2) int test_strcspn(void *ctx) { return bpf_strcspn(str, "lo"); } __test(6) int test_strstr_found(void *ctx) { return bpf_strstr(str, "world"); } __test(-ENOENT) int test_strstr_notfound(void *ctx) { return bpf_strstr(str, "hi"); } __test(0) int test_strstr_empty(void *ctx) { return bpf_strstr(str, ""); } +__test(0) int test_strnstr_found(void *ctx) { return bpf_strnstr("", "", 0); } +__test(0) int test_strnstr_found(void *ctx) { return bpf_strnstr(str, "hello", 5); } __test(0) int test_strnstr_found(void *ctx) { return bpf_strnstr(str, "hello", 6); } __test(-ENOENT) int test_strnstr_notfound(void *ctx) { return bpf_strnstr(str, "hi", 10); } __test(0) int test_strnstr_empty(void *ctx) { return bpf_strnstr(str, "", 1); } -- 2.51.0

3 months, 3 weeks

1
0
0 0

[bpf-next v2 1/2] bpf/helpers: bpf_strnstr: Exact match length

by Rong Tao

From: Rong Tao <rongtao(a)cestc.cn> strnstr should not treat the ending '\0' of s2 as a matching character, otherwise the parameter 'len' will be meaningless, for example: 1. bpf_strnstr("openat", "open", 4) = -ENOENT 2. bpf_strnstr("openat", "open", 5) = 0 This patch makes (1) return 0, indicating a successful match. Signed-off-by: Rong Tao <rongtao(a)cestc.cn> --- kernel/bpf/helpers.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 401b4932cc49..ced7132980fe 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -3672,10 +3672,12 @@ __bpf_kfunc int bpf_strnstr(const char *s1__ign, const char *s2__ign, size_t len guard(pagefault)(); for (i = 0; i < XATTR_SIZE_MAX; i++) { - for (j = 0; i + j < len && j < XATTR_SIZE_MAX; j++) { + for (j = 0; i + j <= len && j < XATTR_SIZE_MAX; j++) { __get_kernel_nofault(&c2, s2__ign + j, char, err_out); if (c2 == '\0') return i; + if (i + j == len) + break; __get_kernel_nofault(&c1, s1__ign + j, char, err_out); if (c1 == '\0') return -ENOENT; -- 2.51.0

3 months, 3 weeks

1
0
0 0

[bpf-next v2 0/2] bpf/helpers: Fix bpf_strnstr len error

by Rong Tao

From: Rong Tao <rongtao(a)cestc.cn> Fix bpf_strnstr() wrong 'len' parameter, bpf_strnstr("open", "open", 4) should return 0 instead of -ENOENT. Rong Tao (2): bpf/helpers: bpf_strnstr: Exact match length selftests/bpf: Add tests for bpf_strnstr kernel/bpf/helpers.c | 4 +++- tools/testing/selftests/bpf/progs/string_kfuncs_success.c | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-) --- v2: Follow Andrii Nakryiko's advise, fix the 'wrong fix'; v1: https://lore.kernel.org/lkml/tencent_65E5988AD52BEC280D22964189505CD6ED06@q… -- 2.51.0

3 months, 3 weeks

1
0
0 0

[PATCH bpf-next v16 0/2] libbpf: fix USDT SIB argument handling causing unrecognized register error

by Jiawei Zhao

When using GCC on x86-64 to compile an usdt prog with -O1 or higher optimization, the compiler will generate SIB addressing mode for global array, e.g. "1@-96(%rbp,%rax,8)". The current USDT implementation in libbpf cannot parse these two formats, causing `bpf_program__attach_usdt()` to fail with -ENOENT (unrecognized register). This patch series adds support for SIB addressing mode in USDT probes. The main changes include: - add correct handling logic for SIB-addressed arguments in `parse_usdt_arg`. - add an usdt_o2 test case to cover SIB addressing mode. Testing shows that the SIB probe correctly generates 8@(%rcx,%rax,8) argument spec and passes all validation checks. The modification history of this patch series: Change since v1: - refactor the code to make it more readable - modify the commit message to explain why and how Change since v2: - fix the `scale` uninitialized error Change since v3: - force -O2 optimization for usdt.test.o to generate SIB addressing usdt and pass all test cases. Change since v4: - split the patch into two parts, one for the fix and the other for the test Change since v5: - Only enable optimization for x86 architecture to generate SIB addressing usdt argument spec. Change since v6: - Add an usdt_o2 test case to cover SIB addressing mode. - Reinstate the usdt.c test case. Change since v7: - Refactor modifications to __bpf_usdt_arg_spec to avoid increasing its size, achieving better compatibility - Fix some minor code style issues - Refactor the usdt_o2 test case, removing semaphore and adding GCC attribute to force -O2 optimization Change since v8: - Refactor the usdt_o2 test case, using assembly to force SIB addressing mode. Change since v9: - Only enable the usdt_o2 test case on x86_64 and i386 architectures since the SIB addressing mode is only supported on x86_64 and i386. Change since v10: - Replace `__attribute__((optimize("O2")))` with `#pragma GCC optimize("O1")` to fix the issue where the optimized compilation condition works improperly. - Renamed test case usdt_o2 and relevant files name to usdt_o1 in that O1 level optimization is enough to generate SIB addressing usdt argument spec. Change since v11: - Replace `STAP_PROBE1` with `STAP_PROBE_ASM` - Use bit fields instead of bit shifting operations - Merge the usdt_o1 test case into the usdt test case Change since v12: - This patch is same with the v12 but with a new version number. Change since v13(resolve some review comments): - https://lore.kernel.org/bpf/CAEf4BzZWd2zUC=U6uGJFF3EMZ7zWGLweQAG3CJWTeHy-5y… - https://lore.kernel.org/bpf/CAEf4Bzbs3hV_Q47+d93tTX13WkrpkpOb4=U04mZCjHyZg4… Change since v14: - fix a typo in __bpf_usdt_arg_spec Change since v15(resolve some review comments): - https://lore.kernel.org/bpf/CAEf4BzaxuYijEfQMDFZ+CQdjxLuDZiesUXNA-SiopS+5+V… - https://lore.kernel.org/bpf/CAEf4BzaHi5kpuJ6OVvDU62LT5g0qHbWYMfb_FBQ3iuvvUF… - https://lore.kernel.org/bpf/d438bf3a-a9c9-4d34-b814-63f2e9bb3a85@linux.dev/ Jiawei Zhao (2): libbpf: fix USDT SIB argument handling causing unrecognized register error selftests/bpf: Enrich subtest_basic_usdt case in selftests to cover SIB handling logic tools/lib/bpf/usdt.bpf.h | 44 +++++++++- tools/lib/bpf/usdt.c | 69 +++++++++++++-- tools/testing/selftests/bpf/prog_tests/usdt.c | 84 ++++++++++++++++++- tools/testing/selftests/bpf/progs/test_usdt.c | 31 +++++++ 4 files changed, 219 insertions(+), 9 deletions(-) -- 2.43.0

3 months, 3 weeks

2
4
0 0

[PATCH v2] selftests/bpf: Fix typos and grammar in test sources

by slopixelz＠gmail.com

From: Shubham Sharma <slopixelz(a)gmail.com> Fixed the spelling typo and checked other BPF selftests sources for similar typos. Follow-up to patch series 990629 v2:Instead of sending multiple tiny patches for minor comment fixes, combined them into a single pass across the affected files. Signed-off-by: Shubham Sharma <slopixelz(a)gmail.com> --- tools/testing/selftests/bpf/Makefile | 2 +- tools/testing/selftests/bpf/bench.c | 2 +- tools/testing/selftests/bpf/prog_tests/btf_dump.c | 2 +- tools/testing/selftests/bpf/prog_tests/fd_array.c | 2 +- .../testing/selftests/bpf/prog_tests/kprobe_multi_test.c | 2 +- tools/testing/selftests/bpf/prog_tests/module_attach.c | 2 +- tools/testing/selftests/bpf/prog_tests/reg_bounds.c | 4 ++-- .../selftests/bpf/prog_tests/stacktrace_build_id.c | 2 +- .../selftests/bpf/prog_tests/stacktrace_build_id_nmi.c | 2 +- tools/testing/selftests/bpf/prog_tests/stacktrace_map.c | 2 +- .../selftests/bpf/prog_tests/stacktrace_map_raw_tp.c | 2 +- .../selftests/bpf/prog_tests/stacktrace_map_skip.c | 2 +- tools/testing/selftests/bpf/progs/bpf_cc_cubic.c | 2 +- tools/testing/selftests/bpf/progs/bpf_dctcp.c | 2 +- .../selftests/bpf/progs/freplace_connect_v4_prog.c | 2 +- tools/testing/selftests/bpf/progs/iters_state_safety.c | 2 +- tools/testing/selftests/bpf/progs/rbtree_search.c | 2 +- .../testing/selftests/bpf/progs/struct_ops_kptr_return.c | 2 +- tools/testing/selftests/bpf/progs/struct_ops_refcounted.c | 2 +- tools/testing/selftests/bpf/progs/test_cls_redirect.c | 2 +- .../selftests/bpf/progs/test_cls_redirect_dynptr.c | 2 +- tools/testing/selftests/bpf/progs/uretprobe_stack.c | 4 ++-- tools/testing/selftests/bpf/progs/verifier_scalar_ids.c | 2 +- tools/testing/selftests/bpf/progs/verifier_var_off.c | 6 +++--- tools/testing/selftests/bpf/test_sockmap.c | 2 +- tools/testing/selftests/bpf/verifier/calls.c | 8 ++++---- tools/testing/selftests/bpf/xdping.c | 2 +- tools/testing/selftests/bpf/xsk.h | 4 ++-- 28 files changed, 36 insertions(+), 36 deletions(-) diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 4863106034df..de0418f7a661 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -398,7 +398,7 @@ $(HOST_BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ DESTDIR=$(HOST_SCRATCH_DIR)/ prefix= all install_headers endif -# vmlinux.h is first dumped to a temprorary file and then compared to +# vmlinux.h is first dumped to a temporary file and then compared to # the previous version. This helps to avoid unnecessary re-builds of # $(TRUNNER_BPF_OBJS) $(INCLUDE_DIR)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) | $(INCLUDE_DIR) diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c index ddd73d06a1eb..3ecc226ea7b2 100644 --- a/tools/testing/selftests/bpf/bench.c +++ b/tools/testing/selftests/bpf/bench.c @@ -499,7 +499,7 @@ extern const struct bench bench_rename_rawtp; extern const struct bench bench_rename_fentry; extern const struct bench bench_rename_fexit; -/* pure counting benchmarks to establish theoretical lmits */ +/* pure counting benchmarks to establish theoretical limits */ extern const struct bench bench_trig_usermode_count; extern const struct bench bench_trig_syscall_count; extern const struct bench bench_trig_kernel_count; diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dump.c b/tools/testing/selftests/bpf/prog_tests/btf_dump.c index 82903585c870..10cba526d3e6 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_dump.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_dump.c @@ -63,7 +63,7 @@ static int test_btf_dump_case(int n, struct btf_dump_test_case *t) /* tests with t->known_ptr_sz have no "long" or "unsigned long" type, * so it's impossible to determine correct pointer size; but if they - * do, it should be 8 regardless of host architecture, becaues BPF + * do, it should be 8 regardless of host architecture, because BPF * target is always 64-bit */ if (!t->known_ptr_sz) { diff --git a/tools/testing/selftests/bpf/prog_tests/fd_array.c b/tools/testing/selftests/bpf/prog_tests/fd_array.c index 241b2c8c6e0f..c534b4d5f9da 100644 --- a/tools/testing/selftests/bpf/prog_tests/fd_array.c +++ b/tools/testing/selftests/bpf/prog_tests/fd_array.c @@ -293,7 +293,7 @@ static int get_btf_id_by_fd(int btf_fd, __u32 *id) * 1) Create a new btf, it's referenced only by a file descriptor, so refcnt=1 * 2) Load a BPF prog with fd_array[0] = btf_fd; now btf's refcnt=2 * 3) Close the btf_fd, now refcnt=1 - * Wait and check that BTF stil exists. + * Wait and check that BTF still exists. */ static void check_fd_array_cnt__referenced_btfs(void) { diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c index e19ef509ebf8..f377bea0b82d 100644 --- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c +++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c @@ -463,7 +463,7 @@ static bool skip_entry(char *name) return false; } -/* Do comparision by ignoring '.llvm.<hash>' suffixes. */ +/* Do comparison by ignoring '.llvm.<hash>' suffixes. */ static int compare_name(const char *name1, const char *name2) { const char *res1, *res2; diff --git a/tools/testing/selftests/bpf/prog_tests/module_attach.c b/tools/testing/selftests/bpf/prog_tests/module_attach.c index 6d391d95f96e..70fa7ae93173 100644 --- a/tools/testing/selftests/bpf/prog_tests/module_attach.c +++ b/tools/testing/selftests/bpf/prog_tests/module_attach.c @@ -90,7 +90,7 @@ void test_module_attach(void) test_module_attach__detach(skel); - /* attach fentry/fexit and make sure it get's module reference */ + /* attach fentry/fexit and make sure it gets module reference */ link = bpf_program__attach(skel->progs.handle_fentry); if (!ASSERT_OK_PTR(link, "attach_fentry")) goto cleanup; diff --git a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c index e261b0e872db..d93a0c7b1786 100644 --- a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c +++ b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c @@ -623,7 +623,7 @@ static void range_cond(enum num_t t, struct range x, struct range y, *newx = range(t, x.a, x.b); *newy = range(t, y.a + 1, y.b); } else if (x.a == x.b && x.b == y.b) { - /* X is a constant matching rigth side of Y */ + /* X is a constant matching right side of Y */ *newx = range(t, x.a, x.b); *newy = range(t, y.a, y.b - 1); } else if (y.a == y.b && x.a == y.a) { @@ -631,7 +631,7 @@ static void range_cond(enum num_t t, struct range x, struct range y, *newx = range(t, x.a + 1, x.b); *newy = range(t, y.a, y.b); } else if (y.a == y.b && x.b == y.b) { - /* Y is a constant matching rigth side of X */ + /* Y is a constant matching right side of X */ *newx = range(t, x.a, x.b - 1); *newy = range(t, y.a, y.b); } else { diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c index b7ba5cd47d96..271b5cc9fc01 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c @@ -39,7 +39,7 @@ void test_stacktrace_build_id(void) bpf_map_update_elem(control_map_fd, &key, &val, 0); /* for every element in stackid_hmap, we can find a corresponding one - * in stackmap, and vise versa. + * in stackmap, and vice versa. */ err = compare_map_keys(stackid_hmap_fd, stackmap_fd); if (CHECK(err, "compare_map_keys stackid_hmap vs. stackmap", diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c index 0832fd787457..b277dddd5af7 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c @@ -66,7 +66,7 @@ void test_stacktrace_build_id_nmi(void) bpf_map_update_elem(control_map_fd, &key, &val, 0); /* for every element in stackid_hmap, we can find a corresponding one - * in stackmap, and vise versa. + * in stackmap, and vice versa. */ err = compare_map_keys(stackid_hmap_fd, stackmap_fd); if (CHECK(err, "compare_map_keys stackid_hmap vs. stackmap", diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c index df59e4ae2951..84a7e405e912 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c @@ -50,7 +50,7 @@ void test_stacktrace_map(void) bpf_map_update_elem(control_map_fd, &key, &val, 0); /* for every element in stackid_hmap, we can find a corresponding one - * in stackmap, and vise versa. + * in stackmap, and vice versa. */ err = compare_map_keys(stackid_hmap_fd, stackmap_fd); if (CHECK(err, "compare_map_keys stackid_hmap vs. stackmap", diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c index c6ef06f55cdb..e0cb4697b4b3 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c @@ -46,7 +46,7 @@ void test_stacktrace_map_raw_tp(void) bpf_map_update_elem(control_map_fd, &key, &val, 0); /* for every element in stackid_hmap, we can find a corresponding one - * in stackmap, and vise versa. + * in stackmap, and vice versa. */ err = compare_map_keys(stackid_hmap_fd, stackmap_fd); if (CHECK(err, "compare_map_keys stackid_hmap vs. stackmap", diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_skip.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_skip.c index 1932b1e0685c..dc2ccf6a14d1 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_skip.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_skip.c @@ -40,7 +40,7 @@ void test_stacktrace_map_skip(void) skel->bss->control = 1; /* for every element in stackid_hmap, we can find a corresponding one - * in stackmap, and vise versa. + * in stackmap, and vice versa. */ err = compare_map_keys(stackid_hmap_fd, stackmap_fd); if (!ASSERT_OK(err, "compare_map_keys stackid_hmap vs. stackmap")) diff --git a/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c b/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c index 1654a530aa3d..4e51785e7606 100644 --- a/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c +++ b/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c @@ -101,7 +101,7 @@ static void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, tp->snd_cwnd = pkts_in_flight + sndcnt; } -/* Decide wheather to run the increase function of congestion control. */ +/* Decide whether to run the increase function of congestion control. */ static bool tcp_may_raise_cwnd(const struct sock *sk, const int flag) { if (tcp_sk(sk)->reordering > TCP_REORDERING) diff --git a/tools/testing/selftests/bpf/progs/bpf_dctcp.c b/tools/testing/selftests/bpf/progs/bpf_dctcp.c index 7cd73e75f52a..32c511bcd60b 100644 --- a/tools/testing/selftests/bpf/progs/bpf_dctcp.c +++ b/tools/testing/selftests/bpf/progs/bpf_dctcp.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2019 Facebook */ -/* WARNING: This implemenation is not necessarily the same +/* WARNING: This implementation is not necessarily the same * as the tcp_dctcp.c. The purpose is mainly for testing * the kernel BPF logic. */ diff --git a/tools/testing/selftests/bpf/progs/freplace_connect_v4_prog.c b/tools/testing/selftests/bpf/progs/freplace_connect_v4_prog.c index 544e5ac90461..d09bbd8ae8a8 100644 --- a/tools/testing/selftests/bpf/progs/freplace_connect_v4_prog.c +++ b/tools/testing/selftests/bpf/progs/freplace_connect_v4_prog.c @@ -12,7 +12,7 @@ SEC("freplace/connect_v4_prog") int new_connect_v4_prog(struct bpf_sock_addr *ctx) { - // return value thats in invalid range + // return value that's in invalid range return 255; } diff --git a/tools/testing/selftests/bpf/progs/iters_state_safety.c b/tools/testing/selftests/bpf/progs/iters_state_safety.c index f41257eadbb2..b381ac0c736c 100644 --- a/tools/testing/selftests/bpf/progs/iters_state_safety.c +++ b/tools/testing/selftests/bpf/progs/iters_state_safety.c @@ -345,7 +345,7 @@ int __naked read_from_iter_slot_fail(void) "r3 = 1000;" "call %[bpf_iter_num_new];" - /* attemp to leak bpf_iter_num state */ + /* attempt to leak bpf_iter_num state */ "r7 = *(u64 *)(r6 + 0);" "r8 = *(u64 *)(r6 + 8);" diff --git a/tools/testing/selftests/bpf/progs/rbtree_search.c b/tools/testing/selftests/bpf/progs/rbtree_search.c index 098ef970fac1..b05565d1db0d 100644 --- a/tools/testing/selftests/bpf/progs/rbtree_search.c +++ b/tools/testing/selftests/bpf/progs/rbtree_search.c @@ -183,7 +183,7 @@ long test_##op##_spinlock_##dolock(void *ctx) \ } /* - * Use a spearate MSG macro instead of passing to TEST_XXX(..., MSG) + * Use a separate MSG macro instead of passing to TEST_XXX(..., MSG) * to ensure the message itself is not in the bpf prog lineinfo * which the verifier includes in its log. * Otherwise, the test_loader will incorrectly match the prog lineinfo diff --git a/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c index 36386b3c23a1..2b98b7710816 100644 --- a/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c +++ b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c @@ -9,7 +9,7 @@ void bpf_task_release(struct task_struct *p) __ksym; /* This test struct_ops BPF programs returning referenced kptr. The verifier should * allow a referenced kptr or a NULL pointer to be returned. A referenced kptr to task - * here is acquried automatically as the task argument is tagged with "__ref". + * here is acquired automatically as the task argument is tagged with "__ref". */ SEC("struct_ops/test_return_ref_kptr") struct task_struct *BPF_PROG(kptr_return, int dummy, diff --git a/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c b/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c index 76dcb6089d7f..9c0a65466356 100644 --- a/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c +++ b/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c @@ -9,7 +9,7 @@ __attribute__((nomerge)) extern void bpf_task_release(struct task_struct *p) __k /* This is a test BPF program that uses struct_ops to access a referenced * kptr argument. This is a test for the verifier to ensure that it - * 1) recongnizes the task as a referenced object (i.e., ref_obj_id > 0), and + * 1) recognizes the task as a referenced object (i.e., ref_obj_id > 0), and * 2) the same reference can be acquired from multiple paths as long as it * has not been released. */ diff --git a/tools/testing/selftests/bpf/progs/test_cls_redirect.c b/tools/testing/selftests/bpf/progs/test_cls_redirect.c index f344c6835e84..823169fb6e4c 100644 --- a/tools/testing/selftests/bpf/progs/test_cls_redirect.c +++ b/tools/testing/selftests/bpf/progs/test_cls_redirect.c @@ -129,7 +129,7 @@ typedef uint8_t *net_ptr __attribute__((align_value(8))); typedef struct buf { struct __sk_buff *skb; net_ptr head; - /* NB: tail musn't have alignment other than 1, otherwise + /* NB: tail mustn't have alignment other than 1, otherwise * LLVM will go and eliminate code, e.g. when checking packet lengths. */ uint8_t *const tail; diff --git a/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c b/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c index d0f7670351e5..dfd4a2710391 100644 --- a/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c +++ b/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c @@ -494,7 +494,7 @@ static ret_t get_next_hop(struct bpf_dynptr *dynptr, __u64 *offset, encap_header *offset += sizeof(*next_hop); - /* Skip the remainig next hops (may be zero). */ + /* Skip the remaining next hops (may be zero). */ return skip_next_hops(offset, encap->unigue.hop_count - encap->unigue.next_hop - 1); } diff --git a/tools/testing/selftests/bpf/progs/uretprobe_stack.c b/tools/testing/selftests/bpf/progs/uretprobe_stack.c index 9fdcf396b8f4..a2951e2f1711 100644 --- a/tools/testing/selftests/bpf/progs/uretprobe_stack.c +++ b/tools/testing/selftests/bpf/progs/uretprobe_stack.c @@ -26,8 +26,8 @@ int usdt_len; SEC("uprobe//proc/self/exe:target_1") int BPF_UPROBE(uprobe_1) { - /* target_1 is recursive wit depth of 2, so we capture two separate - * stack traces, depending on which occurence it is + /* target_1 is recursive with depth of 2, so we capture two separate + * stack traces, depending on which occurrence it is */ static bool recur = false; diff --git a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c index 7c5e5e6d10eb..dba3ca728f6e 100644 --- a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c +++ b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c @@ -349,7 +349,7 @@ __naked void precision_two_ids(void) SEC("socket") __success __log_level(2) __flag(BPF_F_TEST_STATE_FREQ) -/* check thar r0 and r6 have different IDs after 'if', +/* check that r0 and r6 have different IDs after 'if', * collect_linked_regs() can't tie more than 6 registers for a single insn. */ __msg("8: (25) if r0 > 0x7 goto pc+0 ; R0=scalar(id=1") diff --git a/tools/testing/selftests/bpf/progs/verifier_var_off.c b/tools/testing/selftests/bpf/progs/verifier_var_off.c index 1d36d01b746e..f345466bca68 100644 --- a/tools/testing/selftests/bpf/progs/verifier_var_off.c +++ b/tools/testing/selftests/bpf/progs/verifier_var_off.c @@ -114,8 +114,8 @@ __naked void stack_write_priv_vs_unpriv(void) } /* Similar to the previous test, but this time also perform a read from the - * address written to with a variable offset. The read is allowed, showing that, - * after a variable-offset write, a priviledged program can read the slots that + * address written to with a variable offet. The read is allowed, showing that, + * after a variable-offset write, a privileged program can read the slots that * were in the range of that write (even if the verifier doesn't actually know if * the slot being read was really written to or not. * @@ -157,7 +157,7 @@ __naked void stack_write_followed_by_read(void) SEC("socket") __description("variable-offset stack write clobbers spilled regs") __failure -/* In the priviledged case, dereferencing a spilled-and-then-filled +/* In the privileged case, dereferencing a spilled-and-then-filled * register is rejected because the previous variable offset stack * write might have overwritten the spilled pointer (i.e. we lose track * of the spilled register when we analyze the write). diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index fd2da2234cc9..76568db7a664 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -1372,7 +1372,7 @@ static int run_options(struct sockmap_options *options, int cg_fd, int test) } else fprintf(stderr, "unknown test\n"); out: - /* Detatch and zero all the maps */ + /* Detach and zero all the maps */ bpf_prog_detach2(bpf_program__fd(progs[3]), cg_fd, BPF_CGROUP_SOCK_OPS); for (i = 0; i < ARRAY_SIZE(links); i++) { diff --git a/tools/testing/selftests/bpf/verifier/calls.c b/tools/testing/selftests/bpf/verifier/calls.c index f3492efc8834..c8d640802cce 100644 --- a/tools/testing/selftests/bpf/verifier/calls.c +++ b/tools/testing/selftests/bpf/verifier/calls.c @@ -1375,7 +1375,7 @@ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1), /* write into map value */ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0), - /* fetch secound map_value_ptr from the stack */ + /* fetch second map_value_ptr from the stack */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -16), BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1), /* write into map value */ @@ -1439,7 +1439,7 @@ /* second time with fp-16 */ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4), BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 1, 2), - /* fetch secound map_value_ptr from the stack */ + /* fetch second map_value_ptr from the stack */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_7, 0), /* write into map value */ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0), @@ -1493,7 +1493,7 @@ /* second time with fp-16 */ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4), BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2), - /* fetch secound map_value_ptr from the stack */ + /* fetch second map_value_ptr from the stack */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_7, 0), /* write into map value */ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0), @@ -2380,7 +2380,7 @@ */ BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 1), BPF_MOV64_REG(BPF_REG_9, BPF_REG_8), - /* r9 = *r9 ; verifier get's to this point via two paths: + /* r9 = *r9 ; verifier gets to this point via two paths: * ; (I) one including r9 = r8, verified first; * ; (II) one excluding r9 = r8, verified next. * ; After load of *r9 to r9 the frame[0].fp[-24].id == r9.id. diff --git a/tools/testing/selftests/bpf/xdping.c b/tools/testing/selftests/bpf/xdping.c index 1503a1d2faa0..9ed8c796645d 100644 --- a/tools/testing/selftests/bpf/xdping.c +++ b/tools/testing/selftests/bpf/xdping.c @@ -155,7 +155,7 @@ int main(int argc, char **argv) } if (!server) { - /* Only supports IPv4; see hints initiailization above. */ + /* Only supports IPv4; see hints initialization above. */ if (getaddrinfo(argv[optind], NULL, &hints, &a) || !a) { fprintf(stderr, "Could not resolve %s\n", argv[optind]); return 1; diff --git a/tools/testing/selftests/bpf/xsk.h b/tools/testing/selftests/bpf/xsk.h index 93c2cc413cfc..48729da142c2 100644 --- a/tools/testing/selftests/bpf/xsk.h +++ b/tools/testing/selftests/bpf/xsk.h @@ -93,8 +93,8 @@ static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb) /* Refresh the local tail pointer. * cached_cons is r->size bigger than the real consumer pointer so * that this addition can be avoided in the more frequently - * executed code that computs free_entries in the beginning of - * this function. Without this optimization it whould have been + * executed code that computes free_entries in the beginning of + * this function. Without this optimization it would have been * free_entries = r->cached_prod - r->cached_cons + r->size. */ r->cached_cons = __atomic_load_n(r->consumer, __ATOMIC_ACQUIRE); -- 2.48.1

3 months, 3 weeks

2
1
0 0

[PATCH v2 00/30] vfio: Introduce selftests for VFIO

by David Matlack

This series introduces VFIO selftests, located in tools/testing/selftests/vfio/. VFIO selftests aim to enable kernel developers to write and run tests that take the form of userspace programs that interact with VFIO and IOMMUFD uAPIs. VFIO selftests can be used to write functional tests for new features, regression tests for bugs, and performance tests for optimizations. These tests are designed to interact with real PCI devices, i.e. they do not rely on mocking out or faking any behavior in the kernel. This allows the tests to exercise not only VFIO but also IOMMUFD, the IOMMU driver, interrupt remapping, IRQ handling, etc. For more background on the motivation and design of this series, please see the RFC: https://lore.kernel.org/kvm/20250523233018.1702151-1-dmatlack@google.com/ This series can also be found on GitHub: https://github.com/dmatlack/linux/tree/vfio/selftests/v2 Changelog ----------------------------------------------------------------------- v1: https://lore.kernel.org/kvm/20250620232031.2705638-1-dmatlack@google.com/ - Collect various Acks - Switch myself from Reviewer to Maintainer of VFIO selftests - Re-order the new MAINTAINERS entry to be alphabetical - Drop the KVM selftests patches from the series - Reorder the tools header commits to be closer to the commits that use them (Vinicius) - Use host virtual addresses instead of magic numbers for IOVAs in vfio_pci_driver_test and vfio_dma_mapping_test RFC: https://lore.kernel.org/kvm/20250523233018.1702151-1-dmatlack@google.com/ - Add symlink to linux/pci_ids.h instead of copying (Jason) - Add symlinks to drivers/dma/*/*.h instead of copying (Jason) - Automatically replicate vfio_dma_mapping_test across backing sources using fixture variants (Jason) - Automatically replicate vfio_dma_mapping_test and vfio_pci_driver_test across all iommu_modes using fixture variants (Jason) - Invert access() check in vfio_dma_mapping_test (me) - Use driver_override instead of add/remove_id (Alex) - Allow tests to get BDF from env var (Alex) - Use KSFT_FAIL instead of 1 to exit with failure (Alex) - Unconditionally create $(LIBVFIO_O_DIRS) to avoid target conflict with ../cgroup/lib/libcgroup.mk when building KVM selftests (me) - Allow VFIO selftests to run automatically by switching from TEST_GEN_PROGS_EXTENDED to TEST_GEN_PROGS. Automatically run selftests will use $VFIO_SELFTESTS_BDF environment variable to know which device to use (Alex) - Replace hardcoded SZ_4K with getpagesize() in vfio_dma_mapping_test to support platforms with other page sizes (me) - Make all global variables static where possible (me) - Pass argc and argv to test_harness_main() so that users can pass flags to the kselftest harness (me) Instructions ----------------------------------------------------------------------- Running VFIO selftests requires at a PCI device bound to vfio-pci for the tests to use. The address of this device is passed to the test as a segment:bus:device.function string, which must match the path to the device in /sys/bus/pci/devices/ (e.g. 0000:00:04.0). Once you have chosen a device, there is a helper script provided to unbind the device from its current driver, bind it to vfio-pci, export the environment variable $VFIO_SELFTESTS_BDF, and launch a shell: $ tools/testing/selftests/vfio/run.sh -d 0000:00:04.0 -s The -d option tells the script which device to use and the -s option tells the script to launch a shell. Additionally, the VFIO selftest vfio_dma_mapping_test has test cases that rely on HugeTLB pages being available, otherwise they are skipped. To enable those tests make sure at least 1 2MB and 1 1GB HugeTLB pages are available. $ echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages $ echo 1 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages To run all VFIO selftests using make: $ make -C tools/testing/selftests/vfio run_tests To run individual tests: $ tools/testing/selftests/vfio/vfio_dma_mapping_test $ tools/testing/selftests/vfio/vfio_dma_mapping_test -v iommufd_anonymous_hugetlb_2mb $ tools/testing/selftests/vfio/vfio_dma_mapping_test -r vfio_dma_mapping_test.iommufd_anonymous_hugetlb_2mb.dma_map_unmap The environment variable $VFIO_SELFTESTS_BDF can be overridden for a specific test by passing in the BDF on the command line as the last positional argument. $ tools/testing/selftests/vfio/vfio_dma_mapping_test 0000:00:04.0 $ tools/testing/selftests/vfio/vfio_dma_mapping_test -v iommufd_anonymous_hugetlb_2mb 0000:00:04.0 $ tools/testing/selftests/vfio/vfio_dma_mapping_test -r vfio_dma_mapping_test.iommufd_anonymous_hugetlb_2mb.dma_map_unmap 0000:00:04.0 When you are done, free the HugeTLB pages and exit the shell started by run.sh. Exiting the shell will cause the device to be unbound from vfio-pci and bound back to its original driver. $ echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages $ echo 0 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages $ exit It's also possible to use run.sh to run just a single test hermetically, rather than dropping into a shell: $ tools/testing/selftests/vfio/run.sh -d 0000:00:04.0 -- tools/testing/selftests/vfio/vfio_dma_mapping_test -v iommufd_anonymous Tests ----------------------------------------------------------------------- There are 4 tests in this series, mostly to demonstrate as a proof-of-concept: - tools/testing/selftests/vfio/vfio_pci_device_test.c - tools/testing/selftests/vfio/vfio_pci_driver_test.c - tools/testing/selftests/vfio/vfio_iommufd_setup_test.c - tools/testing/selftests/vfio/vfio_dma_mapping_test.c Future Areas of Development ----------------------------------------------------------------------- Library: - Driver support for devices that can be used on AMD, ARM, and other platforms (e.g. mlx5). - Driver support for a device available in QEMU VMs (e.g. pcie-ats-testdev [1]) - Support for tests that use multiple devices. - Support for IOMMU groups with multiple devices. - Support for multiple devices sharing the same container/iommufd. - Sharing TEST_ASSERT() macros and other common code between KVM and VFIO selftests. Tests: - DMA mapping performance tests for BARs/HugeTLB/etc. - Porting tests from https://github.com/awilliam/tests/commits/for-clg/ to selftests. - Live Update selftests. - Resend Sean's KVM selftest for posted interrupts using the VFIO selftests library [2][3] Cc: Alex Williamson <alex.williamson(a)redhat.com> Cc: Jason Gunthorpe <jgg(a)nvidia.com> Cc: Kevin Tian <kevin.tian(a)intel.com> Cc: Paolo Bonzini <pbonzini(a)redhat.com> Cc: Sean Christopherson <seanjc(a)google.com> Cc: Vipin Sharma <vipinsh(a)google.com> Cc: Josh Hilke <jrhilke(a)google.com> Cc: Aaron Lewis <aaronlewis(a)google.com> Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com> Cc: Saeed Mahameed <saeedm(a)nvidia.com> Cc: Adithya Jayachandran <ajayachandra(a)nvidia.com> Cc: Joel Granados <joel.granados(a)kernel.org> [1] https://github.com/Joelgranados/qemu/blob/pcie-testdev/hw/misc/pcie-ats-tes… [2] https://lore.kernel.org/kvm/20250404193923.1413163-68-seanjc@google.com/ [3] https://lore.kernel.org/kvm/20250620232031.2705638-32-dmatlack@google.com/ David Matlack (25): selftests: Create tools/testing/selftests/vfio vfio: selftests: Add a helper library for VFIO selftests vfio: selftests: Introduce vfio_pci_device_test vfio: selftests: Keep track of DMA regions mapped into the device vfio: selftests: Enable asserting MSI eventfds not firing vfio: selftests: Add a helper for matching vendor+device IDs vfio: selftests: Add driver framework vfio: sefltests: Add vfio_pci_driver_test tools headers: Add stub definition for __iomem tools headers: Import asm-generic MMIO helpers tools headers: Import x86 MMIO helper overrides tools headers: Add symlink to linux/pci_ids.h dmaengine: ioat: Move system_has_dca_enabled() to dma.h vfio: selftests: Add driver for Intel CBDMA tools headers: Import iosubmit_cmds512() dmaengine: idxd: Allow registers.h to be included from tools/ vfio: selftests: Add driver for Intel DSA vfio: selftests: Move helper to get cdev path to libvfio vfio: selftests: Encapsulate IOMMU mode vfio: selftests: Replicate tests across all iommu_modes vfio: selftests: Add vfio_type1v2_mode vfio: selftests: Add iommufd_compat_type1{,v2} modes vfio: selftests: Add iommufd mode vfio: selftests: Make iommufd the default iommu_mode vfio: selftests: Add a script to help with running VFIO selftests Josh Hilke (5): vfio: selftests: Test basic VFIO and IOMMUFD integration vfio: selftests: Move vfio dma mapping test to their own file vfio: selftests: Add test to reset vfio device. vfio: selftests: Add DMA mapping tests for 2M and 1G HugeTLB vfio: selftests: Validate 2M/1G HugeTLB are mapped as 2M/1G in IOMMU MAINTAINERS | 7 + drivers/dma/idxd/registers.h | 4 + drivers/dma/ioat/dma.h | 2 + drivers/dma/ioat/hw.h | 3 - tools/arch/x86/include/asm/io.h | 101 +++ tools/arch/x86/include/asm/special_insns.h | 27 + tools/include/asm-generic/io.h | 482 ++++++++++++++ tools/include/asm/io.h | 11 + tools/include/linux/compiler.h | 4 + tools/include/linux/io.h | 4 +- tools/include/linux/pci_ids.h | 1 + tools/testing/selftests/Makefile | 1 + tools/testing/selftests/vfio/.gitignore | 7 + tools/testing/selftests/vfio/Makefile | 21 + .../selftests/vfio/lib/drivers/dsa/dsa.c | 416 ++++++++++++ .../vfio/lib/drivers/dsa/registers.h | 1 + .../selftests/vfio/lib/drivers/ioat/hw.h | 1 + .../selftests/vfio/lib/drivers/ioat/ioat.c | 235 +++++++ .../vfio/lib/drivers/ioat/registers.h | 1 + .../selftests/vfio/lib/include/vfio_util.h | 295 +++++++++ tools/testing/selftests/vfio/lib/libvfio.mk | 24 + .../selftests/vfio/lib/vfio_pci_device.c | 594 ++++++++++++++++++ .../selftests/vfio/lib/vfio_pci_driver.c | 126 ++++ tools/testing/selftests/vfio/run.sh | 109 ++++ .../selftests/vfio/vfio_dma_mapping_test.c | 199 ++++++ .../selftests/vfio/vfio_iommufd_setup_test.c | 127 ++++ .../selftests/vfio/vfio_pci_device_test.c | 176 ++++++ .../selftests/vfio/vfio_pci_driver_test.c | 244 +++++++ 28 files changed, 3219 insertions(+), 4 deletions(-) create mode 100644 tools/arch/x86/include/asm/io.h create mode 100644 tools/arch/x86/include/asm/special_insns.h create mode 100644 tools/include/asm-generic/io.h create mode 100644 tools/include/asm/io.h create mode 120000 tools/include/linux/pci_ids.h create mode 100644 tools/testing/selftests/vfio/.gitignore create mode 100644 tools/testing/selftests/vfio/Makefile create mode 100644 tools/testing/selftests/vfio/lib/drivers/dsa/dsa.c create mode 120000 tools/testing/selftests/vfio/lib/drivers/dsa/registers.h create mode 120000 tools/testing/selftests/vfio/lib/drivers/ioat/hw.h create mode 100644 tools/testing/selftests/vfio/lib/drivers/ioat/ioat.c create mode 120000 tools/testing/selftests/vfio/lib/drivers/ioat/registers.h create mode 100644 tools/testing/selftests/vfio/lib/include/vfio_util.h create mode 100644 tools/testing/selftests/vfio/lib/libvfio.mk create mode 100644 tools/testing/selftests/vfio/lib/vfio_pci_device.c create mode 100644 tools/testing/selftests/vfio/lib/vfio_pci_driver.c create mode 100755 tools/testing/selftests/vfio/run.sh create mode 100644 tools/testing/selftests/vfio/vfio_dma_mapping_test.c create mode 100644 tools/testing/selftests/vfio/vfio_iommufd_setup_test.c create mode 100644 tools/testing/selftests/vfio/vfio_pci_device_test.c create mode 100644 tools/testing/selftests/vfio/vfio_pci_driver_test.c base-commit: c17b750b3ad9f45f2b6f7e6f7f4679844244f0b9 -- 2.51.0.rc2.233.g662b1ed5c5-goog

3 months, 3 weeks

2
32
0 0

[PATCH v19 0/8] fork: Support shadow stacks in clone3()

by Mark Brown

[ I think at this point everyone is OK with the ABI, and the x86 implementation has been tested so hopefully we are near to being able to get this merged? If there are any outstanding issues let me know and I can look at addressing them. The one possible issue I am aware of is that the RISC-V shadow stack support was briefly in -next but got dropped along with the general RISC-V issues during the last merge window, rebasing for that is still in progress. I guess ideally this could be applied on a branch and then pulled into the RISC-V tree? ] The kernel has recently added support for shadow stacks, currently x86 only using their CET feature but both arm64 and RISC-V have equivalent features (GCS and Zicfiss respectively), I am actively working on GCS[1]. With shadow stacks the hardware maintains an additional stack containing only the return addresses for branch instructions which is not generally writeable by userspace and ensures that any returns are to the recorded addresses. This provides some protection against ROP attacks and making it easier to collect call stacks. These shadow stacks are allocated in the address space of the userspace process. Our API for shadow stacks does not currently offer userspace any flexiblity for managing the allocation of shadow stacks for newly created threads, instead the kernel allocates a new shadow stack with the same size as the normal stack whenever a thread is created with the feature enabled. The stacks allocated in this way are freed by the kernel when the thread exits or shadow stacks are disabled for the thread. This lack of flexibility and control isn't ideal, in the vast majority of cases the shadow stack will be over allocated and the implicit allocation and deallocation is not consistent with other interfaces. As far as I can tell the interface is done in this manner mainly because the shadow stack patches were in development since before clone3() was implemented. Since clone3() is readily extensible let's add support for specifying a shadow stack when creating a new thread or process, keeping the current implicit allocation behaviour if one is not specified either with clone3() or through the use of clone(). The user must provide a shadow stack pointer, this must point to memory mapped for use as a shadow stackby map_shadow_stack() with an architecture specified shadow stack token at the top of the stack. Yuri Khrustalev has raised questions from the libc side regarding discoverability of extended clone3() structure sizes[2], this seems like a general issue with clone3(). There was a suggestion to add a hwcap on arm64 which isn't ideal but is doable there, though architecture specific mechanisms would also be needed for x86 (and RISC-V if it's support gets merged before this does). The idea has, however, had strong pushback from the architecture maintainers and it is possible to detect support for this in clone3() by attempting a call with a misaligned shadow stack pointer specified so no hwcap has been added. [1] https://lore.kernel.org/linux-arm-kernel/20241001-arm64-gcs-v13-0-222b78d87… [2] https://lore.kernel.org/r/aCs65ccRQtJBnZ_5@arm.com Signed-off-by: Mark Brown <broonie(a)kernel.org> --- Changes in v19: - Rebase onto v6.17-rc1. - Link to v18: https://lore.kernel.org/r/20250702-clone3-shadow-stack-v18-0-7965d2b694db@k… Changes in v18: - Rebase onto v6.16-rc3. - Thanks to pointers from Yuri Khrustalev this version has been tested on x86 so I have removed the RFT tag. - Clarify clone3_shadow_stack_valid() comment about the Kconfig check. - Remove redundant GCSB DSYNCs in arm64 code. - Fix token validation on x86. - Link to v17: https://lore.kernel.org/r/20250609-clone3-shadow-stack-v17-0-8840ed97ff6f@k… Changes in v17: - Rebase onto v6.16-rc1. - Link to v16: https://lore.kernel.org/r/20250416-clone3-shadow-stack-v16-0-2ffc9ca3917b@k… Changes in v16: - Rebase onto v6.15-rc2. - Roll in fixes from x86 testing from Rick Edgecombe. - Rework so that the argument is shadow_stack_token. - Link to v15: https://lore.kernel.org/r/20250408-clone3-shadow-stack-v15-0-3fa245c6e3be@k… Changes in v15: - Rebase onto v6.15-rc1. - Link to v14: https://lore.kernel.org/r/20250206-clone3-shadow-stack-v14-0-805b53af73b9@k… Changes in v14: - Rebase onto v6.14-rc1. - Link to v13: https://lore.kernel.org/r/20241203-clone3-shadow-stack-v13-0-93b89a81a5ed@k… Changes in v13: - Rebase onto v6.13-rc1. - Link to v12: https://lore.kernel.org/r/20241031-clone3-shadow-stack-v12-0-7183eb8bee17@k… Changes in v12: - Add the regular prctl() to the userspace API document since arm64 support is queued in -next. - Link to v11: https://lore.kernel.org/r/20241005-clone3-shadow-stack-v11-0-2a6a2bd6d651@k… Changes in v11: - Rebase onto arm64 for-next/gcs, which is based on v6.12-rc1, and integrate arm64 support. - Rework the interface to specify a shadow stack pointer rather than a base and size like we do for the regular stack. - Link to v10: https://lore.kernel.org/r/20240821-clone3-shadow-stack-v10-0-06e8797b9445@k… Changes in v10: - Integrate fixes & improvements for the x86 implementation from Rick Edgecombe. - Require that the shadow stack be VM_WRITE. - Require that the shadow stack base and size be sizeof(void *) aligned. - Clean up trailing newline. - Link to v9: https://lore.kernel.org/r/20240819-clone3-shadow-stack-v9-0-962d74f99464@ke… Changes in v9: - Pull token validation earlier and report problems with an error return to parent rather than signal delivery to the child. - Verify that the top of the supplied shadow stack is VM_SHADOW_STACK. - Rework token validation to only do the page mapping once. - Drop no longer needed support for testing for signals in selftest. - Fix typo in comments. - Link to v8: https://lore.kernel.org/r/20240808-clone3-shadow-stack-v8-0-0acf37caf14c@ke… Changes in v8: - Fix token verification with user specified shadow stack. - Don't track user managed shadow stacks for child processes. - Link to v7: https://lore.kernel.org/r/20240731-clone3-shadow-stack-v7-0-a9532eebfb1d@ke… Changes in v7: - Rebase onto v6.11-rc1. - Typo fixes. - Link to v6: https://lore.kernel.org/r/20240623-clone3-shadow-stack-v6-0-9ee7783b1fb9@ke… Changes in v6: - Rebase onto v6.10-rc3. - Ensure we don't try to free the parent shadow stack in error paths of x86 arch code. - Spelling fixes in userspace API document. - Additional cleanups and improvements to the clone3() tests to support the shadow stack tests. - Link to v5: https://lore.kernel.org/r/20240203-clone3-shadow-stack-v5-0-322c69598e4b@ke… Changes in v5: - Rebase onto v6.8-rc2. - Rework ABI to have the user allocate the shadow stack memory with map_shadow_stack() and a token. - Force inlining of the x86 shadow stack enablement. - Move shadow stack enablement out into a shared header for reuse by other tests. - Link to v4: https://lore.kernel.org/r/20231128-clone3-shadow-stack-v4-0-8b28ffe4f676@ke… Changes in v4: - Formatting changes. - Use a define for minimum shadow stack size and move some basic validation to fork.c. - Link to v3: https://lore.kernel.org/r/20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@ke… Changes in v3: - Rebase onto v6.7-rc2. - Remove stale shadow_stack in internal kargs. - If a shadow stack is specified unconditionally use it regardless of CLONE_ parameters. - Force enable shadow stacks in the selftest. - Update changelogs for RISC-V feature rename. - Link to v2: https://lore.kernel.org/r/20231114-clone3-shadow-stack-v2-0-b613f8681155@ke… Changes in v2: - Rebase onto v6.7-rc1. - Remove ability to provide preallocated shadow stack, just specify the desired size. - Link to v1: https://lore.kernel.org/r/20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@ke… --- Mark Brown (8): arm64/gcs: Return a success value from gcs_alloc_thread_stack() Documentation: userspace-api: Add shadow stack API documentation selftests: Provide helper header for shadow stack testing fork: Add shadow stack support to clone3() selftests/clone3: Remove redundant flushes of output streams selftests/clone3: Factor more of main loop into test_clone3() selftests/clone3: Allow tests to flag if -E2BIG is a valid error code selftests/clone3: Test shadow stack support Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/shadow_stack.rst | 44 +++++ arch/arm64/include/asm/gcs.h | 8 +- arch/arm64/kernel/process.c | 8 +- arch/arm64/mm/gcs.c | 55 +++++- arch/x86/include/asm/shstk.h | 11 +- arch/x86/kernel/process.c | 2 +- arch/x86/kernel/shstk.c | 53 ++++- include/asm-generic/cacheflush.h | 11 ++ include/linux/sched/task.h | 17 ++ include/uapi/linux/sched.h | 9 +- kernel/fork.c | 93 +++++++-- tools/testing/selftests/clone3/clone3.c | 226 ++++++++++++++++++---- tools/testing/selftests/clone3/clone3_selftests.h | 65 ++++++- tools/testing/selftests/ksft_shstk.h | 98 ++++++++++ 15 files changed, 620 insertions(+), 81 deletions(-) --- base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585 change-id: 20231019-clone3-shadow-stack-15d40d2bf536 Best regards, -- Mark Brown <broonie(a)kernel.org>

3 months, 3 weeks

3
10
0 0

[PATCH] selftests: cgroup: Make test_pids backwards compatible

by Michal Koutný

The predicates in test expect event counting from 73e75e6fc352b ("cgroup/pids: Separate semantics of pids.events related to pids.max") and the test would fail on older kernels. We want to have one version of tests for all, so detect the feature and skip the test on old kernels. (The test could even switch to check v1 semantics based on the flag but keep it simple for now.) Fixes: 9f34c566027b6 ("selftests: cgroup: Add basic tests for pids controller") Signed-off-by: Michal Koutný <mkoutny(a)suse.com> Tested-by: Sebastian Chlad <sebastian.chlad(a)suse.com> --- tools/testing/selftests/cgroup/lib/cgroup_util.c | 12 ++++++++++++ .../selftests/cgroup/lib/include/cgroup_util.h | 1 + tools/testing/selftests/cgroup/test_pids.c | 3 +++ 3 files changed, 16 insertions(+) diff --git a/tools/testing/selftests/cgroup/lib/cgroup_util.c b/tools/testing/selftests/cgroup/lib/cgroup_util.c index 0e89fcff4d05d..44c52f620fda1 100644 --- a/tools/testing/selftests/cgroup/lib/cgroup_util.c +++ b/tools/testing/selftests/cgroup/lib/cgroup_util.c @@ -522,6 +522,18 @@ int proc_mount_contains(const char *option) return strstr(buf, option) != NULL; } +int cgroup_feature(const char *feature) +{ + char buf[PAGE_SIZE]; + ssize_t read; + + read = read_text("/sys/kernel/cgroup/features", buf, sizeof(buf)); + if (read < 0) + return read; + + return strstr(buf, feature) != NULL; +} + ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size) { char path[PATH_MAX]; diff --git a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h index c69cab66254b4..9dc90a1b386d7 100644 --- a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h +++ b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h @@ -60,6 +60,7 @@ extern int cg_run_nowait(const char *cgroup, extern int cg_wait_for_proc_count(const char *cgroup, int count); extern int cg_killall(const char *cgroup); int proc_mount_contains(const char *option); +int cgroup_feature(const char *feature); extern ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size); extern int proc_read_strstr(int pid, bool thread, const char *item, const char *needle); extern pid_t clone_into_cgroup(int cgroup_fd); diff --git a/tools/testing/selftests/cgroup/test_pids.c b/tools/testing/selftests/cgroup/test_pids.c index 9ecb83c6cc5cb..d8a1d1cd50072 100644 --- a/tools/testing/selftests/cgroup/test_pids.c +++ b/tools/testing/selftests/cgroup/test_pids.c @@ -77,6 +77,9 @@ static int test_pids_events(const char *root) char *cg_parent = NULL, *cg_child = NULL; int pid; + if (cgroup_feature("pids_localevents") <= 0) + return KSFT_SKIP; + cg_parent = cg_name(root, "pids_parent"); cg_child = cg_name(cg_parent, "pids_child"); if (!cg_parent || !cg_child) base-commit: 04a4d6c24eef8a1fc89d8b6129ac00ca2f638aff -- 2.51.0

3 months, 3 weeks

2
1
0 0

[PATCH v2 0/3] HID: hidraw: rework ioctls

by Benjamin Tissoires

Arnd sent the v1 of the series in July, and it was bogus. So with a little help from claude-sonnet I built up the missing ioctls tests and tried to figure out a way to apply Arnd's logic without breaking the existing ioctls. The end result is in patch 3/3, which makes use of subfunctions to keep the main ioctl code path clean. Arnd, I kept your From: and SoB fields, please shout if you are unhappy. Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org> --- changes in v2: - add new hidraw ioctls tests - refactor Arnd's patch to keep the existing error path logic - link to v1: https://lore.kernel.org/linux-input/20250711072847.2836962-1-arnd@kernel.or… --- Jiri, checkpatch.pl complains about my co-develop tag. Did we get some consensus for AI-assisted tag? --- Arnd Bergmann (1): HID: tighten ioctl command parsing Benjamin Tissoires (2): selftests/hid: hidraw: add more coverage for hidraw ioctls selftests/hid: hidraw: forge wrong ioctls and tests them drivers/hid/hidraw.c | 224 ++++++++------- include/uapi/linux/hidraw.h | 2 + tools/testing/selftests/hid/hid_common.h | 6 + tools/testing/selftests/hid/hidraw.c | 473 +++++++++++++++++++++++++++++++ 4 files changed, 603 insertions(+), 102 deletions(-) --- base-commit: b80a75cf6999fb79971b41eaec7af2bb4b514714 change-id: 20250825-b4-hidraw-ioctls-66f34297032a Best regards, -- Benjamin Tissoires <bentiss(a)kernel.org>

3 months, 3 weeks

3
4
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror