November 2024 - Linux-kselftest-mirror

[PATCH] kunit: constify return of string literals

by Christian Göttsche

From: Christian Göttsche <cgzones(a)googlemail.com> The function kunit_status_to_ok_not_ok() returns string literals, thus declare the return value as such. Reported by clang: ./include/kunit/test.h:143:10: warning: returning 'const char[3]' from a function with result type 'char *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] 143 | return "ok"; | ^~~~ ./include/kunit/test.h:145:10: warning: returning 'const char[7]' from a function with result type 'char *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] 145 | return "not ok"; | ^~~~~~~~ ./include/kunit/test.h:147:9: warning: returning 'const char[8]' from a function with result type 'char *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] 147 | return "invalid"; | ^~~~~~~~~ Signed-off-by: Christian Göttsche <cgzones(a)googlemail.com> --- include/kunit/test.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/kunit/test.h b/include/kunit/test.h index 34b71e42fb10..ae1b57578476 100644 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@ -135,7 +135,7 @@ struct kunit_case { struct string_stream *log; }; -static inline char *kunit_status_to_ok_not_ok(enum kunit_status status) +static inline const char *kunit_status_to_ok_not_ok(enum kunit_status status) { switch (status) { case KUNIT_SKIPPED: -- 2.45.2

9 months, 2 weeks

2
1
0 0

[PATCH bpf-next v2] arm64, bpf: Add 12-argument support for bpf trampoline

by Puranjay Mohan

The arm64 bpf JIT currently supports attaching the trampoline to functions with <= 8 arguments. This is because up to 8 arguments can be passed in registers r0-r7. If there are more than 8 arguments then the 9th and later arguments are passed on the stack, with SP pointing to the first stacked argument. See aapcs64[1] for more details. If the 8th argument is a structure of size > 8B, then it is passed fully on stack and r7 is not used for passing any argument. If there is a 9th argument, it will be passed on the stack, even though r7 is available. Add the support of storing and restoring arguments passed on the stack to the arm64 bpf trampoline. This will allow attaching the trampoline to functions that take up to 12 arguments. [1] https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#parame… Signed-off-by: Puranjay Mohan <puranjay(a)kernel.org> --- Changes in V1 -> V2: V1: https://lore.kernel.org/all/20240704173227.130491-1-puranjay@kernel.org/ - Fixed the argument handling for composite types (structs) --- arch/arm64/net/bpf_jit_comp.c | 139 ++++++++++++++----- tools/testing/selftests/bpf/DENYLIST.aarch64 | 3 - 2 files changed, 107 insertions(+), 35 deletions(-) diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 751331f5ba90..063bf5e11fc6 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -30,6 +30,8 @@ #define TMP_REG_3 (MAX_BPF_JIT_REG + 3) #define FP_BOTTOM (MAX_BPF_JIT_REG + 4) #define ARENA_VM_START (MAX_BPF_JIT_REG + 5) +/* Up to eight function arguments are passed in registers r0-r7 */ +#define ARM64_MAX_REG_ARGS 8 #define check_imm(bits, imm) do { \ if ((((imm) > 0) && ((imm) >> (bits))) || \ @@ -2001,26 +2003,51 @@ static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl, } } -static void save_args(struct jit_ctx *ctx, int args_off, int nregs) +static void save_args(struct jit_ctx *ctx, int args_off, int orig_sp_off, + int nargs, int nreg_args) { + const u8 tmp = bpf2a64[TMP_REG_1]; + int arg_pos; int i; - for (i = 0; i < nregs; i++) { - emit(A64_STR64I(i, A64_SP, args_off), ctx); + for (i = 0; i < nargs; i++) { + if (i < nreg_args) { + emit(A64_STR64I(i, A64_SP, args_off), ctx); + } else { + arg_pos = orig_sp_off + (i - nreg_args) * 8; + emit(A64_LDR64I(tmp, A64_SP, arg_pos), ctx); + emit(A64_STR64I(tmp, A64_SP, args_off), ctx); + } args_off += 8; } } -static void restore_args(struct jit_ctx *ctx, int args_off, int nregs) +static void restore_args(struct jit_ctx *ctx, int args_off, int nreg_args) { int i; - for (i = 0; i < nregs; i++) { + for (i = 0; i < nreg_args; i++) { emit(A64_LDR64I(i, A64_SP, args_off), ctx); args_off += 8; } } +static void restore_stack_args(struct jit_ctx *ctx, int args_off, int stk_arg_off, + int nargs, int nreg_args) +{ + const u8 tmp = bpf2a64[TMP_REG_1]; + int arg_pos; + int i; + + for (i = nreg_args; i < nargs; i++) { + arg_pos = args_off + i * 8; + emit(A64_LDR64I(tmp, A64_SP, arg_pos), ctx); + emit(A64_STR64I(tmp, A64_SP, stk_arg_off), ctx); + + stk_arg_off += 8; + } +} + /* Based on the x86's implementation of arch_prepare_bpf_trampoline(). * * bpf prog and function entry before bpf trampoline hooked: @@ -2034,15 +2061,17 @@ static void restore_args(struct jit_ctx *ctx, int args_off, int nregs) */ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, struct bpf_tramp_links *tlinks, void *func_addr, - int nregs, u32 flags) + int nargs, int nreg_args, u32 flags) { int i; int stack_size; + int stk_arg_off; + int orig_sp_off; int retaddr_off; int regs_off; int retval_off; int args_off; - int nregs_off; + int nargs_off; int ip_off; int run_ctx_off; struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY]; @@ -2052,6 +2081,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, __le32 **branches = NULL; /* trampoline stack layout: + * SP + orig_sp_off [ first stack arg ] if nargs > 8 * [ parent ip ] * [ FP ] * SP + retaddr_off [ self ip ] @@ -2069,14 +2099,24 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, * [ ... ] * SP + args_off [ arg reg 1 ] * - * SP + nregs_off [ arg regs count ] + * SP + nargs_off [ arg count ] * * SP + ip_off [ traced function ] BPF_TRAMP_F_IP_ARG flag * * SP + run_ctx_off [ bpf_tramp_run_ctx ] + * + * [ stack_argN ] + * [ ... ] + * SP + stk_arg_off [ stack_arg1 ] BPF_TRAMP_F_CALL_ORIG */ stack_size = 0; + stk_arg_off = stack_size; + if ((flags & BPF_TRAMP_F_CALL_ORIG) && (nargs - nreg_args > 0)) { + /* room for saving arguments passed on stack */ + stack_size += (nargs - nreg_args) * 8; + } + run_ctx_off = stack_size; /* room for bpf_tramp_run_ctx */ stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8); @@ -2086,13 +2126,13 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, if (flags & BPF_TRAMP_F_IP_ARG) stack_size += 8; - nregs_off = stack_size; + nargs_off = stack_size; /* room for args count */ stack_size += 8; args_off = stack_size; /* room for args */ - stack_size += nregs * 8; + stack_size += nargs * 8; /* room for return value */ retval_off = stack_size; @@ -2110,6 +2150,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, /* return address locates above FP */ retaddr_off = stack_size + 8; + /* original SP position + * stack_size + parent function frame + patched function frame + */ + orig_sp_off = stack_size + 32; + /* bpf trampoline may be invoked by 3 instruction types: * 1. bl, attached to bpf prog or kernel function via short jump * 2. br, attached to bpf prog or kernel function via long jump @@ -2135,12 +2180,12 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, emit(A64_STR64I(A64_R(10), A64_SP, ip_off), ctx); } - /* save arg regs count*/ - emit(A64_MOVZ(1, A64_R(10), nregs, 0), ctx); - emit(A64_STR64I(A64_R(10), A64_SP, nregs_off), ctx); + /* save argument count */ + emit(A64_MOVZ(1, A64_R(10), nargs, 0), ctx); + emit(A64_STR64I(A64_R(10), A64_SP, nargs_off), ctx); - /* save arg regs */ - save_args(ctx, args_off, nregs); + /* save arguments passed in regs and on the stack */ + save_args(ctx, args_off, orig_sp_off, nargs, nreg_args); /* save callee saved registers */ emit(A64_STR64I(A64_R(19), A64_SP, regs_off), ctx); @@ -2167,7 +2212,10 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, } if (flags & BPF_TRAMP_F_CALL_ORIG) { - restore_args(ctx, args_off, nregs); + /* restore arguments that were passed in registers */ + restore_args(ctx, args_off, nreg_args); + /* restore arguments that were passed on the stack */ + restore_stack_args(ctx, args_off, stk_arg_off, nargs, nreg_args); /* call original func */ emit(A64_LDR64I(A64_R(10), A64_SP, retaddr_off), ctx); emit(A64_ADR(A64_LR, AARCH64_INSN_SIZE * 2), ctx); @@ -2196,7 +2244,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, } if (flags & BPF_TRAMP_F_RESTORE_REGS) - restore_args(ctx, args_off, nregs); + restore_args(ctx, args_off, nreg_args); /* restore callee saved register x19 and x20 */ emit(A64_LDR64I(A64_R(19), A64_SP, regs_off), ctx); @@ -2228,19 +2276,42 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, return ctx->idx; } -static int btf_func_model_nregs(const struct btf_func_model *m) +static int btf_func_model_nargs(const struct btf_func_model *m) { - int nregs = m->nr_args; + int nargs = m->nr_args; int i; - /* extra registers needed for struct argument */ + /* extra registers or stack slots needed for struct argument */ for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) { /* The arg_size is at most 16 bytes, enforced by the verifier. */ if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) - nregs += (m->arg_size[i] + 7) / 8 - 1; + nargs += (m->arg_size[i] + 7) / 8 - 1; } - return nregs; + return nargs; +} + +/* get the count of the regs that are used to pass arguments */ +static int btf_func_model_nreg_args(const struct btf_func_model *m) +{ + int nargs = m->nr_args; + int nreg_args = 0; + int i; + + for (i = 0; i < nargs; i++) { + /* The arg_size is at most 16 bytes, enforced by the verifier. */ + if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) { + /* struct members are all in the registers or all + * on the stack. + */ + if (nreg_args + ((m->arg_size[i] + 7) / 8 - 1) > 7) + break; + nreg_args += (m->arg_size[i] + 7) / 8 - 1; + } + nreg_args++; + } + + return (nreg_args > ARM64_MAX_REG_ARGS ? ARM64_MAX_REG_ARGS : nreg_args); } int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags, @@ -2251,14 +2322,16 @@ int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags, .idx = 0, }; struct bpf_tramp_image im; - int nregs, ret; + int nargs, nreg_args, ret; - nregs = btf_func_model_nregs(m); - /* the first 8 registers are used for arguments */ - if (nregs > 8) + nargs = btf_func_model_nargs(m); + if (nargs > MAX_BPF_FUNC_ARGS) return -ENOTSUPP; - ret = prepare_trampoline(&ctx, &im, tlinks, func_addr, nregs, flags); + nreg_args = btf_func_model_nreg_args(m); + + ret = prepare_trampoline(&ctx, &im, tlinks, func_addr, nargs, nreg_args, + flags); if (ret < 0) return ret; @@ -2285,7 +2358,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image, u32 flags, struct bpf_tramp_links *tlinks, void *func_addr) { - int ret, nregs; + int ret, nargs, nreg_args; void *image, *tmp; u32 size = ro_image_end - ro_image; @@ -2302,13 +2375,15 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image, .idx = 0, }; - nregs = btf_func_model_nregs(m); - /* the first 8 registers are used for arguments */ - if (nregs > 8) + nargs = btf_func_model_nargs(m); + if (nargs > MAX_BPF_FUNC_ARGS) return -ENOTSUPP; + nreg_args = btf_func_model_nreg_args(m); + jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image)); - ret = prepare_trampoline(&ctx, im, tlinks, func_addr, nregs, flags); + ret = prepare_trampoline(&ctx, im, tlinks, func_addr, nargs, nreg_args, + flags); if (ret > 0 && validate_code(&ctx) < 0) { ret = -EINVAL; diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64 index 3c7c3e79aa93..e865451e90d2 100644 --- a/tools/testing/selftests/bpf/DENYLIST.aarch64 +++ b/tools/testing/selftests/bpf/DENYLIST.aarch64 @@ -4,9 +4,6 @@ fexit_sleep # The test never returns. The r kprobe_multi_bench_attach # needs CONFIG_FPROBE kprobe_multi_test # needs CONFIG_FPROBE module_attach # prog 'kprobe_multi': failed to auto-attach: -95 -fentry_test/fentry_many_args # fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524 -fexit_test/fexit_many_args # fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524 -tracing_struct/struct_many_args # struct_many_args:FAIL:tracing_struct_many_args__attach unexpected error: -524 fill_link_info/kprobe_multi_link_info # bpf_program__attach_kprobe_multi_opts unexpected error: -95 fill_link_info/kretprobe_multi_link_info # bpf_program__attach_kprobe_multi_opts unexpected error: -95 fill_link_info/kprobe_multi_invalid_ubuff # bpf_program__attach_kprobe_multi_opts unexpected error: -95 -- 2.40.1

9 months, 3 weeks

3
4
0 0

[PATCH v2 0/6] cleanups, fixes, and progress towards avoiding "make headers"

by John Hubbard

Jeff Xu, I apologize for this churn: I was forced to drop your Reviewed-by and Tested-by tags from 2 of the 3 mseal patches, because the __NR_mseal fix is completely different now. Changes since v1: a) Reworked the mseal fix to use the kernel's in-tree unistd*.h files, instead of hacking in a __NR_mseal definition directly. (Thanks to David Hildenbrand for pointing out that this needed to be done.) b) Fixed the subject line of the kvm and mdwe patch. c) Reordered the patches so as to group the mseal changes together. d) ADDED an additional patch, 6/6, to remove various __NR_xx items and checks from the mm selftests. Cover letter, updated for v2: Eventually, once the build succeeds on a sufficiently old distro, the idea is to delete $(KHDR_INCLUDES) from the selftests/mm build, and then after that, from selftests/lib.mk and all of the other selftest builds. For now, this series merely achieves a clean build of selftests/mm on a not-so-old distro: Ubuntu 23.04. In other words, after this series is applied, it is possible to delete $(KHDR_INCLUDES) from selftests/mm/Makefile and the build will still succeed. 1. Add tools/uapi/asm/unistd_[32|x32|64].h files, which include definitions of __NR_mseal, and include them (indirectly) from the files that use __NR_mseal. The new files are copied from ./usr/include/asm, which is how we have agreed to do this sort of thing, see [1]. 2. Add fs.h, similarly created: it was copied directly from a snapshot of ./usr/include/linux/fs.h after running "make headers". 3. Add a few selected prctl.h values that the ksm and mdwe tests require. 4. Factor out some common code from mseal_test.c and seal_elf.c, into a new mseal_helpers.h file. 5. Remove local __NR_* definitions and checks. [1] commit e076eaca5906 ("selftests: break the dependency upon local header files") John Hubbard (6): selftests/mm: mseal, self_elf: fix missing __NR_mseal selftests/mm: mseal, self_elf: factor out test macros and other duplicated items selftests/mm: mseal, self_elf: rename TEST_END_CHECK to REPORT_TEST_PASS selftests/mm: fix vm_util.c build failures: add snapshot of fs.h selftests/mm: kvm, mdwe fixes to avoid requiring "make headers" selftests/mm: remove local __NR_* definitions tools/include/uapi/asm/unistd_32.h | 458 ++++++++++++++++++ tools/include/uapi/asm/unistd_64.h | 380 +++++++++++++++ tools/include/uapi/asm/unistd_x32.h | 369 ++++++++++++++ tools/include/uapi/linux/fs.h | 392 +++++++++++++++ tools/testing/selftests/mm/hugepage-mremap.c | 2 +- .../selftests/mm/ksm_functional_tests.c | 8 +- tools/testing/selftests/mm/mdwe_test.c | 1 + tools/testing/selftests/mm/memfd_secret.c | 14 +- tools/testing/selftests/mm/mkdirty.c | 8 +- tools/testing/selftests/mm/mlock2.h | 1 + tools/testing/selftests/mm/mrelease_test.c | 2 +- tools/testing/selftests/mm/mseal_helpers.h | 41 ++ tools/testing/selftests/mm/mseal_test.c | 143 ++---- tools/testing/selftests/mm/pagemap_ioctl.c | 2 +- tools/testing/selftests/mm/protection_keys.c | 2 +- tools/testing/selftests/mm/seal_elf.c | 37 +- tools/testing/selftests/mm/uffd-common.c | 4 - tools/testing/selftests/mm/uffd-stress.c | 16 +- tools/testing/selftests/mm/uffd-unit-tests.c | 14 +- tools/testing/selftests/mm/vm_util.h | 15 + 20 files changed, 1717 insertions(+), 192 deletions(-) create mode 100644 tools/include/uapi/asm/unistd_32.h create mode 100644 tools/include/uapi/asm/unistd_64.h create mode 100644 tools/include/uapi/asm/unistd_x32.h create mode 100644 tools/include/uapi/linux/fs.h create mode 100644 tools/testing/selftests/mm/mseal_helpers.h base-commit: 2ccbdf43d5e758f8493a95252073cf9078a5fea5 -- 2.45.2

10 months

4
22
0 0

[PATCH v2 0/2] unicode: kunit: refactor selftest to kunit tests

by Pedro Orlando

Hey all, We are making these changes as part of a KUnit Hackathon at LKCamp [1]. This patch sets out to refactor fs/unicode/utf8-selftest.c to KUnit tests. The main benefit of this change is that we can leverage KUnit's test suite for quickly compiling and testing the functions in utf8, instead of compiling the kernel and loading the previous utf8-selftest module, as well as adopting a pattern across all kernel tests. The first commit is the refactoring itself from self test into KUnit, which kept the original test logic intact -- maintaining the purpose of the original tests -- with the added benefit of including these tests into the KUnit test suite. The second commit applies the naming style and file path conventions defined on Documentation/dev-tools/kunit/style.rst We appreciate any feedback and suggestions. :) [1] https://lkcamp.dev/about/ Co-developed-by: Pedro Orlando <porlando(a)lkcamp.dev> Signed-off-by: Pedro Orlando <porlando(a)lkcamp.dev> Co-developed-by: Danilo Pereira <dpereira(a)lkcamp.dev> Signed-off-by: Danilo Pereira <dpereira(a)lkcamp.dev> Signed-off-by: Gabriela Bittencourt <gbittencourt(a)lkcamp.dev> Gabriela Bittencourt (2): unicode: kunit: refactor selftest to kunit tests unicode: kunit: change tests filename and path fs/unicode/Kconfig | 5 +- fs/unicode/Makefile | 2 +- fs/unicode/tests/.kunitconfig | 3 + .../{utf8-selftest.c => tests/utf8_kunit.c} | 149 ++++++++---------- 4 files changed, 76 insertions(+), 83 deletions(-) create mode 100644 fs/unicode/tests/.kunitconfig rename fs/unicode/{utf8-selftest.c => tests/utf8_kunit.c} (64%) -- 2.34.1

10 months

6
11
0 0

[PATCH 0/6] KUnit test moves / renames

by David Gow

As discussed in [1], the KUnit test naming scheme has changed to avoid name conflicts (and tab-completion woes) with the files being tested. These renames and moves have caused a nasty set of merge conflicts, so this series collates and rebases them all to be applied via mm-nonmm-unstable alongside any lib/ changes[2]. Thanks to everyone whose patches appear here, and everyone who reviewed on the original series. I hope I didn't break them too much during the rebase! Link: https://lore.kernel.org/lkml/20240720165441.it.320-kees@kernel.org/ [1] Link: https://lore.kernel.org/lkml/CABVgOSmbSzcGUi=E4piSojh3A4_0GjE0fAYbqKjtYGbE9… [2] --- Bruno Sobreira França (1): lib/math: Add int_log test suite Diego Vieira (1): lib/tests/kfifo_kunit.c: add tests for the kfifo structure Gabriela Bittencourt (2): unicode: kunit: refactor selftest to kunit tests unicode: kunit: change tests filename and path Kees Cook (1): lib: Move KUnit tests into tests/ subdirectory Luis Felipe Hernandez (1): lib: math: Move kunit tests into tests/ subdir MAINTAINERS | 19 +- arch/m68k/configs/amiga_defconfig | 2 +- arch/m68k/configs/apollo_defconfig | 2 +- arch/m68k/configs/atari_defconfig | 2 +- arch/m68k/configs/bvme6000_defconfig | 2 +- arch/m68k/configs/hp300_defconfig | 2 +- arch/m68k/configs/mac_defconfig | 2 +- arch/m68k/configs/multi_defconfig | 2 +- arch/m68k/configs/mvme147_defconfig | 2 +- arch/m68k/configs/mvme16x_defconfig | 2 +- arch/m68k/configs/q40_defconfig | 2 +- arch/m68k/configs/sun3_defconfig | 2 +- arch/m68k/configs/sun3x_defconfig | 2 +- arch/powerpc/configs/ppc64_defconfig | 2 +- fs/unicode/Kconfig | 5 +- fs/unicode/Makefile | 2 +- fs/unicode/tests/.kunitconfig | 3 + .../{utf8-selftest.c => tests/utf8_kunit.c} | 149 ++++++------ fs/unicode/utf8-norm.c | 2 +- lib/Kconfig.debug | 31 ++- lib/Makefile | 36 +-- lib/math/Makefile | 5 +- lib/math/tests/Makefile | 6 +- .../{test_div64.c => tests/div64_kunit.c} | 0 lib/math/tests/int_log_kunit.c | 75 ++++++ .../mul_u64_u64_div_u64_kunit.c} | 2 +- .../rational_kunit.c} | 0 lib/tests/Makefile | 39 +++ lib/{ => tests}/bitfield_kunit.c | 0 lib/{ => tests}/checksum_kunit.c | 0 lib/{ => tests}/cmdline_kunit.c | 0 lib/{ => tests}/cpumask_kunit.c | 0 lib/{ => tests}/fortify_kunit.c | 0 lib/{ => tests}/hashtable_test.c | 0 lib/{ => tests}/is_signed_type_kunit.c | 0 lib/tests/kfifo_kunit.c | 224 ++++++++++++++++++ lib/{ => tests}/kunit_iov_iter.c | 0 lib/{ => tests}/list-test.c | 0 lib/{ => tests}/memcpy_kunit.c | 0 lib/{ => tests}/overflow_kunit.c | 0 lib/{ => tests}/siphash_kunit.c | 0 lib/{ => tests}/slub_kunit.c | 0 lib/{ => tests}/stackinit_kunit.c | 0 lib/{ => tests}/string_helpers_kunit.c | 0 lib/{ => tests}/string_kunit.c | 0 lib/{ => tests}/test_bits.c | 0 lib/{ => tests}/test_fprobe.c | 0 lib/{ => tests}/test_hash.c | 0 lib/{ => tests}/test_kprobes.c | 0 lib/{ => tests}/test_linear_ranges.c | 0 lib/{ => tests}/test_list_sort.c | 0 lib/{ => tests}/test_sort.c | 0 lib/{ => tests}/usercopy_kunit.c | 0 53 files changed, 474 insertions(+), 150 deletions(-) create mode 100644 fs/unicode/tests/.kunitconfig rename fs/unicode/{utf8-selftest.c => tests/utf8_kunit.c} (64%) rename lib/math/{test_div64.c => tests/div64_kunit.c} (100%) create mode 100644 lib/math/tests/int_log_kunit.c rename lib/math/{test_mul_u64_u64_div_u64.c => tests/mul_u64_u64_div_u64_kunit.c} (98%) rename lib/math/{rational-test.c => tests/rational_kunit.c} (100%) create mode 100644 lib/tests/Makefile rename lib/{ => tests}/bitfield_kunit.c (100%) rename lib/{ => tests}/checksum_kunit.c (100%) rename lib/{ => tests}/cmdline_kunit.c (100%) rename lib/{ => tests}/cpumask_kunit.c (100%) rename lib/{ => tests}/fortify_kunit.c (100%) rename lib/{ => tests}/hashtable_test.c (100%) rename lib/{ => tests}/is_signed_type_kunit.c (100%) create mode 100644 lib/tests/kfifo_kunit.c rename lib/{ => tests}/kunit_iov_iter.c (100%) rename lib/{ => tests}/list-test.c (100%) rename lib/{ => tests}/memcpy_kunit.c (100%) rename lib/{ => tests}/overflow_kunit.c (100%) rename lib/{ => tests}/siphash_kunit.c (100%) rename lib/{ => tests}/slub_kunit.c (100%) rename lib/{ => tests}/stackinit_kunit.c (100%) rename lib/{ => tests}/string_helpers_kunit.c (100%) rename lib/{ => tests}/string_kunit.c (100%) rename lib/{ => tests}/test_bits.c (100%) rename lib/{ => tests}/test_fprobe.c (100%) rename lib/{ => tests}/test_hash.c (100%) rename lib/{ => tests}/test_kprobes.c (100%) rename lib/{ => tests}/test_linear_ranges.c (100%) rename lib/{ => tests}/test_list_sort.c (100%) rename lib/{ => tests}/test_sort.c (100%) rename lib/{ => tests}/usercopy_kunit.c (100%) -- 2.47.0.rc1.288.g06298d1525-goog

10 months, 1 week

10
24
0 0

Re: [PATCH 2/3] KVM: x86: Add support for VMware guest specific hypercalls

by Doug Covelli

On Wed, Nov 13, 2024 at 2:31 AM Paolo Bonzini <pbonzini(a)redhat.com> wrote: > > > > Il mar 12 nov 2024, 21:44 Doug Covelli <doug.covelli(a)broadcom.com> ha scritto: >> >> > Split irqchip should be the best tradeoff. Without it, moves from cr8 >> > stay in the kernel, but moves to cr8 always go to userspace with a >> > KVM_EXIT_SET_TPR exit. You also won't be able to use Intel >> > flexpriority (in-processor accelerated TPR) because KVM does not know >> > which bits are set in IRR. So it will be *really* every move to cr8 >> > that goes to userspace. >> >> Sorry to hijack this thread but is there a technical reason not to allow CR8 >> based accesses to the TPR (not MMIO accesses) when the in-kernel local APIC is >> not in use? > > > No worries, you're not hijacking :) The only reason is that it would be more code for a seldom used feature and anyway with worse performance. (To be clear, CR8 based accesses are allowed, but stores cause an exit in order to check the new TPR against IRR. That's because KVM's API does not have an equivalent of the TPR threshold as you point out below). I have not really looked at the code but it seems like it could also simplify things as CR8 would be handled more uniformly regardless of who is virtualizing the local APIC. >> Also I could not find these documented anywhere but with MSFT's APIC our monitor >> relies on extensions for trapping certain events such as INIT/SIPI plus LINT0 >> and SVR writes: >> >> UINT64 X64ApicInitSipiExitTrap : 1; // WHvRunVpExitReasonX64ApicInitSipiTrap >> UINT64 X64ApicWriteLint0ExitTrap : 1; // WHvRunVpExitReasonX64ApicWriteTrap >> UINT64 X64ApicWriteLint1ExitTrap : 1; // WHvRunVpExitReasonX64ApicWriteTrap >> UINT64 X64ApicWriteSvrExitTrap : 1; // WHvRunVpExitReasonX64ApicWriteTrap > > > There's no need for this in KVM's in-kernel APIC model. INIT and SIPI are handled in the hypervisor and you can get the current state of APs via KVM_GET_MPSTATE. LINT0 and LINT1 are injected with KVM_INTERRUPT and KVM_NMI respectively, and they obey IF/PPR and NMI blocking respectively, plus the interrupt shadow; so there's no need for userspace to know when LINT0/LINT1 themselves change. The spurious interrupt vector register is also handled completely in kernel. I realize that KVM can handle LINT0/SVR updates themselves but our interrupt subsystem relies on knowing the current values of these registers even when not virtualizing the local APIC. I suppose we could use KVM_GET_LAPIC to sync things up on demand but that seems like it might nor be great from a performance point of view. >> I did not see any similar functionality for KVM. Does anything like that exist? >> In any case we would be happy to add support for handling CR8 accesses w/o >> exiting w/o the in-kernel APIC along with some sort of a way to configure the >> TPR threshold if folks are not opposed to that. > > > As far I know everybody who's using KVM (whether proprietary or open source) has had no need for that, so I don't think it's a good idea to make the API more complex. Performance of Windows guests is going to be bad anyway with userspace APIC. From what I have seen the exit cost with KVM is significantly lower than with WHP/Hyper-V. I don't think performance of Windows guests with userspace APIC emulation would be bad if CR8 exits could be avoided (Linux guests perf isn't bad from what I have observed and the main difference is the astronomical number of CR8 exits). It seems like it would be pretty decent although I agree if you want the absolute best performance then you would want to use the in kernel APIC to speed up handling of ICR/EOI writes but those are relatively infrequent compared to CR8 accesses . Anyway I just saw Sean's response while writing this and it seems he is not in favor of avoiding CR8 exits w/o the in kernel APIC either so I suppose we will have to look into making use of the in kernel APIC. Doug > Paolo > >> Doug >> >> > > For now I think it makes sense to handle BDOOR_CMD_GET_VCPU_INFO at userlevel >> > > like we do on Windows and macOS. >> > > >> > > BDOOR_CMD_GETTIME/BDOOR_CMD_GETTIMEFULL are similar with the former being >> > > deprecated in favor of the latter. Both do essentially the same thing which is >> > > to return the host OS's time - on Linux this is obtained via gettimeofday. I >> > > believe this is mainly used by tools to fix up the VM's time when resuming from >> > > suspend. I think it is fine to continue handling these at userlevel. >> > >> > As long as the TSC is not involved it should be okay. >> > >> > Paolo >> > >> > > > >> Anyway, one question apart from this: is the API the same for the I/O >> > > > >> port and hypercall backdoors? >> > > > > >> > > > > Yeah the calls and arguments are the same. The hypercall based >> > > > > interface is an attempt to modernize the backdoor since as you pointed >> > > > > out the I/O based interface is kind of hacky as it bypasses the normal >> > > > > checks for an I/O port access at CPL3. It would be nice to get rid of >> > > > > it but unfortunately I don't think that will happen in the foreseeable >> > > > > future as there are a lot of existing VMs out there with older SW that >> > > > > still uses this interface. >> > > > >> > > > Yeah, but I think it still justifies that the KVM_ENABLE_CAP API can >> > > > enable the hypercall but not the I/O port. >> > > > >> > > > Paolo >> > >> >> -- >> This electronic communication and the information and any files transmitted >> with it, or attached to it, are confidential and are intended solely for >> the use of the individual or entity to whom it is addressed and may contain >> information that is confidential, legally privileged, protected by privacy >> laws, or otherwise restricted from disclosure to anyone else. If you are >> not the intended recipient or the person responsible for delivering the >> e-mail to the intended recipient, you are hereby notified that any use, >> copying, distributing, dissemination, forwarding, printing, or copying of >> this e-mail is strictly prohibited. If you received this e-mail in error, >> please return the e-mail to the sender, delete it from your computer, and >> destroy any printed copy of it. >> -- This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.

10 months, 1 week

3
12
0 0

[PATCH v11 00/14] riscv: Add support for xtheadvector

by Charlie Jenkins

xtheadvector is a custom extension that is based upon riscv vector version 0.7.1 [1]. All of the vector routines have been modified to support this alternative vector version based upon whether xtheadvector was determined to be supported at boot. vlenb is not supported on the existing xtheadvector hardware, so a devicetree property thead,vlenb is added to provide the vlenb to Linux. There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is used to request which thead vendor extensions are supported on the current platform. This allows future vendors to allocate hwprobe keys for their vendor. Support for xtheadvector is also added to the vector kselftests. Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com> [1] https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361… --- This series is a continuation of a different series that was fragmented into two other series in an attempt to get part of it merged in the 6.10 merge window. The split-off series did not get merged due to a NAK on the series that added the generic riscv,vlenb devicetree entry. This series has converted riscv,vlenb to thead,vlenb to remedy this issue. The original series is titled "riscv: Support vendor extensions and xtheadvector" [3]. I have tested this with an Allwinner Nezha board. I used SkiffOS [1] to manage building the image, but upgraded the U-Boot version to Samuel Holland's more up-to-date version [2] and changed out the device tree used by U-Boot with the device trees that are present in upstream linux and this series. Thank you Samuel for all of the work you did to make this task possible. [1] https://github.com/skiffos/SkiffOS/tree/master/configs/allwinner/nezha [2] https://github.com/smaeul/u-boot/commit/2e89b706f5c956a70c989cd31665f1429e9… [3] https://lore.kernel.org/all/20240503-dev-charlie-support_thead_vector_6_9-v… [4] https://lore.kernel.org/lkml/20240719-support_vendor_extensions-v3-4-0af758… --- Changes in v11: - Fix an issue where the mitigation was not being properly skipped when requested - Fix vstate_discard issue - Fix issue when -1 was passed into __riscv_isa_vendor_extension_available() - Remove some artifacts from being placed in the test directory - Link to v10: https://lore.kernel.org/r/20240911-xtheadvector-v10-0-8d3930091246@rivosinc… Changes in v10: - In DT probing disable vector with new function to clear vendor extension bits for xtheadvector - Add ghostwrite mitigations for c9xx CPUs. This disables xtheadvector unless mitigations=off is set as a kernel boot arg - Link to v9: https://lore.kernel.org/r/20240806-xtheadvector-v9-0-62a56d2da5d0@rivosinc.… Changes in v9: - Rebase onto palmer's for-next - Fix sparse error in arch/riscv/kernel/vendor_extensions/thead.c - Fix maybe-uninitialized warning in arch/riscv/include/asm/vendor_extensions/vendor_hwprobe.h - Wrap some long lines - Link to v8: https://lore.kernel.org/r/20240724-xtheadvector-v8-0-cf043168e137@rivosinc.… Changes in v8: - Rebase onto palmer's for-next - Link to v7: https://lore.kernel.org/r/20240724-xtheadvector-v7-0-b741910ada3e@rivosinc.… Changes in v7: - Add defs for has_xtheadvector_no_alternatives() and has_xtheadvector() when vector disabled. (Palmer) - Link to v6: https://lore.kernel.org/r/20240722-xtheadvector-v6-0-c9af0130fa00@rivosinc.… Changes in v6: - Fix return type of is_vector_supported()/is_xthead_supported() to be bool - Link to v5: https://lore.kernel.org/r/20240719-xtheadvector-v5-0-4b485fc7d55f@rivosinc.… Changes in v5: - Rebase on for-next - Link to v4: https://lore.kernel.org/r/20240702-xtheadvector-v4-0-2bad6820db11@rivosinc.… Changes in v4: - Replace inline asm with C (Samuel) - Rename VCSRs to CSRs (Samuel) - Replace .insn directives with .4byte directives - Link to v3: https://lore.kernel.org/r/20240619-xtheadvector-v3-0-bff39eb9668e@rivosinc.… Changes in v3: - Add back Heiko's signed-off-by (Conor) - Mark RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 as a bitmask - Link to v2: https://lore.kernel.org/r/20240610-xtheadvector-v2-0-97a48613ad64@rivosinc.… Changes in v2: - Removed extraneous references to "riscv,vlenb" (Jess) - Moved declaration of "thead,vlenb" into cpus.yaml and added restriction that it's only applicable to thead cores (Conor) - Check CONFIG_RISCV_ISA_XTHEADVECTOR instead of CONFIG_RISCV_ISA_V for thead,vlenb (Jess) - Fix naming of hwprobe variables (Evan) - Link to v1: https://lore.kernel.org/r/20240609-xtheadvector-v1-0-3fe591d7f109@rivosinc.… --- Charlie Jenkins (13): dt-bindings: riscv: Add xtheadvector ISA extension description dt-bindings: cpus: add a thead vlen register length property riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree riscv: Add thead and xtheadvector as a vendor extension riscv: vector: Use vlenb from DT for thead riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT riscv: Add xtheadvector instruction definitions riscv: vector: Support xtheadvector save/restore riscv: hwprobe: Add thead vendor extension probing riscv: hwprobe: Document thead vendor extensions and xtheadvector extension selftests: riscv: Fix vector tests selftests: riscv: Support xtheadvector in vector tests riscv: Add ghostwrite vulnerability Heiko Stuebner (1): RISC-V: define the elements of the VCSR vector CSR Documentation/arch/riscv/hwprobe.rst | 10 + Documentation/devicetree/bindings/riscv/cpus.yaml | 19 ++ .../devicetree/bindings/riscv/extensions.yaml | 10 + arch/riscv/Kconfig.errata | 11 + arch/riscv/Kconfig.vendor | 26 ++ arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi | 3 +- arch/riscv/errata/thead/errata.c | 28 ++ arch/riscv/include/asm/bugs.h | 22 ++ arch/riscv/include/asm/cpufeature.h | 2 + arch/riscv/include/asm/csr.h | 15 + arch/riscv/include/asm/errata_list.h | 3 +- arch/riscv/include/asm/hwprobe.h | 3 +- arch/riscv/include/asm/switch_to.h | 2 +- arch/riscv/include/asm/vector.h | 222 +++++++++++---- arch/riscv/include/asm/vendor_extensions/thead.h | 47 ++++ .../include/asm/vendor_extensions/thead_hwprobe.h | 19 ++ .../include/asm/vendor_extensions/vendor_hwprobe.h | 37 +++ arch/riscv/include/uapi/asm/hwprobe.h | 3 +- arch/riscv/include/uapi/asm/vendor/thead.h | 3 + arch/riscv/kernel/Makefile | 2 + arch/riscv/kernel/bugs.c | 60 ++++ arch/riscv/kernel/cpufeature.c | 59 +++- arch/riscv/kernel/kernel_mode_vector.c | 8 +- arch/riscv/kernel/process.c | 4 +- arch/riscv/kernel/signal.c | 6 +- arch/riscv/kernel/sys_hwprobe.c | 5 + arch/riscv/kernel/vector.c | 24 +- arch/riscv/kernel/vendor_extensions.c | 10 + arch/riscv/kernel/vendor_extensions/Makefile | 2 + arch/riscv/kernel/vendor_extensions/thead.c | 29 ++ .../riscv/kernel/vendor_extensions/thead_hwprobe.c | 19 ++ drivers/base/cpu.c | 3 + include/linux/cpu.h | 1 + tools/testing/selftests/riscv/vector/.gitignore | 3 +- tools/testing/selftests/riscv/vector/Makefile | 17 +- .../selftests/riscv/vector/v_exec_initval_nolibc.c | 94 +++++++ tools/testing/selftests/riscv/vector/v_helpers.c | 68 +++++ tools/testing/selftests/riscv/vector/v_helpers.h | 8 + tools/testing/selftests/riscv/vector/v_initval.c | 22 ++ .../selftests/riscv/vector/v_initval_nolibc.c | 68 ----- .../selftests/riscv/vector/vstate_exec_nolibc.c | 20 +- .../testing/selftests/riscv/vector/vstate_prctl.c | 305 +++++++++++++-------- 42 files changed, 1051 insertions(+), 271 deletions(-) --- base-commit: 0eb512779d642b21ced83778287a0f7a3ca8f2a1 change-id: 20240530-xtheadvector-833d3d17b423 -- - Charlie

10 months, 2 weeks

3
22
0 0

[PATCH v4 0/9] mm: workingset reporting

by Yuanchu Xie

This patch series provides workingset reporting of user pages in lruvecs, of which coldness can be tracked by accessed bits and fd references. However, the concept of workingset applies generically to all types of memory, which could be kernel slab caches, discardable userspace caches (databases), or CXL.mem. Therefore, data sources might come from slab shrinkers, device drivers, or the userspace. Another interesting idea might be hugepage workingset, so that we can measure the proportion of hugepages backing cold memory. However, with architectures like arm, there may be too many hugepage sizes leading to a combinatorial explosion when exporting stats to the userspace. Nonetheless, the kernel should provide a set of workingset interfaces that is generic enough to accommodate the various use cases, and extensible to potential future use cases. Use cases ========== Job scheduling On overcommitted hosts, workingset information improves efficiency and reliability by allowing the job scheduler to have better stats on the exact memory requirements of each job. This can manifest in efficiency by landing more jobs on the same host or NUMA node. On the other hand, the job scheduler can also ensure each node has a sufficient amount of memory and does not enter direct reclaim or the kernel OOM path. With workingset information and job priority, the userspace OOM killing or proactive reclaim policy can kick in before the system is under memory pressure. If the job shape is very different from the machine shape, knowing the workingset per-node can also help inform page allocation policies. Proactive reclaim Workingset information allows the a container manager to proactively reclaim memory while not impacting a job's performance. While PSI may provide a reactive measure of when a proactive reclaim has reclaimed too much, workingset reporting allows the policy to be more accurate and flexible. Ballooning (similar to proactive reclaim) The last patch of the series extends the virtio-balloon device to report the guest workingset. Balloon policies benefit from workingset to more precisely determine the size of the memory balloon. On end-user devices where memory is scarce and overcommitted, the balloon sizing in multiple VMs running on the same device can be orchestrated with workingset reports from each one. On the server side, workingset reporting allows the balloon controller to inflate the balloon without causing too much file cache to be reclaimed in the guest. Promotion/Demotion If different mechanisms are used for promition and demotion, workingset information can help connect the two and avoid pages being migrated back and forth. For example, given a promotion hot page threshold defined in reaccess distance of N seconds (promote pages accessed more often than every N seconds). The threshold N should be set so that ~80% (e.g.) of pages on the fast memory node passes the threshold. This calculation can be done with workingset reports. To be directly useful for promotion policies, the workingset report interfaces need to be extended to report hotness and gather hotness information from the devices[1]. [1] https://www.opencompute.org/documents/ocp-cms-hotness-tracking-requirements… Sysfs and Cgroup Interfaces ========== The interfaces are detailed in the patches that introduce them. The main idea here is we break down the workingset per-node per-memcg into time intervals (ms), e.g. 1000 anon=137368 file=24530 20000 anon=34342 file=0 30000 anon=353232 file=333608 40000 anon=407198 file=206052 9223372036854775807 anon=4925624 file=892892 Implementation ========== The reporting of user pages is based off of MGLRU, and therefore requires CONFIG_LRU_GEN=y. We would benefit from more MGLRU generations for a more fine-grained workingset report, but we can already gather a lot of data with just four generations. The workingset reporting mechanism is gated behind CONFIG_WORKINGSET_REPORT, and the aging thread is behind CONFIG_WORKINGSET_REPORT_AGING. Benchmarks ========== Ghait Ouled Amar Ben Cheikh has implemented a simple policy and ran Linux compile and redis benchmarks from openbenchmarking.org. The policy and runner is referred to as WMO (Workload Memory Optimization). The results were based on v3 of the series, but v4 doesn't change the core of the working set reporting and just adds the ballooning counterpart. The timed Linux kernel compilation benchmark shows improvements in peak memory usage with a policy of "swap out all bytes colder than 10 seconds every 40 seconds". A swapfile is configured on SSD. -------------------------------------------- peak memory usage (with WMO): 4982.61328 MiB peak memory usage (control): 9569.1367 MiB peak memory reduction: 47.9% -------------------------------------------- Benchmark | Experimental |Control | Experimental_Std_Dev | Control_Std_Dev Timed Linux Kernel Compilation - allmodconfig (sec) | 708.486 (95.91%) | 679.499 (100%) | 0.6% | 0.1% -------------------------------------------- Seconds, fewer is better The redis benchmark shows employs the same policy: -------------------------------------------- peak memory usage (with WMO): 375.9023 MiB peak memory usage (control): 509.765 MiB peak memory reduction: 26% -------------------------------------------- Benchmark | Experimental | Control | Experimental_Std_Dev | Control_Std_Dev Redis - LPOP (Reqs/sec) | 2023130 (98.22%) | 2059849 (100%) | 1.2% | 2% Redis - SADD (Reqs/sec) | 2539662 (98.63%) | 2574811 (100%) | 2.3% | 1.4% Redis - LPUSH (Reqs/sec)| 2024880 (100%) | 2000884 (98.81%) | 1.1% | 0.8% Redis - GET (Reqs/sec) | 2835764 (100%) | 2763722 (97.46%) | 2.7% | 1.6% Redis - SET (Reqs/sec) | 2340723 (100%) | 2327372 (99.43%) | 2.4% | 1.8% -------------------------------------------- Reqs/sec, more is better The detailed report and benchmarking results are in Ghait's repo: https://github.com/miloudi98/WMO Changelog ========== Changes from PATCH v3 -> v4: - Added documentation for cgroup-v2 (Waiman Long) - Fixed types in documentation (Randy Dunlap) - Added implementation for the ballooning use case - Added detailed description of benchmark results (Andrew Morton) Changes from PATCH v2 -> v3: - Fixed typos in commit messages and documentation (Lance Yang, Randy Dunlap) - Split out the force_scan patch to be reviewed separately - Added benchmarks from Ghait Ouled Amar Ben Cheikh - Fixed reported compile error without CONFIG_MEMCG Changes from PATCH v1 -> v2: - Updated selftest to use ksft_test_result_code instead of switch-case (Muhammad Usama Anjum) - Included more use cases in the cover letter (Huang, Ying) - Added documentation for sysfs and memcg interfaces - Added an aging-specific struct lru_gen_mm_walk in struct pglist_data to avoid allocating for each lruvec. [v1] https://lore.kernel.org/linux-mm/20240504073011.4000534-1-yuanchu@google.co… [v2] https://lore.kernel.org/linux-mm/20240604020549.1017540-1-yuanchu@google.co… [v3] https://lore.kernel.org/linux-mm/20240813165619.748102-1-yuanchu@google.com/ Yuanchu Xie (9): mm: aggregate workingset information into histograms mm: use refresh interval to rate-limit workingset report aggregation mm: report workingset during memory pressure driven scanning mm: extend workingset reporting to memcgs mm: add kernel aging thread for workingset reporting selftest: test system-wide workingset reporting Docs/admin-guide/mm/workingset_report: document sysfs and memcg interfaces Docs/admin-guide/cgroup-v2: document workingset reporting virtio-balloon: add workingset reporting Documentation/admin-guide/cgroup-v2.rst | 35 + Documentation/admin-guide/mm/index.rst | 1 + .../admin-guide/mm/workingset_report.rst | 105 +++ drivers/base/node.c | 6 + drivers/virtio/virtio_balloon.c | 390 ++++++++++- include/linux/balloon_compaction.h | 1 + include/linux/memcontrol.h | 21 + include/linux/mmzone.h | 13 + include/linux/workingset_report.h | 167 +++++ include/uapi/linux/virtio_balloon.h | 30 + mm/Kconfig | 15 + mm/Makefile | 2 + mm/internal.h | 19 + mm/memcontrol.c | 162 ++++- mm/mm_init.c | 2 + mm/mmzone.c | 2 + mm/vmscan.c | 56 +- mm/workingset_report.c | 653 ++++++++++++++++++ mm/workingset_report_aging.c | 127 ++++ tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 3 + tools/testing/selftests/mm/run_vmtests.sh | 5 + .../testing/selftests/mm/workingset_report.c | 306 ++++++++ .../testing/selftests/mm/workingset_report.h | 39 ++ .../selftests/mm/workingset_report_test.c | 330 +++++++++ 25 files changed, 2482 insertions(+), 9 deletions(-) create mode 100644 Documentation/admin-guide/mm/workingset_report.rst create mode 100644 include/linux/workingset_report.h create mode 100644 mm/workingset_report.c create mode 100644 mm/workingset_report_aging.c create mode 100644 tools/testing/selftests/mm/workingset_report.c create mode 100644 tools/testing/selftests/mm/workingset_report.h create mode 100644 tools/testing/selftests/mm/workingset_report_test.c -- 2.47.0.338.g60cca15819-goog

10 months, 2 weeks

6
20
0 0

[PATCH v5 0/3] selftests/lam: get_user additions and LAM enabled check

by Maciej Wieczor-Retman

Recent change in how get_user() handles pointers [1] has a specific case for LAM. It assigns a different bitmask that's later used to check whether a pointer comes from userland in get_user(). While currently commented out (until LASS [2] is merged into the kernel) it's worth making changes to the LAM selftest ahead of time. Modify cpu_has_la57() so it provides current paging level information instead of the cpuid one. Add test case to LAM that utilizes a ioctl (FIOASYNC) syscall which uses get_user() in its implementation. Execute the syscall with differently tagged pointers to verify that valid user pointers are passing through and invalid kernel/non-canonical pointers are not. Also to avoid unhelpful test failures add a check in main() to skip running tests if LAM was not compiled into the kernel. Code was tested on a Sierra Forest Xeon machine that's LAM capable. The test was ran without issues with both the LAM lines from [1] untouched and commented out. The test was also ran without issues with LAM_SUP both enabled and disabled. 4/5 level pagetables code paths were also successfully tested in Simics on a 5-level capable machine. [1] https://lore.kernel.org/all/20241024013214.129639-1-torvalds@linux-foundati… [2] https://lore.kernel.org/all/20241028160917.1380714-1-alexander.shishkin@lin… Maciej Wieczor-Retman (3): selftests/lam: Move cpu_has_la57() to use cpuinfo flag selftests/lam: Skip test if LAM is disabled selftests/lam: Test get_user() LAM pointer handling tools/testing/selftests/x86/lam.c | 120 ++++++++++++++++++++++++++++-- 1 file changed, 115 insertions(+), 5 deletions(-) -- 2.47.1

10 months, 3 weeks

3
13
0 0

"stty sane" in kunit.py

by Brendan Jackman

Hi all, Does anyone know what the 'stty sane' invocation in kunit.py is about? The other day I ran into an issue when running it via watchexec[1]. At the time I believed that it was there to clean up after the firmware that QEMU runs potentially messed up the terminal. However, I just realised I'm not sure if that makes sense - stty is about setting terminal settings via ioctl. I don't think QEMU or its guests are messing up the terminal with ioctls, they're just writing funny control characters. What's going on here? I guess one of: 1. Terminal is messed up with ctrl chars but ioctls are the easiest/only way to reliably clean it up. 2. Nobody thought about this unimportant detail so hard before and there's no particular rationale in place here. 3. I made bad assumptions about why the `stty sane` is there. If it's 1 or 2 I wonder if there's an alternative way to clean up without getting the SIGTTOU issue. Or, maybe it doesn't matter and the fact that this was ever a problem is just a bug in watchexec (maybe you can tell I haven't actually taken the time to research the SIGTTOU thing properly). But thought I'd raise it in case this points to issues people might have using kunit.py in CI. [1] https://github.com/watchexec/watchexec/issues/874 [2] https://gist.github.com/bjackman/27fd9980d87c5556c20e67a6ed891500

10 months, 3 weeks

3
4
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror November 2024