- Linux-kselftest-mirror - lists.linaro.org

[PATCH] kunit: tool: cosmetic: don't specify duplicate kunit_shutdown's

by Daniel Latypov

Context: When using a non-UML arch, kunit.py will boot the test kernel with these options by default: > mem=1G console=tty kunit_shutdown=halt console=ttyS0 kunit_shutdown=reboot For QEMU, we need to use 'reboot', and for UML we need to use 'halt'. If you switch them, kunit.py will hang until the --timeout expires. So the code currently unconditionally adds 'kunit_shutdown=halt' but then appends 'reboot' when using QEMU (which overwrites it). This patch: Having these duplicate options is a bit noisy. Switch so we only add 'halt' for UML. I.e. we now get UML: 'mem=1G console=tty console=ttyS0 kunit_shutdown=halt' QEMU: 'mem=1G console=tty console=ttyS0 kunit_shutdown=reboot' Side effect: you can't overwrite kunit_shutdown on UML w/ --kernel_arg. But you already couldn't for QEMU, and why would you want to? Signed-off-by: Daniel Latypov <dlatypov(a)google.com> --- tools/testing/kunit/kunit_kernel.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py index 483f78e15ce9..9731ceb7ad92 100644 --- a/tools/testing/kunit/kunit_kernel.py +++ b/tools/testing/kunit/kunit_kernel.py @@ -158,7 +158,7 @@ class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations): def start(self, params: List[str], build_dir: str) -> subprocess.Popen: """Runs the Linux UML binary. Must be named 'linux'.""" linux_bin = os.path.join(build_dir, 'linux') - return subprocess.Popen([linux_bin] + params, + return subprocess.Popen([linux_bin] + params + ['kunit_shutdown=halt'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, @@ -332,7 +332,7 @@ class LinuxSourceTree(object): def run_kernel(self, args=None, build_dir='', filter_glob='', timeout=None) -> Iterator[str]: if not args: args = [] - args.extend(['mem=1G', 'console=tty', 'kunit_shutdown=halt']) + args.extend(['mem=1G', 'console=tty']) if filter_glob: args.append('kunit.filter_glob='+filter_glob) base-commit: b04d1a8dc7e7ff7ca91a20bef053bcc04265d83a -- 2.35.1.1178.g4f1659d476-goog

3 years, 2 months

2
2
0 0

[PATCH V2] testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set

by Athira Rajeev

The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate CPU set. This cpu set is used further in pthread_attr_setaffinity_np and by pthread_create in the code. But in current code, allocated cpu set is not freed. Fix this issue by adding CPU_FREE in the "shutdown" function which is called in most of the error/exit path for the cleanup. Also add CPU_FREE in some of the error paths where shutdown is not called. Fixes: 7820b0715b6f ("tools/selftests: add mq_perf_tests") Signed-off-by: Athira Rajeev <atrajeev(a)linux.vnet.ibm.com> --- Changelog: From v1 -> v2: Addressed review comment from Shuah Khan to add CPU_FREE in other exit paths where it is needed tools/testing/selftests/mqueue/mq_perf_tests.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/mqueue/mq_perf_tests.c b/tools/testing/selftests/mqueue/mq_perf_tests.c index b019e0b8221c..182434c7898d 100644 --- a/tools/testing/selftests/mqueue/mq_perf_tests.c +++ b/tools/testing/selftests/mqueue/mq_perf_tests.c @@ -180,6 +180,9 @@ void shutdown(int exit_val, char *err_cause, int line_no) if (in_shutdown++) return; + /* Free the cpu_set allocated using CPU_ALLOC in main function */ + CPU_FREE(cpu_set); + for (i = 0; i < num_cpus_to_pin; i++) if (cpu_threads[i]) { pthread_kill(cpu_threads[i], SIGUSR1); @@ -589,6 +592,7 @@ int main(int argc, char *argv[]) cpu_set)) { fprintf(stderr, "Any given CPU may " "only be given once.\n"); + CPU_FREE(cpu_set); exit(1); } else CPU_SET_S(cpus_to_pin[cpu], @@ -607,6 +611,7 @@ int main(int argc, char *argv[]) queue_path = malloc(strlen(option) + 2); if (!queue_path) { perror("malloc()"); + CPU_FREE(cpu_set); exit(1); } queue_path[0] = '/'; @@ -619,6 +624,7 @@ int main(int argc, char *argv[]) } if (continuous_mode && num_cpus_to_pin == 0) { + CPU_FREE(cpu_set); fprintf(stderr, "Must pass at least one CPU to continuous " "mode.\n"); poptPrintUsage(popt_context, stderr, 0); @@ -628,10 +634,12 @@ int main(int argc, char *argv[]) cpus_to_pin[0] = cpus_online - 1; } - if (getuid() != 0) + if (getuid() != 0) { + CPU_FREE(cpu_set); ksft_exit_skip("Not running as root, but almost all tests " "require root in order to modify\nsystem settings. " "Exiting.\n"); + } max_msgs = fopen(MAX_MSGS, "r+"); max_msgsize = fopen(MAX_MSGSIZE, "r+"); -- 2.35.1

3 years, 2 months

2
2
0 0

[PATCH v2 0/4] memcg: introduce per-memcg proactive reclaim

by Yosry Ahmed

This patch series adds a memory.reclaim proactive reclaim interface. The rationale behind the interface and how it works are in the first patch. --- Changes in V2: - Add the interface to root as well. - Added a selftest. - Documented the interface as a nested-keyed interface, which makes adding optional arguments in the future easier (see doc updates in the first patch). - Modified the commit message to reflect changes and add a timeout argument as a suggested possible extension - Return -EAGAIN if the kernel fails to reclaim the full requested amount. --- Shakeel Butt (1): memcg: introduce per-memcg reclaim interface Yosry Ahmed (3): selftests: cgroup: return the errno of write() in cg_write() on failure selftests: cgroup: fix alloc_anon_noexit() instantly freeing memory selftests: cgroup: add a selftest for memory.reclaim Documentation/admin-guide/cgroup-v2.rst | 21 +++++ mm/memcontrol.c | 37 ++++++++ tools/testing/selftests/cgroup/cgroup_util.c | 11 ++- .../selftests/cgroup/test_memcontrol.c | 94 ++++++++++++++++++- 4 files changed, 156 insertions(+), 7 deletions(-) -- 2.35.1.1178.g4f1659d476-goog

3 years, 2 months

2
7
0 0

[PATCH bpf-next] selftests: bpf: use MIN for TCP CC tests

by Geliang Tang

Use macro MIN() in sys/param.h for TCP CC tests, instead of defining a new one. Signed-off-by: Geliang Tang <geliang.tang(a)suse.com> --- tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c index 8f7a1cef7d87..ceed369361fc 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c @@ -3,6 +3,7 @@ #include <linux/err.h> #include <netinet/tcp.h> +#include <sys/param.h> #include <test_progs.h> #include "network_helpers.h" #include "bpf_dctcp.skel.h" @@ -10,8 +11,6 @@ #include "bpf_tcp_nogpl.skel.h" #include "bpf_dctcp_release.skel.h" -#define min(a, b) ((a) < (b) ? (a) : (b)) - #ifndef ENOTSUPP #define ENOTSUPP 524 #endif @@ -53,7 +52,7 @@ static void *server(void *arg) while (bytes < total_bytes && !READ_ONCE(stop)) { nr_sent = send(fd, &batch, - min(total_bytes - bytes, sizeof(batch)), 0); + MIN(total_bytes - bytes, sizeof(batch)), 0); if (nr_sent == -1 && errno == EINTR) continue; if (nr_sent == -1) { @@ -146,7 +145,7 @@ static void do_test(const char *tcp_ca, const struct bpf_map *sk_stg_map) /* recv total_bytes */ while (bytes < total_bytes && !READ_ONCE(stop)) { nr_recv = recv(fd, &batch, - min(total_bytes - bytes, sizeof(batch)), 0); + MIN(total_bytes - bytes, sizeof(batch)), 0); if (nr_recv == -1 && errno == EINTR) continue; if (nr_recv == -1) -- 2.34.1

3 years, 2 months

3
2
0 0

[PATCH] userfaultfd/selftests: use swap() instead of open coding it

by Guo Zhengkui

Address the following coccicheck warning: tools/testing/selftests/vm/userfaultfd.c:1536:21-22: WARNING opportunity for swap(). tools/testing/selftests/vm/userfaultfd.c:1540:33-34: WARNING opportunity for swap(). by using swap() for the swapping of variable values and drop `tmp_area` that is not needed any more. `swap()` macro in userfaultfd.c is introduced in commit 681696862bc18 ("selftests: vm: remove dependecy from internal kernel macros") It has been tested with gcc (Debian 8.3.0-6) 8.3.0. Signed-off-by: Guo Zhengkui <guozhengkui(a)vivo.com> --- tools/testing/selftests/vm/userfaultfd.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index 92a4516f8f0d..7aba3ced7545 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -1422,7 +1422,6 @@ static void userfaultfd_pagemap_test(unsigned int test_pgsize) static int userfaultfd_stress(void) { void *area; - char *tmp_area; unsigned long nr; struct uffdio_register uffdio_register; struct uffd_stats uffd_stats[nr_cpus]; @@ -1533,13 +1532,9 @@ static int userfaultfd_stress(void) count_verify[nr], nr); /* prepare next bounce */ - tmp_area = area_src; - area_src = area_dst; - area_dst = tmp_area; + swap(area_src, area_dst); - tmp_area = area_src_alias; - area_src_alias = area_dst_alias; - area_dst_alias = tmp_area; + swap(area_src_alias, area_dst_alias); uffd_stats_report(uffd_stats, nr_cpus); } -- 2.20.1

3 years, 2 months

3
2
0 0

WARNING: at arch/x86/kvm/../../../virt/kvm/kvm_main.c:3156 mark_page_dirty_in_slot

by Naresh Kamboju

While running kselftest kvm test cases on x86_64 devices the following kernel warning was reported. metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline git_sha: 1930a6e739c4b4a654a69164dbe39e554d228915 git_describe: v5.17-12882-g1930a6e739c4 kernel_version: 5.17.0 kernel-config: https://builds.tuxbuild.com/272RGo17Agp9s62duqGs3mP2d0S/config # selftests: kvm: evmcs_test # Running L1 which uses EVMCS to run L2 # Injecting NMI into L1 before L2 had a chance to run after restore # Trying extra KVM_GET_NESTED_STATE/KVM_SET_NESTED_STATE cycle ok 4 selftests: kvm: evmcs_test # selftests: kvm: emulator_error_test # module parameter 'allow_smaller_maxphyaddr' is not set. Skipping test. ok 5 selftests: kvm: emulator_error_test # selftests: kvm: hyperv_clock [ 62.510388] ------------[ cut here ]------------ [ 62.515064] WARNING: CPU: 1 PID: 915 at arch/x86/kvm/../../../virt/kvm/kvm_main.c:3156 mark_page_dirty_in_slot+0xba/0xd0 [ 62.525968] Modules linked in: x86_pkg_temp_thermal fuse [ 62.531307] CPU: 1 PID: 915 Comm: hyperv_clock Not tainted 5.17.0 #1 [ 62.537691] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 62.545185] RIP: 0010:mark_page_dirty_in_slot+0xba/0xd0 [ 62.550452] Code: 89 ea 09 c6 e8 57 d4 00 00 5b 41 5c 41 5d 41 5e 5d c3 48 8b 83 c0 00 00 00 49 63 d5 f0 48 0f ab 10 5b 41 5c 41 5d 41 5e 5d c3 <0f> 0b 5b 41 5c 41 5d 41 5e 5d c3 0f 1f 44 00 00 eb 80 0f 1f 40 00 [ 62.569265] RSP: 0018:ffffa347c1663b50 EFLAGS: 00010246 [ 62.574502] RAX: 0000000080000000 RBX: ffff8f01149ce600 RCX: 0000000000000000 [ 62.581700] RDX: 0000000000000000 RSI: ffffffffa302ab31 RDI: ffffffffa302ab31 [ 62.588874] RBP: ffffa347c1663b70 R08: 0000000000000000 R09: 0000000000000001 [ 62.596046] R10: 0000000000000001 R11: 0000000000000000 R12: ffffa347c1665000 [ 62.603213] R13: 0000000000000022 R14: 0000000000000000 R15: 0000000000000004 [ 62.610389] FS: 00007fe3799c1740(0000) GS:ffff8f041fc80000(0000) knlGS:0000000000000000 [ 62.618697] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 62.624467] CR2: 0000000000000000 CR3: 000000010614e004 CR4: 00000000003726e0 [ 62.631684] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 62.638833] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 62.646009] Call Trace: [ 62.648480] <TASK> [ 62.650604] __kvm_write_guest_page+0xc8/0x100 [ 62.655112] kvm_write_guest+0x61/0xb0 [ 62.658884] kvm_hv_invalidate_tsc_page+0xd3/0x140 [ 62.663699] ? kvm_hv_invalidate_tsc_page+0x72/0x140 [ 62.668684] kvm_arch_vm_ioctl+0x20f/0xbc0 [ 62.672798] ? __lock_acquire+0x3af/0x2450 [ 62.676956] ? __this_cpu_preempt_check+0x13/0x20 [ 62.681706] kvm_vm_ioctl+0x6f1/0xe20 [ 62.685423] ? ktime_get_coarse_real_ts64+0xc7/0xd0 [ 62.690323] ? __this_cpu_preempt_check+0x13/0x20 [ 62.695048] ? lockdep_hardirqs_on+0x7e/0x100 [ 62.699423] ? blk_log_with_error+0x3b/0x70 [ 62.703644] ? __audit_syscall_entry+0xcd/0x130 [ 62.708220] ? selinux_file_ioctl+0xa6/0x130 [ 62.712542] ? selinux_file_ioctl+0xa6/0x130 [ 62.716869] __x64_sys_ioctl+0x91/0xc0 [ 62.720686] do_syscall_64+0x5c/0x80 [ 62.724305] ? __this_cpu_preempt_check+0x13/0x20 [ 62.729059] ? lock_is_held_type+0xdd/0x130 [ 62.733264] ? do_syscall_64+0x69/0x80 [ 62.737069] ? __this_cpu_preempt_check+0x13/0x20 [ 62.741791] ? lockdep_hardirqs_on+0x7e/0x100 [ 62.746219] ? syscall_exit_to_user_mode+0x3e/0x50 [ 62.751082] ? do_syscall_64+0x69/0x80 [ 62.754904] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 62.760027] RIP: 0033:0x7fe3792bf8f7 [ 62.763687] Code: b3 66 90 48 8b 05 a1 35 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 71 35 2c 00 f7 d8 64 89 01 48 [ 62.782497] RSP: 002b:00007ffe035acf38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 62.790131] RAX: ffffffffffffffda RBX: 000000004030ae7b RCX: 00007fe3792bf8f7 [ 62.797334] RDX: 00007ffe035acf70 RSI: 000000004030ae7b RDI: 0000000000000006 [ 62.804539] RBP: 0000000000000007 R08: 000000000040e320 R09: 0000000000000007 [ 62.811737] R10: 000000000004da6b R11: 0000000000000246 R12: 00007fe3799c7000 [ 62.818914] R13: 0000000000000007 R14: 00000000000058cb R15: 00000000000e8f42 [ 62.826141] </TASK> [ 62.828378] irq event stamp: 6435 [ 62.831765] hardirqs last enabled at (6445): [<ffffffffa3272a88>] __up_console_sem+0x58/0x60 [ 62.840354] hardirqs last disabled at (6454): [<ffffffffa3272a6d>] __up_console_sem+0x3d/0x60 [ 62.848944] softirqs last enabled at (6392): [<ffffffffa4600341>] __do_softirq+0x341/0x4cc [ 62.857362] softirqs last disabled at (6473): [<ffffffffa31ef29f>] irq_exit_rcu+0xdf/0x140 [ 62.865700] ---[ end trace 0000000000000000 ]--- ok 6 selftests: kvm: hyperv_clock Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org> -- Linaro LKFT https://lkft.linaro.org [1] https://lkft.validation.linaro.org/scheduler/job/4805876#L1528

3 years, 2 months

3
2
0 0

Re: [PATCH v3] net: fix wrong network header length

by Lina Wang

On Thu, Feb 17, 2022 at 12:45 AM Paolo Abeni <pabeni(a)redhat.com> wrote: > > Hello, > > On Thu, 2022-02-17 at 15:01 +0800, Lina Wang wrote: > So that bpf helper has to be somehow involved, but iperf udp test > says nothing about it. > Please craft a _complete_ selftest. Finally, I have wrote selftest, please check https://lore.kernel.org/bpf/20220407084727.10241-1-lina.wang@mediatek.com/ https://lore.kernel.org/bpf/20220407084727.10241-2-lina.wang@mediatek.com/ https://lore.kernel.org/bpf/20220407084727.10241-3-lina.wang@mediatek.com/ Thanks

3 years, 2 months

1
0
0 0

[PATCH] testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set

by Athira Rajeev

The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate CPU set. This cpu set is used further in pthread_attr_setaffinity_np and by pthread_create in the code. But in current code, allocated cpu set is not freed. Fix this by adding CPU_FREE after its usage is done. Signed-off-by: Athira Rajeev <atrajeev(a)linux.vnet.ibm.com> --- tools/testing/selftests/mqueue/mq_perf_tests.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/mqueue/mq_perf_tests.c b/tools/testing/selftests/mqueue/mq_perf_tests.c index b019e0b8221c..17c41f216bef 100644 --- a/tools/testing/selftests/mqueue/mq_perf_tests.c +++ b/tools/testing/selftests/mqueue/mq_perf_tests.c @@ -732,6 +732,7 @@ int main(int argc, char *argv[]) pthread_attr_destroy(&thread_attr); } + CPU_FREE(cpu_set); if (!continuous_mode) { pthread_join(cpu_threads[0], &retval); shutdown((long)retval, "perf_test_thread()", __LINE__); -- 2.35.1

3 years, 2 months

2
2
0 0

[PATCH v2] selftest/vm clarify error statement in gup_test

by Sidhartha Kumar

Print three possible reasons /sys/kernel/debug/gup_test cannot be opened to help users of this test diagnose failures. Signed-off-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com> --- v2: - Add support for skipping the test due to unmet dependencies. - Use errno to print a more specific message. - Add check for root privileges. - dropped CC to stable. tools/testing/selftests/vm/gup_test.c | 22 +++++++++++++-- tools/testing/selftests/vm/run_vmtests.sh | 33 ++++++++++++++++------- 2 files changed, 44 insertions(+), 11 deletions(-) diff --git a/tools/testing/selftests/vm/gup_test.c b/tools/testing/selftests/vm/gup_test.c index fe043f67798b0..bdedaa6c58e18 100644 --- a/tools/testing/selftests/vm/gup_test.c +++ b/tools/testing/selftests/vm/gup_test.c @@ -1,7 +1,9 @@ #include <fcntl.h> +#include <errno.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> +#include <dirent.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <sys/stat.h> @@ -9,6 +11,7 @@ #include <pthread.h> #include <assert.h> #include "../../../../mm/gup_test.h" +#include "../kselftest.h" #define MB (1UL << 20) #define PAGE_SIZE sysconf(_SC_PAGESIZE) @@ -205,8 +208,23 @@ int main(int argc, char **argv) gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR); if (gup_fd == -1) { - perror("open"); - exit(1); + switch (errno) { + case EACCES: + if (getuid()) + printf("Please run this test as root\n"); + break; + case ENOENT: + if (opendir("/sys/kernel/debug") == NULL) { + printf("mount debugfs at /sys/kernel/debug\n"); + break; + } + printf("check if CONFIG_GUP_TEST is enabled in kernel config\n"); + break; + default: + perror("failed to open /sys/kernel/debug/gup_test"); + break; + } + exit(KSFT_SKIP); } p = mmap(NULL, size, PROT_READ | PROT_WRITE, flags, filed, 0); diff --git a/tools/testing/selftests/vm/run_vmtests.sh b/tools/testing/selftests/vm/run_vmtests.sh index 45e803af7c775..88e15fbb50278 100755 --- a/tools/testing/selftests/vm/run_vmtests.sh +++ b/tools/testing/selftests/vm/run_vmtests.sh @@ -127,22 +127,32 @@ echo "------------------------------------------------------" echo "running: gup_test -u # get_user_pages_fast() benchmark" echo "------------------------------------------------------" ./gup_test -u -if [ $? -ne 0 ]; then +ret_val=$? + +if [ $ret_val -eq 0 ]; then + echo "[PASS]" +elif [ $ret_val -eq $ksft_skip ]; then + echo "[SKIP]" + exitcode=$ksft_skip +else echo "[FAIL]" exitcode=1 -else - echo "[PASS]" fi echo "------------------------------------------------------" echo "running: gup_test -a # pin_user_pages_fast() benchmark" echo "------------------------------------------------------" ./gup_test -a -if [ $? -ne 0 ]; then +ret_val=$? + +if [ $ret_val -eq 0 ]; then + echo "[PASS]" +elif [ $ret_val -eq $ksft_skip ]; then + echo "[SKIP]" + exitcode=$ksft_skip +else echo "[FAIL]" exitcode=1 -else - echo "[PASS]" fi echo "------------------------------------------------------------" @@ -150,11 +160,16 @@ echo "# Dump pages 0, 19, and 4096, using pin_user_pages:" echo "running: gup_test -ct -F 0x1 0 19 0x1000 # dump_page() test" echo "------------------------------------------------------------" ./gup_test -ct -F 0x1 0 19 0x1000 -if [ $? -ne 0 ]; then +ret_val=$? + +if [ $ret_val -eq 0 ]; then + echo "[PASS]" +elif [ $ret_val -eq $ksft_skip ]; then + echo "[SKIP]" + exitcode=$ksft_skip +else echo "[FAIL]" exitcode=1 -else - echo "[PASS]" fi echo "-------------------" -- 2.27.0

3 years, 3 months

3
3
0 0

[PATCH bpf v5 1/2] bpf: Support dual-stack sockets in bpf_tcp_check_syncookie

by Maxim Mikityanskiy

bpf_tcp_gen_syncookie looks at the IP version in the IP header and validates the address family of the socket. It supports IPv4 packets in AF_INET6 dual-stack sockets. On the other hand, bpf_tcp_check_syncookie looks only at the address family of the socket, ignoring the real IP version in headers, and validates only the packet size. This implementation has some drawbacks: 1. Packets are not validated properly, allowing a BPF program to trick bpf_tcp_check_syncookie into handling an IPv6 packet on an IPv4 socket. 2. Dual-stack sockets fail the checks on IPv4 packets. IPv4 clients end up receiving a SYNACK with the cookie, but the following ACK gets dropped. This patch fixes these issues by changing the checks in bpf_tcp_check_syncookie to match the ones in bpf_tcp_gen_syncookie. IP version from the header is taken into account, and it is validated properly with address family. Fixes: 399040847084 ("bpf: add helper to check for a valid SYN cookie") Signed-off-by: Maxim Mikityanskiy <maximmi(a)nvidia.com> Reviewed-by: Tariq Toukan <tariqt(a)nvidia.com> Acked-by: Arthur Fabre <afabre(a)cloudflare.com> --- net/core/filter.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index a7044e98765e..64470a727ef7 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -7016,24 +7016,33 @@ BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len if (!th->ack || th->rst || th->syn) return -ENOENT; + if (unlikely(iph_len < sizeof(struct iphdr))) + return -EINVAL; + if (tcp_synq_no_recent_overflow(sk)) return -ENOENT; cookie = ntohl(th->ack_seq) - 1; - switch (sk->sk_family) { - case AF_INET: - if (unlikely(iph_len < sizeof(struct iphdr))) + /* Both struct iphdr and struct ipv6hdr have the version field at the + * same offset so we can cast to the shorter header (struct iphdr). + */ + switch (((struct iphdr *)iph)->version) { + case 4: + if (sk->sk_family == AF_INET6 && ipv6_only_sock(sk)) return -EINVAL; ret = __cookie_v4_check((struct iphdr *)iph, th, cookie); break; #if IS_BUILTIN(CONFIG_IPV6) - case AF_INET6: + case 6: if (unlikely(iph_len < sizeof(struct ipv6hdr))) return -EINVAL; + if (sk->sk_family != AF_INET6) + return -EINVAL; + ret = __cookie_v6_check((struct ipv6hdr *)iph, th, cookie); break; #endif /* CONFIG_IPV6 */ -- 2.30.2

3 years, 3 months

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror