in randomize function, there is a open function, but there is no
close function in the randomize, which is easy to cause memory leaks.
Signed-off-by: Liu Jing <liujing(a)cmss.chinamobile.com>
---
tools/testing/selftests/net/tcp_mmap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/net/tcp_mmap.c b/tools/testing/selftests/net/tcp_mmap.c
index 4fcce5150850..ab305e262d0a 100644
--- a/tools/testing/selftests/net/tcp_mmap.c
+++ b/tools/testing/selftests/net/tcp_mmap.c
@@ -438,6 +438,7 @@ static void randomize(void *target, size_t count)
perror("read /dev/urandom");
exit(1);
}
+ close(urandom);
}
int main(int argc, char *argv[])
--
2.33.0
+ Dave Miller, Jakub Kicinski, Paolo Abeni, Shuah Khan,
linux-kselftest(a)vger.kernel.org
On Mon, Jul 08, 2024 at 09:04:05PM +0000, zijianzhang(a)bytedance.com wrote:
> From: Zijian Zhang <zijianzhang(a)bytedance.com>
>
> We update selftests/net/msg_zerocopy.c to accommodate the new mechanism,
> cfg_notification_limit has the same semantics for both methods. Test
> results are as follows, we update skb_orphan_frags_rx to the same as
> skb_orphan_frags to support zerocopy in the localhost test.
>
> cfg_notification_limit = 1, both method get notifications after 1 calling
> of sendmsg. In this case, the new method has around 17% cpu savings in TCP
> and 23% cpu savings in UDP.
> +---------------------+---------+---------+---------+---------+
> | Test Type / Protocol| TCP v4 | TCP v6 | UDP v4 | UDP v6 |
> +---------------------+---------+---------+---------+---------+
> | ZCopy (MB) | 7523 | 7706 | 7489 | 7304 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy (MB) | 8834 | 8993 | 9053 | 9228 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy / ZCopy | 117.42% | 116.70% | 120.88% | 126.34% |
> +---------------------+---------+---------+---------+---------+
>
> cfg_notification_limit = 32, both get notifications after 32 calling of
> sendmsg, which means more chances to coalesce notifications, and less
> overhead of poll + recvmsg for the original method. In this case, the new
> method has around 7% cpu savings in TCP and slightly better cpu usage in
> UDP. In the context of selftest, notifications of TCP are more likely to
> out of order than UDP, it's easier to coalesce more notifications in UDP.
> The original method can get one notification with range of 32 in a recvmsg
> most of the time. In TCP, most notifications' range is around 2, so the
> original method needs around 16 recvmsgs to get notified in one round.
> That's the reason for the "New ZCopy / ZCopy" diff in TCP and UDP here.
> +---------------------+---------+---------+---------+---------+
> | Test Type / Protocol| TCP v4 | TCP v6 | UDP v4 | UDP v6 |
> +---------------------+---------+---------+---------+---------+
> | ZCopy (MB) | 8842 | 8735 | 10072 | 9380 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy (MB) | 9366 | 9477 | 10108 | 9385 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy / ZCopy | 106.00% | 108.28% | 100.31% | 100.01% |
> +---------------------+---------+---------+---------+---------+
>
> In conclusion, when notification interval is small or notifications are
> hard to be coalesced, the new mechanism is highly recommended. Otherwise,
> the performance gain from the new mechanism is very limited.
>
> Signed-off-by: Zijian Zhang <zijianzhang(a)bytedance.com>
> Signed-off-by: Xiaochun Lu <xiaochun.lu(a)bytedance.com>
> ---
> tools/testing/selftests/net/msg_zerocopy.c | 111 ++++++++++++++++++--
> tools/testing/selftests/net/msg_zerocopy.sh | 1 +
> 2 files changed, 105 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/net/msg_zerocopy.c b/tools/testing/selftests/net/msg_zerocopy.c
...
> @@ -466,6 +504,44 @@ static void do_recv_completions(int fd, int domain)
> sends_since_notify = 0;
> }
>
> +static void do_recv_completions2(void)
> +{
> + struct cmsghdr *cm = (struct cmsghdr *)zc_ckbuf;
> + struct zc_info *zc_info;
> + __u32 hi, lo, range;
> + __u8 zerocopy;
> + int i;
> +
> + zc_info = (struct zc_info *)CMSG_DATA(cm);
> + for (i = 0; i < zc_info->size; i++) {
> + hi = zc_info->arr[i].hi;
> + lo = zc_info->arr[i].lo;
> + zerocopy = zc_info->arr[i].zerocopy;
> + range = hi - lo + 1;
> +
> + if (cfg_verbose && lo != next_completion)
> + fprintf(stderr, "gap: %u..%u does not append to %u\n",
> + lo, hi, next_completion);
> + next_completion = hi + 1;
> +
> + if (zerocopied == -1)
> + zerocopied = zerocopy;
> + else if (zerocopied != zerocopy) {
> + fprintf(stderr, "serr: inconsistent\n");
> + zerocopied = zerocopy;
> + }
nit: If any arms of a conditional have {}, then all arms should have them
> +
> + completions += range;
> +
> + if (cfg_verbose >= 2)
> + fprintf(stderr, "completed: %u (h=%u l=%u)\n",
> + range, hi, lo);
> + }
> +
> + sends_since_notify = 0;
> + added_zcopy_info = false;
> +}
...
From: Geliang Tang <tanggeliang(a)kylinos.cn>
Run this BPF selftests (./test_progs -t sockmap_basic) on a Loongarch
platform, a kernel panic occurs:
'''
Oops[#1]:
CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18
Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
... ...
ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560
ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0
CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD: 0000000c (PPLV0 +PIE +PWE)
EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
BADV: 0000000000000040
PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack
Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...)
Stack : ...
...
Call Trace:
[<9000000004162774>] copy_page_to_iter+0x74/0x1c0
[<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560
[<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0
[<90000000049aae34>] inet_recvmsg+0x54/0x100
[<900000000481ad5c>] sock_recvmsg+0x7c/0xe0
[<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0
[<900000000481e27c>] sys_recvfrom+0x1c/0x40
[<9000000004c076ec>] do_syscall+0x8c/0xc0
[<9000000003731da4>] handle_syscall+0xc4/0x160
Code: ...
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Fatal exception
Kernel relocated by 0x3510000
.text @ 0x9000000003710000
.data @ 0x9000000004d70000
.bss @ 0x9000000006469400
---[ end Kernel panic - not syncing: Fatal exception ]---
'''
This crash happens every time when running sockmap_skb_verdict_shutdown
subtest in sockmap_basic.
This crash is because a NULL pointer is passed to page_address() in
sk_msg_recvmsg(). Due to the difference implementations depending on the
architecture, page_address(NULL) will trigger a panic on Loongarch
platform but not on X86 platform. So this bug was hidden on X86 platform
for a while, but now it is exposed on Loongarch platform.
The root cause is a zero length skb (skb->len == 0) is put on the queue.
This zero length skb is a TCP FIN packet, which is sent by shutdown(),
invoked in test_sockmap_skb_verdict_shutdown():
shutdown(p1, SHUT_WR);
In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no
page is put to this sge (see sg_set_page in sg_set_page), but this empty
sge is queued into ingress_msg list.
And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by
sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it
to kmap_local_page() and to page_address(), then kernel panics.
To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(),
if copy is zero, that means it's a zero length skb, skip invoking
copy_page_to_iter(). We are using the EFAULT return triggered by
copy_page_to_iter to check for is_fin in tcp_bpf.c.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Suggested-by: John Fastabend <john.fastabend(a)gmail.com>
Signed-off-by: Geliang Tang <tanggeliang(a)kylinos.cn>
---
v5:
- update v5 as John suggested.
- skmsg: skip zero length skb in sk_msg_recvmsg
v4:
- skmsg: skip empty sge in sk_msg_recvmsg
v3:
- skmsg: prevent empty ingress skb from enqueuing
v2:
- skmsg: null check for sg_page in sk_msg_recvmsg
---
net/core/skmsg.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index fd20aae30be2..bbf40b999713 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -434,7 +434,8 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
page = sg_page(sge);
if (copied + copy > len)
copy = len - copied;
- copy = copy_page_to_iter(page, sge->offset, copy, iter);
+ if (copy)
+ copy = copy_page_to_iter(page, sge->offset, copy, iter);
if (!copy) {
copied = copied ? copied : -EFAULT;
goto out;
--
2.43.0
Changes v3:
- Reworked patch 2.
- Changed minor things in patch 1 like function name and made
corrections to the patch message.
Changes v2:
- Removed patches 2 and 3 since now this part will be supported by the
kernel.
Sub-Numa Clustering (SNC) allows splitting CPU cores, caches and memory
into multiple NUMA nodes. When enabled, NUMA-aware applications can
achieve better performance on bigger server platforms.
SNC support in the kernel is currently in review [1]. With SNC enabled
and kernel support in place all the tests will function normally (aside
from effective cache size). There might be a problem when SNC is enabled
but the system is still using an older kernel version without SNC
support. Currently the only message displayed in that situation is a
guess that SNC might be enabled and is causing issues. That message also
is displayed whenever the test fails on an Intel platform.
Add a mechanism to discover kernel support for SNC which will add more
meaning and certainty to the error message.
Add runtime SNC mode detection and verify how reliable that information
is.
Series was tested on Ice Lake server platforms with SNC disabled, SNC-2
and SNC-4. The tests were also ran with and without kernel support for
SNC.
Series applies cleanly on kselftest/next.
[1] https://lore.kernel.org/all/20240628215619.76401-1-tony.luck@intel.com/
Previous versions of this series:
[v1] https://lore.kernel.org/all/cover.1709721159.git.maciej.wieczor-retman@inte…
[v2] https://lore.kernel.org/all/cover.1715769576.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (2):
selftests/resctrl: Adjust effective L3 cache size with SNC enabled
selftests/resctrl: Adjust SNC support messages
tools/testing/selftests/resctrl/cache.c | 3 +
tools/testing/selftests/resctrl/cmt_test.c | 4 +-
tools/testing/selftests/resctrl/mba_test.c | 4 +
tools/testing/selftests/resctrl/mbm_test.c | 6 +-
tools/testing/selftests/resctrl/resctrl.h | 8 +
.../testing/selftests/resctrl/resctrl_tests.c | 7 +
tools/testing/selftests/resctrl/resctrlfs.c | 138 ++++++++++++++++++
7 files changed, 166 insertions(+), 4 deletions(-)
--
2.45.2
Even if a vgem device is configured in, we will skip the import_vgem_fd()
test almost every time.
TAP version 13
1..11
# Testing heap: system
# =======================================
# Testing allocation and importing:
ok 1 # SKIP Could not open vgem -1
The problem is that we use the DRM_IOCTL_VERSION ioctl to query the driver
version information but leave the name field a non-null-terminated string.
Terminate it properly to actually test against the vgem device.
Signed-off-by: Zenghui Yu <yuzenghui(a)huawei.com>
---
tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
index 5f541522364f..2fcc74998fa9 100644
--- a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
+++ b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
@@ -32,6 +32,8 @@ static int check_vgem(int fd)
if (ret)
return 0;
+ name[4] = '\0';
+
return !strcmp(name, "vgem");
}
--
2.33.0
From: Geliang Tang <tanggeliang(a)kylinos.cn>
v10:
- a new patch 10 is added.
- patches 1-6, 8-9 unchanged, only commit logs updated.
- "err = -errno" is used in patches 7, 11, 12 to get the real error
number before checking value of "err".
v9:
- new patches 5-7, new struct member expect_errno for network_helper_opts.
- patches 1-4, 8-9 unchanged.
- update patches 10-11 to make sure all tests pass.
v8:
- only patch 8 updated, to fix errors reported by CI.
v7:
- address Martin's comments in v6. (thanks)
- use MAX(opts->backlog, 0) instead of opts->backlog.
- use connect_to_fd_opts instead connect_to_fd.
- more ASSERT_* to check errors.
v6:
- update patch 6 as Daniel suggested. (thanks)
v5:
- keep make_server and make_client as Eduard suggested.
v4:
- a new patch to use make_sockaddr in sockmap_ktls.
- a new patch to close fd in error path in drop_on_reuseport.
- drop make_server() in patch 7.
- drop make_client() too in patch 9.
v3:
- a new patch to add backlog for network_helper_opts.
- use start_server_str in sockmap_ktls now, not start_server.
v2:
- address Eduard's comments in v1. (thanks)
- fix errors reported by CI.
This patch set uses network helpers in sockmap_ktls and sk_lookup, and
drop three local helpers tcp_server(), inetaddr_len() and make_socket()
in them.
Geliang Tang (12):
selftests/bpf: Add backlog for network_helper_opts
selftests/bpf: Use start_server_str in sockmap_ktls
selftests/bpf: Use connect_to_fd_opts in sockmap_ktls
selftests/bpf: Use make_sockaddr in sockmap_ktls
selftests/bpf: Add network_helper_opts for connect_fd_to_fd
selftests/bpf: Add expect_errno for network_helper_opts
selftests/bpf: Set expect_errno for cgroup_skb_sk_lookup
selftests/bpf: Close fd in error path in drop_on_reuseport
selftests/bpf: Use start_server_str in sk_lookup
selftests/bpf: Use connect_fd_to_fd in sk_lookup
selftests/bpf: Use connect_to_addr in sk_lookup
selftests/bpf: Drop make_socket in sk_lookup
tools/testing/selftests/bpf/network_helpers.c | 23 ++-
tools/testing/selftests/bpf/network_helpers.h | 8 +-
.../testing/selftests/bpf/prog_tests/bpf_nf.c | 5 +-
.../bpf/prog_tests/cgroup_skb_sk_lookup.c | 8 +-
.../selftests/bpf/prog_tests/cgroup_tcp_skb.c | 4 +-
.../selftests/bpf/prog_tests/cgroup_v1v2.c | 1 +
.../selftests/bpf/prog_tests/sk_lookup.c | 162 +++++++-----------
.../selftests/bpf/prog_tests/sockmap_ktls.c | 53 ++----
8 files changed, 108 insertions(+), 156 deletions(-)
--
2.43.0
I realized this while having a map containing both a struct bpf_timer and
a struct bpf_wq: the third argument provided to the bpf_wq callback is
not the struct bpf_wq pointer itself, but the pointer to the value in
the map.
Which means that the users need to double cast the provided "value" as
this is not a struct bpf_wq *.
This is a change of API, but there doesn't seem to be much users of bpf_wq
right now, so we should be able to go with this right now.
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Changes in v2:
- amended the selftests to retrieve something from the third argument of
the callback
- Link to v1: https://lore.kernel.org/r/20240705-fix-wq-v1-0-91b4d82cd825@kernel.org
---
Benjamin Tissoires (2):
bpf: helpers: fix bpf_wq_set_callback_impl signature
selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature
kernel/bpf/helpers.c | 2 +-
tools/testing/selftests/bpf/bpf_experimental.h | 2 +-
tools/testing/selftests/bpf/progs/wq.c | 19 ++++++++++++++-----
tools/testing/selftests/bpf/progs/wq_failures.c | 4 ++--
4 files changed, 18 insertions(+), 9 deletions(-)
---
base-commit: fd8db07705c55a995c42b1e71afc42faad675b0b
change-id: 20240705-fix-wq-f069c7fb36c3
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
From: Geliang Tang <tanggeliang(a)kylinos.cn>
BPF selftests seem to have not been fully tested on Loongarch platforms.
There are so many "ENOTSUPP" (-524) errors when running BPF selftests on
them since lacking BPF trampoline on Loongarch.
For these "ENOTSUPP" tests, it's better to skip them, instead of reporting
some "ENOTSUPP" errors. This patchset skips ENOTSUPP in ASSERT_OK/
ASSERT_OK_PTR/ASSERT_GE helpers to fix them. This is useful for running BPF
selftests for other architectures too.
Geliang Tang (6):
selftests/bpf: Define ENOTSUPP in testing_helpers.h
selftests/bpf: Skip ENOTSUPP in ASSERT_OK
selftests/bpf: Use ASSERT_OK to skip ENOTSUPP
selftests/bpf: Null checks for link in bpf_tcp_ca
selftests/bpf: Skip ENOTSUPP in ASSERT_OK_PTR
selftests/bpf: Skip ENOTSUPP in ASSERT_GE
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 20 +++++++++-------
.../testing/selftests/bpf/prog_tests/d_path.c | 2 +-
.../selftests/bpf/prog_tests/lsm_cgroup.c | 10 +-------
.../selftests/bpf/prog_tests/module_attach.c | 2 +-
.../selftests/bpf/prog_tests/ringbuf.c | 2 +-
.../selftests/bpf/prog_tests/sock_addr.c | 4 ----
.../selftests/bpf/prog_tests/test_bprm_opts.c | 2 +-
.../selftests/bpf/prog_tests/test_ima.c | 2 +-
.../selftests/bpf/prog_tests/trace_ext.c | 2 +-
tools/testing/selftests/bpf/test_maps.c | 4 ----
tools/testing/selftests/bpf/test_progs.h | 24 ++++++++++++++-----
tools/testing/selftests/bpf/test_verifier.c | 4 ----
tools/testing/selftests/bpf/testing_helpers.h | 4 ++++
13 files changed, 41 insertions(+), 41 deletions(-)
--
2.43.0
I realized this while having a map containing both a struct bpf_timer and
a struct bpf_wq: the third argument provided to the bpf_wq callback is
not the struct bpf_wq pointer itself, but the pointer to the value in
the map.
Which means that the users need to double cast the provided "value" as
this is not a struct bpf_wq *.
This is a change of API, but there doesn't seem to be much users of bpf_wq
right now, so we should be able to go with this right now.
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Benjamin Tissoires (2):
bpf: helpers: fix bpf_wq_set_callback_impl signature
selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature
kernel/bpf/helpers.c | 2 +-
tools/testing/selftests/bpf/bpf_experimental.h | 2 +-
tools/testing/selftests/bpf/progs/wq.c | 8 ++++----
tools/testing/selftests/bpf/progs/wq_failures.c | 4 ++--
4 files changed, 8 insertions(+), 8 deletions(-)
---
base-commit: fd8db07705c55a995c42b1e71afc42faad675b0b
change-id: 20240705-fix-wq-f069c7fb36c3
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>