v3: https://patchwork.kernel.org/project/netdevbpf/list/?series=929401&state=*
===
Address minor comments from RFCv2 and fix a few build warnings and
ynl-regen issues. No major changes.
RFC v2: https://patchwork.kernel.org/project/netdevbpf/list/?series=920056&state=*
=======
RFC v2 addresses much of the feedback from RFC v1. I plan on sending
something close to this as net-next reopens, sending it slightly early
to get feedback if any.
Major changes:
--------------
- much improved UAPI as suggested by Stan. We now interpret the iov_base
of the passed in iov from userspace as the offset into the dmabuf to
send from. This removes the need to set iov.iov_base = NULL which may
be confusing to users, and enables us to send multiple iovs in the
same sendmsg() call. ncdevmem and the docs show a sample use of that.
- Removed the duplicate dmabuf iov_iter in binding->iov_iter. I think
this is good improvment as it was confusing to keep track of
2 iterators for the same sendmsg, and mistracking both iterators
caused a couple of bugs reported in the last iteration that are now
resolved with this streamlining.
- Improved test coverage in ncdevmem. Now muliple sendmsg() are tested,
and sending multiple iovs in the same sendmsg() is tested.
- Fixed issue where dmabuf unmapping was happening in invalid context
(Stan).
====================================================================
The TX path had been dropped from the Device Memory TCP patch series
post RFCv1 [1], to make that series slightly easier to review. This
series rebases the implementation of the TX path on top of the
net_iov/netmem framework agreed upon and merged. The motivation for
the feature is thoroughly described in the docs & cover letter of the
original proposal, so I don't repeat the lengthy descriptions here, but
they are available in [1].
Sending this series as RFC as the winder closure is immenient. I plan on
reposting as non-RFC once the tree re-opens, addressing any feedback
I receive in the meantime.
Full outline on usage of the TX path is detailed in the documentation
added in the first patch.
Test example is available via the kselftest included in the series as well.
The series is relatively small, as the TX path for this feature largely
piggybacks on the existing MSG_ZEROCOPY implementation.
Patch Overview:
---------------
1. Documentation & tests to give high level overview of the feature
being added.
2. Add netmem refcounting needed for the TX path.
3. Devmem TX netlink API.
4. Devmem TX net stack implementation.
Testing:
--------
Testing is very similar to devmem TCP RX path. The ncdevmem test used
for the RX path is now augemented with client functionality to test TX
path.
* Test Setup:
Kernel: net-next with this RFC and memory provider API cherry-picked
locally.
Hardware: Google Cloud A3 VMs.
NIC: GVE with header split & RSS & flow steering support.
Performance results are not included with this version, unfortunately.
I'm having issues running the dma-buf exporter driver against the
upstream kernel on my test setup. The issues are specific to that
dma-buf exporter and do not affect this patch series. I plan to follow
up this series with perf fixes if the tests point to issues once they're
up and running.
Special thanks to Stan who took a stab at rebasing the TX implementation
on top of the netmem/net_iov framework merged. Parts of his proposal [2]
that are reused as-is are forked off into their own patches to give full
credit.
[1] https://lore.kernel.org/netdev/20240909054318.1809580-1-almasrymina@google.…
[2] https://lore.kernel.org/netdev/20240913150913.1280238-2-sdf@fomichev.me/T/#…
Cc: sdf(a)fomichev.me
Cc: asml.silence(a)gmail.com
Cc: dw(a)davidwei.uk
Cc: Jamal Hadi Salim <jhs(a)mojatatu.com>
Cc: Victor Nogueira <victor(a)mojatatu.com>
Cc: Pedro Tammela <pctammela(a)mojatatu.com>
Cc: Samiullah Khawaja <skhawaja(a)google.com>
Mina Almasry (5):
net: add devmem TCP TX documentation
selftests: ncdevmem: Implement devmem TCP TX
net: add get_netmem/put_netmem support
net: devmem: Implement TX path
net: devmem: make dmabuf unbinding scheduled work
Stanislav Fomichev (1):
net: devmem: TCP tx netlink api
Documentation/netlink/specs/netdev.yaml | 12 +
Documentation/networking/devmem.rst | 144 ++++++++-
include/linux/skbuff.h | 15 +-
include/linux/skbuff_ref.h | 4 +-
include/net/netmem.h | 3 +
include/net/sock.h | 1 +
include/uapi/linux/netdev.h | 1 +
include/uapi/linux/uio.h | 6 +-
net/core/datagram.c | 41 ++-
net/core/devmem.c | 111 ++++++-
net/core/devmem.h | 70 +++-
net/core/netdev-genl-gen.c | 13 +
net/core/netdev-genl-gen.h | 1 +
net/core/netdev-genl.c | 66 +++-
net/core/skbuff.c | 36 ++-
net/core/sock.c | 8 +
net/ipv4/tcp.c | 36 ++-
net/vmw_vsock/virtio_transport_common.c | 3 +-
tools/include/uapi/linux/netdev.h | 1 +
.../selftests/drivers/net/hw/ncdevmem.c | 300 +++++++++++++++++-
20 files changed, 819 insertions(+), 53 deletions(-)
--
2.48.1.362.g079036d154-goog
A task in the kernel (task_mm_cid_work) runs somewhat periodically to
compact the mm_cid for each process. Add a test to validate that it runs
correctly and timely.
The test spawns 1 thread pinned to each CPU, then each thread, including
the main one, runs in short bursts for some time. During this period, the
mm_cids should be spanning all numbers between 0 and nproc.
At the end of this phase, a thread with high enough mm_cid (>= nproc/2)
is selected to be the new leader, all other threads terminate.
After some time, the only running thread should see 0 as mm_cid, if that
doesn't happen, the compaction mechanism didn't work and the test fails.
The test never fails if only 1 core is available, in which case, we
cannot test anything as the only available mm_cid is 0.
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Signed-off-by: Gabriele Monaco <gmonaco(a)redhat.com>
---
tools/testing/selftests/rseq/.gitignore | 1 +
tools/testing/selftests/rseq/Makefile | 2 +-
.../selftests/rseq/mm_cid_compaction_test.c | 200 ++++++++++++++++++
3 files changed, 202 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/rseq/mm_cid_compaction_test.c
diff --git a/tools/testing/selftests/rseq/.gitignore b/tools/testing/selftests/rseq/.gitignore
index 16496de5f6ce4..2c89f97e4f737 100644
--- a/tools/testing/selftests/rseq/.gitignore
+++ b/tools/testing/selftests/rseq/.gitignore
@@ -3,6 +3,7 @@ basic_percpu_ops_test
basic_percpu_ops_mm_cid_test
basic_test
basic_rseq_op_test
+mm_cid_compaction_test
param_test
param_test_benchmark
param_test_compare_twice
diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile
index 5a3432fceb586..ce1b38f46a355 100644
--- a/tools/testing/selftests/rseq/Makefile
+++ b/tools/testing/selftests/rseq/Makefile
@@ -16,7 +16,7 @@ OVERRIDE_TARGETS = 1
TEST_GEN_PROGS = basic_test basic_percpu_ops_test basic_percpu_ops_mm_cid_test param_test \
param_test_benchmark param_test_compare_twice param_test_mm_cid \
- param_test_mm_cid_benchmark param_test_mm_cid_compare_twice
+ param_test_mm_cid_benchmark param_test_mm_cid_compare_twice mm_cid_compaction_test
TEST_GEN_PROGS_EXTENDED = librseq.so
diff --git a/tools/testing/selftests/rseq/mm_cid_compaction_test.c b/tools/testing/selftests/rseq/mm_cid_compaction_test.c
new file mode 100644
index 0000000000000..7ddde3b657dd6
--- /dev/null
+++ b/tools/testing/selftests/rseq/mm_cid_compaction_test.c
@@ -0,0 +1,200 @@
+// SPDX-License-Identifier: LGPL-2.1
+#define _GNU_SOURCE
+#include <assert.h>
+#include <pthread.h>
+#include <sched.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stddef.h>
+
+#include "../kselftest.h"
+#include "rseq.h"
+
+#define VERBOSE 0
+#define printf_verbose(fmt, ...) \
+ do { \
+ if (VERBOSE) \
+ printf(fmt, ##__VA_ARGS__); \
+ } while (0)
+
+/* 0.5 s */
+#define RUNNER_PERIOD 500000
+/* Number of runs before we terminate or get the token */
+#define THREAD_RUNS 5
+
+/*
+ * Number of times we check that the mm_cid were compacted.
+ * Checks are repeated every RUNNER_PERIOD.
+ */
+#define MM_CID_COMPACT_TIMEOUT 10
+
+struct thread_args {
+ int cpu;
+ int num_cpus;
+ pthread_mutex_t *token;
+ pthread_barrier_t *barrier;
+ pthread_t *tinfo;
+ struct thread_args *args_head;
+};
+
+static void __noreturn *thread_runner(void *arg)
+{
+ struct thread_args *args = arg;
+ int i, ret, curr_mm_cid;
+ cpu_set_t cpumask;
+
+ CPU_ZERO(&cpumask);
+ CPU_SET(args->cpu, &cpumask);
+ ret = pthread_setaffinity_np(pthread_self(), sizeof(cpumask), &cpumask);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to set affinity");
+ abort();
+ }
+ pthread_barrier_wait(args->barrier);
+
+ for (i = 0; i < THREAD_RUNS; i++)
+ usleep(RUNNER_PERIOD);
+ curr_mm_cid = rseq_current_mm_cid();
+ /*
+ * We select one thread with high enough mm_cid to be the new leader.
+ * All other threads (including the main thread) will terminate.
+ * After some time, the mm_cid of the only remaining thread should
+ * converge to 0, if not, the test fails.
+ */
+ if (curr_mm_cid >= args->num_cpus / 2 &&
+ !pthread_mutex_trylock(args->token)) {
+ printf_verbose(
+ "cpu%d has mm_cid=%d and will be the new leader.\n",
+ sched_getcpu(), curr_mm_cid);
+ for (i = 0; i < args->num_cpus; i++) {
+ if (args->tinfo[i] == pthread_self())
+ continue;
+ ret = pthread_join(args->tinfo[i], NULL);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to join thread");
+ abort();
+ }
+ }
+ pthread_barrier_destroy(args->barrier);
+ free(args->tinfo);
+ free(args->token);
+ free(args->barrier);
+ free(args->args_head);
+
+ for (i = 0; i < MM_CID_COMPACT_TIMEOUT; i++) {
+ curr_mm_cid = rseq_current_mm_cid();
+ printf_verbose("run %d: mm_cid=%d on cpu%d.\n", i,
+ curr_mm_cid, sched_getcpu());
+ if (curr_mm_cid == 0)
+ exit(EXIT_SUCCESS);
+ usleep(RUNNER_PERIOD);
+ }
+ exit(EXIT_FAILURE);
+ }
+ printf_verbose("cpu%d has mm_cid=%d and is going to terminate.\n",
+ sched_getcpu(), curr_mm_cid);
+ pthread_exit(NULL);
+}
+
+int test_mm_cid_compaction(void)
+{
+ cpu_set_t affinity;
+ int i, j, ret = 0, num_threads;
+ pthread_t *tinfo;
+ pthread_mutex_t *token;
+ pthread_barrier_t *barrier;
+ struct thread_args *args;
+
+ sched_getaffinity(0, sizeof(affinity), &affinity);
+ num_threads = CPU_COUNT(&affinity);
+ tinfo = calloc(num_threads, sizeof(*tinfo));
+ if (!tinfo) {
+ perror("Error: failed to allocate tinfo");
+ return -1;
+ }
+ args = calloc(num_threads, sizeof(*args));
+ if (!args) {
+ perror("Error: failed to allocate args");
+ ret = -1;
+ goto out_free_tinfo;
+ }
+ token = malloc(sizeof(*token));
+ if (!token) {
+ perror("Error: failed to allocate token");
+ ret = -1;
+ goto out_free_args;
+ }
+ barrier = malloc(sizeof(*barrier));
+ if (!barrier) {
+ perror("Error: failed to allocate barrier");
+ ret = -1;
+ goto out_free_token;
+ }
+ if (num_threads == 1) {
+ fprintf(stderr, "Cannot test on a single cpu. "
+ "Skipping mm_cid_compaction test.\n");
+ /* only skipping the test, this is not a failure */
+ goto out_free_barrier;
+ }
+ pthread_mutex_init(token, NULL);
+ ret = pthread_barrier_init(barrier, NULL, num_threads);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to initialise barrier");
+ goto out_free_barrier;
+ }
+ for (i = 0, j = 0; i < CPU_SETSIZE && j < num_threads; i++) {
+ if (!CPU_ISSET(i, &affinity))
+ continue;
+ args[j].num_cpus = num_threads;
+ args[j].tinfo = tinfo;
+ args[j].token = token;
+ args[j].barrier = barrier;
+ args[j].cpu = i;
+ args[j].args_head = args;
+ if (!j) {
+ /* The first thread is the main one */
+ tinfo[0] = pthread_self();
+ ++j;
+ continue;
+ }
+ ret = pthread_create(&tinfo[j], NULL, thread_runner, &args[j]);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to create thread");
+ abort();
+ }
+ ++j;
+ }
+ printf_verbose("Started %d threads.\n", num_threads);
+
+ /* Also main thread will terminate if it is not selected as leader */
+ thread_runner(&args[0]);
+
+ /* only reached in case of errors */
+out_free_barrier:
+ free(barrier);
+out_free_token:
+ free(token);
+out_free_args:
+ free(args);
+out_free_tinfo:
+ free(tinfo);
+
+ return ret;
+}
+
+int main(int argc, char **argv)
+{
+ if (!rseq_mm_cid_available()) {
+ fprintf(stderr, "Error: rseq_mm_cid unavailable\n");
+ return -1;
+ }
+ if (test_mm_cid_compaction())
+ return -1;
+ return 0;
+}
--
2.48.1
Hello everyone,
Some minor grammer issues that I have fixed:
1. echo "If an error every occurs --> echo "If an error occurs, every execution
2. Example uses --> Example Usage
Signed-off-by: Sumya Hoque <sumyahoque2012(a)gmail.com>
---
tools/testing/selftests/sysctl/sysctl.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/sysctl/sysctl.sh b/tools/testing/selftests/sysctl/sysctl.sh
index 84472b436c07..a4d76147ed8a 100755
--- a/tools/testing/selftests/sysctl/sysctl.sh
+++ b/tools/testing/selftests/sysctl/sysctl.sh
@@ -891,11 +891,11 @@ usage()
echo " -l List all test ID list"
echo " -h|--help Help"
echo
- echo "If an error every occurs execution will immediately terminate."
+ echo "If an error occurs, every execution will immediately terminate."
echo "If you are adding a new test try using -w <test-ID> first to"
echo "make sure the test passes a series of tests."
echo
- echo Example uses:
+ echo Example usage:
echo
echo "$TEST_NAME.sh -- executes all tests"
echo "$TEST_NAME.sh -t 0002 -- Executes test ID 0002 number of times is recomended"
--
2.34.1
From: Kevin Brodsky <kevin.brodsky(a)arm.com>
[ Upstream commit 46036188ea1f5266df23a6149dea0df1c77cd1c7 ]
The mm kselftests are currently built with no optimisation (-O0). It's
unclear why, and besides being obviously suboptimal, this also prevents
the pkeys tests from working as intended. Let's build all the tests with
-O2.
[kevin.brodsky(a)arm.com: silence unused-result warnings]
Link: https://lkml.kernel.org/r/20250107170110.2819685-1-kevin.brodsky@arm.com
Link: https://lkml.kernel.org/r/20241209095019.1732120-6-kevin.brodsky@arm.com
Signed-off-by: Kevin Brodsky <kevin.brodsky(a)arm.com>
Cc: Aruna Ramakrishna <aruna.ramakrishna(a)oracle.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Joey Gouly <joey.gouly(a)arm.com>
Cc: Keith Lucas <keith.lucas(a)oracle.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
(cherry picked from commit 46036188ea1f5266df23a6149dea0df1c77cd1c7)
[Yifei: This commit also fix the failure of pkey_sighandler_tests_64,
which is also in linux-6.12.y, thus backport this commit. It is already
backported to linux-6.13.y by commit d9eb5a1e76f56]
Signed-off-by: Yifei Liu <yifei.l.liu(a)oracle.com>
---
tools/testing/selftests/mm/Makefile | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 02e1204971b0..c0138cb19705 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -33,9 +33,16 @@ endif
# LDLIBS.
MAKEFLAGS += --no-builtin-rules
-CFLAGS = -Wall -I $(top_srcdir) $(EXTRA_CFLAGS) $(KHDR_INCLUDES) $(TOOLS_INCLUDES)
+CFLAGS = -Wall -O2 -I $(top_srcdir) $(EXTRA_CFLAGS) $(KHDR_INCLUDES) $(TOOLS_INCLUDES)
LDLIBS = -lrt -lpthread -lm
+# Some distributions (such as Ubuntu) configure GCC so that _FORTIFY_SOURCE is
+# automatically enabled at -O1 or above. This triggers various unused-result
+# warnings where functions such as read() or write() are called and their
+# return value is not checked. Disable _FORTIFY_SOURCE to silence those
+# warnings.
+CFLAGS += -U_FORTIFY_SOURCE
+
TEST_GEN_FILES = cow
TEST_GEN_FILES += compaction_test
TEST_GEN_FILES += gup_longterm
--
2.46.0
The kprobe_multi feature was disabled on ARM64 due to the lack of fprobe
support.
The fprobe rewrite on function_graph has been recently merged and thus
brought support for fprobes on arm64. This then enables kprobe_multi
support on arm64, and so the corresponding tests can now be run on this
architecture.
Remove the tests depending on kprobe_multi from DENYLIST.aarch64 to
allow those to run in CI. CONFIG_FPROBE is already correctly set in
tools/testing/selftests/bpf/config
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore(a)bootlin.com>
---
The tests being enabled with this series have been run locally in an
ARM64 qemu environment, and in Github CI.
I only did some testing to ensure that the tests depending on kprobe_multi
now run correctly on arm64, it is fair to stress that all the hard
work has actually been done by M. Hiramatsu ([0])
[0] https://lore.kernel.org/bpf/173518987627.391279.3307342580035322889.stgit@d…
---
tools/testing/selftests/bpf/DENYLIST.aarch64 | 9 ---------
1 file changed, 9 deletions(-)
diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64
index 901349da680fa67896d279d184db78e964d9ae27..6d8feda27ce9de07d77d6e384666082923e3dc76 100644
--- a/tools/testing/selftests/bpf/DENYLIST.aarch64
+++ b/tools/testing/selftests/bpf/DENYLIST.aarch64
@@ -1,12 +1,3 @@
-bpf_cookie/multi_kprobe_attach_api # kprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3
-bpf_cookie/multi_kprobe_link_api # kprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3
-kprobe_multi_bench_attach # needs CONFIG_FPROBE
-kprobe_multi_test # needs CONFIG_FPROBE
-module_attach # prog 'kprobe_multi': failed to auto-attach: -95
fentry_test/fentry_many_args # fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524
fexit_test/fexit_many_args # fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524
tracing_struct/struct_many_args # struct_many_args:FAIL:tracing_struct_many_args__attach unexpected error: -524
-fill_link_info/kprobe_multi_link_info # bpf_program__attach_kprobe_multi_opts unexpected error: -95
-fill_link_info/kretprobe_multi_link_info # bpf_program__attach_kprobe_multi_opts unexpected error: -95
-fill_link_info/kprobe_multi_invalid_ubuff # bpf_program__attach_kprobe_multi_opts unexpected error: -95
-missed/kprobe_recursion # missed_kprobe_recursion__attach unexpected error: -95 (errno 95)
---
base-commit: d3417ac824b98e8773bc04b93e09c4b93c2c6cad
change-id: 20250219-enable_kprobe_multi_tests-c8d53336e5cd
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
This is one of just 3 remaining "Test Module" kselftests (the others
being bitmap and scanf), the rest having been converted to KUnit.
I tested this using:
$ tools/testing/kunit/kunit.py run --arch arm64 --make_options LLVM=1 printf
I have also sent out a series converting scanf[0].
Link: https://lore.kernel.org/all/20250204-scanf-kunit-convert-v3-0-386d7c3ee714@… [0]
Signed-off-by: Tamir Duberstein <tamird(a)gmail.com>
---
Changes in v4:
- Add patch "implicate test line in failure messages".
- Rebase on linux-next, move scanf_kunit.c into lib/tests/.
- Link to v3: https://lore.kernel.org/r/20250210-printf-kunit-convert-v3-0-ee6ac5500f5e@g…
Changes in v3:
- Remove extraneous trailing newlines from failure messages.
- Replace `pr_warn` with `kunit_warn`.
- Drop arch changes.
- Remove KUnit boilerplate from CONFIG_PRINTF_KUNIT_TEST help text.
- Restore `total_tests` counting.
- Remove tc_fail macro in last patch.
- Link to v2: https://lore.kernel.org/r/20250207-printf-kunit-convert-v2-0-057b23860823@g…
Changes in v2:
- Incorporate code review from prior work[0] by Arpitha Raghunandan.
- Link to v1: https://lore.kernel.org/r/20250204-printf-kunit-convert-v1-0-ecf1b846a4de@g…
Link: https://lore.kernel.org/lkml/20200817043028.76502-1-98.arpi@gmail.com/t/#u [0]
---
Tamir Duberstein (3):
printf: convert self-test to KUnit
printf: break kunit into test cases
printf: implicate test line in failure messages
Documentation/core-api/printk-formats.rst | 4 +-
MAINTAINERS | 2 +-
lib/Kconfig.debug | 12 +-
lib/Makefile | 1 -
lib/tests/Makefile | 1 +
lib/{test_printf.c => tests/printf_kunit.c} | 437 ++++++++++++----------------
tools/testing/selftests/lib/config | 1 -
tools/testing/selftests/lib/printf.sh | 4 -
8 files changed, 200 insertions(+), 262 deletions(-)
---
base-commit: 7b7a883c7f4de1ee5040bd1c32aabaafde54d209
change-id: 20250131-printf-kunit-convert-fd4012aa2ec6
Best regards,
--
Tamir Duberstein <tamird(a)gmail.com>