This commit adds a new kernel selftest to verify RTNLGRP_IPV4_MCADDR
and RTNLGRP_IPV6_MCADDR notifications. The test works by adding and
removing a dummy interface and then confirming that the system
correctly receives join and removal notifications for the 224.0.0.1
and ff02::1 multicast addresses.
The test relies on the iproute2 version to be 6.13+.
Tested by the following command:
$ vng -v --user root --cpus 16 -- \
make -C tools/testing/selftests TARGETS=net
TEST_PROGS=rtnetlink_notification.sh \
TEST_GEN_PROGS="" run_tests
Cc: Maciej Żenczykowski <maze(a)google.com>
Cc: Lorenzo Colitti <lorenzo(a)google.com>
Signed-off-by: Yuyang Huang <yuyanghuang(a)google.com>
---
Changelog since v2:
- Move the test cases to a separate file.
Changelog since v1:
- Skip the test if the iproute2 is too old.
tools/testing/selftests/net/Makefile | 1 +
.../selftests/net/rtnetlink_notification.sh | 159 ++++++++++++++++++
2 files changed, 160 insertions(+)
create mode 100755 tools/testing/selftests/net/rtnetlink_notification.sh
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 70a38f485d4d..ad258b25bc9d 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -40,6 +40,7 @@ TEST_PROGS += netns-name.sh
TEST_PROGS += link_netns.py
TEST_PROGS += nl_netdev.py
TEST_PROGS += rtnetlink.py
+TEST_PROGS += rtnetlink_notification.sh
TEST_PROGS += srv6_end_dt46_l3vpn_test.sh
TEST_PROGS += srv6_end_dt4_l3vpn_test.sh
TEST_PROGS += srv6_end_dt6_l3vpn_test.sh
diff --git a/tools/testing/selftests/net/rtnetlink_notification.sh b/tools/testing/selftests/net/rtnetlink_notification.sh
new file mode 100755
index 000000000000..a2c1afed5023
--- /dev/null
+++ b/tools/testing/selftests/net/rtnetlink_notification.sh
@@ -0,0 +1,159 @@
+#!/bin/bash
+#
+# This test is for checking rtnetlink notification callpaths, and get as much
+# coverage as possible.
+#
+# set -e
+
+ALL_TESTS="
+ kci_test_mcast_addr_notification
+"
+
+VERBOSE=0
+PAUSE=no
+PAUSE_ON_FAIL=no
+
+source lib.sh
+
+# set global exit status, but never reset nonzero one.
+check_err()
+{
+ if [ $ret -eq 0 ]; then
+ ret=$1
+ fi
+ [ -n "$2" ] && echo "$2"
+}
+
+run_cmd_common()
+{
+ local cmd="$*"
+ local out
+ if [ "$VERBOSE" = "1" ]; then
+ echo "COMMAND: ${cmd}"
+ fi
+ out=$($cmd 2>&1)
+ rc=$?
+ if [ "$VERBOSE" = "1" -a -n "$out" ]; then
+ echo " $out"
+ fi
+ return $rc
+}
+
+run_cmd() {
+ run_cmd_common "$@"
+ rc=$?
+ check_err $rc
+ return $rc
+}
+
+end_test()
+{
+ echo "$*"
+ [ "${VERBOSE}" = "1" ] && echo
+
+ if [[ $ret -ne 0 ]] && [[ "${PAUSE_ON_FAIL}" = "yes" ]]; then
+ echo "Hit enter to continue"
+ read a
+ fi;
+
+ if [ "${PAUSE}" = "yes" ]; then
+ echo "Hit enter to continue"
+ read a
+ fi
+
+}
+
+kci_test_mcast_addr_notification()
+{
+ local tmpfile
+ local monitor_pid
+ local match_result
+
+ tmpfile=$(mktemp)
+
+ ip monitor maddr > $tmpfile &
+ monitor_pid=$!
+ sleep 1
+ if [ ! -e "/proc/$monitor_pid" ]; then
+ end_test "SKIP: mcast addr notification: iproute2 too old"
+ rm $tmpfile
+ return $ksft_skip
+ fi
+
+ run_cmd ip link add name test-dummy1 type dummy
+ run_cmd ip link set test-dummy1 up
+ run_cmd ip link del dev test-dummy1
+ sleep 1
+
+ match_result=$(grep -cE "test-dummy1.*(224.0.0.1|ff02::1)" $tmpfile)
+
+ kill $monitor_pid
+ rm $tmpfile
+ # There should be 4 line matches as follows.
+ # 13: test-dummy1 inet6 mcast ff02::1 scope global
+ # 13: test-dummy1 inet mcast 224.0.0.1 scope global
+ # Deleted 13: test-dummy1 inet mcast 224.0.0.1 scope global
+ # Deleted 13: test-dummy1 inet6 mcast ff02::1 scope global
+ if [ $match_result -ne 4 ];then
+ end_test "FAIL: mcast addr notification"
+ return 1
+ fi
+ end_test "PASS: mcast addr notification"
+}
+
+kci_test_rtnl()
+{
+ local current_test
+ local ret=0
+
+ for current_test in ${TESTS:-$ALL_TESTS}; do
+ $current_test
+ check_err $?
+ done
+
+ return $ret
+}
+
+usage()
+{
+ cat <<EOF
+usage: ${0##*/} OPTS
+
+ -t <test> Test(s) to run (default: all)
+ (options: $(echo $ALL_TESTS))
+ -v Verbose mode (show commands and output)
+ -P Pause after every test
+ -p Pause after every failing test before cleanup (for debugging)
+EOF
+}
+
+#check for needed privileges
+if [ "$(id -u)" -ne 0 ];then
+ end_test "SKIP: Need root privileges"
+ exit $ksft_skip
+fi
+
+for x in ip;do
+ $x -Version 2>/dev/null >/dev/null
+ if [ $? -ne 0 ];then
+ end_test "SKIP: Could not run test without the $x tool"
+ exit $ksft_skip
+ fi
+done
+
+while getopts t:hvpP o; do
+ case $o in
+ t) TESTS=$OPTARG;;
+ v) VERBOSE=1;;
+ p) PAUSE_ON_FAIL=yes;;
+ P) PAUSE=yes;;
+ h) usage; exit 0;;
+ *) usage; exit 1;;
+ esac
+done
+
+[ $PAUSE = "yes" ] && PAUSE_ON_FAIL="no"
+
+kci_test_rtnl
+
+exit $?
--
2.50.0.rc1.591.g9c95f17f64-goog
This patch series introduces a new feature to netconsole which allows
appending a message ID to the userdata dictionary.
If the msgid feature is enabled, the message ID is built from a per-target 32
bit counter that is incremented and appended to every message sent to the target.
Example::
echo 1 > "/sys/kernel/config/netconsole/cmdline0/userdata/msgid_enabled"
echo "This is message #1" > /dev/kmsg
echo "This is message #2" > /dev/kmsg
13,434,54928466,-;This is message #1
msgid=1
13,435,54934019,-;This is message #2
msgid=2
This feature can be used by the target to detect if messages were dropped or
reordered before reaching the target. This allows system administrators to
assess the reliability of their netconsole pipeline and detect loss of messages
due to network contention or temporary unavailability.
Suggested-by: Breno Leitao <leitao(a)debian.org>
Signed-off-by: Gustavo Luiz Duarte <gustavold(a)gmail.com>
---
Changes in v3:
- Add kdoc documentation for msgcounter.
- Link to v2: https://lore.kernel.org/r/20250612-netconsole-msgid-v2-0-d4c1abc84bac@gmail…
Changes in v2:
- Use wrapping_assign_add() to avoid warnings in UBSAN and friends.
- Improve documentation to clarify wrapping and distinguish msgid from sequnum.
- Rebase and fix conflict in prepare_extradata().
- Link to v1: https://lore.kernel.org/r/20250611-netconsole-msgid-v1-0-1784a51feb1e@gmail…
---
Gustavo Luiz Duarte (5):
netconsole: introduce 'msgid' as a new sysdata field
netconsole: implement configfs for msgid_enabled
netconsole: append msgid to sysdata
selftests: netconsole: Add tests for 'msgid' feature in sysdata
docs: netconsole: document msgid feature
Documentation/networking/netconsole.rst | 32 +++++++++++
drivers/net/netconsole.c | 66 ++++++++++++++++++++++
.../selftests/drivers/net/netcons_sysdata.sh | 30 ++++++++++
3 files changed, 128 insertions(+)
---
base-commit: 08207f42d3ffee43c97f16baf03d7426a3c353ca
change-id: 20250609-netconsole-msgid-b93c6f8e9c60
Best regards,
--
Gustavo Luiz Duarte <gustavold(a)gmail.com>
We already have a selftest for harness, while there is not usage of
FIXTURE_VARIANT.
Patch 3 add FIXTURE_VARIANT usage in the selftest.
Patch 1/2 are trivial fix.
Wei Yang (3):
selftests: correct one typo in comment
selftests: print 0 if no test is chosen
selftests: harness: Add kselftest harness selftest with variant
tools/testing/selftests/kselftest.h | 2 +-
tools/testing/selftests/kselftest_harness.h | 2 +-
.../kselftest_harness/harness-selftest.c | 34 +++++++++++++++++++
.../harness-selftest.expected | 22 +++++++++---
4 files changed, 54 insertions(+), 6 deletions(-)
--
2.34.1
Trivial fix to a couple of outdated netmem comments. No code changes,
just more accurately describing current code.
Signed-off-by: Mina Almasry <almasrymina(a)google.com>
---
v2: https://lore.kernel.org/netdev/20250613042804.3259045-2-almasrymina@google.…
- Adjust comment for clearing lsb as (Jakub)
---
include/net/netmem.h | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 386164fb9c18..850869b45b45 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -89,8 +89,7 @@ static inline unsigned int net_iov_idx(const struct net_iov *niov)
* typedef netmem_ref - a nonexistent type marking a reference to generic
* network memory.
*
- * A netmem_ref currently is always a reference to a struct page. This
- * abstraction is introduced so support for new memory types can be added.
+ * A netmem_ref can be a struct page* or a struct net_iov* underneath.
*
* Use the supplied helpers to obtain the underlying memory pointer and fields.
*/
@@ -117,9 +116,6 @@ static inline struct page *__netmem_to_page(netmem_ref netmem)
return (__force struct page *)netmem;
}
-/* This conversion fails (returns NULL) if the netmem_ref is not struct page
- * backed.
- */
static inline struct page *netmem_to_page(netmem_ref netmem)
{
if (WARN_ON_ONCE(netmem_is_net_iov(netmem)))
@@ -178,6 +174,21 @@ static inline unsigned long netmem_pfn_trace(netmem_ref netmem)
return page_to_pfn(netmem_to_page(netmem));
}
+/* __netmem_clear_lsb - convert netmem_ref to struct net_iov * for access to
+ * common fields.
+ * @netmem: netmem reference to extract as net_iov.
+ *
+ * All the sub types of netmem_ref (page, net_iov) have the same pp, pp_magic,
+ * dma_addr, and pp_ref_count fields at the same offsets. Thus, we can access
+ * these fields without a type check to make sure that the underlying mem is
+ * net_iov or page.
+ *
+ * The resulting value of this function can only be used to access the fields
+ * that are NET_IOV_ASSERT_OFFSET'd. Accessing any other fields will result in
+ * undefined behavior.
+ *
+ * Return: the netmem_ref cast to net_iov* regardless of its underlying type.
+ */
static inline struct net_iov *__netmem_clear_lsb(netmem_ref netmem)
{
return (struct net_iov *)((__force unsigned long)netmem & ~NET_IOV);
base-commit: 8909f5f4ecd551c2299b28e05254b77424c8c7dc
--
2.50.0.rc1.591.g9c95f17f64-goog
This commit adds a new kernel selftest to verify RTNLGRP_IPV4_MCADDR
and RTNLGRP_IPV6_MCADDR notifications. The test works by adding and
removing a dummy interface and then confirming that the system
correctly receives join and removal notifications for the 224.0.0.1
and ff02::1 multicast addresses.
The test relies on the iproute2 version to be 6.13+.
Tested by the following command:
$ vng -v --user root --cpus 16 -- \
make -C tools/testing/selftests TARGETS=net
TEST_PROGS=rtnetlink_notification.sh \
TEST_GEN_PROGS="" run_tests
Cc: Maciej Żenczykowski <maze(a)google.com>
Cc: Lorenzo Colitti <lorenzo(a)google.com>
Signed-off-by: Yuyang Huang <yuyanghuang(a)google.com>
---
Changelog since v3:
- Refactor the test to use utilities provided by lib.h.
- Fix shellcheck warnings.
Changelog since v2:
- Move the test case to a separate file.
Changelog since v1:
- Skip the test if the iproute2 is too old.
tools/testing/selftests/net/Makefile | 1 +
.../selftests/net/rtnetlink_notification.sh | 70 +++++++++++++++++++
2 files changed, 71 insertions(+)
create mode 100755 tools/testing/selftests/net/rtnetlink_notification.sh
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index ab996bd22a5f..3abb74d563a7 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -41,6 +41,7 @@ TEST_PROGS += netns-name.sh
TEST_PROGS += link_netns.py
TEST_PROGS += nl_netdev.py
TEST_PROGS += rtnetlink.py
+TEST_PROGS += rtnetlink_notification.sh
TEST_PROGS += srv6_end_dt46_l3vpn_test.sh
TEST_PROGS += srv6_end_dt4_l3vpn_test.sh
TEST_PROGS += srv6_end_dt6_l3vpn_test.sh
diff --git a/tools/testing/selftests/net/rtnetlink_notification.sh b/tools/testing/selftests/net/rtnetlink_notification.sh
new file mode 100755
index 000000000000..39c1b815bbe4
--- /dev/null
+++ b/tools/testing/selftests/net/rtnetlink_notification.sh
@@ -0,0 +1,70 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# This test is for checking rtnetlink notification callpaths, and get as much
+# coverage as possible.
+#
+# set -e
+
+ALL_TESTS="
+ kci_test_mcast_addr_notification
+"
+
+source lib.sh
+
+kci_test_mcast_addr_notification()
+{
+ RET=0
+ local tmpfile
+ local monitor_pid
+ local match_result
+ local test_dev="test-dummy1"
+
+ tmpfile=$(mktemp)
+ defer rm "$tmpfile"
+
+ ip monitor maddr > $tmpfile &
+ monitor_pid=$!
+ defer kill_process "$monitor_pid"
+
+ sleep 1
+
+ if [ ! -e "/proc/$monitor_pid" ]; then
+ RET=$ksft_skip
+ log_test "mcast addr notification: iproute2 too old"
+ return $RET
+ fi
+
+ ip link add name "$test_dev" type dummy
+ check_err $? "failed to add dummy interface"
+ ip link set "$test_dev" up
+ check_err $? "failed to set dummy interface up"
+ ip link del dev "$test_dev"
+ check_err $? "Failed to delete dummy interface"
+ sleep 1
+
+ # There should be 4 line matches as follows.
+ # 13: test-dummy1 inet6 mcast ff02::1 scope global
+ # 13: test-dummy1 inet mcast 224.0.0.1 scope global
+ # Deleted 13: test-dummy1 inet mcast 224.0.0.1 scope global
+ # Deleted 13: test-dummy1 inet6 mcast ff02::1 scope global
+ match_result=$(grep -cE "$test_dev.*(224.0.0.1|ff02::1)" "$tmpfile")
+ if [ "$match_result" -ne 4 ]; then
+ RET=$ksft_fail
+ fi
+ log_test "mcast addr notification: Expected 4 matches, got $match_result"
+ return $RET
+}
+
+#check for needed privileges
+if [ "$(id -u)" -ne 0 ];then
+ RET=$ksft_skip
+ log_test "need root privileges"
+ exit $RET
+fi
+
+require_command ip
+
+tests_run
+
+exit $EXIT_STATUS
--
2.50.0.rc1.591.g9c95f17f64-goog
The netdevsim driver previously lacked RX statistics support, which
prevented its use with the GenerateTraffic() test framework, as this
framework verifies traffic flow by checking RX byte counts.
This patch migrates netdevsim from its custom statistics collection to
the NETDEV_PCPU_STAT_DSTATS framework, as suggested by Jakub. This
change not only standardizes the statistics handling but also adds the
necessary RX statistics support required by the test framework.
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Changes in v3:
- Rely on netdev from caller instead of napi->dev in nsim_queue_free().
- Link to v2: https://lore.kernel.org/r/20250613-netdevsim_stat-v2-0-98fa38836c48@debian.…
Changes in v2:
- Changed the RX collection place from nsim_napi_rx() to nsim_rcv (Joe
Damato)
- Collect RX dropped packets statistic in nsim_queue_free() (Jakub)
- Added a helper in dstat to add values to RX dropped packets
- Link to v1: https://lore.kernel.org/r/20250611-netdevsim_stat-v1-0-c11b657d96bf@debian.…
---
Breno Leitao (4):
netdevsim: migrate to dstats stats collection
netdevsim: collect statistics at RX side
net: add dev_dstats_rx_dropped_add() helper
netdevsim: account dropped packet length in stats on queue free
drivers/net/netdevsim/netdev.c | 54 +++++++++++++++------------------------
drivers/net/netdevsim/netdevsim.h | 5 ----
include/linux/netdevice.h | 10 ++++++++
3 files changed, 31 insertions(+), 38 deletions(-)
---
base-commit: 3b5b1c428260152e47c9584bc176f358b87ca82d
change-id: 20250610-netdevsim_stat-95995921e03e
Best regards,
--
Breno Leitao <leitao(a)debian.org>
This comes useful when using selftests from mainline on older
kernels/setups that still rely on cgroup v1 while maintains single
variant of selftests.
The core tests that rely on v2 specific features are skipped, the
remaining ones are adjusted to work with a v1 hierarchy.
The expected output on v1 system:
ok 1 # SKIP test_cgcore_internal_process_constraint
ok 2 # SKIP test_cgcore_top_down_constraint_enable
ok 3 # SKIP test_cgcore_top_down_constraint_disable
ok 4 # SKIP test_cgcore_no_internal_process_constraint_on_threads
ok 5 # SKIP test_cgcore_parent_becomes_threaded
ok 6 # SKIP test_cgcore_invalid_domain
ok 7 # SKIP test_cgcore_populated
ok 8 test_cgcore_proc_migration
ok 9 test_cgcore_thread_migration
ok 10 test_cgcore_destroy
ok 11 test_cgcore_lesser_euid_open
ok 12 # SKIP test_cgcore_lesser_ns_open
Michal Koutný (4):
selftests: cgroup_util: Add helpers for testing named v1 hierarchies
selftests: cgroup: Add support for named v1 hierarchies in test_core
selftests: cgroup: Optionally set up v1 environment
selftests: cgroup: Fix compilation on pre-cgroupns kernels
.../selftests/cgroup/lib/cgroup_util.c | 4 +-
.../cgroup/lib/include/cgroup_util.h | 5 ++
tools/testing/selftests/cgroup/test_core.c | 84 ++++++++++++++++---
3 files changed, 82 insertions(+), 11 deletions(-)
base-commit: 9afe652958c3ee88f24df1e4a97f298afce89407
--
2.49.0
This series is a follow-up to [1], which adds mTHP support to khugepaged.
mTHP khugepaged support is a "loose" dependency for the sysfs/sysctl
configs to make sense. Without it global="defer" and mTHP="inherit" case
is "undefined" behavior.
We've seen cases were customers switching from RHEL7 to RHEL8 see a
significant increase in the memory footprint for the same workloads.
Through our investigations we found that a large contributing factor to
the increase in RSS was an increase in THP usage.
For workloads like MySQL, or when using allocators like jemalloc, it is
often recommended to set /transparent_hugepages/enabled=never. This is
in part due to performance degradations and increased memory waste.
This series introduces enabled=defer, this setting acts as a middle
ground between always and madvise. If the mapping is MADV_HUGEPAGE, the
page fault handler will act normally, making a hugepage if possible. If
the allocation is not MADV_HUGEPAGE, then the page fault handler will
default to the base size allocation. The caveat is that khugepaged can
still operate on pages that are not MADV_HUGEPAGE.
This allows for three things... one, applications specifically designed to
use hugepages will get them, and two, applications that don't use
hugepages can still benefit from them without aggressively inserting
THPs at every possible chance. This curbs the memory waste, and defers
the use of hugepages to khugepaged. Khugepaged can then scan the memory
for eligible collapsing. Lastly there is the added benefit for those who
want THPs but experience higher latency PFs. Now you can get base page
performance at the PF handler and Hugepage performance for those mappings
after they collapse.
Admins may want to lower max_ptes_none, if not, khugepaged may
aggressively collapse single allocations into hugepages.
TESTING:
- Built for x86_64, aarch64, ppc64le, and s390x
- selftests mm
- In [1] I provided a script [2] that has multiple access patterns
- lots of general use.
- redis testing. This test was my original case for the defer mode. What I
was able to prove was that THP=always leads to increased max_latency
cases; hence why it is recommended to disable THPs for redis servers.
However with 'defer' we dont have the max_latency spikes and can still
get the system to utilize THPs. I further tested this with the mTHP
defer setting and found that redis (and probably other jmalloc users)
can utilize THPs via defer (+mTHP defer) without a large latency
penalty and some potential gains. I uploaded some mmtest results
here[3] which compares:
stock+thp=never
stock+(m)thp=always
khugepaged-mthp + defer (max_ptes_none=64)
The results show that (m)THPs can cause some throughput regression in
some cases, but also has gains in other cases. The mTHP+defer results
have more gains and less losses over the (m)THP=always case.
V6 Changes:
- nits
- rebased dependent series and added review tags
V5 Changes:
- rebased dependent series
- added reviewed-by tag on 2/4
V4 Changes:
- Minor Documentation fixes
- rebased the dependent series [1] onto mm-unstable
commit 0e68b850b1d3 ("vmalloc: use atomic_long_add_return_relaxed()")
V3 Changes:
- Combined the documentation commits into one, and moved a section to the
khugepaged mthp patchset
V2 Changes:
- base changes on mTHP khugepaged support
- Fix selftests parsing issue
- add mTHP defer option
- add mTHP defer Documentation
[1] - https://lore.kernel.org/all/20250515032226.128900-1-npache@redhat.com/
[2] - https://gitlab.com/npache/khugepaged_mthp_test
[3] - https://people.redhat.com/npache/mthp_khugepaged_defer/testoutput2/output.h…
Nico Pache (4):
mm: defer THP insertion to khugepaged
mm: document (m)THP defer usage
khugepaged: add defer option to mTHP options
selftests: mm: add defer to thp setting parser
Documentation/admin-guide/mm/transhuge.rst | 31 +++++++---
include/linux/huge_mm.h | 18 +++++-
mm/huge_memory.c | 69 +++++++++++++++++++---
mm/khugepaged.c | 8 +--
tools/testing/selftests/mm/thp_settings.c | 1 +
tools/testing/selftests/mm/thp_settings.h | 1 +
6 files changed, 106 insertions(+), 22 deletions(-)
--
2.49.0