This patch series was motivated by fixing a few bugs in the bonding
driver related to xfrm state migration on device failover.
struct xfrm_dev_offload has two net_device pointers: dev and real_dev.
The first one is the device the xfrm_state is offloaded on and the
second one is used by the bonding driver to manage the underlying device
xfrm_states are actually offloaded on. When bonding isn't used, the two
pointers are the same.
This causes confusion in drivers: Which device pointer should they use?
If they want to support bonding, they need to only use real_dev and
never look at dev.
Furthermore, real_dev is used without proper locking from multiple code
paths and changing it is dangerous. See commit [1] for example.
This patch series clears things out by removing all uses of real_dev
from outside the bonding driver.
Then, the bonding driver is refactored to fix a couple of long standing
races and the original bug which motivated this patch series.
[1] commit f8cde9805981 ("bonding: fix xfrm real_dev null pointer
dereference")
Cosmin Ratiu (6):
Cleaning up unnecessary uses of xso.real_dev:
net/mlx5: Avoid using xso.real_dev unnecessarily
xfrm: Use xdo.dev instead of xdo.real_dev
xfrm: Remove unneeded device check from validate_xmit_xfrm
Refactoring device operations to get an explicit device pointer:
xfrm: Add explicit dev to .xdo_dev_state_{add,delete,free}
Fixing a bonding xfrm state migration bug:
bonding: Mark active offloaded xfrm_states
Fixing long standing races in bonding:
bonding: Fix multiple long standing offload races
Documentation/networking/xfrm_device.rst | 10 +-
drivers/net/bonding/bond_main.c | 93 +++++++++++--------
.../net/ethernet/chelsio/cxgb4/cxgb4_main.c | 20 ++--
.../inline_crypto/ch_ipsec/chcr_ipsec.c | 18 ++--
.../net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 40 ++++----
drivers/net/ethernet/intel/ixgbevf/ipsec.c | 20 ++--
.../marvell/octeontx2/nic/cn10k_ipsec.c | 18 ++--
.../mellanox/mlx5/core/en_accel/ipsec.c | 28 +++---
.../mellanox/mlx5/core/en_accel/ipsec.h | 1 +
.../net/ethernet/netronome/nfp/crypto/ipsec.c | 11 +--
drivers/net/netdevsim/ipsec.c | 15 ++-
include/linux/netdevice.h | 10 +-
include/net/xfrm.h | 8 ++
net/xfrm/xfrm_device.c | 13 +--
net/xfrm/xfrm_state.c | 16 ++--
15 files changed, 175 insertions(+), 146 deletions(-)
--
2.45.0
This patchset adds the base infrastructure for modular BPF verifier.
The motivation remains unchanged from the LSFMMBPF25 proposal [0].
However, the design has diverged. Rather than immediately going for the
facade described in [0], we instead make a stop first at the continously
exported copies of the verifier in an out-of-tree repository, with a
separate copy for each kernel release. Each copy will receive as many
verifier backports as possible within the "boundary" of the modular
portions.
For example, a patch that changes the verifier at the same time as one
of the kernel symbols it depends on cannot be applied, as at runtime
only the verifier portion can be updated. However, a patch that only
changes verifier.c can be applied, as it's within the boundary. Rough
analysis of past data shows that most verifier changes fall within the
latter category. The jupyter notebook for this can be found here [1].
From here, we'll gradually enlarge the "boundary" to enable backports of
more and more patches, with the north star being the facade as described
in the proposal. Ideally, completion of the facade will render the
out-of-tree repository useless.
[0]: https://lore.kernel.org/bpf/nahst74z46ov7ii3vmriyhk25zo6tkf2f3hsulzjzselvob…
[1]: https://github.com/danobi/verifier-analysis/blob/master/analysis.ipynb
Daniel Xu (13):
bpf: Move bpf_prog_ctx_arg_info_init() body into header
bpf: Move BTF related globals out of verifier.c
bpf: Move percpu memory allocator definition into core
bpf: Move bpf_check_attach_target() to core
bpf: Remove map_set_for_each_callback_args callback for maps
bpf: Move kfunc definitions out of verifier.c
bpf: Make bpf_free_kfunc_btf_tab() static in core
selftests: bpf: Avoid attaching to bpf_check()
perf: Export perf_snapshot_branch_stack static key
bpf: verifier: Add indirection to kallsyms_lookup_name()
treewide: bpf: Export symbols used by verifier
bpf: verifier: Make verifier loadable
bpf: Supporting building verifier.ko out-of-tree
arch/x86/net/bpf_jit_comp.c | 2 +
drivers/media/rc/bpf-lirc.c | 1 +
fs/bpf_fs_kfuncs.c | 4 +
include/linux/bpf.h | 82 ++-
include/linux/bpf_verifier.h | 7 -
include/linux/btf.h | 4 +
kernel/bpf/Kbuild | 8 +
kernel/bpf/Kconfig | 12 +
kernel/bpf/Makefile | 3 +-
kernel/bpf/arraymap.c | 2 -
kernel/bpf/bpf_iter.c | 1 +
kernel/bpf/bpf_lsm.c | 5 +
kernel/bpf/bpf_struct_ops.c | 2 +
kernel/bpf/btf.c | 61 +-
kernel/bpf/cgroup.c | 4 +
kernel/bpf/core.c | 463 ++++++++++++++++
kernel/bpf/disasm.c | 4 +
kernel/bpf/hashtab.c | 4 -
kernel/bpf/helpers.c | 2 +
kernel/bpf/local_storage.c | 2 +
kernel/bpf/log.c | 12 +
kernel/bpf/map_iter.c | 1 +
kernel/bpf/memalloc.c | 3 +
kernel/bpf/offload.c | 10 +
kernel/bpf/syscall.c | 52 +-
kernel/bpf/tnum.c | 20 +
kernel/bpf/token.c | 1 +
kernel/bpf/trampoline.c | 5 +
kernel/bpf/verifier.c | 521 ++----------------
kernel/events/callchain.c | 3 +
kernel/events/core.c | 1 +
kernel/trace/bpf_trace.c | 9 +
lib/error-inject.c | 2 +
net/core/filter.c | 26 +
net/core/xdp.c | 2 +
net/netfilter/nf_bpf_link.c | 1 +
.../selftests/bpf/progs/exceptions_assert.c | 2 +-
.../selftests/bpf/progs/exceptions_fail.c | 4 +-
38 files changed, 834 insertions(+), 514 deletions(-)
create mode 100644 kernel/bpf/Kbuild
--
2.47.1
When running the mincore_selftest on a system with an XFS file system, it
failed the "check_file_mmap" test case due to the read-ahead pages reaching
the end of the file. The failure log is as below:
RUN global.check_file_mmap ...
mincore_selftest.c:264:check_file_mmap:Expected i (1024) < vec_size (1024)
mincore_selftest.c:265:check_file_mmap:Read-ahead pages reached the end of the file
check_file_mmap: Test failed
FAIL global.check_file_mmap
This is because the read-ahead window size of the XFS file system on this
machine is 4 MB, which is larger than the size from the #PF address to the
end of the file. As a result, all the pages for this file are populated.
blockdev --getra /dev/nvme0n1p5
8192
blockdev --getbsz /dev/nvme0n1p5
512
This issue can be fixed by extending the current FILE_SIZE 4MB to a larger
number, but it will still fail if the read-ahead window size of the file
system is larger enough. Additionally, in the real world, read-ahead pages
reaching the end of the file can happen and is an expected behavior.
Therefore, allowing read-ahead pages to reach the end of the file is a
better choice for the "check_file_mmap" test case.
Reported-by: Yi Lai <yi1.lai(a)intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo(a)intel.com>
---
tools/testing/selftests/mincore/mincore_selftest.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/tools/testing/selftests/mincore/mincore_selftest.c b/tools/testing/selftests/mincore/mincore_selftest.c
index e949a43a6145..efabfcbe0b49 100644
--- a/tools/testing/selftests/mincore/mincore_selftest.c
+++ b/tools/testing/selftests/mincore/mincore_selftest.c
@@ -261,9 +261,6 @@ TEST(check_file_mmap)
TH_LOG("No read-ahead pages found in memory");
}
- EXPECT_LT(i, vec_size) {
- TH_LOG("Read-ahead pages reached the end of the file");
- }
/*
* End of the readahead window. The rest of the pages shouldn't
* be in memory.
--
2.17.1
Hi Linus,
Please pull the following kunit fixes update for Linux 6.15-rc2
Fixes tool to report test count in case of a late test plan when tests
are specified before the test plan. Fixes spelling error in the commit
that went into 6.15-rc1.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 0af2f6be1b4281385b618cb86ad946eded089ac8:
Linux 6.15-rc1 (2025-04-06 13:11:33 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-kunit-6.15-rc2
for you to fetch changes up to d1be0cf3b8aeae75bc8fff5b7a3e01ebfe276008:
kunit: Spelling s/slowm/slow/ (2025-04-08 14:57:24 -0600)
----------------------------------------------------------------
linux_kselftest-kunit-6.15-rc2
Fixes tool to report test count in case of a late test plan when tests
are specified before the test plan. Fixes spelling error in the commit
that went into 6.15-rc1.
----------------------------------------------------------------
Geert Uytterhoeven (1):
kunit: Spelling s/slowm/slow/
Rae Moar (1):
kunit: tool: fix count of tests if late test plan
include/kunit/test.h | 2 +-
tools/testing/kunit/kunit_parser.py | 4 ++++
tools/testing/kunit/kunit_tool_test.py | 4 ++--
3 files changed, 7 insertions(+), 3 deletions(-)
----------------------------------------------------------------
Recently we had some issues in parallel TDC where some of IFE tests are
failing due to some of IFE's submodules (like act_meta_skbtcindex and
act_meta_skbprio) taking too long to load [1]. To avoid that issue,
pre-load IFE and all its submodules before running any of the tests in
tdc.sh
[1] https://lore.kernel.org/netdev/e909b2a0-244e-4141-9fa9-1b7d96ab7d71@mojatat…
Signed-off-by: Victor Nogueira <victor(a)mojatatu.com>
---
tools/testing/selftests/tc-testing/tdc.sh | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/tc-testing/tdc.sh b/tools/testing/selftests/tc-testing/tdc.sh
index cddff1772e10..589b18ed758a 100755
--- a/tools/testing/selftests/tc-testing/tdc.sh
+++ b/tools/testing/selftests/tc-testing/tdc.sh
@@ -31,6 +31,10 @@ try_modprobe act_skbedit
try_modprobe act_skbmod
try_modprobe act_tunnel_key
try_modprobe act_vlan
+try_modprobe act_ife
+try_modprobe act_meta_mark
+try_modprobe act_meta_skbtcindex
+try_modprobe act_meta_skbprio
try_modprobe cls_basic
try_modprobe cls_bpf
try_modprobe cls_cgroup
--
2.49.0
Recently, during a debugging session using local MPTCP connections, I
noticed MPJoinAckHMacFailure was strangely not zero on the server side.
The first patch fixes this issue -- present since v5.9 -- and the second
one validates it in the selftests.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Matthieu Baerts (NGI0) (2):
mptcp: only inc MPJoinAckHMacFailure for HMAC failures
selftests: mptcp: validate MPJoin HMacFailure counters
net/mptcp/subflow.c | 8 ++++++--
tools/testing/selftests/net/mptcp/mptcp_join.sh | 18 ++++++++++++++++++
2 files changed, 24 insertions(+), 2 deletions(-)
---
base-commit: 61f96e684edd28ca40555ec49ea1555df31ba619
change-id: 20250407-net-mptcp-hmac-failure-mib-66f599305ff3
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Testcase should fail if -EWOULDBLOCK is not returned when expected value
differs from actual value from the waiter.
Fixes: 9d57f7c79748920636f8293d2f01192d702fe390 ("selftests: futex: Test sys_futex_waitv() wouldblock")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
---
.../testing/selftests/futex/functional/futex_wait_wouldblock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c b/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c
index 7d7a6a06cdb7..2d8230da9064 100644
--- a/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c
+++ b/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c
@@ -98,7 +98,7 @@ int main(int argc, char *argv[])
info("Calling futex_waitv on f1: %u @ %p with val=%u\n", f1, &f1, f1+1);
res = futex_waitv(&waitv, 1, 0, &to, CLOCK_MONOTONIC);
if (!res || errno != EWOULDBLOCK) {
- ksft_test_result_pass("futex_waitv returned: %d %s\n",
+ ksft_test_result_fail("futex_waitv returned: %d %s\n",
res ? errno : res,
res ? strerror(errno) : "");
ret = RET_FAIL;
--
2.49.0.504.g3bcea36a83-goog