This series fixes a race condition in netconsole's userdata handling where concurrent message transmission could read partially updated userdata fields, resulting in corrupted netconsole output.
The first patch adds a selftest that reproduces the race condition by continuously sending messages while rapidly changing userdata values, detecting any torn reads in the output.
The second patch fixes the issue by ensuring update_userdata() holds the target_list_lock while updating both extradata_complete and userdata_length, preventing readers from seeing inconsistent state.
This targets net tree as it fixes a bug introduced in commit df03f830d099 ("net: netconsole: cache userdata formatted string in netconsole_target").
Signed-off-by: Gustavo Luiz Duarte gustavold@gmail.com --- Gustavo Luiz Duarte (2): selftests: netconsole: Add race condition test for userdata corruption netconsole: Fix race condition in between reader and writer of userdata
drivers/net/netconsole.c | 5 ++ .../selftests/drivers/net/netcons_race_userdata.sh | 87 ++++++++++++++++++++++ 2 files changed, 92 insertions(+) --- base-commit: ffff5c8fc2af2218a3332b3d5b97654599d50cde change-id: 20251020-netconsole-fix-race-f465f37b57ea
Best regards,
Add a test to verify that netconsole userdata handling is properly synchronized under concurrent read/write operations. The test creates two competing loops: one continuously sending netconsole messages (which read userdata), and another rapidly alternating userdata values between two distinct 198-byte patterns filled with 'A' and 'B' characters.
Without proper synchronization, concurrent reads and writes could result in torn reads where a message contains mixed userdata (e.g., starting with 'A' but containing 'B', or vice versa). The test monitors 10,000 messages and fails if it detects any such corruption, ensuring that the netconsole implementation maintains data consistency through proper locking mechanisms.
This test validates the fix for potential race conditions in the netconsole userdata path and serves as a regression test to prevent similar issues in the future.
Signed-off-by: Gustavo Luiz Duarte gustavold@gmail.com --- .../selftests/drivers/net/netcons_race_userdata.sh | 87 ++++++++++++++++++++++ 1 file changed, 87 insertions(+)
diff --git a/tools/testing/selftests/drivers/net/netcons_race_userdata.sh b/tools/testing/selftests/drivers/net/netcons_race_userdata.sh new file mode 100755 index 0000000000000..d6574f0364ead --- /dev/null +++ b/tools/testing/selftests/drivers/net/netcons_race_userdata.sh @@ -0,0 +1,87 @@ +#!/usr/bin/env bash +# SPDX-License-Identifier: GPL-2.0 + +# This test verifies that netconsole userdata remains consistent under concurrent +# read/write operations. It creates two loops: one continuously writing netconsole +# messages (which read userdata) and another rapidly alternating userdata values +# between two distinct patterns. The test checks that no message contains corrupted +# or mixed userdata, ensuring proper synchronization in the netconsole implementation. +# +# Author: Gustavo Luiz Duarte gustavold@gmail.com + +set -euo pipefail + +SCRIPTDIR=$(dirname "$(readlink -e "${BASH_SOURCE[0]}")") + +source "${SCRIPTDIR}"/lib/sh/lib_netcons.sh + +function loop_set_userdata() { + MSGA=$(printf 'A%.0s' {1..198}) + MSGB=$(printf 'B%.0s' {1..198}) + + while true; do + echo "$MSGA" > "${NETCONS_PATH}/userdata/${USERDATA_KEY}/value" + echo "$MSGB" > "${NETCONS_PATH}/userdata/${USERDATA_KEY}/value" + done +} + +function loop_print_msg() { + while true; do + echo "test msg" > /dev/kmsg + done +} + +cleanup_children() { + pkill_socat + # Remove the namespace, interfaces and netconsole target + cleanup + kill $child1 $child2 2> /dev/null || true + wait $child1 $child2 2> /dev/null || true +} + +modprobe netdevsim 2> /dev/null || true +modprobe netconsole 2> /dev/null || true + +OUTPUT_FILE="stdout" +# Check for basic system dependency and exit if not found +check_for_dependencies +# Set current loglevel to KERN_INFO(6), and default to KERN_NOTICE(5) +echo "6 5" > /proc/sys/kernel/printk +# kill child processes and remove interfaces on exit +trap cleanup_children EXIT + +# Create one namespace and two interfaces +set_network +# Create a dynamic target for netconsole +create_dynamic_target +# Set userdata "key" with the "value" value +set_user_data + +# Start userdata read loop (printk) +loop_print_msg & +child1=$! + +# Start userdata write loop +loop_set_userdata & +child2=$! + +# Start socat to listen for netconsole messages and check for corrupted userdata. +MAX_COUNT=10000 +i=0 +while read line; do + if [ $i -ge $MAX_COUNT ]; then + echo "Test passed." + exit ${ksft_pass} + fi + + if [[ "$line" == "key=A"* && "$line" == *"B"* || + "$line" == "key=B"* && "$line" == *"A"* ]]; then + echo "Test failed. Found corrupted userdata: $line" + exit ${ksft_fail} + fi + + i=$((i + 1)) +done < <(listen_port_and_save_to ${OUTPUT_FILE} 2> /dev/null) + +echo "socat died before we could check $MAX_COUNT messages. Skipping test. ${ksft_skip}" +exit ${ksft_skip}
Hi Gustavo,
On Mon, Oct 20, 2025 at 02:22:34PM -0700, Gustavo Luiz Duarte wrote:
This test validates the fix for potential race conditions in the netconsole userdata path and serves as a regression test to prevent similar issues in the future.
I noticed the test was not added to the TEST_PROGS in the Makefile like other selftests. Is that intentional?
You might also need to change the order of the patches in the series to make sure the test passes in CI.
+cleanup_children() {
- pkill_socat
- # Remove the namespace, interfaces and netconsole target
- cleanup
- kill $child1 $child2 2> /dev/null || true
- wait $child1 $child2 2> /dev/null || true
+}
Calling cleanup before stopping loop_set_userdata causes writing the userdata to fail. You might want to move the kill and wait lines to before call to cleanup. Additionally, shellcheck also suggests wrapping $child1 and $child2 with double quotes.
The update_userdata() function constructs the complete userdata string in nt->extradata_complete and updates nt->userdata_length. This data is then read by write_msg() and write_ext_msg() when sending netconsole messages. However, update_userdata() was not holding target_list_lock during this process, allowing concurrent message transmission to read partially updated userdata.
This race condition could result in netconsole messages containing incomplete or inconsistent userdata - for example, reading the old userdata_length with new extradata_complete content, or vice versa, leading to truncated or corrupted output.
Fix this by acquiring target_list_lock with spin_lock_irqsave() before updating extradata_complete and userdata_length, and releasing it after both fields are fully updated. This ensures that readers see a consistent view of the userdata, preventing corruption during concurrent access.
The fix aligns with the existing locking pattern used throughout the netconsole code, where target_list_lock protects access to target fields including buf[] and msgcounter that are accessed during message transmission.
Fixes: df03f830d099 ("net: netconsole: cache userdata formatted string in netconsole_target")
Signed-off-by: Gustavo Luiz Duarte gustavold@gmail.com --- drivers/net/netconsole.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index 194570443493b..1f9cf6b12dfc5 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -888,6 +888,9 @@ static void update_userdata(struct netconsole_target *nt) { int complete_idx = 0, child_count = 0; struct list_head *entry; + unsigned long flags; + + spin_lock_irqsave(&target_list_lock, flags);
/* Clear the current string in case the last userdatum was deleted */ nt->userdata_length = 0; @@ -918,6 +921,8 @@ static void update_userdata(struct netconsole_target *nt) } nt->userdata_length = strnlen(nt->extradata_complete, sizeof(nt->extradata_complete)); + + spin_unlock_irqrestore(&target_list_lock, flags); }
static ssize_t userdatum_value_store(struct config_item *item, const char *buf,
On Mon, Oct 20, 2025 at 02:22:35PM -0700, Gustavo Luiz Duarte wrote:
The update_userdata() function constructs the complete userdata string in nt->extradata_complete and updates nt->userdata_length. This data is then read by write_msg() and write_ext_msg() when sending netconsole messages. However, update_userdata() was not holding target_list_lock during this process, allowing concurrent message transmission to read partially updated userdata.
This race condition could result in netconsole messages containing incomplete or inconsistent userdata - for example, reading the old userdata_length with new extradata_complete content, or vice versa, leading to truncated or corrupted output.
Fix this by acquiring target_list_lock with spin_lock_irqsave() before updating extradata_complete and userdata_length, and releasing it after both fields are fully updated. This ensures that readers see a consistent view of the userdata, preventing corruption during concurrent access.
The fix aligns with the existing locking pattern used throughout the netconsole code, where target_list_lock protects access to target fields including buf[] and msgcounter that are accessed during message transmission.
Fixes: df03f830d099 ("net: netconsole: cache userdata formatted string in netconsole_target")
nit: no blank line here please
Signed-off-by: Gustavo Luiz Duarte gustavold@gmail.com
drivers/net/netconsole.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index 194570443493b..1f9cf6b12dfc5 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -888,6 +888,9 @@ static void update_userdata(struct netconsole_target *nt) { int complete_idx = 0, child_count = 0; struct list_head *entry;
- unsigned long flags;
- spin_lock_irqsave(&target_list_lock, flags);
/* Clear the current string in case the last userdatum was deleted */ nt->userdata_length = 0; @@ -918,6 +921,8 @@ static void update_userdata(struct netconsole_target *nt) } nt->userdata_length = strnlen(nt->extradata_complete, sizeof(nt->extradata_complete));
- spin_unlock_irqrestore(&target_list_lock, flags);
} static ssize_t userdatum_value_store(struct config_item *item, const char *buf,
-- 2.47.3
linux-kselftest-mirror@lists.linaro.org