Replacing all occurrences of `addr_of!(place)` and `addr_of_mut!(place)`
with `&raw const place` and `&raw mut place` respectively.
This will allow us to reduce macro complexity, and improve consistency
with existing reference syntax as `&raw const`, `&raw mut` are similar
to `&`, `&mut` making it fit more naturally with other existing code.
Suggested-by: Benno Lossin <benno.lossin(a)proton.me>
Link: https://github.com/Rust-for-Linux/linux/issues/1148
Signed-off-by: Antonio Hickey <contact(a)antoniohickey.com>
---
rust/kernel/kunit.rs | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/rust/kernel/kunit.rs b/rust/kernel/kunit.rs
index 824da0e9738a..a17ef3b2e860 100644
--- a/rust/kernel/kunit.rs
+++ b/rust/kernel/kunit.rs
@@ -128,9 +128,9 @@ unsafe impl Sync for UnaryAssert {}
unsafe {
$crate::bindings::__kunit_do_failed_assertion(
kunit_test,
- core::ptr::addr_of!(LOCATION.0),
+ &raw const LOCATION.0,
$crate::bindings::kunit_assert_type_KUNIT_ASSERTION,
- core::ptr::addr_of!(ASSERTION.0.assert),
+ &raw const ASSERTION.0.assert,
Some($crate::bindings::kunit_unary_assert_format),
core::ptr::null(),
);
--
2.48.1
I am submitting a series of patches that introduce a new feature for the
netconsole subsystem, specifically the addition of the 'release' field
to the sysdata structure. This feature allows the kernel release/version
to be appended to the userdata dictionary in every message sent,
enhancing the information available for debugging and monitoring
purposes.
This complements the already supported release prepend feature, which
was added some time ago. The release prepend appends the release
information at the message header, which is not ideal for two reasons:
1) It is difficult to determine if a message includes this information,
making it hard and resource-intensive to parse.
2) When a message is fragmented, the release information is appended to
every message fragment, consuming valuable space in the packet.
The "release prepend" feature was created before the concept of userdata
and sysdata. Now that this format has proven successful, we are
implementing the release feature as part of this enhanced structure.
This patch series aims to improve the netconsole subsystem by providing
a more efficient and user-friendly way to include kernel release
information in messages. I believe these changes will significantly aid
in system analysis and troubleshooting.
Suggested-by: Manu Bretelle <chantr4(a)gmail.com>
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Breno Leitao (6):
netconsole: introduce 'release' as a new sysdata field
netconsole: implement configfs for release_enabled
netconsole: add 'sysdata' suffix to related functions
netconsole: append release to sysdata
selftests: netconsole: Add tests for 'release' feature in sysdata
docs: netconsole: document release feature
Documentation/networking/netconsole.rst | 25 ++++++++
drivers/net/netconsole.c | 71 ++++++++++++++++++++--
.../selftests/drivers/net/netcons_sysdata.sh | 44 +++++++++++++-
3 files changed, 133 insertions(+), 7 deletions(-)
---
base-commit: 941defcea7e11ad7ff8f0d4856716dd637d757dd
change-id: 20250314-netcons_release-dc1f1f5ca0f7
Best regards,
--
Breno Leitao <leitao(a)debian.org>
Lei Chen raised an issue with CLOCK_MONOTONIC_COARSE seeing
time inconsistencies.
Lei tracked down that this was being caused by the adjustment
tk->tkr_mono.xtime_nsec -= offset;
which is made to compensate for the unaccumulated cycles in
offset when the mult value is adjusted forward, so that
the non-_COARSE clockids don't see inconsistencies.
However, the _COARSE clockids don't use the mult*offset value
in their calculations, so this subtraction can cause the
_COARSE clock ids to jump back a bit.
Now, by design, this negative adjustment should be fine, because
the logic run from timekeeping_adjust() is done after we
accumulate approx mult*interval_cycles into xtime_nsec.
The accumulated (mult*interval_cycles) will be larger then the
(mult_adj*offset) value subtracted from xtime_nsec, and both
operations are done together under the tk_core.lock, so the net
change to xtime_nsec should always be positive.
However, do_adjtimex() calls into timekeeping_advance() as well,
since we want to apply the ntp freq adjustment immediately.
In this case, we don't return early when the offset is smaller
then interval_cycles, so we don't end up accumulating any time
into xtime_nsec. But we do go on to call timekeeping_adjust(),
which modifies the mult value, and subtracts from xtime_nsec
to correct for the new mult value.
Here because we did not accumulate anything, we have a window
where the _COARSE clockids that don't utilize the mult*offset
value, can see an inconsistency.
So to fix this, rework the timekeeping_advance() logic a bit
so that when we are called from do_adjtimex() and the offset
is smaller then cycle_interval, that we call
timekeeping_forward(), to first accumulate the sub-interval
time into xtime_nsec. Then with no unaccumulated cycles in
offset, we can do the mult adjustment without worry of the
subtraction having an impact.
NOTE: This was implemented as a potential alternative to
Thomas' approach here:
https://lore.kernel.org/lkml/87cyej5rid.ffs@tglx/
And similarly, it needs some additional review and testing,
as it was developed while packing for conference travel.
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Stephen Boyd <sboyd(a)kernel.org>
Cc: Anna-Maria Behnsen <anna-maria(a)linutronix.de>
Cc: Frederic Weisbecker <frederic(a)kernel.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Miroslav Lichvar <mlichvar(a)redhat.com>
Cc: linux-kselftest(a)vger.kernel.org
Cc: kernel-team(a)android.com
Cc: Lei Chen <lei.chen(a)smartx.com>
Fixes: da15cfdae033 ("time: Introduce CLOCK_REALTIME_COARSE")
Reported-by: Lei Chen <lei.chen(a)smartx.com>
Closes: https://lore.kernel.org/lkml/20250310030004.3705801-1-lei.chen@smartx.com/
Diagnosed-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: John Stultz <jstultz(a)google.com>
---
kernel/time/timekeeping.c | 87 ++++++++++++++++++++++++++++-----------
1 file changed, 62 insertions(+), 25 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 1e67d076f1955..6f3a145e7b113 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -682,18 +682,18 @@ static void timekeeping_update_from_shadow(struct tk_data *tkd, unsigned int act
}
/**
- * timekeeping_forward_now - update clock to the current time
+ * timekeeping_forward - update clock to given cycle now value
* @tk: Pointer to the timekeeper to update
+ * @cycle_now: Current clocksource read value
*
* Forward the current clock to update its state since the last call to
* update_wall_time(). This is useful before significant clock changes,
* as it avoids having to deal with this time offset explicitly.
*/
-static void timekeeping_forward_now(struct timekeeper *tk)
+static void timekeeping_forward(struct timekeeper *tk, u64 cycle_now)
{
- u64 cycle_now, delta;
+ u64 delta;
- cycle_now = tk_clock_read(&tk->tkr_mono);
delta = clocksource_delta(cycle_now, tk->tkr_mono.cycle_last, tk->tkr_mono.mask,
tk->tkr_mono.clock->max_raw_delta);
tk->tkr_mono.cycle_last = cycle_now;
@@ -710,6 +710,21 @@ static void timekeeping_forward_now(struct timekeeper *tk)
}
}
+/**
+ * timekeeping_forward_now - update clock to the current time
+ * @tk: Pointer to the timekeeper to update
+ *
+ * Forward the current clock to update its state since the last call to
+ * update_wall_time(). This is useful before significant clock changes,
+ * as it avoids having to deal with this time offset explicitly.
+ */
+static void timekeeping_forward_now(struct timekeeper *tk)
+{
+ u64 cycle_now = tk_clock_read(&tk->tkr_mono);
+
+ timekeeping_forward(tk, cycle_now);
+}
+
/**
* ktime_get_real_ts64 - Returns the time of day in a timespec64.
* @ts: pointer to the timespec to be set
@@ -2151,6 +2166,45 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset,
return offset;
}
+static u64 timekeeping_accumulate(struct timekeeper *tk, u64 now, u64 offset,
+ unsigned int *clock_set)
+{
+ struct timekeeper *real_tk = &tk_core.timekeeper;
+ int shift = 0, maxshift;
+
+ /*
+ * If we have a sub-cycle_interval offset, we
+ * are likely doing a TK_FREQ_ADJ, so accumulate
+ * everything so we don't have a remainder offset
+ * when later adjusting the multiplier
+ */
+ if (offset < real_tk->cycle_interval) {
+ timekeeping_forward(tk, now);
+ *clock_set = 1;
+ return 0;
+ }
+
+ /*
+ * With NO_HZ we may have to accumulate many cycle_intervals
+ * (think "ticks") worth of time at once. To do this efficiently,
+ * we calculate the largest doubling multiple of cycle_intervals
+ * that is smaller than the offset. We then accumulate that
+ * chunk in one go, and then try to consume the next smaller
+ * doubled multiple.
+ */
+ shift = ilog2(offset) - ilog2(tk->cycle_interval);
+ shift = max(0, shift);
+ /* Bound shift to one less than what overflows tick_length */
+ maxshift = (64 - (ilog2(ntp_tick_length()) + 1)) - 1;
+ shift = min(shift, maxshift);
+ while (offset >= tk->cycle_interval) {
+ offset = logarithmic_accumulation(tk, offset, shift, clock_set);
+ if (offset < tk->cycle_interval << shift)
+ shift--;
+ }
+ return offset;
+}
+
/*
* timekeeping_advance - Updates the timekeeper to the current time and
* current NTP tick length
@@ -2160,8 +2214,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
struct timekeeper *tk = &tk_core.shadow_timekeeper;
struct timekeeper *real_tk = &tk_core.timekeeper;
unsigned int clock_set = 0;
- int shift = 0, maxshift;
- u64 offset;
+ u64 cycle_now, offset;
guard(raw_spinlock_irqsave)(&tk_core.lock);
@@ -2169,7 +2222,8 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
if (unlikely(timekeeping_suspended))
return false;
- offset = clocksource_delta(tk_clock_read(&tk->tkr_mono),
+ cycle_now = tk_clock_read(&tk->tkr_mono);
+ offset = clocksource_delta(cycle_now,
tk->tkr_mono.cycle_last, tk->tkr_mono.mask,
tk->tkr_mono.clock->max_raw_delta);
@@ -2177,24 +2231,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
if (offset < real_tk->cycle_interval && mode == TK_ADV_TICK)
return false;
- /*
- * With NO_HZ we may have to accumulate many cycle_intervals
- * (think "ticks") worth of time at once. To do this efficiently,
- * we calculate the largest doubling multiple of cycle_intervals
- * that is smaller than the offset. We then accumulate that
- * chunk in one go, and then try to consume the next smaller
- * doubled multiple.
- */
- shift = ilog2(offset) - ilog2(tk->cycle_interval);
- shift = max(0, shift);
- /* Bound shift to one less than what overflows tick_length */
- maxshift = (64 - (ilog2(ntp_tick_length())+1)) - 1;
- shift = min(shift, maxshift);
- while (offset >= tk->cycle_interval) {
- offset = logarithmic_accumulation(tk, offset, shift, &clock_set);
- if (offset < tk->cycle_interval<<shift)
- shift--;
- }
+ offset = timekeeping_accumulate(tk, cycle_now, offset, &clock_set);
/* Adjust the multiplier to correct NTP error */
timekeeping_adjust(tk, offset);
--
2.49.0.rc1.451.g8f38331e32-goog
Hi all,
This is v8 of the Rust/KUnit integration patch. I think all of the
suggestions have at least been responded to (even if there are a few I'm
leaving as either future projects or matters of taste). Hopefully this
is good-to-go for 6.15, so we can start using it concurrently with
making any additional improvements we may wish.
This series was originally written by José Expósito, and has been
modified and updated by Matt Gilbride, Miguel Ojeda, and myself. The
original version can be found here:
https://github.com/Rust-for-Linux/linux/pull/950
Add support for writing KUnit tests in Rust. While Rust doctests are
already converted to KUnit tests and run, they're really better suited
for examples, rather than as first-class unit tests.
This series implements a series of direct Rust bindings for KUnit tests,
as well as a new macro which allows KUnit tests to be written using a
close variant of normal Rust unit test syntax. The only change required
is replacing '#[cfg(test)]' with '#[kunit_tests(kunit_test_suite_name)]'
An example test would look like:
#[kunit_tests(rust_kernel_hid_driver)]
mod tests {
use super::*;
use crate::{c_str, driver, hid, prelude::*};
use core::ptr;
struct SimpleTestDriver;
impl Driver for SimpleTestDriver {
type Data = ();
}
#[test]
fn rust_test_hid_driver_adapter() {
let mut hid = bindings::hid_driver::default();
let name = c_str!("SimpleTestDriver");
static MODULE: ThisModule = unsafe { ThisModule::from_ptr(ptr::null_mut()) };
let res = unsafe {
<hid::Adapter<SimpleTestDriver> as driver::DriverOps>::register(&mut hid, name, &MODULE)
};
assert_eq!(res, Err(ENODEV)); // The mock returns -19
}
}
Please give this a go, and make sure I haven't broken it! There's almost
certainly a lot of improvements which can be made -- and there's a fair
case to be made for replacing some of this with generated C code which
can use the C macros -- but this is hopefully an adequate implementation
for now, and the interface can (with luck) remain the same even if the
implementation changes.
A few small notable missing features:
- Attributes (like the speed of a test) are hardcoded to the default
value.
- Similarly, the module name attribute is hardcoded to NULL. In C, we
use the KBUILD_MODNAME macro, but I couldn't find a way to use this
from Rust which wasn't more ugly than just disabling it.
- Assertions are not automatically rewritten to use KUnit assertions.
---
Changes since v7:
https://lore.kernel.org/rust-for-linux/20250214074051.1619256-1-davidgow@go…
- Reworked the SAFETY comment for addr_of_mut! use with statics in
kunit_unsafe_test_suite!() (again)
- Removed the second mocking example, which was causing confusion.
The first example of in_kunit_test() should be clear enough.
Changes since v6:
https://lore.kernel.org/rust-for-linux/20250214074051.1619256-1-davidgow@go…
- Fixed an [allow(unused_unsafe)] which ended up in patch 2 instead of
patch 1. (Thanks, Tamir!)
- Doc comments now have several useful links. (Thanks, Tamir!)
- Fix a potential compile error under macos. (Thanks, Tamir!)
- Several small tidy-ups to limit unsafe usage. (Thanks, Tamir!)
Changes since v5:
https://lore.kernel.org/all/20241213081035.2069066-1-davidgow@google.com/
- Rebased against 6.14-rc1
- Fixed a bunch of warnings / clippy lints introduced in Rust 1.83 and
1.84.
- No longer needs static_mut_refs / const_mut_refs, and is much cleaned
up as a result. (Thanks, Miguel)
- Major documentation and example fixes. (Thanks, Miguel)
Changes since v4:
https://lore.kernel.org/linux-kselftest/20241101064505.3820737-1-davidgow@g…
- Rebased against 6.13-rc1
- Allowed an unused_unsafe warning after the behaviour of addr_of_mut!()
changed in Rust 1.82. (Thanks Boqun, Miguel)
- "Expect" that the sample assert_eq!(1+1, 2) produces a clippy warning
due to a redundant assertion. (Thanks Boqun, Miguel)
- Fix some missing safety comments, and remove some unneeded 'unsafe'
blocks. (Thanks Boqun)
- Fix a couple of minor rustfmt issues which were triggering checkpatch
warnings.
Changes since v3:
https://lore.kernel.org/linux-kselftest/20241030045719.3085147-2-davidgow@g…
- The kunit_unsafe_test_suite!() macro now panic!s if the suite name is
too long, triggering a compile error. (Thanks, Alice!)
- The #[kunit_tests()] macro now preserves span information, so
errors can be better reported. (Thanks, Boqun!)
- The example tests have been updated to no longer use assert_eq!() with
a constant bool argument (which triggered a clippy warning now we
have the span info).
Changes since v2:
https://lore.kernel.org/linux-kselftest/20241029092422.2884505-1-davidgow@g…
- Include missing rust/macros/kunit.rs file from v2. (Thanks Boqun!)
- The kunit_unsafe_test_suite!() macro will truncate the name of the
suite if it is too long. (Thanks Alice!)
- The proc macro now emits an error if the suite name is too long.
- We no longer needlessly use UnsafeCell<> in
kunit_unsafe_test_suite!(). (Thanks Alice!)
Changes since v1:
https://lore.kernel.org/lkml/20230720-rustbind-v1-0-c80db349e3b5@google.com…
- Rebase on top of the latest rust-next (commit 718c4069896c)
- Make kunit_case a const fn, rather than a macro (Thanks Boqun)
- As a result, the null terminator is now created with
kernel::kunit::kunit_case_null()
- Use the C kunit_get_current_test() function to implement
in_kunit_test(), rather than re-implementing it (less efficiently)
ourselves.
Changes since the GitHub PR:
- Rebased on top of kselftest/kunit
- Add const_mut_refs feature
This may conflict with https://lore.kernel.org/lkml/20230503090708.2524310-6-nmi@metaspace.dk/
- Add rust/macros/kunit.rs to the KUnit MAINTAINERS entry
---
José Expósito (3):
rust: kunit: add KUnit case and suite macros
rust: macros: add macro to easily run KUnit tests
rust: kunit: allow to know if we are in a test
MAINTAINERS | 1 +
rust/kernel/kunit.rs | 171 +++++++++++++++++++++++++++++++++++++++++++
rust/macros/kunit.rs | 161 ++++++++++++++++++++++++++++++++++++++++
rust/macros/lib.rs | 29 ++++++++
4 files changed, 362 insertions(+)
create mode 100644 rust/macros/kunit.rs
--
2.49.0.rc0.332.g42c0ae87b1-goog
Signal delivery during connect() may disconnect an already established
socket. Problem is that such socket might have been placed in a sockmap
before the connection was closed.
PATCH 1 ensures this race won't lead to an unconnected vsock staying in the
sockmap. PATCH 2 selftests it.
PATCH 3 fixes a related race. Note that selftest in PATCH 2 does test this
code as well, but winning this race variant may take more than 2 seconds,
so I'm not advertising it.
Signed-off-by: Michal Luczaj <mhal(a)rbox.co>
---
Changes in v4:
- Selftest: send signal to only our own process
- Link to v3: https://lore.kernel.org/r/20250316-vsock-trans-signal-race-v3-0-17a6862277c…
Changes in v3:
- Selftest: drop unnecessary variable initialization and reorder the calls
- Link to v2: https://lore.kernel.org/r/20250314-vsock-trans-signal-race-v2-0-421a41f60f4…
Changes in v2:
- Handle one more path of tripping the warning
- Add a selftest
- Collect R-b [Stefano]
- Link to v1: https://lore.kernel.org/r/20250307-vsock-trans-signal-race-v1-1-3aca3f771fb…
---
Michal Luczaj (3):
vsock/bpf: Fix EINTR connect() racing sockmap update
selftest/bpf: Add test for AF_VSOCK connect() racing sockmap update
vsock/bpf: Fix bpf recvmsg() racing transport reassignment
net/vmw_vsock/af_vsock.c | 10 ++-
net/vmw_vsock/vsock_bpf.c | 24 ++++--
.../selftests/bpf/prog_tests/sockmap_basic.c | 99 ++++++++++++++++++++++
3 files changed, 124 insertions(+), 9 deletions(-)
---
base-commit: da9e8efe7ee10e8425dc356a9fc593502c8e3933
change-id: 20250305-vsock-trans-signal-race-d62f7718d099
Best regards,
--
Michal Luczaj <mhal(a)rbox.co>
From: Björn Töpel <bjorn(a)rivosinc.com>
There are scenarios where env.{sub,}test_state->stdout_saved, can be
NULL, e.g. sometimes when the watchdog timeout kicks in, or if the
open_memstream syscall is not available.
Avoid crashing test_progs by adding an explicit NULL check prior the
fclose() call.
Signed-off-by: Björn Töpel <bjorn(a)rivosinc.com>
---
tools/testing/selftests/bpf/test_progs.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index d4ec9586b98c..309d9d4a8ace 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -103,12 +103,14 @@ static void stdio_restore(void)
pthread_mutex_lock(&stdout_lock);
if (env.subtest_state) {
- fclose(env.subtest_state->stdout_saved);
+ if (env.subtest_state->stdout_saved)
+ fclose(env.subtest_state->stdout_saved);
env.subtest_state->stdout_saved = NULL;
stdout = env.test_state->stdout_saved;
stderr = env.test_state->stdout_saved;
} else {
- fclose(env.test_state->stdout_saved);
+ if (env.test_state->stdout_saved)
+ fclose(env.test_state->stdout_saved);
env.test_state->stdout_saved = NULL;
stdout = env.stdout_saved;
stderr = env.stderr_saved;
base-commit: f3f8649585a445414521a6d5b76f41b51205086d
--
2.45.2
Hi all, this is the v3 version.
===
Syzkaller reported this issue [1].
The current sockmap has a dependency on sk_socket in both read and write
stages, but there is a possibility that sk->sk_socket is released during
the process, leading to panic situations. For a detailed reproduction,
please refer to the description in the v2:
https://lore.kernel.org/bpf/20250228055106.58071-1-jiayuan.chen@linux.dev/
The corresponding fix approaches are described in the commit messages of
each patch.
By the way, the current sockmap lacks statistical information, especially
global statistics, such as the number of successful or failed rx and tx
operations. These statistics cannot be obtained from the socket interface
itself.
These data will be of great help in troubleshooting issues and observing
sockmap behavior.
If the maintainer/reviewer does not object, I think we can provide these
statistical information in the future, either through proc/trace/bpftool.
[1] https://syzkaller.appspot.com/bug?extid=dd90a702f518e0eac072
---
v2 -> v3:
1. Michal Luczaj reported similar race issue under sockmap sending path.
2. Rcu lock is conflict with mutex_lock in unix socket read implementation.
https://lore.kernel.org/bpf/20250228055106.58071-1-jiayuan.chen@linux.dev/
v1 -> v2:
1. Add Fixes tag.
2. Extend selftest of edge case for TCP/UDP sockets.
3. Add Reviewed-by and Acked-by tag.
https://lore.kernel.org/bpf/20250226132242.52663-1-jiayuan.chen@linux.dev/T…
Jiayuan Chen (3):
bpf, sockmap: avoid using sk_socket after free when sending
bpf, sockmap: avoid using sk_socket after free when reading
selftests/bpf: Add edge case tests for sockmap
net/core/skmsg.c | 22 ++++++-
.../selftests/bpf/prog_tests/socket_helpers.h | 13 +++-
.../selftests/bpf/prog_tests/sockmap_basic.c | 60 +++++++++++++++++++
3 files changed, 91 insertions(+), 4 deletions(-)
--
2.47.1