On 6/17/20 5:23 AM, Petr Mladek wrote:
On Mon 2020-06-15 13:27:54, Joe Lawrence wrote:
The dmesg utility already comes with a command line switch to omit kernel timestamps, let's use it instead of applying an extra regex to filter them out.
Now without the '[timestamp]: ' prefix at the beginning of the log entry, revise the filtering regex to search for the 'livepatch:' subsystem prefix at the beginning of the line.
I wanted to push this patchset and run full test after each patch. Suddenly the tests started to fail, for example:
Hi Petr,
Thank you for running additional tests on your end. I ran this on x86_64, ppc64le and s390 across a bunch of hosts and VMs, but never repeatedly so I never saw this interesting combination of issues.
$/tools/testing/selftests/livepatch> ./test-livepatch.sh TEST: basic function patching ... ok TEST: multiple livepatches ... not ok
--- expected +++ result @@ -1,3 +1,9 @@ +% echo 0 > /sys/kernel/livepatch/test_klp_livepatch/enabled +livepatch: 'test_klp_livepatch': initializing unpatching transition +livepatch: 'test_klp_livepatch': starting unpatching transition +livepatch: 'test_klp_livepatch': completing unpatching transition +livepatch: 'test_klp_livepatch': unpatching complete +% rmmod test_klp_livepatch % modprobe test_klp_livepatch livepatch: enabling patch 'test_klp_livepatch' livepatch: 'test_klp_livepatch': initializing patching transition @@ -20,9 +26,3 @@ livepatch: 'test_klp_atomic_replace': co livepatch: 'test_klp_atomic_replace': unpatching complete % rmmod test_klp_atomic_replace test_klp_livepatch: this has been live patched -% echo 0 > /sys/kernel/livepatch/test_klp_livepatch/enabled -livepatch: 'test_klp_livepatch': initializing unpatching transition -livepatch: 'test_klp_livepatch': starting unpatching transition -livepatch: 'test_klp_livepatch': completing unpatching transition -livepatch: 'test_klp_livepatch': unpatching complete -% rmmod test_klp_livepatch
ERROR: livepatch kselftest(s) failed
The problem is a combination of:
- 1st patch that causes that old kernel messages are not cleared
- 2nd patch that removes time stamps from the diff
- lost the oldest messages because internal kernel log buffer overflow
- run the same tests more times
As a result, the diff might match with an incomplete log from the previous run.
D'oh. The referenced commit f131d9edc29d uses dmesg without any options, so it didn't suffer this gotcha.
Everything works when this 2nd patch is not commited. The timestamp helps to distinguish old and new messages. The lost messages are ignored thanks to the diff parameters:
--changed-group-format='%>' --unchanged-group-format=''
If you agree, I'll solve this problem by not committing this patch into livepatch.git repo.
Very good catch and explanation. I'd be okay w/skipping the 2nd patch, hopefully the others don't conflict too much by removing it.
It would be great to add a comment that the timestamp is actually important. But it might be done in a followup patch.
Yeah, something that subtle should probably have a comment to that effect. We all thought this was the "easy" change in the set, but never thought it through :)
-- Joe