So far KSM can only be enabled by calling madvise for memory regions. What is
required to enable KSM for more workloads is to enable / disable it at the
process / cgroup level.
1. New options for prctl system command
This patch series adds two new options to the prctl system call. The first
one allows to enable KSM at the process level and the second one to query the
setting.
The setting will be inherited by child processes.
With the above setting, KSM can be enabled for the seed process of a cgroup
and all processes in the cgroup will inherit the setting.
2. Changes to KSM processing
When KSM is enabled at the process level, the KSM code will iterate over all
the VMA's and enable KSM for the eligible VMA's.
When forking a process that has KSM enabled, the setting will be inherited by
the new child process.
In addition when KSM is disabled for a process, KSM will be disabled for the
VMA's where KSM has been enabled.
3. Add tracepoints to KSM
Currently KSM has no tracepoints. This adds tracepoints to the key KSM functions
to make it easier to debug KSM.
4. Add general_profit metric
The general_profit metric of KSM is specified in the documentation, but not
calculated. This adds the general profit metric to /sys/kernel/debug/mm/ksm.
5. Add more metrics to ksm_stat
This adds the process profit and ksm type metric to /proc/<pid>/ksm_stat.
6. Add more tests to ksm_tests
This adds an option to specify the merge type to the ksm_tests. This allows to
test madvise and prctl KSM. It also adds a new option to query if prctl KSM has
been enabled. It adds a fork test to verify that the KSM process setting is
inherited by client processes.
Stefan Roesch (20):
mm: add new flag to enable ksm per process
mm: add flag to __ksm_enter
mm: add flag to __ksm_exit call
mm: invoke madvise for all vmas in scan_get_next_rmap_item
mm: support disabling of ksm for a process
mm: add new prctl option to get and set ksm for a process
mm: add tracepoints to ksm
mm: split off pages_volatile function
mm: expose general_profit metric
docs: document general_profit sysfs knob
mm: calculate ksm process profit metric
mm: add ksm_merge_type() function
mm: expose ksm process profit metric in ksm_stat
mm: expose ksm merge type in ksm_stat
docs: document new procfs ksm knobs
tools: add new prctl flags to prctl in tools dir
selftests/vm: add KSM prctl merge test
selftests/vm: add KSM get merge type test
selftests/vm: add KSM fork test
selftests/vm: add two functions for debugging merge outcome
Documentation/ABI/testing/sysfs-kernel-mm-ksm | 8 +
Documentation/admin-guide/mm/ksm.rst | 8 +-
MAINTAINERS | 1 +
fs/proc/base.c | 5 +
include/linux/ksm.h | 19 +-
include/linux/sched/coredump.h | 1 +
include/trace/events/ksm.h | 257 ++++++++++++++++++
include/uapi/linux/prctl.h | 2 +
kernel/sys.c | 29 ++
mm/ksm.c | 134 ++++++++-
tools/include/uapi/linux/prctl.h | 2 +
tools/testing/selftests/vm/Makefile | 3 +-
tools/testing/selftests/vm/ksm_tests.c | 254 ++++++++++++++---
13 files changed, 665 insertions(+), 58 deletions(-)
create mode 100644 include/trace/events/ksm.h
base-commit: c1649ec55708ae42091a2f1bca1ab49ecd722d55
--
2.30.2
kvm selftests build fails with below info:
rseq_test.c:48:13: error: conflicting types for ‘sys_getcpu’; have ‘void(unsigned int *)’
48 | static void sys_getcpu(unsigned *cpu)
| ^~~~~~~~~~
In file included from rseq_test.c:23:
../rseq/rseq.c:82:12: note: previous definition of ‘sys_getcpu’ with type ‘int(unsigned int *, unsigned int *)’
82 | static int sys_getcpu(unsigned *cpu, unsigned *node)
| ^~~~~~~~~~
commit 66d42ac73fc6 ("KVM: selftests: Make rseq compatible with glibc-2.35")
has include "../rseq/rseq.c", and commit 99babd04b250 ("selftests/rseq: Implement rseq numa node id field selftest")
add sys_getcpu() implement, so use sys_getcpu in rseq/rseq.c to fix this.
Fixes: 99babd04b250 ("selftests/rseq: Implement rseq numa node id field selftest")
Signed-off-by: YueHaibing <yuehaibing(a)huawei.com>
---
tools/testing/selftests/kvm/rseq_test.c | 19 ++++++-------------
1 file changed, 6 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c
index 3045fdf9bdf5..69ff39aa2991 100644
--- a/tools/testing/selftests/kvm/rseq_test.c
+++ b/tools/testing/selftests/kvm/rseq_test.c
@@ -41,18 +41,6 @@ static void guest_code(void)
GUEST_SYNC(0);
}
-/*
- * We have to perform direct system call for getcpu() because it's
- * not available until glic 2.29.
- */
-static void sys_getcpu(unsigned *cpu)
-{
- int r;
-
- r = syscall(__NR_getcpu, cpu, NULL, NULL);
- TEST_ASSERT(!r, "getcpu failed, errno = %d (%s)", errno, strerror(errno));
-}
-
static int next_cpu(int cpu)
{
/*
@@ -249,7 +237,12 @@ int main(int argc, char *argv[])
* across the seq_cnt reads.
*/
smp_rmb();
- sys_getcpu(&cpu);
+ /*
+ * We have to perform direct system call for getcpu() because it's
+ * not available until glic 2.29.
+ */
+ r = sys_getcpu(&cpu, NULL);
+ TEST_ASSERT(!r, "getcpu failed, errno = %d (%s)", errno, strerror(errno));
rseq_cpu = rseq_current_cpu_raw();
smp_rmb();
} while (snapshot != atomic_read(&seq_cnt));
--
2.34.1
The guest used in s390 kvm selftests is not be set up to handle all
instructions the compiler might emit, i.e. vector instructions, leading
to crashes.
Limit what the compiler emits to the oldest machine model currently
supported by Linux.
Signed-off-by: Nina Schoetterl-Glausch <nsg(a)linux.ibm.com>
---
Should we also set -mtune?
Since it are vector instructions that caused the problem here, there
are some alternatives:
* use -mno-vx
* set the required guest control bit to enable vector instructions on
models supporting them
-march=z10 might prevent similar issues with other instructions, but I
don't know if there actually exist other relevant instructions, so it
could be needlessly restricting.
tools/testing/selftests/kvm/Makefile | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 1750f91dd936..df0989949eb5 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -200,6 +200,9 @@ CFLAGS += -Wall -Wstrict-prototypes -Wuninitialized -O2 -g -std=gnu99 \
-I$(LINUX_TOOL_ARCH_INCLUDE) -I$(LINUX_HDR_PATH) -Iinclude \
-I$(<D) -Iinclude/$(ARCH_DIR) -I ../rseq -I.. $(EXTRA_CFLAGS) \
$(KHDR_INCLUDES)
+ifeq ($(ARCH),s390)
+ CFLAGS += -march=z10
+endif
no-pie-option := $(call try-run, echo 'int main(void) { return 0; }' | \
$(CC) -Werror $(CFLAGS) -no-pie -x c - -o "$$TMP", -no-pie)
--
2.34.1
"tcpdump" is used to capture traffic in these tests while using a random,
temporary and not suffixed file for it. This can interfere with apparmor
configuration where the tool is only allowed to read from files with
'known' extensions.
The MINE type application/vnd.tcpdump.pcap was registered with IANA for
pcap files and .pcap is the extension that is both most common but also
aligned with standard apparmor configurations. See TCPDUMP(8) for more
details.
This improves compatibility with standard apparmor configurations by
using ".pcap" as the file extension for the tests' temporary files.
Signed-off-by: Andrei Gherzan <andrei.gherzan(a)canonical.com>
---
tools/testing/selftests/net/cmsg_ipv6.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/cmsg_ipv6.sh b/tools/testing/selftests/net/cmsg_ipv6.sh
index 2d89cb0ad288..330d0b1ceced 100755
--- a/tools/testing/selftests/net/cmsg_ipv6.sh
+++ b/tools/testing/selftests/net/cmsg_ipv6.sh
@@ -6,7 +6,7 @@ ksft_skip=4
NS=ns
IP6=2001:db8:1::1/64
TGT6=2001:db8:1::2
-TMPF=`mktemp`
+TMPF=$(mktemp --suffix ".pcap")
cleanup()
{
--
2.34.1
Akanksha J N wrote:
> Commit 97f88a3d723162 ("powerpc/kprobes: Fix null pointer reference in
> arch_prepare_kprobe()") fixed a recent kernel oops that was caused as
> ftrace-based kprobe does not generate kprobe::ainsn::insn and it gets
> set to NULL.
> Extend multiple kprobes test to add kprobes on first 256 bytes within a
> function, to be able to test potential issues with kprobes on
> successive instructions.
> The '|| true' is added with the echo statement to ignore errors that are
> caused by trying to add kprobes to non probeable lines and continue with
> the test.
>
> Signed-off-by: Akanksha J N <akanksha(a)linux.ibm.com>
> ---
> .../selftests/ftrace/test.d/kprobe/multiple_kprobes.tc | 4 ++++
> 1 file changed, 4 insertions(+)
Thanks for adding this test!
>
> diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/multiple_kprobes.tc b/tools/testing/selftests/ftrace/test.d/kprobe/multiple_kprobes.tc
> index be754f5bcf79..f005c2542baa 100644
> --- a/tools/testing/selftests/ftrace/test.d/kprobe/multiple_kprobes.tc
> +++ b/tools/testing/selftests/ftrace/test.d/kprobe/multiple_kprobes.tc
> @@ -25,6 +25,10 @@ if [ $L -ne 256 ]; then
> exit_fail
> fi
>
> +for i in `seq 0 255`; do
> + echo p $FUNCTION_FORK+${i} >> kprobe_events || true
> +done
> +
> cat kprobe_events >> $testlog
>
> echo 1 > events/kprobes/enable
Thinking about this more, I wonder if we should add an explicit fork
after enabling the events, similar to kprobe_args.tc:
( echo "forked" )
That will ensure we hit all the probes we added. With that change:
Acked-by: Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
- Naveen
There are scenes that we want to show the character value of traced
arguments other than a decimal or hexadecimal or string value for debug
convinience. I add a new type named 'char' to do it and a new test case
file named 'kprobe_args_char.tc' to do selftest for char type.
For example:
The to be traced function is 'void demo_func(char type, char *name);', we
can add a kprobe event as follows to show argument values as we want:
echo 'p:myprobe demo_func $arg1:char +0($arg2):char[5]' > kprobe_events
we will get the following trace log:
... myprobe: (demo_func+0x0/0x29) arg1='A' arg2={'b','p','f','1',''}
Signed-off-by: Donglin Peng <dolinux.peng(a)gmail.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Reported-by: kernel test robot <lkp(a)intel.com>
---
Changes in v6:
- change "\'%c\'" to "'%c'" in trace_probe.c
Changes in v5:
- wrap the output character with single quotes
- add a test case named kprobe_args_char.tc to do selftest
Changes in v4:
- update the example in the commit log
Changes in v3:
- update readme_msg
Changes in v2:
- fix build warnings reported by kernel test robot
- modify commit log
---
Documentation/trace/kprobetrace.rst | 3 +-
kernel/trace/trace.c | 2 +-
kernel/trace/trace_probe.c | 2 +
kernel/trace/trace_probe.h | 1 +
.../ftrace/test.d/kprobe/kprobe_args_char.tc | 47 +++++++++++++++++++
5 files changed, 53 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_char.tc
diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst
index 4274cc6a2f94..007972a3c5c4 100644
--- a/Documentation/trace/kprobetrace.rst
+++ b/Documentation/trace/kprobetrace.rst
@@ -58,7 +58,7 @@ Synopsis of kprobe_events
NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
(u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types
- (x8/x16/x32/x64), "string", "ustring" and bitfield
+ (x8/x16/x32/x64), "char", "string", "ustring" and bitfield
are supported.
(\*1) only for the probe on function entry (offs == 0).
@@ -80,6 +80,7 @@ E.g. 'x16[4]' means an array of x16 (2bytes hex) with 4 elements.
Note that the array can be applied to memory type fetchargs, you can not
apply it to registers/stack-entries etc. (for example, '$stack1:x8[8]' is
wrong, but '+8($stack):x8[8]' is OK.)
+Char type can be used to show the character value of traced arguments.
String type is a special type, which fetches a "null-terminated" string from
kernel space. This means it will fail and store NULL if the string container
has been paged out. "ustring" type is an alternative of string for user-space.
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 6d7ef130f57e..c602081e64c8 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5615,7 +5615,7 @@ static const char readme_msg[] =
"\t $stack<index>, $stack, $retval, $comm,\n"
#endif
"\t +|-[u]<offset>(<fetcharg>), \\imm-value, \\\"imm-string\"\n"
- "\t type: s8/16/32/64, u8/16/32/64, x8/16/32/64, string, symbol,\n"
+ "\t type: s8/16/32/64, u8/16/32/64, x8/16/32/64, char, string, symbol,\n"
"\t b<bit-width>@<bit-offset>/<container-size>, ustring,\n"
"\t <type>\\[<array-size>\\]\n"
#ifdef CONFIG_HIST_TRIGGERS
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index bb2f95d7175c..794a21455396 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -50,6 +50,7 @@ DEFINE_BASIC_PRINT_TYPE_FUNC(x8, u8, "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(x16, u16, "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(x32, u32, "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(x64, u64, "0x%Lx")
+DEFINE_BASIC_PRINT_TYPE_FUNC(char, u8, "'%c'")
int PRINT_TYPE_FUNC_NAME(symbol)(struct trace_seq *s, void *data, void *ent)
{
@@ -93,6 +94,7 @@ static const struct fetch_type probe_fetch_types[] = {
ASSIGN_FETCH_TYPE_ALIAS(x16, u16, u16, 0),
ASSIGN_FETCH_TYPE_ALIAS(x32, u32, u32, 0),
ASSIGN_FETCH_TYPE_ALIAS(x64, u64, u64, 0),
+ ASSIGN_FETCH_TYPE_ALIAS(char, u8, u8, 0),
ASSIGN_FETCH_TYPE_ALIAS(symbol, ADDR_FETCH_TYPE, ADDR_FETCH_TYPE, 0),
ASSIGN_FETCH_TYPE_END
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index de38f1c03776..8c86aaa8b0c9 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -164,6 +164,7 @@ DECLARE_BASIC_PRINT_TYPE_FUNC(x16);
DECLARE_BASIC_PRINT_TYPE_FUNC(x32);
DECLARE_BASIC_PRINT_TYPE_FUNC(x64);
+DECLARE_BASIC_PRINT_TYPE_FUNC(char);
DECLARE_BASIC_PRINT_TYPE_FUNC(string);
DECLARE_BASIC_PRINT_TYPE_FUNC(symbol);
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_char.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_char.tc
new file mode 100644
index 000000000000..285b4770efad
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_char.tc
@@ -0,0 +1,47 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Kprobe event char type argument
+# requires: kprobe_events
+
+case `uname -m` in
+x86_64)
+ ARG1=%di
+;;
+i[3456]86)
+ ARG1=%ax
+;;
+aarch64)
+ ARG1=%x0
+;;
+arm*)
+ ARG1=%r0
+;;
+ppc64*)
+ ARG1=%r3
+;;
+ppc*)
+ ARG1=%r3
+;;
+s390*)
+ ARG1=%r2
+;;
+mips*)
+ ARG1=%r4
+;;
+*)
+ echo "Please implement other architecture here"
+ exit_untested
+esac
+
+: "Test get argument (1)"
+echo "p:testprobe tracefs_create_dir arg1=+0(${ARG1}):char" > kprobe_events
+echo 1 > events/kprobes/testprobe/enable
+echo "p:test $FUNCTION_FORK" >> kprobe_events
+grep -qe "testprobe.* arg1='t'" trace
+
+echo 0 > events/kprobes/testprobe/enable
+: "Test get argument (2)"
+echo "p:testprobe tracefs_create_dir arg1=+0(${ARG1}):char arg2=+0(${ARG1}):char[4]" > kprobe_events
+echo 1 > events/kprobes/testprobe/enable
+echo "p:test $FUNCTION_FORK" >> kprobe_events
+grep -qe "testprobe.* arg1='t' arg2={'t','e','s','t'}" trace
--
2.25.1
Hi Linus,
Please pull the following Kselftest fixes update for Linux 6.2-rc6.
This Kselftest fixes update for Linux 6.2-rc6 consists of a single
fix to a amd-pstate test Makefile bug that deletes source files
during make clean run.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 9fdaca2c1e157dc0a3c0faecf3a6a68e7d8d0c7b:
kselftest: Fix error message for unconfigured LLVM builds (2023-01-12 13:38:04 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux-kselftest-fixes-6.2-rc6
for you to fetch changes up to a49fb7218ed84a4c5e6c56b9fd933498b9730912:
selftests: amd-pstate: Don't delete source files via Makefile (2023-01-25 10:01:35 -0700)
----------------------------------------------------------------
linux-kselftest-fixes-6.2-rc6
This Kselftest fixes update for Linux 6.2-rc6 consists of a single
fix to a amd-pstate test Makefile bug that deletes source files
during make clean run.
----------------------------------------------------------------
Doug Smythies (1):
selftests: amd-pstate: Don't delete source files via Makefile
tools/testing/selftests/amd-pstate/Makefile | 5 -----
1 file changed, 5 deletions(-)
----------------------------------------------------------------
The net/bpf Makefile uses a similar build infrastructure to BPF[1] while
building libbpf as a dependency of nat6to4. This change adds a .gitignore
entry for SCRATCH_DIR where libbpf and its headers end up built/installed.
[1] Introduced in commit 837a3d66d698 ("selftests: net: Add
cross-compilation support for BPF programs")
Signed-off-by: Andrei Gherzan <andrei.gherzan(a)canonical.com>
---
tools/testing/selftests/net/.gitignore | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore
index a6911cae368c..0d07dd13c973 100644
--- a/tools/testing/selftests/net/.gitignore
+++ b/tools/testing/selftests/net/.gitignore
@@ -40,6 +40,7 @@ test_unix_oob
timestamping
tls
toeplitz
+/tools
tun
txring_overwrite
txtimestamp
--
2.34.1
The udpgro_frglist.sh uses nat6to4.o which is tested for existence in
bpf/nat6to4.o (relative to the script). This is where the object is
compiled. Even so, the script attempts to use it as part of tc with a
different path (../bpf/nat6to4.o). As a consequence, this fails the script:
Error opening object ../bpf/nat6to4.o: No such file or directory
Cannot initialize ELF context!
Unable to load program
This change refactors these references to use a variable for consistency
and also reformats two long lines.
Signed-off-by: Andrei Gherzan <andrei.gherzan(a)canonical.com>
---
tools/testing/selftests/net/udpgro_frglist.sh | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/net/udpgro_frglist.sh b/tools/testing/selftests/net/udpgro_frglist.sh
index c9c4b9d65839..1fdf2d53944d 100755
--- a/tools/testing/selftests/net/udpgro_frglist.sh
+++ b/tools/testing/selftests/net/udpgro_frglist.sh
@@ -6,6 +6,7 @@
readonly PEER_NS="ns-peer-$(mktemp -u XXXXXX)"
BPF_FILE="../bpf/xdp_dummy.bpf.o"
+BPF_NAT6TO4_FILE="./bpf/nat6to4.o"
cleanup() {
local -r jobs="$(jobs -p)"
@@ -40,8 +41,12 @@ run_one() {
ip -n "${PEER_NS}" link set veth1 xdp object ${BPF_FILE} section xdp
tc -n "${PEER_NS}" qdisc add dev veth1 clsact
- tc -n "${PEER_NS}" filter add dev veth1 ingress prio 4 protocol ipv6 bpf object-file ../bpf/nat6to4.o section schedcls/ingress6/nat_6 direct-action
- tc -n "${PEER_NS}" filter add dev veth1 egress prio 4 protocol ip bpf object-file ../bpf/nat6to4.o section schedcls/egress4/snat4 direct-action
+ tc -n "${PEER_NS}" filter add dev veth1 ingress prio 4 protocol \
+ ipv6 bpf object-file "$BPF_NAT6TO4_FILE" section \
+ schedcls/ingress6/nat_6 direct-action
+ tc -n "${PEER_NS}" filter add dev veth1 egress prio 4 protocol \
+ ip bpf object-file "$BPF_NAT6TO4_FILE" section \
+ schedcls/egress4/snat4 direct-action
echo ${rx_args}
ip netns exec "${PEER_NS}" ./udpgso_bench_rx ${rx_args} -r &
@@ -88,7 +93,7 @@ if [ ! -f ${BPF_FILE} ]; then
exit -1
fi
-if [ ! -f bpf/nat6to4.o ]; then
+if [ ! -f "$BPF_NAT6TO4_FILE" ]; then
echo "Missing nat6to4 helper. Build bpfnat6to4.o selftest first"
exit -1
fi
--
2.34.1