Introduce a new kselftest to identify slowdowns in key boot events.
This test uses ftrace to monitor the start and end times, as well as
the durations of all initcalls, and compares these timings to reference
values to identify significant slowdowns.
The script functions in two modes: the 'generate' mode allows to create
a JSON file containing initial reference timings for all initcalls from
a known stable kernel. The 'test' mode can be used during subsequent
boots to assess current timings against the reference values and
determine if there are any significant differences.
The test ships with a bootconfig file for setting up ftrace and a
configuration fragment for the necessary kernel configs.
Signed-off-by: Laura Nao <laura.nao(a)collabora.com>
---
Hello,
This v2 is a follow-up to RFCv1[1] and includes changes based on feedback
from the LPC 2024 session [2], along with some other fixes.
[1] https://lore.kernel.org/all/20240725110622.96301-1-laura.nao@collabora.com/
[2] https://www.youtube.com/watch?v=rWhW2-Vzi40
After reviewing other available tests and considering the feedback from
discussions at Plumbers, I decided to stick with the bootconfig file
approach but extend it to track all initcalls instead of a fixed set of
functions or events. The bootconfig file can be expanded and adapted to
track additional functions if needed for specific use cases.
I also defined a synthetic event to calculate initcall durations, while
still tracking their start and end times. Users are then allowed to choose
whether to compare start times, end times, or durations. Support for
specifying different rules for different initcalls has also been added.
In RFCv1, there was some discussion about using existing tools like
bootgraph.py. However, the output from these tools is mainly for manual
inspection (e.g., HTML visual output), whereas this test is designed to run
in automated CI environments too. The kselftest proposed here combines the
process of generating reference data and running tests into a single script
with two modes, making it easy to integrate into automated workflows.
Many of the features in this v2 (e.g., generating a JSON reference file,
comparing timings, and reporting results in KTAP format) could potentially
be integrated into bootgraph.py with some effort.
However, since this test is intended for automated execution rather than
manual use, I've decided to keep it separate for now and explore the
options suggested at LPC, such as using ftrace histograms for initcall
latencies. I'm open to revisiting this decision and working toward
integrating the changes into bootgraph.py if there's a strong preference
for unifying the tools.
Let me know your thoughts.
A comprehensive changelog is reported below.
Thanks,
Laura
---
Changes in v2:
- Updated ftrace configuration to track all initcall start times, end
times, and durations, and generate a histogram.
- Modified test logic to compare initcall durations by default, with the
option to compare start or end times if needed.
- Added warnings if the initcalls in the reference file differ from those
detected in the running system.
- Combined the scripts into a single script with two modes: one for
generating the reference file and one for running the test.
- Added support for specifying different rules for individual initcalls.
- Switched the reference format from YAML to JSON.
- Added metadata to the reference file, including kernel version, kernel
configuration, and cmdline.
- Link to v1: https://lore.kernel.org/all/20240725110622.96301-1-laura.nao@collabora.com/
---
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/boot-time/Makefile | 16 ++
tools/testing/selftests/boot-time/bootconfig | 15 +
tools/testing/selftests/boot-time/config | 6 +
.../selftests/boot-time/test_boot_time.py | 265 ++++++++++++++++++
5 files changed, 303 insertions(+)
create mode 100644 tools/testing/selftests/boot-time/Makefile
create mode 100644 tools/testing/selftests/boot-time/bootconfig
create mode 100644 tools/testing/selftests/boot-time/config
create mode 100755 tools/testing/selftests/boot-time/test_boot_time.py
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index b38199965f99..1bb20d1e3854 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -3,6 +3,7 @@ TARGETS += acct
TARGETS += alsa
TARGETS += amd-pstate
TARGETS += arm64
+TARGETS += boot-time
TARGETS += bpf
TARGETS += breakpoints
TARGETS += cachestat
diff --git a/tools/testing/selftests/boot-time/Makefile b/tools/testing/selftests/boot-time/Makefile
new file mode 100644
index 000000000000..cdcdc1bbe779
--- /dev/null
+++ b/tools/testing/selftests/boot-time/Makefile
@@ -0,0 +1,16 @@
+PY3 = $(shell which python3 2>/dev/null)
+
+ifneq ($(PY3),)
+
+TEST_PROGS := test_boot_time.py
+
+include ../lib.mk
+
+else
+
+all: no_py3_warning
+
+no_py3_warning:
+ @echo "Missing python3. This test will be skipped."
+
+endif
\ No newline at end of file
diff --git a/tools/testing/selftests/boot-time/bootconfig b/tools/testing/selftests/boot-time/bootconfig
new file mode 100644
index 000000000000..e4b89a33b7a3
--- /dev/null
+++ b/tools/testing/selftests/boot-time/bootconfig
@@ -0,0 +1,15 @@
+ftrace.event {
+ synthetic.initcall_latency {
+ # Synthetic event to record initcall latency, start, and end times
+ fields = "unsigned long func", "u64 lat", "u64 start", "u64 end"
+ actions = "hist:keys=func.sym,start,end:vals=lat:sort=lat"
+ }
+ initcall.initcall_start {
+ # Capture the start time (ts0) when initcall starts
+ actions = "hist:keys=func:ts0=common_timestamp.usecs"
+ }
+ initcall.initcall_finish {
+ # Capture the end time, calculate latency, and trigger synthetic event
+ actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:start=$ts0:end=common_timestamp.usecs:onmatch(initcall.initcall_start).initcall_latency(func,$lat,$start,$end)"
+ }
+}
\ No newline at end of file
diff --git a/tools/testing/selftests/boot-time/config b/tools/testing/selftests/boot-time/config
new file mode 100644
index 000000000000..bcb646ec3cd8
--- /dev/null
+++ b/tools/testing/selftests/boot-time/config
@@ -0,0 +1,6 @@
+CONFIG_TRACING=y
+CONFIG_BOOTTIME_TRACING=y
+CONFIG_BOOT_CONFIG_EMBED=y
+CONFIG_BOOT_CONFIG_EMBED_FILE="tools/testing/selftests/boot-time/bootconfig"
+CONFIG_SYNTH_EVENTS=y
+CONFIG_HIST_TRIGGERS=y
\ No newline at end of file
diff --git a/tools/testing/selftests/boot-time/test_boot_time.py b/tools/testing/selftests/boot-time/test_boot_time.py
new file mode 100755
index 000000000000..556dacf04b6d
--- /dev/null
+++ b/tools/testing/selftests/boot-time/test_boot_time.py
@@ -0,0 +1,265 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (c) 2024 Collabora Ltd
+#
+# This script reads the
+# /sys/kernel/debug/tracing/events/synthetic/initcall_latency/hist file,
+# extracts function names and timings, and compares them against reference
+# timings provided in an input JSON file to identify significant boot
+# slowdowns.
+# The script operates in two modes:
+# - Generate Mode: parses initcall timings from the current kernel's ftrace
+# event histogram and generates a JSON reference file with function
+# names, start times, end times, and latencies.
+# - Test Mode: compares current initcall timings against the reference
+# file, allowing users to define a maximum allowed difference between the
+# values (delta). Users can also apply custom delta thresholds for
+# specific initcalls using regex-based overrides. The comparison can be
+# done on latency, start, or end times.
+#
+
+import os
+import sys
+import argparse
+import gzip
+import json
+import re
+import subprocess
+
+this_dir = os.path.dirname(os.path.realpath(__file__))
+sys.path.append(os.path.join(this_dir, "../kselftest/"))
+
+import ksft
+
+def load_reference_from_json(file_path):
+ """
+ Load reference data from a JSON file and returns the parsed data.
+ @file_path: path to the JSON file.
+ """
+
+ try:
+ with open(file_path, 'r', encoding="utf-8") as file:
+ return json.load(file)
+ except FileNotFoundError:
+ ksft.print_msg(f"Error: File {file_path} not found.")
+ ksft.exit_fail()
+ except json.JSONDecodeError:
+ ksft.print_msg(f"Error: Failed to decode JSON from {file_path}.")
+ ksft.exit_fail()
+
+
+def mount_debugfs(path):
+ """
+ Mount debugfs at the specified path if it is not already mounted.
+ @path: path where debugfs should be mounted
+ """
+ # Check if debugfs is already mounted
+ with open('/proc/mounts', 'r', encoding="utf-8") as mounts:
+ for line in mounts:
+ if 'debugfs' in line and path in line:
+ print(f"debugfs is already mounted at {path}")
+ return True
+
+ # Mount debugfs
+ try:
+ subprocess.run(['mount', '-t', 'debugfs', 'none', path], check=True)
+ return True
+ except subprocess.CalledProcessError as e:
+ print(f"Failed to mount debugfs: {e.stderr}")
+ return False
+
+
+def ensure_unique_function_name(func, initcall_entries):
+ """
+ Ensure the function name is unique by appending a suffix if necessary.
+ @func: the original function name.
+ @initcall_entries: a dictionary containing parsed initcall entries.
+ """
+ i = 2
+ base_func = func
+ while func in initcall_entries:
+ func = f'{base_func}[{i}]'
+ i += 1
+ return func
+
+
+def parse_initcall_latency_hist():
+ """
+ Parse the ftrace histogram for the initcall_latency event, extracting
+ function names, start times, end times, and latencies. Return a
+ dictionary where each entry is structured as follows:
+ {
+ <function symbolic name>: {
+ "start": <start time>,
+ "end": <end time>,
+ "latency": <latency>
+ }
+ }
+ """
+
+ pattern = re.compile(r'\{ func: \[\w+\] ([\w_]+)\s*, start: *(\d+), end: *(\d+) \} hitcount: *\d+ lat: *(\d+)')
+ initcall_entries = {}
+
+ try:
+ with open('/sys/kernel/debug/tracing/events/synthetic/initcall_latency/hist', 'r', encoding="utf-8") as hist_file:
+ for line in hist_file:
+ match = pattern.search(line)
+ if match:
+ func = match.group(1).strip()
+ start = int(match.group(2))
+ end = int(match.group(3))
+ latency = int(match.group(4))
+
+ # filter out unresolved names
+ if not func.startswith("0x"):
+ func = ensure_unique_function_name(func, initcall_entries)
+
+ initcall_entries[func] = {
+ "start": start,
+ "end": end,
+ "latency": latency
+ }
+ except FileNotFoundError:
+ print("Error: Histogram file not found.")
+
+ return initcall_entries
+
+
+def compare_initcall_list(ref_initcall_entries, cur_initcall_entries):
+ """
+ Compare the current list of initcall functions against the reference
+ file. Print warnings if there are unique entries in either.
+ @ref_initcall_entries: reference initcall entries.
+ @cur_initcall_entries: current initcall entries.
+ """
+ ref_entries = set(ref_initcall_entries.keys())
+ cur_entries = set(cur_initcall_entries.keys())
+
+ unique_to_ref = ref_entries - cur_entries
+ unique_to_cur = cur_entries - ref_entries
+
+ if (unique_to_ref):
+ ksft.print_msg(
+ f"Warning: {list(unique_to_ref)} not found in current data. Consider updating reference file.")
+ if unique_to_cur:
+ ksft.print_msg(
+ f"Warning: {list(unique_to_cur)} not found in reference data. Consider updating reference file.")
+
+
+def run_test(ref_file_path, delta, overrides, mode):
+ """
+ Run the test comparing the current timings with the reference values.
+ @ref_file_path: path to the JSON file containing reference values.
+ @delta: default allowed difference between reference and current
+ values.
+ @overrides: override rules in the form of regex:threshold.
+ @mode: the comparison mode (either 'start', 'end', or 'latency').
+ """
+
+ ref_data = load_reference_from_json(ref_file_path)
+
+ ref_initcall_entries = ref_data['data']
+ cur_initcall_entries = parse_initcall_latency_hist()
+
+ compare_initcall_list(ref_initcall_entries, cur_initcall_entries)
+
+ ksft.set_plan(len(ref_initcall_entries))
+
+ for func_name in ref_initcall_entries:
+ effective_delta = delta
+ for regex, override_delta in overrides.items():
+ if re.match(regex, func_name):
+ effective_delta = override_delta
+ break
+ if (func_name in cur_initcall_entries):
+ ref_metric = ref_initcall_entries[func_name].get(mode)
+ cur_metric = cur_initcall_entries[func_name].get(mode)
+ if (cur_metric > ref_metric and (cur_metric - ref_metric) >= effective_delta):
+ ksft.test_result_fail(func_name)
+ ksft.print_msg(f"'{func_name}' {mode} differs by "
+ f"{(cur_metric - ref_metric)} usecs.")
+ else:
+ ksft.test_result_pass(func_name)
+ else:
+ ksft.test_result_skip(func_name)
+
+
+def generate_reference_file(file_path):
+ """
+ Generate a reference file in JSON format, containing kernel metadata
+ and initcall timing data.
+ @file_path: output file path.
+ """
+ metadata = {}
+
+ config_file = "/proc/config.gz"
+ if os.path.isfile(config_file):
+ with gzip.open(config_file, "rt", encoding="utf-8") as f:
+ config = f.read()
+ metadata["config"] = config
+
+ metadata["version"] = os.uname().release
+
+ cmdline_file = "/proc/cmdline"
+ if os.path.isfile(cmdline_file):
+ with open(cmdline_file, "r", encoding="utf-8") as f:
+ cmdline = f.read().strip()
+ metadata["cmdline"] = cmdline
+
+ ref_data = {
+ "metadata": metadata,
+ "data": parse_initcall_latency_hist(),
+ }
+
+ with open(file_path, "w", encoding='utf-8') as f:
+ json.dump(ref_data, f, indent=4)
+ print(f"Generated {file_path}")
+
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser(
+ description="")
+
+ subparsers = parser.add_subparsers(dest='mode', required=True, help='Choose between generate or test modes')
+
+ generate_parser = subparsers.add_parser('generate', help="Generate a reference file")
+ generate_parser.add_argument('out_ref_file', nargs='?', default='reference_initcall_timings.json',
+ help='Path to output JSON reference file (default: reference_initcall_timings.json)')
+
+ compare_parser = subparsers.add_parser('test', help='Test against a reference file')
+ compare_parser.add_argument('in_ref_file', help='Path to JSON reference file')
+ compare_parser.add_argument(
+ 'delta', type=int, help='Maximum allowed delta between the current and the reference timings (usecs)')
+ compare_parser.add_argument('--override', '-o', action='append', type=str,
+ help="Specify regex-based rules as regex:delta (e.g., '^acpi_.*:50')")
+ compare_parser.add_argument('--mode', '-m', default='latency', choices=[
+ 'start', 'end', 'latency'],
+ help="Comparison mode: 'latency' (default) for latency, 'start' for start times, or 'end' for end times.")
+
+ args = parser.parse_args()
+
+ if args.mode == 'generate':
+ generate_reference_file(args.out_ref_file)
+ sys.exit(0)
+
+ # Process overrides
+ overrides = {}
+ if args.override:
+ for override in args.override:
+ try:
+ pattern, delta = override.split(":")
+ overrides[pattern] = int(delta)
+ except ValueError:
+ print(f"Invalid override format: {override}. Expected format is 'regex:delta'.")
+ sys.exit(1)
+
+ # Ensure debugfs is mounted
+ if not mount_debugfs("/sys/kernel/debug"):
+ ksft.exit_fail()
+
+ ksft.print_header()
+
+ run_test(args.in_ref_file, args.delta, overrides, args.mode)
+
+ ksft.finished()
--
2.30.2
This patch series adds a some not yet picked selftests to the kvm s390x
selftest suite.
The additional test cases are covering:
* Assert KVM_EXIT_S390_UCONTROL exit on not mapped memory access
* Assert functionality of storage keys in ucontrol VM
* Assert that memory region operations are rejected for ucontrol VMs
Running the test cases requires sys_admin capabilities to start the
ucontrol VM.
This can be achieved by running as root or with a command like:
sudo setpriv --reuid nobody --inh-caps -all,+sys_admin \
--ambient-caps -all,+sys_admin --bounding-set -all,+sys_admin \
./ucontrol_test
---
The patches in this series have been part of the previous patch series.
The test cases added here do depend on the fixture added in the earlier
patches.
From v5 PATCH 7-9 the segment and page table generation has been removed
and DAT
has been disabled. Since DAT is not necessary to validate the KVM code.
https://lore.kernel.org/kvm/20240807154512.316936-1-schlameuss@linux.ibm.co…
v6:
- add instruction intercept handling for skey specific instructions
(iske, sske, rrbe) in addition to kss intercept to work properly on
all machines
- reorder local variables
- fixup some method comments
- add a patch correcting the IP.b value length a debug message
v5:
- rebased to current upstream master
- corrected assertion on 0x00 to 0
- reworded fixup commit so that it can be merged on top of current
upstream
v4:
- fix whitespaces in pointer function arguments (thanks Claudio)
- fix whitespaces in comments (thanks Janosch)
v3:
- fix skey assertion (thanks Claudio)
- introduce a wrapper around UCAS map and unmap ioctls to improve
readability (Claudio)
- add an displacement to accessed memory to assert translation
intercepts actually point to segments to the uc_map_unmap test
- add an misaligned failing mapping try to the uc_map_unmap test
v2:
- Reenable KSS intercept and handle it within skey test.
- Modify the checked register between storing (sske) and reading (iske)
it within the test program to make sure the.
- Add an additional state assertion in the end of uc_skey
- Fix some typos and white spaces.
v1:
- Remove segment and page table generation and disable DAT. This is not
necessary to validate the KVM code.
Christoph Schlameuss (5):
selftests: kvm: s390: Add uc_map_unmap VM test case
selftests: kvm: s390: Add uc_skey VM test case
selftests: kvm: s390: Verify reject memory region operations for
ucontrol VMs
selftests: kvm: s390: Fix whitespace confusion in ucontrol test
selftests: kvm: s390: correct IP.b length in uc_handle_sieic debug
output
.../selftests/kvm/include/s390x/processor.h | 6 +
.../selftests/kvm/s390x/ucontrol_test.c | 307 +++++++++++++++++-
2 files changed, 305 insertions(+), 8 deletions(-)
base-commit: eca631b8fe808748d7585059c4307005ca5c5820
--
2.47.0
virtio-net have two usage of hashes: one is RSS and another is hash
reporting. Conventionally the hash calculation was done by the VMM.
However, computing the hash after the queue was chosen defeats the
purpose of RSS.
Another approach is to use eBPF steering program. This approach has
another downside: it cannot report the calculated hash due to the
restrictive nature of eBPF.
Introduce the code to compute hashes to the kernel in order to overcome
thse challenges.
An alternative solution is to extend the eBPF steering program so that it
will be able to report to the userspace, but it is based on context
rewrites, which is in feature freeze. We can adopt kfuncs, but they will
not be UAPIs. We opt to ioctl to align with other relevant UAPIs (KVM
and vhost_net).
The patches for QEMU to use this new feature was submitted as RFC and
is available at:
https://patchew.org/QEMU/20240915-hash-v3-0-79cb08d28647@daynix.com/
This work was presented at LPC 2024:
https://lpc.events/event/18/contributions/1963/
V1 -> V2:
Changed to introduce a new BPF program type.
Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com>
---
Changes in v5:
- Fixed a compilation error with CONFIG_TUN_VNET_CROSS_LE.
- Optimized the calculation of the hash value according to:
https://git.dpdk.org/dpdk/commit/?id=3fb1ea032bd6ff8317af5dac9af901f1f324ca…
- Added patch "tun: Unify vnet implementation".
- Dropped patch "tap: Pad virtio header with zero".
- Added patch "selftest: tun: Test vnet ioctls without device".
- Reworked selftests to skip for older kernels.
- Documented the case when the underlying device is deleted and packets
have queue_mapping set by TC.
- Reordered test harness arguments.
- Added code to handle fragmented packets.
- Link to v4: https://lore.kernel.org/r/20240924-rss-v4-0-84e932ec0e6c@daynix.com
Changes in v4:
- Moved tun_vnet_hash_ext to if_tun.h.
- Renamed virtio_net_toeplitz() to virtio_net_toeplitz_calc().
- Replaced htons() with cpu_to_be16().
- Changed virtio_net_hash_rss() to return void.
- Reordered variable declarations in virtio_net_hash_rss().
- Removed virtio_net_hdr_v1_hash_from_skb().
- Updated messages of "tap: Pad virtio header with zero" and
"tun: Pad virtio header with zero".
- Fixed vnet_hash allocation size.
- Ensured to free vnet_hash when destructing tun_struct.
- Link to v3: https://lore.kernel.org/r/20240915-rss-v3-0-c630015db082@daynix.com
Changes in v3:
- Reverted back to add ioctl.
- Split patch "tun: Introduce virtio-net hashing feature" into
"tun: Introduce virtio-net hash reporting feature" and
"tun: Introduce virtio-net RSS".
- Changed to reuse hash values computed for automq instead of performing
RSS hashing when hash reporting is requested but RSS is not.
- Extracted relevant data from struct tun_struct to keep it minimal.
- Added kernel-doc.
- Changed to allow calling TUNGETVNETHASHCAP before TUNSETIFF.
- Initialized num_buffers with 1.
- Added a test case for unclassified packets.
- Fixed error handling in tests.
- Changed tests to verify that the queue index will not overflow.
- Rebased.
- Link to v2: https://lore.kernel.org/r/20231015141644.260646-1-akihiko.odaki@daynix.com
---
Akihiko Odaki (10):
virtio_net: Add functions for hashing
skbuff: Introduce SKB_EXT_TUN_VNET_HASH
net: flow_dissector: Export flow_keys_dissector_symmetric
tun: Unify vnet implementation
tun: Pad virtio header with zero
tun: Introduce virtio-net hash reporting feature
tun: Introduce virtio-net RSS
selftest: tun: Test vnet ioctls without device
selftest: tun: Add tests for virtio-net hashing
vhost/net: Support VIRTIO_NET_F_HASH_REPORT
Documentation/networking/tuntap.rst | 7 +
MAINTAINERS | 1 +
drivers/net/Kconfig | 1 +
drivers/net/tap.c | 218 ++++--------
drivers/net/tun.c | 293 ++++++----------
drivers/net/tun_vnet.h | 342 +++++++++++++++++++
drivers/vhost/net.c | 16 +-
include/linux/if_tap.h | 2 +
include/linux/skbuff.h | 3 +
include/linux/virtio_net.h | 188 +++++++++++
include/net/flow_dissector.h | 1 +
include/uapi/linux/if_tun.h | 75 +++++
net/core/flow_dissector.c | 3 +-
net/core/skbuff.c | 4 +
tools/testing/selftests/net/Makefile | 2 +-
tools/testing/selftests/net/tun.c | 630 ++++++++++++++++++++++++++++++++++-
16 files changed, 1430 insertions(+), 356 deletions(-)
---
base-commit: 752ebcbe87aceeb6334e846a466116197711a982
change-id: 20240403-rss-e737d89efa77
Best regards,
--
Akihiko Odaki <akihiko.odaki(a)daynix.com>
If you wish to utilise a pidfd interface to refer to the current process or
thread it is rather cumbersome, requiring something like:
int pidfd = pidfd_open(getpid(), 0 or PIDFD_THREAD);
...
close(pidfd);
Or the equivalent call opening /proc/self. It is more convenient to use a
sentinel value to indicate to an interface that accepts a pidfd that we
simply wish to refer to the current process thread.
This series introduces sentinels for this purposes which can be passed as
the pidfd in this instance rather than having to establish a dummy fd for
this purpose.
It is useful to refer to both the current thread from the userland's
perspective for which we use PIDFD_SELF, and the current process from the
userland's perspective, for which we use PIDFD_SELF_PROCESS.
There is unfortunately some confusion between the kernel and userland as to
what constitutes a process - a thread from the userland perspective is a
process in userland, and a userland process is a thread group (more
specifically the thread group leader from the kernel perspective). We
therefore alias things thusly:
* PIDFD_SELF_THREAD aliased by PIDFD_SELF - use PIDTYPE_PID.
* PIDFD_SELF_THREAD_GROUP alised by PIDFD_SELF_PROCESS - use PIDTYPE_TGID.
In all of the kernel code we refer to PIDFD_SELF_THREAD and
PIDFD_SELF_THREAD_GROUP. However we expect users to use PIDFD_SELF and
PIDFD_SELF_PROCESS.
This matters for cases where, for instance, a user unshare()'s FDs or does
thread-specific signal handling and where the user would be hugely confused
if the FDs referenced or signal processed referred to the thread group
leader rather than the individual thread.
We ensure that pidfd_send_signal() and pidfd_getfd() work correctly, and
assert as much in selftests. All other interfaces except setns() will work
implicitly with this new interface, however it doesn't make sense to test
waitid(P_PIDFD, ...) as waiting on ourselves is a blocking operation.
In the case of setns() we explicitly disallow use of PIDFD_SELF* as it
doesn't make sense to obtain the namespaces of our own process, and it
would require work to implement this functionality there that would be of
no use.
We also do not provide the ability to utilise PIDFD_SELF* in ordinary fd
operations such as open() or poll(), as this would require extensive work
and be of no real use.
v3:
* Do not fput() an invalid fd as reported by kernel test bot.
* Fix unintended churn from moving variable declaration.
v2:
* Fix tests as reported by Shuah.
* Correct RFC version lore link.
https://lore.kernel.org/linux-mm/cover.1728643714.git.lorenzo.stoakes@oracl…
Non-RFC v1:
* Removed RFC tag - there seems to be general consensus that this change is
a good idea, but perhaps some debate to be had on implementation. It
seems sensible then to move forward with the RFC flag removed.
* Introduced PIDFD_SELF_THREAD, PIDFD_SELF_THREAD_GROUP and their aliases
PIDFD_SELF and PIDFD_SELF_PROCESS respectively.
* Updated testing accordingly.
https://lore.kernel.org/linux-mm/cover.1728578231.git.lorenzo.stoakes@oracl…
RFC version:
https://lore.kernel.org/linux-mm/cover.1727644404.git.lorenzo.stoakes@oracl…
Lorenzo Stoakes (3):
pidfd: extend pidfd_get_pid() and de-duplicate pid lookup
pidfd: add PIDFD_SELF_* sentinels to refer to own thread/process
selftests: pidfd: add tests for PIDFD_SELF_*
include/linux/pid.h | 43 +++++-
include/uapi/linux/pidfd.h | 15 ++
kernel/exit.c | 3 +-
kernel/nsproxy.c | 1 +
kernel/pid.c | 73 ++++++---
kernel/signal.c | 26 +---
tools/testing/selftests/pidfd/pidfd.h | 8 +
.../selftests/pidfd/pidfd_getfd_test.c | 141 ++++++++++++++++++
.../selftests/pidfd/pidfd_setns_test.c | 11 ++
tools/testing/selftests/pidfd/pidfd_test.c | 76 ++++++++--
10 files changed, 342 insertions(+), 55 deletions(-)
--
2.46.2