Hi,
Here is version 3 series of patches to support accessing function entry data
from function *return* probes (including kretprobe and fprobe-exit event).
The previous version is here;
https://lore.kernel.org/all/170891987362.609861.6767830614537418260.stgit@d…
In this version, [1/8] is a bugfix patch (but note that this is already pushed to
probes-fixes-v6.8-rc5, just for reference), updated [4/8] changelog and build error,
fixes selftests error [6/8], update document[8/8] and added Steve's reviewed-by.
This allows us to access the results of some functions, which returns the
error code and its results are passed via function parameter, such as an
structure-initialization function.
For example, vfs_open() will link the file structure to the inode and update
mode. Thus we can trace that changes.
# echo 'f vfs_open mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events
# echo 'f vfs_open%return mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events
# echo 1 > events/fprobes/enable
# cat trace
sh-131 [006] ...1. 1945.714346: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x2 inode=0x0
sh-131 [006] ...1. 1945.714358: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4d801e inode=0xffff888008470168
cat-143 [007] ...1. 1945.717949: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0
cat-143 [007] ...1. 1945.717956: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4a801d inode=0xffff888005f78d28
cat-143 [007] ...1. 1945.720616: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0
cat-143 [007] ...1. 1945.728263: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0xa800d inode=0xffff888004ada8d8
So as you can see those fields are initialized at exit.
This series is based on v6.8-rc5 kernel or you can checkout from
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=t…
Thank you,
---
Masami Hiramatsu (Google) (8):
fprobe: Fix to allocate entry_data_size buffer with rethook instances
tracing/fprobe-event: cleanup: Fix a wrong comment in fprobe event
tracing/probes: Cleanup probe argument parser
tracing/probes: cleanup: Set trace_probe::nr_args at trace_probe_init
tracing: Remove redundant #else block for BTF args from README
tracing/probes: Support $argN in return probe (kprobe and fprobe)
selftests/ftrace: Add test cases for entry args at function exit
Documentation: tracing: Add entry argument access at function exit
Documentation/trace/fprobetrace.rst | 31 +
Documentation/trace/kprobetrace.rst | 9
kernel/trace/fprobe.c | 14 -
kernel/trace/trace.c | 5
kernel/trace/trace_eprobe.c | 8
kernel/trace/trace_fprobe.c | 59 ++-
kernel/trace/trace_kprobe.c | 58 ++-
kernel/trace/trace_probe.c | 417 ++++++++++++++------
kernel/trace/trace_probe.h | 30 +
kernel/trace/trace_probe_tmpl.h | 10
kernel/trace/trace_uprobe.c | 14 -
.../ftrace/test.d/dynevent/fprobe_entry_arg.tc | 18 +
.../ftrace/test.d/dynevent/fprobe_syntax_errors.tc | 4
.../ftrace/test.d/kprobe/kprobe_syntax_errors.tc | 2
.../ftrace/test.d/kprobe/kretprobe_entry_arg.tc | 18 +
15 files changed, 521 insertions(+), 176 deletions(-)
create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_entry_arg.tc
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_entry_arg.tc
--
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Hi,
Here is version 2 series of patches to support accessing function entry data
from function *return* probes (including kretprobe and fprobe-exit event).
In this version, I added another cleanup [4/7], updated README[5/7], added
testcases[6/7] and updated document[7/7].
This allows us to access the results of some functions, which returns the
error code and its results are passed via function parameter, such as an
structure-initialization function.
For example, vfs_open() will link the file structure to the inode and update
mode. Thus we can trace that changes.
# echo 'f vfs_open mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events
# echo 'f vfs_open%return mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events
# echo 1 > events/fprobes/enable
# cat trace
sh-131 [006] ...1. 1945.714346: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x2 inode=0x0
sh-131 [006] ...1. 1945.714358: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4d801e inode=0xffff888008470168
cat-143 [007] ...1. 1945.717949: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0
cat-143 [007] ...1. 1945.717956: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4a801d inode=0xffff888005f78d28
cat-143 [007] ...1. 1945.720616: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0
cat-143 [007] ...1. 1945.728263: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0xa800d inode=0xffff888004ada8d8
So as you can see those fields are initialized at exit.
This series is based on v6.8-rc5 kernel or you can checkout from
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=t…
Thank you,
---
Masami Hiramatsu (Google) (7):
tracing/fprobe-event: cleanup: Fix a wrong comment in fprobe event
tracing/probes: Cleanup probe argument parser
tracing/probes: cleanup: Set trace_probe::nr_args at trace_probe_init
tracing: Remove redundant #else block for BTF args from README
tracing/probes: Support $argN in return probe (kprobe and fprobe)
selftests/ftrace: Add test cases for entry args at function exit
Documentation: tracing: Add entry argument access at function exit
Documentation/trace/fprobetrace.rst | 7
Documentation/trace/kprobetrace.rst | 7
kernel/trace/trace.c | 5
kernel/trace/trace_eprobe.c | 8
kernel/trace/trace_fprobe.c | 59 ++-
kernel/trace/trace_kprobe.c | 58 ++-
kernel/trace/trace_probe.c | 417 ++++++++++++++------
kernel/trace/trace_probe.h | 30 +
kernel/trace/trace_probe_tmpl.h | 10
kernel/trace/trace_uprobe.c | 14 -
.../ftrace/test.d/dynevent/fprobe_entry_arg.tc | 18 +
.../ftrace/test.d/kprobe/kretprobe_entry_arg.tc | 18 +
12 files changed, 483 insertions(+), 168 deletions(-)
create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_entry_arg.tc
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_entry_arg.tc
--
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
This series includes 6 types of fixes:
- Patch 1 fixes v4 mapped in v6 addresses support for the userspace PM,
when asking to delete a subflow. It was done everywhere else, but not
there. Patch 2 validates the modification, thanks to a subtest in
mptcp_join.sh. These patches can be backported up to v5.19.
- Patch 3 is a small fix for a recent bug-fix patch, just to avoid
printing an irrelevant warning (pr_warn()) once. It can be backported
up to v5.6, alongside the bug-fix that has been introduced in the
v6.8-rc5.
- Patches 4 to 6 are fixes for bugs found by Paolo while working on
TCP_NOTSENT_LOWAT support for MPTCP. These fixes can improve the
performances in some cases. Patches can be backported up to v5.6,
v5.11 and v6.7 respectively.
- Patch 7 makes sure 'ss -M' is available when starting MPTCP Join
selftest as it is required for some subtests since v5.18.
- Patch 8 fixes a possible double-free on socket dismantle. The issue
always existed, but was unnoticed because it was not causing any
problem so far. This fix can be backported up to v5.6.
- Patch 9 is a fix for a very recent patch causing lockdep warnings in
subflow diag. The patch causing the regression -- which fixes another
issue present since v5.7 -- should be part of the future v6.8-rc6.
Patch 10 validates the modification, thanks to a new subtest in
diag.sh.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Davide Caratti (1):
mptcp: fix double-free on socket dismantle
Geliang Tang (3):
mptcp: map v4 address to v6 when destroying subflow
selftests: mptcp: rm subflow with v4/v4mapped addr
selftests: mptcp: join: add ss mptcp support check
Matthieu Baerts (NGI0) (1):
mptcp: avoid printing warning once on client side
Paolo Abeni (5):
mptcp: push at DSS boundaries
mptcp: fix snd_wnd initialization for passive socket
mptcp: fix potential wake-up event loss
mptcp: fix possible deadlock in subflow diag
selftests: mptcp: explicitly trigger the listener diag code-path
net/mptcp/diag.c | 3 ++
net/mptcp/options.c | 2 +-
net/mptcp/pm_userspace.c | 10 +++++
net/mptcp/protocol.c | 52 ++++++++++++++++++++++++-
net/mptcp/protocol.h | 21 +++++-----
tools/testing/selftests/net/mptcp/diag.sh | 30 +++++++++++++-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 33 ++++++++++------
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 4 +-
8 files changed, 128 insertions(+), 27 deletions(-)
---
base-commit: b0b1210bc150fbd741b4b9fce8a24541306b40fc
change-id: 20240223-upstream-net-20240223-misc-fixes-1630cd6b3b0a
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
This series extends the KVM RISC-V ONE_REG interface to report few more
ISA extensions namely: Ztso and Zacas. These extensions are already
supported by the HWPROBE interface in Linux-6.8 kernel.
To test these patches, use KVMTOOL from the riscv_more_exts_round2_v1
branch at: https://github.com/avpatel/kvmtool.git
These patches can also be found in the riscv_kvm_more_exts_round2_v1
branch at: https://github.com/avpatel/linux.git
Anup Patel (5):
RISC-V: KVM: Forward SEED CSR access to user space
RISC-V: KVM: Allow Ztso extension for Guest/VM
KVM: riscv: selftests: Add Ztso extension to get-reg-list test
RISC-V: KVM: Allow Zacas extension for Guest/VM
KVM: riscv: selftests: Add Zacas extension to get-reg-list test
arch/riscv/include/uapi/asm/kvm.h | 2 ++
arch/riscv/kvm/vcpu_insn.c | 13 +++++++++++++
arch/riscv/kvm/vcpu_onereg.c | 4 ++++
tools/testing/selftests/kvm/riscv/get-reg-list.c | 8 ++++++++
4 files changed, 27 insertions(+)
--
2.34.1
This series fixes a bug in the complete phase of UDP in GRO, in which
socket lookup fails due to using network_header when parsing encapsulated
packets. The fix is to keep track of both outer and inner offsets.
The last commit leverages the first commit to remove some state from
napi_gro_cb, and stateful code in {ipv6,inet}_gro_receive which may be
unnecessarily complicated due to encapsulation support in GRO.
In addition, udpgro_fwd selftest is adjusted to include the socket lookup
case for vxlan. This selftest will test its supposed functionality once
local bind support is merged (https://lore.kernel.org/netdev/df300a49-7811-4126-a56a-a77100c8841b@gmail.c…).
Richard Gobert (3):
net: gro: set {inner_,}network_header in receive phase
selftests/net: add local address bind in vxlan selftest
net: gro: move L3 flush checks to tcp_gro_receive
include/net/gro.h | 23 ++++---
net/8021q/vlan_core.c | 3 +
net/core/gro.c | 3 -
net/ipv4/af_inet.c | 44 ++------------
net/ipv4/tcp_offload.c | 73 ++++++++++++++++++-----
net/ipv4/udp_offload.c | 2 +-
net/ipv6/ip6_offload.c | 22 ++-----
net/ipv6/tcpv6_offload.c | 2 +-
net/ipv6/udp_offload.c | 2 +-
tools/testing/selftests/net/udpgro_fwd.sh | 10 +++-
10 files changed, 97 insertions(+), 87 deletions(-)
--
2.36.1
Previous patch series[1] changes a mmap behavior that treats the hint
address as the upper bound of the mmap address range. The motivation of the
previous patch series is that some user space software may assume 48-bit
address space and use higher bits to encode some information, which may
collide with large virtual address space mmap may return. However, to make
sv48 by default, we don't need to change the meaning of the hint address on
mmap as the upper bound of the mmap address range, especially when this
behavior only shows up on the RISC-V. This behavior also breaks some user
space software which assumes mmap should try to create mapping on the hint
address if possible. As the mmap manpage said:
> If addr is not NULL, then the kernel takes it as a hint about where to
> place the mapping; on Linux, the kernel will pick a nearby page boundary
> (but always above or equal to the value specified by
> /proc/sys/vm/mmap_min_addr) and attempt to create the mapping there.
Unfortunately, what mmap said is not true on RISC-V since kernel v6.6.
Other ISAs with larger than 48-bit virtual address space like x86, arm64,
and powerpc do not have this special mmap behavior on hint address. They
all just make 48-bit / 47-bit virtual address space by default, and if a
user space software wants to large virtual address space, it only need to
specify a hint address larger than 48-bit / 47-bit.
Thus, this patch series keeps the change of mmap to use sv48 by default but
does not treat the hint address as the upper bound of the mmap address
range. After this patch, the behavior of mmap will align with existing
behavior on other ISAs with larger than 48-bit virtual address space like
x86, arm64, and powerpc. The user space software will no longer need to
rewrite their code to fit with this special mmap behavior only on RISC-V.
My concern is that the change of mmap behavior on the hint address is
already in the upstream kernel since v6.6, and it might be hard to revert
it although it already brings some regression on some user space software.
And it will be harder than adding it since v6.6 because mmap not creating
mapping on the hint address is very common, especially when running on a
machine without sv57 / sv48. However, if some user space software already
adopted this special mmap behavior on RISC-V, we should not return a mmap
address larger than the hint if the address is larger than BIT(38). My
opinion is that revert this change on the next kernel release might be a
good choice as only a few of hardware support sv57 / sv48 now, these
changes will have no impact on sv39 systems.
Moreover, previous patch series said it make sv48 by default, which is
in the cover letter, kernel documentation and MMAP_VA_BITS defination.
However, the code on arch_get_mmap_end and arch_get_mmap_base marco still
use sv39 by default, which makes me confused, and I still use sv48 by
default in this patch series including arch_get_mmap_end and
arch_get_mmap_base.
Changes in v2:
- correct arch_get_mmap_end and arch_get_mmap_base
- Add description in documentation about mmap behavior on kernel v6.6-6.7.
- Improve commit message and cover letter
- Rebase to newest riscv/for-next branch
- Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8B…
[1]. https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@rivosin…
Yangyu Chen (3):
RISC-V: mm: do not treat hint addr on mmap as the upper bound to
search
RISC-V: mm: only test mmap without hint
Documentation: riscv: correct sv57 kernel behavior
Documentation/arch/riscv/vm-layout.rst | 54 ++++++++++++-------
arch/riscv/include/asm/processor.h | 38 +++----------
.../selftests/riscv/mm/mmap_bottomup.c | 12 -----
.../testing/selftests/riscv/mm/mmap_default.c | 12 -----
tools/testing/selftests/riscv/mm/mmap_test.h | 30 -----------
5 files changed, 41 insertions(+), 105 deletions(-)
--
2.43.0