This series introduces stats counters for psp. Device key rotations,
and so called 'stale-events' are common to all drivers and are tracked
by the core.
A driver facing api is provided for reporting stats required by the
"Implementation Requirements" section of the PSP Architecture
Specification. Drivers must implement these stats.
Lastly, implementations of the driver stats api for mlx5 and netdevsim
are included.
Here is the output of running the psp selftest suite and then
printing out stats with the ynl cli on system with a psp-capable CX7:
$ ./ksft-psp-stats/drivers/net/psp.py
TAP version 13
1..28
ok 1 psp.test_case # SKIP Test requires IPv4 connectivity
ok 2 psp.data_basic_send_v0_ip6
ok 3 psp.test_case # SKIP Test requires IPv4 connectivity
ok 4 psp.data_basic_send_v1_ip6
ok 5 psp.test_case # SKIP Test requires IPv4 connectivity
ok 6 psp.data_basic_send_v2_ip6 # SKIP ('PSP version not supported', 'hdr0-aes-gmac-128')
ok 7 psp.test_case # SKIP Test requires IPv4 connectivity
ok 8 psp.data_basic_send_v3_ip6 # SKIP ('PSP version not supported', 'hdr0-aes-gmac-256')
ok 9 psp.test_case # SKIP Test requires IPv4 connectivity
ok 10 psp.data_mss_adjust_ip6
ok 11 psp.dev_list_devices
ok 12 psp.dev_get_device
ok 13 psp.dev_get_device_bad
ok 14 psp.dev_rotate
ok 15 psp.dev_rotate_spi
ok 16 psp.assoc_basic
ok 17 psp.assoc_bad_dev
ok 18 psp.assoc_sk_only_conn
ok 19 psp.assoc_sk_only_mismatch
ok 20 psp.assoc_sk_only_mismatch_tx
ok 21 psp.assoc_sk_only_unconn
ok 22 psp.assoc_version_mismatch
ok 23 psp.assoc_twice
ok 24 psp.data_send_bad_key
ok 25 psp.data_send_disconnect
ok 26 psp.data_stale_key
ok 27 psp.removal_device_rx # XFAIL Test only works on netdevsim
ok 28 psp.removal_device_bi # XFAIL Test only works on netdevsim
# Totals: pass:19 fail:0 xfail:2 xpass:0 skip:7 error:0
#
# Responder logs (0):
# STDERR:
# Set PSP enable on device 1 to 0x3
# Set PSP enable on device 1 to 0x0
$ cd ynl/
$ ./pyynl/cli.py --spec netlink/specs/psp.yaml --dump get-stats
[{'dev-id': 1,
'key-rotations': 5,
'rx-auth-fail': 21,
'rx-bad': 0,
'rx-bytes': 11844,
'rx-error': 0,
'rx-packets': 94,
'stale-events': 6,
'tx-bytes': 1128456,
'tx-error': 0,
'tx-packets': 780}]
CHANGES:
v3:
- simplify error path in accel_psp_fs_init_tx()
- avoid casting argument in mlx5e_accel_psp_fs_get_stats_fill()
- delete unused member stats member in mlx5e_psp
- remove zero length array from psp_dev_stats
v2: https://lore.kernel.org/netdev/20251028000018.3869664-1-daniel.zahka@gmail.…
- don't return skb->len from psp_nl_get_stats_dumpit() on success and
EMSGSIZE
- use %pe to print PTR_ERR()
v1: https://lore.kernel.org/netdev/20251022193739.1376320-1-daniel.zahka@gmail.…
Daniel Zahka (2):
selftests: drv-net: psp: add assertions on core-tracked psp dev stats
netdevsim: implement psp device stats
Jakub Kicinski (3):
psp: report basic stats from the core
psp: add stats from psp spec to driver facing api
net/mlx5e: Add PSP stats support for Rx/Tx flows
Documentation/netlink/specs/psp.yaml | 95 +++++++
.../mellanox/mlx5/core/en_accel/psp.c | 233 ++++++++++++++++--
.../mellanox/mlx5/core/en_accel/psp.h | 16 ++
.../mellanox/mlx5/core/en_accel/psp_rxtx.c | 1 +
.../net/ethernet/mellanox/mlx5/core/en_main.c | 5 +
drivers/net/netdevsim/netdevsim.h | 5 +
drivers/net/netdevsim/psp.c | 27 ++
include/net/psp/types.h | 32 +++
include/uapi/linux/psp.h | 18 ++
net/psp/psp-nl-gen.c | 19 ++
net/psp/psp-nl-gen.h | 2 +
net/psp/psp_main.c | 3 +-
net/psp/psp_nl.c | 93 +++++++
net/psp/psp_sock.c | 4 +-
tools/testing/selftests/drivers/net/psp.py | 13 +
15 files changed, 549 insertions(+), 17 deletions(-)
--
2.47.3
From: Alexander Sverdlin <alexander.sverdlin(a)siemens.com>
It seems that most of the tests prepare the interfaces once before the test
run (setup_prepare()), rely on setup_wait() to wait for link and only then
run the test(s).
local_termination brings the physical interfaces down and up during test
run but never wait for them to come up. If the auto-negotiation takes
some seconds, first test packets are being lost, which leads to
false-negative test results.
Use setup_wait() in run_test() to make sure auto-negotiation has been
completed after all simple_if_init() calls on physical interfaces and test
packets will not be lost because of the race against link establishment.
Fixes: 90b9566aa5cd3f ("selftests: forwarding: add a test for local_termination.sh")
Reviewed-by: Vladimir Oltean <vladimir.oltean(a)nxp.com>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin(a)siemens.com>
---
Changelog:
v3:
- moved setup_wait() from individual test groups into run_test()
v2:
- replaced "setup_wait_dev $h1; setup_wait_dev $h2" with setup_wait()
tools/testing/selftests/net/forwarding/local_termination.sh | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/net/forwarding/local_termination.sh b/tools/testing/selftests/net/forwarding/local_termination.sh
index ecd34f364125c..892895659c7e4 100755
--- a/tools/testing/selftests/net/forwarding/local_termination.sh
+++ b/tools/testing/selftests/net/forwarding/local_termination.sh
@@ -176,6 +176,8 @@ run_test()
local rcv_dmac=$(mac_get $rcv_if_name)
local should_receive
+ setup_wait
+
tcpdump_start $rcv_if_name
mc_route_prepare $send_if_name
--
2.51.1
The current netconsole implementation allocates a static buffer for
extradata (userdata + sysdata) with a fixed size of
MAX_EXTRADATA_ENTRY_LEN * MAX_EXTRADATA_ITEMS bytes for every target,
regardless of whether userspace actually uses this feature. This forces
us to keep MAX_EXTRADATA_ITEMS small (16), which is restrictive for
users who need to attach more metadata to their log messages.
This patch series enables dynamic allocation of the userdata buffer,
allowing it to grow on-demand based on actual usage. The series:
1. Refactors send_fragmented_body() to simplify handling of separated
userdata and sysdata (patch 1/4)
2. Splits userdata and sysdata into separate buffers (patch 2/4)
3. Implements dynamic allocation for the userdata buffer (patch 3/4)
4. Increases MAX_USERDATA_ITEMS from 16 to 256 now that we can do so
without memory waste (patch 4/4)
Benefits:
- No memory waste when userdata is not used
- Targets that use userdata only consume what they need
- Users can attach significantly more metadata without impacting systems
that don't use this feature
Signed-off-by: Gustavo Luiz Duarte <gustavold(a)gmail.com>
---
Gustavo Luiz Duarte (4):
netconsole: Simplify send_fragmented_body()
netconsole: Split userdata and sysdata
netconsole: Dynamic allocation of userdata buffer
netconsole: Increase MAX_USERDATA_ITEMS
drivers/net/netconsole.c | 338 +++++++++------------
.../selftests/drivers/net/netcons_overflow.sh | 2 +-
2 files changed, 152 insertions(+), 188 deletions(-)
---
base-commit: 89aec171d9d1ab168e43fcf9754b82e4c0aef9b9
change-id: 20251007-netconsole_dynamic_extradata-21bd9d726568
Best regards,
--
Gustavo Duarte <gustavold(a)meta.com>
The zt-test output is awkward to read, as the 'Expected' value isn't
dumped on its own line and isn't aligned with the 'Got' value beneath.
For example:
Mismatch: PID=5281, iteration=3270249 Expected [00a1146901a1146902a1146903a1146904a1146905a1146906a1146907a1146908a1146909a114690aa114690ba114690ca114690da114690ea114690fa11469]
Got [00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000]
SVCR: 2
Add a newline, matching the other FPSIMD/SVE/SME tests, so that we get
output that can be read more easily:
Mismatch: PID=5281, iteration=3270249
Expected [00a1146901a1146902a1146903a1146904a1146905a1146906a1146907a1146908a1146909a114690aa114690ba114690ca114690da114690ea114690fa11469]
Got [00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000]
SVCR: 2
Admittedly this isn't all that important when the 'Got' value is all
zeroes, but otherwise this would be a major help for identifying which
portion of the 'Got' value is not as expected.
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Will Deacon <will(a)kernel.org>
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/arm64/fp/zt-test.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/fp/zt-test.S b/tools/testing/selftests/arm64/fp/zt-test.S
index 38080f3c32804..a8df057716707 100644
--- a/tools/testing/selftests/arm64/fp/zt-test.S
+++ b/tools/testing/selftests/arm64/fp/zt-test.S
@@ -276,7 +276,7 @@ function barf
bl putdec
puts ", iteration="
mov x0, x22
- bl putdec
+ bl putdecn
puts "\tExpected ["
mov x0, x10
mov x1, x12
--
2.30.2
[ based on kvm/next ]
Implement guest_memfd population via the write syscall.
This is useful in non-CoCo use cases where the host can access guest
memory. Even though the same can also be achieved via userspace mapping
and memcpying from userspace, write provides a more performant option
because it does not need to set page tables and it does not cause a page
fault for every page like memcpy would. Note that memcpy cannot be
accelerated via MADV_POPULATE_WRITE as it is not supported by
guest_memfd and relies on GUP.
Populating 512MiB of guest_memfd on a x86 machine:
- via memcpy: 436 ms
- via write: 202 ms (-54%)
The write syscall support is conditional on kvm_gmem_supports_mmap.
When in-place shared/private conversion is supported, write should only
be allowed on shared pages.
v6:
- Make write support conditional on mmap support instead of relying on
the up-to-date flag to decide whether writing to a page is allowed
- James: Remove depenendencies on folio_test_large
- James: Remove page alignment restriction
- James: Formatting fixes
v5:
- https://lore.kernel.org/kvm/20250902111951.58315-1-kalyazin@amazon.com/
- Replace the call to the unexported filemap_remove_folio with
zeroing the bytes that could not be copied
- Fix checkpatch findings
v4:
- https://lore.kernel.org/kvm/20250828153049.3922-1-kalyazin@amazon.com
- Switch from implementing the write callback to write_iter
- Remove conditional compilation
v3:
- https://lore.kernel.org/kvm/20250303130838.28812-1-kalyazin@amazon.com
- David/Mike D: Only compile support for the write syscall if
CONFIG_KVM_GMEM_SHARED_MEM (now gone) is enabled.
v2:
- https://lore.kernel.org/kvm/20241129123929.64790-1-kalyazin@amazon.com
- Switch from an ioctl to the write syscall to implement population
v1:
- https://lore.kernel.org/kvm/20241024095429.54052-1-kalyazin@amazon.com
Nikita Kalyazin (2):
KVM: guest_memfd: add generic population via write
KVM: selftests: update guest_memfd write tests
.../testing/selftests/kvm/guest_memfd_test.c | 51 ++++++++++++++++---
virt/kvm/guest_memfd.c | 49 ++++++++++++++++++
2 files changed, 94 insertions(+), 6 deletions(-)
base-commit: 6b36119b94d0b2bb8cea9d512017efafd461d6ac
--
2.50.1
Since Armv9.6, FEAT_LSUI supplies the load/store instructions for
previleged level to access to access user memory without clearing
PSTATE.PAN bit.
This patchset support FEAT_LSUI and applies in futex atomic operation
and user_swpX emulation where can replace from ldxr/st{l}xr
pair implmentation with clearing PSTATE.PAN bit to correspondant
load/store unprevileged atomic operation without clearing PSTATE.PAN bit.
Patch Sequences
================
Patch #1 adds cpufeature for FEAT_LSUI
Patch #2-#3 expose FEAT_LSUI to guest
Patch #4 adds Kconfig for FEAT_LSUI
Patch #5-#6 support futex atomic-op with FEAT_LSUI
Patch #7-#9 support user_swpX emulation with FEAT_LSUI
Patch History
==============
from v10 to v11:
- use cast instruction to emulate deprecated swpb instruction
- https://lore.kernel.org/all/20251103163224.818353-1-yeoreum.yun@arm.com/
from v9 to v10:
- apply FEAT_LSUI to user_swpX emulation.
- add test coverage for LSUI bit in ID_AA64ISAR3_EL1
- rebase to v6.18-rc4
- https://lore.kernel.org/all/20250922102244.2068414-1-yeoreum.yun@arm.com/
from v8 to v9:
- refotoring __lsui_cmpxchg64()
- rebase to v6.17-rc7
- https://lore.kernel.org/all/20250917110838.917281-1-yeoreum.yun@arm.com/
from v7 to v8:
- implements futex_atomic_eor() and futex_atomic_cmpxchg() with casalt
with C helper.
- Drop the small optimisation on ll/sc futex_atomic_set operation.
- modify some commit message.
- https://lore.kernel.org/all/20250816151929.197589-1-yeoreum.yun@arm.com/
from v6 to v7:
- wrap FEAT_LSUI with CONFIG_AS_HAS_LSUI in cpufeature
- remove unnecessary addition of indentation.
- remove unnecessary mte_tco_enable()/disable() on LSUI operation.
- https://lore.kernel.org/all/20250811163635.1562145-1-yeoreum.yun@arm.com/
from v5 to v6:
- rebase to v6.17-rc1
- https://lore.kernel.org/all/20250722121956.1509403-1-yeoreum.yun@arm.com/
from v4 to v5:
- remove futex_ll_sc.h futext_lsui and lsui.h and move them to futex.h
- reorganize the patches.
- https://lore.kernel.org/all/20250721083618.2743569-1-yeoreum.yun@arm.com/
from v3 to v4:
- rebase to v6.16-rc7
- modify some patch's title.
- https://lore.kernel.org/all/20250617183635.1266015-1-yeoreum.yun@arm.com/
from v2 to v3:
- expose FEAT_LUSI to guest
- add help section for LUSI Kconfig
- https://lore.kernel.org/all/20250611151154.46362-1-yeoreum.yun@arm.com/
from v1 to v2:
- remove empty v9.6 menu entry
- locate HAS_LUSI in cpucaps in order
- https://lore.kernel.org/all/20250611104916.10636-1-yeoreum.yun@arm.com/
Yeoreum Yun (9):
arm64: cpufeature: add FEAT_LSUI
KVM: arm64: expose FEAT_LSUI to guest
KVM: arm64: kselftest: set_id_regs: add test for FEAT_LSUI
arm64: Kconfig: Detect toolchain support for LSUI
arm64: futex: refactor futex atomic operation
arm64: futex: support futex with FEAT_LSUI
arm64: separate common LSUI definitions into lsui.h
arm64: armv8_deprecated: convert user_swpX to inline function
arm64: armv8_deprecated: apply FEAT_LSUI for swpX emulation.
arch/arm64/Kconfig | 5 +
arch/arm64/include/asm/futex.h | 291 +++++++++++++++---
arch/arm64/include/asm/lsui.h | 25 ++
arch/arm64/kernel/armv8_deprecated.c | 111 +++++--
arch/arm64/kernel/cpufeature.c | 10 +
arch/arm64/kvm/sys_regs.c | 3 +-
arch/arm64/tools/cpucaps | 1 +
.../testing/selftests/kvm/arm64/set_id_regs.c | 1 +
8 files changed, 381 insertions(+), 66 deletions(-)
create mode 100644 arch/arm64/include/asm/lsui.h
base-commit: 6146a0f1dfae5d37442a9ddcba012add260bceb0
--
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
This series adds namespace support to vhost-vsock and loopback. It does
not add namespaces to any of the other guest transports (virtio-vsock,
hyperv, or vmci).
The current revision supports two modes: local and global. Local
mode is complete isolation of namespaces, while global mode is complete
sharing between namespaces of CIDs (the original behavior).
The mode is set using /proc/sys/net/vsock/ns_mode.
Modes are per-netns and write-once. This allows a system to configure
namespaces independently (some may share CIDs, others are completely
isolated). This also supports future possible mixed use cases, where
there may be namespaces in global mode spinning up VMs while there are
mixed mode namespaces that provide services to the VMs, but are not
allowed to allocate from the global CID pool (this mode not implemented
in this series).
If a socket or VM is created when a namespace is global but the
namespace changes to local, the socket or VM will continue working
normally. That is, the socket or VM assumes the mode behavior of the
namespace at the time the socket/VM was created. The original mode is
captured in vsock_create() and so occurs at the time of socket(2) and
accept(2) for sockets and open(2) on /dev/vhost-vsock for VMs. This
prevents a socket/VM connection from suddenly breaking due to a
namespace mode change. Any new sockets/VMs created after the mode change
will adopt the new mode's behavior.
Additionally, added tests for the new namespace features:
tools/testing/selftests/vsock/vmtest.sh
1..30
ok 1 vm_server_host_client
ok 2 vm_client_host_server
ok 3 vm_loopback
ok 4 ns_host_vsock_ns_mode_ok
ok 5 ns_host_vsock_ns_mode_write_once_ok
ok 6 ns_global_same_cid_fails
ok 7 ns_local_same_cid_ok
ok 8 ns_global_local_same_cid_ok
ok 9 ns_local_global_same_cid_ok
ok 10 ns_diff_global_host_connect_to_global_vm_ok
ok 11 ns_diff_global_host_connect_to_local_vm_fails
ok 12 ns_diff_global_vm_connect_to_global_host_ok
ok 13 ns_diff_global_vm_connect_to_local_host_fails
ok 14 ns_diff_local_host_connect_to_local_vm_fails
ok 15 ns_diff_local_vm_connect_to_local_host_fails
ok 16 ns_diff_global_to_local_loopback_local_fails
ok 17 ns_diff_local_to_global_loopback_fails
ok 18 ns_diff_local_to_local_loopback_fails
ok 19 ns_diff_global_to_global_loopback_ok
ok 20 ns_same_local_loopback_ok
ok 21 ns_same_local_host_connect_to_local_vm_ok
ok 22 ns_same_local_vm_connect_to_local_host_ok
ok 23 ns_mode_change_connection_continue_vm_ok
ok 24 ns_mode_change_connection_continue_host_ok
ok 25 ns_mode_change_connection_continue_both_ok
ok 26 ns_delete_vm_ok
ok 27 ns_delete_host_ok
ok 28 ns_delete_both_ok
ok 29 ns_loopback_global_global_late_module_load_ok
ok 30 ns_loopback_local_local_late_module_load_fails
SUMMARY: PASS=30 SKIP=0 FAIL=0
Dependent on series:
https://lore.kernel.org/all/20251022-vsock-selftests-fixes-and-improvements…
Thanks again for everyone's help and reviews!
Signed-off-by: Bobby Eshleman <bobbyeshleman(a)gmail.com>
To: Stefano Garzarella <sgarzare(a)redhat.com>
To: Shuah Khan <shuah(a)kernel.org>
To: David S. Miller <davem(a)davemloft.net>
To: Eric Dumazet <edumazet(a)google.com>
To: Jakub Kicinski <kuba(a)kernel.org>
To: Paolo Abeni <pabeni(a)redhat.com>
To: Simon Horman <horms(a)kernel.org>
To: Stefan Hajnoczi <stefanha(a)redhat.com>
To: Michael S. Tsirkin <mst(a)redhat.com>
To: Jason Wang <jasowang(a)redhat.com>
To: Xuan Zhuo <xuanzhuo(a)linux.alibaba.com>
To: Eugenio Pérez <eperezma(a)redhat.com>
To: K. Y. Srinivasan <kys(a)microsoft.com>
To: Haiyang Zhang <haiyangz(a)microsoft.com>
To: Wei Liu <wei.liu(a)kernel.org>
To: Dexuan Cui <decui(a)microsoft.com>
To: Bryan Tan <bryan-bt.tan(a)broadcom.com>
To: Vishnu Dasa <vishnu.dasa(a)broadcom.com>
To: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: virtualization(a)lists.linux.dev
Cc: netdev(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: kvm(a)vger.kernel.org
Cc: linux-hyperv(a)vger.kernel.org
Cc: berrange(a)redhat.com
Changes in v8:
- Break generic cleanup/refactoring patches into standalone series,
remove those from this series
- Link to dependency: https://lore.kernel.org/all/20251022-vsock-selftests-fixes-and-improvements…
- Link to v7: https://lore.kernel.org/r/20251021-vsock-vmtest-v7-0-0661b7b6f081@meta.com
Changes in v7:
- fix hv_sock build
- break out vmtest patches into distinct, more well-scoped patches
- change `orig_net_mode` to `net_mode`
- many fixes and style changes in per-patch change sets (see individual
patches for specific changes)
- optimize `virtio_vsock_skb_cb` layout
- update commit messages with more useful descriptions
- vsock_loopback: use orig_net_mode instead of current net mode
- add tests for edge cases (ns deletion, mode changing, loopback module
load ordering)
- Link to v6: https://lore.kernel.org/r/20250916-vsock-vmtest-v6-0-064d2eb0c89d@meta.com
Changes in v6:
- define behavior when mode changes to local while socket/VM is alive
- af_vsock: clarify description of CID behavior
- af_vsock: use stronger langauge around CID rules (dont use "may")
- af_vsock: improve naming of buf/buffer
- af_vsock: improve string length checking on proc writes
- vsock_loopback: add space in struct to clarify lock protection
- vsock_loopback: do proper cleanup/unregister on vsock_loopback_exit()
- vsock_loopback: use virtio_vsock_skb_net() instead of sock_net()
- vsock_loopback: set loopback to NULL after kfree()
- vsock_loopback: use pernet_operations and remove callback mechanism
- vsock_loopback: add macros for "global" and "local"
- vsock_loopback: fix length checking
- vmtest.sh: check for namespace support in vmtest.sh
- Link to v5: https://lore.kernel.org/r/20250827-vsock-vmtest-v5-0-0ba580bede5b@meta.com
Changes in v5:
- /proc/net/vsock_ns_mode -> /proc/sys/net/vsock/ns_mode
- vsock_global_net -> vsock_global_dummy_net
- fix netns lookup in vhost_vsock to respect pid namespaces
- add callbacks for vsock_loopback to avoid circular dependency
- vmtest.sh loads vsock_loopback module
- remove vsock_net_mode_can_set()
- change vsock_net_write_mode() to return true/false based on success
- make vsock_net_mode enum instead of u8
- Link to v4: https://lore.kernel.org/r/20250805-vsock-vmtest-v4-0-059ec51ab111@meta.com
Changes in v4:
- removed RFC tag
- implemented loopback support
- renamed new tests to better reflect behavior
- completed suite of tests with permutations of ns modes and vsock_test
as guest/host
- simplified socat bridging with unix socket instead of tcp + veth
- only use vsock_test for success case, socat for failure case (context
in commit message)
- lots of cleanup
Changes in v3:
- add notion of "modes"
- add procfs /proc/net/vsock_ns_mode
- local and global modes only
- no /dev/vhost-vsock-netns
- vmtest.sh already merged, so new patch just adds new tests for NS
- Link to v2:
https://lore.kernel.org/kvm/20250312-vsock-netns-v2-0-84bffa1aa97a@gmail.com
Changes in v2:
- only support vhost-vsock namespaces
- all g2h namespaces retain old behavior, only common API changes
impacted by vhost-vsock changes
- add /dev/vhost-vsock-netns for "opt-in"
- leave /dev/vhost-vsock to old behavior
- removed netns module param
- Link to v1:
https://lore.kernel.org/r/20200116172428.311437-1-sgarzare@redhat.com
Changes in v1:
- added 'netns' module param to vsock.ko to enable the
network namespace support (disabled by default)
- added 'vsock_net_eq()' to check the "net" assigned to a socket
only when 'netns' support is enabled
- Link to RFC: https://patchwork.ozlabs.org/cover/1202235/
---
Bobby Eshleman (14):
vsock: a per-net vsock NS mode state
vsock/virtio: pack struct virtio_vsock_skb_cb
vsock: add netns to vsock skb cb
vsock: add netns to vsock core
vsock/loopback: add netns support
vsock/virtio: add netns to virtio transport common
vhost/vsock: add netns support
selftests/vsock: add namespace helpers to vmtest.sh
selftests/vsock: prepare vm management helpers for namespaces
selftests/vsock: add tests for proc sys vsock ns_mode
selftests/vsock: add namespace tests for CID collisions
selftests/vsock: add tests for host <-> vm connectivity with namespaces
selftests/vsock: add tests for namespace deletion and mode changes
selftests/vsock: add tests for module loading order
MAINTAINERS | 1 +
drivers/vhost/vsock.c | 48 +-
include/linux/virtio_vsock.h | 47 +-
include/net/af_vsock.h | 70 ++-
include/net/net_namespace.h | 4 +
include/net/netns/vsock.h | 22 +
net/vmw_vsock/af_vsock.c | 264 +++++++-
net/vmw_vsock/virtio_transport.c | 7 +-
net/vmw_vsock/virtio_transport_common.c | 21 +-
net/vmw_vsock/vsock_loopback.c | 89 ++-
tools/testing/selftests/vsock/vmtest.sh | 1044 ++++++++++++++++++++++++++++++-
11 files changed, 1532 insertions(+), 85 deletions(-)
---
base-commit: 962ac5ca99a5c3e7469215bf47572440402dfd59
change-id: 20250325-vsock-vmtest-b3a21d2102c2
prerequisite-message-id: <20251022-vsock-selftests-fixes-and-improvements-v1-0-edeb179d6463(a)meta.com>
prerequisite-patch-id: a2eecc3851f2509ed40009a7cab6990c6d7cfff5
prerequisite-patch-id: 501db2100636b9c8fcb3b64b8b1df797ccbede85
prerequisite-patch-id: ba1a2f07398a035bc48ef72edda41888614be449
prerequisite-patch-id: fd5cc5445aca9355ce678e6d2bfa89fab8a57e61
prerequisite-patch-id: 795ab4432ffb0843e22b580374782e7e0d99b909
prerequisite-patch-id: 1499d263dc933e75366c09e045d2125ca39f7ddd
prerequisite-patch-id: f92d99bb1d35d99b063f818a19dcda999152d74c
prerequisite-patch-id: e3296f38cdba6d903e061cff2bbb3e7615e8e671
prerequisite-patch-id: bc4662b4710d302d4893f58708820fc2a0624325
prerequisite-patch-id: f8991f2e98c2661a706183fde6b35e2b8d9aedcf
prerequisite-patch-id: 44bf9ed69353586d284e5ee63d6fffa30439a698
prerequisite-patch-id: d50621bc630eeaf608bbaf260370c8dabf6326df
Best regards,
--
Bobby Eshleman <bobbyeshleman(a)meta.com>
This is a follow-up series of [1]. It tries to fix a possible UAF in the
fops of cros_ec_chardev after the underlying protocol device has gone by
using revocable.
The 1st patch introduces the revocable which is an implementation of ideas
from the talk [2].
The 2nd and 3rd patches add test cases for revocable in Kunit and selftest.
The 4th patch converts existing protocol devices to resource providers
of cros_ec_device.
The 5th - 7th are PoC patches for showing the use case of "Replace file
operations" below.
---
I came out with 2 possible usages of revocable.
1. Use primitive APIs
Use the primitive APIs of revocable directly.
The file operations make sure the resources are available when using them.
This is what the series original proposed[3][4]. Even though it has the
finest grain for accessing the resources, it makes the user code verbose.
Per feedback from the community, I'm looking for some subsystem level
helpers so that user code can be simlper.
2. Replace file operations
Replace filp->f_op to revocable-aware warppers.
The warppers make sure the resources are available in the file operations.
The user code needs to provide a callback .try_access() to tell the wrappers
where/how to *save* the pointers of resources.
Known drawback:
- The warppers reserve the resources for all file operations even if they
might be unused.
- The user code still needs to be revocable-aware.
- The whole file operation becomes a SRCU read-side critical section. Are
there any functions can't be called in the critical section? If there is,
the file operations may not be awared of that.
See 5th - 7th patches for an example usage.
[1] https://lore.kernel.org/chrome-platform/20250721044456.2736300-6-tzungbi@ke…
[2] https://lpc.events/event/17/contributions/1627/
[3] https://lore.kernel.org/chrome-platform/20250912081718.3827390-5-tzungbi@ke…
[4] https://lore.kernel.org/chrome-platform/20250912081718.3827390-6-tzungbi@ke…
v5:
- Rebase onto next-20251015.
- Add more context about the PoC.
- Support multiple revocable providers in the PoC.
v4: https://lore.kernel.org/chrome-platform/20250923075302.591026-1-tzungbi@ker…
- Rebase onto next-20250922.
- Remove the 5th patch from v3.
- Add fops replacement PoC in 5th - 7th patches.
v3: https://lore.kernel.org/chrome-platform/20250912081718.3827390-1-tzungbi@ke…
- Rebase onto https://lore.kernel.org/chrome-platform/20250828083601.856083-1-tzungbi@ker…
and next-20250912.
- The 4th patch changed accordingly.
v2: https://lore.kernel.org/chrome-platform/20250820081645.847919-1-tzungbi@ker…
- Rename "ref_proxy" -> "revocable".
- Add test cases in Kunit and selftest.
v1: https://lore.kernel.org/chrome-platform/20250814091020.1302888-1-tzungbi@ke…
Tzung-Bi Shih (7):
revocable: Revocable resource management
revocable: Add Kunit test cases
selftests: revocable: Add kselftest cases
platform/chrome: Protect cros_ec_device lifecycle with revocable
revocable: Add fops replacement
char: misc: Leverage revocable fops replacement
platform/chrome: cros_ec_chardev: Secure cros_ec_device via revocable
.../driver-api/driver-model/index.rst | 1 +
.../driver-api/driver-model/revocable.rst | 87 +++++++
MAINTAINERS | 9 +
drivers/base/Kconfig | 8 +
drivers/base/Makefile | 5 +-
drivers/base/revocable.c | 233 ++++++++++++++++++
drivers/base/revocable_test.c | 110 +++++++++
drivers/char/misc.c | 8 +
drivers/platform/chrome/cros_ec.c | 5 +
drivers/platform/chrome/cros_ec_chardev.c | 22 +-
fs/Makefile | 2 +-
fs/fs_revocable.c | 154 ++++++++++++
include/linux/fs.h | 2 +
include/linux/fs_revocable.h | 21 ++
include/linux/miscdevice.h | 4 +
include/linux/platform_data/cros_ec_proto.h | 4 +
include/linux/revocable.h | 53 ++++
tools/testing/selftests/Makefile | 1 +
.../selftests/drivers/base/revocable/Makefile | 7 +
.../drivers/base/revocable/revocable_test.c | 116 +++++++++
.../drivers/base/revocable/test-revocable.sh | 39 +++
.../base/revocable/test_modules/Makefile | 10 +
.../revocable/test_modules/revocable_test.c | 188 ++++++++++++++
23 files changed, 1086 insertions(+), 3 deletions(-)
create mode 100644 Documentation/driver-api/driver-model/revocable.rst
create mode 100644 drivers/base/revocable.c
create mode 100644 drivers/base/revocable_test.c
create mode 100644 fs/fs_revocable.c
create mode 100644 include/linux/fs_revocable.h
create mode 100644 include/linux/revocable.h
create mode 100644 tools/testing/selftests/drivers/base/revocable/Makefile
create mode 100644 tools/testing/selftests/drivers/base/revocable/revocable_test.c
create mode 100755 tools/testing/selftests/drivers/base/revocable/test-revocable.sh
create mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/Makefile
create mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/revocable_test.c
--
2.51.0.788.g6d19910ace-goog