- Linux-kselftest-mirror - lists.linaro.org

[PATCH net-next v3 0/5] psp: track stats from core and provide a driver stats api

by Daniel Zahka

This series introduces stats counters for psp. Device key rotations, and so called 'stale-events' are common to all drivers and are tracked by the core. A driver facing api is provided for reporting stats required by the "Implementation Requirements" section of the PSP Architecture Specification. Drivers must implement these stats. Lastly, implementations of the driver stats api for mlx5 and netdevsim are included. Here is the output of running the psp selftest suite and then printing out stats with the ynl cli on system with a psp-capable CX7: $ ./ksft-psp-stats/drivers/net/psp.py TAP version 13 1..28 ok 1 psp.test_case # SKIP Test requires IPv4 connectivity ok 2 psp.data_basic_send_v0_ip6 ok 3 psp.test_case # SKIP Test requires IPv4 connectivity ok 4 psp.data_basic_send_v1_ip6 ok 5 psp.test_case # SKIP Test requires IPv4 connectivity ok 6 psp.data_basic_send_v2_ip6 # SKIP ('PSP version not supported', 'hdr0-aes-gmac-128') ok 7 psp.test_case # SKIP Test requires IPv4 connectivity ok 8 psp.data_basic_send_v3_ip6 # SKIP ('PSP version not supported', 'hdr0-aes-gmac-256') ok 9 psp.test_case # SKIP Test requires IPv4 connectivity ok 10 psp.data_mss_adjust_ip6 ok 11 psp.dev_list_devices ok 12 psp.dev_get_device ok 13 psp.dev_get_device_bad ok 14 psp.dev_rotate ok 15 psp.dev_rotate_spi ok 16 psp.assoc_basic ok 17 psp.assoc_bad_dev ok 18 psp.assoc_sk_only_conn ok 19 psp.assoc_sk_only_mismatch ok 20 psp.assoc_sk_only_mismatch_tx ok 21 psp.assoc_sk_only_unconn ok 22 psp.assoc_version_mismatch ok 23 psp.assoc_twice ok 24 psp.data_send_bad_key ok 25 psp.data_send_disconnect ok 26 psp.data_stale_key ok 27 psp.removal_device_rx # XFAIL Test only works on netdevsim ok 28 psp.removal_device_bi # XFAIL Test only works on netdevsim # Totals: pass:19 fail:0 xfail:2 xpass:0 skip:7 error:0 # # Responder logs (0): # STDERR: # Set PSP enable on device 1 to 0x3 # Set PSP enable on device 1 to 0x0 $ cd ynl/ $ ./pyynl/cli.py --spec netlink/specs/psp.yaml --dump get-stats [{'dev-id': 1, 'key-rotations': 5, 'rx-auth-fail': 21, 'rx-bad': 0, 'rx-bytes': 11844, 'rx-error': 0, 'rx-packets': 94, 'stale-events': 6, 'tx-bytes': 1128456, 'tx-error': 0, 'tx-packets': 780}] CHANGES: v3: - simplify error path in accel_psp_fs_init_tx() - avoid casting argument in mlx5e_accel_psp_fs_get_stats_fill() - delete unused member stats member in mlx5e_psp - remove zero length array from psp_dev_stats v2: https://lore.kernel.org/netdev/20251028000018.3869664-1-daniel.zahka@gmail.… - don't return skb->len from psp_nl_get_stats_dumpit() on success and EMSGSIZE - use %pe to print PTR_ERR() v1: https://lore.kernel.org/netdev/20251022193739.1376320-1-daniel.zahka@gmail.… Daniel Zahka (2): selftests: drv-net: psp: add assertions on core-tracked psp dev stats netdevsim: implement psp device stats Jakub Kicinski (3): psp: report basic stats from the core psp: add stats from psp spec to driver facing api net/mlx5e: Add PSP stats support for Rx/Tx flows Documentation/netlink/specs/psp.yaml | 95 +++++++ .../mellanox/mlx5/core/en_accel/psp.c | 233 ++++++++++++++++-- .../mellanox/mlx5/core/en_accel/psp.h | 16 ++ .../mellanox/mlx5/core/en_accel/psp_rxtx.c | 1 + .../net/ethernet/mellanox/mlx5/core/en_main.c | 5 + drivers/net/netdevsim/netdevsim.h | 5 + drivers/net/netdevsim/psp.c | 27 ++ include/net/psp/types.h | 32 +++ include/uapi/linux/psp.h | 18 ++ net/psp/psp-nl-gen.c | 19 ++ net/psp/psp-nl-gen.h | 2 + net/psp/psp_main.c | 3 +- net/psp/psp_nl.c | 93 +++++++ net/psp/psp_sock.c | 4 +- tools/testing/selftests/drivers/net/psp.py | 13 + 15 files changed, 549 insertions(+), 17 deletions(-) -- 2.47.3

3 weeks, 6 days

2
6
0 0

[PATCH net v3] selftests: net: local_termination: Wait for interfaces to come up

by A. Sverdlin

From: Alexander Sverdlin <alexander.sverdlin(a)siemens.com> It seems that most of the tests prepare the interfaces once before the test run (setup_prepare()), rely on setup_wait() to wait for link and only then run the test(s). local_termination brings the physical interfaces down and up during test run but never wait for them to come up. If the auto-negotiation takes some seconds, first test packets are being lost, which leads to false-negative test results. Use setup_wait() in run_test() to make sure auto-negotiation has been completed after all simple_if_init() calls on physical interfaces and test packets will not be lost because of the race against link establishment. Fixes: 90b9566aa5cd3f ("selftests: forwarding: add a test for local_termination.sh") Reviewed-by: Vladimir Oltean <vladimir.oltean(a)nxp.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin(a)siemens.com> --- Changelog: v3: - moved setup_wait() from individual test groups into run_test() v2: - replaced "setup_wait_dev $h1; setup_wait_dev $h2" with setup_wait() tools/testing/selftests/net/forwarding/local_termination.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/testing/selftests/net/forwarding/local_termination.sh b/tools/testing/selftests/net/forwarding/local_termination.sh index ecd34f364125c..892895659c7e4 100755 --- a/tools/testing/selftests/net/forwarding/local_termination.sh +++ b/tools/testing/selftests/net/forwarding/local_termination.sh @@ -176,6 +176,8 @@ run_test() local rcv_dmac=$(mac_get $rcv_if_name) local should_receive + setup_wait + tcpdump_start $rcv_if_name mc_route_prepare $send_if_name -- 2.51.1

3 weeks, 6 days

2
1
0 0

[PATCH net-next 0/4] netconsole: Allow userdata buffer to grow dynamically

by Gustavo Luiz Duarte

The current netconsole implementation allocates a static buffer for extradata (userdata + sysdata) with a fixed size of MAX_EXTRADATA_ENTRY_LEN * MAX_EXTRADATA_ITEMS bytes for every target, regardless of whether userspace actually uses this feature. This forces us to keep MAX_EXTRADATA_ITEMS small (16), which is restrictive for users who need to attach more metadata to their log messages. This patch series enables dynamic allocation of the userdata buffer, allowing it to grow on-demand based on actual usage. The series: 1. Refactors send_fragmented_body() to simplify handling of separated userdata and sysdata (patch 1/4) 2. Splits userdata and sysdata into separate buffers (patch 2/4) 3. Implements dynamic allocation for the userdata buffer (patch 3/4) 4. Increases MAX_USERDATA_ITEMS from 16 to 256 now that we can do so without memory waste (patch 4/4) Benefits: - No memory waste when userdata is not used - Targets that use userdata only consume what they need - Users can attach significantly more metadata without impacting systems that don't use this feature Signed-off-by: Gustavo Luiz Duarte <gustavold(a)gmail.com> --- Gustavo Luiz Duarte (4): netconsole: Simplify send_fragmented_body() netconsole: Split userdata and sysdata netconsole: Dynamic allocation of userdata buffer netconsole: Increase MAX_USERDATA_ITEMS drivers/net/netconsole.c | 338 +++++++++------------ .../selftests/drivers/net/netcons_overflow.sh | 2 +- 2 files changed, 152 insertions(+), 188 deletions(-) --- base-commit: 89aec171d9d1ab168e43fcf9754b82e4c0aef9b9 change-id: 20251007-netconsole_dynamic_extradata-21bd9d726568 Best regards, -- Gustavo Duarte <gustavold(a)meta.com>

4 weeks

2
8
0 0

[PATCH] kselftest/arm64: Align zt-test register dumps

by Mark Rutland

The zt-test output is awkward to read, as the 'Expected' value isn't dumped on its own line and isn't aligned with the 'Got' value beneath. For example: Mismatch: PID=5281, iteration=3270249 Expected [00a1146901a1146902a1146903a1146904a1146905a1146906a1146907a1146908a1146909a114690aa114690ba114690ca114690da114690ea114690fa11469] Got [00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000] SVCR: 2 Add a newline, matching the other FPSIMD/SVE/SME tests, so that we get output that can be read more easily: Mismatch: PID=5281, iteration=3270249 Expected [00a1146901a1146902a1146903a1146904a1146905a1146906a1146907a1146908a1146909a114690aa114690ba114690ca114690da114690ea114690fa11469] Got [00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000] SVCR: 2 Admittedly this isn't all that important when the 'Got' value is all zeroes, but otherwise this would be a major help for identifying which portion of the 'Got' value is not as expected. Signed-off-by: Mark Rutland <mark.rutland(a)arm.com> Cc: Catalin Marinas <catalin.marinas(a)arm.com> Cc: Mark Brown <broonie(a)kernel.org> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Will Deacon <will(a)kernel.org> Cc: linux-arm-kernel(a)lists.infradead.org Cc: linux-kselftest(a)vger.kernel.org --- tools/testing/selftests/arm64/fp/zt-test.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/arm64/fp/zt-test.S b/tools/testing/selftests/arm64/fp/zt-test.S index 38080f3c32804..a8df057716707 100644 --- a/tools/testing/selftests/arm64/fp/zt-test.S +++ b/tools/testing/selftests/arm64/fp/zt-test.S @@ -276,7 +276,7 @@ function barf bl putdec puts ", iteration=" mov x0, x22 - bl putdec + bl putdecn puts "\tExpected [" mov x0, x10 mov x1, x12 -- 2.30.2

4 weeks

3
2
0 0

[PATCH v6 0/2] KVM: guest_memfd: use write for population

by Kalyazin, Nikita

[ based on kvm/next ] Implement guest_memfd population via the write syscall. This is useful in non-CoCo use cases where the host can access guest memory. Even though the same can also be achieved via userspace mapping and memcpying from userspace, write provides a more performant option because it does not need to set page tables and it does not cause a page fault for every page like memcpy would. Note that memcpy cannot be accelerated via MADV_POPULATE_WRITE as it is not supported by guest_memfd and relies on GUP. Populating 512MiB of guest_memfd on a x86 machine: - via memcpy: 436 ms - via write: 202 ms (-54%) The write syscall support is conditional on kvm_gmem_supports_mmap. When in-place shared/private conversion is supported, write should only be allowed on shared pages. v6: - Make write support conditional on mmap support instead of relying on the up-to-date flag to decide whether writing to a page is allowed - James: Remove depenendencies on folio_test_large - James: Remove page alignment restriction - James: Formatting fixes v5: - https://lore.kernel.org/kvm/20250902111951.58315-1-kalyazin@amazon.com/ - Replace the call to the unexported filemap_remove_folio with zeroing the bytes that could not be copied - Fix checkpatch findings v4: - https://lore.kernel.org/kvm/20250828153049.3922-1-kalyazin@amazon.com - Switch from implementing the write callback to write_iter - Remove conditional compilation v3: - https://lore.kernel.org/kvm/20250303130838.28812-1-kalyazin@amazon.com - David/Mike D: Only compile support for the write syscall if CONFIG_KVM_GMEM_SHARED_MEM (now gone) is enabled. v2: - https://lore.kernel.org/kvm/20241129123929.64790-1-kalyazin@amazon.com - Switch from an ioctl to the write syscall to implement population v1: - https://lore.kernel.org/kvm/20241024095429.54052-1-kalyazin@amazon.com Nikita Kalyazin (2): KVM: guest_memfd: add generic population via write KVM: selftests: update guest_memfd write tests .../testing/selftests/kvm/guest_memfd_test.c | 51 ++++++++++++++++--- virt/kvm/guest_memfd.c | 49 ++++++++++++++++++ 2 files changed, 94 insertions(+), 6 deletions(-) base-commit: 6b36119b94d0b2bb8cea9d512017efafd461d6ac -- 2.50.1

4 weeks

4
8
0 0

[PATCH v11 0/9] support FEAT_LSUI

by Yeoreum Yun

Since Armv9.6, FEAT_LSUI supplies the load/store instructions for previleged level to access to access user memory without clearing PSTATE.PAN bit. This patchset support FEAT_LSUI and applies in futex atomic operation and user_swpX emulation where can replace from ldxr/st{l}xr pair implmentation with clearing PSTATE.PAN bit to correspondant load/store unprevileged atomic operation without clearing PSTATE.PAN bit. Patch Sequences ================ Patch #1 adds cpufeature for FEAT_LSUI Patch #2-#3 expose FEAT_LSUI to guest Patch #4 adds Kconfig for FEAT_LSUI Patch #5-#6 support futex atomic-op with FEAT_LSUI Patch #7-#9 support user_swpX emulation with FEAT_LSUI Patch History ============== from v10 to v11: - use cast instruction to emulate deprecated swpb instruction - https://lore.kernel.org/all/20251103163224.818353-1-yeoreum.yun@arm.com/ from v9 to v10: - apply FEAT_LSUI to user_swpX emulation. - add test coverage for LSUI bit in ID_AA64ISAR3_EL1 - rebase to v6.18-rc4 - https://lore.kernel.org/all/20250922102244.2068414-1-yeoreum.yun@arm.com/ from v8 to v9: - refotoring __lsui_cmpxchg64() - rebase to v6.17-rc7 - https://lore.kernel.org/all/20250917110838.917281-1-yeoreum.yun@arm.com/ from v7 to v8: - implements futex_atomic_eor() and futex_atomic_cmpxchg() with casalt with C helper. - Drop the small optimisation on ll/sc futex_atomic_set operation. - modify some commit message. - https://lore.kernel.org/all/20250816151929.197589-1-yeoreum.yun@arm.com/ from v6 to v7: - wrap FEAT_LSUI with CONFIG_AS_HAS_LSUI in cpufeature - remove unnecessary addition of indentation. - remove unnecessary mte_tco_enable()/disable() on LSUI operation. - https://lore.kernel.org/all/20250811163635.1562145-1-yeoreum.yun@arm.com/ from v5 to v6: - rebase to v6.17-rc1 - https://lore.kernel.org/all/20250722121956.1509403-1-yeoreum.yun@arm.com/ from v4 to v5: - remove futex_ll_sc.h futext_lsui and lsui.h and move them to futex.h - reorganize the patches. - https://lore.kernel.org/all/20250721083618.2743569-1-yeoreum.yun@arm.com/ from v3 to v4: - rebase to v6.16-rc7 - modify some patch's title. - https://lore.kernel.org/all/20250617183635.1266015-1-yeoreum.yun@arm.com/ from v2 to v3: - expose FEAT_LUSI to guest - add help section for LUSI Kconfig - https://lore.kernel.org/all/20250611151154.46362-1-yeoreum.yun@arm.com/ from v1 to v2: - remove empty v9.6 menu entry - locate HAS_LUSI in cpucaps in order - https://lore.kernel.org/all/20250611104916.10636-1-yeoreum.yun@arm.com/ Yeoreum Yun (9): arm64: cpufeature: add FEAT_LSUI KVM: arm64: expose FEAT_LSUI to guest KVM: arm64: kselftest: set_id_regs: add test for FEAT_LSUI arm64: Kconfig: Detect toolchain support for LSUI arm64: futex: refactor futex atomic operation arm64: futex: support futex with FEAT_LSUI arm64: separate common LSUI definitions into lsui.h arm64: armv8_deprecated: convert user_swpX to inline function arm64: armv8_deprecated: apply FEAT_LSUI for swpX emulation. arch/arm64/Kconfig | 5 + arch/arm64/include/asm/futex.h | 291 +++++++++++++++--- arch/arm64/include/asm/lsui.h | 25 ++ arch/arm64/kernel/armv8_deprecated.c | 111 +++++-- arch/arm64/kernel/cpufeature.c | 10 + arch/arm64/kvm/sys_regs.c | 3 +- arch/arm64/tools/cpucaps | 1 + .../testing/selftests/kvm/arm64/set_id_regs.c | 1 + 8 files changed, 381 insertions(+), 66 deletions(-) create mode 100644 arch/arm64/include/asm/lsui.h base-commit: 6146a0f1dfae5d37442a9ddcba012add260bceb0 -- LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}

4 weeks

2
13
0 0

[PATCH net-next v8 00/14] vsock: add namespace support to vhost-vsock

by Bobby Eshleman

This series adds namespace support to vhost-vsock and loopback. It does not add namespaces to any of the other guest transports (virtio-vsock, hyperv, or vmci). The current revision supports two modes: local and global. Local mode is complete isolation of namespaces, while global mode is complete sharing between namespaces of CIDs (the original behavior). The mode is set using /proc/sys/net/vsock/ns_mode. Modes are per-netns and write-once. This allows a system to configure namespaces independently (some may share CIDs, others are completely isolated). This also supports future possible mixed use cases, where there may be namespaces in global mode spinning up VMs while there are mixed mode namespaces that provide services to the VMs, but are not allowed to allocate from the global CID pool (this mode not implemented in this series). If a socket or VM is created when a namespace is global but the namespace changes to local, the socket or VM will continue working normally. That is, the socket or VM assumes the mode behavior of the namespace at the time the socket/VM was created. The original mode is captured in vsock_create() and so occurs at the time of socket(2) and accept(2) for sockets and open(2) on /dev/vhost-vsock for VMs. This prevents a socket/VM connection from suddenly breaking due to a namespace mode change. Any new sockets/VMs created after the mode change will adopt the new mode's behavior. Additionally, added tests for the new namespace features: tools/testing/selftests/vsock/vmtest.sh 1..30 ok 1 vm_server_host_client ok 2 vm_client_host_server ok 3 vm_loopback ok 4 ns_host_vsock_ns_mode_ok ok 5 ns_host_vsock_ns_mode_write_once_ok ok 6 ns_global_same_cid_fails ok 7 ns_local_same_cid_ok ok 8 ns_global_local_same_cid_ok ok 9 ns_local_global_same_cid_ok ok 10 ns_diff_global_host_connect_to_global_vm_ok ok 11 ns_diff_global_host_connect_to_local_vm_fails ok 12 ns_diff_global_vm_connect_to_global_host_ok ok 13 ns_diff_global_vm_connect_to_local_host_fails ok 14 ns_diff_local_host_connect_to_local_vm_fails ok 15 ns_diff_local_vm_connect_to_local_host_fails ok 16 ns_diff_global_to_local_loopback_local_fails ok 17 ns_diff_local_to_global_loopback_fails ok 18 ns_diff_local_to_local_loopback_fails ok 19 ns_diff_global_to_global_loopback_ok ok 20 ns_same_local_loopback_ok ok 21 ns_same_local_host_connect_to_local_vm_ok ok 22 ns_same_local_vm_connect_to_local_host_ok ok 23 ns_mode_change_connection_continue_vm_ok ok 24 ns_mode_change_connection_continue_host_ok ok 25 ns_mode_change_connection_continue_both_ok ok 26 ns_delete_vm_ok ok 27 ns_delete_host_ok ok 28 ns_delete_both_ok ok 29 ns_loopback_global_global_late_module_load_ok ok 30 ns_loopback_local_local_late_module_load_fails SUMMARY: PASS=30 SKIP=0 FAIL=0 Dependent on series: https://lore.kernel.org/all/20251022-vsock-selftests-fixes-and-improvements… Thanks again for everyone's help and reviews! Signed-off-by: Bobby Eshleman <bobbyeshleman(a)gmail.com> To: Stefano Garzarella <sgarzare(a)redhat.com> To: Shuah Khan <shuah(a)kernel.org> To: David S. Miller <davem(a)davemloft.net> To: Eric Dumazet <edumazet(a)google.com> To: Jakub Kicinski <kuba(a)kernel.org> To: Paolo Abeni <pabeni(a)redhat.com> To: Simon Horman <horms(a)kernel.org> To: Stefan Hajnoczi <stefanha(a)redhat.com> To: Michael S. Tsirkin <mst(a)redhat.com> To: Jason Wang <jasowang(a)redhat.com> To: Xuan Zhuo <xuanzhuo(a)linux.alibaba.com> To: Eugenio Pérez <eperezma(a)redhat.com> To: K. Y. Srinivasan <kys(a)microsoft.com> To: Haiyang Zhang <haiyangz(a)microsoft.com> To: Wei Liu <wei.liu(a)kernel.org> To: Dexuan Cui <decui(a)microsoft.com> To: Bryan Tan <bryan-bt.tan(a)broadcom.com> To: Vishnu Dasa <vishnu.dasa(a)broadcom.com> To: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com> Cc: virtualization(a)lists.linux.dev Cc: netdev(a)vger.kernel.org Cc: linux-kselftest(a)vger.kernel.org Cc: linux-kernel(a)vger.kernel.org Cc: kvm(a)vger.kernel.org Cc: linux-hyperv(a)vger.kernel.org Cc: berrange(a)redhat.com Changes in v8: - Break generic cleanup/refactoring patches into standalone series, remove those from this series - Link to dependency: https://lore.kernel.org/all/20251022-vsock-selftests-fixes-and-improvements… - Link to v7: https://lore.kernel.org/r/20251021-vsock-vmtest-v7-0-0661b7b6f081@meta.com Changes in v7: - fix hv_sock build - break out vmtest patches into distinct, more well-scoped patches - change `orig_net_mode` to `net_mode` - many fixes and style changes in per-patch change sets (see individual patches for specific changes) - optimize `virtio_vsock_skb_cb` layout - update commit messages with more useful descriptions - vsock_loopback: use orig_net_mode instead of current net mode - add tests for edge cases (ns deletion, mode changing, loopback module load ordering) - Link to v6: https://lore.kernel.org/r/20250916-vsock-vmtest-v6-0-064d2eb0c89d@meta.com Changes in v6: - define behavior when mode changes to local while socket/VM is alive - af_vsock: clarify description of CID behavior - af_vsock: use stronger langauge around CID rules (dont use "may") - af_vsock: improve naming of buf/buffer - af_vsock: improve string length checking on proc writes - vsock_loopback: add space in struct to clarify lock protection - vsock_loopback: do proper cleanup/unregister on vsock_loopback_exit() - vsock_loopback: use virtio_vsock_skb_net() instead of sock_net() - vsock_loopback: set loopback to NULL after kfree() - vsock_loopback: use pernet_operations and remove callback mechanism - vsock_loopback: add macros for "global" and "local" - vsock_loopback: fix length checking - vmtest.sh: check for namespace support in vmtest.sh - Link to v5: https://lore.kernel.org/r/20250827-vsock-vmtest-v5-0-0ba580bede5b@meta.com Changes in v5: - /proc/net/vsock_ns_mode -> /proc/sys/net/vsock/ns_mode - vsock_global_net -> vsock_global_dummy_net - fix netns lookup in vhost_vsock to respect pid namespaces - add callbacks for vsock_loopback to avoid circular dependency - vmtest.sh loads vsock_loopback module - remove vsock_net_mode_can_set() - change vsock_net_write_mode() to return true/false based on success - make vsock_net_mode enum instead of u8 - Link to v4: https://lore.kernel.org/r/20250805-vsock-vmtest-v4-0-059ec51ab111@meta.com Changes in v4: - removed RFC tag - implemented loopback support - renamed new tests to better reflect behavior - completed suite of tests with permutations of ns modes and vsock_test as guest/host - simplified socat bridging with unix socket instead of tcp + veth - only use vsock_test for success case, socat for failure case (context in commit message) - lots of cleanup Changes in v3: - add notion of "modes" - add procfs /proc/net/vsock_ns_mode - local and global modes only - no /dev/vhost-vsock-netns - vmtest.sh already merged, so new patch just adds new tests for NS - Link to v2: https://lore.kernel.org/kvm/20250312-vsock-netns-v2-0-84bffa1aa97a@gmail.com Changes in v2: - only support vhost-vsock namespaces - all g2h namespaces retain old behavior, only common API changes impacted by vhost-vsock changes - add /dev/vhost-vsock-netns for "opt-in" - leave /dev/vhost-vsock to old behavior - removed netns module param - Link to v1: https://lore.kernel.org/r/20200116172428.311437-1-sgarzare@redhat.com Changes in v1: - added 'netns' module param to vsock.ko to enable the network namespace support (disabled by default) - added 'vsock_net_eq()' to check the "net" assigned to a socket only when 'netns' support is enabled - Link to RFC: https://patchwork.ozlabs.org/cover/1202235/ --- Bobby Eshleman (14): vsock: a per-net vsock NS mode state vsock/virtio: pack struct virtio_vsock_skb_cb vsock: add netns to vsock skb cb vsock: add netns to vsock core vsock/loopback: add netns support vsock/virtio: add netns to virtio transport common vhost/vsock: add netns support selftests/vsock: add namespace helpers to vmtest.sh selftests/vsock: prepare vm management helpers for namespaces selftests/vsock: add tests for proc sys vsock ns_mode selftests/vsock: add namespace tests for CID collisions selftests/vsock: add tests for host <-> vm connectivity with namespaces selftests/vsock: add tests for namespace deletion and mode changes selftests/vsock: add tests for module loading order MAINTAINERS | 1 + drivers/vhost/vsock.c | 48 +- include/linux/virtio_vsock.h | 47 +- include/net/af_vsock.h | 70 ++- include/net/net_namespace.h | 4 + include/net/netns/vsock.h | 22 + net/vmw_vsock/af_vsock.c | 264 +++++++- net/vmw_vsock/virtio_transport.c | 7 +- net/vmw_vsock/virtio_transport_common.c | 21 +- net/vmw_vsock/vsock_loopback.c | 89 ++- tools/testing/selftests/vsock/vmtest.sh | 1044 ++++++++++++++++++++++++++++++- 11 files changed, 1532 insertions(+), 85 deletions(-) --- base-commit: 962ac5ca99a5c3e7469215bf47572440402dfd59 change-id: 20250325-vsock-vmtest-b3a21d2102c2 prerequisite-message-id: <20251022-vsock-selftests-fixes-and-improvements-v1-0-edeb179d6463(a)meta.com> prerequisite-patch-id: a2eecc3851f2509ed40009a7cab6990c6d7cfff5 prerequisite-patch-id: 501db2100636b9c8fcb3b64b8b1df797ccbede85 prerequisite-patch-id: ba1a2f07398a035bc48ef72edda41888614be449 prerequisite-patch-id: fd5cc5445aca9355ce678e6d2bfa89fab8a57e61 prerequisite-patch-id: 795ab4432ffb0843e22b580374782e7e0d99b909 prerequisite-patch-id: 1499d263dc933e75366c09e045d2125ca39f7ddd prerequisite-patch-id: f92d99bb1d35d99b063f818a19dcda999152d74c prerequisite-patch-id: e3296f38cdba6d903e061cff2bbb3e7615e8e671 prerequisite-patch-id: bc4662b4710d302d4893f58708820fc2a0624325 prerequisite-patch-id: f8991f2e98c2661a706183fde6b35e2b8d9aedcf prerequisite-patch-id: 44bf9ed69353586d284e5ee63d6fffa30439a698 prerequisite-patch-id: d50621bc630eeaf608bbaf260370c8dabf6326df Best regards, -- Bobby Eshleman <bobbyeshleman(a)meta.com>

4 weeks

2
35
0 0

[PATCH v5 0/7] platform/chrome: Fix a possible UAF via revocable

by Tzung-Bi Shih

This is a follow-up series of [1]. It tries to fix a possible UAF in the fops of cros_ec_chardev after the underlying protocol device has gone by using revocable. The 1st patch introduces the revocable which is an implementation of ideas from the talk [2]. The 2nd and 3rd patches add test cases for revocable in Kunit and selftest. The 4th patch converts existing protocol devices to resource providers of cros_ec_device. The 5th - 7th are PoC patches for showing the use case of "Replace file operations" below. --- I came out with 2 possible usages of revocable. 1. Use primitive APIs Use the primitive APIs of revocable directly. The file operations make sure the resources are available when using them. This is what the series original proposed[3][4]. Even though it has the finest grain for accessing the resources, it makes the user code verbose. Per feedback from the community, I'm looking for some subsystem level helpers so that user code can be simlper. 2. Replace file operations Replace filp->f_op to revocable-aware warppers. The warppers make sure the resources are available in the file operations. The user code needs to provide a callback .try_access() to tell the wrappers where/how to *save* the pointers of resources. Known drawback: - The warppers reserve the resources for all file operations even if they might be unused. - The user code still needs to be revocable-aware. - The whole file operation becomes a SRCU read-side critical section. Are there any functions can't be called in the critical section? If there is, the file operations may not be awared of that. See 5th - 7th patches for an example usage. [1] https://lore.kernel.org/chrome-platform/20250721044456.2736300-6-tzungbi@ke… [2] https://lpc.events/event/17/contributions/1627/ [3] https://lore.kernel.org/chrome-platform/20250912081718.3827390-5-tzungbi@ke… [4] https://lore.kernel.org/chrome-platform/20250912081718.3827390-6-tzungbi@ke… v5: - Rebase onto next-20251015. - Add more context about the PoC. - Support multiple revocable providers in the PoC. v4: https://lore.kernel.org/chrome-platform/20250923075302.591026-1-tzungbi@ker… - Rebase onto next-20250922. - Remove the 5th patch from v3. - Add fops replacement PoC in 5th - 7th patches. v3: https://lore.kernel.org/chrome-platform/20250912081718.3827390-1-tzungbi@ke… - Rebase onto https://lore.kernel.org/chrome-platform/20250828083601.856083-1-tzungbi@ker… and next-20250912. - The 4th patch changed accordingly. v2: https://lore.kernel.org/chrome-platform/20250820081645.847919-1-tzungbi@ker… - Rename "ref_proxy" -> "revocable". - Add test cases in Kunit and selftest. v1: https://lore.kernel.org/chrome-platform/20250814091020.1302888-1-tzungbi@ke… Tzung-Bi Shih (7): revocable: Revocable resource management revocable: Add Kunit test cases selftests: revocable: Add kselftest cases platform/chrome: Protect cros_ec_device lifecycle with revocable revocable: Add fops replacement char: misc: Leverage revocable fops replacement platform/chrome: cros_ec_chardev: Secure cros_ec_device via revocable .../driver-api/driver-model/index.rst | 1 + .../driver-api/driver-model/revocable.rst | 87 +++++++ MAINTAINERS | 9 + drivers/base/Kconfig | 8 + drivers/base/Makefile | 5 +- drivers/base/revocable.c | 233 ++++++++++++++++++ drivers/base/revocable_test.c | 110 +++++++++ drivers/char/misc.c | 8 + drivers/platform/chrome/cros_ec.c | 5 + drivers/platform/chrome/cros_ec_chardev.c | 22 +- fs/Makefile | 2 +- fs/fs_revocable.c | 154 ++++++++++++ include/linux/fs.h | 2 + include/linux/fs_revocable.h | 21 ++ include/linux/miscdevice.h | 4 + include/linux/platform_data/cros_ec_proto.h | 4 + include/linux/revocable.h | 53 ++++ tools/testing/selftests/Makefile | 1 + .../selftests/drivers/base/revocable/Makefile | 7 + .../drivers/base/revocable/revocable_test.c | 116 +++++++++ .../drivers/base/revocable/test-revocable.sh | 39 +++ .../base/revocable/test_modules/Makefile | 10 + .../revocable/test_modules/revocable_test.c | 188 ++++++++++++++ 23 files changed, 1086 insertions(+), 3 deletions(-) create mode 100644 Documentation/driver-api/driver-model/revocable.rst create mode 100644 drivers/base/revocable.c create mode 100644 drivers/base/revocable_test.c create mode 100644 fs/fs_revocable.c create mode 100644 include/linux/fs_revocable.h create mode 100644 include/linux/revocable.h create mode 100644 tools/testing/selftests/drivers/base/revocable/Makefile create mode 100644 tools/testing/selftests/drivers/base/revocable/revocable_test.c create mode 100755 tools/testing/selftests/drivers/base/revocable/test-revocable.sh create mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/Makefile create mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/revocable_test.c -- 2.51.0.788.g6d19910ace-goog

4 weeks

5
34
0 0

[PATCH v2 0/5] introduce VM_MAYBE_GUARD and make it sticky

by Lorenzo Stoakes

Currently, guard regions are not visible to users except through /proc/$pid/pagemap, with no explicit visibility at the VMA level. This makes the feature less useful, as it isn't entirely apparent which VMAs may have these entries present, especially when performing actions which walk through memory regions such as those performed by CRIU. This series addresses this issue by introducing the VM_MAYBE_GUARD flag which fulfils this role, updating the smaps logic to display an entry for these. The semantics of this flag are that a guard region MAY be present if set (we cannot be sure, as we can't efficiently track whether an MADV_GUARD_REMOVE finally removes all the guard regions in a VMA) - but if not set the VMA definitely does NOT have any guard regions present. It's problematic to establish this flag without further action, because that means that VMAs with guard regions in them become non-mergeable with adjacent VMAs for no especially good reason. To work around this, this series also introduces the concept of 'sticky' VMA flags - that is flags which: a. if set in one VMA and not in another still permit those VMAs to be merged (if otherwise compatible). b. When they are merged, the resultant VMA must have the flag set. The VMA logic is updated to propagate these flags correctly. Additionally, VM_MAYBE_GUARD being an explicit VMA flag allows us to solve an issue with file-backed guard regions - previously these established an anon_vma object for file-backed mappings solely to have vma_needs_copy() correctly propagate guard region mappings to child processes. We introduce a new flag alias VM_COPY_ON_FORK (which currently only specifies VM_MAYBE_GUARD) and update vma_needs_copy() to check explicitly for this flag and to copy page tables if it is present, which resolves this issue. Additionally, we add the ability for allow-listed VMA flags to be atomically writable with only mmap/VMA read locks held. The only flag we allow so far is VM_MAYBE_GUARD, which we carefully ensure does not cause any races by being allowed to do so. This allows us to maintain guard region installation as a read-locked operation and not endure the overhead of obtaining a write lock here. Finally we introduce extensive VMA userland tests to assert that the sticky VMA logic behaves correctly as well as guard region self tests to assert that smaps visibility is correctly implemented. v2: * Separated out userland VMA tests for sticky behaviour as per Suren. * Added the concept of atomic writable VMA flags as per Pedro and Vlastimil. * Made VM_MAYBE_GUARD an atomic writable flag so we don't have to take a VMA write lock in madvise() as per Pedro and Vlastimil. v1: https://lore.kernel.org/all/cover.1761756437.git.lorenzo.stoakes@oracle.com/ Lorenzo Stoakes (5): mm: introduce VM_MAYBE_GUARD and make visible in /proc/$pid/smaps mm: add atomic VMA flags, use VM_MAYBE_GUARD as such mm: implement sticky, copy on fork VMA flags tools/testing/vma: add VMA sticky userland tests selftests/mm/guard-regions: add smaps visibility test Documentation/filesystems/proc.rst | 1 + fs/proc/task_mmu.c | 1 + include/linux/mm.h | 58 ++++++++++ include/trace/events/mmflags.h | 1 + mm/madvise.c | 22 ++-- mm/memory.c | 3 + mm/vma.c | 22 ++-- tools/testing/selftests/mm/guard-regions.c | 120 +++++++++++++++++++++ tools/testing/selftests/mm/vm_util.c | 5 + tools/testing/selftests/mm/vm_util.h | 1 + tools/testing/vma/vma.c | 89 +++++++++++++-- tools/testing/vma/vma_internal.h | 35 ++++++ 12 files changed, 330 insertions(+), 28 deletions(-) -- 2.51.0

4 weeks

4
22
0 0

[RFC 0/2] xdp: Delegate fast path return decision to page_pool

by Dragos Tatulea

This small series proposes the removal of the BPF_RI_F_RF_NO_DIRECT XDP flag in favour of page_pool's internal page_pool_napi_local() check which can override a non-direct recycle into a direct one if the right conditions are met., This was discussed on the mailing list on several occasions [1][2]. The first patch adds additional benchmarking code to the page_pool benchmark. The second patch has the actual change with a proper explanation and measurements. It remains to be debated if the whole BPF_RI_F_RF_NO_DIRECT mechanism should be deleted or only its use in xdp_return_frame_rx_napi(). There is still the unresolved issue of drivers that don't support page_pool NAPI recycling. This series could be extended to add that support. Otherwise those drivers would end up with slow path recycling for XDP. [1] https://lore.kernel.org/all/8d165026-1477-46cb-94d4-a01e1da40833@kernel.org/ [2] https://lore.kernel.org/all/20250918084823.372000-1-dtatulea@nvidia.com/ Dragos Tatulea (2): page_pool: add benchmarking for napi-based recycling xdp: Delegate fast path return decision to page_pool drivers/net/veth.c | 2 - include/linux/filter.h | 22 ----- include/net/xdp.h | 2 +- kernel/bpf/cpumap.c | 2 - net/bpf/test_run.c | 2 - net/core/filter.c | 2 +- net/core/xdp.c | 24 ++--- .../bench/page_pool/bench_page_pool_simple.c | 92 ++++++++++++++++++- 8 files changed, 104 insertions(+), 44 deletions(-) -- 2.50.1

4 weeks

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror