- Linux-kselftest-mirror - lists.linaro.org

[PATCH v4 00/23] iommufd: Add vIOMMU infrastructure (Part-4 HW QUEUE)

by Nicolin Chen

The vIOMMU object is designed to represent a slice of an IOMMU HW for its virtualization features shared with or passed to user space (a VM mostly) in a way of HW acceleration. This extended the HWPT-based design for more advanced virtualization feature. HW QUEUE introduced by this series as a part of the vIOMMU infrastructure represents a HW accelerated queue/buffer for VM to use exclusively, e.g. - NVIDIA's Virtual Command Queue - AMD vIOMMU's Command Buffer, Event Log Buffer, and PPR Log Buffer each of which allows its IOMMU HW to directly access a queue memory owned by a guest VM and allows a guest OS to control the HW queue direclty, to avoid VM Exit overheads to improve the performance. Introduce IOMMUFD_OBJ_HW_QUEUE and its pairing IOMMUFD_CMD_HW_QUEUE_ALLOC allowing VMM to forward the IOMMU-specific queue info, such as queue base address, size, and etc. Meanwhile, a guest-owned queue needs the guest kernel to control the queue by reading/writing its consumer and producer indexes, via MMIO acceses to the hardware MMIO registers. Introduce an mmap infrastructure for iommufd to support passing through a piece of MMIO region from the host physical address space to the guest physical address space. The mmap info (offset/ length) used by an mmap syscall must be pre-allocated and returned to the user space via an output driver-data during an IOMMUFD_CMD_HW_QUEUE_ALLOC call. Thus, it requires a driver-specific user data support in the vIOMMU allocation flow. As a real-world use case, this series implements a HW QUEUE support in the tegra241-cmdqv driver for VCMDQs on NVIDIA Grace CPU. In another word, it is also the Tegra CMDQV series Part-2 (user-space support), reworked from Previous RFCv1: https://lore.kernel.org/all/cover.1712978212.git.nicolinc@nvidia.com/ This enables the HW accelerated feature for NVIDIA Grace CPU. Compared to the standard SMMUv3 operating in the nested translation mode trapping CMDQ for TLBI and ATC_INV commands, this gives a huge performance improvement: 70% to 90% reductions of invalidation time were measured by various DMA unmap tests running in a guest OS. // Unmap latencies from "dma_map_benchmark -g @granule -t @threads", // by toggling "/sys/kernel/debug/iommu/tegra241_cmdqv/bypass_vcmdq" @granule | @threads | bypass_vcmdq=1 | bypass_vcmdq=0 4KB 1 35.7 us 5.3 us 16KB 1 41.8 us 6.8 us 64KB 1 68.9 us 9.9 us 128KB 1 109.0 us 12.6 us 256KB 1 187.1 us 18.0 us 4KB 2 96.9 us 6.8 us 16KB 2 97.8 us 7.5 us 64KB 2 151.5 us 10.7 us 128KB 2 257.8 us 12.7 us 256KB 2 443.0 us 17.9 us This is on Github: https://github.com/nicolinc/iommufd/commits/iommufd_hw_queue-v4 Paring QEMU branch for testing: https://github.com/nicolinc/qemu/commits/wip/for_iommufd_hw_queue-v4 Changelog v4 * Rebase on v6.15-rc5 * Add Reviewed-by from Vasant * Rename "vQUEUE" to "HW QUEUE" * Use "offset" and "length" for all mmap-related variables * [iommufd] Use u64 for guest PA * [iommufd] Fix typo in uAPI doc * [iommufd] Rename immap_id to offset * [iommufd] Drop the partial-size mmap support * [iommufd] Do not replace WARN_ON with WARN_ON_ONCE * [iommufd] Use "u64 base_addr" for queue base address * [iommufd] Use u64 base_pfn/num_pfns for immap structure * [iommufd] Correct the size passed in to mtree_alloc_range() * [iommufd] Add IOMMUFD_VIOMMU_FLAG_HW_QUEUE_READS_PA to viommu_ops v3 https://lore.kernel.org/all/cover.1746139811.git.nicolinc@nvidia.com/ * Add Reviewed-by from Baolu, Pranjal, and Alok * Revise kdocs, uAPI docs, and commit logs * Rename "vCMDQ" back to "vQUEUE" for AMD cases * [tegra] Add tegra241_vcmdq_hw_flush_timeout() * [tegra] Rename vsmmu_alloc to alloc_vintf_user * [tegra] Use writel for SID replacement registers * [tegra] Move mmap removal call to vsmmu_destroy op * [tegra] Fix revert in tegra241_vintf_alloc_lvcmdq_user() * [iommufd] Replace "& ~PAGE_MASK" with PAGE_ALIGNED() * [iommufd] Add an object-type "owner" to immap structure * [iommufd] Drop the ictx input in the new for-driver APIs * [iommufd] Add iommufd_vma_ops to keep track of mmap lifecycle * [iommufd] Add viommu-based iommufd_viommu_alloc/destroy_mmap helpers * [iommufd] Rename iommufd_ctx_alloc/free_mmap to _iommufd_alloc/destroy_mmap v2 https://lore.kernel.org/all/cover.1745646960.git.nicolinc@nvidia.com/ * Add Reviewed-by from Jason * [smmu] Fix vsmmu initial value * [smmu] Support impl for hw_info * [tegra] Rename "slot" to "vsid" * [tegra] Update kdocs and commit logs * [tegra] Map/unmap LVCMDQ dynamically * [tegra] Refcount the previous LVCMDQ * [tegra] Return -EEXIST if LVCMDQ exists * [tegra] Simplify VINTF cleanup routine * [tegra] Use vmid and s2_domain in vsmmu * [tegra] Rename "mmap_pgoff" to "immap_id" * [tegra] Add more addr and length validation * [iommufd] Add more narrative to mmap's kdoc * [iommufd] Add iommufd_struct_depend/undepend() * [iommufd] Rename vcmdq_free op to vcmdq_destroy * [iommufd] Fix bug in iommu_copy_struct_to_user() * [iommufd] Drop is_io from iommufd_ctx_alloc_mmap() * [iommufd] Test the queue memory for its contiguity * [iommufd] Return -ENXIO if address or length fails * [iommufd] Do not change @min_last in mock_viommu_alloc() * [iommufd] Generalize TEGRA241_VCMDQ data in core structure * [iommufd] Add selftest coverage for IOMMUFD_CMD_VCMDQ_ALLOC * [iommufd] Add iopt_pin_pages() to prevent queue memory from unmapping v1 https://lore.kernel.org/all/cover.1744353300.git.nicolinc@nvidia.com/ Thanks Nicolin Nicolin Chen (23): iommufd/viommu: Add driver-allocated vDEVICE support iommu: Pass in a driver-level user data structure to viommu_alloc op iommufd/viommu: Allow driver-specific user data for a vIOMMU object iommu: Add iommu_copy_struct_to_user helper iommufd/driver: Let iommufd_viommu_alloc helper save ictx to viommu->ictx iommufd/driver: Add iommufd_struct_destroy to revert iommufd_viommu_alloc iommufd/selftest: Support user_data in mock_viommu_alloc iommufd/selftest: Add covearge for viommu data iommufd: Abstract iopt_pin_pages and iopt_unpin_pages helpers iommufd/viommu: Introduce IOMMUFD_OBJ_HW_QUEUE and its related struct iommufd/viommu: Add IOMMUFD_CMD_HW_QUEUE_ALLOC ioctl iommufd/driver: Add iommufd_hw_queue_depend/undepend() helpers iommufd/selftest: Add coverage for IOMMUFD_CMD_HW_QUEUE_ALLOC iommufd: Add mmap interface iommufd/selftest: Add coverage for the new mmap interface Documentation: userspace-api: iommufd: Update HW QUEUE iommu/arm-smmu-v3-iommufd: Add vsmmu_alloc impl op iommu/arm-smmu-v3-iommufd: Support implementation-defined hw_info iommu/tegra241-cmdqv: Use request_threaded_irq iommu/tegra241-cmdqv: Simplify deinit flow in tegra241_cmdqv_remove_vintf() iommu/tegra241-cmdqv: Do not statically map LVCMDQs iommu/tegra241-cmdqv: Add user-space use support iommu/tegra241-cmdqv: Add IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV support drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 25 +- drivers/iommu/iommufd/io_pagetable.h | 8 + drivers/iommu/iommufd/iommufd_private.h | 28 +- drivers/iommu/iommufd/iommufd_test.h | 20 + include/linux/iommu.h | 43 +- include/linux/iommufd.h | 186 ++++++- include/uapi/linux/iommufd.h | 116 ++++- tools/testing/selftests/iommu/iommufd_utils.h | 52 +- .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 43 +- .../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 490 +++++++++++++++++- drivers/iommu/iommufd/device.c | 117 +---- drivers/iommu/iommufd/driver.c | 94 ++++ drivers/iommu/iommufd/io_pagetable.c | 95 ++++ drivers/iommu/iommufd/main.c | 80 ++- drivers/iommu/iommufd/selftest.c | 133 ++++- drivers/iommu/iommufd/viommu.c | 121 ++++- tools/testing/selftests/iommu/iommufd.c | 97 +++- .../selftests/iommu/iommufd_fail_nth.c | 11 +- Documentation/userspace-api/iommufd.rst | 12 + 19 files changed, 1577 insertions(+), 194 deletions(-) base-commit: 92a09c47464d040866cf2b4cd052bc60555185fb -- 2.43.0

1 month, 2 weeks

5
105
0 0

[PATCH v1 0/6] VMM can handle guest SEA via KVM_EXIT_ARM_SEA

by Jiaqi Yan

Problem ======= When host APEI is unable to claim synchronous external abort (SEA) during stage-2 guest abort, today KVM directly injects an async SError into the VCPU then resumes it. The injected SError usually results in unpleasant guest kernel panic. One of the major situation of guest SEA is when VCPU consumes recoverable uncorrected memory error (UER), which is not uncommon at all in modern datacenter servers with large amounts of physical memory. Although SError and guest panic is sufficient to stop the propagation of corrupted memory there is still room to recover from memory UER in a more graceful manner. Proposed Solution ================= Alternatively KVM can replay the SEA to the faulting VCPU, via existing KVM_SET_VCPU_EVENTS API. If the memory poison consumption or the fault that cause SEA is not from guest kernel, the blast radius can be limited to the consuming or faulting guest userspace process, so the VM can keep running. In addition, instead of doing under the hood without involving userspace, there are benefits to redirect the SEA to VMM: - VM customers care about the disruptions caused by memory errors, and VMM usually has the responsibility to start the process of notifying the customers of memory error events in their VMs. For example some cloud provider emits a critical log in their observability UI [1], and provides playbook for customers on how to mitigate disruptions to their workloads. - VMM can protect future memory error consumption or faults by unmapping the poisoned pages from stage-2 page table with KVM userfault [2], which is more performant than splitting the memslot that contains the poisoned guest pages. - VMM can keep track SEA events in the VM. When VMM thinks the status on the host or the VM is bad enough, e.g. number of distinct SEAs exceeds a threshold, it can restart the VM on another healthy host. - Behavior parity with x86 architecture. When machine check exception (MCE) is caused by VCPU, kernel or KVM signals userspace SIGBUS to let VMM either recover from the MCE, or terminate itself with VM. The prior RFC proposes to implement SIGBUS on arm64 as well, but Marc preferred VCPU exit over signal [3]. However, implementation aside, returning SEA to VMM is on par with returning MCE to VMM. Once SEA is redirected to VMM, among other actions, VMM is encouraged to inject external aborts into the faulting VCPU, which is already supported by KVM on arm64, although not fully supported by KVM_SET_VCPU_EVENTS but complemented in this patchset. New UAPIs ========= This patchset introduces following userspace-visiable changes to empower VMM to control what happens next for guest SEA: - KVM_CAP_ARM_SEA_TO_USER. If userspace enables this new capability at VM creation, KVM will not inject SError while taking SEA, but VM exit to userspace. - KVM_EXIT_ARM_SEA. This is the VM exit reason VMM gets. The details about the SEA is provided in arm_sea as much as possible, including ESR value at EL2, if guest virtual and physical addresses (GPA and GVA) are available and the values if available. - KVM_CAP_ARM_INJECT_EXT_IABT. VMM today can inject external data abort to VCPU via KVM_SET_VCPU_EVENTS API. However, in case of instruction abort, VMM cannot inject it via KVM_SET_VCPU_EVENTS. KVM_CAP_ARM_INJECT_EXT_IABT is just a natural extend to KVM_CAP_ARM_INJECT_EXT_DABT that tells VMM KVM_SET_VCPU_EVENTS now supports external instruction abort. Patchset utilizes commit 26fbdf369227 ("KVM: arm64: Don't translate FAR if invalid/unsafe") from [4], available already in kvmarm/next. [4] makes KVM safely do address translation for HPFAR_EL2, including at the event of SEA, and indicate if HPFAR_EL2 is valid in NS bit. This patchset depends on [4] to tell userspace if GPA is valid and its value if valid. Patchset is based on commit 68ec8b4e84446 ("Merge branch kvm-arm64/pkvm-6.16 into kvmarm-master/next") [1] https://cloud.google.com/solutions/sap/docs/manage-host-errors [2] https://lpc.events/event/18/contributions/1757/attachments/1442/3073/LPC_%2… [3] https://lore.kernel.org/kvm/86pljbqqh0.wl-maz@kernel.org [4] https://lore.kernel.org/all/174369514508.3034362.13165690020799838042.b4-ty… Jiaqi Yan (5): KVM: arm64: VM exit to userspace to handle SEA KVM: arm64: Set FnV for VCPU when FAR_EL2 is invalid KVM: selftests: Test for KVM_EXIT_ARM_SEA and KVM_CAP_ARM_SEA_TO_USER KVM: selftests: Test for KVM_CAP_INJECT_EXT_IABT Documentation: kvm: new uAPI for handling SEA Raghavendra Rao Ananta (1): KVM: arm64: Allow userspace to inject external instruction aborts Documentation/virt/kvm/api.rst | 120 ++++++- arch/arm64/include/asm/kvm_emulate.h | 12 + arch/arm64/include/asm/kvm_host.h | 8 + arch/arm64/include/asm/kvm_ras.h | 21 +- arch/arm64/include/uapi/asm/kvm.h | 3 +- arch/arm64/kvm/Makefile | 3 +- arch/arm64/kvm/arm.c | 6 + arch/arm64/kvm/guest.c | 13 +- arch/arm64/kvm/inject_fault.c | 3 + arch/arm64/kvm/kvm_ras.c | 54 +++ arch/arm64/kvm/mmu.c | 12 +- include/uapi/linux/kvm.h | 12 + tools/arch/arm64/include/uapi/asm/kvm.h | 3 +- tools/testing/selftests/kvm/Makefile.kvm | 2 + .../testing/selftests/kvm/arm64/inject_iabt.c | 100 ++++++ .../testing/selftests/kvm/arm64/sea_to_user.c | 324 ++++++++++++++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 1 + 17 files changed, 654 insertions(+), 43 deletions(-) create mode 100644 arch/arm64/kvm/kvm_ras.c create mode 100644 tools/testing/selftests/kvm/arm64/inject_iabt.c create mode 100644 tools/testing/selftests/kvm/arm64/sea_to_user.c -- 2.49.0.967.g6a0df3ecc3-goog

1 month, 2 weeks

3
12
0 0

[PATCH] kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests

by Richard Fitzgerald

Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests.config. This helps to detect use of uninitialized local variables. This option found an uninitialized data bug in the cs_dsp test. Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com> --- tools/testing/kunit/configs/all_tests.config | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/kunit/configs/all_tests.config b/tools/testing/kunit/configs/all_tests.config index cdd9782f9646..4a60bb71fe72 100644 --- a/tools/testing/kunit/configs/all_tests.config +++ b/tools/testing/kunit/configs/all_tests.config @@ -10,6 +10,7 @@ CONFIG_KUNIT_EXAMPLE_TEST=y CONFIG_KUNIT_ALL_TESTS=y CONFIG_FORTIFY_SOURCE=y +CONFIG_INIT_STACK_ALL_PATTERN=y CONFIG_IIO=y -- 2.39.5

1 month, 2 weeks

3
3
0 0

[PATCH v8 0/9] ublk: decouple server threads from ublk_queues/hctxs

by Uday Shankar

This patch set aims to allow ublk server threads to better balance load amongst themselves by decoupling server threads from ublk_queues/hctxs, so that multiple threads can service I/Os that are issued from a single CPU. This can improve performance for workloads in which ublk server CPU is a bottleneck, and for which load is issued from CPUs which are not balanced across ublk_queues/hctxs. Performance ----------- First create two ublk devices with: ublkb0: ./kublk add -t null -q 2 --nthreads 2 ublkb1: ./kublk add -t null -q 2 --nthreads 2 --per_io_tasks Then run load with: taskset -c 1 fio/t/io_uring -r5 -p0 /dev/ublkb0: 1.90M IOPS taskset -c 1 fio/t/io_uring -r5 -p0 /dev/ublkb1: 2.18M IOPS Since ublkb1 has per-io-tasks, the second command is able to make use of both ublk server worker threads and therefore has increased max throughput. Caveats: - This testing was done on a system with 2 numa nodes, but the penalty of having I/O cross a numa (or LLC) boundary in the per_io_tasks case is quite high. So these numbers were obtained after moving all ublk server threads and the application threads to CPUs on the same numa node/LLC. - One might expect the scaling to be linear - because ublkb1 can make use of twice as many ublk server threads, it should be able to drive twice the throughput. However this is not true (the improvement is ~15%), and needs further investigation. Signed-off-by: Uday Shankar <ushankar(a)purestorage.com> --- Changes in v8: - Fix queue_rqs batch dispatch OOPS when dispatching a list of requests associated to > 1 ublk_queue (Ming Lei, Caleb Sander Mateos) - Simplify queue_rqs (Caleb Sander Mateos) - Narrow a couple of types (Ming Lei) - Add stress test for per io daemons (Ming Lei) - Link to v7: https://lore.kernel.org/r/20250527-ublk_task_per_io-v7-0-cbdbaf283baa@pures… Changes in v7: - Fix queue_rqs batch dispatch for per-io daemons - Kick round-robin tag allocation changes to a followup - Add explicit feature flag for per-task daemons (Ming Lei, Caleb Sander Mateos) - Move some variable assignments to avoid redundant computation (Caleb Sander Mateos) - Switch from storing pointers in ublk_io to computing based on address with container_of in a couple places (Ming Lei) - Link to v6: https://lore.kernel.org/r/20250507-ublk_task_per_io-v6-0-a2a298783c01@pures… Changes in v6: - Add a feature flag for this feature, called UBLK_F_RR_TAGS (Ming Lei) - Add test for this feature (Ming Lei) - Add documentation for this feature (Ming Lei) - Link to v5: https://lore.kernel.org/r/20250416-ublk_task_per_io-v5-0-9261ad7bff20@pures… Changes in v5: - Set io->task before ublk_mark_io_ready (Caleb Sander Mateos) - Set io->task atomically, read it atomically when needed - Return 0 on success from command-specific helpers in __ublk_ch_uring_cmd (Caleb Sander Mateos) - Rename ublk_handle_need_get_data to ublk_get_data (Caleb Sander Mateos) - Link to v4: https://lore.kernel.org/r/20250415-ublk_task_per_io-v4-0-54210b91a46f@pures… Changes in v4: - Drop "ublk: properly serialize all FETCH_REQs" since Ming is taking it in another set - Prevent data races by marking data structures which should be read-only in the I/O path as const (Ming Lei) - Link to v3: https://lore.kernel.org/r/20250410-ublk_task_per_io-v3-0-b811e8f4554a@pures… Changes in v3: - Check for UBLK_IO_FLAG_ACTIVE on I/O again after taking lock to ensure that two concurrent FETCH_REQs on the same I/O can't succeed (Caleb Sander Mateos) - Link to v2: https://lore.kernel.org/r/20250408-ublk_task_per_io-v2-0-b97877e6fd50@pures… Changes in v2: - Remove changes split into other patches - To ease error handling/synchronization, associate each I/O (instead of each queue) to the last task that issues a FETCH_REQ against it. Only that task is allowed to operate on the I/O. - Link to v1: https://lore.kernel.org/r/20241002224437.3088981-1-ushankar@purestorage.com --- Uday Shankar (9): ublk: have a per-io daemon instead of a per-queue daemon selftests: ublk: kublk: plumb q_id in io_uring user_data selftests: ublk: kublk: tie sqe allocation to io instead of queue selftests: ublk: kublk: lift queue initialization out of thread selftests: ublk: kublk: move per-thread data out of ublk_queue selftests: ublk: kublk: decouple ublk_queues from ublk server threads selftests: ublk: add functional test for per io daemons selftests: ublk: add stress test for per io daemons Documentation: ublk: document UBLK_F_PER_IO_DAEMON Documentation/block/ublk.rst | 35 ++- drivers/block/ublk_drv.c | 111 +++---- include/uapi/linux/ublk_cmd.h | 9 + tools/testing/selftests/ublk/Makefile | 2 + tools/testing/selftests/ublk/fault_inject.c | 4 +- tools/testing/selftests/ublk/file_backed.c | 20 +- tools/testing/selftests/ublk/kublk.c | 344 ++++++++++++++------- tools/testing/selftests/ublk/kublk.h | 73 +++-- tools/testing/selftests/ublk/null.c | 22 +- tools/testing/selftests/ublk/stripe.c | 17 +- tools/testing/selftests/ublk/test_common.sh | 5 + tools/testing/selftests/ublk/test_generic_12.sh | 55 ++++ tools/testing/selftests/ublk/test_stress_06.sh | 36 +++ .../selftests/ublk/trace/count_ios_per_tid.bt | 11 + 14 files changed, 512 insertions(+), 232 deletions(-) --- base-commit: 533c87e2ed742454957f14d7bef9f48d5a72e72d change-id: 20250408-ublk_task_per_io-c693cf608d7a Best regards, -- Uday Shankar <ushankar(a)purestorage.com>

1 month, 2 weeks

3
13
0 0

[PATCH bpf-next 1/2] bpftool: Use appropriate permissions for map access

by Slava Imameev

Modify several functions in tools/bpf/bpftool/common.c to allow specification of requested access for file descriptors, such as read-only access. Update bpftool to request only read access for maps when write access is not required. This fixes errors when reading from maps that are protected from modification via security_bpf_map. Signed-off-by: Slava Imameev <slava.imameev(a)crowdstrike.com> --- tools/bpf/bpftool/btf.c | 3 +- tools/bpf/bpftool/common.c | 57 ++++++++++++++++++++++--------- tools/bpf/bpftool/iter.c | 2 +- tools/bpf/bpftool/link.c | 2 +- tools/bpf/bpftool/main.h | 13 ++++--- tools/bpf/bpftool/map.c | 56 +++++++++++++++++------------- tools/bpf/bpftool/map_perf_ring.c | 3 +- tools/bpf/bpftool/prog.c | 4 +-- 8 files changed, 90 insertions(+), 50 deletions(-) diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index 6b14cbfa58aa..1ba27cb03348 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -905,7 +905,8 @@ static int do_dump(int argc, char **argv) return -1; } - fd = map_parse_fd_and_info(&argc, &argv, &info, &len); + fd = map_parse_fd_and_info(&argc, &argv, &info, &len, + BPF_F_RDONLY); if (fd < 0) return -1; diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c index ecfa790adc13..ff1c99281beb 100644 --- a/tools/bpf/bpftool/common.c +++ b/tools/bpf/bpftool/common.c @@ -193,7 +193,8 @@ int mount_tracefs(const char *target) return err; } -int open_obj_pinned(const char *path, bool quiet) +int open_obj_pinned(const char *path, bool quiet, + const struct bpf_obj_get_opts *opts) { char *pname; int fd = -1; @@ -205,7 +206,7 @@ int open_obj_pinned(const char *path, bool quiet) goto out_ret; } - fd = bpf_obj_get(pname); + fd = bpf_obj_get_opts(pname, opts); if (fd < 0) { if (!quiet) p_err("bpf obj get (%s): %s", pname, @@ -221,12 +222,13 @@ int open_obj_pinned(const char *path, bool quiet) return fd; } -int open_obj_pinned_any(const char *path, enum bpf_obj_type exp_type) +int open_obj_pinned_any(const char *path, enum bpf_obj_type exp_type, + const struct bpf_obj_get_opts *opts) { enum bpf_obj_type type; int fd; - fd = open_obj_pinned(path, false); + fd = open_obj_pinned(path, false, opts); if (fd < 0) return -1; @@ -555,7 +557,7 @@ static int do_build_table_cb(const char *fpath, const struct stat *sb, if (typeflag != FTW_F) goto out_ret; - fd = open_obj_pinned(fpath, true); + fd = open_obj_pinned(fpath, true, NULL); if (fd < 0) goto out_ret; @@ -928,7 +930,7 @@ int prog_parse_fds(int *argc, char ***argv, int **fds) path = **argv; NEXT_ARGP(); - (*fds)[0] = open_obj_pinned_any(path, BPF_OBJ_PROG); + (*fds)[0] = open_obj_pinned_any(path, BPF_OBJ_PROG, NULL); if ((*fds)[0] < 0) return -1; return 1; @@ -965,7 +967,8 @@ int prog_parse_fd(int *argc, char ***argv) return fd; } -static int map_fd_by_name(char *name, int **fds) +static int map_fd_by_name(char *name, int **fds, + const struct bpf_get_fd_by_id_opts *opts) { unsigned int id = 0; int fd, nb_fds = 0; @@ -973,6 +976,7 @@ static int map_fd_by_name(char *name, int **fds) int err; while (true) { + LIBBPF_OPTS(bpf_get_fd_by_id_opts, opts_ro); struct bpf_map_info info = {}; __u32 len = sizeof(info); @@ -985,7 +989,9 @@ static int map_fd_by_name(char *name, int **fds) return nb_fds; } - fd = bpf_map_get_fd_by_id(id); + /* Request a read-only fd to query the map info */ + opts_ro.open_flags = BPF_F_RDONLY; + fd = bpf_map_get_fd_by_id_opts(id, &opts_ro); if (fd < 0) { p_err("can't get map by id (%u): %s", id, strerror(errno)); @@ -1004,6 +1010,15 @@ static int map_fd_by_name(char *name, int **fds) continue; } + /* Get an fd with the requested options. */ + close(fd); + fd = bpf_map_get_fd_by_id_opts(id, opts); + if (fd < 0) { + p_err("can't get map by id (%u): %s", id, + strerror(errno)); + goto err_close_fds; + } + if (nb_fds > 0) { tmp = realloc(*fds, (nb_fds + 1) * sizeof(int)); if (!tmp) { @@ -1023,8 +1038,16 @@ static int map_fd_by_name(char *name, int **fds) return -1; } -int map_parse_fds(int *argc, char ***argv, int **fds) +int map_parse_fds(int *argc, char ***argv, int **fds, __u32 open_flags) { + LIBBPF_OPTS(bpf_get_fd_by_id_opts, opts); + + if (open_flags & ~BPF_F_RDONLY) { + p_err("invalid open_flags: %x", open_flags); + return -1; + } + opts.open_flags = open_flags; + if (is_prefix(**argv, "id")) { unsigned int id; char *endptr; @@ -1038,7 +1061,7 @@ int map_parse_fds(int *argc, char ***argv, int **fds) } NEXT_ARGP(); - (*fds)[0] = bpf_map_get_fd_by_id(id); + (*fds)[0] = bpf_map_get_fd_by_id_opts(id, &opts); if ((*fds)[0] < 0) { p_err("get map by id (%u): %s", id, strerror(errno)); return -1; @@ -1056,16 +1079,18 @@ int map_parse_fds(int *argc, char ***argv, int **fds) } NEXT_ARGP(); - return map_fd_by_name(name, fds); + return map_fd_by_name(name, fds, &opts); } else if (is_prefix(**argv, "pinned")) { char *path; + LIBBPF_OPTS(bpf_obj_get_opts, get_opts); + get_opts.file_flags = open_flags; NEXT_ARGP(); path = **argv; NEXT_ARGP(); - (*fds)[0] = open_obj_pinned_any(path, BPF_OBJ_MAP); + (*fds)[0] = open_obj_pinned_any(path, BPF_OBJ_MAP, &get_opts); if ((*fds)[0] < 0) return -1; return 1; @@ -1075,7 +1100,7 @@ int map_parse_fds(int *argc, char ***argv, int **fds) return -1; } -int map_parse_fd(int *argc, char ***argv) +int map_parse_fd(int *argc, char ***argv, __u32 open_flags) { int *fds = NULL; int nb_fds, fd; @@ -1085,7 +1110,7 @@ int map_parse_fd(int *argc, char ***argv) p_err("mem alloc failed"); return -1; } - nb_fds = map_parse_fds(argc, argv, &fds); + nb_fds = map_parse_fds(argc, argv, &fds, open_flags); if (nb_fds != 1) { if (nb_fds > 1) { p_err("several maps match this handle"); @@ -1103,12 +1128,12 @@ int map_parse_fd(int *argc, char ***argv) } int map_parse_fd_and_info(int *argc, char ***argv, struct bpf_map_info *info, - __u32 *info_len) + __u32 *info_len, __u32 open_flags) { int err; int fd; - fd = map_parse_fd(argc, argv); + fd = map_parse_fd(argc, argv, open_flags); if (fd < 0) return -1; diff --git a/tools/bpf/bpftool/iter.c b/tools/bpf/bpftool/iter.c index 5c39c2ed36a2..ad318a8667a4 100644 --- a/tools/bpf/bpftool/iter.c +++ b/tools/bpf/bpftool/iter.c @@ -37,7 +37,7 @@ static int do_pin(int argc, char **argv) return -1; } - map_fd = map_parse_fd(&argc, &argv); + map_fd = map_parse_fd(&argc, &argv, 0); if (map_fd < 0) return -1; diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c index 3535afc80a49..8523be11dcd9 100644 --- a/tools/bpf/bpftool/link.c +++ b/tools/bpf/bpftool/link.c @@ -117,7 +117,7 @@ static int link_parse_fd(int *argc, char ***argv) path = **argv; NEXT_ARGP(); - return open_obj_pinned_any(path, BPF_OBJ_LINK); + return open_obj_pinned_any(path, BPF_OBJ_LINK, NULL); } p_err("expected 'id' or 'pinned', got: '%s'?", **argv); diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h index 9eb764fe4cc8..6db704fda5c0 100644 --- a/tools/bpf/bpftool/main.h +++ b/tools/bpf/bpftool/main.h @@ -15,6 +15,7 @@ #include <bpf/hashmap.h> #include <bpf/libbpf.h> +#include <bpf/bpf.h> #include "json_writer.h" @@ -140,8 +141,10 @@ void get_prog_full_name(const struct bpf_prog_info *prog_info, int prog_fd, int get_fd_type(int fd); const char *get_fd_type_name(enum bpf_obj_type type); char *get_fdinfo(int fd, const char *key); -int open_obj_pinned(const char *path, bool quiet); -int open_obj_pinned_any(const char *path, enum bpf_obj_type exp_type); +int open_obj_pinned(const char *path, bool quiet, + const struct bpf_obj_get_opts *opts); +int open_obj_pinned_any(const char *path, enum bpf_obj_type exp_type, + const struct bpf_obj_get_opts *opts); int mount_bpffs_for_file(const char *file_name); int create_and_mount_bpffs_dir(const char *dir_name); int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(int *, char ***)); @@ -167,10 +170,10 @@ int do_iter(int argc, char **argv) __weak; int parse_u32_arg(int *argc, char ***argv, __u32 *val, const char *what); int prog_parse_fd(int *argc, char ***argv); int prog_parse_fds(int *argc, char ***argv, int **fds); -int map_parse_fd(int *argc, char ***argv); -int map_parse_fds(int *argc, char ***argv, int **fds); +int map_parse_fd(int *argc, char ***argv, __u32 open_flags); +int map_parse_fds(int *argc, char ***argv, int **fds, __u32 open_flags); int map_parse_fd_and_info(int *argc, char ***argv, struct bpf_map_info *info, - __u32 *info_len); + __u32 *info_len, __u32 open_flags); struct bpf_prog_linfo; #if defined(HAVE_LLVM_SUPPORT) || defined(HAVE_LIBBFD_SUPPORT) diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c index 81cc668b4b05..c7dc2eae8ba8 100644 --- a/tools/bpf/bpftool/map.c +++ b/tools/bpf/bpftool/map.c @@ -337,9 +337,9 @@ static void fill_per_cpu_value(struct bpf_map_info *info, void *value) memcpy(value + i * step, value, info->value_size); } -static int parse_elem(char **argv, struct bpf_map_info *info, - void *key, void *value, __u32 key_size, __u32 value_size, - __u32 *flags, __u32 **value_fd) +static int parse_elem(char **argv, struct bpf_map_info *info, void *key, + void *value, __u32 key_size, __u32 value_size, + __u32 *flags, __u32 **value_fd, __u32 open_flags) { if (!*argv) { if (!key && !value) @@ -362,7 +362,7 @@ static int parse_elem(char **argv, struct bpf_map_info *info, return -1; return parse_elem(argv, info, NULL, value, key_size, value_size, - flags, value_fd); + flags, value_fd, open_flags); } else if (is_prefix(*argv, "value")) { int fd; @@ -388,7 +388,7 @@ static int parse_elem(char **argv, struct bpf_map_info *info, return -1; } - fd = map_parse_fd(&argc, &argv); + fd = map_parse_fd(&argc, &argv, open_flags); if (fd < 0) return -1; @@ -424,7 +424,7 @@ static int parse_elem(char **argv, struct bpf_map_info *info, } return parse_elem(argv, info, key, NULL, key_size, value_size, - flags, NULL); + flags, NULL, open_flags); } else if (is_prefix(*argv, "any") || is_prefix(*argv, "noexist") || is_prefix(*argv, "exist")) { if (!flags) { @@ -440,7 +440,7 @@ static int parse_elem(char **argv, struct bpf_map_info *info, *flags = BPF_EXIST; return parse_elem(argv + 1, info, key, value, key_size, - value_size, NULL, value_fd); + value_size, NULL, value_fd, open_flags); } p_err("expected key or value, got: %s", *argv); @@ -639,7 +639,7 @@ static int do_show_subset(int argc, char **argv) p_err("mem alloc failed"); return -1; } - nb_fds = map_parse_fds(&argc, &argv, &fds); + nb_fds = map_parse_fds(&argc, &argv, &fds, BPF_F_RDONLY); if (nb_fds < 1) goto exit_free; @@ -672,12 +672,15 @@ static int do_show_subset(int argc, char **argv) static int do_show(int argc, char **argv) { + LIBBPF_OPTS(bpf_get_fd_by_id_opts, opts); struct bpf_map_info info = {}; __u32 len = sizeof(info); __u32 id = 0; int err; int fd; + opts.open_flags = BPF_F_RDONLY; + if (show_pinned) { map_table = hashmap__new(hash_fn_for_key_as_id, equal_fn_for_key_as_id, NULL); @@ -707,7 +710,7 @@ static int do_show(int argc, char **argv) break; } - fd = bpf_map_get_fd_by_id(id); + fd = bpf_map_get_fd_by_id_opts(id, &opts); if (fd < 0) { if (errno == ENOENT) continue; @@ -909,7 +912,7 @@ static int do_dump(int argc, char **argv) p_err("mem alloc failed"); return -1; } - nb_fds = map_parse_fds(&argc, &argv, &fds); + nb_fds = map_parse_fds(&argc, &argv, &fds, BPF_F_RDONLY); if (nb_fds < 1) goto exit_free; @@ -997,7 +1000,7 @@ static int do_update(int argc, char **argv) if (argc < 2) usage(); - fd = map_parse_fd_and_info(&argc, &argv, &info, &len); + fd = map_parse_fd_and_info(&argc, &argv, &info, &len, 0); if (fd < 0) return -1; @@ -1006,7 +1009,7 @@ static int do_update(int argc, char **argv) goto exit_free; err = parse_elem(argv, &info, key, value, info.key_size, - info.value_size, &flags, &value_fd); + info.value_size, &flags, &value_fd, 0); if (err) goto exit_free; @@ -1076,7 +1079,7 @@ static int do_lookup(int argc, char **argv) if (argc < 2) usage(); - fd = map_parse_fd_and_info(&argc, &argv, &info, &len); + fd = map_parse_fd_and_info(&argc, &argv, &info, &len, BPF_F_RDONLY); if (fd < 0) return -1; @@ -1084,7 +1087,8 @@ static int do_lookup(int argc, char **argv) if (err) goto exit_free; - err = parse_elem(argv, &info, key, NULL, info.key_size, 0, NULL, NULL); + err = parse_elem(argv, &info, key, NULL, info.key_size, 0, NULL, NULL, + BPF_F_RDONLY); if (err) goto exit_free; @@ -1127,7 +1131,7 @@ static int do_getnext(int argc, char **argv) if (argc < 2) usage(); - fd = map_parse_fd_and_info(&argc, &argv, &info, &len); + fd = map_parse_fd_and_info(&argc, &argv, &info, &len, BPF_F_RDONLY); if (fd < 0) return -1; @@ -1140,8 +1144,8 @@ static int do_getnext(int argc, char **argv) } if (argc) { - err = parse_elem(argv, &info, key, NULL, info.key_size, 0, - NULL, NULL); + err = parse_elem(argv, &info, key, NULL, info.key_size, 0, NULL, + NULL, BPF_F_RDONLY); if (err) goto exit_free; } else { @@ -1198,7 +1202,7 @@ static int do_delete(int argc, char **argv) if (argc < 2) usage(); - fd = map_parse_fd_and_info(&argc, &argv, &info, &len); + fd = map_parse_fd_and_info(&argc, &argv, &info, &len, BPF_F_RDONLY); if (fd < 0) return -1; @@ -1209,7 +1213,8 @@ static int do_delete(int argc, char **argv) goto exit_free; } - err = parse_elem(argv, &info, key, NULL, info.key_size, 0, NULL, NULL); + err = parse_elem(argv, &info, key, NULL, info.key_size, 0, NULL, NULL, + 0); if (err) goto exit_free; @@ -1226,11 +1231,16 @@ static int do_delete(int argc, char **argv) return err; } +static int map_parse_read_only_fd(int *argc, char ***argv) +{ + return map_parse_fd(argc, argv, BPF_F_RDONLY); +} + static int do_pin(int argc, char **argv) { int err; - err = do_pin_any(argc, argv, map_parse_fd); + err = do_pin_any(argc, argv, map_parse_read_only_fd); if (!err && json_output) jsonw_null(json_wtr); return err; @@ -1319,7 +1329,7 @@ static int do_create(int argc, char **argv) if (!REQ_ARGS(2)) usage(); inner_map_fd = map_parse_fd_and_info(&argc, &argv, - &info, &len); + &info, &len, 0); if (inner_map_fd < 0) return -1; attr.inner_map_fd = inner_map_fd; @@ -1368,7 +1378,7 @@ static int do_pop_dequeue(int argc, char **argv) if (argc < 2) usage(); - fd = map_parse_fd_and_info(&argc, &argv, &info, &len); + fd = map_parse_fd_and_info(&argc, &argv, &info, &len, 0); if (fd < 0) return -1; @@ -1407,7 +1417,7 @@ static int do_freeze(int argc, char **argv) if (!REQ_ARGS(2)) return -1; - fd = map_parse_fd(&argc, &argv); + fd = map_parse_fd(&argc, &argv, 0); if (fd < 0) return -1; diff --git a/tools/bpf/bpftool/map_perf_ring.c b/tools/bpf/bpftool/map_perf_ring.c index 552b4ca40c27..bcb767e2d673 100644 --- a/tools/bpf/bpftool/map_perf_ring.c +++ b/tools/bpf/bpftool/map_perf_ring.c @@ -128,7 +128,8 @@ int do_event_pipe(int argc, char **argv) int err, map_fd; map_info_len = sizeof(map_info); - map_fd = map_parse_fd_and_info(&argc, &argv, &map_info, &map_info_len); + map_fd = map_parse_fd_and_info(&argc, &argv, &map_info, &map_info_len, + 0); if (map_fd < 0) return -1; diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 96eea8a67225..deeaa5c1ed7d 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -1062,7 +1062,7 @@ static int parse_attach_detach_args(int argc, char **argv, int *progfd, if (!REQ_ARGS(2)) return -EINVAL; - *mapfd = map_parse_fd(&argc, &argv); + *mapfd = map_parse_fd(&argc, &argv, 0); if (*mapfd < 0) return *mapfd; @@ -1608,7 +1608,7 @@ static int load_with_options(int argc, char **argv, bool first_prog_only) } NEXT_ARG(); - fd = map_parse_fd(&argc, &argv); + fd = map_parse_fd(&argc, &argv, 0); if (fd < 0) goto err_free_reuse_maps; -- 2.34.1

1 month, 2 weeks

1
1
0 0

[PATCH] selftests/filesystems: Fix build of anon_inode_test

by Mark Brown

The anon_inode_test test fails to build due to attempting to include a nonexisting overlayfs/wrapper.h: anon_inode_test.c:10:10: fatal error: overlayfs/wrappers.h: No such file or directory 10 | #include "overlayfs/wrappers.h" | ^~~~~~~~~~~~~~~~~~~~~~ This is due to 0bd92b9fe538 ("selftests/filesystems: move wrapper.h out of overlayfs subdir") which was added in the vfs-6.16.selftests branch which was based on -rc5 and does not contain the newly added test so once things were merged into vfs.all in the build started failing - both parent commits are fine. Fixes: feaa00dbff45a ("Merge branch 'vfs-6.16.selftests' into vfs.all") Signed-off-by: Mark Brown <broonie(a)kernel.org> --- tools/testing/selftests/filesystems/anon_inode_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/filesystems/anon_inode_test.c b/tools/testing/selftests/filesystems/anon_inode_test.c index e8e0ef1460d2..73e0a4d4fb2f 100644 --- a/tools/testing/selftests/filesystems/anon_inode_test.c +++ b/tools/testing/selftests/filesystems/anon_inode_test.c @@ -7,7 +7,7 @@ #include <sys/stat.h> #include "../kselftest_harness.h" -#include "overlayfs/wrappers.h" +#include "wrappers.h" TEST(anon_inode_no_chown) { --- base-commit: feaa00dbff45ad9a0dcd04a92f88c745bf880f55 change-id: 20250516-selftests-anon-inode-build-007e206e8422 Best regards, -- Mark Brown <broonie(a)kernel.org>

1 month, 2 weeks

1
2
0 0

[PATCH net] selftests: net: build net/lib dependency in all target

by Bui Quang Minh

Currently, we only build net/lib dependency in install target. This commit moves that to all target so that net/lib is included in in-tree build and run_tests. Signed-off-by: Bui Quang Minh <minhquangbui99(a)gmail.com> --- tools/testing/selftests/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 6aa11cd3db42..5b04d83ad9a1 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -205,7 +205,7 @@ export KHDR_INCLUDES all: @ret=1; \ - for TARGET in $(TARGETS); do \ + for TARGET in $(TARGETS) $(INSTALL_DEP_TARGETS); do \ BUILD_TARGET=$$BUILD/$$TARGET; \ mkdir $$BUILD_TARGET -p; \ $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET \ @@ -270,7 +270,7 @@ ifdef INSTALL_PATH install -m 744 run_kselftest.sh $(INSTALL_PATH)/ rm -f $(TEST_LIST) @ret=1; \ - for TARGET in $(TARGETS) $(INSTALL_DEP_TARGETS); do \ + for TARGET in $(TARGETS); do \ BUILD_TARGET=$$BUILD/$$TARGET; \ $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET install \ INSTALL_PATH=$(INSTALL_PATH)/$$TARGET \ -- 2.43.0

1 month, 2 weeks

3
5
0 0

[PATCH] selftests/seccomp: report errno and add hints on failure

by Sameeksha Sankpal

Signed-off-by: Sameeksha Sankpal <sameekshasankpal(a)gmail.com> --- tools/testing/selftests/seccomp/seccomp_bpf.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 14ba51b52095..d6a85d7b26da 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -4508,7 +4508,11 @@ static char get_proc_stat(struct __test_metadata *_metadata, pid_t pid) snprintf(proc_path, sizeof(proc_path), "/proc/%d/stat", pid); ASSERT_EQ(get_nth(_metadata, proc_path, 3, &line), 1); - + int rc = get_nth(_metadata, proc_path, 3, &line); + if (rc != 1) { + printf("[ERROR] user_notification_fifo: failed to read stat for PID %d (rc=%d)\n", pid, rc); + } + ASSERT_EQ(rc, 1); status = *line; free(line); @@ -4518,6 +4522,7 @@ static char get_proc_stat(struct __test_metadata *_metadata, pid_t pid) TEST(user_notification_fifo) { struct seccomp_notif_resp resp = {}; + ksft_print_msg("[INFO] Starting FIFO notification test\n"); struct seccomp_notif req = {}; int i, status, listener; pid_t pid, pids[3]; @@ -4535,6 +4540,7 @@ TEST(user_notification_fifo) listener = user_notif_syscall(__NR_getppid, SECCOMP_FILTER_FLAG_NEW_LISTENER); ASSERT_GE(listener, 0); + ksft_print_msg("[INFO] user_notification_fifo: listener PID is %d\n", listener); pid = fork(); ASSERT_GE(pid, 0); -- 2.43.0

1 month, 2 weeks

2
4
0 0

[PATCH v10 0/5] rust: replace kernel::str::CStr w/ core::ffi::CStr

by Tamir Duberstein

This picks up from Michal Rostecki's work[0]. Per Michal's guidance I have omitted Co-authored tags, as the end result is quite different. Link: https://lore.kernel.org/rust-for-linux/20240819153656.28807-2-vadorovsky@pr… [0] Closes: https://github.com/Rust-for-Linux/linux/issues/1075 Signed-off-by: Tamir Duberstein <tamird(a)gmail.com> --- Changes in v10: - Rebase on cbeaa41dfe26b72639141e87183cb23e00d4b0dd. - Implement Alice's suggestion to use a proc macro to work around orphan rules otherwise preventing `core::ffi::CStr` to be directly printed with `{}`. - Link to v9: https://lore.kernel.org/r/20250317-cstr-core-v9-0-51d6cc522f62@gmail.com Changes in v9: - Rebase on rust-next. - Restore `impl Display for BStr` which exists upstream[1]. - Link: https://doc.rust-lang.org/nightly/std/bstr/struct.ByteStr.html#impl-Display… [1] - Link to v8: https://lore.kernel.org/r/20250203-cstr-core-v8-0-cb3f26e78686@gmail.com Changes in v8: - Move `{from,as}_char_ptr` back to `CStrExt`. This reduces the diff some. - Restore `from_bytes_with_nul_unchecked_mut`, `to_cstring`. - Link to v7: https://lore.kernel.org/r/20250202-cstr-core-v7-0-da1802520438@gmail.com Changes in v7: - Rebased on mainline. - Restore functionality added in commit a321f3ad0a5d ("rust: str: add {make,to}_{upper,lower}case() to CString"). - Used `diff.algorithm patience` to improve diff readability. - Link to v6: https://lore.kernel.org/r/20250202-cstr-core-v6-0-8469cd6d29fd@gmail.com Changes in v6: - Split the work into several commits for ease of review. - Restore `{from,as}_char_ptr` to allow building on ARM (see commit message). - Add `CStrExt` to `kernel::prelude`. (Alice Ryhl) - Remove `CStrExt::from_bytes_with_nul_unchecked_mut` and restore `DerefMut for CString`. (Alice Ryhl) - Rename and hide `kernel::c_str!` to encourage use of C-String literals. - Drop implementation and invocation changes in kunit.rs. (Trevor Gross) - Drop docs on `Display` impl. (Trevor Gross) - Rewrite docs in the style of the standard library. - Restore the `test_cstr_debug` unit tests to demonstrate that the implementation has changed. Changes in v5: - Keep the `test_cstr_display*` unit tests. Changes in v4: - Provide the `CStrExt` trait with `display()` method, which returns a `CStrDisplay` wrapper with `Display` implementation. This addresses the lack of `Display` implementation for `core::ffi::CStr`. - Provide `from_bytes_with_nul_unchecked_mut()` method in `CStrExt`, which might be useful and is going to prevent manual, unsafe casts. - Fix a typo (s/preffered/prefered/). Changes in v3: - Fix the commit message. - Remove redundant braces in `use`, when only one item is imported. Changes in v2: - Do not remove `c_str` macro. While it's preferred to use C-string literals, there are two cases where `c_str` is helpful: - When working with macros, which already return a Rust string literal (e.g. `stringify!`). - When building macros, where we want to take a Rust string literal as an argument (for caller's convenience), but still use it as a C-string internally. - Use Rust literals as arguments in macros (`new_mutex`, `new_condvar`, `new_mutex`). Use the `c_str` macro to convert these literals to C-string literals. - Use `c_str` in kunit.rs for converting the output of `stringify!` to a `CStr`. - Remove `DerefMut` implementation for `CString`. --- Tamir Duberstein (5): rust: retitle "Example" section as "Examples" rust: support formatting of foreign types rust: replace `CStr` with `core::ffi::CStr` rust: replace `kernel::c_str!` with C-Strings rust: remove core::ffi::CStr reexport drivers/block/rnull.rs | 2 +- drivers/gpu/drm/drm_panic_qr.rs | 5 +- drivers/gpu/nova-core/driver.rs | 2 +- drivers/gpu/nova-core/firmware.rs | 2 +- drivers/net/phy/ax88796b_rust.rs | 8 +- drivers/net/phy/qt2025.rs | 6 +- rust/kernel/block/mq.rs | 2 +- rust/kernel/device.rs | 9 +- rust/kernel/devres.rs | 2 +- rust/kernel/driver.rs | 4 +- rust/kernel/error.rs | 10 +- rust/kernel/faux.rs | 5 +- rust/kernel/firmware.rs | 16 +- rust/kernel/fmt.rs | 77 +++++++ rust/kernel/kunit.rs | 21 +- rust/kernel/lib.rs | 3 +- rust/kernel/miscdevice.rs | 5 +- rust/kernel/net/phy.rs | 12 +- rust/kernel/of.rs | 5 +- rust/kernel/pci.rs | 2 +- rust/kernel/platform.rs | 6 +- rust/kernel/prelude.rs | 5 +- rust/kernel/print.rs | 4 +- rust/kernel/seq_file.rs | 6 +- rust/kernel/str.rs | 415 ++++++++++------------------------- rust/kernel/sync.rs | 7 +- rust/kernel/sync/condvar.rs | 4 +- rust/kernel/sync/lock.rs | 4 +- rust/kernel/sync/lock/global.rs | 6 +- rust/kernel/sync/poll.rs | 1 + rust/kernel/workqueue.rs | 1 + rust/macros/fmt.rs | 118 ++++++++++ rust/macros/kunit.rs | 6 +- rust/macros/lib.rs | 21 +- rust/macros/module.rs | 2 +- samples/rust/rust_driver_faux.rs | 4 +- samples/rust/rust_driver_pci.rs | 4 +- samples/rust/rust_driver_platform.rs | 4 +- samples/rust/rust_misc_device.rs | 3 +- scripts/rustdoc_test_gen.rs | 2 +- 40 files changed, 426 insertions(+), 395 deletions(-) --- base-commit: cbeaa41dfe26b72639141e87183cb23e00d4b0dd change-id: 20250201-cstr-core-d4b9b69120cf Best regards, -- Tamir Duberstein <tamird(a)gmail.com>

1 month, 2 weeks

4
33
0 0

[PATCH bpf-next v1 1/2] bpf: Restrict usage scope of bpf_get_cgroup_classid

by Jiayuan Chen

A previous commit expanded the usage scope of bpf_get_cgroup_classid() to all contexts (see Fixes tag), but this was inappropriate. First, syzkaller reported a bug [1]. Second, it uses skb as an argument, but its implementation varies across different bpf prog types. For example, in sock_filter and sock_addr, it retrieves the classid from the current context (bpf_get_cgroup_classid_curr_proto) instead of from skb. In tc egress and lwt, it fetches the classid from skb->sk, but in tc ingress, it returns 0. In summary, the definition of bpf_get_cgroup_classid() is ambiguous and its usage scenarios are limited. It should not be treated as a general-purpose helper. This patch reverts part of the previous commit. [1] https://syzkaller.appspot.com/bug?extid=9767c7ed68b95cfa69e6 Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types") Reported-by: syzbot+9767c7ed68b95cfa69e6(a)syzkaller.appspotmail.com Signed-off-by: Jiayuan Chen <jiayuan.chen(a)linux.dev> --- include/linux/bpf-cgroup.h | 8 ++++++++ kernel/bpf/cgroup.c | 25 +++++++++++++++++++++++++ kernel/bpf/helpers.c | 4 ---- 3 files changed, 33 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h index 4847dcade917..9de7adb68294 100644 --- a/include/linux/bpf-cgroup.h +++ b/include/linux/bpf-cgroup.h @@ -427,6 +427,8 @@ int cgroup_bpf_prog_query(const union bpf_attr *attr, const struct bpf_func_proto * cgroup_common_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog); +const struct bpf_func_proto * +cgroup_current_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog); #else static inline int cgroup_bpf_inherit(struct cgroup *cgrp) { return 0; } @@ -463,6 +465,12 @@ cgroup_common_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return NULL; } +static inline const struct bpf_func_proto * +cgroup_current_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) +{ + return NULL; +} + static inline int bpf_cgroup_storage_assign(struct bpf_prog_aux *aux, struct bpf_map *map) { return 0; } static inline struct bpf_cgroup_storage *bpf_cgroup_storage_alloc( diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index 62a1d8deb3dc..a99b72e6f1c9 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -1653,6 +1653,10 @@ cgroup_dev_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) if (func_proto) return func_proto; + func_proto = cgroup_current_func_proto(func_id, prog); + if (func_proto) + return func_proto; + switch (func_id) { case BPF_FUNC_perf_event_output: return &bpf_event_output_data_proto; @@ -2200,6 +2204,10 @@ sysctl_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) if (func_proto) return func_proto; + func_proto = cgroup_current_func_proto(func_id, prog); + if (func_proto) + return func_proto; + switch (func_id) { case BPF_FUNC_sysctl_get_name: return &bpf_sysctl_get_name_proto; @@ -2343,6 +2351,10 @@ cg_sockopt_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) if (func_proto) return func_proto; + func_proto = cgroup_current_func_proto(func_id, prog); + if (func_proto) + return func_proto; + switch (func_id) { #ifdef CONFIG_NET case BPF_FUNC_get_netns_cookie: @@ -2589,3 +2601,16 @@ cgroup_common_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return NULL; } } + +const struct bpf_func_proto * +cgroup_current_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) +{ + switch (func_id) { +#ifdef CONFIG_CGROUP_NET_CLASSID + case BPF_FUNC_get_cgroup_classid: + return &bpf_get_cgroup_classid_curr_proto; +#endif + default: + return NULL; + } +} diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index b71e428ad936..9d0d54f4f0de 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2024,10 +2024,6 @@ bpf_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_get_current_ancestor_cgroup_id_proto; case BPF_FUNC_current_task_under_cgroup: return &bpf_current_task_under_cgroup_proto; -#endif -#ifdef CONFIG_CGROUP_NET_CLASSID - case BPF_FUNC_get_cgroup_classid: - return &bpf_get_cgroup_classid_curr_proto; #endif case BPF_FUNC_task_storage_get: if (bpf_prog_check_recur(prog)) -- 2.47.1

1 month, 2 weeks

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror