- Linux-kselftest-mirror - lists.linaro.org

[PATCH v3 bpf 0/2] bpf: skip non exist keys in generic_map_lookup_batch

by Yan Zhai

The generic_map_lookup_batch currently returns EINTR if it fails with ENOENT and retries several times on bpf_map_copy_value. The next batch would start from the same location, presuming it's a transient issue. This is incorrect if a map can actually have "holes", i.e. "get_next_key" can return a key that does not point to a valid value. At least the array of maps type may contain such holes legitly. Right now these holes show up, generic batch lookup cannot proceed any more. It will always fail with EINTR errors. This patch fixes this behavior by skipping the non-existing key, and does not return EINTR any more. V2->V3: deleted a unused macro V1->V2: split the fix and selftests; fixed a few selftests issues. V2: https://lore.kernel.org/bpf/cover.1738905497.git.yan@cloudflare.com/ V1: https://lore.kernel.org/bpf/Z6OYbS4WqQnmzi2z@debian.debian/ Yan Zhai (2): bpf: skip non exist keys in generic_map_lookup_batch selftests: bpf: test batch lookup on array of maps with holes kernel/bpf/syscall.c | 18 ++---- .../bpf/map_tests/map_in_map_batch_ops.c | 62 +++++++++++++------ 2 files changed, 49 insertions(+), 31 deletions(-) -- 2.39.5

4 months, 2 weeks

4
7
0 0

[PATCH bpf-next v5 0/6] selftests/bpf: Migrate test_xdp_redirect_multi.sh to test_progs

by Bastien Curutchet (eBPF Foundation)

Hi all, This patch series continues the work to migrate the *.sh tests into prog_tests framework. test_xdp_redirect_multi.sh tests the XDP redirections done through bpf_redirect_map(). This is already partly covered by test_xdp_veth.c that already tests map redirections at XDP level. What isn't covered yet by test_xdp_veth is the use of the broadcast flags (BPF_F_BROADCAST or BPF_F_EXCLUDE_INGRESS) and XDP egress programs. Hence, this patch series add test cases to test_xdp_veth.c to get rid of the test_xdp_redirect_multi.sh: - PATCH 1 & 2 Rework test_xdp_veth.c to avoid using the root namespace - PATCH 3 and 4 cover the broadcast flags - PATCH 5 covers the XDP egress programs NOTE: While working on this iteration I ran into a memory leak in net/core/rtnetlink.c that leads to oom-kill when running ./test_progs in a loop. This leak has been fixed by commit 1438f5d07b9a ("rtnetlink: fix netns leak with rtnl_setlink()") in the net tree. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com> --- Changes in v5: - Remove the patches that were applied from previous iteration - Add PATCH 1 & 2 to avoid using the root namespace so the veth indexes don't get incremented on every ./test_progs call - PATCH 3: Remove unnecessary <linux/ip.h> header - Link to v4: https://lore.kernel.org/r/20250131-redirect-multi-v4-0-970b33678512@bootlin… Changes in v4: - Remove the NO_IP #define - append_tid() takes string's size as input to ensure there is enough space to fit the thread ID at the end - Fix PATCH 12's commit log - Link to v3: https://lore.kernel.org/r/20250128-redirect-multi-v3-0-c1ce69997c01@bootlin… Changes in v3: - Add append_tid() helper and use unique names to allow parallel testing - Check create_network()'s return value through ASSERT_OK() - Remove check_ping() and unused defines - Change next_veth type (from string to int) - Link to v2: https://lore.kernel.org/r/20250121-redirect-multi-v2-0-fc9cacabc6b2@bootlin… Changes in v2: - Use serial_test_* to avoid conflict between tests - Link to v1: https://lore.kernel.org/r/20250121-redirect-multi-v1-0-b215e35ff505@bootlin… --- Bastien Curutchet (eBPF Foundation) (6): selftests/bpf: test_xdp_veth: Create struct net_configuration selftests/bpf: test_xdp_veth: Use a dedicated namespace selftests/bpf: Optionally select broadcasting flags selftests/bpf: test_xdp_veth: Add XDP broadcast redirection tests selftests/bpf: test_xdp_veth: Add XDP program on egress test selftests/bpf: Remove test_xdp_redirect_multi.sh tools/testing/selftests/bpf/Makefile | 2 - .../selftests/bpf/prog_tests/test_xdp_veth.c | 435 ++++++++++++++++++--- .../testing/selftests/bpf/progs/xdp_redirect_map.c | 88 +++++ .../selftests/bpf/progs/xdp_redirect_multi_kern.c | 41 +- .../selftests/bpf/test_xdp_redirect_multi.sh | 214 ---------- tools/testing/selftests/bpf/xdp_redirect_multi.c | 226 ----------- 6 files changed, 491 insertions(+), 515 deletions(-) --- base-commit: 36ab3d3de536753a4b9b2b4c4ce26af41917a378 change-id: 20250103-redirect-multi-245d6eafb5d1 Best regards, -- Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>

4 months, 2 weeks

2
7
0 0

[PATCH v4 00/27] KVM: arm64: Implement support for SME in non-protected guests

by Mark Brown

I've removed the RFC tag from this version of the series, but the items that I'm looking for feedback on remains the same: - The userspace ABI, in particular: - The vector length used for the SVE registers, access to the SVE registers and access to ZA and (if available) ZT0 depending on the current state of PSTATE.{SM,ZA}. - The use of a single finalisation for both SVE and SME. - The addition of control for enabling fine grained traps in a similar manner to FGU but without the UNDEF, I'm not clear if this is desired at all and at present this requires symmetric read and write traps like FGU. That seemed like it might be desired from an implementation point of view but we already have one case where we enable an asymmetric trap (for ARM64_WORKAROUND_AMPERE_AC03_CPU_38) and it seems generally useful to enable asymmetrically. This series implements support for SME use in non-protected KVM guests. Much of this is very similar to SVE, the main additional challenge that SME presents is that it introduces a new vector length similar to the SVE vector length and two new controls which change the registers seen by guests: - PSTATE.ZA enables the ZA matrix register and, if SME2 is supported, the ZT0 LUT register. - PSTATE.SM enables streaming mode, a new floating point mode which uses the SVE register set with the separately configured SME vector length. In streaming mode implementation of the FFR register is optional. It is also permitted to build systems which support SME without SVE, in this case when not in streaming mode no SVE registers or instructions are available. Further, there is no requirement that there be any overlap in the set of vector lengths supported by SVE and SME in a system, this is expected to be a common situation in practical systems. Since there is a new vector length to configure we introduce a new feature parallel to the existing SVE one with a new pseudo register for the streaming mode vector length. Due to the overlap with SVE caused by streaming mode rather than finalising SME as a separate feature we use the existing SVE finalisation to also finalise SME, a new define KVM_ARM_VCPU_VEC is provided to help make user code clearer. Finalising SVE and SME separately would introduce complication with register access since finalising SVE makes the SVE regsiters writeable by userspace and doing multiple finalisations results in an error being reported. Dealing with a state where the SVE registers are writeable due to one of SVE or SME being finalised but may have their VL changed by the other being finalised seems like needless complexity with minimal practical utility, it seems clearer to just express directly that only one finalisation can be done in the ABI. Access to the floating point registers follows the architecture: - When both SVE and SME are present: - If PSTATE.SM == 0 the vector length used for the Z and P registers is the SVE vector length. - If PSTATE.SM == 1 the vector length used for the Z and P registers is the SME vector length. - If only SME is present: - If PSTATE.SM == 0 the Z and P registers are inaccessible and the floating point state accessed via the encodings for the V registers. - If PSTATE.SM == 1 the vector length used for the Z and P registers - The SME specific ZA and ZT0 registers are only accessible if SVCR.ZA is 1. The VMM must understand this, in particular when loading state SVCR should be configured before other state. There are a large number of subfeatures for SME, most of which only offer additional instructions but some of which (SME2 and FA64) add architectural state. These are configured via the ID registers as per usual. The new KVM_ARM_VCPU_VEC feature and ZA and ZT0 registers have not been added to the get-reg-list selftest, the idea of supporting additional features there without restructuring the program to generate all possible feature combinations has been rejected. I will post a separate series which does that restructuring. No support is present for protected guests, this is expected to be added separately, the series is already rather large and pKVM in general offers a subset of features. This series is based on Mark Rutland's fix series: https://lore.kernel.org/r/20250210195226.1215254-1-mark.rutland@arm.com Signed-off-by: Mark Brown <broonie(a)kernel.org> --- Changes in v4: - Rebase onto v6.14-rc2 and Mark Rutland's fixes. - Expose SME to nested guests. - Additional cleanups and test fixes following on from the rebase. - Link to v3: https://lore.kernel.org/r/20241220-kvm-arm64-sme-v3-0-05b018c1ffeb@kernel.o… Changes in v3: - Rebase onto v6.12-rc2. - Link to v2: https://lore.kernel.org/r/20231222-kvm-arm64-sme-v2-0-da226cb180bb@kernel.o… Changes in v2: - Rebase onto v6.7-rc3. - Configure subfeatures based on host system only. - Complete nVHE support. - There was some snafu with sending v1 out, it didn't make it to the lists but in case it hit people's inboxes I'm sending as v2. --- Mark Brown (27): arm64/fpsimd: Update FA64 and ZT0 enables when loading SME state arm64/fpsimd: Decide to save ZT0 and streaming mode FFR at bind time arm64/fpsimd: Check enable bit for FA64 when saving EFI state arm64/fpsimd: Determine maximum virtualisable SME vector length KVM: arm64: Introduce non-UNDEF FGT control KVM: arm64: Pay attention to FFR parameter in SVE save and load KVM: arm64: Pull ctxt_has_ helpers to start of sysreg-sr.h KVM: arm64: Move SVE state access macros after feature test macros KVM: arm64: Rename SVE finalization constants to be more general KVM: arm64: Document the KVM ABI for SME KVM: arm64: Define internal features for SME KVM: arm64: Rename sve_state_reg_region KVM: arm64: Store vector lengths in an array KVM: arm64: Implement SME vector length configuration KVM: arm64: Support SME control registers KVM: arm64: Support TPIDR2_EL0 KVM: arm64: Support SME identification registers for guests KVM: arm64: Support SME priority registers KVM: arm64: Provide assembly for SME state restore KVM: arm64: Support userspace access to streaming mode Z and P registers KVM: arm64: Expose SME specific state to userspace KVM: arm64: Context switch SME state for normal guests KVM: arm64: Handle SME exceptions KVM: arm64: Expose SME to nested guests KVM: arm64: Provide interface for configuring and enabling SME for guests KVM: arm64: selftests: Add SME system registers to get-reg-list KVM: arm64: selftests: Add SME to set_id_regs test Documentation/virt/kvm/api.rst | 117 +++++++--- arch/arm64/include/asm/fpsimd.h | 22 ++ arch/arm64/include/asm/kvm_emulate.h | 12 +- arch/arm64/include/asm/kvm_host.h | 143 ++++++++++--- arch/arm64/include/asm/kvm_hyp.h | 4 +- arch/arm64/include/asm/kvm_pkvm.h | 2 +- arch/arm64/include/asm/vncr_mapping.h | 2 + arch/arm64/include/uapi/asm/kvm.h | 33 +++ arch/arm64/kernel/cpufeature.c | 2 - arch/arm64/kernel/fpsimd.c | 86 ++++---- arch/arm64/kvm/arm.c | 10 + arch/arm64/kvm/fpsimd.c | 19 +- arch/arm64/kvm/guest.c | 262 ++++++++++++++++++++--- arch/arm64/kvm/handle_exit.c | 14 ++ arch/arm64/kvm/hyp/fpsimd.S | 18 +- arch/arm64/kvm/hyp/include/hyp/switch.h | 141 ++++++++++-- arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 77 ++++--- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 9 +- arch/arm64/kvm/hyp/nvhe/pkvm.c | 4 +- arch/arm64/kvm/hyp/nvhe/switch.c | 11 +- arch/arm64/kvm/hyp/vhe/switch.c | 21 +- arch/arm64/kvm/nested.c | 3 +- arch/arm64/kvm/reset.c | 154 +++++++++---- arch/arm64/kvm/sys_regs.c | 118 +++++++++- include/uapi/linux/kvm.h | 1 + tools/testing/selftests/kvm/arm64/get-reg-list.c | 32 ++- tools/testing/selftests/kvm/arm64/set_id_regs.c | 29 ++- 27 files changed, 1078 insertions(+), 268 deletions(-) --- base-commit: 6a25088d268ce4c2163142ead7fe1975bb687cb7 change-id: 20230301-kvm-arm64-sme-06a1246d3636 prerequisite-message-id: 20250210195226.1215254-1-mark.rutland(a)arm.com prerequisite-patch-id: 615ab9c526e9f6242bd5b8d7188e96fb66fb0e64 prerequisite-patch-id: e5c4f2ff9c9ba01a0f659dd1e8bf6396de46e197 prerequisite-patch-id: 0794d28526755180847841c045a6b7cb3d800c16 prerequisite-patch-id: 079d3a8a680f793b593268eeba000acc55a0b4ec prerequisite-patch-id: a3428f67a5ee49f2b01208f30b57984d5409d8f5 prerequisite-patch-id: 26393e401e9eae7cff5bb1d3bdb18b4e29ffc8fe prerequisite-patch-id: 64f9819f751da4a1c73b9d1b292ccee6afda89f6 prerequisite-patch-id: 0355baaa8ceb31dc85d015b56084c33416f78041 Best regards, -- Mark Brown <broonie(a)kernel.org>

4 months, 2 weeks

2
31
0 0

[PATCH AUTOSEL 6.12 01/31] sched_ext: selftests/dsp_local_on: Fix sporadic failures

by Sasha Levin

From: Tejun Heo <tj(a)kernel.org> [ Upstream commit e9fe182772dcb2630964724fd93e9c90b68ea0fd ] dsp_local_on has several incorrect assumptions, one of which is that p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task is scheduled out while migration is disabled - p->cpus_ptr is temporarily overridden to the previous CPU while p->nr_cpus_allowed remains unchanged. This led to sporadic test faliures when dsp_local_on_dispatch() tries to put a migration disabled task to a different CPU. Fix it by keeping the previous CPU when migration is disabled. There are SCX schedulers that make use of p->nr_cpus_allowed. They should also implement explicit handling for p->migration_disabled. Signed-off-by: Tejun Heo <tj(a)kernel.org> Reported-by: Ihor Solodrai <ihor.solodrai(a)pm.me> Cc: Andrea Righi <arighi(a)nvidia.com> Cc: Changwoo Min <changwoo(a)igalia.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- tools/testing/selftests/sched_ext/dsp_local_on.bpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c index c9a2da0575a0f..eea06decb6f59 100644 --- a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c +++ b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c @@ -43,7 +43,7 @@ void BPF_STRUCT_OPS(dsp_local_on_dispatch, s32 cpu, struct task_struct *prev) if (!p) return; - if (p->nr_cpus_allowed == nr_cpus) + if (p->nr_cpus_allowed == nr_cpus && !p->migration_disabled) target = bpf_get_prandom_u32() % nr_cpus; else target = scx_bpf_task_cpu(p); -- 2.39.5

4 months, 2 weeks

1
1
0 0

[PATCH AUTOSEL 6.13 01/31] sched_ext: selftests/dsp_local_on: Fix sporadic failures

by Sasha Levin

From: Tejun Heo <tj(a)kernel.org> [ Upstream commit e9fe182772dcb2630964724fd93e9c90b68ea0fd ] dsp_local_on has several incorrect assumptions, one of which is that p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task is scheduled out while migration is disabled - p->cpus_ptr is temporarily overridden to the previous CPU while p->nr_cpus_allowed remains unchanged. This led to sporadic test faliures when dsp_local_on_dispatch() tries to put a migration disabled task to a different CPU. Fix it by keeping the previous CPU when migration is disabled. There are SCX schedulers that make use of p->nr_cpus_allowed. They should also implement explicit handling for p->migration_disabled. Signed-off-by: Tejun Heo <tj(a)kernel.org> Reported-by: Ihor Solodrai <ihor.solodrai(a)pm.me> Cc: Andrea Righi <arighi(a)nvidia.com> Cc: Changwoo Min <changwoo(a)igalia.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- tools/testing/selftests/sched_ext/dsp_local_on.bpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c index fbda6bf546712..758b479bd1ee1 100644 --- a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c +++ b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c @@ -43,7 +43,7 @@ void BPF_STRUCT_OPS(dsp_local_on_dispatch, s32 cpu, struct task_struct *prev) if (!p) return; - if (p->nr_cpus_allowed == nr_cpus) + if (p->nr_cpus_allowed == nr_cpus && !p->migration_disabled) target = bpf_get_prandom_u32() % nr_cpus; else target = scx_bpf_task_cpu(p); -- 2.39.5

4 months, 2 weeks

1
1
0 0

[PATCH 0/2] tools: Unify top-level quiet infrastructure

by Charlie Jenkins

The quiet infrastructure was moved out of Makefile.build to accomidate the new syscall table generation scripts in perf. Syscall table generation wanted to also be able to be quiet, so instead of again copying the code to set the quiet variables, the code was moved into Makefile.perf to be used globally. This was not the right solution. It should have been moved even further upwards in the call chain. Makefile.include is imported in many files so this seems like a proper place to put it. To: Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com> --- Charlie Jenkins (2): tools: Unify top-level quiet infrastructure tools: Remove redundant quiet setup tools/arch/arm64/tools/Makefile | 6 ----- tools/bpf/Makefile | 6 ----- tools/bpf/bpftool/Documentation/Makefile | 6 ----- tools/bpf/bpftool/Makefile | 6 ----- tools/bpf/resolve_btfids/Makefile | 2 -- tools/bpf/runqslower/Makefile | 5 +--- tools/build/Makefile | 8 +----- tools/lib/bpf/Makefile | 13 ---------- tools/lib/perf/Makefile | 13 ---------- tools/lib/thermal/Makefile | 13 ---------- tools/objtool/Makefile | 6 ----- tools/perf/Makefile.perf | 41 ------------------------------- tools/scripts/Makefile.include | 31 ++++++++++++++++++++++- tools/testing/selftests/bpf/Makefile.docs | 6 ----- tools/testing/selftests/hid/Makefile | 2 -- tools/thermal/lib/Makefile | 13 ---------- tools/tracing/latency/Makefile | 6 ----- tools/tracing/rtla/Makefile | 6 ----- tools/verification/rv/Makefile | 6 ----- 19 files changed, 32 insertions(+), 163 deletions(-) --- base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b change-id: 20250203-quiet_tools-9a6ea9d65a19 -- - Charlie

4 months, 2 weeks

5
12
0 0

[PATCH v7 0/8] Buddy allocator like (or non-uniform) folio split

by Zi Yan

Hi Matthew, Can you please take a look at Patch 1 and let me know if the new xarray function looks good to you? I will send __filemap_add_folio() and shmem_split_large_entry() changes separately. Hi all, This patchset adds a new buddy allocator like (or non-uniform) large folio split from a order-n folio to order-m with m < n. It reduces 1. the total number of after-split folios from 2^(n-m) to n-m+1; 2. the amount of memory needed for multi-index xarray split from 2^(n/6-m/6) to n/6-m/6, assuming XA_CHUNK_SHIFT=6; 3. keep more large folios after a split from all order-m folios to order-(n-1) to order-m folios. For example, to split an order-9 to order-0, folio split generates 10 (or 11 for anonymous memory) folios instead of 512, allocates 1 xa_node instead of 8, and leaves 1 order-8, 1 order-7, ..., 1 order-1 and 2 order-0 folios (or 4 order-0 for anonymous memory) instead of 512 order-0 folios. It is on top of mm-everything-2025-02-07-05-27 with V6 reverted. It is ready to be merged. Instead of duplicating existing split_huge_page*() code, __folio_split() is introduced as the shared backend code for both split_huge_page_to_list_to_order() and folio_split(). __folio_split() can support both uniform split and buddy allocator like (or non-uniform) split. All existing split_huge_page*() users can be gradually converted to use folio_split() if possible. In this patchset, I converted truncate_inode_partial_folio() to use folio_split(). xfstests quick group passed for both tmpfs and xfs. Changelog === From V6[8]: 1. Added an xarray function xas_try_split() to support iterative folio split, removing the need of using xas_split_alloc() and xas_split(). The function guarantees that at most one xa_node is allocated for each call. 2. Added concrete numbers of after-split folios and xa_node savings to cover letter, commit log. (per Andrew) From V5[7]: 1. Split shmem to any lower order patches are in mm tree, so dropped from this series. 2. Rename split_folio_at() to try_folio_split() to clarify that non-uniform split will not be used if it is not supported. From V4[6]: 1. Enabled shmem support in both uniform and buddy allocator like split and added selftests for it. 2. Added functions to check if uniform split and buddy allocator like split are supported for the given folio and order. 3. Made truncate fall back to uniform split if buddy allocator split is not supported (CONFIG_READ_ONLY_THP_FOR_FS and FS without large folio). 4. Added the missing folio_clear_has_hwpoisoned() to __split_unmapped_folio(). From V3[5]: 1. Used xas_split_alloc(GFP_NOWAIT) instead of xas_nomem(), since extra operations inside xas_split_alloc() are needed for correctness. 2. Enabled folio_split() for shmem and no issue was found with xfstests quick test group. 3. Split both ends of a truncate range in truncate_inode_partial_folio() to avoid wasting memory in shmem truncate (per David Hildenbrand). 4. Removed page_in_folio_offset() since page_folio() does the same thing. 5. Finished truncate related tests from xfstests quick test group on XFS and tmpfs without issues. 6. Disabled buddy allocator like split on CONFIG_READ_ONLY_THP_FOR_FS and FS without large folio. This check was missed in the prior versions. From V2[3]: 1. Incorporated all the feedback from Kirill[4]. 2. Used GFP_NOWAIT for xas_nomem(). 3. Tested the code path when xas_nomem() fails. 4. Added selftests for folio_split(). 5. Fixed no THP config build error. From V1[2]: 1. Split the original patch 1 into multiple ones for easy review (per Kirill). 2. Added xas_destroy() to avoid memory leak. 3. Fixed nr_dropped not used error (per kernel test robot). 4. Added proper error handling when xas_nomem() fails to allocate memory for xas_split() during buddy allocator like split. From RFC[1]: 1. Merged backend code of split_huge_page_to_list_to_order() and folio_split(). The same code is used for both uniform split and buddy allocator like split. 2. Use xas_nomem() instead of xas_split_alloc() for folio_split(). 3. folio_split() now leaves the first after-split folio unlocked, instead of the one containing the given page, since the caller of truncate_inode_partial_folio() locks and unlocks the first folio. 4. Extended split_huge_page debugfs to use folio_split(). 5. Added truncate_inode_partial_folio() as first user of folio_split(). Design === folio_split() splits a large folio in the same way as buddy allocator splits a large free page for allocation. The purpose is to minimize the number of folios after the split. For example, if user wants to free the 3rd subpage in a order-9 folio, folio_split() will split the order-9 folio as: O-0, O-0, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-8 if it is anon O-1, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-9 if it is pagecache Since anon folio does not support order-1 yet. The split process is similar to existing approach: 1. Unmap all page mappings (split PMD mappings if exist); 2. Split meta data like memcg, page owner, page alloc tag; 3. Copy meta data in struct folio to sub pages, but instead of spliting the whole folio into multiple smaller ones with the same order in a shot, this approach splits the folio iteratively. Taking the example above, this approach first splits the original order-9 into two order-8, then splits left part of order-8 to two order-7 and so on; 4. Post-process split folios, like write mapping->i_pages for pagecache, adjust folio refcounts, add split folios to corresponding list; 5. Remap split folios 6. Unlock split folios. __split_unmapped_folio() and __split_folio_to_order() replace __split_huge_page() and __split_huge_page_tail() respectively. __split_unmapped_folio() uses different approaches to perform uniform split and buddy allocator like split: 1. uniform split: one single call to __split_folio_to_order() is used to uniformly split the given folio. All resulting folios are put back to the list after split. The folio containing the given page is left to caller to unlock and others are unlocked. 2. buddy allocator like (or non-uniform) split: (old_order - new_order) calls to __split_folio_to_order() are used to split the given folio at order N to order N-1. After each call, the target folio is changed to the one containing the page, which is given as a folio_split() parameter. After each call, folios not containing the page are put back to the list. The folio containing the page is put back to the list when its order is new_order. All folios are unlocked except the first folio, which is left to caller to unlock. Patch Overview === 1. Patch 1 added a new xarray function xas_try_split() to perform iterative xarray split. 2. Patch 2 added __split_unmapped_folio() and __split_folio_to_order() to prepare for moving to new backend split code. 3. Patch 3 moved common code in split_huge_page_to_list_to_order() to __folio_split(). 4. Patch 4 added new folio_split() and made split_huge_page_to_list_to_order() share the new __split_unmapped_folio() with folio_split(). 5. Patch 5 removed no longer used __split_huge_page() and __split_huge_page_tail(). 6. Patch 6 added a new in_folio_offset to split_huge_page debugfs for folio_split() test. 7. Patch 7 used try_folio_split() for truncate operation. 8. Patch 8 added folio_split() tests. Any comments and/or suggestions are welcome. Thanks. [1] https://lore.kernel.org/linux-mm/20241008223748.555845-1-ziy@nvidia.com/ [2] https://lore.kernel.org/linux-mm/20241028180932.1319265-1-ziy@nvidia.com/ [3] https://lore.kernel.org/linux-mm/20241101150357.1752726-1-ziy@nvidia.com/ [4] https://lore.kernel.org/linux-mm/e6ppwz5t4p4kvir6eqzoto4y5fmdjdxdyvxvtw43nc… [5] https://lore.kernel.org/linux-mm/20241205001839.2582020-1-ziy@nvidia.com/ [6] https://lore.kernel.org/linux-mm/20250106165513.104899-1-ziy@nvidia.com/ [7] https://lore.kernel.org/linux-mm/20250116211042.741543-1-ziy@nvidia.com/ [8] https://lore.kernel.org/linux-mm/20250205031417.1771278-1-ziy@nvidia.com/ Zi Yan (8): xarray: add xas_try_split() to split a multi-index entry. mm/huge_memory: add two new (not yet used) functions for folio_split() mm/huge_memory: move folio split common code to __folio_split() mm/huge_memory: add buddy allocator like (non-uniform) folio_split() mm/huge_memory: remove the old, unused __split_huge_page() mm/huge_memory: add folio_split() to debugfs testing interface. mm/truncate: use buddy allocator like folio split for truncate operation. selftests/mm: add tests for folio_split(), buddy allocator like split. Documentation/core-api/xarray.rst | 14 +- include/linux/huge_mm.h | 36 + include/linux/xarray.h | 7 + lib/test_xarray.c | 47 ++ lib/xarray.c | 136 +++- mm/huge_memory.c | 751 ++++++++++++------ mm/truncate.c | 31 +- tools/testing/radix-tree/Makefile | 1 + .../selftests/mm/split_huge_page_test.c | 34 +- 9 files changed, 772 insertions(+), 285 deletions(-) -- 2.47.2

4 months, 2 weeks

3
25
0 0

[PATCH v6 0/3] rust: kunit: Support KUnit tests with a user-space like syntax

by David Gow

Hi all, After much delay, v6 of the KUnit/Rust integration patchset is here. This change incorporates most of Miguels suggestions from v5 (save for some of the copyright headers I wasn't comfortable unilaterally changing). This means the documentation is much improved, and it should work more cleanly on Rust 1.83 and 1.84, no longer requiring static_mut_refs or const_mut_refs. (I'm not 100% sure I understand all of the details of this, but I'm comfortable enough with how it's ended up.) This has been rebased against 6.14-rc1/rust-next, and should be able to comfortably go in via either the KUnit or Rust trees. My suspicion is that there's more likely to be conflicts with the Rust work (due to the changes in rust/macros/lib.rs) than with KUnit, where there are no current patches which would break the API, so maybe it makes the most sense for it to go in via Rust for 6.15. This series was originally written by José Expósito, and has been modified and updated by Matt Gilbride, Miguel Ojeda, and myself. The original version can be found here: https://github.com/Rust-for-Linux/linux/pull/950 Add support for writing KUnit tests in Rust. While Rust doctests are already converted to KUnit tests and run, they're really better suited for examples, rather than as first-class unit tests. This series implements a series of direct Rust bindings for KUnit tests, as well as a new macro which allows KUnit tests to be written using a close variant of normal Rust unit test syntax. The only change required is replacing '#[cfg(test)]' with '#[kunit_tests(kunit_test_suite_name)]' An example test would look like: #[kunit_tests(rust_kernel_hid_driver)] mod tests { use super::*; use crate::{c_str, driver, hid, prelude::*}; use core::ptr; struct SimpleTestDriver; impl Driver for SimpleTestDriver { type Data = (); } #[test] fn rust_test_hid_driver_adapter() { let mut hid = bindings::hid_driver::default(); let name = c_str!("SimpleTestDriver"); static MODULE: ThisModule = unsafe { ThisModule::from_ptr(ptr::null_mut()) }; let res = unsafe { <hid::Adapter<SimpleTestDriver> as driver::DriverOps>::register(&mut hid, name, &MODULE) }; assert_eq!(res, Err(ENODEV)); // The mock returns -19 } } Please give this a go, and make sure I haven't broken it! There's almost certainly a lot of improvements which can be made -- and there's a fair case to be made for replacing some of this with generated C code which can use the C macros -- but this is hopefully an adequate implementation for now, and the interface can (with luck) remain the same even if the implementation changes. A few small notable missing features: - Attributes (like the speed of a test) are hardcoded to the default value. - Similarly, the module name attribute is hardcoded to NULL. In C, we use the KBUILD_MODNAME macro, but I couldn't find a way to use this from Rust which wasn't more ugly than just disabling it. - Assertions are not automatically rewritten to use KUnit assertions. --- Changes since v5: https://lore.kernel.org/all/20241213081035.2069066-1-davidgow@google.com/ - Rebased against 6.14-rc1 - Fixed a bunch of warnings / clippy lints introduced in Rust 1.83 and 1.84. - No longer needs static_mut_refs / const_mut_refs, and is much cleaned up as a result. (Thanks, Miguel) - Major documentation and example fixes. (Thanks, Miguel) Changes since v4: https://lore.kernel.org/linux-kselftest/20241101064505.3820737-1-davidgow@g… - Rebased against 6.13-rc1 - Allowed an unused_unsafe warning after the behaviour of addr_of_mut!() changed in Rust 1.82. (Thanks Boqun, Miguel) - "Expect" that the sample assert_eq!(1+1, 2) produces a clippy warning due to a redundant assertion. (Thanks Boqun, Miguel) - Fix some missing safety comments, and remove some unneeded 'unsafe' blocks. (Thanks Boqun) - Fix a couple of minor rustfmt issues which were triggering checkpatch warnings. Changes since v3: https://lore.kernel.org/linux-kselftest/20241030045719.3085147-2-davidgow@g… - The kunit_unsafe_test_suite!() macro now panic!s if the suite name is too long, triggering a compile error. (Thanks, Alice!) - The #[kunit_tests()] macro now preserves span information, so errors can be better reported. (Thanks, Boqun!) - The example tests have been updated to no longer use assert_eq!() with a constant bool argument (which triggered a clippy warning now we have the span info). Changes since v2: https://lore.kernel.org/linux-kselftest/20241029092422.2884505-1-davidgow@g… - Include missing rust/macros/kunit.rs file from v2. (Thanks Boqun!) - The kunit_unsafe_test_suite!() macro will truncate the name of the suite if it is too long. (Thanks Alice!) - The proc macro now emits an error if the suite name is too long. - We no longer needlessly use UnsafeCell<> in kunit_unsafe_test_suite!(). (Thanks Alice!) Changes since v1: https://lore.kernel.org/lkml/20230720-rustbind-v1-0-c80db349e3b5@google.com… - Rebase on top of the latest rust-next (commit 718c4069896c) - Make kunit_case a const fn, rather than a macro (Thanks Boqun) - As a result, the null terminator is now created with kernel::kunit::kunit_case_null() - Use the C kunit_get_current_test() function to implement in_kunit_test(), rather than re-implementing it (less efficiently) ourselves. Changes since the GitHub PR: - Rebased on top of kselftest/kunit - Add const_mut_refs feature This may conflict with https://lore.kernel.org/lkml/20230503090708.2524310-6-nmi@metaspace.dk/ - Add rust/macros/kunit.rs to the KUnit MAINTAINERS entry --- José Expósito (3): rust: kunit: add KUnit case and suite macros rust: macros: add macro to easily run KUnit tests rust: kunit: allow to know if we are in a test MAINTAINERS | 1 + rust/kernel/kunit.rs | 199 +++++++++++++++++++++++++++++++++++++++++++ rust/macros/kunit.rs | 161 ++++++++++++++++++++++++++++++++++ rust/macros/lib.rs | 29 +++++++ 4 files changed, 390 insertions(+) create mode 100644 rust/macros/kunit.rs -- 2.48.1.601.g30ceb7b040-goog

4 months, 2 weeks

3
17
0 0

[PATCH net 0/4] sockmap, vsock: For connectible sockets allow only connected

by Michal Luczaj

Series deals with one more case of vsock surprising BPF/sockmap by being inconsistency about (having an) assigned transport. KASAN: null-ptr-deref in range [0x0000000000000120-0x0000000000000127] CPU: 7 UID: 0 PID: 56 Comm: kworker/7:0 Not tainted 6.14.0-rc1+ Workqueue: vsock-loopback vsock_loopback_work RIP: 0010:vsock_read_skb+0x4b/0x90 Call Trace: sk_psock_verdict_data_ready+0xa4/0x2e0 virtio_transport_recv_pkt+0x1ca8/0x2acc vsock_loopback_work+0x27d/0x3f0 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x35a/0x700 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 This bug, similarly to commit f6abafcd32f9 ("vsock/bpf: return early if transport is not assigned"), could be fixed with a single NULL check. But instead, let's explore another approach: take a hint from vsock_bpf_update_proto() and teach sockmap to accept only vsocks that are already connected (no risk of transport being dropped or reassigned). At the same time straight reject the listeners (vsock listening sockets do not carry any transport anyway). This way BPF does not have to worry about vsk->transport becoming NULL. Signed-off-by: Michal Luczaj <mhal(a)rbox.co> --- Michal Luczaj (4): sockmap, vsock: For connectible sockets allow only connected vsock/bpf: Warn on socket without transport selftest/bpf: Adapt vsock_delete_on_close to sockmap rejecting unconnected selftest/bpf: Add vsock test for sockmap rejecting unconnected net/core/sock_map.c | 3 + net/vmw_vsock/af_vsock.c | 3 + net/vmw_vsock/vsock_bpf.c | 2 +- .../selftests/bpf/prog_tests/sockmap_basic.c | 70 ++++++++++++++++------ 4 files changed, 59 insertions(+), 19 deletions(-) --- base-commit: 9c01a177c2e4b55d2bcce8a1f6bdd1d46a8320e3 change-id: 20250210-vsock-listen-sockmap-nullptr-e6e82ca79611 Best regards, -- Michal Luczaj <mhal(a)rbox.co>

4 months, 2 weeks

3
13
0 0

[RESEND v4 0/3] mm/pkey: Add PKEY_UNRESTRICTED macro

by Yury Khrustalev

Add PKEY_UNRESTRICTED macro to mman.h and use it in selftests. For context, this change will also allow for more consistent update of the Glibc manual which in turn will help with introducing memory protection keys on AArch64 targets. Applies to 5bc55a333a2f (tag: v6.13-rc7). Note that I couldn't build ppc tests so I would appreciate if someone could check the 3rd patch. Thank you! Signed-off-by: Yury Khrustalev <yury.khrustalev(a)arm.com> --- Changes in v4: - Removed change to tools/include/uapi/asm-generic/mman-common.h as it is not necessary. Link to v3: https://lore.kernel.org/all/20241028090715.509527-1-yury.khrustalev@arm.com/ Changes in v3: - Replaced previously missed 0-s tools/testing/selftests/mm/mseal_test.c - Replaced previously missed 0-s in tools/testing/selftests/mm/mseal_test.c Link to v2: https://lore.kernel.org/linux-arch/20241027170006.464252-2-yury.khrustalev@… Changes in v2: - Update tools/include/uapi/asm-generic/mman-common.h as well - Add usages of the new macro to selftests. Link to v1: https://lore.kernel.org/linux-arch/20241022120128.359652-1-yury.khrustalev@… --- Yury Khrustalev (3): mm/pkey: Add PKEY_UNRESTRICTED macro selftests/mm: Use PKEY_UNRESTRICTED macro selftests/powerpc: Use PKEY_UNRESTRICTED macro include/uapi/asm-generic/mman-common.h | 1 + tools/testing/selftests/mm/mseal_test.c | 6 +++--- tools/testing/selftests/mm/pkey-helpers.h | 3 ++- tools/testing/selftests/mm/pkey_sighandler_tests.c | 4 ++-- tools/testing/selftests/mm/protection_keys.c | 2 +- tools/testing/selftests/powerpc/include/pkeys.h | 2 +- tools/testing/selftests/powerpc/mm/pkey_exec_prot.c | 2 +- tools/testing/selftests/powerpc/mm/pkey_siginfo.c | 2 +- tools/testing/selftests/powerpc/ptrace/core-pkey.c | 6 +++--- tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c | 6 +++--- 10 files changed, 18 insertions(+), 16 deletions(-) -- 2.39.5

4 months, 2 weeks

3
7
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror