This series is a follow-up to [1], which adds mTHP support to khugepaged.
mTHP khugepaged support is a "loose" dependency for the sysfs/sysctl
configs to make sense. Without it global="defer" and mTHP="inherit" case
is "undefined" behavior.
We've seen cases were customers switching from RHEL7 to RHEL8 see a
significant increase in the memory footprint for the same workloads.
Through our investigations we found that a large contributing factor to
the increase in RSS was an increase in THP usage.
For workloads like MySQL, or when using allocators like jemalloc, it is
often recommended to set /transparent_hugepages/enabled=never. This is
in part due to performance degradations and increased memory waste.
This series introduces enabled=defer, this setting acts as a middle
ground between always and madvise. If the mapping is MADV_HUGEPAGE, the
page fault handler will act normally, making a hugepage if possible. If
the allocation is not MADV_HUGEPAGE, then the page fault handler will
default to the base size allocation. The caveat is that khugepaged can
still operate on pages thats not MADV_HUGEPAGE.
This allows for three things... one, applications specifically designed to
use hugepages will get them, and two, applications that don't use
hugepages can still benefit from them without aggressively inserting
THPs at every possible chance. This curbs the memory waste, and defers
the use of hugepages to khugepaged. Khugepaged can then scan the memory
for eligible collapsing. Lastly there is the added benefit for those who want
THPs but experience higher latency PFs. Now you can get base page performance at
the PF handler and Hugepage performance for those mappings after they collapse.
Admins may want to lower max_ptes_none, if not, khugepaged may
aggressively collapse single allocations into hugepages.
TESTING:
- Built for x86_64, aarch64, ppc64le, and s390x
- selftests mm
- In [1] I provided a script [2] that has multiple access patterns
- lots of general use. These changes have been running in my VM for some time
- redis testing. This test was my original case for the defer mode. What I was
able to prove was that THP=always leads to increased max_latency cases; hence
why it is recommended to disable THPs for redis servers. However with 'defer'
we dont have the max_latency spikes and can still get the system to utilize
THPs. I further tested this with the mTHP defer setting and found that redis
(and probably other jmalloc users) can utilize THPs via defer (+mTHP defer)
without a large latency penalty and some potential gains.
I uploaded some mmtest results here [3] which compares:
stock+thp=never
stock+(m)thp=always
khugepaged-mthp + defer (max_ptes_none=64)
The results show that (m)THPs can cause some throughput regression in some
cases, but also has gains in other cases. The mTHP+defer results have more
gains and less losses over the (m)THP=always case.
V3 Changes:
- moved some Documentation to the other series and merged the remaining
Documentation updates into one
V2 Changes:
- base changes on mTHP khugepaged support
- Fix selftests parsing issue
- add mTHP defer option
- add mTHP defer Documentation
[1] - https://lore.kernel.org/lkml/20250414220557.35388-1-npache@redhat.com/
[2] - https://gitlab.com/npache/khugepaged_mthp_test
[3] - https://people.redhat.com/npache/mthp_khugepaged_defer/testoutput2/output.h…
Nico Pache (4):
mm: defer THP insertion to khugepaged
mm: document (m)THP defer usage
khugepaged: add defer option to mTHP options
selftests: mm: add defer to thp setting parser
Documentation/admin-guide/mm/transhuge.rst | 31 +++++++---
include/linux/huge_mm.h | 18 +++++-
mm/huge_memory.c | 69 +++++++++++++++++++---
mm/khugepaged.c | 10 ++--
tools/testing/selftests/mm/thp_settings.c | 1 +
tools/testing/selftests/mm/thp_settings.h | 1 +
6 files changed, 107 insertions(+), 23 deletions(-)
--
2.48.1
This is v6 of the TDX selftests that follow RFC v5 sent more than a year
ago. While it has been a while since the previous posting, the TDX
selftests kept up to date with the latest TDX development and supported
the health of the TDX base series.
With TDX base support now in kvm-coco-queue it is a good opportunity to
to again share the TDX selftests and also remove the "RFC" to convey that
this work is now ready to be considered for inclusion in support of the
TDX base support.
Apart from the addition of one new test ("KVM: selftests: TDX: Test
LOG_DIRTY_PAGES flag to a non-GUEST_MEMFD memslot") this series should be
familiar to anybody that previously looked at "RFC v5". All previous feedback
has been addressed. At the same time the changes to TDX base support needed
several matching changes in the TDX selftests that prompted dropping all
previously received "Reviewed-by" tags to indicate that the patches deserve
a new look. In support of upstream inclusion this version also includes many
non functional changes intended to follow the style and customs of this area.
This series is based on: commit 58dd191cf39c ("KVM: x86: Forbid the use of
kvm_load_host_xsave_state() with guest_state_protected") from branch
kvm-coco-queue on git://git.kernel.org/pub/scm/virt/kvm/kvm.git
While the kvm-coco-queue already contains these selftests, this is a
more up-to-date version of the patches.
The tree can be found at:
https://github.com/googleprodkernel/linux-cc/tree/tdx-selftests-v6
I would like to acknowledge the following people, who helped keep these
patches up to date with the latest TDX patches and prepare them for
review:
Reinette Chatre <reinette.chatre(a)intel.com>
Isaku Yamahata <isaku.yamahata(a)intel.com>
Binbin Wu <binbin.wu(a)linux.intel.com>
Adrian Hunter <adrian.hunter(a)intel.com>
Rick Edgecombe <rick.p.edgecombe(a)intel.com>
Links to earlier patch series:
RFC v5: https://lore.kernel.org/all/20231212204647.2170650-1-sagis@google.com/
RFC v4: https://lore.kernel.org/lkml/20230725220132.2310657-1-afranji@google.com/
RFC v3: https://lore.kernel.org/lkml/20230121001542.2472357-1-ackerleytng@google.co…
RFC v2: https://lore.kernel.org/lkml/20220830222000.709028-1-sagis@google.com/T/#u
RFC v1: https://lore.kernel.org/lkml/20210726183816.1343022-1-erdemaktas@google.com…
Ackerley Tng (12):
KVM: selftests: Add function to allow one-to-one GVA to GPA mappings
KVM: selftests: Expose function that sets up sregs based on VM's mode
KVM: selftests: Store initial stack address in struct kvm_vcpu
KVM: selftests: Add vCPU descriptor table initialization utility
KVM: selftests: TDX: Use KVM_TDX_CAPABILITIES to validate TDs'
attribute configuration
KVM: selftests: TDX: Update load_td_memory_region() for VM memory
backed by guest memfd
KVM: selftests: Add functions to allow mapping as shared
KVM: selftests: KVM: selftests: Expose new vm_vaddr_alloc_private()
KVM: selftests: TDX: Add support for TDG.MEM.PAGE.ACCEPT
KVM: selftests: TDX: Add support for TDG.VP.VEINFO.GET
KVM: selftests: TDX: Add TDX UPM selftest
KVM: selftests: TDX: Add TDX UPM selftests for implicit conversion
Erdem Aktas (3):
KVM: selftests: Add helper functions to create TDX VMs
KVM: selftests: TDX: Add TDX lifecycle test
KVM: selftests: TDX: Add TDX HLT exit test
Isaku Yamahata (1):
KVM: selftests: Update kvm_init_vm_address_properties() for TDX
Roger Wang (1):
KVM: selftests: TDX: Add TDG.VP.INFO test
Ryan Afranji (2):
KVM: selftests: TDX: Verify the behavior when host consumes a TD
private memory
KVM: selftests: TDX: Add shared memory test
Sagi Shahar (10):
KVM: selftests: TDX: Add report_fatal_error test
KVM: selftests: TDX: Adding test case for TDX port IO
KVM: selftests: TDX: Add basic TDX CPUID test
KVM: selftests: TDX: Add basic TDG.VP.VMCALL<GetTdVmCallInfo> test
KVM: selftests: TDX: Add TDX IO writes test
KVM: selftests: TDX: Add TDX IO reads test
KVM: selftests: TDX: Add TDX MSR read/write tests
KVM: selftests: TDX: Add TDX MMIO reads test
KVM: selftests: TDX: Add TDX MMIO writes test
KVM: selftests: TDX: Add TDX CPUID TDVMCALL test
Yan Zhao (1):
KVM: selftests: TDX: Test LOG_DIRTY_PAGES flag to a non-GUEST_MEMFD
memslot
tools/testing/selftests/kvm/Makefile.kvm | 8 +
.../testing/selftests/kvm/include/kvm_util.h | 36 +
.../selftests/kvm/include/x86/kvm_util_arch.h | 1 +
.../selftests/kvm/include/x86/processor.h | 2 +
.../selftests/kvm/include/x86/tdx/td_boot.h | 83 ++
.../kvm/include/x86/tdx/td_boot_asm.h | 16 +
.../selftests/kvm/include/x86/tdx/tdcall.h | 54 +
.../selftests/kvm/include/x86/tdx/tdx.h | 67 +
.../selftests/kvm/include/x86/tdx/tdx_util.h | 23 +
.../selftests/kvm/include/x86/tdx/test_util.h | 133 ++
tools/testing/selftests/kvm/lib/kvm_util.c | 74 +-
.../testing/selftests/kvm/lib/x86/processor.c | 108 +-
.../selftests/kvm/lib/x86/tdx/td_boot.S | 100 ++
.../selftests/kvm/lib/x86/tdx/tdcall.S | 163 +++
tools/testing/selftests/kvm/lib/x86/tdx/tdx.c | 243 ++++
.../selftests/kvm/lib/x86/tdx/tdx_util.c | 643 +++++++++
.../selftests/kvm/lib/x86/tdx/test_util.c | 187 +++
.../selftests/kvm/x86/tdx_shared_mem_test.c | 129 ++
.../testing/selftests/kvm/x86/tdx_upm_test.c | 461 ++++++
tools/testing/selftests/kvm/x86/tdx_vm_test.c | 1254 +++++++++++++++++
20 files changed, 3742 insertions(+), 43 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/td_boot.h
create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/td_boot_asm.h
create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/tdcall.h
create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/tdx.h
create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/tdx_util.h
create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/test_util.h
create mode 100644 tools/testing/selftests/kvm/lib/x86/tdx/td_boot.S
create mode 100644 tools/testing/selftests/kvm/lib/x86/tdx/tdcall.S
create mode 100644 tools/testing/selftests/kvm/lib/x86/tdx/tdx.c
create mode 100644 tools/testing/selftests/kvm/lib/x86/tdx/tdx_util.c
create mode 100644 tools/testing/selftests/kvm/lib/x86/tdx/test_util.c
create mode 100644 tools/testing/selftests/kvm/x86/tdx_shared_mem_test.c
create mode 100644 tools/testing/selftests/kvm/x86/tdx_upm_test.c
create mode 100644 tools/testing/selftests/kvm/x86/tdx_vm_test.c
--
2.49.0.504.g3bcea36a83-goog
Hello there,
Static analyser cppcheck says:
linux-6.15-rc2/tools/testing/selftests/kvm/lib/arm64/processor.c:107:2: style: int result is returned as long value. If the return value is long to avoid loss of information, then you have loss of information. [truncLongCastReturn]
Source code is
return 1 << (vm->va_bits - shift);
Maybe better code:
return 1UL << (vm->va_bits - shift);
Regards
David Binderman
v6:
- The memcg_test_low failure is indeed due to the memory_recursiveprot
mount option which is enabled by default in systemd cgroup v2 setting.
So adopt Michal's suggestion to adjust the low event checking
according to whether memory_recursiveprot is enabled or not.
v5:
- Use mem_cgroup_usage() in patch 1 as originally suggested by Johannes.
The test_memcontrol selftest consistently fails its test_memcg_low
sub-test (with memory_recursiveprot enabled) and sporadically fails
its test_memcg_min sub-test. This patchset fixes the test_memcg_min
and test_memcg_low failures by skipping the !usage case in
shrink_node_memcgs() and adjust the test_memcontrol selftest to fix
other causes of the test failures.
Waiman Long (2):
mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()
selftests: memcg: Increase error tolerance of child memory.current
check in test_memcg_protection()
mm/internal.h | 9 +++++++++
mm/memcontrol-v1.h | 2 --
mm/vmscan.c | 4 ++++
.../selftests/cgroup/test_memcontrol.c | 20 ++++++++++++-------
4 files changed, 26 insertions(+), 9 deletions(-)
--
2.48.1
This series fixes the KConfig for cs_dsp and cs-amp-lib tests so that
CONFIG_KUNIT_ALL_TESTS doesn't cause them to add modules to the build.
Patch 1 adds the ASoC CS35L56 driver to KUnit all_tests.config so that
cs_dsp and cs-amp-lib will be included in the test build.
Patch 2 and 3 fixup the KConfig entries for cs_dsp and cs-amp-lib.
Nico Pache (1):
firmware: cs_dsp: tests: Depend on FW_CS_DSP rather then enabling it
Richard Fitzgerald (2):
kunit: configs: Add some Cirrus Logic modules to all_tests
ASoC: cs-amp-lib-test: Don't select SND_SOC_CS_AMP_LIB
drivers/firmware/cirrus/Kconfig | 5 +----
sound/soc/codecs/Kconfig | 5 ++---
tools/testing/kunit/configs/all_tests.config | 2 ++
3 files changed, 5 insertions(+), 7 deletions(-)
--
2.39.5
As part of LKFT’s re-validation of known issues, we have observed that
the selftests: cgroup suite is consistently failing across almost all
LKFT-supported devices due to:
- Test timeouts (45 seconds limit reached)
- OOM-killer invocation
## Key Questions for Discussion:
- Would it be beneficial to increase the test timeout to ~180 seconds
to allow sufficient execution time?
- Should we enhance logging to explicitly print failure reasons when a
test fails?
- Are there any missing dependencies that could be causing these failures?
Note: The required selftests/cgroup/config options were included in
LKFT's build and test plans.
## Devices Affected:
The following DUTs consistently experience these failures:
- dragonboard-410c (arm64)
- dragonboard-845c (arm64)
- e850-96 (arm64)
- juno-r2 (arm64)
- qemu-arm64 (arm64)
- qemu-armv7
- qemu-x86_64
- rk3399-rock-pi-4b (arm64)
- x15 (arm)
- x86_64
Regression Analysis:
- New regression? No (these failures have been observed for months/years).
- Reproducibility? Yes, the failures occur consistently.
- Test suite affected? selftests: cgroup (timeouts and OOM-related failures).
Test regression: selftests cgroup fails timeout and oom-killer
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
## Test log:
# selftests: cgroup: test_cpu
# ok 1 test_cpucg_subtree_control
# ok 2 test_cpucg_stats
# ok 3 test_cpucg_nice
# not ok 4 test_cpucg_weight_overprovisioned
# ok 5 test_cpucg_weight_underprovisioned
# ok 6 test_cpucg_nested_weight_overprovisioned
# ok 7 test_cpucg_nested_weight_underprovisioned
#
not ok 2 selftests: cgroup: test_cpu # TIMEOUT 45 seconds
<trim>
# selftests: cgroup: test_freezer
# ok 1 test_cgfreezer_simple
# ok 2 test_cgfreezer_tree
# ok 3 test_cgfreezer_forkbomb
# ok 4 test_cgfreezer_mkdir
# ok 5 test_cgfreezer_rmdir
# ok 6 test_cgfreezer_migrate
# Cgroup /sys/fs/cgroup/cg_test_ptrace isn't frozen
# not ok 7 test_cgfreezer_ptrace
# ok 8 test_cgfreezer_stopped
# ok 9 test_cgfreezer_ptraced
# ok 10 test_cgfreezer_vfork
not ok 4 selftests: cgroup: test_freezer # exit=1
<trim>
selftests: cgroup: test_kmem
#
not ok 7 selftests: cgroup: test_kmem # TIMEOUT 45 seconds
<trim>
# selftests: cgroup: test_memcontrol
# ok 1 test_memcg_subtree_control
# not ok 2 test_memcg_current_peak
# not ok 3 test_memcg_min
# not ok 4 test_memcg_low
# not ok 5 test_memcg_high
# ok 6 test_memcg_high_sync
[ 270.699078] test_memcontrol invoked oom-killer:
gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[ 270.699921] CPU: 1 UID: 0 PID: 946 Comm: test_memcontrol Not
tainted 6.14.0-rc5-next-20250303 #1
[ 270.699930] Hardware name: Radxa ROCK Pi 4B (DT)
<trim>
[ 270.729527] Memory cgroup out of memory: Killed process 946
(test_memcontrol) total-vm:104840kB, anon-rss:30596kB,
file-rss:1056kB, shmem-rss:0kB, UID:0 pgtables:104kB oom_score_adj:0
# not ok 7 test_memcg_max
# not ok 8 test_memcg_reclaim
<trim>
not ok 8 selftests: cgroup: test_memcontrol # exit=1
## Source
* Kernel version: 6.14.0-rc5-next-20250303
* Git tree: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
* Git sha: cd3215bbcb9d4321def93fea6cfad4d5b42b9d1d
* Git describe: 6.14.0-rc5-next-20250303
* Project details:
https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250303/
## Test data
* Test log: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250303/te…
* Test history:
https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250303/te…
* Test details:
https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250303/te…
* Test logs rock pi:
https://lkft.validation.linaro.org/scheduler/job/8148789#L1774
* Test logs x86: https://lkft.validation.linaro.org/scheduler/job/8148731#L1948
--
Linaro LKFT
https://lkft.linaro.org
Enabling a (modular) test should not silently enable additional kernel
functionality, as that may increase the attack vector of a product.
Fix this by making PRIME_NUMBERS_KUNIT_TEST depend on PRIME_NUMBERS
instead of selecting it.
After this, one can safely enable CONFIG_KUNIT_ALL_TESTS=m to build
modules for all appropriate tests for ones system, without pulling in
extra unwanted functionality, while still allowing a tester to manually
enable PRIME_NUMBERS and this test suite on a system where PRIME_NUMBERS
is not enabled by default.
Fixes: 313b38a6ecb46db4 ("lib/prime_numbers: convert self-test to KUnit")
Signed-off-by: Geert Uytterhoeven <geert(a)linux-m68k.org>
---
lib/Kconfig.debug | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 4060a89866626c0a..51722f5d041970aa 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -3326,7 +3326,7 @@ config GCD_KUNIT_TEST
config PRIME_NUMBERS_KUNIT_TEST
tristate "Prime number generator test" if !KUNIT_ALL_TESTS
depends on KUNIT
- select PRIME_NUMBERS
+ depends on PRIME_NUMBERS
default KUNIT_ALL_TESTS
help
This option enables the KUnit test suite for the {is,next}_prime_number
--
2.43.0