On 03.04.23 17:50, Stefan Roesch wrote:
I guess the interpreter could enable it (like a memory allocator could enable it for the whole heap). But I get that it's much easier to enable this per-process, and eventually only when a lot of the same processes are running in that particular environment.
We don't want it to get enabled for all workloads of that interpreter, instead we want to be able to select for which workloads we enable KSM.
Right.
- New options for prctl system command This patch series adds two new options to the prctl system call. The first one allows to enable KSM at the process level and the second one to query the setting. The setting will be inherited by child processes. With the above setting, KSM can be enabled for the seed process of a cgroup and all processes in the cgroup will inherit the setting.
- Changes to KSM processing When KSM is enabled at the process level, the KSM code will iterate over all the VMA's and enable KSM for the eligible VMA's. When forking a process that has KSM enabled, the setting will be inherited by the new child process. In addition when KSM is disabled for a process, KSM will be disabled for the VMA's where KSM has been enabled.
Do we want to make MADV_MERGEABLE/MADV_UNMERGEABLE fail while the new prctl is enabled for a process?
I decided to allow enabling KSM with prctl even when MADV_MERGEABLE, this allows more flexibility.
MADV_MERGEABLE will be a nop. But IIUC, MADV_UNMERGEABLE will end up calling unmerge_ksm_pages() and clear VM_MERGEABLE. But then, the next KSM scan will merge the pages in there again.
Not sure if that flexibility is worth having.
[...]
@@ -2661,6 +2662,32 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_SET_VMA: error = prctl_set_vma(arg2, arg3, arg4, arg5); break; +#ifdef CONFIG_KSM
- case PR_SET_MEMORY_MERGE:
if (!capable(CAP_SYS_RESOURCE))
return -EPERM;
if (arg2) {
if (mmap_write_lock_killable(me->mm))
return -EINTR;
if (!test_bit(MMF_VM_MERGE_ANY, &me->mm->flags))
error = __ksm_enter(me->mm, MMF_VM_MERGE_ANY);
Hm, I think this might be problematic if we alread called __ksm_enter() via madvise(). Maybe we should really consider making MMF_VM_MERGE_ANY set MMF_VM_MERGABLE instead. Like:
error = 0; if(test_bit(MMF_VM_MERGEABLE, &me->mm->flags)) error = __ksm_enter(me->mm); if (!error) set_bit(MMF_VM_MERGE_ANY, &me->mm->flags);
If we make that change, we would no longer be able to distinguish if MMF_VM_MERGEABLE or MMF_VM_MERGE_ANY have been set.
Why would you need that exactly? To cleanup? See below.
mmap_write_unlock(me->mm);
} else {
__ksm_exit(me->mm, MMF_VM_MERGE_ANY);
Hm, I'd prefer if we really only call __ksm_exit() when we really exit the process. Is there a strong requirement to optimize disabling of KSM or would it be sufficient to clear the MMF_VM_MERGE_ANY flag here?
Then we still have the mm_slot allocated until the process gets terminated.
Which is the same as using MADV_UNMERGEABLE, no?
Also, I wonder what happens if we have another VMA in that process that has it enabled ..
Last but not least, wouldn't we want to do the same thing as MADV_UNMERGEABLE and actually unmerge the KSM pages?
Do you want to call unmerge for all VMA's?
The question is what clearing MMF_VM_MERGE_ANY is supposed to do. If it's supposed to disable KSM (like MADV_UNMERGEABLE) would, then I guess you should go over all VMA's and unmerge.
Also, it depend on how you want to handle VM_MERGABLE with MMF_VM_MERGE_ANY. If MMF_VM_MERGE_ANY would not set VM_MERGABLE, then you'd only unmerge where VM_MERGABLE is not set. Otherwise, you'd unshare everywhere where VM_MERGABLE is set (and clear VM_MERGABLE) while at it.
Unsharing when clearing MMF_VM_MERGE_ANY might be the right thing to do IMHO.
I guess the main questions regarding implementation are:
1) Do we want setting MMF_VM_MERGE_ANY to set VM_MERGABLE on all candidate VMA's (go over all VMA's and set VM_MERGABLE). Then, clearing MMF_VM_MERGE_ANY would simply unmerge and clear VM_MERGABLE on all VMA's.
2) Do we want to make MMF_VM_MERGE_ANY imply MMF_VM_MERGABLE. You could still disable KSM (__ksm_exit()) during clearing MMF_VM_MERGE_ANY after going over all VMA's (where you might want to unshare already either way).
I guess the code will end up simpler if you make MMF_VM_MERGE_ANY simply piggy-back on MMF_VM_MERGABLE + VM_MERGABLE. I might be wrong, of course.