In madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(),
folio_mapcount() is used to check whether the folio is shared. But it's
not correct as folio_mapcount() returns total mapcount of large folio.
Use folio_estimated_sharers() here as the estimated number is enough.
This patchset will fix the cases:
User space application call madvise() with MADV_FREE, MADV_COLD and
MADV_PAGEOUT for specific address range. There are THP mapped to the
range. Without the patchset, the THP is skipped. With the patch, the
THP will be split and handled accordingly.
David reported the cow self test skip some cases because of
MADV_PAGEOUT skip THP:
https://lore.kernel.org/linux-mm/9e92e42d-488f-47db-ac9d-75b24cd0d037@intel…
and I confirmed this patchset make it work again.
Changelog from v1:
- Avoid two Fixes tags make backport harder. Thank Andrew for pointing
this out.
- Add note section to mention this is a temporary fix which is fine
to reduce user-visble effects. For long term fix, we should wait for
David's solution. Thank Ryan and David for pointing this out.
- Spell user-visible effects out. Then people could decide whether
these patches are necessary for stable branch. Thank Andrew for
pointing this out.
V1:
https://lore.kernel.org/linux-mm/9e92e42d-488f-47db-ac9d-75b24cd0d037@intel…
Yin Fengwei (3):
madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount()
against large folio for sharing check
madvise:madvise_free_huge_pmd(): don't use mapcount() against large
folio for sharing check
madvise:madvise_free_pte_range(): don't use mapcount() against large
folio for sharing check
mm/huge_memory.c | 2 +-
mm/madvise.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
--
2.39.2
We hit softlocup with following call trace:
? asm_sysvec_apic_timer_interrupt+0x16/0x20
xa_erase+0x21/0xb0
? sgx_free_epc_page+0x20/0x50
sgx_vepc_release+0x75/0x220
__fput+0x89/0x250
task_work_run+0x59/0x90
do_exit+0x337/0x9a0
Similar like commit
8795359e35bc ("x86/sgx: Silence softlockup detection when releasing large enclaves")
The test system has 64GB of enclave memory, and all assigned to a single
VM. Release vepc take longer time and triggers the softlockup warning.
Add a cond_resched() to give other tasks a chance to run and placate
the softlockup detector.
Cc: Jarkko Sakkinen <jarkko(a)kernel.org>
Cc: Haitao Huang <haitao.huang(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Fixes: 540745ddbc70 ("x86/sgx: Introduce virtual EPC for use by KVM guests")
Reported-by: Yu Zhang <yu.zhang(a)ionos.com>
Tested-by: Yu Zhang <yu.zhang(a)ionos.com>
Acked-by: Haitao Huang <haitao.huang(a)linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Jack Wang <jinpu.wang(a)ionos.com>
---
v2:
* collects review and test by.
* add fixes tag
* trim call trace.
arch/x86/kernel/cpu/sgx/virt.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
index c3e37eaec8ec..01d2795792cc 100644
--- a/arch/x86/kernel/cpu/sgx/virt.c
+++ b/arch/x86/kernel/cpu/sgx/virt.c
@@ -204,6 +204,7 @@ static int sgx_vepc_release(struct inode *inode, struct file *file)
continue;
xa_erase(&vepc->page_array, index);
+ cond_resched();
}
/*
@@ -222,6 +223,7 @@ static int sgx_vepc_release(struct inode *inode, struct file *file)
list_add_tail(&epc_page->list, &secs_pages);
xa_erase(&vepc->page_array, index);
+ cond_resched();
}
/*
--
2.34.1
Your attention please!
My efforts to reaching you many times always not through.
Please may you kindly let me know if you are still using this email
address as my previous messages to you were not responded to.
I await hearing from you once more if my previous messages were not received.
Reach me via my email: privateemail01112(a)gmail.com
My regards,
Kein Briggs.