On Tue, 2022-01-18 at 11:14 -0800, Reinette Chatre wrote:
Vijay reported that the "unclobbered_vdso_oversubscribed" selftest triggers the softlockup detector.
Actual SGX systems have 128GB of enclave memory or more. The "unclobbered_vdso_oversubscribed" selftest creates one enclave which consumes all of the enclave memory on the system. Tearing down such a large enclave takes around a minute, most of it in the loop where the EREMOVE instruction is applied to each individual 4k enclave page.
Spending one minute in a loop triggers the softlockup detector.
Add a cond_resched() to give other tasks a chance to run and placate the softlockup detector.
Cc: stable@vger.kernel.org Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer") Reported-by: Vijay Dhanraj vijay.dhanraj@intel.com Acked-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Reinette Chatre reinette.chatre@intel.com
Softlockup message: watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [test_sgx:11502] Kernel panic - not syncing: softlockup: hung tasks
<snip> sgx_encl_release+0x86/0x1c0 sgx_release+0x11c/0x130 __fput+0xb0/0x280 ____fput+0xe/0x10 task_work_run+0x6c/0xc0 exit_to_user_mode_prepare+0x1eb/0x1f0 syscall_exit_to_user_mode+0x1d/0x50 do_syscall_64+0x46/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae
arch/x86/kernel/cpu/sgx/encl.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 001808e3901c..ab2b79327a8a 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -410,6 +410,7 @@ void sgx_encl_release(struct kref *ref) } kfree(entry); + cond_resched(); } xa_destroy(&encl->page_array);
I'd add a comment, e.g.
/* Invoke scheduler to prevent soft lockups. */
Other than that makes sense.
BR, Jarkko