On Fri, Jan 31, 2025 at 06:32:04PM -0800, Vishal Annapurve wrote:
On Fri, Jan 31, 2025 at 12:13 AM Kirill A. Shutemov kirill@shutemov.name wrote:
On Thu, Jan 30, 2025 at 11:45:01AM -0800, Vishal Annapurve wrote:
On Thu, Jan 30, 2025 at 10:48 AM Kirill A. Shutemov kirill@shutemov.name wrote:
...
I think it is worth to putting this into a separate patch and not backport. The rest of the patch is bugfix and this doesn't belong.
Otherwise, looks good to me:
Reviewed-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com@linux.intel.com>
-- Kiryl Shutsemau / Kirill A. Shutemov
Thanks Kirill for the review.
Thinking more about this fix, now I am wondering why the efforts [1] to move halt/safe_halt under CONFIG_PARAVIRT were abandoned. Currently proposed fix is incomplete as it would not handle scenarios where CONFIG_PARAVIRT_XXL is disabled. I am tilting towards reviving [1] and requiring CONFIG_PARAVIRT for TDX VMs. WDYT?
[1] https://lore.kernel.org/lkml/20210517235008.257241-1-sathyanarayanan.kuppusw...
Many people dislike paravirt callbacks. We tried to avoid relying on them for core TDX enabling.
Can you explain the issue you see with CONFIG_PARAVIRT_XXL being disabled? I don't think I follow.
Relevant callers of *_safe_halt() are:
- kvm_wait() -> safe_halt() -> raw_safe_halt() -> arch_safe_halt()
Okay, I didn't realized that CONFIG_PARAVIRT_SPINLOCKS doesn't depend on CONFIG_PARAVIRT_XXL.
It would be interesting to check if paravirtualized spinlocks make sense for TDX given the cost of TD exit.
Maybe we should avoid advertising KVM_FEATURE_PV_UNHALT to the TDX guests?
Are you hinting towards a model where TDX guest prohibits such call sites from being configured? I am not sure if it's a sustainable model if we just rely on the host not advertising these features as the guest kernel can still add new paths that are not controlled by the host that lead to *_safe_halt().
I've asked TDX module folks to provide additional information in ve_info to help handle STI shadow correctly. They will implement it, but it will take some time.
So we need some kind of stopgap until we have it.
I am reluctant to commit to paravirt calls for this workaround. They will likely stick forever. It is possible, I would like to avoid them. If not, oh well.
- acpi_safe_halt() -> safe_halt() -> raw_safe_halt() -> arch_safe_halt()
Have you checked why you get there? I don't see a reason for TDX guest to get into ACPI idle stuff. We don't have C-states to manage.
Apparently userspace VMM is advertising pblock_address through SSDT tables in my configuration which causes guests to enable ACPI cpuidle drivers. Do you know if future generations of TDX hardware will not support different c-states for TDX VMs?
I have very limited understanding of power management, but I don't see how C-states can be meaningfully supported by any virtualized environment. To me, C-states only make sense for baremetal.