On Mon, Feb 03, 2025 at 09:01:41AM -0800, Vishal Annapurve wrote:
On Mon, Feb 3, 2025 at 8:00 AM Kirill A. Shutemov kirill@shutemov.name wrote:
...
Are you hinting towards a model where TDX guest prohibits such call sites from being configured? I am not sure if it's a sustainable model if we just rely on the host not advertising these features as the guest kernel can still add new paths that are not controlled by the host that lead to *_safe_halt().
I've asked TDX module folks to provide additional information in ve_info to help handle STI shadow correctly. They will implement it, but it will take some time.
What will the final solution look like?
VMX has GUEST_INTERRUPTIBILITY_INFO. This info is going to passed via ve_info. Details are TBD.
With the info at hands, we can check if we are in STI shadow (regardless of instruction) and skip interrupt enabling in that case.
So we need some kind of stopgap until we have it.
Does it make sense to carry the patch suggested by Sean [1] as a stopgap for now?
[1] https://lore.kernel.org/lkml/Z5l6L3Hen9_Y3SGC@google.com/
I like it more than paravirt calls. And in the future, HLT check can be replaced with STI shadow check if the info is available.
I am reluctant to commit to paravirt calls for this workaround. They will likely stick forever. It is possible, I would like to avoid them. If not, oh well.
- acpi_safe_halt() -> safe_halt() -> raw_safe_halt() -> arch_safe_halt()
Have you checked why you get there? I don't see a reason for TDX guest to get into ACPI idle stuff. We don't have C-states to manage.
Apparently userspace VMM is advertising pblock_address through SSDT tables in my configuration which causes guests to enable ACPI cpuidle drivers. Do you know if future generations of TDX hardware will not support different c-states for TDX VMs?
I have very limited understanding of power management, but I don't see how C-states can be meaningfully supported by any virtualized environment. To me, C-states only make sense for baremetal.
One possibility is that host can convey guests about using "mwait" as cstate entry mechanism as an alternative to halt if supported.
You don't need cpuidle for that. If MWAIT is supported, just enumerate MWAIT to the guest and select_idle_routine() will pick it over TDX-specific one.