On 2025-10-19 10:15 AM, Lukas Wunner wrote:
On Sat, Oct 18, 2025 at 03:36:20PM -0700, Vipin Sharma wrote:
On 2025-10-18 09:17:33, Lukas Wunner wrote:
On Fri, Oct 17, 2025 at 05:07:07PM -0700, Vipin Sharma wrote:
Move struct pci_saved_state{} and struct pci_cap_saved_data{} to linux/pci.h so that they are available to code outside of the PCI core.
These structs will be used in subsequent commits to serialize and deserialize PCI state across Live Update.
That's not sufficient as a justification to make these public in my view.
There are already pci_store_saved_state() and pci_load_saved_state() helpers to serialize PCI state. Why do you need anything more? (Honest question.)
In LUO ecosystem, currently, we do not have a solid solution to do proper serialization/deserialization of structs along with versioning between different kernel versions. This work is still being discussed.
Here, I created separate structs (exactly same as the original one) to have little bit control on what gets saved in serialized state and correctly gets deserialized after kexec.
For example, if I am using existing structs and not creating my own structs then I cannot just do a blind memcpy() between whole of the PCI state prior to kexec to PCI state after the kexec. In the new kernel layout might have changed like addition or removal of a field.
The last time we changed those structs was in 2013 by fd0f7f73ca96. So changes are extremely rare.
What could change in theory is the layout of the individual capabilities (the data[] in struct pci_cap_saved_data). E.g. maybe we decide that we need to save an additional register. But that's also rare. Normally we add all the mutable registers when a new capability is supported and have no need to amend that afterwards.
Yeah that has me worried. A totally innocuous commit that adds, removes, or reorders a register stashed in data[] could lead a broken device when VFIO does pci_restore_state() after a Live Update.
Turing pci_save_state into an actual ABI would require adding the registers into the save state probably, rather than assuming their order.
But... I wonder if we truly need to preserve the PCI save state across Live Update.
Based on this comment in drivers/vfio/pci/vfio_pci_core.c, the PCI save/restore stuff in VFIO is for cleaning up devices that do not support resets:
648 /* 649 * If we have saved state, restore it. If we can reset the device, 650 * even better. Resetting with current state seems better than 651 * nothing, but saving and restoring current state without reset 652 * is just busy work. 653 */ 654 if (pci_load_and_free_saved_state(pdev, &vdev->pci_saved_state)) { 655 pci_info(pdev, "%s: Couldn't reload saved state\n", __func__); 656 657 if (!vdev->reset_works) 658 goto out; 659 660 pci_save_state(pdev); 661 }
So if we just limit Live Update support to devices with reset_works, then we don't have to deal with preserving the save state.
I will have to double check that reset_works is true for all the devices we care about supporting for Live Update, but I imagine it will be. They're all relatively modern PCIe devices.