(I apologize in advance if my email comes out formatted strangely, I haven't used ProtonMail for LKML before. I don't think it is line-wrapping properly.)
On Wednesday, December 3rd, 2025 at 12:45 PM, Yazen Ghannam yazen.ghannam@amd.com wrote:
On Fri, Nov 14, 2025 at 07:57:35PM +0000, Steven Noonan wrote:
Thanks Steven for the patch.
On a Xen dom0 boot, this feature does not behave, and we end up calculating:
num_roots = 1 num_nodes = 2 roots_per_node = 0
This causes a divide-by-zero in the modulus inside the loop.
Can you please share more details of the system topology?
I think the list of PCI devices is a good start.
Sure, but it's running as the paravirtual control domain for Xen. The `lspci` topology output won't differ between bare-metal and dom0, but dom0's accesses to certain MSRs and PCI registers may be masked and manipulated, which is probably why this is breaking.
I've attached `lspci -nn` and a CPUID dump from CPU0 -- both of these are while running under Xen.
This change adds a couple of guards for invalid states where we might get a divide-by-zero.
This statement should be imperative, ex. "Add a couple of guards...".
Also, the commit message should generally be in a passive voice (no "we"), ex. "...where a divide-by-zero may result."
Ack. I can fix these and the subsequent suggestions for version 2.
Signed-off-by: Steven Noonan steven@uplinklabs.net Signed-off-by: Ariadne Conill ariadne@ariadne.space
The Signed-off-by lines should be in the order of handling. If you are sending the patch, then your line should be last. If there are other contributors, then they should have a Co-developed-by tag in addition to Signed-off-by.
CC: Yazen Ghannam yazen.ghannam@amd.com CC: x86@vger.kernel.org CC: stable@vger.kernel.org
There should be a Fixes tag along with "Cc: stable", if possible.
Ack.
arch/x86/kernel/amd_node.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/amd_node.c b/arch/x86/kernel/amd_node.c index 3d0a4768d603c..cdc6ba224d4ad 100644 --- a/arch/x86/kernel/amd_node.c +++ b/arch/x86/kernel/amd_node.c @@ -282,6 +282,17 @@ static int __init amd_smn_init(void) return -ENODEV;
num_nodes = amd_num_nodes();
- if (!num_nodes)
- return -ENODEV;
This is generally a good check. But I think it is unnecessary in this case, since the minimum value is '1'. The topology init code initializes the factors used in amd_num_nodes() to '1' before trying to find the true values from CPUID, etc.
- /* Possibly a virtualized environment (e.g. Xen) where we wi
Multi-line comments should start on the next line according to kernel coding style.
/*
- Comment
- Info
*/
ll get
- roots_per_node=0 if the number of roots is fewer than number of
- nodes
- */
- if (num_roots < num_nodes)
- return -ENODEV;
I think this is a fair check. But I'd like to understand how the topology looks in this case.
Thanks, Yazen