From: Ye Bin yebin10@huawei.com
[ Upstream commit 7bb797757bf5720543f1c5115b40a8d646d5c1cc ]
Currently, when ARM64 displays CPU information, every call to c_show() assembles all CPU information. However, as the number of CPUs increases, this can lead to insufficient buffer space due to excessive assembly in a single call, causing repeated expansion and multiple calls to c_show().
To prevent this invalid c_show() call, only one CPU's information is assembled each time c_show() is called.
Signed-off-by: Ye Bin yebin10@huawei.com Link: https://lore.kernel.org/r/20250421062947.4072855-1-yebin@huaweicloud.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
**YES** This commit should be backported to stable kernel trees. Here's my extensive analysis: ## Technical Analysis of the Code Changes The commit addresses a fundamental scalability issue in ARM64's `/proc/cpuinfo` implementation. The key changes are: ### Before (Problematic Implementation): ```c static int c_show(struct seq_file *m, void *v) { for_each_online_cpu(i) { // Shows ALL CPUs in one call struct cpuinfo_arm64 *cpuinfo = &per_cpu(cpu_data, i); seq_printf(m, "processor\t: %d\n", i); // ... extensive CPU information for ALL CPUs } } static void *c_start(struct seq_file *m, loff_t *pos) { return *pos < 1 ? (void *)1 : NULL; // Only called once } ``` ### After (Fixed Implementation): ```c static int c_show(struct seq_file *m, void *v) { int cpu = m->index; // Show only THIS CPU struct cpuinfo_arm64 *cpuinfo = v; seq_printf(m, "processor\t: %d\n", cpu); // ... information for just this one CPU } static void *c_start(struct seq_file *m, loff_t *pos) { *pos = cpumask_next(*pos - 1, cpu_online_mask); return *pos < nr_cpu_ids ? &per_cpu(cpu_data, *pos) : NULL; } ``` ## Why This is a Good Backport Candidate ### 1. **Fixes a Real Scalability Bug** - **Problem**: On ARM64 systems with many CPUs (64+, 128+), the original code dumps ALL CPU information in a single seq_file call - **Buffer overflow risk**: This can exceed seq_file buffer limits, causing repeated buffer reallocations and performance degradation - **User impact**: Users of high-CPU-count ARM64 servers experience slow or failing `/proc/cpuinfo` reads ### 2. **Architectural Alignment** The fix brings ARM64 in line with other architectures. For comparison, x86 has always used the correct pattern: - **x86 approach**: One CPU per `c_show()` call - **ARM64 before**: All CPUs per `c_show()` call (broken) - **ARM64 after**: One CPU per `c_show()` call (fixed) ### 3. **Small, Contained Change** - **Risk assessment**: Low risk - the change is localized to the cpuinfo seq_file operations - **No side effects**: Doesn't affect kernel functionality beyond `/proc/cpuinfo` display - **Well-established pattern**: Uses standard seq_file iteration patterns proven in other architectures ### 4. **Important for Modern Hardware** - **ARM64 server growth**: High-core-count ARM64 systems are becoming common - **Future-proofing**: Essential for scalability as ARM64 systems grow to 256+ cores - **Cloud environments**: Critical for ARM64 cloud instances with many vCPUs ### 5. **Clear Bug vs. Feature** This is clearly a **bug fix**, not a feature addition: - **Fixes broken behavior**: Prevents buffer overflow issues - **No new features**: Only changes how existing information is displayed - **Behavioral correction**: Makes ARM64 behave correctly like other architectures ## Stable Tree Criteria Assessment ✅ **Important bugfix**: Fixes scalability issues on high-CPU-count systems ✅ **Small and contained**: Changes only affect cpuinfo display logic ✅ **No architectural changes**: Uses established seq_file patterns ✅ **Minimal regression risk**: Low risk of breaking existing functionality ✅ **Clear user benefit**: Improves performance and reliability on many-CPU systems ## Conclusion This commit perfectly fits stable tree criteria. It fixes a real scalability bug that affects ARM64 systems with many CPUs, uses a small and well-understood fix, and brings ARM64 in line with other architectures' proven implementations. The change is essential for modern high-core-count ARM64 deployments and should be backported to ensure stable kernel trees support these systems properly.
arch/arm64/kernel/cpuinfo.c | 107 ++++++++++++++++++------------------ 1 file changed, 53 insertions(+), 54 deletions(-)
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 285d7d5383420..750864d0165bd 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -209,80 +209,79 @@ static const char *const compat_hwcap2_str[] = {
static int c_show(struct seq_file *m, void *v) { - int i, j; + int j; + int cpu = m->index; bool compat = personality(current->personality) == PER_LINUX32; + struct cpuinfo_arm64 *cpuinfo = v; + u32 midr = cpuinfo->reg_midr;
- for_each_online_cpu(i) { - struct cpuinfo_arm64 *cpuinfo = &per_cpu(cpu_data, i); - u32 midr = cpuinfo->reg_midr; - - /* - * glibc reads /proc/cpuinfo to determine the number of - * online processors, looking for lines beginning with - * "processor". Give glibc what it expects. - */ - seq_printf(m, "processor\t: %d\n", i); - if (compat) - seq_printf(m, "model name\t: ARMv8 Processor rev %d (%s)\n", - MIDR_REVISION(midr), COMPAT_ELF_PLATFORM); + /* + * glibc reads /proc/cpuinfo to determine the number of + * online processors, looking for lines beginning with + * "processor". Give glibc what it expects. + */ + seq_printf(m, "processor\t: %d\n", cpu); + if (compat) + seq_printf(m, "model name\t: ARMv8 Processor rev %d (%s)\n", + MIDR_REVISION(midr), COMPAT_ELF_PLATFORM);
- seq_printf(m, "BogoMIPS\t: %lu.%02lu\n", - loops_per_jiffy / (500000UL/HZ), - loops_per_jiffy / (5000UL/HZ) % 100); + seq_printf(m, "BogoMIPS\t: %lu.%02lu\n", + loops_per_jiffy / (500000UL/HZ), + loops_per_jiffy / (5000UL/HZ) % 100);
- /* - * Dump out the common processor features in a single line. - * Userspace should read the hwcaps with getauxval(AT_HWCAP) - * rather than attempting to parse this, but there's a body of - * software which does already (at least for 32-bit). - */ - seq_puts(m, "Features\t:"); - if (compat) { + /* + * Dump out the common processor features in a single line. + * Userspace should read the hwcaps with getauxval(AT_HWCAP) + * rather than attempting to parse this, but there's a body of + * software which does already (at least for 32-bit). + */ + seq_puts(m, "Features\t:"); + if (compat) { #ifdef CONFIG_COMPAT - for (j = 0; j < ARRAY_SIZE(compat_hwcap_str); j++) { - if (compat_elf_hwcap & (1 << j)) { - /* - * Warn once if any feature should not - * have been present on arm64 platform. - */ - if (WARN_ON_ONCE(!compat_hwcap_str[j])) - continue; - - seq_printf(m, " %s", compat_hwcap_str[j]); - } + for (j = 0; j < ARRAY_SIZE(compat_hwcap_str); j++) { + if (compat_elf_hwcap & (1 << j)) { + /* + * Warn once if any feature should not + * have been present on arm64 platform. + */ + if (WARN_ON_ONCE(!compat_hwcap_str[j])) + continue; + + seq_printf(m, " %s", compat_hwcap_str[j]); } + }
- for (j = 0; j < ARRAY_SIZE(compat_hwcap2_str); j++) - if (compat_elf_hwcap2 & (1 << j)) - seq_printf(m, " %s", compat_hwcap2_str[j]); + for (j = 0; j < ARRAY_SIZE(compat_hwcap2_str); j++) + if (compat_elf_hwcap2 & (1 << j)) + seq_printf(m, " %s", compat_hwcap2_str[j]); #endif /* CONFIG_COMPAT */ - } else { - for (j = 0; j < ARRAY_SIZE(hwcap_str); j++) - if (cpu_have_feature(j)) - seq_printf(m, " %s", hwcap_str[j]); - } - seq_puts(m, "\n"); - - seq_printf(m, "CPU implementer\t: 0x%02x\n", - MIDR_IMPLEMENTOR(midr)); - seq_printf(m, "CPU architecture: 8\n"); - seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr)); - seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr)); - seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr)); + } else { + for (j = 0; j < ARRAY_SIZE(hwcap_str); j++) + if (cpu_have_feature(j)) + seq_printf(m, " %s", hwcap_str[j]); } + seq_puts(m, "\n"); + + seq_printf(m, "CPU implementer\t: 0x%02x\n", + MIDR_IMPLEMENTOR(midr)); + seq_puts(m, "CPU architecture: 8\n"); + seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr)); + seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr)); + seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr));
return 0; }
static void *c_start(struct seq_file *m, loff_t *pos) { - return *pos < 1 ? (void *)1 : NULL; + *pos = cpumask_next(*pos - 1, cpu_online_mask); + return *pos < nr_cpu_ids ? &per_cpu(cpu_data, *pos) : NULL; }
static void *c_next(struct seq_file *m, void *v, loff_t *pos) { ++*pos; - return NULL; + return c_start(m, pos); }
static void c_stop(struct seq_file *m, void *v)