From: Zhongqiu Han quic_zhonhan@quicinc.com
[ Upstream commit 208baa3ec9043a664d9acfb8174b332e6b17fb69 ]
If malloc returns NULL due to low memory, 'config' pointer can be NULL. Add a check to prevent NULL dereference.
Link: https://lore.kernel.org/r/20250219122715.3892223-1-quic_zhonhan@quicinc.com Signed-off-by: Zhongqiu Han quic_zhonhan@quicinc.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/power/cpupower/bench/parse.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/power/cpupower/bench/parse.c b/tools/power/cpupower/bench/parse.c index e63dc11fa3a53..48e25be6e1635 100644 --- a/tools/power/cpupower/bench/parse.c +++ b/tools/power/cpupower/bench/parse.c @@ -120,6 +120,10 @@ FILE *prepare_output(const char *dirname) struct config *prepare_default_config() { struct config *config = malloc(sizeof(struct config)); + if (!config) { + perror("malloc"); + return NULL; + }
dprintf("loading defaults\n");
From: "Matthew Wilcox (Oracle)" willy@infradead.org
[ Upstream commit c1fcf41cf37f7a3fd3bbf6f0c04aba3ea4258888 ]
The bit pattern of _PAGE_DIRTY set and _PAGE_RW clear is used to mark shadow stacks. This is currently checked for in mk_pte() but not pfn_pte(). If we add the check to pfn_pte(), it catches vfree() calling set_direct_map_invalid_noflush() which calls __change_page_attr() which loads the old protection bits from the PTE, clears the specified bits and uses pfn_pte() to construct the new PTE.
We should, therefore, for kernel mappings, clear the _PAGE_DIRTY bit consistently whenever we clear _PAGE_RW. I opted to do it in the callers in case we want to use __change_page_attr() to create shadow stacks inside the kernel at some point in the future. Arguably, we might also want to clear _PAGE_ACCESSED here.
Note that the 3 functions involved:
__set_pages_np() kernel_map_pages_in_pgd() kernel_unmap_pages_in_pgd()
Only ever manipulate non-swappable kernel mappings, so maintaining the DIRTY:1|RW:0 special pattern for shadow stacks and DIRTY:0 pattern for non-shadow-stack entries can be maintained consistently and doesn't result in the unintended clearing of a live dirty bit that could corrupt (destroy) dirty bit information for user mappings.
Reported-by: kernel test robot oliver.sang@intel.com Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Acked-by: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/r/174051422675.10177.13226545170101706336.tip-bot2@t... Closes: https://lore.kernel.org/oe-lkp/202502241646.719f4651-lkp@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/mm/pat/set_memory.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 2d850f6bae701..525794f1eefb3 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -2374,7 +2374,7 @@ static int __set_pages_np(struct page *page, int numpages) .pgd = NULL, .numpages = numpages, .mask_set = __pgprot(0), - .mask_clr = __pgprot(_PAGE_PRESENT | _PAGE_RW), + .mask_clr = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY), .flags = CPA_NO_CHECK_ALIAS };
/* @@ -2453,7 +2453,7 @@ int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address, .pgd = pgd, .numpages = numpages, .mask_set = __pgprot(0), - .mask_clr = __pgprot(~page_flags & (_PAGE_NX|_PAGE_RW)), + .mask_clr = __pgprot(~page_flags & (_PAGE_NX|_PAGE_RW|_PAGE_DIRTY)), .flags = CPA_NO_CHECK_ALIAS, };
@@ -2496,7 +2496,7 @@ int __init kernel_unmap_pages_in_pgd(pgd_t *pgd, unsigned long address, .pgd = pgd, .numpages = numpages, .mask_set = __pgprot(0), - .mask_clr = __pgprot(_PAGE_PRESENT | _PAGE_RW), + .mask_clr = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY), .flags = CPA_NO_CHECK_ALIAS, };
From: "Xin Li (Intel)" xin@zytor.com
[ Upstream commit ad546940b5991d3e141238cd80a6d1894b767184 ]
The first GDT descriptor is reserved as 'NULL descriptor'. As bits 0 and 1 of a segment selector, i.e., the RPL bits, are NOT used to index GDT, selector values 0~3 all point to the NULL descriptor, thus values 0, 1, 2 and 3 are all valid NULL selector values.
When a NULL selector value is to be loaded into a segment register, reload_segments() sets its RPL bits. Later IRET zeros ES, FS, GS, and DS segment registers if any of them is found to have any nonzero NULL selector value. The two operations offset each other to actually effect a nop.
Besides, zeroing of RPL in NULL selector values is an information leak in pre-FRED systems as userspace can spot any interrupt/exception by loading a nonzero NULL selector, and waiting for it to become zero. But there is nothing software can do to prevent it before FRED.
ERETU, the only legit instruction to return to userspace from kernel under FRED, by design does NOT zero any segment register to avoid this problem behavior.
As such, leave NULL selector values 0~3 unchanged and close the leak.
Do the same on 32-bit kernel as well.
Signed-off-by: Xin Li (Intel) xin@zytor.com Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Andrew Cooper andrew.cooper3@citrix.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Andy Lutomirski luto@kernel.org Cc: Brian Gerst brgerst@gmail.com Cc: Peter Zijlstra peterz@infradead.org Link: https://lore.kernel.org/r/20241126184529.1607334-1-xin@zytor.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/signal_32.c | 62 +++++++++++++++++++++++++------------ 1 file changed, 43 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c index c12624bc82a31..983f8f5893a48 100644 --- a/arch/x86/kernel/signal_32.c +++ b/arch/x86/kernel/signal_32.c @@ -33,25 +33,55 @@ #include <asm/smap.h> #include <asm/gsseg.h>
+/* + * The first GDT descriptor is reserved as 'NULL descriptor'. As bits 0 + * and 1 of a segment selector, i.e., the RPL bits, are NOT used to index + * GDT, selector values 0~3 all point to the NULL descriptor, thus values + * 0, 1, 2 and 3 are all valid NULL selector values. + * + * However IRET zeros ES, FS, GS, and DS segment registers if any of them + * is found to have any nonzero NULL selector value, which can be used by + * userspace in pre-FRED systems to spot any interrupt/exception by loading + * a nonzero NULL selector and waiting for it to become zero. Before FRED + * there was nothing software could do to prevent such an information leak. + * + * ERETU, the only legit instruction to return to userspace from kernel + * under FRED, by design does NOT zero any segment register to avoid this + * problem behavior. + * + * As such, leave NULL selector values 0~3 unchanged. + */ +static inline u16 fixup_rpl(u16 sel) +{ + return sel <= 3 ? sel : sel | 3; +} + #ifdef CONFIG_IA32_EMULATION #include <asm/ia32_unistd.h>
static inline void reload_segments(struct sigcontext_32 *sc) { - unsigned int cur; + u16 cur;
+ /* + * Reload fs and gs if they have changed in the signal + * handler. This does not handle long fs/gs base changes in + * the handler, but does not clobber them at least in the + * normal case. + */ savesegment(gs, cur); - if ((sc->gs | 0x03) != cur) - load_gs_index(sc->gs | 0x03); + if (fixup_rpl(sc->gs) != cur) + load_gs_index(fixup_rpl(sc->gs)); savesegment(fs, cur); - if ((sc->fs | 0x03) != cur) - loadsegment(fs, sc->fs | 0x03); + if (fixup_rpl(sc->fs) != cur) + loadsegment(fs, fixup_rpl(sc->fs)); + savesegment(ds, cur); - if ((sc->ds | 0x03) != cur) - loadsegment(ds, sc->ds | 0x03); + if (fixup_rpl(sc->ds) != cur) + loadsegment(ds, fixup_rpl(sc->ds)); savesegment(es, cur); - if ((sc->es | 0x03) != cur) - loadsegment(es, sc->es | 0x03); + if (fixup_rpl(sc->es) != cur) + loadsegment(es, fixup_rpl(sc->es)); }
#define sigset32_t compat_sigset_t @@ -105,18 +135,12 @@ static bool ia32_restore_sigcontext(struct pt_regs *regs, regs->orig_ax = -1;
#ifdef CONFIG_IA32_EMULATION - /* - * Reload fs and gs if they have changed in the signal - * handler. This does not handle long fs/gs base changes in - * the handler, but does not clobber them at least in the - * normal case. - */ reload_segments(&sc); #else - loadsegment(gs, sc.gs); - regs->fs = sc.fs; - regs->es = sc.es; - regs->ds = sc.ds; + loadsegment(gs, fixup_rpl(sc.gs)); + regs->fs = fixup_rpl(sc.fs); + regs->es = fixup_rpl(sc.es); + regs->ds = fixup_rpl(sc.ds); #endif
return fpu__restore_sig(compat_ptr(sc.fpstate), 1);
From: Max Grobecker max@grobecker.info
[ Upstream commit a4248ee16f411ac1ea7dfab228a6659b111e3d65 ]
When running in a virtual machine, we might see the original hardware CPU vendor string (i.e. "AuthenticAMD"), but a model and family ID set by the hypervisor. In case we run on AMD hardware and the hypervisor sets a model ID < 0x14, the LAHF cpu feature is eliminated from the the list of CPU capabilities present to circumvent a bug with some BIOSes in conjunction with AMD K8 processors.
Parsing the flags list from /proc/cpuinfo seems to be happening mostly in bash scripts and prebuilt Docker containers, as it does not need to have additionals tools present – even though more reliable ways like using "kcpuid", which calls the CPUID instruction instead of parsing a list, should be preferred. Scripts, that use /proc/cpuinfo to determine if the current CPU is "compliant" with defined microarchitecture levels like x86-64-v2 will falsely claim the CPU is incapable of modern CPU instructions when "lahf_lm" is missing in that flags list.
This can prevent some docker containers from starting or build scripts to create unoptimized binaries.
Admittably, this is more a small inconvenience than a severe bug in the kernel and the shoddy scripts that rely on parsing /proc/cpuinfo should be fixed instead.
This patch adds an additional check to see if we're running inside a virtual machine (X86_FEATURE_HYPERVISOR is present), which, to my understanding, can't be present on a real K8 processor as it was introduced only with the later/other Athlon64 models.
Example output with the "lahf_lm" flag missing in the flags list (should be shown between "hypervisor" and "abm"):
$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 6 model name : Common KVM processor stepping : 1 microcode : 0x1000065 cpu MHz : 2599.998 cache size : 512 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c hypervisor abm 3dnowprefetch vmmcall bmi1 avx2 bmi2 xsaveopt
... while kcpuid shows the feature to be present in the CPU:
# kcpuid -d | grep lahf lahf_lm - LAHF/SAHF available in 64-bit mode
[ mingo: Updated the comment a bit, incorporated Boris's review feedback. ]
Signed-off-by: Max Grobecker max@grobecker.info Signed-off-by: Ingo Molnar mingo@kernel.org Cc: linux-kernel@vger.kernel.org Cc: Borislav Petkov bp@alien8.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 9413fb767c6a7..be378b44ca35f 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -825,7 +825,7 @@ static void init_amd_k8(struct cpuinfo_x86 *c) * (model = 0x14) and later actually support it. * (AMD Erratum #110, docId: 25759). */ - if (c->x86_model < 0x14 && cpu_has(c, X86_FEATURE_LAHF_LM)) { + if (c->x86_model < 0x14 && cpu_has(c, X86_FEATURE_LAHF_LM) && !cpu_has(c, X86_FEATURE_HYPERVISOR)) { clear_cpu_cap(c, X86_FEATURE_LAHF_LM); if (!rdmsrl_amd_safe(0xc001100d, &value)) { value &= ~BIT_64(32);
From: Mark Rutland mark.rutland@arm.com
[ Upstream commit dcca27bc1eccb9abc2552aab950b18a9742fb8e7 ]
Currently armpmu_add() tries to handle a newly-allocated counter having a stale associated event, but this should not be possible, and if this were to happen the current mitigation is insufficient and potentially expensive. It would be better to warn if we encounter the impossible case.
Calls to pmu::add() and pmu::del() are serialized by the core perf code, and armpmu_del() clears the relevant slot in pmu_hw_events::events[] before clearing the bit in pmu_hw_events::used_mask such that the counter can be reallocated. Thus when armpmu_add() allocates a counter index from pmu_hw_events::used_mask, it should not be possible to observe a stale even in pmu_hw_events::events[] unless either pmu_hw_events::used_mask or pmu_hw_events::events[] have been corrupted.
If this were to happen, we'd end up with two events with the same event->hw.idx, which would clash with each other during reprogramming, deletion, etc, and produce bogus results. Add a WARN_ON_ONCE() for this case so that we can detect if this ever occurs in practice.
That possiblity aside, there's no need to call arm_pmu::disable(event) for the new event. The PMU reset code initialises the counter in a disabled state, and armpmu_del() will disable the counter before it can be reused. Remove the redundant disable.
Signed-off-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Rob Herring (Arm) robh@kernel.org Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Tested-by: James Clark james.clark@linaro.org Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-2-4e9922fc2e8e@kernel.or... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/perf/arm_pmu.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index d712a19e47ac1..b3bb4d62f13dc 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -342,12 +342,10 @@ armpmu_add(struct perf_event *event, int flags) if (idx < 0) return idx;
- /* - * If there is an event in the counter we are going to use then make - * sure it is disabled. - */ + /* The newly-allocated counter should be empty */ + WARN_ON_ONCE(hw_events->events[idx]); + event->hw.idx = idx; - armpmu->disable(event); hw_events->events[idx] = event;
hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
From: Douglas Anderson dianders@chromium.org
[ Upstream commit 401c3333bb2396aa52e4121887a6f6a6e2f040bc ]
Add a definition for the Qualcomm Kryo 300-series Gold cores.
Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Douglas Anderson dianders@chromium.org Acked-by: Trilok Soni quic_tsoni@quicinc.com Link: https://lore.kernel.org/r/20241219131107.v3.1.I18e0288742871393228249a768e5d... Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 488f8e7513495..c8058f91a5bd3 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -119,6 +119,7 @@ #define QCOM_CPU_PART_KRYO 0x200 #define QCOM_CPU_PART_KRYO_2XX_GOLD 0x800 #define QCOM_CPU_PART_KRYO_2XX_SILVER 0x801 +#define QCOM_CPU_PART_KRYO_3XX_GOLD 0x802 #define QCOM_CPU_PART_KRYO_3XX_SILVER 0x803 #define QCOM_CPU_PART_KRYO_4XX_GOLD 0x804 #define QCOM_CPU_PART_KRYO_4XX_SILVER 0x805 @@ -195,6 +196,7 @@ #define MIDR_QCOM_KRYO MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO) #define MIDR_QCOM_KRYO_2XX_GOLD MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_2XX_GOLD) #define MIDR_QCOM_KRYO_2XX_SILVER MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_2XX_SILVER) +#define MIDR_QCOM_KRYO_3XX_GOLD MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_3XX_GOLD) #define MIDR_QCOM_KRYO_3XX_SILVER MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_3XX_SILVER) #define MIDR_QCOM_KRYO_4XX_GOLD MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_4XX_GOLD) #define MIDR_QCOM_KRYO_4XX_SILVER MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_4XX_SILVER)
From: Kees Cook kees@kernel.org
[ Upstream commit 1c3dfc7c6b0f551fdca3f7c1f1e4c73be8adb17d ]
When a character array without a terminating NUL character has a static initializer, GCC 15's -Wunterminated-string-initialization will only warn if the array lacks the "nonstring" attribute[1]. Mark the arrays with __nonstring to and correctly identify the char array as "not a C string" and thereby eliminate the warning.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117178 [1] Cc: Juergen Gross jgross@suse.com Cc: Stefano Stabellini sstabellini@kernel.org Cc: Oleksandr Tyshchenko oleksandr_tyshchenko@epam.com Cc: xen-devel@lists.xenproject.org Signed-off-by: Kees Cook kees@kernel.org Acked-by: Juergen Gross jgross@suse.com Message-ID: 20250310222234.work.473-kees@kernel.org Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/xen/interface/xen-mca.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/xen/interface/xen-mca.h b/include/xen/interface/xen-mca.h index 464aa6b3a5f92..1c9afbe8cc260 100644 --- a/include/xen/interface/xen-mca.h +++ b/include/xen/interface/xen-mca.h @@ -372,7 +372,7 @@ struct xen_mce { #define XEN_MCE_LOG_LEN 32
struct xen_mce_log { - char signature[12]; /* "MACHINECHECK" */ + char signature[12] __nonstring; /* "MACHINECHECK" */ unsigned len; /* = XEN_MCE_LOG_LEN */ unsigned next; unsigned flags;
From: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com
[ Upstream commit f666c92090a41ac5524dade63ff96b3adcf8c2ab ]
The current calculation of the 'next' virtual address in the page table initialization functions in arch/x86/mm/ident_map.c doesn't protect against wrapping to zero.
This is a theoretical issue that cannot happen currently, the problematic case is possible only if the user sets a high enough x86_mapping_info::offset value - which no current code in the upstream kernel does.
( The wrapping to zero only occurs if the top PGD entry is accessed. There are no such users upstream. Only hibernate_64.c uses x86_mapping_info::offset, and it operates on the direct mapping range, which is not the top PGD entry. )
Should such an overflow happen, it can result in page table corruption and a hang.
To future-proof this code, replace the manual 'next' calculation with p?d_addr_end() which handles wrapping correctly.
[ Backporter's note: there's no need to backport this patch. ]
Signed-off-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Kai Huang kai.huang@intel.com Reviewed-by: Tom Lendacky thomas.lendacky@amd.com Cc: Andy Lutomirski luto@kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/r/20241016111458.846228-2-kirill.shutemov@linux.inte... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/mm/ident_map.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c index fe0b2e66ded93..7adf7001473e7 100644 --- a/arch/x86/mm/ident_map.c +++ b/arch/x86/mm/ident_map.c @@ -28,9 +28,7 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page, pmd_t *pmd; bool use_gbpage;
- next = (addr & PUD_MASK) + PUD_SIZE; - if (next > end) - next = end; + next = pud_addr_end(addr, end);
/* if this is already a gbpage, this portion is already mapped */ if (pud_leaf(*pud)) @@ -81,10 +79,7 @@ static int ident_p4d_init(struct x86_mapping_info *info, p4d_t *p4d_page, p4d_t *p4d = p4d_page + p4d_index(addr); pud_t *pud;
- next = (addr & P4D_MASK) + P4D_SIZE; - if (next > end) - next = end; - + next = p4d_addr_end(addr, end); if (p4d_present(*p4d)) { pud = pud_offset(p4d, 0); result = ident_pud_init(info, pud, addr, next); @@ -126,10 +121,7 @@ int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page, pgd_t *pgd = pgd_page + pgd_index(addr); p4d_t *p4d;
- next = (addr & PGDIR_MASK) + PGDIR_SIZE; - if (next > end) - next = end; - + next = pgd_addr_end(addr, end); if (pgd_present(*pgd)) { p4d = p4d_offset(pgd, 0); result = ident_p4d_init(info, p4d, addr, next);
linux-stable-mirror@lists.linaro.org