A few tests that require running CPUID do so with a private implementation of a wrapper for CPUID. This duplication of the CPUID wrapper should be avoided but having one is also unnecessary because of the existence of a macro that can be used instead.
This series replaces private CPUID wrappers with calls to the __cpuid_count() macro from cpuid.h as made available by gcc and clang/llvm.
Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Ram Pai linuxram@us.ibm.com Cc: Sandipan Das sandipan@linux.ibm.com Cc: Florian Weimer fweimer@redhat.com Cc: "Desnes A. Nunes do Rosario" desnesn@linux.vnet.ibm.com Cc: Ingo Molnar mingo@kernel.org Cc: Thiago Jung Bauermann bauerman@linux.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Michal Suchanek msuchanek@suse.de Cc: linux-mm@kvack.org Cc: Chang S. Bae chang.seok.bae@intel.com Cc: Borislav Petkov bp@suse.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: x86@kernel.org Cc: Andy Lutomirski luto@kernel.org
Reinette Chatre (3): selftests/vm/pkeys: Use existing __cpuid_count() macro selftests/x86/amx: Use existing __cpuid_count() macro selftests/x86/corrupt_xstate_header: Use existing __cpuid_count() macro
tools/testing/selftests/vm/pkey-x86.h | 22 +++--------------- tools/testing/selftests/x86/amx.c | 23 +++++-------------- .../selftests/x86/corrupt_xstate_header.c | 17 ++------------ 3 files changed, 11 insertions(+), 51 deletions(-)
gcc as well as clang makes the __cpuid_count() macro available via cpuid.h to conveniently call the CPUID instruction.
Below is a copy of the macro as found in cpuid.h:
#define __cpuid_count(level, count, a, b, c, d) \ __asm__ ("cpuid\n\t" \ : "=a" (a), "=b" (b), "=c" (c), "=d" (d) \ : "0" (level), "2" (count))
The pkeys test contains a local function used as wrapper of the CPUID instruction. One difference between the pkeys implementation and the macro from cpuid.h is that the pkeys implementation provides the "volatile" qualifier to the asm() call. The "volatile" qualifier is necessary if CPUID has side effects and thus specifies that any optimizations should be avoided. The pkeys test uses CPUID to query the system for its protection keys status and save area properties, queries without side effect, the "volatile" qualifier is not required and the macro from cpuid.h can be used instead.
Remove the duplicated wrapper to CPUID and use __cpuid_count() from cpuid.h instead.
Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Ram Pai linuxram@us.ibm.com Cc: Sandipan Das sandipan@linux.ibm.com Cc: Florian Weimer fweimer@redhat.com Cc: "Desnes A. Nunes do Rosario" desnesn@linux.vnet.ibm.com Cc: Ingo Molnar mingo@kernel.org Cc: Thiago Jung Bauermann bauerman@linux.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Michal Suchanek msuchanek@suse.de Cc: linux-mm@kvack.org Signed-off-by: Reinette Chatre reinette.chatre@intel.com --- tools/testing/selftests/vm/pkey-x86.h | 22 +++------------------- 1 file changed, 3 insertions(+), 19 deletions(-)
diff --git a/tools/testing/selftests/vm/pkey-x86.h b/tools/testing/selftests/vm/pkey-x86.h index e4a4ce2b826d..17b20af8d8f8 100644 --- a/tools/testing/selftests/vm/pkey-x86.h +++ b/tools/testing/selftests/vm/pkey-x86.h @@ -2,6 +2,7 @@
#ifndef _PKEYS_X86_H #define _PKEYS_X86_H +#include <cpuid.h>
#ifdef __i386__
@@ -80,19 +81,6 @@ static inline void __write_pkey_reg(u64 pkey_reg) assert(pkey_reg == __read_pkey_reg()); }
-static inline void __cpuid(unsigned int *eax, unsigned int *ebx, - unsigned int *ecx, unsigned int *edx) -{ - /* ecx is often an input as well as an output. */ - asm volatile( - "cpuid;" - : "=a" (*eax), - "=b" (*ebx), - "=c" (*ecx), - "=d" (*edx) - : "0" (*eax), "2" (*ecx)); -} - /* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */ #define X86_FEATURE_PKU (1<<3) /* Protection Keys for Userspace */ #define X86_FEATURE_OSPKE (1<<4) /* OS Protection Keys Enable */ @@ -104,9 +92,7 @@ static inline int cpu_has_pkeys(void) unsigned int ecx; unsigned int edx;
- eax = 0x7; - ecx = 0x0; - __cpuid(&eax, &ebx, &ecx, &edx); + __cpuid_count(0x7, 0x0, eax, ebx, ecx, edx);
if (!(ecx & X86_FEATURE_PKU)) { dprintf2("cpu does not have PKU\n"); @@ -142,9 +128,7 @@ int pkey_reg_xstate_offset(void) /* assume that XSTATE_PKEY is set in XCR0 */ leaf = XSTATE_PKEY_BIT; { - eax = XSTATE_CPUID; - ecx = leaf; - __cpuid(&eax, &ebx, &ecx, &edx); + __cpuid_count(XSTATE_CPUID, leaf, eax, ebx, ecx, edx);
if (leaf == XSTATE_PKEY_BIT) { xstate_offset = ebx;
gcc as well as clang makes the __cpuid_count() macro available via cpuid.h to conveniently call the CPUID instruction.
Below is a copy of the macro as found in cpuid.h:
#define __cpuid_count(level, count, a, b, c, d) \ __asm__ ("cpuid\n\t" \ : "=a" (a), "=b" (b), "=c" (c), "=d" (d) \ : "0" (level), "2" (count))
The amx test contains a local function used as wrapper to the CPUID instruction. One difference between the amx implementation and the macro from cpuid.h is that the amx implementation provides the "volatile" qualifier to the asm() call. The "volatile" qualifier is necessary if CPUID has side effects and thus any optimizations should be avoided. The amx test uses CPUID to query the system's support for XSAVE/XRSTOR and the save area properties, queries without side effect, the "volatile" qualifier is thus not required and the macro from cpuid.h can be used instead.
Remove the duplicated wrapper to CPUID and use __cpuid_count() from cpuid.h instead.
Cc: Chang S. Bae chang.seok.bae@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Borislav Petkov bp@suse.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: x86@kernel.org Signed-off-by: Reinette Chatre reinette.chatre@intel.com --- tools/testing/selftests/x86/amx.c | 23 ++++++----------------- 1 file changed, 6 insertions(+), 17 deletions(-)
diff --git a/tools/testing/selftests/x86/amx.c b/tools/testing/selftests/x86/amx.c index 3615ef4a48bb..6476ab826603 100644 --- a/tools/testing/selftests/x86/amx.c +++ b/tools/testing/selftests/x86/amx.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0
#define _GNU_SOURCE +#include <cpuid.h> #include <err.h> #include <errno.h> #include <pthread.h> @@ -45,13 +46,6 @@ static inline uint64_t xgetbv(uint32_t index) return eax + ((uint64_t)edx << 32); }
-static inline void cpuid(uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx) -{ - asm volatile("cpuid;" - : "=a" (*eax), "=b" (*ebx), "=c" (*ecx), "=d" (*edx) - : "0" (*eax), "2" (*ecx)); -} - static inline void xsave(struct xsave_buffer *xbuf, uint64_t rfbm) { uint32_t rfbm_lo = rfbm; @@ -115,9 +109,7 @@ static inline void check_cpuid_xsave(void) * support for the XSAVE feature set, including * XGETBV. */ - eax = 1; - ecx = 0; - cpuid(&eax, &ebx, &ecx, &edx); + __cpuid_count(1, 0, eax, ebx, ecx, edx); if (!(ecx & CPUID_LEAF1_ECX_XSAVE_MASK)) fatal_error("cpuid: no CPU xsave support"); if (!(ecx & CPUID_LEAF1_ECX_OSXSAVE_MASK)) @@ -140,9 +132,8 @@ static void check_cpuid_xtiledata(void) { uint32_t eax, ebx, ecx, edx;
- eax = CPUID_LEAF_XSTATE; - ecx = CPUID_SUBLEAF_XSTATE_USER; - cpuid(&eax, &ebx, &ecx, &edx); + __cpuid_count(CPUID_LEAF_XSTATE, CPUID_SUBLEAF_XSTATE_USER, + eax, ebx, ecx, edx);
/* * EBX enumerates the size (in bytes) required by the XSAVE @@ -153,10 +144,8 @@ static void check_cpuid_xtiledata(void) */ xbuf_size = ebx;
- eax = CPUID_LEAF_XSTATE; - ecx = XFEATURE_XTILEDATA; - - cpuid(&eax, &ebx, &ecx, &edx); + __cpuid_count(CPUID_LEAF_XSTATE, XFEATURE_XTILEDATA, + eax, ebx, ecx, edx); /* * eax: XTILEDATA state component size * ebx: XTILEDATA state component offset in user buffer
gcc as well as clang makes the __cpuid_count() macro available via cpuid.h to conveniently call the CPUID instruction.
Below is a copy of the macro as found in cpuid.h:
#define __cpuid_count(level, count, a, b, c, d) \ __asm__ ("cpuid\n\t" \ : "=a" (a), "=b" (b), "=c" (c), "=d" (d) \ : "0" (level), "2" (count))
The corrupt_xstate_header test contains a local function used as wrapper to the CPUID instruction. One difference between the corrupt_xstate_header implementation and the macro from cpuid.h is that the corrupt_xstate_header implementation provides the "volatile" qualifier to the asm() call. The "volatile" qualifier is necessary when CPUID has side effects and thus any optimizations should be avoided. CPUID is used in the corrupt_xstate_header test to query the system for its XSAVE/XRSTOR support, a query without side effect, the "volatile" qualifier is thus not required and the macro from cpuid.h can be used instead.
Remove the duplicated wrapper to CPUID and use __cpuid_count() from cpuid.h instead.
Cc: Andy Lutomirski luto@kernel.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@suse.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: x86@kernel.org Signed-off-by: Reinette Chatre reinette.chatre@intel.com --- .../selftests/x86/corrupt_xstate_header.c | 17 ++--------------- 1 file changed, 2 insertions(+), 15 deletions(-)
diff --git a/tools/testing/selftests/x86/corrupt_xstate_header.c b/tools/testing/selftests/x86/corrupt_xstate_header.c index ab8599c10ce5..7a877612bc98 100644 --- a/tools/testing/selftests/x86/corrupt_xstate_header.c +++ b/tools/testing/selftests/x86/corrupt_xstate_header.c @@ -7,6 +7,7 @@
#define _GNU_SOURCE
+#include <cpuid.h> #include <stdlib.h> #include <stdio.h> #include <string.h> @@ -17,25 +18,11 @@ #include <stdint.h> #include <sys/wait.h>
-static inline void __cpuid(unsigned int *eax, unsigned int *ebx, - unsigned int *ecx, unsigned int *edx) -{ - asm volatile( - "cpuid;" - : "=a" (*eax), - "=b" (*ebx), - "=c" (*ecx), - "=d" (*edx) - : "0" (*eax), "2" (*ecx)); -} - static inline int xsave_enabled(void) { unsigned int eax, ebx, ecx, edx;
- eax = 0x1; - ecx = 0x0; - __cpuid(&eax, &ebx, &ecx, &edx); + __cpuid_count(0x1, 0x0, eax, ebx, ecx, edx);
/* Is CR4.OSXSAVE enabled ? */ return ecx & (1U << 27);
On 2/4/22 12:17 PM, Reinette Chatre wrote:
A few tests that require running CPUID do so with a private implementation of a wrapper for CPUID. This duplication of the CPUID wrapper should be avoided but having one is also unnecessary because of the existence of a macro that can be used instead.
This series replaces private CPUID wrappers with calls to the __cpuid_count() macro from cpuid.h as made available by gcc and clang/llvm.
Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Ram Pai linuxram@us.ibm.com Cc: Sandipan Das sandipan@linux.ibm.com Cc: Florian Weimer fweimer@redhat.com Cc: "Desnes A. Nunes do Rosario" desnesn@linux.vnet.ibm.com Cc: Ingo Molnar mingo@kernel.org Cc: Thiago Jung Bauermann bauerman@linux.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Michal Suchanek msuchanek@suse.de Cc: linux-mm@kvack.org Cc: Chang S. Bae chang.seok.bae@intel.com Cc: Borislav Petkov bp@suse.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: x86@kernel.org Cc: Andy Lutomirski luto@kernel.org
Reinette Chatre (3): selftests/vm/pkeys: Use existing __cpuid_count() macro selftests/x86/amx: Use existing __cpuid_count() macro selftests/x86/corrupt_xstate_header: Use existing __cpuid_count() macro
tools/testing/selftests/vm/pkey-x86.h | 22 +++--------------- tools/testing/selftests/x86/amx.c | 23 +++++-------------- .../selftests/x86/corrupt_xstate_header.c | 17 ++------------ 3 files changed, 11 insertions(+), 51 deletions(-)
I am all for this cleanup. However, I am not finding __cpuid_count() marco on my system with gcc:
gcc --version gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0
My concern is regression on older gcc versions.
thanks, -- Shuah
Hi Shuah,
On 2/4/2022 3:39 PM, Shuah Khan wrote:
On 2/4/22 12:17 PM, Reinette Chatre wrote:
A few tests that require running CPUID do so with a private implementation of a wrapper for CPUID. This duplication of the CPUID wrapper should be avoided but having one is also unnecessary because of the existence of a macro that can be used instead.
This series replaces private CPUID wrappers with calls to the __cpuid_count() macro from cpuid.h as made available by gcc and clang/llvm.
Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Ram Pai linuxram@us.ibm.com Cc: Sandipan Das sandipan@linux.ibm.com Cc: Florian Weimer fweimer@redhat.com Cc: "Desnes A. Nunes do Rosario" desnesn@linux.vnet.ibm.com Cc: Ingo Molnar mingo@kernel.org Cc: Thiago Jung Bauermann bauerman@linux.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Michal Suchanek msuchanek@suse.de Cc: linux-mm@kvack.org Cc: Chang S. Bae chang.seok.bae@intel.com Cc: Borislav Petkov bp@suse.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: x86@kernel.org Cc: Andy Lutomirski luto@kernel.org
Reinette Chatre (3): selftests/vm/pkeys: Use existing __cpuid_count() macro selftests/x86/amx: Use existing __cpuid_count() macro selftests/x86/corrupt_xstate_header: Use existing __cpuid_count() macro
tools/testing/selftests/vm/pkey-x86.h | 22 +++--------------- tools/testing/selftests/x86/amx.c | 23 +++++-------------- .../selftests/x86/corrupt_xstate_header.c | 17 ++------------ 3 files changed, 11 insertions(+), 51 deletions(-)
I am all for this cleanup. However, I am not finding __cpuid_count() marco on my system with gcc:
gcc --version gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0
My concern is regression on older gcc versions.
Please see this message from our earlier thread where you were able to find it on your system: https://lore.kernel.org/linux-kselftest/63293c72-55ca-9446-35eb-74aff4c8ba5d...
As mentioned in that thread, on my system it arrived via user space's libgcc-dev package. This does not seem to be the first time including files from this source - I did a quick check and from what I can tell existing kselftest includes like stdarg.h, stdbool.h, stdatomic.h, unwind.h, x86intrin.h ... arrive via libgcc-dev.
Reinette
On 2/4/22 5:11 PM, Reinette Chatre wrote:
Hi Shuah,
On 2/4/2022 3:39 PM, Shuah Khan wrote:
On 2/4/22 12:17 PM, Reinette Chatre wrote:
A few tests that require running CPUID do so with a private implementation of a wrapper for CPUID. This duplication of the CPUID wrapper should be avoided but having one is also unnecessary because of the existence of a macro that can be used instead.
This series replaces private CPUID wrappers with calls to the __cpuid_count() macro from cpuid.h as made available by gcc and clang/llvm.
Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Ram Pai linuxram@us.ibm.com Cc: Sandipan Das sandipan@linux.ibm.com Cc: Florian Weimer fweimer@redhat.com Cc: "Desnes A. Nunes do Rosario" desnesn@linux.vnet.ibm.com Cc: Ingo Molnar mingo@kernel.org Cc: Thiago Jung Bauermann bauerman@linux.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Michal Suchanek msuchanek@suse.de Cc: linux-mm@kvack.org Cc: Chang S. Bae chang.seok.bae@intel.com Cc: Borislav Petkov bp@suse.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: x86@kernel.org Cc: Andy Lutomirski luto@kernel.org
Reinette Chatre (3): selftests/vm/pkeys: Use existing __cpuid_count() macro selftests/x86/amx: Use existing __cpuid_count() macro selftests/x86/corrupt_xstate_header: Use existing __cpuid_count() macro
tools/testing/selftests/vm/pkey-x86.h | 22 +++--------------- tools/testing/selftests/x86/amx.c | 23 +++++-------------- .../selftests/x86/corrupt_xstate_header.c | 17 ++------------ 3 files changed, 11 insertions(+), 51 deletions(-)
I am all for this cleanup. However, I am not finding __cpuid_count() marco on my system with gcc:
gcc --version gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0
My concern is regression on older gcc versions.
Please see this message from our earlier thread where you were able to find it on your system: https://lore.kernel.org/linux-kselftest/63293c72-55ca-9446-35eb-74aff4c8ba5d...
Right. After I sent off the response, I was thinking we discussed this before. Thanks for the refresh.
As mentioned in that thread, on my system it arrived via user space's libgcc-dev package. This does not seem to be the first time including files from this source - I did a quick check and from what I can tell existing kselftest includes like stdarg.h, stdbool.h, stdatomic.h, unwind.h, x86intrin.h ... arrive via libgcc-dev.
This will work fine on newer versions of gcc/clang. However this could fail when mainline kselftest is used on stable releases on test rings and so on, especially if they have older versions of gcc/clang.
We will have to find a solution for this. Instead of deleting the local define, let's keep it under ifndef __cpuid_count
/usr/lib/gcc/x86_64-linux-gnu/11/include/cpuid.h
#define __cpuid_count(level, count, a, b, c, d) \ __asm__ __volatile__ ("cpuid\n\t" \ : "=a" (a), "=b" (b), "=c" (c), "=d" (d) \ : "0" (level), "2" (count))
thanks, -- Shuah
Hi Shuah,
On 2/7/2022 10:00 AM, Shuah Khan wrote:
This will work fine on newer versions of gcc/clang. However this could fail when mainline kselftest is used on stable releases on test rings and so on, especially if they have older versions of gcc/clang.
Indeed. It thus seems that kselftest has a minimal required version for gcc/clang that is not the current mainline minimal version but the minimal version of the oldest supported stable kernel, which is v4.9.
__cpuid_count() was added to gcc in commit: cb0dee885cb30b4e9beeef070cf000baa7d09abe and thus available since gcc 4.4.
Looking at Documentation/Changes or later Documentation/process/changes.rst kernels v4.9 and v4.14 have the minimal required version of gcc of 3.2. So this change would encounter an issue if mainline kselftest is used to test a v4.9 or v4.14 kernel on a system that only supports its minimal gcc.
Kernel v4.19 moved the gcc minimal required version to 4.6 that does contain this macro.
There does not seem to be a minimum required version of clang/LLVM in v4.19. The first time I see a minimal version for Clang/LLVM for a stable kernel is in kernel v5.10 with Clang/LLVM minimal version 10.0.1 and from what I can tell the __cpuid_count() macro was added to Clang/LLVM in version 3.4.0 (commit 4dcb5dbb53ea4fbeab48bc6bc3c4d392361dabc1).
We will have to find a solution for this. Instead of deleting the local define, let's keep it under ifndef __cpuid_count
/usr/lib/gcc/x86_64-linux-gnu/11/include/cpuid.h
#define __cpuid_count(level, count, a, b, c, d) \ __asm__ __volatile__ ("cpuid\n\t" \ : "=a" (a), "=b" (b), "=c" (c), "=d" (d) \ : "0" (level), "2" (count))
Will do. I see that gcc obtained the volatile qualifier in v11.1 so I can use the most recent macro as you have here.
Reinette
linux-kselftest-mirror@lists.linaro.org