Dear Friend,
I am writing to you to make a proposal regarding Investing in your
country. I am proposing to you a business development Investment in
housing and health sector or any other sector you can recommend. My name
is Wahid Majrooh. Former acting Minister of Public Health of
Afghanistan.
Sincerely
Wahid
This series provides initial support for the ARMv9 Scalable Matrix
Extension (SME). SME takes the approach used for vectors in SVE and
extends this to provide architectural support for matrix operations. A
more detailed overview can be found in [1].
For the kernel SME can be thought of as a series of features which are
intended to be used together by applications but operate mostly
orthogonally:
- The ZA matrix register.
- Streaming mode, in which ZA can be accessed and a subset of SVE
features are available.
- A second vector length, used for streaming mode SVE and ZA and
controlled using a similar interface to that for SVE.
- TPIDR2, a new userspace controllable system register intended for use
by the C library for storing context related to the ZA ABI.
A substantial part of the series is dedicated to refactoring the
existing SVE support so that we don't need to duplicate code for
handling vector lengths and the SVE registers, this involves creating an
array of vector types and making the users take the vector type as a
parameter. I'm not 100% happy with this but wasn't able to come up with
anything better, duplicating code definitely felt like a bad idea so
this felt like the least bad thing. If this approach makes sense to
people it might make sense to split this off into a separate series
and/or merge it while the rest is pending review to try to make things a
little more digestable, the series is very large so it'd probably make
things easier to digest if some of the preparatory refactoring could be
merged before the rest is ready.
One feature of the architecture of particular note is that switching
to and from streaming mode may change the size of and invalidate the
contents of the SVE registers, and when in streaming mode the FFR is not
accessible. This complicates aspects of the ABI like signal handling
and ptrace.
This initial implementation is mainly intended to get the ABI in place,
there are several areas which will be worked on going forwards - some of
these will be blockers, others could be handled in followup serieses:
- KVM is not currently supported and we depend on !KVM, this is
obviously not good - in hopefully the next version I will add support
for coexisting with KVM and then in a subsequent series implement
support for use of SME by KVM guests.
- It is likely some build configurations have issues, I've not fully
checked this yet. In general testing is still ongoing, I anticipate
finding and fixing some issues in the implementation.
- No support is currently provided for scheduler control of SME or SME
applications, given the size of the SME register state the context
switch overhead may be noticable so this may be needed especially for
real time applications. Similar concerns already exist for larger
SVE vector lengths but are amplified for SME, particularly as the
vector length increases.
- There has been no work on optimising the performance of anything the
kernel does.
It is not expected that any systems will be encountered that support SME
but not SVE, SME is an ARMv9 feature and SVE is mandatory for ARMv9.
The code attempts to handle any such systems that are encountered but
this hasn't been tested extensively.
Due to dependencies on changes already upstreamed this series is based
on a merge of for-next/kselftest and for-next/sve in the arm64 tree.
v4:
- Rebase onto merged patches.
- Remove an uneeded NULL check in vec_proc_do_default_vl().
- Include patch to factor out utility routines in kselftests written in
assembler.
- Specify -ffreestanding when building TPIDR2 test.
v3:
- Skip FFR rather than predicate registers in sve_flush_live().
- Don't assume a bool is all zeros in sve_flush_live() as per AAPCS.
- Don't redundantly specify a zero index when clearing FFR.
v2:
- Fix several issues with !SME and !SVE configurations.
- Preserve TPIDR2 when creating a new thread/process unless
CLONE_SETTLS is set.
- Report traps due to using features in an invalid mode as SIGILL.
- Spell out streaming mode behaviour in SVE ABI documentation more
directly.
- Document TPIDR2 in the ABI document.
- Use SMSTART and SMSTOP rather than read/modify/write sequences.
- Rework logic for exiting streaming mode on syscall.
- Don't needlessly initialise SVCR on access trap.
- Always restore SME VL for userspace if SME traps are disabled.
- Only yield to encourage preemption every 128 iterations in za-test,
otherwise do a getpid(), and validate SVCR after syscall.
- Leave streaming mode disabled except when reading the vector length
in za-test, and disable ZA after detecting a mismatch.
- Add SME support to vlset.
- Clarifications and typo fixes in comments.
- Move sme_alloc() forward declaration back a patch.
[1] https://community.arm.com/developer/ip-products/processors/b/processors-ip-…
Mark Brown (33):
arm64/sve: Make sysctl interface for SVE reusable by SME
arm64/sve: Generalise vector length configuration prctl() for SME
kselftest/arm64: Parameterise ptrace vector length information
kselftest/arm64: Allow signal tests to trigger from a function
tools/nolibc: Implement gettid()
arm64/sme: Provide ABI documentation for SME
arm64/sme: System register and exception syndrome definitions
arm64/sme: Define macros for manually encoding SME instructions
arm64/sme: Early CPU setup for SME
arm64/sme: Basic enumeration support
arm64/sme: Identify supported SME vector lengths at boot
arm64/sme: Implement sysctl to set the default vector length
arm64/sme: Implement vector length configuration prctl()s
arm64/sme: Implement support for TPIDR2
arm64/sme: Implement SVCR context switching
arm64/sme: Implement streaming SVE context switching
arm64/sme: Implement ZA context switching
arm64/sme: Implement traps and syscall handling for SME
arm64/sme: Implement streaming SVE signal handling
arm64/sme: Implement ZA signal handling
arm64/sme: Implement ptrace support for streaming mode SVE registers
arm64/sme: Add ptrace support for ZA
arm64/sme: Disable streaming mode and ZA when flushing CPU state
arm64/sme: Save and restore streaming mode over EFI runtime calls
arm64/sme: Provide Kconfig for SME
kselftest/arm64: sme: Add streaming SME support to vlset
kselftest/arm64: Add tests for TPIDR2
kselftest/arm64: Extend vector configuration API tests to cover SME
kselftest/arm64: sme: Provide streaming mode SVE stress test
kselftest/arm64: Add stress test for SME ZA context switching
kselftest/arm64: signal: Add SME signal handling tests
kselftest/arm64: Add streaming SVE to SVE ptrace tests
kselftest/arm64: Add coverage for the ZA ptrace interface
Documentation/arm64/elf_hwcaps.rst | 29 +
Documentation/arm64/index.rst | 1 +
Documentation/arm64/sme.rst | 428 ++++++++++++
Documentation/arm64/sve.rst | 69 +-
arch/arm64/Kconfig | 11 +
arch/arm64/include/asm/cpu.h | 4 +
arch/arm64/include/asm/cpufeature.h | 18 +
arch/arm64/include/asm/el2_setup.h | 36 +
arch/arm64/include/asm/esr.h | 13 +-
arch/arm64/include/asm/exception.h | 1 +
arch/arm64/include/asm/fpsimd.h | 111 ++-
arch/arm64/include/asm/fpsimdmacros.h | 77 +++
arch/arm64/include/asm/hwcap.h | 7 +
arch/arm64/include/asm/kvm_arm.h | 1 +
arch/arm64/include/asm/processor.h | 18 +-
arch/arm64/include/asm/sysreg.h | 53 ++
arch/arm64/include/asm/thread_info.h | 2 +
arch/arm64/include/uapi/asm/hwcap.h | 7 +
arch/arm64/include/uapi/asm/ptrace.h | 69 +-
arch/arm64/include/uapi/asm/sigcontext.h | 55 +-
arch/arm64/kernel/cpufeature.c | 90 +++
arch/arm64/kernel/cpuinfo.c | 12 +
arch/arm64/kernel/entry-common.c | 10 +
arch/arm64/kernel/entry-fpsimd.S | 31 +
arch/arm64/kernel/fpsimd.c | 640 ++++++++++++++++--
arch/arm64/kernel/process.c | 28 +-
arch/arm64/kernel/ptrace.c | 358 ++++++++--
arch/arm64/kernel/signal.c | 187 ++++-
arch/arm64/kernel/syscall.c | 43 +-
arch/arm64/kernel/traps.c | 1 +
arch/arm64/kvm/fpsimd.c | 3 +-
arch/arm64/kvm/reset.c | 8 +-
arch/arm64/tools/cpucaps | 1 +
include/uapi/linux/elf.h | 2 +
include/uapi/linux/prctl.h | 9 +
kernel/sys.c | 12 +
tools/include/nolibc/nolibc.h | 18 +
tools/testing/selftests/arm64/Makefile | 2 +-
tools/testing/selftests/arm64/abi/.gitignore | 1 +
tools/testing/selftests/arm64/abi/Makefile | 13 +
tools/testing/selftests/arm64/abi/tpidr2.c | 298 ++++++++
tools/testing/selftests/arm64/fp/.gitignore | 4 +
tools/testing/selftests/arm64/fp/Makefile | 12 +-
tools/testing/selftests/arm64/fp/rdvl-sme.c | 14 +
tools/testing/selftests/arm64/fp/rdvl.S | 16 +
tools/testing/selftests/arm64/fp/rdvl.h | 1 +
tools/testing/selftests/arm64/fp/ssve-stress | 59 ++
tools/testing/selftests/arm64/fp/sve-ptrace.c | 230 ++++---
tools/testing/selftests/arm64/fp/sve-test.S | 30 +
tools/testing/selftests/arm64/fp/vec-syscfg.c | 10 +
tools/testing/selftests/arm64/fp/vlset.c | 10 +-
tools/testing/selftests/arm64/fp/za-ptrace.c | 353 ++++++++++
tools/testing/selftests/arm64/fp/za-stress | 59 ++
tools/testing/selftests/arm64/fp/za-test.S | 431 ++++++++++++
.../testing/selftests/arm64/signal/.gitignore | 2 +
.../selftests/arm64/signal/test_signals.h | 2 +
.../arm64/signal/test_signals_utils.c | 5 +-
.../testcases/fake_sigreturn_sme_change_vl.c | 92 +++
.../arm64/signal/testcases/sme_trap_za.c | 36 +
.../selftests/arm64/signal/testcases/sme_vl.c | 70 ++
.../arm64/signal/testcases/ssve_regs.c | 129 ++++
61 files changed, 4090 insertions(+), 252 deletions(-)
create mode 100644 Documentation/arm64/sme.rst
create mode 100644 tools/testing/selftests/arm64/abi/.gitignore
create mode 100644 tools/testing/selftests/arm64/abi/Makefile
create mode 100644 tools/testing/selftests/arm64/abi/tpidr2.c
create mode 100644 tools/testing/selftests/arm64/fp/rdvl-sme.c
create mode 100644 tools/testing/selftests/arm64/fp/ssve-stress
create mode 100644 tools/testing/selftests/arm64/fp/za-ptrace.c
create mode 100644 tools/testing/selftests/arm64/fp/za-stress
create mode 100644 tools/testing/selftests/arm64/fp/za-test.S
create mode 100644 tools/testing/selftests/arm64/signal/testcases/fake_sigreturn_sme_change_vl.c
create mode 100644 tools/testing/selftests/arm64/signal/testcases/sme_trap_za.c
create mode 100644 tools/testing/selftests/arm64/signal/testcases/sme_vl.c
create mode 100644 tools/testing/selftests/arm64/signal/testcases/ssve_regs.c
base-commit: ac972c1eafce10fe893df7698f56f25121426f5d
--
2.30.2
Stop tracing while reading the trace file by default, to prevent
the test results while checking it and to avoid taking a long time
to check the result.
If there is any testcase which wants to test the tracing while reading
the trace file, please override this setting inside the test case.
This also recovers the pause-on-trace when clean it up.
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
---
Changes in v2:
- Recover pause-on-trace to 0 when exit.
---
tools/testing/selftests/ftrace/ftracetest | 2 +-
tools/testing/selftests/ftrace/test.d/functions | 12 ++++++++++++
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index 8ec1922e974e..c3311c8c4089 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -428,7 +428,7 @@ for t in $TEST_CASES; do
exit 1
fi
done
-(cd $TRACING_DIR; initialize_ftrace) # for cleanup
+(cd $TRACING_DIR; finish_ftrace) # for cleanup
prlog ""
prlog "# of passed: " `echo $PASSED_CASES | wc -w`
diff --git a/tools/testing/selftests/ftrace/test.d/functions b/tools/testing/selftests/ftrace/test.d/functions
index 000fd05e84b1..5f6cbec847fc 100644
--- a/tools/testing/selftests/ftrace/test.d/functions
+++ b/tools/testing/selftests/ftrace/test.d/functions
@@ -124,10 +124,22 @@ initialize_ftrace() { # Reset ftrace to initial-state
[ -f uprobe_events ] && echo > uprobe_events
[ -f synthetic_events ] && echo > synthetic_events
[ -f snapshot ] && echo 0 > snapshot
+
+# Stop tracing while reading the trace file by default, to prevent
+# the test results while checking it and to avoid taking a long time
+# to check the result.
+ [ -f options/pause-on-trace ] && echo 1 > options/pause-on-trace
+
clear_trace
enable_tracing
}
+finish_ftrace() {
+ initialize_ftrace
+# And recover it to default.
+ [ -f options/pause-on-trace ] && echo 0 > options/pause-on-trace
+}
+
check_requires() { # Check required files and tracers
for i in "$@" ; do
r=${i%:README}
On Tue, Oct 26, 2021 at 01:38:51AM +0000, Luis Machado wrote:
> A few nits below...
Thanks. Hopefully I spotted everything and rolled it in, there's no
flagging of what bits are quoted and you've not deleted any of the extra
context so I might've missed some comment - if so sorry about that.
Intel Advanced Matrix Extensions (AMX) are a new set registers
and ISA. They are conceptually similar to the earlier AVX and
SSE ISA. But, the registers as a whole are *really* big: ~8k
verus 2k for AVX-512.
Those amply-sized registers present some potential problems with
task_struct and signal stack bloat. To fix those issues, most of
the new AMX state is dynamically allocated with the help of a new
CPU feature.
This new selftest exercises the new dynamic allocation ABI and
also ensures that AMX state is properly context-switched.
Processors that support AMX (Sapphire Rapids) are not publicly
available. The kernel support needed to run these tests is not
upstream. This selftest was developed against this tree:
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git/log/?h=x86/f…
These tests were primarily written by Chang Bae. He's busy
working on the real kernel support, so I stole these and cleaned
them up a bit.
Chang S. Bae (2):
selftest/x86/amx: Test cases for the AMX state management
selftest/x86/amx: Add context switch test
tools/testing/selftests/x86/Makefile | 2 +-
tools/testing/selftests/x86/amx.c | 851 +++++++++++++++++++++++++++
2 files changed, 852 insertions(+), 1 deletion(-)
Signed-off-by: Chang S. Bae <chang.seok.bae(a)intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: x86(a)kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
kunit.py currently crashes and fails to parse kernel output if it's not
fully valid utf-8.
This can come from memory corruption or or just inadvertently printing
out binary data as strings.
E.g. adding this line into a kunit test
pr_info("\x80")
will cause this exception
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 1961: invalid start byte
We can tell Python how to handle errors, see
https://docs.python.org/3/library/codecs.html#error-handlers
Unfortunately, it doesn't seem like there's a way to specify this in
just one location, so we need to repeat ourselves quite a bit.
Specify `errors='backslashreplace'` so we instead:
* print out the offending byte as '\x80'
* try and continue parsing the output.
* as long as the TAP lines themselves are valid, we're fine.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
tools/testing/kunit/kunit.py | 3 ++-
tools/testing/kunit/kunit_kernel.py | 4 ++--
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
index 9c9ed4071e9e..28ae096d4b53 100755
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -457,9 +457,10 @@ def main(argv, linux=None):
sys.exit(1)
elif cli_args.subcommand == 'parse':
if cli_args.file == None:
+ sys.stdin.reconfigure(errors='backslashreplace')
kunit_output = sys.stdin
else:
- with open(cli_args.file, 'r') as f:
+ with open(cli_args.file, 'r', errors='backslashreplace') as f:
kunit_output = f.read().splitlines()
request = KunitParseRequest(cli_args.raw_output,
None,
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index faa6320e900e..f08c6c36a947 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -135,7 +135,7 @@ class LinuxSourceTreeOperationsQemu(LinuxSourceTreeOperations):
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
- text=True, shell=True)
+ text=True, shell=True, errors='backslashreplace')
class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
"""An abstraction over command line operations performed on a source tree."""
@@ -172,7 +172,7 @@ class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
- text=True)
+ text=True, errors='backslashreplace')
def get_kconfig_path(build_dir) -> str:
return get_file_path(build_dir, KCONFIG_PATH)
base-commit: a032094fc1ed17070df01de4a7883da7bb8d5741
--
2.33.0.882.g93a45727a2-goog
Currently, we have these errors:
$ mypy ./tools/testing/kunit/*.py
tools/testing/kunit/kunit_kernel.py:213: error: Item "_Loader" of "Optional[_Loader]" has no attribute "exec_module"
tools/testing/kunit/kunit_kernel.py:213: error: Item "None" of "Optional[_Loader]" has no attribute "exec_module"
tools/testing/kunit/kunit_kernel.py:214: error: Module has no attribute "QEMU_ARCH"
tools/testing/kunit/kunit_kernel.py:215: error: Module has no attribute "QEMU_ARCH"
exec_module
===========
pytype currently reports no errors, but that's because there's a comment
disabling it on 213.
This is due to https://github.com/python/typeshed/pull/2626.
The fix is to assert the loaded module implements the ABC
(abstract base class) we want which has exec_module support.
QEMU_ARCH
=========
pytype is fine with this, but mypy is not:
https://github.com/python/mypy/issues/5059
Add a check that the loaded module does indeed have QEMU_ARCH.
Note: this is not enough to appease mypy, so we also add a comment to
squash the warning.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
tools/testing/kunit/kunit_kernel.py | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index faa6320e900e..c68b17905481 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -207,12 +207,15 @@ def get_source_tree_ops_from_qemu_config(config_path: str,
module_path = '.' + os.path.join(os.path.basename(QEMU_CONFIGS_DIR), os.path.basename(config_path))
spec = importlib.util.spec_from_file_location(module_path, config_path)
config = importlib.util.module_from_spec(spec)
- # TODO(brendanhiggins(a)google.com): I looked this up and apparently other
- # Python projects have noted that pytype complains that "No attribute
- # 'exec_module' on _importlib_modulespec._Loader". Disabling for now.
- spec.loader.exec_module(config) # pytype: disable=attribute-error
- return config.QEMU_ARCH.linux_arch, LinuxSourceTreeOperationsQemu(
- config.QEMU_ARCH, cross_compile=cross_compile)
+ # See https://github.com/python/typeshed/pull/2626 for context.
+ assert isinstance(spec.loader, importlib.abc.Loader)
+ spec.loader.exec_module(config)
+
+ if not hasattr(config, 'QEMU_ARCH'):
+ raise ValueError('qemu_config module missing "QEMU_ARCH": ' + config_path)
+ params: qemu_config.QemuArchParams = config.QEMU_ARCH # type: ignore
+ return params.linux_arch, LinuxSourceTreeOperationsQemu(
+ params, cross_compile=cross_compile)
class LinuxSourceTree(object):
"""Represents a Linux kernel source tree with KUnit tests."""
base-commit: 17ac23eb43f0cbefc8bfce44ad51a9f065895f9f
--
2.33.0.1079.g6e70778dc9-goog