- Linux-kselftest-mirror - lists.linaro.org

Re: [PATCH 2/4] KVM: x86: Introduce paravirt feature CR0/CR4 pinning

by Andersen, John

On Thu, Jul 09, 2020 at 09:27:43AM -0700, Andy Lutomirski wrote: > On Thu, Jul 9, 2020 at 9:22 AM Dave Hansen <dave.hansen(a)intel.com> wrote: > > > > On 7/9/20 9:07 AM, Andy Lutomirski wrote: > > > On Thu, Jul 9, 2020 at 8:56 AM Dave Hansen <dave.hansen(a)intel.com> wrote: > > >> On 7/9/20 8:44 AM, Andersen, John wrote: > > >>> Bits which are allowed to be pinned default to WP for CR0 and SMEP, > > >>> SMAP, and UMIP for CR4. > > >> I think it also makes sense to have FSGSBASE in this set. > > >> > > >> I know it hasn't been tested, but I think we should do the legwork to > > >> test it. If not in this set, can we agree that it's a logical next step? > > > I have no objection to pinning FSGSBASE, but is there a clear > > > description of the threat model that this whole series is meant to > > > address? The idea is to provide a degree of protection against an > > > attacker who is able to convince a guest kernel to write something > > > inappropriate to CR4, right? How realistic is this? > > > > If a quick search can find this: > > > > > https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-… > > > > I'd pretty confident that the guys doing actual bad things have it in > > their toolbox too. > > > > True, but we have the existing software CR4 pinning. I suppose the > virtualization version is stronger. > Yes, as Kees said this will be stronger because it stops ROP and other gadget based techniques which avoid the use of native_write_cr0/4(). With regards to what should be done in this patchset and what in other patchsets. I have a fix for kexec thanks to Arvind's note about TRAMPOLINE_32BIT_CODE_SIZE. The physical host boots fine now and the virtual one can kexec fine. What remains to be done on that front is to add some identifying information to the kernel image to declare that it supports paravirtualized control register pinning or not. Liran suggested adding a section to the built image acting as a flag to signify support for being kexec'd by a kernel with pinning enabled. If anyone has any opinions on how they'd like to see this implemented please let me know. Otherwise I'll just take a stab at it and you'll all see it hopefully in the next version. With regards to FSGSBASE, are we open to validating and adding that to the DEFAULT set as a part of a separate patchset? This patchset is focused on replicating the functionality we already have natively. (If anyone got this email twice, sorry I messed up the From: field the first time around)

4 years, 11 months

2
2
0 0

Re: [PATCH 0/3] readfile(2): a new syscall to make open/read/close faster

by Jan Ziak

Hello At first, I thought that the proposed system call is capable of reading *multiple* small files using a single system call - which would help increase HDD/SSD queue utilization and increase IOPS (I/O operations per second) - but that isn't the case and the proposed system call can read just a single file. Without the ability to read multiple small files using a single system call, it is impossible to increase IOPS (unless an application is using multiple reader threads or somehow instructs the kernel to prefetch multiple files into memory). While you are at it, why not also add a readfiles system call to read multiple, presumably small, files? The initial unoptimized implementation of readfiles syscall can simply call readfile sequentially. Sincerely Jan (atomsymbol)

4 years, 11 months

8
26
0 0

[PATCH v2] selftests/livepatch: adopt to newer sysctl error format

by Petr Mladek

With procfs v3.3.16, the sysctl command doesn't print the set key and value on error. This change breaks livepatch selftest test-ftrace.sh, that tests the interaction of sysctl ftrace_enabled: Make it work with all sysctl versions using '-q' option. Explicitly print the final status on success so that it can be verified in the log. The error message is enough on failure. Reported-by: Kamalesh Babulal <kamalesh(a)linux.vnet.ibm.com> Signed-off-by: Petr Mladek <pmladek(a)suse.com> --- The patch has been created against livepatch.git, branch for-5.9/selftests-cleanup. But it applies also against the current Linus' tree. tools/testing/selftests/livepatch/functions.sh | 3 ++- tools/testing/selftests/livepatch/test-ftrace.sh | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/livepatch/functions.sh b/tools/testing/selftests/livepatch/functions.sh index 408529d94ddb..1aba83c87ad3 100644 --- a/tools/testing/selftests/livepatch/functions.sh +++ b/tools/testing/selftests/livepatch/functions.sh @@ -75,7 +75,8 @@ function set_dynamic_debug() { } function set_ftrace_enabled() { - result=$(sysctl kernel.ftrace_enabled="$1" 2>&1 | paste --serial --delimiters=' ') + result=$(sysctl -q kernel.ftrace_enabled="$1" 2>&1 && \ + sysctl kernel.ftrace_enabled 2>&1) echo "livepatch: $result" > /dev/kmsg } diff --git a/tools/testing/selftests/livepatch/test-ftrace.sh b/tools/testing/selftests/livepatch/test-ftrace.sh index 9160c9ec3b6f..552e165512f4 100755 --- a/tools/testing/selftests/livepatch/test-ftrace.sh +++ b/tools/testing/selftests/livepatch/test-ftrace.sh @@ -51,7 +51,7 @@ livepatch: '$MOD_LIVEPATCH': initializing patching transition livepatch: '$MOD_LIVEPATCH': starting patching transition livepatch: '$MOD_LIVEPATCH': completing patching transition livepatch: '$MOD_LIVEPATCH': patching complete -livepatch: sysctl: setting key \"kernel.ftrace_enabled\": Device or resource busy kernel.ftrace_enabled = 0 +livepatch: sysctl: setting key \"kernel.ftrace_enabled\": Device or resource busy % echo 0 > /sys/kernel/livepatch/$MOD_LIVEPATCH/enabled livepatch: '$MOD_LIVEPATCH': initializing unpatching transition livepatch: '$MOD_LIVEPATCH': starting unpatching transition -- 2.26.2

4 years, 11 months

4
4
0 0

[PATCH] selftests: fib_nexthop_multiprefix: fix cleanup() netns deletion

by Paolo Pisati

During setup(): ... for ns in h0 r1 h1 h2 h3 do create_ns ${ns} done ... while in cleanup(): ... for n in h1 r1 h2 h3 h4 do ip netns del ${n} 2>/dev/null done ... and after removing the stderr redirection in cleanup(): $ sudo ./fib_nexthop_multiprefix.sh ... TEST: IPv4: host 0 to host 3, mtu 1400 [ OK ] TEST: IPv6: host 0 to host 3, mtu 1400 [ OK ] Cannot remove namespace file "/run/netns/h4": No such file or directory $ echo $? 1 and a non-zero return code, make kselftests fail (even if the test itself is fine): ... not ok 34 selftests: net: fib_nexthop_multiprefix.sh # exit=1 ... Signed-off-by: Paolo Pisati <paolo.pisati(a)canonical.com> --- tools/testing/selftests/net/fib_nexthop_multiprefix.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/fib_nexthop_multiprefix.sh b/tools/testing/selftests/net/fib_nexthop_multiprefix.sh index 9dc35a16e415..51df5e305855 100755 --- a/tools/testing/selftests/net/fib_nexthop_multiprefix.sh +++ b/tools/testing/selftests/net/fib_nexthop_multiprefix.sh @@ -144,7 +144,7 @@ setup() cleanup() { - for n in h1 r1 h2 h3 h4 + for n in h0 r1 h1 h2 h3 do ip netns del ${n} 2>/dev/null done -- 2.25.1

4 years, 11 months

3
2
0 0

[PATCH v1 0/2] kunit: tool: fix kunit_tool unit tests

by Brendan Higgins

Apparently we haven't run the unit tests for kunit_tool in a while and consequently some things have broken. This patchset fixes those issues. Brendan Higgins (2): kunit: tool: fix broken default args in unit tests kunit: tool: fix improper treatment of file location tools/testing/kunit/kunit.py | 24 ------------------------ tools/testing/kunit/kunit_tool_test.py | 14 +++++++------- 2 files changed, 7 insertions(+), 31 deletions(-) base-commit: a581387e415bbb0085e7e67906c8f4a99746590e -- 2.27.0.389.gc38d7665816-goog

4 years, 11 months

1
2
0 0

[RFC PATCH 00/15] PKS: Add Protection Keys Supervisor (PKS) support

by ira.weiny＠intel.com

From: Ira Weiny <ira.weiny(a)intel.com> This RFC series has been reviewed by Dave Hansen. This patch set introduces a new page protection mechanism for supervisor pages, Protection Key Supervisor (PKS) and an initial user of them, persistent memory, PMEM. PKS enables protections on 'domains' of supervisor pages to limit supervisor mode access to those pages beyond the normal paging protections. They work in a similar fashion to user space pkeys. Like User page pkeys (PKU), supervisor pkeys are checked in addition to normal paging protections and Access or Writes can be disabled via a MSR update without TLB flushes when permissions change. A page mapping is assigned to a domain by setting a pkey in the page table entry. Unlike User pkeys no new instructions are added; rather WRMSR/RDMSR are used to update the PKRS register. XSAVE is not supported for the PKRS MSR. To reduce software complexity the implementation saves/restores the MSR across context switches but not during irqs. This is a compromise which results is a hardening of unwanted access without absolute restriction. For consistent behavior with current paging protections, pkey 0 is reserved and configured to allow full access via the pkey mechanism, thus preserving the default paging protections on mappings with the default pkey value of 0. Other keys, (1-15) are allocated by an allocator which prepares us for key contention from day one. Kernel users should be prepared for the allocator to fail either because of key exhaustion or due to PKS not being supported on the arch and/or CPU instance. Protecting against stray writes is particularly important for PMEM because, unlike writes to anonymous memory, writes to PMEM persists across a reboot. Thus data corruption could result in permanent loss of data. The following attributes of PKS makes it perfect as a mechanism to protect PMEM from stray access within the kernel: 1) Fast switching of permissions 2) Prevents access without page table manipulations 3) Works on a per thread basis 4) No TLB flushes required The second half of this series thus uses the PKS mechanism to protect PMEM from stray access. Implementation details ---------------------- Modifications of task struct in patches: (x86/pks: Preserve the PKRS MSR on context switch) (memremap: Add zone device access protection) Because pkey access is per-thread 2 modifications are made to the task struct. The first is a saved copy of the MSR during context switches. The second reference counts access to the device domain to correctly handle kmap nesting properly. Maintain PKS setting in a re-entrant manner in patch: (memremap: Add zone device access protection) Using local_irq_save() seems to be the safest and fastest way to maintain kmap as re-entrant. But there may be a better way. spin_lock_irq() and atomic counters were considered. But atomic counters do not properly protect the pkey update and spin_lock_irq() is unnecessary as the pkey protections are thread local. Suggestions are welcome. The use of kmap in patch: (kmap: Add stray write protection for device pages) To keep general access to PMEM pages general, we piggy back on the kmap() interface as there are many places in the kernel who do not have, nor should be required to have, a priori knowledge that a page is PMEM. The modifications to the kmap code is careful to quickly determine which pages don't require special handling to reduce overhead for non PMEM pages. Breakdown of patches -------------------- Implement PKS within x86 arch: x86/pkeys: Create pkeys_internal.h x86/fpu: Refactor arch_set_user_pkey_access() for PKS support x86/pks: Enable Protection Keys Supervisor (PKS) x86/pks: Preserve the PKRS MSR on context switch x86/pks: Add PKS kernel API x86/pks: Add a debugfs file for allocated PKS keys Documentation/pkeys: Update documentation for kernel pkeys x86/pks: Add PKS Test code pre-req bug fixes for dax: fs/dax: Remove unused size parameter drivers/dax: Expand lock scope to cover the use of addresses Add stray write protection to PMEM: memremap: Add zone device access protection kmap: Add stray write protection for device pages dax: Stray write protection for dax_direct_access() nvdimm/pmem: Stray write protection for pmem->virt_addr [dax|pmem]: Enable stray write protection Fenghua Yu (4): x86/fpu: Refactor arch_set_user_pkey_access() for PKS support x86/pks: Enable Protection Keys Supervisor (PKS) x86/pks: Add PKS kernel API x86/pks: Add a debugfs file for allocated PKS keys Ira Weiny (11): x86/pkeys: Create pkeys_internal.h x86/pks: Preserve the PKRS MSR on context switch Documentation/pkeys: Update documentation for kernel pkeys x86/pks: Add PKS Test code fs/dax: Remove unused size parameter drivers/dax: Expand lock scope to cover the use of addresses memremap: Add zone device access protection kmap: Add stray write protection for device pages dax: Stray write protection for dax_direct_access() nvdimm/pmem: Stray write protection for pmem->virt_addr [dax|pmem]: Enable stray write protection Documentation/core-api/protection-keys.rst | 81 +++- arch/x86/Kconfig | 1 + arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/pgtable.h | 13 +- arch/x86/include/asm/pgtable_types.h | 4 + arch/x86/include/asm/pkeys.h | 43 ++ arch/x86/include/asm/pkeys_internal.h | 35 ++ arch/x86/include/asm/processor.h | 13 + arch/x86/include/uapi/asm/processor-flags.h | 2 + arch/x86/kernel/cpu/common.c | 17 + arch/x86/kernel/fpu/xstate.c | 17 +- arch/x86/kernel/process.c | 35 ++ arch/x86/mm/fault.c | 16 +- arch/x86/mm/pkeys.c | 174 +++++++- drivers/dax/device.c | 2 + drivers/dax/super.c | 5 +- drivers/nvdimm/pmem.c | 6 + fs/dax.c | 13 +- include/linux/highmem.h | 32 +- include/linux/memremap.h | 1 + include/linux/mm.h | 33 ++ include/linux/pkeys.h | 18 + include/linux/sched.h | 3 + init/init_task.c | 3 + kernel/fork.c | 3 + lib/Kconfig.debug | 12 + lib/Makefile | 3 + lib/pks/Makefile | 3 + lib/pks/pks_test.c | 452 ++++++++++++++++++++ mm/Kconfig | 15 + mm/memremap.c | 111 +++++ tools/testing/selftests/x86/Makefile | 3 +- tools/testing/selftests/x86/test_pks.c | 65 +++ 34 files changed, 1175 insertions(+), 61 deletions(-) create mode 100644 arch/x86/include/asm/pkeys_internal.h create mode 100644 lib/pks/Makefile create mode 100644 lib/pks/pks_test.c create mode 100644 tools/testing/selftests/x86/test_pks.c -- 2.25.1

4 years, 11 months

4
29
0 0

[PATCH] selftests/harness: Limit step counter reporting

by Kees Cook

When the selftest "step" counter grew beyond 255, non-fatal warnings were being emitted, which is noisy and pointless. There are selftests with more than 255 steps (especially those in loops, etc). Instead, just cap "steps" to 254 and do not report the saturation. Reported-by: Ralph Campbell <rcampbell(a)nvidia.com> Tested-by: Ralph Campbell <rcampbell(a)nvidia.com> Fixes: 9847d24af95c ("selftests/harness: Refactor XFAIL into SKIP") Signed-off-by: Kees Cook <keescook(a)chromium.org> --- tools/testing/selftests/kselftest_harness.h | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 935029d4fb21..4f78e4805633 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -680,7 +680,8 @@ __bail(_assert, _metadata->no_print, _metadata->step)) #define __INC_STEP(_metadata) \ - if (_metadata->passed && _metadata->step < 255) \ + /* Keep "step" below 255 (which is used for "SKIP" reporting). */ \ + if (_metadata->passed && _metadata->step < 253) \ _metadata->step++; #define is_signed_type(var) (!!(((__typeof__(var))(-1)) < (__typeof__(var))1)) @@ -976,12 +977,6 @@ void __run_test(struct __fixture_metadata *f, t->passed = 0; } else if (t->pid == 0) { t->fn(t, variant); - /* Make sure step doesn't get lost in reporting */ - if (t->step >= 255) { - ksft_print_msg("Too many test steps (%u)!?\n", t->step); - t->step = 254; - } - /* Use 255 for SKIP */ if (t->skip) _exit(255); /* Pass is exit 0 */ -- 2.25.1 -- Kees Cook

4 years, 11 months

1
0
0 0

[PATCH] selftests/livepatch: Use "comm" instead of "diff" for dmesg

by Joe Lawrence

BusyBox diff doesn't support the GNU diff '--LTYPE-line-format' options that were used in the selftests to filter older kernel log messages from dmesg output. Use "comm" which is more available in smaller boot environments. Reported-by: Naresh Kamboju <naresh.kamboju(a)linaro.org> Signed-off-by: Joe Lawrence <joe.lawrence(a)redhat.com> --- based-on: livepatching.git/for-5.9/selftests-cleanup merge-thru: livepatching.git tools/testing/selftests/livepatch/functions.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/livepatch/functions.sh b/tools/testing/selftests/livepatch/functions.sh index 36648ca367c2..408529d94ddb 100644 --- a/tools/testing/selftests/livepatch/functions.sh +++ b/tools/testing/selftests/livepatch/functions.sh @@ -277,7 +277,7 @@ function check_result { # help differentiate repeated testing runs. Remove them with a # post-comparison sed filter. - result=$(dmesg | diff --changed-group-format='%>' --unchanged-group-format='' "$SAVED_DMESG" - | \ + result=$(dmesg | comm -13 "$SAVED_DMESG" - | \ grep -e 'livepatch:' -e 'test_klp' | \ grep -v '$tainting\|taints$ kernel' | \ sed 's/^\[[ 0-9.]*\] //') -- 2.21.3

4 years, 11 months

4
3
0 0

Re: [PATCH 2/4] KVM: x86: Introduce paravirt feature CR0/CR4 pinning

by Andersen, John

On Thu, Jul 09, 2020 at 09:27:43AM -0700, Andy Lutomirski wrote: > On Thu, Jul 9, 2020 at 9:22 AM Dave Hansen <dave.hansen(a)intel.com> wrote: > > > > On 7/9/20 9:07 AM, Andy Lutomirski wrote: > > > On Thu, Jul 9, 2020 at 8:56 AM Dave Hansen <dave.hansen(a)intel.com> wrote: > > >> On 7/9/20 8:44 AM, Andersen, John wrote: > > >>> Bits which are allowed to be pinned default to WP for CR0 and SMEP, > > >>> SMAP, and UMIP for CR4. > > >> I think it also makes sense to have FSGSBASE in this set. > > >> > > >> I know it hasn't been tested, but I think we should do the legwork to > > >> test it. If not in this set, can we agree that it's a logical next step? > > > I have no objection to pinning FSGSBASE, but is there a clear > > > description of the threat model that this whole series is meant to > > > address? The idea is to provide a degree of protection against an > > > attacker who is able to convince a guest kernel to write something > > > inappropriate to CR4, right? How realistic is this? > > > > If a quick search can find this: > > > > > https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-… > > > > I'd pretty confident that the guys doing actual bad things have it in > > their toolbox too. > > > > True, but we have the existing software CR4 pinning. I suppose the > virtualization version is stronger. > Yes, as Kees said this will be stronger because it stops ROP and other gadget based techniques which avoid the use of native_write_cr0/4(). With regards to what should be done in this patchset and what in other patchsets. I have a fix for kexec thanks to Arvind's note about TRAMPOLINE_32BIT_CODE_SIZE. The physical host boots fine now and the virtual one can kexec fine. What remains to be done on that front is to add some identifying information to the kernel image to declare that it supports paravirtualized control register pinning or not. Liran suggested adding a section to the built image acting as a flag to signify support for being kexec'd by a kernel with pinning enabled. If anyone has any opinions on how they'd like to see this implemented please let me know. Otherwise I'll just take a stab at it and you'll all see it hopefully in the next version. With regards to FSGSBASE, are we open to validating and adding that to the DEFAULT set as a part of a separate patchset? This patchset is focused on replicating the functionality we already have natively.

4 years, 11 months

1
0
0 0

[PATCH v2 0/8] selftests/harness: Switch to TAP output

by Kees Cook

Hi, v2: - switch harness from XFAIL to SKIP - pass skip reason from test into TAP output - add acks/reviews v1: https://lore.kernel.org/lkml/20200611224028.3275174-1-keescook@chromium.org/ I finally got around to converting the kselftest_harness.h API to actually use the kselftest.h API so all the tools using it can actually report TAP correctly. As part of this, there are a bunch of related cleanups, API updates, and additions. Thanks! -Kees Kees Cook (8): selftests/clone3: Reorder reporting output selftests: Remove unneeded selftest API headers selftests/binderfs: Fix harness API usage selftests: Add header documentation and helpers selftests/harness: Switch to TAP output selftests/harness: Refactor XFAIL into SKIP selftests/harness: Display signed values correctly selftests/harness: Report skip reason tools/testing/selftests/clone3/clone3.c | 2 +- .../selftests/clone3/clone3_clear_sighand.c | 3 +- .../testing/selftests/clone3/clone3_set_tid.c | 2 +- .../filesystems/binderfs/binderfs_test.c | 284 +++++++++--------- tools/testing/selftests/kselftest.h | 78 ++++- tools/testing/selftests/kselftest_harness.h | 169 ++++++++--- .../pid_namespace/regression_enomem.c | 1 - .../selftests/pidfd/pidfd_getfd_test.c | 1 - .../selftests/pidfd/pidfd_setns_test.c | 1 - tools/testing/selftests/seccomp/seccomp_bpf.c | 8 +- .../selftests/uevent/uevent_filtering.c | 1 - 11 files changed, 356 insertions(+), 194 deletions(-) -- 2.25.1

4 years, 11 months

3
14
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror