This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.1.6-rc1
Frederick Lawler fred@cloudflare.com net: sched: disallow noqueue for qdisc classes
Linus Torvalds torvalds@linux-foundation.org gcc: disable -Warray-bounds for gcc-11 too
Chuck Lever chuck.lever@oracle.com Revert "SUNRPC: Use RMW bitops in single-threaded hot paths"
Kyle Huey me@kylehuey.com selftests/vm/pkeys: Add a regression test for setting PKRU through ptrace
Kyle Huey me@kylehuey.com x86/fpu: Emulate XRSTOR's behavior if the xfeatures PKRU bit is not set
Kyle Huey me@kylehuey.com x86/fpu: Allow PKRU to be (once again) written by ptrace.
Kyle Huey me@kylehuey.com x86/fpu: Add a pkru argument to copy_uabi_to_xstate()
Kyle Huey me@kylehuey.com x86/fpu: Add a pkru argument to copy_uabi_from_kernel_to_xstate().
Kyle Huey me@kylehuey.com x86/fpu: Take task_struct* in copy_sigframe_from_user_to_xstate()
Helge Deller deller@gmx.de parisc: Align parisc MADV_XXX constants with all other architectures
-------------
Diffstat:
Makefile | 4 +- arch/parisc/include/uapi/asm/mman.h | 29 +++--- arch/parisc/kernel/sys_parisc.c | 28 ++++++ arch/parisc/kernel/syscalls/syscall.tbl | 2 +- arch/x86/kernel/fpu/core.c | 19 ++-- arch/x86/kernel/fpu/regset.c | 2 +- arch/x86/kernel/fpu/signal.c | 2 +- arch/x86/kernel/fpu/xstate.c | 52 ++++++++++- arch/x86/kernel/fpu/xstate.h | 4 +- fs/nfsd/nfs4proc.c | 7 +- fs/nfsd/nfs4xdr.c | 2 +- init/Kconfig | 6 +- net/sched/sch_api.c | 5 + net/sunrpc/auth_gss/svcauth_gss.c | 4 +- net/sunrpc/svc.c | 6 +- net/sunrpc/svc_xprt.c | 2 +- net/sunrpc/svcsock.c | 8 +- net/sunrpc/xprtrdma/svc_rdma_transport.c | 2 +- tools/arch/parisc/include/uapi/asm/mman.h | 12 +-- tools/perf/bench/bench.h | 12 --- tools/testing/selftests/vm/pkey-x86.h | 12 +++ tools/testing/selftests/vm/protection_keys.c | 131 ++++++++++++++++++++++++++- 22 files changed, 276 insertions(+), 75 deletions(-)
From: Helge Deller deller@gmx.de
commit 71bdea6f798b425bc0003780b13e3fdecb16a010 upstream.
Adjust some MADV_XXX constants to be in sync what their values are on all other platforms. There is currently no reason to have an own numbering on parisc, but it requires workarounds in many userspace sources (e.g. glibc, qemu, ...) - which are often forgotten and thus introduce bugs and different behaviour on parisc.
A wrapper avoids an ABI breakage for existing userspace applications by translating any old values to the new ones, so this change allows us to move over all programs to the new ABI over time.
Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/include/uapi/asm/mman.h | 29 ++++++++++++++--------------- arch/parisc/kernel/sys_parisc.c | 28 ++++++++++++++++++++++++++++ arch/parisc/kernel/syscalls/syscall.tbl | 2 +- tools/arch/parisc/include/uapi/asm/mman.h | 12 ++++++------ tools/perf/bench/bench.h | 12 ------------ 5 files changed, 49 insertions(+), 34 deletions(-)
--- a/arch/parisc/include/uapi/asm/mman.h +++ b/arch/parisc/include/uapi/asm/mman.h @@ -49,6 +49,19 @@ #define MADV_DONTFORK 10 /* don't inherit across fork */ #define MADV_DOFORK 11 /* do inherit across fork */
+#define MADV_MERGEABLE 12 /* KSM may merge identical pages */ +#define MADV_UNMERGEABLE 13 /* KSM may not merge identical pages */ + +#define MADV_HUGEPAGE 14 /* Worth backing with hugepages */ +#define MADV_NOHUGEPAGE 15 /* Not worth backing with hugepages */ + +#define MADV_DONTDUMP 16 /* Explicity exclude from the core dump, + overrides the coredump filter bits */ +#define MADV_DODUMP 17 /* Clear the MADV_NODUMP flag */ + +#define MADV_WIPEONFORK 18 /* Zero memory on fork, child only */ +#define MADV_KEEPONFORK 19 /* Undo MADV_WIPEONFORK */ + #define MADV_COLD 20 /* deactivate these pages */ #define MADV_PAGEOUT 21 /* reclaim these pages */
@@ -57,27 +70,13 @@
#define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */
-#define MADV_MERGEABLE 65 /* KSM may merge identical pages */ -#define MADV_UNMERGEABLE 66 /* KSM may not merge identical pages */ - -#define MADV_HUGEPAGE 67 /* Worth backing with hugepages */ -#define MADV_NOHUGEPAGE 68 /* Not worth backing with hugepages */ - -#define MADV_DONTDUMP 69 /* Explicity exclude from the core dump, - overrides the coredump filter bits */ -#define MADV_DODUMP 70 /* Clear the MADV_NODUMP flag */ - -#define MADV_WIPEONFORK 71 /* Zero memory on fork, child only */ -#define MADV_KEEPONFORK 72 /* Undo MADV_WIPEONFORK */ - -#define MADV_COLLAPSE 73 /* Synchronous hugepage collapse */ +#define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */
#define MADV_HWPOISON 100 /* poison a page for testing */ #define MADV_SOFT_OFFLINE 101 /* soft offline page for testing */
/* compatibility flags */ #define MAP_FILE 0 -#define MAP_VARIABLE 0
#define PKEY_DISABLE_ACCESS 0x1 #define PKEY_DISABLE_WRITE 0x2 --- a/arch/parisc/kernel/sys_parisc.c +++ b/arch/parisc/kernel/sys_parisc.c @@ -465,3 +465,31 @@ asmlinkage long parisc_inotify_init1(int flags = FIX_O_NONBLOCK(flags); return sys_inotify_init1(flags); } + +/* + * madvise() wrapper + * + * Up to kernel v6.1 parisc has different values than all other + * platforms for the MADV_xxx flags listed below. + * To keep binary compatibility with existing userspace programs + * translate the former values to the new values. + * + * XXX: Remove this wrapper in year 2025 (or later) + */ + +asmlinkage notrace long parisc_madvise(unsigned long start, size_t len_in, int behavior) +{ + switch (behavior) { + case 65: behavior = MADV_MERGEABLE; break; + case 66: behavior = MADV_UNMERGEABLE; break; + case 67: behavior = MADV_HUGEPAGE; break; + case 68: behavior = MADV_NOHUGEPAGE; break; + case 69: behavior = MADV_DONTDUMP; break; + case 70: behavior = MADV_DODUMP; break; + case 71: behavior = MADV_WIPEONFORK; break; + case 72: behavior = MADV_KEEPONFORK; break; + case 73: behavior = MADV_COLLAPSE; break; + } + + return sys_madvise(start, len_in, behavior); +} --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -131,7 +131,7 @@ 116 common sysinfo sys_sysinfo compat_sys_sysinfo 117 common shutdown sys_shutdown 118 common fsync sys_fsync -119 common madvise sys_madvise +119 common madvise parisc_madvise 120 common clone sys_clone_wrapper 121 common setdomainname sys_setdomainname 122 common sendfile sys_sendfile compat_sys_sendfile --- a/tools/arch/parisc/include/uapi/asm/mman.h +++ b/tools/arch/parisc/include/uapi/asm/mman.h @@ -1,20 +1,20 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ #ifndef TOOLS_ARCH_PARISC_UAPI_ASM_MMAN_FIX_H #define TOOLS_ARCH_PARISC_UAPI_ASM_MMAN_FIX_H -#define MADV_DODUMP 70 +#define MADV_DODUMP 17 #define MADV_DOFORK 11 -#define MADV_DONTDUMP 69 +#define MADV_DONTDUMP 16 #define MADV_DONTFORK 10 #define MADV_DONTNEED 4 #define MADV_FREE 8 -#define MADV_HUGEPAGE 67 -#define MADV_MERGEABLE 65 -#define MADV_NOHUGEPAGE 68 +#define MADV_HUGEPAGE 14 +#define MADV_MERGEABLE 12 +#define MADV_NOHUGEPAGE 15 #define MADV_NORMAL 0 #define MADV_RANDOM 1 #define MADV_REMOVE 9 #define MADV_SEQUENTIAL 2 -#define MADV_UNMERGEABLE 66 +#define MADV_UNMERGEABLE 13 #define MADV_WILLNEED 3 #define MAP_ANONYMOUS 0x10 #define MAP_DENYWRITE 0x0800 --- a/tools/perf/bench/bench.h +++ b/tools/perf/bench/bench.h @@ -10,25 +10,13 @@ extern struct timeval bench__start, benc * The madvise transparent hugepage constants were added in glibc * 2.13. For compatibility with older versions of glibc, define these * tokens if they are not already defined. - * - * PA-RISC uses different madvise values from other architectures and - * needs to be special-cased. */ -#ifdef __hppa__ -# ifndef MADV_HUGEPAGE -# define MADV_HUGEPAGE 67 -# endif -# ifndef MADV_NOHUGEPAGE -# define MADV_NOHUGEPAGE 68 -# endif -#else # ifndef MADV_HUGEPAGE # define MADV_HUGEPAGE 14 # endif # ifndef MADV_NOHUGEPAGE # define MADV_NOHUGEPAGE 15 # endif -#endif
int bench_numa(int argc, const char **argv); int bench_sched_messaging(int argc, const char **argv);
From: Kyle Huey me@kylehuey.com
commit 6a877d2450ace4f27c012519e5a1ae818f931983 upstream.
This will allow copy_sigframe_from_user_to_xstate() to grab the address of thread_struct's pkru value in a later patch.
Signed-off-by: Kyle Huey me@kylehuey.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/20221115230932.7126-2-khuey%40kylehuey.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/fpu/signal.c | 2 +- arch/x86/kernel/fpu/xstate.c | 4 ++-- arch/x86/kernel/fpu/xstate.h | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-)
--- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -396,7 +396,7 @@ static bool __fpu_restore_sig(void __use
fpregs = &fpu->fpstate->regs; if (use_xsave() && !fx_only) { - if (copy_sigframe_from_user_to_xstate(fpu->fpstate, buf_fx)) + if (copy_sigframe_from_user_to_xstate(tsk, buf_fx)) return false; } else { if (__copy_from_user(&fpregs->fxsave, buf_fx, --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1278,10 +1278,10 @@ int copy_uabi_from_kernel_to_xstate(stru * XSAVE[S] format and copy to the target thread. This is called from the * sigreturn() and rt_sigreturn() system calls. */ -int copy_sigframe_from_user_to_xstate(struct fpstate *fpstate, +int copy_sigframe_from_user_to_xstate(struct task_struct *tsk, const void __user *ubuf) { - return copy_uabi_to_xstate(fpstate, NULL, ubuf); + return copy_uabi_to_xstate(tsk->thread.fpu.fpstate, NULL, ubuf); }
static bool validate_independent_components(u64 mask) --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -47,7 +47,7 @@ extern void __copy_xstate_to_uabi_buf(st extern void copy_xstate_to_uabi_buf(struct membuf to, struct task_struct *tsk, enum xstate_copy_mode mode); extern int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf); -extern int copy_sigframe_from_user_to_xstate(struct fpstate *fpstate, const void __user *ubuf); +extern int copy_sigframe_from_user_to_xstate(struct task_struct *tsk, const void __user *ubuf);
extern void fpu__init_cpu_xstate(void);
From: Kyle Huey me@kylehuey.com
commit 1c813ce0305571e1b2e4cc4acca451da9e6ad18f upstream.
Both KVM (through KVM_SET_XSTATE) and ptrace (through PTRACE_SETREGSET with NT_X86_XSTATE) ultimately call copy_uabi_from_kernel_to_xstate(), but the canonical locations for the current PKRU value for KVM guests and processes in a ptrace stop are different (in the kvm_vcpu_arch and the thread_state structs respectively).
In preparation for eventually handling PKRU in copy_uabi_to_xstate, pass in a pointer to the PKRU location.
Signed-off-by: Kyle Huey me@kylehuey.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/20221115230932.7126-3-khuey%40kylehuey.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/fpu/core.c | 2 +- arch/x86/kernel/fpu/regset.c | 2 +- arch/x86/kernel/fpu/xstate.c | 2 +- arch/x86/kernel/fpu/xstate.h | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-)
--- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -406,7 +406,7 @@ int fpu_copy_uabi_to_guest_fpstate(struc if (ustate->xsave.header.xfeatures & ~xcr0) return -EINVAL;
- ret = copy_uabi_from_kernel_to_xstate(kstate, ustate); + ret = copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru); if (ret) return ret;
--- a/arch/x86/kernel/fpu/regset.c +++ b/arch/x86/kernel/fpu/regset.c @@ -167,7 +167,7 @@ int xstateregs_set(struct task_struct *t }
fpu_force_restore(fpu); - ret = copy_uabi_from_kernel_to_xstate(fpu->fpstate, kbuf ?: tmpbuf); + ret = copy_uabi_from_kernel_to_xstate(fpu->fpstate, kbuf ?: tmpbuf, &target->thread.pkru);
out: vfree(tmpbuf); --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1268,7 +1268,7 @@ static int copy_uabi_to_xstate(struct fp * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] * format and copy to the target thread. Used by ptrace and KVM. */ -int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf) +int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf, u32 *pkru) { return copy_uabi_to_xstate(fpstate, kbuf, NULL); } --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -46,7 +46,7 @@ extern void __copy_xstate_to_uabi_buf(st u32 pkru_val, enum xstate_copy_mode copy_mode); extern void copy_xstate_to_uabi_buf(struct membuf to, struct task_struct *tsk, enum xstate_copy_mode mode); -extern int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf); +extern int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf, u32 *pkru); extern int copy_sigframe_from_user_to_xstate(struct task_struct *tsk, const void __user *ubuf);
From: Kyle Huey me@kylehuey.com
commit 2c87767c35ee9744f666ccec869d5fe742c3de0a upstream.
In preparation for moving PKRU handling code out of fpu_copy_uabi_to_guest_fpstate() and into copy_uabi_to_xstate(), add an argument that copy_uabi_from_kernel_to_xstate() can use to pass the canonical location of the PKRU value. For copy_sigframe_from_user_to_xstate() the kernel will actually restore the PKRU value from the fpstate, but pass in the thread_struct's pkru location anyways for consistency.
Signed-off-by: Kyle Huey me@kylehuey.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/20221115230932.7126-4-khuey%40kylehuey.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/fpu/xstate.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
--- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1200,8 +1200,18 @@ static int copy_from_buffer(void *dst, u }
+/** + * copy_uabi_to_xstate - Copy a UABI format buffer to the kernel xstate + * @fpstate: The fpstate buffer to copy to + * @kbuf: The UABI format buffer, if it comes from the kernel + * @ubuf: The UABI format buffer, if it comes from userspace + * @pkru: unused + * + * Converts from the UABI format into the kernel internal hardware + * dependent format. + */ static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf, - const void __user *ubuf) + const void __user *ubuf, u32 *pkru) { struct xregs_state *xsave = &fpstate->regs.xsave; unsigned int offset, size; @@ -1270,7 +1280,7 @@ static int copy_uabi_to_xstate(struct fp */ int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf, u32 *pkru) { - return copy_uabi_to_xstate(fpstate, kbuf, NULL); + return copy_uabi_to_xstate(fpstate, kbuf, NULL, pkru); }
/* @@ -1281,7 +1291,7 @@ int copy_uabi_from_kernel_to_xstate(stru int copy_sigframe_from_user_to_xstate(struct task_struct *tsk, const void __user *ubuf) { - return copy_uabi_to_xstate(tsk->thread.fpu.fpstate, NULL, ubuf); + return copy_uabi_to_xstate(tsk->thread.fpu.fpstate, NULL, ubuf, &tsk->thread.pkru); }
static bool validate_independent_components(u64 mask)
From: Kyle Huey me@kylehuey.com
commit 4a804c4f8356393d6b5eff7600f07615d7869c13 upstream.
Move KVM's PKRU handling code in fpu_copy_uabi_to_guest_fpstate() to copy_uabi_to_xstate() so that it is shared with other APIs that write the XSTATE such as PTRACE_SETREGSET with NT_X86_XSTATE.
This restores the pre-5.14 behavior of ptrace. The regression can be seen by running gdb and executing `p $pkru`, `set $pkru = 42`, and `p $pkru`. On affected kernels (5.14+) the write to the PKRU register (which gdb performs through ptrace) is ignored.
[ dhansen: removed stable@ tag for now. The ABI was broken for long enough that this is not urgent material. Let's let it stew in tip for a few weeks before it's submitted to stable because there are so many ABIs potentially affected. ]
Fixes: e84ba47e313d ("x86/fpu: Hook up PKRU into ptrace()") Signed-off-by: Kyle Huey me@kylehuey.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/20221115230932.7126-5-khuey%40kylehuey.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/fpu/core.c | 13 +------------ arch/x86/kernel/fpu/xstate.c | 21 ++++++++++++++++++++- 2 files changed, 21 insertions(+), 13 deletions(-)
--- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -391,8 +391,6 @@ int fpu_copy_uabi_to_guest_fpstate(struc { struct fpstate *kstate = gfpu->fpstate; const union fpregs_state *ustate = buf; - struct pkru_state *xpkru; - int ret;
if (!cpu_feature_enabled(X86_FEATURE_XSAVE)) { if (ustate->xsave.header.xfeatures & ~XFEATURE_MASK_FPSSE) @@ -406,16 +404,7 @@ int fpu_copy_uabi_to_guest_fpstate(struc if (ustate->xsave.header.xfeatures & ~xcr0) return -EINVAL;
- ret = copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru); - if (ret) - return ret; - - /* Retrieve PKRU if not in init state */ - if (kstate->regs.xsave.header.xfeatures & XFEATURE_MASK_PKRU) { - xpkru = get_xsave_addr(&kstate->regs.xsave, XFEATURE_PKRU); - *vpkru = xpkru->pkru; - } - return 0; + return copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru); } EXPORT_SYMBOL_GPL(fpu_copy_uabi_to_guest_fpstate); #endif /* CONFIG_KVM */ --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1205,10 +1205,22 @@ static int copy_from_buffer(void *dst, u * @fpstate: The fpstate buffer to copy to * @kbuf: The UABI format buffer, if it comes from the kernel * @ubuf: The UABI format buffer, if it comes from userspace - * @pkru: unused + * @pkru: The location to write the PKRU value to * * Converts from the UABI format into the kernel internal hardware * dependent format. + * + * This function ultimately has three different callers with distinct PKRU + * behavior. + * 1. When called from sigreturn the PKRU register will be restored from + * @fpstate via an XRSTOR. Correctly copying the UABI format buffer to + * @fpstate is sufficient to cover this case, but the caller will also + * pass a pointer to the thread_struct's pkru field in @pkru and updating + * it is harmless. + * 2. When called from ptrace the PKRU register will be restored from the + * thread_struct's pkru field. A pointer to that is passed in @pkru. + * 3. When called from KVM the PKRU register will be restored from the vcpu's + * pkru field. A pointer to that is passed in @pkru. */ static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf, const void __user *ubuf, u32 *pkru) @@ -1260,6 +1272,13 @@ static int copy_uabi_to_xstate(struct fp } }
+ if (hdr.xfeatures & XFEATURE_MASK_PKRU) { + struct pkru_state *xpkru; + + xpkru = __raw_xsave_addr(xsave, XFEATURE_PKRU); + *pkru = xpkru->pkru; + } + /* * The state that came in from userspace was user-state only. * Mask all the user states out of 'xfeatures':
From: Kyle Huey me@kylehuey.com
commit d7e5aceace514a2b1b3ca3dc44f93f1704766ca7 upstream.
The hardware XRSTOR instruction resets the PKRU register to its hardware init value (namely 0) if the PKRU bit is not set in the xfeatures mask. Emulating that here restores the pre-5.14 behavior for PTRACE_SET_REGSET with NT_X86_XSTATE, and makes sigreturn (which still uses XRSTOR) and ptrace behave identically. KVM has never used XRSTOR and never had this behavior, so KVM opts-out of this emulation by passing a NULL pkru pointer to copy_uabi_to_xstate().
Fixes: e84ba47e313d ("x86/fpu: Hook up PKRU into ptrace()") Signed-off-by: Kyle Huey me@kylehuey.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/20221115230932.7126-6-khuey%40kylehuey.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/fpu/core.c | 8 ++++++++ arch/x86/kernel/fpu/xstate.c | 15 ++++++++++++++- 2 files changed, 22 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -404,6 +404,14 @@ int fpu_copy_uabi_to_guest_fpstate(struc if (ustate->xsave.header.xfeatures & ~xcr0) return -EINVAL;
+ /* + * Nullify @vpkru to preserve its current value if PKRU's bit isn't set + * in the header. KVM's odd ABI is to leave PKRU untouched in this + * case (all other components are eventually re-initialized). + */ + if (!(ustate->xsave.header.xfeatures & XFEATURE_MASK_PKRU)) + vpkru = NULL; + return copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru); } EXPORT_SYMBOL_GPL(fpu_copy_uabi_to_guest_fpstate); --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1219,8 +1219,14 @@ static int copy_from_buffer(void *dst, u * it is harmless. * 2. When called from ptrace the PKRU register will be restored from the * thread_struct's pkru field. A pointer to that is passed in @pkru. + * The kernel will restore it manually, so the XRSTOR behavior that resets + * the PKRU register to the hardware init value (0) if the corresponding + * xfeatures bit is not set is emulated here. * 3. When called from KVM the PKRU register will be restored from the vcpu's - * pkru field. A pointer to that is passed in @pkru. + * pkru field. A pointer to that is passed in @pkru. KVM hasn't used + * XRSTOR and hasn't had the PKRU resetting behavior described above. To + * preserve that KVM behavior, it passes NULL for @pkru if the xfeatures + * bit is not set. */ static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf, const void __user *ubuf, u32 *pkru) @@ -1277,6 +1283,13 @@ static int copy_uabi_to_xstate(struct fp
xpkru = __raw_xsave_addr(xsave, XFEATURE_PKRU); *pkru = xpkru->pkru; + } else { + /* + * KVM may pass NULL here to indicate that it does not need + * PKRU updated. + */ + if (pkru) + *pkru = 0; }
/*
From: Kyle Huey me@kylehuey.com
commit 6ea25770b043c7997ab21d1ce95ba5de4d3d85d9 upstream.
This tests PTRACE_SETREGSET with NT_X86_XSTATE modifying PKRU directly and removing the PKRU bit from XSTATE_BV.
Signed-off-by: Kyle Huey me@kylehuey.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/20221115230932.7126-7-khuey%40kylehuey.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/vm/pkey-x86.h | 12 ++ tools/testing/selftests/vm/protection_keys.c | 131 ++++++++++++++++++++++++++- 2 files changed, 141 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/vm/pkey-x86.h +++ b/tools/testing/selftests/vm/pkey-x86.h @@ -104,6 +104,18 @@ static inline int cpu_has_pkeys(void) return 1; }
+static inline int cpu_max_xsave_size(void) +{ + unsigned long XSTATE_CPUID = 0xd; + unsigned int eax; + unsigned int ebx; + unsigned int ecx; + unsigned int edx; + + __cpuid_count(XSTATE_CPUID, 0, eax, ebx, ecx, edx); + return ecx; +} + static inline u32 pkey_bit_position(int pkey) { return pkey * PKEY_BITS_PER_PKEY; --- a/tools/testing/selftests/vm/protection_keys.c +++ b/tools/testing/selftests/vm/protection_keys.c @@ -18,12 +18,13 @@ * do a plain mprotect() to a mprotect_pkey() area and make sure the pkey sticks * * Compile like this: - * gcc -o protection_keys -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm - * gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm + * gcc -mxsave -o protection_keys -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm + * gcc -mxsave -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm */ #define _GNU_SOURCE #define __SANE_USERSPACE_TYPES__ #include <errno.h> +#include <linux/elf.h> #include <linux/futex.h> #include <time.h> #include <sys/time.h> @@ -1550,6 +1551,129 @@ void test_implicit_mprotect_exec_only_me do_not_expect_pkey_fault("plain read on recently PROT_EXEC area"); }
+#if defined(__i386__) || defined(__x86_64__) +void test_ptrace_modifies_pkru(int *ptr, u16 pkey) +{ + u32 new_pkru; + pid_t child; + int status, ret; + int pkey_offset = pkey_reg_xstate_offset(); + size_t xsave_size = cpu_max_xsave_size(); + void *xsave; + u32 *pkey_register; + u64 *xstate_bv; + struct iovec iov; + + new_pkru = ~read_pkey_reg(); + /* Don't make PROT_EXEC mappings inaccessible */ + new_pkru &= ~3; + + child = fork(); + pkey_assert(child >= 0); + dprintf3("[%d] fork() ret: %d\n", getpid(), child); + if (!child) { + ptrace(PTRACE_TRACEME, 0, 0, 0); + /* Stop and allow the tracer to modify PKRU directly */ + raise(SIGSTOP); + + /* + * need __read_pkey_reg() version so we do not do shadow_pkey_reg + * checking + */ + if (__read_pkey_reg() != new_pkru) + exit(1); + + /* Stop and allow the tracer to clear XSTATE_BV for PKRU */ + raise(SIGSTOP); + + if (__read_pkey_reg() != 0) + exit(1); + + /* Stop and allow the tracer to examine PKRU */ + raise(SIGSTOP); + + exit(0); + } + + pkey_assert(child == waitpid(child, &status, 0)); + dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status); + pkey_assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP); + + xsave = (void *)malloc(xsave_size); + pkey_assert(xsave > 0); + + /* Modify the PKRU register directly */ + iov.iov_base = xsave; + iov.iov_len = xsave_size; + ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_X86_XSTATE, &iov); + pkey_assert(ret == 0); + + pkey_register = (u32 *)(xsave + pkey_offset); + pkey_assert(*pkey_register == read_pkey_reg()); + + *pkey_register = new_pkru; + + ret = ptrace(PTRACE_SETREGSET, child, (void *)NT_X86_XSTATE, &iov); + pkey_assert(ret == 0); + + /* Test that the modification is visible in ptrace before any execution */ + memset(xsave, 0xCC, xsave_size); + ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_X86_XSTATE, &iov); + pkey_assert(ret == 0); + pkey_assert(*pkey_register == new_pkru); + + /* Execute the tracee */ + ret = ptrace(PTRACE_CONT, child, 0, 0); + pkey_assert(ret == 0); + + /* Test that the tracee saw the PKRU value change */ + pkey_assert(child == waitpid(child, &status, 0)); + dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status); + pkey_assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP); + + /* Test that the modification is visible in ptrace after execution */ + memset(xsave, 0xCC, xsave_size); + ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_X86_XSTATE, &iov); + pkey_assert(ret == 0); + pkey_assert(*pkey_register == new_pkru); + + /* Clear the PKRU bit from XSTATE_BV */ + xstate_bv = (u64 *)(xsave + 512); + *xstate_bv &= ~(1 << 9); + + ret = ptrace(PTRACE_SETREGSET, child, (void *)NT_X86_XSTATE, &iov); + pkey_assert(ret == 0); + + /* Test that the modification is visible in ptrace before any execution */ + memset(xsave, 0xCC, xsave_size); + ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_X86_XSTATE, &iov); + pkey_assert(ret == 0); + pkey_assert(*pkey_register == 0); + + ret = ptrace(PTRACE_CONT, child, 0, 0); + pkey_assert(ret == 0); + + /* Test that the tracee saw the PKRU value go to 0 */ + pkey_assert(child == waitpid(child, &status, 0)); + dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status); + pkey_assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP); + + /* Test that the modification is visible in ptrace after execution */ + memset(xsave, 0xCC, xsave_size); + ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_X86_XSTATE, &iov); + pkey_assert(ret == 0); + pkey_assert(*pkey_register == 0); + + ret = ptrace(PTRACE_CONT, child, 0, 0); + pkey_assert(ret == 0); + pkey_assert(child == waitpid(child, &status, 0)); + dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status); + pkey_assert(WIFEXITED(status)); + pkey_assert(WEXITSTATUS(status) == 0); + free(xsave); +} +#endif + void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey) { int size = PAGE_SIZE; @@ -1585,6 +1709,9 @@ void (*pkey_tests[])(int *ptr, u16 pkey) test_pkey_syscalls_bad_args, test_pkey_alloc_exhaust, test_pkey_alloc_free_attach_pkey0, +#if defined(__i386__) || defined(__x86_64__) + test_ptrace_modifies_pkru, +#endif };
void run_tests_once(void)
From: Chuck Lever chuck.lever@oracle.com
commit 7827c81f0248e3c2f40d438b020f3d222f002171 upstream.
The premise that "Once an svc thread is scheduled and executing an RPC, no other processes will touch svc_rqst::rq_flags" is false. svc_xprt_enqueue() examines the RQ_BUSY flag in scheduled nfsd threads when determining which thread to wake up next.
Found via KCSAN.
Fixes: 28df0988815f ("SUNRPC: Use RMW bitops in single-threaded hot paths") Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4proc.c | 7 +++---- fs/nfsd/nfs4xdr.c | 2 +- net/sunrpc/auth_gss/svcauth_gss.c | 4 ++-- net/sunrpc/svc.c | 6 +++--- net/sunrpc/svc_xprt.c | 2 +- net/sunrpc/svcsock.c | 8 ++++---- net/sunrpc/xprtrdma/svc_rdma_transport.c | 2 +- 7 files changed, 15 insertions(+), 16 deletions(-)
--- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -928,7 +928,7 @@ nfsd4_read(struct svc_rqst *rqstp, struc * the client wants us to do more in this compound: */ if (!nfsd4_last_compound_op(rqstp)) - __clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags); + clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags);
/* check stateid */ status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh, @@ -2615,12 +2615,11 @@ nfsd4_proc_compound(struct svc_rqst *rqs cstate->minorversion = args->minorversion; fh_init(current_fh, NFS4_FHSIZE); fh_init(save_fh, NFS4_FHSIZE); - /* * Don't use the deferral mechanism for NFSv4; compounds make it * too hard to avoid non-idempotency problems. */ - __clear_bit(RQ_USEDEFERRAL, &rqstp->rq_flags); + clear_bit(RQ_USEDEFERRAL, &rqstp->rq_flags);
/* * According to RFC3010, this takes precedence over all other errors. @@ -2742,7 +2741,7 @@ encode_op: out: cstate->status = status; /* Reset deferral mechanism for RPC deferrals */ - __set_bit(RQ_USEDEFERRAL, &rqstp->rq_flags); + set_bit(RQ_USEDEFERRAL, &rqstp->rq_flags); return rpc_success; }
--- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -2464,7 +2464,7 @@ nfsd4_decode_compound(struct nfsd4_compo argp->rqstp->rq_cachetype = cachethis ? RC_REPLBUFF : RC_NOCACHE;
if (readcount > 1 || max_reply > PAGE_SIZE - auth_slack) - __clear_bit(RQ_SPLICE_OK, &argp->rqstp->rq_flags); + clear_bit(RQ_SPLICE_OK, &argp->rqstp->rq_flags);
return true; } --- a/net/sunrpc/auth_gss/svcauth_gss.c +++ b/net/sunrpc/auth_gss/svcauth_gss.c @@ -900,7 +900,7 @@ unwrap_integ_data(struct svc_rqst *rqstp * rejecting the server-computed MIC in this somewhat rare case, * do not use splice with the GSS integrity service. */ - __clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags); + clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags);
/* Did we already verify the signature on the original pass through? */ if (rqstp->rq_deferred) @@ -972,7 +972,7 @@ unwrap_priv_data(struct svc_rqst *rqstp, int pad, remaining_len, offset; u32 rseqno;
- __clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags); + clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags);
priv_len = svc_getnl(&buf->head[0]); if (rqstp->rq_deferred) { --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -1244,10 +1244,10 @@ svc_process_common(struct svc_rqst *rqst goto err_short_len;
/* Will be turned off by GSS integrity and privacy services */ - __set_bit(RQ_SPLICE_OK, &rqstp->rq_flags); + set_bit(RQ_SPLICE_OK, &rqstp->rq_flags); /* Will be turned off only when NFSv4 Sessions are used */ - __set_bit(RQ_USEDEFERRAL, &rqstp->rq_flags); - __clear_bit(RQ_DROPME, &rqstp->rq_flags); + set_bit(RQ_USEDEFERRAL, &rqstp->rq_flags); + clear_bit(RQ_DROPME, &rqstp->rq_flags);
svc_putu32(resv, rqstp->rq_xid);
--- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -1238,7 +1238,7 @@ static struct cache_deferred_req *svc_de trace_svc_defer(rqstp); svc_xprt_get(rqstp->rq_xprt); dr->xprt = rqstp->rq_xprt; - __set_bit(RQ_DROPME, &rqstp->rq_flags); + set_bit(RQ_DROPME, &rqstp->rq_flags);
dr->handle.revisit = svc_revisit; return &dr->handle; --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -298,9 +298,9 @@ static void svc_sock_setbufsize(struct s static void svc_sock_secure_port(struct svc_rqst *rqstp) { if (svc_port_is_privileged(svc_addr(rqstp))) - __set_bit(RQ_SECURE, &rqstp->rq_flags); + set_bit(RQ_SECURE, &rqstp->rq_flags); else - __clear_bit(RQ_SECURE, &rqstp->rq_flags); + clear_bit(RQ_SECURE, &rqstp->rq_flags); }
/* @@ -1008,9 +1008,9 @@ static int svc_tcp_recvfrom(struct svc_r rqstp->rq_xprt_ctxt = NULL; rqstp->rq_prot = IPPROTO_TCP; if (test_bit(XPT_LOCAL, &svsk->sk_xprt.xpt_flags)) - __set_bit(RQ_LOCAL, &rqstp->rq_flags); + set_bit(RQ_LOCAL, &rqstp->rq_flags); else - __clear_bit(RQ_LOCAL, &rqstp->rq_flags); + clear_bit(RQ_LOCAL, &rqstp->rq_flags);
p = (__be32 *)rqstp->rq_arg.head[0].iov_base; calldir = p[1]; --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c @@ -602,7 +602,7 @@ static int svc_rdma_has_wspace(struct sv
static void svc_rdma_secure_port(struct svc_rqst *rqstp) { - __set_bit(RQ_SECURE, &rqstp->rq_flags); + set_bit(RQ_SECURE, &rqstp->rq_flags); }
static void svc_rdma_kill_temp_xprt(struct svc_xprt *xprt)
From: Linus Torvalds torvalds@linux-foundation.org
commit 5a41237ad1d4b62008f93163af1d9b1da90729d8 upstream.
We had already disabled this warning for gcc-12 due to bugs in the value range analysis, but it turns out we end up having some similar problems with gcc-11.3 too, so let's disable it there too.
Older gcc versions end up being increasingly less relevant, and hopefully clang and newer version of gcc (ie gcc-13) end up working reliably enough that we still get the build coverage even when we disable this for some versions.
Link: https://lore.kernel.org/all/20221227002941.GA2691687@roeck-us.net/ Link: https://lore.kernel.org/all/D8BDBF66-E44C-45D4-9758-BAAA4F0C1998@kernel.org/ Cc: Kees Cook kees@kernel.org Cc: Vlastimil Babka vbabka@suse.cz Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- init/Kconfig | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/init/Kconfig +++ b/init/Kconfig @@ -892,13 +892,17 @@ config CC_IMPLICIT_FALLTHROUGH default "-Wimplicit-fallthrough=5" if CC_IS_GCC && $(cc-option,-Wimplicit-fallthrough=5) default "-Wimplicit-fallthrough" if CC_IS_CLANG && $(cc-option,-Wunreachable-code-fallthrough)
-# Currently, disable gcc-12 array-bounds globally. +# Currently, disable gcc-11,12 array-bounds globally. # We may want to target only particular configurations some day. +config GCC11_NO_ARRAY_BOUNDS + def_bool y + config GCC12_NO_ARRAY_BOUNDS def_bool y
config CC_NO_ARRAY_BOUNDS bool + default y if CC_IS_GCC && GCC_VERSION >= 110000 && GCC_VERSION < 120000 && GCC11_NO_ARRAY_BOUNDS default y if CC_IS_GCC && GCC_VERSION >= 120000 && GCC_VERSION < 130000 && GCC12_NO_ARRAY_BOUNDS
#
On 12. 01. 23, 14:56, Greg Kroah-Hartman wrote:
From: Linus Torvalds torvalds@linux-foundation.org
commit 5a41237ad1d4b62008f93163af1d9b1da90729d8 upstream.
We had already disabled this warning for gcc-12 due to bugs in the value range analysis, but it turns out we end up having some similar problems with gcc-11.3 too, so let's disable it there too.
Older gcc versions end up being increasingly less relevant, and hopefully clang and newer version of gcc (ie gcc-13) end up working reliably enough that we still get the build coverage even when we disable this for some versions.
No, this did not happen and I sent a patch for gcc 13 long time ago (it was not applied): https://lore.kernel.org/all/20221031114212.10266-1-jirislaby@kernel.org/
So should we simply make it CC_IS_GCC && GCC_VERSION >= 110000 until this gets resolved eventually?
Link: https://lore.kernel.org/all/20221227002941.GA2691687@roeck-us.net/ Link: https://lore.kernel.org/all/D8BDBF66-E44C-45D4-9758-BAAA4F0C1998@kernel.org/ Cc: Kees Cook kees@kernel.org Cc: Vlastimil Babka vbabka@suse.cz Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
init/Kconfig | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/init/Kconfig +++ b/init/Kconfig @@ -892,13 +892,17 @@ config CC_IMPLICIT_FALLTHROUGH default "-Wimplicit-fallthrough=5" if CC_IS_GCC && $(cc-option,-Wimplicit-fallthrough=5) default "-Wimplicit-fallthrough" if CC_IS_CLANG && $(cc-option,-Wunreachable-code-fallthrough) -# Currently, disable gcc-12 array-bounds globally. +# Currently, disable gcc-11,12 array-bounds globally. # We may want to target only particular configurations some day. +config GCC11_NO_ARRAY_BOUNDS
- def_bool y
- config GCC12_NO_ARRAY_BOUNDS def_bool y
config CC_NO_ARRAY_BOUNDS bool
- default y if CC_IS_GCC && GCC_VERSION >= 110000 && GCC_VERSION < 120000 && GCC11_NO_ARRAY_BOUNDS default y if CC_IS_GCC && GCC_VERSION >= 120000 && GCC_VERSION < 130000 && GCC12_NO_ARRAY_BOUNDS
#
From: Frederick Lawler fred@cloudflare.com
commit 96398560f26aa07e8f2969d73c8197e6a6d10407 upstream.
While experimenting with applying noqueue to a classful queue discipline, we discovered a NULL pointer dereference in the __dev_queue_xmit() path that generates a kernel OOPS:
# dev=enp0s5 # tc qdisc replace dev $dev root handle 1: htb default 1 # tc class add dev $dev parent 1: classid 1:1 htb rate 10mbit # tc qdisc add dev $dev parent 1:1 handle 10: noqueue # ping -I $dev -w 1 -c 1 1.1.1.1
[ 2.172856] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 2.173217] #PF: supervisor instruction fetch in kernel mode ... [ 2.178451] Call Trace: [ 2.178577] <TASK> [ 2.178686] htb_enqueue+0x1c8/0x370 [ 2.178880] dev_qdisc_enqueue+0x15/0x90 [ 2.179093] __dev_queue_xmit+0x798/0xd00 [ 2.179305] ? _raw_write_lock_bh+0xe/0x30 [ 2.179522] ? __local_bh_enable_ip+0x32/0x70 [ 2.179759] ? ___neigh_create+0x610/0x840 [ 2.179968] ? eth_header+0x21/0xc0 [ 2.180144] ip_finish_output2+0x15e/0x4f0 [ 2.180348] ? dst_output+0x30/0x30 [ 2.180525] ip_push_pending_frames+0x9d/0xb0 [ 2.180739] raw_sendmsg+0x601/0xcb0 [ 2.180916] ? _raw_spin_trylock+0xe/0x50 [ 2.181112] ? _raw_spin_unlock_irqrestore+0x16/0x30 [ 2.181354] ? get_page_from_freelist+0xcd6/0xdf0 [ 2.181594] ? sock_sendmsg+0x56/0x60 [ 2.181781] sock_sendmsg+0x56/0x60 [ 2.181958] __sys_sendto+0xf7/0x160 [ 2.182139] ? handle_mm_fault+0x6e/0x1d0 [ 2.182366] ? do_user_addr_fault+0x1e1/0x660 [ 2.182627] __x64_sys_sendto+0x1b/0x30 [ 2.182881] do_syscall_64+0x38/0x90 [ 2.183085] entry_SYSCALL_64_after_hwframe+0x63/0xcd ... [ 2.187402] </TASK>
Previously in commit d66d6c3152e8 ("net: sched: register noqueue qdisc"), NULL was set for the noqueue discipline on noqueue init so that __dev_queue_xmit() falls through for the noqueue case. This also sets a bypass of the enqueue NULL check in the register_qdisc() function for the struct noqueue_disc_ops.
Classful queue disciplines make it past the NULL check in __dev_queue_xmit() because the discipline is set to htb (in this case), and then in the call to __dev_xmit_skb(), it calls into htb_enqueue() which grabs a leaf node for a class and then calls qdisc_enqueue() by passing in a queue discipline which assumes ->enqueue() is not set to NULL.
Fix this by not allowing classes to be assigned to the noqueue discipline. Linux TC Notes states that classes cannot be set to the noqueue discipline. [1] Let's enforce that here.
Links: 1. https://linux-tc-notes.sourceforge.net/tc/doc/sch_noqueue.txt
Fixes: d66d6c3152e8 ("net: sched: register noqueue qdisc") Cc: stable@vger.kernel.org Signed-off-by: Frederick Lawler fred@cloudflare.com Reviewed-by: Jakub Sitnicki jakub@cloudflare.com Link: https://lore.kernel.org/r/20230109163906.706000-1-fred@cloudflare.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_api.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1132,6 +1132,11 @@ skip: return -ENOENT; }
+ if (new && new->ops == &noqueue_qdisc_ops) { + NL_SET_ERR_MSG(extack, "Cannot assign noqueue to a class"); + return -EINVAL; + } + err = cops->graft(parent, cl, new, &old, extack); if (err) return err;
Hey Greg,
On Thu, Jan 12, 2023 at 02:56:21PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
Looks like not much in it that's of concern for RISC-V, but FWIW: Tested-by: Conor Dooley conor.dooley@microchip.com
Thanks, Conor.
On 1/12/23 06:56, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Thu, Jan 12, 2023 at 02:56:21PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
No regressions found on WSL x86_64 or WSL arm64
Built, booted, and reviewed dmesg.
Thank you.
Tested-by: Kelsey Steele kelseysteele@linux.microsoft.com
Hi Greg
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
6.1.6-rc1 tested.
x86_64
build successfully completed boot successfully completed
Lenovo ThinkPad X1 Nano Gen1(Intel i5-1130G7, arch linux)
Thanks
Tested-by: Takeshi Ogasawara takeshi.ogasawara@futuring-girl.com
On Thu, Jan 12, 2023 at 02:56:21PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 500 pass: 500 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On Thu, Jan 12, 2023 at 02:56:21PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 10.2.0) and powerpc (ps3_defconfig, GCC 12.2.0).
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
On 1/12/23 5:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
Hi Greg,
On Thu, Jan 12, 2023 at 02:56:21PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
Build test (gcc version 12.2.1 20221127): mips: 52 configs -> no failure arm: 100 configs -> no failure arm64: 3 configs -> no failure x86_64: 4 configs -> no failure alpha allmodconfig -> no failure csky allmodconfig -> no failure powerpc allmodconfig -> no failure riscv allmodconfig -> no failure s390 allmodconfig -> no failure xtensa allmodconfig -> no failure
Boot test: x86_64: Booted on my test laptop. No regression. x86_64: Booted on qemu. No regression. [1] arm64: Booted on rpi4b (4GB model). No regression. [2] mips: Booted on ci20 board. No regression. [3]
[1]. https://openqa.qa.codethink.co.uk/tests/2630 [2]. https://openqa.qa.codethink.co.uk/tests/2631 [3]. https://openqa.qa.codethink.co.uk/tests/2634
Tested-by: Sudip Mukherjee sudip.mukherjee@codethink.co.uk
On Thu, 12 Jan 2023 at 19:26, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 6.1.6-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-6.1.y * git commit: 5eedeabf82ee83045c63fd32b2473e1d61885204 * git describe: v6.1.5-11-g5eedeabf82ee * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.1.y/build/v6.1.5-...
## Test Regressions (compared to v6.1.4-160-g06bcfb15cd3b)
## Metric Regressions (compared to v6.1.4-160-g06bcfb15cd3b)
## Test Fixes (compared to v6.1.4-160-g06bcfb15cd3b)
## Metric Fixes (compared to v6.1.4-160-g06bcfb15cd3b)
## Test result summary total: 183886, pass: 153292, fail: 5185, skip: 25382, xfail: 27
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 151 total, 148 passed, 3 failed * arm64: 51 total, 50 passed, 1 failed * i386: 39 total, 36 passed, 3 failed * mips: 30 total, 28 passed, 2 failed * parisc: 8 total, 8 passed, 0 failed * powerpc: 38 total, 32 passed, 6 failed * riscv: 16 total, 15 passed, 1 failed * s390: 16 total, 14 passed, 2 failed * sh: 14 total, 12 passed, 2 failed * sparc: 8 total, 8 passed, 0 failed * x86_64: 44 total, 44 passed, 0 failed
## Test suites summary * boot * fwts * igt-gpu-tools * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-open-posix-tests * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * packetdrill * perf * rcutorture * v4l2-compliance * vdso
-- Linaro LKFT https://lkft.linaro.org
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my x86_64 and ARM64 test systems. No errors or regressions.
Tested-by: Allen Pais apais@linux.microsoft.com
Thanks.
On Thu, Jan 12, 2023 at 02:56:21PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.6 release. There are 10 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000. Anything received after that time might be too late.
Hi Greg,
6.1.6-rc1 tested.
Run tested on: - Allwinner H6 (Tanix TX6) - Intel Alder Lake x86_64 (nuc12 i7-1260P)
In addition - build tested for: - Allwinner A64 - Allwinner H3 - Allwinner H5 - NXP iMX6 - NXP iMX8 - Qualcomm Dragonboard - Rockchip RK3288 - Rockchip RK3328 - Rockchip RK3399pro - Samsung Exynos
Tested-by: Rudi Heitbaum rudi@heitbaum.com -- Rudi
linux-stable-mirror@lists.linaro.org