Because Linux credentials are managed per thread, user space relies on some hack to synchronize credential update across threads from the same process. This is required by the Native POSIX Threads Library and implemented by set*id(2) wrappers and libcap(3) to use tgkill(2) to synchronize threads. See nptl(7) and libpsx(3). Furthermore, some runtimes like Go do not enable developers to have control over threads [1].
To avoid potential issues, and because threads are not security boundaries, let's relax the Landlock (optional) signal scoping to always allow signals sent between threads of the same process. This exception is similar to the __ptrace_may_access() one.
hook_file_set_fowner() now checks if the target task is part of the same process as the caller. If this is the case, then the related signal triggered by the socket will always be allowed.
Scoping of abstract UNIX sockets is not changed because kernel objects (e.g. sockets) should be tied to their creator's domain at creation time.
Note that creating one Landlock domain per thread puts each of these threads (and their future children) in their own scope, which is probably not what users expect, especially in Go where we do not control threads. However, being able to drop permissions on all threads should not be restricted by signal scoping. We are working on a way to make it possible to atomically restrict all threads of a process with the same domain [2].
Add erratum for signal scoping.
Closes: https://github.com/landlock-lsm/go-landlock/issues/36 Fixes: 54a6e6bbf3be ("landlock: Add signal scoping") Fixes: c8994965013e ("selftests/landlock: Test signal scoping for threads") Depends-on: 26f204380a3c ("fs: Fix file_set_fowner LSM hook inconsistencies") Link: https://pkg.go.dev/kernel.org/pub/linux/libs/security/libcap/psx [1] Link: https://github.com/landlock-lsm/linux/issues/2 [2] Cc: Günther Noack gnoack@google.com Cc: Paul Moore paul@paul-moore.com Cc: Serge Hallyn serge@hallyn.com Cc: Tahera Fahimi fahimitahera@gmail.com Cc: stable@vger.kernel.org Acked-by: Christian Brauner brauner@kernel.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20250318161443.279194-6-mic@digikod.net ---
Changes since v1: - Add Acked-by Christian. - Add Landlock erratum. - Update subject. --- security/landlock/errata/abi-6.h | 19 ++++++++++++++++ security/landlock/fs.c | 22 +++++++++++++++---- security/landlock/task.c | 12 ++++++++++ .../selftests/landlock/scoped_signal_test.c | 2 +- 4 files changed, 50 insertions(+), 5 deletions(-) create mode 100644 security/landlock/errata/abi-6.h
diff --git a/security/landlock/errata/abi-6.h b/security/landlock/errata/abi-6.h new file mode 100644 index 000000000000..df7bc0e1fdf4 --- /dev/null +++ b/security/landlock/errata/abi-6.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +/** + * DOC: erratum_2 + * + * Erratum 2: Scoped signal handling + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + * + * This fix addresses an issue where signal scoping was overly restrictive, + * preventing sandboxed threads from signaling other threads within the same + * process if they belonged to different domains. Because threads are not + * security boundaries, user space might assume that any thread within the same + * process can send signals between themselves (see :manpage:`nptl(7)` and + * :manpage:`libpsx(3)`). Consistent with :manpage:`ptrace(2)` behavior, direct + * interaction between threads of the same process should always be allowed. + * This change ensures that any thread is allowed to send signals to any other + * thread within the same process, regardless of their domain. + */ +LANDLOCK_ERRATUM(2) diff --git a/security/landlock/fs.c b/security/landlock/fs.c index 71b9dc331aae..47c862fe14e4 100644 --- a/security/landlock/fs.c +++ b/security/landlock/fs.c @@ -27,7 +27,9 @@ #include <linux/mount.h> #include <linux/namei.h> #include <linux/path.h> +#include <linux/pid.h> #include <linux/rcupdate.h> +#include <linux/sched/signal.h> #include <linux/spinlock.h> #include <linux/stat.h> #include <linux/types.h> @@ -1630,15 +1632,27 @@ static int hook_file_ioctl_compat(struct file *file, unsigned int cmd,
static void hook_file_set_fowner(struct file *file) { - struct landlock_ruleset *new_dom, *prev_dom; + struct fown_struct *fown = file_f_owner(file); + struct landlock_ruleset *new_dom = NULL; + struct landlock_ruleset *prev_dom; + struct task_struct *p;
/* * Lock already held by __f_setown(), see commit 26f204380a3c ("fs: Fix * file_set_fowner LSM hook inconsistencies"). */ - lockdep_assert_held(&file_f_owner(file)->lock); - new_dom = landlock_get_current_domain(); - landlock_get_ruleset(new_dom); + lockdep_assert_held(&fown->lock); + + /* + * Always allow sending signals between threads of the same process. This + * ensures consistency with hook_task_kill(). + */ + p = pid_task(fown->pid, fown->pid_type); + if (!same_thread_group(p, current)) { + new_dom = landlock_get_current_domain(); + landlock_get_ruleset(new_dom); + } + prev_dom = landlock_file(file)->fown_domain; landlock_file(file)->fown_domain = new_dom;
diff --git a/security/landlock/task.c b/security/landlock/task.c index dc7dab78392e..4578ce6e319d 100644 --- a/security/landlock/task.c +++ b/security/landlock/task.c @@ -13,6 +13,7 @@ #include <linux/lsm_hooks.h> #include <linux/rcupdate.h> #include <linux/sched.h> +#include <linux/sched/signal.h> #include <net/af_unix.h> #include <net/sock.h>
@@ -264,6 +265,17 @@ static int hook_task_kill(struct task_struct *const p, /* Dealing with USB IO. */ dom = landlock_cred(cred)->domain; } else { + /* + * Always allow sending signals between threads of the same process. + * This is required for process credential changes by the Native POSIX + * Threads Library and implemented by the set*id(2) wrappers and + * libcap(3) with tgkill(2). See nptl(7) and libpsx(3). + * + * This exception is similar to the __ptrace_may_access() one. + */ + if (same_thread_group(p, current)) + return 0; + dom = landlock_get_current_domain(); } dom = landlock_get_applicable_domain(dom, signal_scope); diff --git a/tools/testing/selftests/landlock/scoped_signal_test.c b/tools/testing/selftests/landlock/scoped_signal_test.c index 475ee62a832d..767f117703b7 100644 --- a/tools/testing/selftests/landlock/scoped_signal_test.c +++ b/tools/testing/selftests/landlock/scoped_signal_test.c @@ -281,7 +281,7 @@ TEST(signal_scoping_threads) /* Restricts the domain after creating the first thread. */ create_scoped_domain(_metadata, LANDLOCK_SCOPE_SIGNAL);
- ASSERT_EQ(EPERM, pthread_kill(no_sandbox_thread, 0)); + ASSERT_EQ(0, pthread_kill(no_sandbox_thread, 0)); ASSERT_EQ(1, write(thread_pipe[1], ".", 1));
ASSERT_EQ(0, pthread_create(&scoped_thread, NULL, thread_func, NULL));
On Tue, Mar 18, 2025 at 05:14:40PM +0100, Mickaël Salaün wrote:
Because Linux credentials are managed per thread, user space relies on some hack to synchronize credential update across threads from the same process. This is required by the Native POSIX Threads Library and implemented by set*id(2) wrappers and libcap(3) to use tgkill(2) to synchronize threads. See nptl(7) and libpsx(3). Furthermore, some runtimes like Go do not enable developers to have control over threads [1].
To avoid potential issues, and because threads are not security boundaries, let's relax the Landlock (optional) signal scoping to always allow signals sent between threads of the same process. This exception is similar to the __ptrace_may_access() one.
hook_file_set_fowner() now checks if the target task is part of the same process as the caller. If this is the case, then the related signal triggered by the socket will always be allowed.
Scoping of abstract UNIX sockets is not changed because kernel objects (e.g. sockets) should be tied to their creator's domain at creation time.
Note that creating one Landlock domain per thread puts each of these threads (and their future children) in their own scope, which is probably not what users expect, especially in Go where we do not control threads. However, being able to drop permissions on all threads should not be restricted by signal scoping. We are working on a way to make it possible to atomically restrict all threads of a process with the same domain [2].
Add erratum for signal scoping.
Closes: https://github.com/landlock-lsm/go-landlock/issues/36 Fixes: 54a6e6bbf3be ("landlock: Add signal scoping") Fixes: c8994965013e ("selftests/landlock: Test signal scoping for threads") Depends-on: 26f204380a3c ("fs: Fix file_set_fowner LSM hook inconsistencies") Link: https://pkg.go.dev/kernel.org/pub/linux/libs/security/libcap/psx [1] Link: https://github.com/landlock-lsm/linux/issues/2 [2] Cc: Günther Noack gnoack@google.com Cc: Paul Moore paul@paul-moore.com Cc: Serge Hallyn serge@hallyn.com Cc: Tahera Fahimi fahimitahera@gmail.com Cc: stable@vger.kernel.org Acked-by: Christian Brauner brauner@kernel.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20250318161443.279194-6-mic@digikod.net
index 71b9dc331aae..47c862fe14e4 100644 --- a/security/landlock/fs.c +++ b/security/landlock/fs.c @@ -27,7 +27,9 @@ #include <linux/mount.h> #include <linux/namei.h> #include <linux/path.h> +#include <linux/pid.h> #include <linux/rcupdate.h> +#include <linux/sched/signal.h> #include <linux/spinlock.h> #include <linux/stat.h> #include <linux/types.h> @@ -1630,15 +1632,27 @@ static int hook_file_ioctl_compat(struct file *file, unsigned int cmd, static void hook_file_set_fowner(struct file *file) {
- struct landlock_ruleset *new_dom, *prev_dom;
- struct fown_struct *fown = file_f_owner(file);
- struct landlock_ruleset *new_dom = NULL;
- struct landlock_ruleset *prev_dom;
- struct task_struct *p;
/* * Lock already held by __f_setown(), see commit 26f204380a3c ("fs: Fix * file_set_fowner LSM hook inconsistencies"). */
- lockdep_assert_held(&file_f_owner(file)->lock);
- new_dom = landlock_get_current_domain();
- landlock_get_ruleset(new_dom);
- lockdep_assert_held(&fown->lock);
- /*
* Always allow sending signals between threads of the same process. This
* ensures consistency with hook_task_kill().
*/
- p = pid_task(fown->pid, fown->pid_type);
- if (!same_thread_group(p, current)) {
There is a missing pointer check. I'll apply this:
- if (!same_thread_group(p, current)) { + if (!p || !same_thread_group(p, current)) {
new_dom = landlock_get_current_domain();
landlock_get_ruleset(new_dom);
- }
- prev_dom = landlock_file(file)->fown_domain; landlock_file(file)->fown_domain = new_dom;
Hello,
kernel test robot noticed "Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN" on:
commit: b9fb5bfdb361fd6d945c09c93d351740310a26c7 ("[PATCH v2 5/8] landlock: Always allow signals between threads of the same process") url: https://github.com/intel-lab-lkp/linux/commits/Micka-l-Sala-n/landlock-Move-... patch link: https://lore.kernel.org/all/20250318161443.279194-6-mic@digikod.net/ patch subject: [PATCH v2 5/8] landlock: Always allow signals between threads of the same process
in testcase: trinity version: trinity-x86_64-ba2360ed-1_20241228 with following parameters:
runtime: 300s group: group-03 nr_groups: 5
config: x86_64-randconfig-005-20250325 compiler: gcc-12 test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
we noticed the issue happens randomly (35 times out of 200 runs as below). but parent keeps clean.
37897789c51dd898 b9fb5bfdb361fd6d945c09c93d3 ---------------- --------------------------- fail:runs %reproduction fail:runs | | | :200 18% 35:200 dmesg.KASAN:null-ptr-deref_in_range[#-#] :200 18% 35:200 dmesg.Kernel_panic-not_syncing:Fatal_exception :200 18% 35:200 dmesg.Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN :200 18% 35:200 dmesg.RIP:hook_file_set_fowner
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot oliver.sang@intel.com | Closes: https://lore.kernel.org/oe-lkp/202503261534.22d970e8-lkp@intel.com
[ 354.738531][ T222] [ 355.199494][ T222] [main] 2245715 iterations. [F:1644455 S:601688 HI:11581] [ 355.199514][ T222] [ 355.934630][ T222] [main] 2273938 iterations. [F:1665198 S:609188 HI:11581] [ 355.934650][ T222] [ 356.308897][ T3147] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000151: 0000 [#1] SMP KASAN [ 356.309510][ T3147] KASAN: null-ptr-deref in range [0x0000000000000a88-0x0000000000000a8f] [ 356.309910][ T3147] CPU: 1 UID: 65534 PID: 3147 Comm: trinity-c2 Not tainted 6.14.0-rc5-00005-gb9fb5bfdb361 #1 145c38dc5407add8933da653ccf9cf31d58da93c [ 356.310560][ T3147] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 356.311050][ T3147] RIP: 0010:hook_file_set_fowner (kbuild/src/consumer/include/linux/sched/signal.h:707 (discriminator 9) kbuild/src/consumer/security/landlock/fs.c:1651 (discriminator 9)) [ 356.311345][ T3147] Code: 49 8b 7c 24 50 65 4c 8b 25 e7 e4 0c 7e e8 52 63 33 ff 48 ba 00 00 00 00 00 fc ff df 48 8d b8 88 0a 00 00 48 89 f9 48 c1 e9 03 <80> 3c 11 00 0f 85 7e 02 00 00 49 8d bc 24 88 0a 00 00 4c 8b a8 88 All code ======== 0: 49 8b 7c 24 50 mov 0x50(%r12),%rdi 5: 65 4c 8b 25 e7 e4 0c mov %gs:0x7e0ce4e7(%rip),%r12 # 0x7e0ce4f4 c: 7e d: e8 52 63 33 ff call 0xffffffffff336364 12: 48 ba 00 00 00 00 00 movabs $0xdffffc0000000000,%rdx 19: fc ff df 1c: 48 8d b8 88 0a 00 00 lea 0xa88(%rax),%rdi 23: 48 89 f9 mov %rdi,%rcx 26: 48 c1 e9 03 shr $0x3,%rcx 2a:* 80 3c 11 00 cmpb $0x0,(%rcx,%rdx,1) <-- trapping instruction 2e: 0f 85 7e 02 00 00 jne 0x2b2 34: 49 8d bc 24 88 0a 00 lea 0xa88(%r12),%rdi 3b: 00 3c: 4c rex.WR 3d: 8b .byte 0x8b 3e: a8 88 test $0x88,%al
Code starting with the faulting instruction =========================================== 0: 80 3c 11 00 cmpb $0x0,(%rcx,%rdx,1) 4: 0f 85 7e 02 00 00 jne 0x288 a: 49 8d bc 24 88 0a 00 lea 0xa88(%r12),%rdi 11: 00 12: 4c rex.WR 13: 8b .byte 0x8b 14: a8 88 test $0x88,%al [ 356.312254][ T3147] RSP: 0018:ffffc9000883fd20 EFLAGS: 00010002 [ 356.312556][ T3147] RAX: 0000000000000000 RBX: ffff88816ee4c700 RCX: 0000000000000151 [ 356.312933][ T3147] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000a88 [ 356.313310][ T3147] RBP: ffffc9000883fd48 R08: 0000000000000000 R09: 0000000000000000 [ 356.313687][ T3147] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88814f0c8000 [ 356.314063][ T3147] R13: ffff88814f92b700 R14: ffff888161e71450 R15: ffff888161e71408 [ 356.314440][ T3147] FS: 00007f3c72136740(0000) GS:ffff8883af000000(0000) knlGS:0000000000000000 [ 356.314879][ T3147] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 356.315194][ T3147] CR2: 00007f3c708bd000 CR3: 0000000165606000 CR4: 00000000000406f0 [ 356.315573][ T3147] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 356.315950][ T3147] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 356.316334][ T3147] Call Trace: [ 356.316498][ T3147] <TASK> [ 356.316645][ T3147] ? show_regs (kbuild/src/consumer/arch/x86/kernel/dumpstack.c:479) [ 356.316859][ T3147] ? die_addr (kbuild/src/consumer/arch/x86/kernel/dumpstack.c:421 kbuild/src/consumer/arch/x86/kernel/dumpstack.c:460) [ 356.317066][ T3147] ? exc_general_protection (kbuild/src/consumer/arch/x86/kernel/traps.c:751 kbuild/src/consumer/arch/x86/kernel/traps.c:693) [ 356.317349][ T3147] ? asm_exc_general_protection (kbuild/src/consumer/arch/x86/include/asm/idtentry.h:574)
The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250326/202503261534.22d970e8-lkp@i...
Thanks for the report.
I assumed __f_setown() was only called in an RCU read-side critical section but that's not the case (e.g. fcntl_dirnotify). I moved the pid_task() call in a dedicated function to protect it with an RCU guard. Here is the new hunk:
@@ -1628,21 +1630,46 @@ static int hook_file_ioctl_compat(struct file *file, unsigned int cmd, return -EACCES; }
-static void hook_file_set_fowner(struct file *file) +/* + * Always allow sending signals between threads of the same process. This + * ensures consistency with hook_task_kill(). + */ +static bool control_current_fowner(struct fown_struct *const fown) { - struct landlock_ruleset *new_dom, *prev_dom; + struct task_struct *p;
/* * Lock already held by __f_setown(), see commit 26f204380a3c ("fs: Fix * file_set_fowner LSM hook inconsistencies"). */ - lockdep_assert_held(&file_f_owner(file)->lock); - new_dom = landlock_get_current_domain(); - landlock_get_ruleset(new_dom); + lockdep_assert_held(&fown->lock); + + /* + * Some callers (e.g. fcntl_dirnotify) may not be in an RCU read-side + * critical section. + */ + guard(rcu)(); + p = pid_task(fown->pid, fown->pid_type); + if (!p) + return true; + + return !same_thread_group(p, current); +} + +static void hook_file_set_fowner(struct file *file) +{ + struct landlock_ruleset *prev_dom; + struct landlock_ruleset *new_dom = NULL; + + if (control_current_fowner(file_f_owner(file))) { + new_dom = landlock_get_current_domain(); + landlock_get_ruleset(new_dom); + } + prev_dom = landlock_file(file)->fown_domain; landlock_file(file)->fown_domain = new_dom;
- /* Called in an RCU read-side critical section. */ + /* May be called in an RCU read-side critical section. */ landlock_put_ruleset_deferred(prev_dom); }
This other report gives more details: https://lore.kernel.org/all/202503261510.f9652c11-lkp@intel.com/
On Wed, Mar 26, 2025 at 03:51:50PM +0800, kernel test robot wrote:
Hello,
kernel test robot noticed "Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN" on:
commit: b9fb5bfdb361fd6d945c09c93d351740310a26c7 ("[PATCH v2 5/8] landlock: Always allow signals between threads of the same process") url: https://github.com/intel-lab-lkp/linux/commits/Micka-l-Sala-n/landlock-Move-... patch link: https://lore.kernel.org/all/20250318161443.279194-6-mic@digikod.net/ patch subject: [PATCH v2 5/8] landlock: Always allow signals between threads of the same process
in testcase: trinity version: trinity-x86_64-ba2360ed-1_20241228 with following parameters:
runtime: 300s group: group-03 nr_groups: 5
config: x86_64-randconfig-005-20250325 compiler: gcc-12 test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
we noticed the issue happens randomly (35 times out of 200 runs as below). but parent keeps clean.
37897789c51dd898 b9fb5bfdb361fd6d945c09c93d3
fail:runs %reproduction fail:runs | | | :200 18% 35:200 dmesg.KASAN:null-ptr-deref_in_range[#-#] :200 18% 35:200 dmesg.Kernel_panic-not_syncing:Fatal_exception :200 18% 35:200 dmesg.Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN :200 18% 35:200 dmesg.RIP:hook_file_set_fowner
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot oliver.sang@intel.com | Closes: https://lore.kernel.org/oe-lkp/202503261534.22d970e8-lkp@intel.com
[ 354.738531][ T222] [ 355.199494][ T222] [main] 2245715 iterations. [F:1644455 S:601688 HI:11581] [ 355.199514][ T222] [ 355.934630][ T222] [main] 2273938 iterations. [F:1665198 S:609188 HI:11581] [ 355.934650][ T222] [ 356.308897][ T3147] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000151: 0000 [#1] SMP KASAN [ 356.309510][ T3147] KASAN: null-ptr-deref in range [0x0000000000000a88-0x0000000000000a8f] [ 356.309910][ T3147] CPU: 1 UID: 65534 PID: 3147 Comm: trinity-c2 Not tainted 6.14.0-rc5-00005-gb9fb5bfdb361 #1 145c38dc5407add8933da653ccf9cf31d58da93c [ 356.310560][ T3147] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 356.311050][ T3147] RIP: 0010:hook_file_set_fowner (kbuild/src/consumer/include/linux/sched/signal.h:707 (discriminator 9) kbuild/src/consumer/security/landlock/fs.c:1651 (discriminator 9)) [ 356.311345][ T3147] Code: 49 8b 7c 24 50 65 4c 8b 25 e7 e4 0c 7e e8 52 63 33 ff 48 ba 00 00 00 00 00 fc ff df 48 8d b8 88 0a 00 00 48 89 f9 48 c1 e9 03 <80> 3c 11 00 0f 85 7e 02 00 00 49 8d bc 24 88 0a 00 00 4c 8b a8 88 All code ======== 0: 49 8b 7c 24 50 mov 0x50(%r12),%rdi 5: 65 4c 8b 25 e7 e4 0c mov %gs:0x7e0ce4e7(%rip),%r12 # 0x7e0ce4f4 c: 7e d: e8 52 63 33 ff call 0xffffffffff336364 12: 48 ba 00 00 00 00 00 movabs $0xdffffc0000000000,%rdx 19: fc ff df 1c: 48 8d b8 88 0a 00 00 lea 0xa88(%rax),%rdi 23: 48 89 f9 mov %rdi,%rcx 26: 48 c1 e9 03 shr $0x3,%rcx 2a:* 80 3c 11 00 cmpb $0x0,(%rcx,%rdx,1) <-- trapping instruction 2e: 0f 85 7e 02 00 00 jne 0x2b2 34: 49 8d bc 24 88 0a 00 lea 0xa88(%r12),%rdi 3b: 00 3c: 4c rex.WR 3d: 8b .byte 0x8b 3e: a8 88 test $0x88,%al
Code starting with the faulting instruction
0: 80 3c 11 00 cmpb $0x0,(%rcx,%rdx,1) 4: 0f 85 7e 02 00 00 jne 0x288 a: 49 8d bc 24 88 0a 00 lea 0xa88(%r12),%rdi 11: 00 12: 4c rex.WR 13: 8b .byte 0x8b 14: a8 88 test $0x88,%al [ 356.312254][ T3147] RSP: 0018:ffffc9000883fd20 EFLAGS: 00010002 [ 356.312556][ T3147] RAX: 0000000000000000 RBX: ffff88816ee4c700 RCX: 0000000000000151 [ 356.312933][ T3147] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000a88 [ 356.313310][ T3147] RBP: ffffc9000883fd48 R08: 0000000000000000 R09: 0000000000000000 [ 356.313687][ T3147] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88814f0c8000 [ 356.314063][ T3147] R13: ffff88814f92b700 R14: ffff888161e71450 R15: ffff888161e71408 [ 356.314440][ T3147] FS: 00007f3c72136740(0000) GS:ffff8883af000000(0000) knlGS:0000000000000000 [ 356.314879][ T3147] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 356.315194][ T3147] CR2: 00007f3c708bd000 CR3: 0000000165606000 CR4: 00000000000406f0 [ 356.315573][ T3147] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 356.315950][ T3147] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 356.316334][ T3147] Call Trace: [ 356.316498][ T3147] <TASK> [ 356.316645][ T3147] ? show_regs (kbuild/src/consumer/arch/x86/kernel/dumpstack.c:479) [ 356.316859][ T3147] ? die_addr (kbuild/src/consumer/arch/x86/kernel/dumpstack.c:421 kbuild/src/consumer/arch/x86/kernel/dumpstack.c:460) [ 356.317066][ T3147] ? exc_general_protection (kbuild/src/consumer/arch/x86/kernel/traps.c:751 kbuild/src/consumer/arch/x86/kernel/traps.c:693) [ 356.317349][ T3147] ? asm_exc_general_protection (kbuild/src/consumer/arch/x86/include/asm/idtentry.h:574)
The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250326/202503261534.22d970e8-lkp@i...
-- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
On Fri, Mar 28, 2025 at 05:26:32PM +0800, Oliver Sang wrote:
hi, Mickaël Salaün,
On Fri, Mar 28, 2025 at 09:31:15AM +0100, Mickaël Salaün wrote:
On Fri, Mar 28, 2025 at 02:05:59PM +0800, Oliver Sang wrote:
hi, Mickaël Salaün,
On Fri, Mar 28, 2025 at 11:00:37AM +0800, Oliver Sang wrote:
hi, Mickaël Salaün,
On Thu, Mar 27, 2025 at 07:41:08PM +0100, Mickaël Salaün wrote:
Hi Olivier,
I pushed an updated yesterday in linux-next that should fix this issue and this other issue too: https://lore.kernel.org/all/20250326.yee0ba6Yai3m@digikod.net/
Could you please confirm that these issues are really fixed? Or otherwise, please let me know when I should expect (or not) an email from kernel test robot. :)
ok, we've started the tests for both issues. upon the commit: db8da9da41bce (tag: next-20250327, linux-next/master) Add linux-next specific files for 20250327
sorry that due to unknown reason, we cannot build sucessfully upon db8da9da41bce for both tests (which are using randconfig). could you give us the commit-id of your fix? we could try test upon that fix commit again.
The new commit is 18eb75f3af40 ("landlock: Always allow signals between threads of the same process").
it turned out the build failure is due to my typo. shamed...
we finished tests still upon db8da9da41bce, for the WARNING:suspicious_RCU_usage issue in this report, we run the tests 20 times, cannot reproduce now.
for the random issues we reported in https://lore.kernel.org/all/202503261534.22d970e8-lkp@intel.com/ now we cannot reproduce it with db8da9da41bce by 500 runs.
Yes, that makes sense with my fix.
we think both issues are solved.
and since db8da9da41bce includes the 18eb75f3af40, we won't test again upon 18eb75f3af40. thanks!
Thanks for the confirmation!
if this is not the correct commit to check, please let us know. thanks!
Regards, Mickaël
On Wed, Mar 26, 2025 at 04:00:12PM +0800, kernel test robot wrote:
hi, Mickaël Salaün,
we just reported a random "Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN" issue in https://lore.kernel.org/all/202503261534.22d970e8-lkp@intel.com/
now we noticed this commit is also in linux-next/master.
we don't have enough knowledge to check the difference, but we found a persistent issue for this commit.
6d9ac5e4d70eba3e 9d65581539252fdb1666917a095
fail:runs %reproduction fail:runs | | | :6 100% 6:6 dmesg.WARNING:suspicious_RCU_usage :6 100% 6:6 dmesg.boot_failures :6 100% 6:6 dmesg.kernel/pid.c:#suspicious_rcu_dereference_check()usage
below full report FYI.
Hello,
kernel test robot noticed "WARNING:suspicious_RCU_usage" on:
commit: 9d65581539252fdb1666917a09549c13090fe9e5 ("landlock: Always allow signals between threads of the same process") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
[test failed on linux-next/master eb4bc4b07f66f01618d9cb1aa4eaef59b1188415]
in testcase: trinity version: trinity-x86_64-ba2360ed-1_20241228 with following parameters:
runtime: 300s group: group-00 nr_groups: 5
config: x86_64-randconfig-101-20250325 compiler: gcc-12 test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot oliver.sang@intel.com | Closes: https://lore.kernel.org/oe-lkp/202503261510.f9652c11-lkp@intel.com
[ 166.893101][ T3747] WARNING: suspicious RCU usage [ 166.893462][ T3747] 6.14.0-rc5-00006-g9d6558153925 #1 Not tainted [ 166.893895][ T3747] ----------------------------- [ 166.894239][ T3747] kernel/pid.c:414 suspicious rcu_dereference_check() usage! [ 166.894747][ T3747] [ 166.894747][ T3747] other info that might help us debug this: [ 166.894747][ T3747] [ 166.895450][ T3747] [ 166.895450][ T3747] rcu_scheduler_active = 2, debug_locks = 1 [ 166.896030][ T3747] 3 locks held by trinity-c2/3747: [ 166.896415][ T3747] #0: ffff888114a5a930 (&group->mark_mutex){+.+.}-{4:4}, at: fcntl_dirnotify (include/linux/sched/mm.h:332 include/linux/sched/mm.h:386 include/linux/fsnotify_backend.h:279 fs/notify/dnotify/dnotify.c:329) [ 166.897165][ T3747] #1: ffff888148bbea60 (&mark->lock){+.+.}-{3:3}, at: fcntl_dirnotify (fs/notify/dnotify/dnotify.c:349) [ 166.897831][ T3747] #2: ffff888108a53220 (&f_owner->lock){....}-{3:3}, at: __f_setown (fs/fcntl.c:137) [ 166.898481][ T3747] [ 166.898481][ T3747] stack backtrace: [ 166.898901][ T3747] CPU: 0 UID: 65534 PID: 3747 Comm: trinity-c2 Not tainted 6.14.0-rc5-00006-g9d6558153925 #1 [ 166.898908][ T3747] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 166.898912][ T3747] Call Trace: [ 166.898916][ T3747] <TASK> [ 166.898921][ T3747] dump_stack_lvl (lib/dump_stack.c:123) [ 166.898932][ T3747] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6848) [ 166.898945][ T3747] pid_task (kernel/pid.c:414 (discriminator 11)) [ 166.898954][ T3747] hook_file_set_fowner (security/landlock/fs.c:1651 (discriminator 9)) [ 166.898963][ T3747] security_file_set_fowner (arch/x86/include/asm/atomic.h:23 (discriminator 4) include/linux/atomic/atomic-arch-fallback.h:457 (discriminator 4) include/linux/jump_label.h:262 (discriminator 4) security/security.c:3062 (discriminator 4)) [ 166.898969][ T3747] __f_setown (fs/fcntl.c:145) [ 166.898980][ T3747] fcntl_dirnotify (fs/notify/dnotify/dnotify.c:233 fs/notify/dnotify/dnotify.c:371) [ 166.898996][ T3747] do_fcntl (fs/fcntl.c:539) [ 166.899002][ T3747] ? f_getown (fs/fcntl.c:448) [ 166.899007][ T3747] ? check_prev_add (kernel/locking/lockdep.c:3862) [ 166.899011][ T3747] ? do_syscall_64 (arch/x86/entry/common.c:102) [ 166.899023][ T3747] ? syscall_exit_to_user_mode (include/linux/entry-common.h:361 kernel/entry/common.c:220) [ 166.899038][ T3747] __x64_sys_fcntl (fs/fcntl.c:591 fs/fcntl.c:576 fs/fcntl.c:576) [ 166.899050][ T3747] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) [ 166.899062][ T3747] ? find_held_lock (kernel/locking/lockdep.c:5341) [ 166.899072][ T3747] ? __lock_release+0x10b/0x440 [ 166.899076][ T3747] ? __task_pid_nr_ns (include/linux/rcupdate.h:347 include/linux/rcupdate.h:880 kernel/pid.c:514) [ 166.899082][ T3747] ? reacquire_held_locks (kernel/locking/lockdep.c:5502) [ 166.899087][ T3747] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4470) [ 166.899093][ T3747] ? do_syscall_64 (arch/x86/entry/common.c:102) [ 166.899099][ T3747] ? do_syscall_64 (arch/x86/entry/common.c:102) [ 166.899111][ T3747] ? syscall_exit_to_user_mode (include/linux/entry-common.h:361 kernel/entry/common.c:220) [ 166.899119][ T3747] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:4409) [ 166.899124][ T3747] ? do_syscall_64 (arch/x86/entry/common.c:102) [ 166.899129][ T3747] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4470) [ 166.899134][ T3747] ? do_syscall_64 (arch/x86/entry/common.c:102) [ 166.899139][ T3747] ? do_int80_emulation (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:457 include/linux/jump_label.h:262 arch/x86/entry/common.c:230) [ 166.899149][ T3747] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) [ 166.899155][ T3747] RIP: 0033:0x7f55ad007719 [ 166.899159][ T3747] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b7 06 0d 00 f7 d8 64 89 01 48 All code ======== 0: 08 89 e8 5b 5d c3 or %cl,-0x3ca2a418(%rcx) 6: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) d: 00 00 00 10: 90 nop 11: 48 89 f8 mov %rdi,%rax 14: 48 89 f7 mov %rsi,%rdi 17: 48 89 d6 mov %rdx,%rsi 1a: 48 89 ca mov %rcx,%rdx 1d: 4d 89 c2 mov %r8,%r10 20: 4d 89 c8 mov %r9,%r8 23: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9 28: 0f 05 syscall 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction 30: 73 01 jae 0x33 32: c3 ret 33: 48 8b 0d b7 06 0d 00 mov 0xd06b7(%rip),%rcx # 0xd06f1 3a: f7 d8 neg %eax 3c: 64 89 01 mov %eax,%fs:(%rcx) 3f: 48 rex.W
Code starting with the faulting instruction
0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax 6: 73 01 jae 0x9 8: c3 ret 9: 48 8b 0d b7 06 0d 00 mov 0xd06b7(%rip),%rcx # 0xd06c7 10: f7 d8 neg %eax 12: 64 89 01 mov %eax,%fs:(%rcx) 15: 48 rex.W [ 166.899164][ T3747] RSP: 002b:00007ffff6eefb48 EFLAGS: 00000246 ORIG_RAX: 0000000000000048 [ 166.899168][ T3747] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f55ad007719 [ 166.899172][ T3747] RDX: 0000000000000027 RSI: 0000000000000402 RDI: 0000000000000043 [ 166.899174][ T3747] RBP: 00007f55ab92f058 R08: 0000000099999999 R09: 00000000377dd000 [ 166.899177][ T3747] R10: 0000000084848484 R11: 0000000000000246 R12: 0000000000000048 [ 166.899180][ T3747] R13: 00007f55acf036c0 R14: 00007f55ab92f058 R15: 00007f55ab92f000 [ 166.899203][ T3747] </TASK>
The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250326/202503261510.f9652c11-lkp@i...
-- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
linux-stable-mirror@lists.linaro.org