Hi Greg,
Syzbot has been complaining about KASAN splats due to use-after-free
issues in the l2tp code on 4.9 Android kernels (although I reproduced
with latest 4.9 stable on my laptop):
https://syzkaller.appspot.com/bug?id=3c27eae7bdba97293b68e79c9700ac110f977e…
These have been fixed upstream, but for some reason didn't get picked up
for stable. This series applies to 4.9.y and I'll send patches for 4.4.y
separately as there are a few dependencies to deal with over there.
Thanks,
Will
--->8
Guillaume Nault (2):
l2tp: ensure sessions are freed after their PPPOL2TP socket
l2tp: fix race between l2tp_session_delete() and
l2tp_tunnel_closeall()
net/l2tp/l2tp_core.c | 6 ++++++
net/l2tp/l2tp_core.h | 1 +
net/l2tp/l2tp_ppp.c | 8 ++++----
3 files changed, 11 insertions(+), 4 deletions(-)
--
2.26.0.rc2.310.g2932bb562d-goog
This patch partially reverts the commit for
nvme_fc: add module to ops template to allow module references
The original patch:
Added an ops parameter of "module" to be set by the lldd, and the
lldds were updated to provide their value.
Used the parameter to take module references when a controller was
created or terminated.
The original patch was to resolve the lldd being able to be unloaded
while being used to talk to the boot device of the system. However, the
end result of the original patch is that any driver unload while a nvme
controller is live via the lldd is not being prohibited. Given the module
reference, the module teardown routine can't be called, thus there's no
way, other than manual actions to terminate the controllers.
This patch reverts the portion of the patch that takes module references
on controller creation. It leaves the module parameter so that it could
be used in the future.
As such, there will still remain the issue of detaching from the boot
device, yet needing boot device access to load a new module to replace
the lldd that was unloaded. A solution will be looked for later.
-- james
Fixes: 863fbae929c7 ("nvme_fc: add module to ops template to allow module references")
Cc: <stable(a)vger.kernel.org> # v5.4+
Signed-off-by: James Smart <jsmart2021(a)gmail.com>
Cc: Himanshu Madhani <himanshu.madhani(a)oracle.com>
CC: Christoph Hellwig <hch(a)lst.de>
CC: Keith Busch <kbusch(a)kernel.org>
---
drivers/nvme/host/fc.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index a8bf2fb1287b..1419c8c41fd8 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2016,7 +2016,6 @@ nvme_fc_ctrl_free(struct kref *ref)
{
struct nvme_fc_ctrl *ctrl =
container_of(ref, struct nvme_fc_ctrl, ref);
- struct nvme_fc_lport *lport = ctrl->lport;
unsigned long flags;
if (ctrl->ctrl.tagset) {
@@ -2043,7 +2042,6 @@ nvme_fc_ctrl_free(struct kref *ref)
if (ctrl->ctrl.opts)
nvmf_free_options(ctrl->ctrl.opts);
kfree(ctrl);
- module_put(lport->ops->module);
}
static void
@@ -3074,15 +3072,10 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
goto out_fail;
}
- if (!try_module_get(lport->ops->module)) {
- ret = -EUNATCH;
- goto out_free_ctrl;
- }
-
idx = ida_simple_get(&nvme_fc_ctrl_cnt, 0, 0, GFP_KERNEL);
if (idx < 0) {
ret = -ENOSPC;
- goto out_mod_put;
+ goto out_free_ctrl;
}
ctrl->ctrl.opts = opts;
@@ -3232,8 +3225,6 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
out_free_ida:
put_device(ctrl->dev);
ida_simple_remove(&nvme_fc_ctrl_cnt, ctrl->cnum);
-out_mod_put:
- module_put(lport->ops->module);
out_free_ctrl:
kfree(ctrl);
out_fail:
--
2.16.4
Replace the 32bit exec_id with a 64bit exec_id to make it impossible
to wrap the exec_id counter. With care an attacker can cause exec_id
wrap and send arbitrary signals to a newly exec'd parent. This
bypasses the signal sending checks if the parent changes their
credentials during exec.
The severity of this problem can been seen that in my limited testing
of a 32bit exec_id it can take as little as 19s to exec 65536 times.
Which means that it can take as little as 14 days to wrap a 32bit
exec_id. Adam Zabrocki has succeeded wrapping the self_exe_id in 7
days. Even my slower timing is in the uptime of a typical server.
Which means self_exec_id is simply a speed bump today, and if exec
gets noticably faster self_exec_id won't even be a speed bump.
Extending self_exec_id to 64bits introduces a problem on 32bit
architectures where reading self_exec_id is no longer atomic and can
take two read instructions. Which means that is is possible to hit
a window where the read value of exec_id does not match the written
value. So with very lucky timing after this change this still
remains expoiltable.
I have updated the update of exec_id on exec to use WRITE_ONCE
and the read of exec_id in do_notify_parent to use READ_ONCE
to make it clear that there is no locking between these two
locations.
Link: https://lore.kernel.org/kernel-hardening/20200324215049.GA3710@pi3.com.pl
Fixes: 2.3.23pre2
Cc: stable(a)vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
---
Linus would you prefer to take this patch directly or I could put it in
a brach and send you a pull request.
fs/exec.c | 2 +-
include/linux/sched.h | 4 ++--
kernel/signal.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 0e46ec57fe0a..d55710a36056 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1413,7 +1413,7 @@ void setup_new_exec(struct linux_binprm * bprm)
/* An exec changes our domain. We are no longer part of the thread
group */
- current->self_exec_id++;
+ WRITE_ONCE(current->self_exec_id, current->self_exec_id + 1);
flush_signal_handlers(current, 0);
}
EXPORT_SYMBOL(setup_new_exec);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 04278493bf15..0323e4f0982a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -939,8 +939,8 @@ struct task_struct {
struct seccomp seccomp;
/* Thread group tracking: */
- u32 parent_exec_id;
- u32 self_exec_id;
+ u64 parent_exec_id;
+ u64 self_exec_id;
/* Protection against (de-)allocation: mm, files, fs, tty, keyrings, mems_allowed, mempolicy: */
spinlock_t alloc_lock;
diff --git a/kernel/signal.c b/kernel/signal.c
index 9ad8dea93dbb..5383b562df85 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1926,7 +1926,7 @@ bool do_notify_parent(struct task_struct *tsk, int sig)
* This is only possible if parent == real_parent.
* Check if it has changed security domain.
*/
- if (tsk->parent_exec_id != tsk->parent->self_exec_id)
+ if (tsk->parent_exec_id != READ_ONCE(tsk->parent->self_exec_id))
sig = SIGCHLD;
}
--
2.20.1