On Tue, Mar 28, 2017 at 02:30:56PM +0200, Greg Kroah-Hartman wrote:
4.10-stable review patch. If anyone has any objections, please let me know.
- if (!(auditd_test_task(current) ||
(current == __mutex_owner(&audit_cmd_mutex)))) {
long stime = audit_backlog_wait_time;
Since I cannot find the original email on lkml, NAK on this. __mutex_owner() is not a general purpose helper function.
On Tue, Feb 20, 2018 at 7:37 AM, Peter Zijlstra peterz@infradead.org wrote:
On Tue, Mar 28, 2017 at 02:30:56PM +0200, Greg Kroah-Hartman wrote:
4.10-stable review patch. If anyone has any objections, please let me know.
if (!(auditd_test_task(current) ||
(current == __mutex_owner(&audit_cmd_mutex)))) {
long stime = audit_backlog_wait_time;
Since I cannot find the original email on lkml, NAK on this. __mutex_owner() is not a general purpose helper function.
Since this code also exists in the current kernel, I need to ask what recommended alternatives exist for determining the mutex owner?
I imagine we could track the mutex owner separately in the audit subsystem, but I'd much prefer to leverage an existing mechanism if possible.
On Tue, Feb 20, 2018 at 08:25:21AM -0500, Paul Moore wrote:
On Tue, Feb 20, 2018 at 7:37 AM, Peter Zijlstra peterz@infradead.org wrote:
On Tue, Mar 28, 2017 at 02:30:56PM +0200, Greg Kroah-Hartman wrote:
4.10-stable review patch. If anyone has any objections, please let me know.
if (!(auditd_test_task(current) ||
(current == __mutex_owner(&audit_cmd_mutex)))) {
long stime = audit_backlog_wait_time;
Since I cannot find the original email on lkml, NAK on this. __mutex_owner() is not a general purpose helper function.
Since this code also exists in the current kernel, I need to ask what recommended alternatives exist for determining the mutex owner?
I imagine we could track the mutex owner separately in the audit subsystem, but I'd much prefer to leverage an existing mechanism if possible.
It's not at all clear to me what that code does, I just stumbled upon __mutex_owner() outside of the mutex code itself and went WTF.
The comment (aside from having the most horribly style) is wrong too, because it claims it will not block when we hold that lock, while, afaict, it will in fact do just that.
Maybe if you could explain how that code is supposed to work and why it doesn't know if it holds a lock I could make a suggestion...
On Tue, Feb 20, 2018 at 9:06 AM, Peter Zijlstra peterz@infradead.org wrote:
On Tue, Feb 20, 2018 at 08:25:21AM -0500, Paul Moore wrote:
On Tue, Feb 20, 2018 at 7:37 AM, Peter Zijlstra peterz@infradead.org wrote:
On Tue, Mar 28, 2017 at 02:30:56PM +0200, Greg Kroah-Hartman wrote:
4.10-stable review patch. If anyone has any objections, please let me know.
if (!(auditd_test_task(current) ||
(current == __mutex_owner(&audit_cmd_mutex)))) {
long stime = audit_backlog_wait_time;
Since I cannot find the original email on lkml, NAK on this. __mutex_owner() is not a general purpose helper function.
Since this code also exists in the current kernel, I need to ask what recommended alternatives exist for determining the mutex owner?
I imagine we could track the mutex owner separately in the audit subsystem, but I'd much prefer to leverage an existing mechanism if possible.
It's not at all clear to me what that code does, I just stumbled upon __mutex_owner() outside of the mutex code itself and went WTF.
If you don't want people to use __mutex_owner() outside of the mutex code I might suggest adding a rather serious comment at the top of the function, because right now I don't see anything suggesting that function shouldn't be used. Yes, there is the double underscore prefix, but that can mean a few different things these days.
The comment (aside from having the most horribly style) ...
Yeah, your dog is ugly too. Notice how neither comment is constructive?
... is wrong too, because it claims it will not block when we hold that lock, while, afaict, it will in fact do just that.
A mutex blocks when it is held, but the audit_log_start() function should not block for the task that currently holds the audit_cmd_mutex; that is what the comment is meant to convey. I believe the comment makes sense, but I did write it so I'll concede that I'm probably the not best judge. If anyone would like to offer a different wording I'm happy to consider it.
Maybe if you could explain how that code is supposed to work and why it doesn't know if it holds a lock I could make a suggestion...
I just spent a few minutes looking back over the bits available in include/linux/mutex.h and I'm not seeing anything beyond __mutex_owner() which would allow us to determine the mutex owning task. It's probably easiest for us to just track ownership ourselves. I'll put together a patch later today.
On Tue, Feb 20, 2018 at 09:51:08AM -0500, Paul Moore wrote:
On Tue, Feb 20, 2018 at 9:06 AM, Peter Zijlstra peterz@infradead.org wrote:
It's not at all clear to me what that code does, I just stumbled upon __mutex_owner() outside of the mutex code itself and went WTF.
If you don't want people to use __mutex_owner() outside of the mutex code I might suggest adding a rather serious comment at the top of the function, because right now I don't see anything suggesting that function shouldn't be used. Yes, there is the double underscore prefix, but that can mean a few different things these days.
Find below.
The comment (aside from having the most horribly style) ...
Yeah, your dog is ugly too. Notice how neither comment is constructive?
I'm sure you've seen this one:
https://lkml.org/lkml/2016/7/8/625
It's all about reading code; inconsistent and unbalanced styles are just _really_ hard on the brain.
... is wrong too, because it claims it will not block when we hold that lock, while, afaict, it will in fact do just that.
A mutex blocks when it is held, but the audit_log_start() function should not block for the task that currently holds the audit_cmd_mutex; that is what the comment is meant to convey. I believe the comment makes sense, but I did write it so I'll concede that I'm probably the not best judge. If anyone would like to offer a different wording I'm happy to consider it.
The comment uses 'sleep' which is typically used to mean anything that schedules, but then it does the schedule_timeout() thing.
Maybe if you could explain how that code is supposed to work and why it doesn't know if it holds a lock I could make a suggestion...
I just spent a few minutes looking back over the bits available in include/linux/mutex.h and I'm not seeing anything beyond __mutex_owner() which would allow us to determine the mutex owning task. It's probably easiest for us to just track ownership ourselves. I'll put together a patch later today.
Note that up until recently the mutex implementation didn't even have a consistent owner field. And the thing is, it's very easy to use wrong, only today I've seen a patch do: "__mutex_owner() == task", where task was allowed to be !current, which is just wrong.
Looking through kernel/audit.c I'm not even sure I see how you would end up in audit_log_start() with audit_cmd_mutex held.
Can you give me a few code paths that trigger this? Simple git-grep is failing me.
--- Subject: mutex: Add comment to __mutex_owner() From: Peter Zijlstra peterz@infradead.org Date: Tue Feb 20 16:01:36 CET 2018
Attempt to deter usage, this is not a public interface. It is entirely possibly to implement a conformant mutex without having this owner field (in fact, we used to have that).
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org --- --- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -66,6 +66,11 @@ struct mutex { #endif };
+/* + * Internal helper function; C doesn't allow us to hide it :/ + * + * DO NOT USE (outside of mutex code). + */ static inline struct task_struct *__mutex_owner(struct mutex *lock) { return (struct task_struct *)(atomic_long_read(&lock->owner) & ~0x07);
On Tue, Feb 20, 2018 at 10:18 AM, Peter Zijlstra peterz@infradead.org wrote:
On Tue, Feb 20, 2018 at 09:51:08AM -0500, Paul Moore wrote:
On Tue, Feb 20, 2018 at 9:06 AM, Peter Zijlstra peterz@infradead.org wrote:
It's not at all clear to me what that code does, I just stumbled upon __mutex_owner() outside of the mutex code itself and went WTF.
If you don't want people to use __mutex_owner() outside of the mutex code I might suggest adding a rather serious comment at the top of the function, because right now I don't see anything suggesting that function shouldn't be used. Yes, there is the double underscore prefix, but that can mean a few different things these days.
Find below.
The comment (aside from having the most horribly style) ...
Yeah, your dog is ugly too. Notice how neither comment is constructive?
I'm sure you've seen this one:
Yep. I stand behind my earlier comment in this thread.
Maybe if you could explain how that code is supposed to work and why it doesn't know if it holds a lock I could make a suggestion...
I just spent a few minutes looking back over the bits available in include/linux/mutex.h and I'm not seeing anything beyond __mutex_owner() which would allow us to determine the mutex owning task. It's probably easiest for us to just track ownership ourselves. I'll put together a patch later today.
Note that up until recently the mutex implementation didn't even have a consistent owner field. And the thing is, it's very easy to use wrong, only today I've seen a patch do: "__mutex_owner() == task", where task was allowed to be !current, which is just wrong.
Arguably all the more reason why a strongly worded warning is important (which I see you've included below, feel free to include my Reviewed-by).
Looking through kernel/audit.c I'm not even sure I see how you would end up in audit_log_start() with audit_cmd_mutex held.
Can you give me a few code paths that trigger this? Simple git-grep is failing me.
Basically look at the code in audit_receive_msg(), but I wasn't asking your opinion on how we should rewrite the audit subsystem, I was just asking how one could determine if the current task was holding a given mutex in a way that was acceptable to you. Based on your comments, and some further inspection of the mutex code, it appears that is/was not something that the core mutex code wants to support/make-visible. Which is perfectly fine, I just wanted to make sure I wasn't missing something before I went ahead and wrote a wrapper around the mutex code for use by audit.
FWIW, I just put together the following patch which removes the __mutex_owner() call from audit and doesn't appear to break anything on the audit side (you're CC'd on the patch). It has only been lightly tested, but I'm going to bang on it for a day or so and if I hear no objections I'll merge it into audit/next.
* https://www.redhat.com/archives/linux-audit/2018-February/msg00066.html
Subject: mutex: Add comment to __mutex_owner() From: Peter Zijlstra peterz@infradead.org Date: Tue Feb 20 16:01:36 CET 2018
Attempt to deter usage, this is not a public interface. It is entirely possibly to implement a conformant mutex without having this owner field (in fact, we used to have that).
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org
--- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -66,6 +66,11 @@ struct mutex { #endif };
+/*
- Internal helper function; C doesn't allow us to hide it :/
- DO NOT USE (outside of mutex code).
- */
static inline struct task_struct *__mutex_owner(struct mutex *lock) { return (struct task_struct *)(atomic_long_read(&lock->owner) & ~0x07);
Reviewed-by: Paul Moore paul@paul-moore.com
* Paul Moore paul@paul-moore.com wrote:
On Tue, Feb 20, 2018 at 10:18 AM, Peter Zijlstra peterz@infradead.org wrote:
On Tue, Feb 20, 2018 at 09:51:08AM -0500, Paul Moore wrote:
On Tue, Feb 20, 2018 at 9:06 AM, Peter Zijlstra peterz@infradead.org wrote:
It's not at all clear to me what that code does, I just stumbled upon __mutex_owner() outside of the mutex code itself and went WTF.
If you don't want people to use __mutex_owner() outside of the mutex code I might suggest adding a rather serious comment at the top of the function, because right now I don't see anything suggesting that function shouldn't be used. Yes, there is the double underscore prefix, but that can mean a few different things these days.
Find below.
The comment (aside from having the most horribly style) ...
Yeah, your dog is ugly too. Notice how neither comment is constructive?
I'm sure you've seen this one:
Yep. I stand behind my earlier comment in this thread.
Maybe if you could explain how that code is supposed to work and why it doesn't know if it holds a lock I could make a suggestion...
I just spent a few minutes looking back over the bits available in include/linux/mutex.h and I'm not seeing anything beyond __mutex_owner() which would allow us to determine the mutex owning task. It's probably easiest for us to just track ownership ourselves. I'll put together a patch later today.
Note that up until recently the mutex implementation didn't even have a consistent owner field. And the thing is, it's very easy to use wrong, only today I've seen a patch do: "__mutex_owner() == task", where task was allowed to be !current, which is just wrong.
Arguably all the more reason why a strongly worded warning is important (which I see you've included below, feel free to include my Reviewed-by).
Looking through kernel/audit.c I'm not even sure I see how you would end up in audit_log_start() with audit_cmd_mutex held.
Can you give me a few code paths that trigger this? Simple git-grep is failing me.
Basically look at the code in audit_receive_msg(), but I wasn't asking your opinion on how we should rewrite the audit subsystem, I was just asking how one could determine if the current task was holding a given mutex in a way that was acceptable to you. Based on your comments, and some further inspection of the mutex code, it appears that is/was not something that the core mutex code wants to support/make-visible. Which is perfectly fine, I just wanted to make sure I wasn't missing something before I went ahead and wrote a wrapper around the mutex code for use by audit.
FWIW, I just put together the following patch which removes the __mutex_owner() call from audit and doesn't appear to break anything on the audit side (you're CC'd on the patch). It has only been lightly tested, but I'm going to bang on it for a day or so and if I hear no objections I'll merge it into audit/next.
Could you please explain the audit_ctl_lock()/unlock() primitive you are introducing there? You seem to be implementing some sort of recursive locking primitive, but in a strange way.
AFAICS the primary problem appears to be this code path:
audit_receive() -> audit_receive_msg() -> AUDIT_TTY_SET -> audit_log_common_recv_msg() -> audit_log_start()
where we can arrive already holding the lock.
I.e. recursive mutex, kinda.
What's the thinking there? Neither the changelog nor the code explains this.
Thanks,
Ingo
On Wed, Feb 21, 2018 at 09:46:02AM +0100, Ingo Molnar wrote:
AFAICS the primary problem appears to be this code path:
audit_receive() -> audit_receive_msg() -> AUDIT_TTY_SET -> audit_log_common_recv_msg() -> audit_log_start()
where we can arrive already holding the lock.
I.e. recursive mutex, kinda.
I _think_ something like the below ought to work, but I've no idea how to even begin testing audit.
--- kernel/audit.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-)
diff --git a/kernel/audit.c b/kernel/audit.c index 227db99b0f19..24175754f79d 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -184,6 +184,9 @@ static char *audit_feature_names[2] = { /* Serialize requests from userspace. */ DEFINE_MUTEX(audit_cmd_mutex);
+static struct audit_buffer *__audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, + int type, bool recursive); + /* AUDIT_BUFSIZ is the size of the temporary buffer used for formatting * audit records. Since printk uses a 1024 byte buffer, this buffer * should be at least that large. */ @@ -357,7 +360,7 @@ static int audit_log_config_change(char *function_name, u32 new, u32 old, struct audit_buffer *ab; int rc = 0;
- ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); + ab = __audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE, true); if (unlikely(!ab)) return rc; audit_log_format(ab, "%s=%u old=%u", function_name, new, old); @@ -1024,7 +1027,7 @@ static void audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type) return; }
- *ab = audit_log_start(NULL, GFP_KERNEL, msg_type); + *ab = __audit_log_start(NULL, GFP_KERNEL, msg_type, true); if (unlikely(!*ab)) return; audit_log_format(*ab, "pid=%d uid=%u", pid, uid); @@ -1057,7 +1060,7 @@ static void audit_log_feature_change(int which, u32 old_feature, u32 new_feature if (audit_enabled == AUDIT_OFF) return;
- ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_FEATURE_CHANGE); + ab = __audit_log_start(NULL, GFP_KERNEL, AUDIT_FEATURE_CHANGE, true); audit_log_task_info(ab, current); audit_log_format(ab, " feature=%s old=%u new=%u old_lock=%u new_lock=%u res=%d", audit_feature_names[which], !!old_feature, !!new_feature, @@ -1578,6 +1581,12 @@ static int __init audit_enable(char *str)
if (audit_default == AUDIT_OFF) audit_initialized = AUDIT_DISABLED; + /* + * Normally audit_set_enabled() would need to be called under + * @audit_cmd_mutex, however since audit_do_config_change() will not in + * fact call audit_log_config_change() when 'audit_enabled == + * AUDIT_OFF', we can use it here without issue. + */ if (audit_set_enabled(audit_default)) panic("audit: error setting audit state (%d)\n", audit_default);
@@ -1690,8 +1699,8 @@ static inline void audit_get_stamp(struct audit_context *ctx, * will be written at syscall exit. If there is no associated task, then * task context (ctx) should be NULL. */ -struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, - int type) +static struct audit_buffer *__audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, + int type, bool recursive) { struct audit_buffer *ab; struct timespec64 t; @@ -1703,6 +1712,9 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, if (unlikely(!audit_filter(type, AUDIT_FILTER_TYPE))) return NULL;
+ if (recursive) + lockdep_assert_held(&audit_cmd_mutex); + /* NOTE: don't ever fail/sleep on these two conditions: * 1. auditd generated record - since we need auditd to drain the * queue; also, when we are checking for auditd, compare PIDs using @@ -1710,8 +1722,7 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, * using a PID anchored in the caller's namespace * 2. generator holding the audit_cmd_mutex - we don't want to block * while holding the mutex */ - if (!(auditd_test_task(current) || - (current == __mutex_owner(&audit_cmd_mutex)))) { + if (!(auditd_test_task(current) || recursive)) { long stime = audit_backlog_wait_time;
while (audit_backlog_limit && @@ -1753,6 +1764,12 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, return ab; }
+struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, + int type) +{ + return __audit_log_start(ctx, gfp_mask, type, false); +} + /** * audit_expand - expand skb in the audit buffer * @ab: audit_buffer
On Wed, Feb 21, 2018 at 3:46 AM, Ingo Molnar mingo@kernel.org wrote:
- Paul Moore paul@paul-moore.com wrote:
On Tue, Feb 20, 2018 at 10:18 AM, Peter Zijlstra peterz@infradead.org wrote:
On Tue, Feb 20, 2018 at 09:51:08AM -0500, Paul Moore wrote:
On Tue, Feb 20, 2018 at 9:06 AM, Peter Zijlstra peterz@infradead.org wrote:
It's not at all clear to me what that code does, I just stumbled upon __mutex_owner() outside of the mutex code itself and went WTF.
If you don't want people to use __mutex_owner() outside of the mutex code I might suggest adding a rather serious comment at the top of the function, because right now I don't see anything suggesting that function shouldn't be used. Yes, there is the double underscore prefix, but that can mean a few different things these days.
Find below.
The comment (aside from having the most horribly style) ...
Yeah, your dog is ugly too. Notice how neither comment is constructive?
I'm sure you've seen this one:
Yep. I stand behind my earlier comment in this thread.
Maybe if you could explain how that code is supposed to work and why it doesn't know if it holds a lock I could make a suggestion...
I just spent a few minutes looking back over the bits available in include/linux/mutex.h and I'm not seeing anything beyond __mutex_owner() which would allow us to determine the mutex owning task. It's probably easiest for us to just track ownership ourselves. I'll put together a patch later today.
Note that up until recently the mutex implementation didn't even have a consistent owner field. And the thing is, it's very easy to use wrong, only today I've seen a patch do: "__mutex_owner() == task", where task was allowed to be !current, which is just wrong.
Arguably all the more reason why a strongly worded warning is important (which I see you've included below, feel free to include my Reviewed-by).
Looking through kernel/audit.c I'm not even sure I see how you would end up in audit_log_start() with audit_cmd_mutex held.
Can you give me a few code paths that trigger this? Simple git-grep is failing me.
Basically look at the code in audit_receive_msg(), but I wasn't asking your opinion on how we should rewrite the audit subsystem, I was just asking how one could determine if the current task was holding a given mutex in a way that was acceptable to you. Based on your comments, and some further inspection of the mutex code, it appears that is/was not something that the core mutex code wants to support/make-visible. Which is perfectly fine, I just wanted to make sure I wasn't missing something before I went ahead and wrote a wrapper around the mutex code for use by audit.
FWIW, I just put together the following patch which removes the __mutex_owner() call from audit and doesn't appear to break anything on the audit side (you're CC'd on the patch). It has only been lightly tested, but I'm going to bang on it for a day or so and if I hear no objections I'll merge it into audit/next.
Could you please explain the audit_ctl_lock()/unlock() primitive you are introducing there? You seem to be implementing some sort of recursive locking primitive, but in a strange way.
AFAICS the primary problem appears to be this code path:
audit_receive() -> audit_receive_msg() -> AUDIT_TTY_SET -> audit_log_common_recv_msg() -> audit_log_start()
where we can arrive already holding the lock.
I.e. recursive mutex, kinda.
What's the thinking there? Neither the changelog nor the code explains this.
I don't really go into great detail in the changelog, or comments in the code, because I'm not really doing anything new with respect to locking in this commit. The patch simply wraps the existing mutex_{lock,unlock}() calls so that we can track the mutex owner. It doesn't fundamentally change the locking, it's a quick patch to get rid our our __mutex_owner() usage as Peter doesn't want anyone, outside the mutex code, to use that function.
Based on your comments above, I'm guessing some of the misunderstanding comes from the __mutex_owner()/audit_ctl_owner_current() call in audit_log_start(). We try to determine the mutex/lock owner in audit_log_start() not because we are trying to avoid a recursive lock, we do the check as an optimization to skip the normal queue managment so that the lock holder isn't subject to the same rescheduling/queue-management (is "queue calming" a term?) as regular tasks.
linux-stable-mirror@lists.linaro.org