On Thu, Dec 15, 2022 at 9:30 AM Mimi Zohar zohar@linux.ibm.com wrote:
On Thu, 2022-12-15 at 21:15 +0800, Guozihua (Scott) wrote:
On 2022/12/15 18:49, Mimi Zohar wrote:
On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
On 2022/12/14 20:19, Mimi Zohar wrote:
On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
On 2022/12/13 23:30, Mimi Zohar wrote: > On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote: >> Hi community. >> >> Previously our team reported a race condition in IMA relates to LSM >> based rules which would case IMA to match files that should be filtered >> out under normal condition. The issue was originally analyzed and fixed >> on mainstream. The patch and the discussion could be found here: >> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/ >> >> After that, we did a regression test on 4.19 LTS and the same issue >> arises. Further analysis reveled that the issue is from a completely >> different cause. >> >> The cause is that selinux_audit_rule_init() would set the rule (which is >> a second level pointer) to NULL immediately after called. The relevant >> codes are as shown: >> >> security/selinux/ss/services.c: >>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >>> { >>> struct selinux_state *state = &selinux_state; >>> struct policydb *policydb = &state->ss->policydb; >>> struct selinux_audit_rule *tmprule; >>> struct role_datum *roledatum; >>> struct type_datum *typedatum; >>> struct user_datum *userdatum; >>> struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; >>> int rc = 0; >>> >>> *rule = NULL; >> *rule is set to NULL here, which means the rule on IMA side is also NULL. >>> >>> if (!state->initialized) >>> return -EOPNOTSUPP; >> ... >>> out: >>> read_unlock(&state->ss->policy_rwlock); >>> >>> if (rc) { >>> selinux_audit_rule_free(tmprule); >>> tmprule = NULL; >>> } >>> >>> *rule = tmprule; >> rule is updated at the end of the function. >>> >>> return rc; >>> } >> >> security/integrity/ima/ima_policy.c: >>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, >>> const struct cred *cred, u32 secid, >>> enum ima_hooks func, int mask) >>> {... >>> for (i = 0; i < MAX_LSM_RULES; i++) { >>> int rc = 0; >>> u32 osid; >>> int retried = 0; >>> >>> if (!rule->lsm[i].rule) >>> continue; >> Setting rule to NULL would lead to LSM based rule matching being skipped. >>> retry: >>> switch (i) { >> >> To solve this issue, there are multiple approaches we might take and I >> would like some input from the community. >> >> The first proposed solution would be to change >> selinux_audit_rule_init(). Remove the set to NULL bit and update the >> rule pointer with cmpxchg. >> >>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c >>> index a9f2bc8443bd..aa74b04ccaf7 100644 >>> --- a/security/selinux/ss/services.c >>> +++ b/security/selinux/ss/services.c >>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >>> struct type_datum *typedatum; >>> struct user_datum *userdatum; >>> struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; >>> + struct selinux_audit_rule *orig = rule; >>> int rc = 0; >>> >>> - *rule = NULL; >>> - >>> if (!state->initialized) >>> return -EOPNOTSUPP; >>> >>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >>> tmprule = NULL; >>> } >>> >>> - *rule = tmprule; >>> + if (cmpxchg(rule, orig, tmprule) != orig) >>> + selinux_audit_rule_free(tmprule); >>> >>> return rc; >>> } >> >> This solution would be an easy fix, but might influence other modules >> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, >> only auditfilter and IMA it seems). And it might be worth returning an >> error code such as -EAGAIN. >> >> Or, we can access rules via RCU, similar to what we do on 5.10. This >> could means more code change and testing. > > In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as > needed. IMA waits for selinux_audit_rule_init() to complete and > shouldn't see NULL, unless there is an SELinux failure. Before > "fixing" the problem, what exactly is the problem?
IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL LSM based rules in one go without using RCU, which would still allow other cores to access the rule being updated. And that's the issue.
An example scenario would be: CPU1 | CPU2 opened a file and starts | updating LSM based rules. | | opened a file and starts | matching rules. | set a LSM based rule to NULL. | access the same LSM based rule and | see that it's NULL.
In this situation, CPU 2 would recognize this rule as not LSM based and ignore the LSM part of the rule while matching.
Would picking up just ima_lsm_update_rule(), without changing to the lsm policy update notifier, from upstream and calling it from ima_lsm_update_rules() resolve the RCU locking issue? Or are there other issues?
Hi Mimi,
That should resolve the issue just fine. However, that'd mean having a customized ima_lsm_update_rules as well as a customized ima_lsm_update_rule functions on 4.19.y, potentially decrease the maintainability. The customization of the two functions mentioned above would be to ripoff the RCU part so that we can leave the other rule-list access untouched.
Hi Scott,
Neither do we want to backport new functionality. Perhaps it is only a subset of commit b16942455193 ("ima: use the lsm policy update notifier").
Yes we are able to backport part of this commit to get a mainline-like behavior which would solve the issue at hand as well. However from a maintenance standpoint it might be better to form a different commit and not re-use the commit message from mainline commit.
I assume that is fine, but cherry-pick the original commit with the -x option, so there is a correlation to the upstream commit. The patch description would mention that the patch is a partial backport.
FWIW, if the changes in the backport are significant I tend to use the following approach as it captures both the original commit as well as the details on what changes were made and why.
ima: use the lsm policy update notifier
Really good explanation of what changes were necessary from the original patch, including why they were necessary in the first place.
commit b169424551930a9325f700f502802f4d515194e5 Author: Janne Karhunen janne.karhunen@gmail.com Date: Fri Jun 14 15:20:15 2019 +0300
ima: use the lsm policy update notifier
Don't do lazy policy updates while running the rule matching, run the updates as they happen.
Depends on commit f242064c5df3 ("LSM: switch to blocking policy update notifiers")
Signed-off-by: Janne Karhunen janne.karhunen@gmail.com Signed-off-by: Mimi Zohar zohar@linux.ibm.com
Although your suggested solution of using "cmpxchg" isn't needed in recent kernels, it wouldn't hurt either, right? Assuming that Paul would be willing to accept it, that could be another option.
The cmpxchg part is merely to avoid multiple updates on the same rule item. However I am not so sure why SELinux would set the rule to NULL. My guess is that SELinux is trying to stop others from using an invalidated rule?
Would Paul be able to provide some insight? as well as some Comment on the proposed solution?
I'm not comfortable with what might happen with a cmpxchg assignment when multiple threads are in a related RCU critical section; I'm assuming they would see the new value immediately (it is atomic, right?), which I imagine could cause some consistency problems. However, if someone who understands the intersection of cmpxchg/RCU better than I do can assure me this isn't a problem we can consider it.
How bad is the backport really? Perhaps it is worth doing it to see what it looks like?