Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
The cause is that selinux_audit_rule_init() would set the rule (which is a second level pointer) to NULL immediately after called. The relevant codes are as shown:
security/selinux/ss/services.c:
int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) { struct selinux_state *state = &selinux_state; struct policydb *policydb = &state->ss->policydb; struct selinux_audit_rule *tmprule; struct role_datum *roledatum; struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; int rc = 0;
*rule = NULL;
*rule is set to NULL here, which means the rule on IMA side is also NULL.
if (!state->initialized) return -EOPNOTSUPP;
...
out: read_unlock(&state->ss->policy_rwlock);
if (rc) { selinux_audit_rule_free(tmprule); tmprule = NULL; } *rule = tmprule;
rule is updated at the end of the function.
return rc;
}
security/integrity/ima/ima_policy.c:
static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, const struct cred *cred, u32 secid, enum ima_hooks func, int mask) {... for (i = 0; i < MAX_LSM_RULES; i++) { int rc = 0; u32 osid; int retried = 0;
if (!rule->lsm[i].rule) continue;
Setting rule to NULL would lead to LSM based rule matching being skipped.
retry: switch (i) {
To solve this issue, there are multiple approaches we might take and I would like some input from the community.
The first proposed solution would be to change selinux_audit_rule_init(). Remove the set to NULL bit and update the rule pointer with cmpxchg.
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index a9f2bc8443bd..aa74b04ccaf7 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
struct selinux_audit_rule *orig = rule; int rc = 0;
*rule = NULL;
if (!state->initialized) return -EOPNOTSUPP;
@@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) tmprule = NULL; }
*rule = tmprule;
if (cmpxchg(rule, orig, tmprule) != orig)
selinux_audit_rule_free(tmprule);
return rc; }
This solution would be an easy fix, but might influence other modules calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, only auditfilter and IMA it seems). And it might be worth returning an error code such as -EAGAIN.
Or, we can access rules via RCU, similar to what we do on 5.10. This could means more code change and testing.
Reported-by: Huaxin Lu luhuaxin1@huawei.com
On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
What commit in the tree fixed this in newer kernels? Why can't we just backport that one to 4.19.y as well?
thanks,
greg k-h
On 2022/12/9 15:12, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
What commit in the tree fixed this in newer kernels? Why can't we just backport that one to 4.19.y as well?
thanks,
greg k-h
Hi Greg,
The fix for mainline is now on linux-next, commit d57378d3aa4d ("ima: Simplify ima_lsm_copy_rule") and c7423dbdbc9ece ("ima: Handle -ESTALE returned by ima_filter_rule_match()"). However, these patches cannot be picked directly into 4.19.y due to code difference.
The commit which introduced the issue on mainline was believed to be b16942455193 ("ima: use the lsm policy update notifier"), which is not in 4.19.y. And the mainline patch is designed to handle the situation when IMA rules are accessed through RCU which has not been implemented on 4.19.y either.
On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 15:12, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
What commit in the tree fixed this in newer kernels? Why can't we just backport that one to 4.19.y as well?
thanks,
greg k-h
Hi Greg,
The fix for mainline is now on linux-next, commit d57378d3aa4d ("ima: Simplify ima_lsm_copy_rule") and c7423dbdbc9ece ("ima: Handle -ESTALE returned by ima_filter_rule_match()"). However, these patches cannot be picked directly into 4.19.y due to code difference.
Ok, so it's much more than just 4.19 that's an issue here. And are those commits tagged for stable inclusion?
The commit which introduced the issue on mainline was believed to be b16942455193 ("ima: use the lsm policy update notifier"), which is not in 4.19.y. And the mainline patch is designed to handle the situation when IMA rules are accessed through RCU which has not been implemented on 4.19.y either.
Ok, then provide a series of backports to 4.19 and we will be glad to review them.
thanks,
greg k-h
On 2022/12/9 16:46, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 15:12, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
What commit in the tree fixed this in newer kernels? Why can't we just backport that one to 4.19.y as well?
thanks,
greg k-h
Hi Greg,
The fix for mainline is now on linux-next, commit d57378d3aa4d ("ima: Simplify ima_lsm_copy_rule") and c7423dbdbc9ece ("ima: Handle -ESTALE returned by ima_filter_rule_match()"). However, these patches cannot be picked directly into 4.19.y due to code difference.
Ok, so it's much more than just 4.19 that's an issue here. And are those commits tagged for stable inclusion?
Not actually, not on the commit itself.
The commit which introduced the issue on mainline was believed to be b16942455193 ("ima: use the lsm policy update notifier"), which is not in 4.19.y. And the mainline patch is designed to handle the situation when IMA rules are accessed through RCU which has not been implemented on 4.19.y either.
Ok, then provide a series of backports to 4.19 and we will be glad to review them.
If we are backporting these commits to 4.19 then maybe we would have to start with the commit that makes rule access in IMA RCU protected. I'll have a look into whether it's easy to do.
thanks,
greg k-h
On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 16:46, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 15:12, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
What commit in the tree fixed this in newer kernels? Why can't we just backport that one to 4.19.y as well?
thanks,
greg k-h
Hi Greg,
The fix for mainline is now on linux-next, commit d57378d3aa4d ("ima: Simplify ima_lsm_copy_rule") and c7423dbdbc9ece ("ima: Handle -ESTALE returned by ima_filter_rule_match()"). However, these patches cannot be picked directly into 4.19.y due to code difference.
Ok, so it's much more than just 4.19 that's an issue here. And are those commits tagged for stable inclusion?
Not actually, not on the commit itself.
That's not good. When they hit Linus's tree, please submit backports to the stable mailing list so that they can be picked up.
thanks,
greg k-h
On 2022/12/9 17:00, Greg KH wrote:
On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 16:46, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 15:12, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
What commit in the tree fixed this in newer kernels? Why can't we just backport that one to 4.19.y as well?
thanks,
greg k-h
Hi Greg,
The fix for mainline is now on linux-next, commit d57378d3aa4d ("ima: Simplify ima_lsm_copy_rule") and c7423dbdbc9ece ("ima: Handle -ESTALE returned by ima_filter_rule_match()"). However, these patches cannot be picked directly into 4.19.y due to code difference.
Ok, so it's much more than just 4.19 that's an issue here. And are those commits tagged for stable inclusion?
Not actually, not on the commit itself.
That's not good. When they hit Linus's tree, please submit backports to the stable mailing list so that they can be picked up.
Thing is these commits cannot be simply backported to 4.19.y. Preceding patches are missing. How do we do backporting in this situation? Do we first backport the preceding patches? Or maybe we develop another solution for 4.19.y?
thanks,
greg k-h
On Fri, Dec 09, 2022 at 05:11:40PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 17:00, Greg KH wrote:
On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 16:46, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 15:12, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote: > Hi community. > > Previously our team reported a race condition in IMA relates to LSM based > rules which would case IMA to match files that should be filtered out under > normal condition. The issue was originally analyzed and fixed on mainstream. > The patch and the discussion could be found here: > https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/ > > After that, we did a regression test on 4.19 LTS and the same issue arises. > Further analysis reveled that the issue is from a completely different > cause.
What commit in the tree fixed this in newer kernels? Why can't we just backport that one to 4.19.y as well?
thanks,
greg k-h
Hi Greg,
The fix for mainline is now on linux-next, commit d57378d3aa4d ("ima: Simplify ima_lsm_copy_rule") and c7423dbdbc9ece ("ima: Handle -ESTALE returned by ima_filter_rule_match()"). However, these patches cannot be picked directly into 4.19.y due to code difference.
Ok, so it's much more than just 4.19 that's an issue here. And are those commits tagged for stable inclusion?
Not actually, not on the commit itself.
That's not good. When they hit Linus's tree, please submit backports to the stable mailing list so that they can be picked up.
Thing is these commits cannot be simply backported to 4.19.y. Preceding patches are missing. How do we do backporting in this situation? Do we first backport the preceding patches? Or maybe we develop another solution for 4.19.y?
First they need to go to newer kernel trees, and then worry about 4.19. We never want anyone to upgrade to a newer kernel and have a regression.
Also, we can't do anything until they hit Linus's tree, as per the stable kernel rules.
thanks,
greg k-h
On 2022/12/9 17:22, Greg KH wrote:
On Fri, Dec 09, 2022 at 05:11:40PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 17:00, Greg KH wrote:
On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 16:46, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 15:12, Greg KH wrote: > On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote: >> Hi community. >> >> Previously our team reported a race condition in IMA relates to LSM based >> rules which would case IMA to match files that should be filtered out under >> normal condition. The issue was originally analyzed and fixed on mainstream. >> The patch and the discussion could be found here: >> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/ >> >> After that, we did a regression test on 4.19 LTS and the same issue arises. >> Further analysis reveled that the issue is from a completely different >> cause. > > What commit in the tree fixed this in newer kernels? Why can't we just > backport that one to 4.19.y as well? > > thanks, > > greg k-h
Hi Greg,
The fix for mainline is now on linux-next, commit d57378d3aa4d ("ima: Simplify ima_lsm_copy_rule") and c7423dbdbc9ece ("ima: Handle -ESTALE returned by ima_filter_rule_match()"). However, these patches cannot be picked directly into 4.19.y due to code difference.
Ok, so it's much more than just 4.19 that's an issue here. And are those commits tagged for stable inclusion?
Not actually, not on the commit itself.
That's not good. When they hit Linus's tree, please submit backports to the stable mailing list so that they can be picked up.
Thing is these commits cannot be simply backported to 4.19.y. Preceding patches are missing. How do we do backporting in this situation? Do we first backport the preceding patches? Or maybe we develop another solution for 4.19.y?
First they need to go to newer kernel trees, and then worry about 4.19. We never want anyone to upgrade to a newer kernel and have a regression.
Also, we can't do anything until they hit Linus's tree, as per the stable kernel rules.
Alright. We'll wait for these patches to be in Linus' tree. But should we stick to a backport from mainstream or we form a different solution for LTS?
On 2022/12/9 17:32, Guozihua (Scott) wrote:
On 2022/12/9 17:22, Greg KH wrote:
On Fri, Dec 09, 2022 at 05:11:40PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 17:00, Greg KH wrote:
On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 16:46, Greg KH wrote:
On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote: > On 2022/12/9 15:12, Greg KH wrote: >> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote: >>> Hi community. >>> >>> Previously our team reported a race condition in IMA relates to LSM based >>> rules which would case IMA to match files that should be filtered out under >>> normal condition. The issue was originally analyzed and fixed on mainstream. >>> The patch and the discussion could be found here: >>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/ >>> >>> After that, we did a regression test on 4.19 LTS and the same issue arises. >>> Further analysis reveled that the issue is from a completely different >>> cause. >> >> What commit in the tree fixed this in newer kernels? Why can't we just >> backport that one to 4.19.y as well? >> >> thanks, >> >> greg k-h > > Hi Greg, > > The fix for mainline is now on linux-next, commit d57378d3aa4d ("ima: > Simplify ima_lsm_copy_rule") and c7423dbdbc9ece ("ima: Handle -ESTALE > returned by ima_filter_rule_match()"). However, these patches cannot be > picked directly into 4.19.y due to code difference.
Ok, so it's much more than just 4.19 that's an issue here. And are those commits tagged for stable inclusion?
Not actually, not on the commit itself.
That's not good. When they hit Linus's tree, please submit backports to the stable mailing list so that they can be picked up.
Thing is these commits cannot be simply backported to 4.19.y. Preceding patches are missing. How do we do backporting in this situation? Do we first backport the preceding patches? Or maybe we develop another solution for 4.19.y?
First they need to go to newer kernel trees, and then worry about 4.19. We never want anyone to upgrade to a newer kernel and have a regression.
Also, we can't do anything until they hit Linus's tree, as per the stable kernel rules.
Alright. We'll wait for these patches to be in Linus' tree. But should we stick to a backport from mainstream or we form a different solution for LTS?
BTW, I have a look into it and if we are backporting mainstream's solution, we would also needs to backport b16942455193 ("ima: use the lsm policy update notifier")
On Fri, Dec 09, 2022 at 05:38:00PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 17:32, Guozihua (Scott) wrote:
On 2022/12/9 17:22, Greg KH wrote:
On Fri, Dec 09, 2022 at 05:11:40PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 17:00, Greg KH wrote:
On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 16:46, Greg KH wrote: > On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote: >> On 2022/12/9 15:12, Greg KH wrote: >>> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote: >>>> Hi community. >>>> >>>> Previously our team reported a race condition in IMA relates to LSM based >>>> rules which would case IMA to match files that should be filtered out under >>>> normal condition. The issue was originally analyzed and fixed on mainstream. >>>> The patch and the discussion could be found here: >>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/ >>>> >>>> After that, we did a regression test on 4.19 LTS and the same issue arises. >>>> Further analysis reveled that the issue is from a completely different >>>> cause. >>> >>> What commit in the tree fixed this in newer kernels? Why can't we just >>> backport that one to 4.19.y as well? >>> >>> thanks, >>> >>> greg k-h >> >> Hi Greg, >> >> The fix for mainline is now on linux-next, commit d57378d3aa4d ("ima: >> Simplify ima_lsm_copy_rule") and c7423dbdbc9ece ("ima: Handle -ESTALE >> returned by ima_filter_rule_match()"). However, these patches cannot be >> picked directly into 4.19.y due to code difference. > > Ok, so it's much more than just 4.19 that's an issue here. And are > those commits tagged for stable inclusion?
Not actually, not on the commit itself.
That's not good. When they hit Linus's tree, please submit backports to the stable mailing list so that they can be picked up.
Thing is these commits cannot be simply backported to 4.19.y. Preceding patches are missing. How do we do backporting in this situation? Do we first backport the preceding patches? Or maybe we develop another solution for 4.19.y?
First they need to go to newer kernel trees, and then worry about 4.19. We never want anyone to upgrade to a newer kernel and have a regression.
Also, we can't do anything until they hit Linus's tree, as per the stable kernel rules.
Alright. We'll wait for these patches to be in Linus' tree. But should we stick to a backport from mainstream or we form a different solution for LTS?
We always want to have a normal backport of what is in Linus's tree if at all possible. Whenever we diverge from that, we almost always get it wrong and have to fix it up again later.
BTW, I have a look into it and if we are backporting mainstream's solution, we would also needs to backport b16942455193 ("ima: use the lsm policy update notifier")
That's fine, please just send a patch series to the stable list when needed.
thanks,
greg k-h
On 2022/12/9 18:27, Greg KH wrote:
On Fri, Dec 09, 2022 at 05:38:00PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 17:32, Guozihua (Scott) wrote:
On 2022/12/9 17:22, Greg KH wrote:
On Fri, Dec 09, 2022 at 05:11:40PM +0800, Guozihua (Scott) wrote:
On 2022/12/9 17:00, Greg KH wrote:
On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote: > On 2022/12/9 16:46, Greg KH wrote: >> On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote: >>> On 2022/12/9 15:12, Greg KH wrote: >>>> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote: >>>>> Hi community. >>>>> >>>>> Previously our team reported a race condition in IMA relates to LSM based >>>>> rules which would case IMA to match files that should be filtered out under >>>>> normal condition. The issue was originally analyzed and fixed on mainstream. >>>>> The patch and the discussion could be found here: >>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/ >>>>> >>>>> After that, we did a regression test on 4.19 LTS and the same issue arises. >>>>> Further analysis reveled that the issue is from a completely different >>>>> cause. >>>> >>>> What commit in the tree fixed this in newer kernels? Why can't we just >>>> backport that one to 4.19.y as well? >>>> >>>> thanks, >>>> >>>> greg k-h >>> >>> Hi Greg, >>> >>> The fix for mainline is now on linux-next, commit d57378d3aa4d ("ima: >>> Simplify ima_lsm_copy_rule") and c7423dbdbc9ece ("ima: Handle -ESTALE >>> returned by ima_filter_rule_match()"). However, these patches cannot be >>> picked directly into 4.19.y due to code difference. >> >> Ok, so it's much more than just 4.19 that's an issue here. And are >> those commits tagged for stable inclusion? > > Not actually, not on the commit itself.
That's not good. When they hit Linus's tree, please submit backports to the stable mailing list so that they can be picked up.
Thing is these commits cannot be simply backported to 4.19.y. Preceding patches are missing. How do we do backporting in this situation? Do we first backport the preceding patches? Or maybe we develop another solution for 4.19.y?
First they need to go to newer kernel trees, and then worry about 4.19. We never want anyone to upgrade to a newer kernel and have a regression.
Also, we can't do anything until they hit Linus's tree, as per the stable kernel rules.
Alright. We'll wait for these patches to be in Linus' tree. But should we stick to a backport from mainstream or we form a different solution for LTS?
We always want to have a normal backport of what is in Linus's tree if at all possible. Whenever we diverge from that, we almost always get it wrong and have to fix it up again later.
BTW, I have a look into it and if we are backporting mainstream's solution, we would also needs to backport b16942455193 ("ima: use the lsm policy update notifier")
That's fine, please just send a patch series to the stable list when needed.
thanks,
greg k-h
Thanks Greg.
Any thought from Mimi?
On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
The cause is that selinux_audit_rule_init() would set the rule (which is a second level pointer) to NULL immediately after called. The relevant codes are as shown:
security/selinux/ss/services.c:
int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) { struct selinux_state *state = &selinux_state; struct policydb *policydb = &state->ss->policydb; struct selinux_audit_rule *tmprule; struct role_datum *roledatum; struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; int rc = 0;
*rule = NULL;
*rule is set to NULL here, which means the rule on IMA side is also NULL.
if (!state->initialized) return -EOPNOTSUPP;
...
out: read_unlock(&state->ss->policy_rwlock);
if (rc) { selinux_audit_rule_free(tmprule); tmprule = NULL; } *rule = tmprule;
rule is updated at the end of the function.
return rc;
}
security/integrity/ima/ima_policy.c:
static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, const struct cred *cred, u32 secid, enum ima_hooks func, int mask) {... for (i = 0; i < MAX_LSM_RULES; i++) { int rc = 0; u32 osid; int retried = 0;
if (!rule->lsm[i].rule) continue;
Setting rule to NULL would lead to LSM based rule matching being skipped.
retry: switch (i) {
To solve this issue, there are multiple approaches we might take and I would like some input from the community.
The first proposed solution would be to change selinux_audit_rule_init(). Remove the set to NULL bit and update the rule pointer with cmpxchg.
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index a9f2bc8443bd..aa74b04ccaf7 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
struct selinux_audit_rule *orig = rule; int rc = 0;
*rule = NULL;
if (!state->initialized) return -EOPNOTSUPP;
@@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) tmprule = NULL; }
*rule = tmprule;
if (cmpxchg(rule, orig, tmprule) != orig)
selinux_audit_rule_free(tmprule);
return rc; }
This solution would be an easy fix, but might influence other modules calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, only auditfilter and IMA it seems). And it might be worth returning an error code such as -EAGAIN.
Or, we can access rules via RCU, similar to what we do on 5.10. This could means more code change and testing.
In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as needed. IMA waits for selinux_audit_rule_init() to complete and shouldn't see NULL, unless there is an SELinux failure. Before "fixing" the problem, what exactly is the problem?
thanks,
Mimi
On 2022/12/13 23:30, Mimi Zohar wrote:
On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
The cause is that selinux_audit_rule_init() would set the rule (which is a second level pointer) to NULL immediately after called. The relevant codes are as shown:
security/selinux/ss/services.c:
int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) { struct selinux_state *state = &selinux_state; struct policydb *policydb = &state->ss->policydb; struct selinux_audit_rule *tmprule; struct role_datum *roledatum; struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; int rc = 0;
*rule = NULL;
*rule is set to NULL here, which means the rule on IMA side is also NULL.
if (!state->initialized) return -EOPNOTSUPP;
...
out: read_unlock(&state->ss->policy_rwlock);
if (rc) { selinux_audit_rule_free(tmprule); tmprule = NULL; } *rule = tmprule;
rule is updated at the end of the function.
return rc;
}
security/integrity/ima/ima_policy.c:
static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, const struct cred *cred, u32 secid, enum ima_hooks func, int mask) {... for (i = 0; i < MAX_LSM_RULES; i++) { int rc = 0; u32 osid; int retried = 0;
if (!rule->lsm[i].rule) continue;
Setting rule to NULL would lead to LSM based rule matching being skipped.
retry: switch (i) {
To solve this issue, there are multiple approaches we might take and I would like some input from the community.
The first proposed solution would be to change selinux_audit_rule_init(). Remove the set to NULL bit and update the rule pointer with cmpxchg.
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index a9f2bc8443bd..aa74b04ccaf7 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
struct selinux_audit_rule *orig = rule; int rc = 0;
*rule = NULL;
if (!state->initialized) return -EOPNOTSUPP;
@@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) tmprule = NULL; }
*rule = tmprule;
if (cmpxchg(rule, orig, tmprule) != orig)
selinux_audit_rule_free(tmprule);
return rc; }
This solution would be an easy fix, but might influence other modules calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, only auditfilter and IMA it seems). And it might be worth returning an error code such as -EAGAIN.
Or, we can access rules via RCU, similar to what we do on 5.10. This could means more code change and testing.
In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as needed. IMA waits for selinux_audit_rule_init() to complete and shouldn't see NULL, unless there is an SELinux failure. Before "fixing" the problem, what exactly is the problem?
IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL LSM based rules in one go without using RCU, which would still allow other cores to access the rule being updated. And that's the issue.
An example scenario would be: CPU1 | CPU2 opened a file and starts | updating LSM based rules. | | opened a file and starts | matching rules. | set a LSM based rule to NULL. | access the same LSM based rule and | see that it's NULL.
In this situation, CPU 2 would recognize this rule as not LSM based and ignore the LSM part of the rule while matching.
On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
On 2022/12/13 23:30, Mimi Zohar wrote:
On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
The cause is that selinux_audit_rule_init() would set the rule (which is a second level pointer) to NULL immediately after called. The relevant codes are as shown:
security/selinux/ss/services.c:
int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) { struct selinux_state *state = &selinux_state; struct policydb *policydb = &state->ss->policydb; struct selinux_audit_rule *tmprule; struct role_datum *roledatum; struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; int rc = 0;
*rule = NULL;
*rule is set to NULL here, which means the rule on IMA side is also NULL.
if (!state->initialized) return -EOPNOTSUPP;
...
out: read_unlock(&state->ss->policy_rwlock);
if (rc) { selinux_audit_rule_free(tmprule); tmprule = NULL; } *rule = tmprule;
rule is updated at the end of the function.
return rc;
}
security/integrity/ima/ima_policy.c:
static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, const struct cred *cred, u32 secid, enum ima_hooks func, int mask) {... for (i = 0; i < MAX_LSM_RULES; i++) { int rc = 0; u32 osid; int retried = 0;
if (!rule->lsm[i].rule) continue;
Setting rule to NULL would lead to LSM based rule matching being skipped.
retry: switch (i) {
To solve this issue, there are multiple approaches we might take and I would like some input from the community.
The first proposed solution would be to change selinux_audit_rule_init(). Remove the set to NULL bit and update the rule pointer with cmpxchg.
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index a9f2bc8443bd..aa74b04ccaf7 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
struct selinux_audit_rule *orig = rule; int rc = 0;
*rule = NULL;
if (!state->initialized) return -EOPNOTSUPP;
@@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) tmprule = NULL; }
*rule = tmprule;
if (cmpxchg(rule, orig, tmprule) != orig)
selinux_audit_rule_free(tmprule);
return rc; }
This solution would be an easy fix, but might influence other modules calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, only auditfilter and IMA it seems). And it might be worth returning an error code such as -EAGAIN.
Or, we can access rules via RCU, similar to what we do on 5.10. This could means more code change and testing.
In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as needed. IMA waits for selinux_audit_rule_init() to complete and shouldn't see NULL, unless there is an SELinux failure. Before "fixing" the problem, what exactly is the problem?
IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL LSM based rules in one go without using RCU, which would still allow other cores to access the rule being updated. And that's the issue.
An example scenario would be: CPU1 | CPU2 opened a file and starts | updating LSM based rules. | | opened a file and starts | matching rules. | set a LSM based rule to NULL. | access the same LSM based rule and | see that it's NULL.
In this situation, CPU 2 would recognize this rule as not LSM based and ignore the LSM part of the rule while matching.
Would picking up just ima_lsm_update_rule(), without changing to the lsm policy update notifier, from upstream and calling it from ima_lsm_update_rules() resolve the RCU locking issue? Or are there other issues?
thanks,
Mimi
On 2022/12/14 20:19, Mimi Zohar wrote:
On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
On 2022/12/13 23:30, Mimi Zohar wrote:
On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
The cause is that selinux_audit_rule_init() would set the rule (which is a second level pointer) to NULL immediately after called. The relevant codes are as shown:
security/selinux/ss/services.c:
int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) { struct selinux_state *state = &selinux_state; struct policydb *policydb = &state->ss->policydb; struct selinux_audit_rule *tmprule; struct role_datum *roledatum; struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; int rc = 0;
*rule = NULL;
*rule is set to NULL here, which means the rule on IMA side is also NULL.
if (!state->initialized) return -EOPNOTSUPP;
...
out: read_unlock(&state->ss->policy_rwlock);
if (rc) { selinux_audit_rule_free(tmprule); tmprule = NULL; } *rule = tmprule;
rule is updated at the end of the function.
return rc;
}
security/integrity/ima/ima_policy.c:
static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, const struct cred *cred, u32 secid, enum ima_hooks func, int mask) {... for (i = 0; i < MAX_LSM_RULES; i++) { int rc = 0; u32 osid; int retried = 0;
if (!rule->lsm[i].rule) continue;
Setting rule to NULL would lead to LSM based rule matching being skipped.
retry: switch (i) {
To solve this issue, there are multiple approaches we might take and I would like some input from the community.
The first proposed solution would be to change selinux_audit_rule_init(). Remove the set to NULL bit and update the rule pointer with cmpxchg.
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index a9f2bc8443bd..aa74b04ccaf7 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
struct selinux_audit_rule *orig = rule; int rc = 0;
*rule = NULL;
if (!state->initialized) return -EOPNOTSUPP;
@@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) tmprule = NULL; }
*rule = tmprule;
if (cmpxchg(rule, orig, tmprule) != orig)
selinux_audit_rule_free(tmprule);
return rc; }
This solution would be an easy fix, but might influence other modules calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, only auditfilter and IMA it seems). And it might be worth returning an error code such as -EAGAIN.
Or, we can access rules via RCU, similar to what we do on 5.10. This could means more code change and testing.
In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as needed. IMA waits for selinux_audit_rule_init() to complete and shouldn't see NULL, unless there is an SELinux failure. Before "fixing" the problem, what exactly is the problem?
IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL LSM based rules in one go without using RCU, which would still allow other cores to access the rule being updated. And that's the issue.
An example scenario would be: CPU1 | CPU2 opened a file and starts | updating LSM based rules. | | opened a file and starts | matching rules. | set a LSM based rule to NULL. | access the same LSM based rule and | see that it's NULL.
In this situation, CPU 2 would recognize this rule as not LSM based and ignore the LSM part of the rule while matching.
Would picking up just ima_lsm_update_rule(), without changing to the lsm policy update notifier, from upstream and calling it from ima_lsm_update_rules() resolve the RCU locking issue? Or are there other issues?
Hi Mimi,
That should resolve the issue just fine. However, that'd mean having a customized ima_lsm_update_rules as well as a customized ima_lsm_update_rule functions on 4.19.y, potentially decrease the maintainability. The customization of the two functions mentioned above would be to ripoff the RCU part so that we can leave the other rule-list access untouched.
On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
On 2022/12/14 20:19, Mimi Zohar wrote:
On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
On 2022/12/13 23:30, Mimi Zohar wrote:
On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
The cause is that selinux_audit_rule_init() would set the rule (which is a second level pointer) to NULL immediately after called. The relevant codes are as shown:
security/selinux/ss/services.c:
int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) { struct selinux_state *state = &selinux_state; struct policydb *policydb = &state->ss->policydb; struct selinux_audit_rule *tmprule; struct role_datum *roledatum; struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; int rc = 0;
*rule = NULL;
*rule is set to NULL here, which means the rule on IMA side is also NULL.
if (!state->initialized) return -EOPNOTSUPP;
...
out: read_unlock(&state->ss->policy_rwlock);
if (rc) { selinux_audit_rule_free(tmprule); tmprule = NULL; } *rule = tmprule;
rule is updated at the end of the function.
return rc;
}
security/integrity/ima/ima_policy.c:
static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, const struct cred *cred, u32 secid, enum ima_hooks func, int mask) {... for (i = 0; i < MAX_LSM_RULES; i++) { int rc = 0; u32 osid; int retried = 0;
if (!rule->lsm[i].rule) continue;
Setting rule to NULL would lead to LSM based rule matching being skipped.
retry: switch (i) {
To solve this issue, there are multiple approaches we might take and I would like some input from the community.
The first proposed solution would be to change selinux_audit_rule_init(). Remove the set to NULL bit and update the rule pointer with cmpxchg.
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index a9f2bc8443bd..aa74b04ccaf7 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) struct type_datum *typedatum; struct user_datum *userdatum; struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
struct selinux_audit_rule *orig = rule; int rc = 0;
*rule = NULL;
if (!state->initialized) return -EOPNOTSUPP;
@@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) tmprule = NULL; }
*rule = tmprule;
if (cmpxchg(rule, orig, tmprule) != orig)
selinux_audit_rule_free(tmprule);
return rc; }
This solution would be an easy fix, but might influence other modules calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, only auditfilter and IMA it seems). And it might be worth returning an error code such as -EAGAIN.
Or, we can access rules via RCU, similar to what we do on 5.10. This could means more code change and testing.
In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as needed. IMA waits for selinux_audit_rule_init() to complete and shouldn't see NULL, unless there is an SELinux failure. Before "fixing" the problem, what exactly is the problem?
IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL LSM based rules in one go without using RCU, which would still allow other cores to access the rule being updated. And that's the issue.
An example scenario would be: CPU1 | CPU2 opened a file and starts | updating LSM based rules. | | opened a file and starts | matching rules. | set a LSM based rule to NULL. | access the same LSM based rule and | see that it's NULL.
In this situation, CPU 2 would recognize this rule as not LSM based and ignore the LSM part of the rule while matching.
Would picking up just ima_lsm_update_rule(), without changing to the lsm policy update notifier, from upstream and calling it from ima_lsm_update_rules() resolve the RCU locking issue? Or are there other issues?
Hi Mimi,
That should resolve the issue just fine. However, that'd mean having a customized ima_lsm_update_rules as well as a customized ima_lsm_update_rule functions on 4.19.y, potentially decrease the maintainability. The customization of the two functions mentioned above would be to ripoff the RCU part so that we can leave the other rule-list access untouched.
Hi Scott,
Neither do we want to backport new functionality. Perhaps it is only a subset of commit b16942455193 ("ima: use the lsm policy update notifier").
Although your suggested solution of using "cmpxchg" isn't needed in recent kernels, it wouldn't hurt either, right? Assuming that Paul would be willing to accept it, that could be another option.
On 2022/12/15 18:49, Mimi Zohar wrote:
On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
On 2022/12/14 20:19, Mimi Zohar wrote:
On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
On 2022/12/13 23:30, Mimi Zohar wrote:
On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
Hi community.
Previously our team reported a race condition in IMA relates to LSM based rules which would case IMA to match files that should be filtered out under normal condition. The issue was originally analyzed and fixed on mainstream. The patch and the discussion could be found here: https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue arises. Further analysis reveled that the issue is from a completely different cause.
The cause is that selinux_audit_rule_init() would set the rule (which is a second level pointer) to NULL immediately after called. The relevant codes are as shown:
security/selinux/ss/services.c: > int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) > { > struct selinux_state *state = &selinux_state; > struct policydb *policydb = &state->ss->policydb; > struct selinux_audit_rule *tmprule; > struct role_datum *roledatum; > struct type_datum *typedatum; > struct user_datum *userdatum; > struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; > int rc = 0; > > *rule = NULL; *rule is set to NULL here, which means the rule on IMA side is also NULL. > > if (!state->initialized) > return -EOPNOTSUPP; ... > out: > read_unlock(&state->ss->policy_rwlock); > > if (rc) { > selinux_audit_rule_free(tmprule); > tmprule = NULL; > } > > *rule = tmprule; rule is updated at the end of the function. > > return rc; > }
security/integrity/ima/ima_policy.c: > static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, > const struct cred *cred, u32 secid, > enum ima_hooks func, int mask) > {... > for (i = 0; i < MAX_LSM_RULES; i++) { > int rc = 0; > u32 osid; > int retried = 0; > > if (!rule->lsm[i].rule) > continue; Setting rule to NULL would lead to LSM based rule matching being skipped. > retry: > switch (i) {
To solve this issue, there are multiple approaches we might take and I would like some input from the community.
The first proposed solution would be to change selinux_audit_rule_init(). Remove the set to NULL bit and update the rule pointer with cmpxchg.
> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c > index a9f2bc8443bd..aa74b04ccaf7 100644 > --- a/security/selinux/ss/services.c > +++ b/security/selinux/ss/services.c > @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) > struct type_datum *typedatum; > struct user_datum *userdatum; > struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; > + struct selinux_audit_rule *orig = rule; > int rc = 0; > > - *rule = NULL; > - > if (!state->initialized) > return -EOPNOTSUPP; > > @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) > tmprule = NULL; > } > > - *rule = tmprule; > + if (cmpxchg(rule, orig, tmprule) != orig) > + selinux_audit_rule_free(tmprule); > > return rc; > }
This solution would be an easy fix, but might influence other modules calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, only auditfilter and IMA it seems). And it might be worth returning an error code such as -EAGAIN.
Or, we can access rules via RCU, similar to what we do on 5.10. This could means more code change and testing.
In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as needed. IMA waits for selinux_audit_rule_init() to complete and shouldn't see NULL, unless there is an SELinux failure. Before "fixing" the problem, what exactly is the problem?
IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL LSM based rules in one go without using RCU, which would still allow other cores to access the rule being updated. And that's the issue.
An example scenario would be: CPU1 | CPU2 opened a file and starts | updating LSM based rules. | | opened a file and starts | matching rules. | set a LSM based rule to NULL. | access the same LSM based rule and | see that it's NULL.
In this situation, CPU 2 would recognize this rule as not LSM based and ignore the LSM part of the rule while matching.
Would picking up just ima_lsm_update_rule(), without changing to the lsm policy update notifier, from upstream and calling it from ima_lsm_update_rules() resolve the RCU locking issue? Or are there other issues?
Hi Mimi,
That should resolve the issue just fine. However, that'd mean having a customized ima_lsm_update_rules as well as a customized ima_lsm_update_rule functions on 4.19.y, potentially decrease the maintainability. The customization of the two functions mentioned above would be to ripoff the RCU part so that we can leave the other rule-list access untouched.
Hi Scott,
Neither do we want to backport new functionality. Perhaps it is only a subset of commit b16942455193 ("ima: use the lsm policy update notifier").
Yes we are able to backport part of this commit to get a mainline-like behavior which would solve the issue at hand as well. However from a maintenance standpoint it might be better to form a different commit and not re-use the commit message from mainline commit.
Although your suggested solution of using "cmpxchg" isn't needed in recent kernels, it wouldn't hurt either, right? Assuming that Paul would be willing to accept it, that could be another option.
The cmpxchg part is merely to avoid multiple updates on the same rule item. However I am not so sure why SELinux would set the rule to NULL. My guess is that SELinux is trying to stop others from using an invalidated rule?
Would Paul be able to provide some insight? as well as some Comment on the proposed solution?
On Thu, 2022-12-15 at 21:15 +0800, Guozihua (Scott) wrote:
On 2022/12/15 18:49, Mimi Zohar wrote:
On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
On 2022/12/14 20:19, Mimi Zohar wrote:
On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
On 2022/12/13 23:30, Mimi Zohar wrote:
On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote: > Hi community. > > Previously our team reported a race condition in IMA relates to LSM > based rules which would case IMA to match files that should be filtered > out under normal condition. The issue was originally analyzed and fixed > on mainstream. The patch and the discussion could be found here: > https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/ > > After that, we did a regression test on 4.19 LTS and the same issue > arises. Further analysis reveled that the issue is from a completely > different cause. > > The cause is that selinux_audit_rule_init() would set the rule (which is > a second level pointer) to NULL immediately after called. The relevant > codes are as shown: > > security/selinux/ss/services.c: >> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >> { >> struct selinux_state *state = &selinux_state; >> struct policydb *policydb = &state->ss->policydb; >> struct selinux_audit_rule *tmprule; >> struct role_datum *roledatum; >> struct type_datum *typedatum; >> struct user_datum *userdatum; >> struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; >> int rc = 0; >> >> *rule = NULL; > *rule is set to NULL here, which means the rule on IMA side is also NULL. >> >> if (!state->initialized) >> return -EOPNOTSUPP; > ... >> out: >> read_unlock(&state->ss->policy_rwlock); >> >> if (rc) { >> selinux_audit_rule_free(tmprule); >> tmprule = NULL; >> } >> >> *rule = tmprule; > rule is updated at the end of the function. >> >> return rc; >> } > > security/integrity/ima/ima_policy.c: >> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, >> const struct cred *cred, u32 secid, >> enum ima_hooks func, int mask) >> {... >> for (i = 0; i < MAX_LSM_RULES; i++) { >> int rc = 0; >> u32 osid; >> int retried = 0; >> >> if (!rule->lsm[i].rule) >> continue; > Setting rule to NULL would lead to LSM based rule matching being skipped. >> retry: >> switch (i) { > > To solve this issue, there are multiple approaches we might take and I > would like some input from the community. > > The first proposed solution would be to change > selinux_audit_rule_init(). Remove the set to NULL bit and update the > rule pointer with cmpxchg. > >> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c >> index a9f2bc8443bd..aa74b04ccaf7 100644 >> --- a/security/selinux/ss/services.c >> +++ b/security/selinux/ss/services.c >> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >> struct type_datum *typedatum; >> struct user_datum *userdatum; >> struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; >> + struct selinux_audit_rule *orig = rule; >> int rc = 0; >> >> - *rule = NULL; >> - >> if (!state->initialized) >> return -EOPNOTSUPP; >> >> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >> tmprule = NULL; >> } >> >> - *rule = tmprule; >> + if (cmpxchg(rule, orig, tmprule) != orig) >> + selinux_audit_rule_free(tmprule); >> >> return rc; >> } > > This solution would be an easy fix, but might influence other modules > calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, > only auditfilter and IMA it seems). And it might be worth returning an > error code such as -EAGAIN. > > Or, we can access rules via RCU, similar to what we do on 5.10. This > could means more code change and testing.
In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as needed. IMA waits for selinux_audit_rule_init() to complete and shouldn't see NULL, unless there is an SELinux failure. Before "fixing" the problem, what exactly is the problem?
IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL LSM based rules in one go without using RCU, which would still allow other cores to access the rule being updated. And that's the issue.
An example scenario would be: CPU1 | CPU2 opened a file and starts | updating LSM based rules. | | opened a file and starts | matching rules. | set a LSM based rule to NULL. | access the same LSM based rule and | see that it's NULL.
In this situation, CPU 2 would recognize this rule as not LSM based and ignore the LSM part of the rule while matching.
Would picking up just ima_lsm_update_rule(), without changing to the lsm policy update notifier, from upstream and calling it from ima_lsm_update_rules() resolve the RCU locking issue? Or are there other issues?
Hi Mimi,
That should resolve the issue just fine. However, that'd mean having a customized ima_lsm_update_rules as well as a customized ima_lsm_update_rule functions on 4.19.y, potentially decrease the maintainability. The customization of the two functions mentioned above would be to ripoff the RCU part so that we can leave the other rule-list access untouched.
Hi Scott,
Neither do we want to backport new functionality. Perhaps it is only a subset of commit b16942455193 ("ima: use the lsm policy update notifier").
Yes we are able to backport part of this commit to get a mainline-like behavior which would solve the issue at hand as well. However from a maintenance standpoint it might be better to form a different commit and not re-use the commit message from mainline commit.
I assume that is fine, but cherry-pick the original commit with the -x option, so there is a correlation to the upstream commit. The patch description would mention that the patch is a partial backport.
thanks,
Mimi
Although your suggested solution of using "cmpxchg" isn't needed in recent kernels, it wouldn't hurt either, right? Assuming that Paul would be willing to accept it, that could be another option.
The cmpxchg part is merely to avoid multiple updates on the same rule item. However I am not so sure why SELinux would set the rule to NULL. My guess is that SELinux is trying to stop others from using an invalidated rule?
Would Paul be able to provide some insight? as well as some Comment on the proposed solution?
On Thu, Dec 15, 2022 at 9:30 AM Mimi Zohar zohar@linux.ibm.com wrote:
On Thu, 2022-12-15 at 21:15 +0800, Guozihua (Scott) wrote:
On 2022/12/15 18:49, Mimi Zohar wrote:
On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
On 2022/12/14 20:19, Mimi Zohar wrote:
On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
On 2022/12/13 23:30, Mimi Zohar wrote: > On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote: >> Hi community. >> >> Previously our team reported a race condition in IMA relates to LSM >> based rules which would case IMA to match files that should be filtered >> out under normal condition. The issue was originally analyzed and fixed >> on mainstream. The patch and the discussion could be found here: >> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/ >> >> After that, we did a regression test on 4.19 LTS and the same issue >> arises. Further analysis reveled that the issue is from a completely >> different cause. >> >> The cause is that selinux_audit_rule_init() would set the rule (which is >> a second level pointer) to NULL immediately after called. The relevant >> codes are as shown: >> >> security/selinux/ss/services.c: >>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >>> { >>> struct selinux_state *state = &selinux_state; >>> struct policydb *policydb = &state->ss->policydb; >>> struct selinux_audit_rule *tmprule; >>> struct role_datum *roledatum; >>> struct type_datum *typedatum; >>> struct user_datum *userdatum; >>> struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; >>> int rc = 0; >>> >>> *rule = NULL; >> *rule is set to NULL here, which means the rule on IMA side is also NULL. >>> >>> if (!state->initialized) >>> return -EOPNOTSUPP; >> ... >>> out: >>> read_unlock(&state->ss->policy_rwlock); >>> >>> if (rc) { >>> selinux_audit_rule_free(tmprule); >>> tmprule = NULL; >>> } >>> >>> *rule = tmprule; >> rule is updated at the end of the function. >>> >>> return rc; >>> } >> >> security/integrity/ima/ima_policy.c: >>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, >>> const struct cred *cred, u32 secid, >>> enum ima_hooks func, int mask) >>> {... >>> for (i = 0; i < MAX_LSM_RULES; i++) { >>> int rc = 0; >>> u32 osid; >>> int retried = 0; >>> >>> if (!rule->lsm[i].rule) >>> continue; >> Setting rule to NULL would lead to LSM based rule matching being skipped. >>> retry: >>> switch (i) { >> >> To solve this issue, there are multiple approaches we might take and I >> would like some input from the community. >> >> The first proposed solution would be to change >> selinux_audit_rule_init(). Remove the set to NULL bit and update the >> rule pointer with cmpxchg. >> >>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c >>> index a9f2bc8443bd..aa74b04ccaf7 100644 >>> --- a/security/selinux/ss/services.c >>> +++ b/security/selinux/ss/services.c >>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >>> struct type_datum *typedatum; >>> struct user_datum *userdatum; >>> struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; >>> + struct selinux_audit_rule *orig = rule; >>> int rc = 0; >>> >>> - *rule = NULL; >>> - >>> if (!state->initialized) >>> return -EOPNOTSUPP; >>> >>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >>> tmprule = NULL; >>> } >>> >>> - *rule = tmprule; >>> + if (cmpxchg(rule, orig, tmprule) != orig) >>> + selinux_audit_rule_free(tmprule); >>> >>> return rc; >>> } >> >> This solution would be an easy fix, but might influence other modules >> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, >> only auditfilter and IMA it seems). And it might be worth returning an >> error code such as -EAGAIN. >> >> Or, we can access rules via RCU, similar to what we do on 5.10. This >> could means more code change and testing. > > In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as > needed. IMA waits for selinux_audit_rule_init() to complete and > shouldn't see NULL, unless there is an SELinux failure. Before > "fixing" the problem, what exactly is the problem?
IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL LSM based rules in one go without using RCU, which would still allow other cores to access the rule being updated. And that's the issue.
An example scenario would be: CPU1 | CPU2 opened a file and starts | updating LSM based rules. | | opened a file and starts | matching rules. | set a LSM based rule to NULL. | access the same LSM based rule and | see that it's NULL.
In this situation, CPU 2 would recognize this rule as not LSM based and ignore the LSM part of the rule while matching.
Would picking up just ima_lsm_update_rule(), without changing to the lsm policy update notifier, from upstream and calling it from ima_lsm_update_rules() resolve the RCU locking issue? Or are there other issues?
Hi Mimi,
That should resolve the issue just fine. However, that'd mean having a customized ima_lsm_update_rules as well as a customized ima_lsm_update_rule functions on 4.19.y, potentially decrease the maintainability. The customization of the two functions mentioned above would be to ripoff the RCU part so that we can leave the other rule-list access untouched.
Hi Scott,
Neither do we want to backport new functionality. Perhaps it is only a subset of commit b16942455193 ("ima: use the lsm policy update notifier").
Yes we are able to backport part of this commit to get a mainline-like behavior which would solve the issue at hand as well. However from a maintenance standpoint it might be better to form a different commit and not re-use the commit message from mainline commit.
I assume that is fine, but cherry-pick the original commit with the -x option, so there is a correlation to the upstream commit. The patch description would mention that the patch is a partial backport.
FWIW, if the changes in the backport are significant I tend to use the following approach as it captures both the original commit as well as the details on what changes were made and why.
ima: use the lsm policy update notifier
Really good explanation of what changes were necessary from the original patch, including why they were necessary in the first place.
commit b169424551930a9325f700f502802f4d515194e5 Author: Janne Karhunen janne.karhunen@gmail.com Date: Fri Jun 14 15:20:15 2019 +0300
ima: use the lsm policy update notifier
Don't do lazy policy updates while running the rule matching, run the updates as they happen.
Depends on commit f242064c5df3 ("LSM: switch to blocking policy update notifiers")
Signed-off-by: Janne Karhunen janne.karhunen@gmail.com Signed-off-by: Mimi Zohar zohar@linux.ibm.com
Although your suggested solution of using "cmpxchg" isn't needed in recent kernels, it wouldn't hurt either, right? Assuming that Paul would be willing to accept it, that could be another option.
The cmpxchg part is merely to avoid multiple updates on the same rule item. However I am not so sure why SELinux would set the rule to NULL. My guess is that SELinux is trying to stop others from using an invalidated rule?
Would Paul be able to provide some insight? as well as some Comment on the proposed solution?
I'm not comfortable with what might happen with a cmpxchg assignment when multiple threads are in a related RCU critical section; I'm assuming they would see the new value immediately (it is atomic, right?), which I imagine could cause some consistency problems. However, if someone who understands the intersection of cmpxchg/RCU better than I do can assure me this isn't a problem we can consider it.
How bad is the backport really? Perhaps it is worth doing it to see what it looks like?
On 2022/12/16 5:04, Paul Moore wrote:
On Thu, Dec 15, 2022 at 9:30 AM Mimi Zohar zohar@linux.ibm.com wrote:
On Thu, 2022-12-15 at 21:15 +0800, Guozihua (Scott) wrote:
On 2022/12/15 18:49, Mimi Zohar wrote:
On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
On 2022/12/14 20:19, Mimi Zohar wrote:
On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote: > On 2022/12/13 23:30, Mimi Zohar wrote: >> On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote: >>> Hi community. >>> >>> Previously our team reported a race condition in IMA relates to LSM >>> based rules which would case IMA to match files that should be filtered >>> out under normal condition. The issue was originally analyzed and fixed >>> on mainstream. The patch and the discussion could be found here: >>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/ >>> >>> After that, we did a regression test on 4.19 LTS and the same issue >>> arises. Further analysis reveled that the issue is from a completely >>> different cause. >>> >>> The cause is that selinux_audit_rule_init() would set the rule (which is >>> a second level pointer) to NULL immediately after called. The relevant >>> codes are as shown: >>> >>> security/selinux/ss/services.c: >>>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >>>> { >>>> struct selinux_state *state = &selinux_state; >>>> struct policydb *policydb = &state->ss->policydb; >>>> struct selinux_audit_rule *tmprule; >>>> struct role_datum *roledatum; >>>> struct type_datum *typedatum; >>>> struct user_datum *userdatum; >>>> struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; >>>> int rc = 0; >>>> >>>> *rule = NULL; >>> *rule is set to NULL here, which means the rule on IMA side is also NULL. >>>> >>>> if (!state->initialized) >>>> return -EOPNOTSUPP; >>> ... >>>> out: >>>> read_unlock(&state->ss->policy_rwlock); >>>> >>>> if (rc) { >>>> selinux_audit_rule_free(tmprule); >>>> tmprule = NULL; >>>> } >>>> >>>> *rule = tmprule; >>> rule is updated at the end of the function. >>>> >>>> return rc; >>>> } >>> >>> security/integrity/ima/ima_policy.c: >>>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, >>>> const struct cred *cred, u32 secid, >>>> enum ima_hooks func, int mask) >>>> {... >>>> for (i = 0; i < MAX_LSM_RULES; i++) { >>>> int rc = 0; >>>> u32 osid; >>>> int retried = 0; >>>> >>>> if (!rule->lsm[i].rule) >>>> continue; >>> Setting rule to NULL would lead to LSM based rule matching being skipped. >>>> retry: >>>> switch (i) { >>> >>> To solve this issue, there are multiple approaches we might take and I >>> would like some input from the community. >>> >>> The first proposed solution would be to change >>> selinux_audit_rule_init(). Remove the set to NULL bit and update the >>> rule pointer with cmpxchg. >>> >>>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c >>>> index a9f2bc8443bd..aa74b04ccaf7 100644 >>>> --- a/security/selinux/ss/services.c >>>> +++ b/security/selinux/ss/services.c >>>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >>>> struct type_datum *typedatum; >>>> struct user_datum *userdatum; >>>> struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule; >>>> + struct selinux_audit_rule *orig = rule; >>>> int rc = 0; >>>> >>>> - *rule = NULL; >>>> - >>>> if (!state->initialized) >>>> return -EOPNOTSUPP; >>>> >>>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule) >>>> tmprule = NULL; >>>> } >>>> >>>> - *rule = tmprule; >>>> + if (cmpxchg(rule, orig, tmprule) != orig) >>>> + selinux_audit_rule_free(tmprule); >>>> >>>> return rc; >>>> } >>> >>> This solution would be an easy fix, but might influence other modules >>> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, >>> only auditfilter and IMA it seems). And it might be worth returning an >>> error code such as -EAGAIN. >>> >>> Or, we can access rules via RCU, similar to what we do on 5.10. This >>> could means more code change and testing. >> >> In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as >> needed. IMA waits for selinux_audit_rule_init() to complete and >> shouldn't see NULL, unless there is an SELinux failure. Before >> "fixing" the problem, what exactly is the problem? > > IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL > LSM based rules in one go without using RCU, which would still allow > other cores to access the rule being updated. And that's the issue. > > An example scenario would be: > CPU1 | CPU2 > opened a file and starts | > updating LSM based rules. | > | opened a file and starts > | matching rules. > | > set a LSM based rule to NULL. | access the same LSM based rule and > | see that it's NULL. > > In this situation, CPU 2 would recognize this rule as not LSM based and > ignore the LSM part of the rule while matching.
Would picking up just ima_lsm_update_rule(), without changing to the lsm policy update notifier, from upstream and calling it from ima_lsm_update_rules() resolve the RCU locking issue? Or are there other issues?
Hi Mimi,
That should resolve the issue just fine. However, that'd mean having a customized ima_lsm_update_rules as well as a customized ima_lsm_update_rule functions on 4.19.y, potentially decrease the maintainability. The customization of the two functions mentioned above would be to ripoff the RCU part so that we can leave the other rule-list access untouched.
Hi Scott,
Neither do we want to backport new functionality. Perhaps it is only a subset of commit b16942455193 ("ima: use the lsm policy update notifier").
Yes we are able to backport part of this commit to get a mainline-like behavior which would solve the issue at hand as well. However from a maintenance standpoint it might be better to form a different commit and not re-use the commit message from mainline commit.
I assume that is fine, but cherry-pick the original commit with the -x option, so there is a correlation to the upstream commit. The patch description would mention that the patch is a partial backport.
FWIW, if the changes in the backport are significant I tend to use the following approach as it captures both the original commit as well as the details on what changes were made and why.
ima: use the lsm policy update notifier
Really good explanation of what changes were necessary from the original patch, including why they were necessary in the first place.
commit b169424551930a9325f700f502802f4d515194e5 Author: Janne Karhunen janne.karhunen@gmail.com Date: Fri Jun 14 15:20:15 2019 +0300
ima: use the lsm policy update notifier
Don't do lazy policy updates while running the rule matching, run the updates as they happen.
Depends on commit f242064c5df3 ("LSM: switch to blocking policy update notifiers")
Signed-off-by: Janne Karhunen janne.karhunen@gmail.com Signed-off-by: Mimi Zohar zohar@linux.ibm.com
Thanks for the suggestion Mimi and Paul.
Although your suggested solution of using "cmpxchg" isn't needed in recent kernels, it wouldn't hurt either, right? Assuming that Paul would be willing to accept it, that could be another option.
The cmpxchg part is merely to avoid multiple updates on the same rule item. However I am not so sure why SELinux would set the rule to NULL. My guess is that SELinux is trying to stop others from using an invalidated rule?
Would Paul be able to provide some insight? as well as some Comment on the proposed solution?
I'm not comfortable with what might happen with a cmpxchg assignment when multiple threads are in a related RCU critical section; I'm assuming they would see the new value immediately (it is atomic, right?), which I imagine could cause some consistency problems. However, if someone who understands the intersection of cmpxchg/RCU better than I do can assure me this isn't a problem we can consider it.
How bad is the backport really? Perhaps it is worth doing it to see what it looks like?
It might not be that bad, I'll try to post a version next Monday.
On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) guozihua@huawei.com wrote:
On 2022/12/16 5:04, Paul Moore wrote:
...
How bad is the backport really? Perhaps it is worth doing it to see what it looks like?
It might not be that bad, I'll try to post a version next Monday.
Thanks for giving it a shot.
On 2022/12/16 11:04, Paul Moore wrote:
On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) guozihua@huawei.com wrote:
On 2022/12/16 5:04, Paul Moore wrote:
...
How bad is the backport really? Perhaps it is worth doing it to see what it looks like?
It might not be that bad, I'll try to post a version next Monday.
Thanks for giving it a shot.
When I am trying a partial backport of b16942455193 ("ima: use the lsm policy update notifier"), I took a closer look into it and if we rip off the RCU and the notifier part, there would be a potential UAF issue when multiple processes are calling ima_lsm_update_rule() and ima_match_rules() at the same time. ima_lsm_update_rule() would free the old rule if the new rule is successfully copied and initialized, leading to ima_match_rules() accessing a freed rule.
To reserve the mainline solution, we would have to either introduce RCU for rule access, which would work better with notifier mechanism or the same rule would be updated multiple times, or we would have to introduce a lock for LSM based rule update.
On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
On 2022/12/16 11:04, Paul Moore wrote:
On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) guozihua@huawei.com wrote:
On 2022/12/16 5:04, Paul Moore wrote:
...
How bad is the backport really? Perhaps it is worth doing it to see what it looks like?
It might not be that bad, I'll try to post a version next Monday.
Thanks for giving it a shot.
When I am trying a partial backport of b16942455193 ("ima: use the lsm policy update notifier"), I took a closer look into it and if we rip off the RCU and the notifier part, there would be a potential UAF issue when multiple processes are calling ima_lsm_update_rule() and ima_match_rules() at the same time. ima_lsm_update_rule() would free the old rule if the new rule is successfully copied and initialized, leading to ima_match_rules() accessing a freed rule.
To reserve the mainline solution, we would have to either introduce RCU for rule access, which would work better with notifier mechanism or the same rule would be updated multiple times, or we would have to introduce a lock for LSM based rule update.
Even with the RCU changes, the rules will be updated multiple times. With your "ima: Handle -ESTALE returned by ima_filter_rule_match()" patch, upstream makes a single local copy of the rule to avoid updating it multiple times. Without the notifier, it's updating all the rules.
Perhaps an atomic variable to detect if the rules are already being updated would suffice. If the atomic variable is set, make a single local copy of the rule.
On 2022/12/19 21:11, Mimi Zohar wrote:
On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
On 2022/12/16 11:04, Paul Moore wrote:
On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) guozihua@huawei.com wrote:
On 2022/12/16 5:04, Paul Moore wrote:
...
How bad is the backport really? Perhaps it is worth doing it to see what it looks like?
It might not be that bad, I'll try to post a version next Monday.
Thanks for giving it a shot.
When I am trying a partial backport of b16942455193 ("ima: use the lsm policy update notifier"), I took a closer look into it and if we rip off the RCU and the notifier part, there would be a potential UAF issue when multiple processes are calling ima_lsm_update_rule() and ima_match_rules() at the same time. ima_lsm_update_rule() would free the old rule if the new rule is successfully copied and initialized, leading to ima_match_rules() accessing a freed rule.
To reserve the mainline solution, we would have to either introduce RCU for rule access, which would work better with notifier mechanism or the same rule would be updated multiple times, or we would have to introduce a lock for LSM based rule update.
Even with the RCU changes, the rules will be updated multiple times. With your "ima: Handle -ESTALE returned by ima_filter_rule_match()" patch, upstream makes a single local copy of the rule to avoid updating it multiple times. Without the notifier, it's updating all the rules.
That's true. However, in the mainline solution, we are only making a local copy of the rule. In 4.19, because of the lazy update mechanism, we are replacing the rule on the rule list multiple times and is trying to free the original rule.
Perhaps an atomic variable to detect if the rules are already being updated would suffice. If the atomic variable is set, make a single local copy of the rule.
That should do it. I'll send a patch set soon.
On 2022/12/20 9:11, Guozihua (Scott) wrote:
On 2022/12/19 21:11, Mimi Zohar wrote:
On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
On 2022/12/16 11:04, Paul Moore wrote:
On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) guozihua@huawei.com wrote:
On 2022/12/16 5:04, Paul Moore wrote:
...
How bad is the backport really? Perhaps it is worth doing it to see what it looks like?
It might not be that bad, I'll try to post a version next Monday.
Thanks for giving it a shot.
When I am trying a partial backport of b16942455193 ("ima: use the lsm policy update notifier"), I took a closer look into it and if we rip off the RCU and the notifier part, there would be a potential UAF issue when multiple processes are calling ima_lsm_update_rule() and ima_match_rules() at the same time. ima_lsm_update_rule() would free the old rule if the new rule is successfully copied and initialized, leading to ima_match_rules() accessing a freed rule.
To reserve the mainline solution, we would have to either introduce RCU for rule access, which would work better with notifier mechanism or the same rule would be updated multiple times, or we would have to introduce a lock for LSM based rule update.
Even with the RCU changes, the rules will be updated multiple times. With your "ima: Handle -ESTALE returned by ima_filter_rule_match()" patch, upstream makes a single local copy of the rule to avoid updating it multiple times. Without the notifier, it's updating all the rules.
That's true. However, in the mainline solution, we are only making a local copy of the rule. In 4.19, because of the lazy update mechanism, we are replacing the rule on the rule list multiple times and is trying to free the original rule.
Perhaps an atomic variable to detect if the rules are already being updated would suffice. If the atomic variable is set, make a single local copy of the rule.
That should do it. I'll send a patch set soon.
Including Huaxin Lu in the loop. Sorry for forgotten about it for quite some time.
I tried the backported solution, it seems that it's causing RCU stall. It seems on 4.19.y IMA is already accessing rules through RCU. Still debugging it.
The backported patches are as below
diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c index 0819b7600649..20349ef6383b 100644 --- a/security/integrity/ima/ima_policy.c +++ b/security/integrity/ima/ima_policy.c @@ -353,6 +353,8 @@ static void ima_lsm_update_rules(void) } } +static bool rule_updating = false;
/**
- ima_match_rules - determine whether an inode matches the measure rule.
- @rule: a pointer to a rule
@@ -369,6 +371,9 @@ static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, enum ima_hooks func, int mask) { int i;
bool result = false;
struct ima_rule_entry *lsm_rule = rule;
bool rule_reinitialized = false;
if ((rule->flags & IMA_FUNC) && (rule->func != func && func != POST_SETATTR)) @@ -408,7 +413,7 @@ static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, u32 osid; int retried = 0;
if (!rule->lsm[i].rule)
if (!lsm_rule->lsm[i].rule) continue;
retry: switch (i) { @@ -417,31 +422,49 @@ static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode, case LSM_OBJ_TYPE: security_inode_getsecid(inode, &osid); rc = security_filter_rule_match(osid,
rule->lsm[i].type,
lsm_rule->lsm[i].type, Audit_equal,
rule->lsm[i].rule,
lsm_rule->lsm[i].rule, NULL); break; case LSM_SUBJ_USER: case LSM_SUBJ_ROLE: case LSM_SUBJ_TYPE: rc = security_filter_rule_match(secid,
rule->lsm[i].type,
lsm_rule->lsm[i].type, Audit_equal,
rule->lsm[i].rule,
lsm_rule->lsm[i].rule, NULL); default: break; }> if ((rc < 0) && (!retried)) { retried = 1;
ima_lsm_update_rules();
goto retry;
if (READ_ONCE(rule_updating)) {
lsm_rule = ima_lsm_copy_rule(rule);
if (lsm_rule) {
rule_reinitialized = true;
goto retry;
}
} else {
WRITE_ONCE(rule_updating, true);
ima_lsm_update_rules();
WRITE_ONCE(rule_updating, false);
goto retry;
}
}
if (!rc) {
result = false;
goto out; }
if (!rc)
return false; }
return true;
result = true;
+out:
if (rule_reinitialized) {
ima_lsm_free_rule(lsm_rule);
}
return result;
} diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c index b2dadff3626b..0819b7600649 100644 --- a/security/integrity/ima/ima_policy.c +++ b/security/integrity/ima/ima_policy.c @@ -256,27 +256,99 @@ static void ima_free_rule(struct ima_rule_entry *entry) kfree(entry); } +static void ima_lsm_free_rule(struct ima_rule_entry *entry) +{
int i;
for (i = 0; i < MAX_LSM_RULES; i++) {
kfree(entry->lsm[i].rule);
kfree(entry->lsm[i].args_p);
}
kfree(entry);
+}
+static struct ima_rule_entry *ima_lsm_copy_rule(struct ima_rule_entry *entry) +{
struct ima_rule_entry *nentry;
int i, result;
nentry = kmalloc(sizeof(*nentry), GFP_KERNEL);
if (!nentry)
return NULL;
/*
* Immutable elements are copied over as pointers and data; only
* lsm rules can change
*/
memcpy(nentry, entry, sizeof(*nentry));
memset(nentry->lsm, 0, FIELD_SIZEOF(struct ima_rule_entry, lsm));
for (i = 0; i < MAX_LSM_RULES; i++) {
if (!entry->lsm[i].rule)
continue;
nentry->lsm[i].type = entry->lsm[i].type;
nentry->lsm[i].args_p = kstrdup(entry->lsm[i].args_p,
GFP_KERNEL);
if (!nentry->lsm[i].args_p)
goto out_err;
result = security_filter_rule_init(nentry->lsm[i].type,
Audit_equal,
nentry->lsm[i].args_p,
&nentry->lsm[i].rule);
if (result == -EINVAL)
pr_warn("ima: rule for LSM \'%d\' is undefined\n",
entry->lsm[i].type);
}
return nentry;
+out_err:
ima_lsm_free_rule(nentry);
return NULL;
+}
+static int ima_lsm_update_rule(struct ima_rule_entry *entry) +{
struct ima_rule_entry *nentry;> +
nentry = ima_lsm_copy_rule(entry);
if (!nentry)
return -ENOMEM;
list_replace_rcu(&entry->list, &nentry->list);
synchronize_rcu();
ima_lsm_free_rule(entry);
return 0;
+}
/*
- The LSM policy can be reloaded, leaving the IMA LSM based rules referring
- to the old, stale LSM policy. Update the IMA LSM based rules to reflect
- the reloaded LSM policy. We assume the rules still exist; and BUG_ON() if
- they don't.
*/
- the reloaded LSM policy.
static void ima_lsm_update_rules(void) {
struct ima_rule_entry *entry;
int result;
int i;
struct ima_rule_entry *entry, *e;
int i, result, needs_update;
list_for_each_entry(entry, &ima_policy_rules, list) {
list_for_each_entry_safe(entry, e, &ima_policy_rules, list) {
needs_update = 0; for (i = 0; i < MAX_LSM_RULES; i++) {
if (!entry->lsm[i].rule)
continue;> - result =
security_filter_rule_init(entry->lsm[i].type,
Audit_equal,
entry->lsm[i].args_p,
&entry->lsm[i].rule);
BUG_ON(!entry->lsm[i].rule);
if (entry->lsm[i].rule) {
needs_update = 1;
break;
}
}
if (!needs_update)
continue;
result = ima_lsm_update_rule(entry);
if (result) {
pr_err("ima: lsm rule update error %d\n",
result);
return; } }
}
On 2022/12/21 18:51, Guozihua (Scott) wrote:
On 2022/12/20 9:11, Guozihua (Scott) wrote:
On 2022/12/19 21:11, Mimi Zohar wrote:
On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
On 2022/12/16 11:04, Paul Moore wrote:
On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) guozihua@huawei.com wrote:
On 2022/12/16 5:04, Paul Moore wrote:
...
> How bad is the backport really? Perhaps it is worth doing it to see > what it looks like? > It might not be that bad, I'll try to post a version next Monday.
Thanks for giving it a shot.
When I am trying a partial backport of b16942455193 ("ima: use the lsm policy update notifier"), I took a closer look into it and if we rip off the RCU and the notifier part, there would be a potential UAF issue when multiple processes are calling ima_lsm_update_rule() and ima_match_rules() at the same time. ima_lsm_update_rule() would free the old rule if the new rule is successfully copied and initialized, leading to ima_match_rules() accessing a freed rule.
To reserve the mainline solution, we would have to either introduce RCU for rule access, which would work better with notifier mechanism or the same rule would be updated multiple times, or we would have to introduce a lock for LSM based rule update.
Even with the RCU changes, the rules will be updated multiple times. With your "ima: Handle -ESTALE returned by ima_filter_rule_match()" patch, upstream makes a single local copy of the rule to avoid updating it multiple times. Without the notifier, it's updating all the rules.
That's true. However, in the mainline solution, we are only making a local copy of the rule. In 4.19, because of the lazy update mechanism, we are replacing the rule on the rule list multiple times and is trying to free the original rule.
Perhaps an atomic variable to detect if the rules are already being updated would suffice. If the atomic variable is set, make a single local copy of the rule.
That should do it. I'll send a patch set soon.
Including Huaxin Lu in the loop. Sorry for forgotten about it for quite some time.
I tried the backported solution, it seems that it's causing RCU stall. It seems on 4.19.y IMA is already accessing rules through RCU. Still debugging it.
It seems that after the backport, a NULL pointer deference pops out. I'll have to look into it.
On 2022/12/23 16:04, Guozihua (Scott) wrote:
On 2022/12/21 18:51, Guozihua (Scott) wrote:
On 2022/12/20 9:11, Guozihua (Scott) wrote:
On 2022/12/19 21:11, Mimi Zohar wrote:
On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
On 2022/12/16 11:04, Paul Moore wrote:
On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) guozihua@huawei.com wrote: > On 2022/12/16 5:04, Paul Moore wrote:
...
>> How bad is the backport really? Perhaps it is worth doing it to see >> what it looks like? >> > It might not be that bad, I'll try to post a version next Monday.
Thanks for giving it a shot.
When I am trying a partial backport of b16942455193 ("ima: use the lsm policy update notifier"), I took a closer look into it and if we rip off the RCU and the notifier part, there would be a potential UAF issue when multiple processes are calling ima_lsm_update_rule() and ima_match_rules() at the same time. ima_lsm_update_rule() would free the old rule if the new rule is successfully copied and initialized, leading to ima_match_rules() accessing a freed rule.
To reserve the mainline solution, we would have to either introduce RCU for rule access, which would work better with notifier mechanism or the same rule would be updated multiple times, or we would have to introduce a lock for LSM based rule update.
Even with the RCU changes, the rules will be updated multiple times. With your "ima: Handle -ESTALE returned by ima_filter_rule_match()" patch, upstream makes a single local copy of the rule to avoid updating it multiple times. Without the notifier, it's updating all the rules.
That's true. However, in the mainline solution, we are only making a local copy of the rule. In 4.19, because of the lazy update mechanism, we are replacing the rule on the rule list multiple times and is trying to free the original rule.
Perhaps an atomic variable to detect if the rules are already being updated would suffice. If the atomic variable is set, make a single local copy of the rule.
That should do it. I'll send a patch set soon.
Including Huaxin Lu in the loop. Sorry for forgotten about it for quite some time.
I tried the backported solution, it seems that it's causing RCU stall. It seems on 4.19.y IMA is already accessing rules through RCU. Still debugging it.
It seems that after the backport, a NULL pointer deference pops out. I'll have to look into it.
It seems that any other means except from a full RCU or locking won't be able to prevent race condition between policy update and rule match. Any other suggestions?
On 2022/12/24 11:41, Guozihua (Scott) wrote:
On 2022/12/23 16:04, Guozihua (Scott) wrote:
On 2022/12/21 18:51, Guozihua (Scott) wrote:
On 2022/12/20 9:11, Guozihua (Scott) wrote:
On 2022/12/19 21:11, Mimi Zohar wrote:
On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
On 2022/12/16 11:04, Paul Moore wrote: > On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) guozihua@huawei.com wrote: >> On 2022/12/16 5:04, Paul Moore wrote: > > ... > >>> How bad is the backport really? Perhaps it is worth doing it to see >>> what it looks like? >>> >> It might not be that bad, I'll try to post a version next Monday. > > Thanks for giving it a shot. > When I am trying a partial backport of b16942455193 ("ima: use the lsm policy update notifier"), I took a closer look into it and if we rip off the RCU and the notifier part, there would be a potential UAF issue when multiple processes are calling ima_lsm_update_rule() and ima_match_rules() at the same time. ima_lsm_update_rule() would free the old rule if the new rule is successfully copied and initialized, leading to ima_match_rules() accessing a freed rule.
To reserve the mainline solution, we would have to either introduce RCU for rule access, which would work better with notifier mechanism or the same rule would be updated multiple times, or we would have to introduce a lock for LSM based rule update.
Even with the RCU changes, the rules will be updated multiple times. With your "ima: Handle -ESTALE returned by ima_filter_rule_match()" patch, upstream makes a single local copy of the rule to avoid updating it multiple times. Without the notifier, it's updating all the rules.
That's true. However, in the mainline solution, we are only making a local copy of the rule. In 4.19, because of the lazy update mechanism, we are replacing the rule on the rule list multiple times and is trying to free the original rule.
Perhaps an atomic variable to detect if the rules are already being updated would suffice. If the atomic variable is set, make a single local copy of the rule.
That should do it. I'll send a patch set soon.
Including Huaxin Lu in the loop. Sorry for forgotten about it for quite some time.
I tried the backported solution, it seems that it's causing RCU stall. It seems on 4.19.y IMA is already accessing rules through RCU. Still debugging it.
It seems that after the backport, a NULL pointer deference pops out. I'll have to look into it.
It seems that any other means except from a full RCU or locking won't be able to prevent race condition between policy update and rule match. Any other suggestions?
Correction: full RCU won't be enough as RCU won't work well without notifier. My suggestion would be to backport the notifier mechanism.
On Thu, 2022-12-15 at 22:04 -0500, Paul Moore wrote:
On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) guozihua@huawei.com wrote:
On 2022/12/16 5:04, Paul Moore wrote:
...
How bad is the backport really? Perhaps it is worth doing it to see what it looks like?
It might not be that bad, I'll try to post a version next Monday.
Thanks for giving it a shot.
FYI, in the end backporting the atomic to blocking LSM notifier change was the best solution. Other than one minor correction, v6 of the "ima: Fix IMA mishandling of LSM based rule during" looks good.
linux-stable-mirror@lists.linaro.org