This is a backport for stable of my series to fix a race condition in selinuxfs during policy load:
selinux: Create function for selinuxfs directory cleanup https://lore.kernel.org/selinux/20200819195935.1720168-2-dburgener@linux.mic...
selinux: Refactor selinuxfs directory populating functions https://lore.kernel.org/selinux/20200819195935.1720168-3-dburgener@linux.mic...
selinux: Standardize string literal usage for selinuxfs directory names https://lore.kernel.org/selinux/20200819195935.1720168-4-dburgener@linux.mic...
selinux: Create new booleans and class dirs out of tree https://lore.kernel.org/selinux/20200819195935.1720168-5-dburgener@linux.mic...
Several changes were necessary to backport. They are detailed in the commit message for the third commit in the series. I also dropped the original third commit from this because it was only a style change.
The bulk of the original cover letter is reproduced below.
In the current implementation, on policy load /sys/fs/selinux is updated by deleting the previous contents of /sys/fs/selinux/{class,booleans} and then recreating them. This means that there is a period of time when the contents of these directories do not exist which can cause race conditions as userspace relies on them for information about the policy. In addition, it means that error recovery in the event of failure is challenging.
This patch series follows the design outlined by Al Viro in a previous e-mail to the list[1]. This approach is to first create the new directory structures out of tree, then to perform the swapover, and finally to delete the old directories. Not handled in this series is error recovery in the event of failure.
Error recovery in the selinuxfs recreation is unhandled in the current code, so this series will not cause any regression in this regard. Handling directory recreation in this manner is a prerequisite to make proper error handling possible.
In order to demonstrate the race condition that this series fixes, you can use the following commands:
while true; do cat /sys/fs/selinux/class/service/perms/status
/dev/null; done &
while true; do load_policy; done;
In the existing code, this will display errors fairly often as the class lookup fails. (In normal operation from systemd, this would result in a permission check which would be allowed or denied based on policy settings around unknown object classes.) After applying this patch series you should expect to no longer see such error messages.
Daniel Burgener (3): selinux: Create function for selinuxfs directory cleanup selinux: Refactor selinuxfs directory populating functions selinux: Create new booleans and class dirs out of tree
security/selinux/selinuxfs.c | 160 +++++++++++++++++++++++++++-------- 1 file changed, 123 insertions(+), 37 deletions(-)
Separating the cleanup from the creation will simplify two things in future patches in this series. First, the creation can be made generic, to create directories not tied to the selinux_fs_info structure. Second, we will ultimately want to reorder creation and deletion so that the deletions aren't performed until the new directory structures have already been moved into place.
Signed-off-by: Daniel Burgener dburgener@linux.microsoft.com --- security/selinux/selinuxfs.c | 39 +++++++++++++++++++++++------------- 1 file changed, 25 insertions(+), 14 deletions(-)
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c index e9eaff90cbcc..092c7295f78d 100644 --- a/security/selinux/selinuxfs.c +++ b/security/selinux/selinuxfs.c @@ -348,6 +348,9 @@ static int sel_make_policycap(struct selinux_fs_info *fsi); static struct dentry *sel_make_dir(struct dentry *dir, const char *name, unsigned long *ino);
+/* declaration for sel_remove_old_policy_nodes */ +static void sel_remove_entries(struct dentry *de); + static ssize_t sel_read_mls(struct file *filp, char __user *buf, size_t count, loff_t *ppos) { @@ -502,10 +505,32 @@ static const struct file_operations sel_policy_ops = { .llseek = generic_file_llseek, };
+static void sel_remove_old_policy_nodes(struct selinux_fs_info *fsi) +{ + u32 i; + + /* bool_dir cleanup */ + for (i = 0; i < fsi->bool_num; i++) + kfree(fsi->bool_pending_names[i]); + kfree(fsi->bool_pending_names); + kfree(fsi->bool_pending_values); + fsi->bool_num = 0; + fsi->bool_pending_names = NULL; + fsi->bool_pending_values = NULL; + + sel_remove_entries(fsi->bool_dir); + + /* class_dir cleanup */ + sel_remove_entries(fsi->class_dir); + +} + static int sel_make_policy_nodes(struct selinux_fs_info *fsi) { int ret;
+ sel_remove_old_policy_nodes(fsi); + ret = sel_make_bools(fsi); if (ret) { pr_err("SELinux: failed to load policy booleans\n"); @@ -1336,17 +1361,6 @@ static int sel_make_bools(struct selinux_fs_info *fsi) int *values = NULL; u32 sid;
- /* remove any existing files */ - for (i = 0; i < fsi->bool_num; i++) - kfree(fsi->bool_pending_names[i]); - kfree(fsi->bool_pending_names); - kfree(fsi->bool_pending_values); - fsi->bool_num = 0; - fsi->bool_pending_names = NULL; - fsi->bool_pending_values = NULL; - - sel_remove_entries(dir); - ret = -ENOMEM; page = (char *)get_zeroed_page(GFP_KERNEL); if (!page) @@ -1798,9 +1812,6 @@ static int sel_make_classes(struct selinux_fs_info *fsi) int rc, nclasses, i; char **classes;
- /* delete any existing entries */ - sel_remove_entries(fsi->class_dir); - rc = security_get_classes(fsi->state, &classes, &nclasses); if (rc) return rc;
On Thu, Oct 15, 2020 at 03:29:54PM -0400, Daniel Burgener wrote:
Separating the cleanup from the creation will simplify two things in future patches in this series. First, the creation can be made generic, to create directories not tied to the selinux_fs_info structure. Second, we will ultimately want to reorder creation and deletion so that the deletions aren't performed until the new directory structures have already been moved into place.
Signed-off-by: Daniel Burgener dburgener@linux.microsoft.com
security/selinux/selinuxfs.c | 39 +++++++++++++++++++++++------------- 1 file changed, 25 insertions(+), 14 deletions(-)
What is the git commit id of this patch upstream in Linus's tree?
Same for the other 2, we need those ids.
thanks,
gre gk-h
Make sel_make_bools and sel_make_classes take the specific elements of selinux_fs_info that they need rather than the entire struct.
This will allow a future patch to pass temporary elements that are not in the selinux_fs_info struct to these functions so that the original elements can be preserved until we are ready to perform the switch over.
Signed-off-by: Daniel Burgener dburgener@linux.microsoft.com --- security/selinux/selinuxfs.c | 40 +++++++++++++++++++++--------------- 1 file changed, 24 insertions(+), 16 deletions(-)
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c index 092c7295f78d..ea21f3ef4a6f 100644 --- a/security/selinux/selinuxfs.c +++ b/security/selinux/selinuxfs.c @@ -340,8 +340,11 @@ static const struct file_operations sel_policyvers_ops = { };
/* declaration for sel_write_load */ -static int sel_make_bools(struct selinux_fs_info *fsi); -static int sel_make_classes(struct selinux_fs_info *fsi); +static int sel_make_bools(struct selinux_fs_info *fsi, struct dentry *bool_dir, + unsigned int *bool_num, char ***bool_pending_names, + unsigned int **bool_pending_values); +static int sel_make_classes(struct selinux_fs_info *fsi, struct dentry *class_dir, + unsigned long *last_class_ino); static int sel_make_policycap(struct selinux_fs_info *fsi);
/* declaration for sel_make_class_dirs */ @@ -531,13 +534,15 @@ static int sel_make_policy_nodes(struct selinux_fs_info *fsi)
sel_remove_old_policy_nodes(fsi);
- ret = sel_make_bools(fsi); + ret = sel_make_bools(fsi, fsi->bool_dir, &fsi->bool_num, + &fsi->bool_pending_names, &fsi->bool_pending_values); if (ret) { pr_err("SELinux: failed to load policy booleans\n"); return ret; }
- ret = sel_make_classes(fsi); + ret = sel_make_classes(fsi, fsi->class_dir, + &fsi->last_class_ino); if (ret) { pr_err("SELinux: failed to load policy classes\n"); return ret; @@ -1348,12 +1353,13 @@ static void sel_remove_entries(struct dentry *de)
#define BOOL_DIR_NAME "booleans"
-static int sel_make_bools(struct selinux_fs_info *fsi) +static int sel_make_bools(struct selinux_fs_info *fsi, struct dentry *bool_dir, + unsigned int *bool_num, char ***bool_pending_names, + unsigned int **bool_pending_values) { int i, ret; ssize_t len; struct dentry *dentry = NULL; - struct dentry *dir = fsi->bool_dir; struct inode *inode = NULL; struct inode_security_struct *isec; char **names = NULL, *page; @@ -1372,12 +1378,12 @@ static int sel_make_bools(struct selinux_fs_info *fsi)
for (i = 0; i < num; i++) { ret = -ENOMEM; - dentry = d_alloc_name(dir, names[i]); + dentry = d_alloc_name(bool_dir, names[i]); if (!dentry) goto out;
ret = -ENOMEM; - inode = sel_make_inode(dir->d_sb, S_IFREG | S_IRUGO | S_IWUSR); + inode = sel_make_inode(bool_dir->d_sb, S_IFREG | S_IRUGO | S_IWUSR); if (!inode) { dput(dentry); goto out; @@ -1406,9 +1412,9 @@ static int sel_make_bools(struct selinux_fs_info *fsi) inode->i_ino = i|SEL_BOOL_INO_OFFSET; d_add(dentry, inode); } - fsi->bool_num = num; - fsi->bool_pending_names = names; - fsi->bool_pending_values = values; + *bool_num = num; + *bool_pending_names = names; + *bool_pending_values = values;
free_page((unsigned long)page); return 0; @@ -1421,7 +1427,7 @@ static int sel_make_bools(struct selinux_fs_info *fsi) kfree(names); } kfree(values); - sel_remove_entries(dir); + sel_remove_entries(bool_dir);
return ret; } @@ -1806,7 +1812,9 @@ static int sel_make_class_dir_entries(char *classname, int index, return rc; }
-static int sel_make_classes(struct selinux_fs_info *fsi) +static int sel_make_classes(struct selinux_fs_info *fsi, + struct dentry *class_dir, + unsigned long *last_class_ino) {
int rc, nclasses, i; @@ -1817,13 +1825,13 @@ static int sel_make_classes(struct selinux_fs_info *fsi) return rc;
/* +2 since classes are 1-indexed */ - fsi->last_class_ino = sel_class_to_ino(nclasses + 2); + *last_class_ino = sel_class_to_ino(nclasses + 2);
for (i = 0; i < nclasses; i++) { struct dentry *class_name_dir;
- class_name_dir = sel_make_dir(fsi->class_dir, classes[i], - &fsi->last_class_ino); + class_name_dir = sel_make_dir(class_dir, classes[i], + last_class_ino); if (IS_ERR(class_name_dir)) { rc = PTR_ERR(class_name_dir); goto out;
In order to avoid concurrency issues around selinuxfs resource availability during policy load, we first create new directories out of tree for reloaded resources, then swap them in, and finally delete the old versions.
This fix focuses on concurrency in each of the two subtrees swapped, and not concurrency between the trees. This means that it is still possible that subsequent reads to eg the booleans directory and the class directory during a policy load could see the old state for one and the new for the other. The problem of ensuring that policy loads are fully atomic from the perspective of userspace is larger than what is dealt with here. This commit focuses on ensuring that the directories contents always match either the new or the old policy state from the perspective of userspace.
In the previous implementation, on policy load /sys/fs/selinux is updated by deleting the previous contents of /sys/fs/selinux/{class,booleans} and then recreating them. This means that there is a period of time when the contents of these directories do not exist which can cause race conditions as userspace relies on them for information about the policy. In addition, it means that error recovery in the event of failure is challenging.
In order to demonstrate the race condition that this series fixes, you can use the following commands:
while true; do cat /sys/fs/selinux/class/service/perms/status
/dev/null; done &
while true; do load_policy; done;
In the existing code, this will display errors fairly often as the class lookup fails. (In normal operation from systemd, this would result in a permission check which would be allowed or denied based on policy settings around unknown object classes.) After applying this patch series you should expect to no longer see such error messages.
This has been backported to 5.4 for inclusion in stable. Because prior to this series in the main kernel some refactoring of SELinux policy loading had been done, the backport required changing some function call arguments.
The most significant change of note in the backport is as follows:
In previous versions of the kernel, on a policy load, three directories in the selinuxfs were recreated: class, booleans and policy_capabilities. Changes to the selinuxfs code after 5.4 but prior to this series removed the recreation of the policy_capabilities code, so that was not modified in this series. For this backport, I left the existing recreation functionality of policy_capabilities intact, modifying only the class and booleans directories, as in the original series.
As a final note about changes from the original patch, I dropped a patch that was only style cleanup, so this patch no longer takes advantage of the style changes.
Signed-off-by: Daniel Burgener dburgener@linux.microsoft.com --- security/selinux/selinuxfs.c | 119 +++++++++++++++++++++++++++-------- 1 file changed, 93 insertions(+), 26 deletions(-)
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c index ea21f3ef4a6f..ac553d88cb35 100644 --- a/security/selinux/selinuxfs.c +++ b/security/selinux/selinuxfs.c @@ -20,6 +20,7 @@ #include <linux/fs_context.h> #include <linux/mount.h> #include <linux/mutex.h> +#include <linux/namei.h> #include <linux/init.h> #include <linux/string.h> #include <linux/security.h> @@ -351,7 +352,11 @@ static int sel_make_policycap(struct selinux_fs_info *fsi); static struct dentry *sel_make_dir(struct dentry *dir, const char *name, unsigned long *ino);
-/* declaration for sel_remove_old_policy_nodes */ +/* declaration for sel_make_policy_nodes */ +static struct dentry *sel_make_disconnected_dir(struct super_block *sb, + unsigned long *ino); + +/* declaration for sel_make_policy_nodes */ static void sel_remove_entries(struct dentry *de);
static ssize_t sel_read_mls(struct file *filp, char __user *buf, @@ -508,53 +513,101 @@ static const struct file_operations sel_policy_ops = { .llseek = generic_file_llseek, };
-static void sel_remove_old_policy_nodes(struct selinux_fs_info *fsi) +static void sel_remove_old_bool_data(unsigned int bool_num, char **bool_names, + unsigned int *bool_values) { u32 i;
/* bool_dir cleanup */ - for (i = 0; i < fsi->bool_num; i++) - kfree(fsi->bool_pending_names[i]); - kfree(fsi->bool_pending_names); - kfree(fsi->bool_pending_values); - fsi->bool_num = 0; - fsi->bool_pending_names = NULL; - fsi->bool_pending_values = NULL; - - sel_remove_entries(fsi->bool_dir); - - /* class_dir cleanup */ - sel_remove_entries(fsi->class_dir); - + for (i = 0; i < bool_num; i++) + kfree(bool_names[i]); + kfree(bool_names); + kfree(bool_values); }
+#define BOOL_DIR_NAME "booleans" + static int sel_make_policy_nodes(struct selinux_fs_info *fsi) { - int ret; + int ret = 0; + struct dentry *tmp_parent, *tmp_bool_dir, *tmp_class_dir, *old_dentry; + unsigned int tmp_bool_num, old_bool_num; + char **tmp_bool_names, **old_bool_names; + unsigned int *tmp_bool_values, *old_bool_values; + unsigned long tmp_ino = fsi->last_ino; /* Don't increment last_ino in this function */
- sel_remove_old_policy_nodes(fsi); + tmp_parent = sel_make_disconnected_dir(fsi->sb, &tmp_ino); + if (IS_ERR(tmp_parent)) + return PTR_ERR(tmp_parent);
- ret = sel_make_bools(fsi, fsi->bool_dir, &fsi->bool_num, - &fsi->bool_pending_names, &fsi->bool_pending_values); + tmp_ino = fsi->bool_dir->d_inode->i_ino - 1; /* sel_make_dir will increment and set */ + tmp_bool_dir = sel_make_dir(tmp_parent, BOOL_DIR_NAME, &tmp_ino); + if (IS_ERR(tmp_bool_dir)) { + ret = PTR_ERR(tmp_bool_dir); + goto out; + } + + tmp_ino = fsi->class_dir->d_inode->i_ino - 1; /* sel_make_dir will increment and set */ + tmp_class_dir = sel_make_dir(tmp_parent, "classes", &tmp_ino); + if (IS_ERR(tmp_class_dir)) { + ret = PTR_ERR(tmp_class_dir); + goto out; + } + + ret = sel_make_bools(fsi, tmp_bool_dir, &tmp_bool_num, + &tmp_bool_names, &tmp_bool_values); if (ret) { pr_err("SELinux: failed to load policy booleans\n"); - return ret; + goto out; }
- ret = sel_make_classes(fsi, fsi->class_dir, + ret = sel_make_classes(fsi, tmp_class_dir, &fsi->last_class_ino); if (ret) { pr_err("SELinux: failed to load policy classes\n"); - return ret; + goto out; }
ret = sel_make_policycap(fsi); if (ret) { pr_err("SELinux: failed to load policy capabilities\n"); - return ret; + goto out; }
- return 0; + /* booleans */ + old_dentry = fsi->bool_dir; + lock_rename(tmp_bool_dir, old_dentry); + d_exchange(tmp_bool_dir, fsi->bool_dir); + + old_bool_num = fsi->bool_num; + old_bool_names = fsi->bool_pending_names; + old_bool_values = fsi->bool_pending_values; + + fsi->bool_num = tmp_bool_num; + fsi->bool_pending_names = tmp_bool_names; + fsi->bool_pending_values = tmp_bool_values; + + sel_remove_old_bool_data(old_bool_num, old_bool_names, old_bool_values); + + fsi->bool_dir = tmp_bool_dir; + unlock_rename(tmp_bool_dir, old_dentry); + + /* classes */ + old_dentry = fsi->class_dir; + lock_rename(tmp_class_dir, old_dentry); + d_exchange(tmp_class_dir, fsi->class_dir); + fsi->class_dir = tmp_class_dir; + unlock_rename(tmp_class_dir, old_dentry); + +out: + /* Since the other temporary dirs are children of tmp_parent + * this will handle all the cleanup in the case of a failure before + * the swapover + */ + sel_remove_entries(tmp_parent); + dput(tmp_parent); /* d_genocide() only handles the children */ + + return ret; }
static ssize_t sel_write_load(struct file *file, const char __user *buf, @@ -1351,8 +1404,6 @@ static void sel_remove_entries(struct dentry *de) shrink_dcache_parent(de); }
-#define BOOL_DIR_NAME "booleans" - static int sel_make_bools(struct selinux_fs_info *fsi, struct dentry *bool_dir, unsigned int *bool_num, char ***bool_pending_names, unsigned int **bool_pending_values) @@ -1910,6 +1961,22 @@ static struct dentry *sel_make_dir(struct dentry *dir, const char *name, return dentry; }
+static struct dentry *sel_make_disconnected_dir(struct super_block *sb, + unsigned long *ino) +{ + struct inode *inode = sel_make_inode(sb, S_IFDIR | S_IRUGO | S_IXUGO); + + if (!inode) + return ERR_PTR(-ENOMEM); + + inode->i_op = &simple_dir_inode_operations; + inode->i_fop = &simple_dir_operations; + inode->i_ino = ++(*ino); + /* directory inodes start off with i_nlink == 2 (for "." entry) */ + inc_nlink(inode); + return d_obtain_alias(inode); +} + #define NULL_FILE_NAME "null"
static int sel_fill_super(struct super_block *sb, struct fs_context *fc)
On Thu, Oct 15, 2020 at 03:29:56PM -0400, Daniel Burgener wrote:
As a final note about changes from the original patch, I dropped a patch that was only style cleanup, so this patch no longer takes advantage of the style changes.
I'd suggest to bring that patch back for this series. We're trying to minimize the delta between upstream and stable rather than minimize stable itself.
On Thu, Oct 15, 2020 at 03:29:53PM -0400, Daniel Burgener wrote:
This is a backport for stable of my series to fix a race condition in selinuxfs during policy load:
selinux: Create function for selinuxfs directory cleanup https://lore.kernel.org/selinux/20200819195935.1720168-2-dburgener@linux.mic...
selinux: Refactor selinuxfs directory populating functions https://lore.kernel.org/selinux/20200819195935.1720168-3-dburgener@linux.mic...
selinux: Standardize string literal usage for selinuxfs directory names https://lore.kernel.org/selinux/20200819195935.1720168-4-dburgener@linux.mic...
selinux: Create new booleans and class dirs out of tree https://lore.kernel.org/selinux/20200819195935.1720168-5-dburgener@linux.mic...
Several changes were necessary to backport. They are detailed in the commit message for the third commit in the series. I also dropped the original third commit from this because it was only a style change.
As Sasha said, we want to take the original commits as much as possible to reduce the delta. It is ok to take style changes if other patches depend on them, because every time we do something that is not upstream, we create bugs.
So can you redo this series, and backport the original commits, and provide the ids for them as well?
thanks,
greg k-h
On 10/16/20 1:00 AM, Greg KH wrote:
On Thu, Oct 15, 2020 at 03:29:53PM -0400, Daniel Burgener wrote:
This is a backport for stable of my series to fix a race condition in selinuxfs during policy load:
selinux: Create function for selinuxfs directory cleanup https://lore.kernel.org/selinux/20200819195935.1720168-2-dburgener@linux.mic...
selinux: Refactor selinuxfs directory populating functions https://lore.kernel.org/selinux/20200819195935.1720168-3-dburgener@linux.mic...
selinux: Standardize string literal usage for selinuxfs directory names https://lore.kernel.org/selinux/20200819195935.1720168-4-dburgener@linux.mic...
selinux: Create new booleans and class dirs out of tree https://lore.kernel.org/selinux/20200819195935.1720168-5-dburgener@linux.mic...
Several changes were necessary to backport. They are detailed in the commit message for the third commit in the series. I also dropped the original third commit from this because it was only a style change.
As Sasha said, we want to take the original commits as much as possible to reduce the delta. It is ok to take style changes if other patches depend on them, because every time we do something that is not upstream, we create bugs.
So can you redo this series, and backport the original commits, and provide the ids for them as well?
thanks,
greg k-h
Yes, thank you. I will fix up the series with the third commit included, and add commit ids. Thanks.
-Daniel
On Fri, Oct 16, 2020 at 9:05 AM Daniel Burgener dburgener@linux.microsoft.com wrote:
Yes, thank you. I will fix up the series with the third commit included, and add commit ids. Thanks.
Greg and I have different opinions on what is classified as a good candidate for the -stable trees, but in my opinion this patch series doesn't qualify. There are a lot of dependencies, it is intertwined with a lot of code, and the issue that this patchset fixes has been around for a *long* time. I personally feel the risk of backporting this to -stable does not outweigh the potential wins.
My current opinion is that backporting this patchset is not a good idea; it gets a NACK from me.
Daniel, in the future if this is something you want to see backported please bring this issue up on the SELinux mailing list when the patchset is originally posted so we can have a discussion about it and plan accordingly.
On 10/16/20 9:55 AM, Paul Moore wrote:
On Fri, Oct 16, 2020 at 9:05 AM Daniel Burgener dburgener@linux.microsoft.com wrote:
Yes, thank you. I will fix up the series with the third commit included, and add commit ids. Thanks.
Greg and I have different opinions on what is classified as a good candidate for the -stable trees, but in my opinion this patch series doesn't qualify. There are a lot of dependencies, it is intertwined with a lot of code, and the issue that this patchset fixes has been around for a *long* time. I personally feel the risk of backporting this to -stable does not outweigh the potential wins.
My current opinion is that backporting this patchset is not a good idea; it gets a NACK from me.
Daniel, in the future if this is something you want to see backported please bring this issue up on the SELinux mailing list when the patchset is originally posted so we can have a discussion about it and plan accordingly.
Noted. Thanks for the feedback. I will make sure to bring such things up with the selinux list in the future.
-Daniel
On Fri, Oct 16, 2020 at 09:55:25AM -0400, Paul Moore wrote:
On Fri, Oct 16, 2020 at 9:05 AM Daniel Burgener dburgener@linux.microsoft.com wrote:
Yes, thank you. I will fix up the series with the third commit included, and add commit ids. Thanks.
Greg and I have different opinions on what is classified as a good candidate for the -stable trees, but in my opinion this patch series doesn't qualify. There are a lot of dependencies, it is intertwined with a lot of code, and the issue that this patchset fixes has been around for a *long* time. I personally feel the risk of backporting this to -stable does not outweigh the potential wins.
My understanding is that while the issue Daniel is fixing here has been around for a while, it's also very real - the reports suggest a failure rate of 1-2% on boot.
I do understand your concerns around this series, but given it was just fixed upstream we don't have a better story than "sit tight for the next LTS" to tell to users affected by this issue.
Is there a scenario where you'd feel safer with the series? I suspect that if it doesn't go into upstream stable Daniel will end up carrying it out of tree anyway, so maybe we can ask Daniel to do targetted testing for the next week or two and report back?
On 10/16/20 10:22 AM, Sasha Levin wrote:
On Fri, Oct 16, 2020 at 09:55:25AM -0400, Paul Moore wrote:
On Fri, Oct 16, 2020 at 9:05 AM Daniel Burgener dburgener@linux.microsoft.com wrote:
Yes, thank you. I will fix up the series with the third commit included, and add commit ids. Thanks.
Greg and I have different opinions on what is classified as a good candidate for the -stable trees, but in my opinion this patch series doesn't qualify. There are a lot of dependencies, it is intertwined with a lot of code, and the issue that this patchset fixes has been around for a *long* time. I personally feel the risk of backporting this to -stable does not outweigh the potential wins.
My understanding is that while the issue Daniel is fixing here has been around for a while, it's also very real - the reports suggest a failure rate of 1-2% on boot.
As a point of clarity, I think that the issue occurs much less frequently on boot than it does with a policy load during ordinary operation, since there are a much higher volume of userspace policy manager lookups on a policy_load once the system is up. I think 1-2% is roughly accurate for what we're seeing in the environment I'm working on for a policy load during normal steady state operation. I don't have hard numbers on policy load during boot, but I would expect it to be quite a bit lower. We have seen it, but it's not the common case we're seeing.
I do understand your concerns around this series, but given it was just fixed upstream we don't have a better story than "sit tight for the next LTS" to tell to users affected by this issue.
Is there a scenario where you'd feel safer with the series? I suspect that if it doesn't go into upstream stable Daniel will end up carrying it out of tree anyway, so maybe we can ask Daniel to do targetted testing for the next week or two and report back?
I believe my team will intend to carry this out of tree, yes. If additional data from that would be helpful, I'd be happy to provide it.
-Daniel
linux-stable-mirror@lists.linaro.org