The user triggers the creation of a pseudo-locked region when writing the requested schemata to the schemata resctrl file. The pseudo-locking of a region is required to be done on a CPU that is associated with the cache on which the pseudo-locked region will reside. In order to run the locking code on a specific CPU the needed CPU has to be selected and ensured to remain online during the entire locking sequence.
At this time the cpu_hotplug_lock is not taken during the pseudo-lock region creation and it is thus possible for a CPU to be selected to run the pseudo-locking code and then that CPU to go offline before the thread is able to run on it.
Fix this by ensuring that the cpu_hotplug_lock is taken while the CPU on which code has to run needs to be controlled. Since the cpu_hotplug_lock is always taken before rdtgroup_mutex the lock order is maintained.
Fixes: e0bdfe8e36f3 ("x86/intel_rdt: Support creation/removal of pseudo-locked region") Signed-off-by: Reinette Chatre reinette.chatre@intel.com Cc: stable@vger.kernel.org --- V2: - Rebase against tip/x86/urgent - Modify subject from x86/resctrl to x86/intel_rdt to match subject used before the code reorganization.
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c index 27937458c231..efa4a519f5e5 100644 --- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c +++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c @@ -23,6 +23,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/cpu.h> #include <linux/kernfs.h> #include <linux/seq_file.h> #include <linux/slab.h> @@ -310,9 +311,11 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, return -EINVAL; buf[nbytes - 1] = '\0';
+ cpus_read_lock(); rdtgrp = rdtgroup_kn_lock_live(of->kn); if (!rdtgrp) { rdtgroup_kn_unlock(of->kn); + cpus_read_unlock(); return -ENOENT; } rdt_last_cmd_clear(); @@ -367,6 +370,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
out: rdtgroup_kn_unlock(of->kn); + cpus_read_unlock(); return ret ?: nbytes; }
On Mon, Dec 10, 2018 at 01:21:54PM -0800, Reinette Chatre wrote:
The user triggers the creation of a pseudo-locked region when writing the requested schemata to the schemata resctrl file. The pseudo-locking of a region is required to be done on a CPU that is associated with the cache on which the pseudo-locked region will reside. In order to run the locking code on a specific CPU the needed CPU has to be selected and ensured to remain online during the entire locking sequence.
At this time the cpu_hotplug_lock is not taken during the pseudo-lock region creation and it is thus possible for a CPU to be selected to run the pseudo-locking code and then that CPU to go offline before the thread is able to run on it.
Fix this by ensuring that the cpu_hotplug_lock is taken while the CPU on which code has to run needs to be controlled. Since the cpu_hotplug_lock is always taken before rdtgroup_mutex the lock order is maintained.
Fixes: e0bdfe8e36f3 ("x86/intel_rdt: Support creation/removal of pseudo-locked region") Signed-off-by: Reinette Chatre reinette.chatre@intel.com Cc: stable@vger.kernel.org
V2:
- Rebase against tip/x86/urgent
- Modify subject from x86/resctrl to x86/intel_rdt to match subject used before the code reorganization.
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++++ 1 file changed, 4 insertions(+)
I took it but changed the subject to the more straight-forward:
"x86/intel_rdt: Disable CPU hotplug while modifying schemata"
Now, your second patch:
Subject: [PATCH V2] x86/resctrl: Fix rdt_find_domain() return value and checks Message-Id: b88cd4ff6a75995bf8db9b0ea546908fe50f69f3.1544479852.git.reinette.chatre@intel.com
has the new file paths, has Fixes: tags but no CC:stable.
I'm guessing it needs to go in the next merge window with the rest of the new stuff and not now, with the urgent pile?
I'm thinking that because it is not really a bug now, as the negative ID happens to work.
Right?
Hi Boris,
On 12/11/2018 4:34 AM, Borislav Petkov wrote:
On Mon, Dec 10, 2018 at 01:21:54PM -0800, Reinette Chatre wrote:
The user triggers the creation of a pseudo-locked region when writing the requested schemata to the schemata resctrl file. The pseudo-locking of a region is required to be done on a CPU that is associated with the cache on which the pseudo-locked region will reside. In order to run the locking code on a specific CPU the needed CPU has to be selected and ensured to remain online during the entire locking sequence.
At this time the cpu_hotplug_lock is not taken during the pseudo-lock region creation and it is thus possible for a CPU to be selected to run the pseudo-locking code and then that CPU to go offline before the thread is able to run on it.
Fix this by ensuring that the cpu_hotplug_lock is taken while the CPU on which code has to run needs to be controlled. Since the cpu_hotplug_lock is always taken before rdtgroup_mutex the lock order is maintained.
Fixes: e0bdfe8e36f3 ("x86/intel_rdt: Support creation/removal of pseudo-locked region") Signed-off-by: Reinette Chatre reinette.chatre@intel.com Cc: stable@vger.kernel.org
V2:
- Rebase against tip/x86/urgent
- Modify subject from x86/resctrl to x86/intel_rdt to match subject used before the code reorganization.
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++++ 1 file changed, 4 insertions(+)
I took it but changed the subject to the more straight-forward:
Thank you very much.
"x86/intel_rdt: Disable CPU hotplug while modifying schemata"
I am not sure that this is an issue when updating a schemata in the general case. In the case when just CAT schemata (without pseudo-locking) is updated then the cpu mask associated with the cache instance is indeed used to determine which CPUs should have their registers changed but only the current CPU is not checked for being online, for the other CPUs smp_call_function_many() is used that includes an online check.
Now, your second patch:
Subject: [PATCH V2] x86/resctrl: Fix rdt_find_domain() return value and checks Message-Id: b88cd4ff6a75995bf8db9b0ea546908fe50f69f3.1544479852.git.reinette.chatre@intel.com
has the new file paths, has Fixes: tags but no CC:stable.
I'm guessing it needs to go in the next merge window with the rest of the new stuff and not now, with the urgent pile?
I'm thinking that because it is not really a bug now, as the negative ID happens to work.
I had the same question in V1's notes to the maintainer :)
My initial concern was the lack of IS_ERR checking. Understanding the flow better now it seems to me that this is indeed not a bug now. The reasoning is that an ERR_PTR is only returned when a negative id is provided in the parameters to rdt_find_domain(). There are currently only two places where a negative id could be provided to rdt_find_domain(), domain_add_cpu() and domain_remove_cpu(), and both locations test the return value using IS_ERR.
Reinette
On Tue, Dec 11, 2018 at 10:33:59AM -0800, Reinette Chatre wrote:
I am not sure that this is an issue when updating a schemata in the general case. In the case when just CAT schemata (without pseudo-locking) is updated then the cpu mask associated with the cache instance is indeed used to determine which CPUs should have their registers changed but only the current CPU is not checked for being online, for the other CPUs smp_call_function_many() is used that includes an online check.
Well, in your fix rdtgroup_schemata_write() disables hotplug for its whole duration and doesn't look at what schemata update is being done, right?
I had the same question in V1's notes to the maintainer :)
Whoops, and I read that... Sorry. :-\
My initial concern was the lack of IS_ERR checking. Understanding the flow better now it seems to me that this is indeed not a bug now. The reasoning is that an ERR_PTR is only returned when a negative id is provided in the parameters to rdt_find_domain(). There are currently only two places where a negative id could be provided to rdt_find_domain(), domain_add_cpu() and domain_remove_cpu(), and both locations test the return value using IS_ERR.
Right. I'll queue it for the normal merge window.
Thx.
Hi Boris,
On 12/11/2018 10:50 AM, Borislav Petkov wrote:
On Tue, Dec 11, 2018 at 10:33:59AM -0800, Reinette Chatre wrote:
I am not sure that this is an issue when updating a schemata in the general case. In the case when just CAT schemata (without pseudo-locking) is updated then the cpu mask associated with the cache instance is indeed used to determine which CPUs should have their registers changed but only the current CPU is not checked for being online, for the other CPUs smp_call_function_many() is used that includes an online check.
Well, in your fix rdtgroup_schemata_write() disables hotplug for its whole duration and doesn't look at what schemata update is being done, right?
Correct.
I just wanted to emphasize that it is not the schemata writing that needs to be protected, but instead the pseudo-locking code that runs after schemata programming that needs to run on a particular CPU. The new patch subject could be interpreted to mean the former ... but that is starting to sound like nitpicking by me.
I had the same question in V1's notes to the maintainer :)
Whoops, and I read that... Sorry. :-\
Not a problem at all :)
My initial concern was the lack of IS_ERR checking. Understanding the flow better now it seems to me that this is indeed not a bug now. The reasoning is that an ERR_PTR is only returned when a negative id is provided in the parameters to rdt_find_domain(). There are currently only two places where a negative id could be provided to rdt_find_domain(), domain_add_cpu() and domain_remove_cpu(), and both locations test the return value using IS_ERR.
Right. I'll queue it for the normal merge window.
Thank you very much.
Reinette
On Tue, Dec 11, 2018 at 11:02:08AM -0800, Reinette Chatre wrote:
I just wanted to emphasize that it is not the schemata writing that needs to be protected, but instead the pseudo-locking code that runs after schemata programming that needs to run on a particular CPU. The new patch subject could be interpreted to mean the former ... but that is starting to sound like nitpicking by me.
Nah, that's not nitpicking - it is important that we sort out stuff fully before committing.
Now, I'm trying to understand what you're telling me and I believe you mean what update_domains() does, yes?
And I guess a more fitting subject in that case could be:
x86/intel_rdt: Ensure a CPU remains online for the region's pseudo-locking sequence
or so, and then the commit message explains in more detail what that title actually means. :)
Hmmm.
Hi Boris,
On 12/11/2018 12:43 PM, Borislav Petkov wrote:
On Tue, Dec 11, 2018 at 11:02:08AM -0800, Reinette Chatre wrote:
I just wanted to emphasize that it is not the schemata writing that needs to be protected, but instead the pseudo-locking code that runs after schemata programming that needs to run on a particular CPU. The new patch subject could be interpreted to mean the former ... but that is starting to sound like nitpicking by me.
Nah, that's not nitpicking - it is important that we sort out stuff fully before committing.
I really appreciate your patience.
Now, I'm trying to understand what you're telling me and I believe you mean what update_domains() does, yes?
Yes.
And I guess a more fitting subject in that case could be:
x86/intel_rdt: Ensure a CPU remains online for the region's pseudo-locking sequence
or so, and then the commit message explains in more detail what that title actually means. :)
Hmmm.
That subject line matches exactly the goal of this patch. Thank you very much. Would you prefer that I send a new version with this subject line or would you rather make the change yourself while applying this patch?
Reinette
On Tue, Dec 11, 2018 at 12:51:12PM -0800, Reinette Chatre wrote:
I really appreciate your patience.
That goes both ways. Thx. :-)
That subject line matches exactly the goal of this patch. Thank you very much. Would you prefer that I send a new version with this subject line or would you rather make the change yourself while applying this patch?
Nah, no need. I'll fix them up.
Thx.
Commit-ID: 80b71c340f17705ec145911b9a193ea781811b16 Gitweb: https://git.kernel.org/tip/80b71c340f17705ec145911b9a193ea781811b16 Author: Reinette Chatre reinette.chatre@intel.com AuthorDate: Mon, 10 Dec 2018 13:21:54 -0800 Committer: Borislav Petkov bp@suse.de CommitDate: Tue, 11 Dec 2018 21:59:01 +0100
x86/intel_rdt: Ensure a CPU remains online for the region's pseudo-locking sequence
The user triggers the creation of a pseudo-locked region when writing the requested schemata to the schemata resctrl file. The pseudo-locking of a region is required to be done on a CPU that is associated with the cache on which the pseudo-locked region will reside. In order to run the locking code on a specific CPU, the needed CPU has to be selected and ensured to remain online during the entire locking sequence.
At this time, the cpu_hotplug_lock is not taken during the pseudo-lock region creation and it is thus possible for a CPU to be selected to run the pseudo-locking code and then that CPU to go offline before the thread is able to run on it.
Fix this by ensuring that the cpu_hotplug_lock is taken while the CPU on which code has to run needs to be controlled. Since the cpu_hotplug_lock is always taken before rdtgroup_mutex the lock order is maintained.
Fixes: e0bdfe8e36f3 ("x86/intel_rdt: Support creation/removal of pseudo-locked region") Signed-off-by: Reinette Chatre reinette.chatre@intel.com Signed-off-by: Borislav Petkov bp@suse.de Cc: "H. Peter Anvin" hpa@zytor.com Cc: Fenghua Yu fenghua.yu@intel.com Cc: Ingo Molnar mingo@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Tony Luck tony.luck@intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: stable stable@vger.kernel.org Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/b7b17432a80f95a1fa21a1698ba643014f58ad31.154447642... --- arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c index 27937458c231..efa4a519f5e5 100644 --- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c +++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c @@ -23,6 +23,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/cpu.h> #include <linux/kernfs.h> #include <linux/seq_file.h> #include <linux/slab.h> @@ -310,9 +311,11 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, return -EINVAL; buf[nbytes - 1] = '\0';
+ cpus_read_lock(); rdtgrp = rdtgroup_kn_lock_live(of->kn); if (!rdtgrp) { rdtgroup_kn_unlock(of->kn); + cpus_read_unlock(); return -ENOENT; } rdt_last_cmd_clear(); @@ -367,6 +370,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
out: rdtgroup_kn_unlock(of->kn); + cpus_read_unlock(); return ret ?: nbytes; }
linux-stable-mirror@lists.linaro.org