Hi Boris,
On 12/11/2018 4:34 AM, Borislav Petkov wrote:
On Mon, Dec 10, 2018 at 01:21:54PM -0800, Reinette Chatre wrote:
The user triggers the creation of a pseudo-locked region when writing the requested schemata to the schemata resctrl file. The pseudo-locking of a region is required to be done on a CPU that is associated with the cache on which the pseudo-locked region will reside. In order to run the locking code on a specific CPU the needed CPU has to be selected and ensured to remain online during the entire locking sequence.
At this time the cpu_hotplug_lock is not taken during the pseudo-lock region creation and it is thus possible for a CPU to be selected to run the pseudo-locking code and then that CPU to go offline before the thread is able to run on it.
Fix this by ensuring that the cpu_hotplug_lock is taken while the CPU on which code has to run needs to be controlled. Since the cpu_hotplug_lock is always taken before rdtgroup_mutex the lock order is maintained.
Fixes: e0bdfe8e36f3 ("x86/intel_rdt: Support creation/removal of pseudo-locked region") Signed-off-by: Reinette Chatre reinette.chatre@intel.com Cc: stable@vger.kernel.org
V2:
- Rebase against tip/x86/urgent
- Modify subject from x86/resctrl to x86/intel_rdt to match subject used before the code reorganization.
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++++ 1 file changed, 4 insertions(+)
I took it but changed the subject to the more straight-forward:
Thank you very much.
"x86/intel_rdt: Disable CPU hotplug while modifying schemata"
I am not sure that this is an issue when updating a schemata in the general case. In the case when just CAT schemata (without pseudo-locking) is updated then the cpu mask associated with the cache instance is indeed used to determine which CPUs should have their registers changed but only the current CPU is not checked for being online, for the other CPUs smp_call_function_many() is used that includes an online check.
Now, your second patch:
Subject: [PATCH V2] x86/resctrl: Fix rdt_find_domain() return value and checks Message-Id: b88cd4ff6a75995bf8db9b0ea546908fe50f69f3.1544479852.git.reinette.chatre@intel.com
has the new file paths, has Fixes: tags but no CC:stable.
I'm guessing it needs to go in the next merge window with the rest of the new stuff and not now, with the urgent pile?
I'm thinking that because it is not really a bug now, as the negative ID happens to work.
I had the same question in V1's notes to the maintainer :)
My initial concern was the lack of IS_ERR checking. Understanding the flow better now it seems to me that this is indeed not a bug now. The reasoning is that an ERR_PTR is only returned when a negative id is provided in the parameters to rdt_find_domain(). There are currently only two places where a negative id could be provided to rdt_find_domain(), domain_add_cpu() and domain_remove_cpu(), and both locations test the return value using IS_ERR.
Reinette