On 3/3/22 6:20 AM, Ammar Faizi wrote:
On 3/3/22 12:26 AM, Yazen Ghannam wrote:
Hi Ammar,
Hi Yazen,
... The threshold interrupt handler uses this pointer. I think the goal here is to set this pointer when the list is fully formed and clear this pointer before making any changes to the list. Otherwise, the interrupt handler will operate on incomplete data if an interrupt comes in the middle of these updates.
The changes below should deal with memory leak issue while avoiding a race with the threshold interrupt. What do you think?
Thanks for taking a look into this. I didn't notice that before. The changes look good to me, extra improvements:
- _mce_threshold_remove_device() should be static as we don't use it
in another translation unit. 2) Minor cleanup, we don't need "goto out_err", just early return directly.
I will fold them in...
Please review the patch below, if you think it looks good, I will send this for the v5 series. I added your sign-off.
From cae3965734a67d11a5286c612dfddf52398defc8 Mon Sep 17 00:00:00 2001 From: Ammar Faizi ammarfaizi2@gnuweeb.org Date: Thu, 3 Mar 2022 05:07:38 +0700 Subject: [PATCH v5 2/2] x86/MCE/AMD: Fix memory leak when `threshold_create_bank()` fails
In mce_threshold_create_device(), when threshold_create_bank() fails, the @bp will be leaked, because mce_threshold_remove_device() will not free the @bp. It only frees the @bp when we've already written the @bp to the @threshold_banks per-CPU variable, but at the point, we haven't.
Fix this by extracting the cleanup part into a new static function _mce_threshold_remove_device(), then use it from create and remove device function.
Also, eliminate the "goto out_err". Just early return inside the loop when we fail.
Cc: Borislav Petkov bp@alien8.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: stable@vger.kernel.org # v5.8+ Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path") Co-authored-by: Alviro Iskandar Setiawan alviro.iskandar@gnuweeb.org Signed-off-by: Alviro Iskandar Setiawan alviro.iskandar@gnuweeb.org Co-authored-by: Yazen Ghannam yazen.ghannam@amd.com Signed-off-by: Yazen Ghannam yazen.ghannam@amd.com Signed-off-by: Ammar Faizi ammarfaizi2@gnuweeb.org --- arch/x86/kernel/cpu/mce/amd.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 9f4b508886dd..ac7246a4de08 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1293,10 +1293,22 @@ static void threshold_remove_bank(struct threshold_bank *bank) kfree(bank); }
+static void _mce_threshold_remove_device(struct threshold_bank **bp) +{ + unsigned int bank, numbanks = this_cpu_read(mce_num_banks); + + for (bank = 0; bank < numbanks; bank++) { + if (bp[bank]) { + threshold_remove_bank(bp[bank]); + bp[bank] = NULL; + } + } + kfree(bp); +} + int mce_threshold_remove_device(unsigned int cpu) { struct threshold_bank **bp = this_cpu_read(threshold_banks); - unsigned int bank, numbanks = this_cpu_read(mce_num_banks);
if (!bp) return 0; @@ -1307,13 +1319,7 @@ int mce_threshold_remove_device(unsigned int cpu) */ this_cpu_write(threshold_banks, NULL);
- for (bank = 0; bank < numbanks; bank++) { - if (bp[bank]) { - threshold_remove_bank(bp[bank]); - bp[bank] = NULL; - } - } - kfree(bp); + _mce_threshold_remove_device(bp); return 0; }
@@ -1350,15 +1356,14 @@ int mce_threshold_create_device(unsigned int cpu) if (!(this_cpu_read(bank_map) & (1 << bank))) continue; err = threshold_create_bank(bp, cpu, bank); - if (err) - goto out_err; + if (err) { + _mce_threshold_remove_device(bp); + return err; + } } this_cpu_write(threshold_banks, bp);
if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt; return 0; -out_err: - mce_threshold_remove_device(cpu); - return err; }