Hi,
This is the v6 of two x86 fixes.
1) x86/delay: Fix the wrong Assembly constraint in delay_loop() function. 2) x86/MCE/AMD: Fix memory leak when `threshold_create_bank()` fails.
## Changelog
v6: - Remove unnecessary Cc tags. - Undo the stable mark for patch 1. - Update commit message, emphasize the danger when the compiler decides to inline the function. - Fix the Fixes tag sha1 in patch 1. - Change the helper function name to __threshold_remove_device().
v5: - Mark patch #1 for stable. - Commit message improvement for patch #1 and #2. - Fold in changes from Yazen and Alviro (for patch #2).
v4: - Address comment from Greg, sha1 commit Fixes only needs to be 12 chars. - Add the author of the fixed commit to the CC list.
v3: - Fold in changes from Alviro, the previous version is still leaking @bank[n].
v2: - Fix wrong copy/paste.
Signed-off-by: Alviro Iskandar Setiawan alviro.iskandar@gnuweeb.org Signed-off-by: Yazen Ghannam yazen.ghannam@amd.com Signed-off-by: Ammar Faizi ammarfaizi2@gnuweeb.org --- Ammar Faizi (2): x86/delay: Fix the wrong asm constraint in `delay_loop()` x86/MCE/AMD: Fix memory leak when `threshold_create_bank()` fails
arch/x86/kernel/cpu/mce/amd.c | 32 +++++++++++++++++++------------- arch/x86/lib/delay.c | 4 ++-- 2 files changed, 21 insertions(+), 15 deletions(-)
base-commit: 1930a6e739c4b4a654a69164dbe39e554d228915
The asm constraint does not reflect that the asm statement can modify the value of @loops. But the asm statement in delay_loop() does modify the @loops.
Specifiying the wrong constraint may lead to undefined behavior, it may clobber random stuff (e.g. local variable, important temporary value in regs, etc.). This is especially dangerous when the compiler decides to inline the function and since it doesn't know that the value gets modified, it might decide to use it from a register directly without reloading it.
Fix this by changing the constraint from "a" (as an input) to "+a" (as an input and output).
Cc: David Laight David.Laight@ACULAB.COM Cc: Jiri Hladky hladky.jiri@googlemail.com Cc: Alviro Iskandar Setiawan alviro.iskandar@gnuweeb.org Fixes: e01b70ef3eb3 ("x86: fix bug in arch/i386/lib/delay.c file, delay_loop function") Signed-off-by: Ammar Faizi ammarfaizi2@gnuweeb.org ---
v6: - Remove unnecessary Cc tags. - Update commit message, emphasize the danger when the compiler decides to inline the function. - Fix the Fixes tag sha1. - Remove stable Cc.
--- arch/x86/lib/delay.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c index 65d15df6212d..0e65d00e2339 100644 --- a/arch/x86/lib/delay.c +++ b/arch/x86/lib/delay.c @@ -54,8 +54,8 @@ static void delay_loop(u64 __loops) " jnz 2b \n" "3: dec %0 \n"
- : /* we don't need output */ - :"a" (loops) + : "+a" (loops) + : ); }
On 3/29/22 03:47, Ammar Faizi wrote:
The asm constraint does not reflect that the asm statement can modify the value of @loops. But the asm statement in delay_loop() does modify the @loops.
Specifiying the wrong constraint may lead to undefined behavior, it may clobber random stuff (e.g. local variable, important temporary value in regs, etc.). This is especially dangerous when the compiler decides to inline the function and since it doesn't know that the value gets modified, it might decide to use it from a register directly without reloading it.
Fix this by changing the constraint from "a" (as an input) to "+a" (as an input and output).
Was this found by inspection or was it causing real-world problems?
On 4/2/22 12:42 AM, Dave Hansen wrote:
Was this found by inspection or was it causing real-world problems?
It was found by inspection, no real-world problems found so far.
On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote:
The asm constraint does not reflect that the asm statement can modify the value of @loops. But the asm statement in delay_loop() does modify the @loops.
Specifiying the wrong constraint may lead to undefined behavior, it may clobber random stuff (e.g. local variable, important temporary value in regs, etc.). This is especially dangerous when the compiler decides to inline the function and since it doesn't know that the value gets modified, it might decide to use it from a register directly without reloading it.
Fix this by changing the constraint from "a" (as an input) to "+a" (as an input and output).
This analysis is plain wrong. The assembly code operates on a register and not on memory:
asm volatile( " test %0,%0 \n" " jz 3f \n" " jmp 1f \n"
".align 16 \n" "1: jmp 2f \n"
".align 16 \n" "2: dec %0 \n" " jnz 2b \n" "3: dec %0 \n"
: /* we don't need output */ ----> :"a" (loops)
This tells the compiler to use [RE]AX and initialize it from the variable 'loops'. It's never written back because all '%0' in the above assembly are substituted with [RE]AX. This also tells the compiler that the inline assembly clobbers [RE]AX and that's all it needs to know.
Nothing to fix here, whether the code is inlined or not.
Thanks,
tglx
On 4/3/22 11:57 PM, Thomas Gleixner wrote:
On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote:
The asm constraint does not reflect that the asm statement can modify the value of @loops. But the asm statement in delay_loop() does modify the @loops.
Specifiying the wrong constraint may lead to undefined behavior, it may clobber random stuff (e.g. local variable, important temporary value in regs, etc.). This is especially dangerous when the compiler decides to inline the function and since it doesn't know that the value gets modified, it might decide to use it from a register directly without reloading it.
Fix this by changing the constraint from "a" (as an input) to "+a" (as an input and output).
This analysis is plain wrong. The assembly code operates on a register and not on memory:
asm volatile( " test %0,%0 \n" " jz 3f \n" " jmp 1f \n"
".align 16 \n" "1: jmp 2f \n" ".align 16 \n" "2: dec %0 \n" " jnz 2b \n" "3: dec %0 \n" : /* we don't need output */
----> :"a" (loops)
This tells the compiler to use [RE]AX and initialize it from the variable 'loops'. It's never written back because all '%0' in the above assembly are substituted with [RE]AX. This also tells the compiler that the inline assembly clobbers [RE]AX and that's all it needs to know.
Hi Thomas,
Thanks for taking a look. I doubt about your sentence "This also tells the compiler that the inline assembly clobbers [RE]AX".
How come it tells the compiler that the inline ASM clobbers [RE]AX?
That's an input constraint. Doesn't that mean it is read-only for the ASM statement? That means the compiler is allowed to assume [RE]AX doesn't change inside the ASM statement.
Those `dec`s do really change the [RE]AX. Please review this again.
Thanks!
On Sun, Apr 03 2022 at 18:57, Thomas Gleixner wrote:
On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote:
The asm constraint does not reflect that the asm statement can modify the value of @loops. But the asm statement in delay_loop() does modify the @loops.
Specifiying the wrong constraint may lead to undefined behavior, it may clobber random stuff (e.g. local variable, important temporary value in regs, etc.). This is especially dangerous when the compiler decides to inline the function and since it doesn't know that the value gets modified, it might decide to use it from a register directly without reloading it.
Ignore me, I misread this part of the explanation.
Thanks,
tglx
In mce_threshold_create_device(), if threshold_create_bank() fails, the @bp will be leaked, because the call to mce_threshold_remove_device() will not free the @bp. mce_threshold_remove_device() frees @threshold_banks. At that point, the @bp has not been written to @threshold_banks, @threshold_banks is NULL, so the call is just a nop.
Fix this by extracting the cleanup part into a new static function __threshold_remove_device(), then call it from create/remove device functions.
Also, eliminate the "goto out_err", just early return inside the loop if the creation fails.
Cc: stable@vger.kernel.org # v5.8+ Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path") Co-developed-by: Alviro Iskandar Setiawan alviro.iskandar@gnuweeb.org Signed-off-by: Alviro Iskandar Setiawan alviro.iskandar@gnuweeb.org Co-developed-by: Yazen Ghannam yazen.ghannam@amd.com Signed-off-by: Yazen Ghannam yazen.ghannam@amd.com Signed-off-by: Ammar Faizi ammarfaizi2@gnuweeb.org ---
v6: - Change the helper function name to __threshold_remove_device().
--- arch/x86/kernel/cpu/mce/amd.c | 32 +++++++++++++++++++------------- 1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 1940d305db1c..d293ae088d6b 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1294,10 +1294,23 @@ static void threshold_remove_bank(struct threshold_bank *bank) kfree(bank); }
+static void __threshold_remove_device(struct threshold_bank **bp, + unsigned int numbanks) +{ + unsigned int bank; + + for (bank = 0; bank < numbanks; bank++) { + if (bp[bank]) { + threshold_remove_bank(bp[bank]); + bp[bank] = NULL; + } + } + kfree(bp); +} + int mce_threshold_remove_device(unsigned int cpu) { struct threshold_bank **bp = this_cpu_read(threshold_banks); - unsigned int bank, numbanks = this_cpu_read(mce_num_banks);
if (!bp) return 0; @@ -1308,13 +1321,7 @@ int mce_threshold_remove_device(unsigned int cpu) */ this_cpu_write(threshold_banks, NULL);
- for (bank = 0; bank < numbanks; bank++) { - if (bp[bank]) { - threshold_remove_bank(bp[bank]); - bp[bank] = NULL; - } - } - kfree(bp); + __threshold_remove_device(bp, this_cpu_read(mce_num_banks)); return 0; }
@@ -1351,15 +1358,14 @@ int mce_threshold_create_device(unsigned int cpu) if (!(this_cpu_read(bank_map) & (1 << bank))) continue; err = threshold_create_bank(bp, cpu, bank); - if (err) - goto out_err; + if (err) { + __threshold_remove_device(bp, numbanks); + return err; + } } this_cpu_write(threshold_banks, bp);
if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt; return 0; -out_err: - mce_threshold_remove_device(cpu); - return err; }
On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote:
In mce_threshold_create_device(), if threshold_create_bank() fails, the @bp will be leaked, because the call to mce_threshold_remove_device() will not free the @bp. mce_threshold_remove_device() frees @threshold_banks. At that point, the @bp has not been written to @threshold_banks, @threshold_banks is NULL, so the call is just a nop.
Fix this by extracting the cleanup part into a new static function __threshold_remove_device(), then call it from create/remove device functions.
The way simpler fix is to move
} this_cpu_write(threshold_banks, bp);
before the loop. That's safe because the banks cannot yet be reached via an MCE as the vector is not yet enabled:
if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt;
Thanks,
tglx
On 4/4/22 12:03 AM, Thomas Gleixner wrote:
On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote:
In mce_threshold_create_device(), if threshold_create_bank() fails, the @bp will be leaked, because the call to mce_threshold_remove_device() will not free the @bp. mce_threshold_remove_device() frees @threshold_banks. At that point, the @bp has not been written to @threshold_banks, @threshold_banks is NULL, so the call is just a nop.
Fix this by extracting the cleanup part into a new static function __threshold_remove_device(), then call it from create/remove device functions.
The way simpler fix is to move
} this_cpu_write(threshold_banks, bp);
before the loop. That's safe because the banks cannot yet be reached via an MCE as the vector is not yet enabled:
if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt;
Thomas,
I did like what you said (in the patch v4), but after Yazen and Borislav reviewed it, we got a conclusion that it's not safe.
See [1] and [2] for the full message.
[1]: https://lore.kernel.org/lkml/YkFsQhpGGXIFTMyp@zn.tnic/ [2]: https://lore.kernel.org/lkml/Yh+oyD%2F5M3TW5ZMM@yaz-ubuntu/
Yazen, Borislav, please take a deeper look on this again. I will send a v7 revision to really make it simpler by moving that "per-CPU var write" before the loop.
Thanks!
On 4/4/22 12:43 AM, Ammar Faizi wrote:
On 4/4/22 12:03 AM, Thomas Gleixner wrote:
On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote:
In mce_threshold_create_device(), if threshold_create_bank() fails, the @bp will be leaked, because the call to mce_threshold_remove_device() will not free the @bp. mce_threshold_remove_device() frees @threshold_banks. At that point, the @bp has not been written to @threshold_banks, @threshold_banks is NULL, so the call is just a nop.
Fix this by extracting the cleanup part into a new static function __threshold_remove_device(), then call it from create/remove device functions.
The way simpler fix is to move
} this_cpu_write(threshold_banks, bp);
before the loop. That's safe because the banks cannot yet be reached via an MCE as the vector is not yet enabled:
if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt;
Thomas,
I did like what you said (in the patch v4), but after Yazen and Borislav reviewed it, we got a conclusion that it's not safe.
See [1] and [2] for the full message.
Yazen, Borislav, please take a deeper look on this again. I will send a v7 revision to really make it simpler by moving that "per-CPU var write" before the loop.
(only if it's really safe)
On Mon, Apr 04, 2022 at 12:45:04AM +0700, Ammar Faizi wrote:
Yazen, Borislav, please take a deeper look on this again. I will send a v7 revision to really make it simpler by moving that "per-CPU var write" before the loop.
(only if it's really safe)
Don't bother - I've discussed it with tglx offlist. The current solution remains.
The following commit has been merged into the ras/core branch of tip:
Commit-ID: e5f28623ceb103e13fc3d7bd45edf9818b227fd0 Gitweb: https://git.kernel.org/tip/e5f28623ceb103e13fc3d7bd45edf9818b227fd0 Author: Ammar Faizi ammarfaizi2@gnuweeb.org AuthorDate: Tue, 29 Mar 2022 17:47:05 +07:00 Committer: Borislav Petkov bp@suse.de CommitterDate: Tue, 05 Apr 2022 21:24:37 +02:00
x86/MCE/AMD: Fix memory leak when threshold_create_bank() fails
In mce_threshold_create_device(), if threshold_create_bank() fails, the previously allocated threshold banks array @bp will be leaked because the call to mce_threshold_remove_device() will not free it.
This happens because mce_threshold_remove_device() fetches the pointer through the threshold_banks per-CPU variable but bp is written there only after the bank creation is successful, and not before, when threshold_create_bank() fails.
Add a helper which unwinds all the bank creation work previously done and pass into it the previously allocated threshold banks array for freeing.
[ bp: Massage. ]
Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path") Co-developed-by: Alviro Iskandar Setiawan alviro.iskandar@gnuweeb.org Signed-off-by: Alviro Iskandar Setiawan alviro.iskandar@gnuweeb.org Co-developed-by: Yazen Ghannam yazen.ghannam@amd.com Signed-off-by: Yazen Ghannam yazen.ghannam@amd.com Signed-off-by: Ammar Faizi ammarfaizi2@gnuweeb.org Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220329104705.65256-3-ammarfaizi2@gnuweeb.org --- arch/x86/kernel/cpu/mce/amd.c | 32 +++++++++++++++++++------------- 1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 1940d30..1c87501 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1294,10 +1294,23 @@ out_free: kfree(bank); }
+static void __threshold_remove_device(struct threshold_bank **bp) +{ + unsigned int bank, numbanks = this_cpu_read(mce_num_banks); + + for (bank = 0; bank < numbanks; bank++) { + if (!bp[bank]) + continue; + + threshold_remove_bank(bp[bank]); + bp[bank] = NULL; + } + kfree(bp); +} + int mce_threshold_remove_device(unsigned int cpu) { struct threshold_bank **bp = this_cpu_read(threshold_banks); - unsigned int bank, numbanks = this_cpu_read(mce_num_banks);
if (!bp) return 0; @@ -1308,13 +1321,7 @@ int mce_threshold_remove_device(unsigned int cpu) */ this_cpu_write(threshold_banks, NULL);
- for (bank = 0; bank < numbanks; bank++) { - if (bp[bank]) { - threshold_remove_bank(bp[bank]); - bp[bank] = NULL; - } - } - kfree(bp); + __threshold_remove_device(bp); return 0; }
@@ -1351,15 +1358,14 @@ int mce_threshold_create_device(unsigned int cpu) if (!(this_cpu_read(bank_map) & (1 << bank))) continue; err = threshold_create_bank(bp, cpu, bank); - if (err) - goto out_err; + if (err) { + __threshold_remove_device(bp); + return err; + } } this_cpu_write(threshold_banks, bp);
if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt; return 0; -out_err: - mce_threshold_remove_device(cpu); - return err; }
linux-stable-mirror@lists.linaro.org