A warning is emitted in set_return_thunk() when the return thunk is overwritten since this is likely a bug and will result in mitigations not functioning and the mitigation information displayed in sysfs being incorrect.
There is a special case when the return thunk is overwritten from retbleed_return_thunk to srso_return_thunk since srso_return_thunk provides a superset of the functionality of retbleed_return_thunk, and this is handled correctly in entry_untrain_ret(). Avoid emitting the warning in this scenario to clarify that this is not an issue.
This situation occurs on certain AMD processors (e.g. Zen2) which are affected by both retbleed and srso.
Fixes: f4818881c47fd ("x86/its: Enable Indirect Target Selection mitigation") Cc: stable@vger.kernel.org # 5.15.x- Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com --- arch/x86/kernel/cpu/bugs.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 8596ce85026c..b7797636140f 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -69,7 +69,16 @@ void (*x86_return_thunk)(void) __ro_after_init = __x86_return_thunk;
static void __init set_return_thunk(void *thunk) { - if (x86_return_thunk != __x86_return_thunk) + /* + * There can only be one return thunk enabled at a time, so issue a + * warning when overwriting it. retbleed_return_thunk is a special case + * which is safe to be overwritten with srso_return_thunk since it + * provides a superset of the functionality and is handled correctly in + * entry_untrain_ret(). + */ + if ((x86_return_thunk != __x86_return_thunk) && + (thunk != srso_return_thunk || + x86_return_thunk != retbleed_return_thunk)) pr_warn("x86/bugs: return thunk changed\n");
x86_return_thunk = thunk;
On Wed, May 14, 2025 at 03:08:35PM -0700, Suraj Jitindar Singh wrote:
- if (x86_return_thunk != __x86_return_thunk)
- /*
* There can only be one return thunk enabled at a time, so issue a
* warning when overwriting it. retbleed_return_thunk is a special case
* which is safe to be overwritten with srso_return_thunk since it
* provides a superset of the functionality and is handled correctly in
* entry_untrain_ret().
*/
- if ((x86_return_thunk != __x86_return_thunk) &&
(thunk != srso_return_thunk ||
x86_return_thunk != retbleed_return_thunk))
Instead of making this an unreadable conditional, why don't we ...
pr_warn("x86/bugs: return thunk changed\n");
... turn this into a
pr_info("set return thunk to: %ps\n", ...)
and simply say which thunk was set?
On Thu, May 15, 2025 at 12:25:07AM +0200, Borislav Petkov wrote:
On Wed, May 14, 2025 at 03:08:35PM -0700, Suraj Jitindar Singh wrote:
- if (x86_return_thunk != __x86_return_thunk)
- /*
* There can only be one return thunk enabled at a time, so issue a
* warning when overwriting it. retbleed_return_thunk is a special case
* which is safe to be overwritten with srso_return_thunk since it
* provides a superset of the functionality and is handled correctly in
* entry_untrain_ret().
*/
- if ((x86_return_thunk != __x86_return_thunk) &&
(thunk != srso_return_thunk ||
x86_return_thunk != retbleed_return_thunk))
Instead of making this an unreadable conditional, why don't we ...
pr_warn("x86/bugs: return thunk changed\n");
... turn this into a
pr_info("set return thunk to: %ps\n", ...)
and simply say which thunk was set?
This was discussed during the mitigation, and pr_warn() was chosen because it was not obvious that srso mitigation also mitigates retbleed. (On a retrospect, there should have been a comment about it).
The conclusion was to make the srso and retbleed relationship clear and then take care of the pr_warn().
On Wed, May 14, 2025 at 04:30:22PM -0700, Pawan Gupta wrote:
This was discussed during the mitigation, and pr_warn() was chosen because it was not obvious that srso mitigation also mitigates retbleed. (On a retrospect, there should have been a comment about it).
Why is that important?
We have multiple cases where a mitigation strategy addresses multiple attacks.
The conclusion was to make the srso and retbleed relationship clear and then take care of the pr_warn().
So let's ask ourselves: who is really going to see what single-line warning?
What are we *actually* trying to prevent here?
How about a big fat splat at least if we're really trying to prevent something nasty which causes a panic on warn...?
Thx.
On Thu, May 15, 2025 at 11:36:52AM +0200, Borislav Petkov wrote:
On Wed, May 14, 2025 at 04:30:22PM -0700, Pawan Gupta wrote:
This was discussed during the mitigation, and pr_warn() was chosen because it was not obvious that srso mitigation also mitigates retbleed. (On a retrospect, there should have been a comment about it).
Why is that important?
There are 4 mitigations that currently use return thunks, retbleed=stuff(Call Depth Tracking), retbleed=unret, SRSO and ITS. They all set the return thunks they want without checking if return thunks are already set by another mitigation. I understand the SRSO mitigates retbleed(BTC), but the same is not true for retbleed(RSB underflow mitigated by CDT) and ITS. If ITS overrides CDT return thunk, it will make CDT ineffective.
We have multiple cases where a mitigation strategy addresses multiple attacks.
Agree, but here we are talking about the opposite case where a mitigation unintentionally renders a previously set mitigation ineffective.
The conclusion was to make the srso and retbleed relationship clear and then take care of the pr_warn().
So let's ask ourselves: who is really going to see what single-line warning?
Don't know. I guess some do, hence this patch.
What are we *actually* trying to prevent here?
As I said above, a mitigation unintentionally make another mitigation ineffective.
How about a big fat splat at least if we're really trying to prevent something nasty which causes a panic on warn...?
Yes, maybe a WARN_ON() conditional to sanity checks for retbleed/SRSO.
On Thu, May 15, 2025 at 10:06:33AM -0700, Pawan Gupta wrote:
As I said above, a mitigation unintentionally make another mitigation ineffective.
I actually didn't need an analysis - my point is: if you're going to warn about it, then make it big so that it gets caught.
Yes, maybe a WARN_ON() conditional to sanity checks for retbleed/SRSO.
Yes, that.
At least.
The next step would be if this whole "let's set a thunk without overwriting a previously set one" can be fixed differently.
For now, though, the *least* what should be done here is catch the critical cases where a mitigation is rendered ineffective. And warning Joe Normal User about it doesn't bring anything. We do decide for the user what is safe or not, practically. At least this has been the strategy until now.
So the goal here should be to make Joe catch this and tell us to fix it.
Makes sense?
On Thu, May 15, 2025 at 07:23:55PM +0200, Borislav Petkov wrote:
On Thu, May 15, 2025 at 10:06:33AM -0700, Pawan Gupta wrote:
As I said above, a mitigation unintentionally make another mitigation ineffective.
I actually didn't need an analysis - my point is: if you're going to warn about it, then make it big so that it gets caught.
Yes, maybe a WARN_ON() conditional to sanity checks for retbleed/SRSO.
Yes, that.
At least.
The next step would be if this whole "let's set a thunk without overwriting a previously set one" can be fixed differently.
For now, though, the *least* what should be done here is catch the critical cases where a mitigation is rendered ineffective. And warning Joe Normal User about it doesn't bring anything. We do decide for the user what is safe or not, practically. At least this has been the strategy until now.
So the goal here should be to make Joe catch this and tell us to fix it.
Makes sense?
Absolutely makes sense.
Suraj, do want to revise this patch? Or else I can do it too.
On Thu, 2025-05-15 at 10:38 -0700, Pawan Gupta wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
On Thu, May 15, 2025 at 07:23:55PM +0200, Borislav Petkov wrote:
On Thu, May 15, 2025 at 10:06:33AM -0700, Pawan Gupta wrote:
As I said above, a mitigation unintentionally make another mitigation ineffective.
I actually didn't need an analysis - my point is: if you're going to warn about it, then make it big so that it gets caught.
Yes, maybe a WARN_ON() conditional to sanity checks for retbleed/SRSO.
Yes, that.
At least.
The next step would be if this whole "let's set a thunk without overwriting a previously set one" can be fixed differently.
For now, though, the *least* what should be done here is catch the critical cases where a mitigation is rendered ineffective. And warning Joe Normal User about it doesn't bring anything. We do decide for the user what is safe or not, practically. At least this has been the strategy until now.
So the goal here should be to make Joe catch this and tell us to fix it.
Makes sense?
Absolutely makes sense.
Suraj, do want to revise this patch? Or else I can do it too.
Happy to revise it.
To be clear, based on my understanding the request is to make the warning more obvious with a WARN()?
A warning message is emitted in set_return_thunk() when the return thunk is overwritten since this is likely a bug and will result in a mitigation not functioning and the mitigation information displayed in sysfs being incorrect.
Make this louder by using a WARN().
Cc: stable@vger.kernel.org # 5.15.x- Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com --- arch/x86/kernel/cpu/bugs.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 8596ce85026c..9679fa30563c 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -69,8 +69,15 @@ void (*x86_return_thunk)(void) __ro_after_init = __x86_return_thunk;
static void __init set_return_thunk(void *thunk) { - if (x86_return_thunk != __x86_return_thunk) - pr_warn("x86/bugs: return thunk changed\n"); + /* + * There can only be one return thunk enabled at a time, so issue a + * warning when overwriting it as this is likely a bug which will + * result in a mitigation getting disabled and a vulnerability being + * incorrectly reported in sysfs. + */ + WARN(x86_return_thunk != __x86_return_thunk, + "x86/bugs: return thunk changed from %ps to %ps\n", + x86_return_thunk, thunk);
x86_return_thunk = thunk; }
There is a special case when the return thunk is overwritten from retbleed_return_thunk to srso_return_thunk since srso_return_thunk provides a superset of the functionality of retbleed_return_thunk, and this is handled correctly in entry_untrain_ret(). Avoid emitting the warning in this scenario to clarify that this is not an issue.
This situation occurs on certain AMD processors (e.g. Zen2) which are affected by both retbleed and srso.
Fixes: f4818881c47fd ("x86/its: Enable Indirect Target Selection mitigation") Cc: stable@vger.kernel.org # 5.15.x- Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com --- arch/x86/kernel/cpu/bugs.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 9679fa30563c..840902cee670 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -73,9 +73,14 @@ static void __init set_return_thunk(void *thunk) * There can only be one return thunk enabled at a time, so issue a * warning when overwriting it as this is likely a bug which will * result in a mitigation getting disabled and a vulnerability being - * incorrectly reported in sysfs. + * incorrectly reported in sysfs. retbleed_return_thunk is a special + * case which is safe to be overwritten with srso_return_thunk since it + * provides a superset of the functionality and is handled correctly in + * entry_untrain_ret(). */ - WARN(x86_return_thunk != __x86_return_thunk, + WARN((x86_return_thunk != __x86_return_thunk) && + (thunk != srso_return_thunk || + x86_return_thunk != retbleed_return_thunk), "x86/bugs: return thunk changed from %ps to %ps\n", x86_return_thunk, thunk);
On Thu, May 15, 2025 at 04:34:33PM -0700, Suraj Jitindar Singh wrote:
- WARN(x86_return_thunk != __x86_return_thunk,
- WARN((x86_return_thunk != __x86_return_thunk) &&
(thunk != srso_return_thunk ||
x86_return_thunk != retbleed_return_thunk), "x86/bugs: return thunk changed from %ps to %ps\n", x86_return_thunk, thunk);
This is still adding that nasty conditional which I'd like to avoid.
And I just had this other idea: we're switching to select/update/apply logic with the mitigations and I'm sure we can use that new ability to select the proper mitigation when other mitigations are influencing the decision, to select the proper return thunk.
I'm thinking for retbleed and SRSO we could set it only once, perhaps in srso_select_mitigation() as it runs last.
I don't want to introduce an amd_return_thunk... :-)
But David might have a better idea...
Thx.
[AMD Official Use Only - AMD Internal Distribution Only]
-----Original Message----- From: Borislav Petkov bp@alien8.de Sent: Friday, May 16, 2025 2:48 AM To: Suraj Jitindar Singh surajjs@amazon.com; Kaplan, David David.Kaplan@amd.com Cc: linux-kernel@vger.kernel.org; x86@kernel.org; Thomas Gleixner tglx@linutronix.de; Peter Zijlstra peterz@infradead.org; Josh Poimboeuf jpoimboe@kernel.org; Pawan Gupta pawan.kumar.gupta@linux.intel.com; Ingo Molnar mingo@redhat.com; Dave Hansen dave.hansen@linux.intel.com; stable@vger.kernel.org Subject: Re: [PATCH 2/2] x86/bugs: Don't WARN() when overwriting retbleed_return_thunk with srso_return_thunk
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
On Thu, May 15, 2025 at 04:34:33PM -0700, Suraj Jitindar Singh wrote:
WARN(x86_return_thunk != __x86_return_thunk,
WARN((x86_return_thunk != __x86_return_thunk) &&
(thunk != srso_return_thunk ||
x86_return_thunk != retbleed_return_thunk), "x86/bugs: return thunk changed from %ps to %ps\n", x86_return_thunk, thunk);
This is still adding that nasty conditional which I'd like to avoid.
And I just had this other idea: we're switching to select/update/apply logic with the mitigations and I'm sure we can use that new ability to select the proper mitigation when other mitigations are influencing the decision, to select the proper return thunk.
I'm thinking for retbleed and SRSO we could set it only once, perhaps in srso_select_mitigation() as it runs last.
I don't want to introduce an amd_return_thunk... :-)
But David might have a better idea...
Hmm. Since SRSO is kind of a superset of retbleed, it might make sense to create a new mitigation, RETBLEED_MITIGATION_SAFE_RET.
retbleed_update_mitigation() can change its mitigation to this if srso_mitigation is SAFE_RET (or SAFE_RET_UCODE_NEEDED). RETBLEED_MITIGATION_SAFE_RET can do nothing in retbleed_apply_mitigation() because it means that srso is taking care of things. Thoughts?
This also made me realize there's another minor missing interaction here, which is that if spec_rstack_overflow=ibpb, then that should set retbleed_mitigation to IBPB as well.
--David Kaplan
On Fri, May 16, 2025 at 03:18:30PM +0000, Kaplan, David wrote:
Hmm. Since SRSO is kind of a superset of retbleed, it might make sense to create a new mitigation, RETBLEED_MITIGATION_SAFE_RET.
retbleed_update_mitigation() can change its mitigation to this if srso_mitigation is SAFE_RET (or SAFE_RET_UCODE_NEEDED). RETBLEED_MITIGATION_SAFE_RET can do nothing in retbleed_apply_mitigation() because it means that srso is taking care of things. Thoughts?
This also made me realize there's another minor missing interaction here, which is that if spec_rstack_overflow=ibpb, then that should set retbleed_mitigation to IBPB as well.
Ok, this sounds like we should expedite our srso mitigation cleanup intentions. :-)
Lemme find you on chat...
linux-stable-mirror@lists.linaro.org