On Wed, Feb 03, 2021 at 12:27:40AM +0200, Jarkko Sakkinen wrote:
On Tue, Feb 02, 2021 at 09:58:24AM -0800, James Bottomley wrote:
On Tue, 2021-02-02 at 11:26 -0600, Serge E. Hallyn wrote:
On Tue, Feb 02, 2021 at 05:33:17PM +0200, jarkko@kernel.org wrote:
From: Jarkko Sakkinen jarkko@kernel.org
An unexpected status from TPM chip is not irrecovable failure of the kernel. It's only undesirable situation. Thus, change the WARN_ONCE instance inside tpm_tis_status() to pr_warn_once().
In addition: print the status in the log message because it is actually useful information lacking from the existing log message.
Suggested-by: Guenter Roeck linux@roeck-us.net Cc: stable@vger.kernel.org Fixes: 6f4f57f0b909 ("tpm: ibmvtpm: fix error return code in tpm_ibmvtpm_probe()") Signed-off-by: Jarkko Sakkinen jarkko@kernel.org
drivers/char/tpm/tpm_tis_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c index 431919d5f48a..21f67c6366cb 100644 --- a/drivers/char/tpm/tpm_tis_core.c +++ b/drivers/char/tpm/tpm_tis_core.c @@ -202,7 +202,7 @@ static u8 tpm_tis_status(struct tpm_chip *chip) * acquired. Usually because tpm_try_get_ops() hasn't * been called before doing a TPM operation. */
WARN_ONCE(1, "TPM returned invalid status\n");
pr_warn_once("TPM returned invalid status: 0x%x\n",
status); return 0; }
Actually in this case I don't understand why _once, especially based on the comment. Would ratelimited not be better? So we can see if it happens repeatedly? Even better would be if we could see when it next gave a valid status after an invalid one.
The reason was that we're trying to catch and kill paths to the status where the locality is incorrect. If you do some operation that finds an incorrect path the likelihood is you'll exercise it more than once, but all we need to identify it is the call trace from a single access. The symptom the user space process sees is a TPM timeout, but we still have the in-kernel trace to tell us why.
I don't agree with this reasoning. This warn could spun off also from chip not following TCG spec. The patch does provide the status code, which is always useful information.
I.e. WARN() implies usually that there is something wrong in the kernel state risking its stability which *is not* case here. Thus, it's best to make the status code visible, not the stack trace, and make rest of the conclusions from that.
/Jarkko