On Mon, Aug 19, 2024 at 05:06:58PM +0200, Johan Hovold wrote:
On Sun, Aug 18, 2024 at 04:17:38PM -0700, Bjorn Andersson wrote:
Commit 'caa855189104 ("soc: qcom: pmic_glink: Fix race during initialization")'
This commit does not exist, but I think you really meant to refer to
9329933699b3 ("soc: qcom: pmic_glink: Make client-lock non-sleeping")
and possibly also
635ce0db8956 ("soc: qcom: pmic_glink: don't traverse clients list without a lock")
here.
Yeah, I copy-pasted the wrong SHA1. Prior to commit 9329933699b3 ("soc: qcom: pmic_glink: Make client-lock non-sleeping") the PDR notification happened from a worker with only mutexes held.
moved the pmic_glink client list under a spinlock, as it is accessed by the rpmsg/glink callback, which in turn is invoked from IRQ context.
This means that ucsi_unregister() is now called from IRQ context, which isn't feasible as it's expecting a sleepable context.
But this is not correct as you say above that the callback has always been made in IRQ context. Then this bug has been there since the introduction of the UCSI driver by commit
No, I'm stating that commit 9329933699b3 ("soc: qcom: pmic_glink: Make client-lock non-sleeping") was needed because the client list is traversed under the separate glink callback, which has always been made in IRQ context.
62b5412b1f4a ("usb: typec: ucsi: add PMIC Glink UCSI driver")
An effort is under way to get GLINK to invoke its callbacks in a sleepable context, but until then lets schedule the unregistration.
A side effect of this is that ucsi_unregister() can now happen after the remote processor, and thereby the communication link with it, is gone. pmic_glink_send() is amended with a check to avoid the resulting NULL pointer dereference, but it becomes expecting to see a failing send
Perhaps you can rephrase this bit ("becomes expecting to see").
Sure.
upon shutting down the remote processor (e.g. during a restart following a firmware crash):
ucsi_glink.pmic_glink_ucsi pmic_glink.ucsi.0: failed to send UCSI write request: -5
Fixes: caa855189104 ("soc: qcom: pmic_glink: Fix race during initialization")
So this should be
Fixes: 62b5412b1f4a ("usb: typec: ucsi: add PMIC Glink UCSI driver")
I think it should be:
9329933699b3 ("soc: qcom: pmic_glink: Make client-lock non-sleeping")
Cc: stable@vger.kernel.org Signed-off-by: Bjorn Andersson quic_bjorande@quicinc.com
diff --git a/drivers/usb/typec/ucsi/ucsi_glink.c b/drivers/usb/typec/ucsi/ucsi_glink.c index ac53a81c2a81..a33056eec83d 100644 --- a/drivers/usb/typec/ucsi/ucsi_glink.c +++ b/drivers/usb/typec/ucsi/ucsi_glink.c @@ -68,6 +68,9 @@ struct pmic_glink_ucsi { struct work_struct notify_work; struct work_struct register_work;
- spinlock_t state_lock;
- unsigned int pdr_state;
- unsigned int new_pdr_state;
Should these be int to match the notify callback (and enum servreg_service_state)?
Ohh my. I made it unsigned because I made it unsigned in pmic_glink, when I wrote that. But as you point out, the type passed around is an enum servreg_service_state and it's mostly handled as a signed int.
That said, pmic_glink actually filters the value space down to UP/DOWN, so making this "bool pdr_up" (pd_running?) and "bool ucsi_registered" would make this cleaner...
u8 read_buf[UCSI_BUF_SIZE]; }; @@ -244,8 +247,22 @@ static void pmic_glink_ucsi_notify(struct work_struct *work) static void pmic_glink_ucsi_register(struct work_struct *work) { struct pmic_glink_ucsi *ucsi = container_of(work, struct pmic_glink_ucsi, register_work);
- unsigned long flags;
- unsigned int new_state;
Then int here too.
Yes.
- spin_lock_irqsave(&ucsi->state_lock, flags);
- new_state = ucsi->new_pdr_state;
- spin_unlock_irqrestore(&ucsi->state_lock, flags);
- if (ucsi->pdr_state != SERVREG_SERVICE_STATE_UP) {
if (new_state == SERVREG_SERVICE_STATE_UP)
ucsi_register(ucsi->ucsi);
- } else {
if (new_state == SERVREG_SERVICE_STATE_DOWN)
ucsi_unregister(ucsi->ucsi);
Do you risk a double deregistration (and UAF/double free) here?
I believe we're good.
Thank you, Bjorn
- }
- ucsi_register(ucsi->ucsi);
- ucsi->pdr_state = new_state;
}
Johan