[RESEND2 PATCH net v4 1/2] soc: fsl: qbman: Always disable interrupts when taking cgr_lock

List overview All Threads
Download

newer

older

[PATCH 3/3] mtd: rawnand: Ensure...

[PATCH] hwmon: (amc6821) add...

Sean Anderson

22 Feb 2024 22 Feb '24

5:07 p.m.

smp_call_function_single disables IRQs when executing the callback. To prevent deadlocks, we must disable IRQs when taking cgr_lock elsewhere. This is already done by qman_update_cgr and qman_delete_cgr; fix the other lockers.

Fixes: 96f413f47677 ("soc/fsl/qbman: fix issue in qman_delete_cgr_safe()") CC: stable@vger.kernel.org Signed-off-by: Sean Anderson sean.anderson@linux.dev Reviewed-by: Camelia Groza camelia.groza@nxp.com Tested-by: Vladimir Oltean vladimir.oltean@nxp.com --- Resent from a non-mangling email.

(no changes since v3)

Changes in v3: - Change blamed commit to something more appropriate

Changes in v2: - Fix one additional call to spin_unlock

drivers/soc/fsl/qbman/qman.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index 739e4eee6b75..1bf1f1ea67f0 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -1456,11 +1456,11 @@ static void qm_congestion_task(struct work_struct *work) union qm_mc_result *mcr; struct qman_cgr *cgr;

- spin_lock(&p->cgr_lock); + spin_lock_irq(&p->cgr_lock); qm_mc_start(&p->p); qm_mc_commit(&p->p, QM_MCC_VERB_QUERYCONGESTION); if (!qm_mc_result_timeout(&p->p, &mcr)) { - spin_unlock(&p->cgr_lock); + spin_unlock_irq(&p->cgr_lock); dev_crit(p->config->dev, "QUERYCONGESTION timeout\n"); qman_p_irqsource_add(p, QM_PIRQ_CSCI); return; @@ -1476,7 +1476,7 @@ static void qm_congestion_task(struct work_struct *work) list_for_each_entry(cgr, &p->cgr_cbs, node) if (cgr->cb && qman_cgrs_get(&c, cgr->cgrid)) cgr->cb(p, cgr, qman_cgrs_get(&rr, cgr->cgrid)); - spin_unlock(&p->cgr_lock); + spin_unlock_irq(&p->cgr_lock); qman_p_irqsource_add(p, QM_PIRQ_CSCI); }

@@ -2440,7 +2440,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, preempt_enable();

cgr->chan = p->config->channel; - spin_lock(&p->cgr_lock); + spin_lock_irq(&p->cgr_lock);

if (opts) { struct qm_mcc_initcgr local_opts = *opts; @@ -2477,7 +2477,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, qman_cgrs_get(&p->cgrs[1], cgr->cgrid)) cgr->cb(p, cgr, 1); out: - spin_unlock(&p->cgr_lock); + spin_unlock_irq(&p->cgr_lock); put_affine_portal(); return ret; }

-- 2.35.1.1320.gc452695387.dirty

Show replies by date

Sean Anderson

22 Feb 22 Feb

5:07 p.m.

New subject: [RESEND2 PATCH net v4 2/2] soc: fsl: qbman: Use raw spinlock for cgr_lock

cgr_lock may be locked with interrupts already disabled by smp_call_function_single. As such, we must use a raw spinlock to avoid problems on PREEMPT_RT kernels. Although this bug has existed for a while, it was not apparent until commit ef2a8d5478b9 ("net: dpaa: Adjust queue depth on rate change") which invokes smp_call_function_single via qman_update_cgr_safe every time a link goes up or down.

Fixes: 96f413f47677 ("soc/fsl/qbman: fix issue in qman_delete_cgr_safe()") CC: stable@vger.kernel.org Reported-by: Vladimir Oltean vladimir.oltean@nxp.com Closes: https://lore.kernel.org/all/20230323153935.nofnjucqjqnz34ej@skbuf/ Reported-by: Steffen Trumtrar s.trumtrar@pengutronix.de Closes: https://lore.kernel.org/linux-arm-kernel/87wmsyvclu.fsf@pengutronix.de/ Signed-off-by: Sean Anderson sean.anderson@linux.dev Reviewed-by: Camelia Groza camelia.groza@nxp.com Tested-by: Vladimir Oltean vladimir.oltean@nxp.com

---

Changes in v4: - Add a note about how raw spinlocks aren't quite right

Changes in v3: - Change blamed commit to something more appropriate

drivers/soc/fsl/qbman/qman.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index 1bf1f1ea67f0..7e9074519ad2 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -991,7 +991,7 @@ struct qman_portal { /* linked-list of CSCN handlers. */ struct list_head cgr_cbs; /* list lock */ - spinlock_t cgr_lock; + raw_spinlock_t cgr_lock; struct work_struct congestion_work; struct work_struct mr_work; char irqname[MAX_IRQNAME]; @@ -1281,7 +1281,7 @@ static int qman_create_portal(struct qman_portal *portal, /* if the given mask is NULL, assume all CGRs can be seen */ qman_cgrs_fill(&portal->cgrs[0]); INIT_LIST_HEAD(&portal->cgr_cbs); - spin_lock_init(&portal->cgr_lock); + raw_spin_lock_init(&portal->cgr_lock); INIT_WORK(&portal->congestion_work, qm_congestion_task); INIT_WORK(&portal->mr_work, qm_mr_process_task); portal->bits = 0; @@ -1456,11 +1456,14 @@ static void qm_congestion_task(struct work_struct *work) union qm_mc_result *mcr; struct qman_cgr *cgr;

- spin_lock_irq(&p->cgr_lock); + /* + * FIXME: QM_MCR_TIMEOUT is 10ms, which is too long for a raw spinlock! + */ + raw_spin_lock_irq(&p->cgr_lock); qm_mc_start(&p->p); qm_mc_commit(&p->p, QM_MCC_VERB_QUERYCONGESTION); if (!qm_mc_result_timeout(&p->p, &mcr)) { - spin_unlock_irq(&p->cgr_lock); + raw_spin_unlock_irq(&p->cgr_lock); dev_crit(p->config->dev, "QUERYCONGESTION timeout\n"); qman_p_irqsource_add(p, QM_PIRQ_CSCI); return; @@ -1476,7 +1479,7 @@ static void qm_congestion_task(struct work_struct *work) list_for_each_entry(cgr, &p->cgr_cbs, node) if (cgr->cb && qman_cgrs_get(&c, cgr->cgrid)) cgr->cb(p, cgr, qman_cgrs_get(&rr, cgr->cgrid)); - spin_unlock_irq(&p->cgr_lock); + raw_spin_unlock_irq(&p->cgr_lock); qman_p_irqsource_add(p, QM_PIRQ_CSCI); }

@@ -2440,7 +2443,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, preempt_enable();

cgr->chan = p->config->channel; - spin_lock_irq(&p->cgr_lock); + raw_spin_lock_irq(&p->cgr_lock);

if (opts) { struct qm_mcc_initcgr local_opts = *opts; @@ -2477,7 +2480,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, qman_cgrs_get(&p->cgrs[1], cgr->cgrid)) cgr->cb(p, cgr, 1); out: - spin_unlock_irq(&p->cgr_lock); + raw_spin_unlock_irq(&p->cgr_lock); put_affine_portal(); return ret; } @@ -2512,7 +2515,7 @@ int qman_delete_cgr(struct qman_cgr *cgr) return -EINVAL;

memset(&local_opts, 0, sizeof(struct qm_mcc_initcgr)); - spin_lock_irqsave(&p->cgr_lock, irqflags); + raw_spin_lock_irqsave(&p->cgr_lock, irqflags); list_del(&cgr->node); /* * If there are no other CGR objects for this CGRID in the list, @@ -2537,7 +2540,7 @@ int qman_delete_cgr(struct qman_cgr *cgr) /* add back to the list */ list_add(&cgr->node, &p->cgr_cbs); release_lock: - spin_unlock_irqrestore(&p->cgr_lock, irqflags); + raw_spin_unlock_irqrestore(&p->cgr_lock, irqflags); put_affine_portal(); return ret; } @@ -2577,9 +2580,9 @@ static int qman_update_cgr(struct qman_cgr *cgr, struct qm_mcc_initcgr *opts) if (!p) return -EINVAL;

- spin_lock_irqsave(&p->cgr_lock, irqflags); + raw_spin_lock_irqsave(&p->cgr_lock, irqflags); ret = qm_modify_cgr(cgr, 0, opts); - spin_unlock_irqrestore(&p->cgr_lock, irqflags); + raw_spin_unlock_irqrestore(&p->cgr_lock, irqflags); put_affine_portal(); return ret; }

-- 2.35.1.1320.gc452695387.dirty

Christophe Leroy

23 Feb 23 Feb

5:38 a.m.

New subject: [RESEND2 PATCH net v4 2/2] soc: fsl: qbman: Use raw spinlock for cgr_lock

Le 22/02/2024 à 18:07, Sean Anderson a écrit :

...

[Vous ne recevez pas souvent de courriers de sean.anderson@linux.dev. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

cgr_lock may be locked with interrupts already disabled by smp_call_function_single. As such, we must use a raw spinlock to avoid problems on PREEMPT_RT kernels. Although this bug has existed for a while, it was not apparent until commit ef2a8d5478b9 ("net: dpaa: Adjust queue depth on rate change") which invokes smp_call_function_single via qman_update_cgr_safe every time a link goes up or down.

Why a raw spinlock to avoid problems on PREEMPT_RT, can you elaborate ?

If the problem is that interrupts are already disabled, shouldn't you just change the spin_lock_irq() by spin_lock_irqsave() ?

Christophe

...

Fixes: 96f413f47677 ("soc/fsl/qbman: fix issue in qman_delete_cgr_safe()") CC: stable@vger.kernel.org Reported-by: Vladimir Oltean vladimir.oltean@nxp.com Closes: https://lore.kernel.org/all/20230323153935.nofnjucqjqnz34ej@skbuf/ Reported-by: Steffen Trumtrar s.trumtrar@pengutronix.de Closes: https://lore.kernel.org/linux-arm-kernel/87wmsyvclu.fsf@pengutronix.de/ Signed-off-by: Sean Anderson sean.anderson@linux.dev Reviewed-by: Camelia Groza camelia.groza@nxp.com Tested-by: Vladimir Oltean vladimir.oltean@nxp.com

Changes in v4:

Add a note about how raw spinlocks aren't quite right

Changes in v3:

Change blamed commit to something more appropriate

drivers/soc/fsl/qbman/qman.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index 1bf1f1ea67f0..7e9074519ad2 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -991,7 +991,7 @@ struct qman_portal { /* linked-list of CSCN handlers. */ struct list_head cgr_cbs; /* list lock */
  spinlock_t cgr_lock;
  raw_spinlock_t cgr_lock;
   struct work_struct congestion_work;
   struct work_struct mr_work;
   char irqname[MAX_IRQNAME];
@@ -1281,7 +1281,7 @@ static int qman_create_portal(struct qman_portal *portal, /* if the given mask is NULL, assume all CGRs can be seen */ qman_cgrs_fill(&portal->cgrs[0]); INIT_LIST_HEAD(&portal->cgr_cbs);
  spin_lock_init(&portal->cgr_lock);
  raw_spin_lock_init(&portal->cgr_lock);
   INIT_WORK(&portal->congestion_work, qm_congestion_task);
   INIT_WORK(&portal->mr_work, qm_mr_process_task);
   portal->bits = 0;
@@ -1456,11 +1456,14 @@ static void qm_congestion_task(struct work_struct *work) union qm_mc_result *mcr; struct qman_cgr *cgr;
  spin_lock_irq(&p->cgr_lock);
  /*
   * FIXME: QM_MCR_TIMEOUT is 10ms, which is too long for a raw spinlock!
   */
  raw_spin_lock_irq(&p->cgr_lock);
   qm_mc_start(&p->p);
   qm_mc_commit(&p->p, QM_MCC_VERB_QUERYCONGESTION);
   if (!qm_mc_result_timeout(&p->p, &mcr)) {
          spin_unlock_irq(&p->cgr_lock);
          raw_spin_unlock_irq(&p->cgr_lock);
           dev_crit(p->config->dev, "QUERYCONGESTION timeout\n");
           qman_p_irqsource_add(p, QM_PIRQ_CSCI);
           return;
@@ -1476,7 +1479,7 @@ static void qm_congestion_task(struct work_struct *work) list_for_each_entry(cgr, &p->cgr_cbs, node) if (cgr->cb && qman_cgrs_get(&c, cgr->cgrid)) cgr->cb(p, cgr, qman_cgrs_get(&rr, cgr->cgrid));
  spin_unlock_irq(&p->cgr_lock);
  raw_spin_unlock_irq(&p->cgr_lock);
   qman_p_irqsource_add(p, QM_PIRQ_CSCI);
}
@@ -2440,7 +2443,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, preempt_enable();
     cgr->chan = p->config->channel;
  spin_lock_irq(&p->cgr_lock);
  raw_spin_lock_irq(&p->cgr_lock);

   if (opts) {
           struct qm_mcc_initcgr local_opts = *opts;
@@ -2477,7 +2480,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, qman_cgrs_get(&p->cgrs[1], cgr->cgrid)) cgr->cb(p, cgr, 1); out:
  spin_unlock_irq(&p->cgr_lock);
  raw_spin_unlock_irq(&p->cgr_lock);
   put_affine_portal();
   return ret;
}
@@ -2512,7 +2515,7 @@ int qman_delete_cgr(struct qman_cgr *cgr) return -EINVAL;
     memset(&local_opts, 0, sizeof(struct qm_mcc_initcgr));
  spin_lock_irqsave(&p->cgr_lock, irqflags);
  raw_spin_lock_irqsave(&p->cgr_lock, irqflags);
   list_del(&cgr->node);
   /*
    * If there are no other CGR objects for this CGRID in the list,
@@ -2537,7 +2540,7 @@ int qman_delete_cgr(struct qman_cgr *cgr) /* add back to the list */ list_add(&cgr->node, &p->cgr_cbs); release_lock:
  spin_unlock_irqrestore(&p->cgr_lock, irqflags);
  raw_spin_unlock_irqrestore(&p->cgr_lock, irqflags);
   put_affine_portal();
   return ret;
}
@@ -2577,9 +2580,9 @@ static int qman_update_cgr(struct qman_cgr *cgr, struct qm_mcc_initcgr *opts) if (!p) return -EINVAL;
  spin_lock_irqsave(&p->cgr_lock, irqflags);
  raw_spin_lock_irqsave(&p->cgr_lock, irqflags);
   ret = qm_modify_cgr(cgr, 0, opts);
  spin_unlock_irqrestore(&p->cgr_lock, irqflags);
  raw_spin_unlock_irqrestore(&p->cgr_lock, irqflags);
   put_affine_portal();
   return ret;
}
-- 2.35.1.1320.gc452695387.dirty

Sean Anderson

4:02 p.m.

New subject: [RESEND2 PATCH net v4 2/2] soc: fsl: qbman: Use raw spinlock for cgr_lock

On 2/23/24 00:38, Christophe Leroy wrote:

...

Le 22/02/2024 à 18:07, Sean Anderson a écrit :

...
[Vous ne recevez pas souvent de courriers de sean.anderson@linux.dev. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

cgr_lock may be locked with interrupts already disabled by smp_call_function_single. As such, we must use a raw spinlock to avoid problems on PREEMPT_RT kernels. Although this bug has existed for a while, it was not apparent until commit ef2a8d5478b9 ("net: dpaa: Adjust queue depth on rate change") which invokes smp_call_function_single via qman_update_cgr_safe every time a link goes up or down.

Why a raw spinlock to avoid problems on PREEMPT_RT, can you elaborate ?

smp_call_function always runs its callback in hard IRQ context, even on PREEMPT_RT, where spinlocks can sleep. So we need to use raw spinlocks to ensure we aren't waiting on a sleeping task. See the first bug report for more discussion.

In the longer term it would be better to switch to some other abstraction.

--Sean

...

If the problem is that interrupts are already disabled, shouldn't you just change the spin_lock_irq() by spin_lock_irqsave() ?

Christophe

...
Fixes: 96f413f47677 ("soc/fsl/qbman: fix issue in qman_delete_cgr_safe()") CC: stable@vger.kernel.org Reported-by: Vladimir Oltean vladimir.oltean@nxp.com Closes: https://lore.kernel.org/all/20230323153935.nofnjucqjqnz34ej@skbuf/ Reported-by: Steffen Trumtrar s.trumtrar@pengutronix.de Closes: https://lore.kernel.org/linux-arm-kernel/87wmsyvclu.fsf@pengutronix.de/ Signed-off-by: Sean Anderson sean.anderson@linux.dev Reviewed-by: Camelia Groza camelia.groza@nxp.com Tested-by: Vladimir Oltean vladimir.oltean@nxp.com

Changes in v4:

Add a note about how raw spinlocks aren't quite right

Changes in v3:

Change blamed commit to something more appropriate

drivers/soc/fsl/qbman/qman.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index 1bf1f1ea67f0..7e9074519ad2 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -991,7 +991,7 @@ struct qman_portal { /* linked-list of CSCN handlers. */ struct list_head cgr_cbs; /* list lock */
  spinlock_t cgr_lock;
  raw_spinlock_t cgr_lock;
   struct work_struct congestion_work;
   struct work_struct mr_work;
   char irqname[MAX_IRQNAME];
@@ -1281,7 +1281,7 @@ static int qman_create_portal(struct qman_portal *portal, /* if the given mask is NULL, assume all CGRs can be seen */ qman_cgrs_fill(&portal->cgrs[0]); INIT_LIST_HEAD(&portal->cgr_cbs);
  spin_lock_init(&portal->cgr_lock);
  raw_spin_lock_init(&portal->cgr_lock);
   INIT_WORK(&portal->congestion_work, qm_congestion_task);
   INIT_WORK(&portal->mr_work, qm_mr_process_task);
   portal->bits = 0;
@@ -1456,11 +1456,14 @@ static void qm_congestion_task(struct work_struct *work) union qm_mc_result *mcr; struct qman_cgr *cgr;
  spin_lock_irq(&p->cgr_lock);
  /*
   * FIXME: QM_MCR_TIMEOUT is 10ms, which is too long for a raw spinlock!
   */
  raw_spin_lock_irq(&p->cgr_lock);
   qm_mc_start(&p->p);
   qm_mc_commit(&p->p, QM_MCC_VERB_QUERYCONGESTION);
   if (!qm_mc_result_timeout(&p->p, &mcr)) {
          spin_unlock_irq(&p->cgr_lock);
          raw_spin_unlock_irq(&p->cgr_lock);
           dev_crit(p->config->dev, "QUERYCONGESTION timeout\n");
           qman_p_irqsource_add(p, QM_PIRQ_CSCI);
           return;
@@ -1476,7 +1479,7 @@ static void qm_congestion_task(struct work_struct *work) list_for_each_entry(cgr, &p->cgr_cbs, node) if (cgr->cb && qman_cgrs_get(&c, cgr->cgrid)) cgr->cb(p, cgr, qman_cgrs_get(&rr, cgr->cgrid));
  spin_unlock_irq(&p->cgr_lock);
  raw_spin_unlock_irq(&p->cgr_lock);
   qman_p_irqsource_add(p, QM_PIRQ_CSCI);
}
@@ -2440,7 +2443,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, preempt_enable();
     cgr->chan = p->config->channel;
  spin_lock_irq(&p->cgr_lock);
  raw_spin_lock_irq(&p->cgr_lock);

   if (opts) {
           struct qm_mcc_initcgr local_opts = *opts;
@@ -2477,7 +2480,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, qman_cgrs_get(&p->cgrs[1], cgr->cgrid)) cgr->cb(p, cgr, 1); out:
  spin_unlock_irq(&p->cgr_lock);
  raw_spin_unlock_irq(&p->cgr_lock);
   put_affine_portal();
   return ret;
}
@@ -2512,7 +2515,7 @@ int qman_delete_cgr(struct qman_cgr *cgr) return -EINVAL;
     memset(&local_opts, 0, sizeof(struct qm_mcc_initcgr));
  spin_lock_irqsave(&p->cgr_lock, irqflags);
  raw_spin_lock_irqsave(&p->cgr_lock, irqflags);
   list_del(&cgr->node);
   /*
    * If there are no other CGR objects for this CGRID in the list,
@@ -2537,7 +2540,7 @@ int qman_delete_cgr(struct qman_cgr *cgr) /* add back to the list */ list_add(&cgr->node, &p->cgr_cbs); release_lock:
  spin_unlock_irqrestore(&p->cgr_lock, irqflags);
  raw_spin_unlock_irqrestore(&p->cgr_lock, irqflags);
   put_affine_portal();
   return ret;
}
@@ -2577,9 +2580,9 @@ static int qman_update_cgr(struct qman_cgr *cgr, struct qm_mcc_initcgr *opts) if (!p) return -EINVAL;
  spin_lock_irqsave(&p->cgr_lock, irqflags);
  raw_spin_lock_irqsave(&p->cgr_lock, irqflags);
   ret = qm_modify_cgr(cgr, 0, opts);
  spin_unlock_irqrestore(&p->cgr_lock, irqflags);
  raw_spin_unlock_irqrestore(&p->cgr_lock, irqflags);
   put_affine_portal();
   return ret;
}
-- 2.35.1.1320.gc452695387.dirty

Sean Anderson

5 Mar 5 Mar

6:14 p.m.

New subject: [RESEND2 PATCH net v4 2/2] soc: fsl: qbman: Use raw spinlock for cgr_lock

Hi,

On 2/23/24 11:02, Sean Anderson wrote:

...

On 2/23/24 00:38, Christophe Leroy wrote:

...
Le 22/02/2024 à 18:07, Sean Anderson a écrit :

...
[Vous ne recevez pas souvent de courriers de sean.anderson@linux.dev. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

cgr_lock may be locked with interrupts already disabled by smp_call_function_single. As such, we must use a raw spinlock to avoid problems on PREEMPT_RT kernels. Although this bug has existed for a while, it was not apparent until commit ef2a8d5478b9 ("net: dpaa: Adjust queue depth on rate change") which invokes smp_call_function_single via qman_update_cgr_safe every time a link goes up or down.

Why a raw spinlock to avoid problems on PREEMPT_RT, can you elaborate ?

smp_call_function always runs its callback in hard IRQ context, even on PREEMPT_RT, where spinlocks can sleep. So we need to use raw spinlocks to ensure we aren't waiting on a sleeping task. See the first bug report for more discussion.

In the longer term it would be better to switch to some other abstraction.

Does this make sense to you?

--Sean

Christophe Leroy

10:18 p.m.

New subject: [RESEND2 PATCH net v4 2/2] soc: fsl: qbman: Use raw spinlock for cgr_lock

Le 05/03/2024 à 19:14, Sean Anderson a écrit :

...

[Vous ne recevez pas souvent de courriers de sean.anderson@linux.dev. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

Hi,

On 2/23/24 11:02, Sean Anderson wrote:

...
On 2/23/24 00:38, Christophe Leroy wrote:

...
Le 22/02/2024 à 18:07, Sean Anderson a écrit :

...
[Vous ne recevez pas souvent de courriers de sean.anderson@linux.dev. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

cgr_lock may be locked with interrupts already disabled by smp_call_function_single. As such, we must use a raw spinlock to avoid problems on PREEMPT_RT kernels. Although this bug has existed for a while, it was not apparent until commit ef2a8d5478b9 ("net: dpaa: Adjust queue depth on rate change") which invokes smp_call_function_single via qman_update_cgr_safe every time a link goes up or down.

Why a raw spinlock to avoid problems on PREEMPT_RT, can you elaborate ?

smp_call_function always runs its callback in hard IRQ context, even on PREEMPT_RT, where spinlocks can sleep. So we need to use raw spinlocks to ensure we aren't waiting on a sleeping task. See the first bug report for more discussion.

In the longer term it would be better to switch to some other abstraction.

Does this make sense to you?

Yes that fine, thanks for the clarification. Maybe you can explain that in the patch description in case you send a v5.

Christophe

Sean Anderson

7 Mar 7 Mar

5:26 p.m.

New subject: [RESEND2 PATCH net v4 2/2] soc: fsl: qbman: Use raw spinlock for cgr_lock

On 3/5/24 17:18, Christophe Leroy wrote:

...

Le 05/03/2024 à 19:14, Sean Anderson a écrit :

...
[Vous ne recevez pas souvent de courriers de sean.anderson@linux.dev. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

Hi,

On 2/23/24 11:02, Sean Anderson wrote:

...
On 2/23/24 00:38, Christophe Leroy wrote:

...
Le 22/02/2024 à 18:07, Sean Anderson a écrit :

...
[Vous ne recevez pas souvent de courriers de sean.anderson@linux.dev. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

cgr_lock may be locked with interrupts already disabled by smp_call_function_single. As such, we must use a raw spinlock to avoid problems on PREEMPT_RT kernels. Although this bug has existed for a while, it was not apparent until commit ef2a8d5478b9 ("net: dpaa: Adjust queue depth on rate change") which invokes smp_call_function_single via qman_update_cgr_safe every time a link goes up or down.

Why a raw spinlock to avoid problems on PREEMPT_RT, can you elaborate ?

smp_call_function always runs its callback in hard IRQ context, even on PREEMPT_RT, where spinlocks can sleep. So we need to use raw spinlocks to ensure we aren't waiting on a sleeping task. See the first bug report for more discussion.

In the longer term it would be better to switch to some other abstraction.

Does this make sense to you?

Yes that fine, thanks for the clarification. Maybe you can explain that in the patch description in case you send a v5.

Hm, I thought I put this description in the commit message already. Maybe something like

| smp_call_function always runs its callback in hard IRQ context, even on | PREEMPT_RT, where spinlocks can sleep. So we need to use a raw spinlock | for cgr_lock to ensure we aren't waiting on a sleeping task. | | Although this bug has existed for a while, it was not apparent until | commit ef2a8d5478b9 ("net: dpaa: Adjust queue depth on rate change") | which invokes smp_call_function_single via qman_update_cgr_safe every | time a link goes up or down.

would be clearer.

--Sean

488

days inactive

502

days old

linux-stable-mirror@lists.linaro.org

6 comments

participants

tags (0)

participants (2)

Christophe Leroy
Sean Anderson