When the best selected CPU is offline, work_on_cpu() will stuck forever. This can be happen if a node is online while all its CPUs are offline (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore, in this case, we should call local_pci_probe() instead of work_on_cpu().
Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Hongchen Zhang zhanghongchen@loongson.cn --- v1 -> v2 Added the method to reproduce this issue --- drivers/pci/pci-driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index af2996d0d17f..32a99828e6a3 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -386,7 +386,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, free_cpumask_var(wq_domain_mask); }
- if (cpu < nr_cpu_ids) + if ((cpu < nr_cpu_ids) && cpu_online(cpu)) error = work_on_cpu(cpu, local_pci_probe, &ddi); else error = local_pci_probe(&ddi);
Hi, Hongchen,
It seems you forgot to update the title which I have pointed out. :)
And Bjorn,
Could you please take some time to review this patch? Thank you.
Huacai
On Wed, Jun 5, 2024 at 3:54 PM Hongchen Zhang zhanghongchen@loongson.cn wrote:
When the best selected CPU is offline, work_on_cpu() will stuck forever. This can be happen if a node is online while all its CPUs are offline (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore, in this case, we should call local_pci_probe() instead of work_on_cpu().
Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Hongchen Zhang zhanghongchen@loongson.cn
v1 -> v2 Added the method to reproduce this issue
drivers/pci/pci-driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index af2996d0d17f..32a99828e6a3 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -386,7 +386,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, free_cpumask_var(wq_domain_mask); }
if (cpu < nr_cpu_ids)
if ((cpu < nr_cpu_ids) && cpu_online(cpu)) error = work_on_cpu(cpu, local_pci_probe, &ddi); else error = local_pci_probe(&ddi);
-- 2.33.0
…
This can be happen if a node is online while all its CPUs are offline (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore, in this case, we should call local_pci_probe() instead of work_on_cpu().
* Please take text layout concerns a bit better into account also according to the usage of paragraphs. https://elixir.bootlin.com/linux/v6.10-rc3/source/Documentation/process/main...
* Please improve the change description with an imperative wording. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu...
* Would you like to add the tag “Fixes” accordingly?
* How do you think about to specify the name of the affected function in the summary phrase?
Regards, Markus
Hi Markus, Thanks for your review.
On 2024/6/13 上午2:08, Markus Elfring wrote:
…
This can be happen if a node is online while all its CPUs are offline (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore, in this case, we should call local_pci_probe() instead of work_on_cpu().
- Please take text layout concerns a bit better into account also according to the usage of paragraphs. https://elixir.bootlin.com/linux/v6.10-rc3/source/Documentation/process/main..., Let rewrite the commit message.
- Please improve the change description with an imperative wording. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu...
OK, Let me use imperative word.
- Would you like to add the tag “Fixes” accordingly?
OK, Let me add Fixes.
- How do you think about to specify the name of the affected function in the summary phrase?
OK, Let me add the affected function in summary phrase.
Regards, Markus
linux-stable-mirror@lists.linaro.org