Since the addition of platform MSI support, there were two helpers supposed to allocate/free IRQs for a device:
platform_msi_domain_alloc_irqs() platform_msi_domain_free_irqs()
In these helpers, IRQ descriptors are allocated in the "alloc" routine while they are freed in the "free" one.
Later, two other helpers have been added to handle IRQ domains on top of MSI domains:
platform_msi_domain_alloc() platform_msi_domain_free()
Seen from the outside, the logic is pretty close with the former helpers and people used it with the same logic as before: a platform_msi_domain_alloc() call should be balanced with a platform_msi_domain_free() call. While this is probably what was intended to do, the platform_msi_domain_free() does not remove/free the IRQ descriptor(s) created/inserted in platform_msi_domain_alloc().
One effect of such situation is that removing a module that requested an IRQ will let one orphaned IRQ descriptor (with an allocated MSI entry) in the device descriptors list. Next time the module will be inserted back, one will observe that the allocation will happen twice in the MSI domain, one time for the remaining descriptor, one time for the new one. It also has the side effect to quickly overshoot the maximum number of allocated MSI and then prevent any module requesting an interrupt in the same domain to be inserted anymore.
This situation has been met with loops of insertion/removal of the mvpp2.ko module (requesting 15 MSIs each time).
Fixes: 552c494a7666 ("platform-msi: Allow creation of a MSI-based stacked irq domain") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com --- drivers/base/platform-msi.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index 60d6cc618f1c..b9d9d1729215 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -354,6 +354,20 @@ platform_msi_create_device_domain(struct device *dev, return NULL; }
+static void platform_msi_domain_free_descs(struct irq_domain *domain, int virq, + int nvec) +{ + struct platform_msi_priv_data *data = domain->host_data; + struct msi_desc *desc, *tmp; + + list_for_each_entry_safe(desc, tmp, dev_to_msi_list(data->dev), list) { + if (desc->irq >= virq && desc->irq < (virq + nvec)) { + list_del(&desc->list); + free_msi_entry(desc); + } + } +} + /** * platform_msi_domain_free - Free interrupts associated with a platform-msi * domain @@ -375,6 +389,8 @@ void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq,
irq_domain_free_irqs_common(domain, desc->irq, 1); } + + platform_msi_domain_free_descs(domain, virq, nvec); }
/**
Hi Miquel,
On Fri, 07 Sep 2018 16:01:29 +0100, Miquel Raynal miquel.raynal@bootlin.com wrote:
Since the addition of platform MSI support, there were two helpers supposed to allocate/free IRQs for a device:
platform_msi_domain_alloc_irqs() platform_msi_domain_free_irqs()
In these helpers, IRQ descriptors are allocated in the "alloc" routine while they are freed in the "free" one.
Later, two other helpers have been added to handle IRQ domains on top of MSI domains:
platform_msi_domain_alloc() platform_msi_domain_free()
Seen from the outside, the logic is pretty close with the former helpers and people used it with the same logic as before: a platform_msi_domain_alloc() call should be balanced with a platform_msi_domain_free() call. While this is probably what was intended to do, the platform_msi_domain_free() does not remove/free the IRQ descriptor(s) created/inserted in platform_msi_domain_alloc().
One effect of such situation is that removing a module that requested an IRQ will let one orphaned IRQ descriptor (with an allocated MSI entry) in the device descriptors list. Next time the module will be inserted back, one will observe that the allocation will happen twice in the MSI domain, one time for the remaining descriptor, one time for the new one. It also has the side effect to quickly overshoot the maximum number of allocated MSI and then prevent any module requesting an interrupt in the same domain to be inserted anymore.
This situation has been met with loops of insertion/removal of the mvpp2.ko module (requesting 15 MSIs each time).
Fixes: 552c494a7666 ("platform-msi: Allow creation of a MSI-based stacked irq domain") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com
drivers/base/platform-msi.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index 60d6cc618f1c..b9d9d1729215 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -354,6 +354,20 @@ platform_msi_create_device_domain(struct device *dev, return NULL; } +static void platform_msi_domain_free_descs(struct irq_domain *domain, int virq,
int nvec)
+{
- struct platform_msi_priv_data *data = domain->host_data;
- struct msi_desc *desc, *tmp;
- list_for_each_entry_safe(desc, tmp, dev_to_msi_list(data->dev), list) {
if (desc->irq >= virq && desc->irq < (virq + nvec)) {
list_del(&desc->list);
free_msi_entry(desc);
}
- }
+}
/**
- platform_msi_domain_free - Free interrupts associated with a platform-msi
domain
@@ -375,6 +389,8 @@ void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq, irq_domain_free_irqs_common(domain, desc->irq, 1); }
- platform_msi_domain_free_descs(domain, virq, nvec);
} /** -- 2.17.1
Good catch, but I wonder why you don't use the existing helper instead. Something like this (untested):
diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index 60d6cc618f1c..87808ac08bfb 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -375,6 +375,8 @@ void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq,
irq_domain_free_irqs_common(domain, desc->irq, 1); } + + platform_msi_free_descs(data->dev, virq, nvec); }
/**
Thanks,
M.
Hi Marc,
Marc Zyngier marc.zyngier@arm.com wrote on Thu, 20 Sep 2018 19:39:21 +0100:
Hi Miquel,
On Fri, 07 Sep 2018 16:01:29 +0100, Miquel Raynal miquel.raynal@bootlin.com wrote:
Since the addition of platform MSI support, there were two helpers supposed to allocate/free IRQs for a device:
platform_msi_domain_alloc_irqs() platform_msi_domain_free_irqs()
In these helpers, IRQ descriptors are allocated in the "alloc" routine while they are freed in the "free" one.
Later, two other helpers have been added to handle IRQ domains on top of MSI domains:
platform_msi_domain_alloc() platform_msi_domain_free()
Seen from the outside, the logic is pretty close with the former helpers and people used it with the same logic as before: a platform_msi_domain_alloc() call should be balanced with a platform_msi_domain_free() call. While this is probably what was intended to do, the platform_msi_domain_free() does not remove/free the IRQ descriptor(s) created/inserted in platform_msi_domain_alloc().
One effect of such situation is that removing a module that requested an IRQ will let one orphaned IRQ descriptor (with an allocated MSI entry) in the device descriptors list. Next time the module will be inserted back, one will observe that the allocation will happen twice in the MSI domain, one time for the remaining descriptor, one time for the new one. It also has the side effect to quickly overshoot the maximum number of allocated MSI and then prevent any module requesting an interrupt in the same domain to be inserted anymore.
This situation has been met with loops of insertion/removal of the mvpp2.ko module (requesting 15 MSIs each time).
Fixes: 552c494a7666 ("platform-msi: Allow creation of a MSI-based stacked irq domain") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com
drivers/base/platform-msi.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index 60d6cc618f1c..b9d9d1729215 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -354,6 +354,20 @@ platform_msi_create_device_domain(struct device *dev, return NULL; } +static void platform_msi_domain_free_descs(struct irq_domain *domain, int virq,
int nvec)
+{
- struct platform_msi_priv_data *data = domain->host_data;
- struct msi_desc *desc, *tmp;
- list_for_each_entry_safe(desc, tmp, dev_to_msi_list(data->dev), list) {
if (desc->irq >= virq && desc->irq < (virq + nvec)) {
list_del(&desc->list);
free_msi_entry(desc);
}
- }
+}
/**
- platform_msi_domain_free - Free interrupts associated with a platform-msi
domain
@@ -375,6 +389,8 @@ void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq, irq_domain_free_irqs_common(domain, desc->irq, 1); }
- platform_msi_domain_free_descs(domain, virq, nvec);
} /** -- 2.17.1
Good catch, but I wonder why you don't use the existing helper instead. Something like this (untested):
diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index 60d6cc618f1c..87808ac08bfb 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -375,6 +375,8 @@ void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq, irq_domain_free_irqs_common(domain, desc->irq, 1); }
- platform_msi_free_descs(data->dev, virq, nvec);
First I tried exactly what you propose, however platform_msi_free_descs() takes a "base" IRQ number which is not the Linux global virq number but the local MSI domain index.
virq (in the above example) is checked against desc->platform.msi_index (0, 1, 2, ... 29) while in the function I wrote, virq is checked against desc->irq (12, 13, ..., 19, 26, 27, ..., 48).
If you prefer not to add a "platform_msi_domain_free_descs()" helper, I see another solution which is to use desc->platform.msi_index instead of virq, and put platform_msi_free_descs() inside the for_each_msi_entry() loop (making it _safe otherwise it would crash for each destroyed descriptor in the list).
See the below code.
I personally prefer the function in my first proposal which avoids calling platform_msi_free_descs() once for each descriptor to free.
diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index b9d9d1729215..fb9aa6fcdad9 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -380,17 +380,16 @@ void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq, unsigned int nvec) { struct platform_msi_priv_data *data = domain->host_data; - struct msi_desc *desc; - for_each_msi_entry(desc, data->dev) { + struct msi_desc *desc, *tmp; + for_each_msi_entry_safe(desc, tmp, data->dev) { if (WARN_ON(!desc->irq || desc->nvec_used != 1)) return; if (!(desc->irq >= virq && desc->irq < (virq + nvec))) continue;
irq_domain_free_irqs_common(domain, desc->irq, 1); + platform_msi_free_descs(data->dev, desc->platform.msi_index, 1); } }
/** diff --git a/include/linux/msi.h b/include/linux/msi.h index 5839d8062dfc..be8ec813dbfb 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -116,6 +116,8 @@ struct msi_desc { list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list) #define for_each_msi_entry(desc, dev) \ list_for_each_entry((desc), dev_to_msi_list((dev)), list) +#define for_each_msi_entry_safe(desc, tmp, dev) \ + list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
#ifdef CONFIG_PCI_MSI #define first_pci_msi_entry(pdev) first_msi_entry(&(pdev)->dev)
Thanks, Miquèl
Hi Miquel,
On 24/09/18 14:39, Miquel Raynal wrote:
Hi Marc,
Marc Zyngier marc.zyngier@arm.com wrote on Thu, 20 Sep 2018 19:39:21 +0100:
Hi Miquel,
On Fri, 07 Sep 2018 16:01:29 +0100, Miquel Raynal miquel.raynal@bootlin.com wrote:
Since the addition of platform MSI support, there were two helpers supposed to allocate/free IRQs for a device:
platform_msi_domain_alloc_irqs() platform_msi_domain_free_irqs()
In these helpers, IRQ descriptors are allocated in the "alloc" routine while they are freed in the "free" one.
Later, two other helpers have been added to handle IRQ domains on top of MSI domains:
platform_msi_domain_alloc() platform_msi_domain_free()
Seen from the outside, the logic is pretty close with the former helpers and people used it with the same logic as before: a platform_msi_domain_alloc() call should be balanced with a platform_msi_domain_free() call. While this is probably what was intended to do, the platform_msi_domain_free() does not remove/free the IRQ descriptor(s) created/inserted in platform_msi_domain_alloc().
One effect of such situation is that removing a module that requested an IRQ will let one orphaned IRQ descriptor (with an allocated MSI entry) in the device descriptors list. Next time the module will be inserted back, one will observe that the allocation will happen twice in the MSI domain, one time for the remaining descriptor, one time for the new one. It also has the side effect to quickly overshoot the maximum number of allocated MSI and then prevent any module requesting an interrupt in the same domain to be inserted anymore.
This situation has been met with loops of insertion/removal of the mvpp2.ko module (requesting 15 MSIs each time).
Fixes: 552c494a7666 ("platform-msi: Allow creation of a MSI-based stacked irq domain") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com
drivers/base/platform-msi.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index 60d6cc618f1c..b9d9d1729215 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -354,6 +354,20 @@ platform_msi_create_device_domain(struct device *dev, return NULL; } +static void platform_msi_domain_free_descs(struct irq_domain *domain, int virq,
int nvec)
+{
- struct platform_msi_priv_data *data = domain->host_data;
- struct msi_desc *desc, *tmp;
- list_for_each_entry_safe(desc, tmp, dev_to_msi_list(data->dev), list) {
if (desc->irq >= virq && desc->irq < (virq + nvec)) {
list_del(&desc->list);
free_msi_entry(desc);
}
- }
+}
- /**
- platform_msi_domain_free - Free interrupts associated with a platform-msi
domain
@@ -375,6 +389,8 @@ void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq, irq_domain_free_irqs_common(domain, desc->irq, 1); }
- platform_msi_domain_free_descs(domain, virq, nvec); }
/** -- 2.17.1
Good catch, but I wonder why you don't use the existing helper instead. Something like this (untested):
diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index 60d6cc618f1c..87808ac08bfb 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -375,6 +375,8 @@ void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq, irq_domain_free_irqs_common(domain, desc->irq, 1); }
- platform_msi_free_descs(data->dev, virq, nvec);
First I tried exactly what you propose, however platform_msi_free_descs() takes a "base" IRQ number which is not the Linux global virq number but the local MSI domain index.
Indeed, you're absolutely right. Apologies.
virq (in the above example) is checked against desc->platform.msi_index (0, 1, 2, ... 29) while in the function I wrote, virq is checked against desc->irq (12, 13, ..., 19, 26, 27, ..., 48).
If you prefer not to add a "platform_msi_domain_free_descs()" helper, I see another solution which is to use desc->platform.msi_index instead of virq, and put platform_msi_free_descs() inside the for_each_msi_entry() loop (making it _safe otherwise it would crash for each destroyed descriptor in the list).
See the below code.
I personally prefer the function in my first proposal which avoids calling platform_msi_free_descs() once for each descriptor to free.
diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index b9d9d1729215..fb9aa6fcdad9 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -380,17 +380,16 @@ void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq, unsigned int nvec) { struct platform_msi_priv_data *data = domain->host_data;
struct msi_desc *desc;
for_each_msi_entry(desc, data->dev) {
struct msi_desc *desc, *tmp;
for_each_msi_entry_safe(desc, tmp, data->dev) { if (WARN_ON(!desc->irq || desc->nvec_used != 1)) return; if (!(desc->irq >= virq && desc->irq < (virq + nvec))) continue;
irq_domain_free_irqs_common(domain, desc->irq, 1);
platform_msi_free_descs(data->dev, desc->platform.msi_index, 1);
At that stage, you're better off just calling
list_del(&desc->list); free_msi_entry(desc);
I like this approach better as we only traverse the list once.
}
} /** diff --git a/include/linux/msi.h b/include/linux/msi.h index 5839d8062dfc..be8ec813dbfb 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -116,6 +116,8 @@ struct msi_desc { list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list) #define for_each_msi_entry(desc, dev) \ list_for_each_entry((desc), dev_to_msi_list((dev)), list) +#define for_each_msi_entry_safe(desc, tmp, dev) \
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
#ifdef CONFIG_PCI_MSI #define first_pci_msi_entry(pdev) first_msi_entry(&(pdev)->dev)
If you repin this, I'll queue it right away.
Thanks,
M.
Hi Marc,
[...]
At that stage, you're better off just calling
list_del(&desc->list); free_msi_entry(desc);
I like this approach better as we only traverse the list once.
Right.
}
}
/**
diff --git a/include/linux/msi.h b/include/linux/msi.h index 5839d8062dfc..be8ec813dbfb 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -116,6 +116,8 @@ struct msi_desc { list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list) #define for_each_msi_entry(desc, dev) \ list_for_each_entry((desc), dev_to_msi_list((dev)), list) +#define for_each_msi_entry_safe(desc, tmp, dev) \
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
#ifdef CONFIG_PCI_MSI#define first_pci_msi_entry(pdev) first_msi_entry(&(pdev)->dev)If you repin this, I'll queue it right away.
Let me test the new version to be sure I'm not breaking anything and I'll send a v2.
Thanks, Miquèl
linux-stable-mirror@lists.linaro.org