Hyper-V driver has advertised support for multi-MSI, but any attempt at
using the feature would fallback to a single MSI (non-starter for
devices that require multi-MSI). The fallback also covered up other
bugs related to multi-MSI functionality rooted in the driver not being
able to tell MSIs apart.
These patches fix those bugs by enabling hv multi-MSI through IOMMU
remapping, distinguishing multi-MSIs from the initial MSI of the MSI
block, preventing retargeting of MSI subsets from invalidating the IRTE
block, and aiding hypervisor to preserve the block of requests.
Tested on 5.4.205
Jeffrey Hugo (4):
PCI: hv: Fix multi-MSI to allow more than one MSI vector
PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI
PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()
PCI: hv: Fix interrupt mapping for multi-MSI
drivers/pci/controller/pci-hyperv.c | 99 +++++++++++++++++++++--------
1 file changed, 73 insertions(+), 26 deletions(-)
--
2.25.1
From: Sean Wang <sean.wang(a)mediatek.com>
This reverts commit 663457f421d41e9d2fcb1e84baf43d1433f80c08 that is the
commit 44c4237cf3436bda2b185ff728123651ad133f69 upstream.
Because there was mistake in
'649178c0493e ("mt76: mt7921e: fix possible probe failure after reboot")'
that caused WiFi reset cannot work well as the reported issue
"PROBLEM: [Stable v5.15.42+] [mt7921] Wake after suspend locks up system
when mt7921-driver is used on a Lenovo ThinkPad E15 G3" described in
http://lists.infradead.org/pipermail/linux-mediatek/2022-June/042668.html
So we need to revert the patch first to avoid the conflict of reverting
'649178c0493e ("mt76: mt7921e: fix possible probe failure after reboot")'
and will be applied back later after fixing.
Signed-off-by: Sean Wang <sean.wang(a)mediatek.com>
---
v2: update changelog text
---
drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
index 3d35838ef306..7d9b23a00238 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
@@ -254,10 +254,8 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
dev->bus_ops = dev->mt76.bus;
bus_ops = devm_kmemdup(dev->mt76.dev, dev->bus_ops, sizeof(*bus_ops),
GFP_KERNEL);
- if (!bus_ops) {
- ret = -ENOMEM;
- goto err_free_dev;
- }
+ if (!bus_ops)
+ return -ENOMEM;
bus_ops->rr = mt7921_rr;
bus_ops->wr = mt7921_wr;
@@ -266,7 +264,7 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
ret = __mt7921_mcu_drv_pmctrl(dev);
if (ret)
- goto err_free_dev;
+ return ret;
mdev->rev = (mt7921_l1_rr(dev, MT_HW_CHIPID) << 16) |
(mt7921_l1_rr(dev, MT_HW_REV) & 0xff);
--
2.25.1
Syzkaller reports use-after-free for net_device's in 5.10 stable releases.
The problem has been fixed by the following patch series and
it can be cleanly applied to the 5.10 branch.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Using bin_attributes with a 0 size causes fstat and friends to return that
0 size. This breaks userspace code that retrieves the size before reading
the file. Rather than reverting 75bd50fa841 ("drivers/base/node.c: use
bin_attribute to break the size limitation of cpumap ABI") let's put in a
size value at compile time.
For cpulist the maximum size is on the order of
NR_CPUS * (ceil(log10(NR_CPUS)) + 1)/2
which for 8192 is 20480 (8192 * 5)/2. In order to get near that you'd need
a system with every other CPU on one node. For example: (0,2,4,8, ... ).
To simplify the math and support larger NR_CPUS in the future we are using
(NR_CPUS * 7)/2. We also set it to a min of PAGE_SIZE to retain the older
behavior for smaller NR_CPUS.
The cpumap file the size works out to be NR_CPUS/4 + NR_CPUS/32 - 1
(or NR_CPUS * 9/32 - 1) including the ","s.
Add a set of macros for these values to cpumask.h so they can be used in
multiple places. Apply these to the handful of such files in
drivers/base/topology.c as well as node.c.
As an example, on an 80 cpu 4-node system (NR_CPUS == 8192):
before:
-r--r--r--. 1 root root 0 Jul 12 14:08 system/node/node0/cpulist
-r--r--r--. 1 root root 0 Jul 11 17:25 system/node/node0/cpumap
after:
-r--r--r--. 1 root root 28672 Jul 13 11:32 system/node/node0/cpulist
-r--r--r--. 1 root root 4096 Jul 13 11:31 system/node/node0/cpumap
CONFIG_NR_CPUS = 16384
-r--r--r--. 1 root root 57344 Jul 13 14:03 system/node/node0/cpulist
-r--r--r--. 1 root root 4607 Jul 13 14:02 system/node/node0/cpumap
The actual number of cpus doesn't matter for the reported size since they
are based on NR_CPUS.
Fixes: 75bd50fa841 ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI")
Fixes: bb9ec13d156 ("topology: use bin_attribute to break the size limitation of cpumap ABI")
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Yury Norov <yury.norov(a)gmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Phil Auld <pauld(a)redhat.com>
---
v2: Fix cpumap size calculation. Increase multiplier for cpulist size.
v3: Add comments in code.
v4: Define constants in cpumask.h. Move comments there. Also fix
topology.c.
v5: Fixed math based on Yury's corrections.
drivers/base/node.c | 4 ++--
drivers/base/topology.c | 32 ++++++++++++++++----------------
include/linux/cpumask.h | 18 ++++++++++++++++++
3 files changed, 36 insertions(+), 18 deletions(-)
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 0ac6376ef7a1..eb0f43784c2b 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -45,7 +45,7 @@ static inline ssize_t cpumap_read(struct file *file, struct kobject *kobj,
return n;
}
-static BIN_ATTR_RO(cpumap, 0);
+static BIN_ATTR_RO(cpumap, CPUMAP_FILE_MAX_BYTES);
static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
struct bin_attribute *attr, char *buf,
@@ -66,7 +66,7 @@ static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
return n;
}
-static BIN_ATTR_RO(cpulist, 0);
+static BIN_ATTR_RO(cpulist, CPULIST_FILE_MAX_BYTES);
/**
* struct node_access_nodes - Access class device to hold user visible
diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index ac6ad9ab67f9..89f98be5c5b9 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -62,47 +62,47 @@ define_id_show_func(ppin, "0x%llx");
static DEVICE_ATTR_ADMIN_RO(ppin);
define_siblings_read_func(thread_siblings, sibling_cpumask);
-static BIN_ATTR_RO(thread_siblings, 0);
-static BIN_ATTR_RO(thread_siblings_list, 0);
+static BIN_ATTR_RO(thread_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(thread_siblings_list, CPULIST_FILE_MAX_BYTES);
define_siblings_read_func(core_cpus, sibling_cpumask);
-static BIN_ATTR_RO(core_cpus, 0);
-static BIN_ATTR_RO(core_cpus_list, 0);
+static BIN_ATTR_RO(core_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(core_cpus_list, CPULIST_FILE_MAX_BYTES);
define_siblings_read_func(core_siblings, core_cpumask);
-static BIN_ATTR_RO(core_siblings, 0);
-static BIN_ATTR_RO(core_siblings_list, 0);
+static BIN_ATTR_RO(core_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(core_siblings_list, CPULIST_FILE_MAX_BYTES);
#ifdef TOPOLOGY_CLUSTER_SYSFS
define_siblings_read_func(cluster_cpus, cluster_cpumask);
-static BIN_ATTR_RO(cluster_cpus, 0);
-static BIN_ATTR_RO(cluster_cpus_list, 0);
+static BIN_ATTR_RO(cluster_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(cluster_cpus_list, CPULIST_FILE_MAX_BYTES);
#endif
#ifdef TOPOLOGY_DIE_SYSFS
define_siblings_read_func(die_cpus, die_cpumask);
-static BIN_ATTR_RO(die_cpus, 0);
-static BIN_ATTR_RO(die_cpus_list, 0);
+static BIN_ATTR_RO(die_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(die_cpus_list, CPULIST_FILE_MAX_BYTES);
#endif
define_siblings_read_func(package_cpus, core_cpumask);
-static BIN_ATTR_RO(package_cpus, 0);
-static BIN_ATTR_RO(package_cpus_list, 0);
+static BIN_ATTR_RO(package_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(package_cpus_list, CPULIST_FILE_MAX_BYTES);
#ifdef TOPOLOGY_BOOK_SYSFS
define_id_show_func(book_id, "%d");
static DEVICE_ATTR_RO(book_id);
define_siblings_read_func(book_siblings, book_cpumask);
-static BIN_ATTR_RO(book_siblings, 0);
-static BIN_ATTR_RO(book_siblings_list, 0);
+static BIN_ATTR_RO(book_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(book_siblings_list, CPULIST_FILE_MAX_BYTES);
#endif
#ifdef TOPOLOGY_DRAWER_SYSFS
define_id_show_func(drawer_id, "%d");
static DEVICE_ATTR_RO(drawer_id);
define_siblings_read_func(drawer_siblings, drawer_cpumask);
-static BIN_ATTR_RO(drawer_siblings, 0);
-static BIN_ATTR_RO(drawer_siblings_list, 0);
+static BIN_ATTR_RO(drawer_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(drawer_siblings_list, CPULIST_FILE_MAX_BYTES);
#endif
static struct bin_attribute *bin_attrs[] = {
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index fe29ac7cc469..4592d0845941 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -1071,4 +1071,22 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
[0] = 1UL \
} }
+/*
+ * Provide a valid theoretical max size for cpumap and cpulist sysfs files
+ * to avoid breaking userspace which may allocate a buffer based on the size
+ * reported by e.g. fstat.
+ *
+ * for cpumap NR_CPUS * 9/32 - 1 should be an exact length.
+ *
+ * For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up
+ * to 2 orders of magnitude larger than 8192. And then we divide by 2 to
+ * cover a worst-case of every other cpu being on one of two nodes for a
+ * very large NR_CPUS.
+ *
+ * Use PAGE_SIZE as a minimum for smaller configurations.
+ */
+#define CPUMAP_FILE_MAX_BYTES ((((NR_CPUS * 9)/32 - 1) > PAGE_SIZE) \
+ ? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE)
+#define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE)
+
#endif /* __LINUX_CPUMASK_H */
--
2.31.1
Hyper-V driver has advertised support for multi-MSI, but any attempt at
using the feature would fallback to a single MSI (non-starter for
devices that require multi-MSI). The fallback also covered up other
bugs related to multi-MSI functionality rooted in the driver not being
able to tell MSIs apart.
These patches fix those bugs by enabling hv multi-MSI through IOMMU
remapping, distinguishing multi-MSIs from the initial MSI of the MSI
block, preventing retargeting of MSI subsets from invalidating the IRTE
block, and aiding hypervisor to preserve the block of requests.
Tested on 5.18.10-stable. Looking to backport to all long-term stable
branches (post any possible merge-conflict resolution and testing
documented in future patch-sets of course.)
Jeffrey Hugo (4):
PCI: hv: Fix multi-MSI to allow more than one MSI vector
PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI
PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()
PCI: hv: Fix interrupt mapping for multi-MSI
drivers/pci/controller/pci-hyperv.c | 99 +++++++++++++++++++++--------
1 file changed, 73 insertions(+), 26 deletions(-)
--
2.25.1
Hello,
Oleksandr, thank you for Cc-ing Andrii. Andrii, thank you for the comment!
On Fri, 15 Jul 2022 15:00:10 +0300 Andrii Chepurnyi <andrii.chepurnyi82(a)gmail.com> wrote:
> [-- Attachment #1: Type: text/plain, Size: 5237 bytes --]
>
> Hello All,
>
> I faced the mentioned issue recently and just to bring more context here is
> our setup:
> We use pvblock backend for Android guest. It starts using u-boot with
> pvblock support(which frontend doesn't support the persistent grants
> feature), later it loads and starts the Linux kernel(which frontend
> supports the persistent grants feature). So in total, we have sequent two
> different frontends reconnection, the first of which doesn't support
> persistent grants.
> So the original patch [1] perfectly solves the original issue and provides
> the ability to use persistent grants after the reconnection when Linux
> frontend which supports persistent grants comes into play.
> At the same time [2] will disable the persistent grants feature for the
> first and second frontend.
Thank you for this great explanation of your situation.
> Is it possible to keep [1] as is?
Yes, my concerns about Max's original patch[1] are conflicting behavior
description in the document[1] and different behavior on blkfront-side
'feature_persistent' parameter. I will post Max's patch again with patches for
blkfront behavior change and Documents updates.
[1] https://lore.kernel.org/xen-devel/20220121102309.27802-1-sj@kernel.org/
Thanks,
SJ
>
> [1]
> https://lore.kernel.org/xen-devel/20220106091013.126076-1-mheyne@amazon.de/
> [2] https://lore.kernel.org/xen-devel/20220714224410.51147-1-sj@kernel.org/
>
> Best regards,
> Andrii
>
> On Fri, Jul 15, 2022 at 1:15 PM Oleksandr <olekstysh(a)gmail.com> wrote:
>
> >
> > On 15.07.22 01:44, SeongJae Park wrote:
> >
> >
> > Hello all.
> >
> > Adding Andrii Chepurnyi to CC who have played with the use-case which
> > required reconnect recently and faced some issues with
> > feature_persistent handling.
[...]
The bitops compile-time optimization series revealed one more
problem in olpc-xo1-sci.c:send_ebook_state(), resulted in GCC
warnings:
arch/x86/platform/olpc/olpc-xo1-sci.c: In function 'send_ebook_state':
arch/x86/platform/olpc/olpc-xo1-sci.c:83:63: warning: logical not is only applied to the left hand side of comparison [-Wlogical-not-parentheses]
83 | if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
| ^~
arch/x86/platform/olpc/olpc-xo1-sci.c:83:13: note: add parentheses around left hand side expression to silence this warning
Despite this code working as intended, this redundant double
negation of boolean value, together with comparing to `char`
with no explicit conversion to bool, makes compilers think
the author made some unintentional logical mistakes here.
Make it the other way around and negate the char instead
to silence the warnings.
Fixes: d2aa37411b8e ("x86/olpc/xo1/sci: Produce wakeup events for buttons and switches")
Cc: stable(a)vger.kernel.org # 3.5+
Reported-by: Guenter Roeck <linux(a)roeck-us.net>
Reported-by: kernel test robot <lkp(a)intel.com>
Reviewed-and-tested-by: Guenter Roeck <linux(a)roeck-us.net>
Signed-off-by: Alexander Lobakin <alexandr.lobakin(a)intel.com>
---
arch/x86/platform/olpc/olpc-xo1-sci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/platform/olpc/olpc-xo1-sci.c b/arch/x86/platform/olpc/olpc-xo1-sci.c
index f03a6883dcc6..89f25af4b3c3 100644
--- a/arch/x86/platform/olpc/olpc-xo1-sci.c
+++ b/arch/x86/platform/olpc/olpc-xo1-sci.c
@@ -80,7 +80,7 @@ static void send_ebook_state(void)
return;
}
- if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
+ if (test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == !!state)
return; /* Nothing new to report. */
input_report_switch(ebook_switch_idev, SW_TABLET_MODE, state);
--
2.36.1