From: Bjorn Helgaas bhelgaas@google.com
commit 51c48b310183ab6ba5419edfc6a8de889cc04521 upstream.
pci_bridge_check_ranges() determines whether a bridge supports the optional I/O and prefetchable memory windows and sets the flag bits in the bridge resources. This *could* be done once during enumeration except that the resource allocation code completely clears the flag bits, e.g., in the pci_assign_unassigned_bridge_resources() path.
The problem with pci_bridge_check_ranges() in the resource allocation path is that we may allocate resources after devices have been claimed by drivers, and pci_bridge_check_ranges() *changes* the window registers to determine whether they're writable. This may break concurrent accesses to devices behind the bridge.
Add a new pci_read_bridge_windows() to determine whether a bridge supports the optional windows, call it once during enumeration, remember the results, and change pci_bridge_check_ranges() so it doesn't touch the bridge windows but sets the flag bits based on those remembered results.
Link: https://lore.kernel.org/linux-pci/1506151482-113560-1-git-send-email-wangzho... Link: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02082.html Reported-by: Yandong Xu xuyandong2@huawei.com Tested-by: Yandong Xu xuyandong2@huawei.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Michael S. Tsirkin mst@redhat.com Cc: Sagi Grimberg sagi@grimberg.me Cc: Ofer Hayut ofer@lightbitslabs.com Cc: Roy Shterman roys@lightbitslabs.com Cc: Keith Busch keith.busch@intel.com Cc: Zhou Wang wangzhou1@hisilicon.com Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208371 Signed-off-by: Dima Stepanov dimastep@yandex-team.ru --- drivers/pci/probe.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++ drivers/pci/setup-bus.c | 45 ++++-------------------------------------- include/linux/pci.h | 3 +++ 3 files changed, 59 insertions(+), 41 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 257b9f6..2ef8b95 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -348,6 +348,57 @@ static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom) } }
+static void pci_read_bridge_windows(struct pci_dev *bridge) +{ + u16 io; + u32 pmem, tmp; + + pci_read_config_word(bridge, PCI_IO_BASE, &io); + if (!io) { + pci_write_config_word(bridge, PCI_IO_BASE, 0xe0f0); + pci_read_config_word(bridge, PCI_IO_BASE, &io); + pci_write_config_word(bridge, PCI_IO_BASE, 0x0); + } + if (io) + bridge->io_window = 1; + + /* + * DECchip 21050 pass 2 errata: the bridge may miss an address + * disconnect boundary by one PCI data phase. Workaround: do not + * use prefetching on this device. + */ + if (bridge->vendor == PCI_VENDOR_ID_DEC && bridge->device == 0x0001) + return; + + pci_read_config_dword(bridge, PCI_PREF_MEMORY_BASE, &pmem); + if (!pmem) { + pci_write_config_dword(bridge, PCI_PREF_MEMORY_BASE, + 0xffe0fff0); + pci_read_config_dword(bridge, PCI_PREF_MEMORY_BASE, &pmem); + pci_write_config_dword(bridge, PCI_PREF_MEMORY_BASE, 0x0); + } + if (!pmem) + return; + + bridge->pref_window = 1; + + if ((pmem & PCI_PREF_RANGE_TYPE_MASK) == PCI_PREF_RANGE_TYPE_64) { + + /* + * Bridge claims to have a 64-bit prefetchable memory + * window; verify that the upper bits are actually + * writable. + */ + pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32, &pmem); + pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, + 0xffffffff); + pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32, &tmp); + pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, pmem); + if (tmp) + bridge->pref_64_window = 1; + } +} + static void pci_read_bridge_io(struct pci_bus *child) { struct pci_dev *dev = child->self; @@ -1739,6 +1790,7 @@ int pci_setup_device(struct pci_dev *dev) pci_read_irq(dev); dev->transparent = ((dev->class & 0xff) == 1); pci_read_bases(dev, 2, PCI_ROM_ADDRESS1); + pci_read_bridge_windows(dev); set_pcie_hotplug_bridge(dev); pos = pci_find_capability(dev, PCI_CAP_ID_SSVID); if (pos) { diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index ed96043..1941bb0 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -735,58 +735,21 @@ int pci_claim_bridge_resource(struct pci_dev *bridge, int i) base/limit registers must be read-only and read as 0. */ static void pci_bridge_check_ranges(struct pci_bus *bus) { - u16 io; - u32 pmem; struct pci_dev *bridge = bus->self; - struct resource *b_res; + struct resource *b_res = &bridge->resource[PCI_BRIDGE_RESOURCES];
- b_res = &bridge->resource[PCI_BRIDGE_RESOURCES]; b_res[1].flags |= IORESOURCE_MEM;
- pci_read_config_word(bridge, PCI_IO_BASE, &io); - if (!io) { - pci_write_config_word(bridge, PCI_IO_BASE, 0xe0f0); - pci_read_config_word(bridge, PCI_IO_BASE, &io); - pci_write_config_word(bridge, PCI_IO_BASE, 0x0); - } - if (io) + if (bridge->io_window) b_res[0].flags |= IORESOURCE_IO;
- /* DECchip 21050 pass 2 errata: the bridge may miss an address - disconnect boundary by one PCI data phase. - Workaround: do not use prefetching on this device. */ - if (bridge->vendor == PCI_VENDOR_ID_DEC && bridge->device == 0x0001) - return; - - pci_read_config_dword(bridge, PCI_PREF_MEMORY_BASE, &pmem); - if (!pmem) { - pci_write_config_dword(bridge, PCI_PREF_MEMORY_BASE, - 0xffe0fff0); - pci_read_config_dword(bridge, PCI_PREF_MEMORY_BASE, &pmem); - pci_write_config_dword(bridge, PCI_PREF_MEMORY_BASE, 0x0); - } - if (pmem) { + if (bridge->pref_window) { b_res[2].flags |= IORESOURCE_MEM | IORESOURCE_PREFETCH; - if ((pmem & PCI_PREF_RANGE_TYPE_MASK) == - PCI_PREF_RANGE_TYPE_64) { + if (bridge->pref_64_window) { b_res[2].flags |= IORESOURCE_MEM_64; b_res[2].flags |= PCI_PREF_RANGE_TYPE_64; } } - - /* double check if bridge does support 64 bit pref */ - if (b_res[2].flags & IORESOURCE_MEM_64) { - u32 mem_base_hi, tmp; - pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32, - &mem_base_hi); - pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, - 0xffffffff); - pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32, &tmp); - if (!tmp) - b_res[2].flags &= ~IORESOURCE_MEM_64; - pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, - mem_base_hi); - } }
/* Helper function for sizing routines: find first available diff --git a/include/linux/pci.h b/include/linux/pci.h index 65f1d8c..40b327b 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -373,6 +373,9 @@ struct pci_dev { bool match_driver; /* Skip attaching driver */
unsigned int transparent:1; /* Subtractive decode bridge */ + unsigned int io_window:1; /* Bridge has I/O window */ + unsigned int pref_window:1; /* Bridge has pref mem window */ + unsigned int pref_64_window:1; /* Pref mem window is 64-bit */ unsigned int multifunction:1; /* Multi-function device */
unsigned int is_busmaster:1; /* Is busmaster */
On Mon, Aug 10, 2020 at 05:19:42PM +0300, Dima Stepanov wrote:
From: Bjorn Helgaas bhelgaas@google.com
commit 51c48b310183ab6ba5419edfc6a8de889cc04521 upstream.
pci_bridge_check_ranges() determines whether a bridge supports the optional I/O and prefetchable memory windows and sets the flag bits in the bridge resources. This *could* be done once during enumeration except that the resource allocation code completely clears the flag bits, e.g., in the pci_assign_unassigned_bridge_resources() path.
The problem with pci_bridge_check_ranges() in the resource allocation path is that we may allocate resources after devices have been claimed by drivers, and pci_bridge_check_ranges() *changes* the window registers to determine whether they're writable. This may break concurrent accesses to devices behind the bridge.
Add a new pci_read_bridge_windows() to determine whether a bridge supports the optional windows, call it once during enumeration, remember the results, and change pci_bridge_check_ranges() so it doesn't touch the bridge windows but sets the flag bits based on those remembered results.
Link: https://lore.kernel.org/linux-pci/1506151482-113560-1-git-send-email-wangzho... Link: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02082.html Reported-by: Yandong Xu xuyandong2@huawei.com Tested-by: Yandong Xu xuyandong2@huawei.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Michael S. Tsirkin mst@redhat.com Cc: Sagi Grimberg sagi@grimberg.me Cc: Ofer Hayut ofer@lightbitslabs.com Cc: Roy Shterman roys@lightbitslabs.com Cc: Keith Busch keith.busch@intel.com Cc: Zhou Wang wangzhou1@hisilicon.com Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208371 Signed-off-by: Dima Stepanov dimastep@yandex-team.ru
drivers/pci/probe.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++ drivers/pci/setup-bus.c | 45 ++++-------------------------------------- include/linux/pci.h | 3 +++ 3 files changed, 59 insertions(+), 41 deletions(-)
Why is this now needed in 4.19.y? What changed to require it and what prevents the users from just using 5.4.y instead?
A bit of an explaination when backporting patches that are not obvious "fixes" to much older kernels is always appreciated :)
thanks,
greg k-h
On Mon, Aug 10, 2020 at 04:54:50PM +0200, Greg KH wrote:
On Mon, Aug 10, 2020 at 05:19:42PM +0300, Dima Stepanov wrote:
From: Bjorn Helgaas bhelgaas@google.com
commit 51c48b310183ab6ba5419edfc6a8de889cc04521 upstream.
pci_bridge_check_ranges() determines whether a bridge supports the optional I/O and prefetchable memory windows and sets the flag bits in the bridge resources. This *could* be done once during enumeration except that the resource allocation code completely clears the flag bits, e.g., in the pci_assign_unassigned_bridge_resources() path.
The problem with pci_bridge_check_ranges() in the resource allocation path is that we may allocate resources after devices have been claimed by drivers, and pci_bridge_check_ranges() *changes* the window registers to determine whether they're writable. This may break concurrent accesses to devices behind the bridge.
Add a new pci_read_bridge_windows() to determine whether a bridge supports the optional windows, call it once during enumeration, remember the results, and change pci_bridge_check_ranges() so it doesn't touch the bridge windows but sets the flag bits based on those remembered results.
Link: https://lore.kernel.org/linux-pci/1506151482-113560-1-git-send-email-wangzho... Link: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02082.html Reported-by: Yandong Xu xuyandong2@huawei.com Tested-by: Yandong Xu xuyandong2@huawei.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Michael S. Tsirkin mst@redhat.com Cc: Sagi Grimberg sagi@grimberg.me Cc: Ofer Hayut ofer@lightbitslabs.com Cc: Roy Shterman roys@lightbitslabs.com Cc: Keith Busch keith.busch@intel.com Cc: Zhou Wang wangzhou1@hisilicon.com Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208371 Signed-off-by: Dima Stepanov dimastep@yandex-team.ru
drivers/pci/probe.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++ drivers/pci/setup-bus.c | 45 ++++-------------------------------------- include/linux/pci.h | 3 +++ 3 files changed, 59 insertions(+), 41 deletions(-)
Why is this now needed in 4.19.y? What changed to require it and what prevents the users from just using 5.4.y instead?
A bit of an explaination when backporting patches that are not obvious "fixes" to much older kernels is always appreciated :)
thanks,
greg k-h
Hi Greg,
Sorry, was not sure how to make it properly. So i'll try to describe the history of this issue: - in 2017: https://lore.kernel.org/linux-pci/1506151482-113560-1-git-send-email-wangzho... - in 2018: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02082.html - in 2019 it was fixed: commit: 51c48b310183ab6ba5419edfc6a8de889cc04521 And there was a small idea to add this patch to stable, if a bugzilla report will be added: https://lkml.org/lkml/2019/2/5/600. But as i understand there were some problems with reproducing. - and we hit it again in 2020 and filed a bug for it with the steps to reproduce: https://bugzilla.kernel.org/show_bug.cgi?id=208371 Because of it, i thought that it really looks like an issue that sometimes triggered. And some words about motivation: - What changed to require it? We filed a bugzilla bug and tried to prove that it is a real issue (not the possibility). - In general nothing prevents users from using 5.4.y. But in big complicated environments (clouds) it is not obvious that exactly this issue leads to such behaviour. Also users can rely on default distribution kernels.
Sorry again, for a little confusion, not very familiar with the process, but hope that this description helps. What do you think about it?
Thanks, Dima.
linux-stable-mirror@lists.linaro.org