On Sat, Feb 5, 2022 at 3:13 PM dann frazier dann.frazier@canonical.com wrote:
On Sat, Feb 5, 2022 at 9:05 AM Rob Herring robh@kernel.org wrote:
On Fri, Feb 4, 2022 at 5:01 PM dann frazier dann.frazier@canonical.com wrote:
On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote:
Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") broke PCI support on XGene. The cause is the IB resources are now sorted in address order instead of being in DT dma-ranges order. The result is which inbound registers are used for each region are swapped. I don't know the details about this h/w, but it appears that IB region 0 registers can't handle a size greater than 4GB. In any case, limiting the size for region 0 is enough to get back to the original assignment of dma-ranges to regions.
hey Rob!
I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - only during network installs - that I also bisected down to commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was hoping that this patch that fixed the issue on Stéphane's X-Gene2 system would also fix my issue, but no luck. In fact, it seems to just makes it fail differently. Reverting both patches is required to get a v5.17-rc kernel to boot.
I've collected the following logs - let me know if anything else would be useful.
v5.17-rc2+ (unmodified): http://dannf.org/bugs/m400-no-reverts.log Note that the mlx4 driver fails initialization.
v5.17-rc2+, w/o the commit that fixed Stéphane's system: http://dannf.org/bugs/m400-xgene2-fix-reverted.log Note the mlx4 MSI-X timeout, and later panic.
v5.17-rc2+, w/ both commits reverted (works) http://dannf.org/bugs/m400-both-reverted.log
The ranges and dma-ranges addresses don't appear to match up with any upstream dts files. Can you send me the DT?
The first fix certainly is a problem. It's going to need something besides size to key off of (originally it was dependent on order of dma-ranges entries).
The 2nd issue is the 'dma-ranges' has a second entry that is now ignored:
dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
Based on the flags (3rd addr cell: 0x0), we have an inbound config space which the kernel now ignores because inbound config space accesses make no sense. But clearly some setup is needed. Upstream, in contrast, sets up a memory range that includes this region, so the setup does happen:
<0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>
Minimally, I suspect it will work if you change dma-ranges 2nd entry to:
<0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>
While we shouldn't break existing DTs, the moonshot DT doesn't use what's documented upstream. There are multiple differences compared to what's documented. Is upstream supposed to support upstream DTs, downstream DTs, and ACPI for XGene which is an abandoned platform with only a handful of users?
Rob