On 5/15/24 09:40, Florian Fainelli wrote:
On 5/15/24 09:20, Linus Torvalds wrote:
On Wed, 15 May 2024 at 09:17, Mark Brown broonie@kernel.org wrote:
A bisect claims that "net: bcmgenet: synchronize EXT_RGMII_OOB_CTRL access" is the first commit that breaks, I'm not seeing issues with other stables.
That's d85cf67a3396 ("net: bcmgenet: synchronize EXT_RGMII_OOB_CTRL access") upstream. Is upstream ok?
Yes, upstream is OK.
Exact same issue on 6.1 not just on Raspberry Pi chips but also on ARCH_BRCMSTB systems as well that also make use of the GENET driver.
Doug and I will take a look and provide an updated set of fixes, meanwhile, I would recommend dropping all of Doug's patches for now until we can post a revised series and/or missing dependencies.
OK, I think I see the problem. The upstream patch was applied to the wrong context. In upstream, the critical section is added to bcmgenet_mii_config() which did not run with the phy_device::mutex_lock held before, and that was the race that we are trying to close. In linux-stable-rc/linux-6.1.y however the context somehow resolved to applying the critical section to the bcmgenet_mac_config() section where we are already running with phy_device::lock being held by virtue of the PHY library running our bcmgenet_mac_config() callback.
The corrected incremental diff would look like this:
diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c b/drivers/net/ethernet/broadcom/genet/bcmmii.c index 94e0e858266e..46252d96b90e 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmmii.c +++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c @@ -71,12 +71,10 @@ static void bcmgenet_mac_config(struct net_device *dev) * transmit -- 25MHz(100Mbps) or 125MHz(1Gbps). * Receive clock is provided by the PHY. */ - mutex_lock(&phydev->lock); reg = bcmgenet_ext_readl(priv, EXT_RGMII_OOB_CTRL); reg &= ~OOB_DISABLE; reg |= RGMII_LINK; bcmgenet_ext_writel(priv, reg, EXT_RGMII_OOB_CTRL); - mutex_unlock(&phydev->lock);
spin_lock_bh(&priv->reg_lock); reg = bcmgenet_umac_readl(priv, UMAC_CMD); @@ -271,6 +269,7 @@ int bcmgenet_mii_config(struct net_device *dev, bool init) * block for the interface to work */ if (priv->ext_phy) { + mutex_lock(&phydev->lock); reg = bcmgenet_ext_readl(priv, EXT_RGMII_OOB_CTRL); reg &= ~ID_MODE_DIS; reg |= id_mode_dis; @@ -279,6 +278,7 @@ int bcmgenet_mii_config(struct net_device *dev, bool init) else reg |= RGMII_MODE_EN; bcmgenet_ext_writel(priv, reg, EXT_RGMII_OOB_CTRL); + mutex_unlock(&phydev->lock); }
if (init)
We could, and should consider taking 696450c05181559a35d4d5bee55c465b1ac6fe2e ("net: bcmgenet: Clear RGMII_LINK upon link down") as a prerequisite as well to provide the right context while applying d85cf67a339685beae1d0aee27b7f61da95455be ("net: bcmgenet: synchronize EXT_RGMII_OOB_CTRL access")
Mark, can you try that, too?