On Thu, Apr 17, 2025 at 04:08:08PM +0200, Andrew Lunn wrote:
On Thu, Apr 17, 2025 at 02:12:36PM +0100, Qasim Ijaz wrote:
On Tue, Apr 15, 2025 at 08:56:48PM -0700, Jakub Kicinski wrote:
On Tue, 15 Apr 2025 20:52:30 -0700 Jakub Kicinski wrote:
On Tue, 15 Apr 2025 03:35:07 +0200 Andrew Lunn wrote:
@@ -182,7 +182,7 @@ static int ch9200_mdio_read(struct net_device *netdev, int phy_id, int loc) __func__, phy_id, loc); if (phy_id != 0)
return -ENODEV;
return 0;
An actually MDIO bus would return 0xffff is asked to read from a PHY which is not on the bus. But i've no idea how the ancient mii code handles this.
If this code every gets updated to using phylib, many of the changes you are making will need reverting because phylib actually wants to see the errors. So i'm somewhat reluctant to make changes like this.
Right.
I mean most of the patches seem to be adding error checking, unlike this one, but since Qasim doesn't have access to this HW they are more likely to break stuff than fix. I'm going to apply the first patch, Qasim if you'd like to clean up the rest I think it should be done separately without the Fixes tags, if at all.
Ah, no, patch 1 also does return 0. Hm. Maybe let's propagate the real error to silence the syzbot error and if someone with access to the HW
Hi Andrew and Jakub
Since there is uncertainty on whether these patches would break things how about I refactor the patches to instead return what the function already returns, this way we include error handling but maintain consistency with what the function already returns and does so there is no chance of breaking stuff. I think including the error handling would be a good idea overall because we have already seen 1 bug where the root cause is insufficient error handling right? Furthermore this driver has not been updated in 4 years, so for the near‑term surely improving these aspects can only be a good thing.
It is not a simple thing to decided if we should make changes or not, if we don't have the hardware. The test robot is saying things are potentially wrong, but we don't have any users complaining it is broken. If we make the test robot happy, without testing the changes, we can make users unhappy by breaking it. And that is the opposite of what we want.
For patch 2 the kernel test robot said it is missing a Cc stable tag in the sign-off area, it didnt highlight any build or functional errors so I don't understand what you mean there.
We also need to think about "return on investment". Is anybody actually using this device still? Would it be better to spend our time on other devices we know are actually used?
If you can find a board which actually has this device, or can find somebody to run tests, then great, we are likely to accept them. But otherwise please focus on minimum low risk changes which are obviously
So going forward what should we do? I gave my thoughts for each patch above and how I think we should change it to minimise breaking things while adding error handling, which ones do you agree/ don't agree with?
Thanks Qasim
correct, or just leave the test robot unhappy.
Andrew