From: Jakub Kicinski kuba@kernel.org Sent: Wednesday, October 27, 2021 5:04 PM To: Ariel Elior aelior@marvell.com Cc: Manish Chopra manishc@marvell.com; Greg KH gregkh@linuxfoundation.org; netdev@vger.kernel.org; stable@vger.kernel.org; Sudarsana Reddy Kalluru skalluru@marvell.com; malin1024@gmail.com; Shai Malin smalin@marvell.com; Omkar Kulkarni okulkarni@marvell.com; Nilesh Javali njavali@marvell.com; GR-everest- linux-l2@marvell.com; Andrew Lunn andrew@lunn.ch Subject: Re: [EXT] Re: [PATCH net-next 1/2] bnx2x: Utilize firmware 7.13.20.0
On Wed, 27 Oct 2021 05:17:43 +0000 Ariel Elior wrote:
You may recall we had a discussion on this during our last FW upgrade too.
"During our last FW upgrade" is pretty misleading here. The discussion seems to have been after user reported that you broke their systems:
https://urldefense.proofpoint.com/v2/url?u=https- 3A__lore.kernel.org_netdev_ffbcf99c-2D8274-2Deca1-2D5166- 2Defc0828ca05b- 40molgen.mpg.de_&d=DwICAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=cWBgNIF UifZRx2xhypdcaYrfIsMGt93NxP1r8GQtC0s&m=7SlExiNTnqs5yykfN6rmIWZvB wQtSHCEW1ZmdHhe3S4&s=5OfVYQjTBOqb9GeN7GmGw70U_7MZLYz9YDyl o1iqdtQ&e=
Now you want to make your users' lives even more miserable by pushing your changes into stable.
Please note this is not FW which resides in flash, which may or may not be updated during the life cycle of a specific board deployment, but rather an initialization sequence recipe which happens to contain FW content (as well
as
many other register and memory initializations) which is activated when
driver
loads. We do have Flash based FW as well, with which we are fully
backwards and
forwards compatible. There is no method to build the init sequence in a backwards compatible mode for these devices - it would basically mean duplicating most of the device interaction logic (control plane and data
plane).
To support these products we need to be able to update this from time to
time.
And the driver can't support two versions of init sequence because...?
Well. Generally speaking on the architecture of these devices, the init sequence includes programing the PRAM of the processors on the fastpath, so two different init sequences would mean two different FW versions, and supporting two different data planes. It would also mean drastic changes in control plane as well, for the same reason: the fastpath FW expects completely different structures and messaging from one version to the next. For these two versions, however, changes are admittedly not that big.
Please note these devices are EOLing, and therefore this may well be the
last
update to this FW.
Solid argument.
At this time we don't have any plans on further replacing the FW, so hopefully we won't find ourselves here again. Another important point: the major fix we are pushing here is a breakage in the SR-IOV virtual function compatibility implementation in the FW (introduced in the previous FW version). If we would maintain the ability to support that older FW in the PF driver, we would also be maintaining the presence of the breakage. In other words you may be better off with the driver failing to load with a clear message of "need new FW file" rather than having it load, but then having all your virtual functions which are passed through to virtual machines with distro kernel fail in weird and unpredictable ways. For this reason we also ask to expedite the acceptance of this change, as it is desirable to limit the exposure of this problem. Back to the EOLing point: the set of people still familiar with this
13 years old architecture is severely reduced, so would rather not to try
and invent new ways of doing things.
The only theoretical way we can think of getting around this if we had to is adding the entire thing as a huge header file and have the driver compile with it. This would detach the dependency on the FW file being present on disk, but has big disadvantages of making the compiled driver huge, and bloating the kernel with redundant headers filled with what is essentially a binary blob. We do make sure to add the FW files to the FW sub tree in advance of modifying the drivers to use them.
All the patch is doing is changing some offsets. Why can't you just make the offset the driver uses dependent on the FW version?
Would be great if the engineer who wrote the code could answer that.
My original answer was more for the general design/architecture of these products, and a general design of supporting multiple FW versions on them. Specifically for this FW change, as relatively little has changed, we *could* maintain two sets of offsets based on the FW version. This might impact driver performance, however, as some of the offsets are used in data plane (e.g. updating the L2 ring producers), although I suppose we could also work around that by preparing a new "generic" array of offsets, populate it at FW load time and use that. However, as I stated above, I don't think it is desirable in this case, as the previous version contains some nasty behavior, and it may take us some considerable time to build that design, allowing the problematic FW to spread to further Linux distributions.
Thanks, Ariel