Hello Miquel,
On Thu, Jan 05, 2023 at 12:33:34PM +0100, Miquel Raynal wrote:
miquel.raynal@bootlin.com wrote on Mon, 2 Jan 2023 10:40:04 +0100:
francesco@dolcini.it wrote on Fri, 16 Dec 2022 17:30:18 +0100:
On Fri, Dec 16, 2022 at 04:35:01PM +0100, Miquel Raynal wrote:
marex@denx.de wrote on Fri, 16 Dec 2022 15:32:28 +0100:
The second part of the message, as far as I understand it, is "ignore problems this will cause to users of boards we do not know about, let them run into unbootable systems after some linux kernel update,
Now you know what kernel update will break them, so you can prevent it from happening.
For boards without even a dtsi in the kernel, should we care?
Would caring for those boards not be just exact the same as caring for some UEFI/ACPI mess for which no source code is normally available and nobody really known at which point the various vendors have forked their source code from some Intel or AMD or whatever reference code?
I am sorry I don't know UEFI/ACPI well enough to discuss it.
IMHO we should care for the multiple reason I have already written in my previous emails.
And honestly, just as a side comment, I would feel way more happy to know that the elevator control system in the elevator I use everyday or the chemical industrial plan HMI next to my home is running an up to date Linux system that is not affected by known security vulnerabilities and they did stop updating it just because there was some random bug preventing the updated kernel to boot and nobody had the time/skill to investigate and fix it. [1]
The issue comes from a very specific U-Boot function that should have never existed. I hope people working on chemical plants do not make use of these and will not disregard the "your DT is broken there [...]" warning we plan to add right before their updated board will fail. We are not living people in the dark, I agreed for a warning, but I don't think applying the proposed fix blindly is wise and future-proof.
Let's move forward with this. Let's assume my fears are baseless. We might consider the situation where someone tries to hide the partitions by setting #size-cell to 0 even wronger and too unlikely. Hopefully we will not break any other existing setups by applying an always-on fix.
Nice, good!
I would still like to see U-Boot partitions handling evolve, at least:
- fix #size-cells in fdt_fixup_mtd()
- avoid the fdt_fixup_mtd() call from Collibri boards (ie. an example that can be followed by the other users)
Fine, I can do it.
However I am just not 100% sure about your proposal, I wonder if we should just deprecate this function or we should fix it. The exact end result will depend on the discussion with the U-Boot folks, but I absolutely agree that the current situation needs to change. I'll keep you in CC on those patches.
On Linux side let's fix #size-cells like you proposed without filtering against a list of compatibles. We however need to improve the heuristics:
- Do it only when there are partitions declared within a NAND controller node.
- Change the warning to avoid mentioning backward compatibility, just mention this is utterly wrong and thus the value will be set to 1 instead of 0.
- Mention in the comment above this only works on systems with <4GiB chips.
If you think about other conditions please feel free to add them.
Do you concur?
Yes, I do agree.
Side comment, I have been recently busy with other life AND work priorities and this task was just idling on the bottom of my backlog. I do not see the situation improving that much in the next few weeks.
Said that patches coming, I am committed to have this sorted out before the next Linux Kernel merge window, for U-Boot the merge window opens in 3 days and I am already late, let's see, this might be as well considered a fix that is fine for a late integration.
Francesco