marex@denx.de wrote on Fri, 16 Dec 2022 11:46:18 +0100:
On 12/16/22 08:45, Francesco Dolcini wrote:
Hello Marek and Miquel,
Hi,
On Thu, Dec 15, 2022 at 08:16:04AM +0100, Miquel Raynal wrote:
So my first first idea was to avoid using the broken "fixup mtdparts" function in U-Boot and I am still convinced this is what we should do in priority.
This is something that was already discussed, but I was not really thinking much on it till now. Do you think that the whole idea of editing the MTD partitions from the firmware is wrong and we should just pass the partition on the command line OR that the current implementation is broken and can/should be fixed?
No, patching the partition layout into DT is fine. Firmwares of all kinds have been patching various parts of the DT before passing it to OS since forever, or more recently, merging multiple DT fragments and passing the composite DT to Linux.
As far as I recall, OF predates Linux and the OF tree has been usually assembled by the Forth firmware of that era from various chunks stored in different parts of the system. So this patching is fundamental part of the design since the beginning.
It is difficult to describe complex structure like the partition mapping on kernel command line, it should really be in DT or other such structure, so patching it into the DT is fine.
I think describing it in the DT is fine and welcome. I think patching it in the DT is ugly. My 2cts.
The only detail here is, it should be patched into the DT correctly ... and ... if old firmwares do not do that, Linux should fix it up. You don't throw away your old PC just because it doesn't have perfect ACPI tables one would expect today, I don't see why we should do that with DT machines.
I am still against piggy hacks in the generic ofpart.c driver, but what we could do however is a DT fixup in the init_machine (or the dt_fixup) hook for imx7 Colibri, very much like this: https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.... Plus a warning there saying "your dt is broken, update your firmware".
I have a couple of concerns/question with this approach:
- do we have a single point to handle this? Different architectures are affected by these issue. Duplicating the fixup code in multiple place does not seems a great idea
- If we believe that the device tree is wrong, in the i.MX7 case because of #size-cells should be set to 0 and not 1, we should not alter the FDT. Other part of the code could rely on this being correctly set to 0 moving forward.
If I understood you are proposing to have a fixup at the machine level that is converting a valid nand-controller node definition to a "broken" one. Unless I misunderstood you and you are thinking about rewriting the whole MTD partition from a broken definition to a proper one.
No, quite the opposite.
Either size-cell is wrong which makes the description totally inconsistent (if size-cell is there, it must have a use, otherwise why do we keep it?) and we must fix it, or it is right and we should not touch it.
What I propose is to check very early whether the description is consistent on the board known to have this problem. If the description is wrong, we fix it and the generic parser can then do its work properly.
On Thu, Dec 15, 2022 at 09:04:46AM +0100, Miquel Raynal wrote:
marex@denx.de wrote on Thu, 15 Dec 2022 08:45:33 +0100:
Sadly, it does only fix the known cases, not the unknown cases like downstream forks which never get any bootloader updates ever, and which you can't find in upstream U-Boot, and which you therefore cannot easily catch in the arch side fixup.
And ?
I'm not personally and directly concerned, since the machine I care are all available upstream and known, however this is a general problem with U-Boot code being at the same time widely used on a range of embedded products and producing a broken MTD partition list.
I think we will just silently break boards and just creating a lot of issues to people. We would just introduce regression to the users, being aware of it and deliberately decide to not care and move the problem to someone else. I do not think this is a good way to go.
What? Since when my proposal is breaking boards? My proposal leads to a situation where: - If you have a board that has an inconsistent description but worked, it will still work. - If you have a board that has a consistent description and worked, it will still work. - If your have a board that has an inconsistent description and got broken *recently* by another change (typically you "fix" the DT in Linux to comply with the bindings), then you get a warning that leads you on the right path, you then update your bootloader if you can, but either way you add your machine compatible to the list of devices which need the early fix and your boot is fixed. - If you add support for a new board with an old kernel and have an inconsistent description it does not "just work because we have an automatic piggy hack in the driver". I am against it.
Whether or not it is acceptable by arch maintainer is something else, we won't know until we include them in the loop with a proper patch.
Thanks, Miquèl