Hi Marek,
marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100:
On 12/2/22 15:05, Miquel Raynal wrote:
Hi Francesco,
Hi,
[...]
I still strongly disagree with the initial proposal but what I think we can do is:
To prevent future breakages: Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any kernel should work.
To help tracking down situations like that: Keep the warning in ofpart.c but continue to fail.
To fix the current situation: Immediately revert commit (and prevent it from being backported): 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") This way your own boot flow is fixed in the short term.
Here I disagree, the fix is correct and I think we shouldn't proliferate incorrect DTs which don't match the binding document.
I agree we should not proliferate incorrect DTs, so let's use a modern description then, with a controller and a child node which defines the chip.
Rather, if a bootloader generates incorrect (new) DT entries, I believe the driver should implement a fixup and warn user about this. PC does that as well with broken ACPI tables as far as I can tell.
I'm not convinced making a DT non-compliant with bindings again,
I am sorry to say so, but while warnings reported by the tools should be fixed, it's not because the tool does not scream at you that the description is valid. We are actively working on enhancing the schema so that "all" improper descriptions get warnings (see the series pointed earlier), but in no way this change makes the node compliant with modern bindings.
I'm not saying the fix is wrong, but let's be pragmatic, it currently leads to boot failures.
only to work around a problem induced by bootloader, is the right approach here.
When a patch breaks a board and there is no straight fix, you revert it, then you think harder. That's what I am saying. This is a temporary solution.
This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke.
Please, you know this is not valid DT clean up. We've been decoupling controller and chip description since 2016. What I am proposing is a valid DT cleanup, not to the latest standard, but way closer than the current solution.
- There is no reason to partially fix a DT like what the above did besides trying to avoid warnings emitted by the DT check tools.
Note that the 3. does not partially fix the DT, it fixes the node fully.
If complying with modern bindings is a goal (and I think it should be), then we can modernize this DT without breaking the boot flow: Instead of only setting #size-cell = <0>, you can as well define in your DT a subnode to define the NAND chip. NAND chips are not supposed to have #size-cells properties, but in the past they did, which means #address-cells and #size-cells are allowed (and marked deprecated in the schema). So in practice, the dt-schema will not warn you if they are there, which means you can still set #size-cell = <1>.
I am really not convinced we should hack around this on the DT end and try to push some sort of convoluted workaround there,
"convoluted workaround" :-)
instead of fixing it on the driver side (and bootloader side, in the long run).
Please mind, the tools have been updated very recently to match
s/tools/schema/
what I am describing above, so they will likely still report errors until v6.2-rc1, see: https://lore.kernel.org/linux-mtd/20221114090315.848208-1-miquel.raynal@bootlin.com/
Does this sound reasonable?
[...]
Thanks, Miquèl