Hi Marek,
marex@denx.de wrote on Fri, 2 Dec 2022 16:23:29 +0100:
On 12/2/22 16:00, Miquel Raynal wrote:
Hi Marek,
Hi,
marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100:
On 12/2/22 15:05, Miquel Raynal wrote:
Hi Francesco,
Hi,
[...]
I still strongly disagree with the initial proposal but what I think we can do is:
To prevent future breakages: Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any kernel should work.
To help tracking down situations like that: Keep the warning in ofpart.c but continue to fail.
To fix the current situation: Immediately revert commit (and prevent it from being backported): 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") This way your own boot flow is fixed in the short term.
Here I disagree, the fix is correct and I think we shouldn't proliferate incorrect DTs which don't match the binding document.
I agree we should not proliferate incorrect DTs, so let's use a modern description then
Yes please !
, with a controller and a child node which defines the chip.
But what if there is no chip connected to the controller node ?
If I understand the proposal here right (please correct me if I'm wrong), then:
Good idea to summarize.
- This is the original, old, wrong binding:
&gpmi { #size-cells = <1>; ... partition@N { ... }; };
Yes.
- This is the newer, but still wrong binding:
&gpmi { #size-cells = <0>; ... partitions { partition@N { ... }; }; };
Well, this is wrong description, but it would work (for compat reasons, even though I don't think this is considered valid DT by the schemas).
- This is the newest binding, what we want:
&gpmi { #size-cells = <0>; ... nand-chip { partitions { partition@N { ... }; }; }; };
Yes
But if there is no physical nand chip connected to the controller, would we end up with empty nand-chip node in DT, like this? &gpmi { #size-cells = <X>; ... nand-chip { /* empty */ }; };
Is this really a concern? If there is no NAND chip, the controller should be disabled, no? I guess technically you could even use the status property in the nand-chip node...
However, it should not be empty, at the very least a reg property should indicate on which CS it is wired, as expected there: https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documenta...
But, as nand-chip.yaml references mtd.yaml, you can as well use whatever is described here: https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documenta...
What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ?
The commit that was pointed in the original fix clearly stated that the NAND chip node was targeted, not the NAND controller node. I hope this is correctly supported in U-Boot though. So if there is a NAND chip subnode, I suppose U-Boot would try to create the partitions that are inside, or even in the sub "partitions" container.
Rather, if a bootloader generates incorrect (new) DT entries, I believe the driver should implement a fixup and warn user about this. PC does that as well with broken ACPI tables as far as I can tell.
I'm not convinced making a DT non-compliant with bindings again,
I am sorry to say so, but while warnings reported by the tools should be fixed, it's not because the tool does not scream at you that the description is valid. We are actively working on enhancing the schema so that "all" improper descriptions get warnings (see the series pointed earlier), but in no way this change makes the node compliant with modern bindings.
I'm not saying the fix is wrong, but let's be pragmatic, it currently leads to boot failures.
I fully agree that we do have a problem, and that it trickled into stable makes it even worse. Maybe I don't fully understand the thing with nand-chip proposal, see my question above, esp. the last part.
only to work around a problem induced by bootloader, is the right approach here.
When a patch breaks a board and there is no straight fix, you revert it, then you think harder. That's what I am saying. This is a temporary solution.
Isn't this patch the straight fix, at least until the bootloader can be updated to generate the nand-chip node correctly ?
This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke.
Please, you know this is not valid DT clean up. We've been decoupling controller and chip description since 2016. What I am proposing is a valid DT cleanup, not to the latest standard, but way closer than the current solution.
I think I really need one more explanation of the nand-chip part above.
I hope things are clearer now.
Thanks, Miquèl