In OpenWrt Project the flash write error caused on some products. Also the issue can be fixed by using chip_good() instead of chip_ready(). The chip_ready() just checks the value from flash memory twice. And the chip_good() checks the value with the expected value. Probably the issue can be fixed as checked correctly by the chip_good(). So change to use chip_good() instead of chip_ready().
Signed-off-by: Tokunori Ikegami ikegami@allied-telesis.co.jp Signed-off-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Koen Vandeputte koen.vandeputte@ncentric.com Signed-off-by: Fabio Bettoni fbettoni@gmail.com Co-Developed-by: Hauke Mehrtens hauke@hauke-m.de Co-Developed-by: Koen Vandeputte koen.vandeputte@ncentric.com Co-Developed-by: Fabio Bettoni fbettoni@gmail.com Reported-by: Fabio Bettoni fbettoni@gmail.com Cc: Chris Packham chris.packham@alliedtelesis.co.nz Cc: Joakim Tjernlund Joakim.Tjernlund@infinera.com Cc: Boris Brezillon boris.brezillon@free-electrons.com Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org --- Changes since v2: - Just update the commit message for the comment.
Changes since v1: - Just update the commit message.
Background: This is required for OpenWrt Project to result the flash write issue as below patche. https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=ddc11c3932c7b7b7df7d5fbd48f207e77619eaa7
Also the original patch in OpenWRT is below. https://github.com/openwrt/openwrt/blob/v18.06.0/target/linux/ar71xx/patches-4.9/403-mtd_fix_cfi_cmdset_0002_status_check.patch
The reason to use chip_good() is that just actually fix the issue. And also in the past I had fixed the erase function also as same way by the patch below. https://patchwork.ozlabs.org/patch/922656/ Note: The reason for the patch for erase is same.
In my understanding the chip_ready() is just checked the value twice from flash. So I think that sometimes incorrect value is read twice and it is depended on the flash device behavior but not sure..
So change to use chip_good() instead of chip_ready().
drivers/mtd/chips/cfi_cmdset_0002.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c index 72428b6bfc47..251c9e1675bd 100644 --- a/drivers/mtd/chips/cfi_cmdset_0002.c +++ b/drivers/mtd/chips/cfi_cmdset_0002.c @@ -1627,31 +1627,37 @@ static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip, continue; }
- if (time_after(jiffies, timeo) && !chip_ready(map, adr)){ + if (chip_good(map, adr, datum)) + break; + + if (time_after(jiffies, timeo)){ xip_enable(map, chip, adr); printk(KERN_WARNING "MTD %s(): software timeout\n", __func__); xip_disable(map, chip, adr); + ret = -EIO; break; }
- if (chip_ready(map, adr)) - break; - /* Latency issues. Drop the lock, wait a while and retry */ UDELAY(map, chip, adr, 1); } + /* Did we succeed? */ - if (!chip_good(map, adr, datum)) { + if (ret) { /* reset on all failures. */ map_write(map, CMD(0xF0), chip->start); /* FIXME - should have reset delay before continuing */
- if (++retry_cnt <= MAX_RETRIES) + if (++retry_cnt <= MAX_RETRIES) { + ret = 0; goto retry; + }
ret = -EIO; } + xip_enable(map, chip, adr); + op_done: if (mode == FL_OTP_WRITE) otp_exit(map, chip, adr, map_bankwidth(map));
On Fri, 26 Oct 2018 01:32:09 +0900 Tokunori Ikegami ikegami@allied-telesis.co.jp wrote:
In OpenWrt Project the flash write error caused on some products.
It's okay to mention that the issue was discovered by the OpenWRT team, but I'd rephrase it differently.
"As reported by the OpenWRT team, write requests sometimes fail on some platforms".
Also the issue can be fixed by using chip_good() instead of chip_ready(). The chip_ready() just checks the value from flash memory twice. And the chip_good() checks the value with the expected value. Probably the issue can be fixed as checked correctly by the chip_good(). So change to use chip_good() instead of chip_ready().
Well, that's not really explaining why you think chip_good() should be used instead of chip_ready(). So I went on and looked at the chip_good(), chip_ready() and do_write_oneword() implementation, and also looked at users of do_write_oneword(). It seems this function is used to write data to the flash, and apparently the "one bit should toggle to reflect a busy state" does not apply when writing things to the memory array (it's probably working for other CFI commands, but I guess it takes more time to actually change the level of a NOR cell, hence the result of 2 identical reads does not mean that the write is done).
Also, it seems that cmdset_0001 is not implementing chip_ready() the same way, and I wonder if cmdset_0002 implementation is correct to start with. Or maybe I don't get what chip_ready() is for.
Anyway, this is the sort of clarification I'd like to have.
Signed-off-by: Tokunori Ikegami ikegami@allied-telesis.co.jp Signed-off-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Koen Vandeputte koen.vandeputte@ncentric.com Signed-off-by: Fabio Bettoni fbettoni@gmail.com
Has the patch really gone through all those people? SoB is used when you apply a patch in your tree or when you're the original author.
Co-Developed-by: Hauke Mehrtens hauke@hauke-m.de Co-Developed-by: Koen Vandeputte koen.vandeputte@ncentric.com Co-Developed-by: Fabio Bettoni fbettoni@gmail.com
Not sure we want to add new undocumented tags, but you can mention that all those people helped you find/debug the issue. They can also add their Reviewed-by/Tested-by if they like.
Reported-by: Fabio Bettoni fbettoni@gmail.com Cc: Chris Packham chris.packham@alliedtelesis.co.nz Cc: Joakim Tjernlund Joakim.Tjernlund@infinera.com Cc: Boris Brezillon boris.brezillon@free-electrons.com Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org
Changes since v2:
- Just update the commit message for the comment.
Changes since v1:
- Just update the commit message.
Background: This is required for OpenWrt Project to result the flash write issue as below patche. https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=ddc11c3932c7b7b7df7d5fbd48f207e77619eaa7
Also the original patch in OpenWRT is below. https://github.com/openwrt/openwrt/blob/v18.06.0/target/linux/ar71xx/patches-4.9/403-mtd_fix_cfi_cmdset_0002_status_check.patch
The reason to use chip_good() is that just actually fix the issue. And also in the past I had fixed the erase function also as same way by the patch below. https://patchwork.ozlabs.org/patch/922656/ Note: The reason for the patch for erase is same.
In my understanding the chip_ready() is just checked the value twice from flash. So I think that sometimes incorrect value is read twice and it is depended on the flash device behavior but not sure..
So change to use chip_good() instead of chip_ready().
drivers/mtd/chips/cfi_cmdset_0002.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c index 72428b6bfc47..251c9e1675bd 100644 --- a/drivers/mtd/chips/cfi_cmdset_0002.c +++ b/drivers/mtd/chips/cfi_cmdset_0002.c @@ -1627,31 +1627,37 @@ static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip, continue; }
if (time_after(jiffies, timeo) && !chip_ready(map, adr)){
if (chip_good(map, adr, datum))
break;
if (time_after(jiffies, timeo)){ xip_enable(map, chip, adr); printk(KERN_WARNING "MTD %s(): software timeout\n", __func__); xip_disable(map, chip, adr);
}ret = -EIO; break;
if (chip_ready(map, adr))
break;
- /* Latency issues. Drop the lock, wait a while and retry */ UDELAY(map, chip, adr, 1); }
- /* Did we succeed? */
- if (!chip_good(map, adr, datum)) {
- if (ret) { /* reset on all failures. */ map_write(map, CMD(0xF0), chip->start); /* FIXME - should have reset delay before continuing */
if (++retry_cnt <= MAX_RETRIES)
if (++retry_cnt <= MAX_RETRIES) {
ret = 0; goto retry;
}
ret = -EIO; }
- xip_enable(map, chip, adr);
Not a big deal, but I'd prefer to not have coding style changes mixed with functional changes (in other words, you can drop the addition of blanks lines around xip_enable()).
op_done: if (mode == FL_OTP_WRITE) otp_exit(map, chip, adr, map_bankwidth(map));
On Mon, 2018-11-05 at 11:15 +0100, Boris Brezillon wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Fri, 26 Oct 2018 01:32:09 +0900 Tokunori Ikegami ikegami@allied-telesis.co.jp wrote:
In OpenWrt Project the flash write error caused on some products.
It's okay to mention that the issue was discovered by the OpenWRT team, but I'd rephrase it differently.
"As reported by the OpenWRT team, write requests sometimes fail on some platforms".
Also the issue can be fixed by using chip_good() instead of chip_ready(). The chip_ready() just checks the value from flash memory twice. And the chip_good() checks the value with the expected value. Probably the issue can be fixed as checked correctly by the chip_good(). So change to use chip_good() instead of chip_ready().
Well, that's not really explaining why you think chip_good() should be used instead of chip_ready(). So I went on and looked at the chip_good(), chip_ready() and do_write_oneword() implementation, and also looked at users of do_write_oneword(). It seems this function is used to write data to the flash, and apparently the "one bit should toggle to reflect a busy state" does not apply when writing things to the memory array (it's probably working for other CFI commands, but I guess it takes more time to actually change the level of a NOR cell, hence the result of 2 identical reads does not mean that the write is done).
Also, it seems that cmdset_0001 is not implementing chip_ready() the same way, and I wonder if cmdset_0002 implementation is correct to start with. Or maybe I don't get what chip_ready() is for.
The 0001 cmd set is quite different to 0002 and 0001 is the superior one. If you look at recent 0002 cmd sets they offer an alternative cmd set to replace the all the "toggle" ones with something that is same/similar to what 0001 offers.
Jocke
On Mon, 5 Nov 2018 12:03:04 +0000 Joakim Tjernlund Joakim.Tjernlund@infinera.com wrote:
On Mon, 2018-11-05 at 11:15 +0100, Boris Brezillon wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Fri, 26 Oct 2018 01:32:09 +0900 Tokunori Ikegami ikegami@allied-telesis.co.jp wrote:
In OpenWrt Project the flash write error caused on some products.
It's okay to mention that the issue was discovered by the OpenWRT team, but I'd rephrase it differently.
"As reported by the OpenWRT team, write requests sometimes fail on some platforms".
Also the issue can be fixed by using chip_good() instead of chip_ready(). The chip_ready() just checks the value from flash memory twice. And the chip_good() checks the value with the expected value. Probably the issue can be fixed as checked correctly by the chip_good(). So change to use chip_good() instead of chip_ready().
Well, that's not really explaining why you think chip_good() should be used instead of chip_ready(). So I went on and looked at the chip_good(), chip_ready() and do_write_oneword() implementation, and also looked at users of do_write_oneword(). It seems this function is used to write data to the flash, and apparently the "one bit should toggle to reflect a busy state" does not apply when writing things to the memory array (it's probably working for other CFI commands, but I guess it takes more time to actually change the level of a NOR cell, hence the result of 2 identical reads does not mean that the write is done).
Also, it seems that cmdset_0001 is not implementing chip_ready() the same way, and I wonder if cmdset_0002 implementation is correct to start with. Or maybe I don't get what chip_ready() is for.
The 0001 cmd set is quite different to 0002 and 0001 is the superior one. If you look at recent 0002 cmd sets they offer an alternative cmd set to replace the all the "toggle" ones with something that is same/similar to what 0001 offers.
Okay. Do you know when chip_ready() (the one that checks if something changes between 2 reads) should be used and when it shouldn't?
On Mon, 2018-11-05 at 13:52 +0100, Boris Brezillon wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Mon, 5 Nov 2018 12:03:04 +0000 Joakim Tjernlund Joakim.Tjernlund@infinera.com wrote:
On Mon, 2018-11-05 at 11:15 +0100, Boris Brezillon wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Fri, 26 Oct 2018 01:32:09 +0900 Tokunori Ikegami ikegami@allied-telesis.co.jp wrote:
In OpenWrt Project the flash write error caused on some products.
It's okay to mention that the issue was discovered by the OpenWRT team, but I'd rephrase it differently.
"As reported by the OpenWRT team, write requests sometimes fail on some platforms".
Also the issue can be fixed by using chip_good() instead of chip_ready(). The chip_ready() just checks the value from flash memory twice. And the chip_good() checks the value with the expected value. Probably the issue can be fixed as checked correctly by the chip_good(). So change to use chip_good() instead of chip_ready().
Well, that's not really explaining why you think chip_good() should be used instead of chip_ready(). So I went on and looked at the chip_good(), chip_ready() and do_write_oneword() implementation, and also looked at users of do_write_oneword(). It seems this function is used to write data to the flash, and apparently the "one bit should toggle to reflect a busy state" does not apply when writing things to the memory array (it's probably working for other CFI commands, but I guess it takes more time to actually change the level of a NOR cell, hence the result of 2 identical reads does not mean that the write is done).
Also, it seems that cmdset_0001 is not implementing chip_ready() the same way, and I wonder if cmdset_0002 implementation is correct to start with. Or maybe I don't get what chip_ready() is for.
The 0001 cmd set is quite different to 0002 and 0001 is the superior one. If you look at recent 0002 cmd sets they offer an alternative cmd set to replace the all the "toggle" ones with something that is same/similar to what 0001 offers.
Okay. Do you know when chip_ready() (the one that checks if something changes between 2 reads) should be used and when it shouldn't?
It is next to impossible to do proper error handling(analysing status) with toggle method, especially when you have interleaved chips. Try with erase suspend when something goes wrong and you want to address that properly. Best is to add support for the extended 0002 cmd set and use that whenever possible.
Jocke
Hi Joakim,
On Mon, 5 Nov 2018 13:22:19 +0000 Joakim Tjernlund Joakim.Tjernlund@infinera.com wrote:
On Mon, 2018-11-05 at 13:52 +0100, Boris Brezillon wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Mon, 5 Nov 2018 12:03:04 +0000 Joakim Tjernlund Joakim.Tjernlund@infinera.com wrote:
On Mon, 2018-11-05 at 11:15 +0100, Boris Brezillon wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Fri, 26 Oct 2018 01:32:09 +0900 Tokunori Ikegami ikegami@allied-telesis.co.jp wrote:
In OpenWrt Project the flash write error caused on some products.
It's okay to mention that the issue was discovered by the OpenWRT team, but I'd rephrase it differently.
"As reported by the OpenWRT team, write requests sometimes fail on some platforms".
Also the issue can be fixed by using chip_good() instead of chip_ready(). The chip_ready() just checks the value from flash memory twice. And the chip_good() checks the value with the expected value. Probably the issue can be fixed as checked correctly by the chip_good(). So change to use chip_good() instead of chip_ready().
Well, that's not really explaining why you think chip_good() should be used instead of chip_ready(). So I went on and looked at the chip_good(), chip_ready() and do_write_oneword() implementation, and also looked at users of do_write_oneword(). It seems this function is used to write data to the flash, and apparently the "one bit should toggle to reflect a busy state" does not apply when writing things to the memory array (it's probably working for other CFI commands, but I guess it takes more time to actually change the level of a NOR cell, hence the result of 2 identical reads does not mean that the write is done).
Also, it seems that cmdset_0001 is not implementing chip_ready() the same way, and I wonder if cmdset_0002 implementation is correct to start with. Or maybe I don't get what chip_ready() is for.
The 0001 cmd set is quite different to 0002 and 0001 is the superior one. If you look at recent 0002 cmd sets they offer an alternative cmd set to replace the all the "toggle" ones with something that is same/similar to what 0001 offers.
Okay. Do you know when chip_ready() (the one that checks if something changes between 2 reads) should be used and when it shouldn't?
It is next to impossible to do proper error handling(analysing status) with toggle method, especially when you have interleaved chips.
It's probably me who does not understand how CFI works, but it sounds weird to have chip_ready() called on something that's not a status register (this is my understanding of what do_write_oneword() does).
Try with erase suspend when something goes wrong and you want to address that properly.
I trust you when you say it does not work when using chip_ready(), but I'd like to understand why. Well, first I'd like to understand what chip_ready() is supposed to do, and on which kind of access/address it's supposed to be used. As you already noticed I don't know a lot about CFI, and that's why it's important to me to have things clearly explained in the commit message.
Best is to add support for the extended 0002 cmd set and use that whenever possible.
Okay, does that mean we should replace all chip_ready() calls by chip_good() ones until support for ext 0002 cmdset is added? To be honest, I have a hard time understanding what chip_ready() is supposed to tell us. To me it's something that should return 1 when the chip is ready to accept new requests, but I don't see how comparing values returned by 2 successive reads can provide me this information. Can you maybe point me to the CFI 0002 cmdset spec describing that?
Thanks,
Boris
Thank you so much for your reviewing.
-----Original Message----- From: Boris Brezillon [mailto:boris.brezillon@bootlin.com] Sent: Monday, November 05, 2018 7:16 PM To: IKEGAMI Tokunori Cc: boris.brezillon@free-electrons.com; Hauke Mehrtens; stable@vger.kernel.org; Joakim Tjernlund; PACKHAM Chris; linux-mtd@lists.infradead.org; Koen Vandeputte; Fabio Bettoni Subject: Re: [PATCH v3 01/11] mtd: cfi_cmdset_0002: Change do_write_oneword() to use chip_good()
On Fri, 26 Oct 2018 01:32:09 +0900 Tokunori Ikegami ikegami@allied-telesis.co.jp wrote:
In OpenWrt Project the flash write error caused on some products.
It's okay to mention that the issue was discovered by the OpenWRT team, but I'd rephrase it differently.
"As reported by the OpenWRT team, write requests sometimes fail on some platforms".
Yes I will do fix as this.
Also the issue can be fixed by using chip_good() instead of chip_ready(). The chip_ready() just checks the value from flash memory twice. And the chip_good() checks the value with the expected value. Probably the issue can be fixed as checked correctly by the chip_good(). So change to use chip_good() instead of chip_ready().
Well, that's not really explaining why you think chip_good() should be used instead of chip_ready(). So I went on and looked at the chip_good(), chip_ready() and do_write_oneword() implementation, and also looked at users of do_write_oneword(). It seems this function is used to write data to the flash, and apparently the "one bit should toggle to reflect a busy state" does not apply when writing things to the memory array (it's probably working for other CFI commands, but I guess it takes more time to actually change the level of a NOR cell, hence the result of 2 identical reads does not mean that the write is done).
Also, it seems that cmdset_0001 is not implementing chip_ready() the same way, and I wonder if cmdset_0002 implementation is correct to start with. Or maybe I don't get what chip_ready() is for.
Anyway, this is the sort of clarification I'd like to have.
I am thinking to update the commit message as below.
mtd: cfi_cmdset_0002: Use chip_good() to retry in do_write_oneword()
As reported by the OpenWRT team, write requests sometimes fail on some platforms. Currently to check the state chip_ready() is used correctly as described by the flash memory S29GL256P11TFI01 datasheet. Also chip_good() is used to check if the write is succeeded and it was implemented by the commit fb4a90bfcd6d8 ("[MTD] CFI-0002 - Improve error checking"). But actually the write failure is caused on some platforms and also it can be fixed by using chip_good() to check the state and retry instead. It is depended on the actual flash chip behavior so the root cause is unknown.
If any comment please let me know.
Signed-off-by: Tokunori Ikegami ikegami@allied-telesis.co.jp Signed-off-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Koen Vandeputte koen.vandeputte@ncentric.com Signed-off-by: Fabio Bettoni fbettoni@gmail.com
Has the patch really gone through all those people? SoB is used when you apply a patch in your tree or when you're the original author.
I have just checked the OpenWRT git log again and it looks that it was originally implemented by Felix Fietkau nbd@openwrt.org by the patch below so I will update the Signed-off-by tag as so. https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=2530640f07cd2b3b14fe9ec03fa63a586452cc5f
Co-Developed-by: Hauke Mehrtens hauke@hauke-m.de Co-Developed-by: Koen Vandeputte koen.vandeputte@ncentric.com Co-Developed-by: Fabio Bettoni fbettoni@gmail.com
Not sure we want to add new undocumented tags, but you can mention that all those people helped you find/debug the issue. They can also add their Reviewed-by/Tested-by if they like.
Yes so I am thinking to change as below.
Signed-off-by: Tokunori Ikegami ikegami@allied-telesis.co.jp Signed-off-by: Felix Fietkau nbd@openwrt.org Tested-by: Fabio Bettoni fbettoni@gmail.com Reported-by: Fabio Bettoni fbettoni@gmail.com Cc: Hauke Mehrtens hauke@hauke-m.de Cc: Koen Vandeputte koen.vandeputte@ncentric.com
If any problem let me know.
By the way the patch has been tested by Fabio-san then there was still the failure behavior. And it was not followed the original patch changes correctly. So I will update the patch change a little by the next version v4 patch. Note: It has been retested by the Fabio-san as okay.
Reported-by: Fabio Bettoni fbettoni@gmail.com Cc: Chris Packham chris.packham@alliedtelesis.co.nz Cc: Joakim Tjernlund Joakim.Tjernlund@infinera.com Cc: Boris Brezillon boris.brezillon@free-electrons.com Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org
Changes since v2:
- Just update the commit message for the comment.
Changes since v1:
- Just update the commit message.
Background: This is required for OpenWrt Project to result the flash write issue as below patche.
Also the original patch in OpenWRT is below.
The reason to use chip_good() is that just actually fix the issue. And also in the past I had fixed the erase function also as same way by
the
patch below. https://patchwork.ozlabs.org/patch/922656/ Note: The reason for the patch for erase is same.
In my understanding the chip_ready() is just checked the value twice from flash. So I think that sometimes incorrect value is read twice and it is depended on the flash device behavior but not sure..
So change to use chip_good() instead of chip_ready().
drivers/mtd/chips/cfi_cmdset_0002.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c
b/drivers/mtd/chips/cfi_cmdset_0002.c
index 72428b6bfc47..251c9e1675bd 100644 --- a/drivers/mtd/chips/cfi_cmdset_0002.c +++ b/drivers/mtd/chips/cfi_cmdset_0002.c @@ -1627,31 +1627,37 @@ static int __xipram do_write_oneword(struct
map_info *map, struct flchip *chip,
continue; }
if (time_after(jiffies, timeo) && !chip_ready(map, adr)){
if (chip_good(map, adr, datum))
break;
if (time_after(jiffies, timeo)){ xip_enable(map, chip, adr); printk(KERN_WARNING "MTD %s(): software
timeout\n", __func__);
xip_disable(map, chip, adr);
}ret = -EIO; break;
if (chip_ready(map, adr))
break;
- /* Latency issues. Drop the lock, wait a while and retry
*/
UDELAY(map, chip, adr, 1);
}
- /* Did we succeed? */
- if (!chip_good(map, adr, datum)) {
- if (ret) { /* reset on all failures. */ map_write(map, CMD(0xF0), chip->start); /* FIXME - should have reset delay before continuing */
if (++retry_cnt <= MAX_RETRIES)
if (++retry_cnt <= MAX_RETRIES) {
ret = 0; goto retry;
}
ret = -EIO; }
xip_enable(map, chip, adr);
Not a big deal, but I'd prefer to not have coding style changes mixed with functional changes (in other words, you can drop the addition of blanks lines around xip_enable()).
Yes I will do fix this.
Regards, Ikegami
op_done: if (mode == FL_OTP_WRITE) otp_exit(map, chip, adr, map_bankwidth(map));
Hi IKEGAMI,
On Tue, 6 Nov 2018 00:25:43 +0000 IKEGAMI Tokunori ikegami@allied-telesis.co.jp wrote:
Also the issue can be fixed by using chip_good() instead of chip_ready(). The chip_ready() just checks the value from flash memory twice. And the chip_good() checks the value with the expected value. Probably the issue can be fixed as checked correctly by the chip_good(). So change to use chip_good() instead of chip_ready().
Well, that's not really explaining why you think chip_good() should be used instead of chip_ready(). So I went on and looked at the chip_good(), chip_ready() and do_write_oneword() implementation, and also looked at users of do_write_oneword(). It seems this function is used to write data to the flash, and apparently the "one bit should toggle to reflect a busy state" does not apply when writing things to the memory array (it's probably working for other CFI commands, but I guess it takes more time to actually change the level of a NOR cell, hence the result of 2 identical reads does not mean that the write is done).
Also, it seems that cmdset_0001 is not implementing chip_ready() the same way, and I wonder if cmdset_0002 implementation is correct to start with. Or maybe I don't get what chip_ready() is for.
Anyway, this is the sort of clarification I'd like to have.
I am thinking to update the commit message as below.
mtd: cfi_cmdset_0002: Use chip_good() to retry in do_write_oneword() As reported by the OpenWRT team, write requests sometimes fail on some platforms. Currently to check the state chip_ready() is used correctly as described by the flash memory S29GL256P11TFI01 datasheet.
I had a look at the S29GL256P datasheet here [1], and if I'm correct, it's using cmdset 0001.
Also chip_good() is used to check if the write is succeeded and it was implemented by the commit fb4a90bfcd6d8 ("[MTD] CFI-0002 - Improve error checking"). But actually the write failure is caused on some platforms and also it can be fixed by using chip_good() to check the state and retry instead.
Do you know on which NOR chips this happens? Do you have access to the datasheet?
It is depended on the actual flash chip behavior so the root cause is unknown.
Yes, and that's what I'd like you to figure out, or at least have a good idea why this doesn't work on some chips but works on others.
If any comment please let me know.
Signed-off-by: Tokunori Ikegami ikegami@allied-telesis.co.jp Signed-off-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Koen Vandeputte koen.vandeputte@ncentric.com Signed-off-by: Fabio Bettoni fbettoni@gmail.com
Has the patch really gone through all those people? SoB is used when you apply a patch in your tree or when you're the original author.
I have just checked the OpenWRT git log again and it looks that it was originally implemented by Felix Fietkau nbd@openwrt.org by the patch below so I will update the Signed-off-by tag as so. https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=2530640f07cd2b3b14fe9ec03fa63a586452cc5f
Co-Developed-by: Hauke Mehrtens hauke@hauke-m.de Co-Developed-by: Koen Vandeputte koen.vandeputte@ncentric.com Co-Developed-by: Fabio Bettoni fbettoni@gmail.com
Not sure we want to add new undocumented tags, but you can mention that all those people helped you find/debug the issue. They can also add their Reviewed-by/Tested-by if they like.
My bad, I just noticed these are valid flags [2], so you can keep them, and according to the doc, you should also keep the SoB.
Regards,
Boris
[1]http://www.cypress.com/file/219926/download [2]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu...
Hi Boris-san,
-----Original Message----- From: stable-owner@vger.kernel.org [mailto:stable-owner@vger.kernel.org] On Behalf Of Boris Brezillon Sent: Tuesday, November 06, 2018 5:34 PM To: IKEGAMI Tokunori Cc: boris.brezillon@free-electrons.com; Felix Fietkau; Hauke Mehrtens; stable@vger.kernel.org; Joakim Tjernlund; PACKHAM Chris; linux-mtd@lists.infradead.org; Koen Vandeputte; Fabio Bettoni Subject: Re: [PATCH v3 01/11] mtd: cfi_cmdset_0002: Change do_write_oneword() to use chip_good()
Hi IKEGAMI,
On Tue, 6 Nov 2018 00:25:43 +0000 IKEGAMI Tokunori ikegami@allied-telesis.co.jp wrote:
Also the issue can be fixed by using chip_good() instead of
chip_ready().
The chip_ready() just checks the value from flash memory twice. And the chip_good() checks the value with the expected value. Probably the issue can be fixed as checked correctly by the chip_good(). So change to use chip_good() instead of chip_ready().
Well, that's not really explaining why you think chip_good() should
be
used instead of chip_ready(). So I went on and looked at the chip_good(), chip_ready() and do_write_oneword() implementation, and also looked at users of do_write_oneword(). It seems this function is used to write data to the flash, and apparently the "one bit should toggle to reflect a busy state" does not apply when writing things to the memory array (it's probably working for other CFI commands, but
I
guess it takes more time to actually change the level of a NOR cell, hence the result of 2 identical reads does not mean that the write is done).
Also, it seems that cmdset_0001 is not implementing chip_ready() the same way, and I wonder if cmdset_0002 implementation is correct to start with. Or maybe I don't get what chip_ready() is for.
Anyway, this is the sort of clarification I'd like to have.
I am thinking to update the commit message as below.
mtd: cfi_cmdset_0002: Use chip_good() to retry in do_write_oneword() As reported by the OpenWRT team, write requests sometimes fail on
some
platforms. Currently to check the state chip_ready() is used correctly as
described by
the flash memory S29GL256P11TFI01 datasheet.
I had a look at the S29GL256P datasheet here [1], and if I'm correct, it's using cmdset 0001.
No actually the cmdset 0002 is used on the flash chip. The manufacturer ID xx01h and Device ID 2201h are used to decide.
There is information from Fobis-san below also about this.
On forum thread musashino posted picture of flash chip: https://forum.openwrt.org/t/impossible-to-install-update-any-packages-on-wzr... http://www.cypress.com/part/s29gl256p11tfi010
[ 0.862264] physmap platform flash device: 02000000 at 1e000000 [ 0.868331] physmap-flash: Found 1 x16 devices at 0x0 in 16-bit bank. Manufacturer ID 0x000001 Chip ID 0x002201 [ 0.878493] Amd/Fujitsu Extended Query Table at 0x0040 [ 0.883668] Amd/Fujitsu Extended Query version 1.3. [ 0.888768] number of CFI chips: 1 [ 0.894557] Searching for RedBoot partition table in physmap-flash at offset 0x1fc0000 [ 0.918009] Searching for RedBoot partition table in physmap-flash at offset 0x1fe0000 [ 0.941464] No RedBoot partition table detected in physmap-flash [ 0.947926] Creating 5 MTD partitions on "physmap-flash": [ 0.953384] 0x000000000000-0x000000040000 : "u-boot" [ 0.960853] 0x000000040000-0x000000060000 : "u-boot-env" [ 0.968803] 0x000000060000-0x000001fc0000 : "firmware" [ 0.981859] 2 uimage-fw partitions found on MTD device firmware [ 0.987900] 0x000000060000-0x0000001b5706 : "kernel" [ 0.994916] 0x0000001b5706-0x000001fc0000 : "rootfs" [ 1.001986] mtd: device 4 (rootfs) set to be root filesystem [ 1.007789] 1 squashfs-split partitions found on MTD device rootfs [ 1.014014] 0x0000003c0000-0x000001fc0000 : "rootfs_data" [ 1.022093] 0x000001fc0000-0x000001fe0000 : "user_property" [ 1.030283] 0x000001fe0000-0x000002000000 : "art"
Maybe you could post links to forum thread, and data sheet.
Also chip_good() is used to check if the write is succeeded and it
was
implemented by the commit fb4a90bfcd6d8 ("[MTD] CFI-0002 - Improve
error
checking"). But actually the write failure is caused on some platforms and also
it can
be fixed by using chip_good() to check the state and retry instead.
Do you know on which NOR chips this happens? Do you have access to the datasheet?
But it looks SST49LF008A [3] from the changes below but I am not sure at this moment and probably it should be confirmed to the authr Eric W. Biedermann ebiederman@lnxi.com to make sure.
+#define SST49LF008A 0x005a
static int cfi_amdstd_read (struct mtd_info *, loff_t, size_t, size_t *, u_char *); static int cfi_amdstd_write_words(struct mtd_info *, loff_t, size_t, size_t *, const u_char *); @@ -191,6 +192,7 @@ static struct cfi_fixup cfi_fixup_table[] = { }; static struct cfi_fixup jedec_fixup_table[] = { { MANUFACTURER_SST, SST49LF004B, fixup_use_fwh_lock, NULL, }, + { MANUFACTURER_SST, SST49LF008A, fixup_use_fwh_lock, NULL, },
It is depended on the actual flash chip behavior so the root cause
is
unknown.
Yes, and that's what I'd like you to figure out, or at least have a good idea why this doesn't work on some chips but works on others.
I see. But it is a little bit difficult situation since I do not have the failure environment locally at this moment. But if needed I may ask to get the help for this to Fabio-san.
If any comment please let me know.
Signed-off-by: Tokunori Ikegami ikegami@allied-telesis.co.jp Signed-off-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Koen Vandeputte koen.vandeputte@ncentric.com Signed-off-by: Fabio Bettoni fbettoni@gmail.com
Has the patch really gone through all those people? SoB is used when
you
apply a patch in your tree or when you're the original author.
I have just checked the OpenWRT git log again and it looks that it was
originally
implemented by Felix Fietkau nbd@openwrt.org by the patch below so I
will update the Signed-off-by tag as so.
Co-Developed-by: Hauke Mehrtens hauke@hauke-m.de Co-Developed-by: Koen Vandeputte koen.vandeputte@ncentric.com Co-Developed-by: Fabio Bettoni fbettoni@gmail.com
Not sure we want to add new undocumented tags, but you can mention that all those people helped you find/debug the issue. They can also add their Reviewed-by/Tested-by if they like.
My bad, I just noticed these are valid flags [2], so you can keep them, and according to the doc, you should also keep the SoB.
I see. Yes I had also checked it.
By the way in near future my company email address will be not able to use. So I will change the mail address to my personal email address [4] after that or before.
Regards, Ikegami
Regards,
Boris
[1]http://www.cypress.com/file/219926/download [2]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ tree/Documentation/process/submitting-patches.rst?h=v4.20-rc1#n546
[3]https://www.microchip.com/wwwproducts/en/SST49LF008A [4]ikegami_to@yahoo.co.jp
Sorry let me resend the mail below by changing the email address of Felix-san.
-----Original Message----- From: IKEGAMI Tokunori Sent: Tuesday, November 06, 2018 6:37 PM To: 'Boris Brezillon'; 'ikegami_to@yahoo.co.jp' Cc: boris.brezillon@free-electrons.com; Felix Fietkau; Hauke Mehrtens; stable@vger.kernel.org; Joakim Tjernlund; PACKHAM Chris; linux-mtd@lists.infradead.org; Koen Vandeputte; Fabio Bettoni Subject: RE: [PATCH v3 01/11] mtd: cfi_cmdset_0002: Change do_write_oneword() to use chip_good()
Hi Boris-san,
-----Original Message----- From: stable-owner@vger.kernel.org [mailto:stable-owner@vger.kernel.org] On Behalf Of Boris Brezillon Sent: Tuesday, November 06, 2018 5:34 PM To: IKEGAMI Tokunori Cc: boris.brezillon@free-electrons.com; Felix Fietkau; Hauke Mehrtens; stable@vger.kernel.org; Joakim Tjernlund; PACKHAM Chris; linux-mtd@lists.infradead.org; Koen Vandeputte; Fabio Bettoni Subject: Re: [PATCH v3 01/11] mtd: cfi_cmdset_0002: Change do_write_oneword() to use chip_good()
Hi IKEGAMI,
On Tue, 6 Nov 2018 00:25:43 +0000 IKEGAMI Tokunori ikegami@allied-telesis.co.jp wrote:
Also the issue can be fixed by using chip_good() instead of
chip_ready().
The chip_ready() just checks the value from flash memory twice. And the chip_good() checks the value with the expected value. Probably the issue can be fixed as checked correctly by the
chip_good().
So change to use chip_good() instead of chip_ready().
Well, that's not really explaining why you think chip_good() should
be
used instead of chip_ready(). So I went on and looked at the chip_good(), chip_ready() and do_write_oneword() implementation, and also looked at users of do_write_oneword(). It seems this function
is
used to write data to the flash, and apparently the "one bit should toggle to reflect a busy state" does not apply when writing things
to
the memory array (it's probably working for other CFI commands, but
I
guess it takes more time to actually change the level of a NOR cell, hence the result of 2 identical reads does not mean that the write
is
done).
Also, it seems that cmdset_0001 is not implementing chip_ready() the same way, and I wonder if cmdset_0002 implementation is correct to start with. Or maybe I don't get what chip_ready() is for.
Anyway, this is the sort of clarification I'd like to have.
I am thinking to update the commit message as below.
mtd: cfi_cmdset_0002: Use chip_good() to retry in
do_write_oneword()
As reported by the OpenWRT team, write requests sometimes fail on
some
platforms. Currently to check the state chip_ready() is used correctly as
described by
the flash memory S29GL256P11TFI01 datasheet.
I had a look at the S29GL256P datasheet here [1], and if I'm correct, it's using cmdset 0001.
No actually the cmdset 0002 is used on the flash chip. The manufacturer ID xx01h and Device ID 2201h are used to decide.
There is information from Fobis-san below also about this.
On forum thread musashino posted picture of flash chip: https://forum.openwrt.org/t/impossible-to-install-update-any-packages- on-wzr-hp-g300nh-18-06-1 http://www.cypress.com/part/s29gl256p11tfi010
[ 0.862264] physmap platform flash device: 02000000 at 1e000000 [ 0.868331] physmap-flash: Found 1 x16 devices at 0x0 in 16-bit bank. Manufacturer ID 0x000001 Chip ID 0x002201 [ 0.878493] Amd/Fujitsu Extended Query Table at 0x0040 [ 0.883668] Amd/Fujitsu Extended Query version 1.3. [ 0.888768] number of CFI chips: 1 [ 0.894557] Searching for RedBoot partition table in physmap-flash at offset 0x1fc0000 [ 0.918009] Searching for RedBoot partition table in physmap-flash at offset 0x1fe0000 [ 0.941464] No RedBoot partition table detected in physmap-flash [ 0.947926] Creating 5 MTD partitions on "physmap-flash": [ 0.953384] 0x000000000000-0x000000040000 : "u-boot" [ 0.960853] 0x000000040000-0x000000060000 : "u-boot-env" [ 0.968803] 0x000000060000-0x000001fc0000 : "firmware" [ 0.981859] 2 uimage-fw partitions found on MTD device firmware [ 0.987900] 0x000000060000-0x0000001b5706 : "kernel" [ 0.994916] 0x0000001b5706-0x000001fc0000 : "rootfs" [ 1.001986] mtd: device 4 (rootfs) set to be root filesystem [ 1.007789] 1 squashfs-split partitions found on MTD device rootfs [ 1.014014] 0x0000003c0000-0x000001fc0000 : "rootfs_data" [ 1.022093] 0x000001fc0000-0x000001fe0000 : "user_property" [ 1.030283] 0x000001fe0000-0x000002000000 : "art"
Maybe you could post links to forum thread, and data sheet.
Also chip_good() is used to check if the write is succeeded and
it
was
implemented by the commit fb4a90bfcd6d8 ("[MTD] CFI-0002 - Improve
error
checking"). But actually the write failure is caused on some platforms and also
it can
be fixed by using chip_good() to check the state and retry instead.
Do you know on which NOR chips this happens? Do you have access to the datasheet?
But it looks SST49LF008A [3] from the changes below but I am not sure at this moment and probably it should be confirmed to the authr Eric W. Biedermann ebiederman@lnxi.com to make sure.
+#define SST49LF008A 0x005a
static int cfi_amdstd_read (struct mtd_info *, loff_t, size_t, size_t *, u_char *); static int cfi_amdstd_write_words(struct mtd_info *, loff_t, size_t, size_t *, const u_char *); @@ -191,6 +192,7 @@ static struct cfi_fixup cfi_fixup_table[] = { }; static struct cfi_fixup jedec_fixup_table[] = { { MANUFACTURER_SST, SST49LF004B, fixup_use_fwh_lock, NULL, }, + { MANUFACTURER_SST, SST49LF008A, fixup_use_fwh_lock, NULL, },
It is depended on the actual flash chip behavior so the root cause
is
unknown.
Yes, and that's what I'd like you to figure out, or at least have a good idea why this doesn't work on some chips but works on others.
I see. But it is a little bit difficult situation since I do not have the failure environment locally at this moment. But if needed I may ask to get the help for this to Fabio-san.
If any comment please let me know.
Signed-off-by: Tokunori Ikegami ikegami@allied-telesis.co.jp Signed-off-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Koen Vandeputte koen.vandeputte@ncentric.com Signed-off-by: Fabio Bettoni fbettoni@gmail.com
Has the patch really gone through all those people? SoB is used when
you
apply a patch in your tree or when you're the original author.
I have just checked the OpenWRT git log again and it looks that it was
originally
implemented by Felix Fietkau nbd@openwrt.org by the patch below so
I
will update the Signed-off-by tag as so.
<https://git.openwrt.org/?p=openwrt/openwrt.git%3Ba=commitdiff%3Bh=2530640
f07cd2b3b14fe9ec03fa63a586452cc5f>
Co-Developed-by: Hauke Mehrtens hauke@hauke-m.de Co-Developed-by: Koen Vandeputte koen.vandeputte@ncentric.com Co-Developed-by: Fabio Bettoni fbettoni@gmail.com
Not sure we want to add new undocumented tags, but you can mention that all those people helped you find/debug the issue. They can also add their Reviewed-by/Tested-by if they like.
My bad, I just noticed these are valid flags [2], so you can keep them, and according to the doc, you should also keep the SoB.
I see. Yes I had also checked it.
By the way in near future my company email address will be not able to use. So I will change the mail address to my personal email address [4] after that or before.
Regards, Ikegami
Regards,
Boris
[2]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
tree/Documentation/process/submitting-patches.rst?h=v4.20-rc1#n546
[3]https://www.microchip.com/wwwproducts/en/SST49LF008A [4]ikegami_to@yahoo.co.jp
linux-stable-mirror@lists.linaro.org