Hi Christophe,
christophe.kerello@foss.st.com wrote on Fri, 9 Feb 2024 14:35:44 +0100:
Hi Miquel,
I am testing last nand/next branch with the MP1 board, and i get an issue since this patch was applied.
When I read the SLC NAND using nandump tool (reading page 0 and page 1), the OOB is not displayed at expected. For page 1, oob is displayed when for page 0 the first data of the page are displayed.
The nanddump command used is: nanddump -c -o -l 0x2000 /dev/mtd9
Page 0: OOB Data: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............| OOB Data: 03 00 28 00 01 00 00 00 a4 03 00 00 34 00 00 00 |..(.........4...| OOB Data: 7c 11 00 00 00 04 00 05 34 00 20 00 06 00 28 00 ||.......4. ...(.| OOB Data: 1b 00 1a 00 01 00 00 00 00 00 00 00 00 00 00 00 |................| OOB Data: 00 00 00 00 10 05 00 00 10 05 00 00 05 00 00 00 |................| OOB Data: 00 00 01 00 01 00 00 00 e8 0e 00 00 e8 0e 01 00 |................| OOB Data: e8 0e 01 00 44 01 00 00 48 01 00 00 06 00 00 00 |....D...H.......| OOB Data: 00 00 01 00 02 00 00 00 f0 0e 00 00 f0 0e 01 00 |................| OOB Data: f0 0e 01 00 10 01 00 00 10 01 00 00 06 00 00 00 |................| OOB Data: 04 00 00 00 04 00 00 00 f4 00 00 00 f4 00 00 00 |................| OOB Data: f4 00 00 00 44 00 00 00 44 00 00 00 04 00 00 00 |....D...D.......| OOB Data: 04 00 00 00 51 e5 74 64 00 00 00 00 00 00 00 00 |....Q.td........| OOB Data: 00 00 00 00 00 00 00 00 00 00 00 00 06 00 00 00 |................| OOB Data: 10 00 00 00 52 e5 74 64 e8 0e 00 00 e8 0e 01 00 |....R.td........|
Page 1: OOB Data: ff ff 94 25 8c 3c c7 44 e7 c0 b7 b0 92 5e 50 fb |...%.<.D.....^P.| OOB Data: 80 ca a3 de e2 73 b4 4e 58 39 fe b4 85 76 65 31 |.....s.NX9...ve1| OOB Data: 48 86 91 f3 58 0b 59 df 2c 08 75 8b 6f 48 36 a6 |H...X.Y.,.u.oH6.| OOB Data: bc 16 61 58 db 52 08 75 8b 6f 48 36 a6 bc 16 61 |..aX.R.u.oH6...a| OOB Data: 58 db 52 08 75 8b 6f 48 36 a6 bc 16 61 58 db 52 |X.R.u.oH6...aX.R| OOB Data: 08 75 8b 6f 48 36 a6 bc 16 61 58 db 52 08 75 8b |.u.oH6...aX.R.u.| OOB Data: 6f 48 36 a6 bc 16 61 58 db 52 ff ff ff ff ff ff |oH6...aX.R......| OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................| OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................| OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................| OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................| OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................| OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................| OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I have checked what is happening in rawnand_enable_cont_reads function, and for page 0, con_read.ongoing = true when for page 1 con_read.ongoing = false
page 0: [ 51.785623] rawnand_enable_cont_reads: page=0, col=0, readlen=4096, mtd->writesize=4096 [ 51.793751] rawnand_enable_cont_reads: end_page=1, end_col=0 [ 51.799356] rawnand_enable_cont_reads: con_read.ongoing=1
page 1: [ 53.493337] rawnand_enable_cont_reads: page=1, col=0, readlen=4096, mtd->writesize=4096 [ 53.501413] rawnand_enable_cont_reads: end_page=1, end_col=0 [ 53.507013] rawnand_enable_cont_reads: con_read.ongoing=0
I do not expect con_read.ongoing set to true when we read one page.
I have also dumped what happened when we read the bad block table and it is also strange for me in particular the value end_page.
[ 1.581940] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xd3 [ 1.581966] nand: Micron MT29F8G08ABACAH4 [ 1.581974] nand: 1024 MiB, SLC, erase size: 256 KiB, page size: 4096, OOB size: 224 [ 1.582379] rawnand_enable_cont_reads: page=262080, col=0, readlen=5, mtd->writesize=4096 [ 1.582411] rawnand_enable_cont_reads: end_page=0, end_col=5 [ 1.582419] rawnand_enable_cont_reads: con_read.ongoing=0 [ 1.585817] Bad block table found at page 262080, version 0x01 [ 1.585943] rawnand_enable_cont_reads: page=262080, col=0, readlen=5, mtd->writesize=4096 [ 1.585960] rawnand_enable_cont_reads: end_page=0, end_col=5 [ 1.585968] rawnand_enable_cont_reads: con_read.ongoing=0 [ 1.586677] rawnand_enable_cont_reads: page=262016, col=0, readlen=5, mtd->writesize=4096 [ 1.586700] rawnand_enable_cont_reads: end_page=0, end_col=5 [ 1.586708] rawnand_enable_cont_reads: con_read.ongoing=0 [ 1.587139] Bad block table found at page 262016, version 0x01 [ 1.587168] rawnand_enable_cont_reads: page=262081, col=5, readlen=1019, mtd->writesize=4096 [ 1.587181] rawnand_enable_cont_reads: end_page=0, end_col=1024 [ 1.587189] rawnand_enable_cont_reads: con_read.ongoing=0 [ 1.587672] rawnand_enable_cont_reads: page=262081, col=1024, readlen=5, mtd->writesize=4096 [ 1.587692] rawnand_enable_cont_reads: end_page=0, end_col=1029 [ 1.587700] rawnand_enable_cont_reads: con_read.ongoing=0
Interesting, I played with those corner cases in earlier tests but for this series I was focused on playing with filesystems and the fact that sometimes continuous read was very sporadically breaking, so I played with much more complex patterns but I don't remember checking these two basic cases again...
Sorry for the breakage, I will have a look and keep you updated. I believe the continuous read feature is fine per se, but the problem here is that there is a mismatch between the actual operation and the continuous read configuration on "top" of it, which should in these cases not be enabled at all.
I am away this week, I will look into this when I'm back.
Thanks for the useful report, Miquèl