This is a note to let you know that I've just added the patch titled
mtd: nand: mtk: fix infinite ECC decode IRQ issue
to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git%3Ba=su...
The filename of the patch is: mtd-nand-mtk-fix-infinite-ecc-decode-irq-issue.patch and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree, please let stable@vger.kernel.org know about it.
From 1d2fcdcf33339c7c8016243de0f7f31cf6845e8d Mon Sep 17 00:00:00 2001
From: Xiaolei Li xiaolei.li@mediatek.com Date: Mon, 30 Oct 2017 10:39:56 +0800 Subject: mtd: nand: mtk: fix infinite ECC decode IRQ issue
From: Xiaolei Li xiaolei.li@mediatek.com
commit 1d2fcdcf33339c7c8016243de0f7f31cf6845e8d upstream.
For MT2701 NAND Controller, there may generate infinite ECC decode IRQ during long time burn test on some platforms. Once this issue occurred, the ECC decode IRQ status cannot be cleared in the IRQ handler function, and threads cannot be scheduled.
ECC HW generates decode IRQ each sector, so there will have more than one decode IRQ if read one page of large page NAND.
Currently, ECC IRQ handle flow is that we will check whether it is decode IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be cleared at the same time. Secondly, we will check whether all sectors are decoded by reading the register ECC_DECDONE. This is because the current IRQ may be not dealed in time, and the next sectors have been decoded before reading the register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not be generated. Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the next ECC IRQ.
But, there is a timing issue between step one and two. When we read the reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector, and the ECC IRQ signal is cleared. But the last sector is decoded before reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and it means we will receive one extra ECC IRQ later. In step three, we will find that all sectors were decoded, then disable ECC IRQ and return. When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared anymore. That is because the register ECC_DECIRQ_STA can only be cleared when the register ECC_IRQ_REG(op) is enabled. But actually we have disabled ECC IRQ in the previous ECC IRQ handle. So, there will keep receiving ECC decode IRQ.
Now, we read the register ECC_DECIRQ_STA once again before completing the ecc done event. This ensures that there will be no extra ECC decode IRQ.
Also, remove writel(0, ecc->regs + ECC_IRQ_REG(op)) from irq handler, because ECC IRQ is disabled in mtk_ecc_disable(). And clear ECC_DECIRQ_STA in mtk_ecc_disable() in case there is a timeout to wait decode IRQ.
Fixes: 1d6b1e464950 ("mtd: mediatek: driver for MTK Smart Device") Signed-off-by: Xiaolei Li xiaolei.li@mediatek.com Signed-off-by: Boris Brezillon boris.brezillon@free-electrons.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mtd/nand/mtk_ecc.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
--- a/drivers/mtd/nand/mtk_ecc.c +++ b/drivers/mtd/nand/mtk_ecc.c @@ -116,6 +116,11 @@ static irqreturn_t mtk_ecc_irq(int irq, op = ECC_DECODE; dec = readw(ecc->regs + ECC_DECDONE); if (dec & ecc->sectors) { + /* + * Clear decode IRQ status once again to ensure that + * there will be no extra IRQ. + */ + readw(ecc->regs + ECC_DECIRQ_STA); ecc->sectors = 0; complete(&ecc->done); } else { @@ -131,8 +136,6 @@ static irqreturn_t mtk_ecc_irq(int irq, } }
- writel(0, ecc->regs + ECC_IRQ_REG(op)); - return IRQ_HANDLED; }
@@ -342,6 +345,12 @@ void mtk_ecc_disable(struct mtk_ecc *ecc
/* disable it */ mtk_ecc_wait_idle(ecc, op); + if (op == ECC_DECODE) + /* + * Clear decode IRQ status in case there is a timeout to wait + * decode IRQ. + */ + readw(ecc->regs + ECC_DECIRQ_STA); writew(0, ecc->regs + ECC_IRQ_REG(op)); writew(ECC_OP_DISABLE, ecc->regs + ECC_CTL_REG(op));
Patches currently in stable-queue which might be from xiaolei.li@mediatek.com are
queue-4.9/mtd-nand-mtk-fix-infinite-ecc-decode-irq-issue.patch