The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 40c36e2741d7fe1e66d6ec55477ba5fd19c9c5d2 Mon Sep 17 00:00:00 2001
From: Tony Luck <tony.luck(a)intel.com>
Date: Fri, 22 Jun 2018 11:54:23 +0200
Subject: [PATCH] x86/mce: Fix incorrect "Machine check from unknown source"
message
Some injection testing resulted in the following console log:
mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134
mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]}
mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88
mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043
mce: [Hardware Error]: Run the above through 'mcelog --ascii'
Kernel panic - not syncing: Machine check from unknown source
This confused everybody because the first line quite clearly shows
that we found a logged error in "Bank 1", while the last line says
"unknown source".
The problem is that the Linux code doesn't do the right thing
for a local machine check that results in a fatal error.
It turns out that we know very early in the handler whether the
machine check is fatal. The call to mce_no_way_out() has checked
all the banks for the CPU that took the local machine check. If
it says we must crash, we can do so right away with the right
messages.
We do scan all the banks again. This means that we might initially
not see a problem, but during the second scan find something fatal.
If this happens we print a slightly different message (so I can
see if it actually every happens).
[ bp: Remove unneeded severity assignment. ]
Signed-off-by: Tony Luck <tony.luck(a)intel.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ashok Raj <ashok.raj(a)intel.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo(a)intel.com>
Cc: linux-edac <linux-edac(a)vger.kernel.org>
Cc: stable(a)vger.kernel.org # 4.2
Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.152728389…
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 7e6f51a9d917..e93670d736a6 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1207,13 +1207,18 @@ void do_machine_check(struct pt_regs *regs, long error_code)
lmce = m.mcgstatus & MCG_STATUS_LMCES;
/*
+ * Local machine check may already know that we have to panic.
+ * Broadcast machine check begins rendezvous in mce_start()
* Go through all banks in exclusion of the other CPUs. This way we
* don't report duplicated events on shared banks because the first one
- * to see it will clear it. If this is a Local MCE, then no need to
- * perform rendezvous.
+ * to see it will clear it.
*/
- if (!lmce)
+ if (lmce) {
+ if (no_way_out)
+ mce_panic("Fatal local machine check", &m, msg);
+ } else {
order = mce_start(&no_way_out);
+ }
for (i = 0; i < cfg->banks; i++) {
__clear_bit(i, toclear);
@@ -1289,12 +1294,17 @@ void do_machine_check(struct pt_regs *regs, long error_code)
no_way_out = worst >= MCE_PANIC_SEVERITY;
} else {
/*
- * Local MCE skipped calling mce_reign()
- * If we found a fatal error, we need to panic here.
+ * If there was a fatal machine check we should have
+ * already called mce_panic earlier in this function.
+ * Since we re-read the banks, we might have found
+ * something new. Check again to see if we found a
+ * fatal error. We call "mce_severity()" again to
+ * make sure we have the right "msg".
*/
- if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
- mce_panic("Machine check from unknown source",
- NULL, NULL);
+ if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
+ mce_severity(&m, cfg->tolerant, &msg, true);
+ mce_panic("Local fatal machine check!", &m, msg);
+ }
}
/*
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Hi Greg,
Pleae pull commits for Linux 3.18 .
I've sent a review request for all commits over a week ago and all
comments were addressed.
Thanks,
Sasha
=====
The following changes since commit b0b357c20ca6171b8ac698351f5202402b7ad7d5:
Linux 3.18.112 (2018-05-30 22:08:04 +0200)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git tags/for-greg-3.18-20062018
for you to fetch changes up to b73b55ee0dd09bb24fa6a4fee94b21d97dbc51ba:
net/sonic: Use dma_mapping_error() (2018-06-07 15:40:45 -0400)
- ----------------------------------------------------------------
for-greg-3.18-20062018
- ----------------------------------------------------------------
Finn Thain (1):
net/sonic: Use dma_mapping_error()
Ivan Bornyakov (1):
atm: zatm: fix memcmp casting
Josh Hill (1):
net: qmi_wwan: Add Netgear Aircard 779S
Paolo Abeni (1):
netfilter: ebtables: handle string from userspace with care
drivers/atm/zatm.c | 4 ++--
drivers/net/ethernet/natsemi/sonic.c | 2 +-
drivers/net/usb/qmi_wwan.c | 1 +
net/bridge/netfilter/ebtables.c | 3 ++-
4 files changed, 6 insertions(+), 4 deletions(-)
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAlsrD34ACgkQ3qZv95d3
LNxyQA//Td2TjwiNv4O7a5DTBTY11L1m/4aM/Sg9AR3yUtykiIfatVi5rAwdKzXl
ZBhutqVj7eK7STEyNfvqT0p/qwzoXWQUMSq0BpaH+D6Vfcvo3WNfXSdwODVsYy7D
d8N814Libu3zUHj4x+KXp2og8SCKqAYNgyBhd45bnrRUVbq/UQPcZsXqDSUIX6WK
RHkGU96VogA/4VEdD8Tiv9wX86L4PTQwve+GpYdxRc6Cc5rAuu3tS1BQgl50nfSU
MSgorHcL+RO9wWu/Ij6L1SDOgNfxrk9b9DPt9Jahm5NeF+et5t/vLuJ804osJRYE
dbXk6NQ+8EMQmgCOdDKUZhiHjN6ztk2NMsqmQb1z9hsPB+5gmmPMP2vZVA0BUyjl
UZVQcRwYC1+s/bOb/uqhvBD4DH5pj76hMMkz10rR8pAvVRCvoBGFU9uwtSlu3gZo
24xHREUUSrpecc61tUq3QyomzlUZUQgyvkU8GnqD+2vhiJfsuyZ33AfjVmwEl+Vn
37qgenBJLBKhXzDxOMqg+xg0mUGo63CQeIx2ERSp3DiTHOElOWJINu5MVZQYKC8w
rxSBacxj+5w24IM/OxNpWZ3PAoUSlAAvmkPiQJ5Aavw0StpqvNHN+DElmLf6eVrl
2Q5ZnkFhUOB2xSRkCYpkRhq9GZWGLkWjn4ePToil4z1+X13T6Zs=
=mpuV
-----END PGP SIGNATURE-----
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 781932375ffc6411713ee0926ccae8596ed0261c Mon Sep 17 00:00:00 2001
From: Richard Weinberger <richard(a)nod.at>
Date: Mon, 28 May 2018 22:04:32 +0200
Subject: [PATCH] ubi: fastmap: Correctly handle interrupted erasures in EBA
Fastmap cannot track the LEB unmap operation, therefore it can
happen that after an interrupted erasure the mapping still looks
good from Fastmap's point of view, while reading from the PEB will
cause an ECC error and confuses the upper layer.
Instead of teaching users of UBI how to deal with that, we read back
the VID header and check for errors. If the PEB is empty or shows ECC
errors we fixup the mapping and schedule the PEB for erasure.
Fixes: dbb7d2a88d2a ("UBI: Add fastmap core")
Cc: <stable(a)vger.kernel.org>
Reported-by: martin bayern <Martinbayern(a)outlook.com>
Signed-off-by: Richard Weinberger <richard(a)nod.at>
diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c
index 250e30fac61b..593a4f9d97e3 100644
--- a/drivers/mtd/ubi/eba.c
+++ b/drivers/mtd/ubi/eba.c
@@ -490,6 +490,82 @@ int ubi_eba_unmap_leb(struct ubi_device *ubi, struct ubi_volume *vol,
return err;
}
+#ifdef CONFIG_MTD_UBI_FASTMAP
+/**
+ * check_mapping - check and fixup a mapping
+ * @ubi: UBI device description object
+ * @vol: volume description object
+ * @lnum: logical eraseblock number
+ * @pnum: physical eraseblock number
+ *
+ * Checks whether a given mapping is valid. Fastmap cannot track LEB unmap
+ * operations, if such an operation is interrupted the mapping still looks
+ * good, but upon first read an ECC is reported to the upper layer.
+ * Normaly during the full-scan at attach time this is fixed, for Fastmap
+ * we have to deal with it while reading.
+ * If the PEB behind a LEB shows this symthom we change the mapping to
+ * %UBI_LEB_UNMAPPED and schedule the PEB for erasure.
+ *
+ * Returns 0 on success, negative error code in case of failure.
+ */
+static int check_mapping(struct ubi_device *ubi, struct ubi_volume *vol, int lnum,
+ int *pnum)
+{
+ int err;
+ struct ubi_vid_io_buf *vidb;
+
+ if (!ubi->fast_attach)
+ return 0;
+
+ vidb = ubi_alloc_vid_buf(ubi, GFP_NOFS);
+ if (!vidb)
+ return -ENOMEM;
+
+ err = ubi_io_read_vid_hdr(ubi, *pnum, vidb, 0);
+ if (err > 0 && err != UBI_IO_BITFLIPS) {
+ int torture = 0;
+
+ switch (err) {
+ case UBI_IO_FF:
+ case UBI_IO_FF_BITFLIPS:
+ case UBI_IO_BAD_HDR:
+ case UBI_IO_BAD_HDR_EBADMSG:
+ break;
+ default:
+ ubi_assert(0);
+ }
+
+ if (err == UBI_IO_BAD_HDR_EBADMSG || err == UBI_IO_FF_BITFLIPS)
+ torture = 1;
+
+ down_read(&ubi->fm_eba_sem);
+ vol->eba_tbl->entries[lnum].pnum = UBI_LEB_UNMAPPED;
+ up_read(&ubi->fm_eba_sem);
+ ubi_wl_put_peb(ubi, vol->vol_id, lnum, *pnum, torture);
+
+ *pnum = UBI_LEB_UNMAPPED;
+ } else if (err < 0) {
+ ubi_err(ubi, "unable to read VID header back from PEB %i: %i",
+ *pnum, err);
+
+ goto out_free;
+ }
+
+ err = 0;
+
+out_free:
+ ubi_free_vid_buf(vidb);
+
+ return err;
+}
+#else
+static int check_mapping(struct ubi_device *ubi, struct ubi_volume *vol, int lnum,
+ int *pnum)
+{
+ return 0;
+}
+#endif
+
/**
* ubi_eba_read_leb - read data.
* @ubi: UBI device description object
@@ -522,7 +598,13 @@ int ubi_eba_read_leb(struct ubi_device *ubi, struct ubi_volume *vol, int lnum,
return err;
pnum = vol->eba_tbl->entries[lnum].pnum;
- if (pnum < 0) {
+ if (pnum >= 0) {
+ err = check_mapping(ubi, vol, lnum, &pnum);
+ if (err < 0)
+ goto out_unlock;
+ }
+
+ if (pnum == UBI_LEB_UNMAPPED) {
/*
* The logical eraseblock is not mapped, fill the whole buffer
* with 0xFF bytes. The exception is static volumes for which
@@ -930,6 +1012,12 @@ int ubi_eba_write_leb(struct ubi_device *ubi, struct ubi_volume *vol, int lnum,
return err;
pnum = vol->eba_tbl->entries[lnum].pnum;
+ if (pnum >= 0) {
+ err = check_mapping(ubi, vol, lnum, &pnum);
+ if (err < 0)
+ goto out;
+ }
+
if (pnum >= 0) {
dbg_eba("write %d bytes at offset %d of LEB %d:%d, PEB %d",
len, offset, vol_id, lnum, pnum);
The patch below does not apply to the 3.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 781932375ffc6411713ee0926ccae8596ed0261c Mon Sep 17 00:00:00 2001
From: Richard Weinberger <richard(a)nod.at>
Date: Mon, 28 May 2018 22:04:32 +0200
Subject: [PATCH] ubi: fastmap: Correctly handle interrupted erasures in EBA
Fastmap cannot track the LEB unmap operation, therefore it can
happen that after an interrupted erasure the mapping still looks
good from Fastmap's point of view, while reading from the PEB will
cause an ECC error and confuses the upper layer.
Instead of teaching users of UBI how to deal with that, we read back
the VID header and check for errors. If the PEB is empty or shows ECC
errors we fixup the mapping and schedule the PEB for erasure.
Fixes: dbb7d2a88d2a ("UBI: Add fastmap core")
Cc: <stable(a)vger.kernel.org>
Reported-by: martin bayern <Martinbayern(a)outlook.com>
Signed-off-by: Richard Weinberger <richard(a)nod.at>
diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c
index 250e30fac61b..593a4f9d97e3 100644
--- a/drivers/mtd/ubi/eba.c
+++ b/drivers/mtd/ubi/eba.c
@@ -490,6 +490,82 @@ int ubi_eba_unmap_leb(struct ubi_device *ubi, struct ubi_volume *vol,
return err;
}
+#ifdef CONFIG_MTD_UBI_FASTMAP
+/**
+ * check_mapping - check and fixup a mapping
+ * @ubi: UBI device description object
+ * @vol: volume description object
+ * @lnum: logical eraseblock number
+ * @pnum: physical eraseblock number
+ *
+ * Checks whether a given mapping is valid. Fastmap cannot track LEB unmap
+ * operations, if such an operation is interrupted the mapping still looks
+ * good, but upon first read an ECC is reported to the upper layer.
+ * Normaly during the full-scan at attach time this is fixed, for Fastmap
+ * we have to deal with it while reading.
+ * If the PEB behind a LEB shows this symthom we change the mapping to
+ * %UBI_LEB_UNMAPPED and schedule the PEB for erasure.
+ *
+ * Returns 0 on success, negative error code in case of failure.
+ */
+static int check_mapping(struct ubi_device *ubi, struct ubi_volume *vol, int lnum,
+ int *pnum)
+{
+ int err;
+ struct ubi_vid_io_buf *vidb;
+
+ if (!ubi->fast_attach)
+ return 0;
+
+ vidb = ubi_alloc_vid_buf(ubi, GFP_NOFS);
+ if (!vidb)
+ return -ENOMEM;
+
+ err = ubi_io_read_vid_hdr(ubi, *pnum, vidb, 0);
+ if (err > 0 && err != UBI_IO_BITFLIPS) {
+ int torture = 0;
+
+ switch (err) {
+ case UBI_IO_FF:
+ case UBI_IO_FF_BITFLIPS:
+ case UBI_IO_BAD_HDR:
+ case UBI_IO_BAD_HDR_EBADMSG:
+ break;
+ default:
+ ubi_assert(0);
+ }
+
+ if (err == UBI_IO_BAD_HDR_EBADMSG || err == UBI_IO_FF_BITFLIPS)
+ torture = 1;
+
+ down_read(&ubi->fm_eba_sem);
+ vol->eba_tbl->entries[lnum].pnum = UBI_LEB_UNMAPPED;
+ up_read(&ubi->fm_eba_sem);
+ ubi_wl_put_peb(ubi, vol->vol_id, lnum, *pnum, torture);
+
+ *pnum = UBI_LEB_UNMAPPED;
+ } else if (err < 0) {
+ ubi_err(ubi, "unable to read VID header back from PEB %i: %i",
+ *pnum, err);
+
+ goto out_free;
+ }
+
+ err = 0;
+
+out_free:
+ ubi_free_vid_buf(vidb);
+
+ return err;
+}
+#else
+static int check_mapping(struct ubi_device *ubi, struct ubi_volume *vol, int lnum,
+ int *pnum)
+{
+ return 0;
+}
+#endif
+
/**
* ubi_eba_read_leb - read data.
* @ubi: UBI device description object
@@ -522,7 +598,13 @@ int ubi_eba_read_leb(struct ubi_device *ubi, struct ubi_volume *vol, int lnum,
return err;
pnum = vol->eba_tbl->entries[lnum].pnum;
- if (pnum < 0) {
+ if (pnum >= 0) {
+ err = check_mapping(ubi, vol, lnum, &pnum);
+ if (err < 0)
+ goto out_unlock;
+ }
+
+ if (pnum == UBI_LEB_UNMAPPED) {
/*
* The logical eraseblock is not mapped, fill the whole buffer
* with 0xFF bytes. The exception is static volumes for which
@@ -930,6 +1012,12 @@ int ubi_eba_write_leb(struct ubi_device *ubi, struct ubi_volume *vol, int lnum,
return err;
pnum = vol->eba_tbl->entries[lnum].pnum;
+ if (pnum >= 0) {
+ err = check_mapping(ubi, vol, lnum, &pnum);
+ if (err < 0)
+ goto out;
+ }
+
if (pnum >= 0) {
dbg_eba("write %d bytes at offset %d of LEB %d:%d, PEB %d",
len, offset, vol_id, lnum, pnum);
commit e9893e6fa932f42c90c4ac5849fa9aa0f0f00a34 upstream.
Positive return value from read_oob() is making false BAD
blocks. For some of the NAND controllers, OOB bytes will be
protected with ECC and read_oob() will return number of bitflips.
If there is any bitflip in ECC protected OOB bytes for BAD block
status page, then that block is getting treated as BAD.
Fixes: c120e75e0e7d ("mtd: nand: use read_oob() instead of cmdfunc() for bad block check")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Abhishek Sahu <absahu(a)codeaurora.org>
Reviewed-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
[backported to 4.14.y]
Signed-off-by: Abhishek Sahu <absahu(a)codeaurora.org>
---
This is backported patch for failed patch mentioned in
https://www.spinics.net/lists/stable/msg245833.html
The failure happened due to file rename.
drivers/mtd/nand/nand_base.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 528e04f..d410de3 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -440,7 +440,7 @@ static int nand_block_bad(struct mtd_info *mtd, loff_t ofs)
for (; page < page_end; page++) {
res = chip->ecc.read_oob(mtd, chip, page);
- if (res)
+ if (res < 0)
return res;
bad = chip->oob_poi[chip->badblockpos];
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc.
is a member of Code Aurora Forum, hosted by The Linux Foundation
Hi Greg,
please apply the attched backported patch to the 4.9 stable tree.
It is needed to boot Xen PV guests (broken since commit
c43b4ff972a986c85bdd8dc1aa05fe23b29ef99c which I didn't realize
due to my kernel parameters when testing).
Juergen
Hi Greg,
subject "ARM: dts: imx6q: Use correct SDMA script for SPI5 core"
commit df07101e1c4a29e820df02f9989a066988b160e6 upstream.
Please apply to v4.17.x, v4.14.x, v4.9.x, v4.4.x.
I forgot to write CC: <stable(a)vger.kernel.org> in the commit msg.
I hope this can go in the stable trees.
Please let me know if the commit was in the queue already and I had done
everything correctly. :-)
BR
/Sean