This is a note to let you know that I've just added the patch titled
mtdchar: fix usage of mtd_ooblayout_ecc()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mtdchar-fix-usage-of-mtd_ooblayout_ecc.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6de564939e14327148e31ddcf769e34105176447 Mon Sep 17 00:00:00 2001
From: OuYang ZhiZhong <ouyzz(a)yealink.com>
Date: Sun, 11 Mar 2018 15:59:07 +0800
Subject: mtdchar: fix usage of mtd_ooblayout_ecc()
From: OuYang ZhiZhong <ouyzz(a)yealink.com>
commit 6de564939e14327148e31ddcf769e34105176447 upstream.
Section was not properly computed. The value of OOB region definition is
always ECC section 0 information in the OOB area, but we want to get all
the ECC bytes information, so we should call
mtd_ooblayout_ecc(mtd, section++, &oobregion) until it returns -ERANGE.
Fixes: c2b78452a9db ("mtd: use mtd_ooblayout_xxx() helpers where appropriate")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: OuYang ZhiZhong <ouyzz(a)yealink.com>
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/mtd/mtdchar.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/mtd/mtdchar.c
+++ b/drivers/mtd/mtdchar.c
@@ -487,7 +487,7 @@ static int shrink_ecclayout(struct mtd_i
for (i = 0; i < MTD_MAX_ECCPOS_ENTRIES;) {
u32 eccpos;
- ret = mtd_ooblayout_ecc(mtd, section, &oobregion);
+ ret = mtd_ooblayout_ecc(mtd, section++, &oobregion);
if (ret < 0) {
if (ret != -ERANGE)
return ret;
@@ -534,7 +534,7 @@ static int get_oobinfo(struct mtd_info *
for (i = 0; i < ARRAY_SIZE(to->eccpos);) {
u32 eccpos;
- ret = mtd_ooblayout_ecc(mtd, section, &oobregion);
+ ret = mtd_ooblayout_ecc(mtd, section++, &oobregion);
if (ret < 0) {
if (ret != -ERANGE)
return ret;
Patches currently in stable-queue which might be from ouyzz(a)yealink.com are
queue-4.14/mtdchar-fix-usage-of-mtd_ooblayout_ecc.patch
This is a note to let you know that I've just added the patch titled
mtd: nand: fsl_ifc: Read ECCSTAT0 and ECCSTAT1 registers for IFC 2.0
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mtd-nand-fsl_ifc-read-eccstat0-and-eccstat1-registers-for-ifc-2.0.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6b00c35138b404be98b85f4a703be594cbed501c Mon Sep 17 00:00:00 2001
From: Jagdish Gediya <jagdish.gediya(a)nxp.com>
Date: Thu, 22 Mar 2018 01:08:10 +0530
Subject: mtd: nand: fsl_ifc: Read ECCSTAT0 and ECCSTAT1 registers for IFC 2.0
From: Jagdish Gediya <jagdish.gediya(a)nxp.com>
commit 6b00c35138b404be98b85f4a703be594cbed501c upstream.
Due to missing information in Hardware manual, current
implementation doesn't read ECCSTAT0 and ECCSTAT1 registers
for IFC 2.0.
Add support to read ECCSTAT0 and ECCSTAT1 registers during
ecccheck for IFC 2.0.
Fixes: 656441478ed5 ("mtd: nand: ifc: Fix location of eccstat registers for IFC V1.0")
Cc: stable(a)vger.kernel.org # v3.18+
Signed-off-by: Jagdish Gediya <jagdish.gediya(a)nxp.com>
Reviewed-by: Prabhakar Kushwaha <prabhakar.kushwaha(a)nxp.com>
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/mtd/nand/fsl_ifc_nand.c | 6 +-----
include/linux/fsl_ifc.h | 6 +-----
2 files changed, 2 insertions(+), 10 deletions(-)
--- a/drivers/mtd/nand/fsl_ifc_nand.c
+++ b/drivers/mtd/nand/fsl_ifc_nand.c
@@ -227,11 +227,7 @@ static void fsl_ifc_run_command(struct m
int sector_end = sector_start + chip->ecc.steps - 1;
__be32 *eccstat_regs;
- if (ctrl->version >= FSL_IFC_VERSION_2_0_0)
- eccstat_regs = ifc->ifc_nand.v2_nand_eccstat;
- else
- eccstat_regs = ifc->ifc_nand.v1_nand_eccstat;
-
+ eccstat_regs = ifc->ifc_nand.nand_eccstat;
eccstat = ifc_in32(&eccstat_regs[sector_start / 4]);
for (i = sector_start; i <= sector_end; i++) {
--- a/include/linux/fsl_ifc.h
+++ b/include/linux/fsl_ifc.h
@@ -734,11 +734,7 @@ struct fsl_ifc_nand {
u32 res19[0x10];
__be32 nand_fsr;
u32 res20;
- /* The V1 nand_eccstat is actually 4 words that overlaps the
- * V2 nand_eccstat.
- */
- __be32 v1_nand_eccstat[2];
- __be32 v2_nand_eccstat[6];
+ __be32 nand_eccstat[8];
u32 res21[0x1c];
__be32 nanndcr;
u32 res22[0x2];
Patches currently in stable-queue which might be from jagdish.gediya(a)nxp.com are
queue-4.14/mtd-nand-fsl_ifc-fix-nand-waitfunc-return-value.patch
queue-4.14/mtd-nand-fsl_ifc-fix-eccstat-array-overflow-for-ifc-ver-2.0.0.patch
queue-4.14/mtd-nand-fsl_ifc-read-eccstat0-and-eccstat1-registers-for-ifc-2.0.patch
This is a note to let you know that I've just added the patch titled
mtd: nand: fsl_ifc: Fix nand waitfunc return value
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mtd-nand-fsl_ifc-fix-nand-waitfunc-return-value.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fa8e6d58c5bc260f4369c6699683d69695daed0a Mon Sep 17 00:00:00 2001
From: Jagdish Gediya <jagdish.gediya(a)nxp.com>
Date: Wed, 21 Mar 2018 04:31:36 +0530
Subject: mtd: nand: fsl_ifc: Fix nand waitfunc return value
From: Jagdish Gediya <jagdish.gediya(a)nxp.com>
commit fa8e6d58c5bc260f4369c6699683d69695daed0a upstream.
As per the IFC hardware manual, Most significant 2 bytes in
nand_fsr register are the outcome of NAND READ STATUS command.
So status value need to be shifted and aligned as per the nand
framework requirement.
Fixes: 82771882d960 ("NAND Machine support for Integrated Flash Controller")
Cc: stable(a)vger.kernel.org # v3.18+
Signed-off-by: Jagdish Gediya <jagdish.gediya(a)nxp.com>
Reviewed-by: Prabhakar Kushwaha <prabhakar.kushwaha(a)nxp.com>
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/mtd/nand/fsl_ifc_nand.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/mtd/nand/fsl_ifc_nand.c
+++ b/drivers/mtd/nand/fsl_ifc_nand.c
@@ -626,6 +626,7 @@ static int fsl_ifc_wait(struct mtd_info
struct fsl_ifc_ctrl *ctrl = priv->ctrl;
struct fsl_ifc_runtime __iomem *ifc = ctrl->rregs;
u32 nand_fsr;
+ int status;
/* Use READ_STATUS command, but wait for the device to be ready */
ifc_out32((IFC_FIR_OP_CW0 << IFC_NAND_FIR0_OP0_SHIFT) |
@@ -640,12 +641,12 @@ static int fsl_ifc_wait(struct mtd_info
fsl_ifc_run_command(mtd);
nand_fsr = ifc_in32(&ifc->ifc_nand.nand_fsr);
-
+ status = nand_fsr >> 24;
/*
* The chip always seems to report that it is
* write-protected, even when it is not.
*/
- return nand_fsr | NAND_STATUS_WP;
+ return status | NAND_STATUS_WP;
}
/*
Patches currently in stable-queue which might be from jagdish.gediya(a)nxp.com are
queue-4.14/mtd-nand-fsl_ifc-fix-nand-waitfunc-return-value.patch
queue-4.14/mtd-nand-fsl_ifc-fix-eccstat-array-overflow-for-ifc-ver-2.0.0.patch
queue-4.14/mtd-nand-fsl_ifc-read-eccstat0-and-eccstat1-registers-for-ifc-2.0.patch
This is a note to let you know that I've just added the patch titled
mtd: nand: fsl_ifc: Fix eccstat array overflow for IFC ver >= 2.0.0
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mtd-nand-fsl_ifc-fix-eccstat-array-overflow-for-ifc-ver-2.0.0.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 843c3a59997f18060848b8632607dd04781b52d1 Mon Sep 17 00:00:00 2001
From: Jagdish Gediya <jagdish.gediya(a)nxp.com>
Date: Wed, 21 Mar 2018 05:51:46 +0530
Subject: mtd: nand: fsl_ifc: Fix eccstat array overflow for IFC ver >= 2.0.0
From: Jagdish Gediya <jagdish.gediya(a)nxp.com>
commit 843c3a59997f18060848b8632607dd04781b52d1 upstream.
Number of ECC status registers i.e. (ECCSTATx) has been increased in IFC
version 2.0.0 due to increase in SRAM size. This is causing eccstat
array to over flow.
So, replace eccstat array with u32 variable to make it fail-safe and
independent of number of ECC status registers or SRAM size.
Fixes: bccb06c353af ("mtd: nand: ifc: update bufnum mask for ver >= 2.0.0")
Cc: stable(a)vger.kernel.org # 3.18+
Signed-off-by: Prabhakar Kushwaha <prabhakar.kushwaha(a)nxp.com>
Signed-off-by: Jagdish Gediya <jagdish.gediya(a)nxp.com>
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/mtd/nand/fsl_ifc_nand.c | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)
--- a/drivers/mtd/nand/fsl_ifc_nand.c
+++ b/drivers/mtd/nand/fsl_ifc_nand.c
@@ -173,14 +173,9 @@ static void set_addr(struct mtd_info *mt
/* returns nonzero if entire page is blank */
static int check_read_ecc(struct mtd_info *mtd, struct fsl_ifc_ctrl *ctrl,
- u32 *eccstat, unsigned int bufnum)
+ u32 eccstat, unsigned int bufnum)
{
- u32 reg = eccstat[bufnum / 4];
- int errors;
-
- errors = (reg >> ((3 - bufnum % 4) * 8)) & 15;
-
- return errors;
+ return (eccstat >> ((3 - bufnum % 4) * 8)) & 15;
}
/*
@@ -193,7 +188,7 @@ static void fsl_ifc_run_command(struct m
struct fsl_ifc_ctrl *ctrl = priv->ctrl;
struct fsl_ifc_nand_ctrl *nctrl = ifc_nand_ctrl;
struct fsl_ifc_runtime __iomem *ifc = ctrl->rregs;
- u32 eccstat[4];
+ u32 eccstat;
int i;
/* set the chip select for NAND Transaction */
@@ -228,8 +223,8 @@ static void fsl_ifc_run_command(struct m
if (nctrl->eccread) {
int errors;
int bufnum = nctrl->page & priv->bufnum_mask;
- int sector = bufnum * chip->ecc.steps;
- int sector_end = sector + chip->ecc.steps - 1;
+ int sector_start = bufnum * chip->ecc.steps;
+ int sector_end = sector_start + chip->ecc.steps - 1;
__be32 *eccstat_regs;
if (ctrl->version >= FSL_IFC_VERSION_2_0_0)
@@ -237,10 +232,12 @@ static void fsl_ifc_run_command(struct m
else
eccstat_regs = ifc->ifc_nand.v1_nand_eccstat;
- for (i = sector / 4; i <= sector_end / 4; i++)
- eccstat[i] = ifc_in32(&eccstat_regs[i]);
+ eccstat = ifc_in32(&eccstat_regs[sector_start / 4]);
+
+ for (i = sector_start; i <= sector_end; i++) {
+ if (i != sector_start && !(i % 4))
+ eccstat = ifc_in32(&eccstat_regs[i / 4]);
- for (i = sector; i <= sector_end; i++) {
errors = check_read_ecc(mtd, ctrl, eccstat, i);
if (errors == 15) {
Patches currently in stable-queue which might be from jagdish.gediya(a)nxp.com are
queue-4.14/mtd-nand-fsl_ifc-fix-nand-waitfunc-return-value.patch
queue-4.14/mtd-nand-fsl_ifc-fix-eccstat-array-overflow-for-ifc-ver-2.0.0.patch
queue-4.14/mtd-nand-fsl_ifc-read-eccstat0-and-eccstat1-registers-for-ifc-2.0.patch
This is a note to let you know that I've just added the patch titled
mm/vmscan: wake up flushers for legacy cgroups too
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-vmscan-wake-up-flushers-for-legacy-cgroups-too.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1c610d5f93c709df56787f50b3576704ac271826 Mon Sep 17 00:00:00 2001
From: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Date: Thu, 22 Mar 2018 16:17:42 -0700
Subject: mm/vmscan: wake up flushers for legacy cgroups too
From: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
commit 1c610d5f93c709df56787f50b3576704ac271826 upstream.
Commit 726d061fbd36 ("mm: vmscan: kick flushers when we encounter dirty
pages on the LRU") added flusher invocation to shrink_inactive_list()
when many dirty pages on the LRU are encountered.
However, shrink_inactive_list() doesn't wake up flushers for legacy
cgroup reclaim, so the next commit bbef938429f5 ("mm: vmscan: remove old
flusher wakeup from direct reclaim path") removed the only source of
flusher's wake up in legacy mem cgroup reclaim path.
This leads to premature OOM if there is too many dirty pages in cgroup:
# mkdir /sys/fs/cgroup/memory/test
# echo $$ > /sys/fs/cgroup/memory/test/tasks
# echo 50M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
# dd if=/dev/zero of=tmp_file bs=1M count=100
Killed
dd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0
Call Trace:
dump_stack+0x46/0x65
dump_header+0x6b/0x2ac
oom_kill_process+0x21c/0x4a0
out_of_memory+0x2a5/0x4b0
mem_cgroup_out_of_memory+0x3b/0x60
mem_cgroup_oom_synchronize+0x2ed/0x330
pagefault_out_of_memory+0x24/0x54
__do_page_fault+0x521/0x540
page_fault+0x45/0x50
Task in /test killed as a result of limit of /test
memory: usage 51200kB, limit 51200kB, failcnt 73
memory+swap: usage 51200kB, limit 9007199254740988kB, failcnt 0
kmem: usage 296kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /test: cache:49632KB rss:1056KB rss_huge:0KB shmem:0KB
mapped_file:0KB dirty:49500KB writeback:0KB swap:0KB inactive_anon:0KB
active_anon:1168KB inactive_file:24760KB active_file:24960KB unevictable:0KB
Memory cgroup out of memory: Kill process 3861 (bash) score 88 or sacrifice child
Killed process 3876 (dd) total-vm:8484kB, anon-rss:1052kB, file-rss:1720kB, shmem-rss:0kB
oom_reaper: reaped process 3876 (dd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Wake up flushers in legacy cgroup reclaim too.
Link: http://lkml.kernel.org/r/20180315164553.17856-1-aryabinin@virtuozzo.com
Fixes: bbef938429f5 ("mm: vmscan: remove old flusher wakeup from direct reclaim path")
Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Tested-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Michal Hocko <mhocko(a)suse.cz>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/vmscan.c | 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1847,6 +1847,20 @@ shrink_inactive_list(unsigned long nr_to
set_bit(PGDAT_WRITEBACK, &pgdat->flags);
/*
+ * If dirty pages are scanned that are not queued for IO, it
+ * implies that flushers are not doing their job. This can
+ * happen when memory pressure pushes dirty pages to the end of
+ * the LRU before the dirty limits are breached and the dirty
+ * data has expired. It can also happen when the proportion of
+ * dirty pages grows not through writes but through memory
+ * pressure reclaiming all the clean cache. And in some cases,
+ * the flushers simply cannot keep up with the allocation
+ * rate. Nudge the flusher threads in case they are asleep.
+ */
+ if (stat.nr_unqueued_dirty == nr_taken)
+ wakeup_flusher_threads(0, WB_REASON_VMSCAN);
+
+ /*
* Legacy memcg will stall in page writeback so avoid forcibly
* stalling here.
*/
@@ -1858,22 +1872,9 @@ shrink_inactive_list(unsigned long nr_to
if (stat.nr_dirty && stat.nr_dirty == stat.nr_congested)
set_bit(PGDAT_CONGESTED, &pgdat->flags);
- /*
- * If dirty pages are scanned that are not queued for IO, it
- * implies that flushers are not doing their job. This can
- * happen when memory pressure pushes dirty pages to the end of
- * the LRU before the dirty limits are breached and the dirty
- * data has expired. It can also happen when the proportion of
- * dirty pages grows not through writes but through memory
- * pressure reclaiming all the clean cache. And in some cases,
- * the flushers simply cannot keep up with the allocation
- * rate. Nudge the flusher threads in case they are asleep, but
- * also allow kswapd to start writing pages during reclaim.
- */
- if (stat.nr_unqueued_dirty == nr_taken) {
- wakeup_flusher_threads(0, WB_REASON_VMSCAN);
+ /* Allow kswapd to start writing pages during reclaim. */
+ if (stat.nr_unqueued_dirty == nr_taken)
set_bit(PGDAT_DIRTY, &pgdat->flags);
- }
/*
* If kswapd scans pages marked marked for immediate
Patches currently in stable-queue which might be from aryabinin(a)virtuozzo.com are
queue-4.14/mm-vmscan-wake-up-flushers-for-legacy-cgroups-too.patch
This is a note to let you know that I've just added the patch titled
mm/vmalloc: add interfaces to free unmapped page table
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-vmalloc-add-interfaces-to-free-unmapped-page-table.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b6bdb7517c3d3f41f20e5c2948d6bc3f8897394e Mon Sep 17 00:00:00 2001
From: Toshi Kani <toshi.kani(a)hpe.com>
Date: Thu, 22 Mar 2018 16:17:20 -0700
Subject: mm/vmalloc: add interfaces to free unmapped page table
From: Toshi Kani <toshi.kani(a)hpe.com>
commit b6bdb7517c3d3f41f20e5c2948d6bc3f8897394e upstream.
On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may
create pud/pmd mappings. A kernel panic was observed on arm64 systems
with Cortex-A75 in the following steps as described by Hanjun Guo.
1. ioremap a 4K size, valid page table will build,
2. iounmap it, pte0 will set to 0;
3. ioremap the same address with 2M size, pgd/pmd is unchanged,
then set the a new value for pmd;
4. pte0 is leaked;
5. CPU may meet exception because the old pmd is still in TLB,
which will lead to kernel panic.
This panic is not reproducible on x86. INVLPG, called from iounmap,
purges all levels of entries associated with purged address on x86. x86
still has memory leak.
The patch changes the ioremap path to free unmapped page table(s) since
doing so in the unmap path has the following issues:
- The iounmap() path is shared with vunmap(). Since vmap() only
supports pte mappings, making vunmap() to free a pte page is an
overhead for regular vmap users as they do not need a pte page freed
up.
- Checking if all entries in a pte page are cleared in the unmap path
is racy, and serializing this check is expensive.
- The unmap path calls free_vmap_area_noflush() to do lazy TLB purges.
Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB
purge.
Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which
clear a given pud/pmd entry and free up a page for the lower level
entries.
This patch implements their stub functions on x86 and arm64, which work
as workaround.
[akpm(a)linux-foundation.org: fix typo in pmd_free_pte_page() stub]
Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com
Fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings")
Reported-by: Lei Li <lious.lilei(a)hisilicon.com>
Signed-off-by: Toshi Kani <toshi.kani(a)hpe.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Wang Xuefeng <wxf.wang(a)hisilicon.com>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: Hanjun Guo <guohanjun(a)huawei.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Chintan Pandya <cpandya(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm64/mm/mmu.c | 10 ++++++++++
arch/x86/mm/pgtable.c | 24 ++++++++++++++++++++++++
include/asm-generic/pgtable.h | 10 ++++++++++
lib/ioremap.c | 6 ++++--
4 files changed, 48 insertions(+), 2 deletions(-)
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -937,3 +937,13 @@ int pmd_clear_huge(pmd_t *pmd)
pmd_clear(pmd);
return 1;
}
+
+int pud_free_pmd_page(pud_t *pud)
+{
+ return pud_none(*pud);
+}
+
+int pmd_free_pte_page(pmd_t *pmd)
+{
+ return pmd_none(*pmd);
+}
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -702,4 +702,28 @@ int pmd_clear_huge(pmd_t *pmd)
return 0;
}
+
+/**
+ * pud_free_pmd_page - Clear pud entry and free pmd page.
+ * @pud: Pointer to a PUD.
+ *
+ * Context: The pud range has been unmaped and TLB purged.
+ * Return: 1 if clearing the entry succeeded. 0 otherwise.
+ */
+int pud_free_pmd_page(pud_t *pud)
+{
+ return pud_none(*pud);
+}
+
+/**
+ * pmd_free_pte_page - Clear pmd entry and free pte page.
+ * @pmd: Pointer to a PMD.
+ *
+ * Context: The pmd range has been unmaped and TLB purged.
+ * Return: 1 if clearing the entry succeeded. 0 otherwise.
+ */
+int pmd_free_pte_page(pmd_t *pmd)
+{
+ return pmd_none(*pmd);
+}
#endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -976,6 +976,8 @@ int pud_set_huge(pud_t *pud, phys_addr_t
int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot);
int pud_clear_huge(pud_t *pud);
int pmd_clear_huge(pmd_t *pmd);
+int pud_free_pmd_page(pud_t *pud);
+int pmd_free_pte_page(pmd_t *pmd);
#else /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
static inline int p4d_set_huge(p4d_t *p4d, phys_addr_t addr, pgprot_t prot)
{
@@ -1001,6 +1003,14 @@ static inline int pmd_clear_huge(pmd_t *
{
return 0;
}
+static inline int pud_free_pmd_page(pud_t *pud)
+{
+ return 0;
+}
+static inline int pmd_free_pte_page(pmd_t *pmd)
+{
+ return 0;
+}
#endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
#ifndef __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
--- a/lib/ioremap.c
+++ b/lib/ioremap.c
@@ -91,7 +91,8 @@ static inline int ioremap_pmd_range(pud_
if (ioremap_pmd_enabled() &&
((next - addr) == PMD_SIZE) &&
- IS_ALIGNED(phys_addr + addr, PMD_SIZE)) {
+ IS_ALIGNED(phys_addr + addr, PMD_SIZE) &&
+ pmd_free_pte_page(pmd)) {
if (pmd_set_huge(pmd, phys_addr + addr, prot))
continue;
}
@@ -117,7 +118,8 @@ static inline int ioremap_pud_range(p4d_
if (ioremap_pud_enabled() &&
((next - addr) == PUD_SIZE) &&
- IS_ALIGNED(phys_addr + addr, PUD_SIZE)) {
+ IS_ALIGNED(phys_addr + addr, PUD_SIZE) &&
+ pud_free_pmd_page(pud)) {
if (pud_set_huge(pud, phys_addr + addr, prot))
continue;
}
Patches currently in stable-queue which might be from toshi.kani(a)hpe.com are
queue-4.14/mm-vmalloc-add-interfaces-to-free-unmapped-page-table.patch
queue-4.14/x86-mm-implement-free-pmd-pte-page-interfaces.patch
This is a note to let you know that I've just added the patch titled
mm/thp: do not wait for lock_page() in deferred_split_scan()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-thp-do-not-wait-for-lock_page-in-deferred_split_scan.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fa41b900c30b45fab03783724932dc30cd46a6be Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Date: Thu, 22 Mar 2018 16:17:31 -0700
Subject: mm/thp: do not wait for lock_page() in deferred_split_scan()
From: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
commit fa41b900c30b45fab03783724932dc30cd46a6be upstream.
deferred_split_scan() gets called from reclaim path. Waiting for page
lock may lead to deadlock there.
Replace lock_page() with trylock_page() and skip the page if we failed
to lock it. We will get to the page on the next scan.
Link: http://lkml.kernel.org/r/20180315150747.31945-1-kirill.shutemov@linux.intel…
Fixes: 9a982250f773 ("thp: introduce deferred_split_huge_page()")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/huge_memory.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2739,11 +2739,13 @@ static unsigned long deferred_split_scan
list_for_each_safe(pos, next, &list) {
page = list_entry((void *)pos, struct page, mapping);
- lock_page(page);
+ if (!trylock_page(page))
+ goto next;
/* split_huge_page() removes page from list on success */
if (!split_huge_page(page))
split++;
unlock_page(page);
+next:
put_page(page);
}
Patches currently in stable-queue which might be from kirill.shutemov(a)linux.intel.com are
queue-4.14/mm-khugepaged.c-convert-vm_bug_on-to-collapse-fail.patch
queue-4.14/mm-thp-do-not-wait-for-lock_page-in-deferred_split_scan.patch
queue-4.14/hugetlbfs-check-for-pgoff-value-overflow.patch
queue-4.14/mm-shmem-do-not-wait-for-lock_page-in-shmem_unused_huge_shrink.patch
This is a note to let you know that I've just added the patch titled
mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-shmem-do-not-wait-for-lock_page-in-shmem_unused_huge_shrink.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b3cd54b257ad95d344d121dc563d943ca39b0921 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Date: Thu, 22 Mar 2018 16:17:35 -0700
Subject: mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink()
From: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
commit b3cd54b257ad95d344d121dc563d943ca39b0921 upstream.
shmem_unused_huge_shrink() gets called from reclaim path. Waiting for
page lock may lead to deadlock there.
There was a bug report that may be attributed to this:
http://lkml.kernel.org/r/alpine.LRH.2.11.1801242349220.30642@mail.ewheeler.…
Replace lock_page() with trylock_page() and skip the page if we failed
to lock it. We will get to the page on the next scan.
We can test for the PageTransHuge() outside the page lock as we only
need protection against splitting the page under us. Holding pin oni
the page is enough for this.
Link: http://lkml.kernel.org/r/20180316210830.43738-1-kirill.shutemov@linux.intel…
Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reported-by: Eric Wheeler <linux-mm(a)lists.ewheeler.net>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/shmem.c | 31 ++++++++++++++++++++-----------
1 file changed, 20 insertions(+), 11 deletions(-)
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -493,36 +493,45 @@ next:
info = list_entry(pos, struct shmem_inode_info, shrinklist);
inode = &info->vfs_inode;
- if (nr_to_split && split >= nr_to_split) {
- iput(inode);
- continue;
- }
+ if (nr_to_split && split >= nr_to_split)
+ goto leave;
- page = find_lock_page(inode->i_mapping,
+ page = find_get_page(inode->i_mapping,
(inode->i_size & HPAGE_PMD_MASK) >> PAGE_SHIFT);
if (!page)
goto drop;
+ /* No huge page at the end of the file: nothing to split */
if (!PageTransHuge(page)) {
- unlock_page(page);
put_page(page);
goto drop;
}
+ /*
+ * Leave the inode on the list if we failed to lock
+ * the page at this time.
+ *
+ * Waiting for the lock may lead to deadlock in the
+ * reclaim path.
+ */
+ if (!trylock_page(page)) {
+ put_page(page);
+ goto leave;
+ }
+
ret = split_huge_page(page);
unlock_page(page);
put_page(page);
- if (ret) {
- /* split failed: leave it on the list */
- iput(inode);
- continue;
- }
+ /* If split failed leave the inode on the list */
+ if (ret)
+ goto leave;
split++;
drop:
list_del_init(&info->shrinklist);
removed++;
+leave:
iput(inode);
}
Patches currently in stable-queue which might be from kirill.shutemov(a)linux.intel.com are
queue-4.14/mm-khugepaged.c-convert-vm_bug_on-to-collapse-fail.patch
queue-4.14/mm-thp-do-not-wait-for-lock_page-in-deferred_split_scan.patch
queue-4.14/hugetlbfs-check-for-pgoff-value-overflow.patch
queue-4.14/mm-shmem-do-not-wait-for-lock_page-in-shmem_unused_huge_shrink.patch
This is a note to let you know that I've just added the patch titled
mm/khugepaged.c: convert VM_BUG_ON() to collapse fail
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-khugepaged.c-convert-vm_bug_on-to-collapse-fail.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fece2029a9e65b9a990831afe2a2b83290cbbe26 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Date: Thu, 22 Mar 2018 16:17:28 -0700
Subject: mm/khugepaged.c: convert VM_BUG_ON() to collapse fail
From: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
commit fece2029a9e65b9a990831afe2a2b83290cbbe26 upstream.
khugepaged is not yet able to convert PTE-mapped huge pages back to PMD
mapped. We do not collapse such pages. See check
khugepaged_scan_pmd().
But if between khugepaged_scan_pmd() and __collapse_huge_page_isolate()
somebody managed to instantiate THP in the range and then split the PMD
back to PTEs we would have a problem --
VM_BUG_ON_PAGE(PageCompound(page)) will get triggered.
It's possible since we drop mmap_sem during collapse to re-take for
write.
Replace the VM_BUG_ON() with graceful collapse fail.
Link: http://lkml.kernel.org/r/20180315152353.27989-1-kirill.shutemov@linux.intel…
Fixes: b1caa957ae6d ("khugepaged: ignore pmd tables with THP mapped with ptes")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Laura Abbott <labbott(a)redhat.com>
Cc: Jerome Marchand <jmarchan(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/khugepaged.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -530,7 +530,12 @@ static int __collapse_huge_page_isolate(
goto out;
}
- VM_BUG_ON_PAGE(PageCompound(page), page);
+ /* TODO: teach khugepaged to collapse THP mapped with pte */
+ if (PageCompound(page)) {
+ result = SCAN_PAGE_COMPOUND;
+ goto out;
+ }
+
VM_BUG_ON_PAGE(!PageAnon(page), page);
/*
Patches currently in stable-queue which might be from kirill.shutemov(a)linux.intel.com are
queue-4.14/mm-khugepaged.c-convert-vm_bug_on-to-collapse-fail.patch
queue-4.14/mm-thp-do-not-wait-for-lock_page-in-deferred_split_scan.patch
queue-4.14/hugetlbfs-check-for-pgoff-value-overflow.patch
queue-4.14/mm-shmem-do-not-wait-for-lock_page-in-shmem_unused_huge_shrink.patch
This is a note to let you know that I've just added the patch titled
libnvdimm, {btt, blk}: do integrity setup before add_disk()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
libnvdimm-btt-blk-do-integrity-setup-before-add_disk.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3ffb0ba9b567a8efb9a04ed3d1ec15ff333ada22 Mon Sep 17 00:00:00 2001
From: Vishal Verma <vishal.l.verma(a)intel.com>
Date: Mon, 5 Mar 2018 16:56:13 -0700
Subject: libnvdimm, {btt, blk}: do integrity setup before add_disk()
From: Vishal Verma <vishal.l.verma(a)intel.com>
commit 3ffb0ba9b567a8efb9a04ed3d1ec15ff333ada22 upstream.
Prior to 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk")
we needed to temporarily add a zero-capacity disk before registering for
blk-integrity. But adding a zero-capacity disk caused the partition
table scanning to bail early, and this resulted in partitions not coming
up after a probe of the BTT or blk namespaces.
We can now register for integrity before the disk has been added, and
this fixes the rescan problems.
Fixes: 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk")
Reported-by: Dariusz Dokupil <dariusz.dokupil(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvdimm/blk.c | 3 +--
drivers/nvdimm/btt.c | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/nvdimm/blk.c
+++ b/drivers/nvdimm/blk.c
@@ -278,8 +278,6 @@ static int nsblk_attach_disk(struct nd_n
disk->queue = q;
disk->flags = GENHD_FL_EXT_DEVT;
nvdimm_namespace_disk_name(&nsblk->common, disk->disk_name);
- set_capacity(disk, 0);
- device_add_disk(dev, disk);
if (devm_add_action_or_reset(dev, nd_blk_release_disk, disk))
return -ENOMEM;
@@ -292,6 +290,7 @@ static int nsblk_attach_disk(struct nd_n
}
set_capacity(disk, available_disk_size >> SECTOR_SHIFT);
+ device_add_disk(dev, disk);
revalidate_disk(disk);
return 0;
}
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1542,8 +1542,6 @@ static int btt_blk_init(struct btt *btt)
queue_flag_set_unlocked(QUEUE_FLAG_NONROT, btt->btt_queue);
btt->btt_queue->queuedata = btt;
- set_capacity(btt->btt_disk, 0);
- device_add_disk(&btt->nd_btt->dev, btt->btt_disk);
if (btt_meta_size(btt)) {
int rc = nd_integrity_init(btt->btt_disk, btt_meta_size(btt));
@@ -1555,6 +1553,7 @@ static int btt_blk_init(struct btt *btt)
}
}
set_capacity(btt->btt_disk, btt->nlba * btt->sector_size >> 9);
+ device_add_disk(&btt->nd_btt->dev, btt->btt_disk);
btt->nd_btt->size = btt->nlba * (u64)btt->sector_size;
revalidate_disk(btt->btt_disk);
Patches currently in stable-queue which might be from vishal.l.verma(a)intel.com are
queue-4.14/libnvdimm-btt-blk-do-integrity-setup-before-add_disk.patch