The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From da7d554f7c62d0c17c1ac3cc2586473c2d99f0bd Mon Sep 17 00:00:00 2001
From: Alexander Aring <aahringo(a)redhat.com>
Date: Mon, 26 Oct 2020 10:52:29 -0400
Subject: [PATCH] gfs2: Wake up when sd_glock_disposal becomes zero
Commit fc0e38dae645 ("GFS2: Fix glock deallocation race") fixed a
sd_glock_disposal accounting bug by adding a missing atomic_dec
statement, but it failed to wake up sd_glock_wait when that decrement
causes sd_glock_disposal to reach zero. As a consequence,
gfs2_gl_hash_clear can now run into a 10-minute timeout instead of
being woken up. Add the missing wakeup.
Fixes: fc0e38dae645 ("GFS2: Fix glock deallocation race")
Cc: stable(a)vger.kernel.org # v2.6.39+
Signed-off-by: Alexander Aring <aahringo(a)redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com>
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 5441c17562c5..d98a2e5dab9f 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1078,7 +1078,8 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number,
out_free:
kfree(gl->gl_lksb.sb_lvbptr);
kmem_cache_free(cachep, gl);
- atomic_dec(&sdp->sd_glock_disposal);
+ if (atomic_dec_and_test(&sdp->sd_glock_disposal))
+ wake_up(&sdp->sd_glock_wait);
out:
return ret;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From da7d554f7c62d0c17c1ac3cc2586473c2d99f0bd Mon Sep 17 00:00:00 2001
From: Alexander Aring <aahringo(a)redhat.com>
Date: Mon, 26 Oct 2020 10:52:29 -0400
Subject: [PATCH] gfs2: Wake up when sd_glock_disposal becomes zero
Commit fc0e38dae645 ("GFS2: Fix glock deallocation race") fixed a
sd_glock_disposal accounting bug by adding a missing atomic_dec
statement, but it failed to wake up sd_glock_wait when that decrement
causes sd_glock_disposal to reach zero. As a consequence,
gfs2_gl_hash_clear can now run into a 10-minute timeout instead of
being woken up. Add the missing wakeup.
Fixes: fc0e38dae645 ("GFS2: Fix glock deallocation race")
Cc: stable(a)vger.kernel.org # v2.6.39+
Signed-off-by: Alexander Aring <aahringo(a)redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com>
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 5441c17562c5..d98a2e5dab9f 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1078,7 +1078,8 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number,
out_free:
kfree(gl->gl_lksb.sb_lvbptr);
kmem_cache_free(cachep, gl);
- atomic_dec(&sdp->sd_glock_disposal);
+ if (atomic_dec_and_test(&sdp->sd_glock_disposal))
+ wake_up(&sdp->sd_glock_wait);
out:
return ret;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3f08842098e842c51e3b97d0dcdebf810b32558e Mon Sep 17 00:00:00 2001
From: Shijie Luo <luoshijie1(a)huawei.com>
Date: Sun, 1 Nov 2020 17:07:40 -0800
Subject: [PATCH] mm: mempolicy: fix potential pte_unmap_unlock pte error
When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to
pte_unmap_unlock seems like not a good idea.
queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't
migrate misplaced pages but returns with EIO when encountering such a
page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return
-EIO when MPOL_MF_STRICT is specified") and early break on the first pte
in the range results in pte_unmap_unlock on an underflow pte. This can
lead to lockups later on when somebody tries to lock the pte resp.
page_table_lock again..
Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
Signed-off-by: Shijie Luo <luoshijie1(a)huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Feilong Lin <linfeilong(a)huawei.com>
Cc: Shijie Luo <luoshijie1(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3fde772ef5ef..3ca4898f3f24 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -525,7 +525,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long flags = qp->flags;
int ret;
bool has_unmovable = false;
- pte_t *pte;
+ pte_t *pte, *mapped_pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
@@ -539,7 +539,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
if (pmd_trans_unstable(pmd))
return 0;
- pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+ mapped_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
for (; addr != end; pte++, addr += PAGE_SIZE) {
if (!pte_present(*pte))
continue;
@@ -571,7 +571,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
} else
break;
}
- pte_unmap_unlock(pte - 1, ptl);
+ pte_unmap_unlock(mapped_pte, ptl);
cond_resched();
if (has_unmovable)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3f08842098e842c51e3b97d0dcdebf810b32558e Mon Sep 17 00:00:00 2001
From: Shijie Luo <luoshijie1(a)huawei.com>
Date: Sun, 1 Nov 2020 17:07:40 -0800
Subject: [PATCH] mm: mempolicy: fix potential pte_unmap_unlock pte error
When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to
pte_unmap_unlock seems like not a good idea.
queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't
migrate misplaced pages but returns with EIO when encountering such a
page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return
-EIO when MPOL_MF_STRICT is specified") and early break on the first pte
in the range results in pte_unmap_unlock on an underflow pte. This can
lead to lockups later on when somebody tries to lock the pte resp.
page_table_lock again..
Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
Signed-off-by: Shijie Luo <luoshijie1(a)huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Feilong Lin <linfeilong(a)huawei.com>
Cc: Shijie Luo <luoshijie1(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3fde772ef5ef..3ca4898f3f24 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -525,7 +525,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long flags = qp->flags;
int ret;
bool has_unmovable = false;
- pte_t *pte;
+ pte_t *pte, *mapped_pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
@@ -539,7 +539,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
if (pmd_trans_unstable(pmd))
return 0;
- pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+ mapped_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
for (; addr != end; pte++, addr += PAGE_SIZE) {
if (!pte_present(*pte))
continue;
@@ -571,7 +571,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
} else
break;
}
- pte_unmap_unlock(pte - 1, ptl);
+ pte_unmap_unlock(mapped_pte, ptl);
cond_resched();
if (has_unmovable)
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3f08842098e842c51e3b97d0dcdebf810b32558e Mon Sep 17 00:00:00 2001
From: Shijie Luo <luoshijie1(a)huawei.com>
Date: Sun, 1 Nov 2020 17:07:40 -0800
Subject: [PATCH] mm: mempolicy: fix potential pte_unmap_unlock pte error
When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to
pte_unmap_unlock seems like not a good idea.
queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't
migrate misplaced pages but returns with EIO when encountering such a
page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return
-EIO when MPOL_MF_STRICT is specified") and early break on the first pte
in the range results in pte_unmap_unlock on an underflow pte. This can
lead to lockups later on when somebody tries to lock the pte resp.
page_table_lock again..
Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
Signed-off-by: Shijie Luo <luoshijie1(a)huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Feilong Lin <linfeilong(a)huawei.com>
Cc: Shijie Luo <luoshijie1(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3fde772ef5ef..3ca4898f3f24 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -525,7 +525,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long flags = qp->flags;
int ret;
bool has_unmovable = false;
- pte_t *pte;
+ pte_t *pte, *mapped_pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
@@ -539,7 +539,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
if (pmd_trans_unstable(pmd))
return 0;
- pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+ mapped_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
for (; addr != end; pte++, addr += PAGE_SIZE) {
if (!pte_present(*pte))
continue;
@@ -571,7 +571,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
} else
break;
}
- pte_unmap_unlock(pte - 1, ptl);
+ pte_unmap_unlock(mapped_pte, ptl);
cond_resched();
if (has_unmovable)
The patch below does not apply to the 5.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8de15e920dc85d1705ab9c202c95d56845bc2d48 Mon Sep 17 00:00:00 2001
From: Roman Gushchin <guro(a)fb.com>
Date: Sun, 1 Nov 2020 17:07:34 -0800
Subject: [PATCH] mm: memcg: link page counters to root if use_hierarchy is
false
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Richard reported a warning which can be reproduced by running the LTP
madvise6 test (cgroup v1 in the non-hierarchical mode should be used):
WARNING: CPU: 0 PID: 12 at mm/page_counter.c:57 page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156)
Modules linked in:
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.9.0-rc7-22-default #77
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812d-rebuilt.opensuse.org 04/01/2014
Workqueue: events drain_local_stock
RIP: 0010:page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156)
Call Trace:
__memcg_kmem_uncharge (mm/memcontrol.c:3022)
drain_obj_stock (./include/linux/rcupdate.h:689 mm/memcontrol.c:3114)
drain_local_stock (mm/memcontrol.c:2255)
process_one_work (./arch/x86/include/asm/jump_label.h:25 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:108 kernel/workqueue.c:2274)
worker_thread (./include/linux/list.h:282 kernel/workqueue.c:2416)
kthread (kernel/kthread.c:292)
ret_from_fork (arch/x86/entry/entry_64.S:300)
The problem occurs because in the non-hierarchical mode non-root page
counters are not linked to root page counters, so the charge is not
propagated to the root memory cgroup.
After the removal of the original memory cgroup and reparenting of the
object cgroup, the root cgroup might be uncharged by draining a objcg
stock, for example. It leads to an eventual underflow of the charge and
triggers a warning.
Fix it by linking all page counters to corresponding root page counters
in the non-hierarchical mode.
Please note, that in the non-hierarchical mode all objcgs are always
reparented to the root memory cgroup, even if the hierarchy has more
than 1 level. This patch doesn't change it.
The patch also doesn't affect how the hierarchical mode is working,
which is the only sane and truly supported mode now.
Thanks to Richard for reporting, debugging and providing an alternative
version of the fix!
Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
Reported-by: <ltp(a)lists.linux.it>
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Michal Koutný <mkoutny(a)suse.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201026231326.3212225-1-guro@fb.com
Debugged-by: Richard Palethorpe <rpalethorpe(a)suse.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c3b6dc7d5c94..3dcbf24d2227 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5345,17 +5345,22 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
memcg->swappiness = mem_cgroup_swappiness(parent);
memcg->oom_kill_disable = parent->oom_kill_disable;
}
- if (parent && parent->use_hierarchy) {
+ if (!parent) {
+ page_counter_init(&memcg->memory, NULL);
+ page_counter_init(&memcg->swap, NULL);
+ page_counter_init(&memcg->kmem, NULL);
+ page_counter_init(&memcg->tcpmem, NULL);
+ } else if (parent->use_hierarchy) {
memcg->use_hierarchy = true;
page_counter_init(&memcg->memory, &parent->memory);
page_counter_init(&memcg->swap, &parent->swap);
page_counter_init(&memcg->kmem, &parent->kmem);
page_counter_init(&memcg->tcpmem, &parent->tcpmem);
} else {
- page_counter_init(&memcg->memory, NULL);
- page_counter_init(&memcg->swap, NULL);
- page_counter_init(&memcg->kmem, NULL);
- page_counter_init(&memcg->tcpmem, NULL);
+ page_counter_init(&memcg->memory, &root_mem_cgroup->memory);
+ page_counter_init(&memcg->swap, &root_mem_cgroup->swap);
+ page_counter_init(&memcg->kmem, &root_mem_cgroup->kmem);
+ page_counter_init(&memcg->tcpmem, &root_mem_cgroup->tcpmem);
/*
* Deeper hierachy with use_hierarchy == false doesn't make
* much sense so let cgroup subsystem know about this
Here are backports of some fixes to the 4.14 stable branch.
I tested the blktrace fix with the script referenced in the commit
message.
I wasn't able to test the i40e changes (no hardware and no reproducer
available).
Ben.
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom
Here are backports of some fixes to the 4.19 stable branch.
I tested the blktrace fix with the script referenced in the commit
message.
I tested the btrfs changes with the reproducers for CVE-2019-19039,
CVE-2019-19377, and CVE-2019-19816, and checked for regressions with
xfstests.
Ben.
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom