The quilt patch titled
Subject: mm/damon/sysfs-schemes: skip stats update if the scheme directory is removed
has been removed from the -mm tree. Its filename was
mm-damon-sysfs-schemes-skip-stats-update-if-the-scheme-directory-is-removed.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/sysfs-schemes: skip stats update if the scheme directory is removed
Date: Mon, 14 Nov 2022 17:55:52 +0000
A DAMON sysfs interface user can start DAMON with a scheme, remove the
sysfs directory for the scheme, and then ask update of the scheme's stats.
Because the schemes stats update logic isn't aware of the situation, it
results in an invalid memory access. Fix the bug by checking if the
scheme sysfs directory exists.
Link: https://lkml.kernel.org/r/20221114175552.1951-1-sj@kernel.org
Fixes: 0ac32b8affb5 ("mm/damon/sysfs: support DAMOS stats")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [v5.18]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/mm/damon/sysfs.c~mm-damon-sysfs-schemes-skip-stats-update-if-the-scheme-directory-is-removed
+++ a/mm/damon/sysfs.c
@@ -2339,6 +2339,10 @@ static int damon_sysfs_upd_schemes_stats
damon_for_each_scheme(scheme, ctx) {
struct damon_sysfs_stats *sysfs_stats;
+ /* user could have removed the scheme sysfs dir */
+ if (schemes_idx >= sysfs_schemes->nr)
+ break;
+
sysfs_stats = sysfs_schemes->schemes_arr[schemes_idx++]->stats;
sysfs_stats->nr_tried = scheme->stat.nr_tried;
sysfs_stats->sz_tried = scheme->stat.sz_tried;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-sysfs-fix-wrong-empty-schemes-assumption-under-online-tuning-in-damon_sysfs_set_schemes.patch
mm-damon-core-split-out-damos-charged-region-skip-logic-into-a-new-function.patch
mm-damon-core-split-damos-application-logic-into-a-new-function.patch
mm-damon-core-split-out-scheme-stat-update-logic-into-a-new-function.patch
mm-damon-core-split-out-scheme-quota-adjustment-logic-into-a-new-function.patch
mm-damon-sysfs-use-damon_addr_range-for-regions-start-and-end-values.patch
mm-damon-sysfs-remove-parameters-of-damon_sysfs_region_alloc.patch
mm-damon-sysfs-move-sysfs_lock-to-common-module.patch
mm-damon-sysfs-move-unsigned-long-range-directory-to-common-module.patch
mm-damon-sysfs-split-out-kdamond-independent-schemes-stats-update-logic-into-a-new-function.patch
mm-damon-sysfs-split-out-schemes-directory-implementation-to-separate-file.patch
mm-damon-modules-deduplicate-init-steps-for-damon-context-setup.patch
mm-damon-reclaimlru_sort-remove-unnecessarily-included-headers.patch
mm-damon-reclaim-enable-and-disable-synchronously.patch
selftests-damon-add-tests-for-damon_reclaims-enabled-parameter.patch
mm-damon-lru_sort-enable-and-disable-synchronously.patch
selftests-damon-add-tests-for-damon_lru_sorts-enabled-parameter.patch
docs-admin-guide-mm-damon-usage-describe-the-rules-of-sysfs-region-directories.patch
docs-admin-guide-mm-damon-usage-fix-wrong-usage-example-of-init_regions-file.patch
mm-damon-core-add-a-callback-for-scheme-target-regions-check.patch
mm-damon-sysfs-schemes-implement-schemes-tried_regions-directory.patch
mm-damon-sysfs-schemes-implement-scheme-region-directory.patch
mm-damon-sysfs-implement-damos-tried-regions-update-command.patch
mm-damon-sysfs-implement-damos-tried-regions-update-command-fix.patch
mm-damon-sysfs-schemes-implement-damos-tried-regions-clear-command.patch
mm-damon-sysfs-schemes-implement-damos-tried-regions-clear-command-fix.patch
tools-selftets-damon-sysfs-test-tried_regions-directory-existence.patch
docs-admin-guide-mm-damon-usage-document-schemes-s-tried_regions-sysfs-directory.patch
docs-abi-damon-document-schemes-s-tried_regions-sysfs-directory.patch
selftests-damon-test-non-context-inputs-to-rm_contexts-file.patch
The quilt patch titled
Subject: mm: correctly charge compressed memory to its memcg
has been removed from the -mm tree. Its filename was
mm-correctly-charge-compressed-memory-to-its-memcg.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Li Liguang <liliguang(a)baidu.com>
Subject: mm: correctly charge compressed memory to its memcg
Date: Mon, 14 Nov 2022 14:48:28 -0500
Kswapd will reclaim memory when memory pressure is high, the annonymous
memory will be compressed and stored in the zpool if zswap is enabled.
The memcg_kmem_bypass() in get_obj_cgroup_from_page() will bypass the
kernel thread and cause the compressed memory not be charged to its memory
cgroup.
Remove the memcg_kmem_bypass() call and properly charge compressed memory
to its corresponding memory cgroup.
Link: https://lore.kernel.org/linux-mm/CALvZod4nnn8BHYqAM4xtcR0Ddo2-Wr8uKm9h_CHWU…
Link: https://lkml.kernel.org/r/20221114194828.100822-1-hannes@cmpxchg.org
Fixes: f4840ccfca25 ("zswap: memcg accounting")
Signed-off-by: Li Liguang <liliguang(a)baidu.com>
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Acked-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: <stable(a)vger.kernel.org> [5.19+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/memcontrol.c~mm-correctly-charge-compressed-memory-to-its-memcg
+++ a/mm/memcontrol.c
@@ -3026,7 +3026,7 @@ struct obj_cgroup *get_obj_cgroup_from_p
{
struct obj_cgroup *objcg;
- if (!memcg_kmem_enabled() || memcg_kmem_bypass())
+ if (!memcg_kmem_enabled())
return NULL;
if (PageMemcgKmem(page)) {
_
Patches currently in -mm which might be from liliguang(a)baidu.com are
The quilt patch titled
Subject: mm: vmscan: fix extreme overreclaim and swap floods
has been removed from the -mm tree. Its filename was
mm-vmscan-fix-extreme-overreclaim-and-swap-floods.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: vmscan: fix extreme overreclaim and swap floods
Date: Tue, 2 Aug 2022 12:28:11 -0400
During proactive reclaim, we sometimes observe severe overreclaim, with
several thousand times more pages reclaimed than requested.
This trace was obtained from shrink_lruvec() during such an instance:
prio:0 anon_cost:1141521 file_cost:7767
nr_reclaimed:4387406 nr_to_reclaim:1047 (or_factor:4190)
nr=[7161123 345 578 1111]
While he reclaimer requested 4M, vmscan reclaimed close to 16G, most of it
by swapping. These requests take over a minute, during which the write()
to memory.reclaim is unkillably stuck inside the kernel.
Digging into the source, this is caused by the proportional reclaim
bailout logic. This code tries to resolve a fundamental conflict: to
reclaim roughly what was requested, while also aging all LRUs fairly and
in accordance to their size, swappiness, refault rates etc. The way it
attempts fairness is that once the reclaim goal has been reached, it stops
scanning the LRUs with the smaller remaining scan targets, and adjusts the
remainder of the bigger LRUs according to how much of the smaller LRUs was
scanned. It then finishes scanning that remainder regardless of the
reclaim goal.
This works fine if priority levels are low and the LRU lists are
comparable in size. However, in this instance, the cgroup that is
targeted by proactive reclaim has almost no files left - they've already
been squeezed out by proactive reclaim earlier - and the remaining anon
pages are hot. Anon rotations cause the priority level to drop to 0,
which results in reclaim targeting all of anon (a lot) and all of file
(almost nothing). By the time reclaim decides to bail, it has scanned
most or all of the file target, and therefor must also scan most or all of
the enormous anon target. This target is thousands of times larger than
the reclaim goal, thus causing the overreclaim.
The bailout code hasn't changed in years, why is this failing now? The
most likely explanations are two other recent changes in anon reclaim:
1. Before the series starting with commit 5df741963d52 ("mm: fix LRU
balancing effect of new transparent huge pages"), the VM was
overall relatively reluctant to swap at all, even if swap was
configured. This means the LRU balancing code didn't come into play
as often as it does now, and mostly in high pressure situations
where pronounced swap activity wouldn't be as surprising.
2. For historic reasons, shrink_lruvec() loops on the scan targets of
all LRU lists except the active anon one, meaning it would bail if
the only remaining pages to scan were active anon - even if there
were a lot of them.
Before the series starting with commit ccc5dc67340c ("mm/vmscan:
make active/inactive ratio as 1:1 for anon lru"), most anon pages
would live on the active LRU; the inactive one would contain only a
handful of preselected reclaim candidates. After the series, anon
gets aged similarly to file, and the inactive list is the default
for new anon pages as well, making it often the much bigger list.
As a result, the VM is now more likely to actually finish large
anon targets than before.
Change the code such that only one SWAP_CLUSTER_MAX-sized nudge toward the
larger LRU lists is made before bailing out on a met reclaim goal.
This fixes the extreme overreclaim problem.
Fairness is more subtle and harder to evaluate. No obvious misbehavior
was observed on the test workload, in any case. Conceptually, fairness
should primarily be a cumulative effect from regular, lower priority
scans. Once the VM is in trouble and needs to escalate scan targets to
make forward progress, fairness needs to take a backseat. This is also
acknowledged by the myriad exceptions in get_scan_count(). This patch
makes fairness decrease gradually, as it keeps fairness work static over
increasing priority levels with growing scan targets. This should make
more sense - although we may have to re-visit the exact values.
Link: https://lkml.kernel.org/r/20220802162811.39216-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reviewed-by: Rik van Riel <riel(a)surriel.com>
Acked-by: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-fix-extreme-overreclaim-and-swap-floods
+++ a/mm/vmscan.c
@@ -5844,8 +5844,8 @@ static void shrink_lruvec(struct lruvec
enum lru_list lru;
unsigned long nr_reclaimed = 0;
unsigned long nr_to_reclaim = sc->nr_to_reclaim;
+ bool proportional_reclaim;
struct blk_plug plug;
- bool scan_adjusted;
if (lru_gen_enabled()) {
lru_gen_shrink_lruvec(lruvec, sc);
@@ -5868,8 +5868,8 @@ static void shrink_lruvec(struct lruvec
* abort proportional reclaim if either the file or anon lru has already
* dropped to zero at the first pass.
*/
- scan_adjusted = (!cgroup_reclaim(sc) && !current_is_kswapd() &&
- sc->priority == DEF_PRIORITY);
+ proportional_reclaim = (!cgroup_reclaim(sc) && !current_is_kswapd() &&
+ sc->priority == DEF_PRIORITY);
blk_start_plug(&plug);
while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
@@ -5889,7 +5889,7 @@ static void shrink_lruvec(struct lruvec
cond_resched();
- if (nr_reclaimed < nr_to_reclaim || scan_adjusted)
+ if (nr_reclaimed < nr_to_reclaim || proportional_reclaim)
continue;
/*
@@ -5940,8 +5940,6 @@ static void shrink_lruvec(struct lruvec
nr_scanned = targets[lru] - nr[lru];
nr[lru] = targets[lru] * (100 - percentage) / 100;
nr[lru] -= min(nr[lru], nr_scanned);
-
- scan_adjusted = true;
}
blk_finish_plug(&plug);
sc->nr_reclaimed += nr_reclaimed;
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
mm-vmscan-split-khugepaged-stats-from-direct-reclaim-stats.patch
zswap-fix-writeback-lock-ordering-for-zsmalloc.patch
zpool-clean-out-dead-code.patch
The patch titled
Subject: mm: migrate: Fix THP's mapcount on isolation
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-migrate-fix-thps-mapcount-on-isolation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Gavin Shan <gshan(a)redhat.com>
Subject: mm: migrate: Fix THP's mapcount on isolation
Date: Wed, 23 Nov 2022 08:57:52 +0800
The issue is reported when removing memory through virtio_mem device. The
transparent huge page, experienced copy-on-write fault, is wrongly
regarded as pinned. The transparent huge page is escaped from being
isolated in isolate_migratepages_block(). The transparent huge page can't
be migrated and the corresponding memory block can't be put into offline
state.
Fix it by replacing page_mapcount() with total_mapcount(). With this, the
transparent huge page can be isolated and migrated, and the memory block
can be put into offline state.
Link: https://lkml.kernel.org/r/20221123005752.161003-1-gshan@redhat.com
Fixes: 3917c80280c9 ("thp: change CoW semantics for anon-THP")
Signed-off-by: Gavin Shan <gshan(a)redhat.com>
Reported-by: Zhenyu Zhang <zhenyzha(a)redhat.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: William Kucharski <william.kucharski(a)oracle.com>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: <stable(a)vger.kernel.org> [v5.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/compaction.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/compaction.c~mm-migrate-fix-thps-mapcount-on-isolation
+++ a/mm/compaction.c
@@ -990,7 +990,7 @@ isolate_migratepages_block(struct compac
* admittedly racy check.
*/
mapping = page_mapping(page);
- if (!mapping && page_count(page) > page_mapcount(page))
+ if (!mapping && page_count(page) > total_mapcount(page))
goto isolate_fail;
/*
_
Patches currently in -mm which might be from gshan(a)redhat.com are
mm-migrate-fix-thps-mapcount-on-isolation.patch
Hi Sasha,
On Tue, Nov 22, 2022 at 10:19 AM -05, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> l2tp: Serialize access to sk_user_data with sk_callback_lock
>
> to the 6.0-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> l2tp-serialize-access-to-sk_user_data-with-sk_callba.patch
> and it can be found in the queue-6.0 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
"Just when I thought I was out, they pull me back in!"
Greg assured me yesterday that this was dropped from stable queues:
https://lore.kernel.org/stable/Y3thooxAN2Are7Ai@kroah.com/
> commit 1ea60c1db42da0b5b40eb7e9bf8d5937f6f475cc
> Author: Jakub Sitnicki <jakub(a)cloudflare.com>
> Date: Mon Nov 14 20:16:19 2022 +0100
>
> l2tp: Serialize access to sk_user_data with sk_callback_lock
>
> [ Upstream commit b68777d54fac21fc833ec26ea1a2a84f975ab035 ]
>
> sk->sk_user_data has multiple users, which are not compatible with each
> other. Writers must synchronize by grabbing the sk->sk_callback_lock.
>
> l2tp currently fails to grab the lock when modifying the underlying tunnel
> socket fields. Fix it by adding appropriate locking.
>
> We err on the side of safety and grab the sk_callback_lock also inside the
> sk_destruct callback overridden by l2tp, even though there should be no
> refs allowing access to the sock at the time when sk_destruct gets called.
>
> v4:
> - serialize write to sk_user_data in l2tp sk_destruct
>
> v3:
> - switch from sock lock to sk_callback_lock
> - document write-protection for sk_user_data
>
> v2:
> - update Fixes to point to origin of the bug
> - use real names in Reported/Tested-by tags
>
> Cc: Tom Parkin <tparkin(a)katalix.com>
> Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core")
> Reported-by: Haowei Yan <g1042620637(a)gmail.com>
> Signed-off-by: Jakub Sitnicki <jakub(a)cloudflare.com>
> Signed-off-by: David S. Miller <davem(a)davemloft.net>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index f6e6838c82df..03a4ebe3ccc8 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -323,7 +323,7 @@ struct sk_filter;
> * @sk_tskey: counter to disambiguate concurrent tstamp requests
> * @sk_zckey: counter to order MSG_ZEROCOPY notifications
> * @sk_socket: Identd and reporting IO signals
> - * @sk_user_data: RPC layer private data
> + * @sk_user_data: RPC layer private data. Write-protected by @sk_callback_lock.
> * @sk_frag: cached page frag
> * @sk_peek_off: current peek_offset value
> * @sk_send_head: front of stuff to transmit
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 7499c51b1850..754fdda8a5f5 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1150,8 +1150,10 @@ static void l2tp_tunnel_destruct(struct sock *sk)
> }
>
> /* Remove hooks into tunnel socket */
> + write_lock_bh(&sk->sk_callback_lock);
> sk->sk_destruct = tunnel->old_sk_destruct;
> sk->sk_user_data = NULL;
> + write_unlock_bh(&sk->sk_callback_lock);
>
> /* Call the original destructor */
> if (sk->sk_destruct)
> @@ -1469,16 +1471,18 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
> sock = sockfd_lookup(tunnel->fd, &ret);
> if (!sock)
> goto err;
> -
> - ret = l2tp_validate_socket(sock->sk, net, tunnel->encap);
> - if (ret < 0)
> - goto err_sock;
> }
>
> + sk = sock->sk;
> + write_lock(&sk->sk_callback_lock);
> +
> + ret = l2tp_validate_socket(sk, net, tunnel->encap);
> + if (ret < 0)
> + goto err_sock;
> +
> tunnel->l2tp_net = net;
> pn = l2tp_pernet(net);
>
> - sk = sock->sk;
> sock_hold(sk);
> tunnel->sock = sk;
>
> @@ -1504,7 +1508,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
>
> setup_udp_tunnel_sock(net, sock, &udp_cfg);
> } else {
> - sk->sk_user_data = tunnel;
> + rcu_assign_sk_user_data(sk, tunnel);
> }
>
> tunnel->old_sk_destruct = sk->sk_destruct;
> @@ -1518,6 +1522,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
> if (tunnel->fd >= 0)
> sockfd_put(sock);
>
> + write_unlock(&sk->sk_callback_lock);
> return 0;
>
> err_sock:
> @@ -1525,6 +1530,8 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
> sock_release(sock);
> else
> sockfd_put(sock);
> +
> + write_unlock(&sk->sk_callback_lock);
> err:
> return ret;
> }