From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/dbgfs: protect targets destructions with kdamond_lock
DAMON debugfs interface iterates current monitoring targets in
'dbgfs_target_ids_read()' while holding the corresponding 'kdamond_lock'.
However, it also destructs the monitoring targets in
'dbgfs_before_terminate()' without holding the lock. This can result in a
use_after_free bug. This commit avoids the race by protecting the
destruction with the corresponding 'kdamond_lock'.
Link: https://lkml.kernel.org/r/20211221094447.2241-1-sj@kernel.org
Reported-by: Sangwoo Bae <sangwoob(a)amazon.com>
Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [5.15.x]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/dbgfs.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-protect-targets-destructions-with-kdamond_lock
+++ a/mm/damon/dbgfs.c
@@ -650,10 +650,12 @@ static void dbgfs_before_terminate(struc
if (!targetid_is_pid(ctx))
return;
+ mutex_lock(&ctx->kdamond_lock);
damon_for_each_target_safe(t, next, ctx) {
put_pid((struct pid *)t->id);
damon_destroy_target(t);
}
+ mutex_unlock(&ctx->kdamond_lock);
}
static struct damon_ctx *dbgfs_new_ctx(void)
_
From: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Subject: mm, hwpoison: fix condition in free hugetlb page path
When a memory error hits a tail page of a free hugepage,
__page_handle_poison() is expected to be called to isolate the error in
4kB unit, but it's not called due to the outdated if-condition in
memory_failure_hugetlb(). This loses the chance to isolate the error in
the finer unit, so it's not optimal. Drop the condition.
This "(p != head && TestSetPageHWPoison(head)" condition is based on the
old semantics of PageHWPoison on hugepage (where PG_hwpoison flag was set
on the subpage), so it's not necessray any more. By getting to set
PG_hwpoison on head page for hugepages, concurrent error events on
different subpages in a single hugepage can be prevented by
TestSetPageHWPoison(head) at the beginning of memory_failure_hugetlb().
So dropping the condition should not reopen the race window originally
mentioned in commit b985194c8c0a ("hwpoison, hugetlb:
lock_page/unlock_page does not match for handling a free hugepage")
[naoya.horiguchi(a)linux.dev: fix "HardwareCorrupted" counter]
Link: https://lkml.kernel.org/r/20211220084851.GA1460264@u2004
Link: https://lkml.kernel.org/r/20211210110208.879740-1-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Reported-by: Fei Luo <luofei(a)unicloud.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: <stable(a)vger.kernel.org> [5.14+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
--- a/mm/memory-failure.c~mm-hwpoison-fix-condition-in-free-hugetlb-page-path
+++ a/mm/memory-failure.c
@@ -1470,17 +1470,12 @@ static int memory_failure_hugetlb(unsign
if (!(flags & MF_COUNT_INCREASED)) {
res = get_hwpoison_page(p, flags);
if (!res) {
- /*
- * Check "filter hit" and "race with other subpage."
- */
lock_page(head);
- if (PageHWPoison(head)) {
- if ((hwpoison_filter(p) && TestClearPageHWPoison(p))
- || (p != head && TestSetPageHWPoison(head))) {
+ if (hwpoison_filter(p)) {
+ if (TestClearPageHWPoison(head))
num_poisoned_pages_dec();
- unlock_page(head);
- return 0;
- }
+ unlock_page(head);
+ return 0;
}
unlock_page(head);
res = MF_FAILED;
_
Hi, this is your Linux kernel regression tracker speaking.
Forwarding a regression reported in bugzilla.kernel.org, to ensure
all the interested parties are aware of it, as quite a few (many?)
subsystems don't react at all to reports in that bug tracker.
https://bugzilla.kernel.org/show_bug.cgi?id=215401
> Martin Mokrejs 2021-12-23 20:25:45 UTC
>
> Created attachment 300133 [details] dmesg-5.4.167.txt
>
> Hi, I jumped from 5.4.143 to 5.4.167 but the connection to wifi was
> so unstable I had to reboot to use the old kernel. I never used git
> bisect and am not sure I have that much time to play with that.
> However, let me say that I lost about 5x connection to AP. Sooner or
> later after each situation I disconnected from the AP using nm-applet
> and re-connected. That has helped for a short while, liek a few
> minutes, then I again lost network connection. Maybe you can find the
> event in the dmesg output.
>
> Once, for some reason, there is also a stacktrace from the kernel.
> Why just onceinstead of about 5 times I have no idea.
>
> I could provide the same kernel messages supplemented with daemon
> messages from syslog.
>
> Hope this helps to some extent,
Feel free to either continue discussing this here or in the ticket, I
don't care.
To be sure this issue doesn't fall through the cracks unnoticed, I'm
also adding it to regzbot, my Linux kernel regression tracking bot:
#regzbot introduced v5.4.143 to v5.4.167
#regzbot title: net: iwlwifi: frequently loosing connection to AP
#regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215401
Reminder: when fixing the issue, please link to this mail and the bug
entry with a link tag.
Ciao, Thorsten (wearing his 'Linux kernel regression tracker' hat).
P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
on my table. I can only look briefly into most of them. Unfortunately
therefore I sometimes will get things wrong or miss something important.
I hope that's not the case here; if you think it is, don't hesitate to
tell me about it in a public reply. That's in everyone's interest, as
what I wrote above might be misleading to everyone reading this; any
suggestion I gave thus might sent someone reading this down the wrong
rabbit hole, which none of us wants.
BTW, I have no personal interest in this issue, which is tracked using
regzbot, my Linux kernel regression tracking bot
(https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting
this mail to get things rolling again and hence don't need to be CC on
all further activities wrt to this regression.
---
Additional information about regzbot:
If you want to know more about regzbot, check out its web-interface, the
getting start guide, and/or the references documentation:
https://linux-regtracking.leemhuis.info/regzbot/https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.mdhttps://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md
The last two documents will explain how you can interact with regzbot
yourself if your want to.
Hint for reporters: when reporting a regression it's in your interest to
tell #regzbot about it in the report, as that will ensure the regression
gets on the radar of regzbot and the regression tracker. That's in your
interest, as they will make sure the report won't fall through the
cracks unnoticed.
Hint for developers: you normally don't need to care about regzbot once
it's involved. Fix the issue as you normally would, just remember to
include a 'Link:' tag to the report in the commit message, as explained
in Documentation/process/submitting-patches.rst
That aspect was recently was made more explicit in commit 1f57bd42b77c:
https://git.kernel.org/linus/1f57bd42b77c
When bfqq is shared by multiple processes it can happen that one of the
processes gets moved to a different cgroup (or just starts submitting IO
for different cgroup). In case that happens we need to split the merged
bfqq as otherwise we will have IO for multiple cgroups in one bfqq and
we will just account IO time to wrong entities etc.
CC: stable(a)vger.kernel.org
Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support")
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
block/bfq-cgroup.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 24a5c5329bcd..1f5fb723bed7 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -730,8 +730,18 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
if (sync_bfqq) {
entity = &sync_bfqq->entity;
- if (entity->sched_data != &bfqg->sched_data)
+ if (entity->sched_data != &bfqg->sched_data) {
+ /*
+ * Moving bfqq that is shared with another process?
+ * Split the queues at the nearest occasion as the
+ * processes can be in different cgroups now.
+ */
+ if (bfq_bfqq_coop(sync_bfqq)) {
+ bic->stably_merged = false;
+ bfq_mark_bfqq_split_coop(sync_bfqq);
+ }
bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
+ }
}
return bfqg;
--
2.26.2