The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y git checkout FETCH_HEAD git cherry-pick -x f1796544a0ca0f14386a679d3d05fbc69235015e # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2024022759-crave-busily-bef7@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
f1796544a0ca ("memcg: fix use-after-free in uncharge_batch") 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting") 8d22a9351035 ("mm/memcg: fix refcount error while moving and swapping") d9eb1ea2bf87 ("mm: memcontrol: delete unused lrucare handling") 4c6355b25e8b ("mm: memcontrol: charge swapin pages on instantiation") f0e45fb4da29 ("mm: memcontrol: drop unused try/commit/cancel charge API") 9d82c69438d0 ("mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API") 468c398233da ("mm: memcontrol: switch to native NR_ANON_THPS counter") be5d0a74c62d ("mm: memcontrol: switch to native NR_ANON_MAPPED counter") 0d1c20722ab3 ("mm: memcontrol: switch to native NR_FILE_PAGES and NR_SHMEM counters") 49e50d277ba2 ("mm: memcontrol: prepare move_account for removal of private page type counters") 9f762dbe19b9 ("mm: memcontrol: prepare uncharging for removal of private page type counters") 3fea5a499d57 ("mm: memcontrol: convert page cache to a new mem_cgroup_charge() API") 6caa6a0703e0 ("mm: memcontrol: move out cgroup swaprate throttling") 14235ab36019 ("mm: shmem: remove rare optimization when swapin races with hole punching") 3fba69a56e16 ("mm: memcontrol: drop @compound parameter from memcg charging API") abb242f57196 ("mm: memcontrol: fix stat-corrupting race in charge moving") f4129ea3591a ("mm: fix NUMA node file count error in replace_page_cache()") ffe945e633b5 ("khugepaged: do not stop collapse if less than half PTEs are referenced") 396bcc5299c2 ("mm: remove CONFIG_TRANSPARENT_HUGE_PAGECACHE") 85b9f46e8ea4 ("mm, thp: track fallbacks due to failed memcg charges separately") dcdf11ee1441 ("mm, shmem: add vmstat for hugepage fallback") 9c315e4d7d8c ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()") 92d0510c3585 ("mm: kmem: switch to nr_pages in (__)memcg_kmem_charge_memcg()") f4b00eab5004 ("mm: kmem: rename memcg_kmem_(un)charge() into memcg_kmem_(un)charge_page()") 50591183fa86 ("mm: kmem: cleanup memcg_kmem_uncharge_memcg() arguments") 10eaec2f63b6 ("mm: kmem: cleanup (__)memcg_kmem_charge_memcg() arguments") 47e29d32afba ("mm/gup: page->hpage_pinned_refcount: exact pin counts for huge pages") 3faa52c03f44 ("mm/gup: track FOLL_PIN pages") 3b78d8347d31 ("mm/gup: pass gup flags to two more routines") c23a0c99793f ("mm/migrate: clean up some minor coding style") 92855270ff08 ("mm/memcontrol.c: cleanup some useless code") f1f6a7dd9b53 ("mm, tree-wide: rename put_user_page*() to unpin_user_page*()") aa4b87fe9ea3 ("powerpc: book3s64: convert to pin_user_pages() and put_user_page()") 19fed0dae94d ("vfio, mm: pin_user_pages (FOLL_PIN) and put_user_page() conversion") 1f815afcfca7 ("media/v4l2-core: pin_user_pages (FOLL_PIN) and put_user_page() conversion") 803e4572d7c5 ("mm/process_vm_access: set FOLL_PIN via pin_user_pages_remote()") 57459435cff5 ("goldish_pipe: convert to pin_user_pages() and put_user_page()") eddb1c228f79 ("mm/gup: introduce pin_user_pages*() and FOLL_PIN") 3c7470b6f684 ("media/v4l2-core: set pages dirty upon releasing DMA buffers") f4000fdf435b ("mm/gup: allow FOLL_FORCE for get_user_pages_fast()") 3567813eae5e ("vfio: fix FOLL_LONGTERM use, simplify get_user_pages_remote() call") c4237f8b1f4f ("mm: fix get_user_pages_remote()'s handling of FOLL_LONGTERM") a707cdd55f0f ("mm/gup: move try_get_compound_head() to top, fix minor issues") a43e982082c2 ("mm/gup: factor out duplicate code from four routines") fac0516b5534 ("mm: thp: don't need care deferred split queue in memcg charge move path") f1fe80d4ae33 ("mm, thp: do not queue fully unmapped pages for deferred split") acbfb087e3b1 ("mm/hugetlb: avoid looping to the same hugepage if !pages and !vmas") 867e5e1de14b ("mm: clean up and clarify lruvec lookup procedure") 242c37b459ce ("include/linux/memcontrol.h: fix comments based on per-node memcg")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f1796544a0ca0f14386a679d3d05fbc69235015e Mon Sep 17 00:00:00 2001 From: Michal Hocko mhocko@suse.com Date: Fri, 4 Sep 2020 16:35:24 -0700 Subject: [PATCH] memcg: fix use-after-free in uncharge_batch
syzbot has reported an use-after-free in the uncharge_batch path
BUG: KASAN: use-after-free in instrument_atomic_write include/linux/instrumented.h:71 [inline] BUG: KASAN: use-after-free in atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline] BUG: KASAN: use-after-free in atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline] BUG: KASAN: use-after-free in page_counter_cancel mm/page_counter.c:54 [inline] BUG: KASAN: use-after-free in page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155 Write of size 8 at addr ffff8880371c0148 by task syz-executor.0/9304
CPU: 0 PID: 9304 Comm: syz-executor.0 Not tainted 5.8.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1f0/0x31e lib/dump_stack.c:118 print_address_description+0x66/0x620 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report+0x132/0x1d0 mm/kasan/report.c:530 check_memory_region_inline mm/kasan/generic.c:183 [inline] check_memory_region+0x2b5/0x2f0 mm/kasan/generic.c:192 instrument_atomic_write include/linux/instrumented.h:71 [inline] atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline] atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline] page_counter_cancel mm/page_counter.c:54 [inline] page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155 uncharge_batch+0x6c/0x350 mm/memcontrol.c:6764 uncharge_page+0x115/0x430 mm/memcontrol.c:6796 uncharge_list mm/memcontrol.c:6835 [inline] mem_cgroup_uncharge_list+0x70/0xe0 mm/memcontrol.c:6877 release_pages+0x13a2/0x1550 mm/swap.c:911 tlb_batch_pages_flush mm/mmu_gather.c:49 [inline] tlb_flush_mmu_free mm/mmu_gather.c:242 [inline] tlb_flush_mmu+0x780/0x910 mm/mmu_gather.c:249 tlb_finish_mmu+0xcb/0x200 mm/mmu_gather.c:328 exit_mmap+0x296/0x550 mm/mmap.c:3185 __mmput+0x113/0x370 kernel/fork.c:1076 exit_mm+0x4cd/0x550 kernel/exit.c:483 do_exit+0x576/0x1f20 kernel/exit.c:793 do_group_exit+0x161/0x2d0 kernel/exit.c:903 get_signal+0x139b/0x1d30 kernel/signal.c:2743 arch_do_signal+0x33/0x610 arch/x86/kernel/signal.c:811 exit_to_user_mode_loop kernel/entry/common.c:135 [inline] exit_to_user_mode_prepare+0x8d/0x1b0 kernel/entry/common.c:166 syscall_exit_to_user_mode+0x5e/0x1a0 kernel/entry/common.c:241 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Commit 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting") reworked the memcg lifetime to be bound the the struct page rather than charges. It also removed the css_put_many from uncharge_batch and that is causing the above splat.
uncharge_batch() is supposed to uncharge accumulated charges for all pages freed from the same memcg. The queuing is done by uncharge_page which however drops the memcg reference after it adds charges to the batch. If the current page happens to be the last one holding the reference for its memcg then the memcg is OK to go and the next page to be freed will trigger batched uncharge which needs to access the memcg which is gone already.
Fix the issue by taking a reference for the memcg in the current batch.
Fixes: 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting") Reported-by: syzbot+b305848212deec86eabe@syzkaller.appspotmail.com Reported-by: syzbot+b5ea6fb6f139c8b9482b@syzkaller.appspotmail.com Signed-off-by: Michal Hocko mhocko@suse.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Shakeel Butt shakeelb@google.com Acked-by: Johannes Weiner hannes@cmpxchg.org Cc: Roman Gushchin guro@fb.com Cc: Hugh Dickins hughd@google.com Link: https://lkml.kernel.org/r/20200820090341.GC5033@dhcp22.suse.cz Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b807952b4d43..cfa6cbad21d5 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6774,6 +6774,9 @@ static void uncharge_batch(const struct uncharge_gather *ug) __this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, ug->nr_pages); memcg_check_events(ug->memcg, ug->dummy_page); local_irq_restore(flags); + + /* drop reference from uncharge_page */ + css_put(&ug->memcg->css); }
static void uncharge_page(struct page *page, struct uncharge_gather *ug) @@ -6797,6 +6800,9 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) uncharge_gather_clear(ug); } ug->memcg = page->mem_cgroup; + + /* pairs with css_put in uncharge_batch */ + css_get(&ug->memcg->css); }
nr_pages = compound_nr(page);
Why is this applied to 5.4? $ git describe-ver 1a3e1f40962c v5.9-rc1~97^2~97
I do not see 1a3e1f40962c in 5.4 stable tree. What am I missing?
On Tue 27-02-24 14:12:00, Greg KH wrote: [...]
Fixes: 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting") Reported-by: syzbot+b305848212deec86eabe@syzkaller.appspotmail.com Reported-by: syzbot+b5ea6fb6f139c8b9482b@syzkaller.appspotmail.com Signed-off-by: Michal Hocko mhocko@suse.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Shakeel Butt shakeelb@google.com Acked-by: Johannes Weiner hannes@cmpxchg.org Cc: Roman Gushchin guro@fb.com Cc: Hugh Dickins hughd@google.com Link: https://lkml.kernel.org/r/20200820090341.GC5033@dhcp22.suse.cz Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b807952b4d43..cfa6cbad21d5 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6774,6 +6774,9 @@ static void uncharge_batch(const struct uncharge_gather *ug) __this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, ug->nr_pages); memcg_check_events(ug->memcg, ug->dummy_page); local_irq_restore(flags);
- /* drop reference from uncharge_page */
- css_put(&ug->memcg->css);
} static void uncharge_page(struct page *page, struct uncharge_gather *ug) @@ -6797,6 +6800,9 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) uncharge_gather_clear(ug); } ug->memcg = page->mem_cgroup;
/* pairs with css_put in uncharge_batch */
}css_get(&ug->memcg->css);
nr_pages = compound_nr(page);
On Tue, Feb 27, 2024 at 02:29:12PM +0100, Michal Hocko wrote:
Why is this applied to 5.4? $ git describe-ver 1a3e1f40962c v5.9-rc1~97^2~97
I do not see 1a3e1f40962c in 5.4 stable tree. What am I missing?
It is queued up for this next round of releases in the 5.4.y and 4.19.y trees.
thanks,
greg k-h
On Tue 27-02-24 14:32:20, Greg KH wrote:
On Tue, Feb 27, 2024 at 02:29:12PM +0100, Michal Hocko wrote:
Why is this applied to 5.4? $ git describe-ver 1a3e1f40962c v5.9-rc1~97^2~97
I do not see 1a3e1f40962c in 5.4 stable tree. What am I missing?
It is queued up for this next round of releases in the 5.4.y and 4.19.y trees.
OK, now I remember the partial backport of 1a3e1f40962c (http://lkml.kernel.org/r/20240222030237.82486-1-gongruiqi1@huawei.com) but I need to have a look whether the follow up patch is really needed.
On Tue 27-02-24 16:49:50, Michal Hocko wrote:
On Tue 27-02-24 14:32:20, Greg KH wrote:
On Tue, Feb 27, 2024 at 02:29:12PM +0100, Michal Hocko wrote:
Why is this applied to 5.4? $ git describe-ver 1a3e1f40962c v5.9-rc1~97^2~97
I do not see 1a3e1f40962c in 5.4 stable tree. What am I missing?
It is queued up for this next round of releases in the 5.4.y and 4.19.y trees.
OK, now I remember the partial backport of 1a3e1f40962c (http://lkml.kernel.org/r/20240222030237.82486-1-gongruiqi1@huawei.com) but I need to have a look whether the follow up patch is really needed.
AFAICS f1796544a0ca ("memcg: fix use-after-free in uncharge_batch") is only needed if the full 1a3e1f40962c is backported. The one staged for 5.4 shouldn't need a follow up as it only touches the pcp cache. I would feel safer if other maintainers double check my thinking though.
Thanks
linux-stable-mirror@lists.linaro.org