5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wonhyuk Yang vvghjk1234@gmail.com
[ Upstream commit 10e0f7530205799e7e971aba699a7cb3a47456de ]
Currently, trace point mm_page_alloc_zone_locked() doesn't show correct information.
First, when alloc_flag has ALLOC_HARDER/ALLOC_CMA, page can be allocated from MIGRATE_HIGHATOMIC/MIGRATE_CMA. Nevertheless, tracepoint use requested migration type not MIGRATE_HIGHATOMIC and MIGRATE_CMA.
Second, after commit 44042b4498728 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists") percpu-list can store high order pages. But trace point determine whether it is a refiil of percpu-list by comparing requested order and 0.
To handle these problems, make mm_page_alloc_zone_locked() only be called by __rmqueue_smallest with correct migration type. With a new argument called percpu_refill, it can show roughly whether it is a refill of percpu-list.
Link: https://lkml.kernel.org/r/20220512025307.57924-1-vvghjk1234@gmail.com Signed-off-by: Wonhyuk Yang vvghjk1234@gmail.com Acked-by: Mel Gorman mgorman@suse.de Cc: Baik Song An bsahn@etri.re.kr Cc: Hong Yeon Kim kimhy@etri.re.kr Cc: Taeung Song taeung@reallinux.co.kr Cc: linuxgeek@linuxgeek.io Cc: Steven Rostedt rostedt@goodmis.org Cc: Ingo Molnar mingo@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 281dd25c1a01 ("mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves") Signed-off-by: Sasha Levin sashal@kernel.org --- include/trace/events/kmem.h | 14 +++++++++----- mm/page_alloc.c | 13 +++++-------- 2 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h index ddc8c944f417a..f89fb3afcd46a 100644 --- a/include/trace/events/kmem.h +++ b/include/trace/events/kmem.h @@ -229,20 +229,23 @@ TRACE_EVENT(mm_page_alloc,
DECLARE_EVENT_CLASS(mm_page,
- TP_PROTO(struct page *page, unsigned int order, int migratetype), + TP_PROTO(struct page *page, unsigned int order, int migratetype, + int percpu_refill),
- TP_ARGS(page, order, migratetype), + TP_ARGS(page, order, migratetype, percpu_refill),
TP_STRUCT__entry( __field( unsigned long, pfn ) __field( unsigned int, order ) __field( int, migratetype ) + __field( int, percpu_refill ) ),
TP_fast_assign( __entry->pfn = page ? page_to_pfn(page) : -1UL; __entry->order = order; __entry->migratetype = migratetype; + __entry->percpu_refill = percpu_refill; ),
TP_printk("page=%p pfn=0x%lx order=%u migratetype=%d percpu_refill=%d", @@ -250,14 +253,15 @@ DECLARE_EVENT_CLASS(mm_page, __entry->pfn != -1UL ? __entry->pfn : 0, __entry->order, __entry->migratetype, - __entry->order == 0) + __entry->percpu_refill) );
DEFINE_EVENT(mm_page, mm_page_alloc_zone_locked,
- TP_PROTO(struct page *page, unsigned int order, int migratetype), + TP_PROTO(struct page *page, unsigned int order, int migratetype, + int percpu_refill),
- TP_ARGS(page, order, migratetype) + TP_ARGS(page, order, migratetype, percpu_refill) );
TRACE_EVENT(mm_page_pcpu_drain, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 474150584ba48..264cb1914ab5b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2461,6 +2461,9 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, del_page_from_free_list(page, zone, current_order); expand(zone, page, order, current_order, migratetype); set_pcppage_migratetype(page, migratetype); + trace_mm_page_alloc_zone_locked(page, order, migratetype, + pcp_allowed_order(order) && + migratetype < MIGRATE_PCPTYPES); return page; }
@@ -2988,7 +2991,7 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype, zone_page_state(zone, NR_FREE_PAGES) / 2) { page = __rmqueue_cma_fallback(zone, order); if (page) - goto out; + return page; } } retry: @@ -3001,9 +3004,6 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype, alloc_flags)) goto retry; } -out: - if (page) - trace_mm_page_alloc_zone_locked(page, order, migratetype); return page; }
@@ -3708,11 +3708,8 @@ struct page *rmqueue(struct zone *preferred_zone, * reserved for high-order atomic allocation, so order-0 * request should skip it. */ - if (order > 0 && alloc_flags & ALLOC_HARDER) { + if (order > 0 && alloc_flags & ALLOC_HARDER) page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC); - if (page) - trace_mm_page_alloc_zone_locked(page, order, migratetype); - } if (!page) { page = __rmqueue(zone, order, migratetype, alloc_flags); if (!page)