On Tue, Sep 12, 2023 at 03:31:16PM +0000, Pillai, Aurabindo wrote:
> [AMD Official Use Only - General]
>
> Hi Greg,
>
> It was reverted but has been re-applied.
>
> Here is a chronological summary of what happened:
>
>
> 1. Michel bisected some major issues to "drm/amd/display: Do not set drr on pipe commit" and was revered in upstream. ". Along with that patch, "drm/amd/display: Block optimize on consecutive FAMS enables" was also reverted due to dependency.
> 2. We found that reverting these patches caused some multi monitor configurations to hang on RDNA3.
> 3. We debugged Michel's issue and merged a workaround (https://gitlab.freedesktop.org/agd5f/linux/-/commit/cc225c8af276396c3379885…
> 4. Subsequently, the two patches were reapplied (https://gitlab.freedesktop.org/agd5f/linux/-/commit/bfe1b43c1acee1251ddb091… and https://gitlab.freedesktop.org/agd5f/linux/-/commit/f3c2a89c5103b4ffdd88f09…)
>
> Hence, the stable kernel should have all 3 patches - the workaround and 2 others. Hope that clarifies the situation.
Great, what are the ids of those in Linus's tree?
thanks,
greg k-h
On 23-06-22 04:49 pm, Mathias Nyman wrote:
> From: Hongyu Xie <xy521521(a)gmail.com>
>
> irq is disabled in xhci_quiesce(called by xhci_halt, with bit:2 cleared
> in USBCMD register), but xhci_run(called by usb_add_hcd) re-enable it.
> It's possible that you will receive thousands of interrupt requests
> after initialization for 2.0 roothub. And you will get a lot of
> warning like, "xHCI dying, ignoring interrupt. Shouldn't IRQs be
> disabled?". This amount of interrupt requests will cause the entire
> system to freeze.
> This problem was first found on a device with ASM2142 host controller
> on it.
>
> [tidy up old code while moving it, reword header -Mathias]
> Cc: stable(a)kernel.org
> Signed-off-by: Hongyu Xie <xiehongyu1(a)kylinos.cn>
> Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
> ---
> drivers/usb/host/xhci.c | 35 ++++++++++++++++++++++-------------
> 1 file changed, 22 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 9ac56e9ffc64..cb99bed5f755 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -611,15 +611,37 @@ static int xhci_init(struct usb_hcd *hcd)
>
> static int xhci_run_finished(struct xhci_hcd *xhci)
> {
> + unsigned long flags;
> + u32 temp;
> +
> + /*
> + * Enable interrupts before starting the host (xhci 4.2 and 5.5.2).
> + * Protect the short window before host is running with a lock
> + */
> + spin_lock_irqsave(&xhci->lock, flags);
> +
> + xhci_dbg_trace(xhci, trace_xhci_dbg_init, "Enable interrupts");
> + temp = readl(&xhci->op_regs->command);
> + temp |= (CMD_EIE);
> + writel(temp, &xhci->op_regs->command);
> +
> + xhci_dbg_trace(xhci, trace_xhci_dbg_init, "Enable primary interrupter");
> + temp = readl(&xhci->ir_set->irq_pending);
> + writel(ER_IRQ_ENABLE(temp), &xhci->ir_set->irq_pending);
> +
> if (xhci_start(xhci)) {
> xhci_halt(xhci);
> + spin_unlock_irqrestore(&xhci->lock, flags);
> return -ENODEV;
> }
> +
> xhci->cmd_ring_state = CMD_RING_STATE_RUNNING;
>
> if (xhci->quirks & XHCI_NEC_HOST)
> xhci_ring_cmd_db(xhci);
>
> + spin_unlock_irqrestore(&xhci->lock, flags);
> +
> return 0;
> }
>
> @@ -668,19 +690,6 @@ int xhci_run(struct usb_hcd *hcd)
> temp |= (xhci->imod_interval / 250) & ER_IRQ_INTERVAL_MASK;
> writel(temp, &xhci->ir_set->irq_control);
>
> - /* Set the HCD state before we enable the irqs */
> - temp = readl(&xhci->op_regs->command);
> - temp |= (CMD_EIE);
> - xhci_dbg_trace(xhci, trace_xhci_dbg_init,
> - "// Enable interrupts, cmd = 0x%x.", temp);
> - writel(temp, &xhci->op_regs->command);
> -
> - temp = readl(&xhci->ir_set->irq_pending);
> - xhci_dbg_trace(xhci, trace_xhci_dbg_init,
> - "// Enabling event ring interrupter %p by writing 0x%x to irq_pending",
> - xhci->ir_set, (unsigned int) ER_IRQ_ENABLE(temp));
> - writel(ER_IRQ_ENABLE(temp), &xhci->ir_set->irq_pending);
> -
> if (xhci->quirks & XHCI_NEC_HOST) {
> struct xhci_command *command;
>
This is not available to older kernels [< 5.19]. Can we get this
backported to 5.15 as well? Please let me know if there is some other
way to do it.
Cc: <stable(a)vger.kernel.org> # 5.15
Thanks,
Prashanth K
From: Tero Kristo <tero.kristo(a)linux.intel.com>
The synth traces incorrectly print pointer to the synthetic event values
instead of the actual value when using u64 type. Fix by addressing the
contents of the union properly.
Link: https://lore.kernel.org/linux-trace-kernel/20230911141704.3585965-1-tero.kr…
Fixes: ddeea494a16f ("tracing/synthetic: Use union instead of casts")
Cc: stable(a)vger.kernel.org
Signed-off-by: Tero Kristo <tero.kristo(a)linux.intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace_events_synth.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index 9897d0bfcab7..14cb275a0bab 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -337,7 +337,7 @@ static void print_synth_event_num_val(struct trace_seq *s,
break;
default:
- trace_seq_printf(s, print_fmt, name, val, space);
+ trace_seq_printf(s, print_fmt, name, val->as_u64, space);
break;
}
}
--
2.40.1
The patch titled
Subject: mm: page_alloc: free pages to correct buddy list after PCP lock contention
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-page_alloc-free-pages-to-correct-buddy-list-after-pcp-lock-contention.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mel Gorman <mgorman(a)techsingularity.net>
Subject: mm: page_alloc: free pages to correct buddy list after PCP lock contention
Date: Tue, 5 Sep 2023 10:09:22 +0100
Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
returns pages to the buddy list on PCP lock contention. However, for
migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
been clobbered already for pages that are not being isolated. In
practice, this means that CMA pages may be returned to the wrong
buddy list. While this might be harmless in some cases as it is
MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
and prevent a future CMA allocation. Lookup the PCP migratetype
against unconditionally if the PCP lock is contended.
[lecopzer.chen(a)mediatek.com: CMA-specific fix]
Link: https://lkml.kernel.org/r/20230905090922.zy7srh33rg5c3zao@techsingularity.n…
Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
Signed-off-by: Mel Gorman <mgorman(a)techsingularity.net>
Reported-by: Joe Liu <joe.liu(a)mediatek.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-page_alloc-free-pages-to-correct-buddy-list-after-pcp-lock-contention
+++ a/mm/page_alloc.c
@@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page,
free_unref_page_commit(zone, pcp, page, migratetype, order);
pcp_spin_unlock(pcp);
} else {
- free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
+ /*
+ * The page migratetype may have been clobbered for types
+ * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
+ * must be rechecked.
+ */
+ free_one_page(zone, page, pfn, order,
+ get_pcppage_migratetype(page), FPI_NONE);
}
pcp_trylock_finish(UP_flags);
}
_
Patches currently in -mm which might be from mgorman(a)techsingularity.net are
mm-page_alloc-free-pages-to-correct-buddy-list-after-pcp-lock-contention.patch
The quilt patch titled
Subject: mm: page_alloc: free pages to correct buddy list after PCP lock contention
has been removed from the -mm tree. Its filename was
mm-page_alloc-free-pages-to-correct-buddy-list-after-pcp-lock-contention.patch
This patch was dropped because an alternative patch was or shall be merged
------------------------------------------------------
From: Mel Gorman <mgorman(a)techsingularity.net>
Subject: mm: page_alloc: free pages to correct buddy list after PCP lock contention
Date: Tue, 5 Sep 2023 10:09:22 +0100
Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
returns pages to the buddy list on PCP lock contention. However, for
migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
been clobbered already for pages that are not being isolated. In
practice, this means that CMA pages may be returned to the wrong
buddy list. While this might be harmless in some cases as it is
MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
and prevent a future CMA allocation. Lookup the PCP migratetype
against unconditionally if the PCP lock is contended.
[lecopzer.chen(a)mediatek.com: CMA-specific fix]
Link: https://lkml.kernel.org/r/20230905090922.zy7srh33rg5c3zao@techsingularity.n…
Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
Signed-off-by: Mel Gorman <mgorman(a)techsingularity.net>
Reported-by: Joe Liu <joe.liu(a)mediatek.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-page_alloc-free-pages-to-correct-buddy-list-after-pcp-lock-contention
+++ a/mm/page_alloc.c
@@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page,
free_unref_page_commit(zone, pcp, page, migratetype, order);
pcp_spin_unlock(pcp);
} else {
- free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
+ /*
+ * The page migratetype may have been clobbered for types
+ * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
+ * must be rechecked.
+ */
+ free_one_page(zone, page, pfn, order,
+ get_pcppage_migratetype(page), FPI_NONE);
}
pcp_trylock_finish(UP_flags);
}
_
Patches currently in -mm which might be from mgorman(a)techsingularity.net are
On Tue, Sep 12, 2023 at 03:10:46PM +0000, Pillai, Aurabindo wrote:
> [AMD Official Use Only - General]
>
> Hi Greg,
>
> NAK on this revert. This would cause hangs on multi monitor configurations on recent asics. The original issue that required this patch to be reverted was fixed through a monitor specific workaround (https://gitlab.freedesktop.org/agd5f/linux/-/commit/cc225c8af276396c3379885…)
>
But this revert is upstream. So should the revert be reverted?
confused,
greg k-h