[Why]
This hasn't been well tested and leads to complete system hangs on DCN1
based systems, possibly others.
The system hang can be reproduced by gesturing the video on the YouTube
Android app on ChromeOS into full screen.
[How]
Reject atomic commits with non-zero drm_plane_state.src_x or src_y values.
v2:
- Add code comment describing the reason we're rejecting non-zero
src_x and src_y
- Drop gerrit Change-Id
- Add stable CC
- Based on amd-staging-drm-next
v3: removed trailing whitespace
Signed-off-by: Harry Wentland <harry.wentland(a)amd.com>
Cc: stable(a)vger.kernel.org
Cc: nicholas.kazlauskas(a)amd.com
Cc: amd-gfx(a)lists.freedesktop.org
Cc: alexander.deucher(a)amd.com
Cc: Roman.Li(a)amd.com
Cc: hersenxs.wu(a)amd.com
Cc: danny.wang(a)amd.com
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com>
---
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index be1769d29742..aeedc5a3fb36 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4089,6 +4089,23 @@ static int fill_dc_scaling_info(const struct drm_plane_state *state,
scaling_info->src_rect.x = state->src_x >> 16;
scaling_info->src_rect.y = state->src_y >> 16;
+ /*
+ * For reasons we don't (yet) fully understand a non-zero
+ * src_y coordinate into an NV12 buffer can cause a
+ * system hang. To avoid hangs (and maybe be overly cautious)
+ * let's reject both non-zero src_x and src_y.
+ *
+ * We currently know of only one use-case to reproduce a
+ * scenario with non-zero src_x and src_y for NV12, which
+ * is to gesture the YouTube Android app into full screen
+ * on ChromeOS.
+ */
+ if (state->fb &&
+ state->fb->format->format == DRM_FORMAT_NV12 &&
+ (scaling_info->src_rect.x != 0 ||
+ scaling_info->src_rect.y != 0))
+ return -EINVAL;
+
scaling_info->src_rect.width = state->src_w >> 16;
if (scaling_info->src_rect.width == 0)
return -EINVAL;
--
2.31.1
Good day,
PFIZER B.V invites you or your company to submit a quote on
supply and delivery of the following;
Product/Model No: A702TH FYNE PRESSURE REGULATOR
Model Number: A702TH
Qty. 30 units
Note: Please send your quotation to: quote(a)fizersuppliers.com for
prompt approval.
Best Regards,
Albert Bourla
PFIZER B.V Supply Chain Manager
Tel: +31(0)208080 880
ADDRESS: Rivium Westlaan 142, 2909 LD
Capelle aan den IJssel, Netherlands
We found this problem in our kernel src tree:
[ 14.816231] ------------[ cut here ]------------
[ 14.816231] kernel BUG at irq.c:99!
[ 14.816232] Internal error: Oops - BUG: 0 [#1] SMP
[ 14.816232] Process swapper/0 (pid: 0, stack limit = 0x(____ptrval____))
[ 14.816233] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.19.95.aarch64 #14
[ 14.816233] Hardware name: evb (DT)
[ 14.816234] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[ 14.816234] pc : asm_nmi_enter+0x94/0x98
[ 14.816235] lr : asm_nmi_enter+0x18/0x98
[ 14.816235] sp : ffff000008003c50
[ 14.816235] pmr_save: 00000070
[ 14.816237] x29: ffff000008003c50 x28: ffff0000095f56c0
[ 14.816238] x27: 0000000000000000 x26: ffff000008004000
[ 14.816239] x25: 00000000015e0000 x24: ffff8008fb916000
[ 14.816240] x23: 0000000020400005 x22: ffff0000080817cc
[ 14.816241] x21: ffff000008003da0 x20: 0000000000000060
[ 14.816242] x19: 00000000000003ff x18: ffffffffffffffff
[ 14.816243] x17: 0000000000000008 x16: 003d090000000000
[ 14.816244] x15: ffff0000095ea6c8 x14: ffff8008fff5ab40
[ 14.816244] x13: ffff8008fff58b9d x12: 0000000000000000
[ 14.816245] x11: ffff000008c8a200 x10: 000000008e31fca5
[ 14.816246] x9 : ffff000008c8a208 x8 : 000000000000000f
[ 14.816247] x7 : 0000000000000004 x6 : ffff8008fff58b9e
[ 14.816248] x5 : 0000000000000000 x4 : 0000000080000000
[ 14.816249] x3 : 0000000000000000 x2 : 0000000080000000
[ 14.816250] x1 : 0000000000120000 x0 : ffff0000095f56c0
[ 14.816251] Call trace:
[ 14.816251] asm_nmi_enter+0x94/0x98
[ 14.816251] el1_irq+0x8c/0x180 (IRQ C)
[ 14.816252] gic_handle_irq+0xbc/0x2e4
[ 14.816252] el1_irq+0xcc/0x180 (IRQ B)
[ 14.816253] arch_timer_handler_virt+0x38/0x58
[ 14.816253] handle_percpu_devid_irq+0x90/0x240
[ 14.816253] generic_handle_irq+0x34/0x50
[ 14.816254] __handle_domain_irq+0x68/0xc0
[ 14.816254] gic_handle_irq+0xf8/0x2e4
[ 14.816255] el1_irq+0xcc/0x180 (IRQ A)
[ 14.816255] arch_cpu_idle+0x34/0x1c8
[ 14.816255] default_idle_call+0x24/0x44
[ 14.816256] do_idle+0x1d0/0x2c8
[ 14.816256] cpu_startup_entry+0x28/0x30
[ 14.816256] rest_init+0xb8/0xc8
[ 14.816257] start_kernel+0x4c8/0x4f4
[ 14.816257] Code: 940587f1 d5384100 b9401001 36a7fd01 (d4210000)
[ 14.816258] Modules linked in: start_dp(O) smeth(O)
[ 15.103092] ---[ end trace 701753956cb14aa8 ]---
[ 15.103093] Kernel panic - not syncing: Fatal exception in interrupt
[ 15.103099] SMP: stopping secondary CPUs
[ 15.103100] Kernel Offset: disabled
[ 15.103100] CPU features: 0x36,a2400218
[ 15.103100] Memory Limit: none
I look into this issue and find that it's caused by 'BUG_ON(in_nmi())'
in nmi_enter(). From the call trace, we can find three interrupts which
I mark as IRQ A, B and C. By adding some prints, I find the IRQ B also
calls nmi_enter(), but its priority is not GICD_INT_NMI_PRI and its irq
number is 1023. It enables irq by calling gic_arch_enable_irqs() in
gic_handle_irq(). At this moment, IRQ C preempts the IRQ B and it's
an NMI but current context is already in nmi. So that may be the problem.
When handling spurious interrupts, we shouldn't enable irqs. That's
because for spurious interrupts we may enter nmi context in el1_irq()
because current PMR may be GIC_PRIO_IRQOFF. If we enable irqs at this
time, another NMI may happen and preempt this spurious interrupt
but the context is already in nmi. That causes a bug on if nested NMI
is not supported. Even for nested nmi, it's not a normal scenario.
Though the issue is reported on our private tree, I think it also
exists on the latest tree for the reasons above. To fix this issue,
check spurious interrupts right after the read of ICC_IAR1_EL1 and
return directly for spurious interrupts.
Fixes: 17ce302f3117 ("arm64: Fix interrupt tracing in the presence of NMIs")
Signed-off-by: He Ying <heying24(a)huawei.com>
---
v2:
- Move the check right after the read of ICC_IAR1_EL1 suggested by Marc.
drivers/irqchip/irq-gic-v3.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 94b89258d045..37a23aa6de37 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -648,6 +648,10 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
irqnr = gic_read_iar();
+ /* Check for special IDs first */
+ if ((irqnr >= 1020 && irqnr <= 1023))
+ return;
+
if (gic_supports_nmi() &&
unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) {
gic_handle_nmi(irqnr, regs);
@@ -659,10 +663,6 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
gic_arch_enable_irqs();
}
- /* Check for special IDs first */
- if ((irqnr >= 1020 && irqnr <= 1023))
- return;
-
if (static_branch_likely(&supports_deactivate_key))
gic_write_eoir(irqnr);
else
--
2.17.1
From: Joerg Roedel <joro(a)sev.home.8bytes.org>
Annotate the firmware files CCP might need using MODULE_FIRMWARE().
This will get them included into an initrd when CCP is also included
there. Otherwise the CCP module will not find its firmware when loaded
before the root-fs is mounted.
This can cause problems when the pre-loaded SEV firmware is too old to
support current SEV and SEV-ES virtualization features.
Cc: stable(a)vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
---
Resending with correct Signed-off-by.
drivers/crypto/ccp/sev-dev.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index cb9b4c4e371e..9883e3afe10b 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -42,6 +42,9 @@ static int psp_probe_timeout = 5;
module_param(psp_probe_timeout, int, 0644);
MODULE_PARM_DESC(psp_probe_timeout, " default timeout value, in seconds, during PSP device probe");
+MODULE_FIRMWARE("amd/amd_sev_fam17h_model0xh.sbin");
+MODULE_FIRMWARE("amd/amd_sev_fam17h_model3xh.sbin");
+
static bool psp_dead;
static int psp_timeout;
--
2.31.1
From: Joerg Roedel <joro(a)sev.home.8bytes.org>
Annotate the firmware files CCP might need using MODULE_FIRMWARE().
This will get them included into an initrd when CCP is also included
there. Otherwise the CCP module will not find its firmware when loaded
before the root-fs is mounted.
This can cause problems when the pre-loaded SEV firmware is too old to
support current SEV and SEV-ES virtualization features.
Cc: stable(a)vger.kernel.org
Signed-off-by: Joerg Roedel <joro(a)sev.home.8bytes.org>
---
drivers/crypto/ccp/sev-dev.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index cb9b4c4e371e..9883e3afe10b 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -42,6 +42,9 @@ static int psp_probe_timeout = 5;
module_param(psp_probe_timeout, int, 0644);
MODULE_PARM_DESC(psp_probe_timeout, " default timeout value, in seconds, during PSP device probe");
+MODULE_FIRMWARE("amd/amd_sev_fam17h_model0xh.sbin");
+MODULE_FIRMWARE("amd/amd_sev_fam17h_model3xh.sbin");
+
static bool psp_dead;
static int psp_timeout;
--
2.31.1
We encountered kernel crash when disable wbt through min_lat_nsec
setting to zero, found the problem is the reset of wb_max to zero in
calc_wb_limits() would break the normal scale logic, caused the
scale_step value overflow and kernel crash. Below is the crash backtrace:
[43061417.487135] task: ffff9250828d6540 task.stack: ffffbc8b839f0000
[43061417.487331] RIP: 0010:rwb_arm_timer+0x52/0x60
[43061417.487472] RSP: 0000:ffff9250bfec3ea8 EFLAGS: 00010206
[43061417.487646] RAX: 000000005f5e1000 RBX: ffff9250ab6113c0 RCX: 0000000000000000
[43061417.487877] RDX: 0000000000000000 RSI: ffffffff9fe4a484 RDI: 000000005f5e1000
[43061417.488109] RBP: 0000000000000100 R08: ffffffff00000000 R09: 00000000ffffffff
[43061417.488343] R10: 0000000000000000 R11: ffffdc8b3fdcf938 R12: ffff9250a9324d90
[43061417.488575] R13: ffffffff9f3583a0 R14: ffff9250a9324d80 R15: 0000000000000000
[43061417.488808] FS: 00007f7aadbee700(0000) GS:ffff9250bfec0000(0000) knlGS:0000000000000000
[43061417.489069] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[43061417.489258] CR2: 00007f43b7c809b8 CR3: 0000007e42994006 CR4: 00000000007606e0
[43061417.489490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[43061417.489722] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[43061417.489952] PKRU: 55555554
[43061417.490046] Call Trace:
[43061417.490136] <IRQ>
[43061417.490206] call_timer_fn+0x2e/0x130
[43061417.490328] run_timer_softirq+0x1d4/0x420
[43061417.490466] ? timerqueue_add+0x54/0x80
[43061417.490593] ? enqueue_hrtimer+0x38/0x80
[43061417.490722] __do_softirq+0x108/0x2a9
[43061417.490846] irq_exit+0xc2/0xd0
[43061417.490953] smp_apic_timer_interrupt+0x6c/0x120
[43061417.491106] apic_timer_interrupt+0x7d/0x90
[43061417.491245] </IRQ>
Seen from the crash dump, the scale_step became a very big value and
overflow to zero divisor in div_u64, so kernel crash happened.
Since wbt use wb_max == 1 and scaled_max flag as the scale min/max
point, we only reset wb_normal and wb_background when set min_lat_nsec
to zero, leave wb_max and scaled_max to be driven by the scale timer.
Higher version kernels than v4.18 include a code refactor patchset that
split the scale up/down logic and calc_wb_limits(), so disable wbt by
setting min_lat_nsec to zero will NOT affect the normal scale logic.
But we don't want to backport that patchset because of very big code
changes, may introduce other problems. So just fix the crash bug in
this patch.
Fixes: e34cbd307477 ("blk-wbt: add general throttling mechanism")
Cc: <stable(a)vger.kernel.org> # 4.9.x
Signed-off-by: Chengming Zhou <zhouchengming(a)bytedance.com>
---
block/blk-wbt.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index 5c105514bca7..24c84ee39029 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -194,11 +194,6 @@ static bool calc_wb_limits(struct rq_wb *rwb)
unsigned int depth;
bool ret = false;
- if (!rwb->min_lat_nsec) {
- rwb->wb_max = rwb->wb_normal = rwb->wb_background = 0;
- return false;
- }
-
/*
* For QD=1 devices, this is a special case. It's important for those
* to have one request ready when one completes, so force a depth of
@@ -244,6 +239,9 @@ static bool calc_wb_limits(struct rq_wb *rwb)
rwb->wb_background = (rwb->wb_max + 3) / 4;
}
+ if (!rwb->min_lat_nsec)
+ rwb->wb_normal = rwb->wb_background = 0;
+
return ret;
}
--
2.11.0
This is a note to let you know that I've just added the patch titled
usb: dwc3: core: Do core softreset when switch mode
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From f88359e1588b85cf0e8209ab7d6620085f3441d9 Mon Sep 17 00:00:00 2001
From: Yu Chen <chenyu56(a)huawei.com>
Date: Thu, 15 Apr 2021 15:20:30 -0700
Subject: usb: dwc3: core: Do core softreset when switch mode
From: John Stultz <john.stultz(a)linaro.org>
According to the programming guide, to switch mode for DRD controller,
the driver needs to do the following.
To switch from device to host:
1. Reset controller with GCTL.CoreSoftReset
2. Set GCTL.PrtCapDir(host mode)
3. Reset the host with USBCMD.HCRESET
4. Then follow up with the initializing host registers sequence
To switch from host to device:
1. Reset controller with GCTL.CoreSoftReset
2. Set GCTL.PrtCapDir(device mode)
3. Reset the device with DCTL.CSftRst
4. Then follow up with the initializing registers sequence
Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
switching from host to device. John Stult reported a lockup issue seen
with HiKey960 platform without these steps[1]. Similar issue is observed
with Ferry's testing platform[2].
So, apply the required steps along with some fixes to Yu Chen's and John
Stultz's version. The main fixes to their versions are the missing wait
for clocks synchronization before clearing GCTL.CoreSoftReset and only
apply DCTL.CSftRst when switching from host to device.
[1] https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro…
[2] https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmai…
Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
Cc: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Cc: Ferry Toth <fntoth(a)gmail.com>
Cc: Wesley Cheng <wcheng(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org>
Tested-by: John Stultz <john.stultz(a)linaro.org>
Tested-by: Wesley Cheng <wcheng(a)codeaurora.org>
Signed-off-by: Yu Chen <chenyu56(a)huawei.com>
Signed-off-by: John Stultz <john.stultz(a)linaro.org>
Signed-off-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Link: https://lore.kernel.org/r/374440f8dcd4f06c02c2caf4b1efde86774e02d9.16185216…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
drivers/usb/dwc3/core.h | 5 +++++
2 files changed, 32 insertions(+)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 5c25e6a72dbd..2f118ad43571 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -114,6 +114,8 @@ void dwc3_set_prtcap(struct dwc3 *dwc, u32 mode)
dwc->current_dr_role = mode;
}
+static int dwc3_core_soft_reset(struct dwc3 *dwc);
+
static void __dwc3_set_mode(struct work_struct *work)
{
struct dwc3 *dwc = work_to_dwc(work);
@@ -121,6 +123,8 @@ static void __dwc3_set_mode(struct work_struct *work)
int ret;
u32 reg;
+ mutex_lock(&dwc->mutex);
+
pm_runtime_get_sync(dwc->dev);
if (dwc->current_dr_role == DWC3_GCTL_PRTCAP_OTG)
@@ -154,6 +158,25 @@ static void __dwc3_set_mode(struct work_struct *work)
break;
}
+ /* For DRD host or device mode only */
+ if (dwc->desired_dr_role != DWC3_GCTL_PRTCAP_OTG) {
+ reg = dwc3_readl(dwc->regs, DWC3_GCTL);
+ reg |= DWC3_GCTL_CORESOFTRESET;
+ dwc3_writel(dwc->regs, DWC3_GCTL, reg);
+
+ /*
+ * Wait for internal clocks to synchronized. DWC_usb31 and
+ * DWC_usb32 may need at least 50ms (less for DWC_usb3). To
+ * keep it consistent across different IPs, let's wait up to
+ * 100ms before clearing GCTL.CORESOFTRESET.
+ */
+ msleep(100);
+
+ reg = dwc3_readl(dwc->regs, DWC3_GCTL);
+ reg &= ~DWC3_GCTL_CORESOFTRESET;
+ dwc3_writel(dwc->regs, DWC3_GCTL, reg);
+ }
+
spin_lock_irqsave(&dwc->lock, flags);
dwc3_set_prtcap(dwc, dwc->desired_dr_role);
@@ -178,6 +201,8 @@ static void __dwc3_set_mode(struct work_struct *work)
}
break;
case DWC3_GCTL_PRTCAP_DEVICE:
+ dwc3_core_soft_reset(dwc);
+
dwc3_event_buffers_setup(dwc);
if (dwc->usb2_phy)
@@ -200,6 +225,7 @@ static void __dwc3_set_mode(struct work_struct *work)
out:
pm_runtime_mark_last_busy(dwc->dev);
pm_runtime_put_autosuspend(dwc->dev);
+ mutex_unlock(&dwc->mutex);
}
void dwc3_set_mode(struct dwc3 *dwc, u32 mode)
@@ -1553,6 +1579,7 @@ static int dwc3_probe(struct platform_device *pdev)
dwc3_cache_hwparams(dwc);
spin_lock_init(&dwc->lock);
+ mutex_init(&dwc->mutex);
pm_runtime_set_active(dev);
pm_runtime_use_autosuspend(dev);
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 695ff2d791e4..7e3afa5378e8 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -13,6 +13,7 @@
#include <linux/device.h>
#include <linux/spinlock.h>
+#include <linux/mutex.h>
#include <linux/ioport.h>
#include <linux/list.h>
#include <linux/bitops.h>
@@ -947,6 +948,7 @@ struct dwc3_scratchpad_array {
* @scratch_addr: dma address of scratchbuf
* @ep0_in_setup: one control transfer is completed and enter setup phase
* @lock: for synchronizing
+ * @mutex: for mode switching
* @dev: pointer to our struct device
* @sysdev: pointer to the DMA-capable device
* @xhci: pointer to our xHCI child
@@ -1088,6 +1090,9 @@ struct dwc3 {
/* device lock */
spinlock_t lock;
+ /* mode switching lock */
+ struct mutex mutex;
+
struct device *dev;
struct device *sysdev;
--
2.31.1