Before reseting the engine, we suspend the execution of the guilty
request, so that we can continue execution with a new context while we
slowly compress the captured error state for the guilty context. However,
if the reset fails, we will promptly attempt to reset the same request
again, and discover the ongoing capture. Ignore the second attempt to
suspend and capture the same request.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: <stable(a)vger.kernel.org> # v5.7+
---
drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 43703efb36d1..1d209a8a95e8 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2823,6 +2823,9 @@ static void __execlists_hold(struct i915_request *rq)
static bool execlists_hold(struct intel_engine_cs *engine,
struct i915_request *rq)
{
+ if (i915_request_on_hold(rq))
+ return false;
+
spin_lock_irq(&engine->active.lock);
if (i915_request_completed(rq)) { /* too late! */
--
2.20.1
This is a note to let you know that I've just added the patch titled
of: fix linker-section match-table corruption
to my driver-core git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git
in the driver-core-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the driver-core-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 5812b32e01c6d86ba7a84110702b46d8a8531fe9 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 23 Nov 2020 11:23:12 +0100
Subject: of: fix linker-section match-table corruption
Specify type alignment when declaring linker-section match-table entries
to prevent gcc from increasing alignment and corrupting the various
tables with padding (e.g. timers, irqchips, clocks, reserved memory).
This is specifically needed on x86 where gcc (typically) aligns larger
objects like struct of_device_id with static extent on 32-byte
boundaries which at best prevents matching on anything but the first
entry. Specifying alignment when declaring variables suppresses this
optimisation.
Here's a 64-bit example where all entries are corrupt as 16 bytes of
padding has been inserted before the first entry:
ffffffff8266b4b0 D __clk_of_table
ffffffff8266b4c0 d __of_table_fixed_factor_clk
ffffffff8266b5a0 d __of_table_fixed_clk
ffffffff8266b680 d __clk_of_table_sentinel
And here's a 32-bit example where the 8-byte-aligned table happens to be
placed on a 32-byte boundary so that all but the first entry are corrupt
due to the 28 bytes of padding inserted between entries:
812b3ec0 D __irqchip_of_table
812b3ec0 d __of_table_irqchip1
812b3fa0 d __of_table_irqchip2
812b4080 d __of_table_irqchip3
812b4160 d irqchip_of_match_end
Verified on x86 using gcc-9.3 and gcc-4.9 (which uses 64-byte
alignment), and on arm using gcc-7.2.
Note that there are no in-tree users of these tables on x86 currently
(even if they are included in the image).
Fixes: 54196ccbe0ba ("of: consolidate linker section OF match table declarations")
Fixes: f6e916b82022 ("irqchip: add basic infrastructure")
Cc: stable <stable(a)vger.kernel.org> # 3.9
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Link: https://lore.kernel.org/r/20201123102319.8090-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/of.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/of.h b/include/linux/of.h
index 5d51891cbf1a..af655d264f10 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -1300,6 +1300,7 @@ static inline int of_get_available_child_count(const struct device_node *np)
#define _OF_DECLARE(table, name, compat, fn, fn_type) \
static const struct of_device_id __of_table_##name \
__used __section("__" #table "_of_table") \
+ __aligned(__alignof__(struct of_device_id)) \
= { .compatible = compat, \
.data = (fn == (fn_type)NULL) ? fn : fn }
#else
--
2.29.2
Specify type alignment when declaring linker-section match-table entries
to prevent gcc from increasing alignment and corrupting the various
tables with padding (e.g. timers, irqchips, clocks, reserved memory).
This is specifically needed on x86 where gcc (typically) aligns larger
objects like struct of_device_id with static extent on 32-byte
boundaries which at best prevents matching on anything but the first
entry. Specifying alignment when declaring variables suppresses this
optimisation.
Here's a 64-bit example where all entries are corrupt as 16 bytes of
padding has been inserted before the first entry:
ffffffff8266b4b0 D __clk_of_table
ffffffff8266b4c0 d __of_table_fixed_factor_clk
ffffffff8266b5a0 d __of_table_fixed_clk
ffffffff8266b680 d __clk_of_table_sentinel
And here's a 32-bit example where the 8-byte-aligned table happens to be
placed on a 32-byte boundary so that all but the first entry are corrupt
due to the 28 bytes of padding inserted between entries:
812b3ec0 D __irqchip_of_table
812b3ec0 d __of_table_irqchip1
812b3fa0 d __of_table_irqchip2
812b4080 d __of_table_irqchip3
812b4160 d irqchip_of_match_end
Verified on x86 using gcc-9.3 and gcc-4.9 (which uses 64-byte
alignment), and on arm using gcc-7.2.
Note that there are no in-tree users of these tables on x86 currently
(even if they are included in the image).
Fixes: 54196ccbe0ba ("of: consolidate linker section OF match table declarations")
Fixes: f6e916b82022 ("irqchip: add basic infrastructure")
Cc: stable <stable(a)vger.kernel.org> # 3.9
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
include/linux/of.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/of.h b/include/linux/of.h
index 5d51891cbf1a..af655d264f10 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -1300,6 +1300,7 @@ static inline int of_get_available_child_count(const struct device_node *np)
#define _OF_DECLARE(table, name, compat, fn, fn_type) \
static const struct of_device_id __of_table_##name \
__used __section("__" #table "_of_table") \
+ __aligned(__alignof__(struct of_device_id)) \
= { .compatible = compat, \
.data = (fn == (fn_type)NULL) ? fn : fn }
#else
--
2.26.2
Debounce filter setting should be independent from IRQ type setting
because according to the ACPI specs, there are separate arguments for
specifying debounce timeout and IRQ type in GpioIo() and GpioInt().
Together with commit 06abe8291bc31839950f7d0362d9979edc88a666
("pinctrl: amd: fix incorrect way to disable debounce filter") and
Andy's patch "gpiolib: acpi: Take into account debounce settings" [1],
this will fix broken touchpads for laptops whose BIOS set the
debounce timeout to a relatively large value. For example, the BIOS
of Lenovo AMD gaming laptops including Legion-5 15ARH05 (R7000),
Legion-5P (R7000P) and IdeaPad Gaming 3 15ARH05, set the debounce
timeout to 124.8ms. This led to the kernel receiving only ~7 HID
reports per second from the Synaptics touchpad
(MSFT0001:00 06CB:7F28).
Existing touchpads like [2][3] are not troubled by this bug because
the debounce timeout has been set to 0 by the BIOS before enabling
the debounce filter in setting IRQ type.
[1] https://lore.kernel.org/linux-gpio/20201111222008.39993-11-andriy.shevchenk…
[2] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-721331582
[3] https://forum.manjaro.org/t/random-short-touchpad-freezes/30832/28
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Cc: Benjamin Tissoires <benjamin.tissoires(a)redhat.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1887190
Link: https://lore.kernel.org/linux-gpio/CAHp75VcwiGREBUJ0A06EEw-SyabqYsp%2Bdqs2D…
Signed-off-by: Coiby Xu <coiby.xu(a)gmail.com>
---
Changelog v4:
- Note in the commit message that this patch depends on other two
patches to fix the broken touchpad [Hans de Goede]
- Add in the commit message that one more touchpad could be fixed.
Changelog v3:
- Explain why the driver mistakenly took the slightly deviated value
of debounce timeout in the commit message (patch 2/4) [Andy Shevchenko]
- Explain why other touchpads are affected by the issue of setting debounce
filter in IRQ type setting in the commit message (patch 4/4)
Changelog v2:
- Message-Id to Link and grammar fixes for commit messages [Andy Shevchenko]
---
drivers/pinctrl/pinctrl-amd.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 4aea3e05e8c6..899c16c17b6d 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -429,7 +429,6 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
pin_reg &= ~BIT(LEVEL_TRIG_OFF);
pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
pin_reg |= ACTIVE_HIGH << ACTIVE_LEVEL_OFF;
- pin_reg |= DB_TYPE_REMOVE_GLITCH << DB_CNTRL_OFF;
irq_set_handler_locked(d, handle_edge_irq);
break;
@@ -437,7 +436,6 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
pin_reg &= ~BIT(LEVEL_TRIG_OFF);
pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
- pin_reg |= DB_TYPE_REMOVE_GLITCH << DB_CNTRL_OFF;
irq_set_handler_locked(d, handle_edge_irq);
break;
@@ -445,7 +443,6 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
pin_reg &= ~BIT(LEVEL_TRIG_OFF);
pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
pin_reg |= BOTH_EADGE << ACTIVE_LEVEL_OFF;
- pin_reg |= DB_TYPE_REMOVE_GLITCH << DB_CNTRL_OFF;
irq_set_handler_locked(d, handle_edge_irq);
break;
@@ -453,8 +450,6 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
pin_reg |= ACTIVE_HIGH << ACTIVE_LEVEL_OFF;
- pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
- pin_reg |= DB_TYPE_PRESERVE_LOW_GLITCH << DB_CNTRL_OFF;
irq_set_handler_locked(d, handle_level_irq);
break;
@@ -462,8 +457,6 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
- pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
- pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
irq_set_handler_locked(d, handle_level_irq);
break;
--
2.29.2
In the course of discovering and closing many races with context closure
and execbuf submission, since commit 61231f6bd056 ("drm/i915/gem: Check
that the context wasn't closed during setup") we started checking that
the context was not closed by another userspace thread during the execbuf
ioctl. In doing so we cancelled the inflight request (by telling it to be
skipped), but kept reporting success since we do submit a request, albeit
one that doesn't execute. As the error is known before we return from the
ioctl, we can report the error we detect immediately, rather than leave
it on the fence status. With the immediate propagation of the error, it
is easier for userspace to handle.
Fixes: 61231f6bd056 ("drm/i915/gem: Check that the context wasn't closed during setup")
Testcase: igt/gem_ctx_exec/basic-close-race
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: <stable(a)vger.kernel.org> # v5.7+
---
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1904e6e5ea64..b07dc1156a0e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3097,7 +3097,7 @@ static void retire_requests(struct intel_timeline *tl, struct i915_request *end)
break;
}
-static void eb_request_add(struct i915_execbuffer *eb)
+static int eb_request_add(struct i915_execbuffer *eb, int err)
{
struct i915_request *rq = eb->request;
struct intel_timeline * const tl = i915_request_timeline(rq);
@@ -3118,6 +3118,7 @@ static void eb_request_add(struct i915_execbuffer *eb)
/* Serialise with context_close via the add_to_timeline */
i915_request_set_error_once(rq, -ENOENT);
__i915_request_skip(rq);
+ err = -ENOENT; /* override any transient errors */
}
__i915_request_queue(rq, &attr);
@@ -3127,6 +3128,8 @@ static void eb_request_add(struct i915_execbuffer *eb)
retire_requests(tl, prev);
mutex_unlock(&tl->mutex);
+
+ return err;
}
static const i915_user_extension_fn execbuf_extensions[] = {
@@ -3332,7 +3335,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
err = eb_submit(&eb, batch);
err_request:
i915_request_get(eb.request);
- eb_request_add(&eb);
+ err = eb_request_add(&eb, err);
if (eb.fences)
signal_fence_array(&eb);
--
2.20.1
This patch applies to 5.9 and earlier kernels only.
Since 5.10, this has been fortunately fixed by the commit
e5e179aa3a39 ("pseries/drmem: don't cache node id in drmem_lmb struct").
When LMBs are added to a running system, the node id assigned to the LMB is
fetched from the temporary DT node provided by the hypervisor.
However, LMBs added are always assigned to the first online node. This is a
mistake and this is because hot_add_drconf_scn_to_nid() called by
lmb_set_nid() is checking for the LMB flags DRCONF_MEM_ASSIGNED which is
set later in dlpar_add_lmb().
To fix this issue, simply set that flag earlier in dlpar_add_lmb().
Note, this code has been rewrote in 5.10 and thus this fix has no meaning
since this version.
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Paul Mackerras <paulus(a)samba.org>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: linuxppc-dev(a)lists.ozlabs.org
Cc: stable(a)vger.kernel.org
---
arch/powerpc/platforms/pseries/hotplug-memory.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index e54dcbd04b2f..92d83915c629 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -663,12 +663,14 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
return rc;
}
+ lmb->flags |= DRCONF_MEM_ASSIGNED;
lmb_set_nid(lmb);
block_sz = memory_block_size_bytes();
/* Add the memory */
rc = __add_memory(lmb->nid, lmb->base_addr, block_sz);
if (rc) {
+ lmb->flags &= ~DRCONF_MEM_ASSIGNED;
invalidate_lmb_associativity_index(lmb);
return rc;
}
@@ -676,10 +678,9 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
rc = dlpar_online_lmb(lmb);
if (rc) {
__remove_memory(lmb->nid, lmb->base_addr, block_sz);
+ lmb->flags &= ~DRCONF_MEM_ASSIGNED;
invalidate_lmb_associativity_index(lmb);
lmb_clear_nid(lmb);
- } else {
- lmb->flags |= DRCONF_MEM_ASSIGNED;
}
return rc;
--
2.29.2