This is a note to let you know that I've just added the patch titled
bcache: recover data from backing when data is clean
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bcache-recover-data-from-backing-when-data-is-clean.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From e393aa2446150536929140739f09c6ecbcbea7f0 Mon Sep 17 00:00:00 2001
From: Rui Hua <huarui.dev(a)gmail.com>
Date: Fri, 24 Nov 2017 15:14:26 -0800
Subject: bcache: recover data from backing when data is clean
From: Rui Hua <huarui.dev(a)gmail.com>
commit e393aa2446150536929140739f09c6ecbcbea7f0 upstream.
When we send a read request and hit the clean data in cache device, there
is a situation called cache read race in bcache(see the commit in the tail
of cache_look_up(), the following explaination just copy from there):
The bucket we're reading from might be reused while our bio is in flight,
and we could then end up reading the wrong data. We guard against this
by checking (in bch_cache_read_endio()) if the pointer is stale again;
if so, we treat it as an error (s->iop.error = -EINTR) and reread from
the backing device (but we don't pass that error up anywhere)
It should be noted that cache read race happened under normal
circumstances, not the circumstance when SSD failed, it was counted
and shown in /sys/fs/bcache/XXX/internal/cache_read_races.
Without this patch, when we use writeback mode, we will never reread from
the backing device when cache read race happened, until the whole cache
device is clean, because the condition
(s->recoverable && (dc && !atomic_read(&dc->has_dirty))) is false in
cached_dev_read_error(). In this situation, the s->iop.error(= -EINTR)
will be passed up, at last, user will receive -EINTR when it's bio end,
this is not suitable, and wield to up-application.
In this patch, we use s->read_dirty_data to judge whether the read
request hit dirty data in cache device, it is safe to reread data from
the backing device when the read request hit clean data. This can not
only handle cache read race, but also recover data when failed read
request from cache device.
[edited by mlyle to fix up whitespace, commit log title, comment
spelling]
Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean")
Signed-off-by: Hua Rui <huarui.dev(a)gmail.com>
Reviewed-by: Michael Lyle <mlyle(a)lyle.org>
Reviewed-by: Coly Li <colyli(a)suse.de>
Signed-off-by: Michael Lyle <mlyle(a)lyle.org>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/bcache/request.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -707,16 +707,15 @@ static void cached_dev_read_error(struct
{
struct search *s = container_of(cl, struct search, cl);
struct bio *bio = &s->bio.bio;
- struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
/*
- * If cache device is dirty (dc->has_dirty is non-zero), then
- * recovery a failed read request from cached device may get a
- * stale data back. So read failure recovery is only permitted
- * when cache device is clean.
+ * If read request hit dirty data (s->read_dirty_data is true),
+ * then recovery a failed read request from cached device may
+ * get a stale data back. So read failure recovery is only
+ * permitted when read request hit clean data in cache device,
+ * or when cache read race happened.
*/
- if (s->recoverable &&
- (dc && !atomic_read(&dc->has_dirty))) {
+ if (s->recoverable && !s->read_dirty_data) {
/* Retry from the backing device: */
trace_bcache_read_retry(s->orig_bio);
Patches currently in stable-queue which might be from huarui.dev(a)gmail.com are
queue-4.4/bcache-recover-data-from-backing-when-data-is-clean.patch
This is a note to let you know that I've just added the patch titled
bcache: only permit to recovery read error when cache device is clean
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bcache-only-permit-to-recovery-read-error-when-cache-device-is-clean.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d59b23795933678c9638fd20c942d2b4f3cd6185 Mon Sep 17 00:00:00 2001
From: Coly Li <colyli(a)suse.de>
Date: Mon, 30 Oct 2017 14:46:31 -0700
Subject: bcache: only permit to recovery read error when cache device is clean
From: Coly Li <colyli(a)suse.de>
commit d59b23795933678c9638fd20c942d2b4f3cd6185 upstream.
When bcache does read I/Os, for example in writeback or writethrough mode,
if a read request on cache device is failed, bcache will try to recovery
the request by reading from cached device. If the data on cached device is
not synced with cache device, then requester will get a stale data.
For critical storage system like database, providing stale data from
recovery may result an application level data corruption, which is
unacceptible.
With this patch, for a failed read request in writeback or writethrough
mode, recovery a recoverable read request only happens when cache device
is clean. That is to say, all data on cached device is up to update.
For other cache modes in bcache, read request will never hit
cached_dev_read_error(), they don't need this patch.
Please note, because cache mode can be switched arbitrarily in run time, a
writethrough mode might be switched from a writeback mode. Therefore
checking dc->has_data in writethrough mode still makes sense.
Changelog:
V4: Fix parens error pointed by Michael Lyle.
v3: By response from Kent Oversteet, he thinks recovering stale data is a
bug to fix, and option to permit it is unnecessary. So this version
the sysfs file is removed.
v2: rename sysfs entry from allow_stale_data_on_failure to
allow_stale_data_on_failure, and fix the confusing commit log.
v1: initial patch posted.
[small change to patch comment spelling by mlyle]
Signed-off-by: Coly Li <colyli(a)suse.de>
Signed-off-by: Michael Lyle <mlyle(a)lyle.org>
Reported-by: Arne Wolf <awolf(a)lenovo.com>
Reviewed-by: Michael Lyle <mlyle(a)lyle.org>
Cc: Kent Overstreet <kent.overstreet(a)gmail.com>
Cc: Nix <nix(a)esperi.org.uk>
Cc: Kai Krakow <hurikhan77(a)gmail.com>
Cc: Eric Wheeler <bcache(a)lists.ewheeler.net>
Cc: Junhui Tang <tang.junhui(a)zte.com.cn>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/bcache/request.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -707,8 +707,16 @@ static void cached_dev_read_error(struct
{
struct search *s = container_of(cl, struct search, cl);
struct bio *bio = &s->bio.bio;
+ struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
- if (s->recoverable) {
+ /*
+ * If cache device is dirty (dc->has_dirty is non-zero), then
+ * recovery a failed read request from cached device may get a
+ * stale data back. So read failure recovery is only permitted
+ * when cache device is clean.
+ */
+ if (s->recoverable &&
+ (dc && !atomic_read(&dc->has_dirty))) {
/* Retry from the backing device: */
trace_bcache_read_retry(s->orig_bio);
Patches currently in stable-queue which might be from colyli(a)suse.de are
queue-4.4/bcache-only-permit-to-recovery-read-error-when-cache-device-is-clean.patch
queue-4.4/bcache-recover-data-from-backing-when-data-is-clean.patch
This is a note to let you know that I've just added the patch titled
bcache: recover data from backing when data is clean
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bcache-recover-data-from-backing-when-data-is-clean.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From e393aa2446150536929140739f09c6ecbcbea7f0 Mon Sep 17 00:00:00 2001
From: Rui Hua <huarui.dev(a)gmail.com>
Date: Fri, 24 Nov 2017 15:14:26 -0800
Subject: bcache: recover data from backing when data is clean
From: Rui Hua <huarui.dev(a)gmail.com>
commit e393aa2446150536929140739f09c6ecbcbea7f0 upstream.
When we send a read request and hit the clean data in cache device, there
is a situation called cache read race in bcache(see the commit in the tail
of cache_look_up(), the following explaination just copy from there):
The bucket we're reading from might be reused while our bio is in flight,
and we could then end up reading the wrong data. We guard against this
by checking (in bch_cache_read_endio()) if the pointer is stale again;
if so, we treat it as an error (s->iop.error = -EINTR) and reread from
the backing device (but we don't pass that error up anywhere)
It should be noted that cache read race happened under normal
circumstances, not the circumstance when SSD failed, it was counted
and shown in /sys/fs/bcache/XXX/internal/cache_read_races.
Without this patch, when we use writeback mode, we will never reread from
the backing device when cache read race happened, until the whole cache
device is clean, because the condition
(s->recoverable && (dc && !atomic_read(&dc->has_dirty))) is false in
cached_dev_read_error(). In this situation, the s->iop.error(= -EINTR)
will be passed up, at last, user will receive -EINTR when it's bio end,
this is not suitable, and wield to up-application.
In this patch, we use s->read_dirty_data to judge whether the read
request hit dirty data in cache device, it is safe to reread data from
the backing device when the read request hit clean data. This can not
only handle cache read race, but also recover data when failed read
request from cache device.
[edited by mlyle to fix up whitespace, commit log title, comment
spelling]
Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean")
Signed-off-by: Hua Rui <huarui.dev(a)gmail.com>
Reviewed-by: Michael Lyle <mlyle(a)lyle.org>
Reviewed-by: Coly Li <colyli(a)suse.de>
Signed-off-by: Michael Lyle <mlyle(a)lyle.org>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/bcache/request.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -705,16 +705,15 @@ static void cached_dev_read_error(struct
{
struct search *s = container_of(cl, struct search, cl);
struct bio *bio = &s->bio.bio;
- struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
/*
- * If cache device is dirty (dc->has_dirty is non-zero), then
- * recovery a failed read request from cached device may get a
- * stale data back. So read failure recovery is only permitted
- * when cache device is clean.
+ * If read request hit dirty data (s->read_dirty_data is true),
+ * then recovery a failed read request from cached device may
+ * get a stale data back. So read failure recovery is only
+ * permitted when read request hit clean data in cache device,
+ * or when cache read race happened.
*/
- if (s->recoverable &&
- (dc && !atomic_read(&dc->has_dirty))) {
+ if (s->recoverable && !s->read_dirty_data) {
/* Retry from the backing device: */
trace_bcache_read_retry(s->orig_bio);
Patches currently in stable-queue which might be from huarui.dev(a)gmail.com are
queue-3.18/bcache-recover-data-from-backing-when-data-is-clean.patch
This is a note to let you know that I've just added the patch titled
bcache: only permit to recovery read error when cache device is clean
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bcache-only-permit-to-recovery-read-error-when-cache-device-is-clean.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d59b23795933678c9638fd20c942d2b4f3cd6185 Mon Sep 17 00:00:00 2001
From: Coly Li <colyli(a)suse.de>
Date: Mon, 30 Oct 2017 14:46:31 -0700
Subject: bcache: only permit to recovery read error when cache device is clean
From: Coly Li <colyli(a)suse.de>
commit d59b23795933678c9638fd20c942d2b4f3cd6185 upstream.
When bcache does read I/Os, for example in writeback or writethrough mode,
if a read request on cache device is failed, bcache will try to recovery
the request by reading from cached device. If the data on cached device is
not synced with cache device, then requester will get a stale data.
For critical storage system like database, providing stale data from
recovery may result an application level data corruption, which is
unacceptible.
With this patch, for a failed read request in writeback or writethrough
mode, recovery a recoverable read request only happens when cache device
is clean. That is to say, all data on cached device is up to update.
For other cache modes in bcache, read request will never hit
cached_dev_read_error(), they don't need this patch.
Please note, because cache mode can be switched arbitrarily in run time, a
writethrough mode might be switched from a writeback mode. Therefore
checking dc->has_data in writethrough mode still makes sense.
Changelog:
V4: Fix parens error pointed by Michael Lyle.
v3: By response from Kent Oversteet, he thinks recovering stale data is a
bug to fix, and option to permit it is unnecessary. So this version
the sysfs file is removed.
v2: rename sysfs entry from allow_stale_data_on_failure to
allow_stale_data_on_failure, and fix the confusing commit log.
v1: initial patch posted.
[small change to patch comment spelling by mlyle]
Signed-off-by: Coly Li <colyli(a)suse.de>
Signed-off-by: Michael Lyle <mlyle(a)lyle.org>
Reported-by: Arne Wolf <awolf(a)lenovo.com>
Reviewed-by: Michael Lyle <mlyle(a)lyle.org>
Cc: Kent Overstreet <kent.overstreet(a)gmail.com>
Cc: Nix <nix(a)esperi.org.uk>
Cc: Kai Krakow <hurikhan77(a)gmail.com>
Cc: Eric Wheeler <bcache(a)lists.ewheeler.net>
Cc: Junhui Tang <tang.junhui(a)zte.com.cn>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/bcache/request.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -705,8 +705,16 @@ static void cached_dev_read_error(struct
{
struct search *s = container_of(cl, struct search, cl);
struct bio *bio = &s->bio.bio;
+ struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
- if (s->recoverable) {
+ /*
+ * If cache device is dirty (dc->has_dirty is non-zero), then
+ * recovery a failed read request from cached device may get a
+ * stale data back. So read failure recovery is only permitted
+ * when cache device is clean.
+ */
+ if (s->recoverable &&
+ (dc && !atomic_read(&dc->has_dirty))) {
/* Retry from the backing device: */
trace_bcache_read_retry(s->orig_bio);
Patches currently in stable-queue which might be from colyli(a)suse.de are
queue-3.18/bcache-only-permit-to-recovery-read-error-when-cache-device-is-clean.patch
queue-3.18/bcache-recover-data-from-backing-when-data-is-clean.patch
get_modes() callback might be called asynchronously from the DRM core and
it is not synchronized with bridge_enable(), which sets proper runtime PM
state of the main DP device. Fix this by calling pm_runtime_get_sync()
before calling drm_get_edid(), which in turn calls drm_dp_i2c_xfer() and
analogix_dp_transfer() to ensure that main DP device is runtime active
when doing any access to its registers.
This fixes the following kernel issue on Samsung Exynos5250 Snow board:
Unhandled fault: imprecise external abort (0x406) at 0x00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: : 406 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 62 Comm: kworker/0:2 Not tainted 4.13.0-rc2-00364-g4a97a3da420b #3357
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
Workqueue: events output_poll_execute
task: edc14800 task.stack: edcb2000
PC is at analogix_dp_transfer+0x15c/0x2fc
LR is at analogix_dp_transfer+0x134/0x2fc
pc : [<c0468538>] lr : [<c0468510>] psr: 60000013
sp : edcb3be8 ip : 0000002a fp : 00000001
r10: 00000000 r9 : edcb3cd8 r8 : edcb3c40
r7 : 00000000 r6 : edd3b380 r5 : edd3b010 r4 : 00000064
r3 : 00000000 r2 : f0ad3000 r1 : edcb3c40 r0 : edd3b010
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 4000406a DAC: 00000051
Process kworker/0:2 (pid: 62, stack limit = 0xedcb2210)
Stack: (0xedcb3be8 to 0xedcb4000)
[<c0468538>] (analogix_dp_transfer) from [<c0424ba4>] (drm_dp_i2c_do_msg+0x8c/0x2b4)
[<c0424ba4>] (drm_dp_i2c_do_msg) from [<c0424e64>] (drm_dp_i2c_xfer+0x98/0x214)
[<c0424e64>] (drm_dp_i2c_xfer) from [<c057b2d8>] (__i2c_transfer+0x140/0x29c)
[<c057b2d8>] (__i2c_transfer) from [<c057b4a4>] (i2c_transfer+0x70/0xe4)
[<c057b4a4>] (i2c_transfer) from [<c0441de4>] (drm_do_probe_ddc_edid+0xb4/0x114)
[<c0441de4>] (drm_do_probe_ddc_edid) from [<c0441e5c>] (drm_probe_ddc+0x18/0x28)
[<c0441e5c>] (drm_probe_ddc) from [<c0445728>] (drm_get_edid+0x124/0x2d4)
[<c0445728>] (drm_get_edid) from [<c0465ea0>] (analogix_dp_get_modes+0x90/0x114)
[<c0465ea0>] (analogix_dp_get_modes) from [<c0425e8c>] (drm_helper_probe_single_connector_modes+0x198/0x68c)
[<c0425e8c>] (drm_helper_probe_single_connector_modes) from [<c04325d4>] (drm_setup_crtcs+0x1b4/0xd18)
[<c04325d4>] (drm_setup_crtcs) from [<c04344a8>] (drm_fb_helper_hotplug_event+0x94/0xd0)
[<c04344a8>] (drm_fb_helper_hotplug_event) from [<c0425a50>] (drm_kms_helper_hotplug_event+0x24/0x28)
[<c0425a50>] (drm_kms_helper_hotplug_event) from [<c04263ec>] (output_poll_execute+0x6c/0x174)
[<c04263ec>] (output_poll_execute) from [<c0136f18>] (process_one_work+0x188/0x3fc)
[<c0136f18>] (process_one_work) from [<c01371f4>] (worker_thread+0x30/0x4b8)
[<c01371f4>] (worker_thread) from [<c013daf8>] (kthread+0x128/0x164)
[<c013daf8>] (kthread) from [<c0108510>] (ret_from_fork+0x14/0x24)
Code: 0a000002 ea000009 e2544001 0a00004a (e59537c8)
---[ end trace cddc7919c79f7878 ]---
Reported-by: Misha Komarovskiy <zombah(a)gmail.com>
CC: stable(a)vger.kernel.org # v4.10+
Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
---
This issue was there from the beginning, but recent changes to DRM
core revealed it. It makes sense to backport it to patch f0a8b49c03d2
("drm/bridge: analogix dp: Fix runtime PM state on driver bind"),
which fixed similar issue on driver bind, thus I've marked it for
stable v4.10+.
Best regards
Marek Szyprowski
Samsung R&D Institute Poland
---
drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index 5dd3f1cd074a..a8905049b9da 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -946,7 +946,9 @@ static int analogix_dp_get_modes(struct drm_connector *connector)
return 0;
}
+ pm_runtime_get_sync(dp->dev);
edid = drm_get_edid(connector, &dp->aux.ddc);
+ pm_runtime_put(dp->dev);
if (edid) {
drm_mode_connector_update_edid_property(&dp->connector,
edid);
--
2.14.2
>From the shrinker paths, we want to relinquish the GPU and GGTT access to
the object, releasing the backing storage back to the system for
swapout. As a part of that process we would unpin the pages, marking
them for access by the CPU (for the swapout/swapin). However, if that
process was interrupted after unbind the vma, we missed a flush of the
inflight GGTT writes before we made that GTT space available again for
reuse, with the prospect that we would redirect them to another page.
The bug dates back to the introduction of multiple GGTT vma, but the
code itself dates to commit 02bef8f98d26 ("drm/i915: Unbind closed vma
for i915_gem_object_unbind()").
Fixes: 02bef8f98d26 ("drm/i915: Unbind closed vma for i915_gem_object_unbind()")
Fixes: c5ad54cf7dd8 ("drm/i915: Use partial view in mmap fault handler")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e083f242b8dc..80b78fb5daac 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -330,17 +330,10 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj)
* must wait for all rendering to complete to the object (as unbinding
* must anyway), and retire the requests.
*/
- ret = i915_gem_object_wait(obj,
- I915_WAIT_INTERRUPTIBLE |
- I915_WAIT_LOCKED |
- I915_WAIT_ALL,
- MAX_SCHEDULE_TIMEOUT,
- NULL);
+ ret = i915_gem_object_set_to_cpu_domain(obj, false);
if (ret)
return ret;
- i915_gem_retire_requests(to_i915(obj->base.dev));
-
while ((vma = list_first_entry_or_null(&obj->vma_list,
struct i915_vma,
obj_link))) {
--
2.15.1
>From the shrinker paths, we want to relinquish the GPU and GGTT access to
the object, releasing the backing storage back to the system for
swapout. As a part of that process we would unpin the pages, marking
them for access by the CPU (for the swapout/swapin). However, if that
process was interrupted after unbind the vma, we missed a flush of the
inflight GGTT writes before we made that GTT space available again for
reuse, with the prospect that we would redirect them to another page.
The bug dates back to the introduction of multiple GGTT vma, but the
code itself dates to commit 02bef8f98d26 ("drm/i915: Unbind closed vma
for i915_gem_object_unbind()").
Fixes: 02bef8f98d26 ("drm/i915: Unbind closed vma for i915_gem_object_unbind()")
Fixes: c5ad54cf7dd8 ("drm/i915: Use partial view in mmap fault handler")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e083f242b8dc..80b78fb5daac 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -330,17 +330,10 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj)
* must wait for all rendering to complete to the object (as unbinding
* must anyway), and retire the requests.
*/
- ret = i915_gem_object_wait(obj,
- I915_WAIT_INTERRUPTIBLE |
- I915_WAIT_LOCKED |
- I915_WAIT_ALL,
- MAX_SCHEDULE_TIMEOUT,
- NULL);
+ ret = i915_gem_object_set_to_cpu_domain(obj, false);
if (ret)
return ret;
- i915_gem_retire_requests(to_i915(obj->base.dev));
-
while ((vma = list_first_entry_or_null(&obj->vma_list,
struct i915_vma,
obj_link))) {
--
2.15.1
In
commit 613051dac40da1751ab269572766d3348d45a197
Author: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Date: Wed Dec 14 00:08:06 2016 +0100
drm: locking&new iterators for connector_list
we've went to extreme lengths to make sure connector iterations works
in any context, without introducing any additional locking context.
This worked, except for a small fumble in the implementation:
When we actually race with a concurrent connector unplug event, and
our temporary connector reference turns out to be the final one, then
everything breaks: We call the connector release function from
whatever context we happen to be in, which can be an irq/atomic
context. And connector freeing grabs all kinds of locks and stuff.
Fix this by creating a specially safe put function for connetor_iter,
which (in this rare case) punts the cleanup to a worker.
Reported-by: Ben Widawsky <ben(a)bwidawsk.net>
Cc: Ben Widawsky <ben(a)bwidawsk.net>
Fixes: 613051dac40d ("drm: locking&new iterators for connector_list")
Cc: Dave Airlie <airlied(a)gmail.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Sean Paul <seanpaul(a)chromium.org>
Cc: <stable(a)vger.kernel.org> # v4.11+
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
---
drivers/gpu/drm/drm_connector.c | 28 ++++++++++++++++++++++++++--
drivers/gpu/drm/drm_mode_config.c | 2 ++
include/drm/drm_connector.h | 8 ++++++++
3 files changed, 36 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 25f4b2e9a44f..482014137953 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -152,6 +152,16 @@ static void drm_connector_free(struct kref *kref)
connector->funcs->destroy(connector);
}
+static void drm_connector_free_work_fn(struct work_struct *work)
+{
+ struct drm_connector *connector =
+ container_of(work, struct drm_connector, free_work);
+ struct drm_device *dev = connector->dev;
+
+ drm_mode_object_unregister(dev, &connector->base);
+ connector->funcs->destroy(connector);
+}
+
/**
* drm_connector_init - Init a preallocated connector
* @dev: DRM device
@@ -181,6 +191,8 @@ int drm_connector_init(struct drm_device *dev,
if (ret)
return ret;
+ INIT_WORK(&connector->free_work, drm_connector_free_work_fn);
+
connector->base.properties = &connector->properties;
connector->dev = dev;
connector->funcs = funcs;
@@ -529,6 +541,18 @@ void drm_connector_list_iter_begin(struct drm_device *dev,
}
EXPORT_SYMBOL(drm_connector_list_iter_begin);
+/*
+ * Extra-safe connector put function that works in any context. Should only be
+ * used from the connector_iter functions, where we never really expect to
+ * actually release the connector when dropping our final reference.
+ */
+static void
+drm_connector_put_safe(struct drm_connector *conn)
+{
+ if (refcount_dec_and_test(&conn->base.refcount.refcount))
+ schedule_work(&conn->free_work);
+}
+
/**
* drm_connector_list_iter_next - return next connector
* @iter: connectr_list iterator
@@ -561,7 +585,7 @@ drm_connector_list_iter_next(struct drm_connector_list_iter *iter)
spin_unlock_irqrestore(&config->connector_list_lock, flags);
if (old_conn)
- drm_connector_put(old_conn);
+ drm_connector_put_safe(old_conn);
return iter->conn;
}
@@ -580,7 +604,7 @@ void drm_connector_list_iter_end(struct drm_connector_list_iter *iter)
{
iter->dev = NULL;
if (iter->conn)
- drm_connector_put(iter->conn);
+ drm_connector_put_safe(iter->conn);
lock_release(&connector_list_iter_dep_map, 0, _RET_IP_);
}
EXPORT_SYMBOL(drm_connector_list_iter_end);
diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
index 7623607c0f1e..346c19c6ce01 100644
--- a/drivers/gpu/drm/drm_mode_config.c
+++ b/drivers/gpu/drm/drm_mode_config.c
@@ -431,6 +431,8 @@ void drm_mode_config_cleanup(struct drm_device *dev)
drm_connector_put(connector);
}
drm_connector_list_iter_end(&conn_iter);
+ /* connector_iter drops references in a work item. */
+ flush_scheduled_work();
if (WARN_ON(!list_empty(&dev->mode_config.connector_list))) {
drm_connector_list_iter_begin(dev, &conn_iter);
drm_for_each_connector_iter(connector, &conn_iter)
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index 66d6c99d15e5..c5c753a1be85 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -926,6 +926,14 @@ struct drm_connector {
uint8_t num_h_tile, num_v_tile;
uint8_t tile_h_loc, tile_v_loc;
uint16_t tile_h_size, tile_v_size;
+
+ /**
+ * @free_work:
+ *
+ * Work used only by &drm_connector_iter to be able to clean up a
+ * connector from any context.
+ */
+ struct work_struct free_work;
};
#define obj_to_connector(x) container_of(x, struct drm_connector, base)
--
2.15.0
On 04.12.2017 23:10, rwarsow(a)gmx.de wrote:
> Hallo
>
> someone and I got an regression with e1000e since kernel 4.14.3 and it seems there is 4.14.4 on the way without a fix.
>
>
> bug report is here:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=198047
( added stable and netdev to CC )
Yes I have a box with e1000e and it seems something at least breaks NM after 4.14.3.
Interesting here , when using connman the connection is stable.
Regards,
Gabriel C
Hi Christoph,
A kernel bug report was opened against Ubuntu [0]. After a kernel
bisect, it was found that reverting the following commit resolved this bug:
909657615d9b ("scsi: libsas: allow async aborts")
The regression was introduced as of v4.12-rc1, and it still exists in
4.14 mainline.
I was hoping to get your feedback, since you are the patch author. Do
you think gathering any additional data will help diagnose this issue,
or would it be best to submit a revert request?
Thanks,
Joe
[0] http://pad.lv/1726519
On Tue, Dec 05, 2017 at 08:23:27AM +0100, Christian Hesse wrote:
> Greg KH <gregkh(a)linuxfoundation.org> on Mon, 2017/12/04 19:37:
> > On Mon, Dec 04, 2017 at 04:47:00PM +0100, Christian Hesse wrote:
> > > Amit Pundir <amit.pundir(a)linaro.org> on Mon, 2017/11/27 18:23:
> > > > Hi Greg,
> > > >
> > > > Found few e100e upstream fixes from Benjamin Poirier in lede
> > > > source tree, https://git.lede-project.org/?p=source.git, and
> > > > these fixes seem reasonable enough for 4.14.y too.
> > > >
> > > > Also submitting an e1000e buffer overrun fix by Sasha Neftin.
> > > >
> > > > Cherry-picked and build tested for linux v4.14.2 for ARCH=arm/arm64.
> > > >
> > > > Regards,
> > > > Amit Pundir
> > > >
> > > >
> > > > Benjamin Poirier (4):
> > > > e1000e: Fix error path in link detection
> > > > e1000e: Fix return value test
> > > > e1000e: Separate signaling for link check/link up
> > > > e1000e: Avoid receiver overrun interrupt bursts
> > > >
> > > > Sasha Neftin (1):
> > > > e1000e: fix buffer overrun while the I219 is processing DMA
> > > > transactions
> > >
> > > Hello everybody,
> > >
> > > looks like one of these breaks connectivity on my Thinkpad X250.
> > > Just downgraded to linux 4.14.2 to verify.
> >
> > Can you try the -rc release I just did? It has a fix for this series in
> > it.
>
> It connects with the notebook's built in ethernet port (did not check with
> 4.14.3) but still fails to see a link when placed in docking station.
Do you have the same issues with 4.15-rc2?
thanks,
greg k-h
Hi,
On Tue, Nov 28, 2017 at 11:47 PM, Maxime Ripard
<maxime.ripard(a)free-electrons.com> wrote:
> On Mon, Nov 27, 2017 at 08:05:34PM +0100, Stefan Brüns wrote:
>> Include the OF-based modalias in the uevent sent when registering devices
>> on the sunxi RSB bus, so that user space has a chance to autoload the
>> kernel module for the device.
>>
>> Fixes a regression caused by commit 3f241bfa60bd ("arm64: allwinner: a64:
>> pine64: Use dcdc1 regulator for mmc0"). When the axp20x-rsb module for
>> the AXP803 PMIC is built as a module, it is not loaded and the system
>> ends up with an disfunctional MMC controller.
>>
Tags should be:
Fixes: d787dcdb9c8f ("bus: sunxi-rsb: Add driver for Allwinner Reduced
Serial Bus")
Cc: stable <stable(a)vger.kernel.org> # 4.4.x 7a3b7cd332db of: device:
Export of_device_{get_modalias, uvent_modalias} to modules
>> Cc: stable <stable(a)vger.kernel.org>
>> Signed-off-by: Stefan Brüns <stefan.bruens(a)rwth-aachen.de>
>
> Acked-by: Maxime Ripard <maxime.ripard(a)free-electrons.com>
Acked-by: Chen-Yu Tsai <wens(a)csie.org>
Maxime, could you merge this as a fix to get it in fast?
ChenYu
This is the start of the stable review cycle for the 4.9.67 release.
There are 38 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed Dec 6 15:59:56 UTC 2017.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.67-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.67-rc1
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Prevent zero length "index" write
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Don't try indexed reads to alternate slave addresses
NeilBrown <neilb(a)suse.com>
NFS: revalidate "." etc correctly on "open".
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "x86/entry/64: Add missing irqflags tracing to native_load_gs_index()"
Rex Zhu <Rex.Zhu(a)amd.com>
drm/amd/pp: fix typecast error in powerplay.
Christian König <christian.koenig(a)amd.com>
drm/ttm: once more fix ttm_buffer_object_transfer
Peter Griffin <peter.griffin(a)linaro.org>
drm/hisilicon: Ensure LDI regs are properly configured.
Jonathan Liu <net147(a)gmail.com>
drm/panel: simple: Add missing panel_simple_unprepare() calls
Roman Kapl <rka(a)sysgo.com>
drm/radeon: fix atombios on big endian
Dan Carpenter <dan.carpenter(a)oracle.com>
drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
Dan Carpenter <dan.carpenter(a)oracle.com>
drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/radeon: dont switch vt on suspend"
Jeff Lien <jeff.lien(a)wdc.com>
nvme-pci: add quirk for delay before CHK RDY for WDC SN200
Peter Rosin <peda(a)axentia.se>
hwmon: (jc42) optionally try to disable the SMBUS timeout
Huacai Chen <chenhc(a)lemote.com>
bcache: Fix building error on MIPS
Hans de Goede <hdegoede(a)redhat.com>
i2c: i801: Fix Failed to allocate irq -2147483648 error
Heiner Kallweit <hkallweit1(a)gmail.com>
eeprom: at24: check at24_read/write arguments
Bartosz Golaszewski <brgl(a)bgdev.pl>
eeprom: at24: correctly set the size for at24mac402
Heiner Kallweit <hkallweit1(a)gmail.com>
eeprom: at24: fix reading from 24MAC402/24MAC602
Bastian Stender <bst(a)pengutronix.de>
mmc: core: prepend 0x to OCR entry in sysfs
Adrian Hunter <adrian.hunter(a)intel.com>
mmc: core: Do not leave the block driver in a suspended state
Dr. David Alan Gilbert <dgilbert(a)redhat.com>
KVM: lapic: Fixup LDR on load in x2apic
Dr. David Alan Gilbert <dgilbert(a)redhat.com>
KVM: lapic: Split out x2apic ldr calculation
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86: inject exceptions produced by x86_decode_insn
Liran Alon <liran.alon(a)oracle.com>
KVM: x86: Exit to user-mode on #UD intercept when emulator requires
Liran Alon <liran.alon(a)oracle.com>
KVM: x86: pvclock: Handle first-time write to pvclock-page contains random junk
Adam Ford <aford173(a)gmail.com>
ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate
Adam Ford <aford173(a)gmail.com>
mfd: twl4030-power: Fix pmic for boards that need vmmc1 on reboot
Naofumi Honda <honda(a)math.sci.hokudai.ac.jp>
nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat
Trond Myklebust <trond.myklebust(a)primarydata.com>
nfsd: Fix another OPEN stateid race
Trond Myklebust <trond.myklebust(a)primarydata.com>
nfsd: Fix stateid races between OPEN and CLOSE
Josef Bacik <jbacik(a)fb.com>
btrfs: clear space cache inode generation always
chenjie <chenjie6(a)huawei.com>
mm/madvise.c: fix madvise() infinite loop under special circumstances
Dan Williams <dan.j.williams(a)intel.com>
mm, hugetlbfs: introduce ->split() to vm_operations_struct
Mike Kravetz <mike.kravetz(a)oracle.com>
mm/cma: fix alloc_contig_range ret code/potential leak
Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d()
Adam Ford <aford173(a)gmail.com>
ARM: dts: omap3: logicpd-torpedo-37xx-devkit: Fix MMC1 cd-gpio
Adam Ford <aford173(a)gmail.com>
ARM: dts: LogicPD Torpedo: Fix camera pin mux
-------------
Diffstat:
Documentation/devicetree/bindings/hwmon/jc42.txt | 4 +
Makefile | 4 +-
arch/arm/boot/dts/logicpd-torpedo-37xx-devkit.dts | 8 +-
arch/arm/mach-omap2/pdata-quirks.c | 2 +-
arch/x86/entry/entry_64.S | 10 +--
arch/x86/kvm/lapic.c | 12 ++-
arch/x86/kvm/svm.c | 2 +
arch/x86/kvm/vmx.c | 2 +
arch/x86/kvm/x86.c | 5 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 38 ++++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
.../amd/powerplay/hwmgr/process_pptables_v1_0.c | 4 +-
drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 3 +
drivers/gpu/drm/i915/intel_i2c.c | 4 +-
drivers/gpu/drm/panel/panel-simple.c | 2 +
drivers/gpu/drm/radeon/atombios_dp.c | 38 ++++-----
drivers/gpu/drm/radeon/radeon_fb.c | 1 -
drivers/gpu/drm/ttm/ttm_bo_util.c | 1 +
drivers/hwmon/jc42.c | 21 +++++
drivers/i2c/busses/i2c-i801.c | 3 +
drivers/md/bcache/alloc.c | 2 +-
drivers/md/bcache/extents.c | 2 +-
drivers/md/bcache/journal.c | 2 +-
drivers/mfd/twl4030-power.c | 1 +
drivers/misc/eeprom/at24.c | 19 ++++-
drivers/mmc/core/bus.c | 3 +
drivers/mmc/core/mmc.c | 2 +-
drivers/mmc/core/sd.c | 2 +-
drivers/nvme/host/nvme.h | 2 +-
drivers/nvme/host/pci.c | 2 +
fs/btrfs/extent-tree.c | 14 +--
fs/nfs/dir.c | 3 +-
fs/nfsd/nfs4state.c | 99 ++++++++++++++++------
include/linux/mm.h | 1 +
include/uapi/linux/bcache.h | 2 +-
mm/huge_memory.c | 19 ++---
mm/hugetlb.c | 8 ++
mm/madvise.c | 4 +-
mm/mmap.c | 8 +-
mm/page_alloc.c | 9 +-
41 files changed, 250 insertions(+), 122 deletions(-)
This is the start of the stable review cycle for the 3.18.86 release.
There are 12 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed Dec 6 15:59:06 UTC 2017.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.86-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 3.18.86-rc1
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Prevent zero length "index" write
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Don't try indexed reads to alternate slave addresses
NeilBrown <neilb(a)suse.com>
NFS: revalidate "." etc correctly on "open".
Jonathan Liu <net147(a)gmail.com>
drm/panel: simple: Add missing panel_simple_unprepare() calls
Heiner Kallweit <hkallweit1(a)gmail.com>
eeprom: at24: check at24_read/write arguments
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86: inject exceptions produced by x86_decode_insn
Liran Alon <liran.alon(a)oracle.com>
KVM: x86: Exit to user-mode on #UD intercept when emulator requires
Josef Bacik <jbacik(a)fb.com>
btrfs: clear space cache inode generation always
chenjie <chenjie6(a)huawei.com>
mm/madvise.c: fix madvise() infinite loop under special circumstances
Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d()
Herbert Xu <herbert(a)gondor.apana.org.au>
ipsec: Fix aborted xfrm policy dump crash
Tom Herbert <tom(a)herbertland.com>
netlink: add a start callback for starting a netlink dump
-------------
Diffstat:
Makefile | 4 ++--
arch/x86/kvm/svm.c | 2 ++
arch/x86/kvm/vmx.c | 2 ++
arch/x86/kvm/x86.c | 2 ++
drivers/gpu/drm/i915/intel_i2c.c | 4 +++-
drivers/gpu/drm/panel/panel-simple.c | 2 ++
drivers/misc/eeprom/at24.c | 6 ++++++
fs/btrfs/extent-tree.c | 14 +++++++-------
fs/nfs/dir.c | 3 ++-
include/linux/netlink.h | 2 ++
include/net/genetlink.h | 2 ++
mm/huge_memory.c | 14 ++++----------
mm/madvise.c | 3 +--
net/netlink/af_netlink.c | 4 ++++
net/netlink/genetlink.c | 16 ++++++++++++++++
net/xfrm/xfrm_user.c | 25 +++++++++++++++----------
16 files changed, 72 insertions(+), 33 deletions(-)