This is a note to let you know that I've just added the patch titled
bcache: only permit to recovery read error when cache device is clean
to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git%3Ba=su...
The filename of the patch is: bcache-only-permit-to-recovery-read-error-when-cache-device-is-clean.patch and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree, please let stable@vger.kernel.org know about it.
From d59b23795933678c9638fd20c942d2b4f3cd6185 Mon Sep 17 00:00:00 2001
From: Coly Li colyli@suse.de Date: Mon, 30 Oct 2017 14:46:31 -0700 Subject: bcache: only permit to recovery read error when cache device is clean
From: Coly Li colyli@suse.de
commit d59b23795933678c9638fd20c942d2b4f3cd6185 upstream.
When bcache does read I/Os, for example in writeback or writethrough mode, if a read request on cache device is failed, bcache will try to recovery the request by reading from cached device. If the data on cached device is not synced with cache device, then requester will get a stale data.
For critical storage system like database, providing stale data from recovery may result an application level data corruption, which is unacceptible.
With this patch, for a failed read request in writeback or writethrough mode, recovery a recoverable read request only happens when cache device is clean. That is to say, all data on cached device is up to update.
For other cache modes in bcache, read request will never hit cached_dev_read_error(), they don't need this patch.
Please note, because cache mode can be switched arbitrarily in run time, a writethrough mode might be switched from a writeback mode. Therefore checking dc->has_data in writethrough mode still makes sense.
Changelog: V4: Fix parens error pointed by Michael Lyle. v3: By response from Kent Oversteet, he thinks recovering stale data is a bug to fix, and option to permit it is unnecessary. So this version the sysfs file is removed. v2: rename sysfs entry from allow_stale_data_on_failure to allow_stale_data_on_failure, and fix the confusing commit log. v1: initial patch posted.
[small change to patch comment spelling by mlyle]
Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Michael Lyle mlyle@lyle.org Reported-by: Arne Wolf awolf@lenovo.com Reviewed-by: Michael Lyle mlyle@lyle.org Cc: Kent Overstreet kent.overstreet@gmail.com Cc: Nix nix@esperi.org.uk Cc: Kai Krakow hurikhan77@gmail.com Cc: Eric Wheeler bcache@lists.ewheeler.net Cc: Junhui Tang tang.junhui@zte.com.cn Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/bcache/request.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -702,8 +702,16 @@ static void cached_dev_read_error(struct { struct search *s = container_of(cl, struct search, cl); struct bio *bio = &s->bio.bio; + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
- if (s->recoverable) { + /* + * If cache device is dirty (dc->has_dirty is non-zero), then + * recovery a failed read request from cached device may get a + * stale data back. So read failure recovery is only permitted + * when cache device is clean. + */ + if (s->recoverable && + (dc && !atomic_read(&dc->has_dirty))) { /* Retry from the backing device: */ trace_bcache_read_retry(s->orig_bio);
Patches currently in stable-queue which might be from colyli@suse.de are
queue-4.9/bcache-only-permit-to-recovery-read-error-when-cache-device-is-clean.patch queue-4.9/bcache-check-ca-alloc_thread-initialized-before-wake-up-it.patch
On 27/11/17 16:06, gregkh@linuxfoundation.org wrote:
This is a note to let you know that I've just added the patch titled
bcache: only permit to recovery read error when cache device is clean
to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git%3Ba=su...
The filename of the patch is: bcache-only-permit-to-recovery-read-error-when-cache-device-is-clean.patch and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree, please let stable@vger.kernel.org know about it.
Hi Greg,
This patch is an important fix for a possible data corruption issue, but it also introduces a bug that can produce read errors under certain circumstances. The issue introduced by this patch is not as severe as the issue it fixes, but can lead to e.g. upper layer fs remounting read only. In my environment upper layer handled it badly and indirectly resulted in data loss (not directly bcache's fault of course).
Michael Lyle CC'd the stable list with a follow up fix for the issue introduced by this patch, on 24th Nov, subject "bcache: recover data from backing when data is clean".
However, the followup fix is not in Linus' tree yet, only in Michael's. I guess that means you can't pick it up yet. Never-the-less I felt it important to point this out here.
thanks, Eddie
On 11/27/2017 09:45 AM, Eddie Chapman wrote:
On 27/11/17 16:06, gregkh@linuxfoundation.org wrote:
This is a note to let you know that I've just added the patch titled
bcache: only permit to recovery read error when cache device is clean
to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git%3Ba=su...
The filename of the patch is: bcache-only-permit-to-recovery-read-error-when-cache-device-is-clean.patch and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree, please let stable@vger.kernel.org know about it.
Hi Greg,
This patch is an important fix for a possible data corruption issue, but it also introduces a bug that can produce read errors under certain circumstances. The issue introduced by this patch is not as severe as the issue it fixes, but can lead to e.g. upper layer fs remounting read only. In my environment upper layer handled it badly and indirectly resulted in data loss (not directly bcache's fault of course).
Michael Lyle CC'd the stable list with a follow up fix for the issue introduced by this patch, on 24th Nov, subject "bcache: recover data from backing when data is clean".
However, the followup fix is not in Linus' tree yet, only in Michael's. I guess that means you can't pick it up yet. Never-the-less I felt it important to point this out here.
It's commit e393aa2446150536929140739f09c6ecbcbea7f0 in my tree and will go upstream shortly - but yes, probably should not add this one, before both can be pulled in.
linux-stable-mirror@lists.linaro.org