The patch below was submitted to be applied to the 4.14-stable tree.
I fail to see how this patch meets the stable kernel rules as found at
Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to
<stable(a)vger.kernel.org> and let me know why this patch should be
applied. Otherwise, it is now dropped from my patch queues, never to be
seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8a7a8e1eab929eb3a5b735a788a23b9731139046 Mon Sep 17 00:00:00 2001
From: Dou Liyang <douly.fnst(a)cn.fujitsu.com>
Date: Mon, 13 Nov 2017 13:49:04 +0800
Subject: [PATCH] timekeeping: Eliminate the stale declaration of
ktime_get_raw_and_real_ts64()
Commit ba26621e63ce got rid of ktime_get_raw_and_real_ts64(), but left its
declaration behind.
Remove it.
Fixes: ba26621e63ce ("time: Remove duplicated code in ktime_get_raw_and_real()")
Signed-off-by: Dou Liyang <douly.fnst(a)cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Christopher S. Hall <christopher.s.hall(a)intel.com>
Cc: joelaf(a)google.com
Cc: arnd(a)arndb.de
Cc: gregkh(a)linuxfoundation.org
Cc: john.stultz(a)linaro.org
Cc: deepa.kernel(a)gmail.com
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/1510552144-20831-1-git-send-email-douly.fnst@cn.f…
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index 0021575fe871..51293e1aa4da 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -272,12 +272,6 @@ extern bool timekeeping_rtc_skipresume(void);
extern void timekeeping_inject_sleeptime64(struct timespec64 *delta);
-/*
- * PPS accessor
- */
-extern void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw,
- struct timespec64 *ts_real);
-
/*
* struct system_time_snapshot - simultaneous raw/real time capture with
* counter value
This is a note to let you know that I've just added the patch titled
dm bufio: fix integer overflow when limiting maximum cache size
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-bufio-fix-integer-overflow-when-limiting-maximum-cache-size.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 74d4108d9e681dbbe4a2940ed8fdff1f6868184c Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Wed, 15 Nov 2017 16:38:09 -0800
Subject: dm bufio: fix integer overflow when limiting maximum cache size
From: Eric Biggers <ebiggers(a)google.com>
commit 74d4108d9e681dbbe4a2940ed8fdff1f6868184c upstream.
The default max_cache_size_bytes for dm-bufio is meant to be the lesser
of 25% of the size of the vmalloc area and 2% of the size of lowmem.
However, on 32-bit systems the intermediate result in the expression
(VMALLOC_END - VMALLOC_START) * DM_BUFIO_VMALLOC_PERCENT / 100
overflows, causing the wrong result to be computed. For example, on a
32-bit system where the vmalloc area is 520093696 bytes, the result is
1174405 rather than the expected 130023424, which makes the maximum
cache size much too small (far less than 2% of lowmem). This causes
severe performance problems for dm-verity users on affected systems.
Fix this by using mult_frac() to correctly multiply by a percentage. Do
this for all places in dm-bufio that multiply by a percentage. Also
replace (VMALLOC_END - VMALLOC_START) with VMALLOC_TOTAL, which contrary
to the comment is now defined in include/linux/vmalloc.h.
Depends-on: 9993bc635 ("sched/x86: Fix overflow in cyc2ns_offset")
Fixes: 95d402f057f2 ("dm: add bufio")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-bufio.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -937,7 +937,8 @@ static void __get_memory_limit(struct dm
buffers = c->minimum_buffers;
*limit_buffers = buffers;
- *threshold_buffers = buffers * DM_BUFIO_WRITEBACK_PERCENT / 100;
+ *threshold_buffers = mult_frac(buffers,
+ DM_BUFIO_WRITEBACK_PERCENT, 100);
}
/*
@@ -1856,19 +1857,15 @@ static int __init dm_bufio_init(void)
memset(&dm_bufio_caches, 0, sizeof dm_bufio_caches);
memset(&dm_bufio_cache_names, 0, sizeof dm_bufio_cache_names);
- mem = (__u64)((totalram_pages - totalhigh_pages) *
- DM_BUFIO_MEMORY_PERCENT / 100) << PAGE_SHIFT;
+ mem = (__u64)mult_frac(totalram_pages - totalhigh_pages,
+ DM_BUFIO_MEMORY_PERCENT, 100) << PAGE_SHIFT;
if (mem > ULONG_MAX)
mem = ULONG_MAX;
#ifdef CONFIG_MMU
- /*
- * Get the size of vmalloc space the same way as VMALLOC_TOTAL
- * in fs/proc/internal.h
- */
- if (mem > (VMALLOC_END - VMALLOC_START) * DM_BUFIO_VMALLOC_PERCENT / 100)
- mem = (VMALLOC_END - VMALLOC_START) * DM_BUFIO_VMALLOC_PERCENT / 100;
+ if (mem > mult_frac(VMALLOC_TOTAL, DM_BUFIO_VMALLOC_PERCENT, 100))
+ mem = mult_frac(VMALLOC_TOTAL, DM_BUFIO_VMALLOC_PERCENT, 100);
#endif
dm_bufio_default_cache_size = mem;
Patches currently in stable-queue which might be from ebiggers(a)google.com are
queue-4.9/lib-mpi-call-cond_resched-from-mpi_powm-loop.patch
queue-4.9/dm-bufio-fix-integer-overflow-when-limiting-maximum-cache-size.patch
This is a note to let you know that I've just added the patch titled
dm: allocate struct mapped_device with kvzalloc
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-allocate-struct-mapped_device-with-kvzalloc.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 856eb0916d181da6d043cc33e03f54d5c5bbe54a Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Tue, 31 Oct 2017 19:33:02 -0400
Subject: dm: allocate struct mapped_device with kvzalloc
From: Mikulas Patocka <mpatocka(a)redhat.com>
commit 856eb0916d181da6d043cc33e03f54d5c5bbe54a upstream.
The structure srcu_struct can be very big, its size is proportional to the
value CONFIG_NR_CPUS. The Fedora kernel has CONFIG_NR_CPUS 8192, the field
io_barrier in the struct mapped_device has 84kB in the debugging kernel
and 50kB in the non-debugging kernel. The large size may result in failure
of the function kzalloc_node.
In order to avoid the allocation failure, we use the function
kvzalloc_node, this function falls back to vmalloc if a large contiguous
chunk of memory is not available. This patch also moves the field
io_barrier to the last position of struct mapped_device - the reason is
that on many processor architectures, short memory offsets result in
smaller code than long memory offsets - on x86-64 it reduces code size by
320 bytes.
Note to stable kernel maintainers - the kernels 4.11 and older don't have
the function kvzalloc_node, you can use the function vzalloc_node instead.
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-core.h | 3 ++-
drivers/md/dm.c | 7 ++++---
2 files changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -29,7 +29,6 @@ struct dm_kobject_holder {
* DM targets must _not_ deference a mapped_device to directly access its members!
*/
struct mapped_device {
- struct srcu_struct io_barrier;
struct mutex suspend_lock;
/*
@@ -127,6 +126,8 @@ struct mapped_device {
struct blk_mq_tag_set *tag_set;
bool use_blk_mq:1;
bool init_tio_pdu:1;
+
+ struct srcu_struct io_barrier;
};
void dm_init_md_queue(struct mapped_device *md);
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -21,6 +21,7 @@
#include <linux/delay.h>
#include <linux/wait.h>
#include <linux/pr.h>
+#include <linux/vmalloc.h>
#define DM_MSG_PREFIX "core"
@@ -1511,7 +1512,7 @@ static struct mapped_device *alloc_dev(i
struct mapped_device *md;
void *old_md;
- md = kzalloc_node(sizeof(*md), GFP_KERNEL, numa_node_id);
+ md = vzalloc_node(sizeof(*md), numa_node_id);
if (!md) {
DMWARN("unable to allocate device, out of memory.");
return NULL;
@@ -1605,7 +1606,7 @@ bad_io_barrier:
bad_minor:
module_put(THIS_MODULE);
bad_module_get:
- kfree(md);
+ kvfree(md);
return NULL;
}
@@ -1624,7 +1625,7 @@ static void free_dev(struct mapped_devic
free_minor(minor);
module_put(THIS_MODULE);
- kfree(md);
+ kvfree(md);
}
static void __bind_mempools(struct mapped_device *md, struct dm_table *t)
Patches currently in stable-queue which might be from mpatocka(a)redhat.com are
queue-4.9/dm-allocate-struct-mapped_device-with-kvzalloc.patch
This is a note to let you know that I've just added the patch titled
dm bufio: fix integer overflow when limiting maximum cache size
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-bufio-fix-integer-overflow-when-limiting-maximum-cache-size.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 74d4108d9e681dbbe4a2940ed8fdff1f6868184c Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Wed, 15 Nov 2017 16:38:09 -0800
Subject: dm bufio: fix integer overflow when limiting maximum cache size
From: Eric Biggers <ebiggers(a)google.com>
commit 74d4108d9e681dbbe4a2940ed8fdff1f6868184c upstream.
The default max_cache_size_bytes for dm-bufio is meant to be the lesser
of 25% of the size of the vmalloc area and 2% of the size of lowmem.
However, on 32-bit systems the intermediate result in the expression
(VMALLOC_END - VMALLOC_START) * DM_BUFIO_VMALLOC_PERCENT / 100
overflows, causing the wrong result to be computed. For example, on a
32-bit system where the vmalloc area is 520093696 bytes, the result is
1174405 rather than the expected 130023424, which makes the maximum
cache size much too small (far less than 2% of lowmem). This causes
severe performance problems for dm-verity users on affected systems.
Fix this by using mult_frac() to correctly multiply by a percentage. Do
this for all places in dm-bufio that multiply by a percentage. Also
replace (VMALLOC_END - VMALLOC_START) with VMALLOC_TOTAL, which contrary
to the comment is now defined in include/linux/vmalloc.h.
Depends-on: 9993bc635 ("sched/x86: Fix overflow in cyc2ns_offset")
Fixes: 95d402f057f2 ("dm: add bufio")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-bufio.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -928,7 +928,8 @@ static void __get_memory_limit(struct dm
buffers = c->minimum_buffers;
*limit_buffers = buffers;
- *threshold_buffers = buffers * DM_BUFIO_WRITEBACK_PERCENT / 100;
+ *threshold_buffers = mult_frac(buffers,
+ DM_BUFIO_WRITEBACK_PERCENT, 100);
}
/*
@@ -1829,19 +1830,15 @@ static int __init dm_bufio_init(void)
memset(&dm_bufio_caches, 0, sizeof dm_bufio_caches);
memset(&dm_bufio_cache_names, 0, sizeof dm_bufio_cache_names);
- mem = (__u64)((totalram_pages - totalhigh_pages) *
- DM_BUFIO_MEMORY_PERCENT / 100) << PAGE_SHIFT;
+ mem = (__u64)mult_frac(totalram_pages - totalhigh_pages,
+ DM_BUFIO_MEMORY_PERCENT, 100) << PAGE_SHIFT;
if (mem > ULONG_MAX)
mem = ULONG_MAX;
#ifdef CONFIG_MMU
- /*
- * Get the size of vmalloc space the same way as VMALLOC_TOTAL
- * in fs/proc/internal.h
- */
- if (mem > (VMALLOC_END - VMALLOC_START) * DM_BUFIO_VMALLOC_PERCENT / 100)
- mem = (VMALLOC_END - VMALLOC_START) * DM_BUFIO_VMALLOC_PERCENT / 100;
+ if (mem > mult_frac(VMALLOC_TOTAL, DM_BUFIO_VMALLOC_PERCENT, 100))
+ mem = mult_frac(VMALLOC_TOTAL, DM_BUFIO_VMALLOC_PERCENT, 100);
#endif
dm_bufio_default_cache_size = mem;
Patches currently in stable-queue which might be from ebiggers(a)google.com are
queue-4.4/lib-mpi-call-cond_resched-from-mpi_powm-loop.patch
queue-4.4/dm-bufio-fix-integer-overflow-when-limiting-maximum-cache-size.patch
This is a note to let you know that I've just added the patch titled
ovl: Put upperdentry if ovl_check_origin() fails
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ovl-put-upperdentry-if-ovl_check_origin-fails.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5455f92b54e516995a9ca45bbf790d3629c27a93 Mon Sep 17 00:00:00 2001
From: Vivek Goyal <vgoyal(a)redhat.com>
Date: Wed, 1 Nov 2017 15:37:22 -0400
Subject: ovl: Put upperdentry if ovl_check_origin() fails
From: Vivek Goyal <vgoyal(a)redhat.com>
commit 5455f92b54e516995a9ca45bbf790d3629c27a93 upstream.
If ovl_check_origin() fails, we should put upperdentry. We have a reference
on it by now. So goto out_put_upper instead of out.
Fixes: a9d019573e88 ("ovl: lookup non-dir copy-up-origin by file handle")
Signed-off-by: Vivek Goyal <vgoyal(a)redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/overlayfs/namei.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -630,7 +630,7 @@ struct dentry *ovl_lookup(struct inode *
err = ovl_check_origin(upperdentry, roe->lowerstack,
roe->numlower, &stack, &ctr);
if (err)
- goto out;
+ goto out_put_upper;
}
if (d.redirect) {
Patches currently in stable-queue which might be from vgoyal(a)redhat.com are
queue-4.14/ovl-put-upperdentry-if-ovl_check_origin-fails.patch
This is a note to let you know that I've just added the patch titled
dm zoned: ignore last smaller runt zone
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-zoned-ignore-last-smaller-runt-zone.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 114e025968b5990ad0b57bf60697ea64ee206aac Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)wdc.com>
Date: Sat, 28 Oct 2017 16:39:34 +0900
Subject: dm zoned: ignore last smaller runt zone
From: Damien Le Moal <damien.lemoal(a)wdc.com>
commit 114e025968b5990ad0b57bf60697ea64ee206aac upstream.
The SCSI layer allows ZBC drives to have a smaller last runt zone. For
such a device, specifying the entire capacity for a dm-zoned target
table entry fails because the specified capacity is not aligned on a
device zone size indicated in the request queue structure of the
device.
Fix this problem by ignoring the last runt zone in the entry length
when seting up the dm-zoned target (ctr method) and when iterating table
entries of the target (iterate_devices method). This allows dm-zoned
users to still easily setup a target using the entire device capacity
(as mandated by dm-zoned) or the aligned capacity excluding the last
runt zone.
While at it, replace direct references to the device queue chunk_sectors
limit with calls to the accessor blk_queue_zone_sectors().
Reported-by: Peter Desnoyers <pjd(a)ccs.neu.edu>
Signed-off-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-zoned-target.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -660,6 +660,7 @@ static int dmz_get_zoned_device(struct d
struct dmz_target *dmz = ti->private;
struct request_queue *q;
struct dmz_dev *dev;
+ sector_t aligned_capacity;
int ret;
/* Get the target device */
@@ -685,15 +686,17 @@ static int dmz_get_zoned_device(struct d
goto err;
}
+ q = bdev_get_queue(dev->bdev);
dev->capacity = i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT;
- if (ti->begin || (ti->len != dev->capacity)) {
+ aligned_capacity = dev->capacity & ~(blk_queue_zone_sectors(q) - 1);
+ if (ti->begin ||
+ ((ti->len != dev->capacity) && (ti->len != aligned_capacity))) {
ti->error = "Partial mapping not supported";
ret = -EINVAL;
goto err;
}
- q = bdev_get_queue(dev->bdev);
- dev->zone_nr_sectors = q->limits.chunk_sectors;
+ dev->zone_nr_sectors = blk_queue_zone_sectors(q);
dev->zone_nr_sectors_shift = ilog2(dev->zone_nr_sectors);
dev->zone_nr_blocks = dmz_sect2blk(dev->zone_nr_sectors);
@@ -929,8 +932,10 @@ static int dmz_iterate_devices(struct dm
iterate_devices_callout_fn fn, void *data)
{
struct dmz_target *dmz = ti->private;
+ struct dmz_dev *dev = dmz->dev;
+ sector_t capacity = dev->capacity & ~(dev->zone_nr_sectors - 1);
- return fn(ti, dmz->ddev, 0, dmz->dev->capacity, data);
+ return fn(ti, dmz->ddev, 0, capacity, data);
}
static struct target_type dmz_type = {
Patches currently in stable-queue which might be from damien.lemoal(a)wdc.com are
queue-4.14/dm-zoned-ignore-last-smaller-runt-zone.patch
This is a note to let you know that I've just added the patch titled
dm mpath: remove annoying message of 'blk_get_request() returned -11'
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-mpath-remove-annoying-message-of-blk_get_request-returned-11.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 9dc112e2daf87b40607fd8d357e2d7de32290d45 Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei(a)redhat.com>
Date: Sat, 30 Sep 2017 19:46:48 +0800
Subject: dm mpath: remove annoying message of 'blk_get_request() returned -11'
From: Ming Lei <ming.lei(a)redhat.com>
commit 9dc112e2daf87b40607fd8d357e2d7de32290d45 upstream.
It is very normal to see allocation failure, especially with blk-mq
request_queues, so it's unnecessary to report this error and annoy
people.
In practice this 'blk_get_request() returned -11' error gets logged
quite frequently when a blk-mq DM multipath device sees heavy IO.
This change is marked for stable@ because the annoying message in
question was included in stable@ commit 7083abbbf.
Fixes: 7083abbbf ("dm mpath: avoid that path removal can trigger an infinite loop")
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-mpath.c | 2 --
1 file changed, 2 deletions(-)
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -499,8 +499,6 @@ static int multipath_clone_and_map(struc
if (IS_ERR(clone)) {
/* EBUSY, ENODEV or EWOULDBLOCK: requeue */
bool queue_dying = blk_queue_dying(q);
- DMERR_LIMIT("blk_get_request() returned %ld%s - requeuing",
- PTR_ERR(clone), queue_dying ? " (path offline)" : "");
if (queue_dying) {
atomic_inc(&m->pg_init_in_progress);
activate_or_offline_path(pgpath);
Patches currently in stable-queue which might be from ming.lei(a)redhat.com are
queue-4.14/dm-mpath-remove-annoying-message-of-blk_get_request-returned-11.patch
This is a note to let you know that I've just added the patch titled
dm integrity: allow unaligned bv_offset
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-integrity-allow-unaligned-bv_offset.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 95b1369a9638cfa322ad1c0cde8efbe524059884 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Tue, 7 Nov 2017 10:40:40 -0500
Subject: dm integrity: allow unaligned bv_offset
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Mikulas Patocka <mpatocka(a)redhat.com>
commit 95b1369a9638cfa322ad1c0cde8efbe524059884 upstream.
When slub_debug is enabled kmalloc returns unaligned memory. XFS uses
this unaligned memory for its buffers (if an unaligned buffer crosses a
page, XFS frees it and allocates a full page instead - see the function
xfs_buf_allocate_memory).
dm-integrity checks if bv_offset is aligned on page size and this check
fail with slub_debug and XFS.
Fix this bug by removing the bv_offset check, leaving only the check for
bv_len.
Fixes: 7eada909bfd7 ("dm: add integrity target")
Reported-by: Bruno Prémont <bonbons(a)sysophe.eu>
Reviewed-by: Milan Broz <gmazyland(a)gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-integrity.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -1376,7 +1376,7 @@ static int dm_integrity_map(struct dm_ta
struct bvec_iter iter;
struct bio_vec bv;
bio_for_each_segment(bv, bio, iter) {
- if (unlikely((bv.bv_offset | bv.bv_len) & ((ic->sectors_per_block << SECTOR_SHIFT) - 1))) {
+ if (unlikely(bv.bv_len & ((ic->sectors_per_block << SECTOR_SHIFT) - 1))) {
DMERR("Bio vector (%u,%u) is not aligned on %u-sector boundary",
bv.bv_offset, bv.bv_len, ic->sectors_per_block);
return DM_MAPIO_KILL;
Patches currently in stable-queue which might be from mpatocka(a)redhat.com are
queue-4.14/dm-allocate-struct-mapped_device-with-kvzalloc.patch
queue-4.14/dm-integrity-allow-unaligned-bv_offset.patch
queue-4.14/dm-crypt-allow-unaligned-bv_offset.patch
This is a note to let you know that I've just added the patch titled
dm crypt: allow unaligned bv_offset
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-crypt-allow-unaligned-bv_offset.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0440d5c0ca9744b92a07aeb6df0a9a75db6f4280 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Tue, 7 Nov 2017 10:35:57 -0500
Subject: dm crypt: allow unaligned bv_offset
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Mikulas Patocka <mpatocka(a)redhat.com>
commit 0440d5c0ca9744b92a07aeb6df0a9a75db6f4280 upstream.
When slub_debug is enabled kmalloc returns unaligned memory. XFS uses
this unaligned memory for its buffers (if an unaligned buffer crosses a
page, XFS frees it and allocates a full page instead - see the function
xfs_buf_allocate_memory).
dm-crypt checks if bv_offset is aligned on page size and these checks
fail with slub_debug and XFS.
Fix this bug by removing the bv_offset checks. Switch to checking if
bv_len is aligned instead of bv_offset (this check should be sufficient
to prevent overruns if a bio with too small bv_len is received).
Fixes: 8f0009a22517 ("dm crypt: optionally support larger encryption sector size")
Reported-by: Bruno Prémont <bonbons(a)sysophe.eu>
Tested-by: Bruno Prémont <bonbons(a)sysophe.eu>
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Reviewed-by: Milan Broz <gmazyland(a)gmail.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-crypt.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1075,7 +1075,7 @@ static int crypt_convert_block_aead(stru
BUG_ON(cc->integrity_iv_size && cc->integrity_iv_size != cc->iv_size);
/* Reject unexpected unaligned bio. */
- if (unlikely(bv_in.bv_offset & (cc->sector_size - 1)))
+ if (unlikely(bv_in.bv_len & (cc->sector_size - 1)))
return -EIO;
dmreq = dmreq_of_req(cc, req);
@@ -1168,7 +1168,7 @@ static int crypt_convert_block_skcipher(
int r = 0;
/* Reject unexpected unaligned bio. */
- if (unlikely(bv_in.bv_offset & (cc->sector_size - 1)))
+ if (unlikely(bv_in.bv_len & (cc->sector_size - 1)))
return -EIO;
dmreq = dmreq_of_req(cc, req);
Patches currently in stable-queue which might be from mpatocka(a)redhat.com are
queue-4.14/dm-allocate-struct-mapped_device-with-kvzalloc.patch
queue-4.14/dm-integrity-allow-unaligned-bv_offset.patch
queue-4.14/dm-crypt-allow-unaligned-bv_offset.patch
This is a note to let you know that I've just added the patch titled
dm cache: fix race condition in the writeback mode overwrite_bio optimisation
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-cache-fix-race-condition-in-the-writeback-mode-overwrite_bio-optimisation.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d1260e2a3f85f4c1010510a15f89597001318b1b Mon Sep 17 00:00:00 2001
From: Joe Thornber <ejt(a)redhat.com>
Date: Fri, 10 Nov 2017 07:53:31 -0500
Subject: dm cache: fix race condition in the writeback mode overwrite_bio optimisation
From: Joe Thornber <ejt(a)redhat.com>
commit d1260e2a3f85f4c1010510a15f89597001318b1b upstream.
When a DM cache in writeback mode moves data between the slow and fast
device it can often avoid a copy if the triggering bio either:
i) covers the whole block (no point copying if we're about to overwrite it)
ii) the migration is a promotion and the origin block is currently discarded
Prior to this fix there was a race with case (ii). The discard status
was checked with a shared lock held (rather than exclusive). This meant
another bio could run in parallel and write data to the origin, removing
the discard state. After the promotion the parallel write would have
been lost.
With this fix the discard status is re-checked once the exclusive lock
has been aquired. If the block is no longer discarded it falls back to
the slower full copy path.
Fixes: b29d4986d ("dm cache: significant rework to leverage dm-bio-prison-v2")
Signed-off-by: Joe Thornber <ejt(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm-cache-target.c | 86 ++++++++++++++++++++++++++-----------------
1 file changed, 53 insertions(+), 33 deletions(-)
--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -1201,6 +1201,18 @@ static void background_work_end(struct c
/*----------------------------------------------------------------*/
+static bool bio_writes_complete_block(struct cache *cache, struct bio *bio)
+{
+ return (bio_data_dir(bio) == WRITE) &&
+ (bio->bi_iter.bi_size == (cache->sectors_per_block << SECTOR_SHIFT));
+}
+
+static bool optimisable_bio(struct cache *cache, struct bio *bio, dm_oblock_t block)
+{
+ return writeback_mode(&cache->features) &&
+ (is_discarded_oblock(cache, block) || bio_writes_complete_block(cache, bio));
+}
+
static void quiesce(struct dm_cache_migration *mg,
void (*continuation)(struct work_struct *))
{
@@ -1474,13 +1486,51 @@ static void mg_upgrade_lock(struct work_
}
}
+static void mg_full_copy(struct work_struct *ws)
+{
+ struct dm_cache_migration *mg = ws_to_mg(ws);
+ struct cache *cache = mg->cache;
+ struct policy_work *op = mg->op;
+ bool is_policy_promote = (op->op == POLICY_PROMOTE);
+
+ if ((!is_policy_promote && !is_dirty(cache, op->cblock)) ||
+ is_discarded_oblock(cache, op->oblock)) {
+ mg_upgrade_lock(ws);
+ return;
+ }
+
+ init_continuation(&mg->k, mg_upgrade_lock);
+
+ if (copy(mg, is_policy_promote)) {
+ DMERR_LIMIT("%s: migration copy failed", cache_device_name(cache));
+ mg->k.input = BLK_STS_IOERR;
+ mg_complete(mg, false);
+ }
+}
+
static void mg_copy(struct work_struct *ws)
{
- int r;
struct dm_cache_migration *mg = ws_to_mg(ws);
if (mg->overwrite_bio) {
/*
+ * No exclusive lock was held when we last checked if the bio
+ * was optimisable. So we have to check again in case things
+ * have changed (eg, the block may no longer be discarded).
+ */
+ if (!optimisable_bio(mg->cache, mg->overwrite_bio, mg->op->oblock)) {
+ /*
+ * Fallback to a real full copy after doing some tidying up.
+ */
+ bool rb = bio_detain_shared(mg->cache, mg->op->oblock, mg->overwrite_bio);
+ BUG_ON(rb); /* An exclussive lock must _not_ be held for this block */
+ mg->overwrite_bio = NULL;
+ inc_io_migrations(mg->cache);
+ mg_full_copy(ws);
+ return;
+ }
+
+ /*
* It's safe to do this here, even though it's new data
* because all IO has been locked out of the block.
*
@@ -1489,26 +1539,8 @@ static void mg_copy(struct work_struct *
*/
overwrite(mg, mg_update_metadata_after_copy);
- } else {
- struct cache *cache = mg->cache;
- struct policy_work *op = mg->op;
- bool is_policy_promote = (op->op == POLICY_PROMOTE);
-
- if ((!is_policy_promote && !is_dirty(cache, op->cblock)) ||
- is_discarded_oblock(cache, op->oblock)) {
- mg_upgrade_lock(ws);
- return;
- }
-
- init_continuation(&mg->k, mg_upgrade_lock);
-
- r = copy(mg, is_policy_promote);
- if (r) {
- DMERR_LIMIT("%s: migration copy failed", cache_device_name(cache));
- mg->k.input = BLK_STS_IOERR;
- mg_complete(mg, false);
- }
- }
+ } else
+ mg_full_copy(ws);
}
static int mg_lock_writes(struct dm_cache_migration *mg)
@@ -1748,18 +1780,6 @@ static void inc_miss_counter(struct cach
/*----------------------------------------------------------------*/
-static bool bio_writes_complete_block(struct cache *cache, struct bio *bio)
-{
- return (bio_data_dir(bio) == WRITE) &&
- (bio->bi_iter.bi_size == (cache->sectors_per_block << SECTOR_SHIFT));
-}
-
-static bool optimisable_bio(struct cache *cache, struct bio *bio, dm_oblock_t block)
-{
- return writeback_mode(&cache->features) &&
- (is_discarded_oblock(cache, block) || bio_writes_complete_block(cache, bio));
-}
-
static int map_bio(struct cache *cache, struct bio *bio, dm_oblock_t block,
bool *commit_needed)
{
Patches currently in stable-queue which might be from ejt(a)redhat.com are
queue-4.14/dm-cache-fix-race-condition-in-the-writeback-mode-overwrite_bio-optimisation.patch