In ext4_move_extents(), moved_len is only updated when all moves are
successfully executed, and only discards orig_inode and donor_inode
preallocations when moved_len is not zero. When the loop fails to exit
after successfully moving some extents, moved_len is not updated and
remains at 0, so it does not discard the preallocations.
If the moved extents overlap with the preallocated extents, the
overlapped extents are freed twice in ext4_mb_release_inode_pa() and
ext4_process_freed_data() (as described in commit 94d7c16cbbbd), and
bb_free is incremented twice. Hence when trim is executed, a zero-division
bug is triggered in mb_update_avg_fragment_size() because bb_free is not
zero and bb_fragments is zero.
Therefore, update move_len after each extent move to avoid the issue.
Reported-by: Wei Chen <harperchen1110(a)gmail.com>
Reported-by: xingwei lee <xrivendell7(a)gmail.com>
Closes: https://lore.kernel.org/r/CAO4mrferzqBUnCag8R3m2zf897ts9UEuhjFQGPtODT92rYyR…
Fixes: fcf6b1b729bc ("ext4: refactor ext4_move_extents code base")
CC: stable(a)vger.kernel.org # 3.18
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
---
fs/ext4/move_extent.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index 3aa57376d9c2..4b9b503c6346 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -672,7 +672,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
*/
ext4_double_up_write_data_sem(orig_inode, donor_inode);
/* Swap original branches with new branches */
- move_extent_per_page(o_filp, donor_inode,
+ *moved_len += move_extent_per_page(o_filp, donor_inode,
orig_page_index, donor_page_index,
offset_in_page, cur_len,
unwritten, &ret);
@@ -682,7 +682,6 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
o_start += cur_len;
d_start += cur_len;
}
- *moved_len = o_start - orig_blk;
if (*moved_len > len)
*moved_len = len;
--
2.31.1
Hi all,
5.10.194 includes 331d85f0bc6e (ovl: Always reevaluate the file
signature for IMA), which resulted in a performance regression for
workloads that use IMA and run from overlayfs. 5.10.202 includes
cd5a262a07a5 (ima: detect changes to the backing overlay file), which
resolved the regression in the upstream kernel. However, from my
testing [1], this change doesn't resolve the regression on stable
kernels.
From what I can tell, cd5a262a07a5 (ima: detect changes to the
backing overlay file) depends on both db1d1e8b9867 (IMA: use
vfs_getattr_nosec to get the i_version) and a1175d6b1bda (vfs: plumb
i_version handling into struct kstat). These two dependent changes
were not backported to stable kernels. As a result, IMA seems to be
caching the wrong i_version value when using overlayfs. From my
testing, backporting these two dependent changes is sufficient to
resolve the issue in stable kernels.
Would it make sense to backport those changes to stable kernels? It's
possible that they may not follow the stable kernel patching rules. I
think the issue can also be fixed directly in stable trees with the
following diff (which doesn't make sense in the upstream kernel):
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index 70efd4aa1bd1..c84ae6b62b3a 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -239,7 +239,7 @@ int ima_collect_measurement(struct integrity_iint_cache *iint,
* which do not support i_version, support is limited to an initial
* measurement/appraisal/audit.
*/
- i_version = inode_query_iversion(inode);
+ i_version = inode_query_iversion(real_inode);
hash.hdr.algo = algo;
/* Initialize hash digest to 0's in case of failure */
I've verified that this diff resolves the performance regression.
Which approach would make the most sense to fix the issue in stable
kernels? Backporting the dependent commits, or merging the above diff?
I'd be happy to prepare and mail patches either way. Looking forward
to your thoughts.
Thanks,
-Robert
[1] I'm benchmarking with the following:
cat <<EOF > /sys/kernel/security/ima/policy
audit func=BPRM_CHECK
audit func=FILE_MMAP mask=MAY_EXEC
EOF
docker run --rm debian bash -c "TIMEFORMAT=%3R; time bash -c :"
A good result looks like 0.002 (units are seconds), and a bad result
looks like 0.034.
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 73ea73affe8622bdf292de898da869d441da6a9d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023121132-these-deviation-5ab6@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
73ea73affe86 ("USB: gadget: core: adjust uevent timing on gadget unbind")
50966da807c8 ("usb: gadget: udc: core: Offload usb_udc_vbus_handler processing")
1016fc0c096c ("USB: gadget: Fix obscure lockdep violation for udc_mutex")
f9d76d15072c ("USB: gadget: Add ID numbers to gadget names")
fc274c1e9973 ("USB: gadget: Add a new bus for gadgets")
6ebb449f9f25 ("USB: gadget: Register udc before gadget")
af1969a2d734 ("USB: gadget: Rename usb_gadget_probe_driver()")
710f5d627a98 ("Merge tag 'usb-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 73ea73affe8622bdf292de898da869d441da6a9d Mon Sep 17 00:00:00 2001
From: Roy Luo <royluo(a)google.com>
Date: Tue, 28 Nov 2023 22:17:56 +0000
Subject: [PATCH] USB: gadget: core: adjust uevent timing on gadget unbind
The KOBJ_CHANGE uevent is sent before gadget unbind is actually
executed, resulting in inaccurate uevent emitted at incorrect timing
(the uevent would have USB_UDC_DRIVER variable set while it would
soon be removed).
Move the KOBJ_CHANGE uevent to the end of the unbind function so that
uevent is sent only after the change has been made.
Fixes: 2ccea03a8f7e ("usb: gadget: introduce UDC Class")
Cc: stable(a)vger.kernel.org
Signed-off-by: Roy Luo <royluo(a)google.com>
Link: https://lore.kernel.org/r/20231128221756.2591158-1-royluo@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
index ded9531f141b..d59f94464b87 100644
--- a/drivers/usb/gadget/udc/core.c
+++ b/drivers/usb/gadget/udc/core.c
@@ -1646,8 +1646,6 @@ static void gadget_unbind_driver(struct device *dev)
dev_dbg(&udc->dev, "unbinding gadget driver [%s]\n", driver->function);
- kobject_uevent(&udc->dev.kobj, KOBJ_CHANGE);
-
udc->allow_connect = false;
cancel_work_sync(&udc->vbus_work);
mutex_lock(&udc->connect_lock);
@@ -1667,6 +1665,8 @@ static void gadget_unbind_driver(struct device *dev)
driver->is_bound = false;
udc->driver = NULL;
mutex_unlock(&udc_lock);
+
+ kobject_uevent(&udc->dev.kobj, KOBJ_CHANGE);
}
/* ------------------------------------------------------------------------- */
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 6376a824595607e99d032a39ba3394988b4fce96
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023121843-pension-tactile-868b@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
6376a8245956 ("mm/damon/core: make damon_start() waits until kdamond_fn() starts")
4472edf63d66 ("mm/damon/core: use number of passed access sampling as a timer")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6376a824595607e99d032a39ba3394988b4fce96 Mon Sep 17 00:00:00 2001
From: SeongJae Park <sj(a)kernel.org>
Date: Fri, 8 Dec 2023 17:50:18 +0000
Subject: [PATCH] mm/damon/core: make damon_start() waits until kdamond_fn()
starts
The cleanup tasks of kdamond threads including reset of corresponding
DAMON context's ->kdamond field and decrease of global nr_running_ctxs
counter is supposed to be executed by kdamond_fn(). However, commit
0f91d13366a4 ("mm/damon: simplify stop mechanism") made neither
damon_start() nor damon_stop() ensure the corresponding kdamond has
started the execution of kdamond_fn().
As a result, the cleanup can be skipped if damon_stop() is called fast
enough after the previous damon_start(). Especially the skipped reset
of ->kdamond could cause a use-after-free.
Fix it by waiting for start of kdamond_fn() execution from
damon_start().
Link: https://lkml.kernel.org/r/20231208175018.63880-1-sj@kernel.org
Fixes: 0f91d13366a4 ("mm/damon: simplify stop mechanism")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Reported-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: Changbin Du <changbin.du(a)intel.com>
Cc: Jakub Acs <acsjakub(a)amazon.de>
Cc: <stable(a)vger.kernel.org> # 5.15.x
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/damon.h b/include/linux/damon.h
index ab2f17d9926b..e00ddf1ed39c 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -559,6 +559,8 @@ struct damon_ctx {
* update
*/
unsigned long next_ops_update_sis;
+ /* for waiting until the execution of the kdamond_fn is started */
+ struct completion kdamond_started;
/* public: */
struct task_struct *kdamond;
diff --git a/mm/damon/core.c b/mm/damon/core.c
index ce1562783e7e..3a05e71509b9 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -445,6 +445,8 @@ struct damon_ctx *damon_new_ctx(void)
if (!ctx)
return NULL;
+ init_completion(&ctx->kdamond_started);
+
ctx->attrs.sample_interval = 5 * 1000;
ctx->attrs.aggr_interval = 100 * 1000;
ctx->attrs.ops_update_interval = 60 * 1000 * 1000;
@@ -668,11 +670,14 @@ static int __damon_start(struct damon_ctx *ctx)
mutex_lock(&ctx->kdamond_lock);
if (!ctx->kdamond) {
err = 0;
+ reinit_completion(&ctx->kdamond_started);
ctx->kdamond = kthread_run(kdamond_fn, ctx, "kdamond.%d",
nr_running_ctxs);
if (IS_ERR(ctx->kdamond)) {
err = PTR_ERR(ctx->kdamond);
ctx->kdamond = NULL;
+ } else {
+ wait_for_completion(&ctx->kdamond_started);
}
}
mutex_unlock(&ctx->kdamond_lock);
@@ -1433,6 +1438,7 @@ static int kdamond_fn(void *data)
pr_debug("kdamond (%d) starts\n", current->pid);
+ complete(&ctx->kdamond_started);
kdamond_init_intervals_sis(ctx);
if (ctx->ops.init)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 6376a824595607e99d032a39ba3394988b4fce96
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023121849-ambulance-violate-e5b2@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
6376a8245956 ("mm/damon/core: make damon_start() waits until kdamond_fn() starts")
4472edf63d66 ("mm/damon/core: use number of passed access sampling as a timer")
2f5bef5a590b ("mm/damon/core: update monitoring results for new monitoring attributes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6376a824595607e99d032a39ba3394988b4fce96 Mon Sep 17 00:00:00 2001
From: SeongJae Park <sj(a)kernel.org>
Date: Fri, 8 Dec 2023 17:50:18 +0000
Subject: [PATCH] mm/damon/core: make damon_start() waits until kdamond_fn()
starts
The cleanup tasks of kdamond threads including reset of corresponding
DAMON context's ->kdamond field and decrease of global nr_running_ctxs
counter is supposed to be executed by kdamond_fn(). However, commit
0f91d13366a4 ("mm/damon: simplify stop mechanism") made neither
damon_start() nor damon_stop() ensure the corresponding kdamond has
started the execution of kdamond_fn().
As a result, the cleanup can be skipped if damon_stop() is called fast
enough after the previous damon_start(). Especially the skipped reset
of ->kdamond could cause a use-after-free.
Fix it by waiting for start of kdamond_fn() execution from
damon_start().
Link: https://lkml.kernel.org/r/20231208175018.63880-1-sj@kernel.org
Fixes: 0f91d13366a4 ("mm/damon: simplify stop mechanism")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Reported-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: Changbin Du <changbin.du(a)intel.com>
Cc: Jakub Acs <acsjakub(a)amazon.de>
Cc: <stable(a)vger.kernel.org> # 5.15.x
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/damon.h b/include/linux/damon.h
index ab2f17d9926b..e00ddf1ed39c 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -559,6 +559,8 @@ struct damon_ctx {
* update
*/
unsigned long next_ops_update_sis;
+ /* for waiting until the execution of the kdamond_fn is started */
+ struct completion kdamond_started;
/* public: */
struct task_struct *kdamond;
diff --git a/mm/damon/core.c b/mm/damon/core.c
index ce1562783e7e..3a05e71509b9 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -445,6 +445,8 @@ struct damon_ctx *damon_new_ctx(void)
if (!ctx)
return NULL;
+ init_completion(&ctx->kdamond_started);
+
ctx->attrs.sample_interval = 5 * 1000;
ctx->attrs.aggr_interval = 100 * 1000;
ctx->attrs.ops_update_interval = 60 * 1000 * 1000;
@@ -668,11 +670,14 @@ static int __damon_start(struct damon_ctx *ctx)
mutex_lock(&ctx->kdamond_lock);
if (!ctx->kdamond) {
err = 0;
+ reinit_completion(&ctx->kdamond_started);
ctx->kdamond = kthread_run(kdamond_fn, ctx, "kdamond.%d",
nr_running_ctxs);
if (IS_ERR(ctx->kdamond)) {
err = PTR_ERR(ctx->kdamond);
ctx->kdamond = NULL;
+ } else {
+ wait_for_completion(&ctx->kdamond_started);
}
}
mutex_unlock(&ctx->kdamond_lock);
@@ -1433,6 +1438,7 @@ static int kdamond_fn(void *data)
pr_debug("kdamond (%d) starts\n", current->pid);
+ complete(&ctx->kdamond_started);
kdamond_init_intervals_sis(ctx);
if (ctx->ops.init)