MADV_COLLAPSE acts on one hugepage-aligned/sized region at a time, until
it has collapsed all eligible memory contained within the bounds
supplied by the user.
At the top of each hugepage iteration we (re)lock mmap_lock and
(re)validate the VMA for eligibility and update variables that might
have changed while mmap_lock was dropped. One thing that might occur,
is that the VMA could be resized, and as such, we refetch vma->vm_end
to make sure we don't collapse past the end of the VMA's new end.
However, it's possible that when refetching vma>vm_end that we expand the
region acted on by MADV_COLLAPSE if vma->vm_end is greater than size+len
supplied by the user.
The consequence here is that we may attempt to collapse more memory than
requested, possibly yielding either "too much success" or "false
failure" user-visible results. An example of the former is if we
MADV_COLLAPSE the first 4MiB of a 2TiB mmap()'d file, the incorrect
refetch would cause the operation to block for much longer than
anticipated as we attempt to collapse the entire TiB region. An example
of the latter is that applying MADV_COLLPSE to a 4MiB file mapped to the
start of a 6MiB VMA will successfully collapse the first 4MiB, then
incorrectly attempt to collapse the last hugepage-aligned/sized region
-- fail (since readahead/page cache lookup will fail) -- and report a
failure to the user.
Don't expand the acted-on region when refetching vma->vm_end.
Fixes: 4d24de9425f7 ("mm: MADV_COLLAPSE: refetch vm_end after reacquiring mmap_lock")
Reported-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Zach O'Keefe <zokeefe(a)google.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
v2->v3: Add 'Cc: stable(a)vger.kernel.org' as per stable-kernel-rules.
v1->v2: Updated changelog to make clear what user-visible issues this
patch addresses, as well makes the case for backporting (Andrew
Morton).
While there aren't any stability risks, without this patch there exist
trivial examples where MADV_COLLAPSE won't work; as such, this should be
backported to stable 6.1.X to make MADV_COLLAPSE dependable in such
cases.
v1: https://lore.kernel.org/linux-mm/CAAa6QmRx_b2UCJWE2XZ3=3c3-_N3R4cDGX6Wm4OT7…
---
mm/khugepaged.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 5cb401aa2b9d..b4d2ec0a94ed 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2649,7 +2649,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev,
goto out_nolock;
}
- hend = vma->vm_end & HPAGE_PMD_MASK;
+ hend = min(hend, vma->vm_end & HPAGE_PMD_MASK);
}
mmap_assert_locked(mm);
memset(cc->node_load, 0, sizeof(cc->node_load));
--
2.39.0.314.g84b9a713c41-goog
MADV_COLLAPSE acts on one hugepage-aligned/sized region at a time, until
it has collapsed all eligible memory contained within the bounds
supplied by the user.
At the top of each hugepage iteration we (re)lock mmap_lock and
(re)validate the VMA for eligibility and update variables that might
have changed while mmap_lock was dropped. One thing that might occur,
is that the VMA could be resized, and as such, we refetch vma->vm_end
to make sure we don't collapse past the end of the VMA's new end.
However, it's possible that when refetching vma>vm_end that we expand the
region acted on by MADV_COLLAPSE if vma->vm_end is greater than size+len
supplied by the user.
The consequence here is that we may attempt to collapse more memory than
requested, possibly yielding either "too much success" or "false
failure" user-visible results. An example of the former is if we
MADV_COLLAPSE the first 4MiB of a 2TiB mmap()'d file, the incorrect
refetch would cause the operation to block for much longer than
anticipated as we attempt to collapse the entire TiB region. An example
of the latter is that applying MADV_COLLPSE to a 4MiB file mapped to the
start of a 6MiB VMA will successfully collapse the first 4MiB, then
incorrectly attempt to collapse the last hugepage-aligned/sized region
-- fail (since readahead/page cache lookup will fail) -- and report a
failure to the user.
Don't expand the acted-on region when refetching vma->vm_end.
Fixes: 4d24de9425f7 ("mm: MADV_COLLAPSE: refetch vm_end after reacquiring mmap_lock")
Reported-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Zach O'Keefe <zokeefe(a)google.com>
Cc: Yang Shi <shy828301(a)gmail.com>
---
v1->v2 : Updated changelog to make clear what user-visible issues this
patch addresses, as well makes the case for backporting (Andrew
Morton).
While there aren't any stability risks, without this patch there exist
trivial examples where MADV_COLLAPSE won't work; as such, this should be
backported to stable 6.1.X to make MADV_COLLAPSE dependable in such
cases.
v1: https://lore.kernel.org/linux-mm/CAAa6QmRx_b2UCJWE2XZ3=3c3-_N3R4cDGX6Wm4OT7…
---
mm/khugepaged.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 5cb401aa2b9d..b4d2ec0a94ed 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2649,7 +2649,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev,
goto out_nolock;
}
- hend = vma->vm_end & HPAGE_PMD_MASK;
+ hend = min(hend, vma->vm_end & HPAGE_PMD_MASK);
}
mmap_assert_locked(mm);
memset(cc->node_load, 0, sizeof(cc->node_load));
--
2.39.0.314.g84b9a713c41-goog
From: Yang Yingliang <yangyingliang(a)huawei.com>
[ Upstream commit 1662cea4623f75d8251adf07370bbaa958f0355d ]
Inject fault while loading module, kset_register() may fail.
If it fails, the kset.kobj.name allocated by kobject_set_name()
which must be called before a call to kset_register() may be
leaked, since refcount of kobj was set in kset_init().
To mitigate this, we free the name in kset_register() when an
error is encountered, i.e. when kset_register() returns an error.
A kset may be embedded in a larger structure which may be dynamically
allocated in callers, it needs to be freed in ktype.release() or error
path in callers, in this case, we can not call kset_put() in kset_register(),
or it will cause double free, so just call kfree_const() to free the
name and set it to NULL to avoid accessing bad pointer in callers.
With this fix, the callers don't need care about freeing the name
and may call kset_put() if kset_register() fails.
Suggested-by: Luben Tuikov <luben.tuikov(a)amd.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: <luben.tuikov(a)amd.com>
Link: https://lore.kernel.org/r/20221025071549.1280528-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
lib/kobject.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/kobject.c b/lib/kobject.c
index bbbb067de8ec..be01d49abb62 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -806,6 +806,9 @@ EXPORT_SYMBOL_GPL(kobj_sysfs_ops);
/**
* kset_register - initialize and add a kset.
* @k: kset.
+ *
+ * NOTE: On error, the kset.kobj.name allocated by() kobj_set_name()
+ * is freed, it can not be used any more.
*/
int kset_register(struct kset *k)
{
@@ -816,8 +819,12 @@ int kset_register(struct kset *k)
kset_init(k);
err = kobject_add_internal(&k->kobj);
- if (err)
+ if (err) {
+ kfree_const(k->kobj.name);
+ /* Set it to NULL to avoid accessing bad pointer in callers. */
+ k->kobj.name = NULL;
return err;
+ }
kobject_uevent(&k->kobj, KOBJ_ADD);
return 0;
}
--
2.35.1
From: Yang Yingliang <yangyingliang(a)huawei.com>
[ Upstream commit 1662cea4623f75d8251adf07370bbaa958f0355d ]
Inject fault while loading module, kset_register() may fail.
If it fails, the kset.kobj.name allocated by kobject_set_name()
which must be called before a call to kset_register() may be
leaked, since refcount of kobj was set in kset_init().
To mitigate this, we free the name in kset_register() when an
error is encountered, i.e. when kset_register() returns an error.
A kset may be embedded in a larger structure which may be dynamically
allocated in callers, it needs to be freed in ktype.release() or error
path in callers, in this case, we can not call kset_put() in kset_register(),
or it will cause double free, so just call kfree_const() to free the
name and set it to NULL to avoid accessing bad pointer in callers.
With this fix, the callers don't need care about freeing the name
and may call kset_put() if kset_register() fails.
Suggested-by: Luben Tuikov <luben.tuikov(a)amd.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: <luben.tuikov(a)amd.com>
Link: https://lore.kernel.org/r/20221025071549.1280528-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
lib/kobject.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/kobject.c b/lib/kobject.c
index bbbb067de8ec..be01d49abb62 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -806,6 +806,9 @@ EXPORT_SYMBOL_GPL(kobj_sysfs_ops);
/**
* kset_register - initialize and add a kset.
* @k: kset.
+ *
+ * NOTE: On error, the kset.kobj.name allocated by() kobj_set_name()
+ * is freed, it can not be used any more.
*/
int kset_register(struct kset *k)
{
@@ -816,8 +819,12 @@ int kset_register(struct kset *k)
kset_init(k);
err = kobject_add_internal(&k->kobj);
- if (err)
+ if (err) {
+ kfree_const(k->kobj.name);
+ /* Set it to NULL to avoid accessing bad pointer in callers. */
+ k->kobj.name = NULL;
return err;
+ }
kobject_uevent(&k->kobj, KOBJ_ADD);
return 0;
}
--
2.35.1
From: Yang Yingliang <yangyingliang(a)huawei.com>
[ Upstream commit 1662cea4623f75d8251adf07370bbaa958f0355d ]
Inject fault while loading module, kset_register() may fail.
If it fails, the kset.kobj.name allocated by kobject_set_name()
which must be called before a call to kset_register() may be
leaked, since refcount of kobj was set in kset_init().
To mitigate this, we free the name in kset_register() when an
error is encountered, i.e. when kset_register() returns an error.
A kset may be embedded in a larger structure which may be dynamically
allocated in callers, it needs to be freed in ktype.release() or error
path in callers, in this case, we can not call kset_put() in kset_register(),
or it will cause double free, so just call kfree_const() to free the
name and set it to NULL to avoid accessing bad pointer in callers.
With this fix, the callers don't need care about freeing the name
and may call kset_put() if kset_register() fails.
Suggested-by: Luben Tuikov <luben.tuikov(a)amd.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: <luben.tuikov(a)amd.com>
Link: https://lore.kernel.org/r/20221025071549.1280528-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
lib/kobject.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/kobject.c b/lib/kobject.c
index 97d86dc17c42..1eb1230a2d28 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -821,6 +821,9 @@ EXPORT_SYMBOL_GPL(kobj_sysfs_ops);
/**
* kset_register - initialize and add a kset.
* @k: kset.
+ *
+ * NOTE: On error, the kset.kobj.name allocated by() kobj_set_name()
+ * is freed, it can not be used any more.
*/
int kset_register(struct kset *k)
{
@@ -831,8 +834,12 @@ int kset_register(struct kset *k)
kset_init(k);
err = kobject_add_internal(&k->kobj);
- if (err)
+ if (err) {
+ kfree_const(k->kobj.name);
+ /* Set it to NULL to avoid accessing bad pointer in callers. */
+ k->kobj.name = NULL;
return err;
+ }
kobject_uevent(&k->kobj, KOBJ_ADD);
return 0;
}
--
2.35.1
From: Yang Yingliang <yangyingliang(a)huawei.com>
[ Upstream commit 1662cea4623f75d8251adf07370bbaa958f0355d ]
Inject fault while loading module, kset_register() may fail.
If it fails, the kset.kobj.name allocated by kobject_set_name()
which must be called before a call to kset_register() may be
leaked, since refcount of kobj was set in kset_init().
To mitigate this, we free the name in kset_register() when an
error is encountered, i.e. when kset_register() returns an error.
A kset may be embedded in a larger structure which may be dynamically
allocated in callers, it needs to be freed in ktype.release() or error
path in callers, in this case, we can not call kset_put() in kset_register(),
or it will cause double free, so just call kfree_const() to free the
name and set it to NULL to avoid accessing bad pointer in callers.
With this fix, the callers don't need care about freeing the name
and may call kset_put() if kset_register() fails.
Suggested-by: Luben Tuikov <luben.tuikov(a)amd.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: <luben.tuikov(a)amd.com>
Link: https://lore.kernel.org/r/20221025071549.1280528-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
lib/kobject.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/kobject.c b/lib/kobject.c
index 0c6d17503a11..3ce572d7c26d 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -869,6 +869,9 @@ EXPORT_SYMBOL_GPL(kobj_sysfs_ops);
/**
* kset_register() - Initialize and add a kset.
* @k: kset.
+ *
+ * NOTE: On error, the kset.kobj.name allocated by() kobj_set_name()
+ * is freed, it can not be used any more.
*/
int kset_register(struct kset *k)
{
@@ -879,8 +882,12 @@ int kset_register(struct kset *k)
kset_init(k);
err = kobject_add_internal(&k->kobj);
- if (err)
+ if (err) {
+ kfree_const(k->kobj.name);
+ /* Set it to NULL to avoid accessing bad pointer in callers. */
+ k->kobj.name = NULL;
return err;
+ }
kobject_uevent(&k->kobj, KOBJ_ADD);
return 0;
}
--
2.35.1
From: Yang Yingliang <yangyingliang(a)huawei.com>
[ Upstream commit 1662cea4623f75d8251adf07370bbaa958f0355d ]
Inject fault while loading module, kset_register() may fail.
If it fails, the kset.kobj.name allocated by kobject_set_name()
which must be called before a call to kset_register() may be
leaked, since refcount of kobj was set in kset_init().
To mitigate this, we free the name in kset_register() when an
error is encountered, i.e. when kset_register() returns an error.
A kset may be embedded in a larger structure which may be dynamically
allocated in callers, it needs to be freed in ktype.release() or error
path in callers, in this case, we can not call kset_put() in kset_register(),
or it will cause double free, so just call kfree_const() to free the
name and set it to NULL to avoid accessing bad pointer in callers.
With this fix, the callers don't need care about freeing the name
and may call kset_put() if kset_register() fails.
Suggested-by: Luben Tuikov <luben.tuikov(a)amd.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: <luben.tuikov(a)amd.com>
Link: https://lore.kernel.org/r/20221025071549.1280528-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
lib/kobject.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/kobject.c b/lib/kobject.c
index ea53b30cf483..743e629d60d2 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -866,6 +866,9 @@ EXPORT_SYMBOL_GPL(kobj_sysfs_ops);
/**
* kset_register() - Initialize and add a kset.
* @k: kset.
+ *
+ * NOTE: On error, the kset.kobj.name allocated by() kobj_set_name()
+ * is freed, it can not be used any more.
*/
int kset_register(struct kset *k)
{
@@ -876,8 +879,12 @@ int kset_register(struct kset *k)
kset_init(k);
err = kobject_add_internal(&k->kobj);
- if (err)
+ if (err) {
+ kfree_const(k->kobj.name);
+ /* Set it to NULL to avoid accessing bad pointer in callers. */
+ k->kobj.name = NULL;
return err;
+ }
kobject_uevent(&k->kobj, KOBJ_ADD);
return 0;
}
--
2.35.1
From: Yang Yingliang <yangyingliang(a)huawei.com>
[ Upstream commit 1662cea4623f75d8251adf07370bbaa958f0355d ]
Inject fault while loading module, kset_register() may fail.
If it fails, the kset.kobj.name allocated by kobject_set_name()
which must be called before a call to kset_register() may be
leaked, since refcount of kobj was set in kset_init().
To mitigate this, we free the name in kset_register() when an
error is encountered, i.e. when kset_register() returns an error.
A kset may be embedded in a larger structure which may be dynamically
allocated in callers, it needs to be freed in ktype.release() or error
path in callers, in this case, we can not call kset_put() in kset_register(),
or it will cause double free, so just call kfree_const() to free the
name and set it to NULL to avoid accessing bad pointer in callers.
With this fix, the callers don't need care about freeing the name
and may call kset_put() if kset_register() fails.
Suggested-by: Luben Tuikov <luben.tuikov(a)amd.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: <luben.tuikov(a)amd.com>
Link: https://lore.kernel.org/r/20221025071549.1280528-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
lib/kobject.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/kobject.c b/lib/kobject.c
index ea53b30cf483..743e629d60d2 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -866,6 +866,9 @@ EXPORT_SYMBOL_GPL(kobj_sysfs_ops);
/**
* kset_register() - Initialize and add a kset.
* @k: kset.
+ *
+ * NOTE: On error, the kset.kobj.name allocated by() kobj_set_name()
+ * is freed, it can not be used any more.
*/
int kset_register(struct kset *k)
{
@@ -876,8 +879,12 @@ int kset_register(struct kset *k)
kset_init(k);
err = kobject_add_internal(&k->kobj);
- if (err)
+ if (err) {
+ kfree_const(k->kobj.name);
+ /* Set it to NULL to avoid accessing bad pointer in callers. */
+ k->kobj.name = NULL;
return err;
+ }
kobject_uevent(&k->kobj, KOBJ_ADD);
return 0;
}
--
2.35.1
From: Yang Yingliang <yangyingliang(a)huawei.com>
[ Upstream commit 1662cea4623f75d8251adf07370bbaa958f0355d ]
Inject fault while loading module, kset_register() may fail.
If it fails, the kset.kobj.name allocated by kobject_set_name()
which must be called before a call to kset_register() may be
leaked, since refcount of kobj was set in kset_init().
To mitigate this, we free the name in kset_register() when an
error is encountered, i.e. when kset_register() returns an error.
A kset may be embedded in a larger structure which may be dynamically
allocated in callers, it needs to be freed in ktype.release() or error
path in callers, in this case, we can not call kset_put() in kset_register(),
or it will cause double free, so just call kfree_const() to free the
name and set it to NULL to avoid accessing bad pointer in callers.
With this fix, the callers don't need care about freeing the name
and may call kset_put() if kset_register() fails.
Suggested-by: Luben Tuikov <luben.tuikov(a)amd.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: <luben.tuikov(a)amd.com>
Link: https://lore.kernel.org/r/20221025071549.1280528-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
lib/kobject.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/kobject.c b/lib/kobject.c
index 5f0e71ab292c..0f9cc0b93d99 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -834,6 +834,9 @@ EXPORT_SYMBOL_GPL(kobj_sysfs_ops);
/**
* kset_register() - Initialize and add a kset.
* @k: kset.
+ *
+ * NOTE: On error, the kset.kobj.name allocated by() kobj_set_name()
+ * is freed, it can not be used any more.
*/
int kset_register(struct kset *k)
{
@@ -844,8 +847,12 @@ int kset_register(struct kset *k)
kset_init(k);
err = kobject_add_internal(&k->kobj);
- if (err)
+ if (err) {
+ kfree_const(k->kobj.name);
+ /* Set it to NULL to avoid accessing bad pointer in callers. */
+ k->kobj.name = NULL;
return err;
+ }
kobject_uevent(&k->kobj, KOBJ_ADD);
return 0;
}
--
2.35.1