Since commits
7b9eb53e8591 ("media: cx18: Access v4l2_fh from file")
9ba9d11544f9 ("media: ivtv: Access v4l2_fh from file")
All the ioctl handlers access their private data structures
from file *
The ivtv and cx18 drivers call the ioctl handlers from their
DVB layer without a valid file *, causing invalid memory access.
The issue has been reported by smatch in
"[bug report] media: cx18: Access v4l2_fh from file"
Fix this by providing wrappers for the ioctl handlers to be
used by the DVB layer that do not require a valid file *.
Signed-off-by: Jacopo Mondi <jacopo.mondi(a)ideasonboard.com>
---
Changes in v3:
- Change helpers to accept the type they're going to operate on instead
of using the open_id wrapper type as suggested by Laurent
- Link to v2: https://lore.kernel.org/r/20250818-cx18-v4l2-fh-v2-0-3f53ce423663@ideasonbo…
Changes in v2:
- Add Cc: stable(a)vger.kernel.org per-patch
---
Jacopo Mondi (2):
media: cx18: Fix invalid access to file *
media: ivtv: Fix invalid access to file *
drivers/media/pci/cx18/cx18-driver.c | 9 +++------
drivers/media/pci/cx18/cx18-ioctl.c | 30 +++++++++++++++++++-----------
drivers/media/pci/cx18/cx18-ioctl.h | 8 +++++---
drivers/media/pci/ivtv/ivtv-driver.c | 11 ++++-------
drivers/media/pci/ivtv/ivtv-ioctl.c | 22 +++++++++++++++++-----
drivers/media/pci/ivtv/ivtv-ioctl.h | 6 ++++--
6 files changed, 52 insertions(+), 34 deletions(-)
---
base-commit: a75b8d198c55e9eb5feb6f6e155496305caba2dc
change-id: 20250818-cx18-v4l2-fh-7eaa6199fdde
Best regards,
--
Jacopo Mondi <jacopo.mondi(a)ideasonboard.com>
Hi,
在 2025/08/17 22:18, Sasha Levin 写道:
> This is a note to let you know that I've just added the patch titled
>
> md: call del_gendisk in control path
>
> to the 6.6-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> md-call-del_gendisk-in-control-path.patch
> and it can be found in the queue-6.6 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
This patch should be be backported to any stable kernel, this change
will break user tools mdadm:
https://lore.kernel.org/all/f654db67-a5a5-114b-09b8-00db303daab7@redhat.com/
Thanks,
Kuai
>
> commit fa738623105e2dd4865274dc8525856feaec3ae9
> Author: Xiao Ni <xni(a)redhat.com>
> Date: Wed Jun 11 15:31:06 2025 +0800
>
> md: call del_gendisk in control path
>
> [ Upstream commit 9e59d609763f70a992a8f3808dabcce60f14eb5c ]
>
> Now del_gendisk and put_disk are called asynchronously in workqueue work.
> The asynchronous way has a problem that the device node can still exist
> after mdadm --stop command returns in a short window. So udev rule can
> open this device node and create the struct mddev in kernel again. So put
> del_gendisk in control path and still leave put_disk in md_kobj_release
> to avoid uaf of gendisk.
>
> Function del_gendisk can't be called with reconfig_mutex. If it's called
> with reconfig mutex, a deadlock can happen. del_gendisk waits all sysfs
> files access to finish and sysfs file access waits reconfig mutex. So
> put del_gendisk after releasing reconfig mutex.
>
> But there is still a window that sysfs can be accessed between mddev_unlock
> and del_gendisk. So some actions (add disk, change level, .e.g) can happen
> which lead unexpected results. MD_DELETED is used to resolve this problem.
> MD_DELETED is set before releasing reconfig mutex and it should be checked
> for these sysfs access which need reconfig mutex. For sysfs access which
> don't need reconfig mutex, del_gendisk will wait them to finish.
>
> But it doesn't need to do this in function mddev_lock_nointr. There are
> ten places that call it.
> * Five of them are in dm raid which we don't need to care. MD_DELETED is
> only used for md raid.
> * stop_sync_thread, md_do_sync and md_start_sync are related sync request,
> and it needs to wait sync thread to finish before stopping an array.
> * md_ioctl: md_open is called before md_ioctl, so ->openers is added. It
> will fail to stop the array. So it doesn't need to check MD_DELETED here
> * md_set_readonly:
> It needs to call mddev_set_closing_and_sync_blockdev when setting readonly
> or read_auto. So it will fail to stop the array too because MD_CLOSING is
> already set.
>
> Reviewed-by: Yu Kuai <yukuai3(a)huawei.com>
> Signed-off-by: Xiao Ni <xni(a)redhat.com>
> Link: https://lore.kernel.org/linux-raid/20250611073108.25463-2-xni@redhat.com
> Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index b086cbf24086..8e3939c0d2ed 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -639,9 +639,6 @@ static void __mddev_put(struct mddev *mddev)
> mddev->ctime || mddev->hold_active)
> return;
>
> - /* Array is not configured at all, and not held active, so destroy it */
> - set_bit(MD_DELETED, &mddev->flags);
> -
> /*
> * Call queue_work inside the spinlock so that flush_workqueue() after
> * mddev_find will succeed in waiting for the work to be done.
> @@ -837,6 +834,16 @@ void mddev_unlock(struct mddev *mddev)
> kobject_del(&rdev->kobj);
> export_rdev(rdev, mddev);
> }
> +
> + /* Call del_gendisk after release reconfig_mutex to avoid
> + * deadlock (e.g. call del_gendisk under the lock and an
> + * access to sysfs files waits the lock)
> + * And MD_DELETED is only used for md raid which is set in
> + * do_md_stop. dm raid only uses md_stop to stop. So dm raid
> + * doesn't need to check MD_DELETED when getting reconfig lock
> + */
> + if (test_bit(MD_DELETED, &mddev->flags))
> + del_gendisk(mddev->gendisk);
> }
> EXPORT_SYMBOL_GPL(mddev_unlock);
>
> @@ -5616,19 +5623,30 @@ md_attr_store(struct kobject *kobj, struct attribute *attr,
> struct md_sysfs_entry *entry = container_of(attr, struct md_sysfs_entry, attr);
> struct mddev *mddev = container_of(kobj, struct mddev, kobj);
> ssize_t rv;
> + struct kernfs_node *kn = NULL;
>
> if (!entry->store)
> return -EIO;
> if (!capable(CAP_SYS_ADMIN))
> return -EACCES;
> +
> + if (entry->store == array_state_store && cmd_match(page, "clear"))
> + kn = sysfs_break_active_protection(kobj, attr);
> +
> spin_lock(&all_mddevs_lock);
> if (!mddev_get(mddev)) {
> spin_unlock(&all_mddevs_lock);
> + if (kn)
> + sysfs_unbreak_active_protection(kn);
> return -EBUSY;
> }
> spin_unlock(&all_mddevs_lock);
> rv = entry->store(mddev, page, length);
> mddev_put(mddev);
> +
> + if (kn)
> + sysfs_unbreak_active_protection(kn);
> +
> return rv;
> }
>
> @@ -5636,12 +5654,6 @@ static void md_kobj_release(struct kobject *ko)
> {
> struct mddev *mddev = container_of(ko, struct mddev, kobj);
>
> - if (mddev->sysfs_state)
> - sysfs_put(mddev->sysfs_state);
> - if (mddev->sysfs_level)
> - sysfs_put(mddev->sysfs_level);
> -
> - del_gendisk(mddev->gendisk);
> put_disk(mddev->gendisk);
> }
>
> @@ -6531,8 +6543,9 @@ static int do_md_stop(struct mddev *mddev, int mode,
> mddev->bitmap_info.offset = 0;
>
> export_array(mddev);
> -
> md_clean(mddev);
> + set_bit(MD_DELETED, &mddev->flags);
> +
> if (mddev->hold_active == UNTIL_STOP)
> mddev->hold_active = 0;
> }
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 46995558d3bd..0a7c9122db50 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -589,11 +589,26 @@ static inline bool is_md_suspended(struct mddev *mddev)
>
> static inline int __must_check mddev_lock(struct mddev *mddev)
> {
> - return mutex_lock_interruptible(&mddev->reconfig_mutex);
> + int ret;
> +
> + ret = mutex_lock_interruptible(&mddev->reconfig_mutex);
> +
> + /* MD_DELETED is set in do_md_stop with reconfig_mutex.
> + * So check it here.
> + */
> + if (!ret && test_bit(MD_DELETED, &mddev->flags)) {
> + ret = -ENODEV;
> + mutex_unlock(&mddev->reconfig_mutex);
> + }
> +
> + return ret;
> }
>
> /* Sometimes we need to take the lock in a situation where
> * failure due to interrupts is not acceptable.
> + * It doesn't need to check MD_DELETED here, the owner which
> + * holds the lock here can't be stopped. And all paths can't
> + * call this function after do_md_stop.
> */
> static inline void mddev_lock_nointr(struct mddev *mddev)
> {
> @@ -602,7 +617,14 @@ static inline void mddev_lock_nointr(struct mddev *mddev)
>
> static inline int mddev_trylock(struct mddev *mddev)
> {
> - return mutex_trylock(&mddev->reconfig_mutex);
> + int ret;
> +
> + ret = mutex_trylock(&mddev->reconfig_mutex);
> + if (!ret && test_bit(MD_DELETED, &mddev->flags)) {
> + ret = -ENODEV;
> + mutex_unlock(&mddev->reconfig_mutex);
> + }
> + return ret;
> }
> extern void mddev_unlock(struct mddev *mddev);
>
> .
>
Hi,
The first four patches in this series are miscellaneous fixes and
improvements in the Cadence and TI CSI-RX drivers around probing, fwnode
and link creation.
The last two patches add support for transmitting multiple pixels per
clock on the internal bus between Cadence CSI-RX bridge and TI CSI-RX
wrapper. As this internal bus is 32-bit wide, the maximum number of
pixels that can be transmitted per cycle depend upon the format's bit
width. Secondly, the downstream element must support unpacking of
multiple pixels.
Thus we export a module function that can be used by the downstream
driver to negotiate the pixels per cycle on the output pixel stream of
the Cadence bridge.
Signed-off-by: Jai Luthra <jai.luthra(a)ideasonboard.com>
---
Changes in v4:
- Rebase on top of v6.17-rc1
- Add missing include for linux/export.h in cdns-csi2rx.c
- Link to v3: https://lore.kernel.org/r/20250626-probe_fixes-v3-0-83e735ae466e@ideasonboa…
Changes in v3:
- Move cdns-csi2rx header to include/media
- Export symbol from cdns-csi2rx.c to be used only through
the j721e-csi2rx.c module namespace
- Other minor fixes suggested by Sakari
- Add Abhilash's T-by tags
- Link to v2: https://lore.kernel.org/r/20250410-probe_fixes-v2-0-801bc6eebdea@ideasonboa…
Changes in v2:
- Rebase on v6.15-rc1
- Fix lkp warnings in PATCH 5/6 missing header for FIELD_PREP
- Add R-By tags from Devarsh and Changhuang
- Link to v1: https://lore.kernel.org/r/20250324-probe_fixes-v1-0-5cd5b9e1cfac@ideasonboa…
---
Jai Luthra (6):
media: ti: j721e-csi2rx: Use devm_of_platform_populate
media: ti: j721e-csi2rx: Use fwnode_get_named_child_node
media: ti: j721e-csi2rx: Fix source subdev link creation
media: cadence: csi2rx: Implement get_fwnode_pad op
media: cadence: cdns-csi2rx: Support multiple pixels per clock cycle
media: ti: j721e-csi2rx: Support multiple pixels per clock
MAINTAINERS | 1 +
drivers/media/platform/cadence/cdns-csi2rx.c | 75 ++++++++++++++++------
drivers/media/platform/ti/Kconfig | 3 +-
.../media/platform/ti/j721e-csi2rx/j721e-csi2rx.c | 65 ++++++++++++++-----
include/media/cadence/cdns-csi2rx.h | 19 ++++++
5 files changed, 128 insertions(+), 35 deletions(-)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250314-probe_fixes-7e0ec33c7fee
Best regards,
--
Jai Luthra <jai.luthra(a)ideasonboard.com>
The patch titled
Subject: proc: fix missing pde_set_flags() for net proc files
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
proc-fix-missing-pde_set_flags-for-net-proc-files.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: wangzijie <wangzijie1(a)honor.com>
Subject: proc: fix missing pde_set_flags() for net proc files
Date: Mon, 18 Aug 2025 20:31:02 +0800
To avoid potential UAF issues during module removal races, we use
pde_set_flags() to save proc_ops flags in PDE itself before
proc_register(), and then use pde_has_proc_*() helpers instead of directly
dereferencing pde->proc_ops->*.
However, the pde_set_flags() call was missing when creating net related
proc files. This omission caused incorrect behavior which FMODE_LSEEK was
being cleared inappropriately in proc_reg_open() for net proc files. Lars
reported it in this link[1].
Fix this by ensuring pde_set_flags() is called when register proc entry,
and add NULL check for proc_ops in pde_set_flags().
[1]: https://lore.kernel.org/all/20250815195616.64497967@chagall.paradoxon.rec/
Link: https://lkml.kernel.org/r/20250818123102.959595-1-wangzijie1@honor.com
Fixes: ff7ec8dc1b64 ("proc: use the same treatment to check proc_lseek as ones for proc_read_iter et.al)
Signed-off-by: wangzijie <wangzijie1(a)honor.com>
Reported-by: Lars Wendler <polynomial-c(a)gmx.de>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: "Edgecombe, Rick P" <rick.p.edgecombe(a)intel.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Cc: Kirill A. Shutemov <k.shutemov(a)gmail.com>
Cc: wangzijie <wangzijie1(a)honor.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/generic.c | 36 +++++++++++++++++++-----------------
1 file changed, 19 insertions(+), 17 deletions(-)
--- a/fs/proc/generic.c~proc-fix-missing-pde_set_flags-for-net-proc-files
+++ a/fs/proc/generic.c
@@ -367,6 +367,23 @@ static const struct inode_operations pro
.setattr = proc_notify_change,
};
+static void pde_set_flags(struct proc_dir_entry *pde)
+{
+ if (!pde->proc_ops)
+ return;
+
+ if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT)
+ pde->flags |= PROC_ENTRY_PERMANENT;
+ if (pde->proc_ops->proc_read_iter)
+ pde->flags |= PROC_ENTRY_proc_read_iter;
+#ifdef CONFIG_COMPAT
+ if (pde->proc_ops->proc_compat_ioctl)
+ pde->flags |= PROC_ENTRY_proc_compat_ioctl;
+#endif
+ if (pde->proc_ops->proc_lseek)
+ pde->flags |= PROC_ENTRY_proc_lseek;
+}
+
/* returns the registered entry, or frees dp and returns NULL on failure */
struct proc_dir_entry *proc_register(struct proc_dir_entry *dir,
struct proc_dir_entry *dp)
@@ -374,6 +391,8 @@ struct proc_dir_entry *proc_register(str
if (proc_alloc_inum(&dp->low_ino))
goto out_free_entry;
+ pde_set_flags(dp);
+
write_lock(&proc_subdir_lock);
dp->parent = dir;
if (pde_subdir_insert(dir, dp) == false) {
@@ -561,20 +580,6 @@ struct proc_dir_entry *proc_create_reg(c
return p;
}
-static void pde_set_flags(struct proc_dir_entry *pde)
-{
- if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT)
- pde->flags |= PROC_ENTRY_PERMANENT;
- if (pde->proc_ops->proc_read_iter)
- pde->flags |= PROC_ENTRY_proc_read_iter;
-#ifdef CONFIG_COMPAT
- if (pde->proc_ops->proc_compat_ioctl)
- pde->flags |= PROC_ENTRY_proc_compat_ioctl;
-#endif
- if (pde->proc_ops->proc_lseek)
- pde->flags |= PROC_ENTRY_proc_lseek;
-}
-
struct proc_dir_entry *proc_create_data(const char *name, umode_t mode,
struct proc_dir_entry *parent,
const struct proc_ops *proc_ops, void *data)
@@ -585,7 +590,6 @@ struct proc_dir_entry *proc_create_data(
if (!p)
return NULL;
p->proc_ops = proc_ops;
- pde_set_flags(p);
return proc_register(parent, p);
}
EXPORT_SYMBOL(proc_create_data);
@@ -636,7 +640,6 @@ struct proc_dir_entry *proc_create_seq_p
p->proc_ops = &proc_seq_ops;
p->seq_ops = ops;
p->state_size = state_size;
- pde_set_flags(p);
return proc_register(parent, p);
}
EXPORT_SYMBOL(proc_create_seq_private);
@@ -667,7 +670,6 @@ struct proc_dir_entry *proc_create_singl
return NULL;
p->proc_ops = &proc_single_ops;
p->single_show = show;
- pde_set_flags(p);
return proc_register(parent, p);
}
EXPORT_SYMBOL(proc_create_single_data);
_
Patches currently in -mm which might be from wangzijie1(a)honor.com are
proc-fix-missing-pde_set_flags-for-net-proc-files.patch
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 005b0a0c24e1628313e951516b675109a92cacfe
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081848-kilobyte-skirmish-7ccd@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 005b0a0c24e1628313e951516b675109a92cacfe Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Fri, 18 Jul 2025 13:07:29 +0100
Subject: [PATCH] btrfs: send: use fallocate for hole punching with send stream
v2
Currently holes are sent as writes full of zeroes, which results in
unnecessarily using disk space at the receiving end and increasing the
stream size.
In some cases we avoid sending writes of zeroes, like during a full
send operation where we just skip writes for holes.
But for some cases we fill previous holes with writes of zeroes too, like
in this scenario:
1) We have a file with a hole in the range [2M, 3M), we snapshot the
subvolume and do a full send. The range [2M, 3M) stays as a hole at
the receiver since we skip sending write commands full of zeroes;
2) We punch a hole for the range [3M, 4M) in our file, so that now it
has a 2M hole in the range [2M, 4M), and snapshot the subvolume.
Now if we do an incremental send, we will send write commands full
of zeroes for the range [2M, 4M), removing the hole for [2M, 3M) at
the receiver.
We could improve cases such as this last one by doing additional
comparisons of file extent items (or their absence) between the parent
and send snapshots, but that's a lot of code to add plus additional CPU
and IO costs.
Since the send stream v2 already has a fallocate command and btrfs-progs
implements a callback to execute fallocate since the send stream v2
support was added to it, update the kernel to use fallocate for punching
holes for V2+ streams.
Test coverage is provided by btrfs/284 which is a version of btrfs/007
that exercises send stream v2 instead of v1, using fsstress with random
operations and fssum to verify file contents.
Link: https://github.com/kdave/btrfs-progs/issues/1001
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Boris Burkov <boris(a)bur.io>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 09822e766e41..7664025a5af4 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -4,6 +4,7 @@
*/
#include <linux/bsearch.h>
+#include <linux/falloc.h>
#include <linux/fs.h>
#include <linux/file.h>
#include <linux/sort.h>
@@ -5405,6 +5406,30 @@ static int send_update_extent(struct send_ctx *sctx,
return ret;
}
+static int send_fallocate(struct send_ctx *sctx, u32 mode, u64 offset, u64 len)
+{
+ struct fs_path *path;
+ int ret;
+
+ path = get_cur_inode_path(sctx);
+ if (IS_ERR(path))
+ return PTR_ERR(path);
+
+ ret = begin_cmd(sctx, BTRFS_SEND_C_FALLOCATE);
+ if (ret < 0)
+ return ret;
+
+ TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, path);
+ TLV_PUT_U32(sctx, BTRFS_SEND_A_FALLOCATE_MODE, mode);
+ TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset);
+ TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, len);
+
+ ret = send_cmd(sctx);
+
+tlv_put_failure:
+ return ret;
+}
+
static int send_hole(struct send_ctx *sctx, u64 end)
{
struct fs_path *p = NULL;
@@ -5412,6 +5437,14 @@ static int send_hole(struct send_ctx *sctx, u64 end)
u64 offset = sctx->cur_inode_last_extent;
int ret = 0;
+ /*
+ * Starting with send stream v2 we have fallocate and can use it to
+ * punch holes instead of sending writes full of zeroes.
+ */
+ if (proto_cmd_ok(sctx, BTRFS_SEND_C_FALLOCATE))
+ return send_fallocate(sctx, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+ offset, end - offset);
+
/*
* A hole that starts at EOF or beyond it. Since we do not yet support
* fallocate (for extent preallocation and hole punching), sending a
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x d2845519b0723c5d5a0266cbf410495f9b8fd65c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081859-matcher-handprint-c398@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d2845519b0723c5d5a0266cbf410495f9b8fd65c Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch(a)lst.de>
Date: Wed, 23 Jul 2025 14:19:44 +0200
Subject: [PATCH] xfs: fully decouple XFS_IBULK* flags from XFS_IWALK* flags
Fix up xfs_inumbers to now pass in the XFS_IBULK* flags into the flags
argument to xfs_inobt_walk, which expects the XFS_IWALK* flags.
Currently passing the wrong flags works for non-debug builds because
the only XFS_IWALK* flag has the same encoding as the corresponding
XFS_IBULK* flag, but in debug builds it can trigger an assert that no
incorrect flag is passed. Instead just extra the relevant flag.
Fixes: 5b35d922c52798 ("xfs: Decouple XFS_IBULK flags from XFS_IWALK flags")
Cc: <stable(a)vger.kernel.org> # v5.19
Reported-by: cen zhang <zzzccc427(a)gmail.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index c8c9b8d8309f..5116842420b2 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -447,17 +447,21 @@ xfs_inumbers(
.breq = breq,
};
struct xfs_trans *tp;
+ unsigned int iwalk_flags = 0;
int error = 0;
if (xfs_bulkstat_already_done(breq->mp, breq->startino))
return 0;
+ if (breq->flags & XFS_IBULK_SAME_AG)
+ iwalk_flags |= XFS_IWALK_SAME_AG;
+
/*
* Grab an empty transaction so that we can use its recursive buffer
* locking abilities to detect cycles in the inobt without deadlocking.
*/
tp = xfs_trans_alloc_empty(breq->mp);
- error = xfs_inobt_walk(breq->mp, tp, breq->startino, breq->flags,
+ error = xfs_inobt_walk(breq->mp, tp, breq->startino, iwalk_flags,
xfs_inumbers_walk, breq->icount, &ic);
xfs_trans_cancel(tp);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 1ef94169db0958d6de39f9ea6e063ce887342e2d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081854-eject-aloft-03ff@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ef94169db0958d6de39f9ea6e063ce887342e2d Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Wed, 2 Jul 2025 15:08:13 +0930
Subject: [PATCH] btrfs: populate otime when logging an inode item
[TEST FAILURE WITH EXPERIMENTAL FEATURES]
When running test case generic/508, the test case will fail with the new
btrfs shutdown support:
generic/508 - output mismatch (see /home/adam/xfstests/results//generic/508.out.bad)
--- tests/generic/508.out 2022-05-11 11:25:30.806666664 +0930
+++ /home/adam/xfstests/results//generic/508.out.bad 2025-07-02 14:53:22.401824212 +0930
@@ -1,2 +1,6 @@
QA output created by 508
Silence is golden
+Before:
+After : stat.btime = Thu Jan 1 09:30:00 1970
+Before:
+After : stat.btime = Wed Jul 2 14:53:22 2025
...
(Run 'diff -u /home/adam/xfstests/tests/generic/508.out /home/adam/xfstests/results//generic/508.out.bad' to see the entire diff)
Ran: generic/508
Failures: generic/508
Failed 1 of 1 tests
Please note that the test case requires shutdown support, thus the test
case will be skipped using the current upstream kernel, as it doesn't
have shutdown ioctl support.
[CAUSE]
The direct cause the 0 time stamp in the log tree:
leaf 30507008 items 2 free space 16057 generation 9 owner TREE_LOG
leaf 30507008 flags 0x1(WRITTEN) backref revision 1
checksum stored e522548d
checksum calced e522548d
fs uuid 57d45451-481e-43e4-aa93-289ad707a3a0
chunk uuid d52bd3fd-5163-4337-98a7-7986993ad398
item 0 key (257 INODE_ITEM 0) itemoff 16123 itemsize 160
generation 9 transid 9 size 0 nbytes 0
block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0
sequence 1 flags 0x0(none)
atime 1751432947.492000000 (2025-07-02 14:39:07)
ctime 1751432947.492000000 (2025-07-02 14:39:07)
mtime 1751432947.492000000 (2025-07-02 14:39:07)
otime 0.0 (1970-01-01 09:30:00) <<<
But the old fs tree has all the correct time stamp:
btrfs-progs v6.12
fs tree key (FS_TREE ROOT_ITEM 0)
leaf 30425088 items 2 free space 16061 generation 5 owner FS_TREE
leaf 30425088 flags 0x1(WRITTEN) backref revision 1
checksum stored 48f6c57e
checksum calced 48f6c57e
fs uuid 57d45451-481e-43e4-aa93-289ad707a3a0
chunk uuid d52bd3fd-5163-4337-98a7-7986993ad398
item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160
generation 3 transid 0 size 0 nbytes 16384
block group 0 mode 40755 links 1 uid 0 gid 0 rdev 0
sequence 0 flags 0x0(none)
atime 1751432947.0 (2025-07-02 14:39:07)
ctime 1751432947.0 (2025-07-02 14:39:07)
mtime 1751432947.0 (2025-07-02 14:39:07)
otime 1751432947.0 (2025-07-02 14:39:07) <<<
The root cause is that fill_inode_item() in tree-log.c is only
populating a/c/m time, not the otime (or btime in statx output).
Part of the reason is that, the vfs inode only has a/c/m time, no native
btime support yet.
[FIX]
Thankfully btrfs has its otime stored in btrfs_inode::i_otime_sec and
btrfs_inode::i_otime_nsec.
So what we really need is just fill the otime time stamp in
fill_inode_item() of tree-log.c
There is another fill_inode_item() in inode.c, which is doing the proper
otime population.
Fixes: 94edf4ae43a5 ("Btrfs: don't bother committing delayed inode updates when fsyncing")
CC: stable(a)vger.kernel.org
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 1e805dabfc4b..ab0815d9e7e5 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4233,6 +4233,9 @@ static void fill_inode_item(struct btrfs_trans_handle *trans,
btrfs_set_timespec_sec(leaf, &item->ctime, inode_get_ctime_sec(inode));
btrfs_set_timespec_nsec(leaf, &item->ctime, inode_get_ctime_nsec(inode));
+ btrfs_set_timespec_sec(leaf, &item->otime, BTRFS_I(inode)->i_otime_sec);
+ btrfs_set_timespec_nsec(leaf, &item->otime, BTRFS_I(inode)->i_otime_nsec);
+
/*
* We do not need to set the nbytes field, in fact during a fast fsync
* its value may not even be correct, since a fast fsync does not wait
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x d2845519b0723c5d5a0266cbf410495f9b8fd65c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081859-buckskin-outwit-e0f0@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d2845519b0723c5d5a0266cbf410495f9b8fd65c Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch(a)lst.de>
Date: Wed, 23 Jul 2025 14:19:44 +0200
Subject: [PATCH] xfs: fully decouple XFS_IBULK* flags from XFS_IWALK* flags
Fix up xfs_inumbers to now pass in the XFS_IBULK* flags into the flags
argument to xfs_inobt_walk, which expects the XFS_IWALK* flags.
Currently passing the wrong flags works for non-debug builds because
the only XFS_IWALK* flag has the same encoding as the corresponding
XFS_IBULK* flag, but in debug builds it can trigger an assert that no
incorrect flag is passed. Instead just extra the relevant flag.
Fixes: 5b35d922c52798 ("xfs: Decouple XFS_IBULK flags from XFS_IWALK flags")
Cc: <stable(a)vger.kernel.org> # v5.19
Reported-by: cen zhang <zzzccc427(a)gmail.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index c8c9b8d8309f..5116842420b2 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -447,17 +447,21 @@ xfs_inumbers(
.breq = breq,
};
struct xfs_trans *tp;
+ unsigned int iwalk_flags = 0;
int error = 0;
if (xfs_bulkstat_already_done(breq->mp, breq->startino))
return 0;
+ if (breq->flags & XFS_IBULK_SAME_AG)
+ iwalk_flags |= XFS_IWALK_SAME_AG;
+
/*
* Grab an empty transaction so that we can use its recursive buffer
* locking abilities to detect cycles in the inobt without deadlocking.
*/
tp = xfs_trans_alloc_empty(breq->mp);
- error = xfs_inobt_walk(breq->mp, tp, breq->startino, breq->flags,
+ error = xfs_inobt_walk(breq->mp, tp, breq->startino, iwalk_flags,
xfs_inumbers_walk, breq->icount, &ic);
xfs_trans_cancel(tp);
The patch titled
Subject: mm/mremap: fix WARN with uffd that has remap events disabled
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-mremap-fix-warn-with-uffd-that-has-remap-events-disabled.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/mremap: fix WARN with uffd that has remap events disabled
Date: Mon, 18 Aug 2025 19:53:58 +0200
Registering userfaultd on a VMA that spans at least one PMD and then
mremap()'ing that VMA can trigger a WARN when recovering from a failed
page table move due to a page table allocation error.
The code ends up doing the right thing (recurse, avoiding moving actual
page tables), but triggering that WARN is unpleasant:
WARNING: CPU: 2 PID: 6133 at mm/mremap.c:357 move_normal_pmd mm/mremap.c:357 [inline]
WARNING: CPU: 2 PID: 6133 at mm/mremap.c:357 move_pgt_entry mm/mremap.c:595 [inline]
WARNING: CPU: 2 PID: 6133 at mm/mremap.c:357 move_page_tables+0x3832/0x44a0 mm/mremap.c:852
Modules linked in:
CPU: 2 UID: 0 PID: 6133 Comm: syz.0.19 Not tainted 6.17.0-rc1-syzkaller-00004-g53e760d89498 #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:move_normal_pmd mm/mremap.c:357 [inline]
RIP: 0010:move_pgt_entry mm/mremap.c:595 [inline]
RIP: 0010:move_page_tables+0x3832/0x44a0 mm/mremap.c:852
Code: ...
RSP: 0018:ffffc900037a76d8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000032930007 RCX: ffffffff820c6645
RDX: ffff88802e56a440 RSI: ffffffff820c7201 RDI: 0000000000000007
RBP: ffff888037728fc0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000032930007 R11: 0000000000000000 R12: 0000000000000000
R13: ffffc900037a79a8 R14: 0000000000000001 R15: dffffc0000000000
FS: 000055556316a500(0000) GS:ffff8880d68bc000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b30863fff CR3: 0000000050171000 CR4: 0000000000352ef0
Call Trace:
<TASK>
copy_vma_and_data+0x468/0x790 mm/mremap.c:1215
move_vma+0x548/0x1780 mm/mremap.c:1282
mremap_to+0x1b7/0x450 mm/mremap.c:1406
do_mremap+0xfad/0x1f80 mm/mremap.c:1921
__do_sys_mremap+0x119/0x170 mm/mremap.c:1977
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f00d0b8ebe9
Code: ...
RSP: 002b:00007ffe5ea5ee98 EFLAGS: 00000246 ORIG_RAX: 0000000000000019
RAX: ffffffffffffffda RBX: 00007f00d0db5fa0 RCX: 00007f00d0b8ebe9
RDX: 0000000000400000 RSI: 0000000000c00000 RDI: 0000200000000000
RBP: 00007ffe5ea5eef0 R08: 0000200000c00000 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000002
R13: 00007f00d0db5fa0 R14: 00007f00d0db5fa0 R15: 0000000000000005
</TASK>
The underlying issue is that we recurse during the original page table
move, but not during the recovery move.
Fix it by checking for both VMAs and performing the check before the
pmd_none() sanity check.
Add a new helper where we perform+document that check for the PMD and PUD
level.
Thanks to Harry for bisecting.
Link: https://lkml.kernel.org/r/20250818175358.1184757-1-david@redhat.com
Fixes: 0cef0bb836e ("mm: clear uffd-wp PTE/PMD state on mremap()")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: syzbot+4d9a13f0797c46a29e42(a)syzkaller.appspotmail.com
Closes: https://lkml.kernel.org/r/689bb893.050a0220.7f033.013a.GAE@google.com
Cc: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Jann Horn <jannh(a)google.com>
Cc: Pedro Falcato <pfalcato(a)suse.de>
Cc: Harry Yoo <harry.yoo(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mremap.c | 41 +++++++++++++++++++++++------------------
1 file changed, 23 insertions(+), 18 deletions(-)
--- a/mm/mremap.c~mm-mremap-fix-warn-with-uffd-that-has-remap-events-disabled
+++ a/mm/mremap.c
@@ -323,6 +323,25 @@ static inline bool arch_supports_page_ta
}
#endif
+static inline bool uffd_supports_page_table_move(struct pagetable_move_control *pmc)
+{
+ /*
+ * If we are moving a VMA that has uffd-wp registered but with
+ * remap events disabled (new VMA will not be registered with uffd), we
+ * need to ensure that the uffd-wp state is cleared from all pgtables.
+ * This means recursing into lower page tables in move_page_tables().
+ *
+ * We might get called with VMAs reversed when recovering from a
+ * failed page table move. In that case, the
+ * "old"-but-actually-"originally new" VMA during recovery will not have
+ * a uffd context. Recursing into lower page tables during the original
+ * move but not during the recovery move will cause trouble, because we
+ * run into already-existing page tables. So check both VMAs.
+ */
+ return !vma_has_uffd_without_event_remap(pmc->old) &&
+ !vma_has_uffd_without_event_remap(pmc->new);
+}
+
#ifdef CONFIG_HAVE_MOVE_PMD
static bool move_normal_pmd(struct pagetable_move_control *pmc,
pmd_t *old_pmd, pmd_t *new_pmd)
@@ -335,6 +354,8 @@ static bool move_normal_pmd(struct paget
if (!arch_supports_page_table_move())
return false;
+ if (!uffd_supports_page_table_move(pmc))
+ return false;
/*
* The destination pmd shouldn't be established, free_pgtables()
* should have released it.
@@ -361,15 +382,6 @@ static bool move_normal_pmd(struct paget
if (WARN_ON_ONCE(!pmd_none(*new_pmd)))
return false;
- /* If this pmd belongs to a uffd vma with remap events disabled, we need
- * to ensure that the uffd-wp state is cleared from all pgtables. This
- * means recursing into lower page tables in move_page_tables(), and we
- * can reuse the existing code if we simply treat the entry as "not
- * moved".
- */
- if (vma_has_uffd_without_event_remap(vma))
- return false;
-
/*
* We don't have to worry about the ordering of src and dst
* ptlocks because exclusive mmap_lock prevents deadlock.
@@ -418,6 +430,8 @@ static bool move_normal_pud(struct paget
if (!arch_supports_page_table_move())
return false;
+ if (!uffd_supports_page_table_move(pmc))
+ return false;
/*
* The destination pud shouldn't be established, free_pgtables()
* should have released it.
@@ -425,15 +439,6 @@ static bool move_normal_pud(struct paget
if (WARN_ON_ONCE(!pud_none(*new_pud)))
return false;
- /* If this pud belongs to a uffd vma with remap events disabled, we need
- * to ensure that the uffd-wp state is cleared from all pgtables. This
- * means recursing into lower page tables in move_page_tables(), and we
- * can reuse the existing code if we simply treat the entry as "not
- * moved".
- */
- if (vma_has_uffd_without_event_remap(vma))
- return false;
-
/*
* We don't have to worry about the ordering of src and dst
* ptlocks because exclusive mmap_lock prevents deadlock.
_
Patches currently in -mm which might be from david(a)redhat.com are
mm-mremap-fix-warn-with-uffd-that-has-remap-events-disabled.patch
mm-migrate-remove-migratepage_unmap.patch
treewide-remove-migratepage_success.patch
mm-huge_memory-move-more-common-code-into-insert_pmd.patch
mm-huge_memory-move-more-common-code-into-insert_pud.patch
mm-huge_memory-support-huge-zero-folio-in-vmf_insert_folio_pmd.patch
fs-dax-use-vmf_insert_folio_pmd-to-insert-the-huge-zero-folio.patch
mm-huge_memory-mark-pmd-mappings-of-the-huge-zero-folio-special.patch
powerpc-ptdump-rename-struct-pgtable_level-to-struct-ptdump_pglevel.patch
mm-rmap-convert-enum-rmap_level-to-enum-pgtable_level.patch
mm-memory-convert-print_bad_pte-to-print_bad_page_map.patch
mm-memory-factor-out-common-code-from-vm_normal_page_.patch
mm-introduce-and-use-vm_normal_page_pud.patch
mm-rename-vm_ops-find_special_page-to-vm_ops-find_normal_page.patch
prctl-extend-pr_set_thp_disable-to-optionally-exclude-vm_hugepage.patch
mm-huge_memory-convert-tva_flags-to-enum-tva_type.patch
mm-huge_memory-respect-madv_collapse-with-pr_thp_disable_except_advised.patch