driver_find_device() calls get_device() to increment the reference
count once a matching device is found, but there is no put_device() to
balance the reference count. To avoid reference count leakage, add
put_device() to decrease the reference count.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a31500fe7055 ("drm/tegra: dc: Restore coupling of display controllers")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/gpu/drm/tegra/dc.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 59d5c1ba145a..6c84bd69b11f 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -3148,6 +3148,7 @@ static int tegra_dc_couple(struct tegra_dc *dc)
dc->client.parent = &parent->client;
dev_dbg(dc->dev, "coupled to %s\n", dev_name(companion));
+ put_device(companion);
}
return 0;
--
2.17.1
The previous commit 0718a78f6a9f ("ALSA: usb-audio: Kill timer properly at
removal") patched a UAF issue caused by the error timer.
However, because the error timer kill added in this patch occurs after the
endpoint delete, a race condition to UAF still occurs, albeit rarely.
Therefore, to prevent this, the error timer must be killed before freeing
the heap memory.
Cc: <stable(a)vger.kernel.org>
Fixes: 0718a78f6a9f ("ALSA: usb-audio: Kill timer properly at removal")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
sound/usb/midi.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/sound/usb/midi.c b/sound/usb/midi.c
index acb3bf92857c..8d15f1caa92b 100644
--- a/sound/usb/midi.c
+++ b/sound/usb/midi.c
@@ -1522,6 +1522,8 @@ static void snd_usbmidi_free(struct snd_usb_midi *umidi)
{
int i;
+ timer_shutdown_sync(&umidi->error_timer);
+
for (i = 0; i < MIDI_MAX_ENDPOINTS; ++i) {
struct snd_usb_midi_endpoint *ep = &umidi->endpoints[i];
if (ep->out)
@@ -1530,7 +1532,6 @@ static void snd_usbmidi_free(struct snd_usb_midi *umidi)
snd_usbmidi_in_endpoint_delete(ep->in);
}
mutex_destroy(&umidi->mutex);
- timer_shutdown_sync(&umidi->error_timer);
kfree(umidi);
}
--
Greetings!!
We are a 24+ yr old high tech Web Development firm with presence of over
18+ yrs in Mauritius; partners of RV TechAdvisora Ltd and headquartered in
India
We have catered to over 7000 trendy websites; are a team of 30 people.
Our Services: Domain Registrations, Webhosting, Google Workspace, Mobile
Responsive Website Designing, Wordpress Websites, Mobile Apps, Web Apps,
E-commerce websites, Google Ads, SEO, Catalogue design & affiliated services
Find below some of the ready packages we offer for making an easy selection
for the kind of Mobile Responsive HTML Website:
5 page Responsive Website @ MUR 13,499/-
10 page Responsive Website @ MUR 19,999/-
15 page Responsive Website @ MUR 25,499/-
20 page Responsive Website @ MUR 31,499/-
25 page Responsive Website @ MUR 36,499/-
30 page Responsive Website @ MUR 40,999/-
Additional page beyond 30 pages @ MUR 1199 per page
Find below some of the ready packages we offer for making an easy selection
for the kind of Mobile Responsive Wordpress Website; With Wordpress CMS,
website content can be easily maintained by your company with a backend
login.
5 page Responsive Website @ MUR 14,999/-
10 page Responsive Website @ MUR 21,999/-
15 page Responsive Website @ MUR 27,499/-
20 page Responsive Website @ MUR 32,999/-
25 page Responsive Website @ MUR 36,999/-
30 page Responsive Website @ MUR 40,999/-
Additional page beyond 30 pages @ MUR 1199 per page
Our brief website portfolio: http://www.mirackle.com/portfolio.html
Note: We are also looking for tie-ups with IT/Web design cos. who would
want to outsource work for high end Websites/Mobile APP requirements etc.
We have a team of highly skilled php coders who can cater to any complex
requirement.
India Whatsapp: +91 9323272846 / 9323551195; Mauritius WharsApp: +230 5758
5497; Email: business(a)mirackle.com ; Web: http://www.mirackle.com
Regards,
Amit Patel
driver_find_device() calls get_device() to increment the reference
count once a matching device is found. device_release_driver()
releases the driver, but it does not decrease the reference count that
was incremented by driver_find_device(). At the end of the loop, there
is no put_device() to balance the reference count. To avoid reference
count leakage, add put_device() to decrease the reference count.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: bfc653aa89cb ("perf: arm_cspmu: Separate Arm and vendor module")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/perf/arm_cspmu/arm_cspmu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index efa9b229e701..e0d4293f06f9 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -1365,8 +1365,10 @@ void arm_cspmu_impl_unregister(const struct arm_cspmu_impl_match *impl_match)
/* Unbind the driver from all matching backend devices. */
while ((dev = driver_find_device(&arm_cspmu_driver.driver, NULL,
- match, arm_cspmu_match_device)))
+ match, arm_cspmu_match_device))) {
device_release_driver(dev);
+ put_device(dev);
+ }
mutex_lock(&arm_cspmu_lock);
--
2.17.1
Here are some patches for the MPTCP PM, including some refactoring that
I thought it would be best to send at the end of a cycle to avoid
conflicts between net and net-next that could last a few weeks.
The most interesting changes are in the first and last patch, the rest
are patches refactoring the code & tests to validate the modifications.
- Patches 1 & 2: When servers set the C-flag in their MP_CAPABLE to tell
clients not to create subflows to the initial address and port -- e.g.
a deployment behind a L4 load balancer like a typical CDN deployment
-- clients will not use their other endpoints when default settings
are used. That's because the in-kernel path-manager uses the 'subflow'
endpoints to create subflows only to the initial address and port. The
first patch fixes that (for >=v5.14), and the second one validates it.
- Patches 3-14: various patches refactoring the code around the
in-kernel PM (mainly): split too long functions, rename variables and
functions to avoid confusions, reduce structure size, and compare IDs
instead of IP addresses. Note that one patch modifies one internal
variable used in one BPF selftest.
- Patch 15: ability to control endpoints that are used in reaction to a
new address announced by the other peer. With that, endpoints can be
used only once.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Notes:
- Patches 1 & 2 are sent to net-next on purpose: to delay a bit the
backports, just in case. Plus we are at the end of a cycle, and not
to delay the other refactoring patches.
- Sorry, I wanted to send this series earlier on, but due to some
unrelated issues (and holiday), it got delayed. Most patches are
pure refactoring ones.
---
Matthieu Baerts (NGI0) (15):
mptcp: pm: in-kernel: usable client side with C-flag
selftests: mptcp: join: validate C-flag + def limit
mptcp: pm: in-kernel: refactor fill_local_addresses_vec
mptcp: pm: in-kernel: refactor fill_remote_addresses_vec
mptcp: pm: rename 'subflows' to 'extra_subflows'
mptcp: pm: in-kernel: rename 'subflows_max' to 'limit_extra_subflows'
mptcp: pm: in-kernel: rename 'add_addr_signal_max' to 'endp_signal_max'
mptcp: pm: in-kernel: rename 'add_addr_accept_max' to 'limit_add_addr_accepted'
mptcp: pm: in-kernel: rename 'local_addr_max' to 'endp_subflow_max'
mptcp: pm: in-kernel: rename 'local_addr_list' to 'endp_list'
mptcp: pm: in-kernel: rename 'addrs' to 'endpoints'
mptcp: pm: in-kernel: remove stale_loss_cnt
mptcp: pm: in-kernel: reduce pernet struct size
mptcp: pm: in-kernel: compare IDs instead of addresses
mptcp: pm: in-kernel: add laminar endpoints
include/uapi/linux/mptcp.h | 11 +-
net/mptcp/pm.c | 32 +-
net/mptcp/pm_kernel.c | 569 ++++++++++++++--------
net/mptcp/pm_userspace.c | 2 +-
net/mptcp/protocol.h | 21 +-
net/mptcp/sockopt.c | 22 +-
tools/testing/selftests/bpf/progs/mptcp_subflow.c | 2 +-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 11 +
8 files changed, 441 insertions(+), 229 deletions(-)
---
base-commit: a1f1f2422e098485b09e55a492de05cf97f9954d
change-id: 20250925-net-next-mptcp-c-flag-laminar-f8442e4d4bd9
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
The patch titled
Subject: Squashfs: reject negative file sizes in squashfs_read_inode()
has been added to the -mm mm-nonmm-unstable branch. Its filename is
squashfs-reject-negative-file-sizes-in-squashfs_read_inode.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-nonmm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: Squashfs: reject negative file sizes in squashfs_read_inode()
Date: Fri, 26 Sep 2025 22:59:35 +0100
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file
system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk
Fixes: 6545b246a2c8 ("Squashfs: inode operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+f754e01116421e9754b9(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/inode.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
--- a/fs/squashfs/inode.c~squashfs-reject-negative-file-sizes-in-squashfs_read_inode
+++ a/fs/squashfs/inode.c
@@ -145,6 +145,10 @@ int squashfs_read_inode(struct inode *in
goto failed_read;
inode->i_size = le32_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
@@ -197,6 +201,10 @@ int squashfs_read_inode(struct inode *in
goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
frag = le32_to_cpu(sqsh_ino->fragment);
if (frag != SQUASHFS_INVALID_FRAG) {
/*
@@ -249,8 +257,12 @@ int squashfs_read_inode(struct inode *in
if (err < 0)
goto failed_read;
- set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
inode->i_size = le16_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
+ set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
inode->i_op = &squashfs_dir_inode_ops;
inode->i_fop = &squashfs_dir_ops;
inode->i_mode |= S_IFDIR;
@@ -273,9 +285,13 @@ int squashfs_read_inode(struct inode *in
if (err < 0)
goto failed_read;
+ inode->i_size = le32_to_cpu(sqsh_ino->file_size);
+ if (inode->i_size < 0) {
+ err = -EINVAL;
+ goto failed_read;
+ }
xattr_id = le32_to_cpu(sqsh_ino->xattr);
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
- inode->i_size = le32_to_cpu(sqsh_ino->file_size);
inode->i_op = &squashfs_dir_inode_ops;
inode->i_fop = &squashfs_dir_ops;
inode->i_mode |= S_IFDIR;
@@ -302,7 +318,7 @@ int squashfs_read_inode(struct inode *in
goto failed_read;
inode->i_size = le32_to_cpu(sqsh_ino->symlink_size);
- if (inode->i_size > PAGE_SIZE) {
+ if (inode->i_size < 0 || inode->i_size > PAGE_SIZE) {
ERROR("Corrupted symlink\n");
return -EINVAL;
}
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
squashfs-fix-uninit-value-in-squashfs_get_parent.patch
squashfs-add-additional-inode-sanity-checking.patch
squashfs-add-seek_data-seek_hole-support.patch
squashfs-reject-negative-file-sizes-in-squashfs_read_inode.patch
This patch series enables a future version of tune2fs to be able to
modify certain parts of the ext4 superblock without to write to the
block device.
The first patch fixes a potential buffer overrun caused by a
maliciously moified superblock. The second patch adds support for
32-bit uid and gid's which can have access to the reserved blocks pool.
The last patch adds the ioctl's which will be used by tune2fs.
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
---
Changes in v2:
- fix bugs that were detected using sparse
- remove tune (unsafe) ability to clear certain compat faatures
- add the ability to set the encoding and encoding flags for case folding
- Link to v1: https://lore.kernel.org/r/20250908-tune2fs-v1-0-e3a6929f3355@mit.edu
---
Theodore Ts'o (3):
ext4: avoid potential buffer over-read in parse_apply_sb_mount_options()
ext4: add support for 32-bit default reserved uid and gid values
ext4: implemet new ioctls to set and get superblock parameters
fs/ext4/ext4.h | 16 +++-
fs/ext4/ioctl.c | 312 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
fs/ext4/super.c | 25 +++----
include/uapi/linux/ext4.h | 53 +++++++++++++
4 files changed, 382 insertions(+), 24 deletions(-)
---
base-commit: b320789d6883cc00ac78ce83bccbfe7ed58afcf0
change-id: 20250830-tune2fs-3376beb72403
Best regards,
--
Theodore Ts'o <tytso(a)mit.edu>
Make sure to drop the reference taken to the ocmem platform device when
looking up its driver data.
Note that holding a reference to a device does not prevent its driver
data from going away so there is no point in keeping the reference.
Also note that commit 0ff027027e05 ("soc: qcom: ocmem: Fix missing
put_device() call in of_get_ocmem") fixed the leak in a lookup error
path, but the reference is still leaking on success.
Fixes: 88c1e9404f1d ("soc: qcom: add OCMEM driver")
Cc: stable(a)vger.kernel.org # 5.5: 0ff027027e05
Cc: Brian Masney <bmasney(a)redhat.com>
Cc: Miaoqian Lin <linmq006(a)gmail.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/soc/qcom/ocmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/qcom/ocmem.c b/drivers/soc/qcom/ocmem.c
index 9c3bd37b6579..71130a2f62e9 100644
--- a/drivers/soc/qcom/ocmem.c
+++ b/drivers/soc/qcom/ocmem.c
@@ -202,9 +202,9 @@ struct ocmem *of_get_ocmem(struct device *dev)
}
ocmem = platform_get_drvdata(pdev);
+ put_device(&pdev->dev);
if (!ocmem) {
dev_err(dev, "Cannot get ocmem\n");
- put_device(&pdev->dev);
return ERR_PTR(-ENODEV);
}
return ocmem;
--
2.49.1
Hello,
This is Chenglong from Google Container Optimized OS. I'm reporting a
severe CPU hang regression that occurs after a high volume of file
creation and subsequent cgroup cleanup.
Through bisection, the issue appears to be caused by a chain reaction
between three commits related to writeback, unbound workqueues, and
CPU-hogging detection. The issue is greatly alleviated on the latest
mainline kernel but is not fully resolved, still occurring
intermittently (~1 in 10 runs).
How to reproduce
The kernel v6.1 is good. The hang is reliably triggered(over 80%
chance) on kernels v6.6 and 6.12 and intermittently on
mainline(6.17-rc7) with the following steps:
Environment: A machine with a fast SSD and a high core count (e.g.,
Google Cloud's N2-standard-128).
Workload: Concurrently generate a large number of files (e.g., 2
million) using multiple services managed by systemd-run. This creates
significant I/O and cgroup churn.
Trigger: After the file generation completes, terminate the
systemd-run services.
Result: Shortly after the services are killed, the system's CPU load
spikes, leading to a massive number of kworker/+inode_switch_wbs
threads and a system-wide hang/livelock where the machine becomes
unresponsive (20s - 300s).
Analysis and Problematic Commits
1. The initial commit: The process begins with a worker that can get
stuck busy-waiting on a spinlock.
Commit: ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Effect: This introduced the inode_switch_wbs_work_fn worker to clean
up cgroup writeback structures. Under our test load, this worker
appears to hit a highly contended wb->list_lock spinlock, causing it
to burn 100% CPU without sleeping.
2. The Kworker Explosion: A subsequent change misinterprets the
spinning worker from Stage 1, leading to a runaway feedback loop of
worker creation.
Commit: 616db8779b1e ("workqueue: Automatically mark CPU-hogging work
items CPU_INTENSIVE")
Effect: This logic sees the spinning worker, marks it as
CPU_INTENSIVE, and excludes it from concurrency management. To handle
the work backlog, it spawns a new kworker, which then also gets stuck
on the same lock, repeating the cycle. This directly causes the
kworker count to explode from <50 to 100-2000+.
3. The System-Wide Lockdown: The final piece allows this localized
worker explosion to saturate the entire system.
Commit: 8639ecebc9b1 ("workqueue: Implement non-strict affinity scope
for unbound workqueues")
Effect: This change introduced non-strict affinity as the default. It
allows the hundreds of kworkers created in Stage 2 to be spread by the
scheduler across all available CPU cores, turning the problem into a
system-wide hang.
Current Status and Mitigation
Mainline Status: On the latest mainline kernel, the hang is far less
frequent and the kworker counts are reduced back to normal (<50),
suggesting other changes have partially mitigated the issue. However,
the hang still occurs, and when it does, the kworker count still
explodes (e.g., 300+), indicating the underlying feedback loop
remains.
Workaround: A reliable mitigation is to revert to the old workqueue
behavior by setting affinity_strict to 1. This contains the kworker
proliferation to a single CPU pod, preventing the system-wide hang.
Questions
Given that the issue is not fully resolved, could you please provide
some guidance?
1. Is this a known issue, and are there patches in development that
might fully address the underlying spinlock contention or the kworker
feedback loop?
2. Is there a better long-term mitigation we can apply other than
forcing strict affinity?
Thank you for your time and help.
Best regards,
Chenglong
On Wed, Sep 24, 2025 at 05:24:15PM -0700, Chenglong Tang wrote:
> The kernel v6.1 is good. The hang is reliably triggered(over 80% chance) on
> kernels v6.6 and 6.12 and intermittently on mainline(6.17-rc7) with the
> following steps:
> -
>
> *Environment:* A machine with a fast SSD and a high core count (e.g.,
> Google Cloud's N2-standard-128).
> -
>
> *Workload:* Concurrently generate a large number of files (e.g., 2 million)
> using multiple services managed by systemd-run. This creates significant
> I/O and cgroup churn.
> -
>
> *Trigger:* After the file generation completes, terminate the systemd-run
> services.
> -
>
> *Result:* Shortly after the services are killed, the system's CPU load
> spikes, leading to a massive number of kworker/+inode_switch_wbs threads
> and a system-wide hang/livelock where the machine becomes unresponsive (20s
> - 300s).
Sounds like:
http://lkml.kernel.org/r/20250912103522.2935-1-jack@suse.cz
Can you see whether those patches resolve the problem?
Thanks.
--
tejun
Make sure to drop the reference taken to the pbs platform device when
looking up its driver data.
Note that holding a reference to a device does not prevent its driver
data from going away so there is no point in keeping the reference.
Fixes: 5b2dd77be1d8 ("soc: qcom: add QCOM PBS driver")
Cc: stable(a)vger.kernel.org # 6.9
Cc: Anjelique Melendez <quic_amelende(a)quicinc.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/soc/qcom/qcom-pbs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/soc/qcom/qcom-pbs.c b/drivers/soc/qcom/qcom-pbs.c
index 1cc5d045f9dd..06b4a596e275 100644
--- a/drivers/soc/qcom/qcom-pbs.c
+++ b/drivers/soc/qcom/qcom-pbs.c
@@ -173,6 +173,8 @@ struct pbs_dev *get_pbs_client_device(struct device *dev)
return ERR_PTR(-EINVAL);
}
+ platform_device_put(pdev);
+
return pbs;
}
EXPORT_SYMBOL_GPL(get_pbs_client_device);
--
2.49.1
Make sure to drop the reference taken to the mbox platform device when
looking up its driver data.
Note that holding a reference to a device does not prevent its driver
data from going away so there is no point in keeping the reference.
Fixes: 6e1457fcad3f ("soc: apple: mailbox: Add ASC/M3 mailbox driver")
Cc: stable(a)vger.kernel.org # 6.8
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/soc/apple/mailbox.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/soc/apple/mailbox.c b/drivers/soc/apple/mailbox.c
index 49a0955e82d6..1685da1da23d 100644
--- a/drivers/soc/apple/mailbox.c
+++ b/drivers/soc/apple/mailbox.c
@@ -299,11 +299,18 @@ struct apple_mbox *apple_mbox_get(struct device *dev, int index)
return ERR_PTR(-EPROBE_DEFER);
mbox = platform_get_drvdata(pdev);
- if (!mbox)
- return ERR_PTR(-EPROBE_DEFER);
+ if (!mbox) {
+ mbox = ERR_PTR(-EPROBE_DEFER);
+ goto out_put_pdev;
+ }
+
+ if (!device_link_add(dev, &pdev->dev, DL_FLAG_AUTOREMOVE_CONSUMER)) {
+ mbox = ERR_PTR(-ENODEV);
+ goto out_put_pdev;
+ }
- if (!device_link_add(dev, &pdev->dev, DL_FLAG_AUTOREMOVE_CONSUMER))
- return ERR_PTR(-ENODEV);
+out_put_pdev:
+ put_device(&pdev->dev);
return mbox;
}
--
2.49.1
The ACPI has ways to annotate the location of a USB device. Wire that
annotation to a v4l2 control.
To support all possible devices, add a way to annotate USB devices on DT
as well. The original binding discussion happened here:
https://lore.kernel.org/linux-devicetree/20241212-usb-orientation-v1-1-0b69…
The following patches are needed regardless if we finally add support
for orientation and rotation or not:
- media: uvcvideo: Always set default_value
- media: uvcvideo: Set a function for UVC_EXT_GPIO_UNIT
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Changes in v3:
- refactor dt bindings
- add media: uvcvideo: Use current_value for read-only controls
- get_(max|cur|def) = swentity_get_cur
- virtual_entity add codestyle
- Codestyle
- Fix xu get_info and get_len
- Drop ACPI_DEVICE_SWNODE_DEV_ROTATION
- Add missing select V4L2_FWNODE
- Link to v2: https://lore.kernel.org/r/20250605-uvc-orientation-v2-0-5710f9d030aa@chromi…
Changes in v2:
- Add support for rotation
- Rename fwnode to swentity
- Remove the patch to move the gpio file
- Remove patches already in media-committers
- Change priority of data origins
- Patch mipi-disco
- Link to v1: https://lore.kernel.org/r/20250403-uvc-orientation-v1-0-1a0cc595a62d@chromi…
---
Ricardo Ribalda (12):
media: uvcvideo: Always set default_value
media: uvcvideo: Set a function for UVC_EXT_GPIO_UNIT
media: v4l: fwnode: Support ACPI's _PLD for v4l2_fwnode_device_parse
ACPI: mipi-disco-img: Do not duplicate rotation info into swnodes
media: ipu-bridge: Use v4l2_fwnode_device_parse helper
media: ipu-bridge: Use v4l2_fwnode for unknown rotations
dt-bindings: media: Add usb-camera-module
media: uvcvideo: Add support for V4L2_CID_CAMERA_ORIENTATION
media: uvcvideo: Fill ctrl->info.selector earlier
media: uvcvideo: Add uvc_ctrl_query_entity helper
media: uvcvideo: Use current_value for read-only controls
media: uvcvideo: Add support for V4L2_CID_CAMERA_ROTATION
.../bindings/media/usb-camera-module.yaml | 46 +++++
MAINTAINERS | 1 +
drivers/acpi/mipi-disco-img.c | 15 --
drivers/media/pci/intel/Kconfig | 1 +
drivers/media/pci/intel/ipu-bridge.c | 58 +++---
drivers/media/usb/uvc/Kconfig | 1 +
drivers/media/usb/uvc/Makefile | 3 +-
drivers/media/usb/uvc/uvc_ctrl.c | 201 +++++++++++++++------
drivers/media/usb/uvc/uvc_driver.c | 22 ++-
drivers/media/usb/uvc/uvc_entity.c | 3 +-
drivers/media/usb/uvc/uvc_swentity.c | 107 +++++++++++
drivers/media/usb/uvc/uvcvideo.h | 22 +++
drivers/media/v4l2-core/v4l2-fwnode.c | 84 ++++++++-
include/acpi/acpi_bus.h | 1 -
include/linux/usb/uvc.h | 3 +
15 files changed, 441 insertions(+), 127 deletions(-)
---
base-commit: afb100a5ea7a13d7e6937dcd3b36b19dc6cc9328
change-id: 20250403-uvc-orientation-5f7f19da5adb
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
From: Conor Dooley <conor.dooley(a)microchip.com>
mpfs_gpio_direction_output() actually sets the line to input mode.
Use the correct register settings for output mode so that this function
actually works as intended.
This was a copy-paste mistake made when converting to regmap during the
driver submission process. It went unnoticed because my test for output
mode is toggling LEDs on an Icicle kit which functions with the
incorrect code. The internal reporter has yet to test the patch, but on
their system the incorrect setting may be the reason for failures to
drive the GPIO lines on the BeagleV-fire board.
CC: stable(a)vger.kernel.org
Fixes: a987b78f3615e ("gpio: mpfs: add polarfire soc gpio support")
Signed-off-by: Conor Dooley <conor.dooley(a)microchip.com>
---
CC: Conor Dooley <conor.dooley(a)microchip.com>
CC: Daire McNamara <daire.mcnamara(a)microchip.com>
CC: Cyril.Jean(a)microchip.com
CC: Linus Walleij <linus.walleij(a)linaro.org>
CC: Bartosz Golaszewski <brgl(a)bgdev.pl>
CC: linux-riscv(a)lists.infradead.org
CC: linux-gpio(a)vger.kernel.org
CC: linux-kernel(a)vger.kernel.org
---
drivers/gpio/gpio-mpfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-mpfs.c b/drivers/gpio/gpio-mpfs.c
index 82d557a7e5d8d..9468795b96348 100644
--- a/drivers/gpio/gpio-mpfs.c
+++ b/drivers/gpio/gpio-mpfs.c
@@ -69,7 +69,7 @@ static int mpfs_gpio_direction_output(struct gpio_chip *gc, unsigned int gpio_in
struct mpfs_gpio_chip *mpfs_gpio = gpiochip_get_data(gc);
regmap_update_bits(mpfs_gpio->regs, MPFS_GPIO_CTRL(gpio_index),
- MPFS_GPIO_DIR_MASK, MPFS_GPIO_EN_IN);
+ MPFS_GPIO_DIR_MASK, MPFS_GPIO_EN_OUT | MPFS_GPIO_EN_OUT_BUF);
regmap_update_bits(mpfs_gpio->regs, mpfs_gpio->offsets->outp, BIT(gpio_index),
value << gpio_index);
--
2.47.3
Hi,
Access a comprehensive, verified database of 501,853 IAA Mobility 2025 attendees and 750 exhibitors to grow your network and maximize business opportunities.
What you’ll get: Contact name, Job Title, Business Name, Physical Address, Phone Numbers, Official Email address and many more.
Delivered within 48 hours and 100% opt-in and GDPR compliant.
Special Discount: Enjoy 20% off for a limited time!
Would you prefer the entire attendee database or a customized segment tailored by geography, job role, or industry? I’ll forward the same to our Business Development Manager, who will provide you with more details about list acquisition.
Best regards,
Garnet Conwell
Sr. Marketing Manager
P.S. To opt out of future emails, simply reply “Unfollow.”
From: Max Kellermann <max.kellermann(a)ionos.com>
Commit 20d72b00ca81 ("netfs: Fix the request's work item to not
require a ref") modified netfs_alloc_request() to initialize the
reference counter to 2 instead of 1. The rationale was that the
requet's "work" would release the second reference after completion
(via netfs_{read,write}_collection_worker()). That works most of the
time if all goes well.
However, it leaks this additional reference if the request is released
before the I/O operation has been submitted: the error code path only
decrements the reference counter once and the work item will never be
queued because there will never be a completion.
This has caused outages of our whole server cluster today because
tasks were blocked in netfs_wait_for_outstanding_io(), leading to
deadlocks in Ceph (another bug that I will address soon in another
patch). This was caused by a netfs_pgpriv2_begin_copy_to_cache() call
which failed in fscache_begin_write_operation(). The leaked
netfs_io_request was never completed, leaving `netfs_inode.io_count`
with a positive value forever.
All of this is super-fragile code. Finding out which code paths will
lead to an eventual completion and which do not is hard to see:
- Some functions like netfs_create_write_req() allocate a request, but
will never submit any I/O.
- netfs_unbuffered_read_iter_locked() calls netfs_unbuffered_read()
and then netfs_put_request(); however, netfs_unbuffered_read() can
also fail early before submitting the I/O request, therefore another
netfs_put_request() call must be added there.
A rule of thumb is that functions that return a `netfs_io_request` do
not submit I/O, and all of their callers must be checked.
For my taste, the whole netfs code needs an overhaul to make reference
counting easier to understand and less fragile & obscure. But to fix
this bug here and now and produce a patch that is adequate for a
stable backport, I tried a minimal approach that quickly frees the
request object upon early failure.
I decided against adding a second netfs_put_request() each time
because that would cause code duplication which obscures the code
further. Instead, I added the function netfs_put_failed_request()
which frees such a failed request synchronously under the assumption
that the reference count is exactly 2 (as initially set by
netfs_alloc_request() and never touched), verified by a
WARN_ON_ONCE(). It then deinitializes the request object (without
going through the "cleanup_work" indirection) and frees the allocation
(with RCU protection to protect against concurrent access by
netfs_requests_seq_start()).
All code paths that fail early have been changed to call
netfs_put_failed_request() instead of netfs_put_request().
Additionally, I have added a netfs_put_request() call to
netfs_unbuffered_read() as explained above because the
netfs_put_failed_request() approach does not work there.
Fixes: 20d72b00ca81 ("netfs: Fix the request's work item to not require a ref")
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: Paulo Alcantara <pc(a)manguebit.org>
cc: netfs(a)lists.linux.dev,
cc: linux-fsdevel(a)vger.kernel.org,
cc: stable(a)vger.kernel.org
---
Changes
=======
ver #3)
- Log the refcount in the tracepoint in netfs_put_failed_request().
ver #2)
- Fix missing RCU handling in netfs_put_failed_request().
fs/netfs/buffered_read.c | 10 +++++-----
fs/netfs/direct_read.c | 7 ++++++-
fs/netfs/direct_write.c | 6 +++++-
fs/netfs/internal.h | 1 +
fs/netfs/objects.c | 28 +++++++++++++++++++++++++---
fs/netfs/read_pgpriv2.c | 2 +-
fs/netfs/read_single.c | 2 +-
fs/netfs/write_issue.c | 3 +--
8 files changed, 45 insertions(+), 14 deletions(-)
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 18b3dc74c70e..37ab6f28b5ad 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -369,7 +369,7 @@ void netfs_readahead(struct readahead_control *ractl)
return netfs_put_request(rreq, netfs_rreq_trace_put_return);
cleanup_free:
- return netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ return netfs_put_failed_request(rreq);
}
EXPORT_SYMBOL(netfs_readahead);
@@ -472,7 +472,7 @@ static int netfs_read_gaps(struct file *file, struct folio *folio)
return ret < 0 ? ret : 0;
discard:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
alloc_error:
folio_unlock(folio);
return ret;
@@ -532,7 +532,7 @@ int netfs_read_folio(struct file *file, struct folio *folio)
return ret < 0 ? ret : 0;
discard:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
alloc_error:
folio_unlock(folio);
return ret;
@@ -699,7 +699,7 @@ int netfs_write_begin(struct netfs_inode *ctx,
return 0;
error_put:
- netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(rreq);
error:
if (folio) {
folio_unlock(folio);
@@ -754,7 +754,7 @@ int netfs_prefetch_for_write(struct file *file, struct folio *folio,
return ret < 0 ? ret : 0;
error_put:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
error:
_leave(" = %d", ret);
return ret;
diff --git a/fs/netfs/direct_read.c b/fs/netfs/direct_read.c
index a05e13472baf..a498ee8d6674 100644
--- a/fs/netfs/direct_read.c
+++ b/fs/netfs/direct_read.c
@@ -131,6 +131,7 @@ static ssize_t netfs_unbuffered_read(struct netfs_io_request *rreq, bool sync)
if (rreq->len == 0) {
pr_err("Zero-sized read [R=%x]\n", rreq->debug_id);
+ netfs_put_request(rreq, netfs_rreq_trace_put_discard);
return -EIO;
}
@@ -205,7 +206,7 @@ ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_iter *i
if (user_backed_iter(iter)) {
ret = netfs_extract_user_iter(iter, rreq->len, &rreq->buffer.iter, 0);
if (ret < 0)
- goto out;
+ goto error_put;
rreq->direct_bv = (struct bio_vec *)rreq->buffer.iter.bvec;
rreq->direct_bv_count = ret;
rreq->direct_bv_unpin = iov_iter_extract_will_pin(iter);
@@ -238,6 +239,10 @@ ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_iter *i
if (ret > 0)
orig_count -= ret;
return ret;
+
+error_put:
+ netfs_put_failed_request(rreq);
+ return ret;
}
EXPORT_SYMBOL(netfs_unbuffered_read_iter_locked);
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index a16660ab7f83..a9d1c3b2c084 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -57,7 +57,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
n = netfs_extract_user_iter(iter, len, &wreq->buffer.iter, 0);
if (n < 0) {
ret = n;
- goto out;
+ goto error_put;
}
wreq->direct_bv = (struct bio_vec *)wreq->buffer.iter.bvec;
wreq->direct_bv_count = n;
@@ -101,6 +101,10 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
out:
netfs_put_request(wreq, netfs_rreq_trace_put_return);
return ret;
+
+error_put:
+ netfs_put_failed_request(wreq);
+ return ret;
}
EXPORT_SYMBOL(netfs_unbuffered_write_iter_locked);
diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index d4f16fefd965..4319611f5354 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -87,6 +87,7 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
void netfs_get_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace what);
void netfs_clear_subrequests(struct netfs_io_request *rreq);
void netfs_put_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace what);
+void netfs_put_failed_request(struct netfs_io_request *rreq);
struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq);
static inline void netfs_see_request(struct netfs_io_request *rreq,
diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
index e8c99738b5bb..39d5e13f7248 100644
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -116,10 +116,8 @@ static void netfs_free_request_rcu(struct rcu_head *rcu)
netfs_stat_d(&netfs_n_rh_rreq);
}
-static void netfs_free_request(struct work_struct *work)
+static void netfs_deinit_request(struct netfs_io_request *rreq)
{
- struct netfs_io_request *rreq =
- container_of(work, struct netfs_io_request, cleanup_work);
struct netfs_inode *ictx = netfs_inode(rreq->inode);
unsigned int i;
@@ -149,6 +147,14 @@ static void netfs_free_request(struct work_struct *work)
if (atomic_dec_and_test(&ictx->io_count))
wake_up_var(&ictx->io_count);
+}
+
+static void netfs_free_request(struct work_struct *work)
+{
+ struct netfs_io_request *rreq =
+ container_of(work, struct netfs_io_request, cleanup_work);
+
+ netfs_deinit_request(rreq);
call_rcu(&rreq->rcu, netfs_free_request_rcu);
}
@@ -167,6 +173,22 @@ void netfs_put_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace
}
}
+/*
+ * Free a request (synchronously) that was just allocated but has
+ * failed before it could be submitted.
+ */
+void netfs_put_failed_request(struct netfs_io_request *rreq)
+{
+ /* new requests have two references (see
+ * netfs_alloc_request(), and this function is only allowed on
+ * new request objects
+ */
+ WARN_ON_ONCE(refcount_read(&rreq->ref) != 2);
+
+ trace_netfs_rreq_ref(rreq->debug_id, 0, netfs_rreq_trace_put_failed);
+ netfs_free_request(&rreq->cleanup_work);
+}
+
/*
* Allocate and partially initialise an I/O request structure.
*/
diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c
index 8097bc069c1d..a1489aa29f78 100644
--- a/fs/netfs/read_pgpriv2.c
+++ b/fs/netfs/read_pgpriv2.c
@@ -118,7 +118,7 @@ static struct netfs_io_request *netfs_pgpriv2_begin_copy_to_cache(
return creq;
cancel_put:
- netfs_put_request(creq, netfs_rreq_trace_put_return);
+ netfs_put_failed_request(creq);
cancel:
rreq->copy_to_cache = ERR_PTR(-ENOBUFS);
clear_bit(NETFS_RREQ_FOLIO_COPY_TO_CACHE, &rreq->flags);
diff --git a/fs/netfs/read_single.c b/fs/netfs/read_single.c
index fa622a6cd56d..5c0dc4efc792 100644
--- a/fs/netfs/read_single.c
+++ b/fs/netfs/read_single.c
@@ -189,7 +189,7 @@ ssize_t netfs_read_single(struct inode *inode, struct file *file, struct iov_ite
return ret;
cleanup_free:
- netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(rreq);
return ret;
}
EXPORT_SYMBOL(netfs_read_single);
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index 0584cba1a043..dd8743bc8d7f 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -133,8 +133,7 @@ struct netfs_io_request *netfs_create_write_req(struct address_space *mapping,
return wreq;
nomem:
- wreq->error = -ENOMEM;
- netfs_put_request(wreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(wreq);
return ERR_PTR(-ENOMEM);
}
A chip freeze is observed on i.MX7D when PCIe RC kicks off the PM_PME
message and no any devices are connected on the port.
To workaroud such kind of issue, skip PME_Turn_Off message if there is
no endpoint connected.
Cc: stable(a)vger.kernel.org
Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume method")
Signed-off-by: Richard Zhu <hongxing.zhu(a)nxp.com>
Reviewed-by: Frank Li <Frank.Li(a)nxp.com>
---
drivers/pci/controller/dwc/pcie-designware-host.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 57a1ba08c427..b303a74b0fd7 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -1008,12 +1008,15 @@ int dw_pcie_suspend_noirq(struct dw_pcie *pci)
u32 val;
int ret;
- if (pci->pp.ops->pme_turn_off) {
- pci->pp.ops->pme_turn_off(&pci->pp);
- } else {
- ret = dw_pcie_pme_turn_off(pci);
- if (ret)
- return ret;
+ /* Skip PME_Turn_Off message if there is no endpoint connected */
+ if (dw_pcie_get_ltssm(pci) > DW_PCIE_LTSSM_DETECT_WAIT) {
+ if (pci->pp.ops->pme_turn_off) {
+ pci->pp.ops->pme_turn_off(&pci->pp);
+ } else {
+ ret = dw_pcie_pme_turn_off(pci);
+ if (ret)
+ return ret;
+ }
}
if (dwc_quirk(pci, QUIRK_NOL2POLL_IN_PM)) {
--
2.37.1
The quilt patch titled
Subject: mm/damon/sysfs: do not ignore callback's return value in damon_sysfs_damon_call()
has been removed from the -mm tree. Its filename was
mm-damon-sysfs-do-not-ignore-callbacks-return-value-in-damon_sysfs_damon_call.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Akinobu Mita <akinobu.mita(a)gmail.com>
Subject: mm/damon/sysfs: do not ignore callback's return value in damon_sysfs_damon_call()
Date: Sat, 20 Sep 2025 22:25:46 +0900
The callback return value is ignored in damon_sysfs_damon_call(), which
means that it is not possible to detect invalid user input when writing
commands such as 'commit' to
/sys/kernel/mm/damon/admin/kdamonds/<K>/state. Fix it.
Link: https://lkml.kernel.org/r/20250920132546.5822-1-akinobu.mita@gmail.com
Fixes: f64539dcdb87 ("mm/damon/sysfs: use damon_call() for update_schemes_stats")
Signed-off-by: Akinobu Mita <akinobu.mita(a)gmail.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.14+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/damon/sysfs.c~mm-damon-sysfs-do-not-ignore-callbacks-return-value-in-damon_sysfs_damon_call
+++ a/mm/damon/sysfs.c
@@ -1592,12 +1592,14 @@ static int damon_sysfs_damon_call(int (*
struct damon_sysfs_kdamond *kdamond)
{
struct damon_call_control call_control = {};
+ int err;
if (!kdamond->damon_ctx)
return -EINVAL;
call_control.fn = fn;
call_control.data = kdamond;
- return damon_call(kdamond->damon_ctx, &call_control);
+ err = damon_call(kdamond->damon_ctx, &call_control);
+ return err ? err : call_control.return_code;
}
struct damon_sysfs_schemes_walk_data {
_
Patches currently in -mm which might be from akinobu.mita(a)gmail.com are
The quilt patch titled
Subject: fs/proc/task_mmu: check p->vec_buf for NULL
has been removed from the -mm tree. Its filename was
fs-proc-task_mmu-check-p-vec_buf-for-null.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Jakub Acs <acsjakub(a)amazon.de>
Subject: fs/proc/task_mmu: check p->vec_buf for NULL
Date: Mon, 22 Sep 2025 08:22:05 +0000
When the PAGEMAP_SCAN ioctl is invoked with vec_len = 0 reaches
pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none)
[ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace:
[ 44.947030] <TASK>
[ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0
[ 44.952593] walk_pmd_range.isra.0+0x302/0x910
[ 44.954069] walk_pud_range.isra.0+0x419/0x790
[ 44.954427] walk_p4d_range+0x41e/0x620
[ 44.954743] walk_pgd_range+0x31e/0x630
[ 44.955057] __walk_page_range+0x160/0x670
[ 44.956883] walk_page_range_mm+0x408/0x980
[ 44.958677] walk_page_range+0x66/0x90
[ 44.958984] do_pagemap_scan+0x28d/0x9c0
[ 44.961833] do_pagemap_cmd+0x59/0x80
[ 44.962484] __x64_sys_ioctl+0x18d/0x210
[ 44.962804] do_syscall_64+0x5b/0x290
[ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are
allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(), that
page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking p->vec_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly or
transitively) protected by checking p->vec_buf.
Note:
From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output
is requested and it's only the side effects caller is interested in,
hence it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Link: https://lkml.kernel.org/r/20250922082206.6889-1-acsjakub@amazon.de
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Reviewed-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Penglei Jiang <superman.xpt(a)gmail.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Andrei Vagin <avagin(a)gmail.com>
Cc: "Micha�� Miros��aw" <mirq-linux(a)rere.qmqm.pl>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/proc/task_mmu.c~fs-proc-task_mmu-check-p-vec_buf-for-null
+++ a/fs/proc/task_mmu.c
@@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(s
{
struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
+ if (!p->vec_buf)
+ return;
+
if (cur_buf->start != addr)
cur_buf->end = addr;
else
_
Patches currently in -mm which might be from acsjakub(a)amazon.de are
The quilt patch titled
Subject: kmsan: fix out-of-bounds access to shadow memory
has been removed from the -mm tree. Its filename was
kmsan-fix-out-of-bounds-access-to-shadow-memory.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Eric Biggers <ebiggers(a)kernel.org>
Subject: kmsan: fix out-of-bounds access to shadow memory
Date: Thu, 11 Sep 2025 12:58:58 -0700
Running sha224_kunit on a KMSAN-enabled kernel results in a crash in
kmsan_internal_set_shadow_origin():
BUG: unable to handle page fault for address: ffffbc3840291000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1810067 P4D 1810067 PUD 192d067 PMD 3c17067 PTE 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 81 Comm: kunit_try_catch Tainted: G N 6.17.0-rc3 #10 PREEMPT(voluntary)
Tainted: [N]=TEST
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmsan_internal_set_shadow_origin+0x91/0x100
[...]
Call Trace:
<TASK>
__msan_memset+0xee/0x1a0
sha224_final+0x9e/0x350
test_hash_buffer_overruns+0x46f/0x5f0
? kmsan_get_shadow_origin_ptr+0x46/0xa0
? __pfx_test_hash_buffer_overruns+0x10/0x10
kunit_try_run_case+0x198/0xa00
This occurs when memset() is called on a buffer that is not 4-byte aligned
and extends to the end of a guard page, i.e. the next page is unmapped.
The bug is that the loop at the end of kmsan_internal_set_shadow_origin()
accesses the wrong shadow memory bytes when the address is not 4-byte
aligned. Since each 4 bytes are associated with an origin, it rounds the
address and size so that it can access all the origins that contain the
buffer. However, when it checks the corresponding shadow bytes for a
particular origin, it incorrectly uses the original unrounded shadow
address. This results in reads from shadow memory beyond the end of the
buffer's shadow memory, which crashes when that memory is not mapped.
To fix this, correctly align the shadow address before accessing the 4
shadow bytes corresponding to each origin.
Link: https://lkml.kernel.org/r/20250911195858.394235-1-ebiggers@kernel.org
Fixes: 2ef3cec44c60 ("kmsan: do not wipe out origin when doing partial unpoisoning")
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
Tested-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Alexander Potapenko <glider(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kmsan/core.c | 10 +++++++---
mm/kmsan/kmsan_test.c | 16 ++++++++++++++++
2 files changed, 23 insertions(+), 3 deletions(-)
--- a/mm/kmsan/core.c~kmsan-fix-out-of-bounds-access-to-shadow-memory
+++ a/mm/kmsan/core.c
@@ -195,7 +195,8 @@ void kmsan_internal_set_shadow_origin(vo
u32 origin, bool checked)
{
u64 address = (u64)addr;
- u32 *shadow_start, *origin_start;
+ void *shadow_start;
+ u32 *aligned_shadow, *origin_start;
size_t pad = 0;
KMSAN_WARN_ON(!kmsan_metadata_is_contiguous(addr, size));
@@ -214,9 +215,12 @@ void kmsan_internal_set_shadow_origin(vo
}
__memset(shadow_start, b, size);
- if (!IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ if (IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ aligned_shadow = shadow_start;
+ } else {
pad = address % KMSAN_ORIGIN_SIZE;
address -= pad;
+ aligned_shadow = shadow_start - pad;
size += pad;
}
size = ALIGN(size, KMSAN_ORIGIN_SIZE);
@@ -230,7 +234,7 @@ void kmsan_internal_set_shadow_origin(vo
* corresponding shadow slot is zero.
*/
for (int i = 0; i < size / KMSAN_ORIGIN_SIZE; i++) {
- if (origin || !shadow_start[i])
+ if (origin || !aligned_shadow[i])
origin_start[i] = origin;
}
}
--- a/mm/kmsan/kmsan_test.c~kmsan-fix-out-of-bounds-access-to-shadow-memory
+++ a/mm/kmsan/kmsan_test.c
@@ -556,6 +556,21 @@ DEFINE_TEST_MEMSETXX(16)
DEFINE_TEST_MEMSETXX(32)
DEFINE_TEST_MEMSETXX(64)
+/* Test case: ensure that KMSAN does not access shadow memory out of bounds. */
+static void test_memset_on_guarded_buffer(struct kunit *test)
+{
+ void *buf = vmalloc(PAGE_SIZE);
+
+ kunit_info(test,
+ "memset() on ends of guarded buffer should not crash\n");
+
+ for (size_t size = 0; size <= 128; size++) {
+ memset(buf, 0xff, size);
+ memset(buf + PAGE_SIZE - size, 0xff, size);
+ }
+ vfree(buf);
+}
+
static noinline void fibonacci(int *array, int size, int start)
{
if (start < 2 || (start == size))
@@ -677,6 +692,7 @@ static struct kunit_case kmsan_test_case
KUNIT_CASE(test_memset16),
KUNIT_CASE(test_memset32),
KUNIT_CASE(test_memset64),
+ KUNIT_CASE(test_memset_on_guarded_buffer),
KUNIT_CASE(test_long_origin_chain),
KUNIT_CASE(test_stackdepot_roundtrip),
KUNIT_CASE(test_unpoison_memory),
_
Patches currently in -mm which might be from ebiggers(a)kernel.org are
The quilt patch titled
Subject: mm/hugetlb: fix folio is still mapped when deleted
has been removed from the -mm tree. Its filename was
mm-hugetlb-fix-folio-is-still-mapped-when-deleted.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Jinjiang Tu <tujinjiang(a)huawei.com>
Subject: mm/hugetlb: fix folio is still mapped when deleted
Date: Fri, 12 Sep 2025 15:41:39 +0800
Migration may be raced with fallocating hole. remove_inode_single_folio
will unmap the folio if the folio is still mapped. However, it's called
without folio lock. If the folio is migrated and the mapped pte has been
converted to migration entry, folio_mapped() returns false, and won't
unmap it. Due to extra refcount held by remove_inode_single_folio,
migration fails, restores migration entry to normal pte, and the folio is
mapped again. As a result, we triggered BUG in filemap_unaccount_folio.
The log is as follows:
BUG: Bad page cache in process hugetlb pfn:156c00
page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00
head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0
aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file"
flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f4(hugetlb)
page dumped because: still mapped when deleted
CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Call Trace:
<TASK>
dump_stack_lvl+0x4f/0x70
filemap_unaccount_folio+0xc4/0x1c0
__filemap_remove_folio+0x38/0x1c0
filemap_remove_folio+0x41/0xd0
remove_inode_hugepages+0x142/0x250
hugetlbfs_fallocate+0x471/0x5a0
vfs_fallocate+0x149/0x380
Hold folio lock before checking if the folio is mapped to avold race with
migration.
Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com
Fixes: 4aae8d1c051e ("mm/hugetlbfs: unmap pages if page fault raced with hole punch")
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
--- a/fs/hugetlbfs/inode.c~mm-hugetlb-fix-folio-is-still-mapped-when-deleted
+++ a/fs/hugetlbfs/inode.c
@@ -517,14 +517,16 @@ static bool remove_inode_single_folio(st
/*
* If folio is mapped, it was faulted in after being
- * unmapped in caller. Unmap (again) while holding
- * the fault mutex. The mutex will prevent faults
- * until we finish removing the folio.
+ * unmapped in caller or hugetlb_vmdelete_list() skips
+ * unmapping it due to fail to grab lock. Unmap (again)
+ * while holding the fault mutex. The mutex will prevent
+ * faults until we finish removing the folio. Hold folio
+ * lock to guarantee no concurrent migration.
*/
+ folio_lock(folio);
if (unlikely(folio_mapped(folio)))
hugetlb_unmap_file_folio(h, mapping, folio, index);
- folio_lock(folio);
/*
* We must remove the folio from page cache before removing
* the region/ reserve map (hugetlb_unreserve_pages). In
_
Patches currently in -mm which might be from tujinjiang(a)huawei.com are
The quilt patch titled
Subject: hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
has been removed from the -mm tree. Its filename was
hugetlbfs-skip-vmas-without-shareable-locks-in-hugetlb_vmdelete_list.patch
This patch was dropped because it had testing failures
------------------------------------------------------
From: Deepanshu Kartikey <kartikey406(a)gmail.com>
Subject: hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
Date: Thu, 25 Sep 2025 20:19:32 +0530
hugetlb_vmdelete_list() uses trylock to acquire VMA locks during truncate
operations. As per the original design in commit 40549ba8f8e0 ("hugetlb:
use new vma_lock for pmd sharing synchronization"), if the trylock fails
or the VMA has no lock, it should skip that VMA. Any remaining mapped
pages are handled by remove_inode_hugepages() which is called after
hugetlb_vmdelete_list() and uses proper lock ordering to guarantee
unmapping success.
Currently, when hugetlb_vma_trylock_write() returns success (1) for VMAs
without shareable locks, the code proceeds to call unmap_hugepage_range().
This causes assertion failures in huge_pmd_unshare() ���
hugetlb_vma_assert_locked() because no lock is actually held:
WARNING: CPU: 1 PID: 6594 Comm: syz.0.28 Not tainted
Call Trace:
hugetlb_vma_assert_locked+0x1dd/0x250
huge_pmd_unshare+0x2c8/0x540
__unmap_hugepage_range+0x6e3/0x1aa0
unmap_hugepage_range+0x32e/0x410
hugetlb_vmdelete_list+0x189/0x1f0
Fix by explicitly skipping VMAs without shareable locks after trylock
succeeds, consistent with the original design where such VMAs are deferred
to remove_inode_hugepages() for proper handling.
Link: https://lkml.kernel.org/r/20250925144934.150299-1-kartikey406@gmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
Reported-by: syzbot+f26d7c75c26ec19790e7(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=f26d7c75c26ec19790e7
Fixes: 40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization")
Tested-by: syzbot+f26d7c75c26ec19790e7(a)syzkaller.appspotmail.com
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/hugetlbfs/inode.c~hugetlbfs-skip-vmas-without-shareable-locks-in-hugetlb_vmdelete_list
+++ a/fs/hugetlbfs/inode.c
@@ -487,7 +487,8 @@ hugetlb_vmdelete_list(struct rb_root_cac
if (!hugetlb_vma_trylock_write(vma))
continue;
-
+ if (!__vma_shareable_lock(vma))
+ continue;
v_start = vma_offset_start(vma, start);
v_end = vma_offset_end(vma, end);
_
Patches currently in -mm which might be from kartikey406(a)gmail.com are
The quilt patch titled
Subject: mm/memblock: correct totalram_pages accounting with KMSAN
has been removed from the -mm tree. Its filename was
mm-memblock-correct-totalram_pages-accounting-with-kmsan.patch
This patch was dropped because an updated version will be issued
------------------------------------------------------
From: Alexander Potapenko <glider(a)google.com>
Subject: mm/memblock: correct totalram_pages accounting with KMSAN
Date: Wed, 24 Sep 2025 12:03:01 +0200
When KMSAN is enabled, `kmsan_memblock_free_pages()` can hold back pages
for metadata instead of returning them to the early allocator. The
callers, however, would unconditionally increment `totalram_pages`,
assuming the pages were always freed. This resulted in an incorrect
calculation of the total available RAM, causing the kernel to believe it
had more memory than it actually did.
This patch refactors `memblock_free_pages()` to return the number of pages
it successfully frees. If KMSAN stashes the pages, the function now
returns 0; otherwise, it returns the number of pages in the block.
The callers in `memblock.c` have been updated to use this return value,
ensuring that `totalram_pages` is incremented only by the number of pages
actually returned to the allocator. This corrects the total RAM
accounting when KMSAN is active.
Link: https://lkml.kernel.org/r/20250924100301.1558645-1-glider@google.com
Fixes: 3c2065098260 ("init: kmsan: call KMSAN initialization routines")
Signed-off-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: Aleksandr Nogikh <nogikh(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: Markus Elfring <Markus.Elfring(a)web.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/internal.h | 4 ++--
mm/memblock.c | 21 +++++++++++----------
mm/mm_init.c | 9 +++++----
3 files changed, 18 insertions(+), 16 deletions(-)
--- a/mm/internal.h~mm-memblock-correct-totalram_pages-accounting-with-kmsan
+++ a/mm/internal.h
@@ -742,8 +742,8 @@ static inline void clear_zone_contiguous
extern int __isolate_free_page(struct page *page, unsigned int order);
extern void __putback_isolated_page(struct page *page, unsigned int order,
int mt);
-extern void memblock_free_pages(struct page *page, unsigned long pfn,
- unsigned int order);
+unsigned long memblock_free_pages(struct page *page, unsigned long pfn,
+ unsigned int order);
extern void __free_pages_core(struct page *page, unsigned int order,
enum meminit_context context);
--- a/mm/memblock.c~mm-memblock-correct-totalram_pages-accounting-with-kmsan
+++ a/mm/memblock.c
@@ -1826,6 +1826,7 @@ void *__init __memblock_alloc_or_panic(p
void __init memblock_free_late(phys_addr_t base, phys_addr_t size)
{
phys_addr_t cursor, end;
+ unsigned long freed_pages = 0;
end = base + size - 1;
memblock_dbg("%s: [%pa-%pa] %pS\n",
@@ -1834,10 +1835,9 @@ void __init memblock_free_late(phys_addr
cursor = PFN_UP(base);
end = PFN_DOWN(base + size);
- for (; cursor < end; cursor++) {
- memblock_free_pages(pfn_to_page(cursor), cursor, 0);
- totalram_pages_inc();
- }
+ for (; cursor < end; cursor++)
+ freed_pages += memblock_free_pages(pfn_to_page(cursor), cursor, 0);
+ totalram_pages_add(freed_pages);
}
/*
@@ -2259,9 +2259,11 @@ static void __init free_unused_memmap(vo
#endif
}
-static void __init __free_pages_memory(unsigned long start, unsigned long end)
+static unsigned long __init __free_pages_memory(unsigned long start,
+ unsigned long end)
{
int order;
+ unsigned long freed = 0;
while (start < end) {
/*
@@ -2279,14 +2281,15 @@ static void __init __free_pages_memory(u
while (start + (1UL << order) > end)
order--;
- memblock_free_pages(pfn_to_page(start), start, order);
+ freed += memblock_free_pages(pfn_to_page(start), start, order);
start += (1UL << order);
}
+ return freed;
}
static unsigned long __init __free_memory_core(phys_addr_t start,
- phys_addr_t end)
+ phys_addr_t end)
{
unsigned long start_pfn = PFN_UP(start);
unsigned long end_pfn = PFN_DOWN(end);
@@ -2297,9 +2300,7 @@ static unsigned long __init __free_memor
if (start_pfn >= end_pfn)
return 0;
- __free_pages_memory(start_pfn, end_pfn);
-
- return end_pfn - start_pfn;
+ return __free_pages_memory(start_pfn, end_pfn);
}
static void __init memmap_init_reserved_pages(void)
--- a/mm/mm_init.c~mm-memblock-correct-totalram_pages-accounting-with-kmsan
+++ a/mm/mm_init.c
@@ -2547,24 +2547,25 @@ void *__init alloc_large_system_hash(con
return table;
}
-void __init memblock_free_pages(struct page *page, unsigned long pfn,
- unsigned int order)
+unsigned long __init memblock_free_pages(struct page *page, unsigned long pfn,
+ unsigned int order)
{
if (IS_ENABLED(CONFIG_DEFERRED_STRUCT_PAGE_INIT)) {
int nid = early_pfn_to_nid(pfn);
if (!early_page_initialised(pfn, nid))
- return;
+ return 0;
}
if (!kmsan_memblock_free_pages(page, order)) {
/* KMSAN will take care of these pages. */
- return;
+ return 0;
}
/* pages were reserved and not allocated */
clear_page_tag_ref(page);
__free_pages_core(page, order, MEMINIT_EARLY);
+ return 1UL << order;
}
DEFINE_STATIC_KEY_MAYBE(CONFIG_INIT_ON_ALLOC_DEFAULT_ON, init_on_alloc);
_
Patches currently in -mm which might be from glider(a)google.com are
The patch titled
Subject: hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
has been added to the -mm mm-new branch. Its filename is
hugetlbfs-skip-vmas-without-shareable-locks-in-hugetlb_vmdelete_list.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Deepanshu Kartikey <kartikey406(a)gmail.com>
Subject: hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
Date: Thu, 25 Sep 2025 20:19:32 +0530
hugetlb_vmdelete_list() uses trylock to acquire VMA locks during truncate
operations. As per the original design in commit 40549ba8f8e0 ("hugetlb:
use new vma_lock for pmd sharing synchronization"), if the trylock fails
or the VMA has no lock, it should skip that VMA. Any remaining mapped
pages are handled by remove_inode_hugepages() which is called after
hugetlb_vmdelete_list() and uses proper lock ordering to guarantee
unmapping success.
Currently, when hugetlb_vma_trylock_write() returns success (1) for VMAs
without shareable locks, the code proceeds to call unmap_hugepage_range().
This causes assertion failures in huge_pmd_unshare() ���
hugetlb_vma_assert_locked() because no lock is actually held:
WARNING: CPU: 1 PID: 6594 Comm: syz.0.28 Not tainted
Call Trace:
hugetlb_vma_assert_locked+0x1dd/0x250
huge_pmd_unshare+0x2c8/0x540
__unmap_hugepage_range+0x6e3/0x1aa0
unmap_hugepage_range+0x32e/0x410
hugetlb_vmdelete_list+0x189/0x1f0
Fix by explicitly skipping VMAs without shareable locks after trylock
succeeds, consistent with the original design where such VMAs are deferred
to remove_inode_hugepages() for proper handling.
Link: https://lkml.kernel.org/r/20250925144934.150299-1-kartikey406@gmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
Reported-by: syzbot+f26d7c75c26ec19790e7(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=f26d7c75c26ec19790e7
Fixes: 40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization")
Tested-by: syzbot+f26d7c75c26ec19790e7(a)syzkaller.appspotmail.com
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/hugetlbfs/inode.c~hugetlbfs-skip-vmas-without-shareable-locks-in-hugetlb_vmdelete_list
+++ a/fs/hugetlbfs/inode.c
@@ -487,7 +487,8 @@ hugetlb_vmdelete_list(struct rb_root_cac
if (!hugetlb_vma_trylock_write(vma))
continue;
-
+ if (!__vma_shareable_lock(vma))
+ continue;
v_start = vma_offset_start(vma, start);
v_end = vma_offset_end(vma, end);
_
Patches currently in -mm which might be from kartikey406(a)gmail.com are
hugetlbfs-skip-vmas-without-shareable-locks-in-hugetlb_vmdelete_list.patch
When PAGEMAP_SCAN ioctl invoked with vec_len = 0 reaches
pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none)
[ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace:
[ 44.947030] <TASK>
[ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0
[ 44.952593] walk_pmd_range.isra.0+0x302/0x910
[ 44.954069] walk_pud_range.isra.0+0x419/0x790
[ 44.954427] walk_p4d_range+0x41e/0x620
[ 44.954743] walk_pgd_range+0x31e/0x630
[ 44.955057] __walk_page_range+0x160/0x670
[ 44.956883] walk_page_range_mm+0x408/0x980
[ 44.958677] walk_page_range+0x66/0x90
[ 44.958984] do_pagemap_scan+0x28d/0x9c0
[ 44.961833] do_pagemap_cmd+0x59/0x80
[ 44.962484] __x64_sys_ioctl+0x18d/0x210
[ 44.962804] do_syscall_64+0x5b/0x290
[ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are
allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(),
that page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking p->vec_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly
or transitively) protected by checking p->vec_buf.
Note:
From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output
is requested and it's only the side effects caller is interested in,
hence it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Penglei Jiang <superman.xpt(a)gmail.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Andrei Vagin <avagin(a)gmail.com>
Cc: "Michał Mirosław" <mirq-linux(a)rere.qmqm.pl>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-fsdevel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
fs/proc/task_mmu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 29cca0e6d0ff..b26ae556b446 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(struct pagemap_scan_private *p,
{
struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
+ if (!p->vec_buf)
+ return;
+
if (cur_buf->start != addr)
cur_buf->end = addr;
else
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
devm_kcalloc() may fail. ndtest_probe() allocates three DMA address
arrays (dcr_dma, label_dma, dimm_dma) and later unconditionally uses
them in ndtest_nvdimm_init(), which can lead to a NULL pointer
dereference under low-memory conditions.
Check all three allocations and return -ENOMEM if any allocation fails,
jumping to the common error path. Do not emit an extra error message
since the allocator already warns on allocation failure.
Fixes: 9399ab61ad82 ("ndtest: Add dimms to the two buses")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
changelog:
v3:
- Add NULL checks for all three devm_kcalloc() calls and goto the common
error label on failure.
v2:
- Drop pr_err() on allocation failure; only NULL-check and return -ENOMEM.
- No other changes.
---
tools/testing/nvdimm/test/ndtest.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/nvdimm/test/ndtest.c b/tools/testing/nvdimm/test/ndtest.c
index 68a064ce598c..8e3b6be53839 100644
--- a/tools/testing/nvdimm/test/ndtest.c
+++ b/tools/testing/nvdimm/test/ndtest.c
@@ -850,11 +850,22 @@ static int ndtest_probe(struct platform_device *pdev)
p->dcr_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
sizeof(dma_addr_t), GFP_KERNEL);
+ if (!p->dcr_dma) {
+ rc = -ENOMEM;
+ goto err;
+ }
p->label_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
sizeof(dma_addr_t), GFP_KERNEL);
+ if (!p->label_dma) {
+ rc = -ENOMEM;
+ goto err;
+ }
p->dimm_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
sizeof(dma_addr_t), GFP_KERNEL);
-
+ if (!p->dimm_dma) {
+ rc = -ENOMEM;
+ goto err;
+ }
rc = ndtest_nvdimm_init(p);
if (rc)
goto err;
--
2.43.0
I have a couple more fixes I'm testing but the issues have
been with us for a long time, and they come from
code review not from the field IIUC so no rush I think.
The following changes since commit 76eeb9b8de9880ca38696b2fb56ac45ac0a25c6c:
Linux 6.17-rc5 (2025-09-07 14:22:57 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
for you to fetch changes up to cde7e7c3f8745a61458cea61aa28f37c3f5ae2b4:
MAINTAINERS, mailmap: Update address for Peter Hilber (2025-09-21 17:44:20 -0400)
----------------------------------------------------------------
virtio,vhost: last minute fixes
More small fixes. Most notably this fixes crashes and hangs in
vhost-net.
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
----------------------------------------------------------------
Alok Tiwari (1):
vhost-scsi: fix argument order in tport allocation error message
Alyssa Ross (1):
virtio_config: clarify output parameters
Ashwini Sahu (1):
uapi: vduse: fix typo in comment
Jason Wang (2):
vhost-net: unbreak busy polling
vhost-net: flush batched before enabling notifications
Michael S. Tsirkin (1):
Revert "vhost/net: Defer TX queue re-enable until after sendmsg"
Peter Hilber (1):
MAINTAINERS, mailmap: Update address for Peter Hilber
Sebastian Andrzej Siewior (1):
vhost: Take a reference on the task in struct vhost_task.
.mailmap | 1 +
MAINTAINERS | 2 +-
drivers/vhost/net.c | 40 +++++++++++++++++-----------------------
drivers/vhost/scsi.c | 2 +-
include/linux/virtio_config.h | 11 ++++++-----
include/uapi/linux/vduse.h | 2 +-
kernel/vhost_task.c | 3 ++-
7 files changed, 29 insertions(+), 32 deletions(-)
Hi Sasha,
As the commit message notice, this patch should not be applied to kernel
v6.4 or before. I would like you to exclude this from your queue for the
following versions:
* Patch "firewire: core: fix overlooked update of subsystem ABI version" has been added to the 5.4-stable tree
* Patch "firewire: core: fix overlooked update of subsystem ABI version" has been added to the 5.10-stable tree
* Patch "firewire: core: fix overlooked update of subsystem ABI version" has been added to the 5.15-stable tree
* Patch "firewire: core: fix overlooked update of subsystem ABI version" has been added to the 6.1-stable tree
Thankss
Takashi Sakamoto
On Thu, Sep 25, 2025 at 07:33:23AM -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> firewire: core: fix overlooked update of subsystem ABI version
>
> to the 6.1-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> firewire-core-fix-overlooked-update-of-subsystem-abi.patch
> and it can be found in the queue-6.1 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit bbb4ab1d7b1ad7fff7a85aa9144d0edc5a70bacc
> Author: Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
> Date: Sat Sep 20 11:51:48 2025 +0900
>
> firewire: core: fix overlooked update of subsystem ABI version
>
> [ Upstream commit 853a57ba263adfecf4430b936d6862bc475b4bb5 ]
>
> In kernel v6.5, several functions were added to the cdev layer. This
> required updating the default version of subsystem ABI up to 6, but
> this requirement was overlooked.
>
> This commit updates the version accordingly.
>
> Fixes: 6add87e9764d ("firewire: cdev: add new version of ABI to notify time stamp at request/response subaction of transaction#")
> Link: https://lore.kernel.org/r/20250920025148.163402-1-o-takashi@sakamocchi.jp
> Signed-off-by: Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c
> index 958aa4662ccb0..5cb0059f57e6b 100644
> --- a/drivers/firewire/core-cdev.c
> +++ b/drivers/firewire/core-cdev.c
> @@ -39,7 +39,7 @@
> /*
> * ABI version history is documented in linux/firewire-cdev.h.
> */
> -#define FW_CDEV_KERNEL_VERSION 5
> +#define FW_CDEV_KERNEL_VERSION 6
> #define FW_CDEV_VERSION_EVENT_REQUEST2 4
> #define FW_CDEV_VERSION_ALLOCATE_REGION_END 4
> #define FW_CDEV_VERSION_AUTO_FLUSH_ISO_OVERFLOW 5
The SMBus block read path trusts the device-provided count byte and
copies that many bytes from the master buffer:
buf[0] = readb(p3);
read_count = buf[0];
memcpy_fromio(&buf[1], p3 + 1, read_count);
Without validating 'read_count', a malicious or misbehaving device can
cause an out-of-bounds write to the caller's buffer and may also trigger
out-of-range MMIO reads beyond the controller's buffer window.
SMBus Block Read returns up to 32 data bytes as per the kernel
documentation, so clamp the length to [1, I2C_SMBUS_BLOCK_MAX], verify
the caller's buffer has at least 'read_count + 1' bytes available, and
defensively ensure it does not exceed the controller buffer. Also break
out of the chunking loop after a successful SMBus read.
Return -EPROTO for invalid counts and -EMSGSIZE when the provided buffer
is too small.
Fixes: 361693697249 ("i2c: microchip: pci1xxxx: Add driver for I2C host controller in multifunction endpoint of pci1xxxx switch")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
drivers/i2c/busses/i2c-mchp-pci1xxxx.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-mchp-pci1xxxx.c b/drivers/i2c/busses/i2c-mchp-pci1xxxx.c
index 5ef136c3ecb1..2307c8ec2dc7 100644
--- a/drivers/i2c/busses/i2c-mchp-pci1xxxx.c
+++ b/drivers/i2c/busses/i2c-mchp-pci1xxxx.c
@@ -880,7 +880,22 @@ static int pci1xxxx_i2c_read(struct pci1xxxx_i2c *i2c, u8 slaveaddr,
}
if (i2c->flags & I2C_FLAGS_SMB_BLK_READ) {
- buf[0] = readb(p3);
+ u8 cnt = readb(p3);
+
+ if (!cnt || cnt > I2C_SMBUS_BLOCK_MAX) {
+ retval = -EPROTO;
+ goto cleanup;
+ }
+ if (cnt > total_len - 1) {
+ retval = -EMSGSIZE;
+ goto cleanup;
+ }
+ if (cnt > (SMBUS_BUF_MAX_SIZE - 1)) {
+ retval = -EOVERFLOW;
+ goto cleanup;
+ }
+
+ buf[0] = cnt;
read_count = buf[0];
memcpy_fromio(&buf[1], p3 + 1, read_count);
} else {
--
2.43.0
The pmsr_lock spinlock used to be necessary to synchronize access to the
PMSR register, because that access could have been triggered from either
config space access in rcar_pcie_config_access() or an exception handler
rcar_pcie_aarch32_abort_handler().
The rcar_pcie_aarch32_abort_handler() case is no longer applicable since
commit 6e36203bc14c ("PCI: rcar: Use PCI_SET_ERROR_RESPONSE after read
which triggered an exception"), which performs more accurate, controlled
invocation of the exception, and a fixup.
This leaves rcar_pcie_config_access() as the only call site from which
rcar_pcie_wakeup() is called. The rcar_pcie_config_access() can only be
called from the controller struct pci_ops .read and .write callbacks,
and those are serialized in drivers/pci/access.c using raw spinlock
'pci_lock' . CONFIG_PCI_LOCKLESS_CONFIG is never set on this platform.
Since the 'pci_lock' is a raw spinlock , and the 'pmsr_lock' is not a
raw spinlock, this constellation triggers 'BUG: Invalid wait context'
with CONFIG_PROVE_RAW_LOCK_NESTING=y .
Remove the pmsr_lock to fix the locking.
Fixes: a115b1bd3af0 ("PCI: rcar: Add L1 link state fix into data abort hook")
Reported-by: Duy Nguyen <duy.nguyen.rh(a)renesas.com>
Reported-by: Thuan Nguyen <thuan.nguyen-hong(a)banvien.com.vn>
Cc: stable(a)vger.kernel.org
Signed-off-by: Marek Vasut <marek.vasut+renesas(a)mailbox.org>
---
=============================
[ BUG: Invalid wait context ]
6.17.0-rc4-next-20250905-00048-ga08e553145e7-dirty #1116 Not tainted
-----------------------------
swapper/0/1 is trying to lock:
ffffffd92cf69c30 (pmsr_lock){....}-{3:3}, at: rcar_pcie_config_access+0x48/0x260
other info that might help us debug this:
context-{5:5}
3 locks held by swapper/0/1:
#0: ffffff84c0f890f8 (&dev->mutex){....}-{4:4}, at: device_lock+0x14/0x1c
#1: ffffffd92cf675b0 (pci_rescan_remove_lock){+.+.}-{4:4}, at: pci_lock_rescan_remove+0x18/0x20
#2: ffffffd92cf674a0 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x54/0xd8
stack backtrace:
CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc4-next-20250905-00048-ga08e553145e7-dirty #1116 PREEMPT
Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT)
Call trace:
dump_backtrace+0x6c/0x7c (C)
show_stack+0x14/0x1c
dump_stack_lvl+0x68/0x8c
dump_stack+0x14/0x1c
__lock_acquire+0x3e8/0x1064
lock_acquire+0x17c/0x2ac
_raw_spin_lock_irqsave+0x54/0x70
rcar_pcie_config_access+0x48/0x260
rcar_pcie_read_conf+0x44/0xd8
pci_bus_read_config_dword+0x78/0xd8
pci_bus_generic_read_dev_vendor_id+0x30/0x138
pci_bus_read_dev_vendor_id+0x60/0x68
pci_scan_single_device+0x11c/0x1ec
pci_scan_slot+0x7c/0x170
pci_scan_child_bus_extend+0x5c/0x29c
pci_scan_child_bus+0x10/0x18
pci_scan_root_bus_bridge+0x90/0xc8
pci_host_probe+0x24/0xc4
rcar_pcie_probe+0x5e8/0x650
platform_probe+0x58/0x88
really_probe+0x190/0x350
__driver_probe_device+0x120/0x138
driver_probe_device+0x38/0xec
__driver_attach+0x158/0x168
bus_for_each_dev+0x7c/0xd0
driver_attach+0x20/0x28
bus_add_driver+0xe0/0x1d8
driver_register+0xac/0xe8
__platform_driver_register+0x1c/0x24
rcar_pcie_driver_init+0x18/0x20
do_one_initcall+0xd4/0x220
kernel_init_freeable+0x308/0x30c
kernel_init+0x20/0x11c
ret_from_fork+0x10/0x20
---
Cc: "Krzysztof Wilczyński" <kwilczynski(a)kernel.org>
Cc: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: Geert Uytterhoeven <geert+renesas(a)glider.be>
Cc: Lorenzo Pieralisi <lpieralisi(a)kernel.org>
Cc: Magnus Damm <magnus.damm(a)gmail.com>
Cc: Manivannan Sadhasivam <mani(a)kernel.org>
Cc: Marc Zyngier <maz(a)kernel.org>
Cc: Rob Herring <robh(a)kernel.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Cc: linux-pci(a)vger.kernel.org
Cc: linux-renesas-soc(a)vger.kernel.org
---
drivers/pci/controller/pcie-rcar-host.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)
diff --git a/drivers/pci/controller/pcie-rcar-host.c b/drivers/pci/controller/pcie-rcar-host.c
index 4780e0109e583..625a00f3b2230 100644
--- a/drivers/pci/controller/pcie-rcar-host.c
+++ b/drivers/pci/controller/pcie-rcar-host.c
@@ -52,20 +52,13 @@ struct rcar_pcie_host {
int (*phy_init_fn)(struct rcar_pcie_host *host);
};
-static DEFINE_SPINLOCK(pmsr_lock);
-
static int rcar_pcie_wakeup(struct device *pcie_dev, void __iomem *pcie_base)
{
- unsigned long flags;
u32 pmsr, val;
int ret = 0;
- spin_lock_irqsave(&pmsr_lock, flags);
-
- if (!pcie_base || pm_runtime_suspended(pcie_dev)) {
- ret = -EINVAL;
- goto unlock_exit;
- }
+ if (!pcie_base || pm_runtime_suspended(pcie_dev))
+ return -EINVAL;
pmsr = readl(pcie_base + PMSR);
@@ -87,8 +80,6 @@ static int rcar_pcie_wakeup(struct device *pcie_dev, void __iomem *pcie_base)
writel(L1FAEG | PMEL1RX, pcie_base + PMSR);
}
-unlock_exit:
- spin_unlock_irqrestore(&pmsr_lock, flags);
return ret;
}
--
2.51.0
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during probe_device().
Fixes: b17336c55d89 ("iommu/mediatek: add support for mtk iommu generation one HW")
Cc: stable(a)vger.kernel.org # 4.8
Cc: Honghui Zhang <honghui.zhang(a)mediatek.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/iommu/mtk_iommu_v1.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 10cc0b1197e8..de9153c0a82f 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -435,6 +435,8 @@ static int mtk_iommu_v1_create_mapping(struct device *dev,
return -EINVAL;
dev_iommu_priv_set(dev, platform_get_drvdata(m4updev));
+
+ put_device(&m4updev->dev);
}
ret = iommu_fwspec_add_ids(dev, args->args, 1);
--
2.49.1
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during of_xlate().
Fixes: 7b2d59611fef ("iommu/ipmmu-vmsa: Replace local utlb code with fwspec ids")
Cc: stable(a)vger.kernel.org # 4.14
Cc: Magnus Damm <damm+renesas(a)opensource.se>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/iommu/ipmmu-vmsa.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index ffa892f65714..02a2a55ffa0a 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -720,6 +720,8 @@ static int ipmmu_init_platform_device(struct device *dev,
dev_iommu_priv_set(dev, platform_get_drvdata(ipmmu_pdev));
+ put_device(&ipmmu_pdev->dev);
+
return 0;
}
--
2.49.1
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during of_xlate().
Fixes: aa759fd376fb ("iommu/exynos: Add callback for initializing devices from device tree")
Cc: stable(a)vger.kernel.org # 4.2
Cc: Marek Szyprowski <m.szyprowski(a)samsung.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/iommu/exynos-iommu.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index b6edd178fe25..ce9e935cb84c 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1446,17 +1446,14 @@ static int exynos_iommu_of_xlate(struct device *dev,
return -ENODEV;
data = platform_get_drvdata(sysmmu);
- if (!data) {
- put_device(&sysmmu->dev);
+ put_device(&sysmmu->dev);
+ if (!data)
return -ENODEV;
- }
if (!owner) {
owner = kzalloc(sizeof(*owner), GFP_KERNEL);
- if (!owner) {
- put_device(&sysmmu->dev);
+ if (!owner)
return -ENOMEM;
- }
INIT_LIST_HEAD(&owner->controllers);
mutex_init(&owner->rpm_lock);
--
2.49.1
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during of_xlate().
Note that commit e2eae09939a8 ("iommu/qcom: add missing put_device()
call in qcom_iommu_of_xlate()") fixed the leak in a couple of error
paths, but the reference is still leaking on success and late failures.
Fixes: 0ae349a0f33f ("iommu/qcom: Add qcom_iommu")
Cc: stable(a)vger.kernel.org # 4.14: e2eae09939a8
Cc: Rob Clark <robin.clark(a)oss.qualcomm.com>
Cc: Yu Kuai <yukuai3(a)huawei.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/iommu/arm/arm-smmu/qcom_iommu.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index c5be95e56031..9c1166a3af6c 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -565,14 +565,14 @@ static int qcom_iommu_of_xlate(struct device *dev,
qcom_iommu = platform_get_drvdata(iommu_pdev);
+ put_device(&iommu_pdev->dev);
+
/* make sure the asid specified in dt is valid, so we don't have
* to sanity check this elsewhere:
*/
if (WARN_ON(asid > qcom_iommu->max_asid) ||
- WARN_ON(qcom_iommu->ctxs[asid] == NULL)) {
- put_device(&iommu_pdev->dev);
+ WARN_ON(qcom_iommu->ctxs[asid] == NULL))
return -EINVAL;
- }
if (!dev_iommu_priv_get(dev)) {
dev_iommu_priv_set(dev, qcom_iommu);
@@ -581,10 +581,8 @@ static int qcom_iommu_of_xlate(struct device *dev,
* multiple different iommu devices. Multiple context
* banks are ok, but multiple devices are not:
*/
- if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev))) {
- put_device(&iommu_pdev->dev);
+ if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev)))
return -EINVAL;
- }
}
return iommu_fwspec_add_ids(dev, &asid, 1);
--
2.49.1
Make sure to drop the reference taken to the iommu platform device when
looking up its driver data during of_xlate().
Fixes: 46d1fb072e76 ("iommu/dart: Add DART iommu driver")
Cc: stable(a)vger.kernel.org # 5.15
Cc: Sven Peter <sven(a)kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/iommu/apple-dart.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index 190f28d76615..1aa7c10262a8 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -790,6 +790,8 @@ static int apple_dart_of_xlate(struct device *dev,
struct apple_dart *cfg_dart;
int i, sid;
+ put_device(&iommu_pdev->dev);
+
if (args->args_count != 1)
return -EINVAL;
sid = args->args[0];
--
2.49.1
From: Shawn Guo <shawnguo(a)kernel.org>
A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where
cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1),
due to that platform's DT doesn't provide the optional property
'clock-latency-ns'. The dbs sampling_rate was 10000 us on 6.6 and
suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these
platforms, because the default transition delay was dropped by the commits
below.
commit 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER")
commit a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us")
commit e13aa799c2a6 ("cpufreq: Change default transition delay to 2ms")
It slows down dbs governor's reacting to CPU loading change
dramatically. Also, as transition_delay_us is used by schedutil governor
as rate_limit_us, it shows a negative impact on device idle power
consumption, because the device gets slightly less time in the lowest OPP.
Fix the regressions by defining a default transition latency for
handling the case of CPUFREQ_ETERNAL.
Cc: stable(a)vger.kernel.org
Fixes: 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER")
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
---
Changes for v2:
- Follow Rafael's suggestion to define a default transition latency for
handling CPUFREQ_ETERNAL, and pave the way to get rid of
CPUFREQ_ETERNAL completely later.
v1: https://lkml.org/lkml/2025/9/10/294
drivers/cpufreq/cpufreq.c | 3 +++
include/linux/cpufreq.h | 2 ++
2 files changed, 5 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index fc7eace8b65b..c69d10f0e8ec 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -549,6 +549,9 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
if (policy->transition_delay_us)
return policy->transition_delay_us;
+ if (policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL)
+ policy->cpuinfo.transition_latency = CPUFREQ_DEFAULT_TANSITION_LATENCY_NS;
+
latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
if (latency)
/* Give a 50% breathing room between updates */
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 95f3807c8c55..935e9a660039 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -36,6 +36,8 @@
/* Print length for names. Extra 1 space for accommodating '\n' in prints */
#define CPUFREQ_NAME_PLEN (CPUFREQ_NAME_LEN + 1)
+#define CPUFREQ_DEFAULT_TANSITION_LATENCY_NS NSEC_PER_MSEC
+
struct cpufreq_governor;
enum cpufreq_table_sorting {
--
2.43.0
Commit 20d72b00ca81 ("netfs: Fix the request's work item to not
require a ref") modified netfs_alloc_request() to initialize the
reference counter to 2 instead of 1. The rationale was that the
requet's "work" would release the second reference after completion
(via netfs_{read,write}_collection_worker()). That works most of the
time if all goes well.
However, it leaks this additional reference if the request is released
before the I/O operation has been submitted: the error code path only
decrements the reference counter once and the work item will never be
queued because there will never be a completion.
This has caused outages of our whole server cluster today because
tasks were blocked in netfs_wait_for_outstanding_io(), leading to
deadlocks in Ceph (another bug that I will address soon in another
patch). This was caused by a netfs_pgpriv2_begin_copy_to_cache() call
which failed in fscache_begin_write_operation(). The leaked
netfs_io_request was never completed, leaving `netfs_inode.io_count`
with a positive value forever.
All of this is super-fragile code. Finding out which code paths will
lead to an eventual completion and which do not is hard to see:
- Some functions like netfs_create_write_req() allocate a request, but
will never submit any I/O.
- netfs_unbuffered_read_iter_locked() calls netfs_unbuffered_read()
and then netfs_put_request(); however, netfs_unbuffered_read() can
also fail early before submitting the I/O request, therefore another
netfs_put_request() call must be added there.
A rule of thumb is that functions that return a `netfs_io_request` do
not submit I/O, and all of their callers must be checked.
For my taste, the whole netfs code needs an overhaul to make reference
counting easier to understand and less fragile & obscure. But to fix
this bug here and now and produce a patch that is adequate for a
stable backport, I tried a minimal approach that quickly frees the
request object upon early failure.
I decided against adding a second netfs_put_request() each time
because that would cause code duplication which obscures the code
further. Instead, I added the function netfs_put_failed_request()
which frees such a failed request synchronously under the assumption
that the reference count is exactly 2 (as initially set by
netfs_alloc_request() and never touched), verified by a
WARN_ON_ONCE(). It then deinitializes the request object (without
going through the "cleanup_work" indirection) and frees the allocation
(with RCU protection to protect against concurrent access by
netfs_requests_seq_start()).
All code paths that fail early have been changed to call
netfs_put_failed_request() instead of netfs_put_request().
Additionally, I have added a netfs_put_request() call to
netfs_unbuffered_read() as explained above because the
netfs_put_failed_request() approach does not work there.
Fixes: 20d72b00ca81 ("netfs: Fix the request's work item to not require a ref")
Cc: stable(a)vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
---
v1->v2: free the request with call_rcu() because a proc reader might
be accessing it (suggested by David Howells)
---
fs/netfs/buffered_read.c | 10 +++++-----
fs/netfs/direct_read.c | 7 ++++++-
fs/netfs/direct_write.c | 6 +++++-
fs/netfs/internal.h | 1 +
fs/netfs/objects.c | 28 +++++++++++++++++++++++++---
fs/netfs/read_pgpriv2.c | 2 +-
fs/netfs/read_single.c | 2 +-
fs/netfs/write_issue.c | 3 +--
8 files changed, 45 insertions(+), 14 deletions(-)
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 18b3dc74c70e..37ab6f28b5ad 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -369,7 +369,7 @@ void netfs_readahead(struct readahead_control *ractl)
return netfs_put_request(rreq, netfs_rreq_trace_put_return);
cleanup_free:
- return netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ return netfs_put_failed_request(rreq);
}
EXPORT_SYMBOL(netfs_readahead);
@@ -472,7 +472,7 @@ static int netfs_read_gaps(struct file *file, struct folio *folio)
return ret < 0 ? ret : 0;
discard:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
alloc_error:
folio_unlock(folio);
return ret;
@@ -532,7 +532,7 @@ int netfs_read_folio(struct file *file, struct folio *folio)
return ret < 0 ? ret : 0;
discard:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
alloc_error:
folio_unlock(folio);
return ret;
@@ -699,7 +699,7 @@ int netfs_write_begin(struct netfs_inode *ctx,
return 0;
error_put:
- netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(rreq);
error:
if (folio) {
folio_unlock(folio);
@@ -754,7 +754,7 @@ int netfs_prefetch_for_write(struct file *file, struct folio *folio,
return ret < 0 ? ret : 0;
error_put:
- netfs_put_request(rreq, netfs_rreq_trace_put_discard);
+ netfs_put_failed_request(rreq);
error:
_leave(" = %d", ret);
return ret;
diff --git a/fs/netfs/direct_read.c b/fs/netfs/direct_read.c
index a05e13472baf..a498ee8d6674 100644
--- a/fs/netfs/direct_read.c
+++ b/fs/netfs/direct_read.c
@@ -131,6 +131,7 @@ static ssize_t netfs_unbuffered_read(struct netfs_io_request *rreq, bool sync)
if (rreq->len == 0) {
pr_err("Zero-sized read [R=%x]\n", rreq->debug_id);
+ netfs_put_request(rreq, netfs_rreq_trace_put_discard);
return -EIO;
}
@@ -205,7 +206,7 @@ ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_iter *i
if (user_backed_iter(iter)) {
ret = netfs_extract_user_iter(iter, rreq->len, &rreq->buffer.iter, 0);
if (ret < 0)
- goto out;
+ goto error_put;
rreq->direct_bv = (struct bio_vec *)rreq->buffer.iter.bvec;
rreq->direct_bv_count = ret;
rreq->direct_bv_unpin = iov_iter_extract_will_pin(iter);
@@ -238,6 +239,10 @@ ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_iter *i
if (ret > 0)
orig_count -= ret;
return ret;
+
+error_put:
+ netfs_put_failed_request(rreq);
+ return ret;
}
EXPORT_SYMBOL(netfs_unbuffered_read_iter_locked);
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index a16660ab7f83..a9d1c3b2c084 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -57,7 +57,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
n = netfs_extract_user_iter(iter, len, &wreq->buffer.iter, 0);
if (n < 0) {
ret = n;
- goto out;
+ goto error_put;
}
wreq->direct_bv = (struct bio_vec *)wreq->buffer.iter.bvec;
wreq->direct_bv_count = n;
@@ -101,6 +101,10 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
out:
netfs_put_request(wreq, netfs_rreq_trace_put_return);
return ret;
+
+error_put:
+ netfs_put_failed_request(wreq);
+ return ret;
}
EXPORT_SYMBOL(netfs_unbuffered_write_iter_locked);
diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index d4f16fefd965..4319611f5354 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -87,6 +87,7 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
void netfs_get_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace what);
void netfs_clear_subrequests(struct netfs_io_request *rreq);
void netfs_put_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace what);
+void netfs_put_failed_request(struct netfs_io_request *rreq);
struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq);
static inline void netfs_see_request(struct netfs_io_request *rreq,
diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
index e8c99738b5bb..39d5e13f7248 100644
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -116,10 +116,8 @@ static void netfs_free_request_rcu(struct rcu_head *rcu)
netfs_stat_d(&netfs_n_rh_rreq);
}
-static void netfs_free_request(struct work_struct *work)
+static void netfs_deinit_request(struct netfs_io_request *rreq)
{
- struct netfs_io_request *rreq =
- container_of(work, struct netfs_io_request, cleanup_work);
struct netfs_inode *ictx = netfs_inode(rreq->inode);
unsigned int i;
@@ -149,6 +147,14 @@ static void netfs_free_request(struct work_struct *work)
if (atomic_dec_and_test(&ictx->io_count))
wake_up_var(&ictx->io_count);
+}
+
+static void netfs_free_request(struct work_struct *work)
+{
+ struct netfs_io_request *rreq =
+ container_of(work, struct netfs_io_request, cleanup_work);
+
+ netfs_deinit_request(rreq);
call_rcu(&rreq->rcu, netfs_free_request_rcu);
}
@@ -167,6 +173,22 @@ void netfs_put_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace
}
}
+/*
+ * Free a request (synchronously) that was just allocated but has
+ * failed before it could be submitted.
+ */
+void netfs_put_failed_request(struct netfs_io_request *rreq)
+{
+ /* new requests have two references (see
+ * netfs_alloc_request(), and this function is only allowed on
+ * new request objects
+ */
+ WARN_ON_ONCE(refcount_read(&rreq->ref) != 2);
+
+ trace_netfs_rreq_ref(rreq->debug_id, 0, netfs_rreq_trace_put_failed);
+ netfs_free_request(&rreq->cleanup_work);
+}
+
/*
* Allocate and partially initialise an I/O request structure.
*/
diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c
index 8097bc069c1d..a1489aa29f78 100644
--- a/fs/netfs/read_pgpriv2.c
+++ b/fs/netfs/read_pgpriv2.c
@@ -118,7 +118,7 @@ static struct netfs_io_request *netfs_pgpriv2_begin_copy_to_cache(
return creq;
cancel_put:
- netfs_put_request(creq, netfs_rreq_trace_put_return);
+ netfs_put_failed_request(creq);
cancel:
rreq->copy_to_cache = ERR_PTR(-ENOBUFS);
clear_bit(NETFS_RREQ_FOLIO_COPY_TO_CACHE, &rreq->flags);
diff --git a/fs/netfs/read_single.c b/fs/netfs/read_single.c
index fa622a6cd56d..5c0dc4efc792 100644
--- a/fs/netfs/read_single.c
+++ b/fs/netfs/read_single.c
@@ -189,7 +189,7 @@ ssize_t netfs_read_single(struct inode *inode, struct file *file, struct iov_ite
return ret;
cleanup_free:
- netfs_put_request(rreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(rreq);
return ret;
}
EXPORT_SYMBOL(netfs_read_single);
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index 0584cba1a043..dd8743bc8d7f 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -133,8 +133,7 @@ struct netfs_io_request *netfs_create_write_req(struct address_space *mapping,
return wreq;
nomem:
- wreq->error = -ENOMEM;
- netfs_put_request(wreq, netfs_rreq_trace_put_failed);
+ netfs_put_failed_request(wreq);
return ERR_PTR(-ENOMEM);
}
--
2.47.3
Hi ,
What would it mean to you if your business was able to reduce Expenses by 20%
(Clients: Littelfuse, Corsair, BMB, Mercedes-Benz, Fantac)
We are a PCBA factory with an area of 6,000 square meters. We have been in this industry for 18 years and have an experienced team of engineers.
Help you reduce BOM Expenses Fast delivery (15 days for Demo) Competitive prices (10% lower than peers) Real factory processing fees are Fees Complete quality management system (ISO9001,ISO14001,ISO13485,IATF16949,UL)Given how well our pcba service suits your needs, I think we could do some Excellent work together.
Seven LeeChief Technology Officer
Business Department | Shenzhen STHL Technology Co,Ltd
+8618569002840 Seven(a)pcba-china.com
在2025-06-04,Seven <seven(a)ems-sthi.com> 写道:-----原始邮件-----
发件人: Seven <seven(a)ems-sthi.com>
发件时间: 2025年06月04日 周三
收件人: [Linux-stable-mirror <linux-stable-mirror(a)lists.linaro.org>]
主题: Re:Jordan recommend me get in touch
Hi,
Glad to know you and your company from Jordan.
I‘m Seven CTO of STHL We are a one-stop service provider for PCBA. We can help you with production from PCB to finished product assembly.
Why Partner With Us?
✅ One-Stop Expertise: From PCB fabrication, PCBA (SMT & Through-Hole), custom cable harnesses, , to final product assembly – we eliminate multi-vendor coordination risks.
✅ Cost Efficiency: 40%+ clients reduce logistics/QC costs through our integrated service model (ISO 9001:2015 certified).
✅ Speed-to-Market: Average 15% faster lead times achieved via in-house vertical integration.
Recent Success Case:
Helped a German IoT startup scale from prototype to 50K-unit/month production within 6 months through our:
PCB Design-for-Manufacturing (DFM) optimization Automated PCBA with 99.98% first-pass yield Mechanical housing CNC machining & IP67-rated assembly
Seven Marcus CTO
Shenzhen STHL Technology Co,Ltd
+8618569002840 Seven(a)pcba-china.com
From: Johannes Berg <johannes.berg(a)intel.com>
commit 586e3cb33ba6890054b95aa0ade0a165890efabd upstream.
For devices handled by iwldvm, bc_table_dword was never set, but I missed
that during the removal thereof. Change the logic to not treat the byte
count table as dwords for devices older than 9000 series to fix that.
Fixes: 6570ea227826 ("wifi: iwlwifi: remove bc_table_dword transport config")
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach(a)intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
Link: https://patch.msgid.link/20250828095500.eccd7d3939f1.Ibaffa06d0b3aa5f35a945…
---
drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
index eee55428749c..5ca9712dd7f0 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
@@ -2093,7 +2093,8 @@ static void iwl_txq_gen1_update_byte_cnt_tbl(struct iwl_trans *trans,
break;
}
- if (trans->mac_cfg->device_family < IWL_DEVICE_FAMILY_AX210)
+ if (trans->mac_cfg->device_family >= IWL_DEVICE_FAMILY_9000 &&
+ trans->mac_cfg->device_family < IWL_DEVICE_FAMILY_AX210)
len = DIV_ROUND_UP(len, 4);
if (WARN_ON(len > 0xFFF || write_ptr >= TFD_QUEUE_SIZE_MAX))
--
2.51.0
afs_put_server() accessed server->debug_id before the NULL check, which
could lead to a null pointer dereference. Move the debug_id assignment,
ensuring we never dereference a NULL server pointer.
Fixes: 2757a4dc1849 ("afs: Fix access after dec in put functions")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhen Ni <zhen.ni(a)easystack.cn>
---
fs/afs/server.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/afs/server.c b/fs/afs/server.c
index a97562f831eb..c4428ebddb1d 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -331,13 +331,14 @@ struct afs_server *afs_use_server(struct afs_server *server, bool activate,
void afs_put_server(struct afs_net *net, struct afs_server *server,
enum afs_server_trace reason)
{
- unsigned int a, debug_id = server->debug_id;
+ unsigned int a, debug_id;
bool zero;
int r;
if (!server)
return;
+ debug_id = server->debug_id;
a = atomic_read(&server->active);
zero = __refcount_dec_and_test(&server->ref, &r);
trace_afs_server(debug_id, r - 1, a, reason);
--
2.20.1
On 9/24/25 13:41, Joseph Salisbury wrote:
> Hi Greg/Sasha,
>
> I am reaching out to confirm the projected EOL for the Linux 5.4
> stable kernel.
>
> According to the information listed on kernel.org [0], the EOL is
> currently slated for December 2025. We are using this projection for
> planning, so we would be grateful if you could confirm it is still
> accurate.
>
> Thank you very much for your time and for all the work you do in
> maintaining the stable kernel releases!
>
> Thanks,
>
> Joe Salisbury
>
>
> [0] https://www.kernel.org/category/releases.html
Sorry, I forgot to CC stable for the wider audience. Doing that now.
Commit 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in
af_alg_sendmsg") changed some fields from bool to 1-bit bitfields of
type u32. However, some assignments to these fields, specifically
'more' and 'merge', assign values greater than 1. These relied on C's
implicit conversion to bool, such that zero becomes false and nonzero
becomes true. With a 1-bit bitfields of type u32 instead, mod 2 of the
value is taken instead, resulting in 0 being assigned in some cases when
1 was intended. Fix this by restoring the bool type.
Fixes: 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
---
v2: keep the bitfields and just change the type, as suggested by Linus
include/crypto/if_alg.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 0c70f3a555750..107b797c33ecf 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -150,11 +150,11 @@ struct af_alg_ctx {
struct crypto_wait wait;
size_t used;
atomic_t rcvused;
- u32 more:1,
+ bool more:1,
merge:1,
enc:1,
write:1,
init:1;
base-commit: cec1e6e5d1ab33403b809f79cd20d6aff124ccfe
--
2.51.0
The patch below was submitted to be applied to the 6.16-stable tree.
I fail to see how this patch meets the stable kernel rules as found at
Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to
<stable(a)vger.kernel.org> and let me know why this patch should be
applied. Otherwise, it is now dropped from my patch queues, never to be
seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 091b29d53fe645781c5c1f405bc9fcd50ce5792b Mon Sep 17 00:00:00 2001
From: Tao Cui <cuitao(a)kylinos.cn>
Date: Thu, 18 Sep 2025 19:44:22 +0800
Subject: [PATCH] LoongArch: KVM: Remove unused returns and semicolons
The default branch has already handled all undefined cases, so the final
return statement is redundant. Redundant semicolons are removed, too.
Cc: stable(a)vger.kernel.org
Reviewed-by: Bibo Mao <maobibo(a)loongson.cn>
Signed-off-by: Tao Cui <cuitao(a)kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c
index 2ce41f93b2a4..6c9c7de7226b 100644
--- a/arch/loongarch/kvm/exit.c
+++ b/arch/loongarch/kvm/exit.c
@@ -778,10 +778,8 @@ static long kvm_save_notify(struct kvm_vcpu *vcpu)
return 0;
default:
return KVM_HCALL_INVALID_CODE;
- };
-
- return KVM_HCALL_INVALID_CODE;
-};
+ }
+}
/*
* kvm_handle_lsx_disabled() - Guest used LSX while disabled in root.
From: David Sterba <dsterba(a)suse.com>
[ Upstream commit ed4e6b5d644c4dd2bc2872ffec036b7da0ec2e27 ]
Syzbot hits a problem with enabled ref-verify, ignorebadroots and a
fuzzed/damaged extent tree. There's no fallback option like in other
places that can deal with it so disable the whole ref-verify as it is
just a debugging feature.
Reported-by: syzbot+9c3e0cdfbfe351b0bc0e(a)syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/0000000000001b6052062139be1c@google.com/
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my exhaustive analysis of the commit, code changes, and
extensive research through multiple specialized agents, here is my
determination:
## **Backport Status: YES**
This commit should be backported to stable kernel trees v5.11 and later.
## Detailed Explanation
### Nature of the Fix
The commit fixes a **NULL pointer dereference** that occurs when
mounting a corrupted BTRFS filesystem with both `rescue=ignorebadroots`
and `ref_verify` options enabled. The code change adds a critical
IS_ERR() check:
```c
+ extent_root = btrfs_extent_root(fs_info, 0);
+ /* If the extent tree is damaged we cannot ignore it
(IGNOREBADROOTS). */
+ if (IS_ERR(extent_root)) {
+ btrfs_warn(fs_info, "ref-verify: extent tree not
available, disabling");
+ btrfs_clear_opt(fs_info->mount_opt, REF_VERIFY);
+ return 0;
+ }
```
Previously, the code immediately used `extent_root` without checking if
it was valid, causing a crash at line `eb =
btrfs_read_lock_root_node(extent_root)`.
### Why It Qualifies for Stable Backport
1. **Fixes a Real Bug**: Prevents kernel panic during filesystem mount
operations
2. **Small and Contained**: Only 7 lines of defensive code added
3. **No Side Effects**: Simply disables the debug feature gracefully
instead of crashing
4. **Follows Stable Rules**:
- Important bugfix (prevents crashes)
- Minimal risk (simple NULL check)
- Confined to subsystem (BTRFS ref-verify)
### Specific Code Analysis
The fix properly handles the interaction between two features introduced
at different times:
- **ref-verify**: Debug feature from v4.15 (2017)
- **IGNOREBADROOTS**: Recovery option from v5.11 (2020)
The incompatibility wasn't caught until syzbot fuzzing discovered it.
The fix:
- Checks if `btrfs_extent_root()` returns an error (damaged extent tree)
- Gracefully disables ref-verify with a warning message
- Allows mount to proceed for recovery purposes
- Returns success (0) to continue mounting
### Risk Assessment
**Extremely Low Risk**:
- Pattern already exists in 7+ other places in BTRFS for IGNOREBADROOTS
handling
- ref-verify is a debug feature (`CONFIG_BTRFS_FS_REF_VERIFY=n` by
default)
- Not enabled in production kernels (Ubuntu, RHEL, SUSE)
- Clean cherry-pick expected to v5.11+
- Syzbot tested and verified the fix
### Comparison with Similar Commits
My research found similar BTRFS NULL-check fixes routinely backported:
- Commit 6aecd91a5c5b: Similar fix for scrub with IGNOREBADROOTS
- Multiple IS_ERR() checks added throughout disk-io.c for root tree
handling
- Pattern of "disable debug feature on error" is established in ref-
verify.c
### Conclusion
While this primarily affects debug configurations, it meets all criteria
for stable backporting:
- Prevents kernel crashes (primary stable criterion)
- Simple, obviously correct fix
- Helps filesystem recovery operations
- Zero regression risk
- Follows established BTRFS error handling patterns
The commit should be backported to **stable kernels v5.11 through v6.5**
(kernels where both IGNOREBADROOTS and ref-verify coexist).
fs/btrfs/ref-verify.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/ref-verify.c b/fs/btrfs/ref-verify.c
index 2928abf7eb827..fc46190d26c8e 100644
--- a/fs/btrfs/ref-verify.c
+++ b/fs/btrfs/ref-verify.c
@@ -998,11 +998,18 @@ int btrfs_build_ref_tree(struct btrfs_fs_info *fs_info)
if (!btrfs_test_opt(fs_info, REF_VERIFY))
return 0;
+ extent_root = btrfs_extent_root(fs_info, 0);
+ /* If the extent tree is damaged we cannot ignore it (IGNOREBADROOTS). */
+ if (IS_ERR(extent_root)) {
+ btrfs_warn(fs_info, "ref-verify: extent tree not available, disabling");
+ btrfs_clear_opt(fs_info->mount_opt, REF_VERIFY);
+ return 0;
+ }
+
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
- extent_root = btrfs_extent_root(fs_info, 0);
eb = btrfs_read_lock_root_node(extent_root);
level = btrfs_header_level(eb);
path->nodes[level] = eb;
--
2.51.0
The patch titled
Subject: lib/genalloc: fix device leak in of_gen_pool_get()
has been added to the -mm mm-nonmm-unstable branch. Its filename is
lib-genalloc-fix-device-leak-in-of_gen_pool_get.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-nonmm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Johan Hovold <johan(a)kernel.org>
Subject: lib/genalloc: fix device leak in of_gen_pool_get()
Date: Wed, 24 Sep 2025 10:02:07 +0200
Make sure to drop the reference taken when looking up the genpool platform
device in of_gen_pool_get() before returning the pool.
Note that holding a reference to a device does typically not prevent its
devres managed resources from being released so there is no point in
keeping the reference.
Link: https://lkml.kernel.org/r/20250924080207.18006-1-johan@kernel.org
Fixes: 9375db07adea ("genalloc: add devres support, allow to find a managed pool by device")
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Cc: Philipp Zabel <p.zabel(a)pengutronix.de>
Cc: Vladimir Zapolskiy <vz(a)mleia.com>
Cc: <stable(a)vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/genalloc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/lib/genalloc.c~lib-genalloc-fix-device-leak-in-of_gen_pool_get
+++ a/lib/genalloc.c
@@ -899,8 +899,11 @@ struct gen_pool *of_gen_pool_get(struct
if (!name)
name = of_node_full_name(np_pool);
}
- if (pdev)
+ if (pdev) {
pool = gen_pool_get(&pdev->dev, name);
+ put_device(&pdev->dev);
+ }
+
of_node_put(np_pool);
return pool;
_
Patches currently in -mm which might be from johan(a)kernel.org are
lib-genalloc-fix-device-leak-in-of_gen_pool_get.patch
The patch titled
Subject: mm/memblock: correct totalram_pages accounting with KMSAN
has been added to the -mm mm-new branch. Its filename is
mm-memblock-correct-totalram_pages-accounting-with-kmsan.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Alexander Potapenko <glider(a)google.com>
Subject: mm/memblock: correct totalram_pages accounting with KMSAN
Date: Wed, 24 Sep 2025 12:03:01 +0200
When KMSAN is enabled, `kmsan_memblock_free_pages()` can hold back pages
for metadata instead of returning them to the early allocator. The
callers, however, would unconditionally increment `totalram_pages`,
assuming the pages were always freed. This resulted in an incorrect
calculation of the total available RAM, causing the kernel to believe it
had more memory than it actually did.
This patch refactors `memblock_free_pages()` to return the number of pages
it successfully frees. If KMSAN stashes the pages, the function now
returns 0; otherwise, it returns the number of pages in the block.
The callers in `memblock.c` have been updated to use this return value,
ensuring that `totalram_pages` is incremented only by the number of pages
actually returned to the allocator. This corrects the total RAM
accounting when KMSAN is active.
Link: https://lkml.kernel.org/r/20250924100301.1558645-1-glider@google.com
Fixes: 3c2065098260 ("init: kmsan: call KMSAN initialization routines")
Signed-off-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: Aleksandr Nogikh <nogikh(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: Markus Elfring <Markus.Elfring(a)web.de>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/internal.h | 4 ++--
mm/memblock.c | 21 +++++++++++----------
mm/mm_init.c | 9 +++++----
3 files changed, 18 insertions(+), 16 deletions(-)
--- a/mm/internal.h~mm-memblock-correct-totalram_pages-accounting-with-kmsan
+++ a/mm/internal.h
@@ -742,8 +742,8 @@ static inline void clear_zone_contiguous
extern int __isolate_free_page(struct page *page, unsigned int order);
extern void __putback_isolated_page(struct page *page, unsigned int order,
int mt);
-extern void memblock_free_pages(struct page *page, unsigned long pfn,
- unsigned int order);
+unsigned long memblock_free_pages(struct page *page, unsigned long pfn,
+ unsigned int order);
extern void __free_pages_core(struct page *page, unsigned int order,
enum meminit_context context);
--- a/mm/memblock.c~mm-memblock-correct-totalram_pages-accounting-with-kmsan
+++ a/mm/memblock.c
@@ -1826,6 +1826,7 @@ void *__init __memblock_alloc_or_panic(p
void __init memblock_free_late(phys_addr_t base, phys_addr_t size)
{
phys_addr_t cursor, end;
+ unsigned long freed_pages = 0;
end = base + size - 1;
memblock_dbg("%s: [%pa-%pa] %pS\n",
@@ -1834,10 +1835,9 @@ void __init memblock_free_late(phys_addr
cursor = PFN_UP(base);
end = PFN_DOWN(base + size);
- for (; cursor < end; cursor++) {
- memblock_free_pages(pfn_to_page(cursor), cursor, 0);
- totalram_pages_inc();
- }
+ for (; cursor < end; cursor++)
+ freed_pages += memblock_free_pages(pfn_to_page(cursor), cursor, 0);
+ totalram_pages_add(freed_pages);
}
/*
@@ -2259,9 +2259,11 @@ static void __init free_unused_memmap(vo
#endif
}
-static void __init __free_pages_memory(unsigned long start, unsigned long end)
+static unsigned long __init __free_pages_memory(unsigned long start,
+ unsigned long end)
{
int order;
+ unsigned long freed = 0;
while (start < end) {
/*
@@ -2279,14 +2281,15 @@ static void __init __free_pages_memory(u
while (start + (1UL << order) > end)
order--;
- memblock_free_pages(pfn_to_page(start), start, order);
+ freed += memblock_free_pages(pfn_to_page(start), start, order);
start += (1UL << order);
}
+ return freed;
}
static unsigned long __init __free_memory_core(phys_addr_t start,
- phys_addr_t end)
+ phys_addr_t end)
{
unsigned long start_pfn = PFN_UP(start);
unsigned long end_pfn = PFN_DOWN(end);
@@ -2297,9 +2300,7 @@ static unsigned long __init __free_memor
if (start_pfn >= end_pfn)
return 0;
- __free_pages_memory(start_pfn, end_pfn);
-
- return end_pfn - start_pfn;
+ return __free_pages_memory(start_pfn, end_pfn);
}
static void __init memmap_init_reserved_pages(void)
--- a/mm/mm_init.c~mm-memblock-correct-totalram_pages-accounting-with-kmsan
+++ a/mm/mm_init.c
@@ -2547,24 +2547,25 @@ void *__init alloc_large_system_hash(con
return table;
}
-void __init memblock_free_pages(struct page *page, unsigned long pfn,
- unsigned int order)
+unsigned long __init memblock_free_pages(struct page *page, unsigned long pfn,
+ unsigned int order)
{
if (IS_ENABLED(CONFIG_DEFERRED_STRUCT_PAGE_INIT)) {
int nid = early_pfn_to_nid(pfn);
if (!early_page_initialised(pfn, nid))
- return;
+ return 0;
}
if (!kmsan_memblock_free_pages(page, order)) {
/* KMSAN will take care of these pages. */
- return;
+ return 0;
}
/* pages were reserved and not allocated */
clear_page_tag_ref(page);
__free_pages_core(page, order, MEMINIT_EARLY);
+ return 1UL << order;
}
DEFINE_STATIC_KEY_MAYBE(CONFIG_INIT_ON_ALLOC_DEFAULT_ON, init_on_alloc);
_
Patches currently in -mm which might be from glider(a)google.com are
mm-memblock-correct-totalram_pages-accounting-with-kmsan.patch
The patch titled
Subject: mm: swap: check for stable address space before operating on the VMA
has been added to the -mm mm-new branch. Its filename is
mm-swap-check-for-stable-address-space-before-operating-on-the-vma.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Charan Teja Kalla <charan.kalla(a)oss.qualcomm.com>
Subject: mm: swap: check for stable address space before operating on the VMA
Date: Wed, 24 Sep 2025 23:41:38 +0530
It is possible to hit a zero entry while traversing the vmas in unuse_mm()
called from swapoff path and accessing it causes the OOPS:
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000446--> Loading the memory from offset 0x40 on the
XA_ZERO_ENTRY as address.
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
The issue is manifested from the below race between the fork() on a
process and swapoff:
fork(dup_mmap()) swapoff(unuse_mm)
--------------- -----------------
1) Identical mtree is built using
__mt_dup().
2) copy_pte_range()-->
copy_nonpresent_pte():
The dst mm is added into the
mmlist to be visible to the
swapoff operation.
3) Fatal signal is sent to the parent
process(which is the current during the
fork) thus skip the duplication of the
vmas and mark the vma range with
XA_ZERO_ENTRY as a marker for this process
that helps during exit_mmap().
4) swapoff is tried on the
'mm' added to the 'mmlist' as
part of the 2.
5) unuse_mm(), that iterates
through the vma's of this 'mm'
will hit the non-NULL zero entry
and operating on this zero entry
as a vma is resulting into the
oops.
The proper fix would be around not exposing this partially-valid tree to
others when droping the mmap lock, which is being solved with [1]. A
simpler solution would be checking for MMF_UNSTABLE, as it is set if
mm_struct is not fully initialized in dup_mmap().
Thanks to Liam/Lorenzo/David for all the suggestions in fixing this
issue.
Link: https://lkml.kernel.org/r/20250924181138.1762750-1-charan.kalla@oss.qualcom…
Link: https://lore.kernel.org/all/20250815191031.3769540-1-Liam.Howlett@oracle.co… [1]
Fixes: d24062914837 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()")
Signed-off-by: Charan Teja Kalla <charan.kalla(a)oss.qualcomm.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Kairui Song <kasong(a)tencent.com>
Cc: Kemeng Shi <shikemeng(a)huaweicloud.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Nhat Pham <nphamcs(a)gmail.com>
Cc: Peng Zhang <zhangpeng.00(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swapfile.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/swapfile.c~mm-swap-check-for-stable-address-space-before-operating-on-the-vma
+++ a/mm/swapfile.c
@@ -2389,6 +2389,8 @@ static int unuse_mm(struct mm_struct *mm
VMA_ITERATOR(vmi, mm, 0);
mmap_read_lock(mm);
+ if (check_stable_address_space(mm))
+ goto unlock;
for_each_vma(vmi, vma) {
if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
ret = unuse_vma(vma, type);
@@ -2398,6 +2400,7 @@ static int unuse_mm(struct mm_struct *mm
cond_resched();
}
+unlock:
mmap_read_unlock(mm);
return ret;
}
_
Patches currently in -mm which might be from charan.kalla(a)oss.qualcomm.com are
mm-swap-check-for-stable-address-space-before-operating-on-the-vma.patch
Commit 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in
af_alg_sendmsg") changed some fields from bool to 1-bit bitfields.
However, some assignments to these fields, specifically 'more' and
'merge', assign values greater than 1. These relied on C's implicit
conversion to bool, such that zero becomes false and nonzero becomes
true. With a 1-bit bitfield instead, mod 2 of the value is taken
instead, resulting in 0 being assigned in some cases when 1 was
intended. Fix this by restoring the bool type.
Fixes: 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
---
include/crypto/if_alg.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 0c70f3a55575..02fb7c1d9ef7 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -150,15 +150,15 @@ struct af_alg_ctx {
struct crypto_wait wait;
size_t used;
atomic_t rcvused;
- u32 more:1,
- merge:1,
- enc:1,
- write:1,
- init:1;
+ bool more;
+ bool merge;
+ bool enc;
+ bool write;
+ bool init;
unsigned int len;
unsigned int inflight;
};
base-commit: cec1e6e5d1ab33403b809f79cd20d6aff124ccfe
--
2.51.0.536.g15c5d4f767-goog
From: Allison Henderson <allison.henderson(a)oracle.com>
[ Upstream commit f103df763563ad6849307ed5985d1513acc586dd ]
With parent pointers enabled, a rename operation can update up to 5
inodes: src_dp, target_dp, src_ip, target_ip and wip. This causes
their dquots to a be attached to the transaction chain, so we need
to increase XFS_QM_TRANS_MAXDQS. This patch also add a helper
function xfs_dqlockn to lock an arbitrary number of dquots.
Signed-off-by: Allison Henderson <allison.henderson(a)oracle.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Darrick J. Wong <djwong(a)kernel.org>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
[amir: backport to kernels prior to parent pointers to fix an old bug]
A rename operation of a directory (i.e. mv A/C/ B/) may end up changing
three different dquot accounts under the following conditions:
1. user (or group) quotas are enabled
2. A/ B/ and C/ have different owner uids (or gids)
3. A/ blocks shrinks after remove of entry C/
4. B/ blocks grows before adding of entry C/
5. A/ ino <= XFS_DIR2_MAX_SHORT_INUM
6. B/ ino > XFS_DIR2_MAX_SHORT_INUM
7. C/ is converted from sf to block format, because its parent entry
needs to be stored as 8 bytes (see xfs_dir2_sf_replace_needblock)
When all conditions are met (observed in the wild) we get this assertion:
XFS: Assertion failed: qtrx, file: fs/xfs/xfs_trans_dquot.c, line: 207
The upstream commit fixed this bug as a side effect, so decided to apply
it as is rather than changing XFS_QM_TRANS_MAXDQS to 3 in stable kernels.
The Fixes commit below is NOT the commit that introduced the bug, but
for some reason, which is not explained in the commit message, it fixes
the comment to state that highest number of dquots of one type is 3 and
not 2 (which leads to the assertion), without actually fixing it.
The change of wording from "usr, grp OR prj" to "usr, grp and prj"
suggests that there may have been a confusion between "the number of
dquote of one type" and "the number of dquot types" (which is also 3),
so the comment change was only accidentally correct.
Fixes: 10f73d27c8e9 ("xfs: fix the comment explaining xfs_trans_dqlockedjoin")
Cc: stable(a)vger.kernel.org
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
---
Christoph,
This is a cognitive challenge. can you say what you where thinking in
2013 when making the comment change in the Fixes commit?
Is my speculation above correct?
Catherine and Leah,
I decided that cherry-pick this upstream commit as is with a commit
message addendum was the best stable tree strategy.
The commit applies cleanly to 5.15.y, so I assume it does for 6.6 and
6.1 as well. I ran my tests on 5.15.y and nothing fell out, but did not
try to reproduce these complex assertion in a test.
Could you take this candidate backport patch to a spin on your test
branch?
What do you all think about this?
Thanks,
Amir.
fs/xfs/xfs_dquot.c | 41 ++++++++++++++++++++++++++++++++++++++++
fs/xfs/xfs_dquot.h | 1 +
fs/xfs/xfs_qm.h | 2 +-
fs/xfs/xfs_trans_dquot.c | 15 ++++++++++-----
4 files changed, 53 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index c15d61d47a06..6b05d47aa19b 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -1360,6 +1360,47 @@ xfs_dqlock2(
}
}
+static int
+xfs_dqtrx_cmp(
+ const void *a,
+ const void *b)
+{
+ const struct xfs_dqtrx *qa = a;
+ const struct xfs_dqtrx *qb = b;
+
+ if (qa->qt_dquot->q_id > qb->qt_dquot->q_id)
+ return 1;
+ if (qa->qt_dquot->q_id < qb->qt_dquot->q_id)
+ return -1;
+ return 0;
+}
+
+void
+xfs_dqlockn(
+ struct xfs_dqtrx *q)
+{
+ unsigned int i;
+
+ BUILD_BUG_ON(XFS_QM_TRANS_MAXDQS > MAX_LOCKDEP_SUBCLASSES);
+
+ /* Sort in order of dquot id, do not allow duplicates */
+ for (i = 0; i < XFS_QM_TRANS_MAXDQS && q[i].qt_dquot != NULL; i++) {
+ unsigned int j;
+
+ for (j = 0; j < i; j++)
+ ASSERT(q[i].qt_dquot != q[j].qt_dquot);
+ }
+ if (i == 0)
+ return;
+
+ sort(q, i, sizeof(struct xfs_dqtrx), xfs_dqtrx_cmp, NULL);
+
+ mutex_lock(&q[0].qt_dquot->q_qlock);
+ for (i = 1; i < XFS_QM_TRANS_MAXDQS && q[i].qt_dquot != NULL; i++)
+ mutex_lock_nested(&q[i].qt_dquot->q_qlock,
+ XFS_QLOCK_NESTED + i - 1);
+}
+
int __init
xfs_qm_init(void)
{
diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h
index 6b5e3cf40c8b..0e954f88811f 100644
--- a/fs/xfs/xfs_dquot.h
+++ b/fs/xfs/xfs_dquot.h
@@ -231,6 +231,7 @@ int xfs_qm_dqget_uncached(struct xfs_mount *mp,
void xfs_qm_dqput(struct xfs_dquot *dqp);
void xfs_dqlock2(struct xfs_dquot *, struct xfs_dquot *);
+void xfs_dqlockn(struct xfs_dqtrx *q);
void xfs_dquot_set_prealloc_limits(struct xfs_dquot *);
diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h
index 442a0f97a9d4..f75c12c4c6a0 100644
--- a/fs/xfs/xfs_qm.h
+++ b/fs/xfs/xfs_qm.h
@@ -121,7 +121,7 @@ enum {
XFS_QM_TRANS_PRJ,
XFS_QM_TRANS_DQTYPES
};
-#define XFS_QM_TRANS_MAXDQS 2
+#define XFS_QM_TRANS_MAXDQS 5
struct xfs_dquot_acct {
struct xfs_dqtrx dqs[XFS_QM_TRANS_DQTYPES][XFS_QM_TRANS_MAXDQS];
};
diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
index 955c457e585a..99a03acd4488 100644
--- a/fs/xfs/xfs_trans_dquot.c
+++ b/fs/xfs/xfs_trans_dquot.c
@@ -268,24 +268,29 @@ xfs_trans_mod_dquot(
/*
* Given an array of dqtrx structures, lock all the dquots associated and join
- * them to the transaction, provided they have been modified. We know that the
- * highest number of dquots of one type - usr, grp and prj - involved in a
- * transaction is 3 so we don't need to make this very generic.
+ * them to the transaction, provided they have been modified.
*/
STATIC void
xfs_trans_dqlockedjoin(
struct xfs_trans *tp,
struct xfs_dqtrx *q)
{
+ unsigned int i;
ASSERT(q[0].qt_dquot != NULL);
if (q[1].qt_dquot == NULL) {
xfs_dqlock(q[0].qt_dquot);
xfs_trans_dqjoin(tp, q[0].qt_dquot);
- } else {
- ASSERT(XFS_QM_TRANS_MAXDQS == 2);
+ } else if (q[2].qt_dquot == NULL) {
xfs_dqlock2(q[0].qt_dquot, q[1].qt_dquot);
xfs_trans_dqjoin(tp, q[0].qt_dquot);
xfs_trans_dqjoin(tp, q[1].qt_dquot);
+ } else {
+ xfs_dqlockn(q);
+ for (i = 0; i < XFS_QM_TRANS_MAXDQS; i++) {
+ if (q[i].qt_dquot == NULL)
+ break;
+ xfs_trans_dqjoin(tp, q[i].qt_dquot);
+ }
}
}
--
2.47.1
Guangshuo Li wrote:
> Hi Alison, Dave, and all,
>
> Thanks for the feedback. I’ve adopted your suggestions. Below is what I
> plan to take in v3.
I would just post v3. The review tags given on that version will be
picked up when the patch is merged if it is ok.
Thanks,
Ira
[snip]
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Commit 43c51bb573aa ("sc16is7xx: make sure device is in suspend once
probed") permanently enabled access to the enhanced features in
sc16is7xx_probe(), and it is never disabled after that.
Therefore, remove useless re-enable of enhanced features in
sc16is7xx_set_baud().
Fixes: 43c51bb573aa ("sc16is7xx: make sure device is in suspend once probed")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
---
drivers/tty/serial/sc16is7xx.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 1a2c4c14f6aac..c7435595dce13 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -588,13 +588,6 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
div /= prescaler;
}
- /* Enable enhanced features */
- sc16is7xx_efr_lock(port);
- sc16is7xx_port_update(port, SC16IS7XX_EFR_REG,
- SC16IS7XX_EFR_ENABLE_BIT,
- SC16IS7XX_EFR_ENABLE_BIT);
- sc16is7xx_efr_unlock(port);
-
/* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
--
2.39.5
The comedi_buf_munge() function performs a modulo operation
`async->munge_chan %= async->cmd.chanlist_len` without first
checking if chanlist_len is zero. If a user program submits a command with
chanlist_len set to zero, this causes a divide-by-zero error when the device
processes data in the interrupt handler path.
Add a check for zero chanlist_len at the beginning of the
function, similar to the existing checks for !map and
CMDF_RAWDATA flag. When chanlist_len is zero, update
munge_count and return early, indicating the data was
handled without munging.
This prevents potential kernel panics from malformed user commands.
Reported-by: syzbot+f6c3c066162d2c43a66c(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f6c3c066162d2c43a66c
Cc: stable(a)vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
---
v2: Merged the chanlist_len check with existing early return
check as suggested by Ian Abbott
---
drivers/comedi/comedi_buf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/comedi/comedi_buf.c b/drivers/comedi/comedi_buf.c
index 002c0e76baff..c7c262a2d8ca 100644
--- a/drivers/comedi/comedi_buf.c
+++ b/drivers/comedi/comedi_buf.c
@@ -317,7 +317,7 @@ static unsigned int comedi_buf_munge(struct comedi_subdevice *s,
unsigned int count = 0;
const unsigned int num_sample_bytes = comedi_bytes_per_sample(s);
- if (!s->munge || (async->cmd.flags & CMDF_RAWDATA)) {
+ if (!s->munge || (async->cmd.flags & CMDF_RAWDATA) || async->cmd.chanlist_len == 0) {
async->munge_count += num_bytes;
return num_bytes;
}
--
2.43.0
###**🌟 فرصتك للنشر في مجلات علمية محكمة دوليًا – دعوة مفتوحة للباحثين والأكاديميين**
**انضم إلى النخبة العلمية… وانشر أبحاثك معنا.**
**انطلاقًا من إيماننا بأن البحث العلمي هو الركيزة الأساسية لنهضة المجتمعات وتقدمها،**
يسر*فكر للدراسات والتطوير* أن تتشرف بدعوتكم للنشر في مجلاتها العلمية المحكمة، التي تمثل منبرًا أكاديميًا رصينًا لاحتضان الأفكار الأصيلة، والمشاريع البحثية الجادة، والرؤى التي تسهم في إنتاج معرفة تطبيقية تخدم قضايا الإنسان والمجتمع.
إن دعوتنا هذه تأتي ضمن رؤيتنا في تمكين الباحثين والكتّاب في العالم العربي والإسلامي من إيصال أبحاثهم إلى أوسع نطاق، والمساهمة الفاعلة في الحراك العلمي محليًا ودوليًا، عبر منصات نشر معتمدة وموثوقة.
➡️ **قدّم مخطوطتك الآن:** [https://7m8ue.r.ag.d.sendibm3.com/mk/cl/f/sh/WCPzyXJTZ7nvI8YYc4qB1knr1PmTmi…
####عامل تأثير عربي=2.7
مجلة ريحان للنشر العلمي
-----------------------
(Rihan Journal for Scientific Publishing)
###$50
*مجلة علمية دولية، محكّمة، شهرية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير.*
رقم التسلسل المعياري الدولي: **ISSN-E: 2709-2097**
تستقبل المجلة الأبحاث والمقالات العلمية بثلاث لغات: **العربية، الإنجليزية، والتركية**، في مختلف التخصصات، وتخضع جميع المواد المقدمة لعملية تحكيم علمي صارمة، تضمن جودة المحتوى وموثوقيته الأكاديمية.
####عامل تأثير عربي=1.7
مجلة ايبرس للنشر الطبي
----------------------
###$50
*مجلة علمية دولية، محكّمة، فصلية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير.*
رقم التسلسل المعياري الدولي: **ISSN-E: 2959-5371**
تأسست مجلة إيبرس للنشر الطبي عام 2022، لتكون منصة علمية رصينة في مجال العلوم الطبية والصحية، ووجهة موثوقة للباحثين والطلبة وأعضاء الهيئات التدريسية والأكاديميين لنشر أبحاثهم الأصيلة، ومراجعاتهم العلمية، ومشاركتهم في تطوير المعرفة الطبية الحديثة.
تستقبل المجلة الدراسات والأبحاث باللغة **العربية، الإنجليزية، والتركية**، وتخضع جميع المواد لعملية تحكيم علمي دقيقة، وفق أعلى المعايير الأكاديمية الدولية.
مجلة طُوى للعلوم الاجتماعية
---------------------------
###$50
*مجلة علمية دولية، محكّمة، فصلية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير.*
رقم التسلسل المعياري الدولي:
**ISSN: 3104-7211**
**مجلة طُوى** هي منصة أكاديمية تُعنى بنشر البحوث والدراسات الأصيلة في **مجالات العلوم الاجتماعية المتداخلة**، بما يسهم في إنتاج معرفة تحليلية معمقة حول قضايا الإنسان والمجتمع. تلتزم المجلة بمعايير **التحكيم العلمي الرصين**، وتُرحّب بالأعمال المقدّمة باللغات: **العربية، الإنجليزية، والتركية**.
جاء اختيار اسم "طُوى" استلهامًا من **الوادي المقدّس طُوى**، تعبيرًا عن قدسية المعرفة، وإيمانًا بأن الفكر العلمي رسالة إنسانية لا تقل أثرًا وعمقًا عن أي رسالة تغيير أو بناء مجتمعي.
مجلة زنوبيا لدراسات المرأة والطفل والاسرة
-----------------------------------------
###$50
*مجلة علمية دولية، محكّمة، فصلية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير*
بالتعاون مع **المنتدى الثقافي النسائي السوري**
رقم التسلسل المعياري الدولي:
**ISSN: 3104-7874**
تُعنى **مجلة زنوبيا** بنشر البحوث والدراسات الأكاديمية التي تتناول قضايا **المرأة، والطفل، والأسرة** من زوايا علمية، اجتماعية، وثقافية متعددة، مع التركيز على التحديات والتحولات المعاصرة التي تمس هذه الفئات داخل السياقات العربية والعالمية. وتوفّر المجلة منبرًا أكاديميًا موثوقًا يعزز التفكير النقدي والتحليل العلمي البنّاء، ويسعى لإبراز التجارب والرؤى التي تدعم التمكين المجتمعي.
مجلة زكا للعلوم المالية والاقتصادية والإدارية
---------------------------------------------
###$50
*مجلة علمية دولية، محكّمة، فصلية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير.*
رقم التسلسل المعياري الدولي
**ISSN: 3104-7289**
تُعنى **مجلة زكا** بنشر أبحاث علمية أصيلة وعالية الجودة في مجالات: **إدارة الأعمال، الاقتصاد، المحاسبة، العلوم المالية، التسويق، الاقتصاد الإسلامي، الإدارة العامة، والتمويل**.
وتمثل المجلة منصة أكاديمية رصينة تستهدف الباحثين، الأكاديميين، وطلبة الدراسات العليا في التخصصات المالية والإدارية، وتسعى إلى الإسهام الفعّال في تطوير المعرفة الاقتصادية والإدارية، وفقًا لأحدث المعايير العلمية والمنهجيات البحثية الحديثة.
مجلة روح للعلوم الإنسانية
-------------------------
###$50
**(RUH Journal of Humanities)**
*مجلة علمية دولية، محكّمة، فصلية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير.*
رقم التسلسل المعياري الدولي:
**ISSN: 3105-2436**
تهدف **مجلة روح للعلوم الإنسانية** إلى نشر الأبحاث العلمية المتميزة في مختلف مجالات العلوم الإنسانية، وتعزيز الفكر الأكاديمي والنقاش العلمي المتخصص في القضايا الإنسانية المعاصرة على المستويين العربي والعالمي.
تُعد المجلة منصة أكاديمية رصينة تجمع الباحثين من تخصصات متعددة لتبادل المعرفة، وتحفيز الدراسات الرصينة ذات الأثر المجتمعي والثقافي، وتوسيع دائرة الحوار العلمي حول الإنسان والمجتمع.
جاء اختيار اسم **"روح"** ليعكس جوهر العلوم الإنسانية، التي تنطلق من فهم الإنسان: مشاعره، ثقافته، سلوكه، وتاريخه.
تشير الكلمة إلى **العنصر الحيوي** الذي يمنح الإنسان والمجتمع معناهما، مما يرسّخ رؤية المجلة في أن **العلوم الإنسانية هي الروح النابضة لفهم المجتمعات وتطورها**.
دعوة للتبرع
------------
###$10
###🎓 ساهم في نهضة العلم في سوريا – كن جزءًا من التغيير الحقيقي!
إذا كنت تؤمن بقوة المعرفة وأهمية دعم البحث العلمي في سوريا، فقد حان الوقت لتشارك في صناعة الأمل.
**تبرعك اليوم لا يُقدَّر بثمن…**
إنه استثمار في العقول، وفي جيل جديد من الباحثين والمفكرين الذين يسعون لإعادة بناء المجتمع على أسس علمية وإنسانية.
📢 **ساهم في صناعة الأمل… تبرعك اليوم يصنع فرقاً حقيقياً غداً!**
المؤتمرات العلمية
-----------------
###$100
###🏛️ **مركز فكر للمؤتمرات العلمية**
**مركز فكر للمؤتمرات العلمية** هو أحد برامج مركز فكر للدراسات والتطوير، يُعنى بتنظيم المؤتمرات والملتقيات العلمية المتخصصة، بهدف دعم البحث الأكاديمي وتوسيع دائرة الحوار المعرفي حول القضايا المعاصرة التي تهم المجتمعات العربية والعالمية.
ينطلق المركز من رؤية تؤمن بأن **المؤتمر العلمي ليس مجرد حدث أكاديمي، بل هو منصة استراتيجية لإنتاج المعرفة، وتبادل الخبرات، وصياغة حلول مستندة إلى البحث العلمي**.
###📚 **مجالات المؤتمرات:**
العلوم الإنسانية والاجتماعية
العلوم الطبية والصحية
الدراسات الاقتصادية والمالية
دراسات المرأة والطفل والأسرة
البيئة والتنمية المستدامة
التكنولوجيا والتحول الرقمي
التعليم والتربية
**مع خالص الشكر والتقدير،**
============================
**نثمّن وقتكم واهتمامكم، ونتطلع إلى تعاون مثمر يجمعنا في خدمة البحث العلمي والمجتمع.**
======================================================================================
**نؤمن أن العمل المشترك هو مفتاح التغيير الحقيقي، ونرحّب بكم دائمًا ضمن شبكتنا العلمية والمجتمعية.**
====================================================================================================
➡️ **قدّم مخطوطتك الآن:** [https://forms.gle/g7McfPrkaYYDFC2N6](https://forms.gle/g7McfPrkaYYDFC2N6)
Maria Abdel Rahim
[pr@rjsp.org](mailto:ahmet@rjsp.org)
rihanjournal(a)gmail.com
[00905306359001](https://)
gazimuhtarpas
29600,gazantap
[Unsubscribe](https://7m8ue.r.ag.d.sendibm3.com/mk/un/v2/sh/7nVTPdbLJ2bPbEmD…
From: Lance Yang <lance.yang(a)linux.dev>
When both THP and MTE are enabled, splitting a THP and replacing its
zero-filled subpages with the shared zeropage can cause MTE tag mismatch
faults in userspace.
Remapping zero-filled subpages to the shared zeropage is unsafe, as the
zeropage has a fixed tag of zero, which may not match the tag expected by
the userspace pointer.
KSM already avoids this problem by using memcmp_pages(), which on arm64
intentionally reports MTE-tagged pages as non-identical to prevent unsafe
merging.
As suggested by David[1], this patch adopts the same pattern, replacing the
memchr_inv() byte-level check with a call to pages_identical(). This
leverages existing architecture-specific logic to determine if a page is
truly identical to the shared zeropage.
Having both the THP shrinker and KSM rely on pages_identical() makes the
design more future-proof, IMO. Instead of handling quirks in generic code,
we just let the architecture decide what makes two pages identical.
[1] https://lore.kernel.org/all/ca2106a3-4bb2-4457-81af-301fd99fbef4@redhat.com
Cc: <stable(a)vger.kernel.org>
Reported-by: Qun-wei Lin <Qun-wei.Lin(a)mediatek.com>
Closes: https://lore.kernel.org/all/a7944523fcc3634607691c35311a5d59d1a3f8d4.camel@…
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Suggested-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
---
Tested on x86_64 and on QEMU for arm64 (with and without MTE support),
and the fix works as expected.
mm/huge_memory.c | 15 +++------------
mm/migrate.c | 8 +-------
2 files changed, 4 insertions(+), 19 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 32e0ec2dde36..28d4b02a1aa5 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -4104,29 +4104,20 @@ static unsigned long deferred_split_count(struct shrinker *shrink,
static bool thp_underused(struct folio *folio)
{
int num_zero_pages = 0, num_filled_pages = 0;
- void *kaddr;
int i;
for (i = 0; i < folio_nr_pages(folio); i++) {
- kaddr = kmap_local_folio(folio, i * PAGE_SIZE);
- if (!memchr_inv(kaddr, 0, PAGE_SIZE)) {
- num_zero_pages++;
- if (num_zero_pages > khugepaged_max_ptes_none) {
- kunmap_local(kaddr);
+ if (pages_identical(folio_page(folio, i), ZERO_PAGE(0))) {
+ if (++num_zero_pages > khugepaged_max_ptes_none)
return true;
- }
} else {
/*
* Another path for early exit once the number
* of non-zero filled pages exceeds threshold.
*/
- num_filled_pages++;
- if (num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none) {
- kunmap_local(kaddr);
+ if (++num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none)
return false;
- }
}
- kunmap_local(kaddr);
}
return false;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index aee61a980374..ce83c2c3c287 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -300,9 +300,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
unsigned long idx)
{
struct page *page = folio_page(folio, idx);
- bool contains_data;
pte_t newpte;
- void *addr;
if (PageCompound(page))
return false;
@@ -319,11 +317,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
* this subpage has been non present. If the subpage is only zero-filled
* then map it to the shared zeropage.
*/
- addr = kmap_local_page(page);
- contains_data = memchr_inv(addr, 0, PAGE_SIZE);
- kunmap_local(addr);
-
- if (contains_data)
+ if (!pages_identical(page, ZERO_PAGE(0)))
return false;
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
--
2.49.0
This is the mail system at host zihnyunrui.com.
I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.
For further assistance, please send mail to postmaster.
If you do so, please include this problem report. You can
delete your own text from the attached returned message.
The mail system
<linux-stable-mirror(a)lists.linaro.org>: host lists.linaro.org[3.208.193.21]
said: 554 5.7.1 Spam message rejected (in reply to end of DATA command)