dev_coredumpm() creates a devcoredump device and adds it
to the core kernel framework which eventually end up
sending uevent to the user space and later creates a
symbolic link to the failed device. An application
running in userspace may be interested in this symbolic
link to get the name of the failed device.
In a issue scenario, once uevent sent to the user space
it start reading '/sys/class/devcoredump/devcdX/failing_device'
to get the actual name of the device which might not been
created and it is in its path of creation.
To fix this, suppress sending uevent till the failing device
symbolic link gets created and send uevent once symbolic
link is created successfully.
Fixes: 833c95456a70 ("device coredump: add new device coredump class")
Signed-off-by: Mukesh Ojha <quic_mojha(a)quicinc.com>
Cc: stable(a)vger.kernel.org
---
Change in v3:
- Cced stable.
Change in v2:
- Added Fixes tag as per suggestion from [Johannes]
drivers/base/devcoredump.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
index 91536ee05f14..7e2d1f0d903a 100644
--- a/drivers/base/devcoredump.c
+++ b/drivers/base/devcoredump.c
@@ -362,6 +362,7 @@ void dev_coredumpm(struct device *dev, struct module *owner,
devcd->devcd_dev.class = &devcd_class;
mutex_lock(&devcd->mutex);
+ dev_set_uevent_suppress(&devcd->devcd_dev, true);
if (device_add(&devcd->devcd_dev))
goto put_device;
@@ -376,6 +377,8 @@ void dev_coredumpm(struct device *dev, struct module *owner,
"devcoredump"))
dev_warn(dev, "devcoredump create_link failed\n");
+ dev_set_uevent_suppress(&devcd->devcd_dev, false);
+ kobject_uevent(&devcd->devcd_dev.kobj, KOBJ_ADD);
INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
mutex_unlock(&devcd->mutex);
--
2.7.4
This patch enhances error handling in scenarios with RTS (Request to
Send) messages arriving closely. It replaces the less informative WARN_ON_ONCE
backtraces with a new error handling method. This provides clearer error
messages and allows for the early termination of problematic sessions.
Previously, sessions were only released at the end of j1939_xtp_rx_rts().
Potentially this could be reproduced with something like:
testj1939 -r vcan0:0x80 &
while true; do
# send first RTS
cansend vcan0 18EC8090#1014000303002301;
# send second RTS
cansend vcan0 18EC8090#1014000303002301;
# send abort
cansend vcan0 18EC8090#ff00000000002301;
done
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Reported-by: syzbot+daa36413a5cedf799ae4(a)syzkaller.appspotmail.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Oleksij Rempel <o.rempel(a)pengutronix.de>
---
net/can/j1939/transport.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index fe3df23a2595..c6569f98d251 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -1593,8 +1593,8 @@ j1939_session *j1939_xtp_rx_rts_session_new(struct j1939_priv *priv,
struct j1939_sk_buff_cb skcb = *j1939_skb_to_cb(skb);
struct j1939_session *session;
const u8 *dat;
+ int len, ret;
pgn_t pgn;
- int len;
netdev_dbg(priv->ndev, "%s\n", __func__);
@@ -1653,7 +1653,22 @@ j1939_session *j1939_xtp_rx_rts_session_new(struct j1939_priv *priv,
session->tskey = priv->rx_tskey++;
j1939_sk_errqueue(session, J1939_ERRQUEUE_RX_RTS);
- WARN_ON_ONCE(j1939_session_activate(session));
+ ret = j1939_session_activate(session);
+ if (ret) {
+ /* Entering this scope indicates an issue with the J1939 bus.
+ * Possible scenarios include:
+ * - A time lapse occurred, and a new session was initiated
+ * due to another packet being sent correctly. This could
+ * have been caused by too long interrupt, debugger, or being
+ * out-scheduled by another task.
+ * - The bus is receiving numerous erroneous packets, either
+ * from a malfunctioning device or during a test scenario.
+ */
+ netdev_alert(priv->ndev, "%s: 0x%p: concurrent session with same addr (%02x %02x) is already active.\n",
+ __func__, session, skcb.addr.sa, skcb.addr.da);
+ j1939_session_put(session);
+ return NULL;
+ }
return session;
}
--
2.39.2
The vmd_pm_enable_quirk() helper is called from pci_walk_bus() during
probe to enable ASPM for controllers with VMD_FEAT_BIOS_PM_QUIRK set.
Since pci_walk_bus() already holds a pci_bus_sem read lock, use the new
locked helper to enable link states in order to avoid a potential
deadlock (e.g. in case someone takes a write lock before reacquiring
the read lock).
Fixes: f492edb40b54 ("PCI: vmd: Add quirk to configure PCIe ASPM and LTR")
Cc: stable(a)vger.kernel.org # 6.3
Cc: Michael Bottini <michael.a.bottini(a)linux.intel.com>
Cc: David E. Box <david.e.box(a)linux.intel.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/pci/controller/vmd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 94ba61fe1c44..0452cbc362ee 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -751,7 +751,7 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
if (!(features & VMD_FEAT_BIOS_PM_QUIRK))
return 0;
- pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL);
+ pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
if (!pos)
--
2.41.0
From: Xiubo Li <xiubli(a)redhat.com>
The lock order is incorrect between denty and its parent, we should
always make sure that the parent get the lock first.
Switch to use the 'dget_parent()' to get the parent dentry and also
keep holding the parent until the corresponding inode is not being
refereenced.
Cc: stable(a)vger.kernel.org
Reported-by: Al Viro <viro(a)zeniv.linux.org.uk>
URL: https://www.spinics.net/lists/ceph-devel/msg58622.html
Fixes: adf0d68701c7 ("ceph: fix unsafe dcache access in ceph_encode_dentry_release")
Cc: Jeff Layton <jlayton(a)kernel.org>
Signed-off-by: Xiubo Li <xiubli(a)redhat.com>
---
fs/ceph/caps.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 284424659329..6f7a56a800c2 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -4923,6 +4923,11 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry,
int force = 0;
int ret;
+ if (!dir) {
+ parent = dget_parent(dentry);
+ dir = d_inode(parent);
+ }
+
/*
* force an record for the directory caps if we have a dentry lease.
* this is racy (can't take i_ceph_lock and d_lock together), but it
@@ -4932,14 +4937,9 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry,
spin_lock(&dentry->d_lock);
if (di->lease_session && di->lease_session->s_mds == mds)
force = 1;
- if (!dir) {
- parent = dget(dentry->d_parent);
- dir = d_inode(parent);
- }
spin_unlock(&dentry->d_lock);
ret = ceph_encode_inode_release(p, dir, mds, drop, unless, force);
- dput(parent);
cl = ceph_inode_to_client(dir);
spin_lock(&dentry->d_lock);
@@ -4952,8 +4952,10 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry,
if (IS_ENCRYPTED(dir) && fscrypt_has_encryption_key(dir)) {
int ret2 = ceph_encode_encrypted_fname(dir, dentry, *p);
- if (ret2 < 0)
+ if (ret2 < 0) {
+ dput(parent);
return ret2;
+ }
rel->dname_len = cpu_to_le32(ret2);
*p += ret2;
@@ -4965,6 +4967,8 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry,
} else {
spin_unlock(&dentry->d_lock);
}
+
+ dput(parent);
return ret;
}
--
2.39.1
Hi Trond and Greg:
LTS 4.19 reported null-ptr-deref BUG as follows:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
Call Trace:
nfs_inode_add_request+0x1cc/0x5b8
nfs_setup_write_request+0x1fa/0x1fc
nfs_writepage_setup+0x2d/0x7d
nfs_updatepage+0x8b8/0x936
nfs_write_end+0x61d/0xd45
generic_perform_write+0x19a/0x3f0
nfs_file_write+0x2cc/0x6e5
new_sync_write+0x442/0x560
__vfs_write+0xda/0xef
vfs_write+0x176/0x48b
ksys_write+0x10a/0x1e9
__se_sys_write+0x24/0x29
__x64_sys_write+0x79/0x93
do_syscall_64+0x16d/0x4bb
entry_SYSCALL_64_after_hwframe+0x5c/0xc1
The reason is: generic_error_remove_page set page->mapping to NULL when
nfs server have a fatal error:
nfs_updatepage
nfs_writepage_setup
nfs_setup_write_request
nfs_try_to_update_request // return NULL
nfs_wb_page // return 0
nfs_writepage_locked // return 0
nfs_do_writepage // return 0
nfs_page_async_flush // return 0
nfs_error_is_fatal_on_server
generic_error_remove_page
truncate_inode_page
delete_from_page_cache
__delete_from_page_cache
page_cache_tree_delete
page->mapping = NULL // this is point
nfs_create_request
req->wb_page = page // the page is freed
nfs_inode_add_request
mapping = page_file_mapping(req->wb_page)
return page->mapping
spin_lock(&mapping->private_lock) // mapping is NULL
It is reasonable by reverting the patch "89047634f5ce NFS: Don't
interrupt file writeout due to fatal errors" to fix this bug?
This patch is one patch of patchset [Fix up soft mounts for
NFSv4.x](https://lore.kernel.org/all/20190407175912.23528-1-trond.myklebust…,
the patchset replace custom error reporting mechanism. it seams that we
should merge all the patchset to LTS 4.19, or all patchs should not be
merged. And the "Fixes:" label is not correct, this patch is a
refactoring patch, not for fixing bugs.
From: Tony Lindgren <tony(a)atomide.com>
[ Upstream commit fbb74e56378d8306f214658e3d525a8b3f000c5a ]
We need to check for an active device as otherwise we get warnings
for some mcbsp instances for "Runtime PM usage count underflow!".
Reported-by: Andreas Kemnade <andreas(a)kemnade.info>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Link: https://lore.kernel.org/r/20231030052340.13415-1-tony@atomide.com
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/ti/omap-mcbsp.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/sound/soc/ti/omap-mcbsp.c b/sound/soc/ti/omap-mcbsp.c
index fdabed5133e83..b399d86f22777 100644
--- a/sound/soc/ti/omap-mcbsp.c
+++ b/sound/soc/ti/omap-mcbsp.c
@@ -74,14 +74,16 @@ static int omap2_mcbsp_set_clks_src(struct omap_mcbsp *mcbsp, u8 fck_src_id)
return 0;
}
- pm_runtime_put_sync(mcbsp->dev);
+ if (mcbsp->active)
+ pm_runtime_put_sync(mcbsp->dev);
r = clk_set_parent(mcbsp->fclk, fck_src);
if (r)
dev_err(mcbsp->dev, "CLKS: could not clk_set_parent() to %s\n",
src);
- pm_runtime_get_sync(mcbsp->dev);
+ if (mcbsp->active)
+ pm_runtime_get_sync(mcbsp->dev);
clk_put(fck_src);
--
2.42.0
#regzbot introduced: 89c290ea758911e660878e26270e084d862c03b0
#regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/273
#regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=218124
## Reproducing
1. Boot system to framebuffer console.
2. Run `systemctl suspend`. If undocked without secondary display,
suspend fails. If docked with secondary display, suspend succeeds.
3. Resume from suspend if applicable.
4. System is now in a broken state.
## Testing
- culprit commit is 89c290ea758911e660878e26270e084d862c03b0
- v6.6 fails
- v6.6 with culprit commit reverted does not fail
- Compiled with
<https://gitlab.freedesktop.org/drm/nouveau/uploads/788d7faf22ba2884dcc09d7b…>
## Hardware
- ThinkPad W530 2438-52U
- Dock with Nvidia-connected DVI ports
- Secondary display connected via DVI
- Nvidia Optimus GPU switching system
```console
$ lspci | grep -i vga
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core
processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: NVIDIA Corporation GK107GLM [Quadro
K2000M] (rev a1)
```
## Decoded logs from v6.6
- System is not docked and fails to suspend:
<https://gitlab.freedesktop.org/drm/nouveau/uploads/fb8fdf5a6bed1b1491d2544a…>
- System is docked and fails after resume:
<https://gitlab.freedesktop.org/drm/nouveau/uploads/cb3d5ac55c01f663cd80fa00…>