There are regression reports of soft-reset issue due to the recent changes.
Revert them as they are incomplete/incorrect fix [*].
[*] https://lore.kernel.org/linux-usb/ZW8sJoTEKVmDdk5Y@xhacker/
Thinh Nguyen (2):
Revert "usb: dwc3: Soft reset phy on probe for host"
Revert "usb: dwc3: don't reset device side if dwc3 was configured as host-only"
drivers/usb/dwc3/core.c | 39 +--------------------------------------
1 file changed, 1 insertion(+), 38 deletions(-)
base-commit: ab241a0ab5abd70036c3d959146e534a02447d17
--
2.28.0
From: Huang Ying <ying.huang(a)intel.com>
The decoder_populate_targets() helper walks all of the targets in a port
and makes sure they can be looked up in @target_map. Where @target_map
is a lookup table from target position to target id (corresponding to a
cxl_dport instance). However @target_map is only responsible for
conveying the active dport instances as conveyed by interleave_ways.
When nr_targets > interleave_ways it results in
decoder_populate_targets() walking off the end of the valid entries in
@target_map. Given target_map is initialized to 0 it results in the
dport lookup failing if position 0 is not mapped to a dport with an id
of 0:
cxl_port port3: Failed to populate active decoder targets
cxl_port port3: Failed to add decoder
cxl_port port3: Failed to add decoder3.0
cxl_bus_probe: cxl_port port3: probe: -6
This bug also highlights that when the decoder's ->targets[] array is
written in cxl_port_setup_targets() it is missing a hold of the
targets_lock to synchronize against sysfs readers of the target list. A
fix for that is saved for a later patch.
Fixes: a5c258021689 ("cxl/bus: Populate the target list at decoder create")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: "Huang, Ying" <ying.huang(a)intel.com>
[djbw: rewrite the changelog, find the Fixes: tag]
Co-developed-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
drivers/cxl/core/port.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index b7c93bb18f6e..57495cdc181f 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1644,7 +1644,7 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
return -EINVAL;
write_seqlock(&cxlsd->target_lock);
- for (i = 0; i < cxlsd->nr_targets; i++) {
+ for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
struct cxl_dport *dport = find_dport(port, target_map[i]);
if (!dport) {
In sniff_min_interval_set():
if (val == 0 || val % 2 || val > hdev->sniff_max_interval)
return -EINVAL;
hci_dev_lock(hdev);
hdev->sniff_min_interval = val;
hci_dev_unlock(hdev);
In sniff_max_interval_set():
if (val == 0 || val % 2 || val < hdev->sniff_min_interval)
return -EINVAL;
hci_dev_lock(hdev);
hdev->sniff_max_interval = val;
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs. Consider a scenario where setmin writes a new, valid 'min'
value, and concurrently, setmax writes a value that is greater than the
old 'min' but smaller than the new 'min'. In this case, setmax might check
against the old 'min' value (before acquiring the lock) but write its
value after the 'min' has been updated by setmin. This leads to a
situation where the 'max' value ends up being smaller than the 'min'
value, which is an inconsistency.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 71c3b60ec6d2 ("Bluetooth: Move BR/EDR debugfs file creation ...")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
---
v2:
* Adjust the format to pass the CI.
---
net/bluetooth/hci_debugfs.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 6b7741f6e95b..f032fdf8f481 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -566,11 +566,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(idle_timeout_fops, idle_timeout_get,
static int sniff_min_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val == 0 || val % 2 || val > hdev->sniff_max_interval)
+
+ hci_dev_lock(hdev);
+ if (val == 0 || val % 2 || val > hdev->sniff_max_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->sniff_min_interval = val;
hci_dev_unlock(hdev);
@@ -594,11 +596,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(sniff_min_interval_fops, sniff_min_interval_get,
static int sniff_max_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val == 0 || val % 2 || val < hdev->sniff_min_interval)
+
+ hci_dev_lock(hdev);
+ if (val == 0 || val % 2 || val < hdev->sniff_min_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->sniff_max_interval = val;
hci_dev_unlock(hdev);
--
2.34.1
In conn_info_min_age_set():
if (val == 0 || val > hdev->conn_info_max_age)
return -EINVAL;
hci_dev_lock(hdev);
hdev->conn_info_min_age = val;
hci_dev_unlock(hdev);
In conn_info_max_age_set():
if (val == 0 || val < hdev->conn_info_min_age)
return -EINVAL;
hci_dev_lock(hdev);
hdev->conn_info_max_age = val;
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs.Consider a scenario where setmin writes a new, valid 'min'
value, and concurrently, setmax writes a value that is greater than the
old 'min' but smaller than the new 'min'. In this case, setmax might check
against the old 'min' value (before acquiring the lock) but write its
value after the 'min' has been updated by setmin. This leads to a
situation where the 'max' value ends up being smaller than the 'min'
value, which is an inconsistency.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 40ce72b1951c ("Bluetooth: Move common debugfs file creation ...")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
---
v2:
* Adjust the format to pass the CI.
---
net/bluetooth/hci_debugfs.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 6b7741f6e95b..d4ce2769c939 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -217,11 +217,13 @@ DEFINE_SHOW_ATTRIBUTE(remote_oob);
static int conn_info_min_age_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val == 0 || val > hdev->conn_info_max_age)
+
+ hci_dev_lock(hdev);
+ if (val == 0 || val > hdev->conn_info_max_age) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->conn_info_min_age = val;
hci_dev_unlock(hdev);
@@ -245,11 +247,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(conn_info_min_age_fops, conn_info_min_age_get,
static int conn_info_max_age_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val == 0 || val < hdev->conn_info_min_age)
+
+ hci_dev_lock(hdev);
+ if (val == 0 || val < hdev->conn_info_min_age) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->conn_info_max_age = val;
hci_dev_unlock(hdev);
--
2.34.1
In {conn,adv}_min_interval_set():
if (val < ... || val > ... || val > hdev->le_{conn,adv}_max_interval)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_{conn,adv}_min_interval = val;
hci_dev_unlock(hdev);
In {conn,adv}_max_interval_set():
if (val < ... || val > ... || val < hdev->le_{conn,adv}_min_interval)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_{conn,adv}_max_interval
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs. Consider a scenario where setmin writes a new, valid 'min'
value, and concurrently, setmax writes a value that is greater than the
old 'min' but smaller than the new 'min'. In this case, setmax might check
against the old 'min' value (before acquiring the lock) but write its
value after the 'min' has been updated by setmin. This leads to a
situation where the 'max' value ends up being smaller than the 'min'
value, which is an inconsistency.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 3a5c82b78fd2 ("Bluetooth: Move LE debugfs file creation into ...")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
---
v2:
* Adjust the format to pass the CI.
---
net/bluetooth/hci_debugfs.c | 30 +++++++++++++++++++-----------
1 file changed, 19 insertions(+), 11 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 6b7741f6e95b..6fdda807f2cf 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -849,11 +849,13 @@ DEFINE_SHOW_ATTRIBUTE(long_term_keys);
static int conn_min_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val < 0x0006 || val > 0x0c80 || val > hdev->le_conn_max_interval)
+
+ hci_dev_lock(hdev);
+ if (val < 0x0006 || val > 0x0c80 || val > hdev->le_conn_max_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_conn_min_interval = val;
hci_dev_unlock(hdev);
@@ -877,11 +879,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(conn_min_interval_fops, conn_min_interval_get,
static int conn_max_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val < 0x0006 || val > 0x0c80 || val < hdev->le_conn_min_interval)
+
+ hci_dev_lock(hdev);
+ if (val < 0x0006 || val > 0x0c80 || val < hdev->le_conn_min_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_conn_max_interval = val;
hci_dev_unlock(hdev);
@@ -989,11 +993,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(adv_channel_map_fops, adv_channel_map_get,
static int adv_min_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val < 0x0020 || val > 0x4000 || val > hdev->le_adv_max_interval)
+
+ hci_dev_lock(hdev);
+ if (val < 0x0020 || val > 0x4000 || val > hdev->le_adv_max_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_adv_min_interval = val;
hci_dev_unlock(hdev);
@@ -1018,10 +1024,12 @@ static int adv_max_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
- if (val < 0x0020 || val > 0x4000 || val < hdev->le_adv_min_interval)
+ hci_dev_lock(hdev);
+ if (val < 0x0020 || val > 0x4000 || val < hdev->le_adv_min_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_adv_max_interval = val;
hci_dev_unlock(hdev);
--
2.34.1
This is the start of the stable review cycle for the 5.15.145 release.
There are 159 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 22 Dec 2023 16:08:59 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.145-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.145-rc1
Arnd Bergmann <arnd(a)arndb.de>
kasan: disable kasan_non_canonical_hook() for HW tags
Francis Laniel <flaniel(a)linux.microsoft.com>
tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols
Amit Pundir <amit.pundir(a)linaro.org>
Revert "drm/bridge: lt9611uxc: Switch to devm MIPI-DSI helpers"
Amit Pundir <amit.pundir(a)linaro.org>
Revert "drm/bridge: lt9611uxc: Register and attach our DSI device at probe"
Amit Pundir <amit.pundir(a)linaro.org>
Revert "drm/bridge: lt9611uxc: fix the race in the error path"
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: don't update ->op_state as OPLOCK_STATE_NONE on error
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: move setting SMB2_FLAGS_ASYNC_COMMAND and AsyncId
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: release interim response after sending status pending response
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: move oplock handling after unlock parent dir
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: separately allocate ci per dentry
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix possible deadlock in smb2_open
Zongmin Zhou <zhouzongmin(a)kylinos.cn>
ksmbd: prevent memory leak on error return
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: handle malformed smb1 message
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix kernel-doc comment of ksmbd_vfs_kern_path_locked()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: no need to wait for binded connection termination at logoff
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: add support for surrogate pair conversion
Kangjing Huang <huangkangjing(a)gmail.com>
ksmbd: fix missing RDMA-capable flag for IPoIB device in ksmbd_rdma_capable_netdev()
Marios Makassikis <mmakassikis(a)freebox.fr>
ksmbd: fix recursive locking in vfs helpers
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix kernel-doc comment of ksmbd_vfs_setxattr()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: reorganize ksmbd_iov_pin_rsp()
Cheng-Han Wu <hank20010209(a)gmail.com>
ksmbd: Remove unused field in ksmbd_user struct
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix potential double free on smb2_read_pipe() error path
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix Null pointer dereferences in ksmbd_update_fstate()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix wrong error response status by using set_smb2_rsp_status()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix race condition between tree conn lookup and disconnect
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix race condition from parallel smb2 lock requests
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix race condition from parallel smb2 logoff requests
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix race condition with fp
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix race condition between session lookup and expire
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: check iov vector index in ksmbd_conn_write()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: return invalid parameter error response if smb2 request is invalid
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix passing freed memory 'aux_payload_buf'
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: remove unneeded mark_inode_dirty in set_info_sec()
Steve French <stfrench(a)microsoft.com>
ksmbd: remove experimental warning
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: add missing calling smb2_set_err_rsp() on error
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix slub overflow in ksmbd_decode_ntlmssp_auth_blob()
Yang Li <yang.lee(a)linux.alibaba.com>
ksmbd: Fix one kernel-doc comment
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: reduce descriptor size if remaining bytes is less than request size
Atte Heikkilä <atteh.mailbox(a)gmail.com>
ksmbd: fix `force create mode' and `force directory mode'
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix wrong interim response on compound
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: add support for read compound
Yang Yingliang <yangyingliang(a)huawei.com>
ksmbd: switch to use kmemdup_nul() helper
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix out of bounds in init_smb2_rsp_hdr()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: validate session id and tree id in compound request
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: check if a mount point is crossed during path lookup
Wang Ming <machel(a)vivo.com>
ksmbd: Fix unsigned expression compared with zero
Gustavo A. R. Silva <gustavoars(a)kernel.org>
ksmbd: Replace one-element array with flexible-array member
Gustavo A. R. Silva <gustavoars(a)kernel.org>
ksmbd: Use struct_size() helper in ksmbd_negotiate_smb_dialect()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: add missing compound request handing in some commands
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix out of bounds read in smb2_sess_setup
Lu Hongfei <luhongfei(a)vivo.com>
ksmbd: Replace the ternary conditional operator with min()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: use kvzalloc instead of kvmalloc
Lu Hongfei <luhongfei(a)vivo.com>
ksmbd: Change the return value of ksmbd_vfs_query_maximal_access to void
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: return a literal instead of 'err' in ksmbd_vfs_kern_path_locked()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: use kzalloc() instead of __GFP_ZERO
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: remove unused ksmbd_tree_conn_share function
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: add mnt_want_write to ksmbd vfs functions
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: validate smb request protocol id
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: check the validation of pdu_size in ksmbd_conn_handler_loop
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix posix_acls and acls dereferencing possible ERR_PTR()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix out-of-bound read in parse_lease_state()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix out-of-bound read in deassemble_neg_contexts()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: call putname after using the last component
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix UAF issue from opinfo->conn
Kuan-Ting Chen <h3xrabbit(a)gmail.com>
ksmbd: fix multiple out-of-bounds read during context decoding
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix uninitialized pointer read in smb2_create_link()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix uninitialized pointer read in ksmbd_vfs_rename()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix racy issue under cocurrent smb2 tree disconnect
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix racy issue from smb2 close and logoff with multichannel
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: block asynchronous requests when making a delay on session setup
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: destroy expired sessions
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix racy issue from session setup and logoff
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix racy issue from using ->d_parent and ->d_name
Al Viro <viro(a)zeniv.linux.org.uk>
fs: introduce lock_rename_child() helper
David Disseldorp <ddiss(a)suse.de>
ksmbd: remove unused compression negotiate ctx packing
David Disseldorp <ddiss(a)suse.de>
ksmbd: avoid duplicate negotiate ctx offset increments
David Disseldorp <ddiss(a)suse.de>
ksmbd: set NegotiateContextCount once instead of every inc
David Disseldorp <ddiss(a)suse.de>
ksmbd: avoid out of bounds access in decode_preauth_ctxt()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix slab-out-of-bounds in init_smb2_rsp_hdr
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: delete asynchronous work from list
Tom Rix <trix(a)redhat.com>
ksmbd: remove unused is_char_allowed function
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix wrong signingkey creation when encryption is AES256
Hangyu Hua <hbh25y(a)gmail.com>
ksmbd: fix possible memory leak in smb2_lock()
Jiapeng Chong <jiapeng.chong(a)linux.alibaba.com>
ksmbd: Fix parameter name and comment mismatch
Colin Ian King <colin.i.king(a)gmail.com>
ksmbd: Fix spelling mistake "excceed" -> "exceeded"
Steve French <stfrench(a)microsoft.com>
ksmbd: update Kconfig to note Kerberos support and fix indentation
Dawei Li <set_pte_at(a)outlook.com>
ksmbd: Remove duplicated codes
Dawei Li <set_pte_at(a)outlook.com>
ksmbd: fix typo, syncronous->synchronous
Dawei Li <set_pte_at(a)outlook.com>
ksmbd: Implements sess->rpc_handle_list as xarray
Dawei Li <set_pte_at(a)outlook.com>
ksmbd: Implements sess->ksmbd_chann_list as xarray
Marios Makassikis <mmakassikis(a)freebox.fr>
ksmbd: send proper error response in smb2_tree_connect()
ye xingchen <ye.xingchen(a)zte.com.cn>
ksmbd: Convert to use sysfs_emit()/sysfs_emit_at() APIs
Marios Makassikis <mmakassikis(a)freebox.fr>
ksmbd: Fix resource leak in smb2_lock()
Jeff Layton <jlayton(a)kernel.org>
ksmbd: use F_SETLK when unlocking a file
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: set SMB2_SESSION_FLAG_ENCRYPT_DATA when enforcing data encryption for this share
Gustavo A. R. Silva <gustavoars(a)kernel.org>
ksmbd: replace one-element arrays with flexible-array members
Atte Heikkilä <atteh.mailbox(a)gmail.com>
ksmbd: validate share name from share config response
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: call ib_drain_qp when disconnected
Atte Heikkilä <atteh.mailbox(a)gmail.com>
ksmbd: make utf-8 file name comparison work in __caseless_lookup()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: hide socket error message when ipv6 config is disable
Tom Talpey <tom(a)talpey.com>
ksmbd: reduce server smbdirect max send/receive segment sizes
Tom Talpey <tom(a)talpey.com>
ksmbd: decrease the number of SMB3 smbdirect server SGEs
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: set NTLMSSP_NEGOTIATE_SEAL flag to challenge blob
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix encryption failure issue for session logoff response
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fill sids in SMB_FIND_FILE_POSIX_INFO response
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: set file permission mode to match Samba server posix extension behavior
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: change security id to the one samba used for posix extension
Atte Heikkilä <atteh.mailbox(a)gmail.com>
ksmbd: casefold utf-8 share names and fix ascii lowercase conversion
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: remove generic_fillattr use in smb2_open()
Al Viro <viro(a)zeniv.linux.org.uk>
ksmbd: constify struct path
Al Viro <viro(a)zeniv.linux.org.uk>
ksmbd: don't open-code %pD
Al Viro <viro(a)zeniv.linux.org.uk>
ksmbd: don't open-code file_path()
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: remove unnecessary generic_fillattr in smb2_open
Atte Heikkilä <atteh.mailbox(a)gmail.com>
ksmbd: request update to stale share config
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: use wait_event instead of schedule_timeout()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: remove unused ksmbd_share_configs_cleanup function
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: remove duplicate flag set in smb2_write
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
ksmbd: smbd: Remove useless license text when SPDX-License-Identifier is already used
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: relax the count of sges required
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: fix connection dropped issue
Yang Li <yang.lee(a)linux.alibaba.com>
ksmbd: Fix some kernel-doc comments
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix wrong smbd max read/write size check
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: handle multiple Buffer descriptors
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: change the return value of get_sg_list
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: simplify tracking pending packets
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: introduce read/write credits for RDMA read/write
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: change prototypes of RDMA read/write related functions
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: validate length in smb2_write()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: remove filename in ksmbd_file
Steve French <stfrench(a)microsoft.com>
smb3: fix ksmbd bigendian bug in oplock break, and move its struct to smbfs_common
Jakob Koschel <jakobkoschel(a)gmail.com>
ksmbd: replace usage of found with dedicated list iterator variable
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
ksmbd: Remove a redundant zeroing of memory
Steve French <stfrench(a)microsoft.com>
ksmbd: shorten experimental warning on loading the module
Paulo Alcantara (SUSE) <pc(a)cjr.nz>
ksmbd: store fids as opaque u64 integers
Tobias Klauser <tklauser(a)distanz.ch>
ksmbd: use netif_is_bridge_port
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: add support for key exchange
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: validate buffer descriptor structures
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: fix missing client's memory region invalidation
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: add smb-direct shutdown
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: change the default maximum read/write, receive size
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: create MR pool
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: smbd: call rdma_accept() under CM handler
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: set 445 port to smbdirect port by default
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: register ksmbd ib client with ib_register_client()
Yang Li <yang.lee(a)linux.alibaba.com>
ksmbd: Fix smb2_get_name() kernel-doc comment
Yang Li <yang.lee(a)linux.alibaba.com>
ksmbd: Delete an invalid argument description in smb2_populate_readdir_entry()
Yang Li <yang.lee(a)linux.alibaba.com>
ksmbd: Fix smb2_set_info_file() kernel-doc comment
Yang Li <yang.lee(a)linux.alibaba.com>
ksmbd: Fix buffer_check_err() kernel-doc comment
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: set both ipv4 and ipv6 in FSCTL_QUERY_NETWORK_INTERFACE_INFO
Marios Makassikis <mmakassikis(a)freebox.fr>
ksmbd: Remove unused fields from ksmbd_file struct definition
Marios Makassikis <mmakassikis(a)freebox.fr>
ksmbd: Remove unused parameter from smb2_get_name()
Hyunchul Lee <hyc.lee(a)gmail.com>
ksmbd: use oid registry functions to decode OIDs
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: change LeaseKey data type to u8 array
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: remove smb2_buf_length in smb2_transform_hdr
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: remove smb2_buf_length in smb2_hdr
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: remove md4 leftovers
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
ksmbd: Remove redundant 'flush_workqueue()' calls
Ralph Boehme <slow(a)samba.org>
ksmdb: use cmd helper variable in smb2_get_ksmbd_tcon()
Ralph Boehme <slow(a)samba.org>
ksmbd: use ksmbd_req_buf_next() in ksmbd_verify_smb_message()
-------------
Diffstat:
Makefile | 4 +-
drivers/gpu/drm/bridge/lontium-lt9611uxc.c | 75 +-
fs/ksmbd/Kconfig | 11 +-
fs/ksmbd/asn1.c | 173 +--
fs/ksmbd/auth.c | 72 +-
fs/ksmbd/auth.h | 3 +-
fs/ksmbd/connection.c | 169 +--
fs/ksmbd/connection.h | 92 +-
fs/ksmbd/ksmbd_netlink.h | 7 +-
fs/ksmbd/ksmbd_work.c | 101 +-
fs/ksmbd/ksmbd_work.h | 40 +-
fs/ksmbd/mgmt/share_config.c | 56 +-
fs/ksmbd/mgmt/share_config.h | 36 +-
fs/ksmbd/mgmt/tree_connect.c | 78 +-
fs/ksmbd/mgmt/tree_connect.h | 15 +-
fs/ksmbd/mgmt/user_config.h | 1 -
fs/ksmbd/mgmt/user_session.c | 180 +--
fs/ksmbd/mgmt/user_session.h | 8 +-
fs/ksmbd/misc.c | 94 +-
fs/ksmbd/misc.h | 6 +-
fs/ksmbd/oplock.c | 256 ++--
fs/ksmbd/oplock.h | 4 -
fs/ksmbd/server.c | 54 +-
fs/ksmbd/smb2misc.c | 4 +-
fs/ksmbd/smb2ops.c | 10 +-
fs/ksmbd/smb2pdu.c | 2047 ++++++++++++++--------------
fs/ksmbd/smb2pdu.h | 83 +-
fs/ksmbd/smb_common.c | 176 ++-
fs/ksmbd/smb_common.h | 20 +-
fs/ksmbd/smbacl.c | 26 +-
fs/ksmbd/smbacl.h | 8 +-
fs/ksmbd/transport_ipc.c | 4 +-
fs/ksmbd/transport_rdma.c | 648 ++++++---
fs/ksmbd/transport_rdma.h | 6 +-
fs/ksmbd/transport_tcp.c | 9 +-
fs/ksmbd/unicode.c | 191 ++-
fs/ksmbd/unicode.h | 3 +-
fs/ksmbd/vfs.c | 677 ++++-----
fs/ksmbd/vfs.h | 56 +-
fs/ksmbd/vfs_cache.c | 72 +-
fs/ksmbd/vfs_cache.h | 26 +-
fs/namei.c | 125 +-
include/linux/kasan.h | 6 +-
include/linux/namei.h | 7 +
kernel/trace/trace_kprobe.c | 74 +
kernel/trace/trace_probe.h | 1 +
mm/kasan/report.c | 4 +-
47 files changed, 3279 insertions(+), 2539 deletions(-)
In {conn,adv}_min_interval_set():
if (val < ... || val > ... || val > hdev->le_{conn,adv}_max_interval)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_{conn,adv}_min_interval = val;
hci_dev_unlock(hdev);
In {conn,adv}_max_interval_set():
if (val < ... || val > ... || val < hdev->le_{conn,adv}_min_interval)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_{conn,adv}_max_interval
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs which may lead to inconsistent reads and writes of the min
value and the max value. The checks for value validity are ineffective as
the min/max values could change immediately after being checked, raising
the risk of the min value being greater than the max value and causing
invalid settings.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 3a5c82b78fd28 ("Bluetooth: Move LE debugfs file creation into ...")
Cc: stable(a)vger.kernel.org
Reported-by: BassCheck <bass(a)buaa.edu.cn>
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
---
net/bluetooth/hci_debugfs.c | 30 +++++++++++++++++++-----------
1 file changed, 19 insertions(+), 11 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 6b7741f6e95b..6fdda807f2cf 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -849,11 +849,13 @@ DEFINE_SHOW_ATTRIBUTE(long_term_keys);
static int conn_min_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val < 0x0006 || val > 0x0c80 || val > hdev->le_conn_max_interval)
+
+ hci_dev_lock(hdev);
+ if (val < 0x0006 || val > 0x0c80 || val > hdev->le_conn_max_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_conn_min_interval = val;
hci_dev_unlock(hdev);
@@ -877,11 +879,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(conn_min_interval_fops, conn_min_interval_get,
static int conn_max_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val < 0x0006 || val > 0x0c80 || val < hdev->le_conn_min_interval)
+
+ hci_dev_lock(hdev);
+ if (val < 0x0006 || val > 0x0c80 || val < hdev->le_conn_min_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_conn_max_interval = val;
hci_dev_unlock(hdev);
@@ -989,11 +993,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(adv_channel_map_fops, adv_channel_map_get,
static int adv_min_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val < 0x0020 || val > 0x4000 || val > hdev->le_adv_max_interval)
+
+ hci_dev_lock(hdev);
+ if (val < 0x0020 || val > 0x4000 || val > hdev->le_adv_max_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_adv_min_interval = val;
hci_dev_unlock(hdev);
@@ -1018,10 +1024,12 @@ static int adv_max_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
- if (val < 0x0020 || val > 0x4000 || val < hdev->le_adv_min_interval)
+ hci_dev_lock(hdev);
+ if (val < 0x0020 || val > 0x4000 || val < hdev->le_adv_min_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_adv_max_interval = val;
hci_dev_unlock(hdev);
--
2.34.1
A couple of reports pointed at some strange failures happening a bit
randomly since the introduction of sequential page reads support. After
investigation it turned out the most likely reason for these issues was
the fact that sometimes a (longer) read might happen, starting at the
same page that was read previously. This is optimized by the raw NAND
core, by not sending the READ_PAGE command to the NAND device and just
reading out the data in a local cache. When this page is also flagged as
being the starting point for a sequential read, it means the page right
next will be accessed without the right instructions. The NAND chip will
be confused and will not output correct data. In order to avoid such
situation from happening anymore, we can however handle this case with a
bit of additional logic, to postpone the initialization of the read
sequence by one page.
Reported-by: Alexander Shiyan <eagle.alexander923(a)gmail.com>
Closes: https://lore.kernel.org/linux-mtd/CAP1tNvS=NVAm-vfvYWbc3k9Cx9YxMc2uZZkmXk8h…
Reported-by: Måns Rullgård <mans(a)mansr.com>
Closes: https://lore.kernel.org/linux-mtd/yw1xfs6j4k6q.fsf@mansr.com/
Reported-by: Martin Hundebøll <martin(a)geanix.com>
Closes: https://lore.kernel.org/linux-mtd/9d0c42fcde79bfedfe5b05d6a4e9fdef71d3dd52.…
Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
drivers/mtd/nand/raw/nand_base.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 04e80ace4182..1b0a984d181d 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -3478,6 +3478,18 @@ static void rawnand_enable_cont_reads(struct nand_chip *chip, unsigned int page,
rawnand_cap_cont_reads(chip);
}
+static void rawnand_cont_read_skip_first_page(struct nand_chip *chip, unsigned int page)
+{
+ if (!chip->cont_read.ongoing || page != chip->cont_read.first_page)
+ return;
+
+ chip->cont_read.first_page++;
+ if (chip->cont_read.first_page == chip->cont_read.pause_page)
+ chip->cont_read.first_page++;
+ if (chip->cont_read.first_page >= chip->cont_read.last_page)
+ chip->cont_read.ongoing = false;
+}
+
/**
* nand_setup_read_retry - [INTERN] Set the READ RETRY mode
* @chip: NAND chip object
@@ -3652,6 +3664,8 @@ static int nand_do_read_ops(struct nand_chip *chip, loff_t from,
buf += bytes;
max_bitflips = max_t(unsigned int, max_bitflips,
chip->pagecache.bitflips);
+
+ rawnand_cont_read_skip_first_page(chip, page);
}
readlen -= bytes;
--
2.34.1
Some devices support sequential reads when using the on-die ECC engines,
some others do not. It is a bit hard to know which ones will break other
than experimentally, so in order to avoid such a difficult and painful
task, let's just pretend all devices should avoid using this
optimization when configured like this.
Cc: stable(a)vger.kernel.org
Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads")
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
drivers/mtd/nand/raw/nand_base.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 1b0a984d181d..139fdf3e58c0 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -5170,6 +5170,14 @@ static void rawnand_late_check_supported_ops(struct nand_chip *chip)
/* The supported_op fields should not be set by individual drivers */
WARN_ON_ONCE(chip->controller->supported_op.cont_read);
+ /*
+ * Too many devices do not support sequential cached reads with on-die
+ * ECC correction enabled, so in this case refuse to perform the
+ * automation.
+ */
+ if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE)
+ return;
+
if (!nand_has_exec_op(chip))
return;
--
2.34.1
In conn_info_min_age_set():
if (val == 0 || val > hdev->conn_info_max_age)
return -EINVAL;
hci_dev_lock(hdev);
hdev->conn_info_min_age = val;
hci_dev_unlock(hdev);
In conn_info_max_age_set():
if (val == 0 || val < hdev->conn_info_min_age)
return -EINVAL;
hci_dev_lock(hdev);
hdev->conn_info_max_age = val;
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs which may lead to inconsistent reads and writes of the min
value and the max value. The checks for value validity are ineffective as
the min/max values could change immediately after being checked, raising
the risk of the min value being greater than the max value and causing
invalid settings.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 40ce72b1951c5 ("Bluetooth: Move common debugfs file creation ...")
Cc: stable(a)vger.kernel.org
Reported-by: BassCheck <bass(a)buaa.edu.cn>
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
---
net/bluetooth/hci_debugfs.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 6b7741f6e95b..d4ce2769c939 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -217,11 +217,13 @@ DEFINE_SHOW_ATTRIBUTE(remote_oob);
static int conn_info_min_age_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val == 0 || val > hdev->conn_info_max_age)
+
+ hci_dev_lock(hdev);
+ if (val == 0 || val > hdev->conn_info_max_age) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->conn_info_min_age = val;
hci_dev_unlock(hdev);
@@ -245,11 +247,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(conn_info_min_age_fops, conn_info_min_age_get,
static int conn_info_max_age_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val == 0 || val < hdev->conn_info_min_age)
+
+ hci_dev_lock(hdev);
+ if (val == 0 || val < hdev->conn_info_min_age) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->conn_info_max_age = val;
hci_dev_unlock(hdev);
--
2.34.1
In sniff_min_interval_set():
if (val == 0 || val % 2 || val > hdev->sniff_max_interval)
return -EINVAL;
hci_dev_lock(hdev);
hdev->sniff_min_interval = val;
hci_dev_unlock(hdev);
In sniff_max_interval_set():
if (val == 0 || val % 2 || val < hdev->sniff_min_interval)
return -EINVAL;
hci_dev_lock(hdev);
hdev->sniff_max_interval = val;
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs which may lead to inconsistent reads and writes of the min
value and the max value. The checks for value validity are ineffective as
the min/max values could change immediately after being checked, raising
the risk of the min value being greater than the max value and causing
invalid settings.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 71c3b60ec6d28 ("Bluetooth: Move BR/EDR debugfs file creation ...")
Cc: stable(a)vger.kernel.org
Reported-by: BassCheck <bass(a)buaa.edu.cn>
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
---
net/bluetooth/hci_debugfs.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 6b7741f6e95b..f032fdf8f481 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -566,11 +566,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(idle_timeout_fops, idle_timeout_get,
static int sniff_min_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val == 0 || val % 2 || val > hdev->sniff_max_interval)
+
+ hci_dev_lock(hdev);
+ if (val == 0 || val % 2 || val > hdev->sniff_max_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->sniff_min_interval = val;
hci_dev_unlock(hdev);
@@ -594,11 +596,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(sniff_min_interval_fops, sniff_min_interval_get,
static int sniff_max_interval_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val == 0 || val % 2 || val < hdev->sniff_min_interval)
+
+ hci_dev_lock(hdev);
+ if (val == 0 || val % 2 || val < hdev->sniff_min_interval) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->sniff_max_interval = val;
hci_dev_unlock(hdev);
--
2.34.1
In min_key_size_set():
if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_min_key_size = val;
hci_dev_unlock(hdev);
In max_key_size_set():
if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_max_key_size = val;
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs which may lead to inconsistent reads and writes of the min
value and the max value. The checks for value validity are ineffective as
the min/max values could change immediately after being checked, raising
the risk of the min value being greater than the max value and causing
invalid settings.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 18f81241b74fb ("Bluetooth: Move {min,max}_key_size debugfs ...")
Cc: stable(a)vger.kernel.org
Reported-by: BassCheck <bass(a)buaa.edu.cn>
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
---
net/bluetooth/hci_debugfs.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 6b7741f6e95b..3ffbf3f25363 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -1045,11 +1045,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(adv_max_interval_fops, adv_max_interval_get,
static int min_key_size_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE)
+
+ hci_dev_lock(hdev);
+ if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_min_key_size = val;
hci_dev_unlock(hdev);
@@ -1073,11 +1075,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(min_key_size_fops, min_key_size_get,
static int max_key_size_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size)
+
+ hci_dev_lock(hdev);
+ if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_max_key_size = val;
hci_dev_unlock(hdev);
--
2.34.1
From: Guo Ren <guoren(a)linux.alibaba.com>
In COMPAT mode, the STACK_TOP is 0x80000000, but the TASK_SIZE is
0x7fff000. When the user stack is upon 0x7fff000, it will cause a user
segment fault. Sometimes, it would cause boot failure when the whole
rootfs is rv32.
Freeing unused kernel image (initmem) memory: 2236K
Run /sbin/init as init process
Starting init: /sbin/init exists but couldn't execute it (error -14)
Run /etc/init as init process
...
Cc: stable(a)vger.kernel.org
Fixes: add2cc6b6515 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57")
Signed-off-by: Guo Ren <guoren(a)linux.alibaba.com>
Signed-off-by: Guo Ren <guoren(a)kernel.org>
---
arch/riscv/include/asm/pgtable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index ab00235b018f..74ffb2178f54 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -881,7 +881,7 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
#define TASK_SIZE_MIN (PGDIR_SIZE_L3 * PTRS_PER_PGD / 2)
#ifdef CONFIG_COMPAT
-#define TASK_SIZE_32 (_AC(0x80000000, UL) - PAGE_SIZE)
+#define TASK_SIZE_32 (_AC(0x80000000, UL))
#define TASK_SIZE (test_thread_flag(TIF_32BIT) ? \
TASK_SIZE_32 : TASK_SIZE_64)
#else
--
2.40.1
Current implementation blocks the running operations when Plug-out and
Plug-In is performed continuously, process gets stuck in
dwc3_thread_interrupt().
Code Flow:
CPU1
->Gadget_start
->dwc3_interrupt
->dwc3_thread_interrupt
->dwc3_process_event_buf
->dwc3_process_event_entry
->dwc3_endpoint_interrupt
->dwc3_ep0_interrupt
->dwc3_ep0_inspect_setup
->dwc3_ep0_stall_and_restart
By this time if pending_list is not empty, it will get the next request
on the given list and calls dwc3_gadget_giveback which will unmap request
and call its complete() callback to notify upper layers that it has
completed. Currently dwc3_gadget_giveback status is set to -ECONNRESET,
whereas it should be -ESHUTDOWN based on condition if not dwc->connected
is true.
Cc: <stable(a)vger.kernel.org>
Fixes: d742220b3577 ("usb: dwc3: ep0: giveback requests on stall_and_restart")
Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa(a)quicinc.com>
---
Changes in v2:
Added dwc->connected check to set status either -ESHUTDOWN and -ECONNRESET
in dwc3_gadget_giveback.
Link to v1:
https://lore.kernel.org/all/20231122091127.3636-1-quic_uaggarwa@quicinc.com…
drivers/usb/dwc3/ep0.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
index b94243237293..816b8eea73d6 100644
--- a/drivers/usb/dwc3/ep0.c
+++ b/drivers/usb/dwc3/ep0.c
@@ -238,7 +238,10 @@ void dwc3_ep0_stall_and_restart(struct dwc3 *dwc)
struct dwc3_request *req;
req = next_request(&dep->pending_list);
- dwc3_gadget_giveback(dep, req, -ECONNRESET);
+ if (!dwc->connected)
+ dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
+ else
+ dwc3_gadget_giveback(dep, req, -ECONNRESET);
}
dwc->eps[0]->trb_enqueue = 0;
--
2.17.1
A process may map only some of the pages in a folio, and might be missed
if it maps the poisoned page but not the head page. Or it might be
unnecessarily hit if it maps the head page, but not the poisoned page.
Fixes: 7af446a841a2 ("HWPOISON, hugetlb: enable error handling path for hugepage")
Cc: stable(a)vger.kernel.org
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
---
mm/memory-failure.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 6953bda11e6e..82e15baabb48 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1570,7 +1570,7 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
* This check implies we don't kill processes if their pages
* are in the swap cache early. Those are always late kills.
*/
- if (!page_mapped(hpage))
+ if (!page_mapped(p))
return true;
if (PageSwapCache(p)) {
@@ -1621,10 +1621,10 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
try_to_unmap(folio, ttu);
}
- unmap_success = !page_mapped(hpage);
+ unmap_success = !page_mapped(p);
if (!unmap_success)
pr_err("%#lx: failed to unmap page (mapcount=%d)\n",
- pfn, page_mapcount(hpage));
+ pfn, page_mapcount(p));
/*
* try_to_unmap() might put mlocked page in lru cache, so call
--
2.42.0
There are two major types of uncorrected error (UC) :
- Action Required: The error is detected and the processor already consumes the
memory. OS requires to take action (for example, offline failure page/kill
failure thread) to recover this uncorrectable error.
- Action Optional: The error is detected out of processor execution context.
Some data in the memory are corrupted. But the data have not been consumed.
OS is optional to take action to recover this uncorrectable error.
For X86 platforms, we can easily distinguish between these two types
based on the MCA Bank. While for arm64 platform, the memory failure
flags for all UCs which severity are GHES_SEV_RECOVERABLE are set as 0,
a.k.a, Action Optional now.
If UC is detected by a background scrubber, it is obviously an Action
Optional error. For other errors, we should conservatively regard them
as Action Required.
cper_sec_mem_err::error_type identifies the type of error that occurred
if CPER_MEM_VALID_ERROR_TYPE is set. So, set memory failure flags as 0
for Scrub Uncorrected Error (type 14). Otherwise, set memory failure
flags as MF_ACTION_REQUIRED.
Signed-off-by: Shuai Xue <xueshuai(a)linux.alibaba.com>
---
drivers/acpi/apei/ghes.c | 10 ++++++++--
include/linux/cper.h | 3 +++
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 80ad530583c9..6c03059cbfc6 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -474,8 +474,14 @@ static bool ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata,
if (sec_sev == GHES_SEV_CORRECTED &&
(gdata->flags & CPER_SEC_ERROR_THRESHOLD_EXCEEDED))
flags = MF_SOFT_OFFLINE;
- if (sev == GHES_SEV_RECOVERABLE && sec_sev == GHES_SEV_RECOVERABLE)
- flags = 0;
+ if (sev == GHES_SEV_RECOVERABLE && sec_sev == GHES_SEV_RECOVERABLE) {
+ if (mem_err->validation_bits & CPER_MEM_VALID_ERROR_TYPE)
+ flags = mem_err->error_type == CPER_MEM_SCRUB_UC ?
+ 0 :
+ MF_ACTION_REQUIRED;
+ else
+ flags = MF_ACTION_REQUIRED;
+ }
if (flags != -1)
return ghes_do_memory_failure(mem_err->physical_addr, flags);
diff --git a/include/linux/cper.h b/include/linux/cper.h
index eacb7dd7b3af..b77ab7636614 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -235,6 +235,9 @@ enum {
#define CPER_MEM_VALID_BANK_ADDRESS 0x100000
#define CPER_MEM_VALID_CHIP_ID 0x200000
+#define CPER_MEM_SCRUB_CE 13
+#define CPER_MEM_SCRUB_UC 14
+
#define CPER_MEM_EXT_ROW_MASK 0x3
#define CPER_MEM_EXT_ROW_SHIFT 16
--
2.20.1.9.gb50a0d7
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Commit 834449872105 ("sc16is7xx: Fix for multi-channel stall") changed
sc16is7xx_port_irq() from looping multiple times when there was still
interrupts to serve. It simply changed the do {} while(1) loop to a
do {} while(0) loop, which makes the loop itself now obsolete.
Clean the code by removing this obsolete do {} while(0) loop.
Fixes: 834449872105 ("sc16is7xx: Fix for multi-channel stall")
Cc: stable(a)vger.kernel.org
Suggested-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
---
drivers/tty/serial/sc16is7xx.c | 81 ++++++++++++++++------------------
1 file changed, 39 insertions(+), 42 deletions(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index ced2446909a2..44a11c89c949 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -725,58 +725,55 @@ static void sc16is7xx_update_mlines(struct sc16is7xx_one *one)
static bool sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno)
{
bool rc = true;
+ unsigned int iir, rxlen;
struct uart_port *port = &s->p[portno].port;
struct sc16is7xx_one *one = to_sc16is7xx_one(port, port);
mutex_lock(&one->efr_lock);
- do {
- unsigned int iir, rxlen;
+ iir = sc16is7xx_port_read(port, SC16IS7XX_IIR_REG);
+ if (iir & SC16IS7XX_IIR_NO_INT_BIT) {
+ rc = false;
+ goto out_port_irq;
+ }
- iir = sc16is7xx_port_read(port, SC16IS7XX_IIR_REG);
- if (iir & SC16IS7XX_IIR_NO_INT_BIT) {
- rc = false;
- goto out_port_irq;
- }
+ iir &= SC16IS7XX_IIR_ID_MASK;
- iir &= SC16IS7XX_IIR_ID_MASK;
+ switch (iir) {
+ case SC16IS7XX_IIR_RDI_SRC:
+ case SC16IS7XX_IIR_RLSE_SRC:
+ case SC16IS7XX_IIR_RTOI_SRC:
+ case SC16IS7XX_IIR_XOFFI_SRC:
+ rxlen = sc16is7xx_port_read(port, SC16IS7XX_RXLVL_REG);
- switch (iir) {
- case SC16IS7XX_IIR_RDI_SRC:
- case SC16IS7XX_IIR_RLSE_SRC:
- case SC16IS7XX_IIR_RTOI_SRC:
- case SC16IS7XX_IIR_XOFFI_SRC:
- rxlen = sc16is7xx_port_read(port, SC16IS7XX_RXLVL_REG);
+ /*
+ * There is a silicon bug that makes the chip report a
+ * time-out interrupt but no data in the FIFO. This is
+ * described in errata section 18.1.4.
+ *
+ * When this happens, read one byte from the FIFO to
+ * clear the interrupt.
+ */
+ if (iir == SC16IS7XX_IIR_RTOI_SRC && !rxlen)
+ rxlen = 1;
- /*
- * There is a silicon bug that makes the chip report a
- * time-out interrupt but no data in the FIFO. This is
- * described in errata section 18.1.4.
- *
- * When this happens, read one byte from the FIFO to
- * clear the interrupt.
- */
- if (iir == SC16IS7XX_IIR_RTOI_SRC && !rxlen)
- rxlen = 1;
-
- if (rxlen)
- sc16is7xx_handle_rx(port, rxlen, iir);
- break;
+ if (rxlen)
+ sc16is7xx_handle_rx(port, rxlen, iir);
+ break;
/* CTSRTS interrupt comes only when CTS goes inactive */
- case SC16IS7XX_IIR_CTSRTS_SRC:
- case SC16IS7XX_IIR_MSI_SRC:
- sc16is7xx_update_mlines(one);
- break;
- case SC16IS7XX_IIR_THRI_SRC:
- sc16is7xx_handle_tx(port);
- break;
- default:
- dev_err_ratelimited(port->dev,
- "ttySC%i: Unexpected interrupt: %x",
- port->line, iir);
- break;
- }
- } while (0);
+ case SC16IS7XX_IIR_CTSRTS_SRC:
+ case SC16IS7XX_IIR_MSI_SRC:
+ sc16is7xx_update_mlines(one);
+ break;
+ case SC16IS7XX_IIR_THRI_SRC:
+ sc16is7xx_handle_tx(port);
+ break;
+ default:
+ dev_err_ratelimited(port->dev,
+ "ttySC%i: Unexpected interrupt: %x",
+ port->line, iir);
+ break;
+ }
out_port_irq:
mutex_unlock(&one->efr_lock);
--
2.39.2
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
The original comment is confusing because it implies that variants other
than the SC16IS762 supports other SPI modes beside SPI_MODE_0.
Extract from datasheet:
The SC16IS762 differs from the SC16IS752 in that it supports SPI clock
speeds up to 15 Mbit/s instead of the 4 Mbit/s supported by the
SC16IS752... In all other aspects, the SC16IS762 is functionally and
electrically the same as the SC16IS752.
The same is also true of the SC16IS760 variant versus the SC16IS740 and
SC16IS750 variants.
For all variants, only SPI mode 0 is supported.
Change comment and abort probing if the specified SPI mode is not
SPI_MODE_0.
Fixes: 2c837a8a8f9f ("sc16is7xx: spi interface is added")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
---
drivers/tty/serial/sc16is7xx.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 17b90f971f96..798fa115b28a 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -1733,7 +1733,10 @@ static int sc16is7xx_spi_probe(struct spi_device *spi)
/* Setup SPI bus */
spi->bits_per_word = 8;
- /* only supports mode 0 on SC16IS762 */
+ /* For all variants, only mode 0 is supported */
+ if ((spi->mode & SPI_MODE_X_MASK) != SPI_MODE_0)
+ return dev_err_probe(&spi->dev, -EINVAL, "Unsupported SPI mode\n");
+
spi->mode = spi->mode ? : SPI_MODE_0;
spi->max_speed_hz = spi->max_speed_hz ? : 15000000;
ret = spi_setup(spi);
--
2.39.2
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
If an error occurs during probing, the sc16is7xx_lines bitfield may be left
in a state that doesn't represent the correct state of lines allocation.
For example, in a system with two SC16 devices, if an error occurs only
during probing of channel (port) B of the second device, sc16is7xx_lines
final state will be 00001011b instead of the expected 00000011b.
This is caused in part because of the "i--" in the for/loop located in
the out_ports: error path.
Fix this by checking the return value of uart_add_one_port() and set line
allocation bit only if this was successful. This allows the refactor of
the obfuscated for(i--...) loop in the error path, and properly call
uart_remove_one_port() only when needed, and properly unset line allocation
bits.
Also use same mechanism in remove() when calling uart_remove_one_port().
Fixes: c64349722d14 ("sc16is7xx: support multiple devices")
Cc: stable(a)vger.kernel.org
Cc: Yury Norov <yury.norov(a)gmail.com>
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
---
drivers/tty/serial/sc16is7xx.c | 44 ++++++++++++++--------------------
1 file changed, 18 insertions(+), 26 deletions(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index e40e4a99277e..17b90f971f96 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -407,19 +407,6 @@ static void sc16is7xx_port_update(struct uart_port *port, u8 reg,
regmap_update_bits(one->regmap, reg, mask, val);
}
-static int sc16is7xx_alloc_line(void)
-{
- int i;
-
- BUILD_BUG_ON(SC16IS7XX_MAX_DEVS > BITS_PER_LONG);
-
- for (i = 0; i < SC16IS7XX_MAX_DEVS; i++)
- if (!test_and_set_bit(i, &sc16is7xx_lines))
- break;
-
- return i;
-}
-
static void sc16is7xx_power(struct uart_port *port, int on)
{
sc16is7xx_port_update(port, SC16IS7XX_IER_REG,
@@ -1550,6 +1537,13 @@ static int sc16is7xx_probe(struct device *dev,
SC16IS7XX_IOCONTROL_SRESET_BIT);
for (i = 0; i < devtype->nr_uart; ++i) {
+ s->p[i].port.line = find_first_zero_bit(&sc16is7xx_lines,
+ SC16IS7XX_MAX_DEVS);
+ if (s->p[i].port.line >= SC16IS7XX_MAX_DEVS) {
+ ret = -ERANGE;
+ goto out_ports;
+ }
+
/* Initialize port data */
s->p[i].port.dev = dev;
s->p[i].port.irq = irq;
@@ -1569,14 +1563,8 @@ static int sc16is7xx_probe(struct device *dev,
s->p[i].port.rs485_supported = sc16is7xx_rs485_supported;
s->p[i].port.ops = &sc16is7xx_ops;
s->p[i].old_mctrl = 0;
- s->p[i].port.line = sc16is7xx_alloc_line();
s->p[i].regmap = regmaps[i];
- if (s->p[i].port.line >= SC16IS7XX_MAX_DEVS) {
- ret = -ENOMEM;
- goto out_ports;
- }
-
mutex_init(&s->p[i].efr_lock);
ret = uart_get_rs485_mode(&s->p[i].port);
@@ -1594,8 +1582,13 @@ static int sc16is7xx_probe(struct device *dev,
kthread_init_work(&s->p[i].tx_work, sc16is7xx_tx_proc);
kthread_init_work(&s->p[i].reg_work, sc16is7xx_reg_proc);
kthread_init_delayed_work(&s->p[i].ms_work, sc16is7xx_ms_proc);
+
/* Register port */
- uart_add_one_port(&sc16is7xx_uart, &s->p[i].port);
+ ret = uart_add_one_port(&sc16is7xx_uart, &s->p[i].port);
+ if (ret)
+ goto out_ports;
+
+ set_bit(s->p[i].port.line, &sc16is7xx_lines);
/* Enable EFR */
sc16is7xx_port_write(&s->p[i].port, SC16IS7XX_LCR_REG,
@@ -1653,10 +1646,9 @@ static int sc16is7xx_probe(struct device *dev,
#endif
out_ports:
- for (i--; i >= 0; i--) {
- uart_remove_one_port(&sc16is7xx_uart, &s->p[i].port);
- clear_bit(s->p[i].port.line, &sc16is7xx_lines);
- }
+ for (i = 0; i < devtype->nr_uart; i++)
+ if (test_and_clear_bit(s->p[i].port.line, &sc16is7xx_lines))
+ uart_remove_one_port(&sc16is7xx_uart, &s->p[i].port);
kthread_stop(s->kworker_task);
@@ -1678,8 +1670,8 @@ static void sc16is7xx_remove(struct device *dev)
for (i = 0; i < s->devtype->nr_uart; i++) {
kthread_cancel_delayed_work_sync(&s->p[i].ms_work);
- uart_remove_one_port(&sc16is7xx_uart, &s->p[i].port);
- clear_bit(s->p[i].port.line, &sc16is7xx_lines);
+ if (test_and_clear_bit(s->p[i].port.line, &sc16is7xx_lines))
+ uart_remove_one_port(&sc16is7xx_uart, &s->p[i].port);
sc16is7xx_power(&s->p[i].port, 0);
}
--
2.39.2
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
If an error occurs during probing, the sc16is7xx_lines bitfield may be left
in a state that doesn't represent the correct state of lines allocation.
For example, in a system with two SC16 devices, if an error occurs only
during probing of channel (port) B of the second device, sc16is7xx_lines
final state will be 00001011b instead of the expected 00000011b.
This is caused in part because of the "i--" in the for/loop located in
the out_ports: error path.
Fix this by checking the return value of uart_add_one_port() and set line
allocation bit only if this was successful. This allows the refactor of
the obfuscated for(i--...) loop in the error path, and properly call
uart_remove_one_port() only when needed, and properly unset line allocation
bits.
Also use same mechanism in remove() when calling uart_remove_one_port().
Fixes: c64349722d14 ("sc16is7xx: support multiple devices")
Cc: stable(a)vger.kernel.org
Cc: Yury Norov <yury.norov(a)gmail.com>
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
---
There is already a patch by Yury Norov <yury.norov(a)gmail.com> to simplify
sc16is7xx_alloc_line():
https://lore.kernel.org/all/20231212022749.625238-30-yury.norov@gmail.com/
Since my patch gets rid of sc16is7xx_alloc_line() entirely, it would make
Yury's patch unnecessary.
---
drivers/tty/serial/sc16is7xx.c | 44 ++++++++++++++--------------------
1 file changed, 18 insertions(+), 26 deletions(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index b585663c1e6e..b92fd01cfeec 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -407,19 +407,6 @@ static void sc16is7xx_port_update(struct uart_port *port, u8 reg,
regmap_update_bits(one->regmap, reg, mask, val);
}
-static int sc16is7xx_alloc_line(void)
-{
- int i;
-
- BUILD_BUG_ON(SC16IS7XX_MAX_DEVS > BITS_PER_LONG);
-
- for (i = 0; i < SC16IS7XX_MAX_DEVS; i++)
- if (!test_and_set_bit(i, &sc16is7xx_lines))
- break;
-
- return i;
-}
-
static void sc16is7xx_power(struct uart_port *port, int on)
{
sc16is7xx_port_update(port, SC16IS7XX_IER_REG,
@@ -1550,6 +1537,13 @@ static int sc16is7xx_probe(struct device *dev,
SC16IS7XX_IOCONTROL_SRESET_BIT);
for (i = 0; i < devtype->nr_uart; ++i) {
+ s->p[i].port.line = find_first_zero_bit(&sc16is7xx_lines,
+ SC16IS7XX_MAX_DEVS);
+ if (s->p[i].port.line >= SC16IS7XX_MAX_DEVS) {
+ ret = -ERANGE;
+ goto out_ports;
+ }
+
/* Initialize port data */
s->p[i].port.dev = dev;
s->p[i].port.irq = irq;
@@ -1569,14 +1563,8 @@ static int sc16is7xx_probe(struct device *dev,
s->p[i].port.rs485_supported = sc16is7xx_rs485_supported;
s->p[i].port.ops = &sc16is7xx_ops;
s->p[i].old_mctrl = 0;
- s->p[i].port.line = sc16is7xx_alloc_line();
s->p[i].regmap = regmaps[i];
- if (s->p[i].port.line >= SC16IS7XX_MAX_DEVS) {
- ret = -ENOMEM;
- goto out_ports;
- }
-
mutex_init(&s->p[i].efr_lock);
ret = uart_get_rs485_mode(&s->p[i].port);
@@ -1594,8 +1582,13 @@ static int sc16is7xx_probe(struct device *dev,
kthread_init_work(&s->p[i].tx_work, sc16is7xx_tx_proc);
kthread_init_work(&s->p[i].reg_work, sc16is7xx_reg_proc);
kthread_init_delayed_work(&s->p[i].ms_work, sc16is7xx_ms_proc);
+
/* Register port */
- uart_add_one_port(&sc16is7xx_uart, &s->p[i].port);
+ ret = uart_add_one_port(&sc16is7xx_uart, &s->p[i].port);
+ if (ret)
+ goto out_ports;
+
+ set_bit(s->p[i].port.line, &sc16is7xx_lines);
/* Enable EFR */
sc16is7xx_port_write(&s->p[i].port, SC16IS7XX_LCR_REG,
@@ -1653,10 +1646,9 @@ static int sc16is7xx_probe(struct device *dev,
#endif
out_ports:
- for (i--; i >= 0; i--) {
- uart_remove_one_port(&sc16is7xx_uart, &s->p[i].port);
- clear_bit(s->p[i].port.line, &sc16is7xx_lines);
- }
+ for (i = 0; i < devtype->nr_uart; i++)
+ if (test_and_clear_bit(s->p[i].port.line, &sc16is7xx_lines))
+ uart_remove_one_port(&sc16is7xx_uart, &s->p[i].port);
kthread_stop(s->kworker_task);
@@ -1683,8 +1675,8 @@ static void sc16is7xx_remove(struct device *dev)
for (i = 0; i < s->devtype->nr_uart; i++) {
kthread_cancel_delayed_work_sync(&s->p[i].ms_work);
- uart_remove_one_port(&sc16is7xx_uart, &s->p[i].port);
- clear_bit(s->p[i].port.line, &sc16is7xx_lines);
+ if (test_and_clear_bit(s->p[i].port.line, &sc16is7xx_lines))
+ uart_remove_one_port(&sc16is7xx_uart, &s->p[i].port);
sc16is7xx_power(&s->p[i].port, 0);
}
--
2.39.2
Hi Sasha,
Thank you for picking this patch.
On Thu, 21 Dec 2023 09:53:01 -0500 Sasha Levin <sashal(a)kernel.org> wrote:
> This is a note to let you know that I've just added the patch titled
>
> mm/damon/core: use number of passed access sampling as a timer
>
> to the 6.6-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> mm-damon-core-use-number-of-passed-access-sampling-a.patch
> and it can be found in the queue-6.6 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit dfda8d41e94ee98ebd2ad78c7cb49625a8c92474
> Author: SeongJae Park <sj(a)kernel.org>
> Date: Thu Sep 14 02:15:23 2023 +0000
>
> mm/damon/core: use number of passed access sampling as a timer
>
> [ Upstream commit 4472edf63d6630e6cf65e205b4fc8c3c94d0afe5 ]
>
> DAMON sleeps for sampling interval after each sampling, and check if the
> aggregation interval and the ops update interval have passed using
> ktime_get_coarse_ts64() and baseline timestamps for the intervals. That
> design is for making the operations occur at deterministic timing
> regardless of the time that spend for each work. However, it turned out
> it is not that useful, and incur not-that-intuitive results.
>
> After all, timer functions, and especially sleep functions that DAMON uses
> to wait for specific timing, are not necessarily strictly accurate. It is
> legal design, so no problem. However, depending on such inaccuracies, the
> nr_accesses can be larger than aggregation interval divided by sampling
> interval. For example, with the default setting (5 ms sampling interval
> and 100 ms aggregation interval) we frequently show regions having
> nr_accesses larger than 20. Also, if the execution of a DAMOS scheme
> takes a long time, next aggregation could happen before enough number of
> samples are collected. This is not what usual users would intuitively
> expect.
>
> Since access check sampling is the smallest unit work of DAMON, using the
> number of passed sampling intervals as the DAMON-internal timer can easily
> avoid these problems. That is, convert aggregation and ops update
> intervals to numbers of sampling intervals that need to be passed before
> those operations be executed, count the number of passed sampling
> intervals, and invoke the operations as soon as the specific amount of
> sampling intervals passed. Make the change.
>
> Note that this could make a behavioral change to settings that using
> intervals that not aligned by the sampling interval. For example, if the
> sampling interval is 5 ms and the aggregation interval is 12 ms, DAMON
> effectively uses 15 ms as its aggregation interval, because it checks
> whether the aggregation interval after sleeping the sampling interval.
> This change will make DAMON to effectively use 10 ms as aggregation
> interval, since it uses 'aggregation interval / sampling interval *
> sampling interval' as the effective aggregation interval, and we don't use
> floating point types. Usual users would have used aligned intervals, so
> this behavioral change is not expected to make any meaningful impact, so
> just make this change.
>
> Link: https://lkml.kernel.org/r/20230914021523.60649-1-sj@kernel.org
> Signed-off-by: SeongJae Park <sj(a)kernel.org>
> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
> Stable-dep-of: 6376a8245956 ("mm/damon/core: make damon_start() waits until kdamond_fn() starts")
I think adding this patch on 6.6.y has no problem. Nonetheless, Greg notified
me the patch that depends on this ("mm/damon/core: make damon_start() waits
until kdamond_fn() starts") cannot cleanly applied on 6.1.y and 6.6.y[1,2], and
hence I sent conflict-resolved patches for those[3,4] before.
Hence this patch might not really required, but I also think adding this now
might help merging future fixes. I don't have strong opinion on whether this
patch should be added to 6.6.y or not. I hope you to select a way that better
for minimizing stable kernels maintenance overhead.
[1] https://lore.kernel.org/stable/2023121849-ambulance-violate-e5b2@gregkh/
[2] https://lore.kernel.org/stable/2023121843-pension-tactile-868b@gregkh/
[3] https://lore.kernel.org/r/20231218175939.99263-1-sj@kernel.org
[4] https://lore.kernel.org/r/20231218175959.99278-1-sj@kernel.org
Thanks,
SJ
[...]
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The synth_event_gen_test module can be built in, if someone wants to run
the tests at boot up and not have to load them.
The synth_event_gen_test_init() function creates and enables the synthetic
events and runs its tests.
The synth_event_gen_test_exit() disables the events it created and
destroys the events.
If the module is builtin, the events are never disabled. The issue is, the
events should be disable after the tests are run. This could be an issue
if the rest of the boot up tests are enabled, as they expect the events to
be in a known state before testing. That known state happens to be
disabled.
When CONFIG_SYNTH_EVENT_GEN_TEST=y and CONFIG_EVENT_TRACE_STARTUP_TEST=y
a warning will trigger:
Running tests on trace events:
Testing event create_synth_test:
Enabled event during self test!
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1 at kernel/trace/trace_events.c:4150 event_trace_self_tests+0x1c2/0x480
Modules linked in:
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc2-test-00031-gb803d7c664d5-dirty #276
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:event_trace_self_tests+0x1c2/0x480
Code: bb e8 a2 ab 5d fc 48 8d 7b 48 e8 f9 3d 99 fc 48 8b 73 48 40 f6 c6 01 0f 84 d6 fe ff ff 48 c7 c7 20 b6 ad bb e8 7f ab 5d fc 90 <0f> 0b 90 48 89 df e8 d3 3d 99 fc 48 8b 1b 4c 39 f3 0f 85 2c ff ff
RSP: 0000:ffffc9000001fdc0 EFLAGS: 00010246
RAX: 0000000000000029 RBX: ffff88810399ca80 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffffb9f19478 RDI: ffff88823c734e64
RBP: ffff88810399f300 R08: 0000000000000000 R09: fffffbfff79eb32a
R10: ffffffffbcf59957 R11: 0000000000000001 R12: ffff888104068090
R13: ffffffffbc89f0a0 R14: ffffffffbc8a0f08 R15: 0000000000000078
FS: 0000000000000000(0000) GS:ffff88823c700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001f6282001 CR4: 0000000000170ef0
Call Trace:
<TASK>
? __warn+0xa5/0x200
? event_trace_self_tests+0x1c2/0x480
? report_bug+0x1f6/0x220
? handle_bug+0x6f/0x90
? exc_invalid_op+0x17/0x50
? asm_exc_invalid_op+0x1a/0x20
? tracer_preempt_on+0x78/0x1c0
? event_trace_self_tests+0x1c2/0x480
? __pfx_event_trace_self_tests_init+0x10/0x10
event_trace_self_tests_init+0x27/0xe0
do_one_initcall+0xd6/0x3c0
? __pfx_do_one_initcall+0x10/0x10
? kasan_set_track+0x25/0x30
? rcu_is_watching+0x38/0x60
kernel_init_freeable+0x324/0x450
? __pfx_kernel_init+0x10/0x10
kernel_init+0x1f/0x1e0
? _raw_spin_unlock_irq+0x33/0x50
ret_from_fork+0x34/0x60
? __pfx_kernel_init+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
This is because the synth_event_gen_test_init() left the synthetic events
that it created enabled. By having it disable them after testing, the
other selftests will run fine.
Link: https://lore.kernel.org/linux-trace-kernel/20231220111525.2f0f49b0@gandalf.…
Cc: stable(a)vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Fixes: 9fe41efaca084 ("tracing: Add synth event generation test module")
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Reported-by: Alexander Graf <graf(a)amazon.com>
Tested-by: Alexander Graf <graf(a)amazon.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/synth_event_gen_test.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/kernel/trace/synth_event_gen_test.c b/kernel/trace/synth_event_gen_test.c
index 8dfe85499d4a..354c2117be43 100644
--- a/kernel/trace/synth_event_gen_test.c
+++ b/kernel/trace/synth_event_gen_test.c
@@ -477,6 +477,17 @@ static int __init synth_event_gen_test_init(void)
ret = test_trace_synth_event();
WARN_ON(ret);
+
+ /* Disable when done */
+ trace_array_set_clr_event(gen_synth_test->tr,
+ "synthetic",
+ "gen_synth_test", false);
+ trace_array_set_clr_event(empty_synth_test->tr,
+ "synthetic",
+ "empty_synth_test", false);
+ trace_array_set_clr_event(create_synth_test->tr,
+ "synthetic",
+ "create_synth_test", false);
out:
return ret;
}
--
2.42.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Dongliang reported:
I found that in the latest version, the nodes of tracefs have been
changed to dynamically created.
This has caused me to encounter a problem where the gid I specified in
the mounting parameters cannot apply to all files, as in the following
situation:
/data/tmp/events # mount | grep tracefs
tracefs on /data/tmp type tracefs (rw,seclabel,relatime,gid=3012)
gid 3012 = readtracefs
/data/tmp # ls -lh
total 0
-r--r----- 1 root readtracefs 0 1970-01-01 08:00 README
-r--r----- 1 root readtracefs 0 1970-01-01 08:00 available_events
ums9621_1h10:/data/tmp/events # ls -lh
total 0
drwxr-xr-x 2 root root 0 2023-12-19 00:56 alarmtimer
drwxr-xr-x 2 root root 0 2023-12-19 00:56 asoc
It will prevent certain applications from accessing tracefs properly, I
try to avoid this issue by making the following modifications.
To fix this, have the files created default to taking the ownership of
the parent dentry unless the ownership was previously set by the user.
Link: https://lore.kernel.org/linux-trace-kernel/1703063706-30539-1-git-send-emai…
Link: https://lore.kernel.org/linux-trace-kernel/20231220105017.1489d790@gandalf.…
Cc: stable(a)vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Hongyu Jin <hongyu.jin(a)unisoc.com>
Fixes: 28e12c09f5aa0 ("eventfs: Save ownership and mode")
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Reported-by: Dongliang Cui <cuidongliang390(a)gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/event_inode.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 43e237864a42..2ccc849a5bda 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -148,7 +148,8 @@ static const struct file_operations eventfs_file_operations = {
.release = eventfs_release,
};
-static void update_inode_attr(struct inode *inode, struct eventfs_attr *attr, umode_t mode)
+static void update_inode_attr(struct dentry *dentry, struct inode *inode,
+ struct eventfs_attr *attr, umode_t mode)
{
if (!attr) {
inode->i_mode = mode;
@@ -162,9 +163,13 @@ static void update_inode_attr(struct inode *inode, struct eventfs_attr *attr, um
if (attr->mode & EVENTFS_SAVE_UID)
inode->i_uid = attr->uid;
+ else
+ inode->i_uid = d_inode(dentry->d_parent)->i_uid;
if (attr->mode & EVENTFS_SAVE_GID)
inode->i_gid = attr->gid;
+ else
+ inode->i_gid = d_inode(dentry->d_parent)->i_gid;
}
/**
@@ -206,7 +211,7 @@ static struct dentry *create_file(const char *name, umode_t mode,
return eventfs_failed_creating(dentry);
/* If the user updated the directory's attributes, use them */
- update_inode_attr(inode, attr, mode);
+ update_inode_attr(dentry, inode, attr, mode);
inode->i_op = &eventfs_file_inode_operations;
inode->i_fop = fop;
@@ -242,7 +247,8 @@ static struct dentry *create_dir(struct eventfs_inode *ei, struct dentry *parent
return eventfs_failed_creating(dentry);
/* If the user updated the directory's attributes, use them */
- update_inode_attr(inode, &ei->attr, S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO);
+ update_inode_attr(dentry, inode, &ei->attr,
+ S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO);
inode->i_op = &eventfs_root_dir_inode_operations;
inode->i_fop = &eventfs_file_operations;
--
2.42.0
Dear Stable,
> Lee pointed out issue found by syscaller [0] hitting BUG in prog array
> map poke update in prog_array_map_poke_run function due to error value
> returned from bpf_arch_text_poke function.
>
> There's race window where bpf_arch_text_poke can fail due to missing
> bpf program kallsym symbols, which is accounted for with check for
> -EINVAL in that BUG_ON call.
>
> The problem is that in such case we won't update the tail call jump
> and cause imbalance for the next tail call update check which will
> fail with -EBUSY in bpf_arch_text_poke.
>
> I'm hitting following race during the program load:
>
> CPU 0 CPU 1
>
> bpf_prog_load
> bpf_check
> do_misc_fixups
> prog_array_map_poke_track
>
> map_update_elem
> bpf_fd_array_map_update_elem
> prog_array_map_poke_run
>
> bpf_arch_text_poke returns -EINVAL
>
> bpf_prog_kallsyms_add
>
> After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
> poke update fails on expected jump instruction check in bpf_arch_text_poke
> with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
>
> Similar race exists on the program unload.
>
> Fixing this by moving the update to bpf_arch_poke_desc_update function which
> makes sure we call __bpf_arch_text_poke that skips the bpf address check.
>
> Each architecture has slightly different approach wrt looking up bpf address
> in bpf_arch_text_poke, so instead of splitting the function or adding new
> 'checkip' argument in previous version, it seems best to move the whole
> map_poke_run update as arch specific code.
>
> [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
>
> Cc: Lee Jones <lee(a)kernel.org>
> Cc: Maciej Fijalkowski <maciej.fijalkowski(a)intel.com>
> Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> Reported-by: syzbot+97a4fe20470e9bc30810(a)syzkaller.appspotmail.com
> Acked-by: Yonghong Song <yonghong.song(a)linux.dev>
> Signed-off-by: Jiri Olsa <jolsa(a)kernel.org>
> ---
> arch/x86/net/bpf_jit_comp.c | 46 +++++++++++++++++++++++++++++
> include/linux/bpf.h | 3 ++
> kernel/bpf/arraymap.c | 58 +++++++------------------------------
> 3 files changed, 59 insertions(+), 48 deletions(-)
Please could we have this backported?
Guided by the Fixes: tag.
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 8c10d9abc239..e89e415aa743 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -3025,3 +3025,49 @@ void arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp
> #endif
> WARN(1, "verification of programs using bpf_throw should have failed\n");
> }
> +
> +void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
> + struct bpf_prog *new, struct bpf_prog *old)
> +{
> + u8 *old_addr, *new_addr, *old_bypass_addr;
> + int ret;
> +
> + old_bypass_addr = old ? NULL : poke->bypass_addr;
> + old_addr = old ? (u8 *)old->bpf_func + poke->adj_off : NULL;
> + new_addr = new ? (u8 *)new->bpf_func + poke->adj_off : NULL;
> +
> + /*
> + * On program loading or teardown, the program's kallsym entry
> + * might not be in place, so we use __bpf_arch_text_poke to skip
> + * the kallsyms check.
> + */
> + if (new) {
> + ret = __bpf_arch_text_poke(poke->tailcall_target,
> + BPF_MOD_JUMP,
> + old_addr, new_addr);
> + BUG_ON(ret < 0);
> + if (!old) {
> + ret = __bpf_arch_text_poke(poke->tailcall_bypass,
> + BPF_MOD_JUMP,
> + poke->bypass_addr,
> + NULL);
> + BUG_ON(ret < 0);
> + }
> + } else {
> + ret = __bpf_arch_text_poke(poke->tailcall_bypass,
> + BPF_MOD_JUMP,
> + old_bypass_addr,
> + poke->bypass_addr);
> + BUG_ON(ret < 0);
> + /* let other CPUs finish the execution of program
> + * so that it will not possible to expose them
> + * to invalid nop, stack unwind, nop state
> + */
> + if (!ret)
> + synchronize_rcu();
> + ret = __bpf_arch_text_poke(poke->tailcall_target,
> + BPF_MOD_JUMP,
> + old_addr, NULL);
> + BUG_ON(ret < 0);
> + }
> +}
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 6762dac3ef76..cff5bb08820e 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -3175,6 +3175,9 @@ enum bpf_text_poke_type {
> int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
> void *addr1, void *addr2);
>
> +void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
> + struct bpf_prog *new, struct bpf_prog *old);
> +
> void *bpf_arch_text_copy(void *dst, void *src, size_t len);
> int bpf_arch_text_invalidate(void *dst, size_t len);
>
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 2058e89b5ddd..c85ff9162a5c 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -1012,11 +1012,16 @@ static void prog_array_map_poke_untrack(struct bpf_map *map,
> mutex_unlock(&aux->poke_mutex);
> }
>
> +void __weak bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
> + struct bpf_prog *new, struct bpf_prog *old)
> +{
> + WARN_ON_ONCE(1);
> +}
> +
> static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
> struct bpf_prog *old,
> struct bpf_prog *new)
> {
> - u8 *old_addr, *new_addr, *old_bypass_addr;
> struct prog_poke_elem *elem;
> struct bpf_array_aux *aux;
>
> @@ -1025,7 +1030,7 @@ static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
>
> list_for_each_entry(elem, &aux->poke_progs, list) {
> struct bpf_jit_poke_descriptor *poke;
> - int i, ret;
> + int i;
>
> for (i = 0; i < elem->aux->size_poke_tab; i++) {
> poke = &elem->aux->poke_tab[i];
> @@ -1044,21 +1049,10 @@ static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
> * activated, so tail call updates can arrive from here
> * while JIT is still finishing its final fixup for
> * non-activated poke entries.
> - * 3) On program teardown, the program's kallsym entry gets
> - * removed out of RCU callback, but we can only untrack
> - * from sleepable context, therefore bpf_arch_text_poke()
> - * might not see that this is in BPF text section and
> - * bails out with -EINVAL. As these are unreachable since
> - * RCU grace period already passed, we simply skip them.
> - * 4) Also programs reaching refcount of zero while patching
> + * 3) Also programs reaching refcount of zero while patching
> * is in progress is okay since we're protected under
> * poke_mutex and untrack the programs before the JIT
> - * buffer is freed. When we're still in the middle of
> - * patching and suddenly kallsyms entry of the program
> - * gets evicted, we just skip the rest which is fine due
> - * to point 3).
> - * 5) Any other error happening below from bpf_arch_text_poke()
> - * is a unexpected bug.
> + * buffer is freed.
> */
> if (!READ_ONCE(poke->tailcall_target_stable))
> continue;
> @@ -1068,39 +1062,7 @@ static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
> poke->tail_call.key != key)
> continue;
>
> - old_bypass_addr = old ? NULL : poke->bypass_addr;
> - old_addr = old ? (u8 *)old->bpf_func + poke->adj_off : NULL;
> - new_addr = new ? (u8 *)new->bpf_func + poke->adj_off : NULL;
> -
> - if (new) {
> - ret = bpf_arch_text_poke(poke->tailcall_target,
> - BPF_MOD_JUMP,
> - old_addr, new_addr);
> - BUG_ON(ret < 0 && ret != -EINVAL);
> - if (!old) {
> - ret = bpf_arch_text_poke(poke->tailcall_bypass,
> - BPF_MOD_JUMP,
> - poke->bypass_addr,
> - NULL);
> - BUG_ON(ret < 0 && ret != -EINVAL);
> - }
> - } else {
> - ret = bpf_arch_text_poke(poke->tailcall_bypass,
> - BPF_MOD_JUMP,
> - old_bypass_addr,
> - poke->bypass_addr);
> - BUG_ON(ret < 0 && ret != -EINVAL);
> - /* let other CPUs finish the execution of program
> - * so that it will not possible to expose them
> - * to invalid nop, stack unwind, nop state
> - */
> - if (!ret)
> - synchronize_rcu();
> - ret = bpf_arch_text_poke(poke->tailcall_target,
> - BPF_MOD_JUMP,
> - old_addr, NULL);
> - BUG_ON(ret < 0 && ret != -EINVAL);
> - }
> + bpf_arch_poke_desc_update(poke, new, old);
> }
> }
> }
> --
> 2.43.0
>
--
Lee Jones [李琼斯]
Syzkaller reports "memory leak in p9pdu_readf" in 5.10 stable releases.
I've attached reproducers in Bugzilla [1].
The problem has been fixed by the following patch which can be applied
to the 5.10 branch.
The fix is already present in all stable branches starting from 5.15.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=218235
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
On 32-bit systems, we'll lose the top bits of index because arithmetic
will be performed in unsigned long instead of unsigned long long. This
affects files over 4GB in size.
Fixes: 6100e34b2526 ("mm, memory_failure: Teach memory_failure() about dev_pagemap pages")
Cc: stable(a)vger.kernel.org
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
---
mm/memory-failure.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 82e15baabb48..455093f73a70 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1704,7 +1704,7 @@ static void unmap_and_kill(struct list_head *to_kill, unsigned long pfn,
* mapping being torn down is communicated in siginfo, see
* kill_proc()
*/
- loff_t start = (index << PAGE_SHIFT) & ~(size - 1);
+ loff_t start = ((loff_t)index << PAGE_SHIFT) & ~(size - 1);
unmap_mapping_range(mapping, start, size, 0);
}
--
2.42.0
On Wed, Dec 20, 2023 at 09:53:29PM +0000, Vitaly Rodionov wrote:
> commit 99bf5b0baac941176a6a3d5cef7705b29808de34 upstream
>
> Please backport to 6.2 and 6.3
6.2 and 6.3 are long end-of-life, look a the front page of kernel.org to
see the active kernel versions that we support.
> Ubuntu 22.04.3 LTS, is released with the Linux kernel 6.2, and we need to
> backport this patch to prevent regression for HW with 2 Cirrus Logic CS42L42 codecs.
Then work with Ubuntu, they are the only ones that can support this old
and obsolete kernel, not us, thankfully!
good luck!
greg k-h
GCC seems to incorrectly fail to evaluate skb_ext_total_length() at
compile time under certain conditions.
The issue even occurs if all values in skb_ext_type_len[] are "0",
ruling out the possibility of an actual overflow.
As the patch has been in mainline since v6.6 without triggering the
problem it seems to be a very uncommon occurrence.
As the issue only occurs when -fno-tree-loop-im is specified as part of
CFLAGS_GCOV, disable the BUILD_BUG_ON() only when building with coverage
reporting enabled.
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312171924.4FozI5FG-lkp@intel.com/
Suggested-by: Arnd Bergmann <arnd(a)arndb.de>
Link: https://lore.kernel.org/lkml/487cfd35-fe68-416f-9bfd-6bb417f98304@app.fastm…
Fixes: 5d21d0a65b57 ("net: generalize calculation of skb extensions length")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
net/core/skbuff.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 83af8aaeb893..94cc40a6f797 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4825,7 +4825,9 @@ static __always_inline unsigned int skb_ext_total_length(void)
static void skb_extensions_init(void)
{
BUILD_BUG_ON(SKB_EXT_NUM >= 8);
+#if !IS_ENABLED(CONFIG_KCOV_INSTRUMENT_ALL)
BUILD_BUG_ON(skb_ext_total_length() > 255);
+#endif
skbuff_ext_cache = kmem_cache_create("skbuff_ext_cache",
SKB_EXT_ALIGN_VALUE * skb_ext_total_length(),
---
base-commit: ceb6a6f023fd3e8b07761ed900352ef574010bcb
change-id: 20231218-net-skbuff-build-bug-4a7c1103d0a6
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Hi all,
please include b65ba0c362be665192381cc59e3ac3ef6f0dd1e1 also on the
stable-trees up to v5.10 (i think v5.13 was the first fixed tree).
Serial gadget on AM335X is also affected, breaks with NULL pointer
references and needs this patch. Here is the patch for the v4.19
tree, cherry picked and manually applied from original commit
b65ba0c362be665192381cc59e3ac3ef6f0dd1e1:
From 483d904168b08cf1497c73516c432bde9ae94055 Mon Sep 17 00:00:00 2001
From: Thomas Petazzoni <thomas.petazzoni(a)bootlin.com>
Date: Fri, 28 May 2021 16:04:46 +0200
Subject: [PATCH] usb: musb: fix MUSB_QUIRK_B_DISCONNECT_99 handling
In commit 92af4fc6ec33 ("usb: musb: Fix suspend with devices
connected for a64"), the logic to support the
MUSB_QUIRK_B_DISCONNECT_99 quirk was modified to only conditionally
schedule the musb->irq_work delayed work.
This commit badly breaks ECM Gadget on AM335X. Indeed, with this
commit, one can observe massive packet loss:
$ ping 192.168.0.100
...
15 packets transmitted, 3 received, 80% packet loss, time 14316ms
Reverting this commit brings back a properly functioning ECM
Gadget. An analysis of the commit seems to indicate that a mistake was
made: the previous code was not falling through into the
MUSB_QUIRK_B_INVALID_VBUS_91, but now it is, unless the condition is
taken.
Changing the logic to be as it was before the problematic commit *and*
only conditionally scheduling musb->irq_work resolves the regression:
$ ping 192.168.0.100
...
64 packets transmitted, 64 received, 0% packet loss, time 64475ms
Fixes: 92af4fc6ec33 ("usb: musb: Fix suspend with devices connected for a64")
Cc: stable(a)vger.kernel.org
Tested-by: Alexandre Belloni <alexandre.belloni(a)bootlin.com>
Tested-by: Drew Fustini <drew(a)beagleboard.org>
Acked-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni(a)bootlin.com>
Link: https://lore.kernel.org/r/20210528140446.278076-1-thomas.petazzoni@bootlin.…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/musb/musb_core.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c
index 2a874058dff1..4d2de9ce03f9 100644
--- a/drivers/usb/musb/musb_core.c
+++ b/drivers/usb/musb/musb_core.c
@@ -1873,9 +1873,8 @@ static void musb_pm_runtime_check_session(struct musb *musb)
schedule_delayed_work(&musb->irq_work,
msecs_to_jiffies(1000));
musb->quirk_retries--;
- break;
}
- /* fall through */
+ break;
case MUSB_QUIRK_B_INVALID_VBUS_91:
if (musb->quirk_retries && !musb->flush_irq_work) {
musb_dbg(musb,
--
2.30.2
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Commit 834449872105 ("sc16is7xx: Fix for multi-channel stall") changed
sc16is7xx_port_irq() from looping multiple times when there was still
interrupts to serve. It simply changed the do {} while(1) loop to a
do {} while(0) loop, which makes the loop itself now obsolete.
Clean the code by removing this obsolete do {} while(0) loop.
Fixes: 834449872105 ("sc16is7xx: Fix for multi-channel stall")
Cc: stable(a)vger.kernel.org
Suggested-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
---
drivers/tty/serial/sc16is7xx.c | 85 ++++++++++++++++------------------
1 file changed, 41 insertions(+), 44 deletions(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index b92fd01cfeec..b2d0f6d307bd 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -724,58 +724,55 @@ static void sc16is7xx_update_mlines(struct sc16is7xx_one *one)
static bool sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno)
{
bool rc = true;
+ unsigned int iir, rxlen;
struct uart_port *port = &s->p[portno].port;
struct sc16is7xx_one *one = to_sc16is7xx_one(port, port);
mutex_lock(&one->efr_lock);
- do {
- unsigned int iir, rxlen;
+ iir = sc16is7xx_port_read(port, SC16IS7XX_IIR_REG);
+ if (iir & SC16IS7XX_IIR_NO_INT_BIT) {
+ rc = false;
+ goto out_port_irq;
+ }
- iir = sc16is7xx_port_read(port, SC16IS7XX_IIR_REG);
- if (iir & SC16IS7XX_IIR_NO_INT_BIT) {
- rc = false;
- goto out_port_irq;
- }
+ iir &= SC16IS7XX_IIR_ID_MASK;
- iir &= SC16IS7XX_IIR_ID_MASK;
-
- switch (iir) {
- case SC16IS7XX_IIR_RDI_SRC:
- case SC16IS7XX_IIR_RLSE_SRC:
- case SC16IS7XX_IIR_RTOI_SRC:
- case SC16IS7XX_IIR_XOFFI_SRC:
- rxlen = sc16is7xx_port_read(port, SC16IS7XX_RXLVL_REG);
-
- /*
- * There is a silicon bug that makes the chip report a
- * time-out interrupt but no data in the FIFO. This is
- * described in errata section 18.1.4.
- *
- * When this happens, read one byte from the FIFO to
- * clear the interrupt.
- */
- if (iir == SC16IS7XX_IIR_RTOI_SRC && !rxlen)
- rxlen = 1;
-
- if (rxlen)
- sc16is7xx_handle_rx(port, rxlen, iir);
- break;
+ switch (iir) {
+ case SC16IS7XX_IIR_RDI_SRC:
+ case SC16IS7XX_IIR_RLSE_SRC:
+ case SC16IS7XX_IIR_RTOI_SRC:
+ case SC16IS7XX_IIR_XOFFI_SRC:
+ rxlen = sc16is7xx_port_read(port, SC16IS7XX_RXLVL_REG);
+
+ /*
+ * There is a silicon bug that makes the chip report a
+ * time-out interrupt but no data in the FIFO. This is
+ * described in errata section 18.1.4.
+ *
+ * When this happens, read one byte from the FIFO to
+ * clear the interrupt.
+ */
+ if (iir == SC16IS7XX_IIR_RTOI_SRC && !rxlen)
+ rxlen = 1;
+
+ if (rxlen)
+ sc16is7xx_handle_rx(port, rxlen, iir);
+ break;
/* CTSRTS interrupt comes only when CTS goes inactive */
- case SC16IS7XX_IIR_CTSRTS_SRC:
- case SC16IS7XX_IIR_MSI_SRC:
- sc16is7xx_update_mlines(one);
- break;
- case SC16IS7XX_IIR_THRI_SRC:
- sc16is7xx_handle_tx(port);
- break;
- default:
- dev_err_ratelimited(port->dev,
- "ttySC%i: Unexpected interrupt: %x",
- port->line, iir);
- break;
- }
- } while (0);
+ case SC16IS7XX_IIR_CTSRTS_SRC:
+ case SC16IS7XX_IIR_MSI_SRC:
+ sc16is7xx_update_mlines(one);
+ break;
+ case SC16IS7XX_IIR_THRI_SRC:
+ sc16is7xx_handle_tx(port);
+ break;
+ default:
+ dev_err_ratelimited(port->dev,
+ "ttySC%i: Unexpected interrupt: %x",
+ port->line, iir);
+ break;
+ }
out_port_irq:
mutex_unlock(&one->efr_lock);
--
2.39.2
commit 99bf5b0baac941176a6a3d5cef7705b29808de34 upstream
Please backport to 6.2 and 6.3
Ubuntu 22.04.3 LTS, is released with the Linux kernel 6.2, and we need to
backport this patch to prevent regression for HW with 2 Cirrus Logic
CS42L42 codecs.
These patch went into the 6.4 release.
The quilt patch titled
Subject: mm/memory-failure: cast index to loff_t before shifting it
has been removed from the -mm tree. Its filename was
mm-memory-failure-cast-index-to-loff_t-before-shifting-it.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: mm/memory-failure: cast index to loff_t before shifting it
Date: Mon, 18 Dec 2023 13:58:37 +0000
On 32-bit systems, we'll lose the top bits of index because arithmetic
will be performed in unsigned long instead of unsigned long long. This
affects files over 4GB in size.
Link: https://lkml.kernel.org/r/20231218135837.3310403-4-willy@infradead.org
Fixes: 6100e34b2526 ("mm, memory_failure: Teach memory_failure() about dev_pagemap pages")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/memory-failure.c~mm-memory-failure-cast-index-to-loff_t-before-shifting-it
+++ a/mm/memory-failure.c
@@ -1704,7 +1704,7 @@ static void unmap_and_kill(struct list_h
* mapping being torn down is communicated in siginfo, see
* kill_proc()
*/
- loff_t start = (index << PAGE_SHIFT) & ~(size - 1);
+ loff_t start = ((loff_t)index << PAGE_SHIFT) & ~(size - 1);
unmap_mapping_range(mapping, start, size, 0);
}
_
Patches currently in -mm which might be from willy(a)infradead.org are
buffer-return-bool-from-grow_dev_folio.patch
buffer-calculate-block-number-inside-folio_init_buffers.patch
buffer-fix-grow_buffers-for-block-size-page_size.patch
buffer-cast-block-to-loff_t-before-shifting-it.patch
buffer-fix-various-functions-for-block-size-page_size.patch
buffer-handle-large-folios-in-__block_write_begin_int.patch
buffer-fix-more-functions-for-block-size-page_size.patch
mm-convert-ksm_might_need_to_copy-to-work-on-folios.patch
mm-convert-ksm_might_need_to_copy-to-work-on-folios-fix.patch
mm-remove-pageanonexclusive-assertions-in-unuse_pte.patch
mm-convert-unuse_pte-to-use-a-folio-throughout.patch
mm-remove-some-calls-to-page_add_new_anon_rmap.patch
mm-remove-stale-example-from-comment.patch
mm-remove-references-to-page_add_new_anon_rmap-in-comments.patch
mm-convert-migrate_vma_insert_page-to-use-a-folio.patch
mm-convert-collapse_huge_page-to-use-a-folio.patch
mm-remove-page_add_new_anon_rmap-and-lru_cache_add_inactive_or_unevictable.patch
mm-return-the-folio-from-__read_swap_cache_async.patch
mm-pass-a-folio-to-__swap_writepage.patch
mm-pass-a-folio-to-swap_writepage_fs.patch
mm-pass-a-folio-to-swap_writepage_bdev_sync.patch
mm-pass-a-folio-to-swap_writepage_bdev_async.patch
mm-pass-a-folio-to-swap_readpage_fs.patch
mm-pass-a-folio-to-swap_readpage_bdev_sync.patch
mm-pass-a-folio-to-swap_readpage_bdev_async.patch
mm-convert-swap_page_sector-to-swap_folio_sector.patch
mm-convert-swap_readpage-to-swap_read_folio.patch
mm-remove-page_swap_info.patch
mm-return-a-folio-from-read_swap_cache_async.patch
mm-convert-swap_cluster_readahead-and-swap_vma_readahead-to-return-a-folio.patch
mm-convert-swap_cluster_readahead-and-swap_vma_readahead-to-return-a-folio-fix.patch
fs-remove-clean_page_buffers.patch
fs-convert-clean_buffers-to-take-a-folio.patch
fs-reduce-stack-usage-in-__mpage_writepage.patch
fs-reduce-stack-usage-in-do_mpage_readpage.patch
adfs-remove-writepage-implementation.patch
bfs-remove-writepage-implementation.patch
hfs-really-remove-hfs_writepage.patch
hfsplus-really-remove-hfsplus_writepage.patch
minix-remove-writepage-implementation.patch
ocfs2-remove-writepage-implementation.patch
sysv-remove-writepage-implementation.patch
ufs-remove-writepage-implementation.patch
fs-convert-block_write_full_page-to-block_write_full_folio.patch
fs-remove-the-bh_end_io-argument-from-__block_write_full_folio.patch
The quilt patch titled
Subject: mm/memory-failure: check the mapcount of the precise page
has been removed from the -mm tree. Its filename was
mm-memory-failure-check-the-mapcount-of-the-precise-page.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: mm/memory-failure: check the mapcount of the precise page
Date: Mon, 18 Dec 2023 13:58:36 +0000
A process may map only some of the pages in a folio, and might be missed
if it maps the poisoned page but not the head page. Or it might be
unnecessarily hit if it maps the head page, but not the poisoned page.
Link: https://lkml.kernel.org/r/20231218135837.3310403-3-willy@infradead.org
Fixes: 7af446a841a2 ("HWPOISON, hugetlb: enable error handling path for hugepage")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/memory-failure.c~mm-memory-failure-check-the-mapcount-of-the-precise-page
+++ a/mm/memory-failure.c
@@ -1570,7 +1570,7 @@ static bool hwpoison_user_mappings(struc
* This check implies we don't kill processes if their pages
* are in the swap cache early. Those are always late kills.
*/
- if (!page_mapped(hpage))
+ if (!page_mapped(p))
return true;
if (PageSwapCache(p)) {
@@ -1621,10 +1621,10 @@ static bool hwpoison_user_mappings(struc
try_to_unmap(folio, ttu);
}
- unmap_success = !page_mapped(hpage);
+ unmap_success = !page_mapped(p);
if (!unmap_success)
pr_err("%#lx: failed to unmap page (mapcount=%d)\n",
- pfn, page_mapcount(hpage));
+ pfn, page_mapcount(p));
/*
* try_to_unmap() might put mlocked page in lru cache, so call
_
Patches currently in -mm which might be from willy(a)infradead.org are
buffer-return-bool-from-grow_dev_folio.patch
buffer-calculate-block-number-inside-folio_init_buffers.patch
buffer-fix-grow_buffers-for-block-size-page_size.patch
buffer-cast-block-to-loff_t-before-shifting-it.patch
buffer-fix-various-functions-for-block-size-page_size.patch
buffer-handle-large-folios-in-__block_write_begin_int.patch
buffer-fix-more-functions-for-block-size-page_size.patch
mm-convert-ksm_might_need_to_copy-to-work-on-folios.patch
mm-convert-ksm_might_need_to_copy-to-work-on-folios-fix.patch
mm-remove-pageanonexclusive-assertions-in-unuse_pte.patch
mm-convert-unuse_pte-to-use-a-folio-throughout.patch
mm-remove-some-calls-to-page_add_new_anon_rmap.patch
mm-remove-stale-example-from-comment.patch
mm-remove-references-to-page_add_new_anon_rmap-in-comments.patch
mm-convert-migrate_vma_insert_page-to-use-a-folio.patch
mm-convert-collapse_huge_page-to-use-a-folio.patch
mm-remove-page_add_new_anon_rmap-and-lru_cache_add_inactive_or_unevictable.patch
mm-return-the-folio-from-__read_swap_cache_async.patch
mm-pass-a-folio-to-__swap_writepage.patch
mm-pass-a-folio-to-swap_writepage_fs.patch
mm-pass-a-folio-to-swap_writepage_bdev_sync.patch
mm-pass-a-folio-to-swap_writepage_bdev_async.patch
mm-pass-a-folio-to-swap_readpage_fs.patch
mm-pass-a-folio-to-swap_readpage_bdev_sync.patch
mm-pass-a-folio-to-swap_readpage_bdev_async.patch
mm-convert-swap_page_sector-to-swap_folio_sector.patch
mm-convert-swap_readpage-to-swap_read_folio.patch
mm-remove-page_swap_info.patch
mm-return-a-folio-from-read_swap_cache_async.patch
mm-convert-swap_cluster_readahead-and-swap_vma_readahead-to-return-a-folio.patch
mm-convert-swap_cluster_readahead-and-swap_vma_readahead-to-return-a-folio-fix.patch
fs-remove-clean_page_buffers.patch
fs-convert-clean_buffers-to-take-a-folio.patch
fs-reduce-stack-usage-in-__mpage_writepage.patch
fs-reduce-stack-usage-in-do_mpage_readpage.patch
adfs-remove-writepage-implementation.patch
bfs-remove-writepage-implementation.patch
hfs-really-remove-hfs_writepage.patch
hfsplus-really-remove-hfsplus_writepage.patch
minix-remove-writepage-implementation.patch
ocfs2-remove-writepage-implementation.patch
sysv-remove-writepage-implementation.patch
ufs-remove-writepage-implementation.patch
fs-convert-block_write_full_page-to-block_write_full_folio.patch
fs-remove-the-bh_end_io-argument-from-__block_write_full_folio.patch
The quilt patch titled
Subject: mm/memory-failure: pass the folio and the page to collect_procs()
has been removed from the -mm tree. Its filename was
mm-memory-failure-pass-the-folio-and-the-page-to-collect_procs.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: mm/memory-failure: pass the folio and the page to collect_procs()
Date: Mon, 18 Dec 2023 13:58:35 +0000
Patch series "Three memory-failure fixes".
I've been looking at the memory-failure code and I believe I have found
three bugs that need fixing -- one going all the way back to 2010! I'll
have more patches later to use folios more extensively but didn't want
these bugfixes to get caught up in that.
This patch (of 3):
Both collect_procs_anon() and collect_procs_file() iterate over the VMA
interval trees looking for a single pgoff, so it is wrong to look for the
pgoff of the head page as is currently done. However, it is also wrong to
look at page->mapping of the precise page as this is invalid for tail
pages. Clear up the confusion by passing both the folio and the precise
page to collect_procs().
Link: https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20231218135837.3310403-2-willy@infradead.org
Fixes: 415c64c1453a ("mm/memory-failure: split thp earlier in memory error handling")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)
--- a/mm/memory-failure.c~mm-memory-failure-pass-the-folio-and-the-page-to-collect_procs
+++ a/mm/memory-failure.c
@@ -595,10 +595,9 @@ struct task_struct *task_early_kill(stru
/*
* Collect processes when the error hit an anonymous page.
*/
-static void collect_procs_anon(struct page *page, struct list_head *to_kill,
- int force_early)
+static void collect_procs_anon(struct folio *folio, struct page *page,
+ struct list_head *to_kill, int force_early)
{
- struct folio *folio = page_folio(page);
struct vm_area_struct *vma;
struct task_struct *tsk;
struct anon_vma *av;
@@ -633,12 +632,12 @@ static void collect_procs_anon(struct pa
/*
* Collect processes when the error hit a file mapped page.
*/
-static void collect_procs_file(struct page *page, struct list_head *to_kill,
- int force_early)
+static void collect_procs_file(struct folio *folio, struct page *page,
+ struct list_head *to_kill, int force_early)
{
struct vm_area_struct *vma;
struct task_struct *tsk;
- struct address_space *mapping = page->mapping;
+ struct address_space *mapping = folio->mapping;
pgoff_t pgoff;
i_mmap_lock_read(mapping);
@@ -704,17 +703,17 @@ static void collect_procs_fsdax(struct p
/*
* Collect the processes who have the corrupted page mapped to kill.
*/
-static void collect_procs(struct page *page, struct list_head *tokill,
- int force_early)
+static void collect_procs(struct folio *folio, struct page *page,
+ struct list_head *tokill, int force_early)
{
- if (!page->mapping)
+ if (!folio->mapping)
return;
if (unlikely(PageKsm(page)))
collect_procs_ksm(page, tokill, force_early);
else if (PageAnon(page))
- collect_procs_anon(page, tokill, force_early);
+ collect_procs_anon(folio, page, tokill, force_early);
else
- collect_procs_file(page, tokill, force_early);
+ collect_procs_file(folio, page, tokill, force_early);
}
struct hwpoison_walk {
@@ -1602,7 +1601,7 @@ static bool hwpoison_user_mappings(struc
* mapped in dirty form. This has to be done before try_to_unmap,
* because ttu takes the rmap data structures down.
*/
- collect_procs(hpage, &tokill, flags & MF_ACTION_REQUIRED);
+ collect_procs(folio, p, &tokill, flags & MF_ACTION_REQUIRED);
if (PageHuge(hpage) && !PageAnon(hpage)) {
/*
@@ -1772,7 +1771,7 @@ static int mf_generic_kill_procs(unsigne
* SIGBUS (i.e. MF_MUST_KILL)
*/
flags |= MF_ACTION_REQUIRED | MF_MUST_KILL;
- collect_procs(&folio->page, &to_kill, true);
+ collect_procs(folio, &folio->page, &to_kill, true);
unmap_and_kill(&to_kill, pfn, folio->mapping, folio->index, flags);
unlock:
_
Patches currently in -mm which might be from willy(a)infradead.org are
buffer-return-bool-from-grow_dev_folio.patch
buffer-calculate-block-number-inside-folio_init_buffers.patch
buffer-fix-grow_buffers-for-block-size-page_size.patch
buffer-cast-block-to-loff_t-before-shifting-it.patch
buffer-fix-various-functions-for-block-size-page_size.patch
buffer-handle-large-folios-in-__block_write_begin_int.patch
buffer-fix-more-functions-for-block-size-page_size.patch
mm-convert-ksm_might_need_to_copy-to-work-on-folios.patch
mm-convert-ksm_might_need_to_copy-to-work-on-folios-fix.patch
mm-remove-pageanonexclusive-assertions-in-unuse_pte.patch
mm-convert-unuse_pte-to-use-a-folio-throughout.patch
mm-remove-some-calls-to-page_add_new_anon_rmap.patch
mm-remove-stale-example-from-comment.patch
mm-remove-references-to-page_add_new_anon_rmap-in-comments.patch
mm-convert-migrate_vma_insert_page-to-use-a-folio.patch
mm-convert-collapse_huge_page-to-use-a-folio.patch
mm-remove-page_add_new_anon_rmap-and-lru_cache_add_inactive_or_unevictable.patch
mm-return-the-folio-from-__read_swap_cache_async.patch
mm-pass-a-folio-to-__swap_writepage.patch
mm-pass-a-folio-to-swap_writepage_fs.patch
mm-pass-a-folio-to-swap_writepage_bdev_sync.patch
mm-pass-a-folio-to-swap_writepage_bdev_async.patch
mm-pass-a-folio-to-swap_readpage_fs.patch
mm-pass-a-folio-to-swap_readpage_bdev_sync.patch
mm-pass-a-folio-to-swap_readpage_bdev_async.patch
mm-convert-swap_page_sector-to-swap_folio_sector.patch
mm-convert-swap_readpage-to-swap_read_folio.patch
mm-remove-page_swap_info.patch
mm-return-a-folio-from-read_swap_cache_async.patch
mm-convert-swap_cluster_readahead-and-swap_vma_readahead-to-return-a-folio.patch
mm-convert-swap_cluster_readahead-and-swap_vma_readahead-to-return-a-folio-fix.patch
fs-remove-clean_page_buffers.patch
fs-convert-clean_buffers-to-take-a-folio.patch
fs-reduce-stack-usage-in-__mpage_writepage.patch
fs-reduce-stack-usage-in-do_mpage_readpage.patch
adfs-remove-writepage-implementation.patch
bfs-remove-writepage-implementation.patch
hfs-really-remove-hfs_writepage.patch
hfsplus-really-remove-hfsplus_writepage.patch
minix-remove-writepage-implementation.patch
ocfs2-remove-writepage-implementation.patch
sysv-remove-writepage-implementation.patch
ufs-remove-writepage-implementation.patch
fs-convert-block_write_full_page-to-block_write_full_folio.patch
fs-remove-the-bh_end_io-argument-from-__block_write_full_folio.patch
The quilt patch titled
Subject: selftests: secretmem: floor the memory size to the multiple of page_size
has been removed from the -mm tree. Its filename was
selftests-secretmem-floor-the-memory-size-to-the-multiple-of-page_size.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Subject: selftests: secretmem: floor the memory size to the multiple of page_size
Date: Thu, 14 Dec 2023 15:19:30 +0500
The "locked-in-memory size" limit per process can be non-multiple of
page_size. The mmap() fails if we try to allocate locked-in-memory with
same size as the allowed limit if it isn't multiple of the page_size
because mmap() rounds off the memory size to be allocated to next multiple
of page_size.
Fix this by flooring the length to be allocated with mmap() to the
previous multiple of the page_size.
This was getting triggered on KernelCI regularly because of different
ulimit settings which wasn't multiple of the page_size. Find logs
here: https://linux.kernelci.org/test/plan/id/657654bd8e81e654fae13532/
The bug in was present from the time test was first added.
Link: https://lkml.kernel.org/r/20231214101931.1155586-1-usama.anjum@collabora.com
Fixes: 76fe17ef588a ("secretmem: test: add basic selftest for memfd_secret(2)")
Signed-off-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Reported-by: "kernelci.org bot" <bot(a)kernelci.org>
Closes: https://linux.kernelci.org/test/plan/id/657654bd8e81e654fae13532/
Cc: "James E.J. Bottomley" <James.Bottomley(a)HansenPartnership.com>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/memfd_secret.c | 3 +++
1 file changed, 3 insertions(+)
--- a/tools/testing/selftests/mm/memfd_secret.c~selftests-secretmem-floor-the-memory-size-to-the-multiple-of-page_size
+++ a/tools/testing/selftests/mm/memfd_secret.c
@@ -62,6 +62,9 @@ static void test_mlock_limit(int fd)
char *mem;
len = mlock_limit_cur;
+ if (len % page_size != 0)
+ len = (len/page_size) * page_size;
+
mem = mmap(NULL, len, prot, mode, fd, 0);
if (mem == MAP_FAILED) {
fail("unable to mmap secret memory\n");
_
Patches currently in -mm which might be from usama.anjum(a)collabora.com are
The quilt patch titled
Subject: mm: migrate high-order folios in swap cache correctly
has been removed from the -mm tree. Its filename was
mm-migrate-high-order-folios-in-swap-cache-correctly.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Charan Teja Kalla <quic_charante(a)quicinc.com>
Subject: mm: migrate high-order folios in swap cache correctly
Date: Thu, 14 Dec 2023 04:58:41 +0000
Large folios occupy N consecutive entries in the swap cache instead of
using multi-index entries like the page cache. However, if a large folio
is re-added to the LRU list, it can be migrated. The migration code was
not aware of the difference between the swap cache and the page cache and
assumed that a single xas_store() would be sufficient.
This leaves potentially many stale pointers to the now-migrated folio in
the swap cache, which can lead to almost arbitrary data corruption in the
future. This can also manifest as infinite loops with the RCU read lock
held.
[willy(a)infradead.org: modifications to the changelog & tweaked the fix]
Fixes: 3417013e0d18 ("mm/migrate: Add folio_migrate_mapping()")
Link: https://lkml.kernel.org/r/20231214045841.961776-1-willy@infradead.org
Signed-off-by: Charan Teja Kalla <quic_charante(a)quicinc.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Reported-by: Charan Teja Kalla <quic_charante(a)quicinc.com>
Closes: https://lkml.kernel.org/r/1700569840-17327-1-git-send-email-quic_charante@q…
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/migrate.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/mm/migrate.c~mm-migrate-high-order-folios-in-swap-cache-correctly
+++ a/mm/migrate.c
@@ -405,6 +405,7 @@ int folio_migrate_mapping(struct address
int dirty;
int expected_count = folio_expected_refs(mapping, folio) + extra_count;
long nr = folio_nr_pages(folio);
+ long entries, i;
if (!mapping) {
/* Anonymous page without mapping */
@@ -442,8 +443,10 @@ int folio_migrate_mapping(struct address
folio_set_swapcache(newfolio);
newfolio->private = folio_get_private(folio);
}
+ entries = nr;
} else {
VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
+ entries = 1;
}
/* Move dirty while page refs frozen and newpage not yet exposed */
@@ -453,7 +456,11 @@ int folio_migrate_mapping(struct address
folio_set_dirty(newfolio);
}
- xas_store(&xas, newfolio);
+ /* Swap cache still stores N entries instead of a high-order entry */
+ for (i = 0; i < entries; i++) {
+ xas_store(&xas, newfolio);
+ xas_next(&xas);
+ }
/*
* Drop cache reference from old page by unfreezing
_
Patches currently in -mm which might be from quic_charante(a)quicinc.com are
mm-sparsemem-fix-race-in-accessing-memory_section-usage.patch
mm-sparsemem-fix-race-in-accessing-memory_section-usage-v2.patch
The quilt patch titled
Subject: maple_tree: do not preallocate nodes for slot stores
has been removed from the -mm tree. Its filename was
maple_tree-do-not-preallocate-nodes-for-slot-stores.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Subject: maple_tree: do not preallocate nodes for slot stores
Date: Wed, 13 Dec 2023 12:50:57 -0800
mas_preallocate() defaults to requesting 1 node for preallocation and then
,depending on the type of store, will update the request variable. There
isn't a check for a slot store type, so slot stores are preallocating the
default 1 node. Slot stores do not require any additional nodes, so add a
check for the slot store case that will bypass node_count_gfp(). Update
the tests to reflect that slot stores do not require allocations.
User visible effects of this bug include increased memory usage from the
unneeded node that was allocated.
Link: https://lkml.kernel.org/r/20231213205058.386589-1-sidhartha.kumar@oracle.com
Fixes: 0b8bb544b1a7 ("maple_tree: update mas_preallocate() testing")
Signed-off-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peng Zhang <zhangpeng.00(a)bytedance.com>
Cc: <stable(a)vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/maple_tree.c | 11 +++++++++++
tools/testing/radix-tree/maple.c | 2 +-
2 files changed, 12 insertions(+), 1 deletion(-)
--- a/lib/maple_tree.c~maple_tree-do-not-preallocate-nodes-for-slot-stores
+++ a/lib/maple_tree.c
@@ -5501,6 +5501,17 @@ int mas_preallocate(struct ma_state *mas
mas_wr_end_piv(&wr_mas);
node_size = mas_wr_new_end(&wr_mas);
+
+ /* Slot store, does not require additional nodes */
+ if (node_size == wr_mas.node_end) {
+ /* reuse node */
+ if (!mt_in_rcu(mas->tree))
+ return 0;
+ /* shifting boundary */
+ if (wr_mas.offset_end - mas->offset == 1)
+ return 0;
+ }
+
if (node_size >= mt_slots[wr_mas.type]) {
/* Split, worst case for now. */
request = 1 + mas_mt_height(mas) * 2;
--- a/tools/testing/radix-tree/maple.c~maple_tree-do-not-preallocate-nodes-for-slot-stores
+++ a/tools/testing/radix-tree/maple.c
@@ -35538,7 +35538,7 @@ static noinline void __init check_preall
MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0);
allocated = mas_allocated(&mas);
height = mas_mt_height(&mas);
- MT_BUG_ON(mt, allocated != 1);
+ MT_BUG_ON(mt, allocated != 0);
mas_store_prealloc(&mas, ptr);
MT_BUG_ON(mt, mas_allocated(&mas) != 0);
_
Patches currently in -mm which might be from sidhartha.kumar(a)oracle.com are
The quilt patch titled
Subject: mm/filemap: avoid buffered read/write race to read inconsistent data
has been removed from the -mm tree. Its filename was
mm-filemap-avoid-buffered-read-write-race-to-read-inconsistent-data.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Baokun Li <libaokun1(a)huawei.com>
Subject: mm/filemap: avoid buffered read/write race to read inconsistent data
Date: Wed, 13 Dec 2023 14:23:24 +0800
The following concurrency may cause the data read to be inconsistent with
the data on disk:
cpu1 cpu2
------------------------------|------------------------------
// Buffered write 2048 from 0
ext4_buffered_write_iter
generic_perform_write
copy_page_from_iter_atomic
ext4_da_write_end
ext4_da_do_write_end
block_write_end
__block_commit_write
folio_mark_uptodate
// Buffered read 4096 from 0 smp_wmb()
ext4_file_read_iter set_bit(PG_uptodate, folio_flags)
generic_file_read_iter i_size_write // 2048
filemap_read unlock_page(page)
filemap_get_pages
filemap_get_read_batch
folio_test_uptodate(folio)
ret = test_bit(PG_uptodate, folio_flags)
if (ret)
smp_rmb();
// Ensure that the data in page 0-2048 is up-to-date.
// New buffered write 2048 from 2048
ext4_buffered_write_iter
generic_perform_write
copy_page_from_iter_atomic
ext4_da_write_end
ext4_da_do_write_end
block_write_end
__block_commit_write
folio_mark_uptodate
smp_wmb()
set_bit(PG_uptodate, folio_flags)
i_size_write // 4096
unlock_page(page)
isize = i_size_read(inode) // 4096
// Read the latest isize 4096, but without smp_rmb(), there may be
// Load-Load disorder resulting in the data in the 2048-4096 range
// in the page is not up-to-date.
copy_page_to_iter
// copyout 4096
In the concurrency above, we read the updated i_size, but there is no read
barrier to ensure that the data in the page is the same as the i_size at
this point, so we may copy the unsynchronized page out. Hence adding the
missing read memory barrier to fix this.
This is a Load-Load reordering issue, which only occurs on some weak
mem-ordering architectures (e.g. ARM64, ALPHA), but not on strong
mem-ordering architectures (e.g. X86). And theoretically the problem
doesn't only happen on ext4, filesystems that call filemap_read() but
don't hold inode lock (e.g. btrfs, f2fs, ubifs ...) will have this
problem, while filesystems with inode lock (e.g. xfs, nfs) won't have
this problem.
Link: https://lkml.kernel.org/r/20231213062324.739009-1-libaokun1@huawei.com
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Andreas Dilger <adilger.kernel(a)dilger.ca>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Cc: Theodore Ts'o <tytso(a)mit.edu>
Cc: yangerkun <yangerkun(a)huawei.com>
Cc: Yu Kuai <yukuai3(a)huawei.com>
Cc: Zhang Yi <yi.zhang(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/filemap.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/mm/filemap.c~mm-filemap-avoid-buffered-read-write-race-to-read-inconsistent-data
+++ a/mm/filemap.c
@@ -2608,6 +2608,15 @@ ssize_t filemap_read(struct kiocb *iocb,
end_offset = min_t(loff_t, isize, iocb->ki_pos + iter->count);
/*
+ * Pairs with a barrier in
+ * block_write_end()->mark_buffer_dirty() or other page
+ * dirtying routines like iomap_write_end() to ensure
+ * changes to page contents are visible before we see
+ * increased inode size.
+ */
+ smp_rmb();
+
+ /*
* Once we start copying data, we don't want to be touching any
* cachelines that might be contended:
*/
_
Patches currently in -mm which might be from libaokun1(a)huawei.com are
From: Wayne Lin <wayne.lin(a)amd.com>
link_rate sometime will be changed when DP MST connector hotplug, so
pbn_div also need be updated; otherwise, it will mismatch with
link_rate, causes no output in external monitor.
Cc: stable(a)vger.kernel.org
Reviewed-by: Jerry Zuo <jerry.zuo(a)amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Signed-off-by: Wade Wang <wade.wang(a)hp.com>
Signed-off-by: Wayne Lin <wayne.lin(a)amd.com>
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 2845c884398e..9ff87cee4c61 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6980,8 +6980,7 @@ static int dm_encoder_helper_atomic_check(struct drm_encoder *encoder,
if (IS_ERR(mst_state))
return PTR_ERR(mst_state);
- if (!mst_state->pbn_div)
- mst_state->pbn_div = dm_mst_get_pbn_divider(aconnector->mst_root->dc_link);
+ mst_state->pbn_div = dm_mst_get_pbn_divider(aconnector->mst_root->dc_link);
if (!state->duplicated) {
int max_bpc = conn_state->max_requested_bpc;
--
2.42.0
This is a note to let you know that I've just added the patch titled
iio: adc: ad7091r: Pass iio_dev to event handler
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
From a25a7df518fc71b1ba981d691e9322e645d2689c Mon Sep 17 00:00:00 2001
From: Marcelo Schmitt <marcelo.schmitt(a)analog.com>
Date: Sat, 16 Dec 2023 14:46:11 -0300
Subject: iio: adc: ad7091r: Pass iio_dev to event handler
Previous version of ad7091r event handler received the ADC state pointer
and retrieved the iio device from driver data field with dev_get_drvdata().
However, no driver data have ever been set, which led to null pointer
dereference when running the event handler.
Pass the iio device to the event handler and retrieve the ADC state struct
from it so we avoid the null pointer dereference and save the driver from
filling the driver data field.
Fixes: ca69300173b6 ("iio: adc: Add support for AD7091R5 ADC")
Signed-off-by: Marcelo Schmitt <marcelo.schmitt(a)analog.com>
Link: https://lore.kernel.org/r/5024b764107463de9578d5b3b0a3d5678e307b1a.17027462…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/ad7091r-base.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/iio/adc/ad7091r-base.c b/drivers/iio/adc/ad7091r-base.c
index 8e252cde735b..0e5d3d2e9c98 100644
--- a/drivers/iio/adc/ad7091r-base.c
+++ b/drivers/iio/adc/ad7091r-base.c
@@ -174,8 +174,8 @@ static const struct iio_info ad7091r_info = {
static irqreturn_t ad7091r_event_handler(int irq, void *private)
{
- struct ad7091r_state *st = (struct ad7091r_state *) private;
- struct iio_dev *iio_dev = dev_get_drvdata(st->dev);
+ struct iio_dev *iio_dev = private;
+ struct ad7091r_state *st = iio_priv(iio_dev);
unsigned int i, read_val;
int ret;
s64 timestamp = iio_get_time_ns(iio_dev);
@@ -234,7 +234,7 @@ int ad7091r_probe(struct device *dev, const char *name,
if (irq) {
ret = devm_request_threaded_irq(dev, irq, NULL,
ad7091r_event_handler,
- IRQF_TRIGGER_FALLING | IRQF_ONESHOT, name, st);
+ IRQF_TRIGGER_FALLING | IRQF_ONESHOT, name, iio_dev);
if (ret)
return ret;
}
--
2.43.0
This is a note to let you know that I've just added the patch titled
iio: adc: ad7091r: Pass iio_dev to event handler
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
From a25a7df518fc71b1ba981d691e9322e645d2689c Mon Sep 17 00:00:00 2001
From: Marcelo Schmitt <marcelo.schmitt(a)analog.com>
Date: Sat, 16 Dec 2023 14:46:11 -0300
Subject: iio: adc: ad7091r: Pass iio_dev to event handler
Previous version of ad7091r event handler received the ADC state pointer
and retrieved the iio device from driver data field with dev_get_drvdata().
However, no driver data have ever been set, which led to null pointer
dereference when running the event handler.
Pass the iio device to the event handler and retrieve the ADC state struct
from it so we avoid the null pointer dereference and save the driver from
filling the driver data field.
Fixes: ca69300173b6 ("iio: adc: Add support for AD7091R5 ADC")
Signed-off-by: Marcelo Schmitt <marcelo.schmitt(a)analog.com>
Link: https://lore.kernel.org/r/5024b764107463de9578d5b3b0a3d5678e307b1a.17027462…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/ad7091r-base.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/iio/adc/ad7091r-base.c b/drivers/iio/adc/ad7091r-base.c
index 8e252cde735b..0e5d3d2e9c98 100644
--- a/drivers/iio/adc/ad7091r-base.c
+++ b/drivers/iio/adc/ad7091r-base.c
@@ -174,8 +174,8 @@ static const struct iio_info ad7091r_info = {
static irqreturn_t ad7091r_event_handler(int irq, void *private)
{
- struct ad7091r_state *st = (struct ad7091r_state *) private;
- struct iio_dev *iio_dev = dev_get_drvdata(st->dev);
+ struct iio_dev *iio_dev = private;
+ struct ad7091r_state *st = iio_priv(iio_dev);
unsigned int i, read_val;
int ret;
s64 timestamp = iio_get_time_ns(iio_dev);
@@ -234,7 +234,7 @@ int ad7091r_probe(struct device *dev, const char *name,
if (irq) {
ret = devm_request_threaded_irq(dev, irq, NULL,
ad7091r_event_handler,
- IRQF_TRIGGER_FALLING | IRQF_ONESHOT, name, st);
+ IRQF_TRIGGER_FALLING | IRQF_ONESHOT, name, iio_dev);
if (ret)
return ret;
}
--
2.43.0