This is a note to let you know that I've just added the patch titled
iio: backend: fix out-of-bound write
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From da9374819eb3885636934c1006d450c3cb1a02ed Mon Sep 17 00:00:00 2001
From: Markus Burri <markus.burri(a)mt.com>
Date: Thu, 8 May 2025 15:06:07 +0200
Subject: iio: backend: fix out-of-bound write
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The buffer is set to 80 character. If a caller write more characters,
count is truncated to the max available space in "simple_write_to_buffer".
But afterwards a string terminator is written to the buffer at offset count
without boundary check. The zero termination is written OUT-OF-BOUND.
Add a check that the given buffer is smaller then the buffer to prevent.
Fixes: 035b4989211d ("iio: backend: make sure to NULL terminate stack buffer")
Signed-off-by: Markus Burri <markus.burri(a)mt.com>
Reviewed-by: Nuno Sá <nuno.sa(a)analog.com>
Link: https://patch.msgid.link/20250508130612.82270-2-markus.burri@mt.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/industrialio-backend.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/industrialio-backend.c b/drivers/iio/industrialio-backend.c
index c1eb9ef9db08..266e1b29bf91 100644
--- a/drivers/iio/industrialio-backend.c
+++ b/drivers/iio/industrialio-backend.c
@@ -155,11 +155,14 @@ static ssize_t iio_backend_debugfs_write_reg(struct file *file,
ssize_t rc;
int ret;
+ if (count >= sizeof(buf))
+ return -ENOSPC;
+
rc = simple_write_to_buffer(buf, sizeof(buf) - 1, ppos, userbuf, count);
if (rc < 0)
return rc;
- buf[count] = '\0';
+ buf[rc] = '\0';
ret = sscanf(buf, "%i %i", &back->cached_reg_addr, &val);
--
2.50.0
Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86.
Page table sharing requires at least three levels because it involves
shared references to PMD tables; 32-bit x86 has either two-level paging
(without PAE) or three-level paging (with PAE), but even with
three-level paging, having a dedicated PGD entry for hugetlb is only
barely possible (because the PGD only has four entries), and it seems
unlikely anyone's actually using PMD sharing on 32-bit.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which
has 2-level paging) became particularly problematic after commit
59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"),
since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and
the `pt_share_count` (for PMDs) share the same union storage - and with
2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the
configuration of page tables such that it is never enabled with 2-level
paging.)
Reported-by: Vitaly Chikunov <vt(a)altlinux.org>
Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Tested-by: Vitaly Chikunov <vt(a)altlinux.org>
Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jann Horn <jannh(a)google.com>
---
I'm carrying over Vitaly Chikunov's "Tested-by" from v1.
Changes in v2:
- disable it for 32-bit entirely (Dave Hansen)
- Link to v1: https://lore.kernel.org/r/20250630-x86-2level-hugetlb-v1-1-077cd53d8255@goo…
---
arch/x86/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..4e0fe688cc83 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config X86
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_WANT_HUGE_PMD_SHARE if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
---
base-commit: d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af
change-id: 20250630-x86-2level-hugetlb-b1d8feb255ce
--
Jann Horn <jannh(a)google.com>
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
commit 74abd086d2ee ("gpiolib: sanitize the return value of
gpio_chip::get_multiple()") altered the value returned by
gc->get_multiple() in case it is positive (> 0), but failed to
return for other cases (<= 0).
This may result in the "if (gc->get)" block being executed and thus
negates the performance gain that is normally obtained by using
gc->get_multiple().
Fix by returning the result of gc->get_multiple() if it is <= 0.
Also move the "ret" variable to the scope where it is used, which as an
added bonus fixes an indentation error introduced by the aforementioned
commit.
Fixes: 74abd086d2ee ("gpiolib: sanitize the return value of gpio_chip::get_multiple()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
---
drivers/gpio/gpiolib.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index fdafa0df1b43..3a3eca5b4c40 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -3297,14 +3297,15 @@ static int gpiod_get_raw_value_commit(const struct gpio_desc *desc)
static int gpio_chip_get_multiple(struct gpio_chip *gc,
unsigned long *mask, unsigned long *bits)
{
- int ret;
-
lockdep_assert_held(&gc->gpiodev->srcu);
if (gc->get_multiple) {
+ int ret;
+
ret = gc->get_multiple(gc, mask, bits);
if (ret > 0)
return -EBADE;
+ return ret;
}
if (gc->get) {
base-commit: b4911fb0b060899e4eebca0151eb56deb86921ec
--
2.39.5
Make fscrypt no longer use Crypto API drivers for non-inline crypto
engines, even when the Crypto API prioritizes them over CPU-based code
(which unfortunately it often does). These drivers tend to be really
problematic, especially for fscrypt's workload. This commit has no
effect on inline crypto engines, which are different and do work well.
Specifically, exclude drivers that have CRYPTO_ALG_KERN_DRIVER_ONLY or
CRYPTO_ALG_ALLOCATES_MEMORY set. (Later, CRYPTO_ALG_ASYNC should be
excluded too. That's omitted for now to keep this commit backportable,
since until recently some CPU-based code had CRYPTO_ALG_ASYNC set.)
There are two major issues with these drivers: bugs and performance.
First, these drivers tend to be buggy. They're fundamentally much more
error-prone and harder to test than the CPU-based code. They often
don't get tested before kernel releases, and even if they do, the crypto
self-tests don't properly test these drivers. Released drivers have
en/decrypted or hashed data incorrectly. These bugs cause issues for
fscrypt users who often didn't even want to use these drivers, e.g.:
- https://github.com/google/fscryptctl/issues/32
- https://github.com/google/fscryptctl/issues/9
- https://lore.kernel.org/r/PH0PR02MB731916ECDB6C613665863B6CFFAA2@PH0PR02MB7…
These drivers have also similarly caused issues for dm-crypt users,
including data corruption and deadlocks. Since Linux v5.10, dm-crypt
has disabled most of them by excluding CRYPTO_ALG_ALLOCATES_MEMORY.
Second, these drivers tend to be *much* slower than the CPU-based code.
This may seem counterintuitive, but benchmarks clearly show it. There's
a *lot* of overhead associated with going to a hardware driver, off the
CPU, and back again. To prove this, I gathered as many systems with
this type of crypto engine as I could, and I measured synchronous
encryption of 4096-byte messages (which matches fscrypt's workload):
Intel Emerald Rapids server:
AES-256-XTS:
xts-aes-vaes-avx512 16171 MB/s [CPU-based, Vector AES]
qat_aes_xts 289 MB/s [Offload, Intel QuickAssist]
Qualcomm SM8650 HDK:
AES-256-XTS:
xts-aes-ce 4301 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts-aes-qce 73 MB/s [Offload, Qualcomm Crypto Engine]
i.MX 8M Nano LPDDR4 EVK:
AES-256-XTS:
xts-aes-ce 647 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts(ecb-aes-caam) 20 MB/s [Offload, CAAM]
AES-128-CBC-ESSIV:
essiv(cbc-aes-caam,sha256-lib) 23 MB/s [Offload, CAAM]
STM32MP157F-DK2:
AES-256-XTS:
xts-aes-neonbs 13.2 MB/s [CPU-based, ARM NEON]
xts(stm32-ecb-aes) 3.1 MB/s [Offload, STM32 crypto engine]
AES-128-CBC-ESSIV:
essiv(cbc-aes-neonbs,sha256-lib)
14.7 MB/s [CPU-based, ARM NEON]
essiv(stm32-cbc-aes,sha256-lib)
3.2 MB/s [Offload, STM32 crypto engine]
Adiantum:
adiantum(xchacha12-arm,aes-arm,nhpoly1305-neon)
52.8 MB/s [CPU-based, ARM scalar + NEON]
So, there was no case in which the crypto engine was even *close* to
being faster. On the first three, which have AES instructions in the
CPU, the CPU was 30 to 55 times faster (!). Even on STM32MP157F-DK2
which has a Cortex-A7 CPU that doesn't have AES instructions, AES was
over 4 times faster on the CPU. And Adiantum encryption, which is what
actually should be used on CPUs like that, was over 17 times faster.
Other justifications that have been given for these non-inline crypto
engines (almost always coming from the hardware vendors, not actual
users) don't seem very plausible either:
- The crypto engine throughput could be improved by processing
multiple requests concurrently. Currently irrelevant to fscrypt,
since it doesn't do that. This would also be complex, and unhelpful
in many cases. 2 of the 4 engines I tested even had only one queue.
- Some of the engines, e.g. STM32, support hardware keys. Also
currently irrelevant to fscrypt, since it doesn't support these.
Interestingly, the STM32 driver itself doesn't support this either.
- Free up CPU for other tasks and/or reduce energy usage. Not very
plausible considering the "short" message length, driver overhead,
and scheduling overhead. There's just very little time for the CPU
to do something else like run another task or enter low-power state,
before the message finishes and it's time to process the next one.
- Some of these engines resist power analysis and electromagnetic
attacks, while the CPU-based crypto generally does not. In theory,
this sounds great. In practice, if this benefit requires the use of
an off-CPU offload that massively regresses performance and has a
low-quality, buggy driver, the price for this hardening (which is
not relevant to most fscrypt users, and tends to be incomplete) is
just too high. Inline crypto engines are much more promising here,
as are on-CPU solutions like RISC-V High Assurance Cryptography.
Fixes: b30ab0e03407 ("ext4 crypto: add ext4 encryption facilities")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
---
Changed in v3:
- Further improved the commit message and comment. Added data for
STM32MP157F-DK2 and i.MX 8M Nano LPDDR4 EVK.
- Updated fscrypt.rst
Changed in v2:
- Improved commit message and comment
- Dropped CRYPTO_ALG_ASYNC from the mask, to make this patch
backport-friendly
- Added Fixes and Cc stable
Documentation/filesystems/fscrypt.rst | 37 +++++++++++----------------
fs/crypto/fscrypt_private.h | 17 ++++++++++++
fs/crypto/hkdf.c | 2 +-
fs/crypto/keysetup.c | 3 ++-
fs/crypto/keysetup_v1.c | 3 ++-
5 files changed, 37 insertions(+), 25 deletions(-)
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index 29e84d125e02..4a3e844b790c 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -145,13 +145,12 @@ However, these ioctls have some limitations:
caches are freed but not wiped. Therefore, portions thereof may be
recoverable from freed memory, even after the corresponding key(s)
were wiped. To partially solve this, you can add init_on_free=1 to
your kernel command line. However, this has a performance cost.
-- Secret keys might still exist in CPU registers, in crypto
- accelerator hardware (if used by the crypto API to implement any of
- the algorithms), or in other places not explicitly considered here.
+- Secret keys might still exist in CPU registers or in other places
+ not explicitly considered here.
Full system compromise
~~~~~~~~~~~~~~~~~~~~~~
An attacker who gains "root" access and/or the ability to execute
@@ -404,13 +403,16 @@ of hardware acceleration for AES. Adiantum is a wide-block cipher
that uses XChaCha12 and AES-256 as its underlying components. Most of
the work is done by XChaCha12, which is much faster than AES when AES
acceleration is unavailable. For more information about Adiantum, see
`the Adiantum paper <https://eprint.iacr.org/2018/720.pdf>`_.
-The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair exists only to support
-systems whose only form of AES acceleration is an off-CPU crypto
-accelerator such as CAAM or CESA that does not support XTS.
+The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair was added to try to
+provide a more efficient option for systems that lack AES instructions
+in the CPU but do have a non-inline crypto engine such as CAAM or CESA
+that supports AES-CBC (and not AES-XTS). This is deprecated. It has
+been shown that just doing AES on the CPU is actually faster.
+Moreover, Adiantum is faster still and is recommended on such systems.
The remaining mode pairs are the "national pride ciphers":
- (SM4-XTS, SM4-CBC-CTS)
@@ -1324,26 +1326,17 @@ that systems implementing a form of "verified boot" take advantage of
this by validating all top-level encryption policies prior to access.
Inline encryption support
=========================
-By default, fscrypt uses the kernel crypto API for all cryptographic
-operations (other than HKDF, which fscrypt partially implements
-itself). The kernel crypto API supports hardware crypto accelerators,
-but only ones that work in the traditional way where all inputs and
-outputs (e.g. plaintexts and ciphertexts) are in memory. fscrypt can
-take advantage of such hardware, but the traditional acceleration
-model isn't particularly efficient and fscrypt hasn't been optimized
-for it.
-
-Instead, many newer systems (especially mobile SoCs) have *inline
-encryption hardware* that can encrypt/decrypt data while it is on its
-way to/from the storage device. Linux supports inline encryption
-through a set of extensions to the block layer called *blk-crypto*.
-blk-crypto allows filesystems to attach encryption contexts to bios
-(I/O requests) to specify how the data will be encrypted or decrypted
-in-line. For more information about blk-crypto, see
+Many newer systems (especially mobile SoCs) have *inline encryption
+hardware* that can encrypt/decrypt data while it is on its way to/from
+the storage device. Linux supports inline encryption through a set of
+extensions to the block layer called *blk-crypto*. blk-crypto allows
+filesystems to attach encryption contexts to bios (I/O requests) to
+specify how the data will be encrypted or decrypted in-line. For more
+information about blk-crypto, see
:ref:`Documentation/block/inline-encryption.rst <inline_encryption>`.
On supported filesystems (currently ext4 and f2fs), fscrypt can use
blk-crypto instead of the kernel crypto API to encrypt/decrypt file
contents. To enable this, set CONFIG_FS_ENCRYPTION_INLINE_CRYPT=y in
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index c1d92074b65c..6e7164530a1e 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -43,10 +43,27 @@
* hardware-wrapped keys has made it misleading as it's only for raw keys.
* Don't use it in kernel code; use one of the above constants instead.
*/
#undef FSCRYPT_MAX_KEY_SIZE
+/*
+ * This mask is passed as the third argument to the crypto_alloc_*() functions
+ * to prevent fscrypt from using the Crypto API drivers for non-inline crypto
+ * engines. Those drivers have been problematic for fscrypt. fscrypt users
+ * have reported hangs and even incorrect en/decryption with these drivers.
+ * Since going to the driver, off CPU, and back again is really slow, such
+ * drivers can be over 50 times slower than the CPU-based code for fscrypt's
+ * workload. Even on platforms that lack AES instructions on the CPU, using the
+ * offloads has been shown to be slower, even staying with AES. (Of course,
+ * Adiantum is faster still, and is the recommended option on such platforms...)
+ *
+ * Note that fscrypt also supports inline crypto engines. Those don't use the
+ * Crypto API and work much better than the old-style (non-inline) engines.
+ */
+#define FSCRYPT_CRYPTOAPI_MASK \
+ (CRYPTO_ALG_ALLOCATES_MEMORY | CRYPTO_ALG_KERN_DRIVER_ONLY)
+
#define FSCRYPT_CONTEXT_V1 1
#define FSCRYPT_CONTEXT_V2 2
/* Keep this in sync with include/uapi/linux/fscrypt.h */
#define FSCRYPT_MODE_MAX FSCRYPT_MODE_AES_256_HCTR2
diff --git a/fs/crypto/hkdf.c b/fs/crypto/hkdf.c
index 0f3028adc9c7..5b9c21cfe2b4 100644
--- a/fs/crypto/hkdf.c
+++ b/fs/crypto/hkdf.c
@@ -56,11 +56,11 @@ int fscrypt_init_hkdf(struct fscrypt_hkdf *hkdf, const u8 *master_key,
struct crypto_shash *hmac_tfm;
static const u8 default_salt[HKDF_HASHLEN];
u8 prk[HKDF_HASHLEN];
int err;
- hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, 0);
+ hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(hmac_tfm)) {
fscrypt_err(NULL, "Error allocating " HKDF_HMAC_ALG ": %ld",
PTR_ERR(hmac_tfm));
return PTR_ERR(hmac_tfm);
}
diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
index 0d71843af946..d8113a719697 100644
--- a/fs/crypto/keysetup.c
+++ b/fs/crypto/keysetup.c
@@ -101,11 +101,12 @@ fscrypt_allocate_skcipher(struct fscrypt_mode *mode, const u8 *raw_key,
const struct inode *inode)
{
struct crypto_skcipher *tfm;
int err;
- tfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
+ tfm = crypto_alloc_skcipher(mode->cipher_str, 0,
+ FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
if (PTR_ERR(tfm) == -ENOENT) {
fscrypt_warn(inode,
"Missing crypto API support for %s (API name: \"%s\")",
mode->friendly_name, mode->cipher_str);
diff --git a/fs/crypto/keysetup_v1.c b/fs/crypto/keysetup_v1.c
index b70521c55132..158ceae8a5bc 100644
--- a/fs/crypto/keysetup_v1.c
+++ b/fs/crypto/keysetup_v1.c
@@ -50,11 +50,12 @@ static int derive_key_aes(const u8 *master_key,
{
int res = 0;
struct skcipher_request *req = NULL;
DECLARE_CRYPTO_WAIT(wait);
struct scatterlist src_sg, dst_sg;
- struct crypto_skcipher *tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0);
+ struct crypto_skcipher *tfm =
+ crypto_alloc_skcipher("ecb(aes)", 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
res = PTR_ERR(tfm);
tfm = NULL;
goto out;
base-commit: d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af
--
2.50.0
The handling of the `COMEDI_INSNLIST` ioctl allocates a kernel buffer to
hold the array of `struct comedi_insn`, getting the length from the
`n_insns` member of the `struct comedi_insnlist` supplied by the user.
The allocation will fail with a WARNING and a stack dump if it is too
large.
Avoid that by failing with an `-EINVAL` error if the supplied `n_insns`
value is unreasonable.
Define the limit on the `n_insns` value in the `MAX_INSNS` macro. Set
this to the same value as `MAX_SAMPLES` (65536), which is the maximum
allowed sum of the values of the member `n` in the array of `struct
comedi_insn`, and sensible comedi instructions will have an `n` of at
least 1.
Reported-by: syzbot+d6995b62e5ac7d79557a(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d6995b62e5ac7d79557a
Fixes: ed9eccbe8970 ("Staging: add comedi core")
Tested-by: Ian Abbott <abbotti(a)mev.co.uk>
Cc: <stable(a)vger.kernel.org> # 5.13+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
---
Patch does not apply cleanly to longterm kernels 5.4.x and 5.10.x.
---
drivers/comedi/comedi_fops.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/comedi/comedi_fops.c b/drivers/comedi/comedi_fops.c
index 3383a7ce27ff..962fb9b18a52 100644
--- a/drivers/comedi/comedi_fops.c
+++ b/drivers/comedi/comedi_fops.c
@@ -1589,6 +1589,16 @@ static int do_insnlist_ioctl(struct comedi_device *dev,
return i;
}
+#define MAX_INSNS MAX_SAMPLES
+static int check_insnlist_len(struct comedi_device *dev, unsigned int n_insns)
+{
+ if (n_insns > MAX_INSNS) {
+ dev_dbg(dev->class_dev, "insnlist length too large\n");
+ return -EINVAL;
+ }
+ return 0;
+}
+
/*
* COMEDI_INSN ioctl
* synchronous instruction
@@ -2239,6 +2249,9 @@ static long comedi_unlocked_ioctl(struct file *file, unsigned int cmd,
rc = -EFAULT;
break;
}
+ rc = check_insnlist_len(dev, insnlist.n_insns);
+ if (rc)
+ break;
insns = kcalloc(insnlist.n_insns, sizeof(*insns), GFP_KERNEL);
if (!insns) {
rc = -ENOMEM;
@@ -3142,6 +3155,9 @@ static int compat_insnlist(struct file *file, unsigned long arg)
if (copy_from_user(&insnlist32, compat_ptr(arg), sizeof(insnlist32)))
return -EFAULT;
+ rc = check_insnlist_len(dev, insnlist32.n_insns);
+ if (rc)
+ return rc;
insns = kcalloc(insnlist32.n_insns, sizeof(*insns), GFP_KERNEL);
if (!insns)
return -ENOMEM;
--
2.47.2
From: Guo Xuenan <guoxuenan(a)huawei.com>
[ Upstream commit 575689fc0ffa6c4bb4e72fd18e31a6525a6124e0 ]
xfs log io error will trigger xlog shut down, and end_io worker call
xlog_state_shutdown_callbacks to unpin and release the buf log item.
The race condition is that when there are some thread doing transaction
commit and happened not to be intercepted by xlog_is_shutdown, then,
these log item will be insert into CIL, when unpin and release these
buf log item, UAF will occur. BTW, add delay before `xlog_cil_commit`
can increase recurrence probability.
The following call graph actually encountered this bad situation.
fsstress io end worker kworker/0:1H-216
xlog_ioend_work
->xlog_force_shutdown
->xlog_state_shutdown_callbacks
->xlog_cil_process_committed
->xlog_cil_committed
->xfs_trans_committed_bulk
->xfs_trans_apply_sb_deltas ->li_ops->iop_unpin(lip, 1);
->xfs_trans_getsb
->_xfs_trans_bjoin
->xfs_buf_item_init
->if (bip) { return 0;} //relog
->xlog_cil_commit
->xlog_cil_insert_items //insert into CIL
->xfs_buf_ioend_fail(bp);
->xfs_buf_ioend
->xfs_buf_item_done
->xfs_buf_item_relse
->xfs_buf_item_free
when cil push worker gather percpu cil and insert super block buf log item
into ctx->log_items then uaf occurs.
==================================================================
BUG: KASAN: use-after-free in xlog_cil_push_work+0x1c8f/0x22f0
Write of size 8 at addr ffff88801800f3f0 by task kworker/u4:4/105
CPU: 0 PID: 105 Comm: kworker/u4:4 Tainted: G W
6.1.0-rc1-00001-g274115149b42 #136
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: xfs-cil/sda xlog_cil_push_work
Call Trace:
<TASK>
dump_stack_lvl+0x4d/0x66
print_report+0x171/0x4a6
kasan_report+0xb3/0x130
xlog_cil_push_work+0x1c8f/0x22f0
process_one_work+0x6f9/0xf70
worker_thread+0x578/0xf30
kthread+0x28c/0x330
ret_from_fork+0x1f/0x30
</TASK>
Allocated by task 2145:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
__kasan_slab_alloc+0x54/0x60
kmem_cache_alloc+0x14a/0x510
xfs_buf_item_init+0x160/0x6d0
_xfs_trans_bjoin+0x7f/0x2e0
xfs_trans_getsb+0xb6/0x3f0
xfs_trans_apply_sb_deltas+0x1f/0x8c0
__xfs_trans_commit+0xa25/0xe10
xfs_symlink+0xe23/0x1660
xfs_vn_symlink+0x157/0x280
vfs_symlink+0x491/0x790
do_symlinkat+0x128/0x220
__x64_sys_symlink+0x7a/0x90
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 216:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x40
__kasan_slab_free+0x105/0x1a0
kmem_cache_free+0xb6/0x460
xfs_buf_ioend+0x1e9/0x11f0
xfs_buf_item_unpin+0x3d6/0x840
xfs_trans_committed_bulk+0x4c2/0x7c0
xlog_cil_committed+0xab6/0xfb0
xlog_cil_process_committed+0x117/0x1e0
xlog_state_shutdown_callbacks+0x208/0x440
xlog_force_shutdown+0x1b3/0x3a0
xlog_ioend_work+0xef/0x1d0
process_one_work+0x6f9/0xf70
worker_thread+0x578/0xf30
kthread+0x28c/0x330
ret_from_fork+0x1f/0x30
The buggy address belongs to the object at ffff88801800f388
which belongs to the cache xfs_buf_item of size 272
The buggy address is located 104 bytes inside of
272-byte region [ffff88801800f388, ffff88801800f498)
The buggy address belongs to the physical page:
page:ffffea0000600380 refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff88801800f208 pfn:0x1800e
head:ffffea0000600380 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea0000699788 ffff88801319db50 ffff88800fb50640
raw: ffff88801800f208 000000000015000a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88801800f280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88801800f300: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88801800f380: fc fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88801800f400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88801800f480: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
[ Backport to 5.15: context cleanly applied with no semantic changes.
Build-tested. ]
Signed-off-by: Guo Xuenan <guoxuenan(a)huawei.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Pranav Tyagi <pranav.tyagi03(a)gmail.com>
---
fs/xfs/xfs_buf_item.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index b1ab100c09e1..ffe318eb897f 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -1017,6 +1017,8 @@ xfs_buf_item_relse(
trace_xfs_buf_item_relse(bp, _RET_IP_);
ASSERT(!test_bit(XFS_LI_IN_AIL, &bip->bli_item.li_flags));
+ if (atomic_read(&bip->bli_refcount))
+ return;
bp->b_log_item = NULL;
xfs_buf_rele(bp);
xfs_buf_item_free(bip);
--
2.49.0
Hi,
I'm seeing a regression on an HP ZBook using the e1000e driver
(chipset PCI ID: [8086:57a0]) -- the system can't get an IP address
after hot-plugging an Ethernet cable. In this case, the Ethernet cable
was unplugged at boot. The network interface eno1 was present but
stuck in the DHCP process. Using tcpdump, only TX packets were visible
and never got any RX -- indicating a possible packet loss or
link-layer issue.
This is on the vanilla Linux 6.16-rc4 (commit
62f224733431dbd564c4fe800d4b67a0cf92ed10).
Bisect says it's this commit:
commit efaaf344bc2917cbfa5997633bc18a05d3aed27f
Author: Vitaly Lifshits <vitaly.lifshits(a)intel.com>
Date: Thu Mar 13 16:05:56 2025 +0200
e1000e: change k1 configuration on MTP and later platforms
Starting from Meteor Lake, the Kumeran interface between the integrated
MAC and the I219 PHY works at a different frequency. This causes sporadic
MDI errors when accessing the PHY, and in rare circumstances could lead
to packet corruption.
To overcome this, introduce minor changes to the Kumeran idle
state (K1) parameters during device initialization. Hardware reset
reverts this configuration, therefore it needs to be applied in a few
places.
Fixes: cc23f4f0b6b9 ("e1000e: Add support for Meteor Lake")
Signed-off-by: Vitaly Lifshits <vitaly.lifshits(a)intel.com>
Tested-by: Avigail Dahan <avigailx.dahan(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
drivers/net/ethernet/intel/e1000e/defines.h | 3 +++
drivers/net/ethernet/intel/e1000e/ich8lan.c | 80
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
drivers/net/ethernet/intel/e1000e/ich8lan.h | 4 ++++
3 files changed, 82 insertions(+), 5 deletions(-)
Reverting this patch resolves the issue.
Based on the symptoms and the bisect result, this issue might be
similar to https://lore.kernel.org/intel-wired-lan/20250626153544.1853d106@onyx.my.dom…
Affected machine is:
HP ZBook X G1i 16 inch Mobile Workstation PC, BIOS 01.02.03 05/27/2025
(see end of message for dmesg from boot)
CPU model name:
Intel(R) Core(TM) Ultra 7 265H (Arrow Lake)
ethtool output:
driver: e1000e
version: 6.16.0-061600rc4-generic
firmware-version: 0.1-4
expansion-rom-version:
bus-info: 0000:00:1f.6
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
lspci output:
0:1f.6 Ethernet controller [0200]: Intel Corporation Device [8086:57a0]
DeviceName: Onboard Ethernet
Subsystem: Hewlett-Packard Company Device [103c:8e1d]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin D routed to IRQ 162
IOMMU group: 17
Region 0: Memory at 92280000 (32-bit, non-prefetchable) [size=128K]
Capabilities: [c8] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00798 Data: 0000
Kernel driver in use: e1000e
Kernel modules: e1000e
The relevant dmesg:
<<<cable disconnected>>>
[ 0.927394] e1000e: Intel(R) PRO/1000 Network Driver
[ 0.927398] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[ 0.927933] e1000e 0000:00:1f.6: enabling device (0000 -> 0002)
[ 0.928249] e1000e 0000:00:1f.6: Interrupt Throttling Rate
(ints/sec) set to dynamic conservative mode
[ 1.155716] e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized):
registered PHC clock
[ 1.220694] e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width
x1) 24:fb:e3:bf:28:c6
[ 1.220721] e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
[ 1.220903] e1000e 0000:00:1f.6 eth0: MAC: 16, PHY: 12, PBA No: FFFFFF-0FF
[ 1.222632] e1000e 0000:00:1f.6 eno1: renamed from eth0
<<<cable connected>>>
[ 153.932626] e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Half
Duplex, Flow Control: None
[ 153.934527] e1000e 0000:00:1f.6 eno1: NIC Link is Down
[ 157.622238] e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Full
Duplex, Flow Control: None
No error message seen after hot-plugging the Ethernet cable.
--
Best Regards,
En-Wei.