Hi all,
This backport wires up AMD perfmon v2 so BPF and other software clients
can snapshot LBR stacks on demand, similar to the Intel support
upstream. The series keeps the LBR-freeze path branchless, adds the
perf_snapshot_branch_stack callback for AMD, and drops the
sampling-only restriction now that snapshots can be taken from software
contexts.
Leon Hwang (4):
perf/x86/amd: Ensure amd_pmu_core_disable_all() is always inlined
perf/x86/amd: Avoid taking branches before disabling LBR
perf/x86/amd: Support capturing LBR from software events
perf/x86/amd: Don't reject non-sampling events with configured LBR
arch/x86/events/amd/core.c | 37 +++++++++++++++++++++++++++++++++++-
arch/x86/events/amd/lbr.c | 13 +------------
arch/x86/events/perf_event.h | 13 +++++++++++++
3 files changed, 50 insertions(+), 13 deletions(-)
--
2.52.0
Hi and happy new year!
I would like to report a problem that I am encountering with the sdcard
storage.
I have a laptop/convertible where /home is on an sd-card (because the
internal disk is too small). The card is luks encrypted and has a btrfs
filesystem on it.
When the laptop sleeps and then resumes, there is a race. The sdcard
gets accessed for read/write but is not yet ready, so there are I/O
errors. BTRFS is not happy with them and tends to remount RO.
This issue is well known to purism developers (e.g. see
https://source.puri.sm/Librem5/linux/-/issues/484 and
https://forums.puri.sm/t/sdcard-becomes-read-only-after-waking-up-from-susp…).
My kernel logs are identical to those in
https://source.puri.sm/Librem5/linux/-/issues/484 (first comment), apart
from the fact that I get the errors from BTRFS, while the reporter there
gets the errors from EXT4. This indicates that the race is not specific
to BTRFS.
The errors in the kernel logs come right after the PM: suspend exit message.
From what I understand:
1. The error is more frequent with the SD-LUKS-FILESYSTEM
stratification, but not specific to it
2. A phone/tablet set up such as those that purism developers address
will generally use sdcard for storage and require suspend, being a good
trigger for the problem. However, the problem is in no means specific to
phones, ARM devices, etc. I am getting it on an X86-64 laptop.
3. It is unclear to me if there is a real risk of data loss. Possibly
with BTRFS that has a more complex data management this can be the case.
4.Even without data loss, the issue requires a reboot to get the
filesystem back to RW, so it is annoying.
5. Purism developers have a kernel patch for it at
https://source.puri.sm/Librem5/linux/-/merge_requests/788, but I believe
it is not in linux mainline or stable. Would it make sense to consider
that patch?
6. For stable kernels, there is a mitigation consisting in a systemd
sleep-resume hook as in
#!/bin/sh
/usr/bin/systemd-cat -p5 /usr/bin/echo ${1} ${2}
case "${1}" in
post)
sleep 1.5
systemd-cat -p4 /usr/bin/echo "hack, wait for sdcard"
;;
esac
see https://source.puri.sm/Librem5/linux/-/issues/484#note_277648
This appears to reduce the occurrence of the problem, but not to
eliminate it completely.
Thanks for the attention
Sergio
The issue occurs when gfs2_freeze_lock_shared() fails in
gfs2_fill_super(). If !sb_rdonly(sb), threads for the quotad and logd
were started, however, in the error path for gfs2_freeze_lock_shared(),
the threads are not stopped by gfs2_destroy_threads() before jumping to
fail_per_node.
Introduce fail_threads to handle stopping the threads if the threads were
started.
Reported-by: syzbot+4cb0d0336db6bc6930e9(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4cb0d0336db6bc6930e9
Fixes: a28dc123fa66 ("gfs2: init system threads before freeze lock")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ryota Sakamoto <sakamo.ryota(a)gmail.com>
---
Changes in v2:
- Fix commit message style (imperative mood) as suggested by Markus Elfring.
- Add parentheses to function name in subject as suggested by Markus Elfring.
- Link to v1: https://lore.kernel.org/r/20251230-fix-use-after-free-gfs2-v1-1-ef0e46db6ec…
---
fs/gfs2/ops_fstype.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index e7a88b717991ae3647c1da039636daef7005a7f0..4b5ac1a7050f1fd34e10be4100a2bc381f49c83d 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1269,21 +1269,23 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
error = gfs2_freeze_lock_shared(sdp);
if (error)
- goto fail_per_node;
+ goto fail_threads;
if (!sb_rdonly(sb))
error = gfs2_make_fs_rw(sdp);
if (error) {
gfs2_freeze_unlock(sdp);
- gfs2_destroy_threads(sdp);
fs_err(sdp, "can't make FS RW: %d\n", error);
- goto fail_per_node;
+ goto fail_threads;
}
gfs2_glock_dq_uninit(&mount_gh);
gfs2_online_uevent(sdp);
return 0;
+fail_threads:
+ if (!sb_rdonly(sb))
+ gfs2_destroy_threads(sdp);
fail_per_node:
init_per_node(sdp, UNDO);
fail_inodes:
---
base-commit: 7839932417dd53bb09eb5a585a7a92781dfd7cb2
change-id: 20251230-fix-use-after-free-gfs2-66cfbe23baa8
Best regards,
--
Ryota Sakamoto <sakamo.ryota(a)gmail.com>
Dear,
Please grant me the permission to share important discussions with you. I
am looking forward to hearing from you at your earliest convenience.
Best Regards.
Mrs. Mariam Kouame
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x f5e1e5ec204da11fa87fdf006d451d80ce06e118
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025122951-giggle-reveler-2e6b@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f5e1e5ec204da11fa87fdf006d451d80ce06e118 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj(a)kernel.org>
Date: Thu, 11 Dec 2025 15:45:04 -1000
Subject: [PATCH] sched_ext: Fix missing post-enqueue handling in
move_local_task_to_local_dsq()
move_local_task_to_local_dsq() is used when moving a task from a non-local
DSQ to a local DSQ on the same CPU. It directly manipulates the local DSQ
without going through dispatch_enqueue() and was missing the post-enqueue
handling that triggers preemption when SCX_ENQ_PREEMPT is set or the idle
task is running.
The function is used by move_task_between_dsqs() which backs
scx_bpf_dsq_move() and may be called while the CPU is busy.
Add local_dsq_post_enq() call to move_local_task_to_local_dsq(). As the
dispatch path doesn't need post-enqueue handling, add SCX_RQ_IN_BALANCE
early exit to keep consume_dispatch_q() behavior unchanged and avoid
triggering unnecessary resched when scx_bpf_dsq_move() is used from the
dispatch path.
Fixes: 4c30f5ce4f7a ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()")
Cc: stable(a)vger.kernel.org # v6.12+
Reviewed-by: Andrea Righi <arighi(a)nvidia.com>
Reviewed-by: Emil Tsalapatis <emil(a)etsalapatis.com>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index c78efa99406f..695503a2f7d1 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -988,6 +988,14 @@ static void local_dsq_post_enq(struct scx_dispatch_q *dsq, struct task_struct *p
struct rq *rq = container_of(dsq, struct rq, scx.local_dsq);
bool preempt = false;
+ /*
+ * If @rq is in balance, the CPU is already vacant and looking for the
+ * next task to run. No need to preempt or trigger resched after moving
+ * @p into its local DSQ.
+ */
+ if (rq->scx.flags & SCX_RQ_IN_BALANCE)
+ return;
+
if ((enq_flags & SCX_ENQ_PREEMPT) && p != rq->curr &&
rq->curr->sched_class == &ext_sched_class) {
rq->curr->scx.slice = 0;
@@ -1636,6 +1644,8 @@ static void move_local_task_to_local_dsq(struct task_struct *p, u64 enq_flags,
dsq_mod_nr(dst_dsq, 1);
p->scx.dsq = dst_dsq;
+
+ local_dsq_post_enq(dst_dsq, p, enq_flags);
}
/**
When software issues a Cache Maintenance Operation (CMO) targeting a
dirty cache line, the CPU and DSU cluster may optimize the operation by
combining the CopyBack Write and CMO into a single combined CopyBack
Write plus CMO transaction presented to the interconnect (MCN).
For these combined transactions, the MCN splits the operation into two
separate transactions, one Write and one CMO, and then propagates the
write and optionally the CMO to the downstream memory system or external
Point of Serialization (PoS).
However, the MCN may return an early CompCMO response to the DSU cluster
before the corresponding Write and CMO transactions have completed at
the external PoS or downstream memory. As a result, stale data may be
observed by external observers that are directly connected to the
external PoS or downstream memory.
This erratum affects any system topology in which the following
conditions apply:
- The Point of Serialization (PoS) is located downstream of the
interconnect.
- A downstream observer accesses memory directly, bypassing the
interconnect.
Conditions:
This erratum occurs only when all of the following conditions are met:
1. Software executes a data cache maintenance operation, specifically,
a clean or invalidate by virtual address (DC CVAC, DC CIVAC, or DC
IVAC), that hits on unique dirty data in the CPU or DSU cache. This
results in a combined CopyBack and CMO being issued to the
interconnect.
2. The interconnect splits the combined transaction into separate Write
and CMO transactions and returns an early completion response to the
CPU or DSU before the write has completed at the downstream memory
or PoS.
3. A downstream observer accesses the affected memory address after the
early completion response is issued but before the actual memory
write has completed. This allows the observer to read stale data
that has not yet been updated at the PoS or downstream memory.
The implementation of workaround put a second loop of CMOs at the same
virtual address whose operation meet erratum conditions to wait until
cache data be cleaned to PoC.. This way of implementation mitigates
performance panalty compared to purly duplicate orignial CMO.
Reported-by: kernel test robot <lkp(a)intel.com>
Cc: stable(a)vger.kernel.org # 6.12.x
Signed-off-by: Lucas Wei <lucaswei(a)google.com>
---
Changes in v2:
1. Fixed warning from kernel test robot by changing
arm_si_l1_workaround_4311569 to static
[Reported-by: kernel test robot <lkp(a)intel.com>]
---
Documentation/arch/arm64/silicon-errata.rst | 3 ++
arch/arm64/Kconfig | 19 +++++++++++++
arch/arm64/include/asm/assembler.h | 10 +++++++
arch/arm64/kernel/cpu_errata.c | 31 +++++++++++++++++++++
arch/arm64/mm/cache.S | 13 ++++++++-
arch/arm64/tools/cpucaps | 1 +
6 files changed, 76 insertions(+), 1 deletion(-)
diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst
index a7ec57060f64..98efdf528719 100644
--- a/Documentation/arch/arm64/silicon-errata.rst
+++ b/Documentation/arch/arm64/silicon-errata.rst
@@ -213,6 +213,9 @@ stable kernels.
| ARM | GIC-700 | #2941627 | ARM64_ERRATUM_2941627 |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | SI L1 | #4311569 | ARM64_ERRATUM_4311569 |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 |
+----------------+-----------------+-----------------+-----------------------------+
| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_843419 |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 65db12f66b8f..a834d30859cc 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1153,6 +1153,25 @@ config ARM64_ERRATUM_3194386
If unsure, say Y.
+config ARM64_ERRATUM_4311569
+ bool "SI L1: 4311569: workaround for premature CMO completion erratum"
+ default y
+ help
+ This option adds the workaround for ARM SI L1 erratum 4311569.
+
+ The erratum of SI L1 can cause an early response to a combined write
+ and cache maintenance operation (WR+CMO) before the operation is fully
+ completed to the Point of Serialization (POS).
+ This can result in a non-I/O coherent agent observing stale data,
+ potentially leading to system instability or incorrect behavior.
+
+ Enabling this option implements a software workaround by inserting a
+ second loop of Cache Maintenance Operation (CMO) immediately following the
+ end of function to do CMOs. This ensures that the data is correctly serialized
+ before the buffer is handed off to a non-coherent agent.
+
+ If unsure, say Y.
+
config CAVIUM_ERRATUM_22375
bool "Cavium erratum 22375, 24313"
default y
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index f0ca7196f6fa..d3d46e5f7188 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -381,6 +381,9 @@ alternative_endif
.macro dcache_by_myline_op op, domain, start, end, linesz, tmp, fixup
sub \tmp, \linesz, #1
bic \start, \start, \tmp
+alternative_if ARM64_WORKAROUND_4311569
+ mov \tmp, \start
+alternative_else_nop_endif
.Ldcache_op\@:
.ifc \op, cvau
__dcache_op_workaround_clean_cache \op, \start
@@ -402,6 +405,13 @@ alternative_endif
add \start, \start, \linesz
cmp \start, \end
b.lo .Ldcache_op\@
+alternative_if ARM64_WORKAROUND_4311569
+ .ifnc \op, cvau
+ mov \start, \tmp
+ mov \tmp, xzr
+ cbnz \start, .Ldcache_op\@
+ .endif
+alternative_else_nop_endif
dsb \domain
_cond_uaccess_extable .Ldcache_op\@, \fixup
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 8cb3b575a031..5c0ab6bfd44a 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -141,6 +141,30 @@ has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
return (ctr_real != sys) && (ctr_raw != sys);
}
+#ifdef CONFIG_ARM64_ERRATUM_4311569
+static DEFINE_STATIC_KEY_FALSE(arm_si_l1_workaround_4311569);
+static int __init early_arm_si_l1_workaround_4311569_cfg(char *arg)
+{
+ static_branch_enable(&arm_si_l1_workaround_4311569);
+ pr_info("Enabling cache maintenance workaround for ARM SI-L1 erratum 4311569\n");
+
+ return 0;
+}
+early_param("arm_si_l1_workaround_4311569", early_arm_si_l1_workaround_4311569_cfg);
+
+/*
+ * We have some earlier use cases to call cache maintenance operation functions, for example,
+ * dcache_inval_poc() and dcache_clean_poc() in head.S, before making decision to turn on this
+ * workaround. Since the scope of this workaround is limited to non-coherent DMA agents, its
+ * safe to have the workaround off by default.
+ */
+static bool
+need_arm_si_l1_workaround_4311569(const struct arm64_cpu_capabilities *entry, int scope)
+{
+ return static_branch_unlikely(&arm_si_l1_workaround_4311569);
+}
+#endif
+
static void
cpu_enable_trap_ctr_access(const struct arm64_cpu_capabilities *cap)
{
@@ -870,6 +894,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
ERRATA_MIDR_RANGE_LIST(erratum_spec_ssbs_list),
},
#endif
+#ifdef CONFIG_ARM64_ERRATUM_4311569
+ {
+ .capability = ARM64_WORKAROUND_4311569,
+ .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .matches = need_arm_si_l1_workaround_4311569,
+ },
+#endif
#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
{
.desc = "ARM errata 2966298, 3117295",
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 503567c864fd..ddf0097624ed 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -143,9 +143,14 @@ SYM_FUNC_END(dcache_clean_pou)
* - end - kernel end address of region
*/
SYM_FUNC_START(__pi_dcache_inval_poc)
+alternative_if ARM64_WORKAROUND_4311569
+ mov x4, x0
+ mov x5, x1
+ mov x6, #1
+alternative_else_nop_endif
dcache_line_size x2, x3
sub x3, x2, #1
- tst x1, x3 // end cache line aligned?
+again: tst x1, x3 // end cache line aligned?
bic x1, x1, x3
b.eq 1f
dc civac, x1 // clean & invalidate D / U line
@@ -158,6 +163,12 @@ SYM_FUNC_START(__pi_dcache_inval_poc)
3: add x0, x0, x2
cmp x0, x1
b.lo 2b
+alternative_if ARM64_WORKAROUND_4311569
+ mov x0, x4
+ mov x1, x5
+ sub x6, x6, #1
+ cbz x6, again
+alternative_else_nop_endif
dsb sy
ret
SYM_FUNC_END(__pi_dcache_inval_poc)
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 1b32c1232d28..3b18734f9744 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -101,6 +101,7 @@ WORKAROUND_2077057
WORKAROUND_2457168
WORKAROUND_2645198
WORKAROUND_2658417
+WORKAROUND_4311569
WORKAROUND_AMPERE_AC03_CPU_38
WORKAROUND_AMPERE_AC04_CPU_23
WORKAROUND_TRBE_OVERWRITE_FILL_MODE
base-commit: edde060637b92607f3522252c03d64ad06369933
--
2.52.0.358.g0dd7633a29-goog
Hi,
This series intends to fix the race between the MHI stack and the MHI client
drivers due to the MHI 'auto_queue' feature. As it turns out often, the best
way to fix an issue in a feature is to drop the feature itself and this series
does exactly that.
There is no real benefit in having the 'auto_queue' feature in the MHI stack,
other than saving a few lines of code in the client drivers. Since the QRTR is
the only client driver which makes use of this feature, this series reworks the
QRTR driver to manage the buffer on its own.
Testing
=======
Tested on Qcom X1E based Lenovo Thinkpad T14s laptop with WLAN device.
Merge Strategy
==============
Since this series modifies many subsystem drivers, I'd like to get acks from
relevant subsystem maintainers and take the series through MHI tree.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)oss.qualcomm.com>
---
Changes in v2:
- Used mhi_get_free_desc_count() to queue the buffers
- Collected tags
- Link to v1: https://lore.kernel.org/r/20251217-qrtr-fix-v1-0-f6142a3ec9d8@oss.qualcomm.…
---
Manivannan Sadhasivam (2):
net: qrtr: Drop the MHI auto_queue feature for IPCR DL channels
bus: mhi: host: Drop the auto_queue support
drivers/accel/qaic/mhi_controller.c | 44 -------------------
drivers/bus/mhi/host/init.c | 10 -----
drivers/bus/mhi/host/internal.h | 3 --
drivers/bus/mhi/host/main.c | 81 +----------------------------------
drivers/bus/mhi/host/pci_generic.c | 20 +--------
drivers/net/wireless/ath/ath11k/mhi.c | 4 --
drivers/net/wireless/ath/ath12k/mhi.c | 4 --
include/linux/mhi.h | 14 ------
net/qrtr/mhi.c | 69 ++++++++++++++++++++++++-----
9 files changed, 62 insertions(+), 187 deletions(-)
---
base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
change-id: 20251217-qrtr-fix-c058251d8d1a
Best regards,
--
Manivannan Sadhasivam <manivannan.sadhasivam(a)oss.qualcomm.com>
After discussion with the devicetree maintainers we agreed to not extend
lists with the generic compatible "apple,admac" anymore [1]. Use
"apple,t8103-admac" as base compatible as it is the SoC the driver and
bindings were written for.
[1]: https://lore.kernel.org/asahi/12ab93b7-1fc2-4ce0-926e-c8141cfe81bf@kernel.o…
Fixes: b127315d9a78 ("dmaengine: apple-admac: Add Apple ADMAC driver")
Cc: stable(a)vger.kernel.org
Reviewed-by: Neal Gompa <neal(a)gompa.dev>
Signed-off-by: Janne Grunau <j(a)jannau.net>
---
This is split off from the v1 series adding Apple M2 Pro/Max/Ultra
device trees in
https://lore.kernel.org/r/20250828-dt-apple-t6020-v1-0-507ba4c4b98e@jannau.…
Changes compared to the patch in that series:
- rebased onto v6.19-rc1
- added "Fixes:" and "Cc: stable" tags for handling this as fix adding a
device id
- added Neal's Reviewed-by:
---
drivers/dma/apple-admac.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/dma/apple-admac.c b/drivers/dma/apple-admac.c
index bd49f03742912198804a02a22e38da2c21093761..04bbd774b3b444f928986c266c53becf286daeea 100644
--- a/drivers/dma/apple-admac.c
+++ b/drivers/dma/apple-admac.c
@@ -936,6 +936,7 @@ static void admac_remove(struct platform_device *pdev)
}
static const struct of_device_id admac_of_match[] = {
+ { .compatible = "apple,t8103-admac", },
{ .compatible = "apple,admac", },
{ }
};
---
base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
change-id: 20251231-apple-admac-t8103-base-compat-18288e7d62b3
Best regards,
--
Janne Grunau <j(a)jannau.net>