Modify the framework to adapt to more map modes, add benchmark
support for dma_map_sg, and add support sg map mode in ioctl.
The result:
[root@localhost]# ./dma_map_benchmark -m 1 -g 8 -t 8 -s 30 -d 2
dma mapping mode: DMA_MAP_SG_MODE
dma mapping benchmark: threads:8 seconds:30 node:-1 dir:FROM_DEVICE granule/sg_nents: 8
average map latency(us):1.4 standard deviation:0.3
average unmap latency(us):1.3 standard deviation:0.3
[root@localhost]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2
dma mapping mode: DMA_MAP_SINGLE_MODE
dma mapping benchmark: threads:8 seconds:30 node:-1 dir:FROM_DEVICE granule/sg_nents: 8
average map latency(us):1.0 standard deviation:0.3
average unmap latency(us):1.3 standard deviation:0.5
---
Changes since V2:
- Address the comments from Barry and ALOK, some commit information and function
input parameter names are modified to make them more accurate.
- Link: https://lore.kernel.org/all/20250506030100.394376-1-xiaqinxin@huawei.com/
Changes since V1:
- Address the comments from Barry, added some comments and changed the unmap type to void.
- Link: https://lore.kernel.org/lkml/20250212022718.1995504-1-xiaqinxin@huawei.com/
Qinxin Xia (4):
dma-mapping: benchmark: Add padding to ensure uABI remained consistent
dma-mapping: benchmark: modify the framework to adapt to more map modes
dma-mapping: benchmark: add support for dma_map_sg
selftests/dma: Add dma_map_sg support
include/linux/map_benchmark.h | 46 +++-
kernel/dma/map_benchmark.c | 225 ++++++++++++++++--
.../testing/selftests/dma/dma_map_benchmark.c | 16 +-
3 files changed, 252 insertions(+), 35 deletions(-)
--
2.33.0
This series adds support for camera clock controller base driver,
bindings and DT support on sc8180x platform.
Signed-off-by: Satya Priya Kakitapalli <quic_skakitap(a)quicinc.com>
---
Changes in v3:
- Drop Fixes tag in patch [1/4]. Dropped unused gpu_iref and
aggre_ufs_card_2 clk bindings.
- Move the allOf block below required block in bindings patch.
- Remove the unused cam_cc_parent_data_7 and cam_cc_parent_map_7
in the driver patch. Reported by kernel test bot.
- Link to v2: https://lore.kernel.org/r/20250430-sc8180x-camcc-support-v2-0-6bbb514f467c@…
Changes in v2:
- New patch [1/4] to add all the missing gcc bindings along with
the required GCC_CAMERA_AHB_CLOCK
- As per Konrad's comments, add the camera AHB clock dependency in the
DT and yaml bindings.
- As per Vladimir's comments, update the Kconfig to add the SC8180X config
in correct alphanumerical order.
- Link to v1: https://lore.kernel.org/r/20250422-sc8180x-camcc-support-v1-0-691614d13f06@…
---
Satya Priya Kakitapalli (4):
dt-bindings: clock: qcom: Add missing bindings on gcc-sc8180x
dt-bindings: clock: Add Qualcomm SC8180X Camera clock controller
clk: qcom: camcc-sc8180x: Add SC8180X camera clock controller driver
arm64: dts: qcom: Add camera clock controller for sc8180x
.../bindings/clock/qcom,sc8180x-camcc.yaml | 67 +
arch/arm64/boot/dts/qcom/sc8180x.dtsi | 14 +
drivers/clk/qcom/Kconfig | 10 +
drivers/clk/qcom/Makefile | 1 +
drivers/clk/qcom/camcc-sc8180x.c | 2889 ++++++++++++++++++++
include/dt-bindings/clock/qcom,gcc-sc8180x.h | 10 +
include/dt-bindings/clock/qcom,sc8180x-camcc.h | 181 ++
7 files changed, 3172 insertions(+)
---
base-commit: bc8aa6cdadcc00862f2b5720e5de2e17f696a081
change-id: 20250422-sc8180x-camcc-support-9a82507d2a39
Best regards,
--
Satya Priya Kakitapalli <quic_skakitap(a)quicinc.com>
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x be8250786ca94952a19ce87f98ad9906448bc9ef
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025050521-crispy-study-e836@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From be8250786ca94952a19ce87f98ad9906448bc9ef Mon Sep 17 00:00:00 2001
From: Zhenhua Huang <quic_zhenhuah(a)quicinc.com>
Date: Mon, 21 Apr 2025 15:52:32 +0800
Subject: [PATCH] mm, slab: clean up slab->obj_exts always
When memory allocation profiling is disabled at runtime or due to an
error, shutdown_mem_profiling() is called: slab->obj_exts which
previously allocated remains.
It won't be cleared by unaccount_slab() because of
mem_alloc_profiling_enabled() not true. It's incorrect, slab->obj_exts
should always be cleaned up in unaccount_slab() to avoid following error:
[...]BUG: Bad page state in process...
..
[...]page dumped because: page still charged to cgroup
[andriy.shevchenko(a)linux.intel.com: fold need_slab_obj_ext() into its only user]
Fixes: 21c690a349ba ("mm: introduce slabobj_ext to support slab object extensions")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhenhua Huang <quic_zhenhuah(a)quicinc.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Acked-by: Harry Yoo <harry.yoo(a)oracle.com>
Tested-by: Harry Yoo <harry.yoo(a)oracle.com>
Acked-by: Suren Baghdasaryan <surenb(a)google.com>
Link: https://patch.msgid.link/20250421075232.2165527-1-quic_zhenhuah@quicinc.com
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
diff --git a/mm/slub.c b/mm/slub.c
index dc9e729e1d26..be8b09e09d30 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2028,8 +2028,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
return 0;
}
-/* Should be called only if mem_alloc_profiling_enabled() */
-static noinline void free_slab_obj_exts(struct slab *slab)
+static inline void free_slab_obj_exts(struct slab *slab)
{
struct slabobj_ext *obj_exts;
@@ -2049,18 +2048,6 @@ static noinline void free_slab_obj_exts(struct slab *slab)
slab->obj_exts = 0;
}
-static inline bool need_slab_obj_ext(void)
-{
- if (mem_alloc_profiling_enabled())
- return true;
-
- /*
- * CONFIG_MEMCG creates vector of obj_cgroup objects conditionally
- * inside memcg_slab_post_alloc_hook. No other users for now.
- */
- return false;
-}
-
#else /* CONFIG_SLAB_OBJ_EXT */
static inline void init_slab_obj_exts(struct slab *slab)
@@ -2077,11 +2064,6 @@ static inline void free_slab_obj_exts(struct slab *slab)
{
}
-static inline bool need_slab_obj_ext(void)
-{
- return false;
-}
-
#endif /* CONFIG_SLAB_OBJ_EXT */
#ifdef CONFIG_MEM_ALLOC_PROFILING
@@ -2129,7 +2111,7 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
static inline void
alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
{
- if (need_slab_obj_ext())
+ if (mem_alloc_profiling_enabled())
__alloc_tagging_slab_alloc_hook(s, object, flags);
}
@@ -2601,8 +2583,12 @@ static __always_inline void account_slab(struct slab *slab, int order,
static __always_inline void unaccount_slab(struct slab *slab, int order,
struct kmem_cache *s)
{
- if (memcg_kmem_online() || need_slab_obj_ext())
- free_slab_obj_exts(slab);
+ /*
+ * The slab object extensions should now be freed regardless of
+ * whether mem_alloc_profiling_enabled() or not because profiling
+ * might have been disabled after slab->obj_exts got allocated.
+ */
+ free_slab_obj_exts(slab);
mod_node_page_state(slab_pgdat(slab), cache_vmstat_idx(s),
-(PAGE_SIZE << order));
After a recent change [1] in clang's randstruct implementation to
randomize structures that only contain function pointers, there is an
error because qede_ll_ops get randomized but does not use a designated
initializer for the first member:
drivers/net/ethernet/qlogic/qede/qede_main.c:206:2: error: a randomized struct can only be initialized with a designated initializer
206 | {
| ^
Explicitly initialize the common member using a designated initializer
to fix the build.
Cc: stable(a)vger.kernel.org
Fixes: 035f7f87b729 ("randstruct: Enable Clang support")
Link: https://github.com/llvm/llvm-project/commit/04364fb888eea6db9811510607bed4b… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
drivers/net/ethernet/qlogic/qede/qede_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 99df00c30b8c..b5d744d2586f 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -203,7 +203,7 @@ static struct pci_driver qede_pci_driver = {
};
static struct qed_eth_cb_ops qede_ll_ops = {
- {
+ .common = {
#ifdef CONFIG_RFS_ACCEL
.arfs_filter_op = qede_arfs_filter_op,
#endif
---
base-commit: 9540984da649d46f699c47f28c68bbd3c9d99e4c
change-id: 20250507-qede-fix-clang-randstruct-13d8c593cb58
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
When CONFIG_PREEMPT_COUNT is not configured (i.e. CONFIG_PREEMPT_NONE/
CONFIG_PREEMPT_VOLUNTARY), preempt_disable() / preempt_enable() merely
acts as a barrier(). However, in these cases cond_resched() can still
trigger a context switch and modify the CSR.EUEN, resulting in do_fpu()
exception being activated within the kernel-fpu critical sections, as
demonstrated in the following path:
dcn32_calculate_wm_and_dlg()
DC_FP_START()
dcn32_calculate_wm_and_dlg_fpu()
dcn32_find_dummy_latency_index_for_fw_based_mclk_switch()
dcn32_internal_validate_bw()
dcn32_enable_phantom_stream()
dc_create_stream_for_sink()
kzalloc(GFP_KERNEL)
__kmem_cache_alloc_node()
__cond_resched()
DC_FP_END()
This patch is similar to commit d021985 (x86/fpu: Improve crypto
performance by making kernel-mode FPU reliably usable in softirqs). It
uses local_bh_disable() instead of preempt_disable() for non-RT kernels
so it can avoid the cond_resched() issue, and also extend the kernel-fpu
application scenarios to the softirq context.
Cc: stable(a)vger.kernel.org
Signed-off-by: Tianyang Zhang <zhangtianyang(a)loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
arch/loongarch/kernel/kfpu.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/arch/loongarch/kernel/kfpu.c b/arch/loongarch/kernel/kfpu.c
index ec5b28e570c9..4e469b021cf4 100644
--- a/arch/loongarch/kernel/kfpu.c
+++ b/arch/loongarch/kernel/kfpu.c
@@ -18,11 +18,28 @@ static unsigned int euen_mask = CSR_EUEN_FPEN;
static DEFINE_PER_CPU(bool, in_kernel_fpu);
static DEFINE_PER_CPU(unsigned int, euen_current);
+static inline void fpregs_lock(void)
+{
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+ local_bh_disable();
+ else
+ preempt_disable();
+}
+
+static inline void fpregs_unlock(void)
+{
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+ local_bh_enable();
+ else
+ preempt_enable();
+}
+
void kernel_fpu_begin(void)
{
unsigned int *euen_curr;
- preempt_disable();
+ if (!irqs_disabled())
+ fpregs_lock();
WARN_ON(this_cpu_read(in_kernel_fpu));
@@ -73,7 +90,8 @@ void kernel_fpu_end(void)
this_cpu_write(in_kernel_fpu, false);
- preempt_enable();
+ if (!irqs_disabled())
+ fpregs_unlock();
}
EXPORT_SYMBOL_GPL(kernel_fpu_end);
--
2.20.1
commit 968f19c5b1b7d5595423b0ac0020cc18dfed8cb5 upstream.
[BUG]
It is a long known bug that VM image on btrfs can lead to data csum
mismatch, if the qemu is using direct-io for the image (this is commonly
known as cache mode 'none').
[CAUSE]
Inside the VM, if the fs is EXT4 or XFS, or even NTFS from Windows, the
fs is allowed to dirty/modify the folio even if the folio is under
writeback (as long as the address space doesn't have AS_STABLE_WRITES
flag inherited from the block device).
This is a valid optimization to improve the concurrency, and since these
filesystems have no extra checksum on data, the content change is not a
problem at all.
But the final write into the image file is handled by btrfs, which needs
the content not to be modified during writeback, or the checksum will
not match the data (checksum is calculated before submitting the bio).
So EXT4/XFS/NTRFS assume they can modify the folio under writeback, but
btrfs requires no modification, this leads to the false csum mismatch.
This is only a controlled example, there are even cases where
multi-thread programs can submit a direct IO write, then another thread
modifies the direct IO buffer for whatever reason.
For such cases, btrfs has no sane way to detect such cases and leads to
false data csum mismatch.
[FIX]
I have considered the following ideas to solve the problem:
- Make direct IO to always skip data checksum
This not only requires a new incompatible flag, as it breaks the
current per-inode NODATASUM flag.
But also requires extra handling for no csum found cases.
And this also reduces our checksum protection.
- Let hardware handle all the checksum
AKA, just nodatasum mount option.
That requires trust for hardware (which is not that trustful in a lot
of cases), and it's not generic at all.
- Always fallback to buffered write if the inode requires checksum
This was suggested by Christoph, and is the solution utilized by this
patch.
The cost is obvious, the extra buffer copying into page cache, thus it
reduces the performance.
But at least it's still user configurable, if the end user still wants
the zero-copy performance, just set NODATASUM flag for the inode
(which is a common practice for VM images on btrfs).
Since we cannot trust user space programs to keep the buffer
consistent during direct IO, we have no choice but always falling back
to buffered IO. At least by this, we avoid the more deadly false data
checksum mismatch error.
CC: stable(a)vger.kernel.org # 6.6
Suggested-by: Christoph Hellwig <hch(a)infradead.org>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
[ Fix a conflict due to the movement of the function. ]
---
fs/btrfs/file.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index e794606e7c78..f1456c745c6d 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1515,6 +1515,23 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
goto buffered;
}
+ /*
+ * We can't control the folios being passed in, applications can write
+ * to them while a direct IO write is in progress. This means the
+ * content might change after we calculated the data checksum.
+ * Therefore we can end up storing a checksum that doesn't match the
+ * persisted data.
+ *
+ * To be extra safe and avoid false data checksum mismatch, if the
+ * inode requires data checksum, just fallback to buffered IO.
+ * For buffered IO we have full control of page cache and can ensure
+ * no one is modifying the content during writeback.
+ */
+ if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)) {
+ btrfs_inode_unlock(BTRFS_I(inode), ilock_flags);
+ goto buffered;
+ }
+
/*
* The iov_iter can be mapped to the same file range we are writing to.
* If that's the case, then we will deadlock in the iomap code, because
--
2.49.0