The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d36f6ed761b53933b0b4126486c10d3da7751e7f Mon Sep 17 00:00:00 2001
From: Baokun Li <libaokun1(a)huawei.com>
Date: Wed, 18 May 2022 20:08:16 +0800
Subject: [PATCH] ext4: fix bug_on in __es_tree_search
Hulk Robot reported a BUG_ON:
==================================================================
kernel BUG at fs/ext4/extents_status.c:199!
[...]
RIP: 0010:ext4_es_end fs/ext4/extents_status.c:199 [inline]
RIP: 0010:__es_tree_search+0x1e0/0x260 fs/ext4/extents_status.c:217
[...]
Call Trace:
ext4_es_cache_extent+0x109/0x340 fs/ext4/extents_status.c:766
ext4_cache_extents+0x239/0x2e0 fs/ext4/extents.c:561
ext4_find_extent+0x6b7/0xa20 fs/ext4/extents.c:964
ext4_ext_map_blocks+0x16b/0x4b70 fs/ext4/extents.c:4384
ext4_map_blocks+0xe26/0x19f0 fs/ext4/inode.c:567
ext4_getblk+0x320/0x4c0 fs/ext4/inode.c:980
ext4_bread+0x2d/0x170 fs/ext4/inode.c:1031
ext4_quota_read+0x248/0x320 fs/ext4/super.c:6257
v2_read_header+0x78/0x110 fs/quota/quota_v2.c:63
v2_check_quota_file+0x76/0x230 fs/quota/quota_v2.c:82
vfs_load_quota_inode+0x5d1/0x1530 fs/quota/dquot.c:2368
dquot_enable+0x28a/0x330 fs/quota/dquot.c:2490
ext4_quota_enable fs/ext4/super.c:6137 [inline]
ext4_enable_quotas+0x5d7/0x960 fs/ext4/super.c:6163
ext4_fill_super+0xa7c9/0xdc00 fs/ext4/super.c:4754
mount_bdev+0x2e9/0x3b0 fs/super.c:1158
mount_fs+0x4b/0x1e4 fs/super.c:1261
[...]
==================================================================
Above issue may happen as follows:
-------------------------------------
ext4_fill_super
ext4_enable_quotas
ext4_quota_enable
ext4_iget
__ext4_iget
ext4_ext_check_inode
ext4_ext_check
__ext4_ext_check
ext4_valid_extent_entries
Check for overlapping extents does't take effect
dquot_enable
vfs_load_quota_inode
v2_check_quota_file
v2_read_header
ext4_quota_read
ext4_bread
ext4_getblk
ext4_map_blocks
ext4_ext_map_blocks
ext4_find_extent
ext4_cache_extents
ext4_es_cache_extent
ext4_es_cache_extent
__es_tree_search
ext4_es_end
BUG_ON(es->es_lblk + es->es_len < es->es_lblk)
The error ext4 extents is as follows:
0af3 0300 0400 0000 00000000 extent_header
00000000 0100 0000 12000000 extent1
00000000 0100 0000 18000000 extent2
02000000 0400 0000 14000000 extent3
In the ext4_valid_extent_entries function,
if prev is 0, no error is returned even if lblock<=prev.
This was intended to skip the check on the first extent, but
in the error image above, prev=0+1-1=0 when checking the second extent,
so even though lblock<=prev, the function does not return an error.
As a result, bug_ON occurs in __es_tree_search and the system panics.
To solve this problem, we only need to check that:
1. The lblock of the first extent is not less than 0.
2. The lblock of the next extent is not less than
the next block of the previous extent.
The same applies to extent_idx.
Cc: stable(a)kernel.org
Fixes: 5946d089379a ("ext4: check for overlapping extents in ext4_valid_extent_entries()")
Reported-by: Hulk Robot <hulkci(a)huawei.com>
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20220518120816.1541863-1-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 474479ce76e0..c148bb97b527 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -372,7 +372,7 @@ static int ext4_valid_extent_entries(struct inode *inode,
{
unsigned short entries;
ext4_lblk_t lblock = 0;
- ext4_lblk_t prev = 0;
+ ext4_lblk_t cur = 0;
if (eh->eh_entries == 0)
return 1;
@@ -396,11 +396,11 @@ static int ext4_valid_extent_entries(struct inode *inode,
/* Check for overlapping extents */
lblock = le32_to_cpu(ext->ee_block);
- if ((lblock <= prev) && prev) {
+ if (lblock < cur) {
*pblk = ext4_ext_pblock(ext);
return 0;
}
- prev = lblock + ext4_ext_get_actual_len(ext) - 1;
+ cur = lblock + ext4_ext_get_actual_len(ext);
ext++;
entries--;
}
@@ -420,13 +420,13 @@ static int ext4_valid_extent_entries(struct inode *inode,
/* Check for overlapping index extents */
lblock = le32_to_cpu(ext_idx->ei_block);
- if ((lblock <= prev) && prev) {
+ if (lblock < cur) {
*pblk = ext4_idx_pblock(ext_idx);
return 0;
}
ext_idx++;
entries--;
- prev = lblock;
+ cur = lblock + 1;
}
}
return 1;
Hey all,
I recently used a testing program to test the 4.19 stable branch kernel and found that a crash occurred immediately. The test source code link is:
https://github.com/Backmyheart/src0358/blob/master/vxlan_fdb_destroy.c
The test command is as follows:
gcc vxlan_fdb_destroy.c -o vxlan_fdb_destroy -lpthread
According to its stack, upstream has relevant repair patch, the commit id is 7c31e54aeee517d1318dfc0bde9fa7de75893dc6.
May i ask if the 4.19 kernel will port this patch ?
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 7044dcff8301b29269016ebd17df27c4736140d2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042940-plod-embellish-5a76@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
7044dcff8301 ("rust: macros: fix soundness issue in `module!` macro")
1b6170ff7a20 ("rust: module: place generated init_module() function in .init.text")
41bdc6decda0 ("btf, scripts: rust: drop is_rust_module.sh")
310897659cf0 ("Merge tag 'rust-6.4' of https://github.com/Rust-for-Linux/linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7044dcff8301b29269016ebd17df27c4736140d2 Mon Sep 17 00:00:00 2001
From: Benno Lossin <benno.lossin(a)proton.me>
Date: Mon, 1 Apr 2024 18:52:50 +0000
Subject: [PATCH] rust: macros: fix soundness issue in `module!` macro
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The `module!` macro creates glue code that are called by C to initialize
the Rust modules using the `Module::init` function. Part of this glue
code are the local functions `__init` and `__exit` that are used to
initialize/destroy the Rust module.
These functions are safe and also visible to the Rust mod in which the
`module!` macro is invoked. This means that they can be called by other
safe Rust code. But since they contain `unsafe` blocks that rely on only
being called at the right time, this is a soundness issue.
Wrap these generated functions inside of two private modules, this
guarantees that the public functions cannot be called from the outside.
Make the safe functions `unsafe` and add SAFETY comments.
Cc: stable(a)vger.kernel.org
Reported-by: Björn Roy Baron <bjorn3_gh(a)protonmail.com>
Closes: https://github.com/Rust-for-Linux/linux/issues/629
Fixes: 1fbde52bde73 ("rust: add `macros` crate")
Signed-off-by: Benno Lossin <benno.lossin(a)proton.me>
Reviewed-by: Wedson Almeida Filho <walmeida(a)microsoft.com>
Link: https://lore.kernel.org/r/20240401185222.12015-1-benno.lossin@proton.me
[ Moved `THIS_MODULE` out of the private-in-private modules since it
should remain public, as Dirk Behme noticed [1]. Capitalized comments,
avoided newline in non-list SAFETY comments and reworded to add
Reported-by and newline. ]
Link: https://rust-for-linux.zulipchat.com/#narrow/stream/291565-Help/topic/x/nea… [1]
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
diff --git a/rust/macros/module.rs b/rust/macros/module.rs
index 27979e582e4b..acd0393b5095 100644
--- a/rust/macros/module.rs
+++ b/rust/macros/module.rs
@@ -199,17 +199,6 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream {
/// Used by the printing macros, e.g. [`info!`].
const __LOG_PREFIX: &[u8] = b\"{name}\\0\";
- /// The \"Rust loadable module\" mark.
- //
- // This may be best done another way later on, e.g. as a new modinfo
- // key or a new section. For the moment, keep it simple.
- #[cfg(MODULE)]
- #[doc(hidden)]
- #[used]
- static __IS_RUST_MODULE: () = ();
-
- static mut __MOD: Option<{type_}> = None;
-
// SAFETY: `__this_module` is constructed by the kernel at load time and will not be
// freed until the module is unloaded.
#[cfg(MODULE)]
@@ -221,81 +210,132 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream {
kernel::ThisModule::from_ptr(core::ptr::null_mut())
}};
- // Loadable modules need to export the `{{init,cleanup}}_module` identifiers.
- /// # Safety
- ///
- /// This function must not be called after module initialization, because it may be
- /// freed after that completes.
- #[cfg(MODULE)]
- #[doc(hidden)]
- #[no_mangle]
- #[link_section = \".init.text\"]
- pub unsafe extern \"C\" fn init_module() -> core::ffi::c_int {{
- __init()
- }}
+ // Double nested modules, since then nobody can access the public items inside.
+ mod __module_init {{
+ mod __module_init {{
+ use super::super::{type_};
- #[cfg(MODULE)]
- #[doc(hidden)]
- #[no_mangle]
- pub extern \"C\" fn cleanup_module() {{
- __exit()
- }}
+ /// The \"Rust loadable module\" mark.
+ //
+ // This may be best done another way later on, e.g. as a new modinfo
+ // key or a new section. For the moment, keep it simple.
+ #[cfg(MODULE)]
+ #[doc(hidden)]
+ #[used]
+ static __IS_RUST_MODULE: () = ();
- // Built-in modules are initialized through an initcall pointer
- // and the identifiers need to be unique.
- #[cfg(not(MODULE))]
- #[cfg(not(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS))]
- #[doc(hidden)]
- #[link_section = \"{initcall_section}\"]
- #[used]
- pub static __{name}_initcall: extern \"C\" fn() -> core::ffi::c_int = __{name}_init;
+ static mut __MOD: Option<{type_}> = None;
- #[cfg(not(MODULE))]
- #[cfg(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)]
- core::arch::global_asm!(
- r#\".section \"{initcall_section}\", \"a\"
- __{name}_initcall:
- .long __{name}_init - .
- .previous
- \"#
- );
+ // Loadable modules need to export the `{{init,cleanup}}_module` identifiers.
+ /// # Safety
+ ///
+ /// This function must not be called after module initialization, because it may be
+ /// freed after that completes.
+ #[cfg(MODULE)]
+ #[doc(hidden)]
+ #[no_mangle]
+ #[link_section = \".init.text\"]
+ pub unsafe extern \"C\" fn init_module() -> core::ffi::c_int {{
+ // SAFETY: This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // unique name.
+ unsafe {{ __init() }}
+ }}
- #[cfg(not(MODULE))]
- #[doc(hidden)]
- #[no_mangle]
- pub extern \"C\" fn __{name}_init() -> core::ffi::c_int {{
- __init()
- }}
+ #[cfg(MODULE)]
+ #[doc(hidden)]
+ #[no_mangle]
+ pub extern \"C\" fn cleanup_module() {{
+ // SAFETY:
+ // - This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // unique name,
+ // - furthermore it is only called after `init_module` has returned `0`
+ // (which delegates to `__init`).
+ unsafe {{ __exit() }}
+ }}
- #[cfg(not(MODULE))]
- #[doc(hidden)]
- #[no_mangle]
- pub extern \"C\" fn __{name}_exit() {{
- __exit()
- }}
+ // Built-in modules are initialized through an initcall pointer
+ // and the identifiers need to be unique.
+ #[cfg(not(MODULE))]
+ #[cfg(not(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS))]
+ #[doc(hidden)]
+ #[link_section = \"{initcall_section}\"]
+ #[used]
+ pub static __{name}_initcall: extern \"C\" fn() -> core::ffi::c_int = __{name}_init;
- fn __init() -> core::ffi::c_int {{
- match <{type_} as kernel::Module>::init(&THIS_MODULE) {{
- Ok(m) => {{
- unsafe {{
- __MOD = Some(m);
+ #[cfg(not(MODULE))]
+ #[cfg(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)]
+ core::arch::global_asm!(
+ r#\".section \"{initcall_section}\", \"a\"
+ __{name}_initcall:
+ .long __{name}_init - .
+ .previous
+ \"#
+ );
+
+ #[cfg(not(MODULE))]
+ #[doc(hidden)]
+ #[no_mangle]
+ pub extern \"C\" fn __{name}_init() -> core::ffi::c_int {{
+ // SAFETY: This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // placement above in the initcall section.
+ unsafe {{ __init() }}
+ }}
+
+ #[cfg(not(MODULE))]
+ #[doc(hidden)]
+ #[no_mangle]
+ pub extern \"C\" fn __{name}_exit() {{
+ // SAFETY:
+ // - This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // unique name,
+ // - furthermore it is only called after `__{name}_init` has returned `0`
+ // (which delegates to `__init`).
+ unsafe {{ __exit() }}
+ }}
+
+ /// # Safety
+ ///
+ /// This function must only be called once.
+ unsafe fn __init() -> core::ffi::c_int {{
+ match <{type_} as kernel::Module>::init(&super::super::THIS_MODULE) {{
+ Ok(m) => {{
+ // SAFETY: No data race, since `__MOD` can only be accessed by this
+ // module and there only `__init` and `__exit` access it. These
+ // functions are only called once and `__exit` cannot be called
+ // before or during `__init`.
+ unsafe {{
+ __MOD = Some(m);
+ }}
+ return 0;
+ }}
+ Err(e) => {{
+ return e.to_errno();
+ }}
}}
- return 0;
}}
- Err(e) => {{
- return e.to_errno();
+
+ /// # Safety
+ ///
+ /// This function must
+ /// - only be called once,
+ /// - be called after `__init` has been called and returned `0`.
+ unsafe fn __exit() {{
+ // SAFETY: No data race, since `__MOD` can only be accessed by this module
+ // and there only `__init` and `__exit` access it. These functions are only
+ // called once and `__init` was already called.
+ unsafe {{
+ // Invokes `drop()` on `__MOD`, which should be used for cleanup.
+ __MOD = None;
+ }}
}}
+
+ {modinfo}
}}
}}
-
- fn __exit() {{
- unsafe {{
- // Invokes `drop()` on `__MOD`, which should be used for cleanup.
- __MOD = None;
- }}
- }}
-
- {modinfo}
",
type_ = info.type_,
name = info.name,
Two enclave threads may try to add and remove the same enclave page
simultaneously (e.g., if the SGX runtime supports both lazy allocation
and `MADV_DONTNEED` semantics). Consider this race:
1. T1 performs page removal in sgx_encl_remove_pages() and stops right
after removing the page table entry and right before re-acquiring the
enclave lock to EREMOVE and xa_erase(&encl->page_array) the page.
2. T2 tries to access the page, and #PF[not_present] is raised. The
condition to EAUG in sgx_vma_fault() is not satisfied because the
page is still present in encl->page_array, thus the SGX driver
assumes that the fault happened because the page was swapped out. The
driver continues on a code path that installs a page table entry
*without* performing EAUG.
3. The enclave page metadata is in inconsistent state: the PTE is
installed but there was no EAUG. Thus, T2 in userspace infinitely
receives SIGSEGV on this page (and EACCEPT always fails).
Fix this by making sure that T1 (the page-removing thread) always wins
this data race. In particular, the page-being-removed is marked as such,
and T2 retries until the page is fully removed.
Fixes: 9849bb27152c ("x86/sgx: Support complete page removal")
Cc: stable(a)vger.kernel.org
Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii(a)intel.com>
---
arch/x86/kernel/cpu/sgx/encl.c | 3 ++-
arch/x86/kernel/cpu/sgx/encl.h | 3 +++
arch/x86/kernel/cpu/sgx/ioctl.c | 1 +
3 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 41f14b1a3025..7ccd8b2fce5f 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -257,7 +257,8 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
/* Entry successfully located. */
if (entry->epc_page) {
- if (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)
+ if (entry->desc & (SGX_ENCL_PAGE_BEING_RECLAIMED |
+ SGX_ENCL_PAGE_BEING_REMOVED))
return ERR_PTR(-EBUSY);
return entry;
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index f94ff14c9486..fff5f2293ae7 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -25,6 +25,9 @@
/* 'desc' bit marking that the page is being reclaimed. */
#define SGX_ENCL_PAGE_BEING_RECLAIMED BIT(3)
+/* 'desc' bit marking that the page is being removed. */
+#define SGX_ENCL_PAGE_BEING_REMOVED BIT(2)
+
struct sgx_encl_page {
unsigned long desc;
unsigned long vm_max_prot_bits:8;
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index b65ab214bdf5..c542d4dd3e64 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -1142,6 +1142,7 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
* Do not keep encl->lock because of dependency on
* mmap_lock acquired in sgx_zap_enclave_ptes().
*/
+ entry->desc |= SGX_ENCL_PAGE_BEING_REMOVED;
mutex_unlock(&encl->lock);
sgx_zap_enclave_ptes(encl, addr);
--
2.34.1
Two enclave threads may try to access the same non-present enclave page
simultaneously (e.g., if the SGX runtime supports lazy allocation). The
threads will end up in sgx_encl_eaug_page(), racing to acquire the
enclave lock. The winning thread will perform EAUG, set up the page
table entry, and insert the page into encl->page_array. The losing
thread will then get -EBUSY on xa_insert(&encl->page_array) and proceed
to error handling path.
This error handling path contains two bugs: (1) SIGBUS is sent to
userspace even though the enclave page is correctly installed by another
thread, and (2) sgx_encl_free_epc_page() is called that performs EREMOVE
even though the enclave page was never intended to be removed. The first
bug is less severe because it impacts only the user space; the second
bug is more severe because it also impacts the OS state by ripping the
page (added by the winning thread) from the enclave.
Fix these two bugs (1) by returning VM_FAULT_NOPAGE to the generic Linux
fault handler so that no signal is sent to userspace, and (2) by
replacing sgx_encl_free_epc_page() with sgx_free_epc_page() so that no
EREMOVE is performed.
Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave")
Cc: stable(a)vger.kernel.org
Reported-by: Marcelina Kościelnicka <mwk(a)invisiblethingslab.com>
Suggested-by: Reinette Chatre <reinette.chatre(a)intel.com>
Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii(a)intel.com>
---
arch/x86/kernel/cpu/sgx/encl.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 279148e72459..41f14b1a3025 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -382,8 +382,11 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
* If ret == -EBUSY then page was created in another flow while
* running without encl->lock
*/
- if (ret)
+ if (ret) {
+ if (ret == -EBUSY)
+ vmret = VM_FAULT_NOPAGE;
goto err_out_shrink;
+ }
pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
pginfo.addr = encl_page->desc & PAGE_MASK;
@@ -419,7 +422,7 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
err_out_shrink:
sgx_encl_shrink(encl, va_page);
err_out_epc:
- sgx_encl_free_epc_page(epc_page);
+ sgx_free_epc_page(epc_page);
err_out_unlock:
mutex_unlock(&encl->lock);
kfree(encl_page);
--
2.34.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 6c41468c7c12d74843bb414fc00307ea8a6318c3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023041134-curvature-campsite-e51b@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
6c41468c7c12 ("KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection")
d4963e319f1f ("KVM: x86: Make kvm_queued_exception a properly named, visible struct")
6ad75c5c99f7 ("KVM: x86: Rename kvm_x86_ops.queue_exception to inject_exception")
5623f751bd9c ("KVM: x86: Treat #DBs from the emulator as fault-like (code and DR7.GD=1)")
8d178f460772 ("KVM: nVMX: Treat General Detect #DB (DR7.GD=1) as fault-like")
eba9799b5a6e ("KVM: VMX: Drop bits 31:16 when shoving exception error code into VMCS")
a61d7c5432ac ("KVM: x86: Trace re-injected exceptions")
6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction")
3741aec4c38f ("KVM: SVM: Stuff next_rip on emulated INT3 injection if NRIPS is supported")
cd9e6da8048c ("KVM: SVM: Unwind "speculative" RIP advancement if INTn injection "fails"")
00f08d99dd7d ("KVM: nSVM: Sync next_rip field from vmcb12 to vmcb02")
9bd1f0efa859 ("KVM: nVMX: Clear IDT vectoring on nested VM-Exit for double/triple fault")
c3634d25fbee ("KVM: nVMX: Leave most VM-Exit info fields unmodified on failed VM-Entry")
1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running")
db663af4a001 ("kvm: x86: SVM: use vmcb* instead of svm->vmcb where it makes sense")
b9f3973ab3a8 ("KVM: x86: nSVM: implement nested VMLOAD/VMSAVE")
23e5092b6e2a ("KVM: SVM: Rename hook implementations to conform to kvm_x86_ops' names")
e27bc0440ebd ("KVM: x86: Rename kvm_x86_ops pointers to align w/ preferred vendor names")
068f7ea61895 ("KVM: SVM: improve split between svm_prepare_guest_switch and sev_es_prepare_guest_switch")
e1779c2714c3 ("KVM: x86: nSVM: fix potential NULL derefernce on nested migration")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6c41468c7c12d74843bb414fc00307ea8a6318c3 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Wed, 22 Mar 2023 07:32:59 -0700
Subject: [PATCH] KVM: x86: Clear "has_error_code", not "error_code", for RM
exception injection
When injecting an exception into a vCPU in Real Mode, suppress the error
code by clearing the flag that tracks whether the error code is valid, not
by clearing the error code itself. The "typo" was introduced by recent
fix for SVM's funky Paged Real Mode.
Opportunistically hoist the logic above the tracepoint so that the trace
is coherent with respect to what is actually injected (this was also the
behavior prior to the buggy commit).
Fixes: b97f07458373 ("KVM: x86: determine if an exception has an error code only when injecting it.")
Cc: stable(a)vger.kernel.org
Cc: Maxim Levitsky <mlevitsk(a)redhat.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20230322143300.2209476-2-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 45017576ad5e..7d6f98b7635f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9908,13 +9908,20 @@ int kvm_check_nested_events(struct kvm_vcpu *vcpu)
static void kvm_inject_exception(struct kvm_vcpu *vcpu)
{
+ /*
+ * Suppress the error code if the vCPU is in Real Mode, as Real Mode
+ * exceptions don't report error codes. The presence of an error code
+ * is carried with the exception and only stripped when the exception
+ * is injected as intercepted #PF VM-Exits for AMD's Paged Real Mode do
+ * report an error code despite the CPU being in Real Mode.
+ */
+ vcpu->arch.exception.has_error_code &= is_protmode(vcpu);
+
trace_kvm_inj_exception(vcpu->arch.exception.vector,
vcpu->arch.exception.has_error_code,
vcpu->arch.exception.error_code,
vcpu->arch.exception.injected);
- if (vcpu->arch.exception.error_code && !is_protmode(vcpu))
- vcpu->arch.exception.error_code = false;
static_call(kvm_x86_inject_exception)(vcpu);
}
There is nothing preventing kernel memory allocators from allocating a
page that overlaps with PTR_ERR(), except for architecture-specific
code that setup memblock.
It was discovered that RISCV architecture doesn't setup memblock
corectly, leading to a page overlapping with PTR_ERR() being allocated,
and subsequently crashing the kernel (link in Close: )
The reported crash has nothing to do with PTR_ERR(): the last page
(at address 0xfffff000) being allocated leads to an unexpected
arithmetic overflow in ext4; but still, this page shouldn't be
allocated in the first place.
Because PTR_ERR() is an architecture-independent thing, we shouldn't
ask every single architecture to set this up. There may be other
architectures beside RISCV that have the same problem.
Fix this one and for all by reserving the physical memory page that
may be mapped to the last virtual memory page as part of low memory.
Unfortunately, this means if there is actual memory at this reserved
location, that memory will become inaccessible. However, if this page
is not reserved, it can only be accessed as high memory, so this
doesn't matter if high memory is not supported. Even if high memory is
supported, it is still only one page.
Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@all.your.base.are.belong…
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: <stable(a)vger.kernel.org> # all versions
---
init/main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/init/main.c b/init/main.c
index 881f6230ee59..f8d2793c4641 100644
--- a/init/main.c
+++ b/init/main.c
@@ -900,6 +900,7 @@ void start_kernel(void)
page_address_init();
pr_notice("%s", linux_banner);
early_security_init();
+ memblock_reserve(__pa(-PAGE_SIZE), PAGE_SIZE); /* reserve last page for ERR_PTR */
setup_arch(&command_line);
setup_boot_config();
setup_command_line(command_line);
--
2.39.2
Fuzzing reports a possible deadlock in jbd2_log_wait_commit.
The problem occurs in ext4_ind_migrate due to an incorrect order of
unlocking of the journal and write semaphores - the order of unlocking
must be the reverse of the order of locking.
Found by Linux Verification Center (linuxtesting.org) with syzkaller.
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Signed-off-by: Artem Sadovnikov <ancowi69(a)gmail.com>
Signed-off-by: Mikhail Ukhin <mish.uxin2012(a)yandex.ru>
---
v2: New addresses have been added and Ritesh Harjani has been noted as a
reviewer.
fs/ext4/migrate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
index b0ea646454ac..59290356aa5b 100644
--- a/fs/ext4/migrate.c
+++ b/fs/ext4/migrate.c
@@ -663,8 +663,8 @@ int ext4_ind_migrate(struct inode *inode)
if (unlikely(ret2 && !ret))
ret = ret2;
errout:
- ext4_journal_stop(handle);
up_write(&EXT4_I(inode)->i_data_sem);
+ ext4_journal_stop(handle);
out_unlock:
percpu_up_write(&sbi->s_writepages_rwsem);
return ret;
--
2.25.1
Hi,
Greg and Sasha add the "X-stable: review" to their patch bombs, with the
intention that people will be able to filter these out should they
desire to do so. For example, I usually want all threads that match code
I care about, but I don't regularly want to see thousand-patch stable
series. So this header is helpful.
However, I'm not able to formulate a query for lore (to pass to `lei q`)
that will match on negating it. The idea would be to exclude the thread
if the parent has this header. It looks like public inbox might only
index on some headers, but can't generically search all? I'm not sure
how it works, but queries only seem to half way work when searching for
that header.
In the meantime, I've been using this ugly bash script, which gets the
job done, but means I have to download everything locally first:
#!/bin/bash
PWD="${BASH_SOURCE[0]}"
PWD="${PWD%/*}"
set -e
cd "$PWD"
echo "[+] Syncing new mail" >&2
lei up "$PWD"
echo "[+] Cleaning up stable patch bombs" >&2
mapfile -d $'\0' -t parents < <(grep -F -x -Z -r -l 'X-stable: review' cur tmp new)
{
[[ -f stable-message-ids ]] && cat stable-message-ids
[[ ${#parents[@]} -gt 0 ]] && sed -n 's/^Message-ID: <\(.*\)>$/\1/p' "${parents[@]}"
} | sort -u > stable-message-ids.new
mv stable-message-ids.new stable-message-ids
[[ -s stable-message-ids ]] || exit 0
mapfile -d $'\0' -t children < <(grep -F -Z -r -l -f - cur tmp new < stable-message-ids)
total=$(( ${#parents[@]} + ${#children[@]} ))
[[ $total -gt 0 ]] || exit 0
echo "# rm <...$total messages...>" >&2
rm -f "${parents[@]}" "${children[@]}"
This results in something like:
zx2c4@thinkpad ~/Projects/lkml $ ./update.bash
[+] Syncing new mail
# https://lore.kernel.org/all/ limiting ...
# /usr/bin/curl -gSf -s -d '' https://lore.kernel.org/all/?x=m&t=1&q=(...
[+] Cleaning up stable patch bombs
# rm <...24593 messages...>
It works, but it'd be nice to not even download these messages in the
first place. Since I'm deleting message I don't want, I have to keep
track of the message IDs of those deleted messages with the stable
header in case replies come in later. That's some book keeping, sheesh!
Any thoughts on this workflow?
Jason
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index b82423a72f685..b1107b424bf64 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -86,7 +86,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
There is an issue with ACPI overlay table removal specifically related
to I2C multiplexers.
Consider an ACPI SSDT Overlay that defines a PCA9548 I2C mux on an
existing I2C bus. When this table is loaded we see the creation of a
device for the overall PCA9548 chip and 8 further devices - one
i2c_adapter each for the mux channels. These are all bound to their
ACPI equivalents via an eventual invocation of acpi_bind_one().
When we unload the SSDT overlay we run into the problem. The ACPI
devices are deleted as normal via acpi_device_del_work_fn() and the
acpi_device_del_list.
However, the following warning and stack trace is output as the
deletion does not go smoothly:
------------[ cut here ]------------
kernfs: can not remove 'physical_node', no directory
WARNING: CPU: 1 PID: 11 at fs/kernfs/dir.c:1674 kernfs_remove_by_name_ns+0xb9/0xc0
Modules linked in:
CPU: 1 PID: 11 Comm: kworker/u128:0 Not tainted 6.8.0-rc6+ #1
Hardware name: congatec AG conga-B7E3/conga-B7E3, BIOS 5.13 05/16/2023
Workqueue: kacpi_hotplug acpi_device_del_work_fn
RIP: 0010:kernfs_remove_by_name_ns+0xb9/0xc0
Code: e4 00 48 89 ef e8 07 71 db ff 5b b8 fe ff ff ff 5d 41 5c 41 5d e9 a7 55 e4 00 0f 0b eb a6 48 c7 c7 f0 38 0d 9d e8 97 0a d5 ff <0f> 0b eb dc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 0018:ffff9f864008fb28 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff8ef90a8d4940 RCX: 0000000000000000
RDX: ffff8f000e267d10 RSI: ffff8f000e25c780 RDI: ffff8f000e25c780
RBP: ffff8ef9186f9870 R08: 0000000000013ffb R09: 00000000ffffbfff
R10: 00000000ffffbfff R11: ffff8f000e0a0000 R12: ffff9f864008fb50
R13: ffff8ef90c93dd60 R14: ffff8ef9010d0958 R15: ffff8ef9186f98c8
FS: 0000000000000000(0000) GS:ffff8f000e240000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f48f5253a08 CR3: 00000003cb82e000 CR4: 00000000003506f0
Call Trace:
<TASK>
? kernfs_remove_by_name_ns+0xb9/0xc0
? __warn+0x7c/0x130
? kernfs_remove_by_name_ns+0xb9/0xc0
? report_bug+0x171/0x1a0
? handle_bug+0x3c/0x70
? exc_invalid_op+0x17/0x70
? asm_exc_invalid_op+0x1a/0x20
? kernfs_remove_by_name_ns+0xb9/0xc0
? kernfs_remove_by_name_ns+0xb9/0xc0
acpi_unbind_one+0x108/0x180
device_del+0x18b/0x490
? srso_return_thunk+0x5/0x5f
? srso_return_thunk+0x5/0x5f
device_unregister+0xd/0x30
i2c_del_adapter.part.0+0x1bf/0x250
i2c_mux_del_adapters+0xa1/0xe0
i2c_device_remove+0x1e/0x80
device_release_driver_internal+0x19a/0x200
bus_remove_device+0xbf/0x100
device_del+0x157/0x490
? __pfx_device_match_fwnode+0x10/0x10
? srso_return_thunk+0x5/0x5f
device_unregister+0xd/0x30
i2c_acpi_notify+0x10f/0x140
notifier_call_chain+0x58/0xd0
blocking_notifier_call_chain+0x3a/0x60
acpi_device_del_work_fn+0x85/0x1d0
process_one_work+0x134/0x2f0
worker_thread+0x2f0/0x410
? __pfx_worker_thread+0x10/0x10
kthread+0xe3/0x110
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2f/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
---[ end trace 0000000000000000 ]---
...
repeated 7 more times, 1 for each channel of the mux
...
The issue is that the binding of the ACPI devices to their peer I2C
adapters is not correctly cleaned up. Digging deeper into the issue we
see that the deletion order is such that the ACPI devices matching the
mux channel i2c adapters are deleted first during the SSDT overlay
removal. For each of the channels we see a call to i2c_acpi_notify()
with ACPI_RECONFIG_DEVICE_REMOVE but, because these devices are not
actually i2c_clients, nothing is done for them.
Later on, after each of the mux channels has been dealt with, we come
to delete the i2c_client representing the PCA9548 device. This is the
call stack we see above, whereby the kernel cleans up the i2c_client
including destruction of the mux and its channel adapters. At this
point we do attempt to unbind from the ACPI peers but those peers no
longer exist and so we hit the kernfs errors.
The fix is to augment i2c_acpi_notify() to handle i2c_adapters. But,
given that the life cycle of the adapters is linked to the i2c_client,
instead of deleting the i2c_adapters during the i2c_acpi_notify(), we
just trigger unbinding of the ACPI device from the adapter device, and
allow the clean up of the adapter to continue in the way it always has.
Signed-off-by: Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
Reviewed-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)kernel.org>
Fixes: 525e6fabeae2 ("i2c / ACPI: add support for ACPI reconfigure notifications")
Cc: <stable(a)vger.kernel.org> # v4.8+
---
Notes:
v4:
Resolve Build failure noted by:
Linux Kernel Functional Testing <lkft(a)linaro.org>, and
kernel test robot <lkp(a)intel.com>
These failures led to revert of the v3 version of this patch that had been accepted earlier.
v3:
Add reviewed by tags (Mika Westerberg and Andi Shyti) and Fixes tag.
v2:
Moved long problem description from cover letter to commit description at Mika's suggestion
drivers/i2c/i2c-core-acpi.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/i2c/i2c-core-acpi.c b/drivers/i2c/i2c-core-acpi.c
index d6037a328669..14ae0cfc325e 100644
--- a/drivers/i2c/i2c-core-acpi.c
+++ b/drivers/i2c/i2c-core-acpi.c
@@ -445,6 +445,11 @@ static struct i2c_client *i2c_acpi_find_client_by_adev(struct acpi_device *adev)
return i2c_find_device_by_fwnode(acpi_fwnode_handle(adev));
}
+static struct i2c_adapter *i2c_acpi_find_adapter_by_adev(struct acpi_device *adev)
+{
+ return i2c_find_adapter_by_fwnode(acpi_fwnode_handle(adev));
+}
+
static int i2c_acpi_notify(struct notifier_block *nb, unsigned long value,
void *arg)
{
@@ -471,11 +476,17 @@ static int i2c_acpi_notify(struct notifier_block *nb, unsigned long value,
break;
client = i2c_acpi_find_client_by_adev(adev);
- if (!client)
- break;
+ if (client) {
+ i2c_unregister_device(client);
+ put_device(&client->dev);
+ }
+
+ adapter = i2c_acpi_find_adapter_by_adev(adev);
+ if (adapter) {
+ acpi_unbind_one(&adapter->dev);
+ put_device(&adapter->dev);
+ }
- i2c_unregister_device(client);
- put_device(&client->dev);
break;
}
--
2.43.2
Commit 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.")
added defaults for i_uid/i_gid when set_ownership() is not implemented.
It missed to also adjust net_ctl_set_ownership() to use the same default
values in case the computation of a better value fails.
Instead always initialize i_uid/i_gid inside the sysfs core so
set_ownership() can safely skip setting them.
Fixes: 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Changes in v3:
- Rebase onto v6.9-rc1
- Reword commit message and mention correct fixed commit
- Link to v2: https://lore.kernel.org/r/20240322-sysctl-net-ownership-v2-1-a8b4a3306542@w…
Changes in v2:
- Move the fallback logic to the sysctl core
- Link to v1: https://lore.kernel.org/r/20240315-sysctl-net-ownership-v1-1-2b465555a292@w…
---
fs/proc/proc_sysctl.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 37cde0efee57..9e34ab9c21e4 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -479,12 +479,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
make_empty_dir_inode(inode);
}
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
if (root->set_ownership)
root->set_ownership(head, table, &inode->i_uid, &inode->i_gid);
- else {
- inode->i_uid = GLOBAL_ROOT_UID;
- inode->i_gid = GLOBAL_ROOT_GID;
- }
return inode;
}
---
base-commit: 4cece764965020c22cff7665b18a012006359095
change-id: 20240315-sysctl-net-ownership-bc4e17eaeea6
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
From: Vitor Soares <vitor.soares(a)toradex.com>
When the mcp251xfd_start_xmit() function fails, the driver stops
processing messages, and the interrupt routine does not return,
running indefinitely even after killing the running application.
Error messages:
[ 441.298819] mcp251xfd spi2.0 can0: ERROR in mcp251xfd_start_xmit: -16
[ 441.306498] mcp251xfd spi2.0 can0: Transmit Event FIFO buffer not empty. (seq=0x000017c7, tef_tail=0x000017cf, tef_head=0x000017d0, tx_head=0x000017d3).
... and repeat forever.
The issue can be triggered when multiple devices share the same
SPI interface. And there is concurrent access to the bus.
The problem occurs because tx_ring->head increments even if
mcp251xfd_start_xmit() fails. Consequently, the driver skips one
TX package while still expecting a response in
mcp251xfd_handle_tefif_one().
This patch resolves the issue by decreasing tx_ring->head if
mcp251xfd_start_xmit() fails. With the fix, if we trigger the issue and
the err = -EBUSY, the driver returns NETDEV_TX_BUSY. The network stack
retries to transmit the message.
Otherwise, it prints an error and discards the message.
Fixes: 55e5b97f003e ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Cc: stable(a)vger.kernel.org
Signed-off-by: Vitor Soares <vitor.soares(a)toradex.com>
---
V2->V3:
- Add tx_dropped stats.
- netdev_sent_queue() only if can_put_echo_skb() succeed.
V1->V2:
- Return NETDEV_TX_BUSY if mcp251xfd_tx_obj_write() == -EBUSY.
- Rework the commit message to address the change above.
- Change can_put_echo_skb() to be called after mcp251xfd_tx_obj_write() succeed.
Otherwise, we get Kernel NULL pointer dereference error.
drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c | 34 ++++++++++++--------
1 file changed, 21 insertions(+), 13 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c
index 160528d3cc26..146c44e47c60 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c
@@ -166,6 +166,7 @@ netdev_tx_t mcp251xfd_start_xmit(struct sk_buff *skb,
struct net_device *ndev)
{
struct mcp251xfd_priv *priv = netdev_priv(ndev);
+ struct net_device_stats *stats = &ndev->stats;
struct mcp251xfd_tx_ring *tx_ring = priv->tx;
struct mcp251xfd_tx_obj *tx_obj;
unsigned int frame_len;
@@ -181,25 +182,32 @@ netdev_tx_t mcp251xfd_start_xmit(struct sk_buff *skb,
tx_obj = mcp251xfd_get_tx_obj_next(tx_ring);
mcp251xfd_tx_obj_from_skb(priv, tx_obj, skb, tx_ring->head);
- /* Stop queue if we occupy the complete TX FIFO */
tx_head = mcp251xfd_get_tx_head(tx_ring);
- tx_ring->head++;
- if (mcp251xfd_get_tx_free(tx_ring) == 0)
- netif_stop_queue(ndev);
-
frame_len = can_skb_get_frame_len(skb);
- err = can_put_echo_skb(skb, ndev, tx_head, frame_len);
- if (!err)
- netdev_sent_queue(priv->ndev, frame_len);
+
+ tx_ring->head++;
err = mcp251xfd_tx_obj_write(priv, tx_obj);
- if (err)
- goto out_err;
+ if (err) {
+ tx_ring->head--;
- return NETDEV_TX_OK;
+ if (err == -EBUSY)
+ return NETDEV_TX_BUSY;
- out_err:
- netdev_err(priv->ndev, "ERROR in %s: %d\n", __func__, err);
+ stats->tx_dropped++;
+
+ if (net_ratelimit())
+ netdev_err(priv->ndev,
+ "ERROR in %s: %d\n", __func__, err);
+ } else {
+ err = can_put_echo_skb(skb, ndev, tx_head, frame_len);
+ if (!err)
+ netdev_sent_queue(priv->ndev, frame_len);
+
+ /* Stop queue if we occupy the complete TX FIFO */
+ if (mcp251xfd_get_tx_free(tx_ring) == 0)
+ netif_stop_queue(ndev);
+ }
return NETDEV_TX_OK;
}
--
2.34.1
From: Bjorn Helgaas <bhelgaas(a)google.com>
Arul, Mateusz, Imcarneiro91, and Aman reported a regression caused by
07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map"). On the
Lenovo Legion 9i laptop, that commit removes the area containing ECAM from
E820, which means the early E820 validation started failing, which meant we
didn't enable ECAM in the "early MCFG" path
The lack of ECAM caused many ACPI methods to fail, resulting in the
embedded controller, PS/2, audio, trackpad, and battery devices not being
detected. The _OSC method also failed, so Linux could not take control of
the PCIe hotplug, PME, and AER features:
# pci_mmcfg_early_init()
PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0]
PCI: not using ECAM ([mem 0xc0000000-0xce0fffff] not reserved)
ACPI Error: AE_ERROR, Returned by Handler for [PCI_Config] (20230628/evregion-300)
ACPI: Interpreter enabled
ACPI: Ignoring error and continuing table load
ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162)
ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220)
ACPI: Skipping parse of AML opcode: OpcodeName unavailable (0x0010)
ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162)
ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220)
...
ACPI Error: Aborting method \_SB.PC00._OSC due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_NOT_FOUND)
# pci_mmcfg_late_init()
PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0]
PCI: [Firmware Info]: ECAM [mem 0xc0000000-0xce0fffff] not reserved in ACPI motherboard resources
PCI: ECAM [mem 0xc0000000-0xce0fffff] is EfiMemoryMappedIO; assuming valid
PCI: ECAM [mem 0xc0000000-0xce0fffff] reserved to work around lack of ACPI motherboard _CRS
Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be reserved by a PNP0C02
resource, but it need not be mentioned in E820, so we shouldn't look at
E820 to validate the ECAM space described by MCFG.
946f2ee5c731 ("[PATCH] i386/x86-64: Check that MCFG points to an e820
reserved area") added a sanity check of E820 to work around buggy MCFG
tables, but that over-aggressive validation causes failures like this one.
Keep the E820 validation check only for older BIOSes (pre-2016) so the
buggy 2006-era machines don't break. Skip the early E820 check for 2016
and newer BIOSes.
Fixes: 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map")
Reported-by: Mateusz Kaduk <mateusz.kaduk(a)gmail.com>
Reported-by: Arul <...>
Reported-by: Imcarneiro91 <...>
Reported-by: Aman <...>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218444
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Tested-by: Mateusz Kaduk <mateusz.kaduk(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/pci/mmconfig-shared.c | 35 +++++++++++++++++++++++++++-------
1 file changed, 28 insertions(+), 7 deletions(-)
diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 0cc9520666ef..53c7afa606c3 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -518,7 +518,34 @@ static bool __ref pci_mmcfg_reserved(struct device *dev,
{
struct resource *conflict;
- if (!early && !acpi_disabled) {
+ if (early) {
+
+ /*
+ * Don't try to do this check unless configuration type 1
+ * is available. How about type 2?
+ */
+
+ /*
+ * 946f2ee5c731 ("Check that MCFG points to an e820
+ * reserved area") added this E820 check in 2006 to work
+ * around BIOS defects.
+ *
+ * Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be
+ * reserved by a PNP0C02 resource, but it need not be
+ * mentioned in E820. Before the ACPI interpreter is
+ * available, we can't check for PNP0C02 resources, so
+ * there's no reliable way to verify the region in this
+ * early check. Keep it only for the old machines that
+ * motivated 946f2ee5c731.
+ */
+ if (dmi_get_bios_year() < 2016 && raw_pci_ops)
+ return is_mmconf_reserved(e820__mapped_all, cfg, dev,
+ "E820 entry");
+
+ return true;
+ }
+
+ if (!acpi_disabled) {
if (is_mmconf_reserved(is_acpi_reserved, cfg, dev,
"ACPI motherboard resource"))
return true;
@@ -554,12 +581,6 @@ static bool __ref pci_mmcfg_reserved(struct device *dev,
if (pci_mmcfg_running_state)
return true;
- /* Don't try to do this check unless configuration
- type 1 is available. how about type 2 ?*/
- if (raw_pci_ops)
- return is_mmconf_reserved(e820__mapped_all, cfg, dev,
- "E820 entry");
-
return false;
}
--
2.34.1
__split_huge_pmd_locked() can be called for a present THP, devmap or
(non-present) migration entry. It calls pmdp_invalidate()
unconditionally on the pmdp and only determines if it is present or not
based on the returned old pmd.
But arm64's pmd_mkinvalid(), called by pmdp_invalidate(),
unconditionally sets the PMD_PRESENT_INVALID flag, which causes future
pmd_present() calls to return true - even for a swap pmd. Therefore any
lockless pgtable walker could see the migration entry pmd in this state
and start interpretting the fields (e.g. pmd_pfn()) as if it were
present, leading to BadThings (TM). GUP-fast appears to be one such
lockless pgtable walker.
While the obvious fix is for core-mm to avoid such calls for non-present
pmds (pmdp_invalidate() will also issue TLBI which is not necessary for
this case either), all other arches that implement pmd_mkinvalid() do it
in such a way that it is robust to being called with a non-present pmd.
So it is simpler and safer to make arm64 robust too. This approach means
we can even add tests to debug_vm_pgtable.c to validate the required
behaviour.
This is a theoretical bug found during code review. I don't have any
test case to trigger it in practice.
Cc: stable(a)vger.kernel.org
Fixes: 53fa117bb33c ("arm64/mm: Enable THP migration")
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
---
Hi all,
v1 of this fix [1] took the approach of fixing core-mm to never call
pmdp_invalidate() on a non-present pmd. But Zi Yan highlighted that only arm64
suffers this problem; all other arches are robust. So his suggestion was to
instead make arm64 robust in the same way and add tests to validate it. Despite
my stated reservations in the context of the v1 discussion, having thought on it
for a bit, I now agree with Zi Yan. Hence this post.
Andrew has v1 in mm-unstable at the moment, so probably the best thing to do is
remove it from there and have this go in through the arm64 tree? Assuming there
is agreement that this approach is right one.
This applies on top of v6.9-rc5. Passes all the mm selftests on arm64.
[1] https://lore.kernel.org/linux-mm/20240425170704.3379492-1-ryan.roberts@arm.…
Thanks,
Ryan
arch/arm64/include/asm/pgtable.h | 12 +++++--
mm/debug_vm_pgtable.c | 61 ++++++++++++++++++++++++++++++++
2 files changed, 71 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index afdd56d26ad7..7d580271a46d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -511,8 +511,16 @@ static inline int pmd_trans_huge(pmd_t pmd)
static inline pmd_t pmd_mkinvalid(pmd_t pmd)
{
- pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID));
- pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID));
+ /*
+ * If not valid then either we are already present-invalid or we are
+ * not-present (i.e. none or swap entry). We must not convert
+ * not-present to present-invalid. Unbelievably, the core-mm may call
+ * pmd_mkinvalid() for a swap entry and all other arches can handle it.
+ */
+ if (pmd_valid(pmd)) {
+ pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID));
+ pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID));
+ }
return pmd;
}
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 65c19025da3d..7e9c387d06b0 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -956,6 +956,65 @@ static void __init hugetlb_basic_tests(struct pgtable_debug_args *args) { }
#endif /* CONFIG_HUGETLB_PAGE */
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#if !defined(__HAVE_ARCH_PMDP_INVALIDATE) && defined(CONFIG_ARCH_ENABLE_THP_MIGRATION)
+static void __init swp_pmd_mkinvalid_tests(struct pgtable_debug_args *args)
+{
+ unsigned long max_swap_offset;
+ swp_entry_t swp_set, swp_clear, swp_convert;
+ pmd_t pmd_set, pmd_clear;
+
+ /*
+ * See generic_max_swapfile_size(): probe the maximum offset, then
+ * create swap entry will all possible bits set and a swap entry will
+ * all bits clear.
+ */
+ max_swap_offset = swp_offset(pmd_to_swp_entry(swp_entry_to_pmd(swp_entry(0, ~0UL))));
+ swp_set = swp_entry((1 << MAX_SWAPFILES_SHIFT) - 1, max_swap_offset);
+ swp_clear = swp_entry(0, 0);
+
+ /* Convert to pmd. */
+ pmd_set = swp_entry_to_pmd(swp_set);
+ pmd_clear = swp_entry_to_pmd(swp_clear);
+
+ /*
+ * Sanity check that the pmds are not-present, not-huge and swap entry
+ * is recoverable without corruption.
+ */
+ WARN_ON(pmd_present(pmd_set));
+ WARN_ON(pmd_trans_huge(pmd_set));
+ swp_convert = pmd_to_swp_entry(pmd_set);
+ WARN_ON(swp_type(swp_set) != swp_type(swp_convert));
+ WARN_ON(swp_offset(swp_set) != swp_offset(swp_convert));
+ WARN_ON(pmd_present(pmd_clear));
+ WARN_ON(pmd_trans_huge(pmd_clear));
+ swp_convert = pmd_to_swp_entry(pmd_clear);
+ WARN_ON(swp_type(swp_clear) != swp_type(swp_convert));
+ WARN_ON(swp_offset(swp_clear) != swp_offset(swp_convert));
+
+ /* Now invalidate the pmd. */
+ pmd_set = pmd_mkinvalid(pmd_set);
+ pmd_clear = pmd_mkinvalid(pmd_clear);
+
+ /*
+ * Since its a swap pmd, invalidation should effectively be a noop and
+ * the checks we already did should give the same answer. Check the
+ * invalidation didn't corrupt any fields.
+ */
+ WARN_ON(pmd_present(pmd_set));
+ WARN_ON(pmd_trans_huge(pmd_set));
+ swp_convert = pmd_to_swp_entry(pmd_set);
+ WARN_ON(swp_type(swp_set) != swp_type(swp_convert));
+ WARN_ON(swp_offset(swp_set) != swp_offset(swp_convert));
+ WARN_ON(pmd_present(pmd_clear));
+ WARN_ON(pmd_trans_huge(pmd_clear));
+ swp_convert = pmd_to_swp_entry(pmd_clear);
+ WARN_ON(swp_type(swp_clear) != swp_type(swp_convert));
+ WARN_ON(swp_offset(swp_clear) != swp_offset(swp_convert));
+}
+#else
+static void __init swp_pmd_mkinvalid_tests(struct pgtable_debug_args *args) { }
+#endif /* !__HAVE_ARCH_PMDP_INVALIDATE && CONFIG_ARCH_ENABLE_THP_MIGRATION */
+
static void __init pmd_thp_tests(struct pgtable_debug_args *args)
{
pmd_t pmd;
@@ -982,6 +1041,8 @@ static void __init pmd_thp_tests(struct pgtable_debug_args *args)
WARN_ON(!pmd_trans_huge(pmd_mkinvalid(pmd_mkhuge(pmd))));
WARN_ON(!pmd_present(pmd_mkinvalid(pmd_mkhuge(pmd))));
#endif /* __HAVE_ARCH_PMDP_INVALIDATE */
+
+ swp_pmd_mkinvalid_tests(args);
}
#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
--
2.25.1
Change 'sent' to 'send'
Signed-off-by: Tim Bird <tim.bird(a)sony.com>
---
Documentation/process/stable-kernel-rules.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/process/stable-kernel-rules.rst b/Documentation/process/stable-kernel-rules.rst
index 1704f1c686d0..3178bef6fca3 100644
--- a/Documentation/process/stable-kernel-rules.rst
+++ b/Documentation/process/stable-kernel-rules.rst
@@ -78,7 +78,7 @@ in the sign-off area. Once the patch is mainlined it will be applied to the
stable tree without anything else needing to be done by the author or
subsystem maintainer.
-To sent additional instructions to the stable team, use a shell-style inline
+To send additional instructions to the stable team, use a shell-style inline
comment:
* To specify any additional patch prerequisites for cherry picking use the
--
2.25.1
After a recent discussion regarding "do we need a 'nobackport' tag" I
set out to create one change for stable-kernel-rules.rst. This is now
the last patch in the series, which links to that discussion with
all the details; the other stuff is fine-tuning that happened along the
way.
Ciao, Thorsten
---
v1->v2:
* Add reviewed-by tag from Greg to the first patch.
* Change the backport example in 2 as suggested by Greg.
* Improve description of patch 3 while also making the change remove a
level of indenting.
* Add patch explaining stable(a)kernel.org (w/o @vger.)
* Move the patch adding a 'make AUTOSEL et. al. ignore a change' flag to
the end of the series and use stable+noautosel(a)kernel.org as
suggested my Konstantin and ACKed by Greg.
v1: https://lore.kernel.org/all/cover.1712812895.git.linux@leemhuis.info/
Thorsten Leemhuis (5):
docs: stable-kernel-rules: reduce redundancy
docs: stable-kernel-rules: call mainline by its name and change
example
docs: stable-kernel-rules: remove code-labels tags and a indention
level
docs: stable-kernel-rules: explain use of stable(a)kernel.org (w/o
@vger.)
docs: stable-kernel-rules: create special tag to flag 'no backporting'
Documentation/process/stable-kernel-rules.rst | 234 ++++++++----------
1 file changed, 110 insertions(+), 124 deletions(-)
base-commit: 5eb4573ea63d0c83bf58fb7c243fc2c2b6966c02
--
2.44.0
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
When using a high speed clock with a low baud rate, the 4x prescaler is
automatically selected if required. In that case, sc16is7xx_set_baud()
properly configures the chip registers, but returns an incorrect baud
rate by not taking into account the prescaler value. This incorrect baud
rate is then fed to uart_update_timeout().
For example, with an input clock of 80MHz, and a selected baud rate of 50,
sc16is7xx_set_baud() will return 200 instead of 50.
Fix this by first changing the prescaler variable to hold the selected
prescaler value instead of the MCR bitfield. Then properly take into
account the selected prescaler value in the return value computation.
Also add better documentation about the divisor value computation.
Fixes: dfeae619d781 ("serial: sc16is7xx")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
---
Changes for V2:
- Change prescaler type to "unsigned int" (Jiri S.)
---
drivers/tty/serial/sc16is7xx.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 03cf30e20b75..bf0065d1c8e9 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -555,16 +555,28 @@ static bool sc16is7xx_regmap_noinc(struct device *dev, unsigned int reg)
return reg == SC16IS7XX_RHR_REG;
}
+/*
+ * Configure programmable baud rate generator (divisor) according to the
+ * desired baud rate.
+ *
+ * From the datasheet, the divisor is computed according to:
+ *
+ * XTAL1 input frequency
+ * -----------------------
+ * prescaler
+ * divisor = ---------------------------
+ * baud-rate x sampling-rate
+ */
static int sc16is7xx_set_baud(struct uart_port *port, int baud)
{
struct sc16is7xx_one *one = to_sc16is7xx_one(port, port);
u8 lcr;
- u8 prescaler = 0;
+ unsigned int prescaler = 1;
unsigned long clk = port->uartclk, div = clk / 16 / baud;
if (div >= BIT(16)) {
- prescaler = SC16IS7XX_MCR_CLKSEL_BIT;
- div /= 4;
+ prescaler = 4;
+ div /= prescaler;
}
/* Enable enhanced features */
@@ -574,9 +586,10 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
SC16IS7XX_EFR_ENABLE_BIT);
sc16is7xx_efr_unlock(port);
+ /* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
- prescaler);
+ prescaler == 1 ? 0 : SC16IS7XX_MCR_CLKSEL_BIT);
/* Backup LCR and access special register set (DLL/DLH) */
lcr = sc16is7xx_port_read(port, SC16IS7XX_LCR_REG);
@@ -592,7 +605,7 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
/* Restore LCR and access to general register set */
sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, lcr);
- return DIV_ROUND_CLOSEST(clk / 16, div);
+ return DIV_ROUND_CLOSEST((clk / prescaler) / 16, div);
}
static void sc16is7xx_handle_rx(struct uart_port *port, unsigned int rxlen,
base-commit: 660a708098569a66a47d0abdad998e29e1259de6
--
2.39.2
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 7044dcff8301b29269016ebd17df27c4736140d2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042939-ended-heavily-2e5c@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
7044dcff8301 ("rust: macros: fix soundness issue in `module!` macro")
1b6170ff7a20 ("rust: module: place generated init_module() function in .init.text")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7044dcff8301b29269016ebd17df27c4736140d2 Mon Sep 17 00:00:00 2001
From: Benno Lossin <benno.lossin(a)proton.me>
Date: Mon, 1 Apr 2024 18:52:50 +0000
Subject: [PATCH] rust: macros: fix soundness issue in `module!` macro
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The `module!` macro creates glue code that are called by C to initialize
the Rust modules using the `Module::init` function. Part of this glue
code are the local functions `__init` and `__exit` that are used to
initialize/destroy the Rust module.
These functions are safe and also visible to the Rust mod in which the
`module!` macro is invoked. This means that they can be called by other
safe Rust code. But since they contain `unsafe` blocks that rely on only
being called at the right time, this is a soundness issue.
Wrap these generated functions inside of two private modules, this
guarantees that the public functions cannot be called from the outside.
Make the safe functions `unsafe` and add SAFETY comments.
Cc: stable(a)vger.kernel.org
Reported-by: Björn Roy Baron <bjorn3_gh(a)protonmail.com>
Closes: https://github.com/Rust-for-Linux/linux/issues/629
Fixes: 1fbde52bde73 ("rust: add `macros` crate")
Signed-off-by: Benno Lossin <benno.lossin(a)proton.me>
Reviewed-by: Wedson Almeida Filho <walmeida(a)microsoft.com>
Link: https://lore.kernel.org/r/20240401185222.12015-1-benno.lossin@proton.me
[ Moved `THIS_MODULE` out of the private-in-private modules since it
should remain public, as Dirk Behme noticed [1]. Capitalized comments,
avoided newline in non-list SAFETY comments and reworded to add
Reported-by and newline. ]
Link: https://rust-for-linux.zulipchat.com/#narrow/stream/291565-Help/topic/x/nea… [1]
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
diff --git a/rust/macros/module.rs b/rust/macros/module.rs
index 27979e582e4b..acd0393b5095 100644
--- a/rust/macros/module.rs
+++ b/rust/macros/module.rs
@@ -199,17 +199,6 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream {
/// Used by the printing macros, e.g. [`info!`].
const __LOG_PREFIX: &[u8] = b\"{name}\\0\";
- /// The \"Rust loadable module\" mark.
- //
- // This may be best done another way later on, e.g. as a new modinfo
- // key or a new section. For the moment, keep it simple.
- #[cfg(MODULE)]
- #[doc(hidden)]
- #[used]
- static __IS_RUST_MODULE: () = ();
-
- static mut __MOD: Option<{type_}> = None;
-
// SAFETY: `__this_module` is constructed by the kernel at load time and will not be
// freed until the module is unloaded.
#[cfg(MODULE)]
@@ -221,81 +210,132 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream {
kernel::ThisModule::from_ptr(core::ptr::null_mut())
}};
- // Loadable modules need to export the `{{init,cleanup}}_module` identifiers.
- /// # Safety
- ///
- /// This function must not be called after module initialization, because it may be
- /// freed after that completes.
- #[cfg(MODULE)]
- #[doc(hidden)]
- #[no_mangle]
- #[link_section = \".init.text\"]
- pub unsafe extern \"C\" fn init_module() -> core::ffi::c_int {{
- __init()
- }}
+ // Double nested modules, since then nobody can access the public items inside.
+ mod __module_init {{
+ mod __module_init {{
+ use super::super::{type_};
- #[cfg(MODULE)]
- #[doc(hidden)]
- #[no_mangle]
- pub extern \"C\" fn cleanup_module() {{
- __exit()
- }}
+ /// The \"Rust loadable module\" mark.
+ //
+ // This may be best done another way later on, e.g. as a new modinfo
+ // key or a new section. For the moment, keep it simple.
+ #[cfg(MODULE)]
+ #[doc(hidden)]
+ #[used]
+ static __IS_RUST_MODULE: () = ();
- // Built-in modules are initialized through an initcall pointer
- // and the identifiers need to be unique.
- #[cfg(not(MODULE))]
- #[cfg(not(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS))]
- #[doc(hidden)]
- #[link_section = \"{initcall_section}\"]
- #[used]
- pub static __{name}_initcall: extern \"C\" fn() -> core::ffi::c_int = __{name}_init;
+ static mut __MOD: Option<{type_}> = None;
- #[cfg(not(MODULE))]
- #[cfg(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)]
- core::arch::global_asm!(
- r#\".section \"{initcall_section}\", \"a\"
- __{name}_initcall:
- .long __{name}_init - .
- .previous
- \"#
- );
+ // Loadable modules need to export the `{{init,cleanup}}_module` identifiers.
+ /// # Safety
+ ///
+ /// This function must not be called after module initialization, because it may be
+ /// freed after that completes.
+ #[cfg(MODULE)]
+ #[doc(hidden)]
+ #[no_mangle]
+ #[link_section = \".init.text\"]
+ pub unsafe extern \"C\" fn init_module() -> core::ffi::c_int {{
+ // SAFETY: This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // unique name.
+ unsafe {{ __init() }}
+ }}
- #[cfg(not(MODULE))]
- #[doc(hidden)]
- #[no_mangle]
- pub extern \"C\" fn __{name}_init() -> core::ffi::c_int {{
- __init()
- }}
+ #[cfg(MODULE)]
+ #[doc(hidden)]
+ #[no_mangle]
+ pub extern \"C\" fn cleanup_module() {{
+ // SAFETY:
+ // - This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // unique name,
+ // - furthermore it is only called after `init_module` has returned `0`
+ // (which delegates to `__init`).
+ unsafe {{ __exit() }}
+ }}
- #[cfg(not(MODULE))]
- #[doc(hidden)]
- #[no_mangle]
- pub extern \"C\" fn __{name}_exit() {{
- __exit()
- }}
+ // Built-in modules are initialized through an initcall pointer
+ // and the identifiers need to be unique.
+ #[cfg(not(MODULE))]
+ #[cfg(not(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS))]
+ #[doc(hidden)]
+ #[link_section = \"{initcall_section}\"]
+ #[used]
+ pub static __{name}_initcall: extern \"C\" fn() -> core::ffi::c_int = __{name}_init;
- fn __init() -> core::ffi::c_int {{
- match <{type_} as kernel::Module>::init(&THIS_MODULE) {{
- Ok(m) => {{
- unsafe {{
- __MOD = Some(m);
+ #[cfg(not(MODULE))]
+ #[cfg(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)]
+ core::arch::global_asm!(
+ r#\".section \"{initcall_section}\", \"a\"
+ __{name}_initcall:
+ .long __{name}_init - .
+ .previous
+ \"#
+ );
+
+ #[cfg(not(MODULE))]
+ #[doc(hidden)]
+ #[no_mangle]
+ pub extern \"C\" fn __{name}_init() -> core::ffi::c_int {{
+ // SAFETY: This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // placement above in the initcall section.
+ unsafe {{ __init() }}
+ }}
+
+ #[cfg(not(MODULE))]
+ #[doc(hidden)]
+ #[no_mangle]
+ pub extern \"C\" fn __{name}_exit() {{
+ // SAFETY:
+ // - This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // unique name,
+ // - furthermore it is only called after `__{name}_init` has returned `0`
+ // (which delegates to `__init`).
+ unsafe {{ __exit() }}
+ }}
+
+ /// # Safety
+ ///
+ /// This function must only be called once.
+ unsafe fn __init() -> core::ffi::c_int {{
+ match <{type_} as kernel::Module>::init(&super::super::THIS_MODULE) {{
+ Ok(m) => {{
+ // SAFETY: No data race, since `__MOD` can only be accessed by this
+ // module and there only `__init` and `__exit` access it. These
+ // functions are only called once and `__exit` cannot be called
+ // before or during `__init`.
+ unsafe {{
+ __MOD = Some(m);
+ }}
+ return 0;
+ }}
+ Err(e) => {{
+ return e.to_errno();
+ }}
}}
- return 0;
}}
- Err(e) => {{
- return e.to_errno();
+
+ /// # Safety
+ ///
+ /// This function must
+ /// - only be called once,
+ /// - be called after `__init` has been called and returned `0`.
+ unsafe fn __exit() {{
+ // SAFETY: No data race, since `__MOD` can only be accessed by this module
+ // and there only `__init` and `__exit` access it. These functions are only
+ // called once and `__init` was already called.
+ unsafe {{
+ // Invokes `drop()` on `__MOD`, which should be used for cleanup.
+ __MOD = None;
+ }}
}}
+
+ {modinfo}
}}
}}
-
- fn __exit() {{
- unsafe {{
- // Invokes `drop()` on `__MOD`, which should be used for cleanup.
- __MOD = None;
- }}
- }}
-
- {modinfo}
",
type_ = info.type_,
name = info.name,
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x 7044dcff8301b29269016ebd17df27c4736140d2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042938-computing-synthetic-f9fe@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
7044dcff8301 ("rust: macros: fix soundness issue in `module!` macro")
1b6170ff7a20 ("rust: module: place generated init_module() function in .init.text")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7044dcff8301b29269016ebd17df27c4736140d2 Mon Sep 17 00:00:00 2001
From: Benno Lossin <benno.lossin(a)proton.me>
Date: Mon, 1 Apr 2024 18:52:50 +0000
Subject: [PATCH] rust: macros: fix soundness issue in `module!` macro
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The `module!` macro creates glue code that are called by C to initialize
the Rust modules using the `Module::init` function. Part of this glue
code are the local functions `__init` and `__exit` that are used to
initialize/destroy the Rust module.
These functions are safe and also visible to the Rust mod in which the
`module!` macro is invoked. This means that they can be called by other
safe Rust code. But since they contain `unsafe` blocks that rely on only
being called at the right time, this is a soundness issue.
Wrap these generated functions inside of two private modules, this
guarantees that the public functions cannot be called from the outside.
Make the safe functions `unsafe` and add SAFETY comments.
Cc: stable(a)vger.kernel.org
Reported-by: Björn Roy Baron <bjorn3_gh(a)protonmail.com>
Closes: https://github.com/Rust-for-Linux/linux/issues/629
Fixes: 1fbde52bde73 ("rust: add `macros` crate")
Signed-off-by: Benno Lossin <benno.lossin(a)proton.me>
Reviewed-by: Wedson Almeida Filho <walmeida(a)microsoft.com>
Link: https://lore.kernel.org/r/20240401185222.12015-1-benno.lossin@proton.me
[ Moved `THIS_MODULE` out of the private-in-private modules since it
should remain public, as Dirk Behme noticed [1]. Capitalized comments,
avoided newline in non-list SAFETY comments and reworded to add
Reported-by and newline. ]
Link: https://rust-for-linux.zulipchat.com/#narrow/stream/291565-Help/topic/x/nea… [1]
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
diff --git a/rust/macros/module.rs b/rust/macros/module.rs
index 27979e582e4b..acd0393b5095 100644
--- a/rust/macros/module.rs
+++ b/rust/macros/module.rs
@@ -199,17 +199,6 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream {
/// Used by the printing macros, e.g. [`info!`].
const __LOG_PREFIX: &[u8] = b\"{name}\\0\";
- /// The \"Rust loadable module\" mark.
- //
- // This may be best done another way later on, e.g. as a new modinfo
- // key or a new section. For the moment, keep it simple.
- #[cfg(MODULE)]
- #[doc(hidden)]
- #[used]
- static __IS_RUST_MODULE: () = ();
-
- static mut __MOD: Option<{type_}> = None;
-
// SAFETY: `__this_module` is constructed by the kernel at load time and will not be
// freed until the module is unloaded.
#[cfg(MODULE)]
@@ -221,81 +210,132 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream {
kernel::ThisModule::from_ptr(core::ptr::null_mut())
}};
- // Loadable modules need to export the `{{init,cleanup}}_module` identifiers.
- /// # Safety
- ///
- /// This function must not be called after module initialization, because it may be
- /// freed after that completes.
- #[cfg(MODULE)]
- #[doc(hidden)]
- #[no_mangle]
- #[link_section = \".init.text\"]
- pub unsafe extern \"C\" fn init_module() -> core::ffi::c_int {{
- __init()
- }}
+ // Double nested modules, since then nobody can access the public items inside.
+ mod __module_init {{
+ mod __module_init {{
+ use super::super::{type_};
- #[cfg(MODULE)]
- #[doc(hidden)]
- #[no_mangle]
- pub extern \"C\" fn cleanup_module() {{
- __exit()
- }}
+ /// The \"Rust loadable module\" mark.
+ //
+ // This may be best done another way later on, e.g. as a new modinfo
+ // key or a new section. For the moment, keep it simple.
+ #[cfg(MODULE)]
+ #[doc(hidden)]
+ #[used]
+ static __IS_RUST_MODULE: () = ();
- // Built-in modules are initialized through an initcall pointer
- // and the identifiers need to be unique.
- #[cfg(not(MODULE))]
- #[cfg(not(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS))]
- #[doc(hidden)]
- #[link_section = \"{initcall_section}\"]
- #[used]
- pub static __{name}_initcall: extern \"C\" fn() -> core::ffi::c_int = __{name}_init;
+ static mut __MOD: Option<{type_}> = None;
- #[cfg(not(MODULE))]
- #[cfg(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)]
- core::arch::global_asm!(
- r#\".section \"{initcall_section}\", \"a\"
- __{name}_initcall:
- .long __{name}_init - .
- .previous
- \"#
- );
+ // Loadable modules need to export the `{{init,cleanup}}_module` identifiers.
+ /// # Safety
+ ///
+ /// This function must not be called after module initialization, because it may be
+ /// freed after that completes.
+ #[cfg(MODULE)]
+ #[doc(hidden)]
+ #[no_mangle]
+ #[link_section = \".init.text\"]
+ pub unsafe extern \"C\" fn init_module() -> core::ffi::c_int {{
+ // SAFETY: This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // unique name.
+ unsafe {{ __init() }}
+ }}
- #[cfg(not(MODULE))]
- #[doc(hidden)]
- #[no_mangle]
- pub extern \"C\" fn __{name}_init() -> core::ffi::c_int {{
- __init()
- }}
+ #[cfg(MODULE)]
+ #[doc(hidden)]
+ #[no_mangle]
+ pub extern \"C\" fn cleanup_module() {{
+ // SAFETY:
+ // - This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // unique name,
+ // - furthermore it is only called after `init_module` has returned `0`
+ // (which delegates to `__init`).
+ unsafe {{ __exit() }}
+ }}
- #[cfg(not(MODULE))]
- #[doc(hidden)]
- #[no_mangle]
- pub extern \"C\" fn __{name}_exit() {{
- __exit()
- }}
+ // Built-in modules are initialized through an initcall pointer
+ // and the identifiers need to be unique.
+ #[cfg(not(MODULE))]
+ #[cfg(not(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS))]
+ #[doc(hidden)]
+ #[link_section = \"{initcall_section}\"]
+ #[used]
+ pub static __{name}_initcall: extern \"C\" fn() -> core::ffi::c_int = __{name}_init;
- fn __init() -> core::ffi::c_int {{
- match <{type_} as kernel::Module>::init(&THIS_MODULE) {{
- Ok(m) => {{
- unsafe {{
- __MOD = Some(m);
+ #[cfg(not(MODULE))]
+ #[cfg(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)]
+ core::arch::global_asm!(
+ r#\".section \"{initcall_section}\", \"a\"
+ __{name}_initcall:
+ .long __{name}_init - .
+ .previous
+ \"#
+ );
+
+ #[cfg(not(MODULE))]
+ #[doc(hidden)]
+ #[no_mangle]
+ pub extern \"C\" fn __{name}_init() -> core::ffi::c_int {{
+ // SAFETY: This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // placement above in the initcall section.
+ unsafe {{ __init() }}
+ }}
+
+ #[cfg(not(MODULE))]
+ #[doc(hidden)]
+ #[no_mangle]
+ pub extern \"C\" fn __{name}_exit() {{
+ // SAFETY:
+ // - This function is inaccessible to the outside due to the double
+ // module wrapping it. It is called exactly once by the C side via its
+ // unique name,
+ // - furthermore it is only called after `__{name}_init` has returned `0`
+ // (which delegates to `__init`).
+ unsafe {{ __exit() }}
+ }}
+
+ /// # Safety
+ ///
+ /// This function must only be called once.
+ unsafe fn __init() -> core::ffi::c_int {{
+ match <{type_} as kernel::Module>::init(&super::super::THIS_MODULE) {{
+ Ok(m) => {{
+ // SAFETY: No data race, since `__MOD` can only be accessed by this
+ // module and there only `__init` and `__exit` access it. These
+ // functions are only called once and `__exit` cannot be called
+ // before or during `__init`.
+ unsafe {{
+ __MOD = Some(m);
+ }}
+ return 0;
+ }}
+ Err(e) => {{
+ return e.to_errno();
+ }}
}}
- return 0;
}}
- Err(e) => {{
- return e.to_errno();
+
+ /// # Safety
+ ///
+ /// This function must
+ /// - only be called once,
+ /// - be called after `__init` has been called and returned `0`.
+ unsafe fn __exit() {{
+ // SAFETY: No data race, since `__MOD` can only be accessed by this module
+ // and there only `__init` and `__exit` access it. These functions are only
+ // called once and `__init` was already called.
+ unsafe {{
+ // Invokes `drop()` on `__MOD`, which should be used for cleanup.
+ __MOD = None;
+ }}
}}
+
+ {modinfo}
}}
}}
-
- fn __exit() {{
- unsafe {{
- // Invokes `drop()` on `__MOD`, which should be used for cleanup.
- __MOD = None;
- }}
- }}
-
- {modinfo}
",
type_ = info.type_,
name = info.name,
From: Vitaly Lifshits <vitaly.lifshits(a)intel.com>
This is a partial revert of commit 6dbdd4de0362 ("e1000e: Workaround
for sporadic MDI error on Meteor Lake systems"). The referenced commit
used usleep_range inside the PHY access routines, which are sometimes
called from an atomic context. This can lead to a kernel panic in some
scenarios, such as cable disconnection and reconnection on vPro systems.
Solve this by changing the usleep_range calls back to udelay.
Fixes: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
Cc: stable(a)vger.kernel.org
Reported-by: Jérôme Carretero <cJ(a)zougloub.eu>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218740
Closes: https://lore.kernel.org/lkml/a7eb665c74b5efb5140e6979759ed243072cb24a.camel…
Co-developed-by: Sasha Neftin <sasha.neftin(a)intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin(a)intel.com>
Signed-off-by: Vitaly Lifshits <vitaly.lifshits(a)intel.com>
Tested-by: Dima Ruinskiy <dima.ruinskiy(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
---
drivers/net/ethernet/intel/e1000e/phy.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/e1000e/phy.c b/drivers/net/ethernet/intel/e1000e/phy.c
index 93544f1cc2a5..f7ae0e0aa4a4 100644
--- a/drivers/net/ethernet/intel/e1000e/phy.c
+++ b/drivers/net/ethernet/intel/e1000e/phy.c
@@ -157,7 +157,7 @@ s32 e1000e_read_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 *data)
* the lower time out
*/
for (i = 0; i < (E1000_GEN_POLL_TIMEOUT * 3); i++) {
- usleep_range(50, 60);
+ udelay(50);
mdic = er32(MDIC);
if (mdic & E1000_MDIC_READY)
break;
@@ -181,7 +181,7 @@ s32 e1000e_read_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 *data)
* reading duplicate data in the next MDIC transaction.
*/
if (hw->mac.type == e1000_pch2lan)
- usleep_range(100, 150);
+ udelay(100);
if (success) {
*data = (u16)mdic;
@@ -237,7 +237,7 @@ s32 e1000e_write_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 data)
* the lower time out
*/
for (i = 0; i < (E1000_GEN_POLL_TIMEOUT * 3); i++) {
- usleep_range(50, 60);
+ udelay(50);
mdic = er32(MDIC);
if (mdic & E1000_MDIC_READY)
break;
@@ -261,7 +261,7 @@ s32 e1000e_write_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 data)
* reading duplicate data in the next MDIC transaction.
*/
if (hw->mac.type == e1000_pch2lan)
- usleep_range(100, 150);
+ udelay(100);
if (success)
return 0;
--
2.41.0
To be secure, when a userptr is invalidated the pages should be dma
unmapped ensuring the device can no longer touch the invalidated pages.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Fixes: 12f4b58a37f4 ("drm/xe: Use hmm_range_fault to populate user pages")
Cc: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # 6.8
Signed-off-by: Matthew Brost <matthew.brost(a)intel.com>
---
drivers/gpu/drm/xe/xe_vm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index dfd31b346021..964a5b4d47d8 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -637,6 +637,9 @@ static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
XE_WARN_ON(err);
}
+ if (userptr->sg)
+ xe_hmm_userptr_free_sg(uvma);
+
trace_xe_vma_userptr_invalidate_complete(vma);
return true;
--
2.34.1
The arm64 crypto drivers duplicate driver names when adding simd
variants, which after backported commit 27016f75f5ed ("crypto: api -
Disallow identical driver names"), causes an error that leads to the
aes algs not being installed. On weaker processors this results in hangs
due to falling back to SW crypto.
Use simd_skcipher_create() as it will properly namespace the new algs.
This issue does not exist in mainline/latest (and stable v6.1+) as the
driver has been refactored to remove the simd algs from this code path.
Fixes: 27016f75f5ed ("crypto: api - Disallow identical driver names")
Cc: Herbert Xu <herbert(a)gondor.apana.org.au>
Signed-off-by: Liam Kearney <liam.kearney(a)canonical.com>
---
arch/arm64/crypto/aes-glue.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c
index aa13344a3a5e..af862e52a36b 100644
--- a/arch/arm64/crypto/aes-glue.c
+++ b/arch/arm64/crypto/aes-glue.c
@@ -1028,7 +1028,6 @@ static int __init aes_init(void)
struct simd_skcipher_alg *simd;
const char *basename;
const char *algname;
- const char *drvname;
int err;
int i;
@@ -1045,9 +1044,8 @@ static int __init aes_init(void)
continue;
algname = aes_algs[i].base.cra_name + 2;
- drvname = aes_algs[i].base.cra_driver_name + 2;
basename = aes_algs[i].base.cra_driver_name;
- simd = simd_skcipher_create_compat(algname, drvname, basename);
+ simd = simd_skcipher_create(algname, basename);
err = PTR_ERR(simd);
if (IS_ERR(simd))
goto unregister_simds;
--
2.40.1
Currently all controller IP/revisions except DWC3_usb3 >= 310a
wait 1ms unconditionally for ENDXFER completion when IOC is not
set. This is because DWC_usb3 controller revisions >= 3.10a
supports GUCTL2[14: Rst_actbitlater] bit which allows polling
CMDACT bit to know whether ENDXFER command is completed.
Consider a case where an IN request was queued, and parallelly
soft_disconnect was called (due to ffs_epfile_release). This
eventually calls stop_active_transfer with IOC cleared, hence
send_gadget_ep_cmd() skips waiting for CMDACT cleared during
EndXfer. For DWC3 controllers with revisions >= 310a, we don't
forcefully wait for 1ms either, and we proceed by unmapping the
requests. If ENDXFER didn't complete by this time, it leads to
SMMU faults since the controller would still be accessing those
requests.
Fix this by ensuring ENDXFER completion by adding 1ms delay in
__dwc3_stop_active_transfer() unconditionally.
Cc: <stable(a)vger.kernel.org>
Fixes: b353eb6dc285 ("usb: dwc3: gadget: Skip waiting for CMDACT cleared during endxfer")
Signed-off-by: Prashanth K <quic_prashk(a)quicinc.com>
---
Changes in v2:
Changed the patch logic from CMDACT polling to 1ms mdelay.
Updated subject and commit accordingly.
Link to v1: https://lore.kernel.org/all/20240422090539.3986723-1-quic_prashk@quicinc.co…
drivers/usb/dwc3/gadget.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 4df2661f6675..666eae94524f 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1724,8 +1724,7 @@ static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int
dep->resource_index = 0;
if (!interrupt) {
- if (!DWC3_IP_IS(DWC3) || DWC3_VER_IS_PRIOR(DWC3, 310A))
- mdelay(1);
+ mdelay(1);
dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
} else if (!ret) {
dep->flags |= DWC3_EP_END_TRANSFER_PENDING;
--
2.25.1
The length of Physical Address in General Media Event Record/DRAM Event
Record is 64-bit, so the field mask should be defined as such length.
Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
mask off the upper-32-bits of DPA addresses. The cxl_poison event is
unaffected.
If userspace was doing its own DPA-to-HPA translation this could lead to
incorrect page retirement decisions, but there is no known consumer
(like rasdaemon) of this event today.
Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
Cc: <stable(a)vger.kernel.org>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Signed-off-by: Shiyang Ruan <ruansy.fnst(a)fujitsu.com>
---
drivers/cxl/core/trace.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index e5f13260fc52..cdfce932d5b1 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
* DRAM Event Record
* CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
*/
-#define CXL_DPA_FLAGS_MASK 0x3F
+#define CXL_DPA_FLAGS_MASK 0x3FULL
#define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
#define CXL_DPA_VOLATILE BIT(0)
--
2.34.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x f489c948028b69cea235d9c0de1cc10eeb26a172
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042905-puppy-heritage-e422@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
f489c948028b ("ACPI: CPPC: Fix access width used for PCC registers")
2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
0651ab90e4ad ("ACPI: CPPC: Check _OSC for flexible address space")
c42fa24b4475 ("ACPI: bus: Avoid using CPPC if not supported by firmware")
2ca8e6285250 ("Revert "ACPI: Pass the same capabilities to the _OSC regardless of the query flag"")
f684b1075128 ("ACPI: CPPC: Drop redundant local variable from cpc_read()")
5f51c7ce1dc3 ("ACPI: CPPC: Fix up I/O port access in cpc_read()")
a2c8f92bea5f ("ACPI: CPPC: Implement support for SystemIO registers")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f489c948028b69cea235d9c0de1cc10eeb26a172 Mon Sep 17 00:00:00 2001
From: Vanshidhar Konda <vanshikonda(a)os.amperecomputing.com>
Date: Thu, 11 Apr 2024 16:18:44 -0700
Subject: [PATCH] ACPI: CPPC: Fix access width used for PCC registers
commit 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system
memory accesses") modified cpc_read()/cpc_write() to use access_width to
read CPC registers.
However, for PCC registers the access width field in the ACPI register
macro specifies the PCC subspace ID. For non-zero PCC subspace ID it is
incorrectly treated as access width. This causes errors when reading
from PCC registers in the CPPC driver.
For PCC registers, base the size of read/write on the bit width field.
The debug message in cpc_read()/cpc_write() is updated to print relevant
information for the address space type used to read the register.
Fixes: 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Vanshidhar Konda <vanshikonda(a)os.amperecomputing.com>
Tested-by: Jarred White <jarredwhite(a)linux.microsoft.com>
Reviewed-by: Jarred White <jarredwhite(a)linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha(a)linux.microsoft.com>
Cc: 5.15+ <stable(a)vger.kernel.org> # 5.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 00a30ca35e78..a40b6f3946ef 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1002,14 +1002,14 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
}
*val = 0;
+ size = GET_BIT_WIDTH(reg);
if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
- u32 width = GET_BIT_WIDTH(reg);
u32 val_u32;
acpi_status status;
status = acpi_os_read_port((acpi_io_address)reg->address,
- &val_u32, width);
+ &val_u32, size);
if (ACPI_FAILURE(status)) {
pr_debug("Error: Failed to read SystemIO port %llx\n",
reg->address);
@@ -1018,17 +1018,22 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
*val = val_u32;
return 0;
- } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0)
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) {
+ /*
+ * For registers in PCC space, the register size is determined
+ * by the bit width field; the access size is used to indicate
+ * the PCC subspace id.
+ */
+ size = reg->bit_width;
vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id);
+ }
else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
vaddr = reg_res->sys_mem_vaddr;
else if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE)
return cpc_read_ffh(cpu, reg, val);
else
return acpi_os_read_memory((acpi_physical_address)reg->address,
- val, reg->bit_width);
-
- size = GET_BIT_WIDTH(reg);
+ val, size);
switch (size) {
case 8:
@@ -1044,8 +1049,13 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
*val = readq_relaxed(vaddr);
break;
default:
- pr_debug("Error: Cannot read %u bit width from PCC for ss: %d\n",
- reg->bit_width, pcc_ss_id);
+ if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
+ pr_debug("Error: Cannot read %u bit width from system memory: 0x%llx\n",
+ size, reg->address);
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM) {
+ pr_debug("Error: Cannot read %u bit width from PCC for ss: %d\n",
+ size, pcc_ss_id);
+ }
return -EFAULT;
}
@@ -1063,12 +1073,13 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu);
struct cpc_reg *reg = ®_res->cpc_entry.reg;
+ size = GET_BIT_WIDTH(reg);
+
if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
- u32 width = GET_BIT_WIDTH(reg);
acpi_status status;
status = acpi_os_write_port((acpi_io_address)reg->address,
- (u32)val, width);
+ (u32)val, size);
if (ACPI_FAILURE(status)) {
pr_debug("Error: Failed to write SystemIO port %llx\n",
reg->address);
@@ -1076,17 +1087,22 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
}
return 0;
- } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0)
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) {
+ /*
+ * For registers in PCC space, the register size is determined
+ * by the bit width field; the access size is used to indicate
+ * the PCC subspace id.
+ */
+ size = reg->bit_width;
vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id);
+ }
else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
vaddr = reg_res->sys_mem_vaddr;
else if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE)
return cpc_write_ffh(cpu, reg, val);
else
return acpi_os_write_memory((acpi_physical_address)reg->address,
- val, reg->bit_width);
-
- size = GET_BIT_WIDTH(reg);
+ val, size);
if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
val = MASK_VAL(reg, val);
@@ -1105,8 +1121,13 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
writeq_relaxed(val, vaddr);
break;
default:
- pr_debug("Error: Cannot write %u bit width to PCC for ss: %d\n",
- reg->bit_width, pcc_ss_id);
+ if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
+ pr_debug("Error: Cannot write %u bit width to system memory: 0x%llx\n",
+ size, reg->address);
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM) {
+ pr_debug("Error: Cannot write %u bit width to PCC for ss: %d\n",
+ size, pcc_ss_id);
+ }
ret_val = -EFAULT;
break;
}
The NVM configuration files used by WCN3988 and WCN3990/1/8 have two
sets of configuration tags that are enclosed by a type-length header of
type four which the current parser fails to account for.
Instead the driver happily parses random data as if it were valid tags,
something which can lead to the configuration data being corrupted if it
ever encounters the words 0x0011 or 0x001b.
As is clear from commit b63882549b2b ("Bluetooth: btqca: Fix the NVM
baudrate tag offcet for wcn3991") the intention has always been to
process the configuration data also for WCN3991 and WCN3998 which
encodes the baud rate at a different offset.
Fix the parser so that it can handle the WCN3xxx configuration files,
which has an enclosing type-length header of type four and two sets of
TLV tags enclosed by a type-length header of type two and three,
respectively.
Note that only the first set, which contains the tags the driver is
currently looking for, will be parsed for now.
With the parser fixed, the software in-band sleep bit will now be set
for WCN3991 and WCN3998 (as it is for later controllers) and the default
baud rate 3200000 may be updated by the driver also for WCN3xxx
controllers.
Notably the deep-sleep feature bit is already set by default in all
configuration files in linux-firmware.
Fixes: 4219d4686875 ("Bluetooth: btqca: Add wcn3990 firmware download support.")
Cc: stable(a)vger.kernel.org # 4.19
Cc: Matthias Kaehlcke <mka(a)chromium.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/bluetooth/btqca.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/drivers/bluetooth/btqca.c b/drivers/bluetooth/btqca.c
index 6743b0a79d7a..f6c9f89a6311 100644
--- a/drivers/bluetooth/btqca.c
+++ b/drivers/bluetooth/btqca.c
@@ -281,6 +281,7 @@ static int qca_tlv_check_data(struct hci_dev *hdev,
struct tlv_type_patch *tlv_patch;
struct tlv_type_nvm *tlv_nvm;
uint8_t nvm_baud_rate = config->user_baud_rate;
+ u8 type;
config->dnld_mode = QCA_SKIP_EVT_NONE;
config->dnld_type = QCA_SKIP_EVT_NONE;
@@ -346,11 +347,30 @@ static int qca_tlv_check_data(struct hci_dev *hdev,
tlv = (struct tlv_type_hdr *)fw_data;
type_len = le32_to_cpu(tlv->type_len);
- length = (type_len >> 8) & 0x00ffffff;
+ length = type_len >> 8;
+ type = type_len & 0xff;
- BT_DBG("TLV Type\t\t : 0x%x", type_len & 0x000000ff);
+ /* Some NVM files have more than one set of tags, only parse
+ * the first set when it has type 2 for now. When there is
+ * more than one set there is an enclosing header of type 4.
+ */
+ if (type == 4) {
+ if (fw_size < 2 * sizeof(struct tlv_type_hdr))
+ return -EINVAL;
+
+ tlv++;
+
+ type_len = le32_to_cpu(tlv->type_len);
+ length = type_len >> 8;
+ type = type_len & 0xff;
+ }
+
+ BT_DBG("TLV Type\t\t : 0x%x", type);
BT_DBG("Length\t\t : %d bytes", length);
+ if (type != 2)
+ break;
+
if (fw_size < length + (tlv->data - fw_data))
return -EINVAL;
--
2.43.2