The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x edc0a2b5957652f4685ef3516f519f84807087db
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052818-glider-vivacious-f7d5@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
edc0a2b59576 ("x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms")
7745f03eb395 ("x86/topology: Add CPUID.1F multi-die/package support")
95f3d39ccf7a ("x86/cpu/topology: Provide detect_extended_topology_early()")
55e6d279abd9 ("x86/cpu: Remove the pointless CPU printout")
9305bd6ca7b4 ("x86/CPU: Move x86_cpuinfo::x86_max_cores assignment to detect_num_cpu_cores()")
a2aa578fec8c ("x86/Centaur: Report correct CPU/cache topology")
2cc61be60e37 ("x86/CPU: Make intel_num_cpu_cores() generic")
b5cf8707e6c9 ("x86/CPU: Move cpu local function declarations to local header")
6c4f5abaf356 ("x86/CPU: Modify detect_extended_topology() to return result")
60882cc159e1 ("x86/Centaur: Initialize supported CPU features properly")
7d5905dc14a8 ("x86 / CPU: Always show current CPU frequency in /proc/cpuinfo")
b29c6ef7bb12 ("x86 / CPU: Avoid unnecessary IPIs in arch_freq_get_on_cpu()")
d6ec9d9a4def ("Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From edc0a2b5957652f4685ef3516f519f84807087db Mon Sep 17 00:00:00 2001
From: Zhang Rui <rui.zhang(a)intel.com>
Date: Thu, 23 Mar 2023 09:56:40 +0800
Subject: [PATCH] x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid
platforms
Traditionally, all CPUs in a system have identical numbers of SMT
siblings. That changes with hybrid processors where some logical CPUs
have a sibling and others have none.
Today, the CPU boot code sets the global variable smp_num_siblings when
every CPU thread is brought up. The last thread to boot will overwrite
it with the number of siblings of *that* thread. That last thread to
boot will "win". If the thread is a Pcore, smp_num_siblings == 2. If it
is an Ecore, smp_num_siblings == 1.
smp_num_siblings describes if the *system* supports SMT. It should
specify the maximum number of SMT threads among all cores.
Ensure that smp_num_siblings represents the system-wide maximum number
of siblings by always increasing its value. Never allow it to decrease.
On MeteorLake-P platform, this fixes a problem that the Ecore CPUs are
not updated in any cpu sibling map because the system is treated as an
UP system when probing Ecore CPUs.
Below shows part of the CPU topology information before and after the
fix, for both Pcore and Ecore CPU (cpu0 is Pcore, cpu 12 is Ecore).
...
-/sys/devices/system/cpu/cpu0/topology/package_cpus:000fff
-/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-11
+/sys/devices/system/cpu/cpu0/topology/package_cpus:3fffff
+/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-21
...
-/sys/devices/system/cpu/cpu12/topology/package_cpus:001000
-/sys/devices/system/cpu/cpu12/topology/package_cpus_list:12
+/sys/devices/system/cpu/cpu12/topology/package_cpus:3fffff
+/sys/devices/system/cpu/cpu12/topology/package_cpus_list:0-21
Notice that the "before" 'package_cpus_list' has only one CPU. This
means that userspace tools like lscpu will see a little laptop like
an 11-socket system:
-Core(s) per socket: 1
-Socket(s): 11
+Core(s) per socket: 16
+Socket(s): 1
This is also expected to make the scheduler do rather wonky things
too.
[ dhansen: remove CPUID detail from changelog, add end user effects ]
CC: stable(a)kernel.org
Fixes: bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu topology")
Fixes: 95f3d39ccf7a ("x86/cpu/topology: Provide detect_extended_topology_early()")
Suggested-by: Len Brown <len.brown(a)intel.com>
Signed-off-by: Zhang Rui <rui.zhang(a)intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Link: https://lore.kernel.org/all/20230323015640.27906-1-rui.zhang%40intel.com
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 5e868b62a7c4..0270925fe013 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -79,7 +79,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
* initial apic id, which also represents 32-bit extended x2apic id.
*/
c->initial_apicid = edx;
- smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
#endif
return 0;
}
@@ -109,7 +109,8 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
*/
cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
c->initial_apicid = edx;
- core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
pkg_mask_width = die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x edc0a2b5957652f4685ef3516f519f84807087db
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052817-extras-parasail-a281@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
edc0a2b59576 ("x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms")
7745f03eb395 ("x86/topology: Add CPUID.1F multi-die/package support")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From edc0a2b5957652f4685ef3516f519f84807087db Mon Sep 17 00:00:00 2001
From: Zhang Rui <rui.zhang(a)intel.com>
Date: Thu, 23 Mar 2023 09:56:40 +0800
Subject: [PATCH] x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid
platforms
Traditionally, all CPUs in a system have identical numbers of SMT
siblings. That changes with hybrid processors where some logical CPUs
have a sibling and others have none.
Today, the CPU boot code sets the global variable smp_num_siblings when
every CPU thread is brought up. The last thread to boot will overwrite
it with the number of siblings of *that* thread. That last thread to
boot will "win". If the thread is a Pcore, smp_num_siblings == 2. If it
is an Ecore, smp_num_siblings == 1.
smp_num_siblings describes if the *system* supports SMT. It should
specify the maximum number of SMT threads among all cores.
Ensure that smp_num_siblings represents the system-wide maximum number
of siblings by always increasing its value. Never allow it to decrease.
On MeteorLake-P platform, this fixes a problem that the Ecore CPUs are
not updated in any cpu sibling map because the system is treated as an
UP system when probing Ecore CPUs.
Below shows part of the CPU topology information before and after the
fix, for both Pcore and Ecore CPU (cpu0 is Pcore, cpu 12 is Ecore).
...
-/sys/devices/system/cpu/cpu0/topology/package_cpus:000fff
-/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-11
+/sys/devices/system/cpu/cpu0/topology/package_cpus:3fffff
+/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-21
...
-/sys/devices/system/cpu/cpu12/topology/package_cpus:001000
-/sys/devices/system/cpu/cpu12/topology/package_cpus_list:12
+/sys/devices/system/cpu/cpu12/topology/package_cpus:3fffff
+/sys/devices/system/cpu/cpu12/topology/package_cpus_list:0-21
Notice that the "before" 'package_cpus_list' has only one CPU. This
means that userspace tools like lscpu will see a little laptop like
an 11-socket system:
-Core(s) per socket: 1
-Socket(s): 11
+Core(s) per socket: 16
+Socket(s): 1
This is also expected to make the scheduler do rather wonky things
too.
[ dhansen: remove CPUID detail from changelog, add end user effects ]
CC: stable(a)kernel.org
Fixes: bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu topology")
Fixes: 95f3d39ccf7a ("x86/cpu/topology: Provide detect_extended_topology_early()")
Suggested-by: Len Brown <len.brown(a)intel.com>
Signed-off-by: Zhang Rui <rui.zhang(a)intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Link: https://lore.kernel.org/all/20230323015640.27906-1-rui.zhang%40intel.com
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 5e868b62a7c4..0270925fe013 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -79,7 +79,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
* initial apic id, which also represents 32-bit extended x2apic id.
*/
c->initial_apicid = edx;
- smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
#endif
return 0;
}
@@ -109,7 +109,8 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
*/
cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
c->initial_apicid = edx;
- core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
pkg_mask_width = die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
On some arch (for example IPQ8074) and other with
KERNEL_STACKPROTECTOR_STRONG enabled, the following compilation error is
triggered:
drivers/net/wireguard/allowedips.c: In function 'root_remove_peer_lists':
drivers/net/wireguard/allowedips.c:80:1: error: the frame size of 1040 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
80 | }
| ^
drivers/net/wireguard/allowedips.c: In function 'root_free_rcu':
drivers/net/wireguard/allowedips.c:67:1: error: the frame size of 1040 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
67 | }
| ^
cc1: all warnings being treated as errors
Since these are free function and returns void, using function that can
fail is not ideal since an error would result in data not freed.
Since the free are under RCU lock, we can allocate the required stack
array as static outside the function and memset when needed.
This effectively fix the stack frame warning without changing how the
function work.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
Changes v2:
- Fix double Fixes in fixes tag
drivers/net/wireguard/allowedips.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c
index 5bf7822c53f1..c129082f04c6 100644
--- a/drivers/net/wireguard/allowedips.c
+++ b/drivers/net/wireguard/allowedips.c
@@ -53,12 +53,16 @@ static void node_free_rcu(struct rcu_head *rcu)
kmem_cache_free(node_cache, container_of(rcu, struct allowedips_node, rcu));
}
+static struct allowedips_node *tmpstack[MAX_ALLOWEDIPS_BITS];
+
static void root_free_rcu(struct rcu_head *rcu)
{
- struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_BITS] = {
- container_of(rcu, struct allowedips_node, rcu) };
+ struct allowedips_node *node, **stack = tmpstack;
unsigned int len = 1;
+ memset(stack, 0, sizeof(*stack) * MAX_ALLOWEDIPS_BITS);
+ stack[0] = container_of(rcu, struct allowedips_node, rcu);
+
while (len > 0 && (node = stack[--len])) {
push_rcu(stack, node->bit[0], &len);
push_rcu(stack, node->bit[1], &len);
@@ -68,9 +72,12 @@ static void root_free_rcu(struct rcu_head *rcu)
static void root_remove_peer_lists(struct allowedips_node *root)
{
- struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_BITS] = { root };
+ struct allowedips_node *node, **stack = tmpstack;
unsigned int len = 1;
+ memset(stack, 0, sizeof(*stack) * MAX_ALLOWEDIPS_BITS);
+ stack[0] = root;
+
while (len > 0 && (node = stack[--len])) {
push_rcu(stack, node->bit[0], &len);
push_rcu(stack, node->bit[1], &len);
--
2.39.2
We are Zhao & Associates. We contacting you prior to the COVID relief donation
from Joseph Tsai foundation. You were selected amongst 4 others from different
parts of the world. Kindly confirm your email with us, before we proceed.
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 46930b7cc7727271c9c27aac1fdc97a8645e2d00
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052842-facedown-sliceable-49ca@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
46930b7cc772 ("block: fix bio-cache for passthru IO")
7e2e355dd9c9 ("block: extend bio-cache for non-polled requests")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 46930b7cc7727271c9c27aac1fdc97a8645e2d00 Mon Sep 17 00:00:00 2001
From: Anuj Gupta <anuj20.g(a)samsung.com>
Date: Tue, 23 May 2023 16:47:09 +0530
Subject: [PATCH] block: fix bio-cache for passthru IO
commit <8af870aa5b847> ("block: enable bio caching use for passthru IO")
introduced bio-cache for passthru IO. In case when nr_vecs are greater
than BIO_INLINE_VECS, bio and bvecs are allocated from mempool (instead
of percpu cache) and REQ_ALLOC_CACHE is cleared. This causes the side
effect of not freeing bio/bvecs into mempool on completion.
This patch lets the passthru IO fallback to allocation using bio_kmalloc
when nr_vecs are greater than BIO_INLINE_VECS. The corresponding bio
is freed during call to blk_mq_map_bio_put during completion.
Cc: stable(a)vger.kernel.org # 6.1
fixes <8af870aa5b847> ("block: enable bio caching use for passthru IO")
Signed-off-by: Anuj Gupta <anuj20.g(a)samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k(a)samsung.com>
Link: https://lore.kernel.org/r/20230523111709.145676-1-anuj20.g@samsung.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/block/blk-map.c b/block/blk-map.c
index 04c55f1c492e..46eed2e627c3 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -248,7 +248,7 @@ static struct bio *blk_rq_map_bio_alloc(struct request *rq,
{
struct bio *bio;
- if (rq->cmd_flags & REQ_ALLOC_CACHE) {
+ if (rq->cmd_flags & REQ_ALLOC_CACHE && (nr_vecs <= BIO_INLINE_VECS)) {
bio = bio_alloc_bioset(NULL, nr_vecs, rq->cmd_flags, gfp_mask,
&fs_bio_set);
if (!bio)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 597441b3436a43011f31ce71dc0a6c0bf5ce958a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052843-pants-drearily-a496@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
597441b3436a ("btrfs: use nofs when cleaning up aborted transactions")
fe816d0f1d4c ("btrfs: Fix delalloc inodes invalidation during transaction abort")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 597441b3436a43011f31ce71dc0a6c0bf5ce958a Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Thu, 11 May 2023 12:45:59 -0400
Subject: [PATCH] btrfs: use nofs when cleaning up aborted transactions
Our CI system caught a lockdep splat:
======================================================
WARNING: possible circular locking dependency detected
6.3.0-rc7+ #1167 Not tainted
------------------------------------------------------
kswapd0/46 is trying to acquire lock:
ffff8c6543abd650 (sb_internal#2){++++}-{0:0}, at: btrfs_commit_inode_delayed_inode+0x5f/0x120
but task is already holding lock:
ffffffffabe61b40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x4aa/0x7a0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire+0xa5/0xe0
kmem_cache_alloc+0x31/0x2c0
alloc_extent_state+0x1d/0xd0
__clear_extent_bit+0x2e0/0x4f0
try_release_extent_mapping+0x216/0x280
btrfs_release_folio+0x2e/0x90
invalidate_inode_pages2_range+0x397/0x470
btrfs_cleanup_dirty_bgs+0x9e/0x210
btrfs_cleanup_one_transaction+0x22/0x760
btrfs_commit_transaction+0x3b7/0x13a0
create_subvol+0x59b/0x970
btrfs_mksubvol+0x435/0x4f0
__btrfs_ioctl_snap_create+0x11e/0x1b0
btrfs_ioctl_snap_create_v2+0xbf/0x140
btrfs_ioctl+0xa45/0x28f0
__x64_sys_ioctl+0x88/0xc0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
-> #0 (sb_internal#2){++++}-{0:0}:
__lock_acquire+0x1435/0x21a0
lock_acquire+0xc2/0x2b0
start_transaction+0x401/0x730
btrfs_commit_inode_delayed_inode+0x5f/0x120
btrfs_evict_inode+0x292/0x3d0
evict+0xcc/0x1d0
inode_lru_isolate+0x14d/0x1e0
__list_lru_walk_one+0xbe/0x1c0
list_lru_walk_one+0x58/0x80
prune_icache_sb+0x39/0x60
super_cache_scan+0x161/0x1f0
do_shrink_slab+0x163/0x340
shrink_slab+0x1d3/0x290
shrink_node+0x300/0x720
balance_pgdat+0x35c/0x7a0
kswapd+0x205/0x410
kthread+0xf0/0x120
ret_from_fork+0x29/0x50
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(sb_internal#2);
lock(fs_reclaim);
lock(sb_internal#2);
*** DEADLOCK ***
3 locks held by kswapd0/46:
#0: ffffffffabe61b40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x4aa/0x7a0
#1: ffffffffabe50270 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x113/0x290
#2: ffff8c6543abd0e0 (&type->s_umount_key#44){++++}-{3:3}, at: super_cache_scan+0x38/0x1f0
stack backtrace:
CPU: 0 PID: 46 Comm: kswapd0 Not tainted 6.3.0-rc7+ #1167
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x58/0x90
check_noncircular+0xd6/0x100
? save_trace+0x3f/0x310
? add_lock_to_list+0x97/0x120
__lock_acquire+0x1435/0x21a0
lock_acquire+0xc2/0x2b0
? btrfs_commit_inode_delayed_inode+0x5f/0x120
start_transaction+0x401/0x730
? btrfs_commit_inode_delayed_inode+0x5f/0x120
btrfs_commit_inode_delayed_inode+0x5f/0x120
btrfs_evict_inode+0x292/0x3d0
? lock_release+0x134/0x270
? __pfx_wake_bit_function+0x10/0x10
evict+0xcc/0x1d0
inode_lru_isolate+0x14d/0x1e0
__list_lru_walk_one+0xbe/0x1c0
? __pfx_inode_lru_isolate+0x10/0x10
? __pfx_inode_lru_isolate+0x10/0x10
list_lru_walk_one+0x58/0x80
prune_icache_sb+0x39/0x60
super_cache_scan+0x161/0x1f0
do_shrink_slab+0x163/0x340
shrink_slab+0x1d3/0x290
shrink_node+0x300/0x720
balance_pgdat+0x35c/0x7a0
kswapd+0x205/0x410
? __pfx_autoremove_wake_function+0x10/0x10
? __pfx_kswapd+0x10/0x10
kthread+0xf0/0x120
? __pfx_kthread+0x10/0x10
ret_from_fork+0x29/0x50
</TASK>
This happens because when we abort the transaction in the transaction
commit path we call invalidate_inode_pages2_range on our block group
cache inodes (if we have space cache v1) and any delalloc inodes we may
have. The plain invalidate_inode_pages2_range() call passes through
GFP_KERNEL, which makes sense in most cases, but not here. Wrap these
two invalidate callees with memalloc_nofs_save/memalloc_nofs_restore to
make sure we don't end up with the fs reclaim dependency under the
transaction dependency.
CC: stable(a)vger.kernel.org # 4.14+
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index fbf9006c6234..6de6dcf2743e 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4936,7 +4936,11 @@ static void btrfs_destroy_delalloc_inodes(struct btrfs_root *root)
*/
inode = igrab(&btrfs_inode->vfs_inode);
if (inode) {
+ unsigned int nofs_flag;
+
+ nofs_flag = memalloc_nofs_save();
invalidate_inode_pages2(inode->i_mapping);
+ memalloc_nofs_restore(nofs_flag);
iput(inode);
}
spin_lock(&root->delalloc_lock);
@@ -5042,7 +5046,12 @@ static void btrfs_cleanup_bg_io(struct btrfs_block_group *cache)
inode = cache->io_ctl.inode;
if (inode) {
+ unsigned int nofs_flag;
+
+ nofs_flag = memalloc_nofs_save();
invalidate_inode_pages2(inode->i_mapping);
+ memalloc_nofs_restore(nofs_flag);
+
BTRFS_I(inode)->generation = 0;
cache->io_ctl.inode = NULL;
iput(inode);
Partially backport v6.3 commit 11f75a01448f ("selftests/memfd: add
tests for MFD_NOEXEC_SEAL MFD_EXEC") to fix an unknown type name
build error.
In some systems, the __u64 typedef is not present due to differences
in system headers, causing compilation errors like this one:
fuse_test.c:64:8: error: unknown type name '__u64'
64 | static __u64 mfd_assert_get_seals(int fd)
This header includes the __u64 typedef which increases the
likelihood of successful compilation on a wider variety of systems.
Signed-off-by: Hardik Garg <hargar(a)linux.microsoft.com>
---
tools/testing/selftests/memfd/fuse_test.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/memfd/fuse_test.c b/tools/testing/selftests/memfd/fuse_test.c
index be675002f918..93798c8c5d54 100644
--- a/tools/testing/selftests/memfd/fuse_test.c
+++ b/tools/testing/selftests/memfd/fuse_test.c
@@ -22,6 +22,7 @@
#include <linux/falloc.h>
#include <fcntl.h>
#include <linux/memfd.h>
+#include <linux/types.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
--
2.25.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x ce0b15d11ad837fbacc5356941712218e38a0a83
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052612-reproach-snowbird-d3a2@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
ce0b15d11ad8 ("x86/mm: Avoid incomplete Global INVLPG flushes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ce0b15d11ad837fbacc5356941712218e38a0a83 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Tue, 16 May 2023 12:24:25 -0700
Subject: [PATCH] x86/mm: Avoid incomplete Global INVLPG flushes
The INVLPG instruction is used to invalidate TLB entries for a
specified virtual address. When PCIDs are enabled, INVLPG is supposed
to invalidate TLB entries for the specified address for both the
current PCID *and* Global entries. (Note: Only kernel mappings set
Global=1.)
Unfortunately, some INVLPG implementations can leave Global
translations unflushed when PCIDs are enabled.
As a workaround, never enable PCIDs on affected processors.
I expect there to eventually be microcode mitigations to replace this
software workaround. However, the exact version numbers where that
will happen are not known today. Once the version numbers are set in
stone, the processor list can be tweaked to only disable PCIDs on
affected processors with affected microcode.
Note: if anyone wants a quick fix that doesn't require patching, just
stick 'nopcid' on your kernel command-line.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 3cdac0f0055d..8192452d1d2d 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -9,6 +9,7 @@
#include <linux/sched/task.h>
#include <asm/set_memory.h>
+#include <asm/cpu_device_id.h>
#include <asm/e820/api.h>
#include <asm/init.h>
#include <asm/page.h>
@@ -261,6 +262,24 @@ static void __init probe_page_size_mask(void)
}
}
+#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \
+ .family = 6, \
+ .model = _model, \
+ }
+/*
+ * INVLPG may not properly flush Global entries
+ * on these CPUs when PCIDs are enabled.
+ */
+static const struct x86_cpu_id invlpg_miss_ids[] = {
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE ),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N ),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
+ {}
+};
+
static void setup_pcid(void)
{
if (!IS_ENABLED(CONFIG_X86_64))
@@ -269,6 +288,12 @@ static void setup_pcid(void)
if (!boot_cpu_has(X86_FEATURE_PCID))
return;
+ if (x86_match_cpu(invlpg_miss_ids)) {
+ pr_info("Incomplete global flushes, disabling PCID");
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+ return;
+ }
+
if (boot_cpu_has(X86_FEATURE_PGE)) {
/*
* This can't be cr4_set_bits_and_update_boot() -- the
Hi Greg, Sasha,
[ This is v2 including two initial missing backport patches and one new
patch at the end of this batch. ]
This is second round of -stable backport fixes for 4.14. This batch
includes dependency patches which are not currently in the 4.14 branch.
The following list shows the backported patches, I am using original
commit IDs for reference:
1) 4f16d25c68ec ("netfilter: nftables: add nft_parse_register_load() and use it")
2) 345023b0db31 ("netfilter: nftables: add nft_parse_register_store() and use it")
3) 08a01c11a5bb ("netfilter: nftables: statify nft_parse_register()")
4) 6e1acfa387b9 ("netfilter: nf_tables: validate registers coming from userspace.")
5) 20a1452c3542 ("netfilter: nf_tables: add nft_setelem_parse_key()")
6) fdb9c405e35b ("netfilter: nf_tables: allow up to 64 bytes in the set element data area")
7) 7e6bc1f6cabc ("netfilter: nf_tables: stricter validation of element data")
8) 215a31f19ded ("netfilter: nft_dynset: do not reject set updates with NFT_SET_EVAL")
9) 36d5b2913219 ("netfilter: nf_tables: do not allow RULE_ID to refer to another chain")
10) 470ee20e069a ("netfilter: nf_tables: do not allow SET_ID to refer to another table")
11) d209df3e7f70 ("netfilter: nf_tables: fix register ordering")
Please, apply.
Thanks.
Florian Westphal (1):
netfilter: nf_tables: fix register ordering
Pablo Neira Ayuso (10):
netfilter: nftables: add nft_parse_register_load() and use it
netfilter: nftables: add nft_parse_register_store() and use it
netfilter: nftables: statify nft_parse_register()
netfilter: nf_tables: validate registers coming from userspace.
netfilter: nf_tables: add nft_setelem_parse_key()
netfilter: nf_tables: allow up to 64 bytes in the set element data area
netfilter: nf_tables: stricter validation of element data
netfilter: nft_dynset: do not reject set updates with NFT_SET_EVAL
netfilter: nf_tables: do not allow RULE_ID to refer to another chain
netfilter: nf_tables: do not allow SET_ID to refer to another table
include/net/netfilter/nf_tables.h | 17 +-
include/net/netfilter/nf_tables_core.h | 14 +-
include/net/netfilter/nft_fib.h | 2 +-
include/net/netfilter/nft_masq.h | 4 +-
include/net/netfilter/nft_meta.h | 4 +-
include/net/netfilter/nft_redir.h | 4 +-
include/uapi/linux/netfilter/nf_tables.h | 2 +-
net/bridge/netfilter/nft_meta_bridge.c | 5 +-
net/ipv4/netfilter/nft_dup_ipv4.c | 18 +-
net/ipv6/netfilter/nft_dup_ipv6.c | 18 +-
net/netfilter/nf_tables_api.c | 220 +++++++++++++++--------
net/netfilter/nft_bitwise.c | 14 +-
net/netfilter/nft_byteorder.c | 14 +-
net/netfilter/nft_cmp.c | 8 +-
net/netfilter/nft_ct.c | 12 +-
net/netfilter/nft_dup_netdev.c | 6 +-
net/netfilter/nft_dynset.c | 16 +-
net/netfilter/nft_exthdr.c | 14 +-
net/netfilter/nft_fib.c | 5 +-
net/netfilter/nft_fwd_netdev.c | 6 +-
net/netfilter/nft_hash.c | 25 ++-
net/netfilter/nft_immediate.c | 8 +-
net/netfilter/nft_lookup.c | 14 +-
net/netfilter/nft_masq.c | 14 +-
net/netfilter/nft_meta.c | 8 +-
net/netfilter/nft_nat.c | 35 ++--
net/netfilter/nft_numgen.c | 15 +-
net/netfilter/nft_objref.c | 6 +-
net/netfilter/nft_payload.c | 10 +-
net/netfilter/nft_queue.c | 12 +-
net/netfilter/nft_range.c | 6 +-
net/netfilter/nft_redir.c | 14 +-
net/netfilter/nft_rt.c | 7 +-
33 files changed, 320 insertions(+), 257 deletions(-)
--
2.30.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x d1d8875c8c13517f6fd1ff8d4d3e1ac366a17e07
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052834-shakable-mobile-fdb0@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
d1d8875c8c13 ("binder: fix UAF of alloc->vma in race with munmap()")
c0fd2101781e ("Revert "android: binder: stop saving a pointer to the VMA"")
7b0dbd940765 ("binder: fix binder_alloc kernel-doc warnings")
d6d04d71daae ("binder: remove binder_alloc_set_vma()")
e66b77e50522 ("binder: rename alloc->vma_vm_mm to alloc->mm")
50e177c5bfd9 ("Merge 6.0-rc4 into char-misc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d1d8875c8c13517f6fd1ff8d4d3e1ac366a17e07 Mon Sep 17 00:00:00 2001
From: Carlos Llamas <cmllamas(a)google.com>
Date: Fri, 19 May 2023 19:59:49 +0000
Subject: [PATCH] binder: fix UAF of alloc->vma in race with munmap()
[ cmllamas: clean forward port from commit 015ac18be7de ("binder: fix
UAF of alloc->vma in race with munmap()") in 5.10 stable. It is needed
in mainline after the revert of commit a43cfc87caaf ("android: binder:
stop saving a pointer to the VMA") as pointed out by Liam. The commit
log and tags have been tweaked to reflect this. ]
In commit 720c24192404 ("ANDROID: binder: change down_write to
down_read") binder assumed the mmap read lock is sufficient to protect
alloc->vma inside binder_update_page_range(). This used to be accurate
until commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in
munmap"), which now downgrades the mmap_lock after detaching the vma
from the rbtree in munmap(). Then it proceeds to teardown and free the
vma with only the read lock held.
This means that accesses to alloc->vma in binder_update_page_range() now
will race with vm_area_free() in munmap() and can cause a UAF as shown
in the following KASAN trace:
==================================================================
BUG: KASAN: use-after-free in vm_insert_page+0x7c/0x1f0
Read of size 8 at addr ffff16204ad00600 by task server/558
CPU: 3 PID: 558 Comm: server Not tainted 5.10.150-00001-gdc8dcf942daa #1
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x2a0
show_stack+0x18/0x2c
dump_stack+0xf8/0x164
print_address_description.constprop.0+0x9c/0x538
kasan_report+0x120/0x200
__asan_load8+0xa0/0xc4
vm_insert_page+0x7c/0x1f0
binder_update_page_range+0x278/0x50c
binder_alloc_new_buf+0x3f0/0xba0
binder_transaction+0x64c/0x3040
binder_thread_write+0x924/0x2020
binder_ioctl+0x1610/0x2e5c
__arm64_sys_ioctl+0xd4/0x120
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
Allocated by task 559:
kasan_save_stack+0x38/0x6c
__kasan_kmalloc.constprop.0+0xe4/0xf0
kasan_slab_alloc+0x18/0x2c
kmem_cache_alloc+0x1b0/0x2d0
vm_area_alloc+0x28/0x94
mmap_region+0x378/0x920
do_mmap+0x3f0/0x600
vm_mmap_pgoff+0x150/0x17c
ksys_mmap_pgoff+0x284/0x2dc
__arm64_sys_mmap+0x84/0xa4
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
Freed by task 560:
kasan_save_stack+0x38/0x6c
kasan_set_track+0x28/0x40
kasan_set_free_info+0x24/0x4c
__kasan_slab_free+0x100/0x164
kasan_slab_free+0x14/0x20
kmem_cache_free+0xc4/0x34c
vm_area_free+0x1c/0x2c
remove_vma+0x7c/0x94
__do_munmap+0x358/0x710
__vm_munmap+0xbc/0x130
__arm64_sys_munmap+0x4c/0x64
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
[...]
==================================================================
To prevent the race above, revert back to taking the mmap write lock
inside binder_update_page_range(). One might expect an increase of mmap
lock contention. However, binder already serializes these calls via top
level alloc->mutex. Also, there was no performance impact shown when
running the binder benchmark tests.
Fixes: c0fd2101781e ("Revert "android: binder: stop saving a pointer to the VMA"")
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Reported-by: Jann Horn <jannh(a)google.com>
Closes: https://lore.kernel.org/all/20230518144052.xkj6vmddccq4v66b@revolver
Cc: <stable(a)vger.kernel.org>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Acked-by: Todd Kjos <tkjos(a)google.com>
Link: https://lore.kernel.org/r/20230519195950.1775656-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index e7c9d466f8e8..662a2a2e2e84 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -212,7 +212,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
mm = alloc->mm;
if (mm) {
- mmap_read_lock(mm);
+ mmap_write_lock(mm);
vma = alloc->vma;
}
@@ -270,7 +270,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
trace_binder_alloc_page_end(alloc, index);
}
if (mm) {
- mmap_read_unlock(mm);
+ mmap_write_unlock(mm);
mmput(mm);
}
return 0;
@@ -303,7 +303,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
}
err_no_vma:
if (mm) {
- mmap_read_unlock(mm);
+ mmap_write_unlock(mm);
mmput(mm);
}
return vma ? -ENOMEM : -ESRCH;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x d1d8875c8c13517f6fd1ff8d4d3e1ac366a17e07
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052832-opposing-flavoring-f084@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
d1d8875c8c13 ("binder: fix UAF of alloc->vma in race with munmap()")
c0fd2101781e ("Revert "android: binder: stop saving a pointer to the VMA"")
7b0dbd940765 ("binder: fix binder_alloc kernel-doc warnings")
d6d04d71daae ("binder: remove binder_alloc_set_vma()")
e66b77e50522 ("binder: rename alloc->vma_vm_mm to alloc->mm")
50e177c5bfd9 ("Merge 6.0-rc4 into char-misc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d1d8875c8c13517f6fd1ff8d4d3e1ac366a17e07 Mon Sep 17 00:00:00 2001
From: Carlos Llamas <cmllamas(a)google.com>
Date: Fri, 19 May 2023 19:59:49 +0000
Subject: [PATCH] binder: fix UAF of alloc->vma in race with munmap()
[ cmllamas: clean forward port from commit 015ac18be7de ("binder: fix
UAF of alloc->vma in race with munmap()") in 5.10 stable. It is needed
in mainline after the revert of commit a43cfc87caaf ("android: binder:
stop saving a pointer to the VMA") as pointed out by Liam. The commit
log and tags have been tweaked to reflect this. ]
In commit 720c24192404 ("ANDROID: binder: change down_write to
down_read") binder assumed the mmap read lock is sufficient to protect
alloc->vma inside binder_update_page_range(). This used to be accurate
until commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in
munmap"), which now downgrades the mmap_lock after detaching the vma
from the rbtree in munmap(). Then it proceeds to teardown and free the
vma with only the read lock held.
This means that accesses to alloc->vma in binder_update_page_range() now
will race with vm_area_free() in munmap() and can cause a UAF as shown
in the following KASAN trace:
==================================================================
BUG: KASAN: use-after-free in vm_insert_page+0x7c/0x1f0
Read of size 8 at addr ffff16204ad00600 by task server/558
CPU: 3 PID: 558 Comm: server Not tainted 5.10.150-00001-gdc8dcf942daa #1
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x2a0
show_stack+0x18/0x2c
dump_stack+0xf8/0x164
print_address_description.constprop.0+0x9c/0x538
kasan_report+0x120/0x200
__asan_load8+0xa0/0xc4
vm_insert_page+0x7c/0x1f0
binder_update_page_range+0x278/0x50c
binder_alloc_new_buf+0x3f0/0xba0
binder_transaction+0x64c/0x3040
binder_thread_write+0x924/0x2020
binder_ioctl+0x1610/0x2e5c
__arm64_sys_ioctl+0xd4/0x120
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
Allocated by task 559:
kasan_save_stack+0x38/0x6c
__kasan_kmalloc.constprop.0+0xe4/0xf0
kasan_slab_alloc+0x18/0x2c
kmem_cache_alloc+0x1b0/0x2d0
vm_area_alloc+0x28/0x94
mmap_region+0x378/0x920
do_mmap+0x3f0/0x600
vm_mmap_pgoff+0x150/0x17c
ksys_mmap_pgoff+0x284/0x2dc
__arm64_sys_mmap+0x84/0xa4
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
Freed by task 560:
kasan_save_stack+0x38/0x6c
kasan_set_track+0x28/0x40
kasan_set_free_info+0x24/0x4c
__kasan_slab_free+0x100/0x164
kasan_slab_free+0x14/0x20
kmem_cache_free+0xc4/0x34c
vm_area_free+0x1c/0x2c
remove_vma+0x7c/0x94
__do_munmap+0x358/0x710
__vm_munmap+0xbc/0x130
__arm64_sys_munmap+0x4c/0x64
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
[...]
==================================================================
To prevent the race above, revert back to taking the mmap write lock
inside binder_update_page_range(). One might expect an increase of mmap
lock contention. However, binder already serializes these calls via top
level alloc->mutex. Also, there was no performance impact shown when
running the binder benchmark tests.
Fixes: c0fd2101781e ("Revert "android: binder: stop saving a pointer to the VMA"")
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Reported-by: Jann Horn <jannh(a)google.com>
Closes: https://lore.kernel.org/all/20230518144052.xkj6vmddccq4v66b@revolver
Cc: <stable(a)vger.kernel.org>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Acked-by: Todd Kjos <tkjos(a)google.com>
Link: https://lore.kernel.org/r/20230519195950.1775656-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index e7c9d466f8e8..662a2a2e2e84 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -212,7 +212,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
mm = alloc->mm;
if (mm) {
- mmap_read_lock(mm);
+ mmap_write_lock(mm);
vma = alloc->vma;
}
@@ -270,7 +270,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
trace_binder_alloc_page_end(alloc, index);
}
if (mm) {
- mmap_read_unlock(mm);
+ mmap_write_unlock(mm);
mmput(mm);
}
return 0;
@@ -303,7 +303,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
}
err_no_vma:
if (mm) {
- mmap_read_unlock(mm);
+ mmap_write_unlock(mm);
mmput(mm);
}
return vma ? -ENOMEM : -ESRCH;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x d1d8875c8c13517f6fd1ff8d4d3e1ac366a17e07
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052830-unblended-envision-4bbd@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
d1d8875c8c13 ("binder: fix UAF of alloc->vma in race with munmap()")
c0fd2101781e ("Revert "android: binder: stop saving a pointer to the VMA"")
7b0dbd940765 ("binder: fix binder_alloc kernel-doc warnings")
d6d04d71daae ("binder: remove binder_alloc_set_vma()")
e66b77e50522 ("binder: rename alloc->vma_vm_mm to alloc->mm")
50e177c5bfd9 ("Merge 6.0-rc4 into char-misc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d1d8875c8c13517f6fd1ff8d4d3e1ac366a17e07 Mon Sep 17 00:00:00 2001
From: Carlos Llamas <cmllamas(a)google.com>
Date: Fri, 19 May 2023 19:59:49 +0000
Subject: [PATCH] binder: fix UAF of alloc->vma in race with munmap()
[ cmllamas: clean forward port from commit 015ac18be7de ("binder: fix
UAF of alloc->vma in race with munmap()") in 5.10 stable. It is needed
in mainline after the revert of commit a43cfc87caaf ("android: binder:
stop saving a pointer to the VMA") as pointed out by Liam. The commit
log and tags have been tweaked to reflect this. ]
In commit 720c24192404 ("ANDROID: binder: change down_write to
down_read") binder assumed the mmap read lock is sufficient to protect
alloc->vma inside binder_update_page_range(). This used to be accurate
until commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in
munmap"), which now downgrades the mmap_lock after detaching the vma
from the rbtree in munmap(). Then it proceeds to teardown and free the
vma with only the read lock held.
This means that accesses to alloc->vma in binder_update_page_range() now
will race with vm_area_free() in munmap() and can cause a UAF as shown
in the following KASAN trace:
==================================================================
BUG: KASAN: use-after-free in vm_insert_page+0x7c/0x1f0
Read of size 8 at addr ffff16204ad00600 by task server/558
CPU: 3 PID: 558 Comm: server Not tainted 5.10.150-00001-gdc8dcf942daa #1
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x2a0
show_stack+0x18/0x2c
dump_stack+0xf8/0x164
print_address_description.constprop.0+0x9c/0x538
kasan_report+0x120/0x200
__asan_load8+0xa0/0xc4
vm_insert_page+0x7c/0x1f0
binder_update_page_range+0x278/0x50c
binder_alloc_new_buf+0x3f0/0xba0
binder_transaction+0x64c/0x3040
binder_thread_write+0x924/0x2020
binder_ioctl+0x1610/0x2e5c
__arm64_sys_ioctl+0xd4/0x120
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
Allocated by task 559:
kasan_save_stack+0x38/0x6c
__kasan_kmalloc.constprop.0+0xe4/0xf0
kasan_slab_alloc+0x18/0x2c
kmem_cache_alloc+0x1b0/0x2d0
vm_area_alloc+0x28/0x94
mmap_region+0x378/0x920
do_mmap+0x3f0/0x600
vm_mmap_pgoff+0x150/0x17c
ksys_mmap_pgoff+0x284/0x2dc
__arm64_sys_mmap+0x84/0xa4
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
Freed by task 560:
kasan_save_stack+0x38/0x6c
kasan_set_track+0x28/0x40
kasan_set_free_info+0x24/0x4c
__kasan_slab_free+0x100/0x164
kasan_slab_free+0x14/0x20
kmem_cache_free+0xc4/0x34c
vm_area_free+0x1c/0x2c
remove_vma+0x7c/0x94
__do_munmap+0x358/0x710
__vm_munmap+0xbc/0x130
__arm64_sys_munmap+0x4c/0x64
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
[...]
==================================================================
To prevent the race above, revert back to taking the mmap write lock
inside binder_update_page_range(). One might expect an increase of mmap
lock contention. However, binder already serializes these calls via top
level alloc->mutex. Also, there was no performance impact shown when
running the binder benchmark tests.
Fixes: c0fd2101781e ("Revert "android: binder: stop saving a pointer to the VMA"")
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Reported-by: Jann Horn <jannh(a)google.com>
Closes: https://lore.kernel.org/all/20230518144052.xkj6vmddccq4v66b@revolver
Cc: <stable(a)vger.kernel.org>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Acked-by: Todd Kjos <tkjos(a)google.com>
Link: https://lore.kernel.org/r/20230519195950.1775656-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index e7c9d466f8e8..662a2a2e2e84 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -212,7 +212,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
mm = alloc->mm;
if (mm) {
- mmap_read_lock(mm);
+ mmap_write_lock(mm);
vma = alloc->vma;
}
@@ -270,7 +270,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
trace_binder_alloc_page_end(alloc, index);
}
if (mm) {
- mmap_read_unlock(mm);
+ mmap_write_unlock(mm);
mmput(mm);
}
return 0;
@@ -303,7 +303,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
}
err_no_vma:
if (mm) {
- mmap_read_unlock(mm);
+ mmap_write_unlock(mm);
mmput(mm);
}
return vma ? -ENOMEM : -ESRCH;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x bdc1c5fac982845a58d28690cdb56db8c88a530d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052804-exploit-passion-ce51@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
bdc1c5fac982 ("binder: fix UAF caused by faulty buffer cleanup")
9864bb480133 ("Binder: add TF_UPDATE_TXN to replace outdated txn")
32e9f56a96d8 ("binder: don't detect sender/target during buffer cleanup")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bdc1c5fac982845a58d28690cdb56db8c88a530d Mon Sep 17 00:00:00 2001
From: Carlos Llamas <cmllamas(a)google.com>
Date: Fri, 5 May 2023 20:30:20 +0000
Subject: [PATCH] binder: fix UAF caused by faulty buffer cleanup
In binder_transaction_buffer_release() the 'failed_at' offset indicates
the number of objects to clean up. However, this function was changed by
commit 44d8047f1d87 ("binder: use standard functions to allocate fds"),
to release all the objects in the buffer when 'failed_at' is zero.
This introduced an issue when a transaction buffer is released without
any objects having been processed so far. In this case, 'failed_at' is
indeed zero yet it is misinterpreted as releasing the entire buffer.
This leads to use-after-free errors where nodes are incorrectly freed
and subsequently accessed. Such is the case in the following KASAN
report:
==================================================================
BUG: KASAN: slab-use-after-free in binder_thread_read+0xc40/0x1f30
Read of size 8 at addr ffff4faf037cfc58 by task poc/474
CPU: 6 PID: 474 Comm: poc Not tainted 6.3.0-12570-g7df047b3f0aa #5
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x94/0xec
show_stack+0x18/0x24
dump_stack_lvl+0x48/0x60
print_report+0xf8/0x5b8
kasan_report+0xb8/0xfc
__asan_load8+0x9c/0xb8
binder_thread_read+0xc40/0x1f30
binder_ioctl+0xd9c/0x1768
__arm64_sys_ioctl+0xd4/0x118
invoke_syscall+0x60/0x188
[...]
Allocated by task 474:
kasan_save_stack+0x3c/0x64
kasan_set_track+0x2c/0x40
kasan_save_alloc_info+0x24/0x34
__kasan_kmalloc+0xb8/0xbc
kmalloc_trace+0x48/0x5c
binder_new_node+0x3c/0x3a4
binder_transaction+0x2b58/0x36f0
binder_thread_write+0x8e0/0x1b78
binder_ioctl+0x14a0/0x1768
__arm64_sys_ioctl+0xd4/0x118
invoke_syscall+0x60/0x188
[...]
Freed by task 475:
kasan_save_stack+0x3c/0x64
kasan_set_track+0x2c/0x40
kasan_save_free_info+0x38/0x5c
__kasan_slab_free+0xe8/0x154
__kmem_cache_free+0x128/0x2bc
kfree+0x58/0x70
binder_dec_node_tmpref+0x178/0x1fc
binder_transaction_buffer_release+0x430/0x628
binder_transaction+0x1954/0x36f0
binder_thread_write+0x8e0/0x1b78
binder_ioctl+0x14a0/0x1768
__arm64_sys_ioctl+0xd4/0x118
invoke_syscall+0x60/0x188
[...]
==================================================================
In order to avoid these issues, let's always calculate the intended
'failed_at' offset beforehand. This is renamed and wrapped in a helper
function to make it clear and convenient.
Fixes: 32e9f56a96d8 ("binder: don't detect sender/target during buffer cleanup")
Reported-by: Zi Fan Tan <zifantan(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Acked-by: Todd Kjos <tkjos(a)google.com>
Link: https://lore.kernel.org/r/20230505203020.4101154-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index fb56bfc45096..8fb7672021ee 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -1934,24 +1934,23 @@ static void binder_deferred_fd_close(int fd)
static void binder_transaction_buffer_release(struct binder_proc *proc,
struct binder_thread *thread,
struct binder_buffer *buffer,
- binder_size_t failed_at,
+ binder_size_t off_end_offset,
bool is_failure)
{
int debug_id = buffer->debug_id;
- binder_size_t off_start_offset, buffer_offset, off_end_offset;
+ binder_size_t off_start_offset, buffer_offset;
binder_debug(BINDER_DEBUG_TRANSACTION,
"%d buffer release %d, size %zd-%zd, failed at %llx\n",
proc->pid, buffer->debug_id,
buffer->data_size, buffer->offsets_size,
- (unsigned long long)failed_at);
+ (unsigned long long)off_end_offset);
if (buffer->target_node)
binder_dec_node(buffer->target_node, 1, 0);
off_start_offset = ALIGN(buffer->data_size, sizeof(void *));
- off_end_offset = is_failure && failed_at ? failed_at :
- off_start_offset + buffer->offsets_size;
+
for (buffer_offset = off_start_offset; buffer_offset < off_end_offset;
buffer_offset += sizeof(binder_size_t)) {
struct binder_object_header *hdr;
@@ -2111,6 +2110,21 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
}
}
+/* Clean up all the objects in the buffer */
+static inline void binder_release_entire_buffer(struct binder_proc *proc,
+ struct binder_thread *thread,
+ struct binder_buffer *buffer,
+ bool is_failure)
+{
+ binder_size_t off_end_offset;
+
+ off_end_offset = ALIGN(buffer->data_size, sizeof(void *));
+ off_end_offset += buffer->offsets_size;
+
+ binder_transaction_buffer_release(proc, thread, buffer,
+ off_end_offset, is_failure);
+}
+
static int binder_translate_binder(struct flat_binder_object *fp,
struct binder_transaction *t,
struct binder_thread *thread)
@@ -2806,7 +2820,7 @@ static int binder_proc_transaction(struct binder_transaction *t,
t_outdated->buffer = NULL;
buffer->transaction = NULL;
trace_binder_transaction_update_buffer_release(buffer);
- binder_transaction_buffer_release(proc, NULL, buffer, 0, 0);
+ binder_release_entire_buffer(proc, NULL, buffer, false);
binder_alloc_free_buf(&proc->alloc, buffer);
kfree(t_outdated);
binder_stats_deleted(BINDER_STAT_TRANSACTION);
@@ -3775,7 +3789,7 @@ binder_free_buf(struct binder_proc *proc,
binder_node_inner_unlock(buf_node);
}
trace_binder_transaction_buffer_release(buffer);
- binder_transaction_buffer_release(proc, thread, buffer, 0, is_failure);
+ binder_release_entire_buffer(proc, thread, buffer, is_failure);
binder_alloc_free_buf(&proc->alloc, buffer);
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 0fa53349c3acba0239369ba4cd133740a408d246
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052832-ideally-gleeful-6ac6@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
0fa53349c3ac ("binder: add lockless binder_alloc_(set|get)_vma()")
c0fd2101781e ("Revert "android: binder: stop saving a pointer to the VMA"")
b15655b12ddc ("Revert "binder_alloc: add missing mmap_lock calls when using the VMA"")
7b0dbd940765 ("binder: fix binder_alloc kernel-doc warnings")
d6d04d71daae ("binder: remove binder_alloc_set_vma()")
e66b77e50522 ("binder: rename alloc->vma_vm_mm to alloc->mm")
50e177c5bfd9 ("Merge 6.0-rc4 into char-misc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0fa53349c3acba0239369ba4cd133740a408d246 Mon Sep 17 00:00:00 2001
From: Carlos Llamas <cmllamas(a)google.com>
Date: Tue, 2 May 2023 20:12:19 +0000
Subject: [PATCH] binder: add lockless binder_alloc_(set|get)_vma()
Bring back the original lockless design in binder_alloc to determine
whether the buffer setup has been completed by the ->mmap() handler.
However, this time use smp_load_acquire() and smp_store_release() to
wrap all the ordering in a single macro call.
Also, add comments to make it evident that binder uses alloc->vma to
determine when the binder_alloc has been fully initialized. In these
scenarios acquiring the mmap_lock is not required.
Fixes: a43cfc87caaf ("android: binder: stop saving a pointer to the VMA")
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Link: https://lore.kernel.org/r/20230502201220.1756319-3-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index eb082b33115b..e7c9d466f8e8 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -309,17 +309,18 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
return vma ? -ENOMEM : -ESRCH;
}
+static inline void binder_alloc_set_vma(struct binder_alloc *alloc,
+ struct vm_area_struct *vma)
+{
+ /* pairs with smp_load_acquire in binder_alloc_get_vma() */
+ smp_store_release(&alloc->vma, vma);
+}
+
static inline struct vm_area_struct *binder_alloc_get_vma(
struct binder_alloc *alloc)
{
- struct vm_area_struct *vma = NULL;
-
- if (alloc->vma) {
- /* Look at description in binder_alloc_set_vma */
- smp_rmb();
- vma = alloc->vma;
- }
- return vma;
+ /* pairs with smp_store_release in binder_alloc_set_vma() */
+ return smp_load_acquire(&alloc->vma);
}
static bool debug_low_async_space_locked(struct binder_alloc *alloc, int pid)
@@ -382,6 +383,7 @@ static struct binder_buffer *binder_alloc_new_buf_locked(
size_t size, data_offsets_size;
int ret;
+ /* Check binder_alloc is fully initialized */
if (!binder_alloc_get_vma(alloc)) {
binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
"%d: binder_alloc_buf, no vma\n",
@@ -777,7 +779,9 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
buffer->free = 1;
binder_insert_free_buffer(alloc, buffer);
alloc->free_async_space = alloc->buffer_size / 2;
- alloc->vma = vma;
+
+ /* Signal binder_alloc is fully initialized */
+ binder_alloc_set_vma(alloc, vma);
return 0;
@@ -959,7 +963,7 @@ int binder_alloc_get_allocated_count(struct binder_alloc *alloc)
*/
void binder_alloc_vma_close(struct binder_alloc *alloc)
{
- alloc->vma = 0;
+ binder_alloc_set_vma(alloc, NULL);
}
/**
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x c0fd2101781ef761b636769b2f445351f71c3626
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052814-bannister-undaunted-57c2@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
c0fd2101781e ("Revert "android: binder: stop saving a pointer to the VMA"")
7b0dbd940765 ("binder: fix binder_alloc kernel-doc warnings")
d6d04d71daae ("binder: remove binder_alloc_set_vma()")
e66b77e50522 ("binder: rename alloc->vma_vm_mm to alloc->mm")
50e177c5bfd9 ("Merge 6.0-rc4 into char-misc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c0fd2101781ef761b636769b2f445351f71c3626 Mon Sep 17 00:00:00 2001
From: Carlos Llamas <cmllamas(a)google.com>
Date: Tue, 2 May 2023 20:12:18 +0000
Subject: [PATCH] Revert "android: binder: stop saving a pointer to the VMA"
This reverts commit a43cfc87caaf46710c8027a8c23b8a55f1078f19.
This patch fixed an issue reported by syzkaller in [1]. However, this
turned out to be only a band-aid in binder. The root cause, as bisected
by syzkaller, was fixed by commit 5789151e48ac ("mm/mmap: undo ->mmap()
when mas_preallocate() fails"). We no longer need the patch for binder.
Reverting such patch allows us to have a lockless access to alloc->vma
in specific cases where the mmap_lock is not required. This approach
avoids the contention that caused a performance regression.
[1] https://lore.kernel.org/all/0000000000004a0dbe05e1d749e0@google.com
[cmllamas: resolved conflicts with rework of alloc->mm and removal of
binder_alloc_set_vma() also fixed comment section]
Fixes: a43cfc87caaf ("android: binder: stop saving a pointer to the VMA")
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Link: https://lore.kernel.org/r/20230502201220.1756319-2-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 92c814ec44fe..eb082b33115b 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -213,7 +213,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
if (mm) {
mmap_read_lock(mm);
- vma = vma_lookup(mm, alloc->vma_addr);
+ vma = alloc->vma;
}
if (!vma && need_mm) {
@@ -314,9 +314,11 @@ static inline struct vm_area_struct *binder_alloc_get_vma(
{
struct vm_area_struct *vma = NULL;
- if (alloc->vma_addr)
- vma = vma_lookup(alloc->mm, alloc->vma_addr);
-
+ if (alloc->vma) {
+ /* Look at description in binder_alloc_set_vma */
+ smp_rmb();
+ vma = alloc->vma;
+ }
return vma;
}
@@ -775,7 +777,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
buffer->free = 1;
binder_insert_free_buffer(alloc, buffer);
alloc->free_async_space = alloc->buffer_size / 2;
- alloc->vma_addr = vma->vm_start;
+ alloc->vma = vma;
return 0;
@@ -805,8 +807,7 @@ void binder_alloc_deferred_release(struct binder_alloc *alloc)
buffers = 0;
mutex_lock(&alloc->mutex);
- BUG_ON(alloc->vma_addr &&
- vma_lookup(alloc->mm, alloc->vma_addr));
+ BUG_ON(alloc->vma);
while ((n = rb_first(&alloc->allocated_buffers))) {
buffer = rb_entry(n, struct binder_buffer, rb_node);
@@ -958,7 +959,7 @@ int binder_alloc_get_allocated_count(struct binder_alloc *alloc)
*/
void binder_alloc_vma_close(struct binder_alloc *alloc)
{
- alloc->vma_addr = 0;
+ alloc->vma = 0;
}
/**
diff --git a/drivers/android/binder_alloc.h b/drivers/android/binder_alloc.h
index 0f811ac4bcff..138d1d5af9ce 100644
--- a/drivers/android/binder_alloc.h
+++ b/drivers/android/binder_alloc.h
@@ -75,7 +75,7 @@ struct binder_lru_page {
/**
* struct binder_alloc - per-binder proc state for binder allocator
* @mutex: protects binder_alloc fields
- * @vma_addr: vm_area_struct->vm_start passed to mmap_handler
+ * @vma: vm_area_struct passed to mmap_handler
* (invariant after mmap)
* @mm: copy of task->mm (invariant after open)
* @buffer: base of per-proc address space mapped via mmap
@@ -99,7 +99,7 @@ struct binder_lru_page {
*/
struct binder_alloc {
struct mutex mutex;
- unsigned long vma_addr;
+ struct vm_area_struct *vma;
struct mm_struct *mm;
void __user *buffer;
struct list_head buffers;
diff --git a/drivers/android/binder_alloc_selftest.c b/drivers/android/binder_alloc_selftest.c
index 43a881073a42..c2b323bc3b3a 100644
--- a/drivers/android/binder_alloc_selftest.c
+++ b/drivers/android/binder_alloc_selftest.c
@@ -287,7 +287,7 @@ void binder_selftest_alloc(struct binder_alloc *alloc)
if (!binder_selftest_run)
return;
mutex_lock(&binder_selftest_lock);
- if (!binder_selftest_run || !alloc->vma_addr)
+ if (!binder_selftest_run || !alloc->vma)
goto done;
pr_info("STARTED\n");
binder_selftest_alloc_offset(alloc, end_offset, 0);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x b15655b12ddca7ade09807f790bafb6fab61b50a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052842-ability-badness-50c6@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
b15655b12ddc ("Revert "binder_alloc: add missing mmap_lock calls when using the VMA"")
e66b77e50522 ("binder: rename alloc->vma_vm_mm to alloc->mm")
50e177c5bfd9 ("Merge 6.0-rc4 into char-misc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b15655b12ddca7ade09807f790bafb6fab61b50a Mon Sep 17 00:00:00 2001
From: Carlos Llamas <cmllamas(a)google.com>
Date: Tue, 2 May 2023 20:12:17 +0000
Subject: [PATCH] Revert "binder_alloc: add missing mmap_lock calls when using
the VMA"
This reverts commit 44e602b4e52f70f04620bbbf4fe46ecb40170bde.
This caused a performance regression particularly when pages are getting
reclaimed. We don't need to acquire the mmap_lock to determine when the
binder buffer has been fully initialized. A subsequent patch will bring
back the lockless approach for this.
[cmllamas: resolved trivial conflicts with renaming of alloc->mm]
Fixes: 44e602b4e52f ("binder_alloc: add missing mmap_lock calls when using the VMA")
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Link: https://lore.kernel.org/r/20230502201220.1756319-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 55a3c3c2409f..92c814ec44fe 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -380,15 +380,12 @@ static struct binder_buffer *binder_alloc_new_buf_locked(
size_t size, data_offsets_size;
int ret;
- mmap_read_lock(alloc->mm);
if (!binder_alloc_get_vma(alloc)) {
- mmap_read_unlock(alloc->mm);
binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
"%d: binder_alloc_buf, no vma\n",
alloc->pid);
return ERR_PTR(-ESRCH);
}
- mmap_read_unlock(alloc->mm);
data_offsets_size = ALIGN(data_size, sizeof(void *)) +
ALIGN(offsets_size, sizeof(void *));
@@ -916,25 +913,17 @@ void binder_alloc_print_pages(struct seq_file *m,
* Make sure the binder_alloc is fully initialized, otherwise we might
* read inconsistent state.
*/
-
- mmap_read_lock(alloc->mm);
- if (binder_alloc_get_vma(alloc) == NULL) {
- mmap_read_unlock(alloc->mm);
- goto uninitialized;
- }
-
- mmap_read_unlock(alloc->mm);
- for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) {
- page = &alloc->pages[i];
- if (!page->page_ptr)
- free++;
- else if (list_empty(&page->lru))
- active++;
- else
- lru++;
+ if (binder_alloc_get_vma(alloc) != NULL) {
+ for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) {
+ page = &alloc->pages[i];
+ if (!page->page_ptr)
+ free++;
+ else if (list_empty(&page->lru))
+ active++;
+ else
+ lru++;
+ }
}
-
-uninitialized:
mutex_unlock(&alloc->mutex);
seq_printf(m, " pages: %d:%d:%d\n", active, lru, free);
seq_printf(m, " pages high watermark: %zu\n", alloc->pages_high);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 0f554e37dad416f445cd3ec5935f5aec1b0e7ba5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052807-thorn-tavern-5f16@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
0f554e37dad4 ("arm64: dts: imx8: fix USB 3.0 Gadget Failure in QM & QXPB0 at super speed")
a8bd7f155126 ("arm64: dts: imx8qxp: add cadence usb3 support")
8065fc937f0f ("arm64: dts: imx8dxl: add usb1 and usb2 support")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0f554e37dad416f445cd3ec5935f5aec1b0e7ba5 Mon Sep 17 00:00:00 2001
From: Frank Li <Frank.Li(a)nxp.com>
Date: Mon, 15 May 2023 12:20:53 -0400
Subject: [PATCH] arm64: dts: imx8: fix USB 3.0 Gadget Failure in QM & QXPB0 at
super speed
Resolve USB 3.0 gadget failure for QM and QXPB0 in super speed mode with
single IN and OUT endpoints, like mass storage devices, due to incorrect
ACTUAL_MEM_SIZE in ep_cap2 (32k instead of actual 18k). Implement dt
property cdns,on-chip-buff-size to override ep_cap2 and set it to 18k for
imx8QM and imx8QXP chips. No adverse effects for 8QXP C0.
Cc: stable(a)vger.kernel.org
Fixes: dce49449e04f ("usb: cdns3: allocate TX FIFO size according to composite EP number")
Signed-off-by: Frank Li <Frank.Li(a)nxp.com>
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
diff --git a/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi b/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi
index 2209c1ac6e9b..e62a43591361 100644
--- a/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi
@@ -171,6 +171,7 @@ usbotg3_cdns3: usb@5b120000 {
interrupt-names = "host", "peripheral", "otg", "wakeup";
phys = <&usb3_phy>;
phy-names = "cdns3,usb3-phy";
+ cdns,on-chip-buff-size = /bits/ 16 <18>;
status = "disabled";
};
};
The patch below does not apply to the 6.3-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.3.y
git checkout FETCH_HEAD
git cherry-pick -x 0f554e37dad416f445cd3ec5935f5aec1b0e7ba5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052806-unlawful-nearest-9622@gregkh' --subject-prefix 'PATCH 6.3.y' HEAD^..
Possible dependencies:
0f554e37dad4 ("arm64: dts: imx8: fix USB 3.0 Gadget Failure in QM & QXPB0 at super speed")
a8bd7f155126 ("arm64: dts: imx8qxp: add cadence usb3 support")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0f554e37dad416f445cd3ec5935f5aec1b0e7ba5 Mon Sep 17 00:00:00 2001
From: Frank Li <Frank.Li(a)nxp.com>
Date: Mon, 15 May 2023 12:20:53 -0400
Subject: [PATCH] arm64: dts: imx8: fix USB 3.0 Gadget Failure in QM & QXPB0 at
super speed
Resolve USB 3.0 gadget failure for QM and QXPB0 in super speed mode with
single IN and OUT endpoints, like mass storage devices, due to incorrect
ACTUAL_MEM_SIZE in ep_cap2 (32k instead of actual 18k). Implement dt
property cdns,on-chip-buff-size to override ep_cap2 and set it to 18k for
imx8QM and imx8QXP chips. No adverse effects for 8QXP C0.
Cc: stable(a)vger.kernel.org
Fixes: dce49449e04f ("usb: cdns3: allocate TX FIFO size according to composite EP number")
Signed-off-by: Frank Li <Frank.Li(a)nxp.com>
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
diff --git a/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi b/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi
index 2209c1ac6e9b..e62a43591361 100644
--- a/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi
@@ -171,6 +171,7 @@ usbotg3_cdns3: usb@5b120000 {
interrupt-names = "host", "peripheral", "otg", "wakeup";
phys = <&usb3_phy>;
phy-names = "cdns3,usb3-phy";
+ cdns,on-chip-buff-size = /bits/ 16 <18>;
status = "disabled";
};
};
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 3bf8c6307bad5c0cc09cde982e146d847859b651
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052832-arise-impeding-18c7@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
3bf8c6307bad ("cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf()")
2dd6d0ebf740 ("cpufreq: amd-pstate: Add guided autonomous mode")
3e6e07805764 ("Documentation: cpufreq: amd-pstate: Move amd_pstate param to alphabetical order")
5014603e409b ("Documentation: introduce amd pstate active mode kernel command line options")
ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
36c5014e5460 ("cpufreq: amd-pstate: optimize driver working mode selection in amd_pstate_param()")
4f3085f87b51 ("cpufreq: amd-pstate: fix kernel hang issue while amd-pstate unregistering")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3bf8c6307bad5c0cc09cde982e146d847859b651 Mon Sep 17 00:00:00 2001
From: Wyes Karny <wyes.karny(a)amd.com>
Date: Thu, 18 May 2023 05:58:19 +0000
Subject: [PATCH] cpufreq: amd-pstate: Update policy->cur in
amd_pstate_adjust_perf()
Driver should update policy->cur after updating the frequency.
Currently amd_pstate doesn't update policy->cur when `adjust_perf`
is used. Which causes /proc/cpuinfo to show wrong cpu frequency.
Fix this by updating policy->cur with correct frequency value in
adjust_perf function callback.
- Before the fix: (setting min freq to 1.5 MHz)
[root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
1 cpu MHz : 1777.016
1 cpu MHz : 1797.160
1 cpu MHz : 1797.270
189 cpu MHz : 400.000
- After the fix: (setting min freq to 1.5 MHz)
[root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
1 cpu MHz : 1753.353
1 cpu MHz : 1756.838
1 cpu MHz : 1776.466
1 cpu MHz : 1776.873
1 cpu MHz : 1777.308
1 cpu MHz : 1779.900
183 cpu MHz : 1805.231
1 cpu MHz : 1956.815
1 cpu MHz : 2246.203
1 cpu MHz : 2259.984
Fixes: 1d215f0319c2 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Signed-off-by: Wyes Karny <wyes.karny(a)amd.com>
[ rjw: Subject edits ]
Cc: 5.17+ <stable(a)vger.kernel.org> # 5.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index ac54779f5a49..ddd346a239e0 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -501,12 +501,14 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
unsigned long capacity)
{
unsigned long max_perf, min_perf, des_perf,
- cap_perf, lowest_nonlinear_perf;
+ cap_perf, lowest_nonlinear_perf, max_freq;
struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
struct amd_cpudata *cpudata = policy->driver_data;
+ unsigned int target_freq;
cap_perf = READ_ONCE(cpudata->highest_perf);
lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);
+ max_freq = READ_ONCE(cpudata->max_freq);
des_perf = cap_perf;
if (target_perf < capacity)
@@ -523,6 +525,10 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
if (max_perf < min_perf)
max_perf = min_perf;
+ des_perf = clamp_t(unsigned long, des_perf, min_perf, max_perf);
+ target_freq = div_u64(des_perf * max_freq, max_perf);
+ policy->cur = target_freq;
+
amd_pstate_update(cpudata, min_perf, des_perf, max_perf, true,
policy->governor->flags);
cpufreq_cpu_put(policy);
The patch below does not apply to the 6.3-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.3.y
git checkout FETCH_HEAD
git cherry-pick -x 3bf8c6307bad5c0cc09cde982e146d847859b651
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052831-chest-shabby-7961@gregkh' --subject-prefix 'PATCH 6.3.y' HEAD^..
Possible dependencies:
3bf8c6307bad ("cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf()")
2dd6d0ebf740 ("cpufreq: amd-pstate: Add guided autonomous mode")
3e6e07805764 ("Documentation: cpufreq: amd-pstate: Move amd_pstate param to alphabetical order")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3bf8c6307bad5c0cc09cde982e146d847859b651 Mon Sep 17 00:00:00 2001
From: Wyes Karny <wyes.karny(a)amd.com>
Date: Thu, 18 May 2023 05:58:19 +0000
Subject: [PATCH] cpufreq: amd-pstate: Update policy->cur in
amd_pstate_adjust_perf()
Driver should update policy->cur after updating the frequency.
Currently amd_pstate doesn't update policy->cur when `adjust_perf`
is used. Which causes /proc/cpuinfo to show wrong cpu frequency.
Fix this by updating policy->cur with correct frequency value in
adjust_perf function callback.
- Before the fix: (setting min freq to 1.5 MHz)
[root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
1 cpu MHz : 1777.016
1 cpu MHz : 1797.160
1 cpu MHz : 1797.270
189 cpu MHz : 400.000
- After the fix: (setting min freq to 1.5 MHz)
[root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
1 cpu MHz : 1753.353
1 cpu MHz : 1756.838
1 cpu MHz : 1776.466
1 cpu MHz : 1776.873
1 cpu MHz : 1777.308
1 cpu MHz : 1779.900
183 cpu MHz : 1805.231
1 cpu MHz : 1956.815
1 cpu MHz : 2246.203
1 cpu MHz : 2259.984
Fixes: 1d215f0319c2 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Signed-off-by: Wyes Karny <wyes.karny(a)amd.com>
[ rjw: Subject edits ]
Cc: 5.17+ <stable(a)vger.kernel.org> # 5.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index ac54779f5a49..ddd346a239e0 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -501,12 +501,14 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
unsigned long capacity)
{
unsigned long max_perf, min_perf, des_perf,
- cap_perf, lowest_nonlinear_perf;
+ cap_perf, lowest_nonlinear_perf, max_freq;
struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
struct amd_cpudata *cpudata = policy->driver_data;
+ unsigned int target_freq;
cap_perf = READ_ONCE(cpudata->highest_perf);
lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);
+ max_freq = READ_ONCE(cpudata->max_freq);
des_perf = cap_perf;
if (target_perf < capacity)
@@ -523,6 +525,10 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
if (max_perf < min_perf)
max_perf = min_perf;
+ des_perf = clamp_t(unsigned long, des_perf, min_perf, max_perf);
+ target_freq = div_u64(des_perf * max_freq, max_perf);
+ policy->cur = target_freq;
+
amd_pstate_update(cpudata, min_perf, des_perf, max_perf, true,
policy->governor->flags);
cpufreq_cpu_put(policy);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x eb0764b822b9b26880b28ccb9100b2983e01bc17
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052837-carve-antics-2e83@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
eb0764b822b9 ("cxl/port: Enable the HDM decoder capability for switch ports")
7bba261e0aa6 ("cxl/port: Scan single-target ports for decoders")
a5fcd228ca1d ("Merge branch 'for-6.3/cxl-rr-emu' into cxl/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From eb0764b822b9b26880b28ccb9100b2983e01bc17 Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Wed, 17 May 2023 20:19:43 -0700
Subject: [PATCH] cxl/port: Enable the HDM decoder capability for switch ports
Derick noticed, when testing hot plug, that hot-add behaves nominally
after a removal. However, if the hot-add is done without a prior
removal, CXL.mem accesses fail. It turns out that the original
implementation of the port driver and region programming wrongly assumed
that platform-firmware always enables the host-bridge HDM decoder
capability. Add support turning on switch-level HDM decoders in the case
where platform-firmware has not.
The implementation is careful to only arrange for the enable to be
undone if the current instance of the driver was the one that did the
enable. This is to interoperate with platform-firmware that may expect
CXL.mem to remain active after the driver is shutdown. This comes at the
cost of potentially not shutting down the enable on kexec flows, but it
is mitigated by the fact that the related HDM decoders still need to be
enabled on an individual basis.
Cc: <stable(a)vger.kernel.org>
Reported-by: Derick Marks <derick.w.marks(a)intel.com>
Fixes: 54cdbf845cf7 ("cxl/port: Add a driver for 'struct cxl_port' objects")
Reviewed-by: Ira Weiny <ira.weiny(a)intel.com>
Link: https://lore.kernel.org/r/168437998331.403037.15719879757678389217.stgit@dw…
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index f332fe7af92b..c35c002b65dd 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -241,17 +241,36 @@ static void disable_hdm(void *_cxlhdm)
hdm + CXL_HDM_DECODER_CTRL_OFFSET);
}
-static int devm_cxl_enable_hdm(struct device *host, struct cxl_hdm *cxlhdm)
+int devm_cxl_enable_hdm(struct cxl_port *port, struct cxl_hdm *cxlhdm)
{
- void __iomem *hdm = cxlhdm->regs.hdm_decoder;
+ void __iomem *hdm;
u32 global_ctrl;
+ /*
+ * If the hdm capability was not mapped there is nothing to enable and
+ * the caller is responsible for what happens next. For example,
+ * emulate a passthrough decoder.
+ */
+ if (IS_ERR(cxlhdm))
+ return 0;
+
+ hdm = cxlhdm->regs.hdm_decoder;
global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
+
+ /*
+ * If the HDM decoder capability was enabled on entry, skip
+ * registering disable_hdm() since this decode capability may be
+ * owned by platform firmware.
+ */
+ if (global_ctrl & CXL_HDM_DECODER_ENABLE)
+ return 0;
+
writel(global_ctrl | CXL_HDM_DECODER_ENABLE,
hdm + CXL_HDM_DECODER_CTRL_OFFSET);
- return devm_add_action_or_reset(host, disable_hdm, cxlhdm);
+ return devm_add_action_or_reset(&port->dev, disable_hdm, cxlhdm);
}
+EXPORT_SYMBOL_NS_GPL(devm_cxl_enable_hdm, CXL);
int cxl_dvsec_rr_decode(struct device *dev, int d,
struct cxl_endpoint_dvsec_info *info)
@@ -425,7 +444,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
if (info->mem_enabled)
return 0;
- rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
+ rc = devm_cxl_enable_hdm(port, cxlhdm);
if (rc)
return rc;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 044a92d9813e..f93a28538962 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -710,6 +710,7 @@ struct cxl_endpoint_dvsec_info {
struct cxl_hdm;
struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port,
struct cxl_endpoint_dvsec_info *info);
+int devm_cxl_enable_hdm(struct cxl_port *port, struct cxl_hdm *cxlhdm);
int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
struct cxl_endpoint_dvsec_info *info);
int devm_cxl_add_passthrough_decoder(struct cxl_port *port);
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index eb57324c4ad4..17a95f469c26 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -60,13 +60,17 @@ static int discover_region(struct device *dev, void *root)
static int cxl_switch_port_probe(struct cxl_port *port)
{
struct cxl_hdm *cxlhdm;
- int rc;
+ int rc, nr_dports;
- rc = devm_cxl_port_enumerate_dports(port);
- if (rc < 0)
- return rc;
+ nr_dports = devm_cxl_port_enumerate_dports(port);
+ if (nr_dports < 0)
+ return nr_dports;
cxlhdm = devm_cxl_setup_hdm(port, NULL);
+ rc = devm_cxl_enable_hdm(port, cxlhdm);
+ if (rc)
+ return rc;
+
if (!IS_ERR(cxlhdm))
return devm_cxl_enumerate_decoders(cxlhdm, NULL);
@@ -75,7 +79,7 @@ static int cxl_switch_port_probe(struct cxl_port *port)
return PTR_ERR(cxlhdm);
}
- if (rc == 1) {
+ if (nr_dports == 1) {
dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
return devm_cxl_add_passthrough_decoder(port);
}
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index fba7bec96acd..6f9347ade82c 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -6,6 +6,7 @@ ldflags-y += --wrap=acpi_pci_find_root
ldflags-y += --wrap=nvdimm_bus_register
ldflags-y += --wrap=devm_cxl_port_enumerate_dports
ldflags-y += --wrap=devm_cxl_setup_hdm
+ldflags-y += --wrap=devm_cxl_enable_hdm
ldflags-y += --wrap=devm_cxl_add_passthrough_decoder
ldflags-y += --wrap=devm_cxl_enumerate_decoders
ldflags-y += --wrap=cxl_await_media_ready
diff --git a/tools/testing/cxl/test/mock.c b/tools/testing/cxl/test/mock.c
index de3933a776fd..284416527644 100644
--- a/tools/testing/cxl/test/mock.c
+++ b/tools/testing/cxl/test/mock.c
@@ -149,6 +149,21 @@ struct cxl_hdm *__wrap_devm_cxl_setup_hdm(struct cxl_port *port,
}
EXPORT_SYMBOL_NS_GPL(__wrap_devm_cxl_setup_hdm, CXL);
+int __wrap_devm_cxl_enable_hdm(struct cxl_port *port, struct cxl_hdm *cxlhdm)
+{
+ int index, rc;
+ struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+ if (ops && ops->is_mock_port(port->uport))
+ rc = 0;
+ else
+ rc = devm_cxl_enable_hdm(port, cxlhdm);
+ put_cxl_mock_ops(index);
+
+ return rc;
+}
+EXPORT_SYMBOL_NS_GPL(__wrap_devm_cxl_enable_hdm, CXL);
+
int __wrap_devm_cxl_add_passthrough_decoder(struct cxl_port *port)
{
int rc, index;
[Resending as a plain text email]
Turned out that this is a mixture of an ACPICA issue and an EFISTUB issue.
Kernel v6.2 can boot by reverting the *both* of the following two commits:
- 5c62d5aab8752e5ee7bfbe75ed6060db1c787f98 "ACPICA: Events: Support
fixed PCIe wake event"
- e346bebbd36b1576a3335331fed61bb48c6d8823 "efi: libstub: Always
enable initrd command line loader and bump version"
Kernel v6.3 can boot by just reverting e346bebb, as 5c62d5a has been
already reverted in 8e41e0a575664d26bb87e012c39435c4c3914ed9.
The situation is the same for v6.4-rc3 too.
Note that in my test I let Virtualization.framework directly load
bzImage without GRUB (akin to `qemu-system-x86_64 -kernel bzImage`).
Apparently, reverting e346bebb is not necessary for loading bzImage via GRUB.
> Also, the reporter can't provide dmesg log (forget to attach serial console?).
Uploaded v6.1 dmesg in the bugzilla.
v6.2 dmesg can't be provided, as it hangs before printing something in
console=hvc0.
(IIUC, console=ttyS0 (RS-232C) is not implemented in Virtualization.framework.)
> 2023年5月25日(木) 21:46 Bagas Sanjaya <bagasdotme(a)gmail.com>:
>>
>> Hi,
>>
>> I notice a regression report on Bugzilla [1]. Quoting from it:
>>
>> > Linux kernel >= v6.2 no longer boots on Apple's Virtualization.framework (x86_64).
>> >
>> > It is reported that the issue is not reproducible on ARM64: https://github.com/lima-vm/lima/issues/1577#issuecomment-1561577694
>> >
>> >
>> > ## Reproduction
>> > - Checkout the kernel repo, and run `make defconfig bzImage`.
>> >
>> > - Create an initrd (see the attached `initrd-example.txt`)
>> >
>> > - Transfer the bzImage and initrd to an Intel Mac.
>> >
>> > - On Mac, download `RunningLinuxInAVirtualMachine.zip` from https://developer.apple.com/documentation/virtualization/running_linux_in_a… , and build the `LinuxVirtualMachine` binary with Xcode.
>> > Building this binary with Xcode requires logging in to Apple.
>> > If you do not like logging in, a third party equivalent such as https://github.com/Code-Hex/vz/blob/v3.0.6/example/linux/main.go can be used.
>> >
>> > - Run `LinuxVirtualMachine /tmp/bzImage /tmp/initrd.img`.
>> > v6.1 successfully boots into the busybox shell.
>> > v6.2 just hangs before printing something in the console.
>> >
>> >
>> > ## Tested versions
>> > ```
>> > v6.1: OK
>> > ...
>> > v6.1.0-rc2-00002-g60f2096b59bc (included in v6.2-rc1): OK
>> > v6.1.0-rc2-00003-g5c62d5aab875 (included in v6.2-rc1): NG <-- This commit caused a regression
>> > ...
>> > v6.2-rc1: NG
>> > ...
>> > v6.2: NG
>> > ...
>> > v6.3.0-rc7-00181-g8e41e0a57566 (included in v6.3): NG <-- Reverts 5c62d5aab875 but still NG
>> > ...
>> > v6.3: NG
>> > v6.4-rc3: NG
>> > ```
>> >
>> > Tested on MacBookPro 2020 (Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz) running macOS 13.4.
>> >
>> >
>> > The issue seems a regression in [5c62d5aab8752e5ee7bfbe75ed6060db1c787f98](https://git.kernel.org/pub/scm/li… "ACPICA: Events: Support fixed PCIe wake event".
>> >
>> > This commit was introduced in v6.2-rc1, and apparently reverted in v6.3 ([8e41e0a575664d26bb87e012c39435c4c3914ed9](https://git.kernel.org/pub/scm/li…).
>> > However, v6.3 and the latest v6.4-rc3 still don't boot.
>>
>> See bugzilla for the full thread.
>>
>> Interestingly, this regression still occurs despite the culprit is
>> reverted in 8e41e0a575664d ("Revert "ACPICA: Events: Support fixed
>> PCIe wake event""), so this (obviously) isn't wake-on-lan regression,
>> but rather early boot one.
>>
>> Also, the reporter can't provide dmesg log (forget to attach serial
>> console?).
>>
>> Anyway, I'm adding it to regzbot:
>>
>> #regzbot introduced: 5c62d5aab8752e https://bugzilla.kernel.org/show_bug.cgi?id=217485
>> #regzbot title: Linux v6.2+ (x86_64) no longer boots on Apple's Virtualization framework (ACPICA issue)
>>
>> Thanks.
>>
>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217485
>>
>> --
>> An old man doll... just what I always wanted! - Clara
Hi Greg,
A regression in 6.3.0 has been identified in XFS that causes
filesystem corruption. It has been seen in the wild by a number of
users, and bisected down to an issued we'd already fixed in 6.4-rc1
with commit:
9419092fb263 ("xfs: fix livelock in delayed allocation at ENOSPC")
This was reported with much less harmful symptoms (alloctor
livelock) and it wasn't realised that it could have other, more
impactful symptoms. A reproducer for the corruption was found
yesterday and, soon after than, the cause of the corruption reports
was identified.
The commit applies cleanly to a 6.3.0 kernel here, so it should also
apply cleanly to a current 6.3.x kernel. I've included the entire
commit below in case that's easier for you.
Can you please pull this commit into the next 6.3.x release as a
matter of priority?
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
xfs: fix livelock in delayed allocation at ENOSPC
From: Dave Chinner <dchinner(a)redhat.com>
On a filesystem with a non-zero stripe unit and a large sequential
write, delayed allocation will set a minimum allocation length of
the stripe unit. If allocation fails because there are no extents
long enough for an aligned minlen allocation, it is supposed to
fall back to unaligned allocation which allows single block extents
to be allocated.
When the allocator code was rewritting in the 6.3 cycle, this
fallback was broken - the old code used args->fsbno as the both the
allocation target and the allocation result, the new code passes the
target as a separate parameter. The conversion didn't handle the
aligned->unaligned fallback path correctly - it reset args->fsbno to
the target fsbno on failure which broke allocation failure detection
in the high level code and so it never fell back to unaligned
allocations.
This resulted in a loop in writeback trying to allocate an aligned
block, getting a false positive success, trying to insert the result
in the BMBT. This did nothing because the extent already was in the
BMBT (merge results in an unchanged extent) and so it returned the
prior extent to the conversion code as the current iomap.
Because the iomap returned didn't cover the offset we tried to map,
xfs_convert_blocks() then retries the allocation, which fails in the
same way and now we have a livelock.
Reported-and-tested-by: Brian Foster <bfoster(a)redhat.com>
Fixes: 85843327094f ("xfs: factor xfs_bmap_btalloc()")
Signed-off-by: Dave Chinner <dchinner(a)redhat.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
---
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 1a4e446194dd..b512de0540d5 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3540,7 +3540,6 @@ xfs_bmap_btalloc_at_eof(
* original non-aligned state so the caller can proceed on allocation
* failure as if this function was never called.
*/
- args->fsbno = ap->blkno;
args->alignment = 1;
return 0;
}
Greetings.
I am Ms Nadage Lassou,I have something important to discuss with you.
i will send you the details once i hear from you.
Thanks,
Ms Nadage Lassou
Western Union Money Transfer
Republic of Burkina Faso
Address: CG46+G6G, Somgandé, Ouagadougou, Burkina Faso
website: www.westernunion.com
Money Available For Pick Up,
Western Union Money Transfer in conjunction with the United Nation during
their recently concluded meeting with the International Monetary Fund (IMF)
Payment Reconciliation Board Committee agreed to pay you a total amount of
($350.000.00) Three Hundred and Fifty Thousand US Dollars Only
compensation fund. The awarded compensation amount has been signed under
giving credence ensuring your payment transaction is devoid of any form of
illegality to be paid to you for the money you lost to scammers through
internet frauds.
Your name appeared on our payment schedule list 2023 of beneficiaries that
will receive their funds in this 2nd Quarter Payment of the year 2023. The
sum of $5,000.00 daily sending or its equivalent in your local currency
will be transferred to you via Western Union every day until all the
($350.000.00) Three Hundred and Fifty Thousand US Dollars Only is
completely sent across to you.
Meanwhile, we have registered your First Payment Transfer Sum $5.000 USD
online through Western Union Money Transfer. For you to confirm your First
Payment Transfer please follow the instructions. Open this website:
www.westernunion.com, then click "Track Transfer", then enter this Tracking
Number (MTCN) 9361788041 and click "Track Transfer" to confirm you will see
the status of the transaction online.
Below are the details you need to present to the Western Union officials.
Also go with your ID Card.
MTCN: 9361788041
Sender Names: Prince Maxwell Okoronkwo
Sender City: Ouagadougou
Sender Country: Burkina Faso
Amount sent: $5.000 Dollars
Money Transfer | Global Money Transfer | Western Union Go to any Western
Union Office in your area and pick it up. Make sure you go with an
Identification of yourself like a Driving License or National ID or
International Passport.
Sincerely,
Ms Omarh Waleed.
Send Money Worldwide Registered C 2007 - 2023 Western Union Money Transfer.
All Right Reserved. This email is intended for the named recipient only and
may contain privileged and confidential information. If you have received
this email in error, please notify us immediately. Please do not disclose
the contents to anyone or copy it to outside parties. Our website: http.
westernunion.com. Message from Western Union Aspect of your payment Burkina
Faso
On some arch (for example IPQ8074) and other with
KERNEL_STACKPROTECTOR_STRONG enabled, the following compilation error is
triggered:
drivers/net/wireguard/allowedips.c: In function 'root_remove_peer_lists':
drivers/net/wireguard/allowedips.c:80:1: error: the frame size of 1040 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
80 | }
| ^
drivers/net/wireguard/allowedips.c: In function 'root_free_rcu':
drivers/net/wireguard/allowedips.c:67:1: error: the frame size of 1040 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
67 | }
| ^
cc1: all warnings being treated as errors
Since these are free function and returns void, using function that can
fail is not ideal since an error would result in data not freed.
Since the free are under RCU lock, we can allocate the required stack
array as static outside the function and memset when needed.
This effectively fix the stack frame warning without changing how the
function work.
Fixes: Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
drivers/net/wireguard/allowedips.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c
index 5bf7822c53f1..c129082f04c6 100644
--- a/drivers/net/wireguard/allowedips.c
+++ b/drivers/net/wireguard/allowedips.c
@@ -53,12 +53,16 @@ static void node_free_rcu(struct rcu_head *rcu)
kmem_cache_free(node_cache, container_of(rcu, struct allowedips_node, rcu));
}
+static struct allowedips_node *tmpstack[MAX_ALLOWEDIPS_BITS];
+
static void root_free_rcu(struct rcu_head *rcu)
{
- struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_BITS] = {
- container_of(rcu, struct allowedips_node, rcu) };
+ struct allowedips_node *node, **stack = tmpstack;
unsigned int len = 1;
+ memset(stack, 0, sizeof(*stack) * MAX_ALLOWEDIPS_BITS);
+ stack[0] = container_of(rcu, struct allowedips_node, rcu);
+
while (len > 0 && (node = stack[--len])) {
push_rcu(stack, node->bit[0], &len);
push_rcu(stack, node->bit[1], &len);
@@ -68,9 +72,12 @@ static void root_free_rcu(struct rcu_head *rcu)
static void root_remove_peer_lists(struct allowedips_node *root)
{
- struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_BITS] = { root };
+ struct allowedips_node *node, **stack = tmpstack;
unsigned int len = 1;
+ memset(stack, 0, sizeof(*stack) * MAX_ALLOWEDIPS_BITS);
+ stack[0] = root;
+
while (len > 0 && (node = stack[--len])) {
push_rcu(stack, node->bit[0], &len);
push_rcu(stack, node->bit[1], &len);
--
2.39.2
Hello dear
Do you have the passion for humanitarian welfare?
Can you devote your time and be totally committed and devoted
to run multi-million pounds humanitarian charity project sponsored
totally by me; with an incentive/compensation accrual to you for
your time and effort and at no cost to you.
If interested, reply me for the full details
Thanks & regards
Les Scadding.
� ��������� � ��� � ������ �������!
�������� ����� ����������
���������-�������� ��� "��������������"
���.: +7 (48444) 6-91-69 ���. 495
e-mail: info(a)ludinovocable.ru
�������������� � ������ ��������
Commit ebeb20a9cd3f ("soc: qcom: mdt_loader: Always invoke PAS
mem_setup") dropped the relocate check and made pas_mem_setup run
unconditionally. The code was later moved with commit f4e526ff7e38
("soc: qcom: mdt_loader: Extract PAS operations") to
qcom_mdt_pas_init() effectively losing track of what was actually
done.
The assumption that PAS mem_setup can be done anytime was effectively
wrong, with no good reason and this caused regression on some SoC
that use remoteproc to bringup ath11k. One example is IPQ8074 SoC that
effectively broke resulting in remoteproc silently die and ath11k not
working.
On this SoC FW relocate is not enabled and PAS mem_setup was correctly
skipped in previous kernel version resulting in correct bringup and
function of remoteproc and ath11k.
To fix the regression, reintroduce the relocate check in
qcom_mdt_pas_init() and correctly skip PAS mem_setup where relocate is
not enabled.
Fixes: ebeb20a9cd3f ("soc: qcom: mdt_loader: Always invoke PAS mem_setup")
Tested-by: Robert Marko <robimarko(a)gmail.com>
Co-developed-by: Robert Marko <robimarko(a)gmail.com>
Signed-off-by: Robert Marko <robimarko(a)gmail.com>
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
drivers/soc/qcom/mdt_loader.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c
index 33dd8c315eb7..46820bcdae98 100644
--- a/drivers/soc/qcom/mdt_loader.c
+++ b/drivers/soc/qcom/mdt_loader.c
@@ -210,6 +210,7 @@ int qcom_mdt_pas_init(struct device *dev, const struct firmware *fw,
const struct elf32_hdr *ehdr;
phys_addr_t min_addr = PHYS_ADDR_MAX;
phys_addr_t max_addr = 0;
+ bool relocate = false;
size_t metadata_len;
void *metadata;
int ret;
@@ -224,6 +225,9 @@ int qcom_mdt_pas_init(struct device *dev, const struct firmware *fw,
if (!mdt_phdr_valid(phdr))
continue;
+ if (phdr->p_flags & QCOM_MDT_RELOCATABLE)
+ relocate = true;
+
if (phdr->p_paddr < min_addr)
min_addr = phdr->p_paddr;
@@ -246,11 +250,13 @@ int qcom_mdt_pas_init(struct device *dev, const struct firmware *fw,
goto out;
}
- ret = qcom_scm_pas_mem_setup(pas_id, mem_phys, max_addr - min_addr);
- if (ret) {
- /* Unable to set up relocation */
- dev_err(dev, "error %d setting up firmware %s\n", ret, fw_name);
- goto out;
+ if (relocate) {
+ ret = qcom_scm_pas_mem_setup(pas_id, mem_phys, max_addr - min_addr);
+ if (ret) {
+ /* Unable to set up relocation */
+ dev_err(dev, "error %d setting up firmware %s\n", ret, fw_name);
+ goto out;
+ }
}
out:
--
2.39.2
If 'aggr_interval' is smaller than 'sample_interval', max_nr_accesses
in damon_nr_accesses_to_accesses_bp() becomes zero which leads to divide
error, let's validate the values of them in damon_set_attrs() to fix it,
which similar to others attrs check.
Reported-by: syzbot+841a46899768ec7bec67(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=841a46899768ec7bec67
Link: https://lore.kernel.org/damon/00000000000055fc4e05fc975bc2@google.com/
Fixes: 2f5bef5a590b ("mm/damon/core: update monitoring results for new monitoring attributes")
Cc: <stable(a)vger.kernel.org> # 6.3.x-
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Signed-off-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
---
v2: close checkpatch warning, add RB/cc stable, per SJ
mm/damon/core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index d9ef62047bf5..91cff7f2997e 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -551,6 +551,8 @@ int damon_set_attrs(struct damon_ctx *ctx, struct damon_attrs *attrs)
return -EINVAL;
if (attrs->min_nr_regions > attrs->max_nr_regions)
return -EINVAL;
+ if (attrs->sample_interval > attrs->aggr_interval)
+ return -EINVAL;
damon_update_monitoring_results(ctx, attrs);
ctx->attrs = *attrs;
--
2.35.3
Libuv recently started using it so there is at least one consumer now.
Cc: stable(a)vger.kernel.org
Fixes: 61a2732af4b0 ("io_uring: deprecate epoll_ctl support")
Link: https://github.com/libuv/libuv/pull/3979
Signed-off-by: Ben Noordhuis <info(a)bnoordhuis.nl>
---
io_uring/epoll.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/io_uring/epoll.c b/io_uring/epoll.c
index 9aa74d2c80bc..89bff2068a19 100644
--- a/io_uring/epoll.c
+++ b/io_uring/epoll.c
@@ -25,10 +25,6 @@ int io_epoll_ctl_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
struct io_epoll *epoll = io_kiocb_to_cmd(req, struct io_epoll);
- pr_warn_once("%s: epoll_ctl support in io_uring is deprecated and will "
- "be removed in a future Linux kernel version.\n",
- current->comm);
-
if (sqe->buf_index || sqe->splice_fd_in)
return -EINVAL;
--
2.39.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x ce0b15d11ad837fbacc5356941712218e38a0a83
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052616-audibly-grinning-73b4@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
ce0b15d11ad8 ("x86/mm: Avoid incomplete Global INVLPG flushes")
6cff64b86aaa ("x86/mm: Use INVPCID for __native_flush_tlb_single()")
6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
48e111982cda ("x86/mm: Abstract switching CR3")
2ea907c4fe7b ("x86/mm: Allow flushing for future ASID switches")
aa8c6248f8c7 ("x86/mm/pti: Add infrastructure for page table isolation")
8a09317b895f ("x86/mm/pti: Prepare the x86/entry assembly code for entry/exit CR3 switching")
613e396bc0d4 ("init: Invoke init_espfix_bsp() from mm_init()")
1a3b0caeb77e ("x86/mm: Create asm/invpcid.h")
dd95f1a4b5ca ("x86/mm: Put MMU to hardware ASID translation in one place")
cb0a9144a744 ("x86/mm: Remove hard-coded ASID limit checks")
50fb83a62cf4 ("x86/mm: Move the CR3 construction functions to tlbflush.h")
3f67af51e56f ("x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what")
23cb7d46f371 ("x86/microcode: Dont abuse the TLB-flush interface")
c482feefe1ae ("x86/entry/64: Make cpu_entry_area.tss read-only")
0f9a48100fba ("x86/entry: Clean up the SYSENTER_stack code")
7fbbd5cbebf1 ("x86/entry/64: Remove the SYSENTER stack canary")
40e7f949e0d9 ("x86/entry/64: Move the IST stacks into struct cpu_entry_area")
3386bc8aed82 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline")
3e3b9293d392 ("x86/entry/64: Return to userspace from the trampoline stack")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ce0b15d11ad837fbacc5356941712218e38a0a83 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Tue, 16 May 2023 12:24:25 -0700
Subject: [PATCH] x86/mm: Avoid incomplete Global INVLPG flushes
The INVLPG instruction is used to invalidate TLB entries for a
specified virtual address. When PCIDs are enabled, INVLPG is supposed
to invalidate TLB entries for the specified address for both the
current PCID *and* Global entries. (Note: Only kernel mappings set
Global=1.)
Unfortunately, some INVLPG implementations can leave Global
translations unflushed when PCIDs are enabled.
As a workaround, never enable PCIDs on affected processors.
I expect there to eventually be microcode mitigations to replace this
software workaround. However, the exact version numbers where that
will happen are not known today. Once the version numbers are set in
stone, the processor list can be tweaked to only disable PCIDs on
affected processors with affected microcode.
Note: if anyone wants a quick fix that doesn't require patching, just
stick 'nopcid' on your kernel command-line.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 3cdac0f0055d..8192452d1d2d 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -9,6 +9,7 @@
#include <linux/sched/task.h>
#include <asm/set_memory.h>
+#include <asm/cpu_device_id.h>
#include <asm/e820/api.h>
#include <asm/init.h>
#include <asm/page.h>
@@ -261,6 +262,24 @@ static void __init probe_page_size_mask(void)
}
}
+#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \
+ .family = 6, \
+ .model = _model, \
+ }
+/*
+ * INVLPG may not properly flush Global entries
+ * on these CPUs when PCIDs are enabled.
+ */
+static const struct x86_cpu_id invlpg_miss_ids[] = {
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE ),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N ),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
+ {}
+};
+
static void setup_pcid(void)
{
if (!IS_ENABLED(CONFIG_X86_64))
@@ -269,6 +288,12 @@ static void setup_pcid(void)
if (!boot_cpu_has(X86_FEATURE_PCID))
return;
+ if (x86_match_cpu(invlpg_miss_ids)) {
+ pr_info("Incomplete global flushes, disabling PCID");
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+ return;
+ }
+
if (boot_cpu_has(X86_FEATURE_PGE)) {
/*
* This can't be cr4_set_bits_and_update_boot() -- the
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x ce0b15d11ad837fbacc5356941712218e38a0a83
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052615-manila-armoire-d077@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
ce0b15d11ad8 ("x86/mm: Avoid incomplete Global INVLPG flushes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ce0b15d11ad837fbacc5356941712218e38a0a83 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Tue, 16 May 2023 12:24:25 -0700
Subject: [PATCH] x86/mm: Avoid incomplete Global INVLPG flushes
The INVLPG instruction is used to invalidate TLB entries for a
specified virtual address. When PCIDs are enabled, INVLPG is supposed
to invalidate TLB entries for the specified address for both the
current PCID *and* Global entries. (Note: Only kernel mappings set
Global=1.)
Unfortunately, some INVLPG implementations can leave Global
translations unflushed when PCIDs are enabled.
As a workaround, never enable PCIDs on affected processors.
I expect there to eventually be microcode mitigations to replace this
software workaround. However, the exact version numbers where that
will happen are not known today. Once the version numbers are set in
stone, the processor list can be tweaked to only disable PCIDs on
affected processors with affected microcode.
Note: if anyone wants a quick fix that doesn't require patching, just
stick 'nopcid' on your kernel command-line.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 3cdac0f0055d..8192452d1d2d 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -9,6 +9,7 @@
#include <linux/sched/task.h>
#include <asm/set_memory.h>
+#include <asm/cpu_device_id.h>
#include <asm/e820/api.h>
#include <asm/init.h>
#include <asm/page.h>
@@ -261,6 +262,24 @@ static void __init probe_page_size_mask(void)
}
}
+#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \
+ .family = 6, \
+ .model = _model, \
+ }
+/*
+ * INVLPG may not properly flush Global entries
+ * on these CPUs when PCIDs are enabled.
+ */
+static const struct x86_cpu_id invlpg_miss_ids[] = {
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE ),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N ),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
+ {}
+};
+
static void setup_pcid(void)
{
if (!IS_ENABLED(CONFIG_X86_64))
@@ -269,6 +288,12 @@ static void setup_pcid(void)
if (!boot_cpu_has(X86_FEATURE_PCID))
return;
+ if (x86_match_cpu(invlpg_miss_ids)) {
+ pr_info("Incomplete global flushes, disabling PCID");
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+ return;
+ }
+
if (boot_cpu_has(X86_FEATURE_PGE)) {
/*
* This can't be cr4_set_bits_and_update_boot() -- the
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x ce0b15d11ad837fbacc5356941712218e38a0a83
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052614-ascertain-reshuffle-d127@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
ce0b15d11ad8 ("x86/mm: Avoid incomplete Global INVLPG flushes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ce0b15d11ad837fbacc5356941712218e38a0a83 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Tue, 16 May 2023 12:24:25 -0700
Subject: [PATCH] x86/mm: Avoid incomplete Global INVLPG flushes
The INVLPG instruction is used to invalidate TLB entries for a
specified virtual address. When PCIDs are enabled, INVLPG is supposed
to invalidate TLB entries for the specified address for both the
current PCID *and* Global entries. (Note: Only kernel mappings set
Global=1.)
Unfortunately, some INVLPG implementations can leave Global
translations unflushed when PCIDs are enabled.
As a workaround, never enable PCIDs on affected processors.
I expect there to eventually be microcode mitigations to replace this
software workaround. However, the exact version numbers where that
will happen are not known today. Once the version numbers are set in
stone, the processor list can be tweaked to only disable PCIDs on
affected processors with affected microcode.
Note: if anyone wants a quick fix that doesn't require patching, just
stick 'nopcid' on your kernel command-line.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 3cdac0f0055d..8192452d1d2d 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -9,6 +9,7 @@
#include <linux/sched/task.h>
#include <asm/set_memory.h>
+#include <asm/cpu_device_id.h>
#include <asm/e820/api.h>
#include <asm/init.h>
#include <asm/page.h>
@@ -261,6 +262,24 @@ static void __init probe_page_size_mask(void)
}
}
+#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \
+ .family = 6, \
+ .model = _model, \
+ }
+/*
+ * INVLPG may not properly flush Global entries
+ * on these CPUs when PCIDs are enabled.
+ */
+static const struct x86_cpu_id invlpg_miss_ids[] = {
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE ),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N ),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
+ {}
+};
+
static void setup_pcid(void)
{
if (!IS_ENABLED(CONFIG_X86_64))
@@ -269,6 +288,12 @@ static void setup_pcid(void)
if (!boot_cpu_has(X86_FEATURE_PCID))
return;
+ if (x86_match_cpu(invlpg_miss_ids)) {
+ pr_info("Incomplete global flushes, disabling PCID");
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+ return;
+ }
+
if (boot_cpu_has(X86_FEATURE_PGE)) {
/*
* This can't be cr4_set_bits_and_update_boot() -- the
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x ce0b15d11ad837fbacc5356941712218e38a0a83
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052613-galore-flame-b5de@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
ce0b15d11ad8 ("x86/mm: Avoid incomplete Global INVLPG flushes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ce0b15d11ad837fbacc5356941712218e38a0a83 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Tue, 16 May 2023 12:24:25 -0700
Subject: [PATCH] x86/mm: Avoid incomplete Global INVLPG flushes
The INVLPG instruction is used to invalidate TLB entries for a
specified virtual address. When PCIDs are enabled, INVLPG is supposed
to invalidate TLB entries for the specified address for both the
current PCID *and* Global entries. (Note: Only kernel mappings set
Global=1.)
Unfortunately, some INVLPG implementations can leave Global
translations unflushed when PCIDs are enabled.
As a workaround, never enable PCIDs on affected processors.
I expect there to eventually be microcode mitigations to replace this
software workaround. However, the exact version numbers where that
will happen are not known today. Once the version numbers are set in
stone, the processor list can be tweaked to only disable PCIDs on
affected processors with affected microcode.
Note: if anyone wants a quick fix that doesn't require patching, just
stick 'nopcid' on your kernel command-line.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 3cdac0f0055d..8192452d1d2d 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -9,6 +9,7 @@
#include <linux/sched/task.h>
#include <asm/set_memory.h>
+#include <asm/cpu_device_id.h>
#include <asm/e820/api.h>
#include <asm/init.h>
#include <asm/page.h>
@@ -261,6 +262,24 @@ static void __init probe_page_size_mask(void)
}
}
+#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \
+ .family = 6, \
+ .model = _model, \
+ }
+/*
+ * INVLPG may not properly flush Global entries
+ * on these CPUs when PCIDs are enabled.
+ */
+static const struct x86_cpu_id invlpg_miss_ids[] = {
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE ),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N ),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
+ {}
+};
+
static void setup_pcid(void)
{
if (!IS_ENABLED(CONFIG_X86_64))
@@ -269,6 +288,12 @@ static void setup_pcid(void)
if (!boot_cpu_has(X86_FEATURE_PCID))
return;
+ if (x86_match_cpu(invlpg_miss_ids)) {
+ pr_info("Incomplete global flushes, disabling PCID");
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+ return;
+ }
+
if (boot_cpu_has(X86_FEATURE_PGE)) {
/*
* This can't be cr4_set_bits_and_update_boot() -- the
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 81302b1c7c997e8a56c1c2fc63a296ebeb0cd2d0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052649-kilt-concert-cd8a@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
81302b1c7c99 ("ALSA: hda: Fix unhandled register update during auto-suspend period")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 81302b1c7c997e8a56c1c2fc63a296ebeb0cd2d0 Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai(a)suse.de>
Date: Thu, 18 May 2023 13:35:20 +0200
Subject: [PATCH] ALSA: hda: Fix unhandled register update during auto-suspend
period
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It's reported that the recording started right after the driver probe
doesn't work properly, and it turned out that this is related with the
codec auto-suspend. Namely, after the probe phase, the usage count
goes zero, and the auto-suspend is programmed, but the codec is kept
still active until the auto-suspend expiration. When an application
(e.g. alsactl) updates the mixer values at this moment, the values are
cached but not actually written. Then, starting arecord thereafter
also results in the silence because of the missing unmute.
The root cause is the handling of "lazy update" mode; when a mixer
value is updated *after* the suspend, it should update only the cache
and exits. At the resume, the cached value is written to the device,
in turn. The problem is that the current code misinterprets the state
of auto-suspend as if it were already suspended.
Although we can add the check of the actual device state after
pm_runtime_get_if_in_use() for catching the missing state, this won't
suffice; the second call of regmap_update_bits_check() will skip
writing the register because the cache has been already updated by the
first call. So we'd need fixes in two different places.
OTOH, a simpler fix is to replace pm_runtime_get_if_in_use() with
pm_runtime_get_if_active() (with ign_usage_count=true). This change
implies that the driver takes the pm refcount if the device is still
in ACTIVE state and continues the processing. A small caveat is that
this will leave the auto-suspend timer. But, since the timer callback
itself checks the device state and aborts gracefully when it's active,
this won't be any substantial problem.
Long story short: we address the missing register-write problem just
by replacing the pm_runtime_*() call in snd_hda_keep_power_up().
Fixes: fc4f000bf8c0 ("ALSA: hda - Fix unexpected resume through regmap code path")
Reported-by: Amadeusz Sławiński <amadeuszx.slawinski(a)linux.intel.com>
Closes: https://lore.kernel.org/r/a7478636-af11-92ab-731c-9b13c582a70d@linux.intel.…
Suggested-by: Cezary Rojewski <cezary.rojewski(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20230518113520.15213-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c
index accc9d279ce5..6c043fbd606f 100644
--- a/sound/hda/hdac_device.c
+++ b/sound/hda/hdac_device.c
@@ -611,7 +611,7 @@ EXPORT_SYMBOL_GPL(snd_hdac_power_up_pm);
int snd_hdac_keep_power_up(struct hdac_device *codec)
{
if (!atomic_inc_not_zero(&codec->in_pm)) {
- int ret = pm_runtime_get_if_in_use(&codec->dev);
+ int ret = pm_runtime_get_if_active(&codec->dev, true);
if (!ret)
return -1;
if (ret < 0)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 81302b1c7c997e8a56c1c2fc63a296ebeb0cd2d0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052647-had-steering-38d3@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
81302b1c7c99 ("ALSA: hda: Fix unhandled register update during auto-suspend period")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 81302b1c7c997e8a56c1c2fc63a296ebeb0cd2d0 Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai(a)suse.de>
Date: Thu, 18 May 2023 13:35:20 +0200
Subject: [PATCH] ALSA: hda: Fix unhandled register update during auto-suspend
period
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It's reported that the recording started right after the driver probe
doesn't work properly, and it turned out that this is related with the
codec auto-suspend. Namely, after the probe phase, the usage count
goes zero, and the auto-suspend is programmed, but the codec is kept
still active until the auto-suspend expiration. When an application
(e.g. alsactl) updates the mixer values at this moment, the values are
cached but not actually written. Then, starting arecord thereafter
also results in the silence because of the missing unmute.
The root cause is the handling of "lazy update" mode; when a mixer
value is updated *after* the suspend, it should update only the cache
and exits. At the resume, the cached value is written to the device,
in turn. The problem is that the current code misinterprets the state
of auto-suspend as if it were already suspended.
Although we can add the check of the actual device state after
pm_runtime_get_if_in_use() for catching the missing state, this won't
suffice; the second call of regmap_update_bits_check() will skip
writing the register because the cache has been already updated by the
first call. So we'd need fixes in two different places.
OTOH, a simpler fix is to replace pm_runtime_get_if_in_use() with
pm_runtime_get_if_active() (with ign_usage_count=true). This change
implies that the driver takes the pm refcount if the device is still
in ACTIVE state and continues the processing. A small caveat is that
this will leave the auto-suspend timer. But, since the timer callback
itself checks the device state and aborts gracefully when it's active,
this won't be any substantial problem.
Long story short: we address the missing register-write problem just
by replacing the pm_runtime_*() call in snd_hda_keep_power_up().
Fixes: fc4f000bf8c0 ("ALSA: hda - Fix unexpected resume through regmap code path")
Reported-by: Amadeusz Sławiński <amadeuszx.slawinski(a)linux.intel.com>
Closes: https://lore.kernel.org/r/a7478636-af11-92ab-731c-9b13c582a70d@linux.intel.…
Suggested-by: Cezary Rojewski <cezary.rojewski(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20230518113520.15213-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c
index accc9d279ce5..6c043fbd606f 100644
--- a/sound/hda/hdac_device.c
+++ b/sound/hda/hdac_device.c
@@ -611,7 +611,7 @@ EXPORT_SYMBOL_GPL(snd_hdac_power_up_pm);
int snd_hdac_keep_power_up(struct hdac_device *codec)
{
if (!atomic_inc_not_zero(&codec->in_pm)) {
- int ret = pm_runtime_get_if_in_use(&codec->dev);
+ int ret = pm_runtime_get_if_active(&codec->dev, true);
if (!ret)
return -1;
if (ret < 0)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 81302b1c7c997e8a56c1c2fc63a296ebeb0cd2d0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052628-overfull-secular-50e4@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
81302b1c7c99 ("ALSA: hda: Fix unhandled register update during auto-suspend period")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 81302b1c7c997e8a56c1c2fc63a296ebeb0cd2d0 Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai(a)suse.de>
Date: Thu, 18 May 2023 13:35:20 +0200
Subject: [PATCH] ALSA: hda: Fix unhandled register update during auto-suspend
period
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It's reported that the recording started right after the driver probe
doesn't work properly, and it turned out that this is related with the
codec auto-suspend. Namely, after the probe phase, the usage count
goes zero, and the auto-suspend is programmed, but the codec is kept
still active until the auto-suspend expiration. When an application
(e.g. alsactl) updates the mixer values at this moment, the values are
cached but not actually written. Then, starting arecord thereafter
also results in the silence because of the missing unmute.
The root cause is the handling of "lazy update" mode; when a mixer
value is updated *after* the suspend, it should update only the cache
and exits. At the resume, the cached value is written to the device,
in turn. The problem is that the current code misinterprets the state
of auto-suspend as if it were already suspended.
Although we can add the check of the actual device state after
pm_runtime_get_if_in_use() for catching the missing state, this won't
suffice; the second call of regmap_update_bits_check() will skip
writing the register because the cache has been already updated by the
first call. So we'd need fixes in two different places.
OTOH, a simpler fix is to replace pm_runtime_get_if_in_use() with
pm_runtime_get_if_active() (with ign_usage_count=true). This change
implies that the driver takes the pm refcount if the device is still
in ACTIVE state and continues the processing. A small caveat is that
this will leave the auto-suspend timer. But, since the timer callback
itself checks the device state and aborts gracefully when it's active,
this won't be any substantial problem.
Long story short: we address the missing register-write problem just
by replacing the pm_runtime_*() call in snd_hda_keep_power_up().
Fixes: fc4f000bf8c0 ("ALSA: hda - Fix unexpected resume through regmap code path")
Reported-by: Amadeusz Sławiński <amadeuszx.slawinski(a)linux.intel.com>
Closes: https://lore.kernel.org/r/a7478636-af11-92ab-731c-9b13c582a70d@linux.intel.…
Suggested-by: Cezary Rojewski <cezary.rojewski(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20230518113520.15213-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c
index accc9d279ce5..6c043fbd606f 100644
--- a/sound/hda/hdac_device.c
+++ b/sound/hda/hdac_device.c
@@ -611,7 +611,7 @@ EXPORT_SYMBOL_GPL(snd_hdac_power_up_pm);
int snd_hdac_keep_power_up(struct hdac_device *codec)
{
if (!atomic_inc_not_zero(&codec->in_pm)) {
- int ret = pm_runtime_get_if_in_use(&codec->dev);
+ int ret = pm_runtime_get_if_active(&codec->dev, true);
if (!ret)
return -1;
if (ret < 0)
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
The similar approach was applied to all functions called from the locked
and the unlocked context, which safely mitigates both deadlocks and race
conditions in the driver.
__test_dev_config_update_bool(), __test_dev_config_update_u8() and
__test_dev_config_update_size_t() unlocked versions of the functions
were introduced to be called from the locked contexts as a workaround
without releasing the main driver's lock and thereof causing a race
condition.
The test_dev_config_update_bool(), test_dev_config_update_u8() and
test_dev_config_update_size_t() locked versions of the functions
are being called from driver methods without the unnecessary multiplying
of the locking and unlocking code for each method, and complicating
the code with saving of the return value across lock.
Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf")
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
---
lib/test_firmware.c | 52 ++++++++++++++++++++++++++++++---------------
1 file changed, 35 insertions(+), 17 deletions(-)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 05ed84c2fc4c..35417e0af3f4 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -353,16 +353,26 @@ static ssize_t config_test_show_str(char *dst,
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (kstrtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -373,7 +383,8 @@ static ssize_t test_dev_config_show_bool(char *buf, bool val)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_size_t(const char *buf,
+static int __test_dev_config_update_size_t(
+ const char *buf,
size_t size,
size_t *cfg)
{
@@ -384,9 +395,7 @@ static int test_dev_config_update_size_t(const char *buf,
if (ret)
return ret;
- mutex_lock(&test_fw_mutex);
*(size_t *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
@@ -402,7 +411,7 @@ static ssize_t test_dev_config_show_int(char *buf, int val)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
@@ -411,14 +420,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
if (ret)
return ret;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 val)
{
return snprintf(buf, PAGE_SIZE, "%u\n", val);
@@ -471,10 +489,10 @@ static ssize_t config_num_requests_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
@@ -518,10 +536,10 @@ static ssize_t config_buf_size_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_size_t(buf, count,
- &test_fw_config->buf_size);
+ rc = __test_dev_config_update_size_t(buf, count,
+ &test_fw_config->buf_size);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
@@ -548,10 +566,10 @@ static ssize_t config_file_offset_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_size_t(buf, count,
- &test_fw_config->file_offset);
+ rc = __test_dev_config_update_size_t(buf, count,
+ &test_fw_config->file_offset);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
--
2.30.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 3632679d9e4f879f49949bb5b050e0de553e4739
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052637-smudge-shucking-ac36@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
3632679d9e4f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")
91d0b78c5177 ("inet: Add IP_LOCAL_PORT_RANGE socket option")
28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
d2c135619cb8 ("inet: add READ_ONCE(sk->sk_bound_dev_if) in inet_csk_bind_conflict()")
ca7af0402550 ("tcp: add small random increments to the source port")
ffa84b5ffb37 ("net: add netns refcount tracker to struct sock")
938cca9e4109 ("sock: fix /proc/net/sockstat underflow in sk_clone_lock()")
990c74e3f41d ("memcg: enable accounting for inet_bin_bucket cache")
333bb73f620e ("tcp: Keep TCP_CLOSE sockets in the reuseport group.")
5c040eaf5d17 ("tcp: Add num_closed_socks to struct sock_reuseport.")
c579bd1b4021 ("tcp: add some entropy in __inet_hash_connect()")
190cc82489f4 ("tcp: change source port randomizarion at connect() time")
bbc20b70424a ("net: reduce indentation level in sk_clone_lock()")
62ffc589abb1 ("net: refactor bind_bucket fastreuse into helper")
47ec5303d73e ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3632679d9e4f879f49949bb5b050e0de553e4739 Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Mon, 22 May 2023 14:08:20 +0200
Subject: [PATCH] ipv{4,6}/raw: fix output xfrm lookup wrt protocol
With a raw socket bound to IPPROTO_RAW (ie with hdrincl enabled), the
protocol field of the flow structure, build by raw_sendmsg() /
rawv6_sendmsg()), is set to IPPROTO_RAW. This breaks the ipsec policy
lookup when some policies are defined with a protocol in the selector.
For ipv6, the sin6_port field from 'struct sockaddr_in6' could be used to
specify the protocol. Just accept all values for IPPROTO_RAW socket.
For ipv4, the sin_port field of 'struct sockaddr_in' could not be used
without breaking backward compatibility (the value of this field was never
checked). Let's add a new kind of control message, so that the userland
could specify which protocol is used.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
CC: stable(a)vger.kernel.org
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Link: https://lore.kernel.org/r/20230522120820.1319391-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/include/net/ip.h b/include/net/ip.h
index c3fffaa92d6e..acec504c469a 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -76,6 +76,7 @@ struct ipcm_cookie {
__be32 addr;
int oif;
struct ip_options_rcu *opt;
+ __u8 protocol;
__u8 ttl;
__s16 tos;
char priority;
@@ -96,6 +97,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm,
ipcm->sockc.tsflags = inet->sk.sk_tsflags;
ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if);
ipcm->addr = inet->inet_saddr;
+ ipcm->protocol = inet->inet_num;
}
#define IPCB(skb) ((struct inet_skb_parm*)((skb)->cb))
diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index 4b7f2df66b99..e682ab628dfa 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -163,6 +163,7 @@ struct in_addr {
#define IP_MULTICAST_ALL 49
#define IP_UNICAST_IF 50
#define IP_LOCAL_PORT_RANGE 51
+#define IP_PROTOCOL 52
#define MCAST_EXCLUDE 0
#define MCAST_INCLUDE 1
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index b511ff0adc0a..8e97d8d4cc9d 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -317,7 +317,14 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
ipc->tos = val;
ipc->priority = rt_tos2priority(ipc->tos);
break;
-
+ case IP_PROTOCOL:
+ if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)))
+ return -EINVAL;
+ val = *(int *)CMSG_DATA(cmsg);
+ if (val < 1 || val > 255)
+ return -EINVAL;
+ ipc->protocol = val;
+ break;
default:
return -EINVAL;
}
@@ -1761,6 +1768,9 @@ int do_ip_getsockopt(struct sock *sk, int level, int optname,
case IP_LOCAL_PORT_RANGE:
val = inet->local_port_range.hi << 16 | inet->local_port_range.lo;
break;
+ case IP_PROTOCOL:
+ val = inet_sk(sk)->inet_num;
+ break;
default:
sockopt_release_sock(sk);
return -ENOPROTOOPT;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index ff712bf2a98d..eadf1c9ef7e4 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -532,6 +532,9 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
}
ipcm_init_sk(&ipc, inet);
+ /* Keep backward compat */
+ if (hdrincl)
+ ipc.protocol = IPPROTO_RAW;
if (msg->msg_controllen) {
err = ip_cmsg_send(sk, msg, &ipc, false);
@@ -599,7 +602,7 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
flowi4_init_output(&fl4, ipc.oif, ipc.sockc.mark, tos,
RT_SCOPE_UNIVERSE,
- hdrincl ? IPPROTO_RAW : sk->sk_protocol,
+ hdrincl ? ipc.protocol : sk->sk_protocol,
inet_sk_flowi_flags(sk) |
(hdrincl ? FLOWI_FLAG_KNOWN_NH : 0),
daddr, saddr, 0, 0, sk->sk_uid);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 7d0adb612bdd..44ee7a2e72ac 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -793,7 +793,8 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
if (!proto)
proto = inet->inet_num;
- else if (proto != inet->inet_num)
+ else if (proto != inet->inet_num &&
+ inet->inet_num != IPPROTO_RAW)
return -EINVAL;
if (proto > 255)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 3632679d9e4f879f49949bb5b050e0de553e4739
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052636-scrounger-dirtiness-37d2@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
3632679d9e4f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")
91d0b78c5177 ("inet: Add IP_LOCAL_PORT_RANGE socket option")
28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
d2c135619cb8 ("inet: add READ_ONCE(sk->sk_bound_dev_if) in inet_csk_bind_conflict()")
ca7af0402550 ("tcp: add small random increments to the source port")
ffa84b5ffb37 ("net: add netns refcount tracker to struct sock")
938cca9e4109 ("sock: fix /proc/net/sockstat underflow in sk_clone_lock()")
990c74e3f41d ("memcg: enable accounting for inet_bin_bucket cache")
333bb73f620e ("tcp: Keep TCP_CLOSE sockets in the reuseport group.")
5c040eaf5d17 ("tcp: Add num_closed_socks to struct sock_reuseport.")
c579bd1b4021 ("tcp: add some entropy in __inet_hash_connect()")
190cc82489f4 ("tcp: change source port randomizarion at connect() time")
bbc20b70424a ("net: reduce indentation level in sk_clone_lock()")
62ffc589abb1 ("net: refactor bind_bucket fastreuse into helper")
47ec5303d73e ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3632679d9e4f879f49949bb5b050e0de553e4739 Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Mon, 22 May 2023 14:08:20 +0200
Subject: [PATCH] ipv{4,6}/raw: fix output xfrm lookup wrt protocol
With a raw socket bound to IPPROTO_RAW (ie with hdrincl enabled), the
protocol field of the flow structure, build by raw_sendmsg() /
rawv6_sendmsg()), is set to IPPROTO_RAW. This breaks the ipsec policy
lookup when some policies are defined with a protocol in the selector.
For ipv6, the sin6_port field from 'struct sockaddr_in6' could be used to
specify the protocol. Just accept all values for IPPROTO_RAW socket.
For ipv4, the sin_port field of 'struct sockaddr_in' could not be used
without breaking backward compatibility (the value of this field was never
checked). Let's add a new kind of control message, so that the userland
could specify which protocol is used.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
CC: stable(a)vger.kernel.org
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Link: https://lore.kernel.org/r/20230522120820.1319391-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/include/net/ip.h b/include/net/ip.h
index c3fffaa92d6e..acec504c469a 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -76,6 +76,7 @@ struct ipcm_cookie {
__be32 addr;
int oif;
struct ip_options_rcu *opt;
+ __u8 protocol;
__u8 ttl;
__s16 tos;
char priority;
@@ -96,6 +97,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm,
ipcm->sockc.tsflags = inet->sk.sk_tsflags;
ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if);
ipcm->addr = inet->inet_saddr;
+ ipcm->protocol = inet->inet_num;
}
#define IPCB(skb) ((struct inet_skb_parm*)((skb)->cb))
diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index 4b7f2df66b99..e682ab628dfa 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -163,6 +163,7 @@ struct in_addr {
#define IP_MULTICAST_ALL 49
#define IP_UNICAST_IF 50
#define IP_LOCAL_PORT_RANGE 51
+#define IP_PROTOCOL 52
#define MCAST_EXCLUDE 0
#define MCAST_INCLUDE 1
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index b511ff0adc0a..8e97d8d4cc9d 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -317,7 +317,14 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
ipc->tos = val;
ipc->priority = rt_tos2priority(ipc->tos);
break;
-
+ case IP_PROTOCOL:
+ if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)))
+ return -EINVAL;
+ val = *(int *)CMSG_DATA(cmsg);
+ if (val < 1 || val > 255)
+ return -EINVAL;
+ ipc->protocol = val;
+ break;
default:
return -EINVAL;
}
@@ -1761,6 +1768,9 @@ int do_ip_getsockopt(struct sock *sk, int level, int optname,
case IP_LOCAL_PORT_RANGE:
val = inet->local_port_range.hi << 16 | inet->local_port_range.lo;
break;
+ case IP_PROTOCOL:
+ val = inet_sk(sk)->inet_num;
+ break;
default:
sockopt_release_sock(sk);
return -ENOPROTOOPT;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index ff712bf2a98d..eadf1c9ef7e4 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -532,6 +532,9 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
}
ipcm_init_sk(&ipc, inet);
+ /* Keep backward compat */
+ if (hdrincl)
+ ipc.protocol = IPPROTO_RAW;
if (msg->msg_controllen) {
err = ip_cmsg_send(sk, msg, &ipc, false);
@@ -599,7 +602,7 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
flowi4_init_output(&fl4, ipc.oif, ipc.sockc.mark, tos,
RT_SCOPE_UNIVERSE,
- hdrincl ? IPPROTO_RAW : sk->sk_protocol,
+ hdrincl ? ipc.protocol : sk->sk_protocol,
inet_sk_flowi_flags(sk) |
(hdrincl ? FLOWI_FLAG_KNOWN_NH : 0),
daddr, saddr, 0, 0, sk->sk_uid);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 7d0adb612bdd..44ee7a2e72ac 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -793,7 +793,8 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
if (!proto)
proto = inet->inet_num;
- else if (proto != inet->inet_num)
+ else if (proto != inet->inet_num &&
+ inet->inet_num != IPPROTO_RAW)
return -EINVAL;
if (proto > 255)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 3632679d9e4f879f49949bb5b050e0de553e4739
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052635-styling-unbutton-ac91@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
3632679d9e4f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")
91d0b78c5177 ("inet: Add IP_LOCAL_PORT_RANGE socket option")
28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
d2c135619cb8 ("inet: add READ_ONCE(sk->sk_bound_dev_if) in inet_csk_bind_conflict()")
ca7af0402550 ("tcp: add small random increments to the source port")
ffa84b5ffb37 ("net: add netns refcount tracker to struct sock")
938cca9e4109 ("sock: fix /proc/net/sockstat underflow in sk_clone_lock()")
990c74e3f41d ("memcg: enable accounting for inet_bin_bucket cache")
333bb73f620e ("tcp: Keep TCP_CLOSE sockets in the reuseport group.")
5c040eaf5d17 ("tcp: Add num_closed_socks to struct sock_reuseport.")
c579bd1b4021 ("tcp: add some entropy in __inet_hash_connect()")
190cc82489f4 ("tcp: change source port randomizarion at connect() time")
bbc20b70424a ("net: reduce indentation level in sk_clone_lock()")
62ffc589abb1 ("net: refactor bind_bucket fastreuse into helper")
47ec5303d73e ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3632679d9e4f879f49949bb5b050e0de553e4739 Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Mon, 22 May 2023 14:08:20 +0200
Subject: [PATCH] ipv{4,6}/raw: fix output xfrm lookup wrt protocol
With a raw socket bound to IPPROTO_RAW (ie with hdrincl enabled), the
protocol field of the flow structure, build by raw_sendmsg() /
rawv6_sendmsg()), is set to IPPROTO_RAW. This breaks the ipsec policy
lookup when some policies are defined with a protocol in the selector.
For ipv6, the sin6_port field from 'struct sockaddr_in6' could be used to
specify the protocol. Just accept all values for IPPROTO_RAW socket.
For ipv4, the sin_port field of 'struct sockaddr_in' could not be used
without breaking backward compatibility (the value of this field was never
checked). Let's add a new kind of control message, so that the userland
could specify which protocol is used.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
CC: stable(a)vger.kernel.org
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Link: https://lore.kernel.org/r/20230522120820.1319391-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/include/net/ip.h b/include/net/ip.h
index c3fffaa92d6e..acec504c469a 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -76,6 +76,7 @@ struct ipcm_cookie {
__be32 addr;
int oif;
struct ip_options_rcu *opt;
+ __u8 protocol;
__u8 ttl;
__s16 tos;
char priority;
@@ -96,6 +97,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm,
ipcm->sockc.tsflags = inet->sk.sk_tsflags;
ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if);
ipcm->addr = inet->inet_saddr;
+ ipcm->protocol = inet->inet_num;
}
#define IPCB(skb) ((struct inet_skb_parm*)((skb)->cb))
diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index 4b7f2df66b99..e682ab628dfa 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -163,6 +163,7 @@ struct in_addr {
#define IP_MULTICAST_ALL 49
#define IP_UNICAST_IF 50
#define IP_LOCAL_PORT_RANGE 51
+#define IP_PROTOCOL 52
#define MCAST_EXCLUDE 0
#define MCAST_INCLUDE 1
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index b511ff0adc0a..8e97d8d4cc9d 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -317,7 +317,14 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
ipc->tos = val;
ipc->priority = rt_tos2priority(ipc->tos);
break;
-
+ case IP_PROTOCOL:
+ if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)))
+ return -EINVAL;
+ val = *(int *)CMSG_DATA(cmsg);
+ if (val < 1 || val > 255)
+ return -EINVAL;
+ ipc->protocol = val;
+ break;
default:
return -EINVAL;
}
@@ -1761,6 +1768,9 @@ int do_ip_getsockopt(struct sock *sk, int level, int optname,
case IP_LOCAL_PORT_RANGE:
val = inet->local_port_range.hi << 16 | inet->local_port_range.lo;
break;
+ case IP_PROTOCOL:
+ val = inet_sk(sk)->inet_num;
+ break;
default:
sockopt_release_sock(sk);
return -ENOPROTOOPT;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index ff712bf2a98d..eadf1c9ef7e4 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -532,6 +532,9 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
}
ipcm_init_sk(&ipc, inet);
+ /* Keep backward compat */
+ if (hdrincl)
+ ipc.protocol = IPPROTO_RAW;
if (msg->msg_controllen) {
err = ip_cmsg_send(sk, msg, &ipc, false);
@@ -599,7 +602,7 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
flowi4_init_output(&fl4, ipc.oif, ipc.sockc.mark, tos,
RT_SCOPE_UNIVERSE,
- hdrincl ? IPPROTO_RAW : sk->sk_protocol,
+ hdrincl ? ipc.protocol : sk->sk_protocol,
inet_sk_flowi_flags(sk) |
(hdrincl ? FLOWI_FLAG_KNOWN_NH : 0),
daddr, saddr, 0, 0, sk->sk_uid);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 7d0adb612bdd..44ee7a2e72ac 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -793,7 +793,8 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
if (!proto)
proto = inet->inet_num;
- else if (proto != inet->inet_num)
+ else if (proto != inet->inet_num &&
+ inet->inet_num != IPPROTO_RAW)
return -EINVAL;
if (proto > 255)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 3632679d9e4f879f49949bb5b050e0de553e4739
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052634-surgical-sulfite-a551@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
3632679d9e4f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")
91d0b78c5177 ("inet: Add IP_LOCAL_PORT_RANGE socket option")
28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
d2c135619cb8 ("inet: add READ_ONCE(sk->sk_bound_dev_if) in inet_csk_bind_conflict()")
ca7af0402550 ("tcp: add small random increments to the source port")
ffa84b5ffb37 ("net: add netns refcount tracker to struct sock")
938cca9e4109 ("sock: fix /proc/net/sockstat underflow in sk_clone_lock()")
990c74e3f41d ("memcg: enable accounting for inet_bin_bucket cache")
333bb73f620e ("tcp: Keep TCP_CLOSE sockets in the reuseport group.")
5c040eaf5d17 ("tcp: Add num_closed_socks to struct sock_reuseport.")
c579bd1b4021 ("tcp: add some entropy in __inet_hash_connect()")
190cc82489f4 ("tcp: change source port randomizarion at connect() time")
bbc20b70424a ("net: reduce indentation level in sk_clone_lock()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3632679d9e4f879f49949bb5b050e0de553e4739 Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Mon, 22 May 2023 14:08:20 +0200
Subject: [PATCH] ipv{4,6}/raw: fix output xfrm lookup wrt protocol
With a raw socket bound to IPPROTO_RAW (ie with hdrincl enabled), the
protocol field of the flow structure, build by raw_sendmsg() /
rawv6_sendmsg()), is set to IPPROTO_RAW. This breaks the ipsec policy
lookup when some policies are defined with a protocol in the selector.
For ipv6, the sin6_port field from 'struct sockaddr_in6' could be used to
specify the protocol. Just accept all values for IPPROTO_RAW socket.
For ipv4, the sin_port field of 'struct sockaddr_in' could not be used
without breaking backward compatibility (the value of this field was never
checked). Let's add a new kind of control message, so that the userland
could specify which protocol is used.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
CC: stable(a)vger.kernel.org
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Link: https://lore.kernel.org/r/20230522120820.1319391-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/include/net/ip.h b/include/net/ip.h
index c3fffaa92d6e..acec504c469a 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -76,6 +76,7 @@ struct ipcm_cookie {
__be32 addr;
int oif;
struct ip_options_rcu *opt;
+ __u8 protocol;
__u8 ttl;
__s16 tos;
char priority;
@@ -96,6 +97,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm,
ipcm->sockc.tsflags = inet->sk.sk_tsflags;
ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if);
ipcm->addr = inet->inet_saddr;
+ ipcm->protocol = inet->inet_num;
}
#define IPCB(skb) ((struct inet_skb_parm*)((skb)->cb))
diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index 4b7f2df66b99..e682ab628dfa 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -163,6 +163,7 @@ struct in_addr {
#define IP_MULTICAST_ALL 49
#define IP_UNICAST_IF 50
#define IP_LOCAL_PORT_RANGE 51
+#define IP_PROTOCOL 52
#define MCAST_EXCLUDE 0
#define MCAST_INCLUDE 1
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index b511ff0adc0a..8e97d8d4cc9d 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -317,7 +317,14 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
ipc->tos = val;
ipc->priority = rt_tos2priority(ipc->tos);
break;
-
+ case IP_PROTOCOL:
+ if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)))
+ return -EINVAL;
+ val = *(int *)CMSG_DATA(cmsg);
+ if (val < 1 || val > 255)
+ return -EINVAL;
+ ipc->protocol = val;
+ break;
default:
return -EINVAL;
}
@@ -1761,6 +1768,9 @@ int do_ip_getsockopt(struct sock *sk, int level, int optname,
case IP_LOCAL_PORT_RANGE:
val = inet->local_port_range.hi << 16 | inet->local_port_range.lo;
break;
+ case IP_PROTOCOL:
+ val = inet_sk(sk)->inet_num;
+ break;
default:
sockopt_release_sock(sk);
return -ENOPROTOOPT;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index ff712bf2a98d..eadf1c9ef7e4 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -532,6 +532,9 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
}
ipcm_init_sk(&ipc, inet);
+ /* Keep backward compat */
+ if (hdrincl)
+ ipc.protocol = IPPROTO_RAW;
if (msg->msg_controllen) {
err = ip_cmsg_send(sk, msg, &ipc, false);
@@ -599,7 +602,7 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
flowi4_init_output(&fl4, ipc.oif, ipc.sockc.mark, tos,
RT_SCOPE_UNIVERSE,
- hdrincl ? IPPROTO_RAW : sk->sk_protocol,
+ hdrincl ? ipc.protocol : sk->sk_protocol,
inet_sk_flowi_flags(sk) |
(hdrincl ? FLOWI_FLAG_KNOWN_NH : 0),
daddr, saddr, 0, 0, sk->sk_uid);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 7d0adb612bdd..44ee7a2e72ac 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -793,7 +793,8 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
if (!proto)
proto = inet->inet_num;
- else if (proto != inet->inet_num)
+ else if (proto != inet->inet_num &&
+ inet->inet_num != IPPROTO_RAW)
return -EINVAL;
if (proto > 255)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 3632679d9e4f879f49949bb5b050e0de553e4739
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052623-available-vagueness-5c1e@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3632679d9e4f879f49949bb5b050e0de553e4739 Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Mon, 22 May 2023 14:08:20 +0200
Subject: [PATCH] ipv{4,6}/raw: fix output xfrm lookup wrt protocol
With a raw socket bound to IPPROTO_RAW (ie with hdrincl enabled), the
protocol field of the flow structure, build by raw_sendmsg() /
rawv6_sendmsg()), is set to IPPROTO_RAW. This breaks the ipsec policy
lookup when some policies are defined with a protocol in the selector.
For ipv6, the sin6_port field from 'struct sockaddr_in6' could be used to
specify the protocol. Just accept all values for IPPROTO_RAW socket.
For ipv4, the sin_port field of 'struct sockaddr_in' could not be used
without breaking backward compatibility (the value of this field was never
checked). Let's add a new kind of control message, so that the userland
could specify which protocol is used.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
CC: stable(a)vger.kernel.org
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Link: https://lore.kernel.org/r/20230522120820.1319391-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/include/net/ip.h b/include/net/ip.h
index c3fffaa92d6e..acec504c469a 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -76,6 +76,7 @@ struct ipcm_cookie {
__be32 addr;
int oif;
struct ip_options_rcu *opt;
+ __u8 protocol;
__u8 ttl;
__s16 tos;
char priority;
@@ -96,6 +97,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm,
ipcm->sockc.tsflags = inet->sk.sk_tsflags;
ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if);
ipcm->addr = inet->inet_saddr;
+ ipcm->protocol = inet->inet_num;
}
#define IPCB(skb) ((struct inet_skb_parm*)((skb)->cb))
diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index 4b7f2df66b99..e682ab628dfa 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -163,6 +163,7 @@ struct in_addr {
#define IP_MULTICAST_ALL 49
#define IP_UNICAST_IF 50
#define IP_LOCAL_PORT_RANGE 51
+#define IP_PROTOCOL 52
#define MCAST_EXCLUDE 0
#define MCAST_INCLUDE 1
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index b511ff0adc0a..8e97d8d4cc9d 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -317,7 +317,14 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
ipc->tos = val;
ipc->priority = rt_tos2priority(ipc->tos);
break;
-
+ case IP_PROTOCOL:
+ if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)))
+ return -EINVAL;
+ val = *(int *)CMSG_DATA(cmsg);
+ if (val < 1 || val > 255)
+ return -EINVAL;
+ ipc->protocol = val;
+ break;
default:
return -EINVAL;
}
@@ -1761,6 +1768,9 @@ int do_ip_getsockopt(struct sock *sk, int level, int optname,
case IP_LOCAL_PORT_RANGE:
val = inet->local_port_range.hi << 16 | inet->local_port_range.lo;
break;
+ case IP_PROTOCOL:
+ val = inet_sk(sk)->inet_num;
+ break;
default:
sockopt_release_sock(sk);
return -ENOPROTOOPT;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index ff712bf2a98d..eadf1c9ef7e4 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -532,6 +532,9 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
}
ipcm_init_sk(&ipc, inet);
+ /* Keep backward compat */
+ if (hdrincl)
+ ipc.protocol = IPPROTO_RAW;
if (msg->msg_controllen) {
err = ip_cmsg_send(sk, msg, &ipc, false);
@@ -599,7 +602,7 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
flowi4_init_output(&fl4, ipc.oif, ipc.sockc.mark, tos,
RT_SCOPE_UNIVERSE,
- hdrincl ? IPPROTO_RAW : sk->sk_protocol,
+ hdrincl ? ipc.protocol : sk->sk_protocol,
inet_sk_flowi_flags(sk) |
(hdrincl ? FLOWI_FLAG_KNOWN_NH : 0),
daddr, saddr, 0, 0, sk->sk_uid);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 7d0adb612bdd..44ee7a2e72ac 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -793,7 +793,8 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
if (!proto)
proto = inet->inet_num;
- else if (proto != inet->inet_num)
+ else if (proto != inet->inet_num &&
+ inet->inet_num != IPPROTO_RAW)
return -EINVAL;
if (proto > 255)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 7e01c7f7046efc2c7c192c3619db43292b98e997
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052621-karaoke-try-d2ba@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
7e01c7f7046e ("net: cdc_ncm: Deal with too low values of dwNtbOutMaxSize")
2be6d4d16a08 ("net: cdc_ncm: Allow for dwNtbOutMaxSize to be unset or zero")
0fa81b304a79 ("cdc_ncm: Implement the 32-bit version of NCM Transfer Block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7e01c7f7046efc2c7c192c3619db43292b98e997 Mon Sep 17 00:00:00 2001
From: Tudor Ambarus <tudor.ambarus(a)linaro.org>
Date: Wed, 17 May 2023 13:38:08 +0000
Subject: [PATCH] net: cdc_ncm: Deal with too low values of dwNtbOutMaxSize
Currently in cdc_ncm_check_tx_max(), if dwNtbOutMaxSize is lower than
the calculated "min" value, but greater than zero, the logic sets
tx_max to dwNtbOutMaxSize. This is then used to allocate a new SKB in
cdc_ncm_fill_tx_frame() where all the data is handled.
For small values of dwNtbOutMaxSize the memory allocated during
alloc_skb(dwNtbOutMaxSize, GFP_ATOMIC) will have the same size, due to
how size is aligned at alloc time:
size = SKB_DATA_ALIGN(size);
size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
Thus we hit the same bug that we tried to squash with
commit 2be6d4d16a084 ("net: cdc_ncm: Allow for dwNtbOutMaxSize to be unset or zero")
Low values of dwNtbOutMaxSize do not cause an issue presently because at
alloc_skb() time more memory (512b) is allocated than required for the
SKB headers alone (320b), leaving some space (512b - 320b = 192b)
for CDC data (172b).
However, if more elements (for example 3 x u64 = [24b]) were added to
one of the SKB header structs, say 'struct skb_shared_info',
increasing its original size (320b [320b aligned]) to something larger
(344b [384b aligned]), then suddenly the CDC data (172b) no longer
fits in the spare SKB data area (512b - 384b = 128b).
Consequently the SKB bounds checking semantics fails and panics:
skbuff: skb_over_panic: text:ffffffff831f755b len:184 put:172 head:ffff88811f1c6c00 data:ffff88811f1c6c00 tail:0xb8 end:0x80 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:113!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 57 Comm: kworker/0:2 Not tainted 5.15.106-syzkaller-00249-g19c0ed55a470 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
Workqueue: mld mld_ifc_work
RIP: 0010:skb_panic net/core/skbuff.c:113 [inline]
RIP: 0010:skb_over_panic+0x14c/0x150 net/core/skbuff.c:118
[snip]
Call Trace:
<TASK>
skb_put+0x151/0x210 net/core/skbuff.c:2047
skb_put_zero include/linux/skbuff.h:2422 [inline]
cdc_ncm_ndp16 drivers/net/usb/cdc_ncm.c:1131 [inline]
cdc_ncm_fill_tx_frame+0x11ab/0x3da0 drivers/net/usb/cdc_ncm.c:1308
cdc_ncm_tx_fixup+0xa3/0x100
Deal with too low values of dwNtbOutMaxSize, clamp it in the range
[USB_CDC_NCM_NTB_MIN_OUT_SIZE, CDC_NCM_NTB_MAX_SIZE_TX]. We ensure
enough data space is allocated to handle CDC data by making sure
dwNtbOutMaxSize is not smaller than USB_CDC_NCM_NTB_MIN_OUT_SIZE.
Fixes: 289507d3364f ("net: cdc_ncm: use sysfs for rx/tx aggregation tuning")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+9f575a1f15fc0c01ed69(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=b982f1059506db48409d
Link: https://lore.kernel.org/all/20211202143437.1411410-1-lee.jones@linaro.org/
Signed-off-by: Tudor Ambarus <tudor.ambarus(a)linaro.org>
Reviewed-by: Simon Horman <simon.horman(a)corigine.com>
Link: https://lore.kernel.org/r/20230517133808.1873695-2-tudor.ambarus@linaro.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 6ce8f4f0c70e..db05622f1f70 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -181,9 +181,12 @@ static u32 cdc_ncm_check_tx_max(struct usbnet *dev, u32 new_tx)
else
min = ctx->max_datagram_size + ctx->max_ndp_size + sizeof(struct usb_cdc_ncm_nth32);
- max = min_t(u32, CDC_NCM_NTB_MAX_SIZE_TX, le32_to_cpu(ctx->ncm_parm.dwNtbOutMaxSize));
- if (max == 0)
+ if (le32_to_cpu(ctx->ncm_parm.dwNtbOutMaxSize) == 0)
max = CDC_NCM_NTB_MAX_SIZE_TX; /* dwNtbOutMaxSize not set */
+ else
+ max = clamp_t(u32, le32_to_cpu(ctx->ncm_parm.dwNtbOutMaxSize),
+ USB_CDC_NCM_NTB_MIN_OUT_SIZE,
+ CDC_NCM_NTB_MAX_SIZE_TX);
/* some devices set dwNtbOutMaxSize too low for the above default */
min = min(min, max);
@@ -1244,6 +1247,9 @@ cdc_ncm_fill_tx_frame(struct usbnet *dev, struct sk_buff *skb, __le32 sign)
* further.
*/
if (skb_out == NULL) {
+ /* If even the smallest allocation fails, abort. */
+ if (ctx->tx_curr_size == USB_CDC_NCM_NTB_MIN_OUT_SIZE)
+ goto alloc_failed;
ctx->tx_low_mem_max_cnt = min(ctx->tx_low_mem_max_cnt + 1,
(unsigned)CDC_NCM_LOW_MEM_MAX_CNT);
ctx->tx_low_mem_val = ctx->tx_low_mem_max_cnt;
@@ -1262,13 +1268,8 @@ cdc_ncm_fill_tx_frame(struct usbnet *dev, struct sk_buff *skb, __le32 sign)
skb_out = alloc_skb(ctx->tx_curr_size, GFP_ATOMIC);
/* No allocation possible so we will abort */
- if (skb_out == NULL) {
- if (skb != NULL) {
- dev_kfree_skb_any(skb);
- dev->net->stats.tx_dropped++;
- }
- goto exit_no_skb;
- }
+ if (!skb_out)
+ goto alloc_failed;
ctx->tx_low_mem_val--;
}
if (ctx->is_ndp16) {
@@ -1461,6 +1462,11 @@ cdc_ncm_fill_tx_frame(struct usbnet *dev, struct sk_buff *skb, __le32 sign)
return skb_out;
+alloc_failed:
+ if (skb) {
+ dev_kfree_skb_any(skb);
+ dev->net->stats.tx_dropped++;
+ }
exit_no_skb:
/* Start timer, if there is a remaining non-empty skb */
if (ctx->tx_curr_skb != NULL && n > 0)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 003fb0a51162d940f25fc35e70b0996a12c9e08a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052611-wrangle-clock-fb09@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
003fb0a51162 ("mmc: block: ensure error propagation for non-blk")
b84ba30b6c7a ("block: remove the gendisk argument to blk_execute_rq")
4054cff92c35 ("block: remove blk-exec.c")
0bf6d96cb829 ("block: remove blk_{get,put}_request")
4abafdc4360d ("block: remove the initialize_rq_fn blk_mq_ops method")
68ec3b819a5d ("scsi: add a scsi_alloc_request helper")
5a72e899ceb4 ("block: add a struct io_comp_batch argument to fops->iopoll()")
013a7f954381 ("block: provide helpers for rq_list manipulation")
afd7de03c526 ("block: remove some blk_mq_hw_ctx debugfs entries")
3e08773c3841 ("block: switch polling to be bio based")
6ce913fe3eee ("block: rename REQ_HIPRI to REQ_POLLED")
d729cf9acb93 ("io_uring: don't sleep when polling for I/O")
ef99b2d37666 ("block: replace the spin argument to blk_iopoll with a flags argument")
28a1ae6b9dab ("blk-mq: remove blk_qc_t_valid")
efbabbe121f9 ("blk-mq: remove blk_qc_t_to_tag and blk_qc_t_is_internal")
c6699d6fe0ff ("blk-mq: factor out a "classic" poll helper")
f70299f0d58e ("blk-mq: factor out a blk_qc_to_hctx helper")
71fc3f5e2c00 ("block: don't try to poll multi-bio I/Os in __blkdev_direct_IO")
349302da8352 ("block: improve batched tag allocation")
0f38d7664615 ("blk-mq: cleanup blk_mq_submit_bio")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 003fb0a51162d940f25fc35e70b0996a12c9e08a Mon Sep 17 00:00:00 2001
From: Christian Loehle <CLoehle(a)hyperstone.com>
Date: Wed, 26 Apr 2023 16:59:39 +0000
Subject: [PATCH] mmc: block: ensure error propagation for non-blk
Requests to the mmc layer usually come through a block device IO.
The exceptions are the ioctl interface, RPMB chardev ioctl
and debugfs, which issue their own blk_mq requests through
blk_execute_rq and do not query the BLK_STS error but the
mmcblk-internal drv_op_result. This patch ensures that drv_op_result
defaults to an error and has to be overwritten by the operation
to be considered successful.
The behavior leads to a bug where the request never propagates
the error, e.g. by directly erroring out at mmc_blk_mq_issue_rq if
mmc_blk_part_switch fails. The ioctl caller of the rpmb chardev then
can never see an error (BLK_STS_IOERR, but drv_op_result is unchanged)
and thus may assume that their call executed successfully when it did not.
While always checking the blk_execute_rq return value would be
advised, let's eliminate the error by always setting
drv_op_result as -EIO to be overwritten on success (or other error)
Fixes: 614f0388f580 ("mmc: block: move single ioctl() commands to block requests")
Signed-off-by: Christian Loehle <cloehle(a)hyperstone.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/59c17ada35664b818b7bd83752119b2d@hyperstone.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 00c33edb9fb9..d920c4178389 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -264,6 +264,7 @@ static ssize_t power_ro_lock_store(struct device *dev,
goto out_put;
}
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_BOOT_WP;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
blk_execute_rq(req, false);
ret = req_to_mmc_queue_req(req)->drv_op_result;
blk_mq_free_request(req);
@@ -651,6 +652,7 @@ static int mmc_blk_ioctl_cmd(struct mmc_blk_data *md,
idatas[0] = idata;
req_to_mmc_queue_req(req)->drv_op =
rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = idatas;
req_to_mmc_queue_req(req)->ioc_count = 1;
blk_execute_rq(req, false);
@@ -722,6 +724,7 @@ static int mmc_blk_ioctl_multi_cmd(struct mmc_blk_data *md,
}
req_to_mmc_queue_req(req)->drv_op =
rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = idata;
req_to_mmc_queue_req(req)->ioc_count = n;
blk_execute_rq(req, false);
@@ -2806,6 +2809,7 @@ static int mmc_dbg_card_status_get(void *data, u64 *val)
if (IS_ERR(req))
return PTR_ERR(req);
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_CARD_STATUS;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
blk_execute_rq(req, false);
ret = req_to_mmc_queue_req(req)->drv_op_result;
if (ret >= 0) {
@@ -2844,6 +2848,7 @@ static int mmc_ext_csd_open(struct inode *inode, struct file *filp)
goto out_free;
}
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_EXT_CSD;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = &ext_csd;
blk_execute_rq(req, false);
err = req_to_mmc_queue_req(req)->drv_op_result;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 003fb0a51162d940f25fc35e70b0996a12c9e08a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052610-deluxe-amuck-9628@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
003fb0a51162 ("mmc: block: ensure error propagation for non-blk")
b84ba30b6c7a ("block: remove the gendisk argument to blk_execute_rq")
4054cff92c35 ("block: remove blk-exec.c")
0bf6d96cb829 ("block: remove blk_{get,put}_request")
4abafdc4360d ("block: remove the initialize_rq_fn blk_mq_ops method")
68ec3b819a5d ("scsi: add a scsi_alloc_request helper")
5a72e899ceb4 ("block: add a struct io_comp_batch argument to fops->iopoll()")
013a7f954381 ("block: provide helpers for rq_list manipulation")
afd7de03c526 ("block: remove some blk_mq_hw_ctx debugfs entries")
3e08773c3841 ("block: switch polling to be bio based")
6ce913fe3eee ("block: rename REQ_HIPRI to REQ_POLLED")
d729cf9acb93 ("io_uring: don't sleep when polling for I/O")
ef99b2d37666 ("block: replace the spin argument to blk_iopoll with a flags argument")
28a1ae6b9dab ("blk-mq: remove blk_qc_t_valid")
efbabbe121f9 ("blk-mq: remove blk_qc_t_to_tag and blk_qc_t_is_internal")
c6699d6fe0ff ("blk-mq: factor out a "classic" poll helper")
f70299f0d58e ("blk-mq: factor out a blk_qc_to_hctx helper")
71fc3f5e2c00 ("block: don't try to poll multi-bio I/Os in __blkdev_direct_IO")
349302da8352 ("block: improve batched tag allocation")
0f38d7664615 ("blk-mq: cleanup blk_mq_submit_bio")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 003fb0a51162d940f25fc35e70b0996a12c9e08a Mon Sep 17 00:00:00 2001
From: Christian Loehle <CLoehle(a)hyperstone.com>
Date: Wed, 26 Apr 2023 16:59:39 +0000
Subject: [PATCH] mmc: block: ensure error propagation for non-blk
Requests to the mmc layer usually come through a block device IO.
The exceptions are the ioctl interface, RPMB chardev ioctl
and debugfs, which issue their own blk_mq requests through
blk_execute_rq and do not query the BLK_STS error but the
mmcblk-internal drv_op_result. This patch ensures that drv_op_result
defaults to an error and has to be overwritten by the operation
to be considered successful.
The behavior leads to a bug where the request never propagates
the error, e.g. by directly erroring out at mmc_blk_mq_issue_rq if
mmc_blk_part_switch fails. The ioctl caller of the rpmb chardev then
can never see an error (BLK_STS_IOERR, but drv_op_result is unchanged)
and thus may assume that their call executed successfully when it did not.
While always checking the blk_execute_rq return value would be
advised, let's eliminate the error by always setting
drv_op_result as -EIO to be overwritten on success (or other error)
Fixes: 614f0388f580 ("mmc: block: move single ioctl() commands to block requests")
Signed-off-by: Christian Loehle <cloehle(a)hyperstone.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/59c17ada35664b818b7bd83752119b2d@hyperstone.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 00c33edb9fb9..d920c4178389 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -264,6 +264,7 @@ static ssize_t power_ro_lock_store(struct device *dev,
goto out_put;
}
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_BOOT_WP;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
blk_execute_rq(req, false);
ret = req_to_mmc_queue_req(req)->drv_op_result;
blk_mq_free_request(req);
@@ -651,6 +652,7 @@ static int mmc_blk_ioctl_cmd(struct mmc_blk_data *md,
idatas[0] = idata;
req_to_mmc_queue_req(req)->drv_op =
rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = idatas;
req_to_mmc_queue_req(req)->ioc_count = 1;
blk_execute_rq(req, false);
@@ -722,6 +724,7 @@ static int mmc_blk_ioctl_multi_cmd(struct mmc_blk_data *md,
}
req_to_mmc_queue_req(req)->drv_op =
rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = idata;
req_to_mmc_queue_req(req)->ioc_count = n;
blk_execute_rq(req, false);
@@ -2806,6 +2809,7 @@ static int mmc_dbg_card_status_get(void *data, u64 *val)
if (IS_ERR(req))
return PTR_ERR(req);
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_CARD_STATUS;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
blk_execute_rq(req, false);
ret = req_to_mmc_queue_req(req)->drv_op_result;
if (ret >= 0) {
@@ -2844,6 +2848,7 @@ static int mmc_ext_csd_open(struct inode *inode, struct file *filp)
goto out_free;
}
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_EXT_CSD;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = &ext_csd;
blk_execute_rq(req, false);
err = req_to_mmc_queue_req(req)->drv_op_result;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 003fb0a51162d940f25fc35e70b0996a12c9e08a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052609-regulator-registry-2991@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
003fb0a51162 ("mmc: block: ensure error propagation for non-blk")
b84ba30b6c7a ("block: remove the gendisk argument to blk_execute_rq")
4054cff92c35 ("block: remove blk-exec.c")
0bf6d96cb829 ("block: remove blk_{get,put}_request")
4abafdc4360d ("block: remove the initialize_rq_fn blk_mq_ops method")
68ec3b819a5d ("scsi: add a scsi_alloc_request helper")
5a72e899ceb4 ("block: add a struct io_comp_batch argument to fops->iopoll()")
013a7f954381 ("block: provide helpers for rq_list manipulation")
afd7de03c526 ("block: remove some blk_mq_hw_ctx debugfs entries")
3e08773c3841 ("block: switch polling to be bio based")
6ce913fe3eee ("block: rename REQ_HIPRI to REQ_POLLED")
d729cf9acb93 ("io_uring: don't sleep when polling for I/O")
ef99b2d37666 ("block: replace the spin argument to blk_iopoll with a flags argument")
28a1ae6b9dab ("blk-mq: remove blk_qc_t_valid")
efbabbe121f9 ("blk-mq: remove blk_qc_t_to_tag and blk_qc_t_is_internal")
c6699d6fe0ff ("blk-mq: factor out a "classic" poll helper")
f70299f0d58e ("blk-mq: factor out a blk_qc_to_hctx helper")
71fc3f5e2c00 ("block: don't try to poll multi-bio I/Os in __blkdev_direct_IO")
349302da8352 ("block: improve batched tag allocation")
0f38d7664615 ("blk-mq: cleanup blk_mq_submit_bio")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 003fb0a51162d940f25fc35e70b0996a12c9e08a Mon Sep 17 00:00:00 2001
From: Christian Loehle <CLoehle(a)hyperstone.com>
Date: Wed, 26 Apr 2023 16:59:39 +0000
Subject: [PATCH] mmc: block: ensure error propagation for non-blk
Requests to the mmc layer usually come through a block device IO.
The exceptions are the ioctl interface, RPMB chardev ioctl
and debugfs, which issue their own blk_mq requests through
blk_execute_rq and do not query the BLK_STS error but the
mmcblk-internal drv_op_result. This patch ensures that drv_op_result
defaults to an error and has to be overwritten by the operation
to be considered successful.
The behavior leads to a bug where the request never propagates
the error, e.g. by directly erroring out at mmc_blk_mq_issue_rq if
mmc_blk_part_switch fails. The ioctl caller of the rpmb chardev then
can never see an error (BLK_STS_IOERR, but drv_op_result is unchanged)
and thus may assume that their call executed successfully when it did not.
While always checking the blk_execute_rq return value would be
advised, let's eliminate the error by always setting
drv_op_result as -EIO to be overwritten on success (or other error)
Fixes: 614f0388f580 ("mmc: block: move single ioctl() commands to block requests")
Signed-off-by: Christian Loehle <cloehle(a)hyperstone.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/59c17ada35664b818b7bd83752119b2d@hyperstone.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 00c33edb9fb9..d920c4178389 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -264,6 +264,7 @@ static ssize_t power_ro_lock_store(struct device *dev,
goto out_put;
}
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_BOOT_WP;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
blk_execute_rq(req, false);
ret = req_to_mmc_queue_req(req)->drv_op_result;
blk_mq_free_request(req);
@@ -651,6 +652,7 @@ static int mmc_blk_ioctl_cmd(struct mmc_blk_data *md,
idatas[0] = idata;
req_to_mmc_queue_req(req)->drv_op =
rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = idatas;
req_to_mmc_queue_req(req)->ioc_count = 1;
blk_execute_rq(req, false);
@@ -722,6 +724,7 @@ static int mmc_blk_ioctl_multi_cmd(struct mmc_blk_data *md,
}
req_to_mmc_queue_req(req)->drv_op =
rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = idata;
req_to_mmc_queue_req(req)->ioc_count = n;
blk_execute_rq(req, false);
@@ -2806,6 +2809,7 @@ static int mmc_dbg_card_status_get(void *data, u64 *val)
if (IS_ERR(req))
return PTR_ERR(req);
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_CARD_STATUS;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
blk_execute_rq(req, false);
ret = req_to_mmc_queue_req(req)->drv_op_result;
if (ret >= 0) {
@@ -2844,6 +2848,7 @@ static int mmc_ext_csd_open(struct inode *inode, struct file *filp)
goto out_free;
}
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_EXT_CSD;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = &ext_csd;
blk_execute_rq(req, false);
err = req_to_mmc_queue_req(req)->drv_op_result;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 003fb0a51162d940f25fc35e70b0996a12c9e08a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052608-hybrid-jugular-3bc2@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
003fb0a51162 ("mmc: block: ensure error propagation for non-blk")
b84ba30b6c7a ("block: remove the gendisk argument to blk_execute_rq")
4054cff92c35 ("block: remove blk-exec.c")
0bf6d96cb829 ("block: remove blk_{get,put}_request")
4abafdc4360d ("block: remove the initialize_rq_fn blk_mq_ops method")
68ec3b819a5d ("scsi: add a scsi_alloc_request helper")
5a72e899ceb4 ("block: add a struct io_comp_batch argument to fops->iopoll()")
013a7f954381 ("block: provide helpers for rq_list manipulation")
afd7de03c526 ("block: remove some blk_mq_hw_ctx debugfs entries")
3e08773c3841 ("block: switch polling to be bio based")
6ce913fe3eee ("block: rename REQ_HIPRI to REQ_POLLED")
d729cf9acb93 ("io_uring: don't sleep when polling for I/O")
ef99b2d37666 ("block: replace the spin argument to blk_iopoll with a flags argument")
28a1ae6b9dab ("blk-mq: remove blk_qc_t_valid")
efbabbe121f9 ("blk-mq: remove blk_qc_t_to_tag and blk_qc_t_is_internal")
c6699d6fe0ff ("blk-mq: factor out a "classic" poll helper")
f70299f0d58e ("blk-mq: factor out a blk_qc_to_hctx helper")
71fc3f5e2c00 ("block: don't try to poll multi-bio I/Os in __blkdev_direct_IO")
349302da8352 ("block: improve batched tag allocation")
0f38d7664615 ("blk-mq: cleanup blk_mq_submit_bio")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 003fb0a51162d940f25fc35e70b0996a12c9e08a Mon Sep 17 00:00:00 2001
From: Christian Loehle <CLoehle(a)hyperstone.com>
Date: Wed, 26 Apr 2023 16:59:39 +0000
Subject: [PATCH] mmc: block: ensure error propagation for non-blk
Requests to the mmc layer usually come through a block device IO.
The exceptions are the ioctl interface, RPMB chardev ioctl
and debugfs, which issue their own blk_mq requests through
blk_execute_rq and do not query the BLK_STS error but the
mmcblk-internal drv_op_result. This patch ensures that drv_op_result
defaults to an error and has to be overwritten by the operation
to be considered successful.
The behavior leads to a bug where the request never propagates
the error, e.g. by directly erroring out at mmc_blk_mq_issue_rq if
mmc_blk_part_switch fails. The ioctl caller of the rpmb chardev then
can never see an error (BLK_STS_IOERR, but drv_op_result is unchanged)
and thus may assume that their call executed successfully when it did not.
While always checking the blk_execute_rq return value would be
advised, let's eliminate the error by always setting
drv_op_result as -EIO to be overwritten on success (or other error)
Fixes: 614f0388f580 ("mmc: block: move single ioctl() commands to block requests")
Signed-off-by: Christian Loehle <cloehle(a)hyperstone.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/59c17ada35664b818b7bd83752119b2d@hyperstone.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 00c33edb9fb9..d920c4178389 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -264,6 +264,7 @@ static ssize_t power_ro_lock_store(struct device *dev,
goto out_put;
}
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_BOOT_WP;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
blk_execute_rq(req, false);
ret = req_to_mmc_queue_req(req)->drv_op_result;
blk_mq_free_request(req);
@@ -651,6 +652,7 @@ static int mmc_blk_ioctl_cmd(struct mmc_blk_data *md,
idatas[0] = idata;
req_to_mmc_queue_req(req)->drv_op =
rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = idatas;
req_to_mmc_queue_req(req)->ioc_count = 1;
blk_execute_rq(req, false);
@@ -722,6 +724,7 @@ static int mmc_blk_ioctl_multi_cmd(struct mmc_blk_data *md,
}
req_to_mmc_queue_req(req)->drv_op =
rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = idata;
req_to_mmc_queue_req(req)->ioc_count = n;
blk_execute_rq(req, false);
@@ -2806,6 +2809,7 @@ static int mmc_dbg_card_status_get(void *data, u64 *val)
if (IS_ERR(req))
return PTR_ERR(req);
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_CARD_STATUS;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
blk_execute_rq(req, false);
ret = req_to_mmc_queue_req(req)->drv_op_result;
if (ret >= 0) {
@@ -2844,6 +2848,7 @@ static int mmc_ext_csd_open(struct inode *inode, struct file *filp)
goto out_free;
}
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_EXT_CSD;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = &ext_csd;
blk_execute_rq(req, false);
err = req_to_mmc_queue_req(req)->drv_op_result;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 003fb0a51162d940f25fc35e70b0996a12c9e08a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052607-catering-creamed-fc70@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
003fb0a51162 ("mmc: block: ensure error propagation for non-blk")
b84ba30b6c7a ("block: remove the gendisk argument to blk_execute_rq")
4054cff92c35 ("block: remove blk-exec.c")
0bf6d96cb829 ("block: remove blk_{get,put}_request")
4abafdc4360d ("block: remove the initialize_rq_fn blk_mq_ops method")
68ec3b819a5d ("scsi: add a scsi_alloc_request helper")
5a72e899ceb4 ("block: add a struct io_comp_batch argument to fops->iopoll()")
013a7f954381 ("block: provide helpers for rq_list manipulation")
afd7de03c526 ("block: remove some blk_mq_hw_ctx debugfs entries")
3e08773c3841 ("block: switch polling to be bio based")
6ce913fe3eee ("block: rename REQ_HIPRI to REQ_POLLED")
d729cf9acb93 ("io_uring: don't sleep when polling for I/O")
ef99b2d37666 ("block: replace the spin argument to blk_iopoll with a flags argument")
28a1ae6b9dab ("blk-mq: remove blk_qc_t_valid")
efbabbe121f9 ("blk-mq: remove blk_qc_t_to_tag and blk_qc_t_is_internal")
c6699d6fe0ff ("blk-mq: factor out a "classic" poll helper")
f70299f0d58e ("blk-mq: factor out a blk_qc_to_hctx helper")
71fc3f5e2c00 ("block: don't try to poll multi-bio I/Os in __blkdev_direct_IO")
349302da8352 ("block: improve batched tag allocation")
0f38d7664615 ("blk-mq: cleanup blk_mq_submit_bio")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 003fb0a51162d940f25fc35e70b0996a12c9e08a Mon Sep 17 00:00:00 2001
From: Christian Loehle <CLoehle(a)hyperstone.com>
Date: Wed, 26 Apr 2023 16:59:39 +0000
Subject: [PATCH] mmc: block: ensure error propagation for non-blk
Requests to the mmc layer usually come through a block device IO.
The exceptions are the ioctl interface, RPMB chardev ioctl
and debugfs, which issue their own blk_mq requests through
blk_execute_rq and do not query the BLK_STS error but the
mmcblk-internal drv_op_result. This patch ensures that drv_op_result
defaults to an error and has to be overwritten by the operation
to be considered successful.
The behavior leads to a bug where the request never propagates
the error, e.g. by directly erroring out at mmc_blk_mq_issue_rq if
mmc_blk_part_switch fails. The ioctl caller of the rpmb chardev then
can never see an error (BLK_STS_IOERR, but drv_op_result is unchanged)
and thus may assume that their call executed successfully when it did not.
While always checking the blk_execute_rq return value would be
advised, let's eliminate the error by always setting
drv_op_result as -EIO to be overwritten on success (or other error)
Fixes: 614f0388f580 ("mmc: block: move single ioctl() commands to block requests")
Signed-off-by: Christian Loehle <cloehle(a)hyperstone.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/59c17ada35664b818b7bd83752119b2d@hyperstone.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 00c33edb9fb9..d920c4178389 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -264,6 +264,7 @@ static ssize_t power_ro_lock_store(struct device *dev,
goto out_put;
}
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_BOOT_WP;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
blk_execute_rq(req, false);
ret = req_to_mmc_queue_req(req)->drv_op_result;
blk_mq_free_request(req);
@@ -651,6 +652,7 @@ static int mmc_blk_ioctl_cmd(struct mmc_blk_data *md,
idatas[0] = idata;
req_to_mmc_queue_req(req)->drv_op =
rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = idatas;
req_to_mmc_queue_req(req)->ioc_count = 1;
blk_execute_rq(req, false);
@@ -722,6 +724,7 @@ static int mmc_blk_ioctl_multi_cmd(struct mmc_blk_data *md,
}
req_to_mmc_queue_req(req)->drv_op =
rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = idata;
req_to_mmc_queue_req(req)->ioc_count = n;
blk_execute_rq(req, false);
@@ -2806,6 +2809,7 @@ static int mmc_dbg_card_status_get(void *data, u64 *val)
if (IS_ERR(req))
return PTR_ERR(req);
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_CARD_STATUS;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
blk_execute_rq(req, false);
ret = req_to_mmc_queue_req(req)->drv_op_result;
if (ret >= 0) {
@@ -2844,6 +2848,7 @@ static int mmc_ext_csd_open(struct inode *inode, struct file *filp)
goto out_free;
}
req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_EXT_CSD;
+ req_to_mmc_queue_req(req)->drv_op_result = -EIO;
req_to_mmc_queue_req(req)->drv_op_data = &ext_csd;
blk_execute_rq(req, false);
err = req_to_mmc_queue_req(req)->drv_op_result;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 81dce1490e28439c3cd8a8650b862a712f3061ba
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052651-reporter-deuce-3950@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
81dce1490e28 ("mmc: sdhci-esdhc-imx: make "no-mmc-hs400" works")
f002f45a00ee ("mmc: sdhci-esdhc-imx: use the correct host caps for MMC_CAP_8_BIT_DATA")
1ed5c3b22fc7 ("mmc: sdhci-esdhc-imx: Propagate ESDHC_FLAG_HS400* only on 8bit bus")
2991ad76d253 ("mmc: sdhci-esdhc-imx: advertise HS400 mode through MMC caps")
854a22997ad5 ("mmc: sdhci-esdhc-imx: Convert the driver to DT-only")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 81dce1490e28439c3cd8a8650b862a712f3061ba Mon Sep 17 00:00:00 2001
From: Haibo Chen <haibo.chen(a)nxp.com>
Date: Thu, 4 May 2023 19:22:22 +0800
Subject: [PATCH] mmc: sdhci-esdhc-imx: make "no-mmc-hs400" works
After commit 1ed5c3b22fc7 ("mmc: sdhci-esdhc-imx: Propagate
ESDHC_FLAG_HS400* only on 8bit bus"), the property "no-mmc-hs400"
from device tree file do not work any more.
This patch reorder the code, which can avoid the warning message
"drop HS400 support since no 8-bit bus" and also make the property
"no-mmc-hs400" from dts file works.
Fixes: 1ed5c3b22fc7 ("mmc: sdhci-esdhc-imx: Propagate ESDHC_FLAG_HS400* only on 8bit bus")
Signed-off-by: Haibo Chen <haibo.chen(a)nxp.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230504112222.3599602-1-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-esdhc-imx.c
index d7c0c0b9e26c..eebf94604a7f 100644
--- a/drivers/mmc/host/sdhci-esdhc-imx.c
+++ b/drivers/mmc/host/sdhci-esdhc-imx.c
@@ -1634,6 +1634,10 @@ sdhci_esdhc_imx_probe_dt(struct platform_device *pdev,
if (ret)
return ret;
+ /* HS400/HS400ES require 8 bit bus */
+ if (!(host->mmc->caps & MMC_CAP_8_BIT_DATA))
+ host->mmc->caps2 &= ~(MMC_CAP2_HS400 | MMC_CAP2_HS400_ES);
+
if (mmc_gpio_get_cd(host->mmc) >= 0)
host->quirks &= ~SDHCI_QUIRK_BROKEN_CARD_DETECTION;
@@ -1724,10 +1728,6 @@ static int sdhci_esdhc_imx_probe(struct platform_device *pdev)
host->mmc_host_ops.init_card = usdhc_init_card;
}
- err = sdhci_esdhc_imx_probe_dt(pdev, host, imx_data);
- if (err)
- goto disable_ahb_clk;
-
if (imx_data->socdata->flags & ESDHC_FLAG_MAN_TUNING)
sdhci_esdhc_ops.platform_execute_tuning =
esdhc_executing_tuning;
@@ -1735,15 +1735,13 @@ static int sdhci_esdhc_imx_probe(struct platform_device *pdev)
if (imx_data->socdata->flags & ESDHC_FLAG_ERR004536)
host->quirks |= SDHCI_QUIRK_BROKEN_ADMA;
- if (host->mmc->caps & MMC_CAP_8_BIT_DATA &&
- imx_data->socdata->flags & ESDHC_FLAG_HS400)
+ if (imx_data->socdata->flags & ESDHC_FLAG_HS400)
host->mmc->caps2 |= MMC_CAP2_HS400;
if (imx_data->socdata->flags & ESDHC_FLAG_BROKEN_AUTO_CMD23)
host->quirks2 |= SDHCI_QUIRK2_ACMD23_BROKEN;
- if (host->mmc->caps & MMC_CAP_8_BIT_DATA &&
- imx_data->socdata->flags & ESDHC_FLAG_HS400_ES) {
+ if (imx_data->socdata->flags & ESDHC_FLAG_HS400_ES) {
host->mmc->caps2 |= MMC_CAP2_HS400_ES;
host->mmc_host_ops.hs400_enhanced_strobe =
esdhc_hs400_enhanced_strobe;
@@ -1765,6 +1763,10 @@ static int sdhci_esdhc_imx_probe(struct platform_device *pdev)
goto disable_ahb_clk;
}
+ err = sdhci_esdhc_imx_probe_dt(pdev, host, imx_data);
+ if (err)
+ goto disable_ahb_clk;
+
sdhci_esdhc_imx_hwinit(host);
err = sdhci_add_host(host);
On 5/23/23 15:37, Vegard Nossum wrote:
>
> On 5/21/23 02:12, kernel test robot wrote:
>> Hi Vegard,
>>
>> FYI, the error/warning still remains.
>>
>> tree:
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git queue/5.4
>> head: 9b5924fbde0d84c8b30d7ee297a08ca441a760de
>> commit: 3910babeac1ab031f4e178042cbd1af9a9a0ec51 [4610/23441]
>> compiler.h: fix error in BUILD_BUG_ON() reporting
>> config: sparc64-randconfig-c44-20230521
>> compiler: sparc64-linux-gcc (GCC) 12.1.0
[...]
> I'm not sure why this flags my patch as the culprit.
>
> I just tried this (with the supplied config):
>
> git checkout stable/linux-5.4.y
> git revert 3910babeac1ab031f4e178042cbd1af9a9a0ec51 # revert my patch
> make drivers/net/wireless/mediatek/mt76/mt7615/mac.o
>
> and it still outputs the same error.
>
> The FIELD_GET() call was added in bf92e76851009 and seems to have been
> broken from the start as far as I can tell? If I checkout bf92e76851009^
> then it builds, if I checkout bf92e76851009 then it fails.
>
> Should we just redefine to_rssi() as a macro so it actually passes the
> field as a literal/constant?
Ah, there is a mainline patch that fixes this, doing exactly that:
commit f53300fdaa84dc02f96ab9446b5bac4d20016c43
Author: Pablo Greco <pgreco(a)centosproject.org>
Date: Sun Dec 1 15:17:10 2019 -0300
mt76: mt7615: Fix build with older compilers
[...]
-static inline s8 to_rssi(u32 field, u32 rxv)
-{
- return (FIELD_GET(field, rxv) - 220) / 2;
-}
+#define to_rssi(field, rxv) ((FIELD_GET(field, rxv) - 220) / 2)
Greg, Sasha, does it make sense to pick that for 5.4 (as it doesn't seem
to be in there) to shut up the kernel test robot?
If so, should we add this to the changelog as well?
>> If you fix the issue, kindly add following tag where applicable
>> | Reported-by: kernel test robot <lkp(a)intel.com>
>> | Closes:
>>
https://lore.kernel.org/oe-kbuild-all/202305210701.TND2uZBJ-lkp@intel.com/
>>
Vegard
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 2efbafb91e12ff5a16cbafb0085e4c10c3fca493
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052235-cut-gulp-ad69@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
2efbafb91e12 ("arm64: Also reset KASAN tag if page is not PG_mte_tagged")
e74a68468062 ("arm64: Reset KASAN tag in copy_highpage with HW tags only")
d77e59a8fccd ("arm64: mte: Lock a page for MTE tag initialisation")
e059853d14ca ("arm64: mte: Fix/clarify the PG_mte_tagged semantics")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2efbafb91e12ff5a16cbafb0085e4c10c3fca493 Mon Sep 17 00:00:00 2001
From: Peter Collingbourne <pcc(a)google.com>
Date: Thu, 20 Apr 2023 14:09:45 -0700
Subject: [PATCH] arm64: Also reset KASAN tag if page is not PG_mte_tagged
Consider the following sequence of events:
1) A page in a PROT_READ|PROT_WRITE VMA is faulted.
2) Page migration allocates a page with the KASAN allocator,
causing it to receive a non-match-all tag, and uses it
to replace the page faulted in 1.
3) The program uses mprotect() to enable PROT_MTE on the page faulted in 1.
As a result of step 3, we are left with a non-match-all tag for a page
with tags accessible to userspace, which can lead to the same kind of
tag check faults that commit e74a68468062 ("arm64: Reset KASAN tag in
copy_highpage with HW tags only") intended to fix.
The general invariant that we have for pages in a VMA with VM_MTE_ALLOWED
is that they cannot have a non-match-all tag. As a result of step 2, the
invariant is broken. This means that the fix in the referenced commit
was incomplete and we also need to reset the tag for pages without
PG_mte_tagged.
Fixes: e5b8d9218951 ("arm64: mte: reset the page tag in page->flags")
Cc: <stable(a)vger.kernel.org> # 5.15
Link: https://linux-review.googlesource.com/id/I7409cdd41acbcb215c2a7417c1e50d37b…
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Link: https://lore.kernel.org/r/20230420210945.2313627-1-pcc@google.com
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 4aadcfb01754..a7bb20055ce0 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -21,9 +21,10 @@ void copy_highpage(struct page *to, struct page *from)
copy_page(kto, kfrom);
+ if (kasan_hw_tags_enabled())
+ page_kasan_tag_reset(to);
+
if (system_supports_mte() && page_mte_tagged(from)) {
- if (kasan_hw_tags_enabled())
- page_kasan_tag_reset(to);
/* It's a new page, shouldn't have been tagged yet */
WARN_ON_ONCE(!try_page_mte_tagging(to));
mte_copy_page_tags(kto, kfrom);
Hello,
I would like to request the backport of the commit below to address a
kernel panic in ocfs2 that was identified by Valentin Vidić in this
thread:
https://lore.kernel.org/linux-security-module/20230401214151.1243189-1-vvid…
While Valentin provides his own patch in the original message, the
preferred patch is one that went up to Linus during the last merge
window; Valentin has tested the patch and confirmed that it resolved
the reported problem.
commit de3004c874e740304cc4f4a83d6200acb511bbda
Author: Roberto Sassu <roberto.sassu(a)huawei.com>
Date: Tue Mar 14 09:17:16 2023 +0100
ocfs2: Switch to security_inode_init_security()
In preparation for removing security_old_inode_init_security(), switch to
security_inode_init_security().
Extend the existing ocfs2_initxattrs() to take the
ocfs2_security_xattr_info structure from fs_info, and populate the
name/value/len triple with the first xattr provided by LSMs.
As fs_info was not used before, ocfs2_initxattrs() can now handle the case
of replicating the behavior of security_old_inode_init_security(), i.e.
just obtaining the xattr, in addition to setting all xattrs provided by
LSMs.
Supporting multiple xattrs is not currently supported where
security_old_inode_init_security() was called (mknod, symlink), as it
requires non-trivial changes that can be done at a later time. Like for
reiserfs, even if EVM is invoked, it will not provide an xattr (if it is
not the first to set it, its xattr will be discarded; if it is the first,
it does not have xattrs to calculate the HMAC on).
Finally, since security_inode_init_security(), unlike
security_old_inode_init_security(), returns zero instead of -EOPNOTSUPP if
no xattrs were provided by LSMs or if inodes are private, additionally
check in ocfs2_init_security_get() if the xattr name is set.
If not, act as if security_old_inode_init_security() returned -EOPNOTSUPP,
and set si->enable to zero to notify to the functions following
ocfs2_init_security_get() that no xattrs are available.
Signed-off-by: Roberto Sassu <roberto.sassu(a)huawei.com>
Reviewed-by: Casey Schaufler <casey(a)schaufler-ca.com>
Acked-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Reviewed-by: Mimi Zohar <zohar(a)linux.ibm.com>
Signed-off-by: Paul Moore <paul(a)paul-moore.com>
--
paul-moore.com
When removing a rule with an objref expression and the object it references in
the same batch, it will return EBUSY. The backported commit has a Fixes line
for one of the commits recently backported.
Pablo Neira Ayuso (1):
netfilter: nf_tables: bogus EBUSY in helper removal from transaction
net/netfilter/nft_objref.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
--
2.34.1
Dear User,
Congratulations...
Your email won you a Cash prize of $600,000.00 in the just concluded
GOOGLE AWARD PROMO Held on 24th of May. 2023. Your Ref No:
(GOOLGXQW1563), For claim Email us your Name,Address,Occupation,and
Phone number for more details.
Note: All winnings MUST be claimed in 2 weeks otherwise all winnings
will be returned as unclaimed funds.
Mr. George Harris
Promo Co-coordinator
From: Robin Chen <robin.chen(a)amd.com>
[Why]
This is the fix for the defect of commit ab144f0b4ad6
("drm/amd/display: Allow individual control of eDP hotplug support").
[How]
To revise the default eDP hotplug setting and use the enum to git rid
of the magic number for different options.
Fixes: ab144f0b4ad6 ("drm/amd/display: Allow individual control of eDP hotplug support")
Cc: stable(a)vger.kernel.org
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu(a)amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo(a)amd.com>
Signed-off-by: Robin Chen <robin.chen(a)amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit eeefe7c4820b6baa0462a8b723ea0a3b5846ccae)
Hand modified for missing file rename changes and symbol moves in 6.1.y.
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
This will help some unhandled interrupts that are related to MST
and eDP use.
drivers/gpu/drm/amd/display/dc/core/dc_link.c | 9 +++++++--
drivers/gpu/drm/amd/display/dc/dc_types.h | 6 ++++++
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 6299130663a3..5d53e54ebe90 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -1634,14 +1634,18 @@ static bool dc_link_construct_legacy(struct dc_link *link,
link->irq_source_hpd = DC_IRQ_SOURCE_INVALID;
switch (link->dc->config.allow_edp_hotplug_detection) {
- case 1: // only the 1st eDP handles hotplug
+ case HPD_EN_FOR_ALL_EDP:
+ link->irq_source_hpd_rx =
+ dal_irq_get_rx_source(link->hpd_gpio);
+ break;
+ case HPD_EN_FOR_PRIMARY_EDP_ONLY:
if (link->link_index == 0)
link->irq_source_hpd_rx =
dal_irq_get_rx_source(link->hpd_gpio);
else
link->irq_source_hpd = DC_IRQ_SOURCE_INVALID;
break;
- case 2: // only the 2nd eDP handles hotplug
+ case HPD_EN_FOR_SECONDARY_EDP_ONLY:
if (link->link_index == 1)
link->irq_source_hpd_rx =
dal_irq_get_rx_source(link->hpd_gpio);
@@ -1649,6 +1653,7 @@ static bool dc_link_construct_legacy(struct dc_link *link,
link->irq_source_hpd = DC_IRQ_SOURCE_INVALID;
break;
default:
+ link->irq_source_hpd = DC_IRQ_SOURCE_INVALID;
break;
}
}
diff --git a/drivers/gpu/drm/amd/display/dc/dc_types.h b/drivers/gpu/drm/amd/display/dc/dc_types.h
index ad9041472cca..6050a3469a57 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_types.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_types.h
@@ -993,4 +993,10 @@ struct display_endpoint_id {
enum display_endpoint_type ep_type;
};
+enum dc_hpd_enable_select {
+ HPD_EN_FOR_ALL_EDP = 0,
+ HPD_EN_FOR_PRIMARY_EDP_ONLY,
+ HPD_EN_FOR_SECONDARY_EDP_ONLY,
+};
+
#endif /* DC_TYPES_H_ */
--
2.34.1
Hi,
I would like to request the commit below to be applied to the 6.1-stable tree:
91e87045a5ef ("net: dsa: mv88e6xxx: Add RGMII delay to 88E6320")
Without this commit, there is a failure to retrieve an IP address via DHCP.
Thanks,
Fabio Estevam
The patch titled
Subject: mm/gup_test: fix ioctl fail for compat task
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-gup_test-fix-ioctl-fail-for-compat-task.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Haibo Li <haibo.li(a)mediatek.com>
Subject: mm/gup_test: fix ioctl fail for compat task
Date: Fri, 26 May 2023 10:21:25 +0800
When tools/testing/selftests/mm/gup_test.c is compiled as 32bit, then run
on arm64 kernel, it reports "ioctl: Inappropriate ioctl for device".
Fix it by filling compat_ioctl in gup_test_fops
Link: https://lkml.kernel.org/r/20230526022125.175728-1-haibo.li@mediatek.com
Signed-off-by: Haibo Li <haibo.li(a)mediatek.com>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com>
Cc: Matthias Brugger <matthias.bgg(a)gmail.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup_test.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/gup_test.c~mm-gup_test-fix-ioctl-fail-for-compat-task
+++ a/mm/gup_test.c
@@ -381,6 +381,7 @@ static int gup_test_release(struct inode
static const struct file_operations gup_test_fops = {
.open = nonseekable_open,
.unlocked_ioctl = gup_test_ioctl,
+ .compat_ioctl = compat_ptr_ioctl,
.release = gup_test_release,
};
_
Patches currently in -mm which might be from haibo.li(a)mediatek.com are
mm-gup_test-fix-ioctl-fail-for-compat-task.patch
The patch titled
Subject: nilfs2: reject devices with insufficient block count
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
nilfs2-reject-devices-with-insufficient-block-count.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: reject devices with insufficient block count
Date: Fri, 26 May 2023 11:13:32 +0900
The current sanity check for nilfs2 geometry information lacks checks for
the number of segments stored in superblocks, so even for device images
that have been destructively truncated or have an unusually high number of
segments, the mount operation may succeed.
This causes out-of-bounds block I/O on file system block reads or log
writes to the segments, the latter in particular causing
"a_ops->writepages" to repeatedly fail, resulting in sync_inodes_sb() to
hang.
Fix this issue by checking the number of segments stored in the superblock
and avoiding mounting devices that can cause out-of-bounds accesses. To
eliminate the possibility of overflow when calculating the number of
blocks required for the device from the number of segments, this also adds
a helper function to calculate the upper bound on the number of segments
and inserts a check using it.
Link: https://lkml.kernel.org/r/20230526021332.3431-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+7d50f1e54a12ba3aeae2(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=7d50f1e54a12ba3aeae2
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/the_nilfs.c | 43 +++++++++++++++++++++++++++++++++++++++-
1 file changed, 42 insertions(+), 1 deletion(-)
--- a/fs/nilfs2/the_nilfs.c~nilfs2-reject-devices-with-insufficient-block-count
+++ a/fs/nilfs2/the_nilfs.c
@@ -405,6 +405,18 @@ unsigned long nilfs_nrsvsegs(struct the_
100));
}
+/**
+ * nilfs_max_segment_count - calculate the maximum number of segments
+ * @nilfs: nilfs object
+ */
+static u64 nilfs_max_segment_count(struct the_nilfs *nilfs)
+{
+ u64 max_count = U64_MAX;
+
+ do_div(max_count, nilfs->ns_blocks_per_segment);
+ return min_t(u64, max_count, ULONG_MAX);
+}
+
void nilfs_set_nsegments(struct the_nilfs *nilfs, unsigned long nsegs)
{
nilfs->ns_nsegments = nsegs;
@@ -414,6 +426,8 @@ void nilfs_set_nsegments(struct the_nilf
static int nilfs_store_disk_layout(struct the_nilfs *nilfs,
struct nilfs_super_block *sbp)
{
+ u64 nsegments, nblocks;
+
if (le32_to_cpu(sbp->s_rev_level) < NILFS_MIN_SUPP_REV) {
nilfs_err(nilfs->ns_sb,
"unsupported revision (superblock rev.=%d.%d, current rev.=%d.%d). Please check the version of mkfs.nilfs(2).",
@@ -457,7 +471,34 @@ static int nilfs_store_disk_layout(struc
return -EINVAL;
}
- nilfs_set_nsegments(nilfs, le64_to_cpu(sbp->s_nsegments));
+ nsegments = le64_to_cpu(sbp->s_nsegments);
+ if (nsegments > nilfs_max_segment_count(nilfs)) {
+ nilfs_err(nilfs->ns_sb,
+ "segment count %llu exceeds upper limit (%llu segments)",
+ (unsigned long long)nsegments,
+ (unsigned long long)nilfs_max_segment_count(nilfs));
+ return -EINVAL;
+ }
+
+ nblocks = sb_bdev_nr_blocks(nilfs->ns_sb);
+ if (nblocks) {
+ u64 min_block_count = nsegments * nilfs->ns_blocks_per_segment;
+ /*
+ * To avoid failing to mount early device images without a
+ * second superblock, exclude that block count from the
+ * "min_block_count" calculation.
+ */
+
+ if (nblocks < min_block_count) {
+ nilfs_err(nilfs->ns_sb,
+ "total number of segment blocks %llu exceeds device size (%llu blocks)",
+ (unsigned long long)min_block_count,
+ (unsigned long long)nblocks);
+ return -EINVAL;
+ }
+ }
+
+ nilfs_set_nsegments(nilfs, nsegments);
nilfs->ns_crc_seed = le32_to_cpu(sbp->s_crc_seed);
return 0;
}
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-incomplete-buffer-cleanup-in-nilfs_btnode_abort_change_key.patch
nilfs2-fix-possible-out-of-bounds-segment-allocation-in-resize-ioctl.patch
nilfs2-reject-devices-with-insufficient-block-count.patch
Please backport to stable 5.10.y
ee2aacb6f3a9 ("ARM: dts: stm32: fix AV96 board SAI2 pin muxing on
stm32mp15")
Full commit ID
ee2aacb6f3a901a95b1dd68964b69c92cdbbf213
Thank you
On Tue, May 23, 2023 at 11:31 AM Lopez, Jorge A (Security)
<jorge.lopez2(a)hp.com> wrote:
>
> I investigate the compile failure and appears the latest patch reverted the code to an older version.
> The latest code shows the proper implementation and compiling the code does not report any failures.
>
> enum hp_wmi_radio r = (long)data;
>
> instead of
>
> enum hp_wmi_radio r = (enum hp_wmi_radio) data;
Looks like
commit ce95010ef62d ("platform/x86: hp-wmi: Fix cast to smaller
integer type warning")
is the fixup necessary for 5.15.y.
Dear stable kernel maintainers, please consider cherry-picking the
above commit to linux-5.15.y to avoid the new compiler diagnostic
introduced by
commit 6e9b8992b122 ("platform/x86: Move existing HP drivers to a new
hp subdir")
Hans, thanks for the fix. You may need to additionally add Fixes tag
to such commits to help out automation for stable. (or perhaps
ce95010ef62d failed to apply to linux-5.15.y? I think an email gets
sent when that's the case).
>
>
>
> Regards,
>
> Jorge Lopez
>
>
> Regards,
>
> Jorge Lopez
> HP Inc
>
> "Once you stop learning, you start dying"
> Albert Einstein
>
> From: kernel test robot <lkp(a)intel.com>
> Sent: Saturday, May 20, 2023 4:30 PM
> To: Lopez, Jorge A (Security) <jorge.lopez2(a)hp.com>
> Cc: llvm(a)lists.linux.dev; oe-kbuild-all(a)lists.linux.dev; Sasha Levin <sashal(a)kernel.org>; Hans de Goede <hdegoede(a)redhat.com>
> Subject: [linux-stable-rc:queue/5.15 105/106] drivers/platform/x86/hp/hp-wmi.c:342:24: warning: cast to smaller integer type 'enum hp_wmi_radio' from 'void *'
>
> CAUTION: External Email
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git queue/5.15
> head: 632aeb02f8e831197a9a01b1e93cb00b4363be05
> commit: 8d6ed410e942fa6f60e434729e1cbbc9ce0ccd54 [105/106] platform/x86: Move existing HP drivers to a new hp subdir
> config: x86_64-randconfig-x052
> compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/…
> git remote add linux-stable-rc https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> git fetch --no-tags linux-stable-rc queue/5.15
> git checkout 8d6ed410e942fa6f60e434729e1cbbc9ce0ccd54
> # save the config file
> mkdir build_dir && cp config build_dir/.config
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/platform/x86/hp/
>
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <mailto:lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202305210504.yw7qgOom-lkp@intel.com
>
> All warnings (new ones prefixed by >>):
>
> >> drivers/platform/x86/hp/hp-wmi.c:342:24: warning: cast to smaller integer type 'enum hp_wmi_radio' from 'void *' [-Wvoid-pointer-to-enum-cast]
> enum hp_wmi_radio r = (enum hp_wmi_radio) data;
> ^~~~~~~~~~~~~~~~~~~~~~~~
> 1 warning generated.
>
>
> vim +342 drivers/platform/x86/hp/hp-wmi.c
>
> f82bdd0d77b6bf drivers/platform/x86/hp-wmi.c Kyle Evans 2014-06-09 339
> 19d337dff95cbf drivers/platform/x86/hp-wmi.c Johannes Berg 2009-06-02 340 static int hp_wmi_set_block(void *data, bool blocked)
> 62ec30d45ecbb8 drivers/misc/hp-wmi.c Matthew Garrett 2008-07-25 341 {
> e5fbba85a7acc2 drivers/platform/x86/hp-wmi.c Alan Jenkins 2009-07-21 @342 enum hp_wmi_radio r = (enum hp_wmi_radio) data;
> e5fbba85a7acc2 drivers/platform/x86/hp-wmi.c Alan Jenkins 2009-07-21 343 int query = BIT(r + 8) | ((!blocked) << r);
> 6d96e00cef3503 drivers/platform/x86/hp-wmi.c Thomas Renninger 2010-05-21 344 int ret;
> 62ec30d45ecbb8 drivers/misc/hp-wmi.c Matthew Garrett 2008-07-25 345
> d8193cff33906e drivers/platform/x86/hp-wmi.c Darren Hart (VMware 2017-04-19 346) ret = hp_wmi_perform_query(HPWMI_WIRELESS_QUERY, HPWMI_WRITE,
> c3021ea1beeeb1 drivers/platform/x86/hp-wmi.c Anssi Hannula 2011-02-20 347 &query, sizeof(query), 0);
> 527376c89caf59 drivers/platform/x86/hp-wmi.c Darren Hart (VMware 2017-04-19 348)
> 527376c89caf59 drivers/platform/x86/hp-wmi.c Darren Hart (VMware 2017-04-19 349) return ret <= 0 ? ret : -EINVAL;
> 62ec30d45ecbb8 drivers/misc/hp-wmi.c Matthew Garrett 2008-07-25 350 }
> 62ec30d45ecbb8 drivers/misc/hp-wmi.c Matthew Garrett 2008-07-25 351
>
> :::::: The code at line 342 was first introduced by commit
> :::::: e5fbba85a7acc2626d4fe14501816811d702f3e9 hp-wmi: improve rfkill support
>
> :::::: TO: Alan Jenkins <mailto:alan-jenkins@tuffmail.co.uk>
> :::::: CC: Len Brown <mailto:len.brown@intel.com>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
>
--
Thanks,
~Nick Desaulniers
I'm announcing the release of the 5.15.104 kernel.
All users of the 5.15 kernel series must upgrade.
The updated 5.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.15.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/filesystems/vfs.rst | 2
Makefile | 2
arch/riscv/include/asm/mmu.h | 2
arch/riscv/include/asm/tlbflush.h | 18
arch/riscv/mm/context.c | 40 -
arch/riscv/mm/tlbflush.c | 28 -
arch/s390/boot/ipl_report.c | 8
arch/s390/pci/pci.c | 16
arch/s390/pci/pci_bus.c | 12
arch/s390/pci/pci_bus.h | 3
arch/x86/kernel/cpu/mce/core.c | 1
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 7
arch/x86/kernel/cpu/resctrl/internal.h | 1
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 25 +
arch/x86/kvm/vmx/nested.c | 10
arch/x86/mm/mem_encrypt_identity.c | 3
drivers/block/loop.c | 25 -
drivers/block/null_blk/main.c | 6
drivers/block/sunvdc.c | 2
drivers/clk/Kconfig | 2
drivers/cpuidle/cpuidle-psci-domain.c | 3
drivers/firmware/xilinx/zynqmp.c | 2
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 9
drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 5
drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 43 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 9
drivers/gpu/drm/i915/display/intel_display_types.h | 2
drivers/gpu/drm/i915/display/intel_psr.c | 207 +++++++---
drivers/gpu/drm/i915/gt/intel_ring.c | 2
drivers/gpu/drm/i915/i915_active.c | 24 -
drivers/gpu/drm/meson/meson_vpp.c | 2
drivers/gpu/drm/panfrost/panfrost_mmu.c | 2
drivers/gpu/drm/sun4i/sun4i_drv.c | 6
drivers/hid/hid-core.c | 18
drivers/hid/uhid.c | 1
drivers/hwmon/adt7475.c | 8
drivers/hwmon/ina3221.c | 2
drivers/hwmon/ltc2992.c | 1
drivers/hwmon/pmbus/adm1266.c | 1
drivers/hwmon/pmbus/ucd9000.c | 75 +++
drivers/hwmon/tmp513.c | 2
drivers/hwmon/xgene-hwmon.c | 1
drivers/interconnect/core.c | 4
drivers/interconnect/samsung/exynos.c | 6
drivers/media/i2c/m5mols/m5mols_core.c | 2
drivers/mmc/host/atmel-mci.c | 3
drivers/mmc/host/sdhci_am654.c | 2
drivers/net/bonding/bond_main.c | 23 -
drivers/net/dsa/mt7530.c | 64 +--
drivers/net/dsa/mv88e6xxx/chip.c | 16
drivers/net/ethernet/intel/i40e/i40e_main.c | 1
drivers/net/ethernet/intel/ice/ice.h | 14
drivers/net/ethernet/intel/ice/ice_main.c | 19
drivers/net/ethernet/intel/ice/ice_xsk.c | 4
drivers/net/ethernet/qlogic/qed/qed_dev.c | 5
drivers/net/ethernet/qlogic/qed/qed_mng_tlv.c | 2
drivers/net/ethernet/renesas/ravb_main.c | 12
drivers/net/ethernet/renesas/sh_eth.c | 12
drivers/net/ethernet/sun/ldmvsw.c | 3
drivers/net/ethernet/sun/sunvnet.c | 3
drivers/net/ipvlan/ipvlan_l3s.c | 1
drivers/net/phy/nxp-c45-tja11xx.c | 2
drivers/net/phy/smsc.c | 5
drivers/net/usb/smsc75xx.c | 7
drivers/nfc/pn533/usb.c | 1
drivers/nfc/st-nci/ndlc.c | 6
drivers/nvme/host/core.c | 28 -
drivers/nvme/host/pci.c | 2
drivers/nvme/target/core.c | 4
drivers/pci/bus.c | 21 +
drivers/pci/pci-driver.c | 4
drivers/pci/pci.c | 57 +-
drivers/pci/pci.h | 16
drivers/pci/pcie/dpc.c | 4
drivers/scsi/hosts.c | 3
drivers/scsi/mpt3sas/mpt3sas_transport.c | 14
drivers/tty/serial/8250/8250_em.c | 4
drivers/tty/serial/8250/8250_fsl.c | 4
drivers/tty/serial/fsl_lpuart.c | 12
drivers/vdpa/vdpa_sim/vdpa_sim.c | 13
drivers/video/fbdev/stifb.c | 27 +
fs/cifs/smb2inode.c | 31 +
fs/cifs/transport.c | 21 -
fs/ext4/inode.c | 18
fs/ext4/namei.c | 4
fs/ext4/super.c | 7
fs/ext4/xattr.c | 11
fs/jffs2/file.c | 15
include/drm/drm_bridge.h | 4
include/linux/hid.h | 3
include/linux/netdevice.h | 6
include/linux/pci.h | 1
include/linux/sh_intc.h | 5
include/linux/tracepoint.h | 15
io_uring/io_uring.c | 4
kernel/events/core.c | 2
kernel/trace/ftrace.c | 3
kernel/trace/trace.c | 2
kernel/trace/trace_events_hist.c | 3
kernel/trace/trace_hwlat.c | 3
mm/huge_memory.c | 6
net/9p/client.c | 2
net/ipv4/fib_frontend.c | 3
net/ipv4/ip_tunnel.c | 12
net/ipv4/tcp_output.c | 2
net/ipv6/ip6_tunnel.c | 4
net/iucv/iucv.c | 2
net/mptcp/pm_netlink.c | 16
net/mptcp/subflow.c | 12
net/netfilter/nft_masq.c | 2
net/netfilter/nft_nat.c | 2
net/netfilter/nft_redir.c | 4
net/smc/smc_cdc.c | 3
net/smc/smc_core.c | 2
net/xfrm/xfrm_state.c | 3
scripts/kconfig/confdata.c | 6
sound/hda/intel-dsp-config.c | 9
sound/pci/hda/hda_intel.c | 5
sound/pci/hda/patch_realtek.c | 1
tools/testing/selftests/net/devlink_port_split.py | 36 +
120 files changed, 919 insertions(+), 439 deletions(-)
Alex Hung (1):
drm/amd/display: fix shift-out-of-bounds in CalculateVMAndRowBytes
Alexandra Winter (1):
net/iucv: Fix size of interrupt data
Arınç ÜNAL (2):
net: dsa: mt7530: remove now incorrect comment regarding port 5
net: dsa: mt7530: set PLL frequency and trgmii only when trgmii is used
Baokun Li (3):
ext4: fail ext4_iget if special inode unallocated
ext4: update s_journal_inum if it changes after journal replay
ext4: fix task hung in ext4_xattr_delete_inode
Bard Liao (1):
ALSA: hda: intel-dsp-config: add MTL PCI id
Bart Van Assche (2):
scsi: core: Fix a procfs host directory removal regression
loop: Fix use-after-free issues
Biju Das (1):
serial: 8250_em: Fix UART port type
Bjorn Helgaas (1):
ALSA: hda: Match only Intel devices with CONTROLLER_IN_GPU()
Breno Leitao (1):
tcp: tcp_make_synack() can be called from process context
Budimir Markovic (1):
perf: Fix check before add_event_to_groups() in perf_group_detach()
Błażej Szczygieł (1):
drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume
Chen Zhongjin (1):
ftrace: Fix invalid address access in lookup_rec() when index is 0
Christian Hewitt (1):
drm/meson: fix 1px pink line on GXM when scaling video overlay
D. Wythe (1):
net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()
Damien Le Moal (2):
block: null_blk: Fix handling of fake timeout request
nvmet: avoid potential UAF in nvmet_req_complete()
Daniil Tatianin (2):
qed/qed_dev: guard against a possible division by zero
qed/qed_mng_tlv: correctly zero out ->min instead of ->hour
Dave Ertman (1):
ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
David Hildenbrand (1):
mm/userfaultfd: propagate uffd-wp bit when PTE-mapping the huge zeropage
Dmitry Osipenko (2):
drm/panfrost: Don't sync rpm suspension after mmu flushing
drm/shmem-helper: Remove another errant put in error path
Elmer Miroslav Mosher Golovin (1):
nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV3000
Eric Dumazet (1):
net: tunnels: annotate lockless accesses to dev->needed_headroom
Eric Van Hensbergen (1):
net/9p: fix bug in client create for .L
Eugenio Pérez (2):
vdpa_sim: not reset state in vdpasim_queue_ready
vdpa_sim: set last_used_idx as last_avail_idx in vdpasim_queue_ready
Fedor Pchelkin (2):
nfc: pn533: initialize struct pn533_out_arg properly
io_uring: avoid null-ptr-deref in io_arm_poll_handler
Francesco Dolcini (1):
mmc: sdhci_am654: lower power-on failed message severity
Geliang Tang (1):
mptcp: add ro_after_init for tcp{,v6}_prot_override
Glenn Washburn (1):
docs: Correct missing "d_" prefix for dentry_operations member d_weak_revalidate
Greg Kroah-Hartman (1):
Linux 5.15.104
Guo Ren (1):
riscv: asid: Fixup stale TLB entry cause application crash
Hamidreza H. Fard (1):
ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro
Heiner Kallweit (1):
net: phy: smsc: bail out in lan87xx_read_status if genphy_read_status fails
Helge Deller (1):
fbdev: stifb: Provide valid pixelclock and add fb_check_var() checks
Herbert Xu (1):
xfrm: Allow transport-mode states with AF_UNSPEC selector
Ido Schimmel (1):
ipv4: Fix incorrect table ID in IOCTL path
Ivan Vecera (1):
i40e: Fix kernel crash during reboot when adapter is in recovery mode
Janusz Krzysztofik (1):
drm/i915/active: Fix misuse of non-idle barriers as fence trackers
Jeremy Sowden (4):
netfilter: nft_nat: correct length for loading protocol registers
netfilter: nft_masq: correct length for loading protocol registers
netfilter: nft_redir: correct length for loading protocol registers
netfilter: nft_redir: correct value of inet type `.maxattrs`
Jianguo Wu (1):
ipvlan: Make skb->skb_iif track skb->dev for l3s mode
Johan Hovold (4):
serial: 8250_fsl: fix handle_irq locking
interconnect: fix mem leak when freeing nodes
interconnect: exynos: fix node leak in probe PM QoS error path
drm/sun4i: fix missing component unbind on bind errors
John Harrison (1):
drm/i915: Don't use stolen memory for ring buffers with LLC
José Roberto de Souza (3):
drm/i915/display: Workaround cursor left overs with PSR2 selective fetch enabled
drm/i915/display/psr: Use drm damage helpers to calculate plane damaged area
drm/i915/display/psr: Handle plane and pipe restrictions at every page flip
Jouni Högander (1):
drm/i915/psr: Use calculated io and fast wake lines
Jurica Vukadin (1):
kconfig: Update config changed flag before calling callback
Krzysztof Kozlowski (1):
hwmon: tmp512: drop of_match_ptr for ID table
Lars-Peter Clausen (3):
hwmon: (ucd90320) Add minimum delay between bus accesses
hwmon: (adm1266) Set `can_sleep` flag for GPIO chip
hwmon: (ltc2992) Set `can_sleep` flag for GPIO chip
Lee Jones (2):
HID: core: Provide new max_buffer_size attribute to over-ride the default
HID: uhid: Over-ride the default maximum data buffer value with our own
Liang He (2):
block: sunvdc: add check for mdesc_grab() returning NULL
ethernet: sun: add check for the mdesc_grab()
Linus Torvalds (1):
media: m5mols: fix off-by-one loop termination error
Liu Ying (1):
drm/bridge: Fix returned array size name for atomic_get_input_bus_fmts kdoc
Lukas Wunner (2):
PCI: Unify delay handling for reset and resume
PCI/DPC: Await readiness of secondary bus after reset
Maciej Fijalkowski (1):
ice: xsk: disable txq irq before flushing hw
Marcus Folkesson (1):
hwmon: (ina3221) return prober error code
Matthieu Baerts (1):
mptcp: avoid setting TCP_CLOSE state twice
Michael Karcher (1):
sh: intc: Avoid spurious sizeof-pointer-div warning
Ming Lei (1):
nvme: fix handling single range discard request
Nikita Zhandarovich (1):
x86/mm: Fix use of uninitialized buffer in sme_enable()
Niklas Schnelle (1):
PCI: s390: Fix use-after-free of PCI resources with per-function hotplug
Nikolay Aleksandrov (2):
bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change
bonding: restore bond's IFF_SLAVE flag if a non-eth dev enslave fails
Paolo Abeni (2):
mptcp: fix possible deadlock in subflow_error_report
mptcp: fix lockdep false positive in mptcp_pm_nl_create_listen_socket()
Paolo Bonzini (1):
KVM: nVMX: add missing consistency checks for CR0 and CR4
Po-Hsu Lin (1):
selftests: net: devlink_port_split.py: skip test if no suitable device available
Qu Huang (1):
drm/amdkfd: Fix an illegal memory access
Radu Pirea (OSS) (1):
net: phy: nxp-c45-tja11xx: fix MII_BASIC_CONFIG_REV bit
Randy Dunlap (1):
clk: HI655X: select REGMAP instead of depending on it
Roman Gushchin (1):
firmware: xilinx: don't make a sleepable memory allocation from an atomic context
Sergey Matyukevich (1):
Revert "riscv: mm: notify remote harts about mmu cache updates"
Shawn Guo (1):
cpuidle: psci: Iterate backwards over list in psci_pd_remove()
Shawn Wang (1):
x86/resctrl: Clear staged_config[] before and after it is used
Sherry Sun (1):
tty: serial: fsl_lpuart: skip waiting for transmission complete when UARTCTRL_SBK is asserted
Steven Rostedt (Google) (2):
tracing: Check field value in hist_field_name()
tracing: Make tracepoint lockdep check actually test something
Sung-hun Kim (1):
tracing: Make splice_read available again
Sven Schnelle (1):
s390/ipl: add missing intersection check to ipl_report handling
Szymon Heidrich (2):
net: usb: smsc75xx: Limit packet length to skb->len
net: usb: smsc75xx: Move packet length check to prevent kernel panic in skb_pull
Tero Kristo (1):
trace/hwlat: Do not wipe the contents of per-cpu thread data
Theodore Ts'o (1):
ext4: fix possible double unlock when moving a directory
Tobias Schramm (1):
mmc: atmel-mci: fix race between stop command and start of next command
Tom Rix (1):
drm/i915/display: clean up comments
Tony O'Brien (2):
hwmon: (adt7475) Display smoothing attributes in correct order
hwmon: (adt7475) Fix masking of hysteresis registers
Vladimir Oltean (1):
net: dsa: mv88e6xxx: fix max_mtu of 1492 on 6165, 6191, 6220, 6250, 6290
Volker Lendecke (1):
cifs: Fix smb2_set_path_size()
Wenchao Hao (1):
scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add()
Wenjia Zhang (1):
net/smc: fix deadlock triggered by cancel_delayed_work_syn()
Wolfram Sang (2):
ravb: avoid PHY being resumed when interface is not up
sh_eth: avoid PHY being resumed when interface is not up
Yazen Ghannam (1):
x86/mce: Make sure logged MCEs are processed after sysfs update
Yifei Liu (1):
jffs2: correct logic when creating a hole in jffs2_write_begin
Zhang Xiaoxu (1):
cifs: Move the in_send statistic to __smb_send_rqst()
Zheng Wang (2):
nfc: st-nci: Fix use after free bug in ndlc_remove due to race condition
hwmon: (xgene) Fix use after free bug in xgene_hwmon_remove due to race condition
Hi,
The following commits are required for stable 6.1.y kernel to fix
suspend/resume failure with AMD Navi3x dGPU.
1e7bbdba68ba "drm/amd/amdgpu: update mes11 api def"
a6b3b618c0f7 "drm/amdgpu/mes11: enable reg active poll"
Regards,
Richard
Hi,
The following commit helps fix the watchdog timer on various AMD SoCs.
Please backport it to 5.4.y and later.
4eda19cc8a29 ("watchdog: sp5100_tco: Immediately trigger upon starting.")
Thanks!
Touching privately mapped GPA that is not properly converted to private
with MapGPA and accepted leads to unrecoverable exit to VMM.
load_unaligned_zeropad() can touch memory that is not owned by the
caller, but just happened to next after the owned memory.
This load_unaligned_zeropad() behaviour makes it important when kernel
asks VMM to convert a GPA from shared to private or back. Kernel must
never have a page mapped into direct mapping (and aliases) as private
when the GPA is already converted to shared or when GPA is not yet
converted to private.
guest.enc_status_change_prepare() called before adjusting direct mapping
and therefore it is responsible for converting the memory to private.
guest.enc_tlb_flush_required() called after adjusting direct mapping and
it converts the memory to shared.
It is okay to have a shared mapping of memory that is not converted
properly. handle_mmio() knows how to deal with load_unaligned_zeropad()
stepping on it.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Fixes: 7dbde7631629 ("x86/mm/cpa: Add support for TDX shared memory")
Cc: stable(a)vger.kernel.org
---
arch/x86/coco/tdx/tdx.c | 56 ++++++++++++++++++++++++++++++++++++++---
1 file changed, 53 insertions(+), 3 deletions(-)
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index e146b599260f..84525df750d4 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -840,6 +840,30 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
return true;
}
+static bool tdx_enc_status_change_prepare(unsigned long vaddr, int numpages,
+ bool enc)
+{
+ /*
+ * Only handle shared->private conversion here.
+ * See the comment in tdx_early_init().
+ */
+ if (enc)
+ return tdx_enc_status_changed(vaddr, numpages, enc);
+ return true;
+}
+
+static bool tdx_enc_status_change_finish(unsigned long vaddr, int numpages,
+ bool enc)
+{
+ /*
+ * Only handle private->shared conversion here.
+ * See the comment in tdx_early_init().
+ */
+ if (!enc)
+ return tdx_enc_status_changed(vaddr, numpages, enc);
+ return true;
+}
+
void __init tdx_early_init(void)
{
u64 cc_mask;
@@ -867,9 +891,35 @@ void __init tdx_early_init(void)
*/
physical_mask &= cc_mask - 1;
- x86_platform.guest.enc_cache_flush_required = tdx_cache_flush_required;
- x86_platform.guest.enc_tlb_flush_required = tdx_tlb_flush_required;
- x86_platform.guest.enc_status_change_finish = tdx_enc_status_changed;
+ /*
+ * Touching privately mapped GPA that is not properly converted to
+ * private with MapGPA and accepted leads to unrecoverable exit
+ * to VMM.
+ *
+ * load_unaligned_zeropad() can touch memory that is not owned by
+ * the caller, but just happened to next after the owned memory.
+ * This load_unaligned_zeropad() behaviour makes it important when
+ * kernel asks VMM to convert a GPA from shared to private or back.
+ * Kernel must never have a page mapped into direct mapping (and
+ * aliases) as private when the GPA is already converted to shared or
+ * when GPA is not yet converted to private.
+ *
+ * guest.enc_status_change_prepare() called before adjusting direct
+ * mapping and therefore it is responsible for converting the memory
+ * to private.
+ *
+ * guest.enc_tlb_flush_required() called after adjusting direct mapping
+ * and it converts the memory to shared.
+ *
+ * It is okay to have a shared mapping of memory that is not converted
+ * properly. handle_mmio() knows how to deal with load_unaligned_zeropad()
+ * stepping on it.
+ */
+ x86_platform.guest.enc_status_change_prepare = tdx_enc_status_change_prepare;
+ x86_platform.guest.enc_status_change_finish = tdx_enc_status_change_finish;
+
+ x86_platform.guest.enc_cache_flush_required = tdx_cache_flush_required;
+ x86_platform.guest.enc_tlb_flush_required = tdx_tlb_flush_required;
pr_info("Guest detected\n");
}
--
2.39.3
v2 -> v3:
- Rephrase "Write to VIDC_CTRL_INIT after unmasking interrupts" commit msg
- Drop "Remap bufreq fields on HFI6XX"
- Rephrase "Introduce VPU version distinction" commit msg
- Better explain "Leave a clue for homegrown porters"
- Drop incorrect fixes tags/rephrase version check alternations
- Drop AR50L/IRIS1 from if-conditions, they'll be introduced separately
- pick up tags
- rebase on next-20230517 (no effective changes)
v2: https://lore.kernel.org/r/20230228-topic-venus-v2-0-d95d14949c79@linaro.org
v1 -> v2:
- Move "Write to VIDC_CTRL_INIT after unmasking interrupts" up and add
a Fixes tag & Cc stable
- Reword the comment in "Correct IS_V6() checks"
- Move up "media: venus: Remap bufreq fields on HFI6XX", add Fixes and
Cc stable
- Use better English in "Use newly-introduced hfi_buffer_requirements
accessors" commit message
- Mention "Restrict writing SCIACMDARG3 to Venus V1/V2" doesn't seem to
regress SM8250 in the commit message
- Pick up tags (note: I capitalized the R in Dikshita's 'reviewed-by'
and removed one occurrence of random '**' to make sure review tools
like b4 don't go crazy)
- Handle AR50_LITE in "Assign registers based on VPU version"
- Drop /* VPUn */ comments, they're invalid as explained by Vikash
- Take a different approach to the sys_idle problem in patch 1
v1: https://lore.kernel.org/r/20230228-topic-venus-v1-0-58c2c88384e9@linaro.org
Currently upstream assumes all (well, almost all - see 7280 or CrOS
specific checks) Venus implementations using the same version of the
Hardware Firmware Interface can be treated the same way. This is
however not the case.
This series tries to introduce the groundwork to start differentiating
them based on the VPU (Video Processing Unit) hardware type, fixes a
couple of issues that were an effect of that generalized assumption
and lays the foundation for supporting 8150 (IRIS1) and SM6115/QCM2290
(AR50 Lite), which will hopefully come soon.
Tested on 8250, but pretty please test it on your boards too!
Signed-off-by: Konrad Dybcio <konrad.dybcio(a)linaro.org>
---
Konrad Dybcio (17):
media: venus: hfi_venus: Only consider sys_idle_indicator on V1
media: venus: hfi_venus: Write to VIDC_CTRL_INIT after unmasking interrupts
media: venus: Introduce VPU version distinction
media: venus: Add vpu_version to most SoCs
media: venus: firmware: Leave a clue about obtaining CP VARs
media: venus: hfi_venus: Sanitize venus_boot_core() per-VPU-version
media: venus: core: Assign registers based on VPU version
media: venus: hfi_venus: Sanitize venus_halt_axi() per-VPU-version
media: venus: hfi_venus: Sanitize venus_isr() per-VPU-version
media: venus: hfi_venus: Sanitize venus_cpu_and_video_core_idle() per-VPU-version
media: venus: hfi_venus: Sanitize venus_cpu_idle_and_pc_ready() per-VPU-version
media: venus: firmware: Correct IS_V6() checks
media: venus: hfi_platform: Check vpu_version instead of device compatible
media: venus: vdec: Sanitize vdec_set_work_route() per-VPU-version
media: venus: Introduce accessors for remapped hfi_buffer_reqs members
media: venus: Use newly-introduced hfi_buffer_requirements accessors
media: venus: hfi_venus: Restrict writing SCIACMDARG3 to Venus V1/V2
drivers/media/platform/qcom/venus/core.c | 7 ++-
drivers/media/platform/qcom/venus/core.h | 15 ++++++
drivers/media/platform/qcom/venus/firmware.c | 22 ++++++--
drivers/media/platform/qcom/venus/helpers.c | 7 +--
drivers/media/platform/qcom/venus/hfi_helper.h | 61 +++++++++++++++++++---
drivers/media/platform/qcom/venus/hfi_msgs.c | 2 +-
.../media/platform/qcom/venus/hfi_plat_bufs_v6.c | 22 ++++----
drivers/media/platform/qcom/venus/hfi_platform.c | 2 +-
drivers/media/platform/qcom/venus/hfi_venus.c | 45 ++++++++--------
drivers/media/platform/qcom/venus/vdec.c | 10 ++--
drivers/media/platform/qcom/venus/vdec_ctrls.c | 2 +-
drivers/media/platform/qcom/venus/venc.c | 4 +-
drivers/media/platform/qcom/venus/venc_ctrls.c | 2 +-
13 files changed, 139 insertions(+), 62 deletions(-)
---
base-commit: 065efa589871e93b6610c70c1e9de274ef1f1ba2
change-id: 20230228-topic-venus-70ea3bc76688
Best regards,
--
Konrad Dybcio <konrad.dybcio(a)linaro.org>
The patch titled
Subject: nilfs2: fix possible out-of-bounds segment allocation in resize ioctl
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
nilfs2-fix-possible-out-of-bounds-segment-allocation-in-resize-ioctl.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix possible out-of-bounds segment allocation in resize ioctl
Date: Wed, 24 May 2023 18:43:48 +0900
Syzbot reports that in its stress test for resize ioctl, the log writing
function nilfs_segctor_do_construct hits a WARN_ON in
nilfs_segctor_truncate_segments().
It turned out that there is a problem with the current implementation of
the resize ioctl, which changes the writable range on the device (the
range of allocatable segments) at the end of the resize process.
This order is necessary for file system expansion to avoid corrupting the
superblock at trailing edge. However, in the case of a file system
shrink, if log writes occur after truncating out-of-bounds trailing
segments and before the resize is complete, segments may be allocated from
the truncated space.
The userspace resize tool was fine as it limits the range of allocatable
segments before performing the resize, but it can run into this issue if
the resize ioctl is called alone.
Fix this issue by changing nilfs_sufile_resize() to update the range of
allocatable segments immediately after successful truncation of segment
space in case of file system shrink.
Link: https://lkml.kernel.org/r/20230524094348.3784-1-konishi.ryusuke@gmail.com
Fixes: 4e33f9eab07e ("nilfs2: implement resize ioctl")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+33494cd0df2ec2931851(a)syzkaller.appspotmail.com
Closes: https://lkml.kernel.org/r/0000000000005434c405fbbafdc5@google.com
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/sufile.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/fs/nilfs2/sufile.c~nilfs2-fix-possible-out-of-bounds-segment-allocation-in-resize-ioctl
+++ a/fs/nilfs2/sufile.c
@@ -779,6 +779,15 @@ int nilfs_sufile_resize(struct inode *su
goto out_header;
sui->ncleansegs -= nsegs - newnsegs;
+
+ /*
+ * If the sufile is successfully truncated, immediately adjust
+ * the segment allocation space while locking the semaphore
+ * "mi_sem" so that nilfs_sufile_alloc() never allocates
+ * segments in the truncated space.
+ */
+ sui->allocmax = newnsegs - 1;
+ sui->allocmin = 0;
}
kaddr = kmap_atomic(header_bh->b_page);
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-incomplete-buffer-cleanup-in-nilfs_btnode_abort_change_key.patch
nilfs2-fix-possible-out-of-bounds-segment-allocation-in-resize-ioctl.patch
As of now, in tce_freemulti_pSeriesLP(), there is no limit on how many TCEs
are passed to H_STUFF_TCE hcall. This was not an issue until now. Newer
firmware releases have started enforcing this requirement.
The interface has been in it's current form since the beginning.
Cc: stable(a)vger.kernel.org
Signed-off-by: Gaurav Batra <gbatra(a)linux.vnet.ibm.com>
Reviewed-by: Brian King <brking(a)linux.vnet.ibm.com>
---
arch/powerpc/platforms/pseries/iommu.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index c74b71d4733d..f159a195101d 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -306,13 +306,22 @@ static void tce_free_pSeriesLP(unsigned long liobn, long tcenum, long tceshift,
static void tce_freemulti_pSeriesLP(struct iommu_table *tbl, long tcenum, long npages)
{
u64 rc;
+ long rpages = npages;
+ unsigned long limit;
if (!firmware_has_feature(FW_FEATURE_STUFF_TCE))
return tce_free_pSeriesLP(tbl->it_index, tcenum,
tbl->it_page_shift, npages);
- rc = plpar_tce_stuff((u64)tbl->it_index,
- (u64)tcenum << tbl->it_page_shift, 0, npages);
+ do {
+ limit = min_t(unsigned long, rpages, 512);
+
+ rc = plpar_tce_stuff((u64)tbl->it_index,
+ (u64)tcenum << tbl->it_page_shift, 0, limit);
+
+ rpages -= limit;
+ tcenum += limit;
+ } while (rpages > 0 && !rc);
if (rc && printk_ratelimit()) {
printk("tce_freemulti_pSeriesLP: plpar_tce_stuff failed\n");
--
2.39.2 (Apple Git-143)
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: uapi: Fix [GS]_ROUTING ACTIVE flag value
Author: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Date: Mon Apr 24 15:22:37 2023 +0300
The value of the V4L2_SUBDEV_ROUTE_FL_ACTIVE is 1, not 0. Use hexadecimal
numbers as is done elsewhere in the documentation.
Cc: stable(a)vger.kernel.org # for >= v6.3
Fixes: ea73eda50813 ("media: Documentation: Add GS_ROUTING documentation")
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Reviewed-by: Jacopo Mondi <jacopo.mondi(a)ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
diff --git a/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst b/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
index 68ca343c3b44..2d6e3bbdd040 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
@@ -122,7 +122,7 @@ for all the route entries and call ``VIDIOC_SUBDEV_G_ROUTING`` again.
:widths: 3 1 4
* - V4L2_SUBDEV_ROUTE_FL_ACTIVE
- - 0
+ - 0x0001
- The route is enabled. Set by applications.
Return Value
Hi,
I notice a regression report on Bugzilla [1]. Quoting from it:
> Original Summary:
> absent both plymouth, and video= on linu lines, vtty[1-6] framebuffers produce vast raster right and bottom borders on the larger resolution of two displays
>
> To reproduce:
> 1-connect two unequal native resolution displays to a Tesla or Firmi GPU
> 2-don't have plymouth in use (I don't ever have it installed, so don't know whether it impacts)
> 3-don't include e.g. video=1440x900@60 directive on Grub's linu lines
> 4-boot Tumbleweed or Fedora 38
> 5-switch to a vtty, e.g. Ctrl-Alt-F3
>
> Actual behavior:
> 1-Both displays utilize the resolution (same pixel grid) of the lower resolution display
> 2-Lower resolution display behaves as expected (light text on black background)
> 3-Higher resolution display uses same pixels as lower resolution display, with light text on black background, leaving right side and bottom raster instead of black
>
> Expected behavior:
> 1-Both displays utilize the resolution (same pixel grid) of the lower resolution display
> 2-Lower resolution display behaves as expected
> 3-Entire higher resolution display's background is black instead of portions in raster
>
> Workaround: add e.g. video=1440x900@60 to Grub's linu lines, which causes both displays to use the same nominal mode on the full display space.
>
> Typical other linu line options:
> noresume consoleblank=0 net.ifnames=0 ipv6.disable=1 preempt=full mitigations=none
>
> My Tesla has HDMI and DVI outputs, tested with 1920x1200 and 1680x1050 displays.
> My Fermi has dual DisplayPort, tested with 2560x1440 and 1680x1050 displays.
> Occurs Tumbleweed with 6.3.2 and 6.2.12 kernel-default, and with 6.2.15 on Fedora 38, and (partially with Tesla, right side only) with 6.2.12 and 6.3.3 on Mageia 9.
> Does not occur with 6.1.12 kernel-default on NVidia, or with AMD Caicos (Terascale2) GPU, or with Intel Eaglelake GPU.
> Tested only on legacy booting (no UEFI support).
> Others might describe what I call "raster" as multicolored snow.
See bugzilla for the full thread and attached dmesg.
Anyway, I'm adding it to regzbot:
#regzbot introduced: v6.1.12..v6.2.12
#regzbot title: vast raster right and bottom borders on larger display (two displays with inequal resolution) unless forcing resolution with video= parameter
Thanks.
--
An old man doll... just what I always wanted! - Clara
Hi,
I notice a regression report on Bugzilla [1]. Quoting from it:
> Linux kernel >= v6.2 no longer boots on Apple's Virtualization.framework (x86_64).
>
> It is reported that the issue is not reproducible on ARM64: https://github.com/lima-vm/lima/issues/1577#issuecomment-1561577694
>
>
> ## Reproduction
> - Checkout the kernel repo, and run `make defconfig bzImage`.
>
> - Create an initrd (see the attached `initrd-example.txt`)
>
> - Transfer the bzImage and initrd to an Intel Mac.
>
> - On Mac, download `RunningLinuxInAVirtualMachine.zip` from https://developer.apple.com/documentation/virtualization/running_linux_in_a… , and build the `LinuxVirtualMachine` binary with Xcode.
> Building this binary with Xcode requires logging in to Apple.
> If you do not like logging in, a third party equivalent such as https://github.com/Code-Hex/vz/blob/v3.0.6/example/linux/main.go can be used.
>
> - Run `LinuxVirtualMachine /tmp/bzImage /tmp/initrd.img`.
> v6.1 successfully boots into the busybox shell.
> v6.2 just hangs before printing something in the console.
>
>
> ## Tested versions
> ```
> v6.1: OK
> ...
> v6.1.0-rc2-00002-g60f2096b59bc (included in v6.2-rc1): OK
> v6.1.0-rc2-00003-g5c62d5aab875 (included in v6.2-rc1): NG <-- This commit caused a regression
> ...
> v6.2-rc1: NG
> ...
> v6.2: NG
> ...
> v6.3.0-rc7-00181-g8e41e0a57566 (included in v6.3): NG <-- Reverts 5c62d5aab875 but still NG
> ...
> v6.3: NG
> v6.4-rc3: NG
> ```
>
> Tested on MacBookPro 2020 (Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz) running macOS 13.4.
>
>
> The issue seems a regression in [5c62d5aab8752e5ee7bfbe75ed6060db1c787f98](https://git.kernel.org/pub/scm/li… "ACPICA: Events: Support fixed PCIe wake event".
>
> This commit was introduced in v6.2-rc1, and apparently reverted in v6.3 ([8e41e0a575664d26bb87e012c39435c4c3914ed9](https://git.kernel.org/pub/scm/li…).
> However, v6.3 and the latest v6.4-rc3 still don't boot.
See bugzilla for the full thread.
Interestingly, this regression still occurs despite the culprit is
reverted in 8e41e0a575664d ("Revert "ACPICA: Events: Support fixed
PCIe wake event""), so this (obviously) isn't wake-on-lan regression,
but rather early boot one.
Also, the reporter can't provide dmesg log (forget to attach serial
console?).
Anyway, I'm adding it to regzbot:
#regzbot introduced: 5c62d5aab8752e https://bugzilla.kernel.org/show_bug.cgi?id=217485
#regzbot title: Linux v6.2+ (x86_64) no longer boots on Apple's Virtualization framework (ACPICA issue)
Thanks.
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=217485
--
An old man doll... just what I always wanted! - Clara
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: uapi: Fix [GS]_ROUTING ACTIVE flag value
Author: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Date: Mon Apr 24 15:22:37 2023 +0300
The value of the V4L2_SUBDEV_ROUTE_FL_ACTIVE is 1, not 0. Use hexadecimal
numbers as is done elsewhere in the documentation.
Cc: stable(a)vger.kernel.org # for >= v6.3
Fixes: ea73eda50813 ("media: Documentation: Add GS_ROUTING documentation")
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Reviewed-by: Jacopo Mondi <jacopo.mondi(a)ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
diff --git a/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst b/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
index 68ca343c3b44..2d6e3bbdd040 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
@@ -122,7 +122,7 @@ for all the route entries and call ``VIDIOC_SUBDEV_G_ROUTING`` again.
:widths: 3 1 4
* - V4L2_SUBDEV_ROUTE_FL_ACTIVE
- - 0
+ - 0x0001
- The route is enabled. Set by applications.
Return Value
When a directory is moved to a different directory, some filesystems
(udf, ext4, ocfs2, f2fs, and likely gfs2, reiserfs, and others) need to
update their pointer to the parent and this must not race with other
operations on the directory. Lock the directories when they are moved.
Although not all filesystems need this locking, we perform it in
vfs_rename() because getting the lock ordering right is really difficult
and we don't want to expose these locking details to filesystems.
CC: stable(a)vger.kernel.org
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
.../filesystems/directory-locking.rst | 26 ++++++++++---------
fs/namei.c | 22 ++++++++++------
2 files changed, 28 insertions(+), 20 deletions(-)
diff --git a/Documentation/filesystems/directory-locking.rst b/Documentation/filesystems/directory-locking.rst
index 504ba940c36c..dccd61c7c5c3 100644
--- a/Documentation/filesystems/directory-locking.rst
+++ b/Documentation/filesystems/directory-locking.rst
@@ -22,12 +22,11 @@ exclusive.
3) object removal. Locking rules: caller locks parent, finds victim,
locks victim and calls the method. Locks are exclusive.
-4) rename() that is _not_ cross-directory. Locking rules: caller locks
-the parent and finds source and target. In case of exchange (with
-RENAME_EXCHANGE in flags argument) lock both. In any case,
-if the target already exists, lock it. If the source is a non-directory,
-lock it. If we need to lock both, lock them in inode pointer order.
-Then call the method. All locks are exclusive.
+4) rename() that is _not_ cross-directory. Locking rules: caller locks the
+parent and finds source and target. We lock both (provided they exist). If we
+need to lock two inodes of different type (dir vs non-dir), we lock directory
+first. If we need to lock two inodes of the same type, lock them in inode
+pointer order. Then call the method. All locks are exclusive.
NB: we might get away with locking the source (and target in exchange
case) shared.
@@ -44,15 +43,17 @@ All locks are exclusive.
rules:
* lock the filesystem
- * lock parents in "ancestors first" order.
+ * lock parents in "ancestors first" order. If one is not ancestor of
+ the other, lock them in inode pointer order.
* find source and target.
* if old parent is equal to or is a descendent of target
fail with -ENOTEMPTY
* if new parent is equal to or is a descendent of source
fail with -ELOOP
- * If it's an exchange, lock both the source and the target.
- * If the target exists, lock it. If the source is a non-directory,
- lock it. If we need to lock both, do so in inode pointer order.
+ * Lock both the source and the target provided they exist. If we
+ need to lock two inodes of different type (dir vs non-dir), we lock
+ the directory first. If we need to lock two inodes of the same type,
+ lock them in inode pointer order.
* call the method.
All ->i_rwsem are taken exclusive. Again, we might get away with locking
@@ -66,8 +67,9 @@ If no directory is its own ancestor, the scheme above is deadlock-free.
Proof:
- First of all, at any moment we have a partial ordering of the
- objects - A < B iff A is an ancestor of B.
+ First of all, at any moment we have a linear ordering of the
+ objects - A < B iff (A is an ancestor of B) or (B is not an ancestor
+ of A and ptr(A) < ptr(B)).
That ordering can change. However, the following is true:
diff --git a/fs/namei.c b/fs/namei.c
index 148570aabe74..6a5e26a529e1 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4731,7 +4731,7 @@ SYSCALL_DEFINE2(link, const char __user *, oldname, const char __user *, newname
* sb->s_vfs_rename_mutex. We might be more accurate, but that's another
* story.
* c) we have to lock _four_ objects - parents and victim (if it exists),
- * and source (if it is not a directory).
+ * and source.
* And that - after we got ->i_mutex on parents (until then we don't know
* whether the target exists). Solution: try to be smart with locking
* order for inodes. We rely on the fact that tree topology may change
@@ -4815,10 +4815,16 @@ int vfs_rename(struct renamedata *rd)
take_dentry_name_snapshot(&old_name, old_dentry);
dget(new_dentry);
- if (!is_dir || (flags & RENAME_EXCHANGE))
- lock_two_nondirectories(source, target);
- else if (target)
- inode_lock(target);
+ /*
+ * Lock all moved children. Moved directories may need to change parent
+ * pointer so they need the lock to prevent against concurrent
+ * directory changes moving parent pointer. For regular files we've
+ * historically always done this. The lockdep locking subclasses are
+ * somewhat arbitrary but RENAME_EXCHANGE in particular can swap
+ * regular files and directories so it's difficult to tell which
+ * subclasses to use.
+ */
+ lock_two_inodes(source, target, I_MUTEX_NORMAL, I_MUTEX_NONDIR2);
error = -EPERM;
if (IS_SWAPFILE(source) || (target && IS_SWAPFILE(target)))
@@ -4866,9 +4872,9 @@ int vfs_rename(struct renamedata *rd)
d_exchange(old_dentry, new_dentry);
}
out:
- if (!is_dir || (flags & RENAME_EXCHANGE))
- unlock_two_nondirectories(source, target);
- else if (target)
+ if (source)
+ inode_unlock(source);
+ if (target)
inode_unlock(target);
dput(new_dentry);
if (!error) {
--
2.35.3
Remove locking of moved directory in ext4_rename2(). We will take care
of it in VFS instead. This effectively reverts commit 0813299c586b
("ext4: Fix possible corruption when moving a directory") and followup
fixes.
CC: Ted Tso <tytso(a)mit.edu>
CC: stable(a)vger.kernel.org
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/ext4/namei.c | 17 ++---------------
1 file changed, 2 insertions(+), 15 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 45b579805c95..0caf6c730ce3 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -3834,19 +3834,10 @@ static int ext4_rename(struct mnt_idmap *idmap, struct inode *old_dir,
return retval;
}
- /*
- * We need to protect against old.inode directory getting converted
- * from inline directory format into a normal one.
- */
- if (S_ISDIR(old.inode->i_mode))
- inode_lock_nested(old.inode, I_MUTEX_NONDIR2);
-
old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de,
&old.inlined);
- if (IS_ERR(old.bh)) {
- retval = PTR_ERR(old.bh);
- goto unlock_moved_dir;
- }
+ if (IS_ERR(old.bh))
+ return PTR_ERR(old.bh);
/*
* Check for inode number is _not_ due to possible IO errors.
@@ -4043,10 +4034,6 @@ static int ext4_rename(struct mnt_idmap *idmap, struct inode *old_dir,
brelse(old.bh);
brelse(new.bh);
-unlock_moved_dir:
- if (S_ISDIR(old.inode->i_mode))
- inode_unlock(old.inode);
-
return retval;
}
--
2.35.3
In the case of fast device addition/removal, it's possible that
hv_eject_device_work() can start to run before create_root_hv_pci_bus()
starts to run; as a result, the pci_get_domain_bus_and_slot() in
hv_eject_device_work() can return a 'pdev' of NULL, and
hv_eject_device_work() can remove the 'hpdev', and immediately send a
message PCI_EJECTION_COMPLETE to the host, and the host immediately
unassigns the PCI device from the guest; meanwhile,
create_root_hv_pci_bus() and the PCI device driver can be probing the
dead PCI device and reporting timeout errors.
Fix the issue by adding a per-bus mutex 'state_lock' and grabbing the
mutex before powering on the PCI bus in hv_pci_enter_d0(): when
hv_eject_device_work() starts to run, it's able to find the 'pdev' and call
pci_stop_and_remove_bus_device(pdev): if the PCI device driver has
loaded, the PCI device driver's probe() function is already called in
create_root_hv_pci_bus() -> pci_bus_add_devices(), and now
hv_eject_device_work() -> pci_stop_and_remove_bus_device() is able
to call the PCI device driver's remove() function and remove the device
reliably; if the PCI device driver hasn't loaded yet, the function call
hv_eject_device_work() -> pci_stop_and_remove_bus_device() is able to
remove the PCI device reliably and the PCI device driver's probe()
function won't be called; if the PCI device driver's probe() is already
running (e.g., systemd-udev is loading the PCI device driver), it must
be holding the per-device lock, and after the probe() finishes and releases
the lock, hv_eject_device_work() -> pci_stop_and_remove_bus_device() is
able to proceed to remove the device reliably.
Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Signed-off-by: Dexuan Cui <decui(a)microsoft.com>
Reviewed-by: Michael Kelley <mikelley(a)microsoft.com>
Cc: stable(a)vger.kernel.org
---
v2:
Removed the "debug code".
Fixed the "goto out" in hv_pci_resume() [Michael Kelley]
Added Cc:stable
v3:
Added Michael's Reviewed-by.
drivers/pci/controller/pci-hyperv.c | 29 ++++++++++++++++++++++++++---
1 file changed, 26 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 48feab095a144..3ae2f99dea8c2 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -489,7 +489,10 @@ struct hv_pcibus_device {
struct fwnode_handle *fwnode;
/* Protocol version negotiated with the host */
enum pci_protocol_version_t protocol_version;
+
+ struct mutex state_lock;
enum hv_pcibus_state state;
+
struct hv_device *hdev;
resource_size_t low_mmio_space;
resource_size_t high_mmio_space;
@@ -2512,6 +2515,8 @@ static void pci_devices_present_work(struct work_struct *work)
if (!dr)
return;
+ mutex_lock(&hbus->state_lock);
+
/* First, mark all existing children as reported missing. */
spin_lock_irqsave(&hbus->device_list_lock, flags);
list_for_each_entry(hpdev, &hbus->children, list_entry) {
@@ -2593,6 +2598,8 @@ static void pci_devices_present_work(struct work_struct *work)
break;
}
+ mutex_unlock(&hbus->state_lock);
+
kfree(dr);
}
@@ -2741,6 +2748,8 @@ static void hv_eject_device_work(struct work_struct *work)
hpdev = container_of(work, struct hv_pci_dev, wrk);
hbus = hpdev->hbus;
+ mutex_lock(&hbus->state_lock);
+
/*
* Ejection can come before or after the PCI bus has been set up, so
* attempt to find it and tear down the bus state, if it exists. This
@@ -2777,6 +2786,8 @@ static void hv_eject_device_work(struct work_struct *work)
put_pcichild(hpdev);
put_pcichild(hpdev);
/* hpdev has been freed. Do not use it any more. */
+
+ mutex_unlock(&hbus->state_lock);
}
/**
@@ -3562,6 +3573,7 @@ static int hv_pci_probe(struct hv_device *hdev,
return -ENOMEM;
hbus->bridge = bridge;
+ mutex_init(&hbus->state_lock);
hbus->state = hv_pcibus_init;
hbus->wslot_res_allocated = -1;
@@ -3670,9 +3682,11 @@ static int hv_pci_probe(struct hv_device *hdev,
if (ret)
goto free_irq_domain;
+ mutex_lock(&hbus->state_lock);
+
ret = hv_pci_enter_d0(hdev);
if (ret)
- goto free_irq_domain;
+ goto release_state_lock;
ret = hv_pci_allocate_bridge_windows(hbus);
if (ret)
@@ -3690,12 +3704,15 @@ static int hv_pci_probe(struct hv_device *hdev,
if (ret)
goto free_windows;
+ mutex_unlock(&hbus->state_lock);
return 0;
free_windows:
hv_pci_free_bridge_windows(hbus);
exit_d0:
(void) hv_pci_bus_exit(hdev, true);
+release_state_lock:
+ mutex_unlock(&hbus->state_lock);
free_irq_domain:
irq_domain_remove(hbus->irq_domain);
free_fwnode:
@@ -3945,20 +3962,26 @@ static int hv_pci_resume(struct hv_device *hdev)
if (ret)
goto out;
+ mutex_lock(&hbus->state_lock);
+
ret = hv_pci_enter_d0(hdev);
if (ret)
- goto out;
+ goto release_state_lock;
ret = hv_send_resources_allocated(hdev);
if (ret)
- goto out;
+ goto release_state_lock;
prepopulate_bars(hbus);
hv_pci_restore_msi_state(hbus);
hbus->state = hv_pcibus_installed;
+ mutex_unlock(&hbus->state_lock);
return 0;
+
+release_state_lock:
+ mutex_unlock(&hbus->state_lock);
out:
vmbus_close(hdev->channel);
return ret;
--
2.25.1
This patch prevents potential stack corruption on 68020/030 when
delivering signals following a bus error or (theoretically) an
address error.
Changed since RFC:
- Dropped patch 1 because, as Andreas pointed out, it will not work
properly.
Finn Thain (1):
m68k: Move signal frame following exception on 68020/030
arch/m68k/kernel/signal.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
--
2.37.5
Greetings From Mr. Abebe Aemro Selassie
I have a Mutual/Beneficial Business Project that would be beneficial
to you. I only have two questions to ask of you, if you are
interested.
The reason why I contacted you is because am the account officer of
Mr.Jin Wei-Liang,here in our bank,who died in covid19 pandemic with
his family,since that time until now,no one has come for the money,the
meeting we hosted last week with the central bank president,bank
management agreed to take the money as government property,that is the
reason why I contacted you so that you can apply to our bank as a
cousin to Mr.Jin Wei-Liang,because I have all the documents concerning
the disease customer in my office,I will be here as asider and be
giving you informations,anything bank asked from you,I will give it to
you because in this life opportunity comes but once,I have been
working for this bank for good 13 years now and am based on monthly
salary and never achieved a tangible thing and if I don't do the
business with you,bank will still take the money so this is the reason
why I contacted you so that we can do the business together,the
disease money is (18.6 million dollars),50 percent for you,50 percent
for me,if you are interested respond my email but if you are not
interested do well to inform me so that I will look for another
partner and please don't expose me,delete my message because if bank
finds out,I will be in big trouble..These are the two questions I
would like you to answer:
1. Can you handle this project?
2. Can I give you this trust?
Please note that the deal requires high level of maturity, honesty and
secrecy. This will involve moving some money from my office, on trust
to your hands or bank account. Also note that i will do everything to
make sure that the money is moved as a purely legitimate fund, so you
will not be exposed to any risk.
I request for your full co-operation. I will give you details and
procedure when I receive your reply, to commence this transaction, I
require you to immediately indicate your interest by a return reply. I
will be waiting for your response in a timely manner.
Best Regard,
Mr. Abebe Aemro Selassie
Prior to the commit:
"763bd29fd3d1 ("thermal: int340x_thermal: Use sysfs_emit_at() instead of
scnprintf()"
there was a new line after each UUID string. With the newline removed,
existing user space like "thermald" fails to compare each supported UUID
as it is using getline() to read UUID and apply correct thermal table.
To avoid breaking existing user space, add newline after each UUID string.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Fixes: 763bd29fd3d1 ("thermal: int340x_thermal: Use sysfs_emit_at() instead of scnprintf()")
Cc: stable(a)vger.kernel.org # v6.3+
---
drivers/thermal/intel/int340x_thermal/int3400_thermal.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
index 810231b59dcd..5e1164226ada 100644
--- a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
+++ b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
@@ -131,7 +131,7 @@ static ssize_t available_uuids_show(struct device *dev,
for (i = 0; i < INT3400_THERMAL_MAXIMUM_UUID; i++) {
if (priv->uuid_bitmap & (1 << i))
- length += sysfs_emit_at(buf, length, int3400_thermal_uuids[i]);
+ length += sysfs_emit_at(buf, length, "%s\n", int3400_thermal_uuids[i]);
}
return length;
@@ -149,7 +149,7 @@ static ssize_t current_uuid_show(struct device *dev,
for (i = 0; i <= INT3400_THERMAL_CRITICAL; i++) {
if (priv->os_uuid_mask & BIT(i))
- length += sysfs_emit_at(buf, length, int3400_thermal_uuids[i]);
+ length += sysfs_emit_at(buf, length, "%s\n", int3400_thermal_uuids[i]);
}
if (length)
--
2.39.1
Commit 0813299c586b ("ext4: Fix possible corruption when moving a
directory") forgot that handling of RENAME_EXCHANGE renames needs the
protection of inode lock when changing directory parents for moved
directories. Add proper locking for that case as well.
CC: stable(a)vger.kernel.org
Fixes: 0813299c586b ("ext4: Fix possible corruption when moving a directory")
Reported-by: "Darrick J. Wong" <djwong(a)kernel.org>
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/ext4/namei.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 45b579805c95..b91abea1c781 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -4083,10 +4083,25 @@ static int ext4_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
if (retval)
return retval;
+ /*
+ * We need to protect against old.inode and new.inode directory getting
+ * converted from inline directory format into a normal one. The lock
+ * ordering does not matter here as old and new are guaranteed to be
+ * incomparable in the directory hierarchy.
+ */
+ if (S_ISDIR(old.inode->i_mode))
+ inode_lock(old.inode);
+ if (S_ISDIR(new.inode->i_mode))
+ inode_lock_nested(new.inode, I_MUTEX_NONDIR2);
+
old.bh = ext4_find_entry(old.dir, &old.dentry->d_name,
&old.de, &old.inlined);
- if (IS_ERR(old.bh))
- return PTR_ERR(old.bh);
+ if (IS_ERR(old.bh)) {
+ retval = PTR_ERR(old.bh);
+ old.bh = NULL;
+ goto end_rename;
+ }
+
/*
* Check for inode number is _not_ due to possible IO errors.
* We might rmdir the source, keep it as pwd of some process
@@ -4186,6 +4201,10 @@ static int ext4_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
retval = 0;
end_rename:
+ if (S_ISDIR(old.inode->i_mode))
+ inode_unlock(old.inode);
+ if (S_ISDIR(new.inode->i_mode))
+ inode_unlock(new.inode);
brelse(old.dir_bh);
brelse(new.dir_bh);
brelse(old.bh);
--
2.35.3
Hello!
Would the stable maintainers please consider backporting the following
commit to the 6.1? We are trying to build gki_defconfig (plus a few
extras) on Arm64 and test it under Qemu-arm64, but it fails to boot.
Bisection has pointed here.
We have verified that cherry-picking this patch on top of v6.1.29
applies cleanly and allows the kernel to boot.
commit 12d6c1d3a2ad0c199ec57c201cdc71e8e157a232
Author: Kees Cook <keescook(a)chromium.org>
Date: Tue Oct 25 15:39:35 2022 -0700
skbuff: Proactively round up to kmalloc bucket size
Instead of discovering the kmalloc bucket size _after_ allocation, round
up proactively so the allocation is explicitly made for the full size,
allowing the compiler to correctly reason about the resulting size of
the buffer through the existing __alloc_size() hint.
This will allow for kernels built with CONFIG_UBSAN_BOUNDS or the
coming dynamic bounds checking under CONFIG_FORTIFY_SOURCE to gain
back the __alloc_size() hints that were temporarily reverted in commit
93dd04ab0b2b ("slab: remove __alloc_size attribute from
__kmalloc_track_caller")
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: netdev(a)vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: David Rientjes <rientjes(a)google.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20221021234713.you.031…
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Link: https://lore.kernel.org/r/20221025223811.up.360-kees@kernel.org
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Thanks and greetings!
Daniel Díaz
daniel.diaz(a)linaro.org
Hi!
I found a regression in DMA handling on one of our SAMA5D3 boards.
While combing through the regressing commit, a found two unrelated
strange things. The first is the actually problematic change. The
second is a number of suspect defines, that I fail to see how they
can ever do any good.
Cheers,
Peter
Changes since v1 [1], after comments from Tudor Ambarus:
Patch 1/2:
- Don't convert to inline functions.
- Cc stable
Patch 2/2:
- Extend the field instead of killing "too big" field values.
- Add Fixes and R-b tags.
- Cc stable
[1] https://lore.kernel.org/lkml/dc4834cb-fadf-17a5-fbc7-cf500db88f20@axentia.s…
Peter Rosin (2):
dmaengine: at_hdmac: Repair bitfield macros for peripheral ID handling
dmaengine: at_hdmac: Extend the Flow Controller bitfield to three bits
drivers/dma/at_hdmac.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
--
2.20.1
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 1398aa803f198b7a386fdd8404666043e95f4c16
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052208-chirping-preset-9644@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
1398aa803f19 ("tpm_tis: Use tpm_chip_{start,stop} decoration inside tpm_tis_resume")
955df4f87760 ("tpm, tpm_tis: Claim locality when interrupts are reenabled on resume")
7a2f55d0be29 ("tpm, tpm: Implement usage counter for locality")
e87fcf0dc2b4 ("tpm, tpm_tis: Only handle supported interrupts")
15d7aa4e46eb ("tpm, tpm_tis: Claim locality before writing interrupt registers")
ed9be0e6c892 ("tpm, tpm_tis: Do not skip reset of original interrupt vector")
6d789ad72695 ("tpm, tpm_tis: Disable interrupts if tpm_tis_probe_irq() failed")
282657a8bd7f ("tpm, tpm_tis: Claim locality before writing TPM_INT_ENABLE register")
858e8b792d06 ("tpm, tpm_tis: Avoid cache incoherency in test for interrupts")
7bfda9c73fa9 ("tpm: Add flag to use default cancellation policy")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1398aa803f198b7a386fdd8404666043e95f4c16 Mon Sep 17 00:00:00 2001
From: Jarkko Sakkinen <jarkko(a)kernel.org>
Date: Wed, 26 Apr 2023 20:29:27 +0300
Subject: [PATCH] tpm_tis: Use tpm_chip_{start,stop} decoration inside
tpm_tis_resume
Before sending a TPM command, CLKRUN protocol must be disabled. This is not
done in the case of tpm1_do_selftest() call site inside tpm_tis_resume().
Address this by decorating the calls with tpm_chip_{start,stop}, which
should be always used to arm and disarm the TPM chip for transmission.
Finally, move the call to the main TPM driver callback as the last step
because it should arm the chip by itself, if it needs that type of
functionality.
Cc: stable(a)vger.kernel.org
Reported-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Closes: https://lore.kernel.org/linux-integrity/CS68AWILHXS4.3M36M1EKZLUMS@suppilov…
Fixes: a3fbfae82b4c ("tpm: take TPM chip power gating out of tpm_transmit()")
Reviewed-by: Jerry Snitselaar <jsnitsel(a)redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 02945d53fcef..558144fa707a 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -1209,25 +1209,20 @@ static void tpm_tis_reenable_interrupts(struct tpm_chip *chip)
u32 intmask;
int rc;
- if (chip->ops->clk_enable != NULL)
- chip->ops->clk_enable(chip, true);
-
- /* reenable interrupts that device may have lost or
- * BIOS/firmware may have disabled
+ /*
+ * Re-enable interrupts that device may have lost or BIOS/firmware may
+ * have disabled.
*/
rc = tpm_tis_write8(priv, TPM_INT_VECTOR(priv->locality), priv->irq);
- if (rc < 0)
- goto out;
+ if (rc < 0) {
+ dev_err(&chip->dev, "Setting IRQ failed.\n");
+ return;
+ }
intmask = priv->int_mask | TPM_GLOBAL_INT_ENABLE;
-
- tpm_tis_write32(priv, TPM_INT_ENABLE(priv->locality), intmask);
-
-out:
- if (chip->ops->clk_enable != NULL)
- chip->ops->clk_enable(chip, false);
-
- return;
+ rc = tpm_tis_write32(priv, TPM_INT_ENABLE(priv->locality), intmask);
+ if (rc < 0)
+ dev_err(&chip->dev, "Enabling interrupts failed.\n");
}
int tpm_tis_resume(struct device *dev)
@@ -1235,27 +1230,27 @@ int tpm_tis_resume(struct device *dev)
struct tpm_chip *chip = dev_get_drvdata(dev);
int ret;
- ret = tpm_tis_request_locality(chip, 0);
- if (ret < 0)
+ ret = tpm_chip_start(chip);
+ if (ret)
return ret;
if (chip->flags & TPM_CHIP_FLAG_IRQ)
tpm_tis_reenable_interrupts(chip);
- ret = tpm_pm_resume(dev);
- if (ret)
- goto out;
-
/*
* TPM 1.2 requires self-test on resume. This function actually returns
* an error code but for unknown reason it isn't handled.
*/
if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
tpm1_do_selftest(chip);
-out:
- tpm_tis_relinquish_locality(chip, 0);
- return ret;
+ tpm_chip_stop(chip);
+
+ ret = tpm_pm_resume(dev);
+ if (ret)
+ return ret;
+
+ return 0;
}
EXPORT_SYMBOL_GPL(tpm_tis_resume);
#endif
The patch below does not apply to the 6.3-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.3.y
git checkout FETCH_HEAD
git cherry-pick -x d461aac924b937bcb4fd0ca1242b3ef6868ecddd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052230-sprout-playful-a41f@gregkh' --subject-prefix 'PATCH 6.3.y' HEAD^..
Possible dependencies:
d461aac924b9 ("zsmalloc: move LRU update from zs_map_object() to zs_malloc()")
4c7ac97285d8 ("zsmalloc: fine-grained inuse ratio based fullness grouping")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d461aac924b937bcb4fd0ca1242b3ef6868ecddd Mon Sep 17 00:00:00 2001
From: Nhat Pham <nphamcs(a)gmail.com>
Date: Fri, 5 May 2023 11:50:54 -0700
Subject: [PATCH] zsmalloc: move LRU update from zs_map_object() to zs_malloc()
Under memory pressure, we sometimes observe the following crash:
[ 5694.832838] ------------[ cut here ]------------
[ 5694.842093] list_del corruption, ffff888014b6a448->next is LIST_POISON1 (dead000000000100)
[ 5694.858677] WARNING: CPU: 33 PID: 418824 at lib/list_debug.c:47 __list_del_entry_valid+0x42/0x80
[ 5694.961820] CPU: 33 PID: 418824 Comm: fuse_counters.s Kdump: loaded Tainted: G S 5.19.0-0_fbk3_rc3_hoangnhatpzsdynshrv41_10870_g85a9558a25de #1
[ 5694.990194] Hardware name: Wiwynn Twin Lakes MP/Twin Lakes Passive MP, BIOS YMM16 05/24/2021
[ 5695.007072] RIP: 0010:__list_del_entry_valid+0x42/0x80
[ 5695.017351] Code: 08 48 83 c2 22 48 39 d0 74 24 48 8b 10 48 39 f2 75 2c 48 8b 51 08 b0 01 48 39 f2 75 34 c3 48 c7 c7 55 d7 78 82 e8 4e 45 3b 00 <0f> 0b eb 31 48 c7 c7 27 a8 70 82 e8 3e 45 3b 00 0f 0b eb 21 48 c7
[ 5695.054919] RSP: 0018:ffffc90027aef4f0 EFLAGS: 00010246
[ 5695.065366] RAX: 41fe484987275300 RBX: ffff888008988180 RCX: 0000000000000000
[ 5695.079636] RDX: ffff88886006c280 RSI: ffff888860060480 RDI: ffff888860060480
[ 5695.093904] RBP: 0000000000000002 R08: 0000000000000000 R09: ffffc90027aef370
[ 5695.108175] R10: 0000000000000000 R11: ffffffff82fdf1c0 R12: 0000000010000002
[ 5695.122447] R13: ffff888014b6a448 R14: ffff888014b6a420 R15: 00000000138dc240
[ 5695.136717] FS: 00007f23a7d3f740(0000) GS:ffff888860040000(0000) knlGS:0000000000000000
[ 5695.152899] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5695.164388] CR2: 0000560ceaab6ac0 CR3: 000000001c06c001 CR4: 00000000007706e0
[ 5695.178659] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5695.192927] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5695.207197] PKRU: 55555554
[ 5695.212602] Call Trace:
[ 5695.217486] <TASK>
[ 5695.221674] zs_map_object+0x91/0x270
[ 5695.229000] zswap_frontswap_store+0x33d/0x870
[ 5695.237885] ? do_raw_spin_lock+0x5d/0xa0
[ 5695.245899] __frontswap_store+0x51/0xb0
[ 5695.253742] swap_writepage+0x3c/0x60
[ 5695.261063] shrink_page_list+0x738/0x1230
[ 5695.269255] shrink_lruvec+0x5ec/0xcd0
[ 5695.276749] ? shrink_slab+0x187/0x5f0
[ 5695.284240] ? mem_cgroup_iter+0x6e/0x120
[ 5695.292255] shrink_node+0x293/0x7b0
[ 5695.299402] do_try_to_free_pages+0xea/0x550
[ 5695.307940] try_to_free_pages+0x19a/0x490
[ 5695.316126] __folio_alloc+0x19ff/0x3e40
[ 5695.323971] ? __filemap_get_folio+0x8a/0x4e0
[ 5695.332681] ? walk_component+0x2a8/0xb50
[ 5695.340697] ? generic_permission+0xda/0x2a0
[ 5695.349231] ? __filemap_get_folio+0x8a/0x4e0
[ 5695.357940] ? walk_component+0x2a8/0xb50
[ 5695.365955] vma_alloc_folio+0x10e/0x570
[ 5695.373796] ? walk_component+0x52/0xb50
[ 5695.381634] wp_page_copy+0x38c/0xc10
[ 5695.388953] ? filename_lookup+0x378/0xbc0
[ 5695.397140] handle_mm_fault+0x87f/0x1800
[ 5695.405157] do_user_addr_fault+0x1bd/0x570
[ 5695.413520] exc_page_fault+0x5d/0x110
[ 5695.421017] asm_exc_page_fault+0x22/0x30
After some investigation, I have found the following issue: unlike other
zswap backends, zsmalloc performs the LRU list update at the object
mapping time, rather than when the slot for the object is allocated.
This deviation was discussed and agreed upon during the review process
of the zsmalloc writeback patch series:
https://lore.kernel.org/lkml/Y3flcAXNxxrvy3ZH@cmpxchg.org/
Unfortunately, this introduces a subtle bug that occurs when there is a
concurrent store and reclaim, which interleave as follows:
zswap_frontswap_store() shrink_worker()
zs_malloc() zs_zpool_shrink()
spin_lock(&pool->lock) zs_reclaim_page()
zspage = find_get_zspage()
spin_unlock(&pool->lock)
spin_lock(&pool->lock)
zspage = list_first_entry(&pool->lru)
list_del(&zspage->lru)
zspage->lru.next = LIST_POISON1
zspage->lru.prev = LIST_POISON2
spin_unlock(&pool->lock)
zs_map_object()
spin_lock(&pool->lock)
if (!list_empty(&zspage->lru))
list_del(&zspage->lru)
CHECK_DATA_CORRUPTION(next == LIST_POISON1) /* BOOM */
With the current upstream code, this issue rarely happens. zswap only
triggers writeback when the pool is already full, at which point all
further store attempts are short-circuited. This creates an implicit
pseudo-serialization between reclaim and store. I am working on a new
zswap shrinking mechanism, which makes interleaving reclaim and store
more likely, exposing this bug.
zbud and z3fold do not have this problem, because they perform the LRU
list update in the alloc function, while still holding the pool's lock.
This patch fixes the aforementioned bug by moving the LRU update back to
zs_malloc(), analogous to zbud and z3fold.
Link: https://lkml.kernel.org/r/20230505185054.2417128-1-nphamcs@gmail.com
Fixes: 64f768c6b32e ("zsmalloc: add a LRU to zs_pool to keep track of zspages in LRU order")
Signed-off-by: Nhat Pham <nphamcs(a)gmail.com>
Suggested-by: Johannes Weiner <hannes(a)cmpxchg.org>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Acked-by: Minchan Kim <minchan(a)kernel.org>
Cc: Dan Streetman <ddstreet(a)ieee.org>
Cc: Nitin Gupta <ngupta(a)vflare.org>
Cc: Seth Jennings <sjenning(a)redhat.com>
Cc: Vitaly Wool <vitaly.wool(a)konsulko.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 44ddaf5d601e..02f7f414aade 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1331,31 +1331,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
obj_to_location(obj, &page, &obj_idx);
zspage = get_zspage(page);
-#ifdef CONFIG_ZPOOL
- /*
- * Move the zspage to front of pool's LRU.
- *
- * Note that this is swap-specific, so by definition there are no ongoing
- * accesses to the memory while the page is swapped out that would make
- * it "hot". A new entry is hot, then ages to the tail until it gets either
- * written back or swaps back in.
- *
- * Furthermore, map is also called during writeback. We must not put an
- * isolated page on the LRU mid-reclaim.
- *
- * As a result, only update the LRU when the page is mapped for write
- * when it's first instantiated.
- *
- * This is a deviation from the other backends, which perform this update
- * in the allocation function (zbud_alloc, z3fold_alloc).
- */
- if (mm == ZS_MM_WO) {
- if (!list_empty(&zspage->lru))
- list_del(&zspage->lru);
- list_add(&zspage->lru, &pool->lru);
- }
-#endif
-
/*
* migration cannot move any zpages in this zspage. Here, pool->lock
* is too heavy since callers would take some time until they calls
@@ -1525,9 +1500,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp)
fix_fullness_group(class, zspage);
record_obj(handle, obj);
class_stat_inc(class, ZS_OBJS_INUSE, 1);
- spin_unlock(&pool->lock);
- return handle;
+ goto out;
}
spin_unlock(&pool->lock);
@@ -1550,6 +1524,14 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp)
/* We completely set up zspage so mark them as movable */
SetZsPageMovable(pool, zspage);
+out:
+#ifdef CONFIG_ZPOOL
+ /* Add/move zspage to beginning of LRU */
+ if (!list_empty(&zspage->lru))
+ list_del(&zspage->lru);
+ list_add(&zspage->lru, &pool->lru);
+#endif
+
spin_unlock(&pool->lock);
return handle;
Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject") added
pci_lock_rescan_remove() and pci_unlock_rescan_remove() in
create_root_hv_pci_bus() and in hv_eject_device_work() to address the
race between create_root_hv_pci_bus() and hv_eject_device_work(), but it
turns that grabing the pci_rescan_remove_lock mutex is not enough:
refer to the earlier fix "PCI: hv: Add a per-bus mutex state_lock".
Now with hbus->state_lock and other fixes, the race is resolved, so
remove pci_{lock,unlock}_rescan_remove() in create_root_hv_pci_bus():
this removes the serialization in hv_pci_probe() and hence allows
async-probing (PROBE_PREFER_ASYNCHRONOUS) to work.
Add the async-probing flag to hv_pci_drv.
pci_{lock,unlock}_rescan_remove() in hv_eject_device_work() and in
hv_pci_remove() are still kept: according to the comment before
drivers/pci/probe.c: static DEFINE_MUTEX(pci_rescan_remove_lock),
"PCI device removal routines should always be executed under this mutex".
Signed-off-by: Dexuan Cui <decui(a)microsoft.com>
Reviewed-by: Michael Kelley <mikelley(a)microsoft.com>
Reviewed-by: Long Li <longli(a)microsoft.com>
Cc: stable(a)vger.kernel.org
---
v2:
No change to the patch body.
Improved the commit message [Michael Kelley]
Added Cc:stable
v3:
Added Michael's and Long Li's Reviewed-by.
Fixed a typo in the commit message: grubing -> grabing [Thanks, Michael!]
drivers/pci/controller/pci-hyperv.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 3ae2f99dea8c2..2ea2b1b8a4c9a 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2312,12 +2312,16 @@ static int create_root_hv_pci_bus(struct hv_pcibus_device *hbus)
if (error)
return error;
- pci_lock_rescan_remove();
+ /*
+ * pci_lock_rescan_remove() and pci_unlock_rescan_remove() are
+ * unnecessary here, because we hold the hbus->state_lock, meaning
+ * hv_eject_device_work() and pci_devices_present_work() can't race
+ * with create_root_hv_pci_bus().
+ */
hv_pci_assign_numa_node(hbus);
pci_bus_assign_resources(bridge->bus);
hv_pci_assign_slots(hbus);
pci_bus_add_devices(bridge->bus);
- pci_unlock_rescan_remove();
hbus->state = hv_pcibus_installed;
return 0;
}
@@ -4003,6 +4007,9 @@ static struct hv_driver hv_pci_drv = {
.remove = hv_pci_remove,
.suspend = hv_pci_suspend,
.resume = hv_pci_resume,
+ .driver = {
+ .probe_type = PROBE_PREFER_ASYNCHRONOUS,
+ },
};
static void __exit exit_hv_pci_drv(void)
--
2.25.1
vmbus_wait_for_unload() may be called in the panic path after other
CPUs are stopped. vmbus_wait_for_unload() currently loops through
online CPUs looking for the UNLOAD response message. But the values of
CONFIG_KEXEC_CORE and crash_kexec_post_notifiers affect the path used
to stop the other CPUs, and in one of the paths the stopped CPUs
are removed from cpu_online_mask. This removal happens in both
x86/x64 and arm64 architectures. In such a case, vmbus_wait_for_unload()
only checks the panic'ing CPU, and misses the UNLOAD response message
except when the panic'ing CPU is CPU 0. vmbus_wait_for_unload()
eventually times out, but only after waiting 100 seconds.
Fix this by looping through *present* CPUs in vmbus_wait_for_unload().
The cpu_present_mask is not modified by stopping the other CPUs in the
panic path, nor should it be.
Also, in a CoCo VM the synic_message_page is not allocated in
hv_synic_alloc(), but is set and cleared in hv_synic_enable_regs()
and hv_synic_disable_regs() such that it is set only when the CPU is
online. If not all present CPUs are online when vmbus_wait_for_unload()
is called, the synic_message_page might be NULL. Add a check for this.
Fixes: cd95aad55793 ("Drivers: hv: vmbus: handle various crash scenarios")
Cc: stable(a)vger.kernel.org
Reported-by: John Starks <jostarks(a)microsoft.com>
Signed-off-by: Michael Kelley <mikelley(a)microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets(a)redhat.com>
---
Changes in v2:
* Added a comment and updated commit messages to describe scenario
where synic_message_page might be NULL
* Added Cc: stable(a)vger.kernel.org
drivers/hv/channel_mgmt.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 007f26d..2f4d09c 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -829,11 +829,22 @@ static void vmbus_wait_for_unload(void)
if (completion_done(&vmbus_connection.unload_event))
goto completed;
- for_each_online_cpu(cpu) {
+ for_each_present_cpu(cpu) {
struct hv_per_cpu_context *hv_cpu
= per_cpu_ptr(hv_context.cpu_context, cpu);
+ /*
+ * In a CoCo VM the synic_message_page is not allocated
+ * in hv_synic_alloc(). Instead it is set/cleared in
+ * hv_synic_enable_regs() and hv_synic_disable_regs()
+ * such that it is set only when the CPU is online. If
+ * not all present CPUs are online, the message page
+ * might be NULL, so skip such CPUs.
+ */
page_addr = hv_cpu->synic_message_page;
+ if (!page_addr)
+ continue;
+
msg = (struct hv_message *)page_addr
+ VMBUS_MESSAGE_SINT;
@@ -867,11 +878,14 @@ static void vmbus_wait_for_unload(void)
* maybe-pending messages on all CPUs to be able to receive new
* messages after we reconnect.
*/
- for_each_online_cpu(cpu) {
+ for_each_present_cpu(cpu) {
struct hv_per_cpu_context *hv_cpu
= per_cpu_ptr(hv_context.cpu_context, cpu);
page_addr = hv_cpu->synic_message_page;
+ if (!page_addr)
+ continue;
+
msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT;
msg->header.message_type = HVMSG_NONE;
}
--
1.8.3.1