This is similar to commit 62b6dee1b44a ("PCI/portdrv: Prevent LS7A Bus
Master clearing on shutdown"), which prevents LS7A Bus Master clearing
on kexec.
The key point of this is to work around the LS7A defect that clearing
PCI_COMMAND_MASTER prevents MMIO requests from going downstream, and
we may need to do that even after .shutdown(), e.g., to print console
messages. And in this case we rely on .shutdown() for the downstream
devices to disable interrupts and DMA.
Only skip Bus Master clearing on bridges because endpoint devices still
need it.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Ming Wang <wangming01(a)loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
drivers/pci/pci-driver.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 602838416e6a..8a1e32367a06 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -517,7 +517,7 @@ static void pci_device_shutdown(struct device *dev)
* If it is not a kexec reboot, firmware will hit the PCI
* devices with big hammer and stop their DMA any way.
*/
- if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
+ if (kexec_in_progress && !pci_is_bridge(pci_dev) && (pci_dev->current_state <= PCI_D3hot))
pci_clear_master(pci_dev);
}
--
2.47.1
From: Hongchen Zhang <zhanghongchen(a)loongson.cn>
When the best selected CPU is offline, work_on_cpu() will stuck forever.
This can be happen if a node is online while all its CPUs are offline
(we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore,
in this case, we should call local_pci_probe() instead of work_on_cpu().
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
Signed-off-by: Hongchen Zhang <zhanghongchen(a)loongson.cn>
---
drivers/pci/pci-driver.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index c8bd71a739f7..602838416e6a 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -386,7 +386,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
free_cpumask_var(wq_domain_mask);
}
- if (cpu < nr_cpu_ids)
+ if ((cpu < nr_cpu_ids) && cpu_online(cpu))
error = work_on_cpu(cpu, local_pci_probe, &ddi);
else
error = local_pci_probe(&ddi);
--
2.47.1
Hello,
Please consider applying the following commits for 6.12.y:
c104c16073b7 ("Kunit to check the longest symbol length")
54338750578f ("x86/tools: Drop duplicate unlikely() definition in
insn_decoder_test.c")
They should apply cleanly.
Those two commits implement a kunit test to verify that a symbol with
KSYM_NAME_LEN of 512 can be read.
The first commit implements the test. This commit also includes
a fix for the test x86/insn_decoder_test. In the case a symbol exceeds the
symbol length limit, an error will happen:
arch/x86/tools/insn_decoder_test: error: malformed line 1152000:
tBb_+0xf2>
..which overflowed by 10 characters reading this line:
ffffffff81458193: 74 3d je
ffffffff814581d2
<_RNvXse_NtNtNtCshGpAVYOtgW1_4core4iter8adapters7flattenINtB5_13FlattenCompatINtNtB7_3map3MapNtNtNtBb_3str4iter5CharsNtB1v_17CharEscapeDefaultENtNtBb_4char13EscapeDefaultENtNtBb_3fmt5Debug3fmtBb_+0xf2>
The fix was proposed in [1] and initially mentioned at [2].
The second commit fixes a warning when building with clang because
there was a definition of unlikely from compiler.h in tools/include/linux,
which conflicted with the one in the instruction decoder selftest.
[1] https://lore.kernel.org/lkml/Y9ES4UKl%2F+DtvAVS@gmail.com/
[2] https://lore.kernel.org/lkml/320c4dba-9919-404b-8a26-a8af16be1845@app.fastm…
I will send something similar to 6.6.y and 6.1.y
Thanks!
Best regards,
Sergio
If speed_hz < AMD_SPI_MIN_HZ, amd_set_spi_freq() iterates over the
entire amd_spi_freq array without breaking out early, causing 'i' to go
beyond the array bounds.
Fix that by stopping the loop when it gets to the last entry, so the low
speed_hz value gets clamped up to AMD_SPI_MIN_HZ.
Fixes the following warning with an UBSAN kernel:
drivers/spi/spi-amd.o: error: objtool: amd_set_spi_freq() falls through to next function amd_spi_set_opcode()
Fixes: 3fe26121dc3a ("spi: amd: Configure device speed")
Reported-by: kernel test robot <lkp(a)intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe(a)kernel.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Acked-by: Mark Brown <broonie(a)kernel.org>
Cc: Raju Rangoju <Raju.Rangoju(a)amd.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Link: https://lore.kernel.org/r/78fef0f2434f35be9095bcc9ffa23dd8cab667b9.17428528…
Closes: https://lore.kernel.org/r/202503161828.RUk9EhWx-lkp@intel.com/
Signed-off-by: Dmitriy Privalov <d.privalov(a)omp.ru>
---
drivers/spi/spi-amd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/spi/spi-amd.c b/drivers/spi/spi-amd.c
index bfc3ab5f39ea..b53301e563bc 100644
--- a/drivers/spi/spi-amd.c
+++ b/drivers/spi/spi-amd.c
@@ -243,7 +243,7 @@ static int amd_set_spi_freq(struct amd_spi *amd_spi, u32 speed_hz)
if (speed_hz < AMD_SPI_MIN_HZ)
return -EINVAL;
- for (i = 0; i < ARRAY_SIZE(amd_spi_freq); i++)
+ for (i = 0; i < ARRAY_SIZE(amd_spi_freq)-1; i++)
if (speed_hz >= amd_spi_freq[i].speed_hz)
break;
--
2.34.1
[ Upstream commit 6043b794c7668c19dabc4a93c75b924a19474d59 ]
During ILA address translations, the L4 checksums can be handled in
different ways. One of them, adj-transport, consist in parsing the
transport layer and updating any found checksum. This logic relies on
inet_proto_csum_replace_by_diff and produces an incorrect skb->csum when
in state CHECKSUM_COMPLETE.
This bug can be reproduced with a simple ILA to SIR mapping, assuming
packets are received with CHECKSUM_COMPLETE:
$ ip a show dev eth0
14: eth0@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 62:ae:35:9e:0f:8d brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 3333:0:0:1::c078/64 scope global
valid_lft forever preferred_lft forever
inet6 fd00:10:244:1::c078/128 scope global nodad
valid_lft forever preferred_lft forever
inet6 fe80::60ae:35ff:fe9e:f8d/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
$ ip ila add loc_match fd00:10:244:1 loc 3333:0:0:1 \
csum-mode adj-transport ident-type luid dev eth0
Then I hit [fd00:10:244:1::c078]:8000 with a server listening only on
[3333:0:0:1::c078]:8000. With the bug, the SYN packet is dropped with
SKB_DROP_REASON_TCP_CSUM after inet_proto_csum_replace_by_diff changed
skb->csum. The translation and drop are visible on pwru [1] traces:
IFACE TUPLE FUNC
eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) ipv6_rcv
eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) ip6_rcv_core
eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) nf_hook_slow
eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) inet_proto_csum_replace_by_diff
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) tcp_v6_early_demux
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_route_input
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_input
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_input_finish
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_protocol_deliver_rcu
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) raw6_local_deliver
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ipv6_raw_deliver
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) tcp_v6_rcv
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) __skb_checksum_complete
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) kfree_skb_reason(SKB_DROP_REASON_TCP_CSUM)
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_release_head_state
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_release_data
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_free_head
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) kfree_skbmem
This is happening because inet_proto_csum_replace_by_diff is updating
skb->csum when it shouldn't. The L4 checksum is updated such that it
"cancels" the IPv6 address change in terms of checksum computation, so
the impact on skb->csum is null.
Note this would be different for an IPv4 packet since three fields
would be updated: the IPv4 address, the IP checksum, and the L4
checksum. Two would cancel each other and skb->csum would still need
to be updated to take the L4 checksum change into account.
This patch fixes it by passing an ipv6 flag to
inet_proto_csum_replace_by_diff, to skip the skb->csum update if we're
in the IPv6 case. Note the behavior of the only other user of
inet_proto_csum_replace_by_diff, the BPF subsystem, is left as is in
this patch and fixed in the subsequent patch.
With the fix, using the reproduction from above, I can confirm
skb->csum is not touched by inet_proto_csum_replace_by_diff and the TCP
SYN proceeds to the application after the ILA translation.
Link: https://github.com/cilium/pwru [1]
Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module")
Signed-off-by: Paul Chaignon <paul.chaignon(a)gmail.com>
Acked-by: Daniel Borkmann <daniel(a)iogearbox.net>
Link: https://patch.msgid.link/b5539869e3550d46068504feb02d37653d939c0b.174850948…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Paul Chaignon <paul.chaignon(a)gmail.com>
---
include/net/checksum.h | 2 +-
net/core/filter.c | 2 +-
net/core/utils.c | 4 ++--
net/ipv6/ila/ila_common.c | 6 +++---
4 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/include/net/checksum.h b/include/net/checksum.h
index 1338cb92c8e7..28b101f26636 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -158,7 +158,7 @@ void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb,
const __be32 *from, const __be32 *to,
bool pseudohdr);
void inet_proto_csum_replace_by_diff(__sum16 *sum, struct sk_buff *skb,
- __wsum diff, bool pseudohdr);
+ __wsum diff, bool pseudohdr, bool ipv6);
static __always_inline
void inet_proto_csum_replace2(__sum16 *sum, struct sk_buff *skb,
diff --git a/net/core/filter.c b/net/core/filter.c
index 99b23fd2f509..e0d978c1a4cd 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1999,7 +1999,7 @@ BPF_CALL_5(bpf_l4_csum_replace, struct sk_buff *, skb, u32, offset,
if (unlikely(from != 0))
return -EINVAL;
- inet_proto_csum_replace_by_diff(ptr, skb, to, is_pseudo);
+ inet_proto_csum_replace_by_diff(ptr, skb, to, is_pseudo, false);
break;
case 2:
inet_proto_csum_replace2(ptr, skb, from, to, is_pseudo);
diff --git a/net/core/utils.c b/net/core/utils.c
index 27f4cffaae05..b8c21a859e27 100644
--- a/net/core/utils.c
+++ b/net/core/utils.c
@@ -473,11 +473,11 @@ void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb,
EXPORT_SYMBOL(inet_proto_csum_replace16);
void inet_proto_csum_replace_by_diff(__sum16 *sum, struct sk_buff *skb,
- __wsum diff, bool pseudohdr)
+ __wsum diff, bool pseudohdr, bool ipv6)
{
if (skb->ip_summed != CHECKSUM_PARTIAL) {
csum_replace_by_diff(sum, diff);
- if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr)
+ if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr && !ipv6)
skb->csum = ~csum_sub(diff, skb->csum);
} else if (pseudohdr) {
*sum = ~csum_fold(csum_add(diff, csum_unfold(*sum)));
diff --git a/net/ipv6/ila/ila_common.c b/net/ipv6/ila/ila_common.c
index 95e9146918cc..b8d43ed4689d 100644
--- a/net/ipv6/ila/ila_common.c
+++ b/net/ipv6/ila/ila_common.c
@@ -86,7 +86,7 @@ static void ila_csum_adjust_transport(struct sk_buff *skb,
diff = get_csum_diff(ip6h, p);
inet_proto_csum_replace_by_diff(&th->check, skb,
- diff, true);
+ diff, true, true);
}
break;
case NEXTHDR_UDP:
@@ -97,7 +97,7 @@ static void ila_csum_adjust_transport(struct sk_buff *skb,
if (uh->check || skb->ip_summed == CHECKSUM_PARTIAL) {
diff = get_csum_diff(ip6h, p);
inet_proto_csum_replace_by_diff(&uh->check, skb,
- diff, true);
+ diff, true, true);
if (!uh->check)
uh->check = CSUM_MANGLED_0;
}
@@ -111,7 +111,7 @@ static void ila_csum_adjust_transport(struct sk_buff *skb,
diff = get_csum_diff(ip6h, p);
inet_proto_csum_replace_by_diff(&ih->icmp6_cksum, skb,
- diff, true);
+ diff, true, true);
}
break;
}
--
2.43.0
From: Igor Pylypiv <ipylypiv(a)google.com>
[ Upstream commit e4f949ef1516c0d74745ee54a0f4882c1f6c7aea ]
pm8001_phy_control() populates the enable_completion pointer with a stack
address, sends a PHY_LINK_RESET / PHY_HARD_RESET, waits 300 ms, and
returns. The problem arises when a phy control response comes late. After
300 ms the pm8001_phy_control() function returns and the passed
enable_completion stack address is no longer valid. Late phy control
response invokes complete() on a dangling enable_completion pointer which
leads to a kernel crash.
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Signed-off-by: Terrence Adams <tadamsjr(a)google.com>
Link: https://lore.kernel.org/r/20240627155924.2361370-2-tadamsjr@google.com
Acked-by: Jack Wang <jinpu.wang(a)ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Xiangyu Chen <xiangyu.chen(a)windriver.com>
Signed-off-by: He Zhe <zhe.he(a)windriver.com>
---
Test info:
Building OS: Ubuntu 22.04.5
GCC: gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
Base Tree: https://web.git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
Builds:
make defconfig; make bzImage
make allyesconfig; make bzImage
make allmodconfig; make bzImage
Boot target: Intel Basking Ridge
---
drivers/scsi/pm8001/pm8001_sas.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
index a87c3d7e3e5c..f491edf73e23 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -168,7 +168,7 @@ int pm8001_phy_control(struct asd_sas_phy *sas_phy, enum phy_func func,
unsigned long flags;
pm8001_ha = sas_phy->ha->lldd_ha;
phy = &pm8001_ha->phy[phy_id];
- pm8001_ha->phy[phy_id].enable_completion = &completion;
+
switch (func) {
case PHY_FUNC_SET_LINK_RATE:
rates = funcdata;
@@ -181,6 +181,7 @@ int pm8001_phy_control(struct asd_sas_phy *sas_phy, enum phy_func func,
rates->maximum_linkrate;
}
if (pm8001_ha->phy[phy_id].phy_state == PHY_LINK_DISABLE) {
+ pm8001_ha->phy[phy_id].enable_completion = &completion;
PM8001_CHIP_DISP->phy_start_req(pm8001_ha, phy_id);
wait_for_completion(&completion);
}
@@ -189,6 +190,7 @@ int pm8001_phy_control(struct asd_sas_phy *sas_phy, enum phy_func func,
break;
case PHY_FUNC_HARD_RESET:
if (pm8001_ha->phy[phy_id].phy_state == PHY_LINK_DISABLE) {
+ pm8001_ha->phy[phy_id].enable_completion = &completion;
PM8001_CHIP_DISP->phy_start_req(pm8001_ha, phy_id);
wait_for_completion(&completion);
}
@@ -197,6 +199,7 @@ int pm8001_phy_control(struct asd_sas_phy *sas_phy, enum phy_func func,
break;
case PHY_FUNC_LINK_RESET:
if (pm8001_ha->phy[phy_id].phy_state == PHY_LINK_DISABLE) {
+ pm8001_ha->phy[phy_id].enable_completion = &completion;
PM8001_CHIP_DISP->phy_start_req(pm8001_ha, phy_id);
wait_for_completion(&completion);
}
--
2.34.1
commit de5fbbe1531f645c8b56098be8d1faf31e46f7f0 upstream
The appletbdrm driver is exclusively for Touch Bars on x86 Intel Macs.
The M1 Macs have a separate driver. So, lets avoid compiling it for
other architectures.
Signed-off-by: Aditya Garg <gargaditya08(a)live.com>
Reviewed-by: Alyssa Rosenzweig <alyssa(a)rosenzweig.io>
Link: https://lore.kernel.org/r/PN3PR01MB95970778982F28E4A3751392B8B72@PN3PR01MB9…
Signed-off-by: Alyssa Rosenzweig <alyssa(a)rosenzweig.io>
---
Sending this since https://lore.kernel.org/stable/20250617152509.019353397@linuxfoundation.org/
was also backported to 6.15
drivers/gpu/drm/tiny/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig
index 95c1457d7..d66681d0e 100644
--- a/drivers/gpu/drm/tiny/Kconfig
+++ b/drivers/gpu/drm/tiny/Kconfig
@@ -3,6 +3,7 @@
config DRM_APPLETBDRM
tristate "DRM support for Apple Touch Bars"
depends on DRM && USB && MMU
+ depends on X86 || COMPILE_TEST
select DRM_GEM_SHMEM_HELPER
select DRM_KMS_HELPER
help
--
2.49.0
Jan,
I noticed that fanotify22, the FAN_FS_ERROR test has regressed in the
5.15.y stable tree.
This is because commit d3476f3dad4a ("ext4: don't set SB_RDONLY after
filesystem errors") was backported to 5.15.y and the later Fixes
commit could not be cleanly applied to 5.15.y over the new mount api
re-factoring.
I am not sure it is critical to fix this regression, because it is
mostly a regression in a test feature, but I think the backport is
pretty simple, although I could be missing something.
Please ACK if you agree that this backport should be applied to 5.15.y.
Thanks,
Amir.
Amir Goldstein (2):
ext4: make 'abort' mount option handling standard
ext4: avoid remount errors with 'abort' mount option
fs/ext4/ext4.h | 1 +
fs/ext4/super.c | 15 +++++++++------
2 files changed, 10 insertions(+), 6 deletions(-)
--
2.47.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x f90fff1e152dedf52b932240ebbd670d83330eca
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025061744-precinct-rubble-45c9@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f90fff1e152dedf52b932240ebbd670d83330eca Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg(a)redhat.com>
Date: Fri, 13 Jun 2025 19:26:50 +0200
Subject: [PATCH] posix-cpu-timers: fix race between handle_posix_cpu_timers()
and posix_cpu_timer_del()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If an exiting non-autoreaping task has already passed exit_notify() and
calls handle_posix_cpu_timers() from IRQ, it can be reaped by its parent
or debugger right after unlock_task_sighand().
If a concurrent posix_cpu_timer_del() runs at that moment, it won't be
able to detect timer->it.cpu.firing != 0: cpu_timer_task_rcu() and/or
lock_task_sighand() will fail.
Add the tsk->exit_state check into run_posix_cpu_timers() to fix this.
This fix is not needed if CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y, because
exit_task_work() is called before exit_notify(). But the check still
makes sense, task_work_add(&tsk->posix_cputimers_work.work) will fail
anyway in this case.
Cc: stable(a)vger.kernel.org
Reported-by: Benoît Sevens <bsevens(a)google.com>
Fixes: 0bdd2ed4138e ("sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()")
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 50e8d04ab661..2e5b89d7d866 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -1405,6 +1405,15 @@ void run_posix_cpu_timers(void)
lockdep_assert_irqs_disabled();
+ /*
+ * Ensure that release_task(tsk) can't happen while
+ * handle_posix_cpu_timers() is running. Otherwise, a concurrent
+ * posix_cpu_timer_del() may fail to lock_task_sighand(tsk) and
+ * miss timer->it.cpu.firing != 0.
+ */
+ if (tsk->exit_state)
+ return;
+
/*
* If the actual expiry is deferred to task work context and the
* work is already scheduled there is no point to do anything here.
The arm64 page table dump code can race with concurrent modification of the
kernel page tables. When a leaf entries are modified concurrently, the dump
code may log stale or inconsistent information for a VA range, but this is
otherwise not harmful.
When intermediate levels of table are freed, the dump code will continue to
use memory which has been freed and potentially reallocated for another
purpose. In such cases, the dump code may dereference bogus addresses,
leading to a number of potential problems.
This problem was fixed for ptdump_show() earlier via commit 'bf2b59f60ee1
("arm64/mm: Hold memory hotplug lock while walking for kernel page table
dump")' but a same was missed for ptdump_check_wx() which faced the race
condition as well. Let's just take the memory hotplug lock while executing
ptdump_check_wx().
Cc: stable(a)vger.kernel.org
Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-kernel(a)vger.kernel.org
Reported-by: Dev Jain <dev.jain(a)arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
---
This patch applies on v6.16-rc1
Dev Jain found this via code inspection.
arch/arm64/mm/ptdump.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 421a5de806c62..551f80d41e8d2 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -328,7 +328,7 @@ static struct ptdump_info kernel_ptdump_info __ro_after_init = {
.mm = &init_mm,
};
-bool ptdump_check_wx(void)
+static bool __ptdump_check_wx(void)
{
struct ptdump_pg_state st = {
.seq = NULL,
@@ -367,6 +367,16 @@ bool ptdump_check_wx(void)
}
}
+bool ptdump_check_wx(void)
+{
+ bool ret;
+
+ get_online_mems();
+ ret = __ptdump_check_wx();
+ put_online_mems();
+ return ret;
+}
+
static int __init ptdump_init(void)
{
u64 page_offset = _PAGE_OFFSET(vabits_actual);
--
2.30.2
This reverts commit 5ff79cabb23a2f14d2ed29e9596aec908905a0e6.
Although the Alienware m16 R1 AMD model supports G-Mode, it actually has
a lower power ceiling than plain "performance" profile, which results in
lower performance.
Reported-by: Cihan Ozakca <cozakca(a)outlook.com>
Cc: stable(a)vger.kernel.org # 6.15.x
Signed-off-by: Kurt Borja <kuurtb(a)gmail.com>
---
Hi all,
Contrary to (my) intuition, imitating Windows behavior actually results
in LOWER performance.
I was having second thoughts about this revert because users will notice
that "performance" not longer turns on the G-Mode key found in this
laptop. Some users may think this is actually a regression, but IMO
lower performance is worse.
---
drivers/platform/x86/dell/alienware-wmi-wmax.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/platform/x86/dell/alienware-wmi-wmax.c b/drivers/platform/x86/dell/alienware-wmi-wmax.c
index c42f9228b0b255fe962b735ac96486824e83945f..20ec122a9fe0571a1ecd2ccf630615564ab30481 100644
--- a/drivers/platform/x86/dell/alienware-wmi-wmax.c
+++ b/drivers/platform/x86/dell/alienware-wmi-wmax.c
@@ -119,7 +119,7 @@ static const struct dmi_system_id awcc_dmi_table[] __initconst = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m16 R1 AMD"),
},
- .driver_data = &g_series_quirks,
+ .driver_data = &generic_quirks,
},
{
.ident = "Alienware m16 R2",
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250611-m16-rev-8109b82dee30
--
~ Kurt
After commit 1aaf8c122918 ("mm: gup: fix infinite loop within
__get_longterm_locked") we are able to longterm pin folios that are not
supposed to get longterm pinned, simply because they temporarily have
the LRU flag cleared (esp. temporarily isolated).
For example, two __get_longterm_locked() callers can race, or
__get_longterm_locked() can race with anything else that temporarily
isolates folios.
The introducing commit mentions the use case of a driver that uses
vm_ops->fault to insert pages allocated through cma_alloc() into the
page tables, assuming they can later get longterm pinned. These pages/
folios would never have the LRU flag set and consequently cannot get
isolated. There is no known in-tree user making use of that so far,
fortunately.
To handle that in the future -- and avoid retrying forever to
isolate/migrate them -- we will need a different mechanism for the CMA
area *owner* to indicate that it actually already allocated the page and
is fine with longterm pinning it. The LRU flag is not suitable for that.
Probably we can lookup the relevant CMA area and query the bitmap; we
only have have to care about some races, probably. If already allocated,
we could just allow longterm pinning)
Anyhow, let's fix the "must not be longterm pinned" problem first by
reverting the original commit.
Fixes: 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked")
Closes: https://lore.kernel.org/all/20250522092755.GA3277597@tiffany/
Reported-by: Hyesoo Yu <hyesoo.yu(a)samsung.com>
Cc: <Stable(a)vger.kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Zhaoyang Huang <zhaoyang.huang(a)unisoc.com>
Cc: Aijun Sun <aijun.sun(a)unisoc.com>
Cc: Alistair Popple <apopple(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
mm/gup.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/mm/gup.c b/mm/gup.c
index e065a49842a87..3c39cbbeebef1 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2303,13 +2303,13 @@ static void pofs_unpin(struct pages_or_folios *pofs)
/*
* Returns the number of collected folios. Return value is always >= 0.
*/
-static void collect_longterm_unpinnable_folios(
+static unsigned long collect_longterm_unpinnable_folios(
struct list_head *movable_folio_list,
struct pages_or_folios *pofs)
{
+ unsigned long i, collected = 0;
struct folio *prev_folio = NULL;
bool drain_allow = true;
- unsigned long i;
for (i = 0; i < pofs->nr_entries; i++) {
struct folio *folio = pofs_get_folio(pofs, i);
@@ -2321,6 +2321,8 @@ static void collect_longterm_unpinnable_folios(
if (folio_is_longterm_pinnable(folio))
continue;
+ collected++;
+
if (folio_is_device_coherent(folio))
continue;
@@ -2342,6 +2344,8 @@ static void collect_longterm_unpinnable_folios(
NR_ISOLATED_ANON + folio_is_file_lru(folio),
folio_nr_pages(folio));
}
+
+ return collected;
}
/*
@@ -2418,9 +2422,11 @@ static long
check_and_migrate_movable_pages_or_folios(struct pages_or_folios *pofs)
{
LIST_HEAD(movable_folio_list);
+ unsigned long collected;
- collect_longterm_unpinnable_folios(&movable_folio_list, pofs);
- if (list_empty(&movable_folio_list))
+ collected = collect_longterm_unpinnable_folios(&movable_folio_list,
+ pofs);
+ if (!collected)
return 0;
return migrate_longterm_unpinnable_folios(&movable_folio_list, pofs);
--
2.49.0
This patch fixes Type-C compliance test TD 4.7.6 - Try.SNK DRP Connect
SNKAS.
tVbusON has a limit of 275ms when entering SRC_ATTACHED. Compliance
testers can interpret the TryWait.Src to Attached.Src transition after
Try.Snk as being in Attached.Src the entire time, so ~170ms is lost
to the debounce timer.
Setting the data role can be a costly operation in host mode, and when
completed after 100ms can cause Type-C compliance test check TD 4.7.5.V.4
to fail.
Turn VBUS on before tcpm_set_roles to meet timing requirement.
Fixes: f0690a25a140 ("staging: typec: USB Type-C Port Manager (tcpm)")
Cc: stable(a)vger.kernel.org
Signed-off-by: RD Babiera <rdbabiera(a)google.com>
Reviewed-by: Badhri Jagan Sridharan <badhri(a)google.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
---
Changes since v1:
* Rebased on top of usb-linus for v6.15
Changes since v2:
* Restored to v1, usb-next and usb-linus are both synced to v6.16
so there is no longer a version mismatch possibility.
---
drivers/usb/typec/tcpm/tcpm.c | 34 +++++++++++++++++-----------------
1 file changed, 17 insertions(+), 17 deletions(-)
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index 1a1f9e1f8e4e..1f6fdfaa34bf 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -4410,17 +4410,6 @@ static int tcpm_src_attach(struct tcpm_port *port)
tcpm_enable_auto_vbus_discharge(port, true);
- ret = tcpm_set_roles(port, true, TYPEC_STATE_USB,
- TYPEC_SOURCE, tcpm_data_role_for_source(port));
- if (ret < 0)
- return ret;
-
- if (port->pd_supported) {
- ret = port->tcpc->set_pd_rx(port->tcpc, true);
- if (ret < 0)
- goto out_disable_mux;
- }
-
/*
* USB Type-C specification, version 1.2,
* chapter 4.5.2.2.8.1 (Attached.SRC Requirements)
@@ -4430,13 +4419,24 @@ static int tcpm_src_attach(struct tcpm_port *port)
(polarity == TYPEC_POLARITY_CC2 && port->cc1 == TYPEC_CC_RA)) {
ret = tcpm_set_vconn(port, true);
if (ret < 0)
- goto out_disable_pd;
+ return ret;
}
ret = tcpm_set_vbus(port, true);
if (ret < 0)
goto out_disable_vconn;
+ ret = tcpm_set_roles(port, true, TYPEC_STATE_USB, TYPEC_SOURCE,
+ tcpm_data_role_for_source(port));
+ if (ret < 0)
+ goto out_disable_vbus;
+
+ if (port->pd_supported) {
+ ret = port->tcpc->set_pd_rx(port->tcpc, true);
+ if (ret < 0)
+ goto out_disable_mux;
+ }
+
port->pd_capable = false;
port->partner = NULL;
@@ -4447,14 +4447,14 @@ static int tcpm_src_attach(struct tcpm_port *port)
return 0;
-out_disable_vconn:
- tcpm_set_vconn(port, false);
-out_disable_pd:
- if (port->pd_supported)
- port->tcpc->set_pd_rx(port->tcpc, false);
out_disable_mux:
tcpm_mux_set(port, TYPEC_STATE_SAFE, USB_ROLE_NONE,
TYPEC_ORIENTATION_NONE);
+out_disable_vbus:
+ tcpm_set_vbus(port, false);
+out_disable_vconn:
+ tcpm_set_vconn(port, false);
+
return ret;
}
base-commit: e04c78d86a9699d136910cfc0bdcf01087e3267e
--
2.50.0.rc2.701.gf1e915cc24-goog
Sohil reported seeing a split lock warning when running a test that
generates userspace #DB:
x86/split lock detection: #DB: sigtrap_loop_64/4614 took a bus_lock trap at address: 0x4011ae
We investigated the issue and figured out:
1) The warning is a false positive.
2) It is not caused by the test itself.
3) It occurs even when Bus Lock Detection (BLD) is disabled.
4) It only happens on the first #DB on a CPU.
And the root cause is, at boot time, Linux zeros DR6. This leads to
different DR6 values depending on whether the CPU supports BLD:
1) On CPUs with BLD support, DR6 becomes 0xFFFF07F0 (bit 11, DR6.BLD,
is cleared).
2) On CPUs without BLD, DR6 becomes 0xFFFF0FF0.
Since only BLD-induced #DB exceptions clear DR6.BLD and other debug
exceptions leave it unchanged, even if the first #DB is unrelated to
BLD, DR6.BLD is still cleared. As a result, such a first #DB is
misinterpreted as a BLD #DB, and a false warning is triggerred.
Fix the bug by initializing DR6 by writing its architectural reset
value at boot time.
DR7 suffers from a similar issue, apply the same fix.
This patch set is based on tip/x86/urgent branch as of today.
Link to the previous patch set v2:
https://lore.kernel.org/lkml/20250617073234.1020644-1-xin@zytor.com/
Changes in v3:
*) Polish initialize_debug_regs() (PeterZ).
*) Rewrite the comment for DR6_RESERVED definition (Sohil and Sean).
*) Reword the patch 2's changelog using Sean's description.
*) Explain the definition of DR7_FIXED_1 (Sohil).
*) Collect TB, RB, AB (PeterZ, Sohil and Sean).
Xin Li (Intel) (2):
x86/traps: Initialize DR6 by writing its architectural reset value
x86/traps: Initialize DR7 by writing its architectural reset value
arch/x86/include/asm/debugreg.h | 19 ++++++++++++----
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/include/uapi/asm/debugreg.h | 21 ++++++++++++++++-
arch/x86/kernel/cpu/common.c | 24 ++++++++------------
arch/x86/kernel/kgdb.c | 2 +-
arch/x86/kernel/process_32.c | 2 +-
arch/x86/kernel/process_64.c | 2 +-
arch/x86/kernel/traps.c | 34 +++++++++++++++++-----------
arch/x86/kvm/x86.c | 4 ++--
9 files changed, 72 insertions(+), 38 deletions(-)
base-commit: 2aebf5ee43bf0ed225a09a30cf515d9f2813b759
--
2.49.0
From: anvithdosapati <anvithdosapati(a)google.com>
In ufshcd_host_reset_and_restore, scale up clocks only when clock
scaling is supported. Without this change cpu latency is voted for 0
(ufshcd_pm_qos_update) during resume unconditionally.
Signed-off-by: anvithdosapati <anvithdosapati(a)google.com>
Fixes: a3cd5ec55f6c7 ("scsi: ufs: add load based scaling of UFS gear")
Cc: stable(a)vger.kernel.org
---
v2:
- Update commit message
- Add Fixes and Cc stable
drivers/ufs/core/ufshcd.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 4410e7d93b7d..fac381ea2b3a 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -7802,7 +7802,8 @@ static int ufshcd_host_reset_and_restore(struct ufs_hba *hba)
hba->silence_err_logs = false;
/* scale up clocks to max frequency before full reinitialization */
- ufshcd_scale_clks(hba, ULONG_MAX, true);
+ if (ufshcd_is_clkscaling_supported(hba))
+ ufshcd_scale_clks(hba, ULONG_MAX, true);
err = ufshcd_hba_enable(hba);
--
2.50.0.rc1.591.g9c95f17f64-goog
Hi dear LKML and community, this is my first post here, so I'd
appreciate any guidance or redirection if it's due.
Starting from commit
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…,
HDMI handling for certain refresh rates on my intel iGPU is broken.
The error is still present in 6.16rc1.
Specifically, this is the command that disambiguates the newer broken
kernels:
xrandr --output HDMI-1 --auto --scale 1x1 --mode 1920x1080
--rate 120 --pos 0x0 --output eDP-1 --off
The important parts are 1920x1080 and 120Hz. When run on commits prior
to the bisected above, it behaves as expected, delivering 1920x1080 @
120Hz. When run on kernel builds with the above commit included (that
commit or later), the monitor goes completely blank. After about 30
seconds, it shuts down entirely (which I assume means that from the
monitor's perspective, HDMI got "disconnected").
On this link you can see my original report in the ArchLinux community,
where Christian Heusel (@gromit) kindly guided me through the bisection
process and built the bisection images for me to try:
https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/14…
This link also contains the bisection history.
Additional info:
* The monitor and the notebook are connected via an HDMI cable, the
monitor itself is a 4k@120Hz monitor.
* According to `lsmod | grep -E "(i915|Xe)"`, I'm using the i915 kernel
driver for the GPU.
* The GPU is an iGPU from intel, specifically `Intel Core Ultra 7 155H`.
* One symptom that disambiguates the working and non-working kernels,
besides whether they actually have the bug, is that the broken kernels
cause xrandr to additionally report the 144.05 refresh rate as possible
for the monitor, whereas the non-broken kernels consistently cause
xrandr to only list refresh rate 120 and below as possible. I'm only
ever testing the refresh rate of 120, but the presence of the 144.05
rate is correlated.
If any other information or anything is needed, please write.
Thank you,
Vas
----
#regzbot introduced: 1efd5384277eb71fce20922579061cd3acdb07cf
#regzbot title: intel iGPU with HDMI PLL stopped working at 1080p@120Hz
1efd5384
#regzbot link:
https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/145