[ Upstream commit 83b0177a6c48 ]
[ Patch for 6.1.y, 6.6.y, 6.12.y trees ]
To avoid racing with should_flush_tlb, setting loaded_mm to
LOADED_MM_SWITCHING must happen before reading tlb_gen.
This patch differs from the upstream fix since the ordering issue in
stable trees is different: here, the relevant code blocks are in the
wrong order due to a bad rebase earlier, so the fix is to reorder
them.
Signed-off-by: Stephen Dolan <sdolan(a)janestreet.com>
Acked-by: Dave Hansen <dave.hansen(a)intel.com>
Link: https://lore.kernel.org/lkml/CAHDw0oGd0B4=uuv8NGqbUQ_ZVmSheU2bN70e4QhFXWvuA…
---
arch/x86/mm/tlb.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 8629d90fdcd9..ed182831063c 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -606,6 +606,14 @@ void switch_mm_irqs_off(struct mm_struct *unused,
struct mm_struct *next,
*/
cond_mitigation(tsk);
+ /*
+ * Indicate that CR3 is about to change. nmi_uaccess_okay()
+ * and others are sensitive to the window where mm_cpumask(),
+ * CR3 and cpu_tlbstate.loaded_mm are not all in sync.
+ */
+ this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING);
+ barrier();
+
/*
* Stop remote flushes for the previous mm.
* Skip kernel threads; we never send init_mm TLB
flushing IPIs,
@@ -623,14 +631,6 @@ void switch_mm_irqs_off(struct mm_struct *unused,
struct mm_struct *next,
next_tlb_gen = atomic64_read(&next->context.tlb_gen);
choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);
-
- /*
- * Indicate that CR3 is about to change. nmi_uaccess_okay()
- * and others are sensitive to the window where mm_cpumask(),
- * CR3 and cpu_tlbstate.loaded_mm are not all in sync.
- */
- this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING);
- barrier();
}
new_lam = mm_lam_cr3_mask(next);
--
2.43.7
From: Edward Adam Davis <eadavis(a)qq.com>
commit 96cb948408b3adb69df7e451ba7da9d21f814d00 upstream.
The reproducer passed in an irq number(0x80008000) that was too large,
which triggered the oob.
Added an interrupt number check to prevent users from passing in an irq
number that was too large.
If `it->options[1]` is 31, then `1 << it->options[1]` is still invalid
because it shifts a 1-bit into the sign bit (which is UB in C).
Possible solutions include reducing the upper bound on the
`it->options[1]` value to 30 or lower, or using `1U << it->options[1]`.
The old code would just not attempt to request the IRQ if the
`options[1]` value were invalid. And it would still configure the
device without interrupts even if the call to `request_irq` returned an
error. So it would be better to combine this test with the test below.
Fixes: fff46207245c ("staging: comedi: pcl726: enable the interrupt support code")
Cc: stable <stable(a)kernel.org> # 5.13+
Reported-by: syzbot+5cd373521edd68bebcb3(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5cd373521edd68bebcb3
Tested-by: syzbot+5cd373521edd68bebcb3(a)syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis(a)qq.com>
Reviewed-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/tencent_3C66983CC1369E962436264A50759176BF09@qq.c…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
[Andrey Troshin: backport fix from drivers/comedi/drivers/pcl726.c to drivers/staging/comedi/drivers/pcl726.c]
Signed-off-by: Andrey Troshin <drtrosh(a)yandex-team.ru>
---
Backport fix for CVE-2025-39685
Link: https://nvd.nist.gov/vuln/detail/CVE-2025-39685
---
drivers/staging/comedi/drivers/pcl726.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/comedi/drivers/pcl726.c b/drivers/staging/comedi/drivers/pcl726.c
index 64eb649c9813..f26d7f07d749 100644
--- a/drivers/staging/comedi/drivers/pcl726.c
+++ b/drivers/staging/comedi/drivers/pcl726.c
@@ -327,7 +327,8 @@ static int pcl726_attach(struct comedi_device *dev,
* Hook up the external trigger source interrupt only if the
* user config option is valid and the board supports interrupts.
*/
- if (it->options[1] && (board->irq_mask & (1 << it->options[1]))) {
+ if (it->options[1] > 0 && it->options[1] < 16 &&
+ (board->irq_mask & (1U << it->options[1]))) {
ret = request_irq(it->options[1], pcl726_interrupt, 0,
dev->board_name, dev);
if (ret == 0) {
--
2.34.1
The padding field in the structure was previously reserved to
maintain a stable interface for potential new fields, ensuring
compatibility with user-space shared data structures.
However,it was accidentally removed by tiantao in a prior commit,
which may lead to incompatibility between user space and the kernel.
This patch reinstates the padding to restore the original structure
layout and preserve compatibility.
Fixes: 8ddde07a3d28 ("dma-mapping: benchmark: extract a common header file for map_benchmark definition")
Cc: stable(a)vger.kernel.org
Acked-by: Barry Song <baohua(a)kernel.org>
Signed-off-by: Qinxin Xia <xiaqinxin(a)huawei.com>
---
include/linux/map_benchmark.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/map_benchmark.h b/include/linux/map_benchmark.h
index 62674c83bde4..48e2ff95332f 100644
--- a/include/linux/map_benchmark.h
+++ b/include/linux/map_benchmark.h
@@ -27,5 +27,6 @@ struct map_benchmark {
__u32 dma_dir; /* DMA data direction */
__u32 dma_trans_ns; /* time for DMA transmission in ns */
__u32 granule; /* how many PAGE_SIZE will do map/unmap once a time */
+ __u8 expansion[76]; /* For future use */
};
#endif /* _KERNEL_DMA_BENCHMARK_H */
--
2.33.0
On s390 systems, which use a machine level hypervisor, PCI devices are
always accessed through a form of PCI pass-through which fundamentally
operates on a per PCI function granularity. This is also reflected in the
s390 PCI hotplug driver which creates hotplug slots for individual PCI
functions. Its reset_slot() function, which is a wrapper for
zpci_hot_reset_device(), thus also resets individual functions.
Currently, the kernel's PCI_SLOT() macro assigns the same pci_slot object
to multifunction devices. This approach worked fine on s390 systems that
only exposed virtual functions as individual PCI domains to the operating
system. Since commit 44510d6fa0c0 ("s390/pci: Handling multifunctions")
s390 supports exposing the topology of multifunction PCI devices by
grouping them in a shared PCI domain. When attempting to reset a function
through the hotplug driver, the shared slot assignment causes the wrong
function to be reset instead of the intended one. It also leaks memory as
we do create a pci_slot object for the function, but don't correctly free
it in pci_slot_release().
Add a flag for struct pci_slot to allow per function PCI slots for
functions managed through a hypervisor, which exposes individual PCI
functions while retaining the topology.
Fixes: 44510d6fa0c0 ("s390/pci: Handling multifunctions")
Cc: stable(a)vger.kernel.org
Suggested-by: Niklas Schnelle <schnelle(a)linux.ibm.com>
Signed-off-by: Farhan Ali <alifm(a)linux.ibm.com>
---
drivers/pci/pci.c | 5 +++--
drivers/pci/slot.c | 25 ++++++++++++++++++++++---
include/linux/pci.h | 1 +
3 files changed, 26 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b14dd064006c..36ee38e0d817 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4980,8 +4980,9 @@ static int pci_reset_hotplug_slot(struct hotplug_slot *hotplug, bool probe)
static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
{
- if (dev->multifunction || dev->subordinate || !dev->slot ||
- dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET)
+ if (dev->subordinate || !dev->slot ||
+ dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
+ (dev->multifunction && !dev->slot->per_func_slot))
return -ENOTTY;
return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
index 50fb3eb595fe..ed10fa3ae727 100644
--- a/drivers/pci/slot.c
+++ b/drivers/pci/slot.c
@@ -63,6 +63,22 @@ static ssize_t cur_speed_read_file(struct pci_slot *slot, char *buf)
return bus_speed_read(slot->bus->cur_bus_speed, buf);
}
+static bool pci_dev_matches_slot(struct pci_dev *dev, struct pci_slot *slot)
+{
+ if (slot->per_func_slot)
+ return dev->devfn == slot->number;
+
+ return PCI_SLOT(dev->devfn) == slot->number;
+}
+
+static bool pci_slot_enabled_per_func(void)
+{
+ if (IS_ENABLED(CONFIG_S390))
+ return true;
+
+ return false;
+}
+
static void pci_slot_release(struct kobject *kobj)
{
struct pci_dev *dev;
@@ -73,7 +89,7 @@ static void pci_slot_release(struct kobject *kobj)
down_read(&pci_bus_sem);
list_for_each_entry(dev, &slot->bus->devices, bus_list)
- if (PCI_SLOT(dev->devfn) == slot->number)
+ if (pci_dev_matches_slot(dev, slot))
dev->slot = NULL;
up_read(&pci_bus_sem);
@@ -166,7 +182,7 @@ void pci_dev_assign_slot(struct pci_dev *dev)
mutex_lock(&pci_slot_mutex);
list_for_each_entry(slot, &dev->bus->slots, list)
- if (PCI_SLOT(dev->devfn) == slot->number)
+ if (pci_dev_matches_slot(dev, slot))
dev->slot = slot;
mutex_unlock(&pci_slot_mutex);
}
@@ -265,6 +281,9 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr,
slot->bus = pci_bus_get(parent);
slot->number = slot_nr;
+ if (pci_slot_enabled_per_func())
+ slot->per_func_slot = 1;
+
slot->kobj.kset = pci_slots_kset;
slot_name = make_slot_name(name);
@@ -285,7 +304,7 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr,
down_read(&pci_bus_sem);
list_for_each_entry(dev, &parent->devices, bus_list)
- if (PCI_SLOT(dev->devfn) == slot_nr)
+ if (pci_dev_matches_slot(dev, slot))
dev->slot = slot;
up_read(&pci_bus_sem);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d1fdf81fbe1e..6ad194597ab5 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -78,6 +78,7 @@ struct pci_slot {
struct list_head list; /* Node in list of slots */
struct hotplug_slot *hotplug; /* Hotplug info (move here) */
unsigned char number; /* PCI_SLOT(pci_dev->devfn) */
+ unsigned int per_func_slot:1; /* Allow per function slot */
struct kobject kobj;
};
--
2.43.0
Commit 56a06bd40fab ("virtio_net: enable gso over UDP tunnel support.")
switches to check the alignment of the virtio_net_hdr_v1_hash_tunnel
even when doing the transmission even if the feature is not
negotiated. This will cause a series performance degradation of pktgen
as the skb->data can't satisfy the alignment requirement due to the
increase of the header size then virtio-net must prepare at least 2
sgs with indirect descriptors which will introduce overheads in the
device.
Fixing this by calculate the header alignment during probe so when
tunnel gso is not negotiated, we can less strict.
Pktgen in guest + XDP_DROP on TAP + vhost_net shows the TX PPS is
recovered from 2.4Mpps to 4.45Mpps.
Note that we still need a way to recover the performance when tunnel
gso is enabled, probably a new vnet header format.
Fixes: 56a06bd40fab ("virtio_net: enable gso over UDP tunnel support.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
---
drivers/net/virtio_net.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 31bd32bdecaf..5b851df749c0 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -441,6 +441,9 @@ struct virtnet_info {
/* Packet virtio header size */
u8 hdr_len;
+ /* header alignment */
+ size_t hdr_align;
+
/* Work struct for delayed refilling if we run low on memory. */
struct delayed_work refill;
@@ -3308,8 +3311,9 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb, bool orphan)
pr_debug("%s: xmit %p %pM\n", vi->dev->name, skb, dest);
can_push = vi->any_header_sg &&
- !((unsigned long)skb->data & (__alignof__(*hdr) - 1)) &&
+ !((unsigned long)skb->data & (vi->hdr_align - 1)) &&
!skb_header_cloned(skb) && skb_headroom(skb) >= hdr_len;
+
/* Even if we can, don't push here yet as this would skew
* csum_start offset below. */
if (can_push)
@@ -6926,15 +6930,20 @@ static int virtnet_probe(struct virtio_device *vdev)
}
if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO) ||
- virtio_has_feature(vdev, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO))
+ virtio_has_feature(vdev, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO)) {
vi->hdr_len = sizeof(struct virtio_net_hdr_v1_hash_tunnel);
- else if (vi->has_rss_hash_report)
+ vi->hdr_align = __alignof__(struct virtio_net_hdr_v1_hash_tunnel);
+ } else if (vi->has_rss_hash_report) {
vi->hdr_len = sizeof(struct virtio_net_hdr_v1_hash);
- else if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) ||
- virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
+ vi->hdr_align = __alignof__(struct virtio_net_hdr_v1_hash);
+ } else if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) ||
+ virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) {
vi->hdr_len = sizeof(struct virtio_net_hdr_mrg_rxbuf);
- else
+ vi->hdr_align = __alignof__(struct virtio_net_hdr_mrg_rxbuf);
+ } else {
vi->hdr_len = sizeof(struct virtio_net_hdr);
+ vi->hdr_align = __alignof__(struct virtio_net_hdr);
+ }
if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM))
vi->rx_tnl_csum = true;
--
2.31.1
The function mtk_dp_dt_parse() calls of_graph_get_endpoint_by_regs()
to get the endpoint device node, but fails to call of_node_put() to release
the reference when the function returns. This results in a device node
reference leak.
Fix this by adding the missing of_node_put() call before returning from
the function.
Found via static analysis and code review.
Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/gpu/drm/mediatek/mtk_dp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/mediatek/mtk_dp.c b/drivers/gpu/drm/mediatek/mtk_dp.c
index bef6eeb30d3e..b0b1e158600f 100644
--- a/drivers/gpu/drm/mediatek/mtk_dp.c
+++ b/drivers/gpu/drm/mediatek/mtk_dp.c
@@ -2087,6 +2087,7 @@ static int mtk_dp_dt_parse(struct mtk_dp *mtk_dp,
endpoint = of_graph_get_endpoint_by_regs(pdev->dev.of_node, 1, -1);
len = of_property_count_elems_of_size(endpoint,
"data-lanes", sizeof(u32));
+ of_node_put(endpoint);
if (len < 0 || len > 4 || len == 3) {
dev_err(dev, "invalid data lane size: %d\n", len);
return -EINVAL;
--
2.39.5 (Apple Git-154)
of_graph_get_endpoint_by_regs() gets a reference to the endpoint node
to read the "bus-width" property but fails to call of_node_put()
to release the reference, causing a reference count leak.
Add the missing of_node_put() call to fix this.
Found via static analysis and code review.
Fixes: d284ccd8588c ("drm/bridge: sii902x: Set input bus format based on bus-width")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/gpu/drm/bridge/sii902x.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/bridge/sii902x.c b/drivers/gpu/drm/bridge/sii902x.c
index d537b1d036fb..3a247ac3c7dd 100644
--- a/drivers/gpu/drm/bridge/sii902x.c
+++ b/drivers/gpu/drm/bridge/sii902x.c
@@ -1189,8 +1189,10 @@ static int sii902x_probe(struct i2c_client *client)
sii902x->bus_width = 24;
endpoint = of_graph_get_endpoint_by_regs(dev->of_node, 0, -1);
- if (endpoint)
+ if (endpoint) {
of_property_read_u32(endpoint, "bus-width", &sii902x->bus_width);
+ of_node_put(endpoint);
+ }
endpoint = of_graph_get_endpoint_by_regs(dev->of_node, 1, -1);
if (endpoint) {
--
2.39.5 (Apple Git-154)
The function samsung_dsim_parse_dt() calls of_graph_get_endpoint_by_regs()
to get the endpoint device node, but fails to call of_node_put() to release
the reference when the function returns. This results in a device node
reference leak.
Fix this by adding the missing of_node_put() call before returning from
the function.
Found via static analysis and code review.
Fixes: 77169a11d4e9 ("drm/bridge: samsung-dsim: add driver support for exynos7870 DSIM bridge")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/gpu/drm/bridge/samsung-dsim.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c b/drivers/gpu/drm/bridge/samsung-dsim.c
index eabc4c32f6ab..1a5acd5077ad 100644
--- a/drivers/gpu/drm/bridge/samsung-dsim.c
+++ b/drivers/gpu/drm/bridge/samsung-dsim.c
@@ -2086,6 +2086,7 @@ static int samsung_dsim_parse_dt(struct samsung_dsim *dsi)
if (lane_polarities[1])
dsi->swap_dn_dp_data = true;
}
+ of_node_put(endpoint);
return 0;
}
--
2.39.5 (Apple Git-154)
The bus_find_device_by_name() function returns a device pointer with an
incremented reference count, but the original code was missing put_device()
calls in some return paths, leading to reference count leaks.
Fix this by ensuring put_device() is called before function exit after
bus_find_device_by_name() succeeds
This follows the same pattern used elsewhere in the kernel where
bus_find_device_by_name() is properly paired with put_device().
Found via static analysis and code review.
Fixes: 4f8ef33dd44a ("ASoC: soc_sdw_utils: skip the endpoint that doesn't present")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
sound/soc/sdw_utils/soc_sdw_utils.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/sound/soc/sdw_utils/soc_sdw_utils.c b/sound/soc/sdw_utils/soc_sdw_utils.c
index 270c66b90228..ea594f84f11a 100644
--- a/sound/soc/sdw_utils/soc_sdw_utils.c
+++ b/sound/soc/sdw_utils/soc_sdw_utils.c
@@ -1278,7 +1278,7 @@ static int is_sdca_endpoint_present(struct device *dev,
struct sdw_slave *slave;
struct device *sdw_dev;
const char *sdw_codec_name;
- int i;
+ int ret, i;
dlc = kzalloc(sizeof(*dlc), GFP_KERNEL);
if (!dlc)
@@ -1308,13 +1308,16 @@ static int is_sdca_endpoint_present(struct device *dev,
}
slave = dev_to_sdw_dev(sdw_dev);
- if (!slave)
- return -EINVAL;
+ if (!slave) {
+ ret = -EINVAL;
+ goto put_device;
+ }
/* Make sure BIOS provides SDCA properties */
if (!slave->sdca_data.interface_revision) {
dev_warn(&slave->dev, "SDCA properties not found in the BIOS\n");
- return 1;
+ ret = 1;
+ goto put_device;
}
for (i = 0; i < slave->sdca_data.num_functions; i++) {
@@ -1323,7 +1326,8 @@ static int is_sdca_endpoint_present(struct device *dev,
if (dai_type == dai_info->dai_type) {
dev_dbg(&slave->dev, "DAI type %d sdca function %s found\n",
dai_type, slave->sdca_data.function[i].name);
- return 1;
+ ret = 1;
+ goto put_device;
}
}
@@ -1331,7 +1335,11 @@ static int is_sdca_endpoint_present(struct device *dev,
"SDCA device function for DAI type %d not supported, skip endpoint\n",
dai_info->dai_type);
- return 0;
+ ret = 0;
+
+put_device:
+ put_device(sdw_dev);
+ return ret;
}
int asoc_sdw_parse_sdw_endpoints(struct snd_soc_card *card,
--
2.39.5 (Apple Git-154)
If SMT is disabled or a partial SMT state is enabled, when a new kernel
image is loaded for kexec, on reboot the following warning is observed:
kexec: Waking offline cpu 228.
WARNING: CPU: 0 PID: 9062 at arch/powerpc/kexec/core_64.c:223 kexec_prepare_cpus+0x1b0/0x1bc
[snip]
NIP kexec_prepare_cpus+0x1b0/0x1bc
LR kexec_prepare_cpus+0x1a0/0x1bc
Call Trace:
kexec_prepare_cpus+0x1a0/0x1bc (unreliable)
default_machine_kexec+0x160/0x19c
machine_kexec+0x80/0x88
kernel_kexec+0xd0/0x118
__do_sys_reboot+0x210/0x2c4
system_call_exception+0x124/0x320
system_call_vectored_common+0x15c/0x2ec
This occurs as add_cpu() fails due to cpu_bootable() returning false for
CPUs that fail the cpu_smt_thread_allowed() check or non primary
threads if SMT is disabled.
Fix the issue by enabling SMT and resetting the number of SMT threads to
the number of threads per core, before attempting to wake up all present
CPUs.
Fixes: 38253464bc82 ("cpu/SMT: Create topology_smt_thread_allowed()")
Reported-by: Sachin P Bappalige <sachinpb(a)linux.ibm.com>
Cc: stable(a)vger.kernel.org # v6.6+
Signed-off-by: Nysal Jan K.A. <nysal(a)linux.ibm.com>
---
arch/powerpc/kexec/core_64.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 222aa326dace..ff6df43720c4 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -216,6 +216,11 @@ static void wake_offline_cpus(void)
{
int cpu = 0;
+ lock_device_hotplug();
+ cpu_smt_num_threads = threads_per_core;
+ cpu_smt_control = CPU_SMT_ENABLED;
+ unlock_device_hotplug();
+
for_each_present_cpu(cpu) {
if (!cpu_online(cpu)) {
printk(KERN_INFO "kexec: Waking offline cpu %d.\n",
--
2.51.0
From: Hao Ge <gehao(a)kylinos.cn>
When alloc_slab_obj_exts() fails and then later succeeds in allocating
a slab extension vector, it calls handle_failed_objexts_alloc() to
mark all objects in the vector as empty. As a result all objects in
this slab (slabA) will have their extensions set to CODETAG_EMPTY.
Later on if this slabA is used to allocate a slabobj_ext vector for
another slab (slabB), we end up with the slabB->obj_exts pointing to a
slabobj_ext vector that itself has a non-NULL slabobj_ext equal to
CODETAG_EMPTY. When slabB gets freed, free_slab_obj_exts() is called
to free slabB->obj_exts vector. free_slab_obj_exts() calls
mark_objexts_empty(slabB->obj_exts) which will generate a warning
because it expects slabobj_ext vectors to have a NULL obj_ext, not
CODETAG_EMPTY.
Modify mark_objexts_empty() to skip the warning and setting the
obj_ext value if it's already set to CODETAG_EMPTY.
Fixes: 09c46563ff6d ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Hao Ge <gehao(a)kylinos.cn>
---
v2: Update the commit message and code comments for greater accuracy,
incorporating Suren's suggestions.
Thanks for Suren's help.
---
mm/slub.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/mm/slub.c b/mm/slub.c
index d4367f25b20d..589c596163c4 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2046,7 +2046,11 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
if (slab_exts) {
unsigned int offs = obj_to_index(obj_exts_slab->slab_cache,
obj_exts_slab, obj_exts);
- /* codetag should be NULL */
+
+ if (unlikely(is_codetag_empty(&slab_exts[offs].ref)))
+ return;
+
+ /* codetag should be NULL here */
WARN_ON(slab_exts[offs].ref.ct);
set_codetag_empty(&slab_exts[offs].ref);
}
--
2.25.1
The patch titled
Subject: codetag: debug: handle existing CODETAG_EMPTY in mark_objexts_empty for slabobj_ext
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
codetag-debug-handle-existing-codetag_empty-in-mark_objexts_empty-for-slabobj_ext.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hao Ge <gehao(a)kylinos.cn>
Subject: codetag: debug: handle existing CODETAG_EMPTY in mark_objexts_empty for slabobj_ext
Date: Wed, 29 Oct 2025 09:43:17 +0800
When alloc_slab_obj_exts() fails and then later succeeds in allocating a
slab extension vector, it calls handle_failed_objexts_alloc() to mark all
objects in the vector as empty. As a result all objects in this slab
(slabA) will have their extensions set to CODETAG_EMPTY.
Later on if this slabA is used to allocate a slabobj_ext vector for
another slab (slabB), we end up with the slabB->obj_exts pointing to a
slabobj_ext vector that itself has a non-NULL slabobj_ext equal to
CODETAG_EMPTY. When slabB gets freed, free_slab_obj_exts() is called to
free slabB->obj_exts vector.
free_slab_obj_exts() calls mark_objexts_empty(slabB->obj_exts) which will
generate a warning because it expects slabobj_ext vectors to have a NULL
obj_ext, not CODETAG_EMPTY.
Modify mark_objexts_empty() to skip the warning and setting the obj_ext
value if it's already set to CODETAG_EMPTY.
To quickly detect this WARN, I modified the code
from:WARN_ON(slab_exts[offs].ref.ct) to WARN_ON(slab_exts[offs].ref.ct
== 1);
We then obtained this message:
[21630.898561] ------------[ cut here ]------------
[21630.898596] kernel BUG at mm/slub.c:2050!
[21630.898611] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
[21630.900372] Modules linked in: squashfs isofs vfio_iommu_type1
vhost_vsock vfio vhost_net vmw_vsock_virtio_transport_common vhost tap
vhost_iotlb iommufd vsock binfmt_misc nfsv3 nfs_acl nfs lockd grace
netfs tls rds dns_resolver tun brd overlay ntfs3 exfat btrfs
blake2b_generic xor xor_neon raid6_pq loop sctp ip6_udp_tunnel
udp_tunnel nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib
nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
nf_tables rfkill ip_set sunrpc vfat fat joydev sg sch_fq_codel nfnetlink
virtio_gpu sr_mod cdrom drm_client_lib virtio_dma_buf drm_shmem_helper
drm_kms_helper drm ghash_ce backlight virtio_net virtio_blk virtio_scsi
net_failover virtio_console failover virtio_mmio dm_mirror
dm_region_hash dm_log dm_multipath dm_mod fuse i2c_dev virtio_pci
virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring autofs4
aes_neon_bs aes_ce_blk [last unloaded: hwpoison_inject]
[21630.909177] CPU: 3 UID: 0 PID: 3787 Comm: kylin-process-m Kdump:
loaded Tainted: G�������������� W�������������������� 6.18.0-rc1+ #74 PREEMPT(voluntary)
[21630.910495] Tainted: [W]=WARN
[21630.910867] Hardware name: QEMU KVM Virtual Machine, BIOS unknown
2/2/2022
[21630.911625] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[21630.912392] pc : __free_slab+0x228/0x250
[21630.912868] lr : __free_slab+0x18c/0x250[21630.913334] sp :
ffff8000a02f73e0
[21630.913830] x29: ffff8000a02f73e0 x28: fffffdffc43fc800 x27:
ffff0000c0011c40
[21630.914677] x26: ffff0000c000cac0 x25: ffff00010fe5e5f0 x24:
ffff000102199b40
[21630.915469] x23: 0000000000000003 x22: 0000000000000003 x21:
ffff0000c0011c40
[21630.916259] x20: fffffdffc4086600 x19: fffffdffc43fc800 x18:
0000000000000000
[21630.917048] x17: 0000000000000000 x16: 0000000000000000 x15:
0000000000000000
[21630.917837] x14: 0000000000000000 x13: 0000000000000000 x12:
ffff70001405ee66
[21630.918640] x11: 1ffff0001405ee65 x10: ffff70001405ee65 x9 :
ffff800080a295dc
[21630.919442] x8 : ffff8000a02f7330 x7 : 0000000000000000 x6 :
0000000000003000
[21630.920232] x5 : 0000000024924925 x4 : 0000000000000001 x3 :
0000000000000007
[21630.921021] x2 : 0000000000001b40 x1 : 000000000000001f x0 :
0000000000000001
[21630.921810] Call trace:
[21630.922130]�� __free_slab+0x228/0x250 (P)
[21630.922669]�� free_slab+0x38/0x118
[21630.923079]�� free_to_partial_list+0x1d4/0x340
[21630.923591]�� __slab_free+0x24c/0x348
[21630.924024]�� ___cache_free+0xf0/0x110
[21630.924468]�� qlist_free_all+0x78/0x130
[21630.924922]�� kasan_quarantine_reduce+0x114/0x148
[21630.925525]�� __kasan_slab_alloc+0x7c/0xb0
[21630.926006]�� kmem_cache_alloc_noprof+0x164/0x5c8
[21630.926699]�� __alloc_object+0x44/0x1f8
[21630.927153]�� __create_object+0x34/0xc8
[21630.927604]�� kmemleak_alloc+0xb8/0xd8
[21630.928052]�� kmem_cache_alloc_noprof+0x368/0x5c8
[21630.928606]�� getname_flags.part.0+0xa4/0x610
[21630.929112]�� getname_flags+0x80/0xd8
[21630.929557]�� vfs_fstatat+0xc8/0xe0
[21630.929975]�� __do_sys_newfstatat+0xa0/0x100
[21630.930469]�� __arm64_sys_newfstatat+0x90/0xd8
[21630.931046]�� invoke_syscall+0xd4/0x258
[21630.931685]�� el0_svc_common.constprop.0+0xb4/0x240
[21630.932467]�� do_el0_svc+0x48/0x68
[21630.932972]�� el0_svc+0x40/0xe0
[21630.933472]�� el0t_64_sync_handler+0xa0/0xe8
[21630.934151]�� el0t_64_sync+0x1ac/0x1b0
[21630.934923] Code: aa1803e0 97ffef2b a9446bf9 17ffff9c (d4210000)
[21630.936461] SMP: stopping secondary CPUs
[21630.939550] Starting crashdump kernel...
[21630.940108] Bye!
Link: https://lkml.kernel.org/r/20251029014317.1533488-1-hao.ge@linux.dev
Fixes: 09c46563ff6d ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations")
Signed-off-by: Hao Ge <gehao(a)kylinos.cn>
Cc: Christoph Lameter (Ampere) <cl(a)gentwo.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: gehao <gehao(a)kylinos.cn>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: Shakeel Butt <shakeel.butt(a)linux.dev>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slub.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/mm/slub.c~codetag-debug-handle-existing-codetag_empty-in-mark_objexts_empty-for-slabobj_ext
+++ a/mm/slub.c
@@ -2046,7 +2046,11 @@ static inline void mark_objexts_empty(st
if (slab_exts) {
unsigned int offs = obj_to_index(obj_exts_slab->slab_cache,
obj_exts_slab, obj_exts);
- /* codetag should be NULL */
+
+ if (unlikely(is_codetag_empty(&slab_exts[offs].ref)))
+ return;
+
+ /* codetag should be NULL here */
WARN_ON(slab_exts[offs].ref.ct);
set_codetag_empty(&slab_exts[offs].ref);
}
_
Patches currently in -mm which might be from gehao(a)kylinos.cn are
codetag-debug-handle-existing-codetag_empty-in-mark_objexts_empty-for-slabobj_ext.patch
--
Hi,
PERDIS SUPER U is a leading retail group in France with numerous
outlets across the country. After reviewing your company profile and
products, we’re very interested in establishing a long-term partnership.
Kindly share your product catalog or website so we can review your
offerings and pricing. We are ready to place orders and begin
cooperation.Please note: Our payment terms are SWIFT, 14 days after
delivery.
Looking forward to your response.
Best regards,
Dominique Schelcher
Director, PERDIS SUPER U
RUE DE SAVOIE, 45600 SAINT-PÈRE-SUR-LOIRE
VAT: FR65380071464
www.magasins-u.com
The patch titled
Subject: mm/mremap: honour writable bit in mremap pte batching
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-mremap-honour-writable-bit-in-mremap-pte-batching.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Dev Jain <dev.jain(a)arm.com>
Subject: mm/mremap: honour writable bit in mremap pte batching
Date: Tue, 28 Oct 2025 12:09:52 +0530
Currently mremap folio pte batch ignores the writable bit during figuring
out a set of similar ptes mapping the same folio. Suppose that the first
pte of the batch is writable while the others are not - set_ptes will end
up setting the writable bit on the other ptes, which is a violation of
mremap semantics. Therefore, use FPB_RESPECT_WRITE to check the writable
bit while determining the pte batch.
Link: https://lkml.kernel.org/r/20251028063952.90313-1-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
Fixes: f822a9a81a31 ("mm: optimize mremap() by PTE batching")
Reported-by: David Hildenbrand <david(a)redhat.com>
Debugged-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Pedro Falcato <pfalcato(a)suse.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: Jann Horn <jannh(a)google.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org> [6.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mremap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/mremap.c~mm-mremap-honour-writable-bit-in-mremap-pte-batching
+++ a/mm/mremap.c
@@ -187,7 +187,7 @@ static int mremap_folio_pte_batch(struct
if (!folio || !folio_test_large(folio))
return 1;
- return folio_pte_batch(folio, ptep, pte, max_nr);
+ return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr, FPB_RESPECT_WRITE);
}
static int move_ptes(struct pagetable_move_control *pmc,
_
Patches currently in -mm which might be from dev.jain(a)arm.com are
mm-mremap-honour-writable-bit-in-mremap-pte-batching.patch
The patch titled
Subject: gcov: add support for GCC 15
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
gcov-add-support-for-gcc-15.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Peter Oberparleiter <oberpar(a)linux.ibm.com>
Subject: gcov: add support for GCC 15
Date: Tue, 28 Oct 2025 12:51:25 +0100
Using gcov on kernels compiled with GCC 15 results in truncated 16-byte
long .gcda files with no usable data. To fix this, update GCOV_COUNTERS
to match the value defined by GCC 15.
Tested with GCC 14.3.0 and GCC 15.2.0.
Link: https://lkml.kernel.org/r/20251028115125.1319410-1-oberpar@linux.ibm.com
Signed-off-by: Peter Oberparleiter <oberpar(a)linux.ibm.com>
Reported-by: Matthieu Baerts <matttbe(a)kernel.org>
Closes: https://github.com/linux-test-project/lcov/issues/445
Tested-by: Matthieu Baerts <matttbe(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/gcov/gcc_4_7.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/kernel/gcov/gcc_4_7.c~gcov-add-support-for-gcc-15
+++ a/kernel/gcov/gcc_4_7.c
@@ -18,7 +18,9 @@
#include <linux/mm.h>
#include "gcov.h"
-#if (__GNUC__ >= 14)
+#if (__GNUC__ >= 15)
+#define GCOV_COUNTERS 10
+#elif (__GNUC__ >= 14)
#define GCOV_COUNTERS 9
#elif (__GNUC__ >= 10)
#define GCOV_COUNTERS 8
_
Patches currently in -mm which might be from oberpar(a)linux.ibm.com are
gcov-add-support-for-gcc-15.patch
The patch titled
Subject: s390: fix HugeTLB vmemmap optimization crash
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
s390-fix-hugetlb-vmemmap-optimization-crash.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Luiz Capitulino <luizcap(a)redhat.com>
Subject: s390: fix HugeTLB vmemmap optimization crash
Date: Tue, 28 Oct 2025 17:15:33 -0400
A reproducible crash occurs when enabling HugeTLB vmemmap optimization
(HVO) on s390. The crash and the proposed fix were worked on an s390 KVM
guest running on an older hypervisor, as I don't have access to an LPAR.
However, the same issue should occur on bare-metal.
Reproducer (it may take a few runs to trigger):
# sysctl vm.hugetlb_optimize_vmemmap=1
# echo 1 > /proc/sys/vm/nr_hugepages
# echo 0 > /proc/sys/vm/nr_hugepages
Crash log:
[ 52.340369] list_del corruption. prev->next should be 000000d382110008, but was 000000d7116d3880. (prev=000000d7116d3910)
[ 52.340420] ------------[ cut here ]------------
[ 52.340424] kernel BUG at lib/list_debug.c:62!
[ 52.340566] monitor event: 0040 ilc:2 [#1]SMP
[ 52.340573] Modules linked in: ctcm fsm qeth ccwgroup zfcp scsi_transport_fc qdio dasd_fba_mod dasd_eckd_mod dasd_mod xfs ghash_s390 prng des_s390 libdes sha3_512_s390 sha3_256_s390 virtio_net virtio_blk net_failover sha_common failover dm_mirror dm_region_hash dm_log dm_mod paes_s390 crypto_engine pkey_cca pkey_ep11 zcrypt pkey_pckmo pkey aes_s390
[ 52.340606] CPU: 1 UID: 0 PID: 1672 Comm: root-rep2 Kdump: loaded Not tainted 6.18.0-rc3 #1 NONE
[ 52.340610] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
[ 52.340611] Krnl PSW : 0704c00180000000 0000015710cda7fe (__list_del_entry_valid_or_report+0xfe/0x128)
[ 52.340619] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[ 52.340622] Krnl GPRS: c0000000ffffefff 0000000100000027 000000000000006d 0000000000000000
[ 52.340623] 000000d7116d35d8 000000d7116d35d0 0000000000000002 000000d7116d39b0
[ 52.340625] 000000d7116d3880 000000d7116d3910 000000d7116d3910 000000d382110008
[ 52.340626] 000003ffac1ccd08 000000d7116d39b0 0000015710cda7fa 000000d7116d37d0
[ 52.340632] Krnl Code: 0000015710cda7ee: c020003e496f larl %r2,00000157114a3acc
0000015710cda7f4: c0e5ffd5280e brasl %r14,000001571077f810
#0000015710cda7fa: af000000 mc 0,0
>0000015710cda7fe: b9040029 lgr %r2,%r9
0000015710cda802: c0e5ffe5e193 brasl %r14,0000015710996b28
0000015710cda808: e34090080004 lg %r4,8(%r9)
0000015710cda80e: b9040059 lgr %r5,%r9
0000015710cda812: b9040038 lgr %r3,%r8
[ 52.340643] Call Trace:
[ 52.340645] [<0000015710cda7fe>] __list_del_entry_valid_or_report+0xfe/0x128
[ 52.340649] ([<0000015710cda7fa>] __list_del_entry_valid_or_report+0xfa/0x128)
[ 52.340652] [<0000015710a30b2e>] hugetlb_vmemmap_restore_folios+0x96/0x138
[ 52.340655] [<0000015710a268ac>] update_and_free_pages_bulk+0x64/0x150
[ 52.340659] [<0000015710a26f8a>] set_max_huge_pages+0x4ca/0x6f0
[ 52.340662] [<0000015710a273ba>] hugetlb_sysctl_handler_common+0xea/0x120
[ 52.340665] [<0000015710a27484>] hugetlb_sysctl_handler+0x44/0x50
[ 52.340667] [<0000015710b53ffa>] proc_sys_call_handler+0x17a/0x280
[ 52.340672] [<0000015710a90968>] vfs_write+0x2c8/0x3a0
[ 52.340676] [<0000015710a90bd2>] ksys_write+0x72/0x100
[ 52.340679] [<00000157111483a8>] __do_syscall+0x150/0x318
[ 52.340682] [<0000015711153a5e>] system_call+0x6e/0x90
[ 52.340684] Last Breaking-Event-Address:
[ 52.340684] [<000001571077f85c>] _printk+0x4c/0x58
[ 52.340690] Kernel panic - not syncing: Fatal exception: panic_on_oops
This issue was introduced by commit f13b83fdd996 ("hugetlb: batch TLB
flushes when freeing vmemmap"). Before that change, the HVO
implementation called flush_tlb_kernel_range() each time a vmemmap PMD
split and remapping was performed. The mentioned commit changed this to
issue a few flush_tlb_all() calls after performing all remappings.
However, on s390, flush_tlb_kernel_range() expands to __tlb_flush_kernel()
while flush_tlb_all() is not implemented. As a result, we went from
flushing the TLB for every remapping to no flushing at all.
This commit fixes this by implementing flush_tlb_all() on s390 as an alias
to __tlb_flush_global(). This should cause a flush on all TLB entries on
all CPUs as expected by the flush_tlb_all() semantics.
Link: https://lkml.kernel.org/r/20251028211533.47694-1-luizcap@redhat.com
Fixes: f13b83fdd996 ("hugetlb: batch TLB flushes when freeing vmemmap")
Signed-off-by: Luiz Capitulino <luizcap(a)redhat.com>
Cc: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Joao Martins <joao.m.martins(a)oracle.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/s390/include/asm/tlbflush.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/arch/s390/include/asm/tlbflush.h~s390-fix-hugetlb-vmemmap-optimization-crash
+++ a/arch/s390/include/asm/tlbflush.h
@@ -103,9 +103,13 @@ static inline void __tlb_flush_mm_lazy(s
* flush_tlb_range functions need to do the flush.
*/
#define flush_tlb() do { } while (0)
-#define flush_tlb_all() do { } while (0)
#define flush_tlb_page(vma, addr) do { } while (0)
+static inline void flush_tlb_all(void)
+{
+ __tlb_flush_global();
+}
+
static inline void flush_tlb_mm(struct mm_struct *mm)
{
__tlb_flush_mm_lazy(mm);
_
Patches currently in -mm which might be from luizcap(a)redhat.com are
s390-fix-hugetlb-vmemmap-optimization-crash.patch
The patch titled
Subject: codetag: debug: handle existing CODETAG_EMPTY in mark_objexts_empty for slabobj_ext
has been added to the -mm mm-new branch. Its filename is
codetag-debug-handle-existing-codetag_empty-in-mark_objexts_empty-for-slabobj_ext.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hao Ge <gehao(a)kylinos.cn>
Subject: codetag: debug: handle existing CODETAG_EMPTY in mark_objexts_empty for slabobj_ext
Date: Mon, 27 Oct 2025 16:52:14 +0800
Even though obj_exts was created with the __GFP_NO_OBJ_EXT flag, objects
in the same slab may have their extensions allocated via
alloc_slab_obj_exts, and handle_failed_objexts_alloc may be called within
alloc_slab_obj_exts to set their codetag to CODETAG_EMPTY.
Therefore, both NULL and CODETAG_EMPTY are valid for the codetag of
slabobj_ext, as we do not need to re-set it to CODETAG_EMPTY if it is
already CODETAG_EMPTY. It also resolves the warning triggered when the
codetag is CODETAG_EMPTY during slab freeing.
This issue also occurred during our memory stress testing.
The possibility of its occurrence should be as follows:
When a slab allocates a slabobj_ext, the slab to which this slabobj_ext
belongs may have already allocated its own slabobj_ext
and called handle_failed_objexts_alloc. That is to say, the codetag of
this slabobj_ext has been set to CODETAG_EMPTY.
To quickly detect this WARN, I modified the code
from:WARN_ON(slab_exts[offs].ref.ct) to WARN_ON(slab_exts[offs].ref.ct
== 1);
We then obtained this message:
[21630.898561] ------------[ cut here ]------------
[21630.898596] kernel BUG at mm/slub.c:2050!
[21630.898611] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
[21630.900372] Modules linked in: squashfs isofs vfio_iommu_type1
vhost_vsock vfio vhost_net vmw_vsock_virtio_transport_common vhost tap
vhost_iotlb iommufd vsock binfmt_misc nfsv3 nfs_acl nfs lockd grace
netfs tls rds dns_resolver tun brd overlay ntfs3 exfat btrfs
blake2b_generic xor xor_neon raid6_pq loop sctp ip6_udp_tunnel
udp_tunnel nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib
nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
nf_tables rfkill ip_set sunrpc vfat fat joydev sg sch_fq_codel nfnetlink
virtio_gpu sr_mod cdrom drm_client_lib virtio_dma_buf drm_shmem_helper
drm_kms_helper drm ghash_ce backlight virtio_net virtio_blk virtio_scsi
net_failover virtio_console failover virtio_mmio dm_mirror
dm_region_hash dm_log dm_multipath dm_mod fuse i2c_dev virtio_pci
virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring autofs4
aes_neon_bs aes_ce_blk [last unloaded: hwpoison_inject]
[21630.909177] CPU: 3 UID: 0 PID: 3787 Comm: kylin-process-m Kdump:
loaded Tainted: G�������������� W�������������������� 6.18.0-rc1+ #74 PREEMPT(voluntary)
[21630.910495] Tainted: [W]=WARN
[21630.910867] Hardware name: QEMU KVM Virtual Machine, BIOS unknown
2/2/2022
[21630.911625] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[21630.912392] pc : __free_slab+0x228/0x250
[21630.912868] lr : __free_slab+0x18c/0x250[21630.913334] sp :
ffff8000a02f73e0
[21630.913830] x29: ffff8000a02f73e0 x28: fffffdffc43fc800 x27:
ffff0000c0011c40
[21630.914677] x26: ffff0000c000cac0 x25: ffff00010fe5e5f0 x24:
ffff000102199b40
[21630.915469] x23: 0000000000000003 x22: 0000000000000003 x21:
ffff0000c0011c40
[21630.916259] x20: fffffdffc4086600 x19: fffffdffc43fc800 x18:
0000000000000000
[21630.917048] x17: 0000000000000000 x16: 0000000000000000 x15:
0000000000000000
[21630.917837] x14: 0000000000000000 x13: 0000000000000000 x12:
ffff70001405ee66
[21630.918640] x11: 1ffff0001405ee65 x10: ffff70001405ee65 x9 :
ffff800080a295dc
[21630.919442] x8 : ffff8000a02f7330 x7 : 0000000000000000 x6 :
0000000000003000
[21630.920232] x5 : 0000000024924925 x4 : 0000000000000001 x3 :
0000000000000007
[21630.921021] x2 : 0000000000001b40 x1 : 000000000000001f x0 :
0000000000000001
[21630.921810] Call trace:
[21630.922130]�� __free_slab+0x228/0x250 (P)
[21630.922669]�� free_slab+0x38/0x118
[21630.923079]�� free_to_partial_list+0x1d4/0x340
[21630.923591]�� __slab_free+0x24c/0x348
[21630.924024]�� ___cache_free+0xf0/0x110
[21630.924468]�� qlist_free_all+0x78/0x130
[21630.924922]�� kasan_quarantine_reduce+0x114/0x148
[21630.925525]�� __kasan_slab_alloc+0x7c/0xb0
[21630.926006]�� kmem_cache_alloc_noprof+0x164/0x5c8
[21630.926699]�� __alloc_object+0x44/0x1f8
[21630.927153]�� __create_object+0x34/0xc8
[21630.927604]�� kmemleak_alloc+0xb8/0xd8
[21630.928052]�� kmem_cache_alloc_noprof+0x368/0x5c8
[21630.928606]�� getname_flags.part.0+0xa4/0x610
[21630.929112]�� getname_flags+0x80/0xd8
[21630.929557]�� vfs_fstatat+0xc8/0xe0
[21630.929975]�� __do_sys_newfstatat+0xa0/0x100
[21630.930469]�� __arm64_sys_newfstatat+0x90/0xd8
[21630.931046]�� invoke_syscall+0xd4/0x258
[21630.931685]�� el0_svc_common.constprop.0+0xb4/0x240
[21630.932467]�� do_el0_svc+0x48/0x68
[21630.932972]�� el0_svc+0x40/0xe0
[21630.933472]�� el0t_64_sync_handler+0xa0/0xe8
[21630.934151]�� el0t_64_sync+0x1ac/0x1b0
[21630.934923] Code: aa1803e0 97ffef2b a9446bf9 17ffff9c (d4210000)
[21630.936461] SMP: stopping secondary CPUs
[21630.939550] Starting crashdump kernel...
[21630.940108] Bye!
Link: https://lkml.kernel.org/r/20251027085214.184672-1-hao.ge@linux.dev
Fixes: 09c46563ff6d ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations")
Signed-off-by: Hao Ge <gehao(a)kylinos.cn>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Christoph Lameter (Ampere) <cl(a)gentwo.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: Shakeel Butt <shakeel.butt(a)linux.dev>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slub.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
--- a/mm/slub.c~codetag-debug-handle-existing-codetag_empty-in-mark_objexts_empty-for-slabobj_ext
+++ a/mm/slub.c
@@ -2046,7 +2046,17 @@ static inline void mark_objexts_empty(st
if (slab_exts) {
unsigned int offs = obj_to_index(obj_exts_slab->slab_cache,
obj_exts_slab, obj_exts);
- /* codetag should be NULL */
+
+ /*
+ * codetag should be either NULL or CODETAG_EMPTY.
+ * When the same slab calls handle_failed_objexts_alloc,
+ * it will set us to CODETAG_EMPTY.
+ *
+ * If codetag is already CODETAG_EMPTY, no action is needed here.
+ */
+ if (unlikely(is_codetag_empty(&slab_exts[offs].ref)))
+ return;
+
WARN_ON(slab_exts[offs].ref.ct);
set_codetag_empty(&slab_exts[offs].ref);
}
_
Patches currently in -mm which might be from gehao(a)kylinos.cn are
codetag-debug-handle-existing-codetag_empty-in-mark_objexts_empty-for-slabobj_ext.patch
The patch titled
Subject: mm/mm_init: fix hash table order logging in alloc_large_system_hash()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-mm_init-fix-hash-table-order-logging-in-alloc_large_system_hash.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Isaac J. Manjarres" <isaacmanjarres(a)google.com>
Subject: mm/mm_init: fix hash table order logging in alloc_large_system_hash()
Date: Tue, 28 Oct 2025 12:10:12 -0700
When emitting the order of the allocation for a hash table,
alloc_large_system_hash() unconditionally subtracts PAGE_SHIFT from log
base 2 of the allocation size. This is not correct if the allocation size
is smaller than a page, and yields a negative value for the order as seen
below:
TCP established hash table entries: 32 (order: -4, 256 bytes, linear) TCP
bind hash table entries: 32 (order: -2, 1024 bytes, linear)
Use get_order() to compute the order when emitting the hash table
information to correctly handle cases where the allocation size is smaller
than a page:
TCP established hash table entries: 32 (order: 0, 256 bytes, linear) TCP
bind hash table entries: 32 (order: 0, 1024 bytes, linear)
Link: https://lkml.kernel.org/r/20251028191020.413002-1-isaacmanjarres@google.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mm_init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/mm_init.c~mm-mm_init-fix-hash-table-order-logging-in-alloc_large_system_hash
+++ a/mm/mm_init.c
@@ -2469,7 +2469,7 @@ void *__init alloc_large_system_hash(con
panic("Failed to allocate %s hash table\n", tablename);
pr_info("%s hash table entries: %ld (order: %d, %lu bytes, %s)\n",
- tablename, 1UL << log2qty, ilog2(size) - PAGE_SHIFT, size,
+ tablename, 1UL << log2qty, get_order(size), size,
virt ? (huge ? "vmalloc hugepage" : "vmalloc") : "linear");
if (_hash_shift)
_
Patches currently in -mm which might be from isaacmanjarres(a)google.com are
mm-mm_init-fix-hash-table-order-logging-in-alloc_large_system_hash.patch
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102058-citadel-crinkly-125f@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92 Mon Sep 17 00:00:00 2001
From: Babu Moger <babu.moger(a)amd.com>
Date: Fri, 10 Oct 2025 12:08:35 -0500
Subject: [PATCH] x86/resctrl: Fix miscount of bandwidth event when
reactivating previously unavailable RMID
Users can create as many monitoring groups as the number of RMIDs supported
by the hardware. However, on AMD systems, only a limited number of RMIDs
are guaranteed to be actively tracked by the hardware. RMIDs that exceed
this limit are placed in an "Unavailable" state.
When a bandwidth counter is read for such an RMID, the hardware sets
MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked
again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable
remains set on first read after tracking re-starts and is clear on all
subsequent reads as long as the RMID is tracked.
resctrl miscounts the bandwidth events after an RMID transitions from the
"Unavailable" state back to being tracked. This happens because when the
hardware starts counting again after resetting the counter to zero, resctrl
in turn compares the new count against the counter value stored from the
previous time the RMID was tracked.
This results in resctrl computing an event value that is either undercounting
(when new counter is more than stored counter) or a mistaken overflow (when
new counter is less than stored counter).
Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to
zero whenever the RMID is in the "Unavailable" state to ensure accurate
counting after the RMID resets to zero when it starts to be tracked again.
Example scenario that results in mistaken overflow
==================================================
1. The resctrl filesystem is mounted, and a task is assigned to a
monitoring group.
$mount -t resctrl resctrl /sys/fs/resctrl
$mkdir /sys/fs/resctrl/mon_groups/test1/
$echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
21323 <- Total bytes on domain 0
"Unavailable" <- Total bytes on domain 1
Task is running on domain 0. Counter on domain 1 is "Unavailable".
2. The task runs on domain 0 for a while and then moves to domain 1. The
counter starts incrementing on domain 1.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
7345357 <- Total bytes on domain 0
4545 <- Total bytes on domain 1
3. At some point, the RMID in domain 0 transitions to the "Unavailable"
state because the task is no longer executing in that domain.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
"Unavailable" <- Total bytes on domain 0
434341 <- Total bytes on domain 1
4. Since the task continues to migrate between domains, it may eventually
return to domain 0.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
17592178699059 <- Overflow on domain 0
3232332 <- Total bytes on domain 1
In this case, the RMID on domain 0 transitions from "Unavailable" state to
active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when
the counter is read and begins tracking the RMID counting from 0.
Subsequent reads succeed but return a value smaller than the previously
saved MSR value (7345357). Consequently, the resctrl's overflow logic is
triggered, it compares the previous value (7345357) with the new, smaller
value and incorrectly interprets this as a counter overflow, adding a large
delta.
In reality, this is a false positive: the counter did not overflow but was
simply reset when the RMID transitioned from "Unavailable" back to active
state.
Here is the text from APM [1] available from [2].
"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
first QM_CTR read when it begins tracking an RMID that it was not
previously tracking. The U bit will be zero for all subsequent reads from
that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
read with the U bit set when that RMID is in use by a processor can be
considered 0 when calculating the difference with a subsequent read."
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
Bandwidth (MBM).
[ bp: Split commit message into smaller paragraph chunks for better
consumption. ]
Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature")
Signed-off-by: Babu Moger <babu.moger(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Tested-by: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: stable(a)vger.kernel.org # needs adjustments for <= v6.17
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index c8945610d455..2cd25a0d4637 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -242,7 +242,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
u32 unused, u32 rmid, enum resctrl_event_id eventid,
u64 *val, void *ignored)
{
+ struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
int cpu = cpumask_any(&d->hdr.cpu_mask);
+ struct arch_mbm_state *am;
u64 msr_val;
u32 prmid;
int ret;
@@ -251,12 +253,16 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
prmid = logical_rmid_to_physical_rmid(cpu, rmid);
ret = __rmid_read_phys(prmid, eventid, &msr_val);
- if (ret)
- return ret;
- *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ if (!ret) {
+ *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ } else if (ret == -EINVAL) {
+ am = get_arch_mbm_state(hw_dom, rmid, eventid);
+ if (am)
+ am->prev_msr = 0;
+ }
- return 0;
+ return ret;
}
static int __cntr_id_read(u32 cntr_id, u64 *val)
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 388eff894d6bc5f921e9bfff0e4b0ab2684a96e9
Gitweb: https://git.kernel.org/tip/388eff894d6bc5f921e9bfff0e4b0ab2684a96e9
Author: Chang S. Bae <chang.seok.bae(a)intel.com>
AuthorDate: Mon, 09 Jun 2025 17:16:59 -07:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Tue, 28 Oct 2025 12:10:59 -07:00
x86/fpu: Ensure XFD state on signal delivery
Sean reported [1] the following splat when running KVM tests:
WARNING: CPU: 232 PID: 15391 at xfd_validate_state+0x65/0x70
Call Trace:
<TASK>
fpu__clear_user_states+0x9c/0x100
arch_do_signal_or_restart+0x142/0x210
exit_to_user_mode_loop+0x55/0x100
do_syscall_64+0x205/0x2c0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Chao further identified [2] a reproducible scenario involving signal
delivery: a non-AMX task is preempted by an AMX-enabled task which
modifies the XFD MSR.
When the non-AMX task resumes and reloads XSTATE with init values,
a warning is triggered due to a mismatch between fpstate::xfd and the
CPU's current XFD state. fpu__clear_user_states() does not currently
re-synchronize the XFD state after such preemption.
Invoke xfd_update_state() which detects and corrects the mismatch if
there is a dynamic feature.
This also benefits the sigreturn path, as fpu__restore_sig() may call
fpu__clear_user_states() when the sigframe is inaccessible.
[ dhansen: minor changelog munging ]
Closes: https://lore.kernel.org/lkml/aDCo_SczQOUaB2rS@google.com [1]
Fixes: 672365477ae8a ("x86/fpu: Update XFD state where required")
Reported-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Chang S. Bae <chang.seok.bae(a)intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Chao Gao <chao.gao(a)intel.com>
Tested-by: Chao Gao <chao.gao(a)intel.com>
Link: https://lore.kernel.org/all/aDWbctO%2FRfTGiCg3@intel.com [2]
Cc:stable@vger.kernel.org
Link: https://patch.msgid.link/20250610001700.4097-1-chang.seok.bae%40intel.com
---
arch/x86/kernel/fpu/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1f71cc1..e88eacb 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -825,6 +825,9 @@ void fpu__clear_user_states(struct fpu *fpu)
!fpregs_state_valid(fpu, smp_processor_id()))
os_xrstor_supervisor(fpu->fpstate);
+ /* Ensure XFD state is in sync before reloading XSTATE */
+ xfd_update_state(fpu->fpstate);
+
/* Reset user states in registers. */
restore_fpregs_from_init_fpstate(XFEATURE_MASK_USER_RESTORE);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102056-qualify-unopposed-be08@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92 Mon Sep 17 00:00:00 2001
From: Babu Moger <babu.moger(a)amd.com>
Date: Fri, 10 Oct 2025 12:08:35 -0500
Subject: [PATCH] x86/resctrl: Fix miscount of bandwidth event when
reactivating previously unavailable RMID
Users can create as many monitoring groups as the number of RMIDs supported
by the hardware. However, on AMD systems, only a limited number of RMIDs
are guaranteed to be actively tracked by the hardware. RMIDs that exceed
this limit are placed in an "Unavailable" state.
When a bandwidth counter is read for such an RMID, the hardware sets
MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked
again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable
remains set on first read after tracking re-starts and is clear on all
subsequent reads as long as the RMID is tracked.
resctrl miscounts the bandwidth events after an RMID transitions from the
"Unavailable" state back to being tracked. This happens because when the
hardware starts counting again after resetting the counter to zero, resctrl
in turn compares the new count against the counter value stored from the
previous time the RMID was tracked.
This results in resctrl computing an event value that is either undercounting
(when new counter is more than stored counter) or a mistaken overflow (when
new counter is less than stored counter).
Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to
zero whenever the RMID is in the "Unavailable" state to ensure accurate
counting after the RMID resets to zero when it starts to be tracked again.
Example scenario that results in mistaken overflow
==================================================
1. The resctrl filesystem is mounted, and a task is assigned to a
monitoring group.
$mount -t resctrl resctrl /sys/fs/resctrl
$mkdir /sys/fs/resctrl/mon_groups/test1/
$echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
21323 <- Total bytes on domain 0
"Unavailable" <- Total bytes on domain 1
Task is running on domain 0. Counter on domain 1 is "Unavailable".
2. The task runs on domain 0 for a while and then moves to domain 1. The
counter starts incrementing on domain 1.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
7345357 <- Total bytes on domain 0
4545 <- Total bytes on domain 1
3. At some point, the RMID in domain 0 transitions to the "Unavailable"
state because the task is no longer executing in that domain.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
"Unavailable" <- Total bytes on domain 0
434341 <- Total bytes on domain 1
4. Since the task continues to migrate between domains, it may eventually
return to domain 0.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
17592178699059 <- Overflow on domain 0
3232332 <- Total bytes on domain 1
In this case, the RMID on domain 0 transitions from "Unavailable" state to
active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when
the counter is read and begins tracking the RMID counting from 0.
Subsequent reads succeed but return a value smaller than the previously
saved MSR value (7345357). Consequently, the resctrl's overflow logic is
triggered, it compares the previous value (7345357) with the new, smaller
value and incorrectly interprets this as a counter overflow, adding a large
delta.
In reality, this is a false positive: the counter did not overflow but was
simply reset when the RMID transitioned from "Unavailable" back to active
state.
Here is the text from APM [1] available from [2].
"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
first QM_CTR read when it begins tracking an RMID that it was not
previously tracking. The U bit will be zero for all subsequent reads from
that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
read with the U bit set when that RMID is in use by a processor can be
considered 0 when calculating the difference with a subsequent read."
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
Bandwidth (MBM).
[ bp: Split commit message into smaller paragraph chunks for better
consumption. ]
Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature")
Signed-off-by: Babu Moger <babu.moger(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Tested-by: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: stable(a)vger.kernel.org # needs adjustments for <= v6.17
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index c8945610d455..2cd25a0d4637 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -242,7 +242,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
u32 unused, u32 rmid, enum resctrl_event_id eventid,
u64 *val, void *ignored)
{
+ struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
int cpu = cpumask_any(&d->hdr.cpu_mask);
+ struct arch_mbm_state *am;
u64 msr_val;
u32 prmid;
int ret;
@@ -251,12 +253,16 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
prmid = logical_rmid_to_physical_rmid(cpu, rmid);
ret = __rmid_read_phys(prmid, eventid, &msr_val);
- if (ret)
- return ret;
- *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ if (!ret) {
+ *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ } else if (ret == -EINVAL) {
+ am = get_arch_mbm_state(hw_dom, rmid, eventid);
+ if (am)
+ am->prev_msr = 0;
+ }
- return 0;
+ return ret;
}
static int __cntr_id_read(u32 cntr_id, u64 *val)
From: Sakari Ailus <sakari.ailus(a)linux.intel.com>
[ Upstream commit d9f866b2bb3eec38b3734f1fed325ec7c55ccdfa ]
fwnode_graph_get_next_subnode() may return fwnode backed by ACPI
device nodes and there has been no check these devices are present
in the system, unlike there has been on fwnode OF backend.
In order to provide consistent behaviour towards callers,
add a check for device presence by introducing
a new function acpi_get_next_present_subnode(), used as the
get_next_child_node() fwnode operation that also checks device
node presence.
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron(a)huawei.com>
Link: https://patch.msgid.link/20251001102636.1272722-2-sakari.ailus@linux.intel.…
[ rjw: Kerneldoc comment and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES – this change fixes a real behavioural bug in the ACPI fwnode
backend and should go to stable.
- `drivers/acpi/property.c:1375` adds `acpi_get_next_present_subnode()`
so `.get_next_child_node` skips ACPI child devices whose `_STA` says
they are absent (`acpi_device_is_present()`), while still returning
data subnodes unchanged.
- The new helper is now wired into the ACPI fwnode ops
(`drivers/acpi/property.c:1731`), making generic helpers such as
`fwnode_get_next_child_node()` and macros like
`fwnode_for_each_child_node` (`include/linux/property.h:167`) behave
the same as the OF backend, which already filtered unavailable
children via `of_get_next_available_child()`
(`drivers/of/property.c:1070`).
- Several core helpers assume disabled endpoints never surface: e.g.
`fwnode_graph_get_endpoint_by_id()` in `drivers/base/property.c:1286`
promises to hide endpoints on disabled devices, and higher layers such
as `v4l2_fwnode_reference_get_int_prop()`
(`drivers/media/v4l2-core/v4l2-fwnode.c:1064`) iterate child nodes
without rechecking availability. On ACPI systems today they still see
powered-off devices, leading to asynchronous notifiers that wait
forever for hardware that can’t appear, or to bogus graph
enumerations. This patch closes that gap.
Risk is low: it only suppresses ACPI device nodes already known to be
absent, aligns behaviour with DT userspace expectations, and leaves data
nodes untouched. No extra dependencies are required, so the fix is self-
contained and appropriate for stable backporting. Suggest running
existing ACPI graph users (media/typec drivers) after backport to
confirm no regressions.
drivers/acpi/property.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
index c086786fe84cb..d74678f0ba4af 100644
--- a/drivers/acpi/property.c
+++ b/drivers/acpi/property.c
@@ -1357,6 +1357,28 @@ struct fwnode_handle *acpi_get_next_subnode(const struct fwnode_handle *fwnode,
return NULL;
}
+/*
+ * acpi_get_next_present_subnode - Return the next present child node handle
+ * @fwnode: Firmware node to find the next child node for.
+ * @child: Handle to one of the device's child nodes or a null handle.
+ *
+ * Like acpi_get_next_subnode(), but the device nodes returned by
+ * acpi_get_next_present_subnode() are guaranteed to be present.
+ *
+ * Returns: The fwnode handle of the next present sub-node.
+ */
+static struct fwnode_handle *
+acpi_get_next_present_subnode(const struct fwnode_handle *fwnode,
+ struct fwnode_handle *child)
+{
+ do {
+ child = acpi_get_next_subnode(fwnode, child);
+ } while (is_acpi_device_node(child) &&
+ !acpi_device_is_present(to_acpi_device_node(child)));
+
+ return child;
+}
+
/**
* acpi_node_get_parent - Return parent fwnode of this fwnode
* @fwnode: Firmware node whose parent to get
@@ -1701,7 +1723,7 @@ static int acpi_fwnode_irq_get(const struct fwnode_handle *fwnode,
.property_read_string_array = \
acpi_fwnode_property_read_string_array, \
.get_parent = acpi_node_get_parent, \
- .get_next_child_node = acpi_get_next_subnode, \
+ .get_next_child_node = acpi_get_next_present_subnode, \
.get_named_child_node = acpi_fwnode_get_named_child_node, \
.get_name = acpi_fwnode_get_name, \
.get_name_prefix = acpi_fwnode_get_name_prefix, \
--
2.51.0
Sean reported [1] the following splat when running KVM tests:
WARNING: CPU: 232 PID: 15391 at xfd_validate_state+0x65/0x70
Call Trace:
<TASK>
fpu__clear_user_states+0x9c/0x100
arch_do_signal_or_restart+0x142/0x210
exit_to_user_mode_loop+0x55/0x100
do_syscall_64+0x205/0x2c0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Chao further identified [2] a reproducible scenarios involving signal
delivery: a non-AMX task is preempted by an AMX-enabled task which
modifies the XFD MSR.
When the non-AMX task resumes and reloads XSTATE with init values,
a warning is triggered due to a mismatch between fpstate::xfd and the
CPU's current XFD state. fpu__clear_user_states() does not currently
re-synchronize the XFD state after such preemption.
Invoke xfd_update_state() which detects and corrects the mismatch if the
dynamic feature is enabled.
This also benefits the sigreturn path, as fpu__restore_sig() may call
fpu__clear_user_states() when the sigframe is inaccessible.
Fixes: 672365477ae8a ("x86/fpu: Update XFD state where required")
Reported-by: Sean Christopherson <seanjc(a)google.com>
Closes: https://lore.kernel.org/lkml/aDCo_SczQOUaB2rS@google.com [1]
Tested-by: Chao Gao <chao.gao(a)intel.com>
Signed-off-by: Chang S. Bae <chang.seok.bae(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/aDWbctO%2FRfTGiCg3@intel.com [2]
---
arch/x86/kernel/fpu/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index ea138583dd92..5fa782a2ae7c 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -800,6 +800,9 @@ void fpu__clear_user_states(struct fpu *fpu)
!fpregs_state_valid(fpu, smp_processor_id()))
os_xrstor_supervisor(fpu->fpstate);
+ /* Ensure XFD state is in sync before reloading XSTATE */
+ xfd_update_state(fpu->fpstate);
+
/* Reset user states in registers. */
restore_fpregs_from_init_fpstate(XFEATURE_MASK_USER_RESTORE);
--
2.48.1
From: Shawn Guo <shawnguo(a)kernel.org>
Per commit 9442490a0286 ("regmap: irq: Support wake IRQ mask inversion")
the wake_invert flag is to support enable register, so cleared bits are
wake disabled.
Fixes: 68622bdfefb9 ("regmap: irq: document mask/wake_invert flags")
Cc: stable(a)vger.kernel.org
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
---
include/linux/regmap.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index 4e1ac1fbcec4..55343795644b 100644
--- a/include/linux/regmap.h
+++ b/include/linux/regmap.h
@@ -1643,7 +1643,7 @@ struct regmap_irq_chip_data;
* @status_invert: Inverted status register: cleared bits are active interrupts.
* @status_is_level: Status register is actuall signal level: Xor status
* register with previous value to get active interrupts.
- * @wake_invert: Inverted wake register: cleared bits are wake enabled.
+ * @wake_invert: Inverted wake register: cleared bits are wake disabled.
* @type_in_mask: Use the mask registers for controlling irq type. Use this if
* the hardware provides separate bits for rising/falling edge
* or low/high level interrupts and they should be combined into
--
2.43.0
Currently mremap folio pte batch ignores the writable bit during figuring
out a set of similar ptes mapping the same folio. Suppose that the first
pte of the batch is writable while the others are not - set_ptes will
end up setting the writable bit on the other ptes, which is a violation
of mremap semantics. Therefore, use FPB_RESPECT_WRITE to check the writable
bit while determining the pte batch.
Cc: stable(a)vger.kernel.org #6.17
Fixes: f822a9a81a31 ("mm: optimize mremap() by PTE batching")
Reported-by: David Hildenbrand <david(a)redhat.com>
Debugged-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
---
mm-selftests pass. Based on mm-new. Need David H. to confirm whether
the repro passes.
mm/mremap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/mremap.c b/mm/mremap.c
index a7f531c17b79..8ad06cf50783 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -187,7 +187,7 @@ static int mremap_folio_pte_batch(struct vm_area_struct *vma, unsigned long addr
if (!folio || !folio_test_large(folio))
return 1;
- return folio_pte_batch(folio, ptep, pte, max_nr);
+ return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr, FPB_RESPECT_WRITE);
}
static int move_ptes(struct pagetable_move_control *pmc,
--
2.30.2
Evaluaciones Psicométricas para RR.HH.
body {
margin: 0;
padding: 0;
font-family: Arial, Helvetica, sans-serif;
font-size: 14px;
color: #333;
background-color: #ffffff;
}
table {
border-spacing: 0;
width: 100%;
max-width: 600px;
margin: auto;
}
td {
padding: 12px 20px;
}
a {
color: #1a73e8;
text-decoration: none;
}
.footer {
font-size: 12px;
color: #888888;
text-align: center;
}
Mejora tus procesos de selección con evaluaciones psicométricas fáciles y confiables.
Hola, ,
Sabemos que encontrar al candidato ideal va más allá del currículum. Por eso quiero contarte brevemente sobre PsicoSmart, una plataforma que ayuda a equipos de RR.HH. a evaluar talento con pruebas psicométricas rápidas, confiables y fáciles de aplicar.
Con PsicoSmart puedes:
Aplicar evaluaciones psicométricas 100% en línea.
Elegir entre más de 31 pruebas psicométricas
Generar reportes automáticos, visuales y fáciles de interpretar.
Comparar resultados entre candidatos en segundos.
Ahorrar horas valiosas del proceso de selección.
Si estás buscando mejorar tus contrataciones, te lo recomiendo muchísimo. Si quieres conocer más puedes responder este correo o simplemente contactarme, mis datos están abajo.
Saludos,
-----------------------------
Atte.: Valeria Pérez
Ciudad de México: (55) 5018 0565
WhatsApp: +52 33 1607 2089
Si no deseas recibir más correos, haz clic aquí para darte de baja.
Para remover su dirección de esta lista haga <a href="https://s1.arrobamail.com/unsuscribe.php?id=yiwtsrewisppiseup">click aquí</a>
From: Junrui Luo <moonafterrain(a)outlook.com>
The asd_pci_remove() function fails to synchronize with pending tasklets
before freeing the asd_ha structure, leading to a potential use-after-free
vulnerability.
When a device removal is triggered (via hot-unplug or module unload), race condition can occur.
The fix adds tasklet_kill() before freeing the asd_ha structure, ensuring
all scheduled tasklets complete before cleanup proceeds.
Reported-by: Yuhao Jiang <danisjiang(a)gmail.com>
Reported-by: Junrui Luo <moonafterrain(a)outlook.com>
Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Junrui Luo <moonafterrain(a)outlook.com>
---
drivers/scsi/aic94xx/aic94xx_init.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/scsi/aic94xx/aic94xx_init.c b/drivers/scsi/aic94xx/aic94xx_init.c
index adf3d9145606..95f3620059f7 100644
--- a/drivers/scsi/aic94xx/aic94xx_init.c
+++ b/drivers/scsi/aic94xx/aic94xx_init.c
@@ -882,6 +882,9 @@ static void asd_pci_remove(struct pci_dev *dev)
asd_disable_ints(asd_ha);
+ /* Ensure all scheduled tasklets complete before freeing resources */
+ tasklet_kill(&asd_ha->seq.dl_tasklet);
+
asd_remove_dev_attrs(asd_ha);
/* XXX more here as needed */
--
2.51.1.dirty
From: Junrui Luo <moonafterrain(a)outlook.com>
The uninorth_insert_memory and uninorth_remove_memory functions lack
proper validation of the pg_start parameter before using it as an array
index into the GATT (Graphics Address Translation Table).
The current bounds check fails to reject negative
pg_start values and potentially causes out-of-bounds writes.
Fix by explicitly checking that pg_start is non-negative before
performing bounds checking. This makes the security requirement clear
and does not rely on implicit type conversion behavior.
The uninorth_remove_memory function has no bounds checking at all, so
add the same validation there.
Reported-by: Yuhao Jiang <danisjiang(a)gmail.com>
Reported-by: Junrui Luo <moonafterrain(a)outlook.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Junrui Luo <moonafterrain(a)outlook.com>
---
drivers/char/agp/uninorth-agp.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/char/agp/uninorth-agp.c b/drivers/char/agp/uninorth-agp.c
index b8d7115b8c9e..4e0b949016f7 100644
--- a/drivers/char/agp/uninorth-agp.c
+++ b/drivers/char/agp/uninorth-agp.c
@@ -169,7 +169,9 @@ static int uninorth_insert_memory(struct agp_memory *mem, off_t pg_start, int ty
temp = agp_bridge->current_size;
num_entries = A_SIZE_32(temp)->num_entries;
- if ((pg_start + mem->page_count) > num_entries)
+ if (pg_start < 0 ||
+ (pg_start + mem->page_count) > num_entries ||
+ (pg_start + mem->page_count) < pg_start)
return -EINVAL;
gp = (u32 *) &agp_bridge->gatt_table[pg_start];
@@ -200,6 +202,9 @@ static int uninorth_insert_memory(struct agp_memory *mem, off_t pg_start, int ty
static int uninorth_remove_memory(struct agp_memory *mem, off_t pg_start, int type)
{
size_t i;
+ int num_entries;
+ void *temp;
+
u32 *gp;
int mask_type;
@@ -215,6 +220,14 @@ static int uninorth_remove_memory(struct agp_memory *mem, off_t pg_start, int ty
if (mem->page_count == 0)
return 0;
+ temp = agp_bridge->current_size;
+ num_entries = A_SIZE_32(temp)->num_entries;
+
+ if (pg_start < 0 ||
+ (pg_start + mem->page_count) > num_entries ||
+ (pg_start + mem->page_count) < pg_start)
+ return -EINVAL;
+
gp = (u32 *) &agp_bridge->gatt_table[pg_start];
for (i = 0; i < mem->page_count; ++i) {
gp[i] = scratch_value;
--
2.51.1.dirty
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102054-slacks-ambush-f774@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92 Mon Sep 17 00:00:00 2001
From: Babu Moger <babu.moger(a)amd.com>
Date: Fri, 10 Oct 2025 12:08:35 -0500
Subject: [PATCH] x86/resctrl: Fix miscount of bandwidth event when
reactivating previously unavailable RMID
Users can create as many monitoring groups as the number of RMIDs supported
by the hardware. However, on AMD systems, only a limited number of RMIDs
are guaranteed to be actively tracked by the hardware. RMIDs that exceed
this limit are placed in an "Unavailable" state.
When a bandwidth counter is read for such an RMID, the hardware sets
MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked
again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable
remains set on first read after tracking re-starts and is clear on all
subsequent reads as long as the RMID is tracked.
resctrl miscounts the bandwidth events after an RMID transitions from the
"Unavailable" state back to being tracked. This happens because when the
hardware starts counting again after resetting the counter to zero, resctrl
in turn compares the new count against the counter value stored from the
previous time the RMID was tracked.
This results in resctrl computing an event value that is either undercounting
(when new counter is more than stored counter) or a mistaken overflow (when
new counter is less than stored counter).
Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to
zero whenever the RMID is in the "Unavailable" state to ensure accurate
counting after the RMID resets to zero when it starts to be tracked again.
Example scenario that results in mistaken overflow
==================================================
1. The resctrl filesystem is mounted, and a task is assigned to a
monitoring group.
$mount -t resctrl resctrl /sys/fs/resctrl
$mkdir /sys/fs/resctrl/mon_groups/test1/
$echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
21323 <- Total bytes on domain 0
"Unavailable" <- Total bytes on domain 1
Task is running on domain 0. Counter on domain 1 is "Unavailable".
2. The task runs on domain 0 for a while and then moves to domain 1. The
counter starts incrementing on domain 1.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
7345357 <- Total bytes on domain 0
4545 <- Total bytes on domain 1
3. At some point, the RMID in domain 0 transitions to the "Unavailable"
state because the task is no longer executing in that domain.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
"Unavailable" <- Total bytes on domain 0
434341 <- Total bytes on domain 1
4. Since the task continues to migrate between domains, it may eventually
return to domain 0.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
17592178699059 <- Overflow on domain 0
3232332 <- Total bytes on domain 1
In this case, the RMID on domain 0 transitions from "Unavailable" state to
active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when
the counter is read and begins tracking the RMID counting from 0.
Subsequent reads succeed but return a value smaller than the previously
saved MSR value (7345357). Consequently, the resctrl's overflow logic is
triggered, it compares the previous value (7345357) with the new, smaller
value and incorrectly interprets this as a counter overflow, adding a large
delta.
In reality, this is a false positive: the counter did not overflow but was
simply reset when the RMID transitioned from "Unavailable" back to active
state.
Here is the text from APM [1] available from [2].
"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
first QM_CTR read when it begins tracking an RMID that it was not
previously tracking. The U bit will be zero for all subsequent reads from
that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
read with the U bit set when that RMID is in use by a processor can be
considered 0 when calculating the difference with a subsequent read."
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
Bandwidth (MBM).
[ bp: Split commit message into smaller paragraph chunks for better
consumption. ]
Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature")
Signed-off-by: Babu Moger <babu.moger(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Tested-by: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: stable(a)vger.kernel.org # needs adjustments for <= v6.17
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index c8945610d455..2cd25a0d4637 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -242,7 +242,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
u32 unused, u32 rmid, enum resctrl_event_id eventid,
u64 *val, void *ignored)
{
+ struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
int cpu = cpumask_any(&d->hdr.cpu_mask);
+ struct arch_mbm_state *am;
u64 msr_val;
u32 prmid;
int ret;
@@ -251,12 +253,16 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
prmid = logical_rmid_to_physical_rmid(cpu, rmid);
ret = __rmid_read_phys(prmid, eventid, &msr_val);
- if (ret)
- return ret;
- *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ if (!ret) {
+ *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ } else if (ret == -EINVAL) {
+ am = get_arch_mbm_state(hw_dom, rmid, eventid);
+ if (am)
+ am->prev_msr = 0;
+ }
- return 0;
+ return ret;
}
static int __cntr_id_read(u32 cntr_id, u64 *val)
From: Christian Hitz <christian.hitz(a)bbv.ch>
If a GPIO is used to control the chip's enable pin, it needs to be pulled
high before any i2c communication is attempted.
Currently, the enable GPIO handling is not correct.
Assume the enable GPIO is low when the probe function is entered. In this
case the device is in SHUTDOWN mode and does not react to i2c commands.
During probe the following sequence happens:
1. The call to lp50xx_reset() on line 548 has no effect as i2c is not
possible yet.
2. Then - on line 552 - lp50xx_enable_disable() is called. As
"priv->enable_gpio“ has not yet been initialized, setting the GPIO has
no effect. Also the i2c enable command is not executed as the device
is still in SHUTDOWN.
3. On line 556 the call to lp50xx_probe_dt() finally parses the rest of
the DT and the configured priv->enable_gpio is set up.
As a result the device is still in SHUTDOWN mode and not ready for
operation.
Split lp50xx_enable_disable() into distinct enable and disable functions
to enforce correct ordering between enable_gpio manipulations and i2c
commands.
Read enable_gpio configuration from DT before attempting to manipulate
enable_gpio.
Add delays to observe correct wait timing after manipulating enable_gpio
and before any i2c communication.
Fixes: 242b81170fb8 ("leds: lp50xx: Add the LP50XX family of the RGB LED driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Hitz <christian.hitz(a)bbv.ch>
---
Changes in v2:
- Unconditionally reset in lp50xx_enable
- Define magic numbers
- Improve log message
---
drivers/leds/leds-lp50xx.c | 55 +++++++++++++++++++++++++++-----------
1 file changed, 40 insertions(+), 15 deletions(-)
diff --git a/drivers/leds/leds-lp50xx.c b/drivers/leds/leds-lp50xx.c
index 94f8ef6b482c..d3485d814cf4 100644
--- a/drivers/leds/leds-lp50xx.c
+++ b/drivers/leds/leds-lp50xx.c
@@ -50,6 +50,12 @@
#define LP50XX_SW_RESET 0xff
#define LP50XX_CHIP_EN BIT(6)
+#define LP50XX_CHIP_DISABLE 0x00
+#define LP50XX_START_TIME_US 500
+#define LP50XX_RESET_TIME_US 3
+
+#define LP50XX_EN_GPIO_LOW 0
+#define LP50XX_EN_GPIO_HIGH 1
/* There are 3 LED outputs per bank */
#define LP50XX_LEDS_PER_MODULE 3
@@ -371,19 +377,42 @@ static int lp50xx_reset(struct lp50xx *priv)
return regmap_write(priv->regmap, priv->chip_info->reset_reg, LP50XX_SW_RESET);
}
-static int lp50xx_enable_disable(struct lp50xx *priv, int enable_disable)
+static int lp50xx_enable(struct lp50xx *priv)
{
int ret;
- ret = gpiod_direction_output(priv->enable_gpio, enable_disable);
+ if (priv->enable_gpio) {
+ ret = gpiod_direction_output(priv->enable_gpio, LP50XX_EN_GPIO_HIGH);
+ if (ret)
+ return ret;
+
+ udelay(LP50XX_START_TIME_US);
+ }
+
+ ret = lp50xx_reset(priv);
if (ret)
return ret;
- if (enable_disable)
- return regmap_write(priv->regmap, LP50XX_DEV_CFG0, LP50XX_CHIP_EN);
- else
- return regmap_write(priv->regmap, LP50XX_DEV_CFG0, 0);
+ return regmap_write(priv->regmap, LP50XX_DEV_CFG0, LP50XX_CHIP_EN);
+}
+static int lp50xx_disable(struct lp50xx *priv)
+{
+ int ret;
+
+ ret = regmap_write(priv->regmap, LP50XX_DEV_CFG0, LP50XX_CHIP_DISABLE);
+ if (ret)
+ return ret;
+
+ if (priv->enable_gpio) {
+ ret = gpiod_direction_output(priv->enable_gpio, LP50XX_EN_GPIO_LOW);
+ if (ret)
+ return ret;
+
+ udelay(LP50XX_RESET_TIME_US);
+ }
+
+ return 0;
}
static int lp50xx_probe_leds(struct fwnode_handle *child, struct lp50xx *priv,
@@ -447,6 +476,10 @@ static int lp50xx_probe_dt(struct lp50xx *priv)
return dev_err_probe(priv->dev, PTR_ERR(priv->enable_gpio),
"Failed to get enable GPIO\n");
+ ret = lp50xx_enable(priv);
+ if (ret)
+ return ret;
+
priv->regulator = devm_regulator_get(priv->dev, "vled");
if (IS_ERR(priv->regulator))
priv->regulator = NULL;
@@ -547,14 +580,6 @@ static int lp50xx_probe(struct i2c_client *client)
return ret;
}
- ret = lp50xx_reset(led);
- if (ret)
- return ret;
-
- ret = lp50xx_enable_disable(led, 1);
- if (ret)
- return ret;
-
return lp50xx_probe_dt(led);
}
@@ -563,7 +588,7 @@ static void lp50xx_remove(struct i2c_client *client)
struct lp50xx *led = i2c_get_clientdata(client);
int ret;
- ret = lp50xx_enable_disable(led, 0);
+ ret = lp50xx_disable(led);
if (ret)
dev_err(led->dev, "Failed to disable chip\n");
--
2.51.1
A user reports that on their Lenovo Corsola Magneton with EC firmware
steelix-15194.270.0 the driver probe fails with EINVAL. It turns out
that the power LED does not contain any color components as indicated
by the following "ectool led power query" output:
Brightness range for LED 1:
red : 0x0
green : 0x0
blue : 0x0
yellow : 0x0
white : 0x0
amber : 0x0
The LED also does not react to commands sent manually through ectool and
is generally non-functional.
Instead of failing the probe for all LEDs managed by the EC when one
without color components is encountered, silently skip those.
Fixes: 8d6ce6f3ec9d ("leds: Add ChromeOS EC driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
drivers/leds/leds-cros_ec.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/leds/leds-cros_ec.c b/drivers/leds/leds-cros_ec.c
index 377cf04e202a..bea3cc3fbfd2 100644
--- a/drivers/leds/leds-cros_ec.c
+++ b/drivers/leds/leds-cros_ec.c
@@ -142,9 +142,6 @@ static int cros_ec_led_count_subleds(struct device *dev,
}
}
- if (!num_subleds)
- return -EINVAL;
-
*max_brightness = common_range;
return num_subleds;
}
@@ -189,6 +186,8 @@ static int cros_ec_led_probe_one(struct device *dev, struct cros_ec_device *cros
&priv->led_mc_cdev.led_cdev.max_brightness);
if (num_subleds < 0)
return num_subleds;
+ if (num_subleds == 0)
+ return 0; /* LED without any colors, skip */
priv->cros_ec = cros_ec;
priv->led_id = id;
---
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
change-id: 20251028-cros_ec-leds-no-colors-18eb8d1efa92
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
When simple_write_to_buffer() succeeds, it returns the number of bytes
actually copied to the buffer, which may be less than the requested
'count' if the buffer size is insufficient. However, the current code
incorrectly uses 'count' as the index for null termination instead of
the actual bytes copied, leading to out-of-bound write.
Add a check for the count and use the return value as the index.
Found via static analysis. This is similar to the
commit da9374819eb3 ("iio: backend: fix out-of-bound write")
Fixes: b1c5d68ea66e ("iio: dac: ad3552r-hs: add support for internal ramp")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/iio/dac/ad3552r-hs.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/dac/ad3552r-hs.c b/drivers/iio/dac/ad3552r-hs.c
index 41b96b48ba98..a9578afa7015 100644
--- a/drivers/iio/dac/ad3552r-hs.c
+++ b/drivers/iio/dac/ad3552r-hs.c
@@ -549,12 +549,15 @@ static ssize_t ad3552r_hs_write_data_source(struct file *f,
guard(mutex)(&st->lock);
+ if (count >= sizeof(buf))
+ return -ENOSPC;
+
ret = simple_write_to_buffer(buf, sizeof(buf) - 1, ppos, userbuf,
count);
if (ret < 0)
return ret;
- buf[count] = '\0';
+ buf[ret] = '\0';
ret = match_string(dbgfs_attr_source, ARRAY_SIZE(dbgfs_attr_source),
buf);
--
2.39.5 (Apple Git-154)
A recent change fixed device reference leaks when looking up drm
platform device driver data during bind() but failed to remove a partial
fix which had been added by commit 80805b62ea5b ("drm/mediatek: Fix
kobject put for component sub-drivers").
This results in a reference imbalance on component bind() failures and
on unbind() which could lead to a user-after-free.
Make sure to only drop the references after retrieving the driver data
by effectively reverting the previous partial fix.
Note that holding a reference to a device does not prevent its driver
data from going away so there is no point in keeping the reference.
Fixes: 1f403699c40f ("drm/mediatek: Fix device/node reference count leaks in mtk_drm_get_all_drm_priv")
Reported-by: Sjoerd Simons <sjoerd(a)collabora.com>
Link: https://lore.kernel.org/r/20251003-mtk-drm-refcount-v1-1-3b3f2813b0db@colla…
Cc: stable(a)vger.kernel.org
Cc: Ma Ke <make24(a)iscas.ac.cn>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/gpu/drm/mediatek/mtk_drm_drv.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index 384b0510272c..a94c51a83261 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -686,10 +686,6 @@ static int mtk_drm_bind(struct device *dev)
for (i = 0; i < private->data->mmsys_dev_num; i++)
private->all_drm_private[i]->drm = NULL;
err_put_dev:
- for (i = 0; i < private->data->mmsys_dev_num; i++) {
- /* For device_find_child in mtk_drm_get_all_priv() */
- put_device(private->all_drm_private[i]->dev);
- }
put_device(private->mutex_dev);
return ret;
}
@@ -697,18 +693,12 @@ static int mtk_drm_bind(struct device *dev)
static void mtk_drm_unbind(struct device *dev)
{
struct mtk_drm_private *private = dev_get_drvdata(dev);
- int i;
/* for multi mmsys dev, unregister drm dev in mmsys master */
if (private->drm_master) {
drm_dev_unregister(private->drm);
mtk_drm_kms_deinit(private->drm);
drm_dev_put(private->drm);
-
- for (i = 0; i < private->data->mmsys_dev_num; i++) {
- /* For device_find_child in mtk_drm_get_all_priv() */
- put_device(private->all_drm_private[i]->dev);
- }
put_device(private->mutex_dev);
}
private->mtk_drm_bound = false;
--
2.49.1
The code in bmc150-accel-core.c unconditionally calls
bmc150_accel_set_interrupt() in the iio_buffer_setup_ops,
such as on the runtime PM resume path giving a kernel
splat like this if the device has no interrupts:
Unable to handle kernel NULL pointer dereference at virtual
address 00000001 when read
CPU: 0 UID: 0 PID: 393 Comm: iio-sensor-prox Not tainted
6.18.0-rc1-postmarketos-stericsson-00001-g6b43386e3737 #73 PREEMPT
Hardware name: ST-Ericsson Ux5x0 platform (Device Tree Support)
PC is at bmc150_accel_set_interrupt+0x98/0x194
LR is at __pm_runtime_resume+0x5c/0x64
(...)
Call trace:
bmc150_accel_set_interrupt from bmc150_accel_buffer_postenable+0x40/0x108
bmc150_accel_buffer_postenable from __iio_update_buffers+0xbe0/0xcbc
__iio_update_buffers from enable_store+0x84/0xc8
enable_store from kernfs_fop_write_iter+0x154/0x1b4
kernfs_fop_write_iter from do_iter_readv_writev+0x178/0x1e4
do_iter_readv_writev from vfs_writev+0x158/0x3f4
vfs_writev from do_writev+0x74/0xe4
do_writev from __sys_trace_return+0x0/0x10
This bug seems to have been in the driver since the beginning,
but it only manifests recently, I do not know why.
Cc: stable(a)vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
---
drivers/iio/accel/bmc150-accel-core.c | 5 +++++
drivers/iio/accel/bmc150-accel.h | 1 +
2 files changed, 6 insertions(+)
diff --git a/drivers/iio/accel/bmc150-accel-core.c b/drivers/iio/accel/bmc150-accel-core.c
index 3c5d1560b163..f4421a3b0ef2 100644
--- a/drivers/iio/accel/bmc150-accel-core.c
+++ b/drivers/iio/accel/bmc150-accel-core.c
@@ -523,6 +523,10 @@ static int bmc150_accel_set_interrupt(struct bmc150_accel_data *data, int i,
const struct bmc150_accel_interrupt_info *info = intr->info;
int ret;
+ /* We do not always have an IRQ */
+ if (!data->has_irq)
+ return 0;
+
if (state) {
if (atomic_inc_return(&intr->users) > 1)
return 0;
@@ -1696,6 +1700,7 @@ int bmc150_accel_core_probe(struct device *dev, struct regmap *regmap, int irq,
}
if (irq > 0) {
+ data->has_irq = true;
ret = devm_request_threaded_irq(dev, irq,
bmc150_accel_irq_handler,
bmc150_accel_irq_thread_handler,
diff --git a/drivers/iio/accel/bmc150-accel.h b/drivers/iio/accel/bmc150-accel.h
index 7a7baf52e595..6e9bb69a1fd4 100644
--- a/drivers/iio/accel/bmc150-accel.h
+++ b/drivers/iio/accel/bmc150-accel.h
@@ -59,6 +59,7 @@ enum bmc150_accel_trigger_id {
struct bmc150_accel_data {
struct regmap *regmap;
struct regulator_bulk_data regulators[2];
+ bool has_irq;
struct bmc150_accel_interrupt interrupts[BMC150_ACCEL_INTERRUPTS];
struct bmc150_accel_trigger triggers[BMC150_ACCEL_TRIGGERS];
struct mutex mutex;
---
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
change-id: 20251027-fix-bmc150-7e568122b265
Best regards,
--
Linus Walleij <linus.walleij(a)linaro.org>
From: Mika Westerberg <mika.westerberg(a)linux.intel.com>
[ Upstream commit f7e37affbc9085f2b77ccb6596521a44eabf7505 ]
Add Intel Wildcat Lake SPI serial flash PCI ID to the list of supported
devices.
Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Link: https://patch.msgid.link/20251020145415.3377022-4-mika.westerberg@linux.int…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
## BACKPORT ANALYSIS COMPLETE
**Backport Status: YES**
---
## DETAILED ANALYSIS
### 1. COMMIT CHARACTERIZATION
**Type**: Hardware enablement (PCI device ID addition)
**Code Changes**:
- Single line addition to `drivers/spi/spi-intel-pci.c`
- Adds PCI device ID `0x4d23` (Intel Wildcat Lake SPI serial flash)
- Maps to existing `cnl_info` configuration structure
- No behavioral changes to existing code paths
### 2. SEMANTIC CODE ANALYSIS FINDINGS
#### Tools Used:
- **Read**: Examined the complete file structure (drivers/spi/spi-intel-
pci.c:68-99)
- **Grep**: Located PCI device table usage and MODULE_DEVICE_TABLE macro
- **Git analysis**: Reviewed commit history and backport precedents
#### Impact Scope Analysis:
- **Direct impact**: Only affects systems with PCI ID 0x4d23 (Wildcat
Lake hardware)
- **Caller analysis**: The PCI device table is used by the kernel's PCI
subsystem for automatic device-driver matching
- **Dependency analysis**: Uses `cnl_info` structure (drivers/spi/spi-
intel-pci.c:38-41), which has existed since 2022
- **Risk assessment**: Zero risk to existing hardware - new entry only
triggers on matching PCI ID
### 3. BACKPORT PRECEDENT (Strong Evidence)
I found multiple similar commits that **WERE backported** to stable
trees:
**Example 1 - Lunar Lake-M** (commit ec33549be99fe):
```
commit ec33549be99fe25c6927c8b3d6ed13918b27656e
Author: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Commit: Sasha Levin <sashal(a)kernel.org> [STABLE MAINTAINER]
spi: intel-pci: Add support for Lunar Lake-M SPI serial flash
[ Upstream commit 8f44e3808200c1434c26ef459722f88f48b306df ]
```
**Example 2 - Granite Rapids** (commit 60446b5e74865):
```
commit 60446b5e74865acff1af5f2d89d99551c8e6e2c1
Author: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Commit: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> [STABLE
MAINTAINER]
spi: intel-pci: Add support for Granite Rapids SPI serial flash
[ Upstream commit 9855d60cfc720ff32355484c119acafd3c4dc806 ]
```
**Key Finding**: Both commits were backported to stable trees by stable
maintainers (Sasha Levin, Greg Kroah-Hartman) **despite having NO
explicit stable tags** in the original commits.
### 4. STABLE TREE COMPLIANCE
**Positive Factors**:
- ✅ Fixes a real user problem (Wildcat Lake users cannot access SPI
flash on stable kernels)
- ✅ Extremely low risk (single line, static table entry)
- ✅ No architectural changes
- ✅ No dependencies on unreleased features
- ✅ Follows established backporting pattern for PCI ID additions
- ✅ "Obviously correct" - uses tested configuration from 20+ other Intel
platforms
**Considerations**:
- ⚠️ Technically a "new feature" (new hardware support)
- ✅ However, PCI ID additions are a **special exception** routinely
accepted by stable maintainers
- ✅ Hardware enablement is considered "fixing" missing support for
existing hardware
### 5. DEPENDENCY VERIFICATION
- **cnl_info structure**: Present since 2022, exists in all stable
kernels that would receive this backport
- **PCI_VDEVICE macro**: Standard kernel infrastructure
- **intel_spi_probe()**: Core function, no recent changes that would
conflict
- **Clean backport**: No conflicts expected with any stable kernel
version
### 6. RISK ANALYSIS
**Regression Risk**: **MINIMAL**
- Only triggered when PCI ID 0x4d23 is detected
- No effect on systems without Wildcat Lake hardware
- Uses well-tested code path (cnl_info shared with 20+ other platforms)
- No user-space API changes
**Testing Coverage**:
- Same code path used by numerous Intel platforms (Ice Lake, Tiger Lake,
Alder Lake, etc.)
- Configuration structure (cnl_info) battle-tested across multiple
generations
### 7. COMMIT MESSAGE ANALYSIS
The commit message is:
- ✅ Clear and descriptive
- ✅ Identifies specific hardware (Intel Wildcat Lake)
- ✅ Consistent with previous similar commits
- ❌ No "Fixes:" tag (not applicable - this is new hardware support)
- ❌ No "Cc: stable(a)vger.kernel.org" tag (but precedent shows this
doesn't prevent backporting)
---
## RECOMMENDATION
**YES - This commit SHOULD be backported to stable kernel trees**
### Rationale:
1. **Strong precedent**: Multiple similar PCI ID additions for Intel
platforms (Lunar Lake, Granite Rapids, Arrow Lake) have been
backported to stable trees by stable maintainers
2. **User benefit**: Enables critical SPI flash functionality for
Wildcat Lake hardware owners running stable kernels
3. **Minimal risk**: Single-line change to static table with no impact
on existing hardware
4. **Well-established pattern**: PCI ID additions for hardware
enablement are routinely accepted for stable backports as they "fix"
missing hardware support
5. **Clean backport**: No dependencies or conflicts expected
### Target Stable Trees:
- All currently maintained stable kernels (6.17.x, 6.16.x, 6.15.x, etc.)
- The commit is from v6.18-rc3, so it would benefit users on any stable
kernel before 6.18
drivers/spi/spi-intel-pci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/spi/spi-intel-pci.c b/drivers/spi/spi-intel-pci.c
index 4b63cb98df9cc..4bb158a23349e 100644
--- a/drivers/spi/spi-intel-pci.c
+++ b/drivers/spi/spi-intel-pci.c
@@ -75,6 +75,7 @@ static const struct pci_device_id intel_spi_pci_ids[] = {
{ PCI_VDEVICE(INTEL, 0x38a4), (unsigned long)&bxt_info },
{ PCI_VDEVICE(INTEL, 0x43a4), (unsigned long)&cnl_info },
{ PCI_VDEVICE(INTEL, 0x4b24), (unsigned long)&bxt_info },
+ { PCI_VDEVICE(INTEL, 0x4d23), (unsigned long)&cnl_info },
{ PCI_VDEVICE(INTEL, 0x4da4), (unsigned long)&bxt_info },
{ PCI_VDEVICE(INTEL, 0x51a4), (unsigned long)&cnl_info },
{ PCI_VDEVICE(INTEL, 0x54a4), (unsigned long)&cnl_info },
--
2.51.0
In thermal_of_cm_lookup(), of_parse_phandle() returns a device node with
its reference count incremented. The caller is responsible for releasing
this reference when the node is no longer needed.
Add of_node_put(tr_np) to fix the reference leaks.
Found via static analysis.
Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/thermal/thermal_of.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 1a51a4d240ff..2bb1b8e471cf 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -284,8 +284,11 @@ static bool thermal_of_cm_lookup(struct device_node *cm_np,
int count, i;
tr_np = of_parse_phandle(child, "trip", 0);
- if (tr_np != trip->priv)
+ if (tr_np != trip->priv) {
+ of_node_put(tr_np);
continue;
+ }
+ of_node_put(tr_np);
/* The trip has been found, look up the cdev. */
count = of_count_phandle_with_args(child, "cooling-device",
--
2.39.5 (Apple Git-154)
The 2 argument version of v4l2_subdev_state_get_format() takes the pad
as second argument, not the stream.
Fixes: bc0e8d91feec ("media: v4l: subdev: Switch to stream-aware state functions")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
---
Note the problem pre-exists the Fixes: commit but backporting further
will require manual backports and seems unnecessary.
---
drivers/media/i2c/ov01a10.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/i2c/ov01a10.c b/drivers/media/i2c/ov01a10.c
index f92867f542f0..0405ec7c75fd 100644
--- a/drivers/media/i2c/ov01a10.c
+++ b/drivers/media/i2c/ov01a10.c
@@ -731,7 +731,7 @@ static int ov01a10_set_format(struct v4l2_subdev *sd,
h_blank);
}
- format = v4l2_subdev_state_get_format(sd_state, fmt->stream);
+ format = v4l2_subdev_state_get_format(sd_state, fmt->pad);
*format = fmt->format;
return 0;
--
2.51.0
From: Benjamin Berg <benjamin.berg(a)intel.com>
The normal timer mechanism assume that timeout further in the future
need a lower accuracy. As an example, the granularity for a timer
scheduled 4096 ms in the future on a 1000 Hz system is already 512 ms.
This granularity is perfectly sufficient for e.g. timeouts, but there
are other types of events that will happen at a future point in time and
require a higher accuracy.
Add a new wiphy_hrtimer_work type that uses an hrtimer internally. The
API is almost identical to the existing wiphy_delayed_work and it can be
used as a drop-in replacement after minor adjustments. The work will be
scheduled relative to the current time with a slack of 1 millisecond.
CC: stable(a)vger.kernel.org # 6.4+
Signed-off-by: Benjamin Berg <benjamin.berg(a)intel.com>
Reviewed-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
---
include/net/cfg80211.h | 78 ++++++++++++++++++++++++++++++++++++++++++
net/wireless/core.c | 56 ++++++++++++++++++++++++++++++
net/wireless/trace.h | 21 ++++++++++++
3 files changed, 155 insertions(+)
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index 781624f5913a..820e299f06b5 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -6435,6 +6435,11 @@ static inline void wiphy_delayed_work_init(struct wiphy_delayed_work *dwork,
* after wiphy_lock() was called. Therefore, wiphy_cancel_work() can
* use just cancel_work() instead of cancel_work_sync(), it requires
* being in a section protected by wiphy_lock().
+ *
+ * Note that these are scheduled with a timer where the accuracy
+ * becomes less the longer in the future the scheduled timer is. Use
+ * wiphy_hrtimer_work_queue() if the timer must be not be late by more
+ * than approximately 10 percent.
*/
void wiphy_delayed_work_queue(struct wiphy *wiphy,
struct wiphy_delayed_work *dwork,
@@ -6506,6 +6511,79 @@ void wiphy_delayed_work_flush(struct wiphy *wiphy,
bool wiphy_delayed_work_pending(struct wiphy *wiphy,
struct wiphy_delayed_work *dwork);
+struct wiphy_hrtimer_work {
+ struct wiphy_work work;
+ struct wiphy *wiphy;
+ struct hrtimer timer;
+};
+
+enum hrtimer_restart wiphy_hrtimer_work_timer(struct hrtimer *t);
+
+static inline void wiphy_hrtimer_work_init(struct wiphy_hrtimer_work *hrwork,
+ wiphy_work_func_t func)
+{
+ hrtimer_setup(&hrwork->timer, wiphy_hrtimer_work_timer,
+ CLOCK_BOOTTIME, HRTIMER_MODE_REL);
+ wiphy_work_init(&hrwork->work, func);
+}
+
+/**
+ * wiphy_hrtimer_work_queue - queue hrtimer work for the wiphy
+ * @wiphy: the wiphy to queue for
+ * @hrwork: the high resolution timer worker
+ * @delay: the delay given as a ktime_t
+ *
+ * Please refer to wiphy_delayed_work_queue(). The difference is that
+ * the hrtimer work uses a high resolution timer for scheduling. This
+ * may be needed if timeouts might be scheduled further in the future
+ * and the accuracy of the normal timer is not sufficient.
+ *
+ * Expect a delay of a few milliseconds as the timer is scheduled
+ * with some slack and some more time may pass between queueing the
+ * work and its start.
+ */
+void wiphy_hrtimer_work_queue(struct wiphy *wiphy,
+ struct wiphy_hrtimer_work *hrwork,
+ ktime_t delay);
+
+/**
+ * wiphy_hrtimer_work_cancel - cancel previously queued hrtimer work
+ * @wiphy: the wiphy, for debug purposes
+ * @hrtimer: the hrtimer work to cancel
+ *
+ * Cancel the work *without* waiting for it, this assumes being
+ * called under the wiphy mutex acquired by wiphy_lock().
+ */
+void wiphy_hrtimer_work_cancel(struct wiphy *wiphy,
+ struct wiphy_hrtimer_work *hrtimer);
+
+/**
+ * wiphy_hrtimer_work_flush - flush previously queued hrtimer work
+ * @wiphy: the wiphy, for debug purposes
+ * @hrwork: the hrtimer work to flush
+ *
+ * Flush the work (i.e. run it if pending). This must be called
+ * under the wiphy mutex acquired by wiphy_lock().
+ */
+void wiphy_hrtimer_work_flush(struct wiphy *wiphy,
+ struct wiphy_hrtimer_work *hrwork);
+
+/**
+ * wiphy_hrtimer_work_pending - Find out whether a wiphy hrtimer
+ * work item is currently pending.
+ *
+ * @wiphy: the wiphy, for debug purposes
+ * @hrwork: the hrtimer work in question
+ *
+ * Return: true if timer is pending, false otherwise
+ *
+ * Please refer to the wiphy_delayed_work_pending() documentation as
+ * this is the equivalent function for hrtimer based delayed work
+ * items.
+ */
+bool wiphy_hrtimer_work_pending(struct wiphy *wiphy,
+ struct wiphy_hrtimer_work *hrwork);
+
/**
* enum ieee80211_ap_reg_power - regulatory power for an Access Point
*
diff --git a/net/wireless/core.c b/net/wireless/core.c
index 797f9f2004a6..54a34d8d356e 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -1787,6 +1787,62 @@ bool wiphy_delayed_work_pending(struct wiphy *wiphy,
}
EXPORT_SYMBOL_GPL(wiphy_delayed_work_pending);
+enum hrtimer_restart wiphy_hrtimer_work_timer(struct hrtimer *t)
+{
+ struct wiphy_hrtimer_work *hrwork =
+ container_of(t, struct wiphy_hrtimer_work, timer);
+
+ wiphy_work_queue(hrwork->wiphy, &hrwork->work);
+
+ return HRTIMER_NORESTART;
+}
+EXPORT_SYMBOL_GPL(wiphy_hrtimer_work_timer);
+
+void wiphy_hrtimer_work_queue(struct wiphy *wiphy,
+ struct wiphy_hrtimer_work *hrwork,
+ ktime_t delay)
+{
+ trace_wiphy_hrtimer_work_queue(wiphy, &hrwork->work, delay);
+
+ if (!delay) {
+ hrtimer_cancel(&hrwork->timer);
+ wiphy_work_queue(wiphy, &hrwork->work);
+ return;
+ }
+
+ hrwork->wiphy = wiphy;
+ hrtimer_start_range_ns(&hrwork->timer, delay,
+ 1000 * NSEC_PER_USEC, HRTIMER_MODE_REL);
+}
+EXPORT_SYMBOL_GPL(wiphy_hrtimer_work_queue);
+
+void wiphy_hrtimer_work_cancel(struct wiphy *wiphy,
+ struct wiphy_hrtimer_work *hrwork)
+{
+ lockdep_assert_held(&wiphy->mtx);
+
+ hrtimer_cancel(&hrwork->timer);
+ wiphy_work_cancel(wiphy, &hrwork->work);
+}
+EXPORT_SYMBOL_GPL(wiphy_hrtimer_work_cancel);
+
+void wiphy_hrtimer_work_flush(struct wiphy *wiphy,
+ struct wiphy_hrtimer_work *hrwork)
+{
+ lockdep_assert_held(&wiphy->mtx);
+
+ hrtimer_cancel(&hrwork->timer);
+ wiphy_work_flush(wiphy, &hrwork->work);
+}
+EXPORT_SYMBOL_GPL(wiphy_hrtimer_work_flush);
+
+bool wiphy_hrtimer_work_pending(struct wiphy *wiphy,
+ struct wiphy_hrtimer_work *hrwork)
+{
+ return hrtimer_is_queued(&hrwork->timer);
+}
+EXPORT_SYMBOL_GPL(wiphy_hrtimer_work_pending);
+
static int __init cfg80211_init(void)
{
int err;
diff --git a/net/wireless/trace.h b/net/wireless/trace.h
index 8a4c34112eb5..2b71f1d867a0 100644
--- a/net/wireless/trace.h
+++ b/net/wireless/trace.h
@@ -304,6 +304,27 @@ TRACE_EVENT(wiphy_delayed_work_queue,
__entry->delay)
);
+TRACE_EVENT(wiphy_hrtimer_work_queue,
+ TP_PROTO(struct wiphy *wiphy, struct wiphy_work *work,
+ ktime_t delay),
+ TP_ARGS(wiphy, work, delay),
+ TP_STRUCT__entry(
+ WIPHY_ENTRY
+ __field(void *, instance)
+ __field(void *, func)
+ __field(ktime_t, delay)
+ ),
+ TP_fast_assign(
+ WIPHY_ASSIGN;
+ __entry->instance = work;
+ __entry->func = work->func;
+ __entry->delay = delay;
+ ),
+ TP_printk(WIPHY_PR_FMT " instance=%p func=%pS delay=%llu",
+ WIPHY_PR_ARG, __entry->instance, __entry->func,
+ __entry->delay)
+);
+
TRACE_EVENT(wiphy_work_worker_start,
TP_PROTO(struct wiphy *wiphy),
TP_ARGS(wiphy),
--
2.34.1
In omap_usb2_probe(), of_parse_phandle() returns a device node with its
reference count incremented. The caller is responsible for releasing this
reference when the node is no longer needed.
Add of_node_put(control_node) after usage to fix the
reference leak.
Found via static analysis.
Fixes: 478b6c7436c2 ("usb: phy: omap-usb2: Don't use omap_get_control_dev()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/phy/ti/phy-omap-usb2.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/phy/ti/phy-omap-usb2.c b/drivers/phy/ti/phy-omap-usb2.c
index 1eb252604441..660df3181e4f 100644
--- a/drivers/phy/ti/phy-omap-usb2.c
+++ b/drivers/phy/ti/phy-omap-usb2.c
@@ -426,6 +426,7 @@ static int omap_usb2_probe(struct platform_device *pdev)
}
control_pdev = of_find_device_by_node(control_node);
+ of_node_put(control_node);
if (!control_pdev) {
dev_err(&pdev->dev, "Failed to get control device\n");
return -EINVAL;
--
2.39.5 (Apple Git-154)
The driver_find_device_by_of_node() function calls driver_find_device
and returns a device with its reference count incremented.
Add the missing put_device() call to
release this reference after the device is used.
Found via static analysis.
Fixes: 0b7c6075022c ("soc: samsung: exynos-pmu: Add regmap support for SoCs that protect PMU regs")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/soc/samsung/exynos-pmu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/soc/samsung/exynos-pmu.c b/drivers/soc/samsung/exynos-pmu.c
index 22c50ca2aa79..a53c1f882e1a 100644
--- a/drivers/soc/samsung/exynos-pmu.c
+++ b/drivers/soc/samsung/exynos-pmu.c
@@ -346,6 +346,7 @@ struct regmap *exynos_get_pmu_regmap_by_phandle(struct device_node *np,
if (!dev)
return ERR_PTR(-EPROBE_DEFER);
+ put_device(dev);
return syscon_node_to_regmap(pmu_np);
}
EXPORT_SYMBOL_GPL(exynos_get_pmu_regmap_by_phandle);
--
2.39.5 (Apple Git-154)
There are two reference count leaks in this driver:
1. In nforce2_fsb_read(): pci_get_subsys() increases the reference count
of the PCI device, but pci_dev_put() is never called to release it,
thus leaking the reference.
2. In nforce2_detect_chipset(): pci_get_subsys() gets a reference to the
nforce2_dev which is stored in a global variable, but the reference
is never released when the module is unloaded.
Fix both by:
- Adding pci_dev_put(nforce2_sub5) in nforce2_fsb_read() after reading
the configuration.
- Adding pci_dev_put(nforce2_dev) in nforce2_exit() to release the
global device reference.
Found via static analysis.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/cpufreq/cpufreq-nforce2.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/cpufreq/cpufreq-nforce2.c b/drivers/cpufreq/cpufreq-nforce2.c
index fedad1081973..fbbbe501cf2d 100644
--- a/drivers/cpufreq/cpufreq-nforce2.c
+++ b/drivers/cpufreq/cpufreq-nforce2.c
@@ -145,6 +145,8 @@ static unsigned int nforce2_fsb_read(int bootfsb)
pci_read_config_dword(nforce2_sub5, NFORCE2_BOOTFSB, &fsb);
fsb /= 1000000;
+ pci_dev_put(nforce2_sub5);
+
/* Check if PLL register is already set */
pci_read_config_byte(nforce2_dev, NFORCE2_PLLENABLE, (u8 *)&temp);
@@ -426,6 +428,7 @@ static int __init nforce2_init(void)
static void __exit nforce2_exit(void)
{
cpufreq_unregister_driver(&nforce2_driver);
+ pci_dev_put(nforce2_dev);
}
module_init(nforce2_init);
--
2.39.5 (Apple Git-154)
F2FS can mount filesystems with corrupted directory depth values that
get runtime-clamped to MAX_DIR_HASH_DEPTH. When RENAME_WHITEOUT
operations are performed on such directories, f2fs_rename performs
directory modifications (updating target entry and deleting source
entry) before attempting to add the whiteout entry via f2fs_add_link.
If f2fs_add_link fails due to the corrupted directory structure, the
function returns an error to VFS, but the partial directory
modifications have already been committed to disk. VFS assumes the
entire rename operation failed and does not update the dentry cache,
leaving stale mappings.
In the error path, VFS does not call d_move() to update the dentry
cache. This results in new_dentry still pointing to the old inode
(new_inode) which has already had its i_nlink decremented to zero.
The stale cache causes subsequent operations to incorrectly reference
the freed inode.
This causes subsequent operations to use cached dentry information that
no longer matches the on-disk state. When a second rename targets the
same entry, VFS attempts to decrement i_nlink on the stale inode, which
may already have i_nlink=0, triggering a WARNING in drop_nlink().
Example sequence:
1. First rename (RENAME_WHITEOUT): file2 → file1
- f2fs updates file1 entry on disk (points to inode 8)
- f2fs deletes file2 entry on disk
- f2fs_add_link(whiteout) fails (corrupted directory)
- Returns error to VFS
- VFS does not call d_move() due to error
- VFS cache still has: file1 → inode 7 (stale!)
- inode 7 has i_nlink=0 (already decremented)
2. Second rename: file3 → file1
- VFS uses stale cache: file1 → inode 7
- Tries to drop_nlink on inode 7 (i_nlink already 0)
- WARNING in drop_nlink()
Fix this by explicitly invalidating old_dentry and new_dentry when
f2fs_add_link fails during whiteout creation. This forces VFS to
refresh from disk on subsequent operations, ensuring cache consistency
even when the rename partially succeeds.
Reproducer:
1. Mount F2FS image with corrupted i_current_depth
2. renameat2(file2, file1, RENAME_WHITEOUT)
3. renameat2(file3, file1, 0)
4. System triggers WARNING in drop_nlink()
Fixes: 7e01e7ad746b ("f2fs: support RENAME_WHITEOUT")
Reported-by: syzbot+632cf32276a9a564188d(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=632cf32276a9a564188d
Suggested-by: Chao Yu <chao(a)kernel.org>
Link: https://lore.kernel.org/all/20251022233349.102728-1-kartikey406@gmail.com/ [v1]
Cc: stable(a)vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
---
Changes in v2:
- Added detailed explanation about VFS not calling d_move() in error path,
resulting in new_dentry still pointing to inode with zeroed i_nlink
(suggested by Chao Yu)
- Added Fixes tag pointing to commit 7e01e7ad746b
- Added Cc: stable(a)vger.kernel.org for backporting to stable kernels
---
fs/f2fs/namei.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index b882771e4699..712479b7b93d 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -1053,9 +1053,11 @@ static int f2fs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
if (whiteout) {
set_inode_flag(whiteout, FI_INC_LINK);
err = f2fs_add_link(old_dentry, whiteout);
- if (err)
+ if (err) {
+ d_invalidate(old_dentry);
+ d_invalidate(new_dentry);
goto put_out_dir;
-
+ }
spin_lock(&whiteout->i_lock);
whiteout->i_state &= ~I_LINKABLE;
spin_unlock(&whiteout->i_lock);
--
2.43.0
The qm_get_qos_value() function calls bus_find_device_by_name() which
increases the device reference count, but fails to call put_device()
to balance the reference count and lead to a device reference leak.
Add put_device() calls in both the error path and success path to
properly balance the reference count.
Found via static analysis.
Fixes: 22d7a6c39cab ("crypto: hisilicon/qm - add pci bdf number check")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/crypto/hisilicon/qm.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c
index a5b96adf2d1e..3b391a146635 100644
--- a/drivers/crypto/hisilicon/qm.c
+++ b/drivers/crypto/hisilicon/qm.c
@@ -3871,10 +3871,12 @@ static ssize_t qm_get_qos_value(struct hisi_qm *qm, const char *buf,
pdev = container_of(dev, struct pci_dev, dev);
if (pci_physfn(pdev) != qm->pdev) {
pci_err(qm->pdev, "the pdev input does not match the pf!\n");
+ put_device(dev);
return -EINVAL;
}
*fun_index = pdev->devfn;
+ put_device(dev);
return 0;
}
--
2.39.5 (Apple Git-154)
dclk_vop2_src currently has CLK_SET_RATE_PARENT | CLK_SET_RATE_NO_REPARENT
flags set, which is vastly different than dclk_vop0_src or dclk_vop1_src,
which have none of those.
With these flags in dclk_vop2_src, actually setting the clock then results
in a lot of other peripherals breaking, because setting the rate results
in the PLL source getting changed:
[ 14.898718] clk_core_set_rate_nolock: setting rate for dclk_vop2 to 152840000
[ 15.155017] clk_change_rate: setting rate for pll_gpll to 1680000000
[ clk adjusting every gpll user ]
This includes possibly the other vops, i2s, spdif and even the uarts.
Among other possible things, this breaks the uart console on a board
I use. Sometimes it recovers later on, but there will be a big block
of garbled output for a while at least.
Shared PLLs should not be changed by individual users, so drop these
flags from dclk_vop2_src and make the flags the same as on dclk_vop0
and dclk_vop1.
Fixes: f1c506d152ff ("clk: rockchip: add clock controller for the RK3588")
Cc: stable(a)vger.kernel.org
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
---
drivers/clk/rockchip/clk-rk3588.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/rockchip/clk-rk3588.c b/drivers/clk/rockchip/clk-rk3588.c
index 1694223f4f84..cf83242d1726 100644
--- a/drivers/clk/rockchip/clk-rk3588.c
+++ b/drivers/clk/rockchip/clk-rk3588.c
@@ -2094,7 +2094,7 @@ static struct rockchip_clk_branch rk3588_early_clk_branches[] __initdata = {
COMPOSITE(DCLK_VOP1_SRC, "dclk_vop1_src", gpll_cpll_v0pll_aupll_p, 0,
RK3588_CLKSEL_CON(111), 14, 2, MFLAGS, 9, 5, DFLAGS,
RK3588_CLKGATE_CON(52), 11, GFLAGS),
- COMPOSITE(DCLK_VOP2_SRC, "dclk_vop2_src", gpll_cpll_v0pll_aupll_p, CLK_SET_RATE_PARENT | CLK_SET_RATE_NO_REPARENT,
+ COMPOSITE(DCLK_VOP2_SRC, "dclk_vop2_src", gpll_cpll_v0pll_aupll_p, 0,
RK3588_CLKSEL_CON(112), 5, 2, MFLAGS, 0, 5, DFLAGS,
RK3588_CLKGATE_CON(52), 12, GFLAGS),
COMPOSITE_NODIV(DCLK_VOP0, "dclk_vop0", dclk_vop0_p,
--
2.47.2
From: Sven Eckelmann <sven(a)narfation.org>
Trying to dump the originators or the neighbors via netlink for a meshif
with an inactive primary interface is not allowed. The dump functions were
checking this correctly but they didn't handle non-existing primary
interfaces and existing _inactive_ interfaces differently.
(Primary) batadv_hard_ifaces hold a references to a net_device. And
accessing them is only allowed when either being in a RCU/spinlock
protected section or when holding a valid reference to them. The netlink
dump functions use the latter.
But because the missing specific error handling for inactive primary
interfaces, the reference was never dropped. This reference counting error
was only detected when the interface should have been removed from the
system:
unregister_netdevice: waiting for batadv_slave_0 to become free. Usage count = 2
Cc: stable(a)vger.kernel.org
Fixes: 6ecc4fd6c2f4 ("batman-adv: netlink: reduce duplicate code by returning interfaces")
Reported-by: syzbot+881d65229ca4f9ae8c84(a)syzkaller.appspotmail.com
Reported-by: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp>
Signed-off-by: Sven Eckelmann <sven(a)narfation.org>
Signed-off-by: Simon Wunderlich <sw(a)simonwunderlich.de>
---
net/batman-adv/originator.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/net/batman-adv/originator.c b/net/batman-adv/originator.c
index a464ff96b9291..ed89d7fd1e7f4 100644
--- a/net/batman-adv/originator.c
+++ b/net/batman-adv/originator.c
@@ -764,11 +764,16 @@ int batadv_hardif_neigh_dump(struct sk_buff *msg, struct netlink_callback *cb)
bat_priv = netdev_priv(mesh_iface);
primary_if = batadv_primary_if_get_selected(bat_priv);
- if (!primary_if || primary_if->if_status != BATADV_IF_ACTIVE) {
+ if (!primary_if) {
ret = -ENOENT;
goto out_put_mesh_iface;
}
+ if (primary_if->if_status != BATADV_IF_ACTIVE) {
+ ret = -ENOENT;
+ goto out_put_primary_if;
+ }
+
hard_iface = batadv_netlink_get_hardif(bat_priv, cb);
if (IS_ERR(hard_iface) && PTR_ERR(hard_iface) != -ENONET) {
ret = PTR_ERR(hard_iface);
@@ -1333,11 +1338,16 @@ int batadv_orig_dump(struct sk_buff *msg, struct netlink_callback *cb)
bat_priv = netdev_priv(mesh_iface);
primary_if = batadv_primary_if_get_selected(bat_priv);
- if (!primary_if || primary_if->if_status != BATADV_IF_ACTIVE) {
+ if (!primary_if) {
ret = -ENOENT;
goto out_put_mesh_iface;
}
+ if (primary_if->if_status != BATADV_IF_ACTIVE) {
+ ret = -ENOENT;
+ goto out_put_primary_if;
+ }
+
hard_iface = batadv_netlink_get_hardif(bat_priv, cb);
if (IS_ERR(hard_iface) && PTR_ERR(hard_iface) != -ENONET) {
ret = PTR_ERR(hard_iface);
--
2.47.3
The current code directly overwrites the scratch pointer with the
return value of kvrealloc(). If kvrealloc() fails and returns NULL,
the original buffer becomes unreachable, causing a memory leak.
Fix this by using a temporary variable to store kvrealloc()'s return
value and only update the scratch pointer on success.
Found via static anlaysis and this is similar to commit 42378a9ca553
("bpf, verifier: Fix memory leak in array reallocation for stack state")
Fixes: be17c0df6795 ("riscv: module: Optimize PLT/GOT entry counting")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
arch/riscv/kernel/module-sections.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/module-sections.c b/arch/riscv/kernel/module-sections.c
index 75551ac6504c..1675cbad8619 100644
--- a/arch/riscv/kernel/module-sections.c
+++ b/arch/riscv/kernel/module-sections.c
@@ -119,6 +119,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
unsigned int num_plts = 0;
unsigned int num_gots = 0;
Elf_Rela *scratch = NULL;
+ Elf_Rela *new_scratch;
size_t scratch_size = 0;
int i;
@@ -168,9 +169,12 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
scratch_size_needed = (num_scratch_relas + num_relas) * sizeof(*scratch);
if (scratch_size_needed > scratch_size) {
scratch_size = scratch_size_needed;
- scratch = kvrealloc(scratch, scratch_size, GFP_KERNEL);
- if (!scratch)
+ new_scratch = kvrealloc(scratch, scratch_size, GFP_KERNEL);
+ if (!new_scratch) {
+ kvfree(scratch);
return -ENOMEM;
+ }
+ scratch = new_scratch;
}
for (size_t j = 0; j < num_relas; j++)
--
2.39.5 (Apple Git-154)
From: ZhangGuoDong <zhangguodong(a)kylinos.cn>
[ Upstream commit 6fced056d2cc8d01b326e6fcfabaacb9850b71a4 ]
Memory leak occurs when ksmbd_vfs_read() fails.
Fix this by adding the missing kvfree().
Co-developed-by: ChenXiaoSong <chenxiaosong(a)kylinos.cn>
Signed-off-by: ChenXiaoSong <chenxiaosong(a)kylinos.cn>
Signed-off-by: ZhangGuoDong <zhangguodong(a)kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon(a)kernel.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
## **Backport Status: YES**
### **Extensive Analysis**
Based on comprehensive semantic code analysis and repository
examination, this commit **should be backported** to stable kernel
trees.
---
## **1. Code Change Analysis**
The commit adds a single line to fix a memory leak in
`fs/smb/server/smb2pdu.c`:
```c
nbytes = ksmbd_vfs_read(work, fp, length, &offset, aux_payload_buf);
if (nbytes < 0) {
+ kvfree(aux_payload_buf); // <-- Added line
err = nbytes;
goto out;
}
```
**What was wrong:** Memory allocated at line 6821 (`aux_payload_buf =
kvzalloc(ALIGN(length, 8), KSMBD_DEFAULT_GFP)`) was not freed when
`ksmbd_vfs_read()` fails, while all other error paths properly call
`kvfree()`.
---
## **2. Semantic Analysis Tools Used**
### **Tool 1: mcp__semcode__find_function**
- Located `smb2_read()` in `fs/smb/server/smb2pdu.c:6727-6895`
- Confirmed it's an SMB2 protocol handler (169 lines, 24 function calls)
- Return type: `int` (returns error codes)
### **Tool 2: mcp__semcode__find_callers**
- Result: No direct function callers
- However, cross-referenced with `smb2ops.c:183` showing `smb2_read` is
registered as a handler: `[SMB2_READ_HE] = { .proc = smb2_read }`
- **Conclusion:** This is a protocol handler invoked by the SMB2 message
dispatcher, meaning it's **directly user-triggerable** via network
requests
### **Tool 3: mcp__semcode__find_calls**
- Analyzed `ksmbd_vfs_read()` dependencies
- Found it can fail with multiple error codes: `-EISDIR`, `-EACCES`,
`-EAGAIN`, plus any errors from `kernel_read()`
- **All of these failure paths trigger the memory leak**
### **Tool 4: git blame & git log**
- Bug introduced: commit `e2f34481b24db2` (2021-03-16) - **4 years
old!**
- Recent modification: commit `06a025448b572c` (2024-11-30) changed
allocation to `ALIGN(length, 8)` but didn't fix the leak
- Found 15+ similar "memory leak" fixes in ksmbd history, indicating
active maintenance
---
## **3. Impact Scope Analysis**
### **User Exposure: CRITICAL**
- **Protocol Handler:** Any SMB client can trigger this by sending SMB2
READ requests
- **Network-facing:** ksmbd is a kernel SMB server exposed to network
clients
- **No authentication required to trigger:** The error path can be
reached even with permission errors
### **Trigger Conditions (from VFS analysis):**
1. **-EISDIR**: Client tries to read a directory
2. **-EACCES**: Permission denied (no FILE_READ_DATA or FILE_EXECUTE
access)
3. **-EAGAIN**: File is locked by another process
4. **kernel_read() failures**: Various VFS/filesystem errors
All of these are **easily triggerable** by malicious or misbehaving
clients.
### **Memory Leak Severity: HIGH**
- **Allocation size:** `ALIGN(length, 8)` where `length` is client-
controlled
- **Maximum per leak:** Up to `SMB3_MAX_IOSIZE` = **8 MB** (from
smb2pdu.h:28)
- **Default size:** `SMB21_DEFAULT_IOSIZE` = **1 MB** (from
smb2pdu.h:25)
- **Attack scenario:** An attacker could repeatedly:
1. Send READ requests for locked files (triggers -EAGAIN)
2. Each failed request leaks up to 8MB
3. 100 requests = 800MB leaked
4. Can exhaust server memory leading to **DoS**
---
## **4. Regression Risk Analysis**
### **Risk Level: VERY LOW**
- **Change size:** Single line addition
- **Operation:** Adding missing cleanup (defensive programming)
- **No behavior change:** Only affects error path that already returns
failure
- **Idempotent:** `kvfree()` is safe to call and simply frees allocated
memory
- **No dependencies:** No API changes or external impact
---
## **5. Stable Tree Compliance**
| Criterion | Status | Evidence |
|-----------|--------|----------|
| Fixes important bug | ✅ YES | Memory leak leading to potential DoS |
| Small and contained | ✅ YES | 1 line change, single function |
| No new features | ✅ YES | Pure bug fix |
| No architectural changes | ✅ YES | No structural modifications |
| Low regression risk | ✅ YES | Minimal, defensive change |
| Confined to subsystem | ✅ YES | Only affects ksmbd |
| User-facing impact | ✅ YES | Affects all ksmbd users |
---
## **6. Comparison with Similar Commits**
Recent ksmbd fixes in the repository show similar patterns:
- `379510a815cb2`: "fix possible refcount leak in smb2_sess_setup()"
- `5929e98f3bb76`: "fix potential double free on smb2_read_pipe() error
path"
- `e523a26c05672`: "fix passing freed memory 'aux_payload_buf'"
- `809d50d36e71b`: "fix memory leak in smb2_lock()"
**All these are typical stable tree backport candidates** with similar
characteristics (small resource leak fixes).
---
## **7. Missing Stable Tags**
**Notable:** The commit lacks explicit stable tree tags:
- ❌ No `Cc: stable(a)vger.kernel.org`
- ❌ No `Fixes:` tag pointing to the original buggy commit
However, this doesn't diminish backport worthiness - the technical
merits clearly justify backporting.
---
## **Conclusion**
This commit is an **excellent candidate for backporting** because:
1. **Security concern:** Remote memory exhaustion DoS vector
2. **Long-standing bug:** Present since ksmbd's introduction (4+ years)
3. **Wide impact:** Affects all ksmbd deployments
4. **Minimal risk:** Single-line defensive fix
5. **Clear bug fix:** Unambiguous resource leak on error path
6. **Pattern consistency:** Matches other successfully backported ksmbd
fixes
**Recommendation:** Backport to all active stable trees where ksmbd
exists (5.15+).
fs/smb/server/smb2pdu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
index 287200d7c0764..409b85af82e1c 100644
--- a/fs/smb/server/smb2pdu.c
+++ b/fs/smb/server/smb2pdu.c
@@ -6826,6 +6826,7 @@ int smb2_read(struct ksmbd_work *work)
nbytes = ksmbd_vfs_read(work, fp, length, &offset, aux_payload_buf);
if (nbytes < 0) {
+ kvfree(aux_payload_buf);
err = nbytes;
goto out;
}
--
2.51.0
From: Dave Vasilevsky <dave(a)vasilevsky.ca>
On 32-bit book3s with hash-MMUs, tlb_flush() was a no-op. This was
unnoticed because all uses until recently were for unmaps, and thus
handled by __tlb_remove_tlb_entry().
After commit 4a18419f71cd ("mm/mprotect: use mmu_gather") in kernel 5.19,
tlb_gather_mmu() started being used for mprotect as well. This caused
mprotect to simply not work on these machines:
int *ptr = mmap(NULL, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
*ptr = 1; // force HPTE to be created
mprotect(ptr, 4096, PROT_READ);
*ptr = 2; // should segfault, but succeeds
Fixed by making tlb_flush() actually flush TLB pages. This finally
agrees with the behaviour of boot3s64's tlb_flush().
Fixes: 4a18419f71cd ("mm/mprotect: use mmu_gather")
Signed-off-by: Dave Vasilevsky <dave(a)vasilevsky.ca>
---
arch/powerpc/include/asm/book3s/32/tlbflush.h | 8 ++++++--
arch/powerpc/mm/book3s32/tlb.c | 6 ++++++
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/32/tlbflush.h b/arch/powerpc/include/asm/book3s/32/tlbflush.h
index e43534da5207aa3b0cb3c07b78e29b833c141f3f..b8c587ad2ea954f179246a57d6e86e45e91dcfdc 100644
--- a/arch/powerpc/include/asm/book3s/32/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/32/tlbflush.h
@@ -11,6 +11,7 @@
void hash__flush_tlb_mm(struct mm_struct *mm);
void hash__flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr);
void hash__flush_range(struct mm_struct *mm, unsigned long start, unsigned long end);
+void hash__flush_gather(struct mmu_gather *tlb);
#ifdef CONFIG_SMP
void _tlbie(unsigned long address);
@@ -28,9 +29,12 @@ void _tlbia(void);
*/
static inline void tlb_flush(struct mmu_gather *tlb)
{
- /* 603 needs to flush the whole TLB here since it doesn't use a hash table. */
- if (!mmu_has_feature(MMU_FTR_HPTE_TABLE))
+ if (mmu_has_feature(MMU_FTR_HPTE_TABLE)) {
+ hash__flush_gather(tlb);
+ } else {
+ /* 603 needs to flush the whole TLB here since it doesn't use a hash table. */
_tlbia();
+ }
}
static inline void flush_range(struct mm_struct *mm, unsigned long start, unsigned long end)
diff --git a/arch/powerpc/mm/book3s32/tlb.c b/arch/powerpc/mm/book3s32/tlb.c
index 9ad6b56bfec96e989b96f027d075ad5812500854..3da95ecfbbb296303082e378425e92a5fbdbfac8 100644
--- a/arch/powerpc/mm/book3s32/tlb.c
+++ b/arch/powerpc/mm/book3s32/tlb.c
@@ -105,3 +105,9 @@ void hash__flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr)
flush_hash_pages(mm->context.id, vmaddr, pmd_val(*pmd), 1);
}
EXPORT_SYMBOL(hash__flush_tlb_page);
+
+void hash__flush_gather(struct mmu_gather *tlb)
+{
+ hash__flush_range(tlb->mm, tlb->start, tlb->end);
+}
+EXPORT_SYMBOL(hash__flush_gather);
---
base-commit: dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa
change-id: 20251027-vasi-mprotect-g3-f8f5278d4140
Best regards,
--
Dave Vasilevsky <dave(a)vasilevsky.ca>
Fix the order of the freq-table-hz property, then convert to OPP tables
and add interconnect support for UFS for the SM6350 SoC.
Signed-off-by: Luca Weiss <luca.weiss(a)fairphone.com>
---
Changes in v3:
- Actually pick up tags, sorry about that
- Link to v2: https://lore.kernel.org/r/20251023-sm6350-ufs-things-v2-0-799d59178713@fair…
Changes in v2:
- Resend, pick up tags
- Link to v1: https://lore.kernel.org/r/20250314-sm6350-ufs-things-v1-0-3600362cc52c@fair…
---
Luca Weiss (3):
arm64: dts: qcom: sm6350: Fix wrong order of freq-table-hz for UFS
arm64: dts: qcom: sm6350: Add OPP table support to UFSHC
arm64: dts: qcom: sm6350: Add interconnect support to UFS
arch/arm64/boot/dts/qcom/sm6350.dtsi | 49 ++++++++++++++++++++++++++++--------
1 file changed, 39 insertions(+), 10 deletions(-)
---
base-commit: a92c761bcac3d5042559107fa7679470727a4bcb
change-id: 20250314-sm6350-ufs-things-53c5de9fec5e
Best regards,
--
Luca Weiss <luca.weiss(a)fairphone.com>
The patch titled
Subject: mm/truncate: unmap large folio on split failure
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-truncate-unmap-large-folio-on-split-failure.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Kiryl Shutsemau <kas(a)kernel.org>
Subject: mm/truncate: unmap large folio on split failure
Date: Mon, 27 Oct 2025 11:56:36 +0000
Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
supposed to generate SIGBUS.
This behavior might not be respected on truncation.
During truncation, the kernel splits a large folio in order to reclaim
memory. As a side effect, it unmaps the folio and destroys PMD mappings
of the folio. The folio will be refaulted as PTEs and SIGBUS semantics
are preserved.
However, if the split fails, PMD mappings are preserved and the user will
not receive SIGBUS on any accesses within the PMD.
Unmap the folio on split failure. It will lead to refault as PTEs and
preserve SIGBUS semantics.
Make an exception for shmem/tmpfs that for long time intentionally mapped
with PMDs across i_size.
Link: https://lkml.kernel.org/r/20251027115636.82382-3-kirill@shutemov.name
Signed-off-by: Kiryl Shutsemau <kas(a)kernel.org>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: "Darrick J. Wong" <djwong(a)kernel.org>
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shakeel Butt <shakeel.butt(a)linux.dev>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/truncate.c | 35 +++++++++++++++++++++++++++++------
1 file changed, 29 insertions(+), 6 deletions(-)
--- a/mm/truncate.c~mm-truncate-unmap-large-folio-on-split-failure
+++ a/mm/truncate.c
@@ -177,6 +177,32 @@ int truncate_inode_folio(struct address_
return 0;
}
+static int try_folio_split_or_unmap(struct folio *folio, struct page *split_at,
+ unsigned long min_order)
+{
+ enum ttu_flags ttu_flags =
+ TTU_SYNC |
+ TTU_SPLIT_HUGE_PMD |
+ TTU_IGNORE_MLOCK;
+ int ret;
+
+ ret = try_folio_split_to_order(folio, split_at, min_order);
+
+ /*
+ * If the split fails, unmap the folio, so it will be refaulted
+ * with PTEs to respect SIGBUS semantics.
+ *
+ * Make an exception for shmem/tmpfs that for long time
+ * intentionally mapped with PMDs across i_size.
+ */
+ if (ret && !shmem_mapping(folio->mapping)) {
+ try_to_unmap(folio, ttu_flags);
+ WARN_ON(folio_mapped(folio));
+ }
+
+ return ret;
+}
+
/*
* Handle partial folios. The folio may be entirely within the
* range if a split has raced with us. If not, we zero the part of the
@@ -226,7 +252,7 @@ bool truncate_inode_partial_folio(struct
min_order = mapping_min_folio_order(folio->mapping);
split_at = folio_page(folio, PAGE_ALIGN_DOWN(offset) / PAGE_SIZE);
- if (!try_folio_split_to_order(folio, split_at, min_order)) {
+ if (!try_folio_split_or_unmap(folio, split_at, min_order)) {
/*
* try to split at offset + length to make sure folios within
* the range can be dropped, especially to avoid memory waste
@@ -250,13 +276,10 @@ bool truncate_inode_partial_folio(struct
if (!folio_trylock(folio2))
goto out;
- /*
- * make sure folio2 is large and does not change its mapping.
- * Its split result does not matter here.
- */
+ /* make sure folio2 is large and does not change its mapping */
if (folio_test_large(folio2) &&
folio2->mapping == folio->mapping)
- try_folio_split_to_order(folio2, split_at2, min_order);
+ try_folio_split_or_unmap(folio2, split_at2, min_order);
folio_unlock(folio2);
out:
_
Patches currently in -mm which might be from kas(a)kernel.org are
mm-memory-do-not-populate-page-table-entries-beyond-i_size.patch
mm-truncate-unmap-large-folio-on-split-failure.patch
The patch titled
Subject: mm/memory: do not populate page table entries beyond i_size
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memory-do-not-populate-page-table-entries-beyond-i_size.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Kiryl Shutsemau <kas(a)kernel.org>
Subject: mm/memory: do not populate page table entries beyond i_size
Date: Mon, 27 Oct 2025 11:56:35 +0000
Patch series "Fix SIGBUS semantics with large folios", v3.
Accessing memory within a VMA, but beyond i_size rounded up to the next
page size, is supposed to generate SIGBUS.
Darrick reported[1] an xfstests regression in v6.18-rc1. generic/749
failed due to missing SIGBUS. This was caused by my recent changes that
try to fault in the whole folio where possible:
19773df031bc ("mm/fault: try to map the entire file folio in finish_fault()")
357b92761d94 ("mm/filemap: map entire large folio faultaround")
These changes did not consider i_size when setting up PTEs, leading to
xfstest breakage.
However, the problem has been present in the kernel for a long time -
since huge tmpfs was introduced in 2016. The kernel happily maps
PMD-sized folios as PMD without checking i_size. And huge=always tmpfs
allocates PMD-size folios on any writes.
I considered this corner case when I implemented a large tmpfs, and my
conclusion was that no one in their right mind should rely on receiving a
SIGBUS signal when accessing beyond i_size. I cannot imagine how it could
be useful for the workload.
But apparently filesystem folks care a lot about preserving strict SIGBUS
semantics.
Generic/749 was introduced last year with reference to POSIX, but no real
workloads were mentioned. It also acknowledged the tmpfs deviation from
the test case.
POSIX indeed says[3]:
References within the address range starting at pa and
continuing for len bytes to whole pages following the end of an
object shall result in delivery of a SIGBUS signal.
The patchset fixes the regression introduced by recent changes as well as
more subtle SIGBUS breakage due to split failure on truncation.
This patch (of 2):
Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
supposed to generate SIGBUS.
Recent changes attempted to fault in full folio where possible. They did
not respect i_size, which led to populating PTEs beyond i_size and
breaking SIGBUS semantics.
Darrick reported generic/749 breakage because of this.
However, the problem existed before the recent changes. With huge=always
tmpfs, any write to a file leads to PMD-size allocation. Following the
fault-in of the folio will install PMD mapping regardless of i_size.
Fix filemap_map_pages() and finish_fault() to not install:
- PTEs beyond i_size;
- PMD mappings across i_size;
Make an exception for shmem/tmpfs that for long time intentionally
mapped with PMDs across i_size.
Link: https://lkml.kernel.org/r/20251027115636.82382-1-kirill@shutemov.name
Link: https://lkml.kernel.org/r/20251027115636.82382-2-kirill@shutemov.name
Signed-off-by: Kiryl Shutsemau <kas(a)kernel.org>
Fixes: 19773df031bc ("mm/fault: try to map the entire file folio in finish_fault()")
Fixes: 357b92761d94 ("mm/filemap: map entire large folio faultaround")
Fixes: 01c70267053d ("fs: add a filesystem flag for THPs")
Reported-by: "Darrick J. Wong" <djwong(a)kernel.org>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shakeel Butt <shakeel.butt(a)linux.dev>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/filemap.c | 28 ++++++++++++++++++++--------
mm/memory.c | 20 +++++++++++++++++++-
2 files changed, 39 insertions(+), 9 deletions(-)
--- a/mm/filemap.c~mm-memory-do-not-populate-page-table-entries-beyond-i_size
+++ a/mm/filemap.c
@@ -3681,7 +3681,8 @@ skip:
static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf,
struct folio *folio, unsigned long start,
unsigned long addr, unsigned int nr_pages,
- unsigned long *rss, unsigned short *mmap_miss)
+ unsigned long *rss, unsigned short *mmap_miss,
+ bool can_map_large)
{
unsigned int ref_from_caller = 1;
vm_fault_t ret = 0;
@@ -3696,7 +3697,7 @@ static vm_fault_t filemap_map_folio_rang
* The folio must not cross VMA or page table boundary.
*/
addr0 = addr - start * PAGE_SIZE;
- if (folio_within_vma(folio, vmf->vma) &&
+ if (can_map_large && folio_within_vma(folio, vmf->vma) &&
(addr0 & PMD_MASK) == ((addr0 + folio_size(folio) - 1) & PMD_MASK)) {
vmf->pte -= start;
page -= start;
@@ -3811,13 +3812,27 @@ vm_fault_t filemap_map_pages(struct vm_f
unsigned long rss = 0;
unsigned int nr_pages = 0, folio_type;
unsigned short mmap_miss = 0, mmap_miss_saved;
+ bool can_map_large;
rcu_read_lock();
folio = next_uptodate_folio(&xas, mapping, end_pgoff);
if (!folio)
goto out;
- if (filemap_map_pmd(vmf, folio, start_pgoff)) {
+ file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
+ end_pgoff = min(end_pgoff, file_end);
+
+ /*
+ * Do not allow to map with PTEs beyond i_size and with PMD
+ * across i_size to preserve SIGBUS semantics.
+ *
+ * Make an exception for shmem/tmpfs that for long time
+ * intentionally mapped with PMDs across i_size.
+ */
+ can_map_large = shmem_mapping(mapping) ||
+ file_end >= folio_next_index(folio);
+
+ if (can_map_large && filemap_map_pmd(vmf, folio, start_pgoff)) {
ret = VM_FAULT_NOPAGE;
goto out;
}
@@ -3830,10 +3845,6 @@ vm_fault_t filemap_map_pages(struct vm_f
goto out;
}
- file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
- if (end_pgoff > file_end)
- end_pgoff = file_end;
-
folio_type = mm_counter_file(folio);
do {
unsigned long end;
@@ -3850,7 +3861,8 @@ vm_fault_t filemap_map_pages(struct vm_f
else
ret |= filemap_map_folio_range(vmf, folio,
xas.xa_index - folio->index, addr,
- nr_pages, &rss, &mmap_miss);
+ nr_pages, &rss, &mmap_miss,
+ can_map_large);
folio_unlock(folio);
} while ((folio = next_uptodate_folio(&xas, mapping, end_pgoff)) != NULL);
--- a/mm/memory.c~mm-memory-do-not-populate-page-table-entries-beyond-i_size
+++ a/mm/memory.c
@@ -65,6 +65,7 @@
#include <linux/gfp.h>
#include <linux/migrate.h>
#include <linux/string.h>
+#include <linux/shmem_fs.h>
#include <linux/memory-tiers.h>
#include <linux/debugfs.h>
#include <linux/userfaultfd_k.h>
@@ -5501,8 +5502,25 @@ fallback:
return ret;
}
+ if (!needs_fallback && vma->vm_file) {
+ struct address_space *mapping = vma->vm_file->f_mapping;
+ pgoff_t file_end;
+
+ file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
+
+ /*
+ * Do not allow to map with PTEs beyond i_size and with PMD
+ * across i_size to preserve SIGBUS semantics.
+ *
+ * Make an exception for shmem/tmpfs that for long time
+ * intentionally mapped with PMDs across i_size.
+ */
+ needs_fallback = !shmem_mapping(mapping) &&
+ file_end < folio_next_index(folio);
+ }
+
if (pmd_none(*vmf->pmd)) {
- if (folio_test_pmd_mappable(folio)) {
+ if (!needs_fallback && folio_test_pmd_mappable(folio)) {
ret = do_set_pmd(vmf, folio, page);
if (ret != VM_FAULT_FALLBACK)
return ret;
_
Patches currently in -mm which might be from kas(a)kernel.org are
mm-memory-do-not-populate-page-table-entries-beyond-i_size.patch
mm-truncate-unmap-large-folio-on-split-failure.patch
The function calls of_parse_phandle() to get a
device node, which increments the reference count of the node. However,
the function fails to call of_node_put() to decrement the reference.
This leads to a reference count leak.
This is found by static analysis and similar to the commit a508e33956b5
("ipmi:ipmb: Fix refcount leak in ipmi_ipmb_probe")
Fixes: 423de5b5bc5b ("thermal/of: Fix cdev lookup in thermal_of_should_bind()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/thermal/thermal_of.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 1a51a4d240ff..932291648683 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -284,8 +284,12 @@ static bool thermal_of_cm_lookup(struct device_node *cm_np,
int count, i;
tr_np = of_parse_phandle(child, "trip", 0);
- if (tr_np != trip->priv)
+ if (tr_np != trip->priv) {
+ of_node_put(tr_np);
continue;
+ }
+
+ of_node_put(tr_np);
/* The trip has been found, look up the cdev. */
count = of_count_phandle_with_args(child, "cooling-device",
--
2.39.5 (Apple Git-154)
From: Emanuele Ghidoli <emanuele.ghidoli(a)toradex.com>
While the DP83867 PHYs report EEE capability through their feature
registers, the actual hardware does not support EEE (see Links).
When the connected MAC enables EEE, it causes link instability and
communication failures.
The issue is reproducible with a iMX8MP and relevant stmmac ethernet port.
Since the introduction of phylink-managed EEE support in the stmmac driver,
EEE is now enabled by default, leading to issues on systems using the
DP83867 PHY.
Call phy_disable_eee during phy initialization to prevent EEE from being
enabled on DP83867 PHYs.
Link: https://e2e.ti.com/support/interface-group/interface/f/interface-forum/1445…
Link: https://e2e.ti.com/support/interface-group/interface/f/interface-forum/6586…
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
Cc: stable(a)vger.kernel.org
Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli(a)toradex.com>
---
drivers/net/phy/dp83867.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c
index deeefb962566..36a0c1b7f59c 100644
--- a/drivers/net/phy/dp83867.c
+++ b/drivers/net/phy/dp83867.c
@@ -738,6 +738,12 @@ static int dp83867_config_init(struct phy_device *phydev)
return ret;
}
+ /* Although the DP83867 reports EEE capability through the
+ * MDIO_PCS_EEE_ABLE and MDIO_AN_EEE_ADV registers, the feature
+ * is not actually implemented in hardware.
+ */
+ phy_disable_eee(phydev);
+
if (phy_interface_is_rgmii(phydev) ||
phydev->interface == PHY_INTERFACE_MODE_SGMII) {
val = phy_read(phydev, MII_DP83867_PHYCTRL);
--
2.43.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x daeb4037adf7d3349b4a1fb792f4bc9824686a4b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102701-scabby-entrust-a162@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From daeb4037adf7d3349b4a1fb792f4bc9824686a4b Mon Sep 17 00:00:00 2001
From: Artem Shimko <a.shimko.dev(a)gmail.com>
Date: Sun, 19 Oct 2025 12:51:31 +0300
Subject: [PATCH] serial: 8250_dw: handle reset control deassert error
Check the return value of reset_control_deassert() in the probe
function to prevent continuing probe when reset deassertion fails.
Previously, reset_control_deassert() was called without checking its
return value, which could lead to probe continuing even when the
device reset wasn't properly deasserted.
The fix checks the return value and returns an error with dev_err_probe()
if reset deassertion fails, providing better error handling and
diagnostics.
Fixes: acbdad8dd1ab ("serial: 8250_dw: simplify optional reset handling")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Artem Shimko <a.shimko.dev(a)gmail.com>
Link: https://patch.msgid.link/20251019095131.252848-1-a.shimko.dev@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c
index a53ba04d9770..710ae4d40aec 100644
--- a/drivers/tty/serial/8250/8250_dw.c
+++ b/drivers/tty/serial/8250/8250_dw.c
@@ -635,7 +635,9 @@ static int dw8250_probe(struct platform_device *pdev)
if (IS_ERR(data->rst))
return PTR_ERR(data->rst);
- reset_control_deassert(data->rst);
+ err = reset_control_deassert(data->rst);
+ if (err)
+ return dev_err_probe(dev, err, "failed to deassert resets\n");
err = devm_add_action_or_reset(dev, dw8250_reset_control_assert, data->rst);
if (err)
A maximum gain of 0xffff / 65525 seems unlikely and testing indeed shows
that the gain control wraps-around at 4096, so set the maximum gain to
0xfff / 4095.
The minimum gain of 0x100 is correct. Setting bits 8-11 to 0x0 results
in the same gain values as setting these bits to 0x1, with bits 0-7
still increasing the gain when going from 0x000 - 0x0ff in the exact
same range as when going from 0x100 - 0x1ff.
Fixes: 0827b58dabff ("media: i2c: add ov01a10 image sensor driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
---
drivers/media/i2c/ov01a10.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/i2c/ov01a10.c b/drivers/media/i2c/ov01a10.c
index 0b1a1ecfffd0..95d6a0f046a0 100644
--- a/drivers/media/i2c/ov01a10.c
+++ b/drivers/media/i2c/ov01a10.c
@@ -48,7 +48,7 @@
/* analog gain controls */
#define OV01A10_REG_ANALOG_GAIN 0x3508
#define OV01A10_ANAL_GAIN_MIN 0x100
-#define OV01A10_ANAL_GAIN_MAX 0xffff
+#define OV01A10_ANAL_GAIN_MAX 0xfff
#define OV01A10_ANAL_GAIN_STEP 1
/* digital gain controls */
--
2.51.0
During sensor calibration I noticed that with the hflip control set
to false/disabled the image was mirrored.
So it seems that the horizontal flip control is inverted and needs to
be set to 1 to not flip (just like the similar problem recently fixed
on the ov08x40 sensor).
Invert the hflip control to fix the sensor mirroring by default.
As the comment above the newly added OV01A10_MEDIA_BUS_FMT define explains
the control being inverted also means that the native Bayer-order of
the sensor actually is GBRG not BGGR, but so as to not break userspace
the Bayer-order is kept at BGGR.
Fixes: 0827b58dabff ("media: i2c: add ov01a10 image sensor driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
---
drivers/media/i2c/ov01a10.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/drivers/media/i2c/ov01a10.c b/drivers/media/i2c/ov01a10.c
index 141cb6f75b55..e5df01f97978 100644
--- a/drivers/media/i2c/ov01a10.c
+++ b/drivers/media/i2c/ov01a10.c
@@ -75,6 +75,15 @@
#define OV01A10_REG_X_WIN 0x3811
#define OV01A10_REG_Y_WIN 0x3813
+/*
+ * The native ov01a10 bayer-pattern is GBRG, but there was a driver bug enabling
+ * hflip/mirroring by default resulting in BGGR. Because of this bug Intel's
+ * proprietary IPU6 userspace stack expects BGGR. So we report BGGR to not break
+ * userspace and fix things up by shifting the crop window-x coordinate by 1
+ * when hflip is *disabled*.
+ */
+#define OV01A10_MEDIA_BUS_FMT MEDIA_BUS_FMT_SBGGR10_1X10
+
struct ov01a10_reg {
u16 address;
u8 val;
@@ -185,14 +194,14 @@ static const struct ov01a10_reg sensor_1280x800_setting[] = {
{0x380e, 0x03},
{0x380f, 0x80},
{0x3810, 0x00},
- {0x3811, 0x08},
+ {0x3811, 0x09},
{0x3812, 0x00},
{0x3813, 0x08},
{0x3814, 0x01},
{0x3815, 0x01},
{0x3816, 0x01},
{0x3817, 0x01},
- {0x3820, 0xa0},
+ {0x3820, 0xa8},
{0x3822, 0x13},
{0x3832, 0x28},
{0x3833, 0x10},
@@ -411,7 +420,7 @@ static int ov01a10_set_hflip(struct ov01a10 *ov01a10, u32 hflip)
int ret;
u32 val, offset;
- offset = hflip ? 0x9 : 0x8;
+ offset = hflip ? 0x8 : 0x9;
ret = ov01a10_write_reg(ov01a10, OV01A10_REG_X_WIN, 1, offset);
if (ret)
return ret;
@@ -420,8 +429,8 @@ static int ov01a10_set_hflip(struct ov01a10 *ov01a10, u32 hflip)
if (ret)
return ret;
- val = hflip ? val | FIELD_PREP(OV01A10_HFLIP_MASK, 0x1) :
- val & ~OV01A10_HFLIP_MASK;
+ val = hflip ? val & ~OV01A10_HFLIP_MASK :
+ val | FIELD_PREP(OV01A10_HFLIP_MASK, 0x1);
return ov01a10_write_reg(ov01a10, OV01A10_REG_FORMAT1, 1, val);
}
@@ -610,7 +619,7 @@ static void ov01a10_update_pad_format(const struct ov01a10_mode *mode,
{
fmt->width = mode->width;
fmt->height = mode->height;
- fmt->code = MEDIA_BUS_FMT_SBGGR10_1X10;
+ fmt->code = OV01A10_MEDIA_BUS_FMT;
fmt->field = V4L2_FIELD_NONE;
fmt->colorspace = V4L2_COLORSPACE_RAW;
}
@@ -751,7 +760,7 @@ static int ov01a10_enum_mbus_code(struct v4l2_subdev *sd,
if (code->index > 0)
return -EINVAL;
- code->code = MEDIA_BUS_FMT_SBGGR10_1X10;
+ code->code = OV01A10_MEDIA_BUS_FMT;
return 0;
}
@@ -761,7 +770,7 @@ static int ov01a10_enum_frame_size(struct v4l2_subdev *sd,
struct v4l2_subdev_frame_size_enum *fse)
{
if (fse->index >= ARRAY_SIZE(supported_modes) ||
- fse->code != MEDIA_BUS_FMT_SBGGR10_1X10)
+ fse->code != OV01A10_MEDIA_BUS_FMT)
return -EINVAL;
fse->min_width = supported_modes[fse->index].width;
--
2.51.0
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 1c05bf6c0262f946571a37678250193e46b1ff0f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102739-fable-reroute-e6a6@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1c05bf6c0262f946571a37678250193e46b1ff0f Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Mon, 6 Oct 2025 10:20:02 -0400
Subject: [PATCH] serial: sc16is7xx: remove useless enable of enhanced features
Commit 43c51bb573aa ("sc16is7xx: make sure device is in suspend once
probed") permanently enabled access to the enhanced features in
sc16is7xx_probe(), and it is never disabled after that.
Therefore, remove re-enable of enhanced features in
sc16is7xx_set_baud(). This eliminates a potential useless read + write
cycle each time the baud rate is reconfigured.
Fixes: 43c51bb573aa ("sc16is7xx: make sure device is in suspend once probed")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Link: https://patch.msgid.link/20251006142002.177475-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 1a2c4c14f6aa..c7435595dce1 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -588,13 +588,6 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
div /= prescaler;
}
- /* Enable enhanced features */
- sc16is7xx_efr_lock(port);
- sc16is7xx_port_update(port, SC16IS7XX_EFR_REG,
- SC16IS7XX_EFR_ENABLE_BIT,
- SC16IS7XX_EFR_ENABLE_BIT);
- sc16is7xx_efr_unlock(port);
-
/* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x f3d12ec847b945d5d65846c85f062d07d5e73164
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102714-patriot-eel-32c8@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f3d12ec847b945d5d65846c85f062d07d5e73164 Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 14 Oct 2025 01:55:41 +0300
Subject: [PATCH] xhci: dbc: fix bogus 1024 byte prefix if ttyDBC read races
with stall event
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
DbC may add 1024 bogus bytes to the beginneing of the receiving endpoint
if DbC hw triggers a STALL event before any Transfer Blocks (TRBs) for
incoming data are queued, but driver handles the event after it queued
the TRBs.
This is possible as xHCI DbC hardware may trigger spurious STALL transfer
events even if endpoint is empty. The STALL event contains a pointer
to the stalled TRB, and "remaining" untransferred data length.
As there are no TRBs queued yet the STALL event will just point to first
TRB position of the empty ring, with '0' bytes remaining untransferred.
DbC driver is polling for events, and may not handle the STALL event
before /dev/ttyDBC0 is opened and incoming data TRBs are queued.
The DbC event handler will now assume the first queued TRB (length 1024)
has stalled with '0' bytes remaining untransferred, and copies the data
This race situation can be practically mitigated by making sure the event
handler handles all pending transfer events when DbC reaches configured
state, and only then create dev/ttyDbC0, and start queueing transfers.
The event handler can this way detect the STALL events on empty rings
and discard them before any transfers are queued.
This does in practice solve the issue, but still leaves a small possible
gap for the race to trigger.
We still need a way to distinguish spurious STALLs on empty rings with '0'
bytes remaing, from actual STALL events with all bytes transmitted.
Cc: stable <stable(a)kernel.org>
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index 63edf2d8f245..023a8ec6f305 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -892,7 +892,8 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC configured\n");
portsc = readl(&dbc->regs->portsc);
writel(portsc, &dbc->regs->portsc);
- return EVT_GSER;
+ ret = EVT_GSER;
+ break;
}
return EVT_DONE;
@@ -954,7 +955,8 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
break;
case TRB_TYPE(TRB_TRANSFER):
dbc_handle_xfer_event(dbc, evt);
- ret = EVT_XFER_DONE;
+ if (ret != EVT_GSER)
+ ret = EVT_XFER_DONE;
break;
default:
break;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x daeb4037adf7d3349b4a1fb792f4bc9824686a4b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102700-sleep-robotics-c9e3@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From daeb4037adf7d3349b4a1fb792f4bc9824686a4b Mon Sep 17 00:00:00 2001
From: Artem Shimko <a.shimko.dev(a)gmail.com>
Date: Sun, 19 Oct 2025 12:51:31 +0300
Subject: [PATCH] serial: 8250_dw: handle reset control deassert error
Check the return value of reset_control_deassert() in the probe
function to prevent continuing probe when reset deassertion fails.
Previously, reset_control_deassert() was called without checking its
return value, which could lead to probe continuing even when the
device reset wasn't properly deasserted.
The fix checks the return value and returns an error with dev_err_probe()
if reset deassertion fails, providing better error handling and
diagnostics.
Fixes: acbdad8dd1ab ("serial: 8250_dw: simplify optional reset handling")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Artem Shimko <a.shimko.dev(a)gmail.com>
Link: https://patch.msgid.link/20251019095131.252848-1-a.shimko.dev@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c
index a53ba04d9770..710ae4d40aec 100644
--- a/drivers/tty/serial/8250/8250_dw.c
+++ b/drivers/tty/serial/8250/8250_dw.c
@@ -635,7 +635,9 @@ static int dw8250_probe(struct platform_device *pdev)
if (IS_ERR(data->rst))
return PTR_ERR(data->rst);
- reset_control_deassert(data->rst);
+ err = reset_control_deassert(data->rst);
+ if (err)
+ return dev_err_probe(dev, err, "failed to deassert resets\n");
err = devm_add_action_or_reset(dev, dw8250_reset_control_assert, data->rst);
if (err)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 1c05bf6c0262f946571a37678250193e46b1ff0f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102739-casualty-hankering-f9ab@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1c05bf6c0262f946571a37678250193e46b1ff0f Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Mon, 6 Oct 2025 10:20:02 -0400
Subject: [PATCH] serial: sc16is7xx: remove useless enable of enhanced features
Commit 43c51bb573aa ("sc16is7xx: make sure device is in suspend once
probed") permanently enabled access to the enhanced features in
sc16is7xx_probe(), and it is never disabled after that.
Therefore, remove re-enable of enhanced features in
sc16is7xx_set_baud(). This eliminates a potential useless read + write
cycle each time the baud rate is reconfigured.
Fixes: 43c51bb573aa ("sc16is7xx: make sure device is in suspend once probed")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Link: https://patch.msgid.link/20251006142002.177475-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 1a2c4c14f6aa..c7435595dce1 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -588,13 +588,6 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
div /= prescaler;
}
- /* Enable enhanced features */
- sc16is7xx_efr_lock(port);
- sc16is7xx_port_update(port, SC16IS7XX_EFR_REG,
- SC16IS7XX_EFR_ENABLE_BIT,
- SC16IS7XX_EFR_ENABLE_BIT);
- sc16is7xx_efr_unlock(port);
-
/* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
Currently the driver only configure the data edge sampling partially. The
AM62 require it to be configured in two distincts registers: one in tidss
and one in the general device registers.
Introduce a new dt property to link the proper syscon node from the main
device registers into the tidss driver.
Fixes: 32a1795f57ee ("drm/tidss: New driver for TI Keystone platform Display SubSystem")
---
Cc: stable(a)vger.kernel.org
Signed-off-by: Louis Chauvet <louis.chauvet(a)bootlin.com>
---
Louis Chauvet (4):
dt-bindings: display: ti,am65x-dss: Add clk property for data edge synchronization
dt-bindings: mfd: syscon: Add ti,am625-dss-clk-ctrl
arm64: dts: ti: k3-am62-main: Add tidss clk-ctrl property
drm/tidss: Fix sampling edge configuration
.../devicetree/bindings/display/ti/ti,am65x-dss.yaml | 6 ++++++
Documentation/devicetree/bindings/mfd/syscon.yaml | 3 ++-
arch/arm64/boot/dts/ti/k3-am62-main.dtsi | 6 ++++++
drivers/gpu/drm/tidss/tidss_dispc.c | 14 ++++++++++++++
4 files changed, 28 insertions(+), 1 deletion(-)
---
base-commit: 85c23f28905cf20a86ceec3cfd7a0a5572c9eb13
change-id: 20250730-fix-edge-handling-9123f7438910
Best regards,
--
Louis Chauvet <louis.chauvet(a)bootlin.com>
When livepatch is attached to the same function as bpf trampoline with
a fexit program, bpf trampoline code calls register_ftrace_direct()
twice. The first time will fail with -EAGAIN, and the second time it
will succeed. This requires register_ftrace_direct() to unregister
the address on the first attempt. Otherwise, the bpf trampoline cannot
attach. Here is an easy way to reproduce this issue:
insmod samples/livepatch/livepatch-sample.ko
bpftrace -e 'fexit:cmdline_proc_show {}'
ERROR: Unable to attach probe: fexit:vmlinux:cmdline_proc_show...
Fix this by cleaning up the hash when register_ftrace_function_nolock hits
errors.
Also, move the code that resets ops->func and ops->trampoline to the error
path of register_ftrace_direct(); and add a helper function reset_direct()
in register_ftrace_direct() and unregister_ftrace_direct().
Fixes: d05cb470663a ("ftrace: Fix modification of direct_function hash while in use")
Cc: stable(a)vger.kernel.org # v6.6+
Reported-by: Andrey Grodzovsky <andrey.grodzovsky(a)crowdstrike.com>
Closes: https://lore.kernel.org/live-patching/c5058315a39d4615b333e485893345be@crow…
Cc: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Cc: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Acked-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky(a)crowdstrike.com>
Signed-off-by: Song Liu <song(a)kernel.org>
Reviewed-by: Jiri Olsa <jolsa(a)kernel.org>
---
kernel/bpf/trampoline.c | 5 -----
kernel/trace/ftrace.c | 20 ++++++++++++++------
2 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 5949095e51c3..f2cb0b097093 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -479,11 +479,6 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
* BPF_TRAMP_F_SHARE_IPMODIFY is set, we can generate the
* trampoline again, and retry register.
*/
- /* reset fops->func and fops->trampoline for re-register */
- tr->fops->func = NULL;
- tr->fops->trampoline = 0;
-
- /* free im memory and reallocate later */
bpf_tramp_image_free(im);
goto again;
}
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 42bd2ba68a82..cbeb7e833131 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5953,6 +5953,17 @@ static void register_ftrace_direct_cb(struct rcu_head *rhp)
free_ftrace_hash(fhp);
}
+static void reset_direct(struct ftrace_ops *ops, unsigned long addr)
+{
+ struct ftrace_hash *hash = ops->func_hash->filter_hash;
+
+ remove_direct_functions_hash(hash, addr);
+
+ /* cleanup for possible another register call */
+ ops->func = NULL;
+ ops->trampoline = 0;
+}
+
/**
* register_ftrace_direct - Call a custom trampoline directly
* for multiple functions registered in @ops
@@ -6048,6 +6059,8 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
ops->direct_call = addr;
err = register_ftrace_function_nolock(ops);
+ if (err)
+ reset_direct(ops, addr);
out_unlock:
mutex_unlock(&direct_mutex);
@@ -6080,7 +6093,6 @@ EXPORT_SYMBOL_GPL(register_ftrace_direct);
int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr,
bool free_filters)
{
- struct ftrace_hash *hash = ops->func_hash->filter_hash;
int err;
if (check_direct_multi(ops))
@@ -6090,13 +6102,9 @@ int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr,
mutex_lock(&direct_mutex);
err = unregister_ftrace_function(ops);
- remove_direct_functions_hash(hash, addr);
+ reset_direct(ops, addr);
mutex_unlock(&direct_mutex);
- /* cleanup for possible another register call */
- ops->func = NULL;
- ops->trampoline = 0;
-
if (free_filters)
ftrace_free_filter(ops);
return err;
--
2.47.3
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x f3d12ec847b945d5d65846c85f062d07d5e73164
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102713-cucumber-persevere-aa50@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f3d12ec847b945d5d65846c85f062d07d5e73164 Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 14 Oct 2025 01:55:41 +0300
Subject: [PATCH] xhci: dbc: fix bogus 1024 byte prefix if ttyDBC read races
with stall event
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
DbC may add 1024 bogus bytes to the beginneing of the receiving endpoint
if DbC hw triggers a STALL event before any Transfer Blocks (TRBs) for
incoming data are queued, but driver handles the event after it queued
the TRBs.
This is possible as xHCI DbC hardware may trigger spurious STALL transfer
events even if endpoint is empty. The STALL event contains a pointer
to the stalled TRB, and "remaining" untransferred data length.
As there are no TRBs queued yet the STALL event will just point to first
TRB position of the empty ring, with '0' bytes remaining untransferred.
DbC driver is polling for events, and may not handle the STALL event
before /dev/ttyDBC0 is opened and incoming data TRBs are queued.
The DbC event handler will now assume the first queued TRB (length 1024)
has stalled with '0' bytes remaining untransferred, and copies the data
This race situation can be practically mitigated by making sure the event
handler handles all pending transfer events when DbC reaches configured
state, and only then create dev/ttyDbC0, and start queueing transfers.
The event handler can this way detect the STALL events on empty rings
and discard them before any transfers are queued.
This does in practice solve the issue, but still leaves a small possible
gap for the race to trigger.
We still need a way to distinguish spurious STALLs on empty rings with '0'
bytes remaing, from actual STALL events with all bytes transmitted.
Cc: stable <stable(a)kernel.org>
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index 63edf2d8f245..023a8ec6f305 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -892,7 +892,8 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC configured\n");
portsc = readl(&dbc->regs->portsc);
writel(portsc, &dbc->regs->portsc);
- return EVT_GSER;
+ ret = EVT_GSER;
+ break;
}
return EVT_DONE;
@@ -954,7 +955,8 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
break;
case TRB_TYPE(TRB_TRANSFER):
dbc_handle_xfer_event(dbc, evt);
- ret = EVT_XFER_DONE;
+ if (ret != EVT_GSER)
+ ret = EVT_XFER_DONE;
break;
default:
break;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x daeb4037adf7d3349b4a1fb792f4bc9824686a4b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025102700-impish-precook-5377@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From daeb4037adf7d3349b4a1fb792f4bc9824686a4b Mon Sep 17 00:00:00 2001
From: Artem Shimko <a.shimko.dev(a)gmail.com>
Date: Sun, 19 Oct 2025 12:51:31 +0300
Subject: [PATCH] serial: 8250_dw: handle reset control deassert error
Check the return value of reset_control_deassert() in the probe
function to prevent continuing probe when reset deassertion fails.
Previously, reset_control_deassert() was called without checking its
return value, which could lead to probe continuing even when the
device reset wasn't properly deasserted.
The fix checks the return value and returns an error with dev_err_probe()
if reset deassertion fails, providing better error handling and
diagnostics.
Fixes: acbdad8dd1ab ("serial: 8250_dw: simplify optional reset handling")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Artem Shimko <a.shimko.dev(a)gmail.com>
Link: https://patch.msgid.link/20251019095131.252848-1-a.shimko.dev@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c
index a53ba04d9770..710ae4d40aec 100644
--- a/drivers/tty/serial/8250/8250_dw.c
+++ b/drivers/tty/serial/8250/8250_dw.c
@@ -635,7 +635,9 @@ static int dw8250_probe(struct platform_device *pdev)
if (IS_ERR(data->rst))
return PTR_ERR(data->rst);
- reset_control_deassert(data->rst);
+ err = reset_control_deassert(data->rst);
+ if (err)
+ return dev_err_probe(dev, err, "failed to deassert resets\n");
err = devm_add_action_or_reset(dev, dw8250_reset_control_assert, data->rst);
if (err)