Currently, load_microcode_amd() iterates over all NUMA nodes, retrieves
their CPU masks and unconditonally accesses per-CPU data for the first
CPU of each mask.
According to Documentation/admin-guide/mm/numaperf.rst: "Some memory may
share the same node as a CPU, and others are provided as memory only
nodes." Therefore, some node CPU masks may be empty and wouldn't have a
"first CPU".
On a machine with far memory (and therefore CPU-less NUMA nodes):
- cpumask_of_node(nid) is 0
- cpumask_first(0) is CONFIG_NR_CPUS
- cpu_data(CONFIG_NR_CPUS) accesses the cpu_info per-CPU array at an
index that is 1 out of bounds
This does not have any security implications since flashing microcode is
a privileged operation but I believe this has reliability implications
by potentially corrupting memory while flashing a microcode update.
When booting with CONFIG_UBSAN_BOUNDS=y on an AMD machine that flashes a
microcode update. I get the following splat:
UBSAN: array-index-out-of-bounds in arch/x86/kernel/cpu/microcode/amd.c:X:Y
index 512 is out of range for type 'unsigned long[512]'
[...]
Call Trace:
dump_stack+0xdb/0x143
__ubsan_handle_out_of_bounds+0xf5/0x120
load_microcode_amd+0x58f/0x6b0
request_microcode_amd+0x17c/0x250
reload_store+0x174/0x2b0
kernfs_fop_write_iter+0x227/0x2d0
vfs_write+0x322/0x510
ksys_write+0xb5/0x160
do_syscall_64+0x6b/0xa0
entry_SYSCALL_64_after_hwframe+0x67/0xd1
This patch checks that a NUMA node has CPUs before attempting to update
its first CPU's microcode.
Fixes: 7ff6edf4fef3 ("x86/microcode/AMD: Fix mixed steppings support")
Signed-off-by: Florent Revest <revest(a)chromium.org>
Cc: stable(a)vger.kernel.org
---
arch/x86/kernel/cpu/microcode/amd.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 95ac1c6a84fbe..7c06425edc1b5 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -1059,6 +1059,7 @@ static enum ucode_state _load_microcode_amd(u8 family, const u8 *data, size_t si
static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t size)
{
+ const struct cpumask *mask;
struct cpuinfo_x86 *c;
unsigned int nid, cpu;
struct ucode_patch *p;
@@ -1069,7 +1070,11 @@ static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t siz
return ret;
for_each_node(nid) {
- cpu = cpumask_first(cpumask_of_node(nid));
+ mask = cpumask_of_node(nid);
+ if (cpumask_empty(mask))
+ continue;
+
+ cpu = cpumask_first(mask);
c = &cpu_data(cpu);
p = find_patch(cpu);
--
2.49.0.rc0.332.g42c0ae87b1-goog
Hi Greg, Sasha,
Please consider applying the following commits for 6.12.y and 6.13.y:
0c5928deada1 ("rust: block: fix formatting in GenDisk doc")
It is trivial, and should apply cleanly.
This avoids a Clippy warning in the upcoming Rust 1.86.0 release (to
be released in a few weeks).
Thanks!
Cheers,
Miguel
Currently on stable trees we have support for netmem/devmem RX but not
TX. It is not safe to forward/redirect an RX unreadable netmem packet
into the device's TX path, as the device may call dma-mapping APIs on
dma addrs that should not be passed to it.
Fix this by preventing the xmit of unreadable skbs.
Tested by configuring tc redirect:
sudo tc qdisc add dev eth1 ingress
sudo tc filter add dev eth1 ingress protocol ip prio 1 flower ip_proto \
tcp src_ip 192.168.1.12 action mirred egress redirect dev eth1
Before, I see unreadable skbs in the driver's TX path passed to dma
mapping APIs.
After, I don't see unreadable skbs in the driver's TX path passed to dma
mapping APIs.
Fixes: 65249feb6b3d ("net: add support for skbs with unreadable frags")
Suggested-by: Jakub Kicinski <kuba(a)kernel.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Mina Almasry <almasrymina(a)google.com>
---
v2: https://lore.kernel.org/netdev/20250305191153.6d899a00@kernel.org/
- Put unreadable check at the top (Jakub)
---
net/core/dev.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index 30da277c5a6f..2f7f5fd9ffec 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3872,6 +3872,9 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
{
netdev_features_t features;
+ if (!skb_frags_readable(skb))
+ goto out_kfree_skb;
+
features = netif_skb_features(skb);
skb = validate_xmit_vlan(skb, features);
if (unlikely(!skb))
base-commit: f315296c92fd4b7716bdea17f727ab431891dc3b
--
2.49.0.rc0.332.g42c0ae87b1-goog
A shmem folio can be either in page cache or in swap cache, but not at the
same time. Namely, once it is in swap cache, folio->mapping should be NULL,
and the folio is no longer in a shmem mapping.
In __folio_migrate_mapping(), to determine the number of xarray entries
to update, folio_test_swapbacked() is used, but that conflates shmem in
page cache case and shmem in swap cache case. It leads to xarray
multi-index entry corruption, since it turns a sibling entry to a
normal entry during xas_store() (see [1] for a userspace reproduction).
Fix it by only using folio_test_swapcache() to determine whether xarray
is storing swap cache entries or not to choose the right number of xarray
entries to update.
[1] https://lore.kernel.org/linux-mm/Z8idPCkaJW1IChjT@casper.infradead.org/
Note:
In __split_huge_page(), folio_test_anon() && folio_test_swapcache() is used
to get swap_cache address space, but that ignores the shmem folio in swap
cache case. It could lead to NULL pointer dereferencing when a
in-swap-cache shmem folio is split at __xa_store(), since
!folio_test_anon() is true and folio->mapping is NULL. But fortunately,
its caller split_huge_page_to_list_to_order() bails out early with EBUSY
when folio->mapping is NULL. So no need to take care of it here.
Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly")
Reported-by: Liu Shixin <liushixin2(a)huawei.com>
Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/
Suggested-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Zi Yan <ziy(a)nvidia.com>
Cc: stable(a)vger.kernel.org
---
mm/migrate.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index fb4afd31baf0..c0adea67cd62 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -518,15 +518,13 @@ static int __folio_migrate_mapping(struct address_space *mapping,
if (folio_test_anon(folio) && folio_test_large(folio))
mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1);
folio_ref_add(newfolio, nr); /* add cache reference */
- if (folio_test_swapbacked(folio)) {
+ if (folio_test_swapbacked(folio))
__folio_set_swapbacked(newfolio);
- if (folio_test_swapcache(folio)) {
- folio_set_swapcache(newfolio);
- newfolio->private = folio_get_private(folio);
- }
+ if (folio_test_swapcache(folio)) {
+ folio_set_swapcache(newfolio);
+ newfolio->private = folio_get_private(folio);
entries = nr;
} else {
- VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
entries = 1;
}
--
2.47.2
The current implementation of iommufd_device_do_replace() implicitly
assumes that the input device has already been attached. However, there
is no explicit check to verify this assumption. If another device within
the same group has been attached, the replace operation might succeed,
but the input device itself may not have been attached yet.
As a result, the input device might not be tracked in the
igroup->device_list, and its reserved IOVA might not be added. Despite
this, the caller might incorrectly assume that the device has been
successfully replaced, which could lead to unexpected behavior or errors.
To address this issue, add a check to ensure that the input device has
been attached before proceeding with the replace operation. This check
will help maintain the integrity of the device tracking system and prevent
potential issues arising from incorrect assumptions about the device's
attachment status.
Fixes: e88d4ec154a8 ("iommufd: Add iommufd_device_replace()")
Cc: stable(a)vger.kernel.org
Reviewed-by: Kevin Tian <kevin.tian(a)intel.com>
Signed-off-by: Yi Liu <yi.l.liu(a)intel.com>
---
Change log:
v2:
- Add r-b tag (Kevin)
- Minor tweaks. I swarpped the order of is_attach check with the
if (igroup->hwpt == NULL) check, hence no need to add WARN_ON.
v1: https://lore.kernel.org/linux-iommu/20250304120754.12450-1-yi.l.liu@intel.c…
---
drivers/iommu/iommufd/device.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index b2f0cb909e6d..bd50146e2ad0 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -471,6 +471,17 @@ iommufd_device_attach_reserved_iova(struct iommufd_device *idev,
/* The device attach/detach/replace helpers for attach_handle */
+/* Check if idev is attached to igroup->hwpt */
+static bool iommufd_device_is_attached(struct iommufd_device *idev)
+{
+ struct iommufd_device *cur;
+
+ list_for_each_entry(cur, &idev->igroup->device_list, group_item)
+ if (cur == idev)
+ return true;
+ return false;
+}
+
static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
struct iommufd_device *idev)
{
@@ -710,6 +721,11 @@ iommufd_device_do_replace(struct iommufd_device *idev,
goto err_unlock;
}
+ if (!iommufd_device_is_attached(idev)) {
+ rc = -EINVAL;
+ goto err_unlock;
+ }
+
if (hwpt == igroup->hwpt) {
mutex_unlock(&idev->igroup->lock);
return NULL;
--
2.34.1