March 2025 - Linux-stable-mirror

[PATCH] x86/ioremap: Maintain consistent IORES_MAP_ENCRYPTED for BIOS data

by Dan Williams

Nikolay reports [1] that accessing BIOS data (first 1MB of the physical address space) via /dev/mem results in an SEPT violation. The cause is ioremap() (via xlate_dev_mem_ptr()) establishing an unencrypted mapping where the kernel had established an encrypted mapping previously. Teach __ioremap_check_other() that this address space shall always be mapped as encrypted as historically it is memory resident data, not MMIO with side-effects. Cc: <x86(a)kernel.org> Cc: Vishal Annapurve <vannapurve(a)google.com> Cc: Kirill Shutemov <kirill.shutemov(a)linux.intel.com> Reported-by: Nikolay Borisov <nik.borisov(a)suse.com> Closes: http://lore.kernel.org/20250318113604.297726-1-nik.borisov@suse.com [1] Tested-by: Nikolay Borisov <nik.borisov(a)suse.com> Fixes: 9aa6ea69852c ("x86/tdx: Make pages shared in ioremap()") Cc: <stable(a)vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> --- arch/x86/mm/ioremap.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 42c90b420773..9e81286a631e 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -122,6 +122,10 @@ static void __ioremap_check_other(resource_size_t addr, struct ioremap_desc *des return; } + /* Ensure BIOS data (see devmem_is_allowed()) is consistently mapped */ + if (PHYS_PFN(addr) < 256) + desc->flags |= IORES_MAP_ENCRYPTED; + if (!IS_ENABLED(CONFIG_EFI)) return;

3 months

7
10
0 0

[PATCH v3] gpio: tegra186: fix resource handling in ACPI probe path

by Guixin Liu

When the Tegra186 GPIO controller is probed through ACPI matching, the driver emits two error messages during probing: "tegra186-gpio NVDA0508:00: invalid resource (null)" "tegra186-gpio NVDA0508:00: invalid resource (null)" Fix this by getting resource first and then do the ioremap. Fixes: 2606e7c9f5fc ("gpio: tegra186: Add ACPI support") Cc: stable(a)vger.kernel.org Signed-off-by: Guixin Liu <kanie(a)linux.alibaba.com> --- Changes from v2 to v3: - Add "CC: stable" to commit body. Changes from v1 to v2: - Add "Fixes" to commit body. drivers/gpio/gpio-tegra186.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/drivers/gpio/gpio-tegra186.c b/drivers/gpio/gpio-tegra186.c index 6895b65c86af..d27bfac6c9f5 100644 --- a/drivers/gpio/gpio-tegra186.c +++ b/drivers/gpio/gpio-tegra186.c @@ -823,6 +823,7 @@ static int tegra186_gpio_probe(struct platform_device *pdev) struct gpio_irq_chip *irq; struct tegra_gpio *gpio; struct device_node *np; + struct resource *res; char **names; int err; @@ -842,19 +843,19 @@ static int tegra186_gpio_probe(struct platform_device *pdev) gpio->num_banks++; /* get register apertures */ - gpio->secure = devm_platform_ioremap_resource_byname(pdev, "security"); - if (IS_ERR(gpio->secure)) { - gpio->secure = devm_platform_ioremap_resource(pdev, 0); - if (IS_ERR(gpio->secure)) - return PTR_ERR(gpio->secure); - } - - gpio->base = devm_platform_ioremap_resource_byname(pdev, "gpio"); - if (IS_ERR(gpio->base)) { - gpio->base = devm_platform_ioremap_resource(pdev, 1); - if (IS_ERR(gpio->base)) - return PTR_ERR(gpio->base); - } + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "security"); + if (!res) + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + gpio->secure = devm_ioremap_resource(&pdev->dev, res); + if (IS_ERR(gpio->secure)) + return PTR_ERR(gpio->secure); + + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "gpio"); + if (!res) + res = platform_get_resource(pdev, IORESOURCE_MEM, 1); + gpio->base = devm_ioremap_resource(&pdev->dev, res); + if (IS_ERR(gpio->base)) + return PTR_ERR(gpio->base); err = platform_irq_count(pdev); if (err < 0) -- 2.43.0

3 months

2
1
0 0

[PATCH v5] sched/topology: Enable topology_span_sane check only for debug builds

by Naman Jain

From: Saurabh Sengar <ssengar(a)linux.microsoft.com> On a x86 system under test with 1780 CPUs, topology_span_sane() takes around 8 seconds cumulatively for all the iterations. It is an expensive operation which does the sanity of non-NUMA topology masks. CPU topology is not something which changes very frequently hence make this check optional for the systems where the topology is trusted and need faster bootup. Restrict this to sched_verbose kernel cmdline option so that this penalty can be avoided for the systems who want to avoid it. Cc: stable(a)vger.kernel.org Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap") Signed-off-by: Saurabh Sengar <ssengar(a)linux.microsoft.com> Co-developed-by: Naman Jain <namjain(a)linux.microsoft.com> Signed-off-by: Naman Jain <namjain(a)linux.microsoft.com> Tested-by: K Prateek Nayak <kprateek.nayak(a)amd.com> --- Changes since v4: https://lore.kernel.org/all/20250306055354.52915-1-namjain@linux.microsoft.… - Rephrased print statement and moved it to sched_domain_debug. (addressing Valentin's comments) Changes since v3: https://lore.kernel.org/all/20250203114738.3109-1-namjain@linux.microsoft.c… - Minor typo correction in comment - Added Tested-by tag from Prateek for x86 Changes since v2: https://lore.kernel.org/all/1731922777-7121-1-git-send-email-ssengar@linux.… - Use sched_debug() instead of using sched_debug_verbose variable directly (addressing Prateek's comment) Changes since v1: https://lore.kernel.org/all/1729619853-2597-1-git-send-email-ssengar@linux.… - Use kernel cmdline param instead of compile time flag. Adding a link to the other patch which is under review. https://lore.kernel.org/lkml/20241031200431.182443-1-steve.wahl@hpe.com/ Above patch tries to optimize the topology sanity check, whereas this patch makes it optional. We believe both patches can coexist, as even with optimization, there will still be some performance overhead for this check. --- kernel/sched/topology.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index c49aea8c1025..d7254c47af45 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -132,8 +132,11 @@ static void sched_domain_debug(struct sched_domain *sd, int cpu) { int level = 0; - if (!sched_debug_verbose) + if (!sched_debug_verbose) { + pr_info_once("%s: Scheduler topology debugging disabled, add 'sched_verbose' to the cmdline to enable it\n", + __func__); return; + } if (!sd) { printk(KERN_DEBUG "CPU%d attaching NULL sched-domain.\n", cpu); @@ -2359,6 +2362,10 @@ static bool topology_span_sane(struct sched_domain_topology_level *tl, { int i = cpu + 1; + /* Skip the topology sanity check for non-debug, as it is a time-consuming operation */ + if (!sched_debug()) + return true; + /* NUMA levels are allowed to overlap */ if (tl->flags & SDTL_OVERLAP) return true; base-commit: 7ec162622e66a4ff886f8f28712ea1b13069e1aa -- 2.34.1

3 months

2
5
0 0

[PATCH v3 1/3] Revert "drivers: core: synchronize really_probe() and dev_uevent()"

by Dmitry Torokhov

This reverts commit c0a40097f0bc81deafc15f9195d1fb54595cd6d0. Probing a device can take arbitrary long time. In the field we observed that, for example, probing a bad micro-SD cards in an external USB card reader (or maybe cards were good but cables were flaky) sometimes takes longer than 2 minutes due to multiple retries at various levels of the stack. We can not block uevent_show() method for that long because udev is reading that attribute very often and that blocks udev and interferes with booting of the system. The change that introduced locking was concerned with dev_uevent() racing with unbinding the driver. However we can handle it without locking (which will be done in subsequent patch). There was also claim that synchronization with probe() is needed to properly load USB drivers, however this is a red herring: the change adding the lock was introduced in May of last year and USB loading and probing worked properly for many years before that. Revert the harmful locking. Cc: stable(a)vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com> --- drivers/base/core.c | 3 --- 1 file changed, 3 deletions(-) v3: no changes. v2: added Cc: stable, no code changes. diff --git a/drivers/base/core.c b/drivers/base/core.c index d2f9d3a59d6b..f9c1c623bca5 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -2726,11 +2726,8 @@ static ssize_t uevent_show(struct device *dev, struct device_attribute *attr, if (!env) return -ENOMEM; - /* Synchronize with really_probe() */ - device_lock(dev); /* let the kset specific function add its keys */ retval = kset->uevent_ops->uevent(&dev->kobj, env); - device_unlock(dev); if (retval) goto out; -- 2.49.0.rc0.332.g42c0ae87b1-goog

3 months

2
1
0 0

[PATCH v2] usb: dwc3: gadget: check that event count does not exceed event buffer length

by Frode Isaksen

From: Frode Isaksen <frode(a)meta.com> The event count is read from register DWC3_GEVNTCOUNT. There is a check for the count being zero, but not for exceeding the event buffer length. Check that event count does not exceed event buffer length, avoiding an out-of-bounds access when memcpy'ing the event. Crash log: Unable to handle kernel paging request at virtual address ffffffc0129be000 pc : __memcpy+0x114/0x180 lr : dwc3_check_event_buf+0xec/0x348 x3 : 0000000000000030 x2 : 000000000000dfc4 x1 : ffffffc0129be000 x0 : ffffff87aad60080 Call trace: __memcpy+0x114/0x180 dwc3_interrupt+0x24/0x34 Signed-off-by: Frode Isaksen <frode(a)meta.com> Fixes: ebbb2d59398f ("usb: dwc3: gadget: use evt->cache for processing events") Cc: stable(a)vger.kernel.org --- v1 -> v2: Added Fixes and Cc tag. This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device. Also tested on T.I. AM62x board. drivers/usb/dwc3/gadget.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 63fef4a1a498..548e112167f3 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4564,7 +4564,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); count &= DWC3_GEVNTCOUNT_MASK; - if (!count) + if (!count || count > evt->length) return IRQ_NONE; evt->count = count; -- 2.48.1

3 months

3
7
0 0

Request for backporting c4af66a95aa3 ("cgroup/rstat: Fix forceidle time in cpu.stat")

by Tejun Heo

Hello, c4af66a95aa3 ("cgroup/rstat: Fix forceidle time in cpu.stat") fixes b824766504e4 ("cgroup/rstat: add force idle show helper") and should be backported to v6.11+ but I forgot to add the tag and the patch is currently queued in cgroup/for-6.15. Once the cgroup pull request is merged, can you please include the commit in -stable backports? Thanks. -- tejun

3 months

2
2
0 0

[GIT PULL] virtio: features, fixes, cleanups

by Michael S. Tsirkin

The following changes since commit d082ecbc71e9e0bf49883ee4afd435a77a5101b6: Linux 6.14-rc4 (2025-02-23 12:32:57 -0800) are available in the Git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus for you to fetch changes up to 9d8960672d63db4b3b04542f5622748b345c637a: vhost-scsi: Reduce response iov mem use (2025-02-25 07:10:46 -0500) ---------------------------------------------------------------- virtio: features, fixes, cleanups A small number of improvements all over the place: shutdown has been reworked to reset devices. virtio fs is now allowed in vduse. vhost-scsi memory use has been reduced. cleanups, fixes all over the place. A couple more fixes are being tested and will be merged after rc1. Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com> ---------------------------------------------------------------- Eugenio Pérez (1): vduse: add virtio_fs to allowed dev id John Stultz (1): sound/virtio: Fix cancel_sync warnings on uninitialized work_structs Konstantin Shkolnyy (1): vdpa/mlx5: Fix mlx5_vdpa_get_config() endianness on big-endian machines Michael S. Tsirkin (1): virtio: break and reset virtio devices on device_shutdown() Mike Christie (9): vhost-scsi: Fix handling of multiple calls to vhost_scsi_set_endpoint vhost-scsi: Reduce mem use by moving upages to per queue vhost-scsi: Allocate T10 PI structs only when enabled vhost-scsi: Add better resource allocation failure handling vhost-scsi: Return queue full for page alloc failures during copy vhost-scsi: Dynamically allocate scatterlists vhost-scsi: Stop duplicating se_cmd fields vhost-scsi: Allocate iov_iter used for unaligned copies when needed vhost-scsi: Reduce response iov mem use Si-Wei Liu (1): vdpa/mlx5: Fix oversized null mkey longer than 32bit Yufeng Wang (3): tools/virtio: Add DMA_MAPPING_ERROR and sg_dma_len api define for virtio test tools: virtio/linux/compiler.h: Add data_race() define. tools: virtio/linux/module.h add MODULE_DESCRIPTION() define. drivers/vdpa/mlx5/core/mr.c | 7 +- drivers/vdpa/mlx5/net/mlx5_vnet.c | 3 + drivers/vdpa/vdpa_user/vduse_dev.c | 1 + drivers/vhost/Kconfig | 1 + drivers/vhost/scsi.c | 549 +++++++++++++++++++++++-------------- drivers/virtio/virtio.c | 29 ++ sound/virtio/virtio_pcm.c | 21 +- tools/virtio/linux/compiler.h | 25 ++ tools/virtio/linux/dma-mapping.h | 13 + tools/virtio/linux/module.h | 7 + 10 files changed, 439 insertions(+), 217 deletions(-)

3 months

2
1
0 0

[PATCH 6.1.y 0/7] Backported patches to fix selftest tpdir2

by Kang Wenlin

From: Wenlin Kang <wenlin.kang(a)windriver.com> The selftest tpdir2 terminated with a 'Segmentation fault' during loading. root@localhost:~# cd linux-kenel/tools/testing/selftests/arm64/abi && make root@localhost:~/linux-kernel/tools/testing/selftests/arm64/abi# ./tpidr2 Segmentation fault The cause of this is the __arch_clear_user() failure. load_elf_binary() [fs/binfmt_elf.c] -> if (likely(elf_bss != elf_brk) && unlikely(padzero(elf_bes))) -> padzero() -> clear_user() [arch/arm64/include/asm/uaccess.h] -> __arch_clear_user() [arch/arm64/lib/clear_user.S] For more details, please see: https://lore.kernel.org/lkml/1d0342f3-0474-482b-b6db-81ca7820a462@t-8ch.de/… This issue has been fixed in the mainline. Here I have backported the relevant commits for the linux-6.1.y branch and attached them. With these patches, tpdir2 works as: root@localhost:~/linux-kernel/tools/testing/selftests/arm64/abi# ./tpidr2 TAP version 13 1..5 ok 0 skipped, TPIDR2 not supported ok 1 skipped, TPIDR2 not supported ok 2 skipped, TPIDR2 not supported ok 3 skipped, TPIDR2 not supported ok 4 skipped, TPIDR2 not supported The first patch is just for alignment to apply the follow patches. This issue is resolved by the second patch. However, to ensure functional completeness, all related patches were backported according to the following link. https://lore.kernel.org/all/20230929031716.it.155-kees@kernel.org/#t Bo Liu (1): binfmt_elf: replace IS_ERR() with IS_ERR_VALUE() Eric W. Biederman (1): binfmt_elf: Support segments with 0 filesz and misaligned starts Kees Cook (5): binfmt_elf: elf_bss no longer used by load_elf_binary() binfmt_elf: Use elf_load() for interpreter binfmt_elf: Use elf_load() for library binfmt_elf: Only report padzero() errors when PROT_WRITE mm: Remove unused vm_brk() fs/binfmt_elf.c | 221 ++++++++++++++++----------------------------- include/linux/mm.h | 3 +- mm/mmap.c | 6 -- mm/nommu.c | 5 - 4 files changed, 79 insertions(+), 156 deletions(-) -- 2.39.2

3 months

4
16
0 0

+ x86-vmemmap-use-direct-mapped-va-instead-of-vmemmap-based-va.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: x86/vmemmap: use direct-mapped VA instead of vmemmap-based VA has been added to the -mm mm-hotfixes-unstable branch. Its filename is x86-vmemmap-use-direct-mapped-va-instead-of-vmemmap-based-va.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Gwan-gyeong Mun <gwan-gyeong.mun(a)intel.com> Subject: x86/vmemmap: use direct-mapped VA instead of vmemmap-based VA Date: Mon, 17 Feb 2025 13:41:33 +0200 Address an Oops issues when performing test of loading XE GPU driver module after applying the GPU SVM and Xe SVM patch series[1] and the Dept patch series[2]. The issue occurs when loading the xe driver via modprobe [3], which adds a struct page for device memory via devm_memremap_pages(). When a process leads the addition of a struct page to vmemmap (e.g. hot-plug), the page table update for the newly added vmemmap-based virtual address is updated first in init_mm's page table and then synchronized later. If the vmemmap-based virtual address is accessed through the process's page table before this sync, a page fault will occur. This patch translates vmemmap-based virtual address to direct-mapped virtual address and use it, if the current top-level page table is not init_mm's page table when accessing a vmemmap-based virtual address before this sync. [1] https://lore.kernel.org/dri-devel/20250213021112.1228481-1-matthew.brost@in… [2] https://lore.kernel.org/lkml/20240508094726.35754-1-byungchul@sk.com/ [3] [ 49.103630] xe 0000:00:04.0: [drm] Available VRAM: 0x0000000800000000, 0x00000002fb800000 [ 49.116710] BUG: unable to handle page fault for address: ffffeb3ff1200000 [ 49.117175] #PF: supervisor write access in kernel mode [ 49.117511] #PF: error_code(0x0002) - not-present page [ 49.117835] PGD 0 P4D 0 [ 49.118015] Oops: Oops: 0002 [#1] PREEMPT SMP NOPTI [ 49.118366] CPU: 3 UID: 0 PID: 302 Comm: modprobe Tainted: G W 6.13.0-drm-tip-test+ #62 [ 49.118976] Tainted: [W]=WARN [ 49.119179] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 49.119710] RIP: 0010:vmemmap_set_pmd+0xff/0x230 [ 49.120011] Code: 77 22 02 a9 ff ff 1f 00 74 58 48 8b 3d 62 77 22 02 48 85 ff 0f 85 9a 00 00 00 48 8d 7d 08 48 89 e9 31 c0 48 89 ea 48 83 e7 f8 <48> c7 45 00 00 00 00 00 48 29 f9 48 c7 45 48 00 00 00 00 83 c1 50 [ 49.121158] RSP: 0018:ffffc900016d37a8 EFLAGS: 00010282 [ 49.121502] RAX: 0000000000000000 RBX: ffff888164000000 RCX: ffffeb3ff1200000 [ 49.121966] RDX: ffffeb3ff1200000 RSI: 80000000000001e3 RDI: ffffeb3ff1200008 [ 49.122499] RBP: ffffeb3ff1200000 R08: ffffeb3ff1280000 R09: 0000000000000000 [ 49.123032] R10: ffff88817b94dc48 R11: 0000000000000003 R12: ffffeb3ff1280000 [ 49.123566] R13: 0000000000000000 R14: ffff88817b94dc48 R15: 8000000163e001e3 [ 49.124096] FS: 00007f53ae71d740(0000) GS:ffff88843fd80000(0000) knlGS:0000000000000000 [ 49.124698] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 49.125129] CR2: ffffeb3ff1200000 CR3: 000000017c7d2000 CR4: 0000000000750ef0 [ 49.125662] PKRU: 55555554 [ 49.125880] Call Trace: [ 49.126078] <TASK> [ 49.126252] ? __die_body.cold+0x19/0x26 [ 49.126509] ? page_fault_oops+0xa2/0x240 [ 49.126736] ? preempt_count_add+0x47/0xa0 [ 49.126968] ? search_module_extables+0x4a/0x80 [ 49.127224] ? exc_page_fault+0x206/0x230 [ 49.127454] ? asm_exc_page_fault+0x22/0x30 [ 49.127691] ? vmemmap_set_pmd+0xff/0x230 [ 49.127919] vmemmap_populate_hugepages+0x176/0x180 [ 49.128194] vmemmap_populate+0x34/0x80 [ 49.128416] __populate_section_memmap+0x41/0x90 [ 49.128676] sparse_add_section+0x121/0x3e0 [ 49.128914] __add_pages+0xba/0x150 [ 49.129116] add_pages+0x1d/0x70 [ 49.129305] memremap_pages+0x3dc/0x810 [ 49.129529] devm_memremap_pages+0x1c/0x60 [ 49.129762] xe_devm_add+0x8b/0x100 [xe] [ 49.130072] xe_tile_init_noalloc+0x6a/0x70 [xe] [ 49.130408] xe_device_probe+0x48c/0x740 [xe] [ 49.130714] ? __pfx___drmm_mutex_release+0x10/0x10 [ 49.130982] ? __drmm_add_action+0x85/0xd0 [ 49.131208] ? __pfx___drmm_mutex_release+0x10/0x10 [ 49.131478] xe_pci_probe+0x7ef/0xd90 [xe] [ 49.131777] ? _raw_spin_unlock_irqrestore+0x66/0x90 [ 49.132049] ? lockdep_hardirqs_on+0xba/0x140 [ 49.132290] pci_device_probe+0x99/0x110 [ 49.132510] really_probe+0xdb/0x340 [ 49.132710] ? pm_runtime_barrier+0x50/0x90 [ 49.132941] ? __pfx___driver_attach+0x10/0x10 [ 49.133190] __driver_probe_device+0x78/0x110 [ 49.133433] driver_probe_device+0x1f/0xa0 [ 49.133661] __driver_attach+0xba/0x1c0 [ 49.133874] bus_for_each_dev+0x7a/0xd0 [ 49.134089] bus_add_driver+0x114/0x200 [ 49.134302] driver_register+0x6e/0xc0 [ 49.134515] xe_init+0x1e/0x50 [xe] [ 49.134827] ? __pfx_xe_init+0x10/0x10 [xe] [ 49.134926] xe 0000:00:04.0: [drm:process_one_work] GT1: GuC CT safe-mode canceled [ 49.135112] do_one_initcall+0x5b/0x2b0 [ 49.135734] ? rcu_is_watching+0xd/0x40 [ 49.135995] ? __kmalloc_cache_noprof+0x231/0x310 [ 49.136315] do_init_module+0x60/0x210 [ 49.136572] init_module_from_file+0x86/0xc0 [ 49.136863] idempotent_init_module+0x12b/0x340 [ 49.137156] __x64_sys_finit_module+0x61/0xc0 [ 49.137437] do_syscall_64+0x69/0x140 [ 49.137681] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 49.137953] RIP: 0033:0x7f53ae1261fd [ 49.138153] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e3 fa 0c 00 f7 d8 64 89 01 48 [ 49.139117] RSP: 002b:00007ffd0e9021e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 49.139525] RAX: ffffffffffffffda RBX: 000055c02951ee50 RCX: 00007f53ae1261fd [ 49.139905] RDX: 0000000000000000 RSI: 000055bfff125478 RDI: 0000000000000010 [ 49.140282] RBP: 000055bfff125478 R08: 00007f53ae1f6b20 R09: 00007ffd0e902230 [ 49.140663] R10: 000055c029522000 R11: 0000000000000246 R12: 0000000000040000 [ 49.141040] R13: 000055c02951ef80 R14: 0000000000000000 R15: 000055c029521fc0 [ 49.141424] </TASK> [ 49.141552] Modules linked in: xe(+) drm_ttm_helper gpu_sched drm_suballoc_helper drm_gpuvm drm_exec drm_gpusvm i2c_algo_bit drm_buddy video wmi ttm drm_display_helper drm_kms_helper crct10dif_pclmul crc32_pclmul i2c_piix4 e1000 ghash_clmulni_intel i2c_smbus fuse [ 49.142824] CR2: ffffeb3ff1200000 [ 49.143010] ---[ end trace 0000000000000000 ]--- [ 49.143268] RIP: 0010:vmemmap_set_pmd+0xff/0x230 [ 49.143523] Code: 77 22 02 a9 ff ff 1f 00 74 58 48 8b 3d 62 77 22 02 48 85 ff 0f 85 9a 00 00 00 48 8d 7d 08 48 89 e9 31 c0 48 89 ea 48 83 e7 f8 <48> c7 45 00 00 00 00 00 48 29 f9 48 c7 45 48 00 00 00 00 83 c1 50 [ 49.144489] RSP: 0018:ffffc900016d37a8 EFLAGS: 00010282 [ 49.144775] RAX: 0000000000000000 RBX: ffff888164000000 RCX: ffffeb3ff1200000 [ 49.145154] RDX: ffffeb3ff1200000 RSI: 80000000000001e3 RDI: ffffeb3ff1200008 [ 49.145536] RBP: ffffeb3ff1200000 R08: ffffeb3ff1280000 R09: 0000000000000000 [ 49.145914] R10: ffff88817b94dc48 R11: 0000000000000003 R12: ffffeb3ff1280000 [ 49.146292] R13: 0000000000000000 R14: ffff88817b94dc48 R15: 8000000163e001e3 [ 49.146671] FS: 00007f53ae71d740(0000) GS:ffff88843fd80000(0000) knlGS:0000000000000000 [ 49.147097] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 49.147407] CR2: ffffeb3ff1200000 CR3: 000000017c7d2000 CR4: 0000000000750ef0 [ 49.147786] PKRU: 55555554 [ 49.147941] note: modprobe[302] exited with irqs disabled When a process leads the addition of a struct page to vmemmap (e.g. hot-plug), the page table update for the newly added vmemmap-based virtual address is updated first in init_mm's page table and then synchronized later. If the vmemmap-based virtual address is accessed through the process's page table before this sync, a page fault will occur. This translates vmemmap-based virtual address to direct-mapped virtual address and use it, if the current top-level page table is not init_mm's page table when accessing a vmemmap-based virtual address before this sync. Link: https://lkml.kernel.org/r/20250217114133.400063-2-gwan-gyeong.mun@intel.com Fixes: faf1c0008a33 ("x86/vmemmap: optimize for consecutive sections in partial populated PMDs") Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun(a)intel.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Hyeonggon Yoo <42.hyeyoo(a)gmail.com> Cc: Byungchul Park <byungchul(a)sk.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- arch/x86/mm/init_64.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) --- a/arch/x86/mm/init_64.c~x86-vmemmap-use-direct-mapped-va-instead-of-vmemmap-based-va +++ a/arch/x86/mm/init_64.c @@ -844,6 +844,17 @@ void __init paging_init(void) */ static unsigned long unused_pmd_start __meminitdata; +static void * __meminit vmemmap_va_to_kaddr(unsigned long vmemmap_va) +{ + void *kaddr = (void *)vmemmap_va; + pgd_t *pgd = __va(read_cr3_pa()); + + if (init_mm.pgd != pgd) + kaddr = __va(slow_virt_to_phys(kaddr)); + + return kaddr; +} + static void __meminit vmemmap_flush_unused_pmd(void) { if (!unused_pmd_start) @@ -851,7 +862,7 @@ static void __meminit vmemmap_flush_unus /* * Clears (unused_pmd_start, PMD_END] */ - memset((void *)unused_pmd_start, PAGE_UNUSED, + memset(vmemmap_va_to_kaddr(unused_pmd_start), PAGE_UNUSED, ALIGN(unused_pmd_start, PMD_SIZE) - unused_pmd_start); unused_pmd_start = 0; } @@ -882,7 +893,7 @@ static void __meminit __vmemmap_use_sub_ * case the first memmap never gets initialized e.g., because the memory * block never gets onlined). */ - memset((void *)start, 0, sizeof(struct page)); + memset(vmemmap_va_to_kaddr(start), 0, sizeof(struct page)); } static void __meminit vmemmap_use_sub_pmd(unsigned long start, unsigned long end) @@ -924,7 +935,7 @@ static void __meminit vmemmap_use_new_su * Mark with PAGE_UNUSED the unused parts of the new memmap range */ if (!IS_ALIGNED(start, PMD_SIZE)) - memset((void *)page, PAGE_UNUSED, start - page); + memset(vmemmap_va_to_kaddr(page), PAGE_UNUSED, start - page); /* * We want to avoid memset(PAGE_UNUSED) when populating the vmemmap of _ Patches currently in -mm which might be from gwan-gyeong.mun(a)intel.com are x86-vmemmap-use-direct-mapped-va-instead-of-vmemmap-based-va.patch

3 months

3
5
0 0

[PATCH V2 1/3] drm/amd/display: Protect FPU in dml21_copy()

by Huacai Chen

Commit 7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context start") removes the FP context protection of dml2_create(), and it said "All the DC_FP_START/END should be used before call anything from DML2". However, dml21_copy() are not protected from their callers, causing such errors: do_fpu invoked from kernel context![#1]: CPU: 0 UID: 0 PID: 240 Comm: kworker/0:5 Not tainted 6.14.0-rc6+ #1 Workqueue: events work_for_cpu_fn pc ffff80000318bd2c ra ffff80000315750c tp 9000000105910000 sp 9000000105913810 a0 0000000000000000 a1 0000000000000002 a2 900000013140d728 a3 900000013140d720 a4 0000000000000000 a5 9000000131592d98 a6 0000000000017ae8 a7 00000000001312d0 t0 9000000130751ff0 t1 ffff800003790000 t2 ffff800003790000 t3 9000000131592e28 t4 000000000004c6a8 t5 00000000001b7740 t6 0000000000023e38 t7 0000000000249f00 t8 0000000000000002 u0 0000000000000000 s9 900000012b010000 s0 9000000131400000 s1 9000000130751fd8 s2 ffff800003408000 s3 9000000130752c78 s4 9000000131592da8 s5 9000000131592120 s6 9000000130751ff0 s7 9000000131592e28 s8 9000000131400008 ra: ffff80000315750c dml2_top_soc15_initialize_instance+0x20c/0x300 [amdgpu] ERA: ffff80000318bd2c mcg_dcn4_build_min_clock_table+0x14c/0x600 [amdgpu] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 00000004 (PPLV0 +PIE -PWE) EUEN: 00000000 (-FPE -SXE -ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 000f0000 [FPD] (IS= ECode=15 EsubCode=0) PRID: 0014d010 (Loongson-64bit, Loongson-3C6000/S) Process kworker/0:5 (pid: 240, threadinfo=00000000f1700428, task=0000000020d2e962) Stack : 0000000000000000 0000000000000000 0000000000000000 9000000130751fd8 9000000131400000 ffff8000031574e0 9000000130751ff0 0000000000000000 9000000131592e28 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 f9175936df5d7fd2 900000012b00ff08 900000012b000000 ffff800003409000 ffff8000034a1780 90000001019634c0 900000012b000010 90000001307beeb8 90000001306b0000 0000000000000001 ffff8000031942b4 9000000130780000 90000001306c0000 9000000130780000 ffff8000031c276c 900000012b044bd0 ffff800003408000 ... Call Trace: [<ffff80000318bd2c>] mcg_dcn4_build_min_clock_table+0x14c/0x600 [amdgpu] [<ffff800003157508>] dml2_top_soc15_initialize_instance+0x208/0x300 [amdgpu] [<ffff8000031942b0>] dml21_create_copy+0x30/0x60 [amdgpu] [<ffff8000031c2768>] dc_state_create_copy+0x68/0xe0 [amdgpu] [<ffff800002e98ea0>] amdgpu_dm_init+0x8c0/0x2060 [amdgpu] [<ffff800002e9a658>] dm_hw_init+0x18/0x60 [amdgpu] [<ffff800002b0a738>] amdgpu_device_init+0x1938/0x27e0 [amdgpu] [<ffff800002b0ce80>] amdgpu_driver_load_kms+0x20/0xa0 [amdgpu] [<ffff800002b008f0>] amdgpu_pci_probe+0x1b0/0x580 [amdgpu] [<9000000003c7eae4>] local_pci_probe+0x44/0xc0 [<90000000032f2b18>] work_for_cpu_fn+0x18/0x40 [<90000000032f5da0>] process_one_work+0x160/0x300 [<90000000032f6718>] worker_thread+0x318/0x440 [<9000000003301b8c>] kthread+0x12c/0x220 [<90000000032b1484>] ret_from_kernel_thread+0x8/0xa4 Unfortunately, protecting dml21_copy() out of DML2 causes "sleeping function called from invalid context", so protect them with DC_FP_START() and DC_FP_END() inside. Cc: stable(a)vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn> --- drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c index fb80ba9287b6..a6b8df1d96e8 100644 --- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c +++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_wrapper.c @@ -412,8 +412,12 @@ void dml21_copy(struct dml2_context *dst_dml_ctx, dst_dml_ctx->v21.mode_programming.programming = dst_dml2_programming; + DC_FP_START(); + /* need to initialize copied instance for internal references to be correct */ dml2_initialize_instance(&dst_dml_ctx->v21.dml_init); + + DC_FP_END(); } bool dml21_create_copy(struct dml2_context **dst_dml_ctx, -- 2.47.1

3 months

3
4
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror March 2025