The sysfs interface can be used to trigger arbitrarily large memory
allocations. This can induce pressure on the VM layer to satisfy the
request only to fail anyways.
Reported-by: cheung wall <zzqq0103.hey(a)gmail.com>
Closes: https://lore.kernel.org/lkml/20250103091906.GD1977892@ZenIV/
Fixes: 73f37068d540 ("ptp: support ptp physical/virtual clocks conversion")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
The limit is completely made up, let me know if there is something
better.
I'm also wondering about the point of the max_vclocks sysfs attribute.
It could easily be removed and all its logic moved into the n_vclocks
attribute, simplifying the UAPI.
---
drivers/ptp/ptp_private.h | 1 +
drivers/ptp/ptp_sysfs.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/ptp/ptp_private.h b/drivers/ptp/ptp_private.h
index 18934e28469ee6e3bf9c9e6d1a1adb82808d88e6..07003339795e9c0fb813887e47eaee4ba0e20064 100644
--- a/drivers/ptp/ptp_private.h
+++ b/drivers/ptp/ptp_private.h
@@ -22,6 +22,7 @@
#define PTP_MAX_TIMESTAMPS 128
#define PTP_BUF_TIMESTAMPS 30
#define PTP_DEFAULT_MAX_VCLOCKS 20
+#define PTP_MAX_VCLOCKS_LIMIT 2048
#define PTP_MAX_CHANNELS 2048
struct timestamp_event_queue {
diff --git a/drivers/ptp/ptp_sysfs.c b/drivers/ptp/ptp_sysfs.c
index 6b1b8f57cd9510f269c86dd89a7a74f277f6916b..200eaf50069681eecc87d63c0e0440f28cccab77 100644
--- a/drivers/ptp/ptp_sysfs.c
+++ b/drivers/ptp/ptp_sysfs.c
@@ -284,7 +284,7 @@ static ssize_t max_vclocks_store(struct device *dev,
size_t size;
u32 max;
- if (kstrtou32(buf, 0, &max) || max == 0)
+ if (kstrtou32(buf, 0, &max) || max == 0 || max > PTP_MAX_VCLOCKS_LIMIT)
return -EINVAL;
if (max == ptp->max_vclocks)
---
base-commit: 582ef8a0c406e0b17030b0773392595ec331a0d2
change-id: 20250103-ptp-max_vclocks-0dab5b03b006
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
The QSPI peripheral control and status registers are
accessible via the SoC's APB bus, whereas MMIO transactions'
data travels on the AHB bus.
Microchip documentation and even sample code from Atmel
emphasises the need for a memory barrier before the first
MMIO transaction to the AHB-connected QSPI, and before the
last write to its registers via APB. This is achieved by
the following lines in `atmel_qspi_transfer()`:
/* Dummy read of QSPI_IFR to synchronize APB and AHB accesses */
(void)atmel_qspi_read(aq, QSPI_IFR);
However, the current documentation makes no mention to
synchronization requirements in the other direction, i.e.
after the last data written via AHB, and before the first
register access on APB.
In our case, we were facing an issue where the QSPI peripheral
would cease to send any new CSR (nCS Rise) interrupts,
leading to a timeout in `atmel_qspi_wait_for_completion()`
and ultimately this panic in higher levels:
ubi0 error: ubi_io_write: error -110 while writing 63108 bytes
to PEB 491:128, written 63104 bytes
After months of extensive research of the codebase, fiddling
around the debugger with kgdb, and back-and-forth with
Microchip, we came to the conclusion that the issue is
probably that the peripheral is still busy receiving on AHB
when the LASTXFER bit is written to its Control Register
on APB, therefore this write gets lost, and the peripheral
still thinks there is more data to come in the MMIO transfer.
This was first formulated when we noticed that doubling the
write() of QSPI_CR_LASTXFER seemed to solve the problem.
Ultimately, the solution is to introduce memory barriers
after the AHB-mapped MMIO transfers, to ensure ordering.
Fixes: d5433def3153 ("mtd: spi-nor: atmel-quadspi: Add spi-mem support to atmel-quadspi")
Cc: Hari.PrasathGE(a)microchip.com
Cc: Mahesh.Abotula(a)microchip.com
Cc: Marco.Cardellini(a)microchip.com
Cc: <stable(a)vger.kernel.org> # c0a0203cf579: ("spi: atmel-quadspi: Create `atmel_qspi_ops`"...)
Cc: <stable(a)vger.kernel.org> # 6.x.y
Signed-off-by: Bence Csókás <csokas.bence(a)prolan.hu>
---
Notes:
Changes in v2:
* dropping --- from commit msg
drivers/spi/atmel-quadspi.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/atmel-quadspi.c b/drivers/spi/atmel-quadspi.c
index 73cf0c3f1477..96fc1c56a221 100644
--- a/drivers/spi/atmel-quadspi.c
+++ b/drivers/spi/atmel-quadspi.c
@@ -625,13 +625,20 @@ static int atmel_qspi_transfer(struct spi_mem *mem,
(void)atmel_qspi_read(aq, QSPI_IFR);
/* Send/Receive data */
- if (op->data.dir == SPI_MEM_DATA_IN)
+ if (op->data.dir == SPI_MEM_DATA_IN) {
memcpy_fromio(op->data.buf.in, aq->mem + offset,
op->data.nbytes);
- else
+
+ /* Synchronize AHB and APB accesses again */
+ rmb();
+ } else {
memcpy_toio(aq->mem + offset, op->data.buf.out,
op->data.nbytes);
+ /* Synchronize AHB and APB accesses again */
+ wmb();
+ }
+
/* Release the chip-select */
atmel_qspi_write(QSPI_CR_LASTXFER, aq, QSPI_CR);
--
2.34.1
When cdev_device_add() failed, calling put_device() to explicitly
release dev->lirc_dev. Otherwise, it could cause the fault of the
reference count.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a6ddd4fecbb0 ("media: lirc: remove last remnants of lirc kapi")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/media/rc/lirc_dev.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/rc/lirc_dev.c b/drivers/media/rc/lirc_dev.c
index a2257dc2f25d..ed839e15fa16 100644
--- a/drivers/media/rc/lirc_dev.c
+++ b/drivers/media/rc/lirc_dev.c
@@ -765,6 +765,7 @@ int lirc_register(struct rc_dev *dev)
return 0;
out_ida:
+ put_device(&dev->lirc_dev);
ida_free(&lirc_ida, minor);
return err;
}
--
2.25.1
The reference count of the device incremented in device_initialize() is
not decremented when device_add() fails. Add a put_device() and delete
kfree() before returning from the function to decrement reference
count for cleanup. Or it could cause memory leak.
As comment of device_add() says, if device_add() succeeds, you should
call device_del() when you want to get rid of it. If device_add() has
not succeeded, use only put_device() to drop the reference count.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: ed542bed126c ("[SCSI] raid class: handle component-add errors")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/scsi/raid_class.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/raid_class.c b/drivers/scsi/raid_class.c
index 898a0bdf8df6..77c91dbf3b77 100644
--- a/drivers/scsi/raid_class.c
+++ b/drivers/scsi/raid_class.c
@@ -251,7 +251,7 @@ int raid_component_add(struct raid_template *r,struct device *raid_dev,
list_del(&rc->node);
rd->component_count--;
put_device(component_dev);
- kfree(rc);
+ put_device(&rc->dev);
return err;
}
EXPORT_SYMBOL(raid_component_add);
--
2.25.1
Commit c1cc1552616d ("arm64: MMU initialisation") optimizes the
vmemmap to populate at the PMD section level which was suitable
initially since hotplugging granule is always 128M. However,
commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
which added 2M hotplugging granule disrupted the arm64 assumptions.
Considering the vmemmap_free -> unmap_hotplug_pmd_range path, when
pmd_sect() is true, the entire PMD section is cleared, even if there is
other effective subsection. For example pagemap1 and pagemap2 are part
of a single PMD entry and they are hot-added sequentially. Then pagemap1
is removed, vmemmap_free() will clear the entire PMD entry freeing the
struct page metadata for the whole section, even though pagemap2 is still
active.
To address the issue, we need to prevent PMD/PUD/CONT mappings for both
linear and vmemmap for non-boot sections if the size exceeds 2MB
(considering sub-section is 2MB). We only permit 2MB blocks in a 4KB page
configuration.
Cc: stable(a)vger.kernel.org # v5.4+
Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
Signed-off-by: Zhenhua Huang <quic_zhenhuah(a)quicinc.com>
---
Hi Catalin and Anshuman,
Based on your review comments, I concluded below patch and tested with my setup.
I have not folded patchset #2 since this patch seems to be enough for backporting..
Please see if you have further suggestions.
arch/arm64/mm/mmu.c | 33 +++++++++++++++++++++++++++++----
1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index e2739b69e11b..2b4d23f01d85 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -42,9 +42,11 @@
#include <asm/pgalloc.h>
#include <asm/kfence.h>
-#define NO_BLOCK_MAPPINGS BIT(0)
+#define NO_PMD_BLOCK_MAPPINGS BIT(0)
#define NO_CONT_MAPPINGS BIT(1)
#define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */
+#define NO_PUD_BLOCK_MAPPINGS BIT(3) /* Hotplug case: do not want block mapping for PUD */
+#define NO_BLOCK_MAPPINGS (NO_PMD_BLOCK_MAPPINGS | NO_PUD_BLOCK_MAPPINGS)
u64 kimage_voffset __ro_after_init;
EXPORT_SYMBOL(kimage_voffset);
@@ -254,7 +256,7 @@ static void init_pmd(pmd_t *pmdp, unsigned long addr, unsigned long end,
/* try section mapping first */
if (((addr | next | phys) & ~PMD_MASK) == 0 &&
- (flags & NO_BLOCK_MAPPINGS) == 0) {
+ (flags & NO_PMD_BLOCK_MAPPINGS) == 0) {
pmd_set_huge(pmdp, phys, prot);
/*
@@ -356,10 +358,11 @@ static void alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end,
/*
* For 4K granule only, attempt to put down a 1GB block
+ * Hotplug case: do not attempt 1GB block
*/
if (pud_sect_supported() &&
((addr | next | phys) & ~PUD_MASK) == 0 &&
- (flags & NO_BLOCK_MAPPINGS) == 0) {
+ (flags & NO_PUD_BLOCK_MAPPINGS) == 0) {
pud_set_huge(pudp, phys, prot);
/*
@@ -1175,9 +1178,16 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node,
int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
struct vmem_altmap *altmap)
{
+ unsigned long start_pfn;
+ struct mem_section *ms;
+
WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));
- if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES))
+ start_pfn = page_to_pfn((struct page *)start);
+ ms = __pfn_to_section(start_pfn);
+
+ /* Hotplugged section not support hugepages */
+ if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) || !early_section(ms))
return vmemmap_populate_basepages(start, end, node, altmap);
else
return vmemmap_populate_hugepages(start, end, node, altmap);
@@ -1339,9 +1349,24 @@ int arch_add_memory(int nid, u64 start, u64 size,
struct mhp_params *params)
{
int ret, flags = NO_EXEC_MAPPINGS;
+ unsigned long start_pfn = page_to_pfn((struct page *)start);
+ struct mem_section *ms = __pfn_to_section(start_pfn);
VM_BUG_ON(!mhp_range_allowed(start, size, true));
+ /* Should not be invoked by early section */
+ WARN_ON(early_section(ms));
+
+ if (IS_ENABLED(CONFIG_ARM64_4K_PAGES))
+ /*
+ * As per subsection granule is 2M, allow PMD block mapping in
+ * case 4K PAGES.
+ * Other cases forbid section mapping.
+ */
+ flags |= NO_PUD_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
+ else
+ flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
+
if (can_set_direct_map())
flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
--
2.25.1
The quilt patch titled
Subject: revert "vmstat: disable vmstat_work on vmstat_cpu_down_prep()"
has been removed from the -mm tree. Its filename was
revert-vmstat-disable-vmstat_work-on-vmstat_cpu_down_prep.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Andrew Morton <akpm(a)linux-foundation.org>
Subject: revert "vmstat: disable vmstat_work on vmstat_cpu_down_prep()"
Date: Mon Jan 6 06:24:12 PM PST 2025
Revert adcfb264c3ed ("vmstat: disable vmstat_work on
vmstat_cpu_down_prep()") due to "workqueue: work disable count
underflowed" WARNings.
Fixes: adcfb264c3ed ("vmstat: disable vmstat_work on vmstat_cpu_down_prep()")
Reported-by: Borislav Petkov <bp(a)alien8.de>
Reported-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: Greg KH <greg(a)kroah.com>
Cc: Koichiro Den <koichiro.den(a)canonical.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmstat.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/mm/vmstat.c~revert-vmstat-disable-vmstat_work-on-vmstat_cpu_down_prep
+++ a/mm/vmstat.c
@@ -2148,14 +2148,13 @@ static int vmstat_cpu_online(unsigned in
if (!node_state(cpu_to_node(cpu), N_CPU)) {
node_set_state(cpu_to_node(cpu), N_CPU);
}
- enable_delayed_work(&per_cpu(vmstat_work, cpu));
return 0;
}
static int vmstat_cpu_down_prep(unsigned int cpu)
{
- disable_delayed_work_sync(&per_cpu(vmstat_work, cpu));
+ cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
return 0;
}
_
Patches currently in -mm which might be from akpm(a)linux-foundation.org are
mm-swap_cgroup-allocate-swap_cgroup-map-using-vcalloc-fix.patch
mm-page_alloc-add-some-detailed-comments-in-can_steal_fallback-fix.patch
mm-introduce-mmap_lock_speculate_try_beginretry-fix.patch
mm-damon-tests-vaddr-kunith-reduce-stack-consumption.patch
mm-damon-tests-vaddr-kunith-reduce-stack-consumption-fix.patch
mm-remove-an-avoidable-load-of-page-refcount-in-page_ref_add_unless-fix.patch
mm-fix-outdated-incorrect-code-comments-for-handle_mm_fault-fix.patch
mm-huge_memoryc-rename-shadowed-local.patch
replace-free-hugepage-folios-after-migration-fix.patch
xarray-port-tests-to-kunit-fix.patch
checkpatch-check-return-of-git_commit_info-fix.patch
fault-inject-use-prandom-where-cryptographically-secure-randomness-is-not-needed-fix.patch
The quilt patch titled
Subject: fs/proc: do_task_stat: fix ESP not readable during coredump
has been removed from the -mm tree. Its filename was
fs-proc-do_task_stat-fix-esp-not-readable-during-coredump.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Nam Cao <namcao(a)linutronix.de>
Subject: fs/proc: do_task_stat: fix ESP not readable during coredump
Date: Thu, 2 Jan 2025 09:22:56 +0100
The field "eip" (instruction pointer) and "esp" (stack pointer) of a task
can be read from /proc/PID/stat. These fields can be interesting for
coredump.
However, these fields were disabled by commit 0a1eb2d474ed ("fs/proc: Stop
reporting eip and esp in /proc/PID/stat"), because it is generally unsafe
to do so. But it is safe for a coredumping process, and therefore
exceptions were made:
- for a coredumping thread by commit fd7d56270b52 ("fs/proc: Report
eip/esp in /prod/PID/stat for coredumping").
- for all other threads in a coredumping process by commit cb8f381f1613
("fs/proc/array.c: allow reporting eip/esp for all coredumping
threads").
The above two commits check the PF_DUMPCORE flag to determine a coredump
thread and the PF_EXITING flag for the other threads.
Unfortunately, commit 92307383082d ("coredump: Don't perform any cleanups
before dumping core") moved coredump to happen earlier and before
PF_EXITING is set. Thus, checking PF_EXITING is no longer the correct way
to determine threads in a coredumping process.
Instead of PF_EXITING, use PF_POSTCOREDUMP to determine the other threads.
Checking of PF_EXITING was added for coredumping, so it probably can now
be removed. But it doesn't hurt to keep.
Link: https://lkml.kernel.org/r/d89af63d478d6c64cc46a01420b46fd6eb147d6f.17358057…
Fixes: 92307383082d ("coredump: Don't perform any cleanups before dumping core")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Acked-by: Oleg Nesterov <oleg(a)redhat.com>
Acked-by: Kees Cook <kees(a)kernel.org>
Cc: Eric W. Biederman <ebiederm(a)xmission.com>
Cc: Dylan Hatch <dylanbhatch(a)google.com>
Cc: John Ogness <john.ogness(a)linutronix.de>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/array.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/proc/array.c~fs-proc-do_task_stat-fix-esp-not-readable-during-coredump
+++ a/fs/proc/array.c
@@ -500,7 +500,7 @@ static int do_task_stat(struct seq_file
* a program is not able to use ptrace(2) in that case. It is
* safe because the task has stopped executing permanently.
*/
- if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE))) {
+ if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE|PF_POSTCOREDUMP))) {
if (try_get_task_stack(task)) {
eip = KSTK_EIP(task);
esp = KSTK_ESP(task);
_
Patches currently in -mm which might be from namcao(a)linutronix.de are