The quilt patch titled
Subject: crash: use macro to add crashk_res into iomem early for specific arch
has been removed from the -mm tree. Its filename was
crash-use-macro-to-add-crashk_res-into-iomem-early-for-specific-arch.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Baoquan He <bhe(a)redhat.com>
Subject: crash: use macro to add crashk_res into iomem early for specific arch
Date: Mon, 25 Mar 2024 09:50:50 +0800
There are regression reports[1][2] that crashkernel region on x86_64 can't
be added into iomem tree sometime. This causes the later failure of kdump
loading.
This happened after commit 4a693ce65b18 ("kdump: defer the insertion of
crashkernel resources") was merged.
Even though, these reported issues are proved to be related to other
component, they are just exposed after above commmit applied, I still
would like to keep crashk_res and crashk_low_res being added into iomem
early as before because the early adding has been always there on x86_64
and working very well. For safety of kdump, Let's change it back.
Here, add a macro HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY to limit that
only ARCH defining the macro can have the early adding
crashk_res/_low_res into iomem. Then define
HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY on x86 to enable it.
Note: In reserve_crashkernel_low(), there's a remnant of crashk_low_res
handling which was mistakenly added back in commit 85fcde402db1 ("kexec:
split crashkernel reservation code out from crash_core.c").
[1]
[PATCH V2] x86/kexec: do not update E820 kexec table for setup_data
https://lore.kernel.org/all/Zfv8iCL6CT2JqLIC@darkstar.users.ipa.redhat.com/…
[2]
Question about Address Range Validation in Crash Kernel Allocation
https://lore.kernel.org/all/4eeac1f733584855965a2ea62fa4da58@huawei.com/T/#u
Link: https://lkml.kernel.org/r/ZgDYemRQ2jxjLkq+@MiWiFi-R3L-srv
Fixes: 4a693ce65b18 ("kdump: defer the insertion of crashkernel resources")
Signed-off-by: Baoquan He <bhe(a)redhat.com>
Cc: Dave Young <dyoung(a)redhat.com>
Cc: Huacai Chen <chenhuacai(a)loongson.cn>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Jiri Bohac <jbohac(a)suse.cz>
Cc: Li Huafei <lihuafei1(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/include/asm/crash_reserve.h | 2 ++
kernel/crash_reserve.c | 7 +++++++
2 files changed, 9 insertions(+)
--- a/arch/x86/include/asm/crash_reserve.h~crash-use-macro-to-add-crashk_res-into-iomem-early-for-specific-arch
+++ a/arch/x86/include/asm/crash_reserve.h
@@ -39,4 +39,6 @@ static inline unsigned long crash_low_si
#endif
}
+#define HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
+
#endif /* _X86_CRASH_RESERVE_H */
--- a/kernel/crash_reserve.c~crash-use-macro-to-add-crashk_res-into-iomem-early-for-specific-arch
+++ a/kernel/crash_reserve.c
@@ -366,8 +366,10 @@ static int __init reserve_crashkernel_lo
crashk_low_res.start = low_base;
crashk_low_res.end = low_base + low_size - 1;
+#ifdef HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
insert_resource(&iomem_resource, &crashk_low_res);
#endif
+#endif
return 0;
}
@@ -448,8 +450,12 @@ retry:
crashk_res.start = crash_base;
crashk_res.end = crash_base + crash_size - 1;
+#ifdef HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
+ insert_resource(&iomem_resource, &crashk_res);
+#endif
}
+#ifndef HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
static __init int insert_crashkernel_resources(void)
{
if (crashk_res.start < crashk_res.end)
@@ -462,3 +468,4 @@ static __init int insert_crashkernel_res
}
early_initcall(insert_crashkernel_resources);
#endif
+#endif
_
Patches currently in -mm which might be from bhe(a)redhat.com are
mm-vmallocc-optimize-to-reduce-arguments-of-alloc_vmap_area.patch
x86-remove-unneeded-memblock_find_dma_reserve.patch
mm-mm_initc-remove-the-useless-dma_reserve.patch
mm-mm_initc-add-new-function-calc_nr_all_pages.patch
mm-mm_initc-remove-meaningless-calculation-of-zone-managed_pages-in-free_area_init_core.patch
mm-mm_initc-remove-unneeded-calc_memmap_size.patch
mm-mm_initc-remove-arch_reserved_kernel_pages.patch
The quilt patch titled
Subject: mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices
has been removed from the -mm tree. Its filename was
mm-zswap-fix-data-loss-on-swp_synchronous_io-devices.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices
Date: Sun, 24 Mar 2024 17:04:47 -0400
Zhongkun He reports data corruption when combining zswap with zram.
The issue is the exclusive loads we're doing in zswap. They assume
that all reads are going into the swapcache, which can assume
authoritative ownership of the data and so the zswap copy can go.
However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try to
bypass the swapcache. This results in an optimistic read of the swap data
into a page that will be dismissed if the fault fails due to races. In
this case, zswap mustn't drop its authoritative copy.
Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=zV9P691B9bVq33erwOXNTmEaUbi9DrD…
Fixes: b9c91c43412f ("mm: zswap: support exclusive loads")
Link: https://lkml.kernel.org/r/20240324210447.956973-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Zhongkun He <hezhongkun.hzk(a)bytedance.com>
Tested-by: Zhongkun He <hezhongkun.hzk(a)bytedance.com>
Acked-by: Yosry Ahmed <yosryahmed(a)google.com>
Acked-by: Barry Song <baohua(a)kernel.org>
Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
Reviewed-by: Nhat Pham <nphamcs(a)gmail.com>
Acked-by: Chris Li <chrisl(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.5+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/zswap.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)
--- a/mm/zswap.c~mm-zswap-fix-data-loss-on-swp_synchronous_io-devices
+++ a/mm/zswap.c
@@ -1636,6 +1636,7 @@ bool zswap_load(struct folio *folio)
swp_entry_t swp = folio->swap;
pgoff_t offset = swp_offset(swp);
struct page *page = &folio->page;
+ bool swapcache = folio_test_swapcache(folio);
struct zswap_tree *tree = swap_zswap_tree(swp);
struct zswap_entry *entry;
u8 *dst;
@@ -1648,7 +1649,20 @@ bool zswap_load(struct folio *folio)
spin_unlock(&tree->lock);
return false;
}
- zswap_rb_erase(&tree->rbroot, entry);
+ /*
+ * When reading into the swapcache, invalidate our entry. The
+ * swapcache can be the authoritative owner of the page and
+ * its mappings, and the pressure that results from having two
+ * in-memory copies outweighs any benefits of caching the
+ * compression work.
+ *
+ * (Most swapins go through the swapcache. The notable
+ * exception is the singleton fault on SWP_SYNCHRONOUS_IO
+ * files, which reads into a private page and may free it if
+ * the fault fails. We remain the primary owner of the entry.)
+ */
+ if (swapcache)
+ zswap_rb_erase(&tree->rbroot, entry);
spin_unlock(&tree->lock);
if (entry->length)
@@ -1663,9 +1677,10 @@ bool zswap_load(struct folio *folio)
if (entry->objcg)
count_objcg_event(entry->objcg, ZSWPIN);
- zswap_entry_free(entry);
-
- folio_mark_dirty(folio);
+ if (swapcache) {
+ zswap_entry_free(entry);
+ folio_mark_dirty(folio);
+ }
return true;
}
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
mm-zswap-optimize-zswap-pool-size-tracking.patch
mm-zpool-return-pool-size-in-pages.patch
mm-page_alloc-remove-pcppage-migratetype-caching.patch
mm-page_alloc-optimize-free_unref_folios.patch
mm-page_alloc-fix-up-block-types-when-merging-compatible-blocks.patch
mm-page_alloc-move-free-pages-when-converting-block-during-isolation.patch
mm-page_alloc-fix-move_freepages_block-range-error.patch
mm-page_alloc-fix-freelist-movement-during-block-conversion.patch
mm-page_alloc-close-migratetype-race-between-freeing-and-stealing.patch
mm-page_isolation-prepare-for-hygienic-freelists.patch
mm-page_isolation-prepare-for-hygienic-freelists-fix.patch
mm-page_alloc-consolidate-free-page-accounting.patch
The quilt patch titled
Subject: selftests/mm: fix ARM related issue with fork after pthread_create
has been removed from the -mm tree. Its filename was
selftests-mm-fix-arm-related-issue-with-fork-after-pthread_create.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Edward Liaw <edliaw(a)google.com>
Subject: selftests/mm: fix ARM related issue with fork after pthread_create
Date: Mon, 25 Mar 2024 19:40:52 +0000
Following issue was observed while running the uffd-unit-tests selftest
on ARM devices. On x86_64 no issues were detected:
pthread_create followed by fork caused deadlock in certain cases wherein
fork required some work to be completed by the created thread. Used
synchronization to ensure that created thread's start function has started
before invoking fork.
[edliaw(a)google.com: refactored to use atomic_bool]
Link: https://lkml.kernel.org/r/20240325194100.775052-1-edliaw@google.com
Fixes: 760aee0b71e3 ("selftests/mm: add tests for RO pinning vs fork()")
Signed-off-by: Lokesh Gidra <lokeshgidra(a)google.com>
Signed-off-by: Edward Liaw <edliaw(a)google.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/uffd-common.c | 3 +++
tools/testing/selftests/mm/uffd-common.h | 2 ++
tools/testing/selftests/mm/uffd-unit-tests.c | 10 ++++++++++
3 files changed, 15 insertions(+)
--- a/tools/testing/selftests/mm/uffd-common.c~selftests-mm-fix-arm-related-issue-with-fork-after-pthread_create
+++ a/tools/testing/selftests/mm/uffd-common.c
@@ -18,6 +18,7 @@ bool test_uffdio_wp = true;
unsigned long long *count_verify;
uffd_test_ops_t *uffd_test_ops;
uffd_test_case_ops_t *uffd_test_case_ops;
+atomic_bool ready_for_fork;
static int uffd_mem_fd_create(off_t mem_size, bool hugetlb)
{
@@ -518,6 +519,8 @@ void *uffd_poll_thread(void *arg)
pollfd[1].fd = pipefd[cpu*2];
pollfd[1].events = POLLIN;
+ ready_for_fork = true;
+
for (;;) {
ret = poll(pollfd, 2, -1);
if (ret <= 0) {
--- a/tools/testing/selftests/mm/uffd-common.h~selftests-mm-fix-arm-related-issue-with-fork-after-pthread_create
+++ a/tools/testing/selftests/mm/uffd-common.h
@@ -32,6 +32,7 @@
#include <inttypes.h>
#include <stdint.h>
#include <sys/random.h>
+#include <stdatomic.h>
#include "../kselftest.h"
#include "vm_util.h"
@@ -103,6 +104,7 @@ extern bool map_shared;
extern bool test_uffdio_wp;
extern unsigned long long *count_verify;
extern volatile bool test_uffdio_copy_eexist;
+extern atomic_bool ready_for_fork;
extern uffd_test_ops_t anon_uffd_test_ops;
extern uffd_test_ops_t shmem_uffd_test_ops;
--- a/tools/testing/selftests/mm/uffd-unit-tests.c~selftests-mm-fix-arm-related-issue-with-fork-after-pthread_create
+++ a/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -775,6 +775,8 @@ static void uffd_sigbus_test_common(bool
char c;
struct uffd_args args = { 0 };
+ ready_for_fork = false;
+
fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
if (uffd_register(uffd, area_dst, nr_pages * page_size,
@@ -790,6 +792,9 @@ static void uffd_sigbus_test_common(bool
if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args))
err("uffd_poll_thread create");
+ while (!ready_for_fork)
+ ; /* Wait for the poll_thread to start executing before forking */
+
pid = fork();
if (pid < 0)
err("fork");
@@ -829,6 +834,8 @@ static void uffd_events_test_common(bool
char c;
struct uffd_args args = { 0 };
+ ready_for_fork = false;
+
fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
if (uffd_register(uffd, area_dst, nr_pages * page_size,
true, wp, false))
@@ -838,6 +845,9 @@ static void uffd_events_test_common(bool
if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args))
err("uffd_poll_thread create");
+ while (!ready_for_fork)
+ ; /* Wait for the poll_thread to start executing before forking */
+
pid = fork();
if (pid < 0)
err("fork");
_
Patches currently in -mm which might be from edliaw(a)google.com are
The quilt patch titled
Subject: hexagon: vmlinux.lds.S: handle attributes section
has been removed from the -mm tree. Its filename was
hexagon-vmlinuxldss-handle-attributes-section.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Nathan Chancellor <nathan(a)kernel.org>
Subject: hexagon: vmlinux.lds.S: handle attributes section
Date: Tue, 19 Mar 2024 17:37:46 -0700
After the linked LLVM change, the build fails with
CONFIG_LD_ORPHAN_WARN_LEVEL="error", which happens with allmodconfig:
ld.lld: error: vmlinux.a(init/main.o):(.hexagon.attributes) is being placed in '.hexagon.attributes'
Handle the attributes section in a similar manner as arm and riscv by
adding it after the primary ELF_DETAILS grouping in vmlinux.lds.S, which
fixes the error.
Link: https://lkml.kernel.org/r/20240319-hexagon-handle-attributes-section-vmlinu…
Fixes: 113616ec5b64 ("hexagon: select ARCH_WANT_LD_ORPHAN_WARN")
Link: https://github.com/llvm/llvm-project/commit/31f4b329c8234fab9afa59494d7f8bd…
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
Reviewed-by: Brian Cain <bcain(a)quicinc.com>
Cc: Bill Wendling <morbo(a)google.com>
Cc: Justin Stitt <justinstitt(a)google.com>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/hexagon/kernel/vmlinux.lds.S | 1 +
1 file changed, 1 insertion(+)
--- a/arch/hexagon/kernel/vmlinux.lds.S~hexagon-vmlinuxldss-handle-attributes-section
+++ a/arch/hexagon/kernel/vmlinux.lds.S
@@ -63,6 +63,7 @@ SECTIONS
STABS_DEBUG
DWARF_DEBUG
ELF_DETAILS
+ .hexagon.attributes 0 : { *(.hexagon.attributes) }
DISCARDS
}
_
Patches currently in -mm which might be from nathan(a)kernel.org are
The quilt patch titled
Subject: selftests/mm: sigbus-wp test requires UFFD_FEATURE_WP_HUGETLBFS_SHMEM
has been removed from the -mm tree. Its filename was
selftests-mm-sigbus-wp-test-requires-uffd_feature_wp_hugetlbfs_shmem.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Edward Liaw <edliaw(a)google.com>
Subject: selftests/mm: sigbus-wp test requires UFFD_FEATURE_WP_HUGETLBFS_SHMEM
Date: Thu, 21 Mar 2024 23:20:21 +0000
The sigbus-wp test requires the UFFD_FEATURE_WP_HUGETLBFS_SHMEM flag for
shmem and hugetlb targets. Otherwise it is not backwards compatible with
kernels <5.19 and fails with EINVAL.
Link: https://lkml.kernel.org/r/20240321232023.2064975-1-edliaw@google.com
Fixes: 73c1ea939b65 ("selftests/mm: move uffd sig/events tests into uffd unit tests")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Peter Xu <peterx(a)redhat.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/uffd-unit-tests.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/mm/uffd-unit-tests.c~selftests-mm-sigbus-wp-test-requires-uffd_feature_wp_hugetlbfs_shmem
+++ a/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -1427,7 +1427,8 @@ uffd_test_case_t uffd_tests[] = {
.uffd_fn = uffd_sigbus_wp_test,
.mem_targets = MEM_ALL,
.uffd_feature_required = UFFD_FEATURE_SIGBUS |
- UFFD_FEATURE_EVENT_FORK | UFFD_FEATURE_PAGEFAULT_FLAG_WP,
+ UFFD_FEATURE_EVENT_FORK | UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+ UFFD_FEATURE_WP_HUGETLBFS_SHMEM,
},
{
.name = "events",
_
Patches currently in -mm which might be from edliaw(a)google.com are
The quilt patch titled
Subject: ARM: prctl: reject PR_SET_MDWE on pre-ARMv6
has been removed from the -mm tree. Its filename was
arm-prctl-reject-pr_set_mdwe-on-pre-armv6.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Zev Weiss <zev(a)bewilderbeest.net>
Subject: ARM: prctl: reject PR_SET_MDWE on pre-ARMv6
Date: Mon, 26 Feb 2024 17:35:42 -0800
On v5 and lower CPUs we can't provide MDWE protection, so ensure we fail
any attempt to enable it via prctl(PR_SET_MDWE).
Previously such an attempt would misleadingly succeed, leading to any
subsequent mmap(PROT_READ|PROT_WRITE) or execve() failing unconditionally
(the latter somewhat violently via force_fatal_sig(SIGSEGV) due to
READ_IMPLIES_EXEC).
Link: https://lkml.kernel.org/r/20240227013546.15769-6-zev@bewilderbeest.net
Signed-off-by: Zev Weiss <zev(a)bewilderbeest.net>
Cc: <stable(a)vger.kernel.org> [6.3+]
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Florent Revest <revest(a)chromium.org>
Cc: Helge Deller <deller(a)gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley(a)HansenPartnership.com>
Cc: Josh Triplett <josh(a)joshtriplett.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Miguel Ojeda <ojeda(a)kernel.org>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Ondrej Mosnacek <omosnace(a)redhat.com>
Cc: Rick Edgecombe <rick.p.edgecombe(a)intel.com>
Cc: Russell King (Oracle) <linux(a)armlinux.org.uk>
Cc: Sam James <sam(a)gentoo.org>
Cc: Stefan Roesch <shr(a)devkernel.io>
Cc: Yang Shi <yang(a)os.amperecomputing.com>
Cc: Yin Fengwei <fengwei.yin(a)intel.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/arm/include/asm/mman.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--- /dev/null
+++ a/arch/arm/include/asm/mman.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_MMAN_H__
+#define __ASM_MMAN_H__
+
+#include <asm/system_info.h>
+#include <uapi/asm/mman.h>
+
+static inline bool arch_memory_deny_write_exec_supported(void)
+{
+ return cpu_architecture() >= CPU_ARCH_ARMv6;
+}
+#define arch_memory_deny_write_exec_supported arch_memory_deny_write_exec_supported
+
+#endif /* __ASM_MMAN_H__ */
_
Patches currently in -mm which might be from zev(a)bewilderbeest.net are
The quilt patch titled
Subject: prctl: generalize PR_SET_MDWE support check to be per-arch
has been removed from the -mm tree. Its filename was
prctl-generalize-pr_set_mdwe-support-check-to-be-per-arch.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Zev Weiss <zev(a)bewilderbeest.net>
Subject: prctl: generalize PR_SET_MDWE support check to be per-arch
Date: Mon, 26 Feb 2024 17:35:41 -0800
Patch series "ARM: prctl: Reject PR_SET_MDWE where not supported".
I noticed after a recent kernel update that my ARM926 system started
segfaulting on any execve() after calling prctl(PR_SET_MDWE). After some
investigation it appears that ARMv5 is incapable of providing the
appropriate protections for MDWE, since any readable memory is also
implicitly executable.
The prctl_set_mdwe() function already had some special-case logic added
disabling it on PARISC (commit 793838138c15, "prctl: Disable
prctl(PR_SET_MDWE) on parisc"); this patch series (1) generalizes that
check to use an arch_*() function, and (2) adds a corresponding override
for ARM to disable MDWE on pre-ARMv6 CPUs.
With the series applied, prctl(PR_SET_MDWE) is rejected on ARMv5 and
subsequent execve() calls (as well as mmap(PROT_READ|PROT_WRITE)) can
succeed instead of unconditionally failing; on ARMv6 the prctl works as it
did previously.
[0] https://lore.kernel.org/all/2023112456-linked-nape-bf19@gregkh/
This patch (of 2):
There exist systems other than PARISC where MDWE may not be feasible to
support; rather than cluttering up the generic code with additional
arch-specific logic let's add a generic function for checking MDWE support
and allow each arch to override it as needed.
Link: https://lkml.kernel.org/r/20240227013546.15769-4-zev@bewilderbeest.net
Link: https://lkml.kernel.org/r/20240227013546.15769-5-zev@bewilderbeest.net
Signed-off-by: Zev Weiss <zev(a)bewilderbeest.net>
Acked-by: Helge Deller <deller(a)gmx.de> [parisc]
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Florent Revest <revest(a)chromium.org>
Cc: "James E.J. Bottomley" <James.Bottomley(a)HansenPartnership.com>
Cc: Josh Triplett <josh(a)joshtriplett.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Miguel Ojeda <ojeda(a)kernel.org>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Ondrej Mosnacek <omosnace(a)redhat.com>
Cc: Rick Edgecombe <rick.p.edgecombe(a)intel.com>
Cc: Russell King (Oracle) <linux(a)armlinux.org.uk>
Cc: Sam James <sam(a)gentoo.org>
Cc: Stefan Roesch <shr(a)devkernel.io>
Cc: Yang Shi <yang(a)os.amperecomputing.com>
Cc: Yin Fengwei <fengwei.yin(a)intel.com>
Cc: <stable(a)vger.kernel.org> [6.3+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/parisc/include/asm/mman.h | 14 ++++++++++++++
include/linux/mman.h | 8 ++++++++
kernel/sys.c | 7 +++++--
3 files changed, 27 insertions(+), 2 deletions(-)
--- /dev/null
+++ a/arch/parisc/include/asm/mman.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_MMAN_H__
+#define __ASM_MMAN_H__
+
+#include <uapi/asm/mman.h>
+
+/* PARISC cannot allow mdwe as it needs writable stacks */
+static inline bool arch_memory_deny_write_exec_supported(void)
+{
+ return false;
+}
+#define arch_memory_deny_write_exec_supported arch_memory_deny_write_exec_supported
+
+#endif /* __ASM_MMAN_H__ */
--- a/include/linux/mman.h~prctl-generalize-pr_set_mdwe-support-check-to-be-per-arch
+++ a/include/linux/mman.h
@@ -162,6 +162,14 @@ calc_vm_flag_bits(unsigned long flags)
unsigned long vm_commit_limit(void);
+#ifndef arch_memory_deny_write_exec_supported
+static inline bool arch_memory_deny_write_exec_supported(void)
+{
+ return true;
+}
+#define arch_memory_deny_write_exec_supported arch_memory_deny_write_exec_supported
+#endif
+
/*
* Denies creating a writable executable mapping or gaining executable permissions.
*
--- a/kernel/sys.c~prctl-generalize-pr_set_mdwe-support-check-to-be-per-arch
+++ a/kernel/sys.c
@@ -2408,8 +2408,11 @@ static inline int prctl_set_mdwe(unsigne
if (bits & PR_MDWE_NO_INHERIT && !(bits & PR_MDWE_REFUSE_EXEC_GAIN))
return -EINVAL;
- /* PARISC cannot allow mdwe as it needs writable stacks */
- if (IS_ENABLED(CONFIG_PARISC))
+ /*
+ * EOPNOTSUPP might be more appropriate here in principle, but
+ * existing userspace depends on EINVAL specifically.
+ */
+ if (!arch_memory_deny_write_exec_supported())
return -EINVAL;
current_bits = get_current_mdwe();
_
Patches currently in -mm which might be from zev(a)bewilderbeest.net are
The quilt patch titled
Subject: mm: cachestat: fix two shmem bugs
has been removed from the -mm tree. Its filename was
mm-cachestat-fix-two-shmem-bugs.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: cachestat: fix two shmem bugs
Date: Fri, 15 Mar 2024 05:55:56 -0400
When cachestat on shmem races with swapping and invalidation, there
are two possible bugs:
1) A swapin error can have resulted in a poisoned swap entry in the
shmem inode's xarray. Calling get_shadow_from_swap_cache() on it
will result in an out-of-bounds access to swapper_spaces[].
Validate the entry with non_swap_entry() before going further.
2) When we find a valid swap entry in the shmem's inode, the shadow
entry in the swapcache might not exist yet: swap IO is still in
progress and we're before __remove_mapping; swapin, invalidation,
or swapoff have removed the shadow from swapcache after we saw the
shmem swap entry.
This will send a NULL to workingset_test_recent(). The latter
purely operates on pointer bits, so it won't crash - node 0, memcg
ID 0, eviction timestamp 0, etc. are all valid inputs - but it's a
bogus test. In theory that could result in a false "recently
evicted" count.
Such a false positive wouldn't be the end of the world. But for
code clarity and (future) robustness, be explicit about this case.
Bail on get_shadow_from_swap_cache() returning NULL.
Link: https://lkml.kernel.org/r/20240315095556.GC581298@cmpxchg.org
Fixes: cf264e1329fb ("cachestat: implement cachestat syscall")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Chengming Zhou <chengming.zhou(a)linux.dev> [Bug #1]
Reported-by: Jann Horn <jannh(a)google.com> [Bug #2]
Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
Reviewed-by: Nhat Pham <nphamcs(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [v6.5+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/filemap.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
--- a/mm/filemap.c~mm-cachestat-fix-two-shmem-bugs
+++ a/mm/filemap.c
@@ -4197,7 +4197,23 @@ static void filemap_cachestat(struct add
/* shmem file - in swap cache */
swp_entry_t swp = radix_to_swp_entry(folio);
+ /* swapin error results in poisoned entry */
+ if (non_swap_entry(swp))
+ goto resched;
+
+ /*
+ * Getting a swap entry from the shmem
+ * inode means we beat
+ * shmem_unuse(). rcu_read_lock()
+ * ensures swapoff waits for us before
+ * freeing the swapper space. However,
+ * we can race with swapping and
+ * invalidation, so there might not be
+ * a shadow in the swapcache (yet).
+ */
shadow = get_shadow_from_swap_cache(swp);
+ if (!shadow)
+ goto resched;
}
#endif
if (workingset_test_recent(shadow, true, &workingset))
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
mm-zswap-optimize-zswap-pool-size-tracking.patch
mm-zpool-return-pool-size-in-pages.patch
mm-page_alloc-remove-pcppage-migratetype-caching.patch
mm-page_alloc-optimize-free_unref_folios.patch
mm-page_alloc-fix-up-block-types-when-merging-compatible-blocks.patch
mm-page_alloc-move-free-pages-when-converting-block-during-isolation.patch
mm-page_alloc-fix-move_freepages_block-range-error.patch
mm-page_alloc-fix-freelist-movement-during-block-conversion.patch
mm-page_alloc-close-migratetype-race-between-freeing-and-stealing.patch
mm-page_isolation-prepare-for-hygienic-freelists.patch
mm-page_isolation-prepare-for-hygienic-freelists-fix.patch
mm-page_alloc-consolidate-free-page-accounting.patch
The quilt patch titled
Subject: init: open /initrd.image with O_LARGEFILE
has been removed from the -mm tree. Its filename was
init-open-initrdimage-with-o_largefile.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: John Sperbeck <jsperbeck(a)google.com>
Subject: init: open /initrd.image with O_LARGEFILE
Date: Sun, 17 Mar 2024 15:15:22 -0700
If initrd data is larger than 2Gb, we'll eventually fail to write to the
/initrd.image file when we hit that limit, unless O_LARGEFILE is set.
Link: https://lkml.kernel.org/r/20240317221522.896040-1-jsperbeck@google.com
Signed-off-by: John Sperbeck <jsperbeck(a)google.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
init/initramfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/init/initramfs.c~init-open-initrdimage-with-o_largefile
+++ a/init/initramfs.c
@@ -682,7 +682,7 @@ static void __init populate_initrd_image
printk(KERN_INFO "rootfs image is not initramfs (%s); looks like an initrd\n",
err);
- file = filp_open("/initrd.image", O_WRONLY | O_CREAT, 0700);
+ file = filp_open("/initrd.image", O_WRONLY|O_CREAT|O_LARGEFILE, 0700);
if (IS_ERR(file))
return;
_
Patches currently in -mm which might be from jsperbeck(a)google.com are
folio_is_secretmem() currently relies on secretmem folios being LRU folios,
to save some cycles.
However, folios might reside in a folio batch without the LRU flag set, or
temporarily have their LRU flag cleared. Consequently, the LRU flag is
unreliable for this purpose.
In particular, this is the case when secretmem_fault() allocates a
fresh page and calls filemap_add_folio()->folio_add_lru(). The folio might
be added to the per-cpu folio batch and won't get the LRU flag set until
the batch was drained using e.g., lru_add_drain().
Consequently, folio_is_secretmem() might not detect secretmem folios
and GUP-fast can succeed in grabbing a secretmem folio, crashing the
kernel when we would later try reading/writing to the folio, because
the folio has been unmapped from the directmap.
Fix it by removing that unreliable check.
Reported-by: xingwei lee <xrivendell7(a)gmail.com>
Reported-by: yue sun <samsun1006219(a)gmail.com>
Closes: https://lore.kernel.org/lkml/CABOYnLyevJeravW=QrH0JUPYEcDN160aZFb7kwndm-J2r…
Debugged-by: Miklos Szeredi <miklos(a)szeredi.hu>
Tested-by: Miklos Szeredi <mszeredi(a)redhat.com>
Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
include/linux/secretmem.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 35f3a4a8ceb1..acf7e1a3f3de 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -13,10 +13,10 @@ static inline bool folio_is_secretmem(struct folio *folio)
/*
* Using folio_mapping() is quite slow because of the actual call
* instruction.
- * We know that secretmem pages are not compound and LRU so we can
+ * We know that secretmem pages are not compound, so we can
* save a couple of cycles here.
*/
- if (folio_test_large(folio) || !folio_test_lru(folio))
+ if (folio_test_large(folio))
return false;
mapping = (struct address_space *)
--
2.43.2
The index in the loop has already been added one and is equal to the
number of PDOs to be updated when leaving the loop.
Fixes: cd099cde4ed2 ("usb: typec: tcpm: Support multiple capabilities")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kyle Tso <kyletso(a)google.com>
---
drivers/usb/typec/tcpm/tcpm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index 3d505614bff1..566dad0cb9d3 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -6852,14 +6852,14 @@ static int tcpm_pd_set(struct typec_port *p, struct usb_power_delivery *pd)
if (data->sink_desc.pdo[0]) {
for (i = 0; i < PDO_MAX_OBJECTS && data->sink_desc.pdo[i]; i++)
port->snk_pdo[i] = data->sink_desc.pdo[i];
- port->nr_snk_pdo = i + 1;
+ port->nr_snk_pdo = i;
port->operating_snk_mw = data->operating_snk_mw;
}
if (data->source_desc.pdo[0]) {
for (i = 0; i < PDO_MAX_OBJECTS && data->source_desc.pdo[i]; i++)
port->snk_pdo[i] = data->source_desc.pdo[i];
- port->nr_src_pdo = i + 1;
+ port->nr_src_pdo = i;
}
switch (port->state) {
--
2.44.0.291.gc1ea87d7ee-goog
Hi,
The current version of delay reporting code can report incorrect
values when paired with a firmware which enables this feature.
Unfortunately there are several smaller issues that needed to be addressed
to correct the behavior:
Wrong information was used for the host side of counter
For MTL/LNL used incorrect (in a sense that it was verified only on MTL)
link side counter function.
The link side counter needs compensation logic if pause/resume is used.
The offset values were not refreshed from firmware.
Finally, not strictly connected, but the ALSA buffer size needs to be
constrained to avoid constant xrun from media players (like mpv)
The series applies cleanly for 6.9 and 6.8.y stable, but older stable
would need manual backport, but it is questionable if it is needed as
MTL/LNL is missing features.
Mark, can you pick these patches for the 6.9 cycle as fixes?
Regards,
Peter
---
Peter Ujfalusi (17):
ASoC: SOF: Add dsp_max_burst_size_in_ms member to snd_sof_pcm_stream
ASoC: SOF: ipc4-topology: Save the DMA maximum burst size for PCMs
ASoC: SOF: Intel: hda-pcm: Use dsp_max_burst_size_in_ms to place
constraint
ASoC: SOF: Intel: hda: Implement get_stream_position (Linear Link
Position)
ASoC: SOF: Intel: mtl/lnl: Use the generic get_stream_position
callback
ASoC: SOF: Introduce a new callback pair to be used for PCM delay
reporting
ASoC: SOF: Intel: Set the dai/host get frame/byte counter callbacks
ASoC: SOF: ipc4-pcm: Use the snd_sof_pcm_get_dai_frame_counter() for
pcm_delay
ASoC: SOF: Intel: hda-common-ops: Do not set the get_stream_position
callback
ASoC: SOF: Remove the get_stream_position callback
ASoC: SOF: ipc4-pcm: Move struct sof_ipc4_timestamp_info definition
locally
ASoC: SOF: ipc4-pcm: Combine the SOF_IPC4_PIPE_PAUSED cases in
pcm_trigger
ASoC: SOF: ipc4-pcm: Invalidate the stream_start_offset in PAUSED
state
ASoC: SOF: sof-pcm: Add pointer callback to sof_ipc_pcm_ops
ASoC: SOF: ipc4-pcm: Correct the delay calculation
ALSA: hda: Add pplcllpl/u members to hdac_ext_stream
ASoC: SOF: Intel: hda: Compensate LLP in case it is not reset
include/sound/hdaudio_ext.h | 3 +
sound/soc/sof/intel/hda-common-ops.c | 3 +
sound/soc/sof/intel/hda-dai-ops.c | 11 ++
sound/soc/sof/intel/hda-pcm.c | 29 ++++
sound/soc/sof/intel/hda-stream.c | 70 ++++++++++
sound/soc/sof/intel/hda.h | 6 +
sound/soc/sof/intel/lnl.c | 2 -
sound/soc/sof/intel/mtl.c | 14 --
sound/soc/sof/intel/mtl.h | 10 --
sound/soc/sof/ipc4-pcm.c | 191 ++++++++++++++++++++++-----
sound/soc/sof/ipc4-priv.h | 14 --
sound/soc/sof/ipc4-topology.c | 22 ++-
sound/soc/sof/ops.h | 24 +++-
sound/soc/sof/pcm.c | 8 ++
sound/soc/sof/sof-audio.h | 9 +-
sound/soc/sof/sof-priv.h | 24 +++-
16 files changed, 351 insertions(+), 89 deletions(-)
--
2.44.0
folio_is_secretmem() states that secretmem folios cannot be LRU folios:
so we may only exit early if we find an LRU folio. Yet, we exit early if
we find a folio that is not a secretmem folio.
Consequently, folio_is_secretmem() fails to detect secretmem folios and,
therefore, we can succeed in grabbing a secretmem folio during GUP-fast,
crashing the kernel when we later try reading/writing to the folio, because
the folio has been unmapped from the directmap.
Reported-by: xingwei lee <xrivendell7(a)gmail.com>
Reported-by: yue sun <samsun1006219(a)gmail.com>
Closes: https://lore.kernel.org/lkml/CABOYnLyevJeravW=QrH0JUPYEcDN160aZFb7kwndm-J2r…
Debugged-by: Miklos Szeredi <miklos(a)szeredi.hu>
Reviewed-by: Mike Rapoport (IBM) <rppt(a)kernel.org>
Tested-by: Miklos Szeredi <mszeredi(a)redhat.com>
Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
include/linux/secretmem.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 35f3a4a8ceb1..6996f1f53f14 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -16,7 +16,7 @@ static inline bool folio_is_secretmem(struct folio *folio)
* We know that secretmem pages are not compound and LRU so we can
* save a couple of cycles here.
*/
- if (folio_test_large(folio) || !folio_test_lru(folio))
+ if (folio_test_large(folio) || folio_test_lru(folio))
return false;
mapping = (struct address_space *)
--
2.43.2
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Calling i915_gem_object_get_dma_address() from the vblank
evade critical section triggers might_sleep().
While we know that we've already pinned the framebuffer
and thus i915_gem_object_get_dma_address() will in fact
not sleep in this case, it seems reasonable to keep the
unconditional might_sleep() for maximum coverage.
So let's instead pre-populate the dma address during
fb pinning, which all happens before we enter the
vblank evade critical section.
We can use u32 for the dma address as this class of
hardware doesn't support >32bit addresses.
Cc: stable(a)vger.kernel.org
Fixes: 0225a90981c8 ("drm/i915: Make cursor plane registers unlocked")
Link: https://lore.kernel.org/intel-gfx/20240227100342.GAZd2zfmYcPS_SndtO@fat_cra…
Reported-by: Borislav Petkov <bp(a)alien8.de>
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_cursor.c | 4 +---
drivers/gpu/drm/i915/display/intel_display_types.h | 1 +
drivers/gpu/drm/i915/display/intel_fb_pin.c | 10 ++++++++++
3 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_cursor.c b/drivers/gpu/drm/i915/display/intel_cursor.c
index f8b33999d43f..0d3da55e1c24 100644
--- a/drivers/gpu/drm/i915/display/intel_cursor.c
+++ b/drivers/gpu/drm/i915/display/intel_cursor.c
@@ -36,12 +36,10 @@ static u32 intel_cursor_base(const struct intel_plane_state *plane_state)
{
struct drm_i915_private *dev_priv =
to_i915(plane_state->uapi.plane->dev);
- const struct drm_framebuffer *fb = plane_state->hw.fb;
- struct drm_i915_gem_object *obj = intel_fb_obj(fb);
u32 base;
if (DISPLAY_INFO(dev_priv)->cursor_needs_physical)
- base = i915_gem_object_get_dma_address(obj, 0);
+ base = plane_state->phys_dma_addr;
else
base = intel_plane_ggtt_offset(plane_state);
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h
index 8a35fb6b2ade..68f26a33870b 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -728,6 +728,7 @@ struct intel_plane_state {
#define PLANE_HAS_FENCE BIT(0)
struct intel_fb_view view;
+ u32 phys_dma_addr; /* for cursor_needs_physical */
/* Plane pxp decryption state */
bool decrypt;
diff --git a/drivers/gpu/drm/i915/display/intel_fb_pin.c b/drivers/gpu/drm/i915/display/intel_fb_pin.c
index 7b42aef37d2f..b6df9baf481b 100644
--- a/drivers/gpu/drm/i915/display/intel_fb_pin.c
+++ b/drivers/gpu/drm/i915/display/intel_fb_pin.c
@@ -255,6 +255,16 @@ int intel_plane_pin_fb(struct intel_plane_state *plane_state)
return PTR_ERR(vma);
plane_state->ggtt_vma = vma;
+
+ /*
+ * Pre-populate the dma address before we enter the vblank
+ * evade critical section as i915_gem_object_get_dma_address()
+ * will trigger might_sleep() even if it won't actually sleep,
+ * which is the case when the fb has already been pinned.
+ */
+ if (phys_cursor)
+ plane_state->phys_dma_addr =
+ i915_gem_object_get_dma_address(intel_fb_obj(fb), 0);
} else {
struct intel_framebuffer *intel_fb = to_intel_framebuffer(fb);
--
2.43.2
Fix several issues discovered while debugging UCSI implementation on
Qualcomm platforms (ucsi_glink). With these patches I was able to
get a working Type-C port managament implementation. Tested on SC8280XP
(Lenovo X13s laptop) and SM8350-HDK.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
---
Dmitry Baryshkov (7):
usb: typec: ucsi: fix race condition in connection change ACK'ing
usb: typec: ucsi: acknowledge the UCSI_CCI_NOT_SUPPORTED
usb: typec: ucsi: make ACK_CC_CI rules more obvious
usb: typec: ucsi: allow non-partner GET_PDOS for Qualcomm devices
usb: typec: ucsi: limit the UCSI_NO_PARTNER_PDOS even further
usb: typec: ucsi: properly register partner's PD device
soc: qcom: pmic_glink: reenable UCSI on sc8280xp
drivers/soc/qcom/pmic_glink.c | 1 -
drivers/usb/typec/ucsi/ucsi.c | 51 +++++++++++++++++++++++++++++++++----------
2 files changed, 39 insertions(+), 13 deletions(-)
---
base-commit: ea86f16f605361af3779af0e0b4be18614282091
change-id: 20240312-qcom-ucsi-fixes-6578d236b60b
Best regards,
--
Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
introduces a workqueue to release the consumer and supplier devices used
in the devlink.
In the job queued, devices are release and in turn, when all the
references to these devices are dropped, the release function of the
device itself is called.
Nothing is present to provide some synchronisation with this workqueue
in order to ensure that all ongoing releasing operations are done and
so, some other operations can be started safely.
For instance, in the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()
During the step 1, devices are released and related devlinks are removed
(jobs pushed in the workqueue).
During the step 2, OF nodes are destroyed but, without any
synchronisation with devlink removal jobs, of_overlay_remove() can raise
warnings related to missing of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2
Indeed, the missing of_node_put() call is going to be done, too late,
from the workqueue job execution.
Introduce device_link_wait_removal() to offer a way to synchronize
operations waiting for the end of devlink removals (i.e. end of
workqueue jobs).
Also, as a flushing operation is done on the workqueue, the workqueue
used is moved from a system-wide workqueue to a local one.
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
Tested-by: Luca Ceresoli <luca.ceresoli(a)bootlin.com>
Reviewed-by: Nuno Sa <nuno.sa(a)analog.com>
Reviewed-by: Saravana Kannan <saravanak(a)google.com>
---
drivers/base/core.c | 26 +++++++++++++++++++++++---
include/linux/device.h | 1 +
2 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 7e3af0ad770a..f2242aadffb0 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
static void __fw_devlink_link_to_consumers(struct device *dev);
static bool fw_devlink_drv_reg_done;
static bool fw_devlink_best_effort;
+static struct workqueue_struct *device_link_wq;
/**
* __fwnode_link_add - Create a link between two fwnode_handles.
@@ -533,12 +534,26 @@ static void devlink_dev_release(struct device *dev)
/*
* It may take a while to complete this work because of the SRCU
* synchronization in device_link_release_fn() and if the consumer or
- * supplier devices get deleted when it runs, so put it into the "long"
- * workqueue.
+ * supplier devices get deleted when it runs, so put it into the
+ * dedicated workqueue.
*/
- queue_work(system_long_wq, &link->rm_work);
+ queue_work(device_link_wq, &link->rm_work);
}
+/**
+ * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate
+ */
+void device_link_wait_removal(void)
+{
+ /*
+ * devlink removal jobs are queued in the dedicated work queue.
+ * To be sure that all removal jobs are terminated, ensure that any
+ * scheduled work has run to completion.
+ */
+ flush_workqueue(device_link_wq);
+}
+EXPORT_SYMBOL_GPL(device_link_wait_removal);
+
static struct class devlink_class = {
.name = "devlink",
.dev_groups = devlink_groups,
@@ -4165,9 +4180,14 @@ int __init devices_init(void)
sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj);
if (!sysfs_dev_char_kobj)
goto char_kobj_err;
+ device_link_wq = alloc_workqueue("device_link_wq", 0, 0);
+ if (!device_link_wq)
+ goto wq_err;
return 0;
+ wq_err:
+ kobject_put(sysfs_dev_char_kobj);
char_kobj_err:
kobject_put(sysfs_dev_block_kobj);
block_kobj_err:
diff --git a/include/linux/device.h b/include/linux/device.h
index 1795121dee9a..d7d8305a72e8 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1249,6 +1249,7 @@ void device_link_del(struct device_link *link);
void device_link_remove(void *consumer, struct device *supplier);
void device_links_supplier_sync_state_pause(void);
void device_links_supplier_sync_state_resume(void);
+void device_link_wait_removal(void);
/* Create alias, so I can be autoloaded. */
#define MODULE_ALIAS_CHARDEV(major,minor) \
--
2.44.0
Commit fc663711b944 ("scsi: core: Remove the /proc/scsi/${proc_name} directory
earlier") fixed a bug related to modules loading/unloading, by adding a
call to scsi_proc_hostdir_rm() on scsi_remove_host(). But that led to a
potential duplicate call to the hostdir_rm() routine, since it's also
called from scsi_host_dev_release(). That triggered a regression report,
which was then fixed by commit be03df3d4bfe ("scsi: core: Fix a procfs host
directory removal regression"). The fix just dropped the hostdir_rm() call
from dev_release().
But happens that this proc directory is created on scsi_host_alloc(),
and that function "pairs" with scsi_host_dev_release(), while
scsi_remove_host() pairs with scsi_add_host(). In other words, it seems
the reason for removing the proc directory on dev_release() was meant to
cover cases in which a SCSI host structure was allocated, but the call
to scsi_add_host() didn't happen. And that pattern happens to exist in
some error paths, for example.
Syzkaller causes that by using USB raw gadget device, error'ing on usb-storage
driver, at usb_stor_probe2(). By checking that path, we can see that the
BadDevice label leads to a scsi_host_put() after a SCSI host allocation, but
there's no call to scsi_add_host() in such path. That leads to messages like
this in dmesg (and a leak of the SCSI host proc structure):
usb-storage 4-1:87.51: USB Mass Storage device detected
proc_dir_entry 'scsi/usb-storage' already registered
WARNING: CPU: 1 PID: 3519 at fs/proc/generic.c:377 proc_register+0x347/0x4e0 fs/proc/generic.c:376
The proper fix seems to still call scsi_proc_hostdir_rm() on dev_release(),
but guard that with the state check for SHOST_CREATED; there is even a
comment in scsi_host_dev_release() detailing that: such conditional is
meant for cases where the SCSI host was allocated but there was no calls
to {add,remove}_host(), like the usb-storage case.
This is what we propose here and with that, the error path of usb-storage
does not trigger the warning anymore.
Reported-by: syzbot+c645abf505ed21f931b5(a)syzkaller.appspotmail.com
Fixes: be03df3d4bfe ("scsi: core: Fix a procfs host directory removal regression")
Cc: stable(a)vger.kernel.org
Cc: Bart Van Assche <bvanassche(a)acm.org>
Cc: John Garry <john.g.garry(a)oracle.com>
Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com>
---
Hi folks, thanks in advance for reviews.
The issue with usb-storage happens upstream but despite we have a
repro in the aforementioned Syzkaller report, it's only easy to reproduce
the proc_dir_entry warning in a timely manner on stable right now.
The reason for that is commit 036abd614007 ("scsi: core: Introduce a new
list for SCSI proc directory entries") not being present on stable. This
commit (indirectly) bumps the ->present field from unsigned char to uint,
and the bug reproducer shows the warning whenever such field overflows,
hence it's way easier/faster to see this problem in stable, though it's
present in latest v6.8.0 too.
Cheers,
Guilherme
drivers/scsi/hosts.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index d7f51b84f3c7..445f4a220df3 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -353,12 +353,13 @@ static void scsi_host_dev_release(struct device *dev)
if (shost->shost_state == SHOST_CREATED) {
/*
- * Free the shost_dev device name here if scsi_host_alloc()
- * and scsi_host_put() have been called but neither
+ * Free the shost_dev device name and remove the proc host dir
+ * here if scsi_host_{alloc,put}() have been called but neither
* scsi_host_add() nor scsi_remove_host() has been called.
* This avoids that the memory allocated for the shost_dev
- * name is leaked.
+ * name as well as the proc dir structure are leaked.
*/
+ scsi_proc_hostdir_rm(shost->hostt);
kfree(dev_name(&shost->shost_dev));
}
--
2.43.0
sg_remove_sfp_usercontext() must not use sg_device_destroy() after
calling scsi_device_put().
sg_device_destroy() is accessling the device queue. Which will be set to
NULL if scsi_device_put() removes the last reference to the sg device.
Link: https://lore.kernel.org/r/20240305150509.23896-1-Alexander@wetzel-home.de
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Wetzel <Alexander(a)wetzel-home.de>
---
This is my best shot for a real fix of the issue.
I confirmed with printk's that I get the NULL pointer freeze ony when
scsi_device_put() is deleting the last reference to the device.
In the cases where it's not crashing there is still a reference left
after the call.
I don't see any obvious down side of simply swapping the calls.
The alternative would by my first patch, just without the WARN_ON.
Alexander
---
drivers/scsi/sg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 86210e4dd0d3..80e0d1981191 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -2232,8 +2232,8 @@ sg_remove_sfp_usercontext(struct work_struct *work)
"sg_remove_sfp: sfp=0x%p\n", sfp));
kfree(sfp);
- scsi_device_put(sdp->device);
kref_put(&sdp->d_ref, sg_device_destroy);
+ scsi_device_put(sdp->device);
module_put(THIS_MODULE);
}
--
2.44.0
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: c2ddeb29612f7ca84ed10c6d4f3ac99705135447
Gitweb: https://git.kernel.org/tip/c2ddeb29612f7ca84ed10c6d4f3ac99705135447
Author: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
AuthorDate: Mon, 25 Mar 2024 13:58:08 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Mon, 25 Mar 2024 23:45:21 +01:00
genirq: Introduce IRQF_COND_ONESHOT and use it in pinctrl-amd
There is a problem when a driver requests a shared interrupt line to run a
threaded handler on it without IRQF_ONESHOT set if that flag has been set
already for the IRQ in question by somebody else. Namely, the request
fails which usually leads to a probe failure even though the driver might
have worked just fine with IRQF_ONESHOT, but it does not want to use it by
default. Currently, the only way to handle this is to try to request the
IRQ without IRQF_ONESHOT, but with IRQF_PROBE_SHARED set and if this fails,
try again with IRQF_ONESHOT set. However, this is a bit cumbersome and not
very clean.
When commit 7a36b901a6eb ("ACPI: OSL: Use a threaded interrupt handler for
SCI") switched the ACPI subsystem over to using a threaded interrupt
handler for the SCI, it had to use IRQF_ONESHOT for it because that's
required due to the way the SCI handler works (it needs to walk all of the
enabled GPEs before the interrupt line can be unmasked). The SCI interrupt
line is not shared with other users very often due to the SCI handling
overhead, but on sone systems it is shared and when the other user of it
attempts to install a threaded handler, a flags mismatch related to
IRQF_ONESHOT may occur.
As it turned out, that happened to the pinctrl-amd driver and so commit
4451e8e8415e ("pinctrl: amd: Add IRQF_ONESHOT to the interrupt request")
attempted to address the issue by adding IRQF_ONESHOT to the interrupt
flags in that driver, but this is now causing an IRQF_ONESHOT-related
mismatch to occur on another system which cannot boot as a result of it.
Clearly, pinctrl-amd can work with IRQF_ONESHOT if need be, but it should
not set that flag by default, so it needs a way to indicate that to the
interrupt subsystem.
To that end, introdcuce a new interrupt flag, IRQF_COND_ONESHOT, which will
only have effect when the IRQ line is shared and IRQF_ONESHOT has been set
for it already, in which case it will be promoted to the latter.
This is sufficient for drivers sharing the interrupt line with the SCI as
it is requested by the ACPI subsystem before any drivers are probed, so
they will always see IRQF_ONESHOT set for the interrupt in question.
Fixes: 4451e8e8415e ("pinctrl: amd: Add IRQF_ONESHOT to the interrupt request")
Reported-by: Francisco Ayala Le Brun <francisco(a)videowindow.eu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Cc: 6.8+ <stable(a)vger.kernel.org> # 6.8+
Closes: https://lore.kernel.org/lkml/CAN-StX1HqWqi+YW=t+V52-38Mfp5fAz7YHx4aH-CQjgyN…
Link: https://lore.kernel.org/r/12417336.O9o76ZdvQC@kreacher
---
drivers/pinctrl/pinctrl-amd.c | 2 +-
include/linux/interrupt.h | 3 +++
kernel/irq/manage.c | 9 +++++++--
3 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 49f89b7..7f66ec7 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -1159,7 +1159,7 @@ static int amd_gpio_probe(struct platform_device *pdev)
}
ret = devm_request_irq(&pdev->dev, gpio_dev->irq, amd_gpio_irq_handler,
- IRQF_SHARED | IRQF_ONESHOT, KBUILD_MODNAME, gpio_dev);
+ IRQF_SHARED | IRQF_COND_ONESHOT, KBUILD_MODNAME, gpio_dev);
if (ret)
goto out2;
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 76121c2..5c9bdd3 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -67,6 +67,8 @@
* later.
* IRQF_NO_DEBUG - Exclude from runnaway detection for IPI and similar handlers,
* depends on IRQF_PERCPU.
+ * IRQF_COND_ONESHOT - Agree to do IRQF_ONESHOT if already set for a shared
+ * interrupt.
*/
#define IRQF_SHARED 0x00000080
#define IRQF_PROBE_SHARED 0x00000100
@@ -82,6 +84,7 @@
#define IRQF_COND_SUSPEND 0x00040000
#define IRQF_NO_AUTOEN 0x00080000
#define IRQF_NO_DEBUG 0x00100000
+#define IRQF_COND_ONESHOT 0x00200000
#define IRQF_TIMER (__IRQF_TIMER | IRQF_NO_SUSPEND | IRQF_NO_THREAD)
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index ad3eaf2..bf9ae8a 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1643,8 +1643,13 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
}
if (!((old->flags & new->flags) & IRQF_SHARED) ||
- (oldtype != (new->flags & IRQF_TRIGGER_MASK)) ||
- ((old->flags ^ new->flags) & IRQF_ONESHOT))
+ (oldtype != (new->flags & IRQF_TRIGGER_MASK)))
+ goto mismatch;
+
+ if ((old->flags & IRQF_ONESHOT) &&
+ (new->flags & IRQF_COND_ONESHOT))
+ new->flags |= IRQF_ONESHOT;
+ else if ((old->flags ^ new->flags) & IRQF_ONESHOT)
goto mismatch;
/* All handlers must agree on per-cpuness */
The commit 3a2dbc510c43 ("driver core: fw_devlink: Don't purge child
fwnode's consumer links") introduces the possibility to use the
supplier's parent device instead of the supplier itself.
In that case the supplier fwnode used is not updated and is no more
consistent with the supplier device used.
Use the fwnode consistent with the supplier device when checking flags.
Fixes: 3a2dbc510c43 ("driver core: fw_devlink: Don't purge child fwnode's consumer links")
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
---
Changes v2 -> v3:
Do not update the supplier handle in order to keep the original handle
for debug traces.
Changes v1 -> v2:
Remove sup_handle check and related pr_debug() call as sup_handle cannot be
invalid if sup_dev is valid.
drivers/base/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index b93f3c5716ae..0d335b0dc396 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2163,7 +2163,7 @@ static int fw_devlink_create_devlink(struct device *con,
* supplier device indefinitely.
*/
if (sup_dev->links.status == DL_DEV_NO_DRIVER &&
- sup_handle->flags & FWNODE_FLAG_INITIALIZED) {
+ sup_dev->fwnode->flags & FWNODE_FLAG_INITIALIZED) {
dev_dbg(con,
"Not linking %pfwf - dev might never probe\n",
sup_handle);
--
2.44.0
The patch titled
Subject: crash: use macro to add crashk_res into iomem early for specific arch
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
crash-use-macro-to-add-crashk_res-into-iomem-early-for-specific-arch.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Baoquan He <bhe(a)redhat.com>
Subject: crash: use macro to add crashk_res into iomem early for specific arch
Date: Mon, 25 Mar 2024 09:50:50 +0800
There are regression reports[1][2] that crashkernel region on x86_64 can't
be added into iomem tree sometime. This causes the later failure of kdump
loading.
This happened after commit 4a693ce65b18 ("kdump: defer the insertion of
crashkernel resources") was merged.
Even though, these reported issues are proved to be related to other
component, they are just exposed after above commmit applied, I still
would like to keep crashk_res and crashk_low_res being added into iomem
early as before because the early adding has been always there on x86_64
and working very well. For safety of kdump, Let's change it back.
Here, add a macro HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY to limit that
only ARCH defining the macro can have the early adding
crashk_res/_low_res into iomem. Then define
HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY on x86 to enable it.
Note: In reserve_crashkernel_low(), there's a remnant of crashk_low_res
handling which was mistakenly added back in commit 85fcde402db1 ("kexec:
split crashkernel reservation code out from crash_core.c").
[1]
[PATCH V2] x86/kexec: do not update E820 kexec table for setup_data
https://lore.kernel.org/all/Zfv8iCL6CT2JqLIC@darkstar.users.ipa.redhat.com/…
[2]
Question about Address Range Validation in Crash Kernel Allocation
https://lore.kernel.org/all/4eeac1f733584855965a2ea62fa4da58@huawei.com/T/#u
Link: https://lkml.kernel.org/r/ZgDYemRQ2jxjLkq+@MiWiFi-R3L-srv
Fixes: 4a693ce65b18 ("kdump: defer the insertion of crashkernel resources")
Signed-off-by: Baoquan He <bhe(a)redhat.com>
Cc: Dave Young <dyoung(a)redhat.com>
Cc: Huacai Chen <chenhuacai(a)loongson.cn>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Jiri Bohac <jbohac(a)suse.cz>
Cc: Li Huafei <lihuafei1(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/include/asm/crash_reserve.h | 2 ++
kernel/crash_reserve.c | 7 +++++++
2 files changed, 9 insertions(+)
--- a/arch/x86/include/asm/crash_reserve.h~crash-use-macro-to-add-crashk_res-into-iomem-early-for-specific-arch
+++ a/arch/x86/include/asm/crash_reserve.h
@@ -39,4 +39,6 @@ static inline unsigned long crash_low_si
#endif
}
+#define HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
+
#endif /* _X86_CRASH_RESERVE_H */
--- a/kernel/crash_reserve.c~crash-use-macro-to-add-crashk_res-into-iomem-early-for-specific-arch
+++ a/kernel/crash_reserve.c
@@ -366,8 +366,10 @@ static int __init reserve_crashkernel_lo
crashk_low_res.start = low_base;
crashk_low_res.end = low_base + low_size - 1;
+#ifdef HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
insert_resource(&iomem_resource, &crashk_low_res);
#endif
+#endif
return 0;
}
@@ -448,8 +450,12 @@ retry:
crashk_res.start = crash_base;
crashk_res.end = crash_base + crash_size - 1;
+#ifdef HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
+ insert_resource(&iomem_resource, &crashk_res);
+#endif
}
+#ifndef HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
static __init int insert_crashkernel_resources(void)
{
if (crashk_res.start < crashk_res.end)
@@ -462,3 +468,4 @@ static __init int insert_crashkernel_res
}
early_initcall(insert_crashkernel_resources);
#endif
+#endif
_
Patches currently in -mm which might be from bhe(a)redhat.com are
crash-use-macro-to-add-crashk_res-into-iomem-early-for-specific-arch.patch
mm-vmallocc-optimize-to-reduce-arguments-of-alloc_vmap_area.patch
x86-remove-unneeded-memblock_find_dma_reserve.patch
mm-mm_initc-remove-the-useless-dma_reserve.patch
mm-mm_initc-add-new-function-calc_nr_all_pages.patch
mm-mm_initc-remove-meaningless-calculation-of-zone-managed_pages-in-free_area_init_core.patch
mm-mm_initc-remove-unneeded-calc_memmap_size.patch
mm-mm_initc-remove-arch_reserved_kernel_pages.patch
The patch titled
Subject: mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-zswap-fix-data-loss-on-swp_synchronous_io-devices.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices
Date: Sun, 24 Mar 2024 17:04:47 -0400
Zhongkun He reports data corruption when combining zswap with zram.
The issue is the exclusive loads we're doing in zswap. They assume
that all reads are going into the swapcache, which can assume
authoritative ownership of the data and so the zswap copy can go.
However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try to
bypass the swapcache. This results in an optimistic read of the swap data
into a page that will be dismissed if the fault fails due to races. In
this case, zswap mustn't drop its authoritative copy.
Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=zV9P691B9bVq33erwOXNTmEaUbi9DrD…
Fixes: b9c91c43412f ("mm: zswap: support exclusive loads")
Link: https://lkml.kernel.org/r/20240324210447.956973-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Zhongkun He <hezhongkun.hzk(a)bytedance.com>
Tested-by: Zhongkun He <hezhongkun.hzk(a)bytedance.com>
Acked-by: Yosry Ahmed <yosryahmed(a)google.com>
Acked-by: Barry Song <baohua(a)kernel.org>
Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
Reviewed-by: Nhat Pham <nphamcs(a)gmail.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.5+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/zswap.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)
--- a/mm/zswap.c~mm-zswap-fix-data-loss-on-swp_synchronous_io-devices
+++ a/mm/zswap.c
@@ -1636,6 +1636,7 @@ bool zswap_load(struct folio *folio)
swp_entry_t swp = folio->swap;
pgoff_t offset = swp_offset(swp);
struct page *page = &folio->page;
+ bool swapcache = folio_test_swapcache(folio);
struct zswap_tree *tree = swap_zswap_tree(swp);
struct zswap_entry *entry;
u8 *dst;
@@ -1648,7 +1649,20 @@ bool zswap_load(struct folio *folio)
spin_unlock(&tree->lock);
return false;
}
- zswap_rb_erase(&tree->rbroot, entry);
+ /*
+ * When reading into the swapcache, invalidate our entry. The
+ * swapcache can be the authoritative owner of the page and
+ * its mappings, and the pressure that results from having two
+ * in-memory copies outweighs any benefits of caching the
+ * compression work.
+ *
+ * (Most swapins go through the swapcache. The notable
+ * exception is the singleton fault on SWP_SYNCHRONOUS_IO
+ * files, which reads into a private page and may free it if
+ * the fault fails. We remain the primary owner of the entry.)
+ */
+ if (swapcache)
+ zswap_rb_erase(&tree->rbroot, entry);
spin_unlock(&tree->lock);
if (entry->length)
@@ -1663,9 +1677,10 @@ bool zswap_load(struct folio *folio)
if (entry->objcg)
count_objcg_event(entry->objcg, ZSWPIN);
- zswap_entry_free(entry);
-
- folio_mark_dirty(folio);
+ if (swapcache) {
+ zswap_entry_free(entry);
+ folio_mark_dirty(folio);
+ }
return true;
}
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
mm-cachestat-fix-two-shmem-bugs.patch
mm-zswap-fix-writeback-shinker-gfp_noio-gfp_nofs-recursion.patch
mm-zswap-fix-data-loss-on-swp_synchronous_io-devices.patch
mm-zswap-optimize-zswap-pool-size-tracking.patch
mm-zpool-return-pool-size-in-pages.patch
mm-page_alloc-remove-pcppage-migratetype-caching.patch
mm-page_alloc-optimize-free_unref_folios.patch
mm-page_alloc-fix-up-block-types-when-merging-compatible-blocks.patch
mm-page_alloc-move-free-pages-when-converting-block-during-isolation.patch
mm-page_alloc-fix-move_freepages_block-range-error.patch
mm-page_alloc-fix-freelist-movement-during-block-conversion.patch
mm-page_alloc-close-migratetype-race-between-freeing-and-stealing.patch
mm-page_isolation-prepare-for-hygienic-freelists.patch
mm-page_isolation-prepare-for-hygienic-freelists-fix.patch
mm-page_alloc-consolidate-free-page-accounting.patch
The patch titled
Subject: selftests/mm: fix ARM related issue with fork after pthread_create
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
selftests-mm-fix-arm-related-issue-with-fork-after-pthread_create.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Edward Liaw <edliaw(a)google.com>
Subject: selftests/mm: fix ARM related issue with fork after pthread_create
Date: Mon, 25 Mar 2024 19:40:52 +0000
Following issue was observed while running the uffd-unit-tests selftest
on ARM devices. On x86_64 no issues were detected:
pthread_create followed by fork caused deadlock in certain cases wherein
fork required some work to be completed by the created thread. Used
synchronization to ensure that created thread's start function has started
before invoking fork.
[edliaw(a)google.com: refactored to use atomic_bool]
Link: https://lkml.kernel.org/r/20240325194100.775052-1-edliaw@google.com
Fixes: 760aee0b71e3 ("selftests/mm: add tests for RO pinning vs fork()")
Signed-off-by: Lokesh Gidra <lokeshgidra(a)google.com>
Signed-off-by: Edward Liaw <edliaw(a)google.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/uffd-common.c | 3 +++
tools/testing/selftests/mm/uffd-common.h | 2 ++
tools/testing/selftests/mm/uffd-unit-tests.c | 10 ++++++++++
3 files changed, 15 insertions(+)
--- a/tools/testing/selftests/mm/uffd-common.c~selftests-mm-fix-arm-related-issue-with-fork-after-pthread_create
+++ a/tools/testing/selftests/mm/uffd-common.c
@@ -18,6 +18,7 @@ bool test_uffdio_wp = true;
unsigned long long *count_verify;
uffd_test_ops_t *uffd_test_ops;
uffd_test_case_ops_t *uffd_test_case_ops;
+atomic_bool ready_for_fork;
static int uffd_mem_fd_create(off_t mem_size, bool hugetlb)
{
@@ -518,6 +519,8 @@ void *uffd_poll_thread(void *arg)
pollfd[1].fd = pipefd[cpu*2];
pollfd[1].events = POLLIN;
+ ready_for_fork = true;
+
for (;;) {
ret = poll(pollfd, 2, -1);
if (ret <= 0) {
--- a/tools/testing/selftests/mm/uffd-common.h~selftests-mm-fix-arm-related-issue-with-fork-after-pthread_create
+++ a/tools/testing/selftests/mm/uffd-common.h
@@ -32,6 +32,7 @@
#include <inttypes.h>
#include <stdint.h>
#include <sys/random.h>
+#include <stdatomic.h>
#include "../kselftest.h"
#include "vm_util.h"
@@ -103,6 +104,7 @@ extern bool map_shared;
extern bool test_uffdio_wp;
extern unsigned long long *count_verify;
extern volatile bool test_uffdio_copy_eexist;
+extern atomic_bool ready_for_fork;
extern uffd_test_ops_t anon_uffd_test_ops;
extern uffd_test_ops_t shmem_uffd_test_ops;
--- a/tools/testing/selftests/mm/uffd-unit-tests.c~selftests-mm-fix-arm-related-issue-with-fork-after-pthread_create
+++ a/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -775,6 +775,8 @@ static void uffd_sigbus_test_common(bool
char c;
struct uffd_args args = { 0 };
+ ready_for_fork = false;
+
fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
if (uffd_register(uffd, area_dst, nr_pages * page_size,
@@ -790,6 +792,9 @@ static void uffd_sigbus_test_common(bool
if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args))
err("uffd_poll_thread create");
+ while (!ready_for_fork)
+ ; /* Wait for the poll_thread to start executing before forking */
+
pid = fork();
if (pid < 0)
err("fork");
@@ -829,6 +834,8 @@ static void uffd_events_test_common(bool
char c;
struct uffd_args args = { 0 };
+ ready_for_fork = false;
+
fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
if (uffd_register(uffd, area_dst, nr_pages * page_size,
true, wp, false))
@@ -838,6 +845,9 @@ static void uffd_events_test_common(bool
if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args))
err("uffd_poll_thread create");
+ while (!ready_for_fork)
+ ; /* Wait for the poll_thread to start executing before forking */
+
pid = fork();
if (pid < 0)
err("fork");
_
Patches currently in -mm which might be from edliaw(a)google.com are
selftests-mm-sigbus-wp-test-requires-uffd_feature_wp_hugetlbfs_shmem.patch
selftests-mm-fix-arm-related-issue-with-fork-after-pthread_create.patch
The patch titled
Subject: mm/secretmem: fix GUP-fast succeeding on secretmem folios
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-secretmem-fix-gup-fast-succeeding-on-secretmem-folios.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/secretmem: fix GUP-fast succeeding on secretmem folios
Date: Mon, 25 Mar 2024 14:41:12 +0100
folio_is_secretmem() states that secretmem folios cannot be LRU folios: so
we may only exit early if we find an LRU folio. Yet, we exit early if we
find a folio that is not a secretmem folio.
Consequently, folio_is_secretmem() fails to detect secretmem folios and,
therefore, we can succeed in grabbing a secretmem folio during GUP-fast,
crashing the kernel when we later try reading/writing to the folio,
because the folio has been unmapped from the directmap.
Link: https://lkml.kernel.org/r/20240325134114.257544-2-david@redhat.com
Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: xingwei lee <xrivendell7(a)gmail.com>
Reported-by: yue sun <samsun1006219(a)gmail.com>
Closes: https://lore.kernel.org/lkml/CABOYnLyevJeravW=QrH0JUPYEcDN160aZFb7kwndm-J2r…
Debugged-by: Miklos Szeredi <miklos(a)szeredi.hu>
Reviewed-by: Mike Rapoport (IBM) <rppt(a)kernel.org>
Tested-by: Miklos Szeredi <mszeredi(a)redhat.com>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: "Mike Rapoport (IBM)" <rppt(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/secretmem.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/secretmem.h~mm-secretmem-fix-gup-fast-succeeding-on-secretmem-folios
+++ a/include/linux/secretmem.h
@@ -16,7 +16,7 @@ static inline bool folio_is_secretmem(st
* We know that secretmem pages are not compound and LRU so we can
* save a couple of cycles here.
*/
- if (folio_test_large(folio) || !folio_test_lru(folio))
+ if (folio_test_large(folio) || folio_test_lru(folio))
return false;
mapping = (struct address_space *)
_
Patches currently in -mm which might be from david(a)redhat.com are
mm-secretmem-fix-gup-fast-succeeding-on-secretmem-folios.patch
mm-madvise-make-madv_populate_readwrite-handle-vm_fault_retry-properly.patch
mm-madvise-dont-perform-madvise-vma-walk-for-madv_populate_readwrite.patch
mm-userfaultfd-dont-place-zeropages-when-zeropages-are-disallowed.patch
s390-mm-re-enable-the-shared-zeropage-for-pv-and-skeys-kvm-guests.patch
mm-convert-folio_estimated_sharers-to-folio_likely_mapped_shared.patch
An of_node can be duplicated from an existing device using
device_set_of_node_from_dev() or initialized using device_set_node()
In both cases, these functions have to be called before the device_add()
call in order to have the of_node link created in the device sysfs
directory. Further more, these function cannot prevent any of_node
and/or fwnode overwrites.
When adding an of_node on an already present device, the following
operations need to be done:
- Attach the of_node if no of_node were already attached
- Attach the of_node as a fwnode if no fwnode were already attached
- Create the of_node sysfs link if needed
This is the purpose of device_add_of_node().
device_remove_of_node() reverts the operations done by
device_add_of_node().
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
---
drivers/base/core.c | 74 ++++++++++++++++++++++++++++++++++++++++++
include/linux/device.h | 2 ++
2 files changed, 76 insertions(+)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 521757408fc0..7e3af0ad770a 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -5127,6 +5127,80 @@ void set_secondary_fwnode(struct device *dev, struct fwnode_handle *fwnode)
}
EXPORT_SYMBOL_GPL(set_secondary_fwnode);
+/**
+ * device_remove_of_node - Remove an of_node from a device
+ * @dev: device whose device-tree node is being removed
+ */
+void device_remove_of_node(struct device *dev)
+{
+ dev = get_device(dev);
+ if (!dev)
+ return;
+
+ if (!dev->of_node)
+ goto end;
+
+ sysfs_remove_link(&dev->kobj, "of_node");
+
+ if (dev->fwnode == of_fwnode_handle(dev->of_node))
+ dev->fwnode = NULL;
+
+ of_node_put(dev->of_node);
+ dev->of_node = NULL;
+
+end:
+ put_device(dev);
+}
+EXPORT_SYMBOL_GPL(device_remove_of_node);
+
+/**
+ * device_add_of_node - Add an of_node to an existing device
+ * @dev: device whose device-tree node is being added
+ * @of_node: of_node to add
+ */
+void device_add_of_node(struct device *dev, struct device_node *of_node)
+{
+ int ret;
+
+ if (!of_node)
+ return;
+
+ dev = get_device(dev);
+ if (!dev)
+ return;
+
+ if (dev->of_node) {
+ dev_warn(dev, "Replace node %pOF with %pOF\n", dev->of_node, of_node);
+ device_remove_of_node(dev);
+ }
+
+ dev->of_node = of_node_get(of_node);
+
+ if (!dev->fwnode)
+ dev->fwnode = of_fwnode_handle(of_node);
+
+ if (!dev->p) {
+ /*
+ * device_add() was not previously called.
+ * The of_node link will be created when device_add() is called.
+ */
+ goto end;
+ }
+
+ /*
+ * device_add() was previously called and so the of_node link was not
+ * created by device_add_class_symlinks().
+ * Create this link now.
+ */
+ ret = sysfs_create_link(&dev->kobj, of_node_kobj(of_node), "of_node");
+ if (ret)
+ dev_warn(dev, "Error %d creating of_node link\n", ret);
+
+end:
+ put_device(dev);
+}
+EXPORT_SYMBOL_GPL(device_add_of_node);
+
/**
* device_set_of_node_from_dev - reuse device-tree node of another device
* @dev: device whose device-tree node is being set
diff --git a/include/linux/device.h b/include/linux/device.h
index 97c4b046c09d..1795121dee9a 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1127,6 +1127,8 @@ int device_offline(struct device *dev);
int device_online(struct device *dev);
void set_primary_fwnode(struct device *dev, struct fwnode_handle *fwnode);
void set_secondary_fwnode(struct device *dev, struct fwnode_handle *fwnode);
+void device_add_of_node(struct device *dev, struct device_node *of_node);
+void device_remove_of_node(struct device *dev);
void device_set_of_node_from_dev(struct device *dev, const struct device *dev2);
void device_set_node(struct device *dev, struct fwnode_handle *fwnode);
--
2.44.0
In the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()
During the step 1, devices are destroyed and devlinks are removed.
During the step 2, OF nodes are destroyed but
__of_changeset_entry_destroy() can raise warnings related to missing
of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2 ...
Indeed, during the devlink removals performed at step 1, the removal
itself releasing the device (and the attached of_node) is done by a job
queued in a workqueue and so, it is done asynchronously with respect to
function calls.
When the warning is present, of_node_put() will be called but wrongly
too late from the workqueue job.
In order to be sure that any ongoing devlink removals are done before
the of_node destruction, synchronize the of_changeset_destroy() with the
devlink removals.
Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
Reviewed-by: Saravana Kannan <saravanak(a)google.com>
Tested-by: Luca Ceresoli <luca.ceresoli(a)bootlin.com>
Reviewed-by: Nuno Sa <nuno.sa(a)analog.com>
---
drivers/of/dynamic.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
index 3bf27052832f..4d57a4e34105 100644
--- a/drivers/of/dynamic.c
+++ b/drivers/of/dynamic.c
@@ -9,6 +9,7 @@
#define pr_fmt(fmt) "OF: " fmt
+#include <linux/device.h>
#include <linux/of.h>
#include <linux/spinlock.h>
#include <linux/slab.h>
@@ -667,6 +668,17 @@ void of_changeset_destroy(struct of_changeset *ocs)
{
struct of_changeset_entry *ce, *cen;
+ /*
+ * When a device is deleted, the device links to/from it are also queued
+ * for deletion. Until these device links are freed, the devices
+ * themselves aren't freed. If the device being deleted is due to an
+ * overlay change, this device might be holding a reference to a device
+ * node that will be freed. So, wait until all already pending device
+ * links are deleted before freeing a device node. This ensures we don't
+ * free any device node that has a non-zero reference count.
+ */
+ device_link_wait_removal();
+
list_for_each_entry_safe_reverse(ce, cen, &ocs->entries, node)
__of_changeset_entry_destroy(ce);
}
--
2.44.0
There is a bug when setting the RSS options in virtio_net that can break
the whole machine, getting the kernel into an infinite loop.
Running the following command in any QEMU virtual machine with virtionet
will reproduce this problem:
# ethtool -X eth0 hfunc toeplitz
This is how the problem happens:
1) ethtool_set_rxfh() calls virtnet_set_rxfh()
2) virtnet_set_rxfh() calls virtnet_commit_rss_command()
3) virtnet_commit_rss_command() populates 4 entries for the rss
scatter-gather
4) Since the command above does not have a key, then the last
scatter-gatter entry will be zeroed, since rss_key_size == 0.
sg_buf_size = vi->rss_key_size;
5) This buffer is passed to qemu, but qemu is not happy with a buffer
with zero length, and do the following in virtqueue_map_desc() (QEMU
function):
if (!sz) {
virtio_error(vdev, "virtio: zero sized buffers are not allowed");
6) virtio_error() (also QEMU function) set the device as broken
vdev->broken = true;
7) Qemu bails out, and do not repond this crazy kernel.
8) The kernel is waiting for the response to come back (function
virtnet_send_command())
9) The kernel is waiting doing the following :
while (!virtqueue_get_buf(vi->cvq, &tmp) &&
!virtqueue_is_broken(vi->cvq))
cpu_relax();
10) None of the following functions above is true, thus, the kernel
loops here forever. Keeping in mind that virtqueue_is_broken() does
not look at the qemu `vdev->broken`, so, it never realizes that the
vitio is broken at QEMU side.
Fix it by not sending the key scatter-gatter key if it is not set.
Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Cc: stable(a)vger.kernel.org
Cc: qemu-devel(a)nongnu.org
---
drivers/net/virtio_net.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index d7ce4a1011ea..5a7700b103f8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3041,11 +3041,16 @@ static int virtnet_set_ringparam(struct net_device *dev,
static bool virtnet_commit_rss_command(struct virtnet_info *vi)
{
struct net_device *dev = vi->dev;
+ int has_key = vi->rss_key_size;
struct scatterlist sgs[4];
unsigned int sg_buf_size;
+ int nents = 3;
+
+ if (has_key)
+ nents += 1;
/* prepare sgs */
- sg_init_table(sgs, 4);
+ sg_init_table(sgs, nents);
sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table);
sg_set_buf(&sgs[0], &vi->ctrl->rss, sg_buf_size);
@@ -3057,8 +3062,13 @@ static bool virtnet_commit_rss_command(struct virtnet_info *vi)
- offsetof(struct virtio_net_ctrl_rss, max_tx_vq);
sg_set_buf(&sgs[2], &vi->ctrl->rss.max_tx_vq, sg_buf_size);
- sg_buf_size = vi->rss_key_size;
- sg_set_buf(&sgs[3], vi->ctrl->rss.key, sg_buf_size);
+ if (has_key) {
+ /* Only populate if key is available, otherwise
+ * populating a buffer with zero size breaks virtio
+ */
+ sg_buf_size = vi->rss_key_size;
+ sg_set_buf(&sgs[3], vi->ctrl->rss.key, sg_buf_size);
+ }
if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ,
vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG
--
2.43.0
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: c567f2948f57bdc03ed03403ae0234085f376b7d
Gitweb: https://git.kernel.org/tip/c567f2948f57bdc03ed03403ae0234085f376b7d
Author: Ingo Molnar <mingo(a)kernel.org>
AuthorDate: Mon, 25 Mar 2024 11:47:51 +01:00
Committer: Ingo Molnar <mingo(a)kernel.org>
CommitterDate: Mon, 25 Mar 2024 11:54:35 +01:00
Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped."
This reverts commit d794734c9bbfe22f86686dc2909c25f5ffe1a572.
While the original change tries to fix a bug, it also unintentionally broke
existing systems, see the regressions reported at:
https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@pavinjosep…
Since d794734c9bbf was also marked for -stable, let's back it out before
causing more damage.
Note that due to another upstream change the revert was not 100% automatic:
0a845e0f6348 mm/treewide: replace pud_large() with pud_leaf()
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Cc: Russ Anderson <rja(a)hpe.com>
Cc: Steve Wahl <steve.wahl(a)hpe.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Link: https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@pavinjosep…
Fixes: d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
---
arch/x86/mm/ident_map.c | 23 +++++------------------
1 file changed, 5 insertions(+), 18 deletions(-)
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index a204a33..968d700 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -26,31 +26,18 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
for (; addr < end; addr = next) {
pud_t *pud = pud_page + pud_index(addr);
pmd_t *pmd;
- bool use_gbpage;
next = (addr & PUD_MASK) + PUD_SIZE;
if (next > end)
next = end;
- /* if this is already a gbpage, this portion is already mapped */
- if (pud_leaf(*pud))
- continue;
-
- /* Is using a gbpage allowed? */
- use_gbpage = info->direct_gbpages;
-
- /* Don't use gbpage if it maps more than the requested region. */
- /* at the begining: */
- use_gbpage &= ((addr & ~PUD_MASK) == 0);
- /* ... or at the end: */
- use_gbpage &= ((next & ~PUD_MASK) == 0);
-
- /* Never overwrite existing mappings */
- use_gbpage &= !pud_present(*pud);
-
- if (use_gbpage) {
+ if (info->direct_gbpages) {
pud_t pudval;
+ if (pud_present(*pud))
+ continue;
+
+ addr &= PUD_MASK;
pudval = __pud((addr - info->offset) | info->page_flag);
set_pud(pud, pudval);
continue;
Although the Samsung SoC keypad binding defined
linux,keypad-no-autorepeat property, Linux driver never implemented it
and always used linux,input-no-autorepeat. Correct the DTS to use
property actually implemented.
This also fixes dtbs_check errors like:
exynos4210-smdkv310.dtb: keypad@100a0000: 'linux,keypad-no-autorepeat' does not match any of the regexes: '^key-[0-9a-z]+$', 'pinctrl-[0-9]+'
Cc: <stable(a)vger.kernel.org>
Fixes: 0561ceabd0f1 ("ARM: dts: Add intial dts file for EXYNOS4210 SoC, SMDKV310 and ORIGEN")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
---
arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts b/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts
index b566f878ed84..18f4f494093b 100644
--- a/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts
+++ b/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts
@@ -88,7 +88,7 @@ eeprom@52 {
&keypad {
samsung,keypad-num-rows = <2>;
samsung,keypad-num-columns = <8>;
- linux,keypad-no-autorepeat;
+ linux,input-no-autorepeat;
wakeup-source;
pinctrl-names = "default";
pinctrl-0 = <&keypad_rows &keypad_cols>;
--
2.34.1
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
If we have no VBT, or the VBT didn't declare the encoder
in question, we won't have the 'devdata' for the encoder.
Instead of oopsing just bail early.
We won't be able to tell whether the port is DP++ or not,
but so be it.
Cc: stable(a)vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10464
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_bios.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index c7841b3eede8..c13a98431a7b 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -3458,6 +3458,9 @@ bool intel_bios_encoder_supports_dp_dual_mode(const struct intel_bios_encoder_da
{
const struct child_device_config *child = &devdata->child;
+ if (!devdata)
+ return false;
+
if (!intel_bios_encoder_supports_dp(devdata) ||
!intel_bios_encoder_supports_hdmi(devdata))
return false;
--
2.43.2
MTD OTP logic is very fragile and can be problematic with some specific
kind of devices.
NVMEM across the years had various iteration on how Cells could be
declared in DT and MTD OTP probably was left behind and
add_legacy_fixed_of_cells was enabled without thinking of the consequences.
That option enables NVMEM to scan the provided of_node and treat each
child as a NVMEM Cell, this was to support legacy NVMEM implementation
and don't cause regression.
This is problematic if we have devices like Nand where the OTP is
triggered by setting a special mode in the flash. In this context real
partitions declared in the Nand node are registered as OTP Cells and
this cause probe fail with -EINVAL error.
This was never notice due to the fact that till now, no Nand supported
the OTP feature. With commit e87161321a40 ("mtd: rawnand: macronix: OTP
access for MX30LFxG18AC") this changed and coincidentally this Nand is
used on an FritzBox 7530 supported on OpenWrt.
Alternative and more robust way to declare OTP Cells are already
prossible by using the fixed-layout node or by declaring a child node
with the compatible set to "otp-user" or "otp-factory".
To fix this and limit any regression with other MTD that makes use of
declaring OTP as direct child of the dev node, disable
add_legacy_fixed_of_cells if we have a node called nand since it's the
standard property name to identify Nand devices attached to a Nand
Controller.
With the following logic, the OTP NVMEM entry is correctly created with
no Cells and the MTD Nand is correctly probed and partitions are
correctly exposed.
Fixes: 2cc3b37f5b6d ("nvmem: add explicit config option to read old syntax fixed OF cells")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
---
Changes v2:
- Use mtd_type_is_nand instead of node name check
drivers/mtd/mtdcore.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
index 5887feb347a4..0de87bc63840 100644
--- a/drivers/mtd/mtdcore.c
+++ b/drivers/mtd/mtdcore.c
@@ -900,7 +900,7 @@ static struct nvmem_device *mtd_otp_nvmem_register(struct mtd_info *mtd,
config.name = compatible;
config.id = NVMEM_DEVID_AUTO;
config.owner = THIS_MODULE;
- config.add_legacy_fixed_of_cells = true;
+ config.add_legacy_fixed_of_cells = !mtd_type_is_nand(mtd);
config.type = NVMEM_TYPE_OTP;
config.root_only = true;
config.ignore_wp = true;
--
2.43.0
misc_cmd_type in exec_op have multiple problems. With commit a82990c8a409
("mtd: rawnand: qcom: Add read/read_start ops in exec_op path") it was
reworked and generalized but actually dropped the handling of the
RESET_DEVICE command.
The rework itself was correct with supporting case where a single misc
command is handled, but became problematic by the addition of exiting
early if we didn't had an ERASE or an OP_PROGRAM_PAGE operation.
Also additional logic was added without clear explaination causing the
erase command to be broken on testing it on a ipq806x nandc.
Add some additional logic to restore RESET_DEVICE command handling and
fix erase command.
Fixes: a82990c8a409 ("mtd: rawnand: qcom: Add read/read_start ops in exec_op path")
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
---
drivers/mtd/nand/raw/qcom_nandc.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/mtd/nand/raw/qcom_nandc.c b/drivers/mtd/nand/raw/qcom_nandc.c
index b079605c84d3..b8cff9240b28 100644
--- a/drivers/mtd/nand/raw/qcom_nandc.c
+++ b/drivers/mtd/nand/raw/qcom_nandc.c
@@ -2815,7 +2815,7 @@ static int qcom_misc_cmd_type_exec(struct nand_chip *chip, const struct nand_sub
host->cfg0_raw & ~(7 << CW_PER_PAGE));
nandc_set_reg(chip, NAND_DEV0_CFG1, host->cfg1_raw);
instrs = 3;
- } else {
+ } else if (q_op.cmd_reg != OP_RESET_DEVICE) {
return 0;
}
@@ -2830,9 +2830,8 @@ static int qcom_misc_cmd_type_exec(struct nand_chip *chip, const struct nand_sub
nandc_set_reg(chip, NAND_EXEC_CMD, 1);
write_reg_dma(nandc, NAND_FLASH_CMD, instrs, NAND_BAM_NEXT_SGL);
- (q_op.cmd_reg == OP_BLOCK_ERASE) ? write_reg_dma(nandc, NAND_DEV0_CFG0,
- 2, NAND_BAM_NEXT_SGL) : read_reg_dma(nandc,
- NAND_FLASH_STATUS, 1, NAND_BAM_NEXT_SGL);
+ if (q_op.cmd_reg == OP_BLOCK_ERASE)
+ write_reg_dma(nandc, NAND_DEV0_CFG0, 2, NAND_BAM_NEXT_SGL);
write_reg_dma(nandc, NAND_EXEC_CMD, 1, NAND_BAM_NEXT_SGL);
read_reg_dma(nandc, NAND_FLASH_STATUS, 1, NAND_BAM_NEXT_SGL);
--
2.43.0
Hi,
This commit needs to be backported to 5.4, 5.10, 5.15, it fixes
possible null deference from the following commit 'regmap: Add bulk
read/write callbacks into regmap_config' which was backported to these
kernels in the latest released versions (v5.15.152, v5.10.213, v5.4.272).
Commit 5c422f0b970d287efa864b8390a02face404db5d upstream.
Thanks,
MNAdam
The clk_alpha_pll_stromer_set_rate() function writes inproper
values into the ALPHA_VAL{,_U} registers which results in wrong
clock rates when the alpha value is used.
The broken behaviour can be seen on IPQ5018 for example, when
dynamic scaling sets the CPU frequency to 800000 KHz. In this
case the CPU cores are running only at 792031 KHz:
# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
800000
# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
792031
This happens because the function ignores the fact that the alpha
value calculated by the alpha_pll_round_rate() function is only
32 bits wide which must be extended to 40 bits if it is used on
a hardware which supports 40 bits wide values.
Extend the clk_alpha_pll_stromer_set_rate() function to convert
the alpha value to 40 bits before wrinting that into the registers
in order to ensure that the hardware really uses the requested rate.
After the change the CPU frequency is correct:
# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
800000
# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
800000
Cc: stable(a)vger.kernel.org
Fixes: e47a4f55f240 ("clk: qcom: clk-alpha-pll: Add support for Stromer PLLs")
Signed-off-by: Gabor Juhos <j4g8y7(a)gmail.com>
---
Based on v6.8.
---
Based on v6.8.
Depends on the following patch:
https://lore.kernel.org/r/20240315-apss-ipq-pll-ipq5018-hang-v2-1-6fe30ada2…
---
drivers/clk/qcom/clk-alpha-pll.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/clk/qcom/clk-alpha-pll.c b/drivers/clk/qcom/clk-alpha-pll.c
index 05898d2a8b22c..4f5dba44411f6 100644
--- a/drivers/clk/qcom/clk-alpha-pll.c
+++ b/drivers/clk/qcom/clk-alpha-pll.c
@@ -2474,6 +2474,10 @@ static int clk_alpha_pll_stromer_set_rate(struct clk_hw *hw, unsigned long rate,
rate = alpha_pll_round_rate(rate, prate, &l, &a, ALPHA_REG_BITWIDTH);
regmap_write(pll->clkr.regmap, PLL_L_VAL(pll), l);
+
+ if (ALPHA_REG_BITWIDTH > ALPHA_BITWIDTH)
+ a <<= ALPHA_REG_BITWIDTH - ALPHA_BITWIDTH;
+
regmap_write(pll->clkr.regmap, PLL_ALPHA_VAL(pll), a);
regmap_write(pll->clkr.regmap, PLL_ALPHA_VAL_U(pll),
a >> ALPHA_BITWIDTH);
---
base-commit: 97483adf4c6181df2f3d8fe7c2aa057443298080
change-id: 20240324-alpha-pll-fix-stromer-set-rate-472376e624f0
Best regards,
--
Gabor Juhos <j4g8y7(a)gmail.com>
Hi.
Please apply these patches in this order. They fully resolve the issues
with link-local frames on the switches the MT7530 DSA subdriver
controls.
The commits apply to all stable trees as is.
To 5.15:
8332cf6fd7c7 net: dsa: mt7530: fix handling of LLDP frames
e94b590abfff net: dsa: mt7530: fix handling of 802.1X PAE frames
e8bf353577f3 net: dsa: mt7530: fix link-local frames that ingress vlan
filtering ports
69ddba9d170b net: dsa: mt7530: fix handling of all link-local frames
To 6.8, 6.7, 6.6, 6.1:
e8bf353577f3 net: dsa: mt7530: fix link-local frames that ingress vlan
filtering ports
69ddba9d170b net: dsa: mt7530: fix handling of all link-local frames
Arınç
In case a console is set up really large and contains a really long word
(> 256 characters), we have to stop before the length of the word buffer.
Signed-off-by: Samuel Thibault <samuel.thibault(a)ens-lyon.org>
Fixes: c6e3fd22cd538 ("Staging: add speakup to the staging directory")
Cc: stable(a)vger.kernel.org
---
drivers/accessibility/speakup/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/accessibility/speakup/main.c b/drivers/accessibility/speakup/main.c
index 1fbc9b921c4f..736c2eb8c0f3 100644
--- a/drivers/accessibility/speakup/main.c
+++ b/drivers/accessibility/speakup/main.c
@@ -574,7 +574,7 @@ static u_long get_word(struct vc_data *vc)
}
attr_ch = get_char(vc, (u_short *)tmp_pos, &spk_attr);
buf[cnt++] = attr_ch;
- while (tmpx < vc->vc_cols - 1) {
+ while (tmpx < vc->vc_cols - 1 && cnt < sizeof(buf) - 1) {
tmp_pos += 2;
tmpx++;
ch = get_char(vc, (u_short *)tmp_pos, &temp);
--
2.43.0
Hello,
On 3/22/24 19:17, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> wifi: wilc1000: revert reset line logic flip
>
> to the 6.1-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> wifi-wilc1000-revert-reset-line-logic-flip.patch
> and it can be found in the queue-6.1 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
This patch is expected to introduce a breakage on platforms using a wrong device
tree description. After discussing this consequence with wireless and DT people
(see this patch RFC in [1]), it has been decided that this is tolerable. However,
despite the Fixes tag I have put in the patch, I am not sure it is OK to also
introduce this breakage for people just updating their stable kernels ? My
opinion here is that they should get this break only when updating to a new
kernel release, not stable, so I _would_ keep this patch out of stable trees
(currently applied to 6.1, 6.6, 6.7 and 6.8, if I have followed correctly).
Thanks,
Alexis
[1]
https://lore.kernel.org/all/20240213-wilc_1000_reset_line-v1-1-e01da2b23fed…
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
In NUMMU kernel the value of linux_binprm::p is the offset inside the
temporary program arguments array maintained in separate pages in the
linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
thus must be adjusted when that array is copied to the user stack.
Without that adjustment the value passed by the NOMMU kernel to the ELF
program in the AT_EXECFN entry of the aux array doesn't make any sense
and it may break programs that try to access memory pointed to by that
entry.
Adjust linux_binprm::exec before the successful return from the
transfer_args_to_stack().
Cc: stable(a)vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc(a)gmail.com>
---
fs/exec.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/exec.c b/fs/exec.c
index af4fbb61cd53..5ee2545c3e18 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -895,6 +895,7 @@ int transfer_args_to_stack(struct linux_binprm *bprm,
goto out;
}
+ bprm->exec += *sp_location - MAX_ARG_PAGES * PAGE_SIZE;
*sp_location = sp;
out:
--
2.39.2
Several Qualcomm Bluetooth controllers lack persistent storage for the
device address and instead one can be provided by the boot firmware
using the 'local-bd-address' devicetree property.
The Bluetooth bindings clearly states that the address should be
specified in little-endian order, but due to a long-standing bug in the
Qualcomm driver which reversed the address some boot firmware has been
providing the address in big-endian order instead.
The boot firmware in SC7180 Trogdor Chromebooks is known to be affected
so mark the 'local-bd-address' property as broken to maintain backwards
compatibility with older firmware when fixing the underlying driver bug.
Note that ChromeOS always updates the kernel and devicetree in lockstep
so that there is no need to handle backwards compatibility with older
devicetrees.
Fixes: 7ec3e67307f8 ("arm64: dts: qcom: sc7180-trogdor: add initial trogdor and lazor dt")
Cc: stable(a)vger.kernel.org # 5.10
Cc: Rob Clark <robdclark(a)chromium.org>
Reviewed-by: Douglas Anderson <dianders(a)chromium.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
index 46aaeba28604..ebe37678102f 100644
--- a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
@@ -943,6 +943,8 @@ bluetooth: bluetooth {
vddrf-supply = <&pp1300_l2c>;
vddch0-supply = <&pp3300_l10c>;
max-speed = <3200000>;
+
+ qcom,local-bd-address-broken;
};
};
--
2.43.2
Since
7ee18d677989 ("x86/power: Make restore_processor_context() sane")
kmemleak reports this issue:
unreferenced object 0xf68241e0 (size 32):
comm "swapper/0", pid 1, jiffies 4294668610 (age 68.432s)
hex dump (first 32 bytes):
00 cc cc cc 29 10 01 c0 00 00 00 00 00 00 00 00 ....)...........
00 42 82 f6 cc cc cc cc cc cc cc cc cc cc cc cc .B..............
backtrace:
[<461c1d50>] __kmem_cache_alloc_node+0x106/0x260
[<ea65e13b>] __kmalloc+0x54/0x160
[<c3858cd2>] msr_build_context.constprop.0+0x35/0x100
[<46635aff>] pm_check_save_msr+0x63/0x80
[<6b6bb938>] do_one_initcall+0x41/0x1f0
[<3f3add60>] kernel_init_freeable+0x199/0x1e8
[<3b538fde>] kernel_init+0x1a/0x110
[<938ae2b2>] ret_from_fork+0x1c/0x28
Reproducer:
- Run rsync of whole kernel tree (multiple times if needed).
- start a kmemleak scan
- Note this is just an example: a lot of our internal tests hit these.
The root cause is we expect the same as the equivalent fix in commit
b0b592cf0836, i.e. the alignment within the packed struct saved_context
which has everything unaligned as there is only "u16 gs;" at start of
struct where in the past there were four u16 there thus aligning
everything afterwards. The issue is with the fact that Kmemleak only
searches for pointers that are aligned (see how pointers are scanned in
kmemleak.c) so when the struct members are not aligned it doesn't see
them.
Note we have picked this up on 5.4, 6.1 and 6.6 kernels but we expect it
is the same on all kernels >= 4.15 as the commit 7ee18d677989 which
changed from having four u16 to a single u16 at the start of the struct
was introduced in 4.15.
Fixes: 7ee18d677989 ("x86/power: Make restore_processor_context() sane")
Signed-off-by: Anton Altaparmakov <anton(a)tuxera.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/include/asm/suspend_32.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/suspend_32.h b/arch/x86/include/asm/suspend_32.h
index a800abb1a992..d8416b3bf832 100644
--- a/arch/x86/include/asm/suspend_32.h
+++ b/arch/x86/include/asm/suspend_32.h
@@ -12,11 +12,6 @@
/* image of the saved processor state */
struct saved_context {
- /*
- * On x86_32, all segment registers except gs are saved at kernel
- * entry in pt_regs.
- */
- u16 gs;
unsigned long cr0, cr2, cr3, cr4;
u64 misc_enable;
struct saved_msrs saved_msrs;
@@ -27,6 +22,11 @@ struct saved_context {
unsigned long tr;
unsigned long safety;
unsigned long return_address;
+ /*
+ * On x86_32, all segment registers except gs are saved at kernel
+ * entry in pt_regs.
+ */
+ u16 gs;
bool misc_enable_saved;
} __attribute__((packed));
--
2.39.3 (Apple Git-146)
The issue occurs when the devfreq cooling device uses the EM power model
and the get_real_power() callback is provided by the driver.
The EM power table is sorted ascending,can't index the table by cooling
device state,so convert cooling state to performance state by
dfc->max_state - dfc->capped_state.
Fixes: 615510fe13bd ("thermal: devfreq_cooling: remove old power model and use EM")
Cc: 5.11+ <stable(a)vger.kernel.org> # 5.11+
Signed-off-by: Ye Zhang <ye.zhang(a)rock-chips.com>
---
v1 -> v2:
- Update the commit message.
drivers/thermal/devfreq_cooling.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
index 50dec24e967a..8fd7cf1932cd 100644
--- a/drivers/thermal/devfreq_cooling.c
+++ b/drivers/thermal/devfreq_cooling.c
@@ -214,7 +214,7 @@ static int devfreq_cooling_get_requested_power(struct thermal_cooling_device *cd
res = dfc->power_ops->get_real_power(df, power, freq, voltage);
if (!res) {
- state = dfc->capped_state;
+ state = dfc->max_state - dfc->capped_state;
/* Convert EM power into milli-Watts first */
rcu_read_lock();
--
2.34.1
The patch titled
Subject: hexagon: vmlinux.lds.S: handle attributes section
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
hexagon-vmlinuxldss-handle-attributes-section.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Nathan Chancellor <nathan(a)kernel.org>
Subject: hexagon: vmlinux.lds.S: handle attributes section
Date: Tue, 19 Mar 2024 17:37:46 -0700
After the linked LLVM change, the build fails with
CONFIG_LD_ORPHAN_WARN_LEVEL="error", which happens with allmodconfig:
ld.lld: error: vmlinux.a(init/main.o):(.hexagon.attributes) is being placed in '.hexagon.attributes'
Handle the attributes section in a similar manner as arm and riscv by
adding it after the primary ELF_DETAILS grouping in vmlinux.lds.S, which
fixes the error.
Link: https://lkml.kernel.org/r/20240319-hexagon-handle-attributes-section-vmlinu…
Fixes: 113616ec5b64 ("hexagon: select ARCH_WANT_LD_ORPHAN_WARN")
Link: https://github.com/llvm/llvm-project/commit/31f4b329c8234fab9afa59494d7f8bd…
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
Cc: Bill Wendling <morbo(a)google.com>
Cc: Brian Cain <bcain(a)quicinc.com>
Cc: Justin Stitt <justinstitt(a)google.com>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/hexagon/kernel/vmlinux.lds.S | 1 +
1 file changed, 1 insertion(+)
--- a/arch/hexagon/kernel/vmlinux.lds.S~hexagon-vmlinuxldss-handle-attributes-section
+++ a/arch/hexagon/kernel/vmlinux.lds.S
@@ -63,6 +63,7 @@ SECTIONS
STABS_DEBUG
DWARF_DEBUG
ELF_DETAILS
+ .hexagon.attributes 0 : { *(.hexagon.attributes) }
DISCARDS
}
_
Patches currently in -mm which might be from nathan(a)kernel.org are
hexagon-vmlinuxldss-handle-attributes-section.patch
The patch titled
Subject: tmpfs: fix race on handling dquot rbtree
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
tmpfs-fix-race-on-handling-dquot-rbtree.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Carlos Maiolino <cem(a)kernel.org>
Subject: tmpfs: fix race on handling dquot rbtree
Date: Wed, 20 Mar 2024 13:39:59 +0100
A syzkaller reproducer found a race while attempting to remove dquot
information from the rb tree.
Fetching the rb_tree root node must also be protected by the
dqopt->dqio_sem, otherwise, giving the right timing, shmem_release_dquot()
will trigger a warning because it couldn't find a node in the tree, when
the real reason was the root node changing before the search starts:
Thread 1 Thread 2
- shmem_release_dquot() - shmem_{acquire,release}_dquot()
- fetch ROOT - Fetch ROOT
- acquire dqio_sem
- wait dqio_sem
- do something, triger a tree rebalance
- release dqio_sem
- acquire dqio_sem
- start searching for the node, but
from the wrong location, missing
the node, and triggering a warning.
Link: https://lkml.kernel.org/r/20240320124011.398847-1-cem@kernel.org
Fixes: eafc474e202978 ("shmem: prepare shmem quota infrastructure")
Signed-off-by: Carlos Maiolino <cmaiolino(a)redhat.com>
Reported-by: Ubisectech Sirius <bugreport(a)ubisectech.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/shmem_quota.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
--- a/mm/shmem_quota.c~tmpfs-fix-race-on-handling-dquot-rbtree
+++ a/mm/shmem_quota.c
@@ -116,7 +116,7 @@ static int shmem_free_file_info(struct s
static int shmem_get_next_id(struct super_block *sb, struct kqid *qid)
{
struct mem_dqinfo *info = sb_dqinfo(sb, qid->type);
- struct rb_node *node = ((struct rb_root *)info->dqi_priv)->rb_node;
+ struct rb_node *node;
qid_t id = from_kqid(&init_user_ns, *qid);
struct quota_info *dqopt = sb_dqopt(sb);
struct quota_id *entry = NULL;
@@ -126,6 +126,7 @@ static int shmem_get_next_id(struct supe
return -ESRCH;
down_read(&dqopt->dqio_sem);
+ node = ((struct rb_root *)info->dqi_priv)->rb_node;
while (node) {
entry = rb_entry(node, struct quota_id, node);
@@ -165,7 +166,7 @@ out_unlock:
static int shmem_acquire_dquot(struct dquot *dquot)
{
struct mem_dqinfo *info = sb_dqinfo(dquot->dq_sb, dquot->dq_id.type);
- struct rb_node **n = &((struct rb_root *)info->dqi_priv)->rb_node;
+ struct rb_node **n;
struct shmem_sb_info *sbinfo = dquot->dq_sb->s_fs_info;
struct rb_node *parent = NULL, *new_node = NULL;
struct quota_id *new_entry, *entry;
@@ -176,6 +177,8 @@ static int shmem_acquire_dquot(struct dq
mutex_lock(&dquot->dq_lock);
down_write(&dqopt->dqio_sem);
+ n = &((struct rb_root *)info->dqi_priv)->rb_node;
+
while (*n) {
parent = *n;
entry = rb_entry(parent, struct quota_id, node);
@@ -264,7 +267,7 @@ static bool shmem_is_empty_dquot(struct
static int shmem_release_dquot(struct dquot *dquot)
{
struct mem_dqinfo *info = sb_dqinfo(dquot->dq_sb, dquot->dq_id.type);
- struct rb_node *node = ((struct rb_root *)info->dqi_priv)->rb_node;
+ struct rb_node *node;
qid_t id = from_kqid(&init_user_ns, dquot->dq_id);
struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
struct quota_id *entry = NULL;
@@ -275,6 +278,7 @@ static int shmem_release_dquot(struct dq
goto out_dqlock;
down_write(&dqopt->dqio_sem);
+ node = ((struct rb_root *)info->dqi_priv)->rb_node;
while (node) {
entry = rb_entry(node, struct quota_id, node);
_
Patches currently in -mm which might be from cem(a)kernel.org are
tmpfs-fix-race-on-handling-dquot-rbtree.patch
The patch titled
Subject: selftests/mm: sigbus-wp test requires UFFD_FEATURE_WP_HUGETLBFS_SHMEM
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
selftests-mm-sigbus-wp-test-requires-uffd_feature_wp_hugetlbfs_shmem.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Edward Liaw <edliaw(a)google.com>
Subject: selftests/mm: sigbus-wp test requires UFFD_FEATURE_WP_HUGETLBFS_SHMEM
Date: Thu, 21 Mar 2024 23:20:21 +0000
The sigbus-wp test requires the UFFD_FEATURE_WP_HUGETLBFS_SHMEM flag for
shmem and hugetlb targets. Otherwise it is not backwards compatible with
kernels <5.19 and fails with EINVAL.
Link: https://lkml.kernel.org/r/20240321232023.2064975-1-edliaw@google.com
Fixes: 73c1ea939b65 ("selftests/mm: move uffd sig/events tests into uffd unit tests")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Peter Xu <peterx(a)redhat.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/uffd-unit-tests.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/mm/uffd-unit-tests.c~selftests-mm-sigbus-wp-test-requires-uffd_feature_wp_hugetlbfs_shmem
+++ a/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -1427,7 +1427,8 @@ uffd_test_case_t uffd_tests[] = {
.uffd_fn = uffd_sigbus_wp_test,
.mem_targets = MEM_ALL,
.uffd_feature_required = UFFD_FEATURE_SIGBUS |
- UFFD_FEATURE_EVENT_FORK | UFFD_FEATURE_PAGEFAULT_FLAG_WP,
+ UFFD_FEATURE_EVENT_FORK | UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+ UFFD_FEATURE_WP_HUGETLBFS_SHMEM,
},
{
.name = "events",
_
Patches currently in -mm which might be from edliaw(a)google.com are
selftests-mm-sigbus-wp-test-requires-uffd_feature_wp_hugetlbfs_shmem.patch
Commit 9bb66c179f50 ("drm/i915: Reserve some kernel space per
vm") has reserved an object for kernel space usage.
Userspace, though, needs to know the full address range.
In the former patch the reserved space was substructed from the
total amount of the VM space. Add it back when the user requests
the GTT size through ioctl (I915_CONTEXT_PARAM_GTT_SIZE).
Fixes: 9bb66c179f50 ("drm/i915: Reserve some kernel space per vm")
Signed-off-by: Andi Shyti <andi.shyti(a)linux.intel.com>
Cc: Andrzej Hajda <andrzej.hajda(a)intel.com>
Cc: Chris Wilson <chris.p.wilson(a)linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin(a)intel.com>
Cc: Michal Mrozek <michal.mrozek(a)intel.com>
Cc: Nirmoy Das <nirmoy.das(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.2+
Acked-by: Michal Mrozek <michal.mrozek(a)intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin(a)intel.com>
---
Hi,
Just proposing a different implementation that doesn't affect
i915 internally but provides the same result. Instead of not
substracting the space during the reservation, I add it back
during the ioctl call.
All the "vm->rsvd.vma->node.size" looks a bit ugly, but that's
how it is. Maybe a comment can help to understand better why
there is this addition.
I kept the Ack from Michal and Lionel, because the outcome from
userspace perspactive doesn't really change.
Andi
drivers/gpu/drm/i915/gem/i915_gem_context.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 81f65cab1330..60d9e7fe33b3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -2454,7 +2454,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
case I915_CONTEXT_PARAM_GTT_SIZE:
args->size = 0;
vm = i915_gem_context_get_eb_vm(ctx);
- args->value = vm->total;
+ args->value = vm->total + vm->rsvd.vma->node.size;
i915_vm_put(vm);
break;
--
2.43.0
Hello,
Please, add the following commit:
2814646f76f8518326964f12ff20aaee70ba154d HID: lenovo: Add
middleclick_workaround sysfs knob for cptkbd
Into stable versions: 5.10, 5.15, 6.1, 6.6, 6.7, 6.8
This versions received automatic backport of my previous attempts to
detect properly-behaving patched firmware which suffer from
false-positives causing regression for users of factory lenovo
firmware. The commit adds an explicit sysfs control hence avoids any
room for false positives and fixes the regression.
Thanks.
The flag I2C_HID_READ_PENDING is used to serialize I2C operations.
However, this is not necessary, because I2C core already has its own
locking for that.
More importantly, this flag can cause a lock-up: if the flag is set in
i2c_hid_xfer() and an interrupt happens, the interrupt handler
(i2c_hid_irq) will check this flag and return immediately without doing
anything, then the interrupt handler will be invoked again in an
infinite loop.
Since interrupt handler is an RT task, it takes over the CPU and the
flag-clearing task never gets scheduled, thus we have a lock-up.
Delete this unnecessary flag.
Reported-and-tested-by: Eva Kurchatova <nyandarknessgirl(a)gmail.com>
Closes: https://lore.kernel.org/r/CA+eeCSPUDpUg76ZO8dszSbAGn+UHjcyv8F1J-CUPVARAzEtW…
Fixes: 4a200c3b9a40 ("HID: i2c-hid: introduce HID over i2c specification implementation")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
---
drivers/hid/i2c-hid/i2c-hid-core.c | 9 ---------
1 file changed, 9 deletions(-)
diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
index 2df1ab3c31cc..1c86c97688e9 100644
--- a/drivers/hid/i2c-hid/i2c-hid-core.c
+++ b/drivers/hid/i2c-hid/i2c-hid-core.c
@@ -64,7 +64,6 @@
/* flags */
#define I2C_HID_STARTED 0
#define I2C_HID_RESET_PENDING 1
-#define I2C_HID_READ_PENDING 2
#define I2C_HID_PWR_ON 0x00
#define I2C_HID_PWR_SLEEP 0x01
@@ -190,15 +189,10 @@ static int i2c_hid_xfer(struct i2c_hid *ihid,
msgs[n].len = recv_len;
msgs[n].buf = recv_buf;
n++;
-
- set_bit(I2C_HID_READ_PENDING, &ihid->flags);
}
ret = i2c_transfer(client->adapter, msgs, n);
- if (recv_len)
- clear_bit(I2C_HID_READ_PENDING, &ihid->flags);
-
if (ret != n)
return ret < 0 ? ret : -EIO;
@@ -556,9 +550,6 @@ static irqreturn_t i2c_hid_irq(int irq, void *dev_id)
{
struct i2c_hid *ihid = dev_id;
- if (test_bit(I2C_HID_READ_PENDING, &ihid->flags))
- return IRQ_HANDLED;
-
i2c_hid_get_input(ihid);
return IRQ_HANDLED;
--
2.39.2
syzbot reports a memory leak in pppoe_sendmsg in 6.6 and 6.1 stable
releases. The problem has been fixed by the following patch which can be
cleanly applied to the 6.6 and 6.1 branches.
Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with Syzkaller
Gavrilov Ilia (1):
pppoe: Fix memory leak in pppoe_sendmsg()
drivers/net/ppp/pppoe.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
--
2.39.2
The writes to setipnum_le/be register for APLIC in MSI-mode have special
consideration for level-triggered interrupts as-per the section "4.9.2
Special consideration for level-sensitive interrupt sources" of the RISC-V
AIA specification.
Particularly, the below text from the RISC-V AIA specification defines
the behaviour of writes to setipnum_le/be register for level-triggered
interrupts:
"A second option is for the interrupt service routine to write the
APLIC’s source identity number for the interrupt to the domain’s
setipnum register just before exiting. This will cause the interrupt’s
pending bit to be set to one again if the source is still asserting
an interrupt, but not if the source is not asserting an interrupt."
Fix setipnum_le/be write emulation for in-kernel APLIC by implementing
the above behaviour in aplic_write_pending() function.
Cc: stable(a)vger.kernel.org
Fixes: 74967aa208e2 ("RISC-V: KVM: Add in-kernel emulation of AIA APLIC")
Signed-off-by: Anup Patel <apatel(a)ventanamicro.com>
---
arch/riscv/kvm/aia_aplic.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/kvm/aia_aplic.c b/arch/riscv/kvm/aia_aplic.c
index 39e72aa016a4..5e842b92dc46 100644
--- a/arch/riscv/kvm/aia_aplic.c
+++ b/arch/riscv/kvm/aia_aplic.c
@@ -137,11 +137,21 @@ static void aplic_write_pending(struct aplic *aplic, u32 irq, bool pending)
raw_spin_lock_irqsave(&irqd->lock, flags);
sm = irqd->sourcecfg & APLIC_SOURCECFG_SM_MASK;
- if (!pending &&
- ((sm == APLIC_SOURCECFG_SM_LEVEL_HIGH) ||
- (sm == APLIC_SOURCECFG_SM_LEVEL_LOW)))
+ if (sm == APLIC_SOURCECFG_SM_INACTIVE)
goto skip_write_pending;
+ if (sm == APLIC_SOURCECFG_SM_LEVEL_HIGH ||
+ sm == APLIC_SOURCECFG_SM_LEVEL_LOW) {
+ if (!pending)
+ goto skip_write_pending;
+ if ((irqd->state & APLIC_IRQ_STATE_INPUT) &&
+ sm == APLIC_SOURCECFG_SM_LEVEL_LOW)
+ goto skip_write_pending;
+ if (!(irqd->state & APLIC_IRQ_STATE_INPUT) &&
+ sm == APLIC_SOURCECFG_SM_LEVEL_HIGH)
+ goto skip_write_pending;
+ }
+
if (pending)
irqd->state |= APLIC_IRQ_STATE_PENDING;
else
--
2.34.1
MTD OTP logic is very fragile and can be problematic with some specific
kind of devices.
NVMEM across the years had various iteration on how Cells could be
declared in DT and MTD OTP probably was left behind and
add_legacy_fixed_of_cells was enabled without thinking of the consequences.
That option enables NVMEM to scan the provided of_node and treat each
child as a NVMEM Cell, this was to support legacy NVMEM implementation
and don't cause regression.
This is problematic if we have devices like Nand where the OTP is
triggered by setting a special mode in the flash. In this context real
partitions declared in the Nand node are registered as OTP Cells and
this cause probe fail with -EINVAL error.
This was never notice due to the fact that till now, no Nand supported
the OTP feature. With commit e87161321a40 ("mtd: rawnand: macronix: OTP
access for MX30LFxG18AC") this changed and coincidentally this Nand is
used on an FritzBox 7530 supported on OpenWrt.
Alternative and more robust way to declare OTP Cells are already
prossible by using the fixed-layout node or by declaring a child node
with the compatible set to "otp-user" or "otp-factory".
To fix this and limit any regression with other MTD that makes use of
declaring OTP as direct child of the dev node, disable
add_legacy_fixed_of_cells if we have a node called nand since it's the
standard property name to identify Nand devices attached to a Nand
Controller.
With the following logic, the OTP NVMEM entry is correctly created with
no Cells and the MTD Nand is correctly probed and partitions are
correctly exposed.
Fixes: 2cc3b37f5b6d ("nvmem: add explicit config option to read old syntax fixed OF cells")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
---
drivers/mtd/mtdcore.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
index 5887feb347a4..6872477a5129 100644
--- a/drivers/mtd/mtdcore.c
+++ b/drivers/mtd/mtdcore.c
@@ -900,7 +900,7 @@ static struct nvmem_device *mtd_otp_nvmem_register(struct mtd_info *mtd,
config.name = compatible;
config.id = NVMEM_DEVID_AUTO;
config.owner = THIS_MODULE;
- config.add_legacy_fixed_of_cells = true;
+ config.add_legacy_fixed_of_cells = !of_node_name_eq(mtd->dev.of_node, "nand");
config.type = NVMEM_TYPE_OTP;
config.root_only = true;
config.ignore_wp = true;
--
2.43.0
I have found a regression in userspace behaviour after commit 67b164a871a
got backported into 4.19.306 as commit 19af0310c8767. The regression
can be fixed by backporting two additional commits, detailed below.
The regression can be reproduced with the following sequence:
echo some text > plain.txt
openssl enc -k mysecret -aes-256-cbc -in plain.txt -out cipher.txt -engine afalg
It fails intermittently with the message "error writing to file", but
this error is a bit misleading, the actual problem is that the kernel
returns -16 (EBUSY) on the encoding operation.
The EBUSY comes from the newly added in-flight check. This check is correct,
however it fails on 4.19 kernel, because it is missing two earlier commits:
f3c802a1f3001 crypto: algif_aead - Only wake up when ctx->more is zero
21dfbcd1f5cbf crypto: algif_aead - fix uninitialized ctx->init
I was able to cherry-pick those into 4.19.y, with just a minor conflict
in one case. With those applied, the openssl command no longer fails.
Similar fixes are likely needed in 5.4.y, however I did not test this.
No change is needed in 5.10 or newer, as the two commits are present.
Please add the two commits to 4.19.y (and probably also 5.4.y).
Thanks,
-Ralph
[CCing the stable team, as it looks like two prerequisite changes for a
patch already applied are missing in at least 4.19.y]
On 15.03.24 18:55, Ralph Siemsen wrote:
>
> I have found a regression in userspace behaviour after this patch was
> merged into the 4.19.y kernel. The fix seems to involve backporting a
> few more changes. Could you review details below and confirm if this is
> the right approach?
FWIW, developers are totally free to not care about stable and longterm
kernels series. Not sure if Herbert is among those developers, but it
might explain why there is no reply yet. That's why I CCed the stable
maintainers, strictly speaking they are responsible.
> On Tue, Nov 28, 2023 at 04:25:49PM +0800, Herbert Xu wrote:
>> Having multiple in-flight AIO requests results in unpredictable
>> output because they all share the same IV. Fix this by only allowing
>> one request at a time.
> [...]
> This change got backported on the 4.19 kernel in January:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
>
> Since then, I am seeCiao, ing a regression in a simple openssl encoding test:
>
> openssl enc -k mysecret -aes-256-cbc -in plain.txt -out cipher.txt
> -engine afalg
>
> It fails intermittently with the message "error writing to file", but
> this error is a bit misleading, the actual problem is that the kernel
> returns -16 (EBUSY) on the encoding operation.
>
> This happens only in 4.19, and not under 5.10. The patch seems correct,
> however it seems we are missing a couple of other patches on 4.19:
>
> f3c802a1f3001 crypto: algif_aead - Only wake up when ctx->more is zero
> 21dfbcd1f5cbf crypto: algif_aead - fix uninitialized ctx->init
>
> I was able to cherry-pick those into 4.19.y, with just a minor conflict
> in one case. With those applied, the openssl command no longer fails.
Some feedback here from Herbert would of course be splendid, but maybe
your tests are all the stable team needs to pick those up for a future
4.19.y release.
> I suspect similar changes would be needed also in 5.4 kernel, however I
> neither checked that, nor have I run any tests on that version.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
RISC-V PLIC cannot "end-of-interrupt" (EOI) disabled interrupts, as
explained in the description of Interrupt Completion in the PLIC spec:
"The PLIC signals it has completed executing an interrupt handler by
writing the interrupt ID it received from the claim to the claim/complete
register. The PLIC does not check whether the completion ID is the same
as the last claim ID for that target. If the completion ID does not match
an interrupt source that *is currently enabled* for the target, the
completion is silently ignored."
Commit 69ea463021be ("irqchip/sifive-plic: Fixup EOI failed when masked")
ensured that EOI is successful by enabling interrupt first, before EOI.
Commit a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask
operations") removed the interrupt enabling code from the previous
commit, because it assumes that interrupt should already be enabled at the
point of EOI. However, this is incorrect: there is a window after a hart
claiming an interrupt and before irq_desc->lock getting acquired,
interrupt can be disabled during this window. Thus, EOI can be invoked
while the interrupt is disabled, effectively nullify this EOI. This
results in the interrupt never gets asserted again, and the device who
uses this interrupt appears frozen.
Make sure that interrupt is really enabled before EOI.
Fixes: a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask operations")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
---
v2:
- add unlikely() for optimization
- re-word commit message to make it clearer
drivers/irqchip/irq-sifive-plic.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index e1484905b7bd..0a233e9d9607 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -148,7 +148,13 @@ static void plic_irq_eoi(struct irq_data *d)
{
struct plic_handler *handler = this_cpu_ptr(&plic_handlers);
- writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
+ if (unlikely(irqd_irq_disabled(d))) {
+ plic_toggle(handler, d->hwirq, 1);
+ writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
+ plic_toggle(handler, d->hwirq, 0);
+ } else {
+ writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
+ }
}
#ifdef CONFIG_SMP
--
2.39.2
syzbot reports a memory leak in pppoe_sendmsg in 6.6 and 6.1 stable
releases. The problem has been fixed by the following patch which can be
cleanly applied to the 6.6 and 6.1 branches.
Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with Syzkaller
Gavrilov Ilia (1):
pppoe: Fix memory leak in pppoe_sendmsg()
drivers/net/ppp/pppoe.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
--
2.39.2
From GCC commit 3f13154553f8546a ("df-scan: remove ad-hoc handling of
global regs in asms"), global registers will no longer be forced to add
to the def-use chain. Then current_thread_info(), current_stack_pointer
and __my_cpu_offset may be lifted out of the loop because they are no
longer treated as "volatile variables".
This optimization is still correct for the current_thread_info() and
current_stack_pointer usages because they are associated to a thread.
However it is wrong for __my_cpu_offset because it is associated to a
CPU rather than a thread: if the thread migrates to a different CPU in
the loop, __my_cpu_offset should be changed.
Change __my_cpu_offset definition to treat it as a "volatile variable",
in order to avoid such a mis-optimization.
Cc: stable(a)vger.kernel.org
Reported-by: Xiaotian Wu <wuxiaotian(a)loongson.cn>
Reported-by: Miao Wang <shankerwangmiao(a)gmail.com>
Signed-off-by: Xing Li <lixing(a)loongson.cn>
Signed-off-by: Hongchen Zhang <zhanghongchen(a)loongson.cn>
Signed-off-by: Rui Wang <wangrui(a)loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
arch/loongarch/include/asm/percpu.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/loongarch/include/asm/percpu.h b/arch/loongarch/include/asm/percpu.h
index 9b36ac003f89..03b98491d301 100644
--- a/arch/loongarch/include/asm/percpu.h
+++ b/arch/loongarch/include/asm/percpu.h
@@ -29,7 +29,12 @@ static inline void set_my_cpu_offset(unsigned long off)
__my_cpu_offset = off;
csr_write64(off, PERCPU_BASE_KS);
}
-#define __my_cpu_offset __my_cpu_offset
+
+#define __my_cpu_offset \
+({ \
+ __asm__ __volatile__("":"+r"(__my_cpu_offset)); \
+ __my_cpu_offset; \
+})
#define PERCPU_OP(op, asm_op, c_op) \
static __always_inline unsigned long __percpu_##op(void *ptr, \
--
2.43.0