The quilt patch titled
Subject: kasan: fix type cast in memory_is_poisoned_n
has been removed from the -mm tree. Its filename was
kasan-fix-type-cast-in-memory_is_poisoned_n.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Andrey Konovalov <andreyknvl(a)google.com>
Subject: kasan: fix type cast in memory_is_poisoned_n
Date: Tue, 4 Jul 2023 02:52:05 +0200
Commit bb6e04a173f0 ("kasan: use internal prototypes matching gcc-13
builtins") introduced a bug into the memory_is_poisoned_n implementation:
it effectively removed the cast to a signed integer type after applying
KASAN_GRANULE_MASK.
As a result, KASAN started failing to properly check memset, memcpy, and
other similar functions.
Fix the bug by adding the cast back (through an additional signed integer
variable to make the code more readable).
Link: https://lkml.kernel.org/r/8c9e0251c2b8b81016255709d4ec42942dcaf018.16884318…
Fixes: bb6e04a173f0 ("kasan: use internal prototypes matching gcc-13 builtins")
Signed-off-by: Andrey Konovalov <andreyknvl(a)google.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/generic.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/kasan/generic.c~kasan-fix-type-cast-in-memory_is_poisoned_n
+++ a/mm/kasan/generic.c
@@ -130,9 +130,10 @@ static __always_inline bool memory_is_po
if (unlikely(ret)) {
const void *last_byte = addr + size - 1;
s8 *last_shadow = (s8 *)kasan_mem_to_shadow(last_byte);
+ s8 last_accessible_byte = (unsigned long)last_byte & KASAN_GRANULE_MASK;
if (unlikely(ret != (unsigned long)last_shadow ||
- (((long)last_byte & KASAN_GRANULE_MASK) >= *last_shadow)))
+ last_accessible_byte >= *last_shadow))
return true;
}
return false;
_
Patches currently in -mm which might be from andreyknvl(a)google.com are
The quilt patch titled
Subject: bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
has been removed from the -mm tree. Its filename was
bootmem-remove-the-vmemmap-pages-from-kmemleak-in-free_bootmem_page.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Liu Shixin <liushixin2(a)huawei.com>
Subject: bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
Date: Tue, 4 Jul 2023 18:19:42 +0800
commit dd0ff4d12dd2 ("bootmem: remove the vmemmap pages from kmemleak in
put_page_bootmem") fix an overlaps existing problem of kmemleak. But the
problem still existed when HAVE_BOOTMEM_INFO_NODE is disabled, because in
this case, free_bootmem_page() will call free_reserved_page() directly.
Fix the problem by adding kmemleak_free_part() in free_bootmem_page() when
HAVE_BOOTMEM_INFO_NODE is disabled.
Link: https://lkml.kernel.org/r/20230704101942.2819426-1-liushixin2@huawei.com
Fixes: f41f2ed43ca5 ("mm: hugetlb: free the vmemmap pages associated with each HugeTLB page")
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Acked-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/bootmem_info.h | 2 ++
1 file changed, 2 insertions(+)
--- a/include/linux/bootmem_info.h~bootmem-remove-the-vmemmap-pages-from-kmemleak-in-free_bootmem_page
+++ a/include/linux/bootmem_info.h
@@ -3,6 +3,7 @@
#define __LINUX_BOOTMEM_INFO_H
#include <linux/mm.h>
+#include <linux/kmemleak.h>
/*
* Types for free bootmem stored in page->lru.next. These have to be in
@@ -59,6 +60,7 @@ static inline void get_page_bootmem(unsi
static inline void free_bootmem_page(struct page *page)
{
+ kmemleak_free_part(page_to_virt(page), PAGE_SIZE);
free_reserved_page(page);
}
#endif
_
Patches currently in -mm which might be from liushixin2(a)huawei.com are
The quilt patch titled
Subject: mm: call arch_swap_restore() from do_swap_page()
has been removed from the -mm tree. Its filename was
mm-call-arch_swap_restore-from-do_swap_page.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Peter Collingbourne <pcc(a)google.com>
Subject: mm: call arch_swap_restore() from do_swap_page()
Date: Mon, 22 May 2023 17:43:08 -0700
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance to
restore them. Fix it by adding a call to the arch_swap_restore() hook
before the call to swap_free().
Link: https://lkml.kernel.org/r/20230523004312.1807357-2-pcc@google.com
Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c51…
Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reported-by: Qun-wei Lin <Qun-wei.Lin(a)mediatek.com>
Closes: https://lore.kernel.org/all/5050805753ac469e8d727c797c2218a9d780d434.camel@…
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: "Huang, Ying" <ying.huang(a)intel.com>
Reviewed-by: Steven Price <steven.price(a)arm.com>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: <stable(a)vger.kernel.org> [6.1+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/mm/memory.c~mm-call-arch_swap_restore-from-do_swap_page
+++ a/mm/memory.c
@@ -3954,6 +3954,13 @@ vm_fault_t do_swap_page(struct vm_fault
}
/*
+ * Some architectures may have to restore extra metadata to the page
+ * when reading from swap. This metadata may be indexed by swap entry
+ * so this must be called before swap_free().
+ */
+ arch_swap_restore(entry, folio);
+
+ /*
* Remove the swap entry and conditionally try to free up the swapcache.
* We're already holding a reference on the page but haven't mapped it
* yet.
_
Patches currently in -mm which might be from pcc(a)google.com are
mm-call-arch_swap_restore-from-unuse_pte.patch
arm64-mte-simplify-swap-tag-restoration-logic.patch
The quilt patch titled
Subject: mm: disable CONFIG_PER_VMA_LOCK until its fixed
has been removed from the -mm tree. Its filename was
mm-disable-config_per_vma_lock-until-its-fixed.patch
This patch was dropped because it is obsolete
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: mm: disable CONFIG_PER_VMA_LOCK until its fixed
Date: Wed, 5 Jul 2023 18:14:00 -0700
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Disable per-VMA locks config to
prevent this issue until the fix is confirmed. This is expected to be a
temporary measure.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
Link: https://lkml.kernel.org/r/20230706011400.2949242-3-surenb@google.com
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Holger Hoffst��tte <holger(a)applied-asynchrony.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/Kconfig | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/Kconfig~mm-disable-config_per_vma_lock-until-its-fixed
+++ a/mm/Kconfig
@@ -1224,8 +1224,9 @@ config ARCH_SUPPORTS_PER_VMA_LOCK
def_bool n
config PER_VMA_LOCK
- def_bool y
+ bool "Enable per-vma locking during page fault handling."
depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP
+ depends on BROKEN
help
Allow per-vma locking during page fault handling.
_
Patches currently in -mm which might be from surenb(a)google.com are
mm-lock-a-vma-before-stack-expansion.patch
mm-lock-newly-mapped-vma-which-can-be-modified-after-it-becomes-visible.patch
swap-remove-remnants-of-polling-from-read_swap_cache_async.patch
mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch
mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch
mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch
mm-handle-swap-page-faults-under-per-vma-lock.patch
mm-handle-userfaults-under-vma-lock.patch
The quilt patch titled
Subject: fork: lock VMAs of the parent process when forking
has been removed from the -mm tree. Its filename was
fork-lock-vmas-of-the-parent-process-when-forking.patch
This patch was dropped because it is obsolete
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: fork: lock VMAs of the parent process when forking
Date: Wed, 5 Jul 2023 18:13:59 -0700
Patch series "Avoid memory corruption caused by per-VMA locks", v4.
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking while
forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork. I
tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~7% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Patch 2/2 disables per-VMA locks until the fix is tested and verified.
This patch (of 2):
When forking a child process, parent write-protects an anonymous page and
COW-shares it with the child being forked using copy_present_pte().
Parent's TLB is flushed right before we drop the parent's mmap_lock in
dup_mmap(). If we get a write-fault before that TLB flush in the parent,
and we end up replacing that anonymous page in the parent process in
do_wp_page() (because, COW-shared with the child), this might lead to some
stale writable TLB entries targeting the wrong (old) page. Similar issue
happened in the past with userfaultfd (see flush_tlb_page() call inside
do_wp_page()).
Lock VMAs of the parent process when forking a child, which prevents
concurrent page faults during fork operation and avoids this issue. This
fix can potentially regress some fork-heavy workloads. Kernel build time
did not show noticeable regression on a 56-core machine while a stress
test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~7%
regression. If such fork time regression is unacceptable, disabling
CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations
are possible if this regression proves to be problematic.
Link: https://lkml.kernel.org/r/20230706011400.2949242-1-surenb@google.com
Link: https://lkml.kernel.org/r/20230706011400.2949242-2-surenb@google.com
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Holger Hoffst��tte <holger(a)applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-as…
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Holger Hoffsttte <holger(a)applied-asynchrony.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/kernel/fork.c~fork-lock-vmas-of-the-parent-process-when-forking
+++ a/kernel/fork.c
@@ -658,6 +658,12 @@ static __latent_entropy int dup_mmap(str
retval = -EINTR;
goto fail_uprobe_end;
}
+#ifdef CONFIG_PER_VMA_LOCK
+ /* Disallow any page faults before calling flush_cache_dup_mm */
+ for_each_vma(old_vmi, mpnt)
+ vma_start_write(mpnt);
+ vma_iter_set(&old_vmi, 0);
+#endif
flush_cache_dup_mm(oldmm);
uprobe_dup_mmap(oldmm, mm);
/*
_
Patches currently in -mm which might be from surenb(a)google.com are
mm-disable-config_per_vma_lock-until-its-fixed.patch
mm-lock-a-vma-before-stack-expansion.patch
mm-lock-newly-mapped-vma-which-can-be-modified-after-it-becomes-visible.patch
swap-remove-remnants-of-polling-from-read_swap_cache_async.patch
mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch
mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch
mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch
mm-handle-swap-page-faults-under-per-vma-lock.patch
mm-handle-userfaults-under-vma-lock.patch
Somehow PR_GET_AUXV got added into PR_MCE_KILL's switch when
the patch was applied [1].
Thus move it out of the switch, to the place the patch added it.
In the recently released v6.4 kernel some user could, in
principle, be already using this feature by mapping the right
page and passing the PR_GET_AUXV constant as a pointer:
prctl(PR_MCE_KILL, PR_GET_AUXV, ...)
So this does change the behavior for users. We could keep the bug
since the other subcases in PR_MCE_KILL (PR_MCE_KILL_CLEAR and
PR_MCE_KILL_SET) do not overlap.
However, v6.4 may be recent enough (2 weeks old) that moving
the lines (rather than just adding a new case) does not break
anybody? Moreover, the documentation in man-pages was just
committed today [2].
Fixes: ddc65971bb67 ("prctl: add PR_GET_AUXV to copy auxv to userspace")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/d81864a7f7f43bca6afa2a09fc2e850e4050ab42.168061… [1]
Link: https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=8cf0… [2]
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
---
kernel/sys.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/kernel/sys.c b/kernel/sys.c
index 339fee3eff6a..a36a27ebac33 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2529,11 +2529,6 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
else
return -EINVAL;
break;
- case PR_GET_AUXV:
- if (arg4 || arg5)
- return -EINVAL;
- error = prctl_get_auxv((void __user *)arg2, arg3);
- break;
default:
return -EINVAL;
}
@@ -2688,6 +2683,11 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
case PR_SET_VMA:
error = prctl_set_vma(arg2, arg3, arg4, arg5);
break;
+ case PR_GET_AUXV:
+ if (arg4 || arg5)
+ return -EINVAL;
+ error = prctl_get_auxv((void __user *)arg2, arg3);
+ break;
#ifdef CONFIG_KSM
case PR_SET_MEMORY_MERGE:
if (arg3 || arg4 || arg5)
base-commit: 6995e2de6891c724bfeb2db33d7b87775f913ad1
--
2.41.0
Lockdep is certainly right to complain about
(&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f
but task is already holding lock:
(&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db
Invert those to the usual ordering.
Fixes: 33313a747e81 ("mm: lock newly mapped VMA which can be modified after it becomes visible")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugh Dickins <hughd(a)google.com>
---
mm/mmap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 84c71431a527..3eda23c9ebe7 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2809,11 +2809,11 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
if (vma_iter_prealloc(&vmi))
goto close_and_free_vma;
+ /* Lock the VMA since it is modified after insertion into VMA tree */
+ vma_start_write(vma);
if (vma->vm_file)
i_mmap_lock_write(vma->vm_file->f_mapping);
- /* Lock the VMA since it is modified after insertion into VMA tree */
- vma_start_write(vma);
vma_iter_store(&vmi, vma);
mm->map_count++;
if (vma->vm_file) {
--
2.35.3
Dear developer, I am a security researcher at Wuhan University. Recently I discovered a vulnerability in the driver of the USBcore module in the Linux kernel. This vulnerability will lead to an infinite loop in the probe process of the USB device, which will consume a lot of system resources. The vulnerability was found in kernel version 5.6.19 and tested to exist in the new 6.3.7 kernel version as well. I hope that after your review, you will be able to apply for a CVE number to disclose this vulnerability. If you need more detailed vulnerability information, please contact me. Thank you for your help.
With recent changes necessitating mmap_lock to be held for write while
expanding a stack, per-VMA locks should follow the same rules and be
write-locked to prevent page faults into the VMA being expanded. Add
the necessary locking.
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
---
mm/mmap.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/mmap.c b/mm/mmap.c
index 204ddcd52625..c66e4622a557 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1977,6 +1977,8 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address)
return -ENOMEM;
}
+ /* Lock the VMA before expanding to prevent concurrent page faults */
+ vma_start_write(vma);
/*
* vma->vm_start/vm_end cannot change under us because the caller
* is required to hold the mmap_lock in read mode. We need the
@@ -2064,6 +2066,8 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address)
return -ENOMEM;
}
+ /* Lock the VMA before expanding to prevent concurrent page faults */
+ vma_start_write(vma);
/*
* vma->vm_start/vm_end cannot change under us because the caller
* is required to hold the mmap_lock in read mode. We need the
--
2.41.0.255.g8b1d071c50-goog
The patch titled
Subject: mm: hugetlb_vmemmap: fix a race between vmemmap pmd split
has been added to the -mm mm-unstable branch. Its filename is
mm-hugetlb_vmemmap-fix-a-race-between-vmemmap-pmd-split.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlb_vmemmap: fix a race between vmemmap pmd split
Date: Fri, 7 Jul 2023 11:38:59 +0800
The local variable @page in __split_vmemmap_huge_pmd() to obtain a pmd
page without holding page_table_lock may possiblely get the page table
page instead of a huge pmd page.
The effect may be in set_pte_at() since we may pass an invalid page
struct, if set_pte_at() wants to access the page struct (e.g.
CONFIG_PAGE_TABLE_CHECK is enabled), it may crash the kernel.
So fix it. And inline __split_vmemmap_huge_pmd() since it only has one
user.
Link: https://lkml.kernel.org/r/20230707033859.16148-1-songmuchun@bytedance.com
Fixes: d8d55f5616cf ("mm: sparsemem: use page table lock to protect kernel pmd operations")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb_vmemmap.c | 34 ++++++++++++++--------------------
1 file changed, 14 insertions(+), 20 deletions(-)
--- a/mm/hugetlb_vmemmap.c~mm-hugetlb_vmemmap-fix-a-race-between-vmemmap-pmd-split
+++ a/mm/hugetlb_vmemmap.c
@@ -36,14 +36,22 @@ struct vmemmap_remap_walk {
struct list_head *vmemmap_pages;
};
-static int __split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start)
+static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start)
{
pmd_t __pmd;
int i;
unsigned long addr = start;
- struct page *page = pmd_page(*pmd);
- pte_t *pgtable = pte_alloc_one_kernel(&init_mm);
+ struct page *head;
+ pte_t *pgtable;
+
+ spin_lock(&init_mm.page_table_lock);
+ head = pmd_leaf(*pmd) ? pmd_page(*pmd) : NULL;
+ spin_unlock(&init_mm.page_table_lock);
+ if (!head)
+ return 0;
+
+ pgtable = pte_alloc_one_kernel(&init_mm);
if (!pgtable)
return -ENOMEM;
@@ -53,7 +61,7 @@ static int __split_vmemmap_huge_pmd(pmd_
pte_t entry, *pte;
pgprot_t pgprot = PAGE_KERNEL;
- entry = mk_pte(page + i, pgprot);
+ entry = mk_pte(head + i, pgprot);
pte = pte_offset_kernel(&__pmd, addr);
set_pte_at(&init_mm, addr, pte, entry);
}
@@ -65,8 +73,8 @@ static int __split_vmemmap_huge_pmd(pmd_
* be treated as indepdenent small pages (as they can be freed
* individually).
*/
- if (!PageReserved(page))
- split_page(page, get_order(PMD_SIZE));
+ if (!PageReserved(head))
+ split_page(head, get_order(PMD_SIZE));
/* Make pte visible before pmd. See comment in pmd_install(). */
smp_wmb();
@@ -80,20 +88,6 @@ static int __split_vmemmap_huge_pmd(pmd_
return 0;
}
-static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start)
-{
- int leaf;
-
- spin_lock(&init_mm.page_table_lock);
- leaf = pmd_leaf(*pmd);
- spin_unlock(&init_mm.page_table_lock);
-
- if (!leaf)
- return 0;
-
- return __split_vmemmap_huge_pmd(pmd, start);
-}
-
static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end,
struct vmemmap_remap_walk *walk)
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-hugetlb_vmemmap-fix-a-race-between-vmemmap-pmd-split.patch
Hi,
Can you please help in back porting the below patch to linux-5.15.y
stable tree:
commit d8c47cc7bf602ef73384a00869a70148146c1191("mm: page_io: fix psi
memory pressure error on cold swapins") .
With the absence of this patch we are seeing some user space tools, like
Android low memory killer based on PSI events, bit aggressive as it
makes the PSI is accounted for even cold swapin on a device where swap
is mounted on a zram with slower backing dev.
Thanks,
Charan
From: Rafał Miłecki <rafal(a)milecki.pl>
commit f99e6d7c4ed3be2531bd576425a5bd07fb133bd7 upstream.
While bringing hardware up we should perform a full reset including the
switch bit (BGMAC_BCMA_IOCTL_SW_RESET aka SICF_SWRST). It's what
specification says and what reference driver does.
This seems to be critical for the BCM5358. Without this hardware doesn't
get initialized properly and doesn't seem to transmit or receive any
packets.
Originally bgmac was calling bgmac_chip_reset() before setting
"has_robosw" property which resulted in expected behaviour. That has
changed as a side effect of adding platform device support which
regressed BCM5358 support.
Fixes: f6a95a24957a ("net: ethernet: bgmac: Add platform device support")
Cc: Jon Mason <jdmason(a)kudzu.us>
Signed-off-by: Rafał Miłecki <rafal(a)milecki.pl>
Reviewed-by: Leon Romanovsky <leonro(a)nvidia.com>
Reviewed-by: Florian Fainelli <f.fainelli(a)gmail.com>
Link: https://lore.kernel.org/r/20230227091156.19509-1-zajec5@gmail.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
Upstream commit wasn't backported to 5.4 (and older) because it couldn't
be cherry-picked cleanly. There was a small fuzz caused by a missing
commit 8c7da63978f1 ("bgmac: configure MTU and add support for frames
beyond 8192 byte size").
I've manually cherry-picked fix for BCM5358 to the linux-5.4.x.
---
drivers/net/ethernet/broadcom/bgmac.c | 8 ++++++--
drivers/net/ethernet/broadcom/bgmac.h | 2 ++
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
index 193722334d93..89a63fdbe0e3 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -890,13 +890,13 @@ static void bgmac_chip_reset_idm_config(struct bgmac *bgmac)
if (iost & BGMAC_BCMA_IOST_ATTACHED) {
flags = BGMAC_BCMA_IOCTL_SW_CLKEN;
- if (!bgmac->has_robosw)
+ if (bgmac->in_init || !bgmac->has_robosw)
flags |= BGMAC_BCMA_IOCTL_SW_RESET;
}
bgmac_clk_enable(bgmac, flags);
}
- if (iost & BGMAC_BCMA_IOST_ATTACHED && !bgmac->has_robosw)
+ if (iost & BGMAC_BCMA_IOST_ATTACHED && (bgmac->in_init || !bgmac->has_robosw))
bgmac_idm_write(bgmac, BCMA_IOCTL,
bgmac_idm_read(bgmac, BCMA_IOCTL) &
~BGMAC_BCMA_IOCTL_SW_RESET);
@@ -1489,6 +1489,8 @@ int bgmac_enet_probe(struct bgmac *bgmac)
struct net_device *net_dev = bgmac->net_dev;
int err;
+ bgmac->in_init = true;
+
bgmac_chip_intrs_off(bgmac);
net_dev->irq = bgmac->irq;
@@ -1538,6 +1540,8 @@ int bgmac_enet_probe(struct bgmac *bgmac)
net_dev->hw_features = net_dev->features;
net_dev->vlan_features = net_dev->features;
+ bgmac->in_init = false;
+
err = register_netdev(bgmac->net_dev);
if (err) {
dev_err(bgmac->dev, "Cannot register net device\n");
diff --git a/drivers/net/ethernet/broadcom/bgmac.h b/drivers/net/ethernet/broadcom/bgmac.h
index 40d02fec2747..76930b8353d6 100644
--- a/drivers/net/ethernet/broadcom/bgmac.h
+++ b/drivers/net/ethernet/broadcom/bgmac.h
@@ -511,6 +511,8 @@ struct bgmac {
int irq;
u32 int_mask;
+ bool in_init;
+
/* Current MAC state */
int mac_speed;
int mac_duplex;
--
2.35.3
From: Quan Zhou <quan.zhou(a)mediatek.com>
For some cases as below, we may encounter the unpreditable chip stats
in driver probe()
* The system reboot flow do not work properly, such as kernel oops while
rebooting, and then the driver do not go back to default status at
this moment.
* Similar to the flow above. If the device was enabled in BIOS or UEFI,
the system may switch to Linux without driver fully shutdown.
To avoid the problem, force push the device back to default in probe()
* mt7921e_mcu_fw_pmctrl() : return control privilege to chip side.
* mt7921_wfsys_reset() : cleanup chip config before resource init.
Error log
[59007.600714] mt7921e 0000:02:00.0: ASIC revision: 79220010
[59010.889773] mt7921e 0000:02:00.0: Message 00000010 (seq 1) timeout
[59010.889786] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59014.217839] mt7921e 0000:02:00.0: Message 00000010 (seq 2) timeout
[59014.217852] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59017.545880] mt7921e 0000:02:00.0: Message 00000010 (seq 3) timeout
[59017.545893] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59020.874086] mt7921e 0000:02:00.0: Message 00000010 (seq 4) timeout
[59020.874099] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59024.202019] mt7921e 0000:02:00.0: Message 00000010 (seq 5) timeout
[59024.202033] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59027.530082] mt7921e 0000:02:00.0: Message 00000010 (seq 6) timeout
[59027.530096] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59030.857888] mt7921e 0000:02:00.0: Message 00000010 (seq 7) timeout
[59030.857904] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59034.185946] mt7921e 0000:02:00.0: Message 00000010 (seq 8) timeout
[59034.185961] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59037.514249] mt7921e 0000:02:00.0: Message 00000010 (seq 9) timeout
[59037.514262] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59040.842362] mt7921e 0000:02:00.0: Message 00000010 (seq 10) timeout
[59040.842375] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59040.923845] mt7921e 0000:02:00.0: hardware init failed
Cc: stable(a)vger.kernel.org
Fixes: 5c14a5f944b9 ("mt76: mt7921: introduce mt7921e support")
Tested-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Tested-by: Juan Martinez <juan.martinez(a)amd.com>
Co-developed-by: Leon Yen <leon.yen(a)mediatek.com>
Signed-off-by: Leon Yen <leon.yen(a)mediatek.com>
Signed-off-by: Quan Zhou <quan.zhou(a)mediatek.com>
Signed-off-by: Deren Wu <deren.wu(a)mediatek.com>
---
v2: The v1 patch has been accpeted in wireless patchwork. However,
this patch is very important for existing system, we need to add
cc stable tag and hope this patch can be pulled to stable branch earlier.
---
drivers/net/wireless/mediatek/mt76/mt7921/dma.c | 4 ----
drivers/net/wireless/mediatek/mt76/mt7921/mcu.c | 8 --------
drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 8 ++++++++
3 files changed, 8 insertions(+), 12 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/dma.c b/drivers/net/wireless/mediatek/mt76/mt7921/dma.c
index f0a80c2b476a..4153cd6c2a01 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/dma.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/dma.c
@@ -231,10 +231,6 @@ int mt7921_dma_init(struct mt7921_dev *dev)
if (ret)
return ret;
- ret = mt7921_wfsys_reset(dev);
- if (ret)
- return ret;
-
/* init tx queue */
ret = mt76_connac_init_tx_queues(dev->phy.mt76, MT7921_TXQ_BAND0,
MT7921_TX_RING_SIZE,
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
index c69ce6df4956..f55caa00ac69 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
@@ -476,12 +476,6 @@ static int mt7921_load_firmware(struct mt7921_dev *dev)
{
int ret;
- ret = mt76_get_field(dev, MT_CONN_ON_MISC, MT_TOP_MISC2_FW_N9_RDY);
- if (ret && mt76_is_mmio(&dev->mt76)) {
- dev_dbg(dev->mt76.dev, "Firmware is already download\n");
- goto fw_loaded;
- }
-
ret = mt76_connac2_load_patch(&dev->mt76, mt7921_patch_name(dev));
if (ret)
return ret;
@@ -504,8 +498,6 @@ static int mt7921_load_firmware(struct mt7921_dev *dev)
return -EIO;
}
-fw_loaded:
-
#ifdef CONFIG_PM
dev->mt76.hw->wiphy->wowlan = &mt76_connac_wowlan_support;
#endif /* CONFIG_PM */
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
index 1c727870bbdb..6c512bc75685 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
@@ -325,6 +325,10 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
bus_ops->rmw = mt7921_rmw;
dev->mt76.bus = bus_ops;
+ ret = mt7921e_mcu_fw_pmctrl(dev);
+ if (ret)
+ goto err_free_dev;
+
ret = __mt7921e_mcu_drv_pmctrl(dev);
if (ret)
goto err_free_dev;
@@ -333,6 +337,10 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
(mt7921_l1_rr(dev, MT_HW_REV) & 0xff);
dev_info(mdev->dev, "ASIC revision: %04x\n", mdev->rev);
+ ret = mt7921_wfsys_reset(dev);
+ if (ret)
+ goto err_free_dev;
+
mt76_wr(dev, MT_WFDMA0_HOST_INT_ENA, 0);
mt76_wr(dev, MT_PCIE_MAC_INT_ENABLE, 0xff);
--
2.18.0
From: Long Li <longli(a)microsoft.com>
It's inefficient to ring the doorbell page every time a WQE is posted to
the received queue. Excessive MMIO writes result in CPU spending more
time waiting on LOCK instructions (atomic operations), resulting in
poor scaling performance.
Move the code for ringing doorbell page to where after we have posted all
WQEs to the receive queue during a callback from napi_poll().
With this change, tests showed an improvement from 120G/s to 160G/s on a
200G physical link, with 16 or 32 hardware queues.
Tests showed no regression in network latency benchmarks on single
connection.
While we are making changes in this code path, change the code for
ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
hardware specification specifies that it should set to 0. Although
currently the hardware doesn't enforce the check, in the future releases
it may do.
Cc: stable(a)vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Reviewed-by: Haiyang Zhang <haiyangz(a)microsoft.com>
Reviewed-by: Dexuan Cui <decui(a)microsoft.com>
Signed-off-by: Long Li <longli(a)microsoft.com>
---
Change log:
v2:
Check for comp_read > 0 as it might be negative on completion error.
Set rq.wqe_cnt to 0 according to BNIC spec.
v3:
Add details in the commit on the reason of performance increase and test numbers.
Add details in the commit on why rq.wqe_cnt should be set to 0 according to hardware spec.
Add "Reviewed-by" from Haiyang and Dexuan.
drivers/net/ethernet/microsoft/mana/gdma_main.c | 5 ++++-
drivers/net/ethernet/microsoft/mana/mana_en.c | 10 ++++++++--
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 8f3f78b68592..3765d3389a9a 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -300,8 +300,11 @@ static void mana_gd_ring_doorbell(struct gdma_context *gc, u32 db_index,
void mana_gd_wq_ring_doorbell(struct gdma_context *gc, struct gdma_queue *queue)
{
+ /* Hardware Spec specifies that software client should set 0 for
+ * wqe_cnt for Receive Queues. This value is not used in Send Queues.
+ */
mana_gd_ring_doorbell(gc, queue->gdma_dev->doorbell, queue->type,
- queue->id, queue->head * GDMA_WQE_BU_SIZE, 1);
+ queue->id, queue->head * GDMA_WQE_BU_SIZE, 0);
}
void mana_gd_ring_cq(struct gdma_queue *cq, u8 arm_bit)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index cd4d5ceb9f2d..1d8abe63fcb8 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1383,8 +1383,8 @@ static void mana_post_pkt_rxq(struct mana_rxq *rxq)
recv_buf_oob = &rxq->rx_oobs[curr_index];
- err = mana_gd_post_and_ring(rxq->gdma_rq, &recv_buf_oob->wqe_req,
- &recv_buf_oob->wqe_inf);
+ err = mana_gd_post_work_request(rxq->gdma_rq, &recv_buf_oob->wqe_req,
+ &recv_buf_oob->wqe_inf);
if (WARN_ON_ONCE(err))
return;
@@ -1654,6 +1654,12 @@ static void mana_poll_rx_cq(struct mana_cq *cq)
mana_process_rx_cqe(rxq, cq, &comp[i]);
}
+ if (comp_read > 0) {
+ struct gdma_context *gc = rxq->gdma_rq->gdma_dev->gdma_context;
+
+ mana_gd_wq_ring_doorbell(gc, rxq->gdma_rq);
+ }
+
if (rxq->xdp_flush)
xdp_do_flush();
}
--
2.34.1
A crash was reported in amd-sfh related to hid core initialization
before SFH initialization has run.
```
amdtp_hid_request+0x36/0x50 [amd_sfh
2e3095779aada9fdb1764f08ca578ccb14e41fe4]
sensor_hub_get_feature+0xad/0x170 [hid_sensor_hub
d6157999c9d260a1bfa6f27d4a0dc2c3e2c5654e]
hid_sensor_parse_common_attributes+0x217/0x310 [hid_sensor_iio_common
07a7935272aa9c7a28193b574580b3e953a64ec4]
hid_gyro_3d_probe+0x7f/0x2e0 [hid_sensor_gyro_3d
9f2eb51294a1f0c0315b365f335617cbaef01eab]
platform_probe+0x44/0xa0
really_probe+0x19e/0x3e0
```
Ensure that sensors have been set up before calling into
amd_sfh_get_report() or amd_sfh_set_report().
Cc: stable(a)vger.kernel.org
Cc: Linux regression tracking (Thorsten Leemhuis) <regressions(a)leemhuis.info>
Fixes: 7bcfdab3f0c6 ("HID: amd_sfh: if no sensors are enabled, clean up")
Reported-by: Haochen Tong <linux(a)hexchain.org>
Link: https://lore.kernel.org/all/3250319.ancTxkQ2z5@zen/T/
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
drivers/hid/amd-sfh-hid/amd_sfh_client.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_client.c b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
index d9b7b01900b5..88f3d913eaa1 100644
--- a/drivers/hid/amd-sfh-hid/amd_sfh_client.c
+++ b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
@@ -25,6 +25,9 @@ void amd_sfh_set_report(struct hid_device *hid, int report_id,
struct amdtp_cl_data *cli_data = hid_data->cli_data;
int i;
+ if (!cli_data->is_any_sensor_enabled)
+ return;
+
for (i = 0; i < cli_data->num_hid_devices; i++) {
if (cli_data->hid_sensor_hubs[i] == hid) {
cli_data->cur_hid_dev = i;
@@ -41,6 +44,9 @@ int amd_sfh_get_report(struct hid_device *hid, int report_id, int report_type)
struct request_list *req_list = &cli_data->req_list;
int i;
+ if (!cli_data->is_any_sensor_enabled)
+ return -ENODEV;
+
for (i = 0; i < cli_data->num_hid_devices; i++) {
if (cli_data->hid_sensor_hubs[i] == hid) {
struct request_list *new = kzalloc(sizeof(*new), GFP_KERNEL);
--
2.34.1
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking
while forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork.
I tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~7% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Patch 2/2 disables per-VMA locks until the fix is tested and verified.
Both patches apply cleanly over Linus' ToT and stable 6.4.y branch.
Changes from v3 posted at [3]:
- Replace vma_iter_init with vma_iter_set, per Liam R. Howlett
- Update the regression number caused by additional VMA tree walk
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
[3] https://lore.kernel.org/all/20230705171213.2843068-1-surenb@google.com
Suren Baghdasaryan (2):
fork: lock VMAs of the parent process when forking
mm: disable CONFIG_PER_VMA_LOCK until its fixed
kernel/fork.c | 6 ++++++
mm/Kconfig | 3 ++-
2 files changed, 8 insertions(+), 1 deletion(-)
--
2.41.0.255.g8b1d071c50-goog
This is the start of the stable review cycle for the 6.4.2 release.
There are 15 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 06 Jul 2023 08:46:01 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.2-rc2.…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.4.2-rc2
Linus Torvalds <torvalds(a)linux-foundation.org>
gup: avoid stack expansion warning for known-good case
SeongJae Park <sj(a)kernel.org>
arch/arm64/mm/fault: Fix undeclared variable error in do_page_fault()
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Demi Marie Obenour <demi(a)invisiblethingslab.com>
dm ioctl: Avoid double-fetch of version
Ahmed S. Darwish <darwi(a)linutronix.de>
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Mike Kravetz <mike.kravetz(a)oracle.com>
hugetlb: revert use of page_cache_next_miss()
Finn Thain <fthain(a)linux-m68k.org>
nubus: Partially revert proc_create_single_data() conversion
Dan Williams <dan.j.williams(a)intel.com>
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Jeff Layton <jlayton(a)kernel.org>
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: always mark stack as growing down during early stack setup
Mario Limonciello <mario.limonciello(a)amd.com>
PCI/ACPI: Call _REG when transitioning D-states
Bjorn Helgaas <bhelgaas(a)google.com>
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Thomas Weißschuh <linux(a)weissschuh.net>
tools/nolibc: x86_64: disable stack protector for _start
Max Filippov <jcmvbkbc(a)gmail.com>
xtensa: fix lock_mm_and_find_vma in case VMA not found
-------------
Diffstat:
Documentation/process/changes.rst | 7 +++++
Makefile | 4 +--
arch/arm64/mm/fault.c | 2 --
drivers/cxl/core/pci.c | 27 +++--------------
drivers/cxl/cxl.h | 1 -
drivers/cxl/port.c | 14 ++++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/md/dm-ioctl.c | 33 +++++++++++++--------
drivers/nubus/proc.c | 22 ++++++++++----
drivers/pci/pci-acpi.c | 53 +++++++++++++++++++++++++---------
fs/hugetlbfs/inode.c | 8 ++---
fs/nfs/inode.c | 2 +-
include/linux/mm.h | 4 ++-
mm/hugetlb.c | 12 ++++----
mm/memory.c | 4 +++
mm/nommu.c | 7 ++++-
scripts/tags.sh | 9 +++++-
tools/include/nolibc/arch-x86_64.h | 2 +-
tools/testing/cxl/Kbuild | 1 -
tools/testing/cxl/test/mock.c | 15 ----------
20 files changed, 132 insertions(+), 99 deletions(-)
This is the start of the stable review cycle for the 6.3.12 release.
There are 14 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 06 Jul 2023 08:46:01 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.3.12-rc2…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.3.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.3.12-rc2
Linus Torvalds <torvalds(a)linux-foundation.org>
gup: avoid stack expansion warning for known-good case
Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
drm/amd/display: Ensure vmin and vmax adjust for DCE
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Demi Marie Obenour <demi(a)invisiblethingslab.com>
dm ioctl: Avoid double-fetch of version
Ahmed S. Darwish <darwi(a)linutronix.de>
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Finn Thain <fthain(a)linux-m68k.org>
nubus: Partially revert proc_create_single_data() conversion
Dan Williams <dan.j.williams(a)intel.com>
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Jeff Layton <jlayton(a)kernel.org>
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: always mark stack as growing down during early stack setup
Mario Limonciello <mario.limonciello(a)amd.com>
PCI/ACPI: Call _REG when transitioning D-states
Bjorn Helgaas <bhelgaas(a)google.com>
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Aric Cyr <aric.cyr(a)amd.com>
drm/amd/display: Do not update DRR while BW optimizations pending
Max Filippov <jcmvbkbc(a)gmail.com>
xtensa: fix lock_mm_and_find_vma in case VMA not found
-------------
Diffstat:
Documentation/process/changes.rst | 7 +++++
Makefile | 4 +--
drivers/cxl/core/pci.c | 27 +++-------------
drivers/cxl/cxl.h | 1 -
drivers/cxl/port.c | 14 +++------
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/gpu/drm/amd/display/dc/core/dc.c | 49 +++++++++++++++++------------
drivers/md/dm-ioctl.c | 33 ++++++++++++--------
drivers/nubus/proc.c | 22 ++++++++++---
drivers/pci/pci-acpi.c | 53 ++++++++++++++++++++++++--------
fs/nfs/inode.c | 2 +-
include/linux/mm.h | 4 ++-
mm/memory.c | 4 +++
mm/nommu.c | 7 ++++-
scripts/tags.sh | 9 +++++-
tools/testing/cxl/Kbuild | 1 -
tools/testing/cxl/test/mock.c | 15 ---------
17 files changed, 152 insertions(+), 104 deletions(-)
When unloading the MANA driver, mana_dealloc_queues() waits for the MANA
hardware to complete any inflight packets and set the pending send count
to zero. But if the hardware has failed, mana_dealloc_queues()
could wait forever.
Fix this by adding a timeout to the wait. Set the timeout to 120 seconds,
which is a somewhat arbitrary value that is more than long enough for
functional hardware to complete any sends.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
---
V4 -> V5:
* Added fixes tag
* Changed the usleep_range from static to incremental value.
* Initialized timeout in the begining.
---
Signed-off-by: Souradeep Chakrabarti <schakrabarti(a)linux.microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 30 ++++++++++++++++---
1 file changed, 26 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index a499e460594b..56b7074db1a2 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2345,9 +2345,13 @@ int mana_attach(struct net_device *ndev)
static int mana_dealloc_queues(struct net_device *ndev)
{
struct mana_port_context *apc = netdev_priv(ndev);
+ unsigned long timeout = jiffies + 120 * HZ;
struct gdma_dev *gd = apc->ac->gdma_dev;
struct mana_txq *txq;
+ struct sk_buff *skb;
+ struct mana_cq *cq;
int i, err;
+ u32 tsleep;
if (apc->port_is_up)
return -EINVAL;
@@ -2363,15 +2367,33 @@ static int mana_dealloc_queues(struct net_device *ndev)
* to false, but it doesn't matter since mana_start_xmit() drops any
* new packets due to apc->port_is_up being false.
*
- * Drain all the in-flight TX packets
+ * Drain all the in-flight TX packets.
+ * A timeout of 120 seconds for all the queues is used.
+ * This will break the while loop when h/w is not responding.
+ * This value of 120 has been decided here considering max
+ * number of queues.
*/
+
for (i = 0; i < apc->num_queues; i++) {
txq = &apc->tx_qp[i].txq;
-
- while (atomic_read(&txq->pending_sends) > 0)
- usleep_range(1000, 2000);
+ tsleep = 1000;
+ while (atomic_read(&txq->pending_sends) > 0 &&
+ time_before(jiffies, timeout)) {
+ usleep_range(tsleep, tsleep << 1);
+ tsleep <<= 1;
+ }
}
+ for (i = 0; i < apc->num_queues; i++) {
+ txq = &apc->tx_qp[i].txq;
+ cq = &apc->tx_qp[i].tx_cq;
+ while (atomic_read(&txq->pending_sends)) {
+ skb = skb_dequeue(&txq->pending_skbs);
+ mana_unmap_skb(skb, apc);
+ dev_consume_skb_any(skb);
+ atomic_sub(1, &txq->pending_sends);
+ }
+ }
/* We're 100% sure the queues can no longer be woken up, because
* we're sure now mana_poll_tx_cq() can't be running.
*/
--
2.34.1
Making 'blk' sector_t (i.e. 64 bit if LBD support is active)
fails the 'blk>0' test in the partition block loop if a
value of (signed int) -1 is used to mark the end of the
partition block list.
This bug was introduced in patch 3 of my prior Amiga partition
support fixes series, and spotted by Christian Zigotzky when
testing the latest block updates.
Explicitly cast 'blk' to signed int to allow use of -1 to
terminate the partition block linked list.
Reported-by: Christian Zigotzky <chzigotzky(a)xenosoft.de>
Fixes: b6f3f28f60 ("block: add overflow checks for Amiga partition support")
Message-ID: 024ce4fa-cc6d-50a2-9aae-3701d0ebf668(a)xenosoft.de
Cc: <stable(a)vger.kernel.org> # 5.2
Link: https://lore.kernel.org/r/024ce4fa-cc6d-50a2-9aae-3701d0ebf668@xenosoft.de
Signed-off-by: Michael Schmitz <schmitzmic(a)gmail.com>
Reviewed-by: Martin Steigerwald <martin(a)lichtvoll.de>
Tested-by: Christian Zigotzky <chzigotzky(a)xenosoft.de>
--
Changes since v1:
- corrected Fixes: tag
- added Tested-by:
- reworded commit message to describe filesystem partition
size mismatch problem
Changes since v2:
Adrian Glaubitz:
- fix typo in commit message
Changes since v3:
Greg KH:
- fix stable tag
Geert Uytterhoeven:
- revert changes to commit message since v1
---
block/partitions/amiga.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/partitions/amiga.c b/block/partitions/amiga.c
index ed222b9c901b..506921095412 100644
--- a/block/partitions/amiga.c
+++ b/block/partitions/amiga.c
@@ -90,7 +90,7 @@ int amiga_partition(struct parsed_partitions *state)
}
blk = be32_to_cpu(rdb->rdb_PartitionList);
put_dev_sector(sect);
- for (part = 1; blk>0 && part<=16; part++, put_dev_sector(sect)) {
+ for (part = 1; (s32) blk>0 && part<=16; part++, put_dev_sector(sect)) {
/* Read in terms partition table understands */
if (check_mul_overflow(blk, (sector_t) blksize, &blk)) {
pr_err("Dev %s: overflow calculating partition block %llu! Skipping partitions %u and beyond\n",
--
2.17.1
PARF_SLV_ADDR_SPACE_SIZE_2_3_3 macro is used for IPQ8074
2_3_3 post_init ops. pcie slave addr size was initially set
to 0x358, but was wrongly changed to 0x168 as a part of
"PCI: qcom: Remove PCIE20_ prefix from register definitions"
Fixing it, by using the right macro PARF_SLV_ADDR_SPACE_SIZE
and removing the unused PARF_SLV_ADDR_SPACE_SIZE_2_3_3.
Without this pcie bring up on IPQ8074 is broken now.
Fixes: 39171b33f652 ("PCI: qcom: Remove PCIE20_ prefix from register definitions")
Signed-off-by: Sricharan Ramabadhran <quic_srichara(a)quicinc.com>
---
[V2] Fixed the 'fixes tag' correctly, subject, right macro usage
drivers/pci/controller/dwc/pcie-qcom.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 4ab30892f6ef..1689d072fe86 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -43,7 +43,6 @@
#define PARF_PHY_REFCLK 0x4c
#define PARF_CONFIG_BITS 0x50
#define PARF_DBI_BASE_ADDR 0x168
-#define PARF_SLV_ADDR_SPACE_SIZE_2_3_3 0x16c /* Register offset specific to IP ver 2.3.3 */
#define PARF_MHI_CLOCK_RESET_CTRL 0x174
#define PARF_AXI_MSTR_WR_ADDR_HALT 0x178
#define PARF_AXI_MSTR_WR_ADDR_HALT_V2 0x1a8
@@ -811,7 +810,7 @@ static int qcom_pcie_post_init_2_3_3(struct qcom_pcie *pcie)
u32 val;
writel(SLV_ADDR_SPACE_SZ,
- pcie->parf + PARF_SLV_ADDR_SPACE_SIZE_2_3_3);
+ pcie->parf + PARF_SLV_ADDR_SPACE_SIZE);
val = readl(pcie->parf + PARF_PHY_CTRL);
val &= ~PHY_TEST_PWR_DOWN;
--
2.34.1
The patch titled
Subject: mm: disable CONFIG_PER_VMA_LOCK until its fixed
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-disable-config_per_vma_lock-until-its-fixed.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: mm: disable CONFIG_PER_VMA_LOCK until its fixed
Date: Wed, 5 Jul 2023 18:14:00 -0700
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Disable per-VMA locks config to
prevent this issue until the fix is confirmed. This is expected to be a
temporary measure.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
Link: https://lkml.kernel.org/r/20230706011400.2949242-3-surenb@google.com
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Holger Hoffst��tte <holger(a)applied-asynchrony.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/Kconfig | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/Kconfig~mm-disable-config_per_vma_lock-until-its-fixed
+++ a/mm/Kconfig
@@ -1224,8 +1224,9 @@ config ARCH_SUPPORTS_PER_VMA_LOCK
def_bool n
config PER_VMA_LOCK
- def_bool y
+ bool "Enable per-vma locking during page fault handling."
depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP
+ depends on BROKEN
help
Allow per-vma locking during page fault handling.
_
Patches currently in -mm which might be from surenb(a)google.com are
fork-lock-vmas-of-the-parent-process-when-forking.patch
mm-disable-config_per_vma_lock-until-its-fixed.patch
swap-remove-remnants-of-polling-from-read_swap_cache_async.patch
mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch
mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch
mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch
mm-handle-swap-page-faults-under-per-vma-lock.patch
mm-handle-userfaults-under-vma-lock.patch
The patch titled
Subject: fork: lock VMAs of the parent process when forking
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
fork-lock-vmas-of-the-parent-process-when-forking.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: fork: lock VMAs of the parent process when forking
Date: Wed, 5 Jul 2023 18:13:59 -0700
Patch series "Avoid memory corruption caused by per-VMA locks", v4.
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking while
forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork. I
tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~7% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Patch 2/2 disables per-VMA locks until the fix is tested and verified.
This patch (of 2):
When forking a child process, parent write-protects an anonymous page and
COW-shares it with the child being forked using copy_present_pte().
Parent's TLB is flushed right before we drop the parent's mmap_lock in
dup_mmap(). If we get a write-fault before that TLB flush in the parent,
and we end up replacing that anonymous page in the parent process in
do_wp_page() (because, COW-shared with the child), this might lead to some
stale writable TLB entries targeting the wrong (old) page. Similar issue
happened in the past with userfaultfd (see flush_tlb_page() call inside
do_wp_page()).
Lock VMAs of the parent process when forking a child, which prevents
concurrent page faults during fork operation and avoids this issue. This
fix can potentially regress some fork-heavy workloads. Kernel build time
did not show noticeable regression on a 56-core machine while a stress
test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~7%
regression. If such fork time regression is unacceptable, disabling
CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations
are possible if this regression proves to be problematic.
Link: https://lkml.kernel.org/r/20230706011400.2949242-1-surenb@google.com
Link: https://lkml.kernel.org/r/20230706011400.2949242-2-surenb@google.com
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Holger Hoffst��tte <holger(a)applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-as…
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/kernel/fork.c~fork-lock-vmas-of-the-parent-process-when-forking
+++ a/kernel/fork.c
@@ -658,6 +658,12 @@ static __latent_entropy int dup_mmap(str
retval = -EINTR;
goto fail_uprobe_end;
}
+#ifdef CONFIG_PER_VMA_LOCK
+ /* Disallow any page faults before calling flush_cache_dup_mm */
+ for_each_vma(old_vmi, mpnt)
+ vma_start_write(mpnt);
+ vma_iter_set(&old_vmi, 0);
+#endif
flush_cache_dup_mm(oldmm);
uprobe_dup_mmap(oldmm, mm);
/*
_
Patches currently in -mm which might be from surenb(a)google.com are
fork-lock-vmas-of-the-parent-process-when-forking.patch
mm-disable-config_per_vma_lock-until-its-fixed.patch
swap-remove-remnants-of-polling-from-read_swap_cache_async.patch
mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch
mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch
mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch
mm-handle-swap-page-faults-under-per-vma-lock.patch
mm-handle-userfaults-under-vma-lock.patch
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking
while forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork.
I tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Patch 2/2 disabled per-VMA locks until the fix is tested and verified.
Both patches apply cleanly over Linus' ToT and stable 6.4.y branch.
Changes from v2 posted at [3]:
- Move VMA locking before flush_cache_dup_mm, per David Hildenbrand
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
[3] https://lore.kernel.org/all/20230705063711.2670599-1-surenb@google.com/
Suren Baghdasaryan (2):
fork: lock VMAs of the parent process when forking
mm: disable CONFIG_PER_VMA_LOCK until its fixed
kernel/fork.c | 6 ++++++
mm/Kconfig | 3 ++-
2 files changed, 8 insertions(+), 1 deletion(-)
--
2.41.0.255.g8b1d071c50-goog
On 7/5/23 21:37, Ranjan kumar wrote:
> Hi Damien,
> Regarding delay:
> As Sathya already mentioned as this is our hardware specific behavior and
> we are confident that the increased retry count
> is sufficient from our hardware perspective for any new systems too. So, we
> want to go with this change .
Fine. Adding a comment above the macro definitions to explain something like
that would be nice.
> Apart from that, I will change the name as suggested .
>
> Thanks & Regards,
> Ranjan
Please avoid top posting.
> On Thu, 29 Jun 2023 at 05:24, Damien Le Moal <dlemoal(a)kernel.org> wrote:
>
>> On 6/28/23 16:05, Ranjan Kumar wrote:
>>> Doorbell and Host diagnostic registers could return 0 even
>>> after 3 retries and that leads to occasional resets of the
>>> controllers, hence increased the retry count to thirty.
>>
>> The magic value "3" for retry count was already that, magic. Why would
>> things
>> work better with 30 ? What is the reasoning ? Isn't a udelay needed to
>> avoid
>> that many retries ?
>>
>>>
>>> Fixes: b899202901a8 ("mpt3sas: Add separate function for aero doorbell
>> reads ")
>>> Cc: stable(a)vger.kernel.org
>>> Signed-off-by: Ranjan Kumar <ranjan.kumar(a)broadcom.com>
>>
>> [..]
>>
>>> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h
>> b/drivers/scsi/mpt3sas/mpt3sas_base.h
>>> index 05364aa15ecd..3b8ec4fd2d21 100644
>>> --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
>>> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
>>> @@ -160,6 +160,8 @@
>>>
>>> #define IOC_OPERATIONAL_WAIT_COUNT 10
>>>
>>> +#define READL_RETRY_COUNT_OF_THIRTY 30
>>> +#define READL_RETRY_COUNT_OF_THREE 3
>>
>> Less than ideal naming I think. If the values need to be changed again, a
>> lot of
>> code will need to change. What about soemthing like:
>>
>> #define READL_RETRY_COUNT 30
>> #define READL_RETRY_SHORT_COUNT 3
>>
>>> /*
>>> * NVMe defines
>>> */
>>> @@ -994,7 +996,7 @@ typedef void (*NVME_BUILD_PRP)(struct
>> MPT3SAS_ADAPTER *ioc, u16 smid,
>>> typedef void (*PUT_SMID_IO_FP_HIP) (struct MPT3SAS_ADAPTER *ioc, u16
>> smid,
>>> u16 funcdep);
>>> typedef void (*PUT_SMID_DEFAULT) (struct MPT3SAS_ADAPTER *ioc, u16
>> smid);
>>> -typedef u32 (*BASE_READ_REG) (const volatile void __iomem *addr);
>>> +typedef u32 (*BASE_READ_REG) (const volatile void __iomem *addr, u8
>> retry_count);
>>> /*
>>> * To get high iops reply queue's msix index when high iops mode is
>> enabled
>>> * else get the msix index of general reply queues.
>>
>> --
>> Damien Le Moal
>> Western Digital Research
>>
>>
>
--
Damien Le Moal
Western Digital Research
Hi,
The following commit landed in 6.4 that helps some occasional NULL
pointer dereferences when setting up MST devices, particularly on
suspend/resume cycles.
54d217406afe ("drm: use mgr->dev in drm_dbg_kms in
drm_dp_add_payload_part2")
Can you please take this to 6.1.y and 6.3.y as well? The NULL pointer
de-reference has been reported on both.
Thanks,
From: Yinjun Zhang <yinjun.zhang(a)corigine.com>
When moving devices from one namespace to another, mc addresses are
cleaned in software while not removed from application firmware. Thus
the mc addresses are remained and will cause resource leak.
Now use `__dev_mc_unsync` to clean mc addresses when closing port.
Fixes: e20aa071cd95 ("nfp: fix schedule in atomic context when sync mc address")
Cc: stable(a)vger.kernel.org
Signed-off-by: Yinjun Zhang <yinjun.zhang(a)corigine.com>
Acked-by: Simon Horman <simon.horman(a)corigine.com>
Signed-off-by: Louis Peens <louis.peens(a)corigine.com>
---
Changes since v2:
* Use function prototype to avoid moving code chunk.
Changes since v1:
* Use __dev_mc_unsyc to clean mc addresses instead of tracking mc addresses by
driver itself.
* Clean mc addresses when closing port instead of driver exits,
so that the issue of moving devices between namespaces can be fixed.
* Modify commit message accordingly.
drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 49f2f081ebb5..6b1fb5708434 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -53,6 +53,8 @@
#include "crypto/crypto.h"
#include "crypto/fw.h"
+static int nfp_net_mc_unsync(struct net_device *netdev, const unsigned char *addr);
+
/**
* nfp_net_get_fw_version() - Read and parse the FW version
* @fw_ver: Output fw_version structure to read to
@@ -1084,6 +1086,9 @@ static int nfp_net_netdev_close(struct net_device *netdev)
/* Step 2: Tell NFP
*/
+ if (nn->cap_w1 & NFP_NET_CFG_CTRL_MCAST_FILTER)
+ __dev_mc_unsync(netdev, nfp_net_mc_unsync);
+
nfp_net_clear_config_and_disable(nn);
nfp_port_configure(netdev, false);
--
2.34.1
I'm announcing the release of the 6.4.2 kernel.
All users of the 6.4 kernel series must upgrade.
The updated 6.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/process/changes.rst | 7 ++++
Makefile | 2 -
arch/arm64/mm/fault.c | 2 -
drivers/cxl/core/pci.c | 27 ++--------------
drivers/cxl/cxl.h | 1
drivers/cxl/port.c | 14 +++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++
drivers/md/dm-ioctl.c | 33 +++++++++++++-------
drivers/nubus/proc.c | 22 ++++++++++---
drivers/pci/pci-acpi.c | 53 ++++++++++++++++++++++++---------
fs/hugetlbfs/inode.c | 8 +---
fs/nfs/inode.c | 2 -
include/linux/mm.h | 4 +-
mm/hugetlb.c | 12 +++----
mm/nommu.c | 7 +++-
scripts/tags.sh | 9 ++++-
tools/include/nolibc/arch-x86_64.h | 2 -
tools/testing/cxl/Kbuild | 1
tools/testing/cxl/test/mock.c | 15 ---------
19 files changed, 127 insertions(+), 98 deletions(-)
Ahmed S. Darwish (2):
scripts/tags.sh: Resolve gtags empty index generation
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Bas Nieuwenhuizen (1):
drm/amdgpu: Validate VM ioctl flags.
Bjorn Helgaas (1):
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Dan Williams (1):
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Demi Marie Obenour (1):
dm ioctl: Avoid double-fetch of version
Finn Thain (1):
nubus: Partially revert proc_create_single_data() conversion
Greg Kroah-Hartman (1):
Linux 6.4.2
Jeff Layton (1):
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds (1):
execve: always mark stack as growing down during early stack setup
Mario Limonciello (1):
PCI/ACPI: Call _REG when transitioning D-states
Max Filippov (1):
xtensa: fix lock_mm_and_find_vma in case VMA not found
Mike Kravetz (1):
hugetlb: revert use of page_cache_next_miss()
SeongJae Park (1):
arch/arm64/mm/fault: Fix undeclared variable error in do_page_fault()
Thomas Weißschuh (1):
tools/nolibc: x86_64: disable stack protector for _start
I'm announcing the release of the 6.3.12 kernel.
All users of the 6.3 kernel series must upgrade.
The updated 6.3.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.3.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/process/changes.rst | 7 ++++
Makefile | 2 -
drivers/cxl/core/pci.c | 27 ++-------------
drivers/cxl/cxl.h | 1
drivers/cxl/port.c | 14 ++------
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++
drivers/gpu/drm/amd/display/dc/core/dc.c | 49 +++++++++++++++++-----------
drivers/md/dm-ioctl.c | 33 ++++++++++++-------
drivers/nubus/proc.c | 22 +++++++++---
drivers/pci/pci-acpi.c | 53 +++++++++++++++++++++++--------
fs/nfs/inode.c | 2 -
include/linux/mm.h | 4 +-
mm/nommu.c | 7 +++-
scripts/tags.sh | 9 ++++-
tools/testing/cxl/Kbuild | 1
tools/testing/cxl/test/mock.c | 15 --------
16 files changed, 147 insertions(+), 103 deletions(-)
Ahmed S. Darwish (2):
scripts/tags.sh: Resolve gtags empty index generation
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Aric Cyr (1):
drm/amd/display: Do not update DRR while BW optimizations pending
Bas Nieuwenhuizen (1):
drm/amdgpu: Validate VM ioctl flags.
Bjorn Helgaas (1):
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Dan Williams (1):
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Demi Marie Obenour (1):
dm ioctl: Avoid double-fetch of version
Finn Thain (1):
nubus: Partially revert proc_create_single_data() conversion
Greg Kroah-Hartman (1):
Linux 6.3.12
Jeff Layton (1):
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds (1):
execve: always mark stack as growing down during early stack setup
Mario Limonciello (1):
PCI/ACPI: Call _REG when transitioning D-states
Max Filippov (1):
xtensa: fix lock_mm_and_find_vma in case VMA not found
Rodrigo Siqueira (1):
drm/amd/display: Ensure vmin and vmax adjust for DCE
I'm announcing the release of the 6.1.38 kernel.
All users of the 6.1 kernel series must upgrade.
The updated 6.1.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.1.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/process/changes.rst | 7 ++++
Makefile | 2 -
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++
drivers/gpu/drm/amd/display/dc/core/dc.c | 50 ++++++++++++++++-------------
drivers/nubus/proc.c | 22 +++++++++---
drivers/pci/pci-acpi.c | 53 +++++++++++++++++++++++--------
include/linux/mm.h | 4 +-
mm/nommu.c | 7 +++-
scripts/tags.sh | 9 ++++-
tools/perf/util/symbol.c | 17 ++++++++-
10 files changed, 130 insertions(+), 45 deletions(-)
Ahmed S. Darwish (2):
scripts/tags.sh: Resolve gtags empty index generation
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Alvin Lee (1):
drm/amd/display: Remove optimization for VRR updates
Aric Cyr (1):
drm/amd/display: Do not update DRR while BW optimizations pending
Bas Nieuwenhuizen (1):
drm/amdgpu: Validate VM ioctl flags.
Bjorn Helgaas (1):
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Finn Thain (1):
nubus: Partially revert proc_create_single_data() conversion
Greg Kroah-Hartman (1):
Linux 6.1.38
Krister Johansen (1):
perf symbols: Symbol lookup with kcore can fail if multiple segments match stext
Linus Torvalds (1):
execve: always mark stack as growing down during early stack setup
Mario Limonciello (1):
PCI/ACPI: Call _REG when transitioning D-states
Max Filippov (1):
xtensa: fix lock_mm_and_find_vma in case VMA not found
Rodrigo Siqueira (1):
drm/amd/display: Ensure vmin and vmax adjust for DCE
I'm announcing the release of the 5.15.120 kernel.
All users of the 5.15 kernel series must upgrade.
The updated 5.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.15.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 -
arch/parisc/include/asm/assembly.h | 4 --
arch/x86/kernel/cpu/microcode/amd.c | 2 -
arch/x86/kernel/smpboot.c | 24 ++++++++-------
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++
drivers/hid/hid-logitech-hidpp.c | 2 -
drivers/hid/wacom_wac.c | 6 +--
drivers/hid/wacom_wac.h | 2 -
drivers/nubus/proc.c | 22 ++++++++++---
drivers/thermal/mtk_thermal.c | 14 +-------
include/linux/highmem.h | 24 +++++++++++++++
include/linux/mm.h | 5 ++-
kernel/bpf/verifier.c | 7 +++-
mm/memory.c | 33 ++++++++++++++------
net/can/isotp.c | 5 +--
net/mptcp/protocol.c | 46 +++++++++++++----------------
net/mptcp/subflow.c | 17 ++++++----
scripts/tags.sh | 9 +++++
tools/perf/util/symbol.c | 17 +++++++++-
20 files changed, 158 insertions(+), 88 deletions(-)
Ahmed S. Darwish (1):
scripts/tags.sh: Resolve gtags empty index generation
Bas Nieuwenhuizen (1):
drm/amdgpu: Validate VM ioctl flags.
Ben Hutchings (1):
parisc: Delete redundant register definitions in <asm/assembly.h>
Borislav Petkov (AMD) (1):
x86/microcode/AMD: Load late on both threads too
Finn Thain (1):
nubus: Partially revert proc_create_single_data() conversion
Greg Kroah-Hartman (1):
Linux 5.15.120
Jane Chu (1):
mm, hwpoison: when copy-on-write hits poison, take page offline
Jason Gerecke (1):
HID: wacom: Use ktime_t rather than int when dealing with timestamps
Krister Johansen (2):
bpf: ensure main program has an extable
perf symbols: Symbol lookup with kcore can fail if multiple segments match stext
Mike Hommey (1):
HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651.
Oliver Hartkopp (1):
can: isotp: isotp_sendmsg(): fix return error fix on TX path
Paolo Abeni (2):
mptcp: fix possible divide by zero in recvmsg()
mptcp: consolidate fallback and non fallback state machine
Philip Yang (1):
drm/amdgpu: Set vmbo destroy after pt bo is created
Ricardo Cañuelo (1):
Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe"
Thomas Gleixner (1):
x86/smp: Use dedicated cache-line for mwait_play_dead()
Tony Luck (1):
mm, hwpoison: try to recover from copy-on write faults
This is the start of the stable review cycle for the 6.3.12 release.
There are 13 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.3.12-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.3.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.3.12-rc1
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Demi Marie Obenour <demi(a)invisiblethingslab.com>
dm ioctl: Avoid double-fetch of version
Ahmed S. Darwish <darwi(a)linutronix.de>
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Mike Kravetz <mike.kravetz(a)oracle.com>
hugetlb: revert use of page_cache_next_miss()
Finn Thain <fthain(a)linux-m68k.org>
nubus: Partially revert proc_create_single_data() conversion
Dan Williams <dan.j.williams(a)intel.com>
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Jeff Layton <jlayton(a)kernel.org>
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: always mark stack as growing down during early stack setup
Mario Limonciello <mario.limonciello(a)amd.com>
PCI/ACPI: Call _REG when transitioning D-states
Bjorn Helgaas <bhelgaas(a)google.com>
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Aric Cyr <aric.cyr(a)amd.com>
drm/amd/display: Do not update DRR while BW optimizations pending
Max Filippov <jcmvbkbc(a)gmail.com>
xtensa: fix lock_mm_and_find_vma in case VMA not found
-------------
Diffstat:
Documentation/process/changes.rst | 7 +++++
Makefile | 4 +--
drivers/cxl/core/pci.c | 27 +++-------------
drivers/cxl/cxl.h | 1 -
drivers/cxl/port.c | 14 +++------
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/gpu/drm/amd/display/dc/core/dc.c | 48 +++++++++++++++++------------
drivers/md/dm-ioctl.c | 33 ++++++++++++--------
drivers/nubus/proc.c | 22 ++++++++++---
drivers/pci/pci-acpi.c | 53 ++++++++++++++++++++++++--------
fs/hugetlbfs/inode.c | 8 ++---
fs/nfs/inode.c | 2 +-
include/linux/mm.h | 4 ++-
mm/hugetlb.c | 12 ++++----
mm/nommu.c | 7 ++++-
scripts/tags.sh | 9 +++++-
tools/testing/cxl/Kbuild | 1 -
tools/testing/cxl/test/mock.c | 15 ---------
18 files changed, 156 insertions(+), 115 deletions(-)
The patch titled
Subject: fork: lock VMAs of the parent process when forking
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
fork-lock-vmas-of-the-parent-process-when-forking-v3.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: fork: lock VMAs of the parent process when forking
Date: Wed, 5 Jul 2023 10:12:11 -0700
Patch series "Avoid memory corruption caused by per-VMA locks", v3.
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking while
forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork. I
tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem. This fix can potentially regress some
fork-heavy workloads. Kernel build time did not show noticeable
regression on a 56-core machine while a stress test mapping 10000 VMAs and
forking 5000 times in a tight loop shows ~5% regression. If such fork
time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should
restore its performance. Further optimizations are possible if this
regression proves to be problematic.
This patch (of 2):
When forking a child process, parent write-protects an anonymous page and
COW-shares it with the child being forked using copy_present_pte().
Parent's TLB is flushed right before we drop the parent's mmap_lock in
dup_mmap(). If we get a write-fault before that TLB flush in the parent,
and we end up replacing that anonymous page in the parent process in
do_wp_page() (because, COW-shared with the child), this might lead to some
stale writable TLB entries targeting the wrong (old) page. Similar issue
happened in the past with userfaultfd (see flush_tlb_page() call inside
do_wp_page()).
Lock VMAs of the parent process when forking a child, which prevents
concurrent page faults during fork operation and avoids this issue. This
fix can potentially regress some fork-heavy workloads. Kernel build time
did not show noticeable regression on a 56-core machine while a stress
test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5%
regression. If such fork time regression is unacceptable, disabling
CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations
are possible if this regression proves to be problematic.
Link: https://lkml.kernel.org/r/20230705171213.2843068-2-surenb@google.com
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first"=
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@ke=rnel.org/
Reported-by: Holger Hoffst��tte <holger(a)applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@ap=plied-asynchrony.com/
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624
)
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chriscli(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Greg Thelen <gthelen(a)google.com>
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Cc: Joel Fernandes <joelaf(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: Laurent Dufour <ldufour(a)linux.ibm.com>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Michel Lespinasse <michel(a)lespinasse.org>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Minchan Kim <minchan(a)google.com>
Cc: "Paul E. McKenney" <paulmck(a)kernel.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <peterz(a)infradead.org>
Cc: Punit Agrawal <punit.agrawal(a)bytedance.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/kernel/fork.c~fork-lock-vmas-of-the-parent-process-when-forking-v3
+++ a/kernel/fork.c
@@ -658,6 +658,12 @@ static __latent_entropy int dup_mmap(str
retval = -EINTR;
goto fail_uprobe_end;
}
+#ifdef CONFIG_PER_VMA_LOCK
+ /* Disallow any page faults before calling flush_cache_dup_mm */
+ for_each_vma(old_vmi, mpnt)
+ vma_start_write(mpnt);
+ vma_iter_init(&old_vmi, oldmm, 0);
+#endif
flush_cache_dup_mm(oldmm);
uprobe_dup_mmap(oldmm, mm);
/*
_
Patches currently in -mm which might be from surenb(a)google.com are
fork-lock-vmas-of-the-parent-process-when-forking-v3.patch
mm-disable-config_per_vma_lock-until-its-fixed.patch
swap-remove-remnants-of-polling-from-read_swap_cache_async.patch
mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch
mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch
mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch
mm-handle-swap-page-faults-under-per-vma-lock.patch
mm-handle-userfaults-under-vma-lock.patch
The quilt patch titled
Subject: fork: lock VMAs of the parent process when forking
has been removed from the -mm tree. Its filename was
fork-lock-vmas-of-the-parent-process-when-forking.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: fork: lock VMAs of the parent process when forking
Date: Tue, 4 Jul 2023 23:37:10 -0700
Patch series "Avoid memory corruption caused by per-VMA locks", v2.
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking while
forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork. I
tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Patch 2/2 disabled per-VMA locks until the fix is tested and verified.
This patch (of 2):
When forking a child process, parent write-protects an anonymous page and
COW-shares it with the child being forked using copy_present_pte().
Parent's TLB is flushed right before we drop the parent's mmap_lock in
dup_mmap(). If we get a write-fault before that TLB flush in the parent,
and we end up replacing that anonymous page in the parent process in
do_wp_page() (because, COW-shared with the child), this might lead to some
stale writable TLB entries targeting the wrong (old) page. Similar issue
happened in the past with userfaultfd (see flush_tlb_page() call inside
do_wp_page()).
Lock VMAs of the parent process when forking a child, which prevents
concurrent page faults during fork operation and avoids this issue. This
fix can potentially regress some fork-heavy workloads. Kernel build time
did not show noticeable regression on a 56-core machine while a stress
test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5%
regression. If such fork time regression is unacceptable, disabling
CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations
are possible if this regression proves to be problematic.
Link: https://lkml.kernel.org/r/20230705063711.2670599-1-surenb@google.com
Link: https://lkml.kernel.org/r/20230705063711.2670599-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Holger Hoffst��tte <holger(a)applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-as…
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Bagas Sanjaya <bagasdotme(a)gmail.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Laurent Dufour <ldufour(a)linux.ibm.com>
Cc: <regressions(a)lists.linux.dev>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chriscli(a)google.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Greg Thelen <gthelen(a)google.com>
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Joel Fernandes <joelaf(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Michel Lespinasse <michel(a)lespinasse.org>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Minchan Kim <minchan(a)google.com>
Cc: "Paul E. McKenney" <paulmck(a)kernel.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <peterz(a)infradead.org>
Cc: Punit Agrawal <punit.agrawal(a)bytedance.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 1 +
1 file changed, 1 insertion(+)
--- a/kernel/fork.c~fork-lock-vmas-of-the-parent-process-when-forking
+++ a/kernel/fork.c
@@ -686,6 +686,7 @@ static __latent_entropy int dup_mmap(str
for_each_vma(old_vmi, mpnt) {
struct file *file;
+ vma_start_write(mpnt);
if (mpnt->vm_flags & VM_DONTCOPY) {
vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt));
continue;
_
Patches currently in -mm which might be from surenb(a)google.com are
fork-lock-vmas-of-the-parent-process-when-forking-v3.patch
mm-disable-config_per_vma_lock-until-its-fixed.patch
swap-remove-remnants-of-polling-from-read_swap_cache_async.patch
mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch
mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch
mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch
mm-handle-swap-page-faults-under-per-vma-lock.patch
mm-handle-userfaults-under-vma-lock.patch
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking
while forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork.
I tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Patch 2/2 disabled per-VMA locks until the fix is tested and verified.
Both patches apply cleanly over Linus' ToT and stable 6.4.y branch.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
Suren Baghdasaryan (2):
fork: lock VMAs of the parent process when forking
mm: disable CONFIG_PER_VMA_LOCK until its fixed
kernel/fork.c | 1 +
mm/Kconfig | 3 ++-
2 files changed, 3 insertions(+), 1 deletion(-)
--
2.41.0.255.g8b1d071c50-goog
The patch titled
Subject: kasan, slub: fix HW_TAGS zeroing with slub_debug
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
kasan-slub-fix-hw_tags-zeroing-with-slub_debug.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Andrey Konovalov <andreyknvl(a)google.com>
Subject: kasan, slub: fix HW_TAGS zeroing with slub_debug
Date: Wed, 5 Jul 2023 14:44:02 +0200
Commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated
kmalloc space than requested") added precise kmalloc redzone poisoning to
the slub_debug functionality.
However, this commit didn't account for HW_TAGS KASAN fully initializing
the object via its built-in memory initialization feature. Even though
HW_TAGS KASAN memory initialization contains special memory initialization
handling for when slub_debug is enabled, it does not account for in-object
slub_debug redzones. As a result, HW_TAGS KASAN can overwrite these
redzones and cause false-positive slub_debug reports.
To fix the issue, avoid HW_TAGS KASAN memory initialization when
slub_debug is enabled altogether. Implement this by moving the
__slub_debug_enabled check to slab_post_alloc_hook. Common slab code
seems like a more appropriate place for a slub_debug check anyway.
Link: https://lkml.kernel.org/r/678ac92ab790dba9198f9ca14f405651b97c8502.16885610…
Fixes: 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested")
Signed-off-by: Andrey Konovalov <andreyknvl(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Acked-by: Marco Elver <elver(a)google.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Feng Tang <feng.tang(a)intel.com>
Cc: Hyeonggon Yoo <42.hyeyoo(a)gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: kasan-dev(a)googlegroups.com
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: Peter Collingbourne <pcc(a)google.com>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/kasan.h | 12 ------------
mm/slab.h | 16 ++++++++++++++--
2 files changed, 14 insertions(+), 14 deletions(-)
--- a/mm/kasan/kasan.h~kasan-slub-fix-hw_tags-zeroing-with-slub_debug
+++ a/mm/kasan/kasan.h
@@ -466,18 +466,6 @@ static inline void kasan_unpoison(const
if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
return;
- /*
- * Explicitly initialize the memory with the precise object size to
- * avoid overwriting the slab redzone. This disables initialization in
- * the arch code and may thus lead to performance penalty. This penalty
- * does not affect production builds, as slab redzones are not enabled
- * there.
- */
- if (__slub_debug_enabled() &&
- init && ((unsigned long)size & KASAN_GRANULE_MASK)) {
- init = false;
- memzero_explicit((void *)addr, size);
- }
size = round_up(size, KASAN_GRANULE_SIZE);
hw_set_mem_tag_range((void *)addr, size, tag, init);
--- a/mm/slab.h~kasan-slub-fix-hw_tags-zeroing-with-slub_debug
+++ a/mm/slab.h
@@ -723,6 +723,7 @@ static inline void slab_post_alloc_hook(
unsigned int orig_size)
{
unsigned int zero_size = s->object_size;
+ bool kasan_init = init;
size_t i;
flags &= gfp_allowed_mask;
@@ -740,6 +741,17 @@ static inline void slab_post_alloc_hook(
zero_size = orig_size;
/*
+ * When slub_debug is enabled, avoid memory initialization integrated
+ * into KASAN and instead zero out the memory via the memset below with
+ * the proper size. Otherwise, KASAN might overwrite SLUB redzones and
+ * cause false-positive reports. This does not lead to a performance
+ * penalty on production builds, as slub_debug is not intended to be
+ * enabled there.
+ */
+ if (__slub_debug_enabled())
+ kasan_init = false;
+
+ /*
* As memory initialization might be integrated into KASAN,
* kasan_slab_alloc and initialization memset must be
* kept together to avoid discrepancies in behavior.
@@ -747,8 +759,8 @@ static inline void slab_post_alloc_hook(
* As p[i] might get tagged, memset and kmemleak hook come after KASAN.
*/
for (i = 0; i < size; i++) {
- p[i] = kasan_slab_alloc(s, p[i], flags, init);
- if (p[i] && init && !kasan_has_integrated_init())
+ p[i] = kasan_slab_alloc(s, p[i], flags, kasan_init);
+ if (p[i] && init && (!kasan_init || !kasan_has_integrated_init()))
memset(p[i], 0, zero_size);
kmemleak_alloc_recursive(p[i], s->object_size, 1,
s->flags, flags);
_
Patches currently in -mm which might be from andreyknvl(a)google.com are
kasan-fix-type-cast-in-memory_is_poisoned_n.patch
kasan-slub-fix-hw_tags-zeroing-with-slub_debug.patch
The patch titled
Subject: mm: disable CONFIG_PER_VMA_LOCK until its fixed
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-disable-config_per_vma_lock-until-its-fixed.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: mm: disable CONFIG_PER_VMA_LOCK until its fixed
Date: Tue, 4 Jul 2023 23:37:11 -0700
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Disable per-VMA locks config to
prevent this issue while the problem is being investigated. This is
expected to be a temporary measure.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
Link: https://lkml.kernel.org/r/20230705063711.2670599-3-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Bagas Sanjaya <bagasdotme(a)gmail.com>
Cc: Chris Li <chriscli(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Greg Thelen <gthelen(a)google.com>
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: Holger Hoffst��tte <holger(a)applied-asynchrony.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Joel Fernandes <joelaf(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: Laurent Dufour <ldufour(a)linux.ibm.com>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Michel Lespinasse <michel(a)lespinasse.org>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Minchan Kim <minchan(a)google.com>
Cc: "Paul E. McKenney" <paulmck(a)kernel.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <peterz(a)infradead.org>
Cc: Punit Agrawal <punit.agrawal(a)bytedance.com>
Cc: <regressions(a)lists.linux.dev>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/Kconfig | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/Kconfig~mm-disable-config_per_vma_lock-until-its-fixed
+++ a/mm/Kconfig
@@ -1224,8 +1224,9 @@ config ARCH_SUPPORTS_PER_VMA_LOCK
def_bool n
config PER_VMA_LOCK
- def_bool y
+ bool "Enable per-vma locking during page fault handling."
depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP
+ depends on BROKEN
help
Allow per-vma locking during page fault handling.
_
Patches currently in -mm which might be from surenb(a)google.com are
fork-lock-vmas-of-the-parent-process-when-forking.patch
mm-disable-config_per_vma_lock-until-its-fixed.patch
swap-remove-remnants-of-polling-from-read_swap_cache_async.patch
mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch
mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch
mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch
mm-handle-swap-page-faults-under-per-vma-lock.patch
mm-handle-userfaults-under-vma-lock.patch
mm-disable-config_per_vma_lock-by-default-until-its-fixed.patch
The patch titled
Subject: fork: lock VMAs of the parent process when forking
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
fork-lock-vmas-of-the-parent-process-when-forking.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: fork: lock VMAs of the parent process when forking
Date: Tue, 4 Jul 2023 23:37:10 -0700
Patch series "Avoid memory corruption caused by per-VMA locks", v2.
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking while
forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork. I
tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Patch 2/2 disabled per-VMA locks until the fix is tested and verified.
This patch (of 2):
When forking a child process, parent write-protects an anonymous page and
COW-shares it with the child being forked using copy_present_pte().
Parent's TLB is flushed right before we drop the parent's mmap_lock in
dup_mmap(). If we get a write-fault before that TLB flush in the parent,
and we end up replacing that anonymous page in the parent process in
do_wp_page() (because, COW-shared with the child), this might lead to some
stale writable TLB entries targeting the wrong (old) page. Similar issue
happened in the past with userfaultfd (see flush_tlb_page() call inside
do_wp_page()).
Lock VMAs of the parent process when forking a child, which prevents
concurrent page faults during fork operation and avoids this issue. This
fix can potentially regress some fork-heavy workloads. Kernel build time
did not show noticeable regression on a 56-core machine while a stress
test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5%
regression. If such fork time regression is unacceptable, disabling
CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations
are possible if this regression proves to be problematic.
Link: https://lkml.kernel.org/r/20230705063711.2670599-1-surenb@google.com
Link: https://lkml.kernel.org/r/20230705063711.2670599-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Holger Hoffst��tte <holger(a)applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-as…
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Bagas Sanjaya <bagasdotme(a)gmail.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Laurent Dufour <ldufour(a)linux.ibm.com>
Cc: <regressions(a)lists.linux.dev>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chriscli(a)google.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Greg Thelen <gthelen(a)google.com>
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Joel Fernandes <joelaf(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Michel Lespinasse <michel(a)lespinasse.org>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Minchan Kim <minchan(a)google.com>
Cc: "Paul E. McKenney" <paulmck(a)kernel.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <peterz(a)infradead.org>
Cc: Punit Agrawal <punit.agrawal(a)bytedance.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 1 +
1 file changed, 1 insertion(+)
--- a/kernel/fork.c~fork-lock-vmas-of-the-parent-process-when-forking
+++ a/kernel/fork.c
@@ -686,6 +686,7 @@ static __latent_entropy int dup_mmap(str
for_each_vma(old_vmi, mpnt) {
struct file *file;
+ vma_start_write(mpnt);
if (mpnt->vm_flags & VM_DONTCOPY) {
vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt));
continue;
_
Patches currently in -mm which might be from surenb(a)google.com are
fork-lock-vmas-of-the-parent-process-when-forking.patch
mm-disable-config_per_vma_lock-until-its-fixed.patch
swap-remove-remnants-of-polling-from-read_swap_cache_async.patch
mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch
mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch
mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch
mm-handle-swap-page-faults-under-per-vma-lock.patch
mm-handle-userfaults-under-vma-lock.patch
mm-disable-config_per_vma_lock-by-default-until-its-fixed.patch
The soundwire subsystem uses two completion structures that allow
drivers to wait for soundwire device to become enumerated on the bus and
initialised by their drivers, respectively.
The code implementing the signalling is currently broken as it does not
signal all current and future waiters and also uses the wrong
reinitialisation function, which can potentially lead to memory
corruption if there are still waiters on the queue.
Not signalling future waiters specifically breaks sound card probe
deferrals as codec drivers can not tell that the soundwire device is
already attached when being reprobed. Some codec runtime PM
implementations suffer from similar problems as waiting for enumeration
during resume can also timeout despite the device already having been
enumerated.
Fixes: fb9469e54fa7 ("soundwire: bus: fix race condition with enumeration_complete signaling")
Fixes: a90def068127 ("soundwire: bus: fix race condition with initialization_complete signaling")
Cc: stable(a)vger.kernel.org # 5.7
Cc: Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
Cc: Rander Wang <rander.wang(a)linux.intel.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/soundwire/bus.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index 1ea6a64f8c4a..66e5dba919fa 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -908,8 +908,8 @@ static void sdw_modify_slave_status(struct sdw_slave *slave,
"initializing enumeration and init completion for Slave %d\n",
slave->dev_num);
- init_completion(&slave->enumeration_complete);
- init_completion(&slave->initialization_complete);
+ reinit_completion(&slave->enumeration_complete);
+ reinit_completion(&slave->initialization_complete);
} else if ((status == SDW_SLAVE_ATTACHED) &&
(slave->status == SDW_SLAVE_UNATTACHED)) {
@@ -917,7 +917,7 @@ static void sdw_modify_slave_status(struct sdw_slave *slave,
"signaling enumeration completion for Slave %d\n",
slave->dev_num);
- complete(&slave->enumeration_complete);
+ complete_all(&slave->enumeration_complete);
}
slave->status = status;
mutex_unlock(&bus->bus_lock);
@@ -1941,7 +1941,7 @@ int sdw_handle_slave_status(struct sdw_bus *bus,
"signaling initialization completion for Slave %d\n",
slave->dev_num);
- complete(&slave->initialization_complete);
+ complete_all(&slave->initialization_complete);
/*
* If the manager became pm_runtime active, the peripherals will be
--
2.39.3
Hi Greg, Sasha,
The following list shows the backported patches, I am using original
commit IDs for reference:
1) 0854db2aaef3 ("netfilter: nf_tables: use net_generic infra for transaction data")
2) 81ea01066741 ("netfilter: nf_tables: add rescheduling points during loop detection walks")
3) 1240eb93f061 ("netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE")
4) 4bedf9eee016 ("netfilter: nf_tables: fix chain binding transaction logic")
5) 26b5a5712eb8 ("netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain")
6) 938154b93be8 ("netfilter: nf_tables: reject unbound anonymous set before commit phase")
7) 62e1e94b246e ("netfilter: nf_tables: reject unbound chain set before commit phase")
8) f8bb7889af58 ("netfilter: nftables: rename set element data activation/deactivation functions")
9) 628bd3e49cba ("netfilter: nf_tables: drop map element references from preparation phase")
10) 3e70489721b6 ("netfilter: nf_tables: unbind non-anonymous set if rule construction fails")
Note that Patch #1 is a backported dependency patch required by these fixes.
Please, apply,
Thanks.
Florian Westphal (2):
netfilter: nf_tables: use net_generic infra for transaction data
netfilter: nf_tables: add rescheduling points during loop detection walks
Pablo Neira Ayuso (8):
netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE
netfilter: nf_tables: fix chain binding transaction logic
netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain
netfilter: nf_tables: reject unbound anonymous set before commit phase
netfilter: nf_tables: reject unbound chain set before commit phase
netfilter: nftables: rename set element data activation/deactivation functions
netfilter: nf_tables: drop map element references from preparation phase
netfilter: nf_tables: unbind non-anonymous set if rule construction fails
include/net/netfilter/nf_tables.h | 41 +-
include/net/netns/nftables.h | 7 -
net/netfilter/nf_tables_api.c | 696 +++++++++++++++++++++---------
net/netfilter/nf_tables_offload.c | 30 +-
net/netfilter/nft_chain_filter.c | 11 +-
net/netfilter/nft_dynset.c | 6 +-
net/netfilter/nft_immediate.c | 90 +++-
net/netfilter/nft_set_bitmap.c | 5 +-
net/netfilter/nft_set_hash.c | 23 +-
net/netfilter/nft_set_pipapo.c | 14 +-
net/netfilter/nft_set_rbtree.c | 5 +-
11 files changed, 682 insertions(+), 246 deletions(-)
--
2.30.2
Here is a first batch of fixes for v6.5 and older.
The fixes are not linked to each others.
Patch 1 ensures subflows are unhashed before cleaning the backlog to
avoid races. This fixes another recent fix from v6.4.
Patch 2 does not rely on implicit state check in mptcp_listen() to avoid
races when receiving an MP_FASTCLOSE. A regression from v5.17.
The rest fixes issues in the selftests.
Patch 3 makes sure errors when setting up the environment are no longer
ignored. For v5.17+.
Patch 4 uses 'iptables-legacy' if available to be able to run on older
kernels. A fix for v5.13 and newer.
Patch 5 catches errors when issues are detected with packet marks. Also
for v5.13+.
Patch 6 uses the correct variable instead of an undefined one. Even if
there was no visible impact, it can help to find regressions later. An
issue visible in v5.19+.
Patch 7 makes sure errors with some sub-tests are reported to have the
selftest marked as failed as expected. Also for v5.19+.
Patch 8 adds a kernel config that is required to execute MPTCP
selftests. It is valid for v5.9+.
Patch 9 fixes issues when validating the userspace path-manager with
32-bit arch, an issue affecting v5.19+.
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
Matthieu Baerts (7):
selftests: mptcp: connect: fail if nft supposed to work
selftests: mptcp: sockopt: use 'iptables-legacy' if available
selftests: mptcp: sockopt: return error if wrong mark
selftests: mptcp: userspace_pm: use correct server port
selftests: mptcp: userspace_pm: report errors with 'remove' tests
selftests: mptcp: depend on SYN_COOKIES
selftests: mptcp: pm_nl_ctl: fix 32-bit support
Paolo Abeni (2):
mptcp: ensure subflow is unhashed before cleaning the backlog
mptcp: do not rely on implicit state check in mptcp_listen()
net/mptcp/protocol.c | 7 +++++-
tools/testing/selftests/net/mptcp/config | 1 +
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 3 +++
tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 29 ++++++++++++----------
tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 10 ++++----
tools/testing/selftests/net/mptcp/userspace_pm.sh | 4 ++-
6 files changed, 34 insertions(+), 20 deletions(-)
---
base-commit: 14bb236b29922c4f57d8c05bfdbcb82677f917c9
change-id: 20230704-upstream-net-20230704-misc-fixes-6-5-rc1-c52608649559
Best regards,
--
Matthieu Baerts <matthieu.baerts(a)tessares.net>
Making 'blk' sector_t (i.e. 64 bit if LBD support is active)
fails the 'blk>0' test in the partition block loop if a
value of (signed int) -1 is used to mark the end of the
partition block list.
This bug was introduced in patch 3 of my prior Amiga partition
support fixes series, and spotted by Christian Zigotzky when
testing the latest block updates.
Explicitly cast 'blk' to signed int to allow use of -1 to
terminate the partition block linked list.
Testing by Christian also exposed another aspect of the old
bug fixed in commits fc3d092c6b ("block: fix signed int
overflow in Amiga partition support") and b6f3f28f60
("block: add overflow checks for Amiga partition support"):
Partitions that did overflow the disk size (due to 32 bit int
overflow) were not skipped but truncated to the end of the
disk. Users who missed the warning message during boot would
go on to create a filesystem with a size exceeding the
actual partition size. Now that the 32 bit overflow has been
corrected, such filesystems may refuse to mount with a
'filesystem exceeds partition size' error. Users should
either correct the partition size, or resize the filesystem
before attempting to boot a kernel with the RDB fixes in
place.
Reported-by: Christian Zigotzky <chzigotzky(a)xenosoft.de>
Fixes: b6f3f28f60 ("block: add overflow checks for Amiga partition support")
Message-ID: 024ce4fa-cc6d-50a2-9aae-3701d0ebf668(a)xenosoft.de
Cc: <stable(a)vger.kernel.org> # 6.4
Link: https://lore.kernel.org/r/024ce4fa-cc6d-50a2-9aae-3701d0ebf668@xenosoft.de
Signed-off-by: Michael Schmitz <schmitzmic(a)gmail.com>
Tested-by: Christian Zigotzky <chzigotzky(a)xenosoft.de>
--
Changes since v2:
Adrian Glaubitz:
- fix typo in commit message
Changes since v1:
- corrected Fixes: tag
- added Tested-by:
- reworded commit message to describe filesystem partition
size mismatch problem
---
block/partitions/amiga.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/partitions/amiga.c b/block/partitions/amiga.c
index ed222b9c901b..506921095412 100644
--- a/block/partitions/amiga.c
+++ b/block/partitions/amiga.c
@@ -90,7 +90,7 @@ int amiga_partition(struct parsed_partitions *state)
}
blk = be32_to_cpu(rdb->rdb_PartitionList);
put_dev_sector(sect);
- for (part = 1; blk>0 && part<=16; part++, put_dev_sector(sect)) {
+ for (part = 1; (s32) blk>0 && part<=16; part++, put_dev_sector(sect)) {
/* Read in terms partition table understands */
if (check_mul_overflow(blk, (sector_t) blksize, &blk)) {
pr_err("Dev %s: overflow calculating partition block %llu! Skipping partitions %u and beyond\n",
--
2.17.1
We get the following crash caused by a null pointer access:
BUG: kernel NULL pointer dereference, address: 0000000000000000
...
RIP: 0010:resume_execution+0x35/0x190
...
Call Trace:
<#DB>
kprobe_debug_handler+0x41/0xd0
exc_debug+0xe5/0x1b0
asm_exc_debug+0x19/0x30
RIP: 0010:copy_from_kernel_nofault.part.0+0x55/0xc0
...
</#DB>
process_fetch_insn+0xfb/0x720
kprobe_trace_func+0x199/0x2c0
? kernel_clone+0x5/0x2f0
kprobe_dispatcher+0x3d/0x60
aggr_pre_handler+0x40/0x80
? kernel_clone+0x1/0x2f0
kprobe_ftrace_handler+0x82/0xf0
? __se_sys_clone+0x65/0x90
ftrace_ops_assist_func+0x86/0x110
? rcu_nocb_try_bypass+0x1f3/0x370
0xffffffffc07e60c8
? kernel_clone+0x1/0x2f0
kernel_clone+0x5/0x2f0
The analysis reveals that kprobe and hardware breakpoints conflict in
the use of debug exceptions.
If we set a hardware breakpoint on a memory address and also have a
kprobe event to fetch the memory at this address. Then when kprobe
triggers, it goes to read the memory and triggers hardware breakpoint
monitoring. This time, since kprobe handles debug exceptions earlier
than hardware breakpoints, it will cause kprobe to incorrectly assume
that the exception is a kprobe trigger.
Notice that after the mainline commit 6256e668b7af ("x86/kprobes: Use
int3 instead of debug trap for single-step"), kprobe no longer uses
debug trap, avoiding the conflict with hardware breakpoints here. This
commit is to remove the IRET that returns to kernel, not to fix the
problem we have here. Also there are a bunch of merge conflicts when
trying to apply this commit to older kernels, so fixing it directly in
older kernels is probably a better option.
If the debug exception is triggered by kprobe, then regs->ip should be
located in the kprobe instruction slot. Add this check to
kprobe_debug_handler() to properly determine if a debug exception should
be handled by kprobe.
The stable kernels affected are 5.10, 5.4, 4.19, and 4.14. I made the
fix in 5.10, and we should probably apply this fix to other stable
kernels.
Signed-off-by: Li Huafei <lihuafei1(a)huawei.com>
---
arch/x86/kernel/kprobes/core.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 5de757099186..fd8d7d128807 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -900,7 +900,14 @@ int kprobe_debug_handler(struct pt_regs *regs)
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
- if (!cur)
+ if (!cur || !cur->ainsn.insn)
+ return 0;
+
+ /* regs->ip should be the address of next instruction to
+ * cur->ainsn.insn.
+ */
+ if (regs->ip < (unsigned long)cur->ainsn.insn ||
+ regs->ip - (unsigned long)cur->ainsn.insn > MAX_INSN_SIZE)
return 0;
resume_execution(cur, regs, kcb);
--
2.17.1
When forking a child process, parent write-protects an anonymous page
and COW-shares it with the child being forked using copy_present_pte().
Parent's TLB is flushed right before we drop the parent's mmap_lock in
dup_mmap(). If we get a write-fault before that TLB flush in the parent,
and we end up replacing that anonymous page in the parent process in
do_wp_page() (because, COW-shared with the child), this might lead to
some stale writable TLB entries targeting the wrong (old) page.
Similar issue happened in the past with userfaultfd (see flush_tlb_page()
call inside do_wp_page()).
Lock VMAs of the parent process when forking a child, which prevents
concurrent page faults during fork operation and avoids this issue.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Suggested-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Holger Hoffstätte <holger(a)applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-as…
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
---
kernel/fork.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/fork.c b/kernel/fork.c
index b85814e614a5..d2e12b6d2b18 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -686,6 +686,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
for_each_vma(old_vmi, mpnt) {
struct file *file;
+ vma_start_write(mpnt);
if (mpnt->vm_flags & VM_DONTCOPY) {
vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt));
continue;
--
2.41.0.255.g8b1d071c50-goog
The patch titled
Subject: kasan: fix type cast in memory_is_poisoned_n
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
kasan-fix-type-cast-in-memory_is_poisoned_n.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Andrey Konovalov <andreyknvl(a)google.com>
Subject: kasan: fix type cast in memory_is_poisoned_n
Date: Tue, 4 Jul 2023 02:52:05 +0200
Commit bb6e04a173f0 ("kasan: use internal prototypes matching gcc-13
builtins") introduced a bug into the memory_is_poisoned_n implementation:
it effectively removed the cast to a signed integer type after applying
KASAN_GRANULE_MASK.
As a result, KASAN started failing to properly check memset, memcpy, and
other similar functions.
Fix the bug by adding the cast back (through an additional signed integer
variable to make the code more readable).
Link: https://lkml.kernel.org/r/8c9e0251c2b8b81016255709d4ec42942dcaf018.16884318…
Fixes: bb6e04a173f0 ("kasan: use internal prototypes matching gcc-13 builtins")
Signed-off-by: Andrey Konovalov <andreyknvl(a)google.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/generic.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/kasan/generic.c~kasan-fix-type-cast-in-memory_is_poisoned_n
+++ a/mm/kasan/generic.c
@@ -130,9 +130,10 @@ static __always_inline bool memory_is_po
if (unlikely(ret)) {
const void *last_byte = addr + size - 1;
s8 *last_shadow = (s8 *)kasan_mem_to_shadow(last_byte);
+ s8 last_accessible_byte = (unsigned long)last_byte & KASAN_GRANULE_MASK;
if (unlikely(ret != (unsigned long)last_shadow ||
- (((long)last_byte & KASAN_GRANULE_MASK) >= *last_shadow)))
+ last_accessible_byte >= *last_shadow))
return true;
}
return false;
_
Patches currently in -mm which might be from andreyknvl(a)google.com are
kasan-fix-type-cast-in-memory_is_poisoned_n.patch
Hallow und wie geht es dir heute?
Ich möchte, dass Ihre Partnerschaft Sie als Subunternehmer
präsentiert, damit Sie in meinem Namen 8,6 Mio.
Überrechnungsvertragsfonds erhalten können, die wir zu 65 % und 35 %
aufteilen können.
Diese Transaktion ist 100 % risikofrei; Du brauchst keine Angst zu haben.
Bitte senden Sie mir eine E-Mail an (kmrs41786(a)gmail.com), um
ausführliche Informationen zu erhalten und bei Interesse zu erfahren,
wie wir dies gemeinsam bewältigen können.
Sie müssen es mir also weiterleiten
Ihr vollständiger Name.........................
Telefon.............
Mit freundlichen Grüße,
Herr Jimoh Oyebisi.
Hallow and how are you today?
I seek for your partnership to present you as a sub-contractor so
that you can receive 8.6M Over-Invoice contract fund on my behalf and
we can split it 65% 35%.
This transaction is 100% risk -free; you need not to be afraid.
Please email me at ( kmrs41786(a)gmail.com ) for comprehensive details
and how we can handle this together if interested.
So I need you to forward it to me
your full name.........................
Telephone.............
Kind Regards,
Mr. Jimoh Oyebisi.
The patch titled
Subject: memcg: drop kmem.limit_in_bytes
has been added to the -mm mm-unstable branch. Its filename is
memcg-drop-kmemlimit_in_bytes.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Michal Hocko <mhocko(a)suse.com>
Subject: memcg: drop kmem.limit_in_bytes
Date: Tue, 4 Jul 2023 13:52:40 +0200
kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
deprecated since 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") merged in 5.16. We haven't heard about any serious
users since then but it seems that the mere presence of the file is
causing more harm than good. We (SUSE) have had several bug reports from
customers where Docker based containers started to fail because a write to
kmem.limit_in_bytes has failed.
This was unexpected because runc code only expects ENOENT (kmem disabled)
or EBUSY (tasks already running within cgroup). So a new error code was
unexpected and the whole container startup failed. This has been later
addressed by
https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe7…
so current Docker runtimes do not suffer from the problem anymore. There
are still older version of Docker in use and likely hard to get rid of
completely.
Address this by wiping out the file completely and effectively get back to
pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
I would recommend backporting to stable trees which have picked up
58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").
Link: https://lkml.kernel.org/r/20230704115240.14672-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
Documentation/admin-guide/cgroup-v1/memory.rst | 2 --
mm/memcontrol.c | 13 -------------
2 files changed, 15 deletions(-)
--- a/Documentation/admin-guide/cgroup-v1/memory.rst~memcg-drop-kmemlimit_in_bytes
+++ a/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -92,8 +92,6 @@ Brief summary of control files.
memory.oom_control set/show oom controls.
memory.numa_stat show the number of memory usage per numa
node
- memory.kmem.limit_in_bytes This knob is deprecated and writing to
- it will return -ENOTSUPP.
memory.kmem.usage_in_bytes show current kernel memory allocation
memory.kmem.failcnt show the number of kernel memory usage
hits limits
--- a/mm/memcontrol.c~memcg-drop-kmemlimit_in_bytes
+++ a/mm/memcontrol.c
@@ -3708,9 +3708,6 @@ static u64 mem_cgroup_read_u64(struct cg
case _MEMSWAP:
counter = &memcg->memsw;
break;
- case _KMEM:
- counter = &memcg->kmem;
- break;
case _TCP:
counter = &memcg->tcpmem;
break;
@@ -3871,10 +3868,6 @@ static ssize_t mem_cgroup_write(struct k
case _MEMSWAP:
ret = mem_cgroup_resize_max(memcg, nr_pages, true);
break;
- case _KMEM:
- /* kmem.limit_in_bytes is deprecated. */
- ret = -EOPNOTSUPP;
- break;
case _TCP:
ret = memcg_update_tcp_max(memcg, nr_pages);
break;
@@ -5086,12 +5079,6 @@ static struct cftype mem_cgroup_legacy_f
},
#endif
{
- .name = "kmem.limit_in_bytes",
- .private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
- .write = mem_cgroup_write,
- .read_u64 = mem_cgroup_read_u64,
- },
- {
.name = "kmem.usage_in_bytes",
.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
.read_u64 = mem_cgroup_read_u64,
_
Patches currently in -mm which might be from mhocko(a)suse.com are
memcg-drop-kmemlimit_in_bytes.patch
The patch titled
Subject: bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
bootmem-remove-the-vmemmap-pages-from-kmemleak-in-free_bootmem_page.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Liu Shixin <liushixin2(a)huawei.com>
Subject: bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
Date: Tue, 4 Jul 2023 18:19:42 +0800
commit dd0ff4d12dd2 ("bootmem: remove the vmemmap pages from kmemleak in
put_page_bootmem") fix an overlaps existing problem of kmemleak. But the
problem still existed when HAVE_BOOTMEM_INFO_NODE is disabled, because in
this case, free_bootmem_page() will call free_reserved_page() directly.
Fix the problem by adding kmemleak_free_part() in free_bootmem_page() when
HAVE_BOOTMEM_INFO_NODE is disabled.
Link: https://lkml.kernel.org/r/20230704101942.2819426-1-liushixin2@huawei.com
Fixes: f41f2ed43ca5 ("mm: hugetlb: free the vmemmap pages associated with each HugeTLB page")
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Acked-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/bootmem_info.h | 2 ++
1 file changed, 2 insertions(+)
--- a/include/linux/bootmem_info.h~bootmem-remove-the-vmemmap-pages-from-kmemleak-in-free_bootmem_page
+++ a/include/linux/bootmem_info.h
@@ -3,6 +3,7 @@
#define __LINUX_BOOTMEM_INFO_H
#include <linux/mm.h>
+#include <linux/kmemleak.h>
/*
* Types for free bootmem stored in page->lru.next. These have to be in
@@ -59,6 +60,7 @@ static inline void get_page_bootmem(unsi
static inline void free_bootmem_page(struct page *page)
{
+ kmemleak_free_part(page_to_virt(page), PAGE_SIZE);
free_reserved_page(page);
}
#endif
_
Patches currently in -mm which might be from liushixin2(a)huawei.com are
bootmem-remove-the-vmemmap-pages-from-kmemleak-in-free_bootmem_page.patch
The patch below does not apply to the 6.3-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.3.y
git checkout FETCH_HEAD
git cherry-pick -x fd4aed8d985a3236d0877ff6d0c80ad39d4ce81a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070420-fraction-such-da8d@gregkh' --subject-prefix 'PATCH 6.3.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fd4aed8d985a3236d0877ff6d0c80ad39d4ce81a Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Date: Wed, 21 Jun 2023 14:24:03 -0700
Subject: [PATCH] hugetlb: revert use of page_cache_next_miss()
Ackerley Tng reported an issue with hugetlbfs fallocate as noted in the
Closes tag. The issue showed up after the conversion of hugetlb page
cache lookup code to use page_cache_next_miss. User visible effects are:
- hugetlbfs fallocate incorrectly returns -EEXIST if pages are presnet
in the file.
- hugetlb pages will not be included in core dumps if they need to be
brought in via GUP.
- userfaultfd UFFDIO_COPY will not notice pages already present in the
cache. It may try to allocate a new page and potentially return
ENOMEM as opposed to EEXIST.
Revert the use page_cache_next_miss() in hugetlb code.
IMPORTANT NOTE FOR STABLE BACKPORTS:
This patch will apply cleanly to v6.3. However, due to the change of
filemap_get_folio() return values, it will not function correctly. This
patch must be modified for stable backports.
[dan.carpenter(a)linaro.org: fix hugetlbfs_pagecache_present()]
Link: https://lkml.kernel.org/r/efa86091-6a2c-4064-8f55-9b44e1313015@moroto.mount…
Link: https://lkml.kernel.org/r/20230621212403.174710-2-mike.kravetz@oracle.com
Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Signed-off-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Reported-by: Ackerley Tng <ackerleytng(a)google.com>
Closes: https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com
Reviewed-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: Erdem Aktas <erdemaktas(a)google.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Vishal Annapurve <vannapurve(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 90361a922cec..7b17ccfa039d 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
*/
struct folio *folio;
unsigned long addr;
- bool present;
cond_resched();
@@ -842,10 +841,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
mutex_lock(&hugetlb_fault_mutex_table[hash]);
/* See if already present in mapping to avoid alloc/free */
- rcu_read_lock();
- present = page_cache_next_miss(mapping, index, 1) != index;
- rcu_read_unlock();
- if (present) {
+ folio = filemap_get_folio(mapping, index);
+ if (!IS_ERR(folio)) {
+ folio_put(folio);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
continue;
}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d76574425da3..bce28cca73a1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5728,13 +5728,13 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
{
struct address_space *mapping = vma->vm_file->f_mapping;
pgoff_t idx = vma_hugecache_offset(h, vma, address);
- bool present;
-
- rcu_read_lock();
- present = page_cache_next_miss(mapping, idx, 1) != idx;
- rcu_read_unlock();
+ struct folio *folio;
- return present;
+ folio = filemap_get_folio(mapping, idx);
+ if (IS_ERR(folio))
+ return false;
+ folio_put(folio);
+ return true;
}
int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
This is the start of the stable review cycle for the 5.15.120 release.
There are 15 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.120-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.120-rc1
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Krister Johansen <kjlx(a)templeofstupid.com>
perf symbols: Symbol lookup with kcore can fail if multiple segments match stext
Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe"
Mike Hommey <mh(a)glandium.org>
HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651.
Jason Gerecke <jason.gerecke(a)wacom.com>
HID: wacom: Use ktime_t rather than int when dealing with timestamps
Krister Johansen <kjlx(a)templeofstupid.com>
bpf: ensure main program has an extable
Oliver Hartkopp <socketcan(a)hartkopp.net>
can: isotp: isotp_sendmsg(): fix return error fix on TX path
Thomas Gleixner <tglx(a)linutronix.de>
x86/smp: Use dedicated cache-line for mwait_play_dead()
Borislav Petkov (AMD) <bp(a)alien8.de>
x86/microcode/AMD: Load late on both threads too
Philip Yang <Philip.Yang(a)amd.com>
drm/amdgpu: Set vmbo destroy after pt bo is created
Jane Chu <jane.chu(a)oracle.com>
mm, hwpoison: when copy-on-write hits poison, take page offline
Tony Luck <tony.luck(a)intel.com>
mm, hwpoison: try to recover from copy-on write faults
Paolo Abeni <pabeni(a)redhat.com>
mptcp: consolidate fallback and non fallback state machine
Paolo Abeni <pabeni(a)redhat.com>
mptcp: fix possible divide by zero in recvmsg()
-------------
Diffstat:
Makefile | 4 +--
arch/x86/kernel/cpu/microcode/amd.c | 2 +-
arch/x86/kernel/smpboot.c | 24 +++++++++-------
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/hid/hid-logitech-hidpp.c | 2 +-
drivers/hid/wacom_wac.c | 6 ++--
drivers/hid/wacom_wac.h | 2 +-
drivers/thermal/mtk_thermal.c | 14 ++-------
include/linux/highmem.h | 24 ++++++++++++++++
include/linux/mm.h | 5 +++-
kernel/bpf/verifier.c | 7 +++--
mm/memory.c | 33 ++++++++++++++-------
net/can/isotp.c | 5 ++--
net/mptcp/protocol.c | 46 ++++++++++++++----------------
net/mptcp/subflow.c | 17 ++++++-----
scripts/tags.sh | 9 +++++-
tools/perf/util/symbol.c | 17 +++++++++--
18 files changed, 142 insertions(+), 80 deletions(-)
Making 'blk' sector_t (i.e. 64 bit if LBD support is active)
fails the 'blk>0' test in the partition block loop if a
value of (signed int) -1 is used to mark the end of the
partition block list.
This bug was introduced in patch 3 of my prior Amiga partition
support fixes series, and spotted by Christian Zigotzky when
testing the latest block updates.
Explicitly cast 'blk' to signed int to allow use of -1 to
terminate the partition block linked list.
Reported-by: Christian Zigotzky <chzigotzky(a)xenosoft.de>
Fixes: b6f3f28f60 ("Linux 6.4")
Message-ID: 024ce4fa-cc6d-50a2-9aae-3701d0ebf668(a)xenosoft.de
Cc: <stable(a)vger.kernel.org> # 6.4
Link: https://lore.kernel.org/r/024ce4fa-cc6d-50a2-9aae-3701d0ebf668@xenosoft.de
Signed-off-by: Michael Schmitz <schmitzmic(a)gmail.com>
---
block/partitions/amiga.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/partitions/amiga.c b/block/partitions/amiga.c
index ed222b9c901b..506921095412 100644
--- a/block/partitions/amiga.c
+++ b/block/partitions/amiga.c
@@ -90,7 +90,7 @@ int amiga_partition(struct parsed_partitions *state)
}
blk = be32_to_cpu(rdb->rdb_PartitionList);
put_dev_sector(sect);
- for (part = 1; blk>0 && part<=16; part++, put_dev_sector(sect)) {
+ for (part = 1; (s32) blk>0 && part<=16; part++, put_dev_sector(sect)) {
/* Read in terms partition table understands */
if (check_mul_overflow(blk, (sector_t) blksize, &blk)) {
pr_err("Dev %s: overflow calculating partition block %llu! Skipping partitions %u and beyond\n",
--
2.17.1
Making 'blk' sector_t (i.e. 64 bit if LBD support is active)
fails the 'blk>0' test in the partition block loop if a
value of (signed int) -1 is used to mark the end of the
partition block list.
This bug was introduced in patch 3 of my prior Amiga partition
support fixes series, and spotted by Christian Zigotzky when
testing the latest block updates.
Explicitly cast 'blk' to signed int to allow use of -1 to
terminate the partition block linked list.
Testing by Christian also exposed another aspect of the old
bug fixed in commits fc3d092c6b ("block: fix signed int
overflow in Amiga partition support") and b6f3f28f60
("block: add overflow checks for Amiga partition support"):
Partitons that did overflow the disk size (due to 32 bit int
overflow) were not skipped but truncated to the end of the
disk. Users who missed the warning message during boot would
go on to create a filesystem with a size exceeding the
actual partition size. Now that the 32 bit overflow has been
corrected, such filesystems may refuse to mount with a
'filesystem exceeds partition size' error. Users should
either correct the partition size, or resize the filesystem
before attempting to boot a kernel with the RDB fixes in
place.
Reported-by: Christian Zigotzky <chzigotzky(a)xenosoft.de>
Fixes: b6f3f28f60 ("block: add overflow checks for Amiga partition support")
Message-ID: 024ce4fa-cc6d-50a2-9aae-3701d0ebf668(a)xenosoft.de
Cc: <stable(a)vger.kernel.org> # 6.4
Link: https://lore.kernel.org/r/024ce4fa-cc6d-50a2-9aae-3701d0ebf668@xenosoft.de
Signed-off-by: Michael Schmitz <schmitzmic(a)gmail.com>
Tested-by: Christian Zigotzky <chzigotzky(a)xenosoft.de>
--
Changes since v1:
- corrected Fixes: tag
- added Tested-by:
- reworded commit message to describe filesystem partition
size mismatch problem
---
block/partitions/amiga.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/partitions/amiga.c b/block/partitions/amiga.c
index ed222b9c901b..506921095412 100644
--- a/block/partitions/amiga.c
+++ b/block/partitions/amiga.c
@@ -90,7 +90,7 @@ int amiga_partition(struct parsed_partitions *state)
}
blk = be32_to_cpu(rdb->rdb_PartitionList);
put_dev_sector(sect);
- for (part = 1; blk>0 && part<=16; part++, put_dev_sector(sect)) {
+ for (part = 1; (s32) blk>0 && part<=16; part++, put_dev_sector(sect)) {
/* Read in terms partition table understands */
if (check_mul_overflow(blk, (sector_t) blksize, &blk)) {
pr_err("Dev %s: overflow calculating partition block %llu! Skipping partitions %u and beyond\n",
--
2.17.1
This is the start of the stable review cycle for the 6.1.38 release.
There are 11 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.38-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.1.38-rc1
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Ahmed S. Darwish <darwi(a)linutronix.de>
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Krister Johansen <kjlx(a)templeofstupid.com>
perf symbols: Symbol lookup with kcore can fail if multiple segments match stext
Finn Thain <fthain(a)linux-m68k.org>
nubus: Partially revert proc_create_single_data() conversion
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: always mark stack as growing down during early stack setup
Mario Limonciello <mario.limonciello(a)amd.com>
PCI/ACPI: Call _REG when transitioning D-states
Bjorn Helgaas <bhelgaas(a)google.com>
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Aric Cyr <aric.cyr(a)amd.com>
drm/amd/display: Do not update DRR while BW optimizations pending
Alvin Lee <Alvin.Lee2(a)amd.com>
drm/amd/display: Remove optimization for VRR updates
Max Filippov <jcmvbkbc(a)gmail.com>
xtensa: fix lock_mm_and_find_vma in case VMA not found
-------------
Diffstat:
Documentation/process/changes.rst | 7 +++++
Makefile | 4 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/gpu/drm/amd/display/dc/core/dc.c | 49 ++++++++++++++++-------------
drivers/nubus/proc.c | 22 ++++++++++---
drivers/pci/pci-acpi.c | 53 ++++++++++++++++++++++++--------
include/linux/mm.h | 4 ++-
mm/nommu.c | 7 ++++-
scripts/tags.sh | 9 +++++-
tools/perf/util/symbol.c | 17 ++++++++--
10 files changed, 130 insertions(+), 46 deletions(-)
__setup() handlers should return 1 to obsolete_checksetup() in
init/main.c to indicate that the boot option has been handled.
A return of 0 causes the boot option/value to be listed as an Unknown
kernel parameter and added to init's (limited) argument or environment
strings. Also, error return codes don't mean anything to
obsolete_checksetup() -- only non-zero (usually 1) or zero.
So return 1 from setup_nmi_watchdog().
Fixes: e5553a6d0442 ("sparc64: Implement NMI watchdog on capable cpus.")
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Reported-by: Igor Zhbanov <izh1979(a)gmail.com>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: sparclinux(a)vger.kernel.org
Cc: Sam Ravnborg <sam(a)ravnborg.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: stable(a)vger.kernel.org
Cc: Arnd Bergmann <arnd(a)arndb.de>
---
v2: change From: Igor to Reported-by:
add more Cc's
v3: use Igor's current email address
v4: add Arnd to Cc: list
arch/sparc/kernel/nmi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -- a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c
--- a/arch/sparc/kernel/nmi.c
+++ b/arch/sparc/kernel/nmi.c
@@ -279,7 +279,7 @@ static int __init setup_nmi_watchdog(cha
if (!strncmp(str, "panic", 5))
panic_on_timeout = 1;
- return 0;
+ return 1;
}
__setup("nmi_watchdog=", setup_nmi_watchdog);
__setup() handlers should return 1 to obsolete_checksetup() in
init/main.c to indicate that the boot option has been handled.
A return of 0 causes the boot option/value to be listed as an Unknown
kernel parameter and added to init's (limited) argument or environment
strings. Also, error return codes don't mean anything to
obsolete_checksetup() -- only non-zero (usually 1) or zero.
So return 1 from vdso_setup().
Fixes: 9a08862a5d2e ("vDSO for sparc")
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Reported-by: Igor Zhbanov <izh1979(a)gmail.com>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: sparclinux(a)vger.kernel.org
Cc: Dan Carpenter <dan.carpenter(a)oracle.com>
Cc: Nick Alcock <nick.alcock(a)oracle.com>
Cc: Sam Ravnborg <sam(a)ravnborg.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: stable(a)vger.kernel.org
Cc: Arnd Bergmann <arnd(a)arndb.de>
---
v2: correct the Fixes: tag (Dan Carpenter)
v3: add more Cc's;
correct Igor's email address;
change From: Igor to Reported-by: Igor;
v4: add Arnd to Cc: list
arch/sparc/vdso/vma.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff -- a/arch/sparc/vdso/vma.c b/arch/sparc/vdso/vma.c
--- a/arch/sparc/vdso/vma.c
+++ b/arch/sparc/vdso/vma.c
@@ -449,9 +449,8 @@ static __init int vdso_setup(char *s)
unsigned long val;
err = kstrtoul(s, 10, &val);
- if (err)
- return err;
- vdso_enabled = val;
- return 0;
+ if (!err)
+ vdso_enabled = val;
+ return 1;
}
__setup("vdso=", vdso_setup);
Hi Greg and Sasha,
On Sat, 10 Jun 2023 17:56:18 +0000 SeongJae Park <sj(a)kernel.org> wrote:
> On Sat, 10 Jun 2023 12:15:55 +0800 David Gow <davidgow(a)google.com> wrote:
>
> > [-- Attachment #1: Type: text/plain, Size: 2275 bytes --]
> >
> > On Sat, 10 Jun 2023 at 03:09, SeongJae Park <sj(a)kernel.org> wrote:
> > >
> > > Hi David and Brendan,
> > >
> > > On Tue, 2 May 2023 08:04:20 +0800 David Gow <davidgow(a)google.com> wrote:
> > >
> > > > [-- Attachment #1: Type: text/plain, Size: 1473 bytes --]
> > > >
> > > > On Tue, 2 May 2023 at 02:16, 'Daniel Latypov' via KUnit Development
> > > > <kunit-dev(a)googlegroups.com> wrote:
> > > > >
> > > > > Writing `subprocess.Popen[str]` requires python 3.9+.
> > > > > kunit.py has an assertion that the python version is 3.7+, so we should
> > > > > try to stay backwards compatible.
> > > > >
> > > > > This conflicts a bit with commit 1da2e6220e11 ("kunit: tool: fix
> > > > > pre-existing `mypy --strict` errors and update run_checks.py"), since
> > > > > mypy complains like so
> > > > > > kunit_kernel.py:95: error: Missing type parameters for generic type "Popen" [type-arg]
> > > > >
> > > > > Note: `mypy --strict --python-version 3.7` does not work.
> > > > >
> > > > > We could annotate each file with comments like
> > > > > `# mypy: disable-error-code="type-arg"
> > > > > but then we might still get nudged to break back-compat in other files.
> > > > >
> > > > > This patch adds a `mypy.ini` file since it seems like the only way to
> > > > > disable specific error codes for all our files.
> > > > >
> > > > > Note: run_checks.py doesn't need to specify `--config_file mypy.ini`,
> > > > > but I think being explicit is better, particularly since most kernel
> > > > > devs won't be familiar with how mypy works.
> > > > >
> > > > > Fixes: 695e26030858 ("kunit: tool: add subscripts for type annotations where appropriate")
> > > > > Reported-by: SeongJae Park <sj(a)kernel.org>
> > > > > Link: https://lore.kernel.org/linux-kselftest/20230501171520.138753-1-sj@kernel.o…
> > > > > Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
> > > > > ---
> > > >
> > > > Thanks for jumping on this.
> > > >
> > > > Looks good to me!
> > > >
> > > > Reviewed-by: David Gow <davidgow(a)google.com>
> > >
> > > Looks like this patch is still not merged in the mainline. May I ask the ETA,
> > > or any concern if you have?
> > >
> > >
> >
> > We've got this queued for 6.5 in the kselftest/kunit tree[1], so it
> > should land during the merge window. But I'll look into getting it
> > applied as a fix for 6.4, beforehand.
>
> Thank you for the kind answer, Gow! I was thinking this would be treated as a
> fix, and hence merged into the mainline before next merge window. I'm actually
> getting my personal test suite failures due to absence of this fix. It's not a
> critical problem, but it would definitely better for me if this could be merged
> into the mainline as early as possible.
This patch is now in the mainline (e30f65c4b3d671115bf2a9d9ef142285387f2aff).
However, this fix is not in 6.4.y yet, so the original issue is reproducible on
6.4.y. Could you please add this to 6.4.y? I confirmed the mainline commit
can cleanly applied on latest 6.1.y tree, and it fixes the issue.
Thanks,
SJ
>
>
> Thanks,
> SJ
>
> >
> > -- David
> >
> > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/c…
83efeeeb3d04 ("tty: Allow TIOCSTI to be disabled") broke BRLTTY's
ability to simulate keypresses on the console, thus effectively breaking
braille keyboards of blind users.
This restores the TIOCSTI feature for CAP_SYS_ADMIN processes, which
BRLTTY is, thus fixing braille keyboards without re-opening the security
issue.
Signed-off-by: Samuel Thibault <samuel.thibault(a)ens-lyon.org>
Fixes: 83efeeeb3d04 ("tty: Allow TIOCSTI to be disabled")
Index: linux-6.4/drivers/tty/tty_io.c
===================================================================
--- linux-6.4.orig/drivers/tty/tty_io.c
+++ linux-6.4/drivers/tty/tty_io.c
@@ -2276,7 +2276,7 @@ static int tiocsti(struct tty_struct *tt
char ch, mbz = 0;
struct tty_ldisc *ld;
- if (!tty_legacy_tiocsti)
+ if (!tty_legacy_tiocsti && !capable(CAP_SYS_ADMIN))
return -EIO;
if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
This is the start of the stable review cycle for the 6.4.2 release.
There are 13 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.2-rc1.…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.4.2-rc1
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Demi Marie Obenour <demi(a)invisiblethingslab.com>
dm ioctl: Avoid double-fetch of version
Ahmed S. Darwish <darwi(a)linutronix.de>
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Mike Kravetz <mike.kravetz(a)oracle.com>
hugetlb: revert use of page_cache_next_miss()
Finn Thain <fthain(a)linux-m68k.org>
nubus: Partially revert proc_create_single_data() conversion
Dan Williams <dan.j.williams(a)intel.com>
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Jeff Layton <jlayton(a)kernel.org>
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: always mark stack as growing down during early stack setup
Mario Limonciello <mario.limonciello(a)amd.com>
PCI/ACPI: Call _REG when transitioning D-states
Bjorn Helgaas <bhelgaas(a)google.com>
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Thomas Weißschuh <linux(a)weissschuh.net>
tools/nolibc: x86_64: disable stack protector for _start
Max Filippov <jcmvbkbc(a)gmail.com>
xtensa: fix lock_mm_and_find_vma in case VMA not found
-------------
Diffstat:
Documentation/process/changes.rst | 7 +++++
Makefile | 4 +--
drivers/cxl/core/pci.c | 27 +++--------------
drivers/cxl/cxl.h | 1 -
drivers/cxl/port.c | 14 ++++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/md/dm-ioctl.c | 33 +++++++++++++--------
drivers/nubus/proc.c | 22 ++++++++++----
drivers/pci/pci-acpi.c | 53 +++++++++++++++++++++++++---------
fs/hugetlbfs/inode.c | 8 ++---
fs/nfs/inode.c | 2 +-
include/linux/mm.h | 4 ++-
mm/hugetlb.c | 12 ++++----
mm/nommu.c | 7 ++++-
scripts/tags.sh | 9 +++++-
tools/include/nolibc/arch-x86_64.h | 2 +-
tools/testing/cxl/Kbuild | 1 -
tools/testing/cxl/test/mock.c | 15 ----------
18 files changed, 128 insertions(+), 97 deletions(-)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070344-agenda-establish-d050@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070343-agenda-customs-7f89@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070342-blah-levitate-41a6@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070341-aflame-earwig-4540@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070340-dice-enjoyably-e417@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 249bed821b4db6d95a99160f7d6d236ea5fe6362
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023070339-squire-expend-6932@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 249bed821b4db6d95a99160f7d6d236ea5fe6362 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sat, 3 Jun 2023 10:52:42 -0400
Subject: [PATCH] dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel. copy_params() then fetches the version
from userspace *again*, and this time no validation is done. The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.
Fix this flaw by not copying the version back to the kernel the second
time. This is not exploitable as the version is not further used in the
kernel. However, it could become a problem if future patches start
relying on the version field.
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a92abbe90981..bfaebc02833a 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1872,30 +1872,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
* As well as checking the version compatibility this always
* copies the kernel interface version out.
*/
-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
+static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
+ struct dm_ioctl *kernel_params)
{
- uint32_t version[3];
int r = 0;
- if (copy_from_user(version, user->version, sizeof(version)))
+ /* Make certain version is first member of dm_ioctl struct */
+ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
+
+ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
return -EFAULT;
- if ((version[0] != DM_VERSION_MAJOR) ||
- (version[1] > DM_VERSION_MINOR)) {
+ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
+ (kernel_params->version[1] > DM_VERSION_MINOR)) {
DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
DM_VERSION_MAJOR, DM_VERSION_MINOR,
DM_VERSION_PATCHLEVEL,
- version[0], version[1], version[2], cmd);
+ kernel_params->version[0],
+ kernel_params->version[1],
+ kernel_params->version[2],
+ cmd);
r = -EINVAL;
}
/*
* Fill in the kernel version.
*/
- version[0] = DM_VERSION_MAJOR;
- version[1] = DM_VERSION_MINOR;
- version[2] = DM_VERSION_PATCHLEVEL;
- if (copy_to_user(user->version, version, sizeof(version)))
+ kernel_params->version[0] = DM_VERSION_MAJOR;
+ kernel_params->version[1] = DM_VERSION_MINOR;
+ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
+ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
return -EFAULT;
return r;
@@ -1921,7 +1927,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
unsigned int noio_flag;
- if (copy_from_user(param_kernel, user, minimum_data_size))
+ /* check_version() already copied version from userspace, avoid TOCTOU */
+ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
+ (char __user *)user + sizeof(param_kernel->version),
+ minimum_data_size - sizeof(param_kernel->version)))
return -EFAULT;
if (param_kernel->data_size < minimum_data_size) {
@@ -2033,7 +2042,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
* Check the interface version passed in. This also
* writes out the kernel's interface version.
*/
- r = check_version(cmd, user);
+ r = check_version(cmd, user, ¶m_kernel);
if (r)
return r;
commit 1c249565426e3a9940102c0ba9f63914f7cda73d upstream.
This problem was encountered on an arm64 system with a lot of memory.
Without kernel debug symbols installed, and with both kcore and kallsyms
available, perf managed to get confused and returned "unknown" for all
of the kernel symbols that it tried to look up.
On this system, stext fell within the vmalloc segment. The kcore symbol
matching code tries to find the first segment that contains stext and
uses that to replace the segment generated from just the kallsyms
information. In this case, however, there were two: a very large
vmalloc segment, and the text segment. This caused perf to get confused
because multiple overlapping segments were inserted into the RB tree
that holds the discovered segments. However, that alone wasn't
sufficient to cause the problem. Even when we could find the segment,
the offsets were adjusted in such a way that the newly generated symbols
didn't line up with the instruction addresses in the trace. The most
obvious solution would be to consult which segment type is text from
kcore, but this information is not exposed to users.
Instead, select the smallest matching segment that contains stext
instead of the first matching segment. This allows us to match the text
segment instead of vmalloc, if one is contained within the other.
Reviewed-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Krister Johansen <kjlx(a)templeofstupid.com>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: David Reaver <me(a)davidreaver.com>
Cc: Ian Rogers <irogers(a)google.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Michael Petlan <mpetlan(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Link: http://lore.kernel.org/lkml/20230125183418.GD1963@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Krister Johansen <kjlx(a)templeofstupid.com>
---
tools/perf/util/symbol.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index b1e5fd99e38a..80c54196e0e4 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1357,10 +1357,23 @@ static int dso__load_kcore(struct dso *dso, struct map *map,
/* Find the kernel map using the '_stext' symbol */
if (!kallsyms__get_function_start(kallsyms_filename, "_stext", &stext)) {
+ u64 replacement_size = 0;
+
list_for_each_entry(new_map, &md.maps, node) {
- if (stext >= new_map->start && stext < new_map->end) {
+ u64 new_size = new_map->end - new_map->start;
+
+ if (!(stext >= new_map->start && stext < new_map->end))
+ continue;
+
+ /*
+ * On some architectures, ARM64 for example, the kernel
+ * text can get allocated inside of the vmalloc segment.
+ * Select the smallest matching segment, in case stext
+ * falls within more than one in the list.
+ */
+ if (!replacement_map || new_size < replacement_size) {
replacement_map = new_map;
- break;
+ replacement_size = new_size;
}
}
}
--
2.25.1
commit fd4aed8d985a3236d0877ff6d0c80ad39d4ce81a upstream
Ackerley Tng reported an issue with hugetlbfs fallocate as noted in the
Closes tag. The issue showed up after the conversion of hugetlb page
cache lookup code to use page_cache_next_miss. User visible effects are:
- hugetlbfs fallocate incorrectly returns -EEXIST if pages are presnet
in the file.
- hugetlb pages will not be included in core dumps if they need to be
brought in via GUP.
- userfaultfd UFFDIO_COPY will not notice pages already present in the
cache. It may try to allocate a new page and potentially return
ENOMEM as opposed to EEXIST.
Revert the use page_cache_next_miss() in hugetlb code.
The upstream fix[2] cannot be used used directly as the return value for
filemap_get_folio() has been changed between 6.3 and upstream.
Closes: https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com
Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()")
Cc: <stable(a)vger.kernel.org> #v6.3
Reported-by: Ackerley Tng <ackerleytng(a)google.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
[1] https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.co…
[2] https://lore.kernel.org/lkml/20230621230255.GD4155@monkey/
---
fs/hugetlbfs/inode.c | 8 +++-----
mm/hugetlb.c | 11 +++++------
2 files changed, 8 insertions(+), 11 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 9062da6da5675..586767afb4cdb 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
*/
struct folio *folio;
unsigned long addr;
- bool present;
cond_resched();
@@ -845,10 +844,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
mutex_lock(&hugetlb_fault_mutex_table[hash]);
/* See if already present in mapping to avoid alloc/free */
- rcu_read_lock();
- present = page_cache_next_miss(mapping, index, 1) != index;
- rcu_read_unlock();
- if (present) {
+ folio = filemap_get_folio(mapping, index);
+ if (folio) {
+ folio_put(folio);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
hugetlb_drop_vma_policy(&pseudo_vma);
continue;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 245038a9fe4ea..29ab27d2a3ef5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5666,13 +5666,12 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
{
struct address_space *mapping = vma->vm_file->f_mapping;
pgoff_t idx = vma_hugecache_offset(h, vma, address);
- bool present;
-
- rcu_read_lock();
- present = page_cache_next_miss(mapping, idx, 1) != idx;
- rcu_read_unlock();
+ struct folio *folio;
- return present;
+ folio = filemap_get_folio(mapping, idx);
+ if (folio)
+ folio_put(folio);
+ return folio != NULL;
}
int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
--
2.40.1
split_if_spec expects a NULL-pointer as an end marker for the argument
list, but tuntap_probe never supplied that terminating NULL. As a result
incorrectly formatted interface specification string may cause a crash
because of the random memory access. Fix that by adding NULL terminator
to the split_if_spec argument list.
Cc: stable(a)vger.kernel.org
Fixes: 7282bee78798 ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 8")
Signed-off-by: Max Filippov <jcmvbkbc(a)gmail.com>
---
Changes v1->v2:
- fix commit message wording and add cc: stable
arch/xtensa/platforms/iss/network.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/xtensa/platforms/iss/network.c b/arch/xtensa/platforms/iss/network.c
index 7b97e6ab85a4..85c82cd42188 100644
--- a/arch/xtensa/platforms/iss/network.c
+++ b/arch/xtensa/platforms/iss/network.c
@@ -237,7 +237,7 @@ static int tuntap_probe(struct iss_net_private *lp, int index, char *init)
init += sizeof(TRANSPORT_TUNTAP_NAME) - 1;
if (*init == ',') {
- rem = split_if_spec(init + 1, &mac_str, &dev_name);
+ rem = split_if_spec(init + 1, &mac_str, &dev_name, NULL);
if (rem != NULL) {
pr_err("%s: extra garbage on specification : '%s'\n",
dev->name, rem);
--
2.30.2
__register_btf_kfunc_id_set() assumes .BTF to be part of the module's
.ko file if CONFIG_DEBUG_INFO_BTF is enabled. If that's not the case,
the function prints an error message and return an error. As a result,
such modules cannot be loaded.
However, the section could be stripped out during a build process. It
would be better to let the modules loaded, because their basic
functionalities have no problem[1], though the BTF functionalities will
not be supported. Make the function to lower the level of the message
from error to warn, and return no error.
[1] https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion/
Reported-by: Alexander Egorenkov <Alexander.Egorenkov(a)ibm.com>
Link: https://lore.kernel.org/bpf/87y228q66f.fsf@oc8242746057.ibm.com/
Suggested-by: Kumar Kartikeya Dwivedi <memxor(a)gmail.com>
Link: https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion/
Fixes: c446fdacb10d ("bpf: fix register_btf_kfunc_id_set for !CONFIG_DEBUG_INFO_BTF")
Cc: <stable(a)vger.kernel.org> # 5.18.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Acked-by: Jiri Olsa <jolsa(a)kernel.org>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
Changes from v2
(https://lore.kernel.org/bpf/20230628164611.83038-1-sj@kernel.org/)
- Keep the error for vmlinux case.
Changes from v1
(https://lore.kernel.org/all/20230626181120.7086-1-sj@kernel.org/)
- Fix Fixes: tag (Jiri Olsa)
- Add 'Acked-by: ' from Jiri Olsa
Notes
- This is a fix. Hence, would better to merged into bpf tree, not
bpf-next.
- This doesn't cleanly applied on 6.1.y. I will send the backport to
stable@ as soon as this is merged into the mainline.
kernel/bpf/btf.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 29fe21099298..817204d53372 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7891,10 +7891,8 @@ static int __register_btf_kfunc_id_set(enum btf_kfunc_hook hook,
pr_err("missing vmlinux BTF, cannot register kfuncs\n");
return -ENOENT;
}
- if (kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES)) {
- pr_err("missing module BTF, cannot register kfuncs\n");
- return -ENOENT;
- }
+ if (kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES))
+ pr_warn("missing module BTF, cannot register kfuncs\n");
return 0;
}
if (IS_ERR(btf))
--
2.25.1
__register_btf_kfunc_id_set() assumes .BTF to be part of the module's
.ko file if CONFIG_DEBUG_INFO_BTF is enabled. If that's not the case,
the function prints an error message and return an error. As a result,
such modules cannot be loaded.
However, the section could be stripped out during a build process. It
would be better to let the modules loaded, because their basic
functionalities have no problem[1], though the BTF functionalities will
not be supported. Make the function to lower the level of the message
from error to warn, and return no error.
[1] https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion/
Reported-by: Alexander Egorenkov <Alexander.Egorenkov(a)ibm.com>
Link: https://lore.kernel.org/bpf/87y228q66f.fsf@oc8242746057.ibm.com/
Suggested-by: Kumar Kartikeya Dwivedi <memxor(a)gmail.com>
Link: https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion/
Fixes: c446fdacb10d ("bpf: fix register_btf_kfunc_id_set for !CONFIG_DEBUG_INFO_BTF")
Cc: <stable(a)vger.kernel.org> # 5.18.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Acked-by: Jiri Olsa <jolsa(a)kernel.org>
---
Changes from v1
(https://lore.kernel.org/all/20230626181120.7086-1-sj@kernel.org/)
- Fix Fixes: tag (Jiri Olsa)
- Add 'Acked-by: ' from Jiri Olsa
kernel/bpf/btf.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 6b682b8e4b50..d683f034996f 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7848,14 +7848,10 @@ static int __register_btf_kfunc_id_set(enum btf_kfunc_hook hook,
btf = btf_get_module_btf(kset->owner);
if (!btf) {
- if (!kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
- pr_err("missing vmlinux BTF, cannot register kfuncs\n");
- return -ENOENT;
- }
- if (kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES)) {
- pr_err("missing module BTF, cannot register kfuncs\n");
- return -ENOENT;
- }
+ if (!kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF))
+ pr_warn("missing vmlinux BTF, cannot register kfuncs\n");
+ if (kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES))
+ pr_warn("missing module BTF, cannot register kfuncs\n");
return 0;
}
if (IS_ERR(btf))
--
2.25.1
Make sure that the soundwire device used for register accesses has been
enumerated and initialised before trying to read the codec variant
during component probe.
This specifically avoids interpreting (a masked and shifted) -EBUSY
errno as the variant:
wcd938x_codec audio-codec: ASoC: error at soc_component_read_no_lock on audio-codec for register: [0x000034b0] -16
in case the soundwire device has not yet been initialised, which in turn
prevents some headphone controls from being registered.
Fixes: 8d78602aa87a ("ASoC: codecs: wcd938x: add basic driver")
Cc: stable(a)vger.kernel.org # 5.14
Cc: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Reported-by: Steev Klimaszewski <steev(a)kali.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
sound/soc/codecs/wcd938x.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/sound/soc/codecs/wcd938x.c b/sound/soc/codecs/wcd938x.c
index e3ae4fb2c4db..4571588fad62 100644
--- a/sound/soc/codecs/wcd938x.c
+++ b/sound/soc/codecs/wcd938x.c
@@ -3080,9 +3080,18 @@ static int wcd938x_irq_init(struct wcd938x_priv *wcd, struct device *dev)
static int wcd938x_soc_codec_probe(struct snd_soc_component *component)
{
struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component);
+ struct sdw_slave *tx_sdw_dev = wcd938x->tx_sdw_dev;
struct device *dev = component->dev;
+ unsigned long time_left;
int ret, i;
+ time_left = wait_for_completion_timeout(&tx_sdw_dev->initialization_complete,
+ msecs_to_jiffies(2000));
+ if (!time_left) {
+ dev_err(dev, "soundwire device init timeout\n");
+ return -ETIMEDOUT;
+ }
+
snd_soc_component_init_regmap(component, wcd938x->regmap);
ret = pm_runtime_resume_and_get(dev);
--
2.39.3
If nf_conntrack_init_start() fails (for example due to a
register_nf_conntrack_bpf() failure), the nf_conntrack_helper_fini()
clean-up path frees the nf_ct_helper_hash map.
When built with NF_CONNTRACK=y, further netfilter modules (e.g:
netfilter_conntrack_ftp) can still be loaded and call
nf_conntrack_helpers_register(), independently of whether nf_conntrack
initialized correctly. This accesses the nf_ct_helper_hash dangling
pointer and causes a uaf, possibly leading to random memory corruption.
This patch guards nf_conntrack_helper_register() from accessing a freed
or uninitialized nf_ct_helper_hash pointer and fixes possible
uses-after-free when loading a conntrack module.
Cc: stable(a)vger.kernel.org
Fixes: 12f7a505331e ("netfilter: add user-space connection tracking helper infrastructure")
Signed-off-by: Florent Revest <revest(a)chromium.org>
---
net/netfilter/nf_conntrack_helper.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index 0c4db2f2ac43..f22691f83853 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -360,6 +360,9 @@ int nf_conntrack_helper_register(struct nf_conntrack_helper *me)
BUG_ON(me->expect_class_max >= NF_CT_MAX_EXPECT_CLASSES);
BUG_ON(strlen(me->name) > NF_CT_HELPER_NAME_LEN - 1);
+ if (!nf_ct_helper_hash)
+ return -ENOENT;
+
if (me->expect_policy->max_expected > NF_CT_EXPECT_MAX_CNT)
return -EINVAL;
@@ -515,4 +518,5 @@ int nf_conntrack_helper_init(void)
void nf_conntrack_helper_fini(void)
{
kvfree(nf_ct_helper_hash);
+ nf_ct_helper_hash = NULL;
}
--
2.41.0.255.g8b1d071c50-goog
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance
to restore them. Fix it by adding a call to the arch_swap_restore() hook
before the call to swap_free().
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c51…
Cc: <stable(a)vger.kernel.org> # 6.1
Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
Reported-by: Qun-wei Lin (林群崴) <Qun-wei.Lin(a)mediatek.com>
Closes: https://lore.kernel.org/all/5050805753ac469e8d727c797c2218a9d780d434.camel@…
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: "Huang, Ying" <ying.huang(a)intel.com>
Reviewed-by: Steven Price <steven.price(a)arm.com>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
---
v2:
- Call arch_swap_restore() directly instead of via arch_do_swap_page()
mm/memory.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/mm/memory.c b/mm/memory.c
index f69fbc251198..fc25764016b3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3932,6 +3932,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
}
}
+ /*
+ * Some architectures may have to restore extra metadata to the page
+ * when reading from swap. This metadata may be indexed by swap entry
+ * so this must be called before swap_free().
+ */
+ arch_swap_restore(entry, folio);
+
/*
* Remove the swap entry and conditionally try to free up the swapcache.
* We're already holding a reference on the page but haven't mapped it
--
2.40.1.698.g37aff9b760-goog
For some reason we ended up with a setup without this flag.
This resulted in inconsistent sound card devices numbers which
are also not starting as expected at dai_link->id.
(Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
With this patch patch now the MultiMedia1 PCM ends up with device number 0
as expected.
Fixes: 9b4fe0f1cd79 ("ASoC: qdsp6: audioreach: add q6apm-dai support")
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
---
sound/soc/qcom/qdsp6/q6apm-dai.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/soc/qcom/qdsp6/q6apm-dai.c b/sound/soc/qcom/qdsp6/q6apm-dai.c
index 5eb0b864c740..c90db6daabbd 100644
--- a/sound/soc/qcom/qdsp6/q6apm-dai.c
+++ b/sound/soc/qcom/qdsp6/q6apm-dai.c
@@ -840,6 +840,7 @@ static const struct snd_soc_component_driver q6apm_fe_dai_component = {
.pointer = q6apm_dai_pointer,
.trigger = q6apm_dai_trigger,
.compress_ops = &q6apm_dai_compress_ops,
+ .use_dai_pcm_id = true,
};
static int q6apm_dai_probe(struct platform_device *pdev)
--
2.21.0
From: David Woodhouse <dwmw(a)amazon.co.uk>
[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Michael Kelley <mikelley(a)microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Tested-by: Helge Deller <deller(a)gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index f4a2c5845bcbd..6c11cf2260542 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -591,12 +591,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;
- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1383,6 +1377,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}
cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2
A switch from OSI to PC mode is only possible if all CPUs other than the
calling one are OFF, either through a call to CPU_OFF or not yet booted.
Currently OSI mode is enabled before power domains are created. In cases
where CPUidle states are not using hierarchical CPU topology the bail out
path tries to switch back to PC mode which gets denied by firmware since
other CPUs are online at this point and creates inconsistent state as
firmware is in OSI mode and Linux in PC mode.
This change moves enabling OSI mode after power domains are created,
this would makes sure that hierarchical CPU topology is used before
switching firmware to OSI mode.
Cc: stable(a)vger.kernel.org
Fixes: 70c179b49870 ("cpuidle: psci: Allow PM domain to be initialized even if no OSI mode")
Signed-off-by: Maulik Shah <quic_mkshah(a)quicinc.com>
Reviewed-by: Ulf Hansson <ulf.hansson(a)linaro.org>
---
drivers/cpuidle/cpuidle-psci-domain.c | 39 +++++++++------------------
1 file changed, 13 insertions(+), 26 deletions(-)
diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
index c2d6d9c3c930..b88af1262f1a 100644
--- a/drivers/cpuidle/cpuidle-psci-domain.c
+++ b/drivers/cpuidle/cpuidle-psci-domain.c
@@ -120,20 +120,6 @@ static void psci_pd_remove(void)
}
}
-static bool psci_pd_try_set_osi_mode(void)
-{
- int ret;
-
- if (!psci_has_osi_support())
- return false;
-
- ret = psci_set_osi_mode(true);
- if (ret)
- return false;
-
- return true;
-}
-
static void psci_cpuidle_domain_sync_state(struct device *dev)
{
/*
@@ -152,15 +138,12 @@ static int psci_cpuidle_domain_probe(struct platform_device *pdev)
{
struct device_node *np = pdev->dev.of_node;
struct device_node *node;
- bool use_osi;
+ bool use_osi = psci_has_osi_support();
int ret = 0, pd_count = 0;
if (!np)
return -ENODEV;
- /* If OSI mode is supported, let's try to enable it. */
- use_osi = psci_pd_try_set_osi_mode();
-
/*
* Parse child nodes for the "#power-domain-cells" property and
* initialize a genpd/genpd-of-provider pair when it's found.
@@ -170,33 +153,37 @@ static int psci_cpuidle_domain_probe(struct platform_device *pdev)
continue;
ret = psci_pd_init(node, use_osi);
- if (ret)
- goto put_node;
+ if (ret) {
+ of_node_put(node);
+ goto exit;
+ }
pd_count++;
}
/* Bail out if not using the hierarchical CPU topology. */
if (!pd_count)
- goto no_pd;
+ return 0;
/* Link genpd masters/subdomains to model the CPU topology. */
ret = dt_idle_pd_init_topology(np);
if (ret)
goto remove_pd;
+ /* let's try to enable OSI. */
+ ret = psci_set_osi_mode(use_osi);
+ if (ret)
+ goto remove_pd;
+
pr_info("Initialized CPU PM domain topology using %s mode\n",
use_osi ? "OSI" : "PC");
return 0;
-put_node:
- of_node_put(node);
remove_pd:
+ dt_idle_pd_remove_topology(np);
psci_pd_remove();
+exit:
pr_err("failed to create CPU PM domains ret=%d\n", ret);
-no_pd:
- if (use_osi)
- psci_set_osi_mode(false);
return ret;
}
--
2.17.1
Hi,
There are two issues that have been identified recently where the lack
of calling ACPI _REG when devices change D-states leads to functional
problems.
The first one is a case where some PCIe devices are not functional after
resuming from suspend (S3 or s0ix).
The second one is a case where as the kernel is initializing it gets
stuck in a busy loop waiting for AML that never returns.
In both cases this is fixed by cherry-picking these two commits:
5557b62634ab ("PCI/ACPI: Validate acpi_pci_set_power_state() parameter")
112a7f9c8edb ("PCI/ACPI: Call _REG when transitioning D-states")
Can you please backport these to 6.1.y and later?
Thanks,
Since commit 85e031154c7c ("powerpc/bpf: Perform complete extra passes
to update addresses"), two additional passes are performed to avoid
space and CPU time wastage on powerpc. But these extra passes led to
WARN_ON_ONCE() hits in bpf_add_extable_entry() as extable entries are
populated again, during the extra pass, without resetting the index.
Fix it by resetting entry index before repopulating extable entries,
if and when there is an additional pass.
Fixes: 85e031154c7c ("powerpc/bpf: Perform complete extra passes to update addresses")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hari Bathini <hbathini(a)linux.ibm.com>
---
arch/powerpc/net/bpf_jit_comp.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index e93aefcfb83f..37043dfc1add 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -101,6 +101,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
bpf_hdr = jit_data->header;
proglen = jit_data->proglen;
extra_pass = true;
+ /* During extra pass, ensure index is reset before repopulating extable entries */
+ cgctx.exentry_idx = 0;
goto skip_init_ctx;
}
--
2.40.0
When building the boot wrapper assembly files with clang after
commit 648a1783fe25 ("powerpc/boot: Fix boot wrapper code generation
with CONFIG_POWER10_CPU"), the following warnings appear for each file
built:
'-prefixed' is not a recognized feature for this target (ignoring feature)
'-pcrel' is not a recognized feature for this target (ignoring feature)
While it is questionable whether or not LLVM should be emitting a
warning when passed negative versions of code generation flags when
building assembly files (since it does not emit a warning for the
altivec and vsx flags), it is easy enough to work around this by just
moving the disabled flags to BOOTCFLAGS after the assignment of
BOOTAFLAGS, so that they are not added when building assembly files.
Do so to silence the warnings.
Cc: stable(a)vger.kernel.org
Fixes: 648a1783fe25 ("powerpc/boot: Fix boot wrapper code generation with CONFIG_POWER10_CPU")
Link: https://github.com/ClangBuiltLinux/linux/issues/1839
Reviewed-by: Nicholas Piggin <npiggin(a)gmail.com>
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
I do not think that 648a1783fe25 is truly to blame for this but the
Fixes tag will help the stable team ensure that this change gets
backported with 648a1783fe25. This is the minimal fix for the problem
but the true fix is separating AFLAGS and CFLAGS, which should be done
by this in-flight series by Nick:
https://lore.kernel.org/20230426055848.402993-1-npiggin@gmail.com/
---
arch/powerpc/boot/Makefile | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 85cde5bf04b7..771b79423bbc 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -34,8 +34,6 @@ endif
BOOTCFLAGS := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
-fno-strict-aliasing -O2 -msoft-float -mno-altivec -mno-vsx \
- $(call cc-option,-mno-prefixed) $(call cc-option,-mno-pcrel) \
- $(call cc-option,-mno-mma) \
$(call cc-option,-mno-spe) $(call cc-option,-mspe=no) \
-pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
$(LINUXINCLUDE)
@@ -71,6 +69,10 @@ BOOTAFLAGS := -D__ASSEMBLY__ $(BOOTCFLAGS) -nostdinc
BOOTARFLAGS := -crD
+BOOTCFLAGS += $(call cc-option,-mno-prefixed) \
+ $(call cc-option,-mno-pcrel) \
+ $(call cc-option,-mno-mma)
+
ifdef CONFIG_CC_IS_CLANG
BOOTCFLAGS += $(CLANG_FLAGS)
BOOTAFLAGS += $(CLANG_FLAGS)
---
base-commit: 169f8997968ab620d750d9a45e15c5288d498356
change-id: 20230427-remove-power10-args-from-boot-aflags-clang-268c43e8c1fc
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
Using `% nr_cpumask_bits` is slow and complicated, and not totally
robust toward dynamic changes to CPU topologies. Rather than storing the
next CPU in the round-robin, just store the last one, and also return
that value. This simplifies the loop drastically into a much more common
pattern.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Cc: stable(a)vger.kernel.org
Reported-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Tested-by: Manuel Leiner <manuel.leiner(a)gmx.de>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/net/wireguard/queueing.c | 1 +
drivers/net/wireguard/queueing.h | 25 +++++++++++--------------
drivers/net/wireguard/receive.c | 2 +-
drivers/net/wireguard/send.c | 2 +-
4 files changed, 14 insertions(+), 16 deletions(-)
diff --git a/drivers/net/wireguard/queueing.c b/drivers/net/wireguard/queueing.c
index 8084e7408c0a..26d235d15235 100644
--- a/drivers/net/wireguard/queueing.c
+++ b/drivers/net/wireguard/queueing.c
@@ -28,6 +28,7 @@ int wg_packet_queue_init(struct crypt_queue *queue, work_func_t function,
int ret;
memset(queue, 0, sizeof(*queue));
+ queue->last_cpu = -1;
ret = ptr_ring_init(&queue->ring, len, GFP_KERNEL);
if (ret)
return ret;
diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h
index 125284b346a7..1ea4f874e367 100644
--- a/drivers/net/wireguard/queueing.h
+++ b/drivers/net/wireguard/queueing.h
@@ -117,20 +117,17 @@ static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id)
return cpu;
}
-/* This function is racy, in the sense that next is unlocked, so it could return
- * the same CPU twice. A race-free version of this would be to instead store an
- * atomic sequence number, do an increment-and-return, and then iterate through
- * every possible CPU until we get to that index -- choose_cpu. However that's
- * a bit slower, and it doesn't seem like this potential race actually
- * introduces any performance loss, so we live with it.
+/* This function is racy, in the sense that it's called while last_cpu is
+ * unlocked, so it could return the same CPU twice. Adding locking or using
+ * atomic sequence numbers is slower though, and the consequences of racing are
+ * harmless, so live with it.
*/
-static inline int wg_cpumask_next_online(int *next)
+static inline int wg_cpumask_next_online(int *last_cpu)
{
- int cpu = *next;
-
- while (unlikely(!cpumask_test_cpu(cpu, cpu_online_mask)))
- cpu = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
- *next = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
+ int cpu = cpumask_next(*last_cpu, cpu_online_mask);
+ if (cpu >= nr_cpu_ids)
+ cpu = cpumask_first(cpu_online_mask);
+ *last_cpu = cpu;
return cpu;
}
@@ -159,7 +156,7 @@ static inline void wg_prev_queue_drop_peeked(struct prev_queue *queue)
static inline int wg_queue_enqueue_per_device_and_peer(
struct crypt_queue *device_queue, struct prev_queue *peer_queue,
- struct sk_buff *skb, struct workqueue_struct *wq, int *next_cpu)
+ struct sk_buff *skb, struct workqueue_struct *wq)
{
int cpu;
@@ -173,7 +170,7 @@ static inline int wg_queue_enqueue_per_device_and_peer(
/* Then we queue it up in the device queue, which consumes the
* packet as soon as it can.
*/
- cpu = wg_cpumask_next_online(next_cpu);
+ cpu = wg_cpumask_next_online(&device_queue->last_cpu);
if (unlikely(ptr_ring_produce_bh(&device_queue->ring, skb)))
return -EPIPE;
queue_work_on(cpu, wq, &per_cpu_ptr(device_queue->worker, cpu)->work);
diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
index 7135d51d2d87..0b3f0c843550 100644
--- a/drivers/net/wireguard/receive.c
+++ b/drivers/net/wireguard/receive.c
@@ -524,7 +524,7 @@ static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb)
goto err;
ret = wg_queue_enqueue_per_device_and_peer(&wg->decrypt_queue, &peer->rx_queue, skb,
- wg->packet_crypt_wq, &wg->decrypt_queue.last_cpu);
+ wg->packet_crypt_wq);
if (unlikely(ret == -EPIPE))
wg_queue_enqueue_per_peer_rx(skb, PACKET_STATE_DEAD);
if (likely(!ret || ret == -EPIPE)) {
diff --git a/drivers/net/wireguard/send.c b/drivers/net/wireguard/send.c
index 5368f7c35b4b..95c853b59e1d 100644
--- a/drivers/net/wireguard/send.c
+++ b/drivers/net/wireguard/send.c
@@ -318,7 +318,7 @@ static void wg_packet_create_data(struct wg_peer *peer, struct sk_buff *first)
goto err;
ret = wg_queue_enqueue_per_device_and_peer(&wg->encrypt_queue, &peer->tx_queue, first,
- wg->packet_crypt_wq, &wg->encrypt_queue.last_cpu);
+ wg->packet_crypt_wq);
if (unlikely(ret == -EPIPE))
wg_queue_enqueue_per_peer_tx(first, PACKET_STATE_DEAD);
err:
--
2.41.0
Packets bound for peers can queue up prior to the device private key
being set. For example, if persistent keepalive is set, a packet is
queued up to be sent as soon as the device comes up. However, if the
private key hasn't been set yet, the handshake message never sends, and
no timer is armed to retry, since that would be pointless.
But, if a user later sets a private key, the expectation is that those
queued packets, such as a persistent keepalive, are actually sent. So
adjust the configuration logic to account for this edge case, and add a
test case to make sure this works.
Maxim noticed this with a wg-quick(8) config to the tune of:
[Interface]
PostUp = wg set %i private-key somefile
[Peer]
PublicKey = ...
Endpoint = ...
PersistentKeepalive = 25
Here, the private key gets set after the device comes up using a PostUp
script, triggering the bug.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Cc: stable(a)vger.kernel.org
Reported-by: Maxim Cournoyer <maxim.cournoyer(a)gmail.com>
Tested-by: Maxim Cournoyer <maxim.cournoyer(a)gmail.com>
Link: https://lore.kernel.org/wireguard/87fs7xtqrv.fsf@gmail.com/
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/net/wireguard/netlink.c | 14 ++++++----
tools/testing/selftests/wireguard/netns.sh | 30 +++++++++++++++++++---
2 files changed, 35 insertions(+), 9 deletions(-)
diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c
index 43c8c84e7ea8..6d1bd9f52d02 100644
--- a/drivers/net/wireguard/netlink.c
+++ b/drivers/net/wireguard/netlink.c
@@ -546,6 +546,7 @@ static int wg_set_device(struct sk_buff *skb, struct genl_info *info)
u8 *private_key = nla_data(info->attrs[WGDEVICE_A_PRIVATE_KEY]);
u8 public_key[NOISE_PUBLIC_KEY_LEN];
struct wg_peer *peer, *temp;
+ bool send_staged_packets;
if (!crypto_memneq(wg->static_identity.static_private,
private_key, NOISE_PUBLIC_KEY_LEN))
@@ -564,14 +565,17 @@ static int wg_set_device(struct sk_buff *skb, struct genl_info *info)
}
down_write(&wg->static_identity.lock);
- wg_noise_set_static_identity_private_key(&wg->static_identity,
- private_key);
- list_for_each_entry_safe(peer, temp, &wg->peer_list,
- peer_list) {
+ send_staged_packets = !wg->static_identity.has_identity && netif_running(wg->dev);
+ wg_noise_set_static_identity_private_key(&wg->static_identity, private_key);
+ send_staged_packets = send_staged_packets && wg->static_identity.has_identity;
+
+ wg_cookie_checker_precompute_device_keys(&wg->cookie_checker);
+ list_for_each_entry_safe(peer, temp, &wg->peer_list, peer_list) {
wg_noise_precompute_static_static(peer);
wg_noise_expire_current_peer_keypairs(peer);
+ if (send_staged_packets)
+ wg_packet_send_staged_packets(peer);
}
- wg_cookie_checker_precompute_device_keys(&wg->cookie_checker);
up_write(&wg->static_identity.lock);
}
skip_set_private_key:
diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh
index 69c7796c7ca9..405ff262ca93 100755
--- a/tools/testing/selftests/wireguard/netns.sh
+++ b/tools/testing/selftests/wireguard/netns.sh
@@ -514,10 +514,32 @@ n2 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/all/rp_filter'
n1 ping -W 1 -c 1 192.168.241.2
[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.3:1" ]]
-ip1 link del veth1
-ip1 link del veth3
-ip1 link del wg0
-ip2 link del wg0
+ip1 link del dev veth3
+ip1 link del dev wg0
+ip2 link del dev wg0
+
+# Make sure persistent keep alives are sent when an adapter comes up
+ip1 link add dev wg0 type wireguard
+n1 wg set wg0 private-key <(echo "$key1") peer "$pub2" endpoint 10.0.0.1:1 persistent-keepalive 1
+read _ _ tx_bytes < <(n1 wg show wg0 transfer)
+[[ $tx_bytes -eq 0 ]]
+ip1 link set dev wg0 up
+read _ _ tx_bytes < <(n1 wg show wg0 transfer)
+[[ $tx_bytes -gt 0 ]]
+ip1 link del dev wg0
+# This should also happen even if the private key is set later
+ip1 link add dev wg0 type wireguard
+n1 wg set wg0 peer "$pub2" endpoint 10.0.0.1:1 persistent-keepalive 1
+read _ _ tx_bytes < <(n1 wg show wg0 transfer)
+[[ $tx_bytes -eq 0 ]]
+ip1 link set dev wg0 up
+read _ _ tx_bytes < <(n1 wg show wg0 transfer)
+[[ $tx_bytes -eq 0 ]]
+n1 wg set wg0 private-key <(echo "$key1")
+read _ _ tx_bytes < <(n1 wg show wg0 transfer)
+[[ $tx_bytes -gt 0 ]]
+ip1 link del dev veth1
+ip1 link del dev wg0
# We test that Netlink/IPC is working properly by doing things that usually cause split responses
ip0 link add dev wg0 type wireguard
--
2.41.0
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit a24c1aab652ebacf9ea62470a166514174c96fe1 ]
The rcu_data structure's ->rcu_cpu_has_work field can be modified by
any CPU attempting to wake up the rcuc kthread. Therefore, this commit
marks accesses to this field from the rcu_cpu_kthread() function.
This data race was reported by KCSAN. Not appropriate for backporting
due to failure being unlikely.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/tree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 615283404d9dc..98d64f107fbb7 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2457,12 +2457,12 @@ static void rcu_cpu_kthread(unsigned int cpu)
*statusp = RCU_KTHREAD_RUNNING;
local_irq_disable();
work = *workp;
- *workp = 0;
+ WRITE_ONCE(*workp, 0);
local_irq_enable();
if (work)
rcu_core();
local_bh_enable();
- if (*workp == 0) {
+ if (!READ_ONCE(*workp)) {
trace_rcu_utilization(TPS("End CPU kthread@rcu_wait"));
*statusp = RCU_KTHREAD_WAITING;
return;
--
2.39.2
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit a24c1aab652ebacf9ea62470a166514174c96fe1 ]
The rcu_data structure's ->rcu_cpu_has_work field can be modified by
any CPU attempting to wake up the rcuc kthread. Therefore, this commit
marks accesses to this field from the rcu_cpu_kthread() function.
This data race was reported by KCSAN. Not appropriate for backporting
due to failure being unlikely.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/tree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index df016f6d0662c..48f3e90c5de53 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2826,12 +2826,12 @@ static void rcu_cpu_kthread(unsigned int cpu)
*statusp = RCU_KTHREAD_RUNNING;
local_irq_disable();
work = *workp;
- *workp = 0;
+ WRITE_ONCE(*workp, 0);
local_irq_enable();
if (work)
rcu_core();
local_bh_enable();
- if (*workp == 0) {
+ if (!READ_ONCE(*workp)) {
trace_rcu_utilization(TPS("End CPU kthread@rcu_wait"));
*statusp = RCU_KTHREAD_WAITING;
return;
--
2.39.2
From: Mark Rutland <mark.rutland(a)arm.com>
[ Upstream commit ab9b4008092c86dc12497af155a0901cc1156999 ]
Both create_mapping_noalloc() and update_mapping_prot() sanity-check
their 'virt' parameter, but the check itself doesn't make much sense.
The condition used today appears to be a historical accident.
The sanity-check condition:
if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
[ ... warning here ... ]
return;
}
... can only be true for the KASAN shadow region or the module region,
and there's no reason to exclude these specifically for creating and
updateing mappings.
When arm64 support was first upstreamed in commit:
c1cc1552616d0f35 ("arm64: MMU initialisation")
... the condition was:
if (virt < VMALLOC_START) {
[ ... warning here ... ]
return;
}
At the time, VMALLOC_START was the lowest kernel address, and this was
checking whether 'virt' would be translated via TTBR1.
Subsequently in commit:
14c127c957c1c607 ("arm64: mm: Flip kernel VA space")
... the condition was changed to:
if ((virt >= VA_START) && (virt < VMALLOC_START)) {
[ ... warning here ... ]
return;
}
This appear to have been a thinko. The commit moved the linear map to
the bottom of the kernel address space, with VMALLOC_START being at the
halfway point. The old condition would warn for changes to the linear
map below this, and at the time VA_START was the end of the linear map.
Subsequently we cleaned up the naming of VA_START in commit:
77ad4ce69321abbe ("arm64: memory: rename VA_START to PAGE_END")
... keeping the erroneous condition as:
if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
[ ... warning here ... ]
return;
}
Correct the condition to check against the start of the TTBR1 address
space, which is currently PAGE_OFFSET. This simplifies the logic, and
more clearly matches the "outside kernel range" message in the warning.
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: Steve Capper <steve.capper(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
Link: https://lore.kernel.org/r/20230615102628.1052103-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/arm64/mm/mmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 5cf575f23af28..8e934bb44f12e 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -399,7 +399,7 @@ static phys_addr_t pgd_pgtable_alloc(int shift)
static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
phys_addr_t size, pgprot_t prot)
{
- if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
+ if (virt < PAGE_OFFSET) {
pr_warn("BUG: not creating mapping for %pa at 0x%016lx - outside kernel range\n",
&phys, virt);
return;
@@ -426,7 +426,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
phys_addr_t size, pgprot_t prot)
{
- if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
+ if (virt < PAGE_OFFSET) {
pr_warn("BUG: not updating mapping for %pa at 0x%016lx - outside kernel range\n",
&phys, virt);
return;
--
2.39.2
From: Hans de Goede <hdegoede(a)redhat.com>
[ Upstream commit 69d6b37695c1f2320cfa330e1e1636d50dd5040a ]
The Nextbook Ares 8A is a x86 ACPI tablet which ships with Android x86
as factory OS. Its DSDT contains a bunch of I2C devices which are not
actually there (the Android x86 kernel fork ignores I2C devices described
in the DSDT).
On this specific model this just not cause resource conflicts, one of
the probe() calls for the non existing i2c_clients actually ends up
toggling a GPIO or executing a _PS3 after a failed probe which turns
the tablet off.
Add a ACPI_QUIRK_SKIP_I2C_CLIENTS for the Nextbook Ares 8 to the
acpi_quirk_skip_dmi_ids table to avoid the bogus i2c_clients and
to fix the tablet turning off during boot because of this.
Also add the "10EC5651" HID for the RealTek ALC5651 codec used
in this tablet to the list of HIDs for which not to skipi2c_client
instantiation, since the Intel SST sound driver relies on
the codec being instantiated through ACPI.
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/acpi/x86/utils.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c
index e45285d4e62a4..0b65c0d3f5a80 100644
--- a/drivers/acpi/x86/utils.c
+++ b/drivers/acpi/x86/utils.c
@@ -330,7 +330,7 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY),
},
{
- /* Nextbook Ares 8 */
+ /* Nextbook Ares 8 (BYT version)*/
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Insyde"),
DMI_MATCH(DMI_PRODUCT_NAME, "M890BAP"),
@@ -338,6 +338,16 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
.driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS |
ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY),
},
+ {
+ /* Nextbook Ares 8A (CHT version)*/
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Insyde"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "CherryTrail"),
+ DMI_MATCH(DMI_BIOS_VERSION, "M882"),
+ },
+ .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS |
+ ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY),
+ },
{
/* Whitelabel (sold as various brands) TM800A550L */
.matches = {
@@ -356,6 +366,7 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
#if IS_ENABLED(CONFIG_X86_ANDROID_TABLETS)
static const struct acpi_device_id i2c_acpi_known_good_ids[] = {
{ "10EC5640", 0 }, /* RealTek ALC5640 audio codec */
+ { "10EC5651", 0 }, /* RealTek ALC5651 audio codec */
{ "INT33F4", 0 }, /* X-Powers AXP288 PMIC */
{ "INT33FD", 0 }, /* Intel Crystal Cove PMIC */
{ "INT34D3", 0 }, /* Intel Whiskey Cove PMIC */
--
2.39.2
From: David Woodhouse <dwmw(a)amazon.co.uk>
[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Michael Kelley <mikelley(a)microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Tested-by: Helge Deller <deller(a)gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 008b50da22246..705d330600485 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -553,12 +553,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;
- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1274,6 +1268,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}
cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2
From: David Woodhouse <dwmw(a)amazon.co.uk>
[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Michael Kelley <mikelley(a)microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Tested-by: Helge Deller <deller(a)gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 393114c10c285..4191f17379d2f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -590,12 +590,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;
- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1372,6 +1366,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}
cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2
From: David Woodhouse <dwmw(a)amazon.co.uk>
[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Michael Kelley <mikelley(a)microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Tested-by: Helge Deller <deller(a)gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 98a7a7b1471b7..dbf1e572f6e7d 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -591,12 +591,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;
- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1383,6 +1377,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}
cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2
From: David Woodhouse <dwmw(a)amazon.co.uk>
[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Michael Kelley <mikelley(a)microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Tested-by: Helge Deller <deller(a)gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6c0a92ca6bb59..43e0a77f21e81 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -591,12 +591,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;
- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1383,6 +1377,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}
cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit 6bc6e6b27524304aadb9c04611ddb1c84dd7617a ]
The ref_scale_shutdown() kthread/function uses wait_event() to wait for
the refscale test to complete. However, although the read-side tests
are normally extremely fast, there is no law against specifying a very
large value for the refscale.loops module parameter or against having
a slow read-side primitive. Either way, this might well trigger the
hung-task timeout.
This commit therefore replaces those wait_event() calls with calls to
wait_event_idle(), which do not trigger the hung-task timeout.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/refscale.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index 952595c678b37..4e419ca6d6114 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -625,7 +625,7 @@ ref_scale_cleanup(void)
static int
ref_scale_shutdown(void *arg)
{
- wait_event(shutdown_wq, shutdown_start);
+ wait_event_idle(shutdown_wq, shutdown_start);
smp_mb(); // Wake before output.
ref_scale_cleanup();
--
2.39.2
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit 6bc6e6b27524304aadb9c04611ddb1c84dd7617a ]
The ref_scale_shutdown() kthread/function uses wait_event() to wait for
the refscale test to complete. However, although the read-side tests
are normally extremely fast, there is no law against specifying a very
large value for the refscale.loops module parameter or against having
a slow read-side primitive. Either way, this might well trigger the
hung-task timeout.
This commit therefore replaces those wait_event() calls with calls to
wait_event_idle(), which do not trigger the hung-task timeout.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/refscale.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index 66dc14cf5687e..5abb0cf52803a 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -777,7 +777,7 @@ ref_scale_cleanup(void)
static int
ref_scale_shutdown(void *arg)
{
- wait_event(shutdown_wq, shutdown_start);
+ wait_event_idle(shutdown_wq, shutdown_start);
smp_mb(); // Wake before output.
ref_scale_cleanup();
--
2.39.2
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit 6bc6e6b27524304aadb9c04611ddb1c84dd7617a ]
The ref_scale_shutdown() kthread/function uses wait_event() to wait for
the refscale test to complete. However, although the read-side tests
are normally extremely fast, there is no law against specifying a very
large value for the refscale.loops module parameter or against having
a slow read-side primitive. Either way, this might well trigger the
hung-task timeout.
This commit therefore replaces those wait_event() calls with calls to
wait_event_idle(), which do not trigger the hung-task timeout.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/refscale.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index 435c884c02b5c..d49a9d66e0000 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -795,7 +795,7 @@ ref_scale_cleanup(void)
static int
ref_scale_shutdown(void *arg)
{
- wait_event(shutdown_wq, shutdown_start);
+ wait_event_idle(shutdown_wq, shutdown_start);
smp_mb(); // Wake before output.
ref_scale_cleanup();
--
2.39.2
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit 6bc6e6b27524304aadb9c04611ddb1c84dd7617a ]
The ref_scale_shutdown() kthread/function uses wait_event() to wait for
the refscale test to complete. However, although the read-side tests
are normally extremely fast, there is no law against specifying a very
large value for the refscale.loops module parameter or against having
a slow read-side primitive. Either way, this might well trigger the
hung-task timeout.
This commit therefore replaces those wait_event() calls with calls to
wait_event_idle(), which do not trigger the hung-task timeout.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/rcu/refscale.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index afa3e1a2f6902..1970ce5f22d40 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -1031,7 +1031,7 @@ ref_scale_cleanup(void)
static int
ref_scale_shutdown(void *arg)
{
- wait_event(shutdown_wq, shutdown_start);
+ wait_event_idle(shutdown_wq, shutdown_start);
smp_mb(); // Wake before output.
ref_scale_cleanup();
--
2.39.2
The patch titled
Subject: mm: call arch_swap_restore() from do_swap_page()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-call-arch_swap_restore-from-do_swap_page.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Peter Collingbourne <pcc(a)google.com>
Subject: mm: call arch_swap_restore() from do_swap_page()
Date: Mon, 22 May 2023 17:43:08 -0700
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance to
restore them. Fix it by adding a call to the arch_swap_restore() hook
before the call to swap_free().
Link: https://lkml.kernel.org/r/20230523004312.1807357-2-pcc@google.com
Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c51…
Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reported-by: Qun-wei Lin <Qun-wei.Lin(a)mediatek.com>
Closes: https://lore.kernel.org/all/5050805753ac469e8d727c797c2218a9d780d434.camel@…
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: "Huang, Ying" <ying.huang(a)intel.com>
Reviewed-by: Steven Price <steven.price(a)arm.com>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: <stable(a)vger.kernel.org> [6.1+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/mm/memory.c~mm-call-arch_swap_restore-from-do_swap_page
+++ a/mm/memory.c
@@ -3954,6 +3954,13 @@ vm_fault_t do_swap_page(struct vm_fault
}
/*
+ * Some architectures may have to restore extra metadata to the page
+ * when reading from swap. This metadata may be indexed by swap entry
+ * so this must be called before swap_free().
+ */
+ arch_swap_restore(entry, folio);
+
+ /*
* Remove the swap entry and conditionally try to free up the swapcache.
* We're already holding a reference on the page but haven't mapped it
* yet.
_
Patches currently in -mm which might be from pcc(a)google.com are
mm-call-arch_swap_restore-from-do_swap_page.patch
Hi,
A regression [1] was reported on suspend for AMD Stoney on 6.3.10.
It bisected down to:
1ca399f127e0 ("drm/amd/display: Add wrapper to call planes and stream
update")
This was requested by me to fix a PSR problem [2], which isn't used on
Stoney.
This was also backported into 6.1.36 (as e538342002cb) and 5.15.119 (as
3c1aa91b37f9).
It's fixed on 6.3.y by cherry-picking:
32953485c558 ("drm/amd/display: Do not update DRR while BW optimizations
pending")
It's fixed in 6.1.y by cherry-picking:
3442f4e0e555 ("drm/amd/display: Remove optimization for VRR updates")
32953485c558 ("drm/amd/display: Do not update DRR while BW optimizations
pending")
On 5.15.y it's not a reasonable backport to take the fix to stable
because there is a lot of missing Freesync code. Instead it's better to
revert the patch series that introduced it to 5.15.y because PSR-SU
isn't even introduced until later kernels anyway.
5a24be76af79 ("drm/amd/display: fix the system hang while disable PSR")
3c1aa91b37f9 ("drm/amd/display: Add wrapper to call planes and stream
update")
eea850c025b5 ("drm/amd/display: Use dc_update_planes_and_stream")
97ca308925a5 ("drm/amd/display: Add minimal pipe split transition state")
[1] https://gitlab.freedesktop.org/drm/amd/-/issues/2670
[2]
https://lore.kernel.org/stable/e2ae2999-2e39-31ad-198a-26ab3ae53ae7@amd.com/
Thanks,
Hallow und wie geht es dir heute?
Ich möchte, dass Ihre Partnerschaft Sie als Subunternehmer
präsentiert, damit Sie in meinem Namen 8,6 Millionen US-Dollar aus
Überrechnungsverträgen erhalten können, die wir zu 65 % und 35 %
aufteilen können.
Diese Transaktion ist 100 % risikofrei; Du brauchst keine Angst zu haben.
Bitte senden Sie mir eine E-Mail an (osbornemichel438(a)gmail.com), um
ausführliche Informationen zu erhalten und bei Interesse zu erfahren,
wie wir dies gemeinsam bewältigen können.
Sie müssen es mir also weiterleiten
Ihr vollständiger Name.........................
Telefon.............
Geburtsdatum .........................
Staatsangehörigkeit .................................
Mit freundlichen Grüße,
Osborne Michel.
Hallow and how are you today?
I seek for your partnership to present you as a sub-contractor so
that you can receive 8.6M Over-Invoice contract fund on my behalf and
we can split it 65% 35%.
This transaction is 100% risk -free; you need not to be afraid.
Please email me at ( osbornemichel438(a)gmail.com ) for comprehensive
details and how we can handle this together if interested.
So I need you to forward it to me
your full name.........................
Telephone.............
Date of Birth .........................
Nationality .................................
Kind Regards,
Osborne Michel.
From: Stewart Smith <trawets(a)amazon.com>
For both IPv4 and IPv6 incoming TCP connections are tracked in a hash
table with a hash over the source & destination addresses and ports.
However, the IPv6 hash is insufficient and can lead to a high rate of
collisions.
The IPv6 hash used an XOR to fit everything into the 96 bits for the
fast jenkins hash, meaning it is possible for an external entity to
ensure the hash collides, thus falling back to a linear search in the
bucket, which is slow.
We take the approach of hash half the data; hash the other half; and
then hash them together. We do this with 3x jenkins hashes rather than
2x to calculate the hashing value for the connection covering the full
length of the addresses and ports.
While this may look like it adds overhead, the reality of modern CPUs
means that this is unmeasurable in real world scenarios.
In simulating with llvm-mca, the increase in cycles for the hashing code
was ~5 cycles on Skylake (from a base of ~50), and an extra ~9 on
Nehalem (base of ~62).
In commit dd6d2910c5e0 ("netfilter: conntrack: switch to siphash")
netfilter switched from a jenkins hash to a siphash, but even the faster
hsiphash is a more significant overhead (~20-30%) in some preliminary
testing. So, in this patch, we keep to the more conservative approach to
ensure we don't add much overhead per SYN.
In testing, this results in a consistently even spread across the
connection buckets. In both testing and real-world scenarios, we have
not found any measurable performance impact.
Cc: stable(a)vger.kernel.org
Fixes: 08dcdbf6a7b9 ("ipv6: use a stronger hash for tcp")
Fixes: b3da2cf37c5c ("[INET]: Use jhash + random secret for ehash.")
Signed-off-by: Stewart Smith <trawets(a)amazon.com>
Signed-off-by: Samuel Mendoza-Jonas <samjonas(a)amazon.com>
---
include/net/ipv6.h | 4 +---
net/ipv6/inet6_hashtables.c | 5 ++++-
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 7332296eca44..f9bb54869d82 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -752,9 +752,7 @@ static inline u32 ipv6_addr_hash(const struct in6_addr *a)
/* more secured version of ipv6_addr_hash() */
static inline u32 __ipv6_addr_jhash(const struct in6_addr *a, const u32 initval)
{
- u32 v = (__force u32)a->s6_addr32[0] ^ (__force u32)a->s6_addr32[1];
-
- return jhash_3words(v,
+ return jhash_3words((__force u32)a->s6_addr32[1],
(__force u32)a->s6_addr32[2],
(__force u32)a->s6_addr32[3],
initval);
diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
index b64b49012655..bb7198081974 100644
--- a/net/ipv6/inet6_hashtables.c
+++ b/net/ipv6/inet6_hashtables.c
@@ -33,7 +33,10 @@ u32 inet6_ehashfn(const struct net *net,
net_get_random_once(&inet6_ehash_secret, sizeof(inet6_ehash_secret));
net_get_random_once(&ipv6_hash_secret, sizeof(ipv6_hash_secret));
- lhash = (__force u32)laddr->s6_addr32[3];
+ lhash = jhash_3words((__force u32)laddr->s6_addr32[3],
+ (((u32)lport) << 16) | (__force u32)fport,
+ (__force u32)faddr->s6_addr32[0],
+ ipv6_hash_secret);
fhash = __ipv6_addr_jhash(faddr, ipv6_hash_secret);
return __inet6_ehashfn(lhash, lport, fhash, fport,
--
2.25.1