I'm announcing the release of the 6.4.2 kernel.
All users of the 6.4 kernel series must upgrade.
The updated 6.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/process/changes.rst | 7 ++++
Makefile | 2 -
arch/arm64/mm/fault.c | 2 -
drivers/cxl/core/pci.c | 27 ++--------------
drivers/cxl/cxl.h | 1
drivers/cxl/port.c | 14 +++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++
drivers/md/dm-ioctl.c | 33 +++++++++++++-------
drivers/nubus/proc.c | 22 ++++++++++---
drivers/pci/pci-acpi.c | 53 ++++++++++++++++++++++++---------
fs/hugetlbfs/inode.c | 8 +---
fs/nfs/inode.c | 2 -
include/linux/mm.h | 4 +-
mm/hugetlb.c | 12 +++----
mm/nommu.c | 7 +++-
scripts/tags.sh | 9 ++++-
tools/include/nolibc/arch-x86_64.h | 2 -
tools/testing/cxl/Kbuild | 1
tools/testing/cxl/test/mock.c | 15 ---------
19 files changed, 127 insertions(+), 98 deletions(-)
Ahmed S. Darwish (2):
scripts/tags.sh: Resolve gtags empty index generation
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Bas Nieuwenhuizen (1):
drm/amdgpu: Validate VM ioctl flags.
Bjorn Helgaas (1):
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Dan Williams (1):
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Demi Marie Obenour (1):
dm ioctl: Avoid double-fetch of version
Finn Thain (1):
nubus: Partially revert proc_create_single_data() conversion
Greg Kroah-Hartman (1):
Linux 6.4.2
Jeff Layton (1):
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds (1):
execve: always mark stack as growing down during early stack setup
Mario Limonciello (1):
PCI/ACPI: Call _REG when transitioning D-states
Max Filippov (1):
xtensa: fix lock_mm_and_find_vma in case VMA not found
Mike Kravetz (1):
hugetlb: revert use of page_cache_next_miss()
SeongJae Park (1):
arch/arm64/mm/fault: Fix undeclared variable error in do_page_fault()
Thomas Weißschuh (1):
tools/nolibc: x86_64: disable stack protector for _start
I'm announcing the release of the 6.3.12 kernel.
All users of the 6.3 kernel series must upgrade.
The updated 6.3.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.3.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/process/changes.rst | 7 ++++
Makefile | 2 -
drivers/cxl/core/pci.c | 27 ++-------------
drivers/cxl/cxl.h | 1
drivers/cxl/port.c | 14 ++------
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++
drivers/gpu/drm/amd/display/dc/core/dc.c | 49 +++++++++++++++++-----------
drivers/md/dm-ioctl.c | 33 ++++++++++++-------
drivers/nubus/proc.c | 22 +++++++++---
drivers/pci/pci-acpi.c | 53 +++++++++++++++++++++++--------
fs/nfs/inode.c | 2 -
include/linux/mm.h | 4 +-
mm/nommu.c | 7 +++-
scripts/tags.sh | 9 ++++-
tools/testing/cxl/Kbuild | 1
tools/testing/cxl/test/mock.c | 15 --------
16 files changed, 147 insertions(+), 103 deletions(-)
Ahmed S. Darwish (2):
scripts/tags.sh: Resolve gtags empty index generation
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Aric Cyr (1):
drm/amd/display: Do not update DRR while BW optimizations pending
Bas Nieuwenhuizen (1):
drm/amdgpu: Validate VM ioctl flags.
Bjorn Helgaas (1):
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Dan Williams (1):
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Demi Marie Obenour (1):
dm ioctl: Avoid double-fetch of version
Finn Thain (1):
nubus: Partially revert proc_create_single_data() conversion
Greg Kroah-Hartman (1):
Linux 6.3.12
Jeff Layton (1):
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds (1):
execve: always mark stack as growing down during early stack setup
Mario Limonciello (1):
PCI/ACPI: Call _REG when transitioning D-states
Max Filippov (1):
xtensa: fix lock_mm_and_find_vma in case VMA not found
Rodrigo Siqueira (1):
drm/amd/display: Ensure vmin and vmax adjust for DCE
I'm announcing the release of the 6.1.38 kernel.
All users of the 6.1 kernel series must upgrade.
The updated 6.1.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.1.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/process/changes.rst | 7 ++++
Makefile | 2 -
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++
drivers/gpu/drm/amd/display/dc/core/dc.c | 50 ++++++++++++++++-------------
drivers/nubus/proc.c | 22 +++++++++---
drivers/pci/pci-acpi.c | 53 +++++++++++++++++++++++--------
include/linux/mm.h | 4 +-
mm/nommu.c | 7 +++-
scripts/tags.sh | 9 ++++-
tools/perf/util/symbol.c | 17 ++++++++-
10 files changed, 130 insertions(+), 45 deletions(-)
Ahmed S. Darwish (2):
scripts/tags.sh: Resolve gtags empty index generation
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Alvin Lee (1):
drm/amd/display: Remove optimization for VRR updates
Aric Cyr (1):
drm/amd/display: Do not update DRR while BW optimizations pending
Bas Nieuwenhuizen (1):
drm/amdgpu: Validate VM ioctl flags.
Bjorn Helgaas (1):
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Finn Thain (1):
nubus: Partially revert proc_create_single_data() conversion
Greg Kroah-Hartman (1):
Linux 6.1.38
Krister Johansen (1):
perf symbols: Symbol lookup with kcore can fail if multiple segments match stext
Linus Torvalds (1):
execve: always mark stack as growing down during early stack setup
Mario Limonciello (1):
PCI/ACPI: Call _REG when transitioning D-states
Max Filippov (1):
xtensa: fix lock_mm_and_find_vma in case VMA not found
Rodrigo Siqueira (1):
drm/amd/display: Ensure vmin and vmax adjust for DCE
I'm announcing the release of the 5.15.120 kernel.
All users of the 5.15 kernel series must upgrade.
The updated 5.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.15.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 -
arch/parisc/include/asm/assembly.h | 4 --
arch/x86/kernel/cpu/microcode/amd.c | 2 -
arch/x86/kernel/smpboot.c | 24 ++++++++-------
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++
drivers/hid/hid-logitech-hidpp.c | 2 -
drivers/hid/wacom_wac.c | 6 +--
drivers/hid/wacom_wac.h | 2 -
drivers/nubus/proc.c | 22 ++++++++++---
drivers/thermal/mtk_thermal.c | 14 +-------
include/linux/highmem.h | 24 +++++++++++++++
include/linux/mm.h | 5 ++-
kernel/bpf/verifier.c | 7 +++-
mm/memory.c | 33 ++++++++++++++------
net/can/isotp.c | 5 +--
net/mptcp/protocol.c | 46 +++++++++++++----------------
net/mptcp/subflow.c | 17 ++++++----
scripts/tags.sh | 9 +++++
tools/perf/util/symbol.c | 17 +++++++++-
20 files changed, 158 insertions(+), 88 deletions(-)
Ahmed S. Darwish (1):
scripts/tags.sh: Resolve gtags empty index generation
Bas Nieuwenhuizen (1):
drm/amdgpu: Validate VM ioctl flags.
Ben Hutchings (1):
parisc: Delete redundant register definitions in <asm/assembly.h>
Borislav Petkov (AMD) (1):
x86/microcode/AMD: Load late on both threads too
Finn Thain (1):
nubus: Partially revert proc_create_single_data() conversion
Greg Kroah-Hartman (1):
Linux 5.15.120
Jane Chu (1):
mm, hwpoison: when copy-on-write hits poison, take page offline
Jason Gerecke (1):
HID: wacom: Use ktime_t rather than int when dealing with timestamps
Krister Johansen (2):
bpf: ensure main program has an extable
perf symbols: Symbol lookup with kcore can fail if multiple segments match stext
Mike Hommey (1):
HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651.
Oliver Hartkopp (1):
can: isotp: isotp_sendmsg(): fix return error fix on TX path
Paolo Abeni (2):
mptcp: fix possible divide by zero in recvmsg()
mptcp: consolidate fallback and non fallback state machine
Philip Yang (1):
drm/amdgpu: Set vmbo destroy after pt bo is created
Ricardo Cañuelo (1):
Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe"
Thomas Gleixner (1):
x86/smp: Use dedicated cache-line for mwait_play_dead()
Tony Luck (1):
mm, hwpoison: try to recover from copy-on write faults
This is the start of the stable review cycle for the 6.3.12 release.
There are 13 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.3.12-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.3.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.3.12-rc1
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Demi Marie Obenour <demi(a)invisiblethingslab.com>
dm ioctl: Avoid double-fetch of version
Ahmed S. Darwish <darwi(a)linutronix.de>
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Mike Kravetz <mike.kravetz(a)oracle.com>
hugetlb: revert use of page_cache_next_miss()
Finn Thain <fthain(a)linux-m68k.org>
nubus: Partially revert proc_create_single_data() conversion
Dan Williams <dan.j.williams(a)intel.com>
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Jeff Layton <jlayton(a)kernel.org>
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: always mark stack as growing down during early stack setup
Mario Limonciello <mario.limonciello(a)amd.com>
PCI/ACPI: Call _REG when transitioning D-states
Bjorn Helgaas <bhelgaas(a)google.com>
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Aric Cyr <aric.cyr(a)amd.com>
drm/amd/display: Do not update DRR while BW optimizations pending
Max Filippov <jcmvbkbc(a)gmail.com>
xtensa: fix lock_mm_and_find_vma in case VMA not found
-------------
Diffstat:
Documentation/process/changes.rst | 7 +++++
Makefile | 4 +--
drivers/cxl/core/pci.c | 27 +++-------------
drivers/cxl/cxl.h | 1 -
drivers/cxl/port.c | 14 +++------
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/gpu/drm/amd/display/dc/core/dc.c | 48 +++++++++++++++++------------
drivers/md/dm-ioctl.c | 33 ++++++++++++--------
drivers/nubus/proc.c | 22 ++++++++++---
drivers/pci/pci-acpi.c | 53 ++++++++++++++++++++++++--------
fs/hugetlbfs/inode.c | 8 ++---
fs/nfs/inode.c | 2 +-
include/linux/mm.h | 4 ++-
mm/hugetlb.c | 12 ++++----
mm/nommu.c | 7 ++++-
scripts/tags.sh | 9 +++++-
tools/testing/cxl/Kbuild | 1 -
tools/testing/cxl/test/mock.c | 15 ---------
18 files changed, 156 insertions(+), 115 deletions(-)
The patch titled
Subject: fork: lock VMAs of the parent process when forking
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
fork-lock-vmas-of-the-parent-process-when-forking-v3.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: fork: lock VMAs of the parent process when forking
Date: Wed, 5 Jul 2023 10:12:11 -0700
Patch series "Avoid memory corruption caused by per-VMA locks", v3.
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking while
forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork. I
tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem. This fix can potentially regress some
fork-heavy workloads. Kernel build time did not show noticeable
regression on a 56-core machine while a stress test mapping 10000 VMAs and
forking 5000 times in a tight loop shows ~5% regression. If such fork
time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should
restore its performance. Further optimizations are possible if this
regression proves to be problematic.
This patch (of 2):
When forking a child process, parent write-protects an anonymous page and
COW-shares it with the child being forked using copy_present_pte().
Parent's TLB is flushed right before we drop the parent's mmap_lock in
dup_mmap(). If we get a write-fault before that TLB flush in the parent,
and we end up replacing that anonymous page in the parent process in
do_wp_page() (because, COW-shared with the child), this might lead to some
stale writable TLB entries targeting the wrong (old) page. Similar issue
happened in the past with userfaultfd (see flush_tlb_page() call inside
do_wp_page()).
Lock VMAs of the parent process when forking a child, which prevents
concurrent page faults during fork operation and avoids this issue. This
fix can potentially regress some fork-heavy workloads. Kernel build time
did not show noticeable regression on a 56-core machine while a stress
test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5%
regression. If such fork time regression is unacceptable, disabling
CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations
are possible if this regression proves to be problematic.
Link: https://lkml.kernel.org/r/20230705171213.2843068-2-surenb@google.com
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first"=
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@ke=rnel.org/
Reported-by: Holger Hoffst��tte <holger(a)applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@ap=plied-asynchrony.com/
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624
)
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chriscli(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Greg Thelen <gthelen(a)google.com>
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Cc: Joel Fernandes <joelaf(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: Laurent Dufour <ldufour(a)linux.ibm.com>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Michel Lespinasse <michel(a)lespinasse.org>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Minchan Kim <minchan(a)google.com>
Cc: "Paul E. McKenney" <paulmck(a)kernel.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <peterz(a)infradead.org>
Cc: Punit Agrawal <punit.agrawal(a)bytedance.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/kernel/fork.c~fork-lock-vmas-of-the-parent-process-when-forking-v3
+++ a/kernel/fork.c
@@ -658,6 +658,12 @@ static __latent_entropy int dup_mmap(str
retval = -EINTR;
goto fail_uprobe_end;
}
+#ifdef CONFIG_PER_VMA_LOCK
+ /* Disallow any page faults before calling flush_cache_dup_mm */
+ for_each_vma(old_vmi, mpnt)
+ vma_start_write(mpnt);
+ vma_iter_init(&old_vmi, oldmm, 0);
+#endif
flush_cache_dup_mm(oldmm);
uprobe_dup_mmap(oldmm, mm);
/*
_
Patches currently in -mm which might be from surenb(a)google.com are
fork-lock-vmas-of-the-parent-process-when-forking-v3.patch
mm-disable-config_per_vma_lock-until-its-fixed.patch
swap-remove-remnants-of-polling-from-read_swap_cache_async.patch
mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch
mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch
mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch
mm-handle-swap-page-faults-under-per-vma-lock.patch
mm-handle-userfaults-under-vma-lock.patch
The quilt patch titled
Subject: fork: lock VMAs of the parent process when forking
has been removed from the -mm tree. Its filename was
fork-lock-vmas-of-the-parent-process-when-forking.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: fork: lock VMAs of the parent process when forking
Date: Tue, 4 Jul 2023 23:37:10 -0700
Patch series "Avoid memory corruption caused by per-VMA locks", v2.
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking while
forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork. I
tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Patch 2/2 disabled per-VMA locks until the fix is tested and verified.
This patch (of 2):
When forking a child process, parent write-protects an anonymous page and
COW-shares it with the child being forked using copy_present_pte().
Parent's TLB is flushed right before we drop the parent's mmap_lock in
dup_mmap(). If we get a write-fault before that TLB flush in the parent,
and we end up replacing that anonymous page in the parent process in
do_wp_page() (because, COW-shared with the child), this might lead to some
stale writable TLB entries targeting the wrong (old) page. Similar issue
happened in the past with userfaultfd (see flush_tlb_page() call inside
do_wp_page()).
Lock VMAs of the parent process when forking a child, which prevents
concurrent page faults during fork operation and avoids this issue. This
fix can potentially regress some fork-heavy workloads. Kernel build time
did not show noticeable regression on a 56-core machine while a stress
test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5%
regression. If such fork time regression is unacceptable, disabling
CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations
are possible if this regression proves to be problematic.
Link: https://lkml.kernel.org/r/20230705063711.2670599-1-surenb@google.com
Link: https://lkml.kernel.org/r/20230705063711.2670599-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Holger Hoffst��tte <holger(a)applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-as…
Reported-by: Jacob Young <jacobly.alt(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Bagas Sanjaya <bagasdotme(a)gmail.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Laurent Dufour <ldufour(a)linux.ibm.com>
Cc: <regressions(a)lists.linux.dev>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chriscli(a)google.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Greg Thelen <gthelen(a)google.com>
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Joel Fernandes <joelaf(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Michel Lespinasse <michel(a)lespinasse.org>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Minchan Kim <minchan(a)google.com>
Cc: "Paul E. McKenney" <paulmck(a)kernel.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <peterz(a)infradead.org>
Cc: Punit Agrawal <punit.agrawal(a)bytedance.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 1 +
1 file changed, 1 insertion(+)
--- a/kernel/fork.c~fork-lock-vmas-of-the-parent-process-when-forking
+++ a/kernel/fork.c
@@ -686,6 +686,7 @@ static __latent_entropy int dup_mmap(str
for_each_vma(old_vmi, mpnt) {
struct file *file;
+ vma_start_write(mpnt);
if (mpnt->vm_flags & VM_DONTCOPY) {
vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt));
continue;
_
Patches currently in -mm which might be from surenb(a)google.com are
fork-lock-vmas-of-the-parent-process-when-forking-v3.patch
mm-disable-config_per_vma_lock-until-its-fixed.patch
swap-remove-remnants-of-polling-from-read_swap_cache_async.patch
mm-add-missing-vm_fault_result_trace-name-for-vm_fault_completed.patch
mm-drop-per-vma-lock-when-returning-vm_fault_retry-or-vm_fault_completed.patch
mm-change-folio_lock_or_retry-to-use-vm_fault-directly.patch
mm-handle-swap-page-faults-under-per-vma-lock.patch
mm-handle-userfaults-under-vma-lock.patch
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking
while forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork.
I tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Patch 2/2 disabled per-VMA locks until the fix is tested and verified.
Both patches apply cleanly over Linus' ToT and stable 6.4.y branch.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
Suren Baghdasaryan (2):
fork: lock VMAs of the parent process when forking
mm: disable CONFIG_PER_VMA_LOCK until its fixed
kernel/fork.c | 1 +
mm/Kconfig | 3 ++-
2 files changed, 3 insertions(+), 1 deletion(-)
--
2.41.0.255.g8b1d071c50-goog