In v5.16 btrfs had a big rework (originally considered as a refactor
only) on defrag, which brings quite some regressions.
With those regressions, extra scrutiny is brought to defrag code, and
some existing bugs are exposed.
For v5.15, there are those patches need to be backported:
(Tracked through github issue:
https://github.com/btrfs/linux/issues/422)
- Already upstreamed:
"btrfs: don't hold CPU for too long when defragging a file"
The first patch.
- In maintainer's tree
btrfs: defrag: don't try to merge regular extents with preallocated extents
btrfs: defrag: don't defrag extents which are already at max capacity
btrfs: defrag: remove an ambiguous condition for rejection
Those will be backported when they get merged upstream.
- One special fix for v5.15
The 2nd patch.
The problem is, that patch has no upstream commit, since v5.16
reworked the defrag code and fix the problem unintentionally.
It's not a good idea to bring the whole rework (along with its bugs)
to stable kernel.
So here I crafted a small fix for v5.15 only.
Not sure if this is conflicted with any stable policy.
Qu Wenruo (2):
btrfs: don't hold CPU for too long when defragging a file
btrfs: defrag: use the same cluster size for defrag ioctl and
autodefrag
fs/btrfs/ioctl.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
--
2.35.1
The patch titled
Subject: userfaultfd: mark uffd_wp regardless of VM_WRITE flag
has been added to the -mm tree. Its filename is
userfaultfd-mark-uffd_wp-regardless-of-vm_write-flag.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/userfaultfd-mark-uffd_wp-regardle…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/userfaultfd-mark-uffd_wp-regardle…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Nadav Amit <namit(a)vmware.com>
Subject: userfaultfd: mark uffd_wp regardless of VM_WRITE flag
When a PTE is set by UFFD operations such as UFFDIO_COPY, the PTE is
currently only marked as write-protected if the VMA has VM_WRITE flag set.
This seems incorrect or at least would be unexpected by the users.
Consider the following sequence of operations that are being performed on
a certain page:
mprotect(PROT_READ)
UFFDIO_COPY(UFFDIO_COPY_MODE_WP)
mprotect(PROT_READ|PROT_WRITE)
At this point the user would expect to still get UFFD notification when
the page is accessed for write, but the user would not get one, since the
PTE was not marked as UFFD_WP during UFFDIO_COPY.
Fix it by always marking PTEs as UFFD_WP regardless on the
write-permission in the VMA flags.
Link: https://lkml.kernel.org/r/20220217211602.2769-1-namit@vmware.com
Fixes: 292924b26024 ("userfaultfd: wp: apply _PAGE_UFFD_WP bit")
Signed-off-by: Nadav Amit <namit(a)vmware.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Mike Rapoport <rppt(a)linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/userfaultfd.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
--- a/mm/userfaultfd.c~userfaultfd-mark-uffd_wp-regardless-of-vm_write-flag
+++ a/mm/userfaultfd.c
@@ -72,12 +72,15 @@ int mfill_atomic_install_pte(struct mm_s
_dst_pte = pte_mkdirty(_dst_pte);
if (page_in_cache && !vm_shared)
writable = false;
- if (writable) {
- if (wp_copy)
- _dst_pte = pte_mkuffd_wp(_dst_pte);
- else
- _dst_pte = pte_mkwrite(_dst_pte);
- }
+
+ /*
+ * Always mark a PTE as write-protected when needed, regardless of
+ * VM_WRITE, which the user might change.
+ */
+ if (wp_copy)
+ _dst_pte = pte_mkuffd_wp(_dst_pte);
+ else if (writable)
+ _dst_pte = pte_mkwrite(_dst_pte);
dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
_
Patches currently in -mm which might be from namit(a)vmware.com are
userfaultfd-mark-uffd_wp-regardless-of-vm_write-flag.patch
Hi Greg,
Please consider cherry-pick this patch series into 5.16.x stable. It
includes a fix to a bug in 5.16 stable which allows a user with cap_bpf
privileges to get root privileges. The patch that fixes the bug is
patch 7/9: bpf: Make per_cpu_ptr return rdonly
The rest are the depedences required by the fix patch. This patchset has
been merged in mainline v5.17. The patches were not planned to backport
because of its complex dependences.
Tested by compile, build and run through a subset of bpf test_progs.
Hao Luo (9):
bpf: Introduce composable reg, ret and arg types.
bpf: Replace ARG_XXX_OR_NULL with ARG_XXX | PTR_MAYBE_NULL
bpf: Replace RET_XXX_OR_NULL with RET_XXX | PTR_MAYBE_NULL
bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL
bpf: Introduce MEM_RDONLY flag
bpf: Convert PTR_TO_MEM_OR_NULL to composable types.
bpf: Make per_cpu_ptr return rdonly PTR_TO_MEM.
bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem.
bpf/selftests: Test PTR_TO_RDONLY_MEM
include/linux/bpf.h | 101 +++-
include/linux/bpf_verifier.h | 17 +
kernel/bpf/btf.c | 12 +-
kernel/bpf/cgroup.c | 2 +-
kernel/bpf/helpers.c | 12 +-
kernel/bpf/map_iter.c | 4 +-
kernel/bpf/ringbuf.c | 2 +-
kernel/bpf/syscall.c | 2 +-
kernel/bpf/verifier.c | 488 +++++++++---------
kernel/trace/bpf_trace.c | 26 +-
net/core/bpf_sk_storage.c | 2 +-
net/core/filter.c | 64 +--
net/core/sock_map.c | 2 +-
.../selftests/bpf/prog_tests/ksyms_btf.c | 14 +
.../bpf/progs/test_ksyms_btf_write_check.c | 29 ++
15 files changed, 444 insertions(+), 333 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_btf_write_check.c
--
2.35.1.265.g69c8d7142f-goog
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 23e7b1bfed61e301853b5e35472820d919498278 Mon Sep 17 00:00:00 2001
From: Guillaume Nault <gnault(a)redhat.com>
Date: Mon, 10 Jan 2022 14:43:06 +0100
Subject: [PATCH] xfrm: Don't accidentally set RTO_ONLINK in decode_session4()
Similar to commit 94e2238969e8 ("xfrm4: strip ECN bits from tos field"),
clear the ECN bits from iph->tos when setting ->flowi4_tos.
This ensures that the last bit of ->flowi4_tos is cleared, so
ip_route_output_key_hash() isn't going to restrict the scope of the
route lookup.
Use ~INET_ECN_MASK instead of IPTOS_RT_MASK, because we have no reason
to clear the high order bits.
Found by code inspection, compile tested only.
Fixes: 4da3089f2b58 ("[IPSEC]: Use TOS when doing tunnel lookups")
Signed-off-by: Guillaume Nault <gnault(a)redhat.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index dccb8f3318ef..04d1ce9b510f 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -31,6 +31,7 @@
#include <linux/if_tunnel.h>
#include <net/dst.h>
#include <net/flow.h>
+#include <net/inet_ecn.h>
#include <net/xfrm.h>
#include <net/ip.h>
#include <net/gre.h>
@@ -3295,7 +3296,7 @@ decode_session4(struct sk_buff *skb, struct flowi *fl, bool reverse)
fl4->flowi4_proto = iph->protocol;
fl4->daddr = reverse ? iph->saddr : iph->daddr;
fl4->saddr = reverse ? iph->daddr : iph->saddr;
- fl4->flowi4_tos = iph->tos;
+ fl4->flowi4_tos = iph->tos & ~INET_ECN_MASK;
if (!ip_is_fragment(iph)) {
switch (iph->protocol) {