From: Mel Gorman <mgorman(a)suse.de>
Subject: mm: vmscan: remove deadlock due to throttling failing to make progress
A soft lockup bug in kcompactd was reported in a private bugzilla with
the following visible in dmesg;
[15980.045209][ C33] watchdog: BUG: soft lockup - CPU#33 stuck for 26s! [kcompactd0:479]
[16008.044989][ C33] watchdog: BUG: soft lockup - CPU#33 stuck for 52s! [kcompactd0:479]
[16036.044768][ C33] watchdog: BUG: soft lockup - CPU#33 stuck for 78s! [kcompactd0:479]
[16064.044548][ C33] watchdog: BUG: soft lockup - CPU#33 stuck for 104s! [kcompactd0:479]
The machine had 256G of RAM with no swap and an earlier failed allocation
indicated that node 0 where kcompactd was run was potentially
unreclaimable;
Node 0 active_anon:29355112kB inactive_anon:2913528kB active_file:0kB
inactive_file:0kB unevictable:64kB isolated(anon):0kB isolated(file):0kB
mapped:8kB dirty:0kB writeback:0kB shmem:26780kB shmem_thp:
0kB shmem_pmdmapped: 0kB anon_thp: 23480320kB writeback_tmp:0kB
kernel_stack:2272kB pagetables:24500kB all_unreclaimable? yes
Vlastimil Babka investigated a crash dump and found that a task migrating pages
was trying to drain PCP lists;
PID: 52922 TASK: ffff969f820e5000 CPU: 19 COMMAND: "kworker/u128:3"
#0 [ffffaf4e4f4c3848] __schedule at ffffffffb840116d
#1 [ffffaf4e4f4c3908] schedule at ffffffffb8401e81
#2 [ffffaf4e4f4c3918] schedule_timeout at ffffffffb84066e8
#3 [ffffaf4e4f4c3990] wait_for_completion at ffffffffb8403072
#4 [ffffaf4e4f4c39d0] __flush_work at ffffffffb7ac3e4d
#5 [ffffaf4e4f4c3a48] __drain_all_pages at ffffffffb7cb707c
#6 [ffffaf4e4f4c3a80] __alloc_pages_slowpath.constprop.114 at ffffffffb7cbd9dd
#7 [ffffaf4e4f4c3b60] __alloc_pages at ffffffffb7cbe4f5
#8 [ffffaf4e4f4c3bc0] alloc_migration_target at ffffffffb7cf329c
#9 [ffffaf4e4f4c3bf0] migrate_pages at ffffffffb7cf6d15
10 [ffffaf4e4f4c3cb0] migrate_to_node at ffffffffb7cdb5aa
11 [ffffaf4e4f4c3da8] do_migrate_pages at ffffffffb7cdcf26
12 [ffffaf4e4f4c3e88] cpuset_migrate_mm_workfn at ffffffffb7b859d2
13 [ffffaf4e4f4c3e98] process_one_work at ffffffffb7ac45f3
14 [ffffaf4e4f4c3ed8] worker_thread at ffffffffb7ac47fd
15 [ffffaf4e4f4c3f10] kthread at ffffffffb7acbdc6
16 [ffffaf4e4f4c3f50] ret_from_fork at ffffffffb7a047e2
This failure is specific to CONFIG_PREEMPT=n builds. The root of the
problem is that kcompact0 is not rescheduling on a CPU while a task that
has isolated a large number of the pages from the LRU is waiting on
kcompact0 to reschedule so the pages can be released. While
shrink_inactive_list() only loops once around too_many_isolated, reclaim
can continue without rescheduling if sc->skipped_deactivate == 1 which
could happen if there was no file LRU and the inactive anon list was not
low.
Link: https://lkml.kernel.org/r/20220203100326.GD3301@suse.de
Fixes: d818fca1cac3 ("mm/vmscan: throttle reclaim and compaction when too may pages are isolated")
Signed-off-by: Mel Gorman <mgorman(a)suse.de>
Debugged-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/vmscan.c~mm-vmscan-remove-deadlock-due-to-throttling-failing-to-make-progress
+++ a/mm/vmscan.c
@@ -1066,8 +1066,10 @@ void reclaim_throttle(pg_data_t *pgdat,
* forward progress (e.g. journalling workqueues or kthreads).
*/
if (!current_is_kswapd() &&
- current->flags & (PF_IO_WORKER|PF_KTHREAD))
+ current->flags & (PF_IO_WORKER|PF_KTHREAD)) {
+ cond_resched();
return;
+ }
/*
* These figures are pulled out of thin air.
_
The patch titled
Subject: fs-proc-task_mmuc-dont-read-mapcount-for-migration-entry-v4
has been removed from the -mm tree. Its filename was
fs-proc-task_mmuc-dont-read-mapcount-for-migration-entry-v4.patch
This patch was dropped because it was folded into fs-proc-task_mmuc-dont-read-mapcount-for-migration-entry.patch
------------------------------------------------------
From: Yang Shi <shy828301(a)gmail.com>
Subject: fs-proc-task_mmuc-dont-read-mapcount-for-migration-entry-v4
v4: * s/Treated/Treat per David
* Collected acked-by tag from David
v3: * Fixed the fix tag, the one used by v2 was not accurate
* Added comment about the risk calling page_mapcount() per David
* Fix pagemap
Link: https://lkml.kernel.org/r/20220203182641.824731-1-shy828301@gmail.com
Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()")
Signed-off-by: Yang Shi <shy828301(a)gmail.com>
Reported-by: syzbot+1f52b3a18d5633fa7f82(a)syzkaller.appspotmail.com
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/fs/proc/task_mmu.c~fs-proc-task_mmuc-dont-read-mapcount-for-migration-entry-v4
+++ a/fs/proc/task_mmu.c
@@ -469,9 +469,12 @@ static void smaps_account(struct mem_siz
* If any subpage of the compound page mapped with PTE it would elevate
* page_count().
*
- * Treated regular migration entries as mapcount == 1 without reading
- * mapcount since calling page_mapcount() for migration entries is
- * racy against THP splitting.
+ * The page_mapcount() is called to get a snapshot of the mapcount.
+ * Without holding the page lock this snapshot can be slightly wrong as
+ * we cannot always read the mapcount atomically. It is not safe to
+ * call page_mapcount() even with PTL held if the page is not mapped,
+ * especially for migration entries. Treat regular migration entries
+ * as mapcount == 1.
*/
if ((page_count(page) == 1) || migration) {
smaps_page_accumulate(mss, page, size, size << PSS_SHIFT, dirty,
@@ -1393,6 +1396,7 @@ static pagemap_entry_t pte_to_pagemap_en
{
u64 frame = 0, flags = 0;
struct page *page = NULL;
+ bool migration = false;
if (pte_present(pte)) {
if (pm->show_pfn)
@@ -1414,13 +1418,14 @@ static pagemap_entry_t pte_to_pagemap_en
frame = swp_type(entry) |
(swp_offset(entry) << MAX_SWAPFILES_SHIFT);
flags |= PM_SWAP;
+ migration = is_migration_entry(entry);
if (is_pfn_swap_entry(entry))
page = pfn_swap_entry_to_page(entry);
}
if (page && !PageAnon(page))
flags |= PM_FILE;
- if (page && page_mapcount(page) == 1)
+ if (page && !migration && page_mapcount(page) == 1)
flags |= PM_MMAP_EXCLUSIVE;
if (vma->vm_flags & VM_SOFTDIRTY)
flags |= PM_SOFT_DIRTY;
@@ -1436,6 +1441,7 @@ static int pagemap_pmd_range(pmd_t *pmdp
spinlock_t *ptl;
pte_t *pte, *orig_pte;
int err = 0;
+ bool migration = false;
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
ptl = pmd_trans_huge_lock(pmdp, vma);
@@ -1476,11 +1482,12 @@ static int pagemap_pmd_range(pmd_t *pmdp
if (pmd_swp_uffd_wp(pmd))
flags |= PM_UFFD_WP;
VM_BUG_ON(!is_pmd_migration_entry(pmd));
+ migration = is_migration_entry(entry);
page = pfn_swap_entry_to_page(entry);
}
#endif
- if (page && page_mapcount(page) == 1)
+ if (page && !migration && page_mapcount(page) == 1)
flags |= PM_MMAP_EXCLUSIVE;
for (; addr != end; addr += PAGE_SIZE) {
_
Patches currently in -mm which might be from shy828301(a)gmail.com are
fs-proc-task_mmuc-dont-read-mapcount-for-migration-entry.patch
The patch titled
Subject: hugetlbfs: fix a truncation issue in hugepages parameter
has been added to the -mm tree. Its filename is
hugetlbfs-fix-a-truncation-issue-in-hugepages-parameter.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/hugetlbfs-fix-a-truncation-issue-…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/hugetlbfs-fix-a-truncation-issue-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Liu Yuntao <liuyuntao10(a)huawei.com>
Subject: hugetlbfs: fix a truncation issue in hugepages parameter
When we specify a large number for node in hugepages parameter, it may be
parsed to another number due to truncation in this statement:
node = tmp;
For example, add following parameter in command line:
hugepagesz=1G hugepages=4294967297:5
and kernel will allocate 5 hugepages for node 1 instead of ignoring it.
I move the validation check earlier to fix this issue, and slightly
simplifies the condition here.
Link: https://lkml.kernel.org/r/20220209134018.8242-1-liuyuntao10@huawei.com
Fixes: b5389086ad7be0 ("hugetlbfs: extend the definition of hugepages parameter to support node allocation")
Signed-off-by: Liu Yuntao <liuyuntao10(a)huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/hugetlb.c~hugetlbfs-fix-a-truncation-issue-in-hugepages-parameter
+++ a/mm/hugetlb.c
@@ -4159,10 +4159,10 @@ static int __init hugepages_setup(char *
pr_warn("HugeTLB: architecture can't support node specific alloc, ignoring!\n");
return 0;
}
+ if (tmp >= nr_online_nodes)
+ goto invalid;
node = tmp;
p += count + 1;
- if (node < 0 || node >= nr_online_nodes)
- goto invalid;
/* Parse hugepages */
if (sscanf(p, "%lu%n", &tmp, &count) != 1)
goto invalid;
_
Patches currently in -mm which might be from liuyuntao10(a)huawei.com are
hugetlbfs-fix-a-truncation-issue-in-hugepages-parameter.patch
It was reported that some perf event setup can make fork failed on
ARM64. It was the case of a group of mixed hw and sw events. The ARM
PMU code checks if all the events in a group belong to the same PMU
except for software events. But it didn't set the event_caps of
inherited events and no longer identify them as software events.
Therefore the test failed in a child process.
A simple reproducer is:
$ perf stat -e '{cycles,cs,instructions}' perf bench sched messaging
# Running 'sched/messaging' benchmark:
perf: fork(): Invalid argument
The perf stat was fine but the perf bench failed in fork(). Let's
inherit the event caps from the parent.
Cc: Will Deacon <will(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Namhyung Kim <namhyung(a)kernel.org>
---
kernel/events/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index fc18664f49b0..e28f63ae625b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11560,6 +11560,9 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
event->state = PERF_EVENT_STATE_INACTIVE;
+ if (parent_event)
+ event->event_caps = parent_event->event_caps;
+
if (event->attr.sigtrap)
atomic_set(&event->event_limit, 1);
--
2.35.1.265.g69c8d7142f-goog
The patch titled
Subject: selftests/exec: add non-regular to TEST_GEN_PROGS
has been added to the -mm tree. Its filename is
selftests-exec-add-non-regular-to-test_gen_progs.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/selftests-exec-add-non-regular-to…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/selftests-exec-add-non-regular-to…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Subject: selftests/exec: add non-regular to TEST_GEN_PROGS
non-regular file needs to be compiled and then copied to the output
directory. Remove it from TEST_PROGS and add it to TEST_GEN_PROGS. This
removes error thrown by rsync when non-regular object isn't found:
rsync: [sender] link_stat "/linux/tools/testing/selftests/exec/non-regular" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
Link: https://lkml.kernel.org/r/20220210171323.1304501-1-usama.anjum@collabora.com
Fixes: 0f71241a8e32 ("selftests/exec: add file type errno tests")
Signed-off-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Reported-by: "kernelci.org bot" <bot(a)kernelci.org>
Reviewed-by: Shuah Khan <skhan(a)linuxfoundation.org>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Cc: Eric Biederman <ebiederm(a)xmission.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/tools/testing/selftests/exec/Makefile~selftests-exec-add-non-regular-to-test_gen_progs
+++ a/tools/testing/selftests/exec/Makefile
@@ -3,8 +3,8 @@ CFLAGS = -Wall
CFLAGS += -Wno-nonnull
CFLAGS += -D_GNU_SOURCE
-TEST_PROGS := binfmt_script non-regular
-TEST_GEN_PROGS := execveat load_address_4096 load_address_2097152 load_address_16777216
+TEST_PROGS := binfmt_script
+TEST_GEN_PROGS := execveat load_address_4096 load_address_2097152 load_address_16777216 non-regular
TEST_GEN_FILES := execveat.symlink execveat.denatured script subdir
# Makefile is a run-time dependency, since it's accessed by the execveat test
TEST_FILES := Makefile
_
Patches currently in -mm which might be from usama.anjum(a)collabora.com are
selftests-exec-add-non-regular-to-test_gen_progs.patch
selftests-set-the-build-variable-to-absolute-path.patch
selftests-add-and-export-a-kernel-uapi-headers-path.patch
selftests-correct-the-headers-install-path.patch
selftests-futex-add-the-uapi-headers-include-variable.patch
selftests-kvm-add-the-uapi-headers-include-variable.patch
selftests-landlock-add-the-uapi-headers-include-variable.patch
selftests-net-add-the-uapi-headers-include-variable.patch
selftests-mptcp-add-the-uapi-headers-include-variable.patch
selftests-vm-add-the-uapi-headers-include-variable.patch
selftests-vm-remove-dependecy-from-internal-kernel-macros.patch
From: Daniel Borkmann <daniel(a)iogearbox.net>
commit 08389d888287c3823f80b0216766b71e17f0aba5 upstream.
Add a kconfig knob which allows for unprivileged bpf to be disabled by default.
If set, the knob sets /proc/sys/kernel/unprivileged_bpf_disabled to value of 2.
This still allows a transition of 2 -> {0,1} through an admin. Similarly,
this also still keeps 1 -> {1} behavior intact, so that once set to permanently
disabled, it cannot be undone aside from a reboot.
We've also added extra2 with max of 2 for the procfs handler, so that an admin
still has a chance to toggle between 0 <-> 2.
Either way, as an additional alternative, applications can make use of CAP_BPF
that we added a while ago.
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Link: https://lore.kernel.org/bpf/74ec548079189e4e4dffaeb42b8987bb3c852eee.162076…
[fllinden(a)amazon.com: backported to 4.14]
Signed-off-by: Frank van der Linden <fllinden(a)amazon.com>
---
Documentation/sysctl/kernel.txt | 21 +++++++++++++++++++++
init/Kconfig | 10 ++++++++++
kernel/bpf/syscall.c | 3 ++-
kernel/sysctl.c | 29 +++++++++++++++++++++++++----
4 files changed, 58 insertions(+), 5 deletions(-)
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 694968c7523c..3c8f5bfdf6da 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -91,6 +91,7 @@ show up in /proc/sys/kernel:
- sysctl_writes_strict
- tainted
- threads-max
+- unprivileged_bpf_disabled
- unknown_nmi_panic
- watchdog
- watchdog_thresh
@@ -999,6 +1000,26 @@ available RAM pages threads-max is reduced accordingly.
==============================================================
+unprivileged_bpf_disabled:
+
+Writing 1 to this entry will disable unprivileged calls to bpf();
+once disabled, calling bpf() without CAP_SYS_ADMIN will return
+-EPERM. Once set to 1, this can't be cleared from the running kernel
+anymore.
+
+Writing 2 to this entry will also disable unprivileged calls to bpf(),
+however, an admin can still change this setting later on, if needed, by
+writing 0 or 1 to this entry.
+
+If BPF_UNPRIV_DEFAULT_OFF is enabled in the kernel config, then this
+entry will default to 2 instead of 0.
+
+ 0 - Unprivileged calls to bpf() are enabled
+ 1 - Unprivileged calls to bpf() are disabled without recovery
+ 2 - Unprivileged calls to bpf() are disabled
+
+==============================================================
+
unknown_nmi_panic:
The value in this file affects behavior of handling NMI. When the
diff --git a/init/Kconfig b/init/Kconfig
index be58f0449c68..c87858c434cc 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1378,6 +1378,16 @@ config ADVISE_SYSCALLS
applications use these syscalls, you can disable this option to save
space.
+config BPF_UNPRIV_DEFAULT_OFF
+ bool "Disable unprivileged BPF by default"
+ depends on BPF_SYSCALL
+ help
+ Disables unprivileged BPF by default by setting the corresponding
+ /proc/sys/kernel/unprivileged_bpf_disabled knob to 2. An admin can
+ still reenable it by setting it to 0 later on, or permanently
+ disable it by setting it to 1 (from which no other transition to
+ 0 is possible anymore).
+
config USERFAULTFD
bool "Enable userfaultfd() system call"
select ANON_INODES
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 21073682061d..59d44f1ad958 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -37,7 +37,8 @@ static DEFINE_SPINLOCK(prog_idr_lock);
static DEFINE_IDR(map_idr);
static DEFINE_SPINLOCK(map_idr_lock);
-int sysctl_unprivileged_bpf_disabled __read_mostly;
+int sysctl_unprivileged_bpf_disabled __read_mostly =
+ IS_BUILTIN(CONFIG_BPF_UNPRIV_DEFAULT_OFF) ? 2 : 0;
static const struct bpf_map_ops * const bpf_map_types[] = {
#define BPF_PROG_TYPE(_id, _ops)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 74fc3a9d1923..c9a3e61c88f8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -242,6 +242,28 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
#endif
+#ifdef CONFIG_BPF_SYSCALL
+static int bpf_unpriv_handler(struct ctl_table *table, int write,
+ void *buffer, size_t *lenp, loff_t *ppos)
+{
+ int ret, unpriv_enable = *(int *)table->data;
+ bool locked_state = unpriv_enable == 1;
+ struct ctl_table tmp = *table;
+
+ if (write && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ tmp.data = &unpriv_enable;
+ ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
+ if (write && !ret) {
+ if (locked_state && unpriv_enable != 1)
+ return -EPERM;
+ *(int *)table->data = unpriv_enable;
+ }
+ return ret;
+}
+#endif
+
static struct ctl_table kern_table[];
static struct ctl_table vm_table[];
static struct ctl_table fs_table[];
@@ -1201,10 +1223,9 @@ static struct ctl_table kern_table[] = {
.data = &sysctl_unprivileged_bpf_disabled,
.maxlen = sizeof(sysctl_unprivileged_bpf_disabled),
.mode = 0644,
- /* only handle a transition from default "0" to "1" */
- .proc_handler = proc_dointvec_minmax,
- .extra1 = &one,
- .extra2 = &one,
+ .proc_handler = bpf_unpriv_handler,
+ .extra1 = &zero,
+ .extra2 = &two,
},
#endif
#if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU)
--
2.32.0
From: Daniel Borkmann <daniel(a)iogearbox.net>
commit 08389d888287c3823f80b0216766b71e17f0aba5 upstream.
Add a kconfig knob which allows for unprivileged bpf to be disabled by default.
If set, the knob sets /proc/sys/kernel/unprivileged_bpf_disabled to value of 2.
This still allows a transition of 2 -> {0,1} through an admin. Similarly,
this also still keeps 1 -> {1} behavior intact, so that once set to permanently
disabled, it cannot be undone aside from a reboot.
We've also added extra2 with max of 2 for the procfs handler, so that an admin
still has a chance to toggle between 0 <-> 2.
Either way, as an additional alternative, applications can make use of CAP_BPF
that we added a while ago.
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Link: https://lore.kernel.org/bpf/74ec548079189e4e4dffaeb42b8987bb3c852eee.162076…
[fllinden(a)amazon.com: backported to 4.19]
Signed-off-by: Frank van der Linden <fllinden(a)amazon.com>
---
Documentation/sysctl/kernel.txt | 21 +++++++++++++++++++++
init/Kconfig | 10 ++++++++++
kernel/bpf/syscall.c | 3 ++-
kernel/sysctl.c | 29 +++++++++++++++++++++++++----
4 files changed, 58 insertions(+), 5 deletions(-)
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 37a679501ddc..8bd3b0153959 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -94,6 +94,7 @@ show up in /proc/sys/kernel:
- sysctl_writes_strict
- tainted
- threads-max
+- unprivileged_bpf_disabled
- unknown_nmi_panic
- watchdog
- watchdog_thresh
@@ -1041,6 +1042,26 @@ available RAM pages threads-max is reduced accordingly.
==============================================================
+unprivileged_bpf_disabled:
+
+Writing 1 to this entry will disable unprivileged calls to bpf();
+once disabled, calling bpf() without CAP_SYS_ADMIN will return
+-EPERM. Once set to 1, this can't be cleared from the running kernel
+anymore.
+
+Writing 2 to this entry will also disable unprivileged calls to bpf(),
+however, an admin can still change this setting later on, if needed, by
+writing 0 or 1 to this entry.
+
+If BPF_UNPRIV_DEFAULT_OFF is enabled in the kernel config, then this
+entry will default to 2 instead of 0.
+
+ 0 - Unprivileged calls to bpf() are enabled
+ 1 - Unprivileged calls to bpf() are disabled without recovery
+ 2 - Unprivileged calls to bpf() are disabled
+
+==============================================================
+
unknown_nmi_panic:
The value in this file affects behavior of handling NMI. When the
diff --git a/init/Kconfig b/init/Kconfig
index b56a125b5a76..0fe4f60c974d 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1474,6 +1474,16 @@ config BPF_JIT_ALWAYS_ON
Enables BPF JIT and removes BPF interpreter to avoid
speculative execution of BPF instructions by the interpreter
+config BPF_UNPRIV_DEFAULT_OFF
+ bool "Disable unprivileged BPF by default"
+ depends on BPF_SYSCALL
+ help
+ Disables unprivileged BPF by default by setting the corresponding
+ /proc/sys/kernel/unprivileged_bpf_disabled knob to 2. An admin can
+ still reenable it by setting it to 0 later on, or permanently
+ disable it by setting it to 1 (from which no other transition to
+ 0 is possible anymore).
+
config USERFAULTFD
bool "Enable userfaultfd() system call"
select ANON_INODES
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 353a8d672302..e940c1f65938 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -48,7 +48,8 @@ static DEFINE_SPINLOCK(prog_idr_lock);
static DEFINE_IDR(map_idr);
static DEFINE_SPINLOCK(map_idr_lock);
-int sysctl_unprivileged_bpf_disabled __read_mostly;
+int sysctl_unprivileged_bpf_disabled __read_mostly =
+ IS_BUILTIN(CONFIG_BPF_UNPRIV_DEFAULT_OFF) ? 2 : 0;
static const struct bpf_map_ops * const bpf_map_types[] = {
#define BPF_PROG_TYPE(_id, _ops)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index a5d75bc38eea..03af4a493aff 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -250,6 +250,28 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
#endif
+#ifdef CONFIG_BPF_SYSCALL
+static int bpf_unpriv_handler(struct ctl_table *table, int write,
+ void *buffer, size_t *lenp, loff_t *ppos)
+{
+ int ret, unpriv_enable = *(int *)table->data;
+ bool locked_state = unpriv_enable == 1;
+ struct ctl_table tmp = *table;
+
+ if (write && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ tmp.data = &unpriv_enable;
+ ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
+ if (write && !ret) {
+ if (locked_state && unpriv_enable != 1)
+ return -EPERM;
+ *(int *)table->data = unpriv_enable;
+ }
+ return ret;
+}
+#endif
+
static struct ctl_table kern_table[];
static struct ctl_table vm_table[];
static struct ctl_table fs_table[];
@@ -1220,10 +1242,9 @@ static struct ctl_table kern_table[] = {
.data = &sysctl_unprivileged_bpf_disabled,
.maxlen = sizeof(sysctl_unprivileged_bpf_disabled),
.mode = 0644,
- /* only handle a transition from default "0" to "1" */
- .proc_handler = proc_dointvec_minmax,
- .extra1 = &one,
- .extra2 = &one,
+ .proc_handler = bpf_unpriv_handler,
+ .extra1 = &zero,
+ .extra2 = &two,
},
#endif
#if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU)
--
2.32.0
The upstream commit b400c2f4f4c5 ("net: axienet: Wait for PhyRstCmplt
after core reset") add new instructions in the `__axienet_device_reset`
function to wait until PhyRstCmplt bit is set, and return a non zero-exit
value if this exceeds a timeout.
However, its backported version in 4.9 (commit 9f1a3c13342b ("net:
axienet: Wait for PhyRstCmplt after core reset")) insert these
instructions in `axienet_dma_bd_init` instead.
Unlike upstream, the version of `__axienet_device_reset` currently present
in branch stable/linux-4.9.y does not return an integer for error codes.
Therefore the backport cannot be as simple as moving the inserted
instructions in the right place.
Where and how should we backport the patch in this case ?
Should we simply revert it instead ?
- Guillaume Bertholon
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Upstream
b400c2f4f4c53c86594dd57098970d97d488bfde
%%%%%%%%%
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 9c5b24a..3a2d7e8 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -516,6 +516,16 @@ static int __axienet_device_reset(struct axienet_local *lp)
return ret;
}
+ /* Wait for PhyRstCmplt bit to be set, indicating the PHY reset has finished */
+ ret = read_poll_timeout(axienet_ior, value,
+ value & XAE_INT_PHYRSTCMPLT_MASK,
+ DELAY_OF_ONE_MILLISEC, 50000, false, lp,
+ XAE_IS_OFFSET);
+ if (ret) {
+ dev_err(lp->dev, "%s: timeout waiting for PhyRstCmplt\n", __func__);
+ return ret;
+ }
+
return 0;
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Backported
9f1a3c13342b4d96b9baa337ec5fb7d453828709
%%%%%%%%%
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 46fcf3e..cc5399f 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -278,6 +278,16 @@ static int axienet_dma_bd_init(struct net_device *ndev)
axienet_dma_out32(lp, XAXIDMA_TX_CR_OFFSET,
cr | XAXIDMA_CR_RUNSTOP_MASK);
+ /* Wait for PhyRstCmplt bit to be set, indicating the PHY reset has finished */
+ ret = read_poll_timeout(axienet_ior, value,
+ value & XAE_INT_PHYRSTCMPLT_MASK,
+ DELAY_OF_ONE_MILLISEC, 50000, false, lp,
+ XAE_IS_OFFSET);
+ if (ret) {
+ dev_err(lp->dev, "%s: timeout waiting for PhyRstCmplt\n", __func__);
+ return ret;
+ }
+
return 0;
out:
axienet_dma_bd_release(ndev);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 097f1eefedeab528cecbd35586dfe293853ffb17 Mon Sep 17 00:00:00 2001
From: Tom Zanussi <zanussi(a)kernel.org>
Date: Thu, 27 Jan 2022 15:44:17 -0600
Subject: [PATCH] tracing: Propagate is_signed to expression
During expression parsing, a new expression field is created which
should inherit the properties of the operands, such as size and
is_signed.
is_signed propagation was missing, causing spurious errors with signed
operands. Add it in parse_expr() and parse_unary() to fix the problem.
Link: https://lkml.kernel.org/r/f4dac08742fd7a0920bf80a73c6c44042f5eaa40.16433197…
Cc: stable(a)vger.kernel.org
Fixes: 100719dcef447 ("tracing: Add simple expression support to hist triggers")
Reported-by: Yordan Karadzhov <ykaradzhov(a)vmware.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215513
Signed-off-by: Tom Zanussi <zanussi(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index b894d68082ea..ada87bfb5bb8 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -2503,6 +2503,8 @@ static struct hist_field *parse_unary(struct hist_trigger_data *hist_data,
(HIST_FIELD_FL_TIMESTAMP | HIST_FIELD_FL_TIMESTAMP_USECS);
expr->fn = hist_field_unary_minus;
expr->operands[0] = operand1;
+ expr->size = operand1->size;
+ expr->is_signed = operand1->is_signed;
expr->operator = FIELD_OP_UNARY_MINUS;
expr->name = expr_str(expr, 0);
expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
@@ -2719,6 +2721,7 @@ static struct hist_field *parse_expr(struct hist_trigger_data *hist_data,
/* The operand sizes should be the same, so just pick one */
expr->size = operand1->size;
+ expr->is_signed = operand1->is_signed;
expr->operator = field_op;
expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
We need to apply 2685c77b80a8 ("thermal/drivers/int340x: Fix RFIM mailbox write commands")
to the 5.15-stable tree but it does not apply. This patch fix the write
operation for mailbox command for RFIM (cmd = 0x08) which is ignored.
This results in failing to store RFI restriction.
To apply this fix, we need to backport below set of three patches on
5.15-stable tree because this fix depended on these three patches.
Without these three dependent patches, we cannot directly apply
fix (fourth) patch.
Backport three patches:
[1] c4fcf1ada4ae ("thermal/drivers/int340x: Improve the tcc offset saving for suspend/resume")
[2] aeb58c860dc5 ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses")
[3] 994a04a20b03 ("thermal: int340x: Limit Kconfig to 64-bit")
Fix RFIM patch:
[4] 2685c77b80a8 ("thermal/drivers/int340x: Fix RFIM mailbox write commands")
---
Changes in V2 from V1:
- Added upstream commit id from Linus's tree for all four patches.
---
*** BLURB HERE ***
Antoine Tenart (1):
thermal/drivers/int340x: Improve the tcc offset saving for
suspend/resume
Arnd Bergmann (1):
thermal: int340x: Limit Kconfig to 64-bit
Srinivas Pandruvada (1):
thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM
responses
Sumeet Pawnikar (1):
thermal/drivers/int340x: Fix RFIM mailbox write commands
drivers/thermal/intel/int340x_thermal/Kconfig | 4 +-
.../intel/int340x_thermal/int3401_thermal.c | 8 +-
.../processor_thermal_device.c | 36 ++++--
.../processor_thermal_device.h | 4 +-
.../processor_thermal_device_pci.c | 18 ++-
.../processor_thermal_device_pci_legacy.c | 8 +-
.../int340x_thermal/processor_thermal_mbox.c | 104 +++++++++++-------
.../int340x_thermal/processor_thermal_rfim.c | 23 ++--
8 files changed, 139 insertions(+), 66 deletions(-)
--
2.17.1
Hello,
Please apply the patch "nvme: Fix parsing of ANA log page" to 5.4.
The commit ID in Linus's tree is:
64fab7290dc3561729bbc1e35895a517eb2e549e
The patch was originally submitted on the linux-nvme mailing list, but
for reasons unknown to me it never landed on 5.4 - this thread indicates
it should have been accepted for 5.4.
https://lore.kernel.org/linux-nvme/1572303408-37913-1-git-send-email-psajee…
Without the patch, we perform the check
WARN_ON_ONCE(offset > ctrl->ana_log_size - sizeof(*desc))
at the end of the enclosing loop. This check only makes sense if we are
about to read another nvme_ana_group_desc from the ana_log_buf, but
that's not the case at the end of the last iteration of the loop. In the
last iteration, the warning fires and the function nvme_parse_ana_log
fails. When nvme native multipath is enabled, this translates into
failure to establish a connection to the controller.
The patch fixes the issue by moving the above check to a correct
position within the loop body.
Thanks,
Uday Shankar
We need to apply 2685c77b80a8 ("thermal/drivers/int340x: Fix RFIM mailbox write commands")
to the 5.15-stable tree but it does not apply. This patch fix the write
operation for mailbox command for RFIM (cmd = 0x08) which is ignored.
This results in failing to store RFI restriction.
To apply this fix, we need to backport below set of three patches on
5.15-stable tree because this fix depended on these three patches.
Without these three dependent patches, we cannot directly apply
fix (fourth) patch.
Backport three patches:
[1] c4fcf1ada4ae ("thermal/drivers/int340x: Improve the tcc offset saving for suspend/resume")
[2] 994a04a20b03 ("thermal: int340x: Limit Kconfig to 64-bit")
[3] aeb58c860dc5 ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses")
Fix RFIM patch:
[4] 2685c77b80a8 ("thermal/drivers/int340x: Fix RFIM mailbox write commands")
*** BLURB HERE ***
Antoine Tenart (1):
thermal/drivers/int340x: Improve the tcc offset saving for
suspend/resume
Arnd Bergmann (1):
thermal: int340x: Limit Kconfig to 64-bit
Srinivas Pandruvada (1):
thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM
responses
Sumeet Pawnikar (1):
thermal/drivers/int340x: Fix RFIM mailbox write commands
drivers/thermal/intel/int340x_thermal/Kconfig | 4 +-
.../intel/int340x_thermal/int3401_thermal.c | 8 +-
.../processor_thermal_device.c | 36 ++++--
.../processor_thermal_device.h | 4 +-
.../processor_thermal_device_pci.c | 18 ++-
.../processor_thermal_device_pci_legacy.c | 8 +-
.../int340x_thermal/processor_thermal_mbox.c | 104 +++++++++++-------
.../int340x_thermal/processor_thermal_rfim.c | 23 ++--
8 files changed, 139 insertions(+), 66 deletions(-)
--
2.17.1
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f965021c8bc38965ecb1924f570c4842b33d408 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 24 Jan 2022 15:50:31 -0500
Subject: [PATCH] NFSD: COMMIT operations must not return NFS?ERR_INVAL
Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:
NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Reviewed-by: Bruce Fields <bfields(a)fieldses.org>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index aca38ed1526e..52ad1972cc33 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -663,15 +663,9 @@ nfsd3_proc_commit(struct svc_rqst *rqstp)
argp->count,
(unsigned long long) argp->offset);
- if (argp->offset > NFS_OFFSET_MAX) {
- resp->status = nfserr_inval;
- goto out;
- }
-
fh_copy(&resp->fh, &argp->fh);
resp->status = nfsd_commit(rqstp, &resp->fh, argp->offset,
argp->count, resp->verf);
-out:
return rpc_success;
}
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 0cccceb105e7..91600e71be19 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1114,42 +1114,61 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
}
#ifdef CONFIG_NFSD_V3
-/*
- * Commit all pending writes to stable storage.
+/**
+ * nfsd_commit - Commit pending writes to stable storage
+ * @rqstp: RPC request being processed
+ * @fhp: NFS filehandle
+ * @offset: raw offset from beginning of file
+ * @count: raw count of bytes to sync
+ * @verf: filled in with the server's current write verifier
*
- * Note: we only guarantee that data that lies within the range specified
- * by the 'offset' and 'count' parameters will be synced.
+ * Note: we guarantee that data that lies within the range specified
+ * by the 'offset' and 'count' parameters will be synced. The server
+ * is permitted to sync data that lies outside this range at the
+ * same time.
*
* Unfortunately we cannot lock the file to make sure we return full WCC
* data to the client, as locking happens lower down in the filesystem.
+ *
+ * Return values:
+ * An nfsstat value in network byte order.
*/
__be32
-nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
- loff_t offset, unsigned long count, __be32 *verf)
+nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset,
+ u32 count, __be32 *verf)
{
+ u64 maxbytes;
+ loff_t start, end;
struct nfsd_net *nn;
struct nfsd_file *nf;
- loff_t end = LLONG_MAX;
- __be32 err = nfserr_inval;
-
- if (offset < 0)
- goto out;
- if (count != 0) {
- end = offset + (loff_t)count - 1;
- if (end < offset)
- goto out;
- }
+ __be32 err;
err = nfsd_file_acquire(rqstp, fhp,
NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf);
if (err)
goto out;
+
+ /*
+ * Convert the client-provided (offset, count) range to a
+ * (start, end) range. If the client-provided range falls
+ * outside the maximum file size of the underlying FS,
+ * clamp the sync range appropriately.
+ */
+ start = 0;
+ end = LLONG_MAX;
+ maxbytes = (u64)fhp->fh_dentry->d_sb->s_maxbytes;
+ if (offset < maxbytes) {
+ start = offset;
+ if (count && (offset + count - 1 < maxbytes))
+ end = offset + count - 1;
+ }
+
nn = net_generic(nf->nf_net, nfsd_net_id);
if (EX_ISSYNC(fhp->fh_export)) {
errseq_t since = READ_ONCE(nf->nf_file->f_wb_err);
int err2;
- err2 = vfs_fsync_range(nf->nf_file, offset, end, 0);
+ err2 = vfs_fsync_range(nf->nf_file, start, end, 0);
switch (err2) {
case 0:
nfsd_copy_write_verifier(verf, nn);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 9f56dcb22ff7..2c43d10e3cab 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -74,8 +74,8 @@ __be32 do_nfsd_create(struct svc_rqst *, struct svc_fh *,
char *name, int len, struct iattr *attrs,
struct svc_fh *res, int createmode,
u32 *verifier, bool *truncp, bool *created);
-__be32 nfsd_commit(struct svc_rqst *, struct svc_fh *,
- loff_t, unsigned long, __be32 *verf);
+__be32 nfsd_commit(struct svc_rqst *rqst, struct svc_fh *fhp,
+ u64 offset, u32 count, __be32 *verf);
#endif /* CONFIG_NFSD_V3 */
#ifdef CONFIG_NFSD_V4
__be32 nfsd_getxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f965021c8bc38965ecb1924f570c4842b33d408 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 24 Jan 2022 15:50:31 -0500
Subject: [PATCH] NFSD: COMMIT operations must not return NFS?ERR_INVAL
Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:
NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Reviewed-by: Bruce Fields <bfields(a)fieldses.org>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index aca38ed1526e..52ad1972cc33 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -663,15 +663,9 @@ nfsd3_proc_commit(struct svc_rqst *rqstp)
argp->count,
(unsigned long long) argp->offset);
- if (argp->offset > NFS_OFFSET_MAX) {
- resp->status = nfserr_inval;
- goto out;
- }
-
fh_copy(&resp->fh, &argp->fh);
resp->status = nfsd_commit(rqstp, &resp->fh, argp->offset,
argp->count, resp->verf);
-out:
return rpc_success;
}
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 0cccceb105e7..91600e71be19 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1114,42 +1114,61 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
}
#ifdef CONFIG_NFSD_V3
-/*
- * Commit all pending writes to stable storage.
+/**
+ * nfsd_commit - Commit pending writes to stable storage
+ * @rqstp: RPC request being processed
+ * @fhp: NFS filehandle
+ * @offset: raw offset from beginning of file
+ * @count: raw count of bytes to sync
+ * @verf: filled in with the server's current write verifier
*
- * Note: we only guarantee that data that lies within the range specified
- * by the 'offset' and 'count' parameters will be synced.
+ * Note: we guarantee that data that lies within the range specified
+ * by the 'offset' and 'count' parameters will be synced. The server
+ * is permitted to sync data that lies outside this range at the
+ * same time.
*
* Unfortunately we cannot lock the file to make sure we return full WCC
* data to the client, as locking happens lower down in the filesystem.
+ *
+ * Return values:
+ * An nfsstat value in network byte order.
*/
__be32
-nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
- loff_t offset, unsigned long count, __be32 *verf)
+nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset,
+ u32 count, __be32 *verf)
{
+ u64 maxbytes;
+ loff_t start, end;
struct nfsd_net *nn;
struct nfsd_file *nf;
- loff_t end = LLONG_MAX;
- __be32 err = nfserr_inval;
-
- if (offset < 0)
- goto out;
- if (count != 0) {
- end = offset + (loff_t)count - 1;
- if (end < offset)
- goto out;
- }
+ __be32 err;
err = nfsd_file_acquire(rqstp, fhp,
NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf);
if (err)
goto out;
+
+ /*
+ * Convert the client-provided (offset, count) range to a
+ * (start, end) range. If the client-provided range falls
+ * outside the maximum file size of the underlying FS,
+ * clamp the sync range appropriately.
+ */
+ start = 0;
+ end = LLONG_MAX;
+ maxbytes = (u64)fhp->fh_dentry->d_sb->s_maxbytes;
+ if (offset < maxbytes) {
+ start = offset;
+ if (count && (offset + count - 1 < maxbytes))
+ end = offset + count - 1;
+ }
+
nn = net_generic(nf->nf_net, nfsd_net_id);
if (EX_ISSYNC(fhp->fh_export)) {
errseq_t since = READ_ONCE(nf->nf_file->f_wb_err);
int err2;
- err2 = vfs_fsync_range(nf->nf_file, offset, end, 0);
+ err2 = vfs_fsync_range(nf->nf_file, start, end, 0);
switch (err2) {
case 0:
nfsd_copy_write_verifier(verf, nn);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 9f56dcb22ff7..2c43d10e3cab 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -74,8 +74,8 @@ __be32 do_nfsd_create(struct svc_rqst *, struct svc_fh *,
char *name, int len, struct iattr *attrs,
struct svc_fh *res, int createmode,
u32 *verifier, bool *truncp, bool *created);
-__be32 nfsd_commit(struct svc_rqst *, struct svc_fh *,
- loff_t, unsigned long, __be32 *verf);
+__be32 nfsd_commit(struct svc_rqst *rqst, struct svc_fh *fhp,
+ u64 offset, u32 count, __be32 *verf);
#endif /* CONFIG_NFSD_V3 */
#ifdef CONFIG_NFSD_V4
__be32 nfsd_getxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f965021c8bc38965ecb1924f570c4842b33d408 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 24 Jan 2022 15:50:31 -0500
Subject: [PATCH] NFSD: COMMIT operations must not return NFS?ERR_INVAL
Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:
NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Reviewed-by: Bruce Fields <bfields(a)fieldses.org>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index aca38ed1526e..52ad1972cc33 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -663,15 +663,9 @@ nfsd3_proc_commit(struct svc_rqst *rqstp)
argp->count,
(unsigned long long) argp->offset);
- if (argp->offset > NFS_OFFSET_MAX) {
- resp->status = nfserr_inval;
- goto out;
- }
-
fh_copy(&resp->fh, &argp->fh);
resp->status = nfsd_commit(rqstp, &resp->fh, argp->offset,
argp->count, resp->verf);
-out:
return rpc_success;
}
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 0cccceb105e7..91600e71be19 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1114,42 +1114,61 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
}
#ifdef CONFIG_NFSD_V3
-/*
- * Commit all pending writes to stable storage.
+/**
+ * nfsd_commit - Commit pending writes to stable storage
+ * @rqstp: RPC request being processed
+ * @fhp: NFS filehandle
+ * @offset: raw offset from beginning of file
+ * @count: raw count of bytes to sync
+ * @verf: filled in with the server's current write verifier
*
- * Note: we only guarantee that data that lies within the range specified
- * by the 'offset' and 'count' parameters will be synced.
+ * Note: we guarantee that data that lies within the range specified
+ * by the 'offset' and 'count' parameters will be synced. The server
+ * is permitted to sync data that lies outside this range at the
+ * same time.
*
* Unfortunately we cannot lock the file to make sure we return full WCC
* data to the client, as locking happens lower down in the filesystem.
+ *
+ * Return values:
+ * An nfsstat value in network byte order.
*/
__be32
-nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
- loff_t offset, unsigned long count, __be32 *verf)
+nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset,
+ u32 count, __be32 *verf)
{
+ u64 maxbytes;
+ loff_t start, end;
struct nfsd_net *nn;
struct nfsd_file *nf;
- loff_t end = LLONG_MAX;
- __be32 err = nfserr_inval;
-
- if (offset < 0)
- goto out;
- if (count != 0) {
- end = offset + (loff_t)count - 1;
- if (end < offset)
- goto out;
- }
+ __be32 err;
err = nfsd_file_acquire(rqstp, fhp,
NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf);
if (err)
goto out;
+
+ /*
+ * Convert the client-provided (offset, count) range to a
+ * (start, end) range. If the client-provided range falls
+ * outside the maximum file size of the underlying FS,
+ * clamp the sync range appropriately.
+ */
+ start = 0;
+ end = LLONG_MAX;
+ maxbytes = (u64)fhp->fh_dentry->d_sb->s_maxbytes;
+ if (offset < maxbytes) {
+ start = offset;
+ if (count && (offset + count - 1 < maxbytes))
+ end = offset + count - 1;
+ }
+
nn = net_generic(nf->nf_net, nfsd_net_id);
if (EX_ISSYNC(fhp->fh_export)) {
errseq_t since = READ_ONCE(nf->nf_file->f_wb_err);
int err2;
- err2 = vfs_fsync_range(nf->nf_file, offset, end, 0);
+ err2 = vfs_fsync_range(nf->nf_file, start, end, 0);
switch (err2) {
case 0:
nfsd_copy_write_verifier(verf, nn);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 9f56dcb22ff7..2c43d10e3cab 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -74,8 +74,8 @@ __be32 do_nfsd_create(struct svc_rqst *, struct svc_fh *,
char *name, int len, struct iattr *attrs,
struct svc_fh *res, int createmode,
u32 *verifier, bool *truncp, bool *created);
-__be32 nfsd_commit(struct svc_rqst *, struct svc_fh *,
- loff_t, unsigned long, __be32 *verf);
+__be32 nfsd_commit(struct svc_rqst *rqst, struct svc_fh *fhp,
+ u64 offset, u32 count, __be32 *verf);
#endif /* CONFIG_NFSD_V3 */
#ifdef CONFIG_NFSD_V4
__be32 nfsd_getxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f965021c8bc38965ecb1924f570c4842b33d408 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 24 Jan 2022 15:50:31 -0500
Subject: [PATCH] NFSD: COMMIT operations must not return NFS?ERR_INVAL
Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:
NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Reviewed-by: Bruce Fields <bfields(a)fieldses.org>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index aca38ed1526e..52ad1972cc33 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -663,15 +663,9 @@ nfsd3_proc_commit(struct svc_rqst *rqstp)
argp->count,
(unsigned long long) argp->offset);
- if (argp->offset > NFS_OFFSET_MAX) {
- resp->status = nfserr_inval;
- goto out;
- }
-
fh_copy(&resp->fh, &argp->fh);
resp->status = nfsd_commit(rqstp, &resp->fh, argp->offset,
argp->count, resp->verf);
-out:
return rpc_success;
}
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 0cccceb105e7..91600e71be19 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1114,42 +1114,61 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
}
#ifdef CONFIG_NFSD_V3
-/*
- * Commit all pending writes to stable storage.
+/**
+ * nfsd_commit - Commit pending writes to stable storage
+ * @rqstp: RPC request being processed
+ * @fhp: NFS filehandle
+ * @offset: raw offset from beginning of file
+ * @count: raw count of bytes to sync
+ * @verf: filled in with the server's current write verifier
*
- * Note: we only guarantee that data that lies within the range specified
- * by the 'offset' and 'count' parameters will be synced.
+ * Note: we guarantee that data that lies within the range specified
+ * by the 'offset' and 'count' parameters will be synced. The server
+ * is permitted to sync data that lies outside this range at the
+ * same time.
*
* Unfortunately we cannot lock the file to make sure we return full WCC
* data to the client, as locking happens lower down in the filesystem.
+ *
+ * Return values:
+ * An nfsstat value in network byte order.
*/
__be32
-nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
- loff_t offset, unsigned long count, __be32 *verf)
+nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset,
+ u32 count, __be32 *verf)
{
+ u64 maxbytes;
+ loff_t start, end;
struct nfsd_net *nn;
struct nfsd_file *nf;
- loff_t end = LLONG_MAX;
- __be32 err = nfserr_inval;
-
- if (offset < 0)
- goto out;
- if (count != 0) {
- end = offset + (loff_t)count - 1;
- if (end < offset)
- goto out;
- }
+ __be32 err;
err = nfsd_file_acquire(rqstp, fhp,
NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf);
if (err)
goto out;
+
+ /*
+ * Convert the client-provided (offset, count) range to a
+ * (start, end) range. If the client-provided range falls
+ * outside the maximum file size of the underlying FS,
+ * clamp the sync range appropriately.
+ */
+ start = 0;
+ end = LLONG_MAX;
+ maxbytes = (u64)fhp->fh_dentry->d_sb->s_maxbytes;
+ if (offset < maxbytes) {
+ start = offset;
+ if (count && (offset + count - 1 < maxbytes))
+ end = offset + count - 1;
+ }
+
nn = net_generic(nf->nf_net, nfsd_net_id);
if (EX_ISSYNC(fhp->fh_export)) {
errseq_t since = READ_ONCE(nf->nf_file->f_wb_err);
int err2;
- err2 = vfs_fsync_range(nf->nf_file, offset, end, 0);
+ err2 = vfs_fsync_range(nf->nf_file, start, end, 0);
switch (err2) {
case 0:
nfsd_copy_write_verifier(verf, nn);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 9f56dcb22ff7..2c43d10e3cab 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -74,8 +74,8 @@ __be32 do_nfsd_create(struct svc_rqst *, struct svc_fh *,
char *name, int len, struct iattr *attrs,
struct svc_fh *res, int createmode,
u32 *verifier, bool *truncp, bool *created);
-__be32 nfsd_commit(struct svc_rqst *, struct svc_fh *,
- loff_t, unsigned long, __be32 *verf);
+__be32 nfsd_commit(struct svc_rqst *rqst, struct svc_fh *fhp,
+ u64 offset, u32 count, __be32 *verf);
#endif /* CONFIG_NFSD_V3 */
#ifdef CONFIG_NFSD_V4
__be32 nfsd_getxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f965021c8bc38965ecb1924f570c4842b33d408 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 24 Jan 2022 15:50:31 -0500
Subject: [PATCH] NFSD: COMMIT operations must not return NFS?ERR_INVAL
Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:
NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Reviewed-by: Bruce Fields <bfields(a)fieldses.org>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index aca38ed1526e..52ad1972cc33 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -663,15 +663,9 @@ nfsd3_proc_commit(struct svc_rqst *rqstp)
argp->count,
(unsigned long long) argp->offset);
- if (argp->offset > NFS_OFFSET_MAX) {
- resp->status = nfserr_inval;
- goto out;
- }
-
fh_copy(&resp->fh, &argp->fh);
resp->status = nfsd_commit(rqstp, &resp->fh, argp->offset,
argp->count, resp->verf);
-out:
return rpc_success;
}
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 0cccceb105e7..91600e71be19 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1114,42 +1114,61 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
}
#ifdef CONFIG_NFSD_V3
-/*
- * Commit all pending writes to stable storage.
+/**
+ * nfsd_commit - Commit pending writes to stable storage
+ * @rqstp: RPC request being processed
+ * @fhp: NFS filehandle
+ * @offset: raw offset from beginning of file
+ * @count: raw count of bytes to sync
+ * @verf: filled in with the server's current write verifier
*
- * Note: we only guarantee that data that lies within the range specified
- * by the 'offset' and 'count' parameters will be synced.
+ * Note: we guarantee that data that lies within the range specified
+ * by the 'offset' and 'count' parameters will be synced. The server
+ * is permitted to sync data that lies outside this range at the
+ * same time.
*
* Unfortunately we cannot lock the file to make sure we return full WCC
* data to the client, as locking happens lower down in the filesystem.
+ *
+ * Return values:
+ * An nfsstat value in network byte order.
*/
__be32
-nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
- loff_t offset, unsigned long count, __be32 *verf)
+nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset,
+ u32 count, __be32 *verf)
{
+ u64 maxbytes;
+ loff_t start, end;
struct nfsd_net *nn;
struct nfsd_file *nf;
- loff_t end = LLONG_MAX;
- __be32 err = nfserr_inval;
-
- if (offset < 0)
- goto out;
- if (count != 0) {
- end = offset + (loff_t)count - 1;
- if (end < offset)
- goto out;
- }
+ __be32 err;
err = nfsd_file_acquire(rqstp, fhp,
NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf);
if (err)
goto out;
+
+ /*
+ * Convert the client-provided (offset, count) range to a
+ * (start, end) range. If the client-provided range falls
+ * outside the maximum file size of the underlying FS,
+ * clamp the sync range appropriately.
+ */
+ start = 0;
+ end = LLONG_MAX;
+ maxbytes = (u64)fhp->fh_dentry->d_sb->s_maxbytes;
+ if (offset < maxbytes) {
+ start = offset;
+ if (count && (offset + count - 1 < maxbytes))
+ end = offset + count - 1;
+ }
+
nn = net_generic(nf->nf_net, nfsd_net_id);
if (EX_ISSYNC(fhp->fh_export)) {
errseq_t since = READ_ONCE(nf->nf_file->f_wb_err);
int err2;
- err2 = vfs_fsync_range(nf->nf_file, offset, end, 0);
+ err2 = vfs_fsync_range(nf->nf_file, start, end, 0);
switch (err2) {
case 0:
nfsd_copy_write_verifier(verf, nn);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 9f56dcb22ff7..2c43d10e3cab 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -74,8 +74,8 @@ __be32 do_nfsd_create(struct svc_rqst *, struct svc_fh *,
char *name, int len, struct iattr *attrs,
struct svc_fh *res, int createmode,
u32 *verifier, bool *truncp, bool *created);
-__be32 nfsd_commit(struct svc_rqst *, struct svc_fh *,
- loff_t, unsigned long, __be32 *verf);
+__be32 nfsd_commit(struct svc_rqst *rqst, struct svc_fh *fhp,
+ u64 offset, u32 count, __be32 *verf);
#endif /* CONFIG_NFSD_V3 */
#ifdef CONFIG_NFSD_V4
__be32 nfsd_getxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f965021c8bc38965ecb1924f570c4842b33d408 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 24 Jan 2022 15:50:31 -0500
Subject: [PATCH] NFSD: COMMIT operations must not return NFS?ERR_INVAL
Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:
NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Reviewed-by: Bruce Fields <bfields(a)fieldses.org>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index aca38ed1526e..52ad1972cc33 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -663,15 +663,9 @@ nfsd3_proc_commit(struct svc_rqst *rqstp)
argp->count,
(unsigned long long) argp->offset);
- if (argp->offset > NFS_OFFSET_MAX) {
- resp->status = nfserr_inval;
- goto out;
- }
-
fh_copy(&resp->fh, &argp->fh);
resp->status = nfsd_commit(rqstp, &resp->fh, argp->offset,
argp->count, resp->verf);
-out:
return rpc_success;
}
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 0cccceb105e7..91600e71be19 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1114,42 +1114,61 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
}
#ifdef CONFIG_NFSD_V3
-/*
- * Commit all pending writes to stable storage.
+/**
+ * nfsd_commit - Commit pending writes to stable storage
+ * @rqstp: RPC request being processed
+ * @fhp: NFS filehandle
+ * @offset: raw offset from beginning of file
+ * @count: raw count of bytes to sync
+ * @verf: filled in with the server's current write verifier
*
- * Note: we only guarantee that data that lies within the range specified
- * by the 'offset' and 'count' parameters will be synced.
+ * Note: we guarantee that data that lies within the range specified
+ * by the 'offset' and 'count' parameters will be synced. The server
+ * is permitted to sync data that lies outside this range at the
+ * same time.
*
* Unfortunately we cannot lock the file to make sure we return full WCC
* data to the client, as locking happens lower down in the filesystem.
+ *
+ * Return values:
+ * An nfsstat value in network byte order.
*/
__be32
-nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
- loff_t offset, unsigned long count, __be32 *verf)
+nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset,
+ u32 count, __be32 *verf)
{
+ u64 maxbytes;
+ loff_t start, end;
struct nfsd_net *nn;
struct nfsd_file *nf;
- loff_t end = LLONG_MAX;
- __be32 err = nfserr_inval;
-
- if (offset < 0)
- goto out;
- if (count != 0) {
- end = offset + (loff_t)count - 1;
- if (end < offset)
- goto out;
- }
+ __be32 err;
err = nfsd_file_acquire(rqstp, fhp,
NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf);
if (err)
goto out;
+
+ /*
+ * Convert the client-provided (offset, count) range to a
+ * (start, end) range. If the client-provided range falls
+ * outside the maximum file size of the underlying FS,
+ * clamp the sync range appropriately.
+ */
+ start = 0;
+ end = LLONG_MAX;
+ maxbytes = (u64)fhp->fh_dentry->d_sb->s_maxbytes;
+ if (offset < maxbytes) {
+ start = offset;
+ if (count && (offset + count - 1 < maxbytes))
+ end = offset + count - 1;
+ }
+
nn = net_generic(nf->nf_net, nfsd_net_id);
if (EX_ISSYNC(fhp->fh_export)) {
errseq_t since = READ_ONCE(nf->nf_file->f_wb_err);
int err2;
- err2 = vfs_fsync_range(nf->nf_file, offset, end, 0);
+ err2 = vfs_fsync_range(nf->nf_file, start, end, 0);
switch (err2) {
case 0:
nfsd_copy_write_verifier(verf, nn);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 9f56dcb22ff7..2c43d10e3cab 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -74,8 +74,8 @@ __be32 do_nfsd_create(struct svc_rqst *, struct svc_fh *,
char *name, int len, struct iattr *attrs,
struct svc_fh *res, int createmode,
u32 *verifier, bool *truncp, bool *created);
-__be32 nfsd_commit(struct svc_rqst *, struct svc_fh *,
- loff_t, unsigned long, __be32 *verf);
+__be32 nfsd_commit(struct svc_rqst *rqst, struct svc_fh *fhp,
+ u64 offset, u32 count, __be32 *verf);
#endif /* CONFIG_NFSD_V3 */
#ifdef CONFIG_NFSD_V4
__be32 nfsd_getxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f965021c8bc38965ecb1924f570c4842b33d408 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 24 Jan 2022 15:50:31 -0500
Subject: [PATCH] NFSD: COMMIT operations must not return NFS?ERR_INVAL
Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:
NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Reviewed-by: Bruce Fields <bfields(a)fieldses.org>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index aca38ed1526e..52ad1972cc33 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -663,15 +663,9 @@ nfsd3_proc_commit(struct svc_rqst *rqstp)
argp->count,
(unsigned long long) argp->offset);
- if (argp->offset > NFS_OFFSET_MAX) {
- resp->status = nfserr_inval;
- goto out;
- }
-
fh_copy(&resp->fh, &argp->fh);
resp->status = nfsd_commit(rqstp, &resp->fh, argp->offset,
argp->count, resp->verf);
-out:
return rpc_success;
}
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 0cccceb105e7..91600e71be19 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1114,42 +1114,61 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
}
#ifdef CONFIG_NFSD_V3
-/*
- * Commit all pending writes to stable storage.
+/**
+ * nfsd_commit - Commit pending writes to stable storage
+ * @rqstp: RPC request being processed
+ * @fhp: NFS filehandle
+ * @offset: raw offset from beginning of file
+ * @count: raw count of bytes to sync
+ * @verf: filled in with the server's current write verifier
*
- * Note: we only guarantee that data that lies within the range specified
- * by the 'offset' and 'count' parameters will be synced.
+ * Note: we guarantee that data that lies within the range specified
+ * by the 'offset' and 'count' parameters will be synced. The server
+ * is permitted to sync data that lies outside this range at the
+ * same time.
*
* Unfortunately we cannot lock the file to make sure we return full WCC
* data to the client, as locking happens lower down in the filesystem.
+ *
+ * Return values:
+ * An nfsstat value in network byte order.
*/
__be32
-nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
- loff_t offset, unsigned long count, __be32 *verf)
+nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset,
+ u32 count, __be32 *verf)
{
+ u64 maxbytes;
+ loff_t start, end;
struct nfsd_net *nn;
struct nfsd_file *nf;
- loff_t end = LLONG_MAX;
- __be32 err = nfserr_inval;
-
- if (offset < 0)
- goto out;
- if (count != 0) {
- end = offset + (loff_t)count - 1;
- if (end < offset)
- goto out;
- }
+ __be32 err;
err = nfsd_file_acquire(rqstp, fhp,
NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf);
if (err)
goto out;
+
+ /*
+ * Convert the client-provided (offset, count) range to a
+ * (start, end) range. If the client-provided range falls
+ * outside the maximum file size of the underlying FS,
+ * clamp the sync range appropriately.
+ */
+ start = 0;
+ end = LLONG_MAX;
+ maxbytes = (u64)fhp->fh_dentry->d_sb->s_maxbytes;
+ if (offset < maxbytes) {
+ start = offset;
+ if (count && (offset + count - 1 < maxbytes))
+ end = offset + count - 1;
+ }
+
nn = net_generic(nf->nf_net, nfsd_net_id);
if (EX_ISSYNC(fhp->fh_export)) {
errseq_t since = READ_ONCE(nf->nf_file->f_wb_err);
int err2;
- err2 = vfs_fsync_range(nf->nf_file, offset, end, 0);
+ err2 = vfs_fsync_range(nf->nf_file, start, end, 0);
switch (err2) {
case 0:
nfsd_copy_write_verifier(verf, nn);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 9f56dcb22ff7..2c43d10e3cab 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -74,8 +74,8 @@ __be32 do_nfsd_create(struct svc_rqst *, struct svc_fh *,
char *name, int len, struct iattr *attrs,
struct svc_fh *res, int createmode,
u32 *verifier, bool *truncp, bool *created);
-__be32 nfsd_commit(struct svc_rqst *, struct svc_fh *,
- loff_t, unsigned long, __be32 *verf);
+__be32 nfsd_commit(struct svc_rqst *rqst, struct svc_fh *fhp,
+ u64 offset, u32 count, __be32 *verf);
#endif /* CONFIG_NFSD_V3 */
#ifdef CONFIG_NFSD_V4
__be32 nfsd_getxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0cb4d23ae08c48f6bf3c29a8e5c4a74b8388b960 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Fri, 4 Feb 2022 15:19:34 -0500
Subject: [PATCH] NFSD: Fix the behavior of READ near OFFSET_MAX
Dan Aloni reports:
> Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to
> the RPC read layers") on the client, a read of 0xfff is aligned up
> to server rsize of 0x1000.
>
> As a result, in a test where the server has a file of size
> 0x7fffffffffffffff, and the client tries to read from the offset
> 0x7ffffffffffff000, the read causes loff_t overflow in the server
> and it returns an NFS code of EINVAL to the client. The client as
> a result indefinitely retries the request.
The Linux NFS client does not handle NFS?ERR_INVAL, even though all
NFS specifications permit servers to return that status code for a
READ.
Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed
and return a short result. Set the EOF flag in the result to prevent
the client from retrying the READ request. This behavior appears to
be consistent with Solaris NFS servers.
Note that NFSv3 and NFSv4 use u64 offset values on the wire. These
must be converted to loff_t internally before use -- an implicit
type cast is not adequate for this purpose. Otherwise VFS checks
against sb->s_maxbytes do not work properly.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 2c681785186f..b5a52528f19f 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -150,13 +150,17 @@ nfsd3_proc_read(struct svc_rqst *rqstp)
unsigned int len;
int v;
- argp->count = min_t(u32, argp->count, max_blocksize);
-
dprintk("nfsd: READ(3) %s %lu bytes at %Lu\n",
SVCFH_fmt(&argp->fh),
(unsigned long) argp->count,
(unsigned long long) argp->offset);
+ argp->count = min_t(u32, argp->count, max_blocksize);
+ if (argp->offset > (u64)OFFSET_MAX)
+ argp->offset = (u64)OFFSET_MAX;
+ if (argp->offset + argp->count > (u64)OFFSET_MAX)
+ argp->count = (u64)OFFSET_MAX - argp->offset;
+
v = 0;
len = argp->count;
resp->pages = rqstp->rq_next_page;
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index ed1ee25647be..71d735b125a0 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -782,12 +782,16 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 status;
read->rd_nf = NULL;
- if (read->rd_offset >= OFFSET_MAX)
- return nfserr_inval;
trace_nfsd_read_start(rqstp, &cstate->current_fh,
read->rd_offset, read->rd_length);
+ read->rd_length = min_t(u32, read->rd_length, svc_max_payload(rqstp));
+ if (read->rd_offset > (u64)OFFSET_MAX)
+ read->rd_offset = (u64)OFFSET_MAX;
+ if (read->rd_offset + read->rd_length > (u64)OFFSET_MAX)
+ read->rd_length = (u64)OFFSET_MAX - read->rd_offset;
+
/*
* If we do a zero copy read, then a client will see read data
* that reflects the state of the file *after* performing the
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 899de438e529..f5e3430bb6ff 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3986,10 +3986,8 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
}
xdr_commit_encode(xdr);
- maxcount = svc_max_payload(resp->rqstp);
- maxcount = min_t(unsigned long, maxcount,
+ maxcount = min_t(unsigned long, read->rd_length,
(xdr->buf->buflen - xdr->buf->len));
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
if (file->f_op->splice_read &&
test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags))
@@ -4826,10 +4824,8 @@ nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
return nfserr_resource;
xdr_commit_encode(xdr);
- maxcount = svc_max_payload(resp->rqstp);
- maxcount = min_t(unsigned long, maxcount,
+ maxcount = min_t(unsigned long, read->rd_length,
(xdr->buf->buflen - xdr->buf->len));
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
count = maxcount;
eof = read->rd_offset >= i_size_read(file_inode(file));
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0cb4d23ae08c48f6bf3c29a8e5c4a74b8388b960 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Fri, 4 Feb 2022 15:19:34 -0500
Subject: [PATCH] NFSD: Fix the behavior of READ near OFFSET_MAX
Dan Aloni reports:
> Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to
> the RPC read layers") on the client, a read of 0xfff is aligned up
> to server rsize of 0x1000.
>
> As a result, in a test where the server has a file of size
> 0x7fffffffffffffff, and the client tries to read from the offset
> 0x7ffffffffffff000, the read causes loff_t overflow in the server
> and it returns an NFS code of EINVAL to the client. The client as
> a result indefinitely retries the request.
The Linux NFS client does not handle NFS?ERR_INVAL, even though all
NFS specifications permit servers to return that status code for a
READ.
Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed
and return a short result. Set the EOF flag in the result to prevent
the client from retrying the READ request. This behavior appears to
be consistent with Solaris NFS servers.
Note that NFSv3 and NFSv4 use u64 offset values on the wire. These
must be converted to loff_t internally before use -- an implicit
type cast is not adequate for this purpose. Otherwise VFS checks
against sb->s_maxbytes do not work properly.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 2c681785186f..b5a52528f19f 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -150,13 +150,17 @@ nfsd3_proc_read(struct svc_rqst *rqstp)
unsigned int len;
int v;
- argp->count = min_t(u32, argp->count, max_blocksize);
-
dprintk("nfsd: READ(3) %s %lu bytes at %Lu\n",
SVCFH_fmt(&argp->fh),
(unsigned long) argp->count,
(unsigned long long) argp->offset);
+ argp->count = min_t(u32, argp->count, max_blocksize);
+ if (argp->offset > (u64)OFFSET_MAX)
+ argp->offset = (u64)OFFSET_MAX;
+ if (argp->offset + argp->count > (u64)OFFSET_MAX)
+ argp->count = (u64)OFFSET_MAX - argp->offset;
+
v = 0;
len = argp->count;
resp->pages = rqstp->rq_next_page;
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index ed1ee25647be..71d735b125a0 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -782,12 +782,16 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 status;
read->rd_nf = NULL;
- if (read->rd_offset >= OFFSET_MAX)
- return nfserr_inval;
trace_nfsd_read_start(rqstp, &cstate->current_fh,
read->rd_offset, read->rd_length);
+ read->rd_length = min_t(u32, read->rd_length, svc_max_payload(rqstp));
+ if (read->rd_offset > (u64)OFFSET_MAX)
+ read->rd_offset = (u64)OFFSET_MAX;
+ if (read->rd_offset + read->rd_length > (u64)OFFSET_MAX)
+ read->rd_length = (u64)OFFSET_MAX - read->rd_offset;
+
/*
* If we do a zero copy read, then a client will see read data
* that reflects the state of the file *after* performing the
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 899de438e529..f5e3430bb6ff 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3986,10 +3986,8 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
}
xdr_commit_encode(xdr);
- maxcount = svc_max_payload(resp->rqstp);
- maxcount = min_t(unsigned long, maxcount,
+ maxcount = min_t(unsigned long, read->rd_length,
(xdr->buf->buflen - xdr->buf->len));
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
if (file->f_op->splice_read &&
test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags))
@@ -4826,10 +4824,8 @@ nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
return nfserr_resource;
xdr_commit_encode(xdr);
- maxcount = svc_max_payload(resp->rqstp);
- maxcount = min_t(unsigned long, maxcount,
+ maxcount = min_t(unsigned long, read->rd_length,
(xdr->buf->buflen - xdr->buf->len));
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
count = maxcount;
eof = read->rd_offset >= i_size_read(file_inode(file));
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0cb4d23ae08c48f6bf3c29a8e5c4a74b8388b960 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Fri, 4 Feb 2022 15:19:34 -0500
Subject: [PATCH] NFSD: Fix the behavior of READ near OFFSET_MAX
Dan Aloni reports:
> Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to
> the RPC read layers") on the client, a read of 0xfff is aligned up
> to server rsize of 0x1000.
>
> As a result, in a test where the server has a file of size
> 0x7fffffffffffffff, and the client tries to read from the offset
> 0x7ffffffffffff000, the read causes loff_t overflow in the server
> and it returns an NFS code of EINVAL to the client. The client as
> a result indefinitely retries the request.
The Linux NFS client does not handle NFS?ERR_INVAL, even though all
NFS specifications permit servers to return that status code for a
READ.
Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed
and return a short result. Set the EOF flag in the result to prevent
the client from retrying the READ request. This behavior appears to
be consistent with Solaris NFS servers.
Note that NFSv3 and NFSv4 use u64 offset values on the wire. These
must be converted to loff_t internally before use -- an implicit
type cast is not adequate for this purpose. Otherwise VFS checks
against sb->s_maxbytes do not work properly.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 2c681785186f..b5a52528f19f 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -150,13 +150,17 @@ nfsd3_proc_read(struct svc_rqst *rqstp)
unsigned int len;
int v;
- argp->count = min_t(u32, argp->count, max_blocksize);
-
dprintk("nfsd: READ(3) %s %lu bytes at %Lu\n",
SVCFH_fmt(&argp->fh),
(unsigned long) argp->count,
(unsigned long long) argp->offset);
+ argp->count = min_t(u32, argp->count, max_blocksize);
+ if (argp->offset > (u64)OFFSET_MAX)
+ argp->offset = (u64)OFFSET_MAX;
+ if (argp->offset + argp->count > (u64)OFFSET_MAX)
+ argp->count = (u64)OFFSET_MAX - argp->offset;
+
v = 0;
len = argp->count;
resp->pages = rqstp->rq_next_page;
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index ed1ee25647be..71d735b125a0 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -782,12 +782,16 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 status;
read->rd_nf = NULL;
- if (read->rd_offset >= OFFSET_MAX)
- return nfserr_inval;
trace_nfsd_read_start(rqstp, &cstate->current_fh,
read->rd_offset, read->rd_length);
+ read->rd_length = min_t(u32, read->rd_length, svc_max_payload(rqstp));
+ if (read->rd_offset > (u64)OFFSET_MAX)
+ read->rd_offset = (u64)OFFSET_MAX;
+ if (read->rd_offset + read->rd_length > (u64)OFFSET_MAX)
+ read->rd_length = (u64)OFFSET_MAX - read->rd_offset;
+
/*
* If we do a zero copy read, then a client will see read data
* that reflects the state of the file *after* performing the
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 899de438e529..f5e3430bb6ff 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3986,10 +3986,8 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
}
xdr_commit_encode(xdr);
- maxcount = svc_max_payload(resp->rqstp);
- maxcount = min_t(unsigned long, maxcount,
+ maxcount = min_t(unsigned long, read->rd_length,
(xdr->buf->buflen - xdr->buf->len));
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
if (file->f_op->splice_read &&
test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags))
@@ -4826,10 +4824,8 @@ nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
return nfserr_resource;
xdr_commit_encode(xdr);
- maxcount = svc_max_payload(resp->rqstp);
- maxcount = min_t(unsigned long, maxcount,
+ maxcount = min_t(unsigned long, read->rd_length,
(xdr->buf->buflen - xdr->buf->len));
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
count = maxcount;
eof = read->rd_offset >= i_size_read(file_inode(file));
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0cb4d23ae08c48f6bf3c29a8e5c4a74b8388b960 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Fri, 4 Feb 2022 15:19:34 -0500
Subject: [PATCH] NFSD: Fix the behavior of READ near OFFSET_MAX
Dan Aloni reports:
> Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to
> the RPC read layers") on the client, a read of 0xfff is aligned up
> to server rsize of 0x1000.
>
> As a result, in a test where the server has a file of size
> 0x7fffffffffffffff, and the client tries to read from the offset
> 0x7ffffffffffff000, the read causes loff_t overflow in the server
> and it returns an NFS code of EINVAL to the client. The client as
> a result indefinitely retries the request.
The Linux NFS client does not handle NFS?ERR_INVAL, even though all
NFS specifications permit servers to return that status code for a
READ.
Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed
and return a short result. Set the EOF flag in the result to prevent
the client from retrying the READ request. This behavior appears to
be consistent with Solaris NFS servers.
Note that NFSv3 and NFSv4 use u64 offset values on the wire. These
must be converted to loff_t internally before use -- an implicit
type cast is not adequate for this purpose. Otherwise VFS checks
against sb->s_maxbytes do not work properly.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 2c681785186f..b5a52528f19f 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -150,13 +150,17 @@ nfsd3_proc_read(struct svc_rqst *rqstp)
unsigned int len;
int v;
- argp->count = min_t(u32, argp->count, max_blocksize);
-
dprintk("nfsd: READ(3) %s %lu bytes at %Lu\n",
SVCFH_fmt(&argp->fh),
(unsigned long) argp->count,
(unsigned long long) argp->offset);
+ argp->count = min_t(u32, argp->count, max_blocksize);
+ if (argp->offset > (u64)OFFSET_MAX)
+ argp->offset = (u64)OFFSET_MAX;
+ if (argp->offset + argp->count > (u64)OFFSET_MAX)
+ argp->count = (u64)OFFSET_MAX - argp->offset;
+
v = 0;
len = argp->count;
resp->pages = rqstp->rq_next_page;
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index ed1ee25647be..71d735b125a0 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -782,12 +782,16 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 status;
read->rd_nf = NULL;
- if (read->rd_offset >= OFFSET_MAX)
- return nfserr_inval;
trace_nfsd_read_start(rqstp, &cstate->current_fh,
read->rd_offset, read->rd_length);
+ read->rd_length = min_t(u32, read->rd_length, svc_max_payload(rqstp));
+ if (read->rd_offset > (u64)OFFSET_MAX)
+ read->rd_offset = (u64)OFFSET_MAX;
+ if (read->rd_offset + read->rd_length > (u64)OFFSET_MAX)
+ read->rd_length = (u64)OFFSET_MAX - read->rd_offset;
+
/*
* If we do a zero copy read, then a client will see read data
* that reflects the state of the file *after* performing the
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 899de438e529..f5e3430bb6ff 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3986,10 +3986,8 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
}
xdr_commit_encode(xdr);
- maxcount = svc_max_payload(resp->rqstp);
- maxcount = min_t(unsigned long, maxcount,
+ maxcount = min_t(unsigned long, read->rd_length,
(xdr->buf->buflen - xdr->buf->len));
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
if (file->f_op->splice_read &&
test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags))
@@ -4826,10 +4824,8 @@ nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
return nfserr_resource;
xdr_commit_encode(xdr);
- maxcount = svc_max_payload(resp->rqstp);
- maxcount = min_t(unsigned long, maxcount,
+ maxcount = min_t(unsigned long, read->rd_length,
(xdr->buf->buflen - xdr->buf->len));
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
count = maxcount;
eof = read->rd_offset >= i_size_read(file_inode(file));
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0cb4d23ae08c48f6bf3c29a8e5c4a74b8388b960 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Fri, 4 Feb 2022 15:19:34 -0500
Subject: [PATCH] NFSD: Fix the behavior of READ near OFFSET_MAX
Dan Aloni reports:
> Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to
> the RPC read layers") on the client, a read of 0xfff is aligned up
> to server rsize of 0x1000.
>
> As a result, in a test where the server has a file of size
> 0x7fffffffffffffff, and the client tries to read from the offset
> 0x7ffffffffffff000, the read causes loff_t overflow in the server
> and it returns an NFS code of EINVAL to the client. The client as
> a result indefinitely retries the request.
The Linux NFS client does not handle NFS?ERR_INVAL, even though all
NFS specifications permit servers to return that status code for a
READ.
Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed
and return a short result. Set the EOF flag in the result to prevent
the client from retrying the READ request. This behavior appears to
be consistent with Solaris NFS servers.
Note that NFSv3 and NFSv4 use u64 offset values on the wire. These
must be converted to loff_t internally before use -- an implicit
type cast is not adequate for this purpose. Otherwise VFS checks
against sb->s_maxbytes do not work properly.
Reported-by: Dan Aloni <dan.aloni(a)vastdata.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 2c681785186f..b5a52528f19f 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -150,13 +150,17 @@ nfsd3_proc_read(struct svc_rqst *rqstp)
unsigned int len;
int v;
- argp->count = min_t(u32, argp->count, max_blocksize);
-
dprintk("nfsd: READ(3) %s %lu bytes at %Lu\n",
SVCFH_fmt(&argp->fh),
(unsigned long) argp->count,
(unsigned long long) argp->offset);
+ argp->count = min_t(u32, argp->count, max_blocksize);
+ if (argp->offset > (u64)OFFSET_MAX)
+ argp->offset = (u64)OFFSET_MAX;
+ if (argp->offset + argp->count > (u64)OFFSET_MAX)
+ argp->count = (u64)OFFSET_MAX - argp->offset;
+
v = 0;
len = argp->count;
resp->pages = rqstp->rq_next_page;
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index ed1ee25647be..71d735b125a0 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -782,12 +782,16 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 status;
read->rd_nf = NULL;
- if (read->rd_offset >= OFFSET_MAX)
- return nfserr_inval;
trace_nfsd_read_start(rqstp, &cstate->current_fh,
read->rd_offset, read->rd_length);
+ read->rd_length = min_t(u32, read->rd_length, svc_max_payload(rqstp));
+ if (read->rd_offset > (u64)OFFSET_MAX)
+ read->rd_offset = (u64)OFFSET_MAX;
+ if (read->rd_offset + read->rd_length > (u64)OFFSET_MAX)
+ read->rd_length = (u64)OFFSET_MAX - read->rd_offset;
+
/*
* If we do a zero copy read, then a client will see read data
* that reflects the state of the file *after* performing the
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 899de438e529..f5e3430bb6ff 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3986,10 +3986,8 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
}
xdr_commit_encode(xdr);
- maxcount = svc_max_payload(resp->rqstp);
- maxcount = min_t(unsigned long, maxcount,
+ maxcount = min_t(unsigned long, read->rd_length,
(xdr->buf->buflen - xdr->buf->len));
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
if (file->f_op->splice_read &&
test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags))
@@ -4826,10 +4824,8 @@ nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
return nfserr_resource;
xdr_commit_encode(xdr);
- maxcount = svc_max_payload(resp->rqstp);
- maxcount = min_t(unsigned long, maxcount,
+ maxcount = min_t(unsigned long, read->rd_length,
(xdr->buf->buflen - xdr->buf->len));
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
count = maxcount;
eof = read->rd_offset >= i_size_read(file_inode(file));
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e6faac3f58c7c4176b66f63def17a34232a17b0e Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 31 Jan 2022 13:01:53 -0500
Subject: [PATCH] NFSD: Fix ia_size underflow
iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and
NFSv4 both define file size as an unsigned 64-bit type. Thus there
is a range of valid file size values an NFS client can send that is
already larger than Linux can handle.
Currently decode_fattr4() dumps a full u64 value into ia_size. If
that value happens to be larger than S64_MAX, then ia_size
underflows. I'm about to fix up the NFSv3 behavior as well, so let's
catch the underflow in the common code path: nfsd_setattr().
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 99c2b9dfbb10..0cccceb105e7 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -435,6 +435,10 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
.ia_size = iap->ia_size,
};
+ host_err = -EFBIG;
+ if (iap->ia_size < 0)
+ goto out_unlock;
+
host_err = notify_change(&init_user_ns, dentry, &size_attr, NULL);
if (host_err)
goto out_unlock;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e6faac3f58c7c4176b66f63def17a34232a17b0e Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 31 Jan 2022 13:01:53 -0500
Subject: [PATCH] NFSD: Fix ia_size underflow
iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and
NFSv4 both define file size as an unsigned 64-bit type. Thus there
is a range of valid file size values an NFS client can send that is
already larger than Linux can handle.
Currently decode_fattr4() dumps a full u64 value into ia_size. If
that value happens to be larger than S64_MAX, then ia_size
underflows. I'm about to fix up the NFSv3 behavior as well, so let's
catch the underflow in the common code path: nfsd_setattr().
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 99c2b9dfbb10..0cccceb105e7 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -435,6 +435,10 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
.ia_size = iap->ia_size,
};
+ host_err = -EFBIG;
+ if (iap->ia_size < 0)
+ goto out_unlock;
+
host_err = notify_change(&init_user_ns, dentry, &size_attr, NULL);
if (host_err)
goto out_unlock;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e6faac3f58c7c4176b66f63def17a34232a17b0e Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 31 Jan 2022 13:01:53 -0500
Subject: [PATCH] NFSD: Fix ia_size underflow
iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and
NFSv4 both define file size as an unsigned 64-bit type. Thus there
is a range of valid file size values an NFS client can send that is
already larger than Linux can handle.
Currently decode_fattr4() dumps a full u64 value into ia_size. If
that value happens to be larger than S64_MAX, then ia_size
underflows. I'm about to fix up the NFSv3 behavior as well, so let's
catch the underflow in the common code path: nfsd_setattr().
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 99c2b9dfbb10..0cccceb105e7 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -435,6 +435,10 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
.ia_size = iap->ia_size,
};
+ host_err = -EFBIG;
+ if (iap->ia_size < 0)
+ goto out_unlock;
+
host_err = notify_change(&init_user_ns, dentry, &size_attr, NULL);
if (host_err)
goto out_unlock;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e6faac3f58c7c4176b66f63def17a34232a17b0e Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 31 Jan 2022 13:01:53 -0500
Subject: [PATCH] NFSD: Fix ia_size underflow
iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and
NFSv4 both define file size as an unsigned 64-bit type. Thus there
is a range of valid file size values an NFS client can send that is
already larger than Linux can handle.
Currently decode_fattr4() dumps a full u64 value into ia_size. If
that value happens to be larger than S64_MAX, then ia_size
underflows. I'm about to fix up the NFSv3 behavior as well, so let's
catch the underflow in the common code path: nfsd_setattr().
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 99c2b9dfbb10..0cccceb105e7 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -435,6 +435,10 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
.ia_size = iap->ia_size,
};
+ host_err = -EFBIG;
+ if (iap->ia_size < 0)
+ goto out_unlock;
+
host_err = notify_change(&init_user_ns, dentry, &size_attr, NULL);
if (host_err)
goto out_unlock;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e6faac3f58c7c4176b66f63def17a34232a17b0e Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Mon, 31 Jan 2022 13:01:53 -0500
Subject: [PATCH] NFSD: Fix ia_size underflow
iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and
NFSv4 both define file size as an unsigned 64-bit type. Thus there
is a range of valid file size values an NFS client can send that is
already larger than Linux can handle.
Currently decode_fattr4() dumps a full u64 value into ia_size. If
that value happens to be larger than S64_MAX, then ia_size
underflows. I'm about to fix up the NFSv3 behavior as well, so let's
catch the underflow in the common code path: nfsd_setattr().
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 99c2b9dfbb10..0cccceb105e7 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -435,6 +435,10 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
.ia_size = iap->ia_size,
};
+ host_err = -EFBIG;
+ if (iap->ia_size < 0)
+ goto out_unlock;
+
host_err = notify_change(&init_user_ns, dentry, &size_attr, NULL);
if (host_err)
goto out_unlock;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a648fdeb7c0e17177a2280344d015dba3fbe3314 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Tue, 25 Jan 2022 15:59:57 -0500
Subject: [PATCH] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes
iattr::ia_size is a loff_t, so these NFSv3 procedures must be
careful to deal with incoming client size values that are larger
than s64_max without corrupting the value.
Silently capping the value results in storing a different value
than the client passed in which is unexpected behavior, so remove
the min_t() check in decode_sattr3().
Note that RFC 1813 permits only the WRITE procedure to return
NFS3ERR_FBIG. We believe that NFSv3 reference implementations
also return NFS3ERR_FBIG when ia_size is too large.
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 7c45ba4db61b..2e47a07029f1 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr,
if (xdr_stream_decode_u64(xdr, &newsize) < 0)
return false;
iap->ia_valid |= ATTR_SIZE;
- iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
+ iap->ia_size = newsize;
}
if (xdr_stream_decode_u32(xdr, &set_it) < 0)
return false;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a648fdeb7c0e17177a2280344d015dba3fbe3314 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Tue, 25 Jan 2022 15:59:57 -0500
Subject: [PATCH] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes
iattr::ia_size is a loff_t, so these NFSv3 procedures must be
careful to deal with incoming client size values that are larger
than s64_max without corrupting the value.
Silently capping the value results in storing a different value
than the client passed in which is unexpected behavior, so remove
the min_t() check in decode_sattr3().
Note that RFC 1813 permits only the WRITE procedure to return
NFS3ERR_FBIG. We believe that NFSv3 reference implementations
also return NFS3ERR_FBIG when ia_size is too large.
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 7c45ba4db61b..2e47a07029f1 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr,
if (xdr_stream_decode_u64(xdr, &newsize) < 0)
return false;
iap->ia_valid |= ATTR_SIZE;
- iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
+ iap->ia_size = newsize;
}
if (xdr_stream_decode_u32(xdr, &set_it) < 0)
return false;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a648fdeb7c0e17177a2280344d015dba3fbe3314 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Tue, 25 Jan 2022 15:59:57 -0500
Subject: [PATCH] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes
iattr::ia_size is a loff_t, so these NFSv3 procedures must be
careful to deal with incoming client size values that are larger
than s64_max without corrupting the value.
Silently capping the value results in storing a different value
than the client passed in which is unexpected behavior, so remove
the min_t() check in decode_sattr3().
Note that RFC 1813 permits only the WRITE procedure to return
NFS3ERR_FBIG. We believe that NFSv3 reference implementations
also return NFS3ERR_FBIG when ia_size is too large.
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 7c45ba4db61b..2e47a07029f1 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr,
if (xdr_stream_decode_u64(xdr, &newsize) < 0)
return false;
iap->ia_valid |= ATTR_SIZE;
- iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
+ iap->ia_size = newsize;
}
if (xdr_stream_decode_u32(xdr, &set_it) < 0)
return false;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a648fdeb7c0e17177a2280344d015dba3fbe3314 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Tue, 25 Jan 2022 15:59:57 -0500
Subject: [PATCH] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes
iattr::ia_size is a loff_t, so these NFSv3 procedures must be
careful to deal with incoming client size values that are larger
than s64_max without corrupting the value.
Silently capping the value results in storing a different value
than the client passed in which is unexpected behavior, so remove
the min_t() check in decode_sattr3().
Note that RFC 1813 permits only the WRITE procedure to return
NFS3ERR_FBIG. We believe that NFSv3 reference implementations
also return NFS3ERR_FBIG when ia_size is too large.
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 7c45ba4db61b..2e47a07029f1 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr,
if (xdr_stream_decode_u64(xdr, &newsize) < 0)
return false;
iap->ia_valid |= ATTR_SIZE;
- iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
+ iap->ia_size = newsize;
}
if (xdr_stream_decode_u32(xdr, &set_it) < 0)
return false;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a648fdeb7c0e17177a2280344d015dba3fbe3314 Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Tue, 25 Jan 2022 15:59:57 -0500
Subject: [PATCH] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes
iattr::ia_size is a loff_t, so these NFSv3 procedures must be
careful to deal with incoming client size values that are larger
than s64_max without corrupting the value.
Silently capping the value results in storing a different value
than the client passed in which is unexpected behavior, so remove
the min_t() check in decode_sattr3().
Note that RFC 1813 permits only the WRITE procedure to return
NFS3ERR_FBIG. We believe that NFSv3 reference implementations
also return NFS3ERR_FBIG when ia_size is too large.
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 7c45ba4db61b..2e47a07029f1 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr,
if (xdr_stream_decode_u64(xdr, &newsize) < 0)
return false;
iap->ia_valid |= ATTR_SIZE;
- iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
+ iap->ia_size = newsize;
}
if (xdr_stream_decode_u32(xdr, &set_it) < 0)
return false;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aec12836e7196e4d360b2cbf20cf7aa5139ad2ec Mon Sep 17 00:00:00 2001
From: Pavel Parkhomenko <Pavel.Parkhomenko(a)baikalelectronics.ru>
Date: Sun, 6 Feb 2022 00:49:51 +0300
Subject: [PATCH] net: phy: marvell: Fix MDI-x polarity setting in
88e1118-compatible PHYs
When setting up autonegotiation for 88E1118R and compatible PHYs,
a software reset of PHY is issued before setting up polarity.
This is incorrect as changes of MDI Crossover Mode bits are
disruptive to the normal operation and must be followed by a
software reset to take effect. Let's patch m88e1118_config_aneg()
to fix the issue mentioned before by invoking software reset
of the PHY just after setting up MDI-x polarity.
Fixes: 605f196efbf8 ("phy: Add support for Marvell 88E1118 PHY")
Signed-off-by: Pavel Parkhomenko <Pavel.Parkhomenko(a)baikalelectronics.ru>
Reviewed-by: Serge Semin <fancer.lancer(a)gmail.com>
Suggested-by: Andrew Lunn <andrew(a)lunn.ch>
Cc: stable(a)vger.kernel.org
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index fa71fb7a66b5..ab063961ac00 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -1213,16 +1213,15 @@ static int m88e1118_config_aneg(struct phy_device *phydev)
{
int err;
- err = genphy_soft_reset(phydev);
+ err = marvell_set_polarity(phydev, phydev->mdix_ctrl);
if (err < 0)
return err;
- err = marvell_set_polarity(phydev, phydev->mdix_ctrl);
+ err = genphy_config_aneg(phydev);
if (err < 0)
return err;
- err = genphy_config_aneg(phydev);
- return 0;
+ return genphy_soft_reset(phydev);
}
static int m88e1118_config_init(struct phy_device *phydev)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8375dfac4f683e1b2c5956d919d36aeedad46699 Mon Sep 17 00:00:00 2001
From: Oliver Hartkopp <socketcan(a)hartkopp.net>
Date: Wed, 9 Feb 2022 08:36:01 +0100
Subject: [PATCH] can: isotp: fix error path in isotp_sendmsg() to unlock wait
queue
Commit 43a08c3bdac4 ("can: isotp: isotp_sendmsg(): fix TX buffer concurrent
access in isotp_sendmsg()") introduced a new locking scheme that may render
the userspace application in a locking state when an error is detected.
This issue shows up under high load on simultaneously running isotp channels
with identical configuration which is against the ISO specification and
therefore breaks any reasonable PDU communication anyway.
Fixes: 43a08c3bdac4 ("can: isotp: isotp_sendmsg(): fix TX buffer concurrent access in isotp_sendmsg()")
Link: https://lore.kernel.org/all/20220209073601.25728-1-socketcan@hartkopp.net
Cc: stable(a)vger.kernel.org
Cc: Ziyang Xuan <william.xuanziyang(a)huawei.com>
Signed-off-by: Oliver Hartkopp <socketcan(a)hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 9149e8d8aefc..d2a430b6a13b 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -887,7 +887,7 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
if (!size || size > MAX_MSG_LENGTH) {
err = -EINVAL;
- goto err_out;
+ goto err_out_drop;
}
/* take care of a potential SF_DL ESC offset for TX_DL > 8 */
@@ -897,24 +897,24 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
if ((so->opt.flags & CAN_ISOTP_SF_BROADCAST) &&
(size > so->tx.ll_dl - SF_PCI_SZ4 - ae - off)) {
err = -EINVAL;
- goto err_out;
+ goto err_out_drop;
}
err = memcpy_from_msg(so->tx.buf, msg, size);
if (err < 0)
- goto err_out;
+ goto err_out_drop;
dev = dev_get_by_index(sock_net(sk), so->ifindex);
if (!dev) {
err = -ENXIO;
- goto err_out;
+ goto err_out_drop;
}
skb = sock_alloc_send_skb(sk, so->ll.mtu + sizeof(struct can_skb_priv),
msg->msg_flags & MSG_DONTWAIT, &err);
if (!skb) {
dev_put(dev);
- goto err_out;
+ goto err_out_drop;
}
can_skb_reserve(skb);
@@ -976,7 +976,7 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
if (err) {
pr_notice_once("can-isotp: %s: can_send_ret %pe\n",
__func__, ERR_PTR(err));
- goto err_out;
+ goto err_out_drop;
}
if (wait_tx_done) {
@@ -989,6 +989,9 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
return size;
+err_out_drop:
+ /* drop this PDU and unlock a potential wait queue */
+ old_state = ISOTP_IDLE;
err_out:
so->tx.state = old_state;
if (so->tx.state == ISOTP_IDLE)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bb8e52e4906f148c2faf6656b5106cf7233e9301 Mon Sep 17 00:00:00 2001
From: Roberto Sassu <roberto.sassu(a)huawei.com>
Date: Mon, 31 Jan 2022 18:11:39 +0100
Subject: [PATCH] ima: Allow template selection with ima_template[_fmt]= after
ima_hash=
Commit c2426d2ad5027 ("ima: added support for new kernel cmdline parameter
ima_template_fmt") introduced an additional check on the ima_template
variable to avoid multiple template selection.
Unfortunately, ima_template could be also set by the setup function of the
ima_hash= parameter, when it calls ima_template_desc_current(). This causes
attempts to choose a new template with ima_template= or with
ima_template_fmt=, after ima_hash=, to be ignored.
Achieve the goal of the commit mentioned with the new static variable
template_setup_done, so that template selection requests after ima_hash=
are not ignored.
Finally, call ima_init_template_list(), if not already done, to initialize
the list of templates before lookup_template_desc() is called.
Reported-by: Guo Zihua <guozihua(a)huawei.com>
Signed-off-by: Roberto Sassu <roberto.sassu(a)huawei.com>
Cc: stable(a)vger.kernel.org
Fixes: c2426d2ad5027 ("ima: added support for new kernel cmdline parameter ima_template_fmt")
Signed-off-by: Mimi Zohar <zohar(a)linux.ibm.com>
diff --git a/security/integrity/ima/ima_template.c b/security/integrity/ima/ima_template.c
index 694560396be0..db1ad6d7a57f 100644
--- a/security/integrity/ima/ima_template.c
+++ b/security/integrity/ima/ima_template.c
@@ -29,6 +29,7 @@ static struct ima_template_desc builtin_templates[] = {
static LIST_HEAD(defined_templates);
static DEFINE_SPINLOCK(template_list);
+static int template_setup_done;
static const struct ima_template_field supported_fields[] = {
{.field_id = "d", .field_init = ima_eventdigest_init,
@@ -101,10 +102,11 @@ static int __init ima_template_setup(char *str)
struct ima_template_desc *template_desc;
int template_len = strlen(str);
- if (ima_template)
+ if (template_setup_done)
return 1;
- ima_init_template_list();
+ if (!ima_template)
+ ima_init_template_list();
/*
* Verify that a template with the supplied name exists.
@@ -128,6 +130,7 @@ static int __init ima_template_setup(char *str)
}
ima_template = template_desc;
+ template_setup_done = 1;
return 1;
}
__setup("ima_template=", ima_template_setup);
@@ -136,7 +139,7 @@ static int __init ima_template_fmt_setup(char *str)
{
int num_templates = ARRAY_SIZE(builtin_templates);
- if (ima_template)
+ if (template_setup_done)
return 1;
if (template_desc_init_fields(str, NULL, NULL) < 0) {
@@ -147,6 +150,7 @@ static int __init ima_template_fmt_setup(char *str)
builtin_templates[num_templates - 1].fmt = str;
ima_template = builtin_templates + num_templates - 1;
+ template_setup_done = 1;
return 1;
}
Dzień dobry,
dostrzegam możliwość współpracy z Państwa firmą.
Świadczymy kompleksową obsługę inwestycji w fotowoltaikę, która obniża koszty energii elektrycznej nawet o 90%.
Czy są Państwo zainteresowani weryfikacją wstępnych propozycji?
Pozdrawiam
Norbert Karecki
I'm announcing the release of the 5.16.9 kernel.
All users of the 5.16 kernel series must upgrade.
The updated 5.16.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.16.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/s390/kvm/kvm-s390.c | 2 ++
crypto/algapi.c | 1 +
crypto/api.c | 1 -
drivers/ata/libata-core.c | 14 ++++++--------
drivers/mmc/host/moxart-mmc.c | 2 +-
fs/ksmbd/smb2pdu.c | 2 +-
include/linux/ata.h | 2 +-
net/tipc/link.c | 9 +++++++--
net/tipc/monitor.c | 2 ++
10 files changed, 22 insertions(+), 15 deletions(-)
Damien Le Moal (1):
ata: libata-core: Fix ata_dev_config_cpr()
Greg Kroah-Hartman (2):
moxart: fix potential use-after-free on remove path
Linux 5.16.9
Herbert Xu (1):
crypto: api - Move cryptomgr soft dependency into algapi
Janis Schoetterl-Glausch (1):
KVM: s390: Return error on SIDA memop on normal guest
Jon Maloy (1):
tipc: improve size validations for received domain records
Namjae Jeon (1):
ksmbd: fix SMB 3.11 posix extension mount failure
I'm announcing the release of the 5.15.23 kernel.
All users of the 5.15 kernel series must upgrade.
The updated 5.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.15.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/arm64/include/asm/cputype.h | 2 ++
arch/s390/kvm/kvm-s390.c | 2 ++
crypto/algapi.c | 1 +
crypto/api.c | 1 -
drivers/mmc/host/moxart-mmc.c | 2 +-
fs/ksmbd/smb2pdu.c | 2 +-
net/tipc/link.c | 9 +++++++--
net/tipc/monitor.c | 2 ++
9 files changed, 17 insertions(+), 6 deletions(-)
Anshuman Khandual (1):
arm64: Add Cortex-A510 CPU part definition
Greg Kroah-Hartman (2):
moxart: fix potential use-after-free on remove path
Linux 5.15.23
Herbert Xu (1):
crypto: api - Move cryptomgr soft dependency into algapi
Janis Schoetterl-Glausch (1):
KVM: s390: Return error on SIDA memop on normal guest
Jon Maloy (1):
tipc: improve size validations for received domain records
Namjae Jeon (1):
ksmbd: fix SMB 3.11 posix extension mount failure
I'm announcing the release of the 5.10.100 kernel.
All users of the 5.10 kernel series must upgrade.
The updated 5.10.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.10.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/s390/kvm/kvm-s390.c | 2 ++
crypto/algapi.c | 1 +
crypto/api.c | 1 -
drivers/mmc/host/moxart-mmc.c | 2 +-
net/tipc/link.c | 9 +++++++--
net/tipc/monitor.c | 2 ++
7 files changed, 14 insertions(+), 5 deletions(-)
Greg Kroah-Hartman (2):
moxart: fix potential use-after-free on remove path
Linux 5.10.100
Herbert Xu (1):
crypto: api - Move cryptomgr soft dependency into algapi
Janis Schoetterl-Glausch (1):
KVM: s390: Return error on SIDA memop on normal guest
Jon Maloy (1):
tipc: improve size validations for received domain records
I'm announcing the release of the 5.4.179 kernel.
All users of the 5.4 kernel series must upgrade.
The updated 5.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
drivers/mmc/host/moxart-mmc.c | 2 +-
net/tipc/link.c | 10 +++++++---
net/tipc/monitor.c | 2 ++
4 files changed, 11 insertions(+), 5 deletions(-)
Greg Kroah-Hartman (2):
moxart: fix potential use-after-free on remove path
Linux 5.4.179
Jon Maloy (1):
tipc: improve size validations for received domain records
I'm announcing the release of the 4.19.229 kernel.
All users of the 4.19 kernel series must upgrade.
The updated 4.19.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.19.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
drivers/mmc/host/moxart-mmc.c | 2 +-
kernel/cgroup/cgroup-v1.c | 24 ++++++++++++++++++++++++
net/tipc/link.c | 5 ++++-
net/tipc/monitor.c | 2 ++
5 files changed, 32 insertions(+), 3 deletions(-)
Eric W. Biederman (1):
cgroup-v1: Require capabilities to set release_agent
Greg Kroah-Hartman (2):
moxart: fix potential use-after-free on remove path
Linux 4.19.229
Jon Maloy (1):
tipc: improve size validations for received domain records
I'm announcing the release of the 4.14.266 kernel.
All users of the 4.14 kernel series must upgrade.
The updated 4.14.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.14.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/mcheck/mce.c | 2 +-
drivers/mmc/host/moxart-mmc.c | 2 +-
kernel/cgroup/cgroup-v1.c | 24 ++++++++++++++++++++++++
net/tipc/link.c | 5 ++++-
net/tipc/monitor.c | 2 ++
6 files changed, 33 insertions(+), 4 deletions(-)
Eric W. Biederman (1):
cgroup-v1: Require capabilities to set release_agent
Greg Kroah-Hartman (2):
moxart: fix potential use-after-free on remove path
Linux 4.14.266
Jon Maloy (1):
tipc: improve size validations for received domain records
luofei (1):
x86/mm, mm/hwpoison: Fix the unmap kernel 1:1 pages check condition
I'm announcing the release of the 4.9.301 kernel.
All users of the 4.9 kernel series must upgrade.
The updated 4.9.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.9.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
drivers/mmc/host/moxart-mmc.c | 2 +-
kernel/cgroup.c | 26 ++++++++++++++++++++++++++
net/tipc/link.c | 5 ++++-
net/tipc/monitor.c | 2 ++
5 files changed, 34 insertions(+), 3 deletions(-)
Eric W. Biederman (1):
cgroup-v1: Require capabilities to set release_agent
Greg Kroah-Hartman (2):
moxart: fix potential use-after-free on remove path
Linux 4.9.301
Jon Maloy (1):
tipc: improve size validations for received domain records
The patch titled
Subject: mm/hugetlb: fix kernel crash with hugetlb mremap
has been added to the -mm tree. Its filename is
mm-hugetlb-fix-kernel-crash-with-hugetlb-mremap.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Aneesh Kumar K.V" <aneesh.kumar(a)linux.ibm.com>
Subject: mm/hugetlb: fix kernel crash with hugetlb mremap
This fixes the below crash:
kernel BUG at include/linux/mm.h:2373!
cpu 0x5d: Vector: 700 (Program Check) at [c00000003c6e76e0]
pc: c000000000581a54: pmd_to_page+0x54/0x80
lr: c00000000058d184: move_hugetlb_page_tables+0x4e4/0x5b0
sp: c00000003c6e7980
msr: 9000000000029033
current = 0xc00000003bd8d980
paca = 0xc000200fff610100 irqmask: 0x03 irq_happened: 0x01
pid = 9349, comm = hugepage-mremap
kernel BUG at include/linux/mm.h:2373!
[link register ] c00000000058d184 move_hugetlb_page_tables+0x4e4/0x5b0
[c00000003c6e7980] c00000000058cecc move_hugetlb_page_tables+0x22c/0x5b0 (unreliable)
[c00000003c6e7a90] c00000000053b78c move_page_tables+0xdbc/0x1010
[c00000003c6e7bd0] c00000000053bc34 move_vma+0x254/0x5f0
[c00000003c6e7c90] c00000000053c790 sys_mremap+0x7c0/0x900
[c00000003c6e7db0] c00000000002c450 system_call_exception+0x160/0x2c0
the kernel can't use huge_pte_offset before it set the pte entry because a
page table lookup check for huge PTE bit in the page table to
differentiate between a huge pte entry and a pointer to pte page. A
huge_pte_alloc won't mark the page table entry huge and hence kernel
should not use huge_pte_offset after a huge_pte_alloc.
Link: https://lkml.kernel.org/r/20220211063221.99293-1-aneesh.kumar@linux.ibm.com
Fixes: 550a7d60bd5e ("mm, hugepages: add mremap() support for hugepage backed vma")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Mina Almasry <almasrymina(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/hugetlb.c~mm-hugetlb-fix-kernel-crash-with-hugetlb-mremap
+++ a/mm/hugetlb.c
@@ -4851,14 +4851,13 @@ again:
}
static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr,
- unsigned long new_addr, pte_t *src_pte)
+ unsigned long new_addr, pte_t *src_pte, pte_t *dst_pte)
{
struct hstate *h = hstate_vma(vma);
struct mm_struct *mm = vma->vm_mm;
- pte_t *dst_pte, pte;
spinlock_t *src_ptl, *dst_ptl;
+ pte_t pte;
- dst_pte = huge_pte_offset(mm, new_addr, huge_page_size(h));
dst_ptl = huge_pte_lock(h, mm, dst_pte);
src_ptl = huge_pte_lockptr(h, mm, src_pte);
@@ -4917,7 +4916,7 @@ int move_hugetlb_page_tables(struct vm_a
if (!dst_pte)
break;
- move_huge_pte(vma, old_addr, new_addr, src_pte);
+ move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte);
}
flush_tlb_range(vma, old_end - len, old_end);
mmu_notifier_invalidate_range_end(&range);
_
Patches currently in -mm which might be from aneesh.kumar(a)linux.ibm.com are
mm-hugetlb-fix-kernel-crash-with-hugetlb-mremap.patch
This is the start of the stable review cycle for the 5.16.9 release.
There are 5 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Feb 2022 19:12:41 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.16.9-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.16.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.16.9-rc1
Herbert Xu <herbert(a)gondor.apana.org.au>
crypto: api - Move cryptomgr soft dependency into algapi
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix SMB 3.11 posix extension mount failure
Janis Schoetterl-Glausch <scgl(a)linux.ibm.com>
KVM: s390: Return error on SIDA memop on normal guest
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
moxart: fix potential use-after-free on remove path
Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
ata: libata-core: Fix ata_dev_config_cpr()
-------------
Diffstat:
Makefile | 4 ++--
arch/s390/kvm/kvm-s390.c | 2 ++
crypto/algapi.c | 1 +
crypto/api.c | 1 -
drivers/ata/libata-core.c | 14 ++++++--------
drivers/mmc/host/moxart-mmc.c | 2 +-
fs/ksmbd/smb2pdu.c | 2 +-
include/linux/ata.h | 2 +-
8 files changed, 14 insertions(+), 14 deletions(-)
I urgently seek your service to represent me in investing in
your region / country and you will be rewarded for your service without
affecting your present job with very little time invested in it, which you will
be communicated in details upon response.
My dearest regards
Seyba Daniel
In the rework of btrfs_defrag_file(), we always call
defrag_one_cluster() and increase the offset by cluster size, which is
only 256K.
But there are cases where we have a large extent (e.g. 128M) which
doesn't need to be defragged at all.
Before the refactor, we can directly skip the range, but now we have to
scan that extent map again and again until the cluster moves after the
non-target extent.
Fix the problem by allow defrag_one_cluster() to increase
btrfs_defrag_ctrl::last_scanned to the end of an extent, if and only if
the last extent of the cluster is not a target.
The test script looks like this:
mkfs.btrfs -f $dev > /dev/null
mount $dev $mnt
# As btrfs ioctl uses 32M as extent_threshold
xfs_io -f -c "pwrite 0 64M" $mnt/file1
sync
# Some fragemented range to defrag
xfs_io -s -c "pwrite 65548k 4k" \
-c "pwrite 65544k 4k" \
-c "pwrite 65540k 4k" \
-c "pwrite 65536k 4k" \
$mnt/file1
sync
echo "=== before ==="
xfs_io -c "fiemap -v" $mnt/file1
echo "=== after ==="
btrfs fi defrag $mnt/file1
sync
xfs_io -c "fiemap -v" $mnt/file1
umount $mnt
With extra ftrace put into defrag_one_cluster(), before the patch it
would result tons of loops:
(As defrag_one_cluster() is inlined, the function name is its caller)
btrfs-126062 [005] ..... 4682.816026: btrfs_defrag_file: r/i=5/257 start=0 len=262144
btrfs-126062 [005] ..... 4682.816027: btrfs_defrag_file: r/i=5/257 start=262144 len=262144
btrfs-126062 [005] ..... 4682.816028: btrfs_defrag_file: r/i=5/257 start=524288 len=262144
btrfs-126062 [005] ..... 4682.816028: btrfs_defrag_file: r/i=5/257 start=786432 len=262144
btrfs-126062 [005] ..... 4682.816028: btrfs_defrag_file: r/i=5/257 start=1048576 len=262144
...
btrfs-126062 [005] ..... 4682.816043: btrfs_defrag_file: r/i=5/257 start=67108864 len=262144
But with this patch there will be just one loop, then directly to the
end of the extent:
btrfs-130471 [014] ..... 5434.029558: defrag_one_cluster: r/i=5/257 start=0 len=262144
btrfs-130471 [014] ..... 5434.029559: defrag_one_cluster: r/i=5/257 start=67108864 len=16384
Cc: stable(a)vger.kernel.org # 5.16
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
---
fs/btrfs/ioctl.c | 50 ++++++++++++++++++++++++++++++++++++++----------
1 file changed, 40 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e7a284239393..fa4d29026275 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1204,9 +1204,11 @@ struct defrag_target_range {
*/
static int defrag_collect_targets(struct btrfs_inode *inode,
u64 start, u64 len, u32 extent_thresh,
- u64 newer_than, bool do_compress,
- bool locked, struct list_head *target_list)
+ u64 newer_than, bool do_compress, bool locked,
+ struct list_head *target_list,
+ u64 *last_scanned_ret)
{
+ bool last_is_target = false;
u64 cur = start;
int ret = 0;
@@ -1216,6 +1218,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
bool next_mergeable = true;
u64 range_len;
+ last_is_target = false;
em = defrag_lookup_extent(&inode->vfs_inode, cur, locked);
if (!em)
break;
@@ -1298,6 +1301,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
}
add:
+ last_is_target = true;
range_len = min(extent_map_end(em), start + len) - cur;
/*
* This one is a good target, check if it can be merged into
@@ -1341,6 +1345,17 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
kfree(entry);
}
}
+ if (!ret && last_scanned_ret) {
+ /*
+ * If the last extent is not a target, the caller can skip to
+ * the end of that extent.
+ * Otherwise, we can only go the end of the spcified range.
+ */
+ if (!last_is_target)
+ *last_scanned_ret = max(cur, *last_scanned_ret);
+ else
+ *last_scanned_ret = max(start + len, *last_scanned_ret);
+ }
return ret;
}
@@ -1400,7 +1415,8 @@ static int defrag_one_locked_target(struct btrfs_inode *inode,
}
static int defrag_one_range(struct btrfs_inode *inode, u64 start, u32 len,
- u32 extent_thresh, u64 newer_than, bool do_compress)
+ u32 extent_thresh, u64 newer_than, bool do_compress,
+ u64 *last_scanned_ret)
{
struct extent_state *cached_state = NULL;
struct defrag_target_range *entry;
@@ -1446,7 +1462,7 @@ static int defrag_one_range(struct btrfs_inode *inode, u64 start, u32 len,
*/
ret = defrag_collect_targets(inode, start, len, extent_thresh,
newer_than, do_compress, true,
- &target_list);
+ &target_list, last_scanned_ret);
if (ret < 0)
goto unlock_extent;
@@ -1481,7 +1497,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
u64 start, u32 len, u32 extent_thresh,
u64 newer_than, bool do_compress,
unsigned long *sectors_defragged,
- unsigned long max_sectors)
+ unsigned long max_sectors,
+ u64 *last_scanned_ret)
{
const u32 sectorsize = inode->root->fs_info->sectorsize;
struct defrag_target_range *entry;
@@ -1491,7 +1508,7 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
ret = defrag_collect_targets(inode, start, len, extent_thresh,
newer_than, do_compress, false,
- &target_list);
+ &target_list, NULL);
if (ret < 0)
goto out;
@@ -1508,6 +1525,15 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
range_len = min_t(u32, range_len,
(max_sectors - *sectors_defragged) * sectorsize);
+ /*
+ * If defrag_one_range() has updated last_scanned_ret,
+ * our range may already be invalid (e.g. hole punched).
+ * Skip if our range is before last_scanned_ret, as there is
+ * no need to defrag the range anymore.
+ */
+ if (entry->start + range_len <= *last_scanned_ret)
+ continue;
+
if (ra)
page_cache_sync_readahead(inode->vfs_inode.i_mapping,
ra, NULL, entry->start >> PAGE_SHIFT,
@@ -1520,7 +1546,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
* accounting.
*/
ret = defrag_one_range(inode, entry->start, range_len,
- extent_thresh, newer_than, do_compress);
+ extent_thresh, newer_than, do_compress,
+ last_scanned_ret);
if (ret < 0)
break;
*sectors_defragged += range_len >>
@@ -1531,6 +1558,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
list_del_init(&entry->list);
kfree(entry);
}
+ if (ret >= 0)
+ *last_scanned_ret = max(*last_scanned_ret, start + len);
return ret;
}
@@ -1616,6 +1645,7 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
while (cur < last_byte) {
const unsigned long prev_sectors_defragged = sectors_defragged;
+ u64 last_scanned = cur;
u64 cluster_end;
if (btrfs_defrag_cancelled(fs_info)) {
@@ -1642,8 +1672,8 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
BTRFS_I(inode)->defrag_compress = compress_type;
ret = defrag_one_cluster(BTRFS_I(inode), ra, cur,
cluster_end + 1 - cur, extent_thresh,
- newer_than, do_compress,
- §ors_defragged, max_to_defrag);
+ newer_than, do_compress, §ors_defragged,
+ max_to_defrag, &last_scanned);
if (sectors_defragged > prev_sectors_defragged)
balance_dirty_pages_ratelimited(inode->i_mapping);
@@ -1651,7 +1681,7 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
btrfs_inode_unlock(inode, 0);
if (ret < 0)
break;
- cur = cluster_end + 1;
+ cur = max(cluster_end + 1, last_scanned);
if (ret > 0) {
ret = 0;
break;
--
2.35.0
This is the start of the stable review cycle for the 5.15.23 release.
There are 5 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Feb 2022 19:12:41 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.23-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.23-rc1
Herbert Xu <herbert(a)gondor.apana.org.au>
crypto: api - Move cryptomgr soft dependency into algapi
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix SMB 3.11 posix extension mount failure
Janis Schoetterl-Glausch <scgl(a)linux.ibm.com>
KVM: s390: Return error on SIDA memop on normal guest
Anshuman Khandual <anshuman.khandual(a)arm.com>
arm64: Add Cortex-A510 CPU part definition
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
moxart: fix potential use-after-free on remove path
-------------
Diffstat:
Makefile | 4 ++--
arch/arm64/include/asm/cputype.h | 2 ++
arch/s390/kvm/kvm-s390.c | 2 ++
crypto/algapi.c | 1 +
crypto/api.c | 1 -
drivers/mmc/host/moxart-mmc.c | 2 +-
fs/ksmbd/smb2pdu.c | 2 +-
7 files changed, 9 insertions(+), 5 deletions(-)
This is the start of the stable review cycle for the 5.10.100 release.
There are 3 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Feb 2022 19:12:41 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.100-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.100-rc1
Herbert Xu <herbert(a)gondor.apana.org.au>
crypto: api - Move cryptomgr soft dependency into algapi
Janis Schoetterl-Glausch <scgl(a)linux.ibm.com>
KVM: s390: Return error on SIDA memop on normal guest
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
moxart: fix potential use-after-free on remove path
-------------
Diffstat:
Makefile | 4 ++--
arch/s390/kvm/kvm-s390.c | 2 ++
crypto/algapi.c | 1 +
crypto/api.c | 1 -
drivers/mmc/host/moxart-mmc.c | 2 +-
5 files changed, 6 insertions(+), 4 deletions(-)
Vijay reported that the "unclobbered_vdso_oversubscribed" selftest
triggers the softlockup detector.
Actual SGX systems have 128GB of enclave memory or more. The
"unclobbered_vdso_oversubscribed" selftest creates one enclave which
consumes all of the enclave memory on the system. Tearing down such a
large enclave takes around a minute, most of it in the loop where
the EREMOVE instruction is applied to each individual 4k enclave page.
Spending one minute in a loop triggers the softlockup detector.
Add a cond_resched() to give other tasks a chance to run and placate
the softlockup detector.
Cc: stable(a)vger.kernel.org
Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Reported-by: Vijay Dhanraj <vijay.dhanraj(a)intel.com>
Acked-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Tested-by: Jarkko Sakkinen <jarkko(a)kernel.org> (kselftest as sanity check)
Signed-off-by: Reinette Chatre <reinette.chatre(a)intel.com>
---
Softlockup message:
watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [test_sgx:11502]
Kernel panic - not syncing: softlockup: hung tasks
<snip>
sgx_encl_release+0x86/0x1c0
sgx_release+0x11c/0x130
__fput+0xb0/0x280
____fput+0xe/0x10
task_work_run+0x6c/0xc0
exit_to_user_mode_prepare+0x1eb/0x1f0
syscall_exit_to_user_mode+0x1d/0x50
do_syscall_64+0x46/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Changes since V2:
- V2: https://lore.kernel.org/lkml/b5e9f218064aa76e3026f778e1ad0a1d823e3db8.16431…
- Add Jarkko's "Reviewed-by" and "Tested-by" tags.
Changes since V1:
- V1: https://lore.kernel.org/lkml/1aa037705e5aa209d8b7a075873c6b4190327436.16425…
- Add comment provided by Jarkko.
arch/x86/kernel/cpu/sgx/encl.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 001808e3901c..48afe96ae0f0 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -410,6 +410,8 @@ void sgx_encl_release(struct kref *ref)
}
kfree(entry);
+ /* Invoke scheduler to prevent soft lockups. */
+ cond_resched();
}
xa_destroy(&encl->page_array);
--
2.25.1
This is the start of the stable review cycle for the 4.19.229 release.
There are 2 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Feb 2022 19:12:41 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.229-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.229-rc1
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
moxart: fix potential use-after-free on remove path
Eric W. Biederman <ebiederm(a)xmission.com>
cgroup-v1: Require capabilities to set release_agent
-------------
Diffstat:
Makefile | 4 ++--
drivers/mmc/host/moxart-mmc.c | 2 +-
kernel/cgroup/cgroup-v1.c | 24 ++++++++++++++++++++++++
3 files changed, 27 insertions(+), 3 deletions(-)
This is the start of the stable review cycle for the 5.4.179 release.
There are 1 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Feb 2022 19:12:41 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.179-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.179-rc1
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
moxart: fix potential use-after-free on remove path
-------------
Diffstat:
Makefile | 4 ++--
drivers/mmc/host/moxart-mmc.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
This is the start of the stable review cycle for the 4.14.266 release.
There are 3 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Feb 2022 19:12:41 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.266-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.266-rc1
luofei <luofei(a)unicloud.com>
x86/mm, mm/hwpoison: Fix the unmap kernel 1:1 pages check condition
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
moxart: fix potential use-after-free on remove path
Eric W. Biederman <ebiederm(a)xmission.com>
cgroup-v1: Require capabilities to set release_agent
-------------
Diffstat:
Makefile | 4 ++--
arch/x86/kernel/cpu/mcheck/mce.c | 2 +-
drivers/mmc/host/moxart-mmc.c | 2 +-
kernel/cgroup/cgroup-v1.c | 24 ++++++++++++++++++++++++
4 files changed, 28 insertions(+), 4 deletions(-)
This is the start of the stable review cycle for the 4.9.301 release.
There are 2 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 11 Feb 2022 19:12:41 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.301-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.301-rc1
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
moxart: fix potential use-after-free on remove path
Eric W. Biederman <ebiederm(a)xmission.com>
cgroup-v1: Require capabilities to set release_agent
-------------
Diffstat:
Makefile | 4 ++--
drivers/mmc/host/moxart-mmc.c | 2 +-
kernel/cgroup.c | 26 ++++++++++++++++++++++++++
3 files changed, 29 insertions(+), 3 deletions(-)
We found a failure when used iperf tool for wifi performance testing,
there are some MSIs received while clearing the interrupt status,
these MSIs cannot be serviced.
The interrupt status can be cleared even the MSI status still remaining,
as an edge-triggered interrupts, its interrupt status should be cleared
before dispatching to the handler of device.
Signed-off-by: qizhong cheng <qizhong.cheng(a)mediatek.com>
---
v2:
- Update the subject line.
- Improve the commit log and code comments.
drivers/pci/controller/pcie-mediatek.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/pcie-mediatek.c b/drivers/pci/controller/pcie-mediatek.c
index 2f3f974977a3..2856d74b2513 100644
--- a/drivers/pci/controller/pcie-mediatek.c
+++ b/drivers/pci/controller/pcie-mediatek.c
@@ -624,12 +624,17 @@ static void mtk_pcie_intr_handler(struct irq_desc *desc)
if (status & MSI_STATUS){
unsigned long imsi_status;
+ /*
+ * The interrupt status can be cleared even the MSI
+ * status still remaining, hence as an edge-triggered
+ * interrupts, its interrupt status should be cleared
+ * before dispatching handler.
+ */
+ writel(MSI_STATUS, port->base + PCIE_INT_STATUS);
while ((imsi_status = readl(port->base + PCIE_IMSI_STATUS))) {
for_each_set_bit(bit, &imsi_status, MTK_MSI_IRQS_NUM)
generic_handle_domain_irq(port->inner_domain, bit);
}
- /* Clear MSI interrupt status */
- writel(MSI_STATUS, port->base + PCIE_INT_STATUS);
}
}
--
2.25.1
A kernel exception was hit when trying to dump /proc/lockdep_chains after
lockdep report "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!":
Unable to handle kernel paging request at virtual address 00054005450e05c3
...
00054005450e05c3] address between user and kernel address ranges
...
pc : [0xffffffece769b3a8] string+0x50/0x10c
lr : [0xffffffece769ac88] vsnprintf+0x468/0x69c
...
Call trace:
string+0x50/0x10c
vsnprintf+0x468/0x69c
seq_printf+0x8c/0xd8
print_name+0x64/0xf4
lc_show+0xb8/0x128
seq_read_iter+0x3cc/0x5fc
proc_reg_read_iter+0xdc/0x1d4
The cause of the problem is the function lock_chain_get_class() will
shift lock_classes index by 1, but the index don't need to be shifted
anymore since commit 01bb6f0af992 ("locking/lockdep: Change the range of
class_idx in held_lock struct") already change the index to start from
0.
The lock_classes[-1] located at chain_hlocks array. When printing
lock_classes[-1] after the chain_hlocks entries are modified, the
exception happened.
The output of lockdep_chains are incorrect due to this problem too.
Fixes: f611e8cf98ec ("lockdep: Take read/write status in consideration when generate chainkey")
Signed-off-by: Cheng Jui Wang <cheng-jui.wang(a)mediatek.com>
---
kernel/locking/lockdep.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 4a882f83aeb9..f8a0212189ca 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3462,7 +3462,7 @@ struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i)
u16 chain_hlock = chain_hlocks[chain->base + i];
unsigned int class_idx = chain_hlock_class_idx(chain_hlock);
- return lock_classes + class_idx - 1;
+ return lock_classes + class_idx;
}
/*
@@ -3530,7 +3530,7 @@ static void print_chain_keys_chain(struct lock_chain *chain)
hlock_id = chain_hlocks[chain->base + i];
chain_key = print_chain_key_iteration(hlock_id, chain_key);
- print_lock_name(lock_classes + chain_hlock_class_idx(hlock_id) - 1);
+ print_lock_name(lock_classes + chain_hlock_class_idx(hlock_id));
printk("\n");
}
}
--
2.18.0
The mapping from enum port to whatever port numbering scheme is used by
the SWSCI Display Power State Notification is odd, and the memory of it
has faded. In any case, the parameter only has space for ports numbered
[0..4], and UBSAN reports bit shift beyond it when the platform has port
F or more.
Since the SWSCI functionality is supposed to be obsolete for new
platforms (i.e. ones that might have port F or more), just bail out
early if the mapped and mangled port number is beyond what the Display
Power State Notification can support.
Fixes: 9c4b0a683193 ("drm/i915: add opregion function to notify bios of encoder enable/disable")
Cc: <stable(a)vger.kernel.org> # v3.13+
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi(a)intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4800
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
---
drivers/gpu/drm/i915/display/intel_opregion.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.c b/drivers/gpu/drm/i915/display/intel_opregion.c
index af9d30f56cc1..ad1afe9df6c3 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.c
+++ b/drivers/gpu/drm/i915/display/intel_opregion.c
@@ -363,6 +363,21 @@ int intel_opregion_notify_encoder(struct intel_encoder *intel_encoder,
port++;
}
+ /*
+ * The port numbering and mapping here is bizarre. The now-obsolete
+ * swsci spec supports ports numbered [0..4]. Port E is handled as a
+ * special case, but port F and beyond are not. The functionality is
+ * supposed to be obsolete for new platforms. Just bail out if the port
+ * number is out of bounds after mapping.
+ */
+ if (port > 4) {
+ drm_dbg_kms(&dev_priv->drm,
+ "[ENCODER:%d:%s] port %c (index %u) out of bounds for display power state notification\n",
+ intel_encoder->base.base.id, intel_encoder->base.name,
+ port_name(intel_encoder->port), port);
+ return -EINVAL;
+ }
+
if (!enable)
parm |= 4 << 8;
--
2.30.2
In rare cases the display is flipped or mirrored. This was observed more
often in a low temperature environment. A clean reset on init_display()
should help to get registers in a sane state.
Fixes: ef8f317795da (staging: fbtft: use init function instead of init sequence)
Cc: stable(a)vger.kernel.org
Signed-off-by: Oliver Graute <oliver.graute(a)kococonnector.com>
---
drivers/staging/fbtft/fb_st7789v.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/staging/fbtft/fb_st7789v.c b/drivers/staging/fbtft/fb_st7789v.c
index 3a280cc1892c..0a2dbed9ffc7 100644
--- a/drivers/staging/fbtft/fb_st7789v.c
+++ b/drivers/staging/fbtft/fb_st7789v.c
@@ -82,6 +82,8 @@ enum st7789v_command {
{
int rc;
+ par->fbtftops.reset(par);
+
rc = init_tearing_effect_line(par);
if (rc)
return rc;
--
2.17.1
From: Hongyu Xie <xiehongyu1(a)kylinos.cn>
xhci_check_args returns 4 types of value, -ENODEV, -EINVAL, 1 and 0.
xhci_urb_enqueue and xhci_check_streams_endpoint return -EINVAL if
the return value of xhci_check_args <= 0.
This will cause a problem.
For example, r8152_submit_rx calling usb_submit_urb in
drivers/net/usb/r8152.c.
r8152_submit_rx will never get -ENODEV after submiting an urb
when xHC is halted,
because xhci_urb_enqueue returns -EINVAL in the very beginning.
Fixes: 203a86613fb3 ("xhci: Avoid NULL pointer deref when host dies.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hongyu Xie <xiehongyu1(a)kylinos.cn>
---
v2: keep return value to -EINVAL for roothub urbs
drivers/usb/host/xhci.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index dc357cabb265..948546b98af0 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1604,9 +1604,12 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
struct urb_priv *urb_priv;
int num_tds;
- if (!urb || xhci_check_args(hcd, urb->dev, urb->ep,
- true, true, __func__) <= 0)
+ if (!urb)
return -EINVAL;
+ ret = xhci_check_args(hcd, urb->dev, urb->ep,
+ true, true, __func__);
+ if (ret <= 0)
+ return ret ? ret : -EINVAL;
slot_id = urb->dev->slot_id;
ep_index = xhci_get_endpoint_index(&urb->ep->desc);
@@ -3323,7 +3326,7 @@ static int xhci_check_streams_endpoint(struct xhci_hcd *xhci,
return -EINVAL;
ret = xhci_check_args(xhci_to_hcd(xhci), udev, ep, 1, true, __func__);
if (ret <= 0)
- return -EINVAL;
+ return ret ? ret : -EINVAL;
if (usb_ss_max_streams(&ep->ss_ep_comp) == 0) {
xhci_warn(xhci, "WARN: SuperSpeed Endpoint Companion"
" descriptor for ep 0x%x does not support streams\n",
--
2.25.1
From: Johannes Berg <johannes.berg(a)intel.com>
If no firmware was present at all (or, presumably, all of the
firmware files failed to parse), we end up unbinding by calling
device_release_driver(), which calls remove(), which then in
iwlwifi calls iwl_drv_stop(), freeing the 'drv' struct. However
the new code I added will still erroneously access it after it
was freed.
Set 'failure=false' in this case to avoid the access, all data
was already freed anyway.
Cc: stable(a)vger.kernel.org
Reported-by: Stefan Agner <stefan(a)agner.ch>
Reported-by: Wolfgang Walter <linux(a)stwm.de>
Reported-by: Jason Self <jason(a)bluehome.net>
Reported-by: Dominik Behr <dominik(a)dominikbehr.com>
Reported-by: Marek Marczykowski-Górecki <marmarek(a)invisiblethingslab.com>
Fixes: ab07506b0454 ("iwlwifi: fix leaks/bad data after failed firmware load")
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
---
drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
index 83e3b731ad29..6651e78b39ec 100644
--- a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
+++ b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
@@ -1707,6 +1707,8 @@ static void iwl_req_fw_callback(const struct firmware *ucode_raw, void *context)
out_unbind:
complete(&drv->request_firmware_complete);
device_release_driver(drv->trans->dev);
+ /* drv has just been freed by the release */
+ failure = false;
free:
if (failure)
iwl_dealloc_ucode(drv);
--
2.34.1
From: Huacai Chen <chenhc(a)lemote.com>
commit f4d3da72a76a9ce5f57bba64788931686a9dc333 upstream.
In Mesa, dev_info.gart_page_size is used for alignment and it was
set to AMDGPU_GPU_PAGE_SIZE(4KB). However, the page table of AMDGPU
driver requires an alignment on CPU pages. So, for non-4KB page system,
gart_page_size should be max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE).
Signed-off-by: Rui Wang <wangr(a)lemote.com>
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
Link: https://github.com/loongson-community/linux-stable/commit/caa9c0a1
[Xi: rebased for drm-next, use max_t for checkpatch,
and reworded commit message.]
Signed-off-by: Xi Ruoyao <xry111(a)mengyan1223.wang>
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1549
Tested-by: Dan Horák <dan(a)danny.cz>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
[Salvatore Bonaccorso: Backport to 5.10.y which does not contain
a5a52a43eac0 ("drm/amd/amdgpu/amdgpu_kms: Remove 'struct
drm_amdgpu_info_device dev_info' from the stack") which removes dev_info
from the stack and places it on the heap.]
Tested-by: Timothy Pearson <tpearson(a)raptorengineering.com>
Signed-off-by: Salvatore Bonaccorso <carnil(a)debian.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index efda38349a03..917b94002f4b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -766,9 +766,9 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
dev_info.high_va_offset = AMDGPU_GMC_HOLE_END;
dev_info.high_va_max = AMDGPU_GMC_HOLE_END | vm_size;
}
- dev_info.virtual_address_alignment = max((int)PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE);
+ dev_info.virtual_address_alignment = max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE);
dev_info.pte_fragment_size = (1 << adev->vm_manager.fragment_size) * AMDGPU_GPU_PAGE_SIZE;
- dev_info.gart_page_size = AMDGPU_GPU_PAGE_SIZE;
+ dev_info.gart_page_size = max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE);
dev_info.cu_active_number = adev->gfx.cu_info.number;
dev_info.cu_ao_mask = adev->gfx.cu_info.ao_cu_mask;
dev_info.ce_ram_size = adev->gfx.ce_ram_size;
--
2.34.1
There is a hung_task issue with running generic/068 on an SMR
device. The hang occurs while a process is trying to thaw the
filesystem. The process is trying to take sb->s_umount to thaw the
FS. The lock is held by fsstress, which calls btrfs_sync_fs() and is
waiting for an ordered extent to finish. However, as the FS is frozen,
the ordered extent never finish.
Having an ordered extent while the FS is frozen is the root cause of
the hang. The ordered extent is initiated from btrfs_relocate_chunk()
which is called from btrfs_reclaim_bgs_work().
This commit add sb_*_write() around btrfs_relocate_chunk() call
site. For the usual "btrfs balance" command, we already call it with
mnt_want_file() in btrfs_ioctl_balance().
Additionally, add an ASSERT in btrfs_relocate_chunk() to check it is
properly called.
Fixes: 18bb8bbf13c1 ("btrfs: zoned: automatically reclaim zones")
Cc: stable(a)vger.kernel.org # 5.13+
Link: https://github.com/naota/linux/issues/56
Signed-off-by: Naohiro Aota <naohiro.aota(a)wdc.com>
---
fs/btrfs/block-group.c | 8 +++++++-
fs/btrfs/volumes.c | 6 ++++++
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 68feabc83a27..f9ca58a90630 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1513,8 +1513,12 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
if (!test_bit(BTRFS_FS_OPEN, &fs_info->flags))
return;
- if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE))
+ sb_start_write(fs_info->sb);
+
+ if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) {
+ sb_end_write(fs_info->sb);
return;
+ }
/*
* Long running balances can keep us blocked here for eternity, so
@@ -1522,6 +1526,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
*/
if (!mutex_trylock(&fs_info->reclaim_bgs_lock)) {
btrfs_exclop_finish(fs_info);
+ sb_end_write(fs_info->sb);
return;
}
@@ -1596,6 +1601,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
spin_unlock(&fs_info->unused_bgs_lock);
mutex_unlock(&fs_info->reclaim_bgs_lock);
btrfs_exclop_finish(fs_info);
+ sb_end_write(fs_info->sb);
}
void btrfs_reclaim_bgs(struct btrfs_fs_info *fs_info)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b07d382d53a8..e415fb4a698e 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3251,6 +3251,9 @@ int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
u64 length;
int ret;
+ /* Assert we called sb_start_write(), not to race with FS freezing */
+ sb_assert_write_started(fs_info->sb);
+
/*
* Prevent races with automatic removal of unused block groups.
* After we relocate and before we remove the chunk with offset
@@ -8299,10 +8302,12 @@ static int relocating_repair_kthread(void *data)
target = cache->start;
btrfs_put_block_group(cache);
+ sb_start_write(fs_info->sb);
if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) {
btrfs_info(fs_info,
"zoned: skip relocating block group %llu to repair: EBUSY",
target);
+ sb_end_write(fs_info->sb);
return -EBUSY;
}
@@ -8330,6 +8335,7 @@ static int relocating_repair_kthread(void *data)
btrfs_put_block_group(cache);
mutex_unlock(&fs_info->reclaim_bgs_lock);
btrfs_exclop_finish(fs_info);
+ sb_end_write(fs_info->sb);
return ret;
}
--
2.35.1
On 11/24/21 8:28 AM, Jens Axboe wrote:
> On 11/23/21 8:27 PM, Daniel Black wrote:
>> On Mon, Nov 15, 2021 at 7:55 AM Jens Axboe <axboe(a)kernel.dk> wrote:
>>>
>>> On 11/14/21 1:33 PM, Daniel Black wrote:
>>>> On Fri, Nov 12, 2021 at 10:44 AM Jens Axboe <axboe(a)kernel.dk> wrote:
>>>>>
>>>>> Alright, give this one a go if you can. Against -git, but will apply to
>>>>> 5.15 as well.
>>>>
>>>>
>>>> Works. Thank you very much.
>>>>
>>>> https://jira.mariadb.org/browse/MDEV-26674?page=com.atlassian.jira.plugin.s…
>>>>
>>>> Tested-by: Marko Mäkelä <marko.makela(a)mariadb.com>
>>>
>>> The patch is already upstream (and in the 5.15 stable queue), and I
>>> provided 5.14 patches too.
>>
>> Jens,
>>
>> I'm getting the same reproducer on 5.14.20
>> (https://bugzilla.redhat.com/show_bug.cgi?id=2018882#c3) though the
>> backport change logs indicate 5.14.19 has the patch.
>>
>> Anything missing?
>
> We might also need another patch that isn't in stable, I'm attaching
> it here. Any chance you can run 5.14.20/21 with this applied? If not,
> I'll do some sanity checking here and push it to -stable.
Looks good to me - Greg, would you mind queueing this up for
5.14-stable?
--
Jens Axboe
The user pointer was being illegally dereferenced directly to get the
open_how flags data in audit_match_perm. Use the previously saved flags
data elsewhere in the context instead.
Coverage is provided by the audit-testsuite syscalls_file test case.
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/c96031b4-b76d-d82c-e232-1cccbbf71946@suse.com
Fixes: 1c30e3af8a79 ("audit: add support for the openat2 syscall")
Reported-by: Jeff Mahoney <jeffm(a)suse.com>
Signed-off-by: Richard Guy Briggs <rgb(a)redhat.com>
---
kernel/auditsc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index fce5d43a933f..81ab510a7be4 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -185,7 +185,7 @@ static int audit_match_perm(struct audit_context *ctx, int mask)
case AUDITSC_EXECVE:
return mask & AUDIT_PERM_EXEC;
case AUDITSC_OPENAT2:
- return mask & ACC_MODE((u32)((struct open_how *)ctx->argv[2])->flags);
+ return mask & ACC_MODE((u32)(ctx->openat2.flags));
default:
return 0;
}
--
2.27.0
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 097f1eefedeab528cecbd35586dfe293853ffb17 Mon Sep 17 00:00:00 2001
From: Tom Zanussi <zanussi(a)kernel.org>
Date: Thu, 27 Jan 2022 15:44:17 -0600
Subject: [PATCH] tracing: Propagate is_signed to expression
During expression parsing, a new expression field is created which
should inherit the properties of the operands, such as size and
is_signed.
is_signed propagation was missing, causing spurious errors with signed
operands. Add it in parse_expr() and parse_unary() to fix the problem.
Link: https://lkml.kernel.org/r/f4dac08742fd7a0920bf80a73c6c44042f5eaa40.16433197…
Cc: stable(a)vger.kernel.org
Fixes: 100719dcef447 ("tracing: Add simple expression support to hist triggers")
Reported-by: Yordan Karadzhov <ykaradzhov(a)vmware.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215513
Signed-off-by: Tom Zanussi <zanussi(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index b894d68082ea..ada87bfb5bb8 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -2503,6 +2503,8 @@ static struct hist_field *parse_unary(struct hist_trigger_data *hist_data,
(HIST_FIELD_FL_TIMESTAMP | HIST_FIELD_FL_TIMESTAMP_USECS);
expr->fn = hist_field_unary_minus;
expr->operands[0] = operand1;
+ expr->size = operand1->size;
+ expr->is_signed = operand1->is_signed;
expr->operator = FIELD_OP_UNARY_MINUS;
expr->name = expr_str(expr, 0);
expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
@@ -2719,6 +2721,7 @@ static struct hist_field *parse_expr(struct hist_trigger_data *hist_data,
/* The operand sizes should be the same, so just pick one */
expr->size = operand1->size;
+ expr->is_signed = operand1->is_signed;
expr->operator = field_op;
expr->type = kstrdup_const(operand1->type, GFP_KERNEL);
This is the start of the stable review cycle for the 4.19.228 release.
There are 86 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 09 Feb 2022 10:37:42 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.228-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.228-rc1
Ritesh Harjani <riteshh(a)linux.ibm.com>
ext4: fix error handling in ext4_restore_inline_data()
Sergey Shtylyov <s.shtylyov(a)omp.ru>
EDAC/xgene: Fix deferred probing
Sergey Shtylyov <s.shtylyov(a)omp.ru>
EDAC/altera: Fix deferred probing
Riwen Lu <luriwen(a)kylinos.cn>
rtc: cmos: Evaluate century appropriate
Muhammad Usama Anjum <usama.anjum(a)collabora.com>
selftests: futex: Use variable MAKE instead of make
Dai Ngo <dai.ngo(a)oracle.com>
nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client.
John Meneghini <jmeneghi(a)redhat.com>
scsi: bnx2fc: Make bnx2fc_recv_frame() mp safe
Dan Carpenter <dan.carpenter(a)oracle.com>
ASoC: max9759: fix underflow in speaker_gain_control_put()
Jiasheng Jiang <jiasheng(a)iscas.ac.cn>
ASoC: cpcap: Check for NULL pointer after calling of_get_child_by_name
Miaoqian Lin <linmq006(a)gmail.com>
ASoC: fsl: Add missing error handling in pcm030_fabric_probe
Dan Carpenter <dan.carpenter(a)oracle.com>
drm/i915/overlay: Prevent divide by zero bugs in scaling
Yannick Vignon <yannick.vignon(a)nxp.com>
net: stmmac: ensure PTP time register reads are consistent
Lior Nahmanson <liorna(a)nvidia.com>
net: macsec: Verify that send_sci is on when setting Tx sci explicitly
Miquel Raynal <miquel.raynal(a)bootlin.com>
net: ieee802154: Return meaningful error codes from the netlink helpers
Miquel Raynal <miquel.raynal(a)bootlin.com>
net: ieee802154: ca8210: Stop leaking skb's
Miquel Raynal <miquel.raynal(a)bootlin.com>
net: ieee802154: mcr20a: Fix lifs/sifs periods
Miquel Raynal <miquel.raynal(a)bootlin.com>
net: ieee802154: hwsim: Ensure proper channel selection at probe time
Miaoqian Lin <linmq006(a)gmail.com>
spi: meson-spicc: add IRQ check in meson_spicc_probe
Benjamin Gaignard <benjamin.gaignard(a)collabora.com>
spi: mediatek: Avoid NULL pointer crash in interrupt
Kamal Dasu <kdasu.kdev(a)gmail.com>
spi: bcm-qspi: check for valid cs before applying chip select
Joerg Roedel <jroedel(a)suse.de>
iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()
Guoqing Jiang <guoqing.jiang(a)linux.dev>
iommu/vt-d: Fix potential memory leak in intel_setup_irq_remapping()
Leon Romanovsky <leonro(a)nvidia.com>
RDMA/mlx4: Don't continue event handler after memory allocation failure
Guenter Roeck <linux(a)roeck-us.net>
Revert "ASoC: mediatek: Check for error clk pointer"
Martin K. Petersen <martin.petersen(a)oracle.com>
block: bio-integrity: Advance seed correctly for larger interval sizes
Nick Lopez <github(a)glowingmonkey.org>
drm/nouveau: fix off by one in BIOS boundary checking
Christian Lachner <gladiac(a)gmail.com>
ALSA: hda/realtek: Fix silent output on Gigabyte X570 Aorus Xtreme after reboot from Windows
Christian Lachner <gladiac(a)gmail.com>
ALSA: hda/realtek: Fix silent output on Gigabyte X570S Aorus Master (newer chipset)
Christian Lachner <gladiac(a)gmail.com>
ALSA: hda/realtek: Add missing fixup-model entry for Gigabyte X570 ALC1220 quirks
Mark Brown <broonie(a)kernel.org>
ASoC: ops: Reject out of bounds values in snd_soc_put_xr_sx()
Mark Brown <broonie(a)kernel.org>
ASoC: ops: Reject out of bounds values in snd_soc_put_volsw_sx()
Mark Brown <broonie(a)kernel.org>
ASoC: ops: Reject out of bounds values in snd_soc_put_volsw()
Paul Moore <paul(a)paul-moore.com>
audit: improve audit queue handling when "audit=1" on cmdline
Eric Dumazet <edumazet(a)google.com>
af_packet: fix data-race in packet_setsockopt / packet_setsockopt
Eric Dumazet <edumazet(a)google.com>
rtnetlink: make sure to refresh master_dev/m_ops in __rtnl_newlink()
Shyam Sundar S K <Shyam-sundar.S-k(a)amd.com>
net: amd-xgbe: Fix skb data length underflow
Raju Rangoju <Raju.Rangoju(a)amd.com>
net: amd-xgbe: ensure to reset the tx_timer_active flag
Georgi Valkov <gvalkov(a)abv.bg>
ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
Eric Dumazet <edumazet(a)google.com>
tcp: fix possible socket leaks in internal pacing mode
Florian Westphal <fw(a)strlen.de>
netfilter: nat: limit port clash resolution attempts
Florian Westphal <fw(a)strlen.de>
netfilter: nat: remove l4 protocol port rovers
Eric Dumazet <edumazet(a)google.com>
ipv4: tcp: send zero IPID in SYNACK messages
Eric Dumazet <edumazet(a)google.com>
ipv4: raw: lock the socket in raw_bind()
Hangyu Hua <hbh25y(a)gmail.com>
yam: fix a memory leak in yam_siocdevprivate()
Sukadev Bhattiprolu <sukadev(a)linux.ibm.com>
ibmvnic: don't spin in tasklet
Sukadev Bhattiprolu <sukadev(a)linux.ibm.com>
ibmvnic: init ->running_cap_crqs early
Marek Behún <kabel(a)kernel.org>
phylib: fix potential use-after-free
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFS: Ensure the server has an up to date ctime before renaming
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFS: Ensure the server has an up to date ctime before hardlinking
Eric Dumazet <edumazet(a)google.com>
ipv6: annotate accesses to fn->fn_sernum
José Expósito <jose.exposito89(a)gmail.com>
drm/msm/dsi: invalid parameter check in msm_dsi_phy_enable
Xianting Tian <xianting.tian(a)linux.alibaba.com>
drm/msm: Fix wrong size calculation
Jianguo Wu <wujianguo(a)chinatelecom.cn>
net-procfs: show net devices bound packet types
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFSv4: nfs_atomic_open() can race when looking up a non-regular file
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFSv4: Handle case where the lookup of a directory fails
Guenter Roeck <linux(a)roeck-us.net>
hwmon: (lm90) Reduce maximum conversion rate for G781
Eric Dumazet <edumazet(a)google.com>
ipv4: avoid using shared IP generator for connected sockets
Xin Long <lucien.xin(a)gmail.com>
ping: fix the sk_bound_dev_if match in ping_lookup
Congyu Liu <liu3101(a)purdue.edu>
net: fix information leakage in /proc/net/ptype
Ido Schimmel <idosch(a)nvidia.com>
ipv6_tunnel: Rate limit warning messages
John Meneghini <jmeneghi(a)redhat.com>
scsi: bnx2fc: Flush destroy_work queue before calling bnx2fc_interface_put()
Matthias Kaehlcke <mka(a)chromium.org>
rpmsg: char: Fix race between the release of rpmsg_eptdev and cdev
Sujit Kautkar <sujitka(a)chromium.org>
rpmsg: char: Fix race between the release of rpmsg_ctrldev and cdev
Joe Damato <jdamato(a)fastly.com>
i40e: fix unsigned stat widths
Sylwester Dziedziuch <sylwesterx.dziedziuch(a)intel.com>
i40e: Fix queues reservation for XDP
Jedrzej Jagielski <jedrzej.jagielski(a)intel.com>
i40e: Fix issue when maximum queues is exceeded
Jedrzej Jagielski <jedrzej.jagielski(a)intel.com>
i40e: Increase delay to 1 s after global EMP reset
Christophe Leroy <christophe.leroy(a)csgroup.eu>
powerpc/32: Fix boot failure with GCC latent entropy plugin
Marek Behún <kabel(a)kernel.org>
net: sfp: ignore disabled SFP node
Badhri Jagan Sridharan <badhri(a)google.com>
usb: typec: tcpm: Do not disconnect while receiving VBUS off
Alan Stern <stern(a)rowland.harvard.edu>
USB: core: Fix hang in usb_kill_urb by adding memory barriers
Pavankumar Kondeti <quic_pkondeti(a)quicinc.com>
usb: gadget: f_sourcesink: Fix isoc transfer for USB_SPEED_SUPER_PLUS
Jon Hunter <jonathanh(a)nvidia.com>
usb: common: ulpi: Fix crash in ulpi_match()
Alan Stern <stern(a)rowland.harvard.edu>
usb-storage: Add unusual-devs entry for VL817 USB-SATA bridge
Cameron Williams <cang1(a)live.co.uk>
tty: Add support for Brainboxes UC cards.
daniel.starke(a)siemens.com <daniel.starke(a)siemens.com>
tty: n_gsm: fix SW flow control encoding/handling
Valentin Caron <valentin.caron(a)foss.st.com>
serial: stm32: fix software flow control transfer
Robert Hancock <robert.hancock(a)calian.com>
serial: 8250: of: Fix mapped region size when using reg-offset property
Pablo Neira Ayuso <pablo(a)netfilter.org>
netfilter: nft_payload: do not update layer 4 checksum when mangling fragments
Lucas Stach <l.stach(a)pengutronix.de>
drm/etnaviv: relax submit size limits
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
PM: wakeup: simplify the output logic of pm_show_wakelocks()
Jan Kara <jack(a)suse.cz>
udf: Fix NULL ptr deref when converting from inline format
Jan Kara <jack(a)suse.cz>
udf: Restore i_lenAlloc when inode expansion fails
Steffen Maier <maier(a)linux.ibm.com>
scsi: zfcp: Fix failed recovery on gone remote port with non-NPIV FCP devices
Vasily Gorbik <gor(a)linux.ibm.com>
s390/hypfs: include z/VM guests with access control group set
Brian Gix <brian.gix(a)intel.com>
Bluetooth: refactor malicious adv data check
-------------
Diffstat:
Makefile | 4 +-
arch/powerpc/kernel/Makefile | 1 +
arch/powerpc/lib/Makefile | 3 +
arch/s390/hypfs/hypfs_vm.c | 6 +-
block/bio-integrity.c | 2 +-
drivers/edac/altera_edac.c | 2 +-
drivers/edac/xgene_edac.c | 2 +-
drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 4 +-
drivers/gpu/drm/i915/intel_overlay.c | 3 +
drivers/gpu/drm/msm/dsi/phy/dsi_phy.c | 4 +-
drivers/gpu/drm/msm/msm_drv.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c | 2 +-
drivers/hwmon/lm90.c | 2 +-
drivers/infiniband/hw/mlx4/main.c | 2 +-
drivers/iommu/amd_iommu_init.c | 2 +
drivers/iommu/intel_irq_remapping.c | 13 ++-
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 14 ++-
drivers/net/ethernet/ibm/ibmvnic.c | 112 +++++++++++++--------
drivers/net/ethernet/intel/i40e/i40e.h | 9 +-
drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 2 +-
drivers/net/ethernet/intel/i40e/i40e_main.c | 44 ++++----
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 59 +++++++++++
.../net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c | 19 ++--
drivers/net/hamradio/yam.c | 4 +-
drivers/net/ieee802154/ca8210.c | 1 +
drivers/net/ieee802154/mac802154_hwsim.c | 1 +
drivers/net/ieee802154/mcr20a.c | 4 +-
drivers/net/macsec.c | 9 ++
drivers/net/phy/phy_device.c | 6 +-
drivers/net/phy/phylink.c | 5 +
drivers/net/usb/ipheth.c | 6 +-
drivers/rpmsg/rpmsg_char.c | 22 +---
drivers/rtc/rtc-mc146818-lib.c | 2 +-
drivers/s390/scsi/zfcp_fc.c | 13 ++-
drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 41 ++++----
drivers/soc/mediatek/mtk-scpsys.c | 15 +--
drivers/spi/spi-bcm-qspi.c | 2 +-
drivers/spi/spi-meson-spicc.c | 5 +
drivers/spi/spi-mt65xx.c | 2 +-
drivers/tty/n_gsm.c | 4 +-
drivers/tty/serial/8250/8250_of.c | 11 +-
drivers/tty/serial/8250/8250_pci.c | 100 +++++++++++++++++-
drivers/tty/serial/stm32-usart.c | 2 +-
drivers/usb/common/ulpi.c | 7 +-
drivers/usb/core/hcd.c | 14 +++
drivers/usb/core/urb.c | 12 +++
drivers/usb/gadget/function/f_sourcesink.c | 1 +
drivers/usb/storage/unusual_devs.h | 10 ++
drivers/usb/typec/tcpm.c | 3 +-
fs/ext4/inline.c | 10 +-
fs/nfs/dir.c | 22 ++++
fs/nfsd/nfs4state.c | 4 +-
fs/udf/inode.c | 9 +-
include/linux/netdevice.h | 1 +
include/net/ip.h | 21 ++--
include/net/ip6_fib.h | 2 +-
include/net/netfilter/nf_nat_l4proto.h | 2 +-
kernel/audit.c | 62 ++++++++----
kernel/power/wakelock.c | 12 +--
net/bluetooth/hci_event.c | 10 +-
net/core/net-procfs.c | 38 ++++++-
net/core/rtnetlink.c | 6 +-
net/ieee802154/nl802154.c | 8 +-
net/ipv4/ip_output.c | 11 +-
net/ipv4/ping.c | 3 +-
net/ipv4/raw.c | 5 +-
net/ipv4/tcp_output.c | 31 ++++--
net/ipv6/ip6_fib.c | 23 +++--
net/ipv6/ip6_tunnel.c | 8 +-
net/ipv6/route.c | 2 +-
net/netfilter/nf_nat_proto_common.c | 37 ++++---
net/netfilter/nf_nat_proto_dccp.c | 5 +-
net/netfilter/nf_nat_proto_sctp.c | 5 +-
net/netfilter/nf_nat_proto_tcp.c | 5 +-
net/netfilter/nf_nat_proto_udp.c | 10 +-
net/netfilter/nft_payload.c | 3 +
net/packet/af_packet.c | 10 +-
sound/pci/hda/patch_realtek.c | 5 +-
sound/soc/codecs/cpcap.c | 2 +
sound/soc/codecs/max9759.c | 3 +-
sound/soc/fsl/pcm030-audio-fabric.c | 11 +-
sound/soc/soc-ops.c | 29 +++++-
tools/testing/selftests/futex/Makefile | 4 +-
83 files changed, 731 insertions(+), 303 deletions(-)
From: Miquel Raynal <miquel.raynal(a)bootlin.com>
[ Upstream commit e5ce576d45bf72fd0e3dc37eff897bfcc488f6a9 ]
Upon error the ieee802154_xmit_complete() helper is not called. Only
ieee802154_wake_queue() is called manually. In the Tx case we then leak
the skb structure.
Free the skb structure upon error before returning when appropriate.
As the 'is_tx = 0' cannot be moved in the complete handler because of a
possible race between the delay in switching to STATE_RX_AACK_ON and a
new interrupt, we introduce an intermediate 'was_tx' boolean just for
this purpose.
There is no Fixes tag applying here, many changes have been made on this
area and the issue kind of always existed.
Suggested-by: Alexander Aring <alex.aring(a)gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Acked-by: Alexander Aring <aahringo(a)redhat.com>
Link: https://lore.kernel.org/r/20220125121426.848337-4-miquel.raynal@bootlin.com
Signed-off-by: Stefan Schmidt <stefan(a)datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ieee802154/at86rf230.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
index ce3b7fb7eda09..80c8e9abb402e 100644
--- a/drivers/net/ieee802154/at86rf230.c
+++ b/drivers/net/ieee802154/at86rf230.c
@@ -108,6 +108,7 @@ struct at86rf230_local {
unsigned long cal_timeout;
bool is_tx;
bool is_tx_from_off;
+ bool was_tx;
u8 tx_retry;
struct sk_buff *tx_skb;
struct at86rf230_state_change tx;
@@ -351,7 +352,11 @@ at86rf230_async_error_recover_complete(void *context)
if (ctx->free)
kfree(ctx);
- ieee802154_wake_queue(lp->hw);
+ if (lp->was_tx) {
+ lp->was_tx = 0;
+ dev_kfree_skb_any(lp->tx_skb);
+ ieee802154_wake_queue(lp->hw);
+ }
}
static void
@@ -360,7 +365,11 @@ at86rf230_async_error_recover(void *context)
struct at86rf230_state_change *ctx = context;
struct at86rf230_local *lp = ctx->lp;
- lp->is_tx = 0;
+ if (lp->is_tx) {
+ lp->was_tx = 1;
+ lp->is_tx = 0;
+ }
+
at86rf230_async_state_change(lp, ctx, STATE_RX_AACK_ON,
at86rf230_async_error_recover_complete);
}
--
2.34.1
From: Miquel Raynal <miquel.raynal(a)bootlin.com>
[ Upstream commit e5ce576d45bf72fd0e3dc37eff897bfcc488f6a9 ]
Upon error the ieee802154_xmit_complete() helper is not called. Only
ieee802154_wake_queue() is called manually. In the Tx case we then leak
the skb structure.
Free the skb structure upon error before returning when appropriate.
As the 'is_tx = 0' cannot be moved in the complete handler because of a
possible race between the delay in switching to STATE_RX_AACK_ON and a
new interrupt, we introduce an intermediate 'was_tx' boolean just for
this purpose.
There is no Fixes tag applying here, many changes have been made on this
area and the issue kind of always existed.
Suggested-by: Alexander Aring <alex.aring(a)gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Acked-by: Alexander Aring <aahringo(a)redhat.com>
Link: https://lore.kernel.org/r/20220125121426.848337-4-miquel.raynal@bootlin.com
Signed-off-by: Stefan Schmidt <stefan(a)datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ieee802154/at86rf230.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
index 5c48bdb6f6787..c2667c71a0cd1 100644
--- a/drivers/net/ieee802154/at86rf230.c
+++ b/drivers/net/ieee802154/at86rf230.c
@@ -108,6 +108,7 @@ struct at86rf230_local {
unsigned long cal_timeout;
bool is_tx;
bool is_tx_from_off;
+ bool was_tx;
u8 tx_retry;
struct sk_buff *tx_skb;
struct at86rf230_state_change tx;
@@ -351,7 +352,11 @@ at86rf230_async_error_recover_complete(void *context)
if (ctx->free)
kfree(ctx);
- ieee802154_wake_queue(lp->hw);
+ if (lp->was_tx) {
+ lp->was_tx = 0;
+ dev_kfree_skb_any(lp->tx_skb);
+ ieee802154_wake_queue(lp->hw);
+ }
}
static void
@@ -360,7 +365,11 @@ at86rf230_async_error_recover(void *context)
struct at86rf230_state_change *ctx = context;
struct at86rf230_local *lp = ctx->lp;
- lp->is_tx = 0;
+ if (lp->is_tx) {
+ lp->was_tx = 1;
+ lp->is_tx = 0;
+ }
+
at86rf230_async_state_change(lp, ctx, STATE_RX_AACK_ON,
at86rf230_async_error_recover_complete);
}
--
2.34.1
From: Miquel Raynal <miquel.raynal(a)bootlin.com>
[ Upstream commit e5ce576d45bf72fd0e3dc37eff897bfcc488f6a9 ]
Upon error the ieee802154_xmit_complete() helper is not called. Only
ieee802154_wake_queue() is called manually. In the Tx case we then leak
the skb structure.
Free the skb structure upon error before returning when appropriate.
As the 'is_tx = 0' cannot be moved in the complete handler because of a
possible race between the delay in switching to STATE_RX_AACK_ON and a
new interrupt, we introduce an intermediate 'was_tx' boolean just for
this purpose.
There is no Fixes tag applying here, many changes have been made on this
area and the issue kind of always existed.
Suggested-by: Alexander Aring <alex.aring(a)gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Acked-by: Alexander Aring <aahringo(a)redhat.com>
Link: https://lore.kernel.org/r/20220125121426.848337-4-miquel.raynal@bootlin.com
Signed-off-by: Stefan Schmidt <stefan(a)datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ieee802154/at86rf230.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
index 3d9e915798668..1bc09b6c308f8 100644
--- a/drivers/net/ieee802154/at86rf230.c
+++ b/drivers/net/ieee802154/at86rf230.c
@@ -108,6 +108,7 @@ struct at86rf230_local {
unsigned long cal_timeout;
bool is_tx;
bool is_tx_from_off;
+ bool was_tx;
u8 tx_retry;
struct sk_buff *tx_skb;
struct at86rf230_state_change tx;
@@ -351,7 +352,11 @@ at86rf230_async_error_recover_complete(void *context)
if (ctx->free)
kfree(ctx);
- ieee802154_wake_queue(lp->hw);
+ if (lp->was_tx) {
+ lp->was_tx = 0;
+ dev_kfree_skb_any(lp->tx_skb);
+ ieee802154_wake_queue(lp->hw);
+ }
}
static void
@@ -360,7 +365,11 @@ at86rf230_async_error_recover(void *context)
struct at86rf230_state_change *ctx = context;
struct at86rf230_local *lp = ctx->lp;
- lp->is_tx = 0;
+ if (lp->is_tx) {
+ lp->was_tx = 1;
+ lp->is_tx = 0;
+ }
+
at86rf230_async_state_change(lp, ctx, STATE_RX_AACK_ON,
at86rf230_async_error_recover_complete);
}
--
2.34.1
[ Upstream commit fd0e786d9d09024f67bd71ec094b110237dc3840 ]
This commit solves the problem of unmap kernel 1:1 pages
unconditionally, it appears in Linus's tree 4.16 and later
versions, and is backported to 4.14.x and 4.15.x stable branches.
But the backported patch has its logic reversed when calling
memory_failure() to determine whether it needs to unmap the
kernel page. Only when memory_failure() returns successfully,
the kernel page can be unmapped.
Signed-off-by: luofei <luofei(a)unicloud.com>
Cc: stable(a)vger.kernel.org #v4.14.x
Cc: stable(a)vger.kernel.org #v4.15.x
---
arch/x86/kernel/cpu/mcheck/mce.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 95c09db1bba2..d8399a689165 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -589,7 +589,7 @@ static int srao_decode_notifier(struct notifier_block *nb, unsigned long val,
if (mce_usable_address(mce) && (mce->severity == MCE_AO_SEVERITY)) {
pfn = mce->addr >> PAGE_SHIFT;
- if (memory_failure(pfn, MCE_VECTOR, 0))
+ if (!memory_failure(pfn, MCE_VECTOR, 0))
mce_unmap_kpfn(pfn);
}
--
2.27.0
On 2/7/22 23:04, Damien Le Moal wrote:
> Linus,
>
> The following changes since commit dfd42facf1e4ada021b939b4e19c935dcdd55566:
>
> Linux 5.17-rc3 (2022-02-06 12:20:50 -0800)
>
> are available in the Git repository at:
>
> ssh://git@gitolite.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata tags/ata-5.17-rc4
>
> for you to fetch changes up to fda17afc6166e975bec1197bd94cd2a3317bce3f:
>
> ata: libata-core: Fix ata_dev_config_cpr() (2022-02-07 22:38:02 +0900)
>
> ----------------------------------------------------------------
> ata fixes for 5.17-rc4
>
> A single patch in this pull request, from me, to fix a bug that is
> causing boot issues in the field (reports of problems with Fedora 35).
> The bug affects mostly old-ish drives that have issues with read log
> page command handling.
>
> ----------------------------------------------------------------
> Damien Le Moal (1):
> ata: libata-core: Fix ata_dev_config_cpr()
Greg,
Could you please pick-up this patch for 5.16 as soon as possible ?
The bug it fixes is affecting some Fedora users, among others.
The commit ID in Linus tree is:
commit fda17afc6166e975bec1197bd94cd2a3317bce3f
Author: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Mon Feb 7 11:27:53 2022 +0900
ata: libata-core: Fix ata_dev_config_cpr()
Thanks !
>
> drivers/ata/libata-core.c | 14 ++++++--------
> include/linux/ata.h | 2 +-
> 2 files changed, 7 insertions(+), 9 deletions(-)
--
Damien Le Moal
Western Digital Research
From: Paul Davey <paul.davey(a)alliedtelesis.co.nz>
On big endian architectures the mhi debugfs files which report pm state
give "Invalid State" for all states. This is caused by using
find_last_bit which takes an unsigned long* while the state is passed in
as an enum mhi_pm_state which will be of int size.
Fix by using __fls to pass the value of state instead of find_last_bit.
Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions")
Signed-off-by: Paul Davey <paul.davey(a)alliedtelesis.co.nz>
Reviewed-by: Manivannan Sadhasivam <mani(a)kernel.org>
Reviewed-by: Hemant Kumar <hemantk(a)codeaurora.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
---
drivers/bus/mhi/core/init.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c
index 046f407dc5d6..0d588b60929e 100644
--- a/drivers/bus/mhi/core/init.c
+++ b/drivers/bus/mhi/core/init.c
@@ -79,10 +79,12 @@ static const char * const mhi_pm_state_str[] = {
const char *to_mhi_pm_state_str(enum mhi_pm_state state)
{
- unsigned long pm_state = state;
- int index = find_last_bit(&pm_state, 32);
+ int index;
- if (index >= ARRAY_SIZE(mhi_pm_state_str))
+ if (state)
+ index = __fls(state);
+
+ if (!state || index >= ARRAY_SIZE(mhi_pm_state_str))
return "Invalid State";
return mhi_pm_state_str[index];
@@ -789,7 +791,6 @@ static int parse_ch_cfg(struct mhi_controller *mhi_cntrl,
mhi_chan->offload_ch = ch_cfg->offload_channel;
mhi_chan->db_cfg.reset_req = ch_cfg->doorbell_mode_switch;
mhi_chan->pre_alloc = ch_cfg->auto_queue;
- mhi_chan->wake_capable = ch_cfg->wake_capable;
/*
* If MHI host allocates buffers, then the channel direction
--
2.25.1
From: Oliver Hartkopp <socketcan(a)hartkopp.net>
When receiving a CAN frame the current code logic does not consider
concurrently receiving processes which do not show up in real world
usage.
Ziyang Xuan writes:
The following syz problem is one of the scenarios. so->rx.len is
changed by isotp_rcv_ff() during isotp_rcv_cf(), so->rx.len equals
0 before alloc_skb() and equals 4096 after alloc_skb(). That will
trigger skb_over_panic() in skb_put().
=======================================================
CPU: 1 PID: 19 Comm: ksoftirqd/1 Not tainted 5.16.0-rc8-syzkaller #0
RIP: 0010:skb_panic+0x16c/0x16e net/core/skbuff.c:113
Call Trace:
<TASK>
skb_over_panic net/core/skbuff.c:118 [inline]
skb_put.cold+0x24/0x24 net/core/skbuff.c:1990
isotp_rcv_cf net/can/isotp.c:570 [inline]
isotp_rcv+0xa38/0x1e30 net/can/isotp.c:668
deliver net/can/af_can.c:574 [inline]
can_rcv_filter+0x445/0x8d0 net/can/af_can.c:635
can_receive+0x31d/0x580 net/can/af_can.c:665
can_rcv+0x120/0x1c0 net/can/af_can.c:696
__netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5465
__netif_receive_skb+0x24/0x1b0 net/core/dev.c:5579
Therefore we make sure the state changes and data structures stay
consistent at CAN frame reception time by adding a spin_lock in
isotp_rcv(). This fixes the issue reported by syzkaller but does not
affect real world operation.
Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol")
Link: https://lore.kernel.org/linux-can/d7e69278-d741-c706-65e1-e87623d9a8e8@huaw…
Link: https://lore.kernel.org/all/20220208200026.13783-1-socketcan@hartkopp.net
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+4c63f36709a642f801c5(a)syzkaller.appspotmail.com
Reported-by: Ziyang Xuan <william.xuanziyang(a)huawei.com>
Signed-off-by: Oliver Hartkopp <socketcan(a)hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
net/can/isotp.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 02cbcb2ecf0d..9149e8d8aefc 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -56,6 +56,7 @@
#include <linux/module.h>
#include <linux/init.h>
#include <linux/interrupt.h>
+#include <linux/spinlock.h>
#include <linux/hrtimer.h>
#include <linux/wait.h>
#include <linux/uio.h>
@@ -145,6 +146,7 @@ struct isotp_sock {
struct tpcon rx, tx;
struct list_head notifier;
wait_queue_head_t wait;
+ spinlock_t rx_lock; /* protect single thread state machine */
};
static LIST_HEAD(isotp_notifier_list);
@@ -615,11 +617,17 @@ static void isotp_rcv(struct sk_buff *skb, void *data)
n_pci_type = cf->data[ae] & 0xF0;
+ /* Make sure the state changes and data structures stay consistent at
+ * CAN frame reception time. This locking is not needed in real world
+ * use cases but the inconsistency can be triggered with syzkaller.
+ */
+ spin_lock(&so->rx_lock);
+
if (so->opt.flags & CAN_ISOTP_HALF_DUPLEX) {
/* check rx/tx path half duplex expectations */
if ((so->tx.state != ISOTP_IDLE && n_pci_type != N_PCI_FC) ||
(so->rx.state != ISOTP_IDLE && n_pci_type == N_PCI_FC))
- return;
+ goto out_unlock;
}
switch (n_pci_type) {
@@ -668,6 +676,9 @@ static void isotp_rcv(struct sk_buff *skb, void *data)
isotp_rcv_cf(sk, cf, ae, skb);
break;
}
+
+out_unlock:
+ spin_unlock(&so->rx_lock);
}
static void isotp_fill_dataframe(struct canfd_frame *cf, struct isotp_sock *so,
@@ -1444,6 +1455,7 @@ static int isotp_init(struct sock *sk)
so->txtimer.function = isotp_tx_timer_handler;
init_waitqueue_head(&so->wait);
+ spin_lock_init(&so->rx_lock);
spin_lock(&isotp_notifier_lock);
list_add_tail(&so->notifier, &isotp_notifier_list);
base-commit: 7db788ad627aabff2b74d4f1a3b68516d0fee0d7
--
2.34.1
From: Paul Davey <paul.davey(a)alliedtelesis.co.nz>
On big endian architectures the mhi debugfs files which report pm state
give "Invalid State" for all states. This is caused by using
find_last_bit which takes an unsigned long* while the state is passed in
as an enum mhi_pm_state which will be of int size.
Fix by using __fls to pass the value of state instead of find_last_bit.
Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions")
Signed-off-by: Paul Davey <paul.davey(a)alliedtelesis.co.nz>
Reviewed-by: Manivannan Sadhasivam <mani(a)kernel.org>
Reviewed-by: Hemant Kumar <hemantk(a)codeaurora.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
---
drivers/bus/mhi/core/init.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c
index 046f407dc5d6..0d588b60929e 100644
--- a/drivers/bus/mhi/core/init.c
+++ b/drivers/bus/mhi/core/init.c
@@ -79,10 +79,12 @@ static const char * const mhi_pm_state_str[] = {
const char *to_mhi_pm_state_str(enum mhi_pm_state state)
{
- unsigned long pm_state = state;
- int index = find_last_bit(&pm_state, 32);
+ int index;
- if (index >= ARRAY_SIZE(mhi_pm_state_str))
+ if (state)
+ index = __fls(state);
+
+ if (!state || index >= ARRAY_SIZE(mhi_pm_state_str))
return "Invalid State";
return mhi_pm_state_str[index];
@@ -789,7 +791,6 @@ static int parse_ch_cfg(struct mhi_controller *mhi_cntrl,
mhi_chan->offload_ch = ch_cfg->offload_channel;
mhi_chan->db_cfg.reset_req = ch_cfg->doorbell_mode_switch;
mhi_chan->pre_alloc = ch_cfg->auto_queue;
- mhi_chan->wake_capable = ch_cfg->wake_capable;
/*
* If MHI host allocates buffers, then the channel direction
--
2.25.1
This is the start of the stable review cycle for the 5.4.178 release.
There are 44 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 09 Feb 2022 10:37:42 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.178-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.178-rc1
Waiman Long <longman(a)redhat.com>
cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning
Ritesh Harjani <riteshh(a)linux.ibm.com>
ext4: fix error handling in ext4_restore_inline_data()
Sergey Shtylyov <s.shtylyov(a)omp.ru>
EDAC/xgene: Fix deferred probing
Sergey Shtylyov <s.shtylyov(a)omp.ru>
EDAC/altera: Fix deferred probing
Riwen Lu <luriwen(a)kylinos.cn>
rtc: cmos: Evaluate century appropriate
Muhammad Usama Anjum <usama.anjum(a)collabora.com>
selftests: futex: Use variable MAKE instead of make
Dai Ngo <dai.ngo(a)oracle.com>
nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client.
John Meneghini <jmeneghi(a)redhat.com>
scsi: bnx2fc: Make bnx2fc_recv_frame() mp safe
Florian Fainelli <f.fainelli(a)gmail.com>
pinctrl: bcm2835: Fix a few error paths
Dan Carpenter <dan.carpenter(a)oracle.com>
ASoC: max9759: fix underflow in speaker_gain_control_put()
Jiasheng Jiang <jiasheng(a)iscas.ac.cn>
ASoC: cpcap: Check for NULL pointer after calling of_get_child_by_name
Robert Hancock <robert.hancock(a)calian.com>
ASoC: xilinx: xlnx_formatter_pcm: Make buffer bytes multiple of period bytes
Miaoqian Lin <linmq006(a)gmail.com>
ASoC: fsl: Add missing error handling in pcm030_fabric_probe
Dan Carpenter <dan.carpenter(a)oracle.com>
drm/i915/overlay: Prevent divide by zero bugs in scaling
Yannick Vignon <yannick.vignon(a)nxp.com>
net: stmmac: ensure PTP time register reads are consistent
Camel Guo <camelg(a)axis.com>
net: stmmac: dump gmac4 DMA registers correctly
Lior Nahmanson <liorna(a)nvidia.com>
net: macsec: Verify that send_sci is on when setting Tx sci explicitly
Miquel Raynal <miquel.raynal(a)bootlin.com>
net: ieee802154: Return meaningful error codes from the netlink helpers
Miquel Raynal <miquel.raynal(a)bootlin.com>
net: ieee802154: ca8210: Stop leaking skb's
Miquel Raynal <miquel.raynal(a)bootlin.com>
net: ieee802154: mcr20a: Fix lifs/sifs periods
Miquel Raynal <miquel.raynal(a)bootlin.com>
net: ieee802154: hwsim: Ensure proper channel selection at probe time
Miaoqian Lin <linmq006(a)gmail.com>
spi: meson-spicc: add IRQ check in meson_spicc_probe
Benjamin Gaignard <benjamin.gaignard(a)collabora.com>
spi: mediatek: Avoid NULL pointer crash in interrupt
Kamal Dasu <kdasu.kdev(a)gmail.com>
spi: bcm-qspi: check for valid cs before applying chip select
Joerg Roedel <jroedel(a)suse.de>
iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()
Guoqing Jiang <guoqing.jiang(a)linux.dev>
iommu/vt-d: Fix potential memory leak in intel_setup_irq_remapping()
Leon Romanovsky <leonro(a)nvidia.com>
RDMA/mlx4: Don't continue event handler after memory allocation failure
Bernard Metzler <bmt(a)zurich.ibm.com>
RDMA/siw: Fix broken RDMA Read Fence/Resume logic.
Mike Marciniszyn <mike.marciniszyn(a)cornelisnetworks.com>
IB/rdmavt: Validate remote_addr during loopback atomic tests
Yutian Yang <nglaive(a)gmail.com>
memcg: charge fs_context and legacy_fs_context
Guenter Roeck <linux(a)roeck-us.net>
Revert "ASoC: mediatek: Check for error clk pointer"
Martin K. Petersen <martin.petersen(a)oracle.com>
block: bio-integrity: Advance seed correctly for larger interval sizes
Lang Yu <lang.yu(a)amd.com>
mm/kmemleak: avoid scanning potential huge holes
Nick Lopez <github(a)glowingmonkey.org>
drm/nouveau: fix off by one in BIOS boundary checking
Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
btrfs: fix deadlock between quota disable and qgroup rescan worker
Christian Lachner <gladiac(a)gmail.com>
ALSA: hda/realtek: Fix silent output on Gigabyte X570 Aorus Xtreme after reboot from Windows
Christian Lachner <gladiac(a)gmail.com>
ALSA: hda/realtek: Fix silent output on Gigabyte X570S Aorus Master (newer chipset)
Christian Lachner <gladiac(a)gmail.com>
ALSA: hda/realtek: Add missing fixup-model entry for Gigabyte X570 ALC1220 quirks
Albert Geantă <albertgeanta(a)gmail.com>
ALSA: hda/realtek: Add quirk for ASUS GU603
Takashi Iwai <tiwai(a)suse.de>
ALSA: usb-audio: Simplify quirk entries with a macro
Mark Brown <broonie(a)kernel.org>
ASoC: ops: Reject out of bounds values in snd_soc_put_xr_sx()
Mark Brown <broonie(a)kernel.org>
ASoC: ops: Reject out of bounds values in snd_soc_put_volsw_sx()
Mark Brown <broonie(a)kernel.org>
ASoC: ops: Reject out of bounds values in snd_soc_put_volsw()
Paul Moore <paul(a)paul-moore.com>
audit: improve audit queue handling when "audit=1" on cmdline
-------------
Diffstat:
Makefile | 4 +-
block/bio-integrity.c | 2 +-
drivers/edac/altera_edac.c | 2 +-
drivers/edac/xgene_edac.c | 2 +-
drivers/gpu/drm/i915/display/intel_overlay.c | 3 ++
drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c | 2 +-
drivers/infiniband/hw/mlx4/main.c | 2 +-
drivers/infiniband/sw/rdmavt/qp.c | 2 +
drivers/infiniband/sw/siw/siw.h | 7 +--
drivers/infiniband/sw/siw/siw_qp_rx.c | 20 +++----
drivers/iommu/amd_iommu_init.c | 2 +
drivers/iommu/intel_irq_remapping.c | 13 +++--
drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h | 1 +
.../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 19 ++++++-
.../net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c | 19 ++++---
drivers/net/ieee802154/ca8210.c | 1 +
drivers/net/ieee802154/mac802154_hwsim.c | 1 +
drivers/net/ieee802154/mcr20a.c | 4 +-
drivers/net/macsec.c | 9 ++++
drivers/pinctrl/bcm/pinctrl-bcm2835.c | 23 +++++---
drivers/rtc/rtc-mc146818-lib.c | 2 +-
drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 21 +++++---
drivers/soc/mediatek/mtk-scpsys.c | 15 ++----
drivers/spi/spi-bcm-qspi.c | 2 +-
drivers/spi/spi-meson-spicc.c | 5 ++
drivers/spi/spi-mt65xx.c | 2 +-
fs/btrfs/qgroup.c | 21 +++++++-
fs/ext4/inline.c | 10 +++-
fs/fs_context.c | 4 +-
fs/nfsd/nfs4state.c | 4 +-
kernel/audit.c | 62 +++++++++++++++-------
kernel/cgroup/cpuset.c | 10 ++++
mm/kmemleak.c | 13 ++---
net/ieee802154/nl802154.c | 8 +--
sound/pci/hda/patch_realtek.c | 6 ++-
sound/soc/codecs/cpcap.c | 2 +
sound/soc/codecs/max9759.c | 3 +-
sound/soc/fsl/pcm030-audio-fabric.c | 11 ++--
sound/soc/soc-ops.c | 29 ++++++++--
sound/soc/xilinx/xlnx_formatter_pcm.c | 27 ++++++++--
sound/usb/quirks-table.h | 10 ++++
tools/testing/selftests/futex/Makefile | 4 +-
42 files changed, 295 insertions(+), 114 deletions(-)